TOP ACTIVATIONS
MAX = 0.097

a colorful party with all her friends , and everyone was
He was brave and gave Sue some gum . They both
around his neck , giving himself a giant bear hug .
and colourful dresses with bows and rib bons . Every day
was very happy and gave Brown ie a big hug .
Tom loved Sara and gave her good food , toys and
pond and I will give you a bath ." So ,
bright room with all of her toys . One day ,
ate tasty sandwiches and drank sweet lemon ade . After their
all day and ate y ummy food . The sun was
a blanket and spread y ummy foods on it .
bottles that their mom gave them . The perfume bottles were
with dolls and ate y ummy cookies . When it was
her doll Lily and gave her a pink dress and a
with his cat and give her food . His cat is
two glasses and handed Tim my one . The juice looked
, Lily wanted to give Brown ie a big hug .
flowers . We can give them to people we love ,"
man who always gave Tim my a l oll ipop after
with lots of pretty dresses and suits . Tim my used

MIN ACTIVATIONS
MIN = -0.103

a loud noise from the living room . He ran to
noise coming from a nearby alley way . Ell a was
noise and came down the stair . She saw Lily and
Mom my coming into the living room . "
Mom my coming into the living room . "
loud noise came from the living room . Billy was scared
a big box in the living room . He thought it
They go back to the living room and pick up the
noise and came to the living room . They saw the
lively sound coming from the living room . She followed the
Tom went back to the living room and played with their
hear Mom coming down the stairs . They try to hide
to go back to the living room and do some yoga
a noise coming from the living room . She wanted to
mom my coming from the living room . She quickly got
't touch anything !" the mill er says . He slams
their mom came into the living room with a tray of
and comes back to the living room . She sees Ben
loud noise came from the living room . It was Daddy
one she met in the alley that day . <|endoftext|> Once

INTERVAL 0.047 - 0.097
CONTAINS 1.633%

and then open it with your fingers . Don 't let
scared . She even showed him that there was nothing under
asked her to clean up her toys and put them away
and wrapped his arms around him . " Be
world , so he left his home early in the morning

INTERVAL -0.003 - 0.047
CONTAINS 53.352%

day on anytime Gabe looked at the rock it made him
. They were happy and full . <|endoftext|> Tom my loved
he found the right spell . He said the spell out
building her creation ! It looked very pretty in a bright
it hurt his ears . The bunny flew low

INTERVAL -0.053 - -0.003
CONTAINS 44.529%

a red bow tie . He thought he looked very handsome
and the little girl was enjoying her very modern picnic with
bank and stretched out in the sun . She looked so
� Wow , Bernie ! â € shouted one of the
the flower . But when she touched it , she felt

INTERVAL -0.103 - -0.053
CONTAINS 0.485%

. He crawled into the hole and disappeared .
my new calendar on the wall ?" he asked , trying
They crawled through the hole and entered the temple .
. She was very bored living in the small field near
But then the wind began to blow , and the
Token a
Token ID257
Feature activation+3.77e-02

Pos logit contributions
ouses+0.046
 fries+0.043
 veggies+0.042
heddar+0.042
affles+0.042
Neg logit contributions
h-0.020
 never-0.019
pe-0.017
 clim-0.017
't-0.016
Token colorful
Token ID20239
Feature activation+2.34e-02

Pos logit contributions
ouses+0.070
 veggies+0.068
 fries+0.068
affles+0.061
 groceries+0.060
Neg logit contributions
h-0.061
 clim-0.061
 hill-0.060
pe-0.060
 poster-0.059
Token party
Token ID2151
Feature activation-5.18e-03

Pos logit contributions
 fries+0.039
ouses+0.039
 veggies+0.039
 meals+0.035
 groceries+0.034
Neg logit contributions
pe-0.044
stack-0.043
h-0.042
 clim-0.040
roads-0.039
Token with
Token ID351
Feature activation+3.97e-02

Pos logit contributions
pe+0.009
stack+0.008
 collapse+0.008
h+0.008
roads+0.008
Neg logit contributions
ouses-0.010
 fries-0.010
 veggies-0.010
 meals-0.010
 spaghetti-0.009
Token all
Token ID477
Feature activation+2.37e-02

Pos logit contributions
ouses+0.086
 fries+0.079
 veggies+0.078
ory+0.076
affles+0.076
Neg logit contributions
h-0.053
 never-0.047
 clim-0.046
pe-0.046
 hon-0.045
Token her
Token ID607
Feature activation+9.68e-02

Pos logit contributions
 fries+0.049
ouses+0.048
 veggies+0.048
 meals+0.045
 groceries+0.044
Neg logit contributions
pe-0.036
stack-0.035
h-0.035
 collapse-0.032
 clim-0.032
Token friends
Token ID2460
Feature activation+4.45e-02

Pos logit contributions
 fries+0.196
urger+0.183
heddar+0.182
ouses+0.181
uggets+0.181
Neg logit contributions
 slide-0.117
 only-0.115
 switch-0.115
 hill-0.113
,-0.113
Token,
Token ID11
Feature activation+3.40e-02

Pos logit contributions
 fries+0.084
ouses+0.084
 veggies+0.083
 meals+0.077
 spaghetti+0.076
Neg logit contributions
pe-0.075
stack-0.073
 collapse-0.072
h-0.070
roads-0.069
Token and
Token ID290
Feature activation+4.07e-02

Pos logit contributions
ouses+0.074
 fries+0.072
 veggies+0.072
 meals+0.067
affles+0.066
Neg logit contributions
stack-0.046
pe-0.046
h-0.043
 clim-0.042
 hill-0.041
Token everyone
Token ID2506
Feature activation+2.58e-02

Pos logit contributions
 fries+0.085
ouses+0.084
 veggies+0.084
 meals+0.078
 groceries+0.077
Neg logit contributions
pe-0.061
stack-0.059
h-0.058
 collapse-0.054
roads-0.053
Token was
Token ID373
Feature activation-1.21e-02

Pos logit contributions
 fries+0.051
ouses+0.051
 veggies+0.050
 meals+0.047
 groceries+0.046
Neg logit contributions
pe-0.041
stack-0.040
h-0.038
 collapse-0.037
roads-0.037
Token He
Token ID679
Feature activation+1.14e-02

Pos logit contributions
 fries+0.045
ouses+0.044
 veggies+0.044
 meals+0.041
 groceries+0.041
Neg logit contributions
pe-0.032
stack-0.031
h-0.030
 collapse-0.028
 clim-0.028
Token was
Token ID373
Feature activation+4.46e-03

Pos logit contributions
ouses+0.028
 fries+0.027
 groceries+0.027
 meals+0.026
 veggies+0.026
Neg logit contributions
h-0.013
pe-0.012
 never-0.012
 clim-0.011
stack-0.011
Token brave
Token ID14802
Feature activation+1.46e-02

Pos logit contributions
 fries+0.009
ouses+0.009
 veggies+0.009
 meals+0.008
 groceries+0.008
Neg logit contributions
pe-0.007
stack-0.007
h-0.007
 clim-0.007
 collapse-0.007
Token and
Token ID290
Feature activation+3.35e-02

Pos logit contributions
 fries+0.028
 veggies+0.028
ouses+0.028
 meals+0.026
 groceries+0.026
Neg logit contributions
stack-0.025
pe-0.025
h-0.023
 collapse-0.022
roads-0.022
Token gave
Token ID2921
Feature activation+4.23e-02

Pos logit contributions
ouses+0.068
 fries+0.066
 veggies+0.066
 meals+0.063
 groceries+0.062
Neg logit contributions
pe-0.050
h-0.049
stack-0.048
 clim-0.046
 collapse-0.045
Token Sue
Token ID26089
Feature activation+9.51e-02

Pos logit contributions
ouses+0.110
 fries+0.109
 veggies+0.105
 meals+0.101
affles+0.100
Neg logit contributions
h-0.044
pe-0.043
stack-0.040
 clim-0.038
 collapse-0.037
Token some
Token ID617
Feature activation+7.34e-02

Pos logit contributions
ouses+0.250
 fries+0.243
affles+0.230
 veggies+0.220
ory+0.220
Neg logit contributions
 clim-0.074
 never-0.071
 slide-0.071
pe-0.070
h-0.069
Token gum
Token ID27142
Feature activation+4.21e-02

Pos logit contributions
ouses+0.143
uggets+0.127
urger+0.127
pieces+0.118
 journals+0.118
Neg logit contributions
,-0.108
 be-0.107
 hit-0.106
 wob-0.104
 push-0.102
Token.
Token ID13
Feature activation+1.99e-02

Pos logit contributions
 fries+0.103
 veggies+0.102
ouses+0.101
 meals+0.097
affles+0.096
Neg logit contributions
pe-0.046
h-0.043
stack-0.039
 clim-0.037
oh-0.033
Token They
Token ID1119
Feature activation+1.29e-02

Pos logit contributions
 fries+0.038
ouses+0.038
 veggies+0.037
 meals+0.035
 groceries+0.034
Neg logit contributions
pe-0.033
stack-0.032
h-0.032
 collapse-0.030
roads-0.030
Token both
Token ID1111
Feature activation+2.93e-02

Pos logit contributions
ouses+0.029
 fries+0.028
 veggies+0.028
 meals+0.027
 groceries+0.027
Neg logit contributions
h-0.017
stack-0.016
pe-0.016
 never-0.015
 clim-0.015
Token around
Token ID1088
Feature activation+1.40e-02

Pos logit contributions
 fries+0.134
 veggies+0.127
ouses+0.125
affles+0.120
 meals+0.118
Neg logit contributions
pe-0.079
h-0.077
stack-0.074
 clim-0.068
roads-0.067
Token his
Token ID465
Feature activation+4.88e-02

Pos logit contributions
ouses+0.034
 fries+0.034
 veggies+0.032
affles+0.031
heddar+0.031
Neg logit contributions
pe-0.016
h-0.015
stack-0.015
 hill-0.013
 crash-0.013
Token neck
Token ID7393
Feature activation+2.52e-02

Pos logit contributions
 fries+0.106
ouses+0.102
affles+0.099
imes+0.098
ory+0.097
Neg logit contributions
pe-0.071
stack-0.069
 hill-0.068
 bridge-0.064
h-0.059
Token,
Token ID11
Feature activation+2.81e-02

Pos logit contributions
 fries+0.056
 veggies+0.055
ouses+0.054
 meals+0.051
affles+0.051
Neg logit contributions
pe-0.035
h-0.033
stack-0.033
roads-0.029
 clim-0.029
Token giving
Token ID3501
Feature activation+5.08e-02

Pos logit contributions
 fries+0.057
 veggies+0.057
ouses+0.056
 meals+0.052
 groceries+0.052
Neg logit contributions
pe-0.044
stack-0.042
h-0.042
 collapse-0.039
 clim-0.039
Token himself
Token ID2241
Feature activation+9.34e-02

Pos logit contributions
ouses+0.143
affles+0.137
 fries+0.135
heddar+0.134
ittens+0.130
Neg logit contributions
h-0.031
 never-0.031
pe-0.027
 hon-0.021
 reverse-0.021
Token a
Token ID257
Feature activation+5.57e-02

Pos logit contributions
 fries+0.201
ouses+0.194
 veggies+0.192
affles+0.189
heddar+0.183
Neg logit contributions
pe-0.117
 clim-0.110
h-0.104
stack-0.101
 never-0.097
Token giant
Token ID6175
Feature activation+2.78e-02

Pos logit contributions
 fries+0.106
 veggies+0.106
ouses+0.103
affles+0.101
uggets+0.098
Neg logit contributions
h-0.089
 flap-0.087
 switch-0.087
pe-0.085
 crash-0.084
Token bear
Token ID6842
Feature activation+4.56e-02

Pos logit contributions
ouses+0.051
 fries+0.051
 veggies+0.050
uggets+0.047
affles+0.047
Neg logit contributions
pe-0.046
stack-0.044
 clim-0.043
h-0.043
 bridge-0.042
Token hug
Token ID16225
Feature activation+5.75e-02

Pos logit contributions
 veggies+0.097
 fries+0.097
ouses+0.096
 meals+0.090
 groceries+0.089
Neg logit contributions
pe-0.065
stack-0.063
h-0.060
 clim-0.058
roads-0.054
Token.
Token ID13
Feature activation+1.89e-02

Pos logit contributions
ouses+0.129
 fries+0.129
 veggies+0.128
 meals+0.119
affles+0.118
Neg logit contributions
pe-0.077
h-0.073
stack-0.073
 clim-0.065
roads-0.064
Token and
Token ID290
Feature activation+7.11e-02

Pos logit contributions
 fries+0.127
ouses+0.119
uggets+0.119
heddar+0.118
 veggies+0.116
Neg logit contributions
h-0.069
 switch-0.062
 flash-0.057
pe-0.057
 flap-0.056
Token colourful
Token ID44799
Feature activation+5.18e-02

Pos logit contributions
 fries+0.147
 veggies+0.138
heddar+0.136
 meals+0.134
uggets+0.133
Neg logit contributions
h-0.111
pe-0.100
 never-0.099
 switch-0.093
 hon-0.091
Token dresses
Token ID27309
Feature activation+3.29e-02

Pos logit contributions
 fries+0.096
heddar+0.092
uggets+0.089
 veggies+0.089
ory+0.086
Neg logit contributions
h-0.086
 switch-0.080
pe-0.079
 flap-0.073
 hon-0.073
Token with
Token ID351
Feature activation+4.66e-02

Pos logit contributions
 fries+0.071
 veggies+0.070
ouses+0.069
 meals+0.065
 groceries+0.064
Neg logit contributions
pe-0.048
stack-0.046
h-0.046
 clim-0.042
roads-0.041
Token bows
Token ID37469
Feature activation+5.12e-02

Pos logit contributions
ouses+0.101
heddar+0.099
 fries+0.099
affles+0.097
urger+0.097
Neg logit contributions
h-0.056
 switch-0.052
 hon-0.050
 never-0.050
 zo-0.048
Token and
Token ID290
Feature activation+9.31e-02

Pos logit contributions
 fries+0.114
 veggies+0.112
ouses+0.109
 meals+0.106
 groceries+0.103
Neg logit contributions
pe-0.072
h-0.069
stack-0.068
 clim-0.062
roads-0.059
Token rib
Token ID12183
Feature activation+4.08e-02

Pos logit contributions
uggets+0.160
 fries+0.159
heddar+0.159
affles+0.157
 veggies+0.157
Neg logit contributions
 hon-0.146
pe-0.145
 zo-0.143
h-0.143
 flap-0.142
Tokenbons
Token ID23461
Feature activation+4.38e-02

Pos logit contributions
 fries+0.081
 veggies+0.077
ouses+0.076
 spaghetti+0.075
 groceries+0.074
Neg logit contributions
h-0.067
pe-0.066
 never-0.057
esh-0.057
 hon-0.056
Token.
Token ID13
Feature activation+3.46e-02

Pos logit contributions
 fries+0.089
ouses+0.088
 veggies+0.087
 meals+0.081
 groceries+0.081
Neg logit contributions
pe-0.067
stack-0.066
h-0.064
 collapse-0.061
roads-0.060
Token Every
Token ID3887
Feature activation-1.61e-02

Pos logit contributions
 fries+0.089
ouses+0.086
 veggies+0.084
 meals+0.080
urger+0.079
Neg logit contributions
stack-0.037
h-0.037
pe-0.037
 clim-0.032
 hill-0.032
Token day
Token ID1110
Feature activation-6.70e-03

Pos logit contributions
stack+0.032
pe+0.032
 collapse+0.029
roads+0.028
h+0.028
Neg logit contributions
 fries-0.025
 veggies-0.025
ouses-0.025
 meals-0.024
 spaghetti-0.024
Token was
Token ID373
Feature activation-3.79e-03

Pos logit contributions
 fries+0.019
ouses+0.019
 veggies+0.019
 groceries+0.018
 meals+0.018
Neg logit contributions
pe-0.012
stack-0.012
h-0.012
 clim-0.011
 collapse-0.011
Token very
Token ID845
Feature activation+1.09e-02

Pos logit contributions
pe+0.007
stack+0.007
h+0.006
roads+0.006
 collapse+0.006
Neg logit contributions
 fries-0.007
ouses-0.007
 veggies-0.007
 meals-0.006
 spaghetti-0.006
Token happy
Token ID3772
Feature activation+1.47e-02

Pos logit contributions
 fries+0.018
ouses+0.018
 veggies+0.018
 meals+0.016
 groceries+0.016
Neg logit contributions
pe-0.021
stack-0.021
h-0.020
 collapse-0.020
roads-0.020
Token and
Token ID290
Feature activation+2.84e-02

Pos logit contributions
 fries+0.029
 veggies+0.028
ouses+0.028
 meals+0.026
 groceries+0.026
Neg logit contributions
pe-0.024
stack-0.024
h-0.023
 collapse-0.022
roads-0.022
Token gave
Token ID2921
Feature activation+3.93e-02

Pos logit contributions
ouses+0.056
 fries+0.056
 veggies+0.055
 groceries+0.052
 meals+0.051
Neg logit contributions
pe-0.044
stack-0.044
h-0.043
 collapse-0.040
 clim-0.040
Token Brown
Token ID4373
Feature activation+9.26e-02

Pos logit contributions
 fries+0.092
ouses+0.091
 veggies+0.088
affles+0.087
 meals+0.083
Neg logit contributions
pe-0.050
stack-0.048
h-0.047
 clim-0.044
 collapse-0.042
Tokenie
Token ID494
Feature activation+8.23e-02

Pos logit contributions
 fries+0.240
ouses+0.227
 veggies+0.224
affles+0.221
 meals+0.217
Neg logit contributions
h-0.090
pe-0.087
man-0.073
oh-0.071
 clim-0.070
Token a
Token ID257
Feature activation+5.10e-02

Pos logit contributions
ouses+0.216
affles+0.210
 fries+0.209
ittens+0.205
 veggies+0.200
Neg logit contributions
 clim-0.060
 never-0.058
pe-0.056
 flap-0.056
h-0.051
Token big
Token ID1263
Feature activation+1.15e-02

Pos logit contributions
ouses+0.101
 veggies+0.100
 fries+0.098
affles+0.097
uggets+0.092
Neg logit contributions
 switch-0.076
 flap-0.074
h-0.073
pe-0.073
stack-0.072
Token hug
Token ID16225
Feature activation+4.24e-02

Pos logit contributions
 fries+0.020
 veggies+0.019
ouses+0.019
 spaghetti+0.018
 meals+0.018
Neg logit contributions
pe-0.022
stack-0.021
h-0.020
roads-0.020
 collapse-0.020
Token.
Token ID13
Feature activation+2.38e-02

Pos logit contributions
 fries+0.090
ouses+0.090
 veggies+0.090
 meals+0.083
 groceries+0.083
Neg logit contributions
pe-0.061
stack-0.060
h-0.058
 clim-0.054
 collapse-0.054
Token Tom
Token ID4186
Feature activation+2.07e-02

Pos logit contributions
 fries+0.018
ouses+0.018
 veggies+0.018
 meals+0.017
 groceries+0.017
Neg logit contributions
pe-0.010
stack-0.010
h-0.009
 collapse-0.009
 clim-0.009
Token loved
Token ID6151
Feature activation+1.53e-02

Pos logit contributions
 fries+0.041
 veggies+0.040
ouses+0.040
 meals+0.037
 groceries+0.037
Neg logit contributions
pe-0.033
stack-0.032
h-0.032
 collapse-0.030
roads-0.030
Token Sara
Token ID24799
Feature activation+5.35e-02

Pos logit contributions
ouses+0.034
 fries+0.031
 veggies+0.030
ory+0.029
affles+0.028
Neg logit contributions
pe-0.021
stack-0.021
h-0.021
 clim-0.020
 hill-0.019
Token and
Token ID290
Feature activation+4.30e-02

Pos logit contributions
ouses+0.129
 fries+0.128
 veggies+0.125
affles+0.119
 groceries+0.118
Neg logit contributions
pe-0.066
h-0.057
stack-0.055
 clim-0.054
 never-0.052
Token gave
Token ID2921
Feature activation+4.11e-02

Pos logit contributions
 fries+0.089
 veggies+0.086
ouses+0.084
 meals+0.080
 spaghetti+0.079
Neg logit contributions
pe-0.068
stack-0.066
h-0.064
 collapse-0.062
roads-0.061
Token her
Token ID607
Feature activation+9.26e-02

Pos logit contributions
ouses+0.106
 veggies+0.097
 fries+0.097
affles+0.095
heddar+0.092
Neg logit contributions
h-0.040
pe-0.039
stack-0.037
 never-0.036
 clim-0.033
Token good
Token ID922
Feature activation+3.75e-02

Pos logit contributions
ouses+0.241
affles+0.232
 fries+0.227
 veggies+0.218
ittens+0.217
Neg logit contributions
 slide-0.078
 flap-0.071
 switch-0.068
 hill-0.066
pe-0.066
Token food
Token ID2057
Feature activation+4.64e-02

Pos logit contributions
ouses+0.064
 fries+0.061
 veggies+0.061
heddar+0.058
affles+0.058
Neg logit contributions
pe-0.062
 clim-0.062
 hill-0.059
 zo-0.059
 bridge-0.059
Token,
Token ID11
Feature activation+4.55e-02

Pos logit contributions
 fries+0.110
 veggies+0.109
ouses+0.109
affles+0.107
 groceries+0.101
Neg logit contributions
pe-0.052
stack-0.047
h-0.043
 clim-0.043
 hill-0.039
Token toys
Token ID14958
Feature activation+4.18e-02

Pos logit contributions
ouses+0.111
 veggies+0.101
 fries+0.100
affles+0.099
heddar+0.098
Neg logit contributions
pe-0.053
h-0.049
stack-0.048
 never-0.048
 hon-0.047
Token and
Token ID290
Feature activation+6.48e-02

Pos logit contributions
 fries+0.087
ouses+0.086
 veggies+0.085
 meals+0.080
 groceries+0.079
Neg logit contributions
pe-0.063
stack-0.061
h-0.060
 collapse-0.056
roads-0.056
Token pond
Token ID16723
Feature activation-3.39e-02

Pos logit contributions
pe+0.020
h+0.019
stack+0.019
 collapse+0.018
 clim+0.018
Neg logit contributions
 fries-0.034
 veggies-0.034
ouses-0.033
 meals-0.032
 groceries-0.032
Token and
Token ID290
Feature activation+1.57e-02

Pos logit contributions
pe+0.065
 collapse+0.063
h+0.061
stack+0.060
 clim+0.059
Neg logit contributions
 fries-0.055
ouses-0.055
 veggies-0.055
 meals-0.052
 spaghetti-0.052
Token I
Token ID314
Feature activation+4.81e-03

Pos logit contributions
 fries+0.033
 veggies+0.032
ouses+0.030
 meals+0.029
 spaghetti+0.029
Neg logit contributions
pe-0.025
h-0.024
stack-0.023
roads-0.022
 clim-0.022
Token will
Token ID481
Feature activation-2.48e-02

Pos logit contributions
 fries+0.009
ouses+0.009
 veggies+0.009
 meals+0.009
 favorites+0.009
Neg logit contributions
pe-0.008
stack-0.008
h-0.007
roads-0.007
 collapse-0.007
Token give
Token ID1577
Feature activation+4.70e-02

Pos logit contributions
roads+0.045
pe+0.043
stack+0.042
h+0.041
 clim+0.040
Neg logit contributions
 fries-0.047
 veggies-0.044
 spaghetti-0.039
 meals-0.039
ory-0.039
Token you
Token ID345
Feature activation+9.22e-02

Pos logit contributions
ouses+0.140
 fries+0.129
affles+0.128
 favorites+0.126
urger+0.125
Neg logit contributions
oh-0.029
h-0.029
 never-0.027
't-0.026
pe-0.026
Token a
Token ID257
Feature activation+5.40e-02

Pos logit contributions
ouses+0.212
ittens+0.185
 fries+0.183
ongo+0.181
affles+0.177
Neg logit contributions
h-0.104
 never-0.093
pe-0.092
 clim-0.090
 be-0.090
Token bath
Token ID7837
Feature activation+2.06e-02

Pos logit contributions
ouses+0.112
 veggies+0.107
 fries+0.101
affles+0.099
 meals+0.098
Neg logit contributions
h-0.078
 switch-0.077
 flap-0.077
 crash-0.075
 bridge-0.074
Token."
Token ID526
Feature activation+1.38e-02

Pos logit contributions
 fries+0.043
 veggies+0.043
ouses+0.043
 meals+0.040
 groceries+0.039
Neg logit contributions
pe-0.030
stack-0.029
h-0.029
 collapse-0.027
 clim-0.027
Token So
Token ID1406
Feature activation-1.45e-02

Pos logit contributions
ouses+0.031
 fries+0.030
affles+0.030
 veggies+0.030
 groceries+0.029
Neg logit contributions
stack-0.018
pe-0.016
h-0.015
 hill-0.015
bank-0.014
Token,
Token ID11
Feature activation-9.02e-03

Pos logit contributions
pe+0.015
stack+0.014
h+0.013
roads+0.013
 collapse+0.012
Neg logit contributions
 fries-0.037
ouses-0.037
 veggies-0.037
 meals-0.035
 groceries-0.034
Token bright
Token ID6016
Feature activation+5.38e-04

Pos logit contributions
 fries+0.144
ouses+0.135
 veggies+0.134
ory+0.127
uggets+0.127
Neg logit contributions
pe-0.072
stack-0.070
 hill-0.069
 clim-0.068
 bridge-0.068
Token room
Token ID2119
Feature activation-1.27e-02

Pos logit contributions
 fries+0.001
ouses+0.001
 veggies+0.001
 meals+0.001
 groceries+0.001
Neg logit contributions
pe-0.001
h-0.001
stack-0.001
 clim-0.001
roads-0.001
Token with
Token ID351
Feature activation+4.68e-02

Pos logit contributions
pe+0.023
stack+0.023
 collapse+0.022
roads+0.021
h+0.020
Neg logit contributions
ouses-0.022
 veggies-0.022
 fries-0.022
 meals-0.021
 spaghetti-0.021
Token all
Token ID477
Feature activation+2.60e-02

Pos logit contributions
ouses+0.127
 fries+0.124
ory+0.124
urger+0.123
heddar+0.121
Neg logit contributions
h-0.035
 switch-0.034
 race-0.029
 crash-0.028
 slide-0.028
Token of
Token ID286
Feature activation+2.16e-02

Pos logit contributions
 fries+0.053
 veggies+0.052
ouses+0.052
 meals+0.048
 groceries+0.048
Neg logit contributions
pe-0.041
stack-0.039
h-0.038
 collapse-0.036
 clim-0.036
Token her
Token ID607
Feature activation+9.17e-02

Pos logit contributions
ouses+0.054
 fries+0.052
urger+0.049
ory+0.049
 veggies+0.049
Neg logit contributions
h-0.023
pe-0.023
 switch-0.021
 crash-0.020
stack-0.020
Token toys
Token ID14958
Feature activation+3.52e-02

Pos logit contributions
 fries+0.214
urger+0.212
ory+0.197
uggets+0.194
heddar+0.191
Neg logit contributions
 switch-0.114
 p-0.104
 only-0.103
 hall-0.101
 hill-0.100
Token.
Token ID13
Feature activation+4.14e-02

Pos logit contributions
ouses+0.069
 fries+0.069
 veggies+0.068
 meals+0.064
 groceries+0.063
Neg logit contributions
pe-0.057
stack-0.056
h-0.054
 collapse-0.053
roads-0.051
Token One
Token ID1881
Feature activation+7.40e-03

Pos logit contributions
 fries+0.110
ouses+0.108
 veggies+0.105
 meals+0.101
affles+0.100
Neg logit contributions
h-0.039
pe-0.038
stack-0.038
 hill-0.033
 clim-0.032
Token day
Token ID1110
Feature activation-3.86e-03

Pos logit contributions
 fries+0.020
ouses+0.019
 veggies+0.019
 groceries+0.017
 meals+0.017
Neg logit contributions
h-0.008
 clim-0.007
pe-0.007
 hill-0.007
stack-0.006
Token,
Token ID11
Feature activation+1.00e-02

Pos logit contributions
h+0.002
pe+0.002
 clim+0.002
stack+0.001
 collapse+0.001
Neg logit contributions
 fries-0.012
 veggies-0.012
ouses-0.012
 meals-0.011
 groceries-0.011
Token ate
Token ID15063
Feature activation+3.40e-02

Pos logit contributions
 fries+0.003
 veggies+0.003
ouses+0.003
 meals+0.002
 spaghetti+0.002
Neg logit contributions
pe-0.003
h-0.002
stack-0.002
roads-0.002
 collapse-0.002
Token tasty
Token ID25103
Feature activation+6.24e-02

Pos logit contributions
ouses+0.069
 fries+0.066
 veggies+0.064
affles+0.060
heddar+0.059
Neg logit contributions
pe-0.054
stack-0.053
h-0.051
 collapse-0.048
 clim-0.048
Token sandwiches
Token ID29777
Feature activation+4.83e-02

Pos logit contributions
ouses+0.074
uggets+0.062
 fries+0.062
ory+0.059
ingu+0.059
Neg logit contributions
pe-0.142
h-0.139
 hill-0.139
 clim-0.135
 slide-0.134
Token and
Token ID290
Feature activation+6.33e-02

Pos logit contributions
 fries+0.101
 veggies+0.099
ouses+0.099
 meals+0.094
 groceries+0.092
Neg logit contributions
pe-0.071
stack-0.069
h-0.068
 collapse-0.066
roads-0.065
Token drank
Token ID24070
Feature activation+4.70e-02

Pos logit contributions
ouses+0.118
 fries+0.108
 veggies+0.106
heddar+0.104
 groceries+0.104
Neg logit contributions
pe-0.103
stack-0.102
h-0.099
 never-0.096
 clim-0.095
Token sweet
Token ID6029
Feature activation+9.17e-02

Pos logit contributions
ouses+0.094
 fries+0.089
 veggies+0.086
affles+0.083
 favorites+0.082
Neg logit contributions
pe-0.075
stack-0.072
h-0.072
 collapse-0.068
 clim-0.065
Token lemon
Token ID18873
Feature activation+4.49e-02

Pos logit contributions
ouses+0.132
ocket+0.117
uggets+0.114
ingu+0.111
ongo+0.104
Neg logit contributions
h-0.180
,-0.171
pe-0.166
 slide-0.164
 hill-0.162
Tokenade
Token ID671
Feature activation+3.39e-02

Pos logit contributions
 fries+0.064
ouses+0.062
 veggies+0.060
 favorites+0.056
 groceries+0.054
Neg logit contributions
pe-0.096
h-0.094
stack-0.088
 clim-0.086
 collapse-0.084
Token.
Token ID13
Feature activation+1.16e-02

Pos logit contributions
 fries+0.069
ouses+0.068
 veggies+0.068
 meals+0.064
 groceries+0.063
Neg logit contributions
pe-0.052
stack-0.051
h-0.049
roads-0.047
 collapse-0.047
Token After
Token ID2293
Feature activation-1.47e-02

Pos logit contributions
 fries+0.024
ouses+0.024
 veggies+0.023
 groceries+0.022
affles+0.022
Neg logit contributions
stack-0.018
pe-0.017
h-0.015
roads-0.015
 collapse-0.015
Token their
Token ID511
Feature activation+1.65e-02

Pos logit contributions
pe+0.022
stack+0.022
h+0.021
 collapse+0.020
 clim+0.019
Neg logit contributions
ouses-0.030
 fries-0.030
 veggies-0.030
 groceries-0.027
 meals-0.027
Token all
Token ID477
Feature activation+1.29e-02

Pos logit contributions
pe+0.006
stack+0.006
h+0.006
 collapse+0.006
 clim+0.006
Neg logit contributions
 fries-0.008
 veggies-0.008
ouses-0.008
 meals-0.007
 groceries-0.007
Token day
Token ID1110
Feature activation+2.16e-02

Pos logit contributions
ouses+0.018
 favorite+0.018
 veggies+0.018
 favourite+0.017
 meals+0.017
Neg logit contributions
stack-0.026
pe-0.025
 collapse-0.025
roads-0.023
h-0.023
Token and
Token ID290
Feature activation+3.67e-02

Pos logit contributions
 fries+0.039
ouses+0.039
 veggies+0.038
 meals+0.036
 groceries+0.035
Neg logit contributions
pe-0.038
stack-0.037
h-0.036
 collapse-0.035
roads-0.035
Token ate
Token ID15063
Feature activation+4.61e-02

Pos logit contributions
 fries+0.064
 veggies+0.063
ouses+0.060
 meals+0.058
 spaghetti+0.056
Neg logit contributions
pe-0.067
stack-0.065
h-0.064
roads-0.064
 collapse-0.063
Token y
Token ID331
Feature activation+5.30e-02

Pos logit contributions
ouses+0.092
 fries+0.084
 veggies+0.081
ory+0.077
affles+0.077
Neg logit contributions
pe-0.077
stack-0.075
h-0.072
 hill-0.070
 clim-0.069
Tokenummy
Token ID13513
Feature activation+9.16e-02

Pos logit contributions
ouses+0.088
 fries+0.082
 veggies+0.077
shirt+0.075
 granddaughter+0.075
Neg logit contributions
h-0.099
pe-0.088
 collapse-0.086
 clim-0.084
oh-0.081
Token food
Token ID2057
Feature activation+5.42e-02

Pos logit contributions
ouses+0.133
uggets+0.118
ingu+0.112
ory+0.109
ocket+0.107
Neg logit contributions
h-0.185
pe-0.181
 clim-0.179
 slide-0.176
 hill-0.174
Token.
Token ID13
Feature activation+3.32e-02

Pos logit contributions
 fries+0.116
ouses+0.115
 veggies+0.114
 meals+0.106
 groceries+0.106
Neg logit contributions
pe-0.078
stack-0.075
h-0.073
 clim-0.068
 collapse-0.068
Token The
Token ID383
Feature activation-1.02e-02

Pos logit contributions
 fries+0.073
ouses+0.073
 veggies+0.072
 groceries+0.068
 meals+0.067
Neg logit contributions
pe-0.046
stack-0.046
h-0.043
roads-0.040
 hill-0.040
Token sun
Token ID4252
Feature activation-2.28e-03

Pos logit contributions
pe+0.016
stack+0.015
h+0.015
 collapse+0.014
 clim+0.014
Neg logit contributions
 fries-0.021
ouses-0.021
 veggies-0.021
 meals-0.019
 groceries-0.019
Token was
Token ID373
Feature activation+1.54e-02

Pos logit contributions
pe+0.003
h+0.003
stack+0.003
roads+0.003
 collapse+0.003
Neg logit contributions
 fries-0.005
 veggies-0.005
ouses-0.004
 meals-0.004
 spaghetti-0.004
Token a
Token ID257
Feature activation+2.91e-02

Pos logit contributions
 fries+0.025
 veggies+0.025
ouses+0.024
 groceries+0.023
 meals+0.023
Neg logit contributions
pe-0.024
stack-0.023
h-0.022
 collapse-0.021
roads-0.021
Token blanket
Token ID18447
Feature activation-1.65e-03

Pos logit contributions
 fries+0.055
ouses+0.055
 veggies+0.055
 meals+0.051
affles+0.050
Neg logit contributions
pe-0.047
h-0.046
stack-0.046
 hill-0.045
 bridge-0.044
Token and
Token ID290
Feature activation+5.38e-02

Pos logit contributions
 collapse+0.003
pe+0.003
roads+0.003
h+0.003
 clim+0.003
Neg logit contributions
 fries-0.002
ouses-0.002
 veggies-0.002
 spaghetti-0.002
 sandwiches-0.002
Token spread
Token ID4104
Feature activation+3.31e-02

Pos logit contributions
ouses+0.113
 fries+0.110
 veggies+0.109
 meals+0.103
 groceries+0.102
Neg logit contributions
stack-0.077
pe-0.076
h-0.073
 clim-0.069
 collapse-0.068
Token y
Token ID331
Feature activation+5.32e-02

Pos logit contributions
ouses+0.072
affles+0.067
 veggies+0.066
 fries+0.066
heddar+0.065
Neg logit contributions
pe-0.043
stack-0.040
h-0.040
 flap-0.040
 hon-0.039
Tokenummy
Token ID13513
Feature activation+9.15e-02

Pos logit contributions
ouses+0.089
 fries+0.083
 veggies+0.079
 poems+0.076
 journals+0.075
Neg logit contributions
h-0.101
pe-0.089
 collapse-0.088
 clim-0.086
oh-0.081
Token foods
Token ID9013
Feature activation+6.13e-02

Pos logit contributions
ouses+0.143
ingu+0.132
uggets+0.131
ongo+0.124
enny+0.122
Neg logit contributions
h-0.162
 slide-0.161
 flap-0.160
pe-0.159
 hon-0.158
Token on
Token ID319
Feature activation-1.86e-03

Pos logit contributions
ouses+0.133
 veggies+0.132
 fries+0.132
affles+0.124
 meals+0.123
Neg logit contributions
pe-0.086
h-0.081
stack-0.081
 clim-0.073
 collapse-0.070
Token it
Token ID340
Feature activation+2.31e-02

Pos logit contributions
pe+0.002
stack+0.002
h+0.002
 collapse+0.002
 clim+0.002
Neg logit contributions
 fries-0.004
ouses-0.004
 veggies-0.004
 meals-0.004
 groceries-0.004
Token.
Token ID13
Feature activation+3.14e-02

Pos logit contributions
 fries+0.047
ouses+0.046
 veggies+0.046
 meals+0.043
 groceries+0.043
Neg logit contributions
pe-0.035
stack-0.035
h-0.034
 collapse-0.032
roads-0.032
Token
Token ID198
Feature activation+1.62e-02

Pos logit contributions
ouses+0.069
 veggies+0.067
 fries+0.067
 meals+0.064
affles+0.062
Neg logit contributions
stack-0.042
pe-0.042
h-0.040
roads-0.038
 collapse-0.037
Token bottles
Token ID14096
Feature activation+2.26e-02

Pos logit contributions
 fries+0.077
ouses+0.077
 veggies+0.077
 meals+0.071
 groceries+0.071
Neg logit contributions
pe-0.048
stack-0.047
h-0.046
 clim-0.042
 collapse-0.041
Token that
Token ID326
Feature activation+2.48e-02

Pos logit contributions
 fries+0.047
ouses+0.046
 veggies+0.046
 meals+0.043
 groceries+0.042
Neg logit contributions
pe-0.034
stack-0.033
h-0.033
 collapse-0.031
roads-0.031
Token their
Token ID511
Feature activation+3.99e-02

Pos logit contributions
 fries+0.052
 veggies+0.051
ouses+0.051
 meals+0.048
 groceries+0.047
Neg logit contributions
pe-0.037
stack-0.036
h-0.036
roads-0.033
 collapse-0.033
Token mom
Token ID1995
Feature activation+2.65e-02

Pos logit contributions
ouses+0.075
 fries+0.072
 veggies+0.072
affles+0.068
heddar+0.068
Neg logit contributions
stack-0.064
 hill-0.058
pe-0.058
 switch-0.057
 never-0.057
Token gave
Token ID2921
Feature activation+4.53e-02

Pos logit contributions
 fries+0.057
ouses+0.057
 veggies+0.056
 meals+0.053
 groceries+0.053
Neg logit contributions
pe-0.037
h-0.036
stack-0.035
 clim-0.033
 collapse-0.031
Token them
Token ID606
Feature activation+9.13e-02

Pos logit contributions
ouses+0.110
 fries+0.105
 veggies+0.101
heddar+0.100
affles+0.098
Neg logit contributions
pe-0.052
h-0.048
stack-0.045
 never-0.042
bank-0.041
Token.
Token ID13
Feature activation+1.40e-02

Pos logit contributions
ouses+0.207
affles+0.207
 fries+0.207
 veggies+0.206
heddar+0.196
Neg logit contributions
pe-0.090
 never-0.083
h-0.082
 first-0.078
 switch-0.073
Token The
Token ID383
Feature activation-8.18e-03

Pos logit contributions
ouses+0.032
 fries+0.032
 veggies+0.031
 meals+0.030
affles+0.029
Neg logit contributions
pe-0.018
stack-0.017
h-0.017
roads-0.016
 hill-0.016
Token perfume
Token ID40094
Feature activation+2.10e-02

Pos logit contributions
pe+0.016
stack+0.015
h+0.015
 switch+0.015
 hill+0.014
Neg logit contributions
 fries-0.013
ouses-0.013
 veggies-0.013
affles-0.012
urger-0.012
Token bottles
Token ID14096
Feature activation+1.96e-02

Pos logit contributions
 fries+0.049
ouses+0.049
 veggies+0.048
 groceries+0.045
 meals+0.045
Neg logit contributions
pe-0.026
stack-0.025
h-0.024
 collapse-0.022
 clim-0.021
Token were
Token ID547
Feature activation+8.96e-03

Pos logit contributions
 fries+0.039
ouses+0.039
 veggies+0.038
 meals+0.036
 groceries+0.035
Neg logit contributions
pe-0.031
stack-0.030
h-0.029
 collapse-0.027
roads-0.027
Token with
Token ID351
Feature activation+4.14e-02

Pos logit contributions
pe+0.018
stack+0.017
h+0.017
 clim+0.016
roads+0.015
Neg logit contributions
 fries-0.031
 veggies-0.030
ouses-0.030
 groceries-0.027
ory-0.027
Token dolls
Token ID36062
Feature activation+4.39e-02

Pos logit contributions
 fries+0.095
ouses+0.094
 veggies+0.092
ory+0.090
urger+0.089
Neg logit contributions
 zo-0.047
 never-0.046
 switch-0.045
 hon-0.045
h-0.045
Token and
Token ID290
Feature activation+6.49e-02

Pos logit contributions
 fries+0.100
 veggies+0.097
ouses+0.096
 meals+0.089
 groceries+0.089
Neg logit contributions
pe-0.059
stack-0.057
h-0.056
 clim-0.052
 hill-0.049
Token ate
Token ID15063
Feature activation+5.08e-02

Pos logit contributions
ouses+0.128
 fries+0.119
 veggies+0.116
ory+0.115
affles+0.112
Neg logit contributions
stack-0.102
pe-0.101
 hon-0.100
h-0.097
 crash-0.097
Token y
Token ID331
Feature activation+5.77e-02

Pos logit contributions
ouses+0.089
urger+0.078
ory+0.078
 fries+0.074
ocket+0.071
Neg logit contributions
h-0.089
 hon-0.086
pe-0.086
 zo-0.085
stack-0.084
Tokenummy
Token ID13513
Feature activation+9.13e-02

Pos logit contributions
ouses+0.092
 fries+0.090
 veggies+0.083
 favourite+0.082
ocado+0.080
Neg logit contributions
h-0.109
pe-0.092
 clim-0.092
 collapse-0.090
bank-0.085
Token cookies
Token ID14746
Feature activation+4.96e-02

Pos logit contributions
ouses+0.133
uggets+0.133
ory+0.132
ingu+0.126
ocket+0.120
Neg logit contributions
h-0.165
 slide-0.157
pe-0.156
 p-0.156
 zo-0.156
Token.
Token ID13
Feature activation+1.62e-02

Pos logit contributions
 fries+0.109
 veggies+0.108
ouses+0.107
 meals+0.100
 groceries+0.099
Neg logit contributions
pe-0.069
stack-0.066
h-0.065
 clim-0.059
roads-0.058
Token When
Token ID1649
Feature activation-1.88e-03

Pos logit contributions
 fries+0.033
ouses+0.033
 veggies+0.033
 meals+0.030
 groceries+0.030
Neg logit contributions
pe-0.024
stack-0.024
h-0.023
 collapse-0.022
 clim-0.022
Token it
Token ID340
Feature activation-1.70e-02

Pos logit contributions
pe+0.003
stack+0.003
h+0.003
roads+0.003
 clim+0.003
Neg logit contributions
 fries-0.004
 veggies-0.004
ouses-0.004
 meals-0.004
 groceries-0.004
Token was
Token ID373
Feature activation-9.32e-04

Pos logit contributions
pe+0.027
stack+0.027
roads+0.024
h+0.023
 clim+0.023
Neg logit contributions
 veggies-0.033
 fries-0.033
ouses-0.032
 meals-0.031
 spaghetti-0.030
Token her
Token ID607
Feature activation+5.22e-02

Pos logit contributions
ouses+0.110
 fries+0.101
 veggies+0.100
 meals+0.096
ory+0.094
Neg logit contributions
pe-0.055
h-0.055
stack-0.054
 crash-0.050
 clim-0.050
Token doll
Token ID3654
Feature activation+3.24e-02

Pos logit contributions
ouses+0.122
 fries+0.122
 veggies+0.119
affles+0.114
 meals+0.112
Neg logit contributions
 zo-0.060
 switch-0.059
 clim-0.058
 slide-0.055
stack-0.055
Token Lily
Token ID20037
Feature activation+5.24e-02

Pos logit contributions
 fries+0.088
ouses+0.088
 veggies+0.087
 meals+0.083
 groceries+0.082
Neg logit contributions
pe-0.027
stack-0.027
h-0.025
 clim-0.022
 collapse-0.020
Token and
Token ID290
Feature activation+6.16e-02

Pos logit contributions
 fries+0.111
ouses+0.111
 veggies+0.110
 meals+0.103
 groceries+0.102
Neg logit contributions
pe-0.076
stack-0.073
h-0.071
 collapse-0.070
roads-0.068
Token gave
Token ID2921
Feature activation+5.61e-02

Pos logit contributions
ouses+0.132
 fries+0.129
 veggies+0.129
 meals+0.123
 groceries+0.121
Neg logit contributions
pe-0.080
h-0.079
stack-0.079
 clim-0.076
 hon-0.072
Token her
Token ID607
Feature activation+9.11e-02

Pos logit contributions
ouses+0.167
urger+0.163
 fries+0.161
affles+0.161
heddar+0.157
Neg logit contributions
h-0.024
 never-0.023
 only-0.020
't-0.019
 flap-0.019
Token a
Token ID257
Feature activation+6.65e-02

Pos logit contributions
ouses+0.221
 fries+0.218
affles+0.215
 veggies+0.213
urger+0.206
Neg logit contributions
 switch-0.092
 slide-0.085
 zo-0.084
 flap-0.081
 poster-0.081
Token pink
Token ID11398
Feature activation+4.61e-02

Pos logit contributions
 veggies+0.137
ouses+0.133
 fries+0.133
 meals+0.127
affles+0.127
Neg logit contributions
 switch-0.097
 flap-0.092
 poster-0.092
 slide-0.088
h-0.088
Token dress
Token ID6576
Feature activation+3.20e-02

Pos logit contributions
ouses+0.093
 fries+0.093
 veggies+0.092
 meals+0.086
affles+0.084
Neg logit contributions
pe-0.071
stack-0.069
h-0.068
 clim-0.063
 hill-0.062
Token and
Token ID290
Feature activation+8.58e-02

Pos logit contributions
 veggies+0.075
 fries+0.074
ouses+0.072
 meals+0.070
 groceries+0.069
Neg logit contributions
pe-0.040
h-0.040
stack-0.038
 clim-0.034
roads-0.032
Token a
Token ID257
Feature activation+5.87e-02

Pos logit contributions
ouses+0.199
affles+0.195
ory+0.195
urger+0.191
heddar+0.191
Neg logit contributions
 zo-0.079
 hon-0.079
 flap-0.076
 switch-0.075
 never-0.074
Token with
Token ID351
Feature activation+3.51e-02

Pos logit contributions
pe+0.008
stack+0.008
h+0.008
roads+0.007
 clim+0.007
Neg logit contributions
 fries-0.013
ouses-0.013
 veggies-0.013
 meals-0.012
 groceries-0.012
Token his
Token ID465
Feature activation+6.40e-02

Pos logit contributions
ouses+0.088
 fries+0.080
 veggies+0.079
ory+0.078
urger+0.077
Neg logit contributions
pe-0.037
h-0.033
 hon-0.032
 bridge-0.030
 zo-0.030
Token cat
Token ID3797
Feature activation+2.72e-02

Pos logit contributions
 fries+0.128
ouses+0.128
 veggies+0.122
urger+0.114
ory+0.113
Neg logit contributions
stack-0.102
pe-0.102
 hill-0.096
 bridge-0.094
 switch-0.093
Token and
Token ID290
Feature activation+5.72e-02

Pos logit contributions
 fries+0.061
ouses+0.061
 veggies+0.060
 groceries+0.056
 meals+0.056
Neg logit contributions
pe-0.037
stack-0.035
h-0.034
 clim-0.032
roads-0.031
Token give
Token ID1577
Feature activation+4.97e-02

Pos logit contributions
 fries+0.124
ouses+0.123
 veggies+0.122
 meals+0.114
 groceries+0.113
Neg logit contributions
pe-0.081
stack-0.078
h-0.076
 collapse-0.071
 clim-0.071
Token her
Token ID607
Feature activation+9.08e-02

Pos logit contributions
ouses+0.131
affles+0.118
ittens+0.117
heddar+0.116
 fries+0.113
Neg logit contributions
 never-0.037
pe-0.037
 hon-0.036
h-0.036
 flap-0.032
Token food
Token ID2057
Feature activation+5.07e-02

Pos logit contributions
ouses+0.206
 fries+0.197
 veggies+0.189
affles+0.188
ory+0.188
Neg logit contributions
pe-0.105
 flap-0.104
 switch-0.102
 slide-0.102
 bridge-0.101
Token.
Token ID13
Feature activation+2.61e-02

Pos logit contributions
 fries+0.119
ouses+0.119
 veggies+0.117
affles+0.113
 meals+0.109
Neg logit contributions
pe-0.059
stack-0.054
h-0.051
 clim-0.049
 hill-0.046
Token His
Token ID2399
Feature activation+3.36e-02

Pos logit contributions
 fries+0.053
 veggies+0.052
ouses+0.052
 groceries+0.048
 meals+0.048
Neg logit contributions
pe-0.041
stack-0.040
h-0.039
 collapse-0.037
roads-0.036
Token cat
Token ID3797
Feature activation+3.96e-03

Pos logit contributions
ouses+0.073
 fries+0.072
 veggies+0.071
affles+0.066
ory+0.065
Neg logit contributions
stack-0.048
pe-0.047
 switch-0.042
 zo-0.042
 bridge-0.042
Token is
Token ID318
Feature activation-9.31e-03

Pos logit contributions
 fries+0.009
 veggies+0.009
ouses+0.008
 meals+0.008
 spaghetti+0.008
Neg logit contributions
stack-0.005
pe-0.005
h-0.005
roads-0.005
 collapse-0.005
Token two
Token ID734
Feature activation+3.73e-02

Pos logit contributions
ouses+0.011
 fries+0.010
 veggies+0.010
affles+0.010
ory+0.009
Neg logit contributions
pe-0.006
stack-0.006
h-0.006
 collapse-0.005
 switch-0.005
Token glasses
Token ID15232
Feature activation+2.88e-02

Pos logit contributions
ouses+0.070
 fries+0.067
uggets+0.063
 veggies+0.062
affles+0.062
Neg logit contributions
 hill-0.059
pe-0.058
 bridge-0.058
 switch-0.056
h-0.056
Token and
Token ID290
Feature activation+5.91e-02

Pos logit contributions
 fries+0.054
ouses+0.052
 veggies+0.052
 spaghetti+0.049
 groceries+0.049
Neg logit contributions
pe-0.049
stack-0.047
 collapse-0.046
h-0.046
roads-0.045
Token handed
Token ID10158
Feature activation+4.66e-02

Pos logit contributions
ouses+0.121
 fries+0.112
 veggies+0.112
affles+0.109
 meals+0.107
Neg logit contributions
stack-0.084
pe-0.083
h-0.082
 clim-0.077
 collapse-0.077
Token Tim
Token ID5045
Feature activation+7.50e-02

Pos logit contributions
ouses+0.113
 fries+0.107
 veggies+0.104
affles+0.101
 meals+0.099
Neg logit contributions
pe-0.058
stack-0.056
h-0.056
 collapse-0.050
 clim-0.048
Tokenmy
Token ID1820
Feature activation+9.07e-02

Pos logit contributions
ouses+0.200
 fries+0.200
 veggies+0.189
 meals+0.185
affles+0.180
Neg logit contributions
h-0.066
pe-0.062
oh-0.051
stack-0.050
 clim-0.047
Token one
Token ID530
Feature activation+7.27e-02

Pos logit contributions
ouses+0.212
 fries+0.204
 veggies+0.196
affles+0.195
 favorites+0.185
Neg logit contributions
pe-0.104
h-0.099
 clim-0.085
stack-0.085
oh-0.083
Token.
Token ID13
Feature activation+1.40e-02

Pos logit contributions
ouses+0.179
 fries+0.173
 veggies+0.172
affles+0.168
 meals+0.163
Neg logit contributions
stack-0.078
pe-0.075
h-0.074
 hill-0.069
 switch-0.069
Token The
Token ID383
Feature activation+4.04e-03

Pos logit contributions
 fries+0.029
 veggies+0.029
ouses+0.028
 meals+0.027
 groceries+0.026
Neg logit contributions
pe-0.021
h-0.020
stack-0.020
 collapse-0.019
roads-0.019
Token juice
Token ID13135
Feature activation+1.93e-02

Pos logit contributions
ouses+0.007
 veggies+0.006
 fries+0.006
affles+0.006
uggets+0.005
Neg logit contributions
stack-0.007
pe-0.007
 switch-0.007
 hill-0.007
h-0.007
Token looked
Token ID3114
Feature activation+2.07e-02

Pos logit contributions
 fries+0.040
 veggies+0.039
ouses+0.038
 meals+0.036
 spaghetti+0.036
Neg logit contributions
pe-0.030
h-0.028
stack-0.028
roads-0.027
 clim-0.027
Token,
Token ID11
Feature activation+1.19e-02

Pos logit contributions
h+0.001
pe+0.001
stack+0.001
 clim+0.001
 collapse+0.001
Neg logit contributions
 fries-0.006
 veggies-0.006
ouses-0.006
 meals-0.005
 groceries-0.005
Token Lily
Token ID20037
Feature activation+6.64e-03

Pos logit contributions
ouses+0.028
 fries+0.027
 veggies+0.026
affles+0.026
 meals+0.025
Neg logit contributions
stack-0.015
h-0.015
pe-0.014
 clim-0.013
 collapse-0.013
Token wanted
Token ID2227
Feature activation+2.07e-02

Pos logit contributions
 fries+0.014
ouses+0.014
 veggies+0.014
 meals+0.013
 groceries+0.013
Neg logit contributions
pe-0.009
stack-0.009
h-0.009
 clim-0.008
 collapse-0.008
Token to
Token ID284
Feature activation-1.85e-02

Pos logit contributions
ouses+0.051
 fries+0.047
affles+0.047
 veggies+0.046
ory+0.045
Neg logit contributions
h-0.023
pe-0.023
 never-0.022
stack-0.021
 clim-0.021
Token give
Token ID1577
Feature activation+3.82e-02

Pos logit contributions
pe+0.028
stack+0.027
h+0.026
 collapse+0.024
roads+0.024
Neg logit contributions
 fries-0.039
 veggies-0.038
ouses-0.038
 meals-0.036
 groceries-0.035
Token Brown
Token ID4373
Feature activation+9.06e-02

Pos logit contributions
ouses+0.103
affles+0.098
 fries+0.094
 veggies+0.092
ocado+0.091
Neg logit contributions
pe-0.034
h-0.032
 never-0.031
stack-0.031
't-0.030
Tokenie
Token ID494
Feature activation+7.91e-02

Pos logit contributions
 fries+0.228
ouses+0.224
 veggies+0.209
affles+0.209
imes+0.203
Neg logit contributions
h-0.085
pe-0.084
man-0.069
 clim-0.064
 never-0.063
Token a
Token ID257
Feature activation+5.28e-02

Pos logit contributions
ouses+0.196
 fries+0.186
affles+0.182
 veggies+0.180
uggets+0.175
Neg logit contributions
 never-0.072
 clim-0.071
pe-0.070
h-0.068
 flap-0.064
Token big
Token ID1263
Feature activation+7.06e-03

Pos logit contributions
ouses+0.103
 veggies+0.100
 fries+0.099
affles+0.094
 meals+0.093
Neg logit contributions
 switch-0.082
pe-0.080
 flap-0.079
h-0.079
 clim-0.079
Token hug
Token ID16225
Feature activation+3.33e-02

Pos logit contributions
 fries+0.012
 veggies+0.012
 spaghetti+0.011
ouses+0.011
 groceries+0.011
Neg logit contributions
pe-0.013
stack-0.013
h-0.013
roads-0.013
 collapse-0.012
Token.
Token ID13
Feature activation+3.18e-02

Pos logit contributions
 fries+0.073
ouses+0.072
 veggies+0.072
 meals+0.067
 groceries+0.066
Neg logit contributions
pe-0.046
stack-0.045
h-0.044
 clim-0.041
 collapse-0.040
Token flowers
Token ID12734
Feature activation+7.96e-03

Pos logit contributions
pe+0.007
h+0.007
stack+0.006
 collapse+0.006
 bridge+0.006
Neg logit contributions
 fries-0.008
ouses-0.007
 veggies-0.007
affles-0.007
heddar-0.007
Token.
Token ID13
Feature activation+3.82e-03

Pos logit contributions
 fries+0.017
 veggies+0.017
ouses+0.016
 meals+0.015
 groceries+0.015
Neg logit contributions
pe-0.012
stack-0.011
h-0.011
 collapse-0.011
 clim-0.010
Token We
Token ID775
Feature activation+2.30e-02

Pos logit contributions
 fries+0.008
ouses+0.008
 veggies+0.008
 meals+0.007
 groceries+0.007
Neg logit contributions
pe-0.006
stack-0.005
h-0.005
 collapse-0.005
roads-0.005
Token can
Token ID460
Feature activation-1.45e-02

Pos logit contributions
 veggies+0.045
 fries+0.045
ouses+0.044
 meals+0.041
 favorites+0.041
Neg logit contributions
pe-0.036
h-0.035
stack-0.035
roads-0.034
 collapse-0.033
Token give
Token ID1577
Feature activation+5.09e-02

Pos logit contributions
pe+0.022
roads+0.022
h+0.021
stack+0.021
 clim+0.020
Neg logit contributions
 veggies-0.029
 fries-0.029
ouses-0.027
 favourite-0.026
 meals-0.026
Token them
Token ID606
Feature activation+9.06e-02

Pos logit contributions
ouses+0.124
 fries+0.122
affles+0.117
 veggies+0.114
heddar+0.114
Neg logit contributions
pe-0.053
h-0.051
stack-0.046
 never-0.045
 switch-0.044
Token to
Token ID284
Feature activation+6.93e-03

Pos logit contributions
ouses+0.186
 fries+0.178
affles+0.176
ittens+0.171
 veggies+0.169
Neg logit contributions
pe-0.118
h-0.112
 bridge-0.105
 never-0.104
 flap-0.104
Token people
Token ID661
Feature activation+3.86e-02

Pos logit contributions
ouses+0.016
 fries+0.015
 veggies+0.015
heddar+0.014
affles+0.014
Neg logit contributions
pe-0.009
h-0.008
stack-0.008
 collapse-0.008
 bridge-0.007
Token we
Token ID356
Feature activation+4.68e-02

Pos logit contributions
 fries+0.089
 veggies+0.088
ouses+0.088
 meals+0.083
affles+0.083
Neg logit contributions
pe-0.048
h-0.044
stack-0.042
 clim-0.038
roads-0.037
Token love
Token ID1842
Feature activation+2.48e-02

Pos logit contributions
ouses+0.107
 fries+0.104
 veggies+0.100
 meals+0.098
 groceries+0.097
Neg logit contributions
pe-0.054
h-0.053
stack-0.050
 clim-0.049
 flap-0.048
Token,"
Token ID553
Feature activation+2.39e-02

Pos logit contributions
 fries+0.059
ouses+0.058
 veggies+0.055
affles+0.054
heddar+0.053
Neg logit contributions
pe-0.030
h-0.029
stack-0.027
 hill-0.024
 never-0.023
Token man
Token ID582
Feature activation-1.42e-03

Pos logit contributions
stack+0.025
pe+0.025
 collapse+0.024
roads+0.023
h+0.023
Neg logit contributions
 fries-0.037
 veggies-0.036
ouses-0.035
 meals-0.034
 groceries-0.033
Token who
Token ID508
Feature activation+1.95e-02

Pos logit contributions
stack+0.002
pe+0.002
 collapse+0.002
roads+0.002
h+0.002
Neg logit contributions
 fries-0.003
ouses-0.003
 veggies-0.003
 meals-0.002
 spaghetti-0.002
Token always
Token ID1464
Feature activation+2.62e-03

Pos logit contributions
 fries+0.038
 veggies+0.037
ouses+0.037
 meals+0.034
 groceries+0.033
Neg logit contributions
pe-0.032
stack-0.032
h-0.031
 collapse-0.030
roads-0.030
Token gave
Token ID2921
Feature activation+3.93e-02

Pos logit contributions
 fries+0.005
 veggies+0.005
ouses+0.005
 meals+0.004
 spaghetti+0.004
Neg logit contributions
stack-0.005
pe-0.005
roads-0.005
 collapse-0.004
 bridge-0.004
Token Tim
Token ID5045
Feature activation+7.46e-02

Pos logit contributions
ouses+0.096
 fries+0.090
affles+0.085
 veggies+0.084
heddar+0.083
Neg logit contributions
pe-0.046
h-0.043
 never-0.043
stack-0.039
 collapse-0.039
Tokenmy
Token ID1820
Feature activation+9.05e-02

Pos logit contributions
ouses+0.190
 fries+0.187
 veggies+0.179
 meals+0.172
affles+0.170
Neg logit contributions
pe-0.073
h-0.072
 clim-0.061
oh-0.059
 never-0.056
Token a
Token ID257
Feature activation+4.97e-02

Pos logit contributions
ouses+0.185
 fries+0.179
 veggies+0.176
affles+0.173
ittens+0.162
Neg logit contributions
pe-0.122
h-0.117
 clim-0.113
 never-0.107
oh-0.101
Token l
Token ID300
Feature activation+1.00e-02

Pos logit contributions
ouses+0.097
 veggies+0.095
 fries+0.093
 meals+0.091
affles+0.089
Neg logit contributions
h-0.077
 clim-0.073
pe-0.073
 switch-0.071
 flap-0.068
Tokenoll
Token ID692
Feature activation+5.08e-02

Pos logit contributions
 veggies+0.008
 fries+0.008
ouses+0.007
 meals+0.007
 groceries+0.006
Neg logit contributions
pe-0.028
h-0.028
 clim-0.027
stack-0.027
roads-0.026
Tokenipop
Token ID42800
Feature activation+5.37e-02

Pos logit contributions
 veggies+0.057
 fries+0.056
 meals+0.055
ouses+0.054
 groceries+0.048
Neg logit contributions
pe-0.124
h-0.122
 clim-0.117
stack-0.111
oh-0.107
Token after
Token ID706
Feature activation-6.47e-04

Pos logit contributions
ouses+0.122
 veggies+0.122
 fries+0.121
 meals+0.115
affles+0.114
Neg logit contributions
pe-0.066
h-0.064
stack-0.063
 clim-0.060
 hill-0.053
Token with
Token ID351
Feature activation+4.12e-02

Pos logit contributions
 fries+0.046
 veggies+0.046
ouses+0.046
 meals+0.043
 groceries+0.042
Neg logit contributions
pe-0.035
stack-0.034
h-0.033
 clim-0.031
 collapse-0.031
Token lots
Token ID6041
Feature activation+6.85e-02

Pos logit contributions
ouses+0.089
affles+0.080
 fries+0.080
 veggies+0.079
ory+0.078
Neg logit contributions
pe-0.055
h-0.054
 hon-0.052
 never-0.051
 zo-0.050
Token of
Token ID286
Feature activation+4.36e-02

Pos logit contributions
 fries+0.150
ouses+0.150
 veggies+0.148
uggets+0.142
 meals+0.140
Neg logit contributions
h-0.089
pe-0.083
 clim-0.082
 zo-0.074
 bridge-0.073
Token pretty
Token ID2495
Feature activation+5.81e-02

Pos logit contributions
ouses+0.089
urger+0.083
ory+0.083
heddar+0.083
uggets+0.081
Neg logit contributions
 switch-0.059
h-0.055
 flash-0.055
 building-0.054
 hon-0.054
Token dresses
Token ID27309
Feature activation+4.72e-02

Pos logit contributions
 fries+0.099
uggets+0.095
ouses+0.094
oll+0.092
heddar+0.091
Neg logit contributions
h-0.099
pe-0.091
 bridge-0.089
 switch-0.087
 crash-0.084
Token and
Token ID290
Feature activation+9.05e-02

Pos logit contributions
 fries+0.104
 veggies+0.103
ouses+0.102
 meals+0.097
 groceries+0.096
Neg logit contributions
pe-0.065
h-0.062
stack-0.061
 clim-0.056
 collapse-0.055
Token suits
Token ID14803
Feature activation+4.69e-02

Pos logit contributions
ouses+0.144
ory+0.140
affles+0.136
heddar+0.135
uggets+0.133
Neg logit contributions
 hon-0.156
 zo-0.153
 crash-0.153
pe-0.151
h-0.149
Token.
Token ID13
Feature activation+2.70e-02

Pos logit contributions
 fries+0.099
 veggies+0.099
ouses+0.099
 meals+0.093
 groceries+0.091
Neg logit contributions
pe-0.067
stack-0.065
h-0.065
 collapse-0.059
 clim-0.059
Token Tim
Token ID5045
Feature activation+2.08e-02

Pos logit contributions
ouses+0.060
 fries+0.059
 veggies+0.058
 meals+0.055
 groceries+0.054
Neg logit contributions
pe-0.037
stack-0.036
h-0.034
 clim-0.032
 collapse-0.032
Tokenmy
Token ID1820
Feature activation+8.47e-03

Pos logit contributions
 fries+0.045
ouses+0.044
 veggies+0.044
 meals+0.041
 groceries+0.041
Neg logit contributions
pe-0.030
h-0.029
stack-0.028
 clim-0.027
 collapse-0.026
Token used
Token ID973
Feature activation+2.72e-02

Pos logit contributions
 fries+0.016
ouses+0.016
 veggies+0.016
 meals+0.015
 groceries+0.015
Neg logit contributions
pe-0.014
stack-0.014
h-0.013
 collapse-0.013
roads-0.013
Token a
Token ID257
Feature activation-2.55e-02

Pos logit contributions
ouses+0.015
 fries+0.014
 veggies+0.014
 meals+0.013
 groceries+0.013
Neg logit contributions
pe-0.010
h-0.009
stack-0.009
 collapse-0.009
 clim-0.008
Token loud
Token ID7812
Feature activation-5.60e-02

Pos logit contributions
stack+0.044
 clim+0.040
pe+0.039
eday+0.039
roads+0.038
Neg logit contributions
 fries-0.047
 veggies-0.043
 meals-0.042
 spaghetti-0.041
 delicious-0.041
Token noise
Token ID7838
Feature activation-4.58e-02

Pos logit contributions
stack+0.100
roads+0.087
 clim+0.085
eday+0.085
 cliff+0.079
Neg logit contributions
 fries-0.095
 spaghetti-0.087
 faces-0.084
 poems-0.083
 meals-0.080
Token from
Token ID422
Feature activation-3.18e-02

Pos logit contributions
stack+0.099
pe+0.089
 collapse+0.084
bank+0.083
roads+0.083
Neg logit contributions
 fries-0.064
 spaghetti-0.062
 veggies-0.059
 meals-0.057
 groceries-0.055
Token the
Token ID262
Feature activation-3.76e-02

Pos logit contributions
pe+0.054
h+0.053
oh+0.048
stack+0.047
 clim+0.045
Neg logit contributions
 veggies-0.060
 fries-0.059
 groceries-0.059
 spaghetti-0.057
 meals-0.055
Token living
Token ID2877
Feature activation-1.03e-01

Pos logit contributions
h+0.057
pe+0.052
 clim+0.048
stack+0.047
roads+0.046
Neg logit contributions
 veggies-0.078
 groceries-0.078
 fries-0.077
 spaghetti-0.074
 meals-0.073
Token room
Token ID2119
Feature activation-6.09e-02

Pos logit contributions
h+0.216
pe+0.215
stack+0.207
 clim+0.203
 collapse+0.201
Neg logit contributions
 meals-0.135
 veggies-0.135
 groceries-0.129
ouses-0.129
 sandwiches-0.125
Token.
Token ID13
Feature activation-2.53e-02

Pos logit contributions
stack+0.137
 clim+0.134
pe+0.129
roads+0.128
Ro+0.127
Neg logit contributions
 fries-0.065
 spaghetti-0.059
 sandwiches-0.058
 meals-0.054
 groceries-0.053
Token He
Token ID679
Feature activation-2.46e-02

Pos logit contributions
pe+0.046
h+0.045
stack+0.045
roads+0.042
 clim+0.042
Neg logit contributions
 fries-0.044
 veggies-0.043
ouses-0.042
 spaghetti-0.040
 meals-0.040
Token ran
Token ID4966
Feature activation-4.17e-02

Pos logit contributions
stack+0.037
pe+0.036
h+0.033
roads+0.033
 collapse+0.032
Neg logit contributions
 fries-0.052
 veggies-0.050
ouses-0.050
 meals-0.046
 spaghetti-0.045
Token to
Token ID284
Feature activation-2.47e-02

Pos logit contributions
pe+0.089
stack+0.086
h+0.083
roads+0.080
 collapse+0.079
Neg logit contributions
 fries-0.059
 veggies-0.059
ouses-0.058
 groceries-0.057
 spaghetti-0.054
Token noise
Token ID7838
Feature activation-4.01e-02

Pos logit contributions
stack+0.068
roads+0.063
pe+0.061
 collapse+0.059
eday+0.058
Neg logit contributions
 fries-0.072
 spaghetti-0.067
 veggies-0.067
 meals-0.066
 peanut-0.064
Token coming
Token ID2406
Feature activation-3.38e-02

Pos logit contributions
stack+0.076
pe+0.072
 collapse+0.064
roads+0.064
bank+0.063
Neg logit contributions
 fries-0.065
 spaghetti-0.063
 veggies-0.061
 meals-0.059
ouses-0.058
Token from
Token ID422
Feature activation-1.82e-02

Pos logit contributions
pe+0.067
stack+0.066
roads+0.063
bank+0.058
h+0.058
Neg logit contributions
 veggies-0.046
 groceries-0.045
 fries-0.045
 spaghetti-0.045
 meals-0.044
Token a
Token ID257
Feature activation-1.29e-02

Pos logit contributions
pe+0.032
h+0.031
stack+0.028
oh+0.028
 collapse+0.028
Neg logit contributions
 veggies-0.032
 groceries-0.032
 fries-0.031
 spaghetti-0.031
 meals-0.030
Token nearby
Token ID6716
Feature activation-6.13e-02

Pos logit contributions
pe+0.021
stack+0.020
h+0.020
esh+0.019
 clim+0.018
Neg logit contributions
 fries-0.025
 veggies-0.023
 groceries-0.023
 spaghetti-0.023
 meals-0.022
Token alley
Token ID23978
Feature activation-1.03e-01

Pos logit contributions
pe+0.089
h+0.084
stack+0.081
esh+0.079
 collapse+0.077
Neg logit contributions
 fries-0.127
 spaghetti-0.120
 veggies-0.120
 peanut-0.117
 meals-0.116
Tokenway
Token ID1014
Feature activation-8.36e-02

Pos logit contributions
stack+0.215
 sixty+0.209
 clim+0.204
eday+0.201
Ali+0.199
Neg logit contributions
 peanuts-0.095
ouses-0.089
ly-0.087
 groceries-0.087
 fries-0.082
Token.
Token ID13
Feature activation+1.47e-02

Pos logit contributions
stack+0.181
 clim+0.177
pe+0.172
 ha+0.168
eday+0.166
Neg logit contributions
ory-0.067
ouses-0.063
 groceries-0.061
s-0.059
 veggies-0.059
Token Ell
Token ID7122
Feature activation+2.60e-02

Pos logit contributions
 fries+0.030
ouses+0.030
 veggies+0.030
 meals+0.028
 groceries+0.028
Neg logit contributions
pe-0.022
stack-0.022
h-0.021
roads-0.020
 collapse-0.020
Tokena
Token ID64
Feature activation-8.76e-03

Pos logit contributions
 fries+0.057
 veggies+0.057
ouses+0.055
 groceries+0.053
 meals+0.053
Neg logit contributions
h-0.037
pe-0.037
stack-0.034
 clim-0.033
oh-0.032
Token was
Token ID373
Feature activation-1.03e-02

Pos logit contributions
stack+0.015
pe+0.014
roads+0.013
 collapse+0.013
h+0.013
Neg logit contributions
 fries-0.016
ouses-0.016
 veggies-0.016
ory-0.015
 meals-0.015
Token noise
Token ID7838
Feature activation-3.45e-02

Pos logit contributions
stack+0.062
pe+0.060
 clim+0.059
h+0.058
roads+0.056
Neg logit contributions
 fries-0.075
 veggies-0.073
 spaghetti-0.070
 groceries-0.070
 sandwiches-0.068
Token and
Token ID290
Feature activation+6.01e-03

Pos logit contributions
stack+0.072
pe+0.068
roads+0.063
 clim+0.062
bank+0.061
Neg logit contributions
 fries-0.052
 veggies-0.050
 spaghetti-0.050
ouses-0.048
 groceries-0.047
Token came
Token ID1625
Feature activation-3.32e-02

Pos logit contributions
 fries+0.012
 veggies+0.012
ouses+0.011
 groceries+0.011
 spaghetti+0.011
Neg logit contributions
stack-0.009
pe-0.009
roads-0.008
 clim-0.008
h-0.008
Token down
Token ID866
Feature activation-4.42e-02

Pos logit contributions
pe+0.060
stack+0.059
h+0.056
roads+0.054
 collapse+0.054
Neg logit contributions
ouses-0.058
 veggies-0.058
 fries-0.058
 groceries-0.054
 meals-0.053
Token the
Token ID262
Feature activation-3.99e-02

Pos logit contributions
stack+0.079
pe+0.078
roads+0.070
h+0.069
 clim+0.068
Neg logit contributions
 fries-0.077
ouses-0.076
 veggies-0.076
 meals-0.073
 groceries-0.073
Token stair
Token ID18761
Feature activation-1.02e-01

Pos logit contributions
h+0.061
pe+0.058
stack+0.057
 clim+0.056
 collapse+0.052
Neg logit contributions
 veggies-0.079
 groceries-0.078
 fries-0.078
 spaghetti-0.075
 sandwiches-0.075
Token.
Token ID13
Feature activation-3.00e-02

Pos logit contributions
stack+0.245
AAAA+0.232
esh+0.231
 waterfall+0.229
pe+0.227
Neg logit contributions
 groceries-0.065
 spaghetti-0.065
 sandwiches-0.065
 favorite-0.058
 poems-0.058
Token She
Token ID1375
Feature activation-2.54e-02

Pos logit contributions
pe+0.035
stack+0.034
h+0.033
 collapse+0.030
 clim+0.030
Neg logit contributions
 fries-0.073
ouses-0.072
 veggies-0.071
 meals-0.067
 groceries-0.066
Token saw
Token ID2497
Feature activation-6.75e-03

Pos logit contributions
pe+0.043
stack+0.043
roads+0.039
h+0.039
 collapse+0.038
Neg logit contributions
 fries-0.047
 veggies-0.046
ouses-0.046
 groceries-0.043
 meals-0.042
Token Lily
Token ID20037
Feature activation+2.67e-02

Pos logit contributions
pe+0.008
stack+0.008
h+0.008
 collapse+0.008
roads+0.007
Neg logit contributions
ouses-0.016
 fries-0.016
 veggies-0.015
 meals-0.015
affles-0.014
Token and
Token ID290
Feature activation+3.37e-02

Pos logit contributions
 spaghetti+0.036
 veggies+0.035
 fries+0.035
 sandwiches+0.035
 groceries+0.034
Neg logit contributions
stack-0.056
pe-0.052
roads-0.051
 clim-0.049
 cliff-0.048
Token Mom
Token ID11254
Feature activation+1.81e-02

Pos logit contributions
pe+0.004
stack+0.003
h+0.003
 collapse+0.003
 clim+0.003
Neg logit contributions
 fries-0.005
ouses-0.005
 veggies-0.005
 meals-0.005
 groceries-0.005
Tokenmy
Token ID1820
Feature activation+2.86e-02

Pos logit contributions
 fries+0.034
 veggies+0.033
 spaghetti+0.032
 groceries+0.031
ouses+0.031
Neg logit contributions
stack-0.031
pe-0.029
 clim-0.027
 cliff-0.026
roads-0.026
Token coming
Token ID2406
Feature activation-2.43e-02

Pos logit contributions
 fries+0.052
 veggies+0.050
 spaghetti+0.048
ouses+0.047
 groceries+0.047
Neg logit contributions
stack-0.050
pe-0.047
roads-0.043
 clim-0.043
 cliff-0.042
Token into
Token ID656
Feature activation-3.91e-02

Pos logit contributions
pe+0.045
stack+0.045
roads+0.041
h+0.040
 clim+0.039
Neg logit contributions
 veggies-0.041
 fries-0.041
ouses-0.040
 groceries-0.039
 meals-0.037
Token the
Token ID262
Feature activation-4.07e-02

Pos logit contributions
pe+0.055
stack+0.055
h+0.052
 clim+0.049
roads+0.048
Neg logit contributions
 fries-0.084
 veggies-0.084
ouses-0.081
 groceries-0.080
 meals-0.079
Token living
Token ID2877
Feature activation-9.93e-02

Pos logit contributions
pe+0.068
h+0.068
stack+0.061
 clim+0.060
 collapse+0.057
Neg logit contributions
 veggies-0.078
 fries-0.078
 groceries-0.078
 spaghetti-0.076
 meals-0.075
Token room
Token ID2119
Feature activation-5.73e-02

Pos logit contributions
pe+0.196
stack+0.195
h+0.187
 collapse+0.184
 clim+0.181
Neg logit contributions
ouses-0.161
 veggies-0.161
 meals-0.158
 fries-0.153
 groceries-0.151
Token.
Token ID13
Feature activation-1.41e-02

Pos logit contributions
stack+0.131
pe+0.127
 clim+0.121
esh+0.120
h+0.116
Neg logit contributions
 spaghetti-0.062
 fries-0.060
 sandwiches-0.060
ouses-0.059
 groceries-0.058
Token
Token ID198
Feature activation-4.94e-03

Pos logit contributions
pe+0.019
stack+0.019
h+0.018
 collapse+0.017
roads+0.016
Neg logit contributions
 fries-0.031
ouses-0.031
 veggies-0.031
 meals-0.029
 groceries-0.028
Token
Token ID198
Feature activation-1.59e-02

Pos logit contributions
pe+0.008
stack+0.008
h+0.008
 collapse+0.007
roads+0.007
Neg logit contributions
 fries-0.010
 veggies-0.009
ouses-0.009
 meals-0.009
 groceries-0.009
Token"
Token ID1
Feature activation-2.85e-02

Pos logit contributions
pe+0.025
stack+0.025
h+0.023
 clim+0.023
roads+0.022
Neg logit contributions
ouses-0.032
 fries-0.031
 veggies-0.031
 groceries-0.029
 meals-0.028
Token Mom
Token ID11254
Feature activation+1.81e-02

Pos logit contributions
ouses+0.006
 fries+0.006
 veggies+0.006
 meals+0.005
 groceries+0.005
Neg logit contributions
pe-0.004
stack-0.004
h-0.004
 collapse-0.003
 clim-0.003
Tokenmy
Token ID1820
Feature activation+3.08e-02

Pos logit contributions
 fries+0.034
 veggies+0.033
 spaghetti+0.032
ouses+0.032
 groceries+0.031
Neg logit contributions
stack-0.030
pe-0.029
 clim-0.026
roads-0.026
 cliff-0.026
Token coming
Token ID2406
Feature activation-1.97e-02

Pos logit contributions
 fries+0.057
 veggies+0.056
ouses+0.053
 spaghetti+0.053
 groceries+0.052
Neg logit contributions
stack-0.052
pe-0.050
roads-0.045
 clim-0.045
h-0.044
Token into
Token ID656
Feature activation-3.60e-02

Pos logit contributions
pe+0.035
stack+0.034
h+0.031
roads+0.031
 clim+0.031
Neg logit contributions
 fries-0.036
 veggies-0.035
ouses-0.035
 groceries-0.033
 meals-0.033
Token the
Token ID262
Feature activation-4.86e-02

Pos logit contributions
pe+0.047
stack+0.046
h+0.045
 collapse+0.041
 clim+0.041
Neg logit contributions
 fries-0.082
ouses-0.081
 veggies-0.081
 meals-0.076
 groceries-0.075
Token living
Token ID2877
Feature activation-9.93e-02

Pos logit contributions
pe+0.076
h+0.074
stack+0.073
 clim+0.068
 collapse+0.067
Neg logit contributions
 fries-0.101
 veggies-0.100
ouses-0.096
 meals-0.094
 groceries-0.093
Token room
Token ID2119
Feature activation-4.93e-02

Pos logit contributions
stack+0.180
pe+0.179
h+0.168
 collapse+0.165
 clim+0.163
Neg logit contributions
 veggies-0.181
ouses-0.179
 fries-0.176
 meals-0.174
 groceries-0.166
Token.
Token ID13
Feature activation-1.18e-02

Pos logit contributions
stack+0.108
pe+0.105
 clim+0.098
esh+0.097
h+0.096
Neg logit contributions
 fries-0.062
 spaghetti-0.061
 sandwiches-0.059
ouses-0.058
 veggies-0.058
Token
Token ID198
Feature activation-5.46e-03

Pos logit contributions
pe+0.016
stack+0.016
h+0.015
 collapse+0.014
roads+0.014
Neg logit contributions
 fries-0.026
ouses-0.026
 veggies-0.026
 meals-0.024
affles-0.024
Token
Token ID198
Feature activation-1.76e-02

Pos logit contributions
pe+0.007
h+0.007
stack+0.007
 collapse+0.006
roads+0.006
Neg logit contributions
 fries-0.013
 veggies-0.013
ouses-0.012
 meals-0.012
 groceries-0.012
Token"
Token ID1
Feature activation-2.71e-02

Pos logit contributions
pe+0.024
stack+0.023
h+0.023
 collapse+0.021
 clim+0.021
Neg logit contributions
 fries-0.039
ouses-0.038
 veggies-0.038
 meals-0.036
 groceries-0.035
Token loud
Token ID7812
Feature activation-6.49e-02

Pos logit contributions
pe+0.052
stack+0.051
h+0.049
roads+0.047
 collapse+0.047
Neg logit contributions
 fries-0.051
 veggies-0.049
ouses-0.048
 meals-0.045
 spaghetti-0.044
Token noise
Token ID7838
Feature activation-5.12e-02

Pos logit contributions
stack+0.107
roads+0.104
eday+0.099
 collapse+0.096
 never+0.093
Neg logit contributions
 fries-0.121
 spaghetti-0.111
 peanut-0.103
 veggies-0.101
 peanuts-0.098
Token came
Token ID1625
Feature activation-2.92e-02

Pos logit contributions
stack+0.077
pe+0.075
h+0.067
roads+0.066
 clim+0.065
Neg logit contributions
 fries-0.108
 veggies-0.100
 spaghetti-0.097
 meals-0.095
ouses-0.094
Token from
Token ID422
Feature activation-2.88e-02

Pos logit contributions
stack+0.059
roads+0.058
pe+0.058
 poster+0.053
 salon+0.052
Neg logit contributions
 veggies-0.039
 spaghetti-0.037
 groceries-0.037
 fries-0.036
 meals-0.035
Token the
Token ID262
Feature activation-3.52e-02

Pos logit contributions
h+0.051
pe+0.050
oh+0.046
 clim+0.045
zag+0.044
Neg logit contributions
 veggies-0.051
 groceries-0.051
 spaghetti-0.050
 fries-0.050
 meals-0.047
Token living
Token ID2877
Feature activation-9.73e-02

Pos logit contributions
pe+0.051
stack+0.049
h+0.049
 collapse+0.045
 clim+0.045
Neg logit contributions
 fries-0.075
 veggies-0.074
ouses-0.073
 meals-0.069
 groceries-0.069
Token room
Token ID2119
Feature activation-5.51e-02

Pos logit contributions
pe+0.174
h+0.170
stack+0.165
 clim+0.157
 collapse+0.157
Neg logit contributions
 veggies-0.178
 fries-0.177
 meals-0.171
ouses-0.167
 spaghetti-0.167
Token.
Token ID13
Feature activation-8.81e-03

Pos logit contributions
stack+0.105
pe+0.103
 clim+0.101
roads+0.100
h+0.099
Neg logit contributions
 fries-0.085
 spaghetti-0.076
 veggies-0.076
ouses-0.074
 sandwiches-0.073
Token Billy
Token ID15890
Feature activation+5.13e-03

Pos logit contributions
pe+0.015
h+0.015
stack+0.015
 clim+0.014
 collapse+0.014
Neg logit contributions
 fries-0.016
 veggies-0.016
ouses-0.016
 meals-0.015
 groceries-0.014
Token was
Token ID373
Feature activation-2.13e-02

Pos logit contributions
 fries+0.009
 veggies+0.009
ouses+0.009
 spaghetti+0.008
ory+0.008
Neg logit contributions
stack-0.009
pe-0.009
roads-0.008
 collapse-0.008
h-0.008
Token scared
Token ID12008
Feature activation-1.40e-02

Pos logit contributions
pe+0.040
stack+0.038
h+0.037
roads+0.035
 hill+0.034
Neg logit contributions
 fries-0.036
 veggies-0.033
 spaghetti-0.032
 delicious-0.032
 favourite-0.032
Token a
Token ID257
Feature activation+2.86e-03

Pos logit contributions
ouses+0.006
 veggies+0.006
 fries+0.006
 meals+0.006
 favorites+0.005
Neg logit contributions
pe-0.004
stack-0.004
h-0.004
 collapse-0.003
roads-0.003
Token big
Token ID1263
Feature activation-3.97e-02

Pos logit contributions
 fries+0.005
 veggies+0.005
ouses+0.005
 meals+0.004
 groceries+0.004
Neg logit contributions
pe-0.005
stack-0.005
h-0.005
roads-0.005
 collapse-0.005
Token box
Token ID3091
Feature activation-4.38e-02

Pos logit contributions
pe+0.068
h+0.067
stack+0.063
roads+0.061
esh+0.059
Neg logit contributions
 fries-0.077
 spaghetti-0.075
 veggies-0.074
 peanuts-0.074
 peanut-0.073
Token in
Token ID287
Feature activation-4.11e-02

Pos logit contributions
stack+0.093
pe+0.089
bank+0.087
 Zo+0.085
h+0.084
Neg logit contributions
 spaghetti-0.054
 sandwiches-0.053
 fries-0.052
 veggies-0.051
 groceries-0.049
Token the
Token ID262
Feature activation-2.86e-02

Pos logit contributions
pe+0.066
stack+0.065
h+0.065
 clim+0.059
bank+0.056
Neg logit contributions
 veggies-0.081
 fries-0.079
 groceries-0.078
 spaghetti-0.076
 meals-0.075
Token living
Token ID2877
Feature activation-9.72e-02

Pos logit contributions
pe+0.035
stack+0.034
h+0.032
 collapse+0.030
 clim+0.030
Neg logit contributions
 fries-0.068
ouses-0.067
 veggies-0.067
 meals-0.063
 groceries-0.063
Token room
Token ID2119
Feature activation-4.94e-02

Pos logit contributions
pe+0.170
h+0.164
stack+0.163
 clim+0.155
 collapse+0.153
Neg logit contributions
ouses-0.182
 veggies-0.180
 fries-0.174
 meals-0.173
 groceries-0.168
Token.
Token ID13
Feature activation-7.10e-03

Pos logit contributions
stack+0.105
pe+0.100
 flap+0.099
 clim+0.096
h+0.096
Neg logit contributions
 fries-0.065
 spaghetti-0.061
 veggies-0.061
ouses-0.060
 sandwiches-0.058
Token He
Token ID679
Feature activation-1.51e-03

Pos logit contributions
pe+0.012
stack+0.011
h+0.011
 clim+0.010
 collapse+0.010
Neg logit contributions
 fries-0.014
 veggies-0.014
ouses-0.014
 meals-0.013
 groceries-0.013
Token thought
Token ID1807
Feature activation+3.33e-02

Pos logit contributions
pe+0.002
stack+0.002
h+0.002
 collapse+0.002
roads+0.002
Neg logit contributions
 fries-0.003
ouses-0.003
 veggies-0.003
 meals-0.003
 groceries-0.003
Token it
Token ID340
Feature activation+1.35e-02

Pos logit contributions
ouses+0.076
 fries+0.076
 veggies+0.075
 meals+0.070
 groceries+0.069
Neg logit contributions
pe-0.043
stack-0.041
h-0.041
 collapse-0.039
 clim-0.038
TokenThey
Token ID2990
Feature activation-4.21e-02

Pos logit contributions
stack+0.043
pe+0.043
h+0.041
 collapse+0.039
 clim+0.039
Neg logit contributions
 fries-0.048
ouses-0.047
 veggies-0.047
 meals-0.043
 groceries-0.042
Token go
Token ID467
Feature activation-3.79e-02

Pos logit contributions
pe+0.078
h+0.076
stack+0.074
roads+0.070
bank+0.070
Neg logit contributions
 fries-0.075
 veggies-0.070
ouses-0.064
ory-0.064
 spaghetti-0.063
Token back
Token ID736
Feature activation-7.35e-03

Pos logit contributions
pe+0.078
stack+0.075
h+0.072
 collapse+0.072
 clim+0.069
Neg logit contributions
ouses-0.059
 fries-0.057
 veggies-0.056
 groceries-0.055
 meals-0.052
Token to
Token ID284
Feature activation-3.56e-02

Pos logit contributions
pe+0.015
stack+0.014
h+0.014
 collapse+0.013
roads+0.013
Neg logit contributions
ouses-0.012
 veggies-0.012
 fries-0.012
 groceries-0.011
 meals-0.011
Token the
Token ID262
Feature activation-3.74e-02

Pos logit contributions
pe+0.048
stack+0.045
h+0.045
roads+0.041
 collapse+0.041
Neg logit contributions
 fries-0.081
 veggies-0.080
ouses-0.079
 meals-0.075
 groceries-0.074
Token living
Token ID2877
Feature activation-9.71e-02

Pos logit contributions
h+0.058
pe+0.057
stack+0.055
 clim+0.052
 collapse+0.051
Neg logit contributions
 veggies-0.076
 fries-0.076
ouses-0.073
 groceries-0.073
 meals-0.072
Token room
Token ID2119
Feature activation-4.83e-02

Pos logit contributions
stack+0.190
pe+0.189
h+0.181
 collapse+0.179
 clim+0.173
Neg logit contributions
ouses-0.160
 veggies-0.157
 meals-0.155
 fries-0.146
 favorites-0.143
Token and
Token ID290
Feature activation+1.17e-02

Pos logit contributions
stack+0.100
pe+0.098
 collapse+0.094
h+0.093
 clim+0.092
Neg logit contributions
ouses-0.068
 fries-0.066
 meals-0.064
 veggies-0.064
 spaghetti-0.063
Token pick
Token ID2298
Feature activation+1.28e-02

Pos logit contributions
 fries+0.024
 veggies+0.024
ouses+0.022
 meals+0.022
 groceries+0.022
Neg logit contributions
pe-0.019
h-0.019
stack-0.017
 clim-0.016
roads-0.016
Token up
Token ID510
Feature activation+2.66e-02

Pos logit contributions
ouses+0.034
 fries+0.033
 veggies+0.032
urger+0.031
affles+0.031
Neg logit contributions
 clim-0.010
pe-0.010
stack-0.009
 switch-0.009
h-0.009
Token the
Token ID262
Feature activation+8.07e-03

Pos logit contributions
ouses+0.069
 fries+0.067
 veggies+0.066
affles+0.063
urger+0.061
Neg logit contributions
pe-0.025
stack-0.022
h-0.022
 collapse-0.022
 clim-0.021
Token noise
Token ID7838
Feature activation-3.29e-02

Pos logit contributions
pe+0.050
stack+0.048
h+0.046
 collapse+0.043
 clim+0.043
Neg logit contributions
 fries-0.087
ouses-0.087
 veggies-0.086
 meals-0.081
 groceries-0.080
Token and
Token ID290
Feature activation+5.72e-03

Pos logit contributions
stack+0.069
pe+0.063
 clim+0.059
roads+0.059
bank+0.057
Neg logit contributions
 fries-0.051
 veggies-0.048
 spaghetti-0.047
ouses-0.046
 meals-0.045
Token came
Token ID1625
Feature activation-3.37e-02

Pos logit contributions
 fries+0.011
 veggies+0.011
ouses+0.010
 groceries+0.010
 meals+0.010
Neg logit contributions
stack-0.010
pe-0.009
roads-0.008
zag-0.008
h-0.008
Token to
Token ID284
Feature activation-3.75e-02

Pos logit contributions
pe+0.067
stack+0.065
h+0.062
roads+0.060
 collapse+0.060
Neg logit contributions
ouses-0.053
 veggies-0.053
 fries-0.053
 groceries-0.050
 meals-0.049
Token the
Token ID262
Feature activation-3.31e-02

Pos logit contributions
pe+0.050
stack+0.049
h+0.047
roads+0.044
 clim+0.043
Neg logit contributions
 fries-0.084
 veggies-0.083
ouses-0.082
 meals-0.078
 groceries-0.078
Token living
Token ID2877
Feature activation-9.69e-02

Pos logit contributions
pe+0.040
stack+0.038
h+0.037
 collapse+0.034
 clim+0.034
Neg logit contributions
 fries-0.079
ouses-0.078
 veggies-0.078
 meals-0.073
 groceries-0.073
Token room
Token ID2119
Feature activation-4.70e-02

Pos logit contributions
pe+0.177
stack+0.173
h+0.171
 collapse+0.162
 clim+0.161
Neg logit contributions
 veggies-0.167
ouses-0.165
 fries-0.165
 meals-0.163
 groceries-0.157
Token.
Token ID13
Feature activation-3.57e-02

Pos logit contributions
stack+0.106
pe+0.101
 clim+0.099
h+0.094
roads+0.094
Neg logit contributions
 fries-0.055
ouses-0.055
 groceries-0.053
 sandwiches-0.052
 meals-0.052
Token They
Token ID1119
Feature activation-2.08e-02

Pos logit contributions
pe+0.058
stack+0.056
h+0.055
 clim+0.051
roads+0.051
Neg logit contributions
 fries-0.071
ouses-0.070
 veggies-0.070
 meals-0.065
 groceries-0.064
Token saw
Token ID2497
Feature activation-7.37e-03

Pos logit contributions
pe+0.038
stack+0.036
h+0.035
roads+0.034
 clim+0.034
Neg logit contributions
 fries-0.037
 veggies-0.036
ouses-0.036
 groceries-0.033
 meals-0.033
Token the
Token ID262
Feature activation-8.36e-03

Pos logit contributions
stack+0.005
 collapse+0.005
pe+0.005
 crash+0.005
h+0.005
Neg logit contributions
ouses-0.021
 fries-0.020
 veggies-0.020
urger-0.019
affles-0.019
Token lively
Token ID29696
Feature activation-2.78e-02

Pos logit contributions
pe+0.011
stack+0.011
h+0.011
 clim+0.010
roads+0.010
Neg logit contributions
 fries-0.016
 veggies-0.016
ouses-0.015
 meals-0.015
 spaghetti-0.014
Token sound
Token ID2128
Feature activation-2.47e-02

Pos logit contributions
stack+0.040
pe+0.037
roads+0.036
 clim+0.035
 collapse+0.034
Neg logit contributions
 fries-0.060
 veggies-0.058
 meals-0.056
 spaghetti-0.055
ouses-0.054
Token coming
Token ID2406
Feature activation-3.17e-02

Pos logit contributions
stack+0.040
pe+0.040
 collapse+0.035
h+0.035
roads+0.035
Neg logit contributions
 fries-0.047
 veggies-0.046
ouses-0.045
 spaghetti-0.044
 meals-0.043
Token from
Token ID422
Feature activation-2.05e-02

Pos logit contributions
pe+0.056
stack+0.055
roads+0.053
h+0.050
 clim+0.048
Neg logit contributions
 fries-0.053
 veggies-0.053
 groceries-0.051
 spaghetti-0.050
ouses-0.050
Token the
Token ID262
Feature activation-2.80e-02

Pos logit contributions
pe+0.033
h+0.032
stack+0.031
 clim+0.029
 collapse+0.029
Neg logit contributions
 veggies-0.040
 fries-0.039
 groceries-0.038
ouses-0.037
 meals-0.037
Token living
Token ID2877
Feature activation-9.69e-02

Pos logit contributions
pe+0.039
stack+0.038
h+0.037
 collapse+0.034
 clim+0.034
Neg logit contributions
 fries-0.062
 veggies-0.061
ouses-0.060
 meals-0.057
 groceries-0.056
Token room
Token ID2119
Feature activation-5.03e-02

Pos logit contributions
pe+0.177
h+0.172
stack+0.169
 collapse+0.165
 clim+0.164
Neg logit contributions
 fries-0.169
 veggies-0.169
 meals-0.166
ouses-0.163
 groceries-0.156
Token.
Token ID13
Feature activation+7.05e-03

Pos logit contributions
pe+0.102
stack+0.101
 clim+0.095
h+0.092
roads+0.092
Neg logit contributions
 fries-0.072
ouses-0.068
 spaghetti-0.066
 veggies-0.066
 meals-0.065
Token She
Token ID1375
Feature activation-1.63e-02

Pos logit contributions
 fries+0.017
 veggies+0.017
ouses+0.017
affles+0.016
 meals+0.016
Neg logit contributions
stack-0.008
pe-0.007
h-0.007
 collapse-0.007
 clim-0.006
Token followed
Token ID3940
Feature activation-3.52e-03

Pos logit contributions
stack+0.027
pe+0.026
roads+0.025
 collapse+0.024
h+0.024
Neg logit contributions
 fries-0.032
 veggies-0.031
ouses-0.031
 meals-0.028
ory-0.028
Token the
Token ID262
Feature activation-1.89e-02

Pos logit contributions
pe+0.004
stack+0.004
h+0.004
 collapse+0.003
 clim+0.003
Neg logit contributions
ouses-0.009
 fries-0.009
 veggies-0.009
affles-0.008
 meals-0.008
Token Tom
Token ID4186
Feature activation-1.47e-03

Pos logit contributions
ouses+0.023
 fries+0.022
 veggies+0.022
 meals+0.021
affles+0.021
Neg logit contributions
stack-0.006
pe-0.006
 collapse-0.006
h-0.006
 clim-0.006
Token went
Token ID1816
Feature activation-4.52e-02

Pos logit contributions
pe+0.002
stack+0.002
h+0.002
roads+0.002
 collapse+0.002
Neg logit contributions
 fries-0.003
 veggies-0.003
ouses-0.003
 meals-0.003
 groceries-0.003
Token back
Token ID736
Feature activation+5.58e-04

Pos logit contributions
pe+0.077
stack+0.075
h+0.074
 clim+0.070
 collapse+0.069
Neg logit contributions
 fries-0.085
 veggies-0.084
ouses-0.083
 meals-0.077
 groceries-0.075
Token to
Token ID284
Feature activation-3.17e-02

Pos logit contributions
 fries+0.001
 veggies+0.001
ouses+0.001
 meals+0.001
affles+0.001
Neg logit contributions
pe-0.001
stack-0.001
h-0.001
 clim-0.001
 collapse-0.001
Token the
Token ID262
Feature activation-2.39e-02

Pos logit contributions
pe+0.050
stack+0.047
h+0.047
roads+0.044
 collapse+0.043
Neg logit contributions
 fries-0.066
 veggies-0.065
ouses-0.063
 meals-0.061
 groceries-0.060
Token living
Token ID2877
Feature activation-9.66e-02

Pos logit contributions
pe+0.034
h+0.033
stack+0.031
 collapse+0.029
 clim+0.029
Neg logit contributions
 fries-0.052
 veggies-0.052
ouses-0.051
 groceries-0.050
 meals-0.049
Token room
Token ID2119
Feature activation-3.86e-02

Pos logit contributions
stack+0.177
pe+0.177
 collapse+0.170
h+0.169
 clim+0.160
Neg logit contributions
ouses-0.172
 veggies-0.169
 meals-0.164
 fries-0.162
 groceries-0.156
Token and
Token ID290
Feature activation+1.70e-02

Pos logit contributions
pe+0.075
stack+0.074
 collapse+0.070
h+0.070
roads+0.069
Neg logit contributions
ouses-0.062
 fries-0.060
 veggies-0.059
 meals-0.058
 spaghetti-0.056
Token played
Token ID2826
Feature activation-9.34e-03

Pos logit contributions
 fries+0.035
ouses+0.034
 veggies+0.034
 meals+0.032
 groceries+0.031
Neg logit contributions
pe-0.027
stack-0.026
h-0.025
 collapse-0.024
roads-0.023
Token with
Token ID351
Feature activation+2.59e-02

Pos logit contributions
pe+0.011
stack+0.011
h+0.011
 clim+0.010
roads+0.010
Neg logit contributions
 fries-0.022
 veggies-0.022
ouses-0.022
 groceries-0.020
 meals-0.020
Token their
Token ID511
Feature activation+3.39e-02

Pos logit contributions
ouses+0.072
 fries+0.065
 veggies+0.064
urger+0.064
affles+0.064
Neg logit contributions
pe-0.018
 collapse-0.017
 hon-0.017
 switch-0.017
 only-0.016
Token hear
Token ID3285
Feature activation+2.40e-03

Pos logit contributions
pe+0.060
h+0.057
stack+0.056
roads+0.053
bank+0.052
Neg logit contributions
 fries-0.073
 veggies-0.069
ouses-0.065
 groceries-0.063
 spaghetti-0.063
Token Mom
Token ID11254
Feature activation-9.74e-04

Pos logit contributions
ouses+0.006
 veggies+0.006
 fries+0.006
affles+0.005
heddar+0.005
Neg logit contributions
 collapse-0.003
pe-0.002
h-0.002
 crash-0.002
 switch-0.002
Token coming
Token ID2406
Feature activation-3.45e-02

Pos logit contributions
stack+0.002
pe+0.001
 clim+0.001
 cliff+0.001
roads+0.001
Neg logit contributions
 fries-0.002
 veggies-0.002
ouses-0.002
 spaghetti-0.002
 meals-0.002
Token down
Token ID866
Feature activation-4.19e-02

Pos logit contributions
pe+0.064
stack+0.062
h+0.058
roads+0.058
 clim+0.056
Neg logit contributions
 veggies-0.058
 fries-0.057
ouses-0.057
 groceries-0.055
 meals-0.054
Token the
Token ID262
Feature activation-3.53e-02

Pos logit contributions
pe+0.072
stack+0.070
h+0.065
 clim+0.064
roads+0.063
Neg logit contributions
 fries-0.077
 veggies-0.076
 groceries-0.074
 meals-0.073
ouses-0.073
Token stairs
Token ID16046
Feature activation-9.64e-02

Pos logit contributions
h+0.058
pe+0.056
stack+0.054
 clim+0.052
roads+0.050
Neg logit contributions
 veggies-0.070
 fries-0.067
 groceries-0.066
 spaghetti-0.066
ouses-0.065
Token.
Token ID13
Feature activation-2.53e-02

Pos logit contributions
stack+0.238
pe+0.229
AAAA+0.227
Ro+0.226
esh+0.220
Neg logit contributions
 groceries-0.069
 sandwiches-0.059
 spaghetti-0.055
 slices-0.046
bs-0.046
Token They
Token ID1119
Feature activation-2.95e-02

Pos logit contributions
pe+0.036
stack+0.036
h+0.035
 collapse+0.032
roads+0.032
Neg logit contributions
 fries-0.055
ouses-0.054
 veggies-0.054
 meals-0.050
 groceries-0.049
Token try
Token ID1949
Feature activation-1.47e-02

Pos logit contributions
pe+0.055
h+0.052
stack+0.051
mill+0.051
bank+0.050
Neg logit contributions
 fries-0.046
 veggies-0.045
bs-0.043
 groceries-0.043
 sweets-0.041
Token to
Token ID284
Feature activation-4.53e-02

Pos logit contributions
stack+0.033
pe+0.031
roads+0.030
 clim+0.029
 collapse+0.029
Neg logit contributions
 fries-0.020
 veggies-0.019
 groceries-0.019
ouses-0.019
 spaghetti-0.018
Token hide
Token ID7808
Feature activation-2.95e-03

Pos logit contributions
stack+0.073
pe+0.073
h+0.070
roads+0.067
bank+0.065
Neg logit contributions
 veggies-0.086
 groceries-0.083
 fries-0.083
 spaghetti-0.077
 sandwiches-0.076
Token to
Token ID284
Feature activation-3.94e-02

Pos logit contributions
ouses+0.111
 fries+0.109
 veggies+0.105
affles+0.101
 favorites+0.101
Neg logit contributions
pe-0.063
 never-0.062
h-0.060
stack-0.058
 collapse-0.057
Token go
Token ID467
Feature activation-2.78e-02

Pos logit contributions
roads+0.065
stack+0.063
pe+0.060
 clim+0.059
 site+0.058
Neg logit contributions
 fries-0.073
 veggies-0.071
 spaghetti-0.065
ouses-0.065
 sweets-0.064
Token back
Token ID736
Feature activation+2.75e-03

Pos logit contributions
stack+0.057
pe+0.056
 collapse+0.051
h+0.051
 clim+0.051
Neg logit contributions
ouses-0.044
 groceries-0.042
 fries-0.041
 veggies-0.040
 meals-0.037
Token to
Token ID284
Feature activation-3.73e-02

Pos logit contributions
 fries+0.005
ouses+0.005
 veggies+0.005
 meals+0.004
 groceries+0.004
Neg logit contributions
pe-0.005
stack-0.005
h-0.005
 collapse-0.004
roads-0.004
Token the
Token ID262
Feature activation-2.15e-02

Pos logit contributions
pe+0.065
stack+0.061
h+0.059
roads+0.058
 clim+0.057
Neg logit contributions
 fries-0.071
 veggies-0.070
 groceries-0.067
 meals-0.067
 spaghetti-0.067
Token living
Token ID2877
Feature activation-9.64e-02

Pos logit contributions
pe+0.029
stack+0.028
h+0.027
 collapse+0.025
 clim+0.025
Neg logit contributions
 fries-0.049
ouses-0.048
 veggies-0.048
 meals-0.045
 groceries-0.044
Token room
Token ID2119
Feature activation-4.13e-02

Pos logit contributions
pe+0.158
stack+0.156
h+0.150
 collapse+0.143
 clim+0.141
Neg logit contributions
ouses-0.190
 veggies-0.189
 fries-0.186
 meals-0.181
 groceries-0.175
Token and
Token ID290
Feature activation+1.11e-02

Pos logit contributions
stack+0.082
pe+0.080
roads+0.076
h+0.075
 collapse+0.075
Neg logit contributions
ouses-0.064
 fries-0.060
 veggies-0.059
 meals-0.059
 groceries-0.058
Token do
Token ID466
Feature activation-1.84e-02

Pos logit contributions
 fries+0.022
 veggies+0.022
ouses+0.022
 groceries+0.021
 meals+0.021
Neg logit contributions
pe-0.018
h-0.017
stack-0.016
 clim-0.015
roads-0.015
Token some
Token ID617
Feature activation+3.35e-02

Pos logit contributions
pe+0.030
h+0.030
 collapse+0.029
stack+0.028
 clim+0.027
Neg logit contributions
 fries-0.035
 veggies-0.034
ouses-0.034
 meals-0.031
 groceries-0.031
Token yoga
Token ID20351
Feature activation-1.64e-02

Pos logit contributions
ouses+0.056
 fries+0.056
 veggies+0.054
 meals+0.049
ory+0.049
Neg logit contributions
pe-0.063
h-0.061
stack-0.060
 collapse-0.058
 clim-0.058
Token a
Token ID257
Feature activation-4.68e-03

Pos logit contributions
ouses+0.049
 fries+0.049
 veggies+0.048
affles+0.044
ory+0.044
Neg logit contributions
h-0.030
pe-0.029
 collapse-0.028
 clim-0.027
stack-0.026
Token noise
Token ID7838
Feature activation-4.00e-02

Pos logit contributions
stack+0.008
pe+0.008
h+0.007
roads+0.007
 clim+0.007
Neg logit contributions
 fries-0.009
 veggies-0.009
ouses-0.008
 meals-0.008
 spaghetti-0.008
Token coming
Token ID2406
Feature activation-3.22e-02

Pos logit contributions
stack+0.079
pe+0.075
 collapse+0.065
bank+0.065
roads+0.064
Neg logit contributions
 fries-0.062
 spaghetti-0.062
 veggies-0.059
 meals-0.057
ouses-0.056
Token from
Token ID422
Feature activation-1.55e-02

Pos logit contributions
pe+0.060
stack+0.058
roads+0.054
h+0.052
bank+0.051
Neg logit contributions
 veggies-0.052
 fries-0.051
 groceries-0.051
ouses-0.050
 spaghetti-0.050
Token the
Token ID262
Feature activation-1.82e-02

Pos logit contributions
pe+0.027
h+0.026
stack+0.025
 collapse+0.024
 clim+0.024
Neg logit contributions
 veggies-0.028
 groceries-0.027
 fries-0.027
 spaghetti-0.026
 meals-0.026
Token living
Token ID2877
Feature activation-9.63e-02

Pos logit contributions
pe+0.024
h+0.023
stack+0.022
 clim+0.021
 collapse+0.020
Neg logit contributions
 fries-0.042
 veggies-0.041
ouses-0.040
 groceries-0.039
 meals-0.039
Token room
Token ID2119
Feature activation-5.09e-02

Pos logit contributions
pe+0.185
h+0.175
stack+0.173
 collapse+0.168
 clim+0.163
Neg logit contributions
 veggies-0.160
ouses-0.158
 meals-0.156
 fries-0.156
 groceries-0.152
Token.
Token ID13
Feature activation+2.17e-02

Pos logit contributions
pe+0.112
stack+0.111
 clim+0.102
 flap+0.099
roads+0.098
Neg logit contributions
 fries-0.061
ouses-0.059
 spaghetti-0.058
 groceries-0.055
 sandwiches-0.055
Token She
Token ID1375
Feature activation-1.07e-02

Pos logit contributions
 fries+0.053
ouses+0.052
 veggies+0.052
affles+0.049
 meals+0.049
Neg logit contributions
h-0.024
pe-0.024
stack-0.024
 collapse-0.021
 clim-0.021
Token wanted
Token ID2227
Feature activation+1.66e-02

Pos logit contributions
stack+0.017
pe+0.016
roads+0.015
 collapse+0.015
h+0.014
Neg logit contributions
 fries-0.021
 veggies-0.021
ouses-0.021
 meals-0.019
 spaghetti-0.019
Token to
Token ID284
Feature activation-2.79e-02

Pos logit contributions
ouses+0.034
 fries+0.033
 veggies+0.033
 meals+0.030
affles+0.030
Neg logit contributions
pe-0.026
h-0.025
stack-0.025
 collapse-0.024
 clim-0.023
Token mom
Token ID1995
Feature activation+2.60e-03

Pos logit contributions
ouses+0.004
 fries+0.004
 veggies+0.004
 meals+0.003
 groceries+0.003
Neg logit contributions
pe-0.003
 collapse-0.003
stack-0.003
h-0.003
 crash-0.003
Tokenmy
Token ID1820
Feature activation+9.89e-03

Pos logit contributions
 fries+0.005
 veggies+0.005
ouses+0.005
 meals+0.005
 groceries+0.005
Neg logit contributions
pe-0.004
stack-0.004
h-0.004
 clim-0.004
roads-0.004
Token coming
Token ID2406
Feature activation-3.53e-02

Pos logit contributions
 fries+0.020
 veggies+0.019
ouses+0.019
 meals+0.018
 groceries+0.018
Neg logit contributions
stack-0.016
pe-0.016
h-0.014
roads-0.014
 clim-0.014
Token from
Token ID422
Feature activation-2.47e-02

Pos logit contributions
pe+0.070
stack+0.068
roads+0.064
h+0.063
 clim+0.061
Neg logit contributions
 veggies-0.053
 groceries-0.052
 fries-0.051
ouses-0.051
 meals-0.049
Token the
Token ID262
Feature activation-3.59e-02

Pos logit contributions
pe+0.040
h+0.039
stack+0.036
 clim+0.035
zag+0.034
Neg logit contributions
 veggies-0.048
 groceries-0.048
 fries-0.047
 spaghetti-0.045
 meals-0.045
Token living
Token ID2877
Feature activation-9.62e-02

Pos logit contributions
pe+0.051
stack+0.050
h+0.048
 collapse+0.045
roads+0.045
Neg logit contributions
 fries-0.077
ouses-0.077
 veggies-0.076
 meals-0.071
 groceries-0.070
Token room
Token ID2119
Feature activation-5.89e-02

Pos logit contributions
pe+0.178
stack+0.171
h+0.166
 collapse+0.161
 clim+0.158
Neg logit contributions
 veggies-0.172
ouses-0.168
 fries-0.167
 meals-0.165
 spaghetti-0.161
Token.
Token ID13
Feature activation-2.46e-02

Pos logit contributions
stack+0.125
pe+0.124
 clim+0.122
roads+0.117
esh+0.115
Neg logit contributions
 spaghetti-0.071
 fries-0.071
 sandwiches-0.069
 meals-0.068
 groceries-0.067
Token She
Token ID1375
Feature activation-2.87e-02

Pos logit contributions
pe+0.036
stack+0.034
h+0.033
 clim+0.032
roads+0.031
Neg logit contributions
 fries-0.052
ouses-0.052
 veggies-0.052
 meals-0.049
 groceries-0.048
Token quickly
Token ID2952
Feature activation-2.87e-02

Pos logit contributions
stack+0.045
pe+0.044
roads+0.041
h+0.040
 collapse+0.039
Neg logit contributions
 fries-0.058
 veggies-0.056
ouses-0.056
 meals-0.051
 groceries-0.051
Token got
Token ID1392
Feature activation-2.51e-02

Pos logit contributions
stack+0.047
pe+0.046
roads+0.042
h+0.041
 clim+0.039
Neg logit contributions
 fries-0.058
 veggies-0.056
ouses-0.055
 groceries-0.052
 spaghetti-0.051
Token't
Token ID470
Feature activation-2.09e-02

Pos logit contributions
stack+0.007
pe+0.006
bank+0.006
 clim+0.006
roads+0.005
Neg logit contributions
 fries-0.014
 veggies-0.014
 groceries-0.014
 spaghetti-0.014
ouses-0.014
Token touch
Token ID3638
Feature activation-9.08e-03

Pos logit contributions
stack+0.033
pe+0.032
roads+0.029
bank+0.027
h+0.027
Neg logit contributions
 fries-0.044
 veggies-0.043
 spaghetti-0.040
ouses-0.039
 meals-0.039
Token anything
Token ID1997
Feature activation+2.93e-02

Pos logit contributions
stack+0.014
pe+0.014
h+0.013
 clim+0.012
 collapse+0.012
Neg logit contributions
 fries-0.019
 veggies-0.019
 groceries-0.018
ouses-0.018
 spaghetti-0.017
Token!"
Token ID2474
Feature activation-4.33e-04

Pos logit contributions
 fries+0.054
 veggies+0.053
ouses+0.052
 spaghetti+0.050
 groceries+0.050
Neg logit contributions
stack-0.051
pe-0.049
h-0.046
roads-0.044
 collapse-0.043
Token the
Token ID262
Feature activation-5.52e-02

Pos logit contributions
pe+0.001
stack+0.001
h+0.001
 clim+0.001
 collapse+0.001
Neg logit contributions
 fries-0.001
 veggies-0.001
ouses-0.001
 meals-0.001
 groceries-0.001
Token mill
Token ID3939
Feature activation-9.59e-02

Pos logit contributions
stack+0.080
h+0.078
 collapse+0.074
pe+0.073
 clim+0.072
Neg logit contributions
 fries-0.116
 veggies-0.114
 groceries-0.114
ouses-0.106
 peanuts-0.106
Tokener
Token ID263
Feature activation-6.23e-02

Pos logit contributions
stack+0.175
 clim+0.154
und+0.153
h+0.151
 blur+0.146
Neg logit contributions
 fries-0.152
 sandwiches-0.140
 veggies-0.130
ouses-0.128
led-0.127
Token says
Token ID1139
Feature activation-1.06e-02

Pos logit contributions
stack+0.135
 hill+0.129
 cliff+0.128
 slope+0.125
 staircase+0.124
Neg logit contributions
 fries-0.069
ouses-0.060
ages-0.057
bs-0.055
 faces-0.054
Token.
Token ID13
Feature activation-2.92e-02

Pos logit contributions
stack+0.020
pe+0.019
roads+0.019
 clim+0.018
 hill+0.018
Neg logit contributions
 fries-0.017
ouses-0.017
 veggies-0.017
 meals-0.016
 spaghetti-0.016
Token He
Token ID679
Feature activation-3.07e-02

Pos logit contributions
stack+0.057
pe+0.056
h+0.056
 clim+0.055
roads+0.054
Neg logit contributions
 fries-0.046
ouses-0.043
 veggies-0.042
 spaghetti-0.040
 groceries-0.039
Token slams
Token ID45846
Feature activation-6.18e-02

Pos logit contributions
stack+0.046
h+0.045
pe+0.044
roads+0.040
 collapse+0.039
Neg logit contributions
 fries-0.067
 veggies-0.064
ouses-0.060
 favorites-0.059
 spaghetti-0.059
Token their
Token ID511
Feature activation-1.22e-02

Pos logit contributions
pe+0.043
h+0.042
stack+0.041
roads+0.039
 collapse+0.038
Neg logit contributions
 fries-0.042
 veggies-0.041
ouses-0.038
 meals-0.038
 spaghetti-0.038
Token mom
Token ID1995
Feature activation-1.49e-02

Pos logit contributions
pe+0.019
 bridge+0.018
 zo+0.018
stack+0.018
 poster+0.018
Neg logit contributions
ouses-0.024
 veggies-0.023
 fries-0.022
 groceries-0.022
 meals-0.021
Token came
Token ID1625
Feature activation-2.92e-02

Pos logit contributions
stack+0.021
pe+0.021
roads+0.020
 collapse+0.019
 clim+0.019
Neg logit contributions
 fries-0.032
 veggies-0.030
ouses-0.029
 spaghetti-0.029
 meals-0.027
Token into
Token ID656
Feature activation-4.63e-02

Pos logit contributions
stack+0.057
pe+0.056
 collapse+0.052
roads+0.051
h+0.050
Neg logit contributions
 veggies-0.045
ouses-0.044
 fries-0.043
 groceries-0.043
 spaghetti-0.042
Token the
Token ID262
Feature activation-3.85e-02

Pos logit contributions
stack+0.071
pe+0.071
h+0.066
 clim+0.066
 collapse+0.063
Neg logit contributions
 fries-0.093
 veggies-0.092
 meals-0.088
 spaghetti-0.088
 groceries-0.087
Token living
Token ID2877
Feature activation-9.58e-02

Pos logit contributions
pe+0.059
h+0.056
stack+0.056
 clim+0.052
 collapse+0.052
Neg logit contributions
 fries-0.080
 veggies-0.079
ouses-0.078
 meals-0.075
 spaghetti-0.074
Token room
Token ID2119
Feature activation-6.03e-02

Pos logit contributions
stack+0.184
pe+0.180
 collapse+0.176
 clim+0.171
h+0.167
Neg logit contributions
ouses-0.159
 veggies-0.157
 meals-0.153
 fries-0.151
 spaghetti-0.145
Token with
Token ID351
Feature activation+1.70e-02

Pos logit contributions
stack+0.131
pe+0.124
 collapse+0.123
 clim+0.121
esh+0.119
Neg logit contributions
 fries-0.077
 spaghetti-0.075
 veggies-0.071
ouses-0.071
 meals-0.069
Token a
Token ID257
Feature activation+1.50e-02

Pos logit contributions
ouses+0.038
 fries+0.036
 veggies+0.036
affles+0.034
 meals+0.033
Neg logit contributions
pe-0.023
stack-0.023
h-0.022
 collapse-0.021
roads-0.020
Token tray
Token ID26473
Feature activation-3.08e-02

Pos logit contributions
 fries+0.026
 veggies+0.025
ouses+0.024
 meals+0.023
 spaghetti+0.023
Neg logit contributions
pe-0.028
stack-0.028
h-0.027
roads-0.026
 clim-0.026
Token of
Token ID286
Feature activation+8.82e-03

Pos logit contributions
pe+0.060
stack+0.058
 collapse+0.057
h+0.057
 clim+0.056
Neg logit contributions
 fries-0.047
 spaghetti-0.045
 veggies-0.043
ouses-0.043
 sandwiches-0.042
Token and
Token ID290
Feature activation+2.17e-03

Pos logit contributions
stack+0.092
pe+0.082
bank+0.077
roads+0.077
 clim+0.076
Neg logit contributions
 spaghetti-0.046
 fries-0.046
 veggies-0.044
 sandwiches-0.041
 meals-0.041
Token comes
Token ID2058
Feature activation-2.80e-02

Pos logit contributions
 fries+0.004
 veggies+0.004
 spaghetti+0.004
 groceries+0.004
 vegetables+0.004
Neg logit contributions
stack-0.004
pe-0.003
zag-0.003
h-0.003
roads-0.003
Token back
Token ID736
Feature activation-1.17e-02

Pos logit contributions
pe+0.056
stack+0.055
h+0.052
roads+0.051
 collapse+0.050
Neg logit contributions
 fries-0.044
 veggies-0.044
ouses-0.043
 groceries-0.041
 meals-0.041
Token to
Token ID284
Feature activation-3.64e-02

Pos logit contributions
pe+0.025
stack+0.024
roads+0.022
h+0.022
bank+0.021
Neg logit contributions
ouses-0.016
 groceries-0.016
 meals-0.015
 veggies-0.015
 spaghetti-0.015
Token the
Token ID262
Feature activation-3.41e-02

Pos logit contributions
stack+0.057
pe+0.057
h+0.054
roads+0.052
 clim+0.051
Neg logit contributions
 fries-0.073
 veggies-0.072
 meals-0.068
 groceries-0.068
ouses-0.067
Token living
Token ID2877
Feature activation-9.57e-02

Pos logit contributions
h+0.051
pe+0.050
stack+0.046
 clim+0.044
roads+0.043
Neg logit contributions
 veggies-0.072
 fries-0.072
ouses-0.069
 groceries-0.068
 meals-0.067
Token room
Token ID2119
Feature activation-5.44e-02

Pos logit contributions
pe+0.178
stack+0.175
h+0.173
 collapse+0.160
 clim+0.159
Neg logit contributions
 veggies-0.167
ouses-0.164
 fries-0.161
 meals-0.161
 spaghetti-0.153
Token.
Token ID13
Feature activation-3.31e-02

Pos logit contributions
stack+0.129
pe+0.120
 clim+0.117
roads+0.114
Ro+0.113
Neg logit contributions
ouses-0.055
 fries-0.054
 sandwiches-0.054
 spaghetti-0.054
 meals-0.052
Token She
Token ID1375
Feature activation-3.07e-02

Pos logit contributions
pe+0.042
stack+0.040
h+0.039
 collapse+0.036
 clim+0.036
Neg logit contributions
 fries-0.077
ouses-0.076
 veggies-0.076
 meals-0.072
 groceries-0.071
Token sees
Token ID7224
Feature activation-1.31e-02

Pos logit contributions
pe+0.058
stack+0.057
h+0.053
storm+0.050
 clim+0.050
Neg logit contributions
 fries-0.050
 veggies-0.049
 groceries-0.046
ouses-0.046
 favorites-0.045
Token Ben
Token ID3932
Feature activation+2.46e-02

Pos logit contributions
pe+0.012
h+0.011
stack+0.011
 collapse+0.011
 crash+0.011
Neg logit contributions
ouses-0.034
 veggies-0.032
 fries-0.032
 meals-0.031
urger-0.030
Token loud
Token ID7812
Feature activation-6.22e-02

Pos logit contributions
pe+0.043
stack+0.042
h+0.041
roads+0.039
 collapse+0.039
Neg logit contributions
 fries-0.041
 veggies-0.039
ouses-0.038
 meals-0.037
 spaghetti-0.036
Token noise
Token ID7838
Feature activation-4.66e-02

Pos logit contributions
stack+0.106
roads+0.102
 collapse+0.095
eday+0.095
 never+0.093
Neg logit contributions
 fries-0.114
 spaghetti-0.106
 veggies-0.096
 poems-0.094
 peanut-0.094
Token came
Token ID1625
Feature activation-2.21e-02

Pos logit contributions
stack+0.070
pe+0.069
h+0.061
roads+0.058
 clim+0.058
Neg logit contributions
 fries-0.099
 veggies-0.094
ouses-0.089
 spaghetti-0.089
 meals-0.088
Token from
Token ID422
Feature activation-2.47e-02

Pos logit contributions
stack+0.047
pe+0.047
roads+0.046
bank+0.042
 poster+0.040
Neg logit contributions
 veggies-0.027
 spaghetti-0.027
 groceries-0.027
 sandwiches-0.025
 meals-0.025
Token the
Token ID262
Feature activation-3.00e-02

Pos logit contributions
pe+0.045
h+0.044
oh+0.041
 clim+0.040
stack+0.038
Neg logit contributions
 groceries-0.042
 veggies-0.042
 spaghetti-0.041
 meals-0.039
 fries-0.039
Token living
Token ID2877
Feature activation-9.56e-02

Pos logit contributions
pe+0.044
h+0.042
stack+0.042
 clim+0.039
 collapse+0.038
Neg logit contributions
 fries-0.065
 veggies-0.064
ouses-0.062
 meals-0.060
 groceries-0.059
Token room
Token ID2119
Feature activation-5.12e-02

Pos logit contributions
pe+0.179
stack+0.176
h+0.168
 clim+0.166
 collapse+0.166
Neg logit contributions
 veggies-0.164
 meals-0.161
 fries-0.159
ouses-0.157
 spaghetti-0.152
Token.
Token ID13
Feature activation+2.04e-03

Pos logit contributions
stack+0.104
pe+0.103
 clim+0.097
roads+0.097
h+0.093
Neg logit contributions
 fries-0.073
 spaghetti-0.068
ouses-0.065
 veggies-0.065
 sandwiches-0.064
Token It
Token ID632
Feature activation-9.39e-03

Pos logit contributions
ouses+0.004
 fries+0.004
 veggies+0.004
 meals+0.004
 groceries+0.003
Neg logit contributions
pe-0.003
stack-0.003
h-0.003
 collapse-0.003
 clim-0.003
Token was
Token ID373
Feature activation-1.61e-02

Pos logit contributions
stack+0.018
pe+0.017
 clim+0.016
h+0.016
roads+0.016
Neg logit contributions
 fries-0.015
 veggies-0.014
 spaghetti-0.014
 delicious-0.013
 meals-0.013
Token Daddy
Token ID26926
Feature activation+2.21e-02

Pos logit contributions
pe+0.030
stack+0.028
h+0.028
roads+0.027
 clim+0.026
Neg logit contributions
 fries-0.029
 spaghetti-0.028
 veggies-0.027
 groceries-0.026
 meals-0.026
Token one
Token ID530
Feature activation+1.98e-02

Pos logit contributions
stack+0.033
h+0.033
pe+0.032
 collapse+0.030
oh+0.030
Neg logit contributions
 veggies-0.030
 fries-0.029
ouses-0.029
 groceries-0.029
 favourite-0.028
Token she
Token ID673
Feature activation-8.01e-03

Pos logit contributions
 fries+0.038
 veggies+0.037
ouses+0.037
 meals+0.035
 groceries+0.034
Neg logit contributions
pe-0.032
stack-0.032
h-0.030
 collapse-0.029
roads-0.028
Token met
Token ID1138
Feature activation+8.58e-03

Pos logit contributions
pe+0.013
stack+0.013
h+0.012
 collapse+0.012
 clim+0.012
Neg logit contributions
 fries-0.015
ouses-0.015
 veggies-0.015
 meals-0.014
 groceries-0.014
Token in
Token ID287
Feature activation-4.03e-02

Pos logit contributions
 fries+0.018
ouses+0.017
 veggies+0.017
 meals+0.016
 groceries+0.016
Neg logit contributions
pe-0.013
stack-0.013
h-0.012
 collapse-0.012
 clim-0.011
Token the
Token ID262
Feature activation-2.61e-02

Pos logit contributions
pe+0.066
stack+0.064
h+0.063
 clim+0.060
 collapse+0.058
Neg logit contributions
 fries-0.076
 veggies-0.074
ouses-0.073
 meals-0.073
 groceries-0.070
Token alley
Token ID23978
Feature activation-9.55e-02

Pos logit contributions
pe+0.040
h+0.039
stack+0.039
 collapse+0.035
 clim+0.035
Neg logit contributions
 fries-0.053
 veggies-0.053
ouses-0.052
 meals-0.050
 groceries-0.049
Token that
Token ID326
Feature activation-3.20e-02

Pos logit contributions
stack+0.180
 collapse+0.171
 flap+0.171
 clim+0.166
pe+0.165
Neg logit contributions
ouses-0.136
 fries-0.119
 veggies-0.116
ory-0.115
 favourite-0.115
Token day
Token ID1110
Feature activation-1.09e-02

Pos logit contributions
pe+0.054
stack+0.052
h+0.051
 collapse+0.050
 clim+0.048
Neg logit contributions
 veggies-0.056
 fries-0.055
 meals-0.055
ouses-0.055
 spaghetti-0.052
Token.
Token ID13
Feature activation-8.54e-03

Pos logit contributions
pe+0.017
stack+0.017
h+0.015
 collapse+0.015
roads+0.015
Neg logit contributions
 fries-0.022
ouses-0.022
 veggies-0.021
 meals-0.020
 groceries-0.020
Token<|endoftext|>
Token ID50256
Feature activation-2.25e-02

Pos logit contributions
pe+0.014
stack+0.014
h+0.013
 collapse+0.013
 clim+0.013
Neg logit contributions
 fries-0.016
ouses-0.016
 veggies-0.016
 meals-0.015
 groceries-0.015
TokenOnce
Token ID7454
Feature activation+4.08e-02

Pos logit contributions
pe+0.038
stack+0.037
h+0.036
 collapse+0.035
 clim+0.034
Neg logit contributions
 fries-0.041
ouses-0.041
 veggies-0.041
 meals-0.038
 groceries-0.038
Token and
Token ID290
Feature activation+1.78e-02

Pos logit contributions
 fries+0.016
 veggies+0.015
 spaghetti+0.014
 groceries+0.014
ouses+0.014
Neg logit contributions
pe-0.014
h-0.013
stack-0.013
 clim-0.013
roads-0.013
Token then
Token ID788
Feature activation-9.09e-03

Pos logit contributions
 fries+0.038
 veggies+0.038
ouses+0.036
 meals+0.035
 groceries+0.035
Neg logit contributions
pe-0.026
stack-0.025
h-0.024
roads-0.023
 clim-0.023
Token open
Token ID1280
Feature activation-1.48e-02

Pos logit contributions
pe+0.014
stack+0.014
h+0.013
roads+0.013
 collapse+0.012
Neg logit contributions
 fries-0.019
 veggies-0.019
ouses-0.018
 groceries-0.017
 spaghetti-0.017
Token it
Token ID340
Feature activation+9.60e-03

Pos logit contributions
pe+0.022
stack+0.021
h+0.020
 collapse+0.020
 clim+0.020
Neg logit contributions
 veggies-0.031
 fries-0.031
ouses-0.030
 groceries-0.029
 meals-0.029
Token with
Token ID351
Feature activation+1.29e-02

Pos logit contributions
 veggies+0.018
ouses+0.018
 fries+0.018
 spaghetti+0.017
 groceries+0.017
Neg logit contributions
pe-0.015
stack-0.015
 collapse-0.014
 clim-0.014
h-0.014
Token your
Token ID534
Feature activation+5.20e-02

Pos logit contributions
 fries+0.024
 veggies+0.023
ouses+0.022
 meals+0.021
 groceries+0.021
Neg logit contributions
pe-0.023
stack-0.023
h-0.022
roads-0.021
 clim-0.021
Token fingers
Token ID9353
Feature activation+4.01e-02

Pos logit contributions
ouses+0.093
 fries+0.083
 veggies+0.082
heddar+0.080
affles+0.078
Neg logit contributions
pe-0.096
h-0.087
stack-0.087
 flap-0.085
 bridge-0.085
Token.
Token ID13
Feature activation+2.45e-03

Pos logit contributions
ouses+0.079
 fries+0.079
 veggies+0.079
 groceries+0.073
 meals+0.071
Neg logit contributions
stack-0.063
pe-0.062
 clim-0.058
roads-0.058
 collapse-0.058
Token Don
Token ID2094
Feature activation+1.60e-02

Pos logit contributions
 fries+0.005
ouses+0.005
 veggies+0.005
 meals+0.004
 groceries+0.004
Neg logit contributions
pe-0.004
stack-0.004
h-0.004
 collapse-0.004
 clim-0.004
Token't
Token ID470
Feature activation-2.01e-02

Pos logit contributions
 veggies+0.041
 fries+0.041
 groceries+0.040
ouses+0.040
 spaghetti+0.039
Neg logit contributions
stack-0.014
pe-0.013
 clim-0.012
bank-0.011
 collapse-0.011
Token let
Token ID1309
Feature activation+1.17e-02

Pos logit contributions
stack+0.032
pe+0.032
roads+0.031
esh+0.027
 site+0.027
Neg logit contributions
 fries-0.040
 veggies-0.040
 spaghetti-0.036
 peanut-0.036
ouses-0.036
Token scared
Token ID12008
Feature activation-4.18e-03

Pos logit contributions
stack+0.018
pe+0.018
h+0.017
roads+0.016
 collapse+0.016
Neg logit contributions
 fries-0.027
 veggies-0.027
ouses-0.027
 meals-0.025
 groceries-0.025
Token.
Token ID13
Feature activation+1.93e-02

Pos logit contributions
pe+0.007
stack+0.007
h+0.007
roads+0.006
 collapse+0.006
Neg logit contributions
 fries-0.008
 veggies-0.008
ouses-0.008
 meals-0.007
 groceries-0.007
Token She
Token ID1375
Feature activation-1.25e-02

Pos logit contributions
 fries+0.040
ouses+0.040
 veggies+0.039
 meals+0.037
 groceries+0.036
Neg logit contributions
pe-0.029
stack-0.028
h-0.028
 collapse-0.026
 clim-0.026
Token even
Token ID772
Feature activation+1.30e-02

Pos logit contributions
pe+0.018
stack+0.018
h+0.018
 clim+0.016
 collapse+0.016
Neg logit contributions
 fries-0.026
ouses-0.026
 veggies-0.026
 meals-0.024
 groceries-0.024
Token showed
Token ID3751
Feature activation+2.89e-02

Pos logit contributions
 fries+0.029
ouses+0.029
 veggies+0.028
 meals+0.027
 groceries+0.026
Neg logit contributions
pe-0.017
stack-0.017
h-0.017
 clim-0.015
 collapse-0.015
Token him
Token ID683
Feature activation+5.86e-02

Pos logit contributions
ouses+0.066
 fries+0.063
 veggies+0.061
 meals+0.058
affles+0.057
Neg logit contributions
pe-0.039
h-0.038
stack-0.038
 collapse-0.036
 clim-0.036
Token that
Token ID326
Feature activation+6.48e-03

Pos logit contributions
ouses+0.141
 fries+0.139
 veggies+0.135
ory+0.129
 groceries+0.126
Neg logit contributions
pe-0.067
stack-0.063
 collapse-0.061
 clim-0.060
h-0.058
Token there
Token ID612
Feature activation+7.73e-03

Pos logit contributions
 fries+0.013
 veggies+0.013
ouses+0.013
 meals+0.012
 spaghetti+0.012
Neg logit contributions
pe-0.010
stack-0.010
h-0.010
 clim-0.009
 collapse-0.009
Token was
Token ID373
Feature activation-2.39e-02

Pos logit contributions
 veggies+0.015
 fries+0.015
ouses+0.014
 meals+0.014
 groceries+0.013
Neg logit contributions
pe-0.013
h-0.013
stack-0.012
 clim-0.011
roads-0.011
Token nothing
Token ID2147
Feature activation+1.89e-02

Pos logit contributions
pe+0.038
stack+0.037
h+0.037
roads+0.034
 clim+0.034
Neg logit contributions
 fries-0.047
 veggies-0.047
ouses-0.046
 meals-0.044
 groceries-0.043
Token under
Token ID739
Feature activation-1.73e-02

Pos logit contributions
 fries+0.037
ouses+0.036
 veggies+0.036
 meals+0.034
 groceries+0.033
Neg logit contributions
pe-0.031
stack-0.030
h-0.030
 collapse-0.028
 clim-0.028
Token asked
Token ID1965
Feature activation+2.19e-02

Pos logit contributions
 fries+0.012
 veggies+0.012
ouses+0.012
 meals+0.011
 groceries+0.011
Neg logit contributions
pe-0.009
stack-0.009
h-0.008
 collapse-0.008
roads-0.008
Token her
Token ID607
Feature activation+3.50e-02

Pos logit contributions
ouses+0.055
 fries+0.055
 veggies+0.052
heddar+0.049
 favorites+0.049
Neg logit contributions
h-0.027
pe-0.021
 clim-0.019
 collapse-0.019
stack-0.019
Token to
Token ID284
Feature activation-2.11e-02

Pos logit contributions
 fries+0.074
ouses+0.072
 veggies+0.071
affles+0.066
 meals+0.066
Neg logit contributions
pe-0.050
h-0.050
 clim-0.047
stack-0.047
 collapse-0.044
Token clean
Token ID3424
Feature activation-2.46e-02

Pos logit contributions
pe+0.037
stack+0.036
h+0.034
roads+0.033
 clim+0.032
Neg logit contributions
 fries-0.041
 veggies-0.040
ouses-0.039
 meals-0.037
 groceries-0.037
Token up
Token ID510
Feature activation+1.33e-02

Pos logit contributions
pe+0.036
stack+0.035
h+0.034
 collapse+0.032
roads+0.032
Neg logit contributions
 fries-0.052
ouses-0.052
 veggies-0.051
 meals-0.048
 groceries-0.047
Token her
Token ID607
Feature activation+4.93e-02

Pos logit contributions
 fries+0.027
ouses+0.027
 veggies+0.027
 meals+0.025
 groceries+0.024
Neg logit contributions
pe-0.021
stack-0.020
h-0.020
 collapse-0.018
 clim-0.018
Token toys
Token ID14958
Feature activation+1.46e-02

Pos logit contributions
 fries+0.108
ouses+0.096
 veggies+0.096
ory+0.094
heddar+0.092
Neg logit contributions
pe-0.070
 switch-0.069
h-0.065
 hill-0.065
 bridge-0.063
Token and
Token ID290
Feature activation+3.18e-02

Pos logit contributions
ouses+0.028
 veggies+0.028
 groceries+0.027
 fries+0.027
 spaghetti+0.026
Neg logit contributions
pe-0.024
stack-0.023
 collapse-0.022
 clim-0.022
h-0.022
Token put
Token ID1234
Feature activation+2.41e-02

Pos logit contributions
 fries+0.066
 veggies+0.065
ouses+0.064
 groceries+0.060
 meals+0.060
Neg logit contributions
pe-0.050
stack-0.048
h-0.047
roads-0.044
 clim-0.044
Token them
Token ID606
Feature activation+6.92e-02

Pos logit contributions
ouses+0.057
 fries+0.057
 veggies+0.055
 meals+0.053
affles+0.053
Neg logit contributions
pe-0.027
h-0.027
stack-0.024
 switch-0.024
roads-0.022
Token away
Token ID1497
Feature activation+3.18e-02

Pos logit contributions
 fries+0.167
 veggies+0.158
ouses+0.156
 meals+0.149
heddar+0.145
Neg logit contributions
pe-0.086
h-0.082
stack-0.078
 collapse-0.074
 clim-0.072
Token and
Token ID290
Feature activation+2.95e-02

Pos logit contributions
 fries+0.003
 veggies+0.003
ouses+0.003
 meals+0.003
 spaghetti+0.003
Neg logit contributions
pe-0.003
stack-0.003
 collapse-0.003
h-0.003
roads-0.003
Token wrapped
Token ID12908
Feature activation+6.58e-03

Pos logit contributions
 fries+0.062
 veggies+0.060
ouses+0.060
 meals+0.056
 groceries+0.055
Neg logit contributions
pe-0.045
stack-0.044
h-0.043
 collapse-0.040
roads-0.040
Token his
Token ID465
Feature activation+6.05e-02

Pos logit contributions
ouses+0.015
 fries+0.015
 veggies+0.014
 meals+0.014
affles+0.014
Neg logit contributions
pe-0.009
stack-0.009
h-0.008
roads-0.008
 collapse-0.008
Token arms
Token ID5101
Feature activation+3.39e-02

Pos logit contributions
ouses+0.128
 fries+0.122
 veggies+0.116
ory+0.115
heddar+0.114
Neg logit contributions
pe-0.100
stack-0.087
 flap-0.085
 hill-0.084
h-0.082
Token around
Token ID1088
Feature activation+1.00e-02

Pos logit contributions
 fries+0.065
ouses+0.064
 veggies+0.064
 meals+0.059
 groceries+0.059
Neg logit contributions
pe-0.055
stack-0.055
h-0.052
 collapse-0.050
roads-0.050
Token him
Token ID683
Feature activation+5.21e-02

Pos logit contributions
ouses+0.023
 fries+0.022
 veggies+0.022
 meals+0.020
affles+0.020
Neg logit contributions
pe-0.014
stack-0.013
h-0.013
 collapse-0.012
 clim-0.012
Token.
Token ID13
Feature activation+4.32e-03

Pos logit contributions
 fries+0.110
ouses+0.108
 veggies+0.108
 meals+0.101
 groceries+0.100
Neg logit contributions
pe-0.077
stack-0.075
h-0.073
 collapse-0.069
 clim-0.068
Token
Token ID198
Feature activation+3.25e-03

Pos logit contributions
 fries+0.009
 veggies+0.009
ouses+0.009
 meals+0.008
 groceries+0.008
Neg logit contributions
pe-0.007
stack-0.006
h-0.006
 collapse-0.006
 clim-0.006
Token
Token ID198
Feature activation-1.37e-02

Pos logit contributions
 fries+0.007
 veggies+0.006
ouses+0.006
 meals+0.006
 groceries+0.006
Neg logit contributions
pe-0.005
stack-0.005
h-0.005
 collapse-0.005
roads-0.005
Token"
Token ID1
Feature activation-1.28e-02

Pos logit contributions
stack+0.021
pe+0.019
 collapse+0.018
 clim+0.018
h+0.017
Neg logit contributions
ouses-0.029
 fries-0.028
 veggies-0.027
 groceries-0.025
 spaghetti-0.025
TokenBe
Token ID3856
Feature activation-1.95e-02

Pos logit contributions
pe+0.019
stack+0.019
h+0.018
 collapse+0.017
 clim+0.017
Neg logit contributions
 fries-0.027
ouses-0.027
 veggies-0.027
 meals-0.025
 groceries-0.025
Token world
Token ID995
Feature activation-1.60e-02

Pos logit contributions
 fries+0.024
ouses+0.023
 veggies+0.023
 meals+0.022
 groceries+0.022
Neg logit contributions
pe-0.013
stack-0.012
h-0.012
 collapse-0.011
 clim-0.011
Token,
Token ID11
Feature activation+6.55e-03

Pos logit contributions
stack+0.030
 collapse+0.029
pe+0.029
h+0.028
 clim+0.026
Neg logit contributions
 veggies-0.026
ouses-0.025
 fries-0.025
 meals-0.025
 spaghetti-0.025
Token so
Token ID523
Feature activation+1.28e-02

Pos logit contributions
ouses+0.015
 fries+0.015
 veggies+0.015
 meals+0.014
 groceries+0.014
Neg logit contributions
pe-0.008
stack-0.008
h-0.008
roads-0.007
 clim-0.007
Token he
Token ID339
Feature activation-4.60e-03

Pos logit contributions
ouses+0.033
 veggies+0.032
 fries+0.032
affles+0.031
heddar+0.030
Neg logit contributions
pe-0.013
h-0.012
stack-0.012
 hill-0.011
 never-0.011
Token left
Token ID1364
Feature activation+6.73e-03

Pos logit contributions
pe+0.006
stack+0.006
h+0.006
 clim+0.006
 collapse+0.006
Neg logit contributions
 fries-0.010
ouses-0.010
 veggies-0.010
 meals-0.009
 groceries-0.009
Token his
Token ID465
Feature activation+4.73e-02

Pos logit contributions
 fries+0.013
 veggies+0.012
ouses+0.012
 groceries+0.012
 meals+0.012
Neg logit contributions
pe-0.011
stack-0.011
h-0.011
 collapse-0.011
 clim-0.011
Token home
Token ID1363
Feature activation+2.37e-02

Pos logit contributions
 fries+0.098
ouses+0.096
 veggies+0.094
affles+0.088
heddar+0.086
Neg logit contributions
pe-0.073
stack-0.072
 hill-0.066
h-0.065
 bridge-0.064
Token early
Token ID1903
Feature activation-2.29e-02

Pos logit contributions
 fries+0.047
ouses+0.047
 veggies+0.047
 meals+0.044
 groceries+0.044
Neg logit contributions
pe-0.037
stack-0.036
h-0.035
 collapse-0.033
roads-0.033
Token in
Token ID287
Feature activation-3.10e-02

Pos logit contributions
 collapse+0.044
stack+0.042
pe+0.042
h+0.040
 clim+0.040
Neg logit contributions
 fries-0.038
ouses-0.037
 veggies-0.036
 spaghetti-0.035
 meals-0.034
Token the
Token ID262
Feature activation-5.00e-03

Pos logit contributions
pe+0.053
stack+0.053
h+0.051
 clim+0.049
oh+0.047
Neg logit contributions
 fries-0.058
 veggies-0.057
 meals-0.055
 groceries-0.054
 spaghetti-0.054
Token morning
Token ID3329
Feature activation-3.22e-02

Pos logit contributions
pe+0.008
stack+0.008
h+0.008
 collapse+0.008
 clim+0.008
Neg logit contributions
 fries-0.009
 veggies-0.009
ouses-0.009
 meals-0.009
 groceries-0.009
Token day
Token ID1110
Feature activation-4.32e-03

Pos logit contributions
pe+0.109
stack+0.108
 collapse+0.107
roads+0.103
h+0.102
Neg logit contributions
 veggies-0.083
 meals-0.082
ouses-0.082
 spaghetti-0.081
 fries-0.079
Token on
Token ID319
Feature activation-2.81e-02

Pos logit contributions
pe+0.007
stack+0.007
 collapse+0.007
h+0.007
roads+0.007
Neg logit contributions
ouses-0.008
 fries-0.008
 veggies-0.008
 meals-0.007
 groceries-0.007
Token anytime
Token ID17949
Feature activation+6.06e-03

Pos logit contributions
pe+0.033
stack+0.033
 collapse+0.032
h+0.032
 clim+0.032
Neg logit contributions
 fries-0.065
 veggies-0.065
ouses-0.064
 meals-0.063
 groceries-0.061
Token Gabe
Token ID45513
Feature activation-5.61e-04

Pos logit contributions
 fries+0.013
ouses+0.013
 veggies+0.013
 meals+0.012
 groceries+0.012
Neg logit contributions
pe-0.009
h-0.009
stack-0.008
 clim-0.008
roads-0.008
Token looked
Token ID3114
Feature activation-1.28e-02

Pos logit contributions
stack+0.001
pe+0.001
h+0.001
 collapse+0.001
roads+0.001
Neg logit contributions
 fries-0.001
 veggies-0.001
ouses-0.001
 meals-0.001
 spaghetti-0.001
Token at
Token ID379
Feature activation+9.07e-04

Pos logit contributions
h+0.015
 clim+0.015
pe+0.013
roads+0.013
stack+0.012
Neg logit contributions
 fries-0.030
ouses-0.029
 veggies-0.028
affles-0.027
uggets-0.026
Token the
Token ID262
Feature activation-1.84e-02

Pos logit contributions
ouses+0.002
 fries+0.002
 veggies+0.002
 meals+0.002
affles+0.002
Neg logit contributions
pe-0.001
stack-0.001
h-0.001
 clim-0.001
 collapse-0.001
Token rock
Token ID3881
Feature activation-3.27e-02

Pos logit contributions
pe+0.024
stack+0.023
h+0.022
 collapse+0.021
roads+0.021
Neg logit contributions
 fries-0.042
 veggies-0.042
ouses-0.042
 meals-0.039
 groceries-0.039
Token it
Token ID340
Feature activation+2.73e-03

Pos logit contributions
pe+0.044
stack+0.043
 collapse+0.043
h+0.042
roads+0.040
Neg logit contributions
 veggies-0.069
 fries-0.068
ouses-0.067
 meals-0.066
 spaghetti-0.064
Token made
Token ID925
Feature activation+5.70e-03

Pos logit contributions
 veggies+0.006
ouses+0.005
 fries+0.005
 meals+0.005
 spaghetti+0.005
Neg logit contributions
pe-0.004
stack-0.004
h-0.004
 collapse-0.004
roads-0.004
Token him
Token ID683
Feature activation+3.62e-02

Pos logit contributions
ouses+0.012
 fries+0.012
 veggies+0.012
 groceries+0.011
 meals+0.011
Neg logit contributions
pe-0.008
h-0.008
stack-0.008
 clim-0.007
 collapse-0.007
Token.
Token ID13
Feature activation-6.57e-03

Pos logit contributions
 fries+0.023
ouses+0.023
 veggies+0.023
 meals+0.021
 groceries+0.021
Neg logit contributions
pe-0.016
stack-0.016
h-0.015
 collapse-0.014
 clim-0.014
Token They
Token ID1119
Feature activation+2.09e-03

Pos logit contributions
stack+0.005
pe+0.005
 crash+0.005
 collapse+0.005
 zo+0.005
Neg logit contributions
 fries-0.017
 veggies-0.017
affles-0.017
ouses-0.017
urger-0.016
Token were
Token ID547
Feature activation+6.67e-04

Pos logit contributions
ouses+0.004
 fries+0.004
 veggies+0.004
 groceries+0.004
 meals+0.004
Neg logit contributions
pe-0.003
stack-0.003
h-0.003
 clim-0.003
 collapse-0.003
Token happy
Token ID3772
Feature activation+2.05e-02

Pos logit contributions
ouses+0.001
 fries+0.001
 veggies+0.001
 groceries+0.001
ory+0.001
Neg logit contributions
pe-0.001
h-0.001
stack-0.001
 never-0.001
 clim-0.001
Token and
Token ID290
Feature activation+1.86e-02

Pos logit contributions
 fries+0.045
ouses+0.045
 veggies+0.044
 groceries+0.041
 meals+0.041
Neg logit contributions
pe-0.028
h-0.027
stack-0.027
 clim-0.025
 collapse-0.025
Token full
Token ID1336
Feature activation+1.22e-02

Pos logit contributions
ouses+0.033
 veggies+0.033
 fries+0.032
 groceries+0.032
affles+0.031
Neg logit contributions
pe-0.031
stack-0.030
h-0.030
 never-0.029
 collapse-0.028
Token.
Token ID13
Feature activation-8.09e-03

Pos logit contributions
ouses+0.028
 fries+0.027
 veggies+0.027
 meals+0.025
 groceries+0.025
Neg logit contributions
pe-0.016
h-0.015
stack-0.015
 collapse-0.013
 clim-0.013
Token<|endoftext|>
Token ID50256
Feature activation-3.18e-02

Pos logit contributions
stack+0.006
pe+0.006
 zo+0.006
 crash+0.006
't+0.006
Neg logit contributions
 fries-0.021
 veggies-0.021
affles-0.021
ouses-0.021
urger-0.020
TokenTom
Token ID13787
Feature activation+2.54e-02

Pos logit contributions
pe+0.047
stack+0.046
h+0.045
roads+0.042
 collapse+0.042
Neg logit contributions
 fries-0.067
ouses-0.066
 veggies-0.066
 meals-0.061
 groceries-0.060
Tokenmy
Token ID1820
Feature activation+1.49e-02

Pos logit contributions
 fries+0.053
ouses+0.052
 veggies+0.052
 meals+0.048
 groceries+0.048
Neg logit contributions
pe-0.038
stack-0.037
h-0.036
 collapse-0.034
 clim-0.034
Token loved
Token ID6151
Feature activation+3.00e-02

Pos logit contributions
 fries+0.027
 veggies+0.027
ouses+0.026
 meals+0.024
 spaghetti+0.024
Neg logit contributions
pe-0.027
stack-0.027
h-0.025
 collapse-0.025
roads-0.025
Token he
Token ID339
Feature activation-4.60e-03

Pos logit contributions
pe+0.017
h+0.017
stack+0.016
 clim+0.015
roads+0.015
Neg logit contributions
 fries-0.018
 veggies-0.017
 spaghetti-0.016
ouses-0.016
 meals-0.016
Token found
Token ID1043
Feature activation+4.45e-03

Pos logit contributions
pe+0.007
stack+0.007
h+0.007
roads+0.007
 collapse+0.007
Neg logit contributions
 fries-0.009
ouses-0.009
 veggies-0.009
 meals-0.008
 groceries-0.008
Token the
Token ID262
Feature activation+1.32e-03

Pos logit contributions
 fries+0.009
 veggies+0.009
ouses+0.009
 meals+0.008
 groceries+0.008
Neg logit contributions
pe-0.007
stack-0.007
h-0.007
 collapse-0.006
roads-0.006
Token right
Token ID826
Feature activation-3.16e-02

Pos logit contributions
 fries+0.003
 veggies+0.003
ouses+0.003
 meals+0.002
 groceries+0.002
Neg logit contributions
pe-0.002
stack-0.002
h-0.002
roads-0.002
 collapse-0.002
Token spell
Token ID4822
Feature activation-4.63e-03

Pos logit contributions
h+0.051
stack+0.050
pe+0.048
 collapse+0.047
roads+0.046
Neg logit contributions
 veggies-0.060
 meals-0.056
 fries-0.056
ouses-0.056
 favourite-0.054
Token.
Token ID13
Feature activation+2.45e-03

Pos logit contributions
stack+0.007
pe+0.007
h+0.007
 collapse+0.007
 clim+0.007
Neg logit contributions
 fries-0.009
ouses-0.009
 veggies-0.009
 meals-0.008
 spaghetti-0.008
Token He
Token ID679
Feature activation-9.55e-03

Pos logit contributions
 fries+0.005
ouses+0.005
 veggies+0.005
 meals+0.005
 groceries+0.005
Neg logit contributions
pe-0.004
stack-0.004
h-0.004
 collapse-0.003
 clim-0.003
Token said
Token ID531
Feature activation+2.79e-02

Pos logit contributions
pe+0.014
stack+0.014
h+0.014
roads+0.013
 collapse+0.013
Neg logit contributions
 fries-0.020
 veggies-0.019
ouses-0.019
 meals-0.018
 groceries-0.017
Token the
Token ID262
Feature activation-2.39e-02

Pos logit contributions
 fries+0.057
 veggies+0.055
ouses+0.053
 meals+0.051
 spaghetti+0.050
Neg logit contributions
stack-0.043
roads-0.042
pe-0.042
h-0.040
 collapse-0.040
Token spell
Token ID4822
Feature activation-7.20e-03

Pos logit contributions
stack+0.037
pe+0.036
h+0.035
roads+0.034
 collapse+0.034
Neg logit contributions
 veggies-0.047
 fries-0.046
ouses-0.044
 meals-0.043
 poems-0.043
Token out
Token ID503
Feature activation-1.16e-02

Pos logit contributions
stack+0.011
h+0.011
roads+0.011
pe+0.010
 collapse+0.010
Neg logit contributions
 fries-0.014
ouses-0.014
 veggies-0.013
 meals-0.013
 favourite-0.012
Token building
Token ID2615
Feature activation-2.29e-02

Pos logit contributions
stack+0.041
pe+0.040
roads+0.038
 clim+0.036
h+0.036
Neg logit contributions
 veggies-0.051
ouses-0.049
 fries-0.048
 meals-0.048
 groceries-0.047
Token her
Token ID607
Feature activation+3.45e-02

Pos logit contributions
pe+0.036
stack+0.036
h+0.033
 clim+0.032
roads+0.032
Neg logit contributions
 veggies-0.045
 fries-0.044
ouses-0.044
 groceries-0.043
 meals-0.043
Token creation
Token ID6282
Feature activation+6.71e-04

Pos logit contributions
 fries+0.072
ouses+0.069
 veggies+0.068
affles+0.063
 meals+0.063
Neg logit contributions
pe-0.054
stack-0.052
 bridge-0.049
 collapse-0.049
 hill-0.049
Token!
Token ID0
Feature activation+6.46e-03

Pos logit contributions
 veggies+0.001
ouses+0.001
 meals+0.001
 spaghetti+0.001
 groceries+0.001
Neg logit contributions
pe-0.001
stack-0.001
h-0.001
 collapse-0.001
roads-0.001
Token It
Token ID632
Feature activation-5.31e-03

Pos logit contributions
 fries+0.014
ouses+0.014
 veggies+0.014
 meals+0.013
 groceries+0.013
Neg logit contributions
pe-0.009
stack-0.009
h-0.009
 collapse-0.008
 clim-0.008
Token looked
Token ID3114
Feature activation+7.50e-03

Pos logit contributions
stack+0.011
pe+0.010
roads+0.010
h+0.010
 inconvenience+0.009
Neg logit contributions
 veggies-0.008
 delicious-0.008
ouses-0.008
 favourite-0.007
 meals-0.007
Token very
Token ID845
Feature activation+2.37e-02

Pos logit contributions
ouses+0.018
 fries+0.017
affles+0.017
 veggies+0.017
uggets+0.016
Neg logit contributions
h-0.009
 collapse-0.008
 clim-0.008
 never-0.008
pe-0.008
Token pretty
Token ID2495
Feature activation+3.73e-02

Pos logit contributions
 fries+0.040
ouses+0.039
 veggies+0.038
 meals+0.035
 groceries+0.035
Neg logit contributions
h-0.044
pe-0.044
 collapse-0.042
 clim-0.041
stack-0.041
Token in
Token ID287
Feature activation-2.32e-02

Pos logit contributions
 veggies+0.077
 fries+0.077
ouses+0.076
 meals+0.072
 groceries+0.071
Neg logit contributions
pe-0.056
stack-0.056
h-0.053
roads-0.051
 collapse-0.051
Token a
Token ID257
Feature activation+9.19e-03

Pos logit contributions
pe+0.037
stack+0.036
h+0.034
roads+0.033
 clim+0.033
Neg logit contributions
 veggies-0.047
 fries-0.045
 groceries-0.044
 spaghetti-0.044
 meals-0.044
Token bright
Token ID6016
Feature activation-2.17e-02

Pos logit contributions
 fries+0.017
ouses+0.017
 veggies+0.017
 meals+0.016
 groceries+0.016
Neg logit contributions
pe-0.015
stack-0.015
h-0.015
 collapse-0.014
 clim-0.014
Token it
Token ID340
Feature activation-1.31e-02

Pos logit contributions
 fries+0.015
 veggies+0.015
ouses+0.014
 meals+0.014
 spaghetti+0.013
Neg logit contributions
pe-0.013
h-0.012
stack-0.012
 collapse-0.011
 clim-0.011
Token hurt
Token ID5938
Feature activation-2.07e-02

Pos logit contributions
stack+0.024
pe+0.022
h+0.021
roads+0.020
 poster+0.020
Neg logit contributions
ouses-0.020
 veggies-0.020
 favourite-0.020
 favorite-0.020
 fries-0.019
Token his
Token ID465
Feature activation+3.21e-02

Pos logit contributions
pe+0.034
stack+0.034
h+0.033
roads+0.032
 clim+0.031
Neg logit contributions
 fries-0.038
 veggies-0.038
ouses-0.037
 meals-0.036
 groceries-0.035
Token ears
Token ID11368
Feature activation+1.44e-02

Pos logit contributions
 fries+0.064
ouses+0.063
 veggies+0.063
 meals+0.057
 groceries+0.057
Neg logit contributions
pe-0.052
stack-0.050
h-0.048
 collapse-0.046
 clim-0.046
Token.
Token ID13
Feature activation+5.72e-03

Pos logit contributions
 fries+0.024
 veggies+0.024
ouses+0.024
 spaghetti+0.022
 groceries+0.022
Neg logit contributions
stack-0.025
pe-0.025
roads-0.024
h-0.023
 clim-0.023
Token
Token ID198
Feature activation+8.63e-03

Pos logit contributions
 fries+0.011
ouses+0.011
 veggies+0.011
 meals+0.010
 groceries+0.010
Neg logit contributions
pe-0.010
stack-0.009
h-0.009
 collapse-0.009
 clim-0.009
Token
Token ID198
Feature activation-4.14e-03

Pos logit contributions
 fries+0.017
 veggies+0.016
ouses+0.016
 meals+0.015
 groceries+0.015
Neg logit contributions
pe-0.014
stack-0.014
h-0.014
 collapse-0.013
roads-0.013
TokenThe
Token ID464
Feature activation-2.86e-02

Pos logit contributions
stack+0.007
pe+0.007
h+0.007
 collapse+0.006
 clim+0.006
Neg logit contributions
ouses-0.007
 fries-0.007
 veggies-0.007
 meals-0.007
 spaghetti-0.007
Token bunny
Token ID44915
Feature activation-1.70e-02

Pos logit contributions
pe+0.045
stack+0.045
h+0.045
 collapse+0.041
 clim+0.041
Neg logit contributions
 fries-0.057
 veggies-0.056
ouses-0.055
 groceries-0.052
 meals-0.052
Token flew
Token ID13112
Feature activation-3.02e-02

Pos logit contributions
stack+0.029
 collapse+0.028
h+0.028
pe+0.028
roads+0.027
Neg logit contributions
 fries-0.030
 veggies-0.029
ouses-0.028
 meals-0.027
 favourite-0.026
Token low
Token ID1877
Feature activation-1.66e-02

Pos logit contributions
pe+0.052
stack+0.051
 collapse+0.048
roads+0.048
h+0.047
Neg logit contributions
 veggies-0.049
 fries-0.048
ouses-0.048
 favourite-0.047
 groceries-0.047
Token a
Token ID257
Feature activation+2.93e-02

Pos logit contributions
 fries+0.124
 veggies+0.123
ouses+0.122
 meals+0.119
affles+0.116
Neg logit contributions
h-0.073
pe-0.072
stack-0.069
 clim-0.066
 hon-0.064
Token red
Token ID2266
Feature activation+1.80e-02

Pos logit contributions
 fries+0.053
 veggies+0.053
ouses+0.053
 meals+0.049
 groceries+0.048
Neg logit contributions
pe-0.051
h-0.049
stack-0.049
 clim-0.045
 bridge-0.045
Token bow
Token ID9563
Feature activation+1.01e-02

Pos logit contributions
 fries+0.030
ouses+0.029
 veggies+0.029
 meals+0.027
 groceries+0.026
Neg logit contributions
pe-0.035
stack-0.034
h-0.033
 collapse-0.032
 clim-0.031
Token tie
Token ID9839
Feature activation+3.10e-03

Pos logit contributions
 fries+0.020
ouses+0.019
 veggies+0.019
 meals+0.017
 groceries+0.017
Neg logit contributions
pe-0.016
stack-0.016
h-0.016
 collapse-0.015
roads-0.015
Token.
Token ID13
Feature activation+1.57e-02

Pos logit contributions
 fries+0.006
ouses+0.006
 veggies+0.006
 meals+0.006
 groceries+0.006
Neg logit contributions
pe-0.005
stack-0.005
h-0.005
 collapse-0.005
 clim-0.004
Token He
Token ID679
Feature activation-6.17e-03

Pos logit contributions
 fries+0.035
ouses+0.035
 veggies+0.034
affles+0.032
 meals+0.032
Neg logit contributions
stack-0.020
pe-0.019
h-0.018
roads-0.018
 hill-0.017
Token thought
Token ID1807
Feature activation+3.80e-02

Pos logit contributions
pe+0.009
stack+0.009
h+0.009
 collapse+0.008
roads+0.008
Neg logit contributions
 fries-0.013
ouses-0.013
 veggies-0.013
 meals-0.012
 groceries-0.012
Token he
Token ID339
Feature activation+1.98e-02

Pos logit contributions
 fries+0.084
ouses+0.083
 veggies+0.082
 meals+0.077
 groceries+0.076
Neg logit contributions
pe-0.052
h-0.050
stack-0.050
 collapse-0.046
 clim-0.046
Token looked
Token ID3114
Feature activation-5.23e-03

Pos logit contributions
 fries+0.045
 veggies+0.044
ouses+0.044
 groceries+0.042
 meals+0.042
Neg logit contributions
pe-0.026
h-0.025
stack-0.024
 clim-0.023
 collapse-0.022
Token very
Token ID845
Feature activation+1.57e-02

Pos logit contributions
h+0.006
 never+0.006
 clim+0.006
 hon+0.005
 switch+0.005
Neg logit contributions
 fries-0.012
ouses-0.012
 veggies-0.012
affles-0.011
 groceries-0.011
Token handsome
Token ID22665
Feature activation+2.20e-02

Pos logit contributions
 fries+0.026
 veggies+0.025
ouses+0.025
 groceries+0.023
 meals+0.023
Neg logit contributions
h-0.030
pe-0.030
 clim-0.028
 collapse-0.028
stack-0.027
Token and
Token ID290
Feature activation+6.80e-03

Pos logit contributions
 fries+0.001
 veggies+0.001
ouses+0.001
 meals+0.001
 spaghetti+0.001
Neg logit contributions
pe-0.001
stack-0.001
h-0.001
roads-0.001
 clim-0.001
Token the
Token ID262
Feature activation-1.59e-02

Pos logit contributions
 fries+0.015
 veggies+0.014
ouses+0.014
 meals+0.013
 groceries+0.013
Neg logit contributions
pe-0.010
stack-0.009
h-0.009
roads-0.009
 clim-0.009
Token little
Token ID1310
Feature activation-1.79e-02

Pos logit contributions
pe+0.028
stack+0.027
 hill+0.026
 bridge+0.025
 switch+0.024
Neg logit contributions
ouses-0.030
 fries-0.027
 veggies-0.026
affles-0.025
uggets-0.025
Token girl
Token ID2576
Feature activation-1.78e-02

Pos logit contributions
pe+0.029
stack+0.029
h+0.028
 clim+0.027
 collapse+0.026
Neg logit contributions
 fries-0.035
ouses-0.034
 veggies-0.034
 meals-0.031
 groceries-0.031
Token was
Token ID373
Feature activation-2.60e-02

Pos logit contributions
pe+0.028
stack+0.028
h+0.027
roads+0.026
 collapse+0.026
Neg logit contributions
 fries-0.036
 veggies-0.035
ouses-0.034
 meals-0.032
 spaghetti-0.031
Token enjoying
Token ID13226
Feature activation-6.79e-03

Pos logit contributions
pe+0.052
stack+0.051
roads+0.048
 hill+0.048
h+0.048
Neg logit contributions
 fries-0.040
 veggies-0.038
ouses-0.038
 favourite-0.036
 meals-0.035
Token her
Token ID607
Feature activation+4.15e-02

Pos logit contributions
pe+0.010
stack+0.010
h+0.009
 collapse+0.009
 clim+0.009
Neg logit contributions
 fries-0.014
 veggies-0.014
ouses-0.014
 meals-0.014
 groceries-0.013
Token very
Token ID845
Feature activation+3.15e-02

Pos logit contributions
ouses+0.066
 fries+0.062
 veggies+0.059
heddar+0.054
ory+0.054
Neg logit contributions
pe-0.086
stack-0.082
 hill-0.077
 bridge-0.076
h-0.076
Token modern
Token ID3660
Feature activation+2.61e-03

Pos logit contributions
 fries+0.046
ouses+0.045
 veggies+0.045
 meals+0.041
 groceries+0.040
Neg logit contributions
pe-0.067
stack-0.065
h-0.064
 collapse-0.062
 clim-0.061
Token picnic
Token ID35715
Feature activation+3.62e-04

Pos logit contributions
ouses+0.004
 fries+0.004
 veggies+0.004
 groceries+0.003
 meals+0.003
Neg logit contributions
pe-0.006
stack-0.005
h-0.005
 clim-0.005
 collapse-0.005
Token with
Token ID351
Feature activation+1.79e-02

Pos logit contributions
 fries+0.001
 veggies+0.001
ouses+0.001
 meals+0.001
 groceries+0.001
Neg logit contributions
pe-0.001
stack-0.001
h-0.001
 collapse-0.001
roads-0.001
Token bank
Token ID3331
Feature activation-3.92e-02

Pos logit contributions
 collapse+0.081
pe+0.080
h+0.078
stack+0.078
 clim+0.073
Neg logit contributions
 veggies-0.111
 fries-0.110
 meals-0.106
ouses-0.105
 sandwiches-0.102
Token and
Token ID290
Feature activation+2.09e-02

Pos logit contributions
stack+0.066
pe+0.066
h+0.064
 collapse+0.063
 clim+0.061
Neg logit contributions
 fries-0.071
ouses-0.070
 veggies-0.069
 groceries-0.064
 meals-0.064
Token stretched
Token ID19110
Feature activation+1.32e-03

Pos logit contributions
 fries+0.042
ouses+0.042
 veggies+0.042
 meals+0.039
 groceries+0.038
Neg logit contributions
pe-0.033
stack-0.032
h-0.031
 collapse-0.029
 clim-0.029
Token out
Token ID503
Feature activation+1.05e-02

Pos logit contributions
 fries+0.002
ouses+0.002
 veggies+0.002
 meals+0.002
 groceries+0.002
Neg logit contributions
pe-0.002
stack-0.002
h-0.002
roads-0.002
 collapse-0.002
Token in
Token ID287
Feature activation-3.60e-02

Pos logit contributions
 fries+0.019
 veggies+0.019
ouses+0.018
 meals+0.018
 groceries+0.017
Neg logit contributions
pe-0.018
stack-0.018
h-0.017
roads-0.017
 clim-0.016
Token the
Token ID262
Feature activation-1.72e-02

Pos logit contributions
pe+0.058
stack+0.058
h+0.055
 clim+0.051
roads+0.051
Neg logit contributions
 veggies-0.070
 fries-0.069
 meals-0.067
 groceries-0.066
ouses-0.065
Token sun
Token ID4252
Feature activation-1.74e-02

Pos logit contributions
pe+0.026
stack+0.025
h+0.024
roads+0.023
 collapse+0.023
Neg logit contributions
ouses-0.036
 fries-0.035
 veggies-0.035
 meals-0.032
affles-0.032
Token.
Token ID13
Feature activation+1.20e-02

Pos logit contributions
pe+0.027
stack+0.027
h+0.026
 collapse+0.025
roads+0.025
Neg logit contributions
 fries-0.035
 veggies-0.034
ouses-0.034
 meals-0.032
 groceries-0.031
Token She
Token ID1375
Feature activation-5.94e-03

Pos logit contributions
 fries+0.027
ouses+0.026
 veggies+0.026
 meals+0.025
affles+0.024
Neg logit contributions
stack-0.016
pe-0.016
h-0.015
 clim-0.014
roads-0.014
Token looked
Token ID3114
Feature activation+6.86e-04

Pos logit contributions
pe+0.009
stack+0.009
h+0.008
 collapse+0.008
roads+0.008
Neg logit contributions
 fries-0.012
 veggies-0.012
ouses-0.012
 meals-0.011
 groceries-0.011
Token so
Token ID523
Feature activation+2.90e-02

Pos logit contributions
ouses+0.002
 fries+0.002
 veggies+0.001
affles+0.001
 meals+0.001
Neg logit contributions
pe-0.001
stack-0.001
h-0.001
 clim-0.001
roads-0.001
Token�
Token ID241
Feature activation-1.39e-02

Pos logit contributions
stack+0.029
pe+0.028
h+0.028
 clim+0.028
 collapse+0.027
Neg logit contributions
 fries-0.035
ouses-0.034
 veggies-0.034
 meals-0.032
 favourite-0.031
TokenWow
Token ID22017
Feature activation+1.30e-02

Pos logit contributions
pe+0.021
stack+0.021
h+0.020
 collapse+0.019
roads+0.019
Neg logit contributions
 fries-0.028
ouses-0.028
 veggies-0.028
 meals-0.026
 groceries-0.026
Token,
Token ID11
Feature activation+1.36e-03

Pos logit contributions
ouses+0.025
 veggies+0.024
 favourite+0.023
 fries+0.022
 favorite+0.022
Neg logit contributions
stack-0.021
roads-0.021
pe-0.020
bank-0.019
 collapse-0.018
Token Bernie
Token ID10477
Feature activation+1.56e-02

Pos logit contributions
 fries+0.002
 veggies+0.002
 meals+0.002
ouses+0.002
 favourite+0.002
Neg logit contributions
h-0.002
pe-0.002
 collapse-0.002
stack-0.002
roads-0.002
Token!
Token ID0
Feature activation+1.05e-02

Pos logit contributions
 fries+0.032
ouses+0.031
 veggies+0.030
 meals+0.028
ory+0.028
Neg logit contributions
stack-0.024
 collapse-0.023
pe-0.023
roads-0.022
h-0.021
Tokenâ
Token ID22940
Feature activation-2.04e-02

Pos logit contributions
 fries+0.019
ouses+0.019
 veggies+0.018
 meals+0.017
 spaghetti+0.016
Neg logit contributions
pe-0.018
h-0.018
stack-0.017
 collapse-0.017
roads-0.017
Token€
Token ID26391
Feature activation+1.44e-02

Pos logit contributions
stack+0.031
 clim+0.030
 collapse+0.030
 cliff+0.029
 bridge+0.029
Neg logit contributions
ouses-0.037
 veggies-0.035
 fries-0.034
 meals-0.033
shirt-0.033
Token shouted
Token ID17293
Feature activation+6.75e-03

Pos logit contributions
ouses+0.027
 fries+0.025
 veggies+0.025
 meals+0.024
 favorites+0.023
Neg logit contributions
pe-0.025
stack-0.025
h-0.024
 collapse-0.023
roads-0.023
Token one
Token ID530
Feature activation+2.48e-02

Pos logit contributions
 fries+0.015
ouses+0.015
 veggies+0.014
 meals+0.013
 spaghetti+0.013
Neg logit contributions
pe-0.010
stack-0.009
h-0.009
roads-0.009
 collapse-0.008
Token of
Token ID286
Feature activation-2.05e-02

Pos logit contributions
 fries+0.047
 veggies+0.046
ouses+0.046
 meals+0.043
 groceries+0.042
Neg logit contributions
pe-0.042
stack-0.041
h-0.039
 collapse-0.037
roads-0.037
Token the
Token ID262
Feature activation-8.84e-03

Pos logit contributions
pe+0.033
stack+0.032
h+0.032
 collapse+0.030
 clim+0.030
Neg logit contributions
 fries-0.039
 veggies-0.038
ouses-0.036
 meals-0.035
 spaghetti-0.035
Token the
Token ID262
Feature activation-6.76e-03

Pos logit contributions
ouses+0.012
 fries+0.012
 veggies+0.012
 meals+0.011
affles+0.011
Neg logit contributions
pe-0.005
h-0.005
stack-0.005
 collapse-0.004
roads-0.004
Token flower
Token ID15061
Feature activation-9.00e-03

Pos logit contributions
pe+0.009
stack+0.009
h+0.009
 flap+0.008
 hill+0.008
Neg logit contributions
ouses-0.015
 fries-0.015
 veggies-0.015
 meals-0.014
affles-0.014
Token.
Token ID13
Feature activation-1.17e-02

Pos logit contributions
pe+0.015
stack+0.015
h+0.014
roads+0.014
 clim+0.014
Neg logit contributions
 fries-0.016
 veggies-0.016
ouses-0.016
 meals-0.015
 spaghetti-0.015
Token But
Token ID887
Feature activation-1.84e-02

Pos logit contributions
pe+0.017
stack+0.016
h+0.015
 collapse+0.015
 clim+0.014
Neg logit contributions
 fries-0.025
ouses-0.025
 veggies-0.025
 meals-0.023
 groceries-0.023
Token when
Token ID618
Feature activation-1.60e-02

Pos logit contributions
pe+0.028
stack+0.027
h+0.026
roads+0.025
bank+0.024
Neg logit contributions
 fries-0.037
 veggies-0.037
ouses-0.036
 groceries-0.035
 spaghetti-0.035
Token she
Token ID673
Feature activation-1.75e-02

Pos logit contributions
pe+0.031
stack+0.029
h+0.027
bank+0.027
roads+0.026
Neg logit contributions
 veggies-0.027
 fries-0.027
 groceries-0.027
ouses-0.026
 spaghetti-0.025
Token touched
Token ID12615
Feature activation-2.67e-02

Pos logit contributions
pe+0.023
stack+0.022
h+0.022
 collapse+0.020
 clim+0.020
Neg logit contributions
 fries-0.040
ouses-0.039
 veggies-0.039
 meals-0.037
 groceries-0.037
Token it
Token ID340
Feature activation-6.66e-03

Pos logit contributions
pe+0.037
stack+0.036
h+0.035
 clim+0.033
 collapse+0.033
Neg logit contributions
 fries-0.058
 veggies-0.058
ouses-0.057
 groceries-0.054
 meals-0.054
Token,
Token ID11
Feature activation-1.74e-02

Pos logit contributions
stack+0.010
pe+0.009
roads+0.009
bank+0.008
 clim+0.008
Neg logit contributions
 veggies-0.013
 fries-0.013
 groceries-0.013
 spaghetti-0.013
ouses-0.012
Token she
Token ID673
Feature activation-4.13e-03

Pos logit contributions
pe+0.028
stack+0.027
h+0.027
roads+0.025
 clim+0.024
Neg logit contributions
 veggies-0.034
 fries-0.034
ouses-0.033
 groceries-0.032
 meals-0.031
Token felt
Token ID2936
Feature activation-1.21e-02

Pos logit contributions
pe+0.005
stack+0.005
h+0.005
 collapse+0.005
 clim+0.005
Neg logit contributions
 fries-0.009
ouses-0.009
 veggies-0.009
 meals-0.009
 groceries-0.009
Token.
Token ID13
Feature activation-1.21e-02

Pos logit contributions
h+0.007
pe+0.007
 clim+0.007
stack+0.007
roads+0.007
Neg logit contributions
 fries-0.011
ouses-0.011
 veggies-0.011
affles-0.010
 meals-0.010
Token He
Token ID679
Feature activation-2.21e-02

Pos logit contributions
pe+0.020
stack+0.020
h+0.020
 collapse+0.019
 clim+0.018
Neg logit contributions
 fries-0.023
ouses-0.023
 veggies-0.022
 meals-0.021
 groceries-0.021
Token crawled
Token ID45668
Feature activation-2.85e-02

Pos logit contributions
pe+0.035
stack+0.035
h+0.034
 collapse+0.033
roads+0.032
Neg logit contributions
 fries-0.045
 veggies-0.043
ouses-0.043
 meals-0.039
 favorites-0.039
Token into
Token ID656
Feature activation-2.11e-02

Pos logit contributions
pe+0.044
stack+0.043
h+0.042
roads+0.039
 collapse+0.039
Neg logit contributions
 fries-0.059
ouses-0.058
 veggies-0.057
 meals-0.053
 groceries-0.052
Token the
Token ID262
Feature activation-1.66e-02

Pos logit contributions
pe+0.027
stack+0.026
h+0.026
 clim+0.024
 collapse+0.024
Neg logit contributions
 fries-0.049
 veggies-0.048
ouses-0.048
 meals-0.045
 groceries-0.045
Token hole
Token ID7604
Feature activation-5.89e-02

Pos logit contributions
pe+0.024
stack+0.023
h+0.023
 collapse+0.021
 clim+0.021
Neg logit contributions
 fries-0.036
 veggies-0.035
ouses-0.035
 meals-0.033
 groceries-0.033
Token and
Token ID290
Feature activation+1.75e-02

Pos logit contributions
stack+0.117
 collapse+0.115
roads+0.113
pe+0.112
 clim+0.112
Neg logit contributions
 veggies-0.078
 sandwiches-0.078
 fries-0.077
 spaghetti-0.076
ouses-0.076
Token disappeared
Token ID12120
Feature activation-9.93e-03

Pos logit contributions
 fries+0.035
 veggies+0.035
ouses+0.032
 meals+0.032
 spaghetti+0.031
Neg logit contributions
pe-0.028
h-0.028
stack-0.024
 clim-0.024
roads-0.024
Token.
Token ID13
Feature activation-4.80e-03

Pos logit contributions
pe+0.017
stack+0.017
h+0.017
roads+0.016
 clim+0.016
Neg logit contributions
 veggies-0.018
ouses-0.018
 fries-0.017
 meals-0.016
 groceries-0.016
Token
Token ID198
Feature activation+2.65e-03

Pos logit contributions
pe+0.008
stack+0.007
h+0.007
 collapse+0.007
 clim+0.007
Neg logit contributions
 fries-0.010
ouses-0.009
 veggies-0.009
 meals-0.009
 groceries-0.009
Token
Token ID198
Feature activation-1.01e-02

Pos logit contributions
 fries+0.005
 veggies+0.005
ouses+0.005
 meals+0.005
 groceries+0.005
Neg logit contributions
pe-0.004
stack-0.004
h-0.004
 collapse-0.004
roads-0.004
Token my
Token ID616
Feature activation+3.34e-02

Pos logit contributions
pe+0.003
stack+0.003
h+0.003
 clim+0.003
 collapse+0.003
Neg logit contributions
 fries-0.004
 veggies-0.004
ouses-0.004
 groceries-0.004
 meals-0.004
Token new
Token ID649
Feature activation+5.19e-03

Pos logit contributions
 fries+0.063
ouses+0.062
 veggies+0.060
affles+0.056
 meals+0.055
Neg logit contributions
pe-0.057
stack-0.055
h-0.052
 bridge-0.051
 hill-0.051
Token calendar
Token ID11845
Feature activation-1.53e-02

Pos logit contributions
 veggies+0.009
ouses+0.009
 fries+0.009
 groceries+0.009
 meals+0.009
Neg logit contributions
pe-0.009
stack-0.009
h-0.009
 collapse-0.008
roads-0.008
Token on
Token ID319
Feature activation-2.48e-02

Pos logit contributions
pe+0.022
stack+0.022
h+0.021
 collapse+0.020
 clim+0.020
Neg logit contributions
 veggies-0.031
ouses-0.030
 fries-0.030
 groceries-0.029
 spaghetti-0.029
Token the
Token ID262
Feature activation-2.17e-02

Pos logit contributions
pe+0.038
stack+0.036
h+0.035
 clim+0.034
 collapse+0.033
Neg logit contributions
 veggies-0.051
 fries-0.050
 groceries-0.049
 spaghetti-0.049
 meals-0.048
Token wall
Token ID3355
Feature activation-6.60e-02

Pos logit contributions
h+0.032
pe+0.032
stack+0.029
 collapse+0.029
 clim+0.029
Neg logit contributions
 veggies-0.045
 fries-0.045
 groceries-0.043
ouses-0.042
 meals-0.042
Token?"
Token ID1701
Feature activation-1.20e-02

Pos logit contributions
stack+0.100
pe+0.099
 clim+0.094
 collapse+0.094
esh+0.088
Neg logit contributions
ouses-0.125
 veggies-0.120
 spaghetti-0.119
 fries-0.118
 groceries-0.114
Token he
Token ID339
Feature activation-3.56e-02

Pos logit contributions
pe+0.020
stack+0.019
h+0.019
 collapse+0.018
 clim+0.018
Neg logit contributions
 fries-0.024
 veggies-0.023
ouses-0.022
 groceries-0.021
 meals-0.021
Token asked
Token ID1965
Feature activation+1.18e-02

Pos logit contributions
stack+0.063
pe+0.057
 collapse+0.057
roads+0.055
 hill+0.054
Neg logit contributions
 fries-0.068
 veggies-0.066
ouses-0.065
 spaghetti-0.060
 poems-0.059
Token,
Token ID11
Feature activation-2.58e-02

Pos logit contributions
 fries+0.026
ouses+0.025
 veggies+0.025
 meals+0.024
 groceries+0.023
Neg logit contributions
pe-0.017
stack-0.016
h-0.016
 collapse-0.015
 clim-0.015
Token trying
Token ID2111
Feature activation-1.16e-02

Pos logit contributions
roads+0.044
stack+0.044
pe+0.042
 collapse+0.042
oh+0.041
Neg logit contributions
 fries-0.046
 veggies-0.045
 spaghetti-0.042
ouses-0.040
uggets-0.039
Token
Token ID198
Feature activation-1.61e-02

Pos logit contributions
pe+0.008
stack+0.008
h+0.008
 collapse+0.008
 clim+0.008
Neg logit contributions
 fries-0.009
ouses-0.009
 veggies-0.009
 meals-0.008
 groceries-0.008
TokenThey
Token ID2990
Feature activation-2.43e-02

Pos logit contributions
stack+0.032
 collapse+0.029
 clim+0.028
pe+0.028
roads+0.027
Neg logit contributions
ouses-0.026
 veggies-0.023
 fries-0.023
 poems-0.022
uggets-0.021
Token crawled
Token ID45668
Feature activation-2.61e-02

Pos logit contributions
pe+0.040
stack+0.039
h+0.038
roads+0.038
 collapse+0.037
Neg logit contributions
 fries-0.048
 veggies-0.045
ouses-0.045
ory-0.042
 meals-0.041
Token through
Token ID832
Feature activation-1.61e-02

Pos logit contributions
pe+0.039
stack+0.038
h+0.037
 clim+0.034
 collapse+0.034
Neg logit contributions
 fries-0.056
ouses-0.055
 veggies-0.054
 meals-0.050
affles-0.050
Token the
Token ID262
Feature activation-1.49e-02

Pos logit contributions
pe+0.020
stack+0.020
h+0.019
 collapse+0.018
roads+0.017
Neg logit contributions
 fries-0.037
 veggies-0.037
ouses-0.037
 meals-0.035
 groceries-0.034
Token hole
Token ID7604
Feature activation-5.74e-02

Pos logit contributions
pe+0.022
stack+0.022
h+0.021
 clim+0.020
 collapse+0.020
Neg logit contributions
 fries-0.031
 veggies-0.031
ouses-0.031
 meals-0.029
 groceries-0.029
Token and
Token ID290
Feature activation+2.24e-02

Pos logit contributions
stack+0.107
roads+0.102
pe+0.101
 clim+0.100
 collapse+0.100
Neg logit contributions
 sandwiches-0.087
ouses-0.086
 veggies-0.085
 favorite-0.084
 fries-0.083
Token entered
Token ID5982
Feature activation-1.91e-02

Pos logit contributions
 fries+0.045
 veggies+0.044
ouses+0.043
 meals+0.041
 spaghetti+0.040
Neg logit contributions
pe-0.035
h-0.033
stack-0.032
roads-0.031
 clim-0.030
Token the
Token ID262
Feature activation-1.09e-02

Pos logit contributions
pe+0.022
stack+0.021
h+0.020
 collapse+0.019
roads+0.019
Neg logit contributions
 fries-0.047
ouses-0.047
 veggies-0.046
affles-0.044
 meals-0.043
Token temple
Token ID12505
Feature activation-5.03e-02

Pos logit contributions
pe+0.013
stack+0.013
h+0.013
 collapse+0.012
 clim+0.012
Neg logit contributions
 fries-0.025
 veggies-0.025
ouses-0.025
 meals-0.024
 groceries-0.023
Token.
Token ID13
Feature activation-9.07e-03

Pos logit contributions
pe+0.091
stack+0.090
h+0.087
 clim+0.084
 collapse+0.083
Neg logit contributions
ouses-0.084
 veggies-0.081
 fries-0.079
 meals-0.078
 sandwiches-0.077
Token.
Token ID13
Feature activation+2.86e-02

Pos logit contributions
 fries+0.051
ouses+0.050
 veggies+0.049
 meals+0.046
 spaghetti+0.045
Neg logit contributions
pe-0.043
stack-0.043
 collapse-0.042
roads-0.042
h-0.042
Token She
Token ID1375
Feature activation+9.94e-03

Pos logit contributions
 veggies+0.051
ouses+0.048
 meals+0.046
 groceries+0.046
 fries+0.046
Neg logit contributions
pe-0.051
stack-0.048
h-0.048
roads-0.048
 collapse-0.047
Token was
Token ID373
Feature activation+1.43e-02

Pos logit contributions
 fries+0.018
ouses+0.018
 veggies+0.018
 meals+0.017
 groceries+0.016
Neg logit contributions
pe-0.017
stack-0.017
h-0.016
 collapse-0.016
roads-0.016
Token very
Token ID845
Feature activation+7.88e-03

Pos logit contributions
 fries+0.026
 veggies+0.025
ouses+0.025
 meals+0.023
 groceries+0.023
Neg logit contributions
pe-0.025
stack-0.025
h-0.024
 collapse-0.024
roads-0.023
Token bored
Token ID17533
Feature activation-1.44e-02

Pos logit contributions
 fries+0.012
ouses+0.012
 veggies+0.012
 meals+0.011
 groceries+0.010
Neg logit contributions
stack-0.016
pe-0.016
h-0.015
 collapse-0.015
roads-0.015
Token living
Token ID2877
Feature activation-6.47e-02

Pos logit contributions
pe+0.026
stack+0.026
h+0.024
 collapse+0.023
roads+0.023
Neg logit contributions
 fries-0.025
 veggies-0.025
ouses-0.025
 meals-0.024
 spaghetti-0.024
Token in
Token ID287
Feature activation-3.48e-02

Pos logit contributions
pe+0.100
stack+0.098
h+0.093
 collapse+0.090
roads+0.088
Neg logit contributions
 veggies-0.132
ouses-0.131
 fries-0.131
 meals-0.124
 groceries-0.122
Token the
Token ID262
Feature activation-1.30e-02

Pos logit contributions
stack+0.061
pe+0.061
h+0.054
 clim+0.054
zag+0.052
Neg logit contributions
 veggies-0.063
 meals-0.062
 groceries-0.061
 spaghetti-0.061
 fries-0.060
Token small
Token ID1402
Feature activation-4.84e-02

Pos logit contributions
pe+0.018
stack+0.018
h+0.017
 collapse+0.016
 clim+0.016
Neg logit contributions
 fries-0.028
 veggies-0.028
ouses-0.028
 meals-0.026
 groceries-0.026
Token field
Token ID2214
Feature activation-6.17e-02

Pos logit contributions
stack+0.072
pe+0.072
 collapse+0.068
h+0.068
 clim+0.064
Neg logit contributions
 fries-0.101
 veggies-0.101
ouses-0.100
 meals-0.095
 groceries-0.093
Token near
Token ID1474
Feature activation-1.12e-02

Pos logit contributions
pe+0.120
 collapse+0.116
stack+0.116
 crash+0.110
 zo+0.110
Neg logit contributions
ouses-0.085
 veggies-0.085
 meals-0.084
 spaghetti-0.083
 favourite-0.082
Token
Token ID198
Feature activation+4.17e-03

Pos logit contributions
 fries+0.028
 veggies+0.028
ouses+0.027
 meals+0.026
 groceries+0.025
Neg logit contributions
pe-0.022
h-0.021
stack-0.021
roads-0.019
 collapse-0.019
TokenBut
Token ID1537
Feature activation-1.53e-02

Pos logit contributions
ouses+0.008
 fries+0.007
 veggies+0.007
 meals+0.007
affles+0.007
Neg logit contributions
stack-0.007
pe-0.007
 collapse-0.007
h-0.006
 clim-0.006
Token then
Token ID788
Feature activation-2.95e-02

Pos logit contributions
pe+0.024
 collapse+0.022
h+0.022
roads+0.022
stack+0.022
Neg logit contributions
 fries-0.030
 veggies-0.029
ouses-0.028
 favourite-0.028
 groceries-0.027
Token the
Token ID262
Feature activation-3.97e-02

Pos logit contributions
pe+0.038
stack+0.038
h+0.035
 collapse+0.033
roads+0.033
Neg logit contributions
 fries-0.068
 veggies-0.068
ouses-0.067
 meals-0.064
 groceries-0.063
Token wind
Token ID2344
Feature activation-5.50e-02

Pos logit contributions
pe+0.064
stack+0.062
h+0.061
 collapse+0.057
 clim+0.057
Neg logit contributions
 fries-0.079
ouses-0.078
 veggies-0.078
 meals-0.072
 groceries-0.071
Token began
Token ID2540
Feature activation-5.80e-02

Pos logit contributions
stack+0.084
pe+0.081
roads+0.079
 clim+0.077
h+0.076
Neg logit contributions
 fries-0.105
 veggies-0.100
ouses-0.098
ory-0.093
 sandwiches-0.089
Token to
Token ID284
Feature activation-5.35e-02

Pos logit contributions
stack+0.122
pe+0.121
roads+0.116
 clim+0.112
h+0.111
Neg logit contributions
 veggies-0.084
 fries-0.082
ouses-0.081
 meals-0.079
 groceries-0.076
Token blow
Token ID6611
Feature activation-2.94e-02

Pos logit contributions
stack+0.094
roads+0.089
pe+0.086
 clim+0.086
oh+0.080
Neg logit contributions
 veggies-0.089
 fries-0.082
 delicious-0.081
 vegetables-0.078
 spaghetti-0.077
Token,
Token ID11
Feature activation-6.45e-03

Pos logit contributions
pe+0.047
stack+0.046
h+0.043
 clim+0.042
roads+0.041
Neg logit contributions
 veggies-0.057
 fries-0.057
ouses-0.056
 meals-0.054
 groceries-0.053
Token and
Token ID290
Feature activation+4.92e-03

Pos logit contributions
pe+0.012
stack+0.012
roads+0.012
 clim+0.011
h+0.011
Neg logit contributions
 fries-0.011
 veggies-0.010
 spaghetti-0.010
 groceries-0.009
 meals-0.009
Token the
Token ID262
Feature activation-2.87e-02

Pos logit contributions
 fries+0.010
 veggies+0.010
ouses+0.009
 groceries+0.009
 meals+0.009
Neg logit contributions
pe-0.008
stack-0.008
h-0.007
roads-0.007
 clim-0.007