TOP ACTIVATIONS
MAX = 10.494

there was a great l oll ipop . It was pink
time there was a l oll ipop . It was pink
little girl who loved l oll ip ops . One day
Lily . She loved l oll ip ops , especially the
Lily . She loved l oll ip ops and would always
Lily . She loved l oll ip ops and always had
Lily . She loved l oll ip ops and always had
there was a nice l oll ipop . It was round
named Lily who loved l oll ip ops . She would
as she held the l oll ipop in her hand .
to walk and give l oll ip ops to children ."
John . John loved l oll ip ops ! One day
going to get a l oll ipop for being so organized
Tess . She loved l oll ip ops . One day
Lily . She loved l oll ip ops and ate them
thought it was a l oll ipop . She wanted to
Lily . She loved l oll ip ops and always had
gave him a fancy l oll ipop as a reward .
who really wanted a l oll ipop . Every time she
Can we have a l oll ipop , mom ?" Ben

MIN ACTIVATIONS
MIN = -9.333

Julie could fly with no constraints . Julie learned how to
the pigeon all those weeks ago . <|endoftext|> Mike and his
. He says " L adies and gentlemen , boys and
, following the squirrel 's directions . But when he got
They said , " L adies and gentlemen , welcome to
and they all followed her directions . Everyone said they had
- ups , sat - ups and jumping j acks every
Daisy had planted it weeks ago , but now she could
and then a loud " CR ASH !" echoed around the
the wardrobe a few days ago , so she opened it
born only a few days ago . The deer felt sorry
table , making the candle topple over . Everyone quickly rushed
our country a long time ago . He helped many people
big fire alarm , so firefighters were available to come help
looked out the window . Trees and houses flew by quickly
better than yours . Fire fighters save people from fires ."
my had told him long ago . The bird spread out
her hand . CR ASH ! The tower fell
ancestors who sang it long ago . Would you like to
do , a few days ago . He started to applaud

INTERVAL 5.537 - 10.494
CONTAINS 0.009%

What â s all this ? â
" O w , ow , ow ! Ben ,
. She loved l oll ip ops and always had one
. She grabbed a l oll ipop from her bag and
That â s a rope , John

INTERVAL 0.580 - 5.537
CONTAINS 31.436%

Snow y . Max would say , " I hate cats
Then , Lily remembered something her mom had told her
sack . It belonged to his neighbor , Mrs . Smith
ives Sam and hugs him back . They play with other
The youth spent the day sharing the jokes and making people

INTERVAL -4.376 - 0.580
CONTAINS 68.476%

very enthusiastic . She ran over to them and asked if
a big tree in the forest . One day , Sp
find it . L ily was sad because she
worry , I 'll help you find him . Let 's
. He felt the water around him and heard the bubbles

INTERVAL -9.333 - -4.376
CONTAINS 0.079%

their snack from their back packs and gave some to the
water balloon with all their might . But they missed the
room and hugged herself . Nobody knew the secret but she
â Now Annie , â
. That â s
Token there
Token ID612
Feature activation+1.47e+00

Pos logit contributions
iflower+0.033
pron+0.028
affle+0.027
opus+0.027
orn+0.027
Neg logit contributions
 stations-0.016
 levers-0.014
 footsteps-0.013
 speakers-0.013
 towels-0.013
Token was
Token ID373
Feature activation+1.28e+00

Pos logit contributions
iflower+0.058
pron+0.050
opus+0.049
affle+0.048
heddar+0.047
Neg logit contributions
 stations-0.031
 levers-0.027
 themselves-0.026
 footsteps-0.025
 towels-0.025
Token a
Token ID257
Feature activation+1.85e+00

Pos logit contributions
iflower+0.058
opus+0.050
pron+0.050
orn+0.049
oppy+0.048
Neg logit contributions
 stations-0.022
 levers-0.018
 speakers-0.017
 footsteps-0.016
 towels-0.015
Token great
Token ID1049
Feature activation+3.03e+00

Pos logit contributions
iflower+0.072
orn+0.065
zed+0.064
wl+0.063
ons+0.062
Neg logit contributions
 stations-0.026
 like-0.025
 turn-0.024
 heads-0.024
 towels-0.024
Token l
Token ID300
Feature activation+1.99e+00

Pos logit contributions
iflower+0.104
pron+0.081
orn+0.081
opus+0.080
oppy+0.079
Neg logit contributions
 stations-0.074
 towels-0.064
 levers-0.063
 footsteps-0.063
 heads-0.063
Tokenoll
Token ID692
Feature activation+1.05e+01

Pos logit contributions
iflower+0.068
pron+0.063
corn+0.058
lf+0.054
opus+0.052
Neg logit contributions
 stations-0.041
ks-0.038
 speakers-0.037
hole-0.036
 footsteps-0.034
Tokenipop
Token ID42800
Feature activation+4.98e+00

Pos logit contributions
iflower+0.203
Buzz+0.191
Mic+0.174
pron+0.168
heddar+0.157
Neg logit contributions
 permission-0.315
 attention-0.295
 coats-0.284
 keys-0.282
hole-0.275
Token.
Token ID13
Feature activation+4.90e-01

Pos logit contributions
iflower+0.214
pron+0.185
oppy+0.184
opus+0.184
orn+0.182
Neg logit contributions
 stations-0.082
 speakers-0.067
 levers-0.066
 footsteps-0.061
 towels-0.058
Token It
Token ID632
Feature activation+2.46e+00

Pos logit contributions
iflower+0.019
opus+0.016
pron+0.015
leton+0.015
affle+0.015
Neg logit contributions
 stations-0.011
 levers-0.010
 footsteps-0.010
 speakers-0.010
 extingu-0.009
Token was
Token ID373
Feature activation-1.13e+00

Pos logit contributions
iflower+0.101
opus+0.099
oppy+0.093
pron+0.091
leton+0.091
Neg logit contributions
 like-0.032
 and-0.025
 sill-0.023
 heads-0.021
 each-0.020
Token pink
Token ID11398
Feature activation+1.39e+00

Pos logit contributions
 stations+0.017
 levers+0.015
 speakers+0.015
 footsteps+0.014
 towels+0.014
Neg logit contributions
iflower-0.048
opus-0.046
leton-0.042
oppy-0.042
orn-0.041
Token time
Token ID640
Feature activation+1.00e-01

Pos logit contributions
iflower+0.074
opus+0.059
pron+0.057
orn+0.057
oppy+0.055
Neg logit contributions
 stations-0.078
 levers-0.071
 speakers-0.069
 footsteps-0.068
 towels-0.067
Token there
Token ID612
Feature activation+6.61e-01

Pos logit contributions
iflower+0.004
pron+0.003
opus+0.003
orn+0.003
affle+0.003
Neg logit contributions
 stations-0.002
 levers-0.002
 footsteps-0.002
 speakers-0.002
 towels-0.002
Token was
Token ID373
Feature activation+5.86e-01

Pos logit contributions
iflower+0.027
pron+0.023
opus+0.022
affle+0.022
heddar+0.021
Neg logit contributions
 stations-0.014
 levers-0.012
 themselves-0.012
 footsteps-0.012
 towels-0.011
Token a
Token ID257
Feature activation+1.15e+00

Pos logit contributions
iflower+0.025
pron+0.021
orn+0.021
opus+0.021
oppy+0.021
Neg logit contributions
 stations-0.011
 levers-0.010
 speakers-0.009
 footsteps-0.009
 heads-0.008
Token l
Token ID300
Feature activation+1.56e+00

Pos logit contributions
iflower+0.045
orn+0.040
zed+0.039
wl+0.038
affle+0.038
Neg logit contributions
 stations-0.018
 speakers-0.017
 footsteps-0.017
 towels-0.017
 turn-0.017
Tokenoll
Token ID692
Feature activation+1.02e+01

Pos logit contributions
iflower+0.054
pron+0.049
corn+0.045
lf+0.043
opus+0.041
Neg logit contributions
 stations-0.032
ks-0.030
 speakers-0.030
hole-0.029
 footsteps-0.028
Tokenipop
Token ID42800
Feature activation+4.69e+00

Pos logit contributions
iflower+0.230
Buzz+0.186
pron+0.178
Mic+0.178
heddar+0.169
Neg logit contributions
 permission-0.300
 attention-0.290
 levers-0.282
 coats-0.279
 keys-0.276
Token.
Token ID13
Feature activation+8.11e-02

Pos logit contributions
iflower+0.199
pron+0.172
opus+0.170
oppy+0.169
orn+0.168
Neg logit contributions
 stations-0.084
 levers-0.069
 speakers-0.069
 footsteps-0.064
 towels-0.061
Token It
Token ID632
Feature activation+2.11e+00

Pos logit contributions
iflower+0.003
opus+0.002
pron+0.002
oppy+0.002
orn+0.002
Neg logit contributions
 stations-0.002
 levers-0.002
 speakers-0.002
 footsteps-0.002
hole-0.002
Token was
Token ID373
Feature activation-1.36e+00

Pos logit contributions
iflower+0.088
opus+0.085
oppy+0.079
pron+0.079
leton+0.077
Neg logit contributions
 like-0.028
 and-0.022
 heads-0.019
 each-0.019
 sill-0.018
Token pink
Token ID11398
Feature activation+1.17e+00

Pos logit contributions
 stations+0.023
 levers+0.021
 speakers+0.020
 footsteps+0.019
 towels+0.019
Neg logit contributions
iflower-0.057
opus-0.053
leton-0.049
oppy-0.048
orn-0.048
Token little
Token ID1310
Feature activation+1.94e+00

Pos logit contributions
iflower+0.045
orn+0.042
wl+0.041
live+0.040
zed+0.040
Neg logit contributions
 stop-0.015
 stations-0.015
 turn-0.014
 steering-0.014
 louder-0.014
Token girl
Token ID2576
Feature activation+2.91e+00

Pos logit contributions
iflower+0.070
pron+0.058
affle+0.057
orn+0.056
oppy+0.056
Neg logit contributions
 stations-0.046
 levers-0.041
 speakers-0.040
 towels-0.039
 footsteps-0.038
Token who
Token ID508
Feature activation+2.48e+00

Pos logit contributions
iflower+0.090
opus+0.072
pron+0.071
orn+0.070
affle+0.069
Neg logit contributions
 stations-0.090
 levers-0.080
 speakers-0.080
 footsteps-0.077
 towels-0.075
Token loved
Token ID6151
Feature activation-1.08e+00

Pos logit contributions
iflower+0.093
opus+0.084
pron+0.077
affle+0.076
oppy+0.075
Neg logit contributions
 stations-0.055
 speakers-0.048
 like-0.047
 levers-0.047
 footsteps-0.045
Token l
Token ID300
Feature activation+1.87e+00

Pos logit contributions
 stations+0.021
 levers+0.018
 speakers+0.017
 footsteps+0.016
 towels+0.015
Neg logit contributions
iflower-0.046
pron-0.040
opus-0.040
orn-0.039
oppy-0.039
Tokenoll
Token ID692
Feature activation+1.01e+01

Pos logit contributions
iflower+0.054
pron+0.047
lf+0.042
opus+0.041
leton+0.039
Neg logit contributions
 stations-0.055
 speakers-0.051
 permission-0.050
ks-0.050
 footsteps-0.050
Tokenip
Token ID541
Feature activation+5.83e+00

Pos logit contributions
iflower+0.167
pron+0.128
Buzz+0.126
opus+0.109
Mic+0.107
Neg logit contributions
 permission-0.375
 stations-0.363
 attention-0.362
 levers-0.356
 coats-0.356
Tokenops
Token ID2840
Feature activation+2.74e-01

Pos logit contributions
iflower+0.147
pron+0.119
orn+0.111
opus+0.111
oppy+0.109
Neg logit contributions
 stations-0.202
 levers-0.184
 speakers-0.179
 heads-0.177
 towels-0.174
Token.
Token ID13
Feature activation-8.43e-02

Pos logit contributions
iflower+0.011
orn+0.009
pron+0.009
opus+0.009
ip+0.008
Neg logit contributions
 stations-0.007
 levers-0.006
 speakers-0.006
 footsteps-0.006
 heads-0.005
Token One
Token ID1881
Feature activation-1.05e+00

Pos logit contributions
 stations+0.002
 levers+0.002
 speakers+0.002
 footsteps+0.001
 heads+0.001
Neg logit contributions
iflower-0.003
opus-0.003
pron-0.003
orn-0.003
oppy-0.003
Token day
Token ID1110
Feature activation-5.63e-01

Pos logit contributions
 stations+0.026
 footsteps+0.023
 towels+0.023
 attention+0.023
 like+0.023
Neg logit contributions
iflower-0.034
opus-0.029
oppy-0.028
pron-0.027
affle-0.027
Token Lily
Token ID20037
Feature activation+2.18e+00

Pos logit contributions
iflower+0.120
opus+0.103
orn+0.100
affle+0.096
pron+0.093
Neg logit contributions
 stations-0.071
 footsteps-0.062
 towels-0.061
 heads-0.060
 speakers-0.059
Token.
Token ID13
Feature activation+2.56e-01

Pos logit contributions
iflower+0.085
orn+0.072
pron+0.069
oppy+0.068
opus+0.067
Neg logit contributions
 stations-0.051
 levers-0.047
 speakers-0.043
 footsteps-0.043
 towels-0.040
Token She
Token ID1375
Feature activation+1.99e+00

Pos logit contributions
iflower+0.012
opus+0.010
pron+0.010
affle+0.010
orn+0.010
Neg logit contributions
 stations-0.004
 footsteps-0.004
 speakers-0.003
 towels-0.003
 levers-0.003
Token loved
Token ID6151
Feature activation-5.37e-01

Pos logit contributions
iflower+0.082
opus+0.075
oppy+0.072
affle+0.071
heddar+0.070
Neg logit contributions
 stations-0.026
 like-0.025
 speakers-0.024
 heads-0.023
 been-0.021
Token l
Token ID300
Feature activation+1.78e+00

Pos logit contributions
 stations+0.009
 levers+0.007
 speakers+0.007
 footsteps+0.006
 towels+0.006
Neg logit contributions
iflower-0.024
opus-0.021
oppy-0.021
orn-0.021
pron-0.021
Tokenoll
Token ID692
Feature activation+9.91e+00

Pos logit contributions
iflower+0.055
pron+0.049
lf+0.046
opus+0.043
leton+0.042
Neg logit contributions
 stations-0.046
ks-0.043
 permission-0.043
 like-0.043
 themselves-0.042
Tokenip
Token ID541
Feature activation+5.62e+00

Pos logit contributions
iflower+0.161
pron+0.141
Buzz+0.137
opus+0.116
va+0.116
Neg logit contributions
 permission-0.354
 attention-0.342
 stations-0.337
 coats-0.334
 towels-0.329
Tokenops
Token ID2840
Feature activation+1.09e-01

Pos logit contributions
iflower+0.141
pron+0.120
opus+0.110
orn+0.108
oppy+0.106
Neg logit contributions
 stations-0.187
 levers-0.168
 heads-0.168
 towels-0.163
 themselves-0.163
Token,
Token ID11
Feature activation-7.45e-03

Pos logit contributions
iflower+0.005
pron+0.004
opus+0.004
orn+0.004
oppy+0.004
Neg logit contributions
 stations-0.002
 levers-0.002
 speakers-0.002
 footsteps-0.002
 heads-0.002
Token especially
Token ID2592
Feature activation-3.32e-01

Pos logit contributions
 stations+0.000
 levers+0.000
 speakers+0.000
 footsteps+0.000
 towels+0.000
Neg logit contributions
iflower-0.000
opus-0.000
pron-0.000
orn-0.000
oppy-0.000
Token the
Token ID262
Feature activation-3.44e-01

Pos logit contributions
 stations+0.007
 levers+0.006
 themselves+0.005
 attention+0.005
 heads+0.005
Neg logit contributions
iflower-0.013
opus-0.012
oppy-0.012
pron-0.011
affle-0.011
Token Lily
Token ID20037
Feature activation+2.28e+00

Pos logit contributions
iflower+0.118
opus+0.101
orn+0.098
affle+0.095
pron+0.091
Neg logit contributions
 stations-0.070
 footsteps-0.062
 towels-0.061
 heads-0.060
 speakers-0.059
Token.
Token ID13
Feature activation+2.50e-01

Pos logit contributions
iflower+0.087
orn+0.073
pron+0.070
oppy+0.069
lf+0.067
Neg logit contributions
 stations-0.054
 levers-0.051
 speakers-0.047
 footsteps-0.047
 towels-0.043
Token She
Token ID1375
Feature activation+2.08e+00

Pos logit contributions
iflower+0.011
opus+0.010
pron+0.009
affle+0.009
orn+0.009
Neg logit contributions
 stations-0.004
 footsteps-0.004
 speakers-0.004
 towels-0.003
 levers-0.003
Token loved
Token ID6151
Feature activation-4.64e-01

Pos logit contributions
iflower+0.085
opus+0.078
oppy+0.074
affle+0.073
orn+0.072
Neg logit contributions
 stations-0.032
 like-0.028
 speakers-0.028
 heads-0.027
 been-0.024
Token l
Token ID300
Feature activation+1.77e+00

Pos logit contributions
 stations+0.008
 levers+0.006
 speakers+0.006
 footsteps+0.006
 towels+0.005
Neg logit contributions
iflower-0.021
opus-0.018
oppy-0.018
pron-0.018
orn-0.018
Tokenoll
Token ID692
Feature activation+9.79e+00

Pos logit contributions
iflower+0.057
pron+0.050
lf+0.045
opus+0.044
leton+0.042
Neg logit contributions
 stations-0.046
ks-0.043
 permission-0.043
 like-0.043
 speakers-0.042
Tokenip
Token ID541
Feature activation+5.56e+00

Pos logit contributions
iflower+0.153
Buzz+0.136
pron+0.132
va+0.108
opus+0.108
Neg logit contributions
 permission-0.353
 attention-0.344
 coats-0.336
 stations-0.332
 keys-0.330
Tokenops
Token ID2840
Feature activation+3.34e-03

Pos logit contributions
iflower+0.140
pron+0.119
opus+0.109
orn+0.107
oppy+0.105
Neg logit contributions
 stations-0.185
 levers-0.166
 heads-0.165
 speakers-0.161
 themselves-0.160
Token and
Token ID290
Feature activation+1.54e+00

Pos logit contributions
iflower+0.000
pron+0.000
opus+0.000
orn+0.000
oppy+0.000
Neg logit contributions
 stations-0.000
 levers-0.000
 speakers-0.000
 footsteps-0.000
 heads-0.000
Token would
Token ID561
Feature activation-2.00e+00

Pos logit contributions
iflower+0.054
opus+0.046
orn+0.044
oppy+0.044
pron+0.044
Neg logit contributions
 stations-0.040
 levers-0.036
 speakers-0.034
 heads-0.033
 towels-0.033
Token always
Token ID1464
Feature activation-7.28e-01

Pos logit contributions
 stations+0.053
 levers+0.047
 speakers+0.045
 footsteps+0.044
 heads+0.044
Neg logit contributions
iflower-0.070
opus-0.059
orn-0.057
pron-0.057
oppy-0.057
Token Lily
Token ID20037
Feature activation+1.98e+00

Pos logit contributions
iflower+0.109
opus+0.092
orn+0.089
affle+0.086
pron+0.085
Neg logit contributions
 stations-0.076
 footsteps-0.068
 speakers-0.067
 towels-0.066
 heads-0.064
Token.
Token ID13
Feature activation+1.96e-01

Pos logit contributions
iflower+0.075
orn+0.063
pron+0.060
oppy+0.060
lf+0.058
Neg logit contributions
 stations-0.047
 levers-0.044
 speakers-0.041
 footsteps-0.041
 towels-0.038
Token She
Token ID1375
Feature activation+1.88e+00

Pos logit contributions
iflower+0.009
opus+0.008
pron+0.007
affle+0.007
orn+0.007
Neg logit contributions
 stations-0.003
 footsteps-0.003
 speakers-0.003
 towels-0.003
 levers-0.003
Token loved
Token ID6151
Feature activation-6.22e-01

Pos logit contributions
iflower+0.075
opus+0.069
oppy+0.066
affle+0.065
pron+0.065
Neg logit contributions
 stations-0.029
 speakers-0.026
 like-0.026
 heads-0.025
 levers-0.023
Token l
Token ID300
Feature activation+1.77e+00

Pos logit contributions
 stations+0.010
 levers+0.008
 speakers+0.008
 footsteps+0.007
 towels+0.007
Neg logit contributions
iflower-0.028
opus-0.025
orn-0.024
oppy-0.024
pron-0.024
Tokenoll
Token ID692
Feature activation+9.78e+00

Pos logit contributions
iflower+0.054
pron+0.047
lf+0.042
opus+0.042
leton+0.039
Neg logit contributions
 stations-0.050
 permission-0.046
 footsteps-0.046
 speakers-0.046
 towels-0.045
Tokenip
Token ID541
Feature activation+5.60e+00

Pos logit contributions
iflower+0.157
pron+0.129
Buzz+0.126
opus+0.109
va+0.101
Neg logit contributions
 permission-0.366
 stations-0.361
 attention-0.357
 levers-0.353
 coats-0.349
Tokenops
Token ID2840
Feature activation+1.19e-01

Pos logit contributions
iflower+0.139
pron+0.116
opus+0.108
orn+0.107
oppy+0.104
Neg logit contributions
 stations-0.191
 levers-0.173
 heads-0.170
 speakers-0.167
 towels-0.167
Token and
Token ID290
Feature activation+1.56e+00

Pos logit contributions
iflower+0.005
pron+0.004
orn+0.004
opus+0.004
oppy+0.004
Neg logit contributions
 stations-0.003
 levers-0.002
 speakers-0.002
 footsteps-0.002
 heads-0.002
Token always
Token ID1464
Feature activation-2.12e-01

Pos logit contributions
iflower+0.052
opus+0.043
pron+0.042
orn+0.042
oppy+0.041
Neg logit contributions
 stations-0.044
 levers-0.040
 speakers-0.038
 footsteps-0.037
 towels-0.036
Token had
Token ID550
Feature activation+1.37e+00

Pos logit contributions
 stations+0.006
 levers+0.006
 speakers+0.006
 footsteps+0.005
 towels+0.005
Neg logit contributions
iflower-0.007
pron-0.006
opus-0.005
orn-0.005
oppy-0.005
Token Lily
Token ID20037
Feature activation+2.23e+00

Pos logit contributions
iflower+0.119
opus+0.102
orn+0.098
affle+0.095
pron+0.094
Neg logit contributions
 stations-0.069
 footsteps-0.060
 towels-0.059
 heads-0.059
 speakers-0.058
Token.
Token ID13
Feature activation+7.60e-02

Pos logit contributions
iflower+0.085
orn+0.072
pron+0.068
oppy+0.067
opus+0.066
Neg logit contributions
 stations-0.053
 levers-0.050
 footsteps-0.046
 speakers-0.046
 towels-0.043
Token She
Token ID1375
Feature activation+2.01e+00

Pos logit contributions
iflower+0.003
opus+0.003
pron+0.003
affle+0.003
orn+0.003
Neg logit contributions
 stations-0.001
 footsteps-0.001
 speakers-0.001
 levers-0.001
 towels-0.001
Token loved
Token ID6151
Feature activation-4.98e-01

Pos logit contributions
iflower+0.081
opus+0.075
oppy+0.072
affle+0.070
pron+0.070
Neg logit contributions
 stations-0.030
 like-0.027
 speakers-0.027
 heads-0.024
 been-0.024
Token l
Token ID300
Feature activation+1.78e+00

Pos logit contributions
 stations+0.008
 levers+0.007
 speakers+0.006
 footsteps+0.006
 towels+0.006
Neg logit contributions
iflower-0.022
opus-0.020
oppy-0.019
pron-0.019
orn-0.019
Tokenoll
Token ID692
Feature activation+9.74e+00

Pos logit contributions
iflower+0.057
pron+0.050
lf+0.046
opus+0.045
corn+0.043
Neg logit contributions
 stations-0.045
ks-0.042
 permission-0.042
 like-0.041
 themselves-0.041
Tokenip
Token ID541
Feature activation+5.47e+00

Pos logit contributions
iflower+0.159
Buzz+0.138
pron+0.136
opus+0.115
Mic+0.113
Neg logit contributions
 permission-0.346
 attention-0.338
 coats-0.330
 stations-0.324
 towels-0.321
Tokenops
Token ID2840
Feature activation-1.27e-01

Pos logit contributions
iflower+0.135
pron+0.114
opus+0.106
orn+0.104
oppy+0.103
Neg logit contributions
 stations-0.185
 levers-0.166
 heads-0.165
 themselves-0.161
 towels-0.161
Token and
Token ID290
Feature activation+1.63e+00

Pos logit contributions
 stations+0.003
 levers+0.002
 speakers+0.002
 footsteps+0.002
 heads+0.002
Neg logit contributions
iflower-0.005
pron-0.004
orn-0.004
opus-0.004
oppy-0.004
Token always
Token ID1464
Feature activation-2.05e-01

Pos logit contributions
iflower+0.056
opus+0.047
orn+0.046
oppy+0.045
pron+0.045
Neg logit contributions
 stations-0.045
 levers-0.040
 speakers-0.038
 footsteps-0.037
 heads-0.037
Token had
Token ID550
Feature activation+1.33e+00

Pos logit contributions
 stations+0.006
 levers+0.006
 speakers+0.005
 footsteps+0.005
 towels+0.005
Neg logit contributions
iflower-0.007
pron-0.005
opus-0.005
orn-0.005
oppy-0.005
Token there
Token ID612
Feature activation+2.79e-01

Pos logit contributions
iflower+0.035
opus+0.030
affle+0.030
pron+0.030
heddar+0.029
Neg logit contributions
 stations-0.018
 levers-0.016
 themselves-0.015
 footsteps-0.015
hole-0.015
Token was
Token ID373
Feature activation+3.61e-02

Pos logit contributions
iflower+0.011
pron+0.010
opus+0.009
affle+0.009
heddar+0.009
Neg logit contributions
 stations-0.006
 levers-0.005
 themselves-0.005
 footsteps-0.005
 towels-0.005
Token a
Token ID257
Feature activation+7.71e-01

Pos logit contributions
iflower+0.002
pron+0.001
opus+0.001
orn+0.001
oppy+0.001
Neg logit contributions
 stations-0.001
 levers-0.001
 speakers-0.001
 footsteps-0.000
 heads-0.000
Token nice
Token ID3621
Feature activation+2.16e+00

Pos logit contributions
iflower+0.031
orn+0.028
zed+0.027
wl+0.027
live+0.026
Neg logit contributions
 stations-0.011
 stop-0.011
 turn-0.011
 footsteps-0.011
 louder-0.010
Token l
Token ID300
Feature activation+1.17e+00

Pos logit contributions
iflower+0.078
orn+0.062
affle+0.061
oppy+0.061
opus+0.060
Neg logit contributions
 stations-0.048
 footsteps-0.044
 towels-0.043
 speakers-0.043
 extingu-0.042
Tokenoll
Token ID692
Feature activation+9.70e+00

Pos logit contributions
iflower+0.041
pron+0.036
corn+0.032
lf+0.031
opus+0.030
Neg logit contributions
 stations-0.026
 speakers-0.024
ks-0.023
 footsteps-0.023
hole-0.023
Tokenipop
Token ID42800
Feature activation+4.19e+00

Pos logit contributions
iflower+0.225
Buzz+0.185
Mic+0.177
pron+0.176
heddar+0.165
Neg logit contributions
 permission-0.280
 attention-0.267
 stations-0.260
 coats-0.256
 levers-0.256
Token.
Token ID13
Feature activation-2.74e-01

Pos logit contributions
iflower+0.182
pron+0.157
opus+0.156
orn+0.154
oppy+0.154
Neg logit contributions
 stations-0.076
 levers-0.062
 speakers-0.060
 footsteps-0.056
 towels-0.053
Token It
Token ID632
Feature activation+1.94e+00

Pos logit contributions
 stations+0.008
 levers+0.007
 speakers+0.007
 footsteps+0.007
hole+0.006
Neg logit contributions
iflower-0.010
opus-0.008
oppy-0.008
pron-0.008
orn-0.007
Token was
Token ID373
Feature activation-1.62e+00

Pos logit contributions
iflower+0.079
opus+0.074
oppy+0.069
pron+0.069
affle+0.067
Neg logit contributions
 like-0.029
 stations-0.024
 heads-0.024
 and-0.023
 each-0.022
Token round
Token ID2835
Feature activation+8.92e-01

Pos logit contributions
 stations+0.027
 levers+0.023
 speakers+0.022
 towels+0.021
 footsteps+0.021
Neg logit contributions
iflower-0.070
opus-0.065
oppy-0.060
orn-0.060
leton-0.059
Token named
Token ID3706
Feature activation+2.65e+00

Pos logit contributions
iflower+0.080
opus+0.064
pron+0.064
orn+0.062
affle+0.061
Neg logit contributions
 stations-0.084
 levers-0.075
 speakers-0.074
 footsteps-0.072
 towels-0.070
Token Lily
Token ID20037
Feature activation+1.70e+00

Pos logit contributions
iflower+0.101
opus+0.084
orn+0.083
affle+0.080
pron+0.077
Neg logit contributions
 stations-0.067
 footsteps-0.060
 speakers-0.058
 towels-0.058
 heads-0.056
Token who
Token ID508
Feature activation+1.96e+00

Pos logit contributions
iflower+0.064
orn+0.054
pron+0.052
oppy+0.051
lf+0.050
Neg logit contributions
 stations-0.041
 levers-0.039
 speakers-0.035
 footsteps-0.035
 towels-0.033
Token loved
Token ID6151
Feature activation-9.75e-01

Pos logit contributions
iflower+0.081
opus+0.078
inkle+0.069
affle+0.069
pron+0.068
Neg logit contributions
 like-0.031
 stations-0.028
 want-0.025
 hope-0.025
 respect-0.023
Token l
Token ID300
Feature activation+1.71e+00

Pos logit contributions
 stations+0.018
 levers+0.015
 speakers+0.014
 footsteps+0.013
 towels+0.013
Neg logit contributions
iflower-0.043
opus-0.037
pron-0.037
orn-0.036
oppy-0.036
Tokenoll
Token ID692
Feature activation+9.68e+00

Pos logit contributions
iflower+0.048
pron+0.041
opus+0.037
lf+0.036
leton+0.034
Neg logit contributions
 stations-0.054
 levers-0.049
 speakers-0.049
 footsteps-0.048
 permission-0.048
Tokenip
Token ID541
Feature activation+5.31e+00

Pos logit contributions
iflower+0.142
pron+0.114
Buzz+0.104
opus+0.092
va+0.086
Neg logit contributions
 permission-0.378
 stations-0.374
 levers-0.366
 attention-0.365
 heads-0.353
Tokenops
Token ID2840
Feature activation-6.97e-02

Pos logit contributions
iflower+0.128
pron+0.104
opus+0.096
orn+0.095
oppy+0.093
Neg logit contributions
 stations-0.188
 levers-0.172
 speakers-0.167
 heads-0.165
 towels-0.162
Token.
Token ID13
Feature activation-1.07e-01

Pos logit contributions
 stations+0.002
 levers+0.002
 speakers+0.001
 footsteps+0.001
 heads+0.001
Neg logit contributions
iflower-0.003
pron-0.002
orn-0.002
opus-0.002
affle-0.002
Token She
Token ID1375
Feature activation+1.45e+00

Pos logit contributions
 stations+0.002
 levers+0.002
 speakers+0.002
 footsteps+0.002
 towels+0.002
Neg logit contributions
iflower-0.005
opus-0.004
orn-0.004
pron-0.004
oppy-0.004
Token would
Token ID561
Feature activation-2.57e+00

Pos logit contributions
iflower+0.060
opus+0.056
orn+0.051
oppy+0.051
affle+0.050
Neg logit contributions
 stations-0.024
 heads-0.021
 each-0.020
 coats-0.020
 argued-0.019
Token as
Token ID355
Feature activation+4.66e-01

Pos logit contributions
 stations+0.041
 levers+0.034
 speakers+0.033
 elbows+0.031
 brake+0.031
Neg logit contributions
iflower-0.054
pron-0.046
oppy-0.044
opus-0.044
orn-0.042
Token she
Token ID673
Feature activation+8.04e-01

Pos logit contributions
iflower+0.016
opus+0.013
pron+0.013
orn+0.013
affle+0.012
Neg logit contributions
 stations-0.013
 levers-0.012
 speakers-0.011
 footsteps-0.011
 towels-0.011
Token held
Token ID2714
Feature activation+2.47e+00

Pos logit contributions
iflower+0.030
opus+0.025
pron+0.025
orn+0.024
oppy+0.024
Neg logit contributions
 stations-0.020
 levers-0.018
 speakers-0.017
 footsteps-0.016
 towels-0.016
Token the
Token ID262
Feature activation+9.50e-02

Pos logit contributions
iflower+0.102
opus+0.089
pron+0.086
orn+0.085
affle+0.084
Neg logit contributions
 stations-0.050
 levers-0.044
 speakers-0.042
 footsteps-0.041
 towels-0.039
Token l
Token ID300
Feature activation+9.92e-01

Pos logit contributions
iflower+0.003
opus+0.002
oppy+0.002
pron+0.002
orn+0.002
Neg logit contributions
 stations-0.003
 levers-0.003
 speakers-0.002
 footsteps-0.002
 towels-0.002
Tokenoll
Token ID692
Feature activation+9.50e+00

Pos logit contributions
iflower+0.026
pron+0.020
opus+0.020
lf+0.019
leton+0.018
Neg logit contributions
 stations-0.030
hole-0.029
 levers-0.029
 permission-0.029
 speakers-0.028
Tokenipop
Token ID42800
Feature activation+4.16e+00

Pos logit contributions
iflower+0.216
opus+0.179
Buzz+0.177
pron+0.169
leton+0.165
Neg logit contributions
 attention-0.295
 permission-0.293
 levers-0.278
bars-0.277
hole-0.270
Token in
Token ID287
Feature activation-2.68e-01

Pos logit contributions
iflower+0.140
pron+0.116
orn+0.108
oppy+0.108
opus+0.108
Neg logit contributions
 stations-0.128
 levers-0.111
 speakers-0.110
 footsteps-0.105
 buses-0.098
Token her
Token ID607
Feature activation+4.85e-01

Pos logit contributions
 stations+0.009
 levers+0.007
hole+0.007
 buses+0.007
bars+0.007
Neg logit contributions
iflower-0.008
pron-0.007
orn-0.006
oppy-0.006
on-0.006
Token hand
Token ID1021
Feature activation+7.88e-01

Pos logit contributions
iflower+0.021
pron+0.018
opus+0.018
orn+0.017
oppy+0.017
Neg logit contributions
 stations-0.010
 levers-0.008
 speakers-0.008
 footsteps-0.007
 towels-0.007
Token.
Token ID13
Feature activation-7.22e-03

Pos logit contributions
iflower+0.025
pron+0.022
ip+0.021
on+0.020
orn+0.019
Neg logit contributions
 stations-0.025
 levers-0.021
 speakers-0.020
 footsteps-0.019
 buses-0.019
Token to
Token ID284
Feature activation+2.93e+00

Pos logit contributions
 stations+0.022
 levers+0.019
 speakers+0.019
 towels+0.018
 permission+0.018
Neg logit contributions
iflower-0.030
opus-0.025
pron-0.025
orn-0.025
oppy-0.024
Token walk
Token ID2513
Feature activation+4.93e-01

Pos logit contributions
iflower+0.114
opus+0.096
pron+0.094
orn+0.092
oppy+0.091
Neg logit contributions
 stations-0.066
 levers-0.058
 speakers-0.056
 footsteps-0.054
 towels-0.054
Token and
Token ID290
Feature activation+8.76e-01

Pos logit contributions
iflower+0.018
opus+0.015
pron+0.015
orn+0.015
oppy+0.015
Neg logit contributions
 stations-0.012
 levers-0.011
 speakers-0.010
 footsteps-0.010
 towels-0.010
Token give
Token ID1577
Feature activation+3.04e+00

Pos logit contributions
iflower+0.030
pron+0.025
opus+0.025
orn+0.024
oppy+0.024
Neg logit contributions
 stations-0.024
 levers-0.022
 speakers-0.020
 footsteps-0.020
 towels-0.019
Token l
Token ID300
Feature activation+1.01e+00

Pos logit contributions
iflower+0.109
pron+0.096
opus+0.095
affle+0.089
lf+0.089
Neg logit contributions
 stations-0.071
 levers-0.064
 permission-0.061
 speakers-0.061
 heads-0.061
Tokenoll
Token ID692
Feature activation+9.50e+00

Pos logit contributions
iflower+0.026
pron+0.021
lf+0.020
opus+0.020
leton+0.019
Neg logit contributions
 stations-0.034
 levers-0.032
 speakers-0.031
 permission-0.031
 towels-0.030
Tokenip
Token ID541
Feature activation+5.42e+00

Pos logit contributions
iflower+0.174
pron+0.142
lf+0.138
Buzz+0.131
opus+0.127
Neg logit contributions
 permission-0.343
 attention-0.338
 levers-0.311
 coats-0.309
bars-0.308
Tokenops
Token ID2840
Feature activation-1.83e-02

Pos logit contributions
iflower+0.128
pron+0.106
opus+0.096
orn+0.096
lf+0.094
Neg logit contributions
 stations-0.177
 levers-0.168
 heads-0.166
 towels-0.165
 attention-0.165
Token to
Token ID284
Feature activation+1.66e+00

Pos logit contributions
 stations+0.000
 levers+0.000
 speakers+0.000
 footsteps+0.000
 heads+0.000
Neg logit contributions
iflower-0.001
pron-0.001
opus-0.001
orn-0.001
oppy-0.001
Token children
Token ID1751
Feature activation-1.50e+00

Pos logit contributions
iflower+0.056
opus+0.047
pron+0.046
orn+0.045
oppy+0.044
Neg logit contributions
 stations-0.046
 levers-0.040
 speakers-0.039
 footsteps-0.039
 heads-0.038
Token."
Token ID526
Feature activation+7.88e-02

Pos logit contributions
 stations+0.039
 levers+0.035
 footsteps+0.033
 speakers+0.033
 heads+0.030
Neg logit contributions
iflower-0.054
pron-0.045
opus-0.044
oppy-0.044
orn-0.043
Token John
Token ID1757
Feature activation+2.45e+00

Pos logit contributions
iflower+0.049
opus+0.041
pron+0.040
oppy+0.040
orn+0.039
Neg logit contributions
 stations-0.041
 levers-0.036
 speakers-0.035
 footsteps-0.033
 heads-0.033
Token.
Token ID13
Feature activation-2.70e-02

Pos logit contributions
iflower+0.100
pron+0.084
orn+0.084
opus+0.083
oppy+0.082
Neg logit contributions
 stations-0.053
 levers-0.046
 speakers-0.044
 footsteps-0.043
 towels-0.039
Token John
Token ID1757
Feature activation+1.50e+00

Pos logit contributions
 stations+0.001
 levers+0.001
 speakers+0.001
 footsteps+0.001
 heads+0.001
Neg logit contributions
iflower-0.001
pron-0.001
opus-0.001
orn-0.001
affle-0.001
Token loved
Token ID6151
Feature activation-7.56e-01

Pos logit contributions
iflower+0.051
opus+0.044
pron+0.043
oppy+0.042
orn+0.042
Neg logit contributions
 stations-0.039
 levers-0.035
 speakers-0.034
 heads-0.033
 towels-0.032
Token l
Token ID300
Feature activation+1.59e+00

Pos logit contributions
 stations+0.015
 levers+0.013
 speakers+0.012
 footsteps+0.012
 heads+0.011
Neg logit contributions
iflower-0.031
pron-0.027
opus-0.027
orn-0.026
oppy-0.026
Tokenoll
Token ID692
Feature activation+9.44e+00

Pos logit contributions
iflower+0.044
pron+0.037
lf+0.033
opus+0.033
leton+0.031
Neg logit contributions
 stations-0.049
 levers-0.045
 speakers-0.045
 footsteps-0.045
 permission-0.044
Tokenip
Token ID541
Feature activation+5.36e+00

Pos logit contributions
iflower+0.143
pron+0.112
Buzz+0.107
opus+0.090
orn+0.089
Neg logit contributions
 stations-0.358
 permission-0.356
 levers-0.353
 attention-0.350
 coats-0.342
Tokenops
Token ID2840
Feature activation-7.92e-02

Pos logit contributions
iflower+0.130
pron+0.104
opus+0.098
orn+0.097
oppy+0.096
Neg logit contributions
 stations-0.192
 levers-0.176
 speakers-0.170
 heads-0.169
 towels-0.166
Token!
Token ID0
Feature activation-3.05e-01

Pos logit contributions
 stations+0.002
 levers+0.002
 speakers+0.002
 footsteps+0.002
 heads+0.001
Neg logit contributions
iflower-0.003
pron-0.003
orn-0.003
opus-0.003
affle-0.003
Token One
Token ID1881
Feature activation-1.20e+00

Pos logit contributions
 stations+0.009
 levers+0.008
 speakers+0.008
 footsteps+0.008
 towels+0.008
Neg logit contributions
iflower-0.010
opus-0.008
pron-0.008
orn-0.008
oppy-0.007
Token day
Token ID1110
Feature activation-7.13e-01

Pos logit contributions
 stations+0.030
 footsteps+0.027
 speakers+0.027
 levers+0.026
 towels+0.026
Neg logit contributions
iflower-0.039
opus-0.033
oppy-0.031
pron-0.031
affle-0.031
Token going
Token ID1016
Feature activation+1.40e-01

Pos logit contributions
 stations+0.009
 levers+0.008
 speakers+0.007
 footsteps+0.007
 heads+0.007
Neg logit contributions
iflower-0.015
pron-0.013
opus-0.013
oppy-0.012
orn-0.012
Token to
Token ID284
Feature activation+2.83e-01

Pos logit contributions
iflower+0.005
pron+0.004
opus+0.004
orn+0.004
affle+0.004
Neg logit contributions
 stations-0.004
 levers-0.003
 footsteps-0.003
 speakers-0.003
 heads-0.003
Token get
Token ID651
Feature activation+2.32e+00

Pos logit contributions
iflower+0.009
pron+0.008
opus+0.007
orn+0.007
lf+0.007
Neg logit contributions
 stations-0.008
 levers-0.007
 speakers-0.007
hole-0.007
 heads-0.007
Token a
Token ID257
Feature activation+1.35e+00

Pos logit contributions
iflower+0.090
pron+0.077
orn+0.073
oppy+0.072
opus+0.072
Neg logit contributions
 stations-0.055
 levers-0.046
 speakers-0.044
 footsteps-0.043
 heads-0.043
Token l
Token ID300
Feature activation+1.14e+00

Pos logit contributions
iflower+0.040
orn+0.031
oppy+0.031
opus+0.031
affle+0.030
Neg logit contributions
 stations-0.041
 levers-0.038
 speakers-0.037
 towels-0.037
 footsteps-0.036
Tokenoll
Token ID692
Feature activation+9.43e+00

Pos logit contributions
iflower+0.029
pron+0.021
opus+0.020
oppy+0.020
lf+0.019
Neg logit contributions
 stations-0.038
 speakers-0.036
hole-0.035
 levers-0.035
 footsteps-0.034
Tokenipop
Token ID42800
Feature activation+3.40e+00

Pos logit contributions
iflower+0.211
Buzz+0.175
opus+0.157
pron+0.155
Mic+0.154
Neg logit contributions
 attention-0.298
 permission-0.297
 levers-0.288
hole-0.279
 stations-0.277
Token for
Token ID329
Feature activation-1.44e-01

Pos logit contributions
iflower+0.126
pron+0.108
opus+0.102
ip+0.100
orn+0.100
Neg logit contributions
 stations-0.093
 levers-0.079
 footsteps-0.076
 speakers-0.076
 heads-0.072
Token being
Token ID852
Feature activation-5.21e-01

Pos logit contributions
 stations+0.004
 levers+0.004
 speakers+0.004
 footsteps+0.003
 heads+0.003
Neg logit contributions
iflower-0.005
pron-0.004
opus-0.004
orn-0.004
oppy-0.004
Token so
Token ID523
Feature activation+2.11e+00

Pos logit contributions
 stations+0.010
 levers+0.009
 speakers+0.009
 permission+0.008
 towels+0.008
Neg logit contributions
iflower-0.021
opus-0.019
orn-0.018
oppy-0.018
affle-0.017
Token organized
Token ID8389
Feature activation-6.12e-01

Pos logit contributions
iflower+0.064
orn+0.052
opus+0.051
pron+0.050
oppy+0.047
Neg logit contributions
 stations-0.065
 levers-0.061
 speakers-0.058
 footsteps-0.057
 heads-0.056
Token Tess
Token ID39412
Feature activation+3.31e+00

Pos logit contributions
iflower+0.093
opus+0.076
orn+0.073
pron+0.073
affle+0.072
Neg logit contributions
 stations-0.091
 levers-0.081
 speakers-0.080
 footsteps-0.079
 towels-0.077
Token.
Token ID13
Feature activation-7.66e-02

Pos logit contributions
iflower+0.124
orn+0.102
pron+0.100
oppy+0.100
lf+0.096
Neg logit contributions
 stations-0.078
 levers-0.077
 speakers-0.070
 footsteps-0.069
 elbows-0.067
Token She
Token ID1375
Feature activation+1.56e+00

Pos logit contributions
 stations+0.002
 levers+0.002
 speakers+0.002
 footsteps+0.002
 heads+0.002
Neg logit contributions
iflower-0.002
pron-0.002
orn-0.002
oppy-0.002
opus-0.002
Token loved
Token ID6151
Feature activation-8.01e-01

Pos logit contributions
iflower+0.061
opus+0.053
oppy+0.051
pron+0.051
orn+0.050
Neg logit contributions
 stations-0.030
 speakers-0.027
 levers-0.026
 heads-0.025
 like-0.025
Token l
Token ID300
Feature activation+1.39e+00

Pos logit contributions
 stations+0.016
 levers+0.014
 speakers+0.013
 footsteps+0.012
hole+0.012
Neg logit contributions
iflower-0.034
pron-0.029
opus-0.029
orn-0.028
oppy-0.028
Tokenoll
Token ID692
Feature activation+9.42e+00

Pos logit contributions
iflower+0.039
pron+0.032
opus+0.029
lf+0.029
orn+0.026
Neg logit contributions
 stations-0.045
 speakers-0.041
 levers-0.041
 footsteps-0.041
 permission-0.040
Tokenip
Token ID541
Feature activation+5.25e+00

Pos logit contributions
iflower+0.140
pron+0.108
Buzz+0.099
opus+0.087
orn+0.081
Neg logit contributions
 stations-0.373
 permission-0.370
 levers-0.363
 attention-0.361
 heads-0.353
Tokenops
Token ID2840
Feature activation-2.42e-01

Pos logit contributions
iflower+0.125
pron+0.099
opus+0.093
orn+0.091
oppy+0.089
Neg logit contributions
 stations-0.192
 levers-0.176
 speakers-0.171
 heads-0.169
 footsteps-0.167
Token.
Token ID13
Feature activation-5.27e-01

Pos logit contributions
 stations+0.006
 levers+0.005
 speakers+0.005
 footsteps+0.005
 heads+0.004
Neg logit contributions
iflower-0.010
pron-0.008
orn-0.008
opus-0.008
oppy-0.008
Token One
Token ID1881
Feature activation-1.20e+00

Pos logit contributions
 stations+0.017
 levers+0.015
 speakers+0.014
 footsteps+0.014
 heads+0.014
Neg logit contributions
iflower-0.015
orn-0.012
pron-0.012
oppy-0.012
opus-0.012
Token day
Token ID1110
Feature activation-8.68e-01

Pos logit contributions
 stations+0.034
 footsteps+0.031
 levers+0.031
 speakers+0.031
 towels+0.029
Neg logit contributions
iflower-0.037
opus-0.030
oppy-0.029
pron-0.028
affle-0.028
Token Lily
Token ID20037
Feature activation+1.75e+00

Pos logit contributions
iflower+0.105
opus+0.090
orn+0.087
affle+0.085
pron+0.081
Neg logit contributions
 stations-0.059
 footsteps-0.053
 speakers-0.051
 towels-0.050
 themselves-0.049
Token.
Token ID13
Feature activation-9.02e-02

Pos logit contributions
iflower+0.068
orn+0.058
pron+0.056
oppy+0.055
opus+0.054
Neg logit contributions
 stations-0.040
 levers-0.037
 footsteps-0.034
 speakers-0.034
 towels-0.032
Token She
Token ID1375
Feature activation+1.58e+00

Pos logit contributions
 stations+0.002
 footsteps+0.001
 speakers+0.001
 levers+0.001
 towels+0.001
Neg logit contributions
iflower-0.004
opus-0.004
pron-0.003
orn-0.003
affle-0.003
Token loved
Token ID6151
Feature activation-7.66e-01

Pos logit contributions
iflower+0.064
opus+0.058
oppy+0.057
orn+0.055
affle+0.054
Neg logit contributions
 stations-0.024
 speakers-0.023
 like-0.021
 heads-0.020
 levers-0.020
Token l
Token ID300
Feature activation+1.48e+00

Pos logit contributions
 stations+0.012
 levers+0.010
 speakers+0.010
 footsteps+0.009
 towels+0.008
Neg logit contributions
iflower-0.035
opus-0.031
oppy-0.030
orn-0.030
pron-0.030
Tokenoll
Token ID692
Feature activation+9.41e+00

Pos logit contributions
iflower+0.044
pron+0.038
lf+0.034
opus+0.033
leton+0.031
Neg logit contributions
 stations-0.043
 permission-0.039
 levers-0.039
 speakers-0.039
 footsteps-0.039
Tokenip
Token ID541
Feature activation+5.14e+00

Pos logit contributions
iflower+0.154
Buzz+0.124
pron+0.122
opus+0.103
Mic+0.098
Neg logit contributions
 permission-0.345
 stations-0.339
 attention-0.337
 levers-0.333
 coats-0.329
Tokenops
Token ID2840
Feature activation-1.44e-01

Pos logit contributions
iflower+0.131
pron+0.111
orn+0.101
opus+0.100
oppy+0.099
Neg logit contributions
 stations-0.168
 levers-0.153
 heads-0.151
 speakers-0.147
 themselves-0.145
Token and
Token ID290
Feature activation+1.22e+00

Pos logit contributions
 stations+0.003
 levers+0.003
 speakers+0.003
 footsteps+0.002
 heads+0.002
Neg logit contributions
iflower-0.006
pron-0.005
orn-0.005
opus-0.005
oppy-0.005
Token ate
Token ID15063
Feature activation+2.17e+00

Pos logit contributions
iflower+0.041
opus+0.034
pron+0.034
orn+0.033
oppy+0.033
Neg logit contributions
 stations-0.034
 levers-0.031
 speakers-0.029
 footsteps-0.029
 towels-0.028
Token them
Token ID606
Feature activation-7.30e-01

Pos logit contributions
iflower+0.089
opus+0.078
pron+0.076
oppy+0.076
orn+0.076
Neg logit contributions
 stations-0.044
 levers-0.038
 speakers-0.035
 towels-0.034
 footsteps-0.033
Token thought
Token ID1807
Feature activation+3.42e-02

Pos logit contributions
iflower+0.031
pron+0.026
affle+0.025
oppy+0.025
orn+0.025
Neg logit contributions
 stations-0.030
 levers-0.026
 speakers-0.025
 footsteps-0.025
 extingu-0.024
Token it
Token ID340
Feature activation+3.20e+00

Pos logit contributions
iflower+0.001
opus+0.001
pron+0.001
orn+0.001
oppy+0.001
Neg logit contributions
 stations-0.001
 levers-0.001
 speakers-0.001
 footsteps-0.001
 towels-0.001
Token was
Token ID373
Feature activation+2.28e-01

Pos logit contributions
iflower+0.099
opus+0.079
pron+0.079
orn+0.078
oppy+0.077
Neg logit contributions
 stations-0.099
 levers-0.090
 speakers-0.087
 footsteps-0.085
 towels-0.082
Token a
Token ID257
Feature activation+1.35e+00

Pos logit contributions
iflower+0.007
opus+0.006
pron+0.006
orn+0.006
oppy+0.006
Neg logit contributions
 stations-0.007
 levers-0.006
 speakers-0.006
 footsteps-0.006
 towels-0.006
Token l
Token ID300
Feature activation+1.69e+00

Pos logit contributions
iflower+0.040
opus+0.032
orn+0.032
oppy+0.031
pron+0.030
Neg logit contributions
 stations-0.041
 levers-0.038
 footsteps-0.036
 speakers-0.036
 towels-0.036
Tokenoll
Token ID692
Feature activation+9.38e+00

Pos logit contributions
iflower+0.047
pron+0.039
lf+0.038
opus+0.035
Buzz+0.033
Neg logit contributions
 stations-0.046
 levers-0.045
 permission-0.045
hole-0.045
 footsteps-0.043
Tokenipop
Token ID42800
Feature activation+4.58e+00

Pos logit contributions
iflower+0.222
Buzz+0.177
opus+0.171
pron+0.163
Mic+0.162
Neg logit contributions
 attention-0.289
 permission-0.284
 levers-0.280
 extingu-0.268
hole-0.268
Token.
Token ID13
Feature activation-1.05e-01

Pos logit contributions
iflower+0.167
pron+0.139
opus+0.139
ip+0.137
orn+0.135
Neg logit contributions
 stations-0.120
 levers-0.103
 speakers-0.102
 footsteps-0.093
 elbows-0.090
Token She
Token ID1375
Feature activation+1.34e+00

Pos logit contributions
 stations+0.003
 levers+0.003
 speakers+0.003
 towels+0.002
 extingu+0.002
Neg logit contributions
iflower-0.003
oppy-0.003
opus-0.003
orn-0.003
pron-0.003
Token wanted
Token ID2227
Feature activation+2.41e-01

Pos logit contributions
iflower+0.045
pron+0.038
opus+0.037
oppy+0.037
orn+0.036
Neg logit contributions
 stations-0.038
 levers-0.033
 speakers-0.032
 footsteps-0.031
 extingu-0.030
Token to
Token ID284
Feature activation+1.69e+00

Pos logit contributions
iflower+0.010
pron+0.008
opus+0.008
orn+0.008
oppy+0.008
Neg logit contributions
 stations-0.005
 levers-0.005
 speakers-0.004
 footsteps-0.004
 heads-0.004
Token Lily
Token ID20037
Feature activation+1.69e+00

Pos logit contributions
iflower+0.103
opus+0.088
orn+0.085
affle+0.082
pron+0.079
Neg logit contributions
 stations-0.060
 footsteps-0.055
 towels-0.053
 speakers-0.051
 heads-0.051
Token.
Token ID13
Feature activation-1.53e-01

Pos logit contributions
iflower+0.063
orn+0.053
pron+0.051
oppy+0.050
lf+0.049
Neg logit contributions
 stations-0.041
 levers-0.038
 speakers-0.035
 footsteps-0.035
 towels-0.033
Token She
Token ID1375
Feature activation+1.55e+00

Pos logit contributions
 stations+0.003
 footsteps+0.002
 speakers+0.002
 levers+0.002
 towels+0.002
Neg logit contributions
iflower-0.007
opus-0.006
pron-0.005
orn-0.005
affle-0.005
Token loved
Token ID6151
Feature activation-8.86e-01

Pos logit contributions
iflower+0.063
opus+0.056
oppy+0.054
orn+0.053
affle+0.053
Neg logit contributions
 stations-0.025
 speakers-0.023
 heads-0.021
 like-0.021
 levers-0.020
Token l
Token ID300
Feature activation+1.39e+00

Pos logit contributions
 stations+0.015
 levers+0.012
 speakers+0.012
 footsteps+0.011
 towels+0.010
Neg logit contributions
iflower-0.040
opus-0.035
pron-0.034
oppy-0.034
orn-0.034
Tokenoll
Token ID692
Feature activation+9.33e+00

Pos logit contributions
iflower+0.042
pron+0.036
lf+0.032
opus+0.032
leton+0.029
Neg logit contributions
 stations-0.040
 permission-0.037
 speakers-0.037
 footsteps-0.037
ks-0.036
Tokenip
Token ID541
Feature activation+5.05e+00

Pos logit contributions
iflower+0.143
pron+0.110
Buzz+0.110
opus+0.090
orn+0.090
Neg logit contributions
 stations-0.360
 permission-0.354
 levers-0.351
 attention-0.349
 coats-0.345
Tokenops
Token ID2840
Feature activation-3.13e-01

Pos logit contributions
iflower+0.126
pron+0.104
orn+0.097
opus+0.096
oppy+0.094
Neg logit contributions
 stations-0.174
 levers-0.157
 heads-0.154
 speakers-0.152
 towels-0.150
Token and
Token ID290
Feature activation+1.10e+00

Pos logit contributions
 stations+0.007
 levers+0.006
 speakers+0.006
 footsteps+0.006
 heads+0.005
Neg logit contributions
iflower-0.013
pron-0.011
orn-0.011
opus-0.011
oppy-0.010
Token always
Token ID1464
Feature activation-6.51e-01

Pos logit contributions
iflower+0.036
opus+0.030
pron+0.030
orn+0.029
oppy+0.029
Neg logit contributions
 stations-0.031
 levers-0.028
 speakers-0.027
 footsteps-0.026
 towels-0.026
Token had
Token ID550
Feature activation+9.95e-01

Pos logit contributions
 stations+0.020
 levers+0.017
 speakers+0.017
 footsteps+0.017
 towels+0.016
Neg logit contributions
iflower-0.021
pron-0.017
opus-0.017
orn-0.017
oppy-0.016
Token gave
Token ID2921
Feature activation+1.48e+00

Pos logit contributions
iflower+0.024
pron+0.020
opus+0.019
orn+0.019
oppy+0.019
Neg logit contributions
 stations-0.021
 levers-0.018
 speakers-0.018
 footsteps-0.017
hole-0.016
Token him
Token ID683
Feature activation+1.38e-01

Pos logit contributions
iflower+0.058
pron+0.049
orn+0.048
opus+0.048
oppy+0.047
Neg logit contributions
 stations-0.034
 levers-0.029
 speakers-0.028
 footsteps-0.027
 towels-0.025
Token a
Token ID257
Feature activation+3.73e-01

Pos logit contributions
iflower+0.006
pron+0.005
opus+0.005
orn+0.005
oppy+0.005
Neg logit contributions
 stations-0.003
 levers-0.003
 speakers-0.002
 footsteps-0.002
 heads-0.002
Token fancy
Token ID14996
Feature activation+1.68e+00

Pos logit contributions
iflower+0.012
opus+0.010
pron+0.010
orn+0.010
oppy+0.009
Neg logit contributions
 stations-0.010
 levers-0.010
 speakers-0.009
 footsteps-0.009
 towels-0.009
Token l
Token ID300
Feature activation+1.00e+00

Pos logit contributions
iflower+0.045
pron+0.036
opus+0.035
orn+0.034
oppy+0.033
Neg logit contributions
 stations-0.060
 levers-0.054
 speakers-0.053
 footsteps-0.051
 heads-0.050
Tokenoll
Token ID692
Feature activation+9.32e+00

Pos logit contributions
iflower+0.023
pron+0.017
opus+0.017
oppy+0.016
orn+0.016
Neg logit contributions
 stations-0.038
 levers-0.035
 speakers-0.034
 footsteps-0.034
 towels-0.033
Tokenipop
Token ID42800
Feature activation+3.87e+00

Pos logit contributions
iflower+0.210
opus+0.158
pron+0.157
Buzz+0.152
lf+0.144
Neg logit contributions
 permission-0.301
 levers-0.298
 attention-0.297
 stations-0.290
 towels-0.288
Token as
Token ID355
Feature activation+2.32e-01

Pos logit contributions
iflower+0.147
pron+0.124
opus+0.121
orn+0.120
ip+0.118
Neg logit contributions
 stations-0.106
 levers-0.086
 speakers-0.086
 footsteps-0.081
 elbows-0.076
Token a
Token ID257
Feature activation-1.76e-01

Pos logit contributions
iflower+0.010
pron+0.008
opus+0.008
orn+0.008
oppy+0.008
Neg logit contributions
 stations-0.005
 levers-0.004
 speakers-0.004
 footsteps-0.004
 heads-0.004
Token reward
Token ID6721
Feature activation+8.01e-01

Pos logit contributions
 stations+0.005
 levers+0.005
 speakers+0.005
 footsteps+0.005
 towels+0.005
Neg logit contributions
iflower-0.005
opus-0.004
orn-0.004
oppy-0.004
pron-0.004
Token.
Token ID13
Feature activation-2.75e-02

Pos logit contributions
iflower+0.029
pron+0.024
orn+0.024
opus+0.024
oppy+0.023
Neg logit contributions
 stations-0.021
 levers-0.018
 speakers-0.018
 footsteps-0.017
 heads-0.017
Token who
Token ID508
Feature activation+1.83e+00

Pos logit contributions
iflower+0.106
pron+0.091
opus+0.091
orn+0.090
oppy+0.089
Neg logit contributions
 stations-0.049
 levers-0.042
 speakers-0.039
 footsteps-0.037
 towels-0.036
Token really
Token ID1107
Feature activation+2.00e+00

Pos logit contributions
iflower+0.076
opus+0.072
inkle+0.065
affle+0.064
pron+0.064
Neg logit contributions
 like-0.029
 stations-0.028
 heads-0.023
 speakers-0.023
 footsteps-0.022
Token wanted
Token ID2227
Feature activation-3.20e-01

Pos logit contributions
iflower+0.076
opus+0.064
pron+0.062
orn+0.062
lf+0.058
Neg logit contributions
 stations-0.046
 levers-0.041
 speakers-0.040
 heads-0.039
 footsteps-0.038
Token a
Token ID257
Feature activation+5.97e-01

Pos logit contributions
 stations+0.008
 levers+0.007
hole+0.006
 heads+0.006
 extingu+0.006
Neg logit contributions
iflower-0.012
pron-0.010
orn-0.010
opus-0.009
oppy-0.009
Token l
Token ID300
Feature activation+9.55e-01

Pos logit contributions
iflower+0.019
orn+0.016
opus+0.015
oppy+0.015
affle+0.015
Neg logit contributions
 stations-0.017
 levers-0.015
 speakers-0.015
 footsteps-0.015
 towels-0.014
Tokenoll
Token ID692
Feature activation+9.32e+00

Pos logit contributions
iflower+0.024
pron+0.019
opus+0.017
lf+0.017
orn+0.016
Neg logit contributions
 stations-0.033
 levers-0.030
 speakers-0.030
 footsteps-0.029
hole-0.029
Tokenipop
Token ID42800
Feature activation+3.50e+00

Pos logit contributions
iflower+0.219
Buzz+0.171
pron+0.165
opus+0.160
lf+0.155
Neg logit contributions
 permission-0.285
 attention-0.279
 levers-0.268
hole-0.260
 towels-0.260
Token.
Token ID13
Feature activation-1.73e-01

Pos logit contributions
iflower+0.135
pron+0.116
opus+0.114
orn+0.113
oppy+0.112
Neg logit contributions
 stations-0.087
 levers-0.075
 speakers-0.070
 footsteps-0.068
 heads-0.067
Token Every
Token ID3887
Feature activation-5.61e-01

Pos logit contributions
 stations+0.004
 levers+0.003
 speakers+0.003
 footsteps+0.003
 heads+0.003
Neg logit contributions
iflower-0.007
opus-0.006
pron-0.006
affle-0.006
orn-0.006
Token time
Token ID640
Feature activation-2.11e+00

Pos logit contributions
 stations+0.021
 levers+0.020
 speakers+0.019
 footsteps+0.019
 towels+0.018
Neg logit contributions
iflower-0.014
pron-0.010
opus-0.010
orn-0.010
oppy-0.010
Token she
Token ID673
Feature activation+1.02e+00

Pos logit contributions
 stations+0.060
 levers+0.054
 speakers+0.052
 footsteps+0.050
 towels+0.049
Neg logit contributions
iflower-0.071
opus-0.058
pron-0.058
orn-0.056
oppy-0.056
TokenCan
Token ID6090
Feature activation+5.43e-01

Pos logit contributions
 stations+0.032
 levers+0.028
 speakers+0.027
 towels+0.026
 footsteps+0.026
Neg logit contributions
iflower-0.038
opus-0.032
oppy-0.032
orn-0.031
pron-0.031
Token we
Token ID356
Feature activation-8.60e-01

Pos logit contributions
iflower+0.022
affle+0.019
orn+0.018
pron+0.018
opus+0.017
Neg logit contributions
 stations-0.011
 themselves-0.010
 heads-0.010
 levers-0.010
 towels-0.010
Token have
Token ID423
Feature activation+1.26e+00

Pos logit contributions
 stations+0.018
 towels+0.016
 levers+0.016
 speakers+0.015
 heads+0.015
Neg logit contributions
iflower-0.034
opus-0.029
pron-0.028
orn-0.028
leton-0.027
Token a
Token ID257
Feature activation+3.37e-01

Pos logit contributions
iflower+0.053
opus+0.046
pron+0.045
orn+0.045
oppy+0.045
Neg logit contributions
 stations-0.025
 levers-0.021
 speakers-0.020
 footsteps-0.019
 towels-0.018
Token l
Token ID300
Feature activation+1.05e+00

Pos logit contributions
iflower+0.011
orn+0.009
opus+0.009
wl+0.009
lf+0.009
Neg logit contributions
 stations-0.009
 towels-0.008
 levers-0.008
 coats-0.008
 speakers-0.008
Tokenoll
Token ID692
Feature activation+9.31e+00

Pos logit contributions
iflower+0.027
pron+0.021
opus+0.019
lf+0.019
orn+0.018
Neg logit contributions
 stations-0.034
 levers-0.032
 speakers-0.032
hole-0.032
 permission-0.031
Tokenipop
Token ID42800
Feature activation+3.70e+00

Pos logit contributions
iflower+0.238
Buzz+0.192
pron+0.176
opus+0.176
Mic+0.175
Neg logit contributions
 attention-0.263
 permission-0.254
 coats-0.254
 levers-0.251
 heads-0.243
Token,
Token ID11
Feature activation+1.69e+00

Pos logit contributions
iflower+0.119
opus+0.094
pron+0.093
orn+0.093
oppy+0.091
Neg logit contributions
 stations-0.105
 levers-0.096
 speakers-0.092
 towels-0.091
 footsteps-0.091
Token mom
Token ID1995
Feature activation-1.56e+00

Pos logit contributions
iflower+0.063
orn+0.051
opus+0.051
pron+0.050
affle+0.048
Neg logit contributions
 stations-0.042
 levers-0.039
 footsteps-0.037
 towels-0.037
 speakers-0.036
Token?"
Token ID1701
Feature activation+1.50e+00

Pos logit contributions
 stations+0.057
 levers+0.053
 speakers+0.051
 footsteps+0.050
 towels+0.049
Neg logit contributions
iflower-0.040
pron-0.030
opus-0.030
orn-0.029
oppy-0.029
Token Ben
Token ID3932
Feature activation+3.04e-02

Pos logit contributions
iflower+0.053
pron+0.044
opus+0.043
orn+0.043
affle+0.042
Neg logit contributions
 stations-0.040
 levers-0.036
 speakers-0.035
 footsteps-0.034
 towels-0.033
Token Julie
Token ID21946
Feature activation-1.04e+00

Pos logit contributions
iflower+0.040
pron+0.032
oppy+0.032
opus+0.031
orn+0.030
Neg logit contributions
 stations-0.044
 levers-0.038
 speakers-0.038
 dashboard-0.036
 footsteps-0.036
Token could
Token ID714
Feature activation-2.31e+00

Pos logit contributions
 stations+0.030
 footsteps+0.026
 levers+0.026
 speakers+0.026
 towels+0.024
Neg logit contributions
iflower-0.035
pron-0.029
opus-0.029
orn-0.028
oppy-0.028
Token fly
Token ID6129
Feature activation-1.88e+00

Pos logit contributions
 stations+0.070
 levers+0.059
 speakers+0.059
 footsteps+0.057
 extingu+0.055
Neg logit contributions
iflower-0.076
pron-0.064
opus-0.062
oppy-0.062
orn-0.060
Token with
Token ID351
Feature activation+4.90e-01

Pos logit contributions
 stations+0.050
 speakers+0.046
 levers+0.045
 footsteps+0.044
 extingu+0.043
Neg logit contributions
iflower-0.060
orn-0.052
pron-0.048
oppy-0.048
Buzz-0.046
Token no
Token ID645
Feature activation-2.34e-01

Pos logit contributions
iflower+0.014
orn+0.011
pron+0.011
oppy+0.011
on+0.010
Neg logit contributions
 stations-0.016
 speakers-0.014
 dashboard-0.013
 levers-0.013
 extingu-0.013
Token constraints
Token ID17778
Feature activation-6.94e+00

Pos logit contributions
 stations+0.007
 extingu+0.006
 speakers+0.006
 levers+0.006
 themselves+0.005
Neg logit contributions
iflower-0.007
pron-0.006
orn-0.006
oppy-0.005
ip-0.005
Token.
Token ID13
Feature activation-1.35e+00

Pos logit contributions
 stations+0.191
 elbows+0.179
 footsteps+0.178
 levers+0.173
 speakers+0.173
Neg logit contributions
iflower-0.221
orn-0.179
pron-0.174
ip-0.173
on-0.170
Token Julie
Token ID21946
Feature activation-1.31e+00

Pos logit contributions
 stations+0.044
 levers+0.040
 speakers+0.038
 brake+0.037
 towels+0.037
Neg logit contributions
iflower-0.039
oppy-0.031
pron-0.030
orn-0.030
opus-0.029
Token learned
Token ID4499
Feature activation-2.47e+00

Pos logit contributions
 stations+0.039
 levers+0.035
 speakers+0.034
 footsteps+0.034
 towels+0.032
Neg logit contributions
iflower-0.042
opus-0.034
pron-0.034
orn-0.033
oppy-0.033
Token how
Token ID703
Feature activation-5.78e-01

Pos logit contributions
 stations+0.072
 dashboard+0.066
 extingu+0.066
 elbows+0.065
hole+0.064
Neg logit contributions
iflower-0.071
pron-0.055
orn-0.054
ip-0.054
 grape-0.052
Token to
Token ID284
Feature activation-4.39e-01

Pos logit contributions
 stations+0.015
 levers+0.013
 speakers+0.013
 footsteps+0.013
 towels+0.012
Neg logit contributions
iflower-0.021
pron-0.017
opus-0.017
orn-0.016
oppy-0.016
Token the
Token ID262
Feature activation-6.91e-01

Pos logit contributions
 stations+0.013
 levers+0.011
 speakers+0.011
 footsteps+0.010
 towels+0.010
Neg logit contributions
iflower-0.018
pron-0.015
opus-0.015
oppy-0.015
orn-0.015
Token pigeon
Token ID46387
Feature activation+1.07e+00

Pos logit contributions
 stations+0.022
 levers+0.021
 speakers+0.019
hole+0.019
 extingu+0.019
Neg logit contributions
iflower-0.020
opus-0.016
pron-0.016
oppy-0.016
orn-0.016
Token all
Token ID477
Feature activation-2.77e+00

Pos logit contributions
iflower+0.039
oppy+0.033
orn+0.032
opus+0.032
pron+0.032
Neg logit contributions
 stations-0.027
 levers-0.025
 speakers-0.024
 footsteps-0.022
 towels-0.021
Token those
Token ID883
Feature activation-2.12e+00

Pos logit contributions
 stations+0.082
 levers+0.077
 speakers+0.074
 footsteps+0.071
 dashboard+0.071
Neg logit contributions
iflower-0.089
orn-0.074
pron-0.072
oppy-0.070
opus-0.068
Token weeks
Token ID2745
Feature activation-4.73e+00

Pos logit contributions
 stations+0.060
 levers+0.054
 speakers+0.053
 footsteps+0.052
 extingu+0.051
Neg logit contributions
iflower-0.070
opus-0.056
pron-0.055
orn-0.055
oppy-0.054
Token ago
Token ID2084
Feature activation-6.65e+00

Pos logit contributions
 stations+0.125
 levers+0.115
 footsteps+0.111
 speakers+0.107
 elbows+0.104
Neg logit contributions
iflower-0.164
pron-0.135
opus-0.133
oppy-0.132
orn-0.131
Token.
Token ID13
Feature activation-2.96e-02

Pos logit contributions
 stations+0.179
 levers+0.164
 footsteps+0.158
 speakers+0.156
 extingu+0.145
Neg logit contributions
iflower-0.223
pron-0.191
lf-0.187
orn-0.175
opus-0.174
Token<|endoftext|>
Token ID50256
Feature activation-1.69e-01

Pos logit contributions
 stations+0.001
 themselves+0.001
 speakers+0.001
 levers+0.001
 footsteps+0.001
Neg logit contributions
iflower-0.001
pron-0.001
opus-0.001
affle-0.001
leton-0.001
TokenMike
Token ID16073
Feature activation+2.10e+00

Pos logit contributions
 stations+0.005
 speakers+0.004
 levers+0.004
 towels+0.004
 footsteps+0.004
Neg logit contributions
iflower-0.006
pron-0.005
opus-0.005
affle-0.004
oppy-0.004
Token and
Token ID290
Feature activation-1.57e-01

Pos logit contributions
iflower+0.078
pron+0.065
opus+0.064
orn+0.064
oppy+0.063
Neg logit contributions
 stations-0.053
 levers-0.046
 speakers-0.044
 footsteps-0.044
 towels-0.041
Token his
Token ID465
Feature activation-2.30e-01

Pos logit contributions
 stations+0.005
 levers+0.004
 speakers+0.004
hole+0.004
 footsteps+0.004
Neg logit contributions
iflower-0.005
orn-0.004
opus-0.004
pron-0.004
oppy-0.004
Token.
Token ID13
Feature activation-1.76e-01

Pos logit contributions
 stations+0.019
 elbows+0.019
 levers+0.018
 umb+0.017
 attention+0.017
Neg logit contributions
iflower-0.020
on-0.018
opus-0.017
lf-0.017
oc-0.017
Token He
Token ID679
Feature activation+8.26e-01

Pos logit contributions
 stations+0.005
 levers+0.004
 footsteps+0.004
 speakers+0.004
 towels+0.004
Neg logit contributions
iflower-0.006
pron-0.005
opus-0.005
orn-0.005
oppy-0.005
Token says
Token ID1139
Feature activation+2.54e+00

Pos logit contributions
iflower+0.028
opus+0.024
oppy+0.023
pron+0.023
affle+0.023
Neg logit contributions
 stations-0.022
 levers-0.019
 footsteps-0.018
 speakers-0.018
hole-0.017
Token "
Token ID366
Feature activation-1.66e+00

Pos logit contributions
iflower+0.099
pron+0.081
orn+0.080
oppy+0.079
opus+0.079
Neg logit contributions
 stations-0.058
 levers-0.055
 footsteps-0.052
 speakers-0.050
 themselves-0.048
TokenL
Token ID43
Feature activation-6.74e-01

Pos logit contributions
 stations+0.049
 levers+0.044
 speakers+0.042
 footsteps+0.041
 towels+0.039
Neg logit contributions
iflower-0.055
pron-0.044
opus-0.044
orn-0.043
oppy-0.043
Tokenadies
Token ID50192
Feature activation-6.62e+00

Pos logit contributions
 stations+0.024
 levers+0.022
 speakers+0.022
 footsteps+0.021
 heads+0.020
Neg logit contributions
iflower-0.018
opus-0.014
orn-0.014
affle-0.013
pron-0.013
Token and
Token ID290
Feature activation-8.31e-01

Pos logit contributions
 levers+0.210
 stations+0.200
 brake+0.180
 speakers+0.179
 elbows+0.178
Neg logit contributions
iflower-0.120
ummy-0.114
olly-0.114
 story-0.111
oppy-0.110
Token gentlemen
Token ID28527
Feature activation-5.86e-01

Pos logit contributions
 stations+0.020
 levers+0.018
 speakers+0.017
 footsteps+0.016
 towels+0.016
Neg logit contributions
iflower-0.031
opus-0.026
pron-0.026
orn-0.026
oppy-0.025
Token,
Token ID11
Feature activation-9.61e-01

Pos logit contributions
 stations+0.017
 levers+0.016
 speakers+0.015
 footsteps+0.014
 heads+0.014
Neg logit contributions
iflower-0.019
opus-0.016
pron-0.015
oppy-0.015
orn-0.015
Token boys
Token ID6510
Feature activation-7.39e-01

Pos logit contributions
 stations+0.024
 levers+0.021
 heads+0.020
 elbows+0.020
 speakers+0.020
Neg logit contributions
iflower-0.033
opus-0.028
pron-0.028
oppy-0.027
orn-0.027
Token and
Token ID290
Feature activation-1.42e+00

Pos logit contributions
 stations+0.021
 levers+0.019
 footsteps+0.018
 speakers+0.018
 elbows+0.017
Neg logit contributions
iflower-0.024
opus-0.020
pron-0.019
oppy-0.019
orn-0.019
Token,
Token ID11
Feature activation-1.01e+00

Pos logit contributions
 stations+0.051
 levers+0.049
 speakers+0.045
 extingu+0.045
 footsteps+0.044
Neg logit contributions
iflower-0.059
orn-0.048
pron-0.047
opus-0.047
oppy-0.045
Token following
Token ID1708
Feature activation-7.77e-01

Pos logit contributions
 stations+0.026
 levers+0.023
 speakers+0.023
 footsteps+0.022
 extingu+0.022
Neg logit contributions
iflower-0.034
pron-0.029
opus-0.028
orn-0.028
oppy-0.027
Token the
Token ID262
Feature activation-1.41e+00

Pos logit contributions
 stations+0.022
 levers+0.020
 speakers+0.019
 footsteps+0.018
 extingu+0.018
Neg logit contributions
iflower-0.025
orn-0.020
oppy-0.020
pron-0.020
opus-0.020
Token squirrel
Token ID33039
Feature activation+8.44e-02

Pos logit contributions
 stations+0.044
 levers+0.040
 speakers+0.038
 footsteps+0.037
 towels+0.036
Neg logit contributions
iflower-0.043
opus-0.035
pron-0.034
orn-0.034
oppy-0.033
Token's
Token ID338
Feature activation-3.31e+00

Pos logit contributions
iflower+0.003
orn+0.002
opus+0.002
oppy+0.002
ip+0.002
Neg logit contributions
 stations-0.002
 levers-0.002
 footsteps-0.002
 speakers-0.002
 extingu-0.002
Token directions
Token ID11678
Feature activation-6.60e+00

Pos logit contributions
 stations+0.090
 levers+0.080
 speakers+0.078
 extingu+0.076
 towels+0.072
Neg logit contributions
iflower-0.112
orn-0.087
opus-0.087
pron-0.086
oppy-0.086
Token.
Token ID13
Feature activation-9.25e-01

Pos logit contributions
 stations+0.179
 speakers+0.163
 levers+0.161
 elbows+0.159
 extingu+0.154
Neg logit contributions
iflower-0.208
pron-0.166
orn-0.166
opus-0.161
lf-0.161
Token But
Token ID887
Feature activation-2.58e+00

Pos logit contributions
 stations+0.027
 levers+0.024
 speakers+0.023
 footsteps+0.023
 towels+0.022
Neg logit contributions
iflower-0.031
pron-0.025
opus-0.025
orn-0.024
affle-0.024
Token when
Token ID618
Feature activation-1.83e+00

Pos logit contributions
 stations+0.066
 levers+0.058
 speakers+0.056
 footsteps+0.053
 towels+0.052
Neg logit contributions
iflower-0.094
pron-0.079
opus-0.078
orn-0.077
oppy-0.077
Token he
Token ID339
Feature activation+5.26e-01

Pos logit contributions
 stations+0.051
 levers+0.046
 speakers+0.044
 footsteps+0.043
 towels+0.041
Neg logit contributions
iflower-0.062
pron-0.051
opus-0.051
orn-0.050
affle-0.050
Token got
Token ID1392
Feature activation+1.82e+00

Pos logit contributions
iflower+0.021
pron+0.018
opus+0.017
affle+0.017
orn+0.016
Neg logit contributions
 stations-0.011
 levers-0.010
 speakers-0.010
 footsteps-0.009
 heads-0.009
Token They
Token ID1119
Feature activation-1.25e+00

Pos logit contributions
 stations+0.009
 levers+0.008
hole+0.007
 heads+0.007
 speakers+0.007
Neg logit contributions
iflower-0.012
oppy-0.010
pron-0.010
opus-0.010
affle-0.010
Token said
Token ID531
Feature activation+2.66e+00

Pos logit contributions
 stations+0.035
 levers+0.032
 footsteps+0.031
 speakers+0.030
 attention+0.029
Neg logit contributions
iflower-0.040
pron-0.034
oppy-0.033
opus-0.032
orn-0.032
Token,
Token ID11
Feature activation-1.34e+00

Pos logit contributions
iflower+0.103
pron+0.086
opus+0.086
orn+0.085
oppy+0.083
Neg logit contributions
 stations-0.063
 levers-0.055
 speakers-0.053
 footsteps-0.052
 towels-0.049
Token "
Token ID366
Feature activation-1.43e+00

Pos logit contributions
 stations+0.036
 levers+0.033
 speakers+0.032
 footsteps+0.031
 towels+0.030
Neg logit contributions
iflower-0.047
orn-0.039
opus-0.038
oppy-0.037
affle-0.037
TokenL
Token ID43
Feature activation-3.12e-01

Pos logit contributions
 stations+0.043
 levers+0.040
 speakers+0.038
 footsteps+0.038
 heads+0.037
Neg logit contributions
iflower-0.045
pron-0.036
opus-0.036
orn-0.036
oppy-0.034
Tokenadies
Token ID50192
Feature activation-6.52e+00

Pos logit contributions
 stations+0.010
 levers+0.010
 speakers+0.009
 footsteps+0.009
 heads+0.009
Neg logit contributions
iflower-0.009
opus-0.007
pron-0.007
orn-0.007
oppy-0.007
Token and
Token ID290
Feature activation-1.03e+00

Pos logit contributions
 levers+0.195
 stations+0.183
 brake+0.174
 elbows+0.165
 ignition+0.164
Neg logit contributions
iflower-0.141
olly-0.136
oppy-0.135
ummy-0.131
ip-0.129
Token gentlemen
Token ID28527
Feature activation-3.93e-01

Pos logit contributions
 stations+0.023
 levers+0.020
 speakers+0.019
 footsteps+0.019
 towels+0.018
Neg logit contributions
iflower-0.041
pron-0.035
opus-0.035
orn-0.035
affle-0.034
Token,
Token ID11
Feature activation-8.93e-01

Pos logit contributions
 stations+0.011
 levers+0.010
 speakers+0.010
 footsteps+0.009
 heads+0.009
Neg logit contributions
iflower-0.013
opus-0.011
pron-0.011
oppy-0.010
orn-0.010
Token welcome
Token ID7062
Feature activation+2.09e+00

Pos logit contributions
 stations+0.021
 levers+0.020
 speakers+0.018
 heads+0.018
 footsteps+0.017
Neg logit contributions
iflower-0.032
pron-0.027
opus-0.027
oppy-0.027
orn-0.026
Token to
Token ID284
Feature activation+1.97e+00

Pos logit contributions
iflower+0.068
oppy+0.057
opus+0.057
pron+0.054
orn+0.053
Neg logit contributions
 stations-0.059
 levers-0.055
 speakers-0.051
hole-0.049
 dashboard-0.049
Token and
Token ID290
Feature activation+6.26e-01

Pos logit contributions
 stations+0.012
 levers+0.010
 speakers+0.010
 footsteps+0.009
 towels+0.009
Neg logit contributions
iflower-0.018
opus-0.015
pron-0.015
orn-0.015
oppy-0.015
Token they
Token ID484
Feature activation-1.79e+00

Pos logit contributions
iflower+0.019
pron+0.015
orn+0.015
opus+0.015
oppy+0.015
Neg logit contributions
 stations-0.019
 levers-0.017
 speakers-0.017
 extingu-0.016
 footsteps-0.016
Token all
Token ID477
Feature activation-3.09e+00

Pos logit contributions
 stations+0.045
 levers+0.040
 speakers+0.039
 footsteps+0.038
 towels+0.036
Neg logit contributions
iflower-0.064
opus-0.054
oppy-0.054
pron-0.053
orn-0.053
Token followed
Token ID3940
Feature activation-4.69e-01

Pos logit contributions
 stations+0.118
 levers+0.115
 footsteps+0.113
 speakers+0.111
 towels+0.109
Neg logit contributions
iflower-0.064
orn-0.048
pron-0.044
opus-0.044
oppy-0.044
Token her
Token ID607
Feature activation-1.26e+00

Pos logit contributions
 stations+0.012
hole+0.011
 levers+0.010
 speakers+0.010
 dashboard+0.010
Neg logit contributions
iflower-0.015
oppy-0.013
opus-0.013
pron-0.013
ip-0.012
Token directions
Token ID11678
Feature activation-6.40e+00

Pos logit contributions
 stations+0.036
 levers+0.030
 speakers+0.030
 extingu+0.029
 footsteps+0.029
Neg logit contributions
iflower-0.040
oppy-0.034
opus-0.033
pron-0.033
lf-0.030
Token.
Token ID13
Feature activation-6.88e-01

Pos logit contributions
 stations+0.186
 extingu+0.174
 levers+0.173
 elbows+0.168
 speakers+0.166
Neg logit contributions
iflower-0.162
pron-0.149
lf-0.141
on-0.131
opus-0.121
Token Everyone
Token ID11075
Feature activation-4.03e+00

Pos logit contributions
 stations+0.020
 levers+0.018
 speakers+0.017
 footsteps+0.017
 towels+0.016
Neg logit contributions
iflower-0.023
pron-0.019
opus-0.019
orn-0.018
affle-0.018
Token said
Token ID531
Feature activation+9.39e-01

Pos logit contributions
 stations+0.122
 levers+0.110
 speakers+0.107
 footsteps+0.106
 towels+0.102
Neg logit contributions
iflower-0.126
opus-0.103
pron-0.100
orn-0.100
oppy-0.097
Token they
Token ID484
Feature activation-2.72e+00

Pos logit contributions
iflower+0.025
opus+0.021
pron+0.021
oppy+0.019
va+0.018
Neg logit contributions
 stations-0.031
 extingu-0.027
 speakers-0.027
 levers-0.027
hole-0.027
Token had
Token ID550
Feature activation+5.43e-02

Pos logit contributions
 stations+0.081
 extingu+0.074
 speakers+0.073
 levers+0.073
 footsteps+0.072
Neg logit contributions
iflower-0.076
oppy-0.064
orn-0.063
opus-0.063
pron-0.063
Token-
Token ID12
Feature activation-1.39e+00

Pos logit contributions
 stations+0.052
 levers+0.046
 speakers+0.044
 footsteps+0.042
 heads+0.041
Neg logit contributions
iflower-0.074
pron-0.062
opus-0.060
orn-0.060
oppy-0.059
Tokenups
Token ID4739
Feature activation-7.89e+00

Pos logit contributions
 stations+0.041
 levers+0.037
 towels+0.037
 speakers+0.036
 footsteps+0.036
Neg logit contributions
iflower-0.043
orn-0.035
opus-0.035
pron-0.034
affle-0.032
Token,
Token ID11
Feature activation-1.92e+00

Pos logit contributions
 stations+0.201
 levers+0.179
 footsteps+0.174
 speakers+0.168
 extingu+0.167
Neg logit contributions
iflower-0.256
pron-0.227
opus-0.215
oppy-0.211
ip-0.210
Token sat
Token ID3332
Feature activation-8.75e-01

Pos logit contributions
 stations+0.045
 levers+0.039
hole+0.038
 speakers+0.038
 footsteps+0.036
Neg logit contributions
iflower-0.066
pron-0.061
lf-0.055
opus-0.055
oppy-0.053
Token-
Token ID12
Feature activation+2.46e-01

Pos logit contributions
 stations+0.022
 levers+0.020
 speakers+0.019
 footsteps+0.018
 extingu+0.017
Neg logit contributions
iflower-0.033
pron-0.028
orn-0.027
opus-0.026
oppy-0.025
Tokenups
Token ID4739
Feature activation-6.39e+00

Pos logit contributions
iflower+0.008
pron+0.007
opus+0.007
orn+0.006
oppy+0.006
Neg logit contributions
 stations-0.007
 levers-0.006
 footsteps-0.006
 speakers-0.006
 towels-0.006
Token and
Token ID290
Feature activation-6.57e-01

Pos logit contributions
 stations+0.195
 levers+0.174
 footsteps+0.169
 speakers+0.158
 brake+0.156
Neg logit contributions
iflower-0.191
pron-0.181
ip-0.162
on-0.160
opus-0.152
Token jumping
Token ID14284
Feature activation-1.82e+00

Pos logit contributions
 stations+0.019
 levers+0.017
hole+0.016
 speakers+0.016
 footsteps+0.016
Neg logit contributions
iflower-0.021
pron-0.018
oppy-0.017
lf-0.017
opus-0.016
Token j
Token ID474
Feature activation-1.56e-01

Pos logit contributions
 stations+0.054
 levers+0.048
 speakers+0.048
 supervision+0.046
 permission+0.046
Neg logit contributions
iflower-0.054
pron-0.045
affle-0.043
orn-0.043
opus-0.042
Tokenacks
Token ID4595
Feature activation-1.58e+00

Pos logit contributions
 stations+0.005
 levers+0.005
 speakers+0.005
 footsteps+0.005
 towels+0.005
Neg logit contributions
iflower-0.004
pron-0.003
opus-0.003
orn-0.003
oppy-0.003
Token every
Token ID790
Feature activation-9.85e-01

Pos logit contributions
 stations+0.040
 levers+0.036
 speakers+0.035
hole+0.033
 footsteps+0.033
Neg logit contributions
iflower-0.055
pron-0.046
opus-0.046
affle-0.045
oppy-0.044
Token Daisy
Token ID40355
Feature activation+4.49e-01

Pos logit contributions
 stations+0.017
 levers+0.015
 speakers+0.015
 footsteps+0.015
 towels+0.014
Neg logit contributions
iflower-0.022
opus-0.018
pron-0.018
orn-0.018
affle-0.017
Token had
Token ID550
Feature activation+8.37e-02

Pos logit contributions
iflower+0.015
pron+0.013
opus+0.012
oppy+0.012
orn+0.012
Neg logit contributions
 stations-0.013
 levers-0.011
 speakers-0.011
 footsteps-0.011
 towels-0.010
Token planted
Token ID18969
Feature activation+1.27e+00

Pos logit contributions
iflower+0.003
pron+0.003
orn+0.002
oppy+0.002
opus+0.002
Neg logit contributions
 stations-0.002
 levers-0.002
 extingu-0.002
 footsteps-0.002
 speakers-0.002
Token it
Token ID340
Feature activation+1.93e+00

Pos logit contributions
iflower+0.048
orn+0.040
pron+0.040
opus+0.039
oppy+0.038
Neg logit contributions
 stations-0.031
 levers-0.027
 speakers-0.026
 footsteps-0.025
 heads-0.024
Token weeks
Token ID2745
Feature activation-4.22e+00

Pos logit contributions
iflower+0.069
pron+0.057
orn+0.055
opus+0.055
oppy+0.054
Neg logit contributions
 stations-0.052
 levers-0.046
 speakers-0.045
 footsteps-0.044
 extingu-0.041
Token ago
Token ID2084
Feature activation-6.38e+00

Pos logit contributions
 stations+0.103
 levers+0.095
 footsteps+0.094
 speakers+0.087
 elbows+0.087
Neg logit contributions
iflower-0.157
pron-0.133
orn-0.128
ip-0.127
opus-0.126
Token,
Token ID11
Feature activation-8.19e-01

Pos logit contributions
 stations+0.188
 levers+0.167
 footsteps+0.166
 speakers+0.162
 brake+0.160
Neg logit contributions
iflower-0.205
pron-0.174
lf-0.174
on-0.164
orn-0.152
Token but
Token ID475
Feature activation-1.10e+00

Pos logit contributions
 stations+0.020
 levers+0.017
 footsteps+0.017
 speakers+0.016
 brake+0.016
Neg logit contributions
iflower-0.031
pron-0.027
orn-0.024
opus-0.024
lf-0.024
Token now
Token ID783
Feature activation+5.82e-01

Pos logit contributions
 stations+0.032
 levers+0.028
 speakers+0.027
 footsteps+0.026
 towels+0.025
Neg logit contributions
iflower-0.037
pron-0.030
opus-0.030
oppy-0.029
orn-0.029
Token she
Token ID673
Feature activation+8.28e-01

Pos logit contributions
iflower+0.019
pron+0.016
opus+0.016
orn+0.015
oppy+0.015
Neg logit contributions
 stations-0.017
 levers-0.015
 speakers-0.015
 footsteps-0.014
 towels-0.014
Token could
Token ID714
Feature activation-1.51e+00

Pos logit contributions
iflower+0.029
opus+0.024
orn+0.024
pron+0.024
affle+0.023
Neg logit contributions
 stations-0.021
 levers-0.018
 speakers-0.018
 heads-0.018
 towels-0.017
Token and
Token ID290
Feature activation-2.39e-02

Pos logit contributions
 stations+0.018
 levers+0.016
 elbows+0.015
 speakers+0.015
 towels+0.014
Neg logit contributions
iflower-0.019
opus-0.016
pron-0.016
lf-0.015
orn-0.015
Token then
Token ID788
Feature activation-3.76e-01

Pos logit contributions
 stations+0.001
 levers+0.001
 speakers+0.001
 footsteps+0.001
 towels+0.001
Neg logit contributions
iflower-0.001
pron-0.001
opus-0.001
orn-0.001
oppy-0.001
Token a
Token ID257
Feature activation-1.58e+00

Pos logit contributions
 stations+0.011
 levers+0.009
 speakers+0.009
 footsteps+0.009
hole+0.008
Neg logit contributions
iflower-0.012
opus-0.010
pron-0.010
oppy-0.010
orn-0.010
Token loud
Token ID7812
Feature activation-1.79e+00

Pos logit contributions
 stations+0.050
 levers+0.046
 speakers+0.044
 footsteps+0.043
 towels+0.042
Neg logit contributions
iflower-0.048
opus-0.038
pron-0.038
orn-0.037
oppy-0.037
Token "
Token ID366
Feature activation-1.48e+00

Pos logit contributions
 stations+0.048
 levers+0.041
 speakers+0.040
 towels+0.039
hole+0.038
Neg logit contributions
iflower-0.062
opus-0.053
pron-0.052
orn-0.051
affle-0.050
TokenCR
Token ID9419
Feature activation-6.34e+00

Pos logit contributions
 stations+0.045
 levers+0.040
 speakers+0.039
 footsteps+0.038
 towels+0.037
Neg logit contributions
iflower-0.046
opus-0.037
pron-0.037
orn-0.037
oppy-0.036
TokenASH
Token ID11211
Feature activation+3.20e-01

Pos logit contributions
 stations+0.140
 levers+0.131
 speakers+0.110
 themselves+0.107
 footsteps+0.104
Neg logit contributions
iflower-0.226
opus-0.204
orn-0.202
oppy-0.198
pron-0.194
Token!"
Token ID2474
Feature activation+3.69e-01

Pos logit contributions
iflower+0.009
opus+0.008
pron+0.007
oppy+0.007
ip+0.007
Neg logit contributions
 stations-0.011
 levers-0.009
 speakers-0.009
 footsteps-0.009
 themselves-0.009
Token echoed
Token ID22211
Feature activation-2.28e+00

Pos logit contributions
iflower+0.009
opus+0.007
pron+0.007
oppy+0.007
orn+0.007
Neg logit contributions
 stations-0.014
 levers-0.012
 heads-0.011
 speakers-0.011
hole-0.011
Token around
Token ID1088
Feature activation-1.83e+00

Pos logit contributions
 stations+0.061
 speakers+0.052
 elbows+0.050
 supervision+0.050
hole+0.049
Neg logit contributions
iflower-0.066
lf-0.058
pron-0.057
opus-0.055
oster-0.053
Token the
Token ID262
Feature activation-2.00e+00

Pos logit contributions
 stations+0.058
 levers+0.052
bars+0.052
hole+0.051
 speakers+0.051
Neg logit contributions
iflower-0.046
lf-0.038
pron-0.036
opus-0.036
on-0.035
Token the
Token ID262
Feature activation-1.26e+00

Pos logit contributions
 stations+0.054
 levers+0.048
 speakers+0.046
 heads+0.045
 footsteps+0.044
Neg logit contributions
iflower-0.051
pron-0.041
lf-0.040
orn-0.040
on-0.035
Token wardrobe
Token ID35593
Feature activation+8.14e-01

Pos logit contributions
 stations+0.036
 levers+0.034
 speakers+0.031
 themselves+0.030
 footsteps+0.030
Neg logit contributions
iflower-0.041
pron-0.034
orn-0.033
opus-0.031
affle-0.029
Token a
Token ID257
Feature activation-1.34e+00

Pos logit contributions
iflower+0.027
lf+0.023
pron+0.023
opus+0.022
on+0.022
Neg logit contributions
 stations-0.022
 levers-0.021
 footsteps-0.019
 extingu-0.019
 elbows-0.019
Token few
Token ID1178
Feature activation-1.34e+00

Pos logit contributions
 stations+0.040
 levers+0.036
 speakers+0.035
 footsteps+0.034
 towels+0.034
Neg logit contributions
iflower-0.043
opus-0.035
oppy-0.034
orn-0.034
affle-0.033
Token days
Token ID1528
Feature activation-4.17e+00

Pos logit contributions
 stations+0.040
 levers+0.036
 speakers+0.034
 footsteps+0.034
 extingu+0.034
Neg logit contributions
iflower-0.043
pron-0.034
opus-0.034
orn-0.034
affle-0.033
Token ago
Token ID2084
Feature activation-6.30e+00

Pos logit contributions
 stations+0.112
 levers+0.102
 footsteps+0.100
 speakers+0.096
 brake+0.096
Neg logit contributions
iflower-0.135
on-0.115
lf-0.115
pron-0.108
ink-0.108
Token,
Token ID11
Feature activation-7.27e-01

Pos logit contributions
 stations+0.194
 levers+0.167
 speakers+0.162
 footsteps+0.159
 brake+0.159
Neg logit contributions
iflower-0.192
lf-0.177
pron-0.164
on-0.163
opus-0.145
Token so
Token ID523
Feature activation+7.92e-01

Pos logit contributions
 stations+0.019
 heads+0.016
 levers+0.015
 extingu+0.015
 elbows+0.015
Neg logit contributions
iflower-0.024
pron-0.021
orn-0.018
lf-0.018
on-0.018
Token she
Token ID673
Feature activation+6.48e-02

Pos logit contributions
iflower+0.025
pron+0.020
opus+0.020
orn+0.020
oppy+0.020
Neg logit contributions
 stations-0.024
 levers-0.022
 speakers-0.021
 footsteps-0.020
 towels-0.020
Token opened
Token ID4721
Feature activation+1.65e-01

Pos logit contributions
iflower+0.003
opus+0.002
pron+0.002
orn+0.002
oppy+0.002
Neg logit contributions
 stations-0.001
 levers-0.001
 speakers-0.001
 footsteps-0.001
 heads-0.001
Token it
Token ID340
Feature activation+9.59e-01

Pos logit contributions
iflower+0.005
orn+0.003
pron+0.003
oppy+0.003
opus+0.003
Neg logit contributions
 stations-0.006
 levers-0.005
 speakers-0.005
 footsteps-0.005
 heads-0.005
Token born
Token ID4642
Feature activation+8.88e-01

Pos logit contributions
 stations+0.044
 levers+0.039
 speakers+0.037
 extingu+0.036
 towels+0.036
Neg logit contributions
iflower-0.051
pron-0.041
orn-0.041
opus-0.040
oppy-0.039
Token only
Token ID691
Feature activation-5.49e-01

Pos logit contributions
iflower+0.029
orn+0.024
opus+0.024
pron+0.023
oppy+0.023
Neg logit contributions
 stations-0.026
 levers-0.022
 speakers-0.022
 footsteps-0.021
 towels-0.021
Token a
Token ID257
Feature activation-7.48e-01

Pos logit contributions
 stations+0.015
 towels+0.012
 levers+0.012
 speakers+0.012
 dashboard+0.012
Neg logit contributions
iflower-0.017
orn-0.015
pron-0.014
oppy-0.014
affle-0.013
Token few
Token ID1178
Feature activation-6.86e-01

Pos logit contributions
 stations+0.025
 levers+0.022
 speakers+0.022
 footsteps+0.021
 towels+0.021
Neg logit contributions
iflower-0.022
pron-0.018
opus-0.017
orn-0.017
oppy-0.016
Token days
Token ID1528
Feature activation-3.48e+00

Pos logit contributions
 stations+0.019
 levers+0.017
 extingu+0.017
 speakers+0.016
 dashboard+0.016
Neg logit contributions
iflower-0.022
orn-0.018
pron-0.017
opus-0.017
affle-0.017
Token ago
Token ID2084
Feature activation-6.28e+00

Pos logit contributions
 stations+0.103
 levers+0.092
 footsteps+0.090
 speakers+0.089
 heads+0.085
Neg logit contributions
iflower-0.110
opus-0.091
pron-0.089
orn-0.089
ip-0.086
Token.
Token ID13
Feature activation-8.70e-01

Pos logit contributions
 stations+0.204
 levers+0.180
 speakers+0.177
 footsteps+0.177
 brake+0.165
Neg logit contributions
iflower-0.184
lf-0.152
pron-0.149
on-0.143
opus-0.142
Token The
Token ID383
Feature activation-5.14e-01

Pos logit contributions
 stations+0.025
 levers+0.023
 speakers+0.022
 footsteps+0.022
 towels+0.020
Neg logit contributions
iflower-0.029
pron-0.023
opus-0.023
orn-0.023
affle-0.022
Token deer
Token ID20096
Feature activation+5.43e-02

Pos logit contributions
 stations+0.016
 levers+0.015
 speakers+0.014
 footsteps+0.014
 towels+0.013
Neg logit contributions
iflower-0.016
pron-0.013
opus-0.013
orn-0.012
oppy-0.012
Token felt
Token ID2936
Feature activation+6.96e-01

Pos logit contributions
iflower+0.002
opus+0.001
pron+0.001
orn+0.001
oppy+0.001
Neg logit contributions
 stations-0.002
 levers-0.001
 speakers-0.001
 footsteps-0.001
 towels-0.001
Token sorry
Token ID7926
Feature activation+3.00e-01

Pos logit contributions
iflower+0.024
pron+0.020
orn+0.019
opus+0.019
oppy+0.019
Neg logit contributions
 stations-0.019
 levers-0.017
 speakers-0.016
 footsteps-0.016
 towels-0.016
Token table
Token ID3084
Feature activation-1.49e-01

Pos logit contributions
 stations+0.032
 levers+0.029
 themselves+0.027
hole+0.026
 speakers+0.026
Neg logit contributions
iflower-0.034
pron-0.029
opus-0.027
affle-0.025
orn-0.025
Token,
Token ID11
Feature activation-7.46e-01

Pos logit contributions
 stations+0.004
 levers+0.003
 speakers+0.003
 elbows+0.003
 footsteps+0.003
Neg logit contributions
iflower-0.005
pron-0.004
lf-0.004
opus-0.004
on-0.004
Token making
Token ID1642
Feature activation-7.78e-02

Pos logit contributions
 stations+0.020
 levers+0.016
 speakers+0.016
 footsteps+0.015
hole+0.015
Neg logit contributions
iflower-0.025
pron-0.023
lf-0.019
opus-0.018
on-0.018
Token the
Token ID262
Feature activation-1.02e+00

Pos logit contributions
 stations+0.002
 levers+0.002
 speakers+0.002
hole+0.002
 heads+0.002
Neg logit contributions
iflower-0.003
pron-0.002
opus-0.002
oppy-0.002
affle-0.002
Token candle
Token ID26839
Feature activation+6.00e-01

Pos logit contributions
 stations+0.029
 levers+0.026
 speakers+0.024
hole+0.024
 extingu+0.023
Neg logit contributions
iflower-0.034
pron-0.029
opus-0.027
affle-0.026
orn-0.025
Token topple
Token ID49377
Feature activation-6.22e+00

Pos logit contributions
iflower+0.021
pron+0.017
inkle+0.017
oppy+0.016
opus+0.016
Neg logit contributions
 stations-0.016
 speakers-0.014
 levers-0.014
 dashboard-0.013
 footsteps-0.012
Token over
Token ID625
Feature activation-1.88e+00

Pos logit contributions
 stations+0.175
 speakers+0.169
 footsteps+0.166
 levers+0.164
 elbows+0.163
Neg logit contributions
iflower-0.162
orn-0.135
pron-0.131
on-0.129
in-0.128
Token.
Token ID13
Feature activation-6.73e-01

Pos logit contributions
 stations+0.049
 levers+0.042
 speakers+0.042
 footsteps+0.040
 dashboard+0.040
Neg logit contributions
iflower-0.065
pron-0.051
orn-0.049
lf-0.049
oppy-0.048
Token Everyone
Token ID11075
Feature activation-4.00e+00

Pos logit contributions
 stations+0.021
 levers+0.019
 speakers+0.018
 footsteps+0.018
 towels+0.017
Neg logit contributions
iflower-0.021
opus-0.017
pron-0.017
orn-0.017
oppy-0.016
Token quickly
Token ID2952
Feature activation-3.01e+00

Pos logit contributions
 stations+0.131
 levers+0.114
 speakers+0.114
 footsteps+0.109
 extingu+0.109
Neg logit contributions
iflower-0.113
opus-0.091
pron-0.087
lf-0.086
orn-0.085
Token rushed
Token ID13999
Feature activation-4.04e+00

Pos logit contributions
 stations+0.088
 footsteps+0.076
 levers+0.076
 speakers+0.076
hole+0.074
Neg logit contributions
iflower-0.089
pron-0.072
lf-0.072
opus-0.071
ip-0.070
Token our
Token ID674
Feature activation-5.94e-01

Pos logit contributions
iflower+0.019
orn+0.015
pron+0.015
opus+0.015
affle+0.015
Neg logit contributions
 stations-0.016
 levers-0.014
 speakers-0.013
hole-0.013
 footsteps-0.013
Token country
Token ID1499
Feature activation-6.00e-01

Pos logit contributions
 stations+0.019
 levers+0.017
 speakers+0.016
 footsteps+0.016
 towels+0.015
Neg logit contributions
iflower-0.018
opus-0.015
pron-0.014
oppy-0.014
orn-0.014
Token a
Token ID257
Feature activation-6.75e-01

Pos logit contributions
 stations+0.015
 levers+0.014
 footsteps+0.013
 speakers+0.013
 elbows+0.012
Neg logit contributions
iflower-0.021
pron-0.018
opus-0.018
orn-0.018
lf-0.017
Token long
Token ID890
Feature activation-4.62e-01

Pos logit contributions
 stations+0.022
 levers+0.020
 speakers+0.020
 footsteps+0.019
 towels+0.019
Neg logit contributions
iflower-0.019
opus-0.015
pron-0.015
orn-0.015
oppy-0.015
Token time
Token ID640
Feature activation-2.04e+00

Pos logit contributions
 stations+0.015
 levers+0.014
 footsteps+0.014
 speakers+0.013
 towels+0.013
Neg logit contributions
iflower-0.013
pron-0.011
orn-0.010
opus-0.010
lf-0.010
Token ago
Token ID2084
Feature activation-6.20e+00

Pos logit contributions
 stations+0.054
 levers+0.049
 footsteps+0.048
 elbows+0.048
 heads+0.046
Neg logit contributions
iflower-0.064
pron-0.053
opus-0.053
lf-0.052
oppy-0.050
Token.
Token ID13
Feature activation-7.90e-01

Pos logit contributions
 stations+0.185
 levers+0.165
 footsteps+0.164
 speakers+0.157
 brake+0.149
Neg logit contributions
iflower-0.197
lf-0.167
pron-0.163
on-0.155
opus-0.154
Token He
Token ID679
Feature activation+1.09e+00

Pos logit contributions
 stations+0.025
 levers+0.022
 speakers+0.022
 footsteps+0.021
 towels+0.020
Neg logit contributions
iflower-0.024
pron-0.020
opus-0.019
orn-0.019
oppy-0.019
Token helped
Token ID4193
Feature activation-8.99e-01

Pos logit contributions
iflower+0.034
opus+0.027
pron+0.027
orn+0.027
oppy+0.026
Neg logit contributions
 stations-0.033
 levers-0.030
 speakers-0.029
 footsteps-0.029
 towels-0.028
Token many
Token ID867
Feature activation-9.83e-01

Pos logit contributions
 stations+0.024
 levers+0.021
 speakers+0.020
 footsteps+0.019
hole+0.019
Neg logit contributions
iflower-0.032
opus-0.026
pron-0.026
orn-0.026
oppy-0.025
Token people
Token ID661
Feature activation-3.95e+00

Pos logit contributions
 stations+0.027
 levers+0.024
 extingu+0.024
 speakers+0.023
 footsteps+0.023
Neg logit contributions
iflower-0.032
opus-0.025
orn-0.025
pron-0.025
lf-0.024
Token big
Token ID1263
Feature activation+4.64e-01

Pos logit contributions
 stations+0.041
 levers+0.037
 speakers+0.036
 footsteps+0.035
 towels+0.034
Neg logit contributions
iflower-0.037
pron-0.029
opus-0.029
orn-0.028
oppy-0.028
Token fire
Token ID2046
Feature activation-2.77e+00

Pos logit contributions
iflower+0.014
pron+0.011
opus+0.011
orn+0.011
affle+0.010
Neg logit contributions
 stations-0.015
 levers-0.014
 speakers-0.013
 footsteps-0.013
 themselves-0.013
Token alarm
Token ID10436
Feature activation-2.28e+00

Pos logit contributions
 stations+0.073
 footsteps+0.064
 levers+0.064
 elbows+0.063
 speakers+0.063
Neg logit contributions
iflower-0.094
pron-0.077
opus-0.076
orn-0.073
lf-0.071
Token,
Token ID11
Feature activation-1.25e+00

Pos logit contributions
 stations+0.060
 levers+0.050
 elbows+0.049
 footsteps+0.049
 permission+0.048
Neg logit contributions
iflower-0.079
pron-0.069
opus-0.064
lf-0.064
orn-0.062
Token so
Token ID523
Feature activation+1.18e+00

Pos logit contributions
 stations+0.031
 levers+0.025
hole+0.025
 heads+0.025
 elbows+0.024
Neg logit contributions
iflower-0.042
pron-0.039
lf-0.034
opus-0.031
orn-0.031
Token firefighters
Token ID22611
Feature activation-6.20e+00

Pos logit contributions
iflower+0.036
pron+0.028
opus+0.028
orn+0.027
oppy+0.027
Neg logit contributions
 stations-0.038
 levers-0.034
 speakers-0.033
 footsteps-0.032
 towels-0.031
Token were
Token ID547
Feature activation-1.62e+00

Pos logit contributions
 stations+0.182
 elbows+0.161
 footsteps+0.161
 speakers+0.157
 levers+0.154
Neg logit contributions
iflower-0.174
pron-0.154
opus-0.148
on-0.144
lf-0.143
Token available
Token ID1695
Feature activation+5.07e-01

Pos logit contributions
 stations+0.044
 levers+0.038
 speakers+0.037
 heads+0.036
 footsteps+0.036
Neg logit contributions
iflower-0.053
pron-0.046
lf-0.041
affle-0.041
opus-0.039
Token to
Token ID284
Feature activation+1.23e+00

Pos logit contributions
iflower+0.016
pron+0.013
lf+0.012
on+0.012
orn+0.012
Neg logit contributions
 stations-0.014
 levers-0.013
 elbows-0.013
 heads-0.013
 speakers-0.012
Token come
Token ID1282
Feature activation-5.44e-01

Pos logit contributions
iflower+0.039
pron+0.034
lf+0.032
oppy+0.031
affle+0.031
Neg logit contributions
 stations-0.032
 levers-0.029
hole-0.028
 speakers-0.028
 heads-0.027
Token help
Token ID1037
Feature activation-1.15e+00

Pos logit contributions
 stations+0.014
 levers+0.012
 speakers+0.012
 elbows+0.011
 heads+0.011
Neg logit contributions
iflower-0.018
pron-0.015
oppy-0.015
inkle-0.014
orn-0.014
Token looked
Token ID3114
Feature activation+2.06e-01

Pos logit contributions
 stations+0.064
 levers+0.055
 footsteps+0.053
 speakers+0.053
 extingu+0.053
Neg logit contributions
iflower-0.073
opus-0.062
pron-0.060
oppy-0.059
orn-0.058
Token out
Token ID503
Feature activation-3.96e-02

Pos logit contributions
iflower+0.006
pron+0.006
inkle+0.005
orn+0.005
affle+0.005
Neg logit contributions
 stations-0.006
 levers-0.005
 speakers-0.005
 extingu-0.005
 brake-0.005
Token the
Token ID262
Feature activation-1.24e+00

Pos logit contributions
 stations+0.001
 dashboard+0.001
 speakers+0.001
hole+0.001
 elbows+0.001
Neg logit contributions
iflower-0.001
inkle-0.001
pron-0.001
 grape-0.001
oppy-0.001
Token window
Token ID4324
Feature activation-2.94e+00

Pos logit contributions
 stations+0.026
 levers+0.021
 themselves+0.021
 argued+0.020
 footsteps+0.019
Neg logit contributions
iflower-0.051
pron-0.045
orn-0.042
opus-0.042
affle-0.040
Token.
Token ID13
Feature activation-9.59e-01

Pos logit contributions
 stations+0.099
 elbows+0.081
 brake+0.080
 levers+0.080
 speakers+0.077
Neg logit contributions
iflower-0.075
pron-0.069
on-0.065
lf-0.063
inkle-0.057
Token Trees
Token ID34925
Feature activation-6.20e+00

Pos logit contributions
 stations+0.024
 levers+0.022
 speakers+0.021
 footsteps+0.021
 towels+0.020
Neg logit contributions
iflower-0.035
pron-0.029
opus-0.029
orn-0.028
affle-0.028
Token and
Token ID290
Feature activation-1.14e+00

Pos logit contributions
 stations+0.188
 speakers+0.164
 footsteps+0.163
 elbows+0.160
 permission+0.159
Neg logit contributions
iflower-0.173
pron-0.141
orn-0.140
lf-0.140
opus-0.138
Token houses
Token ID7777
Feature activation-4.56e+00

Pos logit contributions
hole+0.027
 extingu+0.027
 stations+0.027
 dashboard+0.025
 permission+0.024
Neg logit contributions
iflower-0.034
lf-0.029
pron-0.028
Buzz-0.026
 grape-0.026
Token flew
Token ID13112
Feature activation-1.74e+00

Pos logit contributions
 stations+0.134
 speakers+0.119
 footsteps+0.116
 elbows+0.113
 permission+0.113
Neg logit contributions
iflower-0.116
lf-0.104
pron-0.102
on-0.100
inkle-0.093
Token by
Token ID416
Feature activation-3.08e+00

Pos logit contributions
 stations+0.047
 speakers+0.044
 elbows+0.043
 supervision+0.043
 footsteps+0.042
Neg logit contributions
iflower-0.045
pron-0.037
orn-0.037
Buzz-0.036
ork-0.036
Token quickly
Token ID2952
Feature activation-2.68e+00

Pos logit contributions
 stations+0.080
bars+0.078
 levers+0.073
 dashboard+0.073
 speakers+0.073
Neg logit contributions
iflower-0.086
lf-0.070
inkle-0.068
ork-0.065
 grape-0.063
Token better
Token ID1365
Feature activation+2.41e+00

Pos logit contributions
iflower+0.013
opus+0.011
pron+0.011
orn+0.010
oppy+0.010
Neg logit contributions
 stations-0.010
 levers-0.009
 speakers-0.008
 footsteps-0.008
 towels-0.008
Token than
Token ID621
Feature activation+1.79e+00

Pos logit contributions
iflower+0.093
opus+0.076
pron+0.076
oppy+0.076
ip+0.072
Neg logit contributions
 stations-0.057
 levers-0.050
 speakers-0.049
 footsteps-0.048
hole-0.045
Token yours
Token ID12431
Feature activation+1.32e+00

Pos logit contributions
iflower+0.056
pron+0.046
oppy+0.044
opus+0.044
affle+0.044
Neg logit contributions
 stations-0.054
 levers-0.050
 speakers-0.047
hole-0.046
 footsteps-0.045
Token.
Token ID13
Feature activation+4.53e-01

Pos logit contributions
iflower+0.052
opus+0.043
pron+0.043
orn+0.043
oppy+0.042
Neg logit contributions
 stations-0.030
 levers-0.027
 speakers-0.025
 footsteps-0.025
 towels-0.023
Token Fire
Token ID3764
Feature activation-1.93e+00

Pos logit contributions
iflower+0.016
pron+0.013
opus+0.013
orn+0.013
oppy+0.013
Neg logit contributions
 stations-0.012
 levers-0.011
 speakers-0.010
 footsteps-0.010
 towels-0.010
Tokenfighters
Token ID17114
Feature activation-6.12e+00

Pos logit contributions
 stations+0.059
 levers+0.054
 speakers+0.053
 footsteps+0.052
 elbows+0.049
Neg logit contributions
iflower-0.060
opus-0.048
orn-0.047
pron-0.047
affle-0.046
Token save
Token ID3613
Feature activation-2.39e-01

Pos logit contributions
 stations+0.145
 levers+0.129
 footsteps+0.126
 speakers+0.124
 elbows+0.113
Neg logit contributions
iflower-0.224
opus-0.188
pron-0.183
orn-0.183
oppy-0.183
Token people
Token ID661
Feature activation-3.27e+00

Pos logit contributions
 stations+0.006
 levers+0.005
 speakers+0.005
 footsteps+0.005
hole+0.005
Neg logit contributions
iflower-0.009
pron-0.007
oppy-0.007
opus-0.007
orn-0.007
Token from
Token ID422
Feature activation-3.85e-01

Pos logit contributions
 stations+0.073
 levers+0.066
 footsteps+0.065
 speakers+0.062
 elbows+0.058
Neg logit contributions
iflower-0.128
opus-0.106
pron-0.105
oppy-0.102
orn-0.101
Token fires
Token ID12252
Feature activation-3.52e+00

Pos logit contributions
 stations+0.010
 levers+0.009
 speakers+0.009
 footsteps+0.008
hole+0.008
Neg logit contributions
iflower-0.013
pron-0.011
opus-0.011
oppy-0.011
orn-0.011
Token."
Token ID526
Feature activation+7.69e-01

Pos logit contributions
 levers+0.095
 stations+0.095
 footsteps+0.092
 elbows+0.090
 speakers+0.089
Neg logit contributions
iflower-0.106
opus-0.086
pron-0.084
orn-0.084
lf-0.082
Tokenmy
Token ID1820
Feature activation-1.57e+00

Pos logit contributions
 stations+0.066
 levers+0.059
 speakers+0.058
 footsteps+0.057
 towels+0.054
Neg logit contributions
iflower-0.059
pron-0.048
orn-0.046
opus-0.046
oppy-0.044
Token had
Token ID550
Feature activation-2.45e-01

Pos logit contributions
 stations+0.052
 levers+0.045
 speakers+0.045
 footsteps+0.045
 towels+0.042
Neg logit contributions
iflower-0.047
pron-0.038
opus-0.036
orn-0.036
oppy-0.035
Token told
Token ID1297
Feature activation-8.22e-01

Pos logit contributions
 stations+0.008
 levers+0.007
 footsteps+0.007
 speakers+0.007
 extingu+0.006
Neg logit contributions
iflower-0.007
pron-0.006
orn-0.006
on-0.005
lf-0.005
Token him
Token ID683
Feature activation-2.24e+00

Pos logit contributions
 stations+0.023
 speakers+0.021
 levers+0.021
 extingu+0.019
hole+0.019
Neg logit contributions
iflower-0.026
orn-0.022
pron-0.022
oppy-0.021
lf-0.021
Token long
Token ID890
Feature activation-1.37e+00

Pos logit contributions
 stations+0.060
 levers+0.052
 speakers+0.051
 extingu+0.048
 heads+0.048
Neg logit contributions
iflower-0.077
oppy-0.064
pron-0.063
opus-0.062
orn-0.062
Token ago
Token ID2084
Feature activation-6.11e+00

Pos logit contributions
 stations+0.035
 speakers+0.029
 levers+0.029
 extingu+0.029
 themselves+0.028
Neg logit contributions
iflower-0.045
pron-0.040
affle-0.039
orn-0.038
oppy-0.037
Token.
Token ID13
Feature activation-7.70e-01

Pos logit contributions
 stations+0.169
 footsteps+0.146
 levers+0.146
 speakers+0.144
 extingu+0.137
Neg logit contributions
iflower-0.202
lf-0.179
pron-0.178
on-0.166
orn-0.153
Token The
Token ID383
Feature activation-4.13e-01

Pos logit contributions
 stations+0.022
 levers+0.020
 speakers+0.019
 footsteps+0.019
 towels+0.018
Neg logit contributions
iflower-0.026
opus-0.021
pron-0.021
orn-0.020
oppy-0.020
Token bird
Token ID6512
Feature activation+1.16e+00

Pos logit contributions
 stations+0.013
 levers+0.011
hole+0.011
 speakers+0.011
 themselves+0.011
Neg logit contributions
iflower-0.012
pron-0.010
opus-0.010
orn-0.010
affle-0.010
Token spread
Token ID4104
Feature activation+1.07e-01

Pos logit contributions
iflower+0.041
pron+0.035
opus+0.035
orn+0.034
oppy+0.034
Neg logit contributions
 stations-0.030
 levers-0.026
 speakers-0.025
 footsteps-0.024
 heads-0.024
Token out
Token ID503
Feature activation-6.79e-01

Pos logit contributions
iflower+0.004
orn+0.003
pron+0.003
oppy+0.003
opus+0.003
Neg logit contributions
 stations-0.003
 speakers-0.003
 levers-0.002
hole-0.002
 dashboard-0.002
Token her
Token ID607
Feature activation-1.88e+00

Pos logit contributions
 stations+0.008
 levers+0.006
 speakers+0.006
 footsteps+0.006
hole+0.006
Neg logit contributions
iflower-0.007
pron-0.006
opus-0.006
oppy-0.006
orn-0.006
Token hand
Token ID1021
Feature activation-1.86e+00

Pos logit contributions
 stations+0.058
 levers+0.052
 speakers+0.049
 footsteps+0.047
 extingu+0.046
Neg logit contributions
iflower-0.060
pron-0.050
oppy-0.048
opus-0.047
affle-0.046
Token.
Token ID13
Feature activation-1.59e+00

Pos logit contributions
 stations+0.052
 levers+0.044
 speakers+0.041
 footsteps+0.041
 elbows+0.038
Neg logit contributions
iflower-0.062
pron-0.058
ip-0.051
orn-0.051
oppy-0.049
Token
Token ID198
Feature activation-8.20e-02

Pos logit contributions
 stations+0.044
 levers+0.039
 speakers+0.038
 footsteps+0.037
 towels+0.035
Neg logit contributions
iflower-0.055
opus-0.045
pron-0.044
orn-0.044
oppy-0.043
Token
Token ID198
Feature activation+2.33e-01

Pos logit contributions
 stations+0.002
 levers+0.002
 speakers+0.002
 footsteps+0.002
 themselves+0.002
Neg logit contributions
iflower-0.003
opus-0.002
pron-0.002
affle-0.002
orn-0.002
TokenCR
Token ID9419
Feature activation-6.11e+00

Pos logit contributions
iflower+0.007
opus+0.005
orn+0.005
pron+0.005
affle+0.005
Neg logit contributions
 stations-0.007
 levers-0.007
 speakers-0.007
 footsteps-0.006
 towels-0.006
TokenASH
Token ID11211
Feature activation+3.75e-01

Pos logit contributions
 stations+0.138
 levers+0.133
 themselves+0.109
 speakers+0.103
 dashboard+0.100
Neg logit contributions
iflower-0.205
on-0.189
orn-0.189
pron-0.187
opus-0.186
Token!
Token ID0
Feature activation-1.59e-01

Pos logit contributions
iflower+0.009
opus+0.009
ip+0.008
oppy+0.008
ink+0.007
Neg logit contributions
 stations-0.013
 levers-0.012
 speakers-0.011
 elbows-0.011
 themselves-0.011
Token The
Token ID383
Feature activation-7.90e-01

Pos logit contributions
 stations+0.004
 levers+0.004
 speakers+0.004
 footsteps+0.004
 towels+0.004
Neg logit contributions
iflower-0.005
opus-0.004
pron-0.004
orn-0.004
affle-0.004
Token tower
Token ID10580
Feature activation-1.02e+00

Pos logit contributions
 stations+0.025
 levers+0.022
 speakers+0.021
hole+0.021
 themselves+0.020
Neg logit contributions
iflower-0.024
pron-0.020
opus-0.020
orn-0.019
affle-0.019
Token fell
Token ID3214
Feature activation-2.12e+00

Pos logit contributions
 stations+0.032
 levers+0.028
 speakers+0.028
 footsteps+0.028
 permission+0.027
Neg logit contributions
iflower-0.030
opus-0.024
orn-0.024
pron-0.024
lf-0.023
Token ancestors
Token ID18668
Feature activation-3.85e-01

Pos logit contributions
 stations+0.026
 levers+0.023
 speakers+0.022
 footsteps+0.022
 towels+0.021
Neg logit contributions
iflower-0.025
opus-0.020
pron-0.020
oppy-0.019
orn-0.019
Token who
Token ID508
Feature activation+8.52e-01

Pos logit contributions
 stations+0.010
 levers+0.009
 footsteps+0.009
 speakers+0.008
 elbows+0.008
Neg logit contributions
iflower-0.014
opus-0.012
pron-0.012
orn-0.011
oppy-0.011
Token sang
Token ID25889
Feature activation-1.45e+00

Pos logit contributions
iflower+0.028
opus+0.023
pron+0.023
orn+0.022
oppy+0.022
Neg logit contributions
 stations-0.024
 levers-0.022
 speakers-0.021
 footsteps-0.021
 towels-0.020
Token it
Token ID340
Feature activation+1.37e+00

Pos logit contributions
 stations+0.033
 levers+0.029
 speakers+0.027
 footsteps+0.027
 towels+0.025
Neg logit contributions
iflower-0.056
pron-0.048
opus-0.047
orn-0.046
oppy-0.046
Token long
Token ID890
Feature activation-2.09e+00

Pos logit contributions
iflower+0.051
pron+0.042
opus+0.041
orn+0.041
oppy+0.041
Neg logit contributions
 stations-0.035
 levers-0.031
 speakers-0.029
 footsteps-0.029
 towels-0.028
Token ago
Token ID2084
Feature activation-6.08e+00

Pos logit contributions
 stations+0.054
 levers+0.050
 footsteps+0.048
 extingu+0.046
 elbows+0.045
Neg logit contributions
iflower-0.073
pron-0.061
orn-0.058
opus-0.057
affle-0.057
Token.
Token ID13
Feature activation-7.99e-01

Pos logit contributions
 stations+0.162
 levers+0.149
 footsteps+0.143
 speakers+0.137
 extingu+0.133
Neg logit contributions
iflower-0.217
pron-0.185
lf-0.176
opus-0.165
orn-0.164
Token Would
Token ID10928
Feature activation+1.64e+00

Pos logit contributions
 stations+0.023
 levers+0.020
 speakers+0.020
 footsteps+0.019
 heads+0.018
Neg logit contributions
iflower-0.027
pron-0.022
opus-0.022
orn-0.022
affle-0.021
Token you
Token ID345
Feature activation-1.07e+00

Pos logit contributions
iflower+0.067
affle+0.061
pron+0.061
oppy+0.060
orn+0.059
Neg logit contributions
 stations-0.028
 themselves-0.025
 speakers-0.025
 footsteps-0.025
 permission-0.023
Token like
Token ID588
Feature activation-1.23e+00

Pos logit contributions
 stations+0.018
 levers+0.014
 footsteps+0.013
 speakers+0.013
 heads+0.012
Neg logit contributions
iflower-0.049
pron-0.044
orn-0.042
oppy-0.042
opus-0.042
Token to
Token ID284
Feature activation+1.43e+00

Pos logit contributions
 stations+0.025
 levers+0.022
 speakers+0.020
 footsteps+0.020
 heads+0.019
Neg logit contributions
iflower-0.051
pron-0.043
opus-0.043
orn-0.043
oppy-0.042
Token do
Token ID466
Feature activation+6.01e-01

Pos logit contributions
iflower+0.002
opus+0.002
orn+0.002
pron+0.002
lf+0.002
Neg logit contributions
 stations-0.002
 levers-0.002
 speakers-0.001
 heads-0.001
 towels-0.001
Token,
Token ID11
Feature activation-6.94e-02

Pos logit contributions
iflower+0.022
pron+0.018
opus+0.018
orn+0.018
oppy+0.017
Neg logit contributions
 stations-0.015
 levers-0.013
 speakers-0.013
 heads-0.012
 footsteps-0.012
Token a
Token ID257
Feature activation-8.54e-01

Pos logit contributions
 stations+0.002
 heads+0.002
 levers+0.001
 speakers+0.001
 extingu+0.001
Neg logit contributions
iflower-0.002
pron-0.002
orn-0.002
on-0.002
lf-0.002
Token few
Token ID1178
Feature activation-7.36e-01

Pos logit contributions
 stations+0.027
 levers+0.025
 speakers+0.024
 footsteps+0.023
 towels+0.022
Neg logit contributions
iflower-0.026
opus-0.020
orn-0.020
oppy-0.020
pron-0.020
Token days
Token ID1528
Feature activation-3.53e+00

Pos logit contributions
 stations+0.021
 levers+0.018
 speakers+0.018
 extingu+0.017
 towels+0.017
Neg logit contributions
iflower-0.025
orn-0.020
pron-0.020
opus-0.020
affle-0.019
Token ago
Token ID2084
Feature activation-6.07e+00

Pos logit contributions
 stations+0.103
 footsteps+0.089
 speakers+0.089
 levers+0.089
 heads+0.088
Neg logit contributions
iflower-0.113
on-0.093
lf-0.092
orn-0.090
opus-0.090
Token.
Token ID13
Feature activation-3.00e-02

Pos logit contributions
 stations+0.198
 speakers+0.171
 footsteps+0.170
 levers+0.169
 brake+0.158
Neg logit contributions
iflower-0.178
lf-0.158
on-0.149
pron-0.145
orn-0.134
Token He
Token ID679
Feature activation+1.42e+00

Pos logit contributions
 stations+0.001
 levers+0.001
 speakers+0.001
 footsteps+0.001
 towels+0.001
Neg logit contributions
iflower-0.001
pron-0.001
opus-0.001
orn-0.001
affle-0.001
Token started
Token ID2067
Feature activation-3.47e+00

Pos logit contributions
iflower+0.050
pron+0.041
opus+0.041
orn+0.040
oppy+0.040
Neg logit contributions
 stations-0.038
 levers-0.034
 speakers-0.033
 footsteps-0.032
 towels-0.031
Token to
Token ID284
Feature activation-6.20e-02

Pos logit contributions
 stations+0.093
 levers+0.076
 speakers+0.076
hole+0.075
 dashboard+0.074
Neg logit contributions
iflower-0.117
pron-0.098
orn-0.093
oppy-0.092
opus-0.092
Token applaud
Token ID22517
Feature activation-4.13e-01

Pos logit contributions
 stations+0.002
 speakers+0.001
 levers+0.001
hole+0.001
 towels+0.001
Neg logit contributions
iflower-0.002
pron-0.002
orn-0.002
oppy-0.002
lf-0.002
Token
Token ID129
Feature activation-6.88e-01

Pos logit contributions
iflower+0.080
pron+0.065
opus+0.065
orn+0.064
affle+0.063
Neg logit contributions
 stations-0.067
 levers-0.060
 speakers-0.058
 footsteps-0.056
 towels-0.054
Token
Token ID241
Feature activation-4.08e+00

Pos logit contributions
 stations+0.021
 levers+0.019
 speakers+0.018
 heads+0.017
 footsteps+0.017
Neg logit contributions
iflower-0.022
orn-0.018
pron-0.018
opus-0.018
oppy-0.018
TokenWhat
Token ID2061
Feature activation+2.62e+00

Pos logit contributions
 stations+0.147
 footsteps+0.135
 heads+0.132
 levers+0.132
 extingu+0.131
Neg logit contributions
iflower-0.100
pron-0.081
tto-0.079
orn-0.077
opus-0.074
Tokenâ
Token ID22940
Feature activation-1.19e+00

Pos logit contributions
iflower+0.094
pron+0.078
opus+0.078
orn+0.077
oppy+0.076
Neg logit contributions
 stations-0.068
 levers-0.060
 speakers-0.058
 footsteps-0.056
 towels-0.054
Token
Token ID26391
Feature activation+3.90e+00

Pos logit contributions
 stations+0.035
 levers+0.032
 speakers+0.031
 footsteps+0.030
hole+0.029
Neg logit contributions
iflower-0.037
pron-0.030
opus-0.030
oppy-0.029
orn-0.029
Token
Token ID8151
Feature activation+6.03e+00

Pos logit contributions
iflower+0.137
orn+0.110
pron+0.109
affle+0.106
opus+0.106
Neg logit contributions
 stations-0.100
 speakers-0.091
 levers-0.090
 footsteps-0.086
 towels-0.084
Tokens
Token ID82
Feature activation-8.12e-01

Pos logit contributions
iflower+0.214
pron+0.176
affle+0.167
orn+0.167
opus+0.163
Neg logit contributions
 stations-0.142
 levers-0.134
hole-0.129
 speakers-0.127
 towels-0.124
Token all
Token ID477
Feature activation-2.64e-01

Pos logit contributions
 stations+0.024
 levers+0.021
 speakers+0.021
 footsteps+0.020
 towels+0.019
Neg logit contributions
iflower-0.027
opus-0.022
pron-0.022
orn-0.021
oppy-0.021
Token this
Token ID428
Feature activation+1.89e+00

Pos logit contributions
 stations+0.008
 levers+0.007
 speakers+0.007
 footsteps+0.007
 heads+0.007
Neg logit contributions
iflower-0.008
opus-0.006
orn-0.006
pron-0.006
oppy-0.006
Token?
Token ID30
Feature activation+2.81e-01

Pos logit contributions
iflower+0.057
pron+0.047
oppy+0.045
opus+0.045
orn+0.045
Neg logit contributions
 stations-0.059
 levers-0.054
 speakers-0.051
 footsteps-0.051
 heads-0.048
Tokenâ
Token ID22940
Feature activation-1.53e+00

Pos logit contributions
iflower+0.009
opus+0.007
pron+0.007
orn+0.007
oppy+0.007
Neg logit contributions
 stations-0.008
 levers-0.007
 speakers-0.007
 footsteps-0.007
 heads-0.007
Token
Token ID198
Feature activation+8.84e-01

Pos logit contributions
iflower+0.019
pron+0.016
opus+0.016
affle+0.015
orn+0.015
Neg logit contributions
 stations-0.013
 levers-0.012
 themselves-0.011
 speakers-0.011
 footsteps-0.011
Token"
Token ID1
Feature activation-1.43e+00

Pos logit contributions
iflower+0.025
pron+0.019
opus+0.019
orn+0.019
affle+0.018
Neg logit contributions
 stations-0.029
 levers-0.027
 speakers-0.026
 themselves-0.025
 footsteps-0.025
TokenO
Token ID46
Feature activation+1.70e+00

Pos logit contributions
 stations+0.044
 levers+0.040
 speakers+0.039
 footsteps+0.038
 towels+0.038
Neg logit contributions
iflower-0.044
opus-0.035
pron-0.035
oppy-0.035
orn-0.035
Tokenw
Token ID86
Feature activation+1.18e+00

Pos logit contributions
iflower+0.050
pron+0.043
opus+0.041
affle+0.039
orn+0.039
Neg logit contributions
 stations-0.050
 levers-0.045
 heads-0.042
hole-0.042
 speakers-0.042
Token,
Token ID11
Feature activation+7.36e-01

Pos logit contributions
iflower+0.044
pron+0.038
opus+0.037
orn+0.036
affle+0.035
Neg logit contributions
 stations-0.026
 levers-0.024
 footsteps-0.022
 heads-0.022
 speakers-0.022
Token ow
Token ID12334
Feature activation+6.23e+00

Pos logit contributions
iflower+0.021
pron+0.016
opus+0.016
orn+0.016
oppy+0.016
Neg logit contributions
 stations-0.025
 levers-0.023
 speakers-0.022
 footsteps-0.021
 towels-0.021
Token,
Token ID11
Feature activation+1.01e+00

Pos logit contributions
iflower+0.244
pron+0.221
opus+0.206
affle+0.205
orn+0.204
Neg logit contributions
 stations-0.119
 levers-0.108
 heads-0.100
 towels-0.097
 attention-0.097
Token ow
Token ID12334
Feature activation+6.03e+00

Pos logit contributions
iflower+0.027
opus+0.021
pron+0.021
orn+0.021
oppy+0.020
Neg logit contributions
 stations-0.036
 levers-0.032
 speakers-0.031
 footsteps-0.030
 towels-0.029
Token!
Token ID0
Feature activation-2.13e-02

Pos logit contributions
iflower+0.230
pron+0.210
affle+0.196
opus+0.194
orn+0.190
Neg logit contributions
 stations-0.124
 levers-0.112
 heads-0.100
 towels-0.099
 footsteps-0.098
Token Ben
Token ID3932
Feature activation+9.42e-01

Pos logit contributions
 stations+0.001
 levers+0.001
 footsteps+0.001
 speakers+0.001
 towels+0.000
Neg logit contributions
iflower-0.001
pron-0.001
orn-0.001
opus-0.001
affle-0.001
Token,
Token ID11
Feature activation-3.81e-01

Pos logit contributions
iflower+0.033
pron+0.028
opus+0.028
orn+0.027
affle+0.027
Neg logit contributions
 stations-0.024
 levers-0.022
 speakers-0.021
 footsteps-0.020
 heads-0.020
Token.
Token ID13
Feature activation+1.96e-01

Pos logit contributions
iflower+0.075
orn+0.063
pron+0.060
oppy+0.060
lf+0.058
Neg logit contributions
 stations-0.047
 levers-0.044
 speakers-0.041
 footsteps-0.041
 towels-0.038
Token She
Token ID1375
Feature activation+1.88e+00

Pos logit contributions
iflower+0.009
opus+0.008
pron+0.007
affle+0.007
orn+0.007
Neg logit contributions
 stations-0.003
 footsteps-0.003
 speakers-0.003
 towels-0.003
 levers-0.003
Token loved
Token ID6151
Feature activation-6.22e-01

Pos logit contributions
iflower+0.075
opus+0.069
oppy+0.066
affle+0.065
pron+0.065
Neg logit contributions
 stations-0.029
 speakers-0.026
 like-0.026
 heads-0.025
 levers-0.023
Token l
Token ID300
Feature activation+1.77e+00

Pos logit contributions
 stations+0.010
 levers+0.008
 speakers+0.008
 footsteps+0.007
 towels+0.007
Neg logit contributions
iflower-0.028
opus-0.025
orn-0.024
oppy-0.024
pron-0.024
Tokenoll
Token ID692
Feature activation+9.78e+00

Pos logit contributions
iflower+0.054
pron+0.047
lf+0.042
opus+0.042
leton+0.039
Neg logit contributions
 stations-0.050
 permission-0.046
 footsteps-0.046
 speakers-0.046
 towels-0.045
Tokenip
Token ID541
Feature activation+5.60e+00

Pos logit contributions
iflower+0.157
pron+0.129
Buzz+0.126
opus+0.109
va+0.101
Neg logit contributions
 permission-0.366
 stations-0.361
 attention-0.357
 levers-0.353
 coats-0.349
Tokenops
Token ID2840
Feature activation+1.19e-01

Pos logit contributions
iflower+0.139
pron+0.116
opus+0.108
orn+0.107
oppy+0.104
Neg logit contributions
 stations-0.191
 levers-0.173
 heads-0.170
 speakers-0.167
 towels-0.167
Token and
Token ID290
Feature activation+1.56e+00

Pos logit contributions
iflower+0.005
pron+0.004
orn+0.004
opus+0.004
oppy+0.004
Neg logit contributions
 stations-0.003
 levers-0.002
 speakers-0.002
 footsteps-0.002
 heads-0.002
Token always
Token ID1464
Feature activation-2.12e-01

Pos logit contributions
iflower+0.052
opus+0.043
pron+0.042
orn+0.042
oppy+0.041
Neg logit contributions
 stations-0.044
 levers-0.040
 speakers-0.038
 footsteps-0.037
 towels-0.036
Token had
Token ID550
Feature activation+1.37e+00

Pos logit contributions
 stations+0.006
 levers+0.006
 speakers+0.006
 footsteps+0.005
 towels+0.005
Neg logit contributions
iflower-0.007
pron-0.006
opus-0.005
orn-0.005
oppy-0.005
Token one
Token ID530
Feature activation-6.61e-01

Pos logit contributions
iflower+0.052
opus+0.044
pron+0.043
orn+0.043
oppy+0.042
Neg logit contributions
 stations-0.033
 levers-0.029
 speakers-0.027
 footsteps-0.026
 towels-0.026
Token.
Token ID13
Feature activation-2.69e-01

Pos logit contributions
iflower+0.083
pron+0.070
opus+0.070
orn+0.069
oppy+0.069
Neg logit contributions
 stations-0.042
 levers-0.036
 speakers-0.034
 footsteps-0.032
 towels-0.031
Token She
Token ID1375
Feature activation+9.37e-01

Pos logit contributions
 stations+0.006
 levers+0.005
 heads+0.005
 speakers+0.005
 footsteps+0.005
Neg logit contributions
iflower-0.011
opus-0.009
pron-0.009
orn-0.009
affle-0.009
Token grabbed
Token ID13176
Feature activation+7.89e-01

Pos logit contributions
iflower+0.038
opus+0.032
pron+0.032
orn+0.032
oppy+0.031
Neg logit contributions
 stations-0.019
 levers-0.018
 speakers-0.016
 heads-0.016
 towels-0.015
Token a
Token ID257
Feature activation-4.01e-01

Pos logit contributions
iflower+0.032
pron+0.028
opus+0.028
orn+0.027
oppy+0.027
Neg logit contributions
 stations-0.017
 levers-0.014
 speakers-0.013
 footsteps-0.013
 towels-0.012
Token l
Token ID300
Feature activation+2.39e-02

Pos logit contributions
 stations+0.012
 levers+0.011
 speakers+0.010
 footsteps+0.010
 towels+0.010
Neg logit contributions
iflower-0.013
opus-0.010
oppy-0.010
pron-0.010
orn-0.010
Tokenoll
Token ID692
Feature activation+8.62e+00

Pos logit contributions
iflower+0.001
opus+0.000
pron+0.000
orn+0.000
oppy+0.000
Neg logit contributions
 stations-0.001
 levers-0.001
 speakers-0.001
 footsteps-0.001
 towels-0.001
Tokenipop
Token ID42800
Feature activation+2.59e+00

Pos logit contributions
iflower+0.191
opus+0.138
pron+0.138
Buzz+0.127
orn+0.126
Neg logit contributions
 levers-0.298
 stations-0.298
 attention-0.287
 permission-0.285
 towels-0.284
Token from
Token ID422
Feature activation-9.30e-01

Pos logit contributions
iflower+0.091
ip+0.076
pron+0.075
orn+0.074
on+0.073
Neg logit contributions
 stations-0.079
 speakers-0.064
 levers-0.062
 footsteps-0.062
 announcements-0.058
Token her
Token ID607
Feature activation-7.99e-01

Pos logit contributions
 stations+0.023
 levers+0.020
 speakers+0.019
 footsteps+0.018
 towels+0.017
Neg logit contributions
iflower-0.035
pron-0.029
opus-0.029
orn-0.029
oppy-0.029
Token bag
Token ID6131
Feature activation+2.75e+00

Pos logit contributions
 stations+0.022
 levers+0.019
 speakers+0.018
 footsteps+0.018
 themselves+0.017
Neg logit contributions
iflower-0.028
pron-0.023
opus-0.023
oppy-0.023
orn-0.023
Token and
Token ID290
Feature activation+3.34e-01

Pos logit contributions
iflower+0.085
pron+0.075
on+0.073
orn+0.073
ip+0.072
Neg logit contributions
 stations-0.089
 levers-0.076
 speakers-0.072
 footsteps-0.072
 elbows-0.069
Token
Token ID129
Feature activation-7.31e-01

Pos logit contributions
iflower+0.074
affle+0.061
orn+0.060
pron+0.057
opus+0.055
Neg logit contributions
 stations-0.077
 levers-0.071
 speakers-0.071
hole-0.067
 towels-0.066
Token
Token ID241
Feature activation-4.33e+00

Pos logit contributions
 stations+0.023
 levers+0.020
 speakers+0.020
 heads+0.019
 footsteps+0.019
Neg logit contributions
iflower-0.023
opus-0.018
pron-0.018
oppy-0.018
orn-0.018
TokenThat
Token ID2504
Feature activation+2.92e+00

Pos logit contributions
 stations+0.153
 heads+0.138
 footsteps+0.136
 levers+0.134
 speakers+0.130
Neg logit contributions
iflower-0.114
pron-0.095
opus-0.092
orn-0.087
tto-0.084
Tokenâ
Token ID22940
Feature activation-1.20e+00

Pos logit contributions
iflower+0.081
opus+0.066
orn+0.063
pron+0.062
ip+0.059
Neg logit contributions
 stations-0.096
 footsteps-0.093
 elbows-0.088
 extingu-0.087
 levers-0.087
Token
Token ID26391
Feature activation+3.65e+00

Pos logit contributions
 stations+0.035
 levers+0.032
 speakers+0.031
 footsteps+0.030
hole+0.029
Neg logit contributions
iflower-0.038
oppy-0.030
pron-0.030
opus-0.030
orn-0.029
Token
Token ID8151
Feature activation+5.76e+00

Pos logit contributions
iflower+0.118
orn+0.094
affle+0.094
pron+0.091
opus+0.089
Neg logit contributions
 stations-0.102
 speakers-0.095
 levers-0.093
 towels-0.088
hole-0.087
Tokens
Token ID82
Feature activation-6.47e-01

Pos logit contributions
iflower+0.223
pron+0.188
orn+0.182
opus+0.180
affle+0.177
Neg logit contributions
 stations-0.123
 levers-0.111
 speakers-0.104
hole-0.103
 towels-0.101
Token a
Token ID257
Feature activation+6.35e-01

Pos logit contributions
 stations+0.020
 levers+0.018
 speakers+0.018
 footsteps+0.017
 towels+0.016
Neg logit contributions
iflower-0.020
opus-0.016
pron-0.016
orn-0.015
oppy-0.015
Token rope
Token ID17182
Feature activation+1.09e+00

Pos logit contributions
iflower+0.017
pron+0.013
opus+0.013
orn+0.013
oppy+0.013
Neg logit contributions
 stations-0.022
 levers-0.020
 speakers-0.020
 footsteps-0.019
 towels-0.019
Token,
Token ID11
Feature activation+1.13e-01

Pos logit contributions
iflower+0.037
opus+0.031
pron+0.030
orn+0.030
lf+0.029
Neg logit contributions
 stations-0.030
 footsteps-0.027
 levers-0.026
 elbows-0.026
 speakers-0.025
Token John
Token ID1757
Feature activation+9.17e-01

Pos logit contributions
iflower+0.003
pron+0.003
opus+0.003
affle+0.003
orn+0.003
Neg logit contributions
 stations-0.004
 levers-0.003
 speakers-0.003
 footsteps-0.003
 elbows-0.003
Token Snow
Token ID7967
Feature activation+9.43e-01

Pos logit contributions
iflower+0.045
pron+0.038
opus+0.038
orn+0.037
oppy+0.036
Neg logit contributions
 stations-0.036
 levers-0.032
 speakers-0.030
 footsteps-0.030
 heads-0.029
Tokeny
Token ID88
Feature activation-8.48e-03

Pos logit contributions
iflower+0.037
pron+0.031
opus+0.031
orn+0.030
affle+0.030
Neg logit contributions
 stations-0.021
 levers-0.018
 speakers-0.017
 footsteps-0.017
 heads-0.016
Token.
Token ID13
Feature activation-7.90e-01

Pos logit contributions
 stations+0.000
 levers+0.000
 extingu+0.000
 speakers+0.000
 towels+0.000
Neg logit contributions
iflower-0.000
opus-0.000
oppy-0.000
orn-0.000
pron-0.000
Token Max
Token ID5436
Feature activation-8.26e-02

Pos logit contributions
 stations+0.022
 levers+0.020
 speakers+0.019
 footsteps+0.019
 heads+0.018
Neg logit contributions
iflower-0.028
opus-0.022
pron-0.022
orn-0.022
oppy-0.021
Token would
Token ID561
Feature activation-2.90e+00

Pos logit contributions
 stations+0.002
 heads+0.002
 levers+0.002
 speakers+0.002
 footsteps+0.001
Neg logit contributions
iflower-0.003
pron-0.003
orn-0.003
opus-0.003
affle-0.003
Token say
Token ID910
Feature activation+2.15e+00

Pos logit contributions
 stations+0.081
 levers+0.071
 speakers+0.067
hole+0.067
 extingu+0.063
Neg logit contributions
iflower-0.096
pron-0.085
oppy-0.080
affle-0.079
opus-0.079
Token,
Token ID11
Feature activation-2.01e+00

Pos logit contributions
iflower+0.091
pron+0.078
orn+0.077
oppy+0.075
opus+0.075
Neg logit contributions
 stations-0.041
 levers-0.037
 footsteps-0.036
 speakers-0.034
 louder-0.032
Token "
Token ID366
Feature activation-2.02e+00

Pos logit contributions
 stations+0.067
 levers+0.061
 speakers+0.059
 footsteps+0.057
 towels+0.056
Neg logit contributions
iflower-0.057
pron-0.046
opus-0.045
orn-0.044
oppy-0.043
TokenI
Token ID40
Feature activation-7.43e-01

Pos logit contributions
 stations+0.061
 levers+0.055
 speakers+0.053
 footsteps+0.051
 heads+0.050
Neg logit contributions
iflower-0.065
opus-0.052
pron-0.052
orn-0.051
oppy-0.051
Token hate
Token ID5465
Feature activation-1.36e+00

Pos logit contributions
 stations+0.018
 levers+0.016
 speakers+0.015
 footsteps+0.015
 towels+0.014
Neg logit contributions
iflower-0.028
pron-0.023
opus-0.023
orn-0.023
oppy-0.022
Token cats
Token ID11875
Feature activation-2.02e+00

Pos logit contributions
 stations+0.038
 levers+0.034
 speakers+0.032
hole+0.032
 footsteps+0.031
Neg logit contributions
iflower-0.046
opus-0.038
orn-0.037
pron-0.036
oppy-0.036
Token
Token ID198
Feature activation+3.37e-01

Pos logit contributions
 stations+0.001
 themselves+0.001
 footsteps+0.001
 speakers+0.001
 towels+0.001
Neg logit contributions
iflower-0.002
opus-0.001
pron-0.001
affle-0.001
atmeal-0.001
TokenThen
Token ID6423
Feature activation-2.17e+00

Pos logit contributions
iflower+0.011
opus+0.009
affle+0.009
oppy+0.009
pron+0.009
Neg logit contributions
 stations-0.009
 speakers-0.008
 themselves-0.008
 levers-0.008
 footsteps-0.007
Token,
Token ID11
Feature activation-5.72e-01

Pos logit contributions
 stations+0.063
 levers+0.056
 speakers+0.054
 footsteps+0.051
hole+0.051
Neg logit contributions
iflower-0.071
oppy-0.058
pron-0.058
opus-0.058
orn-0.057
Token Lily
Token ID20037
Feature activation-3.99e-01

Pos logit contributions
 stations+0.017
 levers+0.015
 speakers+0.015
 footsteps+0.014
 heads+0.014
Neg logit contributions
iflower-0.018
pron-0.015
opus-0.015
oppy-0.015
orn-0.014
Token remembered
Token ID12086
Feature activation-1.62e+00

Pos logit contributions
 stations+0.012
 levers+0.010
 footsteps+0.010
 speakers+0.010
 towels+0.009
Neg logit contributions
iflower-0.013
opus-0.011
pron-0.011
oppy-0.011
orn-0.011
Token something
Token ID1223
Feature activation+1.66e+00

Pos logit contributions
 stations+0.051
 levers+0.045
hole+0.043
 speakers+0.042
 dashboard+0.042
Neg logit contributions
iflower-0.051
pron-0.043
oppy-0.041
lf-0.040
opus-0.039
Token her
Token ID607
Feature activation+3.10e-01

Pos logit contributions
iflower+0.055
pron+0.045
opus+0.044
orn+0.044
oppy+0.043
Neg logit contributions
 stations-0.049
 levers-0.043
 speakers-0.042
 footsteps-0.040
 heads-0.040
Token mom
Token ID1995
Feature activation-2.06e+00

Pos logit contributions
iflower+0.008
pron+0.007
oppy+0.007
opus+0.006
orn+0.006
Neg logit contributions
 stations-0.011
 levers-0.010
 speakers-0.009
 extingu-0.009
 footsteps-0.009
Token had
Token ID550
Feature activation-2.27e-01

Pos logit contributions
 stations+0.064
 levers+0.057
 speakers+0.055
 footsteps+0.054
 towels+0.052
Neg logit contributions
iflower-0.065
pron-0.053
opus-0.051
orn-0.051
oppy-0.050
Token told
Token ID1297
Feature activation-7.40e-01

Pos logit contributions
 stations+0.008
 levers+0.007
 extingu+0.006
 footsteps+0.006
 speakers+0.006
Neg logit contributions
iflower-0.007
pron-0.006
oppy-0.005
orn-0.005
opus-0.005
Token her
Token ID607
Feature activation-1.78e+00

Pos logit contributions
 stations+0.021
 levers+0.020
 speakers+0.019
 dashboard+0.018
 extingu+0.018
Neg logit contributions
iflower-0.023
pron-0.019
oppy-0.019
orn-0.018
lf-0.018
Token sack
Token ID23704
Feature activation+6.30e-02

Pos logit contributions
 stations+0.055
 levers+0.049
 footsteps+0.047
 speakers+0.046
hole+0.046
Neg logit contributions
iflower-0.068
pron-0.057
opus-0.055
orn-0.054
affle-0.053
Token.
Token ID13
Feature activation-1.34e+00

Pos logit contributions
iflower+0.002
pron+0.002
orn+0.002
opus+0.002
on+0.002
Neg logit contributions
 stations-0.002
 levers-0.002
 footsteps-0.001
 speakers-0.001
 elbows-0.001
Token It
Token ID632
Feature activation+8.47e-01

Pos logit contributions
 stations+0.038
 levers+0.034
 speakers+0.033
 footsteps+0.032
 towels+0.031
Neg logit contributions
iflower-0.045
pron-0.036
opus-0.036
orn-0.036
oppy-0.035
Token belonged
Token ID19611
Feature activation-1.75e+00

Pos logit contributions
iflower+0.026
pron+0.021
opus+0.021
orn+0.021
oppy+0.021
Neg logit contributions
 stations-0.026
 levers-0.023
 speakers-0.023
 footsteps-0.022
 towels-0.021
Token to
Token ID284
Feature activation+2.44e+00

Pos logit contributions
 stations+0.036
 levers+0.032
 speakers+0.030
 footsteps+0.029
 heads+0.027
Neg logit contributions
iflower-0.072
pron-0.061
orn-0.061
opus-0.061
oppy-0.060
Token his
Token ID465
Feature activation+9.59e-01

Pos logit contributions
iflower+0.091
pron+0.076
opus+0.075
orn+0.074
oppy+0.074
Neg logit contributions
 stations-0.060
 levers-0.054
 speakers-0.051
 footsteps-0.048
 heads-0.047
Token neighbor
Token ID4780
Feature activation-1.29e+00

Pos logit contributions
iflower+0.029
pron+0.023
opus+0.023
orn+0.023
affle+0.022
Neg logit contributions
 stations-0.031
 levers-0.028
 speakers-0.027
 footsteps-0.026
 towels-0.026
Token,
Token ID11
Feature activation-5.08e-01

Pos logit contributions
 stations+0.030
 levers+0.026
 speakers+0.024
 footsteps+0.024
 towels+0.022
Neg logit contributions
iflower-0.050
pron-0.042
opus-0.042
oppy-0.042
orn-0.041
Token Mrs
Token ID9074
Feature activation-6.06e-01

Pos logit contributions
 stations+0.015
 levers+0.013
 speakers+0.013
 heads+0.012
 footsteps+0.012
Neg logit contributions
iflower-0.016
pron-0.013
opus-0.013
oppy-0.013
orn-0.013
Token.
Token ID13
Feature activation-1.65e+00

Pos logit contributions
 stations+0.011
 levers+0.010
 speakers+0.009
 footsteps+0.009
 towels+0.008
Neg logit contributions
iflower-0.026
opus-0.022
pron-0.022
orn-0.022
oppy-0.022
Token Smith
Token ID4176
Feature activation+3.61e-01

Pos logit contributions
 stations+0.040
 levers+0.035
 speakers+0.034
 footsteps+0.034
 extingu+0.032
Neg logit contributions
iflower-0.062
opus-0.051
pron-0.050
orn-0.050
affle-0.050
Tokenives
Token ID1083
Feature activation+1.95e-01

Pos logit contributions
 stations+0.064
 levers+0.056
 speakers+0.054
 footsteps+0.051
 towels+0.050
Neg logit contributions
iflower-0.108
opus-0.092
orn-0.092
pron-0.091
oppy-0.091
Token Sam
Token ID3409
Feature activation-9.73e-01

Pos logit contributions
iflower+0.007
orn+0.006
oppy+0.006
opus+0.006
pron+0.006
Neg logit contributions
 stations-0.005
 levers-0.004
 speakers-0.004
hole-0.004
 footsteps-0.004
Token and
Token ID290
Feature activation-2.18e-01

Pos logit contributions
 stations+0.026
 levers+0.023
 speakers+0.022
 footsteps+0.021
hole+0.020
Neg logit contributions
iflower-0.034
opus-0.028
orn-0.028
oppy-0.028
pron-0.028
Token hugs
Token ID40657
Feature activation-5.36e-01

Pos logit contributions
 stations+0.006
hole+0.006
 levers+0.006
 speakers+0.005
 dashboard+0.005
Neg logit contributions
iflower-0.007
oppy-0.006
opus-0.005
orn-0.005
affle-0.005
Token him
Token ID683
Feature activation+6.78e-01

Pos logit contributions
 stations+0.016
hole+0.015
 speakers+0.015
 levers+0.015
 dashboard+0.014
Neg logit contributions
iflower-0.015
orn-0.013
oppy-0.013
pron-0.012
affle-0.012
Token back
Token ID736
Feature activation+1.37e+00

Pos logit contributions
iflower+0.029
pron+0.025
opus+0.025
orn+0.024
oppy+0.024
Neg logit contributions
 stations-0.013
 levers-0.011
 speakers-0.011
 footsteps-0.010
 towels-0.010
Token.
Token ID13
Feature activation+2.45e-01

Pos logit contributions
iflower+0.053
pron+0.045
oppy+0.044
opus+0.044
orn+0.043
Neg logit contributions
 stations-0.032
 levers-0.027
 speakers-0.026
 footsteps-0.025
hole-0.024
Token They
Token ID1119
Feature activation-9.41e-01

Pos logit contributions
iflower+0.009
pron+0.008
opus+0.008
orn+0.007
oppy+0.007
Neg logit contributions
 stations-0.006
 levers-0.005
 speakers-0.005
 footsteps-0.005
 towels-0.005
Token play
Token ID711
Feature activation+6.90e-01

Pos logit contributions
 stations+0.021
 levers+0.018
 speakers+0.018
 footsteps+0.016
hole+0.016
Neg logit contributions
iflower-0.036
oppy-0.031
orn-0.030
pron-0.030
opus-0.030
Token with
Token ID351
Feature activation+1.84e+00

Pos logit contributions
iflower+0.027
opus+0.023
pron+0.022
orn+0.022
oppy+0.022
Neg logit contributions
 stations-0.017
 levers-0.014
 footsteps-0.014
 speakers-0.014
 towels-0.013
Token other
Token ID584
Feature activation-1.84e+00

Pos logit contributions
iflower+0.063
pron+0.051
oppy+0.051
opus+0.050
orn+0.050
Neg logit contributions
 stations-0.048
 levers-0.043
 speakers-0.042
hole-0.041
 footsteps-0.040
Token The
Token ID383
Feature activation-3.38e-01

Pos logit contributions
 stations+0.008
 levers+0.007
 speakers+0.007
 footsteps+0.007
 towels+0.007
Neg logit contributions
iflower-0.009
pron-0.007
opus-0.007
orn-0.007
oppy-0.007
Token youth
Token ID6205
Feature activation-2.67e-01

Pos logit contributions
 stations+0.010
 levers+0.009
 speakers+0.008
 footsteps+0.008
hole+0.008
Neg logit contributions
iflower-0.011
opus-0.009
pron-0.009
orn-0.009
oppy-0.009
Token spent
Token ID3377
Feature activation-4.17e-01

Pos logit contributions
 stations+0.008
 levers+0.007
 speakers+0.007
 footsteps+0.006
 towels+0.006
Neg logit contributions
iflower-0.009
opus-0.007
pron-0.007
orn-0.007
oppy-0.007
Token the
Token ID262
Feature activation-7.90e-01

Pos logit contributions
 stations+0.010
 levers+0.009
 speakers+0.009
 footsteps+0.008
 towels+0.008
Neg logit contributions
iflower-0.015
opus-0.013
pron-0.013
orn-0.013
oppy-0.013
Token day
Token ID1110
Feature activation-1.06e+00

Pos logit contributions
 stations+0.026
 levers+0.024
 speakers+0.023
 footsteps+0.022
 towels+0.022
Neg logit contributions
iflower-0.023
pron-0.018
opus-0.018
orn-0.018
oppy-0.018
Token sharing
Token ID7373
Feature activation+1.22e+00

Pos logit contributions
 stations+0.030
 levers+0.027
 speakers+0.025
 footsteps+0.025
 heads+0.024
Neg logit contributions
iflower-0.036
opus-0.030
pron-0.029
oppy-0.028
orn-0.028
Token the
Token ID262
Feature activation-8.10e-01

Pos logit contributions
iflower+0.042
pron+0.036
opus+0.035
orn+0.034
affle+0.033
Neg logit contributions
 stations-0.033
 levers-0.029
 speakers-0.028
 footsteps-0.027
hole-0.027
Token jokes
Token ID14532
Feature activation-1.52e+00

Pos logit contributions
 stations+0.026
 levers+0.023
 speakers+0.022
 footsteps+0.022
hole+0.021
Neg logit contributions
iflower-0.025
pron-0.020
opus-0.020
orn-0.019
affle-0.019
Token and
Token ID290
Feature activation+9.97e-01

Pos logit contributions
 stations+0.039
 levers+0.033
 speakers+0.032
 elbows+0.031
 footsteps+0.031
Neg logit contributions
iflower-0.055
opus-0.047
pron-0.046
orn-0.044
on-0.044
Token making
Token ID1642
Feature activation+4.50e-01

Pos logit contributions
iflower+0.031
lf+0.025
pron+0.025
opus+0.024
orn+0.024
Neg logit contributions
 stations-0.028
hole-0.026
 levers-0.025
 speakers-0.025
 extingu-0.024
Token people
Token ID661
Feature activation-3.25e+00

Pos logit contributions
iflower+0.015
opus+0.012
pron+0.012
oppy+0.012
orn+0.012
Neg logit contributions
 stations-0.012
 levers-0.011
 speakers-0.011
hole-0.010
 footsteps-0.010
Token very
Token ID845
Feature activation+1.56e+00

Pos logit contributions
 stations+0.043
 levers+0.039
 speakers+0.038
 footsteps+0.037
 towels+0.036
Neg logit contributions
iflower-0.040
opus-0.031
pron-0.031
orn-0.031
oppy-0.030
Token enthusiastic
Token ID21848
Feature activation-1.97e+00

Pos logit contributions
iflower+0.051
orn+0.042
pron+0.041
opus+0.041
affle+0.039
Neg logit contributions
 stations-0.047
 levers-0.043
 speakers-0.041
 footsteps-0.040
 heads-0.040
Token.
Token ID13
Feature activation-5.54e-01

Pos logit contributions
 stations+0.056
 levers+0.055
 elbows+0.052
 brake+0.051
 extingu+0.051
Neg logit contributions
iflower-0.057
lf-0.049
pron-0.048
affle-0.045
opus-0.044
Token She
Token ID1375
Feature activation+7.60e-01

Pos logit contributions
 stations+0.014
 levers+0.012
 speakers+0.012
 footsteps+0.012
 towels+0.011
Neg logit contributions
iflower-0.020
opus-0.017
pron-0.017
orn-0.017
oppy-0.016
Token ran
Token ID4966
Feature activation-2.45e+00

Pos logit contributions
iflower+0.028
pron+0.023
opus+0.023
orn+0.023
oppy+0.022
Neg logit contributions
 stations-0.019
 levers-0.017
 speakers-0.016
 footsteps-0.016
 towels-0.015
Token over
Token ID625
Feature activation-1.85e+00

Pos logit contributions
 stations+0.060
 levers+0.054
 elbows+0.051
 footsteps+0.050
 speakers+0.050
Neg logit contributions
iflower-0.081
pron-0.071
opus-0.071
leton-0.070
affle-0.069
Token to
Token ID284
Feature activation-5.24e-01

Pos logit contributions
 stations+0.047
 levers+0.042
 speakers+0.039
 footsteps+0.039
 dashboard+0.039
Neg logit contributions
iflower-0.060
orn-0.052
pron-0.051
opus-0.051
ip-0.051
Token them
Token ID606
Feature activation-2.21e+00

Pos logit contributions
 stations+0.015
 levers+0.014
hole+0.013
 elbows+0.013
 extingu+0.013
Neg logit contributions
iflower-0.014
pron-0.013
orn-0.012
opus-0.012
oppy-0.012
Token and
Token ID290
Feature activation-3.83e-01

Pos logit contributions
 stations+0.060
 levers+0.051
 elbows+0.049
 speakers+0.049
 footsteps+0.046
Neg logit contributions
iflower-0.074
opus-0.062
pron-0.061
ip-0.061
lf-0.060
Token asked
Token ID1965
Feature activation+3.73e-01

Pos logit contributions
 stations+0.011
 levers+0.010
 footsteps+0.010
 speakers+0.010
hole+0.009
Neg logit contributions
iflower-0.012
pron-0.010
opus-0.010
oppy-0.010
orn-0.009
Token if
Token ID611
Feature activation-1.62e-01

Pos logit contributions
iflower+0.015
opus+0.013
pron+0.013
orn+0.012
oppy+0.012
Neg logit contributions
 stations-0.008
 levers-0.007
 speakers-0.007
 footsteps-0.007
 towels-0.006
Token a
Token ID257
Feature activation+7.48e-02

Pos logit contributions
 stations+0.002
 levers+0.001
 speakers+0.001
 footsteps+0.001
 themselves+0.001
Neg logit contributions
iflower-0.005
opus-0.005
pron-0.005
affle-0.004
ip-0.004
Token big
Token ID1263
Feature activation+1.65e+00

Pos logit contributions
mm+0.003
iflower+0.003
affle+0.003
atmeal+0.003
affles+0.002
Neg logit contributions
 window-0.001
 stations-0.001
 speakers-0.001
 h-0.001
 heads-0.001
Token tree
Token ID5509
Feature activation+1.83e+00

Pos logit contributions
iflower+0.060
pron+0.051
opus+0.050
affle+0.049
tto+0.049
Neg logit contributions
 stations-0.039
 heads-0.033
 speakers-0.033
 footsteps-0.032
 towels-0.032
Token in
Token ID287
Feature activation-5.82e-01

Pos logit contributions
iflower+0.064
pron+0.053
opus+0.052
orn+0.052
oppy+0.051
Neg logit contributions
 stations-0.050
 levers-0.045
 speakers-0.042
 footsteps-0.042
 towels-0.040
Token the
Token ID262
Feature activation-3.30e-01

Pos logit contributions
 stations+0.012
 levers+0.010
 speakers+0.010
 footsteps+0.010
 themselves+0.009
Neg logit contributions
iflower-0.024
opus-0.021
pron-0.021
affle-0.020
ip-0.020
Token forest
Token ID8222
Feature activation+1.43e-01

Pos logit contributions
 stations+0.009
 speakers+0.008
 footsteps+0.008
 heads+0.007
 levers+0.007
Neg logit contributions
iflower-0.011
opus-0.009
pron-0.009
oppy-0.009
affle-0.009
Token.
Token ID13
Feature activation-2.63e-01

Pos logit contributions
iflower+0.005
orn+0.004
oppy+0.004
pron+0.004
lf+0.004
Neg logit contributions
 stations-0.003
 levers-0.003
 speakers-0.003
 extingu-0.003
 footsteps-0.003
Token One
Token ID1881
Feature activation-1.01e+00

Pos logit contributions
 stations+0.006
 footsteps+0.005
 speakers+0.005
 levers+0.005
 sill+0.005
Neg logit contributions
iflower-0.011
pron-0.009
opus-0.009
leton-0.008
tto-0.008
Token day
Token ID1110
Feature activation-8.57e-01

Pos logit contributions
 stations+0.023
 like+0.021
 footsteps+0.020
 sill+0.020
 eye+0.019
Neg logit contributions
iflower-0.035
opus-0.030
pron-0.030
affle-0.029
oppy-0.028
Token,
Token ID11
Feature activation+1.16e-01

Pos logit contributions
 stations+0.013
 levers+0.012
 footsteps+0.012
 speakers+0.011
 towels+0.011
Neg logit contributions
iflower-0.038
pron-0.034
orn-0.032
opus-0.031
affle-0.031
Token Sp
Token ID1338
Feature activation-1.72e-01

Pos logit contributions
iflower+0.004
pron+0.003
orn+0.003
opus+0.003
affle+0.003
Neg logit contributions
 stations-0.003
 levers-0.003
 footsteps-0.003
 speakers-0.003
 themselves-0.003
Token find
Token ID1064
Feature activation+2.37e+00

Pos logit contributions
 stations+0.040
 levers+0.036
 speakers+0.034
 footsteps+0.032
 towels+0.031
Neg logit contributions
iflower-0.058
orn-0.048
oppy-0.048
opus-0.048
pron-0.048
Token it
Token ID340
Feature activation+1.61e+00

Pos logit contributions
iflower+0.074
pron+0.059
opus+0.059
orn+0.058
oppy+0.057
Neg logit contributions
 stations-0.073
 levers-0.066
 speakers-0.063
 footsteps-0.062
 towels-0.060
Token.
Token ID13
Feature activation-4.24e-01

Pos logit contributions
iflower+0.063
orn+0.053
pron+0.052
oppy+0.052
opus+0.051
Neg logit contributions
 stations-0.037
 levers-0.033
 speakers-0.032
 footsteps-0.030
 heads-0.028
Token
Token ID198
Feature activation-3.43e-01

Pos logit contributions
 stations+0.009
 footsteps+0.008
 themselves+0.008
 levers+0.008
 speakers+0.008
Neg logit contributions
iflower-0.017
opus-0.015
affle-0.014
pron-0.014
orn-0.013
Token
Token ID198
Feature activation+8.11e-02

Pos logit contributions
 stations+0.006
 themselves+0.006
 footsteps+0.005
 levers+0.005
hole+0.005
Neg logit contributions
iflower-0.014
opus-0.012
pron-0.012
affle-0.012
atmeal-0.011
TokenL
Token ID43
Feature activation-8.80e-01

Pos logit contributions
iflower+0.003
opus+0.002
affle+0.002
pron+0.002
orn+0.002
Neg logit contributions
 stations-0.002
 speakers-0.002
 levers-0.002
 footsteps-0.002
 themselves-0.002
Tokenily
Token ID813
Feature activation-5.80e-01

Pos logit contributions
 stations+0.026
 levers+0.022
 footsteps+0.022
 speakers+0.022
 towels+0.021
Neg logit contributions
iflower-0.029
opus-0.023
pron-0.023
orn-0.022
oppy-0.022
Token was
Token ID373
Feature activation-1.11e+00

Pos logit contributions
 stations+0.017
 levers+0.015
 speakers+0.015
 footsteps+0.014
 extingu+0.014
Neg logit contributions
iflower-0.019
orn-0.015
oppy-0.015
opus-0.015
pron-0.015
Token sad
Token ID6507
Feature activation-2.07e-01

Pos logit contributions
 stations+0.033
 levers+0.030
 speakers+0.028
 footsteps+0.027
 towels+0.027
Neg logit contributions
iflower-0.036
pron-0.029
opus-0.029
oppy-0.028
orn-0.028
Token because
Token ID780
Feature activation+1.74e-01

Pos logit contributions
 stations+0.006
 levers+0.005
 speakers+0.005
 dashboard+0.005
 brake+0.005
Neg logit contributions
iflower-0.007
oppy-0.006
orn-0.006
opus-0.006
ip-0.006
Token she
Token ID673
Feature activation+1.90e+00

Pos logit contributions
iflower+0.006
opus+0.005
pron+0.005
affle+0.005
orn+0.005
Neg logit contributions
 stations-0.004
 levers-0.004
 speakers-0.004
 footsteps-0.004
 themselves-0.004
Token worry
Token ID5490
Feature activation+2.04e+00

Pos logit contributions
 stations+0.033
 levers+0.028
 speakers+0.026
 towels+0.023
 footsteps+0.023
Neg logit contributions
iflower-0.065
opus-0.056
oppy-0.056
pron-0.055
orn-0.055
Token,
Token ID11
Feature activation+8.85e-01

Pos logit contributions
iflower+0.086
pron+0.074
orn+0.073
opus+0.072
affle+0.070
Neg logit contributions
 stations-0.036
 footsteps-0.034
 levers-0.032
 heads-0.031
 towels-0.030
Token I
Token ID314
Feature activation-1.77e-01

Pos logit contributions
iflower+0.030
opus+0.024
pron+0.024
affle+0.024
orn+0.024
Neg logit contributions
 stations-0.024
 levers-0.022
 footsteps-0.022
 speakers-0.021
 towels-0.021
Token'll
Token ID1183
Feature activation-2.19e+00

Pos logit contributions
 stations+0.005
 levers+0.004
 speakers+0.004
 footsteps+0.004
 towels+0.004
Neg logit contributions
iflower-0.006
pron-0.005
opus-0.005
orn-0.005
affle-0.005
Token help
Token ID1037
Feature activation+2.89e-01

Pos logit contributions
 stations+0.055
 levers+0.048
 speakers+0.046
 footsteps+0.044
 towels+0.042
Neg logit contributions
iflower-0.082
pron-0.068
opus-0.068
orn-0.067
oppy-0.067
Token you
Token ID345
Feature activation-1.39e+00

Pos logit contributions
iflower+0.010
pron+0.008
opus+0.008
orn+0.008
affle+0.008
Neg logit contributions
 stations-0.008
 levers-0.007
 speakers-0.007
 footsteps-0.007
 heads-0.007
Token find
Token ID1064
Feature activation+1.76e+00

Pos logit contributions
 stations+0.039
 levers+0.035
 speakers+0.034
 footsteps+0.032
 elbows+0.030
Neg logit contributions
iflower-0.048
pron-0.039
oppy-0.039
opus-0.039
orn-0.038
Token him
Token ID683
Feature activation-5.72e-01

Pos logit contributions
iflower+0.058
pron+0.048
opus+0.047
orn+0.046
oppy+0.046
Neg logit contributions
 stations-0.051
 levers-0.046
 speakers-0.044
 footsteps-0.043
 towels-0.042
Token.
Token ID13
Feature activation-1.38e-01

Pos logit contributions
 stations+0.015
 levers+0.013
 speakers+0.013
 footsteps+0.012
 towels+0.012
Neg logit contributions
iflower-0.021
pron-0.017
opus-0.017
orn-0.017
oppy-0.017
Token Let
Token ID3914
Feature activation-7.62e-01

Pos logit contributions
 stations+0.003
 levers+0.003
 speakers+0.003
 footsteps+0.003
 extingu+0.003
Neg logit contributions
iflower-0.005
pron-0.004
opus-0.004
orn-0.004
affle-0.004
Token's
Token ID338
Feature activation-1.85e+00

Pos logit contributions
 stations+0.016
 heads+0.013
 levers+0.013
 towels+0.013
 footsteps+0.013
Neg logit contributions
iflower-0.031
pron-0.027
opus-0.026
affle-0.026
oppy-0.025
Token.
Token ID13
Feature activation+4.40e-01

Pos logit contributions
 stations+0.038
 elbows+0.033
 footsteps+0.032
 speakers+0.032
 levers+0.031
Neg logit contributions
iflower-0.030
on-0.025
ip-0.025
pron-0.024
orn-0.023
Token He
Token ID679
Feature activation+1.29e+00

Pos logit contributions
iflower+0.016
opus+0.013
pron+0.013
orn+0.013
affle+0.013
Neg logit contributions
 stations-0.011
 levers-0.010
 footsteps-0.010
 speakers-0.010
 towels-0.009
Token felt
Token ID2936
Feature activation+9.81e-01

Pos logit contributions
iflower+0.044
opus+0.036
oppy+0.035
pron+0.035
orn+0.035
Neg logit contributions
 stations-0.035
 levers-0.030
 footsteps-0.030
 speakers-0.030
hole-0.028
Token the
Token ID262
Feature activation+4.95e-01

Pos logit contributions
iflower+0.033
orn+0.026
pron+0.026
oppy+0.026
opus+0.025
Neg logit contributions
 stations-0.028
 levers-0.025
 speakers-0.024
 heads-0.023
 footsteps-0.023
Token water
Token ID1660
Feature activation+1.07e+00

Pos logit contributions
iflower+0.015
pron+0.012
opus+0.012
orn+0.012
affle+0.012
Neg logit contributions
 stations-0.015
packs-0.013
hole-0.013
 speakers-0.013
 levers-0.013
Token around
Token ID1088
Feature activation-5.11e-01

Pos logit contributions
iflower+0.034
orn+0.027
opus+0.027
pron+0.027
oppy+0.024
Neg logit contributions
 stations-0.033
 speakers-0.029
 levers-0.028
 footsteps-0.027
 permission-0.026
Token him
Token ID683
Feature activation-2.35e-01

Pos logit contributions
 stations+0.015
 levers+0.013
 speakers+0.013
hole+0.012
 extingu+0.012
Neg logit contributions
iflower-0.017
orn-0.014
pron-0.013
oppy-0.013
lf-0.013
Token and
Token ID290
Feature activation+1.39e+00

Pos logit contributions
 stations+0.007
 speakers+0.006
 levers+0.006
 extingu+0.006
 umb+0.005
Neg logit contributions
iflower-0.008
orn-0.006
oppy-0.006
pron-0.006
opus-0.006
Token heard
Token ID2982
Feature activation-8.28e-01

Pos logit contributions
iflower+0.038
pron+0.029
lf+0.028
orn+0.028
oppy+0.027
Neg logit contributions
 stations-0.045
hole-0.039
 extingu-0.038
 levers-0.038
 speakers-0.038
Token the
Token ID262
Feature activation-1.30e+00

Pos logit contributions
 stations+0.023
hole+0.022
 levers+0.020
 heads+0.020
 extingu+0.020
Neg logit contributions
iflower-0.025
orn-0.020
opus-0.019
pron-0.019
affle-0.019
Token bubbles
Token ID25037
Feature activation-1.76e+00

Pos logit contributions
 stations+0.035
hole+0.031
 levers+0.030
 speakers+0.029
packs+0.029
Neg logit contributions
iflower-0.044
opus-0.037
pron-0.035
affle-0.034
orn-0.034
Token their
Token ID511
Feature activation-6.51e-01

Pos logit contributions
iflower+0.012
pron+0.010
oppy+0.010
orn+0.010
affle+0.010
Neg logit contributions
 stations-0.009
 levers-0.008
hole-0.008
 speakers-0.008
 dashboard-0.007
Token snack
Token ID26906
Feature activation-3.32e-02

Pos logit contributions
 stations+0.018
 levers+0.016
 speakers+0.015
 footsteps+0.015
 towels+0.015
Neg logit contributions
iflower-0.023
opus-0.019
pron-0.019
orn-0.018
oppy-0.018
Token from
Token ID422
Feature activation+8.32e-02

Pos logit contributions
 stations+0.001
 levers+0.001
 speakers+0.001
 footsteps+0.001
 elbows+0.001
Neg logit contributions
iflower-0.001
pron-0.001
orn-0.001
ip-0.001
lf-0.001
Token their
Token ID511
Feature activation-6.98e-01

Pos logit contributions
iflower+0.003
pron+0.002
orn+0.002
opus+0.002
oppy+0.002
Neg logit contributions
 stations-0.002
 levers-0.002
 speakers-0.002
 footsteps-0.002
hole-0.002
Token back
Token ID736
Feature activation+1.21e+00

Pos logit contributions
 stations+0.019
 levers+0.017
 speakers+0.016
 footsteps+0.016
 towels+0.016
Neg logit contributions
iflower-0.024
opus-0.020
pron-0.020
orn-0.020
oppy-0.019
Tokenpacks
Token ID32377
Feature activation-4.52e+00

Pos logit contributions
iflower+0.046
orn+0.039
oppy+0.039
pron+0.037
opus+0.037
Neg logit contributions
 stations-0.028
 levers-0.026
 speakers-0.023
 footsteps-0.022
 heads-0.022
Token and
Token ID290
Feature activation+1.92e-01

Pos logit contributions
 stations+0.121
 levers+0.118
 footsteps+0.111
 elbows+0.105
 speakers+0.103
Neg logit contributions
iflower-0.149
pron-0.127
orn-0.125
on-0.124
lf-0.121
Token gave
Token ID2921
Feature activation+1.64e+00

Pos logit contributions
iflower+0.006
pron+0.005
oppy+0.005
orn+0.005
lf+0.005
Neg logit contributions
 stations-0.006
 levers-0.005
 speakers-0.005
hole-0.005
 footsteps-0.005
Token some
Token ID617
Feature activation-9.61e-01

Pos logit contributions
iflower+0.062
pron+0.051
orn+0.051
oppy+0.051
opus+0.051
Neg logit contributions
 stations-0.039
 levers-0.034
 speakers-0.033
 footsteps-0.031
hole-0.030
Token to
Token ID284
Feature activation+1.69e+00

Pos logit contributions
 stations+0.034
 levers+0.030
 speakers+0.029
hole+0.028
 footsteps+0.028
Neg logit contributions
iflower-0.026
pron-0.021
affle-0.020
orn-0.020
opus-0.020
Token the
Token ID262
Feature activation-6.23e-01

Pos logit contributions
iflower+0.060
pron+0.050
oppy+0.049
opus+0.049
orn+0.048
Neg logit contributions
 stations-0.044
 levers-0.039
 speakers-0.037
hole-0.036
 footsteps-0.035
Token water
Token ID1660
Feature activation-2.30e-01

Pos logit contributions
 stations+0.025
 footsteps+0.022
 levers+0.022
 speakers+0.021
 extingu+0.021
Neg logit contributions
iflower-0.024
pron-0.020
opus-0.020
orn-0.019
affle-0.017
Token balloon
Token ID21190
Feature activation+8.93e-01

Pos logit contributions
 stations+0.007
 speakers+0.006
 footsteps+0.006
 levers+0.006
hole+0.005
Neg logit contributions
iflower-0.007
orn-0.006
ip-0.006
opus-0.005
on-0.005
Token with
Token ID351
Feature activation+4.08e-01

Pos logit contributions
iflower+0.026
orn+0.021
ip+0.020
on+0.019
opus+0.019
Neg logit contributions
 stations-0.027
 speakers-0.026
 footsteps-0.024
 levers-0.022
 permission-0.022
Token all
Token ID477
Feature activation-1.79e+00

Pos logit contributions
iflower+0.012
on+0.010
orn+0.009
Buzz+0.009
 grape+0.009
Neg logit contributions
 stations-0.012
hole-0.010
 speakers-0.010
bars-0.010
 spine-0.009
Token their
Token ID511
Feature activation-7.08e-01

Pos logit contributions
 stations+0.054
 speakers+0.050
 footsteps+0.050
 levers+0.048
 elbows+0.047
Neg logit contributions
iflower-0.051
orn-0.042
pron-0.039
on-0.039
affle-0.038
Token might
Token ID1244
Feature activation-4.75e+00

Pos logit contributions
 stations+0.016
 footsteps+0.013
 speakers+0.013
 extingu+0.012
 levers+0.012
Neg logit contributions
iflower-0.028
oppy-0.023
opus-0.023
pron-0.023
orn-0.022
Token.
Token ID13
Feature activation-2.84e-01

Pos logit contributions
 stations+0.145
 speakers+0.129
 levers+0.121
 footsteps+0.120
 buses+0.118
Neg logit contributions
iflower-0.136
inkle-0.117
orn-0.111
on-0.109
pron-0.109
Token But
Token ID887
Feature activation-2.26e+00

Pos logit contributions
 stations+0.007
 levers+0.007
 speakers+0.006
 footsteps+0.006
 towels+0.006
Neg logit contributions
iflower-0.010
opus-0.008
pron-0.008
orn-0.008
oppy-0.008
Token they
Token ID484
Feature activation-2.44e+00

Pos logit contributions
 stations+0.063
 levers+0.055
 speakers+0.055
 footsteps+0.052
 towels+0.051
Neg logit contributions
iflower-0.075
pron-0.062
opus-0.062
oppy-0.061
orn-0.060
Token missed
Token ID6825
Feature activation+7.79e-02

Pos logit contributions
 stations+0.071
 speakers+0.062
 levers+0.061
 footsteps+0.061
 extingu+0.057
Neg logit contributions
iflower-0.079
opus-0.066
oppy-0.065
orn-0.065
pron-0.065
Token the
Token ID262
Feature activation-3.93e-01

Pos logit contributions
iflower+0.003
oppy+0.002
pron+0.002
opus+0.002
lf+0.002
Neg logit contributions
 stations-0.002
 levers-0.002
 speakers-0.002
hole-0.002
 dashboard-0.002
Token room
Token ID2119
Feature activation-1.26e+00

Pos logit contributions
 stations+0.013
 levers+0.012
 speakers+0.011
 footsteps+0.010
 towels+0.010
Neg logit contributions
iflower-0.020
pron-0.016
opus-0.016
oppy-0.016
orn-0.016
Token and
Token ID290
Feature activation-1.84e-01

Pos logit contributions
 stations+0.041
 levers+0.037
 extingu+0.035
 brake+0.034
 speakers+0.034
Neg logit contributions
iflower-0.035
pron-0.032
lf-0.030
opus-0.028
on-0.026
Token hugged
Token ID41130
Feature activation+2.19e-01

Pos logit contributions
 stations+0.005
 extingu+0.005
 levers+0.005
 towels+0.005
 speakers+0.005
Neg logit contributions
iflower-0.006
pron-0.005
lf-0.004
orn-0.004
oppy-0.004
Token herself
Token ID5223
Feature activation+7.79e-01

Pos logit contributions
iflower+0.006
pron+0.005
orn+0.005
lf+0.005
oppy+0.004
Neg logit contributions
 stations-0.007
 levers-0.006
 speakers-0.006
 extingu-0.006
hole-0.006
Token.
Token ID13
Feature activation-5.78e-01

Pos logit contributions
iflower+0.026
pron+0.022
oppy+0.020
opus+0.020
orn+0.020
Neg logit contributions
 stations-0.023
 levers-0.021
 speakers-0.020
 footsteps-0.019
 elbows-0.018
Token Nobody
Token ID15658
Feature activation-4.44e+00

Pos logit contributions
 stations+0.017
 levers+0.015
 speakers+0.014
 footsteps+0.014
 towels+0.014
Neg logit contributions
iflower-0.019
pron-0.016
opus-0.016
orn-0.015
oppy-0.015
Token knew
Token ID2993
Feature activation-8.77e-01

Pos logit contributions
 stations+0.143
 levers+0.129
 speakers+0.124
 footsteps+0.123
 towels+0.120
Neg logit contributions
iflower-0.132
pron-0.106
orn-0.106
opus-0.105
oppy-0.101
Token the
Token ID262
Feature activation-6.07e-01

Pos logit contributions
 stations+0.025
 speakers+0.022
 levers+0.022
 extingu+0.022
 heads+0.021
Neg logit contributions
iflower-0.029
orn-0.023
oppy-0.022
pron-0.022
opus-0.021
Token secret
Token ID3200
Feature activation-5.61e-01

Pos logit contributions
 stations+0.018
 levers+0.016
 speakers+0.015
 towels+0.015
 themselves+0.015
Neg logit contributions
iflower-0.020
pron-0.016
orn-0.016
opus-0.016
affle-0.015
Token but
Token ID475
Feature activation-1.27e+00

Pos logit contributions
 stations+0.019
 levers+0.017
 speakers+0.016
 elbows+0.016
 extingu+0.016
Neg logit contributions
iflower-0.016
pron-0.012
orn-0.012
lf-0.012
opus-0.012
Token she
Token ID673
Feature activation+1.70e-01

Pos logit contributions
 stations+0.039
 levers+0.035
 speakers+0.033
 extingu+0.032
 footsteps+0.032
Neg logit contributions
iflower-0.040
pron-0.033
orn-0.032
opus-0.031
oppy-0.031
Token
Token ID198
Feature activation-2.43e-02

Pos logit contributions
 stations+0.021
 levers+0.019
 speakers+0.018
 footsteps+0.018
 towels+0.017
Neg logit contributions
iflower-0.028
opus-0.023
pron-0.023
orn-0.023
affle-0.022
Token
Token ID198
Feature activation+2.87e-01

Pos logit contributions
 stations+0.001
 speakers+0.000
 themselves+0.000
 levers+0.000
 towels+0.000
Neg logit contributions
iflower-0.001
opus-0.001
affle-0.001
pron-0.001
atmeal-0.001
Tokenâ
Token ID22940
Feature activation-2.10e+00

Pos logit contributions
iflower+0.010
affle+0.008
orn+0.008
opus+0.008
oppy+0.007
Neg logit contributions
 stations-0.008
 speakers-0.007
 levers-0.007
 themselves-0.007
 towels-0.007
Token
Token ID26391
Feature activation+2.68e+00

Pos logit contributions
 stations+0.066
 levers+0.060
 speakers+0.057
 footsteps+0.056
 towels+0.055
Neg logit contributions
iflower-0.065
pron-0.053
opus-0.052
orn-0.052
oppy-0.050
Token
Token ID129
Feature activation-1.17e+00

Pos logit contributions
iflower+0.084
pron+0.067
orn+0.067
opus+0.067
affle+0.066
Neg logit contributions
 stations-0.081
 levers-0.074
 speakers-0.072
 footsteps-0.069
 towels-0.067
Token
Token ID241
Feature activation-4.66e+00

Pos logit contributions
 stations+0.036
 levers+0.033
 speakers+0.032
 footsteps+0.031
 heads+0.031
Neg logit contributions
iflower-0.036
opus-0.029
pron-0.029
orn-0.029
oppy-0.028
TokenNow
Token ID3844
Feature activation-1.30e+00

Pos logit contributions
 stations+0.164
 levers+0.149
 footsteps+0.147
 heads+0.147
 towels+0.142
Neg logit contributions
iflower-0.120
pron-0.099
orn-0.093
tto-0.092
opus-0.091
Token Annie
Token ID25860
Feature activation+1.97e+00

Pos logit contributions
 stations+0.040
 levers+0.036
 speakers+0.035
 footsteps+0.034
 heads+0.033
Neg logit contributions
iflower-0.039
oppy-0.032
pron-0.032
opus-0.032
orn-0.030
Token,
Token ID11
Feature activation-3.87e-01

Pos logit contributions
iflower+0.069
pron+0.058
opus+0.057
orn+0.056
affle+0.055
Neg logit contributions
 stations-0.052
 levers-0.046
 speakers-0.044
 footsteps-0.043
 heads-0.042
Tokenâ
Token ID22940
Feature activation-1.74e+00

Pos logit contributions
 stations+0.012
 levers+0.011
 speakers+0.010
 footsteps+0.010
 heads+0.010
Neg logit contributions
iflower-0.012
pron-0.010
opus-0.010
oppy-0.009
orn-0.009
Token
Token ID26391
Feature activation+2.92e+00

Pos logit contributions
 stations+0.051
 levers+0.045
 speakers+0.044
 footsteps+0.043
 heads+0.041
Neg logit contributions
iflower-0.057
opus-0.046
pron-0.046
oppy-0.045
orn-0.045
Token.
Token ID13
Feature activation-1.45e+00

Pos logit contributions
 stations+0.082
 levers+0.069
 speakers+0.067
 brake+0.062
 elbows+0.061
Neg logit contributions
iflower-0.098
pron-0.086
opus-0.078
oppy-0.077
orn-0.075
Token
Token ID6184
Feature activation-9.20e-02

Pos logit contributions
 stations+0.041
 levers+0.037
 speakers+0.036
 footsteps+0.035
 towels+0.034
Neg logit contributions
iflower-0.049
opus-0.039
pron-0.039
orn-0.039
oppy-0.038
Token
Token ID95
Feature activation-8.60e-01

Pos logit contributions
 stations+0.003
 levers+0.003
 speakers+0.003
 footsteps+0.003
 dashboard+0.002
Neg logit contributions
iflower-0.003
orn-0.002
pron-0.002
oppy-0.002
affle-0.002
Token
Token ID26391
Feature activation+2.59e+00

Pos logit contributions
 stations+0.027
 levers+0.024
 speakers+0.024
 footsteps+0.023
 heads+0.022
Neg logit contributions
iflower-0.026
opus-0.021
pron-0.021
oppy-0.021
orn-0.020
Token
Token ID129
Feature activation-9.76e-01

Pos logit contributions
iflower+0.082
orn+0.066
pron+0.065
affle+0.064
opus+0.064
Neg logit contributions
 stations-0.075
 levers-0.069
 speakers-0.068
hole-0.064
 footsteps-0.064
Token
Token ID241
Feature activation-4.50e+00

Pos logit contributions
 stations+0.030
 levers+0.027
 speakers+0.027
 heads+0.025
 footsteps+0.025
Neg logit contributions
iflower-0.030
orn-0.024
pron-0.024
opus-0.024
oppy-0.024
TokenThat
Token ID2504
Feature activation+2.52e+00

Pos logit contributions
 stations+0.159
 levers+0.145
 footsteps+0.144
 heads+0.143
 towels+0.139
Neg logit contributions
iflower-0.115
pron-0.093
opus-0.091
orn-0.090
tto-0.083
Tokenâ
Token ID22940
Feature activation-1.82e+00

Pos logit contributions
iflower+0.074
opus+0.060
pron+0.060
orn+0.057
affle+0.053
Neg logit contributions
 stations-0.081
 footsteps-0.074
 levers-0.073
 extingu-0.072
 elbows-0.071
Token
Token ID26391
Feature activation+3.08e+00

Pos logit contributions
 stations+0.053
 levers+0.048
 speakers+0.046
 footsteps+0.045
 heads+0.044
Neg logit contributions
iflower-0.059
pron-0.047
opus-0.047
oppy-0.047
orn-0.046
Token
Token ID8151
Feature activation+3.46e+00

Pos logit contributions
iflower+0.095
orn+0.075
pron+0.074
affle+0.073
opus+0.071
Neg logit contributions
 stations-0.092
 levers-0.085
 speakers-0.084
hole-0.078
 towels-0.077
Tokens
Token ID82
Feature activation-7.92e-01

Pos logit contributions
iflower+0.111
opus+0.090
pron+0.090
orn+0.088
oppy+0.087
Neg logit contributions
 stations-0.103
 levers-0.092
 speakers-0.089
 footsteps-0.087
 towels-0.084