TOP ACTIVATIONS
MAX = 6.575

Tim my 's great big er aser . <|endoftext|> Once upon
Tim my 's great big er aser . <|endoftext|> Once upon
don 't forget your new er aser . You never know
" I lost my lucky er aser . I can 't
all day with the lucky er aser . <|endoftext|> Once upon
for helping me fix my er aser ." Billy smiled and
" Oh no , my er aser is broken !" Billy
't mean to break your er aser . I just wanted
She gave Lily the er aser . She helped her
yours ! That 's my er aser !" But
mom showed Amy the incredible er asers . She was so
took turns sliding with the er aser . They laughed and
. She was holding another er aser . She said ,
can help you with your er aser ." George
room and gave back the er aser . From that day
We did not share the er aser ." The
the little girl and the er aser could play again .
sad because she loved her er aser . Suddenly , Mitt
the floor and hugged the er aser tight . The er
not want to share her er aser . She liked it

MIN ACTIVATIONS
MIN = -5.123

fun . Lily realized that pic n ics were not so
pick oranges . One day Lib by 's parents decided to
Jane hopped on the sees aw and before long she was
children running around , families pic n icking , couples st
went on adventures , had pic n ics , and even
even see other people having pic n ics .
a fountain . Near the fountain stood a little girl .
the slide and the sees aw . They had a lot
the slide and the sees aw . They had a lot
." The moral of the story is that it
bake . One day , Lib by 's mom my brought
roof . They would have pic n ics and play games
The next day Lib by went to the kitchen
One day , Sof ia was playing with her
it . S asha looked out of her
the slide and the sees aw . They also liked to
the slide and the sees aw . They laugh and shout
the c rows were c aw ing in unison to make
so happy that he was able to cross with the help
a swing , a sees aw and a climbing frame .

INTERVAL 3.651 - 6.575
CONTAINS 0.013%

said he lost his original toy car . Lily told him
to play with his new toy car . He makes it
Tom , here is your laser . Are you okay ?"
Mary was only three - years - old and loved playing
nice , like your favorite toy or your best friend .

INTERVAL 0.726 - 3.651
CONTAINS 16.993%

mom my smiled and said : " It is a safe
three years old and she loved going to the park .
go away , Anna ! You are a bad girl !"
said , " I understand , Lily . You have a
who was clever and she chased it up a tree .

INTERVAL -2.198 - 0.726
CONTAINS 82.562%

and explore . One day Ranger found something very special in
sad . But then he felt the warm sun on his
Tom felt bad . He knew he was wrong
went so high that it almost touched the clouds . Tim
mom were having so much fun together . <|endoftext|> Once upon

INTERVAL -5.123 - -2.198
CONTAINS 0.431%

It then hopped to a different place and scattered some more
a big pole in the park . It was smooth and
fly . It flew high up in the sky and could
chased it until she was able to catch it in her
the air . It flew higher and higher until it reached
Token Tim
Token ID5045
Feature activation+2.80e-01

Pos logit contributions
 robots+0.015
 rockets+0.014
 Amen+0.014
 moral+0.014
 Mechan+0.014
Neg logit contributions
ill-0.013
line-0.012
cious-0.011
air-0.011
 Sophia-0.011
Tokenmy
Token ID1820
Feature activation+7.76e-01

Pos logit contributions
 robots+0.007
 Amen+0.007
 rockets+0.007
 moral+0.007
 Mechan+0.007
Neg logit contributions
ill-0.008
line-0.007
air-0.007
cious-0.007
 Sophia-0.007
Token's
Token ID338
Feature activation+1.08e+00

Pos logit contributions
 robots+0.028
 rockets+0.026
 puzzles+0.026
 trucks+0.026
 novels+0.026
Neg logit contributions
ill-0.015
line-0.014
cious-0.013
air-0.013
 Sophia-0.013
Token great
Token ID1049
Feature activation+1.33e+00

Pos logit contributions
 Amen+0.025
 rockets+0.022
 robots+0.022
 novels+0.022
 nurses+0.022
Neg logit contributions
ill-0.034
line-0.032
air-0.031
cious-0.031
 Sophia-0.030
Token big
Token ID1263
Feature activation+2.83e+00

Pos logit contributions
 Amen+0.024
 robots+0.021
 rockets+0.021
 trucks+0.020
 nurses+0.020
Neg logit contributions
ill-0.048
air-0.045
 Sophia-0.044
line-0.044
cious-0.044
Token er
Token ID1931
Feature activation+6.58e+00

Pos logit contributions
 Amen+0.046
 robots+0.042
 rockets+0.041
 nurses+0.040
 novels+0.040
Neg logit contributions
ill-0.106
air-0.098
 Sophia-0.098
line-0.097
cious-0.096
Tokenaser
Token ID6005
Feature activation+3.18e+00

Pos logit contributions
 Amen+0.151
 moral+0.143
 trucks+0.142
 rockets+0.140
 robots+0.140
Neg logit contributions
ill-0.217
air-0.196
line-0.194
ax-0.185
cious-0.184
Token.
Token ID13
Feature activation+1.94e-01

Pos logit contributions
 robots+0.109
 moral+0.108
 Amen+0.107
 Mechan+0.105
 rockets+0.104
Neg logit contributions
ill-0.061
line-0.056
cious-0.053
air-0.051
 Sophia-0.049
Token<|endoftext|>
Token ID50256
Feature activation+8.98e-01

Pos logit contributions
 robots+0.006
 rockets+0.006
 puzzles+0.006
 trucks+0.006
 novels+0.006
Neg logit contributions
ill-0.004
line-0.004
cious-0.004
air-0.004
 Sophia-0.004
TokenOnce
Token ID7454
Feature activation+1.26e+00

Pos logit contributions
 robots+0.021
 rockets+0.020
 moral+0.020
 puzzles+0.019
 novels+0.019
Neg logit contributions
line-0.027
cious-0.027
ill-0.026
air-0.025
 Sophia-0.025
Token upon
Token ID2402
Feature activation+1.70e+00

Pos logit contributions
 robots+0.037
 Mechan+0.035
 nurses+0.034
 puzzles+0.034
 Amen+0.034
Neg logit contributions
ill-0.034
line-0.031
 Sophia-0.030
air-0.027
ax-0.027
Token Tim
Token ID5045
Feature activation+2.04e-01

Pos logit contributions
 robots+0.016
 Amen+0.016
 moral+0.015
 Mechan+0.015
 rockets+0.015
Neg logit contributions
ill-0.013
line-0.012
 Sophia-0.012
cious-0.012
air-0.012
Tokenmy
Token ID1820
Feature activation+7.66e-01

Pos logit contributions
 robots+0.005
 Amen+0.005
 rockets+0.005
 moral+0.005
 Mechan+0.005
Neg logit contributions
ill-0.006
line-0.005
air-0.005
cious-0.005
 Sophia-0.005
Token's
Token ID338
Feature activation+1.05e+00

Pos logit contributions
 robots+0.028
 rockets+0.026
 puzzles+0.026
 trucks+0.026
 moral+0.026
Neg logit contributions
ill-0.014
line-0.013
cious-0.012
air-0.012
 Sophia-0.012
Token great
Token ID1049
Feature activation+1.19e+00

Pos logit contributions
 Amen+0.027
 novels+0.022
 nurses+0.022
 rockets+0.022
 robots+0.022
Neg logit contributions
ill-0.030
line-0.029
air-0.028
cious-0.027
 Sophia-0.026
Token big
Token ID1263
Feature activation+2.59e+00

Pos logit contributions
 Amen+0.026
 robots+0.022
 rockets+0.022
 trucks+0.021
 nurses+0.021
Neg logit contributions
ill-0.039
air-0.037
line-0.035
 Sophia-0.035
cious-0.034
Token er
Token ID1931
Feature activation+6.39e+00

Pos logit contributions
 Amen+0.050
 rockets+0.042
 nurses+0.042
 novels+0.042
 robots+0.042
Neg logit contributions
ill-0.089
air-0.083
 Sophia-0.082
line-0.082
cious-0.080
Tokenaser
Token ID6005
Feature activation+3.10e+00

Pos logit contributions
 Amen+0.156
 moral+0.146
 trucks+0.143
 rockets+0.142
 robots+0.140
Neg logit contributions
ill-0.203
air-0.184
line-0.183
ax-0.176
cious-0.169
Token.
Token ID13
Feature activation+2.34e-01

Pos logit contributions
 robots+0.107
 moral+0.107
 Amen+0.106
 Mechan+0.104
oral+0.103
Neg logit contributions
ill-0.057
line-0.053
cious-0.051
air-0.048
 Sophia-0.046
Token<|endoftext|>
Token ID50256
Feature activation+8.23e-01

Pos logit contributions
 robots+0.008
 rockets+0.007
 puzzles+0.007
 trucks+0.007
 novels+0.007
Neg logit contributions
ill-0.005
line-0.005
cious-0.005
air-0.004
 Sophia-0.004
TokenOnce
Token ID7454
Feature activation+1.28e+00

Pos logit contributions
 robots+0.019
 rockets+0.017
 moral+0.017
 puzzles+0.017
 novels+0.017
Neg logit contributions
line-0.026
cious-0.025
ill-0.025
air-0.024
 Sophia-0.023
Token upon
Token ID2402
Feature activation+1.62e+00

Pos logit contributions
 robots+0.038
 Mechan+0.035
 nurses+0.035
 puzzles+0.035
 trucks+0.035
Neg logit contributions
ill-0.035
line-0.031
 Sophia-0.030
air-0.028
ax-0.028
Token don
Token ID836
Feature activation-1.04e-01

Pos logit contributions
ill+0.011
line+0.010
cious+0.010
air+0.010
 Sophia+0.010
Neg logit contributions
 robots-0.010
 rockets-0.009
 puzzles-0.009
 trucks-0.009
 moral-0.009
Token't
Token ID470
Feature activation+2.38e-01

Pos logit contributions
line+0.003
ill+0.003
air+0.002
 Sophia+0.002
cious+0.002
Neg logit contributions
 robots-0.003
 nurses-0.003
 rockets-0.003
 trucks-0.003
 novels-0.003
Token forget
Token ID6044
Feature activation-3.83e-01

Pos logit contributions
 robots+0.008
 rockets+0.008
 puzzles+0.008
 novels+0.008
 trucks+0.008
Neg logit contributions
ill-0.005
line-0.005
cious-0.004
air-0.004
 Sophia-0.004
Token your
Token ID534
Feature activation-2.20e-01

Pos logit contributions
ill+0.009
line+0.008
 Sophia+0.008
air+0.008
cious+0.008
Neg logit contributions
 robots-0.012
 Mechan-0.011
 Amen-0.011
 moral-0.011
 rockets-0.011
Token new
Token ID649
Feature activation+8.66e-01

Pos logit contributions
 Sophia+0.005
ill+0.005
air+0.005
 mind+0.005
cious+0.005
Neg logit contributions
 Amen-0.006
 nurses-0.006
 Mechan-0.006
cut-0.006
 robots-0.005
Token er
Token ID1931
Feature activation+6.33e+00

Pos logit contributions
 Amen+0.023
 nurses+0.022
cut+0.020
 Mechan+0.020
 rockets+0.020
Neg logit contributions
ill-0.022
air-0.021
 Sophia-0.021
cious-0.020
 mind-0.019
Tokenaser
Token ID6005
Feature activation+2.93e+00

Pos logit contributions
 Amen+0.164
 trucks+0.160
 rockets+0.154
 cartoons+0.151
 moral+0.151
Neg logit contributions
ill-0.186
air-0.172
line-0.165
ax-0.158
ass-0.155
Token.
Token ID13
Feature activation-7.38e-02

Pos logit contributions
 robots+0.099
 Amen+0.094
 moral+0.094
 Mechan+0.094
 rockets+0.093
Neg logit contributions
ill-0.061
line-0.056
cious-0.052
air-0.052
 Sophia-0.050
Token You
Token ID921
Feature activation+1.25e+00

Pos logit contributions
ill+0.002
line+0.002
cious+0.002
air+0.002
 Sophia+0.002
Neg logit contributions
 robots-0.002
 Amen-0.002
 rockets-0.002
 moral-0.002
 novels-0.002
Token never
Token ID1239
Feature activation+7.65e-01

Pos logit contributions
 robots+0.038
 rockets+0.035
 trucks+0.035
 novels+0.034
 moral+0.034
Neg logit contributions
ill-0.031
line-0.029
air-0.028
cious-0.028
 Sophia-0.027
Token know
Token ID760
Feature activation+7.08e-02

Pos logit contributions
 robots+0.024
 rockets+0.023
 novels+0.023
 Amen+0.023
 trucks+0.023
Neg logit contributions
ill-0.017
line-0.016
air-0.016
 Sophia-0.015
cious-0.015
Token "
Token ID366
Feature activation-2.87e-01

Pos logit contributions
 robots+0.018
 puzzles+0.018
 rockets+0.018
 novels+0.017
 nurses+0.017
Neg logit contributions
line-0.013
ill-0.013
 Sophia-0.012
cious-0.012
air-0.011
TokenI
Token ID40
Feature activation+9.24e-01

Pos logit contributions
ill+0.005
line+0.005
cious+0.005
air+0.005
 Sophia+0.004
Neg logit contributions
 robots-0.010
 rockets-0.010
 puzzles-0.010
 cartoons-0.010
 novels-0.010
Token lost
Token ID2626
Feature activation+2.28e+00

Pos logit contributions
 robots+0.030
 rockets+0.029
 novels+0.029
 trucks+0.029
 automobiles+0.028
Neg logit contributions
ill-0.021
line-0.019
cious-0.018
air-0.018
 Sophia-0.018
Token my
Token ID616
Feature activation+9.74e-02

Pos logit contributions
 robots+0.065
 moral+0.062
 Amen+0.062
 rockets+0.061
 Mechan+0.061
Neg logit contributions
ill-0.056
line-0.053
air-0.052
 Sophia-0.051
cious-0.050
Token lucky
Token ID9670
Feature activation+1.45e+00

Pos logit contributions
 Amen+0.002
 rockets+0.002
 nurses+0.002
oral+0.002
 novels+0.002
Neg logit contributions
ill-0.003
 Sophia-0.003
line-0.003
air-0.003
cious-0.003
Token er
Token ID1931
Feature activation+6.32e+00

Pos logit contributions
 Amen+0.029
 rockets+0.026
 robots+0.025
 novels+0.025
 nurses+0.025
Neg logit contributions
ill-0.051
air-0.048
 Sophia-0.047
line-0.047
cious-0.047
Tokenaser
Token ID6005
Feature activation+3.60e+00

Pos logit contributions
 Amen+0.161
 rockets+0.152
 trucks+0.152
 moral+0.147
 nurses+0.147
Neg logit contributions
ill-0.191
air-0.172
line-0.170
ax-0.163
it-0.158
Token.
Token ID13
Feature activation+8.88e-01

Pos logit contributions
oral+0.142
 nurses+0.138
 moral+0.138
 Amen+0.137
 robots+0.136
Neg logit contributions
ill-0.050
line-0.043
cious-0.041
air-0.040
 Sophia-0.033
Token I
Token ID314
Feature activation+9.17e-01

Pos logit contributions
 robots+0.030
 rockets+0.029
 puzzles+0.029
 chores+0.029
 moral+0.029
Neg logit contributions
ill-0.017
air-0.016
line-0.015
 Sophia-0.015
cious-0.015
Token can
Token ID460
Feature activation+6.42e-01

Pos logit contributions
 robots+0.030
 rockets+0.029
 novels+0.029
 trucks+0.028
 automobiles+0.028
Neg logit contributions
ill-0.021
line-0.018
air-0.018
cious-0.018
 Sophia-0.017
Token't
Token ID470
Feature activation+1.30e+00

Pos logit contributions
 robots+0.019
 novels+0.018
 trucks+0.018
 rockets+0.018
 automobiles+0.018
Neg logit contributions
ill-0.016
line-0.015
air-0.014
cious-0.013
 Sophia-0.013
Token all
Token ID477
Feature activation+1.11e+00

Pos logit contributions
 robots+0.000
 puzzles+0.000
 rockets+0.000
 moral+0.000
 novels+0.000
Neg logit contributions
ill-0.000
cious-0.000
line-0.000
air-0.000
ax-0.000
Token day
Token ID1110
Feature activation-6.96e-01

Pos logit contributions
 robots+0.017
 moral+0.015
 trucks+0.015
 rockets+0.015
 puzzles+0.014
Neg logit contributions
ill-0.045
line-0.044
cious-0.044
air-0.042
 Sophia-0.042
Token with
Token ID351
Feature activation-9.03e-01

Pos logit contributions
ill+0.023
 Sophia+0.021
air+0.021
ax+0.020
line+0.020
Neg logit contributions
 robots-0.016
 puzzles-0.016
 rockets-0.016
 puzzle-0.015
 trucks-0.014
Token the
Token ID262
Feature activation+1.20e+00

Pos logit contributions
line+0.030
cious+0.029
ill+0.029
air+0.029
ax+0.027
Neg logit contributions
 robots-0.022
 rockets-0.020
 puzzles-0.020
 trucks-0.020
 magnets-0.020
Token lucky
Token ID9670
Feature activation+5.75e-01

Pos logit contributions
 Amen+0.021
 robots+0.021
 rockets+0.020
 novels+0.019
 trucks+0.019
Neg logit contributions
ill-0.042
line-0.040
 Sophia-0.040
air-0.039
cious-0.039
Token er
Token ID1931
Feature activation+6.23e+00

Pos logit contributions
 robots+0.008
 Amen+0.008
 rockets+0.008
 trucks+0.008
 novels+0.007
Neg logit contributions
ill-0.022
line-0.021
air-0.021
 Sophia-0.021
cious-0.021
Tokenaser
Token ID6005
Feature activation+3.11e+00

Pos logit contributions
 Amen+0.132
 robots+0.127
 rockets+0.125
 trucks+0.124
 moral+0.121
Neg logit contributions
ill-0.210
line-0.196
air-0.192
ax-0.183
cious-0.182
Token.
Token ID13
Feature activation-4.57e-01

Pos logit contributions
 robots+0.101
 rockets+0.096
 puzzles+0.094
 trucks+0.093
 moral+0.093
Neg logit contributions
ill-0.069
line-0.065
cious-0.062
air-0.062
 Sophia-0.061
Token<|endoftext|>
Token ID50256
Feature activation+7.47e-01

Pos logit contributions
ill+0.012
line+0.012
cious+0.012
air+0.011
 Sophia+0.011
Neg logit contributions
 robots-0.013
 rockets-0.012
 moral-0.012
 trucks-0.012
 novels-0.011
TokenOnce
Token ID7454
Feature activation+7.34e-01

Pos logit contributions
 robots+0.015
 rockets+0.014
 moral+0.014
 puzzles+0.014
 novels+0.013
Neg logit contributions
line-0.025
cious-0.025
air-0.024
ill-0.024
 Sophia-0.022
Token upon
Token ID2402
Feature activation+1.54e+00

Pos logit contributions
 robots+0.020
 rockets+0.019
 puzzles+0.019
 Mechan+0.019
 nurses+0.019
Neg logit contributions
ill-0.020
line-0.019
 Sophia-0.018
air-0.018
cious-0.017
Token for
Token ID329
Feature activation+1.98e-01

Pos logit contributions
 robots+0.014
 rockets+0.013
 puzzles+0.013
 novels+0.013
 trucks+0.013
Neg logit contributions
ill-0.015
line-0.014
 Sophia-0.014
cious-0.014
air-0.013
Token helping
Token ID5742
Feature activation-1.16e+00

Pos logit contributions
 robots+0.005
 novels+0.005
 rockets+0.005
 automobiles+0.005
 moral+0.005
Neg logit contributions
line-0.005
ill-0.005
 Sophia-0.005
cious-0.004
air-0.004
Token me
Token ID502
Feature activation+4.48e-01

Pos logit contributions
ill+0.028
line+0.027
cious+0.026
air+0.025
 Sophia+0.025
Neg logit contributions
 robots-0.035
 rockets-0.033
 puzzles-0.033
 trucks-0.033
 moral-0.032
Token fix
Token ID4259
Feature activation+1.43e+00

Pos logit contributions
 robots+0.014
 rockets+0.013
 puzzles+0.013
 trucks+0.013
 moral+0.013
Neg logit contributions
ill-0.011
line-0.010
air-0.010
cious-0.009
 Sophia-0.009
Token my
Token ID616
Feature activation+1.09e+00

Pos logit contributions
 robots+0.045
 Amen+0.043
 rockets+0.043
 moral+0.043
 Mechan+0.043
Neg logit contributions
ill-0.031
line-0.030
air-0.029
 Sophia-0.028
cious-0.027
Token er
Token ID1931
Feature activation+6.20e+00

Pos logit contributions
 Amen+0.028
 nurses+0.024
 rockets+0.023
 novels+0.023
 Mechan+0.023
Neg logit contributions
air-0.030
cious-0.030
ill-0.030
 Sophia-0.029
line-0.028
Tokenaser
Token ID6005
Feature activation+3.57e+00

Pos logit contributions
 Amen+0.158
 rockets+0.147
 trucks+0.147
 robots+0.143
 nurses+0.142
Neg logit contributions
ill-0.194
air-0.180
line-0.169
ax-0.163
atter-0.161
Token."
Token ID526
Feature activation-2.32e-01

Pos logit contributions
 robots+0.123
 Amen+0.121
 moral+0.119
oral+0.118
 Mechan+0.118
Neg logit contributions
ill-0.069
line-0.066
air-0.060
cious-0.057
 Sophia-0.054
Token Billy
Token ID15890
Feature activation+1.67e+00

Pos logit contributions
ill+0.005
line+0.005
 Sophia+0.005
air+0.005
cious+0.005
Neg logit contributions
 robots-0.007
 puzzles-0.007
 rockets-0.007
 nurses-0.007
 novels-0.007
Token smiled
Token ID13541
Feature activation+6.42e-01

Pos logit contributions
 robots+0.052
 novels+0.051
 rockets+0.051
 chores+0.049
 automobiles+0.049
Neg logit contributions
ill-0.041
 Sophia-0.036
air-0.035
ax-0.033
cious-0.033
Token and
Token ID290
Feature activation+4.20e-01

Pos logit contributions
 robots+0.018
 rockets+0.017
 moral+0.017
 magnets+0.016
 Amen+0.016
Neg logit contributions
ill-0.017
cious-0.017
line-0.017
air-0.016
 Sophia-0.016
Token "
Token ID366
Feature activation-6.87e-01

Pos logit contributions
 robots+0.014
 rockets+0.013
 puzzles+0.013
 novels+0.013
 moral+0.013
Neg logit contributions
ill-0.015
line-0.015
cious-0.014
air-0.014
 Sophia-0.014
TokenOh
Token ID5812
Feature activation+5.10e-01

Pos logit contributions
ill+0.020
cious+0.019
line+0.019
 Sophia+0.019
air+0.018
Neg logit contributions
 robots-0.017
 moral-0.017
 Amen-0.016
 nurses-0.016
 Mechan-0.015
Token no
Token ID645
Feature activation+1.99e+00

Pos logit contributions
 robots+0.016
 puzzles+0.015
 rockets+0.015
 moral+0.015
 cartoons+0.015
Neg logit contributions
ill-0.011
line-0.010
 Sophia-0.010
cious-0.010
air-0.009
Token,
Token ID11
Feature activation+1.39e+00

Pos logit contributions
 robots+0.069
 rockets+0.065
 puzzles+0.064
 moral+0.064
 trucks+0.064
Neg logit contributions
ill-0.041
line-0.038
cious-0.036
air-0.036
 Sophia-0.035
Token my
Token ID616
Feature activation+1.24e-01

Pos logit contributions
oral+0.046
 robots+0.046
 rockets+0.044
 novels+0.044
 Amen+0.043
Neg logit contributions
line-0.024
ill-0.022
 Sophia-0.022
 Sandy-0.020
 Dan-0.020
Token er
Token ID1931
Feature activation+6.15e+00

Pos logit contributions
 Amen+0.003
 rockets+0.002
 robots+0.002
 nurses+0.002
 novels+0.002
Neg logit contributions
ill-0.004
air-0.004
line-0.004
 Sophia-0.004
cious-0.004
Tokenaser
Token ID6005
Feature activation+3.37e+00

Pos logit contributions
 Amen+0.147
 rockets+0.139
 trucks+0.134
 moral+0.134
 robots+0.133
Neg logit contributions
ill-0.196
air-0.177
line-0.174
ax-0.170
 Sophia-0.164
Token is
Token ID318
Feature activation+1.10e-01

Pos logit contributions
 robots+0.107
 Amen+0.106
 rockets+0.104
 nurses+0.102
 moral+0.102
Neg logit contributions
ill-0.074
air-0.066
cious-0.064
line-0.062
 Sophia-0.061
Token broken
Token ID5445
Feature activation+2.78e+00

Pos logit contributions
oral+0.003
 Amen+0.003
 robots+0.003
 nurses+0.003
 novels+0.003
Neg logit contributions
ill-0.003
 Sophia-0.002
line-0.002
air-0.002
ax-0.002
Token!"
Token ID2474
Feature activation-3.25e-01

Pos logit contributions
 robots+0.103
 Amen+0.103
oral+0.102
 nurses+0.101
 moral+0.101
Neg logit contributions
ill-0.047
line-0.040
air-0.036
 Sophia-0.031
cious-0.031
Token Billy
Token ID15890
Feature activation+1.45e+00

Pos logit contributions
ill+0.009
line+0.009
cious+0.008
 Sophia+0.008
air+0.008
Neg logit contributions
 robots-0.008
 puzzles-0.008
 rockets-0.008
 novels-0.008
 moral-0.008
Token't
Token ID470
Feature activation+1.06e+00

Pos logit contributions
 nurses+0.054
 robots+0.053
 trucks+0.053
 rockets+0.052
cut+0.051
Neg logit contributions
air-0.030
ill-0.029
line-0.025
 Sophia-0.025
oh-0.024
Token mean
Token ID1612
Feature activation+1.46e-01

Pos logit contributions
 robots+0.028
 rockets+0.026
 puzzles+0.026
 trucks+0.026
 moral+0.025
Neg logit contributions
ill-0.030
line-0.029
cious-0.027
air-0.027
 Sophia-0.027
Token to
Token ID284
Feature activation+1.20e+00

Pos logit contributions
 robots+0.005
 rockets+0.005
 Mechan+0.005
 Amen+0.005
 moral+0.005
Neg logit contributions
ill-0.003
line-0.003
air-0.003
 Sophia-0.003
cious-0.003
Token break
Token ID2270
Feature activation+1.33e+00

Pos logit contributions
 robots+0.035
 rockets+0.034
 novels+0.034
 Amen+0.034
oral+0.034
Neg logit contributions
ill-0.029
line-0.027
air-0.027
 Sophia-0.025
cious-0.024
Token your
Token ID534
Feature activation+3.79e-01

Pos logit contributions
 robots+0.045
 Amen+0.045
 Mechan+0.044
 rockets+0.044
 moral+0.043
Neg logit contributions
ill-0.026
line-0.024
air-0.024
 Sophia-0.023
cious-0.021
Token er
Token ID1931
Feature activation+6.12e+00

Pos logit contributions
 Amen+0.008
 nurses+0.007
 rockets+0.007
 novels+0.007
 robots+0.007
Neg logit contributions
air-0.012
ill-0.012
cious-0.011
 Sophia-0.011
line-0.011
Tokenaser
Token ID6005
Feature activation+3.61e+00

Pos logit contributions
 Amen+0.144
 rockets+0.136
 robots+0.133
 trucks+0.131
 moral+0.128
Neg logit contributions
ill-0.197
air-0.186
line-0.179
ax-0.172
 Sophia-0.167
Token.
Token ID13
Feature activation+1.83e-02

Pos logit contributions
 Amen+0.147
 robots+0.146
 Mechan+0.143
oral+0.141
 rockets+0.140
Neg logit contributions
ill-0.051
line-0.045
air-0.040
 Sophia-0.032
cious-0.031
Token I
Token ID314
Feature activation+8.78e-01

Pos logit contributions
 robots+0.001
 rockets+0.001
 puzzles+0.001
 trucks+0.001
 moral+0.001
Neg logit contributions
ill-0.000
line-0.000
air-0.000
cious-0.000
 Sophia-0.000
Token just
Token ID655
Feature activation-4.42e-01

Pos logit contributions
 robots+0.028
 rockets+0.027
 novels+0.026
 trucks+0.026
 puzzles+0.026
Neg logit contributions
ill-0.021
air-0.019
cious-0.018
line-0.018
 Sophia-0.018
Token wanted
Token ID2227
Feature activation+1.29e+00

Pos logit contributions
ill+0.007
 seen+0.006
e+0.006
air+0.006
line+0.005
Neg logit contributions
oral-0.017
 rockets-0.016
 robots-0.016
 automobiles-0.016
 novels-0.016
Token
Token ID198
Feature activation-3.39e-01

Pos logit contributions
 robots+0.015
 rockets+0.014
 trucks+0.014
 Mechan+0.014
 puzzles+0.014
Neg logit contributions
ill-0.006
line-0.006
air-0.006
 Sophia-0.006
cious-0.005
TokenShe
Token ID3347
Feature activation+2.19e+00

Pos logit contributions
ill+0.008
line+0.008
air+0.007
 Sophia+0.007
cious+0.007
Neg logit contributions
 robots-0.010
 rockets-0.010
 trucks-0.010
 Mechan-0.010
 Amen-0.010
Token gave
Token ID2921
Feature activation-1.06e+00

Pos logit contributions
 robots+0.068
 rockets+0.066
 Amen+0.066
 novels+0.064
 automobiles+0.064
Neg logit contributions
ill-0.052
line-0.047
 Sophia-0.047
air-0.046
cious-0.043
Token Lily
Token ID20037
Feature activation-6.51e-01

Pos logit contributions
ill+0.025
line+0.024
cious+0.023
air+0.023
 Sophia+0.023
Neg logit contributions
 robots-0.033
 rockets-0.031
 puzzles-0.030
 moral-0.030
 trucks-0.030
Token the
Token ID262
Feature activation+4.70e-01

Pos logit contributions
ill+0.020
line+0.019
cious+0.019
air+0.018
 Sophia+0.018
Neg logit contributions
 robots-0.016
 puzzles-0.015
 rockets-0.014
 novels-0.014
 moral-0.014
Token er
Token ID1931
Feature activation+6.10e+00

Pos logit contributions
 Amen+0.010
 rockets+0.010
 robots+0.009
 trucks+0.009
 novels+0.009
Neg logit contributions
ill-0.016
 Sophia-0.015
line-0.014
air-0.014
cious-0.014
Tokenaser
Token ID6005
Feature activation+2.63e+00

Pos logit contributions
 Amen+0.141
 rockets+0.140
 trucks+0.140
 robots+0.137
 cartoons+0.131
Neg logit contributions
ill-0.194
air-0.181
line-0.174
 Sophia-0.164
ax-0.164
Token.
Token ID13
Feature activation-7.74e-01

Pos logit contributions
 robots+0.088
 rockets+0.083
 puzzles+0.081
 trucks+0.081
 moral+0.081
Neg logit contributions
ill-0.056
line-0.053
cious-0.050
air-0.050
 Sophia-0.049
Token She
Token ID1375
Feature activation+9.36e-01

Pos logit contributions
ill+0.016
line+0.016
 Sophia+0.015
air+0.014
cious+0.014
Neg logit contributions
 robots-0.026
 rockets-0.025
 trucks-0.024
 moral-0.024
 puzzles-0.024
Token helped
Token ID4193
Feature activation-5.50e-02

Pos logit contributions
 Amen+0.033
 robots+0.033
 automobiles+0.031
 rockets+0.031
 novels+0.031
Neg logit contributions
ill-0.020
 Sophia-0.017
air-0.015
ax-0.015
line-0.015
Token her
Token ID607
Feature activation+3.81e-01

Pos logit contributions
ill+0.001
line+0.001
cious+0.001
air+0.001
 Sophia+0.001
Neg logit contributions
 robots-0.002
 rockets-0.001
 puzzles-0.001
 trucks-0.001
 moral-0.001
Token yours
Token ID12431
Feature activation-8.07e-01

Pos logit contributions
 robots+0.010
 rockets+0.009
 moral+0.009
 Amen+0.009
 trucks+0.009
Neg logit contributions
ill-0.007
line-0.006
 Sophia-0.006
air-0.006
cious-0.006
Token!
Token ID0
Feature activation+5.76e-01

Pos logit contributions
ill+0.017
line+0.016
cious+0.015
air+0.015
 Sophia+0.015
Neg logit contributions
 robots-0.027
 rockets-0.026
 puzzles-0.025
 moral-0.025
 trucks-0.025
Token That
Token ID1320
Feature activation+7.74e-01

Pos logit contributions
 robots+0.018
 rockets+0.017
 puzzles+0.016
 moral+0.016
 trucks+0.016
Neg logit contributions
ill-0.014
line-0.013
cious-0.013
air-0.012
 Sophia-0.012
Token's
Token ID338
Feature activation-7.58e-01

Pos logit contributions
 Amen+0.033
 rockets+0.032
 novels+0.032
 robots+0.032
 nurses+0.031
Neg logit contributions
ill-0.008
oh-0.008
air-0.007
's-0.007
cious-0.006
Token my
Token ID616
Feature activation+9.09e-03

Pos logit contributions
ill+0.019
 Sophia+0.018
line+0.017
air+0.017
cious+0.017
Neg logit contributions
 robots-0.022
 rockets-0.021
 Amen-0.020
 Mechan-0.020
oral-0.020
Token er
Token ID1931
Feature activation+6.10e+00

Pos logit contributions
 Amen+0.000
 rockets+0.000
 novels+0.000
cut+0.000
 nurses+0.000
Neg logit contributions
ill-0.000
air-0.000
 Sophia-0.000
line-0.000
cious-0.000
Tokenaser
Token ID6005
Feature activation+2.49e+00

Pos logit contributions
 Amen+0.152
 rockets+0.144
 moral+0.141
 trucks+0.141
 nurses+0.140
Neg logit contributions
ill-0.188
air-0.169
line-0.161
ax-0.159
 Sophia-0.153
Token!"
Token ID2474
Feature activation-8.97e-01

Pos logit contributions
 robots+0.093
 Amen+0.091
 moral+0.091
 nurses+0.090
 rockets+0.089
Neg logit contributions
ill-0.042
cious-0.036
air-0.036
line-0.035
 Sophia-0.033
Token
Token ID198
Feature activation-1.96e-02

Pos logit contributions
ill+0.022
line+0.021
cious+0.020
air+0.020
 Sophia+0.019
Neg logit contributions
 robots-0.026
 moral-0.025
 rockets-0.025
 puzzles-0.024
 novels-0.024
Token
Token ID198
Feature activation-6.55e-01

Pos logit contributions
ill+0.000
line+0.000
cious+0.000
air+0.000
 Sophia+0.000
Neg logit contributions
 robots-0.001
 rockets-0.001
 moral-0.001
 puzzles-0.001
 trucks-0.001
TokenBut
Token ID1537
Feature activation+2.44e-01

Pos logit contributions
ill+0.021
line+0.018
cious+0.017
 Sophia+0.017
 Maggie+0.017
Neg logit contributions
 moral-0.015
 robots-0.014
 novels-0.014
 trucks-0.014
oral-0.013
Token mom
Token ID1995
Feature activation+6.03e-01

Pos logit contributions
 Amen+0.013
 rockets+0.012
 Mechan+0.012
 robots+0.012
cut+0.012
Neg logit contributions
ill-0.012
 Sophia-0.011
cious-0.010
 Sandy-0.010
air-0.010
Token showed
Token ID3751
Feature activation-1.92e-01

Pos logit contributions
 robots+0.017
 rockets+0.016
 novels+0.016
 puzzles+0.016
 moral+0.016
Neg logit contributions
ill-0.017
line-0.015
air-0.015
 Sophia-0.015
cious-0.014
Token Amy
Token ID14235
Feature activation-1.59e+00

Pos logit contributions
ill+0.005
line+0.004
 Sophia+0.004
air+0.004
cious+0.004
Neg logit contributions
 robots-0.006
 rockets-0.005
 Mechan-0.005
 moral-0.005
 trucks-0.005
Token the
Token ID262
Feature activation+4.55e-01

Pos logit contributions
ill+0.051
line+0.049
cious+0.047
air+0.047
 Sophia+0.047
Neg logit contributions
 robots-0.036
 rockets-0.033
 puzzles-0.032
 trucks-0.032
 moral-0.032
Token incredible
Token ID8082
Feature activation+9.30e-01

Pos logit contributions
 Amen+0.010
 rockets+0.009
 novels+0.009
 robots+0.009
 nurses+0.008
Neg logit contributions
ill-0.015
line-0.014
cious-0.014
 Sophia-0.014
air-0.014
Token er
Token ID1931
Feature activation+6.07e+00

Pos logit contributions
 Amen+0.014
 robots+0.014
 rockets+0.014
 novels+0.013
 nurses+0.013
Neg logit contributions
ill-0.035
line-0.033
air-0.033
cious-0.033
 Sophia-0.033
Tokenasers
Token ID19865
Feature activation+2.82e+00

Pos logit contributions
 Amen+0.144
 rockets+0.136
 moral+0.134
 robots+0.132
 trucks+0.131
Neg logit contributions
ill-0.194
line-0.177
air-0.176
ax-0.167
 Sophia-0.162
Token.
Token ID13
Feature activation+1.91e-01

Pos logit contributions
 robots+0.091
 rockets+0.086
 puzzles+0.085
 trucks+0.084
 moral+0.084
Neg logit contributions
ill-0.062
line-0.059
cious-0.056
air-0.055
 Sophia-0.055
Token She
Token ID1375
Feature activation+8.87e-01

Pos logit contributions
 robots+0.006
 rockets+0.005
 puzzles+0.005
 moral+0.005
 trucks+0.005
Neg logit contributions
ill-0.005
line-0.004
cious-0.004
air-0.004
 Sophia-0.004
Token was
Token ID373
Feature activation-3.07e-01

Pos logit contributions
 Amen+0.029
 robots+0.029
 novels+0.028
 rockets+0.028
 automobiles+0.027
Neg logit contributions
ill-0.021
 Sophia-0.019
cious-0.017
air-0.017
line-0.017
Token so
Token ID523
Feature activation-1.01e+00

Pos logit contributions
ill+0.008
line+0.007
cious+0.007
air+0.007
 Sophia+0.007
Neg logit contributions
 robots-0.009
 rockets-0.009
 puzzles-0.008
 moral-0.008
 trucks-0.008
Token took
Token ID1718
Feature activation+1.18e-01

Pos logit contributions
 robots+0.058
 Amen+0.057
 rockets+0.055
 novels+0.055
 Mechan+0.053
Neg logit contributions
ill-0.046
line-0.043
cious-0.041
 Sophia-0.040
air-0.040
Token turns
Token ID4962
Feature activation-8.63e-01

Pos logit contributions
 robots+0.003
 puzzles+0.003
 trucks+0.003
 rockets+0.003
 novels+0.003
Neg logit contributions
line-0.004
ill-0.004
cious-0.003
air-0.003
 Sophia-0.003
Token sliding
Token ID22292
Feature activation+8.18e-02

Pos logit contributions
ill+0.026
cious+0.026
air+0.024
line+0.024
 Sophia+0.024
Neg logit contributions
 robots-0.020
 trucks-0.020
 rockets-0.019
 puzzles-0.019
 automobiles-0.018
Token with
Token ID351
Feature activation-9.40e-02

Pos logit contributions
 robots+0.003
 rockets+0.003
 moral+0.003
 Amen+0.003
 puzzles+0.003
Neg logit contributions
ill-0.002
line-0.002
air-0.001
 Sophia-0.001
cious-0.001
Token the
Token ID262
Feature activation+1.11e+00

Pos logit contributions
ill+0.002
line+0.002
cious+0.002
air+0.002
 Sophia+0.002
Neg logit contributions
 robots-0.003
 rockets-0.003
 trucks-0.002
 puzzles-0.002
 moral-0.002
Token er
Token ID1931
Feature activation+6.02e+00

Pos logit contributions
 Amen+0.017
 robots+0.016
 rockets+0.016
 novels+0.015
 nurses+0.015
Neg logit contributions
ill-0.042
line-0.041
air-0.040
cious-0.040
 Sophia-0.039
Tokenaser
Token ID6005
Feature activation+2.58e+00

Pos logit contributions
 Amen+0.124
 robots+0.123
 rockets+0.121
 trucks+0.119
 moral+0.115
Neg logit contributions
ill-0.207
air-0.193
line-0.192
 Sophia-0.180
cious-0.179
Token.
Token ID13
Feature activation-7.65e-02

Pos logit contributions
 Amen+0.087
 robots+0.086
 Mechan+0.082
 moral+0.082
 nurses+0.082
Neg logit contributions
ill-0.054
line-0.051
air-0.047
cious-0.046
 Sophia-0.045
Token They
Token ID1119
Feature activation+1.92e-01

Pos logit contributions
ill+0.002
line+0.002
air+0.002
cious+0.002
 Sophia+0.002
Neg logit contributions
 robots-0.002
 rockets-0.002
 moral-0.002
 puzzles-0.002
 magnets-0.002
Token laughed
Token ID13818
Feature activation-5.52e-02

Pos logit contributions
 robots+0.006
 rockets+0.005
 novels+0.005
 Amen+0.005
 moral+0.005
Neg logit contributions
ill-0.005
line-0.005
cious-0.004
air-0.004
 Sophia-0.004
Token and
Token ID290
Feature activation-1.90e-01

Pos logit contributions
ill+0.001
line+0.001
cious+0.001
air+0.001
 Sophia+0.001
Neg logit contributions
 robots-0.002
 rockets-0.002
 puzzles-0.002
 trucks-0.002
 moral-0.002
Token.
Token ID13
Feature activation-1.04e+00

Pos logit contributions
 robots+0.027
 trucks+0.025
 puzzles+0.024
 rockets+0.024
 novels+0.023
Neg logit contributions
ill-0.025
line-0.024
air-0.023
cious-0.023
ax-0.022
Token She
Token ID1375
Feature activation+2.87e-02

Pos logit contributions
ill+0.031
cious+0.029
line+0.029
ax+0.028
air+0.027
Neg logit contributions
 robots-0.027
 automobiles-0.024
 novels-0.024
 rockets-0.023
 trucks-0.023
Token was
Token ID373
Feature activation+1.20e-02

Pos logit contributions
 robots+0.001
 rockets+0.001
 moral+0.001
 puzzles+0.001
 trucks+0.001
Neg logit contributions
ill-0.001
line-0.001
cious-0.001
air-0.001
 Sophia-0.001
Token holding
Token ID4769
Feature activation+5.37e-01

Pos logit contributions
 robots+0.000
 cartoons+0.000
 puzzles+0.000
 rockets+0.000
 moral+0.000
Neg logit contributions
cious-0.000
air-0.000
line-0.000
ill-0.000
ive-0.000
Token another
Token ID1194
Feature activation+4.25e-01

Pos logit contributions
 robots+0.012
 puzzles+0.011
 novels+0.011
 magnets+0.011
 rockets+0.011
Neg logit contributions
line-0.018
ill-0.017
cious-0.017
air-0.016
ipples-0.015
Token er
Token ID1931
Feature activation+5.98e+00

Pos logit contributions
 rockets+0.009
 robots+0.009
 Amen+0.009
 trucks+0.008
 novels+0.008
Neg logit contributions
ill-0.014
line-0.013
air-0.013
 Sophia-0.013
cious-0.013
Tokenaser
Token ID6005
Feature activation+2.70e+00

Pos logit contributions
 Amen+0.128
 trucks+0.127
 rockets+0.127
 robots+0.126
 moral+0.123
Neg logit contributions
ill-0.202
air-0.184
line-0.182
ax-0.173
 Sophia-0.171
Token.
Token ID13
Feature activation-9.57e-01

Pos logit contributions
 robots+0.098
 Amen+0.095
 Mechan+0.095
 moral+0.094
 rockets+0.094
Neg logit contributions
ill-0.049
line-0.045
cious-0.042
air-0.041
 Sophia-0.040
Token She
Token ID1375
Feature activation+7.32e-03

Pos logit contributions
ill+0.026
cious+0.025
line+0.024
air+0.024
ax+0.022
Neg logit contributions
 robots-0.027
 novels-0.024
 rockets-0.024
 automobiles-0.024
 puzzles-0.023
Token said
Token ID531
Feature activation-1.32e+00

Pos logit contributions
 robots+0.000
 rockets+0.000
 puzzles+0.000
 novels+0.000
 trucks+0.000
Neg logit contributions
ill-0.000
line-0.000
 Sophia-0.000
cious-0.000
air-0.000
Token,
Token ID11
Feature activation+7.38e-02

Pos logit contributions
ill+0.032
line+0.031
air+0.029
cious+0.029
 Sophia+0.027
Neg logit contributions
 robots-0.040
 rockets-0.037
 puzzles-0.037
 novels-0.036
 trucks-0.036
Token can
Token ID460
Feature activation+3.03e-01

Pos logit contributions
 robots+0.013
 rockets+0.012
 novels+0.012
 trucks+0.012
 puzzles+0.012
Neg logit contributions
ill-0.008
line-0.007
air-0.007
cious-0.007
 Sophia-0.007
Token help
Token ID1037
Feature activation+6.23e-01

Pos logit contributions
 robots+0.009
 novels+0.009
 rockets+0.008
 trucks+0.008
 puzzles+0.008
Neg logit contributions
ill-0.008
line-0.007
air-0.007
cious-0.006
 Sophia-0.006
Token you
Token ID345
Feature activation+1.20e+00

Pos logit contributions
 robots+0.020
 rockets+0.019
 moral+0.018
 puzzles+0.018
 trucks+0.018
Neg logit contributions
ill-0.014
line-0.013
air-0.013
cious-0.013
 Sophia-0.012
Token with
Token ID351
Feature activation+8.07e-01

Pos logit contributions
 robots+0.038
 rockets+0.035
 novels+0.035
 trucks+0.035
 moral+0.034
Neg logit contributions
ill-0.029
line-0.027
air-0.026
cious-0.025
 Sophia-0.024
Token your
Token ID534
Feature activation+4.00e-01

Pos logit contributions
oral+0.028
 Amen+0.027
 Mechan+0.027
 moral+0.026
 robots+0.026
Neg logit contributions
air-0.013
ill-0.013
line-0.012
 Sophia-0.012
cious-0.011
Token er
Token ID1931
Feature activation+5.98e+00

Pos logit contributions
 Amen+0.011
 nurses+0.009
cut+0.009
oral+0.009
 Mechan+0.008
Neg logit contributions
ill-0.010
air-0.010
cious-0.010
line-0.010
 Sophia-0.010
Tokenaser
Token ID6005
Feature activation+3.43e+00

Pos logit contributions
 Amen+0.150
 trucks+0.141
 rockets+0.139
 robots+0.135
 moral+0.135
Neg logit contributions
ill-0.186
air-0.172
ax-0.164
line-0.163
ass-0.151
Token."
Token ID526
Feature activation+3.93e-01

Pos logit contributions
 Amen+0.123
 robots+0.121
 moral+0.119
 Mechan+0.119
oral+0.119
Neg logit contributions
ill-0.062
line-0.057
air-0.052
cious-0.048
 Sophia-0.046
Token
Token ID198
Feature activation+8.53e-01

Pos logit contributions
 robots+0.011
 rockets+0.010
 novels+0.010
 puzzles+0.010
 Amen+0.010
Neg logit contributions
ill-0.011
line-0.010
cious-0.010
air-0.010
 Sophia-0.009
Token
Token ID198
Feature activation+2.75e-01

Pos logit contributions
 robots+0.031
 rockets+0.029
 trucks+0.029
 puzzles+0.029
 nurses+0.028
Neg logit contributions
ill-0.016
line-0.014
 Sophia-0.014
air-0.014
cious-0.014
TokenGeorge
Token ID20191
Feature activation+2.30e-01

Pos logit contributions
 robots+0.008
 rockets+0.007
 trucks+0.007
 puzzles+0.007
 moral+0.007
Neg logit contributions
ill-0.007
line-0.007
cious-0.007
air-0.007
 Sophia-0.007
Token room
Token ID2119
Feature activation+4.94e-01

Pos logit contributions
 Amen+0.012
 rockets+0.011
 robots+0.011
 nurses+0.011
cut+0.011
Neg logit contributions
ill-0.010
air-0.010
cious-0.010
 Sophia-0.010
line-0.009
Token and
Token ID290
Feature activation+4.18e-01

Pos logit contributions
 robots+0.016
 rockets+0.015
 puzzles+0.014
 trucks+0.014
 moral+0.014
Neg logit contributions
ill-0.011
line-0.011
cious-0.011
air-0.010
 Sophia-0.010
Token gave
Token ID2921
Feature activation-5.57e-01

Pos logit contributions
 robots+0.013
 Amen+0.013
 rockets+0.013
 novels+0.013
 automobiles+0.013
Neg logit contributions
ill-0.009
 Sophia-0.009
line-0.008
 Sandy-0.008
cious-0.007
Token back
Token ID736
Feature activation-2.08e-01

Pos logit contributions
ill+0.011
 Sophia+0.010
line+0.009
air+0.009
 Sandy+0.008
Neg logit contributions
 robots-0.019
 Mechan-0.019
 rockets-0.019
 trucks-0.018
 Amen-0.018
Token the
Token ID262
Feature activation+5.89e-01

Pos logit contributions
ill+0.005
line+0.005
 Sophia+0.005
air+0.005
cious+0.005
Neg logit contributions
 robots-0.006
 rockets-0.006
 Mechan-0.006
 moral-0.006
 Amen-0.006
Token er
Token ID1931
Feature activation+5.97e+00

Pos logit contributions
 Amen+0.015
 rockets+0.014
 robots+0.013
 novels+0.013
 cartoons+0.012
Neg logit contributions
ill-0.017
line-0.016
 Sophia-0.016
cious-0.016
air-0.016
Tokenaser
Token ID6005
Feature activation+3.13e+00

Pos logit contributions
 rockets+0.135
 robots+0.135
 trucks+0.134
 Amen+0.133
 cartoons+0.130
Neg logit contributions
ill-0.191
line-0.178
air-0.176
ax-0.168
it-0.161
Token.
Token ID13
Feature activation+7.31e-01

Pos logit contributions
 robots+0.132
 rockets+0.126
 Amen+0.126
 trucks+0.125
 moral+0.124
Neg logit contributions
ill-0.039
line-0.033
air-0.032
 Sophia-0.030
cious-0.030
Token From
Token ID3574
Feature activation+6.68e-01

Pos logit contributions
 robots+0.023
 rockets+0.022
 puzzles+0.022
 trucks+0.022
 nurses+0.021
Neg logit contributions
ill-0.017
line-0.016
 Sophia-0.015
air-0.015
cious-0.014
Token that
Token ID326
Feature activation+9.65e-01

Pos logit contributions
 Mechan+0.023
 robots+0.023
 Amen+0.022
 rockets+0.021
 puzzles+0.021
Neg logit contributions
 Sophia-0.012
line-0.012
ill-0.012
air-0.012
ax-0.011
Token day
Token ID1110
Feature activation+6.69e-01

Pos logit contributions
 robots+0.029
 rockets+0.028
 trucks+0.027
 nurses+0.027
 automobiles+0.027
Neg logit contributions
ill-0.022
line-0.020
cious-0.019
 Sophia-0.019
air-0.019
Token We
Token ID775
Feature activation-1.83e-01

Pos logit contributions
ill+0.010
line+0.009
cious+0.009
air+0.009
 Sophia+0.009
Neg logit contributions
 robots-0.009
 rockets-0.009
 moral-0.009
 puzzles-0.009
 Amen-0.008
Token did
Token ID750
Feature activation-4.48e-01

Pos logit contributions
ill+0.004
line+0.003
cious+0.003
air+0.003
 Sophia+0.003
Neg logit contributions
 robots-0.006
 rockets-0.006
 novels-0.006
 Amen-0.006
 cartoons-0.006
Token not
Token ID407
Feature activation+2.95e-01

Pos logit contributions
ill+0.009
 Sophia+0.009
line+0.008
air+0.008
cious+0.008
Neg logit contributions
 robots-0.014
 Mechan-0.014
 nurses-0.014
 Amen-0.014
 trucks-0.013
Token share
Token ID2648
Feature activation-6.28e-01

Pos logit contributions
 robots+0.009
 rockets+0.009
 novels+0.008
 trucks+0.008
 puzzles+0.008
Neg logit contributions
ill-0.007
line-0.007
air-0.006
 Sophia-0.006
cious-0.006
Token the
Token ID262
Feature activation+1.05e+00

Pos logit contributions
ill+0.014
line+0.013
air+0.012
 Sophia+0.012
cious+0.012
Neg logit contributions
 robots-0.020
 rockets-0.019
 Mechan-0.019
 Amen-0.019
 moral-0.019
Token er
Token ID1931
Feature activation+5.96e+00

Pos logit contributions
 Amen+0.028
 nurses+0.025
 rockets+0.024
 robots+0.024
 novels+0.024
Neg logit contributions
ill-0.028
line-0.028
air-0.026
cious-0.026
 Sophia-0.026
Tokenaser
Token ID6005
Feature activation+3.01e+00

Pos logit contributions
 robots+0.117
 Amen+0.114
 rockets+0.113
 trucks+0.111
 moral+0.107
Neg logit contributions
ill-0.210
line-0.199
air-0.196
cious-0.184
 Sophia-0.182
Token."
Token ID526
Feature activation-9.20e-01

Pos logit contributions
 robots+0.103
 rockets+0.098
 Mechan+0.098
 nurses+0.097
 Amen+0.097
Neg logit contributions
ill-0.062
line-0.057
air-0.053
 Sophia-0.051
cious-0.051
Token
Token ID198
Feature activation+1.71e-01

Pos logit contributions
ill+0.025
cious+0.024
line+0.024
air+0.023
 Sophia+0.022
Neg logit contributions
 robots-0.025
 moral-0.024
 rockets-0.023
 puzzles-0.023
 novels-0.023
Token
Token ID198
Feature activation-6.35e-01

Pos logit contributions
 robots+0.006
 puzzles+0.006
 rockets+0.006
 nurses+0.006
 trucks+0.006
Neg logit contributions
ill-0.003
line-0.003
air-0.003
 Sophia-0.003
cious-0.002
TokenThe
Token ID464
Feature activation+8.04e-01

Pos logit contributions
ill+0.015
line+0.014
cious+0.014
air+0.014
 Sophia+0.013
Neg logit contributions
 robots-0.020
 rockets-0.019
 puzzles-0.018
 trucks-0.018
 moral-0.018
Token the
Token ID262
Feature activation+1.04e+00

Pos logit contributions
 robots+0.029
 puzzles+0.028
 rockets+0.027
 moral+0.027
 trucks+0.027
Neg logit contributions
ill-0.026
line-0.025
cious-0.024
air-0.023
 Sophia-0.022
Token little
Token ID1310
Feature activation+4.00e-01

Pos logit contributions
 Amen+0.028
 rockets+0.028
 novels+0.026
 robots+0.025
 trucks+0.025
Neg logit contributions
ill-0.028
 Sophia-0.027
line-0.027
cious-0.026
air-0.025
Token girl
Token ID2576
Feature activation-1.20e-01

Pos logit contributions
 robots+0.010
 rockets+0.009
 puzzles+0.009
 moral+0.009
 trucks+0.009
Neg logit contributions
ill-0.012
line-0.012
cious-0.011
air-0.011
 Sophia-0.011
Token and
Token ID290
Feature activation-4.88e-01

Pos logit contributions
ill+0.003
air+0.003
line+0.003
cious+0.003
 Sophia+0.003
Neg logit contributions
 robots-0.004
 rockets-0.003
 novels-0.003
 trucks-0.003
 puzzles-0.003
Token the
Token ID262
Feature activation+1.20e+00

Pos logit contributions
ill+0.010
 Sophia+0.009
line+0.009
air+0.008
cious+0.008
Neg logit contributions
 robots-0.016
 Amen-0.016
 Mechan-0.016
 rockets-0.016
oral-0.016
Token er
Token ID1931
Feature activation+5.95e+00

Pos logit contributions
 Amen+0.031
 rockets+0.029
 novels+0.029
 trucks+0.026
 robots+0.026
Neg logit contributions
ill-0.033
 Sophia-0.032
line-0.032
cious-0.032
air-0.031
Tokenaser
Token ID6005
Feature activation+2.93e+00

Pos logit contributions
 Amen+0.149
 rockets+0.143
 trucks+0.141
 cartoons+0.137
 robots+0.136
Neg logit contributions
ill-0.188
air-0.168
line-0.167
ax-0.157
it-0.156
Token could
Token ID714
Feature activation+2.00e-01

Pos logit contributions
 Amen+0.091
 robots+0.091
 novels+0.088
 moral+0.085
 rockets+0.084
Neg logit contributions
ill-0.069
line-0.063
air-0.059
cious-0.059
 Sophia-0.056
Token play
Token ID711
Feature activation+1.22e+00

Pos logit contributions
 robots+0.007
 novels+0.007
 rockets+0.007
 Mechan+0.007
 trucks+0.007
Neg logit contributions
ill-0.004
line-0.003
 Sophia-0.003
air-0.003
cious-0.003
Token again
Token ID757
Feature activation-1.05e+00

Pos logit contributions
 robots+0.030
 puzzles+0.029
 rockets+0.028
 novels+0.026
 cartoons+0.026
Neg logit contributions
ill-0.038
line-0.036
cious-0.036
air-0.036
 Sophia-0.034
Token.
Token ID13
Feature activation+2.15e-01

Pos logit contributions
line+0.027
ill+0.027
cious+0.025
air+0.025
 Sophia+0.024
Neg logit contributions
 robots-0.030
 puzzles-0.029
 rockets-0.028
 moral-0.027
 novels-0.027
Token sad
Token ID6507
Feature activation+1.47e+00

Pos logit contributions
 robots+0.022
 rockets+0.021
 trucks+0.021
 Mechan+0.021
 puzzles+0.020
Neg logit contributions
line-0.021
ill-0.021
 Sophia-0.020
air-0.020
cious-0.020
Token because
Token ID780
Feature activation+1.37e+00

Pos logit contributions
 robots+0.049
 rockets+0.047
 puzzles+0.046
 trucks+0.046
 Mechan+0.046
Neg logit contributions
ill-0.031
line-0.030
air-0.027
 Sophia-0.027
cious-0.026
Token she
Token ID673
Feature activation+5.50e-01

Pos logit contributions
 robots+0.044
 Mechan+0.043
 rockets+0.042
 Amen+0.041
 trucks+0.041
Neg logit contributions
ill-0.030
line-0.029
 Sophia-0.028
cious-0.025
air-0.025
Token loved
Token ID6151
Feature activation+8.40e-01

Pos logit contributions
 robots+0.012
 rockets+0.011
 puzzles+0.011
 trucks+0.011
 moral+0.011
Neg logit contributions
ill-0.018
line-0.017
cious-0.016
air-0.016
 Sophia-0.016
Token her
Token ID607
Feature activation+7.28e-01

Pos logit contributions
 Amen+0.025
 robots+0.025
 Mechan+0.024
 moral+0.024
 rockets+0.024
Neg logit contributions
ill-0.019
line-0.018
 Sophia-0.018
cious-0.017
air-0.017
Token er
Token ID1931
Feature activation+5.94e+00

Pos logit contributions
 robots+0.012
 rockets+0.012
 Amen+0.011
 novels+0.011
 trucks+0.011
Neg logit contributions
ill-0.027
line-0.025
cious-0.025
 Sophia-0.025
air-0.025
Tokenaser
Token ID6005
Feature activation+3.28e+00

Pos logit contributions
 Amen+0.125
 rockets+0.123
 robots+0.122
 trucks+0.122
 moral+0.117
Neg logit contributions
ill-0.202
air-0.184
line-0.183
ax-0.174
cious-0.172
Token.
Token ID13
Feature activation+5.00e-01

Pos logit contributions
 Mechan+0.124
 robots+0.122
 Amen+0.121
oral+0.120
 rockets+0.120
Neg logit contributions
ill-0.051
line-0.043
cious-0.040
air-0.039
 Sophia-0.037
Token Suddenly
Token ID24975
Feature activation+1.30e+00

Pos logit contributions
 robots+0.017
 rockets+0.016
 puzzles+0.016
 trucks+0.016
 moral+0.016
Neg logit contributions
ill-0.010
line-0.010
 Sophia-0.009
air-0.009
cious-0.009
Token,
Token ID11
Feature activation+8.88e-01

Pos logit contributions
 robots+0.047
 Amen+0.045
 rockets+0.045
 moral+0.044
 trucks+0.044
Neg logit contributions
ill-0.023
line-0.021
 Sophia-0.021
cious-0.021
air-0.020
Token Mitt
Token ID16627
Feature activation+1.29e+00

Pos logit contributions
 Amen+0.033
 robots+0.033
oral+0.033
 Mechan+0.032
 rockets+0.032
Neg logit contributions
 Sophia-0.014
 Sandy-0.013
line-0.013
ill-0.012
 Sasha-0.011
Token the
Token ID262
Feature activation+8.21e-01

Pos logit contributions
ill+0.012
cious+0.012
line+0.012
air+0.011
ax+0.010
Neg logit contributions
 robots-0.012
 moral-0.011
 puzzles-0.011
 cartoons-0.011
 rockets-0.011
Token floor
Token ID4314
Feature activation-6.05e-02

Pos logit contributions
 Amen+0.020
 rockets+0.019
 robots+0.019
 novels+0.019
 trucks+0.018
Neg logit contributions
ill-0.024
line-0.024
air-0.023
cious-0.022
 Sophia-0.022
Token and
Token ID290
Feature activation-2.54e-01

Pos logit contributions
ill+0.001
line+0.001
cious+0.001
air+0.001
 Sophia+0.001
Neg logit contributions
 robots-0.002
 rockets-0.002
 puzzles-0.002
 trucks-0.002
 moral-0.002
Token hugged
Token ID41130
Feature activation+3.83e-01

Pos logit contributions
ill+0.006
line+0.006
 Sophia+0.006
air+0.006
cious+0.006
Neg logit contributions
 robots-0.008
 Amen-0.007
 rockets-0.007
 moral-0.007
 novels-0.007
Token the
Token ID262
Feature activation+1.49e+00

Pos logit contributions
 robots+0.009
 moral+0.008
 nurses+0.008
 puzzles+0.008
 novels+0.008
Neg logit contributions
ill-0.012
cious-0.012
line-0.011
air-0.011
ipples-0.010
Token er
Token ID1931
Feature activation+5.94e+00

Pos logit contributions
 rockets+0.030
 Amen+0.030
 robots+0.030
 novels+0.029
 trucks+0.028
Neg logit contributions
ill-0.049
line-0.048
air-0.046
cious-0.046
 Sophia-0.046
Tokenaser
Token ID6005
Feature activation+3.07e+00

Pos logit contributions
 robots+0.112
 rockets+0.108
 trucks+0.105
 Amen+0.102
 moral+0.101
Neg logit contributions
ill-0.215
line-0.205
air-0.201
cious-0.194
 Sophia-0.192
Token tight
Token ID5381
Feature activation+1.22e+00

Pos logit contributions
 robots+0.107
 rockets+0.101
 trucks+0.100
 Mechan+0.100
 puzzles+0.098
Neg logit contributions
ill-0.062
line-0.057
cious-0.054
air-0.054
 Sophia-0.053
Token.
Token ID13
Feature activation+6.34e-02

Pos logit contributions
 robots+0.039
 rockets+0.037
 puzzles+0.037
 trucks+0.036
 novels+0.036
Neg logit contributions
ill-0.027
line-0.026
cious-0.024
air-0.024
 Sophia-0.024
Token The
Token ID383
Feature activation+1.43e+00

Pos logit contributions
 robots+0.002
 rockets+0.002
 moral+0.002
 puzzles+0.002
 novels+0.002
Neg logit contributions
ill-0.002
line-0.002
cious-0.001
air-0.001
 Sophia-0.001
Token er
Token ID1931
Feature activation+5.49e+00

Pos logit contributions
 Amen+0.033
 rockets+0.032
 novels+0.029
 robots+0.029
 trucks+0.029
Neg logit contributions
ill-0.044
line-0.043
 Sophia-0.042
air-0.042
cious-0.040
Token not
Token ID407
Feature activation-1.82e-01

Pos logit contributions
ill+0.006
line+0.006
 Sophia+0.006
air+0.005
cious+0.005
Neg logit contributions
 robots-0.006
 rockets-0.006
 trucks-0.005
 Mechan-0.005
 puzzles-0.005
Token want
Token ID765
Feature activation-7.12e-01

Pos logit contributions
cious+0.005
air+0.005
line+0.005
ill+0.005
 Sophia+0.005
Neg logit contributions
 robots-0.005
 moral-0.004
 nurses-0.004
 chores-0.004
 cartoons-0.004
Token to
Token ID284
Feature activation+4.20e-01

Pos logit contributions
ill+0.016
 Sophia+0.014
line+0.014
air+0.014
cious+0.013
Neg logit contributions
 robots-0.023
 Amen-0.022
 Mechan-0.022
 rockets-0.022
 moral-0.021
Token share
Token ID2648
Feature activation-7.39e-01

Pos logit contributions
 robots+0.012
 puzzles+0.011
 moral+0.011
 trucks+0.011
 rockets+0.010
Neg logit contributions
ill-0.011
cious-0.011
line-0.011
 Sophia-0.010
air-0.010
Token her
Token ID607
Feature activation+1.76e+00

Pos logit contributions
cious+0.020
ill+0.020
line+0.019
air+0.019
 Sophia+0.017
Neg logit contributions
 robots-0.021
 puzzles-0.020
 novels-0.019
 rockets-0.019
 trucks-0.019
Token er
Token ID1931
Feature activation+5.93e+00

Pos logit contributions
 Amen+0.035
 rockets+0.031
 robots+0.030
 nurses+0.029
 novels+0.029
Neg logit contributions
ill-0.061
air-0.057
line-0.057
 Sophia-0.056
cious-0.056
Tokenaser
Token ID6005
Feature activation+3.25e+00

Pos logit contributions
 Amen+0.123
 robots+0.121
 rockets+0.121
 trucks+0.118
 nurses+0.113
Neg logit contributions
ill-0.204
air-0.190
line-0.186
cious-0.176
 Sophia-0.175
Token.
Token ID13
Feature activation-3.68e-01

Pos logit contributions
 robots+0.114
 rockets+0.109
 Mechan+0.107
 Amen+0.107
 trucks+0.106
Neg logit contributions
ill-0.064
line-0.059
air-0.055
cious-0.054
 Sophia-0.054
Token She
Token ID1375
Feature activation+8.29e-01

Pos logit contributions
ill+0.010
cious+0.009
line+0.009
air+0.009
 Sophia+0.008
Neg logit contributions
 robots-0.010
 moral-0.010
 rockets-0.009
 novels-0.009
 puzzles-0.009
Token liked
Token ID8288
Feature activation+5.72e-01

Pos logit contributions
 robots+0.024
 rockets+0.023
 novels+0.022
 puzzles+0.022
 Amen+0.022
Neg logit contributions
ill-0.021
line-0.019
cious-0.019
air-0.019
 Sophia-0.019
Token it
Token ID340
Feature activation+7.47e-01

Pos logit contributions
 Amen+0.021
 Mechan+0.019
oral+0.019
 moral+0.018
cut+0.018
Neg logit contributions
ill-0.011
 Sophia-0.011
line-0.010
air-0.010
 Sandy-0.010
Token fun
Token ID1257
Feature activation-4.19e-01

Pos logit contributions
 robots+0.004
 puzzles+0.003
 rockets+0.003
 moral+0.003
 novels+0.003
Neg logit contributions
ill-0.006
line-0.005
cious-0.005
air-0.005
 Sophia-0.005
Token.
Token ID13
Feature activation+3.30e-01

Pos logit contributions
ill+0.008
line+0.008
cious+0.008
air+0.007
 Sophia+0.007
Neg logit contributions
 robots-0.014
 rockets-0.014
 moral-0.013
 trucks-0.013
 puzzles-0.013
Token Lily
Token ID20037
Feature activation-1.09e+00

Pos logit contributions
 robots+0.010
 rockets+0.009
 puzzles+0.009
 trucks+0.009
 Mechan+0.009
Neg logit contributions
ill-0.008
line-0.008
 Sophia-0.008
cious-0.007
air-0.007
Token realized
Token ID6939
Feature activation+7.51e-01

Pos logit contributions
ill+0.025
line+0.024
cious+0.023
 Sophia+0.022
air+0.022
Neg logit contributions
 robots-0.034
 rockets-0.033
 novels-0.032
 puzzles-0.032
 trucks-0.032
Token that
Token ID326
Feature activation+2.96e-01

Pos logit contributions
 Mechan+0.027
 robots+0.026
 Amen+0.026
 rockets+0.024
 trucks+0.024
Neg logit contributions
 Sophia-0.015
line-0.014
 Sandy-0.014
ill-0.014
 Sasha-0.013
Token pic
Token ID8301
Feature activation-4.29e+00

Pos logit contributions
 robots+0.009
 Mechan+0.008
 Amen+0.008
 rockets+0.008
 trucks+0.008
Neg logit contributions
ill-0.007
line-0.007
 Sophia-0.007
cious-0.006
 Sandy-0.006
Tokenn
Token ID77
Feature activation-1.94e+00

Pos logit contributions
ill+0.117
line+0.114
cious+0.102
air+0.098
 Sophia+0.098
Neg logit contributions
 moral-0.121
 puzzles-0.111
oral-0.110
 chores-0.108
 robots-0.107
Tokenics
Token ID873
Feature activation-1.82e+00

Pos logit contributions
ill+0.048
line+0.046
cious+0.045
air+0.044
 Sophia+0.043
Neg logit contributions
 robots-0.057
 rockets-0.053
 moral-0.053
 puzzles-0.052
 trucks-0.052
Token were
Token ID547
Feature activation-4.33e-02

Pos logit contributions
ill+0.055
line+0.053
cious+0.051
air+0.051
 Sophia+0.050
Neg logit contributions
 robots-0.044
 rockets-0.041
 puzzles-0.041
 moral-0.040
 trucks-0.040
Token not
Token ID407
Feature activation-5.35e-01

Pos logit contributions
ill+0.001
line+0.001
 Sophia+0.001
cious+0.001
air+0.001
Neg logit contributions
 robots-0.001
 rockets-0.001
 trucks-0.001
 Mechan-0.001
 puzzles-0.001
Token so
Token ID523
Feature activation-4.68e-01

Pos logit contributions
ill+0.014
line+0.013
 Sophia+0.013
cious+0.013
air+0.012
Neg logit contributions
 robots-0.015
 rockets-0.014
 trucks-0.014
 moral-0.014
 puzzles-0.014
Token pick
Token ID2298
Feature activation-8.25e-01

Pos logit contributions
 robots+0.011
 rockets+0.011
 moral+0.011
oral+0.011
 nurses+0.011
Neg logit contributions
ill-0.007
line-0.007
cious-0.006
air-0.006
 Sophia-0.006
Token oranges
Token ID48389
Feature activation+4.54e-01

Pos logit contributions
ill+0.028
line+0.027
cious+0.027
air+0.024
 Sophia+0.024
Neg logit contributions
 robots-0.018
 automobiles-0.017
 puzzles-0.017
 magazines-0.017
 novels-0.017
Token.
Token ID13
Feature activation-9.71e-01

Pos logit contributions
 robots+0.014
 rockets+0.014
 puzzles+0.014
 trucks+0.013
 moral+0.013
Neg logit contributions
ill-0.010
line-0.010
cious-0.010
air-0.009
 Sophia-0.009
Token One
Token ID1881
Feature activation-9.32e-01

Pos logit contributions
ill+0.032
cious+0.031
line+0.030
air+0.029
 Sophia+0.027
Neg logit contributions
 robots-0.020
 novels-0.020
 moral-0.019
 automobiles-0.018
 rockets-0.018
Token day
Token ID1110
Feature activation-1.75e+00

Pos logit contributions
ill+0.038
line+0.038
cious+0.036
air+0.036
 Sophia+0.035
Neg logit contributions
 moral-0.012
 robots-0.011
 trucks-0.010
 novels-0.010
 automobiles-0.009
Token Lib
Token ID7980
Feature activation-4.24e+00

Pos logit contributions
ill+0.055
line+0.052
air+0.049
ipples+0.048
cious+0.047
Neg logit contributions
 novels-0.039
 puzzles-0.037
 robots-0.037
 moral-0.037
 trucks-0.036
Tokenby
Token ID1525
Feature activation-8.40e-01

Pos logit contributions
ill+0.107
line+0.106
cious+0.100
air+0.098
 Sophia+0.097
Neg logit contributions
 robots-0.120
 rockets-0.114
 moral-0.113
 puzzles-0.113
 novels-0.111
Token's
Token ID338
Feature activation-1.33e+00

Pos logit contributions
ill+0.019
line+0.017
cious+0.017
air+0.017
 Sophia+0.017
Neg logit contributions
 robots-0.027
 rockets-0.026
 novels-0.025
 trucks-0.025
 Amen-0.025
Token parents
Token ID3397
Feature activation-4.81e-01

Pos logit contributions
ill+0.033
line+0.032
 Sophia+0.032
cious+0.030
 Sandy+0.030
Neg logit contributions
 Amen-0.038
 robots-0.036
 rockets-0.035
 novels-0.034
 puzzles-0.034
Token decided
Token ID3066
Feature activation-9.39e-01

Pos logit contributions
ill+0.014
line+0.013
cious+0.012
air+0.012
 Sophia+0.012
Neg logit contributions
 robots-0.013
 rockets-0.012
 puzzles-0.012
 trucks-0.011
 novels-0.011
Token to
Token ID284
Feature activation-1.04e+00

Pos logit contributions
ill+0.025
line+0.023
air+0.023
cious+0.023
 Sophia+0.022
Neg logit contributions
 robots-0.026
 rockets-0.024
 puzzles-0.024
 trucks-0.024
 moral-0.024
Token Jane
Token ID12091
Feature activation-2.97e+00

Pos logit contributions
ill+0.045
line+0.043
 Sophia+0.041
cious+0.040
air+0.040
Neg logit contributions
 robots-0.030
 rockets-0.029
 puzzles-0.027
 trucks-0.026
 novels-0.025
Token hopped
Token ID45230
Feature activation-2.86e-01

Pos logit contributions
ill+0.075
line+0.068
 Sophia+0.067
cious+0.067
air+0.066
Neg logit contributions
 robots-0.091
 rockets-0.085
 puzzles-0.083
 moral-0.083
 Mechan-0.082
Token on
Token ID319
Feature activation-7.61e-01

Pos logit contributions
ill+0.006
line+0.006
 Sophia+0.005
air+0.005
cious+0.005
Neg logit contributions
 robots-0.010
 rockets-0.009
 Mechan-0.009
 trucks-0.009
 puzzles-0.009
Token the
Token ID262
Feature activation-1.28e+00

Pos logit contributions
ill+0.019
line+0.018
 Sophia+0.018
air+0.017
cious+0.017
Neg logit contributions
 robots-0.022
 rockets-0.021
 Mechan-0.021
 Amen-0.021
 trucks-0.021
Token sees
Token ID7224
Feature activation+2.18e-02

Pos logit contributions
line+0.029
ill+0.028
cious+0.028
air+0.028
 Sophia+0.027
Neg logit contributions
 robots-0.038
 Amen-0.037
 Mechan-0.037
oral-0.036
 rockets-0.036
Tokenaw
Token ID707
Feature activation-4.21e+00

Pos logit contributions
 Amen+0.001
 Mechan+0.001
 robots+0.001
 moral+0.001
oral+0.001
Neg logit contributions
ill-0.001
 Sophia-0.001
line-0.001
air-0.001
cious-0.001
Token and
Token ID290
Feature activation-1.19e+00

Pos logit contributions
ill+0.099
cious+0.098
air+0.097
 Sophia+0.094
 Maggie+0.092
Neg logit contributions
 robots-0.100
 trucks-0.100
 rockets-0.095
ahs-0.094
 moral-0.092
Token before
Token ID878
Feature activation-8.95e-01

Pos logit contributions
ill+0.031
line+0.030
cious+0.029
air+0.028
 Sophia+0.028
Neg logit contributions
 robots-0.033
 rockets-0.031
 puzzles-0.031
 moral-0.031
 trucks-0.031
Token long
Token ID890
Feature activation+1.30e-01

Pos logit contributions
ill+0.023
line+0.022
 Sophia+0.021
cious+0.020
air+0.020
Neg logit contributions
 robots-0.026
 rockets-0.025
 moral-0.024
 puzzles-0.024
 trucks-0.024
Token she
Token ID673
Feature activation-6.02e-02

Pos logit contributions
 robots+0.004
 rockets+0.004
 Amen+0.004
 Mechan+0.004
 moral+0.004
Neg logit contributions
ill-0.003
line-0.003
 Sophia-0.003
cious-0.002
air-0.002
Token was
Token ID373
Feature activation-3.12e-01

Pos logit contributions
ill+0.002
line+0.001
air+0.001
cious+0.001
 Sophia+0.001
Neg logit contributions
 robots-0.002
 rockets-0.002
 puzzles-0.002
 trucks-0.002
 novels-0.002
Token children
Token ID1751
Feature activation+1.51e-01

Pos logit contributions
ill+0.039
line+0.037
 Sophia+0.037
cious+0.036
air+0.035
Neg logit contributions
 robots-0.028
 Amen-0.026
 Mechan-0.026
 moral-0.025
 rockets-0.025
Token running
Token ID2491
Feature activation+2.82e-01

Pos logit contributions
 robots+0.004
 rockets+0.004
 puzzles+0.004
 trucks+0.004
 novels+0.004
Neg logit contributions
ill-0.004
line-0.004
air-0.004
cious-0.004
 Sophia-0.004
Token around
Token ID1088
Feature activation-9.94e-02

Pos logit contributions
 robots+0.010
 rockets+0.009
 moral+0.009
 puzzles+0.009
 trucks+0.009
Neg logit contributions
ill-0.006
line-0.005
cious-0.005
air-0.005
 Sophia-0.005
Token,
Token ID11
Feature activation-5.88e-01

Pos logit contributions
ill+0.002
line+0.002
cious+0.002
air+0.002
 Sophia+0.002
Neg logit contributions
 robots-0.003
 rockets-0.003
 puzzles-0.003
 trucks-0.003
 moral-0.003
Token families
Token ID4172
Feature activation-6.07e-01

Pos logit contributions
ill+0.017
air+0.016
line+0.015
cious+0.015
 Sophia+0.015
Neg logit contributions
 robots-0.016
 trucks-0.015
 rockets-0.015
 puzzles-0.014
 moral-0.014
Token pic
Token ID8301
Feature activation-4.19e+00

Pos logit contributions
ill+0.014
line+0.014
cious+0.013
air+0.013
 Sophia+0.013
Neg logit contributions
 robots-0.019
 rockets-0.018
 trucks-0.017
 puzzles-0.017
 moral-0.017
Tokenn
Token ID77
Feature activation-1.50e+00

Pos logit contributions
ill+0.134
line+0.119
 Sophia+0.112
ive+0.110
ax+0.108
Neg logit contributions
 moral-0.096
 robots-0.093
 trucks-0.093
lf-0.092
 puzzles-0.092
Tokenicking
Token ID7958
Feature activation-4.68e-02

Pos logit contributions
ill+0.048
line+0.046
cious+0.045
air+0.045
 Sophia+0.045
Neg logit contributions
 robots-0.033
 rockets-0.030
 trucks-0.029
 moral-0.029
 puzzles-0.029
Token,
Token ID11
Feature activation-9.17e-01

Pos logit contributions
ill+0.001
line+0.001
cious+0.001
air+0.001
 Sophia+0.001
Neg logit contributions
 robots-0.002
 Mechan-0.001
 moral-0.001
 Amen-0.001
 rockets-0.001
Token couples
Token ID11886
Feature activation-9.12e-01

Pos logit contributions
ill+0.023
line+0.022
air+0.021
cious+0.021
 Sophia+0.020
Neg logit contributions
 robots-0.027
 rockets-0.026
 trucks-0.025
 puzzles-0.025
 moral-0.025
Token st
Token ID336
Feature activation-1.22e-01

Pos logit contributions
ill+0.022
line+0.021
cious+0.020
air+0.020
 Sophia+0.020
Neg logit contributions
 robots-0.027
 rockets-0.025
 moral-0.025
 Amen-0.025
 novels-0.025
Token went
Token ID1816
Feature activation+1.63e-02

Pos logit contributions
 robots+0.017
 rockets+0.016
 novels+0.016
 Amen+0.016
 moral+0.016
Neg logit contributions
ill-0.018
line-0.017
cious-0.016
 Sophia-0.016
air-0.016
Token on
Token ID319
Feature activation-9.60e-01

Pos logit contributions
 robots+0.000
 rockets+0.000
 puzzles+0.000
 trucks+0.000
 automobiles+0.000
Neg logit contributions
ill-0.001
line-0.000
air-0.000
cious-0.000
 Sophia-0.000
Token adventures
Token ID17545
Feature activation-3.56e-02

Pos logit contributions
ill+0.033
line+0.032
cious+0.031
air+0.031
 Sophia+0.030
Neg logit contributions
 robots-0.019
 rockets-0.018
 moral-0.017
 puzzles-0.017
 trucks-0.017
Token,
Token ID11
Feature activation-9.76e-01

Pos logit contributions
ill+0.001
line+0.001
air+0.001
cious+0.001
 Sophia+0.001
Neg logit contributions
 robots-0.001
 rockets-0.001
 puzzles-0.001
 trucks-0.001
 novels-0.001
Token had
Token ID550
Feature activation-8.23e-01

Pos logit contributions
air+0.030
ill+0.030
line+0.027
cious+0.027
 Sophia+0.027
Neg logit contributions
 robots-0.025
 rockets-0.024
 trucks-0.024
 puzzles-0.023
 unic-0.021
Token pic
Token ID8301
Feature activation-4.15e+00

Pos logit contributions
line+0.034
ill+0.033
air+0.033
cious+0.031
 Sophia+0.030
Neg logit contributions
 robots-0.012
 puzzles-0.011
 rockets-0.011
 trucks-0.010
 magnets-0.009
Tokenn
Token ID77
Feature activation-1.28e+00

Pos logit contributions
ill+0.106
line+0.104
 Sophia+0.095
air+0.094
cious+0.092
Neg logit contributions
 moral-0.116
 robots-0.112
 puzzles-0.111
 rockets-0.108
 chores-0.108
Tokenics
Token ID873
Feature activation-1.52e+00

Pos logit contributions
ill+0.041
line+0.039
cious+0.038
air+0.037
 Sophia+0.037
Neg logit contributions
 robots-0.029
 rockets-0.027
 puzzles-0.026
 moral-0.026
 trucks-0.026
Token,
Token ID11
Feature activation-1.04e+00

Pos logit contributions
ill+0.039
line+0.039
air+0.038
cious+0.037
 Sophia+0.036
Neg logit contributions
 robots-0.043
 rockets-0.041
 puzzles-0.041
 novels-0.039
 trucks-0.039
Token and
Token ID290
Feature activation-9.01e-01

Pos logit contributions
ill+0.028
air+0.028
line+0.027
cious+0.026
 Sophia+0.025
Neg logit contributions
 robots-0.030
 rockets-0.029
 puzzles-0.028
 trucks-0.027
 magnets-0.026
Token even
Token ID772
Feature activation-2.08e-01

Pos logit contributions
ill+0.028
line+0.026
air+0.026
cious+0.025
 Sophia+0.025
Neg logit contributions
 robots-0.022
 rockets-0.021
 puzzles-0.020
 trucks-0.020
 moral-0.019
Token even
Token ID772
Feature activation+1.39e-01

Pos logit contributions
ill+0.020
line+0.018
 Sophia+0.017
air+0.016
cious+0.016
Neg logit contributions
 robots-0.039
 Mechan-0.037
 rockets-0.037
 nurses-0.036
 novels-0.036
Token see
Token ID766
Feature activation-3.00e-01

Pos logit contributions
 robots+0.004
 puzzles+0.004
 rockets+0.004
 trucks+0.004
 moral+0.004
Neg logit contributions
ill-0.003
line-0.003
cious-0.003
air-0.003
 Sophia-0.003
Token other
Token ID584
Feature activation-1.10e+00

Pos logit contributions
ill+0.008
line+0.007
 Sophia+0.007
air+0.007
cious+0.007
Neg logit contributions
 robots-0.009
 rockets-0.008
 moral-0.008
 puzzles-0.008
 Mechan-0.008
Token people
Token ID661
Feature activation-1.06e+00

Pos logit contributions
ill+0.035
line+0.035
cious+0.033
air+0.033
 Sophia+0.032
Neg logit contributions
 robots-0.026
 puzzles-0.024
 rockets-0.024
 trucks-0.024
 novels-0.023
Token having
Token ID1719
Feature activation+2.78e-01

Pos logit contributions
ill+0.029
cious+0.027
air+0.027
 Sophia+0.027
line+0.026
Neg logit contributions
 robots-0.028
 rockets-0.027
 trucks-0.026
 puzzles-0.026
 cartoons-0.025
Token pic
Token ID8301
Feature activation-4.13e+00

Pos logit contributions
 robots+0.006
 rockets+0.006
 puzzles+0.006
 trucks+0.006
 novels+0.006
Neg logit contributions
ill-0.009
line-0.008
air-0.008
cious-0.008
 Sophia-0.008
Tokenn
Token ID77
Feature activation-1.10e+00

Pos logit contributions
ill+0.122
line+0.119
 Sophia+0.110
air+0.107
ive+0.103
Neg logit contributions
 moral-0.099
 chores-0.097
 puzzles-0.095
 robots-0.089
oral-0.088
Tokenics
Token ID873
Feature activation-1.64e+00

Pos logit contributions
ill+0.037
line+0.036
cious+0.035
air+0.035
 Sophia+0.034
Neg logit contributions
 robots-0.023
 rockets-0.021
 puzzles-0.020
 trucks-0.020
 moral-0.020
Token.
Token ID13
Feature activation-2.28e-01

Pos logit contributions
ill+0.042
air+0.042
line+0.042
cious+0.041
 Sophia+0.039
Neg logit contributions
 robots-0.044
 puzzles-0.043
 rockets-0.043
 trucks-0.041
 novels-0.041
Token
Token ID198
Feature activation-1.61e-01

Pos logit contributions
line+0.006
cious+0.006
ill+0.006
air+0.006
 Sophia+0.005
Neg logit contributions
 robots-0.006
 moral-0.005
 magnets-0.005
 rockets-0.005
 puzzles-0.005
Token
Token ID198
Feature activation-6.80e-01

Pos logit contributions
ill+0.004
line+0.004
 Sophia+0.004
cious+0.004
air+0.004
Neg logit contributions
 robots-0.005
 rockets-0.005
 trucks-0.005
 puzzles-0.005
 novels-0.005
Token a
Token ID257
Feature activation+9.92e-01

Pos logit contributions
ill+0.004
 Sophia+0.004
cious+0.004
line+0.004
air+0.004
Neg logit contributions
 robots-0.005
 trucks-0.004
 Mechan-0.004
 rockets-0.004
 nurses-0.004
Token fountain
Token ID29420
Feature activation-3.06e+00

Pos logit contributions
 robots+0.024
 rockets+0.022
 puzzles+0.021
 trucks+0.021
 moral+0.021
Neg logit contributions
ill-0.030
line-0.029
cious-0.028
air-0.028
 Sophia-0.028
Token.
Token ID13
Feature activation+1.09e-01

Pos logit contributions
ill+0.075
line+0.069
cious+0.068
 Sophia+0.067
air+0.067
Neg logit contributions
 robots-0.091
 rockets-0.086
 trucks-0.085
 moral-0.084
 puzzles-0.084
Token Near
Token ID20173
Feature activation-1.63e+00

Pos logit contributions
 robots+0.003
 rockets+0.003
 novels+0.003
 puzzles+0.003
 magnets+0.003
Neg logit contributions
line-0.003
ill-0.003
cious-0.003
air-0.003
ax-0.003
Token the
Token ID262
Feature activation-1.50e+00

Pos logit contributions
ill+0.039
air+0.035
 Sophia+0.035
line+0.035
cious+0.032
Neg logit contributions
 robots-0.051
 Amen-0.048
 Mechan-0.048
 moral-0.048
 rockets-0.047
Token fountain
Token ID29420
Feature activation-4.10e+00

Pos logit contributions
ill+0.043
cious+0.040
air+0.037
ax+0.036
 Sophia+0.036
Neg logit contributions
 moral-0.041
 rockets-0.037
 robots-0.036
 trucks-0.036
oral-0.035
Token stood
Token ID6204
Feature activation-6.26e-01

Pos logit contributions
ill+0.103
cious+0.096
line+0.095
air+0.095
 Sophia+0.092
Neg logit contributions
 robots-0.117
 rockets-0.112
 trucks-0.110
 moral-0.110
 puzzles-0.108
Token a
Token ID257
Feature activation-9.12e-01

Pos logit contributions
ill+0.017
line+0.015
cious+0.015
air+0.015
 Sophia+0.015
Neg logit contributions
 robots-0.018
 rockets-0.017
 moral-0.017
 puzzles-0.016
 trucks-0.016
Token little
Token ID1310
Feature activation-8.82e-01

Pos logit contributions
ill+0.024
line+0.023
cious+0.022
air+0.022
 Sophia+0.022
Neg logit contributions
 robots-0.026
 rockets-0.024
 puzzles-0.024
 trucks-0.024
 novels-0.023
Token girl
Token ID2576
Feature activation-2.66e-01

Pos logit contributions
ill+0.026
line+0.024
cious+0.023
 Sophia+0.023
air+0.023
Neg logit contributions
 robots-0.023
 rockets-0.021
 trucks-0.021
 puzzles-0.021
 moral-0.020
Token.
Token ID13
Feature activation-4.94e-02

Pos logit contributions
ill+0.009
line+0.009
cious+0.008
air+0.008
 Sophia+0.008
Neg logit contributions
 robots-0.005
 rockets-0.005
 puzzles-0.005
 trucks-0.005
 moral-0.005
Token the
Token ID262
Feature activation-9.80e-01

Pos logit contributions
line+0.021
ill+0.021
 Sophia+0.019
cious+0.017
air+0.017
Neg logit contributions
 robots-0.032
 Mechan-0.032
 Amen-0.032
oral-0.032
 moral-0.031
Token slide
Token ID10649
Feature activation-1.79e+00

Pos logit contributions
line+0.022
ill+0.022
 Sophia+0.021
cious+0.020
air+0.020
Neg logit contributions
 Amen-0.030
 Mechan-0.029
 robots-0.028
oral-0.027
 novels-0.027
Token and
Token ID290
Feature activation-1.49e+00

Pos logit contributions
ill+0.039
line+0.037
air+0.034
 Sophia+0.033
cious+0.033
Neg logit contributions
 robots-0.059
 rockets-0.056
 Amen-0.056
 moral-0.055
 puzzles-0.055
Token the
Token ID262
Feature activation-1.08e+00

Pos logit contributions
line+0.030
 Sophia+0.030
ill+0.030
air+0.025
 Sandy+0.024
Neg logit contributions
 Amen-0.050
oral-0.050
 Mechan-0.050
 robots-0.050
 moral-0.048
Token sees
Token ID7224
Feature activation+1.89e-01

Pos logit contributions
line+0.022
ill+0.021
 Sophia+0.021
cious+0.020
air+0.019
Neg logit contributions
 Amen-0.036
 Mechan-0.034
 robots-0.033
oral-0.033
 novels-0.032
Tokenaw
Token ID707
Feature activation-4.09e+00

Pos logit contributions
 Amen+0.006
 robots+0.006
 Mechan+0.006
oral+0.006
 moral+0.006
Neg logit contributions
ill-0.004
 Sophia-0.004
line-0.004
air-0.004
cious-0.003
Token.
Token ID13
Feature activation-5.39e-01

Pos logit contributions
ill+0.107
cious+0.106
 Sophia+0.104
 Maggie+0.101
air+0.101
Neg logit contributions
 trucks-0.097
 robots-0.092
 rockets-0.090
 automobiles-0.086
ahs-0.085
Token They
Token ID1119
Feature activation-5.26e-01

Pos logit contributions
ill+0.012
line+0.011
cious+0.011
air+0.011
 Sophia+0.011
Neg logit contributions
 robots-0.017
 rockets-0.017
 puzzles-0.016
 moral-0.016
 trucks-0.016
Token had
Token ID550
Feature activation-8.04e-01

Pos logit contributions
ill+0.011
line+0.010
 Sophia+0.009
air+0.009
cious+0.009
Neg logit contributions
 robots-0.018
 rockets-0.017
 novels-0.017
 Mechan-0.017
 puzzles-0.017
Token a
Token ID257
Feature activation-1.64e+00

Pos logit contributions
ill+0.026
line+0.025
air+0.024
cious+0.024
 Sophia+0.023
Neg logit contributions
 robots-0.019
 rockets-0.017
 puzzles-0.017
 trucks-0.017
 moral-0.017
Token lot
Token ID1256
Feature activation-6.20e-01

Pos logit contributions
ill+0.051
line+0.049
 Sophia+0.049
cious+0.049
air+0.047
Neg logit contributions
 robots-0.038
 rockets-0.034
 novels-0.033
 Amen-0.033
 puzzles-0.033
Token the
Token ID262
Feature activation-1.25e+00

Pos logit contributions
ill+0.026
line+0.024
cious+0.024
air+0.024
 Sophia+0.023
Neg logit contributions
 robots-0.031
 rockets-0.029
 trucks-0.029
 puzzles-0.029
 moral-0.028
Token slide
Token ID10649
Feature activation-1.95e+00

Pos logit contributions
line+0.025
ill+0.024
 Sophia+0.022
cious+0.022
air+0.020
Neg logit contributions
 Amen-0.043
 Mechan-0.041
 robots-0.039
oral-0.038
 nurses-0.037
Token and
Token ID290
Feature activation-1.06e+00

Pos logit contributions
ill+0.043
line+0.041
cious+0.040
air+0.039
 Sophia+0.038
Neg logit contributions
 robots-0.063
 rockets-0.059
 trucks-0.059
 puzzles-0.058
 moral-0.058
Token the
Token ID262
Feature activation-1.24e+00

Pos logit contributions
ill+0.023
line+0.022
 Sophia+0.021
cious+0.020
air+0.020
Neg logit contributions
 robots-0.035
 Mechan-0.033
 Amen-0.033
 rockets-0.033
oral-0.032
Token sees
Token ID7224
Feature activation+2.68e-01

Pos logit contributions
line+0.024
ill+0.022
 Sophia+0.021
cious+0.021
air+0.019
Neg logit contributions
 Amen-0.043
 Mechan-0.041
 robots-0.040
oral-0.039
 novels-0.038
Tokenaw
Token ID707
Feature activation-4.04e+00

Pos logit contributions
 Amen+0.011
 robots+0.011
 Mechan+0.011
 rockets+0.010
 nurses+0.010
Neg logit contributions
ill-0.003
line-0.003
 Sophia-0.003
air-0.003
cious-0.003
Token.
Token ID13
Feature activation-4.80e-01

Pos logit contributions
ill+0.118
cious+0.116
 Sophia+0.115
iza+0.115
 Maggie+0.115
Neg logit contributions
 trucks-0.085
 rockets-0.077
ahs-0.075
 automobiles-0.074
 robots-0.073
Token They
Token ID1119
Feature activation-5.11e-01

Pos logit contributions
ill+0.012
line+0.012
cious+0.011
air+0.011
 Sophia+0.011
Neg logit contributions
 robots-0.014
 rockets-0.013
 trucks-0.013
 moral-0.013
 puzzles-0.013
Token had
Token ID550
Feature activation-8.38e-01

Pos logit contributions
ill+0.012
line+0.011
 Sophia+0.011
cious+0.010
air+0.010
Neg logit contributions
 robots-0.017
 novels-0.016
 rockets-0.016
 Mechan-0.016
 Amen-0.015
Token a
Token ID257
Feature activation-1.48e+00

Pos logit contributions
ill+0.028
line+0.027
air+0.026
cious+0.025
 Sophia+0.025
Neg logit contributions
 robots-0.018
 rockets-0.017
 puzzles-0.017
 trucks-0.017
 moral-0.017
Token lot
Token ID1256
Feature activation-4.37e-01

Pos logit contributions
ill+0.047
line+0.045
cious+0.045
 Sophia+0.045
air+0.043
Neg logit contributions
 robots-0.034
 rockets-0.030
 novels-0.029
 trucks-0.029
 puzzles-0.029
Token."
Token ID526
Feature activation-1.59e+00

Pos logit contributions
 robots+0.017
 Mechan+0.016
 moral+0.016
 Amen+0.016
 rockets+0.016
Neg logit contributions
ill-0.010
line-0.010
air-0.010
cious-0.009
 Sophia-0.009
Token
Token ID198
Feature activation-1.48e+00

Pos logit contributions
ill+0.045
cious+0.041
air+0.041
line+0.040
iza+0.038
Neg logit contributions
 robots-0.038
 moral-0.036
 automobiles-0.036
 trucks-0.035
oral-0.035
Token
Token ID198
Feature activation-2.11e+00

Pos logit contributions
ill+0.037
line+0.035
cious+0.034
air+0.034
 Sophia+0.033
Neg logit contributions
 robots-0.043
 rockets-0.041
 puzzles-0.040
 trucks-0.040
 moral-0.040
TokenThe
Token ID464
Feature activation+8.20e-01

Pos logit contributions
ill+0.048
line+0.043
air+0.040
 Sophia+0.040
cious+0.040
Neg logit contributions
 robots-0.067
 moral-0.066
 rockets-0.063
 trucks-0.062
oral-0.062
Token moral
Token ID6573
Feature activation+1.71e-02

Pos logit contributions
 robots+0.023
 puzzles+0.022
 Mechan+0.022
 novels+0.022
 rockets+0.021
Neg logit contributions
line-0.017
cious-0.017
 Sophia-0.016
 distance-0.016
ill-0.016
Token of
Token ID286
Feature activation-4.03e+00

Pos logit contributions
 robots+0.000
 rockets+0.000
 trucks+0.000
 puzzles+0.000
 Mechan+0.000
Neg logit contributions
ill-0.000
line-0.000
air-0.000
 Sophia-0.000
cious-0.000
Token the
Token ID262
Feature activation-1.24e+00

Pos logit contributions
ill+0.118
line+0.113
cious+0.107
air+0.106
 Sophia+0.101
Neg logit contributions
 robots-0.100
 moral-0.100
 rockets-0.095
 Amen-0.094
 trucks-0.093
Token story
Token ID1621
Feature activation-5.33e-01

Pos logit contributions
ill+0.046
line+0.045
cious+0.044
air+0.043
 Sophia+0.043
Neg logit contributions
 robots-0.021
 rockets-0.019
 puzzles-0.019
 trucks-0.018
 moral-0.018
Token is
Token ID318
Feature activation-7.05e-02

Pos logit contributions
ill+0.011
air+0.011
 Sophia+0.010
line+0.010
cious+0.009
Neg logit contributions
 robots-0.017
 rockets-0.016
 Mechan-0.016
 cartoons-0.016
 magnets-0.016
Token that
Token ID326
Feature activation+6.65e-01

Pos logit contributions
ill+0.002
line+0.002
cious+0.002
air+0.002
 Sophia+0.002
Neg logit contributions
 robots-0.002
 rockets-0.002
 moral-0.001
 puzzles-0.001
 trucks-0.001
Token it
Token ID340
Feature activation-9.05e-01

Pos logit contributions
 robots+0.022
 Mechan+0.020
 rockets+0.020
 Amen+0.020
 novels+0.020
Neg logit contributions
ill-0.013
line-0.013
cious-0.012
 Sophia-0.012
air-0.012
Token bake
Token ID28450
Feature activation-6.45e-01

Pos logit contributions
ill+0.008
 Sophia+0.008
line+0.008
 Maggie+0.007
air+0.007
Neg logit contributions
 Mechan-0.016
 robots-0.016
oral-0.016
 rockets-0.015
 novels-0.015
Token.
Token ID13
Feature activation+3.47e-01

Pos logit contributions
ill+0.016
line+0.015
cious+0.014
air+0.014
 Sophia+0.014
Neg logit contributions
 robots-0.019
 rockets-0.018
 puzzles-0.018
 moral-0.018
 trucks-0.017
Token One
Token ID1881
Feature activation-2.37e-01

Pos logit contributions
 robots+0.012
 rockets+0.011
 puzzles+0.011
 moral+0.011
 Mechan+0.011
Neg logit contributions
ill-0.007
 Sophia-0.007
line-0.007
air-0.006
cious-0.006
Token day
Token ID1110
Feature activation+4.54e-01

Pos logit contributions
ill+0.006
line+0.006
 Sophia+0.006
 Maggie+0.006
air+0.005
Neg logit contributions
 robots-0.007
 Mechan-0.007
 rockets-0.006
 Amen-0.006
 automobiles-0.006
Token,
Token ID11
Feature activation-8.24e-02

Pos logit contributions
 robots+0.015
 rockets+0.015
 moral+0.014
 puzzles+0.014
 trucks+0.014
Neg logit contributions
ill-0.009
line-0.009
 Sophia-0.008
cious-0.008
air-0.008
Token Lib
Token ID7980
Feature activation-4.02e+00

Pos logit contributions
ill+0.002
line+0.002
 Sophia+0.002
cious+0.002
air+0.002
Neg logit contributions
 robots-0.002
 moral-0.002
 rockets-0.002
 Amen-0.002
 Mechan-0.002
Tokenby
Token ID1525
Feature activation-1.80e-01

Pos logit contributions
ill+0.102
line+0.100
cious+0.094
air+0.092
 Sophia+0.091
Neg logit contributions
 robots-0.114
 rockets-0.109
 moral-0.108
 puzzles-0.107
 trucks-0.106
Token's
Token ID338
Feature activation-8.08e-01

Pos logit contributions
ill+0.004
cious+0.004
air+0.003
 Sophia+0.003
line+0.003
Neg logit contributions
 robots-0.006
 rockets-0.006
 automobiles-0.006
 novels-0.006
 cartoons-0.006
Token mom
Token ID1995
Feature activation-2.51e-02

Pos logit contributions
ill+0.017
 Sophia+0.017
cious+0.016
line+0.016
air+0.015
Neg logit contributions
 Amen-0.025
 Mechan-0.024
 robots-0.023
 rockets-0.022
 trucks-0.021
Tokenmy
Token ID1820
Feature activation+1.09e+00

Pos logit contributions
ill+0.001
line+0.000
 Sophia+0.000
ax+0.000
air+0.000
Neg logit contributions
 rockets-0.001
 robots-0.001
 novels-0.001
 puzzles-0.001
 automobiles-0.001
Token brought
Token ID3181
Feature activation-1.67e+00

Pos logit contributions
 robots+0.037
 novels+0.036
 rockets+0.036
 puzzles+0.035
 automobiles+0.034
Neg logit contributions
ill-0.024
line-0.021
ax-0.020
 Sophia-0.020
cious-0.019
Token roof
Token ID9753
Feature activation+1.85e-01

Pos logit contributions
ill+0.003
line+0.003
cious+0.003
air+0.003
 Sophia+0.003
Neg logit contributions
 robots-0.003
 rockets-0.002
 puzzles-0.002
 trucks-0.002
 moral-0.002
Token.
Token ID13
Feature activation-2.73e-01

Pos logit contributions
 robots+0.005
 rockets+0.005
 trucks+0.004
 puzzles+0.004
 automobiles+0.004
Neg logit contributions
cious-0.005
 Sophia-0.005
ill-0.005
air-0.005
line-0.005
Token They
Token ID1119
Feature activation+1.29e-01

Pos logit contributions
ill+0.007
line+0.007
cious+0.007
air+0.007
 Sophia+0.006
Neg logit contributions
 robots-0.008
 rockets-0.007
 puzzles-0.007
 moral-0.007
 trucks-0.007
Token would
Token ID561
Feature activation-1.63e-01

Pos logit contributions
 robots+0.003
 rockets+0.003
 novels+0.003
 moral+0.003
 Amen+0.003
Neg logit contributions
ill-0.004
line-0.003
cious-0.003
 Sophia-0.003
air-0.003
Token have
Token ID423
Feature activation-1.22e+00

Pos logit contributions
ill+0.004
line+0.004
air+0.004
 Sophia+0.004
cious+0.004
Neg logit contributions
 robots-0.005
 rockets-0.004
 novels-0.004
 Mechan-0.004
 moral-0.004
Token pic
Token ID8301
Feature activation-4.00e+00

Pos logit contributions
line+0.050
ill+0.048
air+0.047
cious+0.045
atter+0.043
Neg logit contributions
 puzzles-0.019
 robots-0.019
 rockets-0.019
 novels-0.017
 magnets-0.016
Tokenn
Token ID77
Feature activation-1.16e+00

Pos logit contributions
ill+0.100
line+0.099
 Sophia+0.089
cious+0.087
air+0.087
Neg logit contributions
 moral-0.114
 puzzles-0.111
 robots-0.109
 chores-0.107
 rockets-0.106
Tokenics
Token ID873
Feature activation-1.71e+00

Pos logit contributions
ill+0.038
line+0.037
cious+0.036
air+0.036
 Sophia+0.035
Neg logit contributions
 robots-0.025
 rockets-0.023
 puzzles-0.022
 trucks-0.022
 moral-0.022
Token and
Token ID290
Feature activation-4.14e-01

Pos logit contributions
line+0.054
air+0.054
ill+0.053
cious+0.052
ax+0.049
Neg logit contributions
 puzzles-0.038
 robots-0.038
 rockets-0.037
 novels-0.036
 trucks-0.034
Token play
Token ID711
Feature activation+1.01e+00

Pos logit contributions
ill+0.010
line+0.010
cious+0.009
 Sophia+0.009
air+0.009
Neg logit contributions
 robots-0.012
 rockets-0.011
 moral-0.011
 puzzles-0.011
 novels-0.011
Token games
Token ID1830
Feature activation-1.21e+00

Pos logit contributions
 puzzles+0.022
 robots+0.022
 rockets+0.021
 novels+0.021
 cartoons+0.020
Neg logit contributions
ill-0.034
line-0.034
air-0.033
cious-0.032
ax-0.032
Token
Token ID198
Feature activation-1.24e+00

Pos logit contributions
 robots+0.046
 rockets+0.043
 moral+0.043
 novels+0.043
 puzzles+0.043
Neg logit contributions
ill-0.042
line-0.041
cious-0.040
air-0.039
 Sophia-0.038
Token
Token ID198
Feature activation-1.84e+00

Pos logit contributions
ill+0.033
line+0.031
cious+0.030
air+0.029
 Sophia+0.028
Neg logit contributions
 robots-0.033
 moral-0.032
 novels-0.031
 rockets-0.031
 puzzles-0.031
TokenThe
Token ID464
Feature activation+5.40e-01

Pos logit contributions
ill+0.044
line+0.040
cious+0.037
air+0.036
 Sophia+0.035
Neg logit contributions
 moral-0.056
 robots-0.055
 novels-0.053
 puzzles-0.053
oral-0.052
Token next
Token ID1306
Feature activation-2.16e+00

Pos logit contributions
 Mechan+0.017
 rockets+0.016
 robots+0.016
 infected+0.016
 nurses+0.016
Neg logit contributions
 Sophia-0.010
cious-0.009
ill-0.009
air-0.009
line-0.009
Token day
Token ID1110
Feature activation-2.59e+00

Pos logit contributions
ill+0.070
line+0.067
air+0.066
 Sophia+0.066
cious+0.065
Neg logit contributions
 robots-0.048
 rockets-0.045
 puzzles-0.043
 trucks-0.042
 magnets-0.041
Token Lib
Token ID7980
Feature activation-3.99e+00

Pos logit contributions
ill+0.087
line+0.087
cious+0.084
ipples+0.084
air+0.083
Neg logit contributions
 moral-0.043
 novels-0.043
oral-0.039
 puzzles-0.038
 robots-0.036
Tokenby
Token ID1525
Feature activation-2.14e-01

Pos logit contributions
line+0.106
ill+0.105
cious+0.099
air+0.096
 Sophia+0.095
Neg logit contributions
 robots-0.107
 moral-0.102
 rockets-0.102
 puzzles-0.100
 novels-0.099
Token went
Token ID1816
Feature activation-3.04e-01

Pos logit contributions
ill+0.005
cious+0.004
air+0.004
line+0.004
 Sophia+0.004
Neg logit contributions
 robots-0.007
 rockets-0.007
 Amen-0.007
 trucks-0.007
 novels-0.007
Token to
Token ID284
Feature activation-5.22e-01

Pos logit contributions
ill+0.009
cious+0.009
line+0.008
air+0.008
 Sophia+0.007
Neg logit contributions
 robots-0.008
 trucks-0.007
 puzzles-0.007
 rockets-0.007
 moral-0.007
Token the
Token ID262
Feature activation-5.25e-01

Pos logit contributions
ill+0.013
line+0.012
 Sophia+0.012
air+0.012
cious+0.011
Neg logit contributions
 robots-0.016
 rockets-0.015
 moral-0.014
 trucks-0.014
 puzzles-0.014
Token kitchen
Token ID9592
Feature activation-1.53e+00

Pos logit contributions
ill+0.013
 Sophia+0.012
line+0.012
air+0.012
cious+0.012
Neg logit contributions
 robots-0.015
 rockets-0.015
 Amen-0.014
 trucks-0.014
 Mechan-0.014
Token
Token ID198
Feature activation-1.41e+00

Pos logit contributions
line+0.037
cious+0.037
ill+0.037
air+0.036
ax+0.034
Neg logit contributions
 robots-0.028
 moral-0.027
 novels-0.026
 puzzles-0.026
 rockets-0.025
Token
Token ID198
Feature activation-1.95e+00

Pos logit contributions
ill+0.036
line+0.036
cious+0.035
air+0.034
 Sophia+0.032
Neg logit contributions
 robots-0.038
 moral-0.037
 rockets-0.036
 puzzles-0.035
 novels-0.035
TokenOne
Token ID3198
Feature activation-1.23e+00

Pos logit contributions
ill+0.071
line+0.070
cious+0.068
air+0.067
 Sophia+0.065
Neg logit contributions
 robots-0.033
 moral-0.031
 rockets-0.029
 puzzles-0.029
 novels-0.029
Token day
Token ID1110
Feature activation-2.02e+00

Pos logit contributions
ill+0.049
line+0.048
cious+0.046
air+0.045
 Sophia+0.045
Neg logit contributions
 moral-0.018
 robots-0.018
 novels-0.015
 rockets-0.015
 trucks-0.015
Token,
Token ID11
Feature activation-1.95e+00

Pos logit contributions
line+0.064
ill+0.063
air+0.060
ax+0.058
cious+0.057
Neg logit contributions
 moral-0.040
 robots-0.039
 trucks-0.038
 puzzles-0.038
 novels-0.037
Token Sof
Token ID41427
Feature activation-3.98e+00

Pos logit contributions
ill+0.050
line+0.047
air+0.046
cious+0.045
 Sophia+0.042
Neg logit contributions
 robots-0.055
 moral-0.052
 puzzles-0.052
 rockets-0.052
 trucks-0.051
Tokenia
Token ID544
Feature activation-1.53e+00

Pos logit contributions
line+0.100
ill+0.092
cious+0.090
air+0.086
 Sophia+0.082
Neg logit contributions
 moral-0.113
 robots-0.108
oral-0.105
 rockets-0.105
 puzzles-0.104
Token was
Token ID373
Feature activation-1.00e-01

Pos logit contributions
ill+0.030
cious+0.027
line+0.026
air+0.026
 Sophia+0.025
Neg logit contributions
 robots-0.054
 rockets-0.052
 Amen-0.051
 novels-0.051
 Mechan-0.051
Token playing
Token ID2712
Feature activation+2.32e+00

Pos logit contributions
ill+0.003
line+0.003
cious+0.003
air+0.003
 Sophia+0.003
Neg logit contributions
 robots-0.002
 rockets-0.002
 puzzles-0.002
 trucks-0.002
 moral-0.002
Token with
Token ID351
Feature activation+1.16e-01

Pos logit contributions
 robots+0.084
 rockets+0.081
 Amen+0.080
 trucks+0.079
 moral+0.079
Neg logit contributions
ill-0.041
line-0.038
 Sophia-0.035
air-0.034
cious-0.034
Token her
Token ID607
Feature activation-6.22e-02

Pos logit contributions
oral+0.004
 Amen+0.004
 Mechan+0.004
 moral+0.004
 robots+0.004
Neg logit contributions
ill-0.002
 Sophia-0.002
 Sandy-0.002
line-0.002
 Maggie-0.002
Token it
Token ID340
Feature activation+5.43e-01

Pos logit contributions
 robots+0.039
 rockets+0.037
 puzzles+0.037
 trucks+0.036
 novels+0.036
Neg logit contributions
ill-0.034
line-0.031
cious-0.030
air-0.029
 Sophia-0.029
Token.
Token ID13
Feature activation-2.03e+00

Pos logit contributions
 robots+0.019
 rockets+0.018
 Mechan+0.018
 trucks+0.018
 moral+0.017
Neg logit contributions
ill-0.011
line-0.010
cious-0.009
air-0.009
 Sophia-0.009
Token
Token ID220
Feature activation+1.90e+00

Pos logit contributions
ill+0.055
line+0.051
cious+0.047
air+0.047
e+0.046
Neg logit contributions
 moral-0.052
oral-0.048
 Sometimes-0.048
 novels-0.045
ossible-0.045
Token
Token ID198
Feature activation-1.41e+00

Pos logit contributions
 robots+0.055
 rockets+0.052
 moral+0.051
 puzzles+0.051
 novels+0.050
Neg logit contributions
ill-0.046
line-0.045
cious-0.044
air-0.043
 Sophia-0.041
Token
Token ID198
Feature activation-2.78e+00

Pos logit contributions
ill+0.042
line+0.041
cious+0.040
air+0.039
 Sophia+0.036
Neg logit contributions
 moral-0.031
oral-0.029
 robots-0.028
 Amen-0.028
 novels-0.027
TokenS
Token ID50
Feature activation-3.97e+00

Pos logit contributions
ill+0.072
line+0.067
air+0.061
cious+0.057
owing+0.055
Neg logit contributions
 moral-0.083
oral-0.078
 chores-0.070
 novels-0.069
 Amen-0.068
Tokenasha
Token ID14715
Feature activation-1.06e+00

Pos logit contributions
line+0.101
cious+0.085
ill+0.081
 Mil+0.076
air+0.075
Neg logit contributions
oral-0.129
 moral-0.117
ossible-0.106
 chores-0.099
 novels-0.096
Token looked
Token ID3114
Feature activation+6.44e-01

Pos logit contributions
ill+0.026
air+0.024
cious+0.024
line+0.023
 Sophia+0.023
Neg logit contributions
 robots-0.033
 rockets-0.031
 novels-0.031
 trucks-0.030
 puzzles-0.030
Token out
Token ID503
Feature activation+6.45e-01

Pos logit contributions
 robots+0.024
 Mechan+0.023
 Amen+0.022
 rockets+0.022
 trucks+0.022
Neg logit contributions
ill-0.011
line-0.010
 Sophia-0.010
air-0.009
cious-0.009
Token of
Token ID286
Feature activation+6.98e-01

Pos logit contributions
 robots+0.021
 Mechan+0.020
 Amen+0.020
 trucks+0.020
 rockets+0.020
Neg logit contributions
ill-0.014
line-0.013
 Sophia-0.013
air-0.013
cious-0.012
Token her
Token ID607
Feature activation-1.29e+00

Pos logit contributions
 robots+0.023
 Mechan+0.023
 Amen+0.022
 rockets+0.022
oral+0.022
Neg logit contributions
ill-0.014
 Sophia-0.014
line-0.012
air-0.012
 Sandy-0.012
Token the
Token ID262
Feature activation-1.12e+00

Pos logit contributions
ill+0.023
air+0.022
cious+0.022
line+0.022
 Sophia+0.020
Neg logit contributions
 robots-0.030
 trucks-0.028
 rockets-0.028
 puzzles-0.028
 automobiles-0.026
Token slide
Token ID10649
Feature activation-1.98e+00

Pos logit contributions
line+0.024
ill+0.023
 Sophia+0.022
cious+0.022
air+0.021
Neg logit contributions
 Amen-0.037
 robots-0.035
 Mechan-0.035
 novels-0.033
 rockets-0.033
Token and
Token ID290
Feature activation-1.26e+00

Pos logit contributions
ill+0.046
line+0.043
cious+0.043
air+0.042
 Sophia+0.041
Neg logit contributions
 robots-0.061
 rockets-0.058
 trucks-0.058
 puzzles-0.057
 moral-0.056
Token the
Token ID262
Feature activation-1.14e+00

Pos logit contributions
ill+0.032
line+0.030
cious+0.029
air+0.028
 Sophia+0.028
Neg logit contributions
 robots-0.037
 rockets-0.035
 puzzles-0.035
 moral-0.034
 trucks-0.034
Token sees
Token ID7224
Feature activation+2.54e-01

Pos logit contributions
line+0.023
ill+0.022
 Sophia+0.021
cious+0.020
air+0.019
Neg logit contributions
 Amen-0.038
 robots-0.037
 Mechan-0.036
 novels-0.035
 rockets-0.035
Tokenaw
Token ID707
Feature activation-3.96e+00

Pos logit contributions
 robots+0.011
 Amen+0.011
 Mechan+0.010
 rockets+0.010
 puzzles+0.010
Neg logit contributions
ill-0.003
line-0.003
 Sophia-0.003
air-0.002
cious-0.002
Token.
Token ID13
Feature activation-3.17e-01

Pos logit contributions
ill+0.107
cious+0.106
 Sophia+0.104
 Maggie+0.104
iza+0.103
Neg logit contributions
 trucks-0.087
ahs-0.082
 robots-0.082
 rockets-0.080
 automobiles-0.079
Token They
Token ID1119
Feature activation-1.67e-01

Pos logit contributions
ill+0.008
line+0.008
cious+0.007
air+0.007
 Sophia+0.007
Neg logit contributions
 robots-0.009
 rockets-0.009
 puzzles-0.009
 trucks-0.009
 moral-0.009
Token also
Token ID635
Feature activation-5.04e-01

Pos logit contributions
ill+0.004
line+0.004
 Sophia+0.003
cious+0.003
air+0.003
Neg logit contributions
 robots-0.005
 novels-0.005
 rockets-0.005
 Mechan-0.005
 Amen-0.005
Token liked
Token ID8288
Feature activation-8.68e-01

Pos logit contributions
ill+0.011
line+0.011
 Sophia+0.010
cious+0.010
air+0.009
Neg logit contributions
 robots-0.017
 rockets-0.016
 novels-0.016
 Amen-0.016
 magnets-0.016
Token to
Token ID284
Feature activation-1.77e-01

Pos logit contributions
ill+0.020
line+0.019
 Sophia+0.018
cious+0.018
air+0.018
Neg logit contributions
 robots-0.027
 rockets-0.026
 moral-0.026
 Amen-0.025
 Mechan-0.025
Token the
Token ID262
Feature activation-1.43e+00

Pos logit contributions
ill+0.029
cious+0.027
line+0.027
air+0.027
 Sophia+0.025
Neg logit contributions
 robots-0.033
 trucks-0.031
 rockets-0.031
 puzzles-0.031
 moral-0.030
Token slide
Token ID10649
Feature activation-2.08e+00

Pos logit contributions
line+0.027
ill+0.026
cious+0.023
 Sophia+0.023
air+0.022
Neg logit contributions
 Amen-0.051
 robots-0.049
 Mechan-0.049
 nurses-0.047
 novels-0.046
Token and
Token ID290
Feature activation-1.10e+00

Pos logit contributions
ill+0.045
line+0.042
air+0.039
cious+0.039
 Sophia+0.039
Neg logit contributions
 robots-0.069
 rockets-0.066
 puzzles-0.064
 moral-0.064
 novels-0.064
Token the
Token ID262
Feature activation-1.29e+00

Pos logit contributions
ill+0.028
line+0.027
cious+0.026
air+0.026
 Sophia+0.025
Neg logit contributions
 robots-0.032
 rockets-0.030
 puzzles-0.030
 trucks-0.030
 moral-0.029
Token sees
Token ID7224
Feature activation+4.90e-01

Pos logit contributions
line+0.024
ill+0.023
 Sophia+0.021
cious+0.020
air+0.020
Neg logit contributions
 Amen-0.047
 robots-0.044
 Mechan-0.044
 novels-0.042
 nurses-0.042
Tokenaw
Token ID707
Feature activation-3.94e+00

Pos logit contributions
 robots+0.023
 Amen+0.022
 Mechan+0.022
 rockets+0.021
 nurses+0.021
Neg logit contributions
ill-0.004
line-0.003
 Sophia-0.003
air-0.003
cious-0.003
Token.
Token ID13
Feature activation-3.13e-01

Pos logit contributions
ill+0.116
cious+0.113
 Sophia+0.112
 Maggie+0.112
iza+0.111
Neg logit contributions
 trucks-0.086
 rockets-0.080
 robots-0.078
 automobiles-0.078
 cars-0.075
Token They
Token ID1119
Feature activation-2.20e-01

Pos logit contributions
ill+0.009
line+0.009
cious+0.009
air+0.009
 Sophia+0.009
Neg logit contributions
 robots-0.008
 rockets-0.007
 puzzles-0.007
 moral-0.007
 trucks-0.007
Token laugh
Token ID6487
Feature activation-3.17e-01

Pos logit contributions
ill+0.005
line+0.004
 Sophia+0.004
air+0.004
cious+0.004
Neg logit contributions
 robots-0.008
 rockets-0.007
 novels-0.007
 puzzles-0.007
 Mechan-0.007
Token and
Token ID290
Feature activation-6.32e-01

Pos logit contributions
ill+0.005
line+0.005
 Sophia+0.004
cious+0.004
air+0.004
Neg logit contributions
 robots-0.012
 Mechan-0.011
 Amen-0.011
 nurses-0.011
 moral-0.011
Token shout
Token ID9191
Feature activation-6.82e-01

Pos logit contributions
ill+0.016
line+0.015
air+0.014
cious+0.014
 Sophia+0.014
Neg logit contributions
 robots-0.019
 rockets-0.018
 trucks-0.018
 puzzles-0.018
 moral-0.017
Token the
Token ID262
Feature activation+2.32e-01

Pos logit contributions
ill+0.017
line+0.015
air+0.014
 Sophia+0.014
cious+0.014
Neg logit contributions
 robots-0.021
 Mechan-0.020
 rockets-0.020
 novels-0.020
 Amen-0.019
Token c
Token ID269
Feature activation-6.37e-01

Pos logit contributions
 Mechan+0.007
 Amen+0.007
 robots+0.006
 novels+0.006
 rockets+0.006
Neg logit contributions
ill-0.005
 Sophia-0.004
air-0.004
cious-0.004
line-0.004
Tokenrows
Token ID8516
Feature activation-1.78e+00

Pos logit contributions
ill+0.016
line+0.015
cious+0.015
 Sophia+0.014
air+0.014
Neg logit contributions
 robots-0.019
 rockets-0.018
 puzzles-0.017
 moral-0.017
 novels-0.017
Token were
Token ID547
Feature activation-3.56e-01

Pos logit contributions
ill+0.045
line+0.043
cious+0.041
air+0.041
 Sophia+0.040
Neg logit contributions
 robots-0.051
 rockets-0.048
 puzzles-0.047
 moral-0.047
 trucks-0.047
Token c
Token ID269
Feature activation-1.99e-01

Pos logit contributions
ill+0.009
line+0.008
cious+0.008
 Sophia+0.008
air+0.008
Neg logit contributions
 robots-0.011
 rockets-0.010
 trucks-0.010
 puzzles-0.010
 novels-0.010
Tokenaw
Token ID707
Feature activation-3.91e+00

Pos logit contributions
ill+0.006
line+0.006
 Sophia+0.006
cious+0.005
air+0.005
Neg logit contributions
 robots-0.005
 rockets-0.004
 puzzles-0.004
 novels-0.004
 moral-0.004
Tokening
Token ID278
Feature activation-1.15e+00

Pos logit contributions
ill+0.089
air+0.083
 Sophia+0.082
 Maggie+0.077
line+0.076
Neg logit contributions
 robots-0.102
 trucks-0.099
 automobiles-0.097
 rockets-0.097
ahs-0.095
Token in
Token ID287
Feature activation-9.85e-01

Pos logit contributions
ill+0.024
line+0.022
air+0.020
cious+0.020
 Sophia+0.020
Neg logit contributions
 robots-0.039
 Mechan-0.037
 rockets-0.037
 moral-0.037
 Amen-0.037
Token unison
Token ID49418
Feature activation-1.87e+00

Pos logit contributions
ill+0.027
line+0.027
cious+0.025
air+0.025
 Sophia+0.024
Neg logit contributions
 robots-0.026
 rockets-0.024
 puzzles-0.024
 moral-0.024
 trucks-0.024
Token to
Token ID284
Feature activation-3.91e-01

Pos logit contributions
ill+0.041
line+0.040
cious+0.038
air+0.038
 Sophia+0.036
Neg logit contributions
 robots-0.058
 rockets-0.055
 puzzles-0.054
 moral-0.054
 trucks-0.054
Token make
Token ID787
Feature activation-7.54e-01

Pos logit contributions
ill+0.009
line+0.008
cious+0.008
air+0.008
 Sophia+0.008
Neg logit contributions
 robots-0.012
 rockets-0.012
 puzzles-0.012
 moral-0.011
 trucks-0.011
Token so
Token ID523
Feature activation-1.39e+00

Pos logit contributions
ill+0.031
air+0.029
line+0.029
cious+0.028
 Sophia+0.026
Neg logit contributions
 robots-0.026
 rockets-0.024
 trucks-0.023
 moral-0.023
 puzzles-0.023
Token happy
Token ID3772
Feature activation-2.22e+00

Pos logit contributions
ill+0.044
line+0.040
cious+0.040
air+0.040
 Sophia+0.038
Neg logit contributions
 robots-0.029
 moral-0.029
 rockets-0.027
 trucks-0.026
 Amen-0.026
Token that
Token ID326
Feature activation+1.69e-01

Pos logit contributions
ill+0.065
line+0.061
cious+0.060
air+0.060
 Sophia+0.057
Neg logit contributions
 robots-0.056
 rockets-0.053
 trucks-0.052
 puzzles-0.050
 novels-0.048
Token he
Token ID339
Feature activation-3.54e-01

Pos logit contributions
 robots+0.004
 rockets+0.003
 moral+0.003
 trucks+0.003
 puzzles+0.003
Neg logit contributions
ill-0.006
air-0.005
cious-0.005
line-0.005
 Sophia-0.005
Token was
Token ID373
Feature activation-5.71e-01

Pos logit contributions
ill+0.010
line+0.009
air+0.009
cious+0.009
 Sophia+0.009
Neg logit contributions
 robots-0.009
 rockets-0.009
 trucks-0.009
 moral-0.008
 puzzles-0.008
Token able
Token ID1498
Feature activation-3.89e+00

Pos logit contributions
air+0.017
ill+0.017
cious+0.016
line+0.015
 Sophia+0.015
Neg logit contributions
 robots-0.013
 rockets-0.012
 moral-0.012
 Amen-0.012
 trucks-0.011
Token to
Token ID284
Feature activation-7.38e-01

Pos logit contributions
ill+0.120
line+0.116
air+0.113
cious+0.108
 Sophia+0.104
Neg logit contributions
 robots-0.092
 rockets-0.089
 puzzles-0.085
 moral-0.084
 trucks-0.083
Token cross
Token ID3272
Feature activation-8.55e-01

Pos logit contributions
ill+0.020
cious+0.019
line+0.019
 Sophia+0.018
air+0.018
Neg logit contributions
 robots-0.019
 moral-0.019
 chores-0.018
 trucks-0.018
 rockets-0.018
Token with
Token ID351
Feature activation-4.84e-01

Pos logit contributions
ill+0.023
cious+0.022
line+0.022
air+0.021
 Sophia+0.021
Neg logit contributions
 robots-0.023
 rockets-0.022
 trucks-0.022
 puzzles-0.022
 moral-0.021
Token the
Token ID262
Feature activation-2.94e-01

Pos logit contributions
ill+0.014
line+0.014
cious+0.013
air+0.013
 Sophia+0.013
Neg logit contributions
 robots-0.012
 rockets-0.011
 trucks-0.011
 puzzles-0.011
 moral-0.011
Token help
Token ID1037
Feature activation-4.44e-01

Pos logit contributions
ill+0.007
line+0.007
cious+0.006
air+0.006
 Sophia+0.006
Neg logit contributions
 robots-0.009
 rockets-0.009
 novels-0.008
 puzzles-0.008
 trucks-0.008
Token a
Token ID257
Feature activation-8.30e-01

Pos logit contributions
ill+0.036
line+0.034
cious+0.032
air+0.032
 Sophia+0.032
Neg logit contributions
 robots-0.041
 rockets-0.039
 puzzles-0.038
 trucks-0.038
 moral-0.038
Token swing
Token ID9628
Feature activation-1.94e+00

Pos logit contributions
line+0.019
ill+0.019
cious+0.018
 Sophia+0.018
air+0.017
Neg logit contributions
 robots-0.025
 Amen-0.024
 rockets-0.023
 nurses-0.023
 novels-0.023
Token,
Token ID11
Feature activation-1.74e+00

Pos logit contributions
ill+0.036
line+0.033
 Sophia+0.029
air+0.029
cious+0.027
Neg logit contributions
 robots-0.072
 Amen-0.071
 Mechan-0.069
 rockets-0.068
 nurses-0.068
Token a
Token ID257
Feature activation-7.73e-01

Pos logit contributions
ill+0.039
line+0.037
 Sophia+0.035
cious+0.035
air+0.035
Neg logit contributions
 robots-0.056
 rockets-0.053
 Amen-0.052
 Mechan-0.052
 moral-0.052
Token sees
Token ID7224
Feature activation+4.11e-02

Pos logit contributions
line+0.017
ill+0.016
cious+0.016
 Sophia+0.015
air+0.015
Neg logit contributions
 robots-0.025
 Amen-0.024
 rockets-0.023
 nurses-0.023
 novels-0.023
Tokenaw
Token ID707
Feature activation-3.88e+00

Pos logit contributions
 robots+0.002
 Amen+0.002
 Mechan+0.002
 nurses+0.002
 rockets+0.002
Neg logit contributions
ill-0.001
line-0.000
air-0.000
 Sophia-0.000
cious-0.000
Token and
Token ID290
Feature activation-1.20e+00

Pos logit contributions
ill+0.111
cious+0.107
 Maggie+0.107
iza+0.106
 Sophia+0.102
Neg logit contributions
 trucks-0.089
 robots-0.081
 automobiles-0.080
 rockets-0.077
 puzzles-0.076
Token a
Token ID257
Feature activation-7.47e-01

Pos logit contributions
ill+0.029
line+0.028
 Sophia+0.027
cious+0.027
air+0.027
Neg logit contributions
 robots-0.035
 rockets-0.033
 moral-0.033
 Amen-0.033
 Mechan-0.033
Token climbing
Token ID14281
Feature activation-6.33e-01

Pos logit contributions
line+0.017
ill+0.017
cious+0.016
 Sophia+0.016
air+0.015
Neg logit contributions
 robots-0.023
 rockets-0.021
 Amen-0.021
 novels-0.021
 nurses-0.021
Token frame
Token ID5739
Feature activation-3.31e-01

Pos logit contributions
ill+0.014
line+0.013
air+0.013
 Sophia+0.012
cious+0.011
Neg logit contributions
 Amen-0.021
 robots-0.020
 Mechan-0.020
 nurses-0.019
 moral-0.019
Token.
Token ID13
Feature activation-1.04e+00

Pos logit contributions
ill+0.007
line+0.006
air+0.006
 Sophia+0.006
cious+0.005
Neg logit contributions
 robots-0.012
 rockets-0.011
 moral-0.011
 Amen-0.011
 Mechan-0.011
Token said
Token ID531
Feature activation-1.14e+00

Pos logit contributions
ill+0.003
 Sophia+0.002
air+0.002
line+0.002
cious+0.002
Neg logit contributions
 robots-0.004
 automobiles-0.004
 Mechan-0.004
 novels-0.004
 rockets-0.004
Token he
Token ID339
Feature activation-3.81e-01

Pos logit contributions
ill+0.026
line+0.025
 Sophia+0.024
cious+0.024
air+0.023
Neg logit contributions
 robots-0.036
 rockets-0.034
 puzzles-0.034
 trucks-0.034
 Mechan-0.033
Token lost
Token ID2626
Feature activation+1.36e+00

Pos logit contributions
ill+0.009
cious+0.008
 Sophia+0.008
air+0.008
line+0.008
Neg logit contributions
 robots-0.012
 rockets-0.011
 automobiles-0.011
 puzzles-0.011
 cartoons-0.011
Token his
Token ID465
Feature activation-1.95e-01

Pos logit contributions
 robots+0.042
 moral+0.042
oral+0.042
 Mechan+0.041
 Amen+0.041
Neg logit contributions
ill-0.030
line-0.029
 Sophia-0.027
air-0.027
cious-0.026
Token original
Token ID2656
Feature activation+1.02e+00

Pos logit contributions
cious+0.004
 back+0.004
 away+0.004
ill+0.004
line+0.004
Neg logit contributions
 Mechan-0.006
 Amen-0.006
oral-0.006
 robots-0.005
 nurses-0.005
Token toy
Token ID13373
Feature activation+3.66e+00

Pos logit contributions
 Amen+0.028
 robots+0.028
oral+0.027
 Mechan+0.027
 nurses+0.026
Neg logit contributions
ill-0.023
line-0.022
air-0.022
cious-0.022
 Sophia-0.021
Token car
Token ID1097
Feature activation+1.93e+00

Pos logit contributions
 Amen+0.148
oral+0.148
 moral+0.141
 Mechan+0.139
 nurses+0.134
Neg logit contributions
line-0.047
ill-0.043
 back-0.041
 away-0.039
air-0.036
Token.
Token ID13
Feature activation+2.77e-01

Pos logit contributions
 Amen+0.075
oral+0.074
 moral+0.073
 robots+0.073
 Mechan+0.071
Neg logit contributions
ill-0.030
line-0.030
air-0.025
cious-0.023
 Sophia-0.021
Token Lily
Token ID20037
Feature activation+1.26e-01

Pos logit contributions
 robots+0.009
 rockets+0.009
 puzzles+0.009
 moral+0.009
 trucks+0.008
Neg logit contributions
ill-0.006
 Sophia-0.006
line-0.006
air-0.005
cious-0.005
Token told
Token ID1297
Feature activation+2.09e-01

Pos logit contributions
 rockets+0.004
 robots+0.004
 novels+0.004
 automobiles+0.004
 trucks+0.004
Neg logit contributions
ill-0.003
 Sophia-0.002
air-0.002
line-0.002
cious-0.002
Token him
Token ID683
Feature activation-1.02e-01

Pos logit contributions
 robots+0.007
 Mechan+0.007
 moral+0.007
 rockets+0.007
 Amen+0.007
Neg logit contributions
ill-0.004
 Sophia-0.004
line-0.004
air-0.004
 Sandy-0.004
Token to
Token ID284
Feature activation-1.23e-01

Pos logit contributions
 Amen+0.005
oral+0.005
 Mechan+0.005
 moral+0.005
 infected+0.005
Neg logit contributions
line-0.001
 Sophia-0.001
ill-0.001
air-0.001
 Dave-0.001
Token play
Token ID711
Feature activation+2.13e+00

Pos logit contributions
ill+0.003
line+0.003
 Sophia+0.002
air+0.002
cious+0.002
Neg logit contributions
 robots-0.004
 rockets-0.004
 Mechan-0.004
 moral-0.004
 novels-0.004
Token with
Token ID351
Feature activation+6.21e-01

Pos logit contributions
 robots+0.076
 Mechan+0.072
 nurses+0.072
 trucks+0.072
 rockets+0.072
Neg logit contributions
ill-0.037
line-0.036
air-0.032
cious-0.031
 Sophia-0.031
Token his
Token ID465
Feature activation+3.34e-01

Pos logit contributions
 Amen+0.022
oral+0.021
 Mechan+0.021
 moral+0.020
 nurses+0.019
Neg logit contributions
 Sophia-0.011
ill-0.011
air-0.010
 Sandy-0.010
 Dan-0.010
Token new
Token ID649
Feature activation+5.78e-01

Pos logit contributions
 Amen+0.011
oral+0.010
 Mechan+0.009
 nurses+0.009
 infected+0.009
Neg logit contributions
 Sophia-0.006
cious-0.006
 Sandy-0.006
 pond-0.006
 spot-0.006
Token toy
Token ID13373
Feature activation+3.66e+00

Pos logit contributions
 Amen+0.017
 infected+0.017
 Mechan+0.016
oral+0.016
 nurses+0.016
Neg logit contributions
 spot-0.010
cious-0.010
 place-0.010
line-0.010
ill-0.010
Token car
Token ID1097
Feature activation+2.25e+00

Pos logit contributions
 Amen+0.111
 Mechan+0.095
oral+0.092
 swollen+0.092
 moral+0.088
Neg logit contributions
line-0.078
 back-0.074
per-0.073
al-0.072
 away-0.072
Token.
Token ID13
Feature activation-3.10e-01

Pos logit contributions
 Amen+0.100
 Mechan+0.096
oral+0.096
 moral+0.094
 nurses+0.091
Neg logit contributions
line-0.022
ill-0.019
air-0.017
e-0.014
per-0.013
Token He
Token ID679
Feature activation+1.28e+00

Pos logit contributions
cious+0.009
ill+0.009
line+0.009
air+0.009
 Sophia+0.008
Neg logit contributions
 robots-0.008
 rockets-0.007
 magnets-0.007
 puzzles-0.007
 trucks-0.007
Token makes
Token ID1838
Feature activation-4.91e-01

Pos logit contributions
oral+0.042
 Amen+0.041
 novels+0.040
 nurses+0.040
 cartoons+0.039
Neg logit contributions
ill-0.022
line-0.022
air-0.022
e-0.021
cious-0.020
Token it
Token ID340
Feature activation+6.68e-01

Pos logit contributions
line+0.010
air+0.009
ill+0.009
 Sophia+0.009
cious+0.008
Neg logit contributions
 Amen-0.016
oral-0.016
 Mechan-0.015
 robots-0.015
 moral-0.015
Token Tom
Token ID4186
Feature activation-1.98e-01

Pos logit contributions
 robots+0.008
 rockets+0.008
 Amen+0.008
 novels+0.008
oral+0.008
Neg logit contributions
ill-0.005
 Sophia-0.004
line-0.004
air-0.004
 Sandy-0.004
Token,
Token ID11
Feature activation+8.77e-01

Pos logit contributions
ill+0.005
line+0.005
cious+0.004
air+0.004
 Sophia+0.004
Neg logit contributions
 robots-0.006
 trucks-0.005
 rockets-0.005
 puzzles-0.005
 moral-0.005
Token here
Token ID994
Feature activation-8.05e-01

Pos logit contributions
 robots+0.030
oral+0.030
 automobiles+0.030
 Amen+0.030
 novels+0.030
Neg logit contributions
ill-0.015
line-0.015
 Sophia-0.013
air-0.013
 Sandy-0.012
Token is
Token ID318
Feature activation+3.57e-01

Pos logit contributions
ill+0.022
line+0.021
air+0.020
 Sophia+0.020
cious+0.020
Neg logit contributions
 robots-0.022
 rockets-0.020
 puzzles-0.020
 moral-0.020
 trucks-0.020
Token your
Token ID534
Feature activation+2.23e-02

Pos logit contributions
 Mechan+0.012
 Amen+0.012
 robots+0.012
oral+0.012
 nurses+0.011
Neg logit contributions
 Sophia-0.007
ill-0.006
 Dan-0.006
 Sandy-0.006
air-0.006
Token laser
Token ID12855
Feature activation+4.06e+00

Pos logit contributions
 nurses+0.000
 Amen+0.000
 novels+0.000
oral+0.000
 puzzles+0.000
Neg logit contributions
ill-0.001
line-0.001
air-0.001
 back-0.001
 Sandy-0.001
Token.
Token ID13
Feature activation+2.34e-01

Pos logit contributions
 nurses+0.134
oral+0.132
 moral+0.128
 Amen+0.128
 novels+0.124
Neg logit contributions
ill-0.080
air-0.071
line-0.070
cious-0.064
 back-0.064
Token Are
Token ID4231
Feature activation-7.79e-01

Pos logit contributions
 robots+0.009
 rockets+0.008
 puzzles+0.008
 trucks+0.008
 moral+0.008
Neg logit contributions
ill-0.004
line-0.004
cious-0.004
air-0.004
 Sophia-0.003
Token you
Token ID345
Feature activation+1.20e-02

Pos logit contributions
ill+0.015
 Sophia+0.014
line+0.014
cious+0.013
air+0.013
Neg logit contributions
 robots-0.027
 rockets-0.025
 Mechan-0.025
 Amen-0.025
 moral-0.025
Token okay
Token ID8788
Feature activation-1.96e-01

Pos logit contributions
 robots+0.000
 rockets+0.000
 puzzles+0.000
 novels+0.000
 trucks+0.000
Neg logit contributions
ill-0.000
line-0.000
cious-0.000
air-0.000
 Sophia-0.000
Token?"
Token ID1701
Feature activation-7.64e-01

Pos logit contributions
ill+0.008
line+0.008
air+0.008
cious+0.008
 Sophia+0.008
Neg logit contributions
 robots-0.003
 rockets-0.002
 puzzles-0.002
 trucks-0.002
 moral-0.002
Token Mary
Token ID5335
Feature activation-1.77e+00

Pos logit contributions
line+0.030
iza+0.027
cious+0.027
air+0.027
mer+0.026
Neg logit contributions
 novels-0.028
 robots-0.027
 trucks-0.027
 moral-0.027
 rockets-0.026
Token was
Token ID373
Feature activation-1.56e-01

Pos logit contributions
ill+0.054
line+0.052
cious+0.050
air+0.050
 Sophia+0.049
Neg logit contributions
 robots-0.043
 rockets-0.039
 puzzles-0.038
 trucks-0.038
 moral-0.038
Token only
Token ID691
Feature activation-8.67e-01

Pos logit contributions
ill+0.004
line+0.004
cious+0.004
 Sophia+0.004
air+0.004
Neg logit contributions
 robots-0.004
 rockets-0.004
 trucks-0.004
 moral-0.004
 puzzles-0.004
Token three
Token ID1115
Feature activation+1.74e+00

Pos logit contributions
ill+0.032
line+0.032
air+0.030
cious+0.029
ax+0.028
Neg logit contributions
 robots-0.016
 nurses-0.015
 puzzles-0.014
 novels-0.013
 brother-0.013
Token-
Token ID12
Feature activation+1.18e+00

Pos logit contributions
 robots+0.049
 rockets+0.046
 puzzles+0.045
 trucks+0.045
 moral+0.044
Neg logit contributions
ill-0.047
line-0.044
cious-0.042
air-0.042
 Sophia-0.042
Tokenyears
Token ID19002
Feature activation+3.66e+00

Pos logit contributions
 robots+0.028
 rockets+0.027
 moral+0.026
 trucks+0.026
 puzzles+0.025
Neg logit contributions
ill-0.037
line-0.035
cious-0.034
air-0.034
 Sophia-0.033
Token-
Token ID12
Feature activation+1.51e+00

Pos logit contributions
 robots+0.122
 rockets+0.116
 puzzles+0.114
 moral+0.114
 trucks+0.113
Neg logit contributions
ill-0.077
line-0.072
cious-0.068
air-0.068
 Sophia-0.067
Tokenold
Token ID727
Feature activation+1.11e+00

Pos logit contributions
 robots+0.033
 novels+0.030
 puzzles+0.030
 rockets+0.029
 nurses+0.029
Neg logit contributions
ill-0.048
line-0.047
cious-0.043
 Sophia-0.043
air-0.043
Token and
Token ID290
Feature activation-8.80e-02

Pos logit contributions
 robots+0.031
 trucks+0.028
 novels+0.028
 puzzles+0.027
 rockets+0.027
Neg logit contributions
line-0.030
ill-0.030
air-0.028
cious-0.027
ipples-0.026
Token loved
Token ID6151
Feature activation+1.42e+00

Pos logit contributions
ill+0.003
air+0.003
line+0.003
 Sophia+0.003
cious+0.003
Neg logit contributions
 robots-0.001
 puzzles-0.001
 rockets-0.001
 novels-0.001
 trucks-0.001
Token playing
Token ID2712
Feature activation+3.00e+00

Pos logit contributions
oral+0.054
 moral+0.053
 Amen+0.052
 Mechan+0.050
cut+0.050
Neg logit contributions
line-0.017
 Sophia-0.016
cious-0.016
ill-0.015
air-0.014
Token nice
Token ID3621
Feature activation-5.48e-01

Pos logit contributions
 robots+0.014
 rockets+0.013
 Mechan+0.013
 moral+0.013
 novels+0.012
Neg logit contributions
line-0.010
ill-0.010
cious-0.010
air-0.010
 Sophia-0.009
Token,
Token ID11
Feature activation+1.46e-01

Pos logit contributions
ill+0.010
line+0.009
 Sophia+0.009
air+0.009
cious+0.008
Neg logit contributions
 Mechan-0.019
 robots-0.019
 Amen-0.018
 moral-0.018
oral-0.018
Token like
Token ID588
Feature activation+1.59e-01

Pos logit contributions
 robots+0.005
 rockets+0.005
 Mechan+0.005
 puzzles+0.005
 trucks+0.005
Neg logit contributions
ill-0.003
line-0.003
 Sophia-0.003
air-0.003
cious-0.003
Token your
Token ID534
Feature activation-8.81e-01

Pos logit contributions
 Mechan+0.005
oral+0.005
 Amen+0.005
 robots+0.005
 moral+0.005
Neg logit contributions
 Sophia-0.003
ill-0.003
 Maggie-0.003
 Sandy-0.003
line-0.003
Token favorite
Token ID4004
Feature activation+1.15e+00

Pos logit contributions
ill+0.021
 Sophia+0.021
cious+0.019
 Maggie+0.019
air+0.018
Neg logit contributions
 Mechan-0.025
 robots-0.023
oral-0.022
 automobiles-0.022
 Amen-0.022
Token toy
Token ID13373
Feature activation+3.71e+00

Pos logit contributions
 Mechan+0.035
oral+0.033
 robots+0.031
 nurses+0.031
 automobiles+0.031
Neg logit contributions
ill-0.023
air-0.021
cious-0.021
 Sophia-0.021
e-0.020
Token or
Token ID393
Feature activation+8.48e-01

Pos logit contributions
oral+0.150
 Mechan+0.145
 Amen+0.145
 moral+0.142
nam+0.134
Neg logit contributions
 back-0.033
ill-0.032
line-0.027
cious-0.025
's-0.024
Token your
Token ID534
Feature activation-5.72e-01

Pos logit contributions
oral+0.027
 Mechan+0.025
 Amen+0.024
 moral+0.023
cut+0.022
Neg logit contributions
 Sophia-0.018
cious-0.017
line-0.016
ill-0.016
 Dan-0.016
Token best
Token ID1266
Feature activation+3.29e-01

Pos logit contributions
ill+0.013
 Sophia+0.013
cious+0.013
air+0.012
line+0.012
Neg logit contributions
 Mechan-0.016
oral-0.015
 nurses-0.015
 Amen-0.015
 robots-0.014
Token friend
Token ID1545
Feature activation-1.74e-01

Pos logit contributions
 Mechan+0.012
 robots+0.011
oral+0.011
 nurses+0.011
 puzzles+0.011
Neg logit contributions
ill-0.005
line-0.005
cious-0.004
air-0.004
 Sophia-0.004
Token.
Token ID13
Feature activation+5.61e-01

Pos logit contributions
ill+0.004
line+0.004
cious+0.003
air+0.003
 Sophia+0.003
Neg logit contributions
 robots-0.006
 rockets-0.005
 moral-0.005
 Mechan-0.005
 puzzles-0.005
Token mom
Token ID1995
Feature activation+1.32e+00

Pos logit contributions
cut+0.030
 Amen+0.029
 Mechan+0.029
 nurses+0.029
 unic+0.028
Neg logit contributions
 Sophia-0.016
ill-0.016
 Dan-0.014
 back-0.014
air-0.013
Tokenmy
Token ID1820
Feature activation+1.81e+00

Pos logit contributions
 robots+0.040
 rockets+0.040
 puzzles+0.040
 novels+0.039
 chores+0.039
Neg logit contributions
ill-0.034
air-0.027
 Sophia-0.027
line-0.027
ax-0.026
Token smiled
Token ID13541
Feature activation+8.92e-01

Pos logit contributions
 robots+0.044
 rockets+0.041
 puzzles+0.041
 novels+0.040
 trucks+0.039
Neg logit contributions
ill-0.056
 Sophia-0.051
line-0.051
air-0.051
cious-0.050
Token and
Token ID290
Feature activation+6.59e-01

Pos logit contributions
 robots+0.025
 rockets+0.024
 moral+0.023
 novels+0.023
 magnets+0.023
Neg logit contributions
ill-0.023
cious-0.023
line-0.022
air-0.021
 Sophia-0.021
Token said
Token ID531
Feature activation-1.05e+00

Pos logit contributions
 robots+0.014
 rockets+0.013
 trucks+0.013
 puzzles+0.013
 moral+0.013
Neg logit contributions
ill-0.022
line-0.021
cious-0.020
air-0.020
 Sophia-0.020
Token:
Token ID25
Feature activation+1.09e+00

Pos logit contributions
ill+0.027
line+0.026
cious+0.025
air+0.025
 Sophia+0.024
Neg logit contributions
 robots-0.030
 rockets-0.028
 puzzles-0.027
 moral-0.027
 trucks-0.027
Token "
Token ID366
Feature activation-7.89e-01

Pos logit contributions
 robots+0.029
 rockets+0.027
 puzzles+0.027
 trucks+0.026
 novels+0.026
Neg logit contributions
ill-0.031
line-0.030
cious-0.028
 Sophia-0.028
air-0.028
TokenIt
Token ID1026
Feature activation+2.20e-01

Pos logit contributions
ill+0.017
line+0.016
cious+0.016
 Sophia+0.016
air+0.016
Neg logit contributions
 robots-0.025
 rockets-0.024
 moral-0.023
 trucks-0.023
 puzzles-0.023
Token is
Token ID318
Feature activation+1.06e-01

Pos logit contributions
 robots+0.007
 rockets+0.007
 moral+0.007
 novels+0.007
 puzzles+0.007
Neg logit contributions
ill-0.004
line-0.004
air-0.004
cious-0.004
 Sophia-0.004
Token a
Token ID257
Feature activation+1.21e+00

Pos logit contributions
 robots+0.003
 Amen+0.003
oral+0.003
 rockets+0.003
 Mechan+0.003
Neg logit contributions
ill-0.002
line-0.002
 Sophia-0.002
air-0.002
cious-0.002
Token safe
Token ID3338
Feature activation-1.54e+00

Pos logit contributions
 robots+0.033
 rockets+0.031
 puzzles+0.030
 magnets+0.030
 novels+0.030
Neg logit contributions
line-0.031
ill-0.031
air-0.030
cious-0.029
 Sophia-0.028
Token three
Token ID1115
Feature activation+1.32e+00

Pos logit contributions
ill+0.012
 Sophia+0.011
line+0.011
cious+0.011
air+0.011
Neg logit contributions
 robots-0.013
 rockets-0.013
 trucks-0.012
 moral-0.012
 puzzles-0.012
Token years
Token ID812
Feature activation+1.49e-01

Pos logit contributions
 robots+0.036
 rockets+0.034
 puzzles+0.033
 trucks+0.033
 moral+0.033
Neg logit contributions
ill-0.036
line-0.035
cious-0.033
air-0.033
 Sophia-0.032
Token old
Token ID1468
Feature activation+1.56e+00

Pos logit contributions
 robots+0.004
 Mechan+0.004
 Amen+0.003
 trucks+0.003
 rockets+0.003
Neg logit contributions
ill-0.004
line-0.004
air-0.004
 Sophia-0.004
cious-0.004
Token and
Token ID290
Feature activation-5.63e-01

Pos logit contributions
 robots+0.049
 rockets+0.045
 puzzles+0.045
 trucks+0.045
 novels+0.045
Neg logit contributions
ill-0.038
line-0.036
cious-0.033
air-0.033
 Sophia-0.033
Token she
Token ID673
Feature activation+2.67e-01

Pos logit contributions
ill+0.017
line+0.017
cious+0.017
 Sophia+0.016
air+0.016
Neg logit contributions
 robots-0.013
 rockets-0.012
 trucks-0.012
 moral-0.012
 puzzles-0.012
Token loved
Token ID6151
Feature activation+8.48e-01

Pos logit contributions
 robots+0.006
 rockets+0.006
 trucks+0.005
 puzzles+0.005
 moral+0.005
Neg logit contributions
ill-0.009
line-0.008
cious-0.008
air-0.008
 Sophia-0.008
Token going
Token ID1016
Feature activation-1.17e+00

Pos logit contributions
oral+0.033
 moral+0.032
 Mechan+0.032
 Amen+0.031
cut+0.030
Neg logit contributions
 Sophia-0.010
line-0.009
 Maggie-0.009
ill-0.009
air-0.009
Token to
Token ID284
Feature activation-2.35e-01

Pos logit contributions
ill+0.022
line+0.022
air+0.019
cious+0.019
 Sophia+0.019
Neg logit contributions
 robots-0.039
 Amen-0.038
 moral-0.038
 Mechan-0.038
 rockets-0.037
Token the
Token ID262
Feature activation-6.52e-01

Pos logit contributions
ill+0.006
line+0.006
 Sophia+0.005
cious+0.005
air+0.005
Neg logit contributions
 robots-0.007
 rockets-0.006
 moral-0.006
 trucks-0.006
 puzzles-0.006
Token park
Token ID3952
Feature activation-1.80e+00

Pos logit contributions
line+0.015
ill+0.015
 Sophia+0.014
air+0.014
cious+0.014
Neg logit contributions
 robots-0.020
 rockets-0.019
 Amen-0.019
 trucks-0.018
 magnets-0.018
Token.
Token ID13
Feature activation+7.28e-01

Pos logit contributions
ill+0.041
line+0.039
cious+0.037
air+0.037
 Sophia+0.036
Neg logit contributions
 robots-0.057
 rockets-0.054
 puzzles-0.054
 trucks-0.053
 novels-0.053
Token go
Token ID467
Feature activation-6.26e-01

Pos logit contributions
 robots+0.023
 rockets+0.022
 novels+0.022
 Amen+0.022
 puzzles+0.021
Neg logit contributions
ill-0.017
line-0.015
air-0.014
cious-0.014
 Sophia-0.014
Token away
Token ID1497
Feature activation+5.39e-01

Pos logit contributions
ill+0.012
line+0.011
cious+0.011
air+0.011
 Sophia+0.011
Neg logit contributions
 robots-0.022
 rockets-0.021
 trucks-0.021
 puzzles-0.021
 moral-0.020
Token,
Token ID11
Feature activation+5.75e-01

Pos logit contributions
 robots+0.019
 rockets+0.018
 novels+0.018
 puzzles+0.018
 trucks+0.018
Neg logit contributions
ill-0.011
line-0.010
air-0.009
 Sophia-0.009
cious-0.009
Token Anna
Token ID11735
Feature activation-7.69e-01

Pos logit contributions
 robots+0.019
 rockets+0.018
 puzzles+0.017
 moral+0.017
 trucks+0.017
Neg logit contributions
ill-0.013
line-0.012
cious-0.012
air-0.011
 Sophia-0.011
Token!
Token ID0
Feature activation+9.78e-01

Pos logit contributions
ill+0.014
line+0.013
air+0.013
cious+0.012
 Sophia+0.012
Neg logit contributions
 robots-0.028
 rockets-0.027
 moral-0.026
 puzzles-0.026
 novels-0.026
Token You
Token ID921
Feature activation+1.39e+00

Pos logit contributions
 robots+0.030
 rockets+0.028
 puzzles+0.028
 moral+0.028
 trucks+0.027
Neg logit contributions
ill-0.024
line-0.022
air-0.021
 Sophia-0.021
cious-0.021
Token are
Token ID389
Feature activation-8.67e-01

Pos logit contributions
 robots+0.047
 Amen+0.046
 rockets+0.046
 novels+0.045
 moral+0.044
Neg logit contributions
ill-0.027
line-0.024
air-0.023
cious-0.022
 Sophia-0.022
Token a
Token ID257
Feature activation+9.71e-01

Pos logit contributions
ill+0.027
line+0.026
cious+0.024
air+0.024
 Sophia+0.024
Neg logit contributions
 robots-0.022
 rockets-0.020
 puzzles-0.020
 trucks-0.019
 moral-0.019
Token bad
Token ID2089
Feature activation+8.58e-01

Pos logit contributions
 robots+0.026
 rockets+0.024
 trucks+0.024
 novels+0.024
 puzzles+0.024
Neg logit contributions
ill-0.025
line-0.024
cious-0.024
air-0.024
 Sophia-0.023
Token girl
Token ID2576
Feature activation-3.00e-01

Pos logit contributions
 robots+0.020
 puzzles+0.020
 rockets+0.020
 novels+0.019
 magazines+0.019
Neg logit contributions
ill-0.026
line-0.024
 Sophia-0.023
air-0.023
cious-0.022
Token!"
Token ID2474
Feature activation-6.50e-01

Pos logit contributions
ill+0.006
line+0.006
air+0.006
cious+0.006
 Sophia+0.005
Neg logit contributions
 robots-0.010
 rockets-0.010
 moral-0.010
 puzzles-0.009
 trucks-0.009
Token said
Token ID531
Feature activation-7.80e-01

Pos logit contributions
 robots+0.002
 rockets+0.002
 novels+0.002
 Mechan+0.002
oral+0.002
Neg logit contributions
 Sophia-0.001
line-0.001
ill-0.001
 Sasha-0.001
 Sandy-0.001
Token,
Token ID11
Feature activation+5.07e-01

Pos logit contributions
ill+0.015
line+0.015
cious+0.014
 Sophia+0.014
air+0.013
Neg logit contributions
 robots-0.027
 rockets-0.026
 trucks-0.026
 puzzles-0.025
 Mechan-0.025
Token "
Token ID366
Feature activation+8.66e-02

Pos logit contributions
 robots+0.013
 rockets+0.012
 puzzles+0.012
 trucks+0.012
 novels+0.012
Neg logit contributions
ill-0.014
line-0.014
cious-0.013
air-0.013
 Sophia-0.013
TokenI
Token ID40
Feature activation-7.21e-02

Pos logit contributions
 robots+0.003
 rockets+0.003
 puzzles+0.003
 trucks+0.003
 novels+0.002
Neg logit contributions
ill-0.002
line-0.002
air-0.002
cious-0.002
 Sophia-0.002
Token understand
Token ID1833
Feature activation+9.93e-01

Pos logit contributions
ill+0.002
line+0.002
cious+0.002
air+0.002
 Sophia+0.002
Neg logit contributions
 robots-0.002
 moral-0.002
 Amen-0.002
 puzzles-0.002
 Mechan-0.002
Token,
Token ID11
Feature activation+1.31e+00

Pos logit contributions
 robots+0.032
 rockets+0.030
 puzzles+0.030
 moral+0.029
 trucks+0.029
Neg logit contributions
ill-0.023
line-0.021
cious-0.020
air-0.020
 Sophia-0.020
Token Lily
Token ID20037
Feature activation+5.46e-01

Pos logit contributions
 robots+0.039
 rockets+0.038
 puzzles+0.036
 Amen+0.036
 novels+0.036
Neg logit contributions
ill-0.032
line-0.030
 Sophia-0.029
air-0.028
cious-0.028
Token.
Token ID13
Feature activation+5.55e-01

Pos logit contributions
 robots+0.018
 rockets+0.017
 puzzles+0.017
 trucks+0.017
 moral+0.017
Neg logit contributions
ill-0.012
line-0.011
cious-0.011
air-0.011
 Sophia-0.010
Token You
Token ID921
Feature activation+1.24e+00

Pos logit contributions
 robots+0.015
 rockets+0.014
 Amen+0.014
 novels+0.014
 trucks+0.014
Neg logit contributions
ill-0.015
line-0.015
cious-0.014
air-0.014
 Sophia-0.014
Token have
Token ID423
Feature activation-9.55e-01

Pos logit contributions
 robots+0.037
 rockets+0.035
 novels+0.035
 trucks+0.035
 puzzles+0.034
Neg logit contributions
ill-0.030
line-0.028
air-0.027
cious-0.027
 Sophia-0.026
Token a
Token ID257
Feature activation-4.11e-01

Pos logit contributions
ill+0.019
 Sophia+0.018
cious+0.017
line+0.017
air+0.017
Neg logit contributions
 robots-0.032
 Amen-0.031
 rockets-0.031
 trucks-0.030
 moral-0.030
Token who
Token ID508
Feature activation-6.34e-01

Pos logit contributions
ill+0.009
line+0.008
 Sophia+0.008
cious+0.008
air+0.008
Neg logit contributions
 robots-0.010
 rockets-0.010
 trucks-0.010
 puzzles-0.010
 moral-0.009
Token was
Token ID373
Feature activation-2.48e-01

Pos logit contributions
ill+0.016
line+0.015
cious+0.015
 Sophia+0.015
air+0.015
Neg logit contributions
 robots-0.018
 rockets-0.017
 moral-0.017
 novels-0.016
 Amen-0.016
Token clever
Token ID14169
Feature activation+5.99e-01

Pos logit contributions
ill+0.008
line+0.008
air+0.007
cious+0.007
 Sophia+0.007
Neg logit contributions
 robots-0.006
 rockets-0.005
 puzzles-0.005
 trucks-0.005
 moral-0.005
Token and
Token ID290
Feature activation-2.33e-01

Pos logit contributions
 robots+0.020
 rockets+0.019
 moral+0.019
 puzzles+0.019
 trucks+0.019
Neg logit contributions
ill-0.012
line-0.012
cious-0.011
air-0.011
 Sophia-0.011
Token she
Token ID673
Feature activation+5.51e-01

Pos logit contributions
ill+0.006
line+0.006
cious+0.005
 Sophia+0.005
air+0.005
Neg logit contributions
 robots-0.007
 rockets-0.006
 Mechan-0.006
 novels-0.006
 puzzles-0.006
Token chased
Token ID26172
Feature activation+7.35e-01

Pos logit contributions
 robots+0.017
 novels+0.017
 chores+0.016
 rockets+0.016
oral+0.016
Neg logit contributions
cious-0.012
ill-0.012
air-0.011
 Sophia-0.011
line-0.010
Token it
Token ID340
Feature activation+8.00e-01

Pos logit contributions
 robots+0.021
 moral+0.019
 puzzles+0.019
 trucks+0.019
 nurses+0.019
Neg logit contributions
ill-0.020
line-0.019
cious-0.017
air-0.016
 Sophia-0.016
Token up
Token ID510
Feature activation-1.41e+00

Pos logit contributions
 robots+0.030
 Mechan+0.029
 rockets+0.028
 Amen+0.028
 nurses+0.028
Neg logit contributions
ill-0.014
line-0.013
air-0.012
 Sophia-0.012
cious-0.011
Token a
Token ID257
Feature activation-4.59e-01

Pos logit contributions
ill+0.032
line+0.030
cious+0.029
air+0.029
 Sophia+0.028
Neg logit contributions
 robots-0.045
 rockets-0.042
 puzzles-0.042
 trucks-0.041
 moral-0.041
Token tree
Token ID5509
Feature activation-6.65e-01

Pos logit contributions
ill+0.013
line+0.012
 Sophia+0.012
air+0.012
cious+0.012
Neg logit contributions
 robots-0.012
 rockets-0.011
 puzzles-0.011
 moral-0.011
 trucks-0.011
Token.
Token ID13
Feature activation-4.38e-01

Pos logit contributions
ill+0.014
line+0.013
cious+0.013
air+0.013
 Sophia+0.012
Neg logit contributions
 robots-0.022
 rockets-0.021
 puzzles-0.021
 trucks-0.020
 moral-0.020
Token and
Token ID290
Feature activation-2.14e-01

Pos logit contributions
ill+0.031
line+0.029
 Sophia+0.028
air+0.028
cious+0.028
Neg logit contributions
 robots-0.039
 rockets-0.038
 puzzles-0.037
 trucks-0.037
 moral-0.036
Token explore
Token ID7301
Feature activation-1.26e-01

Pos logit contributions
ill+0.005
line+0.005
cious+0.005
air+0.005
 Sophia+0.005
Neg logit contributions
 robots-0.007
 rockets-0.006
 puzzles-0.006
 trucks-0.006
 moral-0.006
Token.
Token ID13
Feature activation-1.67e-01

Pos logit contributions
ill+0.004
cious+0.004
line+0.004
 Sophia+0.004
air+0.004
Neg logit contributions
 robots-0.003
 rockets-0.003
 puzzles-0.003
 novels-0.003
 trucks-0.003
Token One
Token ID1881
Feature activation-6.99e-01

Pos logit contributions
ill+0.006
line+0.005
cious+0.005
air+0.005
 Sophia+0.005
Neg logit contributions
 robots-0.004
 rockets-0.003
 trucks-0.003
 novels-0.003
 moral-0.003
Token day
Token ID1110
Feature activation-6.31e-01

Pos logit contributions
ill+0.029
line+0.028
cious+0.027
 Sophia+0.027
air+0.027
Neg logit contributions
 moral-0.009
 robots-0.009
 rockets-0.008
 trucks-0.007
 novels-0.007
Token Ranger
Token ID21888
Feature activation-1.82e+00

Pos logit contributions
ill+0.015
line+0.014
air+0.013
cious+0.013
 Sophia+0.013
Neg logit contributions
 robots-0.020
 rockets-0.019
 puzzles-0.018
 trucks-0.018
 novels-0.018
Token found
Token ID1043
Feature activation-1.13e-01

Pos logit contributions
ill+0.053
line+0.051
air+0.048
 Sophia+0.047
cious+0.047
Neg logit contributions
 robots-0.045
 rockets-0.042
 trucks-0.042
 puzzles-0.041
 moral-0.041
Token something
Token ID1223
Feature activation-2.14e+00

Pos logit contributions
ill+0.004
line+0.004
air+0.004
cious+0.004
 Sophia+0.003
Neg logit contributions
 robots-0.002
 rockets-0.002
 puzzles-0.002
 trucks-0.002
 novels-0.002
Token very
Token ID845
Feature activation-7.54e-01

Pos logit contributions
ill+0.064
line+0.061
cious+0.059
air+0.058
 Sophia+0.058
Neg logit contributions
 robots-0.053
 rockets-0.050
 puzzles-0.049
 moral-0.048
 trucks-0.048
Token special
Token ID2041
Feature activation-8.48e-01

Pos logit contributions
ill+0.024
line+0.023
cious+0.022
air+0.021
 Sophia+0.021
Neg logit contributions
 robots-0.018
 moral-0.017
 rockets-0.016
 puzzles-0.016
 trucks-0.016
Token in
Token ID287
Feature activation-6.51e-01

Pos logit contributions
ill+0.019
line+0.018
cious+0.018
air+0.017
 Sophia+0.017
Neg logit contributions
 robots-0.027
 rockets-0.025
 puzzles-0.025
 trucks-0.025
 moral-0.025
Token sad
Token ID6507
Feature activation+1.82e-01

Pos logit contributions
ill+0.022
line+0.021
cious+0.020
air+0.020
 Sophia+0.020
Neg logit contributions
 robots-0.018
 rockets-0.017
 moral-0.017
 puzzles-0.017
 trucks-0.016
Token.
Token ID13
Feature activation+2.58e-02

Pos logit contributions
 robots+0.006
 rockets+0.005
 moral+0.005
 puzzles+0.005
 trucks+0.005
Neg logit contributions
ill-0.004
line-0.004
cious-0.004
air-0.004
 Sophia-0.004
Token But
Token ID887
Feature activation+5.54e-01

Pos logit contributions
 robots+0.001
 rockets+0.001
 trucks+0.001
 puzzles+0.001
 moral+0.001
Neg logit contributions
ill-0.001
line-0.001
cious-0.001
air-0.001
 Sophia-0.001
Token then
Token ID788
Feature activation+1.13e+00

Pos logit contributions
 robots+0.011
 rockets+0.010
 moral+0.010
 puzzles+0.010
 trucks+0.010
Neg logit contributions
ill-0.019
line-0.019
cious-0.018
air-0.018
 Sophia-0.018
Token he
Token ID339
Feature activation+3.92e-01

Pos logit contributions
 robots+0.034
 automobiles+0.032
 rockets+0.032
 Mechan+0.032
 cartoons+0.031
Neg logit contributions
ill-0.025
 Sophia-0.025
cious-0.024
line-0.024
air-0.023
Token felt
Token ID2936
Feature activation+2.84e-01

Pos logit contributions
 robots+0.013
 novels+0.013
 rockets+0.012
 Amen+0.012
 automobiles+0.012
Neg logit contributions
ill-0.009
cious-0.008
line-0.008
 Sophia-0.008
air-0.007
Token the
Token ID262
Feature activation-4.99e-01

Pos logit contributions
 robots+0.007
 rockets+0.007
 puzzles+0.007
 trucks+0.007
 moral+0.007
Neg logit contributions
ill-0.008
line-0.008
cious-0.007
air-0.007
 Sophia-0.007
Token warm
Token ID5814
Feature activation-1.78e+00

Pos logit contributions
ill+0.012
line+0.011
 Sophia+0.011
cious+0.011
air+0.011
Neg logit contributions
 robots-0.015
 trucks-0.014
 rockets-0.014
 novels-0.014
 Mechan-0.014
Token sun
Token ID4252
Feature activation-1.83e+00

Pos logit contributions
ill+0.047
cious+0.045
line+0.043
ax+0.043
air+0.042
Neg logit contributions
 moral-0.049
 robots-0.048
 rockets-0.045
 puzzles-0.044
 cartoons-0.044
Token on
Token ID319
Feature activation-6.17e-01

Pos logit contributions
ill+0.061
 Sophia+0.059
cious+0.058
air+0.058
line+0.056
Neg logit contributions
 robots-0.038
 rockets-0.037
 trucks-0.035
 puzzles-0.034
 magnets-0.034
Token his
Token ID465
Feature activation-1.06e+00

Pos logit contributions
cious+0.024
ill+0.023
line+0.023
air+0.021
 Sophia+0.021
Neg logit contributions
 robots-0.010
 moral-0.010
 cartoons-0.009
 rockets-0.009
 novels-0.009
Token
Token ID198
Feature activation+1.47e-03

Pos logit contributions
ill+0.010
 Sophia+0.010
line+0.009
air+0.009
 Maggie+0.009
Neg logit contributions
 robots-0.019
 rockets-0.018
 Mechan-0.018
 puzzles-0.018
 trucks-0.018
Token
Token ID198
Feature activation-2.90e-01

Pos logit contributions
 robots+0.000
 rockets+0.000
 trucks+0.000
 puzzles+0.000
 Mechan+0.000
Neg logit contributions
ill-0.000
 Sophia-0.000
line-0.000
air-0.000
cious-0.000
TokenTom
Token ID13787
Feature activation-5.02e-01

Pos logit contributions
ill+0.006
line+0.006
 Sophia+0.005
air+0.005
cious+0.005
Neg logit contributions
 robots-0.010
 rockets-0.010
 trucks-0.009
 puzzles-0.009
 novels-0.009
Token felt
Token ID2936
Feature activation+7.13e-01

Pos logit contributions
ill+0.012
 Sophia+0.010
air+0.010
cious+0.010
line+0.010
Neg logit contributions
 robots-0.016
 rockets-0.016
 puzzles-0.015
 novels-0.015
 Mechan-0.015
Token bad
Token ID2089
Feature activation+1.01e+00

Pos logit contributions
 robots+0.021
 Mechan+0.021
 rockets+0.021
 trucks+0.021
 automobiles+0.020
Neg logit contributions
 Sophia-0.016
line-0.016
ill-0.015
air-0.015
cious-0.015
Token.
Token ID13
Feature activation-6.52e-01

Pos logit contributions
 robots+0.031
 rockets+0.029
 puzzles+0.029
 trucks+0.029
 moral+0.028
Neg logit contributions
ill-0.024
line-0.023
cious-0.022
air-0.022
 Sophia-0.021
Token He
Token ID679
Feature activation+5.65e-01

Pos logit contributions
ill+0.017
line+0.016
cious+0.015
air+0.015
 Sophia+0.015
Neg logit contributions
 robots-0.019
 rockets-0.018
 puzzles-0.018
 trucks-0.017
 moral-0.017
Token knew
Token ID2993
Feature activation+1.06e+00

Pos logit contributions
 automobiles+0.022
 rockets+0.022
 robots+0.021
 cartoons+0.021
 novels+0.021
Neg logit contributions
 reached-0.008
ill-0.007
 Sophia-0.007
cious-0.007
 found-0.007
Token he
Token ID339
Feature activation-5.49e-01

Pos logit contributions
 Mechan+0.035
 Amen+0.034
 robots+0.033
 rockets+0.032
 automobiles+0.032
Neg logit contributions
 Sophia-0.023
ill-0.023
 Sandy-0.021
line-0.021
 Dan-0.021
Token was
Token ID373
Feature activation+4.18e-01

Pos logit contributions
ill+0.016
line+0.015
cious+0.014
air+0.014
 Sophia+0.014
Neg logit contributions
 robots-0.014
 rockets-0.014
 puzzles-0.013
 trucks-0.013
 novels-0.013
Token wrong
Token ID2642
Feature activation-8.18e-01

Pos logit contributions
 robots+0.010
 rockets+0.010
 moral+0.009
 puzzles+0.009
 trucks+0.009
Neg logit contributions
ill-0.013
line-0.012
air-0.012
cious-0.012
 Sophia-0.011
Token went
Token ID1816
Feature activation+6.13e-01

Pos logit contributions
ill+0.004
line+0.003
air+0.003
cious+0.003
 Sophia+0.003
Neg logit contributions
 robots-0.005
 novels-0.005
 Amen-0.005
 magazines-0.005
 nurses-0.005
Token so
Token ID523
Feature activation+3.70e-01

Pos logit contributions
 robots+0.021
 nurses+0.021
 Amen+0.021
 moral+0.021
 novels+0.021
Neg logit contributions
ill-0.011
line-0.011
air-0.010
 Sophia-0.009
cious-0.009
Token high
Token ID1029
Feature activation-3.64e-01

Pos logit contributions
 robots+0.012
 Amen+0.012
 Mechan+0.012
oral+0.012
 nurses+0.012
Neg logit contributions
ill-0.007
line-0.007
air-0.007
 Sophia-0.007
cious-0.006
Token that
Token ID326
Feature activation+5.61e-01

Pos logit contributions
ill+0.008
line+0.008
air+0.007
 Sophia+0.007
cious+0.007
Neg logit contributions
 robots-0.011
 rockets-0.011
 moral-0.011
 puzzles-0.011
 novels-0.011
Token it
Token ID340
Feature activation-4.58e-02

Pos logit contributions
 robots+0.016
 rockets+0.015
 puzzles+0.015
 moral+0.015
 trucks+0.015
Neg logit contributions
ill-0.014
line-0.014
cious-0.013
air-0.013
 Sophia-0.013
Token almost
Token ID2048
Feature activation+2.01e-01

Pos logit contributions
ill+0.001
line+0.001
air+0.001
cious+0.001
 Sophia+0.001
Neg logit contributions
 robots-0.001
 novels-0.001
 Amen-0.001
 rockets-0.001
 puzzles-0.001
Token touched
Token ID12615
Feature activation-3.71e-01

Pos logit contributions
 robots+0.006
 Amen+0.006
 novels+0.006
 rockets+0.006
 puzzles+0.006
Neg logit contributions
ill-0.005
line-0.005
air-0.004
 Sophia-0.004
cious-0.004
Token the
Token ID262
Feature activation+2.99e-01

Pos logit contributions
ill+0.009
line+0.009
cious+0.009
air+0.008
 Sophia+0.008
Neg logit contributions
 robots-0.011
 rockets-0.010
 trucks-0.010
 puzzles-0.010
 moral-0.010
Token clouds
Token ID15114
Feature activation-1.35e+00

Pos logit contributions
 Amen+0.008
 robots+0.008
oral+0.008
 nurses+0.008
 Mechan+0.008
Neg logit contributions
line-0.007
ill-0.007
air-0.007
cious-0.006
 distance-0.006
Token.
Token ID13
Feature activation+3.55e-01

Pos logit contributions
ill+0.030
cious+0.028
line+0.028
 Sophia+0.027
air+0.027
Neg logit contributions
 robots-0.042
 rockets-0.040
 trucks-0.039
 puzzles-0.039
 moral-0.039
Token Tim
Token ID5045
Feature activation-4.28e-01

Pos logit contributions
 robots+0.011
 rockets+0.011
 puzzles+0.010
 moral+0.010
 trucks+0.010
Neg logit contributions
ill-0.008
line-0.008
 Sophia-0.007
air-0.007
cious-0.007
Token mom
Token ID1995
Feature activation+4.20e-01

Pos logit contributions
ill+0.014
 Sophia+0.014
cious+0.013
air+0.013
line+0.013
Neg logit contributions
 Mechan-0.013
 Amen-0.013
 robots-0.012
oral-0.012
cut-0.012
Token were
Token ID547
Feature activation-1.38e-01

Pos logit contributions
 robots+0.011
 rockets+0.010
 trucks+0.010
 puzzles+0.010
 moral+0.010
Neg logit contributions
ill-0.012
line-0.012
cious-0.011
air-0.011
 Sophia-0.011
Token having
Token ID1719
Feature activation+1.03e+00

Pos logit contributions
ill+0.004
air+0.004
line+0.004
cious+0.004
ive+0.004
Neg logit contributions
 robots-0.003
 rockets-0.003
 puzzles-0.003
 cartoons-0.003
 moral-0.003
Token so
Token ID523
Feature activation+4.12e-01

Pos logit contributions
 robots+0.028
 rockets+0.026
 trucks+0.026
 moral+0.026
 puzzles+0.026
Neg logit contributions
ill-0.028
line-0.026
 Sophia-0.026
cious-0.025
air-0.025
Token much
Token ID881
Feature activation+3.55e-01

Pos logit contributions
 robots+0.010
 rockets+0.009
 puzzles+0.009
 moral+0.009
 trucks+0.009
Neg logit contributions
ill-0.013
line-0.012
cious-0.012
air-0.012
 Sophia-0.011
Token fun
Token ID1257
Feature activation-1.62e-01

Pos logit contributions
 robots+0.008
 rockets+0.007
 puzzles+0.007
 moral+0.007
 novels+0.007
Neg logit contributions
ill-0.012
line-0.011
cious-0.011
air-0.011
 Sophia-0.010
Token together
Token ID1978
Feature activation-7.88e-01

Pos logit contributions
ill+0.004
line+0.004
cious+0.004
air+0.004
 Sophia+0.004
Neg logit contributions
 robots-0.005
 rockets-0.005
 puzzles-0.005
 trucks-0.004
 moral-0.004
Token.
Token ID13
Feature activation-8.62e-01

Pos logit contributions
ill+0.020
line+0.020
air+0.019
cious+0.019
ax+0.018
Neg logit contributions
 robots-0.022
 puzzles-0.021
 rockets-0.021
 trucks-0.020
 novels-0.020
Token<|endoftext|>
Token ID50256
Feature activation+4.31e-01

Pos logit contributions
ill+0.025
line+0.024
cious+0.023
air+0.022
ax+0.022
Neg logit contributions
 robots-0.021
 moral-0.021
 rockets-0.020
 automobiles-0.020
 Amen-0.020
TokenOnce
Token ID7454
Feature activation+7.69e-01

Pos logit contributions
 robots+0.009
 rockets+0.008
 puzzles+0.008
 moral+0.008
 novels+0.008
Neg logit contributions
line-0.014
cious-0.014
ill-0.014
air-0.013
 Sophia-0.013
Token upon
Token ID2402
Feature activation+1.25e+00

Pos logit contributions
 robots+0.020
 rockets+0.019
 puzzles+0.018
 trucks+0.018
 Mechan+0.018
Neg logit contributions
ill-0.023
line-0.021
 Sophia-0.020
air-0.020
cious-0.020
Token It
Token ID632
Feature activation-8.43e-01

Pos logit contributions
ill+0.006
line+0.006
air+0.005
 Sophia+0.005
cious+0.005
Neg logit contributions
 robots-0.008
 rockets-0.008
 puzzles-0.008
 trucks-0.008
 moral-0.008
Token then
Token ID788
Feature activation+7.39e-01

Pos logit contributions
ill+0.021
line+0.019
cious+0.019
 Sophia+0.019
air+0.019
Neg logit contributions
 robots-0.025
 rockets-0.024
 novels-0.023
 automobiles-0.023
 puzzles-0.023
Token hopped
Token ID45230
Feature activation-1.02e-01

Pos logit contributions
 robots+0.022
 rockets+0.021
 novels+0.021
 Amen+0.021
 puzzles+0.020
Neg logit contributions
ill-0.018
line-0.017
 Sophia-0.017
cious-0.016
air-0.016
Token to
Token ID284
Feature activation-6.57e-02

Pos logit contributions
ill+0.002
line+0.002
cious+0.002
air+0.002
 Sophia+0.002
Neg logit contributions
 robots-0.004
 moral-0.003
 rockets-0.003
 puzzles-0.003
 trucks-0.003
Token a
Token ID257
Feature activation-4.59e-01

Pos logit contributions
ill+0.002
line+0.002
cious+0.002
air+0.002
 Sophia+0.002
Neg logit contributions
 robots-0.002
 rockets-0.002
 moral-0.002
 puzzles-0.002
 trucks-0.002
Token different
Token ID1180
Feature activation-2.43e+00

Pos logit contributions
ill+0.011
line+0.010
cious+0.010
air+0.010
 Sophia+0.009
Neg logit contributions
 robots-0.014
 rockets-0.013
 puzzles-0.013
 trucks-0.013
 moral-0.013
Token place
Token ID1295
Feature activation-2.38e+00

Pos logit contributions
ill+0.044
line+0.041
 Sophia+0.040
cious+0.039
air+0.038
Neg logit contributions
 robots-0.086
 Mechan-0.082
 rockets-0.081
 Amen-0.080
 novels-0.080
Token and
Token ID290
Feature activation+1.06e-01

Pos logit contributions
ill+0.056
line+0.055
cious+0.054
air+0.053
 Sophia+0.051
Neg logit contributions
 robots-0.070
 rockets-0.066
 puzzles-0.066
 trucks-0.065
 moral-0.064
Token scattered
Token ID16830
Feature activation+1.12e+00

Pos logit contributions
 robots+0.003
 rockets+0.003
 novels+0.003
 Mechan+0.003
 automobiles+0.003
Neg logit contributions
ill-0.002
line-0.002
cious-0.002
 Sophia-0.002
air-0.002
Token some
Token ID617
Feature activation-1.01e+00

Pos logit contributions
 robots+0.033
 rockets+0.031
 puzzles+0.031
 trucks+0.031
 moral+0.031
Neg logit contributions
ill-0.028
line-0.026
cious-0.025
air-0.025
 Sophia-0.025
Token more
Token ID517
Feature activation-1.14e+00

Pos logit contributions
ill+0.026
line+0.024
air+0.023
cious+0.023
 Sophia+0.023
Neg logit contributions
 robots-0.029
 Mechan-0.028
 Amen-0.028
 rockets-0.027
 moral-0.026
Token a
Token ID257
Feature activation-4.05e-01

Pos logit contributions
 Amen+0.048
 Mechan+0.045
oral+0.045
 robots+0.044
 moral+0.043
Neg logit contributions
 Sophia-0.030
ill-0.028
 Sandy-0.027
line-0.025
 Sasha-0.024
Token big
Token ID1263
Feature activation+5.19e-03

Pos logit contributions
line+0.010
ill+0.010
cious+0.010
 Sophia+0.010
air+0.010
Neg logit contributions
 robots-0.011
 rockets-0.010
 Amen-0.010
 trucks-0.010
 novels-0.010
Token pole
Token ID16825
Feature activation-1.41e+00

Pos logit contributions
 robots+0.000
 rockets+0.000
 puzzles+0.000
 trucks+0.000
 moral+0.000
Neg logit contributions
ill-0.000
line-0.000
cious-0.000
air-0.000
 Sophia-0.000
Token in
Token ID287
Feature activation-5.43e-01

Pos logit contributions
ill+0.033
line+0.031
cious+0.029
air+0.029
 Sophia+0.029
Neg logit contributions
 robots-0.044
 rockets-0.042
 puzzles-0.041
 moral-0.041
 trucks-0.041
Token the
Token ID262
Feature activation-1.64e+00

Pos logit contributions
ill+0.015
line+0.015
cious+0.015
air+0.014
ax+0.013
Neg logit contributions
 robots-0.014
 puzzles-0.013
 rockets-0.013
 moral-0.013
 trucks-0.013
Token park
Token ID3952
Feature activation-2.87e+00

Pos logit contributions
ill+0.059
cious+0.054
owing+0.053
line+0.053
 Sophia+0.050
Neg logit contributions
 moral-0.030
 trucks-0.030
 robots-0.030
 puzzle-0.029
 rockets-0.028
Token.
Token ID13
Feature activation+1.12e-01

Pos logit contributions
cious+0.080
ill+0.077
air+0.075
ax+0.074
 Sophia+0.071
Neg logit contributions
 robots-0.069
 trucks-0.066
 moral-0.066
 automobiles-0.063
 puzzles-0.063
Token It
Token ID632
Feature activation-9.38e-01

Pos logit contributions
 robots+0.003
 puzzles+0.003
 novels+0.003
 magnets+0.003
 rockets+0.003
Neg logit contributions
ill-0.003
cious-0.003
line-0.003
air-0.003
 Sophia-0.003
Token was
Token ID373
Feature activation-9.06e-01

Pos logit contributions
ill+0.020
line+0.018
air+0.018
 Sophia+0.017
cious+0.017
Neg logit contributions
 robots-0.031
 novels-0.030
 nurses-0.030
 rockets-0.029
 chores-0.029
Token smooth
Token ID7209
Feature activation-1.69e-01

Pos logit contributions
ill+0.020
 Sophia+0.020
line+0.019
 Sandy+0.018
cious+0.018
Neg logit contributions
 Amen-0.028
 robots-0.027
oral-0.027
 Mechan-0.027
 rockets-0.026
Token and
Token ID290
Feature activation-8.72e-02

Pos logit contributions
ill+0.004
line+0.004
cious+0.004
 Sophia+0.004
air+0.004
Neg logit contributions
 robots-0.005
 rockets-0.005
 puzzles-0.005
 trucks-0.005
 novels-0.005
Token fly
Token ID6129
Feature activation-6.69e-01

Pos logit contributions
 robots+0.000
 Mechan+0.000
 automobiles+0.000
 trucks+0.000
 novels+0.000
Neg logit contributions
ill-0.000
line-0.000
cious-0.000
air-0.000
 Sophia-0.000
Token.
Token ID13
Feature activation+4.97e-02

Pos logit contributions
ill+0.016
line+0.015
cious+0.014
air+0.014
 Sophia+0.014
Neg logit contributions
 robots-0.021
 rockets-0.020
 puzzles-0.020
 trucks-0.019
 moral-0.019
Token It
Token ID632
Feature activation-5.37e-01

Pos logit contributions
 robots+0.001
 rockets+0.001
 novels+0.001
 moral+0.001
 puzzles+0.001
Neg logit contributions
ill-0.001
cious-0.001
line-0.001
air-0.001
 Sophia-0.001
Token flew
Token ID13112
Feature activation-6.51e-01

Pos logit contributions
ill+0.013
line+0.013
cious+0.013
air+0.012
 Sophia+0.012
Neg logit contributions
 robots-0.016
 novels-0.015
 automobiles-0.015
 rockets-0.014
 puzzles-0.014
Token high
Token ID1029
Feature activation-4.71e-01

Pos logit contributions
ill+0.016
line+0.015
cious+0.014
air+0.014
 Sophia+0.014
Neg logit contributions
 robots-0.020
 rockets-0.019
 puzzles-0.018
 trucks-0.018
 moral-0.018
Token up
Token ID510
Feature activation-2.40e+00

Pos logit contributions
ill+0.008
line+0.007
air+0.007
cious+0.007
 Sophia+0.007
Neg logit contributions
 robots-0.017
 Mechan-0.017
 nurses-0.017
 Amen-0.017
 moral-0.017
Token in
Token ID287
Feature activation-7.74e-01

Pos logit contributions
ill+0.051
line+0.046
air+0.044
 Sophia+0.044
cious+0.042
Neg logit contributions
 robots-0.080
 Amen-0.076
 nurses-0.076
 Mechan-0.076
 moral-0.075
Token the
Token ID262
Feature activation-9.89e-01

Pos logit contributions
ill+0.021
line+0.020
cious+0.020
air+0.019
 Sophia+0.019
Neg logit contributions
 robots-0.022
 rockets-0.021
 puzzles-0.020
 trucks-0.020
 moral-0.020
Token sky
Token ID6766
Feature activation-2.28e+00

Pos logit contributions
ill+0.030
line+0.029
cious+0.028
 Sophia+0.027
ax+0.026
Neg logit contributions
 robots-0.024
 rockets-0.023
 moral-0.023
 puzzles-0.022
 trucks-0.021
Token and
Token ID290
Feature activation-4.03e-01

Pos logit contributions
cious+0.067
air+0.064
ill+0.064
ax+0.064
line+0.063
Neg logit contributions
 rockets-0.052
 robots-0.051
 puzzles-0.047
 trucks-0.046
 puzzle-0.044
Token could
Token ID714
Feature activation+2.21e-01

Pos logit contributions
ill+0.010
line+0.010
cious+0.009
air+0.009
 Sophia+0.009
Neg logit contributions
 robots-0.012
 rockets-0.011
 puzzles-0.011
 moral-0.011
 trucks-0.011
Token chased
Token ID26172
Feature activation+8.58e-01

Pos logit contributions
ill+0.013
air+0.012
cious+0.012
ax+0.011
line+0.011
Neg logit contributions
 robots-0.010
 trucks-0.010
 puzzles-0.009
 nurses-0.009
 rockets-0.009
Token it
Token ID340
Feature activation+6.67e-01

Pos logit contributions
 robots+0.020
 puzzles+0.020
 moral+0.019
 novels+0.019
 automobiles+0.019
Neg logit contributions
ill-0.028
ipples-0.025
line-0.025
cious-0.024
adow-0.023
Token until
Token ID1566
Feature activation+1.90e+00

Pos logit contributions
 robots+0.022
 rockets+0.021
 puzzles+0.020
 trucks+0.020
 moral+0.020
Neg logit contributions
ill-0.014
line-0.014
cious-0.013
air-0.013
 Sophia-0.013
Token she
Token ID673
Feature activation+1.85e-01

Pos logit contributions
 robots+0.051
 puzzles+0.049
 moral+0.049
 rockets+0.049
 nurses+0.049
Neg logit contributions
ill-0.052
cious-0.047
air-0.046
line-0.045
ax-0.043
Token was
Token ID373
Feature activation-1.95e-01

Pos logit contributions
 robots+0.004
 moral+0.004
 brother+0.004
 trucks+0.004
 puzzles+0.004
Neg logit contributions
line-0.005
cious-0.005
air-0.005
ill-0.005
 Sophia-0.004
Token able
Token ID1498
Feature activation-2.84e+00

Pos logit contributions
cious+0.006
air+0.006
ill+0.006
line+0.005
aire+0.005
Neg logit contributions
 robots-0.005
 rockets-0.004
 puzzles-0.004
 cartoons-0.004
 moral-0.004
Token to
Token ID284
Feature activation-1.06e+00

Pos logit contributions
line+0.095
ill+0.095
air+0.094
cious+0.090
iza+0.088
Neg logit contributions
 robots-0.056
 rockets-0.055
 puzzles-0.055
 trucks-0.049
 moral-0.049
Token catch
Token ID4929
Feature activation-9.29e-01

Pos logit contributions
cious+0.031
ill+0.031
air+0.029
aney+0.028
line+0.028
Neg logit contributions
 moral-0.024
 chores-0.023
 trucks-0.023
 robots-0.023
 puzzles-0.021
Token it
Token ID340
Feature activation+8.55e-02

Pos logit contributions
ill+0.027
cious+0.025
line+0.025
air+0.023
ax+0.022
Neg logit contributions
 robots-0.023
 puzzles-0.023
 moral-0.022
 rockets-0.022
 cartoons-0.022
Token in
Token ID287
Feature activation-3.61e-03

Pos logit contributions
 robots+0.003
 rockets+0.002
 puzzles+0.002
 trucks+0.002
 moral+0.002
Neg logit contributions
ill-0.002
line-0.002
cious-0.002
air-0.002
 Sophia-0.002
Token her
Token ID607
Feature activation-4.77e-01

Pos logit contributions
ill+0.000
line+0.000
cious+0.000
air+0.000
 Sophia+0.000
Neg logit contributions
 robots-0.000
 rockets-0.000
 puzzles-0.000
 moral-0.000
 trucks-0.000
Token the
Token ID262
Feature activation-5.62e-01

Pos logit contributions
 robots+0.012
 rockets+0.011
 moral+0.011
 Amen+0.011
 Mechan+0.011
Neg logit contributions
ill-0.009
line-0.008
air-0.008
 Sophia-0.008
cious-0.008
Token air
Token ID1633
Feature activation-2.12e+00

Pos logit contributions
ill+0.013
line+0.012
air+0.012
cious+0.012
 Sophia+0.012
Neg logit contributions
 robots-0.018
 rockets-0.016
 trucks-0.016
 puzzles-0.016
 Mechan-0.016
Token.
Token ID13
Feature activation+4.22e-01

Pos logit contributions
ill+0.046
line+0.043
cious+0.042
air+0.042
 Sophia+0.041
Neg logit contributions
 robots-0.068
 rockets-0.064
 puzzles-0.063
 trucks-0.063
 moral-0.062
Token It
Token ID632
Feature activation-5.04e-01

Pos logit contributions
 robots+0.013
 rockets+0.012
 puzzles+0.012
 moral+0.012
 trucks+0.012
Neg logit contributions
ill-0.010
line-0.009
 Sophia-0.009
air-0.009
cious-0.009
Token flew
Token ID13112
Feature activation-7.23e-01

Pos logit contributions
ill+0.013
line+0.012
cious+0.011
air+0.011
 Sophia+0.011
Neg logit contributions
 robots-0.015
 novels-0.014
 rockets-0.014
 puzzles-0.014
 Amen-0.014
Token higher
Token ID2440
Feature activation-2.26e+00

Pos logit contributions
ill+0.016
line+0.016
cious+0.015
air+0.015
 Sophia+0.014
Neg logit contributions
 robots-0.023
 rockets-0.022
 puzzles-0.021
 trucks-0.021
 moral-0.021
Token and
Token ID290
Feature activation+5.20e-01

Pos logit contributions
ill+0.059
line+0.058
cious+0.058
 Sophia+0.054
air+0.054
Neg logit contributions
 robots-0.061
 rockets-0.060
 trucks-0.058
 puzzles-0.058
 magnets-0.055
Token higher
Token ID2440
Feature activation-1.38e+00

Pos logit contributions
 robots+0.016
 rockets+0.016
 trucks+0.016
 moral+0.016
 puzzles+0.016
Neg logit contributions
ill-0.011
cious-0.010
air-0.010
line-0.009
 Sophia-0.009
Token until
Token ID1566
Feature activation+1.88e+00

Pos logit contributions
ill+0.035
cious+0.035
line+0.034
air+0.033
 Sophia+0.032
Neg logit contributions
 robots-0.036
 rockets-0.036
 trucks-0.036
 puzzles-0.034
 magnets-0.032
Token it
Token ID340
Feature activation-3.50e-01

Pos logit contributions
 robots+0.051
 rockets+0.048
 puzzles+0.047
 trucks+0.046
 moral+0.046
Neg logit contributions
ill-0.052
line-0.049
cious-0.048
air-0.047
 Sophia-0.046
Token reached
Token ID4251
Feature activation-3.97e-01

Pos logit contributions
line+0.010
ill+0.010
cious+0.010
 Sophia+0.010
air+0.010
Neg logit contributions
 moral-0.007
 robots-0.007
 rockets-0.007
 puzzles-0.006
 Mechan-0.006