TOP ACTIVATIONS
MAX = 0.363

!" Mor ris smiled and replied , "
's go on the fer ris wheel and see the whole
He went on the fer ris wheel , the roller coaster
saw the great big Fer ris wheel and all the other
they went on the fer ris wheel together , and Lily
day ever ! <|endoftext|> Mor ris was very excited . Today
Lily loved the big Fer ris wheel the most . She
that she had done the correct thing and been fit .
are both wrong . The correct answer is fifteen cand ies
tried to teach Mike the correct words .
put them back in the correct order and put them away
on n apping had been correct . <|endoftext|> Once upon a
pieces carefully to make the correct matches . After some time
came up . With each correct guess , she felt a
said , " That 's correct ! You win the game
and her toys in the correct places . One day ,
say , " You are correct . Those feathers belong to
bear , much to his dismay . The bear
The family was dismay ed . What could they
looked up , they were dismay ed to see an empty

MIN ACTIVATIONS
MIN = -0.412

the sunshine . Every day Boom would go to a big
One day as Boom was walking in the field
bright and colorful and so Boom thought it was fun .
heard a loud noise . Boom ! It was thunder .
! Then suddenly Boom heard a sound coming from
. From that day onwards Boom always enjoyed visiting the magical
a cheerful tune and when Boom looked closer she saw some
pet als opened wider , Boom saw a pretty flower blo
there was a cow named Boom . Boom was a very
a cow named Boom . Boom was a very happy cow
. Pow ! Zap ! Boom !" Ben watched
" S of ish t icated !" She laughs . Ben
put it on his night stand . He did not want
glowing rock on his night stand before he went to bed
stood on her t ipt oes to reach the spoon .
their popcorn while it r ained outside . Lily was happy
wet because it had r ained . Tim wanted to use
, especially when it r ained . One day , Lily
even more about how volcan oes work . She thought it
loved to play make - bel ieve . Today she was

INTERVAL 0.169 - 0.363
CONTAINS 1.436%

try to fly higher than ever before . Twe ety fl
s help and they show ered her with love and
is fun to jump on tight springs , he must be
When she arrived at the park , she started
, " Wow ! That must have been a long time

INTERVAL -0.024 - 0.169
CONTAINS 67.466%

smiled . It was a really great day . <|endoftext|> Once
Today was her birthday , and she was very excited .
decided to give it away . He went to a little
smiled and waved at her . " Hello ,
birds singing in the morning and she always smiled and waved

INTERVAL -0.218 - -0.024
CONTAINS 30.656%

you ?" The boy said , " No ,
clear . It had big eyes , a long nose and
letters . Let 's try ." Lily says .
You 're a natural . Maybe one day you 'll be
looked at each other and smiled . They liked the same

INTERVAL -0.412 - -0.218
CONTAINS 0.442%

made the little pe b ble his special treasure and kept
who gave me this doll ?" she says .
They added pine cones and feathers , leaves and feathers .
Albert then remembered the thunder , and he knew it
could make such a nice sound . She liked to pretend
Token!"
Token ID2474
Feature activation-4.49e-02

Pos logit contributions
 dismay+0.119
aleb+0.109
 purchase+0.105
 correct+0.104
ris+0.103
Neg logit contributions
 Boom-0.055
icated-0.055
bows-0.054
stand-0.052
oes-0.050
Token
Token ID220
Feature activation-4.04e-03

Pos logit contributions
 Boom+0.062
icated+0.054
stand+0.050
fl+0.050
bows+0.048
Neg logit contributions
ris-0.066
 dismay-0.065
 correct-0.064
ait-0.060
 purchase-0.060
Token
Token ID198
Feature activation-9.90e-03

Pos logit contributions
 Boom+0.005
icated+0.004
stand+0.004
ained+0.004
ric+0.004
Neg logit contributions
 dismay-0.006
ris-0.006
 correct-0.006
 genius-0.006
 purchase-0.006
Token
Token ID198
Feature activation-1.05e-02

Pos logit contributions
 Boom+0.013
icated+0.012
stand+0.011
 elbows+0.011
bel+0.011
Neg logit contributions
 dismay-0.014
 correct-0.014
ris-0.013
 happily-0.013
 assistant-0.013
TokenMor
Token ID20044
Feature activation+2.80e-01

Pos logit contributions
 Boom+0.014
icated+0.013
Rain+0.012
Br+0.012
stand+0.012
Neg logit contributions
 dismay-0.015
ris-0.015
 correct-0.015
 purchase-0.014
ait-0.014
Tokenris
Token ID2442
Feature activation+3.63e-01

Pos logit contributions
 dismay+0.404
 correct+0.369
 purchase+0.350
 Whoever+0.344
 security+0.340
Neg logit contributions
icated-0.377
 Boom-0.368
oes-0.361
bows-0.343
ems-0.332
Token smiled
Token ID13541
Feature activation-1.02e-01

Pos logit contributions
 correct+0.579
 dismay+0.559
 genius+0.524
 assistant+0.516
 certainly+0.515
Neg logit contributions
 Boom-0.383
fl-0.343
icated-0.335
bye-0.335
llers-0.333
Token and
Token ID290
Feature activation+1.68e-02

Pos logit contributions
 Boom+0.131
icated+0.116
stand+0.107
bows+0.107
bel+0.106
Neg logit contributions
ris-0.148
 correct-0.145
 dismay-0.143
 Whoever-0.136
 purchase-0.132
Token replied
Token ID8712
Feature activation-8.83e-03

Pos logit contributions
ris+0.028
 correct+0.027
 dismay+0.027
ait+0.027
 purchase+0.026
Neg logit contributions
 Boom-0.018
fl-0.016
bye-0.015
icated-0.014
drops-0.014
Token,
Token ID11
Feature activation-4.24e-02

Pos logit contributions
 Boom+0.009
icated+0.007
fl+0.007
bel+0.007
bows+0.007
Neg logit contributions
ris-0.016
 dismay-0.015
 correct-0.015
aleb-0.014
 purchase-0.014
Token "
Token ID366
Feature activation+1.53e-02

Pos logit contributions
 Boom+0.045
fl+0.033
stand+0.032
icated+0.032
Rain+0.031
Neg logit contributions
ris-0.078
 dismay-0.073
aleb-0.073
 correct-0.072
ait-0.071
Token's
Token ID338
Feature activation+6.11e-02

Pos logit contributions
ris+0.062
 correct+0.057
 dismay+0.057
 assistant+0.056
 happily+0.055
Neg logit contributions
 Boom-0.062
icated-0.057
stand-0.052
fl-0.049
weet-0.049
Token go
Token ID467
Feature activation-1.70e-02

Pos logit contributions
ris+0.086
 correct+0.084
 dismay+0.083
 assistant+0.080
 happily+0.080
Neg logit contributions
 Boom-0.078
icated-0.075
stand-0.068
ric-0.066
drops-0.065
Token on
Token ID319
Feature activation-2.10e-02

Pos logit contributions
 Boom+0.021
icated+0.019
oes+0.018
stand+0.018
drops+0.018
Neg logit contributions
 dismay-0.026
 correct-0.024
 passer-0.023
 assistant-0.023
ris-0.023
Token the
Token ID262
Feature activation+5.27e-03

Pos logit contributions
 Boom+0.032
icated+0.031
stand+0.028
bows+0.028
bel+0.028
Neg logit contributions
 correct-0.025
ris-0.025
 dismay-0.025
 happily-0.024
 Whoever-0.022
Token fer
Token ID11354
Feature activation+7.77e-02

Pos logit contributions
ris+0.007
 dismay+0.007
 correct+0.007
 happily+0.007
 assistant+0.006
Neg logit contributions
 Boom-0.007
icated-0.007
stand-0.006
 elbows-0.006
bows-0.006
Tokenris
Token ID2442
Feature activation+3.62e-01

Pos logit contributions
 dismay+0.046
 correct+0.044
 happily+0.041
 certainly+0.035
 assistant+0.034
Neg logit contributions
icated-0.161
 Boom-0.159
oes-0.152
stand-0.150
bel-0.147
Token wheel
Token ID7825
Feature activation+1.40e-02

Pos logit contributions
 dismay+0.454
 correct+0.452
ris+0.450
 happily+0.440
 assistant+0.415
Neg logit contributions
 Boom-0.522
icated-0.498
stand-0.449
bows-0.441
 elbows-0.440
Token and
Token ID290
Feature activation+1.57e-02

Pos logit contributions
ris+0.019
 correct+0.019
 dismay+0.019
 happily+0.019
 assistant+0.018
Neg logit contributions
 Boom-0.019
icated-0.018
stand-0.016
 elbows-0.015
bows-0.015
Token see
Token ID766
Feature activation-4.46e-02

Pos logit contributions
ris+0.019
 dismay+0.019
 correct+0.019
 happily+0.018
 assistant+0.017
Neg logit contributions
 Boom-0.024
icated-0.023
stand-0.020
 elbows-0.020
bows-0.020
Token the
Token ID262
Feature activation+4.07e-03

Pos logit contributions
 Boom+0.065
icated+0.063
stand+0.055
bows+0.054
 elbows+0.054
Neg logit contributions
 dismay-0.056
ris-0.055
 correct-0.055
 happily-0.053
 assistant-0.051
Token whole
Token ID2187
Feature activation+5.37e-03

Pos logit contributions
 dismay+0.005
 correct+0.005
ris+0.005
 happily+0.005
 assistant+0.004
Neg logit contributions
 Boom-0.006
icated-0.006
stand-0.005
bows-0.005
 elbows-0.005
Token He
Token ID679
Feature activation+2.69e-02

Pos logit contributions
 dismay+0.016
ris+0.016
 correct+0.016
 purchase+0.015
 genius+0.015
Neg logit contributions
 Boom-0.014
icated-0.011
stand-0.010
oes-0.010
bows-0.010
Token went
Token ID1816
Feature activation-1.50e-02

Pos logit contributions
 dismay+0.042
 correct+0.041
 genius+0.038
 purchase+0.037
 assistant+0.037
Neg logit contributions
 Boom-0.031
icated-0.028
bel-0.026
 keyboards-0.024
oes-0.024
Token on
Token ID319
Feature activation-2.19e-02

Pos logit contributions
 Boom+0.018
icated+0.017
stand+0.015
bel+0.015
oes+0.015
Neg logit contributions
 dismay-0.023
 correct-0.022
ris-0.022
 happily-0.021
 certainly-0.021
Token the
Token ID262
Feature activation+1.10e-03

Pos logit contributions
 Boom+0.028
icated+0.025
stand+0.024
 elbows+0.023
ained+0.022
Neg logit contributions
ris-0.031
 correct-0.031
 certainly-0.030
 dismay-0.030
 assistant-0.030
Token fer
Token ID11354
Feature activation+7.87e-02

Pos logit contributions
ris+0.001
 dismay+0.001
 happily+0.001
 correct+0.001
 assistant+0.001
Neg logit contributions
 Boom-0.002
icated-0.002
stand-0.001
bows-0.001
 elbows-0.001
Tokenris
Token ID2442
Feature activation+3.62e-01

Pos logit contributions
 dismay+0.046
 correct+0.044
 happily+0.037
 certainly+0.034
 assistant+0.034
Neg logit contributions
icated-0.163
 Boom-0.161
oes-0.156
stand-0.153
bel-0.153
Token wheel
Token ID7825
Feature activation+1.40e-02

Pos logit contributions
ris+0.455
 dismay+0.454
 correct+0.453
 happily+0.442
 assistant+0.416
Neg logit contributions
 Boom-0.521
icated-0.498
stand-0.447
bows-0.441
 elbows-0.440
Token,
Token ID11
Feature activation-4.65e-02

Pos logit contributions
ris+0.017
 happily+0.017
 correct+0.017
 dismay+0.017
 assistant+0.016
Neg logit contributions
 Boom-0.021
icated-0.020
 elbows-0.018
stand-0.018
bows-0.017
Token the
Token ID262
Feature activation+7.40e-04

Pos logit contributions
 Boom+0.065
icated+0.061
stand+0.056
drops+0.054
bel+0.054
Neg logit contributions
 dismay-0.061
ris-0.061
 correct-0.060
 happily-0.057
 assistant-0.056
Token roller
Token ID24471
Feature activation-4.43e-02

Pos logit contributions
 correct+0.001
 dismay+0.001
ris+0.001
 happily+0.001
 assistant+0.001
Neg logit contributions
 Boom-0.001
icated-0.001
ained-0.001
bows-0.001
stand-0.001
Token coaster
Token ID42450
Feature activation-8.07e-02

Pos logit contributions
icated+0.063
 Boom+0.061
 elbows+0.056
ained+0.055
oes+0.054
Neg logit contributions
ris-0.058
 happily-0.058
 correct-0.056
 Whoever-0.054
 dismay-0.053
Token saw
Token ID2497
Feature activation-9.04e-03

Pos logit contributions
 correct+0.111
 dismay+0.109
ris+0.108
 assistant+0.105
 genius+0.104
Neg logit contributions
 Boom-0.077
stand-0.067
icated-0.067
llers-0.059
bel-0.059
Token the
Token ID262
Feature activation+7.10e-04

Pos logit contributions
oes+0.016
drops+0.016
 Boom+0.016
icated+0.016
stand+0.015
Neg logit contributions
 dismay-0.008
 happily-0.008
 surprise-0.008
 correct-0.007
 exactly-0.007
Token great
Token ID1049
Feature activation+1.52e-01

Pos logit contributions
ris+0.001
 dismay+0.001
 correct+0.001
 happily+0.001
ait+0.001
Neg logit contributions
 Boom-0.001
icated-0.001
bows-0.001
stand-0.001
ained-0.001
Token big
Token ID1263
Feature activation+3.66e-02

Pos logit contributions
ris+0.219
 dismay+0.208
 correct+0.203
 certainly+0.198
ait+0.197
Neg logit contributions
 Boom-0.199
icated-0.190
bows-0.177
 songs-0.168
stand-0.164
Token Fer
Token ID12880
Feature activation+1.26e-01

Pos logit contributions
ris+0.049
 dismay+0.048
 correct+0.048
 happily+0.047
 assistant+0.045
Neg logit contributions
 Boom-0.050
icated-0.048
stand-0.043
bows-0.042
 elbows-0.041
Tokenris
Token ID2442
Feature activation+3.62e-01

Pos logit contributions
 correct+0.051
 dismay+0.048
 happily+0.036
 certainly+0.036
 surprise+0.027
Neg logit contributions
icated-0.287
 Boom-0.279
oes-0.276
stand-0.272
bel-0.268
Token wheel
Token ID7825
Feature activation+1.53e-02

Pos logit contributions
 dismay+0.493
 correct+0.486
 certainly+0.467
 happily+0.453
ris+0.447
Neg logit contributions
 Boom-0.484
icated-0.436
stand-0.419
bel-0.409
bows-0.404
Token and
Token ID290
Feature activation+1.13e-02

Pos logit contributions
 dismay+0.024
 correct+0.022
ris+0.022
aleb+0.021
 happily+0.021
Neg logit contributions
 Boom-0.018
icated-0.017
bows-0.017
stand-0.016
drops-0.016
Token all
Token ID477
Feature activation+1.52e-01

Pos logit contributions
 dismay+0.014
 correct+0.013
ris+0.013
 happily+0.013
 assistant+0.012
Neg logit contributions
 Boom-0.017
icated-0.016
stand-0.015
bows-0.014
 elbows-0.014
Token the
Token ID262
Feature activation+2.31e-04

Pos logit contributions
ris+0.212
 dismay+0.212
 correct+0.207
 assistant+0.201
aleb+0.195
Neg logit contributions
 Boom-0.201
icated-0.189
ained-0.169
oes-0.166
bows-0.166
Token other
Token ID584
Feature activation+1.04e-03

Pos logit contributions
ris+0.000
 dismay+0.000
 correct+0.000
 happily+0.000
 assistant+0.000
Neg logit contributions
 Boom-0.000
icated-0.000
bows-0.000
stand-0.000
drops-0.000
Token they
Token ID484
Feature activation+4.03e-02

Pos logit contributions
 correct+0.144
ris+0.144
 dismay+0.143
 assistant+0.134
aleb+0.132
Neg logit contributions
 Boom-0.108
icated-0.097
stand-0.088
bows-0.085
drops-0.085
Token went
Token ID1816
Feature activation-1.22e-02

Pos logit contributions
 dismay+0.059
 correct+0.059
 assistant+0.054
 purchase+0.053
 passer+0.053
Neg logit contributions
 Boom-0.052
icated-0.049
drops-0.044
bel-0.042
oes-0.040
Token on
Token ID319
Feature activation-2.04e-02

Pos logit contributions
 Boom+0.012
oes+0.010
line+0.010
drops+0.010
stand+0.009
Neg logit contributions
 dismay-0.022
 correct-0.021
 certainly-0.021
 passer-0.020
 genius-0.020
Token the
Token ID262
Feature activation+5.81e-03

Pos logit contributions
 Boom+0.022
 elbows+0.019
 drums+0.018
stand+0.017
icated+0.017
Neg logit contributions
 certainly-0.033
 assistant-0.032
 correct-0.032
ait-0.031
ris-0.031
Token fer
Token ID11354
Feature activation+7.83e-02

Pos logit contributions
 correct+0.006
 dismay+0.006
ris+0.006
 happily+0.006
 Whoever+0.006
Neg logit contributions
 Boom-0.009
icated-0.009
ained-0.008
stand-0.008
bows-0.008
Tokenris
Token ID2442
Feature activation+3.62e-01

Pos logit contributions
 dismay+0.036
 correct+0.034
 happily+0.029
 assistant+0.024
 certainly+0.024
Neg logit contributions
icated-0.172
 Boom-0.170
oes-0.164
bel-0.161
stand-0.161
Token wheel
Token ID7825
Feature activation+1.47e-02

Pos logit contributions
 dismay+0.469
 correct+0.466
ris+0.457
 happily+0.451
 assistant+0.429
Neg logit contributions
 Boom-0.506
icated-0.480
stand-0.435
bows-0.425
 elbows-0.424
Token together
Token ID1978
Feature activation+1.52e-01

Pos logit contributions
 dismay+0.022
 correct+0.021
ris+0.020
 certainly+0.019
 happily+0.019
Neg logit contributions
 Boom-0.018
icated-0.017
stand-0.016
bows-0.016
oes-0.016
Token,
Token ID11
Feature activation-4.31e-02

Pos logit contributions
 dismay+0.270
ris+0.269
 correct+0.269
aleb+0.268
 passer+0.266
Neg logit contributions
stand-0.109
line-0.104
 Boom-0.104
iece-0.100
bows-0.096
Token and
Token ID290
Feature activation+1.61e-02

Pos logit contributions
 Boom+0.046
drops+0.042
line+0.042
icated+0.040
stand+0.039
Neg logit contributions
 dismay-0.069
ris-0.068
aleb-0.066
 correct-0.066
 assistant-0.064
Token Lily
Token ID20037
Feature activation-6.60e-03

Pos logit contributions
 correct+0.021
 dismay+0.021
ris+0.020
 assistant+0.019
 certainly+0.019
Neg logit contributions
 Boom-0.023
icated-0.021
drops-0.019
stand-0.019
bows-0.019
Token day
Token ID1110
Feature activation-7.59e-02

Pos logit contributions
Con+0.203
 Whoever+0.198
ults+0.193
athed+0.193
aleb+0.192
Neg logit contributions
 sound-0.121
 songs-0.115
 weather-0.105
bye-0.105
fl-0.103
Token ever
Token ID1683
Feature activation+1.69e-01

Pos logit contributions
icated+0.085
 Boom+0.084
bye+0.082
stand+0.079
line+0.076
Neg logit contributions
 dismay-0.112
ris-0.109
 assistant-0.107
 correct-0.106
Con-0.102
Token!
Token ID0
Feature activation+1.25e-01

Pos logit contributions
 dismay+0.212
 correct+0.205
 assistant+0.195
ris+0.193
 Whoever+0.186
Neg logit contributions
icated-0.251
 Boom-0.244
stand-0.221
bows-0.218
fl-0.217
Token<|endoftext|>
Token ID50256
Feature activation-1.67e-01

Pos logit contributions
 correct+0.183
 dismay+0.179
ris+0.178
 happily+0.172
 assistant+0.171
Neg logit contributions
 Boom-0.163
icated-0.148
stand-0.135
ained-0.129
ric-0.129
TokenMor
Token ID20044
Feature activation+2.79e-01

Pos logit contributions
 Boom+0.210
icated+0.177
stand+0.166
ained+0.163
fl+0.157
Neg logit contributions
 dismay-0.258
 correct-0.255
ris-0.248
 assistant-0.244
 purchase-0.243
Tokenris
Token ID2442
Feature activation+3.61e-01

Pos logit contributions
 dismay+0.397
 correct+0.395
 assistant+0.376
 genius+0.370
 certainly+0.363
Neg logit contributions
icated-0.352
 Boom-0.352
oes-0.316
ric-0.310
bye-0.303
Token was
Token ID373
Feature activation+2.26e-02

Pos logit contributions
 correct+0.569
 dismay+0.552
 genius+0.523
 assistant+0.520
 certainly+0.508
Neg logit contributions
 Boom-0.391
fl-0.358
icated-0.342
stand-0.337
bye-0.333
Token very
Token ID845
Feature activation-3.77e-02

Pos logit contributions
ris+0.030
 dismay+0.029
 correct+0.028
 purchase+0.028
 Whoever+0.027
Neg logit contributions
 Boom-0.032
icated-0.028
ained-0.028
stand-0.025
 elbows-0.025
Token excited
Token ID6568
Feature activation+4.39e-02

Pos logit contributions
 Boom+0.049
bel+0.045
icated+0.041
bye+0.039
Rain+0.038
Neg logit contributions
ris-0.056
 dismay-0.050
 Whoever-0.050
 correct-0.049
ait-0.047
Token.
Token ID13
Feature activation+9.25e-03

Pos logit contributions
 dismay+0.060
ris+0.060
 correct+0.059
 happily+0.057
 assistant+0.054
Neg logit contributions
 Boom-0.059
icated-0.057
stand-0.051
bows-0.050
oes-0.049
Token Today
Token ID6288
Feature activation-3.79e-02

Pos logit contributions
 dismay+0.014
 correct+0.013
ris+0.013
 happily+0.013
 exactly+0.012
Neg logit contributions
 Boom-0.012
icated-0.011
oes-0.010
stand-0.010
bows-0.010
Token Lily
Token ID20037
Feature activation-6.24e-03

Pos logit contributions
ris+0.022
 correct+0.022
 dismay+0.021
 happily+0.020
 assistant+0.020
Neg logit contributions
 Boom-0.026
icated-0.024
stand-0.022
ained-0.021
bows-0.021
Token loved
Token ID6151
Feature activation-1.18e-01

Pos logit contributions
 Boom+0.008
icated+0.008
stand+0.007
bows+0.007
ained+0.007
Neg logit contributions
 dismay-0.009
ris-0.009
 correct-0.009
 happily-0.008
 assistant-0.008
Token the
Token ID262
Feature activation+7.56e-03

Pos logit contributions
 Boom+0.162
icated+0.155
 elbows+0.139
stand+0.139
bows+0.138
Neg logit contributions
 dismay-0.152
 correct-0.151
 happily-0.150
ris-0.150
 assistant-0.139
Token big
Token ID1263
Feature activation+3.76e-02

Pos logit contributions
ris+0.009
 dismay+0.009
 correct+0.009
 happily+0.009
 Whoever+0.008
Neg logit contributions
 Boom-0.011
icated-0.011
stand-0.010
bows-0.010
ained-0.009
Token Fer
Token ID12880
Feature activation+1.20e-01

Pos logit contributions
 dismay+0.048
 correct+0.048
ris+0.048
 happily+0.047
 assistant+0.044
Neg logit contributions
 Boom-0.053
icated-0.051
stand-0.045
bows-0.045
 elbows-0.045
Tokenris
Token ID2442
Feature activation+3.57e-01

Pos logit contributions
 dismay+0.093
 correct+0.092
 happily+0.082
 certainly+0.076
 assistant+0.075
Neg logit contributions
 Boom-0.229
icated-0.227
oes-0.213
ained-0.212
bel-0.212
Token wheel
Token ID7825
Feature activation+1.01e-02

Pos logit contributions
 dismay+0.514
 correct+0.504
 happily+0.470
 certainly+0.467
 assistant+0.463
Neg logit contributions
 Boom-0.450
icated-0.422
stand-0.399
ric-0.385
bows-0.384
Token the
Token ID262
Feature activation+3.93e-03

Pos logit contributions
ris+0.012
 correct+0.012
 happily+0.012
 dismay+0.012
 assistant+0.011
Neg logit contributions
 Boom-0.015
icated-0.014
 elbows-0.013
stand-0.013
ained-0.013
Token most
Token ID749
Feature activation-4.18e-02

Pos logit contributions
 dismay+0.005
ris+0.005
 correct+0.005
 happily+0.005
 assistant+0.004
Neg logit contributions
 Boom-0.006
icated-0.006
stand-0.005
bows-0.005
 elbows-0.005
Token.
Token ID13
Feature activation+1.51e-02

Pos logit contributions
icated+0.051
 Boom+0.049
 songs+0.048
bel+0.046
bows+0.046
Neg logit contributions
 dismay-0.060
 correct-0.057
 Whoever-0.055
ris-0.053
 assistant-0.053
Token She
Token ID1375
Feature activation+3.89e-02

Pos logit contributions
 dismay+0.019
 correct+0.019
ris+0.019
 happily+0.018
 assistant+0.017
Neg logit contributions
 Boom-0.023
icated-0.021
stand-0.019
bows-0.019
ained-0.019
Token that
Token ID326
Feature activation-7.09e-03

Pos logit contributions
 dismay+0.131
 correct+0.129
ris+0.129
 Whoever+0.123
aleb+0.118
Neg logit contributions
 Boom-0.120
icated-0.117
bows-0.108
stand-0.108
drops-0.105
Token she
Token ID673
Feature activation+2.37e-02

Pos logit contributions
 Boom+0.008
icated+0.008
stand+0.007
bows+0.007
 rhy+0.007
Neg logit contributions
ris-0.011
aleb-0.010
 dismay-0.010
 Whoever-0.010
 correct-0.010
Token had
Token ID550
Feature activation-4.58e-03

Pos logit contributions
 dismay+0.042
ris+0.041
ait+0.040
 correct+0.039
 Whoever+0.039
Neg logit contributions
 Boom-0.020
icated-0.019
drops-0.018
stand-0.017
bel-0.017
Token done
Token ID1760
Feature activation-7.67e-02

Pos logit contributions
 Boom+0.006
icated+0.005
stand+0.005
bows+0.005
 songs+0.005
Neg logit contributions
ris-0.006
 dismay-0.006
ait-0.006
 Whoever-0.006
 assistant-0.006
Token the
Token ID262
Feature activation+7.48e-03

Pos logit contributions
 Boom+0.093
icated+0.089
stand+0.084
bows+0.080
ained+0.077
Neg logit contributions
ris-0.113
 dismay-0.109
 correct-0.107
 assistant-0.103
 Whoever-0.103
Token correct
Token ID3376
Feature activation+3.46e-01

Pos logit contributions
ris+0.010
 Whoever+0.009
ait+0.009
 happily+0.009
 dismay+0.008
Neg logit contributions
 Boom-0.010
icated-0.010
 songs-0.009
 sound-0.009
fl-0.009
Token thing
Token ID1517
Feature activation-1.69e-02

Pos logit contributions
 Whoever+0.492
ris+0.461
Mor+0.438
aleb+0.435
ait+0.435
Neg logit contributions
fl-0.406
 songs-0.397
 sound-0.397
 Boom-0.396
bows-0.395
Token and
Token ID290
Feature activation+2.01e-02

Pos logit contributions
bows+0.017
icated+0.016
 Boom+0.016
stand+0.016
oes+0.015
Neg logit contributions
 dismay-0.028
ris-0.027
 Whoever-0.026
 purchase-0.026
 correct-0.025
Token been
Token ID587
Feature activation-4.22e-03

Pos logit contributions
ris+0.028
 dismay+0.026
 correct+0.026
 Whoever+0.025
aleb+0.025
Neg logit contributions
 Boom-0.026
icated-0.025
stand-0.023
bows-0.023
bel-0.023
Token fit
Token ID4197
Feature activation+9.70e-02

Pos logit contributions
 Boom+0.006
stand+0.005
icated+0.005
bel+0.005
 elbows+0.005
Neg logit contributions
ris-0.005
 Whoever-0.005
 dismay-0.005
aleb-0.005
 purchase-0.005
Token.
Token ID13
Feature activation+1.75e-02

Pos logit contributions
 dismay+0.146
ris+0.144
 Whoever+0.141
 correct+0.135
Mor+0.135
Neg logit contributions
 Boom-0.102
icated-0.099
ained-0.098
stand-0.097
bows-0.095
Token are
Token ID389
Feature activation-5.65e-02

Pos logit contributions
 happily+0.010
aleb+0.010
 assistant+0.010
ris+0.010
 addition+0.010
Neg logit contributions
 Boom-0.008
icated-0.007
 rhy-0.006
drops-0.006
stand-0.006
Token both
Token ID1111
Feature activation+1.36e-01

Pos logit contributions
icated+0.060
 Boom+0.056
stand+0.048
 ger+0.048
 rhy+0.046
Neg logit contributions
ris-0.097
aleb-0.090
 dismay-0.090
ait-0.090
 purchase-0.086
Token wrong
Token ID2642
Feature activation-4.01e-02

Pos logit contributions
ris+0.188
 dismay+0.182
aleb+0.177
 Whoever+0.176
 happily+0.166
Neg logit contributions
icated-0.180
 Boom-0.171
 ger-0.153
stand-0.151
bows-0.151
Token.
Token ID13
Feature activation+5.77e-03

Pos logit contributions
 Boom+0.056
icated+0.054
stand+0.049
bows+0.048
 elbows+0.048
Neg logit contributions
 dismay-0.051
ris-0.051
 correct-0.051
 happily-0.050
 assistant-0.047
Token The
Token ID383
Feature activation-4.27e-02

Pos logit contributions
 dismay+0.008
ris+0.008
 happily+0.008
 correct+0.007
 purchase+0.007
Neg logit contributions
 Boom-0.008
icated-0.008
stand-0.007
oes-0.007
bows-0.007
Token correct
Token ID3376
Feature activation+3.44e-01

Pos logit contributions
 Boom+0.061
icated+0.058
stand+0.053
bows+0.052
 elbows+0.052
Neg logit contributions
ris-0.055
 dismay-0.054
 correct-0.053
 happily-0.053
 Whoever-0.050
Token answer
Token ID3280
Feature activation+3.84e-02

Pos logit contributions
 dismay+0.456
ris+0.441
 Whoever+0.439
 happily+0.436
 correct+0.430
Neg logit contributions
 Boom-0.450
icated-0.428
bows-0.423
fl-0.403
stand-0.401
Token is
Token ID318
Feature activation+5.83e-02

Pos logit contributions
 dismay+0.052
ris+0.051
 correct+0.051
 happily+0.049
 Whoever+0.047
Neg logit contributions
 Boom-0.052
icated-0.050
stand-0.045
bows-0.044
oes-0.044
Token fifteen
Token ID17280
Feature activation+5.07e-02

Pos logit contributions
ris+0.077
 dismay+0.075
 happily+0.071
 correct+0.071
 Whoever+0.070
Neg logit contributions
 Boom-0.080
icated-0.078
stand-0.069
oes-0.066
ric-0.066
Token cand
Token ID2658
Feature activation-1.04e-01

Pos logit contributions
 dismay+0.070
ris+0.069
 correct+0.068
 happily+0.066
 Whoever+0.065
Neg logit contributions
 Boom-0.065
icated-0.063
stand-0.056
bows-0.056
 elbows-0.055
Tokenies
Token ID444
Feature activation-2.44e-01

Pos logit contributions
icated+0.024
 Boom+0.017
bows+0.015
oes+0.012
bel+0.010
Neg logit contributions
 dismay-0.262
 correct-0.251
ris-0.247
 Whoever-0.247
 happily-0.241
Token tried
Token ID3088
Feature activation+7.82e-02

Pos logit contributions
ris+0.105
 dismay+0.103
 correct+0.101
 happily+0.097
 Whoever+0.096
Neg logit contributions
 Boom-0.109
icated-0.105
bows-0.093
stand-0.092
ained-0.091
Token to
Token ID284
Feature activation-4.39e-02

Pos logit contributions
 dismay+0.106
 correct+0.106
ris+0.105
 happily+0.104
 assistant+0.097
Neg logit contributions
 Boom-0.105
icated-0.099
stand-0.088
bows-0.087
 elbows-0.087
Token teach
Token ID4545
Feature activation+6.73e-02

Pos logit contributions
 Boom+0.060
icated+0.057
stand+0.052
bows+0.051
 elbows+0.051
Neg logit contributions
ris-0.058
 dismay-0.057
 correct-0.057
 happily-0.056
 assistant-0.053
Token Mike
Token ID4995
Feature activation+1.60e-01

Pos logit contributions
ris+0.092
 correct+0.090
 dismay+0.089
 assistant+0.085
 happily+0.085
Neg logit contributions
 Boom-0.089
icated-0.087
stand-0.078
 songs-0.075
ric-0.074
Token the
Token ID262
Feature activation+6.26e-03

Pos logit contributions
 happily+0.184
 correct+0.182
ris+0.182
 dismay+0.178
 assistant+0.168
Neg logit contributions
 Boom-0.249
icated-0.231
 elbows-0.220
stand-0.214
drops-0.211
Token correct
Token ID3376
Feature activation+3.44e-01

Pos logit contributions
ris+0.008
 dismay+0.008
 happily+0.007
 correct+0.007
 Whoever+0.007
Neg logit contributions
 Boom-0.009
icated-0.009
bows-0.008
stand-0.008
oes-0.008
Token words
Token ID2456
Feature activation-1.30e-01

Pos logit contributions
ris+0.462
 dismay+0.452
 correct+0.438
 happily+0.436
 Whoever+0.434
Neg logit contributions
 Boom-0.450
icated-0.424
bows-0.396
 songs-0.387
oes-0.381
Token.
Token ID13
Feature activation+1.77e-02

Pos logit contributions
 Boom+0.152
icated+0.151
bows+0.135
stand+0.134
oes+0.133
Neg logit contributions
 dismay-0.198
ris-0.190
 correct-0.181
aleb-0.178
 assistant-0.174
Token
Token ID220
Feature activation-2.52e-03

Pos logit contributions
 correct+0.028
 dismay+0.028
 genius+0.027
 surprise+0.027
 happily+0.026
Neg logit contributions
 Boom-0.020
ric-0.016
icated-0.016
stand-0.016
oes-0.015
Token
Token ID198
Feature activation-9.38e-03

Pos logit contributions
 Boom+0.003
icated+0.003
stand+0.003
ained+0.003
ric+0.003
Neg logit contributions
 dismay-0.003
 correct-0.003
ris-0.003
 happily-0.003
 assistant-0.003
Token
Token ID198
Feature activation-1.01e-02

Pos logit contributions
 Boom+0.012
icated+0.011
stand+0.010
ained+0.010
ric+0.010
Neg logit contributions
 dismay-0.013
 correct-0.013
ris-0.013
 happily-0.012
 assistant-0.012
Token put
Token ID1234
Feature activation+8.49e-02

Pos logit contributions
 Boom+0.058
icated+0.055
stand+0.051
 elbows+0.050
drops+0.050
Neg logit contributions
ris-0.052
 dismay-0.051
 correct-0.050
 happily-0.048
 assistant-0.047
Token them
Token ID606
Feature activation+5.28e-02

Pos logit contributions
ris+0.112
 dismay+0.110
 correct+0.106
 assistant+0.102
 happily+0.100
Neg logit contributions
 Boom-0.123
icated-0.111
stand-0.104
 elbows-0.102
drops-0.101
Token back
Token ID736
Feature activation+1.19e-02

Pos logit contributions
ris+0.077
 dismay+0.076
 correct+0.072
 Whoever+0.071
 assistant+0.070
Neg logit contributions
 Boom-0.066
icated-0.061
 elbows-0.058
bel-0.057
drops-0.057
Token in
Token ID287
Feature activation+4.04e-02

Pos logit contributions
 dismay+0.015
ris+0.015
 correct+0.015
 Whoever+0.014
 assistant+0.014
Neg logit contributions
 Boom-0.017
icated-0.016
stand-0.015
ained-0.014
 elbows-0.014
Token the
Token ID262
Feature activation+9.16e-03

Pos logit contributions
ris+0.063
aleb+0.062
 Whoever+0.062
ager+0.061
 assistant+0.060
Neg logit contributions
stand-0.043
 Boom-0.042
icated-0.039
 elbows-0.039
 keyboards-0.034
Token correct
Token ID3376
Feature activation+3.44e-01

Pos logit contributions
ris+0.012
 dismay+0.012
 happily+0.012
 Whoever+0.011
 correct+0.011
Neg logit contributions
 Boom-0.013
icated-0.012
stand-0.011
 elbows-0.011
drops-0.010
Token order
Token ID1502
Feature activation+2.22e-01

Pos logit contributions
ris+0.452
 dismay+0.428
 Whoever+0.423
 happily+0.421
 correct+0.414
Neg logit contributions
 Boom-0.468
icated-0.430
stand-0.422
 elbows-0.406
oes-0.394
Token and
Token ID290
Feature activation+1.95e-02

Pos logit contributions
 dismay+0.310
ris+0.304
 correct+0.297
 happily+0.286
 Whoever+0.280
Neg logit contributions
 Boom-0.286
icated-0.275
stand-0.257
bows-0.246
drops-0.244
Token put
Token ID1234
Feature activation+8.70e-02

Pos logit contributions
ris+0.025
 dismay+0.024
 correct+0.024
 happily+0.023
 Whoever+0.022
Neg logit contributions
 Boom-0.028
icated-0.026
stand-0.025
 elbows-0.024
drops-0.024
Token them
Token ID606
Feature activation+5.44e-02

Pos logit contributions
ris+0.118
 dismay+0.116
 correct+0.109
 assistant+0.108
 passer+0.105
Neg logit contributions
 Boom-0.122
icated-0.108
stand-0.105
 elbows-0.102
drops-0.100
Token away
Token ID1497
Feature activation+9.36e-02

Pos logit contributions
 dismay+0.076
ris+0.076
 Whoever+0.071
 correct+0.071
 passer+0.070
Neg logit contributions
 Boom-0.070
icated-0.065
bel-0.063
stand-0.061
drops-0.061
Token on
Token ID319
Feature activation-1.95e-02

Pos logit contributions
 Boom+0.016
icated+0.015
stand+0.014
bows+0.014
drops+0.014
Neg logit contributions
ris-0.016
 dismay-0.016
 correct-0.015
 happily-0.014
 Whoever-0.014
Token n
Token ID299
Feature activation+6.80e-02

Pos logit contributions
 Boom+0.023
icated+0.023
stand+0.022
 elbows+0.020
llers+0.020
Neg logit contributions
ris-0.029
 correct-0.027
 assistant-0.027
 dismay-0.027
 Whoever-0.027
Tokenapping
Token ID5912
Feature activation-4.71e-02

Pos logit contributions
 correct+0.113
 dismay+0.107
 Whoever+0.107
ris+0.104
 passer+0.103
Neg logit contributions
 Boom-0.065
ained-0.065
ric-0.064
bel-0.063
stand-0.061
Token had
Token ID550
Feature activation-5.61e-03

Pos logit contributions
 Boom+0.068
icated+0.065
stand+0.059
bows+0.058
 elbows+0.058
Neg logit contributions
ris-0.059
 dismay-0.058
 correct-0.058
 happily-0.055
 Whoever-0.054
Token been
Token ID587
Feature activation-4.77e-03

Pos logit contributions
 Boom+0.008
icated+0.008
stand+0.007
bows+0.007
ained+0.007
Neg logit contributions
ris-0.007
 dismay-0.007
 correct-0.007
 happily-0.007
 Whoever-0.007
Token correct
Token ID3376
Feature activation+3.44e-01

Pos logit contributions
 Boom+0.007
icated+0.006
stand+0.006
bows+0.006
ained+0.006
Neg logit contributions
ris-0.006
 dismay-0.006
 correct-0.006
 happily-0.006
 Whoever-0.006
Token.
Token ID13
Feature activation+1.90e-02

Pos logit contributions
 Whoever+0.503
 dismay+0.501
ris+0.499
 correct+0.479
aleb+0.466
Neg logit contributions
 Boom-0.370
icated-0.365
stand-0.338
bows-0.337
 songs-0.332
Token<|endoftext|>
Token ID50256
Feature activation-1.67e-01

Pos logit contributions
 correct+0.032
 dismay+0.031
 assistant+0.030
ris+0.030
 genius+0.030
Neg logit contributions
 Boom-0.021
icated-0.018
stand-0.017
ric-0.016
oes-0.015
TokenOnce
Token ID7454
Feature activation+8.31e-02

Pos logit contributions
 Boom+0.213
icated+0.183
stand+0.171
ained+0.168
bows+0.163
Neg logit contributions
 dismay-0.252
 correct-0.249
ris-0.242
 assistant-0.238
 purchase-0.234
Token upon
Token ID2402
Feature activation-4.76e-02

Pos logit contributions
 correct+0.123
 dismay+0.122
ris+0.122
 happily+0.117
 assistant+0.115
Neg logit contributions
 Boom-0.101
icated-0.095
stand-0.083
oes-0.082
ained-0.082
Token a
Token ID257
Feature activation-3.50e-02

Pos logit contributions
 Boom+0.080
drops+0.078
oes+0.076
icated+0.075
ric+0.074
Neg logit contributions
 happily-0.050
 dismay-0.045
 correct-0.045
 certainly-0.043
ris-0.042
Token pieces
Token ID5207
Feature activation+5.01e-02

Pos logit contributions
ris+0.002
 dismay+0.002
aleb+0.002
 happily+0.002
ait+0.002
Neg logit contributions
 Boom-0.002
icated-0.002
bows-0.002
 noises-0.001
stand-0.001
Token carefully
Token ID7773
Feature activation+1.87e-01

Pos logit contributions
 dismay+0.078
ris+0.073
aleb+0.072
 assistant+0.070
 correct+0.070
Neg logit contributions
 Boom-0.056
icated-0.055
stand-0.051
bel-0.051
oes-0.051
Token to
Token ID284
Feature activation-4.59e-02

Pos logit contributions
 dismay+0.271
ris+0.265
 correct+0.257
aleb+0.253
 assistant+0.246
Neg logit contributions
 Boom-0.230
icated-0.216
stand-0.207
bows-0.205
drops-0.199
Token make
Token ID787
Feature activation-8.32e-02

Pos logit contributions
 Boom+0.053
icated+0.052
stand+0.048
ric+0.044
bows+0.044
Neg logit contributions
ris-0.069
 dismay-0.067
 correct-0.065
aleb-0.064
 assistant-0.064
Token the
Token ID262
Feature activation+2.37e-03

Pos logit contributions
 Boom+0.100
icated+0.092
 noises+0.090
 songs+0.083
 sound+0.082
Neg logit contributions
ris-0.129
aleb-0.118
 dismay-0.118
 passer-0.116
irst-0.115
Token correct
Token ID3376
Feature activation+3.43e-01

Pos logit contributions
ris+0.004
 dismay+0.003
aleb+0.003
ait+0.003
 exactly+0.003
Neg logit contributions
 Boom-0.003
icated-0.003
bows-0.003
 songs-0.003
 noises-0.002
Token matches
Token ID7466
Feature activation+1.07e-01

Pos logit contributions
aleb+0.544
ris+0.542
 dismay+0.535
ait+0.527
Mor+0.518
Neg logit contributions
 Boom-0.347
bows-0.330
icated-0.322
 songs-0.320
 noises-0.319
Token.
Token ID13
Feature activation+9.23e-03

Pos logit contributions
ris+0.158
 dismay+0.156
aleb+0.153
 Whoever+0.144
 correct+0.143
Neg logit contributions
icated-0.124
 Boom-0.124
bows-0.116
stand-0.112
bel-0.112
Token After
Token ID2293
Feature activation+3.91e-02

Pos logit contributions
 dismay+0.013
ris+0.013
 correct+0.013
 assistant+0.012
 happily+0.012
Neg logit contributions
 Boom-0.012
icated-0.011
stand-0.010
oes-0.009
llers-0.009
Token some
Token ID617
Feature activation-4.62e-02

Pos logit contributions
ris+0.060
 dismay+0.059
 correct+0.059
aleb+0.058
 assistant+0.056
Neg logit contributions
 Boom-0.046
icated-0.045
stand-0.040
llers-0.037
 songs-0.036
Token time
Token ID640
Feature activation+9.50e-02

Pos logit contributions
 Boom+0.056
icated+0.054
 rhy+0.050
stand+0.050
 songs+0.050
Neg logit contributions
aleb-0.067
ris-0.065
 dismay-0.064
eller-0.063
 assistant-0.063
Token came
Token ID1625
Feature activation-4.45e-03

Pos logit contributions
 Boom+0.013
icated+0.012
bows+0.012
 rhy+0.011
stand+0.011
Neg logit contributions
ris-0.016
 dismay-0.015
 happily-0.015
 correct-0.015
aleb-0.015
Token up
Token ID510
Feature activation-2.53e-03

Pos logit contributions
 Boom+0.006
icated+0.005
stand+0.005
oes+0.005
fl+0.005
Neg logit contributions
ris-0.006
 dismay-0.006
 correct-0.006
 Whoever-0.006
aleb-0.006
Token.
Token ID13
Feature activation+1.48e-02

Pos logit contributions
 Boom+0.003
icated+0.003
bel+0.003
stand+0.003
ained+0.003
Neg logit contributions
ris-0.004
 Whoever-0.004
 dismay-0.004
aleb-0.004
 assistant-0.003
Token With
Token ID2080
Feature activation+8.56e-02

Pos logit contributions
 dismay+0.021
 correct+0.021
ris+0.021
 happily+0.020
aleb+0.020
Neg logit contributions
 Boom-0.019
icated-0.016
stand-0.016
bows-0.016
ained-0.015
Token each
Token ID1123
Feature activation+8.03e-02

Pos logit contributions
ris+0.129
 Whoever+0.119
 correct+0.117
 assistant+0.117
 dismay+0.116
Neg logit contributions
 Boom-0.096
icated-0.095
 songs-0.087
stand-0.085
 rhy-0.084
Token correct
Token ID3376
Feature activation+3.43e-01

Pos logit contributions
 pic+0.115
eller+0.115
ait+0.113
ris+0.111
aleb+0.111
Neg logit contributions
 Boom-0.087
icated-0.085
stand-0.084
 rhy-0.081
 songs-0.080
Token guess
Token ID4724
Feature activation+1.63e-01

Pos logit contributions
ris+0.538
 dismay+0.514
 pic+0.506
+0.503
 Whoever+0.502
Neg logit contributions
 Boom-0.355
 songs-0.316
 noises-0.312
icated-0.310
ment-0.305
Token,
Token ID11
Feature activation-3.98e-02

Pos logit contributions
 dismay+0.250
ris+0.250
 correct+0.234
aleb+0.231
 happily+0.229
Neg logit contributions
 Boom-0.185
icated-0.177
bows-0.166
stand-0.158
 songs-0.156
Token she
Token ID673
Feature activation+2.07e-02

Pos logit contributions
 Boom+0.045
icated+0.039
stand+0.039
line+0.039
bel+0.038
Neg logit contributions
ris-0.062
 dismay-0.060
aleb-0.059
 correct-0.059
 assistant-0.056
Token felt
Token ID2936
Feature activation-1.87e-01

Pos logit contributions
ait+0.033
 dismay+0.033
 correct+0.032
aleb+0.032
 purchase+0.032
Neg logit contributions
 Boom-0.019
 rhy-0.018
icated-0.017
bel-0.017
fl-0.016
Token a
Token ID257
Feature activation-3.07e-02

Pos logit contributions
 Boom+0.233
icated+0.213
stand+0.199
bel+0.193
ained+0.192
Neg logit contributions
ris-0.271
 dismay-0.256
 correct-0.248
 Whoever-0.245
aleb-0.245
Token said
Token ID531
Feature activation-1.13e-02

Pos logit contributions
 correct+0.024
ris+0.024
 dismay+0.023
 purchase+0.022
 buyers+0.022
Neg logit contributions
 Boom-0.010
line-0.007
drops-0.007
bel-0.006
icated-0.006
Token,
Token ID11
Feature activation-4.61e-02

Pos logit contributions
 Boom+0.013
icated+0.012
stand+0.011
bows+0.010
bel+0.010
Neg logit contributions
ris-0.018
 dismay-0.018
 correct-0.018
 happily-0.016
 Whoever-0.016
Token "
Token ID366
Feature activation+1.35e-02

Pos logit contributions
 Boom+0.061
icated+0.054
stand+0.052
drops+0.050
llers+0.048
Neg logit contributions
ris-0.068
 dismay-0.065
 correct-0.064
aleb-0.060
 purchase-0.059
TokenThat
Token ID2504
Feature activation+9.19e-02

Pos logit contributions
 dismay+0.019
 correct+0.018
 happily+0.018
ris+0.018
 assistant+0.018
Neg logit contributions
 Boom-0.018
icated-0.018
Rain-0.015
fl-0.015
bows-0.015
Token's
Token ID338
Feature activation+6.09e-02

Pos logit contributions
ris+0.118
 dismay+0.117
 happily+0.113
 correct+0.111
 Whoever+0.106
Neg logit contributions
 Boom-0.133
icated-0.127
bows-0.114
fl-0.112
stand-0.110
Token correct
Token ID3376
Feature activation+3.42e-01

Pos logit contributions
ris+0.081
 dismay+0.077
 happily+0.076
 Whoever+0.074
ait+0.074
Neg logit contributions
 Boom-0.084
icated-0.083
ric-0.071
ained-0.071
bows-0.070
Token!
Token ID0
Feature activation+1.22e-01

Pos logit contributions
 dismay+0.473
ris+0.458
 Whoever+0.446
 happily+0.439
 correct+0.437
Neg logit contributions
 Boom-0.442
icated-0.434
bows-0.391
fl-0.379
 songs-0.368
Token You
Token ID921
Feature activation+3.66e-02

Pos logit contributions
 dismay+0.177
 happily+0.172
 correct+0.169
ris+0.168
 security+0.164
Neg logit contributions
 Boom-0.168
icated-0.151
oes-0.132
stand-0.129
ric-0.127
Token win
Token ID1592
Feature activation+1.86e-01

Pos logit contributions
 dismay+0.056
aleb+0.052
 happily+0.051
 purchase+0.051
 assistant+0.051
Neg logit contributions
 Boom-0.043
icated-0.042
bows-0.037
 rhy-0.036
ems-0.034
Token the
Token ID262
Feature activation+1.96e-03

Pos logit contributions
 correct+0.218
ris+0.215
 dismay+0.215
 happily+0.210
 certainly+0.194
Neg logit contributions
 Boom-0.287
icated-0.270
 elbows-0.250
stand-0.249
drops-0.246
Token game
Token ID983
Feature activation+7.79e-02

Pos logit contributions
 dismay+0.002
ris+0.002
ait+0.002
 happily+0.002
aleb+0.002
Neg logit contributions
 Boom-0.003
icated-0.003
bows-0.003
stand-0.002
 rhy-0.002
Token and
Token ID290
Feature activation+1.03e-02

Pos logit contributions
 Boom+0.044
icated+0.042
stand+0.039
bows+0.038
 elbows+0.038
Neg logit contributions
ris-0.043
 dismay-0.043
 correct-0.042
 happily-0.040
 Whoever-0.039
Token her
Token ID607
Feature activation+3.93e-02

Pos logit contributions
 correct+0.011
 happily+0.011
 dismay+0.011
ris+0.010
 purchase+0.010
Neg logit contributions
 Boom-0.017
icated-0.017
bows-0.015
ric-0.015
stand-0.015
Token toys
Token ID14958
Feature activation-7.29e-02

Pos logit contributions
ris+0.054
 correct+0.051
 dismay+0.051
 Whoever+0.049
 happily+0.049
Neg logit contributions
 Boom-0.054
icated-0.050
 elbows-0.046
stand-0.046
bel-0.045
Token in
Token ID287
Feature activation+3.77e-02

Pos logit contributions
 Boom+0.112
icated+0.107
stand+0.096
 elbows+0.095
bows+0.094
Neg logit contributions
 correct-0.086
ris-0.085
 dismay-0.085
 happily-0.084
 assistant-0.078
Token the
Token ID262
Feature activation-3.04e-04

Pos logit contributions
ris+0.058
 Whoever+0.055
aleb+0.054
 assistant+0.053
ager+0.053
Neg logit contributions
stand-0.043
 Boom-0.040
 elbows-0.039
icated-0.038
 ger-0.036
Token correct
Token ID3376
Feature activation+3.42e-01

Pos logit contributions
 Boom+0.000
stand+0.000
icated+0.000
 elbows+0.000
ained+0.000
Neg logit contributions
ris-0.000
 Whoever-0.000
 happily-0.000
 dismay-0.000
 certainly-0.000
Token places
Token ID4113
Feature activation-5.31e-02

Pos logit contributions
 Whoever+0.462
ris+0.449
aleb+0.437
Mor+0.436
 certainly+0.436
Neg logit contributions
 Boom-0.408
stand-0.397
 ger-0.392
 songs-0.385
bows-0.373
Token.
Token ID13
Feature activation+9.54e-03

Pos logit contributions
 Boom+0.062
icated+0.058
stand+0.058
bel+0.056
bows+0.056
Neg logit contributions
 dismay-0.079
ris-0.077
 correct-0.075
 Whoever-0.072
aleb-0.071
Token One
Token ID1881
Feature activation+9.20e-02

Pos logit contributions
ris+0.013
 correct+0.012
 dismay+0.012
 assistant+0.012
 security+0.012
Neg logit contributions
 Boom-0.014
icated-0.012
oes-0.012
ained-0.011
bows-0.011
Token day
Token ID1110
Feature activation-7.54e-02

Pos logit contributions
ris+0.167
 correct+0.167
 dismay+0.166
 Whoever+0.162
 happily+0.161
Neg logit contributions
 Boom-0.079
icated-0.074
ained-0.063
oes-0.062
stand-0.061
Token,
Token ID11
Feature activation-4.81e-02

Pos logit contributions
 Boom+0.090
icated+0.082
stand+0.074
bows+0.073
 elbows+0.072
Neg logit contributions
ris-0.115
 correct-0.115
 dismay-0.113
 happily-0.108
 assistant-0.107
Token say
Token ID910
Feature activation-7.86e-02

Pos logit contributions
 correct+0.074
ris+0.074
 dismay+0.073
 happily+0.069
 assistant+0.068
Neg logit contributions
 Boom-0.066
icated-0.062
stand-0.056
ric-0.055
bows-0.054
Token,
Token ID11
Feature activation-4.69e-02

Pos logit contributions
 Boom+0.102
icated+0.091
stand+0.086
bows+0.083
bel+0.081
Neg logit contributions
ris-0.113
 dismay-0.112
 correct-0.110
 happily-0.105
 Whoever-0.102
Token "
Token ID366
Feature activation+1.33e-02

Pos logit contributions
 Boom+0.075
icated+0.073
 elbows+0.066
bows+0.066
ained+0.066
Neg logit contributions
 dismay-0.050
 correct-0.050
 happily-0.050
ris-0.049
 Whoever-0.046
TokenYou
Token ID1639
Feature activation-9.02e-03

Pos logit contributions
 dismay+0.022
 correct+0.022
ris+0.022
 happily+0.022
 assistant+0.021
Neg logit contributions
 Boom-0.014
icated-0.013
fl-0.011
Rain-0.011
oes-0.011
Token are
Token ID389
Feature activation-5.74e-02

Pos logit contributions
 Boom+0.010
icated+0.009
bows+0.008
 rhy+0.008
rings+0.008
Neg logit contributions
 dismay-0.014
aleb-0.014
 happily-0.014
 purchase-0.014
ris-0.014
Token correct
Token ID3376
Feature activation+3.41e-01

Pos logit contributions
 Boom+0.061
icated+0.055
 sound+0.049
weet+0.048
stand+0.048
Neg logit contributions
ris-0.097
aleb-0.091
 Whoever-0.089
ait-0.089
-0.088
Token.
Token ID13
Feature activation+9.78e-03

Pos logit contributions
ris+0.415
 dismay+0.415
 correct+0.406
 happily+0.396
 Whoever+0.393
Neg logit contributions
 Boom-0.491
icated-0.470
stand-0.426
bows-0.419
fl-0.416
Token Those
Token ID5845
Feature activation+9.95e-02

Pos logit contributions
 dismay+0.014
 happily+0.014
ris+0.014
 correct+0.014
 purchase+0.013
Neg logit contributions
 Boom-0.013
icated-0.012
stand-0.010
oes-0.010
ric-0.010
Token feathers
Token ID27400
Feature activation-2.28e-01

Pos logit contributions
 correct+0.143
 dismay+0.141
ris+0.141
 happily+0.137
 assistant+0.131
Neg logit contributions
 Boom-0.126
icated-0.119
stand-0.106
ained-0.104
 elbows-0.104
Token belong
Token ID5594
Feature activation-1.32e-02

Pos logit contributions
 Boom+0.281
icated+0.272
bows+0.241
stand+0.237
drops+0.234
Neg logit contributions
 dismay-0.333
ris-0.328
 correct-0.323
 happily-0.318
 assistant-0.305
Token to
Token ID284
Feature activation-4.73e-02

Pos logit contributions
 Boom+0.016
icated+0.015
bows+0.013
bel+0.013
oes+0.013
Neg logit contributions
 dismay-0.020
ris-0.019
 correct-0.019
 Whoever-0.018
 happily-0.018
Token bear
Token ID6842
Feature activation+3.89e-02

Pos logit contributions
 dismay+0.003
 correct+0.003
ris+0.003
 happily+0.003
 assistant+0.003
Neg logit contributions
 Boom-0.004
icated-0.003
stand-0.003
bows-0.003
ained-0.003
Token,
Token ID11
Feature activation-4.51e-02

Pos logit contributions
 dismay+0.064
ris+0.062
aleb+0.061
 correct+0.061
 Whoever+0.060
Neg logit contributions
stand-0.036
icated-0.034
 Boom-0.034
fl-0.033
ring-0.033
Token much
Token ID881
Feature activation+9.67e-03

Pos logit contributions
 Boom+0.055
icated+0.053
stand+0.049
drops+0.048
 elbows+0.048
Neg logit contributions
ris-0.067
 dismay-0.063
 correct-0.063
aleb-0.061
 assistant-0.061
Token to
Token ID284
Feature activation-4.52e-02

Pos logit contributions
ris+0.014
 dismay+0.014
 correct+0.014
aleb+0.013
 Whoever+0.013
Neg logit contributions
 Boom-0.012
icated-0.011
stand-0.011
drops-0.010
bows-0.010
Token his
Token ID465
Feature activation+3.08e-02

Pos logit contributions
 Boom+0.056
icated+0.055
stand+0.052
ric+0.051
weet+0.050
Neg logit contributions
ris-0.064
 correct-0.059
 assistant-0.059
aleb-0.058
 dismay-0.058
Token dismay
Token ID29793
Feature activation+3.40e-01

Pos logit contributions
 Whoever+0.040
ris+0.039
 correct+0.037
ait+0.036
aleb+0.035
Neg logit contributions
 elbows-0.040
 Boom-0.039
 songs-0.039
 skin-0.038
fl-0.037
Token.
Token ID13
Feature activation+1.03e-02

Pos logit contributions
 happily+0.413
ris+0.401
 correct+0.397
 dismay+0.394
 assistant+0.369
Neg logit contributions
 Boom-0.508
icated-0.487
stand-0.436
ained-0.432
 elbows-0.432
Token
Token ID198
Feature activation-1.15e-02

Pos logit contributions
ris+0.016
 dismay+0.016
 correct+0.015
aleb+0.014
 assistant+0.014
Neg logit contributions
 Boom-0.013
icated-0.011
stand-0.010
oes-0.010
bows-0.010
Token
Token ID198
Feature activation-1.11e-02

Pos logit contributions
 Boom+0.018
icated+0.018
drops+0.017
bows+0.017
 elbows+0.017
Neg logit contributions
ris-0.013
 happily-0.013
 dismay-0.012
 correct-0.012
 Whoever-0.011
TokenThe
Token ID464
Feature activation-7.97e-02

Pos logit contributions
 Boom+0.013
icated+0.013
stand+0.011
ained+0.011
bel+0.011
Neg logit contributions
ris-0.017
 dismay-0.017
 correct-0.017
 happily-0.016
 assistant-0.016
Token bear
Token ID6842
Feature activation+3.92e-02

Pos logit contributions
 Boom+0.130
icated+0.126
stand+0.116
bows+0.115
ained+0.115
Neg logit contributions
 dismay-0.087
 correct-0.086
 happily-0.084
ris-0.083
 assistant-0.078
Token
Token ID198
Feature activation-1.39e-02

Pos logit contributions
 dismay+0.008
ris+0.008
 correct+0.008
 happily+0.008
 Whoever+0.008
Neg logit contributions
 Boom-0.010
icated-0.009
 elbows-0.009
stand-0.009
bows-0.008
Token
Token ID198
Feature activation-1.44e-02

Pos logit contributions
icated+0.023
 Boom+0.023
drops+0.022
 elbows+0.021
bows+0.021
Neg logit contributions
ris-0.015
 happily-0.014
 dismay-0.014
 correct-0.014
 Whoever-0.013
TokenThe
Token ID464
Feature activation-8.14e-02

Pos logit contributions
 Boom+0.018
icated+0.017
stand+0.015
ained+0.014
bows+0.014
Neg logit contributions
 dismay-0.022
 correct-0.021
ris-0.021
 happily-0.021
 assistant-0.020
Token family
Token ID1641
Feature activation+3.26e-02

Pos logit contributions
 Boom+0.114
icated+0.109
stand+0.097
 elbows+0.095
bows+0.095
Neg logit contributions
ris-0.106
 correct-0.104
 dismay-0.102
 happily-0.100
 Whoever-0.097
Token was
Token ID373
Feature activation+2.29e-02

Pos logit contributions
 correct+0.052
ris+0.050
 dismay+0.050
 purchase+0.049
aleb+0.049
Neg logit contributions
 Boom-0.033
icated-0.029
bel-0.026
 sound-0.026
drops-0.026
Token dismay
Token ID29793
Feature activation+3.39e-01

Pos logit contributions
ris+0.042
 correct+0.039
aleb+0.039
 dismay+0.039
 Whoever+0.038
Neg logit contributions
 Boom-0.020
icated-0.017
ained-0.017
stand-0.016
bel-0.016
Tokened
Token ID276
Feature activation-1.10e-01

Pos logit contributions
 happily+0.445
ris+0.430
 dismay+0.422
 correct+0.420
 assistant+0.399
Neg logit contributions
 Boom-0.472
icated-0.449
stand-0.415
 elbows-0.408
ric-0.401
Token.
Token ID13
Feature activation+6.76e-03

Pos logit contributions
 Boom+0.139
icated+0.131
 elbows+0.120
bel+0.119
ained+0.119
Neg logit contributions
ris-0.154
 correct-0.149
 dismay-0.148
 Whoever-0.143
 certainly-0.141
Token What
Token ID1867
Feature activation-8.62e-02

Pos logit contributions
ris+0.009
 dismay+0.008
 correct+0.008
 happily+0.008
 assistant+0.008
Neg logit contributions
 Boom-0.010
icated-0.009
stand-0.008
bows-0.008
ained-0.008
Token could
Token ID714
Feature activation-7.27e-02

Pos logit contributions
 Boom+0.107
icated+0.101
 rhy+0.085
ained+0.085
bel+0.084
Neg logit contributions
 correct-0.127
 dismay-0.126
ris-0.126
 purchase-0.120
 assistant-0.120
Token they
Token ID484
Feature activation+3.64e-02

Pos logit contributions
 Boom+0.081
icated+0.079
llers+0.071
stand+0.070
bel+0.070
Neg logit contributions
ris-0.121
 correct-0.109
 dismay-0.106
 happily-0.106
 assistant-0.103
Token looked
Token ID3114
Feature activation+1.55e-02

Pos logit contributions
 Boom+0.168
icated+0.144
drops+0.137
bel+0.132
line+0.132
Neg logit contributions
 correct-0.258
 assistant-0.245
 dismay-0.244
ris-0.243
aleb-0.241
Token up
Token ID510
Feature activation-4.66e-03

Pos logit contributions
 dismay+0.017
 correct+0.017
 happily+0.017
ris+0.017
 assistant+0.015
Neg logit contributions
 Boom-0.025
icated-0.024
stand-0.022
bows-0.022
 elbows-0.021
Token,
Token ID11
Feature activation-4.32e-02

Pos logit contributions
 Boom+0.005
icated+0.004
bel+0.004
stand+0.004
ained+0.004
Neg logit contributions
ris-0.007
 correct-0.007
 dismay-0.007
 assistant-0.007
 certainly-0.007
Token they
Token ID484
Feature activation+4.19e-02

Pos logit contributions
 Boom+0.057
icated+0.051
stand+0.044
 keyboards+0.044
bows+0.043
Neg logit contributions
ris-0.066
 correct-0.062
 dismay-0.061
aleb-0.059
 assistant-0.059
Token were
Token ID547
Feature activation-4.24e-02

Pos logit contributions
 correct+0.065
ris+0.063
 dismay+0.063
 assistant+0.061
 purchase+0.060
Neg logit contributions
 Boom-0.048
icated-0.044
bel-0.039
drops-0.038
stand-0.038
Token dismay
Token ID29793
Feature activation+3.38e-01

Pos logit contributions
 Boom+0.055
ained+0.048
stand+0.047
 elbows+0.046
drops+0.046
Neg logit contributions
ris-0.060
 purchase-0.056
aleb-0.056
 Whoever-0.055
 correct-0.055
Tokened
Token ID276
Feature activation-1.07e-01

Pos logit contributions
ris+0.472
 dismay+0.471
 correct+0.470
 happily+0.470
 assistant+0.437
Neg logit contributions
 Boom-0.433
icated-0.415
stand-0.371
 elbows-0.363
bows-0.361
Token to
Token ID284
Feature activation-4.70e-02

Pos logit contributions
 Boom+0.124
icated+0.110
stand+0.106
 elbows+0.106
drops+0.105
Neg logit contributions
ris-0.161
 correct-0.153
 dismay-0.152
aleb-0.149
 assistant-0.148
Token see
Token ID766
Feature activation-4.13e-02

Pos logit contributions
 Boom+0.062
icated+0.057
 elbows+0.054
ric+0.053
stand+0.053
Neg logit contributions
ris-0.067
 correct-0.063
 assistant-0.062
aleb-0.061
 dismay-0.059
Token an
Token ID281
Feature activation+2.70e-02

Pos logit contributions
 Boom+0.052
icated+0.049
stand+0.043
ric+0.042
weet+0.041
Neg logit contributions
ris-0.061
 correct-0.058
 dismay-0.057
 assistant-0.056
 happily-0.056
Token empty
Token ID6565
Feature activation+7.58e-02

Pos logit contributions
 dismay+0.037
 correct+0.037
ris+0.037
 happily+0.036
 assistant+0.034
Neg logit contributions
 Boom-0.036
icated-0.034
stand-0.031
bows-0.030
 elbows-0.030
Token the
Token ID262
Feature activation+1.58e-02

Pos logit contributions
 dismay+0.054
 correct+0.054
ris+0.054
 happily+0.053
 assistant+0.050
Neg logit contributions
 Boom-0.052
icated-0.050
stand-0.044
bows-0.044
 elbows-0.044
Token sunshine
Token ID34488
Feature activation-1.92e-01

Pos logit contributions
 correct+0.021
 dismay+0.021
 happily+0.020
 surprise+0.019
ris+0.019
Neg logit contributions
 Boom-0.022
icated-0.022
llers-0.020
bel-0.020
irds-0.019
Token.
Token ID13
Feature activation+2.86e-02

Pos logit contributions
 Boom+0.303
icated+0.288
 elbows+0.258
stand+0.254
ained+0.252
Neg logit contributions
 happily-0.220
 correct-0.217
ris-0.214
 dismay-0.214
 assistant-0.196
Token Every
Token ID3887
Feature activation+7.24e-02

Pos logit contributions
ris+0.037
 correct+0.037
 dismay+0.037
 happily+0.034
 assistant+0.034
Neg logit contributions
 Boom-0.041
icated-0.038
stand-0.035
ained-0.034
weet-0.033
Token day
Token ID1110
Feature activation-7.71e-02

Pos logit contributions
ris+0.136
 correct+0.133
 dismay+0.131
 assistant+0.128
 Whoever+0.128
Neg logit contributions
 Boom-0.061
icated-0.055
stand-0.047
ained-0.046
oes-0.043
Token Boom
Token ID31824
Feature activation-4.12e-01

Pos logit contributions
 Boom+0.097
icated+0.086
stand+0.081
bows+0.077
drops+0.075
Neg logit contributions
ris-0.115
 dismay-0.114
 correct-0.113
 happily-0.106
 assistant-0.105
Token would
Token ID561
Feature activation-5.49e-02

Pos logit contributions
 Boom+0.578
icated+0.563
 elbows+0.553
ained+0.526
stand+0.514
Neg logit contributions
 happily-0.510
ris-0.494
 correct-0.482
 dismay-0.476
 certainly-0.461
Token go
Token ID467
Feature activation-2.22e-02

Pos logit contributions
 Boom+0.071
icated+0.068
stand+0.063
drops+0.060
bows+0.060
Neg logit contributions
ris-0.077
 dismay-0.074
 correct-0.074
 assistant-0.070
 happily-0.070
Token to
Token ID284
Feature activation-4.78e-02

Pos logit contributions
 Boom+0.033
icated+0.032
 elbows+0.028
stand+0.028
bows+0.028
Neg logit contributions
 correct-0.027
 happily-0.027
ris-0.027
 dismay-0.026
 assistant-0.025
Token a
Token ID257
Feature activation-3.52e-02

Pos logit contributions
 Boom+0.065
icated+0.062
stand+0.056
 elbows+0.054
bows+0.054
Neg logit contributions
ris-0.064
 dismay-0.064
 correct-0.064
 happily-0.062
 assistant-0.059
Token big
Token ID1263
Feature activation+3.24e-02

Pos logit contributions
 Boom+0.053
icated+0.051
stand+0.048
weet+0.045
Rain+0.045
Neg logit contributions
ris-0.041
 correct-0.040
 dismay-0.040
 happily-0.038
 assistant-0.037
Token
Token ID198
Feature activation-1.01e-02

Pos logit contributions
 dismay+0.022
 correct+0.022
ris+0.022
 happily+0.021
 assistant+0.021
Neg logit contributions
 Boom-0.024
icated-0.023
stand-0.020
oes-0.020
ained-0.020
Token
Token ID198
Feature activation-1.14e-02

Pos logit contributions
 Boom+0.013
icated+0.012
stand+0.011
ained+0.011
oes+0.011
Neg logit contributions
 correct-0.014
 dismay-0.014
ris-0.014
 happily-0.013
 assistant-0.013
TokenOne
Token ID3198
Feature activation+5.29e-02

Pos logit contributions
 Boom+0.016
icated+0.015
Rain+0.014
stand+0.013
Br+0.013
Neg logit contributions
 correct-0.015
 dismay-0.015
ris-0.015
 happily-0.014
 assistant-0.014
Token day
Token ID1110
Feature activation-7.95e-02

Pos logit contributions
 dismay+0.083
ris+0.083
 correct+0.083
 happily+0.082
 assistant+0.077
Neg logit contributions
 Boom-0.060
icated-0.056
stand-0.050
 elbows-0.050
bows-0.049
Token as
Token ID355
Feature activation-4.24e-02

Pos logit contributions
 Boom+0.099
icated+0.090
stand+0.080
bows+0.078
 elbows+0.077
Neg logit contributions
ris-0.119
 correct-0.119
 dismay-0.119
 happily-0.113
 assistant-0.110
Token Boom
Token ID31824
Feature activation-4.11e-01

Pos logit contributions
 Boom+0.055
icated+0.051
bel+0.044
Rain+0.043
ained+0.042
Neg logit contributions
 correct-0.063
ris-0.061
 dismay-0.060
 assistant-0.058
aleb-0.057
Token was
Token ID373
Feature activation+2.67e-02

Pos logit contributions
 Boom+0.537
icated+0.522
 elbows+0.514
ained+0.477
stand+0.477
Neg logit contributions
ris-0.549
 happily-0.546
 dismay-0.527
 correct-0.507
 certainly-0.489
Token walking
Token ID6155
Feature activation+9.19e-02

Pos logit contributions
 happily+0.031
 dismay+0.030
 correct+0.030
ris+0.028
 certainly+0.027
Neg logit contributions
icated-0.041
 Boom-0.040
bows-0.037
oes-0.036
 elbows-0.036
Token in
Token ID287
Feature activation+3.51e-02

Pos logit contributions
 happily+0.112
 correct+0.108
 dismay+0.107
ris+0.105
 assistant+0.101
Neg logit contributions
icated-0.137
 Boom-0.135
stand-0.122
ric-0.121
llers-0.121
Token the
Token ID262
Feature activation+1.55e-04

Pos logit contributions
ris+0.059
 correct+0.057
aleb+0.057
 certainly+0.057
 assistant+0.056
Neg logit contributions
 Boom-0.038
stand-0.032
icated-0.030
 elbows-0.030
 boom-0.027
Token field
Token ID2214
Feature activation+1.25e-01

Pos logit contributions
 correct+0.000
 dismay+0.000
 happily+0.000
ris+0.000
 surprise+0.000
Neg logit contributions
icated-0.000
 Boom-0.000
bel-0.000
bows-0.000
 elbows-0.000
Token bright
Token ID6016
Feature activation-7.99e-02

Pos logit contributions
ris+0.036
 dismay+0.034
 correct+0.033
aleb+0.033
 happily+0.033
Neg logit contributions
 Boom-0.032
icated-0.030
ained-0.028
stand-0.027
ric-0.027
Token and
Token ID290
Feature activation+1.22e-02

Pos logit contributions
 Boom+0.097
icated+0.090
bows+0.084
drops+0.081
fl+0.081
Neg logit contributions
 dismay-0.118
ris-0.117
 correct-0.116
 Whoever-0.111
 assistant-0.109
Token colorful
Token ID20239
Feature activation-2.77e-02

Pos logit contributions
 dismay+0.016
 correct+0.016
ris+0.016
 happily+0.016
 assistant+0.015
Neg logit contributions
 Boom-0.017
icated-0.016
stand-0.014
bows-0.014
 elbows-0.014
Token and
Token ID290
Feature activation+1.24e-02

Pos logit contributions
 Boom+0.031
icated+0.030
bows+0.029
drops+0.028
fl+0.028
Neg logit contributions
 dismay-0.043
ris-0.042
 correct-0.041
 Whoever-0.040
ait-0.040
Token so
Token ID523
Feature activation-5.12e-02

Pos logit contributions
ris+0.017
 dismay+0.017
 correct+0.017
 happily+0.016
 assistant+0.016
Neg logit contributions
 Boom-0.017
icated-0.015
bows-0.014
stand-0.014
drops-0.014
Token Boom
Token ID31824
Feature activation-4.11e-01

Pos logit contributions
 Boom+0.069
icated+0.064
bows+0.059
bel+0.058
drops+0.057
Neg logit contributions
ris-0.071
 dismay-0.069
 correct-0.067
aleb-0.067
 assistant-0.065
Token thought
Token ID1807
Feature activation+3.85e-03

Pos logit contributions
 Boom+0.593
icated+0.572
 elbows+0.548
stand+0.526
ained+0.522
Neg logit contributions
ris-0.500
 happily-0.495
 dismay-0.484
 correct-0.472
 certainly-0.450
Token it
Token ID340
Feature activation+3.25e-03

Pos logit contributions
ris+0.005
 dismay+0.005
aleb+0.005
 correct+0.005
 happily+0.005
Neg logit contributions
 Boom-0.005
icated-0.005
stand-0.004
bel-0.004
bows-0.004
Token was
Token ID373
Feature activation+2.63e-02

Pos logit contributions
ris+0.005
aleb+0.005
 dismay+0.005
ait+0.005
 purchase+0.005
Neg logit contributions
 Boom-0.004
icated-0.003
bows-0.003
fl-0.003
drops-0.003
Token fun
Token ID1257
Feature activation+1.89e-02

Pos logit contributions
ris+0.041
aleb+0.038
 dismay+0.037
 assistant+0.037
ait+0.036
Neg logit contributions
 Boom-0.033
icated-0.031
ric-0.027
ained-0.027
bows-0.026
Token.
Token ID13
Feature activation+1.34e-02

Pos logit contributions
 dismay+0.030
ris+0.029
 correct+0.029
aleb+0.028
 assistant+0.028
Neg logit contributions
 Boom-0.021
icated-0.021
bows-0.019
stand-0.018
fl-0.018
Token heard
Token ID2982
Feature activation-6.02e-02

Pos logit contributions
 correct+0.059
ris+0.058
 dismay+0.057
 assistant+0.056
 purchase+0.055
Neg logit contributions
 Boom-0.043
icated-0.039
drops-0.035
 elbows-0.034
ric-0.034
Token a
Token ID257
Feature activation-3.26e-02

Pos logit contributions
icated+0.096
oes+0.095
 elbows+0.094
drops+0.093
stand+0.092
Neg logit contributions
 dismay-0.063
 happily-0.061
 correct-0.057
 surprise-0.056
 security-0.054
Token loud
Token ID7812
Feature activation-7.85e-02

Pos logit contributions
 Boom+0.041
icated+0.039
ained+0.034
stand+0.034
bows+0.034
Neg logit contributions
 dismay-0.047
 correct-0.046
ris-0.046
 happily-0.045
 assistant-0.043
Token noise
Token ID7838
Feature activation-2.07e-01

Pos logit contributions
 Boom+0.070
icated+0.070
ained+0.064
 elbows+0.060
oes+0.059
Neg logit contributions
 happily-0.135
 dismay-0.134
 correct-0.133
ris-0.129
 Whoever-0.128
Token.
Token ID13
Feature activation+1.44e-02

Pos logit contributions
 Boom+0.309
icated+0.296
stand+0.267
bows+0.264
 elbows+0.264
Neg logit contributions
 dismay-0.249
ris-0.249
 correct-0.248
 happily-0.243
 assistant-0.227
Token Boom
Token ID31824
Feature activation-4.10e-01

Pos logit contributions
 dismay+0.018
 correct+0.018
ris+0.018
 happily+0.017
 Whoever+0.016
Neg logit contributions
 Boom-0.021
icated-0.020
 elbows-0.018
bows-0.018
stand-0.018
Token!
Token ID0
Feature activation+1.26e-01

Pos logit contributions
 Boom+0.762
icated+0.755
 elbows+0.742
oes+0.723
ained+0.722
Neg logit contributions
 happily-0.307
ris-0.301
 dismay-0.283
 correct-0.272
 Whoever-0.266
Token It
Token ID632
Feature activation-3.20e-03

Pos logit contributions
 dismay+0.156
ris+0.151
 correct+0.151
 Whoever+0.150
 happily+0.150
Neg logit contributions
icated-0.172
bel-0.162
 elbows-0.161
 Boom-0.158
bows-0.157
Token was
Token ID373
Feature activation+2.77e-02

Pos logit contributions
icated+0.005
 Boom+0.005
stand+0.005
 elbows+0.005
bows+0.005
Neg logit contributions
 dismay-0.003
 happily-0.003
 correct-0.003
ris-0.003
 certainly-0.003
Token thunder
Token ID18355
Feature activation-2.50e-01

Pos logit contributions
 dismay+0.040
 correct+0.040
 happily+0.039
ris+0.037
 surprise+0.036
Neg logit contributions
icated-0.035
 Boom-0.034
stand-0.030
bows-0.030
oes-0.030
Token.
Token ID13
Feature activation+1.55e-02

Pos logit contributions
 Boom+0.395
icated+0.383
 elbows+0.349
stand+0.345
ained+0.342
Neg logit contributions
ris-0.272
 dismay-0.270
 correct-0.270
 happily-0.269
 Whoever-0.245
Token!
Token ID0
Feature activation+1.25e-01

Pos logit contributions
 dismay+0.099
ris+0.096
 correct+0.093
 Whoever+0.090
aleb+0.089
Neg logit contributions
icated-0.063
 Boom-0.062
bows-0.058
stand-0.058
fl-0.056
Token
Token ID198
Feature activation-1.12e-02

Pos logit contributions
ris+0.167
 correct+0.166
 dismay+0.165
 happily+0.159
 assistant+0.153
Neg logit contributions
 Boom-0.174
icated-0.158
stand-0.146
ained-0.141
bows-0.141
Token
Token ID198
Feature activation-1.13e-02

Pos logit contributions
 Boom+0.015
icated+0.014
stand+0.013
ained+0.012
bel+0.012
Neg logit contributions
 correct-0.015
 dismay-0.015
ris-0.015
 happily-0.014
 assistant-0.014
TokenThen
Token ID6423
Feature activation-1.36e-01

Pos logit contributions
 Boom+0.012
icated+0.011
stand+0.010
Rain+0.010
bel+0.010
Neg logit contributions
ris-0.018
 correct-0.018
 dismay-0.018
 happily-0.017
 assistant-0.017
Token suddenly
Token ID6451
Feature activation-7.40e-02

Pos logit contributions
 Boom+0.181
icated+0.164
stand+0.150
 elbows+0.149
bows+0.148
Neg logit contributions
ris-0.192
 correct-0.189
 dismay-0.186
 happily-0.179
aleb-0.174
Token Boom
Token ID31824
Feature activation-4.08e-01

Pos logit contributions
 Boom+0.101
icated+0.100
oes+0.088
stand+0.088
ric+0.088
Neg logit contributions
 dismay-0.097
 happily-0.096
 correct-0.096
ris-0.095
 assistant-0.088
Token heard
Token ID2982
Feature activation-5.94e-02

Pos logit contributions
 Boom+0.601
icated+0.595
 elbows+0.571
ained+0.557
oes+0.554
Neg logit contributions
 happily-0.473
ris-0.470
 dismay-0.457
 correct-0.435
 certainly-0.418
Token a
Token ID257
Feature activation-3.35e-02

Pos logit contributions
icated+0.088
 Boom+0.088
 elbows+0.083
ric+0.082
oes+0.082
Neg logit contributions
 dismay-0.069
 happily-0.067
 correct-0.066
ris-0.065
 surprise-0.061
Token sound
Token ID2128
Feature activation-2.86e-01

Pos logit contributions
 Boom+0.039
icated+0.036
stand+0.032
bel+0.030
bows+0.030
Neg logit contributions
ris-0.053
 correct-0.052
 dismay-0.052
 happily-0.050
 assistant-0.048
Token coming
Token ID2406
Feature activation+3.98e-03

Pos logit contributions
 Boom+0.453
icated+0.451
 elbows+0.418
bows+0.410
ained+0.410
Neg logit contributions
 happily-0.309
 dismay-0.306
ris-0.303
 correct-0.297
 surprise-0.271
Token from
Token ID422
Feature activation-6.54e-02

Pos logit contributions
 correct+0.007
ris+0.007
 dismay+0.007
aleb+0.006
 purchase+0.006
Neg logit contributions
 Boom-0.004
icated-0.004
drops-0.003
fl-0.003
oes-0.003
Token.
Token ID13
Feature activation+1.28e-02

Pos logit contributions
ris+0.186
 dismay+0.184
aleb+0.178
 correct+0.178
 Whoever+0.174
Neg logit contributions
 Boom-0.146
icated-0.139
stand-0.135
bows-0.135
bel-0.131
Token From
Token ID3574
Feature activation+1.04e-01

Pos logit contributions
ris+0.017
 dismay+0.016
 correct+0.016
 happily+0.015
 assistant+0.015
Neg logit contributions
 Boom-0.019
icated-0.017
stand-0.016
bows-0.015
ained-0.015
Token that
Token ID326
Feature activation-1.39e-02

Pos logit contributions
 happily+0.131
 dismay+0.130
 correct+0.129
ris+0.128
 certainly+0.117
Neg logit contributions
 Boom-0.152
icated-0.145
stand-0.134
bows-0.131
 elbows-0.131
Token day
Token ID1110
Feature activation-7.72e-02

Pos logit contributions
 Boom+0.011
icated+0.011
stand+0.009
fl+0.009
oes+0.008
Neg logit contributions
ris-0.027
 dismay-0.026
 correct-0.025
 assistant-0.025
 happily-0.025
Token onwards
Token ID31719
Feature activation+7.39e-02

Pos logit contributions
 Boom+0.102
icated+0.087
ained+0.082
stand+0.082
 elbows+0.078
Neg logit contributions
ris-0.110
 dismay-0.108
 correct-0.107
 assistant-0.105
 purchase-0.104
Token Boom
Token ID31824
Feature activation-4.08e-01

Pos logit contributions
ris+0.136
 correct+0.128
 dismay+0.127
aleb+0.122
 Whoever+0.121
Neg logit contributions
 Boom-0.070
fl-0.056
icated-0.054
drops-0.052
stand-0.051
Token always
Token ID1464
Feature activation+7.73e-02

Pos logit contributions
 Boom+0.580
icated+0.553
stand+0.498
bows+0.493
drops+0.484
Neg logit contributions
 dismay-0.522
 correct-0.519
ris-0.519
 happily-0.503
 assistant-0.477
Token enjoyed
Token ID8359
Feature activation+9.79e-04

Pos logit contributions
 correct+0.099
 dismay+0.099
ris+0.098
 happily+0.096
 assistant+0.090
Neg logit contributions
 Boom-0.110
icated-0.105
stand-0.094
bows-0.093
 elbows-0.093
Token visiting
Token ID10013
Feature activation+4.58e-02

Pos logit contributions
ris+0.001
 dismay+0.001
 correct+0.001
 assistant+0.001
 Whoever+0.001
Neg logit contributions
 Boom-0.001
icated-0.001
stand-0.001
fl-0.001
 songs-0.001
Token the
Token ID262
Feature activation+1.23e-03

Pos logit contributions
ris+0.063
 dismay+0.061
 correct+0.060
 Whoever+0.057
 assistant+0.056
Neg logit contributions
 Boom-0.062
icated-0.059
stand-0.052
bows-0.052
 songs-0.051
Token magical
Token ID10883
Feature activation+1.27e-01

Pos logit contributions
ris+0.002
 dismay+0.002
 Whoever+0.002
 happily+0.002
 correct+0.002
Neg logit contributions
 Boom-0.002
icated-0.002
fl-0.001
bows-0.001
 songs-0.001
Token a
Token ID257
Feature activation-3.33e-02

Pos logit contributions
ris+0.037
 correct+0.036
 dismay+0.036
 happily+0.035
 Whoever+0.034
Neg logit contributions
 Boom-0.037
icated-0.035
stand-0.032
 elbows-0.031
ained-0.031
Token cheerful
Token ID37999
Feature activation-4.86e-02

Pos logit contributions
 Boom+0.050
icated+0.047
stand+0.043
bows+0.041
drops+0.041
Neg logit contributions
ris-0.040
 correct-0.040
 dismay-0.039
 happily-0.038
 Whoever-0.036
Token tune
Token ID14009
Feature activation-2.14e-01

Pos logit contributions
 Boom+0.062
icated+0.059
stand+0.052
 elbows+0.052
bows+0.052
Neg logit contributions
 dismay-0.069
ris-0.068
 correct-0.068
 happily-0.067
 assistant-0.064
Token and
Token ID290
Feature activation+1.40e-02

Pos logit contributions
 Boom+0.307
icated+0.292
stand+0.267
bows+0.261
drops+0.258
Neg logit contributions
 dismay-0.271
 correct-0.271
ris-0.270
 happily-0.258
 assistant-0.247
Token when
Token ID618
Feature activation-1.81e-01

Pos logit contributions
ris+0.018
 correct+0.018
 dismay+0.018
 happily+0.017
 assistant+0.016
Neg logit contributions
 Boom-0.020
icated-0.019
stand-0.017
bows-0.017
drops-0.017
Token Boom
Token ID31824
Feature activation-4.08e-01

Pos logit contributions
 Boom+0.239
icated+0.206
 elbows+0.186
bows+0.185
stand+0.185
Neg logit contributions
ris-0.275
 correct-0.259
aleb-0.257
 dismay-0.250
 assistant-0.245
Token looked
Token ID3114
Feature activation+1.68e-02

Pos logit contributions
 Boom+0.651
icated+0.636
 elbows+0.599
ained+0.586
stand+0.582
Neg logit contributions
ris-0.432
 happily-0.423
 dismay-0.420
 correct-0.417
 Whoever-0.385
Token closer
Token ID5699
Feature activation+6.43e-02

Pos logit contributions
 happily+0.019
 correct+0.018
 dismay+0.018
ris+0.016
 exactly+0.016
Neg logit contributions
icated-0.028
 Boom-0.027
bows-0.025
stand-0.025
ric-0.025
Token she
Token ID673
Feature activation+1.78e-02

Pos logit contributions
ris+0.096
 dismay+0.095
 correct+0.094
 assistant+0.089
 happily+0.088
Neg logit contributions
 Boom-0.080
icated-0.072
bows-0.065
stand-0.065
drops-0.065
Token saw
Token ID2497
Feature activation-1.13e-02

Pos logit contributions
 dismay+0.022
 correct+0.022
ris+0.022
 happily+0.022
 certainly+0.020
Neg logit contributions
 Boom-0.026
icated-0.025
stand-0.022
bows-0.022
ained-0.022
Token some
Token ID617
Feature activation-4.61e-02

Pos logit contributions
 Boom+0.018
icated+0.017
ained+0.016
 elbows+0.016
oes+0.016
Neg logit contributions
 dismay-0.012
 correct-0.012
 happily-0.012
ris-0.012
 certainly-0.011
Token pet
Token ID4273
Feature activation+1.29e-01

Pos logit contributions
 correct+0.004
 dismay+0.004
ris+0.004
 happily+0.004
 assistant+0.004
Neg logit contributions
 Boom-0.003
icated-0.003
ained-0.003
ric-0.003
 elbows-0.003
Tokenals
Token ID874
Feature activation-1.29e-01

Pos logit contributions
 correct+0.213
 dismay+0.209
ris+0.207
 purchase+0.190
 Whoever+0.190
Neg logit contributions
 Boom-0.147
icated-0.120
stand-0.115
drops-0.114
fl-0.113
Token opened
Token ID4721
Feature activation+1.11e-01

Pos logit contributions
 Boom+0.157
icated+0.139
drops+0.132
stand+0.132
bel+0.131
Neg logit contributions
ris-0.191
 correct-0.189
 dismay-0.186
 assistant-0.179
aleb-0.178
Token wider
Token ID10595
Feature activation-4.87e-02

Pos logit contributions
ris+0.176
 correct+0.166
 dismay+0.165
 assistant+0.158
aleb+0.157
Neg logit contributions
 Boom-0.132
icated-0.110
fl-0.107
bows-0.106
stand-0.106
Token,
Token ID11
Feature activation-4.64e-02

Pos logit contributions
 Boom+0.059
icated+0.055
stand+0.050
bows+0.049
drops+0.049
Neg logit contributions
ris-0.072
 correct-0.071
 dismay-0.071
 happily-0.068
 assistant-0.067
Token Boom
Token ID31824
Feature activation-4.07e-01

Pos logit contributions
 Boom+0.064
icated+0.056
bows+0.051
stand+0.050
drops+0.049
Neg logit contributions
ris-0.067
 correct-0.065
 dismay-0.064
aleb-0.062
 assistant-0.061
Token saw
Token ID2497
Feature activation-1.09e-02

Pos logit contributions
 Boom+0.689
icated+0.675
 elbows+0.635
ained+0.627
oes+0.621
Neg logit contributions
 happily-0.392
ris-0.390
 dismay-0.383
 correct-0.383
 certainly-0.351
Token a
Token ID257
Feature activation-3.34e-02

Pos logit contributions
 Boom+0.016
icated+0.016
 elbows+0.015
ained+0.014
stand+0.014
Neg logit contributions
 dismay-0.013
 correct-0.013
 happily-0.013
ris-0.012
 certainly-0.012
Token pretty
Token ID2495
Feature activation-5.60e-02

Pos logit contributions
 Boom+0.052
icated+0.051
ained+0.046
 elbows+0.046
oes+0.045
Neg logit contributions
 dismay-0.038
 correct-0.037
ris-0.037
 happily-0.037
 assistant-0.034
Token flower
Token ID15061
Feature activation+6.20e-02

Pos logit contributions
 Boom+0.092
icated+0.089
 elbows+0.083
ained+0.082
oes+0.081
Neg logit contributions
 happily-0.060
 dismay-0.059
 correct-0.058
ris-0.057
 assistant-0.053
Token blo
Token ID24924
Feature activation+2.17e-02

Pos logit contributions
ris+0.089
 dismay+0.088
 correct+0.086
aleb+0.084
 Whoever+0.083
Neg logit contributions
 Boom-0.074
icated-0.071
stand-0.070
bows-0.069
drops-0.068
Token there
Token ID612
Feature activation+6.90e-02

Pos logit contributions
 happily+0.125
 dismay+0.125
ris+0.124
 correct+0.124
 Whoever+0.114
Neg logit contributions
 Boom-0.132
icated-0.128
 elbows-0.115
stand-0.115
bows-0.115
Token was
Token ID373
Feature activation+2.30e-02

Pos logit contributions
 correct+0.099
ris+0.098
 dismay+0.095
 assistant+0.094
 Whoever+0.091
Neg logit contributions
 Boom-0.089
icated-0.081
ained-0.072
bows-0.071
stand-0.071
Token a
Token ID257
Feature activation-3.62e-02

Pos logit contributions
 happily+0.028
 correct+0.028
 dismay+0.028
ris+0.027
 certainly+0.025
Neg logit contributions
 Boom-0.034
icated-0.032
stand-0.030
drops-0.030
oes-0.030
Token cow
Token ID9875
Feature activation+1.20e-01

Pos logit contributions
 Boom+0.050
icated+0.046
stand+0.041
Rain+0.041
 songs+0.041
Neg logit contributions
 correct-0.048
ris-0.048
 dismay-0.047
 Whoever-0.043
 certainly-0.043
Token named
Token ID3706
Feature activation+2.84e-03

Pos logit contributions
ris+0.133
 happily+0.133
 dismay+0.129
 correct+0.128
 assistant+0.120
Neg logit contributions
 Boom-0.194
icated-0.185
 elbows-0.170
stand-0.167
bows-0.166
Token Boom
Token ID31824
Feature activation-4.07e-01

Pos logit contributions
ris+0.004
 dismay+0.004
 correct+0.004
 happily+0.003
 assistant+0.003
Neg logit contributions
 Boom-0.004
icated-0.004
stand-0.004
weet-0.004
bows-0.004
Token.
Token ID13
Feature activation+8.27e-03

Pos logit contributions
 elbows+0.621
 Boom+0.615
icated+0.601
ained+0.582
stand+0.559
Neg logit contributions
 happily-0.466
ris-0.461
 dismay-0.429
 Whoever-0.394
 assistant-0.387
Token Boom
Token ID31824
Feature activation-4.07e-01

Pos logit contributions
 correct+0.010
ris+0.010
 Whoever+0.010
 happily+0.010
 dismay+0.010
Neg logit contributions
 elbows-0.011
bel-0.011
stand-0.011
bery-0.010
icated-0.010
Token was
Token ID373
Feature activation+2.23e-02

Pos logit contributions
 elbows+0.765
rings+0.690
ained+0.686
icated+0.681
Br+0.680
Neg logit contributions
 happily-0.371
ris-0.337
 Whoever-0.293
 dismay-0.289
 surprise-0.280
Token a
Token ID257
Feature activation-3.70e-02

Pos logit contributions
 happily+0.023
 correct+0.022
 only+0.021
 just+0.020
 dismay+0.020
Neg logit contributions
icated-0.037
 fiber-0.037
irds-0.037
rings-0.036
aid-0.035
Token very
Token ID845
Feature activation-3.96e-02

Pos logit contributions
 Boom+0.054
icated+0.052
stand+0.047
 elbows+0.047
bows+0.046
Neg logit contributions
 dismay-0.045
ris-0.045
 correct-0.045
 happily-0.044
 assistant-0.041
Token a
Token ID257
Feature activation-3.62e-02

Pos logit contributions
 happily+0.028
 correct+0.028
 dismay+0.028
ris+0.027
 certainly+0.025
Neg logit contributions
 Boom-0.034
icated-0.032
stand-0.030
drops-0.030
oes-0.030
Token cow
Token ID9875
Feature activation+1.20e-01

Pos logit contributions
 Boom+0.050
icated+0.046
stand+0.041
Rain+0.041
 songs+0.041
Neg logit contributions
 correct-0.048
ris-0.048
 dismay-0.047
 Whoever-0.043
 certainly-0.043
Token named
Token ID3706
Feature activation+2.84e-03

Pos logit contributions
ris+0.133
 happily+0.133
 dismay+0.129
 correct+0.128
 assistant+0.120
Neg logit contributions
 Boom-0.194
icated-0.185
 elbows-0.170
stand-0.167
bows-0.166
Token Boom
Token ID31824
Feature activation-4.07e-01

Pos logit contributions
ris+0.004
 dismay+0.004
 correct+0.004
 happily+0.003
 assistant+0.003
Neg logit contributions
 Boom-0.004
icated-0.004
stand-0.004
weet-0.004
bows-0.004
Token.
Token ID13
Feature activation+8.27e-03

Pos logit contributions
 elbows+0.621
 Boom+0.615
icated+0.601
ained+0.582
stand+0.559
Neg logit contributions
 happily-0.466
ris-0.461
 dismay-0.429
 Whoever-0.394
 assistant-0.387
Token Boom
Token ID31824
Feature activation-4.07e-01

Pos logit contributions
 correct+0.010
ris+0.010
 Whoever+0.010
 happily+0.010
 dismay+0.010
Neg logit contributions
 elbows-0.011
bel-0.011
stand-0.011
bery-0.010
icated-0.010
Token was
Token ID373
Feature activation+2.23e-02

Pos logit contributions
 elbows+0.765
rings+0.690
ained+0.686
icated+0.681
Br+0.680
Neg logit contributions
 happily-0.371
ris-0.337
 Whoever-0.293
 dismay-0.289
 surprise-0.280
Token a
Token ID257
Feature activation-3.70e-02

Pos logit contributions
 happily+0.023
 correct+0.022
 only+0.021
 just+0.020
 dismay+0.020
Neg logit contributions
icated-0.037
 fiber-0.037
irds-0.037
rings-0.036
aid-0.035
Token very
Token ID845
Feature activation-3.96e-02

Pos logit contributions
 Boom+0.054
icated+0.052
stand+0.047
 elbows+0.047
bows+0.046
Neg logit contributions
 dismay-0.045
ris-0.045
 correct-0.045
 happily-0.044
 assistant-0.041
Token happy
Token ID3772
Feature activation-2.81e-02

Pos logit contributions
 Boom+0.048
icated+0.039
drops+0.039
bel+0.038
bows+0.037
Neg logit contributions
ris-0.062
 dismay-0.060
 correct-0.059
aleb-0.058
 Whoever-0.057
Token cow
Token ID9875
Feature activation+1.19e-01

Pos logit contributions
 Boom+0.043
icated+0.039
bows+0.037
stand+0.036
fl+0.036
Neg logit contributions
 correct-0.035
ris-0.035
 dismay-0.034
 Whoever-0.031
 happily-0.030
Token.
Token ID13
Feature activation+1.86e-02

Pos logit contributions
 correct+0.010
 happily+0.010
ris+0.009
 dismay+0.009
 Whoever+0.009
Neg logit contributions
 Boom-0.011
icated-0.010
 elbows-0.010
stand-0.009
bows-0.009
Token Pow
Token ID14120
Feature activation-1.47e-01

Pos logit contributions
ris+0.024
 dismay+0.024
 correct+0.024
 happily+0.024
 assistant+0.022
Neg logit contributions
 Boom-0.027
icated-0.025
stand-0.022
 elbows-0.022
oes-0.021
Token!
Token ID0
Feature activation+1.35e-01

Pos logit contributions
 Boom+0.213
icated+0.204
 elbows+0.187
ained+0.184
stand+0.183
Neg logit contributions
 happily-0.184
 correct-0.182
 dismay-0.181
ris-0.181
 assistant-0.168
Token Zap
Token ID36079
Feature activation-1.68e-01

Pos logit contributions
ris+0.173
 dismay+0.173
 correct+0.173
 happily+0.169
 assistant+0.158
Neg logit contributions
 Boom-0.195
icated-0.184
stand-0.166
bows-0.160
oes-0.159
Token!
Token ID0
Feature activation+1.36e-01

Pos logit contributions
 Boom+0.239
icated+0.229
stand+0.206
 elbows+0.203
bows+0.203
Neg logit contributions
ris-0.215
 dismay-0.214
 correct-0.214
 happily-0.209
 assistant-0.197
Token Boom
Token ID31824
Feature activation-4.06e-01

Pos logit contributions
 dismay+0.173
ris+0.173
 correct+0.173
 happily+0.168
 assistant+0.159
Neg logit contributions
 Boom-0.194
icated-0.184
stand-0.166
bows-0.163
 elbows-0.161
Token!"
Token ID2474
Feature activation-3.85e-02

Pos logit contributions
 Boom+0.604
icated+0.588
 elbows+0.566
oes+0.551
bel+0.550
Neg logit contributions
 happily-0.464
ris-0.458
 correct-0.452
 dismay-0.449
 Whoever-0.424
Token
Token ID198
Feature activation-7.80e-03

Pos logit contributions
icated+0.061
 Boom+0.060
ained+0.056
bows+0.056
ric+0.056
Neg logit contributions
 happily-0.041
 dismay-0.041
 correct-0.040
ris-0.039
 Whoever-0.037
Token
Token ID198
Feature activation-8.08e-03

Pos logit contributions
 Boom+0.012
icated+0.011
bows+0.010
drops+0.010
ained+0.010
Neg logit contributions
 dismay-0.009
ris-0.009
 happily-0.009
 correct-0.009
 Whoever-0.008
TokenBen
Token ID11696
Feature activation-6.13e-02

Pos logit contributions
 Boom+0.009
icated+0.009
stand+0.008
Rain+0.008
 elbows+0.008
Neg logit contributions
 correct-0.013
ris-0.012
 dismay-0.012
 happily-0.012
 purchase-0.012
Token watched
Token ID7342
Feature activation-1.13e-02

Pos logit contributions
 Boom+0.092
icated+0.089
stand+0.081
ained+0.080
 elbows+0.080
Neg logit contributions
 dismay-0.072
 correct-0.072
ris-0.072
 happily-0.071
 assistant-0.065
Token "
Token ID366
Feature activation+1.09e-02

Pos logit contributions
 Boom+0.073
icated+0.071
stand+0.064
 elbows+0.064
bows+0.063
Neg logit contributions
 correct-0.051
 dismay-0.051
 happily-0.050
ris-0.050
 assistant-0.046
TokenS
Token ID50
Feature activation+6.39e-03

Pos logit contributions
 correct+0.010
 Whoever+0.010
 dismay+0.009
ris+0.009
 certainly+0.009
Neg logit contributions
icated-0.019
 Boom-0.018
ric-0.017
ained-0.017
stand-0.017
Tokenof
Token ID1659
Feature activation-2.42e-02

Pos logit contributions
 dismay+0.008
 correct+0.008
 happily+0.008
ris+0.008
 Whoever+0.007
Neg logit contributions
 Boom-0.009
icated-0.009
bows-0.008
oes-0.008
stand-0.008
Tokenish
Token ID680
Feature activation+5.89e-02

Pos logit contributions
icated+0.027
 Boom+0.026
bows+0.022
stand+0.022
oes+0.021
Neg logit contributions
 dismay-0.040
ris-0.039
 correct-0.038
 happily-0.038
 assistant-0.037
Tokent
Token ID83
Feature activation-1.06e-01

Pos logit contributions
ris+0.073
 dismay+0.073
 correct+0.071
 happily+0.070
 Whoever+0.066
Neg logit contributions
 Boom-0.086
icated-0.083
stand-0.074
bows-0.073
ric-0.073
Tokenicated
Token ID3474
Feature activation-3.81e-01

Pos logit contributions
 Boom+0.157
drops+0.149
 elbows+0.147
bel+0.143
Br+0.141
Neg logit contributions
 correct-0.124
 dismay-0.119
 happily-0.118
 Whoever-0.118
ris-0.116
Token!"
Token ID2474
Feature activation-4.96e-02

Pos logit contributions
 Boom+0.511
icated+0.496
stand+0.442
bows+0.430
drops+0.418
Neg logit contributions
 dismay-0.533
ris-0.531
 correct-0.517
 happily-0.495
 assistant-0.486
Token She
Token ID1375
Feature activation+3.57e-02

Pos logit contributions
 Boom+0.081
icated+0.080
ric+0.072
bows+0.072
ained+0.072
Neg logit contributions
 happily-0.052
 dismay-0.052
 correct-0.051
ris-0.049
 Whoever-0.046
Token laughs
Token ID22051
Feature activation-6.61e-02

Pos logit contributions
ris+0.057
ait+0.052
 dismay+0.050
 purchase+0.050
 assistant+0.049
Neg logit contributions
 Boom-0.045
icated-0.038
bows-0.037
 noises-0.037
 rhy-0.036
Token.
Token ID13
Feature activation+6.38e-04

Pos logit contributions
 Boom+0.111
icated+0.110
ric+0.102
oes+0.101
stand+0.101
Neg logit contributions
 happily-0.069
 dismay-0.064
 correct-0.064
ris-0.059
 assistant-0.057
Token Ben
Token ID3932
Feature activation+2.30e-02

Pos logit contributions
ris+0.001
 correct+0.001
 dismay+0.001
 happily+0.001
 assistant+0.001
Neg logit contributions
 Boom-0.001
icated-0.001
stand-0.001
bows-0.001
 elbows-0.001
Token put
Token ID1234
Feature activation+9.05e-02

Pos logit contributions
ris+0.029
 dismay+0.029
 correct+0.028
 happily+0.027
 assistant+0.026
Neg logit contributions
 Boom-0.035
icated-0.033
stand-0.030
 elbows-0.030
bows-0.030
Token it
Token ID340
Feature activation+1.30e-02

Pos logit contributions
ris+0.121
 dismay+0.119
 correct+0.116
 assistant+0.111
 happily+0.111
Neg logit contributions
 Boom-0.128
icated-0.117
stand-0.109
 elbows-0.106
bows-0.104
Token on
Token ID319
Feature activation-1.78e-02

Pos logit contributions
 correct+0.015
 dismay+0.015
 happily+0.015
ris+0.015
 assistant+0.013
Neg logit contributions
 Boom-0.020
icated-0.019
stand-0.018
bows-0.017
 elbows-0.017
Token his
Token ID465
Feature activation+4.04e-02

Pos logit contributions
 Boom+0.018
stand+0.018
icated+0.017
 ger+0.014
 songs+0.014
Neg logit contributions
ris-0.029
aleb-0.028
 dismay-0.028
 assistant-0.028
 genius-0.027
Token night
Token ID1755
Feature activation-1.46e-01

Pos logit contributions
 correct+0.057
 dismay+0.057
 happily+0.056
ris+0.055
 assistant+0.052
Neg logit contributions
 Boom-0.054
icated-0.051
ained-0.045
bows-0.045
oes-0.043
Tokenstand
Token ID1481
Feature activation-3.38e-01

Pos logit contributions
 Boom+0.144
icated+0.139
ained+0.115
oes+0.111
ric+0.107
Neg logit contributions
 happily-0.258
 correct-0.256
 dismay-0.248
ris-0.246
 surprise-0.235
Token.
Token ID13
Feature activation+2.72e-02

Pos logit contributions
 Boom+0.457
icated+0.438
stand+0.415
bows+0.402
bel+0.393
Neg logit contributions
 dismay-0.445
 correct-0.439
ris-0.437
 happily-0.412
 Whoever-0.407
Token He
Token ID679
Feature activation+3.56e-02

Pos logit contributions
ris+0.037
 correct+0.037
 dismay+0.037
 happily+0.035
 assistant+0.034
Neg logit contributions
 Boom-0.038
icated-0.034
stand-0.032
ained-0.029
weet-0.029
Token did
Token ID750
Feature activation-1.13e-02

Pos logit contributions
ris+0.050
 dismay+0.049
 correct+0.049
 assistant+0.047
 Whoever+0.047
Neg logit contributions
 Boom-0.044
icated-0.042
stand-0.038
 elbows-0.036
bel-0.036
Token not
Token ID407
Feature activation-3.89e-02

Pos logit contributions
 Boom+0.014
icated+0.014
stand+0.013
bows+0.012
 elbows+0.012
Neg logit contributions
ris-0.016
 dismay-0.016
 correct-0.016
 happily-0.015
 assistant-0.015
Token want
Token ID765
Feature activation-1.19e-02

Pos logit contributions
 Boom+0.051
icated+0.048
stand+0.042
ained+0.042
 elbows+0.042
Neg logit contributions
 dismay-0.054
 correct-0.054
ris-0.054
 happily-0.053
 assistant-0.050
Token glowing
Token ID21377
Feature activation-9.49e-02

Pos logit contributions
 correct+0.002
 dismay+0.002
 happily+0.002
 surprise+0.002
ris+0.002
Neg logit contributions
icated-0.004
 Boom-0.004
bows-0.004
llers-0.004
ained-0.004
Token rock
Token ID3881
Feature activation-1.87e-02

Pos logit contributions
icated+0.176
 Boom+0.173
ained+0.163
bows+0.163
 elbows+0.162
Neg logit contributions
 happily-0.088
 correct-0.080
 surprise-0.076
 dismay-0.076
ris-0.074
Token on
Token ID319
Feature activation-2.27e-02

Pos logit contributions
 Boom+0.028
icated+0.027
stand+0.024
bows+0.024
 elbows+0.024
Neg logit contributions
 correct-0.022
ris-0.022
 dismay-0.022
 happily-0.022
 assistant-0.020
Token his
Token ID465
Feature activation+3.05e-02

Pos logit contributions
 Boom+0.035
icated+0.033
stand+0.030
bows+0.030
 elbows+0.030
Neg logit contributions
ris-0.027
 correct-0.027
 dismay-0.026
 happily-0.025
 assistant-0.024
Token night
Token ID1755
Feature activation-1.47e-01

Pos logit contributions
 dismay+0.039
 correct+0.039
 happily+0.039
ris+0.037
 purchase+0.036
Neg logit contributions
 Boom-0.045
icated-0.043
ained-0.040
bows-0.040
drops-0.038
Tokenstand
Token ID1481
Feature activation-3.37e-01

Pos logit contributions
icated+0.176
 Boom+0.166
ained+0.156
 elbows+0.144
ric+0.142
Neg logit contributions
 happily-0.231
 correct-0.226
ris-0.222
 surprise-0.221
 dismay-0.214
Token before
Token ID878
Feature activation-1.83e-01

Pos logit contributions
 Boom+0.437
icated+0.417
stand+0.399
bows+0.385
ained+0.373
Neg logit contributions
 dismay-0.465
 correct-0.460
ris-0.452
 happily-0.427
 Whoever-0.425
Token he
Token ID339
Feature activation-1.97e-03

Pos logit contributions
 Boom+0.277
icated+0.270
oes+0.252
bows+0.251
bel+0.247
Neg logit contributions
 happily-0.215
 dismay-0.213
 correct-0.202
ris-0.199
 purchase-0.189
Token went
Token ID1816
Feature activation-1.30e-02

Pos logit contributions
 Boom+0.003
icated+0.002
stand+0.002
bows+0.002
 elbows+0.002
Neg logit contributions
 dismay-0.003
ris-0.003
 correct-0.003
 happily-0.003
 assistant-0.003
Token to
Token ID284
Feature activation-4.72e-02

Pos logit contributions
 Boom+0.019
icated+0.018
bows+0.016
stand+0.016
 elbows+0.016
Neg logit contributions
 correct-0.016
 dismay-0.016
ris-0.016
 happily-0.016
 assistant-0.015
Token bed
Token ID3996
Feature activation-1.07e-01

Pos logit contributions
 Boom+0.054
icated+0.051
stand+0.045
 elbows+0.043
ric+0.042
Neg logit contributions
ris-0.074
 correct-0.073
 dismay-0.072
 happily-0.070
 assistant-0.069
Token stood
Token ID6204
Feature activation-5.90e-02

Pos logit contributions
 dismay+0.067
 correct+0.066
ris+0.066
 happily+0.062
 assistant+0.062
Neg logit contributions
 Boom-0.066
icated-0.063
bows-0.056
stand-0.056
drops-0.055
Token on
Token ID319
Feature activation-2.13e-02

Pos logit contributions
 Boom+0.089
icated+0.086
stand+0.077
bows+0.076
 elbows+0.075
Neg logit contributions
 correct-0.070
 dismay-0.070
 happily-0.069
ris-0.069
 assistant-0.063
Token her
Token ID607
Feature activation+4.65e-02

Pos logit contributions
 Boom+0.028
icated+0.025
stand+0.024
 elbows+0.023
bows+0.022
Neg logit contributions
ris-0.031
 correct-0.029
 dismay-0.029
 assistant-0.029
aleb-0.029
Token t
Token ID256
Feature activation+6.40e-02

Pos logit contributions
 happily+0.064
 correct+0.063
 dismay+0.063
ris+0.058
 surprise+0.057
Neg logit contributions
 Boom-0.066
icated-0.064
ained-0.057
bows-0.055
oes-0.053
Tokenipt
Token ID10257
Feature activation+4.63e-02

Pos logit contributions
ris+0.103
 happily+0.100
 dismay+0.096
 assistant+0.093
aleb+0.091
Neg logit contributions
 Boom-0.080
icated-0.072
 keyboards-0.065
bows-0.063
drops-0.063
Tokenoes
Token ID3028
Feature activation-3.35e-01

Pos logit contributions
ris+0.070
 happily+0.068
 correct+0.064
 dismay+0.062
 assistant+0.061
Neg logit contributions
icated-0.058
 Boom-0.056
 elbows-0.053
 keyboards-0.052
 ger-0.051
Token to
Token ID284
Feature activation-4.87e-02

Pos logit contributions
 Boom+0.438
icated+0.405
stand+0.373
bows+0.367
drops+0.364
Neg logit contributions
 dismay-0.466
ris-0.465
 correct-0.454
 assistant-0.425
 happily-0.424
Token reach
Token ID3151
Feature activation+1.35e-01

Pos logit contributions
 Boom+0.059
icated+0.055
stand+0.053
weet+0.049
 elbows+0.047
Neg logit contributions
ris-0.077
aleb-0.070
 correct-0.069
 assistant-0.068
 dismay-0.068
Token the
Token ID262
Feature activation+9.90e-03

Pos logit contributions
ris+0.196
 dismay+0.187
 correct+0.179
 assistant+0.179
aleb+0.178
Neg logit contributions
 Boom-0.171
icated-0.159
stand-0.148
oes-0.141
 elbows-0.138
Token spoon
Token ID24556
Feature activation+9.02e-02

Pos logit contributions
ris+0.012
 dismay+0.012
 correct+0.012
 happily+0.012
 assistant+0.011
Neg logit contributions
 Boom-0.015
icated-0.014
stand-0.013
bows-0.013
 elbows-0.012
Token.
Token ID13
Feature activation+1.69e-02

Pos logit contributions
 dismay+0.134
ris+0.131
 correct+0.127
aleb+0.120
 Whoever+0.120
Neg logit contributions
 Boom-0.107
icated-0.104
stand-0.097
drops-0.097
oes-0.096
Token their
Token ID511
Feature activation+7.32e-02

Pos logit contributions
 happily+0.039
 dismay+0.035
 correct+0.035
ris+0.032
 exactly+0.031
Neg logit contributions
 Boom-0.063
icated-0.060
bows-0.057
stand-0.056
drops-0.056
Token popcorn
Token ID38914
Feature activation+1.37e-02

Pos logit contributions
 correct+0.091
 dismay+0.091
 happily+0.091
ris+0.085
 surprise+0.083
Neg logit contributions
 Boom-0.109
icated-0.103
ained-0.094
bows-0.093
stand-0.092
Token while
Token ID981
Feature activation-1.47e-01

Pos logit contributions
 correct+0.017
ris+0.017
 dismay+0.017
 happily+0.017
 assistant+0.015
Neg logit contributions
 Boom-0.020
icated-0.019
stand-0.017
 elbows-0.017
bows-0.017
Token it
Token ID340
Feature activation+7.12e-03

Pos logit contributions
 Boom+0.191
icated+0.181
stand+0.170
 elbows+0.164
bel+0.157
Neg logit contributions
ris-0.216
 correct-0.207
 dismay-0.200
 Whoever-0.190
 assistant-0.189
Token r
Token ID374
Feature activation-5.58e-03

Pos logit contributions
 correct+0.009
 dismay+0.009
ris+0.009
 happily+0.009
 assistant+0.008
Neg logit contributions
 Boom-0.010
icated-0.010
stand-0.009
 elbows-0.009
bows-0.009
Tokenained
Token ID1328
Feature activation-3.34e-01

Pos logit contributions
icated+0.005
 Boom+0.004
 rhy+0.004
 keyboards+0.004
stand+0.004
Neg logit contributions
 happily-0.011
ris-0.010
ait-0.009
 correct-0.009
 dismay-0.009
Token outside
Token ID2354
Feature activation-3.88e-02

Pos logit contributions
 Boom+0.559
icated+0.542
 elbows+0.496
llers+0.487
ric+0.483
Neg logit contributions
 happily-0.361
 correct-0.338
ris-0.336
 dismay-0.322
 Whoever-0.292
Token.
Token ID13
Feature activation+2.15e-02

Pos logit contributions
 Boom+0.053
icated+0.050
stand+0.046
bows+0.045
drops+0.045
Neg logit contributions
ris-0.052
 dismay-0.051
 correct-0.051
 happily-0.048
 assistant-0.047
Token Lily
Token ID20037
Feature activation-7.53e-03

Pos logit contributions
ris+0.027
 correct+0.027
 dismay+0.027
 happily+0.026
 assistant+0.025
Neg logit contributions
 Boom-0.032
icated-0.030
stand-0.027
ained-0.026
bows-0.026
Token was
Token ID373
Feature activation+2.86e-02

Pos logit contributions
 Boom+0.012
icated+0.011
oes+0.010
stand+0.010
bows+0.010
Neg logit contributions
 happily-0.009
 dismay-0.009
 correct-0.009
ris-0.009
 certainly-0.008
Token happy
Token ID3772
Feature activation-3.32e-02

Pos logit contributions
 dismay+0.041
 happily+0.041
 correct+0.040
ris+0.037
 certainly+0.037
Neg logit contributions
 Boom-0.038
icated-0.038
oes-0.033
bows-0.033
stand-0.032
Token wet
Token ID9583
Feature activation-1.14e-01

Pos logit contributions
ris+0.025
 dismay+0.025
 correct+0.024
 happily+0.024
aleb+0.023
Neg logit contributions
 Boom-0.023
icated-0.021
ained-0.020
stand-0.019
bel-0.019
Token because
Token ID780
Feature activation-7.08e-02

Pos logit contributions
 Boom+0.182
icated+0.174
ric+0.160
stand+0.159
 elbows+0.159
Neg logit contributions
 correct-0.126
 happily-0.125
ris-0.123
 dismay-0.121
 assistant-0.111
Token it
Token ID340
Feature activation+1.37e-03

Pos logit contributions
 Boom+0.101
icated+0.095
stand+0.084
ained+0.084
oes+0.081
Neg logit contributions
ris-0.093
 dismay-0.091
 correct-0.091
 happily-0.088
 assistant-0.085
Token had
Token ID550
Feature activation-1.43e-02

Pos logit contributions
ris+0.002
 dismay+0.002
 correct+0.002
 happily+0.002
 assistant+0.002
Neg logit contributions
 Boom-0.002
icated-0.002
stand-0.002
bows-0.002
drops-0.002
Token r
Token ID374
Feature activation-1.09e-02

Pos logit contributions
 Boom+0.021
icated+0.020
bows+0.018
ric+0.018
oes+0.018
Neg logit contributions
 correct-0.018
 dismay-0.017
 happily-0.017
ris-0.017
 certainly-0.016
Tokenained
Token ID1328
Feature activation-3.34e-01

Pos logit contributions
 Boom+0.010
icated+0.010
 keyboards+0.009
stand+0.009
 rhy+0.009
Neg logit contributions
ris-0.019
ait-0.017
 correct-0.017
 happily-0.017
 dismay-0.017
Token.
Token ID13
Feature activation+1.10e-02

Pos logit contributions
 Boom+0.563
icated+0.547
 elbows+0.530
llers+0.513
 keyboards+0.508
Neg logit contributions
 happily-0.344
 correct-0.322
ris-0.317
 dismay-0.287
 exactly-0.283
Token Tim
Token ID5045
Feature activation+4.69e-02

Pos logit contributions
 dismay+0.011
 correct+0.011
ris+0.011
 happily+0.011
 assistant+0.010
Neg logit contributions
 Boom-0.018
icated-0.018
stand-0.016
bows-0.016
 elbows-0.016
Token wanted
Token ID2227
Feature activation+1.05e-02

Pos logit contributions
 happily+0.059
ris+0.058
 dismay+0.058
 correct+0.057
 assistant+0.053
Neg logit contributions
 Boom-0.069
icated-0.066
stand-0.060
bows-0.060
bel-0.059
Token to
Token ID284
Feature activation-4.91e-02

Pos logit contributions
 dismay+0.013
 correct+0.013
 happily+0.013
ris+0.013
 surprise+0.012
Neg logit contributions
 Boom-0.015
icated-0.015
bows-0.014
drops-0.014
 elbows-0.013
Token use
Token ID779
Feature activation+1.68e-01

Pos logit contributions
 Boom+0.068
icated+0.064
stand+0.059
drops+0.056
 elbows+0.056
Neg logit contributions
ris-0.065
 correct-0.064
 dismay-0.064
 happily-0.062
 assistant-0.060
Token,
Token ID11
Feature activation-5.30e-02

Pos logit contributions
 Boom+0.067
icated+0.063
stand+0.057
bows+0.056
 elbows+0.056
Neg logit contributions
ris-0.062
 dismay-0.062
 correct-0.061
 assistant-0.057
 happily-0.057
Token especially
Token ID2592
Feature activation-6.92e-02

Pos logit contributions
 Boom+0.085
icated+0.079
 elbows+0.075
ained+0.074
bows+0.071
Neg logit contributions
 happily-0.061
 correct-0.058
 dismay-0.057
 Whoever-0.055
 certainly-0.054
Token when
Token ID618
Feature activation-1.90e-01

Pos logit contributions
 Boom+0.119
icated+0.116
stand+0.113
ric+0.112
 keyboards+0.107
Neg logit contributions
 happily-0.078
 correct-0.060
 pic-0.058
 dismay-0.058
 addition-0.058
Token it
Token ID340
Feature activation+1.73e-03

Pos logit contributions
 Boom+0.272
icated+0.258
stand+0.232
ained+0.230
 elbows+0.230
Neg logit contributions
ris-0.248
 correct-0.241
 dismay-0.240
 happily-0.228
 assistant-0.226
Token r
Token ID374
Feature activation-1.20e-02

Pos logit contributions
 correct+0.002
 dismay+0.002
 happily+0.002
ris+0.002
 certainly+0.002
Neg logit contributions
 Boom-0.003
icated-0.003
stand-0.002
 elbows-0.002
bows-0.002
Tokenained
Token ID1328
Feature activation-3.34e-01

Pos logit contributions
icated+0.009
stand+0.008
 Boom+0.008
 keyboards+0.007
 rhy+0.007
Neg logit contributions
ris-0.024
 happily-0.023
 correct-0.022
-0.021
 dismay-0.021
Token.
Token ID13
Feature activation+7.56e-03

Pos logit contributions
icated+0.602
 Boom+0.590
 elbows+0.576
llers+0.569
 keyboards+0.564
Neg logit contributions
 happily-0.315
 correct-0.278
ris-0.252
 certainly-0.229
 Whoever-0.228
Token One
Token ID1881
Feature activation+9.21e-02

Pos logit contributions
 dismay+0.007
ris+0.007
 correct+0.007
 happily+0.007
 assistant+0.006
Neg logit contributions
 Boom-0.013
icated-0.013
stand-0.012
bows-0.012
 elbows-0.012
Token day
Token ID1110
Feature activation-8.38e-02

Pos logit contributions
 dismay+0.126
 happily+0.124
ris+0.123
 correct+0.123
 certainly+0.114
Neg logit contributions
 Boom-0.123
icated-0.120
 elbows-0.114
stand-0.109
llers-0.107
Token,
Token ID11
Feature activation-5.13e-02

Pos logit contributions
icated+0.138
ric+0.136
 Boom+0.130
oes+0.129
llers+0.125
Neg logit contributions
 happily-0.096
 dismay-0.084
 correct-0.083
ris-0.078
 surprise-0.077
Token Lily
Token ID20037
Feature activation-1.56e-02

Pos logit contributions
 Boom+0.083
icated+0.082
ric+0.075
oes+0.074
 elbows+0.073
Neg logit contributions
 happily-0.056
 dismay-0.055
 correct-0.054
ris-0.052
 Whoever-0.050
Token even
Token ID772
Feature activation-1.61e-02

Pos logit contributions
 Boom+0.016
icated+0.015
 elbows+0.014
stand+0.014
drops+0.014
Neg logit contributions
 correct-0.013
 dismay-0.013
 happily-0.013
ris-0.013
 assistant-0.012
Token more
Token ID517
Feature activation-7.79e-04

Pos logit contributions
 Boom+0.022
 elbows+0.021
ric+0.020
icated+0.020
drops+0.020
Neg logit contributions
 happily-0.023
 correct-0.022
 dismay-0.022
ris-0.021
 surprise-0.020
Token about
Token ID546
Feature activation-4.24e-02

Pos logit contributions
 Boom+0.001
icated+0.001
stand+0.001
bows+0.001
ained+0.001
Neg logit contributions
 dismay-0.001
ris-0.001
 correct-0.001
 happily-0.001
 Whoever-0.001
Token how
Token ID703
Feature activation-2.85e-02

Pos logit contributions
icated+0.053
 Boom+0.052
stand+0.046
bows+0.046
weet+0.046
Neg logit contributions
ris-0.063
 dismay-0.059
 correct-0.059
aleb-0.058
 assistant-0.058
Token volcan
Token ID17516
Feature activation+5.62e-02

Pos logit contributions
 Boom+0.041
icated+0.040
stand+0.036
bows+0.035
ained+0.035
Neg logit contributions
ris-0.035
 dismay-0.035
 correct-0.035
 happily-0.034
 assistant-0.032
Tokenoes
Token ID3028
Feature activation-3.33e-01

Pos logit contributions
ris+0.109
 happily+0.101
 certainly+0.100
 correct+0.100
 exactly+0.096
Neg logit contributions
 elbows-0.045
Br-0.041
 Boom-0.040
stand-0.037
 keyboards-0.034
Token work
Token ID670
Feature activation+8.04e-02

Pos logit contributions
 Boom+0.406
icated+0.397
bows+0.361
stand+0.350
oes+0.343
Neg logit contributions
 dismay-0.493
ris-0.492
 correct-0.470
 assistant-0.464
aleb-0.454
Token.
Token ID13
Feature activation+1.57e-02

Pos logit contributions
 dismay+0.108
ris+0.105
 correct+0.103
 happily+0.099
 assistant+0.098
Neg logit contributions
 Boom-0.109
icated-0.105
bows-0.096
stand-0.095
ained-0.093
Token She
Token ID1375
Feature activation+3.99e-02

Pos logit contributions
 correct+0.021
 dismay+0.021
ris+0.021
 happily+0.020
 assistant+0.020
Neg logit contributions
 Boom-0.022
icated-0.020
stand-0.018
ained-0.018
bows-0.018
Token thought
Token ID1807
Feature activation+5.42e-03

Pos logit contributions
 dismay+0.061
 correct+0.061
ris+0.059
ait+0.059
 Whoever+0.058
Neg logit contributions
 Boom-0.043
icated-0.041
bows-0.038
 rhy-0.037
llers-0.035
Token it
Token ID340
Feature activation+6.32e-03

Pos logit contributions
ris+0.008
 dismay+0.008
 correct+0.007
aleb+0.007
 assistant+0.007
Neg logit contributions
 Boom-0.007
icated-0.007
stand-0.006
bows-0.006
bel-0.006
Token loved
Token ID6151
Feature activation-1.33e-01

Pos logit contributions
 Boom+0.214
icated+0.205
stand+0.183
 elbows+0.182
bows+0.180
Neg logit contributions
 correct-0.188
 dismay-0.188
ris-0.188
 happily-0.187
 assistant-0.172
Token to
Token ID284
Feature activation-5.88e-02

Pos logit contributions
 Boom+0.197
icated+0.185
 elbows+0.173
ained+0.171
stand+0.170
Neg logit contributions
 happily-0.162
 dismay-0.155
 correct-0.155
ris-0.150
 surprise-0.145
Token play
Token ID711
Feature activation-5.44e-02

Pos logit contributions
 Boom+0.094
icated+0.088
ained+0.085
 elbows+0.082
drops+0.080
Neg logit contributions
 happily-0.067
 correct-0.065
 dismay-0.064
 surprise-0.062
 purchase-0.062
Token make
Token ID787
Feature activation-8.95e-02

Pos logit contributions
 Boom+0.074
icated+0.068
stand+0.063
bows+0.062
bel+0.061
Neg logit contributions
 dismay-0.077
ris-0.076
 correct-0.074
 assistant-0.069
 happily-0.069
Token-
Token ID12
Feature activation+3.61e-02

Pos logit contributions
 Boom+0.103
icated+0.097
stand+0.085
bows+0.083
 elbows+0.082
Neg logit contributions
ris-0.140
 dismay-0.139
 correct-0.138
 happily-0.134
 assistant-0.129
Tokenbel
Token ID6667
Feature activation-3.32e-01

Pos logit contributions
ris+0.054
 happily+0.052
 Whoever+0.051
 correct+0.050
 assistant+0.049
Neg logit contributions
 elbows-0.039
 Boom-0.039
icated-0.036
ained-0.035
 spine-0.034
Tokenieve
Token ID12311
Feature activation+6.11e-02

Pos logit contributions
 Boom+0.562
icated+0.539
stand+0.498
drops+0.495
oes+0.494
Neg logit contributions
 correct-0.336
ris-0.329
 dismay-0.326
 happily-0.324
 assistant-0.296
Token.
Token ID13
Feature activation+7.56e-03

Pos logit contributions
 dismay+0.097
ris+0.094
 correct+0.092
aleb+0.090
 Whoever+0.088
Neg logit contributions
 Boom-0.071
icated-0.066
bel-0.065
bows-0.063
 songs-0.061
Token Today
Token ID6288
Feature activation-4.89e-02

Pos logit contributions
 dismay+0.008
 correct+0.007
ris+0.007
 happily+0.007
 assistant+0.007
Neg logit contributions
 Boom-0.013
icated-0.012
stand-0.011
bows-0.011
ained-0.011
Token she
Token ID673
Feature activation+1.02e-02

Pos logit contributions
 Boom+0.060
icated+0.056
stand+0.051
bows+0.051
ained+0.049
Neg logit contributions
ris-0.073
 dismay-0.072
 correct-0.071
 happily-0.068
aleb-0.067
Token was
Token ID373
Feature activation+1.52e-02

Pos logit contributions
 dismay+0.016
 correct+0.016
ris+0.015
 purchase+0.015
aleb+0.015
Neg logit contributions
 Boom-0.011
icated-0.010
bows-0.009
 rhy-0.009
drops-0.009
Token try
Token ID1949
Feature activation+8.30e-02

Pos logit contributions
 Boom+0.050
icated+0.050
stand+0.046
 rhy+0.045
oes+0.044
Neg logit contributions
ris-0.062
 dismay-0.059
 correct-0.058
ait-0.057
 assistant-0.056
Token to
Token ID284
Feature activation-4.17e-02

Pos logit contributions
 happily+0.103
 dismay+0.102
 correct+0.102
ris+0.097
 purchase+0.092
Neg logit contributions
 Boom-0.123
icated-0.116
drops-0.106
bows-0.106
bel-0.106
Token fly
Token ID6129
Feature activation-2.50e-02

Pos logit contributions
 Boom+0.055
icated+0.054
stand+0.050
 elbows+0.047
 rhy+0.047
Neg logit contributions
ris-0.056
 dismay-0.054
 correct-0.053
 assistant-0.051
 Whoever-0.050
Token higher
Token ID2440
Feature activation-1.15e-01

Pos logit contributions
 Boom+0.042
icated+0.042
stand+0.039
llers+0.038
 elbows+0.038
Neg logit contributions
 happily-0.028
 correct-0.025
 dismay-0.023
ris-0.023
 exactly-0.021
Token than
Token ID621
Feature activation-7.56e-02

Pos logit contributions
 Boom+0.177
icated+0.169
stand+0.153
 elbows+0.151
ained+0.149
Neg logit contributions
 happily-0.136
 correct-0.134
ris-0.134
 dismay-0.132
 assistant-0.120
Token ever
Token ID1683
Feature activation+1.70e-01

Pos logit contributions
 Boom+0.117
icated+0.111
stand+0.101
 elbows+0.100
bows+0.100
Neg logit contributions
 correct-0.087
ris-0.086
 dismay-0.086
 happily-0.086
 assistant-0.078
Token before
Token ID878
Feature activation-1.78e-01

Pos logit contributions
 happily+0.247
 correct+0.244
ris+0.241
 dismay+0.239
 certainly+0.218
Neg logit contributions
 Boom-0.220
icated-0.200
 elbows-0.187
stand-0.183
bel-0.182
Token.
Token ID13
Feature activation+1.94e-02

Pos logit contributions
 Boom+0.232
icated+0.219
stand+0.204
 elbows+0.197
ained+0.194
Neg logit contributions
ris-0.251
 correct-0.246
 dismay-0.246
 assistant-0.237
aleb-0.229
Token Twe
Token ID24205
Feature activation-1.06e-01

Pos logit contributions
ris+0.028
 dismay+0.028
 correct+0.027
 happily+0.025
aleb+0.025
Neg logit contributions
 Boom-0.026
icated-0.023
bows-0.022
stand-0.021
drops-0.021
Tokenety
Token ID2963
Feature activation-2.08e-01

Pos logit contributions
 Boom+0.096
icated+0.080
stand+0.072
 elbows+0.067
bows+0.066
Neg logit contributions
ris-0.202
 happily-0.196
 dismay-0.193
 correct-0.191
aleb-0.184
Token fl
Token ID781
Feature activation+3.88e-02

Pos logit contributions
 Boom+0.294
icated+0.280
stand+0.252
bows+0.249
 elbows+0.247
Neg logit contributions
 dismay-0.265
 correct-0.264
ris-0.264
 happily-0.255
 assistant-0.243
Token
Token ID8151
Feature activation-1.12e-01

Pos logit contributions
 Boom+0.218
stand+0.203
icated+0.199
bows+0.196
llers+0.191
Neg logit contributions
 correct-0.251
 dismay-0.251
ris-0.241
 assistant-0.231
 happily-0.229
Tokens
Token ID82
Feature activation-7.16e-02

Pos logit contributions
 Boom+0.127
stand+0.126
icated+0.122
oes+0.114
line+0.114
Neg logit contributions
 dismay-0.165
 correct-0.164
 surprise-0.156
 purchase-0.153
 buyers-0.152
Token help
Token ID1037
Feature activation+6.23e-02

Pos logit contributions
icated+0.091
 Boom+0.091
stand+0.083
bows+0.081
ric+0.080
Neg logit contributions
ris-0.098
 dismay-0.095
 Whoever-0.092
 correct-0.091
aleb-0.089
Token and
Token ID290
Feature activation+1.20e-02

Pos logit contributions
ris+0.105
 Whoever+0.099
 dismay+0.097
aleb+0.096
 correct+0.095
Neg logit contributions
stand-0.061
icated-0.057
 Boom-0.053
 sound-0.053
bows-0.052
Token they
Token ID484
Feature activation+3.53e-02

Pos logit contributions
ris+0.018
aleb+0.016
 dismay+0.016
 Whoever+0.016
 correct+0.016
Neg logit contributions
icated-0.014
 Boom-0.014
bel-0.013
 sound-0.013
 songs-0.013
Token show
Token ID905
Feature activation+1.88e-01

Pos logit contributions
 dismay+0.051
 Whoever+0.050
 correct+0.050
ris+0.050
aleb+0.049
Neg logit contributions
 Boom-0.039
icated-0.039
 rhy-0.037
bel-0.035
bye-0.034
Tokenered
Token ID1068
Feature activation+8.49e-02

Pos logit contributions
ris+0.295
 Whoever+0.264
+0.260
 scaven+0.259
 dismay+0.257
Neg logit contributions
 Boom-0.207
icated-0.200
 songs-0.196
stand-0.192
 noises-0.179
Token her
Token ID607
Feature activation+3.74e-02

Pos logit contributions
ris+0.142
+0.135
 Whoever+0.135
aleb+0.133
 assistant+0.133
Neg logit contributions
 Boom-0.080
 songs-0.070
bel-0.070
fl-0.069
 noises-0.069
Token with
Token ID351
Feature activation+7.17e-03

Pos logit contributions
ris+0.048
 Whoever+0.047
aleb+0.044
 dismay+0.042
ait+0.042
Neg logit contributions
bel-0.048
 Boom-0.048
fl-0.048
bows-0.047
stand-0.047
Token love
Token ID1842
Feature activation-1.57e-02

Pos logit contributions
ris+0.012
 Whoever+0.012
ait+0.011
 assistant+0.011
aleb+0.011
Neg logit contributions
 songs-0.006
icated-0.006
 Boom-0.006
 skin-0.006
 elbows-0.005
Token and
Token ID290
Feature activation+1.26e-02

Pos logit contributions
bows+0.015
icated+0.015
fl+0.015
 songs+0.015
 Boom+0.014
Neg logit contributions
 Whoever-0.025
 dismay-0.025
ris-0.025
 correct-0.024
Mor-0.023
Token is
Token ID318
Feature activation+6.54e-02

Pos logit contributions
ris+0.016
 dismay+0.015
ait+0.014
aleb+0.014
 correct+0.014
Neg logit contributions
 Boom-0.014
icated-0.013
fl-0.012
bows-0.012
oes-0.011
Token fun
Token ID1257
Feature activation+2.48e-02

Pos logit contributions
ris+0.084
 dismay+0.081
 correct+0.080
 happily+0.078
 assistant+0.076
Neg logit contributions
 Boom-0.093
icated-0.089
ained-0.081
stand-0.080
bows-0.079
Token to
Token ID284
Feature activation-4.17e-02

Pos logit contributions
ris+0.042
 dismay+0.042
 correct+0.039
 assistant+0.039
aleb+0.039
Neg logit contributions
 Boom-0.025
icated-0.024
bows-0.022
stand-0.021
ained-0.020
Token jump
Token ID4391
Feature activation+2.62e-02

Pos logit contributions
 Boom+0.054
icated+0.053
stand+0.048
bows+0.048
ric+0.047
Neg logit contributions
ris-0.057
 correct-0.056
 dismay-0.055
 assistant-0.054
 happily-0.052
Token on
Token ID319
Feature activation-1.80e-02

Pos logit contributions
 correct+0.032
 happily+0.032
 dismay+0.031
ris+0.031
 certainly+0.029
Neg logit contributions
 Boom-0.038
icated-0.037
stand-0.033
 elbows-0.033
ric-0.033
Token tight
Token ID5381
Feature activation+1.77e-01

Pos logit contributions
 Boom+0.023
icated+0.021
stand+0.019
 elbows+0.019
 ger+0.018
Neg logit contributions
ris-0.027
 assistant-0.025
 dismay-0.025
 correct-0.024
 Whoever-0.024
Token springs
Token ID24747
Feature activation+9.55e-03

Pos logit contributions
 dismay+0.238
ris+0.238
 correct+0.237
 happily+0.231
 assistant+0.219
Neg logit contributions
 Boom-0.239
icated-0.227
stand-0.204
bows-0.200
 elbows-0.200
Token,
Token ID11
Feature activation-3.96e-02

Pos logit contributions
 dismay+0.012
ris+0.012
 correct+0.012
 happily+0.012
 assistant+0.011
Neg logit contributions
 Boom-0.013
icated-0.013
stand-0.011
bows-0.011
 elbows-0.011
Token he
Token ID339
Feature activation+3.22e-03

Pos logit contributions
 Boom+0.047
icated+0.046
stand+0.039
drops+0.039
 elbows+0.038
Neg logit contributions
ris-0.061
 assistant-0.057
 dismay-0.056
 correct-0.056
aleb-0.055
Token must
Token ID1276
Feature activation+2.61e-01

Pos logit contributions
ris+0.005
 dismay+0.005
ait+0.005
 correct+0.005
 purchase+0.005
Neg logit contributions
 Boom-0.003
icated-0.003
ric-0.003
drops-0.003
ained-0.003
Token be
Token ID307
Feature activation-2.45e-02

Pos logit contributions
ris+0.374
 dismay+0.365
 assistant+0.355
aleb+0.353
 genius+0.347
Neg logit contributions
icated-0.292
ric-0.287
stand-0.282
 Boom-0.279
oes-0.268
Token
Token ID198
Feature activation-1.19e-02

Pos logit contributions
 Boom+0.009
icated+0.008
stand+0.007
ained+0.007
bows+0.007
Neg logit contributions
 dismay-0.008
 correct-0.008
ris-0.008
 happily-0.008
 assistant-0.008
Token
Token ID198
Feature activation-1.25e-02

Pos logit contributions
 Boom+0.018
icated+0.018
 elbows+0.017
drops+0.016
bows+0.016
Neg logit contributions
ris-0.014
 happily-0.013
 dismay-0.013
 correct-0.013
 Whoever-0.012
TokenWhen
Token ID2215
Feature activation-2.27e-02

Pos logit contributions
 Boom+0.019
icated+0.019
stand+0.017
 elbows+0.017
bows+0.017
Neg logit contributions
ris-0.014
 dismay-0.014
 correct-0.014
 happily-0.014
 assistant-0.013
Token she
Token ID673
Feature activation+1.67e-02

Pos logit contributions
 Boom+0.033
icated+0.030
fl+0.026
oes+0.026
bows+0.026
Neg logit contributions
ris-0.031
 dismay-0.030
 correct-0.030
aleb-0.028
 happily-0.028
Token arrived
Token ID5284
Feature activation+9.47e-02

Pos logit contributions
 correct+0.032
 assistant+0.031
 dismay+0.030
aleb+0.029
ris+0.029
Neg logit contributions
 Boom-0.014
stand-0.012
icated-0.010
bel-0.010
 rhy-0.009
Token at
Token ID379
Feature activation+1.85e-01

Pos logit contributions
ris+0.122
 correct+0.116
 dismay+0.113
 Whoever+0.111
 certainly+0.110
Neg logit contributions
 Boom-0.138
icated-0.121
bel-0.119
 elbows-0.117
stand-0.117
Token the
Token ID262
Feature activation+3.64e-04

Pos logit contributions
ris+0.262
 correct+0.257
 dismay+0.252
 Whoever+0.244
 certainly+0.244
Neg logit contributions
 Boom-0.240
icated-0.229
stand-0.205
bows-0.197
 elbows-0.196
Token park
Token ID3952
Feature activation+1.07e-01

Pos logit contributions
ris+0.000
 dismay+0.000
 happily+0.000
 correct+0.000
 Whoever+0.000
Neg logit contributions
 Boom-0.001
icated-0.001
ained-0.001
bows-0.001
stand-0.001
Token,
Token ID11
Feature activation-4.77e-02

Pos logit contributions
 dismay+0.146
 correct+0.146
ris+0.145
 happily+0.144
 assistant+0.135
Neg logit contributions
 Boom-0.141
icated-0.136
stand-0.119
 elbows-0.119
bows-0.117
Token she
Token ID673
Feature activation+1.68e-02

Pos logit contributions
 Boom+0.076
icated+0.074
stand+0.068
 elbows+0.067
bows+0.066
Neg logit contributions
 correct-0.052
 dismay-0.052
ris-0.051
 happily-0.050
 assistant-0.046
Token started
Token ID2067
Feature activation-5.58e-02

Pos logit contributions
 dismay+0.021
ris+0.021
 correct+0.021
 happily+0.020
 assistant+0.019
Neg logit contributions
 Boom-0.024
icated-0.023
stand-0.021
bows-0.021
 elbows-0.021
Token,
Token ID11
Feature activation-4.30e-02

Pos logit contributions
 Boom+0.011
icated+0.011
stand+0.010
 elbows+0.010
ained+0.010
Neg logit contributions
 correct-0.011
 dismay-0.011
 happily-0.011
ris-0.011
 assistant-0.010
Token "
Token ID366
Feature activation+1.63e-02

Pos logit contributions
 Boom+0.065
icated+0.061
stand+0.056
bows+0.054
 elbows+0.054
Neg logit contributions
ris-0.052
 dismay-0.052
 correct-0.051
 happily-0.049
 assistant-0.047
TokenWow
Token ID22017
Feature activation-1.49e-01

Pos logit contributions
 dismay+0.024
 correct+0.024
ris+0.024
 happily+0.023
 assistant+0.022
Neg logit contributions
 Boom-0.020
icated-0.019
stand-0.017
bows-0.017
ained-0.016
Token!
Token ID0
Feature activation+1.24e-01

Pos logit contributions
 Boom+0.152
icated+0.136
bows+0.113
stand+0.106
ong+0.106
Neg logit contributions
 dismay-0.267
ris-0.266
 passer-0.251
 correct-0.247
 assistant-0.247
Token That
Token ID1320
Feature activation+9.57e-02

Pos logit contributions
 dismay+0.164
ris+0.161
 correct+0.161
 happily+0.159
 assistant+0.148
Neg logit contributions
 Boom-0.177
icated-0.166
stand-0.147
bows-0.145
oes-0.143
Token must
Token ID1276
Feature activation+2.60e-01

Pos logit contributions
 correct+0.105
ris+0.105
 dismay+0.105
 happily+0.101
 assistant+0.095
Neg logit contributions
 Boom-0.153
icated-0.147
stand-0.135
 elbows-0.133
bows-0.132
Token have
Token ID423
Feature activation+7.54e-02

Pos logit contributions
 dismay+0.373
 happily+0.372
ris+0.372
aleb+0.354
 assistant+0.349
Neg logit contributions
 Boom-0.328
icated-0.317
stand-0.279
bows-0.267
oes-0.266
Token been
Token ID587
Feature activation-4.92e-03

Pos logit contributions
ris+0.101
 dismay+0.100
 correct+0.098
 happily+0.097
 assistant+0.094
Neg logit contributions
 Boom-0.105
icated-0.100
stand-0.088
bows-0.087
ained-0.085
Token a
Token ID257
Feature activation-3.16e-02

Pos logit contributions
 Boom+0.007
icated+0.007
stand+0.006
bows+0.006
ained+0.006
Neg logit contributions
 dismay-0.006
ris-0.006
 correct-0.006
 happily-0.006
 assistant-0.006
Token long
Token ID890
Feature activation-3.97e-02

Pos logit contributions
 Boom+0.045
icated+0.042
 sound+0.039
 boom+0.039
 songs+0.038
Neg logit contributions
ait-0.040
aleb-0.037
ris-0.036
 dismay-0.035
 correct-0.034
Token time
Token ID640
Feature activation+9.41e-02

Pos logit contributions
 Boom+0.064
icated+0.062
stand+0.057
bows+0.056
 elbows+0.055
Neg logit contributions
 dismay-0.041
 correct-0.041
ris-0.041
 happily-0.040
 certainly-0.038
Token smiled
Token ID13541
Feature activation-1.01e-01

Pos logit contributions
 dismay+0.026
 correct+0.025
ris+0.025
 happily+0.025
 assistant+0.023
Neg logit contributions
 Boom-0.026
icated-0.025
stand-0.023
bows-0.022
 elbows-0.022
Token.
Token ID13
Feature activation+1.68e-02

Pos logit contributions
 Boom+0.138
icated+0.128
bows+0.117
bel+0.116
stand+0.116
Neg logit contributions
ris-0.137
 correct-0.133
 dismay-0.133
 Whoever-0.125
 happily-0.121
Token It
Token ID632
Feature activation+4.12e-03

Pos logit contributions
ris+0.024
 dismay+0.024
 correct+0.024
 assistant+0.022
aleb+0.022
Neg logit contributions
 Boom-0.022
icated-0.020
stand-0.018
bows-0.018
ained-0.017
Token was
Token ID373
Feature activation+3.22e-02

Pos logit contributions
ris+0.007
 dismay+0.006
aleb+0.006
ait+0.006
 purchase+0.006
Neg logit contributions
 Boom-0.004
fl-0.004
icated-0.004
bows-0.004
 sound-0.004
Token a
Token ID257
Feature activation-2.75e-02

Pos logit contributions
ris+0.042
 dismay+0.041
 correct+0.041
 happily+0.040
 assistant+0.039
Neg logit contributions
 Boom-0.045
icated-0.042
ained-0.038
stand-0.038
bows-0.038
Token really
Token ID1107
Feature activation+8.76e-02

Pos logit contributions
 Boom+0.041
icated+0.039
stand+0.036
bows+0.035
ric+0.035
Neg logit contributions
ris-0.031
 dismay-0.031
 correct-0.031
 happily-0.029
 Whoever-0.029
Token great
Token ID1049
Feature activation+1.54e-01

Pos logit contributions
 dismay+0.109
ris+0.108
aleb+0.100
 Whoever+0.099
ait+0.099
Neg logit contributions
 Boom-0.127
icated-0.114
bel-0.107
drops-0.107
bows-0.106
Token day
Token ID1110
Feature activation-7.19e-02

Pos logit contributions
ris+0.227
 dismay+0.225
 correct+0.223
ait+0.220
aleb+0.219
Neg logit contributions
 Boom-0.166
 songs-0.161
icated-0.158
 sound-0.157
bows-0.157
Token.
Token ID13
Feature activation+1.67e-02

Pos logit contributions
stand+0.081
 Boom+0.079
icated+0.078
drops+0.074
ring+0.073
Neg logit contributions
 dismay-0.108
 correct-0.102
ris-0.100
aleb-0.100
 assistant-0.098
Token<|endoftext|>
Token ID50256
Feature activation-1.63e-01

Pos logit contributions
ris+0.025
 correct+0.025
 dismay+0.025
 assistant+0.024
 happily+0.023
Neg logit contributions
 Boom-0.021
icated-0.019
stand-0.017
ained-0.016
bows-0.016
TokenOnce
Token ID7454
Feature activation+8.75e-02

Pos logit contributions
 Boom+0.218
icated+0.194
stand+0.179
ained+0.175
bows+0.171
Neg logit contributions
 dismay-0.233
 correct-0.231
ris-0.227
 assistant-0.218
 happily-0.216
Token Today
Token ID6288
Feature activation-4.05e-02

Pos logit contributions
 happily+0.008
 correct+0.007
 dismay+0.007
ris+0.007
 Whoever+0.006
Neg logit contributions
icated-0.017
 Boom-0.016
bel-0.016
ric-0.016
 elbows-0.016
Token was
Token ID373
Feature activation+2.19e-02

Pos logit contributions
 Boom+0.058
icated+0.057
 elbows+0.053
bel+0.051
bows+0.051
Neg logit contributions
 dismay-0.050
 happily-0.050
ris-0.049
 correct-0.048
 certainly-0.045
Token her
Token ID607
Feature activation+3.63e-02

Pos logit contributions
 happily+0.017
 only+0.016
 dismay+0.015
 surprise+0.015
 correct+0.015
Neg logit contributions
llers-0.043
ric-0.043
iece-0.042
ller-0.042
icated-0.042
Token birthday
Token ID10955
Feature activation-8.05e-02

Pos logit contributions
ris+0.054
 dismay+0.053
 correct+0.052
 happily+0.051
 Whoever+0.049
Neg logit contributions
 Boom-0.045
icated-0.042
stand-0.038
bows-0.038
ained-0.037
Token,
Token ID11
Feature activation-5.04e-02

Pos logit contributions
 Boom+0.080
stand+0.072
icated+0.069
bows+0.068
 songs+0.067
Neg logit contributions
 dismay-0.138
 correct-0.136
ris-0.131
 Whoever-0.131
aleb-0.125
Token and
Token ID290
Feature activation+8.46e-03

Pos logit contributions
 Boom+0.067
icated+0.063
stand+0.057
weet+0.053
bows+0.053
Neg logit contributions
ris-0.075
 dismay-0.072
 correct-0.072
 assistant-0.066
 happily-0.066
Token she
Token ID673
Feature activation+1.43e-02

Pos logit contributions
 happily+0.007
 correct+0.007
 dismay+0.007
 certainly+0.006
 surprise+0.006
Neg logit contributions
 Boom-0.016
ric-0.016
icated-0.015
oes-0.015
bows-0.015
Token was
Token ID373
Feature activation+2.23e-02

Pos logit contributions
 happily+0.017
 dismay+0.016
 correct+0.016
ris+0.016
 certainly+0.015
Neg logit contributions
 Boom-0.022
icated-0.022
oes-0.020
 elbows-0.020
ric-0.020
Token very
Token ID845
Feature activation-4.13e-02

Pos logit contributions
ris+0.032
 dismay+0.031
 correct+0.031
 happily+0.030
 assistant+0.029
Neg logit contributions
 Boom-0.029
icated-0.027
stand-0.024
bows-0.024
ained-0.024
Token excited
Token ID6568
Feature activation+4.14e-02

Pos logit contributions
 Boom+0.051
icated+0.048
stand+0.042
bel+0.042
ained+0.042
Neg logit contributions
ris-0.062
 dismay-0.060
 correct-0.060
 happily-0.057
 Whoever-0.056
Token.
Token ID13
Feature activation+8.59e-03

Pos logit contributions
 happily+0.050
 correct+0.049
 dismay+0.048
ris+0.048
 assistant+0.045
Neg logit contributions
 Boom-0.063
icated-0.058
 elbows-0.056
ric-0.054
drops-0.053
Token decided
Token ID3066
Feature activation-4.83e-02

Pos logit contributions
 Boom+0.002
bel+0.002
pee+0.002
drops+0.002
Snow+0.002
Neg logit contributions
 correct-0.003
 purchase-0.003
ris-0.003
 dismay-0.003
 horror-0.003
Token to
Token ID284
Feature activation-4.41e-02

Pos logit contributions
 Boom+0.061
icated+0.057
stand+0.051
bows+0.050
 elbows+0.050
Neg logit contributions
ris-0.070
 dismay-0.070
 correct-0.070
 happily-0.068
 assistant-0.065
Token give
Token ID1577
Feature activation+8.71e-02

Pos logit contributions
 sound+0.042
stand+0.041
oes+0.040
 skin+0.040
 feel+0.040
Neg logit contributions
ris-0.068
ement-0.066
 genius-0.065
joy-0.063
Con-0.063
Token it
Token ID340
Feature activation+8.04e-03

Pos logit contributions
ris+0.142
 correct+0.136
 dismay+0.134
 certainly+0.132
 genius+0.131
Neg logit contributions
 Boom-0.094
icated-0.076
bye-0.072
fl-0.071
stand-0.070
Token away
Token ID1497
Feature activation+9.39e-02

Pos logit contributions
 dismay+0.012
ris+0.012
 correct+0.012
 happily+0.011
 Whoever+0.011
Neg logit contributions
 Boom-0.010
icated-0.009
stand-0.008
ained-0.008
oes-0.008
Token.
Token ID13
Feature activation+1.05e-02

Pos logit contributions
 dismay+0.146
ris+0.144
 correct+0.142
 Whoever+0.139
aleb+0.137
Neg logit contributions
 Boom-0.101
icated-0.099
stand-0.089
bel-0.086
bows-0.085
Token He
Token ID679
Feature activation+2.86e-02

Pos logit contributions
 dismay+0.016
ris+0.016
 correct+0.015
 genius+0.015
 purchase+0.015
Neg logit contributions
 Boom-0.014
icated-0.011
stand-0.011
oes-0.010
bows-0.010
Token went
Token ID1816
Feature activation-1.35e-02

Pos logit contributions
 correct+0.048
igsaw+0.048
+0.046
ihu+0.045
 genius+0.045
Neg logit contributions
 felt-0.024
 Boom-0.022
bel-0.022
 sang-0.021
 rang-0.021
Token to
Token ID284
Feature activation-4.39e-02

Pos logit contributions
 Boom+0.013
stand+0.012
bel+0.011
oes+0.011
drops+0.011
Neg logit contributions
 dismay-0.023
 certainly-0.023
 correct-0.022
ris-0.022
aleb-0.022
Token a
Token ID257
Feature activation-3.21e-02

Pos logit contributions
 Boom+0.037
icated+0.033
stand+0.032
 sound+0.032
 feel+0.027
Neg logit contributions
ris-0.082
 certainly-0.079
ement-0.079
aleb-0.077
 genius-0.076
Token little
Token ID1310
Feature activation+6.09e-02

Pos logit contributions
 Boom+0.044
stand+0.043
icated+0.041
bel+0.040
Rain+0.039
Neg logit contributions
ris-0.040
 dismay-0.037
 correct-0.037
 certainly-0.037
aleb-0.036
Token smiled
Token ID13541
Feature activation-1.06e-01

Pos logit contributions
 correct+0.043
 dismay+0.042
ris+0.042
 purchase+0.040
aleb+0.039
Neg logit contributions
 Boom-0.038
icated-0.034
bel-0.030
stand-0.030
bows-0.029
Token and
Token ID290
Feature activation+1.58e-02

Pos logit contributions
 Boom+0.172
icated+0.168
stand+0.153
 elbows+0.152
bows+0.150
Neg logit contributions
 happily-0.116
 dismay-0.112
 correct-0.111
ris-0.109
 assistant-0.102
Token waved
Token ID26834
Feature activation-1.51e-02

Pos logit contributions
ris+0.021
 dismay+0.020
 correct+0.020
 happily+0.019
 assistant+0.019
Neg logit contributions
 Boom-0.023
icated-0.021
stand-0.019
bows-0.019
 elbows-0.019
Token at
Token ID379
Feature activation+1.83e-01

Pos logit contributions
 Boom+0.022
icated+0.019
bel+0.018
 songs+0.018
stand+0.017
Neg logit contributions
ris-0.021
 correct-0.020
 dismay-0.020
 certainly-0.019
 purchase-0.018
Token her
Token ID607
Feature activation+4.26e-02

Pos logit contributions
ris+0.222
 correct+0.222
 dismay+0.221
 happily+0.212
 assistant+0.203
Neg logit contributions
 Boom-0.274
icated-0.261
stand-0.237
 elbows-0.232
bows-0.232
Token.
Token ID13
Feature activation+1.63e-02

Pos logit contributions
 happily+0.048
 dismay+0.048
 correct+0.048
ris+0.047
 assistant+0.044
Neg logit contributions
 Boom-0.067
icated-0.064
ained-0.059
 elbows-0.059
stand-0.058
Token
Token ID198
Feature activation-8.59e-03

Pos logit contributions
 dismay+0.020
 correct+0.020
ris+0.020
 happily+0.020
 assistant+0.018
Neg logit contributions
 Boom-0.024
icated-0.023
stand-0.021
bows-0.020
 elbows-0.020
Token
Token ID198
Feature activation-8.98e-03

Pos logit contributions
icated+0.015
 Boom+0.014
drops+0.014
bows+0.014
 elbows+0.014
Neg logit contributions
 happily-0.009
ris-0.008
 dismay-0.008
 correct-0.008
 Whoever-0.007
Token"
Token ID1
Feature activation+4.89e-02

Pos logit contributions
 Boom+0.013
icated+0.012
stand+0.011
bows+0.011
 elbows+0.011
Neg logit contributions
 dismay-0.012
ris-0.011
 correct-0.011
 happily-0.011
 assistant-0.011
TokenHello
Token ID15496
Feature activation-2.00e-01

Pos logit contributions
 correct+0.079
 happily+0.079
 dismay+0.078
ris+0.078
 Whoever+0.075
Neg logit contributions
icated-0.048
 Boom-0.048
ric-0.045
ained-0.044
 elbows-0.042
Token,
Token ID11
Feature activation-4.12e-02

Pos logit contributions
 Boom+0.230
icated+0.209
bows+0.180
stand+0.176
fl+0.169
Neg logit contributions
 correct-0.323
 dismay-0.323
ris-0.319
 happily-0.301
 assistant-0.296
Token birds
Token ID10087
Feature activation-1.86e-01

Pos logit contributions
ris+0.065
 correct+0.063
 dismay+0.063
 happily+0.061
 assistant+0.059
Neg logit contributions
 Boom-0.073
icated-0.069
stand-0.063
bows-0.061
 elbows-0.061
Token singing
Token ID13777
Feature activation-1.78e-01

Pos logit contributions
 Boom+0.260
icated+0.249
stand+0.224
bows+0.221
 elbows+0.219
Neg logit contributions
 dismay-0.239
ris-0.239
 correct-0.239
 happily-0.230
 assistant-0.219
Token in
Token ID287
Feature activation+3.64e-02

Pos logit contributions
 Boom+0.285
icated+0.283
 elbows+0.256
ric+0.255
oes+0.254
Neg logit contributions
 happily-0.205
 correct-0.187
 dismay-0.184
ris-0.179
 assistant-0.169
Token the
Token ID262
Feature activation+4.11e-04

Pos logit contributions
ris+0.057
 correct+0.053
 assistant+0.053
 Whoever+0.052
aleb+0.051
Neg logit contributions
 Boom-0.045
icated-0.041
stand-0.039
weet-0.038
 songs-0.038
Token morning
Token ID3329
Feature activation-1.27e-01

Pos logit contributions
 correct+0.001
 dismay+0.001
 happily+0.001
ris+0.001
 surprise+0.001
Neg logit contributions
icated-0.000
 Boom-0.000
ric-0.000
ained-0.000
bel-0.000
Token and
Token ID290
Feature activation+1.15e-02

Pos logit contributions
icated+0.213
 Boom+0.210
 elbows+0.196
ric+0.192
ained+0.191
Neg logit contributions
 happily-0.132
 correct-0.126
ris-0.120
 dismay-0.119
 surprise-0.112
Token she
Token ID673
Feature activation+1.74e-02

Pos logit contributions
 happily+0.014
 dismay+0.013
 correct+0.013
ris+0.012
 certainly+0.012
Neg logit contributions
 Boom-0.017
icated-0.017
stand-0.016
 elbows-0.015
ained-0.015
Token always
Token ID1464
Feature activation+7.77e-02

Pos logit contributions
 dismay+0.026
 correct+0.026
ris+0.026
 happily+0.024
 assistant+0.024
Neg logit contributions
 Boom-0.021
icated-0.020
stand-0.018
bows-0.017
ained-0.017
Token smiled
Token ID13541
Feature activation-1.06e-01

Pos logit contributions
 correct+0.100
 happily+0.100
 dismay+0.100
ris+0.100
 certainly+0.092
Neg logit contributions
 Boom-0.109
icated-0.104
stand-0.094
 elbows-0.092
ained-0.091
Token and
Token ID290
Feature activation+1.18e-02

Pos logit contributions
 Boom+0.153
icated+0.146
stand+0.132
bows+0.131
ained+0.129
Neg logit contributions
ris-0.133
 dismay-0.132
 correct-0.132
 happily-0.127
 assistant-0.121
Token waved
Token ID26834
Feature activation-1.50e-02

Pos logit contributions
 happily+0.016
 correct+0.015
 dismay+0.015
ris+0.015
 certainly+0.014
Neg logit contributions
 Boom-0.017
icated-0.016
stand-0.014
oes-0.014
 elbows-0.014
Token you
Token ID345
Feature activation+4.33e-03

Pos logit contributions
 happily+0.008
 correct+0.008
 dismay+0.007
 Whoever+0.007
 surprise+0.007
Neg logit contributions
 Boom-0.017
bows-0.017
drops-0.017
ained-0.016
bel-0.016
Token?"
Token ID1701
Feature activation-2.43e-01

Pos logit contributions
 happily+0.007
ris+0.006
 hiding+0.006
 correct+0.006
 surprise+0.006
Neg logit contributions
 elbows-0.006
drops-0.005
 Boom-0.005
bows-0.005
ages-0.005
Token
Token ID198
Feature activation-1.27e-02

Pos logit contributions
icated+0.421
ained+0.395
 Boom+0.384
drops+0.381
 elbows+0.377
Neg logit contributions
 happily-0.266
 dismay-0.227
 Whoever-0.225
ris-0.224
 correct-0.216
Token
Token ID198
Feature activation-1.32e-02

Pos logit contributions
icated+0.025
drops+0.024
bows+0.023
 elbows+0.023
 Boom+0.023
Neg logit contributions
 happily-0.012
ris-0.010
 dismay-0.009
 Whoever-0.009
 correct-0.008
TokenThe
Token ID464
Feature activation-8.03e-02

Pos logit contributions
 Boom+0.018
icated+0.017
ained+0.016
ric+0.016
bows+0.015
Neg logit contributions
 happily-0.018
ris-0.018
 correct-0.018
 dismay-0.018
 Whoever-0.017
Token boy
Token ID2933
Feature activation-2.99e-02

Pos logit contributions
ric+0.142
 elbows+0.140
icated+0.137
rings+0.133
ained+0.132
Neg logit contributions
 happily-0.075
 surprise-0.074
 correct-0.072
 dismay-0.069
 assistant-0.068
Token said
Token ID531
Feature activation-1.25e-02

Pos logit contributions
 Boom+0.037
icated+0.033
oes+0.028
stand+0.028
bel+0.028
Neg logit contributions
 correct-0.045
ris-0.044
 dismay-0.044
 purchase-0.042
ait-0.041
Token,
Token ID11
Feature activation-4.60e-02

Pos logit contributions
icated+0.019
 Boom+0.019
ained+0.018
ric+0.018
 elbows+0.017
Neg logit contributions
 happily-0.016
 correct-0.014
 dismay-0.014
 assistant-0.014
 genius-0.013
Token "
Token ID366
Feature activation+1.32e-02

Pos logit contributions
icated+0.077
 Boom+0.077
 elbows+0.072
ained+0.071
bows+0.070
Neg logit contributions
 happily-0.046
 correct-0.044
 dismay-0.043
 Whoever-0.041
ris-0.041
TokenNo
Token ID2949
Feature activation+7.11e-04

Pos logit contributions
 happily+0.011
 Whoever+0.010
 correct+0.010
ris+0.010
 dismay+0.010
Neg logit contributions
 Boom-0.024
icated-0.024
ric-0.024
ained-0.024
 elbows-0.023
Token,
Token ID11
Feature activation-4.64e-02

Pos logit contributions
 dismay+0.001
 correct+0.001
ris+0.001
 happily+0.001
 assistant+0.001
Neg logit contributions
 Boom-0.001
icated-0.001
stand-0.001
bel-0.001
ained-0.001
Token clear
Token ID1598
Feature activation-3.04e-02

Pos logit contributions
 Boom+0.046
icated+0.042
bel+0.040
 sound+0.039
weet+0.038
Neg logit contributions
ris-0.060
 Whoever-0.057
 dismay-0.055
aleb-0.055
 correct-0.054
Token.
Token ID13
Feature activation+1.36e-02

Pos logit contributions
 Boom+0.028
icated+0.028
bows+0.027
fl+0.026
drops+0.026
Neg logit contributions
 dismay-0.050
ris-0.050
 Whoever-0.048
Mor-0.048
 correct-0.047
Token It
Token ID632
Feature activation-2.93e-03

Pos logit contributions
ris+0.021
 correct+0.021
 dismay+0.021
 purchase+0.021
 genius+0.020
Neg logit contributions
 Boom-0.018
icated-0.014
stand-0.013
weet-0.012
ric-0.012
Token had
Token ID550
Feature activation-8.84e-03

Pos logit contributions
 Boom+0.003
icated+0.002
fl+0.002
 rhy+0.002
 sound+0.002
Neg logit contributions
ris-0.005
ait-0.005
 purchase-0.005
aleb-0.005
 correct-0.005
Token big
Token ID1263
Feature activation+3.27e-02

Pos logit contributions
 Boom+0.012
icated+0.012
stand+0.011
bows+0.011
 elbows+0.010
Neg logit contributions
ris-0.011
 dismay-0.011
 correct-0.011
 happily-0.011
 assistant-0.010
Token eyes
Token ID2951
Feature activation-9.85e-02

Pos logit contributions
 correct+0.047
 dismay+0.047
ris+0.046
 happily+0.046
 assistant+0.043
Neg logit contributions
 Boom-0.041
icated-0.039
stand-0.035
ained-0.034
oes-0.034
Token,
Token ID11
Feature activation-4.41e-02

Pos logit contributions
 Boom+0.153
icated+0.149
 elbows+0.135
oes+0.131
ained+0.131
Neg logit contributions
 happily-0.113
ris-0.112
 correct-0.112
 dismay-0.111
 certainly-0.102
Token a
Token ID257
Feature activation-3.51e-02

Pos logit contributions
 Boom+0.065
icated+0.062
oes+0.056
ained+0.056
 elbows+0.055
Neg logit contributions
 correct-0.054
 dismay-0.054
 happily-0.053
ris-0.052
 certainly-0.049
Token long
Token ID890
Feature activation-4.20e-02

Pos logit contributions
 Boom+0.052
icated+0.050
stand+0.045
bows+0.044
 elbows+0.044
Neg logit contributions
ris-0.042
 dismay-0.042
 correct-0.042
 happily-0.041
 assistant-0.038
Token nose
Token ID9686
Feature activation-1.37e-01

Pos logit contributions
 Boom+0.053
icated+0.050
stand+0.044
bows+0.043
 elbows+0.043
Neg logit contributions
 dismay-0.060
 correct-0.060
ris-0.060
 happily-0.059
 assistant-0.056
Token and
Token ID290
Feature activation+1.41e-02

Pos logit contributions
 Boom+0.191
icated+0.184
 elbows+0.162
stand+0.160
ained+0.160
Neg logit contributions
 correct-0.177
 dismay-0.177
ris-0.176
 happily-0.175
 assistant-0.163
Token letters
Token ID7475
Feature activation-1.21e-01

Pos logit contributions
 Boom+0.061
icated+0.057
 elbows+0.056
ained+0.055
ric+0.055
Neg logit contributions
 correct-0.057
ris-0.056
 happily-0.055
 surprise-0.054
 dismay-0.053
Token.
Token ID13
Feature activation+1.36e-02

Pos logit contributions
 Boom+0.192
icated+0.180
 elbows+0.168
ained+0.163
stand+0.162
Neg logit contributions
 correct-0.137
ris-0.136
 happily-0.136
 dismay-0.128
 assistant-0.126
Token Let
Token ID3914
Feature activation+1.55e-02

Pos logit contributions
 dismay+0.018
ris+0.018
 correct+0.018
 happily+0.018
 assistant+0.016
Neg logit contributions
 Boom-0.019
icated-0.018
stand-0.016
bows-0.016
oes-0.015
Token's
Token ID338
Feature activation+6.27e-02

Pos logit contributions
ris+0.017
 dismay+0.017
 correct+0.016
 happily+0.016
 assistant+0.015
Neg logit contributions
 Boom-0.026
icated-0.024
stand-0.022
 elbows-0.022
bows-0.022
Token try
Token ID1949
Feature activation+7.93e-02

Pos logit contributions
ris+0.083
 dismay+0.082
 correct+0.082
 happily+0.078
 assistant+0.077
Neg logit contributions
 Boom-0.083
icated-0.079
stand-0.074
bows-0.072
oes-0.071
Token."
Token ID526
Feature activation-1.25e-01

Pos logit contributions
 happily+0.093
 correct+0.091
ris+0.089
 dismay+0.087
 Whoever+0.084
Neg logit contributions
 Boom-0.119
icated-0.113
 elbows-0.111
ained-0.110
drops-0.110
Token Lily
Token ID20037
Feature activation-3.99e-03

Pos logit contributions
 Boom+0.176
icated+0.160
stand+0.148
bows+0.140
 elbows+0.139
Neg logit contributions
ris-0.168
 correct-0.167
 dismay-0.167
 happily-0.157
 certainly-0.155
Token says
Token ID1139
Feature activation+2.26e-02

Pos logit contributions
 Boom+0.005
icated+0.004
bows+0.004
 rhy+0.004
 noises+0.004
Neg logit contributions
 dismay-0.006
ris-0.006
 purchase-0.006
ait-0.006
 correct-0.006
Token.
Token ID13
Feature activation+1.33e-02

Pos logit contributions
 correct+0.026
 dismay+0.026
ris+0.026
 happily+0.026
 assistant+0.024
Neg logit contributions
 Boom-0.034
icated-0.033
stand-0.030
bows-0.029
ained-0.029
Token
Token ID198
Feature activation-7.99e-03

Pos logit contributions
ris+0.019
 dismay+0.019
 correct+0.019
 happily+0.017
 purchase+0.017
Neg logit contributions
 Boom-0.018
icated-0.016
stand-0.015
bows-0.015
 elbows-0.014
Token
Token ID198
Feature activation-7.93e-03

Pos logit contributions
icated+0.014
 Boom+0.014
drops+0.014
bows+0.014
ring+0.013
Neg logit contributions
 happily-0.008
ris-0.007
 dismay-0.006
 correct-0.006
 Whoever-0.006
Token You
Token ID921
Feature activation+3.76e-02

Pos logit contributions
 dismay+0.160
ris+0.159
 correct+0.158
 happily+0.155
 assistant+0.145
Neg logit contributions
 Boom-0.188
icated-0.178
stand-0.159
bows-0.157
oes-0.155
Token're
Token ID821
Feature activation-1.13e-01

Pos logit contributions
 dismay+0.059
 correct+0.056
ris+0.056
 happily+0.056
 assistant+0.054
Neg logit contributions
 Boom-0.045
icated-0.043
bows-0.036
stand-0.034
drops-0.034
Token a
Token ID257
Feature activation-3.27e-02

Pos logit contributions
 Boom+0.124
icated+0.108
ric+0.096
 sound+0.092
ained+0.090
Neg logit contributions
ris-0.195
 dismay-0.180
 purchase-0.176
aleb-0.174
ait-0.174
Token natural
Token ID3288
Feature activation+1.51e-01

Pos logit contributions
 Boom+0.042
icated+0.037
 songs+0.037
 boom+0.036
 sound+0.035
Neg logit contributions
ait-0.044
 dismay-0.043
 Whoever-0.043
ris-0.043
 purchase-0.041
Token.
Token ID13
Feature activation+1.55e-02

Pos logit contributions
 dismay+0.215
 correct+0.211
ris+0.199
 Whoever+0.195
 happily+0.192
Neg logit contributions
 Boom-0.199
icated-0.184
bows-0.182
drops-0.169
fl-0.169
Token Maybe
Token ID6674
Feature activation-4.26e-02

Pos logit contributions
 correct+0.018
ris+0.018
 dismay+0.018
 happily+0.018
 assistant+0.017
Neg logit contributions
 Boom-0.023
icated-0.022
stand-0.020
 elbows-0.020
bows-0.020
Token one
Token ID530
Feature activation+7.90e-02

Pos logit contributions
 Boom+0.060
icated+0.057
stand+0.051
oes+0.049
 elbows+0.049
Neg logit contributions
ris-0.056
 dismay-0.056
 correct-0.055
 happily-0.054
 assistant-0.051
Token day
Token ID1110
Feature activation-7.60e-02

Pos logit contributions
 dismay+0.140
 correct+0.136
ris+0.134
 happily+0.133
 assistant+0.130
Neg logit contributions
 Boom-0.077
icated-0.074
oes-0.060
stand-0.058
bows-0.057
Token you
Token ID345
Feature activation+5.97e-03

Pos logit contributions
 Boom+0.103
icated+0.098
stand+0.088
bows+0.085
oes+0.085
Neg logit contributions
ris-0.103
 dismay-0.102
 correct-0.101
 happily-0.098
 assistant-0.095
Token'll
Token ID1183
Feature activation+1.00e-01

Pos logit contributions
 dismay+0.009
 correct+0.008
 assistant+0.008
aleb+0.008
 purchase+0.008
Neg logit contributions
 Boom-0.007
icated-0.007
bows-0.006
 rhy-0.006
drops-0.006
Token be
Token ID307
Feature activation-2.70e-02

Pos logit contributions
 dismay+0.148
aleb+0.147
ris+0.143
 assistant+0.143
 correct+0.142
Neg logit contributions
 Boom-0.109
icated-0.106
stand-0.106
ric-0.101
 rhy-0.098
Token looked
Token ID3114
Feature activation+1.86e-02

Pos logit contributions
 Boom+0.010
icated+0.010
bows+0.008
drops+0.008
 elbows+0.008
Neg logit contributions
 correct-0.011
ris-0.011
 dismay-0.011
 happily-0.010
 assistant-0.010
Token at
Token ID379
Feature activation+1.82e-01

Pos logit contributions
ris+0.018
 dismay+0.017
 correct+0.017
 purchase+0.015
 Whoever+0.015
Neg logit contributions
 Boom-0.034
icated-0.031
ained-0.029
drops-0.028
bows-0.028
Token each
Token ID1123
Feature activation+8.07e-02

Pos logit contributions
 dismay+0.186
 happily+0.179
ris+0.179
 correct+0.178
 Whoever+0.161
Neg logit contributions
 Boom-0.307
icated-0.298
stand-0.278
bows-0.275
bel-0.271
Token other
Token ID584
Feature activation-3.76e-04

Pos logit contributions
 correct+0.119
ris+0.117
 dismay+0.116
 happily+0.113
 certainly+0.113
Neg logit contributions
 Boom-0.098
icated-0.096
stand-0.085
oes-0.084
weet-0.082
Token and
Token ID290
Feature activation+1.55e-02

Pos logit contributions
 Boom+0.000
icated+0.000
stand+0.000
bows+0.000
 elbows+0.000
Neg logit contributions
ris-0.001
 correct-0.001
 dismay-0.001
 certainly-0.000
 happily-0.000
Token smiled
Token ID13541
Feature activation-1.05e-01

Pos logit contributions
 happily+0.021
 dismay+0.020
 correct+0.020
ris+0.019
 assistant+0.018
Neg logit contributions
 Boom-0.022
icated-0.021
stand-0.019
ained-0.019
oes-0.019
Token.
Token ID13
Feature activation+1.39e-02

Pos logit contributions
 Boom+0.169
icated+0.166
stand+0.152
 elbows+0.148
bows+0.148
Neg logit contributions
 happily-0.116
 dismay-0.113
 correct-0.111
ris-0.109
 assistant-0.102
Token They
Token ID1119
Feature activation+3.44e-02

Pos logit contributions
 dismay+0.014
 correct+0.014
 happily+0.013
ris+0.013
 Whoever+0.013
Neg logit contributions
icated-0.023
 Boom-0.023
stand-0.021
 elbows-0.021
bel-0.021
Token liked
Token ID8288
Feature activation-9.04e-02

Pos logit contributions
 correct+0.050
ris+0.050
 dismay+0.050
 happily+0.048
 assistant+0.047
Neg logit contributions
 Boom-0.043
icated-0.040
bows-0.035
stand-0.035
 elbows-0.035
Token the
Token ID262
Feature activation+4.38e-03

Pos logit contributions
 Boom+0.130
icated+0.123
stand+0.112
bows+0.110
 elbows+0.110
Neg logit contributions
 dismay-0.114
 correct-0.114
ris-0.113
 happily-0.111
 assistant-0.104
Token same
Token ID976
Feature activation-5.98e-02

Pos logit contributions
 correct+0.005
 dismay+0.005
ris+0.005
 surprise+0.004
 happily+0.004
Neg logit contributions
 Boom-0.007
icated-0.007
bel-0.006
 elbows-0.006
ric-0.006
Token made
Token ID925
Feature activation-1.97e-02

Pos logit contributions
 Boom+0.017
icated+0.015
bel+0.014
llers+0.013
stand+0.013
Neg logit contributions
 correct-0.023
 dismay-0.022
ris-0.022
 purchase-0.021
ait-0.021
Token the
Token ID262
Feature activation+3.71e-03

Pos logit contributions
 Boom+0.033
icated+0.032
oes+0.030
bows+0.030
drops+0.030
Neg logit contributions
 happily-0.020
 correct-0.020
 dismay-0.020
ris-0.019
 certainly-0.017
Token little
Token ID1310
Feature activation+6.06e-02

Pos logit contributions
 correct+0.002
 dismay+0.002
 happily+0.002
ris+0.002
 certainly+0.002
Neg logit contributions
 Boom-0.008
icated-0.008
stand-0.007
 keyboards-0.007
ained-0.007
Token pe
Token ID613
Feature activation+1.38e-01

Pos logit contributions
 dismay+0.034
 correct+0.032
 happily+0.031
ris+0.030
 surprise+0.026
Neg logit contributions
 Boom-0.132
icated-0.129
ained-0.125
stand-0.124
 keyboards-0.121
Tokenb
Token ID65
Feature activation-2.37e-01

Pos logit contributions
ris+0.260
 correct+0.259
 dismay+0.256
 happily+0.255
 exactly+0.244
Neg logit contributions
 Boom-0.120
icated-0.109
stand-0.092
bows-0.089
ained-0.085
Tokenble
Token ID903
Feature activation-2.38e-01

Pos logit contributions
 Boom+0.201
icated+0.194
stand+0.167
 elbows+0.158
 keyboards+0.158
Neg logit contributions
 happily-0.437
 correct-0.434
ris-0.431
 dismay-0.426
 certainly-0.422
Token his
Token ID465
Feature activation+3.19e-02

Pos logit contributions
 Boom+0.305
icated+0.297
stand+0.275
bel+0.272
bows+0.263
Neg logit contributions
ris-0.328
 dismay-0.318
 correct-0.313
aleb-0.300
 Whoever-0.299
Token special
Token ID2041
Feature activation+1.55e-01

Pos logit contributions
 dismay+0.036
 happily+0.035
 correct+0.034
ris+0.034
 purchase+0.032
Neg logit contributions
 Boom-0.053
icated-0.051
stand-0.047
ained-0.047
bows-0.047
Token treasure
Token ID14068
Feature activation+1.63e-01

Pos logit contributions
 dismay+0.178
ris+0.176
 correct+0.176
 happily+0.176
 assistant+0.161
Neg logit contributions
 Boom-0.247
icated-0.238
ained-0.218
 elbows-0.214
stand-0.214
Token and
Token ID290
Feature activation+1.35e-02

Pos logit contributions
 dismay+0.256
ait+0.240
aleb+0.239
 Whoever+0.238
ris+0.233
Neg logit contributions
bows-0.171
stand-0.167
 sound-0.166
bel-0.166
 songs-0.163
Token kept
Token ID4030
Feature activation+2.98e-02

Pos logit contributions
ris+0.017
 correct+0.017
 dismay+0.017
 happily+0.016
 assistant+0.016
Neg logit contributions
 Boom-0.019
icated-0.018
stand-0.016
 elbows-0.016
bows-0.016
Token who
Token ID508
Feature activation+1.16e-01

Pos logit contributions
 Boom+0.065
icated+0.063
ained+0.060
bows+0.058
 elbows+0.058
Neg logit contributions
 correct-0.044
 dismay-0.043
ris-0.042
 happily-0.041
 surprise-0.040
Token gave
Token ID2921
Feature activation+2.26e-02

Pos logit contributions
 dismay+0.166
 happily+0.158
ris+0.157
 correct+0.152
 assistant+0.149
Neg logit contributions
icated-0.152
 Boom-0.149
bows-0.135
stand-0.131
 rhy-0.127
Token me
Token ID502
Feature activation+3.67e-02

Pos logit contributions
 correct+0.021
 dismay+0.021
 happily+0.021
ris+0.019
 surprise+0.018
Neg logit contributions
 Boom-0.039
icated-0.038
drops-0.036
ained-0.036
bows-0.036
Token this
Token ID428
Feature activation+6.66e-02

Pos logit contributions
 correct+0.037
 happily+0.036
ris+0.035
 dismay+0.034
 assistant+0.033
Neg logit contributions
 Boom-0.061
icated-0.059
 elbows-0.057
drops-0.056
ained-0.056
Token doll
Token ID3654
Feature activation+3.29e-02

Pos logit contributions
 correct+0.075
ris+0.074
 dismay+0.074
 happily+0.073
 assistant+0.068
Neg logit contributions
 Boom-0.105
icated-0.101
 elbows-0.091
stand-0.091
ained-0.090
Token?"
Token ID1701
Feature activation-2.40e-01

Pos logit contributions
 correct+0.042
 happily+0.041
ris+0.041
 assistant+0.039
 dismay+0.039
Neg logit contributions
 Boom-0.048
icated-0.045
 elbows-0.042
ained-0.041
 keyboards-0.040
Token she
Token ID673
Feature activation+2.32e-02

Pos logit contributions
 Boom+0.381
icated+0.342
stand+0.331
 elbows+0.318
bel+0.311
Neg logit contributions
ris-0.294
 correct-0.278
 dismay-0.269
 happily-0.254
 certainly-0.252
Token says
Token ID1139
Feature activation+2.32e-02

Pos logit contributions
ris+0.026
 dismay+0.024
 correct+0.024
 happily+0.023
 purchase+0.022
Neg logit contributions
 Boom-0.039
icated-0.036
stand-0.033
 elbows-0.033
bows-0.032
Token.
Token ID13
Feature activation+9.96e-03

Pos logit contributions
 happily+0.026
 correct+0.025
 dismay+0.025
ris+0.024
 assistant+0.023
Neg logit contributions
 Boom-0.038
icated-0.037
ained-0.033
ric-0.033
bows-0.033
Token
Token ID198
Feature activation-9.31e-03

Pos logit contributions
 dismay+0.012
 correct+0.012
ris+0.012
 happily+0.012
 assistant+0.011
Neg logit contributions
 Boom-0.015
icated-0.014
stand-0.013
bows-0.013
 elbows-0.013
Token
Token ID198
Feature activation-9.43e-03

Pos logit contributions
icated+0.015
 Boom+0.015
drops+0.014
bows+0.014
ained+0.013
Neg logit contributions
 happily-0.010
 dismay-0.010
ris-0.010
 correct-0.010
 Whoever-0.009
Token They
Token ID1119
Feature activation+3.14e-02

Pos logit contributions
 correct+0.022
 dismay+0.021
ris+0.021
 security+0.021
 assistant+0.020
Neg logit contributions
 Boom-0.018
icated-0.015
stand-0.014
ric-0.014
ained-0.014
Token added
Token ID2087
Feature activation+2.24e-02

Pos logit contributions
 dismay+0.048
 correct+0.047
ris+0.044
 assistant+0.044
 Whoever+0.043
Neg logit contributions
icated-0.036
 Boom-0.036
drops-0.031
 rhy-0.030
bel-0.030
Token pine
Token ID20161
Feature activation+4.37e-02

Pos logit contributions
 correct+0.027
 happily+0.027
ris+0.026
 dismay+0.026
 surprise+0.024
Neg logit contributions
 Boom-0.033
icated-0.033
stand-0.029
 elbows-0.029
ained-0.029
Token cones
Token ID47314
Feature activation+6.27e-02

Pos logit contributions
ris+0.056
 correct+0.054
 happily+0.054
 dismay+0.052
 assistant+0.050
Neg logit contributions
 Boom-0.062
icated-0.060
stand-0.054
 elbows-0.054
ained-0.053
Token and
Token ID290
Feature activation+1.40e-02

Pos logit contributions
 dismay+0.092
 correct+0.088
ris+0.087
 happily+0.083
 assistant+0.083
Neg logit contributions
 Boom-0.078
icated-0.075
stand-0.068
bows-0.068
oes-0.067
Token feathers
Token ID27400
Feature activation-2.26e-01

Pos logit contributions
 happily+0.017
ris+0.017
 correct+0.017
 dismay+0.017
 Whoever+0.015
Neg logit contributions
 Boom-0.020
icated-0.020
ained-0.018
stand-0.018
bows-0.017
Token,
Token ID11
Feature activation-4.40e-02

Pos logit contributions
 Boom+0.355
icated+0.343
 elbows+0.316
stand+0.306
ained+0.299
Neg logit contributions
 happily-0.263
ris-0.257
 correct-0.252
 dismay-0.237
 certainly-0.236
Token leaves
Token ID5667
Feature activation-5.21e-02

Pos logit contributions
 Boom+0.058
icated+0.055
drops+0.050
bows+0.050
stand+0.050
Neg logit contributions
 dismay-0.062
ris-0.061
 correct-0.060
 happily-0.057
 assistant-0.057
Token and
Token ID290
Feature activation+1.55e-02

Pos logit contributions
 Boom+0.063
icated+0.060
stand+0.056
bows+0.056
drops+0.055
Neg logit contributions
 dismay-0.078
 correct-0.075
ris-0.075
 happily-0.070
 assistant-0.070
Token feathers
Token ID27400
Feature activation-2.27e-01

Pos logit contributions
 correct+0.020
 happily+0.019
ris+0.018
 dismay+0.018
 surprise+0.017
Neg logit contributions
 Boom-0.022
icated-0.021
ained-0.021
stand-0.019
oes-0.019
Token.
Token ID13
Feature activation+1.14e-02

Pos logit contributions
 Boom+0.365
icated+0.349
 elbows+0.319
stand+0.311
ained+0.306
Neg logit contributions
 happily-0.254
ris-0.250
 correct-0.248
 dismay-0.233
 certainly-0.228
Token
Token ID198
Feature activation-1.17e-02

Pos logit contributions
 Boom+0.013
icated+0.012
stand+0.011
ained+0.010
ric+0.010
Neg logit contributions
 correct-0.017
 dismay-0.017
ris-0.017
 happily-0.016
 assistant-0.016
TokenAlbert
Token ID42590
Feature activation+1.20e-01

Pos logit contributions
 Boom+0.016
icated+0.015
Br+0.014
Rain+0.013
bel+0.013
Neg logit contributions
 correct-0.015
 dismay-0.015
ris-0.015
 assistant-0.015
 purchase-0.014
Token then
Token ID788
Feature activation-7.20e-02

Pos logit contributions
 correct+0.166
 dismay+0.163
ris+0.161
 Whoever+0.153
 assistant+0.152
Neg logit contributions
 Boom-0.150
icated-0.147
stand-0.129
fl-0.127
ric-0.127
Token remembered
Token ID12086
Feature activation+5.65e-04

Pos logit contributions
 Boom+0.091
icated+0.082
stand+0.074
bel+0.074
weet+0.071
Neg logit contributions
ris-0.103
 correct-0.101
 dismay-0.099
 Whoever-0.097
-0.094
Token the
Token ID262
Feature activation+3.31e-03

Pos logit contributions
ris+0.001
 correct+0.001
 dismay+0.001
 happily+0.001
 assistant+0.001
Neg logit contributions
 Boom-0.001
icated-0.001
stand-0.001
 elbows-0.001
bows-0.001
Token thunder
Token ID18355
Feature activation-2.46e-01

Pos logit contributions
ris+0.004
 dismay+0.004
 happily+0.004
 correct+0.004
 Whoever+0.004
Neg logit contributions
 Boom-0.005
icated-0.004
bows-0.004
drops-0.004
stand-0.004
Token,
Token ID11
Feature activation-4.42e-02

Pos logit contributions
 Boom+0.324
icated+0.303
bows+0.280
stand+0.277
drops+0.275
Neg logit contributions
 dismay-0.345
ris-0.342
 correct-0.338
 happily-0.323
 assistant-0.315
Token and
Token ID290
Feature activation+1.61e-02

Pos logit contributions
 Boom+0.064
icated+0.061
stand+0.055
 elbows+0.055
bows+0.054
Neg logit contributions
 correct-0.055
 dismay-0.055
ris-0.054
 happily-0.054
 certainly-0.050
Token he
Token ID339
Feature activation-3.01e-03

Pos logit contributions
ris+0.020
 dismay+0.020
 correct+0.020
 happily+0.019
 assistant+0.018
Neg logit contributions
 Boom-0.023
icated-0.022
stand-0.020
bows-0.020
 elbows-0.020
Token knew
Token ID2993
Feature activation+1.93e-02

Pos logit contributions
 Boom+0.003
icated+0.003
bel+0.003
fl+0.002
 rhy+0.002
Neg logit contributions
 correct-0.005
 dismay-0.005
ris-0.005
 purchase-0.005
-0.005
Token it
Token ID340
Feature activation+7.02e-03

Pos logit contributions
ris+0.031
 dismay+0.030
 Whoever+0.030
 correct+0.030
aleb+0.030
Neg logit contributions
 Boom-0.019
icated-0.018
stand-0.016
ric-0.015
 sound-0.015
Token could
Token ID714
Feature activation-7.17e-02

Pos logit contributions
 Boom+0.117
icated+0.112
stand+0.100
 elbows+0.100
ained+0.097
Neg logit contributions
ris-0.115
 correct-0.115
 happily-0.114
 dismay-0.113
 certainly-0.106
Token make
Token ID787
Feature activation-8.54e-02

Pos logit contributions
 Boom+0.096
icated+0.091
stand+0.082
bows+0.080
 elbows+0.080
Neg logit contributions
 dismay-0.097
ris-0.097
 correct-0.097
 happily-0.094
 assistant-0.089
Token such
Token ID884
Feature activation+8.19e-02

Pos logit contributions
 Boom+0.121
icated+0.114
stand+0.104
bows+0.102
 elbows+0.100
Neg logit contributions
ris-0.114
 dismay-0.110
 correct-0.109
 happily-0.105
 assistant-0.103
Token a
Token ID257
Feature activation-3.74e-02

Pos logit contributions
 correct+0.101
 dismay+0.101
 happily+0.099
ris+0.099
 assistant+0.091
Neg logit contributions
 Boom-0.120
icated-0.115
stand-0.105
oes-0.103
 elbows-0.102
Token nice
Token ID3621
Feature activation+5.91e-04

Pos logit contributions
 Boom+0.055
icated+0.052
stand+0.048
bows+0.047
 elbows+0.046
Neg logit contributions
ris-0.045
 dismay-0.045
 correct-0.045
 happily-0.043
 assistant-0.042
Token sound
Token ID2128
Feature activation-2.84e-01

Pos logit contributions
 happily+0.001
 dismay+0.001
ris+0.001
 correct+0.001
 surprise+0.001
Neg logit contributions
 Boom-0.001
icated-0.001
 elbows-0.001
ained-0.001
stand-0.000
Token.
Token ID13
Feature activation+6.35e-03

Pos logit contributions
 Boom+0.383
icated+0.363
stand+0.336
bows+0.328
drops+0.323
Neg logit contributions
 dismay-0.388
ris-0.387
 correct-0.380
 happily-0.362
 assistant-0.355
Token She
Token ID1375
Feature activation+3.63e-02

Pos logit contributions
 happily+0.007
 correct+0.007
 dismay+0.006
ris+0.006
 Whoever+0.006
Neg logit contributions
 Boom-0.010
icated-0.010
bel-0.010
 elbows-0.009
drops-0.009
Token liked
Token ID8288
Feature activation-9.36e-02

Pos logit contributions
ris+0.048
 happily+0.048
 dismay+0.047
 correct+0.047
 assistant+0.044
Neg logit contributions
 Boom-0.050
icated-0.048
stand-0.044
bows-0.043
 elbows-0.043
Token to
Token ID284
Feature activation-4.75e-02

Pos logit contributions
 Boom+0.135
icated+0.130
 elbows+0.118
ained+0.118
bows+0.117
Neg logit contributions
 happily-0.116
 dismay-0.115
 correct-0.114
ris-0.112
 Whoever-0.105
Token pretend
Token ID16614
Feature activation+3.62e-02

Pos logit contributions
 Boom+0.071
icated+0.068
ained+0.061
stand+0.060
 elbows+0.060
Neg logit contributions
 correct-0.058
ris-0.057
 dismay-0.057
 happily-0.057
 purchase-0.052