TOP ACTIVATIONS
MAX = 0.196

she got rid of the intr uders . She did not
balloon . A str id held onto her balloon
it was , and the elderly lady handed him a large
big ostr ich . The ostr ich liked to swim .
On the way , A str id saw a pond filled
John went to visit his grand pa . Grand pa showed
share her love with the elderly lady .
doing hopping around ?" the elderly rabbit said with a smile
help ." The ge ese gave Lily and Tom
boy came to visit the wise man . The wise man
refrigerator ! The elderly woman was shocked but in
play . The ostr ich was glad when it
time there was a blue ostr ich that lived in the
They went to visit the elderly neighbour and she said ,
couldn 't see what the ostr ich was hiding , but
okay . The elderly man smiled weak ly and
dog . The elderly man and his dog would
stayed away from the other ge ese . "
friend ." The ostr ich heard Ben 's voice
that day on , the elderly lady and the little boy

MIN ACTIVATIONS
MIN = -0.118

She searches and searches and searches . She makes a big
just remember that bells are never a bad thing and that
again . She searches and searches and searches . She makes
knew that his thumb was never going to be nice again
: " Del ay is never a waste of time ."
the calendar . It looks even better than before ."
He knows it will be even more perfect than his picture
to him . He was never lazy again ! <|endoftext|> Jane
the chess pieces . They searched and searched , but the
big , blue button was never used again and it became
And Freddy was never helpless again . He flew
She wanted to make them look more real .
too . His hold was never too tight , he understood
story is that it is never okay to lie . Lies
pieces . They searched and searched , but the chess pieces
them . She made them look nice again . L ila
happy because his guitar sounded even better when he played it
. And the kitchen was never gloomy anymore . <|endoftext|> Once
, the dark cane was never lonely . It always had
But Tim 's drum is never the same . He is

INTERVAL 0.117 - 0.196
CONTAINS 1.485%

was a big tree in the middle of the forest .
went back home with the icy pigeon . <|endoftext|> Bob was
Gr anny caught him at the bottom . The little boy
They take turns going on the slide . They laugh and
pocket and spin it in the sun . He would watch

INTERVAL 0.039 - 0.117
CONTAINS 15.752%

, " OK , let 's plan a fun game ."
there was a lucky cloth . The lucky cloth was very
delicious . The moral of this story is
, there was a little girl named Amy . Amy liked
was determined to have a good day . So her mother

INTERVAL -0.040 - 0.039
CONTAINS 63.126%

avocado . She walked up to him and said , "
hungry and wanted a big meal . But the cook was
I can find the most things ." So they started counting
environment . The moral of the story is that
you on an adventure !" The potato was excited

INTERVAL -0.118 - -0.040
CONTAINS 19.637%

my to go home . He waved at the hairy dog
her hand so that it would stay together .
" Yes , we can lift the window ."
his best friend . He hoped that he would find his
their adventures . They then ate a y ummy dinner together
Token she
Token ID673
Feature activation-4.82e-02

Pos logit contributions
t+0.042
 don+0.033
't+0.033
 cannot+0.032
 fell+0.030
Neg logit contributions
 orchestra-0.067
 amusement-0.063
 films-0.063
 elephants-0.063
 lions-0.062
Token got
Token ID1392
Feature activation-5.62e-02

Pos logit contributions
t+0.057
't+0.041
 don+0.039
 cannot+0.039
w+0.038
Neg logit contributions
 orchestra-0.116
 elephant-0.110
 elephants-0.108
 lions-0.108
 monkeys-0.108
Token rid
Token ID5755
Feature activation+3.28e-02

Pos logit contributions
t+0.067
 don+0.050
't+0.049
 cannot+0.048
w+0.045
Neg logit contributions
 orchestra-0.136
 elephants-0.130
 films-0.130
 amusement-0.129
 elephant-0.129
Token of
Token ID286
Feature activation+2.42e-02

Pos logit contributions
 orchestra+0.062
 amusement+0.058
 films+0.058
 elephants+0.057
 lions+0.057
Neg logit contributions
t-0.059
 don-0.049
't-0.048
 cannot-0.048
w-0.045
Token the
Token ID262
Feature activation+1.19e-01

Pos logit contributions
 orchestra+0.057
 sym+0.055
 woodland+0.053
 melody+0.053
 amusement+0.053
Neg logit contributions
t-0.029
 don-0.027
 it-0.026
 cannot-0.026
't-0.025
Token intr
Token ID9913
Feature activation+1.96e-01

Pos logit contributions
 orchestra+0.155
latable+0.153
 amusement+0.151
 unison+0.148
 sym+0.145
Neg logit contributions
t-0.263
 don-0.249
 cannot-0.242
 fell-0.232
w-0.231
Tokenuders
Token ID48739
Feature activation+8.15e-03

Pos logit contributions
 orchestra+0.275
 amusement+0.264
 sym+0.244
 melody+0.243
 cartoon+0.240
Neg logit contributions
t-0.400
 cannot-0.368
't-0.366
 don-0.361
 not-0.359
Token.
Token ID13
Feature activation+1.53e-02

Pos logit contributions
 orchestra+0.018
 amusement+0.017
 films+0.017
 elephants+0.017
 lions+0.017
Neg logit contributions
t-0.012
 don-0.010
't-0.009
 cannot-0.009
 fell-0.009
Token She
Token ID1375
Feature activation-3.64e-02

Pos logit contributions
 orchestra+0.036
 films+0.034
 melody+0.034
 tunes+0.034
 amusement+0.033
Neg logit contributions
t-0.021
 cannot-0.017
 don-0.017
't-0.017
 not-0.015
Token did
Token ID750
Feature activation-5.14e-02

Pos logit contributions
t+0.049
't+0.037
 don+0.037
 cannot+0.035
w+0.034
Neg logit contributions
 orchestra-0.084
 elephants-0.080
 elephant-0.080
 amusement-0.080
 lions-0.080
Token not
Token ID407
Feature activation-4.68e-02

Pos logit contributions
t+0.040
w+0.021
 don+0.020
't+0.020
 cannot+0.019
Neg logit contributions
 orchestra-0.147
 amusement-0.142
 films-0.142
 lions-0.141
 monkeys-0.141
Token balloon
Token ID21190
Feature activation+8.25e-03

Pos logit contributions
 orchestra+0.208
 films+0.197
 amusement+0.196
 woodland+0.195
 elephants+0.190
Neg logit contributions
t-0.165
 don-0.134
't-0.134
 cannot-0.132
 fell-0.126
Token.
Token ID13
Feature activation+3.43e-02

Pos logit contributions
 orchestra+0.017
 elephant+0.017
 elephants+0.017
 lions+0.017
 monkeys+0.017
Neg logit contributions
t-0.012
 don-0.010
't-0.010
 cannot-0.009
w-0.008
Token
Token ID198
Feature activation+1.18e-02

Pos logit contributions
 orchestra+0.075
 amusement+0.071
 melody+0.071
 films+0.070
 elephants+0.070
Neg logit contributions
t-0.050
't-0.042
 don-0.041
 cannot-0.040
 they-0.039
Token
Token ID198
Feature activation+1.54e-02

Pos logit contributions
 orchestra+0.029
 elephants+0.028
 amusement+0.028
 films+0.028
 elephant+0.028
Neg logit contributions
t-0.014
't-0.011
 don-0.010
 cannot-0.010
 they-0.009
TokenA
Token ID32
Feature activation+6.01e-02

Pos logit contributions
 orchestra+0.035
 elephants+0.034
 films+0.033
 lions+0.033
 elephant+0.033
Neg logit contributions
t-0.021
't-0.017
 don-0.017
 cannot-0.016
w-0.015
Tokenstr
Token ID2536
Feature activation+1.82e-01

Pos logit contributions
 orchestra+0.087
latable+0.086
opl+0.084
 pasture+0.083
 amusement+0.083
Neg logit contributions
t-0.127
 they-0.107
 don-0.107
't-0.105
 fell-0.104
Tokenid
Token ID312
Feature activation+8.04e-02

Pos logit contributions
 orchestra+0.242
 films+0.235
 amusement+0.222
 movies+0.221
 cartoon+0.210
Neg logit contributions
t-0.433
't-0.383
 don-0.371
w-0.370
 cannot-0.360
Token held
Token ID2714
Feature activation-2.24e-02

Pos logit contributions
 orchestra+0.179
 films+0.173
 amusement+0.168
 sym+0.167
 elephants+0.166
Neg logit contributions
t-0.117
 don-0.093
't-0.091
 cannot-0.090
 fell-0.087
Token onto
Token ID4291
Feature activation+4.82e-04

Pos logit contributions
t+0.031
 don+0.026
't+0.026
 cannot+0.025
 fell+0.023
Neg logit contributions
 orchestra-0.050
 films-0.048
 amusement-0.047
 sym-0.047
 lions-0.047
Token her
Token ID607
Feature activation+8.68e-02

Pos logit contributions
 orchestra+0.001
 films+0.001
 sym+0.001
 amusement+0.001
 tunes+0.001
Neg logit contributions
t-0.001
't-0.001
 don-0.001
 it-0.001
 cannot-0.001
Token balloon
Token ID21190
Feature activation+3.10e-03

Pos logit contributions
 orchestra+0.141
 sym+0.129
 lions+0.129
 films+0.129
 ze+0.129
Neg logit contributions
t-0.159
 don-0.149
 cannot-0.146
 it-0.145
 not-0.145
Token it
Token ID340
Feature activation-3.51e-02

Pos logit contributions
t+0.002
 don+0.001
't+0.001
 cannot+0.001
 they+0.001
Neg logit contributions
 orchestra-0.005
 melody-0.005
 amusement-0.005
 sym-0.005
 films-0.005
Token was
Token ID373
Feature activation-1.17e-02

Pos logit contributions
t+0.043
 don+0.033
't+0.032
 cannot+0.032
w+0.029
Neg logit contributions
 orchestra-0.084
 amusement-0.080
 elephants-0.080
 films-0.080
 lions-0.080
Token,
Token ID11
Feature activation-1.61e-02

Pos logit contributions
t+0.019
 don+0.015
't+0.015
 cannot+0.015
w+0.014
Neg logit contributions
 orchestra-0.024
 films-0.022
 amusement-0.022
 lions-0.022
 elephants-0.022
Token and
Token ID290
Feature activation-1.61e-02

Pos logit contributions
t+0.023
 don+0.020
't+0.019
 cannot+0.019
 they+0.018
Neg logit contributions
 orchestra-0.035
 amusement-0.033
 woodland-0.033
 films-0.033
 elephants-0.033
Token the
Token ID262
Feature activation+1.25e-01

Pos logit contributions
t+0.020
 don+0.015
't+0.014
 cannot+0.014
w+0.013
Neg logit contributions
 orchestra-0.039
 amusement-0.037
 elephants-0.037
 films-0.037
 elephant-0.037
Token elderly
Token ID13830
Feature activation+1.75e-01

Pos logit contributions
latable+0.154
 orchestra+0.151
 films+0.146
 sym+0.144
 amusement+0.139
Neg logit contributions
t-0.273
 don-0.252
't-0.243
 they-0.241
 fell-0.240
Token lady
Token ID10846
Feature activation+5.28e-02

Pos logit contributions
latable+0.263
 films+0.252
 sym+0.246
 unison+0.244
 ig+0.238
Neg logit contributions
t-0.379
 they-0.342
't-0.323
 ever-0.312
 it-0.311
Token handed
Token ID10158
Feature activation-3.54e-02

Pos logit contributions
 orchestra+0.119
 films+0.114
 amusement+0.113
 elephants+0.112
 lions+0.112
Neg logit contributions
t-0.073
 don-0.059
't-0.057
 cannot-0.057
 fell-0.052
Token him
Token ID683
Feature activation+4.68e-03

Pos logit contributions
t+0.047
 don+0.037
't+0.036
 cannot+0.036
 fell+0.033
Neg logit contributions
 orchestra-0.081
 amusement-0.077
 elephants-0.077
 films-0.077
 lions-0.077
Token a
Token ID257
Feature activation+6.36e-02

Pos logit contributions
 orchestra+0.011
 melody+0.010
 films+0.010
 sym+0.010
 elephants+0.010
Neg logit contributions
t-0.006
 don-0.005
 cannot-0.005
't-0.005
 they-0.005
Token large
Token ID1588
Feature activation+1.03e-01

Pos logit contributions
 orchestra+0.100
 films+0.094
 lions+0.093
 elephants+0.093
 monkeys+0.093
Neg logit contributions
t-0.131
 don-0.113
't-0.111
 cannot-0.111
 fell-0.105
Token big
Token ID1263
Feature activation+1.24e-01

Pos logit contributions
opl+0.173
 orchestra+0.170
 amusement+0.167
 rhythms+0.162
child+0.161
Neg logit contributions
 don-0.165
 they-0.156
t-0.155
 not-0.149
 fell-0.148
Token ostr
Token ID42387
Feature activation+1.65e-01

Pos logit contributions
 unison+0.203
 tunes+0.195
 rhythms+0.195
 orchestra+0.191
pill+0.190
Neg logit contributions
 don-0.210
 can-0.202
 they-0.199
 it-0.198
 fell-0.197
Tokenich
Token ID488
Feature activation+7.36e-02

Pos logit contributions
 orchestra+0.283
 films+0.278
 amusement+0.273
 rhythms+0.263
 movies+0.260
Neg logit contributions
t-0.340
't-0.309
w-0.292
 not-0.277
 ever-0.273
Token.
Token ID13
Feature activation+5.32e-02

Pos logit contributions
 orchestra+0.167
 amusement+0.163
latable+0.157
 amaz+0.156
 unison+0.156
Neg logit contributions
t-0.110
't-0.092
 never-0.092
 fell-0.088
 not-0.087
Token The
Token ID383
Feature activation+1.10e-01

Pos logit contributions
 orchestra+0.086
 amusement+0.078
 melody+0.077
 films+0.077
 sym+0.076
Neg logit contributions
t-0.110
't-0.097
 don-0.096
 cannot-0.094
 fell-0.089
Token ostr
Token ID42387
Feature activation+1.73e-01

Pos logit contributions
 orchestra+0.062
 films+0.052
 amusement+0.049
 sym+0.045
 tunes+0.045
Neg logit contributions
t-0.337
 don-0.310
't-0.308
 cannot-0.305
 never-0.299
Tokenich
Token ID488
Feature activation+5.81e-02

Pos logit contributions
 orchestra+0.232
 films+0.220
 amusement+0.218
 movies+0.207
 melody+0.205
Neg logit contributions
t-0.411
't-0.366
 don-0.351
 cannot-0.350
w-0.350
Token liked
Token ID8288
Feature activation+7.62e-03

Pos logit contributions
 orchestra+0.125
 elephant+0.120
 lions+0.119
 elephants+0.119
 jungle+0.119
Neg logit contributions
t-0.085
 don-0.068
't-0.067
 cannot-0.066
w-0.061
Token to
Token ID284
Feature activation+2.63e-02

Pos logit contributions
 orchestra+0.022
 amusement+0.021
 films+0.021
 elephants+0.021
 woodland+0.021
Neg logit contributions
t-0.006
 don-0.004
't-0.003
 cannot-0.003
 fell-0.003
Token swim
Token ID9422
Feature activation-3.07e-02

Pos logit contributions
 orchestra+0.054
 amusement+0.051
 woodland+0.050
 films+0.050
 sym+0.050
Neg logit contributions
t-0.040
 don-0.036
't-0.035
 cannot-0.035
 fell-0.033
Token.
Token ID13
Feature activation+3.57e-02

Pos logit contributions
t+0.042
't+0.034
 don+0.033
 cannot+0.032
w+0.030
Neg logit contributions
 orchestra-0.066
 jungle-0.065
 lions-0.065
 films-0.065
 monkeys-0.064
TokenOn
Token ID2202
Feature activation-3.93e-02

Pos logit contributions
 orchestra+0.033
 amusement+0.032
 films+0.031
 elephants+0.031
 lions+0.031
Neg logit contributions
t-0.030
 don-0.025
 cannot-0.025
't-0.025
w-0.023
Token the
Token ID262
Feature activation+1.29e-01

Pos logit contributions
t+0.054
 they+0.045
't+0.045
 soon+0.043
 not+0.043
Neg logit contributions
 orchestra-0.089
 woodland-0.083
 elephants-0.083
 tunes-0.083
 melody-0.083
Token way
Token ID835
Feature activation-2.89e-02

Pos logit contributions
 orchestra+0.231
 alien+0.219
 films+0.216
 sym+0.216
 rhythms+0.215
Neg logit contributions
t-0.238
't-0.205
 they-0.199
 soon-0.192
 fell-0.191
Token,
Token ID11
Feature activation-5.57e-04

Pos logit contributions
t+0.048
 don+0.042
 cannot+0.042
w+0.040
't+0.039
Neg logit contributions
 orchestra-0.052
 elephant-0.050
 woodland-0.050
 amusement-0.049
 monkeys-0.049
Token A
Token ID317
Feature activation+5.39e-02

Pos logit contributions
t+0.001
 don+0.000
't+0.000
 they+0.000
 cannot+0.000
Neg logit contributions
 orchestra-0.001
 films-0.001
 amusement-0.001
 melody-0.001
 sym-0.001
Tokenstr
Token ID2536
Feature activation+1.73e-01

Pos logit contributions
 orchestra+0.090
 films+0.089
latable+0.084
opl+0.084
 rhythms+0.083
Neg logit contributions
t-0.102
 don-0.090
 they-0.087
 soon-0.087
 fell-0.086
Tokenid
Token ID312
Feature activation+7.61e-02

Pos logit contributions
 orchestra+0.249
 films+0.235
 amusement+0.229
 movies+0.222
 melody+0.221
Neg logit contributions
t-0.390
't-0.339
 don-0.334
 cannot-0.329
w-0.321
Token saw
Token ID2497
Feature activation+1.31e-02

Pos logit contributions
 orchestra+0.168
 films+0.161
 amusement+0.158
 elephants+0.157
 lions+0.156
Neg logit contributions
t-0.112
 don-0.089
't-0.088
 cannot-0.085
 fell-0.082
Token a
Token ID257
Feature activation+7.32e-02

Pos logit contributions
 orchestra+0.032
 films+0.029
 melody+0.029
 sym+0.029
 tunes+0.028
Neg logit contributions
t-0.017
 they-0.015
 it-0.014
 don-0.014
't-0.013
Token pond
Token ID16723
Feature activation+1.26e-02

Pos logit contributions
 films+0.119
 orchestra+0.118
 tunes+0.111
 rhythms+0.110
latable+0.108
Neg logit contributions
t-0.137
 don-0.128
 fell-0.121
 it-0.119
't-0.118
Token filled
Token ID5901
Feature activation+3.66e-02

Pos logit contributions
 orchestra+0.028
 films+0.027
 elephants+0.027
 amusement+0.026
 woodland+0.026
Neg logit contributions
t-0.018
 don-0.014
't-0.014
 cannot-0.014
 fell-0.013
Token John
Token ID1757
Feature activation+3.69e-02

Pos logit contributions
t+0.022
 cannot+0.020
 don+0.020
 will+0.018
 never+0.018
Neg logit contributions
 woodland-0.024
 orchestra-0.024
 lions-0.023
 films-0.023
 jungle-0.023
Token went
Token ID1816
Feature activation-2.05e-02

Pos logit contributions
 orchestra+0.080
 jungle+0.078
 ig+0.078
 elephant+0.077
 lions+0.077
Neg logit contributions
t-0.053
't-0.042
 don-0.040
 cannot-0.039
 not-0.036
Token to
Token ID284
Feature activation+4.30e-02

Pos logit contributions
t+0.023
 don+0.018
 cannot+0.017
't+0.016
w+0.015
Neg logit contributions
 orchestra-0.051
 films-0.049
 lions-0.049
 elephant-0.049
 elephants-0.049
Token visit
Token ID3187
Feature activation-4.61e-03

Pos logit contributions
 orchestra+0.104
 melody+0.099
 sym+0.099
 elephants+0.098
 tunes+0.098
Neg logit contributions
't-0.045
 don-0.044
 it-0.043
t-0.042
 they-0.041
Token his
Token ID465
Feature activation+1.11e-01

Pos logit contributions
t+0.007
 don+0.006
't+0.005
 cannot+0.005
 they+0.005
Neg logit contributions
 orchestra-0.010
 elephants-0.009
 monkeys-0.009
 lions-0.009
 elephant-0.009
Token grand
Token ID4490
Feature activation+1.72e-01

Pos logit contributions
 orchestra+0.221
 melody+0.214
ki+0.212
 tunes+0.211
 sym+0.208
Neg logit contributions
 not-0.158
 they-0.151
't-0.150
 cannot-0.149
 soon-0.148
Tokenpa
Token ID8957
Feature activation+4.47e-02

Pos logit contributions
 sym+0.242
Whe+0.239
 rhythms+0.235
 echoing+0.234
 orchestra+0.229
Neg logit contributions
t-0.294
 to-0.285
 never-0.284
 soon-0.284
 ever-0.281
Token.
Token ID13
Feature activation+4.63e-02

Pos logit contributions
 orchestra+0.093
 elephants+0.087
 monkeys+0.087
 amusement+0.087
 melody+0.086
Neg logit contributions
t-0.070
't-0.058
 don-0.057
 cannot-0.056
 they-0.054
Token Grand
Token ID5675
Feature activation+1.18e-01

Pos logit contributions
 orchestra+0.080
 amusement+0.075
 films+0.075
 elephants+0.074
 elephant+0.074
Neg logit contributions
t-0.088
 don-0.075
 cannot-0.074
w-0.073
't-0.073
Tokenpa
Token ID8957
Feature activation+1.30e-02

Pos logit contributions
 orchestra+0.158
 elephants+0.144
 films+0.143
 amusement+0.143
 lions+0.142
Neg logit contributions
t-0.274
 don-0.238
't-0.238
 cannot-0.235
w-0.225
Token showed
Token ID3751
Feature activation+2.49e-03

Pos logit contributions
 orchestra+0.030
 elephant+0.029
 amusement+0.029
 elephants+0.029
 lions+0.029
Neg logit contributions
t-0.017
't-0.013
 don-0.013
 cannot-0.012
w-0.012
Token share
Token ID2648
Feature activation-2.71e-02

Pos logit contributions
 orchestra+0.021
 elephants+0.020
 films+0.020
 lions+0.020
 monkeys+0.020
Neg logit contributions
t-0.013
 don-0.011
't-0.011
 cannot-0.010
 fell-0.009
Token her
Token ID607
Feature activation+1.01e-01

Pos logit contributions
t+0.030
 don+0.025
't+0.025
 cannot+0.024
 it+0.023
Neg logit contributions
 orchestra-0.066
 elephants-0.063
 woodland-0.063
 ig-0.063
 lions-0.063
Token love
Token ID1842
Feature activation-8.66e-04

Pos logit contributions
 orchestra+0.157
 elephants+0.145
 lions+0.145
 ze+0.144
 monkeys+0.142
Neg logit contributions
t-0.206
 don-0.189
 cannot-0.182
't-0.177
 not-0.174
Token with
Token ID351
Feature activation+1.35e-02

Pos logit contributions
t+0.001
't+0.001
 don+0.001
 cannot+0.001
w+0.001
Neg logit contributions
 orchestra-0.002
 films-0.002
 amusement-0.002
 elephant-0.002
 elephants-0.002
Token the
Token ID262
Feature activation+1.23e-01

Pos logit contributions
 orchestra+0.032
 amusement+0.030
 sym+0.030
 films+0.030
 lions+0.030
Neg logit contributions
t-0.017
 don-0.014
't-0.014
 cannot-0.014
 not-0.013
Token elderly
Token ID13830
Feature activation+1.72e-01

Pos logit contributions
 orchestra+0.187
latable+0.183
 sym+0.177
 films+0.176
 unison+0.173
Neg logit contributions
t-0.240
 don-0.235
 cannot-0.217
 will-0.216
 ever-0.214
Token lady
Token ID10846
Feature activation+5.10e-02

Pos logit contributions
latable+0.258
 unison+0.246
 sym+0.239
 films+0.236
 ig+0.235
Neg logit contributions
t-0.370
 they-0.335
 ever-0.330
 it-0.314
 don-0.313
Token.
Token ID13
Feature activation+2.90e-02

Pos logit contributions
 orchestra+0.125
 amusement+0.119
 sym+0.117
latable+0.117
 films+0.117
Neg logit contributions
t-0.063
 ever-0.051
 soon-0.051
 don-0.051
 they-0.050
Token
Token ID220
Feature activation-1.68e-02

Pos logit contributions
 orchestra+0.074
 films+0.070
 amusement+0.069
 elephants+0.069
 melody+0.069
Neg logit contributions
t-0.034
't-0.026
 don-0.026
 cannot-0.024
 they-0.023
Token
Token ID198
Feature activation+1.38e-02

Pos logit contributions
t+0.021
't+0.016
 don+0.016
 cannot+0.015
 fell+0.013
Neg logit contributions
 orchestra-0.041
 amusement-0.039
 films-0.039
 elephants-0.039
 lions-0.038
Token
Token ID198
Feature activation+1.79e-02

Pos logit contributions
 orchestra+0.032
 amusement+0.031
 films+0.030
 elephants+0.030
 lions+0.030
Neg logit contributions
t-0.018
 don-0.014
't-0.014
 cannot-0.014
w-0.012
Token doing
Token ID1804
Feature activation-1.90e-02

Pos logit contributions
t+0.016
 don+0.012
 cannot+0.012
't+0.012
w+0.012
Neg logit contributions
 orchestra-0.017
 elephant-0.016
 lions-0.016
 jungle-0.016
 elephants-0.016
Token hopping
Token ID47153
Feature activation-1.09e-02

Pos logit contributions
t+0.024
 don+0.017
 cannot+0.017
't+0.017
w+0.016
Neg logit contributions
 orchestra-0.046
 films-0.044
 amusement-0.044
 lions-0.043
 elephant-0.043
Token around
Token ID1088
Feature activation-2.26e-02

Pos logit contributions
t+0.014
 don+0.011
't+0.011
 cannot+0.011
 fell+0.010
Neg logit contributions
 orchestra-0.025
 amusement-0.024
 films-0.024
 elephants-0.024
 lions-0.024
Token?"
Token ID1701
Feature activation+2.92e-02

Pos logit contributions
t+0.028
 don+0.023
't+0.022
 cannot+0.022
w+0.019
Neg logit contributions
 orchestra-0.054
 amusement-0.051
 elephants-0.051
 films-0.051
 lions-0.050
Token the
Token ID262
Feature activation+1.30e-01

Pos logit contributions
 orchestra+0.066
latable+0.065
 amusement+0.064
 sym+0.063
 films+0.063
Neg logit contributions
t-0.038
't-0.034
 they-0.034
 don-0.034
 it-0.033
Token elderly
Token ID13830
Feature activation+1.72e-01

Pos logit contributions
 orchestra+0.183
latable+0.183
 amusement+0.181
 sym+0.176
opl+0.171
Neg logit contributions
t-0.256
't-0.247
 don-0.246
 it-0.241
 fell-0.239
Token rabbit
Token ID22746
Feature activation+6.02e-02

Pos logit contributions
latable+0.293
 unison+0.278
 amusement+0.272
 films+0.269
 sym+0.269
Neg logit contributions
t-0.331
 they-0.308
 it-0.304
't-0.304
 fell-0.291
Token said
Token ID531
Feature activation-5.68e-03

Pos logit contributions
 orchestra+0.142
 amusement+0.137
 films+0.136
 elephants+0.134
 lions+0.134
Neg logit contributions
t-0.076
 don-0.062
 cannot-0.060
't-0.060
 fell-0.057
Token with
Token ID351
Feature activation+2.69e-03

Pos logit contributions
t+0.008
't+0.006
 don+0.006
 cannot+0.006
 they+0.005
Neg logit contributions
 orchestra-0.013
 amusement-0.013
 elephants-0.013
 films-0.013
 jungle-0.013
Token a
Token ID257
Feature activation+5.61e-02

Pos logit contributions
 orchestra+0.006
 films+0.005
 lions+0.005
 elephants+0.005
 amusement+0.005
Neg logit contributions
t-0.004
 cannot-0.003
 don-0.003
't-0.003
 not-0.003
Token smile
Token ID8212
Feature activation-1.31e-02

Pos logit contributions
 orchestra+0.093
 amusement+0.087
 films+0.087
 elephants+0.087
 lions+0.086
Neg logit contributions
t-0.111
 don-0.096
't-0.095
 cannot-0.094
 fell-0.089
Token help
Token ID1037
Feature activation-2.31e-02

Pos logit contributions
t+0.004
 don+0.003
't+0.003
 cannot+0.003
w+0.003
Neg logit contributions
 orchestra-0.006
 amusement-0.006
 films-0.006
 elephants-0.006
 lions-0.006
Token."
Token ID526
Feature activation+1.75e-02

Pos logit contributions
t+0.029
 don+0.024
't+0.023
 cannot+0.022
 soon+0.021
Neg logit contributions
 orchestra-0.056
 amusement-0.053
 sym-0.053
 elephants-0.053
 woodland-0.053
Token
Token ID198
Feature activation+1.09e-02

Pos logit contributions
 orchestra+0.040
 films+0.038
 amusement+0.038
 elephants+0.038
 lions+0.038
Neg logit contributions
t-0.024
 don-0.018
't-0.018
 cannot-0.018
w-0.017
Token
Token ID198
Feature activation+1.42e-02

Pos logit contributions
 orchestra+0.028
 amusement+0.026
 elephants+0.026
 films+0.026
 elephant+0.026
Neg logit contributions
t-0.012
't-0.010
 don-0.009
 cannot-0.009
 they-0.008
TokenThe
Token ID464
Feature activation+1.10e-01

Pos logit contributions
 orchestra+0.038
 elephants+0.037
 lions+0.036
 films+0.036
 elephant+0.036
Neg logit contributions
t-0.014
't-0.010
 don-0.010
 cannot-0.009
 they-0.009
Token ge
Token ID4903
Feature activation+1.71e-01

Pos logit contributions
latable+0.163
 orchestra+0.156
 sym+0.151
ki+0.147
 pasture+0.145
Neg logit contributions
t-0.220
 don-0.211
 cannot-0.201
 they-0.200
 will-0.195
Tokenese
Token ID2771
Feature activation+4.88e-02

Pos logit contributions
 orchestra+0.214
 films+0.193
 amusement+0.181
 alien+0.174
 sym+0.171
Neg logit contributions
t-0.436
w-0.406
 don-0.390
 cannot-0.383
't-0.378
Token gave
Token ID2921
Feature activation-4.16e-02

Pos logit contributions
 orchestra+0.104
 elephants+0.099
 lions+0.099
 amusement+0.099
 films+0.099
Neg logit contributions
t-0.073
 don-0.058
't-0.058
 cannot-0.057
w-0.053
Token Lily
Token ID20037
Feature activation+6.28e-03

Pos logit contributions
t+0.045
 don+0.037
 cannot+0.035
't+0.033
 they+0.033
Neg logit contributions
 orchestra-0.108
 amusement-0.102
 sym-0.101
 melody-0.099
 woodland-0.099
Token and
Token ID290
Feature activation-2.38e-02

Pos logit contributions
 orchestra+0.016
 sym+0.015
 amusement+0.014
 melody+0.014
 woodland+0.014
Neg logit contributions
t-0.007
 don-0.006
 cannot-0.006
 they-0.006
 not-0.006
Token Tom
Token ID4186
Feature activation+1.60e-02

Pos logit contributions
t+0.032
 they+0.031
 don+0.031
 fell+0.029
 soon+0.028
Neg logit contributions
 orchestra-0.052
 sym-0.048
 melody-0.048
 woodland-0.048
 films-0.048
Token boy
Token ID2933
Feature activation+4.79e-02

Pos logit contributions
 amusement+0.143
 orchestra+0.141
latable+0.136
opl+0.134
 films+0.132
Neg logit contributions
t-0.207
 don-0.197
 they-0.189
 cannot-0.184
 fell-0.184
Token came
Token ID1625
Feature activation-1.71e-02

Pos logit contributions
 orchestra+0.067
 elephant+0.063
 elephants+0.063
 lions+0.063
 monkeys+0.062
Neg logit contributions
t-0.106
't-0.091
 don-0.091
 cannot-0.090
w-0.086
Token to
Token ID284
Feature activation+2.61e-02

Pos logit contributions
t+0.018
 don+0.013
't+0.013
 cannot+0.013
w+0.011
Neg logit contributions
 orchestra-0.044
 films-0.042
 amusement-0.042
 elephants-0.042
 lions-0.042
Token visit
Token ID3187
Feature activation-1.36e-02

Pos logit contributions
 orchestra+0.062
 sym+0.059
 melody+0.058
umm+0.057
ika+0.057
Neg logit contributions
 don-0.030
t-0.027
 it-0.027
't-0.026
 they-0.025
Token the
Token ID262
Feature activation+1.46e-01

Pos logit contributions
t+0.019
 don+0.016
't+0.016
 cannot+0.016
 soon+0.015
Neg logit contributions
 orchestra-0.030
 amusement-0.028
 elephants-0.028
 films-0.028
 sym-0.028
Token wise
Token ID10787
Feature activation+1.71e-01

Pos logit contributions
 orchestra+0.211
ika+0.206
 films+0.199
 unison+0.198
latable+0.197
Neg logit contributions
t-0.289
 don-0.274
 will-0.263
 soon-0.261
 cannot-0.260
Token man
Token ID582
Feature activation+3.21e-02

Pos logit contributions
 unison+0.272
ika+0.270
 orchestra+0.270
 films+0.265
 sym+0.262
Neg logit contributions
t-0.335
 don-0.302
 can-0.281
 it-0.278
 fell-0.276
Token.
Token ID13
Feature activation+3.29e-02

Pos logit contributions
 orchestra+0.066
 amusement+0.062
 films+0.062
 elephants+0.062
 woodland+0.061
Neg logit contributions
t-0.051
 don-0.042
't-0.041
 cannot-0.041
 fell-0.038
Token The
Token ID383
Feature activation+1.14e-01

Pos logit contributions
 orchestra+0.077
 melody+0.075
 tunes+0.074
 rhythms+0.072
 amusement+0.071
Neg logit contributions
t-0.042
't-0.037
 don-0.035
 cannot-0.034
 they-0.034
Token wise
Token ID10787
Feature activation+1.51e-01

Pos logit contributions
 orchestra+0.182
 sym+0.174
ika+0.169
 unison+0.168
 films+0.168
Neg logit contributions
t-0.222
 don-0.215
 cannot-0.201
 can-0.192
 not-0.191
Token man
Token ID582
Feature activation+2.03e-02

Pos logit contributions
 orchestra+0.227
 unison+0.226
 sym+0.224
 films+0.219
 melody+0.210
Neg logit contributions
t-0.318
 don-0.286
 cannot-0.268
 fell-0.263
't-0.259
Token refrigerator
Token ID30500
Feature activation+3.22e-02

Pos logit contributions
 orchestra+0.213
 amusement+0.201
 films+0.199
 elephants+0.198
 niece+0.197
Neg logit contributions
t-0.210
 don-0.179
't-0.177
 ever-0.172
 cannot-0.172
Token!
Token ID0
Feature activation+4.19e-02

Pos logit contributions
 orchestra+0.065
 elephants+0.062
 elephant+0.062
 lions+0.062
 amusement+0.062
Neg logit contributions
t-0.050
 don-0.041
't-0.041
 cannot-0.040
w-0.038
Token
Token ID198
Feature activation+1.53e-02

Pos logit contributions
 orchestra+0.088
 films+0.083
 amusement+0.083
 melody+0.083
 elephants+0.082
Neg logit contributions
t-0.065
't-0.055
 don-0.054
 cannot-0.052
 they-0.049
Token
Token ID198
Feature activation+1.80e-02

Pos logit contributions
 orchestra+0.036
 elephants+0.035
 films+0.035
 elephant+0.034
 melody+0.034
Neg logit contributions
t-0.020
't-0.016
 don-0.015
 cannot-0.015
 soon-0.014
TokenThe
Token ID464
Feature activation+1.26e-01

Pos logit contributions
 orchestra+0.039
 elephants+0.037
 films+0.037
 lions+0.037
 elephant+0.037
Neg logit contributions
t-0.028
't-0.023
 don-0.022
 cannot-0.021
 soon-0.020
Token elderly
Token ID13830
Feature activation+1.69e-01

Pos logit contributions
 orchestra+0.175
 sym+0.164
 amusement+0.162
 rhythms+0.158
latable+0.158
Neg logit contributions
t-0.272
 don-0.247
't-0.239
 cannot-0.238
 soon-0.229
Token woman
Token ID2415
Feature activation+3.64e-02

Pos logit contributions
 films+0.237
 unison+0.237
 sym+0.236
latable+0.236
 orchestra+0.232
Neg logit contributions
t-0.376
't-0.325
 ever-0.319
 they-0.317
 don-0.315
Token was
Token ID373
Feature activation-1.56e-02

Pos logit contributions
 orchestra+0.083
 films+0.079
 amusement+0.079
 elephants+0.078
 lions+0.078
Neg logit contributions
t-0.049
 don-0.039
't-0.038
 cannot-0.038
 fell-0.036
Token shocked
Token ID11472
Feature activation-2.82e-03

Pos logit contributions
t+0.022
't+0.018
 cannot+0.017
 don+0.017
 not+0.017
Neg logit contributions
 orchestra-0.035
 films-0.033
 elephants-0.033
 tunes-0.033
 melody-0.033
Token but
Token ID475
Feature activation-2.55e-02

Pos logit contributions
t+0.004
 don+0.003
't+0.003
 cannot+0.003
 they+0.003
Neg logit contributions
 orchestra-0.007
 films-0.006
 amusement-0.006
 elephants-0.006
 lions-0.006
Token in
Token ID287
Feature activation+1.98e-02

Pos logit contributions
t+0.033
't+0.025
 don+0.024
 cannot+0.024
w+0.023
Neg logit contributions
 orchestra-0.060
 amusement-0.057
 elephants-0.057
 elephant-0.057
 lions-0.057
Token play
Token ID711
Feature activation-1.28e-02

Pos logit contributions
 orchestra+0.077
 amusement+0.072
 films+0.072
 elephants+0.072
 lions+0.072
Neg logit contributions
t-0.063
 don-0.052
't-0.051
 cannot-0.051
w-0.047
Token.
Token ID13
Feature activation+3.01e-02

Pos logit contributions
t+0.019
 don+0.015
't+0.015
 cannot+0.015
w+0.014
Neg logit contributions
 orchestra-0.027
 films-0.026
 lions-0.026
 monkeys-0.026
 amusement-0.026
Token
Token ID198
Feature activation+1.22e-02

Pos logit contributions
 orchestra+0.054
 amusement+0.050
 films+0.049
 melody+0.049
 elephants+0.049
Neg logit contributions
t-0.058
't-0.050
 don-0.050
 cannot-0.049
 they-0.046
Token
Token ID198
Feature activation+1.74e-02

Pos logit contributions
 orchestra+0.023
 woodland+0.023
 jungle+0.022
 lions+0.022
 films+0.022
Neg logit contributions
t-0.018
 don-0.016
 cannot-0.016
w-0.015
 wouldn-0.015
TokenThe
Token ID464
Feature activation+1.12e-01

Pos logit contributions
 orchestra+0.037
 films+0.035
 elephants+0.035
 amusement+0.034
 lions+0.034
Neg logit contributions
t-0.028
't-0.023
 don-0.023
 cannot-0.022
 fell-0.020
Token ostr
Token ID42387
Feature activation+1.69e-01

Pos logit contributions
 orchestra+0.069
 films+0.056
 amusement+0.055
 melody+0.051
 sym+0.051
Neg logit contributions
t-0.341
 don-0.312
't-0.309
 cannot-0.306
 fell-0.298
Tokenich
Token ID488
Feature activation+5.06e-02

Pos logit contributions
 orchestra+0.208
 films+0.193
 amusement+0.191
 movies+0.182
 melody+0.180
Neg logit contributions
t-0.417
't-0.370
 don-0.363
 cannot-0.359
w-0.353
Token was
Token ID373
Feature activation-6.82e-03

Pos logit contributions
 orchestra+0.118
 lions+0.113
 elephant+0.113
 elephants+0.113
 monkeys+0.113
Neg logit contributions
t-0.064
't-0.050
 don-0.049
 cannot-0.048
w-0.044
Token glad
Token ID9675
Feature activation-1.70e-02

Pos logit contributions
t+0.010
 don+0.008
't+0.008
 cannot+0.008
w+0.007
Neg logit contributions
 orchestra-0.015
 films-0.014
 elephants-0.014
 amusement-0.014
 lions-0.014
Token when
Token ID618
Feature activation-3.12e-02

Pos logit contributions
t+0.018
 don+0.013
't+0.012
 cannot+0.012
w+0.011
Neg logit contributions
 orchestra-0.045
 amusement-0.043
 films-0.042
 elephants-0.042
 lions-0.042
Token it
Token ID340
Feature activation-3.59e-02

Pos logit contributions
t+0.042
 cannot+0.030
't+0.030
 don+0.030
w+0.029
Neg logit contributions
 orchestra-0.071
 lions-0.071
 monkeys-0.071
 elephants-0.070
 jungle-0.070
Token time
Token ID640
Feature activation+2.76e-03

Pos logit contributions
 orchestra+0.233
 amusement+0.219
 elephants+0.217
 films+0.214
 elephant+0.213
Neg logit contributions
t-0.175
 don-0.144
 cannot-0.143
't-0.140
 fell-0.132
Token there
Token ID612
Feature activation+3.12e-03

Pos logit contributions
 woodland+0.004
 lions+0.004
 ig+0.004
 elephant+0.004
 monkeys+0.004
Neg logit contributions
t-0.005
w-0.004
 cannot-0.004
 wouldn-0.004
 will-0.004
Token was
Token ID373
Feature activation+5.51e-02

Pos logit contributions
 orchestra+0.007
 ig+0.007
 woodland+0.007
 lions+0.007
 amusement+0.007
Neg logit contributions
t-0.004
 cannot-0.003
 don-0.003
't-0.003
w-0.003
Token a
Token ID257
Feature activation+1.01e-01

Pos logit contributions
 orchestra+0.131
 amusement+0.122
 melody+0.122
 elephants+0.121
 tunes+0.120
Neg logit contributions
t-0.072
't-0.059
 don-0.056
 cannot-0.056
 they-0.054
Token blue
Token ID4171
Feature activation+1.27e-01

Pos logit contributions
 orchestra+0.180
 amusement+0.177
opl+0.176
 rhythms+0.170
 tunes+0.167
Neg logit contributions
 don-0.170
t-0.165
 they-0.165
 fell-0.157
 soon-0.156
Token ostr
Token ID42387
Feature activation+1.69e-01

Pos logit contributions
 orchestra+0.204
 rhythms+0.202
 sym+0.198
opl+0.190
 amaz+0.188
Neg logit contributions
t-0.216
 they-0.211
 on-0.208
 can-0.203
 it-0.202
Tokenich
Token ID488
Feature activation+7.92e-02

Pos logit contributions
 orchestra+0.292
 films+0.280
 amusement+0.279
 rhythms+0.267
 movies+0.262
Neg logit contributions
t-0.343
't-0.317
w-0.303
 not-0.283
 ever-0.280
Token that
Token ID326
Feature activation+3.01e-02

Pos logit contributions
 orchestra+0.187
 amusement+0.182
 amaz+0.174
latable+0.174
 films+0.172
Neg logit contributions
t-0.113
 never-0.097
 could-0.095
 fell-0.095
 they-0.094
Token lived
Token ID5615
Feature activation+1.86e-02

Pos logit contributions
 orchestra+0.056
 elephants+0.054
 elephant+0.054
 lions+0.053
 woodland+0.053
Neg logit contributions
t-0.053
 don-0.044
 cannot-0.043
't-0.043
w-0.040
Token in
Token ID287
Feature activation+6.68e-02

Pos logit contributions
 orchestra+0.051
 amusement+0.049
 films+0.048
 elephants+0.048
 melody+0.047
Neg logit contributions
t-0.018
't-0.013
 don-0.013
 cannot-0.012
w-0.011
Token the
Token ID262
Feature activation+1.53e-01

Pos logit contributions
 orchestra+0.170
 sym+0.158
 amusement+0.155
ote+0.155
ki+0.154
Neg logit contributions
t-0.070
't-0.065
 don-0.063
 it-0.061
 not-0.060
TokenThey
Token ID2990
Feature activation-5.09e-02

Pos logit contributions
 orchestra+0.031
 films+0.029
 elephants+0.029
 lions+0.029
 ig+0.029
Neg logit contributions
t-0.025
't-0.021
 don-0.020
 soon-0.020
 never-0.019
Token went
Token ID1816
Feature activation-6.31e-02

Pos logit contributions
t+0.062
 don+0.048
't+0.047
 cannot+0.046
w+0.041
Neg logit contributions
 orchestra-0.123
 amusement-0.117
 films-0.117
 elephants-0.117
 lions-0.116
Token to
Token ID284
Feature activation+8.09e-03

Pos logit contributions
t+0.073
 don+0.059
 cannot+0.054
w+0.054
't+0.053
Neg logit contributions
 orchestra-0.155
 elephants-0.150
 films-0.149
 amusement-0.149
 lions-0.148
Token visit
Token ID3187
Feature activation-2.99e-02

Pos logit contributions
 orchestra+0.020
 tunes+0.019
 sym+0.019
 melody+0.019
 elephants+0.018
Neg logit contributions
 don-0.009
t-0.009
't-0.008
 they-0.008
 it-0.008
Token the
Token ID262
Feature activation+1.27e-01

Pos logit contributions
t+0.040
 don+0.035
 soon+0.034
 they+0.034
't+0.033
Neg logit contributions
 orchestra-0.069
 sym-0.064
 melody-0.064
 films-0.063
 elephants-0.063
Token elderly
Token ID13830
Feature activation+1.69e-01

Pos logit contributions
 orchestra+0.146
 tunes+0.144
 films+0.142
 unison+0.139
 melody+0.135
Neg logit contributions
t-0.289
 soon-0.277
 don-0.269
 not-0.262
 cannot-0.261
Token neighbour
Token ID12250
Feature activation+6.95e-02

Pos logit contributions
 unison+0.261
 sym+0.251
 films+0.249
 orchestra+0.248
 rhythms+0.241
Neg logit contributions
t-0.361
 they-0.318
't-0.301
 soon-0.301
 don-0.296
Token and
Token ID290
Feature activation-2.41e-02

Pos logit contributions
 orchestra+0.154
 melody+0.145
 sym+0.144
 films+0.143
 amusement+0.142
Neg logit contributions
t-0.102
 don-0.085
 they-0.082
 cannot-0.082
't-0.081
Token she
Token ID673
Feature activation-3.49e-02

Pos logit contributions
t+0.031
 don+0.024
't+0.024
 cannot+0.023
w+0.022
Neg logit contributions
 orchestra-0.056
 amusement-0.054
 elephants-0.054
 elephant-0.054
 films-0.053
Token said
Token ID531
Feature activation-8.81e-03

Pos logit contributions
t+0.046
 don+0.035
't+0.035
 cannot+0.034
w+0.033
Neg logit contributions
 orchestra-0.080
 amusement-0.077
 elephants-0.077
 elephant-0.076
 lions-0.076
Token,
Token ID11
Feature activation-3.61e-02

Pos logit contributions
t+0.013
 cannot+0.010
 don+0.010
't+0.010
w+0.010
Neg logit contributions
 orchestra-0.018
 lions-0.018
 films-0.018
 monkeys-0.018
 elephants-0.018
Token couldn
Token ID3521
Feature activation-3.49e-02

Pos logit contributions
 orchestra+0.081
 films+0.078
 amusement+0.077
 elephants+0.076
 lions+0.076
Neg logit contributions
t-0.045
 don-0.036
 cannot-0.035
't-0.034
 fell-0.034
Token't
Token ID470
Feature activation-1.56e-02

Pos logit contributions
t+0.017
w+0.014
 don+0.013
ged+0.011
 cannot+0.010
Neg logit contributions
 orchestra-0.096
 cartoon-0.094
 movies-0.093
 elephant-0.093
 jungle-0.093
Token see
Token ID766
Feature activation-2.14e-02

Pos logit contributions
t+0.026
't+0.021
 don+0.021
 cannot+0.021
w+0.021
Neg logit contributions
 orchestra-0.030
 amusement-0.028
 elephant-0.028
 melody-0.028
 woodland-0.028
Token what
Token ID644
Feature activation-2.43e-03

Pos logit contributions
t+0.028
 don+0.024
 cannot+0.023
't+0.022
 they+0.022
Neg logit contributions
 orchestra-0.049
 sym-0.047
 melody-0.046
 amusement-0.046
 woodland-0.046
Token the
Token ID262
Feature activation+1.28e-01

Pos logit contributions
t+0.002
 don+0.002
't+0.002
 cannot+0.001
w+0.001
Neg logit contributions
 orchestra-0.007
 amusement-0.006
 elephants-0.006
 films-0.006
 lions-0.006
Token ostr
Token ID42387
Feature activation+1.69e-01

Pos logit contributions
 orchestra+0.169
latable+0.165
 sym+0.156
ika+0.155
ki+0.153
Neg logit contributions
t-0.272
 don-0.259
 cannot-0.244
 will-0.241
 ever-0.238
Tokenich
Token ID488
Feature activation+5.90e-02

Pos logit contributions
 orchestra+0.243
 films+0.237
 amusement+0.228
 movies+0.220
 melody+0.217
Neg logit contributions
t-0.387
't-0.343
w-0.333
 don-0.328
 cannot-0.327
Token was
Token ID373
Feature activation-5.53e-03

Pos logit contributions
 orchestra+0.148
 amusement+0.141
 films+0.141
 elephants+0.141
 lions+0.140
Neg logit contributions
t-0.066
 don-0.049
't-0.048
 cannot-0.047
w-0.042
Token hiding
Token ID11816
Feature activation-4.99e-03

Pos logit contributions
t+0.010
 cannot+0.008
 don+0.008
't+0.008
w+0.007
Neg logit contributions
 orchestra-0.011
 elephant-0.010
 lions-0.010
 jungle-0.010
 elephants-0.010
Token,
Token ID11
Feature activation-2.40e-02

Pos logit contributions
t+0.007
't+0.006
 don+0.006
 cannot+0.005
w+0.005
Neg logit contributions
 orchestra-0.011
 amusement-0.010
 elephant-0.010
 lions-0.010
 ig-0.010
Token but
Token ID475
Feature activation-1.67e-02

Pos logit contributions
t+0.036
 don+0.029
't+0.029
 cannot+0.028
 fell+0.026
Neg logit contributions
 orchestra-0.051
 films-0.049
 amusement-0.049
 elephants-0.048
 lions-0.048
Token okay
Token ID8788
Feature activation+4.25e-04

Pos logit contributions
t+0.023
 don+0.018
't+0.017
 cannot+0.017
w+0.015
Neg logit contributions
 orchestra-0.045
 amusement-0.043
 films-0.042
 elephants-0.042
 lions-0.042
Token.
Token ID13
Feature activation+2.01e-02

Pos logit contributions
 orchestra+0.001
 films+0.001
 amusement+0.001
 elephants+0.001
 lions+0.001
Neg logit contributions
t-0.001
 don-0.001
't-0.001
 cannot-0.001
w-0.001
Token
Token ID198
Feature activation+6.99e-03

Pos logit contributions
 orchestra+0.042
 amusement+0.039
 films+0.039
 melody+0.038
 elephants+0.038
Neg logit contributions
t-0.033
't-0.027
 don-0.027
 cannot-0.026
 they-0.025
Token
Token ID198
Feature activation+1.28e-02

Pos logit contributions
 orchestra+0.015
 amusement+0.014
 films+0.014
 woodland+0.014
 monkeys+0.014
Neg logit contributions
t-0.010
 don-0.008
 cannot-0.008
't-0.007
w-0.007
TokenThe
Token ID464
Feature activation+1.11e-01

Pos logit contributions
 orchestra+0.027
 amusement+0.026
 elephant+0.026
 monkeys+0.026
 films+0.026
Neg logit contributions
t-0.018
 don-0.015
 cannot-0.015
't-0.014
 fell-0.014
Token elderly
Token ID13830
Feature activation+1.69e-01

Pos logit contributions
 orchestra+0.173
 sym+0.170
latable+0.161
 unison+0.159
 films+0.157
Neg logit contributions
t-0.221
 don-0.203
 cannot-0.201
't-0.200
 not-0.196
Token man
Token ID582
Feature activation+1.57e-02

Pos logit contributions
 sym+0.292
 unison+0.290
latable+0.281
 melody+0.277
 films+0.275
Neg logit contributions
t-0.325
 they-0.311
't-0.275
 it-0.271
 we-0.265
Token smiled
Token ID13541
Feature activation-3.14e-02

Pos logit contributions
 orchestra+0.036
 films+0.034
 amusement+0.034
 elephants+0.034
 lions+0.034
Neg logit contributions
t-0.021
 don-0.017
't-0.016
 cannot-0.016
 fell-0.015
Token weak
Token ID4939
Feature activation+2.47e-02

Pos logit contributions
t+0.044
 don+0.035
't+0.034
 cannot+0.034
 fell+0.031
Neg logit contributions
 orchestra-0.071
 amusement-0.067
 films-0.067
 elephants-0.067
 lions-0.067
Tokenly
Token ID306
Feature activation-2.84e-02

Pos logit contributions
 orchestra+0.050
 amusement+0.047
 films+0.047
 sym+0.047
 woodland+0.047
Neg logit contributions
t-0.042
 don-0.034
 cannot-0.033
't-0.032
 fell-0.032
Token and
Token ID290
Feature activation-4.87e-02

Pos logit contributions
t+0.035
 don+0.028
 cannot+0.027
't+0.026
 they+0.025
Neg logit contributions
 orchestra-0.069
 amusement-0.065
 films-0.065
 sym-0.065
 woodland-0.064
Token dog
Token ID3290
Feature activation+5.58e-02

Pos logit contributions
 orchestra+0.173
 films+0.159
 amusement+0.158
 sym+0.153
 melody+0.153
Neg logit contributions
t-0.255
 don-0.237
 cannot-0.228
't-0.221
 fell-0.220
Token.
Token ID13
Feature activation+2.93e-02

Pos logit contributions
 orchestra+0.127
 amusement+0.121
 films+0.121
 elephants+0.121
 lions+0.120
Neg logit contributions
t-0.075
 don-0.059
't-0.058
 cannot-0.057
w-0.052
Token
Token ID198
Feature activation+9.63e-03

Pos logit contributions
 orchestra+0.065
 melody+0.061
 amusement+0.061
 tunes+0.061
 films+0.060
Neg logit contributions
t-0.042
't-0.036
 don-0.035
 cannot-0.034
 they-0.033
Token
Token ID198
Feature activation+1.39e-02

Pos logit contributions
 orchestra+0.022
 amusement+0.021
 films+0.021
 lions+0.021
 elephants+0.021
Neg logit contributions
t-0.013
 don-0.010
't-0.010
 cannot-0.010
w-0.009
TokenThe
Token ID464
Feature activation+1.23e-01

Pos logit contributions
 orchestra+0.031
 amusement+0.029
 films+0.029
 elephants+0.029
 lions+0.029
Neg logit contributions
t-0.020
 don-0.016
't-0.016
 cannot-0.015
w-0.014
Token elderly
Token ID13830
Feature activation+1.69e-01

Pos logit contributions
 orchestra+0.173
 sym+0.160
latable+0.155
 films+0.153
ika+0.150
Neg logit contributions
t-0.258
 don-0.248
 cannot-0.235
 they-0.231
 soon-0.226
Token man
Token ID582
Feature activation+2.39e-02

Pos logit contributions
latable+0.295
 sym+0.294
 unison+0.287
 films+0.284
 orchestra+0.281
Neg logit contributions
t-0.318
 they-0.298
't-0.268
 ever-0.266
 we-0.265
Token and
Token ID290
Feature activation-6.90e-03

Pos logit contributions
 orchestra+0.054
 amusement+0.052
 films+0.051
 elephants+0.051
 lions+0.051
Neg logit contributions
t-0.033
 don-0.026
't-0.025
 cannot-0.025
w-0.023
Token his
Token ID465
Feature activation+8.05e-02

Pos logit contributions
 they+0.009
 don+0.009
t+0.008
 it+0.008
 soon+0.008
Neg logit contributions
 orchestra-0.015
 sym-0.015
latable-0.015
 melody-0.015
 amusement-0.014
Token dog
Token ID3290
Feature activation+5.16e-02

Pos logit contributions
 orchestra+0.133
 films+0.121
 sym+0.120
 melody+0.118
latable+0.116
Neg logit contributions
t-0.147
 don-0.143
 cannot-0.139
 ever-0.136
 they-0.133
Token would
Token ID561
Feature activation-6.29e-02

Pos logit contributions
 orchestra+0.122
 amusement+0.115
 films+0.115
 melody+0.115
 woodland+0.113
Neg logit contributions
t-0.065
 don-0.054
 cannot-0.052
't-0.051
 fell-0.048
Token stayed
Token ID9658
Feature activation-2.71e-02

Pos logit contributions
t+0.011
 don+0.009
't+0.009
 cannot+0.009
w+0.008
Neg logit contributions
 orchestra-0.014
 amusement-0.013
 films-0.013
 elephants-0.013
 lions-0.013
Token away
Token ID1497
Feature activation-3.13e-03

Pos logit contributions
t+0.042
't+0.033
 don+0.031
 cannot+0.031
w+0.030
Neg logit contributions
 orchestra-0.058
 films-0.055
 sym-0.055
 woodland-0.054
 amusement-0.054
Token from
Token ID422
Feature activation+5.72e-02

Pos logit contributions
t+0.005
't+0.004
 don+0.004
 cannot+0.004
w+0.003
Neg logit contributions
 orchestra-0.007
 amusement-0.006
 films-0.006
 lions-0.006
 elephants-0.006
Token the
Token ID262
Feature activation+1.31e-01

Pos logit contributions
 orchestra+0.119
 amusement+0.113
 films+0.113
 elephants+0.113
 lions+0.112
Neg logit contributions
t-0.089
 don-0.072
't-0.071
 cannot-0.070
w-0.065
Token other
Token ID584
Feature activation+1.10e-01

Pos logit contributions
 orchestra+0.169
latable+0.161
 films+0.155
ika+0.153
 amusement+0.152
Neg logit contributions
t-0.284
 don-0.269
 cannot-0.257
't-0.253
 will-0.252
Token ge
Token ID4903
Feature activation+1.69e-01

Pos logit contributions
 orchestra+0.153
latable+0.150
 sym+0.141
 Labrador+0.138
 melody+0.138
Neg logit contributions
t-0.225
 don-0.217
 ever-0.216
 cannot-0.208
 soon-0.207
Tokenese
Token ID2771
Feature activation+5.74e-02

Pos logit contributions
 orchestra+0.202
 amusement+0.176
 films+0.175
latable+0.167
 alien+0.162
Neg logit contributions
t-0.436
w-0.383
 ever-0.374
't-0.373
 never-0.371
Token.
Token ID13
Feature activation+2.88e-02

Pos logit contributions
 orchestra+0.125
 amusement+0.119
 films+0.118
 elephants+0.117
 woodland+0.116
Neg logit contributions
t-0.084
 don-0.068
't-0.067
 cannot-0.066
 fell-0.061
Token
Token ID198
Feature activation+1.15e-02

Pos logit contributions
 orchestra+0.064
 amusement+0.060
 melody+0.060
 tunes+0.059
 films+0.059
Neg logit contributions
t-0.042
't-0.036
 don-0.035
 cannot-0.034
 they-0.033
Token
Token ID198
Feature activation+1.69e-02

Pos logit contributions
 orchestra+0.028
 elephants+0.026
 amusement+0.026
 films+0.026
 elephant+0.026
Neg logit contributions
t-0.015
't-0.012
 don-0.012
 cannot-0.011
 soon-0.010
Token"
Token ID1
Feature activation+4.25e-03

Pos logit contributions
 orchestra+0.035
 films+0.033
 elephants+0.033
 amusement+0.032
 lions+0.032
Neg logit contributions
t-0.027
 don-0.023
't-0.023
 cannot-0.022
w-0.021
Token friend
Token ID1545
Feature activation+1.79e-02

Pos logit contributions
 orchestra+0.120
 films+0.117
latable+0.114
 sym+0.114
ki+0.111
Neg logit contributions
 don-0.145
t-0.142
 will-0.139
 ever-0.137
 soon-0.135
Token."
Token ID526
Feature activation+2.03e-02

Pos logit contributions
 orchestra+0.043
 amusement+0.041
 woodland+0.041
 films+0.041
 sym+0.041
Neg logit contributions
t-0.022
 don-0.018
't-0.017
 cannot-0.017
 soon-0.016
Token
Token ID198
Feature activation+5.17e-03

Pos logit contributions
 orchestra+0.047
 amusement+0.044
 films+0.044
 elephants+0.044
 lions+0.044
Neg logit contributions
t-0.028
 don-0.022
't-0.022
 cannot-0.021
 fell-0.019
Token
Token ID198
Feature activation+1.01e-02

Pos logit contributions
 orchestra+0.014
 elephants+0.013
 amusement+0.013
 films+0.013
 elephant+0.013
Neg logit contributions
t-0.005
't-0.004
 don-0.004
 cannot-0.004
 they-0.004
TokenThe
Token ID464
Feature activation+1.12e-01

Pos logit contributions
 orchestra+0.022
 films+0.021
 elephants+0.021
 amusement+0.020
 elephant+0.020
Neg logit contributions
t-0.015
 don-0.013
't-0.013
 cannot-0.012
w-0.012
Token ostr
Token ID42387
Feature activation+1.68e-01

Pos logit contributions
latable+0.104
 orchestra+0.100
 films+0.098
 tunes+0.094
 programmes+0.092
Neg logit contributions
t-0.280
 don-0.273
 never-0.268
 soon-0.266
 not-0.263
Tokenich
Token ID488
Feature activation+4.61e-02

Pos logit contributions
 films+0.250
 orchestra+0.249
 amusement+0.238
 movies+0.234
latable+0.230
Neg logit contributions
t-0.367
't-0.327
w-0.323
 don-0.317
 cannot-0.317
Token heard
Token ID2982
Feature activation-1.71e-02

Pos logit contributions
 orchestra+0.111
 films+0.107
 amusement+0.106
 woodland+0.103
 elephants+0.103
Neg logit contributions
t-0.058
 don-0.047
 cannot-0.045
 fell-0.044
't-0.043
Token Ben
Token ID3932
Feature activation+1.98e-02

Pos logit contributions
 it+0.019
 they+0.018
t+0.018
't+0.017
 don+0.016
Neg logit contributions
 orchestra-0.043
 sym-0.041
 rhythms-0.040
 tunes-0.040
 jungle-0.040
Token's
Token ID338
Feature activation+7.28e-02

Pos logit contributions
 orchestra+0.041
 amusement+0.039
 elephants+0.039
 films+0.039
 lions+0.038
Neg logit contributions
t-0.031
 don-0.026
't-0.025
 cannot-0.025
 fell-0.023
Token voice
Token ID3809
Feature activation-2.62e-02

Pos logit contributions
 orchestra+0.111
 sym+0.104
 amusement+0.103
 woodland+0.103
latable+0.102
Neg logit contributions
t-0.143
 not-0.133
 cannot-0.132
't-0.131
 ever-0.125
Token that
Token ID326
Feature activation+1.51e-02

Pos logit contributions
 orchestra+0.028
 films+0.027
 woodland+0.026
 elephants+0.026
 lions+0.026
Neg logit contributions
t-0.014
 don-0.011
't-0.011
 cannot-0.010
 soon-0.010
Token day
Token ID1110
Feature activation-3.88e-02

Pos logit contributions
 elephant+0.023
 jungle+0.023
 orchestra+0.023
 woodland+0.022
 ig+0.022
Neg logit contributions
t-0.030
 cannot-0.026
 don-0.026
't-0.026
ged-0.024
Token on
Token ID319
Feature activation+7.68e-03

Pos logit contributions
t+0.047
 don+0.037
 cannot+0.037
 not+0.032
use+0.031
Neg logit contributions
 elephant-0.079
 lions-0.078
 monkeys-0.078
 orchestra-0.078
 ze-0.078
Token,
Token ID11
Feature activation+6.41e-03

Pos logit contributions
 orchestra+0.017
 woodland+0.016
 films+0.016
 jungle+0.016
 elephant+0.016
Neg logit contributions
t-0.011
 don-0.008
 cannot-0.008
't-0.008
w-0.008
Token the
Token ID262
Feature activation+1.19e-01

Pos logit contributions
 orchestra+0.015
 amusement+0.014
 melody+0.014
 sym+0.014
 unison+0.014
Neg logit contributions
t-0.009
 don-0.008
 they-0.008
't-0.007
 soon-0.007
Token elderly
Token ID13830
Feature activation+1.68e-01

Pos logit contributions
 orchestra+0.190
latable+0.177
 films+0.175
 amusement+0.169
 sym+0.168
Neg logit contributions
 don-0.234
t-0.227
 soon-0.208
 cannot-0.207
 can-0.207
Token lady
Token ID10846
Feature activation+4.20e-02

Pos logit contributions
 unison+0.271
 films+0.262
latable+0.259
 orchestra+0.244
 amusement+0.243
Neg logit contributions
t-0.350
 they-0.313
 ever-0.305
't-0.299
 don-0.295
Token and
Token ID290
Feature activation-7.23e-03

Pos logit contributions
 orchestra+0.098
 elephants+0.093
 amusement+0.093
 films+0.093
 lions+0.093
Neg logit contributions
t-0.054
 don-0.042
't-0.042
 cannot-0.041
w-0.037
Token the
Token ID262
Feature activation+1.19e-01

Pos logit contributions
 they+0.009
 don+0.009
t+0.008
 soon+0.008
 it+0.008
Neg logit contributions
 orchestra-0.017
latable-0.016
 sym-0.016
 unison-0.016
 melody-0.016
Token little
Token ID1310
Feature activation+1.08e-01

Pos logit contributions
latable+0.180
 films+0.176
 orchestra+0.175
 sym+0.167
 programmes+0.166
Neg logit contributions
 don-0.234
t-0.215
 will-0.209
 soon-0.208
 can-0.207
Token boy
Token ID2933
Feature activation+4.10e-02

Pos logit contributions
opl+0.212
 sym+0.210
 orchestra+0.209
 films+0.208
latable+0.207
Neg logit contributions
 don-0.165
 they-0.164
 got-0.150
 will-0.150
 it-0.149
Token She
Token ID1375
Feature activation-4.73e-02

Pos logit contributions
 orchestra+0.000
 woodland+0.000
 films+0.000
 jungle+0.000
 elephants+0.000
Neg logit contributions
t-0.000
 don-0.000
w-0.000
 cannot-0.000
't-0.000
Token searches
Token ID15455
Feature activation-1.08e-01

Pos logit contributions
t+0.086
't+0.068
w+0.064
 don+0.064
 ever+0.060
Neg logit contributions
 orchestra-0.089
 amusement-0.085
 jungle-0.085
 lions-0.084
 films-0.084
Token and
Token ID290
Feature activation-5.53e-02

Pos logit contributions
t+0.208
't+0.175
w+0.174
uch+0.152
 don+0.151
Neg logit contributions
 ways-0.178
 films-0.176
 elephants-0.175
 orchestra-0.175
 moon-0.174
Token searches
Token ID15455
Feature activation-1.17e-01

Pos logit contributions
t+0.121
't+0.096
w+0.091
 ever+0.084
 don+0.083
Neg logit contributions
 orchestra-0.090
 ig-0.087
 moon-0.087
 amusement-0.086
 films-0.084
Token and
Token ID290
Feature activation-6.16e-02

Pos logit contributions
t+0.234
w+0.206
't+0.198
uch+0.176
 don+0.171
Neg logit contributions
 ways-0.184
 moon-0.180
 elephants-0.177
 orchestra-0.177
 lions-0.176
Token searches
Token ID15455
Feature activation-1.18e-01

Pos logit contributions
t+0.131
't+0.106
w+0.099
ve+0.089
ged+0.088
Neg logit contributions
 orchestra-0.100
 moon-0.098
 ig-0.098
 amusement-0.097
 jungle-0.095
Token.
Token ID13
Feature activation-2.52e-03

Pos logit contributions
t+0.244
w+0.215
't+0.209
 don+0.184
 cannot+0.182
Neg logit contributions
 ways-0.174
 moon-0.172
 orchestra-0.171
 lions-0.171
 elephants-0.171
Token She
Token ID1375
Feature activation-4.83e-02

Pos logit contributions
t+0.004
 don+0.004
w+0.004
't+0.004
 cannot+0.003
Neg logit contributions
 orchestra-0.005
 woodland-0.004
 jungle-0.004
 lions-0.004
 films-0.004
Token makes
Token ID1838
Feature activation-2.72e-02

Pos logit contributions
t+0.091
't+0.071
w+0.067
ged+0.064
 don+0.063
Neg logit contributions
 orchestra-0.087
 ig-0.086
 jungle-0.085
 amusement-0.083
 moon-0.083
Token a
Token ID257
Feature activation+4.44e-02

Pos logit contributions
t+0.032
 don+0.028
 not+0.027
 cannot+0.027
't+0.027
Neg logit contributions
 orchestra-0.065
 elephants-0.062
 sym-0.062
 amusement-0.062
 elephant-0.061
Token big
Token ID1263
Feature activation+7.71e-02

Pos logit contributions
 orchestra+0.075
 elephants+0.074
ican+0.072
 monkeys+0.071
 lions+0.071
Neg logit contributions
t-0.076
 cannot-0.072
 don-0.072
 not-0.070
 will-0.069
Token just
Token ID655
Feature activation-3.52e-02

Pos logit contributions
t+0.038
w+0.028
't+0.027
 cannot+0.027
ged+0.025
Neg logit contributions
 orchestra-0.067
 films-0.065
 amusement-0.064
 melody-0.064
 woodland-0.063
Token remember
Token ID3505
Feature activation-5.44e-02

Pos logit contributions
t+0.054
't+0.040
w+0.038
 cannot+0.037
 fell+0.035
Neg logit contributions
 orchestra-0.075
 elephant-0.073
 melody-0.072
 elephants-0.071
 amusement-0.071
Token that
Token ID326
Feature activation-2.41e-02

Pos logit contributions
t+0.072
 cannot+0.052
't+0.052
 don+0.051
w+0.049
Neg logit contributions
 orchestra-0.124
 lions-0.121
 elephants-0.120
 amusement-0.120
 woodland-0.119
Token bells
Token ID30987
Feature activation-2.63e-02

Pos logit contributions
t+0.032
 don+0.025
't+0.025
 cannot+0.024
w+0.022
Neg logit contributions
 orchestra-0.056
 amusement-0.053
 films-0.053
 elephants-0.053
 lions-0.053
Token are
Token ID389
Feature activation-3.91e-02

Pos logit contributions
t+0.029
't+0.021
ged+0.020
w+0.020
 cannot+0.019
Neg logit contributions
 orchestra-0.060
 elephant-0.058
 elephants-0.058
 lions-0.057
 melody-0.057
Token never
Token ID1239
Feature activation-1.18e-01

Pos logit contributions
t+0.063
't+0.049
w+0.048
 cannot+0.048
 don+0.048
Neg logit contributions
 orchestra-0.078
 elephants-0.075
 elephant-0.075
 lions-0.075
 monkeys-0.074
Token a
Token ID257
Feature activation+4.98e-02

Pos logit contributions
t+0.183
't+0.147
 don+0.146
 cannot+0.144
w+0.135
Neg logit contributions
 orchestra-0.247
 elephants-0.234
 amusement-0.233
 lions-0.232
 films-0.232
Token bad
Token ID2089
Feature activation+6.73e-02

Pos logit contributions
 orchestra+0.081
 films+0.077
 amusement+0.076
 elephants+0.075
 lions+0.075
Neg logit contributions
t-0.100
 don-0.086
't-0.085
 cannot-0.084
 fell-0.079
Token thing
Token ID1517
Feature activation+3.29e-03

Pos logit contributions
 orchestra+0.114
 films+0.107
 amusement+0.107
 elephants+0.106
 lions+0.105
Neg logit contributions
t-0.131
 don-0.113
't-0.111
 cannot-0.109
 fell-0.104
Token and
Token ID290
Feature activation-3.40e-02

Pos logit contributions
 orchestra+0.007
 elephant+0.007
 lions+0.007
 elephants+0.007
 monkeys+0.007
Neg logit contributions
t-0.004
 don-0.003
 cannot-0.003
w-0.003
't-0.003
Token that
Token ID326
Feature activation-2.01e-02

Pos logit contributions
t+0.051
't+0.039
w+0.037
 cannot+0.037
 don+0.036
Neg logit contributions
 orchestra-0.072
 elephants-0.069
 elephant-0.069
 lions-0.068
 melody-0.068
Token again
Token ID757
Feature activation-7.87e-02

Pos logit contributions
t+0.208
't+0.174
w+0.171
 don+0.153
 cannot+0.146
Neg logit contributions
 orchestra-0.184
 elephants-0.178
 films-0.178
 ways-0.178
 moon-0.177
Token.
Token ID13
Feature activation+2.57e-05

Pos logit contributions
t+0.114
 don+0.095
 cannot+0.092
't+0.092
 soon+0.089
Neg logit contributions
 orchestra-0.171
 elephants-0.161
 films-0.161
 amusement-0.161
 lions-0.160
Token She
Token ID1375
Feature activation-4.73e-02

Pos logit contributions
 orchestra+0.000
 woodland+0.000
 films+0.000
 jungle+0.000
 elephants+0.000
Neg logit contributions
t-0.000
 don-0.000
w-0.000
 cannot-0.000
't-0.000
Token searches
Token ID15455
Feature activation-1.08e-01

Pos logit contributions
t+0.086
't+0.068
w+0.064
 don+0.064
 ever+0.060
Neg logit contributions
 orchestra-0.089
 amusement-0.085
 jungle-0.085
 lions-0.084
 films-0.084
Token and
Token ID290
Feature activation-5.53e-02

Pos logit contributions
t+0.208
't+0.175
w+0.174
uch+0.152
 don+0.151
Neg logit contributions
 ways-0.178
 films-0.176
 elephants-0.175
 orchestra-0.175
 moon-0.174
Token searches
Token ID15455
Feature activation-1.17e-01

Pos logit contributions
t+0.121
't+0.096
w+0.091
 ever+0.084
 don+0.083
Neg logit contributions
 orchestra-0.090
 ig-0.087
 moon-0.087
 amusement-0.086
 films-0.084
Token and
Token ID290
Feature activation-6.16e-02

Pos logit contributions
t+0.234
w+0.206
't+0.198
uch+0.176
 don+0.171
Neg logit contributions
 ways-0.184
 moon-0.180
 elephants-0.177
 orchestra-0.177
 lions-0.176
Token searches
Token ID15455
Feature activation-1.18e-01

Pos logit contributions
t+0.131
't+0.106
w+0.099
ve+0.089
ged+0.088
Neg logit contributions
 orchestra-0.100
 moon-0.098
 ig-0.098
 amusement-0.097
 jungle-0.095
Token.
Token ID13
Feature activation-2.52e-03

Pos logit contributions
t+0.244
w+0.215
't+0.209
 don+0.184
 cannot+0.182
Neg logit contributions
 ways-0.174
 moon-0.172
 orchestra-0.171
 lions-0.171
 elephants-0.171
Token She
Token ID1375
Feature activation-4.83e-02

Pos logit contributions
t+0.004
 don+0.004
w+0.004
't+0.004
 cannot+0.003
Neg logit contributions
 orchestra-0.005
 woodland-0.004
 jungle-0.004
 lions-0.004
 films-0.004
Token makes
Token ID1838
Feature activation-2.72e-02

Pos logit contributions
t+0.091
't+0.071
w+0.067
ged+0.064
 don+0.063
Neg logit contributions
 orchestra-0.087
 ig-0.086
 jungle-0.085
 amusement-0.083
 moon-0.083
Token knew
Token ID2993
Feature activation-4.26e-02

Pos logit contributions
t+0.029
't+0.022
 don+0.022
 cannot+0.021
w+0.020
Neg logit contributions
 orchestra-0.055
 elephant-0.054
 amusement-0.053
 woodland-0.053
 lions-0.052
Token that
Token ID326
Feature activation-4.09e-02

Pos logit contributions
t+0.040
 don+0.029
't+0.028
 cannot+0.027
 they+0.025
Neg logit contributions
 orchestra-0.117
 films-0.111
 amusement-0.110
 sym-0.110
 elephants-0.110
Token his
Token ID465
Feature activation+6.75e-02

Pos logit contributions
t+0.047
 don+0.035
't+0.034
 cannot+0.034
w+0.030
Neg logit contributions
 orchestra-0.102
 amusement-0.098
 films-0.098
 elephants-0.097
 lions-0.097
Token thumb
Token ID15683
Feature activation-1.49e-02

Pos logit contributions
 orchestra+0.079
 amusement+0.071
 melody+0.071
 films+0.070
 woodland+0.070
Neg logit contributions
t-0.168
't-0.146
 don-0.145
 cannot-0.144
w-0.139
Token was
Token ID373
Feature activation-4.82e-02

Pos logit contributions
t+0.019
't+0.016
 don+0.015
 cannot+0.015
w+0.015
Neg logit contributions
 elephant-0.030
 orchestra-0.030
 tunes-0.030
 cartoon-0.030
 melody-0.030
Token never
Token ID1239
Feature activation-1.17e-01

Pos logit contributions
t+0.079
't+0.062
 don+0.061
 cannot+0.060
w+0.060
Neg logit contributions
 orchestra-0.091
 elephant-0.091
 films-0.089
 ig-0.089
 lions-0.089
Token going
Token ID1016
Feature activation-4.94e-02

Pos logit contributions
t+0.178
't+0.143
 don+0.141
 cannot+0.139
w+0.131
Neg logit contributions
 orchestra-0.247
 amusement-0.234
 films-0.233
 elephants-0.233
 elephant-0.232
Token to
Token ID284
Feature activation-1.15e-02

Pos logit contributions
t+0.066
w+0.062
 don+0.058
 cannot+0.055
't+0.049
Neg logit contributions
 elephant-0.100
 monkeys-0.100
 elephants-0.099
 jungle-0.098
 lions-0.097
Token be
Token ID307
Feature activation-1.35e-02

Pos logit contributions
t+0.016
 don+0.013
 cannot+0.012
't+0.012
w+0.012
Neg logit contributions
 orchestra-0.026
 amusement-0.025
 films-0.024
 elephants-0.024
 elephant-0.024
Token nice
Token ID3621
Feature activation-8.97e-03

Pos logit contributions
t+0.020
 don+0.016
't+0.016
 cannot+0.016
 fell+0.015
Neg logit contributions
 orchestra-0.029
 amusement-0.027
 films-0.027
 elephants-0.027
 lions-0.027
Token again
Token ID757
Feature activation-6.64e-02

Pos logit contributions
t+0.013
 don+0.010
't+0.010
 cannot+0.010
w+0.009
Neg logit contributions
 orchestra-0.020
 amusement-0.019
 lions-0.019
 elephants-0.019
 films-0.019
Token:
Token ID25
Feature activation-3.19e-02

Pos logit contributions
 orchestra+0.008
 amusement+0.007
 films+0.007
 elephants+0.007
 lions+0.007
Neg logit contributions
t-0.006
 don-0.005
't-0.005
 cannot-0.005
w-0.005
Token "
Token ID366
Feature activation-2.49e-02

Pos logit contributions
t+0.057
 cannot+0.046
 don+0.046
w+0.046
't+0.045
Neg logit contributions
 orchestra-0.059
 films-0.056
 lions-0.056
 elephants-0.056
 movies-0.055
TokenDel
Token ID13856
Feature activation+9.37e-02

Pos logit contributions
t+0.038
 don+0.033
 cannot+0.033
't+0.031
 fell+0.031
Neg logit contributions
 orchestra-0.049
 melody-0.046
 tunes-0.046
 elephants-0.046
 monkeys-0.046
Tokenay
Token ID323
Feature activation+2.62e-02

Pos logit contributions
 orchestra+0.134
 elephants+0.125
 films+0.124
 woodland+0.123
 amusement+0.123
Neg logit contributions
t-0.213
't-0.183
 cannot-0.176
 don-0.175
w-0.171
Token is
Token ID318
Feature activation-4.85e-02

Pos logit contributions
 orchestra+0.056
 elephant+0.054
 lions+0.053
 films+0.053
 elephants+0.053
Neg logit contributions
t-0.038
 don-0.030
 cannot-0.030
't-0.029
 fell-0.028
Token never
Token ID1239
Feature activation-1.16e-01

Pos logit contributions
t+0.068
 don+0.054
't+0.053
 cannot+0.052
w+0.048
Neg logit contributions
 orchestra-0.109
 amusement-0.104
 films-0.103
 elephants-0.103
 lions-0.103
Token a
Token ID257
Feature activation+4.91e-02

Pos logit contributions
t+0.165
 don+0.142
 cannot+0.132
't+0.131
 not+0.125
Neg logit contributions
 orchestra-0.252
 films-0.241
 amusement-0.238
 woodland-0.238
 elephants-0.238
Token waste
Token ID7030
Feature activation+2.03e-02

Pos logit contributions
 orchestra+0.098
 films+0.093
 amusement+0.093
 elephants+0.092
 lions+0.092
Neg logit contributions
t-0.080
 don-0.068
 cannot-0.066
't-0.065
 soon-0.061
Token of
Token ID286
Feature activation+4.04e-02

Pos logit contributions
 orchestra+0.043
 elephants+0.041
 films+0.041
 amusement+0.041
 lions+0.041
Neg logit contributions
t-0.030
 don-0.024
't-0.024
 cannot-0.024
w-0.022
Token time
Token ID640
Feature activation-5.00e-02

Pos logit contributions
 orchestra+0.076
 amusement+0.072
 films+0.072
 lions+0.071
 elephants+0.071
Neg logit contributions
t-0.073
't-0.058
 don-0.057
 cannot-0.057
w-0.057
Token."
Token ID526
Feature activation+1.15e-02

Pos logit contributions
t+0.076
 don+0.062
 cannot+0.062
't+0.060
w+0.060
Neg logit contributions
 orchestra-0.103
 lions-0.097
 films-0.097
 amusement-0.096
 elephants-0.096
Token the
Token ID262
Feature activation+1.15e-01

Pos logit contributions
t+0.028
 don+0.026
 it+0.025
 cannot+0.024
't+0.024
Neg logit contributions
 orchestra-0.055
 woodland-0.052
 sym-0.052
 Labrador-0.052
 melody-0.051
Token calendar
Token ID11845
Feature activation-9.47e-03

Pos logit contributions
latable+0.127
 Labrador+0.124
 woodland+0.123
 ig+0.123
 orchestra+0.122
Neg logit contributions
t-0.271
 don-0.249
 cannot-0.247
't-0.246
 will-0.239
Token.
Token ID13
Feature activation+2.40e-03

Pos logit contributions
t+0.014
 don+0.012
't+0.012
 cannot+0.011
w+0.011
Neg logit contributions
 orchestra-0.020
 films-0.019
 lions-0.019
 elephant-0.019
 elephants-0.019
Token It
Token ID632
Feature activation-7.24e-02

Pos logit contributions
 orchestra+0.005
 amusement+0.005
 films+0.005
 elephants+0.005
 lions+0.005
Neg logit contributions
t-0.004
 don-0.003
't-0.003
 cannot-0.003
w-0.003
Token looks
Token ID3073
Feature activation-8.26e-02

Pos logit contributions
t+0.111
ged+0.093
w+0.088
 don+0.086
't+0.084
Neg logit contributions
 elephant-0.144
 moon-0.141
 elephants-0.140
 orchestra-0.140
 lions-0.140
Token even
Token ID772
Feature activation-1.16e-01

Pos logit contributions
t+0.184
't+0.157
w+0.154
 cannot+0.145
 don+0.144
Neg logit contributions
 orchestra-0.121
 films-0.113
 movies-0.112
 amusement-0.112
 elephants-0.112
Token better
Token ID1365
Feature activation-5.89e-02

Pos logit contributions
t+0.261
't+0.226
 cannot+0.214
 don+0.213
w+0.208
Neg logit contributions
 orchestra-0.168
 films-0.155
 elephants-0.155
 movies-0.152
 moon-0.152
Token than
Token ID621
Feature activation-4.57e-02

Pos logit contributions
t+0.113
't+0.098
ged+0.092
w+0.092
 cannot+0.090
Neg logit contributions
 lions-0.093
 elderly-0.089
 moon-0.089
 monkeys-0.089
 jungle-0.089
Token before
Token ID878
Feature activation-2.51e-02

Pos logit contributions
t+0.074
 cannot+0.059
 don+0.059
't+0.059
w+0.056
Neg logit contributions
 orchestra-0.089
 lions-0.088
 elephants-0.088
 monkeys-0.086
 elephant-0.086
Token."
Token ID526
Feature activation-2.54e-03

Pos logit contributions
t+0.043
 don+0.034
 cannot+0.033
w+0.032
't+0.032
Neg logit contributions
 films-0.048
 orchestra-0.048
 lions-0.047
 moon-0.046
 jungle-0.046
Token
Token ID198
Feature activation-6.64e-03

Pos logit contributions
t+0.004
w+0.004
 don+0.003
't+0.003
 cannot+0.003
Neg logit contributions
 orchestra-0.005
 films-0.005
 jungle-0.005
 monkeys-0.005
 movies-0.005
Token He
Token ID679
Feature activation-6.20e-02

Pos logit contributions
 orchestra+0.004
 amusement+0.004
 films+0.004
 elephants+0.004
 lions+0.004
Neg logit contributions
t-0.003
 don-0.002
't-0.002
 cannot-0.002
w-0.002
Token knows
Token ID4206
Feature activation-5.08e-02

Pos logit contributions
t+0.100
't+0.080
 don+0.079
 cannot+0.076
w+0.073
Neg logit contributions
 orchestra-0.127
 amusement-0.120
 elephants-0.120
 films-0.120
 lions-0.119
Token it
Token ID340
Feature activation-7.51e-02

Pos logit contributions
t+0.068
 don+0.053
't+0.052
 cannot+0.051
w+0.047
Neg logit contributions
 orchestra-0.117
 amusement-0.112
 films-0.112
 elephants-0.112
 lions-0.112
Token will
Token ID481
Feature activation-9.38e-02

Pos logit contributions
t+0.106
't+0.080
 don+0.080
ged+0.078
w+0.077
Neg logit contributions
 orchestra-0.160
 lions-0.159
 cartoon-0.158
 films-0.158
 movies-0.157
Token be
Token ID307
Feature activation-3.23e-02

Pos logit contributions
ged+0.161
t+0.161
w+0.149
ed+0.139
itates+0.129
Neg logit contributions
 moon-0.157
 orchestra-0.153
 melody-0.149
 amusement-0.148
 elephant-0.147
Token even
Token ID772
Feature activation-1.15e-01

Pos logit contributions
t+0.063
't+0.052
 cannot+0.049
 don+0.049
w+0.047
Neg logit contributions
 orchestra-0.056
 lions-0.055
 films-0.055
 elephants-0.055
 monkeys-0.054
Token more
Token ID517
Feature activation-7.75e-03

Pos logit contributions
t+0.229
't+0.193
 cannot+0.181
 don+0.178
w+0.177
Neg logit contributions
 orchestra-0.195
 elephants-0.184
 amusement-0.184
 moon-0.182
 melody-0.182
Token perfect
Token ID2818
Feature activation+2.63e-04

Pos logit contributions
t+0.017
't+0.015
 cannot+0.014
 don+0.014
w+0.014
Neg logit contributions
 orchestra-0.011
 amusement-0.010
 films-0.010
 jungle-0.010
 moon-0.010
Token than
Token ID621
Feature activation-3.53e-02

Pos logit contributions
 orchestra+0.000
 lions+0.000
 moon+0.000
 films+0.000
 elephants+0.000
Neg logit contributions
t-0.000
't-0.000
w-0.000
 don-0.000
 cannot-0.000
Token his
Token ID465
Feature activation+7.79e-02

Pos logit contributions
t+0.054
 don+0.042
 cannot+0.042
w+0.041
't+0.040
Neg logit contributions
 orchestra-0.073
 elephants-0.072
 lions-0.072
 monkeys-0.071
 elephant-0.071
Token picture
Token ID4286
Feature activation-1.12e-02

Pos logit contributions
 orchestra+0.102
 films+0.092
 amusement+0.091
 woodland+0.091
 monkeys+0.091
Neg logit contributions
t-0.178
 don-0.161
 cannot-0.159
't-0.156
 ever-0.151
Token to
Token ID284
Feature activation-1.54e-02

Pos logit contributions
t+0.074
 don+0.059
't+0.057
 cannot+0.057
 soon+0.056
Neg logit contributions
 orchestra-0.121
 amusement-0.115
 woodland-0.113
 ig-0.113
 elephant-0.113
Token him
Token ID683
Feature activation-6.13e-03

Pos logit contributions
t+0.024
 don+0.022
't+0.021
 cannot+0.021
 not+0.020
Neg logit contributions
 orchestra-0.030
 sym-0.029
 Labrador-0.028
 amusement-0.028
 elephants-0.028
Token.
Token ID13
Feature activation+9.30e-03

Pos logit contributions
t+0.008
 don+0.007
 cannot+0.006
't+0.006
 soon+0.006
Neg logit contributions
 orchestra-0.014
 amusement-0.014
 elephants-0.013
 films-0.013
 monkeys-0.013
Token He
Token ID679
Feature activation-6.33e-02

Pos logit contributions
 orchestra+0.021
 films+0.020
 amusement+0.020
 melody+0.020
 elephants+0.020
Neg logit contributions
t-0.013
't-0.010
 don-0.010
 cannot-0.010
 they-0.009
Token was
Token ID373
Feature activation-2.61e-02

Pos logit contributions
t+0.080
't+0.060
 don+0.060
 cannot+0.057
w+0.054
Neg logit contributions
 orchestra-0.150
 elephant-0.143
 elephants-0.143
 amusement-0.142
 jungle-0.142
Token never
Token ID1239
Feature activation-1.14e-01

Pos logit contributions
t+0.037
 don+0.030
't+0.029
 cannot+0.029
 not+0.027
Neg logit contributions
 orchestra-0.058
 amusement-0.055
 films-0.055
 elephants-0.055
 woodland-0.054
Token lazy
Token ID16931
Feature activation+1.82e-02

Pos logit contributions
t+0.177
 don+0.152
 cannot+0.146
't+0.141
 fell+0.139
Neg logit contributions
 orchestra-0.236
 films-0.226
 amusement-0.224
 elephants-0.222
 lions-0.221
Token again
Token ID757
Feature activation-6.95e-02

Pos logit contributions
 orchestra+0.040
 amusement+0.038
 films+0.038
 elephants+0.038
 lions+0.038
Neg logit contributions
t-0.026
 don-0.021
't-0.020
 cannot-0.020
w-0.019
Token!
Token ID0
Feature activation+1.16e-02

Pos logit contributions
t+0.120
't+0.098
 don+0.098
 cannot+0.097
w+0.093
Neg logit contributions
 orchestra-0.132
 amusement-0.125
 elephants-0.124
 films-0.124
 lions-0.123
Token<|endoftext|>
Token ID50256
Feature activation+3.60e-02

Pos logit contributions
 orchestra+0.028
 films+0.026
 amusement+0.026
 elephants+0.026
 lions+0.026
Neg logit contributions
t-0.015
 don-0.011
't-0.011
 cannot-0.011
 fell-0.010
TokenJane
Token ID41083
Feature activation+1.15e-02

Pos logit contributions
 orchestra+0.087
 amusement+0.082
 films+0.081
 melody+0.081
 elephants+0.081
Neg logit contributions
t-0.047
't-0.036
 don-0.035
 cannot-0.034
 fell-0.031
Token the
Token ID262
Feature activation+1.01e-01

Pos logit contributions
t+0.069
 don+0.055
't+0.054
 cannot+0.053
w+0.049
Neg logit contributions
 orchestra-0.110
 amusement-0.104
 films-0.104
 elephants-0.104
 lions-0.104
Token chess
Token ID19780
Feature activation+6.42e-04

Pos logit contributions
 orchestra+0.158
 elephants+0.145
 films+0.145
 lions+0.144
 amusement+0.143
Neg logit contributions
t-0.214
 don-0.184
 cannot-0.183
't-0.183
 fell-0.170
Token pieces
Token ID5207
Feature activation-2.98e-02

Pos logit contributions
 orchestra+0.001
 amusement+0.001
 films+0.001
 lions+0.001
 elephants+0.001
Neg logit contributions
t-0.001
 don-0.001
't-0.001
 cannot-0.001
w-0.001
Token.
Token ID13
Feature activation+2.67e-03

Pos logit contributions
t+0.050
w+0.047
't+0.045
 don+0.044
 cannot+0.043
Neg logit contributions
 orchestra-0.048
 elephant-0.047
 lions-0.046
 elephants-0.045
 amusement-0.045
Token They
Token ID1119
Feature activation-5.68e-02

Pos logit contributions
 orchestra+0.006
 amusement+0.005
 elephants+0.005
 films+0.005
 melody+0.005
Neg logit contributions
t-0.004
't-0.003
 don-0.003
 cannot-0.003
 they-0.003
Token searched
Token ID16499
Feature activation-1.14e-01

Pos logit contributions
t+0.079
w+0.065
't+0.061
iffs+0.056
 don+0.051
Neg logit contributions
 elephant-0.121
 orchestra-0.118
 alien-0.116
 ig-0.115
 youngest-0.114
Token and
Token ID290
Feature activation-5.89e-02

Pos logit contributions
t+0.204
w+0.172
't+0.169
 don+0.146
 cannot+0.144
Neg logit contributions
 orchestra-0.206
 elephants-0.205
 elephant-0.204
 lions-0.203
 films-0.202
Token searched
Token ID16499
Feature activation-1.13e-01

Pos logit contributions
t+0.092
w+0.087
't+0.075
 cannot+0.068
 don+0.068
Neg logit contributions
 elephant-0.108
 ig-0.106
 woodland-0.105
 orchestra-0.105
 alien-0.104
Token,
Token ID11
Feature activation-5.28e-02

Pos logit contributions
t+0.204
w+0.177
't+0.169
 cannot+0.151
 don+0.150
Neg logit contributions
 elephants-0.198
 orchestra-0.198
 lions-0.196
 elephant-0.196
 monkeys-0.195
Token but
Token ID475
Feature activation-3.59e-02

Pos logit contributions
t+0.099
w+0.086
't+0.082
 cannot+0.080
 don+0.079
Neg logit contributions
 orchestra-0.090
 elephant-0.085
 lions-0.085
 elephants-0.085
 monkeys-0.084
Token the
Token ID262
Feature activation+9.26e-02

Pos logit contributions
t+0.037
w+0.034
use+0.028
't+0.025
iffs+0.025
Neg logit contributions
 woodland-0.082
 lions-0.080
 elephant-0.079
 monkeys-0.079
 animal-0.079
Token big
Token ID1263
Feature activation+1.16e-01

Pos logit contributions
 orchestra+0.186
 elephants+0.181
 lions+0.181
latable+0.180
 monkeys+0.179
Neg logit contributions
t-0.154
 don-0.133
't-0.129
 cannot-0.126
 soon-0.121
Token,
Token ID11
Feature activation+2.91e-03

Pos logit contributions
 orchestra+0.178
 lions+0.170
 amusement+0.170
 elephants+0.170
 films+0.169
Neg logit contributions
t-0.236
 don-0.210
't-0.205
 cannot-0.203
 fell-0.195
Token blue
Token ID4171
Feature activation+8.39e-02

Pos logit contributions
 pasture+0.004
 enclosure+0.004
latable+0.004
 orchestra+0.004
 ze+0.004
Neg logit contributions
t-0.006
 don-0.005
 cannot-0.005
 it-0.005
 they-0.005
Token button
Token ID4936
Feature activation-3.65e-02

Pos logit contributions
 orchestra+0.105
 amusement+0.104
 lions+0.103
 elephants+0.101
 monkeys+0.101
Neg logit contributions
t-0.203
't-0.178
 don-0.176
 cannot-0.172
 fell-0.168
Token was
Token ID373
Feature activation-3.02e-02

Pos logit contributions
t+0.045
 don+0.034
't+0.034
 cannot+0.033
w+0.030
Neg logit contributions
 orchestra-0.088
 amusement-0.083
 films-0.083
 elephants-0.083
 lions-0.083
Token never
Token ID1239
Feature activation-1.14e-01

Pos logit contributions
t+0.044
 don+0.035
't+0.034
 cannot+0.034
w+0.032
Neg logit contributions
 orchestra-0.066
 amusement-0.063
 elephants-0.062
 films-0.062
 lions-0.062
Token used
Token ID973
Feature activation-5.71e-02

Pos logit contributions
t+0.171
 don+0.140
't+0.136
 cannot+0.135
 fell+0.127
Neg logit contributions
 orchestra-0.243
 films-0.232
 amusement-0.231
 elephants-0.229
 lions-0.229
Token again
Token ID757
Feature activation-6.08e-02

Pos logit contributions
t+0.066
 don+0.049
't+0.048
 cannot+0.048
w+0.044
Neg logit contributions
 orchestra-0.141
 amusement-0.134
 elephants-0.134
 elephant-0.133
 lions-0.133
Token and
Token ID290
Feature activation-3.72e-02

Pos logit contributions
t+0.101
 don+0.082
't+0.082
 cannot+0.080
w+0.076
Neg logit contributions
 orchestra-0.121
 amusement-0.114
 elephants-0.113
 films-0.113
 lions-0.113
Token it
Token ID340
Feature activation-6.46e-02

Pos logit contributions
t+0.061
 cannot+0.049
w+0.048
't+0.048
 don+0.048
Neg logit contributions
 orchestra-0.075
 amusement-0.070
 melody-0.070
 films-0.070
 woodland-0.069
Token became
Token ID2627
Feature activation-3.13e-02

Pos logit contributions
t+0.078
 don+0.061
't+0.060
 cannot+0.058
w+0.056
Neg logit contributions
 orchestra-0.151
 elephant-0.144
 woodland-0.143
 melody-0.143
 elephants-0.143
Token
Token ID198
Feature activation-5.84e-03

Pos logit contributions
 orchestra+0.016
 woodland+0.015
 jungle+0.015
 elephant+0.015
 monkeys+0.015
Neg logit contributions
t-0.017
 don-0.014
 cannot-0.014
't-0.014
w-0.014
Token
Token ID198
Feature activation-9.99e-05

Pos logit contributions
t+0.012
 don+0.011
 cannot+0.010
w+0.010
't+0.010
Neg logit contributions
 woodland-0.008
 jungle-0.008
 monkeys-0.008
 orchestra-0.008
 sym-0.008
TokenAnd
Token ID1870
Feature activation-7.81e-03

Pos logit contributions
t+0.000
 don+0.000
 cannot+0.000
 fell+0.000
 wouldn+0.000
Neg logit contributions
 woodland-0.000
 jungle-0.000
 monkeys-0.000
 sym-0.000
 melody-0.000
Token Freddy
Token ID45437
Feature activation+1.67e-02

Pos logit contributions
t+0.013
w+0.011
 cannot+0.011
't+0.010
 don+0.010
Neg logit contributions
 amusement-0.014
 elephant-0.014
 orchestra-0.014
 woodland-0.014
 monkeys-0.014
Token was
Token ID373
Feature activation-2.83e-02

Pos logit contributions
 orchestra+0.040
 amusement+0.038
 elephants+0.038
 films+0.038
 lions+0.038
Neg logit contributions
t-0.021
 don-0.016
't-0.016
 cannot-0.016
w-0.014
Token never
Token ID1239
Feature activation-1.14e-01

Pos logit contributions
t+0.042
 don+0.033
 cannot+0.032
't+0.032
w+0.031
Neg logit contributions
 orchestra-0.061
 amusement-0.058
 elephant-0.058
 woodland-0.058
 lions-0.058
Token helpless
Token ID21144
Feature activation-4.78e-03

Pos logit contributions
t+0.177
 don+0.150
 cannot+0.143
't+0.142
 fell+0.137
Neg logit contributions
 orchestra-0.238
 films-0.227
 amusement-0.225
 elephants-0.223
 lions-0.222
Token again
Token ID757
Feature activation-6.87e-02

Pos logit contributions
t+0.007
 don+0.006
't+0.006
 cannot+0.006
w+0.005
Neg logit contributions
 orchestra-0.010
 amusement-0.010
 films-0.010
 elephants-0.010
 lions-0.010
Token.
Token ID13
Feature activation+4.57e-03

Pos logit contributions
t+0.113
 don+0.093
't+0.092
 cannot+0.091
w+0.085
Neg logit contributions
 orchestra-0.137
 amusement-0.129
 films-0.129
 elephants-0.129
 lions-0.128
Token He
Token ID679
Feature activation-5.89e-02

Pos logit contributions
 orchestra+0.008
 woodland+0.008
 jungle+0.008
 elephant+0.008
 elephants+0.008
Neg logit contributions
t-0.008
 don-0.007
 cannot-0.007
't-0.007
w-0.007
Token flew
Token ID13112
Feature activation-5.10e-02

Pos logit contributions
t+0.079
't+0.060
 don+0.056
 cannot+0.055
w+0.053
Neg logit contributions
 orchestra-0.131
 elephant-0.130
 jungle-0.128
 woodland-0.128
 melody-0.126
Token She
Token ID1375
Feature activation-3.79e-02

Pos logit contributions
 orchestra+0.034
 amusement+0.033
 elephants+0.033
 films+0.033
 lions+0.033
Neg logit contributions
t-0.023
 don-0.018
't-0.018
 cannot-0.018
w-0.017
Token wanted
Token ID2227
Feature activation-6.06e-02

Pos logit contributions
t+0.051
't+0.040
 don+0.037
 cannot+0.035
w+0.034
Neg logit contributions
 orchestra-0.085
 elephant-0.084
 elephants-0.080
 moon-0.080
 lions-0.080
Token to
Token ID284
Feature activation-1.06e-02

Pos logit contributions
t+0.046
't+0.029
 don+0.028
 cannot+0.027
 they+0.021
Neg logit contributions
 orchestra-0.175
 amusement-0.168
 films-0.167
 elephants-0.166
 woodland-0.166
Token make
Token ID787
Feature activation-2.34e-02

Pos logit contributions
t+0.016
't+0.014
 don+0.014
 cannot+0.013
 not+0.012
Neg logit contributions
 orchestra-0.022
 lions-0.022
 elephants-0.022
 films-0.022
 ze-0.021
Token them
Token ID606
Feature activation-4.48e-02

Pos logit contributions
 it+0.021
 they+0.018
 not+0.018
't+0.015
 don+0.014
Neg logit contributions
 Labrador-0.064
latable-0.062
 orchestra-0.062
ki-0.061
 amusement-0.061
Token look
Token ID804
Feature activation-1.14e-01

Pos logit contributions
t+0.067
 don+0.056
't+0.055
 cannot+0.054
 they+0.049
Neg logit contributions
 orchestra-0.095
 amusement-0.090
 elephants-0.090
 woodland-0.090
 lions-0.090
Token more
Token ID517
Feature activation-1.26e-02

Pos logit contributions
t+0.247
't+0.213
w+0.206
 don+0.204
 cannot+0.203
Neg logit contributions
 orchestra-0.169
 films-0.158
 amusement-0.154
 melody-0.153
 elephants-0.152
Token real
Token ID1103
Feature activation+2.79e-02

Pos logit contributions
t+0.027
 don+0.023
't+0.023
 cannot+0.023
w+0.022
Neg logit contributions
 orchestra-0.019
 films-0.018
 amusement-0.017
 elephants-0.017
 lions-0.017
Token.
Token ID13
Feature activation+1.44e-02

Pos logit contributions
 orchestra+0.061
 amusement+0.059
 woodland+0.058
 films+0.058
 elephants+0.057
Neg logit contributions
t-0.040
 don-0.033
 cannot-0.032
't-0.031
 ever-0.030
Token
Token ID198
Feature activation+2.15e-03

Pos logit contributions
 orchestra+0.030
 amusement+0.029
 woodland+0.029
 elephants+0.029
 monkeys+0.029
Neg logit contributions
t-0.022
 don-0.017
't-0.017
 cannot-0.017
w-0.016
Token
Token ID198
Feature activation+8.30e-03

Pos logit contributions
 orchestra+0.004
 amusement+0.004
 woodland+0.004
 lions+0.004
 jungle+0.004
Neg logit contributions
t-0.003
 don-0.003
 cannot-0.003
 wouldn-0.003
't-0.003
Token too
Token ID1165
Feature activation-5.06e-02

Pos logit contributions
t+0.033
't+0.028
w+0.028
 cannot+0.028
 don+0.027
Neg logit contributions
 orchestra-0.033
 elephant-0.032
 amusement-0.032
 lions-0.032
 elephants-0.032
Token.
Token ID13
Feature activation+7.96e-03

Pos logit contributions
t+0.101
w+0.092
't+0.090
 cannot+0.088
 don+0.085
Neg logit contributions
 elderly-0.061
 woodland-0.060
 jungle-0.060
 elephant-0.059
 orchestra-0.058
Token His
Token ID2399
Feature activation+3.78e-02

Pos logit contributions
 orchestra+0.017
 amusement+0.016
 melody+0.016
 films+0.016
 elephants+0.015
Neg logit contributions
t-0.013
't-0.011
 don-0.010
 they-0.010
 cannot-0.010
Token hold
Token ID1745
Feature activation-1.81e-02

Pos logit contributions
 orchestra+0.061
 elephants+0.056
 films+0.055
 monkeys+0.054
 lions+0.054
Neg logit contributions
t-0.076
 don-0.072
 cannot-0.069
 not-0.067
't-0.066
Token was
Token ID373
Feature activation-3.69e-02

Pos logit contributions
t+0.033
 don+0.029
't+0.028
 cannot+0.028
w+0.026
Neg logit contributions
 orchestra-0.031
 woodland-0.029
 elephant-0.029
 films-0.029
 amusement-0.029
Token never
Token ID1239
Feature activation-1.13e-01

Pos logit contributions
t+0.059
 don+0.047
't+0.047
 cannot+0.046
w+0.044
Neg logit contributions
 orchestra-0.076
 amusement-0.072
 films-0.072
 elephants-0.071
 lions-0.071
Token too
Token ID1165
Feature activation-4.30e-02

Pos logit contributions
t+0.180
 don+0.152
 cannot+0.146
't+0.145
 fell+0.138
Neg logit contributions
 orchestra-0.233
 films-0.221
 amusement-0.220
 elephants-0.219
 lions-0.218
Token tight
Token ID5381
Feature activation-2.79e-02

Pos logit contributions
t+0.080
 don+0.067
't+0.066
 cannot+0.066
 fell+0.062
Neg logit contributions
 orchestra-0.078
 films-0.073
 amusement-0.073
 elephants-0.073
 lions-0.073
Token,
Token ID11
Feature activation-3.31e-02

Pos logit contributions
t+0.039
w+0.035
 cannot+0.034
 don+0.032
't+0.032
Neg logit contributions
 orchestra-0.055
 elephant-0.054
 amusement-0.053
 lions-0.053
 tunes-0.053
Token he
Token ID339
Feature activation-6.38e-02

Pos logit contributions
t+0.052
't+0.042
 don+0.042
 cannot+0.041
w+0.040
Neg logit contributions
 orchestra-0.068
 amusement-0.064
 elephants-0.064
 elephant-0.064
 lions-0.064
Token understood
Token ID7247
Feature activation-5.72e-02

Pos logit contributions
t+0.067
't+0.050
 don+0.047
w+0.046
 cannot+0.046
Neg logit contributions
 orchestra-0.156
 elephant-0.153
 jungle-0.150
 melody-0.150
 tunes-0.149
Token story
Token ID1621
Feature activation-3.16e-02

Pos logit contributions
 orchestra+0.156
 elephants+0.144
 sym+0.143
 films+0.141
 woodland+0.141
Neg logit contributions
t-0.242
't-0.213
 don-0.210
 cannot-0.202
 not-0.195
Token is
Token ID318
Feature activation-2.39e-02

Pos logit contributions
ged+0.050
t+0.050
 don+0.043
't+0.043
w+0.042
Neg logit contributions
 orchestra-0.049
 animal-0.048
 jungle-0.047
 cartoon-0.047
 elephant-0.047
Token that
Token ID326
Feature activation-2.66e-02

Pos logit contributions
t+0.036
 don+0.030
 cannot+0.029
 fell+0.029
ged+0.029
Neg logit contributions
 orchestra-0.046
 woodland-0.045
 elephant-0.044
 lions-0.044
 amusement-0.044
Token it
Token ID340
Feature activation-7.11e-02

Pos logit contributions
t+0.031
't+0.021
 cannot+0.021
w+0.020
 don+0.019
Neg logit contributions
 orchestra-0.065
 melody-0.063
 films-0.063
 amusement-0.063
 tunes-0.063
Token is
Token ID318
Feature activation-4.57e-02

Pos logit contributions
t+0.138
ged+0.117
w+0.113
't+0.111
 don+0.108
Neg logit contributions
 woodland-0.101
 elephant-0.101
 orchestra-0.100
 films-0.097
 animal-0.097
Token never
Token ID1239
Feature activation-1.13e-01

Pos logit contributions
t+0.096
ged+0.080
't+0.080
w+0.079
 fell+0.078
Neg logit contributions
 elephant-0.060
 woodland-0.060
 orchestra-0.059
 animal-0.059
 melody-0.059
Token okay
Token ID8788
Feature activation-1.08e-02

Pos logit contributions
t+0.180
 don+0.152
 cannot+0.146
't+0.145
 fell+0.136
Neg logit contributions
 orchestra-0.235
 films-0.221
 amusement-0.220
 elephants-0.219
 lions-0.217
Token to
Token ID284
Feature activation-5.77e-03

Pos logit contributions
t+0.012
w+0.010
't+0.010
 cannot+0.009
ged+0.009
Neg logit contributions
 niece-0.023
 lions-0.023
 elephant-0.023
 animal-0.022
 orchestra-0.022
Token lie
Token ID6486
Feature activation-6.52e-02

Pos logit contributions
t+0.009
 don+0.007
't+0.007
 cannot+0.007
w+0.007
Neg logit contributions
 orchestra-0.012
 amusement-0.011
 elephants-0.011
 films-0.011
 elephant-0.011
Token.
Token ID13
Feature activation+8.53e-03

Pos logit contributions
t+0.095
't+0.078
 cannot+0.072
 don+0.072
w+0.069
Neg logit contributions
 orchestra-0.131
 woodland-0.125
 lions-0.125
 elephants-0.125
 amusement-0.124
Token Lies
Token ID35337
Feature activation-3.48e-02

Pos logit contributions
 orchestra+0.016
 films+0.016
 amusement+0.016
 woodland+0.015
 elephants+0.015
Neg logit contributions
t-0.014
 don-0.012
 cannot-0.011
't-0.011
w-0.011
Token pieces
Token ID5207
Feature activation-2.98e-02

Pos logit contributions
 orchestra+0.001
 amusement+0.001
 films+0.001
 lions+0.001
 elephants+0.001
Neg logit contributions
t-0.001
 don-0.001
't-0.001
 cannot-0.001
w-0.001
Token.
Token ID13
Feature activation+2.67e-03

Pos logit contributions
t+0.050
w+0.047
't+0.045
 don+0.044
 cannot+0.043
Neg logit contributions
 orchestra-0.048
 elephant-0.047
 lions-0.046
 elephants-0.045
 amusement-0.045
Token They
Token ID1119
Feature activation-5.68e-02

Pos logit contributions
 orchestra+0.006
 amusement+0.005
 elephants+0.005
 films+0.005
 melody+0.005
Neg logit contributions
t-0.004
't-0.003
 don-0.003
 cannot-0.003
 they-0.003
Token searched
Token ID16499
Feature activation-1.14e-01

Pos logit contributions
t+0.079
w+0.065
't+0.061
iffs+0.056
 don+0.051
Neg logit contributions
 elephant-0.121
 orchestra-0.118
 alien-0.116
 ig-0.115
 youngest-0.114
Token and
Token ID290
Feature activation-5.89e-02

Pos logit contributions
t+0.204
w+0.172
't+0.169
 don+0.146
 cannot+0.144
Neg logit contributions
 orchestra-0.206
 elephants-0.205
 elephant-0.204
 lions-0.203
 films-0.202
Token searched
Token ID16499
Feature activation-1.13e-01

Pos logit contributions
t+0.092
w+0.087
't+0.075
 cannot+0.068
 don+0.068
Neg logit contributions
 elephant-0.108
 ig-0.106
 woodland-0.105
 orchestra-0.105
 alien-0.104
Token,
Token ID11
Feature activation-5.28e-02

Pos logit contributions
t+0.204
w+0.177
't+0.169
 cannot+0.151
 don+0.150
Neg logit contributions
 elephants-0.198
 orchestra-0.198
 lions-0.196
 elephant-0.196
 monkeys-0.195
Token but
Token ID475
Feature activation-3.59e-02

Pos logit contributions
t+0.099
w+0.086
't+0.082
 cannot+0.080
 don+0.079
Neg logit contributions
 orchestra-0.090
 elephant-0.085
 lions-0.085
 elephants-0.085
 monkeys-0.084
Token the
Token ID262
Feature activation+9.26e-02

Pos logit contributions
t+0.037
w+0.034
use+0.028
't+0.025
iffs+0.025
Neg logit contributions
 woodland-0.082
 lions-0.080
 elephant-0.079
 monkeys-0.079
 animal-0.079
Token chess
Token ID19780
Feature activation-1.59e-03

Pos logit contributions
 orchestra+0.152
 films+0.139
 sym+0.137
ika+0.137
 elephants+0.137
Neg logit contributions
t-0.192
 cannot-0.168
't-0.166
 don-0.166
 not-0.159
Token pieces
Token ID5207
Feature activation-2.69e-02

Pos logit contributions
t+0.002
 don+0.002
't+0.002
 cannot+0.002
w+0.002
Neg logit contributions
 orchestra-0.003
 amusement-0.003
 jungle-0.003
 films-0.003
 lions-0.003
Token them
Token ID606
Feature activation-4.22e-02

Pos logit contributions
t+0.062
 don+0.050
't+0.048
 cannot+0.047
 they+0.043
Neg logit contributions
 orchestra-0.115
 amusement-0.110
 woodland-0.108
 elephants-0.108
 films-0.108
Token.
Token ID13
Feature activation+8.50e-03

Pos logit contributions
t+0.074
't+0.062
w+0.061
 cannot+0.060
 don+0.059
Neg logit contributions
 orchestra-0.073
 films-0.072
 tunes-0.072
 movies-0.071
 cartoon-0.070
Token She
Token ID1375
Feature activation-5.07e-02

Pos logit contributions
 orchestra+0.017
 films+0.016
 elephants+0.016
 amusement+0.016
 lions+0.016
Neg logit contributions
t-0.014
 don-0.012
't-0.011
 cannot-0.011
w-0.011
Token made
Token ID925
Feature activation-4.76e-02

Pos logit contributions
t+0.066
't+0.050
 don+0.050
 cannot+0.049
w+0.044
Neg logit contributions
 orchestra-0.119
 amusement-0.113
 elephants-0.112
 films-0.112
 lions-0.112
Token them
Token ID606
Feature activation-5.21e-02

Pos logit contributions
t+0.045
 it+0.042
 don+0.039
 not+0.038
 they+0.038
Neg logit contributions
 orchestra-0.123
 Labrador-0.119
 woodland-0.118
 sym-0.118
 amusement-0.118
Token look
Token ID804
Feature activation-1.13e-01

Pos logit contributions
t+0.088
 don+0.073
't+0.072
 cannot+0.071
w+0.066
Neg logit contributions
 orchestra-0.102
 amusement-0.096
 films-0.096
 elephants-0.095
 lions-0.095
Token nice
Token ID3621
Feature activation-2.95e-02

Pos logit contributions
t+0.216
't+0.182
 don+0.177
 cannot+0.176
w+0.173
Neg logit contributions
 orchestra-0.198
 films-0.186
 amusement-0.185
 elephants-0.183
 lions-0.183
Token again
Token ID757
Feature activation-8.40e-02

Pos logit contributions
t+0.049
't+0.041
 don+0.040
 cannot+0.039
w+0.038
Neg logit contributions
 orchestra-0.057
 lions-0.055
 elephants-0.055
 films-0.055
 elephant-0.055
Token.
Token ID13
Feature activation+8.18e-03

Pos logit contributions
t+0.134
 don+0.110
't+0.108
 cannot+0.107
w+0.099
Neg logit contributions
 orchestra-0.174
 amusement-0.164
 films-0.164
 elephants-0.163
 lions-0.162
Token L
Token ID406
Feature activation+7.58e-02

Pos logit contributions
 orchestra+0.017
 films+0.016
 elephants+0.016
 amusement+0.016
 lions+0.016
Neg logit contributions
t-0.013
 don-0.010
't-0.010
 cannot-0.010
w-0.009
Tokenila
Token ID10102
Feature activation+2.86e-02

Pos logit contributions
 orchestra+0.138
 amusement+0.129
 films+0.128
 lions+0.128
 elephants+0.128
Neg logit contributions
t-0.138
 don-0.116
't-0.115
 cannot-0.114
 fell-0.106
Token happy
Token ID3772
Feature activation-7.16e-03

Pos logit contributions
t+0.041
 don+0.033
't+0.033
 cannot+0.032
 fell+0.030
Neg logit contributions
 orchestra-0.060
 films-0.057
 amusement-0.056
 elephants-0.056
 lions-0.056
Token because
Token ID780
Feature activation-3.39e-02

Pos logit contributions
t+0.008
 don+0.006
't+0.006
 cannot+0.006
w+0.006
Neg logit contributions
 orchestra-0.018
 films-0.017
 amusement-0.017
 elephants-0.017
 lions-0.017
Token his
Token ID465
Feature activation+6.74e-02

Pos logit contributions
t+0.039
 don+0.030
 cannot+0.030
w+0.028
't+0.028
Neg logit contributions
 orchestra-0.082
 amusement-0.079
 elephant-0.079
 lions-0.079
 ig-0.079
Token guitar
Token ID10047
Feature activation-1.52e-02

Pos logit contributions
 orchestra+0.066
 elephants+0.060
 amusement+0.059
 monkeys+0.059
 lions+0.058
Neg logit contributions
t-0.178
 don-0.161
 cannot-0.159
't-0.159
 fell-0.152
Token sounded
Token ID14846
Feature activation-4.55e-02

Pos logit contributions
t+0.019
 don+0.014
't+0.014
 cannot+0.014
w+0.013
Neg logit contributions
 orchestra-0.036
 amusement-0.035
 elephants-0.034
 films-0.034
 elephant-0.034
Token even
Token ID772
Feature activation-1.13e-01

Pos logit contributions
t+0.076
't+0.065
w+0.065
 cannot+0.063
 don+0.062
Neg logit contributions
 orchestra-0.085
 amusement-0.081
 elephants-0.080
 melody-0.080
 sym-0.080
Token better
Token ID1365
Feature activation-5.24e-02

Pos logit contributions
t+0.212
't+0.181
 cannot+0.173
 don+0.170
w+0.162
Neg logit contributions
 orchestra-0.201
 melody-0.190
 amusement-0.188
 elephants-0.185
 films-0.184
Token when
Token ID618
Feature activation-3.90e-02

Pos logit contributions
t+0.092
't+0.086
 cannot+0.078
w+0.077
 don+0.076
Neg logit contributions
 tunes-0.081
 animal-0.079
 jungle-0.078
 orchestra-0.076
 niece-0.076
Token he
Token ID339
Feature activation-4.97e-02

Pos logit contributions
t+0.045
 cannot+0.032
 don+0.032
't+0.031
w+0.029
Neg logit contributions
 orchestra-0.096
 lions-0.093
 films-0.093
 elephants-0.093
 monkeys-0.093
Token played
Token ID2826
Feature activation-2.21e-02

Pos logit contributions
t+0.061
't+0.047
 don+0.045
 cannot+0.043
w+0.040
Neg logit contributions
 orchestra-0.116
 jungle-0.111
 amusement-0.110
 ze-0.110
 elephants-0.110
Token it
Token ID340
Feature activation-6.63e-02

Pos logit contributions
t+0.027
 don+0.022
't+0.022
 cannot+0.021
 soon+0.019
Neg logit contributions
 orchestra-0.053
 elephants-0.051
 amusement-0.050
 lions-0.050
 monkeys-0.050
Token.
Token ID13
Feature activation+9.38e-03

Pos logit contributions
t+0.077
 don+0.061
't+0.060
 cannot+0.059
w+0.055
Neg logit contributions
 orchestra-0.126
 amusement-0.120
 films-0.120
 elephants-0.120
 lions-0.120
Token And
Token ID843
Feature activation-1.81e-02

Pos logit contributions
 orchestra+0.020
 amusement+0.019
 elephants+0.019
 elephant+0.019
 monkeys+0.018
Neg logit contributions
t-0.015
't-0.013
 cannot-0.012
 don-0.012
 they-0.011
Token the
Token ID262
Feature activation+8.65e-02

Pos logit contributions
t+0.023
't+0.018
 cannot+0.017
 don+0.017
w+0.017
Neg logit contributions
 orchestra-0.042
 films-0.040
 amusement-0.040
 lions-0.040
 movies-0.040
Token kitchen
Token ID9592
Feature activation-1.92e-02

Pos logit contributions
 orchestra+0.145
 films+0.135
 elephants+0.134
 lions+0.134
 pasture+0.133
Neg logit contributions
t-0.177
 don-0.155
 cannot-0.154
't-0.149
 not-0.145
Token was
Token ID373
Feature activation-3.05e-02

Pos logit contributions
t+0.023
 don+0.018
't+0.017
 cannot+0.017
w+0.015
Neg logit contributions
 orchestra-0.046
 amusement-0.044
 films-0.044
 elephants-0.044
 lions-0.044
Token never
Token ID1239
Feature activation-1.13e-01

Pos logit contributions
t+0.045
 don+0.036
't+0.036
 cannot+0.035
w+0.033
Neg logit contributions
 orchestra-0.067
 amusement-0.063
 films-0.063
 elephants-0.063
 lions-0.062
Token gloomy
Token ID46400
Feature activation+2.25e-02

Pos logit contributions
t+0.175
 don+0.147
 cannot+0.141
't+0.140
 fell+0.135
Neg logit contributions
 orchestra-0.235
 films-0.224
 amusement-0.223
 elephants-0.221
 lions-0.220
Token anymore
Token ID7471
Feature activation-4.72e-02

Pos logit contributions
 orchestra+0.046
 amusement+0.044
 elephants+0.043
 lions+0.043
 films+0.043
Neg logit contributions
t-0.035
 don-0.029
 cannot-0.029
't-0.029
w-0.028
Token.
Token ID13
Feature activation+1.24e-02

Pos logit contributions
t+0.080
't+0.065
 don+0.065
 cannot+0.065
w+0.064
Neg logit contributions
 orchestra-0.090
 amusement-0.085
 lions-0.084
 elephants-0.084
 elephant-0.084
Token<|endoftext|>
Token ID50256
Feature activation+3.85e-02

Pos logit contributions
 orchestra+0.031
 elephants+0.030
 elephant+0.030
 amusement+0.030
 monkeys+0.030
Neg logit contributions
t-0.015
't-0.011
 cannot-0.011
 don-0.011
 they-0.009
TokenOnce
Token ID7454
Feature activation-1.57e-04

Pos logit contributions
 orchestra+0.096
 amusement+0.090
 elephants+0.090
 melody+0.090
 films+0.090
Neg logit contributions
t-0.047
't-0.036
 don-0.035
 cannot-0.034
 fell-0.031
Token,
Token ID11
Feature activation-2.29e-02

Pos logit contributions
t+0.115
't+0.087
w+0.087
 cannot+0.086
 don+0.081
Neg logit contributions
 orchestra-0.185
 amusement-0.180
 lions-0.177
 movies-0.176
 elephants-0.176
Token the
Token ID262
Feature activation+8.91e-02

Pos logit contributions
t+0.034
't+0.026
 cannot+0.026
 don+0.025
w+0.025
Neg logit contributions
 orchestra-0.049
 amusement-0.048
 films-0.047
 lions-0.047
 woodland-0.047
Token dark
Token ID3223
Feature activation+6.96e-02

Pos logit contributions
 orchestra+0.143
 elephants+0.134
 monkeys+0.133
 films+0.132
 ze+0.132
Neg logit contributions
t-0.179
 don-0.162
 cannot-0.157
't-0.154
 fell-0.150
Token cane
Token ID33009
Feature activation-1.69e-02

Pos logit contributions
 orchestra+0.099
 films+0.093
 elephants+0.092
 lions+0.091
 monkeys+0.091
Neg logit contributions
t-0.154
 don-0.135
't-0.133
 cannot-0.133
 fell-0.129
Token was
Token ID373
Feature activation-3.15e-02

Pos logit contributions
t+0.021
 don+0.016
't+0.016
 cannot+0.015
w+0.015
Neg logit contributions
 orchestra-0.040
 elephant-0.038
 amusement-0.038
 jungle-0.037
 woodland-0.037
Token never
Token ID1239
Feature activation-1.13e-01

Pos logit contributions
t+0.051
 don+0.042
 cannot+0.040
't+0.040
w+0.039
Neg logit contributions
 orchestra-0.064
 amusement-0.060
 elephant-0.060
 jungle-0.060
 ig-0.060
Token lonely
Token ID21757
Feature activation-5.76e-03

Pos logit contributions
t+0.175
 don+0.144
't+0.141
 cannot+0.140
 fell+0.130
Neg logit contributions
 orchestra-0.235
 films-0.223
 amusement-0.222
 elephants-0.221
 lions-0.221
Token.
Token ID13
Feature activation+1.29e-02

Pos logit contributions
t+0.008
 don+0.007
't+0.006
 cannot+0.006
 fell+0.006
Neg logit contributions
 orchestra-0.013
 films-0.012
 amusement-0.012
 elephants-0.012
 lions-0.012
Token It
Token ID632
Feature activation-6.24e-02

Pos logit contributions
 orchestra+0.030
 elephants+0.028
 melody+0.028
 lions+0.028
 films+0.028
Neg logit contributions
t-0.019
't-0.015
 don-0.015
 cannot-0.014
 they-0.013
Token always
Token ID1464
Feature activation-7.06e-02

Pos logit contributions
t+0.078
't+0.059
 don+0.058
w+0.056
 cannot+0.056
Neg logit contributions
 orchestra-0.146
 woodland-0.140
 elephant-0.139
 moon-0.137
 jungle-0.137
Token had
Token ID550
Feature activation-6.80e-02

Pos logit contributions
t+0.107
't+0.089
 cannot+0.076
w+0.073
 don+0.071
Neg logit contributions
 orchestra-0.151
 woodland-0.145
 moon-0.143
 amusement-0.142
 sym-0.142
Token But
Token ID887
Feature activation-8.94e-03

Pos logit contributions
 orchestra+0.026
 melody+0.024
 tunes+0.024
 films+0.024
 elephants+0.024
Neg logit contributions
t-0.015
 don-0.013
 they-0.012
't-0.012
 cannot-0.012
Token Tim
Token ID5045
Feature activation+2.70e-02

Pos logit contributions
t+0.012
w+0.009
 cannot+0.009
use+0.009
't+0.008
Neg logit contributions
 lions-0.019
 elephant-0.019
 monkeys-0.019
 ig-0.018
 elephants-0.018
Token's
Token ID338
Feature activation+6.87e-02

Pos logit contributions
 orchestra+0.055
 jungle+0.054
 lions+0.054
 amusement+0.053
 elephant+0.053
Neg logit contributions
t-0.041
't-0.031
 don-0.030
w-0.029
 ever-0.029
Token drum
Token ID13026
Feature activation-1.94e-02

Pos logit contributions
 elephants+0.129
 orchestra+0.127
 sym+0.124
 woodland+0.124
 monkeys+0.124
Neg logit contributions
t-0.120
 don-0.105
 cannot-0.105
 not-0.104
't-0.099
Token is
Token ID318
Feature activation-3.71e-02

Pos logit contributions
t+0.025
't+0.019
 don+0.019
 cannot+0.018
w+0.018
Neg logit contributions
 orchestra-0.043
 lions-0.042
 elephant-0.042
 monkeys-0.042
 elephants-0.042
Token never
Token ID1239
Feature activation-1.13e-01

Pos logit contributions
t+0.053
 don+0.043
't+0.042
w+0.040
 cannot+0.040
Neg logit contributions
 orchestra-0.078
 elephant-0.075
 lions-0.075
 films-0.075
 amusement-0.075
Token the
Token ID262
Feature activation+8.84e-02

Pos logit contributions
t+0.171
 don+0.142
't+0.137
 cannot+0.137
 fell+0.129
Neg logit contributions
 orchestra-0.238
 amusement-0.226
 films-0.225
 elephants-0.224
 woodland-0.223
Token same
Token ID976
Feature activation+1.94e-02

Pos logit contributions
 orchestra+0.187
 films+0.182
 woodland+0.180
child+0.180
latable+0.179
Neg logit contributions
t-0.126
 don-0.111
't-0.105
 not-0.104
 will-0.103
Token.
Token ID13
Feature activation+1.07e-02

Pos logit contributions
 orchestra+0.046
ika+0.044
 elephants+0.044
 films+0.044
 sym+0.044
Neg logit contributions
t-0.025
 ever-0.020
 don-0.020
 soon-0.020
't-0.018
Token He
Token ID679
Feature activation-4.67e-02

Pos logit contributions
 orchestra+0.024
 melody+0.023
 elephants+0.023
 amusement+0.022
 films+0.022
Neg logit contributions
t-0.016
't-0.013
 don-0.013
 cannot-0.013
 not-0.012
Token is
Token ID318
Feature activation-2.08e-02

Pos logit contributions
t+0.061
't+0.047
 don+0.046
 cannot+0.044
w+0.042
Neg logit contributions
 orchestra-0.108
 amusement-0.103
 lions-0.103
 films-0.103
 elephants-0.103
Token was
Token ID373
Feature activation+6.89e-03

Pos logit contributions
t+0.077
 don+0.057
't+0.057
w+0.057
 cannot+0.056
Neg logit contributions
 orchestra-0.132
 woodland-0.130
 lions-0.129
 elephant-0.129
 monkeys-0.129
Token a
Token ID257
Feature activation+5.53e-02

Pos logit contributions
 orchestra+0.018
 melody+0.016
 sym+0.016
ki+0.016
 films+0.016
Neg logit contributions
t-0.007
 not-0.007
 never-0.006
 they-0.006
't-0.006
Token big
Token ID1263
Feature activation+9.02e-02

Pos logit contributions
 orchestra+0.093
 films+0.091
child+0.089
 tunes+0.085
 amusement+0.084
Neg logit contributions
t-0.100
 don-0.092
't-0.088
 cannot-0.087
 fell-0.086
Token tree
Token ID5509
Feature activation+6.73e-03

Pos logit contributions
 orchestra+0.151
 films+0.146
 youngest+0.142
 tunes+0.141
child+0.140
Neg logit contributions
t-0.156
 don-0.150
 cannot-0.143
 it-0.140
 fell-0.140
Token in
Token ID287
Feature activation+3.92e-02

Pos logit contributions
 orchestra+0.016
 films+0.015
 amusement+0.015
 elephants+0.015
 lions+0.015
Neg logit contributions
t-0.009
 don-0.007
't-0.007
 cannot-0.007
 fell-0.006
Token the
Token ID262
Feature activation+1.19e-01

Pos logit contributions
 orchestra+0.084
 films+0.079
 sym+0.077
 lions+0.077
 melody+0.077
Neg logit contributions
t-0.059
't-0.050
 don-0.050
 cannot-0.049
 soon-0.046
Token middle
Token ID3504
Feature activation+9.78e-03

Pos logit contributions
 films+0.227
 orchestra+0.208
 cartoons+0.204
 programmes+0.194
child+0.194
Neg logit contributions
 never-0.211
t-0.208
 will-0.198
 fell-0.195
 got-0.192
Token of
Token ID286
Feature activation+7.19e-02

Pos logit contributions
 orchestra+0.021
 films+0.020
 elephants+0.019
 melody+0.019
 amusement+0.019
Neg logit contributions
t-0.016
't-0.013
 cannot-0.012
 don-0.012
 ever-0.012
Token the
Token ID262
Feature activation+1.15e-01

Pos logit contributions
 orchestra+0.161
 sym+0.151
 films+0.149
 melody+0.149
 tunes+0.147
Neg logit contributions
t-0.096
 don-0.084
't-0.084
 cannot-0.080
 it-0.080
Token forest
Token ID8222
Feature activation-1.28e-02

Pos logit contributions
 films+0.175
 orchestra+0.169
 niece+0.160
child+0.159
 cartoons+0.157
Neg logit contributions
t-0.239
 never-0.227
 soon-0.218
 will-0.216
't-0.214
Token.
Token ID13
Feature activation+1.20e-02

Pos logit contributions
t+0.019
 don+0.015
't+0.015
 cannot+0.015
 fell+0.013
Neg logit contributions
 orchestra-0.028
 films-0.027
 amusement-0.026
 elephants-0.026
 lions-0.026
Token went
Token ID1816
Feature activation-5.64e-02

Pos logit contributions
 orchestra+0.068
 amusement+0.065
 elephants+0.065
 films+0.064
 lions+0.064
Neg logit contributions
t-0.032
 don-0.024
't-0.023
 cannot-0.023
w-0.021
Token back
Token ID736
Feature activation-2.82e-03

Pos logit contributions
t+0.053
't+0.038
 cannot+0.036
 don+0.036
 they+0.034
Neg logit contributions
 orchestra-0.152
 amusement-0.145
 melody-0.145
 films-0.144
 elephants-0.144
Token home
Token ID1363
Feature activation-3.88e-02

Pos logit contributions
t+0.003
 don+0.002
't+0.002
 cannot+0.002
w+0.002
Neg logit contributions
 orchestra-0.007
 amusement-0.007
 films-0.007
 elephants-0.007
 lions-0.007
Token with
Token ID351
Feature activation+1.98e-02

Pos logit contributions
t+0.061
't+0.050
 don+0.050
 cannot+0.049
w+0.048
Neg logit contributions
 orchestra-0.079
 amusement-0.074
 films-0.074
 woodland-0.073
 lions-0.073
Token the
Token ID262
Feature activation+1.23e-01

Pos logit contributions
 orchestra+0.050
 sym+0.047
 melody+0.047
ote+0.046
 Labrador+0.046
Neg logit contributions
 they-0.021
t-0.019
 it-0.019
 soon-0.018
 don-0.018
Token icy
Token ID30284
Feature activation+1.19e-01

Pos logit contributions
 orchestra+0.198
 films+0.185
 sym+0.182
 tunes+0.180
ki+0.179
Neg logit contributions
t-0.238
 don-0.224
 cannot-0.213
 will-0.204
't-0.204
Token pigeon
Token ID46387
Feature activation+4.91e-02

Pos logit contributions
 orchestra+0.182
 films+0.167
 amusement+0.167
 elephants+0.163
 lions+0.162
Neg logit contributions
t-0.249
 don-0.218
 fell-0.211
 cannot-0.208
 ever-0.207
Token.
Token ID13
Feature activation+2.95e-02

Pos logit contributions
 orchestra+0.113
 amusement+0.108
 films+0.106
 elephants+0.104
 melody+0.104
Neg logit contributions
t-0.068
 don-0.054
 they-0.053
 fell-0.052
 cannot-0.052
Token<|endoftext|>
Token ID50256
Feature activation+4.77e-02

Pos logit contributions
 orchestra+0.069
 melody+0.067
 tunes+0.066
 amusement+0.065
 ig+0.065
Neg logit contributions
t-0.039
't-0.034
 they-0.032
 don-0.031
 cannot-0.030
TokenBob
Token ID18861
Feature activation+4.48e-02

Pos logit contributions
 orchestra+0.109
 amusement+0.103
 films+0.102
 elephants+0.102
 elephant+0.101
Neg logit contributions
t-0.066
't-0.052
 don-0.052
 cannot-0.051
 fell-0.047
Token was
Token ID373
Feature activation+2.37e-02

Pos logit contributions
 orchestra+0.094
 elephant+0.092
 jungle+0.092
 woodland+0.091
 ig+0.090
Neg logit contributions
t-0.066
't-0.052
 don-0.050
 cannot-0.049
w-0.045
Token Gr
Token ID1902
Feature activation+1.24e-01

Pos logit contributions
t+0.011
 don+0.009
't+0.009
 cannot+0.009
w+0.008
Neg logit contributions
 orchestra-0.017
 amusement-0.016
 films-0.016
 elephants-0.016
 lions-0.016
Tokenanny
Token ID7737
Feature activation+2.60e-02

Pos logit contributions
 orchestra+0.169
 amusement+0.162
 elephants+0.160
 monkeys+0.158
 lions+0.156
Neg logit contributions
t-0.284
 don-0.246
't-0.244
 cannot-0.241
w-0.234
Token caught
Token ID4978
Feature activation+1.06e-02

Pos logit contributions
 orchestra+0.059
 amusement+0.056
 films+0.056
 elephants+0.056
 lions+0.056
Neg logit contributions
t-0.035
 don-0.027
't-0.027
 cannot-0.027
w-0.024
Token him
Token ID683
Feature activation+4.74e-03

Pos logit contributions
 orchestra+0.023
 amusement+0.022
 jungle+0.022
 sym+0.021
 woodland+0.021
Neg logit contributions
t-0.015
 don-0.014
 it-0.013
't-0.013
 they-0.013
Token at
Token ID379
Feature activation+3.39e-02

Pos logit contributions
 orchestra+0.011
 elephants+0.011
 amusement+0.011
 films+0.011
 woodland+0.011
Neg logit contributions
t-0.006
 don-0.005
't-0.005
 cannot-0.005
 fell-0.004
Token the
Token ID262
Feature activation+1.36e-01

Pos logit contributions
 orchestra+0.070
 amusement+0.066
 films+0.066
 elephants+0.065
 lions+0.065
Neg logit contributions
t-0.054
 don-0.044
't-0.044
 cannot-0.043
w-0.040
Token bottom
Token ID4220
Feature activation+1.14e-02

Pos logit contributions
 orchestra+0.167
ika+0.159
latable+0.158
 films+0.157
 woodland+0.156
Neg logit contributions
t-0.288
 don-0.268
 will-0.266
't-0.265
 never-0.261
Token.
Token ID13
Feature activation+2.46e-02

Pos logit contributions
 orchestra+0.024
 amusement+0.023
 films+0.023
 woodland+0.022
 elephants+0.022
Neg logit contributions
t-0.018
't-0.014
 don-0.014
 cannot-0.014
 they-0.013
Token The
Token ID383
Feature activation+1.09e-01

Pos logit contributions
 orchestra+0.054
 amusement+0.052
 melody+0.051
 lions+0.051
 elephants+0.051
Neg logit contributions
t-0.036
't-0.030
 don-0.029
 cannot-0.028
 they-0.028
Token little
Token ID1310
Feature activation+9.77e-02

Pos logit contributions
latable+0.157
 orchestra+0.148
 sym+0.141
 amusement+0.140
 pasture+0.140
Neg logit contributions
t-0.218
 don-0.206
 cannot-0.199
 soon-0.195
't-0.194
Token boy
Token ID2933
Feature activation+3.09e-02

Pos logit contributions
latable+0.171
opl+0.161
ote+0.158
 pasture+0.155
 amusement+0.154
Neg logit contributions
t-0.174
 they-0.172
 soon-0.171
 don-0.170
 it-0.166
Token They
Token ID1119
Feature activation-5.09e-02

Pos logit contributions
 orchestra+0.025
 elephant+0.024
 elephants+0.024
 amusement+0.024
 woodland+0.023
Neg logit contributions
t-0.014
 don-0.012
't-0.011
 cannot-0.011
 they-0.011
Token take
Token ID1011
Feature activation-3.09e-02

Pos logit contributions
t+0.067
 don+0.051
't+0.051
 cannot+0.050
w+0.046
Neg logit contributions
 orchestra-0.119
 amusement-0.114
 films-0.113
 elephants-0.113
 lions-0.112
Token turns
Token ID4962
Feature activation-7.14e-02

Pos logit contributions
t+0.038
 don+0.029
't+0.028
 cannot+0.028
 fell+0.026
Neg logit contributions
 orchestra-0.076
 amusement-0.072
 elephants-0.072
 elephant-0.071
 woodland-0.071
Token going
Token ID1016
Feature activation-5.42e-02

Pos logit contributions
t+0.107
 don+0.087
 cannot+0.086
't+0.084
w+0.083
Neg logit contributions
 orchestra-0.150
 lions-0.144
 amusement-0.143
 monkeys-0.143
 elephants-0.142
Token on
Token ID319
Feature activation-1.90e-02

Pos logit contributions
t+0.073
w+0.068
 don+0.064
 cannot+0.063
use+0.057
Neg logit contributions
 monkeys-0.110
 elephants-0.109
 orchestra-0.109
 playground-0.109
 elephant-0.109
Token the
Token ID262
Feature activation+1.22e-01

Pos logit contributions
t+0.022
 it+0.020
 they+0.019
 don+0.019
't+0.019
Neg logit contributions
 orchestra-0.047
 woodland-0.044
 alien-0.043
 elephant-0.043
 sym-0.043
Token slide
Token ID10649
Feature activation-2.38e-02

Pos logit contributions
 orchestra+0.212
 films+0.206
 niece+0.200
 alien+0.197
 woodland+0.197
Neg logit contributions
t-0.241
't-0.207
 don-0.205
 cannot-0.200
 fell-0.194
Token.
Token ID13
Feature activation+1.14e-02

Pos logit contributions
t+0.037
 don+0.030
't+0.030
 cannot+0.029
w+0.027
Neg logit contributions
 orchestra-0.050
 amusement-0.047
 films-0.047
 elephants-0.047
 lions-0.047
Token They
Token ID1119
Feature activation-5.14e-02

Pos logit contributions
 orchestra+0.026
 elephant+0.024
 elephants+0.024
 amusement+0.024
 monkeys+0.024
Neg logit contributions
t-0.016
 don-0.013
't-0.013
 cannot-0.013
 they-0.012
Token laugh
Token ID6487
Feature activation-3.09e-02

Pos logit contributions
t+0.068
't+0.052
 don+0.052
 cannot+0.050
w+0.047
Neg logit contributions
 orchestra-0.119
 amusement-0.114
 elephants-0.113
 films-0.113
 elephant-0.112
Token and
Token ID290
Feature activation-5.87e-02

Pos logit contributions
t+0.041
 don+0.033
't+0.032
 cannot+0.032
 they+0.029
Neg logit contributions
 orchestra-0.071
 amusement-0.067
 woodland-0.067
 elephants-0.067
 elephant-0.066
Token pocket
Token ID10000
Feature activation-2.19e-02

Pos logit contributions
 orchestra+0.168
 elephants+0.156
 monkeys+0.153
 lions+0.153
 sym+0.153
Neg logit contributions
t-0.173
 don-0.161
 cannot-0.161
 fell-0.152
 will-0.147
Token and
Token ID290
Feature activation-4.12e-02

Pos logit contributions
t+0.033
 don+0.027
't+0.027
 cannot+0.027
w+0.025
Neg logit contributions
 orchestra-0.046
 amusement-0.043
 elephants-0.043
 elephant-0.043
 films-0.043
Token spin
Token ID7906
Feature activation-3.92e-02

Pos logit contributions
t+0.068
't+0.054
 don+0.052
 cannot+0.052
w+0.051
Neg logit contributions
 orchestra-0.083
 amusement-0.079
 elephant-0.078
 elephants-0.078
 films-0.078
Token it
Token ID340
Feature activation-6.30e-02

Pos logit contributions
t+0.034
 don+0.023
't+0.022
 cannot+0.021
 fell+0.018
Neg logit contributions
 orchestra-0.109
 amusement-0.105
 films-0.105
 elephants-0.105
 lions-0.104
Token in
Token ID287
Feature activation+1.48e-02

Pos logit contributions
t+0.097
 don+0.082
w+0.081
't+0.080
 cannot+0.079
Neg logit contributions
 orchestra-0.123
 melody-0.117
 tunes-0.117
 elephant-0.116
 monkeys-0.115
Token the
Token ID262
Feature activation+1.23e-01

Pos logit contributions
 orchestra+0.033
 elephants+0.031
ote+0.031
 niece+0.031
 films+0.031
Neg logit contributions
t-0.018
 soon-0.017
 it-0.017
 don-0.016
't-0.016
Token sun
Token ID4252
Feature activation+1.50e-03

Pos logit contributions
 films+0.195
ami+0.192
 orchestra+0.192
ote+0.189
 lions+0.188
Neg logit contributions
 will-0.222
 can-0.215
 fell-0.205
 got-0.203
 on-0.200
Token.
Token ID13
Feature activation+2.34e-02

Pos logit contributions
 orchestra+0.004
 amusement+0.004
 woodland+0.003
 films+0.003
 elephants+0.003
Neg logit contributions
t-0.002
 they-0.001
 soon-0.001
't-0.001
 don-0.001
Token He
Token ID679
Feature activation-5.32e-02

Pos logit contributions
 orchestra+0.059
 amusement+0.056
 melody+0.056
 films+0.054
 tunes+0.054
Neg logit contributions
t-0.029
't-0.023
 don-0.022
 cannot-0.021
 they-0.020
Token would
Token ID561
Feature activation-6.01e-02

Pos logit contributions
t+0.069
't+0.053
 don+0.053
 cannot+0.051
w+0.047
Neg logit contributions
 orchestra-0.124
 elephants-0.118
 amusement-0.118
 films-0.117
 lions-0.117
Token watch
Token ID2342
Feature activation-4.49e-02

Pos logit contributions
t+0.101
 don+0.079
't+0.079
 cannot+0.079
w+0.078
Neg logit contributions
 orchestra-0.119
 amusement-0.112
 elephant-0.111
 elephants-0.111
 melody-0.110
Token,
Token ID11
Feature activation-4.70e-02

Pos logit contributions
t+0.042
 cannot+0.036
w+0.035
't+0.035
 don+0.033
Neg logit contributions
 orchestra-0.047
 lions-0.047
 elephant-0.047
 monkeys-0.046
 films-0.046
Token "
Token ID366
Feature activation-1.58e-02

Pos logit contributions
t+0.098
w+0.090
 cannot+0.087
 don+0.083
 fell+0.081
Neg logit contributions
 orchestra-0.064
 monkeys-0.064
 elephant-0.063
 lions-0.063
 elephants-0.063
TokenOK
Token ID11380
Feature activation-5.86e-02

Pos logit contributions
t+0.024
 cannot+0.020
't+0.020
 don+0.020
 fell+0.019
Neg logit contributions
 orchestra-0.033
 amusement-0.031
 elephants-0.031
 melody-0.031
 monkeys-0.031
Token,
Token ID11
Feature activation-2.56e-02

Pos logit contributions
t+0.086
 cannot+0.067
't+0.066
 don+0.065
w+0.064
Neg logit contributions
 orchestra-0.127
 amusement-0.121
 elephant-0.120
 films-0.119
 elephants-0.119
Token let
Token ID1309
Feature activation-3.87e-02

Pos logit contributions
t+0.039
't+0.030
 don+0.030
 cannot+0.030
w+0.028
Neg logit contributions
 orchestra-0.055
 amusement-0.052
 elephants-0.052
 films-0.052
 lions-0.051
Token's
Token ID338
Feature activation+4.28e-02

Pos logit contributions
t+0.064
't+0.055
 don+0.054
 cannot+0.052
 fell+0.050
Neg logit contributions
 orchestra-0.080
 elephants-0.076
 amusement-0.074
 sym-0.073
 films-0.073
Token plan
Token ID1410
Feature activation-8.60e-02

Pos logit contributions
 orchestra+0.090
 amusement+0.085
 films+0.085
 elephants+0.085
 lions+0.084
Neg logit contributions
t-0.066
 don-0.053
't-0.053
 cannot-0.052
w-0.048
Token a
Token ID257
Feature activation+5.26e-02

Pos logit contributions
t+0.126
 don+0.102
't+0.101
 cannot+0.099
w+0.091
Neg logit contributions
 orchestra-0.187
 amusement-0.177
 films-0.177
 elephants-0.176
 lions-0.175
Token fun
Token ID1257
Feature activation-8.29e-03

Pos logit contributions
 orchestra+0.086
 amusement+0.081
 elephants+0.081
 films+0.080
 lions+0.079
Neg logit contributions
t-0.105
 don-0.091
't-0.089
 cannot-0.087
 they-0.087
Token game
Token ID983
Feature activation-1.89e-02

Pos logit contributions
t+0.018
 don+0.016
't+0.016
 cannot+0.015
 they+0.015
Neg logit contributions
 orchestra-0.012
 elephants-0.011
 amusement-0.011
 woodland-0.011
 monkeys-0.011
Token."
Token ID526
Feature activation+1.18e-02

Pos logit contributions
t+0.029
 don+0.024
't+0.023
 cannot+0.023
w+0.022
Neg logit contributions
 orchestra-0.039
 lions-0.037
 elephants-0.037
 amusement-0.037
 films-0.037
Token there
Token ID612
Feature activation-4.73e-03

Pos logit contributions
 elephant+0.101
 lions+0.100
 woodland+0.099
 ig+0.099
 moon+0.098
Neg logit contributions
t-0.051
't-0.036
 cannot-0.036
 don-0.035
w-0.031
Token was
Token ID373
Feature activation+4.29e-02

Pos logit contributions
t+0.007
 don+0.005
 cannot+0.005
't+0.005
w+0.005
Neg logit contributions
 orchestra-0.010
 woodland-0.010
 ig-0.010
 amusement-0.010
 jungle-0.010
Token a
Token ID257
Feature activation+9.23e-02

Pos logit contributions
 orchestra+0.112
 melody+0.104
ki+0.104
 tunes+0.103
 rhythms+0.102
Neg logit contributions
t-0.044
't-0.038
 they-0.035
 not-0.034
 it-0.034
Token lucky
Token ID9670
Feature activation+9.44e-02

Pos logit contributions
opl+0.163
 orchestra+0.161
 amusement+0.157
 rhythms+0.154
child+0.151
Neg logit contributions
 don-0.159
 they-0.151
t-0.149
 not-0.143
 fell-0.143
Token cloth
Token ID16270
Feature activation+2.29e-03

Pos logit contributions
opl+0.145
 orchestra+0.142
 rhythms+0.140
pieces+0.137
 unison+0.136
Neg logit contributions
 don-0.175
t-0.175
 they-0.173
 it-0.168
 we-0.167
Token.
Token ID13
Feature activation+5.02e-02

Pos logit contributions
 orchestra+0.006
 amusement+0.005
 sym+0.005
 films+0.005
 woodland+0.005
Neg logit contributions
t-0.003
't-0.002
 they-0.002
 ever-0.002
 fell-0.002
Token The
Token ID383
Feature activation+1.05e-01

Pos logit contributions
 orchestra+0.108
 elephants+0.100
 films+0.100
 amusement+0.099
 melody+0.099
Neg logit contributions
t-0.077
't-0.066
 don-0.065
 cannot-0.063
 fell-0.057
Token lucky
Token ID9670
Feature activation+9.68e-02

Pos logit contributions
 orchestra+0.137
 moon+0.135
 melody+0.129
 woodland+0.124
 tunes+0.124
Neg logit contributions
t-0.245
 cannot-0.208
't-0.206
 don-0.204
 ever-0.199
Token cloth
Token ID16270
Feature activation-1.14e-02

Pos logit contributions
 orchestra+0.129
 moon+0.127
 melody+0.122
 woodland+0.119
 elephant+0.117
Neg logit contributions
t-0.222
 cannot-0.193
't-0.192
 don-0.186
w-0.179
Token was
Token ID373
Feature activation-1.77e-02

Pos logit contributions
t+0.014
 don+0.011
't+0.011
 cannot+0.011
w+0.010
Neg logit contributions
 orchestra-0.026
 elephant-0.026
 elephants-0.025
 lions-0.025
 films-0.025
Token very
Token ID845
Feature activation-1.68e-02

Pos logit contributions
t+0.033
 don+0.029
use+0.028
 cannot+0.028
w+0.027
Neg logit contributions
 ig-0.030
 elephant-0.029
 moon-0.029
 elderly-0.028
 orchestra-0.028
Token delicious
Token ID12625
Feature activation+3.46e-02

Pos logit contributions
t+0.058
 don+0.047
 cannot+0.047
't+0.047
w+0.045
Neg logit contributions
 orchestra-0.064
 amusement-0.060
 elephant-0.060
 films-0.060
 elephants-0.060
Token.
Token ID13
Feature activation+1.38e-02

Pos logit contributions
 orchestra+0.084
 woodland+0.079
 films+0.078
 amusement+0.077
 ze+0.076
Neg logit contributions
t-0.045
 ever-0.037
 they-0.035
 don-0.035
 soon-0.035
Token
Token ID220
Feature activation-2.21e-02

Pos logit contributions
 orchestra+0.026
 amusement+0.024
 films+0.024
 elephants+0.024
 lions+0.024
Neg logit contributions
t-0.024
 don-0.020
't-0.020
 cannot-0.020
w-0.019
Token
Token ID198
Feature activation+3.18e-03

Pos logit contributions
t+0.029
 don+0.022
't+0.022
 cannot+0.022
w+0.020
Neg logit contributions
 orchestra-0.051
 films-0.049
 amusement-0.049
 elephants-0.049
 lions-0.049
Token
Token ID198
Feature activation+9.24e-03

Pos logit contributions
 orchestra+0.006
 films+0.006
 woodland+0.006
 sym+0.006
 melody+0.006
Neg logit contributions
t-0.005
 don-0.004
 cannot-0.004
w-0.004
't-0.004
TokenThe
Token ID464
Feature activation+1.00e-01

Pos logit contributions
 orchestra+0.015
 films+0.014
 melody+0.014
 woodland+0.014
 unison+0.014
Neg logit contributions
t-0.018
 don-0.016
 cannot-0.016
 fell-0.015
't-0.015
Token moral
Token ID6573
Feature activation-6.90e-04

Pos logit contributions
 orchestra+0.214
 sym+0.198
 films+0.198
 elephants+0.196
 woodland+0.195
Neg logit contributions
t-0.152
 don-0.130
 cannot-0.124
't-0.123
 not-0.118
Token of
Token ID286
Feature activation+2.41e-02

Pos logit contributions
t+0.001
 don+0.001
't+0.001
 cannot+0.001
 fell+0.001
Neg logit contributions
 orchestra-0.001
 amusement-0.001
 films-0.001
 elephants-0.001
 lions-0.001
Token this
Token ID428
Feature activation+4.76e-02

Pos logit contributions
 orchestra+0.049
 amusement+0.046
 elephants+0.046
 monkeys+0.046
 lions+0.045
Neg logit contributions
t-0.039
't-0.032
 don-0.032
 cannot-0.031
w-0.029
Token story
Token ID1621
Feature activation-3.42e-02

Pos logit contributions
 orchestra+0.061
 amusement+0.056
 elephants+0.056
 films+0.055
 lions+0.055
Neg logit contributions
t-0.114
 don-0.099
't-0.099
 cannot-0.098
w-0.094
Token is
Token ID318
Feature activation-2.67e-02

Pos logit contributions
t+0.055
ged+0.050
w+0.046
 fell+0.045
 don+0.044
Neg logit contributions
 orchestra-0.060
 cartoon-0.060
 elephant-0.059
 films-0.059
 jungle-0.058
Token,
Token ID11
Feature activation+3.61e-02

Pos logit contributions
t+0.011
w+0.010
 cannot+0.009
 wouldn+0.009
 don+0.009
Neg logit contributions
 ig-0.012
 woodland-0.012
 lions-0.011
 elephant-0.011
 monkeys-0.011
Token there
Token ID612
Feature activation-1.32e-02

Pos logit contributions
 elephant+0.077
 ig+0.077
 lions+0.075
 rug+0.075
 animal+0.075
Neg logit contributions
t-0.047
't-0.034
use-0.033
 cannot-0.031
w-0.031
Token was
Token ID373
Feature activation+3.15e-02

Pos logit contributions
t+0.019
 don+0.015
 cannot+0.015
't+0.015
w+0.014
Neg logit contributions
 orchestra-0.027
 ig-0.027
 amusement-0.027
 jungle-0.027
 woodland-0.027
Token a
Token ID257
Feature activation+8.29e-02

Pos logit contributions
 orchestra+0.084
 melody+0.078
 tunes+0.078
ki+0.078
 elephants+0.077
Neg logit contributions
t-0.031
't-0.026
 not-0.025
 they-0.024
 don-0.024
Token little
Token ID1310
Feature activation+9.26e-02

Pos logit contributions
 orchestra+0.148
opl+0.145
 amusement+0.143
 tunes+0.140
child+0.139
Neg logit contributions
 don-0.141
t-0.135
 they-0.134
 not-0.129
 fell-0.127
Token girl
Token ID2576
Feature activation+5.70e-02

Pos logit contributions
 orchestra+0.144
 amusement+0.142
 sym+0.137
opl+0.137
 unison+0.136
Neg logit contributions
t-0.171
 don-0.170
 they-0.169
 fell-0.162
 we-0.161
Token named
Token ID3706
Feature activation+3.46e-02

Pos logit contributions
 orchestra+0.105
 amusement+0.099
 elephants+0.096
 sym+0.096
 films+0.095
Neg logit contributions
t-0.103
 don-0.089
't-0.088
 cannot-0.087
 fell-0.086
Token Amy
Token ID14235
Feature activation+3.70e-02

Pos logit contributions
 orchestra+0.057
 amusement+0.053
 films+0.053
 elephants+0.053
 lions+0.053
Neg logit contributions
t-0.069
 don-0.059
't-0.058
 cannot-0.058
w-0.055
Token.
Token ID13
Feature activation+4.38e-02

Pos logit contributions
 orchestra+0.072
 amusement+0.068
 films+0.068
 elephants+0.067
 lions+0.067
Neg logit contributions
t-0.063
 don-0.052
't-0.052
 cannot-0.051
w-0.048
Token Amy
Token ID14235
Feature activation+2.43e-02

Pos logit contributions
 orchestra+0.067
 amusement+0.062
 films+0.062
 elephants+0.062
 lions+0.061
Neg logit contributions
t-0.093
 don-0.080
't-0.079
 cannot-0.079
w-0.075
Token liked
Token ID8288
Feature activation+6.33e-03

Pos logit contributions
 orchestra+0.050
 elephant+0.048
 jungle+0.048
 films+0.047
 amusement+0.047
Neg logit contributions
t-0.037
't-0.030
 don-0.029
 cannot-0.028
w-0.026
Token was
Token ID373
Feature activation-6.46e-03

Pos logit contributions
t+0.007
 don+0.005
't+0.005
 cannot+0.005
 fell+0.005
Neg logit contributions
 orchestra-0.014
 amusement-0.013
 films-0.013
 elephants-0.013
 lions-0.013
Token determined
Token ID5295
Feature activation-4.83e-02

Pos logit contributions
t+0.009
't+0.008
 don+0.008
 cannot+0.007
 not+0.007
Neg logit contributions
 orchestra-0.014
 elephants-0.014
 films-0.014
 amusement-0.014
 elephant-0.013
Token to
Token ID284
Feature activation-9.51e-03

Pos logit contributions
t+0.058
w+0.056
 don+0.052
 cannot+0.050
't+0.048
Neg logit contributions
 orchestra-0.098
 amusement-0.098
 playground-0.096
 melody-0.095
 jungle-0.094
Token have
Token ID423
Feature activation-4.61e-02

Pos logit contributions
t+0.013
 don+0.011
't+0.010
 cannot+0.010
w+0.010
Neg logit contributions
 orchestra-0.021
 amusement-0.020
 films-0.020
 elephants-0.020
 lions-0.020
Token a
Token ID257
Feature activation+6.53e-02

Pos logit contributions
t+0.079
 don+0.064
't+0.063
 cannot+0.063
w+0.060
Neg logit contributions
 orchestra-0.090
 amusement-0.086
 woodland-0.084
 films-0.084
 elephants-0.084
Token good
Token ID922
Feature activation+6.36e-02

Pos logit contributions
 orchestra+0.104
 films+0.097
 elephants+0.097
 lions+0.096
 amusement+0.096
Neg logit contributions
t-0.135
 don-0.117
't-0.114
 cannot-0.114
 fell-0.109
Token day
Token ID1110
Feature activation-3.40e-02

Pos logit contributions
 orchestra+0.104
 woodland+0.097
 amusement+0.097
 films+0.097
 melody+0.095
Neg logit contributions
t-0.126
't-0.109
 cannot-0.107
 don-0.106
w-0.103
Token.
Token ID13
Feature activation+1.81e-02

Pos logit contributions
t+0.048
 don+0.038
't+0.037
 cannot+0.037
 ever+0.034
Neg logit contributions
 orchestra-0.076
 films-0.072
 elephants-0.072
 lions-0.071
 monkeys-0.071
Token So
Token ID1406
Feature activation-9.75e-03

Pos logit contributions
 orchestra+0.032
 woodland+0.031
 amusement+0.030
 niece+0.030
 sym+0.029
Neg logit contributions
t-0.032
w-0.029
 don-0.028
 cannot-0.027
't-0.027
Token her
Token ID607
Feature activation+7.80e-02

Pos logit contributions
t+0.017
w+0.015
 cannot+0.015
iffs+0.014
itates+0.013
Neg logit contributions
 niece-0.017
 orchestra-0.017
 amusement-0.016
 granddaughter-0.016
 melody-0.016
Token mother
Token ID2802
Feature activation+6.95e-03

Pos logit contributions
 orchestra+0.151
 elephants+0.149
 films+0.148
 lions+0.146
 monkeys+0.144
Neg logit contributions
t-0.134
 don-0.113
 fell-0.112
 not-0.111
 soon-0.110
Token avocado
Token ID40377
Feature activation+3.57e-02

Pos logit contributions
 unison+0.139
 sym+0.138
 orchestra+0.135
 woodland+0.134
 films+0.134
Neg logit contributions
t-0.196
 cannot-0.181
 not-0.179
't-0.175
 don-0.169
Token.
Token ID13
Feature activation+3.12e-02

Pos logit contributions
 orchestra+0.082
 amusement+0.078
 films+0.078
 elephants+0.078
 lions+0.077
Neg logit contributions
t-0.048
 don-0.038
't-0.037
 cannot-0.037
 fell-0.034
Token She
Token ID1375
Feature activation-2.73e-02

Pos logit contributions
 orchestra+0.069
 amusement+0.065
 elephants+0.064
 lions+0.064
 melody+0.064
Neg logit contributions
t-0.047
't-0.038
 don-0.038
 cannot-0.037
 they-0.035
Token walked
Token ID6807
Feature activation-3.68e-02

Pos logit contributions
t+0.035
 don+0.026
't+0.026
 cannot+0.025
w+0.024
Neg logit contributions
 orchestra-0.064
 elephant-0.062
 elephants-0.061
 monkeys-0.061
 lions-0.061
Token up
Token ID510
Feature activation-1.98e-02

Pos logit contributions
t+0.052
 don+0.042
't+0.041
 cannot+0.040
w+0.039
Neg logit contributions
 orchestra-0.078
 elephant-0.078
 elephants-0.076
 amusement-0.076
 monkeys-0.076
Token to
Token ID284
Feature activation+1.64e-02

Pos logit contributions
t+0.020
 don+0.015
't+0.014
 cannot+0.014
w+0.012
Neg logit contributions
 orchestra-0.051
 amusement-0.049
 elephants-0.049
 films-0.049
 lions-0.049
Token him
Token ID683
Feature activation+7.90e-03

Pos logit contributions
 orchestra+0.040
 amusement+0.038
 films+0.038
 elephants+0.038
 lions+0.038
Neg logit contributions
t-0.020
 don-0.015
't-0.015
 cannot-0.015
w-0.013
Token and
Token ID290
Feature activation-3.66e-02

Pos logit contributions
 orchestra+0.017
 elephant+0.016
 films+0.016
 amusement+0.016
 elephants+0.016
Neg logit contributions
t-0.012
 don-0.010
't-0.009
 cannot-0.009
w-0.009
Token said
Token ID531
Feature activation-2.18e-02

Pos logit contributions
t+0.056
't+0.044
 cannot+0.043
 don+0.043
w+0.041
Neg logit contributions
 orchestra-0.077
 elephant-0.074
 elephants-0.074
 monkeys-0.073
 lions-0.073
Token,
Token ID11
Feature activation-4.75e-02

Pos logit contributions
t+0.034
 cannot+0.027
w+0.027
't+0.026
 don+0.026
Neg logit contributions
 orchestra-0.043
 lions-0.043
 monkeys-0.043
 elephants-0.042
 elephant-0.042
Token "
Token ID366
Feature activation-7.03e-03

Pos logit contributions
t+0.096
w+0.088
 cannot+0.083
 don+0.080
 ever+0.078
Neg logit contributions
 orchestra-0.070
 elephant-0.068
 lions-0.067
 monkeys-0.067
 elephants-0.067
Token hungry
Token ID14720
Feature activation+1.00e-02

Pos logit contributions
t+0.028
't+0.024
w+0.023
 don+0.022
 cannot+0.022
Neg logit contributions
 orchestra-0.023
 elderly-0.022
 jungle-0.022
 elephant-0.021
 films-0.021
Token and
Token ID290
Feature activation-4.24e-02

Pos logit contributions
 orchestra+0.025
 sym+0.023
 films+0.023
 woodland+0.023
 melody+0.023
Neg logit contributions
t-0.013
 don-0.010
't-0.009
 cannot-0.009
 ever-0.009
Token wanted
Token ID2227
Feature activation-5.17e-02

Pos logit contributions
t+0.060
w+0.048
't+0.045
 cannot+0.042
 don+0.041
Neg logit contributions
 elephant-0.091
 orchestra-0.089
 monkeys-0.088
 jungle-0.088
 elephants-0.087
Token a
Token ID257
Feature activation+6.07e-02

Pos logit contributions
t+0.050
 don+0.036
't+0.034
 cannot+0.034
w+0.031
Neg logit contributions
 orchestra-0.135
 lions-0.133
 elephants-0.133
 monkeys-0.132
 amusement-0.132
Token big
Token ID1263
Feature activation+1.03e-01

Pos logit contributions
 orchestra+0.086
 elephants+0.080
 elephant+0.080
 films+0.080
 lions+0.080
Neg logit contributions
t-0.134
't-0.116
 don-0.115
 cannot-0.114
w-0.111
Token meal
Token ID9799
Feature activation-6.08e-03

Pos logit contributions
 orchestra+0.155
 amusement+0.141
 films+0.141
 elephants+0.140
 woodland+0.140
Neg logit contributions
t-0.221
 don-0.194
 cannot-0.190
't-0.190
 fell-0.181
Token.
Token ID13
Feature activation+2.18e-02

Pos logit contributions
t+0.009
 don+0.007
't+0.007
 cannot+0.007
w+0.007
Neg logit contributions
 orchestra-0.013
 amusement-0.012
 elephants-0.012
 lions-0.012
 films-0.012
Token But
Token ID887
Feature activation-8.62e-03

Pos logit contributions
 orchestra+0.046
 elephants+0.043
 amusement+0.043
 elephant+0.043
 films+0.043
Neg logit contributions
t-0.033
 don-0.027
't-0.026
 cannot-0.026
w-0.025
Token the
Token ID262
Feature activation+1.10e-01

Pos logit contributions
t+0.016
 cannot+0.013
't+0.013
w+0.012
 don+0.012
Neg logit contributions
 elephant-0.015
 monkeys-0.015
 ig-0.015
 amusement-0.015
 elephants-0.015
Token cook
Token ID4255
Feature activation+1.65e-02

Pos logit contributions
 orchestra+0.154
 sym+0.140
 amusement+0.139
 melody+0.138
 films+0.136
Neg logit contributions
t-0.252
 cannot-0.222
 don-0.222
't-0.219
 not-0.211
Token was
Token ID373
Feature activation-1.88e-02

Pos logit contributions
 orchestra+0.040
 amusement+0.039
 films+0.038
 elephants+0.038
 lions+0.038
Neg logit contributions
t-0.020
 don-0.015
't-0.015
 cannot-0.014
w-0.013
Token I
Token ID314
Feature activation-3.33e-02

Pos logit contributions
t+0.056
't+0.039
 don+0.039
 cannot+0.038
w+0.035
Neg logit contributions
 orchestra-0.140
 elephants-0.135
 lions-0.135
 amusement-0.134
 monkeys-0.134
Token can
Token ID460
Feature activation-6.32e-02

Pos logit contributions
t+0.034
w+0.027
ged+0.025
't+0.024
 they+0.022
Neg logit contributions
 elephant-0.078
 orchestra-0.076
 elephants-0.074
 jungle-0.074
 ig-0.073
Token find
Token ID1064
Feature activation-3.95e-02

Pos logit contributions
t+0.104
w+0.085
 don+0.083
 cannot+0.081
ged+0.080
Neg logit contributions
 orchestra-0.122
 elephant-0.118
 amusement-0.115
 elephants-0.115
 monkeys-0.114
Token the
Token ID262
Feature activation+1.14e-01

Pos logit contributions
t+0.053
 don+0.043
't+0.042
 cannot+0.042
w+0.038
Neg logit contributions
 orchestra-0.090
 amusement-0.085
 films-0.085
 melody-0.084
 woodland-0.084
Token most
Token ID749
Feature activation+4.63e-02

Pos logit contributions
 orchestra+0.184
 films+0.176
 amusement+0.173
 sym+0.171
 woodland+0.170
Neg logit contributions
t-0.228
 don-0.200
't-0.198
 cannot-0.196
 will-0.186
Token things
Token ID1243
Feature activation-3.04e-02

Pos logit contributions
 orchestra+0.079
 amusement+0.074
 films+0.074
 woodland+0.073
 melody+0.072
Neg logit contributions
t-0.088
 don-0.076
 cannot-0.074
't-0.074
 ever-0.070
Token."
Token ID526
Feature activation+2.33e-02

Pos logit contributions
t+0.039
w+0.032
ged+0.031
't+0.029
 cannot+0.029
Neg logit contributions
 elephants-0.064
 elephant-0.063
 orchestra-0.063
 lions-0.063
 monkeys-0.062
Token So
Token ID1406
Feature activation-8.81e-03

Pos logit contributions
 orchestra+0.057
 amusement+0.054
 films+0.053
 melody+0.053
 lions+0.052
Neg logit contributions
t-0.031
't-0.025
 don-0.024
 cannot-0.023
 they-0.023
Token they
Token ID484
Feature activation-5.52e-02

Pos logit contributions
t+0.016
 don+0.013
't+0.013
 cannot+0.013
w+0.012
Neg logit contributions
 orchestra-0.016
 amusement-0.015
 films-0.015
 elephants-0.015
 lions-0.015
Token started
Token ID2067
Feature activation-5.01e-02

Pos logit contributions
t+0.081
't+0.064
 don+0.063
 cannot+0.062
w+0.060
Neg logit contributions
 orchestra-0.118
 elephant-0.115
 elephants-0.113
 amusement-0.112
 monkeys-0.111
Token counting
Token ID14143
Feature activation-3.12e-02

Pos logit contributions
t+0.066
't+0.054
 don+0.053
 cannot+0.052
 they+0.050
Neg logit contributions
 orchestra-0.115
 amusement-0.108
 melody-0.107
 woodland-0.107
 films-0.107
Token environment
Token ID2858
Feature activation+2.06e-03

Pos logit contributions
 orchestra+0.179
 films+0.164
 ze+0.161
 amusement+0.161
 melody+0.158
Neg logit contributions
t-0.244
 don-0.221
 cannot-0.213
't-0.211
 will-0.199
Token.
Token ID13
Feature activation+1.34e-02

Pos logit contributions
 orchestra+0.004
 amusement+0.004
 films+0.004
 elephants+0.004
 lions+0.004
Neg logit contributions
t-0.003
 don-0.003
't-0.003
 cannot-0.002
 fell-0.002
Token
Token ID198
Feature activation+2.06e-03

Pos logit contributions
 orchestra+0.032
 amusement+0.030
 elephants+0.030
 elephant+0.030
 monkeys+0.030
Neg logit contributions
t-0.017
't-0.014
 don-0.014
 cannot-0.013
 they-0.012
Token
Token ID198
Feature activation+9.03e-03

Pos logit contributions
 orchestra+0.004
 elephants+0.004
 amusement+0.004
 films+0.004
 elephant+0.004
Neg logit contributions
t-0.004
't-0.003
 don-0.003
 cannot-0.003
 fell-0.003
TokenThe
Token ID464
Feature activation+1.01e-01

Pos logit contributions
 orchestra+0.017
 elephants+0.016
 films+0.015
 elephant+0.015
 amusement+0.015
Neg logit contributions
t-0.017
't-0.014
 don-0.014
 cannot-0.013
w-0.013
Token moral
Token ID6573
Feature activation-3.25e-03

Pos logit contributions
 orchestra+0.204
 films+0.192
 sym+0.191
latable+0.191
smith+0.189
Neg logit contributions
t-0.158
 don-0.140
't-0.134
 cannot-0.130
 soon-0.125
Token of
Token ID286
Feature activation+2.33e-02

Pos logit contributions
t+0.006
 don+0.005
't+0.005
 cannot+0.005
 fell+0.005
Neg logit contributions
 orchestra-0.005
 amusement-0.005
 films-0.005
 lions-0.005
 woodland-0.005
Token the
Token ID262
Feature activation+1.08e-01

Pos logit contributions
 orchestra+0.049
 elephants+0.046
 amusement+0.046
 monkeys+0.046
 lions+0.045
Neg logit contributions
t-0.036
't-0.030
 don-0.030
 cannot-0.029
w-0.027
Token story
Token ID1621
Feature activation-3.18e-02

Pos logit contributions
 orchestra+0.158
 elephants+0.144
 sym+0.144
 niece+0.143
 amusement+0.142
Neg logit contributions
t-0.237
't-0.211
 don-0.209
 cannot-0.201
 not-0.193
Token is
Token ID318
Feature activation-2.26e-02

Pos logit contributions
t+0.049
ged+0.046
w+0.041
 don+0.041
't+0.041
Neg logit contributions
 orchestra-0.056
 jungle-0.053
 cartoon-0.053
 films-0.053
 lions-0.053
Token that
Token ID326
Feature activation-2.77e-02

Pos logit contributions
t+0.035
 don+0.028
w+0.028
 fell+0.028
ged+0.028
Neg logit contributions
 orchestra-0.043
 woodland-0.042
 films-0.041
 amusement-0.041
 elephant-0.041
Token you
Token ID345
Feature activation-2.68e-02

Pos logit contributions
t+0.034
 don+0.025
't+0.024
 cannot+0.024
 fell+0.021
Neg logit contributions
 orchestra-0.074
 elephants-0.072
 amusement-0.071
 lions-0.071
 films-0.071
Token on
Token ID319
Feature activation-2.04e-02

Pos logit contributions
t+0.035
't+0.027
 don+0.027
 cannot+0.026
 fell+0.024
Neg logit contributions
 orchestra-0.060
 lions-0.059
 amusement-0.059
 elephants-0.059
 monkeys-0.058
Token an
Token ID281
Feature activation+1.04e-01

Pos logit contributions
t+0.031
't+0.026
 don+0.026
 cannot+0.025
 soon+0.023
Neg logit contributions
 orchestra-0.043
 films-0.041
 elephants-0.041
 amusement-0.040
 elephant-0.040
Token adventure
Token ID8855
Feature activation-3.64e-02

Pos logit contributions
 orchestra+0.143
 sym+0.137
 niece+0.135
 elephants+0.133
 films+0.132
Neg logit contributions
t-0.200
 soon-0.189
 not-0.186
 never-0.185
 ever-0.185
Token!"
Token ID2474
Feature activation+2.15e-02

Pos logit contributions
t+0.051
 don+0.040
't+0.040
 cannot+0.039
w+0.036
Neg logit contributions
 orchestra-0.082
 amusement-0.078
 films-0.077
 elephants-0.077
 lions-0.077
Token
Token ID198
Feature activation+4.68e-03

Pos logit contributions
 orchestra+0.050
 amusement+0.047
 films+0.046
 elephants+0.045
 lions+0.045
Neg logit contributions
t-0.031
 don-0.026
't-0.025
 they-0.025
 fell-0.025
Token
Token ID198
Feature activation+8.91e-03

Pos logit contributions
 orchestra+0.011
 amusement+0.011
 elephants+0.011
 films+0.011
 elephant+0.011
Neg logit contributions
t-0.006
 don-0.004
't-0.004
 cannot-0.004
 fell-0.004
TokenThe
Token ID464
Feature activation+9.76e-02

Pos logit contributions
 orchestra+0.019
 films+0.018
 elephant+0.018
 elephants+0.018
 lions+0.018
Neg logit contributions
t-0.014
 don-0.012
't-0.012
 soon-0.011
 never-0.011
Token potato
Token ID21219
Feature activation+2.86e-02

Pos logit contributions
 orchestra+0.125
 amusement+0.110
 sym+0.110
 elephants+0.107
 niece+0.106
Neg logit contributions
t-0.230
 don-0.211
't-0.205
 cannot-0.203
 they-0.196
Token was
Token ID373
Feature activation-2.44e-02

Pos logit contributions
 orchestra+0.069
 films+0.066
 amusement+0.066
 elephants+0.065
 lions+0.065
Neg logit contributions
t-0.035
 don-0.027
't-0.026
 cannot-0.026
 fell-0.025
Token excited
Token ID6568
Feature activation-2.04e-02

Pos logit contributions
t+0.037
 don+0.030
't+0.030
 cannot+0.030
 fell+0.028
Neg logit contributions
 orchestra-0.052
 amusement-0.049
 films-0.049
 elephants-0.049
 lions-0.048
Tokenmy
Token ID1820
Feature activation-1.71e-03

Pos logit contributions
 orchestra+0.088
 films+0.083
 amusement+0.082
 sym+0.081
 elephants+0.081
Neg logit contributions
t-0.066
 don-0.053
't-0.051
 cannot-0.051
 soon-0.049
Token to
Token ID284
Feature activation+1.51e-02

Pos logit contributions
t+0.002
 don+0.001
't+0.001
 cannot+0.001
w+0.001
Neg logit contributions
 orchestra-0.004
 amusement-0.004
 elephant-0.004
 lions-0.004
 films-0.004
Token go
Token ID467
Feature activation-4.77e-02

Pos logit contributions
 orchestra+0.037
 elephants+0.036
 lions+0.036
 films+0.035
 monkeys+0.035
Neg logit contributions
t-0.017
 don-0.014
't-0.014
 cannot-0.013
 fell-0.012
Token home
Token ID1363
Feature activation-4.36e-02

Pos logit contributions
t+0.079
 don+0.067
 cannot+0.064
w+0.064
't+0.063
Neg logit contributions
 orchestra-0.094
 amusement-0.090
 elephants-0.087
 elephant-0.087
 lions-0.087
Token.
Token ID13
Feature activation+1.74e-02

Pos logit contributions
t+0.072
 don+0.060
't+0.059
 cannot+0.059
w+0.056
Neg logit contributions
 orchestra-0.086
 amusement-0.081
 films-0.080
 elephants-0.080
 lions-0.080
Token He
Token ID679
Feature activation-4.90e-02

Pos logit contributions
 orchestra+0.037
 films+0.035
 melody+0.034
 lions+0.034
 elephants+0.034
Neg logit contributions
t-0.028
't-0.024
 don-0.023
 cannot-0.023
 they-0.022
Token waved
Token ID26834
Feature activation-4.33e-02

Pos logit contributions
t+0.065
't+0.050
 don+0.049
 cannot+0.047
w+0.046
Neg logit contributions
 orchestra-0.113
 amusement-0.108
 elephant-0.107
 woodland-0.107
 elephants-0.107
Token at
Token ID379
Feature activation+7.11e-03

Pos logit contributions
t+0.084
 don+0.071
't+0.071
 cannot+0.070
w+0.066
Neg logit contributions
 orchestra-0.074
 amusement-0.070
 films-0.069
 elephants-0.069
 lions-0.068
Token the
Token ID262
Feature activation+1.09e-01

Pos logit contributions
 orchestra+0.016
 sym+0.016
 films+0.015
 amusement+0.015
 melody+0.015
Neg logit contributions
t-0.010
 don-0.008
't-0.008
 it-0.008
 they-0.008
Token hairy
Token ID43873
Feature activation+1.51e-01

Pos logit contributions
 orchestra+0.200
 sym+0.192
 unison+0.180
 alien+0.179
 rhythms+0.176
Neg logit contributions
t-0.203
 cannot-0.178
 don-0.171
't-0.169
 not-0.168
Token dog
Token ID3290
Feature activation+3.35e-02

Pos logit contributions
 orchestra+0.209
 sym+0.181
 amusement+0.174
 rhythms+0.173
 unison+0.171
Neg logit contributions
t-0.355
 don-0.313
 cannot-0.309
't-0.309
w-0.302
Token her
Token ID607
Feature activation+7.58e-02

Pos logit contributions
t+0.002
 don+0.001
't+0.001
 cannot+0.001
 it+0.001
Neg logit contributions
 orchestra-0.004
 films-0.004
 amusement-0.004
 sym-0.004
 jungle-0.004
Token hand
Token ID1021
Feature activation-3.44e-02

Pos logit contributions
 orchestra+0.126
 amusement+0.114
 woodland+0.111
 films+0.110
 sym+0.109
Neg logit contributions
t-0.155
 don-0.139
 cannot-0.138
't-0.134
 ever-0.130
Token so
Token ID523
Feature activation-1.90e-02

Pos logit contributions
t+0.051
 don+0.041
't+0.041
 cannot+0.040
 fell+0.038
Neg logit contributions
 orchestra-0.075
 films-0.071
 amusement-0.071
 woodland-0.070
 elephants-0.070
Token that
Token ID326
Feature activation-3.83e-02

Pos logit contributions
t+0.026
 don+0.019
't+0.019
 cannot+0.019
w+0.018
Neg logit contributions
 orchestra-0.042
 lions-0.041
 monkeys-0.040
 elephants-0.040
 woodland-0.040
Token it
Token ID340
Feature activation-6.73e-02

Pos logit contributions
t+0.044
't+0.029
 cannot+0.028
 don+0.027
w+0.027
Neg logit contributions
 orchestra-0.096
 amusement-0.093
 moon-0.093
 lions-0.093
 movies-0.092
Token would
Token ID561
Feature activation-7.90e-02

Pos logit contributions
t+0.083
't+0.065
 don+0.062
w+0.059
 cannot+0.057
Neg logit contributions
 moon-0.147
 elephant-0.146
 orchestra-0.144
 monkeys-0.144
 elephants-0.144
Token stay
Token ID2652
Feature activation-4.67e-02

Pos logit contributions
t+0.130
w+0.112
ged+0.111
 cannot+0.100
 don+0.098
Neg logit contributions
 moon-0.141
 orchestra-0.139
 elephant-0.139
 melody-0.138
 monkeys-0.137
Token together
Token ID1978
Feature activation-7.04e-02

Pos logit contributions
t+0.078
't+0.063
 cannot+0.060
 don+0.060
w+0.057
Neg logit contributions
 orchestra-0.093
 films-0.090
 amusement-0.089
 melody-0.088
 ig-0.088
Token.
Token ID13
Feature activation+1.28e-02

Pos logit contributions
t+0.114
 don+0.093
't+0.093
 cannot+0.092
w+0.089
Neg logit contributions
 orchestra-0.140
 amusement-0.132
 films-0.132
 lions-0.131
 elephants-0.131
Token
Token ID198
Feature activation-9.69e-04

Pos logit contributions
 orchestra+0.031
 amusement+0.030
 elephants+0.030
 films+0.030
 lions+0.029
Neg logit contributions
t-0.016
't-0.013
 don-0.012
 cannot-0.012
 they-0.011
Token
Token ID198
Feature activation+6.17e-03

Pos logit contributions
t+0.001
 don+0.001
 cannot+0.001
w+0.001
't+0.001
Neg logit contributions
 orchestra-0.002
 woodland-0.002
 films-0.002
 monkeys-0.002
 jungle-0.002
Token "
Token ID366
Feature activation-2.94e-02

Pos logit contributions
t+0.113
w+0.102
 cannot+0.098
 don+0.095
't+0.090
Neg logit contributions
 orchestra-0.085
 elephant-0.083
 elephants-0.082
 monkeys-0.081
 lions-0.080
TokenYes
Token ID5297
Feature activation-2.28e-02

Pos logit contributions
t+0.056
 cannot+0.052
 don+0.051
 fell+0.051
't+0.051
Neg logit contributions
 orchestra-0.043
 melody-0.043
 tunes-0.042
 elephants-0.040
 amusement-0.040
Token,
Token ID11
Feature activation-2.75e-02

Pos logit contributions
t+0.033
 cannot+0.026
 don+0.025
w+0.025
't+0.025
Neg logit contributions
 orchestra-0.048
 elephant-0.046
 films-0.045
 monkeys-0.045
 amusement-0.045
Token we
Token ID356
Feature activation-5.20e-02

Pos logit contributions
t+0.028
't+0.020
 don+0.020
 cannot+0.020
w+0.018
Neg logit contributions
 orchestra-0.072
 amusement-0.069
 films-0.068
 elephants-0.068
 lions-0.068
Token can
Token ID460
Feature activation-6.63e-02

Pos logit contributions
ged+0.068
t+0.067
iffs+0.062
w+0.058
mar+0.054
Neg logit contributions
 elephant-0.095
 moon-0.092
 engine-0.090
 ig-0.089
 yard-0.087
Token lift
Token ID10303
Feature activation-4.33e-02

Pos logit contributions
t+0.110
w+0.097
ged+0.094
 cannot+0.088
 don+0.084
Neg logit contributions
 ig-0.113
 elephant-0.112
 moon-0.109
 cartoon-0.107
 orchestra-0.107
Token the
Token ID262
Feature activation+1.11e-01

Pos logit contributions
t+0.061
't+0.046
 don+0.044
 cannot+0.044
 fell+0.040
Neg logit contributions
 orchestra-0.097
 films-0.094
 lions-0.094
 elephants-0.094
 amusement-0.093
Token window
Token ID4324
Feature activation-1.97e-02

Pos logit contributions
 orchestra+0.124
 amusement+0.121
 films+0.117
latable+0.117
 Labrador+0.116
Neg logit contributions
t-0.273
 don-0.247
 cannot-0.247
't-0.246
 fell-0.238
Token."
Token ID526
Feature activation+5.13e-03

Pos logit contributions
t+0.027
 don+0.022
't+0.021
 cannot+0.021
w+0.020
Neg logit contributions
 orchestra-0.044
 amusement-0.042
 films-0.042
 elephants-0.042
 lions-0.042
Token
Token ID220
Feature activation-2.58e-02

Pos logit contributions
 orchestra+0.012
 amusement+0.011
 elephants+0.011
 lions+0.011
 films+0.011
Neg logit contributions
t-0.007
 don-0.006
't-0.006
 cannot-0.006
 they-0.006
Token
Token ID198
Feature activation+2.29e-03

Pos logit contributions
t+0.037
 don+0.029
 cannot+0.029
't+0.028
w+0.028
Neg logit contributions
 orchestra-0.054
 woodland-0.052
 films-0.052
 lions-0.052
 monkeys-0.052
Token his
Token ID465
Feature activation+5.10e-02

Pos logit contributions
t+0.040
 don+0.032
't+0.031
 cannot+0.031
 fell+0.028
Neg logit contributions
 orchestra-0.068
 amusement-0.065
 films-0.064
 elephants-0.064
 woodland-0.064
Token best
Token ID1266
Feature activation+2.16e-02

Pos logit contributions
 orchestra+0.082
 amusement+0.076
 films+0.074
 sym+0.073
 melody+0.073
Neg logit contributions
t-0.104
 don-0.094
 cannot-0.092
't-0.090
 not-0.089
Token friend
Token ID1545
Feature activation+1.57e-02

Pos logit contributions
 orchestra+0.027
 amusement+0.024
 films+0.024
 elephants+0.024
 woodland+0.023
Neg logit contributions
t-0.053
 don-0.046
't-0.046
 cannot-0.045
 ever-0.044
Token.
Token ID13
Feature activation+1.90e-02

Pos logit contributions
 orchestra+0.033
 amusement+0.031
 films+0.031
 elephants+0.031
 woodland+0.031
Neg logit contributions
t-0.024
 don-0.020
 cannot-0.019
't-0.019
 fell-0.018
Token He
Token ID679
Feature activation-5.37e-02

Pos logit contributions
 orchestra+0.049
 amusement+0.046
 melody+0.046
 tunes+0.046
 films+0.045
Neg logit contributions
t-0.022
't-0.018
 don-0.017
 cannot-0.017
 they-0.016
Token hoped
Token ID10719
Feature activation-5.95e-02

Pos logit contributions
t+0.073
 don+0.056
't+0.056
 cannot+0.054
w+0.051
Neg logit contributions
 orchestra-0.123
 amusement-0.117
 elephants-0.117
 films-0.116
 lions-0.116
Token that
Token ID326
Feature activation-3.45e-02

Pos logit contributions
t+0.067
 don+0.050
't+0.049
 cannot+0.048
 they+0.043
Neg logit contributions
 orchestra-0.151
 amusement-0.144
 films-0.143
 elephants-0.142
 woodland-0.142
Token he
Token ID339
Feature activation-6.57e-02

Pos logit contributions
t+0.043
 don+0.035
't+0.033
 cannot+0.033
 they+0.031
Neg logit contributions
 orchestra-0.083
 amusement-0.078
 films-0.077
 woodland-0.077
 sym-0.077
Token would
Token ID561
Feature activation-7.60e-02

Pos logit contributions
t+0.075
't+0.056
w+0.056
ged+0.046
 don+0.039
Neg logit contributions
 elephants-0.150
 elephant-0.150
 lions-0.149
 monkeys-0.147
 jungle-0.147
Token find
Token ID1064
Feature activation-4.70e-02

Pos logit contributions
t+0.118
w+0.096
ged+0.088
 cannot+0.085
't+0.083
Neg logit contributions
 orchestra-0.156
 amusement-0.151
 elephant-0.150
 melody-0.148
 jungle-0.147
Token his
Token ID465
Feature activation+6.95e-02

Pos logit contributions
t+0.070
 don+0.058
 cannot+0.056
't+0.056
w+0.052
Neg logit contributions
 orchestra-0.101
 amusement-0.095
 films-0.094
 elephants-0.094
 lions-0.093
Token their
Token ID511
Feature activation+9.08e-02

Pos logit contributions
t+0.055
 don+0.046
't+0.045
 cannot+0.044
 they+0.041
Neg logit contributions
 orchestra-0.077
 amusement-0.073
 elephants-0.072
 films-0.072
 lions-0.072
Token adventures
Token ID17545
Feature activation-4.42e-02

Pos logit contributions
 orchestra+0.162
 elephants+0.152
 monkeys+0.151
 lions+0.148
 ze+0.147
Neg logit contributions
t-0.170
 don-0.151
 cannot-0.146
 ever-0.138
't-0.138
Token.
Token ID13
Feature activation+1.01e-02

Pos logit contributions
t+0.071
 don+0.058
't+0.057
 cannot+0.057
w+0.053
Neg logit contributions
 orchestra-0.090
 amusement-0.085
 films-0.085
 elephants-0.084
 lions-0.084
Token They
Token ID1119
Feature activation-6.11e-02

Pos logit contributions
 orchestra+0.022
 amusement+0.021
 elephants+0.021
 lions+0.021
 monkeys+0.021
Neg logit contributions
t-0.015
't-0.012
 don-0.012
 cannot-0.012
 they-0.011
Token then
Token ID788
Feature activation-8.65e-02

Pos logit contributions
t+0.075
 don+0.058
't+0.056
 cannot+0.056
 fell+0.051
Neg logit contributions
 orchestra-0.147
 films-0.140
 amusement-0.140
 elephants-0.139
 lions-0.139
Token ate
Token ID15063
Feature activation-4.35e-02

Pos logit contributions
t+0.105
't+0.079
 don+0.077
 cannot+0.076
w+0.068
Neg logit contributions
 orchestra-0.211
 amusement-0.204
 elephants-0.201
 lions-0.201
 films-0.201
Token a
Token ID257
Feature activation+5.15e-02

Pos logit contributions
t+0.061
 don+0.052
 cannot+0.049
't+0.048
 they+0.047
Neg logit contributions
 orchestra-0.097
 elephants-0.089
 amusement-0.089
 elephant-0.089
 woodland-0.089
Token y
Token ID331
Feature activation+8.41e-02

Pos logit contributions
 orchestra+0.089
 elephants+0.085
 lions+0.083
 monkeys+0.083
 elephant+0.082
Neg logit contributions
t-0.099
 don-0.083
't-0.080
 fell-0.079
 cannot-0.079
Tokenummy
Token ID13513
Feature activation+7.10e-02

Pos logit contributions
 orchestra+0.121
 elephants+0.113
 lions+0.110
 monkeys+0.110
 ze+0.109
Neg logit contributions
t-0.191
't-0.162
 don-0.160
 cannot-0.158
 ever-0.156
Token dinner
Token ID8073
Feature activation-2.67e-02

Pos logit contributions
 orchestra+0.116
 elephants+0.107
 amusement+0.106
 lions+0.106
 films+0.106
Neg logit contributions
t-0.143
 don-0.123
't-0.121
 cannot-0.120
 fell-0.115
Token together
Token ID1978
Feature activation-5.88e-02

Pos logit contributions
t+0.036
 don+0.029
't+0.029
 cannot+0.028
w+0.026
Neg logit contributions
 orchestra-0.060
 amusement-0.057
 films-0.057
 lions-0.057
 elephants-0.057