TOP ACTIVATIONS
MAX = 4.275

the cars stopped and the pedestrian light turned green . Then
you love are all important factors which can make life more
" We are the weapon creators ! We can stop the
! AND YOU ARE E ATING ALL THE S UG AR
dragon with my stick - sword ." He swings the stick
" ZE B RA K ILL ER " and his elephant
C leaning is important , and I
stepped into an ant - hill and could not get out
. They say " m arch , march , march "
. She started to " compl ain " that it wasn
. Alt ogether , Sam and Mom my
mysterious . Talking to himself , the tiger
became known as the best cra fter around . <|endoftext|> Once
with her toys and doing arts and crafts . One day
. They say " m arch , march , march "
looked at the egg - du ck and then at each
ised and studied hard -- mem or izing the answers and
â O tty was so excited that his
to share the egg - du ck and try to fit
even practiced yoga and martial arts together . Joe

MIN ACTIVATIONS
MIN = -3.795

flashlight , and a magn ifying glass . As
, handing him the magn ifying glass . She looked around
When Daisy put the magn ifying glass down , the cater
examine it with the magn ifying glass ?" "
a flashlight and a magn ifying glass to examine their finds
, next to her magn ifying glass . She looked at
said , holding the magn ifying glass over a black ant
. Lily had a magn ifying glass that she found in
. They had a magn ifying glass and a small box
the watch with the magn ifying glass . He saw the
She had a big magn ifying glass that made things look
borrowed her mom 's magn ifying glass to take a closer
the necklace and the magn ifying glass and started to examine
hat and grabbed her magn ifying glass . She was ready
He pointed his magn ifying glass at different classmates and
He took out a magn ifying glass and a notebook and
She takes out a magn ifying glass . She looks at
cried , dropping the magn ifying glass . "
and handed him the magn ifying glass . He
and put on your p aj amas . Then we can

INTERVAL 2.257 - 4.275
CONTAINS 0.082%

said , That was a very
and said Thank you ! I
softly , Hello ? Who â
mom my replied , We need
". The end . <|endoftext|> Once upon a

INTERVAL 0.240 - 2.257
CONTAINS 23.474%

talk . They talked and talked and soon enough the truth
. They put on their shoes and jackets and go to
The family returned to the structure with happy smiles and loving
few of the smallest ones to put on her shelf .
Tom , and he ran out of the kitchen .

INTERVAL -1.777 - 0.240
CONTAINS 76.394%

. She was happy that she could help someone in need
her drink . Lily was a reliable friend . She always
thought the spear would be useful . She held the spear
said , " L ily , you can 't keep the
time , there was a hero . He was brave and

INTERVAL -3.795 - -1.777
CONTAINS 0.050%

the tooth and recorded a copy of the song .
outside in the woods and follow the trail of leaves .
to be back in the quiet , blue sky . She
upon a time there were two friends called Sam and Jack
the woods and noticed how quiet it was . The silence
Token the
Token ID262
Feature activation+1.85e-01

Pos logit contributions
uddled+0.054
uffed+0.051
ist+0.050
arest+0.049
ows+0.048
Neg logit contributions
 his-0.023
 night-0.017
 day-0.015
 end-0.014
 morning-0.014
Token cars
Token ID5006
Feature activation+3.56e-01

Pos logit contributions
 his+0.013
 day+0.012
 night+0.012
 end+0.012
 morning+0.011
Neg logit contributions
uddled-0.026
uffed-0.025
ist-0.024
itting-0.023
ended-0.023
Token stopped
Token ID5025
Feature activation-9.27e-01

Pos logit contributions
 his+0.029
 night+0.023
 day+0.022
 end+0.021
 him+0.019
Neg logit contributions
arest-0.047
uddled-0.046
uffed-0.045
itting-0.045
ider-0.045
Token and
Token ID290
Feature activation-2.54e-02

Pos logit contributions
uddled+0.141
uffed+0.133
arest+0.129
ows+0.120
ider+0.119
Neg logit contributions
 his-0.076
 night-0.051
 home-0.048
 him-0.048
 himself-0.048
Token the
Token ID262
Feature activation+1.62e-01

Pos logit contributions
uddled+0.004
arest+0.003
itting+0.003
ower+0.003
uffed+0.003
Neg logit contributions
 his-0.002
 night-0.002
 himself-0.001
 day-0.001
 him-0.001
Token pedestrian
Token ID22382
Feature activation+4.27e+00

Pos logit contributions
 his+0.012
 night+0.010
 day+0.010
 end+0.010
 jungle+0.009
Neg logit contributions
uddled-0.023
uffed-0.022
ist-0.020
itting-0.020
arest-0.020
Token light
Token ID1657
Feature activation-5.00e-01

Pos logit contributions
 his+0.341
 night+0.260
 end+0.246
 day+0.241
 himself+0.228
Neg logit contributions
uddled-0.612
uffed-0.577
op-0.541
arest-0.536
ve-0.530
Token turned
Token ID2900
Feature activation-8.65e-01

Pos logit contributions
arest+0.068
uddled+0.067
uffed+0.066
ve+0.064
ider+0.063
Neg logit contributions
 his-0.039
 night-0.031
 day-0.030
 way-0.027
 food-0.026
Token green
Token ID4077
Feature activation+5.67e-01

Pos logit contributions
uddled+0.118
arest+0.113
uffed+0.109
ider+0.108
ute+0.106
Neg logit contributions
 his-0.073
 night-0.052
 the-0.046
 day-0.046
 him-0.046
Token.
Token ID13
Feature activation-2.12e-02

Pos logit contributions
 his+0.041
 night+0.033
 day+0.031
 food+0.029
 jungle+0.029
Neg logit contributions
uddled-0.081
uffed-0.079
arest-0.077
ve-0.074
outing-0.073
Token Then
Token ID3244
Feature activation-1.14e-01

Pos logit contributions
uddled+0.003
uffed+0.003
ist+0.003
arest+0.003
itting+0.003
Neg logit contributions
 his-0.001
 night-0.001
 day-0.001
 himself-0.001
 restaurants-0.001
Token you
Token ID345
Feature activation-8.48e-01

Pos logit contributions
uddled+0.082
uffed+0.080
op+0.074
ist+0.073
itting+0.073
Neg logit contributions
 his-0.046
 day-0.031
 night-0.031
 himself-0.029
 way-0.029
Token love
Token ID1842
Feature activation+3.14e-02

Pos logit contributions
uddled+0.123
uffed+0.116
ist+0.107
ower+0.107
outing+0.106
Neg logit contributions
 his-0.063
 night-0.051
 end-0.047
 restaurants-0.046
 day-0.046
Token are
Token ID389
Feature activation-3.92e-01

Pos logit contributions
 his+0.002
 night+0.002
 day+0.002
 end+0.002
 himself+0.002
Neg logit contributions
uddled-0.005
uffed-0.005
arest-0.004
ider-0.004
outing-0.004
Token all
Token ID477
Feature activation-2.46e-01

Pos logit contributions
uddled+0.058
uffed+0.055
itting+0.051
ist+0.051
arest+0.050
Neg logit contributions
 his-0.029
 night-0.021
 day-0.021
 himself-0.019
 meal-0.019
Token important
Token ID1593
Feature activation-1.76e-01

Pos logit contributions
uddled+0.033
uffed+0.031
ist+0.030
itting+0.029
arest+0.029
Neg logit contributions
 his-0.019
 night-0.015
 restaurants-0.014
 day-0.014
 himself-0.013
Token factors
Token ID5087
Feature activation+4.24e+00

Pos logit contributions
uddled+0.024
uffed+0.024
arest+0.022
ist+0.022
itting+0.022
Neg logit contributions
 his-0.014
 night-0.010
 himself-0.010
 day-0.010
 restaurants-0.009
Token which
Token ID543
Feature activation+9.94e-02

Pos logit contributions
 his+0.337
 night+0.232
 day+0.223
 himself+0.218
 end+0.211
Neg logit contributions
uddled-0.635
uffed-0.609
arest-0.550
ve-0.545
ider-0.545
Token can
Token ID460
Feature activation+9.31e-03

Pos logit contributions
 his+0.007
 night+0.005
 day+0.005
 restaurants+0.005
 meal+0.005
Neg logit contributions
uddled-0.015
uffed-0.014
ist-0.014
itting-0.013
arest-0.013
Token make
Token ID787
Feature activation-3.52e-01

Pos logit contributions
 his+0.001
 end+0.000
 himself+0.000
 search+0.000
 night+0.000
Neg logit contributions
uddled-0.002
uffed-0.001
itting-0.001
outing-0.001
ower-0.001
Token life
Token ID1204
Feature activation-1.48e+00

Pos logit contributions
uddled+0.049
uffed+0.046
ider+0.044
ist+0.042
itting+0.042
Neg logit contributions
 his-0.030
 night-0.022
 restaurants-0.021
 day-0.021
 himself-0.021
Token more
Token ID517
Feature activation+7.13e-01

Pos logit contributions
uddled+0.219
uffed+0.210
itting+0.194
ist+0.193
ider+0.192
Neg logit contributions
 his-0.105
 night-0.077
 restaurants-0.072
 day-0.071
 meal-0.069
Token "
Token ID366
Feature activation-8.70e-01

Pos logit contributions
 his+0.007
 night+0.005
 restaurants+0.005
 himself+0.005
 day+0.005
Neg logit contributions
uddled-0.013
uffed-0.012
ist-0.012
arest-0.012
itting-0.012
TokenWe
Token ID1135
Feature activation+6.05e-01

Pos logit contributions
uddled+0.103
uffed+0.095
arest+0.093
ist+0.093
itting+0.091
Neg logit contributions
 his-0.079
 restaurants-0.063
 night-0.062
 meal-0.060
 peanuts-0.059
Token are
Token ID389
Feature activation-1.10e-01

Pos logit contributions
 his+0.056
 night+0.044
 end+0.042
 day+0.041
 restaurants+0.041
Neg logit contributions
uddled-0.079
uffed-0.072
ower-0.067
outing-0.067
ist-0.066
Token the
Token ID262
Feature activation+2.00e-01

Pos logit contributions
uddled+0.012
uffed+0.012
ist+0.011
ve+0.011
arest+0.011
Neg logit contributions
 his-0.009
 restaurants-0.008
 night-0.008
 agreement-0.007
 meal-0.007
Token weapon
Token ID4282
Feature activation-5.23e-02

Pos logit contributions
 his+0.019
 night+0.015
 restaurants+0.015
 day+0.014
 himself+0.014
Neg logit contributions
uddled-0.024
uffed-0.022
ist-0.021
itting-0.021
arest-0.021
Token creators
Token ID16294
Feature activation+3.96e+00

Pos logit contributions
uddled+0.007
uffed+0.007
itting+0.006
outing+0.006
arest+0.006
Neg logit contributions
 his-0.004
 night-0.003
 day-0.003
 dinosaurs-0.003
 bear-0.003
Token!
Token ID0
Feature activation-1.69e-01

Pos logit contributions
 his+0.335
 night+0.268
 day+0.255
 end+0.255
 bear+0.236
Neg logit contributions
uddled-0.538
uffed-0.517
outing-0.471
arest-0.470
itting-0.470
Token We
Token ID775
Feature activation-4.54e-02

Pos logit contributions
uddled+0.021
uffed+0.019
ist+0.018
ve+0.018
itting+0.018
Neg logit contributions
 his-0.016
 night-0.013
 himself-0.012
 day-0.012
 restaurants-0.012
Token can
Token ID460
Feature activation+1.91e-01

Pos logit contributions
uddled+0.006
uffed+0.006
ower+0.005
outing+0.005
ended+0.005
Neg logit contributions
 his-0.004
 night-0.003
 end-0.003
 search-0.003
 day-0.003
Token stop
Token ID2245
Feature activation-9.98e-01

Pos logit contributions
 his+0.013
 end+0.009
 himself+0.008
 night+0.008
 search+0.008
Neg logit contributions
uddled-0.030
uffed-0.028
itting-0.027
outing-0.026
ist-0.026
Token the
Token ID262
Feature activation+1.59e-01

Pos logit contributions
uddled+0.134
uffed+0.125
ist+0.119
itting+0.118
arest+0.118
Neg logit contributions
 his-0.084
 night-0.063
 restaurants-0.059
 himself-0.059
 day-0.059
Token!
Token ID0
Feature activation+1.43e-01

Pos logit contributions
 his+0.129
 night+0.107
 saf+0.105
 bears+0.101
 restaurants+0.100
Neg logit contributions
uddled-0.169
uffed-0.158
ended-0.151
arest-0.149
ider-0.147
Token AND
Token ID5357
Feature activation+2.08e+00

Pos logit contributions
 his+0.009
 night+0.007
 day+0.007
 Lily+0.007
 restaurants+0.007
Neg logit contributions
uddled-0.020
uffed-0.019
ist-0.019
arest-0.018
outing-0.018
Token YOU
Token ID7013
Feature activation+4.54e-01

Pos logit contributions
 his+0.168
 end+0.148
 day+0.148
 the+0.147
 night+0.141
Neg logit contributions
uddled-0.246
ider-0.225
uffed-0.220
ist-0.219
ended-0.218
Token ARE
Token ID15986
Feature activation-1.52e-01

Pos logit contributions
 his+0.040
 night+0.032
 end+0.031
 saf+0.031
 battle+0.030
Neg logit contributions
uddled-0.059
uffed-0.054
ended-0.052
ower-0.051
itting-0.050
Token E
Token ID412
Feature activation-4.50e-01

Pos logit contributions
uddled+0.016
uffed+0.015
ist+0.014
itting+0.014
arest+0.014
Neg logit contributions
 his-0.016
 night-0.013
 restaurants-0.013
 himself-0.013
 day-0.013
TokenATING
Token ID33881
Feature activation+3.55e+00

Pos logit contributions
uddled+0.043
uffed+0.042
outing+0.036
ended+0.035
ist+0.033
Neg logit contributions
 his-0.051
 night-0.048
 day-0.046
 morning-0.044
 restaurants-0.043
Token ALL
Token ID11096
Feature activation+1.37e+00

Pos logit contributions
 his+0.345
 night+0.277
 saf+0.273
 end+0.272
 home+0.271
Neg logit contributions
uddled-0.392
uffed-0.354
itting-0.332
ist-0.331
outing-0.330
Token THE
Token ID3336
Feature activation+2.27e+00

Pos logit contributions
 his+0.120
 night+0.097
 day+0.093
 morning+0.087
 end+0.086
Neg logit contributions
uddled-0.187
uffed-0.169
ider-0.164
itting-0.157
ist-0.157
Token S
Token ID311
Feature activation-2.06e-01

Pos logit contributions
 his+0.211
 night+0.188
 day+0.183
 morning+0.173
 restaurants+0.172
Neg logit contributions
uddled-0.253
uffed-0.237
ist-0.229
outing-0.226
arest-0.225
TokenUG
Token ID7340
Feature activation+2.67e+00

Pos logit contributions
uffed+0.020
uddled+0.020
outing+0.018
arest+0.017
ended+0.017
Neg logit contributions
 his-0.021
 night-0.019
 end-0.018
 day-0.018
 morning-0.018
TokenAR
Token ID1503
Feature activation-5.60e-01

Pos logit contributions
 his+0.239
 night+0.185
 day+0.177
 jungle+0.176
 himself+0.175
Neg logit contributions
uddled-0.327
uffed-0.310
arest-0.305
itting-0.291
outing-0.291
Token dragon
Token ID10441
Feature activation-1.85e-01

Pos logit contributions
 his+0.011
 restaurants+0.009
 night+0.008
 himself+0.008
 agreement+0.008
Neg logit contributions
uddled-0.019
uffed-0.018
ist-0.017
arest-0.017
itting-0.017
Token with
Token ID351
Feature activation-2.20e-01

Pos logit contributions
uddled+0.023
uffed+0.022
ist+0.021
itting+0.021
arest+0.020
Neg logit contributions
 his-0.016
 night-0.012
 restaurants-0.012
 day-0.012
 meal-0.011
Token my
Token ID616
Feature activation-6.12e-01

Pos logit contributions
uddled+0.030
uffed+0.029
ist+0.028
arest+0.027
itting+0.027
Neg logit contributions
 his-0.016
 night-0.012
 restaurants-0.011
 day-0.011
 meal-0.011
Token stick
Token ID4859
Feature activation-3.11e-01

Pos logit contributions
uddled+0.078
uffed+0.074
ist+0.072
arest+0.070
ve+0.070
Neg logit contributions
 his-0.047
 restaurants-0.037
 night-0.036
 himself-0.033
 agreement-0.032
Token-
Token ID12
Feature activation+8.44e-01

Pos logit contributions
uddled+0.045
uffed+0.043
itting+0.040
arest+0.040
ended+0.039
Neg logit contributions
 his-0.024
 night-0.017
 day-0.016
 himself-0.016
 end-0.015
Tokensword
Token ID30553
Feature activation+3.54e+00

Pos logit contributions
 his+0.094
 night+0.077
 restaurants+0.075
 day+0.074
 himself+0.074
Neg logit contributions
uddled-0.086
uffed-0.079
ist-0.075
itting-0.073
arest-0.072
Token."
Token ID526
Feature activation+3.60e-01

Pos logit contributions
 his+0.270
 night+0.195
 day+0.184
 way+0.172
 himself+0.170
Neg logit contributions
uddled-0.513
uffed-0.486
ended-0.473
arest-0.473
itting-0.467
Token He
Token ID679
Feature activation-5.95e-02

Pos logit contributions
 his+0.029
 meal+0.027
 restaurants+0.027
 night+0.026
 battle+0.026
Neg logit contributions
uddled-0.041
uffed-0.040
op-0.037
ows-0.035
Because-0.035
Token swings
Token ID26728
Feature activation+2.26e-01

Pos logit contributions
uddled+0.007
uffed+0.006
ist+0.006
itting+0.006
arest+0.006
Neg logit contributions
 his-0.006
 night-0.005
 restaurants-0.005
 day-0.004
 himself-0.004
Token the
Token ID262
Feature activation+1.36e-01

Pos logit contributions
 his+0.011
 restaurants+0.009
 meal+0.008
 night+0.008
 peanuts+0.007
Neg logit contributions
uddled-0.034
uffed-0.032
ist-0.031
ows-0.030
op-0.030
Token stick
Token ID4859
Feature activation-2.69e-01

Pos logit contributions
 his+0.012
 restaurants+0.010
 night+0.009
 himself+0.009
 agreement+0.009
Neg logit contributions
uddled-0.016
uffed-0.015
ist-0.014
arest-0.014
op-0.014
Token "
Token ID366
Feature activation-7.87e-01

Pos logit contributions
uddled+0.045
uffed+0.044
itting+0.042
ower+0.042
outing+0.041
Neg logit contributions
 his-0.027
 night-0.019
 day-0.017
 himself-0.017
 him-0.016
TokenZE
Token ID21211
Feature activation+4.84e-01

Pos logit contributions
uddled+0.097
uffed+0.093
ows+0.085
ist+0.085
outing+0.084
Neg logit contributions
 his-0.073
 night-0.059
 day-0.056
 himself-0.055
 morning-0.052
TokenB
Token ID33
Feature activation+1.54e+00

Pos logit contributions
 his+0.047
 night+0.036
 day+0.036
 himself+0.034
 restaurants+0.034
Neg logit contributions
uddled-0.059
uffed-0.056
itting-0.052
outing-0.052
ider-0.051
TokenRA
Token ID3861
Feature activation+8.69e-02

Pos logit contributions
 his+0.153
 night+0.126
 day+0.125
 himself+0.124
 saf+0.122
Neg logit contributions
uddled-0.164
uffed-0.152
arest-0.141
outing-0.137
ist-0.136
TokenK
Token ID42
Feature activation+1.96e-01

Pos logit contributions
 his+0.009
 night+0.008
 restaurants+0.007
 saf+0.007
 day+0.007
Neg logit contributions
uddled-0.009
uffed-0.008
ist-0.008
itting-0.008
arest-0.008
TokenILL
Token ID8267
Feature activation+3.51e+00

Pos logit contributions
 his+0.019
 night+0.016
 day+0.015
 saf+0.015
 restaurants+0.014
Neg logit contributions
uddled-0.023
uffed-0.021
arest-0.020
ist-0.019
ows-0.019
TokenER
Token ID1137
Feature activation-7.98e-01

Pos logit contributions
 his+0.302
 night+0.222
 himself+0.215
 day+0.210
 restaurants+0.204
Neg logit contributions
uddled-0.464
uffed-0.431
ist-0.413
arest-0.409
itting-0.408
Token"
Token ID1
Feature activation-2.31e-01

Pos logit contributions
uddled+0.096
itting+0.091
uffed+0.090
ist+0.088
outing+0.088
Neg logit contributions
 his-0.076
 himself-0.058
 night-0.053
 day-0.053
 peanuts-0.051
Token and
Token ID290
Feature activation-1.43e-02

Pos logit contributions
uddled+0.032
uffed+0.031
outing+0.029
arest+0.029
ist+0.029
Neg logit contributions
 his-0.020
 himself-0.015
 night-0.014
 day-0.014
 morning-0.013
Token his
Token ID465
Feature activation+2.88e-01

Pos logit contributions
uddled+0.002
arest+0.002
itting+0.002
uffed+0.002
outing+0.002
Neg logit contributions
 his-0.001
 night-0.001
 himself-0.001
 day-0.001
 restaurants-0.001
Token elephant
Token ID20950
Feature activation-1.97e-01

Pos logit contributions
 his+0.022
 night+0.016
 day+0.015
 jungle+0.015
 saf+0.015
Neg logit contributions
uddled-0.041
uffed-0.039
ist-0.037
itting-0.036
arest-0.036
Token
Token ID95
Feature activation+2.17e+00

Pos logit contributions
 his+0.022
 himself+0.018
 morning+0.018
 end+0.018
 saf+0.017
Neg logit contributions
uddled-0.024
uffed-0.021
ist-0.021
as-0.021
itting-0.020
Token
Token ID26391
Feature activation+1.05e+00

Pos logit contributions
 his+0.197
 himself+0.154
 night+0.153
 jungle+0.150
 morning+0.150
Neg logit contributions
uddled-0.262
uffed-0.239
ist-0.221
itting-0.220
ve-0.218
Token
Token ID129
Feature activation+2.97e+00

Pos logit contributions
 his+0.102
 restaurants+0.087
 jungle+0.084
 night+0.083
 day+0.081
Neg logit contributions
uddled-0.115
ist-0.106
uffed-0.103
outing-0.100
arest-0.099
Token
Token ID241
Feature activation+1.35e+00

Pos logit contributions
 his+0.294
 himself+0.226
 night+0.225
 day+0.217
 agreement+0.216
Neg logit contributions
uddled-0.346
uffed-0.316
ist-0.306
itting-0.300
arest-0.296
TokenC
Token ID34
Feature activation+1.62e+00

Pos logit contributions
 his+0.136
 day+0.109
 himself+0.107
 night+0.104
 morning+0.099
Neg logit contributions
uddled-0.167
uffed-0.155
ist-0.155
ows-0.144
ushing-0.142
Tokenleaning
Token ID25909
Feature activation+3.45e+00

Pos logit contributions
 his+0.204
 night+0.182
 end+0.177
 day+0.175
 morning+0.175
Neg logit contributions
uddled-0.125
uffed-0.116
ider-0.101
ist-0.101
arest-0.098
Token is
Token ID318
Feature activation+2.59e-01

Pos logit contributions
 his+0.297
 himself+0.217
 night+0.211
 him+0.200
 day+0.194
Neg logit contributions
uddled-0.489
ider-0.477
uttering-0.452
ended-0.440
itting-0.436
Token important
Token ID1593
Feature activation+2.45e-01

Pos logit contributions
 his+0.020
 himself+0.013
 day+0.012
 night+0.012
 food+0.012
Neg logit contributions
uddled-0.040
uffed-0.038
itting-0.037
arest-0.036
ist-0.036
Token,
Token ID11
Feature activation+2.44e-01

Pos logit contributions
 his+0.019
 day+0.013
 night+0.013
 himself+0.013
 food+0.012
Neg logit contributions
uddled-0.037
uffed-0.036
arest-0.033
outing-0.033
itting-0.033
Token and
Token ID290
Feature activation+3.49e-02

Pos logit contributions
 his+0.020
 day+0.014
 night+0.013
 himself+0.013
 peanuts+0.012
Neg logit contributions
uddled-0.036
uffed-0.034
arest-0.033
ist-0.032
itting-0.032
Token I
Token ID314
Feature activation+1.82e-01

Pos logit contributions
 his+0.003
 day+0.002
 night+0.002
 himself+0.002
 restaurants+0.002
Neg logit contributions
uddled-0.005
uffed-0.005
ist-0.004
itting-0.004
ows-0.004
Token stepped
Token ID10764
Feature activation-8.01e-01

Pos logit contributions
 his+0.063
 end+0.049
 night+0.049
 search+0.046
 forest+0.046
Neg logit contributions
uddled-0.111
uffed-0.106
ist-0.105
arest-0.104
ider-0.103
Token into
Token ID656
Feature activation+5.65e-01

Pos logit contributions
uddled+0.103
uffed+0.096
ist+0.093
arest+0.091
itting+0.091
Neg logit contributions
 his-0.068
 night-0.052
 restaurants-0.050
 day-0.049
 himself-0.049
Token an
Token ID281
Feature activation+4.35e-01

Pos logit contributions
 his+0.038
 night+0.034
 restaurants+0.033
 day+0.033
 jungle+0.032
Neg logit contributions
uddled-0.066
uffed-0.062
ist-0.061
ve-0.061
op-0.060
Token ant
Token ID1885
Feature activation+1.39e-01

Pos logit contributions
 his+0.040
 meal+0.037
 restaurants+0.036
 meals+0.035
 salads+0.035
Neg logit contributions
ist-0.034
uddled-0.033
ushing-0.032
a-0.032
ows-0.031
Token-
Token ID12
Feature activation+9.56e-01

Pos logit contributions
 his+0.011
 night+0.009
 restaurants+0.008
 day+0.008
 jungle+0.008
Neg logit contributions
uddled-0.019
uffed-0.017
ist-0.017
arest-0.016
itting-0.016
Tokenhill
Token ID12639
Feature activation+3.43e+00

Pos logit contributions
 his+0.100
 restaurants+0.085
 meal+0.084
 night+0.084
 agreement+0.083
Neg logit contributions
uddled-0.098
uffed-0.090
ist-0.088
itting-0.085
op-0.085
Token and
Token ID290
Feature activation-1.82e-01

Pos logit contributions
 his+0.186
 him+0.090
 himself+0.087
 night+0.076
 day+0.073
Neg logit contributions
uddled-0.572
arest-0.560
uffed-0.554
keys-0.514
ve-0.509
Token could
Token ID714
Feature activation-5.91e-02

Pos logit contributions
uddled+0.022
uffed+0.021
ist+0.020
itting+0.020
arest+0.020
Neg logit contributions
 his-0.018
 night-0.013
 himself-0.012
 restaurants-0.012
 Freddy-0.012
Token not
Token ID407
Feature activation+1.52e-01

Pos logit contributions
uddled+0.008
uffed+0.008
ist+0.008
itting+0.007
arest+0.007
Neg logit contributions
 his-0.005
 search-0.004
 himself-0.004
 end-0.003
 Freddy-0.003
Token get
Token ID651
Feature activation-2.94e-02

Pos logit contributions
 his+0.011
 night+0.009
 restaurants+0.009
 peanuts+0.008
 day+0.008
Neg logit contributions
uddled-0.019
ist-0.018
uffed-0.018
arest-0.018
ve-0.018
Token out
Token ID503
Feature activation+1.93e-01

Pos logit contributions
uddled+0.004
uffed+0.003
ve+0.003
ist+0.003
a+0.003
Neg logit contributions
 his-0.002
 restaurants-0.002
 night-0.002
 agreement-0.002
 end-0.002
Token.
Token ID13
Feature activation+5.97e-02

Pos logit contributions
uddled+0.061
uffed+0.058
itting+0.054
arest+0.054
ve+0.054
Neg logit contributions
 his-0.031
 night-0.022
 day-0.022
 end-0.021
 himself-0.021
Token They
Token ID1119
Feature activation+4.59e-01

Pos logit contributions
 his+0.004
 night+0.003
 restaurants+0.003
 day+0.003
 himself+0.003
Neg logit contributions
uddled-0.009
uffed-0.008
ist-0.008
arest-0.008
ve-0.008
Token say
Token ID910
Feature activation-5.83e-02

Pos logit contributions
 his+0.032
 night+0.023
 end+0.022
 search+0.022
 restaurants+0.021
Neg logit contributions
uddled-0.072
uffed-0.066
itting-0.064
ower-0.063
ended-0.063
Token "
Token ID366
Feature activation-8.55e-01

Pos logit contributions
uddled+0.008
uffed+0.007
ist+0.007
itting+0.007
arest+0.007
Neg logit contributions
 his-0.005
 night-0.003
 restaurants-0.003
 day-0.003
 himself-0.003
Tokenm
Token ID76
Feature activation+5.05e-02

Pos logit contributions
uddled+0.099
uffed+0.093
ist+0.087
itting+0.084
arest+0.084
Neg logit contributions
 his-0.083
 night-0.068
 restaurants-0.065
 day-0.065
 himself-0.063
Tokenarch
Token ID998
Feature activation+3.38e+00

Pos logit contributions
 his+0.004
 night+0.003
 restaurants+0.003
 himself+0.003
 day+0.003
Neg logit contributions
uddled-0.007
uffed-0.006
ist-0.006
arest-0.006
itting-0.006
Token,
Token ID11
Feature activation+1.26e-01

Pos logit contributions
 his+0.289
 end+0.219
 night+0.215
 day+0.208
 him+0.197
Neg logit contributions
uddled-0.462
uffed-0.429
ended-0.399
ider-0.398
arest-0.398
Token march
Token ID9960
Feature activation-2.55e-01

Pos logit contributions
 his+0.010
 night+0.007
 restaurants+0.007
 day+0.007
 end+0.007
Neg logit contributions
uddled-0.017
uffed-0.016
ist-0.016
arest-0.015
itting-0.015
Token,
Token ID11
Feature activation+1.16e-01

Pos logit contributions
uddled+0.036
uffed+0.034
ended+0.032
arest+0.032
outing+0.031
Neg logit contributions
 his-0.021
 night-0.016
 day-0.015
 end-0.015
 home-0.014
Token march
Token ID9960
Feature activation-2.64e-01

Pos logit contributions
 his+0.008
 night+0.006
 end+0.006
 day+0.006
 restaurants+0.006
Neg logit contributions
uddled-0.017
uffed-0.016
ist-0.015
arest-0.015
itting-0.015
Token"
Token ID1
Feature activation-2.92e-01

Pos logit contributions
uddled+0.036
uffed+0.035
ended+0.033
arest+0.032
outing+0.032
Neg logit contributions
 his-0.022
 night-0.017
 day-0.016
 end-0.016
 home-0.015
Token.
Token ID13
Feature activation-8.71e-02

Pos logit contributions
uddled+0.000
uffed+0.000
arest+0.000
itting+0.000
ist+0.000
Neg logit contributions
 his-0.000
 day-0.000
 night-0.000
 way-0.000
 himself-0.000
Token She
Token ID1375
Feature activation-4.93e-01

Pos logit contributions
uddled+0.010
uffed+0.009
ist+0.009
itting+0.009
arest+0.009
Neg logit contributions
 his-0.008
 night-0.007
 restaurants-0.006
 day-0.006
 meal-0.006
Token started
Token ID2067
Feature activation-8.01e-03

Pos logit contributions
uddled+0.063
uffed+0.062
itting+0.062
arest+0.062
ower+0.061
Neg logit contributions
 his-0.045
 night-0.032
 day-0.030
 himself-0.029
 end-0.029
Token to
Token ID284
Feature activation+7.29e-02

Pos logit contributions
uddled+0.001
uffed+0.001
ist+0.001
itting+0.001
op+0.001
Neg logit contributions
 his-0.001
 night-0.001
 restaurants-0.001
 meal-0.000
 day-0.000
Token "
Token ID366
Feature activation-1.02e+00

Pos logit contributions
 his+0.006
 night+0.004
 restaurants+0.004
 himself+0.004
 day+0.004
Neg logit contributions
uddled-0.010
uffed-0.009
ist-0.009
arest-0.009
itting-0.009
Tokencompl
Token ID23855
Feature activation+3.35e+00

Pos logit contributions
uddled+0.114
uffed+0.106
ist+0.101
arest+0.098
itting+0.098
Neg logit contributions
 his-0.102
 night-0.082
 restaurants-0.080
 day-0.078
 himself-0.078
Tokenain
Token ID391
Feature activation-4.74e-01

Pos logit contributions
 his+0.346
 night+0.279
 himself+0.273
 him+0.260
 restaurants+0.260
Neg logit contributions
uddled-0.371
uffed-0.327
ider-0.321
arest-0.321
ist-0.307
Token"
Token ID1
Feature activation-5.96e-01

Pos logit contributions
uddled+0.061
uffed+0.057
arest+0.054
itting+0.053
ider+0.052
Neg logit contributions
 his-0.046
 himself-0.034
 bears-0.032
 night-0.031
 bear-0.031
Token that
Token ID326
Feature activation-7.32e-01

Pos logit contributions
uddled+0.082
uffed+0.080
arest+0.076
itting+0.075
ist+0.074
Neg logit contributions
 his-0.047
 day-0.036
 night-0.036
 himself-0.034
 morning-0.031
Token it
Token ID340
Feature activation-3.05e-01

Pos logit contributions
uddled+0.101
uffed+0.094
arest+0.088
ider+0.087
ist+0.086
Neg logit contributions
 his-0.067
 day-0.051
 night-0.050
 end-0.047
 way-0.046
Token wasn
Token ID2492
Feature activation-5.84e-01

Pos logit contributions
uddled+0.043
arest+0.042
uffed+0.041
uttering+0.040
itting+0.039
Neg logit contributions
 his-0.026
 day-0.018
 night-0.017
 end-0.017
 one-0.016
Token.
Token ID13
Feature activation+1.47e-01

Pos logit contributions
uddled+0.015
uffed+0.014
arest+0.014
itting+0.013
op+0.013
Neg logit contributions
 his-0.008
 himself-0.005
 night-0.005
 day-0.005
 him-0.005
Token
Token ID220
Feature activation+6.59e-01

Pos logit contributions
 his+0.010
 restaurants+0.008
 meal+0.008
 night+0.008
 bears+0.008
Neg logit contributions
uddled-0.020
uffed-0.019
itting-0.018
op-0.018
ower-0.017
Token
Token ID198
Feature activation+2.18e-01

Pos logit contributions
 his+0.039
 himself+0.025
 him+0.019
 night+0.019
 Freddy+0.019
Neg logit contributions
uddled-0.104
ist-0.100
arest-0.098
ider-0.095
uffed-0.094
Token
Token ID198
Feature activation+2.86e-01

Pos logit contributions
 his+0.017
 restaurants+0.014
 night+0.014
 meal+0.013
 peanuts+0.013
Neg logit contributions
uddled-0.028
uffed-0.026
itting-0.025
ist-0.025
arest-0.025
TokenAlt
Token ID29161
Feature activation+1.26e+00

Pos logit contributions
 his+0.030
 night+0.025
 restaurants+0.025
 meal+0.023
 peanuts+0.023
Neg logit contributions
uddled-0.030
uffed-0.028
ist-0.026
itting-0.026
arest-0.026
Tokenogether
Token ID8236
Feature activation+3.34e+00

Pos logit contributions
 his+0.123
 night+0.103
 day+0.101
 himself+0.100
 end+0.098
Neg logit contributions
uddled-0.141
uffed-0.132
ist-0.123
ve-0.121
arest-0.120
Token,
Token ID11
Feature activation+2.99e-01

Pos logit contributions
 his+0.279
 himself+0.177
 night+0.176
 day+0.169
 bear+0.166
Neg logit contributions
uddled-0.473
arest-0.459
itting-0.451
uffed-0.434
outing-0.431
Token Sam
Token ID3409
Feature activation-4.14e-01

Pos logit contributions
 his+0.020
 night+0.012
 day+0.012
 himself+0.012
 end+0.011
Neg logit contributions
uddled-0.048
uffed-0.045
arest-0.044
ist-0.043
ider-0.043
Token and
Token ID290
Feature activation+1.00e-01

Pos logit contributions
uddled+0.056
ower+0.054
arest+0.054
itting+0.054
ider+0.053
Neg logit contributions
 his-0.038
 night-0.027
 day-0.026
 himself-0.025
 way-0.025
Token Mom
Token ID11254
Feature activation-2.96e-01

Pos logit contributions
 restaurants+0.003
 pizz+0.003
 jungle+0.003
 saf+0.003
 peanuts+0.003
Neg logit contributions
uddled-0.014
uffed-0.014
ist-0.013
op-0.013
a-0.013
Tokenmy
Token ID1820
Feature activation+7.03e-01

Pos logit contributions
uddled+0.056
arest+0.053
uffed+0.053
ider+0.053
ower+0.051
Neg logit contributions
 his-0.009
 night-0.004
 day-0.003
 end-0.001
 morning-0.001
Token mysterious
Token ID11428
Feature activation-2.62e-01

Pos logit contributions
 his+0.001
 night+0.001
 day+0.001
 himself+0.001
 restaurants+0.001
Neg logit contributions
uddled-0.002
uffed-0.002
ist-0.002
itting-0.002
arest-0.002
Token.
Token ID13
Feature activation+1.91e-01

Pos logit contributions
uddled+0.038
uffed+0.036
arest+0.034
itting+0.034
ve+0.034
Neg logit contributions
 his-0.020
 night-0.015
 day-0.014
 himself-0.014
 food-0.013
Token
Token ID220
Feature activation+5.39e-01

Pos logit contributions
 his+0.014
 night+0.010
 restaurants+0.010
 end+0.010
 day+0.010
Neg logit contributions
uddled-0.026
uffed-0.025
ist-0.024
itting-0.023
arest-0.023
Token
Token ID198
Feature activation+1.98e-01

Pos logit contributions
 his+0.034
 himself+0.023
 night+0.020
 restaurants+0.019
 day+0.019
Neg logit contributions
uddled-0.083
ist-0.078
uffed-0.077
arest-0.076
ider-0.074
Token
Token ID198
Feature activation+2.82e-01

Pos logit contributions
 his+0.014
 night+0.010
 himself+0.009
 restaurants+0.009
 day+0.009
Neg logit contributions
uddled-0.029
uffed-0.027
ist-0.026
itting-0.026
arest-0.025
TokenTalking
Token ID45904
Feature activation+3.32e+00

Pos logit contributions
 his+0.031
 himself+0.025
 night+0.024
 restaurants+0.024
 end+0.024
Neg logit contributions
uddled-0.030
uffed-0.028
ist-0.027
itting-0.025
ower-0.025
Token to
Token ID284
Feature activation+4.25e-01

Pos logit contributions
 his+0.281
 night+0.220
 day+0.197
 himself+0.193
 morning+0.186
Neg logit contributions
uddled-0.460
uffed-0.423
arest-0.421
ower-0.410
ider-0.405
Token himself
Token ID2241
Feature activation+7.49e-01

Pos logit contributions
 his+0.022
 night+0.014
 restaurants+0.014
 peanuts+0.013
 meal+0.013
Neg logit contributions
uddled-0.066
uffed-0.063
ist-0.062
arest-0.060
itting-0.060
Token,
Token ID11
Feature activation+2.70e-01

Pos logit contributions
 his+0.051
 himself+0.031
 day+0.027
 night+0.026
 home+0.025
Neg logit contributions
uddled-0.126
uffed-0.115
ower-0.112
itting-0.111
ist-0.111
Token the
Token ID262
Feature activation+4.22e-01

Pos logit contributions
 his+0.016
 himself+0.008
 night+0.008
 day+0.007
 end+0.006
Neg logit contributions
uddled-0.046
uffed-0.043
ist-0.043
arest-0.042
itting-0.042
Token tiger
Token ID26241
Feature activation+9.78e-01

Pos logit contributions
 his+0.026
 night+0.018
 restaurants+0.017
 day+0.016
 himself+0.016
Neg logit contributions
uddled-0.064
uffed-0.061
ist-0.059
itting-0.058
arest-0.057
Token became
Token ID2627
Feature activation+4.29e-01

Pos logit contributions
 his+0.134
 night+0.108
 day+0.101
 end+0.096
 himself+0.091
Neg logit contributions
uddled-0.207
outing-0.206
ower-0.203
uffed-0.203
itting-0.198
Token known
Token ID1900
Feature activation+4.04e-01

Pos logit contributions
 his+0.026
 restaurants+0.021
 night+0.020
 meal+0.020
 saf+0.019
Neg logit contributions
uddled-0.058
uffed-0.055
ist-0.053
itting-0.053
a-0.052
Token as
Token ID355
Feature activation-1.04e-01

Pos logit contributions
 his+0.030
 himself+0.020
 night+0.020
 end+0.019
 day+0.019
Neg logit contributions
uddled-0.060
uffed-0.056
op-0.052
ist-0.052
itting-0.052
Token the
Token ID262
Feature activation+1.90e-01

Pos logit contributions
uddled+0.016
uffed+0.015
ist+0.015
arest+0.015
op+0.014
Neg logit contributions
 his-0.006
 night-0.004
 restaurants-0.004
 himself-0.004
 day-0.004
Token best
Token ID1266
Feature activation-7.11e-02

Pos logit contributions
 his+0.018
 night+0.014
 restaurants+0.014
 day+0.013
 himself+0.013
Neg logit contributions
uddled-0.022
uffed-0.021
ist-0.020
itting-0.020
arest-0.019
Token cra
Token ID15671
Feature activation+3.31e+00

Pos logit contributions
uddled+0.010
uffed+0.009
arest+0.009
outing+0.008
ist+0.008
Neg logit contributions
 his-0.006
 night-0.005
 day-0.005
 end-0.004
 peanuts-0.004
Tokenfter
Token ID637
Feature activation-3.42e-01

Pos logit contributions
 his+0.337
 end+0.292
 day+0.286
 night+0.283
 restaurants+0.282
Neg logit contributions
uddled-0.348
uffed-0.315
ower-0.288
ows-0.286
op-0.285
Token around
Token ID1088
Feature activation-2.92e-02

Pos logit contributions
uddled+0.049
uffed+0.047
arest+0.044
outing+0.043
ist+0.043
Neg logit contributions
 his-0.027
 night-0.020
 day-0.020
 himself-0.018
 home-0.017
Token.
Token ID13
Feature activation+6.44e-02

Pos logit contributions
uddled+0.005
uffed+0.004
arest+0.004
ist+0.004
ower+0.004
Neg logit contributions
 his-0.002
 night-0.001
 day-0.001
 himself-0.001
 end-0.001
Token<|endoftext|>
Token ID50256
Feature activation-2.50e-01

Pos logit contributions
 his+0.005
 restaurants+0.004
 meal+0.004
 peanuts+0.004
 salads+0.004
Neg logit contributions
uddled-0.008
uffed-0.008
itting-0.007
op-0.007
ended-0.007
TokenOnce
Token ID7454
Feature activation-3.82e-01

Pos logit contributions
uddled+0.034
uffed+0.032
ist+0.030
itting+0.030
arest+0.030
Neg logit contributions
 his-0.019
 night-0.014
 restaurants-0.014
 day-0.013
 himself-0.013
Token with
Token ID351
Feature activation-2.01e-01

Pos logit contributions
 his+0.054
 night+0.046
 restaurants+0.044
 end+0.043
 agreement+0.042
Neg logit contributions
uddled-0.068
uffed-0.062
ist-0.061
op-0.059
itting-0.059
Token her
Token ID607
Feature activation+1.36e-01

Pos logit contributions
uddled+0.026
uffed+0.025
op+0.023
ist+0.023
ower+0.023
Neg logit contributions
 his-0.012
 night-0.011
 end-0.011
 agreement-0.011
 distance-0.010
Token toys
Token ID14958
Feature activation+2.39e-01

Pos logit contributions
 his+0.009
 night+0.006
 day+0.006
 peanuts+0.005
 meal+0.005
Neg logit contributions
uddled-0.021
uffed-0.020
ist-0.020
arest-0.019
itting-0.019
Token and
Token ID290
Feature activation-4.59e-01

Pos logit contributions
 his+0.019
 night+0.014
 day+0.014
 end+0.014
 restaurants+0.012
Neg logit contributions
uddled-0.034
uffed-0.033
arest-0.031
ist-0.030
ended-0.030
Token doing
Token ID1804
Feature activation-7.11e-01

Pos logit contributions
uddled+0.051
uffed+0.045
ist+0.043
ve+0.043
op+0.043
Neg logit contributions
 his-0.033
 agreement-0.030
 Prince-0.028
 Freddy-0.028
 night-0.028
Token arts
Token ID10848
Feature activation+3.31e+00

Pos logit contributions
uddled+0.091
uffed+0.085
ist+0.082
ve+0.080
itting+0.080
Neg logit contributions
 his-0.052
 night-0.046
 day-0.042
 end-0.041
 agreement-0.040
Token and
Token ID290
Feature activation-4.73e-01

Pos logit contributions
 his+0.285
 night+0.216
 restaurants+0.213
 day+0.206
 himself+0.204
Neg logit contributions
uddled-0.422
uffed-0.404
arest-0.377
ist-0.376
itting-0.375
Token crafts
Token ID28229
Feature activation+5.14e-01

Pos logit contributions
uddled+0.050
uffed+0.045
ist+0.044
ve+0.043
op+0.043
Neg logit contributions
 his-0.049
 night-0.040
 restaurants-0.039
 Prince-0.039
 agreement-0.038
Token.
Token ID13
Feature activation-2.66e-01

Pos logit contributions
 his+0.042
 night+0.031
 restaurants+0.030
 himself+0.030
 day+0.030
Neg logit contributions
uddled-0.068
uffed-0.065
arest-0.061
ist-0.061
itting-0.061
Token One
Token ID1881
Feature activation+3.47e-01

Pos logit contributions
uddled+0.034
itting+0.033
uffed+0.032
op+0.031
Because+0.030
Neg logit contributions
 his-0.020
 battle-0.015
 robbers-0.015
 restaurants-0.015
 peanuts-0.015
Token day
Token ID1110
Feature activation-8.66e-02

Pos logit contributions
 his+0.012
 restaurants+0.007
 himself+0.007
 peanuts+0.006
 salads+0.006
Neg logit contributions
uddled-0.058
itting-0.055
op-0.054
ist-0.054
uffed-0.054
Token.
Token ID13
Feature activation-6.80e-02

Pos logit contributions
uddled+0.040
uffed+0.037
ist+0.036
ows+0.036
op+0.036
Neg logit contributions
 his-0.035
 restaurants-0.030
 jungle-0.029
 night-0.029
 battle-0.029
Token They
Token ID1119
Feature activation+3.02e-01

Pos logit contributions
uddled+0.008
uffed+0.008
itting+0.007
ower+0.007
op+0.007
Neg logit contributions
 his-0.006
 restaurants-0.005
 night-0.005
 himself-0.005
 peanuts-0.005
Token say
Token ID910
Feature activation-1.76e-01

Pos logit contributions
 his+0.020
 night+0.015
 end+0.015
 search+0.014
 agreement+0.013
Neg logit contributions
uddled-0.049
uffed-0.045
ended-0.044
ower-0.044
itting-0.043
Token "
Token ID366
Feature activation-1.01e+00

Pos logit contributions
ist+0.026
uddled+0.026
uffed+0.025
ve+0.024
arest+0.024
Neg logit contributions
 his-0.014
 night-0.008
 day-0.008
 himself-0.008
 morning-0.007
Tokenm
Token ID76
Feature activation-1.70e-01

Pos logit contributions
uffed+0.111
uddled+0.109
ist+0.097
ows+0.097
ushing+0.094
Neg logit contributions
 his-0.100
 night-0.092
 day-0.090
 morning-0.084
 evening-0.084
Tokenarch
Token ID998
Feature activation+3.30e+00

Pos logit contributions
uddled+0.022
uffed+0.021
ist+0.020
arest+0.020
itting+0.020
Neg logit contributions
 his-0.014
 night-0.011
 himself-0.010
 day-0.010
 end-0.010
Token,
Token ID11
Feature activation+2.04e-02

Pos logit contributions
 his+0.253
 end+0.188
 night+0.185
 day+0.175
 home+0.170
Neg logit contributions
uddled-0.475
uffed-0.448
arest-0.425
ended-0.423
ider-0.419
Token march
Token ID9960
Feature activation-3.68e-01

Pos logit contributions
 his+0.002
 end+0.002
 day+0.002
 night+0.002
 agreement+0.001
Neg logit contributions
uddled-0.003
uffed-0.002
ist-0.002
ve-0.002
arest-0.002
Token,
Token ID11
Feature activation+1.60e-02

Pos logit contributions
uddled+0.052
uffed+0.051
ended+0.049
outing+0.048
arest+0.048
Neg logit contributions
 his-0.028
 night-0.022
 day-0.021
 home-0.020
 end-0.020
Token march
Token ID9960
Feature activation-3.75e-01

Pos logit contributions
 his+0.001
 night+0.001
 day+0.001
 end+0.001
 winter+0.001
Neg logit contributions
uddled-0.002
uffed-0.002
itting-0.002
ve-0.002
arest-0.002
Token"
Token ID1
Feature activation-4.75e-01

Pos logit contributions
uddled+0.053
uffed+0.052
ended+0.050
outing+0.049
arest+0.048
Neg logit contributions
 his-0.029
 night-0.023
 day-0.022
 end-0.021
 home-0.020
Token looked
Token ID3114
Feature activation+4.57e-01

Pos logit contributions
 his+0.007
 end+0.006
 him+0.005
 himself+0.005
 bear+0.005
Neg logit contributions
arest-0.009
ute-0.008
uddled-0.008
ower-0.008
uffed-0.008
Token at
Token ID379
Feature activation-2.77e-02

Pos logit contributions
 his+0.048
 night+0.038
 day+0.036
 agreement+0.035
 end+0.035
Neg logit contributions
uddled-0.050
uffed-0.047
arest-0.046
ist-0.045
op-0.044
Token the
Token ID262
Feature activation+2.25e-01

Pos logit contributions
uddled+0.004
uffed+0.003
ist+0.003
itting+0.003
arest+0.003
Neg logit contributions
 his-0.002
 night-0.002
 restaurants-0.002
 day-0.002
 meal-0.002
Token egg
Token ID5935
Feature activation+9.91e-01

Pos logit contributions
 his+0.017
 night+0.014
 end+0.014
 day+0.013
 meal+0.013
Neg logit contributions
uddled-0.031
uffed-0.031
ist-0.029
itting-0.029
arest-0.028
Token-
Token ID12
Feature activation+8.86e-01

Pos logit contributions
 his+0.077
 night+0.057
 end+0.056
 day+0.055
 himself+0.051
Neg logit contributions
uddled-0.140
uffed-0.136
arest-0.132
ve-0.129
itting-0.126
Tokendu
Token ID646
Feature activation+3.30e+00

Pos logit contributions
 his+0.109
 night+0.090
 himself+0.088
 restaurants+0.086
 day+0.086
Neg logit contributions
uddled-0.080
uffed-0.074
arest-0.068
ist-0.068
itting-0.067
Tokenck
Token ID694
Feature activation+3.71e-01

Pos logit contributions
 his+0.346
 himself+0.263
 end+0.259
 night+0.257
 day+0.254
Neg logit contributions
uddled-0.377
uffed-0.329
arest-0.306
ended-0.303
itting-0.303
Token and
Token ID290
Feature activation-3.52e-02

Pos logit contributions
 his+0.035
 night+0.025
 end+0.024
 him+0.024
 bears+0.023
Neg logit contributions
uddled-0.047
arest-0.047
uffed-0.045
ider-0.044
ower-0.044
Token then
Token ID788
Feature activation-1.42e-01

Pos logit contributions
uddled+0.004
arest+0.004
uffed+0.004
itting+0.004
op+0.004
Neg logit contributions
 his-0.004
 night-0.003
 himself-0.002
 end-0.002
 restaurants-0.002
Token at
Token ID379
Feature activation-6.86e-02

Pos logit contributions
uddled+0.019
arest+0.018
uffed+0.017
ist+0.017
itting+0.017
Neg logit contributions
 his-0.014
 night-0.009
 the-0.009
 one-0.009
 him-0.008
Token each
Token ID1123
Feature activation+3.42e-01

Pos logit contributions
uddled+0.011
uffed+0.011
itting+0.010
ist+0.010
arest+0.010
Neg logit contributions
 his-0.004
 night-0.002
 end-0.002
 day-0.002
 himself-0.002
Tokenised
Token ID1417
Feature activation-1.78e-01

Pos logit contributions
uddled+0.037
uffed+0.031
ider+0.031
ower+0.030
itting+0.030
Neg logit contributions
 his-0.061
 himself-0.051
 night-0.047
 him-0.047
 agreement-0.046
Token and
Token ID290
Feature activation+4.41e-03

Pos logit contributions
uddled+0.027
uffed+0.025
arest+0.024
ider+0.023
ower+0.023
Neg logit contributions
 his-0.014
 night-0.010
 day-0.009
 himself-0.009
 end-0.008
Token studied
Token ID9713
Feature activation+1.96e-01

Pos logit contributions
 his+0.000
 night+0.000
 day+0.000
 himself+0.000
 restaurants+0.000
Neg logit contributions
uddled-0.001
uffed-0.001
itting-0.001
arest-0.001
outing-0.001
Token hard
Token ID1327
Feature activation+1.03e-01

Pos logit contributions
 his+0.015
 himself+0.009
 night+0.009
 day+0.008
 him+0.008
Neg logit contributions
uddled-0.031
uffed-0.029
ower-0.027
itting-0.027
arest-0.027
Token--
Token ID438
Feature activation+1.35e+00

Pos logit contributions
 his+0.008
 night+0.006
 day+0.006
 himself+0.006
 restaurants+0.006
Neg logit contributions
uddled-0.015
uffed-0.014
arest-0.013
ist-0.013
itting-0.013
Tokenmem
Token ID11883
Feature activation+3.30e+00

Pos logit contributions
 his+0.167
 himself+0.137
 him+0.129
 day+0.127
 restaurants+0.125
Neg logit contributions
uddled-0.132
uffed-0.120
ider-0.112
arest-0.108
ower-0.105
Tokenor
Token ID273
Feature activation-4.53e-01

Pos logit contributions
 his+0.371
 end+0.319
 agreement+0.305
 night+0.296
 himself+0.291
Neg logit contributions
uddled-0.319
uts-0.282
ider-0.275
uffed-0.271
arest-0.270
Tokenizing
Token ID2890
Feature activation-6.81e-01

Pos logit contributions
uddled+0.050
uffed+0.045
arest+0.043
itting+0.043
outing+0.042
Neg logit contributions
 his-0.051
 himself-0.042
 night-0.038
 him-0.037
 day-0.037
Token the
Token ID262
Feature activation+2.94e-01

Pos logit contributions
uddled+0.107
uffed+0.100
arest+0.098
ider+0.097
ve+0.094
Neg logit contributions
 his-0.049
 way-0.033
 day-0.033
 night-0.032
 himself-0.031
Token answers
Token ID7429
Feature activation+3.98e-01

Pos logit contributions
 his+0.022
 night+0.017
 restaurants+0.016
 day+0.016
 end+0.015
Neg logit contributions
uddled-0.041
uffed-0.038
ist-0.037
itting-0.036
arest-0.036
Token and
Token ID290
Feature activation-3.35e-02

Pos logit contributions
 his+0.032
 night+0.020
 himself+0.020
 day+0.020
 him+0.020
Neg logit contributions
uddled-0.062
uffed-0.057
ower-0.053
arest-0.053
ider-0.052
Tokenâ
Token ID22940
Feature activation+7.14e-01

Pos logit contributions
uddled+0.132
uffed+0.122
ist+0.119
arest+0.115
itting+0.114
Neg logit contributions
 his-0.071
 night-0.049
 himself-0.048
 restaurants-0.048
 day-0.047
Token
Token ID26391
Feature activation+9.07e-01

Pos logit contributions
 his+0.055
 night+0.044
 day+0.043
 morning+0.040
 end+0.040
Neg logit contributions
uddled-0.098
uffed-0.092
op-0.084
itting-0.084
ve-0.084
Token
Token ID198
Feature activation+2.18e-01

Pos logit contributions
 his+0.065
 restaurants+0.048
 night+0.046
 jungle+0.045
 day+0.044
Neg logit contributions
uddled-0.129
uffed-0.120
ist-0.118
arest-0.116
itting-0.114
Token
Token ID198
Feature activation+2.82e-01

Pos logit contributions
 his+0.015
 night+0.010
 day+0.010
 himself+0.010
 restaurants+0.010
Neg logit contributions
uddled-0.032
uffed-0.030
ist-0.030
itting-0.028
ower-0.028
TokenO
Token ID46
Feature activation+3.97e-01

Pos logit contributions
 his+0.029
 day+0.022
 himself+0.022
 night+0.022
 agreement+0.022
Neg logit contributions
uddled-0.032
uffed-0.030
ist-0.029
ows-0.028
ower-0.028
Tokentty
Token ID42852
Feature activation+3.30e+00

Pos logit contributions
 his+0.045
 day+0.041
 night+0.041
 morning+0.038
 himself+0.038
Neg logit contributions
uddled-0.032
uffed-0.031
uttering-0.029
ifying-0.029
arest-0.028
Token was
Token ID373
Feature activation+4.69e-02

Pos logit contributions
 his+0.267
 night+0.203
 day+0.199
 morning+0.180
 end+0.180
Neg logit contributions
itting-0.459
uttering-0.454
uddled-0.447
arest-0.447
uffed-0.443
Token so
Token ID523
Feature activation-4.52e-01

Pos logit contributions
 his+0.004
 night+0.002
 himself+0.002
 day+0.002
 morning+0.002
Neg logit contributions
uddled-0.007
uffed-0.006
ist-0.006
arest-0.006
itting-0.006
Token excited
Token ID6568
Feature activation+3.47e-01

Pos logit contributions
uddled+0.072
ist+0.067
uffed+0.064
ider+0.063
ows+0.061
Neg logit contributions
 his-0.036
 day-0.024
 night-0.023
 himself-0.022
 morning-0.020
Token that
Token ID326
Feature activation-7.84e-02

Pos logit contributions
 his+0.028
 day+0.020
 night+0.019
 himself+0.018
 him+0.018
Neg logit contributions
uddled-0.051
uffed-0.049
arest-0.048
ended-0.045
ist-0.045
Token his
Token ID465
Feature activation+3.23e-01

Pos logit contributions
uddled+0.012
uffed+0.012
ist+0.011
arest+0.011
itting+0.011
Neg logit contributions
 his-0.005
 night-0.004
 day-0.003
 himself-0.003
 restaurants-0.003
Token to
Token ID284
Feature activation+1.78e-01

Pos logit contributions
 his+0.021
 night+0.011
 day+0.011
 himself+0.011
 home+0.011
Neg logit contributions
uffed-0.045
ist-0.045
arest-0.045
uddled-0.044
itting-0.043
Token share
Token ID2648
Feature activation+2.58e-01

Pos logit contributions
 his+0.011
 end+0.008
 search+0.007
 night+0.007
 day+0.007
Neg logit contributions
uddled-0.029
uffed-0.027
arest-0.026
itting-0.026
ve-0.025
Token the
Token ID262
Feature activation+2.97e-01

Pos logit contributions
 his+0.017
 night+0.012
 restaurants+0.011
 day+0.011
 himself+0.010
Neg logit contributions
uddled-0.038
uffed-0.037
ist-0.035
itting-0.035
arest-0.035
Token egg
Token ID5935
Feature activation+9.73e-01

Pos logit contributions
 his+0.027
 night+0.021
 restaurants+0.020
 day+0.020
 meal+0.020
Neg logit contributions
uddled-0.037
uffed-0.035
ist-0.033
itting-0.032
arest-0.032
Token-
Token ID12
Feature activation+9.50e-01

Pos logit contributions
 his+0.091
 night+0.071
 restaurants+0.069
 day+0.067
 himself+0.067
Neg logit contributions
uddled-0.117
uffed-0.110
ist-0.104
itting-0.103
arest-0.103
Tokendu
Token ID646
Feature activation+3.28e+00

Pos logit contributions
 his+0.113
 night+0.093
 himself+0.091
 restaurants+0.091
 day+0.089
Neg logit contributions
uddled-0.089
uffed-0.083
ist-0.078
itting-0.077
arest-0.075
Tokenck
Token ID694
Feature activation+4.32e-01

Pos logit contributions
 his+0.368
 night+0.292
 himself+0.290
 peanuts+0.288
 end+0.287
Neg logit contributions
uddled-0.339
uffed-0.305
itting-0.280
arest-0.275
ider-0.275
Token and
Token ID290
Feature activation-1.07e-02

Pos logit contributions
 his+0.040
 night+0.029
 him+0.028
 bears+0.027
 himself+0.027
Neg logit contributions
uddled-0.056
arest-0.055
uffed-0.054
ider-0.052
itting-0.051
Token try
Token ID1949
Feature activation-1.11e+00

Pos logit contributions
uddled+0.002
uffed+0.001
ist+0.001
itting+0.001
arest+0.001
Neg logit contributions
 his-0.001
 night-0.001
 restaurants-0.001
 day-0.001
 himself-0.001
Token to
Token ID284
Feature activation+9.64e-02

Pos logit contributions
uddled+0.150
uffed+0.142
ist+0.136
arest+0.134
itting+0.133
Neg logit contributions
 his-0.090
 night-0.065
 restaurants-0.063
 himself-0.062
 day-0.061
Token fit
Token ID4197
Feature activation-1.35e-01

Pos logit contributions
 his+0.007
 end+0.005
 himself+0.005
 night+0.004
 search+0.004
Neg logit contributions
uddled-0.015
uffed-0.014
itting-0.014
op-0.013
ider-0.013
Token even
Token ID772
Feature activation+1.20e+00

Pos logit contributions
 his+0.029
 night+0.023
 day+0.022
 restaurants+0.020
 end+0.020
Neg logit contributions
uddled-0.039
uffed-0.036
ider-0.036
ower-0.035
arest-0.035
Token practiced
Token ID19893
Feature activation-1.23e-01

Pos logit contributions
 his+0.119
 night+0.096
 restaurants+0.095
 himself+0.091
 meal+0.091
Neg logit contributions
uddled-0.132
uffed-0.123
ist-0.118
itting-0.114
arest-0.113
Token yoga
Token ID20351
Feature activation-4.08e-01

Pos logit contributions
uddled+0.017
uffed+0.016
arest+0.016
ist+0.015
itting+0.015
Neg logit contributions
 his-0.010
 night-0.007
 himself-0.007
 day-0.007
 restaurants-0.007
Token and
Token ID290
Feature activation-2.38e-01

Pos logit contributions
uddled+0.058
uffed+0.057
ended+0.055
outing+0.054
arest+0.054
Neg logit contributions
 his-0.030
 day-0.021
 night-0.021
 himself-0.021
 morning-0.018
Token martial
Token ID15618
Feature activation-6.95e-01

Pos logit contributions
uddled+0.030
uffed+0.029
ist+0.027
arest+0.027
itting+0.027
Neg logit contributions
 his-0.021
 night-0.016
 restaurants-0.016
 day-0.015
 himself-0.015
Token arts
Token ID10848
Feature activation+3.27e+00

Pos logit contributions
uddled+0.080
ider+0.078
uffed+0.077
arest+0.074
outing+0.073
Neg logit contributions
 his-0.068
 him-0.050
 himself-0.049
 day-0.049
 night-0.049
Token together
Token ID1978
Feature activation-7.96e-01

Pos logit contributions
 his+0.270
 himself+0.189
 night+0.181
 day+0.179
 restaurants+0.166
Neg logit contributions
uddled-0.454
uffed-0.449
arest-0.431
ended-0.418
itting-0.417
Token.
Token ID13
Feature activation-7.80e-02

Pos logit contributions
uddled+0.119
uffed+0.113
arest+0.105
itting+0.102
ider+0.102
Neg logit contributions
 his-0.062
 night-0.043
 day-0.043
 himself-0.041
 end-0.038
Token
Token ID198
Feature activation-6.01e-02

Pos logit contributions
uddled+0.010
uffed+0.010
itting+0.009
ist+0.009
op+0.009
Neg logit contributions
 his-0.006
 restaurants-0.004
 night-0.004
 end-0.004
 peanuts-0.004
Token
Token ID198
Feature activation+4.88e-03

Pos logit contributions
uddled+0.008
uffed+0.008
ist+0.007
itting+0.007
arest+0.007
Neg logit contributions
 his-0.005
 restaurants-0.004
 night-0.004
 peanuts-0.003
 meal-0.003
TokenJoe
Token ID19585
Feature activation+3.65e-01

Pos logit contributions
 his+0.001
 night+0.000
 restaurants+0.000
 day+0.000
 meal+0.000
Neg logit contributions
uddled-0.000
uffed-0.000
ist-0.000
itting-0.000
arest-0.000
Token flashlight
Token ID38371
Feature activation-5.07e-01

Pos logit contributions
uddled+0.048
uffed+0.045
ist+0.043
itting+0.042
arest+0.042
Neg logit contributions
 his-0.029
 night-0.022
 day-0.021
 meal-0.020
 restaurants-0.020
Token,
Token ID11
Feature activation-3.29e-01

Pos logit contributions
uddled+0.075
uffed+0.072
arest+0.069
itting+0.068
outing+0.068
Neg logit contributions
 his-0.038
 night-0.030
 day-0.029
 bear-0.026
 way-0.025
Token and
Token ID290
Feature activation-2.42e-01

Pos logit contributions
uddled+0.051
uffed+0.049
arest+0.046
ve+0.046
itting+0.046
Neg logit contributions
 his-0.026
 himself-0.016
 night-0.016
 food-0.015
 day-0.015
Token a
Token ID257
Feature activation-2.69e-01

Pos logit contributions
uddled+0.038
uffed+0.036
itting+0.034
ower+0.034
ist+0.034
Neg logit contributions
 his-0.018
 night-0.011
 restaurants-0.011
 himself-0.011
 day-0.011
Token magn
Token ID7842
Feature activation-5.54e-01

Pos logit contributions
uddled+0.037
uffed+0.035
ist+0.033
itting+0.032
arest+0.032
Neg logit contributions
 his-0.022
 night-0.016
 day-0.016
 restaurants-0.015
 meal-0.015
Tokenifying
Token ID4035
Feature activation-3.79e+00

Pos logit contributions
uddled+0.052
liv+0.044
Because+0.043
ower+0.042
ended+0.040
Neg logit contributions
 his-0.064
 night-0.057
 him-0.056
 home-0.054
 dragons-0.051
Token glass
Token ID5405
Feature activation-4.80e-01

Pos logit contributions
uddled+0.505
uffed+0.475
ist+0.457
itting+0.448
arest+0.444
Neg logit contributions
 his-0.306
 night-0.233
 restaurants-0.224
 day-0.215
 meal-0.213
Token.
Token ID13
Feature activation-2.49e-01

Pos logit contributions
uddled+0.071
uffed+0.068
itting+0.065
arest+0.064
ended+0.063
Neg logit contributions
 his-0.035
 night-0.026
 day-0.024
 food-0.023
 meal-0.023
Token
Token ID198
Feature activation-4.71e-02

Pos logit contributions
uddled+0.035
uffed+0.032
ist+0.032
arest+0.031
itting+0.031
Neg logit contributions
 his-0.019
 night-0.014
 restaurants-0.013
 himself-0.013
 day-0.013
Token
Token ID198
Feature activation+1.13e-01

Pos logit contributions
uddled+0.007
uffed+0.006
ist+0.006
itting+0.006
arest+0.006
Neg logit contributions
 his-0.003
 night-0.002
 restaurants-0.002
 day-0.002
 himself-0.002
TokenAs
Token ID1722
Feature activation+1.19e+00

Pos logit contributions
 his+0.011
 day+0.008
 night+0.008
 end+0.008
 himself+0.008
Neg logit contributions
uddled-0.014
uffed-0.013
ist-0.012
ows-0.011
ower-0.011
Token,
Token ID11
Feature activation-7.46e-02

Pos logit contributions
uddled+0.067
uffed+0.063
arest+0.062
ist+0.062
itting+0.062
Neg logit contributions
 his-0.033
 himself-0.021
 night-0.021
 day-0.020
 end-0.019
Token handing
Token ID22786
Feature activation+1.43e-01

Pos logit contributions
uddled+0.011
arest+0.011
uffed+0.011
ist+0.010
itting+0.010
Neg logit contributions
 his-0.005
 himself-0.003
 night-0.003
 day-0.003
 bear-0.003
Token him
Token ID683
Feature activation-3.32e-02

Pos logit contributions
 restaurants+0.010
 jungle+0.010
 den+0.010
 battles+0.009
 agreement+0.009
Neg logit contributions
uddled-0.015
uffed-0.014
op-0.013
a-0.013
ist-0.013
Token the
Token ID262
Feature activation+5.37e-03

Pos logit contributions
uddled+0.004
uffed+0.004
ist+0.004
a+0.004
ows+0.004
Neg logit contributions
 his-0.002
 restaurants-0.002
 night-0.002
 jungle-0.002
 day-0.002
Token magn
Token ID7842
Feature activation-4.21e-01

Pos logit contributions
 his+0.000
 night+0.000
 day+0.000
 food+0.000
 meal+0.000
Neg logit contributions
uddled-0.001
uffed-0.001
ist-0.001
itting-0.001
ended-0.001
Tokenifying
Token ID4035
Feature activation-3.60e+00

Pos logit contributions
uddled+0.039
ower+0.031
Because+0.031
uffed+0.030
liv+0.030
Neg logit contributions
 his-0.049
 night-0.042
 him-0.041
 dragons-0.041
 dinosaurs-0.040
Token glass
Token ID5405
Feature activation-4.15e-01

Pos logit contributions
uddled+0.465
uffed+0.438
ist+0.420
itting+0.412
arest+0.409
Neg logit contributions
 his-0.300
 night-0.230
 restaurants-0.224
 day-0.214
 meal-0.212
Token.
Token ID13
Feature activation+1.10e-01

Pos logit contributions
uddled+0.057
uffed+0.054
itting+0.052
arest+0.052
ve+0.051
Neg logit contributions
 his-0.034
 night-0.025
 day-0.023
 himself-0.023
 peanuts-0.023
Token She
Token ID1375
Feature activation-1.77e-01

Pos logit contributions
 his+0.009
 night+0.007
 restaurants+0.006
 day+0.006
 himself+0.006
Neg logit contributions
uddled-0.015
uffed-0.014
ist-0.013
arest-0.013
itting-0.013
Token looked
Token ID3114
Feature activation+3.15e-01

Pos logit contributions
itting+0.025
arest+0.024
ower+0.024
uttering+0.023
ute+0.022
Neg logit contributions
 his-0.016
 bear-0.010
 bears-0.009
 search-0.009
 way-0.009
Token around
Token ID1088
Feature activation-1.73e-02

Pos logit contributions
 his+0.032
 night+0.025
 day+0.023
 end+0.023
 himself+0.023
Neg logit contributions
uddled-0.037
uffed-0.034
arest-0.033
op-0.033
ist-0.033
Token When
Token ID1649
Feature activation+5.74e-01

Pos logit contributions
 his+0.051
 night+0.035
 restaurants+0.034
 himself+0.034
 day+0.033
Neg logit contributions
uddled-0.097
uffed-0.091
ist-0.089
arest-0.087
itting-0.086
Token Daisy
Token ID40355
Feature activation-3.14e-01

Pos logit contributions
 his+0.048
 night+0.037
 restaurants+0.036
 day+0.034
 himself+0.034
Neg logit contributions
uddled-0.075
uffed-0.070
ist-0.068
itting-0.066
arest-0.066
Token put
Token ID1234
Feature activation+6.79e-02

Pos logit contributions
uddled+0.045
ower+0.043
uffed+0.042
itting+0.042
arest+0.042
Neg logit contributions
 his-0.025
 night-0.020
 bear-0.017
 search-0.017
 day-0.017
Token the
Token ID262
Feature activation-2.38e-02

Pos logit contributions
 his+0.005
 night+0.004
 restaurants+0.004
 day+0.003
 meal+0.003
Neg logit contributions
uddled-0.009
uffed-0.009
ist-0.008
itting-0.008
arest-0.008
Token magn
Token ID7842
Feature activation-5.33e-01

Pos logit contributions
uddled+0.003
uffed+0.003
itting+0.003
ist+0.003
ended+0.003
Neg logit contributions
 his-0.002
 night-0.002
 day-0.001
 end-0.001
 peanuts-0.001
Tokenifying
Token ID4035
Feature activation-3.51e+00

Pos logit contributions
uddled+0.050
Because+0.041
liv+0.040
ower+0.039
uffed+0.038
Neg logit contributions
 his-0.062
 night-0.054
 home-0.052
 him-0.050
 over-0.050
Token glass
Token ID5405
Feature activation-3.17e-01

Pos logit contributions
uddled+0.475
uffed+0.447
ist+0.430
itting+0.422
arest+0.418
Neg logit contributions
 his-0.277
 night-0.208
 restaurants-0.200
 day-0.192
 meal-0.190
Token down
Token ID866
Feature activation-8.43e-01

Pos logit contributions
uddled+0.047
uffed+0.044
itting+0.043
arest+0.042
ve+0.041
Neg logit contributions
 his-0.024
 night-0.017
 day-0.016
 meal-0.015
 himself-0.015
Token,
Token ID11
Feature activation+1.46e-01

Pos logit contributions
uddled+0.132
uffed+0.122
itting+0.118
arest+0.117
ist+0.116
Neg logit contributions
 his-0.062
 night-0.042
 day-0.039
 himself-0.038
 end-0.037
Token the
Token ID262
Feature activation+6.06e-02

Pos logit contributions
 his+0.015
 night+0.012
 restaurants+0.011
 day+0.011
 himself+0.011
Neg logit contributions
uddled-0.017
uffed-0.016
ist-0.015
itting-0.015
arest-0.015
Token cater
Token ID20825
Feature activation+5.55e-03

Pos logit contributions
 his+0.004
 night+0.004
 day+0.003
 end+0.003
 meal+0.003
Neg logit contributions
uddled-0.009
uffed-0.008
itting-0.008
ist-0.008
arest-0.008
Token examine
Token ID10716
Feature activation-6.61e-01

Pos logit contributions
uddled+0.051
uffed+0.047
outing+0.044
itting+0.043
ower+0.043
Neg logit contributions
 his-0.019
 search-0.014
 end-0.012
 night-0.012
 play-0.011
Token it
Token ID340
Feature activation-2.00e-01

Pos logit contributions
uddled+0.096
uffed+0.090
itting+0.086
ist+0.086
arest+0.084
Neg logit contributions
 his-0.048
 night-0.034
 restaurants-0.032
 himself-0.031
 day-0.031
Token with
Token ID351
Feature activation-2.43e-01

Pos logit contributions
uddled+0.027
uffed+0.026
itting+0.024
ended+0.023
ower+0.023
Neg logit contributions
 his-0.018
 night-0.013
 end-0.013
 day-0.013
 himself-0.012
Token the
Token ID262
Feature activation+7.70e-02

Pos logit contributions
uddled+0.037
uffed+0.035
itting+0.034
ist+0.033
ider+0.033
Neg logit contributions
 his-0.017
 night-0.012
 restaurants-0.010
 himself-0.010
 peanuts-0.010
Token magn
Token ID7842
Feature activation-3.17e-01

Pos logit contributions
 his+0.006
 night+0.005
 day+0.005
 end+0.005
 restaurants+0.005
Neg logit contributions
uddled-0.011
uffed-0.010
ist-0.009
itting-0.009
ended-0.009
Tokenifying
Token ID4035
Feature activation-3.46e+00

Pos logit contributions
uddled+0.028
liv+0.022
uffed+0.022
Because+0.021
ower+0.020
Neg logit contributions
 his-0.037
 night-0.036
 Prince-0.033
 home-0.033
 dinosaurs-0.033
Token glass
Token ID5405
Feature activation-5.52e-01

Pos logit contributions
uddled+0.406
uffed+0.378
ist+0.364
arest+0.357
itting+0.353
Neg logit contributions
 his-0.308
 restaurants-0.242
 night-0.238
 himself-0.233
 day-0.230
Token?"
Token ID1701
Feature activation-2.88e-01

Pos logit contributions
uddled+0.073
uffed+0.070
itting+0.065
ended+0.064
arest+0.064
Neg logit contributions
 his-0.048
 night-0.039
 day-0.038
 meal-0.036
 end-0.036
Token
Token ID198
Feature activation-1.30e-01

Pos logit contributions
uddled+0.034
uffed+0.033
op+0.031
ended+0.030
Because+0.030
Neg logit contributions
 his-0.022
 restaurants-0.020
 night-0.019
 end-0.019
 meal-0.019
Token
Token ID198
Feature activation-7.55e-02

Pos logit contributions
uddled+0.019
uffed+0.018
ist+0.017
itting+0.017
arest+0.017
Neg logit contributions
 his-0.009
 night-0.006
 day-0.006
 himself-0.006
 restaurants-0.006
Token"
Token ID1
Feature activation-5.37e-01

Pos logit contributions
uddled+0.009
uffed+0.009
ist+0.009
arest+0.008
itting+0.008
Neg logit contributions
 his-0.007
 himself-0.005
 day-0.005
 end-0.005
 night-0.005
Token a
Token ID257
Feature activation-1.51e-01

Pos logit contributions
uddled+0.024
uffed+0.023
ist+0.022
itting+0.022
arest+0.022
Neg logit contributions
 his-0.013
 night-0.010
 restaurants-0.010
 end-0.009
 agreement-0.009
Token flashlight
Token ID38371
Feature activation-3.55e-01

Pos logit contributions
uddled+0.018
uffed+0.017
ist+0.016
arest+0.016
ve+0.016
Neg logit contributions
 his-0.014
 restaurants-0.011
 himself-0.011
 night-0.010
 agreement-0.010
Token and
Token ID290
Feature activation-2.01e-01

Pos logit contributions
uddled+0.051
arest+0.049
uffed+0.049
ve+0.047
ended+0.046
Neg logit contributions
 his-0.028
 night-0.021
 day-0.021
 bear-0.019
 saf-0.018
Token a
Token ID257
Feature activation-1.61e-01

Pos logit contributions
uddled+0.032
uffed+0.031
itting+0.030
ist+0.030
outing+0.030
Neg logit contributions
 his-0.013
 night-0.008
 restaurants-0.008
 food-0.007
 day-0.007
Token magn
Token ID7842
Feature activation-2.69e-01

Pos logit contributions
uddled+0.021
uffed+0.019
ist+0.019
itting+0.018
arest+0.018
Neg logit contributions
 his-0.013
 night-0.010
 restaurants-0.010
 day-0.010
 meal-0.010
Tokenifying
Token ID4035
Feature activation-3.45e+00

Pos logit contributions
uddled+0.022
ended+0.018
ower+0.018
uffed+0.017
Because+0.017
Neg logit contributions
 his-0.033
 night-0.029
 him-0.028
 dragons-0.028
 bear-0.028
Token glass
Token ID5405
Feature activation-4.22e-01

Pos logit contributions
uddled+0.373
uffed+0.344
ist+0.333
arest+0.328
op+0.323
Neg logit contributions
 his-0.312
 restaurants-0.254
 himself-0.250
 robbers-0.248
 agreement-0.247
Token to
Token ID284
Feature activation-4.49e-02

Pos logit contributions
uddled+0.059
uffed+0.056
arest+0.053
ended+0.052
itting+0.052
Neg logit contributions
 his-0.033
 night-0.025
 day-0.024
 meal-0.023
 peanuts-0.023
Token examine
Token ID10716
Feature activation-5.62e-01

Pos logit contributions
uddled+0.006
uffed+0.006
ist+0.006
itting+0.006
arest+0.005
Neg logit contributions
 his-0.003
 night-0.002
 restaurants-0.002
 day-0.002
 himself-0.002
Token their
Token ID511
Feature activation-3.89e-01

Pos logit contributions
uddled+0.080
uffed+0.076
ist+0.072
itting+0.072
arest+0.071
Neg logit contributions
 his-0.041
 night-0.029
 restaurants-0.028
 himself-0.027
 day-0.027
Token finds
Token ID7228
Feature activation+8.05e-01

Pos logit contributions
uddled+0.052
uffed+0.048
ist+0.047
itting+0.046
ower+0.046
Neg logit contributions
 his-0.029
 restaurants-0.022
 night-0.020
 himself-0.020
 Freddy-0.020
Token,
Token ID11
Feature activation-1.29e-01

Pos logit contributions
uddled+0.031
uffed+0.030
arest+0.029
itting+0.029
ve+0.028
Neg logit contributions
 his-0.017
 day-0.012
 night-0.012
 himself-0.011
 end-0.011
Token next
Token ID1306
Feature activation-5.66e-01

Pos logit contributions
uddled+0.018
arest+0.017
uffed+0.017
ve+0.017
op+0.017
Neg logit contributions
 his-0.011
 night-0.007
 himself-0.007
 search-0.007
 day-0.006
Token to
Token ID284
Feature activation+9.76e-02

Pos logit contributions
uddled+0.072
uffed+0.067
ist+0.065
itting+0.064
arest+0.063
Neg logit contributions
 his-0.048
 restaurants-0.036
 night-0.036
 peanuts-0.034
 himself-0.034
Token her
Token ID607
Feature activation+2.17e-01

Pos logit contributions
 his+0.005
 night+0.003
 restaurants+0.003
 peanuts+0.003
 meal+0.003
Neg logit contributions
uddled-0.015
uffed-0.015
ist-0.014
itting-0.014
arest-0.014
Token magn
Token ID7842
Feature activation-3.94e-01

Pos logit contributions
 his+0.016
 night+0.012
 day+0.011
 restaurants+0.011
 end+0.011
Neg logit contributions
uddled-0.031
uffed-0.030
arest-0.029
itting-0.029
ist-0.028
Tokenifying
Token ID4035
Feature activation-3.43e+00

Pos logit contributions
uddled+0.033
Because+0.031
ower+0.031
keys+0.029
arest+0.027
Neg logit contributions
 his-0.043
 home-0.038
 him-0.038
 night-0.037
 dragons-0.036
Token glass
Token ID5405
Feature activation-4.32e-01

Pos logit contributions
uddled+0.424
uffed+0.396
ist+0.380
itting+0.371
arest+0.370
Neg logit contributions
 his-0.293
 night-0.227
 restaurants-0.225
 day-0.215
 himself-0.215
Token.
Token ID13
Feature activation-1.08e-01

Pos logit contributions
uddled+0.060
uffed+0.058
arest+0.057
itting+0.055
ve+0.055
Neg logit contributions
 his-0.035
 night-0.026
 day-0.024
 food-0.023
 meal-0.023
Token She
Token ID1375
Feature activation-4.13e-01

Pos logit contributions
uddled+0.013
uffed+0.012
ist+0.012
arest+0.012
itting+0.012
Neg logit contributions
 his-0.010
 night-0.008
 restaurants-0.007
 day-0.007
 himself-0.007
Token looked
Token ID3114
Feature activation+3.15e-01

Pos logit contributions
ower+0.057
arest+0.055
itting+0.054
uddled+0.054
ended+0.053
Neg logit contributions
 his-0.034
 search-0.025
 night-0.024
 dragons-0.022
 end-0.022
Token at
Token ID379
Feature activation-4.49e-02

Pos logit contributions
 his+0.031
 night+0.024
 restaurants+0.023
 day+0.023
 end+0.022
Neg logit contributions
uddled-0.037
uffed-0.035
ist-0.033
arest-0.033
itting-0.032
Token said
Token ID531
Feature activation-7.66e-01

Pos logit contributions
 his+0.038
 night+0.031
 restaurants+0.031
 day+0.030
 himself+0.030
Neg logit contributions
uddled-0.030
uffed-0.027
ist-0.025
arest-0.025
itting-0.025
Token,
Token ID11
Feature activation+4.33e-02

Pos logit contributions
uddled+0.105
uffed+0.098
arest+0.095
ist+0.094
itting+0.094
Neg logit contributions
 his-0.061
 night-0.043
 himself-0.041
 day-0.040
 restaurants-0.040
Token holding
Token ID4769
Feature activation+1.08e+00

Pos logit contributions
 his+0.004
 night+0.003
 restaurants+0.003
 day+0.003
 meal+0.003
Neg logit contributions
uddled-0.005
uffed-0.005
ist-0.004
itting-0.004
arest-0.004
Token the
Token ID262
Feature activation+1.50e-01

Pos logit contributions
 his+0.071
 restaurants+0.067
 night+0.063
 meal+0.061
 jungle+0.061
Neg logit contributions
uddled-0.138
uffed-0.136
ist-0.128
op-0.125
itting-0.123
Token magn
Token ID7842
Feature activation-2.10e-01

Pos logit contributions
 his+0.012
 night+0.009
 restaurants+0.008
 day+0.008
 end+0.008
Neg logit contributions
uddled-0.020
uffed-0.019
ist-0.018
itting-0.018
arest-0.018
Tokenifying
Token ID4035
Feature activation-3.40e+00

Pos logit contributions
uddled+0.016
uffed+0.013
Because+0.012
ower+0.012
ended+0.011
Neg logit contributions
 his-0.028
 night-0.024
 dragons-0.023
 bear-0.023
 mushroom-0.023
Token glass
Token ID5405
Feature activation-3.72e-01

Pos logit contributions
uddled+0.385
uffed+0.358
ist+0.343
op+0.337
arest+0.337
Neg logit contributions
 his-0.304
 restaurants-0.245
 night-0.239
 himself-0.235
 day-0.234
Token over
Token ID625
Feature activation+5.22e-03

Pos logit contributions
uddled+0.050
uffed+0.047
arest+0.045
itting+0.044
ist+0.044
Neg logit contributions
 his-0.031
 night-0.023
 day-0.022
 end-0.021
 himself-0.021
Token a
Token ID257
Feature activation-1.39e-01

Pos logit contributions
 his+0.000
 restaurants+0.000
 night+0.000
 meal+0.000
 day+0.000
Neg logit contributions
uddled-0.001
uffed-0.001
ist-0.001
itting-0.001
ve-0.001
Token black
Token ID2042
Feature activation+4.97e-01

Pos logit contributions
uddled+0.018
uffed+0.017
ist+0.016
arest+0.016
itting+0.016
Neg logit contributions
 his-0.011
 restaurants-0.009
 night-0.008
 himself-0.008
 day-0.008
Token ant
Token ID1885
Feature activation+1.24e-01

Pos logit contributions
 his+0.042
 restaurants+0.034
 night+0.033
 himself+0.031
 peanuts+0.031
Neg logit contributions
uddled-0.060
ist-0.055
uffed-0.055
op-0.054
itting-0.053
Token.
Token ID13
Feature activation-5.80e-02

Pos logit contributions
uddled+0.107
uffed+0.101
arest+0.095
itting+0.093
ist+0.092
Neg logit contributions
 his-0.059
 night-0.042
 day-0.039
 himself-0.039
 peanuts-0.037
Token Lily
Token ID20037
Feature activation+2.97e-01

Pos logit contributions
uffed+0.007
uddled+0.006
ows+0.006
op+0.006
itting+0.006
Neg logit contributions
 his-0.004
 battle-0.004
 night-0.004
 restaurants-0.004
 saf-0.004
Token had
Token ID550
Feature activation-1.87e-01

Pos logit contributions
 his+0.027
 night+0.020
 day+0.019
 search+0.019
 jungle+0.018
Neg logit contributions
uddled-0.038
arest-0.036
uffed-0.036
itting-0.035
ower-0.035
Token a
Token ID257
Feature activation-1.66e-01

Pos logit contributions
uddled+0.029
uffed+0.028
ist+0.027
itting+0.026
ower+0.026
Neg logit contributions
 his-0.013
 night-0.008
 himself-0.008
 restaurants-0.008
 day-0.008
Token magn
Token ID7842
Feature activation-2.40e-01

Pos logit contributions
uddled+0.021
uffed+0.020
ist+0.019
itting+0.019
arest+0.018
Neg logit contributions
 his-0.014
 night-0.011
 day-0.010
 restaurants-0.010
 meal-0.010
Tokenifying
Token ID4035
Feature activation-3.40e+00

Pos logit contributions
uddled+0.020
liv+0.017
ower+0.016
Because+0.016
ended+0.015
Neg logit contributions
 his-0.030
 him-0.025
 night-0.025
 dinosaurs-0.025
 bear-0.024
Token glass
Token ID5405
Feature activation-4.49e-01

Pos logit contributions
uddled+0.391
uffed+0.365
ist+0.351
arest+0.345
itting+0.340
Neg logit contributions
 his-0.299
 restaurants-0.240
 night-0.232
 himself-0.230
 agreement-0.228
Token that
Token ID326
Feature activation-6.89e-01

Pos logit contributions
uddled+0.060
uffed+0.057
arest+0.054
ve+0.053
itting+0.053
Neg logit contributions
 his-0.039
 night-0.029
 day-0.029
 peanuts-0.028
 end-0.028
Token she
Token ID673
Feature activation-3.40e-01

Pos logit contributions
uddled+0.082
uffed+0.075
ist+0.074
itting+0.073
arest+0.072
Neg logit contributions
 his-0.073
 night-0.056
 day-0.054
 himself-0.054
 restaurants-0.052
Token found
Token ID1043
Feature activation+7.60e-01

Pos logit contributions
uddled+0.045
uffed+0.042
itting+0.041
ower+0.041
outing+0.040
Neg logit contributions
 his-0.030
 night-0.022
 search-0.022
 day-0.021
 himself-0.021
Token in
Token ID287
Feature activation-3.48e-02

Pos logit contributions
 his+0.067
 himself+0.049
 night+0.049
 day+0.047
 restaurants+0.046
Neg logit contributions
uddled-0.099
uffed-0.093
ist-0.089
arest-0.089
itting-0.088
Token.
Token ID13
Feature activation-8.58e-02

Pos logit contributions
 his+0.033
 night+0.023
 day+0.022
 peanuts+0.021
 bears+0.020
Neg logit contributions
uddled-0.075
uffed-0.073
arest-0.068
itting-0.067
ended-0.066
Token They
Token ID1119
Feature activation+2.69e-01

Pos logit contributions
uddled+0.011
uffed+0.010
itting+0.010
ist+0.010
ower+0.010
Neg logit contributions
 his-0.007
 restaurants-0.006
 night-0.005
 meal-0.005
 dragons-0.005
Token had
Token ID550
Feature activation-2.07e-01

Pos logit contributions
 his+0.024
 night+0.017
 search+0.017
 end+0.016
 day+0.016
Neg logit contributions
uddled-0.037
uffed-0.034
arest-0.034
ower-0.033
ider-0.033
Token a
Token ID257
Feature activation-1.43e-01

Pos logit contributions
uddled+0.031
uffed+0.030
ist+0.029
arest+0.028
itting+0.028
Neg logit contributions
 his-0.014
 night-0.009
 restaurants-0.009
 himself-0.009
 day-0.009
Token magn
Token ID7842
Feature activation-2.07e-01

Pos logit contributions
uddled+0.019
uffed+0.018
ist+0.017
itting+0.017
arest+0.017
Neg logit contributions
 his-0.012
 night-0.009
 restaurants-0.009
 day-0.008
 meal-0.008
Tokenifying
Token ID4035
Feature activation-3.39e+00

Pos logit contributions
uddled+0.018
ower+0.016
ended+0.016
liv+0.015
keys+0.015
Neg logit contributions
 his-0.022
 him-0.019
 home-0.019
 dinosaurs-0.019
 night-0.019
Token glass
Token ID5405
Feature activation-4.25e-01

Pos logit contributions
uddled+0.396
uffed+0.368
ist+0.355
arest+0.349
itting+0.347
Neg logit contributions
 his-0.292
 restaurants-0.228
 himself-0.222
 night-0.220
 robbers-0.220
Token and
Token ID290
Feature activation-1.92e-01

Pos logit contributions
uddled+0.056
uffed+0.055
arest+0.052
ended+0.051
ve+0.051
Neg logit contributions
 his-0.036
 day-0.028
 night-0.028
 peanuts-0.027
 end-0.027
Token a
Token ID257
Feature activation-1.83e-01

Pos logit contributions
uddled+0.032
uffed+0.031
itting+0.030
ist+0.030
outing+0.029
Neg logit contributions
 his-0.012
 food-0.007
 peanuts-0.007
 restaurants-0.007
 night-0.006
Token small
Token ID1402
Feature activation-7.82e-01

Pos logit contributions
uddled+0.024
uffed+0.023
ist+0.022
itting+0.021
arest+0.021
Neg logit contributions
 his-0.015
 night-0.012
 day-0.011
 restaurants-0.011
 meal-0.011
Token box
Token ID3091
Feature activation+6.16e-01

Pos logit contributions
uddled+0.099
uffed+0.093
ist+0.090
itting+0.088
arest+0.088
Neg logit contributions
 his-0.064
 restaurants-0.049
 night-0.048
 himself-0.046
 day-0.045
Token the
Token ID262
Feature activation+2.75e-01

Pos logit contributions
 meal+0.002
 restaurants+0.002
 salads+0.002
 pizz+0.002
 saf+0.002
Neg logit contributions
uddled-0.005
uffed-0.005
ist-0.005
Because-0.004
outing-0.004
Token watch
Token ID2342
Feature activation-3.89e-01

Pos logit contributions
 his+0.020
 night+0.018
 day+0.017
 end+0.017
 jungle+0.016
Neg logit contributions
uddled-0.039
uffed-0.038
ist-0.035
ended-0.035
itting-0.034
Token with
Token ID351
Feature activation+6.92e-03

Pos logit contributions
uddled+0.054
uffed+0.052
ve+0.050
arest+0.050
itting+0.048
Neg logit contributions
 his-0.031
 night-0.024
 day-0.023
 end-0.022
 morning-0.021
Token the
Token ID262
Feature activation+2.25e-01

Pos logit contributions
 his+0.000
 restaurants+0.000
 night+0.000
 jungle+0.000
 day+0.000
Neg logit contributions
uddled-0.001
uffed-0.001
ist-0.001
arest-0.001
itting-0.001
Token magn
Token ID7842
Feature activation-2.09e-01

Pos logit contributions
 his+0.017
 night+0.014
 day+0.014
 end+0.013
 food+0.013
Neg logit contributions
uddled-0.032
uffed-0.030
ist-0.028
ended-0.028
itting-0.027
Tokenifying
Token ID4035
Feature activation-3.37e+00

Pos logit contributions
uddled+0.019
Because+0.015
ower+0.015
uffed+0.015
liv+0.014
Neg logit contributions
 his-0.023
 night-0.020
 dragons-0.020
 bear-0.020
 mushroom-0.019
Token glass
Token ID5405
Feature activation-3.24e-01

Pos logit contributions
uddled+0.414
uffed+0.387
ist+0.371
arest+0.363
itting+0.362
Neg logit contributions
 his-0.290
 night-0.224
 restaurants-0.224
 himself-0.214
 day-0.214
Token.
Token ID13
Feature activation+8.99e-02

Pos logit contributions
uddled+0.047
uffed+0.045
arest+0.044
itting+0.043
ve+0.043
Neg logit contributions
 his-0.024
 night-0.017
 day-0.017
 end-0.016
 food-0.016
Token He
Token ID679
Feature activation+2.79e-02

Pos logit contributions
 his+0.005
 night+0.003
 restaurants+0.003
 day+0.003
 himself+0.003
Neg logit contributions
uddled-0.014
uffed-0.013
ist-0.013
itting-0.012
arest-0.012
Token saw
Token ID2497
Feature activation+2.83e-01

Pos logit contributions
 his+0.002
 night+0.001
 search+0.001
 himself+0.001
 bear+0.001
Neg logit contributions
uddled-0.004
arest-0.004
itting-0.004
ower-0.004
ist-0.004
Token the
Token ID262
Feature activation+2.27e-01

Pos logit contributions
 his+0.020
 restaurants+0.017
 night+0.017
 meal+0.016
 agreement+0.016
Neg logit contributions
uddled-0.037
uffed-0.035
ist-0.032
arest-0.031
ended-0.031
Token She
Token ID1375
Feature activation-5.13e-01

Pos logit contributions
uddled+0.014
uffed+0.013
itting+0.012
op+0.012
ist+0.012
Neg logit contributions
 his-0.008
 night-0.006
 restaurants-0.006
 meal-0.006
 robbers-0.006
Token had
Token ID550
Feature activation-1.24e-01

Pos logit contributions
arest+0.069
uddled+0.068
uffed+0.068
ower+0.066
ended+0.066
Neg logit contributions
 his-0.044
 night-0.034
 day-0.033
 end-0.030
 search-0.030
Token a
Token ID257
Feature activation-8.69e-02

Pos logit contributions
uddled+0.019
uffed+0.018
ist+0.018
arest+0.017
itting+0.017
Neg logit contributions
 his-0.008
 night-0.006
 restaurants-0.005
 day-0.005
 himself-0.005
Token big
Token ID1263
Feature activation+2.25e-01

Pos logit contributions
uddled+0.011
uffed+0.010
ist+0.010
itting+0.010
arest+0.010
Neg logit contributions
 his-0.007
 night-0.006
 restaurants-0.006
 himself-0.005
 day-0.005
Token magn
Token ID7842
Feature activation-1.89e-01

Pos logit contributions
 his+0.020
 himself+0.016
 restaurants+0.015
 night+0.014
 peanuts+0.014
Neg logit contributions
uddled-0.026
uffed-0.025
ist-0.024
itting-0.024
op-0.024
Tokenifying
Token ID4035
Feature activation-3.37e+00

Pos logit contributions
uddled+0.017
liv+0.014
ended+0.014
ower+0.014
arest+0.013
Neg logit contributions
 his-0.020
 him-0.017
 night-0.017
 dinosaurs-0.017
 dragons-0.017
Token glass
Token ID5405
Feature activation-4.06e-01

Pos logit contributions
uddled+0.396
uffed+0.369
ist+0.357
itting+0.348
arest+0.348
Neg logit contributions
 his-0.295
 restaurants-0.231
 night-0.225
 himself-0.224
 robbers-0.219
Token that
Token ID326
Feature activation-6.42e-01

Pos logit contributions
uddled+0.056
uffed+0.054
arest+0.052
ve+0.051
ended+0.050
Neg logit contributions
 his-0.033
 day-0.026
 night-0.026
 end-0.024
 meal-0.024
Token made
Token ID925
Feature activation+1.85e-01

Pos logit contributions
uddled+0.083
uffed+0.077
arest+0.075
ider+0.075
ist+0.075
Neg logit contributions
 his-0.062
 night-0.050
 day-0.048
 search-0.044
 restaurants-0.043
Token things
Token ID1243
Feature activation+2.82e-01

Pos logit contributions
 his+0.010
 night+0.006
 himself+0.005
 day+0.005
 food+0.005
Neg logit contributions
uddled-0.031
uffed-0.029
arest-0.028
ider-0.028
ist-0.028
Token look
Token ID804
Feature activation+1.97e-01

Pos logit contributions
 his+0.022
 night+0.014
 himself+0.014
 search+0.014
 day+0.013
Neg logit contributions
uddled-0.044
uffed-0.041
liv-0.038
ider-0.038
itting-0.038
Token borrowed
Token ID22546
Feature activation+6.02e-01

Pos logit contributions
uddled+0.057
uffed+0.054
ist+0.052
itting+0.052
arest+0.051
Neg logit contributions
 his-0.048
 night-0.037
 himself-0.036
 day-0.035
 end-0.035
Token her
Token ID607
Feature activation+2.52e-01

Pos logit contributions
 his+0.040
 restaurants+0.039
 night+0.035
 jungle+0.034
 saf+0.033
Neg logit contributions
uddled-0.073
uffed-0.071
ve-0.068
ist-0.066
op-0.066
Token mom
Token ID1995
Feature activation-2.44e-01

Pos logit contributions
 his+0.020
 night+0.015
 restaurants+0.015
 day+0.014
 himself+0.014
Neg logit contributions
uddled-0.033
uffed-0.031
ist-0.030
itting-0.029
arest-0.029
Token's
Token ID338
Feature activation+7.36e-01

Pos logit contributions
uddled+0.034
uffed+0.032
ist+0.031
itting+0.031
op+0.030
Neg logit contributions
 his-0.017
 restaurants-0.012
 night-0.012
 agreement-0.011
 himself-0.011
Token magn
Token ID7842
Feature activation-3.07e-01

Pos logit contributions
 his+0.059
 night+0.045
 restaurants+0.044
 day+0.042
 himself+0.041
Neg logit contributions
uddled-0.097
uffed-0.091
ist-0.088
itting-0.086
arest-0.085
Tokenifying
Token ID4035
Feature activation-3.35e+00

Pos logit contributions
uddled+0.024
Because+0.020
ower+0.020
arest+0.020
keys+0.020
Neg logit contributions
 his-0.037
 Freddy-0.031
 him-0.031
 search-0.031
 home-0.031
Token glass
Token ID5405
Feature activation-5.60e-01

Pos logit contributions
uddled+0.412
uffed+0.386
ist+0.369
itting+0.361
arest+0.360
Neg logit contributions
 his-0.286
 night-0.223
 restaurants-0.221
 day-0.212
 himself-0.210
Token to
Token ID284
Feature activation-7.00e-02

Pos logit contributions
uddled+0.075
uffed+0.071
arest+0.069
itting+0.067
ist+0.066
Neg logit contributions
 his-0.047
 night-0.034
 peanuts-0.034
 day-0.033
 meal-0.032
Token take
Token ID1011
Feature activation-8.85e-01

Pos logit contributions
uddled+0.010
uffed+0.010
ist+0.009
itting+0.009
arest+0.009
Neg logit contributions
 his-0.005
 night-0.003
 restaurants-0.003
 day-0.003
 himself-0.003
Token a
Token ID257
Feature activation-3.08e-01

Pos logit contributions
uddled+0.105
ist+0.103
ve+0.102
uffed+0.100
op+0.099
Neg logit contributions
 his-0.056
 restaurants-0.056
 jungle-0.048
 dinosaurs-0.048
 robbers-0.047
Token closer
Token ID5699
Feature activation+1.73e-01

Pos logit contributions
uddled+0.034
arest+0.033
ve+0.033
op+0.033
uffed+0.032
Neg logit contributions
 his-0.028
 restaurants-0.023
 himself-0.022
 dinosaurs-0.021
 salads-0.021
Token the
Token ID262
Feature activation+2.02e-01

Pos logit contributions
 his+0.018
 night+0.012
 restaurants+0.012
 day+0.011
 meal+0.011
Neg logit contributions
uddled-0.054
uffed-0.051
ist-0.050
itting-0.049
arest-0.049
Token necklace
Token ID38987
Feature activation+7.74e-01

Pos logit contributions
 his+0.017
 night+0.013
 day+0.013
 peanuts+0.013
 restaurants+0.013
Neg logit contributions
uddled-0.026
uffed-0.024
ist-0.024
itting-0.023
arest-0.023
Token and
Token ID290
Feature activation+1.73e-02

Pos logit contributions
 his+0.068
 night+0.052
 day+0.050
 end+0.047
 himself+0.047
Neg logit contributions
uddled-0.099
uffed-0.096
arest-0.091
itting-0.089
ve-0.088
Token the
Token ID262
Feature activation+1.87e-01

Pos logit contributions
 his+0.002
 night+0.001
 restaurants+0.001
 himself+0.001
 peanuts+0.001
Neg logit contributions
uddled-0.002
uffed-0.002
itting-0.002
ist-0.002
arest-0.002
Token magn
Token ID7842
Feature activation-2.78e-01

Pos logit contributions
 his+0.015
 night+0.012
 restaurants+0.011
 day+0.011
 peanuts+0.011
Neg logit contributions
uddled-0.024
uffed-0.023
ist-0.022
itting-0.022
arest-0.021
Tokenifying
Token ID4035
Feature activation-3.34e+00

Pos logit contributions
uddled+0.021
Because+0.020
ower+0.020
liv+0.017
keys+0.017
Neg logit contributions
 his-0.032
 night-0.029
 home-0.028
 him-0.028
 end-0.027
Token glass
Token ID5405
Feature activation-3.41e-01

Pos logit contributions
uddled+0.380
uffed+0.350
ist+0.331
arest+0.329
op+0.322
Neg logit contributions
 his-0.300
 restaurants-0.244
 night-0.233
 robbers-0.233
 agreement-0.233
Token and
Token ID290
Feature activation-5.60e-02

Pos logit contributions
uddled+0.048
uffed+0.047
arest+0.046
itting+0.045
ve+0.044
Neg logit contributions
 his-0.025
 night-0.018
 day-0.017
 peanuts-0.017
 food-0.016
Token started
Token ID2067
Feature activation-4.41e-02

Pos logit contributions
uddled+0.007
uffed+0.007
itting+0.006
ist+0.006
arest+0.006
Neg logit contributions
 his-0.005
 night-0.004
 restaurants-0.004
 himself-0.004
 day-0.004
Token to
Token ID284
Feature activation+8.65e-02

Pos logit contributions
uddled+0.005
uffed+0.005
ist+0.004
itting+0.004
op+0.004
Neg logit contributions
 his-0.004
 restaurants-0.003
 night-0.003
 end-0.003
 agreement-0.003
Token examine
Token ID10716
Feature activation-4.38e-01

Pos logit contributions
 his+0.007
 night+0.005
 restaurants+0.005
 day+0.005
 peanuts+0.005
Neg logit contributions
uddled-0.011
uffed-0.010
ist-0.010
arest-0.010
itting-0.010
Token hat
Token ID6877
Feature activation-7.85e-02

Pos logit contributions
 his+0.022
 night+0.017
 day+0.016
 meal+0.016
 restaurants+0.016
Neg logit contributions
uddled-0.037
uffed-0.035
ist-0.033
itting-0.033
arest-0.033
Token and
Token ID290
Feature activation-1.91e-01

Pos logit contributions
uddled+0.010
uffed+0.010
arest+0.010
itting+0.009
ve+0.009
Neg logit contributions
 his-0.007
 night-0.005
 day-0.005
 himself-0.005
 end-0.005
Token grabbed
Token ID13176
Feature activation-4.52e-01

Pos logit contributions
uddled+0.024
uffed+0.023
ist+0.022
arest+0.021
itting+0.021
Neg logit contributions
 his-0.016
 night-0.012
 restaurants-0.012
 day-0.011
 himself-0.011
Token her
Token ID607
Feature activation+2.32e-01

Pos logit contributions
uddled+0.064
uffed+0.061
ist+0.059
arest+0.058
ve+0.057
Neg logit contributions
 his-0.026
 restaurants-0.022
 jungle-0.020
 night-0.020
 saf-0.019
Token magn
Token ID7842
Feature activation-2.55e-01

Pos logit contributions
 his+0.019
 night+0.014
 day+0.014
 meal+0.013
 end+0.013
Neg logit contributions
uddled-0.032
uffed-0.030
itting-0.029
ist-0.028
arest-0.028
Tokenifying
Token ID4035
Feature activation-3.32e+00

Pos logit contributions
uddled+0.020
Because+0.017
arest+0.016
ower+0.016
keys+0.016
Neg logit contributions
 his-0.030
 home-0.027
 night-0.026
 search-0.026
 him-0.025
Token glass
Token ID5405
Feature activation-4.73e-01

Pos logit contributions
uddled+0.394
uffed+0.368
ist+0.350
arest+0.342
itting+0.340
Neg logit contributions
 his-0.288
 restaurants-0.229
 night-0.226
 day-0.219
 himself-0.218
Token.
Token ID13
Feature activation-1.69e-01

Pos logit contributions
uddled+0.064
uffed+0.060
arest+0.058
itting+0.057
ist+0.056
Neg logit contributions
 his-0.038
 night-0.029
 day-0.027
 peanuts-0.026
 meal-0.026
Token She
Token ID1375
Feature activation-3.94e-01

Pos logit contributions
uddled+0.021
uffed+0.020
ist+0.019
arest+0.018
itting+0.018
Neg logit contributions
 his-0.015
 night-0.012
 restaurants-0.011
 day-0.011
 himself-0.011
Token was
Token ID373
Feature activation-2.59e-01

Pos logit contributions
itting+0.052
ower+0.050
arest+0.050
uttering+0.049
uddled+0.049
Neg logit contributions
 his-0.036
 night-0.025
 bear-0.024
 search-0.023
 dragons-0.023
Token ready
Token ID3492
Feature activation-1.05e+00

Pos logit contributions
uddled+0.036
uffed+0.033
ist+0.032
itting+0.032
arest+0.032
Neg logit contributions
 his-0.020
 night-0.015
 restaurants-0.014
 himself-0.014
 day-0.014
Token
Token ID198
Feature activation+6.77e-02

Pos logit contributions
uddled+0.000
uffed+0.000
ist+0.000
arest+0.000
itting+0.000
Neg logit contributions
 his-0.000
 night-0.000
 restaurants-0.000
 peanuts-0.000
 meal-0.000
TokenHe
Token ID1544
Feature activation+5.23e-01

Pos logit contributions
 his+0.007
 night+0.006
 restaurants+0.006
 day+0.006
 meal+0.006
Neg logit contributions
uddled-0.007
uffed-0.007
ist-0.006
arest-0.006
itting-0.006
Token pointed
Token ID6235
Feature activation+4.47e-01

Pos logit contributions
 his+0.050
 himself+0.032
 bear+0.032
 search+0.031
 dragons+0.030
Neg logit contributions
uddled-0.066
itting-0.066
ower-0.066
arest-0.066
ute-0.063
Token his
Token ID465
Feature activation+1.00e-01

Pos logit contributions
 his+0.039
 night+0.029
 restaurants+0.028
 himself+0.027
 day+0.027
Neg logit contributions
uddled-0.058
uffed-0.054
ist-0.052
arest-0.051
itting-0.051
Token magn
Token ID7842
Feature activation-1.70e-01

Pos logit contributions
 his+0.009
 night+0.007
 day+0.006
 restaurants+0.006
 meal+0.006
Neg logit contributions
uddled-0.013
uffed-0.012
ist-0.012
itting-0.011
arest-0.011
Tokenifying
Token ID4035
Feature activation-3.30e+00

Pos logit contributions
uddled+0.015
Because+0.013
liv+0.013
ower+0.012
arest+0.012
Neg logit contributions
 his-0.020
 him-0.016
 search-0.015
 bear-0.015
 night-0.015
Token glass
Token ID5405
Feature activation-3.57e-01

Pos logit contributions
uddled+0.428
uffed+0.402
ist+0.387
itting+0.379
arest+0.378
Neg logit contributions
 his-0.275
 night-0.210
 restaurants-0.205
 day-0.195
 meal-0.194
Token at
Token ID379
Feature activation-1.15e-01

Pos logit contributions
uddled+0.050
arest+0.047
uffed+0.046
itting+0.046
ve+0.045
Neg logit contributions
 his-0.030
 night-0.020
 bear-0.020
 peanuts-0.019
 day-0.019
Token different
Token ID1180
Feature activation-6.54e-02

Pos logit contributions
uddled+0.018
uffed+0.017
ist+0.017
itting+0.016
ve+0.016
Neg logit contributions
 his-0.005
 restaurants-0.003
 night-0.003
 peanuts-0.003
 meal-0.003
Token classmates
Token ID28999
Feature activation-4.27e-01

Pos logit contributions
uddled+0.009
uffed+0.008
ist+0.008
itting+0.008
arest+0.008
Neg logit contributions
 his-0.005
 night-0.004
 restaurants-0.004
 day-0.003
 meal-0.003
Token and
Token ID290
Feature activation+2.68e-01

Pos logit contributions
uddled+0.059
uffed+0.056
arest+0.052
itting+0.051
ower+0.051
Neg logit contributions
 his-0.036
 night-0.026
 restaurants-0.025
 himself-0.024
 day-0.024
Token He
Token ID679
Feature activation+1.32e-01

Pos logit contributions
 his+0.030
 night+0.026
 peanuts+0.026
 end+0.025
 restaurants+0.025
Neg logit contributions
uddled-0.045
uffed-0.044
ist-0.041
op-0.041
ows-0.040
Token took
Token ID1718
Feature activation-3.42e-01

Pos logit contributions
 his+0.012
 night+0.009
 himself+0.009
 day+0.009
 dragons+0.008
Neg logit contributions
uddled-0.016
itting-0.015
uffed-0.015
arest-0.015
ower-0.015
Token out
Token ID503
Feature activation+3.25e-01

Pos logit contributions
uddled+0.046
uffed+0.045
ist+0.045
arest+0.043
ve+0.043
Neg logit contributions
 his-0.017
 restaurants-0.016
 peanuts-0.015
 meal-0.015
 jungle-0.014
Token a
Token ID257
Feature activation+2.76e-02

Pos logit contributions
 his+0.013
 night+0.009
 restaurants+0.009
 day+0.008
 meal+0.008
Neg logit contributions
uddled-0.052
uffed-0.050
ist-0.049
itting-0.047
ve-0.047
Token magn
Token ID7842
Feature activation-1.69e-01

Pos logit contributions
 his+0.002
 night+0.002
 restaurants+0.002
 day+0.002
 himself+0.002
Neg logit contributions
uddled-0.003
uffed-0.003
ist-0.003
itting-0.003
arest-0.003
Tokenifying
Token ID4035
Feature activation-3.30e+00

Pos logit contributions
uddled+0.015
ower+0.013
keys+0.012
Because+0.012
ended+0.012
Neg logit contributions
 his-0.020
 night-0.017
 armor-0.016
 him-0.016
 dragons-0.016
Token glass
Token ID5405
Feature activation-2.71e-01

Pos logit contributions
uddled+0.427
uffed+0.402
ist+0.385
itting+0.377
arest+0.376
Neg logit contributions
 his-0.273
 night-0.210
 restaurants-0.205
 day-0.195
 meal-0.193
Token and
Token ID290
Feature activation-3.16e-03

Pos logit contributions
uddled+0.036
uffed+0.035
arest+0.033
itting+0.033
ve+0.032
Neg logit contributions
 his-0.023
 night-0.017
 day-0.017
 meal-0.016
 peanuts-0.016
Token a
Token ID257
Feature activation-2.36e-02

Pos logit contributions
uddled+0.000
uffed+0.000
itting+0.000
outing+0.000
ower+0.000
Neg logit contributions
 his-0.000
 night-0.000
 himself-0.000
 restaurants-0.000
 day-0.000
Token notebook
Token ID20922
Feature activation+4.69e-01

Pos logit contributions
uddled+0.003
uffed+0.003
ist+0.003
itting+0.003
outing+0.003
Neg logit contributions
 his-0.002
 night-0.001
 day-0.001
 meal-0.001
 end-0.001
Token and
Token ID290
Feature activation-4.10e-02

Pos logit contributions
 his+0.037
 night+0.027
 day+0.026
 himself+0.025
 end+0.024
Neg logit contributions
uddled-0.065
uffed-0.063
itting-0.060
outing-0.060
arest-0.059
Token She
Token ID1375
Feature activation-1.47e-01

Pos logit contributions
uddled+0.011
uffed+0.010
arest+0.010
ist+0.010
itting+0.009
Neg logit contributions
 his-0.005
 himself-0.003
 night-0.003
 day-0.003
 Prince-0.003
Token takes
Token ID2753
Feature activation-6.21e-01

Pos logit contributions
uddled+0.019
ower+0.018
uffed+0.018
arest+0.018
itting+0.018
Neg logit contributions
 his-0.013
 end-0.009
 search-0.009
 night-0.009
 bear-0.009
Token out
Token ID503
Feature activation+3.63e-01

Pos logit contributions
uddled+0.080
uffed+0.078
ist+0.076
op+0.075
ve+0.074
Neg logit contributions
 his-0.041
 restaurants-0.034
 jungle-0.033
 meal-0.032
 peanuts-0.032
Token a
Token ID257
Feature activation+1.86e-02

Pos logit contributions
 his+0.022
 night+0.015
 restaurants+0.014
 day+0.013
 himself+0.013
Neg logit contributions
uddled-0.056
uffed-0.053
ist-0.051
arest-0.050
itting-0.050
Token magn
Token ID7842
Feature activation-1.72e-01

Pos logit contributions
 his+0.002
 night+0.001
 restaurants+0.001
 himself+0.001
 day+0.001
Neg logit contributions
uddled-0.002
uffed-0.002
ist-0.002
arest-0.002
itting-0.002
Tokenifying
Token ID4035
Feature activation-3.26e+00

Pos logit contributions
uddled+0.015
ower+0.012
Because+0.012
uffed+0.011
liv+0.011
Neg logit contributions
 his-0.021
 him-0.018
 night-0.018
 dragons-0.017
 himself-0.017
Token glass
Token ID5405
Feature activation-1.85e-01

Pos logit contributions
uddled+0.422
uffed+0.397
ist+0.381
itting+0.373
arest+0.371
Neg logit contributions
 his-0.270
 night-0.207
 restaurants-0.202
 day-0.193
 meal-0.191
Token.
Token ID13
Feature activation-1.18e-01

Pos logit contributions
uddled+0.025
uffed+0.024
itting+0.023
arest+0.022
ve+0.022
Neg logit contributions
 his-0.016
 night-0.012
 end-0.011
 day-0.011
 meal-0.011
Token She
Token ID1375
Feature activation-1.89e-01

Pos logit contributions
uddled+0.018
arest+0.016
ist+0.016
uffed+0.016
ve+0.016
Neg logit contributions
 his-0.008
 himself-0.005
 night-0.005
 -0.005
 him-0.005
Token looks
Token ID3073
Feature activation-2.96e-01

Pos logit contributions
uddled+0.025
ower+0.024
uffed+0.024
itting+0.024
arest+0.024
Neg logit contributions
 his-0.015
 search-0.011
 night-0.011
 dragons-0.011
 end-0.011
Token at
Token ID379
Feature activation+1.40e-01

Pos logit contributions
uddled+0.032
uffed+0.030
ist+0.029
itting+0.028
arest+0.028
Neg logit contributions
 his-0.030
 night-0.024
 restaurants-0.024
 day-0.023
 himself-0.023
Token cried
Token ID16896
Feature activation+7.21e-01

Pos logit contributions
 his+0.031
 night+0.021
 restaurants+0.020
 himself+0.020
 search+0.020
Neg logit contributions
uddled-0.058
uffed-0.054
arest-0.053
itting-0.052
ist-0.051
Token,
Token ID11
Feature activation+3.30e-01

Pos logit contributions
 his+0.062
 restaurants+0.055
 jungle+0.052
 night+0.051
 meal+0.051
Neg logit contributions
uddled-0.084
uffed-0.079
ist-0.077
op-0.075
itting-0.074
Token dropping
Token ID12047
Feature activation+9.73e-02

Pos logit contributions
 his+0.030
 himself+0.021
 night+0.021
 agreement+0.020
 end+0.020
Neg logit contributions
uddled-0.043
uffed-0.040
arest-0.039
itting-0.039
ist-0.038
Token the
Token ID262
Feature activation+2.69e-01

Pos logit contributions
 his+0.005
 restaurants+0.005
 night+0.005
 jungle+0.004
 meal+0.004
Neg logit contributions
uddled-0.013
uffed-0.013
ist-0.012
op-0.012
itting-0.012
Token magn
Token ID7842
Feature activation-8.19e-02

Pos logit contributions
 his+0.021
 night+0.016
 day+0.016
 restaurants+0.015
 end+0.015
Neg logit contributions
uddled-0.037
uffed-0.035
ist-0.033
itting-0.032
arest-0.032
Tokenifying
Token ID4035
Feature activation-3.26e+00

Pos logit contributions
uddled+0.007
Because+0.006
uffed+0.006
arest+0.005
ower+0.005
Neg logit contributions
 his-0.009
 night-0.009
 bear-0.008
 dragons-0.008
 end-0.008
Token glass
Token ID5405
Feature activation-2.92e-01

Pos logit contributions
uddled+0.392
uffed+0.366
ist+0.350
arest+0.342
itting+0.341
Neg logit contributions
 his-0.283
 restaurants-0.220
 night-0.218
 himself-0.212
 day-0.210
Token.
Token ID13
Feature activation-1.87e-01

Pos logit contributions
uddled+0.042
uffed+0.040
arest+0.038
itting+0.038
ve+0.037
Neg logit contributions
 his-0.023
 night-0.017
 day-0.016
 end-0.015
 bear-0.015
Token
Token ID198
Feature activation-1.21e-01

Pos logit contributions
uddled+0.029
ist+0.029
arest+0.028
uffed+0.027
itting+0.027
Neg logit contributions
 his-0.011
 Freddy-0.006
 -0.006
 himself-0.006
 Lily-0.006
Token
Token ID198
Feature activation-6.27e-02

Pos logit contributions
uddled+0.019
ist+0.018
uffed+0.018
itting+0.018
ider+0.017
Neg logit contributions
 his-0.007
 himself-0.004
 night-0.004
 him-0.004
 day-0.004
Token"
Token ID1
Feature activation-3.74e-01

Pos logit contributions
uddled+0.008
uffed+0.008
ist+0.008
itting+0.008
ows+0.007
Neg logit contributions
 his-0.005
 himself-0.004
 robbers-0.004
 end-0.004
 day-0.003
Token and
Token ID290
Feature activation-1.12e-01

Pos logit contributions
uddled+0.104
uffed+0.103
arest+0.095
itting+0.094
plets+0.092
Neg logit contributions
 his-0.053
 night-0.035
 day-0.035
 end-0.035
 himself-0.033
Token handed
Token ID10158
Feature activation-2.02e-01

Pos logit contributions
uddled+0.014
uffed+0.013
itting+0.013
ower+0.013
ist+0.013
Neg logit contributions
 his-0.011
 night-0.008
 himself-0.008
 Freddy-0.008
 day-0.007
Token him
Token ID683
Feature activation+1.72e-01

Pos logit contributions
uddled+0.024
uffed+0.022
a+0.022
ve+0.021
op+0.021
Neg logit contributions
 jungle-0.014
 his-0.013
 restaurants-0.013
 saf-0.013
 meal-0.012
Token the
Token ID262
Feature activation+2.49e-01

Pos logit contributions
 his+0.010
 night+0.008
 restaurants+0.008
 day+0.008
 agreement+0.008
Neg logit contributions
uddled-0.023
uffed-0.022
ist-0.022
op-0.021
ows-0.021
Token magn
Token ID7842
Feature activation-2.24e-01

Pos logit contributions
 his+0.023
 night+0.019
 end+0.018
 day+0.018
 peanuts+0.018
Neg logit contributions
uddled-0.030
uffed-0.028
ist-0.027
itting-0.026
op-0.026
Tokenifying
Token ID4035
Feature activation-3.24e+00

Pos logit contributions
uddled+0.020
liv+0.016
ower+0.016
Because+0.016
uffed+0.016
Neg logit contributions
 his-0.025
 night-0.023
 home-0.022
 dragons-0.022
 end-0.021
Token glass
Token ID5405
Feature activation-4.59e-01

Pos logit contributions
uddled+0.386
uffed+0.360
ist+0.345
arest+0.339
itting+0.336
Neg logit contributions
 his-0.288
 restaurants-0.225
 night-0.223
 himself-0.216
 day-0.214
Token.
Token ID13
Feature activation+1.51e-01

Pos logit contributions
uddled+0.060
uffed+0.057
itting+0.054
arest+0.054
ist+0.053
Neg logit contributions
 his-0.040
 night-0.030
 himself-0.028
 day-0.028
 peanuts-0.028
Token
Token ID198
Feature activation+1.76e-01

Pos logit contributions
 his+0.011
 night+0.009
 restaurants+0.008
 day+0.008
 meal+0.008
Neg logit contributions
uddled-0.020
uffed-0.019
ist-0.018
itting-0.018
arest-0.018
Token
Token ID198
Feature activation+2.20e-01

Pos logit contributions
 his+0.014
 night+0.010
 restaurants+0.010
 day+0.010
 meal+0.010
Neg logit contributions
uddled-0.023
uffed-0.022
ist-0.021
itting-0.021
arest-0.021
TokenHe
Token ID1544
Feature activation+6.21e-01

Pos logit contributions
 his+0.026
 night+0.021
 restaurants+0.021
 himself+0.021
 end+0.020
Neg logit contributions
uddled-0.021
uffed-0.020
ist-0.019
itting-0.018
arest-0.018
Token and
Token ID290
Feature activation-4.41e-01

Pos logit contributions
uddled+0.080
itting+0.078
arest+0.077
uffed+0.076
keys+0.074
Neg logit contributions
 his-0.042
 night-0.030
 himself-0.030
 home-0.027
 day-0.026
Token put
Token ID1234
Feature activation-4.48e-01

Pos logit contributions
uddled+0.065
itting+0.062
uffed+0.060
ist+0.059
arest+0.058
Neg logit contributions
 his-0.034
 night-0.026
 day-0.023
 himself-0.023
 search-0.022
Token on
Token ID319
Feature activation-8.77e-01

Pos logit contributions
uddled+0.069
uffed+0.065
itting+0.063
ist+0.062
arest+0.062
Neg logit contributions
 his-0.030
 night-0.020
 himself-0.018
 day-0.018
 restaurants-0.017
Token your
Token ID534
Feature activation-1.46e+00

Pos logit contributions
uddled+0.136
uffed+0.129
ist+0.124
itting+0.124
arest+0.123
Neg logit contributions
 his-0.054
 night-0.035
 restaurants-0.033
 himself-0.032
 day-0.031
Token p
Token ID279
Feature activation-1.11e+00

Pos logit contributions
uddled+0.197
uffed+0.186
itting+0.180
ist+0.179
arest+0.174
Neg logit contributions
 his-0.118
 night-0.089
 restaurants-0.084
 day-0.084
 meal-0.084
Tokenaj
Token ID1228
Feature activation-3.23e+00

Pos logit contributions
keys+0.082
arest+0.082
uddled+0.080
ver+0.074
ended+0.074
Neg logit contributions
 his-0.133
 home-0.120
 peanuts-0.117
 saf-0.114
 night-0.113
Tokenamas
Token ID17485
Feature activation-1.08e+00

Pos logit contributions
uddled+0.350
arest+0.347
itting+0.343
uffed+0.333
ended+0.323
Neg logit contributions
 his-0.299
 himself-0.244
 restaurants-0.241
 meal-0.235
 night-0.230
Token.
Token ID13
Feature activation-4.22e-01

Pos logit contributions
uddled+0.177
uffed+0.173
arest+0.162
keys+0.161
itting+0.160
Neg logit contributions
 his-0.067
 night-0.050
 home-0.044
 day-0.042
 himself-0.038
Token Then
Token ID3244
Feature activation-3.99e-01

Pos logit contributions
uddled+0.052
uffed+0.049
ist+0.047
itting+0.046
arest+0.046
Neg logit contributions
 his-0.037
 night-0.028
 restaurants-0.028
 day-0.027
 meal-0.026
Token we
Token ID356
Feature activation+1.68e-02

Pos logit contributions
uddled+0.051
uffed+0.047
ist+0.045
itting+0.045
arest+0.044
Neg logit contributions
 his-0.036
 night-0.028
 restaurants-0.027
 day-0.026
 himself-0.026
Token can
Token ID460
Feature activation+6.49e-02

Pos logit contributions
 his+0.001
 night+0.001
 end+0.001
 search+0.001
 day+0.001
Neg logit contributions
uddled-0.002
uffed-0.002
ower-0.002
arest-0.002
itting-0.002
Token said
Token ID531
Feature activation-3.11e-01

Pos logit contributions
 his+0.002
 night+0.001
 day+0.001
 himself+0.001
 Freddy+0.001
Neg logit contributions
uddled-0.002
ider-0.002
itting-0.002
outing-0.002
uffed-0.002
Token,
Token ID11
Feature activation+1.54e-01

Pos logit contributions
uddled+0.050
ist+0.048
uffed+0.047
itting+0.046
arest+0.046
Neg logit contributions
 his-0.019
 night-0.010
 himself-0.010
 Freddy-0.010
 day-0.010
Token
Token ID6184
Feature activation+9.41e-02

Pos logit contributions
 his+0.008
 +0.005
 himself+0.004
 day+0.004
 night+0.004
Neg logit contributions
uddled-0.025
ist-0.024
arest-0.024
uffed-0.024
itting-0.023
Token
Token ID95
Feature activation+2.02e+00

Pos logit contributions
 his+0.009
 saf+0.008
 morning+0.007
 end+0.007
 night+0.007
Neg logit contributions
uddled-0.009
as-0.008
uffed-0.008
a-0.008
keys-0.008
Token
Token ID26391
Feature activation+8.66e-01

Pos logit contributions
 his+0.172
 night+0.142
 day+0.139
 morning+0.137
 agreement+0.137
Neg logit contributions
uddled-0.239
uffed-0.229
outing-0.210
keys-0.206
op-0.205
Token
Token ID129
Feature activation+2.75e+00

Pos logit contributions
 his+0.081
 +0.070
 restaurants+0.068
 night+0.068
 jungle+0.068
Neg logit contributions
uddled-0.088
ist-0.085
outing-0.083
arest-0.083
uffed-0.081
Token
Token ID241
Feature activation+1.23e+00

Pos logit contributions
 his+0.292
 night+0.235
 himself+0.231
 restaurants+0.227
 day+0.226
Neg logit contributions
uddled-0.291
uffed-0.269
ist-0.257
itting-0.253
arest-0.250
TokenThat
Token ID2504
Feature activation+1.92e-01

Pos logit contributions
 his+0.121
 day+0.100
 night+0.097
 himself+0.095
 morning+0.090
Neg logit contributions
uddled-0.152
uffed-0.145
ist-0.141
ows-0.133
ushing-0.131
Token was
Token ID373
Feature activation-1.94e-01

Pos logit contributions
 his+0.014
 night+0.012
 day+0.012
 way+0.011
 morning+0.010
Neg logit contributions
uddled-0.029
uffed-0.028
ider-0.027
ended-0.027
outing-0.026
Token a
Token ID257
Feature activation-2.84e-02

Pos logit contributions
uddled+0.030
uffed+0.028
ist+0.027
op+0.026
itting+0.026
Neg logit contributions
 his-0.014
 night-0.009
 day-0.009
 himself-0.009
 end-0.008
Token very
Token ID845
Feature activation+7.94e-01

Pos logit contributions
uddled+0.003
uffed+0.003
ist+0.003
arest+0.003
itting+0.003
Neg logit contributions
 his-0.002
 restaurants-0.002
 night-0.002
 himself-0.002
 peanuts-0.002
Token and
Token ID290
Feature activation-1.22e-01

Pos logit contributions
uffed+0.003
uddled+0.003
arest+0.003
ve+0.003
itting+0.003
Neg logit contributions
 his-0.002
 night-0.001
 day-0.001
 himself-0.001
 morning-0.001
Token said
Token ID531
Feature activation-5.30e-01

Pos logit contributions
uddled+0.015
itting+0.014
uffed+0.014
ist+0.014
outing+0.014
Neg logit contributions
 his-0.013
 night-0.009
 himself-0.009
 day-0.009
 meal-0.008
Token
Token ID6184
Feature activation+3.51e-01

Pos logit contributions
uddled+0.081
ist+0.077
uffed+0.077
itting+0.075
outing+0.075
Neg logit contributions
 his-0.039
 himself-0.023
 Freddy-0.021
 night-0.021
 day-0.020
Token
Token ID95
Feature activation+2.14e+00

Pos logit contributions
 his+0.040
 saf+0.035
 Freddy+0.034
 himself+0.033
 jungle+0.033
Neg logit contributions
uddled-0.032
uffed-0.029
as-0.027
ve-0.025
itting-0.025
Token
Token ID26391
Feature activation+7.42e-01

Pos logit contributions
 his+0.208
 himself+0.169
 jungle+0.168
 night+0.165
 restaurants+0.163
Neg logit contributions
uddled-0.244
uffed-0.226
ve-0.206
itting-0.204
arest-0.204
Token
Token ID129
Feature activation+2.48e+00

Pos logit contributions
 his+0.076
 jungle+0.062
 restaurants+0.061
 saf+0.060
 search+0.059
Neg logit contributions
uddled-0.076
ist-0.071
arest-0.069
uffed-0.069
outing-0.068
Token
Token ID241
Feature activation+1.22e+00

Pos logit contributions
 his+0.260
 night+0.208
 restaurants+0.205
 himself+0.202
 day+0.199
Neg logit contributions
uddled-0.269
uffed-0.248
ist-0.237
itting-0.232
arest-0.231
TokenThank
Token ID10449
Feature activation+2.76e-02

Pos logit contributions
 his+0.119
 day+0.095
 himself+0.093
 night+0.092
 morning+0.087
Neg logit contributions
uddled-0.156
uffed-0.145
ist-0.144
ows-0.134
ushing-0.133
Token you
Token ID345
Feature activation-7.43e-01

Pos logit contributions
 his+0.002
 night+0.002
 day+0.002
 himself+0.002
 end+0.002
Neg logit contributions
uddled-0.004
uffed-0.003
ist-0.003
outing-0.003
itting-0.003
Token!
Token ID0
Feature activation+1.46e-01

Pos logit contributions
uddled+0.103
uffed+0.097
arest+0.090
ist+0.090
itting+0.090
Neg logit contributions
 his-0.061
 night-0.045
 day-0.042
 end-0.042
 himself-0.041
Token I
Token ID314
Feature activation-7.40e-02

Pos logit contributions
 his+0.012
 night+0.009
 restaurants+0.009
 himself+0.009
 day+0.009
Neg logit contributions
uddled-0.019
uffed-0.018
ist-0.017
arest-0.017
itting-0.017
Token softly
Token ID26625
Feature activation-3.13e-01

Pos logit contributions
 his+0.009
 himself+0.006
 night+0.005
 home+0.005
 him+0.005
Neg logit contributions
uddled-0.018
uffed-0.017
arest-0.016
ist-0.016
ve-0.016
Token,
Token ID11
Feature activation+2.76e-01

Pos logit contributions
uddled+0.046
uffed+0.043
arest+0.042
ist+0.040
itting+0.040
Neg logit contributions
 his-0.024
 night-0.016
 himself-0.015
 day-0.015
 home-0.014
Token
Token ID6184
Feature activation+3.14e-01

Pos logit contributions
 his+0.019
 himself+0.011
 +0.010
 night+0.010
 agreement+0.010
Neg logit contributions
uddled-0.043
arest-0.040
ist-0.040
uffed-0.040
itting-0.039
Token
Token ID95
Feature activation+2.13e+00

Pos logit contributions
 his+0.029
 saf+0.027
 jungle+0.026
 himself+0.024
 end+0.023
Neg logit contributions
uddled-0.034
uffed-0.032
as-0.030
itting-0.029
arest-0.029
Token
Token ID26391
Feature activation+9.53e-01

Pos logit contributions
 his+0.198
 jungle+0.165
 night+0.163
 himself+0.159
 end+0.156
Neg logit contributions
uddled-0.250
uffed-0.233
itting-0.211
outing-0.210
arest-0.208
Token
Token ID129
Feature activation+2.76e+00

Pos logit contributions
 his+0.088
 jungle+0.076
 night+0.075
 restaurants+0.075
 search+0.073
Neg logit contributions
uddled-0.102
uffed-0.094
ist-0.094
arest-0.090
outing-0.090
Token
Token ID241
Feature activation+1.30e+00

Pos logit contributions
 his+0.289
 night+0.228
 himself+0.228
 restaurants+0.221
 jungle+0.220
Neg logit contributions
uddled-0.303
uffed-0.279
ist-0.266
itting-0.263
arest-0.260
TokenHello
Token ID15496
Feature activation+3.27e-01

Pos logit contributions
 his+0.135
 himself+0.107
 night+0.106
 day+0.105
 winter+0.099
Neg logit contributions
uddled-0.160
ist-0.148
uffed-0.148
ushing-0.136
ows-0.132
Token?
Token ID30
Feature activation-7.45e-01

Pos logit contributions
 his+0.023
 night+0.015
 bear+0.015
 day+0.014
 restaurants+0.014
Neg logit contributions
uddled-0.050
uffed-0.048
arest-0.046
outing-0.044
ended-0.044
Token Who
Token ID5338
Feature activation+3.94e-01

Pos logit contributions
uddled+0.112
uffed+0.102
ist+0.101
arest+0.098
ower+0.097
Neg logit contributions
 his-0.053
 himself-0.034
 night-0.034
 Prince-0.033
 restaurants-0.033
Tokenâ
Token ID22940
Feature activation+8.87e-01

Pos logit contributions
 his+0.034
 night+0.023
 restaurants+0.022
 dragons+0.022
 him+0.022
Neg logit contributions
uddled-0.059
uffed-0.055
ower-0.055
outing-0.053
ider-0.052
Token mom
Token ID1995
Feature activation+2.01e-02

Pos logit contributions
uddled+0.049
uffed+0.046
ist+0.045
itting+0.045
arest+0.045
Neg logit contributions
 his-0.026
 night-0.018
 day-0.018
 himself-0.017
 meal-0.017
Tokenmy
Token ID1820
Feature activation+3.67e-01

Pos logit contributions
 his+0.002
 bear+0.001
 night+0.001
my+0.001
 end+0.001
Neg logit contributions
uddled-0.002
arest-0.002
ower-0.002
uttering-0.002
ute-0.002
Token replied
Token ID8712
Feature activation-8.10e-01

Pos logit contributions
 his+0.038
 bear+0.027
 him+0.026
 night+0.026
 bears+0.026
Neg logit contributions
arest-0.044
uddled-0.044
ower-0.042
uttering-0.042
itting-0.042
Token,
Token ID11
Feature activation+1.32e-01

Pos logit contributions
uddled+0.115
uffed+0.109
arest+0.107
ist+0.105
itting+0.104
Neg logit contributions
 his-0.063
 himself-0.041
 night-0.040
 day-0.038
 end-0.037
Token
Token ID6184
Feature activation+6.96e-01

Pos logit contributions
 his+0.010
 himself+0.006
 night+0.006
 day+0.006
 Freddy+0.006
Neg logit contributions
uddled-0.019
uffed-0.018
arest-0.018
ist-0.018
itting-0.017
Token
Token ID95
Feature activation+2.34e+00

Pos logit contributions
 his+0.074
 saf+0.062
 himself+0.059
 jungle+0.058
 night+0.058
Neg logit contributions
uddled-0.071
uffed-0.065
as-0.059
itting-0.058
ve-0.055
Token
Token ID26391
Feature activation+7.80e-01

Pos logit contributions
 his+0.224
 himself+0.180
 night+0.177
 jungle+0.175
 day+0.172
Neg logit contributions
uddled-0.274
uffed-0.256
itting-0.230
outing-0.228
arest-0.224
Token
Token ID129
Feature activation+2.68e+00

Pos logit contributions
 his+0.052
 restaurants+0.037
 jungle+0.035
 +0.034
 saf+0.034
Neg logit contributions
uddled-0.110
ist-0.103
uffed-0.102
arest-0.101
outing-0.099
Token
Token ID241
Feature activation+1.25e+00

Pos logit contributions
 his+0.290
 night+0.233
 himself+0.228
 restaurants+0.228
 day+0.224
Neg logit contributions
uddled-0.283
uffed-0.261
ist-0.248
itting-0.244
arest-0.242
TokenWe
Token ID1135
Feature activation+7.14e-01

Pos logit contributions
 his+0.112
 day+0.086
 himself+0.083
 night+0.082
 morning+0.075
Neg logit contributions
uddled-0.168
uffed-0.161
ist-0.158
ushing-0.148
ows-0.146
Token need
Token ID761
Feature activation+3.19e-01

Pos logit contributions
 his+0.057
 night+0.048
 end+0.045
 care+0.044
 search+0.043
Neg logit contributions
uddled-0.106
ider-0.096
uffed-0.093
ower-0.092
outing-0.091
Token".
Token ID1911
Feature activation+4.88e-01

Pos logit contributions
uddled+0.017
uffed+0.016
itting+0.015
arest+0.015
outing+0.015
Neg logit contributions
 his-0.010
 night-0.008
 day-0.008
 end-0.007
 home-0.007
Token
Token ID220
Feature activation+7.18e-01

Pos logit contributions
 his+0.030
 night+0.021
 restaurants+0.020
 day+0.019
 himself+0.019
Neg logit contributions
uddled-0.073
uffed-0.069
ist-0.067
itting-0.066
arest-0.066
Token
Token ID198
Feature activation+3.16e-01

Pos logit contributions
 his+0.035
 himself+0.021
 +0.019
 Freddy+0.015
 home+0.014
Neg logit contributions
uddled-0.117
ist-0.115
arest-0.111
ider-0.111
outing-0.109
Token
Token ID198
Feature activation+3.64e-01

Pos logit contributions
 his+0.022
 night+0.015
 himself+0.015
 day+0.014
 restaurants+0.014
Neg logit contributions
uddled-0.046
uffed-0.044
ist-0.043
itting-0.042
ider-0.041
TokenThe
Token ID464
Feature activation-7.59e-02

Pos logit contributions
 his+0.038
 himself+0.029
 end+0.029
 agreement+0.029
 day+0.029
Neg logit contributions
uddled-0.040
uffed-0.040
ist-0.037
ower-0.036
ows-0.036
Token end
Token ID886
Feature activation+2.46e+00

Pos logit contributions
uddled+0.010
uffed+0.010
ist+0.010
itting+0.010
uts+0.009
Neg logit contributions
 his-0.006
 end-0.005
 day-0.004
 night-0.004
 agreement-0.004
Token.
Token ID13
Feature activation+4.48e-01

Pos logit contributions
 his+0.169
 night+0.119
 day+0.115
 end+0.101
 himself+0.101
Neg logit contributions
uddled-0.383
uffed-0.370
arest-0.341
outing-0.337
itting-0.335
Token<|endoftext|>
Token ID50256
Feature activation+9.85e-02

Pos logit contributions
 his+0.036
 restaurants+0.028
 night+0.028
 peanuts+0.026
 meal+0.026
Neg logit contributions
uddled-0.057
uffed-0.054
ist-0.051
itting-0.051
arest-0.050
TokenOnce
Token ID7454
Feature activation-5.64e-03

Pos logit contributions
 his+0.008
 night+0.006
 restaurants+0.006
 day+0.005
 meal+0.005
Neg logit contributions
uddled-0.013
uffed-0.012
ist-0.012
itting-0.011
arest-0.011
Token upon
Token ID2402
Feature activation+1.24e-01

Pos logit contributions
uddled+0.001
uffed+0.001
ist+0.001
arest+0.001
ider+0.001
Neg logit contributions
 his-0.001
 night-0.000
 day-0.000
 restaurants-0.000
 end-0.000
Token a
Token ID257
Feature activation+1.19e-01

Pos logit contributions
 his+0.009
 restaurants+0.007
 night+0.007
 peanuts+0.006
 day+0.006
Neg logit contributions
uddled-0.017
uffed-0.016
ist-0.015
arest-0.015
itting-0.015
Token talk
Token ID1561
Feature activation-1.29e-01

Pos logit contributions
 his+0.005
 night+0.004
 restaurants+0.003
 end+0.003
 day+0.003
Neg logit contributions
uddled-0.007
uffed-0.007
arest-0.006
itting-0.006
ve-0.006
Token.
Token ID13
Feature activation+1.08e-02

Pos logit contributions
uddled+0.019
uffed+0.019
arest+0.017
op+0.017
ower+0.017
Neg logit contributions
 his-0.010
 night-0.007
 day-0.006
 himself-0.006
 dragons-0.006
Token They
Token ID1119
Feature activation+3.71e-01

Pos logit contributions
 his+0.001
 night+0.001
 restaurants+0.001
 day+0.001
 himself+0.001
Neg logit contributions
uddled-0.001
uffed-0.001
ist-0.001
itting-0.001
arest-0.001
Token talked
Token ID6619
Feature activation+4.22e-01

Pos logit contributions
 his+0.032
 night+0.025
 day+0.023
 end+0.023
 dragons+0.022
Neg logit contributions
uddled-0.051
uffed-0.048
ower-0.047
arest-0.047
ider-0.047
Token and
Token ID290
Feature activation-1.06e-01

Pos logit contributions
 his+0.039
 night+0.029
 day+0.028
 himself+0.027
 dragons+0.026
Neg logit contributions
uddled-0.054
uffed-0.053
arest-0.050
op-0.049
ended-0.049
Token talked
Token ID6619
Feature activation+4.25e-01

Pos logit contributions
uddled+0.010
uffed+0.010
ist+0.009
itting+0.009
ve+0.009
Neg logit contributions
 his-0.011
 restaurants-0.009
 night-0.009
 peanuts-0.009
 meal-0.009
Token and
Token ID290
Feature activation-8.80e-02

Pos logit contributions
 his+0.037
 night+0.027
 day+0.026
 himself+0.025
 dragons+0.024
Neg logit contributions
uddled-0.057
uffed-0.056
arest-0.052
itting-0.051
ended-0.051
Token soon
Token ID2582
Feature activation-8.25e-02

Pos logit contributions
uddled+0.009
uffed+0.009
ist+0.009
itting+0.008
ve+0.008
Neg logit contributions
 his-0.008
 restaurants-0.007
 night-0.007
 peanuts-0.007
 meal-0.007
Token enough
Token ID1576
Feature activation-3.79e-01

Pos logit contributions
uddled+0.011
uffed+0.010
ist+0.010
itting+0.009
arest+0.009
Neg logit contributions
 his-0.007
 night-0.005
 restaurants-0.005
 day-0.005
 himself-0.005
Token the
Token ID262
Feature activation+1.58e-01

Pos logit contributions
uddled+0.050
uffed+0.047
ist+0.046
itting+0.045
arest+0.044
Neg logit contributions
 his-0.030
 night-0.022
 restaurants-0.022
 day-0.021
 himself-0.021
Token truth
Token ID3872
Feature activation+8.35e-01

Pos logit contributions
 his+0.011
 night+0.009
 day+0.009
 restaurants+0.008
 meal+0.008
Neg logit contributions
uddled-0.022
uffed-0.021
ist-0.020
itting-0.020
arest-0.020
Token.
Token ID13
Feature activation-5.16e-02

Pos logit contributions
uddled+0.017
uffed+0.016
ist+0.015
itting+0.015
arest+0.015
Neg logit contributions
 his-0.011
 night-0.009
 restaurants-0.009
 day-0.008
 meal-0.008
Token They
Token ID1119
Feature activation+5.56e-01

Pos logit contributions
uddled+0.007
uffed+0.006
ist+0.006
itting+0.006
arest+0.006
Neg logit contributions
 his-0.004
 night-0.003
 restaurants-0.003
 day-0.003
 meal-0.003
Token put
Token ID1234
Feature activation+7.12e-02

Pos logit contributions
 his+0.040
 search+0.029
 end+0.028
 night+0.027
 agreement+0.027
Neg logit contributions
uddled-0.084
uffed-0.078
ower-0.074
itting-0.074
ended-0.073
Token on
Token ID319
Feature activation-3.07e-01

Pos logit contributions
 his+0.005
 restaurants+0.005
 jungle+0.005
 night+0.005
 agreement+0.005
Neg logit contributions
uddled-0.008
uffed-0.008
ist-0.008
ows-0.007
op-0.007
Token their
Token ID511
Feature activation-1.22e-01

Pos logit contributions
uddled+0.038
uffed+0.036
ist+0.034
op+0.034
ows+0.033
Neg logit contributions
 his-0.020
 restaurants-0.018
 night-0.018
 agreement-0.017
 end-0.017
Token shoes
Token ID10012
Feature activation+8.44e-01

Pos logit contributions
uddled+0.014
uffed+0.013
ist+0.012
arest+0.012
ve+0.012
Neg logit contributions
 his-0.011
 night-0.009
 restaurants-0.009
 day-0.009
 himself-0.009
Token and
Token ID290
Feature activation+5.19e-02

Pos logit contributions
 his+0.065
 night+0.045
 home+0.043
 day+0.043
 himself+0.042
Neg logit contributions
uddled-0.122
uffed-0.118
itting-0.113
arest-0.113
ended-0.110
Token jackets
Token ID35129
Feature activation+1.00e+00

Pos logit contributions
 his+0.005
 night+0.004
 restaurants+0.004
 day+0.003
 himself+0.003
Neg logit contributions
uddled-0.006
uffed-0.006
ist-0.006
itting-0.006
arest-0.006
Token and
Token ID290
Feature activation-6.15e-03

Pos logit contributions
 his+0.076
 night+0.050
 home+0.048
 day+0.046
 him+0.045
Neg logit contributions
arest-0.150
uddled-0.150
ended-0.148
uffed-0.147
itting-0.145
Token go
Token ID467
Feature activation+1.81e-01

Pos logit contributions
uddled+0.001
uffed+0.001
ist+0.001
itting+0.001
arest+0.001
Neg logit contributions
 his-0.001
 night-0.000
 restaurants-0.000
 day-0.000
 himself-0.000
Token to
Token ID284
Feature activation+7.94e-02

Pos logit contributions
 his+0.015
 restaurants+0.012
 night+0.012
 meal+0.011
 peanuts+0.011
Neg logit contributions
uddled-0.023
uffed-0.021
ist-0.020
arest-0.020
itting-0.020
TokenThe
Token ID464
Feature activation-6.04e-02

Pos logit contributions
 his+0.047
 day+0.037
 himself+0.036
 end+0.036
 night+0.036
Neg logit contributions
uddled-0.049
uffed-0.048
ist-0.046
ows-0.044
ower-0.043
Token family
Token ID1641
Feature activation-3.43e-02

Pos logit contributions
uddled+0.008
uffed+0.008
ist+0.007
arest+0.007
itting+0.007
Neg logit contributions
 his-0.004
 restaurants-0.003
 himself-0.003
 night-0.003
 peanuts-0.003
Token returned
Token ID4504
Feature activation-4.24e-02

Pos logit contributions
ower+0.004
uddled+0.004
ider+0.004
arest+0.004
uffed+0.004
Neg logit contributions
 his-0.003
 night-0.002
 day-0.002
 end-0.002
 bear-0.002
Token to
Token ID284
Feature activation+4.17e-01

Pos logit contributions
uddled+0.007
uffed+0.007
arest+0.007
ider+0.007
ist+0.006
Neg logit contributions
 his-0.003
 home-0.001
 night-0.001
 himself-0.001
 day-0.001
Token the
Token ID262
Feature activation+4.29e-01

Pos logit contributions
 his+0.023
 night+0.015
 restaurants+0.014
 day+0.013
 peanuts+0.013
Neg logit contributions
uddled-0.065
uffed-0.062
ist-0.060
itting-0.059
arest-0.059
Token structure
Token ID4645
Feature activation+2.85e-01

Pos logit contributions
 his+0.032
 night+0.023
 restaurants+0.023
 himself+0.021
 day+0.021
Neg logit contributions
uddled-0.059
uffed-0.056
ist-0.054
itting-0.053
arest-0.053
Token with
Token ID351
Feature activation+1.68e-01

Pos logit contributions
 his+0.024
 day+0.018
 night+0.018
 search+0.016
 end+0.016
Neg logit contributions
uddled-0.039
uffed-0.038
arest-0.036
ower-0.035
ve-0.035
Token happy
Token ID3772
Feature activation+3.38e-01

Pos logit contributions
 his+0.010
 night+0.007
 restaurants+0.006
 day+0.006
 meal+0.006
Neg logit contributions
uddled-0.026
uffed-0.025
ist-0.024
ider-0.024
itting-0.024
Token smiles
Token ID21845
Feature activation+1.20e-02

Pos logit contributions
 his+0.023
 night+0.017
 day+0.016
 restaurants+0.015
 meal+0.015
Neg logit contributions
uddled-0.050
uffed-0.049
ist-0.046
arest-0.046
itting-0.046
Token and
Token ID290
Feature activation+2.70e-01

Pos logit contributions
 his+0.001
 night+0.001
 day+0.001
 end+0.001
 home+0.001
Neg logit contributions
uffed-0.002
uddled-0.002
arest-0.002
outing-0.002
op-0.002
Token loving
Token ID14442
Feature activation-3.63e-01

Pos logit contributions
 his+0.021
 night+0.016
 day+0.015
 himself+0.013
 morning+0.013
Neg logit contributions
uddled-0.041
outing-0.038
ist-0.038
uffed-0.038
itting-0.037
Token few
Token ID1178
Feature activation-5.15e-01

Pos logit contributions
 his+0.007
 night+0.006
 day+0.005
 restaurants+0.005
 meal+0.005
Neg logit contributions
uddled-0.014
uffed-0.013
ist-0.012
itting-0.012
arest-0.012
Token of
Token ID286
Feature activation+1.93e-01

Pos logit contributions
uddled+0.072
uffed+0.068
ist+0.065
itting+0.065
arest+0.065
Neg logit contributions
 his-0.038
 night-0.027
 restaurants-0.027
 day-0.026
 peanuts-0.025
Token the
Token ID262
Feature activation+3.40e-01

Pos logit contributions
 his+0.012
 night+0.008
 restaurants+0.008
 day+0.007
 himself+0.007
Neg logit contributions
uddled-0.030
uffed-0.028
ist-0.027
arest-0.027
itting-0.027
Token smallest
Token ID18197
Feature activation-4.38e-01

Pos logit contributions
 his+0.024
 restaurants+0.017
 night+0.017
 peanuts+0.016
 himself+0.016
Neg logit contributions
uddled-0.047
uffed-0.044
ist-0.043
itting-0.042
arest-0.042
Token ones
Token ID3392
Feature activation+8.64e-01

Pos logit contributions
uddled+0.063
uffed+0.060
arest+0.057
ist+0.056
ower+0.056
Neg logit contributions
 his-0.032
 night-0.024
 day-0.023
 restaurants-0.022
 end-0.022
Token to
Token ID284
Feature activation+2.47e-01

Pos logit contributions
 his+0.057
 day+0.034
 himself+0.034
 him+0.033
 night+0.032
Neg logit contributions
uddled-0.142
uffed-0.136
ider-0.128
arest-0.127
plets-0.126
Token put
Token ID1234
Feature activation-2.74e-02

Pos logit contributions
 his+0.020
 himself+0.013
 search+0.013
 end+0.012
 night+0.012
Neg logit contributions
uddled-0.039
uffed-0.036
itting-0.033
ider-0.033
ve-0.033
Token on
Token ID319
Feature activation-3.43e-01

Pos logit contributions
uddled+0.004
uffed+0.004
itting+0.004
arest+0.004
ist+0.004
Neg logit contributions
 his-0.002
 night-0.001
 himself-0.001
 day-0.001
 restaurants-0.001
Token her
Token ID607
Feature activation+4.04e-01

Pos logit contributions
uddled+0.059
uffed+0.055
itting+0.053
arest+0.053
ider+0.052
Neg logit contributions
 his-0.020
 night-0.010
 himself-0.010
 day-0.009
 restaurants-0.009
Token shelf
Token ID18316
Feature activation+9.05e-01

Pos logit contributions
 his+0.033
 night+0.025
 day+0.024
 restaurants+0.024
 himself+0.023
Neg logit contributions
uddled-0.054
uffed-0.051
ist-0.049
itting-0.049
arest-0.048
Token.
Token ID13
Feature activation+1.48e-01

Pos logit contributions
 his+0.059
 night+0.039
 day+0.038
 home+0.037
 himself+0.036
Neg logit contributions
uddled-0.143
uffed-0.135
arest-0.130
itting-0.127
ve-0.125
Token Tom
Token ID4186
Feature activation+8.57e-02

Pos logit contributions
uddled+0.061
uffed+0.058
ist+0.055
op+0.054
itting+0.054
Neg logit contributions
 his-0.032
 restaurants-0.027
 night-0.026
 meal-0.024
 jungle-0.024
Token,
Token ID11
Feature activation+1.74e-01

Pos logit contributions
 his+0.007
 night+0.006
 restaurants+0.006
 day+0.005
 meal+0.005
Neg logit contributions
uddled-0.011
uffed-0.010
ist-0.010
itting-0.009
arest-0.009
Token and
Token ID290
Feature activation+9.88e-02

Pos logit contributions
 his+0.015
 agreement+0.010
 night+0.010
 himself+0.010
 restaurants+0.010
Neg logit contributions
uddled-0.023
uffed-0.022
arest-0.021
itting-0.021
ist-0.021
Token he
Token ID339
Feature activation+4.22e-02

Pos logit contributions
 his+0.009
 night+0.006
 himself+0.006
 day+0.006
 restaurants+0.006
Neg logit contributions
uddled-0.013
uffed-0.012
itting-0.012
ist-0.012
arest-0.012
Token ran
Token ID4966
Feature activation+1.81e-01

Pos logit contributions
 his+0.004
 bear+0.003
 bears+0.003
 night+0.003
 way+0.003
Neg logit contributions
itting-0.005
ower-0.005
ended-0.005
arest-0.005
uddled-0.005
Token out
Token ID503
Feature activation+3.57e-01

Pos logit contributions
 restaurants+0.014
 his+0.014
 salads+0.014
 meal+0.013
 jungle+0.013
Neg logit contributions
uddled-0.020
uffed-0.019
ve-0.019
ows-0.018
ist-0.018
Token of
Token ID286
Feature activation+1.17e-01

Pos logit contributions
 restaurants+0.041
 peanuts+0.040
 jungle+0.040
 his+0.039
 meal+0.039
Neg logit contributions
uddled-0.024
uffed-0.023
ist-0.022
op-0.021
ve-0.020
Token the
Token ID262
Feature activation+2.52e-01

Pos logit contributions
oc+0.007
 guys+0.007
 saf+0.007
 yak+0.007
 battles+0.007
Neg logit contributions
uffed-0.010
uddled-0.010
ist-0.009
apy-0.009
oms-0.009
Token kitchen
Token ID9592
Feature activation-8.93e-02

Pos logit contributions
 his+0.018
 restaurants+0.013
 night+0.013
 peanuts+0.013
 himself+0.013
Neg logit contributions
uddled-0.034
uffed-0.032
ist-0.031
itting-0.030
arest-0.030
Token.
Token ID13
Feature activation+2.02e-03

Pos logit contributions
uddled+0.012
uffed+0.011
ist+0.011
arest+0.011
itting+0.011
Neg logit contributions
 his-0.007
 night-0.006
 restaurants-0.005
 day-0.005
 meal-0.005
Token
Token ID198
Feature activation+8.24e-02

Pos logit contributions
 his+0.000
 night+0.000
 restaurants+0.000
 meal+0.000
 end+0.000
Neg logit contributions
uddled-0.000
uffed-0.000
ist-0.000
itting-0.000
arest-0.000
Token.
Token ID13
Feature activation+2.45e-04

Pos logit contributions
 his+0.000
 night+0.000
 restaurants+0.000
 day+0.000
 himself+0.000
Neg logit contributions
uddled-0.000
uffed-0.000
ist-0.000
arest-0.000
itting-0.000
Token She
Token ID1375
Feature activation-2.53e-01

Pos logit contributions
 his+0.000
 restaurants+0.000
 night+0.000
 meal+0.000
 peanuts+0.000
Neg logit contributions
uddled-0.000
uffed-0.000
ist-0.000
itting-0.000
arest-0.000
Token was
Token ID373
Feature activation-5.39e-02

Pos logit contributions
uddled+0.033
uffed+0.032
arest+0.032
ower+0.031
itting+0.031
Neg logit contributions
 his-0.023
 night-0.017
 day-0.016
 end-0.015
 way-0.015
Token happy
Token ID3772
Feature activation+2.13e-01

Pos logit contributions
uddled+0.007
uffed+0.007
ist+0.007
arest+0.006
itting+0.006
Neg logit contributions
 his-0.005
 night-0.003
 day-0.003
 himself-0.003
 restaurants-0.003
Token that
Token ID326
Feature activation-3.24e-01

Pos logit contributions
 his+0.021
 night+0.016
 day+0.016
 restaurants+0.015
 himself+0.015
Neg logit contributions
uddled-0.025
uffed-0.024
ist-0.023
arest-0.023
itting-0.022
Token she
Token ID673
Feature activation-9.26e-02

Pos logit contributions
uddled+0.038
uffed+0.035
ist+0.033
itting+0.033
arest+0.032
Neg logit contributions
 his-0.030
 restaurants-0.024
 night-0.023
 meal-0.023
 peanuts-0.023
Token could
Token ID714
Feature activation+2.75e-01

Pos logit contributions
outing+0.012
uddled+0.012
ower+0.012
arest+0.012
uffed+0.012
Neg logit contributions
 his-0.008
 night-0.006
 day-0.006
 bear-0.006
 Benny-0.005
Token help
Token ID1037
Feature activation+5.40e-01

Pos logit contributions
 his+0.020
 himself+0.014
 end+0.012
 search+0.012
 play+0.011
Neg logit contributions
uddled-0.045
uffed-0.043
outing-0.042
itting-0.042
plets-0.041
Token someone
Token ID2130
Feature activation-8.59e-01

Pos logit contributions
 his+0.031
 night+0.022
 restaurants+0.021
 day+0.019
 meal+0.019
Neg logit contributions
uddled-0.082
uffed-0.077
ist-0.075
itting-0.074
arest-0.073
Token in
Token ID287
Feature activation+3.05e-01

Pos logit contributions
uddled+0.119
uffed+0.114
ider+0.104
arest+0.103
itting+0.103
Neg logit contributions
 his-0.078
 day-0.057
 night-0.055
 search-0.054
 himself-0.054
Token need
Token ID761
Feature activation+4.19e-01

Pos logit contributions
 his+0.017
 night+0.012
 restaurants+0.012
 himself+0.011
 meal+0.010
Neg logit contributions
uddled-0.046
uffed-0.044
ist-0.042
itting-0.042
arest-0.041
Token her
Token ID607
Feature activation+1.18e-01

Pos logit contributions
uddled+0.090
uffed+0.085
ist+0.084
op+0.081
ower+0.080
Neg logit contributions
 his-0.040
 restaurants-0.031
 night-0.030
 jungle-0.029
 robbers-0.029
Token drink
Token ID4144
Feature activation+3.31e-01

Pos logit contributions
 his+0.009
 restaurants+0.007
 night+0.007
 himself+0.007
 agreement+0.007
Neg logit contributions
uddled-0.015
uffed-0.014
ist-0.014
op-0.013
ower-0.013
Token.
Token ID13
Feature activation-2.35e-01

Pos logit contributions
 his+0.025
 night+0.019
 restaurants+0.019
 day+0.018
 himself+0.018
Neg logit contributions
uddled-0.045
uffed-0.042
ist-0.040
itting-0.040
arest-0.039
Token Lily
Token ID20037
Feature activation+1.54e-01

Pos logit contributions
uddled+0.027
uffed+0.025
itting+0.024
Because+0.023
ows+0.023
Neg logit contributions
 his-0.020
 battle-0.018
 robbers-0.016
 restaurants-0.016
 meal-0.016
Token was
Token ID373
Feature activation-3.96e-01

Pos logit contributions
 his+0.015
 night+0.012
 day+0.012
 meal+0.011
 restaurants+0.011
Neg logit contributions
uddled-0.018
uffed-0.017
arest-0.017
itting-0.016
ist-0.016
Token a
Token ID257
Feature activation-2.00e-01

Pos logit contributions
uddled+0.052
uffed+0.050
ist+0.047
itting+0.046
arest+0.046
Neg logit contributions
 his-0.030
 night-0.023
 restaurants-0.023
 meal-0.022
 peanuts-0.022
Token reliable
Token ID9314
Feature activation-9.09e-01

Pos logit contributions
uddled+0.025
uffed+0.024
ist+0.023
itting+0.022
arest+0.022
Neg logit contributions
 his-0.017
 night-0.014
 restaurants-0.013
 day-0.013
 meal-0.013
Token friend
Token ID1545
Feature activation-4.18e-01

Pos logit contributions
uddled+0.130
uffed+0.130
arest+0.123
ist+0.119
outing+0.118
Neg logit contributions
 his-0.064
 day-0.051
 night-0.051
 end-0.047
 meal-0.047
Token.
Token ID13
Feature activation-1.51e-01

Pos logit contributions
uddled+0.055
uffed+0.053
arest+0.050
ist+0.049
ve+0.048
Neg logit contributions
 his-0.035
 night-0.027
 day-0.026
 restaurants-0.025
 end-0.025
Token She
Token ID1375
Feature activation-5.04e-01

Pos logit contributions
uddled+0.016
uffed+0.015
itting+0.015
ows+0.014
op+0.014
Neg logit contributions
 his-0.014
 battle-0.012
 night-0.012
 restaurants-0.011
 meal-0.011
Token always
Token ID1464
Feature activation-2.11e-01

Pos logit contributions
uddled+0.061
arest+0.061
uffed+0.061
ower+0.060
itting+0.060
Neg logit contributions
 his-0.045
 night-0.035
 day-0.034
 bear-0.033
 bears-0.032
Token thought
Token ID1807
Feature activation+3.14e-01

Pos logit contributions
uddled+0.145
arest+0.143
itting+0.139
uffed+0.139
ower+0.136
Neg logit contributions
 his-0.087
 himself-0.057
 search-0.056
 bear-0.056
 night-0.054
Token the
Token ID262
Feature activation-1.75e-01

Pos logit contributions
 his+0.022
 night+0.012
 himself+0.012
 day+0.012
 him+0.011
Neg logit contributions
uddled-0.052
uffed-0.048
ower-0.047
ider-0.047
outing-0.047
Token spear
Token ID17514
Feature activation+9.58e-03

Pos logit contributions
uddled+0.023
uffed+0.022
ist+0.021
arest+0.021
itting+0.021
Neg logit contributions
 his-0.012
 restaurants-0.009
 night-0.008
 agreement-0.008
 meal-0.008
Token would
Token ID561
Feature activation+1.36e-01

Pos logit contributions
 his+0.001
 day+0.001
 night+0.001
 bear+0.001
 himself+0.001
Neg logit contributions
uddled-0.001
uffed-0.001
itting-0.001
arest-0.001
ower-0.001
Token be
Token ID307
Feature activation-1.28e+00

Pos logit contributions
 his+0.011
 night+0.008
 himself+0.008
 end+0.007
 day+0.007
Neg logit contributions
uddled-0.020
uffed-0.019
itting-0.018
ist-0.017
op-0.017
Token useful
Token ID4465
Feature activation+1.25e-02

Pos logit contributions
uddled+0.164
uffed+0.156
ist+0.151
arest+0.146
itting+0.146
Neg logit contributions
 his-0.097
 restaurants-0.077
 night-0.076
 meal-0.071
 day-0.071
Token.
Token ID13
Feature activation-2.98e-01

Pos logit contributions
 his+0.001
 night+0.001
 himself+0.001
 day+0.001
 restaurants+0.001
Neg logit contributions
uddled-0.002
uffed-0.002
arest-0.002
ist-0.002
itting-0.002
Token She
Token ID1375
Feature activation-6.25e-01

Pos logit contributions
uddled+0.038
uffed+0.036
ist+0.035
itting+0.034
arest+0.034
Neg logit contributions
 his-0.024
 night-0.019
 restaurants-0.018
 meal-0.017
 day-0.017
Token held
Token ID2714
Feature activation-1.94e-01

Pos logit contributions
uddled+0.083
itting+0.079
uffed+0.079
arest+0.079
ower+0.078
Neg logit contributions
 his-0.055
 night-0.037
 search-0.036
 himself-0.036
 day-0.036
Token the
Token ID262
Feature activation-5.48e-04

Pos logit contributions
uddled+0.025
uffed+0.025
ve+0.023
ist+0.023
op+0.023
Neg logit contributions
 his-0.013
 restaurants-0.012
 night-0.011
 meal-0.010
 jungle-0.010
Token spear
Token ID17514
Feature activation+1.89e-01

Pos logit contributions
uddled+0.000
uffed+0.000
ist+0.000
arest+0.000
ve+0.000
Neg logit contributions
 his-0.000
 restaurants-0.000
 night-0.000
 day-0.000
 meal-0.000
Token said
Token ID531
Feature activation-5.03e-01

Pos logit contributions
uddled+0.009
uffed+0.009
arest+0.008
itting+0.008
ist+0.008
Neg logit contributions
 his-0.007
 night-0.005
 restaurants-0.005
 himself-0.005
 peanuts-0.005
Token,
Token ID11
Feature activation-3.38e-02

Pos logit contributions
uddled+0.079
ist+0.075
uffed+0.075
itting+0.073
arest+0.073
Neg logit contributions
 his-0.035
 himself-0.020
 Freddy-0.019
 him-0.019
 night-0.018
Token "
Token ID366
Feature activation-8.75e-01

Pos logit contributions
uddled+0.005
arest+0.005
ist+0.005
uffed+0.005
itting+0.005
Neg logit contributions
 his-0.002
 himself-0.001
 -0.001
 day-0.001
 Freddy-0.001
TokenL
Token ID43
Feature activation+4.60e-01

Pos logit contributions
uddled+0.108
uffed+0.104
ows+0.097
ist+0.096
ower+0.094
Neg logit contributions
 his-0.080
 night-0.063
 day-0.061
 himself-0.059
 end-0.058
Tokenily
Token ID813
Feature activation-2.38e-02

Pos logit contributions
 his+0.043
 night+0.035
 end+0.034
 day+0.034
 peanuts+0.034
Neg logit contributions
uddled-0.052
uffed-0.050
arest-0.046
ower-0.045
op-0.044
Token,
Token ID11
Feature activation-6.73e-02

Pos logit contributions
uddled+0.004
arest+0.004
uffed+0.004
outing+0.004
ider+0.004
Neg logit contributions
 his-0.001
 night-0.001
 day-0.001
 bear-0.001
 way-0.001
Token you
Token ID345
Feature activation-6.26e-01

Pos logit contributions
uddled+0.009
uffed+0.009
arest+0.008
ist+0.008
ower+0.008
Neg logit contributions
 his-0.006
 night-0.005
 day-0.005
 end-0.005
 himself-0.004
Token can
Token ID460
Feature activation+2.19e-01

Pos logit contributions
uddled+0.088
ower+0.080
uffed+0.079
itting+0.077
ider+0.077
Neg logit contributions
 his-0.052
 night-0.042
 end-0.039
 search-0.038
 bear-0.037
Token't
Token ID470
Feature activation+7.19e-02

Pos logit contributions
 his+0.017
 search+0.013
 end+0.013
 him+0.013
 himself+0.013
Neg logit contributions
uddled-0.034
uffed-0.031
outing-0.031
itting-0.030
ower-0.030
Token keep
Token ID1394
Feature activation-2.98e-01

Pos logit contributions
 his+0.004
 end+0.003
 night+0.003
 search+0.003
 himself+0.003
Neg logit contributions
uddled-0.012
uffed-0.011
itting-0.011
outing-0.010
arest-0.010
Token the
Token ID262
Feature activation+1.32e-01

Pos logit contributions
uddled+0.039
uffed+0.038
ve+0.036
ist+0.035
op+0.035
Neg logit contributions
 his-0.020
 restaurants-0.016
 night-0.015
 agreement-0.015
 end-0.015
Token time
Token ID640
Feature activation-3.24e-01

Pos logit contributions
uddled+0.059
uffed+0.056
ist+0.055
itting+0.054
arest+0.054
Neg logit contributions
 his-0.022
 restaurants-0.015
 night-0.014
 himself-0.014
 peanuts-0.013
Token,
Token ID11
Feature activation-5.62e-01

Pos logit contributions
uddled+0.044
uffed+0.041
ist+0.039
arest+0.039
itting+0.038
Neg logit contributions
 his-0.026
 night-0.019
 restaurants-0.018
 day-0.018
 himself-0.017
Token there
Token ID612
Feature activation-2.65e-01

Pos logit contributions
uddled+0.073
uffed+0.066
arest+0.064
ist+0.063
ended+0.063
Neg logit contributions
 his-0.052
 night-0.039
 day-0.039
 end-0.036
 himself-0.036
Token was
Token ID373
Feature activation-6.69e-01

Pos logit contributions
uddled+0.031
uffed+0.029
ist+0.028
arest+0.027
itting+0.027
Neg logit contributions
 his-0.026
 night-0.021
 restaurants-0.020
 day-0.020
 end-0.019
Token a
Token ID257
Feature activation-3.71e-01

Pos logit contributions
uddled+0.079
uffed+0.073
itting+0.071
arest+0.070
ve+0.069
Neg logit contributions
 his-0.053
 restaurants-0.046
 agreement-0.043
 end-0.042
 robbers-0.042
Token hero
Token ID4293
Feature activation-2.41e-02

Pos logit contributions
uddled+0.044
uffed+0.041
ist+0.040
itting+0.039
arest+0.039
Neg logit contributions
 his-0.034
 night-0.027
 restaurants-0.026
 himself-0.025
 day-0.025
Token.
Token ID13
Feature activation-4.95e-01

Pos logit contributions
uddled+0.003
uffed+0.003
ist+0.003
itting+0.003
arest+0.003
Neg logit contributions
 his-0.002
 night-0.002
 restaurants-0.002
 day-0.002
 himself-0.002
Token He
Token ID679
Feature activation-5.54e-01

Pos logit contributions
uffed+0.050
uddled+0.049
itting+0.048
ended+0.047
uts+0.047
Neg logit contributions
 his-0.037
 peanuts-0.036
 end-0.035
 meal-0.035
 salads-0.034
Token was
Token ID373
Feature activation-5.59e-01

Pos logit contributions
uddled+0.058
uffed+0.053
ist+0.051
itting+0.049
arest+0.049
Neg logit contributions
 his-0.059
 night-0.049
 restaurants-0.048
 day-0.046
 meal-0.046
Token brave
Token ID14802
Feature activation-1.07e+00

Pos logit contributions
uddled+0.065
uffed+0.065
ve+0.061
a+0.059
itting+0.059
Neg logit contributions
 his-0.041
 restaurants-0.036
 salads-0.035
 agreement-0.035
 meal-0.035
Token and
Token ID290
Feature activation-4.75e-01

Pos logit contributions
uddled+0.116
uffed+0.108
ist+0.104
itting+0.101
op+0.099
Neg logit contributions
 his-0.106
 restaurants-0.087
 night-0.086
 meal-0.083
 peanuts-0.082
Token the
Token ID262
Feature activation-3.69e-02

Pos logit contributions
 his+0.040
 night+0.023
 himself+0.021
 restaurants+0.020
 day+0.020
Neg logit contributions
uddled-0.110
uffed-0.104
ist-0.100
itting-0.100
arest-0.100
Token tooth
Token ID16162
Feature activation+3.77e-01

Pos logit contributions
uddled+0.005
uffed+0.005
itting+0.005
ist+0.005
arest+0.005
Neg logit contributions
 his-0.003
 night-0.002
 day-0.002
 meal-0.002
 peanuts-0.002
Token and
Token ID290
Feature activation-2.14e-01

Pos logit contributions
 his+0.034
 night+0.026
 himself+0.024
 bear+0.024
 day+0.024
Neg logit contributions
uddled-0.049
uffed-0.047
itting-0.045
outing-0.043
arest-0.042
Token recorded
Token ID6264
Feature activation-1.02e+00

Pos logit contributions
uddled+0.027
uffed+0.025
itting+0.024
ist+0.024
arest+0.024
Neg logit contributions
 his-0.020
 night-0.015
 restaurants-0.014
 himself-0.014
 peanuts-0.014
Token a
Token ID257
Feature activation-1.90e-01

Pos logit contributions
uddled+0.158
uffed+0.149
itting+0.144
ist+0.143
arest+0.143
Neg logit contributions
 his-0.064
 night-0.041
 himself-0.039
 restaurants-0.038
 day-0.036
Token copy
Token ID4866
Feature activation-1.78e+00

Pos logit contributions
uddled+0.022
uffed+0.020
ist+0.020
arest+0.020
ve+0.019
Neg logit contributions
 his-0.017
 restaurants-0.014
 night-0.014
 himself-0.013
 peanuts-0.013
Token of
Token ID286
Feature activation-2.98e-01

Pos logit contributions
uddled+0.250
uffed+0.235
arest+0.222
ve+0.218
ower+0.216
Neg logit contributions
 his-0.159
 night-0.110
 himself-0.109
 day-0.107
 end-0.104
Token the
Token ID262
Feature activation-1.38e-01

Pos logit contributions
uddled+0.047
uffed+0.044
ist+0.043
itting+0.042
arest+0.042
Neg logit contributions
 his-0.017
 night-0.011
 restaurants-0.011
 day-0.010
 himself-0.010
Token song
Token ID3496
Feature activation-9.73e-01

Pos logit contributions
uddled+0.018
uffed+0.017
ist+0.017
arest+0.016
itting+0.016
Neg logit contributions
 his-0.011
 restaurants-0.008
 night-0.008
 himself-0.007
 day-0.007
Token.
Token ID13
Feature activation-9.62e-02

Pos logit contributions
uddled+0.151
uffed+0.148
arest+0.138
ve+0.136
outing+0.136
Neg logit contributions
 his-0.068
 night-0.050
 day-0.048
 end-0.040
 morning-0.040
Token
Token ID220
Feature activation+3.71e-01

Pos logit contributions
uddled+0.014
uffed+0.013
ist+0.012
arest+0.012
itting+0.012
Neg logit contributions
 his-0.007
 night-0.005
 restaurants-0.005
 day-0.005
 himself-0.004
Token outside
Token ID2354
Feature activation+3.30e-01

Pos logit contributions
uddled+0.028
uffed+0.025
ist+0.024
op+0.024
ve+0.023
Neg logit contributions
 his-0.022
 night-0.019
 restaurants-0.018
 agreement-0.018
 end-0.018
Token in
Token ID287
Feature activation-4.59e-02

Pos logit contributions
 restaurants+0.035
 agreement+0.035
 his+0.035
 meal+0.034
 peanuts+0.034
Neg logit contributions
uddled-0.023
op-0.022
ist-0.021
ider-0.021
uffed-0.021
Token the
Token ID262
Feature activation+6.46e-02

Pos logit contributions
op+0.003
Because+0.003
ended+0.003
arest+0.003
apy+0.003
Neg logit contributions
 Prince-0.004
 pizz-0.004
 ending-0.004
 servant-0.004
 Moral-0.003
Token woods
Token ID16479
Feature activation+3.17e-02

Pos logit contributions
 his+0.004
 agreement+0.004
 pizz+0.004
 battle+0.004
 restaurants+0.004
Neg logit contributions
uddled-0.008
itting-0.007
ower-0.007
op-0.007
uffed-0.007
Token and
Token ID290
Feature activation-4.48e-01

Pos logit contributions
 his+0.003
 restaurants+0.003
 night+0.002
 peanuts+0.002
 meal+0.002
Neg logit contributions
uddled-0.003
ist-0.003
uffed-0.003
itting-0.003
ider-0.003
Token follow
Token ID1061
Feature activation-1.81e+00

Pos logit contributions
uddled+0.047
uffed+0.044
ist+0.043
op+0.043
ve+0.042
Neg logit contributions
 his-0.030
 agreement-0.028
 pizz-0.027
 peanuts-0.026
 Prince-0.026
Token the
Token ID262
Feature activation+4.65e-02

Pos logit contributions
uddled+0.232
uffed+0.230
ows+0.226
ist+0.225
ute+0.222
Neg logit contributions
 night-0.057
 jungle-0.053
 day-0.053
 meal-0.053
 End-0.052
Token trail
Token ID8025
Feature activation+6.28e-01

Pos logit contributions
 his+0.003
 night+0.002
 restaurants+0.002
 himself+0.002
 meal+0.002
Neg logit contributions
uddled-0.007
uffed-0.006
ist-0.006
itting-0.006
arest-0.006
Token of
Token ID286
Feature activation-1.33e-01

Pos logit contributions
 his+0.051
 night+0.037
 day+0.036
 himself+0.035
 restaurants+0.035
Neg logit contributions
uddled-0.085
uffed-0.080
ist-0.076
itting-0.075
arest-0.075
Token leaves
Token ID5667
Feature activation-5.27e-01

Pos logit contributions
uddled+0.017
uffed+0.016
ows+0.015
ve+0.015
itting+0.015
Neg logit contributions
 his-0.007
 night-0.006
 end-0.006
 meal-0.006
 pizz-0.006
Token.
Token ID13
Feature activation-2.40e-01

Pos logit contributions
uddled+0.073
uffed+0.069
arest+0.066
ist+0.066
itting+0.065
Neg logit contributions
 his-0.042
 night-0.030
 himself-0.029
 day-0.029
 restaurants-0.028
Token to
Token ID284
Feature activation+1.73e-01

Pos logit contributions
uddled+0.027
uffed+0.027
arest+0.027
ist+0.026
op+0.025
Neg logit contributions
 his-0.020
 night-0.015
 himself-0.014
 day-0.014
 him-0.014
Token be
Token ID307
Feature activation-1.02e+00

Pos logit contributions
 his+0.013
 night+0.010
 restaurants+0.010
 day+0.009
 himself+0.009
Neg logit contributions
uddled-0.024
uffed-0.022
ist-0.021
itting-0.021
arest-0.021
Token back
Token ID736
Feature activation+6.82e-01

Pos logit contributions
uddled+0.132
uffed+0.124
ist+0.119
arest+0.117
itting+0.117
Neg logit contributions
 his-0.088
 night-0.065
 himself-0.063
 restaurants-0.062
 day-0.061
Token in
Token ID287
Feature activation+8.91e-02

Pos logit contributions
 his+0.053
 night+0.037
 himself+0.035
 day+0.035
 home+0.034
Neg logit contributions
uddled-0.098
uffed-0.095
arest-0.090
ist-0.089
itting-0.088
Token the
Token ID262
Feature activation+2.47e-01

Pos logit contributions
 his+0.005
 restaurants+0.004
 peanuts+0.003
 salads+0.003
 night+0.003
Neg logit contributions
uddled-0.013
uffed-0.012
ist-0.012
itting-0.012
arest-0.012
Token quiet
Token ID5897
Feature activation-1.91e+00

Pos logit contributions
 his+0.015
 night+0.012
 day+0.011
 end+0.010
 restaurants+0.010
Neg logit contributions
uddled-0.038
uffed-0.037
ist-0.035
ve-0.034
itting-0.034
Token,
Token ID11
Feature activation+5.51e-02

Pos logit contributions
uddled+0.321
uffed+0.313
ve+0.294
arest+0.293
ist+0.289
Neg logit contributions
 his-0.097
 night-0.075
 day-0.070
 end-0.062
 way-0.061
Token blue
Token ID4171
Feature activation+3.16e-01

Pos logit contributions
 his+0.004
 night+0.003
 day+0.003
 way+0.003
 winter+0.003
Neg logit contributions
uddled-0.008
uffed-0.008
arest-0.008
ist-0.007
op-0.007
Token sky
Token ID6766
Feature activation-1.75e-01

Pos logit contributions
 his+0.021
 night+0.019
 end+0.017
 day+0.017
 morning+0.016
Neg logit contributions
uffed-0.047
uddled-0.046
arest-0.044
ist-0.044
ve-0.044
Token.
Token ID13
Feature activation+7.22e-02

Pos logit contributions
uffed+0.028
uddled+0.028
arest+0.028
op+0.026
ve+0.026
Neg logit contributions
 his-0.011
 night-0.008
 day-0.008
 end-0.006
 morning-0.006
Token She
Token ID1375
Feature activation-2.78e-01

Pos logit contributions
 his+0.006
 night+0.005
 restaurants+0.005
 day+0.005
 himself+0.004
Neg logit contributions
uddled-0.009
uffed-0.009
ist-0.008
arest-0.008
itting-0.008
Token upon
Token ID2402
Feature activation-2.15e-01

Pos logit contributions
uddled+0.034
uffed+0.031
ist+0.030
arest+0.030
ider+0.029
Neg logit contributions
 his-0.029
 night-0.024
 day-0.023
 restaurants-0.022
 end-0.022
Token a
Token ID257
Feature activation-1.40e-01

Pos logit contributions
uddled+0.029
uffed+0.027
ist+0.026
arest+0.026
itting+0.025
Neg logit contributions
 his-0.016
 restaurants-0.012
 night-0.012
 peanuts-0.011
 meal-0.011
Token time
Token ID640
Feature activation-2.78e-02

Pos logit contributions
uddled+0.021
uffed+0.020
ist+0.020
itting+0.020
arest+0.019
Neg logit contributions
 his-0.008
 restaurants-0.005
 night-0.005
 himself-0.005
 peanuts-0.005
Token there
Token ID612
Feature activation-9.99e-02

Pos logit contributions
uddled+0.004
uffed+0.003
ist+0.003
arest+0.003
itting+0.003
Neg logit contributions
 his-0.002
 night-0.002
 restaurants-0.002
 day-0.002
 himself-0.002
Token were
Token ID547
Feature activation-5.11e-01

Pos logit contributions
uddled+0.012
uffed+0.011
ist+0.011
arest+0.011
ider+0.011
Neg logit contributions
 his-0.010
 night-0.007
 day-0.007
 restaurants-0.007
 himself-0.007
Token two
Token ID734
Feature activation-1.84e+00

Pos logit contributions
uddled+0.061
uffed+0.056
itting+0.054
arest+0.053
ve+0.053
Neg logit contributions
 his-0.045
 restaurants-0.035
 night-0.034
 agreement-0.034
 himself-0.034
Token friends
Token ID2460
Feature activation-3.09e-01

Pos logit contributions
uddled+0.261
uffed+0.248
ist+0.238
arest+0.235
itting+0.234
Neg logit contributions
 his-0.130
 night-0.093
 restaurants-0.091
 day-0.087
 peanuts-0.086
Token called
Token ID1444
Feature activation-7.56e-02

Pos logit contributions
uddled+0.037
uffed+0.034
ist+0.033
itting+0.032
arest+0.032
Neg logit contributions
 his-0.028
 night-0.023
 restaurants-0.023
 day-0.021
 meal-0.021
Token Sam
Token ID3409
Feature activation-1.01e+00

Pos logit contributions
uddled+0.011
uffed+0.010
ist+0.010
itting+0.010
arest+0.009
Neg logit contributions
 his-0.005
 night-0.004
 restaurants-0.004
 day-0.003
 end-0.003
Token and
Token ID290
Feature activation-6.06e-01

Pos logit contributions
uddled+0.131
uffed+0.126
arest+0.121
itting+0.117
ider+0.116
Neg logit contributions
 his-0.092
 night-0.067
 day-0.064
 himself-0.064
 end-0.063
Token Jack
Token ID3619
Feature activation-8.83e-01

Pos logit contributions
uddled+0.087
uffed+0.083
ist+0.080
itting+0.079
arest+0.078
Neg logit contributions
 his-0.037
 night-0.027
 restaurants-0.027
 end-0.025
 meal-0.025
Token the
Token ID262
Feature activation+1.70e-01

Pos logit contributions
uddled+0.002
Because+0.002
itting+0.002
op+0.002
ended+0.002
Neg logit contributions
 pizz-0.001
 cuc-0.001
 lettuce-0.001
 salads-0.001
 peanuts-0.001
Token woods
Token ID16479
Feature activation+1.80e-01

Pos logit contributions
 his+0.012
 restaurants+0.009
 salads+0.008
 agreement+0.008
 meal+0.008
Neg logit contributions
uddled-0.023
uffed-0.021
itting-0.021
ist-0.021
arest-0.020
Token and
Token ID290
Feature activation-2.72e-01

Pos logit contributions
 his+0.016
 night+0.012
 restaurants+0.012
 day+0.012
 himself+0.011
Neg logit contributions
uddled-0.022
uffed-0.021
ist-0.020
itting-0.020
arest-0.019
Token noticed
Token ID6810
Feature activation+7.75e-02

Pos logit contributions
uddled+0.031
uffed+0.029
ist+0.027
itting+0.027
ve+0.026
Neg logit contributions
 his-0.026
 restaurants-0.020
 night-0.020
 meal-0.020
 agreement-0.020
Token how
Token ID703
Feature activation-3.82e-02

Pos logit contributions
 his+0.005
 night+0.003
 himself+0.003
 restaurants+0.003
 day+0.003
Neg logit contributions
uddled-0.012
uffed-0.011
ist-0.011
arest-0.011
ve-0.011
Token quiet
Token ID5897
Feature activation-2.05e+00

Pos logit contributions
uddled+0.005
ist+0.005
uffed+0.005
ider+0.005
arest+0.005
Neg logit contributions
 his-0.003
 night-0.003
 day-0.002
 restaurants-0.002
 morning-0.002
Token it
Token ID340
Feature activation-1.73e-01

Pos logit contributions
uddled+0.308
uffed+0.298
arest+0.283
ve+0.283
itting+0.283
Neg logit contributions
 his-0.146
 night-0.103
 day-0.095
 restaurants-0.092
 himself-0.089
Token was
Token ID373
Feature activation-3.87e-01

Pos logit contributions
uddled+0.022
uffed+0.021
arest+0.020
itting+0.020
ist+0.020
Neg logit contributions
 his-0.016
 night-0.013
 day-0.012
 end-0.011
 search-0.011
Token.
Token ID13
Feature activation+1.05e-03

Pos logit contributions
uddled+0.056
uffed+0.055
arest+0.051
ist+0.050
ve+0.050
Neg logit contributions
 his-0.030
 night-0.022
 day-0.021
 end-0.020
 way-0.019
Token The
Token ID383
Feature activation-4.82e-01

Pos logit contributions
 his+0.000
 restaurants+0.000
 night+0.000
 meal+0.000
 end+0.000
Neg logit contributions
uddled-0.000
uffed-0.000
op-0.000
itting-0.000
ower-0.000
Token silence
Token ID9550
Feature activation-1.62e+00

Pos logit contributions
uddled+0.074
uffed+0.071
ist+0.068
itting+0.067
arest+0.067
Neg logit contributions
 his-0.028
 night-0.019
 restaurants-0.018
 day-0.017
 himself-0.016