TOP ACTIVATIONS
MAX = 13.637

clown s juggling and ac rob ats doing flips . But
of fun watching the ac rob ats and clown s .
thought they looked like ac rob ats . She enjoyed the
elephants , lions and ac rob ats . They are amazed
clown s and the ac rob ats and the lions and
saw clown s and ac rob ats and lions and elephants
very still so the ha ird ress er could do a
his mom and the ha ird ress er . " I
chair and allowed the ha ird ress er to give her
in and let the ha ird ress er go to work
called it a " we ird rock ." Tim my got
am ready !" The ha ird ress er put a cape
happy and thanked the ha ird ress er . As they
the animals , the ac rob ats and the clown s
like it and the ha ird ress er said it was
the chair and the ha ird ress er began to cut
his mom left the ha ird ress er , happy with
's turn . The ha ird ress er asked her what
hair salon . The ha ird ress er was very kind
different today ." The ha ird ress er smiled and said

MIN ACTIVATIONS
MIN = -10.995

We have different likes and disl ikes . We can be
sad . He said " Fear not , I can help
. It said , " Fear not . I am with
and dangerous to explore the woods alone . With her pole
her mom my said about earthquakes and thought , " If
to always be prepared for earthquakes and other dangers in the
They said , " L adies and gentlemen , welcome to
. He also learned that earthquakes can be very scary ,
. He says " L adies and gentlemen , boys and
also learned that even though earthquakes can be scary , it
anymore . She knew that earthquakes could sometimes be scary ,
the branch all around . Up to the sky and down
and Ben did not like earthquakes . They heard
she felt safe from any earthquakes . <|endoftext|> Once upon a
but then she realised that earthquakes don 't happen very often
? You know I hate earthquakes ." Paul felt bad and
was over . They knew earthquakes were exciting but can sometimes
its engine thunder ing . Up , it went into the
he stayed still during the earthquakes . He was also thankful
and her mum explained that earthquakes happen when the ground below

INTERVAL 7.479 - 13.637
CONTAINS 0.047%

ed and hugged her blue bag tightly . She went back
the cookies by using fresh butter ." They went back to
mom brought home a big bag of popcorn . Lily was
too . She opened her bag and took out a handful
was a happy , smooth butter . The butter looked all

INTERVAL 1.321 - 7.479
CONTAINS 20.963%

rake ! What is a rake ? he thought to himself
a toy par rot that said funny things .
She sat for a few more minutes and then decided it
Emma who loved to play chess . Every day , she
resist , however , adding that the coin was too beautiful

INTERVAL -4.837 - 1.321
CONTAINS 76.819%

and sharp teeth . " No , I want
. Every day when she woke up , Sus ie had
are friends . You have to take care of each other
holding a wand . " Look at me ,
who had a big castle . One day , the king

INTERVAL -10.995 - -4.837
CONTAINS 2.170%

child made beautiful music . <|endoftext|> One day , a small
but he could not reach him . Lily 's mom called
play with . <|endoftext|> Once upon a time , there was
day . One day , when she was playing , she
thought of all the wonderful things she had learned and couldn
Token clown
Token ID25573
Feature activation-1.71e+00

Pos logit contributions
ires+0.036
erv+0.035
 gut+0.033
adoes+0.033
 confusion+0.032
Neg logit contributions
rob-0.044
 batch-0.042
bar-0.039
ip-0.038
ah-0.034
Tokens
Token ID82
Feature activation+6.60e-01

Pos logit contributions
ires+0.060
erv+0.059
 gut+0.058
adoes+0.058
 confusion+0.057
Neg logit contributions
rob-0.044
 batch-0.043
bar-0.039
ip-0.038
ah-0.034
Token juggling
Token ID48097
Feature activation+9.47e-01

Pos logit contributions
 batch+0.021
rob+0.021
bar+0.020
ip+0.019
ah+0.018
Neg logit contributions
ires-0.018
erv-0.018
adoes-0.017
 gut-0.017
 confusion-0.016
Token and
Token ID290
Feature activation-5.54e-01

Pos logit contributions
rob+0.035
 batch+0.034
bar+0.032
ip+0.031
ah+0.028
Neg logit contributions
ires-0.022
erv-0.021
 confusion-0.021
 gut-0.021
adoes-0.021
Token ac
Token ID936
Feature activation+4.41e+00

Pos logit contributions
 confusion+0.012
ires+0.012
 stops+0.012
adoes+0.012
erv+0.011
Neg logit contributions
 batch-0.020
rob-0.020
ip-0.019
bar-0.018
P-0.017
Tokenrob
Token ID22609
Feature activation+1.36e+01

Pos logit contributions
 batch+0.050
bar+0.033
 loaf+0.024
 french+0.022
 million+0.022
Neg logit contributions
ires-0.208
iring-0.200
erv-0.197
uted-0.196
adoes-0.195
Tokenats
Token ID1381
Feature activation-6.31e-02

Pos logit contributions
rob+0.375
 batch+0.369
bar+0.353
ip+0.309
tail+0.302
Neg logit contributions
ires-0.450
erv-0.429
adoes-0.405
 confusion-0.401
 gut-0.397
Token doing
Token ID1804
Feature activation+1.89e+00

Pos logit contributions
ires+0.002
erv+0.001
 gut+0.001
adoes+0.001
 confusion+0.001
Neg logit contributions
rob-0.002
 batch-0.002
bar-0.002
ip-0.002
tail-0.002
Token flips
Token ID45971
Feature activation-3.68e+00

Pos logit contributions
rob+0.072
 batch+0.070
bar+0.067
ip+0.065
ird+0.059
Neg logit contributions
ires-0.041
erv-0.039
adoes-0.038
 confusion-0.038
 gut-0.038
Token.
Token ID13
Feature activation+2.14e-01

Pos logit contributions
ires+0.100
 gut+0.097
erv+0.097
 confusion+0.094
adoes+0.092
Neg logit contributions
rob-0.122
 batch-0.119
bar-0.112
ip-0.107
ah-0.101
Token But
Token ID887
Feature activation-1.66e-02

Pos logit contributions
rob+0.007
 batch+0.007
bar+0.007
ip+0.006
ah+0.006
Neg logit contributions
ires-0.006
erv-0.006
 confusion-0.005
adoes-0.005
 gut-0.005
Token of
Token ID286
Feature activation+2.73e-01

Pos logit contributions
rob+0.096
 batch+0.095
bar+0.087
ip+0.083
ird+0.072
Neg logit contributions
ires-0.121
adoes-0.121
 gut-0.118
erv-0.118
 confusion-0.116
Token fun
Token ID1257
Feature activation-2.07e+00

Pos logit contributions
rob+0.010
 batch+0.010
bar+0.010
ip+0.010
 million+0.009
Neg logit contributions
ires-0.006
 confusion-0.006
erv-0.006
 gut-0.005
adoes-0.005
Token watching
Token ID4964
Feature activation-3.80e+00

Pos logit contributions
ires+0.048
erv+0.045
adoes+0.044
 gut+0.043
 confusion+0.043
Neg logit contributions
rob-0.077
 batch-0.075
bar-0.072
ip-0.069
ah-0.064
Token the
Token ID262
Feature activation+2.87e-01

Pos logit contributions
ires+0.097
erv+0.093
 gut+0.091
 confusion+0.090
adoes+0.089
Neg logit contributions
rob-0.134
 batch-0.130
bar-0.121
ip-0.118
ah-0.107
Token ac
Token ID936
Feature activation+4.19e+00

Pos logit contributions
rob+0.009
 batch+0.009
bar+0.009
ip+0.008
ah+0.008
Neg logit contributions
ires-0.008
erv-0.008
 gut-0.007
 confusion-0.007
adoes-0.007
Tokenrob
Token ID22609
Feature activation+1.31e+01

Pos logit contributions
 batch+0.021
bar+0.015
rob+0.005
 million+0.002
ip+0.002
Neg logit contributions
ires-0.225
iring-0.213
erv-0.209
adoes-0.208
 gut-0.207
Tokenats
Token ID1381
Feature activation-3.42e-01

Pos logit contributions
rob+0.348
 batch+0.342
bar+0.330
ip+0.292
 million+0.284
Neg logit contributions
ires-0.436
erv-0.408
 gut-0.389
adoes-0.383
iring-0.382
Token and
Token ID290
Feature activation-8.09e-01

Pos logit contributions
ires+0.008
erv+0.007
adoes+0.007
 gut+0.007
 confusion+0.007
Neg logit contributions
rob-0.013
 batch-0.013
ip-0.012
bar-0.011
P-0.011
Token clown
Token ID25573
Feature activation-2.03e+00

Pos logit contributions
ires+0.020
erv+0.019
 gut+0.019
 confusion+0.019
adoes+0.018
Neg logit contributions
rob-0.029
 batch-0.028
bar-0.026
ip-0.026
ah-0.024
Tokens
Token ID82
Feature activation+3.00e-01

Pos logit contributions
ires+0.056
erv+0.055
adoes+0.052
 gut+0.052
 confusion+0.051
Neg logit contributions
rob-0.065
 batch-0.063
bar-0.056
ip-0.054
P-0.051
Token.
Token ID13
Feature activation+1.03e-01

Pos logit contributions
rob+0.009
 batch+0.009
bar+0.009
ip+0.008
ah+0.008
Neg logit contributions
ires-0.009
erv-0.008
 gut-0.008
adoes-0.008
 confusion-0.008
Token thought
Token ID1807
Feature activation-1.93e+00

Pos logit contributions
rob+0.024
 batch+0.024
bar+0.022
ip+0.021
ah+0.019
Neg logit contributions
ires-0.022
erv-0.021
 gut-0.020
 confusion-0.020
adoes-0.020
Token they
Token ID484
Feature activation+2.49e+00

Pos logit contributions
erv+0.064
ires+0.063
 confusion+0.061
adoes+0.061
 gut+0.060
Neg logit contributions
 batch-0.052
rob-0.049
ip-0.047
bar-0.045
ah-0.038
Token looked
Token ID3114
Feature activation+8.86e-01

Pos logit contributions
rob+0.062
 batch+0.060
bar+0.055
ip+0.052
ah+0.047
Neg logit contributions
ires-0.090
erv-0.086
 gut-0.085
adoes-0.084
 confusion-0.083
Token like
Token ID588
Feature activation-2.68e+00

Pos logit contributions
 batch+0.022
rob+0.019
ip+0.018
 million+0.018
bar+0.017
Neg logit contributions
ires-0.030
 gut-0.029
adoes-0.029
erv-0.029
 confusion-0.028
Token ac
Token ID936
Feature activation+4.18e+00

Pos logit contributions
ires+0.077
erv+0.075
 gut+0.072
 confusion+0.072
adoes+0.070
Neg logit contributions
rob-0.084
 batch-0.084
bar-0.074
ip-0.074
P-0.066
Tokenrob
Token ID22609
Feature activation+1.31e+01

Pos logit contributions
 batch+0.044
bar+0.037
 loaf+0.021
 french+0.021
 gob+0.020
Neg logit contributions
ires-0.200
iring-0.190
adoes-0.189
erv-0.189
 gut-0.186
Tokenats
Token ID1381
Feature activation-3.88e-01

Pos logit contributions
rob+0.337
 batch+0.324
bar+0.303
ip+0.276
tail+0.249
Neg logit contributions
ires-0.457
erv-0.442
 gut-0.425
adoes-0.424
 confusion-0.423
Token.
Token ID13
Feature activation+1.70e-01

Pos logit contributions
ires+0.008
erv+0.008
adoes+0.008
 gut+0.008
 confusion+0.008
Neg logit contributions
rob-0.015
 batch-0.015
bar-0.014
ip-0.013
P-0.012
Token She
Token ID1375
Feature activation+1.88e+00

Pos logit contributions
 batch+0.005
ip+0.004
rob+0.004
bar+0.004
tail+0.004
Neg logit contributions
 gut-0.005
ires-0.005
 confusion-0.005
erv-0.005
iring-0.005
Token enjoyed
Token ID8359
Feature activation+8.95e-01

Pos logit contributions
rob+0.060
 batch+0.058
bar+0.054
ip+0.052
ah+0.047
Neg logit contributions
ires-0.054
erv-0.052
 gut-0.051
 confusion-0.050
adoes-0.050
Token the
Token ID262
Feature activation+5.79e-01

Pos logit contributions
rob+0.030
 batch+0.030
bar+0.028
ip+0.027
ah+0.024
Neg logit contributions
ires-0.024
erv-0.023
 gut-0.023
 confusion-0.022
adoes-0.022
Token elephants
Token ID25368
Feature activation-1.79e+00

Pos logit contributions
ires+0.001
erv+0.001
 gut+0.001
adoes+0.001
 confusion+0.001
Neg logit contributions
rob-0.002
 batch-0.001
bar-0.001
ip-0.001
ah-0.001
Token,
Token ID11
Feature activation-4.13e-02

Pos logit contributions
ires+0.051
erv+0.050
 gut+0.049
adoes+0.048
 confusion+0.048
Neg logit contributions
rob-0.058
 batch-0.055
bar-0.052
ip-0.051
ah-0.046
Token lions
Token ID31420
Feature activation-1.03e+00

Pos logit contributions
ires+0.001
erv+0.001
 gut+0.001
iring+0.001
adoes+0.001
Neg logit contributions
rob-0.001
 batch-0.001
bar-0.001
ip-0.001
ah-0.001
Token and
Token ID290
Feature activation-7.65e-01

Pos logit contributions
ires+0.033
erv+0.033
 gut+0.032
 confusion+0.031
adoes+0.031
Neg logit contributions
rob-0.030
 batch-0.028
bar-0.026
ip-0.026
ah-0.023
Token ac
Token ID936
Feature activation+4.18e+00

Pos logit contributions
ires+0.020
erv+0.019
 gut+0.018
 confusion+0.018
adoes+0.018
Neg logit contributions
rob-0.027
 batch-0.026
bar-0.024
ip-0.024
ah-0.021
Tokenrob
Token ID22609
Feature activation+1.29e+01

Pos logit contributions
 batch+0.026
bar+0.017
rob+0.010
ip+0.004
 million+0.002
Neg logit contributions
ires-0.223
adoes-0.213
erv-0.212
 gut-0.210
iring-0.207
Tokenats
Token ID1381
Feature activation-4.38e-01

Pos logit contributions
rob+0.348
 batch+0.341
bar+0.321
ip+0.287
 million+0.273
Neg logit contributions
ires-0.435
erv-0.417
adoes-0.398
 confusion-0.397
 gut-0.395
Token.
Token ID13
Feature activation-3.06e-02

Pos logit contributions
ires+0.010
erv+0.010
adoes+0.010
 gut+0.010
 confusion+0.010
Neg logit contributions
rob-0.016
 batch-0.016
bar-0.015
ip-0.014
ah-0.013
Token They
Token ID1119
Feature activation+1.44e+00

Pos logit contributions
ires+0.001
erv+0.001
 gut+0.001
 confusion+0.001
adoes+0.001
Neg logit contributions
rob-0.001
 batch-0.001
bar-0.001
ip-0.001
tail-0.001
Token are
Token ID389
Feature activation+1.47e+00

Pos logit contributions
rob+0.046
 batch+0.044
bar+0.042
ip+0.039
ah+0.038
Neg logit contributions
ires-0.041
erv-0.039
adoes-0.038
 gut-0.037
iring-0.037
Token amazed
Token ID24562
Feature activation-9.55e-01

Pos logit contributions
rob+0.050
 batch+0.048
bar+0.045
ip+0.044
ah+0.040
Neg logit contributions
ires-0.039
erv-0.038
 gut-0.037
 confusion-0.037
adoes-0.037
Token clown
Token ID25573
Feature activation-2.11e+00

Pos logit contributions
rob+0.010
 batch+0.010
bar+0.009
ip+0.008
ah+0.008
Neg logit contributions
ires-0.008
erv-0.007
 gut-0.007
adoes-0.007
 confusion-0.007
Tokens
Token ID82
Feature activation+1.63e-01

Pos logit contributions
ires+0.059
erv+0.058
 gut+0.057
adoes+0.056
 confusion+0.054
Neg logit contributions
rob-0.068
 batch-0.067
bar-0.063
ip-0.061
c-0.056
Token and
Token ID290
Feature activation-1.22e+00

Pos logit contributions
 batch+0.005
rob+0.005
bar+0.005
ip+0.004
c+0.004
Neg logit contributions
ires-0.005
erv-0.005
 gut-0.004
adoes-0.004
iring-0.004
Token the
Token ID262
Feature activation+2.60e-01

Pos logit contributions
ires+0.030
 confusion+0.029
erv+0.029
 gut+0.029
adoes+0.029
Neg logit contributions
rob-0.044
 batch-0.042
bar-0.039
ip-0.039
ah-0.035
Token ac
Token ID936
Feature activation+4.07e+00

Pos logit contributions
rob+0.009
 batch+0.009
bar+0.008
ip+0.008
ah+0.007
Neg logit contributions
ires-0.007
erv-0.006
 gut-0.006
 confusion-0.006
adoes-0.006
Tokenrob
Token ID22609
Feature activation+1.29e+01

Pos logit contributions
 batch+0.040
bar+0.023
 french+0.020
ip+0.015
 gob+0.015
Neg logit contributions
ires-0.198
iring-0.188
adoes-0.188
 gut-0.186
erv-0.183
Tokenats
Token ID1381
Feature activation-6.31e-01

Pos logit contributions
rob+0.350
 batch+0.349
bar+0.323
ip+0.296
tail+0.282
Neg logit contributions
ires-0.426
erv-0.405
 gut-0.385
 confusion-0.384
adoes-0.382
Token and
Token ID290
Feature activation-1.28e+00

Pos logit contributions
ires+0.016
erv+0.015
 gut+0.015
adoes+0.014
 confusion+0.014
Neg logit contributions
rob-0.022
 batch-0.022
bar-0.021
ip-0.020
ah-0.018
Token the
Token ID262
Feature activation+2.14e-01

Pos logit contributions
 confusion+0.027
adoes+0.026
ires+0.026
 gut+0.026
 stops+0.025
Neg logit contributions
rob-0.050
 batch-0.048
bar-0.046
ip-0.046
P-0.042
Token lions
Token ID31420
Feature activation-1.10e+00

Pos logit contributions
rob+0.008
 batch+0.008
ip+0.007
bar+0.007
eb+0.006
Neg logit contributions
ires-0.005
 confusion-0.005
 gut-0.005
erv-0.005
adoes-0.005
Token and
Token ID290
Feature activation-1.29e+00

Pos logit contributions
ires+0.026
erv+0.025
 gut+0.024
 confusion+0.024
adoes+0.024
Neg logit contributions
rob-0.040
 batch-0.039
bar-0.036
ip-0.035
ah-0.032
Token saw
Token ID2497
Feature activation-1.54e+00

Pos logit contributions
rob+0.009
 batch+0.009
bar+0.008
ip+0.008
ah+0.007
Neg logit contributions
ires-0.011
erv-0.011
adoes-0.011
 gut-0.011
 confusion-0.010
Token clown
Token ID25573
Feature activation-2.16e+00

Pos logit contributions
erv+0.046
ires+0.044
 gut+0.044
adoes+0.042
 confusion+0.040
Neg logit contributions
rob-0.049
 batch-0.048
bar-0.043
ip-0.040
 peanut-0.038
Tokens
Token ID82
Feature activation+1.86e-01

Pos logit contributions
ires+0.079
 gut+0.078
erv+0.077
 confusion+0.076
adoes+0.075
Neg logit contributions
rob-0.053
 batch-0.051
ip-0.046
bar-0.046
ah-0.040
Token and
Token ID290
Feature activation-1.10e+00

Pos logit contributions
 batch+0.006
rob+0.006
bar+0.005
ip+0.005
 gob+0.005
Neg logit contributions
ires-0.005
erv-0.005
 gut-0.005
adoes-0.005
 confusion-0.004
Token ac
Token ID936
Feature activation+4.18e+00

Pos logit contributions
ires+0.027
 confusion+0.026
erv+0.026
adoes+0.026
 gut+0.025
Neg logit contributions
rob-0.039
 batch-0.038
bar-0.036
ip-0.035
ah-0.032
Tokenrob
Token ID22609
Feature activation+1.28e+01

Pos logit contributions
 batch+0.046
bar+0.034
 million+0.020
ird+0.019
rob+0.019
Neg logit contributions
ires-0.202
adoes-0.194
iring-0.192
erv-0.190
 gut-0.190
Tokenats
Token ID1381
Feature activation-4.04e-01

Pos logit contributions
rob+0.340
 batch+0.331
bar+0.307
ip+0.281
tail+0.259
Neg logit contributions
ires-0.437
erv-0.423
 confusion-0.406
 gut-0.405
adoes-0.404
Token and
Token ID290
Feature activation-1.13e+00

Pos logit contributions
ires+0.009
erv+0.009
 gut+0.008
 confusion+0.008
adoes+0.008
Neg logit contributions
rob-0.016
 batch-0.015
bar-0.014
ip-0.014
ah-0.013
Token lions
Token ID31420
Feature activation-9.89e-01

Pos logit contributions
ires+0.028
 confusion+0.027
adoes+0.027
erv+0.027
 gut+0.027
Neg logit contributions
rob-0.040
 batch-0.039
bar-0.036
ip-0.035
eb-0.032
Token and
Token ID290
Feature activation-1.13e+00

Pos logit contributions
ires+0.026
erv+0.025
 gut+0.024
 confusion+0.024
adoes+0.024
Neg logit contributions
rob-0.035
 batch-0.033
bar-0.031
ip-0.030
tail-0.028
Token elephants
Token ID25368
Feature activation-1.61e+00

Pos logit contributions
ires+0.030
erv+0.029
 gut+0.028
 confusion+0.028
adoes+0.028
Neg logit contributions
rob-0.038
 batch-0.038
bar-0.035
ip-0.034
ah-0.031
Token very
Token ID845
Feature activation+4.40e+00

Pos logit contributions
ires+0.068
erv+0.066
 gut+0.064
 confusion+0.064
adoes+0.064
Neg logit contributions
rob-0.084
 batch-0.081
bar-0.075
ip-0.073
ah-0.066
Token still
Token ID991
Feature activation-4.82e-01

Pos logit contributions
 batch+0.158
rob+0.154
bar+0.148
ip+0.147
 million+0.131
Neg logit contributions
adoes-0.100
ires-0.100
erv-0.099
 gut-0.096
 confusion-0.096
Token so
Token ID523
Feature activation+6.04e+00

Pos logit contributions
ires+0.011
erv+0.010
 confusion+0.010
 gut+0.010
adoes+0.010
Neg logit contributions
rob-0.019
 batch-0.018
bar-0.017
ip-0.016
ah-0.015
Token the
Token ID262
Feature activation+5.16e-01

Pos logit contributions
rob+0.214
 batch+0.212
bar+0.198
ip+0.196
ah+0.176
Neg logit contributions
ires-0.148
erv-0.145
adoes-0.140
 gut-0.138
 confusion-0.137
Token ha
Token ID387
Feature activation+5.55e+00

Pos logit contributions
rob+0.016
 batch+0.015
bar+0.014
ip+0.014
eb+0.012
Neg logit contributions
ires-0.015
 gut-0.015
erv-0.015
 confusion-0.015
adoes-0.014
Tokenird
Token ID1447
Feature activation+1.27e+01

Pos logit contributions
rob+0.089
 batch+0.087
bar+0.064
ip+0.053
 million+0.049
Neg logit contributions
erv-0.245
ires-0.237
 gut-0.232
 confusion-0.229
adoes-0.225
Tokenress
Token ID601
Feature activation+4.08e+00

Pos logit contributions
rob+0.358
 batch+0.333
 million+0.261
 gob+0.258
P+0.257
Neg logit contributions
erv-0.424
ires-0.415
 gut-0.414
iring-0.397
 confusion-0.380
Tokener
Token ID263
Feature activation+1.94e-01

Pos logit contributions
rob+0.156
 batch+0.154
bar+0.138
ip+0.135
 million+0.131
Neg logit contributions
 gut-0.085
ires-0.085
erv-0.082
adoes-0.081
iring-0.077
Token could
Token ID714
Feature activation+1.47e+00

Pos logit contributions
rob+0.006
 batch+0.006
bar+0.005
ip+0.005
 million+0.005
Neg logit contributions
ires-0.005
erv-0.005
 gut-0.005
adoes-0.005
 confusion-0.005
Token do
Token ID466
Feature activation+2.89e+00

Pos logit contributions
rob+0.057
 batch+0.054
ip+0.050
bar+0.049
P+0.045
Neg logit contributions
ires-0.033
 gut-0.032
erv-0.032
 confusion-0.031
adoes-0.030
Token a
Token ID257
Feature activation-4.99e-01

Pos logit contributions
rob+0.103
 batch+0.099
bar+0.092
ip+0.089
P+0.082
Neg logit contributions
ires-0.072
 gut-0.068
erv-0.068
 confusion-0.067
adoes-0.067
Token his
Token ID465
Feature activation-2.20e+00

Pos logit contributions
adoes+0.027
 footsteps+0.026
 gut+0.025
ires+0.025
 confusion+0.023
Neg logit contributions
rob-0.077
 batch-0.073
ip-0.072
bar-0.070
 million-0.068
Token mom
Token ID1995
Feature activation-7.07e-01

Pos logit contributions
ires+0.054
erv+0.054
adoes+0.052
 confusion+0.049
iring+0.047
Neg logit contributions
 batch-0.078
rob-0.072
bar-0.070
ip-0.069
ah-0.062
Token and
Token ID290
Feature activation-8.77e-01

Pos logit contributions
ires+0.017
erv+0.015
 gut+0.015
 confusion+0.015
iring+0.015
Neg logit contributions
rob-0.028
 batch-0.024
P-0.023
ww-0.022
bar-0.022
Token the
Token ID262
Feature activation-1.46e-01

Pos logit contributions
 gut+0.020
 stops+0.020
 confusion+0.019
adoes+0.019
ires+0.019
Neg logit contributions
rob-0.033
 batch-0.030
bar-0.028
ip-0.028
ah-0.027
Token ha
Token ID387
Feature activation+5.40e+00

Pos logit contributions
ires+0.004
erv+0.004
adoes+0.003
 gut+0.003
iring+0.003
Neg logit contributions
rob-0.005
 batch-0.005
bar-0.005
ip-0.004
 million-0.004
Tokenird
Token ID1447
Feature activation+1.26e+01

Pos logit contributions
rob+0.120
 batch+0.117
 gob+0.089
 million+0.086
 loaf+0.086
Neg logit contributions
erv-0.206
ires-0.190
iring-0.185
 gut-0.181
ever-0.181
Tokenress
Token ID601
Feature activation+3.88e+00

Pos logit contributions
rob+0.403
 batch+0.361
 gob+0.315
ww+0.315
www+0.311
Neg logit contributions
erv-0.376
iring-0.368
 gut-0.365
ires-0.363
 upon-0.331
Tokener
Token ID263
Feature activation+3.82e-02

Pos logit contributions
rob+0.153
 batch+0.152
 million+0.136
ip+0.133
bar+0.132
Neg logit contributions
ires-0.074
 gut-0.074
erv-0.070
iring-0.069
adoes-0.068
Token.
Token ID13
Feature activation-2.85e-02

Pos logit contributions
rob+0.001
 batch+0.001
ip+0.001
bar+0.001
 million+0.001
Neg logit contributions
ires-0.001
 gut-0.001
erv-0.001
adoes-0.001
 confusion-0.001
Token "
Token ID366
Feature activation+5.49e-02

Pos logit contributions
ires+0.001
erv+0.001
adoes+0.001
 confusion+0.001
 gut+0.001
Neg logit contributions
rob-0.001
 batch-0.001
bar-0.001
ip-0.001
ah-0.001
TokenI
Token ID40
Feature activation+1.19e+00

Pos logit contributions
 batch+0.002
rob+0.002
ip+0.002
bar+0.002
 loaf+0.002
Neg logit contributions
ires-0.001
erv-0.001
 confusion-0.001
adoes-0.001
 gut-0.001
Token chair
Token ID5118
Feature activation+3.82e+00

Pos logit contributions
rob+0.006
ip+0.005
 batch+0.005
bar+0.005
eb+0.005
Neg logit contributions
 confusion-0.004
 gut-0.003
 footsteps-0.003
ires-0.003
adoes-0.003
Token and
Token ID290
Feature activation-6.12e-01

Pos logit contributions
rob+0.152
 batch+0.140
ip+0.130
bar+0.129
P+0.122
Neg logit contributions
ires-0.079
 gut-0.077
erv-0.076
adoes-0.075
 confusion-0.074
Token allowed
Token ID3142
Feature activation+2.42e+00

Pos logit contributions
ires+0.017
erv+0.016
 gut+0.016
 confusion+0.016
adoes+0.016
Neg logit contributions
rob-0.020
 batch-0.019
bar-0.018
ip-0.017
ah-0.016
Token the
Token ID262
Feature activation+2.58e-01

Pos logit contributions
rob+0.093
 batch+0.086
bar+0.082
ip+0.080
ah+0.073
Neg logit contributions
ires-0.058
erv-0.055
 confusion-0.055
adoes-0.054
 gut-0.053
Token ha
Token ID387
Feature activation+5.52e+00

Pos logit contributions
rob+0.008
 batch+0.008
bar+0.007
ip+0.007
ah+0.006
Neg logit contributions
ires-0.008
erv-0.007
adoes-0.007
 gut-0.007
 confusion-0.007
Tokenird
Token ID1447
Feature activation+1.26e+01

Pos logit contributions
rob+0.117
 batch+0.106
bar+0.083
ip+0.074
 million+0.073
Neg logit contributions
erv-0.223
ires-0.214
 gut-0.209
 confusion-0.205
adoes-0.201
Tokenress
Token ID601
Feature activation+3.96e+00

Pos logit contributions
rob+0.391
 batch+0.343
P+0.285
 gob+0.284
 million+0.282
Neg logit contributions
erv-0.405
ires-0.398
 gut-0.391
iring-0.377
 confusion-0.361
Tokener
Token ID263
Feature activation+8.36e-02

Pos logit contributions
rob+0.149
 batch+0.143
bar+0.129
ip+0.126
 million+0.124
Neg logit contributions
ires-0.089
 gut-0.086
erv-0.086
adoes-0.085
 confusion-0.080
Token to
Token ID284
Feature activation+9.49e-01

Pos logit contributions
rob+0.003
 batch+0.003
bar+0.003
ip+0.002
ah+0.002
Neg logit contributions
ires-0.002
erv-0.002
 gut-0.002
 confusion-0.002
adoes-0.002
Token give
Token ID1577
Feature activation+1.35e+00

Pos logit contributions
rob+0.037
 batch+0.035
ip+0.033
bar+0.032
 million+0.030
Neg logit contributions
ires-0.020
 gut-0.020
 confusion-0.020
erv-0.020
adoes-0.019
Token her
Token ID607
Feature activation-1.81e+00

Pos logit contributions
rob+0.051
 batch+0.048
bar+0.046
ip+0.043
ah+0.041
Neg logit contributions
ires-0.033
erv-0.031
 confusion-0.031
adoes-0.031
 gut-0.030
Token in
Token ID287
Feature activation+5.37e-01

Pos logit contributions
ires+0.008
erv+0.008
 gut+0.007
adoes+0.007
 confusion+0.007
Neg logit contributions
rob-0.009
 batch-0.009
bar-0.008
ip-0.008
ah-0.007
Token and
Token ID290
Feature activation-2.71e-01

Pos logit contributions
rob+0.022
 batch+0.021
bar+0.020
ip+0.020
ah+0.018
Neg logit contributions
 confusion-0.011
erv-0.011
ires-0.011
 gut-0.010
adoes-0.010
Token let
Token ID1309
Feature activation+2.76e+00

Pos logit contributions
erv+0.006
 confusion+0.006
 gut+0.006
ires+0.006
 stops+0.006
Neg logit contributions
rob-0.010
 batch-0.009
bar-0.008
ip-0.008
 loaf-0.008
Token the
Token ID262
Feature activation+5.72e-01

Pos logit contributions
rob+0.109
 batch+0.099
bar+0.094
ip+0.091
P+0.083
Neg logit contributions
ires-0.063
erv-0.062
adoes-0.061
 confusion-0.061
 gut-0.060
Token ha
Token ID387
Feature activation+5.53e+00

Pos logit contributions
rob+0.020
 batch+0.019
bar+0.017
ip+0.017
eb+0.015
Neg logit contributions
 confusion-0.015
erv-0.015
 gut-0.015
ires-0.015
 footsteps-0.014
Tokenird
Token ID1447
Feature activation+1.26e+01

Pos logit contributions
rob+0.092
 batch+0.083
bar+0.057
 gob+0.052
 million+0.050
Neg logit contributions
erv-0.249
ires-0.231
 gut-0.225
 confusion-0.225
iring-0.221
Tokenress
Token ID601
Feature activation+4.03e+00

Pos logit contributions
rob+0.374
 batch+0.332
 gob+0.265
 million+0.264
ip+0.260
Neg logit contributions
erv-0.425
ires-0.412
 gut-0.404
iring-0.396
 confusion-0.382
Tokener
Token ID263
Feature activation+2.13e-01

Pos logit contributions
rob+0.148
 batch+0.143
bar+0.131
ip+0.128
 million+0.122
Neg logit contributions
ires-0.095
erv-0.092
 gut-0.090
adoes-0.089
 confusion-0.087
Token go
Token ID467
Feature activation-1.57e-01

Pos logit contributions
rob+0.007
 batch+0.007
bar+0.007
ip+0.006
ah+0.006
Neg logit contributions
ires-0.006
erv-0.005
 gut-0.005
adoes-0.005
 confusion-0.005
Token to
Token ID284
Feature activation+1.09e+00

Pos logit contributions
ires+0.003
 gut+0.003
erv+0.003
 confusion+0.003
adoes+0.003
Neg logit contributions
rob-0.006
 batch-0.006
bar-0.005
ip-0.005
 million-0.005
Token work
Token ID670
Feature activation-1.29e+00

Pos logit contributions
rob+0.043
 batch+0.040
bar+0.038
ip+0.038
ah+0.035
Neg logit contributions
ires-0.023
 confusion-0.023
erv-0.023
 gut-0.023
adoes-0.021
Token called
Token ID1444
Feature activation+2.70e+00

Pos logit contributions
ires+0.013
 confusion+0.013
adoes+0.013
 gut+0.013
erv+0.013
Neg logit contributions
rob-0.018
 batch-0.017
ip-0.015
bar-0.015
P-0.014
Token it
Token ID340
Feature activation-4.25e-01

Pos logit contributions
rob+0.102
 batch+0.093
ip+0.086
bar+0.084
P+0.081
Neg logit contributions
ires-0.068
adoes-0.066
 confusion-0.065
 gut-0.065
erv-0.065
Token a
Token ID257
Feature activation-4.34e-01

Pos logit contributions
ires+0.012
erv+0.012
 gut+0.011
 confusion+0.011
adoes+0.011
Neg logit contributions
rob-0.014
 batch-0.013
bar-0.012
ip-0.012
ah-0.011
Token "
Token ID366
Feature activation+4.63e-01

Pos logit contributions
 confusion+0.010
 gut+0.010
erv+0.010
 stops+0.010
ires+0.009
Neg logit contributions
rob-0.018
 batch-0.015
ip-0.015
bar-0.014
tail-0.014
Tokenwe
Token ID732
Feature activation+4.70e-01

Pos logit contributions
rob+0.014
 batch+0.014
ip+0.012
bar+0.011
 loaf+0.011
Neg logit contributions
ires-0.013
erv-0.013
adoes-0.013
iring-0.012
 confusion-0.012
Tokenird
Token ID1447
Feature activation+1.26e+01

Pos logit contributions
rob+0.018
 batch+0.017
bar+0.015
ip+0.014
 million+0.014
Neg logit contributions
ires-0.011
erv-0.011
 confusion-0.010
adoes-0.010
 gut-0.009
Token rock
Token ID3881
Feature activation+8.46e-01

Pos logit contributions
rob+0.368
 batch+0.328
 million+0.262
 gob+0.259
ip+0.259
Neg logit contributions
ires-0.416
erv-0.413
 gut-0.399
adoes-0.396
iring-0.387
Token."
Token ID526
Feature activation+2.10e+00

Pos logit contributions
rob+0.028
 batch+0.027
bar+0.025
ip+0.024
ah+0.022
Neg logit contributions
ires-0.024
erv-0.023
 gut-0.022
adoes-0.022
 confusion-0.022
Token Tim
Token ID5045
Feature activation-3.41e-01

Pos logit contributions
rob+0.082
 batch+0.075
ip+0.069
bar+0.067
ah+0.065
Neg logit contributions
erv-0.049
ires-0.047
 confusion-0.046
adoes-0.046
 gut-0.044
Tokenmy
Token ID1820
Feature activation-1.39e+00

Pos logit contributions
erv+0.010
 confusion+0.010
ires+0.010
 gut+0.010
adoes+0.010
Neg logit contributions
rob-0.011
 batch-0.010
bar-0.009
ip-0.008
 loaf-0.008
Token got
Token ID1392
Feature activation+5.35e-01

Pos logit contributions
erv+0.034
ires+0.034
 confusion+0.033
 gut+0.033
adoes+0.032
Neg logit contributions
rob-0.050
 batch-0.048
bar-0.044
ip-0.043
tail-0.039
Token am
Token ID716
Feature activation+2.06e+00

Pos logit contributions
rob+0.045
 batch+0.044
bar+0.041
ip+0.039
ah+0.036
Neg logit contributions
ires-0.036
erv-0.035
 gut-0.034
 confusion-0.033
adoes-0.033
Token ready
Token ID3492
Feature activation+2.53e+00

Pos logit contributions
rob+0.078
 batch+0.074
bar+0.070
ip+0.066
eb+0.061
Neg logit contributions
ires-0.049
erv-0.048
 gut-0.047
 confusion-0.044
adoes-0.044
Token!"
Token ID2474
Feature activation+2.82e+00

Pos logit contributions
rob+0.091
 batch+0.084
bar+0.079
ip+0.075
P+0.069
Neg logit contributions
ires-0.066
erv-0.064
 gut-0.062
 confusion-0.059
adoes-0.059
Token The
Token ID383
Feature activation-4.32e-03

Pos logit contributions
rob+0.112
 batch+0.100
bar+0.097
ah+0.088
ip+0.087
Neg logit contributions
erv-0.063
ires-0.058
 confusion-0.054
adoes-0.053
 stops-0.053
Token ha
Token ID387
Feature activation+5.51e+00

Pos logit contributions
ires+0.000
erv+0.000
adoes+0.000
 gut+0.000
 confusion+0.000
Neg logit contributions
rob-0.000
 batch-0.000
bar-0.000
ip-0.000
ah-0.000
Tokenird
Token ID1447
Feature activation+1.26e+01

Pos logit contributions
rob+0.146
 batch+0.139
bar+0.117
ip+0.109
 million+0.106
Neg logit contributions
erv-0.190
ires-0.178
 confusion-0.173
 gut-0.171
iring-0.166
Tokenress
Token ID601
Feature activation+4.00e+00

Pos logit contributions
rob+0.370
 batch+0.332
 million+0.270
P+0.269
ip+0.261
Neg logit contributions
erv-0.410
 gut-0.396
ires-0.395
iring-0.378
 confusion-0.372
Tokener
Token ID263
Feature activation+7.59e-02

Pos logit contributions
rob+0.154
 batch+0.149
bar+0.136
ip+0.132
 million+0.128
Neg logit contributions
ires-0.083
erv-0.082
 gut-0.081
adoes-0.080
 confusion-0.077
Token put
Token ID1234
Feature activation+3.83e+00

Pos logit contributions
rob+0.002
 batch+0.002
bar+0.002
ip+0.002
tail+0.002
Neg logit contributions
erv-0.002
ires-0.002
 gut-0.002
 confusion-0.002
adoes-0.002
Token a
Token ID257
Feature activation-5.87e-01

Pos logit contributions
rob+0.135
 batch+0.131
bar+0.121
ip+0.115
tail+0.111
Neg logit contributions
ires-0.091
 confusion-0.089
adoes-0.087
erv-0.082
 gut-0.081
Token cape
Token ID36023
Feature activation+1.28e+00

Pos logit contributions
 confusion+0.013
 gut+0.013
erv+0.013
 little+0.012
 stops+0.012
Neg logit contributions
rob-0.023
 batch-0.020
bar-0.019
ip-0.019
eb-0.018
Token happy
Token ID3772
Feature activation-1.39e+00

Pos logit contributions
rob+0.156
 batch+0.155
bar+0.145
ip+0.142
ah+0.130
Neg logit contributions
ires-0.085
erv-0.082
adoes-0.080
 gut-0.078
 confusion-0.077
Token and
Token ID290
Feature activation-7.77e-01

Pos logit contributions
ires+0.034
erv+0.032
 gut+0.031
 confusion+0.031
adoes+0.031
Neg logit contributions
rob-0.050
 batch-0.050
bar-0.046
ip-0.045
ah-0.041
Token thanked
Token ID26280
Feature activation-1.71e+00

Pos logit contributions
ires+0.019
 gut+0.019
 confusion+0.019
erv+0.019
 stops+0.018
Neg logit contributions
rob-0.028
 batch-0.026
bar-0.024
ip-0.023
ah-0.022
Token the
Token ID262
Feature activation+2.78e-01

Pos logit contributions
 gut+0.027
 footsteps+0.026
ires+0.025
 confusion+0.025
adoes+0.024
Neg logit contributions
rob-0.075
 batch-0.072
bar-0.070
ip-0.069
 million-0.066
Token ha
Token ID387
Feature activation+5.42e+00

Pos logit contributions
rob+0.008
 batch+0.007
bar+0.007
ip+0.007
ah+0.006
Neg logit contributions
ires-0.009
erv-0.009
 gut-0.009
 confusion-0.009
adoes-0.008
Tokenird
Token ID1447
Feature activation+1.26e+01

Pos logit contributions
rob+0.101
 batch+0.097
bar+0.068
 gob+0.067
 loaf+0.066
Neg logit contributions
erv-0.223
ires-0.209
 gut-0.208
iring-0.204
 confusion-0.198
Tokenress
Token ID601
Feature activation+3.94e+00

Pos logit contributions
rob+0.399
 batch+0.356
 gob+0.294
 million+0.293
P+0.293
Neg logit contributions
 gut-0.392
erv-0.386
iring-0.378
ires-0.378
 confusion-0.342
Tokener
Token ID263
Feature activation+3.19e-02

Pos logit contributions
rob+0.159
 batch+0.156
bar+0.140
 million+0.140
 loaf+0.136
Neg logit contributions
 gut-0.075
ires-0.072
erv-0.069
iring-0.068
adoes-0.065
Token.
Token ID13
Feature activation+8.64e-02

Pos logit contributions
rob+0.001
 batch+0.001
bar+0.001
ip+0.001
P+0.001
Neg logit contributions
ires-0.001
 gut-0.001
erv-0.001
adoes-0.001
 confusion-0.001
Token As
Token ID1081
Feature activation-6.01e+00

Pos logit contributions
rob+0.003
 batch+0.003
bar+0.003
ip+0.003
ah+0.003
Neg logit contributions
ires-0.002
erv-0.002
 gut-0.002
adoes-0.002
 confusion-0.002
Token they
Token ID484
Feature activation+2.51e+00

Pos logit contributions
 gut+0.118
ires+0.117
 confusion+0.116
 earthquakes+0.112
 curiosity+0.109
Neg logit contributions
rob-0.242
 batch-0.227
bar-0.207
tail-0.198
ip-0.198
Token the
Token ID262
Feature activation+7.25e-02

Pos logit contributions
ires+0.038
erv+0.034
adoes+0.034
 confusion+0.034
 gut+0.034
Neg logit contributions
rob-0.062
 batch-0.061
bar-0.059
ip-0.056
tail-0.053
Token animals
Token ID4695
Feature activation-5.42e+00

Pos logit contributions
rob+0.003
 batch+0.003
bar+0.002
ip+0.002
ah+0.002
Neg logit contributions
erv-0.002
ires-0.002
 gut-0.002
 confusion-0.002
adoes-0.002
Token,
Token ID11
Feature activation-6.24e-01

Pos logit contributions
ires+0.115
erv+0.114
 gut+0.109
adoes+0.108
 confusion+0.108
Neg logit contributions
rob-0.211
 batch-0.207
bar-0.193
ip-0.190
tail-0.174
Token the
Token ID262
Feature activation+5.45e-02

Pos logit contributions
ires+0.016
erv+0.016
 gut+0.015
adoes+0.015
 confusion+0.015
Neg logit contributions
rob-0.022
 batch-0.021
bar-0.019
ip-0.019
ah-0.017
Token ac
Token ID936
Feature activation+3.91e+00

Pos logit contributions
rob+0.002
 batch+0.002
bar+0.002
ip+0.002
ah+0.002
Neg logit contributions
erv-0.001
ires-0.001
 gut-0.001
adoes-0.001
 confusion-0.001
Tokenrob
Token ID22609
Feature activation+1.26e+01

Pos logit contributions
 batch+0.029
bar+0.021
ird+0.012
 million+0.010
 french+0.009
Neg logit contributions
ires-0.203
iring-0.192
adoes-0.187
 gut-0.186
 stops-0.181
Tokenats
Token ID1381
Feature activation-6.60e-01

Pos logit contributions
rob+0.336
 batch+0.332
bar+0.316
ip+0.280
 million+0.271
Neg logit contributions
ires-0.422
erv-0.395
 gut-0.381
adoes-0.379
 confusion-0.374
Token and
Token ID290
Feature activation-1.32e+00

Pos logit contributions
ires+0.016
erv+0.015
 gut+0.015
adoes+0.014
 confusion+0.014
Neg logit contributions
rob-0.024
 batch-0.024
bar-0.022
ip-0.021
ah-0.019
Token the
Token ID262
Feature activation+2.52e-02

Pos logit contributions
ires+0.031
erv+0.030
 gut+0.029
 confusion+0.029
adoes+0.029
Neg logit contributions
rob-0.049
 batch-0.048
bar-0.045
ip-0.044
ah-0.040
Token clown
Token ID25573
Feature activation-2.36e+00

Pos logit contributions
rob+0.001
 batch+0.001
bar+0.001
ip+0.001
tail+0.001
Neg logit contributions
ires-0.001
erv-0.001
 gut-0.001
 confusion-0.001
adoes-0.000
Tokens
Token ID82
Feature activation-2.84e-01

Pos logit contributions
ires+0.062
erv+0.059
 stops+0.056
iring+0.055
adoes+0.055
Neg logit contributions
rob-0.079
 batch-0.077
bar-0.071
ip-0.065
P-0.064
Token like
Token ID588
Feature activation-2.67e+00

Pos logit contributions
erv+0.069
ires+0.069
 confusion+0.067
 gut+0.066
adoes+0.065
Neg logit contributions
rob-0.134
 batch-0.129
ip-0.118
bar-0.117
 loaf-0.107
Token it
Token ID340
Feature activation-9.26e-01

Pos logit contributions
ires+0.078
erv+0.076
 confusion+0.073
 gut+0.073
adoes+0.073
Neg logit contributions
rob-0.083
 batch-0.081
bar-0.075
ip-0.073
ah-0.065
Token and
Token ID290
Feature activation-8.82e-01

Pos logit contributions
ires+0.025
erv+0.024
 gut+0.023
 confusion+0.023
adoes+0.023
Neg logit contributions
rob-0.031
 batch-0.031
bar-0.029
ip-0.028
ah-0.025
Token the
Token ID262
Feature activation-8.15e-02

Pos logit contributions
adoes+0.022
ires+0.022
 gut+0.022
 confusion+0.022
erv+0.021
Neg logit contributions
rob-0.032
 batch-0.029
ip-0.027
bar-0.027
P-0.025
Token ha
Token ID387
Feature activation+5.33e+00

Pos logit contributions
 confusion+0.002
 gut+0.002
erv+0.002
ires+0.002
 footsteps+0.002
Neg logit contributions
rob-0.003
 batch-0.002
ip-0.002
bar-0.002
eb-0.002
Tokenird
Token ID1447
Feature activation+1.25e+01

Pos logit contributions
rob+0.094
 batch+0.089
bar+0.071
ip+0.060
 loaf+0.057
Neg logit contributions
erv-0.226
ires-0.219
 gut-0.214
 confusion-0.213
adoes-0.209
Tokenress
Token ID601
Feature activation+3.81e+00

Pos logit contributions
rob+0.370
 batch+0.329
 gob+0.271
P+0.265
 million+0.265
Neg logit contributions
erv-0.408
 gut-0.397
ires-0.393
iring-0.383
 confusion-0.369
Tokener
Token ID263
Feature activation-2.11e-01

Pos logit contributions
rob+0.146
 batch+0.144
bar+0.128
ip+0.125
 million+0.123
Neg logit contributions
ires-0.079
 gut-0.078
adoes-0.077
erv-0.076
 confusion-0.072
Token said
Token ID531
Feature activation+3.76e+00

Pos logit contributions
ires+0.006
erv+0.006
 gut+0.006
adoes+0.006
 confusion+0.006
Neg logit contributions
rob-0.007
 batch-0.006
bar-0.006
ip-0.006
tail-0.005
Token it
Token ID340
Feature activation-9.22e-01

Pos logit contributions
rob+0.128
 batch+0.123
bar+0.113
ip+0.110
ah+0.100
Neg logit contributions
ires-0.101
erv-0.097
 gut-0.096
 confusion-0.094
adoes-0.094
Token was
Token ID373
Feature activation-9.81e-02

Pos logit contributions
ires+0.026
erv+0.025
 gut+0.025
 confusion+0.025
adoes+0.024
Neg logit contributions
rob-0.029
 batch-0.029
bar-0.027
ip-0.026
ah-0.023
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
rob+0.226
 batch+0.211
bar+0.209
ip+0.192
 million+0.186
Neg logit contributions
ires-0.106
 gut-0.104
 confusion-0.103
erv-0.101
adoes-0.095
Token chair
Token ID5118
Feature activation+3.85e+00

Pos logit contributions
rob+0.010
 batch+0.010
ip+0.009
bar+0.009
eb+0.008
Neg logit contributions
 gut-0.008
ires-0.008
 confusion-0.008
erv-0.008
 footsteps-0.007
Token and
Token ID290
Feature activation-7.52e-01

Pos logit contributions
rob+0.191
 batch+0.173
ip+0.167
P+0.165
ww+0.164
Neg logit contributions
erv-0.044
ires-0.041
 gut-0.040
 confusion-0.038
adoes-0.034
Token the
Token ID262
Feature activation+2.50e-01

Pos logit contributions
erv+0.018
 confusion+0.017
ires+0.017
 stops+0.017
 gut+0.017
Neg logit contributions
rob-0.027
 batch-0.025
bar-0.023
ip-0.022
ah-0.021
Token ha
Token ID387
Feature activation+5.43e+00

Pos logit contributions
rob+0.007
 batch+0.006
bar+0.006
ip+0.005
eb+0.005
Neg logit contributions
ires-0.008
erv-0.008
 gut-0.008
 confusion-0.008
adoes-0.008
Tokenird
Token ID1447
Feature activation+1.25e+01

Pos logit contributions
rob+0.078
 batch+0.074
bar+0.053
 gob+0.042
ip+0.041
Neg logit contributions
erv-0.251
ires-0.239
 confusion-0.234
 gut-0.233
iring-0.227
Tokenress
Token ID601
Feature activation+3.99e+00

Pos logit contributions
rob+0.379
 batch+0.337
 gob+0.283
ww+0.277
 million+0.276
Neg logit contributions
erv-0.404
 gut-0.385
ires-0.383
iring-0.369
 confusion-0.356
Tokener
Token ID263
Feature activation+4.67e-02

Pos logit contributions
rob+0.156
 batch+0.153
bar+0.138
ip+0.133
 million+0.133
Neg logit contributions
 gut-0.080
ires-0.080
erv-0.079
adoes-0.076
 confusion-0.074
Token began
Token ID2540
Feature activation-2.10e+00

Pos logit contributions
rob+0.002
 batch+0.001
 million+0.001
bar+0.001
ip+0.001
Neg logit contributions
erv-0.001
 gut-0.001
ires-0.001
 confusion-0.001
adoes-0.001
Token to
Token ID284
Feature activation+9.53e-01

Pos logit contributions
erv+0.040
 confusion+0.040
 gut+0.039
ires+0.038
adoes+0.036
Neg logit contributions
rob-0.090
 batch-0.079
bar-0.073
ip-0.073
ah-0.071
Token cut
Token ID2005
Feature activation+4.15e+00

Pos logit contributions
rob+0.037
 batch+0.035
ip+0.032
bar+0.032
eb+0.029
Neg logit contributions
 confusion-0.021
ires-0.020
erv-0.020
 gut-0.020
adoes-0.019
Token his
Token ID465
Feature activation-2.00e+00

Pos logit contributions
 confusion+0.016
 gut+0.016
adoes+0.016
 stops+0.015
 footsteps+0.015
Neg logit contributions
rob-0.037
 batch-0.034
bar-0.032
ip-0.032
P-0.030
Token mom
Token ID1995
Feature activation-4.37e-01

Pos logit contributions
ires+0.038
erv+0.037
 gut+0.036
 confusion+0.035
adoes+0.034
Neg logit contributions
rob-0.084
 batch-0.082
bar-0.077
ip-0.075
ah-0.070
Token left
Token ID1364
Feature activation+9.16e-01

Pos logit contributions
erv+0.009
 stops+0.009
 confusion+0.009
 footsteps+0.008
ires+0.008
Neg logit contributions
rob-0.017
 batch-0.016
bar-0.014
 loaf-0.013
ip-0.013
Token the
Token ID262
Feature activation+2.07e-01

Pos logit contributions
rob+0.036
 batch+0.034
bar+0.032
ip+0.031
ah+0.030
Neg logit contributions
ires-0.020
 confusion-0.020
 gut-0.020
erv-0.019
adoes-0.019
Token ha
Token ID387
Feature activation+5.39e+00

Pos logit contributions
rob+0.006
 batch+0.006
bar+0.005
ip+0.005
ah+0.005
Neg logit contributions
ires-0.007
erv-0.006
 gut-0.006
 confusion-0.006
adoes-0.006
Tokenird
Token ID1447
Feature activation+1.25e+01

Pos logit contributions
rob+0.093
 batch+0.085
bar+0.064
 gob+0.053
ip+0.053
Neg logit contributions
erv-0.233
ires-0.219
 gut-0.212
 confusion-0.211
ever-0.209
Tokenress
Token ID601
Feature activation+3.94e+00

Pos logit contributions
rob+0.391
 batch+0.347
P+0.290
 gob+0.289
ww+0.288
Neg logit contributions
erv-0.389
 gut-0.379
ires-0.375
iring-0.368
 confusion-0.337
Tokener
Token ID263
Feature activation+4.32e-02

Pos logit contributions
rob+0.154
 batch+0.149
bar+0.136
ip+0.134
P+0.131
Neg logit contributions
 gut-0.079
ires-0.077
erv-0.075
adoes-0.073
iring-0.071
Token,
Token ID11
Feature activation+3.01e-02

Pos logit contributions
rob+0.002
 batch+0.002
ip+0.001
bar+0.001
P+0.001
Neg logit contributions
ires-0.001
erv-0.001
 gut-0.001
adoes-0.001
 confusion-0.001
Token happy
Token ID3772
Feature activation-1.72e+00

Pos logit contributions
rob+0.001
 batch+0.001
bar+0.001
ip+0.001
ah+0.001
Neg logit contributions
ires-0.001
erv-0.001
 gut-0.001
 confusion-0.001
adoes-0.001
Token with
Token ID351
Feature activation+1.59e+00

Pos logit contributions
ires+0.047
erv+0.046
 gut+0.045
 confusion+0.044
adoes+0.044
Neg logit contributions
rob-0.057
 batch-0.057
bar-0.051
ip-0.051
ah-0.045
Token's
Token ID338
Feature activation+9.09e-01

Pos logit contributions
rob+0.027
 batch+0.025
bar+0.022
ip+0.021
P+0.020
Neg logit contributions
erv-0.026
ires-0.025
 gut-0.025
 confusion-0.024
adoes-0.024
Token turn
Token ID1210
Feature activation-2.31e+00

Pos logit contributions
 batch+0.027
rob+0.026
bar+0.026
ip+0.025
tail+0.023
Neg logit contributions
ires-0.027
erv-0.026
adoes-0.026
 confusion-0.025
 gut-0.024
Token.
Token ID13
Feature activation+7.21e-02

Pos logit contributions
ires+0.058
erv+0.056
 gut+0.054
 confusion+0.054
adoes+0.053
Neg logit contributions
rob-0.082
 batch-0.080
bar-0.075
ip-0.073
ah-0.066
Token The
Token ID383
Feature activation+3.28e-01

Pos logit contributions
rob+0.003
 batch+0.003
bar+0.002
ah+0.002
ip+0.002
Neg logit contributions
ires-0.002
erv-0.002
adoes-0.001
 confusion-0.001
 stops-0.001
Token ha
Token ID387
Feature activation+5.35e+00

Pos logit contributions
 batch+0.011
rob+0.011
bar+0.009
ip+0.009
ird+0.009
Neg logit contributions
ires-0.009
erv-0.009
adoes-0.009
 gut-0.008
uting-0.008
Tokenird
Token ID1447
Feature activation+1.24e+01

Pos logit contributions
rob+0.078
 batch+0.073
bar+0.057
ip+0.047
 gob+0.040
Neg logit contributions
erv-0.244
ires-0.237
 confusion-0.232
 gut-0.231
adoes-0.229
Tokenress
Token ID601
Feature activation+3.86e+00

Pos logit contributions
rob+0.347
 batch+0.319
ip+0.253
P+0.248
 million+0.247
Neg logit contributions
erv-0.420
ires-0.410
 gut-0.409
iring-0.394
 confusion-0.388
Tokener
Token ID263
Feature activation-3.96e-02

Pos logit contributions
rob+0.137
 batch+0.134
bar+0.124
ip+0.120
tail+0.110
Neg logit contributions
ires-0.096
erv-0.093
adoes-0.091
 gut-0.091
 confusion-0.089
Token asked
Token ID1965
Feature activation+1.49e+00

Pos logit contributions
ires+0.001
erv+0.001
 gut+0.001
adoes+0.001
 confusion+0.001
Neg logit contributions
rob-0.001
 batch-0.001
bar-0.001
ip-0.001
tail-0.001
Token her
Token ID607
Feature activation-2.00e+00

Pos logit contributions
rob+0.051
 batch+0.049
bar+0.045
ip+0.044
ah+0.040
Neg logit contributions
ires-0.040
erv-0.039
 gut-0.038
 confusion-0.037
adoes-0.037
Token what
Token ID644
Feature activation-2.97e+00

Pos logit contributions
ires+0.044
erv+0.042
 gut+0.041
 confusion+0.040
adoes+0.040
Neg logit contributions
rob-0.079
 batch-0.076
bar-0.071
ip-0.069
ah-0.064
Token hair
Token ID4190
Feature activation+2.51e+00

Pos logit contributions
rob+0.015
 batch+0.015
bar+0.014
ip+0.013
tail+0.012
Neg logit contributions
ires-0.014
erv-0.014
adoes-0.013
 confusion-0.013
 gut-0.013
Token salon
Token ID46608
Feature activation+4.95e+00

Pos logit contributions
rob+0.107
 batch+0.094
ip+0.087
bar+0.087
P+0.086
Neg logit contributions
erv-0.050
ires-0.047
 gut-0.046
adoes-0.044
 confusion-0.043
Token.
Token ID13
Feature activation+1.99e-01

Pos logit contributions
rob+0.212
 batch+0.192
bar+0.183
ip+0.180
P+0.176
Neg logit contributions
erv-0.092
ires-0.090
 gut-0.089
adoes-0.084
 confusion-0.083
Token The
Token ID383
Feature activation+3.18e-01

Pos logit contributions
 batch+0.006
rob+0.006
bar+0.006
ip+0.006
tail+0.005
Neg logit contributions
ires-0.006
erv-0.005
 gut-0.005
 confusion-0.005
adoes-0.005
Token ha
Token ID387
Feature activation+5.35e+00

Pos logit contributions
rob+0.009
 batch+0.009
bar+0.008
ip+0.008
ah+0.007
Neg logit contributions
ires-0.010
erv-0.010
 gut-0.010
 confusion-0.010
adoes-0.010
Tokenird
Token ID1447
Feature activation+1.24e+01

Pos logit contributions
rob+0.063
 batch+0.056
bar+0.042
ip+0.032
 gob+0.023
Neg logit contributions
erv-0.261
ires-0.257
 gut-0.250
 confusion-0.250
adoes-0.247
Tokenress
Token ID601
Feature activation+3.89e+00

Pos logit contributions
rob+0.355
 batch+0.323
 gob+0.258
bar+0.255
ip+0.254
Neg logit contributions
erv-0.414
ires-0.406
 gut-0.400
iring-0.384
 confusion-0.377
Tokener
Token ID263
Feature activation+7.05e-03

Pos logit contributions
rob+0.135
 batch+0.131
bar+0.122
ip+0.118
ah+0.107
Neg logit contributions
ires-0.101
erv-0.097
 gut-0.095
adoes-0.094
 confusion-0.094
Token was
Token ID373
Feature activation-6.65e-02

Pos logit contributions
rob+0.000
 batch+0.000
bar+0.000
ip+0.000
ah+0.000
Neg logit contributions
ires-0.000
erv-0.000
 gut-0.000
 confusion-0.000
adoes-0.000
Token very
Token ID845
Feature activation+3.69e+00

Pos logit contributions
 confusion+0.002
erv+0.002
 gut+0.002
ires+0.002
adoes+0.002
Neg logit contributions
rob-0.003
 batch-0.002
bar-0.002
ip-0.002
eb-0.002
Token kind
Token ID1611
Feature activation-3.21e-01

Pos logit contributions
rob+0.164
 batch+0.153
bar+0.147
ip+0.142
ah+0.135
Neg logit contributions
ires-0.067
erv-0.064
 gut-0.062
 confusion-0.061
adoes-0.058
Token different
Token ID1180
Feature activation-8.99e-01

Pos logit contributions
ires+0.015
 confusion+0.014
 gut+0.014
adoes+0.014
erv+0.014
Neg logit contributions
rob-0.031
 batch-0.029
bar-0.027
ip-0.027
eb-0.025
Token today
Token ID1909
Feature activation-1.35e+00

Pos logit contributions
ires+0.025
erv+0.024
 gut+0.023
 confusion+0.023
adoes+0.023
Neg logit contributions
rob-0.030
 batch-0.029
bar-0.027
ip-0.026
ah-0.023
Token."
Token ID526
Feature activation+1.56e+00

Pos logit contributions
ires+0.032
erv+0.032
adoes+0.030
 confusion+0.030
 gut+0.030
Neg logit contributions
rob-0.051
 batch-0.046
bar-0.044
ip-0.043
P-0.039
Token The
Token ID383
Feature activation-1.04e-01

Pos logit contributions
rob+0.057
 batch+0.051
bar+0.049
ip+0.046
ah+0.045
Neg logit contributions
erv-0.039
ires-0.038
 confusion-0.036
 gut-0.036
adoes-0.035
Token ha
Token ID387
Feature activation+5.21e+00

Pos logit contributions
ires+0.003
 gut+0.003
erv+0.003
 confusion+0.003
adoes+0.003
Neg logit contributions
rob-0.003
 batch-0.003
bar-0.003
ip-0.003
tail-0.002
Tokenird
Token ID1447
Feature activation+1.24e+01

Pos logit contributions
rob+0.080
 batch+0.073
bar+0.058
ip+0.047
 loaf+0.042
Neg logit contributions
erv-0.235
ires-0.226
 confusion-0.224
 gut-0.222
iring-0.214
Tokenress
Token ID601
Feature activation+3.82e+00

Pos logit contributions
rob+0.367
 batch+0.324
 million+0.263
P+0.262
 gob+0.259
Neg logit contributions
erv-0.403
 gut-0.395
ires-0.387
iring-0.372
 confusion-0.370
Tokener
Token ID263
Feature activation-1.81e-01

Pos logit contributions
rob+0.144
 batch+0.140
bar+0.129
ip+0.124
 million+0.119
Neg logit contributions
ires-0.084
erv-0.082
 gut-0.081
adoes-0.080
 confusion-0.077
Token smiled
Token ID13541
Feature activation-4.02e+00

Pos logit contributions
ires+0.005
erv+0.005
 gut+0.005
 confusion+0.005
adoes+0.005
Neg logit contributions
rob-0.006
 batch-0.005
bar-0.005
ip-0.005
tail-0.004
Token and
Token ID290
Feature activation-8.78e-01

Pos logit contributions
ires+0.095
erv+0.088
adoes+0.085
 gut+0.085
 confusion+0.084
Neg logit contributions
 batch-0.145
rob-0.137
bar-0.137
ip-0.133
ah-0.122
Token said
Token ID531
Feature activation+3.76e+00

Pos logit contributions
 confusion+0.021
 stops+0.021
 gut+0.020
erv+0.020
ires+0.020
Neg logit contributions
rob-0.031
 batch-0.027
bar-0.027
ip-0.024
ction-0.024
Token We
Token ID775
Feature activation+1.36e+00

Pos logit contributions
ires+0.017
erv+0.016
 gut+0.016
 confusion+0.016
adoes+0.015
Neg logit contributions
rob-0.020
 batch-0.020
bar-0.019
ip-0.018
tail-0.016
Token have
Token ID423
Feature activation+3.34e+00

Pos logit contributions
rob+0.048
 batch+0.047
bar+0.043
ip+0.042
ah+0.038
Neg logit contributions
ires-0.034
erv-0.033
 gut-0.032
 confusion-0.032
adoes-0.032
Token different
Token ID1180
Feature activation-1.62e+00

Pos logit contributions
rob+0.122
 batch+0.116
bar+0.108
ip+0.105
ah+0.097
Neg logit contributions
ires-0.083
erv-0.081
 gut-0.078
 confusion-0.078
adoes-0.078
Token likes
Token ID7832
Feature activation-3.86e+00

Pos logit contributions
ires+0.035
erv+0.035
 gut+0.034
 confusion+0.032
iring+0.031
Neg logit contributions
rob-0.063
 batch-0.061
bar-0.057
ip-0.054
c-0.052
Token and
Token ID290
Feature activation-1.22e+00

Pos logit contributions
ires+0.094
erv+0.087
adoes+0.087
 gut+0.085
 confusion+0.084
Neg logit contributions
rob-0.144
 batch-0.135
bar-0.130
ip-0.124
ah-0.117
Token disl
Token ID19621
Feature activation-9.38e+00

Pos logit contributions
ires+0.030
erv+0.030
 gut+0.028
 confusion+0.027
adoes+0.027
Neg logit contributions
rob-0.043
 batch-0.042
bar-0.040
ip-0.038
tail-0.036
Tokenikes
Token ID7938
Feature activation-9.24e-01

Pos logit contributions
 gut+0.249
 confusion+0.245
 curiosity+0.206
 fear+0.203
 footsteps+0.202
Neg logit contributions
rob-0.360
 batch-0.315
bar-0.289
ip-0.280
eb-0.268
Token.
Token ID13
Feature activation-6.05e-01

Pos logit contributions
ires+0.022
erv+0.021
adoes+0.020
 gut+0.020
 confusion+0.020
Neg logit contributions
rob-0.035
 batch-0.034
bar-0.031
ip-0.030
ah-0.028
Token We
Token ID775
Feature activation+1.36e+00

Pos logit contributions
ires+0.016
erv+0.015
 gut+0.015
 confusion+0.015
adoes+0.015
Neg logit contributions
rob-0.021
 batch-0.020
bar-0.019
ip-0.018
ah-0.017
Token can
Token ID460
Feature activation-7.43e-01

Pos logit contributions
rob+0.048
 batch+0.047
bar+0.044
ip+0.042
ah+0.039
Neg logit contributions
ires-0.034
erv-0.033
 gut-0.032
 confusion-0.032
adoes-0.032
Token be
Token ID307
Feature activation+7.95e-01

Pos logit contributions
ires+0.020
erv+0.019
 gut+0.018
 confusion+0.018
adoes+0.018
Neg logit contributions
rob-0.025
 batch-0.025
bar-0.023
ip-0.022
ah-0.020
Token sad
Token ID6507
Feature activation-1.98e+00

Pos logit contributions
 batch+0.203
bar+0.196
rob+0.195
ip+0.187
 million+0.175
Neg logit contributions
ires-0.127
adoes-0.126
erv-0.125
 gut-0.117
 confusion-0.113
Token.
Token ID13
Feature activation+2.76e-01

Pos logit contributions
ires+0.052
erv+0.051
 gut+0.049
 confusion+0.049
adoes+0.049
Neg logit contributions
rob-0.066
 batch-0.065
bar-0.060
ip-0.059
ah-0.053
Token He
Token ID679
Feature activation+4.96e-01

Pos logit contributions
rob+0.011
 batch+0.010
bar+0.009
robat+0.009
 loaf+0.009
Neg logit contributions
erv-0.006
ires-0.006
 gut-0.006
 confusion-0.006
iring-0.005
Token said
Token ID531
Feature activation+3.91e+00

Pos logit contributions
rob+0.016
 batch+0.015
bar+0.013
ip+0.013
 loaf+0.012
Neg logit contributions
 confusion-0.015
erv-0.014
 gut-0.014
ires-0.014
 stops-0.014
Token "
Token ID366
Feature activation+1.95e-01

Pos logit contributions
rob+0.116
 batch+0.113
bar+0.106
ip+0.102
ah+0.091
Neg logit contributions
ires-0.121
erv-0.117
 confusion-0.112
adoes-0.111
 gut-0.111
TokenFear
Token ID37798
Feature activation-9.34e+00

Pos logit contributions
rob+0.007
 batch+0.007
bar+0.006
ip+0.006
ah+0.005
Neg logit contributions
ires-0.005
erv-0.005
 gut-0.005
adoes-0.005
 confusion-0.005
Token not
Token ID407
Feature activation-4.61e+00

Pos logit contributions
ires+0.220
 confusion+0.216
erv+0.212
adoes+0.206
 gut+0.198
Neg logit contributions
bar-0.319
 batch-0.316
ip-0.295
rob-0.292
tail-0.278
Token,
Token ID11
Feature activation+5.48e-01

Pos logit contributions
ires+0.157
erv+0.146
 confusion+0.144
 gut+0.143
overed+0.136
Neg logit contributions
 batch-0.120
bar-0.111
rob-0.104
 million-0.098
ah-0.094
Token I
Token ID314
Feature activation+1.76e+00

Pos logit contributions
 batch+0.018
rob+0.018
bar+0.017
ip+0.017
ah+0.015
Neg logit contributions
ires-0.015
erv-0.014
 confusion-0.013
adoes-0.013
 gut-0.013
Token can
Token ID460
Feature activation-4.09e-01

Pos logit contributions
rob+0.052
 batch+0.051
bar+0.048
ip+0.047
ah+0.043
Neg logit contributions
ires-0.054
adoes-0.051
erv-0.051
 gut-0.049
 confusion-0.049
Token help
Token ID1037
Feature activation-6.37e-01

Pos logit contributions
ires+0.011
erv+0.011
adoes+0.011
 confusion+0.010
 gut+0.010
Neg logit contributions
rob-0.013
 batch-0.013
bar-0.013
ip-0.012
tail-0.011
Token.
Token ID13
Feature activation+9.94e-02

Pos logit contributions
ires+0.068
erv+0.065
 gut+0.063
adoes+0.063
 confusion+0.062
Neg logit contributions
rob-0.100
 batch-0.097
bar-0.091
ip-0.088
ah-0.081
Token It
Token ID632
Feature activation-1.61e-01

Pos logit contributions
rob+0.003
 batch+0.003
bar+0.003
ip+0.003
 million+0.002
Neg logit contributions
ires-0.003
erv-0.003
 gut-0.003
adoes-0.003
 confusion-0.003
Token said
Token ID531
Feature activation+3.93e+00

Pos logit contributions
ires+0.004
erv+0.004
 confusion+0.004
 gut+0.004
adoes+0.004
Neg logit contributions
rob-0.006
 batch-0.005
bar-0.005
ip-0.005
tail-0.004
Token,
Token ID11
Feature activation+2.37e-01

Pos logit contributions
rob+0.098
bar+0.098
 batch+0.098
ip+0.090
eb+0.082
Neg logit contributions
ires-0.131
erv-0.128
 confusion-0.122
adoes-0.116
 gut-0.113
Token "
Token ID366
Feature activation+1.52e-01

Pos logit contributions
rob+0.008
 batch+0.007
bar+0.007
ip+0.007
 million+0.006
Neg logit contributions
ires-0.007
erv-0.006
adoes-0.006
 gut-0.006
 confusion-0.006
TokenFear
Token ID37798
Feature activation-9.19e+00

Pos logit contributions
rob+0.005
 batch+0.005
bar+0.005
ip+0.004
ah+0.004
Neg logit contributions
ires-0.004
erv-0.004
 gut-0.004
adoes-0.004
 confusion-0.004
Token not
Token ID407
Feature activation-4.65e+00

Pos logit contributions
ires+0.245
 confusion+0.238
erv+0.235
adoes+0.229
 gut+0.209
Neg logit contributions
bar-0.293
 batch-0.260
tail-0.250
rob-0.249
ah-0.246
Token.
Token ID13
Feature activation-2.57e-01

Pos logit contributions
ires+0.175
erv+0.158
overed+0.157
 confusion+0.153
 gut+0.148
Neg logit contributions
bar-0.100
 batch-0.092
 million-0.088
ah-0.082
rob-0.081
Token I
Token ID314
Feature activation+1.78e+00

Pos logit contributions
ires+0.009
erv+0.008
adoes+0.008
 confusion+0.008
iring+0.008
Neg logit contributions
bar-0.007
rob-0.006
 batch-0.006
ird-0.005
ip-0.005
Token am
Token ID716
Feature activation+2.30e+00

Pos logit contributions
rob+0.047
 batch+0.045
bar+0.043
ip+0.041
ah+0.039
Neg logit contributions
ires-0.061
adoes-0.057
erv-0.057
iring-0.054
 gut-0.054
Token with
Token ID351
Feature activation+1.78e+00

Pos logit contributions
rob+0.080
 batch+0.080
bar+0.076
ip+0.072
 million+0.068
Neg logit contributions
ires-0.057
erv-0.054
adoes-0.051
 confusion-0.051
 gut-0.051
Token and
Token ID290
Feature activation-8.21e-01

Pos logit contributions
ires+0.061
erv+0.059
 confusion+0.058
 gut+0.057
iring+0.053
Neg logit contributions
rob-0.053
bar-0.051
 batch-0.051
ip-0.045
c-0.043
Token dangerous
Token ID4923
Feature activation-1.14e+00

Pos logit contributions
ires+0.019
erv+0.018
 gut+0.018
 confusion+0.018
adoes+0.018
Neg logit contributions
rob-0.031
 batch-0.030
bar-0.028
ip-0.027
ah-0.025
Token to
Token ID284
Feature activation+9.33e-01

Pos logit contributions
ires+0.030
erv+0.030
 gut+0.029
 confusion+0.028
adoes+0.028
Neg logit contributions
rob-0.039
 batch-0.038
bar-0.035
ip-0.034
tail-0.031
Token explore
Token ID7301
Feature activation-4.31e+00

Pos logit contributions
rob+0.038
 batch+0.037
ip+0.034
bar+0.033
 loaf+0.030
Neg logit contributions
ires-0.019
 gut-0.019
 confusion-0.019
adoes-0.018
erv-0.018
Token the
Token ID262
Feature activation+4.93e-01

Pos logit contributions
ires+0.116
erv+0.115
 gut+0.108
 confusion+0.107
adoes+0.106
Neg logit contributions
rob-0.143
 batch-0.140
bar-0.128
ip-0.124
c-0.115
Tokenwoods
Token ID39493
Feature activation-9.08e+00

Pos logit contributions
rob+0.015
 batch+0.015
bar+0.014
c+0.013
 million+0.013
Neg logit contributions
erv-0.014
ires-0.014
adoes-0.013
 gut-0.013
 confusion-0.013
Token alone
Token ID3436
Feature activation-3.30e+00

Pos logit contributions
ires+0.228
erv+0.221
 gut+0.213
 confusion+0.212
adoes+0.211
Neg logit contributions
rob-0.320
 batch-0.313
bar-0.291
ip-0.283
tail-0.258
Token.
Token ID13
Feature activation+1.95e-02

Pos logit contributions
ires+0.087
erv+0.083
 gut+0.081
 confusion+0.080
adoes+0.080
Neg logit contributions
rob-0.112
 batch-0.110
bar-0.102
ip-0.099
ah-0.090
Token With
Token ID2080
Feature activation+6.13e-01

Pos logit contributions
rob+0.001
 batch+0.001
bar+0.001
ip+0.001
ah+0.001
Neg logit contributions
ires-0.000
erv-0.000
 confusion-0.000
 gut-0.000
adoes-0.000
Token her
Token ID607
Feature activation-1.89e+00

Pos logit contributions
rob+0.025
 batch+0.023
bar+0.023
ip+0.021
 million+0.021
Neg logit contributions
ires-0.013
 confusion-0.013
 gut-0.012
 curiosity-0.012
 footsteps-0.011
Token pole
Token ID16825
Feature activation+2.39e+00

Pos logit contributions
ires+0.046
erv+0.044
adoes+0.042
iring+0.040
uting+0.039
Neg logit contributions
 batch-0.069
rob-0.066
bar-0.063
ip-0.062
tail-0.057
Token her
Token ID607
Feature activation-1.87e+00

Pos logit contributions
ires+0.063
 confusion+0.063
 gut+0.062
erv+0.062
adoes+0.061
Neg logit contributions
rob-0.113
 batch-0.102
bar-0.100
ip-0.094
ah-0.089
Token mom
Token ID1995
Feature activation-3.92e-01

Pos logit contributions
ires+0.048
erv+0.047
adoes+0.046
 gut+0.044
uting+0.044
Neg logit contributions
 batch-0.063
rob-0.058
ip-0.055
bar-0.054
ah-0.048
Tokenmy
Token ID1820
Feature activation-1.38e+00

Pos logit contributions
ires+0.013
erv+0.013
 confusion+0.013
 gut+0.012
adoes+0.012
Neg logit contributions
rob-0.010
 batch-0.010
bar-0.009
ip-0.008
tail-0.008
Token said
Token ID531
Feature activation+3.79e+00

Pos logit contributions
ires+0.049
erv+0.047
 gut+0.046
adoes+0.046
 confusion+0.046
Neg logit contributions
rob-0.035
 batch-0.034
bar-0.030
ip-0.029
ah-0.025
Token about
Token ID546
Feature activation-2.77e+00

Pos logit contributions
 batch+0.123
rob+0.121
bar+0.114
ip+0.114
ird+0.101
Neg logit contributions
ires-0.103
erv-0.098
adoes-0.094
 confusion-0.092
 gut-0.092
Token earthquakes
Token ID29781
Feature activation-9.01e+00

Pos logit contributions
ires+0.074
erv+0.072
 gut+0.069
 confusion+0.069
adoes+0.068
Neg logit contributions
rob-0.094
 batch-0.092
bar-0.085
ip-0.082
ah-0.074
Token and
Token ID290
Feature activation-8.20e-01

Pos logit contributions
ires+0.237
erv+0.235
 gut+0.231
 confusion+0.220
adoes+0.214
Neg logit contributions
 batch-0.297
rob-0.297
bar-0.275
ip-0.265
c-0.247
Token thought
Token ID1807
Feature activation-1.81e+00

Pos logit contributions
ires+0.022
erv+0.022
 gut+0.021
adoes+0.020
 confusion+0.020
Neg logit contributions
 batch-0.027
rob-0.026
bar-0.025
ip-0.023
c-0.021
Token,
Token ID11
Feature activation+3.24e-01

Pos logit contributions
erv+0.055
ires+0.054
 confusion+0.052
 gut+0.052
adoes+0.051
Neg logit contributions
 batch-0.053
rob-0.050
ip-0.048
bar-0.047
 million-0.041
Token "
Token ID366
Feature activation+1.73e-01

Pos logit contributions
rob+0.012
 batch+0.012
bar+0.011
ip+0.011
ah+0.010
Neg logit contributions
ires-0.008
erv-0.007
 gut-0.007
 confusion-0.007
adoes-0.007
TokenIf
Token ID1532
Feature activation-3.05e+00

Pos logit contributions
rob+0.006
 batch+0.006
bar+0.006
ip+0.005
ah+0.005
Neg logit contributions
ires-0.004
erv-0.004
 gut-0.004
 confusion-0.004
adoes-0.004
Token to
Token ID284
Feature activation+9.23e-01

Pos logit contributions
 batch+0.031
rob+0.030
bar+0.028
ip+0.027
 million+0.024
Neg logit contributions
ires-0.031
erv-0.030
 confusion-0.028
 gut-0.028
adoes-0.028
Token always
Token ID1464
Feature activation+2.55e-03

Pos logit contributions
rob+0.034
 batch+0.033
ip+0.030
bar+0.029
 million+0.027
Neg logit contributions
 gut-0.022
ires-0.022
erv-0.021
 confusion-0.021
adoes-0.021
Token be
Token ID307
Feature activation+9.51e-01

Pos logit contributions
rob+0.000
 batch+0.000
bar+0.000
ip+0.000
 million+0.000
Neg logit contributions
ires-0.000
erv-0.000
 gut-0.000
 confusion-0.000
adoes-0.000
Token prepared
Token ID5597
Feature activation+1.93e+00

Pos logit contributions
rob+0.039
 batch+0.037
bar+0.035
ip+0.034
ah+0.032
Neg logit contributions
ires-0.019
erv-0.019
 gut-0.018
adoes-0.018
 confusion-0.018
Token for
Token ID329
Feature activation+4.36e+00

Pos logit contributions
 batch+0.051
rob+0.046
bar+0.045
ip+0.044
c+0.038
Neg logit contributions
ires-0.063
erv-0.063
 confusion-0.061
adoes-0.059
 gut-0.058
Token earthquakes
Token ID29781
Feature activation-9.00e+00

Pos logit contributions
rob+0.157
 batch+0.152
bar+0.142
ip+0.138
ah+0.126
Neg logit contributions
ires-0.109
erv-0.105
 gut-0.102
 confusion-0.102
adoes-0.101
Token and
Token ID290
Feature activation-7.29e-01

Pos logit contributions
ires+0.242
erv+0.233
 confusion+0.233
 gut+0.225
adoes+0.221
Neg logit contributions
rob-0.293
 batch-0.290
bar-0.271
ip-0.262
c-0.241
Token other
Token ID584
Feature activation+1.61e+00

Pos logit contributions
ires+0.021
erv+0.021
 gut+0.020
 confusion+0.019
adoes+0.019
Neg logit contributions
rob-0.022
 batch-0.022
bar-0.021
ip-0.020
tail-0.018
Token dangers
Token ID16169
Feature activation-6.30e+00

Pos logit contributions
rob+0.061
 batch+0.061
bar+0.057
ip+0.055
c+0.054
Neg logit contributions
ires-0.034
erv-0.032
 gut-0.028
adoes-0.028
overed-0.028
Token in
Token ID287
Feature activation+5.15e-01

Pos logit contributions
ires+0.186
erv+0.184
 confusion+0.178
 gut+0.174
adoes+0.171
Neg logit contributions
rob-0.201
 batch-0.184
bar-0.179
ip-0.171
c-0.157
Token the
Token ID262
Feature activation+5.72e-01

Pos logit contributions
rob+0.017
 batch+0.017
bar+0.016
ip+0.015
 million+0.014
Neg logit contributions
ires-0.014
erv-0.014
 gut-0.013
adoes-0.013
 confusion-0.013
Token They
Token ID1119
Feature activation+1.64e+00

Pos logit contributions
 batch+0.001
rob+0.001
ip+0.001
bar+0.001
tail+0.001
Neg logit contributions
 gut-0.001
erv-0.001
ires-0.001
adoes-0.001
 confusion-0.001
Token said
Token ID531
Feature activation+3.87e+00

Pos logit contributions
rob+0.054
 batch+0.052
bar+0.048
ip+0.046
ww+0.043
Neg logit contributions
erv-0.044
ires-0.043
 confusion-0.043
 gut-0.041
 stops-0.040
Token,
Token ID11
Feature activation-7.15e-02

Pos logit contributions
 batch+0.113
rob+0.108
ip+0.104
bar+0.100
 loaf+0.088
Neg logit contributions
erv-0.118
ires-0.118
 confusion-0.114
 gut-0.110
adoes-0.110
Token "
Token ID366
Feature activation+1.75e-01

Pos logit contributions
ires+0.002
erv+0.002
 confusion+0.002
 gut+0.002
adoes+0.002
Neg logit contributions
 batch-0.002
rob-0.002
bar-0.002
ip-0.002
ah-0.002
TokenL
Token ID43
Feature activation+1.74e+00

Pos logit contributions
 batch+0.006
rob+0.006
ip+0.006
bar+0.006
 gob+0.005
Neg logit contributions
ires-0.004
erv-0.004
 confusion-0.004
Up-0.004
adoes-0.004
Tokenadies
Token ID50192
Feature activation-8.99e+00

Pos logit contributions
rob+0.053
 batch+0.052
ip+0.050
bar+0.048
ird+0.045
Neg logit contributions
 gut-0.055
 confusion-0.054
adoes-0.050
 curiosity-0.048
erv-0.047
Token and
Token ID290
Feature activation-9.32e-01

Pos logit contributions
ires+0.204
 gut+0.200
 confusion+0.191
erv+0.188
adoes+0.186
Neg logit contributions
 batch-0.329
rob-0.322
bar-0.302
ip-0.294
ah-0.280
Token gentlemen
Token ID28527
Feature activation-4.91e-02

Pos logit contributions
ires+0.025
 confusion+0.024
erv+0.023
 footsteps+0.023
adoes+0.023
Neg logit contributions
rob-0.033
 batch-0.031
ip-0.028
bar-0.027
ah-0.025
Token,
Token ID11
Feature activation-2.86e-01

Pos logit contributions
ires+0.001
erv+0.001
 gut+0.001
 confusion+0.001
adoes+0.001
Neg logit contributions
rob-0.002
 batch-0.002
bar-0.001
ip-0.001
ah-0.001
Token welcome
Token ID7062
Feature activation-7.70e-01

Pos logit contributions
ires+0.007
erv+0.007
 gut+0.007
 confusion+0.007
adoes+0.007
Neg logit contributions
rob-0.010
 batch-0.010
bar-0.009
ip-0.009
ah-0.008
Token to
Token ID284
Feature activation+8.89e-01

Pos logit contributions
ires+0.023
 gut+0.022
erv+0.022
 confusion+0.022
adoes+0.021
Neg logit contributions
 batch-0.023
rob-0.023
ip-0.020
bar-0.020
eb-0.018
Token.
Token ID13
Feature activation+1.28e-01

Pos logit contributions
ires+0.039
erv+0.037
 gut+0.036
adoes+0.036
 confusion+0.035
Neg logit contributions
rob-0.063
 batch-0.061
bar-0.057
ip-0.056
ah-0.051
Token He
Token ID679
Feature activation+5.45e-01

Pos logit contributions
rob+0.005
 batch+0.005
bar+0.004
ip+0.004
ah+0.004
Neg logit contributions
ires-0.003
erv-0.003
 gut-0.003
adoes-0.003
 confusion-0.003
Token also
Token ID635
Feature activation+3.58e+00

Pos logit contributions
rob+0.019
 batch+0.018
bar+0.017
ip+0.015
tail+0.015
Neg logit contributions
erv-0.014
 gut-0.014
ires-0.014
 confusion-0.013
adoes-0.013
Token learned
Token ID4499
Feature activation-4.62e-01

Pos logit contributions
rob+0.131
 batch+0.124
bar+0.117
ip+0.111
eb+0.107
Neg logit contributions
 confusion-0.085
erv-0.081
 gut-0.080
 stops-0.080
 curiosity-0.076
Token that
Token ID326
Feature activation+2.42e+00

Pos logit contributions
ires+0.015
erv+0.015
 confusion+0.015
adoes+0.014
 gut+0.014
Neg logit contributions
rob-0.013
 batch-0.012
bar-0.012
ip-0.011
ah-0.009
Token earthquakes
Token ID29781
Feature activation-8.97e+00

Pos logit contributions
 batch+0.067
rob+0.067
bar+0.063
ip+0.057
 million+0.053
Neg logit contributions
ires-0.077
erv-0.077
 confusion-0.071
 gut-0.071
adoes-0.071
Token can
Token ID460
Feature activation-4.09e-01

Pos logit contributions
erv+0.285
ires+0.272
 gut+0.270
 confusion+0.261
 footsteps+0.256
Neg logit contributions
rob-0.232
bar-0.224
 batch-0.222
ip-0.199
c-0.194
Token be
Token ID307
Feature activation+9.73e-01

Pos logit contributions
ires+0.012
erv+0.011
 gut+0.011
adoes+0.011
 confusion+0.011
Neg logit contributions
rob-0.013
 batch-0.013
bar-0.012
ip-0.011
ah-0.010
Token very
Token ID845
Feature activation+3.81e+00

Pos logit contributions
 batch+0.025
bar+0.024
rob+0.024
c+0.023
ip+0.022
Neg logit contributions
ires-0.031
erv-0.029
overed-0.027
 gut-0.027
 footsteps-0.026
Token scary
Token ID14343
Feature activation-1.77e+00

Pos logit contributions
 batch+0.080
bar+0.077
ip+0.076
 million+0.075
ww+0.070
Neg logit contributions
erv-0.130
 footsteps-0.118
 confusion-0.116
overed-0.116
adoes-0.114
Token,
Token ID11
Feature activation+2.32e-01

Pos logit contributions
erv+0.055
ires+0.055
 confusion+0.053
 gut+0.051
overed+0.049
Neg logit contributions
rob-0.043
 batch-0.042
bar-0.041
ip-0.039
 million-0.037
Token.
Token ID13
Feature activation-2.00e-01

Pos logit contributions
rob+0.018
 batch+0.017
ip+0.016
bar+0.016
P+0.014
Neg logit contributions
ires-0.011
erv-0.010
 confusion-0.010
 gut-0.010
adoes-0.010
Token He
Token ID679
Feature activation+1.54e-01

Pos logit contributions
 gut+0.006
ires+0.006
adoes+0.006
erv+0.006
 confusion+0.006
Neg logit contributions
 batch-0.005
rob-0.005
bar-0.005
ip-0.005
tail-0.004
Token says
Token ID1139
Feature activation+3.91e+00

Pos logit contributions
rob+0.005
 batch+0.005
ip+0.004
bar+0.004
ah+0.004
Neg logit contributions
erv-0.005
ires-0.004
adoes-0.004
iring-0.004
 gut-0.004
Token "
Token ID366
Feature activation-1.05e-01

Pos logit contributions
rob+0.136
 batch+0.133
ip+0.124
bar+0.123
ah+0.110
Neg logit contributions
ires-0.100
erv-0.099
 confusion-0.096
adoes-0.094
 gut-0.094
TokenL
Token ID43
Feature activation+1.61e+00

Pos logit contributions
ires+0.002
erv+0.002
 confusion+0.002
adoes+0.002
 gut+0.002
Neg logit contributions
 batch-0.004
rob-0.004
ip-0.004
bar-0.004
 loaf-0.003
Tokenadies
Token ID50192
Feature activation-8.97e+00

Pos logit contributions
rob+0.054
 batch+0.049
ah+0.049
ip+0.047
bar+0.047
Neg logit contributions
 gut-0.047
 confusion-0.046
adoes-0.044
erv-0.040
 footsteps-0.040
Token and
Token ID290
Feature activation-1.16e+00

Pos logit contributions
ires+0.193
 gut+0.184
erv+0.178
 confusion+0.177
adoes+0.173
Neg logit contributions
rob-0.338
 batch-0.336
bar-0.319
ip-0.304
ah-0.293
Token gentlemen
Token ID28527
Feature activation-8.66e-02

Pos logit contributions
ires+0.032
 confusion+0.031
 gut+0.030
erv+0.030
 footsteps+0.029
Neg logit contributions
rob-0.040
 batch-0.037
ip-0.033
bar-0.033
ah-0.030
Token,
Token ID11
Feature activation-4.47e-01

Pos logit contributions
ires+0.002
erv+0.002
 gut+0.002
 confusion+0.002
adoes+0.002
Neg logit contributions
rob-0.003
 batch-0.003
bar-0.003
ip-0.003
ah-0.002
Token boys
Token ID6510
Feature activation+2.30e-01

Pos logit contributions
ires+0.011
erv+0.011
 gut+0.010
 confusion+0.010
adoes+0.010
Neg logit contributions
rob-0.016
 batch-0.016
bar-0.014
ip-0.014
ah-0.013
Token and
Token ID290
Feature activation-1.16e+00

Pos logit contributions
rob+0.007
 batch+0.007
bar+0.007
ip+0.006
tail+0.006
Neg logit contributions
ires-0.006
erv-0.006
 gut-0.006
adoes-0.006
 confusion-0.006
Token also
Token ID635
Feature activation+3.66e+00

Pos logit contributions
rob+0.014
 batch+0.013
bar+0.013
ip+0.012
tail+0.011
Neg logit contributions
ires-0.012
erv-0.012
 gut-0.012
 confusion-0.012
adoes-0.011
Token learned
Token ID4499
Feature activation-2.46e-01

Pos logit contributions
rob+0.134
 batch+0.124
bar+0.120
ip+0.114
eb+0.106
Neg logit contributions
 confusion-0.091
ires-0.089
 gut-0.087
erv-0.086
 stops-0.084
Token that
Token ID326
Feature activation+2.58e+00

Pos logit contributions
ires+0.008
erv+0.008
 confusion+0.008
adoes+0.008
 gut+0.008
Neg logit contributions
rob-0.006
 batch-0.006
bar-0.006
ip-0.005
P-0.004
Token even
Token ID772
Feature activation+2.32e+00

Pos logit contributions
 batch+0.073
rob+0.072
bar+0.067
ip+0.062
 million+0.057
Neg logit contributions
ires-0.081
erv-0.081
 gut-0.075
adoes-0.075
 confusion-0.075
Token though
Token ID996
Feature activation-4.46e+00

Pos logit contributions
rob+0.083
 batch+0.082
bar+0.075
ip+0.075
 million+0.069
Neg logit contributions
erv-0.054
ires-0.052
 gut-0.051
 confusion-0.050
adoes-0.049
Token earthquakes
Token ID29781
Feature activation-8.96e+00

Pos logit contributions
erv+0.136
ires+0.134
 gut+0.126
 confusion+0.125
adoes+0.123
Neg logit contributions
rob-0.128
 batch-0.127
bar-0.120
ip-0.112
 million-0.103
Token can
Token ID460
Feature activation-1.40e-01

Pos logit contributions
erv+0.291
 gut+0.286
ires+0.284
 footsteps+0.266
 confusion+0.261
Neg logit contributions
rob-0.222
 batch-0.221
bar-0.209
ip-0.193
c-0.187
Token be
Token ID307
Feature activation+1.24e+00

Pos logit contributions
ires+0.004
erv+0.004
 gut+0.003
adoes+0.003
 confusion+0.003
Neg logit contributions
rob-0.004
 batch-0.004
bar-0.004
ip-0.004
 million-0.004
Token scary
Token ID14343
Feature activation-1.52e+00

Pos logit contributions
 batch+0.037
bar+0.035
rob+0.034
 million+0.033
c+0.033
Neg logit contributions
ires-0.035
erv-0.033
 gut-0.032
 confusion-0.030
iring-0.030
Token,
Token ID11
Feature activation+6.91e-01

Pos logit contributions
erv+0.054
ires+0.054
 gut+0.050
 confusion+0.050
iring+0.048
Neg logit contributions
 batch-0.030
bar-0.029
rob-0.028
ip-0.027
 million-0.027
Token it
Token ID340
Feature activation-4.63e-01

Pos logit contributions
 batch+0.019
rob+0.018
bar+0.018
ip+0.016
 million+0.015
Neg logit contributions
erv-0.022
ires-0.022
 gut-0.021
 confusion-0.020
adoes-0.020
Token anymore
Token ID7471
Feature activation-3.69e+00

Pos logit contributions
ires+0.054
 gut+0.052
adoes+0.052
erv+0.050
 confusion+0.049
Neg logit contributions
 batch-0.045
bar-0.040
ip-0.040
 million-0.038
rob-0.037
Token.
Token ID13
Feature activation+1.30e-01

Pos logit contributions
ires+0.101
erv+0.095
 gut+0.093
adoes+0.093
 confusion+0.092
Neg logit contributions
rob-0.121
 batch-0.121
bar-0.110
ip-0.108
tail-0.098
Token She
Token ID1375
Feature activation+1.79e+00

Pos logit contributions
rob+0.004
 batch+0.004
bar+0.004
ip+0.004
ah+0.004
Neg logit contributions
ires-0.003
erv-0.003
 gut-0.003
 confusion-0.003
adoes-0.003
Token knew
Token ID2993
Feature activation+3.91e-03

Pos logit contributions
rob+0.059
 batch+0.055
bar+0.052
ip+0.049
tail+0.044
Neg logit contributions
erv-0.051
ires-0.050
 confusion-0.050
 gut-0.049
adoes-0.048
Token that
Token ID326
Feature activation+2.63e+00

Pos logit contributions
rob+0.000
 batch+0.000
bar+0.000
ip+0.000
ah+0.000
Neg logit contributions
ires-0.000
 gut-0.000
erv-0.000
 confusion-0.000
adoes-0.000
Token earthquakes
Token ID29781
Feature activation-8.94e+00

Pos logit contributions
rob+0.094
 batch+0.089
bar+0.084
ip+0.082
ah+0.075
Neg logit contributions
ires-0.067
erv-0.065
 confusion-0.064
 gut-0.064
adoes-0.063
Token could
Token ID714
Feature activation+1.57e+00

Pos logit contributions
ires+0.306
erv+0.306
 gut+0.298
 confusion+0.280
adoes+0.275
Neg logit contributions
 batch-0.215
rob-0.211
bar-0.195
ip-0.182
c-0.178
Token sometimes
Token ID3360
Feature activation-2.68e+00

Pos logit contributions
rob+0.046
 batch+0.046
bar+0.044
ip+0.038
tail+0.038
Neg logit contributions
ires-0.047
erv-0.047
adoes-0.043
 gut-0.043
 confusion-0.042
Token be
Token ID307
Feature activation+1.14e+00

Pos logit contributions
ires+0.075
erv+0.073
 gut+0.068
adoes+0.067
 confusion+0.067
Neg logit contributions
 batch-0.085
rob-0.083
bar-0.077
ip-0.074
tail-0.071
Token scary
Token ID14343
Feature activation-1.64e+00

Pos logit contributions
 batch+0.036
bar+0.033
c+0.033
rob+0.033
 million+0.031
Neg logit contributions
ires-0.032
erv-0.030
 gut-0.028
iring-0.027
adoes-0.027
Token,
Token ID11
Feature activation+2.65e-01

Pos logit contributions
erv+0.050
ires+0.050
 gut+0.047
 confusion+0.046
iring+0.044
Neg logit contributions
 batch-0.043
rob-0.041
bar-0.040
ip-0.038
 million-0.037
Token the
Token ID262
Feature activation-7.07e-01

Pos logit contributions
ires+0.062
erv+0.060
 gut+0.059
adoes+0.059
 confusion+0.058
Neg logit contributions
rob-0.081
 batch-0.079
bar-0.073
ip-0.071
ah-0.064
Token branch
Token ID8478
Feature activation+1.49e+00

Pos logit contributions
ires+0.020
erv+0.020
 gut+0.020
 confusion+0.019
adoes+0.019
Neg logit contributions
rob-0.022
 batch-0.022
bar-0.020
ip-0.019
ah-0.017
Token all
Token ID477
Feature activation+2.37e+00

Pos logit contributions
rob+0.064
 batch+0.060
ip+0.054
 french+0.053
bar+0.052
Neg logit contributions
erv-0.027
ires-0.027
 gut-0.025
adoes-0.024
 confusion-0.023
Token around
Token ID1088
Feature activation-3.28e+00

Pos logit contributions
 batch+0.084
rob+0.082
ip+0.075
bar+0.073
ird+0.072
Neg logit contributions
ires-0.060
 gut-0.053
 confusion-0.053
erv-0.053
adoes-0.052
Token.
Token ID13
Feature activation-2.56e-01

Pos logit contributions
ires+0.072
 gut+0.068
erv+0.068
 confusion+0.067
adoes+0.066
Neg logit contributions
rob-0.127
 batch-0.124
bar-0.117
ip-0.115
ird-0.104
Token Up
Token ID3205
Feature activation-8.93e+00

Pos logit contributions
ires+0.007
erv+0.006
 gut+0.006
 confusion+0.006
adoes+0.006
Neg logit contributions
rob-0.009
 batch-0.009
bar-0.008
ip-0.008
ah-0.007
Token to
Token ID284
Feature activation+6.44e-01

Pos logit contributions
ires+0.221
erv+0.213
 gut+0.207
 confusion+0.206
adoes+0.205
Neg logit contributions
rob-0.322
 batch-0.312
bar-0.294
ip-0.285
ah-0.262
Token the
Token ID262
Feature activation+4.68e-03

Pos logit contributions
 batch+0.025
rob+0.025
ip+0.024
bar+0.023
ird+0.022
Neg logit contributions
ires-0.013
 gut-0.012
adoes-0.012
 confusion-0.012
 stops-0.011
Token sky
Token ID6766
Feature activation-3.20e+00

Pos logit contributions
rob+0.000
 batch+0.000
ip+0.000
ird+0.000
bar+0.000
Neg logit contributions
 stops-0.000
 confusion-0.000
 gut-0.000
 footsteps-0.000
ires-0.000
Token and
Token ID290
Feature activation-7.76e-01

Pos logit contributions
ires+0.080
erv+0.076
 gut+0.075
 confusion+0.074
adoes+0.074
Neg logit contributions
rob-0.114
 batch-0.111
bar-0.106
ip-0.101
ah-0.094
Token down
Token ID866
Feature activation+1.15e+00

Pos logit contributions
ires+0.020
erv+0.019
 gut+0.019
 confusion+0.019
adoes+0.019
Neg logit contributions
rob-0.027
 batch-0.026
bar-0.024
ip-0.024
ah-0.022
Token and
Token ID290
Feature activation-8.68e-01

Pos logit contributions
 batch+0.032
rob+0.031
ip+0.031
bar+0.029
c+0.028
Neg logit contributions
ires-0.030
iring-0.026
adoes-0.025
erv-0.025
 footsteps-0.025
Token Ben
Token ID3932
Feature activation+2.96e-01

Pos logit contributions
ires+0.028
adoes+0.028
erv+0.028
 gut+0.027
iring+0.027
Neg logit contributions
rob-0.023
ip-0.022
 batch-0.022
bar-0.021
ah-0.019
Token did
Token ID750
Feature activation+1.68e+00

Pos logit contributions
rob+0.007
 batch+0.007
bar+0.007
ip+0.007
 gob+0.006
Neg logit contributions
ires-0.011
adoes-0.009
erv-0.009
iring-0.009
 gut-0.009
Token not
Token ID407
Feature activation-4.66e+00

Pos logit contributions
 batch+0.035
tail+0.033
ip+0.030
 gob+0.029
 +0.028
Neg logit contributions
ires-0.059
iring-0.057
erv-0.053
adoes-0.051
ments-0.050
Token like
Token ID588
Feature activation-2.47e+00

Pos logit contributions
ires+0.120
erv+0.109
 gut+0.107
adoes+0.107
 confusion+0.105
Neg logit contributions
rob-0.166
 batch-0.158
bar-0.152
ip-0.148
ah-0.144
Token earthquakes
Token ID29781
Feature activation-8.89e+00

Pos logit contributions
erv+0.089
ires+0.087
etermined+0.084
 confusion+0.082
iring+0.081
Neg logit contributions
 batch-0.056
bar-0.044
rob-0.041
 peanut-0.041
ip-0.041
Token.
Token ID13
Feature activation-4.29e-02

Pos logit contributions
ires+0.254
erv+0.249
 gut+0.247
 confusion+0.240
adoes+0.230
Neg logit contributions
rob-0.280
 batch-0.274
bar-0.253
ip-0.244
 million-0.226
Token
Token ID198
Feature activation+4.22e-01

Pos logit contributions
ires+0.001
 gut+0.001
erv+0.001
adoes+0.001
 confusion+0.001
Neg logit contributions
 batch-0.001
rob-0.001
ip-0.001
bar-0.001
 million-0.001
Token
Token ID198
Feature activation+3.86e-01

Pos logit contributions
rob+0.014
 batch+0.014
bar+0.013
ip+0.012
ah+0.011
Neg logit contributions
ires-0.011
erv-0.011
 gut-0.011
 confusion-0.011
adoes-0.011
TokenThey
Token ID2990
Feature activation+1.01e+00

Pos logit contributions
rob+0.013
 batch+0.013
bar+0.012
ip+0.012
ah+0.010
Neg logit contributions
ires-0.010
erv-0.010
 gut-0.010
 confusion-0.010
adoes-0.010
Token heard
Token ID2982
Feature activation-1.49e+00

Pos logit contributions
rob+0.032
 batch+0.031
bar+0.029
ip+0.028
ah+0.026
Neg logit contributions
ires-0.029
erv-0.028
 gut-0.027
adoes-0.027
 confusion-0.027
Token she
Token ID673
Feature activation+1.69e+00

Pos logit contributions
ires+0.013
erv+0.013
 gut+0.013
 confusion+0.013
adoes+0.013
Neg logit contributions
rob-0.017
 batch-0.016
bar-0.015
ip-0.015
ah-0.014
Token felt
Token ID2936
Feature activation+2.27e+00

Pos logit contributions
rob+0.056
 batch+0.054
bar+0.050
ip+0.048
tail+0.043
Neg logit contributions
ires-0.046
erv-0.046
 gut-0.045
 confusion-0.044
adoes-0.044
Token safe
Token ID3338
Feature activation+1.47e+00

Pos logit contributions
 batch+0.050
bar+0.045
rob+0.044
ip+0.044
tail+0.042
Neg logit contributions
ires-0.082
erv-0.080
adoes-0.078
iring-0.072
uting-0.071
Token from
Token ID422
Feature activation+3.64e-01

Pos logit contributions
rob+0.048
 batch+0.047
bar+0.044
ip+0.042
tail+0.039
Neg logit contributions
ires-0.041
erv-0.039
 confusion-0.038
 gut-0.038
adoes-0.037
Token any
Token ID597
Feature activation+1.95e-01

Pos logit contributions
rob+0.012
 batch+0.012
bar+0.011
ip+0.011
ah+0.010
Neg logit contributions
ires-0.010
erv-0.009
 gut-0.009
 confusion-0.009
adoes-0.009
Token earthquakes
Token ID29781
Feature activation-8.83e+00

Pos logit contributions
rob+0.007
 batch+0.006
bar+0.006
ip+0.006
ah+0.005
Neg logit contributions
ires-0.005
erv-0.005
adoes-0.005
 gut-0.005
 confusion-0.005
Token.
Token ID13
Feature activation+2.76e-01

Pos logit contributions
ires+0.245
erv+0.238
 gut+0.235
 confusion+0.227
adoes+0.226
Neg logit contributions
rob-0.286
 batch-0.282
bar-0.263
ip-0.252
c-0.230
Token<|endoftext|>
Token ID50256
Feature activation-5.09e+00

Pos logit contributions
rob+0.011
 batch+0.010
bar+0.010
ip+0.009
ah+0.009
Neg logit contributions
ires-0.007
erv-0.006
 gut-0.006
 confusion-0.006
adoes-0.006
TokenOnce
Token ID7454
Feature activation-1.25e+00

Pos logit contributions
ires+0.123
erv+0.120
 confusion+0.114
 gut+0.112
adoes+0.111
Neg logit contributions
rob-0.182
 batch-0.180
bar-0.170
ip-0.163
ah-0.149
Token upon
Token ID2402
Feature activation-7.61e+00

Pos logit contributions
ires+0.014
erv+0.013
 gut+0.012
adoes+0.012
 confusion+0.012
Neg logit contributions
rob-0.061
 batch-0.061
bar-0.058
ip-0.057
ah-0.053
Token a
Token ID257
Feature activation-8.03e-01

Pos logit contributions
ires+0.239
 gut+0.233
adoes+0.231
erv+0.217
iring+0.214
Neg logit contributions
rob-0.223
 batch-0.218
bar-0.189
tail-0.185
P-0.184
Token but
Token ID475
Feature activation+7.18e-01

Pos logit contributions
ires+0.034
erv+0.033
 gut+0.032
adoes+0.032
 confusion+0.032
Neg logit contributions
 batch-0.043
rob-0.043
bar-0.039
ip-0.038
ah-0.035
Token then
Token ID788
Feature activation+5.72e-01

Pos logit contributions
 batch+0.020
rob+0.018
bar+0.018
ip+0.018
 gob+0.016
Neg logit contributions
ires-0.022
adoes-0.022
erv-0.021
 gut-0.020
iring-0.020
Token she
Token ID673
Feature activation+1.72e+00

Pos logit contributions
rob+0.020
 batch+0.019
bar+0.018
ip+0.017
ah+0.016
Neg logit contributions
ires-0.015
erv-0.015
 confusion-0.015
 gut-0.014
adoes-0.014
Token realised
Token ID19169
Feature activation+9.53e-01

Pos logit contributions
rob+0.062
 batch+0.058
bar+0.051
ip+0.050
 loaf+0.046
Neg logit contributions
 confusion-0.045
erv-0.045
 gut-0.044
ires-0.042
 stops-0.041
Token that
Token ID326
Feature activation+2.55e+00

Pos logit contributions
 batch+0.026
rob+0.025
bar+0.023
ip+0.023
c+0.020
Neg logit contributions
ires-0.031
erv-0.030
adoes-0.029
 confusion-0.028
 gut-0.028
Token earthquakes
Token ID29781
Feature activation-8.81e+00

Pos logit contributions
 batch+0.077
rob+0.075
bar+0.070
ip+0.066
c+0.062
Neg logit contributions
ires-0.075
erv-0.074
 gut-0.070
adoes-0.069
 confusion-0.068
Token don
Token ID836
Feature activation-4.62e+00

Pos logit contributions
erv+0.284
ires+0.283
 gut+0.278
 confusion+0.258
adoes+0.256
Neg logit contributions
 batch-0.228
rob-0.223
bar-0.211
ip-0.194
c-0.189
Token't
Token ID470
Feature activation-3.26e+00

Pos logit contributions
ires+0.130
erv+0.127
 gut+0.124
 confusion+0.123
adoes+0.122
Neg logit contributions
rob-0.148
 batch-0.144
bar-0.135
ip-0.129
ah-0.117
Token happen
Token ID1645
Feature activation-2.13e+00

Pos logit contributions
ires+0.097
erv+0.090
 gut+0.086
adoes+0.085
iring+0.083
Neg logit contributions
bar-0.097
 batch-0.093
rob-0.089
ah-0.083
 million-0.082
Token very
Token ID845
Feature activation+3.99e+00

Pos logit contributions
erv+0.066
ires+0.065
 confusion+0.063
adoes+0.063
 gut+0.062
Neg logit contributions
 batch-0.058
rob-0.053
bar-0.052
ip-0.052
c-0.046
Token often
Token ID1690
Feature activation-1.36e+00

Pos logit contributions
 batch+0.112
ip+0.105
bar+0.103
rob+0.100
 million+0.097
Neg logit contributions
erv-0.115
ires-0.104
adoes-0.102
 confusion-0.102
 gut-0.099
Token?
Token ID30
Feature activation+3.14e-01

Pos logit contributions
rob+0.057
 batch+0.057
bar+0.052
ip+0.050
ah+0.045
Neg logit contributions
ires-0.050
erv-0.049
 confusion-0.048
 gut-0.048
adoes-0.046
Token You
Token ID921
Feature activation-1.24e-01

Pos logit contributions
rob+0.011
 batch+0.011
bar+0.010
ip+0.010
ah+0.009
Neg logit contributions
ires-0.008
erv-0.008
 gut-0.007
adoes-0.007
 confusion-0.007
Token know
Token ID760
Feature activation+1.93e-01

Pos logit contributions
ires+0.004
erv+0.004
 gut+0.004
adoes+0.004
 confusion+0.004
Neg logit contributions
rob-0.004
 batch-0.003
bar-0.003
ip-0.003
ah-0.003
Token I
Token ID314
Feature activation+1.79e+00

Pos logit contributions
 batch+0.006
rob+0.006
bar+0.006
ip+0.006
ird+0.005
Neg logit contributions
ires-0.005
erv-0.005
 confusion-0.005
 gut-0.005
adoes-0.004
Token hate
Token ID5465
Feature activation-3.48e+00

Pos logit contributions
rob+0.050
 batch+0.048
bar+0.045
ip+0.044
ah+0.041
Neg logit contributions
ires-0.059
adoes-0.056
erv-0.055
 gut-0.054
 confusion-0.053
Token earthquakes
Token ID29781
Feature activation-8.80e+00

Pos logit contributions
ires+0.097
erv+0.094
 gut+0.091
 confusion+0.090
adoes+0.090
Neg logit contributions
rob-0.115
 batch-0.111
bar-0.103
ip-0.100
ah-0.090
Token."
Token ID526
Feature activation+2.04e+00

Pos logit contributions
ires+0.243
erv+0.237
 confusion+0.232
 gut+0.232
adoes+0.222
Neg logit contributions
rob-0.286
 batch-0.279
bar-0.260
ip-0.249
ah-0.228
Token Paul
Token ID3362
Feature activation-1.21e+00

Pos logit contributions
rob+0.071
 batch+0.069
bar+0.064
ip+0.063
 loaf+0.058
Neg logit contributions
ires-0.052
erv-0.051
 gut-0.049
 confusion-0.049
adoes-0.049
Token felt
Token ID2936
Feature activation+2.48e+00

Pos logit contributions
erv+0.032
ires+0.032
 confusion+0.032
 gut+0.031
adoes+0.030
Neg logit contributions
rob-0.041
 batch-0.039
bar-0.036
ip-0.035
tail-0.032
Token bad
Token ID2089
Feature activation+1.77e+00

Pos logit contributions
rob+0.085
 batch+0.085
bar+0.079
ip+0.076
tail+0.071
Neg logit contributions
ires-0.062
erv-0.060
adoes-0.058
 gut-0.055
iring-0.054
Token and
Token ID290
Feature activation-6.30e-01

Pos logit contributions
rob+0.065
 batch+0.063
bar+0.059
ip+0.057
ah+0.052
Neg logit contributions
ires-0.043
erv-0.041
 gut-0.040
 confusion-0.040
adoes-0.040
Token was
Token ID373
Feature activation+2.46e-01

Pos logit contributions
ires+0.018
erv+0.017
 gut+0.017
adoes+0.016
iring+0.016
Neg logit contributions
rob-0.012
 batch-0.011
bar-0.010
ip-0.009
ah-0.009
Token over
Token ID625
Feature activation-1.21e+00

Pos logit contributions
rob+0.009
 batch+0.008
bar+0.008
ip+0.008
ah+0.007
Neg logit contributions
ires-0.006
erv-0.006
 gut-0.006
 confusion-0.006
adoes-0.006
Token.
Token ID13
Feature activation+2.02e-01

Pos logit contributions
ires+0.033
erv+0.031
adoes+0.031
 gut+0.030
 confusion+0.029
Neg logit contributions
rob-0.040
 batch-0.040
bar-0.037
ip-0.036
ah-0.032
Token They
Token ID1119
Feature activation+1.76e+00

Pos logit contributions
rob+0.007
 batch+0.007
bar+0.007
ip+0.006
ah+0.006
Neg logit contributions
ires-0.005
erv-0.005
adoes-0.005
 gut-0.005
 confusion-0.005
Token knew
Token ID2993
Feature activation-2.24e-03

Pos logit contributions
rob+0.065
 batch+0.063
bar+0.054
ip+0.053
 loaf+0.051
Neg logit contributions
 confusion-0.044
erv-0.042
 gut-0.042
ires-0.039
 curiosity-0.039
Token earthquakes
Token ID29781
Feature activation-8.79e+00

Pos logit contributions
ires+0.000
erv+0.000
 gut+0.000
 confusion+0.000
adoes+0.000
Neg logit contributions
rob-0.000
 batch-0.000
bar-0.000
ip-0.000
ah-0.000
Token were
Token ID547
Feature activation+2.12e+00

Pos logit contributions
ires+0.300
erv+0.297
 gut+0.285
 confusion+0.271
adoes+0.269
Neg logit contributions
rob-0.215
 batch-0.215
bar-0.196
ip-0.181
c-0.177
Token exciting
Token ID7895
Feature activation-2.48e+00

Pos logit contributions
 batch+0.056
rob+0.050
bar+0.050
 million+0.047
ip+0.045
Neg logit contributions
erv-0.065
ires-0.065
 gut-0.059
adoes-0.058
 confusion-0.057
Token but
Token ID475
Feature activation+8.37e-01

Pos logit contributions
erv+0.075
ires+0.074
 confusion+0.069
 gut+0.068
iring+0.065
Neg logit contributions
 batch-0.068
rob-0.066
ip-0.057
bar-0.057
c-0.055
Token can
Token ID460
Feature activation-2.80e-01

Pos logit contributions
 batch+0.024
rob+0.023
bar+0.022
ip+0.020
 million+0.019
Neg logit contributions
erv-0.026
ires-0.025
adoes-0.024
 gut-0.023
 confusion-0.023
Token sometimes
Token ID3360
Feature activation-2.61e+00

Pos logit contributions
ires+0.008
erv+0.008
 confusion+0.007
adoes+0.007
 gut+0.007
Neg logit contributions
rob-0.009
 batch-0.008
bar-0.008
ip-0.007
ah-0.007
Token its
Token ID663
Feature activation-1.63e-01

Pos logit contributions
ires+0.028
 confusion+0.027
erv+0.027
adoes+0.027
 gut+0.026
Neg logit contributions
rob-0.039
 batch-0.038
bar-0.035
ip-0.035
ah-0.032
Token engine
Token ID3113
Feature activation-4.37e-01

Pos logit contributions
ires+0.004
erv+0.004
 gut+0.004
 confusion+0.004
adoes+0.004
Neg logit contributions
rob-0.006
 batch-0.006
bar-0.005
ip-0.005
ah-0.005
Token thunder
Token ID18355
Feature activation-5.65e+00

Pos logit contributions
ires+0.013
erv+0.012
 gut+0.012
 confusion+0.012
adoes+0.012
Neg logit contributions
rob-0.014
 batch-0.013
bar-0.012
ip-0.012
ah-0.011
Tokening
Token ID278
Feature activation-5.59e-01

Pos logit contributions
ires+0.166
erv+0.159
 gut+0.156
 confusion+0.154
adoes+0.151
Neg logit contributions
rob-0.179
 batch-0.172
bar-0.161
ip-0.154
ah-0.141
Token.
Token ID13
Feature activation-6.33e-01

Pos logit contributions
ires+0.015
erv+0.014
 gut+0.014
 confusion+0.014
adoes+0.014
Neg logit contributions
rob-0.019
 batch-0.019
bar-0.017
ip-0.017
ah-0.015
Token Up
Token ID3205
Feature activation-8.79e+00

Pos logit contributions
ires+0.015
erv+0.015
 confusion+0.014
adoes+0.014
 gut+0.014
Neg logit contributions
rob-0.024
 batch-0.022
bar-0.021
ip-0.020
ah-0.019
Token,
Token ID11
Feature activation+2.37e-01

Pos logit contributions
ires+0.216
erv+0.209
 gut+0.202
 confusion+0.201
adoes+0.200
Neg logit contributions
rob-0.318
 batch-0.309
bar-0.290
ip-0.282
ah-0.258
Token it
Token ID340
Feature activation-7.11e-01

Pos logit contributions
bar+0.007
rob+0.007
 batch+0.007
ip+0.006
c+0.006
Neg logit contributions
erv-0.007
ires-0.007
 confusion-0.006
 gut-0.006
iring-0.006
Token went
Token ID1816
Feature activation+8.57e-01

Pos logit contributions
ires+0.020
erv+0.019
 confusion+0.018
 gut+0.018
adoes+0.018
Neg logit contributions
rob-0.024
 batch-0.022
bar-0.022
ip-0.021
ah-0.019
Token into
Token ID656
Feature activation+5.64e-01

Pos logit contributions
rob+0.028
 batch+0.027
bar+0.026
ip+0.025
ah+0.023
Neg logit contributions
ires-0.024
erv-0.023
 confusion-0.023
 gut-0.022
adoes-0.022
Token the
Token ID262
Feature activation+3.74e-01

Pos logit contributions
rob+0.022
 batch+0.021
ip+0.020
bar+0.020
ah+0.018
Neg logit contributions
ires-0.012
erv-0.012
 confusion-0.012
 gut-0.012
adoes-0.012
Token he
Token ID339
Feature activation+6.07e-01

Pos logit contributions
rob+0.092
 batch+0.089
bar+0.080
ip+0.080
tail+0.072
Neg logit contributions
ires-0.070
 confusion-0.069
 gut-0.069
erv-0.068
adoes-0.066
Token stayed
Token ID9658
Feature activation-2.48e+00

Pos logit contributions
rob+0.016
 batch+0.015
bar+0.014
ip+0.013
ah+0.011
Neg logit contributions
ires-0.021
erv-0.021
 gut-0.020
 confusion-0.020
adoes-0.020
Token still
Token ID991
Feature activation-7.25e-01

Pos logit contributions
ires+0.066
erv+0.064
adoes+0.060
 gut+0.059
 confusion+0.059
Neg logit contributions
rob-0.083
 batch-0.082
bar-0.077
ip-0.074
c-0.068
Token during
Token ID1141
Feature activation-4.41e-01

Pos logit contributions
ires+0.017
erv+0.017
 gut+0.016
adoes+0.016
 confusion+0.016
Neg logit contributions
rob-0.027
 batch-0.026
bar-0.024
ip-0.024
ah-0.022
Token the
Token ID262
Feature activation+6.97e-01

Pos logit contributions
ires+0.010
 confusion+0.010
 gut+0.010
erv+0.009
adoes+0.009
Neg logit contributions
rob-0.017
 batch-0.016
bar-0.015
ip-0.015
ah-0.014
Token earthquakes
Token ID29781
Feature activation-8.78e+00

Pos logit contributions
 batch+0.018
rob+0.018
bar+0.016
ip+0.015
ah+0.014
Neg logit contributions
ires-0.024
erv-0.024
adoes-0.023
 gut-0.023
iring-0.022
Token.
Token ID13
Feature activation+3.92e-01

Pos logit contributions
ires+0.236
erv+0.228
 gut+0.222
adoes+0.218
 confusion+0.217
Neg logit contributions
rob-0.295
 batch-0.287
bar-0.272
ip-0.259
ah-0.238
Token He
Token ID679
Feature activation+7.66e-01

Pos logit contributions
rob+0.014
 batch+0.013
bar+0.012
ip+0.012
ah+0.011
Neg logit contributions
ires-0.010
erv-0.010
 gut-0.010
 confusion-0.009
adoes-0.009
Token was
Token ID373
Feature activation+2.69e-01

Pos logit contributions
rob+0.025
 batch+0.024
bar+0.022
ip+0.021
tail+0.020
Neg logit contributions
ires-0.021
 gut-0.021
 confusion-0.020
erv-0.020
adoes-0.019
Token also
Token ID635
Feature activation+3.83e+00

Pos logit contributions
rob+0.011
 batch+0.010
ip+0.009
bar+0.009
tail+0.008
Neg logit contributions
 gut-0.006
 confusion-0.006
ires-0.006
erv-0.006
adoes-0.005
Token thankful
Token ID25535
Feature activation+9.57e-01

Pos logit contributions
rob+0.146
 batch+0.140
bar+0.129
ip+0.127
ah+0.117
Neg logit contributions
ires-0.088
erv-0.086
 confusion-0.086
 gut-0.085
adoes-0.081
Token and
Token ID290
Feature activation-7.13e-01

Pos logit contributions
rob+0.019
 batch+0.018
bar+0.017
ip+0.016
ah+0.015
Neg logit contributions
ires-0.011
erv-0.011
 gut-0.010
 confusion-0.010
adoes-0.010
Token her
Token ID607
Feature activation-1.85e+00

Pos logit contributions
 confusion+0.017
ires+0.017
erv+0.016
 gut+0.016
adoes+0.016
Neg logit contributions
rob-0.028
 batch-0.024
ip-0.023
bar-0.023
ah-0.021
Token mum
Token ID25682
Feature activation-2.26e-01

Pos logit contributions
ires+0.050
erv+0.047
adoes+0.046
 gut+0.043
iring+0.043
Neg logit contributions
 batch-0.061
rob-0.057
ip-0.055
bar-0.054
ah-0.050
Token explained
Token ID4893
Feature activation+4.07e+00

Pos logit contributions
 confusion+0.007
erv+0.007
 gut+0.007
ires+0.007
 stops+0.006
Neg logit contributions
rob-0.007
 batch-0.006
bar-0.006
ip-0.005
P-0.005
Token that
Token ID326
Feature activation+2.50e+00

Pos logit contributions
 batch+0.115
rob+0.111
bar+0.104
ip+0.102
ah+0.089
Neg logit contributions
ires-0.128
erv-0.127
 confusion-0.119
 gut-0.119
adoes-0.118
Token earthquakes
Token ID29781
Feature activation-8.78e+00

Pos logit contributions
 batch+0.069
rob+0.067
bar+0.062
ip+0.059
c+0.054
Neg logit contributions
ires-0.080
erv-0.080
 gut-0.074
 confusion-0.073
adoes-0.072
Token happen
Token ID1645
Feature activation-2.11e+00

Pos logit contributions
erv+0.272
ires+0.270
 gut+0.263
 confusion+0.248
iring+0.243
Neg logit contributions
 batch-0.238
rob-0.233
bar-0.221
ip-0.206
c-0.199
Token when
Token ID618
Feature activation-5.50e+00

Pos logit contributions
erv+0.061
ires+0.060
 confusion+0.058
 gut+0.056
adoes+0.056
Neg logit contributions
 batch-0.061
rob-0.058
ip-0.056
bar-0.056
c-0.051
Token the
Token ID262
Feature activation+7.49e-01

Pos logit contributions
erv+0.166
ires+0.161
 gut+0.151
 confusion+0.148
adoes+0.148
Neg logit contributions
 batch-0.157
rob-0.150
bar-0.138
ip-0.137
 million-0.128
Token ground
Token ID2323
Feature activation-1.08e+00

Pos logit contributions
 batch+0.023
rob+0.021
bar+0.021
 million+0.020
 peanut+0.018
Neg logit contributions
ires-0.021
erv-0.021
 gut-0.019
iring-0.018
uting-0.018
Token below
Token ID2174
Feature activation-4.02e+00

Pos logit contributions
ires+0.032
erv+0.032
 gut+0.029
 confusion+0.028
iring+0.027
Neg logit contributions
 batch-0.031
bar-0.029
rob-0.028
c-0.028
ip-0.027
Tokened
Token ID276
Feature activation-7.04e-01

Pos logit contributions
ires+0.081
erv+0.081
 confusion+0.073
adoes+0.070
 gut+0.068
Neg logit contributions
rob-0.214
 batch-0.211
bar-0.199
ip-0.188
 loaf-0.178
Token and
Token ID290
Feature activation-1.20e+00

Pos logit contributions
ires+0.015
erv+0.014
 gut+0.014
 confusion+0.014
adoes+0.014
Neg logit contributions
rob-0.028
 batch-0.027
bar-0.025
ip-0.025
ah-0.023
Token hugged
Token ID41130
Feature activation-8.48e-04

Pos logit contributions
ires+0.034
erv+0.034
 gut+0.034
 confusion+0.034
adoes+0.033
Neg logit contributions
rob-0.039
 batch-0.036
bar-0.034
ip-0.033
ah-0.030
Token her
Token ID607
Feature activation-2.07e+00

Pos logit contributions
ires+0.000
 gut+0.000
adoes+0.000
 confusion+0.000
erv+0.000
Neg logit contributions
rob-0.000
 batch-0.000
bar-0.000
ip-0.000
 million-0.000
Token blue
Token ID4171
Feature activation+9.20e-01

Pos logit contributions
ires+0.049
erv+0.047
adoes+0.045
 gut+0.044
 confusion+0.044
Neg logit contributions
rob-0.076
 batch-0.076
ip-0.070
bar-0.069
tail-0.063
Token bag
Token ID6131
Feature activation+7.67e+00

Pos logit contributions
rob+0.025
 batch+0.023
bar+0.022
ip+0.021
 million+0.019
Neg logit contributions
ires-0.030
erv-0.030
 gut-0.029
 confusion-0.029
adoes-0.029
Token tightly
Token ID17707
Feature activation+3.69e+00

Pos logit contributions
rob+0.332
 batch+0.298
ip+0.273
bar+0.270
P+0.264
Neg logit contributions
erv-0.150
ires-0.147
 gut-0.140
iring-0.134
 confusion-0.134
Token.
Token ID13
Feature activation+2.11e-01

Pos logit contributions
rob+0.146
 batch+0.139
bar+0.130
ip+0.126
P+0.117
Neg logit contributions
ires-0.080
erv-0.078
 gut-0.075
 confusion-0.074
adoes-0.074
Token She
Token ID1375
Feature activation+1.58e+00

Pos logit contributions
rob+0.008
 batch+0.007
bar+0.007
ip+0.006
ah+0.006
Neg logit contributions
ires-0.005
erv-0.005
adoes-0.005
 gut-0.005
 confusion-0.005
Token went
Token ID1816
Feature activation+7.99e-01

Pos logit contributions
rob+0.062
 batch+0.057
bar+0.055
ip+0.051
 loaf+0.049
Neg logit contributions
 confusion-0.034
 gut-0.033
erv-0.033
 stops-0.031
ires-0.031
Token back
Token ID736
Feature activation+1.05e+00

Pos logit contributions
rob+0.029
 batch+0.028
bar+0.026
ip+0.025
ah+0.024
Neg logit contributions
ires-0.019
erv-0.018
 gut-0.018
 confusion-0.018
adoes-0.017
Token the
Token ID262
Feature activation+3.30e-01

Pos logit contributions
ires+0.043
adoes+0.043
 earthquakes+0.038
 gut+0.038
 confusion+0.038
Neg logit contributions
rob-0.079
bar-0.075
 batch-0.073
ip-0.069
ah-0.068
Token cookies
Token ID14746
Feature activation+4.49e+00

Pos logit contributions
rob+0.010
 batch+0.009
bar+0.009
ip+0.008
eb+0.008
Neg logit contributions
ires-0.010
adoes-0.010
 confusion-0.010
erv-0.010
 gut-0.010
Token by
Token ID416
Feature activation-2.96e+00

Pos logit contributions
rob+0.179
 batch+0.162
bar+0.155
ip+0.150
ah+0.146
Neg logit contributions
ires-0.097
erv-0.092
adoes-0.091
 gut-0.084
iring-0.082
Token using
Token ID1262
Feature activation+2.70e+00

Pos logit contributions
ires+0.074
adoes+0.072
 confusion+0.072
 gut+0.067
 earthquakes+0.067
Neg logit contributions
rob-0.112
bar-0.097
ip-0.095
 batch-0.091
ah-0.090
Token fresh
Token ID4713
Feature activation+4.37e+00

Pos logit contributions
rob+0.114
bar+0.100
 batch+0.095
ip+0.093
ah+0.093
Neg logit contributions
ires-0.054
adoes-0.054
 confusion-0.051
 earthquakes-0.049
erv-0.048
Token butter
Token ID9215
Feature activation+8.13e+00

Pos logit contributions
rob+0.150
bar+0.139
 batch+0.134
ip+0.125
ah+0.121
Neg logit contributions
ires-0.117
adoes-0.114
erv-0.109
 confusion-0.109
 stops-0.104
Token."
Token ID526
Feature activation+1.89e+00

Pos logit contributions
rob+0.314
 batch+0.293
bar+0.275
ip+0.255
ah+0.250
Neg logit contributions
ires-0.184
adoes-0.179
erv-0.173
 gut-0.161
iring-0.155
Token They
Token ID1119
Feature activation+1.70e+00

Pos logit contributions
rob+0.070
 batch+0.063
bar+0.062
ip+0.057
ah+0.056
Neg logit contributions
erv-0.047
ires-0.046
 confusion-0.043
adoes-0.042
 gut-0.041
Token went
Token ID1816
Feature activation+9.59e-01

Pos logit contributions
rob+0.071
 batch+0.059
bar+0.056
ww+0.056
www+0.053
Neg logit contributions
 confusion-0.034
erv-0.032
 stops-0.030
 little-0.030
 curiosity-0.029
Token back
Token ID736
Feature activation+1.26e+00

Pos logit contributions
rob+0.041
 batch+0.037
bar+0.036
 million+0.034
ah+0.034
Neg logit contributions
ires-0.016
 gut-0.016
erv-0.015
adoes-0.015
 stops-0.014
Token to
Token ID284
Feature activation+1.13e+00

Pos logit contributions
rob+0.044
 batch+0.042
bar+0.039
ip+0.038
ah+0.035
Neg logit contributions
ires-0.033
erv-0.032
 gut-0.031
 confusion-0.031
adoes-0.031
Token mom
Token ID1995
Feature activation-7.61e-01

Pos logit contributions
 batch+0.026
ip+0.023
rob+0.021
bar+0.021
 peanut+0.020
Neg logit contributions
ires-0.022
 confusion-0.021
adoes-0.021
erv-0.021
 gut-0.020
Token brought
Token ID3181
Feature activation+2.44e+00

Pos logit contributions
erv+0.020
 stops+0.020
ires+0.019
 little+0.018
 confusion+0.018
Neg logit contributions
rob-0.026
 batch-0.022
bar-0.022
tail-0.022
ip-0.020
Token home
Token ID1363
Feature activation+1.14e+00

Pos logit contributions
rob+0.117
bar+0.101
 batch+0.094
ah+0.093
ip+0.091
Neg logit contributions
ires-0.040
erv-0.038
 confusion-0.037
 footsteps-0.035
 up-0.035
Token a
Token ID257
Feature activation-5.94e-01

Pos logit contributions
rob+0.041
 batch+0.041
bar+0.038
ip+0.037
ah+0.033
Neg logit contributions
ires-0.028
erv-0.026
 gut-0.026
adoes-0.026
 confusion-0.026
Token big
Token ID1263
Feature activation+2.61e+00

Pos logit contributions
erv+0.011
ires+0.011
 confusion+0.011
 gut+0.011
 stops+0.010
Neg logit contributions
rob-0.025
bar-0.022
 batch-0.021
ip-0.021
eb-0.020
Token bag
Token ID6131
Feature activation+7.48e+00

Pos logit contributions
rob+0.079
bar+0.067
ip+0.061
 batch+0.059
eb+0.057
Neg logit contributions
 confusion-0.078
 gut-0.077
ires-0.076
erv-0.075
 stops-0.075
Token of
Token ID286
Feature activation+4.76e-01

Pos logit contributions
rob+0.335
 batch+0.291
ip+0.275
bar+0.266
ah+0.265
Neg logit contributions
ires-0.135
erv-0.133
iring-0.120
adoes-0.107
 gut-0.104
Token popcorn
Token ID38914
Feature activation+4.21e+00

Pos logit contributions
rob+0.015
bar+0.015
 batch+0.013
ip+0.013
ah+0.013
Neg logit contributions
ires-0.013
 confusion-0.013
 gut-0.012
adoes-0.012
erv-0.012
Token.
Token ID13
Feature activation+3.01e-01

Pos logit contributions
rob+0.177
 batch+0.161
bar+0.159
ip+0.151
 million+0.150
Neg logit contributions
ires-0.076
erv-0.073
adoes-0.068
iring-0.063
 gut-0.062
Token Lily
Token ID20037
Feature activation+8.04e-01

Pos logit contributions
rob+0.012
bar+0.010
 batch+0.010
ip+0.009
ah+0.009
Neg logit contributions
ires-0.007
erv-0.007
adoes-0.007
 confusion-0.007
 gut-0.006
Token was
Token ID373
Feature activation+4.89e-02

Pos logit contributions
rob+0.030
 batch+0.028
bar+0.027
ip+0.025
ah+0.024
Neg logit contributions
ires-0.019
erv-0.019
 gut-0.018
 confusion-0.018
adoes-0.018
Token too
Token ID1165
Feature activation-1.42e+00

Pos logit contributions
ires+0.090
erv+0.086
 gut+0.083
 confusion+0.082
adoes+0.081
Neg logit contributions
rob-0.160
 batch-0.151
bar-0.142
ip-0.136
 million-0.126
Token.
Token ID13
Feature activation-3.59e-01

Pos logit contributions
ires+0.039
adoes+0.038
erv+0.037
 gut+0.037
 confusion+0.037
Neg logit contributions
 batch-0.046
rob-0.045
ip-0.041
bar-0.041
ah-0.037
Token She
Token ID1375
Feature activation+1.48e+00

Pos logit contributions
ires+0.011
 gut+0.010
erv+0.010
adoes+0.010
 confusion+0.010
Neg logit contributions
rob-0.011
 batch-0.011
bar-0.010
ip-0.010
tail-0.009
Token opened
Token ID4721
Feature activation-2.62e+00

Pos logit contributions
rob+0.052
 batch+0.049
bar+0.046
ip+0.044
tail+0.040
Neg logit contributions
erv-0.038
ires-0.038
 confusion-0.037
 gut-0.036
adoes-0.035
Token her
Token ID607
Feature activation-2.05e+00

Pos logit contributions
ires+0.069
erv+0.067
 gut+0.065
 confusion+0.064
adoes+0.064
Neg logit contributions
rob-0.091
 batch-0.088
bar-0.082
ip-0.079
ah-0.072
Token bag
Token ID6131
Feature activation+7.54e+00

Pos logit contributions
ires+0.059
 gut+0.057
 confusion+0.057
erv+0.055
adoes+0.054
Neg logit contributions
rob-0.073
 batch-0.062
bar-0.060
ip-0.055
ah-0.054
Token and
Token ID290
Feature activation-8.93e-01

Pos logit contributions
rob+0.344
 batch+0.332
bar+0.311
ip+0.306
ah+0.288
Neg logit contributions
ires-0.118
erv-0.111
 gut-0.104
 confusion-0.103
adoes-0.102
Token took
Token ID1718
Feature activation+2.22e+00

Pos logit contributions
ires+0.025
 confusion+0.024
erv+0.024
adoes+0.024
 gut+0.024
Neg logit contributions
rob-0.029
 batch-0.027
bar-0.026
ip-0.024
ah-0.023
Token out
Token ID503
Feature activation-8.54e-01

Pos logit contributions
rob+0.093
 batch+0.087
bar+0.082
ip+0.077
ah+0.074
Neg logit contributions
ires-0.045
erv-0.042
 confusion-0.042
adoes-0.041
 gut-0.039
Token a
Token ID257
Feature activation-5.34e-01

Pos logit contributions
ires+0.020
erv+0.019
 confusion+0.019
 gut+0.019
adoes+0.018
Neg logit contributions
rob-0.033
 batch-0.031
bar-0.029
ip-0.028
ah-0.026
Token handful
Token ID10089
Feature activation+7.95e+00

Pos logit contributions
 little+0.013
 confusion+0.013
 stops+0.012
 name+0.012
 hello+0.012
Neg logit contributions
rob-0.021
bar-0.017
 batch-0.016
ip-0.015
tail-0.015
Token was
Token ID373
Feature activation-5.41e-01

Pos logit contributions
ires+0.064
erv+0.062
adoes+0.060
 gut+0.060
 confusion+0.058
Neg logit contributions
rob-0.068
 batch-0.066
bar-0.060
ip-0.059
ah-0.054
Token a
Token ID257
Feature activation-1.06e+00

Pos logit contributions
ires+0.014
 gut+0.014
adoes+0.013
erv+0.013
 confusion+0.013
Neg logit contributions
rob-0.019
 batch-0.019
bar-0.017
ip-0.016
P-0.016
Token happy
Token ID3772
Feature activation-2.01e+00

Pos logit contributions
ires+0.015
erv+0.014
 gut+0.013
adoes+0.013
 confusion+0.013
Neg logit contributions
rob-0.049
 batch-0.048
bar-0.046
ip-0.045
ah-0.042
Token,
Token ID11
Feature activation-4.79e-01

Pos logit contributions
ires+0.043
erv+0.041
 gut+0.040
 confusion+0.039
adoes+0.039
Neg logit contributions
rob-0.080
 batch-0.077
bar-0.073
ip-0.071
ah-0.065
Token smooth
Token ID7209
Feature activation+2.96e+00

Pos logit contributions
ires+0.012
erv+0.011
 gut+0.011
adoes+0.011
 confusion+0.011
Neg logit contributions
rob-0.017
 batch-0.017
bar-0.016
ip-0.015
ah-0.014
Token butter
Token ID9215
Feature activation+7.77e+00

Pos logit contributions
rob+0.102
 batch+0.097
bar+0.091
ip+0.088
tail+0.080
Neg logit contributions
ires-0.077
erv-0.075
 gut-0.073
 confusion-0.072
adoes-0.072
Token.
Token ID13
Feature activation-7.50e-01

Pos logit contributions
rob+0.315
 batch+0.313
bar+0.294
ip+0.276
ird+0.270
Neg logit contributions
ires-0.138
erv-0.128
adoes-0.124
iring-0.122
 gut-0.119
Token The
Token ID383
Feature activation-4.12e-02

Pos logit contributions
ires+0.020
erv+0.019
 gut+0.019
 confusion+0.019
adoes+0.019
Neg logit contributions
rob-0.026
 batch-0.025
bar-0.023
ip-0.022
ah-0.020
Token butter
Token ID9215
Feature activation+7.76e+00

Pos logit contributions
ires+0.001
erv+0.001
 confusion+0.001
 gut+0.001
adoes+0.001
Neg logit contributions
rob-0.001
 batch-0.001
bar-0.001
ip-0.001
ah-0.001
Token looked
Token ID3114
Feature activation+4.72e-01

Pos logit contributions
rob+0.307
 batch+0.296
bar+0.282
ip+0.266
tail+0.259
Neg logit contributions
ires-0.158
erv-0.139
adoes-0.139
 stops-0.136
 gut-0.132
Token all
Token ID477
Feature activation+2.29e+00

Pos logit contributions
rob+0.015
 batch+0.014
bar+0.013
ip+0.012
ird+0.011
Neg logit contributions
ires-0.015
erv-0.014
 confusion-0.013
 gut-0.013
adoes-0.013
Token rake
Token ID47885
Feature activation+1.84e+00

Pos logit contributions
 gut+0.015
ires+0.015
 confusion+0.015
erv+0.015
adoes+0.014
Neg logit contributions
rob-0.045
 batch-0.042
bar-0.041
ip-0.039
eb-0.037
Token!
Token ID0
Feature activation-8.87e-02

Pos logit contributions
rob+0.066
 batch+0.063
ip+0.057
bar+0.056
tail+0.053
Neg logit contributions
ires-0.044
adoes-0.043
 confusion-0.042
 stops-0.042
erv-0.042
Token What
Token ID1867
Feature activation-1.75e+00

Pos logit contributions
ires+0.002
erv+0.002
 confusion+0.002
 gut+0.002
adoes+0.002
Neg logit contributions
rob-0.003
 batch-0.003
bar-0.003
ip-0.003
ah-0.003
Token is
Token ID318
Feature activation-2.47e-01

Pos logit contributions
ires+0.055
adoes+0.053
 gut+0.053
 confusion+0.052
erv+0.052
Neg logit contributions
rob-0.049
 batch-0.047
ip-0.045
bar-0.043
on-0.041
Token a
Token ID257
Feature activation-6.54e-01

Pos logit contributions
ires+0.007
erv+0.007
 gut+0.007
 confusion+0.007
adoes+0.007
Neg logit contributions
rob-0.008
 batch-0.008
bar-0.007
ip-0.007
ah-0.006
Token rake
Token ID47885
Feature activation+1.83e+00

Pos logit contributions
ires+0.015
erv+0.015
 gut+0.014
 confusion+0.014
adoes+0.014
Neg logit contributions
rob-0.024
 batch-0.024
bar-0.022
ip-0.021
ah-0.020
Token?
Token ID30
Feature activation+4.64e-01

Pos logit contributions
rob+0.075
 batch+0.070
ip+0.064
bar+0.061
ah+0.060
Neg logit contributions
ires-0.036
iring-0.033
adoes-0.033
erv-0.032
 stops-0.032
Token he
Token ID339
Feature activation+3.36e-01

Pos logit contributions
rob+0.019
 batch+0.019
bar+0.017
ip+0.017
ah+0.016
Neg logit contributions
erv-0.009
ires-0.009
 confusion-0.008
adoes-0.008
 gut-0.008
Token thought
Token ID1807
Feature activation-1.91e+00

Pos logit contributions
rob+0.011
 batch+0.011
bar+0.010
ip+0.010
ah+0.009
Neg logit contributions
ires-0.009
erv-0.009
 gut-0.009
 confusion-0.009
adoes-0.008
Token to
Token ID284
Feature activation+8.78e-01

Pos logit contributions
ires+0.060
erv+0.059
 confusion+0.057
adoes+0.057
 gut+0.057
Neg logit contributions
rob-0.055
 batch-0.054
bar-0.050
ip-0.049
ah-0.043
Token himself
Token ID2241
Feature activation-3.41e+00

Pos logit contributions
rob+0.034
 batch+0.033
ip+0.030
bar+0.030
ah+0.027
Neg logit contributions
ires-0.020
erv-0.019
 gut-0.019
 confusion-0.019
adoes-0.018
Token a
Token ID257
Feature activation-8.03e-01

Pos logit contributions
rob+0.145
 batch+0.142
bar+0.133
ip+0.131
ah+0.119
Neg logit contributions
ires-0.074
erv-0.071
 confusion-0.070
 gut-0.069
adoes-0.067
Token toy
Token ID13373
Feature activation+2.70e+00

Pos logit contributions
ires+0.018
erv+0.017
 gut+0.017
 confusion+0.017
adoes+0.016
Neg logit contributions
rob-0.031
 batch-0.029
bar-0.028
ip-0.027
tail-0.025
Token par
Token ID1582
Feature activation+1.14e+00

Pos logit contributions
rob+0.095
 batch+0.092
bar+0.082
ip+0.082
c+0.076
Neg logit contributions
ires-0.071
erv-0.067
 gut-0.065
adoes-0.064
 confusion-0.061
Tokenrot
Token ID10599
Feature activation+3.33e-01

Pos logit contributions
rob+0.031
bar+0.029
 batch+0.028
ip+0.027
ah+0.025
Neg logit contributions
 confusion-0.038
ires-0.037
 gut-0.036
erv-0.036
adoes-0.036
Token that
Token ID326
Feature activation+2.13e+00

Pos logit contributions
rob+0.013
 batch+0.012
bar+0.011
ip+0.010
 million+0.010
Neg logit contributions
ires-0.007
erv-0.007
adoes-0.007
 gut-0.007
iring-0.006
Token said
Token ID531
Feature activation+3.53e+00

Pos logit contributions
rob+0.066
 batch+0.064
bar+0.059
ip+0.056
tail+0.051
Neg logit contributions
ires-0.063
erv-0.061
 confusion-0.059
adoes-0.059
 gut-0.059
Token funny
Token ID8258
Feature activation-1.37e+00

Pos logit contributions
rob+0.130
 batch+0.126
bar+0.118
ip+0.115
ah+0.105
Neg logit contributions
ires-0.085
erv-0.081
 gut-0.079
 confusion-0.078
adoes-0.078
Token things
Token ID1243
Feature activation-5.98e+00

Pos logit contributions
ires+0.032
 confusion+0.031
erv+0.031
 gut+0.031
adoes+0.029
Neg logit contributions
rob-0.049
 batch-0.049
bar-0.045
ip-0.045
ah-0.042
Token.
Token ID13
Feature activation-4.76e-01

Pos logit contributions
 confusion+0.183
ires+0.182
 gut+0.181
erv+0.180
adoes+0.174
Neg logit contributions
rob-0.166
 batch-0.157
bar-0.151
ip-0.151
ah-0.135
Token
Token ID198
Feature activation-2.47e-01

Pos logit contributions
ires+0.012
erv+0.011
 confusion+0.010
 gut+0.010
adoes+0.010
Neg logit contributions
rob-0.018
 batch-0.017
bar-0.017
ip-0.015
tail-0.015
Token
Token ID198
Feature activation-2.16e-01

Pos logit contributions
ires+0.006
erv+0.006
 confusion+0.006
 gut+0.006
adoes+0.005
Neg logit contributions
rob-0.009
 batch-0.008
bar-0.008
ip-0.008
tail-0.007
TokenShe
Token ID3347
Feature activation+1.13e+00

Pos logit contributions
rob+0.027
 batch+0.026
bar+0.024
ip+0.023
ah+0.021
Neg logit contributions
ires-0.015
erv-0.014
 gut-0.014
 confusion-0.014
adoes-0.013
Token sat
Token ID3332
Feature activation-2.59e+00

Pos logit contributions
rob+0.040
 batch+0.037
bar+0.035
ip+0.033
tail+0.030
Neg logit contributions
ires-0.029
 confusion-0.029
erv-0.029
 gut-0.028
 stops-0.027
Token for
Token ID329
Feature activation+4.58e+00

Pos logit contributions
ires+0.070
erv+0.069
 confusion+0.065
 gut+0.065
adoes+0.065
Neg logit contributions
rob-0.087
 batch-0.085
bar-0.079
ip-0.077
tail-0.069
Token a
Token ID257
Feature activation-7.57e-01

Pos logit contributions
rob+0.200
 batch+0.195
bar+0.188
ip+0.184
ah+0.170
Neg logit contributions
 confusion-0.073
ires-0.071
 gut-0.071
 footsteps-0.070
erv-0.067
Token few
Token ID1178
Feature activation+6.30e+00

Pos logit contributions
erv+0.018
 confusion+0.018
 gut+0.017
ires+0.016
 stops+0.016
Neg logit contributions
rob-0.028
 batch-0.025
bar-0.025
ip-0.023
eb-0.022
Token more
Token ID517
Feature activation+1.74e+00

Pos logit contributions
rob+0.200
 batch+0.191
bar+0.181
ip+0.172
ah+0.155
Neg logit contributions
ires-0.181
 confusion-0.176
erv-0.175
 gut-0.174
adoes-0.168
Token minutes
Token ID2431
Feature activation+2.45e+00

Pos logit contributions
rob+0.061
 batch+0.059
bar+0.055
ip+0.054
ah+0.049
Neg logit contributions
ires-0.045
erv-0.044
 gut-0.042
adoes-0.042
 confusion-0.041
Token and
Token ID290
Feature activation-8.65e-01

Pos logit contributions
rob+0.098
 batch+0.095
bar+0.089
ip+0.087
ah+0.080
Neg logit contributions
ires-0.051
erv-0.049
 gut-0.047
 confusion-0.047
adoes-0.046
Token then
Token ID788
Feature activation+7.93e-01

Pos logit contributions
ires+0.024
erv+0.023
 gut+0.023
 confusion+0.023
adoes+0.023
Neg logit contributions
rob-0.028
 batch-0.027
bar-0.025
ip-0.024
ah-0.022
Token decided
Token ID3066
Feature activation-5.97e-02

Pos logit contributions
rob+0.026
 batch+0.025
bar+0.023
ip+0.022
ah+0.020
Neg logit contributions
ires-0.022
erv-0.021
 confusion-0.021
 gut-0.021
adoes-0.021
Token it
Token ID340
Feature activation-4.41e-01

Pos logit contributions
ires+0.002
erv+0.002
 confusion+0.002
 gut+0.002
adoes+0.002
Neg logit contributions
rob-0.002
 batch-0.002
bar-0.002
ip-0.002
tail-0.001
Token Emma
Token ID18966
Feature activation+1.46e-01

Pos logit contributions
rob+0.047
 batch+0.043
bar+0.042
ip+0.039
ah+0.037
Neg logit contributions
ires-0.035
erv-0.034
 confusion-0.033
adoes-0.032
 gut-0.032
Token who
Token ID508
Feature activation+2.07e+00

Pos logit contributions
rob+0.006
 batch+0.006
bar+0.006
ip+0.005
 loaf+0.005
Neg logit contributions
erv-0.003
ires-0.003
 confusion-0.002
 gut-0.002
adoes-0.002
Token loved
Token ID6151
Feature activation-4.97e+00

Pos logit contributions
rob+0.080
 batch+0.078
bar+0.074
ip+0.071
ah+0.065
Neg logit contributions
ires-0.046
erv-0.045
 gut-0.043
 confusion-0.043
adoes-0.042
Token to
Token ID284
Feature activation+6.95e-01

Pos logit contributions
ires+0.106
adoes+0.100
erv+0.097
 gut+0.096
 earthquakes+0.095
Neg logit contributions
rob-0.195
bar-0.181
 batch-0.178
ip-0.170
ah-0.165
Token play
Token ID711
Feature activation-3.34e+00

Pos logit contributions
rob+0.024
 batch+0.024
bar+0.022
ip+0.021
ah+0.019
Neg logit contributions
ires-0.018
erv-0.017
 gut-0.017
 confusion-0.017
adoes-0.016
Token chess
Token ID19780
Feature activation+1.62e+00

Pos logit contributions
ires+0.075
erv+0.072
 gut+0.069
adoes+0.069
 confusion+0.069
Neg logit contributions
rob-0.128
 batch-0.125
bar-0.117
ip-0.114
ah-0.105
Token.
Token ID13
Feature activation-7.13e-01

Pos logit contributions
rob+0.059
 batch+0.057
bar+0.054
ip+0.052
ah+0.048
Neg logit contributions
ires-0.039
erv-0.038
 gut-0.036
adoes-0.036
 confusion-0.036
Token Every
Token ID3887
Feature activation-1.36e+00

Pos logit contributions
ires+0.022
 gut+0.021
erv+0.021
adoes+0.021
 confusion+0.021
Neg logit contributions
 batch-0.021
rob-0.020
bar-0.018
ip-0.018
tail-0.016
Token day
Token ID1110
Feature activation+2.31e-01

Pos logit contributions
ires+0.038
erv+0.038
 confusion+0.036
 gut+0.036
 stops+0.035
Neg logit contributions
rob-0.044
bar-0.040
 batch-0.039
ip-0.036
ird-0.034
Token,
Token ID11
Feature activation-3.56e-01

Pos logit contributions
rob+0.008
 batch+0.008
bar+0.007
ip+0.007
ah+0.007
Neg logit contributions
ires-0.006
erv-0.006
 gut-0.006
 confusion-0.006
adoes-0.005
Token she
Token ID673
Feature activation+1.17e+00

Pos logit contributions
ires+0.011
erv+0.011
 gut+0.011
 confusion+0.011
adoes+0.011
Neg logit contributions
 batch-0.010
rob-0.010
bar-0.009
ip-0.009
tail-0.008
Token resist
Token ID4180
Feature activation-9.76e-01

Pos logit contributions
ires+0.077
erv+0.075
 gut+0.074
 confusion+0.073
adoes+0.071
Neg logit contributions
rob-0.125
 batch-0.121
bar-0.112
ip-0.110
ah-0.099
Token,
Token ID11
Feature activation+3.72e-01

Pos logit contributions
ires+0.027
erv+0.027
adoes+0.026
 gut+0.025
 confusion+0.025
Neg logit contributions
rob-0.031
 batch-0.031
bar-0.028
ip-0.027
ah-0.025
Token however
Token ID2158
Feature activation-2.70e-01

Pos logit contributions
rob+0.015
 batch+0.014
bar+0.013
ip+0.013
P+0.012
Neg logit contributions
ires-0.008
erv-0.008
 confusion-0.008
 gut-0.008
adoes-0.008
Token,
Token ID11
Feature activation+3.72e-01

Pos logit contributions
ires+0.008
erv+0.007
adoes+0.007
 gut+0.007
 confusion+0.007
Neg logit contributions
rob-0.009
 batch-0.008
bar-0.008
ip-0.008
ah-0.007
Token adding
Token ID4375
Feature activation+1.39e+00

Pos logit contributions
rob+0.014
 batch+0.013
bar+0.012
ip+0.012
ah+0.011
Neg logit contributions
ires-0.009
erv-0.009
 gut-0.008
adoes-0.008
 confusion-0.008
Token that
Token ID326
Feature activation+2.59e+00

Pos logit contributions
rob+0.047
 batch+0.047
bar+0.043
ip+0.042
ah+0.038
Neg logit contributions
ires-0.036
erv-0.035
 gut-0.034
adoes-0.034
 confusion-0.033
Token the
Token ID262
Feature activation+4.74e-01

Pos logit contributions
rob+0.087
 batch+0.084
bar+0.078
ip+0.076
ah+0.069
Neg logit contributions
ires-0.070
erv-0.068
 gut-0.066
 confusion-0.066
adoes-0.065
Token coin
Token ID10752
Feature activation+2.39e+00

Pos logit contributions
rob+0.016
 batch+0.016
bar+0.014
ip+0.014
ah+0.013
Neg logit contributions
ires-0.013
erv-0.012
 gut-0.012
 confusion-0.012
adoes-0.012
Token was
Token ID373
Feature activation+2.35e-01

Pos logit contributions
rob+0.069
 batch+0.068
bar+0.066
ip+0.063
ah+0.056
Neg logit contributions
ires-0.073
erv-0.071
 gut-0.069
 confusion-0.069
adoes-0.068
Token too
Token ID1165
Feature activation-1.27e+00

Pos logit contributions
rob+0.009
 batch+0.008
bar+0.007
ip+0.007
tail+0.007
Neg logit contributions
ires-0.006
erv-0.006
 confusion-0.006
 gut-0.006
adoes-0.006
Token beautiful
Token ID4950
Feature activation-6.66e-01

Pos logit contributions
ires+0.035
adoes+0.034
erv+0.033
 gut+0.032
 footsteps+0.032
Neg logit contributions
 batch-0.041
ip-0.039
rob-0.038
bar-0.038
 million-0.035
Token and
Token ID290
Feature activation-9.18e-01

Pos logit contributions
rob+0.117
 batch+0.113
bar+0.104
ip+0.102
ah+0.093
Neg logit contributions
ires-0.078
erv-0.075
 gut-0.072
 confusion-0.072
adoes-0.072
Token sharp
Token ID7786
Feature activation+2.62e+00

Pos logit contributions
ires+0.021
erv+0.020
 gut+0.019
 confusion+0.019
adoes+0.019
Neg logit contributions
rob-0.035
 batch-0.034
bar-0.032
ip-0.031
ah-0.029
Token teeth
Token ID9941
Feature activation-1.80e-02

Pos logit contributions
rob+0.102
 batch+0.099
bar+0.093
ip+0.091
ah+0.084
Neg logit contributions
ires-0.058
erv-0.055
 gut-0.053
 confusion-0.053
adoes-0.053
Token.
Token ID13
Feature activation-2.32e-01

Pos logit contributions
ires+0.000
erv+0.000
 gut+0.000
adoes+0.000
 confusion+0.000
Neg logit contributions
rob-0.001
 batch-0.001
bar-0.001
ip-0.001
ah-0.001
Token
Token ID198
Feature activation+4.03e-01

Pos logit contributions
ires+0.007
erv+0.006
 gut+0.006
adoes+0.006
 confusion+0.006
Neg logit contributions
rob-0.007
 batch-0.007
ip-0.007
bar-0.007
eb-0.006
Token
Token ID198
Feature activation+3.85e-01

Pos logit contributions
rob+0.013
 batch+0.013
bar+0.012
ip+0.012
ah+0.010
Neg logit contributions
ires-0.011
erv-0.011
 gut-0.011
adoes-0.011
 confusion-0.011
Token"
Token ID1
Feature activation-3.98e-01

Pos logit contributions
rob+0.011
 batch+0.011
bar+0.010
ip+0.010
P+0.010
Neg logit contributions
ires-0.012
erv-0.012
adoes-0.012
 confusion-0.011
 gut-0.011
TokenNo
Token ID2949
Feature activation+1.76e+00

Pos logit contributions
ires+0.011
erv+0.011
 gut+0.011
adoes+0.011
 confusion+0.010
Neg logit contributions
rob-0.013
 batch-0.013
bar-0.012
ip-0.011
P-0.011
Token,
Token ID11
Feature activation+1.39e-02

Pos logit contributions
rob+0.076
 batch+0.070
bar+0.069
ip+0.066
P+0.066
Neg logit contributions
 gut-0.031
 stops-0.030
 confusion-0.030
adoes-0.029
erv-0.029
Token I
Token ID314
Feature activation+1.61e+00

Pos logit contributions
rob+0.001
 batch+0.000
bar+0.000
ip+0.000
c+0.000
Neg logit contributions
ires-0.000
 gut-0.000
 confusion-0.000
erv-0.000
adoes-0.000
Token want
Token ID765
Feature activation-2.13e+00

Pos logit contributions
rob+0.069
 batch+0.067
bar+0.063
ip+0.062
ah+0.058
Neg logit contributions
ires-0.030
erv-0.028
 gut-0.027
adoes-0.027
 confusion-0.026
Token.
Token ID13
Feature activation+2.57e-01

Pos logit contributions
ires+0.010
erv+0.010
adoes+0.010
 gut+0.010
 confusion+0.009
Neg logit contributions
rob-0.019
 batch-0.019
ip-0.017
bar-0.017
 million-0.016
Token Every
Token ID3887
Feature activation-9.53e-01

Pos logit contributions
rob+0.009
 batch+0.008
bar+0.008
ip+0.008
ah+0.007
Neg logit contributions
ires-0.007
erv-0.007
 confusion-0.006
 gut-0.006
adoes-0.006
Token day
Token ID1110
Feature activation+9.52e-01

Pos logit contributions
 confusion+0.025
 stops+0.023
ires+0.022
 gut+0.022
 curiosity+0.022
Neg logit contributions
rob-0.035
bar-0.030
ip-0.029
 batch-0.028
ird-0.026
Token when
Token ID618
Feature activation-5.27e+00

Pos logit contributions
rob+0.033
 batch+0.032
bar+0.029
ip+0.028
ah+0.026
Neg logit contributions
ires-0.025
erv-0.024
 gut-0.023
 confusion-0.023
adoes-0.023
Token she
Token ID673
Feature activation+1.88e+00

Pos logit contributions
ires+0.155
erv+0.151
 confusion+0.149
 gut+0.149
adoes+0.145
Neg logit contributions
rob-0.166
 batch-0.157
bar-0.147
ip-0.141
ah-0.127
Token woke
Token ID19092
Feature activation-1.56e-01

Pos logit contributions
rob+0.058
 batch+0.057
bar+0.052
ip+0.050
tail+0.045
Neg logit contributions
ires-0.054
erv-0.054
 confusion-0.052
 gut-0.052
adoes-0.051
Token up
Token ID510
Feature activation-3.78e+00

Pos logit contributions
erv+0.003
ires+0.003
 confusion+0.003
 gut+0.003
adoes+0.003
Neg logit contributions
 batch-0.006
rob-0.006
ip-0.006
bar-0.006
eb-0.005
Token,
Token ID11
Feature activation+8.28e-01

Pos logit contributions
ires+0.123
erv+0.121
adoes+0.115
 gut+0.115
 confusion+0.109
Neg logit contributions
 batch-0.105
rob-0.100
bar-0.095
ip-0.092
 million-0.081
Token Sus
Token ID8932
Feature activation+1.33e+00

Pos logit contributions
rob+0.023
 batch+0.023
bar+0.021
ip+0.020
 million+0.018
Neg logit contributions
ires-0.027
erv-0.026
 gut-0.025
adoes-0.025
 confusion-0.025
Tokenie
Token ID494
Feature activation+5.29e-02

Pos logit contributions
rob+0.034
 batch+0.033
bar+0.029
ip+0.028
tail+0.026
Neg logit contributions
ires-0.047
erv-0.046
adoes-0.044
 gut-0.044
 confusion-0.044
Token had
Token ID550
Feature activation+4.58e+00

Pos logit contributions
rob+0.002
 batch+0.002
bar+0.001
ip+0.001
tail+0.001
Neg logit contributions
ires-0.002
erv-0.001
 gut-0.001
 confusion-0.001
adoes-0.001
Token are
Token ID389
Feature activation+1.49e+00

Pos logit contributions
ires+0.007
erv+0.007
 gut+0.007
adoes+0.007
 confusion+0.007
Neg logit contributions
rob-0.006
 batch-0.005
bar-0.005
ip-0.005
ird-0.004
Token friends
Token ID2460
Feature activation-1.86e+00

Pos logit contributions
rob+0.051
 batch+0.051
bar+0.047
ip+0.046
 million+0.042
Neg logit contributions
ires-0.036
erv-0.036
 gut-0.035
adoes-0.034
 confusion-0.034
Token.
Token ID13
Feature activation-2.73e-01

Pos logit contributions
ires+0.049
erv+0.048
 gut+0.047
 confusion+0.046
adoes+0.046
Neg logit contributions
rob-0.064
 batch-0.062
bar-0.058
ip-0.056
ah-0.051
Token You
Token ID921
Feature activation-2.18e-01

Pos logit contributions
ires+0.008
erv+0.007
 gut+0.007
 confusion+0.007
adoes+0.007
Neg logit contributions
rob-0.009
 batch-0.008
bar-0.008
ip-0.008
tail-0.007
Token have
Token ID423
Feature activation+3.59e+00

Pos logit contributions
ires+0.007
erv+0.007
 gut+0.007
adoes+0.007
 confusion+0.007
Neg logit contributions
rob-0.006
 batch-0.006
bar-0.005
ip-0.005
ird-0.005
Token to
Token ID284
Feature activation+9.51e-01

Pos logit contributions
rob+0.131
 batch+0.125
bar+0.116
ip+0.113
ah+0.104
Neg logit contributions
ires-0.090
erv-0.088
 gut-0.085
 confusion-0.085
adoes-0.084
Token take
Token ID1011
Feature activation+1.49e+00

Pos logit contributions
rob+0.038
 batch+0.036
ip+0.034
bar+0.034
 million+0.031
Neg logit contributions
ires-0.020
 gut-0.020
 confusion-0.019
erv-0.019
adoes-0.019
Token care
Token ID1337
Feature activation-2.66e+00

Pos logit contributions
rob+0.054
 batch+0.053
bar+0.049
ip+0.048
tail+0.044
Neg logit contributions
ires-0.036
erv-0.034
 gut-0.033
 confusion-0.033
adoes-0.032
Token of
Token ID286
Feature activation+6.78e-01

Pos logit contributions
 confusion+0.093
erv+0.091
adoes+0.089
ires+0.088
 gut+0.088
Neg logit contributions
 batch-0.063
bar-0.059
rob-0.059
ip-0.058
 million-0.048
Token each
Token ID1123
Feature activation+5.94e+00

Pos logit contributions
rob+0.021
 batch+0.021
bar+0.019
ip+0.018
tail+0.016
Neg logit contributions
erv-0.019
ires-0.019
 confusion-0.018
 gut-0.018
adoes-0.018
Token other
Token ID584
Feature activation+1.75e+00

Pos logit contributions
rob+0.229
 batch+0.220
bar+0.215
ip+0.202
tail+0.188
Neg logit contributions
ires-0.134
erv-0.127
 gut-0.124
 confusion-0.121
adoes-0.119
Token holding
Token ID4769
Feature activation+3.02e+00

Pos logit contributions
ires+0.001
erv+0.001
 gut+0.001
 confusion+0.001
adoes+0.001
Neg logit contributions
rob-0.002
 batch-0.001
bar-0.001
ip-0.001
ah-0.001
Token a
Token ID257
Feature activation-5.12e-01

Pos logit contributions
rob+0.098
 batch+0.096
bar+0.090
ip+0.089
tail+0.081
Neg logit contributions
ires-0.083
erv-0.081
 gut-0.078
adoes-0.077
 confusion-0.075
Token wand
Token ID11569
Feature activation+6.80e-01

Pos logit contributions
 confusion+0.012
erv+0.012
 gut+0.012
ires+0.012
 stops+0.011
Neg logit contributions
rob-0.019
 batch-0.018
bar-0.017
ip-0.016
tail-0.015
Token.
Token ID13
Feature activation+3.23e-02

Pos logit contributions
rob+0.027
 batch+0.026
bar+0.024
ip+0.023
 million+0.021
Neg logit contributions
ires-0.015
erv-0.014
 gut-0.014
 confusion-0.014
adoes-0.013
Token
Token ID198
Feature activation+4.81e-01

Pos logit contributions
rob+0.001
 batch+0.001
bar+0.001
ip+0.001
ah+0.001
Neg logit contributions
ires-0.001
erv-0.001
 gut-0.001
adoes-0.001
 confusion-0.001
Token
Token ID198
Feature activation+4.66e-01

Pos logit contributions
rob+0.015
 batch+0.014
bar+0.013
ip+0.013
eb+0.011
Neg logit contributions
ires-0.014
erv-0.014
adoes-0.014
 gut-0.014
 confusion-0.014
Token"
Token ID1
Feature activation-3.01e-01

Pos logit contributions
rob+0.015
 batch+0.014
bar+0.013
ip+0.013
P+0.012
Neg logit contributions
ires-0.014
erv-0.013
 gut-0.013
 confusion-0.013
adoes-0.013
TokenLook
Token ID8567
Feature activation+1.20e+00

Pos logit contributions
ires+0.008
erv+0.007
 gut+0.007
 confusion+0.007
adoes+0.007
Neg logit contributions
rob-0.011
 batch-0.010
bar-0.010
ip-0.009
ah-0.009
Token at
Token ID379
Feature activation-1.10e+00

Pos logit contributions
rob+0.042
 batch+0.041
bar+0.038
ip+0.037
ah+0.034
Neg logit contributions
ires-0.031
erv-0.030
 gut-0.029
 confusion-0.028
adoes-0.028
Token me
Token ID502
Feature activation-4.25e+00

Pos logit contributions
ires+0.028
adoes+0.026
erv+0.026
 gut+0.026
 confusion+0.025
Neg logit contributions
rob-0.040
 batch-0.038
bar-0.035
ip-0.035
ird-0.032
Token,
Token ID11
Feature activation-2.08e-03

Pos logit contributions
ires+0.095
erv+0.094
 gut+0.089
adoes+0.086
iring+0.085
Neg logit contributions
rob-0.167
 batch-0.155
bar-0.142
ip-0.137
P-0.132
Token who
Token ID508
Feature activation+2.15e+00

Pos logit contributions
erv+0.049
ires+0.048
iring+0.043
adoes+0.041
 gut+0.040
Neg logit contributions
rob-0.122
 batch-0.120
bar-0.113
ip-0.108
 french-0.103
Token had
Token ID550
Feature activation+3.72e+00

Pos logit contributions
 batch+0.069
rob+0.069
bar+0.064
ip+0.061
 loaf+0.054
Neg logit contributions
ires-0.058
erv-0.056
 confusion-0.055
 gut-0.055
adoes-0.055
Token a
Token ID257
Feature activation-1.04e+00

Pos logit contributions
 batch+0.176
rob+0.175
ip+0.164
bar+0.161
tail+0.152
Neg logit contributions
ires-0.043
erv-0.043
 confusion-0.040
adoes-0.039
 gut-0.037
Token big
Token ID1263
Feature activation+2.17e+00

Pos logit contributions
 gut+0.018
 confusion+0.017
ires+0.017
 little+0.016
erv+0.016
Neg logit contributions
rob-0.046
bar-0.042
 batch-0.041
ip-0.040
eb-0.038
Token castle
Token ID16669
Feature activation-1.21e+00

Pos logit contributions
rob+0.080
bar+0.070
ip+0.070
 batch+0.068
P+0.064
Neg logit contributions
 confusion-0.051
 gut-0.049
 stops-0.049
 curiosity-0.047
ires-0.046
Token.
Token ID13
Feature activation-4.75e-01

Pos logit contributions
ires+0.021
erv+0.020
adoes+0.019
iring+0.018
 stops+0.018
Neg logit contributions
rob-0.051
 batch-0.050
ip-0.046
bar-0.046
eb-0.043
Token One
Token ID1881
Feature activation-8.54e-01

Pos logit contributions
ires+0.013
erv+0.013
 gut+0.013
 confusion+0.012
adoes+0.012
Neg logit contributions
rob-0.016
 batch-0.015
bar-0.014
ip-0.014
ah-0.012
Token day
Token ID1110
Feature activation+2.62e-01

Pos logit contributions
ires+0.031
erv+0.030
 gut+0.030
 confusion+0.030
adoes+0.029
Neg logit contributions
rob-0.021
 batch-0.020
bar-0.018
ip-0.017
tail-0.015
Token,
Token ID11
Feature activation-1.99e-01

Pos logit contributions
rob+0.006
 batch+0.006
ip+0.006
bar+0.005
ah+0.005
Neg logit contributions
ires-0.010
 confusion-0.009
 gut-0.009
erv-0.009
adoes-0.009
Token the
Token ID262
Feature activation+2.03e-01

Pos logit contributions
erv+0.007
ires+0.007
 gut+0.007
 confusion+0.007
adoes+0.006
Neg logit contributions
 batch-0.005
rob-0.005
bar-0.004
ip-0.004
 loaf-0.004
Token king
Token ID5822
Feature activation-2.96e+00

Pos logit contributions
rob+0.005
 batch+0.005
bar+0.005
ip+0.004
 peanut+0.004
Neg logit contributions
ires-0.007
erv-0.007
adoes-0.007
 confusion-0.007
 gut-0.006
Token child
Token ID1200
Feature activation+3.30e+00

Pos logit contributions
rob+0.015
 batch+0.014
bar+0.013
ip+0.012
ah+0.012
Neg logit contributions
ires-0.013
erv-0.012
adoes-0.012
 gut-0.012
iring-0.011
Token made
Token ID925
Feature activation+4.15e+00

Pos logit contributions
rob+0.110
 batch+0.106
bar+0.093
ip+0.087
 loaf+0.086
Neg logit contributions
 confusion-0.085
 gut-0.085
erv-0.084
 stops-0.084
ires-0.082
Token beautiful
Token ID4950
Feature activation-5.88e-01

Pos logit contributions
rob+0.154
 batch+0.149
bar+0.140
ip+0.136
ah+0.125
Neg logit contributions
ires-0.098
erv-0.094
 gut-0.091
 confusion-0.091
adoes-0.090
Token music
Token ID2647
Feature activation-2.57e+00

Pos logit contributions
ires+0.012
erv+0.011
 gut+0.010
 confusion+0.010
adoes+0.010
Neg logit contributions
rob-0.025
 batch-0.024
ip-0.022
bar-0.022
tail-0.021
Token.
Token ID13
Feature activation+2.09e-01

Pos logit contributions
ires+0.068
erv+0.066
 gut+0.064
adoes+0.064
 confusion+0.063
Neg logit contributions
rob-0.088
 batch-0.084
bar-0.079
ip-0.076
ah-0.069
Token<|endoftext|>
Token ID50256
Feature activation-5.27e+00

Pos logit contributions
rob+0.008
 batch+0.008
bar+0.007
ip+0.007
P+0.007
Neg logit contributions
ires-0.005
erv-0.005
 confusion-0.004
adoes-0.004
 gut-0.004
TokenOne
Token ID3198
Feature activation-1.38e+00

Pos logit contributions
ires+0.122
erv+0.118
 confusion+0.112
 gut+0.110
adoes+0.109
Neg logit contributions
 batch-0.193
rob-0.193
bar-0.181
ip-0.173
tail-0.158
Token day
Token ID1110
Feature activation+3.09e-01

Pos logit contributions
ires+0.047
 confusion+0.045
erv+0.045
 gut+0.045
adoes+0.043
Neg logit contributions
rob-0.039
 batch-0.035
bar-0.032
ip-0.030
tail-0.027
Token,
Token ID11
Feature activation-3.26e-01

Pos logit contributions
rob+0.007
 batch+0.007
ip+0.007
bar+0.007
P+0.006
Neg logit contributions
ires-0.011
 gut-0.011
adoes-0.011
erv-0.010
 confusion-0.010
Token a
Token ID257
Feature activation-1.04e+00

Pos logit contributions
 gut+0.010
ires+0.010
adoes+0.010
erv+0.010
 perspectives+0.009
Neg logit contributions
 batch-0.009
rob-0.009
bar-0.008
ah-0.008
ip-0.008
Token small
Token ID1402
Feature activation+4.26e-01

Pos logit contributions
ires+0.016
erv+0.015
adoes+0.015
 confusion+0.014
 gut+0.014
Neg logit contributions
rob-0.047
 batch-0.046
ip-0.043
bar-0.043
ah-0.041
Token but
Token ID475
Feature activation+5.37e-01

Pos logit contributions
ires+0.004
erv+0.004
adoes+0.003
 gut+0.003
iring+0.003
Neg logit contributions
rob-0.005
 batch-0.005
bar-0.004
ip-0.004
 gob-0.004
Token he
Token ID339
Feature activation+1.95e-01

Pos logit contributions
rob+0.018
 batch+0.018
bar+0.016
ip+0.016
ah+0.014
Neg logit contributions
ires-0.014
 gut-0.014
 confusion-0.014
adoes-0.013
erv-0.013
Token could
Token ID714
Feature activation+1.09e+00

Pos logit contributions
rob+0.005
 batch+0.005
ip+0.005
bar+0.005
ah+0.004
Neg logit contributions
ires-0.006
adoes-0.006
erv-0.006
 confusion-0.006
 gut-0.006
Token not
Token ID407
Feature activation-4.93e+00

Pos logit contributions
rob+0.038
 batch+0.037
bar+0.036
ip+0.034
tail+0.034
Neg logit contributions
ires-0.027
erv-0.026
adoes-0.026
 gut-0.023
 confusion-0.023
Token reach
Token ID3151
Feature activation-1.47e+00

Pos logit contributions
ires+0.161
adoes+0.152
erv+0.143
iring+0.141
 gut+0.136
Neg logit contributions
rob-0.137
bar-0.129
 batch-0.127
tail-0.124
ip-0.120
Token him
Token ID683
Feature activation-5.76e+00

Pos logit contributions
ires+0.026
 gut+0.024
 confusion+0.024
 footsteps+0.023
erv+0.022
Neg logit contributions
rob-0.063
 batch-0.063
bar-0.057
ird-0.055
ip-0.054
Token.
Token ID13
Feature activation-5.76e-02

Pos logit contributions
ires+0.141
erv+0.136
 gut+0.132
 confusion+0.131
adoes+0.131
Neg logit contributions
rob-0.209
 batch-0.203
bar-0.189
ip-0.184
ah-0.168
Token Lily
Token ID20037
Feature activation+7.93e-01

Pos logit contributions
ires+0.001
erv+0.001
 gut+0.001
 confusion+0.001
adoes+0.001
Neg logit contributions
rob-0.002
 batch-0.002
bar-0.002
ip-0.002
ah-0.002
Token's
Token ID338
Feature activation+6.87e-01

Pos logit contributions
rob+0.022
 batch+0.021
ip+0.020
bar+0.019
 gob+0.018
Neg logit contributions
ires-0.028
erv-0.025
adoes-0.025
iring-0.024
 gut-0.024
Token mom
Token ID1995
Feature activation-3.65e-01

Pos logit contributions
 batch+0.025
rob+0.024
bar+0.023
ip+0.022
tail+0.021
Neg logit contributions
ires-0.017
erv-0.016
adoes-0.015
 gut-0.014
iring-0.014
Token called
Token ID1444
Feature activation+1.98e+00

Pos logit contributions
ires+0.011
erv+0.010
 gut+0.010
 confusion+0.010
adoes+0.010
Neg logit contributions
rob-0.012
 batch-0.011
bar-0.010
ip-0.010
ah-0.009
Token play
Token ID711
Feature activation-3.44e+00

Pos logit contributions
ires+0.009
 gut+0.007
 confusion+0.007
 footsteps+0.007
adoes+0.007
Neg logit contributions
rob-0.021
 batch-0.019
ip-0.019
bar-0.018
 million-0.017
Token with
Token ID351
Feature activation+1.78e+00

Pos logit contributions
ires+0.067
adoes+0.063
erv+0.062
 gut+0.060
 confusion+0.057
Neg logit contributions
rob-0.140
 batch-0.136
bar-0.129
ip-0.128
ah-0.118
Token.
Token ID13
Feature activation-6.59e-01

Pos logit contributions
rob+0.064
 batch+0.062
bar+0.058
ip+0.056
ah+0.052
Neg logit contributions
ires-0.044
erv-0.043
 gut-0.041
adoes-0.041
 confusion-0.041
Token<|endoftext|>
Token ID50256
Feature activation-4.81e+00

Pos logit contributions
ires+0.015
erv+0.015
 confusion+0.014
adoes+0.013
 gut+0.013
Neg logit contributions
rob-0.026
 batch-0.024
ip-0.022
bar-0.022
ah-0.020
TokenOnce
Token ID7454
Feature activation-1.10e+00

Pos logit contributions
ires+0.130
erv+0.126
 gut+0.123
 confusion+0.121
adoes+0.121
Neg logit contributions
rob-0.161
 batch-0.156
bar-0.145
ip-0.141
ah-0.127
Token upon
Token ID2402
Feature activation-7.93e+00

Pos logit contributions
ires+0.012
erv+0.012
adoes+0.012
 gut+0.011
 confusion+0.011
Neg logit contributions
rob-0.053
 batch-0.053
bar-0.050
ip-0.050
ah-0.047
Token a
Token ID257
Feature activation-1.24e+00

Pos logit contributions
ires+0.247
 gut+0.241
adoes+0.239
erv+0.228
 confusion+0.223
Neg logit contributions
rob-0.237
 batch-0.221
bar-0.203
P-0.197
tail-0.194
Token time
Token ID640
Feature activation-2.81e+00

Pos logit contributions
ires+0.032
erv+0.031
 gut+0.030
adoes+0.030
 confusion+0.030
Neg logit contributions
rob-0.043
 batch-0.042
bar-0.039
ip-0.038
ah-0.035
Token,
Token ID11
Feature activation-1.08e-01

Pos logit contributions
adoes+0.110
ires+0.109
 gut+0.109
 confusion+0.108
erv+0.104
Neg logit contributions
rob-0.054
 batch-0.053
bar-0.052
ip-0.051
on-0.044
Token there
Token ID612
Feature activation-2.12e+00

Pos logit contributions
ires+0.002
erv+0.002
 gut+0.002
 confusion+0.002
adoes+0.002
Neg logit contributions
rob-0.004
 batch-0.004
bar-0.004
ip-0.004
ah-0.003
Token was
Token ID373
Feature activation-1.41e-01

Pos logit contributions
ires+0.065
erv+0.063
adoes+0.063
 gut+0.062
 confusion+0.060
Neg logit contributions
rob-0.063
 batch-0.061
bar-0.056
ip-0.054
eb-0.050
Token day
Token ID1110
Feature activation+2.10e-01

Pos logit contributions
 confusion+0.041
erv+0.039
ires+0.039
 earthquakes+0.039
 stops+0.038
Neg logit contributions
rob-0.059
bar-0.052
ip-0.050
 batch-0.049
eb-0.045
Token.
Token ID13
Feature activation-7.03e-01

Pos logit contributions
rob+0.010
 batch+0.009
bar+0.009
ah+0.009
ip+0.008
Neg logit contributions
ires-0.003
erv-0.003
iring-0.002
 confusion-0.002
adoes-0.002
Token One
Token ID1881
Feature activation-8.02e-01

Pos logit contributions
ires+0.018
erv+0.017
 gut+0.017
 confusion+0.016
adoes+0.016
Neg logit contributions
rob-0.025
 batch-0.024
bar-0.023
ip-0.021
 million-0.020
Token day
Token ID1110
Feature activation+1.95e-01

Pos logit contributions
ires+0.032
erv+0.032
adoes+0.031
 gut+0.031
 confusion+0.031
Neg logit contributions
 batch-0.016
rob-0.016
bar-0.014
ip-0.013
ah-0.011
Token,
Token ID11
Feature activation-3.06e-01

Pos logit contributions
rob+0.005
ip+0.005
 batch+0.005
P+0.005
bar+0.004
Neg logit contributions
ires-0.006
erv-0.006
 gut-0.006
 confusion-0.006
adoes-0.006
Token when
Token ID618
Feature activation-5.98e+00

Pos logit contributions
ires+0.010
erv+0.010
 gut+0.009
 confusion+0.009
adoes+0.009
Neg logit contributions
rob-0.009
 batch-0.008
bar-0.008
ip-0.008
tail-0.007
Token she
Token ID673
Feature activation+1.20e+00

Pos logit contributions
ires+0.182
erv+0.176
 gut+0.173
 confusion+0.172
adoes+0.171
Neg logit contributions
rob-0.182
 batch-0.176
bar-0.163
ip-0.155
ah-0.140
Token was
Token ID373
Feature activation-3.96e-01

Pos logit contributions
 batch+0.031
rob+0.031
ip+0.027
bar+0.026
c+0.026
Neg logit contributions
ires-0.043
adoes-0.041
 gut-0.041
erv-0.039
iring-0.038
Token playing
Token ID2712
Feature activation-3.55e+00

Pos logit contributions
ires+0.013
adoes+0.013
 gut+0.012
erv+0.012
 confusion+0.012
Neg logit contributions
 batch-0.011
rob-0.010
ip-0.010
bar-0.010
ird-0.009
Token,
Token ID11
Feature activation-1.34e-01

Pos logit contributions
ires+0.099
erv+0.097
 confusion+0.093
 gut+0.091
adoes+0.091
Neg logit contributions
rob-0.115
 batch-0.115
bar-0.104
ip-0.101
tail-0.094
Token she
Token ID673
Feature activation+1.23e+00

Pos logit contributions
ires+0.005
erv+0.005
 gut+0.004
 confusion+0.004
adoes+0.004
Neg logit contributions
rob-0.003
 batch-0.003
ip-0.003
bar-0.003
P-0.003
Token thought
Token ID1807
Feature activation-1.93e+00

Pos logit contributions
rob+0.059
 batch+0.055
bar+0.052
ip+0.048
 loaf+0.045
Neg logit contributions
 confusion-0.045
erv-0.044
ires-0.044
 gut-0.043
 stops-0.042
Token of
Token ID286
Feature activation+3.14e-01

Pos logit contributions
ires+0.050
erv+0.049
 gut+0.047
adoes+0.047
 confusion+0.047
Neg logit contributions
rob-0.066
 batch-0.065
bar-0.060
ip-0.058
ah-0.053
Token all
Token ID477
Feature activation+2.53e+00

Pos logit contributions
rob+0.011
 batch+0.011
bar+0.010
ip+0.010
ah+0.009
Neg logit contributions
ires-0.008
erv-0.008
 gut-0.007
adoes-0.007
 confusion-0.007
Token the
Token ID262
Feature activation+4.38e-01

Pos logit contributions
rob+0.096
 batch+0.090
bar+0.088
ip+0.084
P+0.077
Neg logit contributions
ires-0.060
 confusion-0.056
erv-0.056
 gut-0.056
adoes-0.055
Token wonderful
Token ID7932
Feature activation-7.63e-01

Pos logit contributions
rob+0.017
 batch+0.017
bar+0.015
ip+0.015
ah+0.014
Neg logit contributions
ires-0.009
erv-0.009
adoes-0.009
 gut-0.009
iring-0.008
Token things
Token ID1243
Feature activation-5.83e+00

Pos logit contributions
ires+0.015
erv+0.015
 gut+0.014
adoes+0.014
iring+0.013
Neg logit contributions
 batch-0.031
rob-0.031
ip-0.028
bar-0.027
ah-0.026
Token she
Token ID673
Feature activation+1.72e+00

Pos logit contributions
ires+0.173
erv+0.169
 gut+0.165
adoes+0.163
 confusion+0.162
Neg logit contributions
rob-0.178
 batch-0.174
bar-0.160
ip-0.156
ah-0.139
Token had
Token ID550
Feature activation+4.25e+00

Pos logit contributions
rob+0.050
 batch+0.049
bar+0.047
ip+0.043
 million+0.038
Neg logit contributions
ires-0.054
erv-0.052
 confusion-0.051
 gut-0.050
adoes-0.048
Token learned
Token ID4499
Feature activation-3.91e-01

Pos logit contributions
rob+0.122
 batch+0.117
bar+0.107
ip+0.103
ah+0.092
Neg logit contributions
ires-0.137
erv-0.133
 gut-0.129
 confusion-0.129
adoes-0.128
Token and
Token ID290
Feature activation-8.78e-01

Pos logit contributions
ires+0.010
erv+0.010
 gut+0.010
 confusion+0.010
adoes+0.010
Neg logit contributions
rob-0.013
 batch-0.013
bar-0.011
ip-0.011
c-0.010
Token couldn
Token ID3521
Feature activation-1.60e+00

Pos logit contributions
ires+0.025
 confusion+0.024
 gut+0.024
erv+0.024
adoes+0.023
Neg logit contributions
rob-0.029
 batch-0.027
bar-0.025
ip-0.025
ah-0.023