TOP ACTIVATIONS
MAX = 8.964

When Daisy put the magn ifying glass down , the cater
He brought his magn ifying glass and looked closely .
the watch with the magn ifying glass . He saw the
He pointed his magn ifying glass at different classmates and
cried , dropping the magn ifying glass . "
said , holding the magn ifying glass over a black ant
, handing him the magn ifying glass . She looked around
and handed him the magn ifying glass . He
hat and grabbed her magn ifying glass . She was ready
As she was carrying the potatoes , she noticed that some
examine it with the magn ifying glass ?" "
the necklace and the magn ifying glass and started to examine
." So she put the potatoes into the pot and the
help . He took the potatoes from the bag and put
As she was picking the herbs , a huge lion came
too . He took the potatoes and the onion to the
! I 'll put the potatoes in the pot ." So
the necklace and the magn ifying glass and told him that
borrowed her mom 's magn ifying glass to take a closer
too much . She put Mitt ens down gently and sat

MIN ACTIVATIONS
MIN = -6.457

, a boy on a bike . She decided to sing
. " Of course ," the lady says .
and dad to a big park . There was a lake
a big house near a university . She liked to watch
and says , " Of course , you can play here
, " Yes , of course . We can come back
and says , " Of course , you can . I
teacher says , " Of course , you can . You
and said , " Of course , we forgive you .
a big place called a university . He learned lots of
and said , " Of course I forgive you , Sara
. " Of course ," Ben says . "
?" " Of course , we can play together
. " Of course , my sweet ies .
Ben said , " Of course we can be friends .
mom says , " Of course , Ben . You are
man says , " Of course . Come on the stage
?" " Of course , I forgive you ,
and said , " Of course you can . We are
. " Of course , sweet ie . But

INTERVAL 5.108 - 8.964
CONTAINS 0.014%

oven . She placed the vegetables on top of the stove
Mr . Lee put the sheep on a table and turned
into Sara and spills her coffee on Sara 's shirt .
! They both enjoyed the salty treats together . <|endoftext|> Once
kitchen . She put the bread in the to aster and

INTERVAL 1.253 - 5.108
CONTAINS 10.527%

ster continued to search for food , never worrying about getting
she stepped outside with the egg , she heard that same
. He was too busy catching frogs . The crocod ile
. Mitt ens was the happiest kitten in the world .
The lizard climbed on top of the rock and looked around

INTERVAL -2.602 - 1.253
CONTAINS 87.783%

to play with his old phone . It is big and
was scared of the snake , so he ran inside to
when something bad happened . It started to rain and the
the sweater was stained . Tim and Jen learned
not like dolls . She says , " E w ,

INTERVAL -6.457 - -2.602
CONTAINS 1.675%

is no monster in your room . It was just a
and went to Anna 's room . They played happily and
box . It was a mystery . Tom said
waiting for him at the front door . She was happy
" It 's okay , Lily . It 's just a
Token When
Token ID1649
Feature activation+7.51e-01

Pos logit contributions
opsy+0.034
ulent+0.032
 herbal+0.032
apa+0.032
cious+0.031
Neg logit contributions
 poster-0.042
 elevator-0.039
 door-0.038
 bus-0.038
 doors-0.038
Token Daisy
Token ID40355
Feature activation+2.41e+00

Pos logit contributions
opsy+0.041
 herbal+0.039
apa+0.038
cious+0.038
 Cic+0.038
Neg logit contributions
 poster-0.032
 elevator-0.028
 door-0.028
 bus-0.027
 doors-0.027
Token put
Token ID1234
Feature activation+4.73e-02

Pos logit contributions
opsy+0.127
ulent+0.123
 herbal+0.121
cious+0.118
bowl+0.113
Neg logit contributions
 poster-0.112
 bus-0.105
 door-0.105
 elevator-0.103
 doors-0.100
Token the
Token ID262
Feature activation+2.24e+00

Pos logit contributions
opsy+0.002
 herbal+0.002
 Mitt+0.002
 Louie+0.002
bowl+0.002
Neg logit contributions
 poster-0.003
 officer-0.002
 bus-0.002
 rules-0.002
 inconvenience-0.002
Token magn
Token ID7842
Feature activation+3.41e+00

Pos logit contributions
opsy+0.124
 herbal+0.120
cious+0.114
ulent+0.113
bowl+0.112
Neg logit contributions
 poster-0.089
 doors-0.079
 bus-0.077
 officer-0.077
 door-0.077
Tokenifying
Token ID4035
Feature activation+8.96e+00

Pos logit contributions
opsy+0.051
 herbal+0.039
ulent+0.037
cious+0.035
apa+0.031
Neg logit contributions
 poster-0.291
 door-0.282
 elevator-0.279
 bus-0.278
 doors-0.277
Token glass
Token ID5405
Feature activation+1.73e+00

Pos logit contributions
opsy+0.422
 herbal+0.398
ifying+0.387
ulent+0.384
bowl+0.381
Neg logit contributions
 poster-0.457
 bus-0.430
 officer-0.424
 elevator-0.422
 doors-0.412
Token down
Token ID866
Feature activation+2.07e+00

Pos logit contributions
opsy+0.085
bowl+0.083
 herbal+0.083
apa+0.080
cr+0.079
Neg logit contributions
 poster-0.083
 bus-0.076
 elevator-0.074
 officer-0.073
 rules-0.070
Token,
Token ID11
Feature activation+1.56e+00

Pos logit contributions
opsy+0.098
 herbal+0.094
apa+0.093
 Madd+0.089
ulent+0.089
Neg logit contributions
 poster-0.104
 bus-0.094
 officer-0.091
 elevator-0.091
 rules-0.087
Token the
Token ID262
Feature activation+1.74e-02

Pos logit contributions
opsy+0.098
ulent+0.093
 herbal+0.093
cious+0.092
apa+0.090
Neg logit contributions
 poster-0.057
 door-0.052
 elevator-0.052
 bus-0.051
 doors-0.050
Token cater
Token ID20825
Feature activation+3.15e+00

Pos logit contributions
opsy+0.001
 herbal+0.001
ulent+0.001
cious+0.001
apa+0.001
Neg logit contributions
 poster-0.001
 elevator-0.001
 bus-0.001
 door-0.001
 doors-0.001
Token
Token ID198
Feature activation+1.27e+00

Pos logit contributions
opsy+0.064
 herbal+0.058
ulent+0.058
apa+0.058
cious+0.057
Neg logit contributions
 poster-0.064
 elevator-0.059
 bus-0.059
 door-0.059
 doors-0.058
TokenHe
Token ID1544
Feature activation-2.50e-01

Pos logit contributions
opsy+0.074
 herbal+0.070
ulent+0.070
apa+0.066
 fingertips+0.066
Neg logit contributions
 poster-0.052
 bus-0.051
 lady-0.049
 door-0.049
 elevator-0.048
Token brought
Token ID3181
Feature activation+1.42e+00

Pos logit contributions
 poster+0.013
 door+0.012
 bus+0.012
 elevator+0.012
 doors+0.012
Neg logit contributions
opsy-0.012
ulent-0.011
 herbal-0.011
cious-0.011
apa-0.010
Token his
Token ID465
Feature activation+1.96e+00

Pos logit contributions
opsy+0.094
ulent+0.089
 herbal+0.089
cious+0.089
apa+0.085
Neg logit contributions
 poster-0.047
 door-0.046
 elevator-0.045
 doors-0.044
 bus-0.043
Token magn
Token ID7842
Feature activation+3.02e+00

Pos logit contributions
opsy+0.105
ulent+0.102
 herbal+0.099
cious+0.098
apa+0.097
Neg logit contributions
 poster-0.090
 door-0.086
 elevator-0.083
 bus-0.083
 doors-0.081
Tokenifying
Token ID4035
Feature activation+8.04e+00

Pos logit contributions
opsy+0.061
 herbal+0.058
 Cic+0.050
cious+0.050
 colony+0.045
Neg logit contributions
 door-0.236
 poster-0.233
 doors-0.226
 bus-0.221
 elevator-0.220
Token glass
Token ID5405
Feature activation+1.21e+00

Pos logit contributions
opsy+0.394
 herbal+0.373
ulent+0.366
apa+0.355
cious+0.351
Neg logit contributions
 poster-0.399
 bus-0.372
 elevator-0.371
 doors-0.358
 door-0.355
Token and
Token ID290
Feature activation+1.01e+00

Pos logit contributions
opsy+0.067
bowl+0.064
 herbal+0.064
ulent+0.063
apa+0.063
Neg logit contributions
 poster-0.053
 elevator-0.048
 bus-0.047
 doors-0.044
 officer-0.044
Token looked
Token ID3114
Feature activation+1.27e-01

Pos logit contributions
opsy+0.056
ulent+0.052
ifying+0.052
 herbal+0.051
umble+0.050
Neg logit contributions
 doors-0.046
 door-0.045
 bus-0.044
 elevator-0.043
 poster-0.043
Token closely
Token ID7173
Feature activation+3.46e-01

Pos logit contributions
opsy+0.006
 herbal+0.006
ulent+0.006
cious+0.006
apa+0.006
Neg logit contributions
 poster-0.007
 elevator-0.006
 bus-0.006
 door-0.006
 doors-0.006
Token.
Token ID13
Feature activation+1.14e+00

Pos logit contributions
opsy+0.022
ulent+0.021
 herbal+0.021
cious+0.021
apa+0.021
Neg logit contributions
 poster-0.012
 elevator-0.011
 bus-0.011
 door-0.011
 doors-0.010
Token the
Token ID262
Feature activation+5.03e-01

Pos logit contributions
opsy+0.016
 Mitt+0.015
 Cic+0.015
 Madd+0.014
 Sammy+0.014
Neg logit contributions
 poster-0.017
 bus-0.015
 door-0.015
 elevator-0.015
 doors-0.014
Token watch
Token ID2342
Feature activation+1.66e+00

Pos logit contributions
opsy+0.029
 herbal+0.027
ulent+0.027
cious+0.027
 Cic+0.025
Neg logit contributions
 poster-0.020
 elevator-0.019
 bus-0.018
 doors-0.018
 door-0.018
Token with
Token ID351
Feature activation-7.21e-02

Pos logit contributions
opsy+0.105
ulent+0.099
 herbal+0.098
cious+0.097
apa+0.096
Neg logit contributions
 poster-0.061
 door-0.056
 elevator-0.056
 bus-0.055
 doors-0.053
Token the
Token ID262
Feature activation+6.17e-01

Pos logit contributions
 poster+0.003
 elevator+0.003
 bus+0.003
 door+0.003
 doors+0.003
Neg logit contributions
opsy-0.004
ulent-0.004
 herbal-0.004
cious-0.004
apa-0.004
Token magn
Token ID7842
Feature activation+2.88e+00

Pos logit contributions
opsy+0.034
 herbal+0.032
ulent+0.032
cious+0.032
apa+0.030
Neg logit contributions
 poster-0.026
 elevator-0.024
 bus-0.023
 doors-0.023
 door-0.023
Tokenifying
Token ID4035
Feature activation+7.76e+00

Pos logit contributions
opsy+0.048
 herbal+0.040
cious+0.035
ulent+0.033
 Cic+0.032
Neg logit contributions
 poster-0.241
 door-0.238
 elevator-0.230
 bus-0.230
 doors-0.229
Token glass
Token ID5405
Feature activation+1.18e+00

Pos logit contributions
opsy+0.323
 herbal+0.316
ifying+0.302
ulent+0.298
 Bro+0.292
Neg logit contributions
 poster-0.415
 bus-0.399
 elevator-0.391
 officer-0.391
 doors-0.380
Token.
Token ID13
Feature activation+2.72e-01

Pos logit contributions
opsy+0.070
 herbal+0.067
cious+0.066
ulent+0.066
apa+0.065
Neg logit contributions
 poster-0.044
 elevator-0.040
 bus-0.039
 doors-0.037
 door-0.037
Token He
Token ID679
Feature activation-2.65e-01

Pos logit contributions
opsy+0.014
ulent+0.013
 herbal+0.013
cious+0.013
apa+0.013
Neg logit contributions
 poster-0.013
 bus-0.012
 elevator-0.012
 door-0.012
 doors-0.011
Token saw
Token ID2497
Feature activation-2.39e-02

Pos logit contributions
 poster+0.011
 door+0.011
 bus+0.011
 doors+0.011
 elevator+0.010
Neg logit contributions
opsy-0.015
 herbal-0.014
ulent-0.014
cious-0.014
apa-0.014
Token the
Token ID262
Feature activation+8.25e-01

Pos logit contributions
 poster+0.001
 bus+0.001
 door+0.001
 elevator+0.001
 doors+0.001
Neg logit contributions
opsy-0.001
ulent-0.001
 herbal-0.001
cious-0.001
apa-0.001
Token
Token ID198
Feature activation-1.28e-01

Pos logit contributions
 poster+0.012
 bus+0.012
 elevator+0.012
 door+0.011
 doors+0.011
Neg logit contributions
opsy-0.009
apa-0.008
Carl-0.008
cious-0.008
ulent-0.008
TokenHe
Token ID1544
Feature activation-1.19e+00

Pos logit contributions
 poster+0.007
 door+0.006
 elevator+0.006
 bus+0.006
 doors+0.006
Neg logit contributions
opsy-0.006
cious-0.006
Carl-0.006
ulent-0.006
apa-0.006
Token pointed
Token ID6235
Feature activation+2.32e+00

Pos logit contributions
 poster+0.064
 elevator+0.059
 bus+0.059
 rules+0.058
 door+0.058
Neg logit contributions
opsy-0.052
 herbal-0.049
apa-0.048
cious-0.048
ulent-0.048
Token his
Token ID465
Feature activation+1.04e+00

Pos logit contributions
opsy+0.118
apa+0.114
ulent+0.111
 herbal+0.110
cious+0.110
Neg logit contributions
 poster-0.111
 elevator-0.099
 bus-0.098
 door-0.094
 officer-0.092
Token magn
Token ID7842
Feature activation+2.73e+00

Pos logit contributions
opsy+0.056
 herbal+0.053
ulent+0.053
cious+0.052
apa+0.052
Neg logit contributions
 poster-0.047
 elevator-0.043
 bus-0.043
 door-0.042
 doors-0.041
Tokenifying
Token ID4035
Feature activation+7.72e+00

Pos logit contributions
opsy+0.043
 herbal+0.036
cious+0.031
ulent+0.030
 Cic+0.028
Neg logit contributions
 poster-0.229
 door-0.228
 doors-0.222
 elevator-0.219
 bus-0.217
Token glass
Token ID5405
Feature activation+9.90e-01

Pos logit contributions
opsy+0.354
 herbal+0.343
ifying+0.324
ulent+0.323
apa+0.323
Neg logit contributions
 poster-0.389
 bus-0.370
 elevator-0.365
 officer-0.357
 doors-0.349
Token at
Token ID379
Feature activation+1.98e-02

Pos logit contributions
opsy+0.056
 herbal+0.054
ulent+0.053
apa+0.053
bowl+0.053
Neg logit contributions
 poster-0.040
 bus-0.036
 elevator-0.036
 officer-0.033
 doors-0.033
Token different
Token ID1180
Feature activation-1.38e+00

Pos logit contributions
opsy+0.001
apa+0.001
 Louie+0.001
 Cic+0.001
bowl+0.001
Neg logit contributions
 poster-0.001
 bus-0.001
 elevator-0.001
 officer-0.001
 door-0.001
Token classmates
Token ID28999
Feature activation+2.09e-01

Pos logit contributions
 poster+0.050
 door+0.045
 elevator+0.045
 bus+0.045
 doors+0.044
Neg logit contributions
opsy-0.088
ulent-0.083
 herbal-0.083
cious-0.082
apa-0.081
Token and
Token ID290
Feature activation-7.33e-01

Pos logit contributions
opsy+0.014
 herbal+0.013
ulent+0.013
cious+0.013
apa+0.013
Neg logit contributions
 poster-0.007
 door-0.006
 elevator-0.006
 bus-0.006
 doors-0.006
Token cried
Token ID16896
Feature activation-7.09e-01

Pos logit contributions
 poster+0.048
 door+0.046
 doors+0.045
 elevator+0.044
 bus+0.044
Neg logit contributions
opsy-0.074
 herbal-0.071
ulent-0.070
cious-0.068
apa-0.067
Token,
Token ID11
Feature activation-2.93e-01

Pos logit contributions
 door+0.016
 doors+0.014
 poster+0.014
 bus+0.014
 elevator+0.014
Neg logit contributions
opsy-0.054
cious-0.053
 herbal-0.052
ulent-0.052
bowl-0.051
Token dropping
Token ID12047
Feature activation+1.98e+00

Pos logit contributions
 poster+0.013
 door+0.013
 bus+0.013
 elevator+0.012
 doors+0.012
Neg logit contributions
opsy-0.016
 herbal-0.015
ulent-0.015
apa-0.015
cious-0.015
Token the
Token ID262
Feature activation+7.94e-01

Pos logit contributions
opsy+0.121
ulent+0.114
 herbal+0.113
cious+0.112
apa+0.110
Neg logit contributions
 poster-0.078
 door-0.073
 elevator-0.072
 bus-0.071
 doors-0.070
Token magn
Token ID7842
Feature activation+2.81e+00

Pos logit contributions
opsy+0.040
ulent+0.038
 herbal+0.037
cious+0.037
apa+0.036
Neg logit contributions
 poster-0.040
 door-0.037
 elevator-0.037
 bus-0.037
 doors-0.036
Tokenifying
Token ID4035
Feature activation+7.64e+00

Pos logit contributions
opsy+0.044
 herbal+0.038
cious+0.034
ulent+0.031
 Cic+0.029
Neg logit contributions
 poster-0.235
 door-0.234
 elevator-0.226
 doors-0.225
 bus-0.224
Token glass
Token ID5405
Feature activation+1.36e+00

Pos logit contributions
opsy+0.333
 herbal+0.324
ifying+0.306
ulent+0.305
 Cic+0.305
Neg logit contributions
 poster-0.399
 bus-0.378
 elevator-0.373
 officer-0.366
 doors-0.359
Token.
Token ID13
Feature activation-3.12e-02

Pos logit contributions
opsy+0.073
bowl+0.071
 herbal+0.070
ifying+0.070
cr+0.069
Neg logit contributions
 poster-0.057
 bus-0.050
 elevator-0.050
 doors-0.047
 officer-0.047
Token
Token ID198
Feature activation+1.92e-01

Pos logit contributions
 poster+0.001
 door+0.001
 elevator+0.001
 bus+0.001
 doors+0.001
Neg logit contributions
opsy-0.002
ulent-0.002
 herbal-0.002
cious-0.002
apa-0.002
Token
Token ID198
Feature activation+2.20e-01

Pos logit contributions
opsy+0.010
ulent+0.010
 herbal+0.010
cious+0.010
apa+0.009
Neg logit contributions
 poster-0.009
 door-0.008
 elevator-0.008
 bus-0.008
 doors-0.008
Token"
Token ID1
Feature activation-6.78e-01

Pos logit contributions
opsy+0.016
 herbal+0.016
ulent+0.015
bowl+0.015
cious+0.015
Neg logit contributions
 poster-0.006
 door-0.005
 bus-0.005
 elevator-0.005
 lady-0.005
Token said
Token ID531
Feature activation-8.40e-01

Pos logit contributions
 poster+0.047
 door+0.044
 elevator+0.043
 bus+0.043
 doors+0.042
Neg logit contributions
opsy-0.103
ulent-0.098
 herbal-0.097
ifying-0.095
cious-0.095
Token,
Token ID11
Feature activation-3.40e-01

Pos logit contributions
 door+0.021
 poster+0.021
 lady+0.019
 bus+0.018
 elevator+0.018
Neg logit contributions
opsy-0.063
 herbal-0.060
ulent-0.060
ifying-0.058
cious-0.057
Token holding
Token ID4769
Feature activation+1.82e+00

Pos logit contributions
 poster+0.016
 door+0.016
 bus+0.015
 lady+0.015
 elevator+0.015
Neg logit contributions
opsy-0.018
apa-0.017
 herbal-0.017
ulent-0.016
bowl-0.016
Token the
Token ID262
Feature activation+6.38e-01

Pos logit contributions
opsy+0.124
ulent+0.122
 herbal+0.119
cious+0.118
apa+0.115
Neg logit contributions
 poster-0.056
 elevator-0.055
 door-0.054
 doors-0.052
 bus-0.050
Token magn
Token ID7842
Feature activation+2.63e+00

Pos logit contributions
opsy+0.031
ulent+0.029
 herbal+0.029
cious+0.029
apa+0.028
Neg logit contributions
 poster-0.033
 elevator-0.030
 door-0.030
 bus-0.030
 doors-0.030
Tokenifying
Token ID4035
Feature activation+7.50e+00

Pos logit contributions
opsy+0.040
 herbal+0.035
cious+0.031
ulent+0.028
 Cic+0.027
Neg logit contributions
 poster-0.220
 door-0.218
 elevator-0.210
 doors-0.209
 bus-0.209
Token glass
Token ID5405
Feature activation+8.74e-01

Pos logit contributions
opsy+0.331
 herbal+0.318
ulent+0.306
ifying+0.297
 Cic+0.296
Neg logit contributions
 poster-0.396
 bus-0.377
 elevator-0.373
 officer-0.366
 doors-0.360
Token over
Token ID625
Feature activation+9.17e-01

Pos logit contributions
opsy+0.051
 herbal+0.049
ulent+0.048
cious+0.048
bowl+0.048
Neg logit contributions
 poster-0.035
 elevator-0.031
 bus-0.031
 doors-0.029
 officer-0.029
Token a
Token ID257
Feature activation-4.63e-01

Pos logit contributions
opsy+0.058
ulent+0.054
 herbal+0.054
cious+0.054
apa+0.053
Neg logit contributions
 poster-0.034
 door-0.031
 elevator-0.031
 bus-0.031
 doors-0.030
Token black
Token ID2042
Feature activation-2.22e-01

Pos logit contributions
 poster+0.020
 door+0.018
 bus+0.018
 elevator+0.018
 doors+0.017
Neg logit contributions
opsy-0.027
ulent-0.025
 herbal-0.025
cious-0.025
apa-0.024
Token ant
Token ID1885
Feature activation+5.50e-01

Pos logit contributions
 poster+0.009
 elevator+0.009
 bus+0.008
 doors+0.008
 door+0.008
Neg logit contributions
opsy-0.012
 herbal-0.012
ulent-0.012
cious-0.012
bowl-0.011
Token,
Token ID11
Feature activation-5.28e-01

Pos logit contributions
 poster+0.017
 door+0.015
 elevator+0.015
 bus+0.015
 doors+0.015
Neg logit contributions
opsy-0.030
 herbal-0.029
ulent-0.029
cious-0.028
apa-0.028
Token handing
Token ID22786
Feature activation+1.90e-01

Pos logit contributions
 poster+0.022
 elevator+0.020
 door+0.020
 bus+0.020
 doors+0.020
Neg logit contributions
opsy-0.031
ulent-0.029
 herbal-0.029
cious-0.028
apa-0.028
Token him
Token ID683
Feature activation+1.27e+00

Pos logit contributions
opsy+0.011
ulent+0.010
 herbal+0.010
cious+0.010
apa+0.010
Neg logit contributions
 poster-0.008
 door-0.008
 elevator-0.008
 bus-0.008
 doors-0.008
Token the
Token ID262
Feature activation+5.15e-01

Pos logit contributions
opsy+0.080
ulent+0.076
 herbal+0.075
cious+0.074
apa+0.073
Neg logit contributions
 poster-0.047
 elevator-0.043
 door-0.043
 bus-0.043
 doors-0.041
Token magn
Token ID7842
Feature activation+2.46e+00

Pos logit contributions
opsy+0.028
 herbal+0.027
ulent+0.026
cious+0.026
apa+0.025
Neg logit contributions
 poster-0.020
 doors-0.019
 elevator-0.018
 door-0.018
 officer-0.018
Tokenifying
Token ID4035
Feature activation+7.43e+00

Pos logit contributions
opsy+0.015
ulent+0.008
 herbal+0.005
cious+0.005
apa+0.003
Neg logit contributions
 poster-0.231
 elevator-0.223
 bus-0.222
 door-0.220
 doors-0.219
Token glass
Token ID5405
Feature activation+9.47e-01

Pos logit contributions
opsy+0.285
 herbal+0.281
ifying+0.281
 Bro+0.266
bowl+0.258
Neg logit contributions
 poster-0.406
 bus-0.400
 officer-0.400
 elevator-0.390
 doors-0.381
Token.
Token ID13
Feature activation+1.10e-01

Pos logit contributions
cr+0.049
bowl+0.047
opsy+0.046
 herbal+0.045
apa+0.045
Neg logit contributions
 poster-0.042
 elevator-0.038
 bus-0.037
 doors-0.036
 officer-0.034
Token She
Token ID1375
Feature activation-1.37e-01

Pos logit contributions
opsy+0.006
ulent+0.006
 herbal+0.006
cious+0.006
apa+0.006
Neg logit contributions
 poster-0.005
 door-0.004
 bus-0.004
 elevator-0.004
 lady-0.004
Token looked
Token ID3114
Feature activation+4.02e-01

Pos logit contributions
 poster+0.007
 elevator+0.006
 door+0.006
 bus+0.006
 doors+0.006
Neg logit contributions
opsy-0.007
 herbal-0.007
ulent-0.007
cious-0.007
apa-0.006
Token around
Token ID1088
Feature activation+3.55e-01

Pos logit contributions
opsy+0.016
 herbal+0.015
ulent+0.014
cious+0.014
apa+0.014
Neg logit contributions
 poster-0.023
 elevator-0.021
 bus-0.021
 rules-0.021
 door-0.021
Token and
Token ID290
Feature activation-1.44e+00

Pos logit contributions
 poster+0.047
 elevator+0.045
 doors+0.041
 stairs+0.040
 bus+0.040
Neg logit contributions
-0.035
cious-0.035
opsy-0.035
apa-0.033
ulent-0.032
Token handed
Token ID10158
Feature activation-6.72e-01

Pos logit contributions
 poster+0.072
 elevator+0.067
 bus+0.066
 door+0.066
 doors+0.065
Neg logit contributions
opsy-0.072
 herbal-0.067
ulent-0.067
cious-0.066
apa-0.065
Token him
Token ID683
Feature activation+6.97e-01

Pos logit contributions
 poster+0.028
 elevator+0.025
 bus+0.025
 door+0.025
 doors+0.024
Neg logit contributions
opsy-0.039
 herbal-0.037
ulent-0.037
cious-0.036
apa-0.036
Token the
Token ID262
Feature activation+5.10e-01

Pos logit contributions
opsy+0.043
ulent+0.040
 herbal+0.040
cious+0.040
apa+0.039
Neg logit contributions
 poster-0.027
 door-0.025
 elevator-0.025
 bus-0.024
 doors-0.024
Token magn
Token ID7842
Feature activation+2.47e+00

Pos logit contributions
opsy+0.023
ulent+0.023
cious+0.022
 herbal+0.022
cr+0.022
Neg logit contributions
iren-0.025
 elevator-0.024
 bus-0.023
 poster-0.022
round-0.021
Tokenifying
Token ID4035
Feature activation+7.31e+00

Pos logit contributions
opsy+0.032
ulent+0.026
ifying+0.022
 herbal+0.022
cious+0.022
Neg logit contributions
 poster-0.214
 elevator-0.208
 bus-0.208
 door-0.203
 lady-0.201
Token glass
Token ID5405
Feature activation+8.13e-01

Pos logit contributions
opsy+0.271
 herbal+0.270
ifying+0.266
 Bro+0.253
ulent+0.249
Neg logit contributions
 poster-0.409
 bus-0.409
 elevator-0.400
 officer-0.398
 belts-0.377
Token.
Token ID13
Feature activation-4.66e-01

Pos logit contributions
cr+0.042
opsy+0.041
bowl+0.041
 herbal+0.041
apa+0.040
Neg logit contributions
 poster-0.036
 elevator-0.033
 bus-0.033
 officer-0.030
 doors-0.030
Token
Token ID198
Feature activation+7.60e-02

Pos logit contributions
 poster+0.023
 door+0.021
 elevator+0.021
 bus+0.021
 doors+0.020
Neg logit contributions
opsy-0.024
ulent-0.023
 herbal-0.023
cious-0.022
apa-0.022
Token
Token ID198
Feature activation+6.07e-02

Pos logit contributions
opsy+0.004
 herbal+0.004
ulent+0.004
apa+0.004
cious+0.004
Neg logit contributions
 poster-0.004
 elevator-0.004
 bus-0.003
 door-0.003
 doors-0.003
TokenHe
Token ID1544
Feature activation-1.02e+00

Pos logit contributions
opsy+0.003
ulent+0.003
 herbal+0.003
cious+0.003
apa+0.003
Neg logit contributions
 poster-0.003
 door-0.003
 bus-0.003
 elevator-0.003
 lady-0.003
Token hat
Token ID6877
Feature activation+2.02e+00

Pos logit contributions
opsy+0.030
 herbal+0.029
ulent+0.028
cious+0.027
apa+0.026
Neg logit contributions
 poster-0.023
 elevator-0.021
 door-0.021
 bus-0.021
 officer-0.020
Token and
Token ID290
Feature activation+2.44e-01

Pos logit contributions
opsy+0.121
ulent+0.114
 herbal+0.113
cious+0.113
bowl+0.112
Neg logit contributions
 poster-0.078
 elevator-0.071
 bus-0.067
 doors-0.066
 door-0.066
Token grabbed
Token ID13176
Feature activation-8.96e-01

Pos logit contributions
opsy+0.013
ulent+0.013
 herbal+0.012
cious+0.012
apa+0.012
Neg logit contributions
 poster-0.011
 bus-0.011
 door-0.011
 elevator-0.010
 doors-0.010
Token her
Token ID607
Feature activation+6.20e-01

Pos logit contributions
 poster+0.041
 bus+0.037
 elevator+0.036
 door+0.036
 doors+0.034
Neg logit contributions
opsy-0.049
 herbal-0.046
ulent-0.046
apa-0.046
cious-0.045
Token magn
Token ID7842
Feature activation+2.12e+00

Pos logit contributions
opsy+0.035
 herbal+0.033
ulent+0.032
cious+0.032
apa+0.031
Neg logit contributions
 poster-0.027
 elevator-0.025
 bus-0.025
 door-0.025
 doors-0.024
Tokenifying
Token ID4035
Feature activation+7.09e+00

Pos logit contributions
opsy+0.035
 herbal+0.031
cious+0.028
 Cic+0.026
ulent+0.025
Neg logit contributions
 poster-0.173
 door-0.172
 doors-0.167
 bus-0.165
 elevator-0.164
Token glass
Token ID5405
Feature activation+4.82e-01

Pos logit contributions
opsy+0.295
 herbal+0.289
ulent+0.271
apa+0.270
ifying+0.266
Neg logit contributions
 poster-0.395
 bus-0.372
 elevator-0.368
 officer-0.365
 doors-0.352
Token.
Token ID13
Feature activation-3.87e-03

Pos logit contributions
opsy+0.027
bowl+0.026
 herbal+0.025
apa+0.025
cr+0.025
Neg logit contributions
 poster-0.021
 elevator-0.019
 bus-0.018
 officer-0.017
 doors-0.017
Token She
Token ID1375
Feature activation+7.17e-04

Pos logit contributions
 poster+0.000
 door+0.000
 elevator+0.000
 bus+0.000
 doors+0.000
Neg logit contributions
opsy-0.000
ulent-0.000
 herbal-0.000
cious-0.000
apa-0.000
Token was
Token ID373
Feature activation+4.41e-01

Pos logit contributions
opsy+0.000
ulent+0.000
 herbal+0.000
cious+0.000
apa+0.000
Neg logit contributions
 poster-0.000
 bus-0.000
 door-0.000
 elevator-0.000
 doors-0.000
Token ready
Token ID3492
Feature activation+1.75e+00

Pos logit contributions
opsy+0.022
 herbal+0.021
ulent+0.021
cious+0.020
apa+0.020
Neg logit contributions
 poster-0.022
 bus-0.020
 elevator-0.020
 door-0.020
 doors-0.019
TokenAs
Token ID1722
Feature activation+7.06e-01

Pos logit contributions
opsy+0.072
ulent+0.069
apa+0.067
 herbal+0.067
cious+0.066
Neg logit contributions
 poster-0.066
 door-0.062
 bus-0.062
 elevator-0.061
 lady-0.061
Token she
Token ID673
Feature activation+1.99e+00

Pos logit contributions
opsy+0.046
ulent+0.045
cious+0.045
 herbal+0.044
apa+0.044
Neg logit contributions
 poster-0.024
 door-0.023
 bus-0.022
 elevator-0.022
 lady-0.022
Token was
Token ID373
Feature activation+1.59e+00

Pos logit contributions
ulent+0.126
cious+0.122
opsy+0.120
apa+0.119
istered+0.118
Neg logit contributions
 elevator-0.065
 doors-0.063
 bus-0.063
 poster-0.062
 door-0.062
Token carrying
Token ID6872
Feature activation+2.35e+00

Pos logit contributions
opsy+0.082
cious+0.077
ulent+0.076
 herbal+0.076
apa+0.075
Neg logit contributions
 poster-0.074
 elevator-0.073
 bus-0.073
 door-0.073
 doors-0.069
Token the
Token ID262
Feature activation+1.61e+00

Pos logit contributions
opsy+0.151
ulent+0.143
 herbal+0.142
cious+0.141
apa+0.138
Neg logit contributions
 poster-0.084
 door-0.077
 elevator-0.077
 bus-0.075
 doors-0.075
Token potatoes
Token ID18821
Feature activation+7.03e+00

Pos logit contributions
opsy+0.085
ulent+0.080
 herbal+0.078
cious+0.078
apa+0.076
Neg logit contributions
 poster-0.076
 door-0.073
 bus-0.071
 elevator-0.070
 doors-0.070
Token,
Token ID11
Feature activation+1.62e+00

Pos logit contributions
opsy+0.455
ulent+0.435
 herbal+0.429
cious+0.427
apa+0.411
Neg logit contributions
 poster-0.244
 door-0.242
 elevator-0.228
 doors-0.222
 bus-0.222
Token she
Token ID673
Feature activation+5.42e-01

Pos logit contributions
opsy+0.108
ulent+0.104
 herbal+0.102
cious+0.102
apa+0.100
Neg logit contributions
 poster-0.054
 elevator-0.050
 door-0.050
 bus-0.049
 doors-0.048
Token noticed
Token ID6810
Feature activation+7.71e-01

Pos logit contributions
opsy+0.029
 herbal+0.029
ulent+0.029
cious+0.027
ifying+0.027
Neg logit contributions
 door-0.024
 doors-0.024
 poster-0.023
 bus-0.023
 man-0.022
Token that
Token ID326
Feature activation+1.08e+00

Pos logit contributions
opsy+0.049
ulent+0.046
 herbal+0.046
cious+0.045
apa+0.045
Neg logit contributions
 poster-0.029
 door-0.026
 elevator-0.026
 bus-0.026
 doors-0.025
Token some
Token ID617
Feature activation+8.12e-01

Pos logit contributions
opsy+0.066
ulent+0.063
 herbal+0.063
cious+0.062
apa+0.061
Neg logit contributions
 poster-0.042
 door-0.038
 elevator-0.038
 bus-0.038
 doors-0.037
Token examine
Token ID10716
Feature activation-1.11e+00

Pos logit contributions
opsy+0.024
ulent+0.023
 herbal+0.023
cious+0.023
apa+0.022
Neg logit contributions
 poster-0.024
 door-0.023
 bus-0.023
 elevator-0.022
 doors-0.022
Token it
Token ID340
Feature activation+2.30e-01

Pos logit contributions
 poster+0.048
 bus+0.046
 officer+0.044
 elevator+0.043
 lady+0.043
Neg logit contributions
opsy-0.057
ulent-0.055
bowl-0.055
 Bro-0.055
 herbal-0.054
Token with
Token ID351
Feature activation-9.73e-01

Pos logit contributions
opsy+0.017
 herbal+0.016
bowl+0.016
ulent+0.016
comb+0.015
Neg logit contributions
 poster-0.006
 elevator-0.005
 bus-0.005
 officer-0.005
 rules-0.005
Token the
Token ID262
Feature activation-1.66e-01

Pos logit contributions
 poster+0.035
 elevator+0.031
 bus+0.030
 officer+0.029
 door+0.028
Neg logit contributions
opsy-0.060
ulent-0.060
apa-0.058
 herbal-0.058
cr-0.056
Token magn
Token ID7842
Feature activation+2.25e+00

Pos logit contributions
 poster+0.008
 elevator+0.007
 bus+0.007
 officer+0.007
 doors+0.007
Neg logit contributions
opsy-0.008
 herbal-0.008
ulent-0.008
cious-0.008
 Cic-0.008
Tokenifying
Token ID4035
Feature activation+7.02e+00

Pos logit contributions
ifying+0.009
opsy+0.006
ulent+0.003
 herbal-0.003
bowl-0.004
Neg logit contributions
 poster-0.214
 elevator-0.211
 bus-0.210
 lady-0.207
 rules-0.207
Token glass
Token ID5405
Feature activation+7.18e-01

Pos logit contributions
opsy+0.277
 herbal+0.277
ifying+0.260
ulent+0.257
 Bro+0.254
Neg logit contributions
 poster-0.387
 bus-0.373
 officer-0.369
 elevator-0.366
 doors-0.354
Token?"
Token ID1701
Feature activation-2.31e+00

Pos logit contributions
bowl+0.044
cr+0.044
 herbal+0.041
opsy+0.041
comb+0.040
Neg logit contributions
 poster-0.021
 elevator-0.020
 officer-0.019
 bus-0.019
 doors-0.018
Token
Token ID198
Feature activation+7.35e-02

Pos logit contributions
 poster+0.112
 elevator+0.103
 door+0.101
 bus+0.101
 doors+0.099
Neg logit contributions
opsy-0.122
 herbal-0.112
ulent-0.110
apa-0.109
cious-0.109
Token
Token ID198
Feature activation+1.92e-01

Pos logit contributions
opsy+0.004
 herbal+0.004
ulent+0.004
cious+0.003
apa+0.003
Neg logit contributions
 poster-0.004
 elevator-0.003
 bus-0.003
 door-0.003
 doors-0.003
Token"
Token ID1
Feature activation-9.25e-01

Pos logit contributions
opsy+0.013
ulent+0.012
 herbal+0.012
cious+0.012
apa+0.012
Neg logit contributions
 poster-0.006
 bus-0.006
 door-0.006
 elevator-0.005
 lady-0.005
Token the
Token ID262
Feature activation+4.07e-01

Pos logit contributions
opsy+0.059
 herbal+0.056
 Cic+0.054
apa+0.054
ulent+0.053
Neg logit contributions
 poster-0.065
 bus-0.056
 elevator-0.056
 officer-0.056
 door-0.055
Token necklace
Token ID38987
Feature activation-9.45e-02

Pos logit contributions
opsy+0.022
 herbal+0.021
ulent+0.020
cious+0.020
bowl+0.020
Neg logit contributions
 poster-0.017
 elevator-0.016
 bus-0.015
 officer-0.015
iren-0.015
Token and
Token ID290
Feature activation-2.74e-01

Pos logit contributions
 poster+0.003
 elevator+0.003
 bus+0.003
 doors+0.003
 door+0.003
Neg logit contributions
opsy-0.006
 herbal-0.006
ulent-0.006
cious-0.006
apa-0.005
Token the
Token ID262
Feature activation-9.46e-01

Pos logit contributions
 poster+0.014
 door+0.013
 bus+0.013
 elevator+0.013
 doors+0.013
Neg logit contributions
opsy-0.014
ulent-0.013
 herbal-0.013
cious-0.012
apa-0.012
Token magn
Token ID7842
Feature activation+1.82e+00

Pos logit contributions
 poster+0.040
 elevator+0.038
iren+0.035
 bus+0.035
?"+0.035
Neg logit contributions
opsy-0.050
 herbal-0.047
cious-0.047
ulent-0.046
cr-0.045
Tokenifying
Token ID4035
Feature activation+7.00e+00

Pos logit contributions
opsy+0.031
 herbal+0.026
ulent+0.024
cious+0.024
apa+0.022
Neg logit contributions
 poster-0.150
 door-0.145
 elevator-0.143
 bus-0.143
 doors-0.142
Token glass
Token ID5405
Feature activation+3.18e-01

Pos logit contributions
opsy+0.297
 herbal+0.282
ifying+0.273
ulent+0.269
bowl+0.262
Neg logit contributions
 poster-0.381
 bus-0.362
 elevator-0.357
 officer-0.356
 doors-0.344
Token and
Token ID290
Feature activation-4.66e-01

Pos logit contributions
opsy+0.018
 herbal+0.017
cr+0.017
bowl+0.017
cious+0.017
Neg logit contributions
 poster-0.013
 elevator-0.012
 bus-0.012
 officer-0.011
 doors-0.011
Token started
Token ID2067
Feature activation+6.76e-01

Pos logit contributions
 poster+0.022
 door+0.022
 doors+0.021
 bus+0.021
 elevator+0.021
Neg logit contributions
opsy-0.024
ulent-0.023
 herbal-0.022
cious-0.022
apa-0.022
Token to
Token ID284
Feature activation-1.76e-01

Pos logit contributions
opsy+0.044
ulent+0.041
 herbal+0.041
cious+0.041
apa+0.040
Neg logit contributions
 poster-0.024
 door-0.022
 elevator-0.022
 bus-0.022
 doors-0.021
Token examine
Token ID10716
Feature activation-1.09e+00

Pos logit contributions
 poster+0.009
 bus+0.008
 door+0.008
 elevator+0.008
 doors+0.008
Neg logit contributions
opsy-0.009
apa-0.008
cious-0.008
ulent-0.008
 herbal-0.008
Token."
Token ID526
Feature activation-1.01e+00

Pos logit contributions
opsy+0.150
bowl+0.148
 herbal+0.146
apa+0.144
cr+0.141
Neg logit contributions
 poster-0.086
 officer-0.072
 elevator-0.071
 bus-0.070
 doors-0.068
Token So
Token ID1406
Feature activation+2.95e-01

Pos logit contributions
 poster+0.046
 door+0.045
 bus+0.043
 elevator+0.043
 lady+0.042
Neg logit contributions
opsy-0.053
ulent-0.051
 herbal-0.050
cious-0.050
apa-0.047
Token she
Token ID673
Feature activation-1.11e-01

Pos logit contributions
opsy+0.019
ulent+0.019
 herbal+0.018
cious+0.018
bowl+0.017
Neg logit contributions
 poster-0.010
 door-0.010
 doors-0.009
 elevator-0.009
 bus-0.009
Token put
Token ID1234
Feature activation-5.58e-01

Pos logit contributions
 poster+0.005
 door+0.005
 bus+0.005
 elevator+0.005
 doors+0.005
Neg logit contributions
opsy-0.006
ulent-0.005
 herbal-0.005
cious-0.005
ifying-0.005
Token the
Token ID262
Feature activation+1.61e+00

Pos logit contributions
 poster+0.023
 bus+0.021
 door+0.021
 elevator+0.020
 officer+0.020
Neg logit contributions
opsy-0.033
 herbal-0.031
ulent-0.030
apa-0.030
bowl-0.030
Token potatoes
Token ID18821
Feature activation+6.89e+00

Pos logit contributions
opsy+0.052
ulent+0.047
 herbal+0.046
cious+0.045
apa+0.044
Neg logit contributions
 poster-0.108
 door-0.104
 bus-0.103
 elevator-0.103
 doors-0.101
Token into
Token ID656
Feature activation-2.23e-01

Pos logit contributions
opsy+0.374
ulent+0.351
 herbal+0.350
cious+0.345
apa+0.339
Neg logit contributions
 poster-0.312
 door-0.289
 elevator-0.288
 bus-0.287
 doors-0.280
Token the
Token ID262
Feature activation-2.81e-01

Pos logit contributions
 poster+0.008
 door+0.007
 elevator+0.007
 doors+0.007
 bus+0.007
Neg logit contributions
opsy-0.014
ulent-0.014
cious-0.014
 herbal-0.013
apa-0.013
Token pot
Token ID1787
Feature activation+2.70e+00

Pos logit contributions
 poster+0.015
 elevator+0.014
 bus+0.014
 door+0.014
 doors+0.014
Neg logit contributions
opsy-0.013
 herbal-0.012
ulent-0.012
cious-0.011
apa-0.011
Token and
Token ID290
Feature activation-2.50e-01

Pos logit contributions
opsy+0.178
 herbal+0.168
ulent+0.168
cious+0.166
apa+0.164
Neg logit contributions
 poster-0.092
 elevator-0.082
 bus-0.081
 door-0.081
 doors-0.078
Token the
Token ID262
Feature activation-1.31e+00

Pos logit contributions
 door+0.011
 poster+0.011
 doors+0.011
 elevator+0.011
 bus+0.011
Neg logit contributions
opsy-0.014
ulent-0.013
cious-0.012
 herbal-0.012
ithering-0.012
Token help
Token ID1037
Feature activation-4.57e-01

Pos logit contributions
opsy+0.019
cious+0.018
ulent+0.018
apa+0.017
 herbal+0.017
Neg logit contributions
?"-0.014
 poster-0.014
 door-0.014
 bus-0.013
 elevator-0.013
Token.
Token ID13
Feature activation+2.83e-02

Pos logit contributions
 door+0.012
 poster+0.011
 lady+0.011
 doors+0.011
 rules+0.010
Neg logit contributions
opsy-0.034
ulent-0.033
cious-0.033
ifying-0.033
 herbal-0.032
Token He
Token ID679
Feature activation-5.61e-01

Pos logit contributions
opsy+0.001
 herbal+0.001
ulent+0.001
cious+0.001
apa+0.001
Neg logit contributions
 poster-0.001
 elevator-0.001
 bus-0.001
 door-0.001
 doors-0.001
Token took
Token ID1718
Feature activation+6.06e-01

Pos logit contributions
 poster+0.027
 door+0.026
 elevator+0.026
 bus+0.026
 doors+0.025
Neg logit contributions
opsy-0.028
ulent-0.027
cious-0.026
 herbal-0.026
ifying-0.025
Token the
Token ID262
Feature activation+8.77e-01

Pos logit contributions
opsy+0.039
ulent+0.037
cious+0.037
 herbal+0.036
apa+0.035
Neg logit contributions
 poster-0.021
 elevator-0.020
 door-0.020
 doors-0.020
 bus-0.020
Token potatoes
Token ID18821
Feature activation+6.87e+00

Pos logit contributions
opsy+0.031
ulent+0.028
cious+0.027
 herbal+0.027
apa+0.026
Neg logit contributions
 poster-0.057
 door-0.055
 elevator-0.054
 bus-0.054
 doors-0.053
Token from
Token ID422
Feature activation+1.09e+00

Pos logit contributions
opsy+0.463
ulent+0.447
cious+0.447
 herbal+0.436
 Cic+0.419
Neg logit contributions
 poster-0.220
 door-0.215
 elevator-0.197
 bus-0.196
 doors-0.190
Token the
Token ID262
Feature activation-5.09e-01

Pos logit contributions
opsy+0.068
 herbal+0.064
ulent+0.063
cious+0.063
apa+0.062
Neg logit contributions
 poster-0.041
 door-0.037
 elevator-0.037
 bus-0.037
 doors-0.036
Token bag
Token ID6131
Feature activation+7.62e-02

Pos logit contributions
 poster+0.023
 door+0.022
 elevator+0.022
 bus+0.021
 doors+0.021
Neg logit contributions
opsy-0.028
ulent-0.026
 herbal-0.026
cious-0.025
apa-0.025
Token and
Token ID290
Feature activation-2.90e-01

Pos logit contributions
opsy+0.005
 herbal+0.004
ulent+0.004
apa+0.004
bowl+0.004
Neg logit contributions
 poster-0.003
 elevator-0.003
 bus-0.003
 doors-0.002
 door-0.002
Token put
Token ID1234
Feature activation-5.52e-01

Pos logit contributions
 poster+0.013
 door+0.013
 elevator+0.013
 doors+0.013
 bus+0.013
Neg logit contributions
opsy-0.016
ulent-0.015
cious-0.014
 herbal-0.014
apa-0.013
Token As
Token ID1081
Feature activation+4.35e-01

Pos logit contributions
opsy+0.042
ulent+0.039
 herbal+0.039
cious+0.038
apa+0.037
Neg logit contributions
 poster-0.043
 door-0.041
 elevator-0.040
 bus-0.040
 doors-0.039
Token she
Token ID673
Feature activation+1.58e+00

Pos logit contributions
opsy+0.030
ulent+0.030
 herbal+0.029
cious+0.029
apa+0.029
Neg logit contributions
 poster-0.013
 door-0.012
 bus-0.012
 lady-0.012
 doors-0.011
Token was
Token ID373
Feature activation+1.12e+00

Pos logit contributions
ulent+0.093
ifying+0.093
apa+0.092
istered+0.092
cious+0.091
Neg logit contributions
 poster-0.056
 doors-0.053
 lady-0.052
 door-0.051
 elevator-0.051
Token picking
Token ID10868
Feature activation+3.14e+00

Pos logit contributions
ifying+0.057
opsy+0.056
 fingertips+0.054
cious+0.053
Carl+0.052
Neg logit contributions
 poster-0.051
 door-0.050
 elevator-0.049
?"-0.049
 bus-0.048
Token the
Token ID262
Feature activation+2.21e+00

Pos logit contributions
opsy+0.205
cious+0.201
ulent+0.199
 fingertips+0.195
ifying+0.193
Neg logit contributions
 poster-0.097
 door-0.097
 doors-0.094
 elevator-0.091
 old-0.090
Token herbs
Token ID25411
Feature activation+6.86e+00

Pos logit contributions
opsy+0.095
apa+0.091
ulent+0.089
 fingertips+0.089
cious+0.088
Neg logit contributions
 door-0.122
 poster-0.121
 old-0.117
 lady-0.115
 doors-0.114
Token,
Token ID11
Feature activation+1.14e+00

Pos logit contributions
opsy+0.450
cious+0.430
ulent+0.430
 herbal+0.417
apa+0.409
Neg logit contributions
 poster-0.233
 door-0.229
 doors-0.207
 elevator-0.203
?"-0.198
Token a
Token ID257
Feature activation-5.87e-01

Pos logit contributions
opsy+0.075
ulent+0.071
 herbal+0.070
cious+0.070
apa+0.069
Neg logit contributions
 poster-0.039
 door-0.036
 elevator-0.036
 bus-0.035
 doors-0.035
Token huge
Token ID3236
Feature activation-1.09e+00

Pos logit contributions
 poster+0.024
 door+0.023
 bus+0.022
 elevator+0.022
 doors+0.021
Neg logit contributions
opsy-0.034
ulent-0.033
 herbal-0.032
cious-0.032
apa-0.032
Token lion
Token ID18744
Feature activation-6.97e-01

Pos logit contributions
 poster+0.050
 elevator+0.047
 doors+0.046
 bus+0.046
 rules+0.044
Neg logit contributions
opsy-0.055
 herbal-0.053
ulent-0.052
 Cic-0.050
cious-0.049
Token came
Token ID1625
Feature activation+3.01e-02

Pos logit contributions
 poster+0.030
 bus+0.029
 door+0.029
 elevator+0.028
 doors+0.027
Neg logit contributions
opsy-0.041
 herbal-0.038
ulent-0.037
apa-0.037
cious-0.035
Token too
Token ID1165
Feature activation-1.80e+00

Pos logit contributions
opsy+0.310
ulent+0.294
 herbal+0.294
apa+0.290
cious+0.290
Neg logit contributions
 poster-0.202
 elevator-0.182
 bus-0.180
 door-0.177
 doors-0.174
Token.
Token ID13
Feature activation+2.50e-01

Pos logit contributions
 poster+0.067
 elevator+0.058
 bus+0.058
 door+0.057
 doors+0.055
Neg logit contributions
opsy-0.115
cious-0.109
ulent-0.109
 herbal-0.109
apa-0.108
Token He
Token ID679
Feature activation-5.45e-01

Pos logit contributions
opsy+0.011
ulent+0.011
 herbal+0.010
cious+0.010
apa+0.010
Neg logit contributions
 poster-0.014
 bus-0.013
 elevator-0.013
 door-0.013
 doors-0.012
Token took
Token ID1718
Feature activation+8.16e-01

Pos logit contributions
 poster+0.027
 elevator+0.026
 door+0.026
 bus+0.026
 doors+0.025
Neg logit contributions
opsy-0.027
ulent-0.025
 herbal-0.025
cious-0.024
apa-0.023
Token the
Token ID262
Feature activation+9.08e-01

Pos logit contributions
opsy+0.051
ulent+0.048
 herbal+0.048
apa+0.047
cious+0.047
Neg logit contributions
 poster-0.031
 bus-0.027
 door-0.027
 elevator-0.027
 doors-0.026
Token potatoes
Token ID18821
Feature activation+6.82e+00

Pos logit contributions
opsy+0.040
ulent+0.037
 herbal+0.037
cious+0.036
apa+0.035
Neg logit contributions
 poster-0.051
 door-0.048
 elevator-0.048
 bus-0.048
 doors-0.047
Token and
Token ID290
Feature activation+4.07e-01

Pos logit contributions
opsy+0.393
 herbal+0.369
ulent+0.368
apa+0.365
cious+0.362
Neg logit contributions
 poster-0.285
 elevator-0.260
 bus-0.258
 door-0.254
 doors-0.251
Token the
Token ID262
Feature activation-9.29e-01

Pos logit contributions
opsy+0.021
ulent+0.020
 herbal+0.019
cious+0.019
apa+0.018
Neg logit contributions
 poster-0.019
 doors-0.019
 elevator-0.019
 bus-0.019
 door-0.019
Token onion
Token ID21670
Feature activation+3.22e+00

Pos logit contributions
 poster+0.049
 elevator+0.045
 bus+0.045
 door+0.044
 doors+0.044
Neg logit contributions
opsy-0.044
 herbal-0.041
ulent-0.040
apa-0.040
cious-0.040
Token to
Token ID284
Feature activation-6.08e-01

Pos logit contributions
opsy+0.184
ulent+0.173
 herbal+0.172
cious+0.168
 Cic+0.164
Neg logit contributions
 poster-0.136
 door-0.131
 elevator-0.128
 bus-0.127
 doors-0.125
Token the
Token ID262
Feature activation-1.35e+00

Pos logit contributions
 poster+0.024
 elevator+0.023
 door+0.023
 bus+0.022
 doors+0.022
Neg logit contributions
opsy-0.036
ulent-0.035
cious-0.035
 herbal-0.034
apa-0.033
Token!
Token ID0
Feature activation-1.03e+00

Pos logit contributions
 poster+0.045
 elevator+0.042
 bus+0.040
 doors+0.039
 officer+0.038
Neg logit contributions
opsy-0.042
apa-0.039
cious-0.038
ulent-0.037
 herbal-0.037
Token I
Token ID314
Feature activation+6.95e-02

Pos logit contributions
 poster+0.042
 door+0.039
 bus+0.038
 elevator+0.038
 doors+0.037
Neg logit contributions
opsy-0.061
 herbal-0.058
cious-0.057
ulent-0.057
apa-0.055
Token'll
Token ID1183
Feature activation-1.56e-01

Pos logit contributions
opsy+0.003
ulent+0.003
 herbal+0.003
cious+0.003
apa+0.003
Neg logit contributions
 poster-0.003
 door-0.003
 bus-0.003
 elevator-0.003
 doors-0.003
Token put
Token ID1234
Feature activation-7.31e-01

Pos logit contributions
 poster+0.008
 door+0.007
 bus+0.007
 elevator+0.007
 doors+0.007
Neg logit contributions
opsy-0.008
ulent-0.007
 herbal-0.007
cious-0.007
apa-0.007
Token the
Token ID262
Feature activation+1.40e+00

Pos logit contributions
 poster+0.030
 bus+0.027
 door+0.026
 elevator+0.026
 officer+0.026
Neg logit contributions
opsy-0.043
 herbal-0.042
ulent-0.041
bowl-0.041
apa-0.041
Token potatoes
Token ID18821
Feature activation+6.81e+00

Pos logit contributions
opsy+0.044
ulent+0.039
 herbal+0.039
cious+0.038
apa+0.036
Neg logit contributions
 poster-0.096
 door-0.091
 elevator-0.091
 bus-0.091
 doors-0.089
Token in
Token ID287
Feature activation-3.22e-01

Pos logit contributions
opsy+0.317
bowl+0.300
 herbal+0.297
ulent+0.292
cr+0.290
Neg logit contributions
 poster-0.359
 elevator-0.327
 bus-0.325
 officer-0.319
 doors-0.314
Token the
Token ID262
Feature activation-6.78e-01

Pos logit contributions
 poster+0.013
 bus+0.012
 elevator+0.012
 officer+0.011
 door+0.011
Neg logit contributions
opsy-0.019
 herbal-0.019
ulent-0.018
bowl-0.018
apa-0.018
Token pot
Token ID1787
Feature activation+2.40e+00

Pos logit contributions
 poster+0.033
 bus+0.029
 elevator+0.029
 door+0.029
 officer+0.029
Neg logit contributions
opsy-0.034
 herbal-0.033
apa-0.032
ulent-0.031
cious-0.031
Token."
Token ID526
Feature activation-1.01e+00

Pos logit contributions
opsy+0.150
bowl+0.148
 herbal+0.146
apa+0.144
cr+0.141
Neg logit contributions
 poster-0.086
 officer-0.072
 elevator-0.071
 bus-0.070
 doors-0.068
Token So
Token ID1406
Feature activation+2.95e-01

Pos logit contributions
 poster+0.046
 door+0.045
 bus+0.043
 elevator+0.043
 lady+0.042
Neg logit contributions
opsy-0.053
ulent-0.051
 herbal-0.050
cious-0.050
apa-0.047
Token the
Token ID262
Feature activation+1.80e-01

Pos logit contributions
opsy+0.070
ulent+0.066
 herbal+0.066
cious+0.065
apa+0.064
Neg logit contributions
 poster-0.039
 elevator-0.036
 door-0.035
 bus-0.035
 doors-0.034
Token necklace
Token ID38987
Feature activation-1.52e-01

Pos logit contributions
 herbal+0.008
opsy+0.008
ulent+0.008
cious+0.008
+0.008
Neg logit contributions
iren-0.008
 elevator-0.008
 bus-0.008
round-0.008
 poster-0.007
Token and
Token ID290
Feature activation-5.12e-01

Pos logit contributions
 poster+0.006
 elevator+0.005
 bus+0.005
 doors+0.005
 door+0.005
Neg logit contributions
opsy-0.009
cious-0.009
 herbal-0.008
ulent-0.008
comb-0.008
Token the
Token ID262
Feature activation-1.12e+00

Pos logit contributions
 poster+0.025
 elevator+0.023
 bus+0.022
 door+0.022
 rules+0.022
Neg logit contributions
opsy-0.025
 herbal-0.024
cious-0.024
 Cic-0.023
ulent-0.023
Token magn
Token ID7842
Feature activation+1.91e+00

Pos logit contributions
 elevator+0.045
iren+0.045
 poster+0.044
 bus+0.042
?"+0.042
Neg logit contributions
opsy-0.057
 herbal-0.055
cious-0.054
ulent-0.054
cr-0.052
Tokenifying
Token ID4035
Feature activation+6.74e+00

Pos logit contributions
opsy+0.022
ulent+0.018
ifying+0.017
cious+0.015
 herbal+0.014
Neg logit contributions
 poster-0.167
 elevator-0.162
 bus-0.162
 door-0.158
 lady-0.158
Token glass
Token ID5405
Feature activation+3.22e-01

Pos logit contributions
opsy+0.262
 herbal+0.255
ifying+0.250
ulent+0.239
 Bro+0.236
Neg logit contributions
 poster-0.377
 bus-0.371
 elevator-0.364
 officer-0.361
 rules-0.345
Token and
Token ID290
Feature activation-6.43e-01

Pos logit contributions
cr+0.016
opsy+0.016
 herbal+0.016
bowl+0.016
cious+0.016
Neg logit contributions
 poster-0.014
 elevator-0.013
 bus-0.013
 officer-0.012
 doors-0.011
Token told
Token ID1297
Feature activation-1.15e+00

Pos logit contributions
 poster+0.031
 elevator+0.028
 bus+0.028
 door+0.028
 doors+0.027
Neg logit contributions
opsy-0.033
 herbal-0.031
ulent-0.031
cious-0.031
apa-0.030
Token him
Token ID683
Feature activation-9.11e-01

Pos logit contributions
 poster+0.053
 elevator+0.048
 bus+0.048
 door+0.048
 doors+0.046
Neg logit contributions
opsy-0.062
 herbal-0.058
apa-0.057
ulent-0.057
cious-0.057
Token that
Token ID326
Feature activation-7.17e-01

Pos logit contributions
 poster+0.036
 elevator+0.032
 bus+0.031
 door+0.031
 doors+0.030
Neg logit contributions
opsy-0.053
cious-0.052
apa-0.051
 herbal-0.051
ulent-0.050
Token borrowed
Token ID22546
Feature activation+4.09e-01

Pos logit contributions
opsy+0.005
ulent+0.005
 herbal+0.005
cious+0.004
apa+0.004
Neg logit contributions
 poster-0.004
 door-0.004
 bus-0.004
 doors-0.004
 elevator-0.004
Token her
Token ID607
Feature activation+9.55e-01

Pos logit contributions
opsy+0.030
cious+0.028
apa+0.027
 herbal+0.027
ulent+0.027
Neg logit contributions
 poster-0.011
 door-0.011
 doors-0.010
 lady-0.010
 elevator-0.010
Token mom
Token ID1995
Feature activation+1.19e+00

Pos logit contributions
apa+0.050
opsy+0.050
umped+0.049
ulent+0.049
amy+0.049
Neg logit contributions
 door-0.044
 poster-0.041
 old-0.038
 doors-0.037
 lady-0.037
Token's
Token ID338
Feature activation+2.58e+00

Pos logit contributions
opsy+0.091
 herbal+0.087
istered+0.086
ulent+0.086
apy+0.085
Neg logit contributions
 door-0.026
 poster-0.025
 lady-0.024
 bus-0.024
 elevator-0.023
Token magn
Token ID7842
Feature activation+1.83e+00

Pos logit contributions
opsy+0.144
apa+0.132
Carl+0.131
cious+0.131
ulent+0.129
Neg logit contributions
 door-0.121
 poster-0.111
 doors-0.107
 rules-0.107
 old-0.103
Tokenifying
Token ID4035
Feature activation+6.69e+00

Pos logit contributions
opsy+0.037
 herbal+0.034
cious+0.029
 Cic+0.029
apa+0.028
Neg logit contributions
 door-0.143
 poster-0.140
 doors-0.136
 elevator-0.131
 bus-0.131
Token glass
Token ID5405
Feature activation+3.66e-01

Pos logit contributions
opsy+0.338
 herbal+0.316
ulent+0.315
cious+0.308
apa+0.304
Neg logit contributions
 poster-0.328
 elevator-0.305
 bus-0.304
 door-0.302
 doors-0.296
Token to
Token ID284
Feature activation+1.87e-01

Pos logit contributions
opsy+0.022
 herbal+0.021
ulent+0.021
cious+0.020
apa+0.020
Neg logit contributions
 poster-0.014
 elevator-0.013
 bus-0.013
 doors-0.012
 door-0.012
Token take
Token ID1011
Feature activation+4.29e-01

Pos logit contributions
opsy+0.010
cious+0.010
apa+0.010
umped+0.010
ulent+0.010
Neg logit contributions
 poster-0.008
 door-0.008
 bus-0.008
 lady-0.007
 doors-0.007
Token a
Token ID257
Feature activation+2.28e-01

Pos logit contributions
opsy+0.029
cious+0.026
ulent+0.026
 herbal+0.026
ifying+0.026
Neg logit contributions
 door-0.014
 poster-0.013
 doors-0.013
 elevator-0.013
 lady-0.012
Token closer
Token ID5699
Feature activation-6.02e-01

Pos logit contributions
opsy+0.013
ulent+0.012
cious+0.012
apa+0.012
 herbal+0.012
Neg logit contributions
 poster-0.010
 door-0.009
 bus-0.009
 elevator-0.008
 lady-0.008
Token too
Token ID1165
Feature activation-1.56e+00

Pos logit contributions
opsy+0.301
ulent+0.281
apa+0.280
 herbal+0.279
cious+0.277
Neg logit contributions
 door-0.087
 poster-0.076
 rules-0.071
 doors-0.071
 bus-0.069
Token much
Token ID881
Feature activation+3.08e-01

Pos logit contributions
 door+0.040
 poster+0.038
 elevator+0.036
 doors+0.036
 bus+0.036
Neg logit contributions
opsy-0.114
 herbal-0.109
ulent-0.107
cious-0.106
apa-0.106
Token.
Token ID13
Feature activation+9.10e-01

Pos logit contributions
opsy+0.022
 herbal+0.021
cious+0.021
ulent+0.021
apa+0.020
Neg logit contributions
 poster-0.009
 door-0.008
 bus-0.008
 elevator-0.008
 doors-0.008
Token She
Token ID1375
Feature activation+4.27e-01

Pos logit contributions
aults+0.055
opsy+0.053
ulent+0.052
ifying+0.051
cious+0.050
Neg logit contributions
 door-0.037
 lady-0.033
 bus-0.033
 doors-0.033
 old-0.031
Token put
Token ID1234
Feature activation+3.45e-01

Pos logit contributions
ulent+0.023
 herbal+0.023
ifying+0.022
opsy+0.022
cious+0.022
Neg logit contributions
 bus-0.018
 doors-0.018
 elevator-0.018
 door-0.017
?"-0.017
Token Mitt
Token ID16627
Feature activation+6.69e+00

Pos logit contributions
opsy+0.021
cious+0.021
ulent+0.020
 herbal+0.020
apa+0.020
Neg logit contributions
 poster-0.013
 door-0.012
 elevator-0.012
 doors-0.012
 bus-0.012
Tokenens
Token ID641
Feature activation+5.31e+00

Pos logit contributions
ifying+0.401
apa+0.383
 fingertips+0.381
opsy+0.381
aults+0.377
Neg logit contributions
 lady-0.242
 man-0.236
 poster-0.223
 bus-0.222
 too-0.213
Token down
Token ID866
Feature activation+1.88e+00

Pos logit contributions
opsy+0.339
apa+0.320
 herbal+0.317
ulent+0.311
cious+0.309
Neg logit contributions
 door-0.188
 poster-0.184
 doors-0.183
 bus-0.181
 elevator-0.179
Token gently
Token ID15165
Feature activation+2.49e-01

Pos logit contributions
opsy+0.099
 herbal+0.094
ulent+0.092
 Cic+0.089
cious+0.088
Neg logit contributions
 poster-0.088
 bus-0.076
 elevator-0.075
 door-0.073
 doors-0.073
Token and
Token ID290
Feature activation-2.80e-01

Pos logit contributions
opsy+0.016
ulent+0.015
 herbal+0.015
cious+0.015
apa+0.014
Neg logit contributions
 poster-0.009
 door-0.008
 elevator-0.008
 bus-0.008
 doors-0.008
Token sat
Token ID3332
Feature activation-7.94e-02

Pos logit contributions
 bus+0.012
 poster+0.012
 lady+0.012
 door+0.012
 doors+0.012
Neg logit contributions
opsy-0.015
ulent-0.015
cious-0.014
 herbal-0.014
ifying-0.014
Token,
Token ID11
Feature activation-1.12e+00

Pos logit contributions
 poster+0.082
 elevator+0.074
 doors+0.072
 bus+0.071
 door+0.070
Neg logit contributions
opsy-0.130
ulent-0.124
cious-0.123
 herbal-0.122
bowl-0.121
Token a
Token ID257
Feature activation-2.53e+00

Pos logit contributions
 poster+0.038
 bus+0.035
 door+0.035
 elevator+0.034
 doors+0.033
Neg logit contributions
opsy-0.074
 herbal-0.070
apa-0.069
ulent-0.069
cious-0.068
Token boy
Token ID2933
Feature activation-3.16e+00

Pos logit contributions
 poster+0.114
 door+0.106
 bus+0.105
 elevator+0.104
 doors+0.101
Neg logit contributions
opsy-0.139
ulent-0.130
 herbal-0.130
cious-0.128
apa-0.127
Token on
Token ID319
Feature activation-1.76e+00

Pos logit contributions
 poster+0.152
 bus+0.143
 door+0.141
 elevator+0.141
 doors+0.136
Neg logit contributions
opsy-0.163
 herbal-0.153
ulent-0.150
apa-0.148
cious-0.146
Token a
Token ID257
Feature activation-2.86e+00

Pos logit contributions
 poster+0.081
 elevator+0.073
 bus+0.072
 door+0.072
 doors+0.071
Neg logit contributions
opsy-0.091
 herbal-0.085
ulent-0.084
cious-0.083
 Cic-0.082
Token bike
Token ID7161
Feature activation-6.24e+00

Pos logit contributions
 poster+0.118
 elevator+0.109
 door+0.106
 bus+0.105
 doors+0.105
Neg logit contributions
opsy-0.166
 herbal-0.157
ulent-0.156
cious-0.156
apa-0.150
Token.
Token ID13
Feature activation-9.86e-01

Pos logit contributions
 poster+0.278
 elevator+0.271
 doors+0.259
 rules+0.251
 door+0.249
Neg logit contributions
-0.291
bowl-0.284
cious-0.273
cr-0.271
ulent-0.270
Token She
Token ID1375
Feature activation-7.28e-01

Pos logit contributions
 poster+0.053
 elevator+0.049
 door+0.048
 bus+0.048
 doors+0.047
Neg logit contributions
opsy-0.046
ulent-0.043
 herbal-0.042
cious-0.042
apa-0.042
Token decided
Token ID3066
Feature activation+9.50e-01

Pos logit contributions
 poster+0.038
 elevator+0.036
 door+0.035
 bus+0.035
 doors+0.034
Neg logit contributions
opsy-0.035
ulent-0.032
cious-0.032
 herbal-0.032
apa-0.031
Token to
Token ID284
Feature activation-5.83e-01

Pos logit contributions
cious+0.039
opsy+0.036
+0.034
 Dee+0.034
ulent+0.034
Neg logit contributions
 poster-0.052
 elevator-0.051
 door-0.045
 doors-0.045
loo-0.045
Token sing
Token ID1702
Feature activation+6.24e-01

Pos logit contributions
 poster+0.031
 elevator+0.029
 door+0.029
 bus+0.029
 doors+0.029
Neg logit contributions
opsy-0.027
ulent-0.025
 herbal-0.025
cious-0.024
apa-0.024
Token.
Token ID13
Feature activation-1.21e-01

Pos logit contributions
 poster+0.062
 elevator+0.057
 door+0.057
 bus+0.057
 doors+0.055
Neg logit contributions
opsy-0.080
ulent-0.075
 herbal-0.075
cious-0.074
apa-0.073
Token
Token ID198
Feature activation+1.93e-01

Pos logit contributions
 poster+0.006
 elevator+0.006
 bus+0.006
 door+0.006
 doors+0.006
Neg logit contributions
opsy-0.005
apa-0.005
cious-0.005
ulent-0.005
 herbal-0.005
Token
Token ID198
Feature activation+8.99e-02

Pos logit contributions
opsy+0.009
apa+0.008
Carl+0.008
 herbal+0.008
cious+0.008
Neg logit contributions
 poster-0.011
 elevator-0.010
 bus-0.010
 door-0.010
 doors-0.010
Token"
Token ID1
Feature activation-9.88e-01

Pos logit contributions
opsy+0.005
cious+0.005
apa+0.005
ulent+0.005
 herbal+0.005
Neg logit contributions
 poster-0.004
 door-0.004
 elevator-0.004
 bus-0.004
 doors-0.003
TokenOf
Token ID5189
Feature activation-1.70e+00

Pos logit contributions
 poster+0.051
 door+0.049
 elevator+0.048
 bus+0.048
 doors+0.048
Neg logit contributions
opsy-0.048
cious-0.044
 herbal-0.044
ulent-0.043
apa-0.043
Token course
Token ID1781
Feature activation-6.11e+00

Pos logit contributions
 poster+0.075
 door+0.072
 bus+0.072
 elevator+0.067
 stairs+0.067
Neg logit contributions
opsy-0.093
-0.092
apa-0.090
ulent-0.085
 Cic-0.084
Token,"
Token ID553
Feature activation-1.77e+00

Pos logit contributions
 poster+0.331
 elevator+0.305
 bus+0.291
 doors+0.285
 door+0.281
Neg logit contributions
-0.281
apa-0.260
cious-0.257
opsy-0.257
ulent-0.252
Token the
Token ID262
Feature activation-1.77e+00

Pos logit contributions
 poster+0.084
 elevator+0.079
 bus+0.076
 door+0.076
 doors+0.074
Neg logit contributions
opsy-0.091
apa-0.085
 Cic-0.085
cious-0.082
 herbal-0.082
Token lady
Token ID10846
Feature activation-2.19e+00

Pos logit contributions
 poster+0.062
 doors+0.054
 elevator+0.054
 bus+0.053
?"+0.052
Neg logit contributions
opsy-0.110
 herbal-0.101
apa-0.100
cious-0.099
ulent-0.096
Token says
Token ID1139
Feature activation-2.24e+00

Pos logit contributions
 poster+0.087
 elevator+0.080
 bus+0.079
 door+0.078
 rules+0.076
Neg logit contributions
opsy-0.127
apa-0.122
 herbal-0.117
cious-0.117
ulent-0.117
Token.
Token ID13
Feature activation-1.28e+00

Pos logit contributions
 poster+0.077
 elevator+0.069
 bus+0.068
 door+0.067
 doors+0.065
Neg logit contributions
opsy-0.147
ulent-0.141
 herbal-0.139
cious-0.139
apa-0.137
Token and
Token ID290
Feature activation-4.51e-01

Pos logit contributions
opsy+0.035
ulent+0.033
 herbal+0.033
cious+0.032
Carl+0.031
Neg logit contributions
 poster-0.014
 door-0.013
 bus-0.013
 elevator-0.013
 doors-0.012
Token dad
Token ID9955
Feature activation-9.78e-01

Pos logit contributions
 poster+0.020
 elevator+0.018
 rules+0.018
 door+0.018
 bus+0.017
Neg logit contributions
opsy-0.025
apa-0.023
cious-0.023
 herbal-0.022
 Cic-0.022
Token to
Token ID284
Feature activation-1.34e+00

Pos logit contributions
 poster+0.029
 door+0.029
 bus+0.028
 elevator+0.027
 doors+0.026
Neg logit contributions
opsy-0.070
 herbal-0.066
ulent-0.065
Carl-0.063
cious-0.063
Token a
Token ID257
Feature activation-2.00e+00

Pos logit contributions
 poster+0.043
 bus+0.040
 elevator+0.040
 door+0.039
 lady+0.037
Neg logit contributions
opsy-0.091
cious-0.088
ulent-0.088
 herbal-0.086
apa-0.086
Token big
Token ID1263
Feature activation-2.09e+00

Pos logit contributions
 poster+0.078
 elevator+0.071
 door+0.071
 bus+0.070
 doors+0.070
Neg logit contributions
opsy-0.120
 herbal-0.113
ulent-0.112
cious-0.111
apa-0.109
Token park
Token ID3952
Feature activation-6.09e+00

Pos logit contributions
 poster+0.082
 doors+0.076
 elevator+0.075
 rules+0.074
 door+0.074
Neg logit contributions
opsy-0.117
 herbal-0.111
ulent-0.107
bowl-0.105
cious-0.105
Token.
Token ID13
Feature activation+1.64e-01

Pos logit contributions
 poster+0.237
 elevator+0.211
 doors+0.207
 bus+0.206
 door+0.204
Neg logit contributions
opsy-0.351
cious-0.341
 herbal-0.340
ulent-0.331
bowl-0.323
Token There
Token ID1318
Feature activation+1.47e-01

Pos logit contributions
opsy+0.008
ulent+0.008
 herbal+0.008
cious+0.008
apa+0.007
Neg logit contributions
 poster-0.008
 door-0.008
 bus-0.008
 elevator-0.008
 doors-0.007
Token was
Token ID373
Feature activation-2.31e-01

Pos logit contributions
ulent+0.008
opsy+0.008
ifying+0.007
 herbal+0.007
cious+0.007
Neg logit contributions
 poster-0.007
 elevator-0.007
 door-0.007
 bus-0.007
 lady-0.007
Token a
Token ID257
Feature activation-9.53e-01

Pos logit contributions
 poster+0.009
 elevator+0.009
 door+0.008
 bus+0.008
 doors+0.008
Neg logit contributions
opsy-0.014
ulent-0.013
 herbal-0.013
cious-0.013
apa-0.013
Token lake
Token ID13546
Feature activation-2.48e+00

Pos logit contributions
 poster+0.038
 elevator+0.035
 door+0.035
 bus+0.034
 doors+0.034
Neg logit contributions
opsy-0.056
 herbal-0.053
ulent-0.052
cious-0.052
apa-0.050
Token a
Token ID257
Feature activation-5.34e-01

Pos logit contributions
 poster+0.017
 elevator+0.016
 door+0.016
 bus+0.015
 doors+0.015
Neg logit contributions
opsy-0.035
apa-0.034
ulent-0.033
cious-0.033
 herbal-0.033
Token big
Token ID1263
Feature activation-1.03e+00

Pos logit contributions
 poster+0.020
 door+0.019
 bus+0.018
 elevator+0.018
 lady+0.017
Neg logit contributions
opsy-0.033
ulent-0.033
apa-0.032
cious-0.032
 herbal-0.032
Token house
Token ID2156
Feature activation-3.38e+00

Pos logit contributions
 poster+0.042
 elevator+0.038
 door+0.038
 bus+0.038
 doors+0.037
Neg logit contributions
opsy-0.060
 herbal-0.057
ulent-0.056
cious-0.056
apa-0.054
Token near
Token ID1474
Feature activation+3.53e-01

Pos logit contributions
 poster+0.193
 officer+0.178
 bus+0.176
 doors+0.175
 rules+0.174
Neg logit contributions
opsy-0.126
 herbal-0.123
ulent-0.122
cious-0.119
bowl-0.117
Token a
Token ID257
Feature activation-1.27e+00

Pos logit contributions
opsy+0.020
 herbal+0.019
ulent+0.019
cious+0.019
apa+0.019
Neg logit contributions
 poster-0.015
 bus-0.014
 elevator-0.013
 door-0.013
 doors-0.013
Token university
Token ID6403
Feature activation-6.06e+00

Pos logit contributions
 poster+0.055
 elevator+0.051
 doors+0.050
 bus+0.049
 door+0.049
Neg logit contributions
opsy-0.070
 herbal-0.065
cious-0.063
ulent-0.063
apa-0.061
Token.
Token ID13
Feature activation+5.08e-01

Pos logit contributions
 poster+0.295
iren+0.287
 bus+0.278
 officer+0.275
 elevator+0.275
Neg logit contributions
-0.220
cious-0.214
 colony-0.208
e-0.193
comb-0.184
Token She
Token ID1375
Feature activation+3.06e-01

Pos logit contributions
opsy+0.023
 herbal+0.021
ulent+0.021
cious+0.021
 Cic+0.020
Neg logit contributions
 poster-0.028
 bus-0.026
 elevator-0.026
 door-0.026
 doors-0.025
Token liked
Token ID8288
Feature activation+1.08e+00

Pos logit contributions
opsy+0.010
cious+0.009
 herbal+0.008
 roasted+0.008
 Cic+0.008
Neg logit contributions
 poster-0.021
 door-0.019
 rules-0.019
 elevator-0.019
 doors-0.019
Token to
Token ID284
Feature activation+4.39e-01

Pos logit contributions
 Cic+0.048
 herbal+0.046
opsy+0.046
cious+0.046
ulent+0.045
Neg logit contributions
 poster-0.053
 officer-0.048
 elevator-0.048
 door-0.046
 bus-0.046
Token watch
Token ID2342
Feature activation+7.98e-01

Pos logit contributions
opsy+0.013
 herbal+0.013
 hop+0.012
cious+0.012
 gra+0.012
Neg logit contributions
 poster-0.028
 door-0.026
 elevator-0.025
 doors-0.025
 officer-0.025
Token and
Token ID290
Feature activation-1.13e+00

Pos logit contributions
 poster+0.094
 elevator+0.091
 bus+0.085
 doors+0.084
 door+0.082
Neg logit contributions
opsy-0.108
cious-0.104
ulent-0.104
 herbal-0.102
ifying-0.102
Token says
Token ID1139
Feature activation-2.31e+00

Pos logit contributions
 poster+0.047
 door+0.043
 elevator+0.043
 rules+0.043
 bus+0.042
Neg logit contributions
opsy-0.062
apa-0.058
 herbal-0.058
 Cic-0.057
cious-0.057
Token,
Token ID11
Feature activation-1.15e+00

Pos logit contributions
 poster+0.107
 elevator+0.097
 door+0.097
 bus+0.096
 doors+0.095
Neg logit contributions
opsy-0.120
cious-0.117
apa-0.115
ulent-0.114
 herbal-0.113
Token "
Token ID366
Feature activation-1.24e+00

Pos logit contributions
 poster+0.059
 elevator+0.057
 doors+0.054
 bus+0.053
 door+0.052
Neg logit contributions
opsy-0.052
ulent-0.050
cious-0.049
 herbal-0.049
apa-0.049
TokenOf
Token ID5189
Feature activation-1.73e+00

Pos logit contributions
 door+0.070
 poster+0.069
 elevator+0.069
 doors+0.068
 bus+0.067
Neg logit contributions
opsy-0.053
Carl-0.050
ulent-0.049
cious-0.048
 herbal-0.048
Token course
Token ID1781
Feature activation-6.05e+00

Pos logit contributions
 poster+0.069
 door+0.067
 bus+0.065
 doors+0.062
 elevator+0.059
Neg logit contributions
opsy-0.101
apa-0.095
-0.094
ulent-0.094
 herbal-0.094
Token,
Token ID11
Feature activation-1.54e+00

Pos logit contributions
 poster+0.329
 elevator+0.298
 bus+0.290
 doors+0.290
 door+0.283
Neg logit contributions
opsy-0.264
-0.263
cious-0.261
apa-0.255
ulent-0.251
Token you
Token ID345
Feature activation-6.50e-01

Pos logit contributions
 poster+0.068
 elevator+0.060
 doors+0.059
 bus+0.058
 door+0.057
Neg logit contributions
opsy-0.084
apa-0.080
 herbal-0.079
ulent-0.078
-0.078
Token can
Token ID460
Feature activation-1.01e+00

Pos logit contributions
 poster+0.033
 elevator+0.030
 bus+0.030
 door+0.030
 doors+0.029
Neg logit contributions
opsy-0.032
ulent-0.030
 herbal-0.030
cious-0.030
apa-0.030
Token play
Token ID711
Feature activation-1.30e+00

Pos logit contributions
 poster+0.054
 elevator+0.051
 door+0.050
 bus+0.050
 doors+0.050
Neg logit contributions
opsy-0.047
ulent-0.043
 herbal-0.042
cious-0.041
apa-0.041
Token here
Token ID994
Feature activation-7.81e-01

Pos logit contributions
 poster+0.063
 bus+0.060
 door+0.060
 officer+0.060
 doors+0.060
Neg logit contributions
opsy-0.058
 herbal-0.058
Carl-0.057
bowl-0.057
ulent-0.056
Token,
Token ID11
Feature activation-7.26e-01

Pos logit contributions
 poster+0.013
 elevator+0.012
 bus+0.012
 door+0.012
 doors+0.012
Neg logit contributions
opsy-0.014
ulent-0.013
cious-0.013
 herbal-0.013
apa-0.013
Token "
Token ID366
Feature activation-1.26e+00

Pos logit contributions
 poster+0.032
 elevator+0.030
 bus+0.029
 door+0.029
 doors+0.029
Neg logit contributions
opsy-0.040
ulent-0.038
 herbal-0.038
cious-0.037
apa-0.036
TokenYes
Token ID5297
Feature activation-1.95e+00

Pos logit contributions
 elevator+0.070
 poster+0.069
 door+0.069
 doors+0.068
 bus+0.066
Neg logit contributions
opsy-0.053
cious-0.051
Carl-0.050
ulent-0.050
 herbal-0.049
Token,
Token ID11
Feature activation-1.69e+00

Pos logit contributions
 poster+0.120
 elevator+0.109
 bus+0.107
 officer+0.105
 doors+0.101
Neg logit contributions
apa-0.074
opsy-0.072
-0.067
ulent-0.063
 herbal-0.063
Token of
Token ID286
Feature activation-1.65e-01

Pos logit contributions
 poster+0.064
 elevator+0.060
 bus+0.058
 door+0.056
 doors+0.056
Neg logit contributions
opsy-0.097
 herbal-0.096
ulent-0.094
apa-0.092
cious-0.091
Token course
Token ID1781
Feature activation-6.04e+00

Pos logit contributions
 door+0.007
 officer+0.007
 poster+0.007
 bus+0.007
 doors+0.007
Neg logit contributions
 herbal-0.008
e-0.008
opsy-0.008
-0.008
amy-0.008
Token.
Token ID13
Feature activation-1.88e+00

Pos logit contributions
 poster+0.260
 elevator+0.240
 bus+0.237
 door+0.226
 officer+0.223
Neg logit contributions
opsy-0.346
apa-0.327
cious-0.326
ulent-0.326
 herbal-0.323
Token We
Token ID775
Feature activation+5.07e-01

Pos logit contributions
 poster+0.083
 door+0.077
 elevator+0.077
 bus+0.076
 doors+0.074
Neg logit contributions
opsy-0.104
 herbal-0.097
ulent-0.097
cious-0.095
apa-0.094
Token can
Token ID460
Feature activation-4.19e-01

Pos logit contributions
opsy+0.024
ulent+0.023
 herbal+0.023
cious+0.022
apa+0.022
Neg logit contributions
 poster-0.026
 elevator-0.025
 door-0.025
 bus-0.024
 doors-0.024
Token come
Token ID1282
Feature activation+3.04e-01

Pos logit contributions
 poster+0.024
 elevator+0.023
 door+0.022
 bus+0.022
 doors+0.022
Neg logit contributions
opsy-0.017
ulent-0.015
cious-0.015
apa-0.014
 herbal-0.014
Token back
Token ID736
Feature activation-1.15e+00

Pos logit contributions
opsy+0.016
ulent+0.015
cious+0.015
 herbal+0.015
bowl+0.014
Neg logit contributions
 poster-0.013
 elevator-0.013
 rules-0.012
 bus-0.012
 officer-0.012
Token and
Token ID290
Feature activation-8.66e-01

Pos logit contributions
 poster+0.083
 elevator+0.081
 bus+0.075
 doors+0.072
 door+0.070
Neg logit contributions
opsy-0.099
ulent-0.095
cious-0.094
 herbal-0.094
umped-0.092
Token says
Token ID1139
Feature activation-1.90e+00

Pos logit contributions
 poster+0.030
 elevator+0.028
 door+0.028
 bus+0.027
 doors+0.027
Neg logit contributions
opsy-0.053
 herbal-0.051
 Cic-0.050
ulent-0.050
cious-0.050
Token,
Token ID11
Feature activation-9.62e-01

Pos logit contributions
 poster+0.075
 elevator+0.068
 bus+0.068
 door+0.066
 doors+0.065
Neg logit contributions
opsy-0.113
cious-0.110
ulent-0.110
apa-0.108
 herbal-0.108
Token "
Token ID366
Feature activation-1.28e+00

Pos logit contributions
 poster+0.046
 elevator+0.045
 doors+0.043
 bus+0.042
 door+0.042
Neg logit contributions
opsy-0.047
ulent-0.046
cious-0.045
 herbal-0.044
 Cic-0.044
TokenOf
Token ID5189
Feature activation-1.82e+00

Pos logit contributions
 elevator+0.079
 door+0.077
 doors+0.076
 poster+0.074
?"+0.073
Neg logit contributions
cious-0.044
Carl-0.044
opsy-0.042
-0.042
ulent-0.042
Token course
Token ID1781
Feature activation-6.03e+00

Pos logit contributions
 poster+0.077
 bus+0.074
 door+0.073
 elevator+0.068
 stairs+0.067
Neg logit contributions
opsy-0.096
-0.093
apa-0.091
 Bro-0.091
Carl-0.090
Token,
Token ID11
Feature activation-1.48e+00

Pos logit contributions
 poster+0.332
 elevator+0.304
 bus+0.294
 doors+0.284
 officer+0.281
Neg logit contributions
opsy-0.269
apa-0.264
-0.264
cious-0.259
ulent-0.249
Token you
Token ID345
Feature activation-6.44e-01

Pos logit contributions
 poster+0.069
 elevator+0.060
iren+0.059
 doors+0.058
 bus+0.058
Neg logit contributions
opsy-0.077
 herbal-0.073
apa-0.072
 Madd-0.072
 Cic-0.071
Token can
Token ID460
Feature activation-9.28e-01

Pos logit contributions
 poster+0.037
 elevator+0.035
 bus+0.035
 doors+0.033
 lady+0.033
Neg logit contributions
-0.025
apa-0.025
opsy-0.025
ulent-0.024
're-0.024
Token.
Token ID13
Feature activation-1.51e+00

Pos logit contributions
 poster+0.045
 elevator+0.043
 bus+0.042
 doors+0.041
 door+0.041
Neg logit contributions
opsy-0.046
ulent-0.044
cious-0.043
 herbal-0.042
apa-0.042
Token I
Token ID314
Feature activation+2.46e-01

Pos logit contributions
 poster+0.074
 door+0.069
 elevator+0.069
 bus+0.068
 doors+0.067
Neg logit contributions
opsy-0.077
ulent-0.072
 herbal-0.072
cious-0.071
apa-0.069
Token teacher
Token ID4701
Feature activation-2.10e+00

Pos logit contributions
 poster+0.013
 elevator+0.012
 bus+0.012
 door+0.012
 doors+0.011
Neg logit contributions
opsy-0.018
 herbal-0.017
cious-0.017
ulent-0.017
apa-0.017
Token says
Token ID1139
Feature activation-1.79e+00

Pos logit contributions
 poster+0.069
 rules+0.068
 bus+0.067
 door+0.066
 elevator+0.066
Neg logit contributions
-0.116
apa-0.111
 herbal-0.109
cious-0.108
 Cic-0.108
Token,
Token ID11
Feature activation-9.08e-01

Pos logit contributions
 poster+0.087
 bus+0.082
 elevator+0.082
 door+0.080
 doors+0.079
Neg logit contributions
opsy-0.087
ulent-0.084
cious-0.084
 herbal-0.083
apa-0.082
Token "
Token ID366
Feature activation-1.30e+00

Pos logit contributions
 poster+0.041
 elevator+0.039
 bus+0.038
 door+0.037
 doors+0.037
Neg logit contributions
opsy-0.047
ulent-0.045
cious-0.045
 herbal-0.045
 Cic-0.044
TokenOf
Token ID5189
Feature activation-2.02e+00

Pos logit contributions
 door+0.080
 elevator+0.080
 doors+0.078
 poster+0.077
 bus+0.076
Neg logit contributions
Carl-0.047
opsy-0.046
cious-0.045
Fred-0.043
-0.042
Token course
Token ID1781
Feature activation-6.00e+00

Pos logit contributions
 poster+0.077
 door+0.075
 bus+0.073
 elevator+0.069
 doors+0.068
Neg logit contributions
opsy-0.117
Carl-0.111
 herbal-0.109
apa-0.108
 Bro-0.108
Token,
Token ID11
Feature activation-1.58e+00

Pos logit contributions
 poster+0.324
 elevator+0.300
 bus+0.293
 door+0.285
 doors+0.285
Neg logit contributions
opsy-0.271
apa-0.260
-0.257
cious-0.257
ulent-0.248
Token you
Token ID345
Feature activation-8.18e-01

Pos logit contributions
 poster+0.070
iren+0.061
 door+0.059
 elevator+0.059
 doors+0.057
Neg logit contributions
opsy-0.078
-0.076
 Cic-0.076
 Madd-0.074
apa-0.073
Token can
Token ID460
Feature activation-1.04e+00

Pos logit contributions
 poster+0.042
 bus+0.040
 elevator+0.040
 doors+0.038
 door+0.038
Neg logit contributions
opsy-0.037
ulent-0.035
apa-0.035
 herbal-0.035
bowl-0.034
Token.
Token ID13
Feature activation-1.77e+00

Pos logit contributions
 poster+0.050
 elevator+0.048
 bus+0.046
 door+0.046
 doors+0.046
Neg logit contributions
opsy-0.053
ulent-0.050
 herbal-0.049
cious-0.049
apa-0.047
Token You
Token ID921
Feature activation-2.86e-01

Pos logit contributions
 poster+0.085
 elevator+0.078
 door+0.078
 bus+0.078
 doors+0.076
Neg logit contributions
opsy-0.092
 herbal-0.086
ulent-0.086
cious-0.085
apa-0.083
Token and
Token ID290
Feature activation-1.31e+00

Pos logit contributions
 poster+0.022
 elevator+0.020
 bus+0.019
 door+0.019
 doors+0.019
Neg logit contributions
opsy-0.036
 herbal-0.034
ulent-0.034
cious-0.033
apa-0.033
Token said
Token ID531
Feature activation-9.08e-01

Pos logit contributions
 poster+0.057
 elevator+0.052
 door+0.052
 bus+0.052
 doors+0.050
Neg logit contributions
opsy-0.074
 herbal-0.070
ulent-0.069
cious-0.069
apa-0.068
Token,
Token ID11
Feature activation-1.07e+00

Pos logit contributions
 poster+0.042
 bus+0.039
 elevator+0.039
 door+0.038
 doors+0.038
Neg logit contributions
opsy-0.049
ulent-0.046
 herbal-0.046
cious-0.045
apa-0.045
Token "
Token ID366
Feature activation-1.35e+00

Pos logit contributions
 poster+0.046
 elevator+0.043
 bus+0.042
 door+0.042
 doors+0.041
Neg logit contributions
opsy-0.060
ulent-0.057
 herbal-0.056
cious-0.056
apa-0.055
TokenOf
Token ID5189
Feature activation-1.88e+00

Pos logit contributions
 poster+0.076
 elevator+0.075
 door+0.074
 doors+0.073
 bus+0.071
Neg logit contributions
opsy-0.058
Carl-0.054
ulent-0.053
 herbal-0.053
cious-0.052
Token course
Token ID1781
Feature activation-5.99e+00

Pos logit contributions
 poster+0.077
 door+0.071
 bus+0.071
 elevator+0.066
 doors+0.066
Neg logit contributions
opsy-0.109
Carl-0.103
apa-0.103
ulent-0.102
 herbal-0.102
Token,
Token ID11
Feature activation-1.54e+00

Pos logit contributions
 poster+0.315
 elevator+0.286
 bus+0.279
 doors+0.271
 door+0.268
Neg logit contributions
opsy-0.288
apa-0.278
cious-0.272
-0.269
ulent-0.268
Token we
Token ID356
Feature activation-5.82e-01

Pos logit contributions
 poster+0.067
 elevator+0.059
 doors+0.057
 bus+0.056
 door+0.056
Neg logit contributions
opsy-0.088
 herbal-0.082
apa-0.082
ulent-0.081
 Cic-0.080
Token forgive
Token ID20927
Feature activation-1.69e+00

Pos logit contributions
 poster+0.022
 door+0.021
 elevator+0.021
 bus+0.020
 doors+0.020
Neg logit contributions
opsy-0.036
ulent-0.034
cious-0.034
 herbal-0.034
apa-0.032
Token you
Token ID345
Feature activation-5.06e-01

Pos logit contributions
 poster+0.038
 elevator+0.032
 door+0.032
 bus+0.031
 doors+0.030
Neg logit contributions
opsy-0.132
ulent-0.126
 herbal-0.126
cious-0.125
apa-0.124
Token.
Token ID13
Feature activation-1.78e+00

Pos logit contributions
 poster+0.019
 elevator+0.017
 bus+0.017
 door+0.016
 doors+0.015
Neg logit contributions
opsy-0.032
ulent-0.031
 herbal-0.031
cious-0.030
apa-0.030
Token a
Token ID257
Feature activation-2.27e-01

Pos logit contributions
opsy+0.045
 herbal+0.042
 Cic+0.041
ulent+0.041
apa+0.041
Neg logit contributions
 poster-0.040
 elevator-0.036
 bus-0.036
 door-0.035
 doors-0.035
Token big
Token ID1263
Feature activation-5.55e-01

Pos logit contributions
 poster+0.009
 elevator+0.009
 doors+0.008
 door+0.008
 bus+0.008
Neg logit contributions
opsy-0.013
 herbal-0.012
ulent-0.012
cious-0.012
bowl-0.012
Token place
Token ID1295
Feature activation-2.16e+00

Pos logit contributions
 poster+0.021
 doors+0.020
 belts+0.020
 rules+0.020
 officer+0.020
Neg logit contributions
opsy-0.030
 herbal-0.029
 colony-0.028
bowl-0.028
ulent-0.027
Token called
Token ID1444
Feature activation+9.34e-01

Pos logit contributions
 poster+0.106
 officer+0.093
 doors+0.092
 elevator+0.092
 bus+0.091
Neg logit contributions
bowl-0.092
cr-0.091
opsy-0.089
 herbal-0.088
comb-0.087
Token a
Token ID257
Feature activation-1.16e+00

Pos logit contributions
opsy+0.046
ulent+0.045
 herbal+0.045
apa+0.044
cious+0.044
Neg logit contributions
 poster-0.045
 doors-0.039
 elevator-0.039
 bus-0.039
 door-0.039
Token university
Token ID6403
Feature activation-5.98e+00

Pos logit contributions
 poster+0.047
 elevator+0.043
 doors+0.042
 bus+0.042
 door+0.042
Neg logit contributions
opsy-0.067
 herbal-0.064
ulent-0.062
cious-0.062
apa-0.061
Token.
Token ID13
Feature activation+6.62e-01

Pos logit contributions
 poster+0.264
 yourselves+0.241
 elevator+0.234
 doors+0.228
iren+0.227
Neg logit contributions
-0.285
cious-0.261
comb-0.237
cr-0.234
ifying-0.234
Token He
Token ID679
Feature activation-1.79e-01

Pos logit contributions
opsy+0.025
ulent+0.024
 herbal+0.024
apa+0.023
cious+0.023
Neg logit contributions
 poster-0.041
 bus-0.036
 elevator-0.036
 door-0.035
 doors-0.035
Token learned
Token ID4499
Feature activation-2.90e-02

Pos logit contributions
 poster+0.011
 rules+0.010
 elevator+0.010
 door+0.010
 bus+0.009
Neg logit contributions
opsy-0.007
cious-0.006
apa-0.006
 herbal-0.006
ulent-0.006
Token lots
Token ID6041
Feature activation-7.00e-01

Pos logit contributions
 poster+0.001
 bus+0.001
 elevator+0.001
 lady+0.001
 door+0.001
Neg logit contributions
 herbal-0.001
opsy-0.001
apa-0.001
ulent-0.001
-0.001
Token of
Token ID286
Feature activation+3.59e-01

Pos logit contributions
 poster+0.036
 elevator+0.032
 bus+0.032
 door+0.031
 doors+0.031
Neg logit contributions
opsy-0.032
 herbal-0.031
ulent-0.030
cious-0.030
apa-0.029
Token and
Token ID290
Feature activation-8.70e-01

Pos logit contributions
 poster+0.018
 elevator+0.016
 bus+0.016
 door+0.015
 doors+0.015
Neg logit contributions
opsy-0.028
ulent-0.027
 herbal-0.027
apa-0.026
cious-0.026
Token said
Token ID531
Feature activation-8.92e-01

Pos logit contributions
 poster+0.039
 elevator+0.035
 door+0.035
 bus+0.035
 doors+0.034
Neg logit contributions
opsy-0.048
 herbal-0.045
ulent-0.045
cious-0.044
apa-0.044
Token,
Token ID11
Feature activation-9.40e-01

Pos logit contributions
 poster+0.043
 bus+0.040
 elevator+0.039
 door+0.039
 doors+0.038
Neg logit contributions
opsy-0.046
apa-0.044
 herbal-0.043
ulent-0.043
cious-0.043
Token "
Token ID366
Feature activation-1.37e+00

Pos logit contributions
 poster+0.041
 elevator+0.038
 bus+0.037
 door+0.037
 doors+0.037
Neg logit contributions
opsy-0.052
ulent-0.050
 herbal-0.049
cious-0.049
apa-0.048
TokenOf
Token ID5189
Feature activation-1.95e+00

Pos logit contributions
 poster+0.079
 elevator+0.079
 door+0.078
 doors+0.076
 bus+0.076
Neg logit contributions
opsy-0.056
Carl-0.052
cious-0.050
 herbal-0.050
ulent-0.049
Token course
Token ID1781
Feature activation-5.97e+00

Pos logit contributions
 poster+0.073
 bus+0.068
 door+0.068
 elevator+0.063
 doors+0.062
Neg logit contributions
opsy-0.118
apa-0.110
 herbal-0.110
Carl-0.109
 Cic-0.108
Token I
Token ID314
Feature activation-7.37e-01

Pos logit contributions
 poster+0.323
 elevator+0.297
 bus+0.290
 doors+0.281
 door+0.279
Neg logit contributions
opsy-0.277
apa-0.265
cious-0.257
-0.255
ulent-0.251
Token forgive
Token ID20927
Feature activation-1.68e+00

Pos logit contributions
 poster+0.034
 door+0.032
 bus+0.032
 elevator+0.031
 lady+0.031
Neg logit contributions
opsy-0.040
ulent-0.038
 herbal-0.037
cious-0.037
apa-0.035
Token you
Token ID345
Feature activation-5.88e-01

Pos logit contributions
 poster+0.056
 elevator+0.052
 bus+0.051
 door+0.049
 doors+0.045
Neg logit contributions
opsy-0.111
ulent-0.106
cious-0.105
 herbal-0.104
apa-0.104
Token,
Token ID11
Feature activation-1.56e+00

Pos logit contributions
 poster+0.026
 elevator+0.024
 bus+0.023
 door+0.022
 doors+0.022
Neg logit contributions
opsy-0.033
ulent-0.031
 herbal-0.031
apa-0.031
cious-0.031
Token Sara
Token ID24799
Feature activation-2.04e+00

Pos logit contributions
 poster+0.068
 elevator+0.062
iren+0.061
 doors+0.058
 bus+0.054
Neg logit contributions
opsy-0.082
 Mitt-0.080
 Cic-0.077
-0.077
 Madd-0.077
Token.
Token ID13
Feature activation-1.24e+00

Pos logit contributions
 poster+0.059
 door+0.056
 doors+0.053
 elevator+0.052
 bus+0.050
Neg logit contributions
opsy-0.206
ulent-0.199
 herbal-0.199
cious-0.196
bowl-0.192
Token
Token ID198
Feature activation-6.70e-01

Pos logit contributions
 poster+0.061
 elevator+0.058
 bus+0.057
 door+0.056
 doors+0.055
Neg logit contributions
opsy-0.062
 herbal-0.058
ulent-0.058
cious-0.057
apa-0.056
Token
Token ID198
Feature activation-6.15e-01

Pos logit contributions
 poster+0.035
 elevator+0.033
 bus+0.033
 door+0.033
 doors+0.032
Neg logit contributions
opsy-0.032
apa-0.029
 herbal-0.029
ulent-0.028
cious-0.028
Token"
Token ID1
Feature activation-1.35e+00

Pos logit contributions
 poster+0.020
 elevator+0.018
 door+0.018
 bus+0.018
 doors+0.017
Neg logit contributions
opsy-0.041
ulent-0.039
 herbal-0.039
cious-0.039
apa-0.038
TokenOf
Token ID5189
Feature activation-1.96e+00

Pos logit contributions
 poster+0.071
 door+0.070
 elevator+0.069
 doors+0.068
 bus+0.068
Neg logit contributions
opsy-0.060
 herbal-0.057
Carl-0.056
cious-0.056
ulent-0.056
Token course
Token ID1781
Feature activation-5.97e+00

Pos logit contributions
 poster+0.076
 bus+0.073
 door+0.071
 elevator+0.069
 stairs+0.067
Neg logit contributions
opsy-0.113
apa-0.107
-0.105
 herbal-0.104
ulent-0.104
Token,"
Token ID553
Feature activation-2.17e+00

Pos logit contributions
 poster+0.291
 elevator+0.273
 bus+0.268
 doors+0.256
 door+0.252
Neg logit contributions
opsy-0.299
apa-0.291
-0.286
cious-0.280
ulent-0.280
Token Ben
Token ID3932
Feature activation-4.12e-01

Pos logit contributions
 poster+0.103
 elevator+0.095
 door+0.094
 bus+0.094
 doors+0.092
Neg logit contributions
opsy-0.116
ulent-0.110
 herbal-0.109
cious-0.108
apa-0.105
Token says
Token ID1139
Feature activation-2.74e+00

Pos logit contributions
 elevator+0.017
 poster+0.016
 bus+0.015
 doors+0.015
 door+0.015
Neg logit contributions
-0.023
opsy-0.021
 herbal-0.021
 Cic-0.021
apa-0.021
Token.
Token ID13
Feature activation-1.71e+00

Pos logit contributions
 poster+0.118
 bus+0.108
 elevator+0.108
 door+0.097
 doors+0.097
Neg logit contributions
opsy-0.148
ulent-0.145
apa-0.145
 herbal-0.142
cious-0.139
Token "
Token ID366
Feature activation-9.42e-01

Pos logit contributions
 poster+0.100
 elevator+0.097
 bus+0.094
 doors+0.090
 officer+0.085
Neg logit contributions
opsy-0.064
 Cic-0.062
ulent-0.062
 herbal-0.061
Carl-0.061
Token?"
Token ID1701
Feature activation-1.92e+00

Pos logit contributions
 poster+0.079
 elevator+0.074
 bus+0.070
 officer+0.070
 doors+0.069
Neg logit contributions
opsy-0.172
bowl-0.164
-0.163
 herbal-0.160
cr-0.158
Token
Token ID198
Feature activation-3.17e-01

Pos logit contributions
 poster+0.100
 elevator+0.093
 door+0.093
 bus+0.092
 doors+0.091
Neg logit contributions
opsy-0.094
apa-0.085
 herbal-0.085
ulent-0.084
cious-0.083
Token
Token ID198
Feature activation-2.23e-01

Pos logit contributions
 poster+0.017
 elevator+0.016
 door+0.016
 bus+0.016
 doors+0.016
Neg logit contributions
opsy-0.015
apa-0.014
 herbal-0.013
Carl-0.013
ulent-0.013
Token"
Token ID1
Feature activation-1.22e+00

Pos logit contributions
 poster+0.011
 door+0.010
 elevator+0.010
 bus+0.010
 doors+0.010
Neg logit contributions
opsy-0.011
ulent-0.010
 herbal-0.010
cious-0.010
apa-0.010
TokenOf
Token ID5189
Feature activation-1.69e+00

Pos logit contributions
 poster+0.063
 door+0.061
 elevator+0.060
 doors+0.060
 bus+0.059
Neg logit contributions
opsy-0.057
 herbal-0.054
ulent-0.053
cious-0.053
Carl-0.052
Token course
Token ID1781
Feature activation-5.97e+00

Pos logit contributions
 poster+0.078
 bus+0.076
 stairs+0.076
 door+0.075
 elevator+0.073
Neg logit contributions
opsy-0.087
-0.087
apa-0.081
ulent-0.081
e-0.079
Token,
Token ID11
Feature activation-1.33e+00

Pos logit contributions
 poster+0.299
 elevator+0.281
 doors+0.259
 bus+0.257
iren+0.253
Neg logit contributions
-0.306
cious-0.278
apa-0.278
opsy-0.277
ulent-0.276
Token we
Token ID356
Feature activation-3.63e-01

Pos logit contributions
 poster+0.064
 elevator+0.058
 doors+0.056
 door+0.054
 bus+0.053
Neg logit contributions
opsy-0.068
apa-0.063
 herbal-0.063
 Cic-0.063
ulent-0.063
Token can
Token ID460
Feature activation-8.27e-01

Pos logit contributions
 poster+0.014
 door+0.014
 elevator+0.013
 bus+0.013
 doors+0.013
Neg logit contributions
opsy-0.022
ulent-0.021
 herbal-0.021
cious-0.021
apa-0.020
Token play
Token ID711
Feature activation-1.23e+00

Pos logit contributions
 poster+0.037
 elevator+0.034
 bus+0.034
 door+0.034
 doors+0.033
Neg logit contributions
opsy-0.046
ulent-0.043
 herbal-0.043
cious-0.042
apa-0.042
Token together
Token ID1978
Feature activation-2.62e+00

Pos logit contributions
 poster+0.060
 bus+0.059
 door+0.059
 elevator+0.059
 doors+0.057
Neg logit contributions
opsy-0.055
 herbal-0.054
ulent-0.053
bowl-0.053
Carl-0.052
Token.
Token ID13
Feature activation-9.17e-01

Pos logit contributions
 poster+0.029
 bus+0.027
 elevator+0.027
 door+0.027
 doors+0.026
Neg logit contributions
opsy-0.028
apa-0.027
cious-0.026
ulent-0.026
 herbal-0.026
Token
Token ID198
Feature activation-2.19e-01

Pos logit contributions
 poster+0.049
 elevator+0.045
 bus+0.045
 doors+0.043
 door+0.043
Neg logit contributions
opsy-0.042
 herbal-0.039
apa-0.038
ulent-0.038
cious-0.038
Token
Token ID198
Feature activation-1.68e-01

Pos logit contributions
 poster+0.011
 elevator+0.010
 bus+0.010
 door+0.010
 doors+0.010
Neg logit contributions
opsy-0.011
apa-0.010
 herbal-0.010
ulent-0.010
cious-0.010
Token"
Token ID1
Feature activation-1.23e+00

Pos logit contributions
 poster+0.006
 elevator+0.006
 door+0.006
 bus+0.006
 doors+0.006
Neg logit contributions
opsy-0.010
 herbal-0.010
ulent-0.010
cious-0.010
apa-0.010
TokenOf
Token ID5189
Feature activation-1.77e+00

Pos logit contributions
 poster+0.065
 door+0.064
 elevator+0.064
 doors+0.063
 bus+0.062
Neg logit contributions
opsy-0.056
 herbal-0.053
ulent-0.052
cious-0.052
Carl-0.051
Token course
Token ID1781
Feature activation-5.96e+00

Pos logit contributions
 poster+0.071
 door+0.067
 bus+0.067
 elevator+0.064
 doors+0.062
Neg logit contributions
opsy-0.101
ulent-0.095
-0.095
apa-0.095
Carl-0.094
Token,
Token ID11
Feature activation-1.51e+00

Pos logit contributions
 poster+0.346
 elevator+0.321
 bus+0.313
 doors+0.302
 door+0.300
Neg logit contributions
opsy-0.246
-0.244
apa-0.237
cious-0.232
ulent-0.226
Token my
Token ID616
Feature activation-1.51e+00

Pos logit contributions
 poster+0.069
 elevator+0.059
 bus+0.057
iren+0.056
 doors+0.056
Neg logit contributions
opsy-0.078
-0.076
ulent-0.074
apa-0.073
 Mitt-0.072
Token sweet
Token ID6029
Feature activation+1.48e+00

Pos logit contributions
 poster+0.080
 doors+0.074
 door+0.071
 elevator+0.071
 bus+0.069
Neg logit contributions
opsy-0.072
apa-0.067
cious-0.064
 herbal-0.064
ulent-0.062
Tokenies
Token ID444
Feature activation+1.49e+00

Pos logit contributions
opsy+0.078
apa+0.070
 herbal+0.068
umble+0.065
oppy+0.064
Neg logit contributions
 poster-0.073
 doors-0.065
 elevator-0.065
 belts-0.060
 bus-0.060
Token.
Token ID13
Feature activation-1.88e+00

Pos logit contributions
opsy+0.080
 herbal+0.075
comb+0.074
bowl+0.074
apa+0.074
Neg logit contributions
 poster-0.064
 doors-0.061
 elevator-0.060
 door-0.058
 bus-0.058
Token Ben
Token ID3932
Feature activation-4.86e-02

Pos logit contributions
 poster+0.072
 elevator+0.068
 door+0.068
 bus+0.068
 doors+0.066
Neg logit contributions
opsy-0.058
ulent-0.054
 herbal-0.053
cious-0.053
apa-0.052
Token said
Token ID531
Feature activation-8.36e-01

Pos logit contributions
 poster+0.002
 bus+0.002
 door+0.002
 elevator+0.002
 doors+0.002
Neg logit contributions
opsy-0.003
ulent-0.003
 herbal-0.003
cious-0.002
apa-0.002
Token,
Token ID11
Feature activation-8.24e-01

Pos logit contributions
 poster+0.040
 elevator+0.037
 bus+0.037
 door+0.036
 doors+0.035
Neg logit contributions
opsy-0.043
ulent-0.041
 herbal-0.041
cious-0.041
apa-0.040
Token "
Token ID366
Feature activation-1.27e+00

Pos logit contributions
 poster+0.035
 elevator+0.033
 door+0.032
 bus+0.032
 doors+0.032
Neg logit contributions
opsy-0.046
ulent-0.044
cious-0.044
 herbal-0.044
apa-0.042
TokenOf
Token ID5189
Feature activation-2.06e+00

Pos logit contributions
 poster+0.062
 elevator+0.059
 door+0.059
 bus+0.058
 doors+0.057
Neg logit contributions
opsy-0.065
ulent-0.061
cious-0.061
 herbal-0.061
Carl-0.058
Token course
Token ID1781
Feature activation-5.96e+00

Pos logit contributions
 poster+0.071
 door+0.067
 bus+0.066
 elevator+0.065
 doors+0.063
Neg logit contributions
opsy-0.130
 herbal-0.121
ulent-0.120
apa-0.120
cious-0.118
Token we
Token ID356
Feature activation-6.69e-01

Pos logit contributions
 poster+0.297
 elevator+0.279
 bus+0.270
 doors+0.261
 door+0.259
Neg logit contributions
opsy-0.300
cious-0.284
apa-0.283
ulent-0.281
 herbal-0.275
Token can
Token ID460
Feature activation-1.02e+00

Pos logit contributions
 poster+0.027
 door+0.025
 bus+0.025
 elevator+0.025
 doors+0.024
Neg logit contributions
opsy-0.040
ulent-0.038
 herbal-0.038
cious-0.037
apa-0.036
Token be
Token ID307
Feature activation-1.95e+00

Pos logit contributions
 poster+0.033
 elevator+0.030
 bus+0.029
 door+0.029
 doors+0.028
Neg logit contributions
opsy-0.070
ulent-0.066
 herbal-0.066
cious-0.065
apa-0.064
Token friends
Token ID2460
Feature activation+1.10e+00

Pos logit contributions
 poster+0.096
 elevator+0.089
 door+0.088
 bus+0.088
 doors+0.086
Neg logit contributions
opsy-0.101
 herbal-0.094
cious-0.093
ulent-0.093
bowl-0.089
Token.
Token ID13
Feature activation-1.42e+00

Pos logit contributions
opsy+0.060
ulent+0.059
+0.058
cious+0.058
bowl+0.058
Neg logit contributions
 elevator-0.046
 poster-0.046
 bus-0.043
 doors-0.043
 door-0.042
Token mom
Token ID1995
Feature activation-3.50e-01

Pos logit contributions
opsy+0.068
 fingertips+0.060
apa+0.059
 herbal+0.059
ulent+0.058
Neg logit contributions
 poster-0.072
 elevator-0.063
 officer-0.062
 bus-0.060
?"-0.060
Token says
Token ID1139
Feature activation-1.74e+00

Pos logit contributions
 yours+0.013
 poster+0.012
 bus+0.012
 elevator+0.012
 rules+0.012
Neg logit contributions
-0.017
 Cic-0.017
opsy-0.017
apa-0.017
 herbal-0.016
Token,
Token ID11
Feature activation-6.27e-01

Pos logit contributions
 poster+0.076
 elevator+0.070
 bus+0.070
 door+0.070
 doors+0.068
Neg logit contributions
opsy-0.098
ulent-0.092
 herbal-0.092
cious-0.091
apa-0.090
Token "
Token ID366
Feature activation-1.12e+00

Pos logit contributions
 poster+0.027
 elevator+0.025
 bus+0.025
 door+0.025
 doors+0.024
Neg logit contributions
opsy-0.035
ulent-0.033
cious-0.033
 herbal-0.033
apa-0.032
TokenOf
Token ID5189
Feature activation-2.09e+00

Pos logit contributions
 elevator+0.061
 poster+0.061
 door+0.059
 doors+0.058
 bus+0.058
Neg logit contributions
opsy-0.049
Carl-0.046
 herbal-0.046
cious-0.046
ulent-0.046
Token course
Token ID1781
Feature activation-5.96e+00

Pos logit contributions
 poster+0.072
 bus+0.066
 door+0.066
 elevator+0.064
 doors+0.062
Neg logit contributions
opsy-0.134
ulent-0.125
 herbal-0.125
apa-0.124
Carl-0.124
Token,
Token ID11
Feature activation-1.45e+00

Pos logit contributions
 poster+0.307
 elevator+0.287
 bus+0.279
 doors+0.270
 door+0.268
Neg logit contributions
opsy-0.294
apa-0.273
cious-0.270
ulent-0.269
 herbal-0.266
Token Ben
Token ID3932
Feature activation-9.07e-01

Pos logit contributions
 poster+0.068
 elevator+0.064
 bus+0.062
 door+0.061
 doors+0.061
Neg logit contributions
opsy-0.077
 herbal-0.071
ulent-0.071
apa-0.070
 Cic-0.069
Token.
Token ID13
Feature activation-1.92e+00

Pos logit contributions
 poster+0.037
 elevator+0.035
 bus+0.034
 doors+0.033
 door+0.032
Neg logit contributions
opsy-0.054
cious-0.051
 herbal-0.051
ulent-0.050
apa-0.050
Token You
Token ID921
Feature activation-4.23e-01

Pos logit contributions
 poster+0.074
 door+0.069
 elevator+0.067
 bus+0.067
 doors+0.065
Neg logit contributions
opsy-0.116
 herbal-0.110
ulent-0.108
cious-0.108
apa-0.105
Token are
Token ID389
Feature activation-1.07e+00

Pos logit contributions
 poster+0.023
 elevator+0.022
 bus+0.022
 door+0.021
 doors+0.021
Neg logit contributions
opsy-0.018
ulent-0.017
bowl-0.017
 herbal-0.017
apa-0.017
Token man
Token ID582
Feature activation-1.77e+00

Pos logit contributions
 yours+0.034
 yourselves+0.026
 ours+0.025
 my+0.023
 poster+0.023
Neg logit contributions
opsy-0.061
 herbal-0.059
cious-0.058
apa-0.057
 Louie-0.057
Token says
Token ID1139
Feature activation-1.71e+00

Pos logit contributions
 poster+0.060
 door+0.056
 elevator+0.056
 bus+0.054
 rules+0.054
Neg logit contributions
opsy-0.104
 herbal-0.099
apa-0.098
-0.098
 Cic-0.098
Token,
Token ID11
Feature activation-6.50e-01

Pos logit contributions
 poster+0.075
 elevator+0.069
 door+0.069
 bus+0.069
 doors+0.067
Neg logit contributions
opsy-0.095
ulent-0.090
 herbal-0.089
cious-0.088
apa-0.087
Token "
Token ID366
Feature activation-1.15e+00

Pos logit contributions
 poster+0.027
 elevator+0.025
 door+0.024
 bus+0.024
 doors+0.024
Neg logit contributions
opsy-0.038
ulent-0.036
 herbal-0.036
cious-0.035
apa-0.035
TokenOf
Token ID5189
Feature activation-2.04e+00

Pos logit contributions
 door+0.067
 elevator+0.067
 poster+0.066
 doors+0.065
 bus+0.063
Neg logit contributions
cious-0.043
Carl-0.043
opsy-0.043
ulent-0.042
-0.042
Token course
Token ID1781
Feature activation-5.95e+00

Pos logit contributions
 poster+0.085
 door+0.082
 bus+0.081
 stairs+0.076
 elevator+0.076
Neg logit contributions
opsy-0.107
Carl-0.103
-0.102
e-0.101
 Bro-0.101
Token.
Token ID13
Feature activation-1.76e+00

Pos logit contributions
 poster+0.286
 elevator+0.260
 bus+0.250
 doors+0.241
 door+0.240
Neg logit contributions
opsy-0.306
cious-0.298
apa-0.297
-0.293
ulent-0.290
Token Come
Token ID7911
Feature activation+7.00e-01

Pos logit contributions
 poster+0.082
 elevator+0.076
 door+0.076
 bus+0.076
 doors+0.074
Neg logit contributions
opsy-0.094
ulent-0.088
 herbal-0.087
cious-0.086
apa-0.085
Token on
Token ID319
Feature activation-3.27e-01

Pos logit contributions
opsy+0.040
ulent+0.037
 herbal+0.037
cious+0.037
apa+0.036
Neg logit contributions
 poster-0.030
 elevator-0.028
 door-0.027
 bus-0.027
 doors-0.027
Token the
Token ID262
Feature activation-1.76e+00

Pos logit contributions
 poster+0.016
 elevator+0.015
 bus+0.015
 doors+0.014
 door+0.014
Neg logit contributions
opsy-0.017
apa-0.015
 herbal-0.015
ulent-0.015
cious-0.015
Token stage
Token ID3800
Feature activation-3.52e+00

Pos logit contributions
 poster+0.069
?"+0.065
 elevator+0.064
 doors+0.062
 officer+0.061
Neg logit contributions
 herbal-0.087
cious-0.086
opsy-0.080
aults-0.080
ulent-0.077
Token?"
Token ID1701
Feature activation-3.24e+00

Pos logit contributions
 poster+0.026
 elevator+0.024
 doors+0.022
 door+0.022
 bus+0.022
Neg logit contributions
opsy-0.066
ulent-0.063
 herbal-0.063
bowl-0.062
apa-0.060
Token
Token ID198
Feature activation-9.33e-01

Pos logit contributions
 poster+0.130
 door+0.119
 bus+0.116
 elevator+0.115
 lady+0.115
Neg logit contributions
opsy-0.191
ulent-0.185
 herbal-0.184
cious-0.183
apa-0.178
Token
Token ID198
Feature activation-8.01e-01

Pos logit contributions
 poster+0.038
 door+0.035
 bus+0.034
 elevator+0.034
 lady+0.033
Neg logit contributions
opsy-0.054
ulent-0.053
 herbal-0.052
cious-0.052
apa-0.050
Token"
Token ID1
Feature activation-1.65e+00

Pos logit contributions
 poster+0.042
 lady+0.038
 bus+0.037
 door+0.037
 man+0.037
Neg logit contributions
opsy-0.038
ulent-0.037
 herbal-0.037
cious-0.036
apa-0.035
TokenOf
Token ID5189
Feature activation-2.10e+00

Pos logit contributions
 elevator+0.091
 poster+0.090
 door+0.089
 doors+0.088
 bus+0.087
Neg logit contributions
opsy-0.068
Carl-0.064
 herbal-0.063
ulent-0.062
cious-0.061
Token course
Token ID1781
Feature activation-5.94e+00

Pos logit contributions
 poster+0.070
 bus+0.065
 elevator+0.064
 door+0.064
 doors+0.062
Neg logit contributions
opsy-0.138
ulent-0.128
apa-0.127
 herbal-0.126
 Cic-0.125
Token,
Token ID11
Feature activation-1.54e+00

Pos logit contributions
 poster+0.334
 elevator+0.320
 bus+0.306
 doors+0.304
 door+0.294
Neg logit contributions
opsy-0.258
apa-0.243
-0.233
ulent-0.233
cious-0.229
Token I
Token ID314
Feature activation-8.85e-01

Pos logit contributions
 poster+0.073
 elevator+0.071
 doors+0.069
iren+0.067
 bus+0.066
Neg logit contributions
opsy-0.078
 Mitt-0.074
 Lucky-0.074
 Po-0.074
 Bro-0.071
Token forgive
Token ID20927
Feature activation-1.76e+00

Pos logit contributions
 poster+0.036
 door+0.033
 bus+0.033
 elevator+0.032
 doors+0.031
Neg logit contributions
opsy-0.053
ulent-0.050
 herbal-0.050
cious-0.050
apa-0.048
Token you
Token ID345
Feature activation-5.49e-01

Pos logit contributions
 elevator+0.065
 poster+0.063
 bus+0.063
 door+0.059
 doors+0.055
Neg logit contributions
opsy-0.108
ulent-0.103
-0.102
Carl-0.101
 Cic-0.100
Token,
Token ID11
Feature activation-1.44e+00

Pos logit contributions
 poster+0.029
 elevator+0.027
 bus+0.026
 doors+0.024
 door+0.024
Neg logit contributions
ulent-0.025
opsy-0.025
-0.025
apa-0.024
 herbal-0.023
Token and
Token ID290
Feature activation-1.33e+00

Pos logit contributions
 poster+0.044
 elevator+0.040
 bus+0.039
 door+0.039
 doors+0.038
Neg logit contributions
opsy-0.089
 herbal-0.085
ulent-0.085
cious-0.084
apa-0.083
Token said
Token ID531
Feature activation-1.02e+00

Pos logit contributions
 poster+0.046
 bus+0.042
 elevator+0.042
 door+0.041
 lady+0.040
Neg logit contributions
opsy-0.087
ulent-0.084
 herbal-0.083
cious-0.082
bowl-0.080
Token,
Token ID11
Feature activation-9.14e-01

Pos logit contributions
 poster+0.045
 bus+0.041
 elevator+0.041
 door+0.041
 doors+0.040
Neg logit contributions
opsy-0.057
ulent-0.053
 herbal-0.053
cious-0.053
apa-0.052
Token "
Token ID366
Feature activation-1.11e+00

Pos logit contributions
 poster+0.042
 elevator+0.039
 bus+0.038
 door+0.038
 doors+0.038
Neg logit contributions
opsy-0.048
ulent-0.046
cious-0.046
 herbal-0.046
apa-0.044
TokenOf
Token ID5189
Feature activation-1.94e+00

Pos logit contributions
 poster+0.061
 door+0.061
 elevator+0.060
 doors+0.059
 bus+0.058
Neg logit contributions
opsy-0.049
cious-0.046
 herbal-0.046
ulent-0.046
Carl-0.045
Token course
Token ID1781
Feature activation-5.94e+00

Pos logit contributions
 poster+0.076
 door+0.073
 bus+0.073
 elevator+0.067
 doors+0.066
Neg logit contributions
opsy-0.112
 herbal-0.106
apa-0.105
ulent-0.105
Carl-0.105
Token you
Token ID345
Feature activation-5.67e-01

Pos logit contributions
 poster+0.315
 elevator+0.288
 bus+0.283
 door+0.275
 doors+0.273
Neg logit contributions
opsy-0.283
cious-0.270
apa-0.269
ulent-0.262
 herbal-0.258
Token can
Token ID460
Feature activation-1.06e+00

Pos logit contributions
 poster+0.028
 bus+0.025
 elevator+0.025
 door+0.025
 doors+0.024
Neg logit contributions
opsy-0.029
ulent-0.028
 herbal-0.027
apa-0.027
cious-0.027
Token.
Token ID13
Feature activation-1.49e+00

Pos logit contributions
 poster+0.040
 elevator+0.036
 bus+0.036
 door+0.035
 doors+0.034
Neg logit contributions
opsy-0.067
ulent-0.063
 herbal-0.063
cious-0.062
apa-0.061
Token We
Token ID775
Feature activation+6.74e-01

Pos logit contributions
 poster+0.070
 elevator+0.065
 door+0.064
 bus+0.064
 doors+0.062
Neg logit contributions
opsy-0.078
 herbal-0.072
ulent-0.072
cious-0.071
apa-0.069
Token are
Token ID389
Feature activation-8.06e-01

Pos logit contributions
opsy+0.033
ulent+0.031
 herbal+0.031
cious+0.030
apa+0.030
Neg logit contributions
 poster-0.034
 door-0.032
 elevator-0.032
 bus-0.032
 doors-0.031
Token.
Token ID13
Feature activation-8.80e-01

Pos logit contributions
 poster+0.060
 door+0.056
 doors+0.053
 elevator+0.052
 lady+0.051
Neg logit contributions
opsy-0.185
 herbal-0.182
ulent-0.181
cious-0.178
bowl-0.176
Token
Token ID198
Feature activation-3.74e-01

Pos logit contributions
 poster+0.040
 door+0.037
 elevator+0.036
 bus+0.036
 doors+0.035
Neg logit contributions
opsy-0.048
ulent-0.046
 herbal-0.045
cious-0.045
apa-0.044
Token
Token ID198
Feature activation-3.33e-01

Pos logit contributions
 poster+0.019
 elevator+0.018
 bus+0.018
 door+0.018
 doors+0.017
Neg logit contributions
opsy-0.019
apa-0.017
 herbal-0.017
ulent-0.017
cious-0.017
Token"
Token ID1
Feature activation-1.15e+00

Pos logit contributions
 poster+0.009
 door+0.008
 bus+0.008
 elevator+0.008
 doors+0.008
Neg logit contributions
opsy-0.024
ulent-0.023
 herbal-0.023
cious-0.022
apa-0.022
TokenOf
Token ID5189
Feature activation-1.86e+00

Pos logit contributions
 door+0.063
 poster+0.063
 elevator+0.062
 doors+0.062
 bus+0.060
Neg logit contributions
opsy-0.048
 herbal-0.046
cious-0.045
Carl-0.045
ulent-0.044
Token course
Token ID1781
Feature activation-5.93e+00

Pos logit contributions
 poster+0.071
 door+0.066
 bus+0.066
 elevator+0.063
 stairs+0.063
Neg logit contributions
opsy-0.110
apa-0.103
ulent-0.101
 herbal-0.100
 Cic-0.099
Token,
Token ID11
Feature activation-1.46e+00

Pos logit contributions
 poster+0.310
 elevator+0.289
 bus+0.280
 doors+0.272
 door+0.268
Neg logit contributions
opsy-0.282
apa-0.270
cious-0.265
-0.262
ulent-0.261
Token sweet
Token ID6029
Feature activation+8.93e-01

Pos logit contributions
 poster+0.055
 elevator+0.048
 bus+0.046
 doors+0.045
 door+0.044
Neg logit contributions
opsy-0.095
apa-0.087
 herbal-0.086
 Cic-0.085
ulent-0.085
Tokenie
Token ID494
Feature activation-9.46e-01

Pos logit contributions
opsy+0.046
apa+0.042
 herbal+0.040
umble+0.038
 Cic+0.038
Neg logit contributions
 poster-0.046
 elevator-0.042
 doors-0.041
 bus-0.040
 officer-0.040
Token.
Token ID13
Feature activation-1.74e+00

Pos logit contributions
 poster+0.048
 elevator+0.045
 doors+0.045
 bus+0.044
 door+0.041
Neg logit contributions
opsy-0.044
cious-0.043
apa-0.042
-0.042
comb-0.041
Token But
Token ID887
Feature activation-1.01e+00

Pos logit contributions
 poster+0.080
 door+0.074
 elevator+0.074
 bus+0.073
 doors+0.072
Neg logit contributions
opsy-0.094
 herbal-0.088
ulent-0.087
cious-0.086
apa-0.085
Token oven
Token ID14361
Feature activation-9.17e-01

Pos logit contributions
 poster+0.073
 bus+0.066
 elevator+0.066
 door+0.065
 doors+0.063
Neg logit contributions
opsy-0.083
 herbal-0.078
ulent-0.077
cious-0.077
apa-0.076
Token.
Token ID13
Feature activation+6.36e-01

Pos logit contributions
 poster+0.029
 elevator+0.026
 bus+0.026
 door+0.025
 doors+0.024
Neg logit contributions
opsy-0.062
 herbal-0.059
ulent-0.059
cious-0.058
apa-0.058
Token She
Token ID1375
Feature activation+1.90e-01

Pos logit contributions
opsy+0.027
ulent+0.024
 herbal+0.024
cious+0.024
apa+0.023
Neg logit contributions
 poster-0.037
 door-0.036
 elevator-0.035
 bus-0.035
 doors-0.035
Token placed
Token ID4624
Feature activation-4.44e-01

Pos logit contributions
opsy+0.009
ulent+0.008
 herbal+0.008
cious+0.008
ifying+0.008
Neg logit contributions
 poster-0.010
 elevator-0.009
 door-0.009
 bus-0.009
 doors-0.009
Token the
Token ID262
Feature activation+1.93e+00

Pos logit contributions
 poster+0.017
 bus+0.016
 door+0.015
 elevator+0.015
 doors+0.014
Neg logit contributions
opsy-0.027
 herbal-0.026
ulent-0.026
apa-0.025
cious-0.025
Token vegetables
Token ID13701
Feature activation+6.11e+00

Pos logit contributions
opsy+0.100
ulent+0.093
 herbal+0.093
cious+0.091
apa+0.090
Neg logit contributions
 poster-0.094
 door-0.088
 elevator-0.088
 bus-0.087
 doors-0.085
Token on
Token ID319
Feature activation+1.77e+00

Pos logit contributions
opsy+0.347
ulent+0.328
cious+0.323
 herbal+0.322
apa+0.313
Neg logit contributions
 poster-0.259
 door-0.251
 elevator-0.243
 doors-0.240
 bus-0.238
Token top
Token ID1353
Feature activation-1.75e+00

Pos logit contributions
opsy+0.111
ulent+0.105
 herbal+0.105
cious+0.104
apa+0.102
Neg logit contributions
 poster-0.066
 door-0.060
 elevator-0.060
 bus-0.059
 doors-0.058
Token of
Token ID286
Feature activation+2.39e+00

Pos logit contributions
 poster+0.082
 officer+0.072
 bus+0.070
 elevator+0.069
 rules+0.069
Neg logit contributions
opsy-0.090
 herbal-0.086
apa-0.085
bowl-0.084
ulent-0.082
Token the
Token ID262
Feature activation+5.63e-01

Pos logit contributions
opsy+0.159
ulent+0.152
cious+0.152
 herbal+0.150
apa+0.146
Neg logit contributions
 poster-0.078
 door-0.073
 elevator-0.072
 doors-0.070
 bus-0.069
Token stove
Token ID27635
Feature activation+2.67e+00

Pos logit contributions
opsy+0.026
 herbal+0.026
ulent+0.024
apa+0.024
cious+0.024
Neg logit contributions
 poster-0.029
 officer-0.025
 bus-0.025
 elevator-0.024
 lady-0.024
TokenMr
Token ID5246
Feature activation+1.32e+00

Pos logit contributions
opsy+0.022
ulent+0.020
 herbal+0.020
cious+0.020
apa+0.020
Neg logit contributions
 poster-0.015
 door-0.013
 elevator-0.013
 bus-0.013
 doors-0.013
Token.
Token ID13
Feature activation-3.44e-01

Pos logit contributions
opsy+0.090
ulent+0.085
cious+0.085
 herbal+0.084
apa+0.083
Neg logit contributions
 poster-0.042
 elevator-0.038
 bus-0.038
 door-0.038
 doors-0.036
Token Lee
Token ID5741
Feature activation-9.31e-01

Pos logit contributions
 poster+0.013
 door+0.012
 bus+0.012
 lady+0.011
 elevator+0.011
Neg logit contributions
opsy-0.022
 herbal-0.021
ulent-0.021
cious-0.021
apa-0.020
Token put
Token ID1234
Feature activation-2.17e-01

Pos logit contributions
 poster+0.043
 door+0.040
 elevator+0.040
 bus+0.040
 doors+0.039
Neg logit contributions
opsy-0.050
ulent-0.047
 herbal-0.047
cious-0.046
apa-0.045
Token the
Token ID262
Feature activation+1.36e+00

Pos logit contributions
 poster+0.010
 door+0.009
 bus+0.009
 elevator+0.009
 rules+0.008
Neg logit contributions
opsy-0.011
 herbal-0.011
ulent-0.011
apa-0.010
bowl-0.010
Token sheep
Token ID15900
Feature activation+5.35e+00

Pos logit contributions
opsy+0.047
 herbal+0.043
cious+0.041
ulent+0.041
apa+0.038
Neg logit contributions
 poster-0.088
 elevator-0.083
 door-0.082
 doors-0.082
 bus-0.081
Token on
Token ID319
Feature activation+1.41e+00

Pos logit contributions
opsy+0.318
ulent+0.296
 herbal+0.296
cious+0.291
apa+0.290
Neg logit contributions
 poster-0.216
 door-0.205
 bus-0.204
 elevator-0.204
 doors-0.199
Token a
Token ID257
Feature activation-9.13e-01

Pos logit contributions
opsy+0.074
ulent+0.070
 herbal+0.070
apa+0.069
cious+0.068
Neg logit contributions
 poster-0.064
 bus-0.057
 elevator-0.057
 door-0.055
 doors-0.054
Token table
Token ID3084
Feature activation-2.65e+00

Pos logit contributions
 poster+0.039
 elevator+0.036
 doors+0.035
 door+0.034
 bus+0.034
Neg logit contributions
opsy-0.052
 herbal-0.049
cious-0.048
ulent-0.048
apa-0.047
Token and
Token ID290
Feature activation-1.01e+00

Pos logit contributions
 poster+0.124
 elevator+0.105
 bus+0.104
 officer+0.103
 doors+0.097
Neg logit contributions
ulent-0.129
cious-0.128
cr-0.126
 herbal-0.125
comb-0.125
Token turned
Token ID2900
Feature activation+1.19e-01

Pos logit contributions
 poster+0.050
 elevator+0.047
 door+0.047
 bus+0.047
 doors+0.046
Neg logit contributions
opsy-0.052
ulent-0.048
 herbal-0.048
cious-0.047
apa-0.046
Token into
Token ID656
Feature activation-2.46e-01

Pos logit contributions
 poster+0.001
 elevator+0.001
 bus+0.001
 door+0.001
 doors+0.001
Neg logit contributions
opsy-0.001
ulent-0.001
 herbal-0.001
cious-0.001
apa-0.001
Token Sara
Token ID24799
Feature activation+4.43e-01

Pos logit contributions
 poster+0.010
 door+0.009
 elevator+0.009
 bus+0.009
 doors+0.009
Neg logit contributions
opsy-0.015
ulent-0.014
 herbal-0.014
cious-0.014
apa-0.013
Token and
Token ID290
Feature activation-1.69e-01

Pos logit contributions
opsy+0.023
 herbal+0.022
ulent+0.022
Carl+0.022
apa+0.022
Neg logit contributions
 door-0.021
 poster-0.021
 bus-0.019
 doors-0.019
 lady-0.019
Token spills
Token ID42123
Feature activation+1.90e+00

Pos logit contributions
 poster+0.007
 door+0.007
 elevator+0.007
 bus+0.007
 doors+0.006
Neg logit contributions
opsy-0.010
 herbal-0.009
ulent-0.009
cious-0.009
apa-0.009
Token her
Token ID607
Feature activation+1.28e+00

Pos logit contributions
opsy+0.094
 herbal+0.092
ulent+0.089
apa+0.087
 Cic+0.086
Neg logit contributions
 poster-0.094
 bus-0.082
 elevator-0.080
 door-0.078
 officer-0.077
Token coffee
Token ID6891
Feature activation+5.23e+00

Pos logit contributions
opsy+0.065
ulent+0.064
apa+0.060
Carl+0.060
umped+0.059
Neg logit contributions
 poster-0.061
 door-0.060
 bus-0.058
 elevator-0.057
 doors-0.056
Token on
Token ID319
Feature activation+9.30e-01

Pos logit contributions
opsy+0.307
 herbal+0.294
cious+0.287
ulent+0.286
bowl+0.281
Neg logit contributions
 poster-0.207
 elevator-0.183
 bus-0.183
 doors-0.176
 door-0.176
Token Sara
Token ID24799
Feature activation-1.54e-01

Pos logit contributions
opsy+0.053
 herbal+0.050
ulent+0.049
cious+0.049
apa+0.048
Neg logit contributions
 poster-0.040
 bus-0.036
 elevator-0.035
 door-0.035
 doors-0.034
Token's
Token ID338
Feature activation+2.41e+00

Pos logit contributions
 door+0.008
 bus+0.007
 poster+0.007
 doors+0.007
 elevator+0.007
Neg logit contributions
opsy-0.008
 herbal-0.007
Carl-0.007
ulent-0.007
 Cic-0.007
Token shirt
Token ID10147
Feature activation+1.12e-01

Pos logit contributions
opsy+0.124
ulent+0.118
 herbal+0.115
Carl+0.112
apa+0.112
Neg logit contributions
 door-0.113
 poster-0.112
 bus-0.110
 elevator-0.107
 doors-0.106
Token.
Token ID13
Feature activation+4.34e-01

Pos logit contributions
opsy+0.006
cious+0.006
 herbal+0.006
cr+0.006
comb+0.006
Neg logit contributions
 poster-0.005
 elevator-0.004
 bus-0.004
 doors-0.004
 officer-0.004
Token!
Token ID0
Feature activation+1.32e-01

Pos logit contributions
opsy+0.208
 herbal+0.195
ulent+0.195
cious+0.192
apa+0.191
Neg logit contributions
 poster-0.105
 door-0.097
 elevator-0.095
 bus-0.094
 doors-0.091
Token They
Token ID1119
Feature activation+1.32e+00

Pos logit contributions
opsy+0.007
ulent+0.006
 herbal+0.006
cious+0.006
apa+0.006
Neg logit contributions
 poster-0.006
 door-0.006
 elevator-0.006
 bus-0.006
 doors-0.006
Token both
Token ID1111
Feature activation-1.44e+00

Pos logit contributions
opsy+0.064
 herbal+0.062
ulent+0.062
cious+0.060
ithering+0.058
Neg logit contributions
 bus-0.063
 poster-0.063
 door-0.062
 elevator-0.061
 lady-0.061
Token enjoyed
Token ID8359
Feature activation+1.53e+00

Pos logit contributions
 bus+0.065
 poster+0.065
 elevator+0.064
 lady+0.062
 door+0.062
Neg logit contributions
opsy-0.074
ulent-0.070
 herbal-0.070
ithering-0.067
apa-0.065
Token the
Token ID262
Feature activation+9.17e-01

Pos logit contributions
opsy+0.102
apa+0.097
ulent+0.096
 herbal+0.096
cious+0.095
Neg logit contributions
 poster-0.049
 door-0.043
 bus-0.043
 elevator-0.043
 doors-0.043
Token salty
Token ID36021
Feature activation+6.01e+00

Pos logit contributions
opsy+0.041
ulent+0.038
cious+0.037
 herbal+0.037
apa+0.036
Neg logit contributions
 poster-0.051
 door-0.048
 bus-0.048
 elevator-0.048
 doors-0.047
Token treats
Token ID18432
Feature activation+2.66e+00

Pos logit contributions
opsy+0.300
ulent+0.280
 herbal+0.278
cious+0.275
apa+0.270
Neg logit contributions
 poster-0.299
 door-0.279
 elevator-0.278
 bus-0.277
 doors-0.271
Token together
Token ID1978
Feature activation-2.54e+00

Pos logit contributions
opsy+0.177
ulent+0.167
 herbal+0.166
apa+0.163
cious+0.162
Neg logit contributions
 poster-0.088
 door-0.082
 bus-0.079
 elevator-0.079
 lady-0.077
Token.
Token ID13
Feature activation+1.74e-01

Pos logit contributions
 poster+0.071
 door+0.066
 bus+0.062
 elevator+0.061
 doors+0.060
Neg logit contributions
opsy-0.182
 herbal-0.172
ulent-0.172
cious-0.169
apa-0.168
Token<|endoftext|>
Token ID50256
Feature activation-1.29e+00

Pos logit contributions
opsy+0.011
ulent+0.011
cious+0.010
 herbal+0.010
aults+0.010
Neg logit contributions
 poster-0.006
 door-0.006
 lady-0.006
 elevator-0.006
 bus-0.005
TokenOnce
Token ID7454
Feature activation+7.85e-01

Pos logit contributions
 poster+0.050
 door+0.046
 elevator+0.046
 bus+0.045
 doors+0.044
Neg logit contributions
opsy-0.078
ulent-0.074
 herbal-0.073
cious-0.073
apa-0.071
Token kitchen
Token ID9592
Feature activation-4.30e-02

Pos logit contributions
 poster+0.052
 door+0.049
 elevator+0.049
 bus+0.048
 doors+0.047
Neg logit contributions
opsy-0.067
ulent-0.063
 herbal-0.062
cious-0.062
apa-0.061
Token.
Token ID13
Feature activation+4.95e-01

Pos logit contributions
 poster+0.002
 elevator+0.001
 bus+0.001
 officer+0.001
 doors+0.001
Neg logit contributions
opsy-0.003
 herbal-0.003
cious-0.002
ulent-0.002
apa-0.002
Token She
Token ID1375
Feature activation+1.08e-01

Pos logit contributions
opsy+0.029
ulent+0.026
cious+0.026
 herbal+0.026
apa+0.026
Neg logit contributions
 door-0.022
 elevator-0.021
 bus-0.021
 poster-0.021
 doors-0.020
Token put
Token ID1234
Feature activation+7.70e-02

Pos logit contributions
opsy+0.005
 herbal+0.005
ulent+0.005
cious+0.005
apa+0.005
Neg logit contributions
 poster-0.005
 door-0.005
 elevator-0.005
 bus-0.005
 doors-0.005
Token the
Token ID262
Feature activation+1.95e+00

Pos logit contributions
opsy+0.004
 herbal+0.004
apa+0.004
ulent+0.004
bowl+0.004
Neg logit contributions
 poster-0.004
 bus-0.003
 door-0.003
 elevator-0.003
 rules-0.003
Token bread
Token ID8509
Feature activation+5.59e+00

Pos logit contributions
opsy+0.089
 herbal+0.082
ulent+0.081
cious+0.080
apa+0.079
Neg logit contributions
 poster-0.106
 bus-0.098
 elevator-0.098
 door-0.096
 officer-0.096
Token in
Token ID287
Feature activation+4.42e-02

Pos logit contributions
opsy+0.329
ulent+0.315
 herbal+0.313
cious+0.308
apa+0.298
Neg logit contributions
 door-0.219
 poster-0.216
 doors-0.208
 elevator-0.206
 bus-0.206
Token the
Token ID262
Feature activation-1.45e-01

Pos logit contributions
opsy+0.003
 herbal+0.002
ulent+0.002
cious+0.002
apa+0.002
Neg logit contributions
 poster-0.002
 bus-0.002
 elevator-0.002
 door-0.002
 doors-0.002
Token to
Token ID284
Feature activation+6.24e-01

Pos logit contributions
 poster+0.006
 door+0.005
 elevator+0.005
 bus+0.005
 doors+0.005
Neg logit contributions
opsy-0.009
ulent-0.008
 herbal-0.008
cious-0.008
apa-0.008
Tokenaster
Token ID1603
Feature activation-3.08e-01

Pos logit contributions
opsy+0.032
ulent+0.030
cious+0.030
 herbal+0.029
apa+0.029
Neg logit contributions
 poster-0.030
 door-0.029
 elevator-0.028
 bus-0.028
 doors-0.028
Token and
Token ID290
Feature activation-5.32e-01

Pos logit contributions
 poster+0.012
 elevator+0.011
 bus+0.011
 door+0.011
 doors+0.010
Neg logit contributions
opsy-0.018
ulent-0.017
 herbal-0.017
cious-0.017
bowl-0.017
Tokenster
Token ID1706
Feature activation+2.06e+00

Pos logit contributions
opsy+0.055
cious+0.049
apa+0.049
ulent+0.048
 herbal+0.047
Neg logit contributions
 poster-0.082
 elevator-0.078
 bus-0.077
 door-0.077
 doors-0.076
Token continued
Token ID3767
Feature activation+6.37e-01

Pos logit contributions
opsy+0.093
 herbal+0.085
apa+0.084
cious+0.083
bowl+0.082
Neg logit contributions
 poster-0.111
 elevator-0.104
 bus-0.103
 door-0.102
 doors-0.100
Token to
Token ID284
Feature activation+1.51e-01

Pos logit contributions
opsy+0.037
 herbal+0.035
ulent+0.035
cious+0.034
apa+0.034
Neg logit contributions
 poster-0.026
 elevator-0.024
 bus-0.023
 door-0.023
 doors-0.023
Token search
Token ID2989
Feature activation-1.80e+00

Pos logit contributions
opsy+0.007
ulent+0.007
cious+0.007
 herbal+0.007
apa+0.006
Neg logit contributions
 poster-0.008
 elevator-0.008
 door-0.007
 bus-0.007
 doors-0.007
Token for
Token ID329
Feature activation-9.49e-02

Pos logit contributions
 poster+0.072
 bus+0.062
 elevator+0.061
 door+0.059
 doors+0.057
Neg logit contributions
opsy-0.107
 herbal-0.103
apa-0.102
ulent-0.101
bowl-0.099
Token food
Token ID2057
Feature activation+3.30e+00

Pos logit contributions
 poster+0.004
 elevator+0.004
 bus+0.004
 door+0.004
 doors+0.004
Neg logit contributions
opsy-0.005
ulent-0.005
 herbal-0.005
cious-0.005
apa-0.005
Token,
Token ID11
Feature activation+1.02e+00

Pos logit contributions
opsy+0.208
ulent+0.196
 herbal+0.196
cious+0.194
apa+0.191
Neg logit contributions
 poster-0.121
 elevator-0.109
 door-0.108
 bus-0.108
 doors-0.105
Token never
Token ID1239
Feature activation-2.89e+00

Pos logit contributions
opsy+0.061
 herbal+0.057
ulent+0.057
cious+0.056
apa+0.055
Neg logit contributions
 poster-0.042
 door-0.039
 bus-0.038
 elevator-0.038
 doors-0.037
Token worrying
Token ID18916
Feature activation+4.26e-01

Pos logit contributions
 poster+0.111
 door+0.108
 bus+0.105
 doors+0.101
 elevator+0.099
Neg logit contributions
opsy-0.180
 herbal-0.167
apa-0.165
ulent-0.165
cious-0.163
Token about
Token ID546
Feature activation+2.82e-01

Pos logit contributions
opsy+0.023
 herbal+0.022
ulent+0.022
cious+0.022
apa+0.021
Neg logit contributions
 poster-0.019
 elevator-0.017
 bus-0.017
 door-0.017
 doors-0.017
Token getting
Token ID1972
Feature activation+1.31e+00

Pos logit contributions
opsy+0.017
 herbal+0.016
ulent+0.016
cious+0.015
apa+0.015
Neg logit contributions
 poster-0.012
 elevator-0.011
 bus-0.010
 door-0.010
 doors-0.010
Token she
Token ID673
Feature activation+5.79e-01

Pos logit contributions
 poster+0.027
 door+0.025
 elevator+0.024
 bus+0.024
 doors+0.024
Neg logit contributions
opsy-0.050
ulent-0.048
cious-0.047
 herbal-0.047
apa-0.047
Token stepped
Token ID10764
Feature activation+7.00e-01

Pos logit contributions
opsy+0.033
ulent+0.032
apa+0.032
 herbal+0.031
cious+0.031
Neg logit contributions
 poster-0.024
 door-0.023
 bus-0.022
 lady-0.022
 doors-0.022
Token outside
Token ID2354
Feature activation-1.18e-01

Pos logit contributions
opsy+0.039
ulent+0.036
 herbal+0.036
cious+0.036
apa+0.035
Neg logit contributions
 poster-0.032
 elevator-0.029
 door-0.029
 bus-0.029
 doors-0.028
Token with
Token ID351
Feature activation-4.65e-01

Pos logit contributions
 poster+0.005
 elevator+0.005
 door+0.005
 bus+0.004
 doors+0.004
Neg logit contributions
opsy-0.007
 herbal-0.006
ulent-0.006
cious-0.006
apa-0.006
Token the
Token ID262
Feature activation+4.43e-01

Pos logit contributions
 door+0.015
 poster+0.015
 elevator+0.014
 doors+0.014
 bus+0.013
Neg logit contributions
opsy-0.031
cious-0.030
ulent-0.030
 herbal-0.029
apa-0.029
Token egg
Token ID5935
Feature activation+3.03e+00

Pos logit contributions
opsy+0.025
ulent+0.023
 herbal+0.023
cious+0.023
apa+0.023
Neg logit contributions
 poster-0.020
 door-0.018
 elevator-0.018
 bus-0.018
 doors-0.017
Token,
Token ID11
Feature activation+8.30e-01

Pos logit contributions
opsy+0.197
ulent+0.192
 herbal+0.188
cious+0.186
Carl+0.180
Neg logit contributions
 door-0.107
 poster-0.104
 doors-0.099
 bus-0.095
 elevator-0.095
Token she
Token ID673
Feature activation+1.99e-01

Pos logit contributions
opsy+0.055
ulent+0.053
 herbal+0.052
cious+0.052
apa+0.051
Neg logit contributions
 poster-0.028
 door-0.025
 elevator-0.025
 bus-0.025
 doors-0.024
Token heard
Token ID2982
Feature activation+3.37e-01

Pos logit contributions
 herbal+0.010
opsy+0.010
ulent+0.010
cious+0.009
 Cic+0.009
Neg logit contributions
 door-0.009
 poster-0.009
 doors-0.009
 bus-0.009
 elevator-0.009
Token that
Token ID326
Feature activation+5.62e-01

Pos logit contributions
opsy+0.020
ulent+0.019
 herbal+0.019
cious+0.018
apa+0.018
Neg logit contributions
 poster-0.014
 elevator-0.013
 door-0.013
 bus-0.013
 doors-0.012
Token same
Token ID976
Feature activation-4.37e-01

Pos logit contributions
opsy+0.035
 herbal+0.033
ulent+0.033
cious+0.033
apa+0.032
Neg logit contributions
 poster-0.021
 elevator-0.019
 door-0.019
 bus-0.019
 doors-0.019
Token.
Token ID13
Feature activation-4.85e-01

Pos logit contributions
 poster+0.006
 door+0.006
 bus+0.006
 rules+0.005
 elevator+0.005
Neg logit contributions
opsy-0.030
 herbal-0.029
ulent-0.028
apa-0.028
cious-0.028
Token He
Token ID679
Feature activation-5.88e-01

Pos logit contributions
 poster+0.022
 door+0.020
 elevator+0.020
 bus+0.020
 doors+0.020
Neg logit contributions
opsy-0.027
 herbal-0.025
ulent-0.025
cious-0.025
apa-0.024
Token was
Token ID373
Feature activation+1.93e-01

Pos logit contributions
 poster+0.026
 bus+0.026
 elevator+0.025
 door+0.025
 doors+0.025
Neg logit contributions
opsy-0.032
ulent-0.030
 herbal-0.030
apa-0.029
cious-0.029
Token too
Token ID1165
Feature activation-1.56e+00

Pos logit contributions
opsy+0.010
ulent+0.010
 herbal+0.010
cious+0.010
apa+0.009
Neg logit contributions
 poster-0.009
 door-0.008
 elevator-0.008
 bus-0.008
 doors-0.008
Token busy
Token ID8179
Feature activation+3.08e-01

Pos logit contributions
 poster+0.064
 door+0.059
 elevator+0.059
 bus+0.059
 doors+0.057
Neg logit contributions
opsy-0.092
ulent-0.087
 herbal-0.087
cious-0.086
apa-0.084
Token catching
Token ID16508
Feature activation+1.78e+00

Pos logit contributions
opsy+0.015
ulent+0.014
 herbal+0.014
cious+0.014
bowl+0.013
Neg logit contributions
 poster-0.015
 elevator-0.014
 bus-0.014
 door-0.014
 doors-0.014
Token frogs
Token ID37475
Feature activation+3.12e+00

Pos logit contributions
opsy+0.118
 herbal+0.111
ulent+0.110
cious+0.110
apa+0.109
Neg logit contributions
 poster-0.058
 door-0.057
 doors-0.055
 elevator-0.054
 bus-0.053
Token.
Token ID13
Feature activation-1.81e-01

Pos logit contributions
opsy+0.227
 herbal+0.215
ulent+0.213
cious+0.211
apa+0.210
Neg logit contributions
 poster-0.088
 door-0.080
 bus-0.079
 elevator-0.076
 doors-0.073
Token The
Token ID383
Feature activation-4.46e-01

Pos logit contributions
 poster+0.008
 door+0.007
 elevator+0.007
 bus+0.007
 doors+0.007
Neg logit contributions
opsy-0.010
 herbal-0.010
ulent-0.010
cious-0.010
apa-0.009
Token crocod
Token ID37565
Feature activation+1.45e+00

Pos logit contributions
 poster+0.028
 elevator+0.026
 bus+0.026
 door+0.026
 doors+0.026
Neg logit contributions
opsy-0.016
 herbal-0.015
ulent-0.014
cious-0.014
bowl-0.014
Tokenile
Token ID576
Feature activation+1.09e+00

Pos logit contributions
opsy+0.081
ulent+0.077
 herbal+0.077
cious+0.076
apa+0.075
Neg logit contributions
 poster-0.064
 elevator-0.059
 door-0.059
 bus-0.058
 doors-0.056
Token.
Token ID13
Feature activation+5.77e-01

Pos logit contributions
opsy+0.005
ulent+0.005
 herbal+0.005
cious+0.005
apa+0.005
Neg logit contributions
 poster-0.002
 door-0.002
 bus-0.002
 doors-0.002
 rules-0.002
Token Mitt
Token ID16627
Feature activation+4.85e+00

Pos logit contributions
cious+0.027
aults+0.027
ulent+0.027
opsy+0.026
 herbal+0.025
Neg logit contributions
 bus-0.026
 poster-0.026
 lady-0.026
?"-0.025
 car-0.025
Tokenens
Token ID641
Feature activation+3.17e+00

Pos logit contributions
opsy+0.242
ulent+0.238
ifying+0.235
cious+0.232
 herbal+0.231
Neg logit contributions
 poster-0.226
 bus-0.213
 elevator-0.208
 lady-0.208
 man-0.204
Token was
Token ID373
Feature activation+2.83e-01

Pos logit contributions
opsy+0.164
ulent+0.156
 herbal+0.153
apa+0.150
cious+0.149
Neg logit contributions
 poster-0.152
 bus-0.140
 elevator-0.139
 door-0.138
 lady-0.135
Token the
Token ID262
Feature activation+6.91e-02

Pos logit contributions
opsy+0.017
ulent+0.016
apa+0.016
 herbal+0.016
cious+0.016
Neg logit contributions
 poster-0.010
 bus-0.010
 elevator-0.010
 door-0.010
 lady-0.010
Token happiest
Token ID49414
Feature activation+3.10e+00

Pos logit contributions
opsy+0.003
cious+0.003
ulent+0.003
 herbal+0.003
apa+0.003
Neg logit contributions
 poster-0.004
 bus-0.004
 door-0.004
 elevator-0.004
 lady-0.004
Token kitten
Token ID42143
Feature activation+2.28e+00

Pos logit contributions
opsy+0.193
 herbal+0.181
ulent+0.181
cious+0.179
apa+0.176
Neg logit contributions
 poster-0.118
 elevator-0.107
 door-0.107
 bus-0.106
 doors-0.104
Token in
Token ID287
Feature activation-1.54e+00

Pos logit contributions
opsy+0.131
ulent+0.123
 herbal+0.123
apa+0.120
cious+0.120
Neg logit contributions
 poster-0.095
 door-0.088
 bus-0.088
 elevator-0.088
 lady-0.084
Token the
Token ID262
Feature activation-1.43e+00

Pos logit contributions
 poster+0.061
 elevator+0.056
 bus+0.055
 door+0.055
 doors+0.054
Neg logit contributions
opsy-0.094
 herbal-0.088
ulent-0.088
cious-0.086
apa-0.085
Token world
Token ID995
Feature activation-1.50e+00

Pos logit contributions
 poster+0.064
 officer+0.058
 bus+0.057
 belts+0.057
 doors+0.057
Neg logit contributions
opsy-0.076
 colony-0.072
 herbal-0.070
bowl-0.067
-0.066
Token.
Token ID13
Feature activation+2.96e-01

Pos logit contributions
 poster+0.063
 doors+0.058
 door+0.058
 elevator+0.058
 bus+0.056
Neg logit contributions
opsy-0.080
cious-0.079
ulent-0.078
 herbal-0.077
bowl-0.075
Token The
Token ID383
Feature activation+1.62e-01

Pos logit contributions
opsy+0.049
ulent+0.046
 herbal+0.045
cious+0.045
apa+0.043
Neg logit contributions
 poster-0.053
 door-0.050
 elevator-0.049
 bus-0.049
 doors-0.048
Token lizard
Token ID42406
Feature activation+1.73e+00

Pos logit contributions
opsy+0.009
 herbal+0.009
ulent+0.008
cious+0.008
 Cic+0.008
Neg logit contributions
 poster-0.007
 elevator-0.006
 doors-0.006
 bus-0.006
 officer-0.006
Token climbed
Token ID19952
Feature activation+1.04e+00

Pos logit contributions
opsy+0.091
ulent+0.086
 herbal+0.086
apa+0.083
cious+0.082
Neg logit contributions
 poster-0.081
 door-0.077
 bus-0.076
 elevator-0.076
 doors-0.075
Token on
Token ID319
Feature activation+1.77e+00

Pos logit contributions
opsy+0.049
 herbal+0.045
ulent+0.045
 Cic+0.044
cr+0.044
Neg logit contributions
 poster-0.054
 bus-0.048
 elevator-0.048
 officer-0.048
 rules-0.046
Token top
Token ID1353
Feature activation-1.91e+00

Pos logit contributions
opsy+0.094
ulent+0.090
 herbal+0.088
cious+0.087
 Cic+0.086
Neg logit contributions
 poster-0.080
 bus-0.075
 officer-0.074
 elevator-0.073
 door-0.069
Token of
Token ID286
Feature activation+2.30e+00

Pos logit contributions
 poster+0.100
 elevator+0.090
 bus+0.090
 officer+0.090
 rules+0.085
Neg logit contributions
opsy-0.085
 herbal-0.079
bowl-0.078
umble-0.077
ulent-0.077
Token the
Token ID262
Feature activation+6.37e-01

Pos logit contributions
opsy+0.131
ulent+0.123
 herbal+0.122
cious+0.119
 Cic+0.119
Neg logit contributions
 poster-0.098
 elevator-0.090
 bus-0.090
 door-0.088
 officer-0.087
Token rock
Token ID3881
Feature activation-3.76e-01

Pos logit contributions
opsy+0.041
 herbal+0.039
ulent+0.038
cious+0.038
bowl+0.037
Neg logit contributions
 poster-0.023
 elevator-0.020
 bus-0.019
 officer-0.019
 doors-0.019
Token and
Token ID290
Feature activation+2.32e-01

Pos logit contributions
 poster+0.016
 elevator+0.014
 bus+0.014
 officer+0.014
 doors+0.014
Neg logit contributions
opsy-0.020
cious-0.020
ulent-0.020
bowl-0.020
 herbal-0.019
Token looked
Token ID3114
Feature activation+2.75e-02

Pos logit contributions
opsy+0.012
ulent+0.011
 herbal+0.011
cious+0.011
apa+0.011
Neg logit contributions
 poster-0.011
 door-0.011
 elevator-0.011
 bus-0.011
 doors-0.011
Token around
Token ID1088
Feature activation+7.82e-02

Pos logit contributions
opsy+0.001
 herbal+0.001
ulent+0.001
cious+0.001
apa+0.001
Neg logit contributions
 poster-0.001
 elevator-0.001
 bus-0.001
 door-0.001
 officer-0.001
Token to
Token ID284
Feature activation+3.84e-01

Pos logit contributions
opsy+0.050
 herbal+0.047
ulent+0.047
cious+0.046
apa+0.046
Neg logit contributions
 poster-0.030
 elevator-0.028
 door-0.027
 bus-0.027
 doors-0.027
Token play
Token ID711
Feature activation-7.58e-01

Pos logit contributions
opsy+0.017
ulent+0.016
 herbal+0.016
cious+0.016
apa+0.015
Neg logit contributions
 poster-0.021
 bus-0.020
 elevator-0.020
 door-0.020
 doors-0.019
Token with
Token ID351
Feature activation-7.96e-01

Pos logit contributions
 poster+0.032
 door+0.029
 elevator+0.029
 bus+0.029
 doors+0.028
Neg logit contributions
opsy-0.045
ulent-0.042
cious-0.042
 herbal-0.042
apa-0.041
Token his
Token ID465
Feature activation+1.83e-02

Pos logit contributions
 poster+0.027
 elevator+0.025
 door+0.025
 bus+0.025
 doors+0.024
Neg logit contributions
opsy-0.053
ulent-0.050
 herbal-0.050
cious-0.050
apa-0.049
Token old
Token ID1468
Feature activation-7.82e-01

Pos logit contributions
opsy+0.001
ulent+0.001
 herbal+0.001
cious+0.001
apa+0.001
Neg logit contributions
 poster-0.001
 elevator-0.001
 door-0.001
 bus-0.001
 doors-0.001
Token phone
Token ID3072
Feature activation-5.06e-01

Pos logit contributions
 poster+0.032
 door+0.030
 elevator+0.030
 bus+0.029
 doors+0.029
Neg logit contributions
opsy-0.044
 herbal-0.042
cious-0.041
ulent-0.041
bowl-0.040
Token.
Token ID13
Feature activation+7.30e-01

Pos logit contributions
 poster+0.019
 elevator+0.016
 door+0.016
 bus+0.016
 doors+0.015
Neg logit contributions
opsy-0.031
 herbal-0.029
ulent-0.029
cious-0.029
bowl-0.029
Token It
Token ID632
Feature activation-3.45e-01

Pos logit contributions
+0.022
 Bro+0.022
 herbal+0.021
bowl+0.020
apa+0.019
Neg logit contributions
 poster-0.049
 door-0.042
 doors-0.041
 officer-0.040
 bus-0.040
Token is
Token ID318
Feature activation-6.28e-01

Pos logit contributions
 poster+0.012
 door+0.012
 elevator+0.011
 rules+0.011
 doors+0.011
Neg logit contributions
-0.020
 herbal-0.020
opsy-0.019
bowl-0.019
 Cic-0.019
Token big
Token ID1263
Feature activation+4.34e-01

Pos logit contributions
 poster+0.026
 elevator+0.023
 door+0.023
 bus+0.023
 doors+0.023
Neg logit contributions
opsy-0.037
 herbal-0.035
ulent-0.035
cious-0.034
apa-0.034
Token and
Token ID290
Feature activation-9.83e-02

Pos logit contributions
 herbal+0.017
ulent+0.017
bowl+0.017
+0.017
opsy+0.016
Neg logit contributions
 poster-0.022
 elevator-0.020
 doors-0.019
 rules-0.019
 officer-0.019
Token was
Token ID373
Feature activation+2.08e-01

Pos logit contributions
ulent+0.022
opsy+0.022
ifying+0.022
 herbal+0.021
apa+0.021
Neg logit contributions
 poster-0.013
 door-0.013
 man-0.013
 bus-0.012
 lady-0.012
Token scared
Token ID12008
Feature activation+1.88e+00

Pos logit contributions
apa+0.015
opsy+0.014
ifying+0.014
comb+0.014
ete+0.014
Neg logit contributions
 old-0.007
 door-0.007
 too-0.007
 lady-0.006
 bus-0.006
Token of
Token ID286
Feature activation+1.42e+00

Pos logit contributions
opsy+0.109
ulent+0.102
 herbal+0.102
cious+0.099
bowl+0.099
Neg logit contributions
 poster-0.081
 elevator-0.072
 bus-0.071
 door-0.070
 doors-0.068
Token the
Token ID262
Feature activation-1.23e-01

Pos logit contributions
opsy+0.081
ulent+0.077
 herbal+0.077
cious+0.076
 Cic+0.074
Neg logit contributions
 poster-0.059
 bus-0.053
 door-0.053
 elevator-0.052
 doors-0.051
Token snake
Token ID17522
Feature activation+2.09e+00

Pos logit contributions
 poster+0.005
 door+0.005
 elevator+0.005
 bus+0.005
 doors+0.005
Neg logit contributions
opsy-0.007
ulent-0.007
apa-0.007
cious-0.007
 herbal-0.007
Token,
Token ID11
Feature activation+7.00e-01

Pos logit contributions
opsy+0.160
 herbal+0.152
apa+0.150
ulent+0.150
cious+0.148
Neg logit contributions
 poster-0.052
 door-0.050
 bus-0.049
 elevator-0.046
 doors-0.043
Token so
Token ID523
Feature activation-1.52e-01

Pos logit contributions
opsy+0.034
 herbal+0.032
cious+0.032
ulent+0.032
 Cic+0.032
Neg logit contributions
 poster-0.035
 elevator-0.031
 door-0.030
 bus-0.030
 doors-0.030
Token he
Token ID339
Feature activation+2.36e-01

Pos logit contributions
 poster+0.005
 door+0.005
 elevator+0.005
 bus+0.005
 doors+0.005
Neg logit contributions
opsy-0.010
 herbal-0.009
ulent-0.009
cious-0.009
apa-0.009
Token ran
Token ID4966
Feature activation+9.24e-01

Pos logit contributions
opsy+0.011
ulent+0.010
 herbal+0.010
cious+0.010
apa+0.010
Neg logit contributions
 poster-0.013
 elevator-0.012
 door-0.012
 bus-0.012
 doors-0.012
Token inside
Token ID2641
Feature activation-6.52e-01

Pos logit contributions
opsy+0.036
Carl+0.036
 Mitt+0.035
ulent+0.034
 herbal+0.034
Neg logit contributions
 poster-0.052
 elevator-0.047
 bus-0.045
 doors-0.044
 door-0.043
Token to
Token ID284
Feature activation+4.37e-01

Pos logit contributions
 poster+0.029
 elevator+0.027
 bus+0.026
 doors+0.026
 officer+0.025
Neg logit contributions
opsy-0.034
 herbal-0.033
bowl-0.032
 Cic-0.031
ulent-0.031
Token when
Token ID618
Feature activation+2.03e+00

Pos logit contributions
 poster+0.033
 door+0.030
 bus+0.030
 elevator+0.029
 doors+0.028
Neg logit contributions
opsy-0.077
ulent-0.074
 herbal-0.073
cious-0.073
apa-0.072
Token something
Token ID1223
Feature activation+1.22e+00

Pos logit contributions
opsy+0.128
 herbal+0.119
ulent+0.119
cious+0.118
apa+0.116
Neg logit contributions
 poster-0.075
 bus-0.067
 elevator-0.066
 door-0.066
 doors-0.063
Token bad
Token ID2089
Feature activation+1.38e+00

Pos logit contributions
opsy+0.063
ulent+0.059
 herbal+0.058
cious+0.058
apa+0.057
Neg logit contributions
 poster-0.059
 door-0.055
 elevator-0.054
 bus-0.054
 doors-0.053
Token happened
Token ID3022
Feature activation+1.03e+00

Pos logit contributions
opsy+0.059
 herbal+0.054
ulent+0.053
cious+0.052
apa+0.051
Neg logit contributions
 poster-0.078
 elevator-0.073
 door-0.073
 bus-0.073
 doors-0.072
Token.
Token ID13
Feature activation+1.15e+00

Pos logit contributions
opsy+0.067
ulent+0.064
 herbal+0.064
cious+0.063
apa+0.061
Neg logit contributions
 poster-0.036
 elevator-0.033
 bus-0.032
 door-0.032
 doors-0.031
Token It
Token ID632
Feature activation+1.66e-01

Pos logit contributions
opsy+0.052
ulent+0.048
 herbal+0.047
cious+0.047
apa+0.045
Neg logit contributions
 poster-0.063
 door-0.060
 elevator-0.060
 bus-0.059
 doors-0.058
Token started
Token ID2067
Feature activation+1.76e+00

Pos logit contributions
opsy+0.008
ulent+0.008
 herbal+0.008
cious+0.008
apa+0.008
Neg logit contributions
 poster-0.008
 door-0.008
 elevator-0.008
 bus-0.008
 doors-0.008
Token to
Token ID284
Feature activation+8.51e-01

Pos logit contributions
opsy+0.127
apa+0.120
cious+0.119
ulent+0.119
 herbal+0.118
Neg logit contributions
 poster-0.052
 door-0.049
 bus-0.046
 doors-0.045
 elevator-0.045
Token rain
Token ID6290
Feature activation-3.20e-01

Pos logit contributions
opsy+0.043
 herbal+0.041
ulent+0.040
cious+0.039
 Cic+0.038
Neg logit contributions
 poster-0.041
 elevator-0.038
 door-0.038
 bus-0.038
 doors-0.037
Token and
Token ID290
Feature activation+2.15e-01

Pos logit contributions
 poster+0.013
 elevator+0.011
 bus+0.011
 door+0.011
 doors+0.011
Neg logit contributions
opsy-0.019
 herbal-0.018
ulent-0.018
cious-0.018
bowl-0.017
Token the
Token ID262
Feature activation-7.57e-01

Pos logit contributions
opsy+0.013
 herbal+0.012
ulent+0.012
cious+0.012
apa+0.012
Neg logit contributions
 poster-0.008
 bus-0.007
 elevator-0.007
 door-0.007
 doors-0.007
Token the
Token ID262
Feature activation-4.57e-01

Pos logit contributions
opsy+0.013
 herbal+0.012
apa+0.012
 roasted+0.012
apy+0.012
Neg logit contributions
 poster-0.014
 bus-0.012
 elevator-0.012
 officer-0.012
 rules-0.012
Token sweater
Token ID35685
Feature activation+1.53e+00

Pos logit contributions
 poster+0.017
?"+0.015
 elevator+0.015
iren+0.015
 bus+0.015
Neg logit contributions
 herbal-0.025
opsy-0.024
ifying-0.023
ulent-0.023
cious-0.022
Token was
Token ID373
Feature activation+1.18e-02

Pos logit contributions
opsy+0.062
cious+0.058
 herbal+0.058
ulent+0.056
apa+0.055
Neg logit contributions
 poster-0.088
 bus-0.080
 elevator-0.080
 door-0.078
 doors-0.077
Token stained
Token ID34568
Feature activation+1.56e+00

Pos logit contributions
opsy+0.001
 herbal+0.001
ifying+0.001
ulent+0.001
 roasted+0.001
Neg logit contributions
 poster-0.001
 elevator-0.001
 bus-0.001
 rules-0.000
 officer-0.000
Token.
Token ID13
Feature activation+6.49e-02

Pos logit contributions
opsy+0.089
 herbal+0.088
ulent+0.086
cious+0.084
apa+0.084
Neg logit contributions
 poster-0.063
 bus-0.055
 elevator-0.054
 door-0.051
 officer-0.050
Token
Token ID198
Feature activation+5.58e-01

Pos logit contributions
opsy+0.003
 herbal+0.003
apa+0.003
ulent+0.003
cious+0.003
Neg logit contributions
 poster-0.003
 bus-0.003
 elevator-0.003
 door-0.003
 doors-0.003
Token
Token ID198
Feature activation+5.22e-01

Pos logit contributions
opsy+0.027
 herbal+0.025
apa+0.025
ulent+0.025
cious+0.025
Neg logit contributions
 poster-0.029
 bus-0.027
 elevator-0.027
 door-0.026
 doors-0.026
TokenTim
Token ID14967
Feature activation-9.40e-01

Pos logit contributions
opsy+0.028
 herbal+0.026
ulent+0.026
cious+0.026
apa+0.026
Neg logit contributions
 poster-0.024
 door-0.023
 elevator-0.022
 bus-0.022
 doors-0.022
Token and
Token ID290
Feature activation-4.37e-01

Pos logit contributions
 poster+0.037
 elevator+0.033
 bus+0.033
 door+0.032
 doors+0.032
Neg logit contributions
opsy-0.058
cious-0.054
ulent-0.053
 herbal-0.053
apa-0.053
Token Jen
Token ID13364
Feature activation-9.62e-01

Pos logit contributions
 poster+0.021
 elevator+0.019
 door+0.019
 bus+0.019
 doors+0.019
Neg logit contributions
opsy-0.023
 herbal-0.022
ulent-0.022
cious-0.021
apa-0.021
Token learned
Token ID4499
Feature activation-6.19e-01

Pos logit contributions
 poster+0.055
 door+0.053
 bus+0.053
 elevator+0.053
 doors+0.052
Neg logit contributions
opsy-0.041
 herbal-0.038
ulent-0.038
cious-0.036
apa-0.035
Token not
Token ID407
Feature activation-1.63e+00

Pos logit contributions
 elevator+0.003
 bus+0.003
 officer+0.003
 poster+0.003
 door+0.003
Neg logit contributions
opsy-0.002
 herbal-0.002
cious-0.002
-0.002
aults-0.002
Token like
Token ID588
Feature activation+2.09e-01

Pos logit contributions
 poster+0.117
 elevator+0.114
 door+0.112
 doors+0.112
 bus+0.105
Neg logit contributions
cious-0.039
umble-0.039
opsy-0.038
 Cic-0.035
cr-0.035
Token dolls
Token ID36062
Feature activation+1.56e-02

Pos logit contributions
opsy+0.011
 herbal+0.010
 Cic+0.010
cious+0.010
ulent+0.010
Neg logit contributions
 poster-0.009
 elevator-0.008
 bus-0.008
 officer-0.008
 door-0.007
Token.
Token ID13
Feature activation-4.61e-01

Pos logit contributions
opsy+0.001
cious+0.001
ulent+0.001
 herbal+0.001
bowl+0.001
Neg logit contributions
 poster-0.001
 elevator-0.001
 door-0.000
 bus-0.000
 doors-0.000
Token She
Token ID1375
Feature activation-3.80e-01

Pos logit contributions
 poster+0.020
 door+0.019
 elevator+0.019
 doors+0.018
 bus+0.018
Neg logit contributions
opsy-0.026
ulent-0.024
 herbal-0.024
cious-0.024
bowl-0.023
Token says
Token ID1139
Feature activation-8.73e-01

Pos logit contributions
 poster+0.015
 door+0.013
 elevator+0.013
 bus+0.013
 rules+0.012
Neg logit contributions
opsy-0.023
 herbal-0.021
cious-0.021
ulent-0.021
apa-0.021
Token,
Token ID11
Feature activation-7.13e-02

Pos logit contributions
 poster+0.041
 elevator+0.038
 bus+0.038
 door+0.037
 doors+0.036
Neg logit contributions
opsy-0.045
 herbal-0.043
cious-0.043
ulent-0.043
apa-0.042
Token "
Token ID366
Feature activation-7.80e-01

Pos logit contributions
 poster+0.003
 elevator+0.003
 bus+0.003
 door+0.003
 doors+0.003
Neg logit contributions
opsy-0.004
ulent-0.004
 herbal-0.004
cious-0.004
apa-0.003
TokenE
Token ID36
Feature activation-2.01e+00

Pos logit contributions
 poster+0.038
 door+0.035
 elevator+0.035
 bus+0.035
 doors+0.034
Neg logit contributions
opsy-0.040
ulent-0.038
 herbal-0.038
cious-0.037
apa-0.036
Tokenw
Token ID86
Feature activation+9.40e-01

Pos logit contributions
 poster+0.097
 door+0.093
 bus+0.092
 elevator+0.091
 doors+0.091
Neg logit contributions
opsy-0.106
cious-0.098
ulent-0.096
 herbal-0.096
apa-0.092
Token,
Token ID11
Feature activation-7.84e-01

Pos logit contributions
opsy+0.055
ulent+0.052
 herbal+0.052
cious+0.051
apa+0.050
Neg logit contributions
 poster-0.038
 door-0.035
 elevator-0.035
 bus-0.035
 doors-0.034
Token is
Token ID318
Feature activation-1.34e+00

Pos logit contributions
 poster+0.007
 door+0.007
 elevator+0.007
 bus+0.007
 doors+0.006
Neg logit contributions
opsy-0.006
ulent-0.006
 herbal-0.005
cious-0.005
apa-0.005
Token no
Token ID645
Feature activation-4.13e-01

Pos logit contributions
 poster+0.061
 elevator+0.055
 door+0.055
 bus+0.055
 doors+0.053
Neg logit contributions
opsy-0.073
ulent-0.069
 herbal-0.069
cious-0.067
apa-0.067
Token monster
Token ID9234
Feature activation-6.72e-01

Pos logit contributions
 poster+0.019
 elevator+0.017
 officer+0.016
 belts+0.016
 lady+0.016
Neg logit contributions
 herbal-0.019
ulent-0.019
opsy-0.019
bowl-0.018
 roasted-0.018
Token in
Token ID287
Feature activation-2.68e+00

Pos logit contributions
 poster+0.028
 elevator+0.026
 door+0.025
 bus+0.025
 doors+0.025
Neg logit contributions
opsy-0.038
ulent-0.037
cious-0.036
 herbal-0.036
apa-0.035
Token your
Token ID534
Feature activation-1.09e+00

Pos logit contributions
 poster+0.108
 bus+0.100
 elevator+0.098
 lady+0.095
 officer+0.094
Neg logit contributions
opsy-0.145
ifying-0.142
bowl-0.140
 herbal-0.140
ulent-0.138
Token room
Token ID2119
Feature activation-2.88e+00

Pos logit contributions
 poster+0.043
 elevator+0.039
 bus+0.038
 officer+0.038
 lady+0.037
Neg logit contributions
opsy-0.063
 herbal-0.060
cious-0.060
ulent-0.059
bowl-0.058
Token.
Token ID13
Feature activation-6.97e-01

Pos logit contributions
 poster+0.119
 elevator+0.102
 bus+0.100
 officer+0.096
 door+0.095
Neg logit contributions
opsy-0.155
cious-0.153
 herbal-0.151
bowl-0.150
ulent-0.150
Token It
Token ID632
Feature activation-6.35e-01

Pos logit contributions
 poster+0.027
 door+0.027
 bus+0.026
 elevator+0.026
 doors+0.026
Neg logit contributions
opsy-0.042
 herbal-0.039
cious-0.038
ulent-0.038
aults-0.037
Token was
Token ID373
Feature activation-2.13e-01

Pos logit contributions
 poster+0.034
 door+0.032
 elevator+0.032
 bus+0.032
 doors+0.031
Neg logit contributions
opsy-0.029
ulent-0.027
 herbal-0.027
cious-0.027
apa-0.026
Token just
Token ID655
Feature activation-7.42e-01

Pos logit contributions
 poster+0.009
 elevator+0.008
 bus+0.008
 door+0.008
 doors+0.008
Neg logit contributions
opsy-0.012
 herbal-0.012
ulent-0.012
cious-0.011
apa-0.011
Token a
Token ID257
Feature activation-9.58e-01

Pos logit contributions
 poster+0.034
 elevator+0.030
 bus+0.030
 door+0.030
 doors+0.029
Neg logit contributions
opsy-0.039
 herbal-0.038
ulent-0.038
cious-0.037
apa-0.036
Token and
Token ID290
Feature activation-1.38e+00

Pos logit contributions
opsy+0.028
 herbal+0.027
ulent+0.027
cious+0.026
apa+0.026
Neg logit contributions
 poster-0.013
 door-0.011
 bus-0.011
 elevator-0.011
 doors-0.011
Token went
Token ID1816
Feature activation-4.75e-01

Pos logit contributions
 poster+0.062
 door+0.058
 bus+0.058
 elevator+0.058
 doors+0.056
Neg logit contributions
opsy-0.075
ulent-0.070
 herbal-0.070
cious-0.069
apa-0.068
Token to
Token ID284
Feature activation-2.85e-01

Pos logit contributions
 poster+0.021
 elevator+0.019
 doors+0.018
 bus+0.017
 door+0.017
Neg logit contributions
opsy-0.025
 herbal-0.024
bowl-0.023
ulent-0.023
apa-0.023
Token Anna
Token ID11735
Feature activation-1.15e+00

Pos logit contributions
 poster+0.013
 door+0.012
 bus+0.012
 elevator+0.012
 doors+0.012
Neg logit contributions
opsy-0.016
ulent-0.015
cious-0.015
 herbal-0.015
apa-0.014
Token's
Token ID338
Feature activation+1.91e+00

Pos logit contributions
 poster+0.051
 door+0.049
 bus+0.048
 elevator+0.046
 doors+0.046
Neg logit contributions
opsy-0.064
ulent-0.060
 herbal-0.060
cious-0.059
apa-0.057
Token room
Token ID2119
Feature activation-2.69e+00

Pos logit contributions
opsy+0.112
 herbal+0.107
ulent+0.104
cious+0.102
bowl+0.102
Neg logit contributions
 poster-0.076
 elevator-0.069
 bus-0.067
 doors-0.065
 door-0.065
Token.
Token ID13
Feature activation-7.47e-01

Pos logit contributions
 poster+0.121
 elevator+0.104
 bus+0.100
 doors+0.098
 officer+0.098
Neg logit contributions
opsy-0.138
 herbal-0.136
ulent-0.134
apa-0.133
cious-0.132
Token They
Token ID1119
Feature activation+6.54e-01

Pos logit contributions
 poster+0.039
 door+0.037
 bus+0.036
 elevator+0.036
 doors+0.035
Neg logit contributions
opsy-0.036
ulent-0.033
 herbal-0.033
cious-0.033
apa-0.032
Token played
Token ID2826
Feature activation+8.33e-01

Pos logit contributions
opsy+0.034
ulent+0.033
 herbal+0.032
cious+0.032
apa+0.031
Neg logit contributions
 poster-0.030
 bus-0.030
 elevator-0.030
 door-0.029
 doors-0.029
Token happily
Token ID18177
Feature activation-2.56e+00

Pos logit contributions
opsy+0.043
 herbal+0.041
ulent+0.040
cious+0.040
bowl+0.039
Neg logit contributions
 poster-0.039
 elevator-0.036
 bus-0.036
 door-0.035
 doors-0.035
Token and
Token ID290
Feature activation-6.64e-01

Pos logit contributions
 poster+0.110
 door+0.102
 bus+0.101
 elevator+0.101
 doors+0.098
Neg logit contributions
opsy-0.148
 herbal-0.138
ulent-0.138
apa-0.135
cious-0.135
Token box
Token ID3091
Feature activation-1.81e+00

Pos logit contributions
opsy+0.011
 herbal+0.011
ulent+0.011
cious+0.011
bowl+0.010
Neg logit contributions
 poster-0.004
 bus-0.003
 elevator-0.003
 officer-0.003
 doors-0.003
Token.
Token ID13
Feature activation+4.28e-01

Pos logit contributions
 poster+0.063
 bus+0.056
 elevator+0.056
 door+0.054
 doors+0.052
Neg logit contributions
opsy-0.115
 herbal-0.109
ulent-0.109
cious-0.108
apa-0.107
Token It
Token ID632
Feature activation-3.92e-01

Pos logit contributions
opsy+0.024
ulent+0.023
 herbal+0.023
cious+0.022
apa+0.022
Neg logit contributions
 poster-0.019
 door-0.018
 elevator-0.017
 bus-0.017
 doors-0.017
Token was
Token ID373
Feature activation+2.47e-01

Pos logit contributions
 poster+0.018
 door+0.017
 elevator+0.017
 doors+0.016
 bus+0.016
Neg logit contributions
opsy-0.022
ulent-0.021
cious-0.020
 herbal-0.020
apa-0.020
Token a
Token ID257
Feature activation-4.37e-01

Pos logit contributions
opsy+0.015
 herbal+0.014
ulent+0.014
cious+0.014
apa+0.013
Neg logit contributions
 poster-0.010
 elevator-0.009
 bus-0.009
 door-0.009
 doors-0.009
Token mystery
Token ID10715
Feature activation-2.71e+00

Pos logit contributions
 poster+0.017
 elevator+0.016
 doors+0.016
 bus+0.016
 door+0.015
Neg logit contributions
opsy-0.025
 herbal-0.024
ulent-0.023
cious-0.023
apa-0.022
Token.
Token ID13
Feature activation+5.32e-01

Pos logit contributions
 poster+0.096
 bus+0.083
 elevator+0.080
 yourselves+0.080
four+0.080
Neg logit contributions
 herbal-0.149
comb-0.144
opsy-0.143
-0.142
cious-0.141
Token
Token ID198
Feature activation+9.94e-01

Pos logit contributions
opsy+0.030
ulent+0.028
 herbal+0.028
cious+0.028
apa+0.027
Neg logit contributions
 poster-0.023
 door-0.022
 elevator-0.022
 bus-0.021
 doors-0.021
Token
Token ID198
Feature activation+9.65e-01

Pos logit contributions
opsy+0.049
 herbal+0.045
ulent+0.044
cious+0.044
apa+0.044
Neg logit contributions
 poster-0.050
 elevator-0.047
 bus-0.046
 door-0.046
 doors-0.045
TokenTom
Token ID13787
Feature activation-7.83e-01

Pos logit contributions
opsy+0.064
ulent+0.060
 herbal+0.060
apa+0.059
cious+0.058
Neg logit contributions
 poster-0.033
 bus-0.031
 elevator-0.030
 door-0.030
 doors-0.029
Token said
Token ID531
Feature activation+2.93e-01

Pos logit contributions
 poster+0.032
 door+0.030
 elevator+0.030
 bus+0.030
 doors+0.029
Neg logit contributions
opsy-0.046
ulent-0.043
 herbal-0.043
cious-0.042
apa-0.042
Token waiting
Token ID4953
Feature activation+1.12e+00

Pos logit contributions
opsy+0.055
 herbal+0.051
ulent+0.051
cious+0.050
apa+0.050
Neg logit contributions
 poster-0.039
 door-0.038
 bus-0.037
 elevator-0.036
 doors-0.036
Token for
Token ID329
Feature activation-8.21e-01

Pos logit contributions
opsy+0.051
ifying+0.051
icking+0.050
 herbal+0.049
apa+0.048
Neg logit contributions
 poster-0.054
 officer-0.045
 elevator-0.044
iren-0.043
 rules-0.043
Token him
Token ID683
Feature activation-2.81e-01

Pos logit contributions
 poster+0.042
 elevator+0.038
 bus+0.037
 door+0.037
 doors+0.036
Neg logit contributions
opsy-0.041
ulent-0.038
 herbal-0.038
apa-0.037
cious-0.037
Token at
Token ID379
Feature activation-9.20e-01

Pos logit contributions
 poster+0.010
 elevator+0.009
 door+0.009
 bus+0.009
 doors+0.008
Neg logit contributions
opsy-0.018
ulent-0.017
 herbal-0.017
cious-0.017
apa-0.017
Token the
Token ID262
Feature activation-1.20e+00

Pos logit contributions
 poster+0.047
 officer+0.045
 bus+0.044
 belts+0.044
iren+0.043
Neg logit contributions
 Mitt-0.035
 herbal-0.033
opsy-0.033
 roasted-0.033
 Bro-0.032
Token front
Token ID2166
Feature activation-2.62e+00

Pos logit contributions
 poster+0.045
 doors+0.040
 elevator+0.039
 bus+0.039
 officer+0.038
Neg logit contributions
opsy-0.070
 herbal-0.068
ulent-0.065
cious-0.064
 colony-0.063
Token door
Token ID3420
Feature activation-3.91e+00

Pos logit contributions
 poster+0.109
 officer+0.089
ep+0.089
 elevator+0.088
 yourselves+0.086
Neg logit contributions
 colony-0.136
-0.135
 herbal-0.132
apa-0.131
opsy-0.131
Token.
Token ID13
Feature activation+1.37e-01

Pos logit contributions
 poster+0.170
 elevator+0.145
 bus+0.140
 rules+0.135
 doors+0.134
Neg logit contributions
opsy-0.201
cious-0.197
apa-0.197
ulent-0.197
-0.196
Token She
Token ID1375
Feature activation-1.39e-01

Pos logit contributions
opsy+0.007
ulent+0.006
 herbal+0.006
cious+0.006
apa+0.006
Neg logit contributions
 poster-0.007
 door-0.006
 elevator-0.006
 bus-0.006
 doors-0.006
Token was
Token ID373
Feature activation+4.10e-01

Pos logit contributions
 poster+0.007
 elevator+0.007
 door+0.006
 bus+0.006
 doors+0.006
Neg logit contributions
opsy-0.007
 herbal-0.006
ulent-0.006
cious-0.006
apa-0.006
Token happy
Token ID3772
Feature activation+8.58e-01

Pos logit contributions
opsy+0.021
ulent+0.020
 herbal+0.020
cious+0.020
apa+0.019
Neg logit contributions
 poster-0.020
 bus-0.018
 elevator-0.018
 door-0.018
 doors-0.017
Token "
Token ID366
Feature activation-9.56e-01

Pos logit contributions
 poster+0.017
 elevator+0.016
 bus+0.016
 door+0.016
 doors+0.015
Neg logit contributions
opsy-0.024
ulent-0.023
 herbal-0.022
cious-0.022
apa-0.022
TokenIt
Token ID1026
Feature activation-7.35e-01

Pos logit contributions
 poster+0.049
 bus+0.047
 elevator+0.047
 door+0.047
 doors+0.045
Neg logit contributions
opsy-0.047
 herbal-0.043
cious-0.042
ulent-0.042
apa-0.040
Token's
Token ID338
Feature activation+1.50e+00

Pos logit contributions
 poster+0.041
 door+0.039
 elevator+0.038
?"+0.038
 bus+0.037
Neg logit contributions
opsy-0.034
ulent-0.033
cious-0.031
ifying-0.031
apa-0.030
Token okay
Token ID8788
Feature activation+3.03e+00

Pos logit contributions
opsy+0.053
cious+0.050
ulent+0.049
 herbal+0.049
apa+0.047
Neg logit contributions
 poster-0.094
 door-0.093
 elevator-0.091
?"-0.090
 doors-0.090
Token,
Token ID11
Feature activation-9.14e-01

Pos logit contributions
opsy+0.152
 herbal+0.137
ulent+0.137
apa+0.136
 Cic+0.134
Neg logit contributions
 poster-0.154
 elevator-0.145
 bus-0.142
 doors-0.136
 officer-0.134
Token Lily
Token ID20037
Feature activation-2.72e+00

Pos logit contributions
 poster+0.039
 elevator+0.037
 bus+0.036
 doors+0.035
 door+0.034
Neg logit contributions
opsy-0.050
 herbal-0.046
 Mitt-0.046
 Cic-0.045
ulent-0.044
Token.
Token ID13
Feature activation-1.88e+00

Pos logit contributions
 poster+0.115
 elevator+0.106
 bus+0.104
 doors+0.101
 door+0.097
Neg logit contributions
opsy-0.154
ulent-0.145
cious-0.145
apa-0.144
 herbal-0.142
Token It
Token ID632
Feature activation-1.32e+00

Pos logit contributions
 poster+0.061
 door+0.055
 elevator+0.054
 doors+0.051
?"+0.051
Neg logit contributions
opsy-0.128
ulent-0.121
cious-0.120
 herbal-0.120
apa-0.117
Token's
Token ID338
Feature activation+1.15e+00

Pos logit contributions
 poster+0.075
 door+0.070
 elevator+0.070
 bus+0.068
 doors+0.067
Neg logit contributions
opsy-0.060
ulent-0.057
cious-0.055
apa-0.053
 herbal-0.053
Token just
Token ID655
Feature activation-1.51e+00

Pos logit contributions
opsy+0.049
ulent+0.045
cious+0.045
 herbal+0.045
apa+0.043
Neg logit contributions
 poster-0.066
 door-0.063
 elevator-0.062
 bus-0.062
 doors-0.061
Token a
Token ID257
Feature activation-1.47e+00

Pos logit contributions
 poster+0.066
 bus+0.061
 door+0.061
 elevator+0.061
 doors+0.059
Neg logit contributions
opsy-0.085
 herbal-0.080
ulent-0.080
cious-0.079
apa-0.077