TOP ACTIVATIONS
MAX = 5.806

." " Can we learn her language ?" Lily
. " Can we try some ?" Lily asks
. " Can we help you with the sign
ara says , " Can we eat it ?"
" Sure , you can each choose one balloon ," mom
. " Can we have some ?" Tom asked
. " Can we eat it ?" Lily asked
." " Can we open it ?" Ben asks
." " Can we fly with it ?" Tom
!" " Can we go and see ?" Lily
. " Can we make it longer and higher
. " Can we play with it ?" Sam
. " Can we go outside and play in
. " Can we slide down the snow ,
. " Can we feed it ?" Sam asks
" What is it ?" Anna asks .
Ben said . " Can we swim in the water ,
sweet ie . You can each choose one balloon ," Mom
. " Can we feed it ?" Lily asked
Tom says . " Can we see the photos ?"

MIN ACTIVATIONS
MIN = -6.886

She had a big magn ifying glass that made things look
it she had a magn ifying glass , a tiny torch
. Lily had a magn ifying glass that she found in
When Daisy put the magn ifying glass down , the cater
examine it with the magn ifying glass ?" "
a flashlight and a magn ifying glass to examine their finds
. They had a magn ifying glass and a small box
flashlight , and a magn ifying glass . As
hat and grabbed her magn ifying glass . She was ready
the necklace and the magn ifying glass and told him that
and handed him the magn ifying glass . He
the necklace and the magn ifying glass and started to examine
cried , dropping the magn ifying glass . "
borrowed her mom 's magn ifying glass to take a closer
, handing him the magn ifying glass . She looked around
the watch with the magn ifying glass . He saw the
He took out a magn ifying glass and a notebook and
He pointed his magn ifying glass at different classmates and
said , holding the magn ifying glass over a black ant
She took out a magn ifying glass and looked very carefully

INTERVAL 2.633 - 5.806
CONTAINS 0.934%

something delicious with it ". Tom nodded and
his mom , " Can I keep it ?"
. I will watch cartoons all day and all night .
. <|endoftext|> Once upon a time , there was a little
The sk unk thanked the box for the exciting ride

INTERVAL -0.540 - 2.633
CONTAINS 69.268%

others were mean and made fun of the ugly pupil ,
huge shadow . She looked up and saw a huge dog
not want to sell their games . They love all their
water . But their moms say no . They say the
Squirrel sh ivered and said " I 'm so scared !

INTERVAL -3.713 - -0.540
CONTAINS 29.772%

They both smiled and laughed , and started to
still loved playing in the snow . <|endoftext|> Once upon a
got tough . One day , they were walking
was only three years old and loved everyone . One day
, he remembered leaving it on top of his dress er

INTERVAL -6.886 - -3.713
CONTAINS 0.025%

nice . Like our t eddy bears ." L
garden that are not good to eat . They can make
Lily noticed that her t eddy bear 's arm was torn
. They found his t eddy bear on his bed .
She could still read her diary while she was playing !
Token."
Token ID526
Feature activation+1.08e+00

Pos logit contributions
 stranger+0.020
guard+0.020
 ambulance+0.019
 answer+0.019
 cop+0.018
Neg logit contributions
icol-0.015
und-0.014
text-0.014
bed-0.014
ra-0.014
Token
Token ID198
Feature activation+5.09e-01

Pos logit contributions
 lady+0.023
 stranger+0.023
 driver+0.023
 man+0.022
 ambulance+0.022
Neg logit contributions
icol-0.029
ifying-0.027
ra-0.027
bed-0.026
text-0.026
Token
Token ID198
Feature activation+7.68e-01

Pos logit contributions
 stranger+0.014
guard+0.013
 answer+0.013
 ambulance+0.013
 cop+0.013
Neg logit contributions
icol-0.010
text-0.009
und-0.009
bed-0.009
ifying-0.009
Token"
Token ID1
Feature activation+7.82e-02

Pos logit contributions
 stranger+0.018
 answer+0.018
guard+0.018
 ambulance+0.017
 seller+0.017
Neg logit contributions
icol-0.017
text-0.016
und-0.016
bed-0.016
ra-0.015
TokenCan
Token ID6090
Feature activation+4.91e-01

Pos logit contributions
 ambulance+0.002
 answer+0.002
 stranger+0.002
guard+0.002
old+0.002
Neg logit contributions
icol-0.001
und-0.001
bed-0.001
ra-0.001
ifying-0.001
Token we
Token ID356
Feature activation+5.81e+00

Pos logit contributions
 stranger+0.016
 answer+0.016
 ambulance+0.015
guard+0.015
 pastor+0.015
Neg logit contributions
icol-0.006
 P-0.006
ra-0.006
bed-0.005
text-0.005
Token learn
Token ID2193
Feature activation+1.15e+00

Pos logit contributions
guard+0.129
 stranger+0.122
 pastor+0.121
vern+0.117
old+0.116
Neg logit contributions
ra-0.140
ifying-0.127
 P-0.125
 M-0.123
bed-0.122
Token her
Token ID607
Feature activation-3.20e-01

Pos logit contributions
 stranger+0.029
guard+0.029
 pastor+0.028
 seller+0.027
 ambulance+0.027
Neg logit contributions
 P-0.020
und-0.020
 M-0.020
icol-0.019
text-0.019
Token language
Token ID3303
Feature activation+5.54e-01

Pos logit contributions
icol+0.007
text+0.007
bed+0.007
ra+0.007
und+0.007
Neg logit contributions
guard-0.007
 stranger-0.007
 ambulance-0.007
 pastor-0.006
 officer-0.006
Token?"
Token ID1701
Feature activation+1.53e+00

Pos logit contributions
 stranger+0.013
guard+0.013
 pastor+0.013
 ambulance+0.013
 seller+0.012
Neg logit contributions
bed-0.010
und-0.010
text-0.010
icol-0.010
ra-0.010
Token Lily
Token ID20037
Feature activation+5.64e-01

Pos logit contributions
 stranger+0.045
 answer+0.043
guard+0.043
 ambulance+0.042
 cop+0.042
Neg logit contributions
icol-0.030
ifying-0.027
text-0.027
und-0.027
bed-0.026
Token.
Token ID13
Feature activation+1.11e+00

Pos logit contributions
icol+0.001
text+0.001
und+0.001
bed+0.001
ra+0.001
Neg logit contributions
 stranger-0.001
guard-0.001
 answer-0.001
 ambulance-0.001
 cop-0.001
Token
Token ID198
Feature activation+8.27e-01

Pos logit contributions
 stranger+0.027
guard+0.025
 answer+0.025
 cop+0.025
 lady+0.025
Neg logit contributions
icol-0.027
text-0.025
und-0.024
ra-0.024
bed-0.024
Token
Token ID198
Feature activation+9.77e-01

Pos logit contributions
 stranger+0.021
guard+0.020
 answer+0.020
 ambulance+0.020
 driver+0.019
Neg logit contributions
icol-0.019
ifying-0.017
bed-0.016
und-0.016
text-0.016
Token"
Token ID1
Feature activation-4.47e-01

Pos logit contributions
 stranger+0.020
guard+0.020
 answer+0.019
 ambulance+0.019
 cop+0.018
Neg logit contributions
icol-0.026
text-0.024
und-0.024
bed-0.024
ifying-0.023
TokenCan
Token ID6090
Feature activation+6.77e-01

Pos logit contributions
icol+0.008
text+0.007
bed+0.007
ra+0.007
ifying+0.007
Neg logit contributions
 ambulance-0.012
 answer-0.012
 stranger-0.012
guard-0.012
 officer-0.011
Token we
Token ID356
Feature activation+5.71e+00

Pos logit contributions
 stranger+0.018
guard+0.018
 answer+0.018
 ambulance+0.017
 seller+0.017
Neg logit contributions
icol-0.014
text-0.012
und-0.012
bed-0.012
ifying-0.012
Token try
Token ID1949
Feature activation+1.89e+00

Pos logit contributions
 stranger+0.123
guard+0.122
 ambulance+0.117
 pastor+0.114
 cop+0.113
Neg logit contributions
icol-0.136
ifying-0.132
ra-0.129
text-0.129
 P-0.129
Token some
Token ID617
Feature activation+2.73e+00

Pos logit contributions
 stranger+0.050
guard+0.049
 answer+0.048
 ambulance+0.048
 seller+0.046
Neg logit contributions
icol-0.038
text-0.035
und-0.035
bed-0.034
ifying-0.033
Token?"
Token ID1701
Feature activation+1.53e+00

Pos logit contributions
 stranger+0.067
guard+0.065
 cop+0.064
 officer+0.064
 firefighter+0.063
Neg logit contributions
icol-0.056
text-0.054
 P-0.053
bed-0.052
 M-0.052
Token Lily
Token ID20037
Feature activation+1.18e+00

Pos logit contributions
 stranger+0.045
 answer+0.044
guard+0.043
 lady+0.043
 man+0.042
Neg logit contributions
icol-0.030
und-0.028
ifying-0.027
text-0.026
bed-0.026
Token asks
Token ID7893
Feature activation+8.09e-01

Pos logit contributions
guard+0.028
 stranger+0.027
vern+0.025
 ambulance+0.025
 cop+0.024
Neg logit contributions
icol-0.026
 M-0.024
 P-0.024
bed-0.024
und-0.024
Token.
Token ID13
Feature activation+2.01e+00

Pos logit contributions
 stranger+0.030
guard+0.029
 pastor+0.028
 ambulance+0.028
 answer+0.028
Neg logit contributions
icol-0.027
bed-0.026
 P-0.026
text-0.025
und-0.025
Token
Token ID198
Feature activation+1.41e+00

Pos logit contributions
 stranger+0.058
guard+0.058
 answer+0.056
 ambulance+0.056
 pastor+0.054
Neg logit contributions
icol-0.035
text-0.032
und-0.031
bed-0.031
ifying-0.029
Token
Token ID198
Feature activation+1.34e+00

Pos logit contributions
 stranger+0.040
guard+0.040
 answer+0.039
 ambulance+0.038
 cop+0.037
Neg logit contributions
icol-0.026
text-0.023
und-0.023
bed-0.022
ifying-0.022
Token"
Token ID1
Feature activation+1.11e-01

Pos logit contributions
 stranger+0.031
 answer+0.030
 ambulance+0.029
guard+0.029
 pastor+0.028
Neg logit contributions
icol-0.031
und-0.030
text-0.030
bed-0.029
ra-0.029
TokenCan
Token ID6090
Feature activation+8.74e-01

Pos logit contributions
 ambulance+0.003
 answer+0.003
 owl+0.003
 stranger+0.003
 dentist+0.003
Neg logit contributions
bed-0.002
 Neither-0.002
ra-0.002
 P-0.002
sc-0.002
Token we
Token ID356
Feature activation+5.70e+00

Pos logit contributions
 stranger+0.027
 answer+0.027
guard+0.027
 ambulance+0.026
 pastor+0.026
Neg logit contributions
icol-0.013
text-0.012
bed-0.011
ra-0.011
und-0.011
Token help
Token ID1037
Feature activation+1.60e+00

Pos logit contributions
 stranger+0.125
guard+0.123
 pastor+0.116
 ambulance+0.111
vern+0.109
Neg logit contributions
ra-0.140
ifying-0.132
 M-0.128
 P-0.127
bed-0.126
Token you
Token ID345
Feature activation+3.11e+00

Pos logit contributions
 stranger+0.050
guard+0.048
 seller+0.047
 pastor+0.047
 ambulance+0.047
Neg logit contributions
bed-0.023
ra-0.022
 P-0.021
 M-0.021
icol-0.021
Token with
Token ID351
Feature activation+1.38e+00

Pos logit contributions
 stranger+0.083
guard+0.079
 pastor+0.077
 officer+0.073
 gentleman+0.072
Neg logit contributions
bed-0.065
und-0.062
ra-0.060
 P-0.057
 M-0.056
Token the
Token ID262
Feature activation+1.98e+00

Pos logit contributions
 stranger+0.042
guard+0.041
 answer+0.040
 ambulance+0.040
 seller+0.039
Neg logit contributions
icol-0.022
text-0.020
und-0.020
bed-0.020
ra-0.019
Token sign
Token ID1051
Feature activation+2.43e+00

Pos logit contributions
guard+0.052
 stranger+0.050
old+0.049
 seller+0.047
vern+0.046
Neg logit contributions
ra-0.033
icol-0.033
 hands-0.032
ifying-0.030
 P-0.030
Tokenara
Token ID3301
Feature activation+1.62e-01

Pos logit contributions
 stranger+0.003
 ambulance+0.003
guard+0.003
 cop+0.003
 answer+0.003
Neg logit contributions
icol-0.002
ra-0.002
text-0.002
und-0.002
bed-0.001
Token says
Token ID1139
Feature activation+6.54e-02

Pos logit contributions
guard+0.004
 stranger+0.004
 ambulance+0.004
vern+0.004
 cop+0.004
Neg logit contributions
icol-0.003
ra-0.003
text-0.003
bed-0.003
und-0.003
Token,
Token ID11
Feature activation+1.43e-01

Pos logit contributions
 stranger+0.002
guard+0.002
 ambulance+0.002
 answer+0.002
 pastor+0.002
Neg logit contributions
icol-0.001
bed-0.001
text-0.001
und-0.001
 P-0.001
Token "
Token ID366
Feature activation+4.00e-01

Pos logit contributions
 stranger+0.003
guard+0.003
 answer+0.003
 ambulance+0.003
 cop+0.003
Neg logit contributions
icol-0.003
text-0.003
und-0.003
bed-0.003
ifying-0.003
TokenCan
Token ID6090
Feature activation+8.56e-01

Pos logit contributions
 stranger+0.010
guard+0.010
 answer+0.009
 seller+0.009
 ambulance+0.009
Neg logit contributions
icol-0.009
text-0.009
und-0.008
bed-0.008
ifying-0.008
Token we
Token ID356
Feature activation+5.69e+00

Pos logit contributions
 stranger+0.023
guard+0.023
 answer+0.022
 ambulance+0.022
 cop+0.021
Neg logit contributions
icol-0.018
text-0.016
und-0.016
bed-0.015
ifying-0.015
Token eat
Token ID4483
Feature activation+1.20e+00

Pos logit contributions
guard+0.122
 stranger+0.122
 pastor+0.115
 ambulance+0.113
 cop+0.108
Neg logit contributions
icol-0.138
ra-0.136
 P-0.135
bed-0.133
ifying-0.133
Token it
Token ID340
Feature activation+3.00e+00

Pos logit contributions
 stranger+0.032
guard+0.032
 answer+0.031
 ambulance+0.031
 seller+0.030
Neg logit contributions
icol-0.024
text-0.022
und-0.022
bed-0.021
ifying-0.021
Token?"
Token ID1701
Feature activation+2.12e+00

Pos logit contributions
 stranger+0.071
guard+0.069
 pastor+0.067
 ambulance+0.065
 officer+0.064
Neg logit contributions
bed-0.065
icol-0.065
 P-0.064
und-0.063
text-0.062
Token
Token ID198
Feature activation+1.08e+00

Pos logit contributions
 stranger+0.051
 answer+0.050
guard+0.048
 ambulance+0.048
 lady+0.048
Neg logit contributions
icol-0.050
ifying-0.046
text-0.045
und-0.045
ra-0.044
Token
Token ID198
Feature activation+1.14e+00

Pos logit contributions
 stranger+0.032
guard+0.032
 answer+0.031
 ambulance+0.031
 seller+0.030
Neg logit contributions
icol-0.018
text-0.017
und-0.016
bed-0.016
ra-0.015
Token"
Token ID1
Feature activation-2.82e-01

Pos logit contributions
 stranger+0.026
 answer+0.026
 ambulance+0.025
guard+0.024
 pastor+0.024
Neg logit contributions
ra-0.020
bed-0.020
text-0.019
icol-0.019
und-0.019
TokenSure
Token ID19457
Feature activation-3.86e-01

Pos logit contributions
sc+0.004
 P+0.004
bed+0.004
ifying+0.004
 Neither+0.004
Neg logit contributions
 ambulance-0.008
 answer-0.007
 owl-0.007
 stranger-0.007
 firefighter-0.007
Token,
Token ID11
Feature activation+3.42e-01

Pos logit contributions
icol+0.010
und+0.010
text+0.010
bed+0.010
ifying+0.009
Neg logit contributions
 stranger-0.008
guard-0.008
 answer-0.008
 ambulance-0.007
 cop-0.007
Token you
Token ID345
Feature activation+1.81e+00

Pos logit contributions
 stranger+0.008
 answer+0.008
guard+0.008
 ambulance+0.007
 lady+0.007
Neg logit contributions
icol-0.009
und-0.009
ifying-0.008
bed-0.008
text-0.008
Token can
Token ID460
Feature activation+1.04e+00

Pos logit contributions
 stranger+0.038
 answer+0.037
guard+0.036
 ambulance+0.036
 cop+0.035
Neg logit contributions
icol-0.048
und-0.045
text-0.044
bed-0.044
keeping-0.042
Token each
Token ID1123
Feature activation+5.68e+00

Pos logit contributions
 stranger+0.028
 pastor+0.026
guard+0.026
 ambulance+0.026
 firefighter+0.026
Neg logit contributions
ra-0.022
bed-0.021
und-0.019
 hands-0.019
 P-0.019
Token choose
Token ID3853
Feature activation+9.07e-01

Pos logit contributions
guard+0.126
 stranger+0.124
 ambulance+0.122
vern+0.121
 pastor+0.120
Neg logit contributions
ra-0.121
bed-0.114
 hands-0.114
text-0.114
und-0.109
Token one
Token ID530
Feature activation+1.21e+00

Pos logit contributions
 stranger+0.026
 ambulance+0.025
guard+0.025
 firefighter+0.025
 pastor+0.025
Neg logit contributions
ra-0.014
ifying-0.013
bed-0.013
 P-0.013
icol-0.012
Token balloon
Token ID21190
Feature activation-2.47e-01

Pos logit contributions
 ambulance+0.030
 stranger+0.030
 pastor+0.030
 officer+0.029
guard+0.029
Neg logit contributions
ra-0.022
text-0.021
bed-0.021
ifying-0.020
 hands-0.020
Token,"
Token ID553
Feature activation+2.94e-01

Pos logit contributions
icol+0.005
ra+0.005
bed+0.005
text+0.005
und+0.005
Neg logit contributions
 stranger-0.006
 ambulance-0.006
 pastor-0.006
 answer-0.006
guard-0.006
Token mom
Token ID1995
Feature activation+2.89e+00

Pos logit contributions
 stranger+0.008
guard+0.008
 answer+0.008
 ambulance+0.008
 seller+0.008
Neg logit contributions
icol-0.006
und-0.005
text-0.005
bed-0.005
ifying-0.005
Token.
Token ID13
Feature activation+1.31e+00

Pos logit contributions
 stranger+0.004
guard+0.004
 answer+0.003
 ambulance+0.003
 cop+0.003
Neg logit contributions
icol-0.004
text-0.004
bed-0.004
und-0.004
ra-0.004
Token
Token ID198
Feature activation+9.84e-01

Pos logit contributions
 stranger+0.034
guard+0.033
 answer+0.032
 cop+0.032
 ambulance+0.032
Neg logit contributions
icol-0.029
und-0.026
text-0.026
bed-0.025
ra-0.025
Token
Token ID198
Feature activation+1.02e+00

Pos logit contributions
 stranger+0.027
guard+0.027
 answer+0.026
 ambulance+0.026
 cop+0.025
Neg logit contributions
icol-0.019
text-0.017
und-0.017
bed-0.017
ifying-0.017
Token"
Token ID1
Feature activation-7.81e-02

Pos logit contributions
 answer+0.025
 stranger+0.025
 ambulance+0.025
guard+0.024
 ring+0.023
Neg logit contributions
icol-0.021
text-0.020
und-0.020
ra-0.019
bed-0.019
TokenCan
Token ID6090
Feature activation+7.24e-01

Pos logit contributions
icol+0.001
ifying+0.001
und+0.001
text+0.001
ra+0.001
Neg logit contributions
 ambulance-0.002
 answer-0.002
 stranger-0.002
guard-0.002
 owl-0.002
Token we
Token ID356
Feature activation+5.66e+00

Pos logit contributions
 stranger+0.021
guard+0.020
 answer+0.020
 ambulance+0.020
 seller+0.019
Neg logit contributions
icol-0.013
text-0.012
und-0.012
bed-0.012
ifying-0.011
Token have
Token ID423
Feature activation+1.48e+00

Pos logit contributions
guard+0.133
 stranger+0.130
 pastor+0.130
 ambulance+0.123
vern+0.119
Neg logit contributions
ra-0.125
ifying-0.122
 P-0.119
 M-0.118
und-0.116
Token some
Token ID617
Feature activation+1.90e+00

Pos logit contributions
guard+0.047
 stranger+0.047
 pastor+0.045
 ambulance+0.045
 answer+0.045
Neg logit contributions
icol-0.019
 P-0.018
text-0.018
bed-0.017
und-0.017
Token?"
Token ID1701
Feature activation+1.38e+00

Pos logit contributions
 pastor+0.050
 officer+0.050
guard+0.050
 professor+0.049
 stranger+0.049
Neg logit contributions
 hands-0.033
 P-0.032
 M-0.032
und-0.032
 pic-0.032
Token Tom
Token ID4186
Feature activation+1.61e+00

Pos logit contributions
 stranger+0.043
guard+0.042
 answer+0.042
 ambulance+0.041
 cop+0.040
Neg logit contributions
icol-0.023
und-0.021
text-0.020
bed-0.020
ifying-0.019
Token asked
Token ID1965
Feature activation+4.75e-01

Pos logit contributions
guard+0.039
 stranger+0.038
 ambulance+0.037
 pastor+0.035
 answer+0.035
Neg logit contributions
bed-0.033
icol-0.033
und-0.032
text-0.031
ra-0.031
Token.
Token ID13
Feature activation+1.73e+00

Pos logit contributions
 stranger+0.008
guard+0.007
 answer+0.007
 ambulance+0.007
 cop+0.007
Neg logit contributions
icol-0.009
text-0.009
und-0.009
bed-0.009
ifying-0.008
Token
Token ID198
Feature activation+1.34e+00

Pos logit contributions
 stranger+0.039
guard+0.037
 answer+0.037
 cop+0.036
 ambulance+0.036
Neg logit contributions
icol-0.044
text-0.041
und-0.041
ra-0.039
bed-0.039
Token
Token ID198
Feature activation+1.44e+00

Pos logit contributions
 stranger+0.035
guard+0.035
 answer+0.034
 ambulance+0.034
 cop+0.033
Neg logit contributions
icol-0.028
text-0.025
und-0.025
ifying-0.024
bed-0.024
Token"
Token ID1
Feature activation-8.16e-02

Pos logit contributions
 stranger+0.032
 answer+0.031
guard+0.031
 ambulance+0.031
 pastor+0.029
Neg logit contributions
icol-0.035
text-0.033
und-0.033
bed-0.032
ra-0.032
TokenCan
Token ID6090
Feature activation+1.06e+00

Pos logit contributions
icol+0.002
text+0.001
und+0.001
bed+0.001
ifying+0.001
Neg logit contributions
 stranger-0.002
guard-0.002
 answer-0.002
 ambulance-0.002
 cop-0.002
Token we
Token ID356
Feature activation+5.64e+00

Pos logit contributions
guard+0.027
 stranger+0.026
 answer+0.025
 ambulance+0.025
 seller+0.025
Neg logit contributions
icol-0.025
und-0.022
text-0.022
keeping-0.021
bed-0.021
Token eat
Token ID4483
Feature activation+9.37e-01

Pos logit contributions
guard+0.138
 stranger+0.136
 ambulance+0.126
 pastor+0.126
vern+0.124
Neg logit contributions
ra-0.124
ifying-0.120
 M-0.116
bed-0.113
und-0.113
Token it
Token ID340
Feature activation+3.16e+00

Pos logit contributions
 stranger+0.030
guard+0.030
 answer+0.029
 ambulance+0.029
 cop+0.028
Neg logit contributions
icol-0.014
text-0.012
und-0.012
bed-0.012
ra-0.011
Token?"
Token ID1701
Feature activation+1.98e+00

Pos logit contributions
 stranger+0.080
guard+0.077
 pastor+0.075
 officer+0.074
 ambulance+0.073
Neg logit contributions
bed-0.064
icol-0.062
und-0.062
 P-0.061
 M-0.060
Token Lily
Token ID20037
Feature activation+1.06e+00

Pos logit contributions
 stranger+0.056
 lady+0.055
 man+0.054
 answer+0.054
 driver+0.054
Neg logit contributions
icol-0.044
und-0.041
ifying-0.040
bed-0.039
text-0.038
Token asked
Token ID1965
Feature activation-1.44e-01

Pos logit contributions
guard+0.024
 stranger+0.024
 ambulance+0.023
 answer+0.022
 pastor+0.022
Neg logit contributions
icol-0.024
bed-0.023
und-0.022
 P-0.022
text-0.022
Token."
Token ID526
Feature activation+9.41e-01

Pos logit contributions
guard+0.021
 stranger+0.021
 answer+0.020
 ambulance+0.020
 seller+0.019
Neg logit contributions
icol-0.021
text-0.019
und-0.019
bed-0.019
ra-0.018
Token
Token ID198
Feature activation+5.99e-01

Pos logit contributions
 stranger+0.021
 lady+0.020
guard+0.020
 answer+0.020
 man+0.020
Neg logit contributions
icol-0.023
ifying-0.022
text-0.021
ra-0.021
und-0.021
Token
Token ID198
Feature activation+8.22e-01

Pos logit contributions
 stranger+0.017
guard+0.017
 answer+0.016
 ambulance+0.016
 cop+0.016
Neg logit contributions
icol-0.011
text-0.010
und-0.010
bed-0.010
ifying-0.010
Token"
Token ID1
Feature activation-2.24e-01

Pos logit contributions
 answer+0.022
 stranger+0.022
 ambulance+0.021
guard+0.021
 seller+0.020
Neg logit contributions
und-0.015
icol-0.015
text-0.015
bed-0.014
ra-0.014
TokenCan
Token ID6090
Feature activation+7.50e-01

Pos logit contributions
icol+0.004
text+0.004
und+0.004
bed+0.004
ifying+0.004
Neg logit contributions
 stranger-0.006
 ambulance-0.006
 answer-0.006
guard-0.006
 officer-0.006
Token we
Token ID356
Feature activation+5.60e+00

Pos logit contributions
 stranger+0.021
guard+0.021
 answer+0.020
 ambulance+0.020
 seller+0.019
Neg logit contributions
icol-0.015
und-0.013
text-0.013
bed-0.013
ifying-0.012
Token open
Token ID1280
Feature activation+1.27e+00

Pos logit contributions
 stranger+0.126
 pastor+0.122
guard+0.120
 ambulance+0.116
 cop+0.112
Neg logit contributions
ra-0.126
ifying-0.125
icol-0.121
 M-0.120
 P-0.120
Token it
Token ID340
Feature activation+3.24e+00

Pos logit contributions
guard+0.033
 stranger+0.033
 answer+0.032
 ambulance+0.032
 cop+0.030
Neg logit contributions
icol-0.028
text-0.025
und-0.024
bed-0.023
keeping-0.022
Token?"
Token ID1701
Feature activation+1.29e+00

Pos logit contributions
 stranger+0.067
guard+0.064
 pastor+0.062
 ambulance+0.062
 answer+0.062
Neg logit contributions
icol-0.082
und-0.079
bed-0.078
text-0.077
ra-0.075
Token Ben
Token ID3932
Feature activation+1.27e+00

Pos logit contributions
 stranger+0.036
 answer+0.035
guard+0.035
 ambulance+0.034
 cop+0.034
Neg logit contributions
icol-0.026
text-0.023
und-0.023
ifying-0.023
bed-0.022
Token asks
Token ID7893
Feature activation+1.48e+00

Pos logit contributions
guard+0.029
 stranger+0.028
 ambulance+0.026
vern+0.026
 cop+0.025
Neg logit contributions
icol-0.029
 P-0.027
 M-0.027
bed-0.027
 Po-0.027
Token."
Token ID526
Feature activation+1.44e+00

Pos logit contributions
 stranger+0.028
 answer+0.028
guard+0.028
 ambulance+0.027
 seller+0.026
Neg logit contributions
icol-0.032
text-0.029
und-0.028
bed-0.028
ifying-0.027
Token
Token ID198
Feature activation+8.38e-01

Pos logit contributions
 stranger+0.041
guard+0.041
 answer+0.040
 ambulance+0.040
 cop+0.038
Neg logit contributions
icol-0.026
text-0.023
und-0.023
bed-0.023
ra-0.022
Token
Token ID198
Feature activation+9.91e-01

Pos logit contributions
 stranger+0.025
guard+0.025
 answer+0.024
 ambulance+0.024
 seller+0.023
Neg logit contributions
icol-0.014
text-0.013
und-0.013
bed-0.012
ra-0.012
Token"
Token ID1
Feature activation-1.09e-01

Pos logit contributions
 answer+0.027
 stranger+0.027
guard+0.026
 ambulance+0.025
 pastor+0.025
Neg logit contributions
und-0.019
icol-0.018
ra-0.017
bed-0.017
text-0.017
TokenCan
Token ID6090
Feature activation+6.98e-01

Pos logit contributions
icol+0.002
und+0.002
ifying+0.002
text+0.002
 P+0.002
Neg logit contributions
 ambulance-0.003
 answer-0.003
 stranger-0.003
guard-0.003
 owl-0.003
Token we
Token ID356
Feature activation+5.60e+00

Pos logit contributions
 stranger+0.021
guard+0.020
 answer+0.020
 ambulance+0.020
 pastor+0.019
Neg logit contributions
icol-0.011
text-0.010
bed-0.010
und-0.010
ra-0.010
Token fly
Token ID6129
Feature activation+1.26e+00

Pos logit contributions
 pastor+0.132
 stranger+0.127
 ambulance+0.125
old+0.121
guard+0.119
Neg logit contributions
ra-0.124
bed-0.117
ifying-0.116
 P-0.116
 hands-0.110
Token with
Token ID351
Feature activation+1.54e-01

Pos logit contributions
 stranger+0.031
guard+0.030
 ambulance+0.030
 pastor+0.030
 answer+0.030
Neg logit contributions
icol-0.026
und-0.025
bed-0.025
text-0.024
ifying-0.024
Token it
Token ID340
Feature activation+2.25e+00

Pos logit contributions
 stranger+0.004
guard+0.004
 answer+0.004
 ambulance+0.004
 seller+0.004
Neg logit contributions
icol-0.003
text-0.003
und-0.003
bed-0.003
ifying-0.003
Token?"
Token ID1701
Feature activation+1.55e+00

Pos logit contributions
 stranger+0.049
guard+0.047
 pastor+0.047
 ambulance+0.045
 answer+0.045
Neg logit contributions
icol-0.050
bed-0.050
und-0.050
 P-0.048
text-0.047
Token Tom
Token ID4186
Feature activation+1.05e+00

Pos logit contributions
guard+0.049
 stranger+0.049
 ambulance+0.047
 answer+0.046
 seller+0.045
Neg logit contributions
icol-0.022
 P-0.021
bed-0.021
und-0.020
 M-0.019
Token!"
Token ID2474
Feature activation+2.00e+00

Pos logit contributions
icol+0.020
text+0.018
und+0.018
bed+0.018
ra+0.017
Neg logit contributions
 stranger-0.031
guard-0.031
 ambulance-0.030
 answer-0.030
 pastor-0.029
Token
Token ID198
Feature activation+1.17e+00

Pos logit contributions
 stranger+0.046
 lady+0.045
 man+0.044
 answer+0.044
 driver+0.043
Neg logit contributions
icol-0.055
ifying-0.050
und-0.049
ra-0.049
text-0.049
Token
Token ID198
Feature activation+1.24e+00

Pos logit contributions
 stranger+0.030
guard+0.029
 answer+0.029
 ambulance+0.028
 driver+0.028
Neg logit contributions
icol-0.026
ifying-0.023
text-0.023
und-0.023
bed-0.022
Token"
Token ID1
Feature activation+8.73e-02

Pos logit contributions
 stranger+0.027
 answer+0.026
guard+0.026
 ambulance+0.026
 pastor+0.025
Neg logit contributions
icol-0.030
text-0.029
und-0.029
bed-0.028
ra-0.028
TokenCan
Token ID6090
Feature activation+1.01e+00

Pos logit contributions
 stranger+0.002
guard+0.002
 answer+0.002
 ambulance+0.002
 seller+0.002
Neg logit contributions
icol-0.002
text-0.002
und-0.002
bed-0.002
ifying-0.001
Token we
Token ID356
Feature activation+5.59e+00

Pos logit contributions
 stranger+0.031
guard+0.030
 answer+0.030
 ambulance+0.030
 seller+0.029
Neg logit contributions
icol-0.016
text-0.014
bed-0.014
und-0.014
ra-0.013
Token go
Token ID467
Feature activation+1.52e+00

Pos logit contributions
 stranger+0.119
 pastor+0.116
guard+0.113
 seller+0.110
 ambulance+0.108
Neg logit contributions
text-0.128
 P-0.127
icol-0.125
ifying-0.124
ra-0.123
Token and
Token ID290
Feature activation+1.49e+00

Pos logit contributions
 stranger+0.038
guard+0.036
 pastor+0.036
 seller+0.035
 ambulance+0.035
Neg logit contributions
icol-0.032
bed-0.030
und-0.029
text-0.029
ra-0.028
Token see
Token ID766
Feature activation+1.26e+00

Pos logit contributions
 stranger+0.028
guard+0.027
 seller+0.026
vern+0.026
 ambulance+0.026
Neg logit contributions
icol-0.039
ifying-0.037
 P-0.036
bed-0.036
text-0.036
Token?"
Token ID1701
Feature activation+2.04e+00

Pos logit contributions
 stranger+0.031
guard+0.031
 answer+0.030
 ambulance+0.029
 seller+0.029
Neg logit contributions
icol-0.027
text-0.025
und-0.025
bed-0.024
ifying-0.024
Token Lily
Token ID20037
Feature activation+8.17e-01

Pos logit contributions
 stranger+0.060
 answer+0.058
guard+0.058
 lady+0.057
 ambulance+0.057
Neg logit contributions
icol-0.040
und-0.036
text-0.035
ifying-0.035
bed-0.035
Token.
Token ID13
Feature activation+8.63e-01

Pos logit contributions
 stranger+0.006
guard+0.006
 answer+0.006
 ambulance+0.005
 seller+0.005
Neg logit contributions
icol-0.007
text-0.007
und-0.007
bed-0.007
ifying-0.007
Token
Token ID198
Feature activation+1.11e+00

Pos logit contributions
 stranger+0.019
 driver+0.018
 lady+0.018
guard+0.018
 answer+0.018
Neg logit contributions
icol-0.022
text-0.020
bed-0.019
und-0.019
ifying-0.019
Token
Token ID198
Feature activation+9.43e-01

Pos logit contributions
 stranger+0.027
guard+0.026
 answer+0.026
 ambulance+0.025
 lady+0.025
Neg logit contributions
icol-0.026
ifying-0.023
bed-0.023
text-0.022
ra-0.022
Token"
Token ID1
Feature activation+1.06e-01

Pos logit contributions
 stranger+0.020
guard+0.020
 answer+0.019
 ambulance+0.019
 cop+0.018
Neg logit contributions
icol-0.024
text-0.022
und-0.022
bed-0.022
ifying-0.021
TokenCan
Token ID6090
Feature activation+7.91e-01

Pos logit contributions
 answer+0.003
 ambulance+0.003
 stranger+0.003
guard+0.003
 bus+0.002
Neg logit contributions
icol-0.002
und-0.002
text-0.002
ra-0.002
bed-0.002
Token we
Token ID356
Feature activation+5.58e+00

Pos logit contributions
 stranger+0.024
guard+0.023
 answer+0.023
 ambulance+0.023
 pastor+0.022
Neg logit contributions
icol-0.013
text-0.012
bed-0.011
und-0.011
ra-0.011
Token make
Token ID787
Feature activation+1.94e+00

Pos logit contributions
 stranger+0.115
 pastor+0.115
guard+0.113
 ambulance+0.109
 firefighter+0.105
Neg logit contributions
ra-0.133
 P-0.129
 M-0.125
bed-0.124
ifying-0.124
Token it
Token ID340
Feature activation+3.65e+00

Pos logit contributions
 stranger+0.044
 answer+0.044
 ambulance+0.044
guard+0.043
 cop+0.042
Neg logit contributions
icol-0.046
text-0.043
und-0.042
ifying-0.042
bed-0.040
Token longer
Token ID2392
Feature activation+2.32e+00

Pos logit contributions
 stranger+0.096
 pastor+0.092
guard+0.090
 officer+0.085
 gentleman+0.085
Neg logit contributions
 P-0.073
bed-0.071
 M-0.068
ra-0.067
icol-0.067
Token and
Token ID290
Feature activation+1.70e+00

Pos logit contributions
 stranger+0.057
guard+0.054
 pastor+0.053
 officer+0.052
 ambulance+0.051
Neg logit contributions
icol-0.048
bed-0.048
und-0.047
 P-0.046
ra-0.045
Token higher
Token ID2440
Feature activation+2.00e+00

Pos logit contributions
 stranger+0.033
guard+0.032
 ambulance+0.031
 answer+0.030
 pastor+0.030
Neg logit contributions
icol-0.045
bed-0.042
ra-0.042
und-0.041
ifying-0.041
Token.
Token ID13
Feature activation+1.50e+00

Pos logit contributions
icol+0.002
bed+0.002
text+0.002
und+0.002
ra+0.002
Neg logit contributions
 stranger-0.002
guard-0.002
 ambulance-0.002
 answer-0.002
 cop-0.001
Token
Token ID198
Feature activation+1.22e+00

Pos logit contributions
 stranger+0.034
guard+0.033
 answer+0.033
 ambulance+0.032
 cop+0.032
Neg logit contributions
icol-0.037
text-0.034
und-0.033
ra-0.033
ifying-0.033
Token
Token ID198
Feature activation+1.28e+00

Pos logit contributions
 stranger+0.037
guard+0.036
 answer+0.036
 ambulance+0.035
 seller+0.035
Neg logit contributions
icol-0.020
text-0.018
und-0.018
bed-0.017
ra-0.016
Token"
Token ID1
Feature activation-8.52e-02

Pos logit contributions
 answer+0.033
 stranger+0.033
guard+0.032
 ambulance+0.032
 pastor+0.031
Neg logit contributions
icol-0.026
und-0.025
bed-0.025
text-0.024
ra-0.024
TokenCan
Token ID6090
Feature activation+9.66e-01

Pos logit contributions
icol+0.002
text+0.002
und+0.002
bed+0.002
ifying+0.002
Neg logit contributions
 stranger-0.002
guard-0.002
 answer-0.002
 ambulance-0.002
 cop-0.002
Token we
Token ID356
Feature activation+5.56e+00

Pos logit contributions
guard+0.026
 stranger+0.025
 answer+0.025
 ambulance+0.024
 seller+0.024
Neg logit contributions
icol-0.020
und-0.018
text-0.018
keeping-0.017
bed-0.017
Token play
Token ID711
Feature activation+1.06e+00

Pos logit contributions
 stranger+0.112
guard+0.108
 pastor+0.104
 ambulance+0.102
 cop+0.101
Neg logit contributions
icol-0.144
bed-0.139
ra-0.138
 P-0.137
text-0.137
Token with
Token ID351
Feature activation+1.31e+00

Pos logit contributions
 stranger+0.028
guard+0.028
 answer+0.027
 ambulance+0.027
 cop+0.026
Neg logit contributions
icol-0.022
text-0.020
und-0.020
ifying-0.019
keeping-0.019
Token it
Token ID340
Feature activation+2.36e+00

Pos logit contributions
 stranger+0.034
guard+0.034
 ambulance+0.033
 answer+0.033
 cop+0.032
Neg logit contributions
icol-0.028
text-0.025
und-0.024
bed-0.023
keeping-0.023
Token?"
Token ID1701
Feature activation+1.55e+00

Pos logit contributions
 stranger+0.050
guard+0.048
 answer+0.047
 ambulance+0.047
 pastor+0.046
Neg logit contributions
icol-0.058
bed-0.055
und-0.055
text-0.055
ra-0.053
Token Sam
Token ID3409
Feature activation+1.17e+00

Pos logit contributions
guard+0.052
 stranger+0.052
 ambulance+0.050
 answer+0.050
 seller+0.049
Neg logit contributions
icol-0.019
 P-0.018
bed-0.017
text-0.017
und-0.017
Token.
Token ID13
Feature activation+1.69e+00

Pos logit contributions
icol+0.006
bed+0.005
text+0.005
und+0.005
ra+0.005
Neg logit contributions
 stranger-0.006
guard-0.006
 ambulance-0.005
 answer-0.005
 pastor-0.005
Token
Token ID198
Feature activation+1.04e+00

Pos logit contributions
 stranger+0.049
guard+0.049
 ambulance+0.047
 answer+0.047
 pastor+0.045
Neg logit contributions
icol-0.029
text-0.027
bed-0.026
und-0.026
 P-0.025
Token
Token ID198
Feature activation+1.11e+00

Pos logit contributions
 stranger+0.034
guard+0.033
 answer+0.033
 ambulance+0.032
 seller+0.032
Neg logit contributions
icol-0.014
text-0.014
und-0.013
bed-0.013
 P-0.012
Token"
Token ID1
Feature activation-2.68e-01

Pos logit contributions
 answer+0.031
 stranger+0.030
 ambulance+0.029
guard+0.029
 officer+0.028
Neg logit contributions
ra-0.020
und-0.019
bed-0.019
text-0.019
icol-0.018
TokenCan
Token ID6090
Feature activation+7.39e-01

Pos logit contributions
sc+0.004
text+0.004
icol+0.004
 Neither+0.004
bed+0.004
Neg logit contributions
 ambulance-0.007
 answer-0.007
 owl-0.007
 firefighter-0.007
 dentist-0.007
Token we
Token ID356
Feature activation+5.56e+00

Pos logit contributions
 stranger+0.022
guard+0.022
 answer+0.022
 ambulance+0.021
 cop+0.021
Neg logit contributions
icol-0.012
text-0.011
bed-0.011
und-0.011
ra-0.010
Token go
Token ID467
Feature activation+1.74e+00

Pos logit contributions
 stranger+0.117
guard+0.116
 pastor+0.109
 officer+0.105
 ambulance+0.105
Neg logit contributions
ra-0.134
icol-0.132
 M-0.132
ifying-0.130
text-0.129
Token outside
Token ID2354
Feature activation+2.63e+00

Pos logit contributions
 stranger+0.045
 pastor+0.043
guard+0.042
 seller+0.041
 cop+0.041
Neg logit contributions
icol-0.035
bed-0.034
und-0.033
text-0.032
 P-0.032
Token and
Token ID290
Feature activation+1.35e+00

Pos logit contributions
guard+0.053
 stranger+0.052
 answer+0.051
 ambulance+0.050
 cop+0.047
Neg logit contributions
icol-0.074
text-0.068
und-0.065
bed-0.064
ifying-0.064
Token play
Token ID711
Feature activation+4.95e-01

Pos logit contributions
guard+0.024
 stranger+0.024
 ambulance+0.022
 answer+0.022
 seller+0.022
Neg logit contributions
icol-0.038
bed-0.035
ifying-0.035
und-0.035
 M-0.034
Token in
Token ID287
Feature activation+3.25e-01

Pos logit contributions
 stranger+0.011
guard+0.011
 answer+0.011
 ambulance+0.010
 pastor+0.010
Neg logit contributions
icol-0.012
und-0.011
bed-0.011
text-0.011
ra-0.010
Token.
Token ID13
Feature activation+1.35e+00

Pos logit contributions
 stranger+0.020
guard+0.020
 ambulance+0.019
 answer+0.019
 cop+0.019
Neg logit contributions
icol-0.022
bed-0.021
text-0.020
und-0.020
ra-0.020
Token
Token ID198
Feature activation+1.04e+00

Pos logit contributions
 stranger+0.037
guard+0.037
 answer+0.036
 ambulance+0.036
 cop+0.035
Neg logit contributions
icol-0.026
text-0.024
und-0.023
bed-0.023
ifying-0.022
Token
Token ID198
Feature activation+1.13e+00

Pos logit contributions
 stranger+0.034
guard+0.033
 answer+0.032
 ambulance+0.032
 seller+0.032
Neg logit contributions
icol-0.014
text-0.014
und-0.013
bed-0.013
 P-0.012
Token"
Token ID1
Feature activation-2.77e-01

Pos logit contributions
 stranger+0.032
 answer+0.032
guard+0.031
 ambulance+0.030
 officer+0.030
Neg logit contributions
ra-0.019
bed-0.018
und-0.018
text-0.018
icol-0.017
TokenCan
Token ID6090
Feature activation+8.24e-01

Pos logit contributions
icol+0.004
 P+0.004
 Neither+0.004
sc+0.004
ra+0.004
Neg logit contributions
 answer-0.008
 ambulance-0.007
 owl-0.007
 stranger-0.007
old-0.007
Token we
Token ID356
Feature activation+5.56e+00

Pos logit contributions
 stranger+0.026
guard+0.025
 answer+0.025
 ambulance+0.025
 cop+0.024
Neg logit contributions
icol-0.012
text-0.011
bed-0.011
und-0.011
ra-0.011
Token slide
Token ID10649
Feature activation+3.24e+00

Pos logit contributions
 stranger+0.129
 pastor+0.129
guard+0.124
 ambulance+0.120
vern+0.120
Neg logit contributions
ra-0.114
 hands-0.110
 M-0.110
ifying-0.109
text-0.109
Token down
Token ID866
Feature activation+1.40e+00

Pos logit contributions
 stranger+0.082
guard+0.079
 answer+0.079
 ambulance+0.077
 pastor+0.076
Neg logit contributions
icol-0.067
und-0.063
text-0.063
bed-0.062
ifying-0.060
Token the
Token ID262
Feature activation-1.36e+00

Pos logit contributions
guard+0.031
 stranger+0.030
 answer+0.029
 ambulance+0.029
 cop+0.027
Neg logit contributions
icol-0.038
text-0.033
bed-0.033
und-0.032
ra-0.032
Token snow
Token ID6729
Feature activation-1.25e+00

Pos logit contributions
icol+0.035
text+0.034
und+0.033
bed+0.032
ifying+0.031
Neg logit contributions
 stranger-0.027
guard-0.026
 answer-0.026
 ambulance-0.026
 cop-0.025
Token,
Token ID11
Feature activation+2.01e+00

Pos logit contributions
icol+0.039
text+0.036
ifying+0.035
und+0.035
bed+0.035
Neg logit contributions
guard-0.021
 stranger-0.021
 ambulance-0.020
 answer-0.019
 seller-0.019
Token.
Token ID13
Feature activation+1.12e+00

Pos logit contributions
 stranger+0.014
guard+0.014
 answer+0.014
 ambulance+0.014
 cop+0.013
Neg logit contributions
icol-0.017
text-0.016
bed-0.016
und-0.016
ra-0.015
Token
Token ID198
Feature activation+5.60e-01

Pos logit contributions
 stranger+0.027
 answer+0.026
 ambulance+0.026
guard+0.026
 lady+0.025
Neg logit contributions
icol-0.026
und-0.024
text-0.024
ra-0.023
ifying-0.023
Token
Token ID198
Feature activation+5.73e-01

Pos logit contributions
 stranger+0.014
guard+0.014
 answer+0.014
 ambulance+0.013
 cop+0.013
Neg logit contributions
icol-0.012
ifying-0.011
text-0.011
und-0.011
bed-0.011
Token"
Token ID1
Feature activation-3.43e-02

Pos logit contributions
 stranger+0.012
guard+0.012
 answer+0.011
 ambulance+0.011
 cop+0.011
Neg logit contributions
icol-0.015
text-0.014
und-0.014
bed-0.014
ifying-0.014
TokenCan
Token ID6090
Feature activation+7.77e-01

Pos logit contributions
icol+0.001
bed+0.001
text+0.001
ifying+0.001
 P+0.001
Neg logit contributions
 ambulance-0.001
 answer-0.001
 stranger-0.001
guard-0.001
old-0.001
Token we
Token ID356
Feature activation+5.55e+00

Pos logit contributions
 stranger+0.022
guard+0.022
 answer+0.021
 ambulance+0.021
 seller+0.021
Neg logit contributions
icol-0.015
text-0.013
und-0.013
bed-0.013
ifying-0.012
Token feed
Token ID3745
Feature activation+2.08e+00

Pos logit contributions
guard+0.118
 stranger+0.117
 pastor+0.115
 ambulance+0.113
vern+0.111
Neg logit contributions
ifying-0.132
ra-0.130
 M-0.128
 P-0.127
icol-0.123
Token it
Token ID340
Feature activation+3.65e+00

Pos logit contributions
 stranger+0.065
guard+0.063
 pastor+0.062
 ambulance+0.062
 cop+0.062
Neg logit contributions
bed-0.028
text-0.028
icol-0.028
 P-0.028
und-0.027
Token?"
Token ID1701
Feature activation+9.95e-01

Pos logit contributions
 stranger+0.100
 pastor+0.096
guard+0.096
 officer+0.094
 ambulance+0.092
Neg logit contributions
 P-0.067
bed-0.066
 M-0.064
 Po-0.062
text-0.061
Token Sam
Token ID3409
Feature activation+1.17e+00

Pos logit contributions
 stranger+0.033
 answer+0.033
guard+0.033
 ambulance+0.033
 cop+0.032
Neg logit contributions
icol-0.014
und-0.012
text-0.012
ifying-0.011
keeping-0.011
Token asks
Token ID7893
Feature activation+1.02e+00

Pos logit contributions
guard+0.030
 stranger+0.028
vern+0.028
 firefighter+0.026
 ambulance+0.026
Neg logit contributions
 M-0.023
 P-0.022
icol-0.022
 Po-0.021
bed-0.021
Token
Token ID198
Feature activation+6.07e-01

Pos logit contributions
 lady+0.014
 stranger+0.014
 man+0.013
 veterinarian+0.013
guard+0.013
Neg logit contributions
icol-0.019
ifying-0.018
ra-0.017
text-0.017
bed-0.017
Token
Token ID198
Feature activation+6.92e-01

Pos logit contributions
 stranger+0.015
guard+0.015
 answer+0.014
 ambulance+0.014
 lady+0.014
Neg logit contributions
icol-0.014
ifying-0.012
bed-0.012
text-0.012
und-0.012
Token"
Token ID1
Feature activation-2.23e-01

Pos logit contributions
 stranger+0.017
guard+0.017
 answer+0.017
 ambulance+0.016
 seller+0.016
Neg logit contributions
icol-0.015
text-0.014
und-0.014
bed-0.014
ra-0.013
TokenWhat
Token ID2061
Feature activation-2.14e-01

Pos logit contributions
icol+0.004
text+0.004
und+0.004
bed+0.004
ra+0.004
Neg logit contributions
 stranger-0.006
guard-0.006
 ambulance-0.006
 answer-0.006
 cop-0.006
Token is
Token ID318
Feature activation+2.51e+00

Pos logit contributions
text+0.004
icol+0.004
und+0.004
ra+0.004
 P+0.004
Neg logit contributions
guard-0.006
 stranger-0.006
 ambulance-0.005
 officer-0.005
 seller-0.005
Token it
Token ID340
Feature activation+5.54e+00

Pos logit contributions
 stranger+0.062
 answer+0.061
guard+0.059
 ambulance+0.059
 seller+0.058
Neg logit contributions
icol-0.055
text-0.055
und-0.051
ifying-0.049
bed-0.049
Token?"
Token ID1701
Feature activation+1.05e+00

Pos logit contributions
guard+0.132
 stranger+0.130
 officer+0.125
 ambulance+0.121
 cop+0.120
Neg logit contributions
 M-0.119
 P-0.119
icol-0.113
 Po-0.112
bed-0.111
Token Anna
Token ID11735
Feature activation+8.44e-01

Pos logit contributions
 stranger+0.029
 answer+0.028
guard+0.028
 ambulance+0.027
 cop+0.027
Neg logit contributions
icol-0.022
text-0.020
und-0.019
ifying-0.019
bed-0.019
Token asks
Token ID7893
Feature activation+6.63e-01

Pos logit contributions
guard+0.016
 stranger+0.016
 ambulance+0.015
 cop+0.014
 answer+0.014
Neg logit contributions
icol-0.023
und-0.022
bed-0.022
text-0.022
ra-0.021
Token.
Token ID13
Feature activation+1.89e+00

Pos logit contributions
 stranger+0.015
guard+0.015
 ambulance+0.014
 answer+0.014
 seller+0.014
Neg logit contributions
icol-0.016
bed-0.015
text-0.015
und-0.014
ra-0.014
Token
Token ID198
Feature activation+1.32e+00

Pos logit contributions
 stranger+0.048
guard+0.047
 answer+0.047
 ambulance+0.046
 cop+0.045
Neg logit contributions
icol-0.041
text-0.038
und-0.037
bed-0.036
ifying-0.035
Token Ben
Token ID3932
Feature activation+5.00e-01

Pos logit contributions
 stranger+0.001
 lady+0.001
 man+0.001
 answer+0.001
guard+0.001
Neg logit contributions
icol-0.001
und-0.000
ifying-0.000
bed-0.000
keeping-0.000
Token said
Token ID531
Feature activation+4.56e-01

Pos logit contributions
 stranger+0.008
 answer+0.008
guard+0.008
 seller+0.007
 ambulance+0.007
Neg logit contributions
icol-0.015
ifying-0.015
und-0.014
text-0.014
bed-0.014
Token.
Token ID13
Feature activation+1.37e+00

Pos logit contributions
 stranger+0.010
guard+0.009
 answer+0.009
 ambulance+0.009
 cop+0.009
Neg logit contributions
icol-0.012
text-0.011
und-0.011
bed-0.011
ra-0.010
Token "
Token ID366
Feature activation+5.95e-01

Pos logit contributions
 stranger+0.034
guard+0.034
 answer+0.033
 ambulance+0.033
 cop+0.032
Neg logit contributions
icol-0.030
text-0.028
und-0.027
bed-0.027
ra-0.026
TokenCan
Token ID6090
Feature activation+6.55e-01

Pos logit contributions
 stranger+0.015
guard+0.015
 answer+0.015
 ambulance+0.014
 cop+0.014
Neg logit contributions
icol-0.013
text-0.012
und-0.012
bed-0.011
ra-0.011
Token we
Token ID356
Feature activation+5.53e+00

Pos logit contributions
 stranger+0.018
guard+0.018
 answer+0.018
 ambulance+0.017
 seller+0.017
Neg logit contributions
icol-0.013
und-0.012
text-0.011
bed-0.011
ifying-0.011
Token swim
Token ID9422
Feature activation+1.64e+00

Pos logit contributions
 stranger+0.119
guard+0.116
 pastor+0.114
 ambulance+0.113
 cop+0.107
Neg logit contributions
ra-0.131
icol-0.131
ifying-0.128
 P-0.128
und-0.126
Token in
Token ID287
Feature activation-1.29e-01

Pos logit contributions
guard+0.038
 stranger+0.037
 answer+0.037
 ambulance+0.034
 driver+0.034
Neg logit contributions
icol-0.042
text-0.038
und-0.036
keeping-0.036
bed-0.036
Token the
Token ID262
Feature activation+4.77e-01

Pos logit contributions
icol+0.004
bed+0.003
text+0.003
ra+0.003
und+0.003
Neg logit contributions
guard-0.003
 answer-0.002
 stranger-0.002
 lady-0.002
 ambulance-0.002
Token water
Token ID1660
Feature activation+2.80e+00

Pos logit contributions
 stranger+0.009
guard+0.009
 answer+0.009
 ambulance+0.009
 seller+0.008
Neg logit contributions
icol-0.013
text-0.012
und-0.012
bed-0.012
ra-0.012
Token,
Token ID11
Feature activation+2.44e+00

Pos logit contributions
guard+0.047
 stranger+0.046
 answer+0.044
 ambulance+0.043
 seller+0.041
Neg logit contributions
icol-0.087
text-0.082
bed-0.078
und-0.078
ra-0.077
Token sweet
Token ID6029
Feature activation+1.70e+00

Pos logit contributions
icol+0.010
und+0.010
ifying+0.009
keeping+0.009
text+0.009
Neg logit contributions
 answer-0.006
guard-0.006
 stranger-0.006
 lady-0.006
 ambulance-0.006
Tokenie
Token ID494
Feature activation-3.19e-01

Pos logit contributions
 stranger+0.040
guard+0.039
 answer+0.038
 cop+0.037
 ambulance+0.037
Neg logit contributions
icol-0.038
text-0.036
und-0.035
ra-0.033
 P-0.033
Token.
Token ID13
Feature activation-5.97e-01

Pos logit contributions
icol+0.009
text+0.008
und+0.008
bed+0.008
ifying+0.008
Neg logit contributions
 stranger-0.006
guard-0.006
 answer-0.006
 ambulance-0.006
 seller-0.006
Token You
Token ID921
Feature activation+3.29e-01

Pos logit contributions
icol+0.011
text+0.011
und+0.010
keeping+0.010
bed+0.009
Neg logit contributions
 driver-0.016
 answer-0.016
 cop-0.016
 ambulance-0.016
 seller-0.016
Token can
Token ID460
Feature activation+6.68e-01

Pos logit contributions
 stranger+0.008
guard+0.008
 ambulance+0.008
 pastor+0.008
 answer+0.008
Neg logit contributions
icol-0.007
text-0.007
ra-0.007
und-0.006
ifying-0.006
Token each
Token ID1123
Feature activation+5.52e+00

Pos logit contributions
 stranger+0.018
guard+0.017
 ambulance+0.017
 cop+0.016
 pastor+0.016
Neg logit contributions
ra-0.014
bed-0.012
e-0.011
ifying-0.011
und-0.011
Token choose
Token ID3853
Feature activation+3.04e-01

Pos logit contributions
guard+0.122
 stranger+0.120
 ambulance+0.116
vern+0.116
 pastor+0.113
Neg logit contributions
ra-0.121
bed-0.117
text-0.116
und-0.114
 hands-0.111
Token one
Token ID530
Feature activation+6.34e-01

Pos logit contributions
 stranger+0.009
guard+0.008
 ambulance+0.008
 firefighter+0.008
 cop+0.008
Neg logit contributions
ifying-0.005
ra-0.004
bed-0.004
icol-0.004
 P-0.004
Token balloon
Token ID21190
Feature activation-8.96e-01

Pos logit contributions
 stranger+0.015
 ambulance+0.015
guard+0.015
 pastor+0.015
 officer+0.015
Neg logit contributions
ra-0.012
text-0.012
bed-0.012
icol-0.011
ifying-0.011
Token,"
Token ID553
Feature activation+5.54e-02

Pos logit contributions
icol+0.018
bed+0.017
ra+0.017
text+0.017
und+0.017
Neg logit contributions
 stranger-0.022
 ambulance-0.021
 answer-0.021
guard-0.021
 pastor-0.021
Token Mom
Token ID11254
Feature activation+1.65e+00

Pos logit contributions
 stranger+0.002
guard+0.002
 answer+0.002
 ambulance+0.002
 cop+0.002
Neg logit contributions
icol-0.001
text-0.001
und-0.001
bed-0.001
ifying-0.001
Token.
Token ID13
Feature activation+1.02e-01

Pos logit contributions
icol+0.011
text+0.011
und+0.011
ifying+0.010
ra+0.010
Neg logit contributions
 stranger-0.008
guard-0.007
 answer-0.007
 ambulance-0.007
 seller-0.007
Token
Token ID198
Feature activation+2.46e-01

Pos logit contributions
 stranger+0.002
 lady+0.002
 veterinarian+0.002
 cop+0.002
 man+0.002
Neg logit contributions
icol-0.003
ra-0.003
text-0.003
ifying-0.003
und-0.003
Token
Token ID198
Feature activation+3.60e-01

Pos logit contributions
 stranger+0.006
guard+0.006
 answer+0.005
 ambulance+0.005
 lady+0.005
Neg logit contributions
icol-0.006
ifying-0.005
bed-0.005
ra-0.005
text-0.005
Token"
Token ID1
Feature activation-5.77e-02

Pos logit contributions
 stranger+0.008
guard+0.008
 answer+0.008
 ambulance+0.008
 cop+0.007
Neg logit contributions
icol-0.009
text-0.008
und-0.008
bed-0.008
ifying-0.008
TokenCan
Token ID6090
Feature activation+5.50e-01

Pos logit contributions
icol+0.001
text+0.001
ifying+0.001
bed+0.001
und+0.001
Neg logit contributions
 answer-0.002
 ambulance-0.002
 stranger-0.002
guard-0.002
old-0.001
Token we
Token ID356
Feature activation+5.51e+00

Pos logit contributions
 stranger+0.016
guard+0.016
 answer+0.016
 ambulance+0.015
 cop+0.015
Neg logit contributions
icol-0.010
text-0.009
und-0.009
bed-0.008
ra-0.008
Token feed
Token ID3745
Feature activation+2.22e+00

Pos logit contributions
 stranger+0.108
guard+0.107
 ambulance+0.102
 pastor+0.101
 cop+0.099
Neg logit contributions
icol-0.141
ra-0.136
text-0.135
 M-0.134
ifying-0.133
Token it
Token ID340
Feature activation+3.82e+00

Pos logit contributions
 stranger+0.069
guard+0.068
 pastor+0.066
 ambulance+0.066
 answer+0.066
Neg logit contributions
bed-0.030
icol-0.029
text-0.029
und-0.029
 P-0.029
Token?"
Token ID1701
Feature activation+7.95e-01

Pos logit contributions
 stranger+0.104
 pastor+0.103
guard+0.101
 officer+0.098
old+0.098
Neg logit contributions
 P-0.070
bed-0.068
 M-0.067
 Po-0.066
 Chick-0.061
Token Lily
Token ID20037
Feature activation+8.62e-01

Pos logit contributions
 stranger+0.022
 lady+0.021
 answer+0.021
 cop+0.021
guard+0.021
Neg logit contributions
icol-0.017
ifying-0.015
text-0.015
und-0.015
bed-0.014
Token asked
Token ID1965
Feature activation-2.09e-02

Pos logit contributions
guard+0.019
 stranger+0.019
 ambulance+0.018
 answer+0.018
 cop+0.017
Neg logit contributions
icol-0.021
bed-0.019
und-0.019
text-0.019
ra-0.018
Token Tom
Token ID4186
Feature activation+7.18e-01

Pos logit contributions
 stranger+0.030
guard+0.030
 answer+0.028
 ambulance+0.028
 seller+0.028
Neg logit contributions
icol-0.017
text-0.015
bed-0.015
und-0.015
 P-0.014
Token says
Token ID1139
Feature activation+4.94e-01

Pos logit contributions
 stranger+0.017
guard+0.017
 ambulance+0.015
vern+0.015
 pastor+0.015
Neg logit contributions
icol-0.016
text-0.016
 P-0.015
 M-0.015
und-0.015
Token.
Token ID13
Feature activation+1.52e+00

Pos logit contributions
 stranger+0.010
guard+0.010
 answer+0.010
 ambulance+0.010
 cop+0.009
Neg logit contributions
icol-0.013
text-0.012
und-0.012
bed-0.012
ra-0.011
Token "
Token ID366
Feature activation+7.41e-01

Pos logit contributions
 stranger+0.040
guard+0.040
 answer+0.039
 ambulance+0.038
 seller+0.037
Neg logit contributions
icol-0.031
text-0.028
und-0.028
bed-0.028
ifying-0.027
TokenCan
Token ID6090
Feature activation+6.12e-01

Pos logit contributions
 stranger+0.018
guard+0.018
 answer+0.017
 cop+0.017
 driver+0.017
Neg logit contributions
icol-0.018
text-0.015
und-0.015
bed-0.015
ifying-0.014
Token we
Token ID356
Feature activation+5.50e+00

Pos logit contributions
 stranger+0.017
guard+0.017
 answer+0.016
 ambulance+0.016
 cop+0.016
Neg logit contributions
icol-0.012
text-0.011
und-0.011
bed-0.011
ifying-0.010
Token see
Token ID766
Feature activation+1.72e+00

Pos logit contributions
 stranger+0.117
guard+0.115
 pastor+0.111
 ambulance+0.109
vern+0.103
Neg logit contributions
ra-0.131
ifying-0.128
icol-0.126
und-0.124
text-0.123
Token the
Token ID262
Feature activation+1.58e+00

Pos logit contributions
 stranger+0.044
guard+0.043
 answer+0.043
 ambulance+0.043
 cop+0.041
Neg logit contributions
icol-0.038
text-0.034
und-0.033
bed-0.033
ifying-0.032
Token photos
Token ID5205
Feature activation+1.65e+00

Pos logit contributions
guard+0.025
 stranger+0.025
 answer+0.024
 ambulance+0.023
 seller+0.023
Neg logit contributions
icol-0.046
text-0.043
ifying-0.043
und-0.042
ra-0.042
Token?"
Token ID1701
Feature activation+2.30e+00

Pos logit contributions
 stranger+0.038
 ambulance+0.036
 pastor+0.036
 answer+0.036
guard+0.036
Neg logit contributions
und-0.035
ifying-0.034
icol-0.033
text-0.033
bed-0.032
Token
Token ID198
Feature activation+1.48e+00

Pos logit contributions
 stranger+0.062
guard+0.060
 answer+0.059
 ambulance+0.058
 cop+0.057
Neg logit contributions
icol-0.049
text-0.044
und-0.044
bed-0.043
ifying-0.043
Token She
Token ID1375
Feature activation+9.27e-03

Pos logit contributions
 stranger+0.001
guard+0.001
 answer+0.001
 ambulance+0.001
 seller+0.001
Neg logit contributions
icol-0.001
text-0.001
und-0.001
bed-0.001
ra-0.001
Token had
Token ID550
Feature activation-1.78e+00

Pos logit contributions
 stranger+0.000
guard+0.000
 ambulance+0.000
 cop+0.000
 answer+0.000
Neg logit contributions
icol-0.000
bed-0.000
ra-0.000
text-0.000
und-0.000
Token a
Token ID257
Feature activation-1.54e+00

Pos logit contributions
icol+0.048
text+0.045
und+0.044
bed+0.044
ifying+0.044
Neg logit contributions
 stranger-0.038
 answer-0.037
 pastor-0.034
 bus-0.033
 ambulance-0.033
Token big
Token ID1263
Feature activation-1.17e+00

Pos logit contributions
icol+0.039
text+0.036
und+0.036
bed+0.036
ra+0.035
Neg logit contributions
 stranger-0.033
guard-0.032
 answer-0.031
 ambulance-0.031
 cop-0.030
Token magn
Token ID7842
Feature activation-2.35e+00

Pos logit contributions
icol+0.031
text+0.028
bed+0.028
und+0.027
ra+0.027
Neg logit contributions
guard-0.024
 stranger-0.023
 ambulance-0.022
 answer-0.022
 pastor-0.022
Tokenifying
Token ID4035
Feature activation-6.89e+00

Pos logit contributions
icol+0.018
text+0.014
keeping+0.012
bed+0.010
und+0.010
Neg logit contributions
guard-0.090
 stranger-0.090
 answer-0.090
 seller-0.087
 ambulance-0.086
Token glass
Token ID5405
Feature activation-8.76e-01

Pos logit contributions
icol+0.162
text+0.150
bed+0.148
und+0.148
ra+0.143
Neg logit contributions
 stranger-0.159
guard-0.155
 answer-0.155
 ambulance-0.151
 seller-0.146
Token that
Token ID326
Feature activation-9.03e-01

Pos logit contributions
icol+0.022
text+0.021
und+0.020
bed+0.020
ifying+0.020
Neg logit contributions
 stranger-0.019
guard-0.018
 answer-0.018
 ambulance-0.018
 cop-0.017
Token made
Token ID925
Feature activation-3.12e-01

Pos logit contributions
icol+0.020
text+0.019
und+0.019
bed+0.019
ra+0.018
Neg logit contributions
 stranger-0.022
guard-0.021
 ambulance-0.021
 answer-0.021
 cop-0.020
Token things
Token ID1243
Feature activation+1.74e+00

Pos logit contributions
icol+0.008
text+0.007
und+0.007
bed+0.007
ifying+0.007
Neg logit contributions
 stranger-0.007
 answer-0.007
guard-0.006
 ambulance-0.006
 seller-0.006
Token look
Token ID804
Feature activation-1.25e+00

Pos logit contributions
 stranger+0.044
 ambulance+0.043
guard+0.043
 cop+0.042
 pastor+0.042
Neg logit contributions
icol-0.034
bed-0.034
ifying-0.033
ra-0.033
 P-0.032
Token it
Token ID340
Feature activation+2.16e+00

Pos logit contributions
icol+0.023
und+0.020
text+0.020
ifying+0.020
bed+0.020
Neg logit contributions
 stranger-0.019
 answer-0.018
guard-0.018
 ambulance-0.017
 lady-0.017
Token she
Token ID673
Feature activation-2.06e-01

Pos logit contributions
 stranger+0.053
guard+0.052
 ambulance+0.050
 answer+0.049
 pastor+0.049
Neg logit contributions
icol-0.048
und-0.045
bed-0.045
text-0.045
ra-0.043
Token had
Token ID550
Feature activation-2.16e+00

Pos logit contributions
ifying+0.006
icol+0.006
keeping+0.005
und+0.005
text+0.005
Neg logit contributions
 answer-0.004
 driver-0.004
 stand-0.004
 seller-0.004
guard-0.004
Token a
Token ID257
Feature activation-1.73e+00

Pos logit contributions
icol+0.053
bed+0.049
und+0.048
ifying+0.048
text+0.047
Neg logit contributions
 stranger-0.049
 answer-0.048
guard-0.045
 lady-0.045
 ambulance-0.045
Token magn
Token ID7842
Feature activation-2.25e+00

Pos logit contributions
icol+0.045
text+0.040
und+0.040
bed+0.040
ra+0.040
Neg logit contributions
guard-0.036
 stranger-0.035
 answer-0.034
 ambulance-0.034
 pastor-0.033
Tokenifying
Token ID4035
Feature activation-6.81e+00

Pos logit contributions
icol+0.018
text+0.014
keeping+0.011
und+0.011
bed+0.011
Neg logit contributions
 stranger-0.086
guard-0.086
 answer-0.086
 seller-0.083
 ambulance-0.082
Token glass
Token ID5405
Feature activation-5.06e-01

Pos logit contributions
icol+0.156
text+0.144
und+0.142
bed+0.142
ra+0.137
Neg logit contributions
 stranger-0.162
guard-0.159
 answer-0.156
 ambulance-0.154
 seller-0.149
Token,
Token ID11
Feature activation+3.26e-01

Pos logit contributions
icol+0.012
text+0.012
und+0.012
bed+0.011
ifying+0.011
Neg logit contributions
 stranger-0.011
guard-0.011
 answer-0.010
 ambulance-0.010
 pastor-0.010
Token a
Token ID257
Feature activation-1.65e+00

Pos logit contributions
 answer+0.007
 stranger+0.007
 owl+0.006
 cop+0.006
 lady+0.006
Neg logit contributions
icol-0.009
text-0.009
und-0.008
ifying-0.008
ra-0.008
Token tiny
Token ID7009
Feature activation-8.47e-01

Pos logit contributions
icol+0.043
ifying+0.038
und+0.038
text+0.038
ra+0.038
Neg logit contributions
guard-0.034
 stranger-0.033
 ambulance-0.032
 answer-0.032
 pastor-0.031
Token torch
Token ID28034
Feature activation+2.36e-01

Pos logit contributions
icol+0.021
ra+0.019
bed+0.019
und+0.019
ifying+0.019
Neg logit contributions
guard-0.018
 stranger-0.018
 pastor-0.017
 ambulance-0.017
 answer-0.017
Token.
Token ID13
Feature activation+4.51e-01

Pos logit contributions
icol+0.030
und+0.028
bed+0.028
text+0.028
ifying+0.027
Neg logit contributions
 stranger-0.032
guard-0.031
 ambulance-0.030
 answer-0.030
 pastor-0.029
Token Lily
Token ID20037
Feature activation-1.36e+00

Pos logit contributions
 driver+0.008
 answer+0.008
 man+0.007
 stranger+0.007
 lady+0.007
Neg logit contributions
icol-0.014
ra-0.013
und-0.013
keeping-0.013
bed-0.012
Token had
Token ID550
Feature activation-1.45e+00

Pos logit contributions
icol+0.025
bed+0.024
text+0.023
ra+0.023
und+0.023
Neg logit contributions
 stranger-0.038
 ambulance-0.037
guard-0.037
 answer-0.036
 pastor-0.035
Token a
Token ID257
Feature activation-1.56e+00

Pos logit contributions
icol+0.038
text+0.035
und+0.035
bed+0.034
ifying+0.034
Neg logit contributions
 stranger-0.031
 answer-0.031
 cop-0.029
 driver-0.029
 bus-0.029
Token magn
Token ID7842
Feature activation-2.57e+00

Pos logit contributions
icol+0.041
ra+0.037
bed+0.037
und+0.037
ifying+0.036
Neg logit contributions
guard-0.032
 stranger-0.031
 ambulance-0.030
 answer-0.029
 pastor-0.029
Tokenifying
Token ID4035
Feature activation-6.80e+00

Pos logit contributions
icol+0.017
text+0.013
und+0.010
keeping+0.010
bed+0.009
Neg logit contributions
guard-0.101
 answer-0.101
 stranger-0.101
 seller-0.098
 ambulance-0.097
Token glass
Token ID5405
Feature activation-1.08e+00

Pos logit contributions
icol+0.156
text+0.144
und+0.143
bed+0.142
ra+0.136
Neg logit contributions
 stranger-0.161
guard-0.157
 answer-0.156
 ambulance-0.153
 seller-0.148
Token that
Token ID326
Feature activation-1.01e+00

Pos logit contributions
icol+0.027
text+0.025
und+0.025
bed+0.025
ifying+0.024
Neg logit contributions
 stranger-0.023
guard-0.023
 answer-0.022
 ambulance-0.022
 cop-0.021
Token she
Token ID673
Feature activation-5.15e-01

Pos logit contributions
icol+0.027
text+0.025
und+0.025
bed+0.024
ifying+0.024
Neg logit contributions
 stranger-0.020
guard-0.020
 answer-0.020
 ambulance-0.019
 seller-0.019
Token found
Token ID1043
Feature activation-4.93e-01

Pos logit contributions
icol+0.012
text+0.011
und+0.011
bed+0.011
ra+0.010
Neg logit contributions
 stranger-0.012
guard-0.012
 ambulance-0.012
 answer-0.012
 cop-0.011
Token in
Token ID287
Feature activation-1.21e+00

Pos logit contributions
icol+0.013
text+0.012
und+0.011
ifying+0.011
bed+0.011
Neg logit contributions
 answer-0.011
 stranger-0.011
guard-0.010
 seller-0.010
 ambulance-0.010
Token When
Token ID1649
Feature activation-9.88e-01

Pos logit contributions
 stranger+0.006
 driver+0.006
 cop+0.006
 ambulance+0.006
guard+0.006
Neg logit contributions
icol-0.007
text-0.006
bed-0.006
ra-0.006
und-0.006
Token Daisy
Token ID40355
Feature activation-1.12e+00

Pos logit contributions
icol+0.029
text+0.027
und+0.027
bed+0.026
ifying+0.026
Neg logit contributions
 stranger-0.019
guard-0.018
 answer-0.017
 ambulance-0.017
 cop-0.016
Token put
Token ID1234
Feature activation+3.06e-01

Pos logit contributions
icol+0.026
text+0.024
und+0.023
bed+0.023
ifying+0.023
Neg logit contributions
 stranger-0.027
guard-0.026
 answer-0.026
 ambulance-0.025
 seller-0.024
Token the
Token ID262
Feature activation+1.79e+00

Pos logit contributions
 answer+0.008
 stranger+0.008
 ambulance+0.007
 lady+0.007
 man+0.007
Neg logit contributions
icol-0.007
text-0.007
und-0.006
ifying-0.006
keeping-0.006
Token magn
Token ID7842
Feature activation-1.84e+00

Pos logit contributions
guard+0.036
 stranger+0.035
 answer+0.033
 pastor+0.033
 ambulance+0.033
Neg logit contributions
icol-0.044
ifying-0.042
ra-0.042
text-0.041
bed-0.040
Tokenifying
Token ID4035
Feature activation-6.74e+00

Pos logit contributions
icol+0.019
text+0.014
keeping+0.011
und+0.011
 pic+0.011
Neg logit contributions
guard-0.068
 stranger-0.067
 answer-0.067
 seller-0.065
 ambulance-0.064
Token glass
Token ID5405
Feature activation-5.92e-01

Pos logit contributions
icol+0.165
text+0.153
bed+0.151
und+0.145
keeping+0.143
Neg logit contributions
 stranger-0.149
guard-0.145
 answer-0.143
 seller-0.138
 ambulance-0.137
Token down
Token ID866
Feature activation+8.54e-02

Pos logit contributions
icol+0.015
text+0.014
und+0.014
bed+0.013
ra+0.013
Neg logit contributions
 stranger-0.013
guard-0.013
 answer-0.012
 ambulance-0.012
 seller-0.012
Token,
Token ID11
Feature activation+4.29e-01

Pos logit contributions
 stranger+0.002
guard+0.002
 answer+0.002
 lady+0.002
 ambulance+0.002
Neg logit contributions
icol-0.002
text-0.002
und-0.002
bed-0.002
ifying-0.002
Token the
Token ID262
Feature activation-1.99e-01

Pos logit contributions
 stranger+0.006
 answer+0.005
guard+0.005
 man+0.005
 lady+0.005
Neg logit contributions
icol-0.016
text-0.015
ra-0.014
und-0.014
ifying-0.014
Token cater
Token ID20825
Feature activation-1.18e+00

Pos logit contributions
icol+0.006
text+0.005
und+0.005
bed+0.005
ifying+0.005
Neg logit contributions
 stranger-0.004
guard-0.004
 answer-0.004
 ambulance-0.004
 cop-0.003
Token examine
Token ID10716
Feature activation+9.10e-01

Pos logit contributions
 pastor+0.098
guard+0.098
 stranger+0.096
 ambulance+0.093
 cop+0.087
Neg logit contributions
ra-0.095
 M-0.090
ifying-0.089
icol-0.089
 P-0.088
Token it
Token ID340
Feature activation+2.15e+00

Pos logit contributions
 answer+0.024
 stranger+0.024
guard+0.023
 ambulance+0.023
 ring+0.023
Neg logit contributions
icol-0.020
text-0.018
und-0.017
keeping-0.016
ifying-0.016
Token with
Token ID351
Feature activation+5.07e-01

Pos logit contributions
 stranger+0.048
guard+0.046
 ambulance+0.044
 pastor+0.044
 cop+0.043
Neg logit contributions
icol-0.050
bed-0.050
und-0.048
ra-0.047
 P-0.047
Token the
Token ID262
Feature activation+1.22e+00

Pos logit contributions
 stranger+0.014
guard+0.013
 answer+0.013
 ambulance+0.013
 cop+0.013
Neg logit contributions
icol-0.010
text-0.009
und-0.009
bed-0.009
ifying-0.009
Token magn
Token ID7842
Feature activation-1.92e+00

Pos logit contributions
guard+0.026
 stranger+0.025
 pastor+0.023
 ambulance+0.023
old+0.022
Neg logit contributions
icol-0.030
ra-0.029
 P-0.028
 M-0.028
 Po-0.027
Tokenifying
Token ID4035
Feature activation-6.63e+00

Pos logit contributions
icol+0.017
text+0.014
keeping+0.011
und+0.011
bed+0.011
Neg logit contributions
guard-0.072
 answer-0.071
 stranger-0.071
 ambulance-0.068
 seller-0.068
Token glass
Token ID5405
Feature activation-6.54e-01

Pos logit contributions
icol+0.153
text+0.141
bed+0.140
und+0.139
ra+0.133
Neg logit contributions
 stranger-0.157
guard-0.154
 answer-0.153
 ambulance-0.149
 seller-0.145
Token?"
Token ID1701
Feature activation+1.86e+00

Pos logit contributions
icol+0.018
text+0.017
und+0.016
bed+0.016
ifying+0.016
Neg logit contributions
 stranger-0.013
guard-0.012
 answer-0.012
 ambulance-0.012
 seller-0.011
Token
Token ID198
Feature activation+5.87e-01

Pos logit contributions
 stranger+0.043
 lady+0.042
guard+0.042
 driver+0.042
 man+0.042
Neg logit contributions
icol-0.049
ifying-0.045
und-0.043
keeping-0.043
text-0.043
Token
Token ID198
Feature activation+6.98e-01

Pos logit contributions
guard+0.012
 stranger+0.012
 driver+0.012
 answer+0.012
 lady+0.012
Neg logit contributions
icol-0.016
keeping-0.014
ifying-0.014
ra-0.014
text-0.013
Token"
Token ID1
Feature activation-6.04e-01

Pos logit contributions
 stranger+0.016
guard+0.016
 answer+0.015
 ambulance+0.015
 seller+0.014
Neg logit contributions
icol-0.017
text-0.016
und-0.016
bed-0.015
ifying-0.015
Token a
Token ID257
Feature activation-1.01e+00

Pos logit contributions
icol+0.042
ifying+0.039
bed+0.038
und+0.038
ra+0.037
Neg logit contributions
 stranger-0.026
 answer-0.026
 bus-0.024
 stand-0.023
 ambulance-0.023
Token flashlight
Token ID38371
Feature activation-4.93e-01

Pos logit contributions
icol+0.027
text+0.025
und+0.025
bed+0.024
ra+0.024
Neg logit contributions
 stranger-0.020
guard-0.020
 answer-0.019
 ambulance-0.019
 pastor-0.018
Token and
Token ID290
Feature activation-4.03e-01

Pos logit contributions
icol+0.011
und+0.010
text+0.010
bed+0.010
ifying+0.010
Neg logit contributions
 stranger-0.012
guard-0.011
 answer-0.011
 ambulance-0.011
 pastor-0.011
Token a
Token ID257
Feature activation-1.72e+00

Pos logit contributions
icol+0.010
und+0.009
text+0.009
bed+0.009
ifying+0.009
Neg logit contributions
 stranger-0.009
 answer-0.009
guard-0.009
 ambulance-0.009
 cop-0.008
Token magn
Token ID7842
Feature activation-1.79e+00

Pos logit contributions
icol+0.045
ifying+0.040
bed+0.039
text+0.039
ra+0.039
Neg logit contributions
guard-0.036
 stranger-0.034
 answer-0.033
 ambulance-0.033
 pastor-0.032
Tokenifying
Token ID4035
Feature activation-6.58e+00

Pos logit contributions
icol+0.013
text+0.010
und+0.008
bed+0.008
keeping+0.007
Neg logit contributions
guard-0.070
 stranger-0.070
 answer-0.070
 ambulance-0.068
 seller-0.067
Token glass
Token ID5405
Feature activation-6.98e-01

Pos logit contributions
icol+0.158
text+0.146
bed+0.146
und+0.145
ra+0.140
Neg logit contributions
 stranger-0.149
guard-0.145
 answer-0.144
 ambulance-0.141
 seller-0.136
Token to
Token ID284
Feature activation-8.04e-01

Pos logit contributions
icol+0.017
text+0.016
und+0.016
bed+0.016
ifying+0.016
Neg logit contributions
 stranger-0.015
guard-0.015
 answer-0.014
 ambulance-0.014
 cop-0.014
Token examine
Token ID10716
Feature activation+2.96e-01

Pos logit contributions
bed+0.019
icol+0.019
ra+0.018
und+0.018
 M+0.018
Neg logit contributions
guard-0.018
 stranger-0.018
 pastor-0.017
 ambulance-0.017
 cop-0.016
Token their
Token ID511
Feature activation-6.16e-01

Pos logit contributions
 answer+0.006
 stranger+0.006
guard+0.005
 ambulance+0.005
 man+0.005
Neg logit contributions
icol-0.009
text-0.008
und-0.008
bed-0.008
ra-0.007
Token finds
Token ID7228
Feature activation-2.33e-02

Pos logit contributions
icol+0.017
text+0.016
und+0.016
bed+0.015
ifying+0.015
Neg logit contributions
 stranger-0.012
 answer-0.012
 ambulance-0.012
guard-0.012
 seller-0.011
Token.
Token ID13
Feature activation-2.44e-01

Pos logit contributions
icol+0.002
und+0.002
bed+0.002
text+0.002
ifying+0.002
Neg logit contributions
 stranger-0.002
guard-0.002
 ambulance-0.002
 answer-0.002
 cop-0.002
Token They
Token ID1119
Feature activation+6.68e-01

Pos logit contributions
icol+0.006
text+0.006
ra+0.005
keeping+0.005
ifying+0.005
Neg logit contributions
 driver-0.005
 answer-0.005
 ambulance-0.005
 stranger-0.005
 seller-0.005
Token had
Token ID550
Feature activation-1.95e+00

Pos logit contributions
 ambulance+0.018
guard+0.018
 stranger+0.018
 cop+0.017
 answer+0.017
Neg logit contributions
bed-0.012
icol-0.012
ra-0.012
und-0.011
 P-0.011
Token a
Token ID257
Feature activation-1.48e+00

Pos logit contributions
icol+0.055
und+0.050
text+0.050
ifying+0.049
bed+0.049
Neg logit contributions
 stranger-0.040
 answer-0.039
 stand-0.035
 seller-0.035
 wife-0.035
Token magn
Token ID7842
Feature activation-2.20e+00

Pos logit contributions
icol+0.035
text+0.032
und+0.031
bed+0.031
ra+0.030
Neg logit contributions
guard-0.034
 stranger-0.034
 answer-0.033
 ambulance-0.033
 pastor-0.032
Tokenifying
Token ID4035
Feature activation-6.57e+00

Pos logit contributions
icol+0.014
text+0.010
keeping+0.008
bed+0.008
und+0.008
Neg logit contributions
guard-0.088
 stranger-0.087
 answer-0.086
 seller-0.084
 ambulance-0.084
Token glass
Token ID5405
Feature activation-7.35e-01

Pos logit contributions
icol+0.157
text+0.145
bed+0.144
und+0.143
ra+0.139
Neg logit contributions
 stranger-0.150
guard-0.146
 answer-0.145
 ambulance-0.142
 seller-0.137
Token and
Token ID290
Feature activation-9.39e-01

Pos logit contributions
icol+0.019
text+0.018
und+0.017
bed+0.017
ra+0.017
Neg logit contributions
 stranger-0.015
guard-0.015
 answer-0.015
 ambulance-0.015
 seller-0.014
Token a
Token ID257
Feature activation-1.65e+00

Pos logit contributions
icol+0.023
text+0.022
und+0.021
bed+0.020
ifying+0.020
Neg logit contributions
 stranger-0.021
 answer-0.021
guard-0.020
 seller-0.019
 ambulance-0.019
Token small
Token ID1402
Feature activation-8.21e-01

Pos logit contributions
icol+0.041
ifying+0.036
ra+0.036
bed+0.035
und+0.035
Neg logit contributions
guard-0.037
 ambulance-0.034
 stranger-0.034
 answer-0.034
 pastor-0.033
Token box
Token ID3091
Feature activation+3.28e-01

Pos logit contributions
icol+0.020
ra+0.018
bed+0.018
ifying+0.018
und+0.017
Neg logit contributions
guard-0.018
 stranger-0.018
 pastor-0.017
 ambulance-0.017
 answer-0.016
Token flashlight
Token ID38371
Feature activation-1.39e+00

Pos logit contributions
icol+0.041
text+0.037
und+0.037
ifying+0.036
bed+0.036
Neg logit contributions
guard-0.028
 stranger-0.028
 answer-0.027
 ambulance-0.026
 pastor-0.025
Token,
Token ID11
Feature activation-1.30e+00

Pos logit contributions
icol+0.033
text+0.030
und+0.030
bed+0.030
ifying+0.029
Neg logit contributions
 stranger-0.032
guard-0.031
 answer-0.030
 ambulance-0.030
 cop-0.029
Token and
Token ID290
Feature activation+9.25e-02

Pos logit contributions
icol+0.034
text+0.031
und+0.031
bed+0.030
ra+0.030
Neg logit contributions
 stranger-0.028
 answer-0.028
 ambulance-0.026
guard-0.026
 cop-0.026
Token a
Token ID257
Feature activation-1.91e+00

Pos logit contributions
 stranger+0.002
 answer+0.002
guard+0.002
 ambulance+0.002
 cop+0.002
Neg logit contributions
icol-0.002
text-0.002
und-0.002
bed-0.002
ifying-0.002
Token magn
Token ID7842
Feature activation-1.99e+00

Pos logit contributions
icol+0.052
text+0.047
und+0.047
ifying+0.047
bed+0.047
Neg logit contributions
guard-0.036
 stranger-0.035
 answer-0.034
 ambulance-0.034
 pastor-0.033
Tokenifying
Token ID4035
Feature activation-6.51e+00

Pos logit contributions
icol+0.018
text+0.015
bed+0.013
und+0.012
 P+0.011
Neg logit contributions
 stranger-0.074
guard-0.074
 answer-0.074
 ambulance-0.072
 seller-0.071
Token glass
Token ID5405
Feature activation-6.11e-01

Pos logit contributions
icol+0.154
text+0.144
bed+0.142
und+0.139
ra+0.136
Neg logit contributions
 stranger-0.149
guard-0.145
 answer-0.143
 ambulance-0.140
 seller-0.137
Token.
Token ID13
Feature activation-4.94e-01

Pos logit contributions
icol+0.016
text+0.015
und+0.014
bed+0.014
ra+0.014
Neg logit contributions
 stranger-0.013
guard-0.013
 answer-0.012
 ambulance-0.012
 seller-0.012
Token
Token ID198
Feature activation+6.33e-01

Pos logit contributions
icol+0.014
text+0.013
ra+0.013
bed+0.013
und+0.013
Neg logit contributions
 veterinarian-0.009
 lady-0.009
 doctor-0.009
 driver-0.009
 cop-0.009
Token
Token ID198
Feature activation+7.33e-01

Pos logit contributions
 stranger+0.016
guard+0.015
 answer+0.015
 ambulance+0.015
 cop+0.014
Neg logit contributions
icol-0.014
ifying-0.012
bed-0.012
text-0.012
ra-0.012
TokenAs
Token ID1722
Feature activation-3.07e-01

Pos logit contributions
 stranger+0.016
guard+0.016
 answer+0.016
 ambulance+0.016
 seller+0.015
Neg logit contributions
icol-0.018
text-0.017
und-0.016
bed-0.016
ra-0.016
Token hat
Token ID6877
Feature activation-1.58e+00

Pos logit contributions
icol+0.031
text+0.029
bed+0.029
und+0.028
ifying+0.028
Neg logit contributions
 stranger-0.024
guard-0.024
 answer-0.023
 ambulance-0.023
 pastor-0.022
Token and
Token ID290
Feature activation-4.86e-01

Pos logit contributions
icol+0.039
text+0.036
und+0.035
bed+0.034
ifying+0.034
Neg logit contributions
 stranger-0.035
guard-0.034
 answer-0.034
 ambulance-0.034
 seller-0.033
Token grabbed
Token ID13176
Feature activation+4.85e-01

Pos logit contributions
icol+0.013
und+0.013
ifying+0.012
text+0.012
ra+0.011
Neg logit contributions
 stranger-0.010
 answer-0.009
 lady-0.009
guard-0.009
 man-0.009
Token her
Token ID607
Feature activation-1.70e+00

Pos logit contributions
 answer+0.009
 stranger+0.008
 lady+0.008
 bar+0.008
 ambulance+0.008
Neg logit contributions
icol-0.015
und-0.014
text-0.014
keeping-0.014
â„¢-0.013
Token magn
Token ID7842
Feature activation-2.42e+00

Pos logit contributions
icol+0.045
und+0.042
text+0.042
bed+0.040
ra+0.039
Neg logit contributions
 stranger-0.035
 answer-0.034
guard-0.034
 ambulance-0.033
 cop-0.032
Tokenifying
Token ID4035
Feature activation-6.42e+00

Pos logit contributions
icol+0.025
text+0.020
 P+0.018
 pic+0.017
keeping+0.017
Neg logit contributions
guard-0.083
 answer-0.082
 stranger-0.081
 seller-0.080
 ambulance-0.079
Token glass
Token ID5405
Feature activation-9.98e-01

Pos logit contributions
icol+0.160
bed+0.154
text+0.148
und+0.147
ra+0.142
Neg logit contributions
 stranger-0.138
 answer-0.138
guard-0.133
 ambulance-0.131
 lady-0.129
Token.
Token ID13
Feature activation+1.03e-01

Pos logit contributions
icol+0.027
text+0.025
und+0.024
bed+0.024
ifying+0.023
Neg logit contributions
 stranger-0.021
guard-0.020
 answer-0.020
 ambulance-0.019
 seller-0.019
Token She
Token ID1375
Feature activation-8.82e-02

Pos logit contributions
 veterinarian+0.002
 answer+0.002
 cop+0.002
 stranger+0.002
 doctor+0.002
Neg logit contributions
icol-0.003
text-0.003
und-0.003
ra-0.003
ifying-0.003
Token was
Token ID373
Feature activation+5.18e-01

Pos logit contributions
icol+0.002
und+0.002
ifying+0.002
text+0.002
keeping+0.002
Neg logit contributions
 stranger-0.002
guard-0.002
 answer-0.002
 seller-0.002
 driver-0.002
Token ready
Token ID3492
Feature activation+4.14e-01

Pos logit contributions
 answer+0.009
 stranger+0.009
 ambulance+0.009
guard+0.008
 pastor+0.008
Neg logit contributions
icol-0.016
und-0.015
text-0.015
ifying-0.014
bed-0.014
Token the
Token ID262
Feature activation+9.66e-01

Pos logit contributions
 stranger+0.022
guard+0.022
 ambulance+0.021
 answer+0.021
 pastor+0.021
Neg logit contributions
icol-0.015
text-0.014
und-0.014
bed-0.014
 P-0.014
Token necklace
Token ID38987
Feature activation-1.52e+00

Pos logit contributions
 answer+0.018
 stranger+0.018
guard+0.017
 ambulance+0.017
 seller+0.016
Neg logit contributions
icol-0.028
und-0.026
text-0.026
bed-0.025
ifying-0.025
Token and
Token ID290
Feature activation+8.37e-01

Pos logit contributions
icol+0.031
bed+0.029
text+0.029
und+0.029
ra+0.028
Neg logit contributions
 stranger-0.038
guard-0.037
 ambulance-0.037
 pastor-0.036
 answer-0.036
Token the
Token ID262
Feature activation+1.34e-01

Pos logit contributions
 stranger+0.018
guard+0.017
 answer+0.017
 seller+0.016
 ambulance+0.016
Neg logit contributions
icol-0.022
text-0.021
und-0.020
ifying-0.020
bed-0.020
Token magn
Token ID7842
Feature activation-1.51e+00

Pos logit contributions
 stranger+0.002
 answer+0.002
guard+0.002
 ambulance+0.002
 seller+0.002
Neg logit contributions
icol-0.004
und-0.004
text-0.004
bed-0.004
ifying-0.004
Tokenifying
Token ID4035
Feature activation-6.40e+00

Pos logit contributions
icol+0.018
text+0.013
 pic+0.013
 P+0.012
 Neither+0.012
Neg logit contributions
guard-0.052
 answer-0.051
 stranger-0.051
 seller-0.050
 ring-0.049
Token glass
Token ID5405
Feature activation-8.21e-01

Pos logit contributions
icol+0.164
bed+0.154
text+0.153
und+0.150
keeping+0.144
Neg logit contributions
 stranger-0.135
guard-0.131
 answer-0.131
 lady-0.125
 seller-0.125
Token and
Token ID290
Feature activation+2.93e-01

Pos logit contributions
icol+0.021
text+0.020
und+0.020
bed+0.019
ifying+0.019
Neg logit contributions
 stranger-0.017
guard-0.017
 answer-0.016
 ambulance-0.016
 seller-0.015
Token told
Token ID1297
Feature activation+4.06e-01

Pos logit contributions
 stranger+0.007
guard+0.007
 answer+0.006
 seller+0.006
 ambulance+0.006
Neg logit contributions
icol-0.007
text-0.007
und-0.007
ifying-0.007
bed-0.006
Token him
Token ID683
Feature activation+6.26e-01

Pos logit contributions
 stranger+0.009
 answer+0.009
guard+0.009
 ambulance+0.008
 pastor+0.008
Neg logit contributions
icol-0.010
und-0.010
text-0.010
ifying-0.009
bed-0.009
Token that
Token ID326
Feature activation+2.35e-01

Pos logit contributions
 stranger+0.015
guard+0.015
 ambulance+0.015
 answer+0.014
 cop+0.014
Neg logit contributions
icol-0.014
text-0.013
und-0.013
bed-0.012
ra-0.012
Token and
Token ID290
Feature activation+5.28e-01

Pos logit contributions
icol+0.025
und+0.023
text+0.023
ra+0.023
bed+0.023
Neg logit contributions
guard-0.034
 stranger-0.034
 pastor-0.033
 ambulance-0.033
 firefighter-0.032
Token handed
Token ID10158
Feature activation+9.54e-01

Pos logit contributions
guard+0.011
 stranger+0.011
 seller+0.011
 lady+0.011
 answer+0.011
Neg logit contributions
icol-0.014
ifying-0.013
text-0.013
und-0.013
bed-0.012
Token him
Token ID683
Feature activation+1.82e-02

Pos logit contributions
 answer+0.020
 stranger+0.020
guard+0.019
 seller+0.018
 man+0.018
Neg logit contributions
icol-0.028
und-0.025
text-0.024
ifying-0.024
bed-0.023
Token the
Token ID262
Feature activation+2.20e-01

Pos logit contributions
 stranger+0.001
guard+0.001
 ambulance+0.001
 pastor+0.001
 answer+0.001
Neg logit contributions
icol-0.000
text-0.000
 P-0.000
bed-0.000
und-0.000
Token magn
Token ID7842
Feature activation-1.76e+00

Pos logit contributions
 answer+0.004
 stranger+0.004
guard+0.004
 ring+0.004
 seller+0.004
Neg logit contributions
icol-0.006
und-0.006
text-0.006
bed-0.006
ifying-0.006
Tokenifying
Token ID4035
Feature activation-6.39e+00

Pos logit contributions
icol+0.022
text+0.017
 pic+0.016
 Neither+0.015
und+0.015
Neg logit contributions
guard-0.060
 answer-0.058
 stranger-0.058
 seller-0.057
 ring-0.057
Token glass
Token ID5405
Feature activation-9.72e-01

Pos logit contributions
icol+0.162
bed+0.153
text+0.149
und+0.147
keeping+0.140
Neg logit contributions
 stranger-0.139
guard-0.135
 answer-0.135
 lady-0.128
 seller-0.127
Token.
Token ID13
Feature activation+3.98e-01

Pos logit contributions
icol+0.025
text+0.023
und+0.023
bed+0.023
ifying+0.022
Neg logit contributions
 stranger-0.020
guard-0.020
 answer-0.019
 ambulance-0.019
 seller-0.018
Token
Token ID198
Feature activation+2.98e-03

Pos logit contributions
 stranger+0.009
 answer+0.009
 lady+0.009
 seller+0.009
 cop+0.009
Neg logit contributions
icol-0.010
text-0.009
und-0.009
ra-0.009
keeping-0.009
Token
Token ID198
Feature activation+3.14e-01

Pos logit contributions
 stranger+0.000
guard+0.000
 answer+0.000
 man+0.000
 lady+0.000
Neg logit contributions
icol-0.000
ifying-0.000
ra-0.000
bed-0.000
und-0.000
TokenHe
Token ID1544
Feature activation+2.06e-01

Pos logit contributions
 stranger+0.007
 answer+0.007
guard+0.007
 ambulance+0.007
 cop+0.007
Neg logit contributions
icol-0.007
text-0.007
und-0.007
bed-0.007
ra-0.006
Token the
Token ID262
Feature activation+1.06e+00

Pos logit contributions
 stranger+0.042
 answer+0.041
guard+0.040
 ambulance+0.039
 cop+0.037
Neg logit contributions
icol-0.055
und-0.050
text-0.049
bed-0.049
ifying-0.047
Token necklace
Token ID38987
Feature activation-1.06e+00

Pos logit contributions
 stranger+0.020
guard+0.019
 answer+0.019
 ambulance+0.019
 seller+0.018
Neg logit contributions
icol-0.030
und-0.028
text-0.028
bed-0.028
ifying-0.027
Token and
Token ID290
Feature activation+2.51e-01

Pos logit contributions
icol+0.024
text+0.023
und+0.022
bed+0.022
ra+0.021
Neg logit contributions
 stranger-0.025
guard-0.025
 answer-0.024
 ambulance-0.024
 cop-0.023
Token the
Token ID262
Feature activation-1.06e-01

Pos logit contributions
 stranger+0.005
 answer+0.005
guard+0.005
 seller+0.005
 lady+0.005
Neg logit contributions
icol-0.007
und-0.007
ifying-0.006
text-0.006
ra-0.006
Token magn
Token ID7842
Feature activation-1.32e+00

Pos logit contributions
icol+0.003
und+0.003
text+0.003
bed+0.003
ifying+0.003
Neg logit contributions
 stranger-0.002
guard-0.002
 answer-0.002
 ambulance-0.002
 seller-0.002
Tokenifying
Token ID4035
Feature activation-6.38e+00

Pos logit contributions
icol+0.016
text+0.012
 pic+0.011
 P+0.011
keeping+0.011
Neg logit contributions
guard-0.045
 answer-0.044
 stranger-0.044
 seller-0.043
 ring-0.043
Token glass
Token ID5405
Feature activation-5.13e-01

Pos logit contributions
icol+0.156
bed+0.146
text+0.145
und+0.143
ra+0.138
Neg logit contributions
 stranger-0.140
guard-0.136
 answer-0.134
 ambulance-0.130
 seller-0.129
Token and
Token ID290
Feature activation-1.81e-01

Pos logit contributions
icol+0.014
text+0.013
und+0.013
bed+0.012
ifying+0.012
Neg logit contributions
 stranger-0.011
guard-0.010
 answer-0.010
 ambulance-0.010
 seller-0.010
Token started
Token ID2067
Feature activation+9.49e-01

Pos logit contributions
icol+0.005
und+0.005
text+0.005
ifying+0.005
ra+0.004
Neg logit contributions
 stranger-0.004
 answer-0.004
 lady-0.003
guard-0.003
 stand-0.003
Token to
Token ID284
Feature activation+8.31e-01

Pos logit contributions
 answer+0.016
 stranger+0.016
guard+0.015
 stand+0.015
 ambulance+0.015
Neg logit contributions
icol-0.028
text-0.028
und-0.027
ifying-0.026
bed-0.025
Token examine
Token ID10716
Feature activation+1.08e+00

Pos logit contributions
 stranger+0.016
guard+0.016
 ambulance+0.015
 pastor+0.015
 cop+0.015
Neg logit contributions
icol-0.023
bed-0.021
und-0.021
ra-0.021
text-0.021
Token cried
Token ID16896
Feature activation+8.88e-01

Pos logit contributions
 stranger+0.010
guard+0.010
 answer+0.009
 ambulance+0.009
 cop+0.009
Neg logit contributions
icol-0.013
text-0.012
und-0.012
bed-0.012
ifying-0.011
Token,
Token ID11
Feature activation+2.60e-01

Pos logit contributions
guard+0.023
 stranger+0.023
 seller+0.022
 pastor+0.022
 vendor+0.021
Neg logit contributions
icol-0.018
bed-0.018
 P-0.017
 M-0.017
 Po-0.016
Token dropping
Token ID12047
Feature activation+1.68e+00

Pos logit contributions
 stranger+0.006
guard+0.006
 answer+0.006
 ambulance+0.006
 seller+0.005
Neg logit contributions
icol-0.006
text-0.006
und-0.006
bed-0.006
ifying-0.005
Token the
Token ID262
Feature activation+9.64e-01

Pos logit contributions
 stranger+0.045
guard+0.045
 answer+0.043
 ambulance+0.043
 pastor+0.043
Neg logit contributions
icol-0.032
text-0.029
und-0.029
bed-0.029
ra-0.028
Token magn
Token ID7842
Feature activation-1.74e+00

Pos logit contributions
guard+0.024
old+0.022
 pastor+0.022
 officer+0.022
 stranger+0.021
Neg logit contributions
ra-0.020
 hands-0.019
 M-0.019
icol-0.018
ifying-0.018
Tokenifying
Token ID4035
Feature activation-6.19e+00

Pos logit contributions
icol+0.017
text+0.012
 P+0.010
 pic+0.010
keeping+0.010
Neg logit contributions
guard-0.065
 stranger-0.065
 answer-0.064
 ambulance-0.063
 cop-0.061
Token glass
Token ID5405
Feature activation-9.04e-01

Pos logit contributions
icol+0.156
bed+0.146
text+0.142
und+0.139
ra+0.136
Neg logit contributions
 stranger-0.138
guard-0.134
 answer-0.133
 ambulance-0.129
 lady-0.127
Token.
Token ID13
Feature activation+1.09e+00

Pos logit contributions
icol+0.025
text+0.023
und+0.022
bed+0.022
ra+0.021
Neg logit contributions
 stranger-0.019
guard-0.018
 answer-0.018
 ambulance-0.017
 seller-0.017
Token
Token ID198
Feature activation+5.49e-01

Pos logit contributions
 stranger+0.020
 lady+0.019
 cop+0.019
 veterinarian+0.018
 man+0.018
Neg logit contributions
icol-0.031
text-0.031
ifying-0.029
keeping-0.029
ra-0.029
Token
Token ID198
Feature activation+9.20e-01

Pos logit contributions
 stranger+0.016
guard+0.016
 answer+0.015
 ambulance+0.015
 seller+0.015
Neg logit contributions
icol-0.010
text-0.009
und-0.009
bed-0.009
ra-0.008
Token"
Token ID1
Feature activation-5.49e-01

Pos logit contributions
 answer+0.023
 stranger+0.022
guard+0.021
 ambulance+0.021
 pastor+0.020
Neg logit contributions
icol-0.020
und-0.020
bed-0.019
text-0.019
ra-0.019
Token borrowed
Token ID22546
Feature activation-9.43e-01

Pos logit contributions
icol+0.004
bed+0.004
text+0.004
und+0.004
ra+0.004
Neg logit contributions
guard-0.006
 ambulance-0.006
 stranger-0.006
 answer-0.006
 pastor-0.005
Token her
Token ID607
Feature activation-1.29e+00

Pos logit contributions
icol+0.023
text+0.021
und+0.020
ifying+0.020
keeping+0.019
Neg logit contributions
 answer-0.023
 stranger-0.023
 owl-0.021
 lady-0.021
guard-0.020
Token mom
Token ID1995
Feature activation-1.00e+00

Pos logit contributions
icol+0.036
und+0.034
text+0.034
ifying+0.032
keeping+0.031
Neg logit contributions
 answer-0.026
 stranger-0.025
 owl-0.023
guard-0.023
 ambulance-0.023
Token's
Token ID338
Feature activation-1.09e+00

Pos logit contributions
icol+0.030
text+0.027
und+0.027
ifying+0.026
bed+0.026
Neg logit contributions
 stranger-0.018
 answer-0.018
guard-0.017
 ambulance-0.017
 seller-0.016
Token magn
Token ID7842
Feature activation-2.61e+00

Pos logit contributions
icol+0.026
text+0.024
bed+0.024
und+0.024
ra+0.023
Neg logit contributions
 stranger-0.025
guard-0.025
 ambulance-0.024
 answer-0.023
 pastor-0.023
Tokenifying
Token ID4035
Feature activation-6.11e+00

Pos logit contributions
icol+0.026
text+0.022
keeping+0.018
â„¢+0.017
 pic+0.016
Neg logit contributions
 answer-0.091
 stranger-0.090
guard-0.090
 seller-0.087
 ring-0.085
Token glass
Token ID5405
Feature activation-1.20e+00

Pos logit contributions
icol+0.142
text+0.132
bed+0.130
und+0.128
ra+0.123
Neg logit contributions
 stranger-0.144
 answer-0.140
guard-0.140
 ambulance-0.135
 seller-0.132
Token to
Token ID284
Feature activation-1.26e+00

Pos logit contributions
icol+0.031
text+0.029
und+0.028
bed+0.028
ifying+0.027
Neg logit contributions
 stranger-0.026
guard-0.025
 answer-0.025
 ambulance-0.024
 seller-0.023
Token take
Token ID1011
Feature activation-1.74e+00

Pos logit contributions
icol+0.033
und+0.031
bed+0.031
text+0.031
ra+0.031
Neg logit contributions
guard-0.025
 stranger-0.025
 ambulance-0.024
 pastor-0.023
 answer-0.023
Token a
Token ID257
Feature activation-1.29e+00

Pos logit contributions
icol+0.053
ifying+0.051
text+0.050
und+0.049
â„¢+0.047
Neg logit contributions
 answer-0.031
 stranger-0.028
 owl-0.026
 veterinarian-0.025
 ambulance-0.025
Token closer
Token ID5699
Feature activation-1.34e+00

Pos logit contributions
icol+0.038
text+0.036
und+0.035
bed+0.035
ra+0.034
Neg logit contributions
 stranger-0.023
guard-0.022
 answer-0.022
 ambulance-0.021
 cop-0.020
Token,
Token ID11
Feature activation-5.72e-01

Pos logit contributions
icol+0.029
und+0.027
text+0.027
ifying+0.026
bed+0.025
Neg logit contributions
 stranger-0.022
guard-0.021
 answer-0.021
 ambulance-0.020
 seller-0.020
Token handing
Token ID22786
Feature activation+1.43e+00

Pos logit contributions
icol+0.014
text+0.013
und+0.013
bed+0.013
ra+0.012
Neg logit contributions
 stranger-0.013
guard-0.013
 answer-0.012
 ambulance-0.012
 cop-0.012
Token him
Token ID683
Feature activation+6.84e-01

Pos logit contributions
 stranger+0.027
 answer+0.026
 ambulance+0.026
 cop+0.026
guard+0.026
Neg logit contributions
icol-0.040
text-0.039
und-0.038
ifying-0.037
bed-0.036
Token the
Token ID262
Feature activation+1.34e+00

Pos logit contributions
 stranger+0.019
guard+0.019
 ambulance+0.018
 answer+0.018
 pastor+0.018
Neg logit contributions
icol-0.012
text-0.012
und-0.011
bed-0.011
ra-0.011
Token magn
Token ID7842
Feature activation-1.44e+00

Pos logit contributions
guard+0.025
 stranger+0.023
old+0.022
 pastor+0.021
 officer+0.021
Neg logit contributions
icol-0.034
ra-0.033
 hands-0.032
ifying-0.032
 M-0.031
Tokenifying
Token ID4035
Feature activation-6.09e+00

Pos logit contributions
icol+0.011
text+0.007
bed+0.005
und+0.005
keeping+0.005
Neg logit contributions
guard-0.058
 stranger-0.057
 answer-0.056
 ambulance-0.055
 seller-0.055
Token glass
Token ID5405
Feature activation-4.44e-01

Pos logit contributions
icol+0.145
text+0.134
bed+0.134
und+0.130
ra+0.126
Neg logit contributions
 stranger-0.142
guard-0.138
 answer-0.136
 ambulance-0.132
 seller-0.130
Token.
Token ID13
Feature activation+4.44e-01

Pos logit contributions
icol+0.011
text+0.010
und+0.010
bed+0.010
ra+0.010
Neg logit contributions
 stranger-0.009
guard-0.009
 answer-0.009
 ambulance-0.009
 cop-0.009
Token She
Token ID1375
Feature activation+1.57e-01

Pos logit contributions
 cop+0.009
 stranger+0.009
 veterinarian+0.009
 lady+0.009
 driver+0.008
Neg logit contributions
icol-0.011
text-0.011
ifying-0.011
ra-0.011
keeping-0.011
Token looked
Token ID3114
Feature activation+4.90e-01

Pos logit contributions
 stranger+0.003
guard+0.003
 answer+0.003
 ambulance+0.003
 seller+0.003
Neg logit contributions
icol-0.004
text-0.004
und-0.004
ifying-0.004
bed-0.004
Token around
Token ID1088
Feature activation+1.41e+00

Pos logit contributions
 stranger+0.011
 answer+0.011
guard+0.011
 ambulance+0.011
 cop+0.010
Neg logit contributions
icol-0.012
text-0.011
und-0.011
keeping-0.011
bed-0.010
Token the
Token ID262
Feature activation+2.53e+00

Pos logit contributions
 stranger+0.036
 answer+0.035
guard+0.035
 ambulance+0.034
 cop+0.033
Neg logit contributions
icol-0.039
text-0.036
ifying-0.036
ra-0.034
und-0.034
Token watch
Token ID2342
Feature activation+2.63e-01

Pos logit contributions
guard+0.050
 stranger+0.046
vern+0.046
old+0.046
 ambulance+0.045
Neg logit contributions
icol-0.064
ra-0.062
ifying-0.060
bed-0.059
 M-0.058
Token with
Token ID351
Feature activation+1.43e+00

Pos logit contributions
 stranger+0.006
guard+0.006
 answer+0.006
 ambulance+0.005
 seller+0.005
Neg logit contributions
icol-0.007
text-0.006
und-0.006
bed-0.006
ifying-0.006
Token the
Token ID262
Feature activation+2.16e+00

Pos logit contributions
 stranger+0.033
 answer+0.033
guard+0.032
 ambulance+0.031
 cop+0.031
Neg logit contributions
icol-0.034
text-0.032
und-0.031
bed-0.031
ifying-0.030
Token magn
Token ID7842
Feature activation-9.49e-01

Pos logit contributions
guard+0.042
 stranger+0.038
vern+0.037
 ambulance+0.036
old+0.036
Neg logit contributions
icol-0.055
ra-0.055
 M-0.052
ifying-0.051
 Po-0.051
Tokenifying
Token ID4035
Feature activation-6.08e+00

Pos logit contributions
icol+0.006
text+0.004
und+0.003
bed+0.003
keeping+0.003
Neg logit contributions
guard-0.038
 stranger-0.038
 answer-0.038
 ambulance-0.037
 seller-0.037
Token glass
Token ID5405
Feature activation-7.45e-02

Pos logit contributions
icol+0.145
bed+0.135
text+0.134
und+0.131
ra+0.127
Neg logit contributions
 stranger-0.140
guard-0.137
 answer-0.135
 ambulance-0.132
 seller-0.129
Token.
Token ID13
Feature activation+4.82e-01

Pos logit contributions
icol+0.002
text+0.002
und+0.002
bed+0.002
ifying+0.002
Neg logit contributions
 stranger-0.002
guard-0.002
 answer-0.002
 ambulance-0.002
 cop-0.001
Token He
Token ID679
Feature activation+4.42e-01

Pos logit contributions
 stranger+0.009
 answer+0.009
 ambulance+0.009
 cop+0.009
guard+0.009
Neg logit contributions
icol-0.013
text-0.013
und-0.012
ifying-0.012
bed-0.012
Token saw
Token ID2497
Feature activation+6.25e-01

Pos logit contributions
 stranger+0.009
guard+0.009
 ambulance+0.009
 answer+0.009
 cop+0.009
Neg logit contributions
icol-0.011
text-0.010
bed-0.010
und-0.010
ra-0.010
Token the
Token ID262
Feature activation+1.38e+00

Pos logit contributions
 stranger+0.016
guard+0.016
 answer+0.015
 ambulance+0.015
 cop+0.015
Neg logit contributions
icol-0.013
text-0.012
und-0.012
bed-0.012
ra-0.011
Token He
Token ID679
Feature activation+2.64e-01

Pos logit contributions
 stranger+0.019
guard+0.019
 answer+0.019
 ambulance+0.019
 cop+0.018
Neg logit contributions
icol-0.014
text-0.013
und-0.013
bed-0.013
ifying-0.012
Token took
Token ID1718
Feature activation-2.10e-01

Pos logit contributions
 stranger+0.007
guard+0.006
 answer+0.006
 ambulance+0.006
 cop+0.006
Neg logit contributions
icol-0.006
text-0.005
und-0.005
bed-0.005
ra-0.005
Token out
Token ID503
Feature activation+1.90e+00

Pos logit contributions
icol+0.006
ifying+0.005
und+0.005
text+0.005
bed+0.005
Neg logit contributions
 answer-0.005
 stranger-0.004
 cop-0.004
 ambulance-0.004
 driver-0.004
Token a
Token ID257
Feature activation-2.99e-01

Pos logit contributions
 stranger+0.044
 answer+0.042
guard+0.042
 ambulance+0.041
 cop+0.040
Neg logit contributions
icol-0.047
text-0.043
und-0.042
bed-0.042
ra-0.041
Token magn
Token ID7842
Feature activation-1.72e+00

Pos logit contributions
icol+0.008
und+0.007
ra+0.007
ifying+0.007
text+0.007
Neg logit contributions
guard-0.006
 stranger-0.006
 ambulance-0.006
 pastor-0.006
 answer-0.006
Tokenifying
Token ID4035
Feature activation-6.07e+00

Pos logit contributions
icol+0.014
text+0.011
keeping+0.009
 P+0.009
bed+0.009
Neg logit contributions
guard-0.065
 stranger-0.065
 answer-0.064
 ambulance-0.063
 cop-0.062
Token glass
Token ID5405
Feature activation-5.33e-01

Pos logit contributions
icol+0.149
bed+0.142
text+0.141
und+0.135
keeping+0.133
Neg logit contributions
 stranger-0.135
guard-0.131
 answer-0.130
 ambulance-0.126
 seller-0.123
Token and
Token ID290
Feature activation-3.60e-01

Pos logit contributions
icol+0.013
text+0.012
und+0.012
bed+0.012
ifying+0.011
Neg logit contributions
 stranger-0.012
guard-0.012
 answer-0.011
 ambulance-0.011
 cop-0.011
Token a
Token ID257
Feature activation-1.39e+00

Pos logit contributions
icol+0.009
text+0.008
und+0.008
ifying+0.008
bed+0.008
Neg logit contributions
 stranger-0.008
guard-0.008
 answer-0.008
 ambulance-0.008
 seller-0.008
Token notebook
Token ID20922
Feature activation-1.82e+00

Pos logit contributions
icol+0.036
ifying+0.032
ra+0.032
und+0.031
bed+0.031
Neg logit contributions
guard-0.029
 stranger-0.028
 ambulance-0.027
 answer-0.027
 pastor-0.026
Token and
Token ID290
Feature activation-2.13e-01

Pos logit contributions
icol+0.043
text+0.040
und+0.039
bed+0.039
ra+0.038
Neg logit contributions
 stranger-0.041
guard-0.040
 ambulance-0.039
 answer-0.039
 pastor-0.038
Token
Token ID198
Feature activation+3.72e-01

Pos logit contributions
 stranger+0.003
guard+0.003
 answer+0.003
 cop+0.003
 driver+0.003
Neg logit contributions
icol-0.003
ifying-0.003
text-0.003
bed-0.003
und-0.003
TokenHe
Token ID1544
Feature activation+4.05e-01

Pos logit contributions
 answer+0.009
guard+0.009
 stranger+0.009
 ambulance+0.009
 seller+0.009
Neg logit contributions
icol-0.007
und-0.007
text-0.007
ra-0.007
bed-0.007
Token pointed
Token ID6235
Feature activation+8.27e-01

Pos logit contributions
 stranger+0.009
guard+0.009
 answer+0.009
 ambulance+0.009
 cop+0.009
Neg logit contributions
icol-0.010
text-0.009
und-0.009
bed-0.009
ra-0.008
Token his
Token ID465
Feature activation-1.55e+00

Pos logit contributions
 stranger+0.020
guard+0.020
 ambulance+0.019
 answer+0.019
 pastor+0.018
Neg logit contributions
icol-0.019
text-0.017
und-0.017
bed-0.017
ra-0.016
Token magn
Token ID7842
Feature activation-2.50e+00

Pos logit contributions
icol+0.038
text+0.036
und+0.036
bed+0.035
ifying+0.033
Neg logit contributions
 stranger-0.034
 answer-0.034
guard-0.033
 ambulance-0.033
 cop-0.032
Tokenifying
Token ID4035
Feature activation-5.93e+00

Pos logit contributions
icol+0.025
text+0.020
 pic+0.017
keeping+0.017
 P+0.017
Neg logit contributions
 answer-0.090
guard-0.089
 stranger-0.089
 seller-0.086
 ring-0.085
Token glass
Token ID5405
Feature activation-1.44e+00

Pos logit contributions
icol+0.143
bed+0.134
text+0.132
und+0.128
ra+0.123
Neg logit contributions
 stranger-0.137
 answer-0.133
guard-0.132
 ambulance-0.127
 seller-0.125
Token at
Token ID379
Feature activation+7.18e-02

Pos logit contributions
icol+0.035
text+0.032
und+0.031
bed+0.031
ifying+0.030
Neg logit contributions
 stranger-0.033
guard-0.033
 answer-0.032
 ambulance-0.031
 seller-0.030
Token different
Token ID1180
Feature activation+1.36e+00

Pos logit contributions
 answer+0.002
 stranger+0.001
 man+0.001
guard+0.001
 lady+0.001
Neg logit contributions
icol-0.002
text-0.002
ifying-0.002
und-0.002
â„¢-0.002
Token classmates
Token ID28999
Feature activation+4.71e-02

Pos logit contributions
 stranger+0.029
 answer+0.029
guard+0.028
 ambulance+0.027
 seller+0.027
Neg logit contributions
icol-0.035
text-0.033
und-0.032
bed-0.031
ifying-0.031
Token and
Token ID290
Feature activation-4.30e-02

Pos logit contributions
 stranger+0.001
guard+0.001
 answer+0.001
 ambulance+0.001
 cop+0.001
Neg logit contributions
icol-0.001
text-0.001
und-0.001
bed-0.001
ra-0.001
Token said
Token ID531
Feature activation-4.25e-01

Pos logit contributions
 stranger+0.002
guard+0.002
 ambulance+0.002
 answer+0.002
 pastor+0.002
Neg logit contributions
icol-0.002
bed-0.002
text-0.002
und-0.002
ra-0.002
Token,
Token ID11
Feature activation+5.08e-01

Pos logit contributions
icol+0.010
text+0.010
bed+0.010
und+0.010
ra+0.009
Neg logit contributions
 stranger-0.009
guard-0.009
 answer-0.009
 ambulance-0.009
 cop-0.009
Token holding
Token ID4769
Feature activation+8.27e-01

Pos logit contributions
 stranger+0.012
guard+0.012
 answer+0.011
 ambulance+0.011
 cop+0.011
Neg logit contributions
icol-0.012
bed-0.011
text-0.011
und-0.011
ra-0.010
Token the
Token ID262
Feature activation+1.57e+00

Pos logit contributions
 answer+0.017
 stranger+0.017
 cop+0.016
 ambulance+0.016
 man+0.016
Neg logit contributions
icol-0.022
text-0.021
und-0.021
keeping-0.020
ifying-0.020
Token magn
Token ID7842
Feature activation-1.29e+00

Pos logit contributions
guard+0.036
 pastor+0.033
 stranger+0.033
 officer+0.032
old+0.032
Neg logit contributions
ra-0.035
icol-0.035
ifying-0.034
bed-0.034
 hands-0.034
Tokenifying
Token ID4035
Feature activation-5.81e+00

Pos logit contributions
icol+0.010
text+0.008
keeping+0.006
 P+0.005
und+0.005
Neg logit contributions
 stranger-0.050
guard-0.050
 answer-0.049
 ambulance-0.048
 seller-0.048
Token glass
Token ID5405
Feature activation-4.18e-01

Pos logit contributions
icol+0.134
text+0.124
bed+0.123
und+0.122
ra+0.117
Neg logit contributions
 stranger-0.138
guard-0.135
 answer-0.133
 ambulance-0.130
 seller-0.127
Token over
Token ID625
Feature activation+1.24e-01

Pos logit contributions
icol+0.011
text+0.010
und+0.010
bed+0.010
ra+0.009
Neg logit contributions
 stranger-0.009
guard-0.009
 answer-0.009
 ambulance-0.008
 seller-0.008
Token a
Token ID257
Feature activation-1.82e+00

Pos logit contributions
 stranger+0.003
guard+0.002
 answer+0.002
 lady+0.002
 man+0.002
Neg logit contributions
icol-0.004
text-0.003
und-0.003
ifying-0.003
â„¢-0.003
Token black
Token ID2042
Feature activation-7.81e-01

Pos logit contributions
icol+0.045
ifying+0.042
und+0.041
bed+0.041
ra+0.040
Neg logit contributions
guard-0.039
 ambulance-0.037
 stranger-0.036
 pastor-0.036
 answer-0.035
Token ant
Token ID1885
Feature activation-2.54e-01

Pos logit contributions
icol+0.020
bed+0.018
und+0.018
ra+0.018
text+0.018
Neg logit contributions
guard-0.016
 stranger-0.016
 pastor-0.015
 ambulance-0.015
 answer-0.015
Token She
Token ID1375
Feature activation+1.92e-01

Pos logit contributions
 stranger+0.005
 answer+0.004
 veterinarian+0.004
 cop+0.004
guard+0.004
Neg logit contributions
icol-0.008
text-0.008
ifying-0.008
ra-0.008
und-0.008
Token took
Token ID1718
Feature activation-5.49e-01

Pos logit contributions
 stranger+0.005
guard+0.005
 ambulance+0.005
 answer+0.005
 pastor+0.005
Neg logit contributions
icol-0.004
bed-0.004
und-0.004
text-0.004
ra-0.004
Token out
Token ID503
Feature activation+2.00e+00

Pos logit contributions
icol+0.014
ifying+0.013
und+0.013
text+0.012
keeping+0.012
Neg logit contributions
 answer-0.012
 stranger-0.012
 ambulance-0.011
guard-0.011
 cop-0.011
Token a
Token ID257
Feature activation-9.87e-02

Pos logit contributions
 stranger+0.048
guard+0.047
 answer+0.047
 ambulance+0.046
 cop+0.044
Neg logit contributions
icol-0.046
text-0.042
und-0.041
bed-0.041
ra-0.040
Token magn
Token ID7842
Feature activation-1.54e+00

Pos logit contributions
icol+0.002
ifying+0.002
und+0.002
ra+0.002
 M+0.002
Neg logit contributions
guard-0.002
 pastor-0.002
 ambulance-0.002
 stranger-0.002
 seller-0.002
Tokenifying
Token ID4035
Feature activation-5.80e+00

Pos logit contributions
icol+0.011
text+0.009
keeping+0.007
 P+0.007
und+0.006
Neg logit contributions
guard-0.059
 stranger-0.059
 answer-0.059
 ambulance-0.057
 seller-0.057
Token glass
Token ID5405
Feature activation-2.36e-01

Pos logit contributions
icol+0.140
text+0.131
bed+0.130
und+0.127
ra+0.123
Neg logit contributions
 stranger-0.132
guard-0.128
 answer-0.128
 ambulance-0.123
 seller-0.120
Token and
Token ID290
Feature activation-5.61e-01

Pos logit contributions
icol+0.006
text+0.006
und+0.006
bed+0.005
ifying+0.005
Neg logit contributions
 stranger-0.005
guard-0.005
 answer-0.005
 ambulance-0.005
 pastor-0.005
Token looked
Token ID3114
Feature activation+1.82e-01

Pos logit contributions
icol+0.014
text+0.013
und+0.013
ifying+0.013
bed+0.012
Neg logit contributions
 stranger-0.012
 answer-0.012
guard-0.012
 seller-0.011
 ambulance-0.011
Token very
Token ID845
Feature activation+5.55e-01

Pos logit contributions
 stranger+0.004
 answer+0.004
guard+0.004
 ambulance+0.004
 seller+0.004
Neg logit contributions
icol-0.004
text-0.004
und-0.004
bed-0.004
ra-0.004
Token carefully
Token ID7773
Feature activation-4.68e-01

Pos logit contributions
guard+0.013
 ambulance+0.012
 stranger+0.012
 officer+0.012
 cop+0.012
Neg logit contributions
icol-0.013
bed-0.012
 M-0.012
ra-0.012
und-0.012
Token something
Token ID1223
Feature activation+1.83e+00

Pos logit contributions
 answer+0.006
 stranger+0.006
 ambulance+0.006
 owl+0.006
 doctor+0.006
Neg logit contributions
icol-0.007
ifying-0.007
text-0.007
und-0.006
keeping-0.006
Token delicious
Token ID12625
Feature activation-5.82e-01

Pos logit contributions
 stranger+0.036
guard+0.035
 ambulance+0.034
 answer+0.033
 cop+0.033
Neg logit contributions
icol-0.048
text-0.045
und-0.045
bed-0.044
ifying-0.044
Token with
Token ID351
Feature activation-2.18e-01

Pos logit contributions
icol+0.013
text+0.012
und+0.011
bed+0.011
ifying+0.011
Neg logit contributions
 stranger-0.015
guard-0.015
 answer-0.015
 ambulance-0.014
 seller-0.014
Token it
Token ID340
Feature activation+6.69e-02

Pos logit contributions
icol+0.005
text+0.004
und+0.004
bed+0.004
ra+0.004
Neg logit contributions
 stranger-0.006
 answer-0.006
guard-0.006
 ambulance-0.005
 seller-0.005
Token".
Token ID1911
Feature activation+1.11e+00

Pos logit contributions
 stranger+0.002
 answer+0.002
guard+0.002
 ambulance+0.002
 seller+0.002
Neg logit contributions
icol-0.001
text-0.001
und-0.001
bed-0.001
ra-0.001
Token
Token ID220
Feature activation+2.84e+00

Pos logit contributions
 driver+0.023
 lady+0.023
 man+0.023
 officer+0.023
 stranger+0.023
Neg logit contributions
icol-0.029
ra-0.027
bed-0.026
 Chick-0.025
ifying-0.025
Token
Token ID198
Feature activation-8.28e-02

Pos logit contributions
 stranger+0.066
 answer+0.065
guard+0.065
 ambulance+0.064
 driver+0.063
Neg logit contributions
icol-0.066
ifying-0.058
text-0.058
bed-0.058
ra-0.057
Token
Token ID198
Feature activation+2.22e-01

Pos logit contributions
icol+0.002
ifying+0.002
bed+0.002
ra+0.002
text+0.002
Neg logit contributions
 stranger-0.002
guard-0.002
 lady-0.002
 driver-0.002
 man-0.002
TokenTom
Token ID13787
Feature activation-9.21e-01

Pos logit contributions
 stranger+0.004
guard+0.004
 answer+0.004
 ambulance+0.004
 cop+0.004
Neg logit contributions
icol-0.006
text-0.006
bed-0.006
und-0.006
ifying-0.006
Token nodded
Token ID14464
Feature activation-1.21e-01

Pos logit contributions
icol+0.022
text+0.020
ifying+0.020
und+0.020
bed+0.019
Neg logit contributions
 stranger-0.021
guard-0.020
 answer-0.020
 ambulance-0.020
 seller-0.019
Token and
Token ID290
Feature activation+1.96e-01

Pos logit contributions
icol+0.003
text+0.003
und+0.003
ifying+0.003
keeping+0.003
Neg logit contributions
 stranger-0.003
 answer-0.002
guard-0.002
 ambulance-0.002
 seller-0.002
Token his
Token ID465
Feature activation+8.83e-01

Pos logit contributions
guard+0.002
 stranger+0.002
 pastor+0.002
 ambulance+0.002
 seller+0.002
Neg logit contributions
bed-0.002
icol-0.002
 P-0.002
 M-0.002
text-0.002
Token mom
Token ID1995
Feature activation+5.53e-01

Pos logit contributions
guard+0.024
 stranger+0.023
 ambulance+0.022
 pastor+0.022
 answer+0.022
Neg logit contributions
icol-0.016
 M-0.015
 hands-0.015
 P-0.015
ra-0.015
Token,
Token ID11
Feature activation+8.29e-01

Pos logit contributions
 stranger+0.013
guard+0.013
 ambulance+0.013
 answer+0.012
 pastor+0.012
Neg logit contributions
icol-0.012
bed-0.011
und-0.011
 P-0.011
text-0.011
Token "
Token ID366
Feature activation+7.06e-01

Pos logit contributions
guard+0.020
 stranger+0.020
 ambulance+0.019
 answer+0.019
 pastor+0.019
Neg logit contributions
icol-0.017
bed-0.017
 P-0.016
und-0.016
text-0.016
TokenCan
Token ID6090
Feature activation+6.70e-01

Pos logit contributions
 stranger+0.016
guard+0.016
 ambulance+0.015
 answer+0.015
 cop+0.015
Neg logit contributions
icol-0.018
text-0.017
und-0.016
bed-0.016
ra-0.015
Token I
Token ID314
Feature activation+4.63e+00

Pos logit contributions
 stranger+0.019
guard+0.019
 answer+0.019
 ambulance+0.019
 pastor+0.018
Neg logit contributions
icol-0.012
text-0.011
bed-0.010
und-0.010
ra-0.010
Token keep
Token ID1394
Feature activation+1.09e+00

Pos logit contributions
 pastor+0.109
 stranger+0.103
 ambulance+0.099
guard+0.099
 officer+0.097
Neg logit contributions
 P-0.102
ifying-0.102
ra-0.102
 M-0.100
text-0.098
Token it
Token ID340
Feature activation+2.38e+00

Pos logit contributions
 stranger+0.028
 answer+0.028
 ambulance+0.027
guard+0.027
 owl+0.027
Neg logit contributions
icol-0.024
text-0.023
und-0.022
ifying-0.021
keeping-0.020
Token?"
Token ID1701
Feature activation+1.45e+00

Pos logit contributions
 stranger+0.055
guard+0.053
 pastor+0.053
 ambulance+0.051
 answer+0.050
Neg logit contributions
icol-0.054
bed-0.051
und-0.050
text-0.050
 P-0.049
Token
Token ID220
Feature activation+3.02e+00

Pos logit contributions
 stranger+0.028
 answer+0.028
 ambulance+0.026
guard+0.025
 seller+0.025
Neg logit contributions
icol-0.041
text-0.039
ifying-0.039
und-0.038
ra-0.037
Token
Token ID198
Feature activation-1.42e-01

Pos logit contributions
 stranger+0.071
 answer+0.071
guard+0.070
 ambulance+0.069
 seller+0.067
Neg logit contributions
icol-0.070
ifying-0.062
text-0.061
ra-0.060
und-0.060
Token.
Token ID13
Feature activation-7.03e-01

Pos logit contributions
 stranger+0.033
guard+0.033
 ambulance+0.031
 answer+0.031
 seller+0.031
Neg logit contributions
icol-0.031
text-0.030
und-0.030
bed-0.029
ra-0.029
Token I
Token ID314
Feature activation+2.01e+00

Pos logit contributions
icol+0.015
und+0.013
rily+0.013
bed+0.013
keeping+0.013
Neg logit contributions
 cop-0.018
 ambulance-0.018
 answer-0.018
 veterinarian-0.018
 driver-0.017
Token will
Token ID481
Feature activation+5.90e-01

Pos logit contributions
 stranger+0.042
guard+0.040
 ambulance+0.039
 pastor+0.038
 answer+0.038
Neg logit contributions
icol-0.050
und-0.049
ra-0.049
text-0.048
bed-0.047
Token watch
Token ID2342
Feature activation+6.28e-01

Pos logit contributions
guard+0.012
 stranger+0.012
 seller+0.011
 ambulance+0.011
 pastor+0.010
Neg logit contributions
icol-0.015
ra-0.015
bed-0.014
und-0.014
 P-0.014
Token cartoons
Token ID30070
Feature activation-8.38e-01

Pos logit contributions
 ambulance+0.016
 answer+0.016
 stranger+0.016
guard+0.015
 cop+0.015
Neg logit contributions
icol-0.014
text-0.013
und-0.013
keeping-0.012
bed-0.012
Token all
Token ID477
Feature activation+2.81e+00

Pos logit contributions
icol+0.016
und+0.015
text+0.015
bed+0.015
ra+0.015
Neg logit contributions
 stranger-0.022
guard-0.022
 answer-0.021
 ambulance-0.021
 pastor-0.020
Token day
Token ID1110
Feature activation-2.25e-01

Pos logit contributions
 stranger+0.067
guard+0.067
 answer+0.066
 ambulance+0.065
 cop+0.062
Neg logit contributions
icol-0.067
text-0.061
bed-0.059
und-0.058
ra-0.057
Token and
Token ID290
Feature activation-2.52e-01

Pos logit contributions
icol+0.006
text+0.006
und+0.006
bed+0.006
ifying+0.006
Neg logit contributions
 stranger-0.004
 answer-0.004
guard-0.004
 man-0.004
 ambulance-0.004
Token all
Token ID477
Feature activation+2.59e+00

Pos logit contributions
icol+0.006
text+0.006
und+0.006
bed+0.006
ifying+0.006
Neg logit contributions
 stranger-0.005
guard-0.005
 answer-0.005
 ambulance-0.005
 cop-0.005
Token night
Token ID1755
Feature activation-1.19e+00

Pos logit contributions
 stranger+0.067
guard+0.066
 answer+0.065
 ambulance+0.064
 cop+0.062
Neg logit contributions
icol-0.056
text-0.051
und-0.050
bed-0.049
ra-0.048
Token.
Token ID13
Feature activation-1.02e+00

Pos logit contributions
icol+0.030
text+0.029
bed+0.027
und+0.027
ra+0.027
Neg logit contributions
guard-0.025
 stranger-0.025
 answer-0.025
 ambulance-0.024
 man-0.023
Token.
Token ID13
Feature activation-1.81e-01

Pos logit contributions
 stranger+0.025
 ambulance+0.024
guard+0.024
 pastor+0.024
 cop+0.023
Neg logit contributions
icol-0.032
 P-0.030
text-0.030
bed-0.030
 M-0.030
Token<|endoftext|>
Token ID50256
Feature activation-1.69e+00

Pos logit contributions
icol+0.005
ra+0.005
bed+0.005
ifying+0.005
rily+0.005
Neg logit contributions
 veterinarian-0.003
 stranger-0.003
 answer-0.003
 doctor-0.003
 cop-0.003
TokenOnce
Token ID7454
Feature activation-7.53e-01

Pos logit contributions
icol+0.056
ra+0.050
ifying+0.049
bed+0.048
rily+0.047
Neg logit contributions
 answer-0.028
 stranger-0.027
guard-0.027
 man-0.026
 lady-0.025
Token upon
Token ID2402
Feature activation-1.56e+00

Pos logit contributions
icol+0.019
text+0.017
und+0.017
bed+0.017
ifying+0.017
Neg logit contributions
 stranger-0.016
guard-0.016
 answer-0.016
 ambulance-0.015
 cop-0.015
Token a
Token ID257
Feature activation-1.48e+00

Pos logit contributions
icol+0.034
text+0.031
und+0.031
bed+0.030
ra+0.029
Neg logit contributions
 stranger-0.039
guard-0.039
 answer-0.038
 ambulance-0.037
 cop-0.036
Token time
Token ID640
Feature activation+2.78e+00

Pos logit contributions
icol+0.028
keeping+0.027
bed+0.026
ifying+0.026
und+0.025
Neg logit contributions
 stranger-0.041
 answer-0.040
 ambulance-0.038
guard-0.038
 cop-0.038
Token,
Token ID11
Feature activation+4.25e-01

Pos logit contributions
 stranger+0.054
guard+0.053
 answer+0.053
 ambulance+0.051
 cop+0.049
Neg logit contributions
icol-0.077
text-0.069
ifying-0.069
und-0.069
keeping-0.068
Token there
Token ID612
Feature activation-2.58e-01

Pos logit contributions
 answer+0.010
 stranger+0.009
guard+0.009
 cop+0.009
 driver+0.009
Neg logit contributions
icol-0.012
keeping-0.010
text-0.010
und-0.009
ra-0.009
Token was
Token ID373
Feature activation+2.63e-01

Pos logit contributions
icol+0.007
ifying+0.006
keeping+0.006
ra+0.006
text+0.006
Neg logit contributions
 stranger-0.005
 answer-0.005
guard-0.005
 man-0.005
 stand-0.005
Token a
Token ID257
Feature activation-3.23e-01

Pos logit contributions
 stranger+0.005
 answer+0.005
 cop+0.004
 ambulance+0.004
 veterinarian+0.004
Neg logit contributions
icol-0.008
und-0.007
ra-0.007
ifying-0.007
keeping-0.007
Token little
Token ID1310
Feature activation+1.88e-01

Pos logit contributions
icol+0.006
 P+0.006
 M+0.006
text+0.006
 Po+0.006
Neg logit contributions
guard-0.008
 answer-0.008
 stranger-0.008
 ambulance-0.008
old-0.007
Token
Token ID198
Feature activation-4.42e-01

Pos logit contributions
icol+0.015
ifying+0.013
ra+0.013
text+0.013
bed+0.013
Neg logit contributions
 stranger-0.015
guard-0.014
 answer-0.014
 man-0.013
 lady-0.013
TokenThe
Token ID464
Feature activation+6.16e-01

Pos logit contributions
icol+0.009
text+0.008
und+0.008
bed+0.008
ifying+0.008
Neg logit contributions
 stranger-0.012
guard-0.011
 answer-0.011
 ambulance-0.011
 cop-0.011
Token sk
Token ID1341
Feature activation+8.83e-01

Pos logit contributions
 stranger+0.013
guard+0.013
 answer+0.012
 ambulance+0.012
 pastor+0.012
Neg logit contributions
icol-0.015
text-0.014
bed-0.014
ra-0.014
und-0.014
Tokenunk
Token ID2954
Feature activation+3.81e-01

Pos logit contributions
guard+0.020
 ambulance+0.020
 answer+0.020
 stranger+0.020
 seller+0.019
Neg logit contributions
icol-0.021
text-0.018
 P-0.017
ifying-0.017
und-0.017
Token thanked
Token ID26280
Feature activation-4.44e-01

Pos logit contributions
 stranger+0.009
guard+0.009
 answer+0.009
 ambulance+0.008
 driver+0.008
Neg logit contributions
icol-0.009
text-0.008
ifying-0.008
und-0.008
bed-0.008
Token the
Token ID262
Feature activation+2.69e+00

Pos logit contributions
icol+0.011
ifying+0.010
text+0.010
und+0.010
bed+0.009
Neg logit contributions
 answer-0.011
 stranger-0.011
guard-0.010
 ambulance-0.010
 owl-0.010
Token box
Token ID3091
Feature activation+1.23e+00

Pos logit contributions
guard+0.053
 stranger+0.053
 pastor+0.050
old+0.048
 ambulance+0.048
Neg logit contributions
icol-0.067
ra-0.065
bed-0.063
 P-0.063
und-0.062
Token for
Token ID329
Feature activation+2.87e-01

Pos logit contributions
 stranger+0.030
guard+0.029
 ambulance+0.029
 pastor+0.029
 answer+0.028
Neg logit contributions
icol-0.025
bed-0.024
text-0.024
und-0.024
ra-0.023
Token the
Token ID262
Feature activation+1.06e+00

Pos logit contributions
 stranger+0.007
guard+0.007
 answer+0.006
 ambulance+0.006
 cop+0.006
Neg logit contributions
icol-0.007
text-0.006
und-0.006
bed-0.006
ifying-0.006
Token exciting
Token ID7895
Feature activation+7.00e-01

Pos logit contributions
 stranger+0.016
guard+0.016
 answer+0.015
 ambulance+0.015
 seller+0.014
Neg logit contributions
icol-0.033
text-0.031
und-0.031
bed-0.031
ra-0.030
Token ride
Token ID6594
Feature activation-4.01e-01

Pos logit contributions
 stranger+0.011
guard+0.011
 ambulance+0.011
 answer+0.011
 seller+0.010
Neg logit contributions
icol-0.021
text-0.020
und-0.020
bed-0.019
ra-0.019
Token others
Token ID1854
Feature activation+5.24e-01

Pos logit contributions
icol+0.026
text+0.023
und+0.023
bed+0.023
ifying+0.022
Neg logit contributions
 stranger-0.029
guard-0.029
 answer-0.028
 ambulance-0.028
 cop-0.027
Token were
Token ID547
Feature activation-2.69e-01

Pos logit contributions
 stranger+0.014
 ambulance+0.014
 answer+0.014
guard+0.014
 cop+0.014
Neg logit contributions
und-0.009
icol-0.009
text-0.009
bed-0.009
ra-0.008
Token mean
Token ID1612
Feature activation+6.47e-02

Pos logit contributions
und+0.006
bed+0.006
icol+0.006
 P+0.006
ra+0.006
Neg logit contributions
 stranger-0.006
guard-0.006
 ambulance-0.005
 seller-0.005
 cop-0.005
Token and
Token ID290
Feature activation-1.11e+00

Pos logit contributions
 stranger+0.001
guard+0.001
 answer+0.001
 ambulance+0.001
 cop+0.001
Neg logit contributions
icol-0.002
text-0.002
bed-0.002
und-0.002
ifying-0.002
Token made
Token ID925
Feature activation-4.09e-01

Pos logit contributions
icol+0.028
und+0.027
 P+0.026
bed+0.026
ifying+0.026
Neg logit contributions
 stranger-0.021
 ambulance-0.021
guard-0.021
 answer-0.020
 seller-0.020
Token fun
Token ID1257
Feature activation+1.21e+00

Pos logit contributions
icol+0.011
text+0.010
und+0.010
ifying+0.010
bed+0.010
Neg logit contributions
 stranger-0.008
 answer-0.008
guard-0.008
 ambulance-0.008
 cop-0.008
Token of
Token ID286
Feature activation-9.39e-01

Pos logit contributions
 stranger+0.027
guard+0.027
 answer+0.026
 ambulance+0.026
 cop+0.025
Neg logit contributions
icol-0.029
text-0.027
und-0.027
bed-0.027
ra-0.026
Token the
Token ID262
Feature activation+1.03e+00

Pos logit contributions
icol+0.022
text+0.020
und+0.020
bed+0.020
ra+0.019
Neg logit contributions
 stranger-0.022
guard-0.022
 answer-0.021
 ambulance-0.021
 seller-0.020
Token ugly
Token ID13400
Feature activation+4.51e-01

Pos logit contributions
 stranger+0.018
guard+0.018
 ambulance+0.017
 answer+0.017
 seller+0.016
Neg logit contributions
icol-0.030
text-0.028
und-0.028
bed-0.028
ra-0.027
Token pupil
Token ID34047
Feature activation-8.51e-01

Pos logit contributions
guard+0.010
 pastor+0.009
 ambulance+0.009
 firefighter+0.009
 seller+0.009
Neg logit contributions
icol-0.011
 P-0.010
bed-0.010
text-0.010
 Po-0.009
Token,
Token ID11
Feature activation-6.51e-01

Pos logit contributions
icol+0.020
text+0.019
und+0.019
bed+0.018
ra+0.018
Neg logit contributions
 stranger-0.019
guard-0.019
 answer-0.019
 ambulance-0.018
 cop-0.018
Token huge
Token ID3236
Feature activation+5.56e-01

Pos logit contributions
old+0.000
guard+0.000
 answer+0.000
vern+0.000
 seller+0.000
Neg logit contributions
icol-0.000
ifying-0.000
 M-0.000
und-0.000
 P-0.000
Token shadow
Token ID9082
Feature activation+7.77e-01

Pos logit contributions
guard+0.013
vern+0.012
old+0.012
 pastor+0.012
 stranger+0.012
Neg logit contributions
icol-0.013
und-0.012
bed-0.011
ra-0.011
 M-0.011
Token.
Token ID13
Feature activation+3.16e-01

Pos logit contributions
 stranger+0.021
guard+0.021
 answer+0.020
 ambulance+0.020
 pastor+0.020
Neg logit contributions
icol-0.015
und-0.014
text-0.014
bed-0.013
ifying-0.013
Token She
Token ID1375
Feature activation+6.56e-01

Pos logit contributions
 answer+0.008
 stranger+0.008
 ambulance+0.007
 cop+0.007
 lady+0.007
Neg logit contributions
icol-0.007
bed-0.007
text-0.007
und-0.006
ra-0.006
Token looked
Token ID3114
Feature activation+1.06e+00

Pos logit contributions
 stranger+0.016
guard+0.016
 answer+0.015
 ambulance+0.015
 seller+0.015
Neg logit contributions
icol-0.015
text-0.014
und-0.013
bed-0.013
ra-0.013
Token up
Token ID510
Feature activation+9.87e-01

Pos logit contributions
guard+0.027
 stranger+0.025
 ambulance+0.024
 pastor+0.024
 seller+0.024
Neg logit contributions
icol-0.023
 P-0.020
und-0.020
bed-0.020
text-0.020
Token and
Token ID290
Feature activation+5.50e-01

Pos logit contributions
 stranger+0.024
guard+0.024
 ambulance+0.024
 answer+0.023
 pastor+0.023
Neg logit contributions
icol-0.021
und-0.019
bed-0.019
text-0.019
ra-0.019
Token saw
Token ID2497
Feature activation+6.34e-01

Pos logit contributions
 stranger+0.010
guard+0.010
 answer+0.010
 ambulance+0.010
 cop+0.009
Neg logit contributions
icol-0.015
text-0.014
und-0.014
bed-0.014
ra-0.014
Token a
Token ID257
Feature activation-5.42e-01

Pos logit contributions
 stranger+0.016
guard+0.016
 answer+0.016
 ambulance+0.015
 cop+0.015
Neg logit contributions
icol-0.014
text-0.013
und-0.012
bed-0.012
ifying-0.012
Token huge
Token ID3236
Feature activation+6.73e-02

Pos logit contributions
icol+0.014
ifying+0.012
und+0.012
 M+0.012
 P+0.012
Neg logit contributions
guard-0.011
old-0.011
 answer-0.011
 ambulance-0.010
 stranger-0.010
Token dog
Token ID3290
Feature activation-1.67e-02

Pos logit contributions
guard+0.002
 pastor+0.001
 stranger+0.001
vern+0.001
old+0.001
Neg logit contributions
icol-0.002
und-0.001
bed-0.001
ra-0.001
 M-0.001
Token not
Token ID407
Feature activation+1.02e+00

Pos logit contributions
icol+0.018
text+0.017
und+0.016
ifying+0.016
bed+0.016
Neg logit contributions
 stranger-0.013
 answer-0.013
guard-0.013
 ambulance-0.013
 cop-0.012
Token want
Token ID765
Feature activation+1.06e+00

Pos logit contributions
guard+0.021
 seller+0.020
 ambulance+0.019
 stranger+0.019
 operator+0.019
Neg logit contributions
keeping-0.020
ra-0.020
e-0.020
 hands-0.019
 P-0.019
Token to
Token ID284
Feature activation+6.27e-02

Pos logit contributions
 stranger+0.025
guard+0.025
 ambulance+0.024
 seller+0.024
 answer+0.024
Neg logit contributions
icol-0.023
 P-0.021
text-0.021
 M-0.021
bed-0.021
Token sell
Token ID3677
Feature activation+8.61e-01

Pos logit contributions
 stranger+0.001
guard+0.001
 ambulance+0.001
 pastor+0.001
 firefighter+0.001
Neg logit contributions
icol-0.002
ra-0.002
bed-0.002
 P-0.002
keeping-0.002
Token their
Token ID511
Feature activation+8.52e-01

Pos logit contributions
 stranger+0.023
guard+0.023
 ambulance+0.022
 pastor+0.022
 answer+0.022
Neg logit contributions
icol-0.015
text-0.014
bed-0.014
 P-0.014
und-0.014
Token games
Token ID1830
Feature activation-1.31e-01

Pos logit contributions
guard+0.019
 stranger+0.019
 ambulance+0.018
 pastor+0.018
 officer+0.018
Neg logit contributions
icol-0.019
text-0.018
ra-0.018
 hands-0.018
bed-0.017
Token.
Token ID13
Feature activation+3.32e-01

Pos logit contributions
bed+0.002
 P+0.002
keeping+0.002
ifying+0.002
text+0.002
Neg logit contributions
 pastor-0.003
 stranger-0.003
 ambulance-0.003
guard-0.003
 firefighter-0.003
Token They
Token ID1119
Feature activation+1.10e+00

Pos logit contributions
 cop+0.008
 driver+0.008
 ambulance+0.008
 stranger+0.008
 man+0.008
Neg logit contributions
icol-0.007
text-0.007
bed-0.007
und-0.007
ra-0.007
Token love
Token ID1842
Feature activation+1.84e-01

Pos logit contributions
guard+0.024
 stranger+0.022
vern+0.021
 pastor+0.021
 ambulance+0.021
Neg logit contributions
ra-0.027
 hands-0.025
icol-0.025
und-0.025
ifying-0.024
Token all
Token ID477
Feature activation+2.81e+00

Pos logit contributions
 stranger+0.004
guard+0.004
 answer+0.004
 ambulance+0.004
 seller+0.004
Neg logit contributions
icol-0.004
text-0.004
und-0.004
bed-0.004
ra-0.004
Token their
Token ID511
Feature activation-4.16e-01

Pos logit contributions
 stranger+0.074
guard+0.073
 answer+0.071
 ambulance+0.071
 seller+0.069
Neg logit contributions
icol-0.057
text-0.053
und-0.052
bed-0.051
ifying-0.049
Token water
Token ID1660
Feature activation+1.38e+00

Pos logit contributions
 stranger+0.025
guard+0.025
 cop+0.024
 answer+0.024
 ambulance+0.024
Neg logit contributions
icol-0.031
text-0.029
ifying-0.029
und-0.028
ra-0.028
Token.
Token ID13
Feature activation+2.60e-01

Pos logit contributions
 stranger+0.030
guard+0.030
 answer+0.029
 ambulance+0.029
 seller+0.028
Neg logit contributions
icol-0.035
text-0.033
und-0.032
bed-0.031
ra-0.031
Token But
Token ID887
Feature activation+1.93e+00

Pos logit contributions
 lady+0.005
 stranger+0.005
 nod+0.005
 veterinarian+0.005
 man+0.005
Neg logit contributions
icol-0.008
keeping-0.007
und-0.007
bed-0.006
text-0.006
Token their
Token ID511
Feature activation+3.61e-01

Pos logit contributions
 stranger+0.047
guard+0.047
 answer+0.046
 ambulance+0.045
 seller+0.045
Neg logit contributions
icol-0.047
und-0.042
text-0.041
keeping-0.040
ifying-0.038
Token moms
Token ID34015
Feature activation+1.67e+00

Pos logit contributions
guard+0.007
 stranger+0.006
 answer+0.006
 ambulance+0.006
 cop+0.006
Neg logit contributions
icol-0.010
ra-0.009
text-0.009
bed-0.009
 hands-0.009
Token say
Token ID910
Feature activation+1.34e-01

Pos logit contributions
guard+0.031
 stranger+0.030
 pastor+0.028
 firefighter+0.027
 cop+0.027
Neg logit contributions
und-0.043
ra-0.042
icol-0.040
 P-0.040
bed-0.039
Token no
Token ID645
Feature activation+1.80e-01

Pos logit contributions
 stranger+0.004
guard+0.004
 answer+0.004
 ambulance+0.004
 cop+0.004
Neg logit contributions
icol-0.002
text-0.002
bed-0.002
und-0.002
ra-0.002
Token.
Token ID13
Feature activation-5.85e-01

Pos logit contributions
 seller+0.005
 firefighter+0.005
 stranger+0.005
guard+0.005
vern+0.005
Neg logit contributions
 M-0.003
 P-0.003
 hands-0.003
icol-0.002
bed-0.002
Token They
Token ID1119
Feature activation+1.04e+00

Pos logit contributions
icol+0.015
keeping+0.013
und+0.013
bed+0.012
ra+0.012
Neg logit contributions
 stranger-0.014
 ambulance-0.014
 lady-0.014
 answer-0.014
 veterinarian-0.013
Token say
Token ID910
Feature activation+8.04e-02

Pos logit contributions
guard+0.020
 stranger+0.020
 cop+0.019
 pastor+0.018
 firefighter+0.018
Neg logit contributions
ra-0.026
text-0.025
und-0.025
icol-0.025
ifying-0.024
Token the
Token ID262
Feature activation+1.29e+00

Pos logit contributions
 stranger+0.002
guard+0.002
 answer+0.002
 ambulance+0.002
 cop+0.002
Neg logit contributions
icol-0.001
text-0.001
und-0.001
bed-0.001
ifying-0.001
Token Squirrel
Token ID44259
Feature activation+9.99e-01

Pos logit contributions
guard+0.034
 pastor+0.031
old+0.031
 answer+0.030
 ambulance+0.029
Neg logit contributions
icol-0.064
 P-0.062
text-0.060
bed-0.060
ra-0.060
Token sh
Token ID427
Feature activation+3.06e-01

Pos logit contributions
 stranger+0.024
guard+0.023
 ambulance+0.023
 answer+0.023
 pastor+0.023
Neg logit contributions
icol-0.022
und-0.021
bed-0.021
text-0.020
ra-0.020
Tokenivered
Token ID6396
Feature activation+1.39e-01

Pos logit contributions
 stranger+0.008
guard+0.008
 answer+0.008
 ambulance+0.007
 cop+0.007
Neg logit contributions
icol-0.007
und-0.006
bed-0.006
text-0.006
ra-0.006
Token and
Token ID290
Feature activation-3.55e-01

Pos logit contributions
 stranger+0.003
guard+0.003
 answer+0.003
 ambulance+0.003
 cop+0.003
Neg logit contributions
icol-0.003
text-0.003
und-0.003
bed-0.003
ifying-0.003
Token said
Token ID531
Feature activation-5.69e-01

Pos logit contributions
icol+0.010
text+0.010
ifying+0.010
und+0.010
keeping+0.009
Neg logit contributions
 stranger-0.006
guard-0.006
 answer-0.006
 seller-0.006
 ambulance-0.006
Token "
Token ID366
Feature activation-2.86e-01

Pos logit contributions
icol+0.013
bed+0.012
text+0.012
und+0.012
ra+0.012
Neg logit contributions
 stranger-0.013
guard-0.013
 ambulance-0.013
 answer-0.013
 cop-0.012
TokenI
Token ID40
Feature activation+1.94e+00

Pos logit contributions
icol+0.007
text+0.006
und+0.006
bed+0.006
ra+0.006
Neg logit contributions
 stranger-0.007
guard-0.007
 answer-0.006
 ambulance-0.006
 seller-0.006
Token'm
Token ID1101
Feature activation+1.07e+00

Pos logit contributions
 stranger+0.044
 ambulance+0.043
guard+0.042
 cop+0.041
 pastor+0.041
Neg logit contributions
icol-0.045
und-0.044
text-0.043
ra-0.043
bed-0.041
Token so
Token ID523
Feature activation+2.11e+00

Pos logit contributions
 stranger+0.022
guard+0.022
 answer+0.022
 ambulance+0.021
 cop+0.020
Neg logit contributions
icol-0.028
text-0.026
und-0.026
bed-0.025
ifying-0.025
Token scared
Token ID12008
Feature activation+1.07e+00

Pos logit contributions
guard+0.043
 stranger+0.043
 ambulance+0.041
 answer+0.041
 cop+0.040
Neg logit contributions
icol-0.053
und-0.051
bed-0.050
text-0.049
ifying-0.049
Token!
Token ID0
Feature activation+5.43e-01

Pos logit contributions
 stranger+0.025
guard+0.024
 answer+0.023
 ambulance+0.023
 cop+0.022
Neg logit contributions
icol-0.029
text-0.026
bed-0.025
ra-0.025
und-0.025
Token
Token ID220
Feature activation+2.45e+00

Pos logit contributions
 stranger+0.017
 man+0.017
 seller+0.017
 answer+0.017
 lady+0.017
Neg logit contributions
icol-0.017
ra-0.016
ifying-0.015
text-0.015
und-0.014
Token
Token ID198
Feature activation-5.59e-01

Pos logit contributions
 answer+0.057
guard+0.057
 stranger+0.056
 ambulance+0.055
 seller+0.055
Neg logit contributions
icol-0.057
ifying-0.050
text-0.050
bed-0.050
ra-0.048
TokenThey
Token ID2990
Feature activation+5.74e-01

Pos logit contributions
icol+0.013
ifying+0.012
ra+0.011
text+0.011
bed+0.011
Neg logit contributions
 stranger-0.013
guard-0.013
 answer-0.013
 seller-0.013
 driver-0.012
Token both
Token ID1111
Feature activation+1.86e+00

Pos logit contributions
 stranger+0.014
 answer+0.014
guard+0.013
 seller+0.013
 ambulance+0.013
Neg logit contributions
icol-0.013
und-0.012
text-0.012
bed-0.011
keeping-0.011
Token smiled
Token ID13541
Feature activation-6.14e-01

Pos logit contributions
 stranger+0.042
guard+0.041
 answer+0.041
 ambulance+0.039
 seller+0.039
Neg logit contributions
icol-0.045
und-0.041
text-0.041
bed-0.041
ifying-0.039
Token and
Token ID290
Feature activation-1.37e+00

Pos logit contributions
icol+0.015
text+0.014
und+0.014
bed+0.014
ra+0.014
Neg logit contributions
 stranger-0.013
 answer-0.013
guard-0.013
 ambulance-0.012
 seller-0.012
Token laughed
Token ID13818
Feature activation+1.60e-01

Pos logit contributions
icol+0.037
und+0.036
ifying+0.035
icer+0.035
ASH+0.034
Neg logit contributions
 stranger-0.025
 wife-0.025
 man-0.024
 answer-0.024
 lady-0.024
Token,
Token ID11
Feature activation-8.13e-01

Pos logit contributions
 stranger+0.003
 answer+0.003
guard+0.003
 ambulance+0.003
 seller+0.003
Neg logit contributions
icol-0.004
text-0.004
und-0.004
ra-0.004
keeping-0.004
Token and
Token ID290
Feature activation-1.20e+00

Pos logit contributions
icol+0.022
und+0.022
text+0.022
ifying+0.021
ra+0.021
Neg logit contributions
 stranger-0.016
 lady-0.015
 driver-0.015
 answer-0.015
 man-0.015
Token started
Token ID2067
Feature activation+6.20e-01

Pos logit contributions
icol+0.031
text+0.030
ifying+0.030
und+0.030
icer+0.029
Neg logit contributions
 stranger-0.025
 man-0.023
 lady-0.023
 driver-0.023
 wife-0.023
Token to
Token ID284
Feature activation+2.60e-01

Pos logit contributions
 stranger+0.011
 answer+0.011
 bus+0.011
 pastor+0.010
 ambulance+0.010
Neg logit contributions
icol-0.017
text-0.017
und-0.016
ifying-0.016
ra-0.016
Token still
Token ID991
Feature activation+4.27e-01

Pos logit contributions
icol+0.009
text+0.008
und+0.008
bed+0.008
ra+0.008
Neg logit contributions
 stranger-0.010
guard-0.010
 answer-0.010
 ambulance-0.010
 cop-0.010
Token loved
Token ID6151
Feature activation-3.47e-01

Pos logit contributions
 stranger+0.009
guard+0.009
 answer+0.009
 ambulance+0.009
 cop+0.009
Neg logit contributions
icol-0.011
text-0.010
und-0.010
bed-0.009
ra-0.009
Token playing
Token ID2712
Feature activation+6.90e-01

Pos logit contributions
icol+0.009
text+0.008
ifying+0.008
und+0.008
ra+0.008
Neg logit contributions
 stranger-0.008
 ambulance-0.008
guard-0.008
 answer-0.008
 pastor-0.007
Token in
Token ID287
Feature activation-6.85e-01

Pos logit contributions
 stranger+0.018
guard+0.018
 ambulance+0.017
 answer+0.017
 cop+0.017
Neg logit contributions
icol-0.015
text-0.013
und-0.013
bed-0.013
ifying-0.013
Token the
Token ID262
Feature activation-5.78e-02

Pos logit contributions
icol+0.019
text+0.017
ifying+0.016
ra+0.016
und+0.016
Neg logit contributions
 answer-0.015
 stranger-0.015
guard-0.014
 ambulance-0.014
 lady-0.014
Token snow
Token ID6729
Feature activation-1.80e+00

Pos logit contributions
icol+0.002
text+0.001
und+0.001
bed+0.001
ifying+0.001
Neg logit contributions
 stranger-0.001
guard-0.001
 answer-0.001
 seller-0.001
 ambulance-0.001
Token.
Token ID13
Feature activation-3.85e-01

Pos logit contributions
icol+0.052
text+0.050
ifying+0.047
und+0.046
ra+0.046
Neg logit contributions
guard-0.034
 stranger-0.033
 ambulance-0.032
 answer-0.031
 seller-0.031
Token<|endoftext|>
Token ID50256
Feature activation-1.26e+00

Pos logit contributions
icol+0.011
ra+0.011
ifying+0.011
rily+0.011
aults+0.010
Neg logit contributions
 veterinarian-0.007
old-0.006
 doctor-0.006
 stranger-0.006
 lady-0.006
TokenOnce
Token ID7454
Feature activation-6.89e-01

Pos logit contributions
icol+0.041
ra+0.036
ifying+0.035
rily+0.035
bed+0.035
Neg logit contributions
 answer-0.021
 stranger-0.021
guard-0.021
 man-0.019
 lady-0.019
Token upon
Token ID2402
Feature activation-1.62e+00

Pos logit contributions
icol+0.017
text+0.016
und+0.016
bed+0.016
ifying+0.015
Neg logit contributions
 stranger-0.015
guard-0.015
 answer-0.014
 ambulance-0.014
 cop-0.014
Token a
Token ID257
Feature activation-1.66e+00

Pos logit contributions
icol+0.033
text+0.031
und+0.030
bed+0.030
ifying+0.029
Neg logit contributions
 stranger-0.042
guard-0.041
 answer-0.040
 ambulance-0.040
 seller-0.039
Token got
Token ID1392
Feature activation+8.04e-02

Pos logit contributions
icol+0.035
und+0.032
bed+0.032
ifying+0.030
text+0.030
Neg logit contributions
 stranger-0.057
guard-0.057
 answer-0.055
 ambulance-0.054
 pastor-0.054
Token tough
Token ID5802
Feature activation+9.91e-01

Pos logit contributions
 stranger+0.002
guard+0.002
 ambulance+0.002
 cop+0.002
 pastor+0.002
Neg logit contributions
icol-0.002
 P-0.002
ifying-0.002
bed-0.002
text-0.002
Token.
Token ID13
Feature activation-4.55e-01

Pos logit contributions
 stranger+0.021
guard+0.021
 seller+0.019
 answer+0.019
 ambulance+0.019
Neg logit contributions
icol-0.025
und-0.023
text-0.023
bed-0.023
ra-0.022
Token
Token ID198
Feature activation-1.05e+00

Pos logit contributions
icol+0.010
text+0.009
und+0.009
bed+0.009
ra+0.009
Neg logit contributions
 stranger-0.012
guard-0.011
 answer-0.011
 ambulance-0.011
 cop-0.011
Token
Token ID198
Feature activation-9.87e-01

Pos logit contributions
icol+0.026
ifying+0.023
ra+0.022
bed+0.022
text+0.022
Neg logit contributions
 stranger-0.025
guard-0.024
 answer-0.024
 ambulance-0.023
 man-0.022
TokenOne
Token ID3198
Feature activation-6.37e-01

Pos logit contributions
und+0.022
text+0.021
icol+0.020
 P+0.020
ifying+0.019
Neg logit contributions
guard-0.022
 answer-0.022
 stranger-0.022
 ambulance-0.021
 pastor-0.020
Token day
Token ID1110
Feature activation+2.08e-01

Pos logit contributions
 M+0.009
text+0.009
icol+0.008
 P+0.008
ifying+0.008
Neg logit contributions
guard-0.018
 officer-0.018
 ambulance-0.018
 seller-0.018
vern-0.018
Token,
Token ID11
Feature activation-7.73e-01

Pos logit contributions
 stranger+0.005
guard+0.005
 answer+0.005
 ambulance+0.005
 pastor+0.005
Neg logit contributions
icol-0.005
text-0.004
bed-0.004
und-0.004
ra-0.004
Token they
Token ID484
Feature activation+9.52e-01

Pos logit contributions
icol+0.018
text+0.016
und+0.016
bed+0.016
ifying+0.015
Neg logit contributions
 stranger-0.019
guard-0.019
 answer-0.018
 ambulance-0.018
 cop-0.018
Token were
Token ID547
Feature activation+5.17e-01

Pos logit contributions
 stranger+0.022
guard+0.022
 answer+0.021
 ambulance+0.021
 seller+0.020
Neg logit contributions
icol-0.023
text-0.021
und-0.020
bed-0.020
ra-0.020
Token walking
Token ID6155
Feature activation+2.25e-01

Pos logit contributions
guard+0.013
 stranger+0.013
 answer+0.013
 ambulance+0.012
 seller+0.012
Neg logit contributions
icol-0.010
bed-0.010
und-0.010
text-0.010
ra-0.009
Token was
Token ID373
Feature activation+4.32e-01

Pos logit contributions
 stranger+0.000
guard+0.000
 answer+0.000
 ambulance+0.000
 cop+0.000
Neg logit contributions
icol-0.000
text-0.000
und-0.000
bed-0.000
ifying-0.000
Token only
Token ID691
Feature activation+5.55e-01

Pos logit contributions
 answer+0.008
 stranger+0.008
 ambulance+0.008
 cop+0.008
 veterinarian+0.007
Neg logit contributions
icol-0.013
text-0.012
und-0.012
keeping-0.012
ifying-0.011
Token three
Token ID1115
Feature activation+1.95e-02

Pos logit contributions
 stranger+0.013
guard+0.013
 ambulance+0.013
 answer+0.013
 seller+0.012
Neg logit contributions
icol-0.012
text-0.011
und-0.011
bed-0.011
ra-0.011
Token years
Token ID812
Feature activation+1.28e-01

Pos logit contributions
 stranger+0.001
guard+0.001
 ambulance+0.001
 answer+0.001
 cop+0.001
Neg logit contributions
icol-0.000
text-0.000
und-0.000
bed-0.000
ifying-0.000
Token old
Token ID1468
Feature activation+2.93e-02

Pos logit contributions
 communicate+0.002
guard+0.002
 seller+0.002
 stranger+0.002
 answer+0.002
Neg logit contributions
icol-0.004
 M-0.004
text-0.004
 Po-0.004
 P-0.004
Token and
Token ID290
Feature activation-6.37e-01

Pos logit contributions
guard+0.001
 stranger+0.001
 answer+0.001
 ambulance+0.001
 pastor+0.001
Neg logit contributions
icol-0.001
bed-0.001
text-0.001
und-0.001
 P-0.001
Token loved
Token ID6151
Feature activation-5.56e-01

Pos logit contributions
icol+0.017
text+0.016
und+0.016
bed+0.016
ifying+0.015
Neg logit contributions
 stranger-0.013
guard-0.012
 answer-0.012
 ambulance-0.012
 cop-0.012
Token everyone
Token ID2506
Feature activation+2.73e+00

Pos logit contributions
icol+0.014
text+0.013
und+0.013
bed+0.013
ra+0.012
Neg logit contributions
 stranger-0.012
guard-0.012
 answer-0.011
 ambulance-0.011
 seller-0.011
Token.
Token ID13
Feature activation-6.95e-01

Pos logit contributions
 ambulance+0.070
guard+0.069
 stranger+0.068
 cop+0.067
 answer+0.067
Neg logit contributions
bed-0.048
text-0.048
 M-0.047
icol-0.047
und-0.046
Token One
Token ID1881
Feature activation-6.61e-01

Pos logit contributions
icol+0.020
keeping+0.020
ifying+0.019
ra+0.019
und+0.019
Neg logit contributions
 answer-0.013
 stranger-0.011
 veterinarian-0.011
 seller-0.011
 nod-0.011
Token day
Token ID1110
Feature activation+4.72e-01

Pos logit contributions
 M+0.009
text+0.009
icol+0.009
 P+0.008
bed+0.008
Neg logit contributions
guard-0.020
 officer-0.020
 seller-0.019
 ambulance-0.019
 stranger-0.019
Token,
Token ID11
Feature activation-1.21e+00

Pos logit contributions
icol+0.001
text+0.001
bed+0.001
ra+0.001
 P+0.001
Neg logit contributions
guard-0.001
 stranger-0.001
 seller-0.001
 ambulance-0.001
 pastor-0.001
Token he
Token ID339
Feature activation+2.17e-01

Pos logit contributions
icol+0.030
text+0.028
und+0.027
bed+0.027
ra+0.026
Neg logit contributions
 stranger-0.028
 answer-0.027
 ambulance-0.027
guard-0.026
 cop-0.026
Token remembered
Token ID12086
Feature activation-1.77e+00

Pos logit contributions
 stranger+0.005
guard+0.005
 answer+0.005
 ambulance+0.005
 seller+0.004
Neg logit contributions
icol-0.005
text-0.005
und-0.005
bed-0.005
ifying-0.005
Token leaving
Token ID4305
Feature activation+1.29e+00

Pos logit contributions
icol+0.043
text+0.040
und+0.039
bed+0.039
ifying+0.038
Neg logit contributions
 stranger-0.040
guard-0.039
 answer-0.038
 ambulance-0.038
 cop-0.036
Token it
Token ID340
Feature activation+1.79e+00

Pos logit contributions
 stranger+0.030
 answer+0.029
guard+0.029
 ambulance+0.028
 cop+0.027
Neg logit contributions
icol-0.031
text-0.029
und-0.028
bed-0.028
ifying-0.027
Token on
Token ID319
Feature activation-1.78e+00

Pos logit contributions
 stranger+0.042
guard+0.041
 ambulance+0.040
 answer+0.040
 pastor+0.039
Neg logit contributions
icol-0.040
text-0.037
und-0.037
bed-0.037
ifying-0.035
Token top
Token ID1353
Feature activation+1.87e+00

Pos logit contributions
icol+0.048
text+0.045
ifying+0.043
und+0.042
bed+0.042
Neg logit contributions
 stranger-0.039
 answer-0.038
guard-0.036
 ambulance-0.035
 cop-0.035
Token of
Token ID286
Feature activation-9.65e-01

Pos logit contributions
 stranger+0.032
guard+0.032
 answer+0.030
 ambulance+0.030
 cop+0.028
Neg logit contributions
icol-0.056
text-0.052
bed-0.051
und-0.051
ra-0.050
Token his
Token ID465
Feature activation-2.14e+00

Pos logit contributions
icol+0.023
text+0.022
und+0.021
ifying+0.021
bed+0.021
Neg logit contributions
 stranger-0.023
 answer-0.022
guard-0.022
 ambulance-0.021
 cop-0.021
Token dress
Token ID6576
Feature activation-2.11e-01

Pos logit contributions
icol+0.056
text+0.053
und+0.052
bed+0.051
ra+0.051
Neg logit contributions
 stranger-0.043
guard-0.042
 answer-0.040
 ambulance-0.040
 seller-0.039
Tokener
Token ID263
Feature activation+6.59e-01

Pos logit contributions
icol+0.005
text+0.005
und+0.005
bed+0.005
ra+0.005
Neg logit contributions
 stranger-0.005
guard-0.004
 answer-0.004
 ambulance-0.004
 seller-0.004
Token nice
Token ID3621
Feature activation-1.00e+00

Pos logit contributions
icol+0.050
text+0.047
und+0.047
bed+0.046
ra+0.045
Neg logit contributions
 stranger-0.030
guard-0.029
 answer-0.029
 ambulance-0.028
 cop-0.027
Token.
Token ID13
Feature activation-7.89e-01

Pos logit contributions
icol+0.023
text+0.021
und+0.021
bed+0.021
ifying+0.020
Neg logit contributions
 stranger-0.024
guard-0.024
 answer-0.023
 ambulance-0.023
 seller-0.022
Token Like
Token ID4525
Feature activation+7.31e-01

Pos logit contributions
icol+0.019
und+0.018
keeping+0.018
text+0.017
zed+0.017
Neg logit contributions
 veterinarian-0.018
 answer-0.017
 driver-0.016
 ambulance-0.016
 cop-0.016
Token our
Token ID674
Feature activation+3.03e-01

Pos logit contributions
 stranger+0.018
guard+0.018
 answer+0.017
 ambulance+0.017
 cop+0.017
Neg logit contributions
icol-0.016
text-0.015
und-0.015
bed-0.014
ifying-0.014
Token t
Token ID256
Feature activation+4.26e-01

Pos logit contributions
 stranger+0.006
 answer+0.006
 ambulance+0.005
guard+0.005
 lady+0.005
Neg logit contributions
icol-0.009
und-0.009
text-0.008
ifying-0.008
bed-0.008
Tokeneddy
Token ID21874
Feature activation-4.08e+00

Pos logit contributions
guard+0.010
 stranger+0.010
 ambulance+0.010
 answer+0.010
 seller+0.009
Neg logit contributions
icol-0.010
text-0.009
und-0.009
bed-0.009
ifying-0.008
Token bears
Token ID13062
Feature activation-1.14e+00

Pos logit contributions
icol+0.143
und+0.136
ifying+0.134
text+0.133
 Neither+0.126
Neg logit contributions
 ambulance-0.051
 stranger-0.050
 answer-0.049
 owl-0.049
guard-0.048
Token."
Token ID526
Feature activation+1.17e+00

Pos logit contributions
icol+0.022
und+0.021
text+0.021
bed+0.020
ra+0.020
Neg logit contributions
 stranger-0.030
guard-0.029
 ambulance-0.028
 answer-0.028
 pastor-0.028
Token
Token ID198
Feature activation+6.66e-01

Pos logit contributions
 lady+0.022
 stranger+0.022
 man+0.022
 doctor+0.022
 veterinarian+0.021
Neg logit contributions
icol-0.035
ifying-0.033
ra-0.033
und-0.031
text-0.031
Token
Token ID198
Feature activation+7.56e-01

Pos logit contributions
 stranger+0.016
guard+0.015
 answer+0.015
 ambulance+0.015
 man+0.015
Neg logit contributions
icol-0.016
ifying-0.014
ra-0.014
und-0.014
bed-0.014
TokenL
Token ID43
Feature activation-6.48e-01

Pos logit contributions
 stranger+0.019
guard+0.019
 ambulance+0.018
 cop+0.018
 answer+0.018
Neg logit contributions
icol-0.017
text-0.015
und-0.015
bed-0.015
ifying-0.015
Token garden
Token ID11376
Feature activation-9.07e-01

Pos logit contributions
icol+0.038
text+0.035
bed+0.035
und+0.035
ra+0.034
Neg logit contributions
 stranger-0.025
 answer-0.023
guard-0.023
 ambulance-0.023
 cop-0.022
Token that
Token ID326
Feature activation-1.28e+00

Pos logit contributions
icol+0.026
text+0.024
und+0.023
bed+0.023
ra+0.022
Neg logit contributions
guard-0.017
 stranger-0.017
 answer-0.017
 ambulance-0.017
 seller-0.016
Token are
Token ID389
Feature activation-1.39e+00

Pos logit contributions
icol+0.034
text+0.031
und+0.031
bed+0.031
ra+0.030
Neg logit contributions
 stranger-0.026
guard-0.025
 answer-0.025
 ambulance-0.025
 cop-0.024
Token not
Token ID407
Feature activation-2.65e+00

Pos logit contributions
icol+0.033
und+0.030
text+0.030
bed+0.030
ifying+0.030
Neg logit contributions
 stranger-0.031
guard-0.031
 ambulance-0.029
 answer-0.029
 pastor-0.029
Token good
Token ID922
Feature activation-1.70e+00

Pos logit contributions
icol+0.059
und+0.056
text+0.056
bed+0.055
ra+0.054
Neg logit contributions
 stranger-0.061
guard-0.061
 ambulance-0.057
 pastor-0.057
 seller-0.056
Token to
Token ID284
Feature activation-3.87e+00

Pos logit contributions
icol+0.044
text+0.041
und+0.041
bed+0.040
ifying+0.039
Neg logit contributions
 stranger-0.036
guard-0.035
 answer-0.034
 ambulance-0.034
 cop-0.032
Token eat
Token ID4483
Feature activation-2.77e+00

Pos logit contributions
und+0.088
ra+0.087
 P+0.086
 M+0.085
bed+0.085
Neg logit contributions
guard-0.085
 stranger-0.081
 pastor-0.079
 ambulance-0.077
 firefighter-0.077
Token.
Token ID13
Feature activation-1.46e+00

Pos logit contributions
icol+0.079
text+0.071
und+0.070
bed+0.069
ifying+0.068
Neg logit contributions
 stranger-0.053
guard-0.053
 answer-0.052
 ambulance-0.051
 seller-0.048
Token They
Token ID1119
Feature activation-7.51e-01

Pos logit contributions
icol+0.032
bed+0.029
rily+0.028
und+0.028
text+0.027
Neg logit contributions
 ambulance-0.035
 veterinarian-0.035
 answer-0.035
 cop-0.035
 doctor-0.034
Token can
Token ID460
Feature activation-1.07e-01

Pos logit contributions
ra+0.016
text+0.016
und+0.016
icol+0.015
bed+0.015
Neg logit contributions
 stranger-0.018
guard-0.017
 cop-0.017
 ambulance-0.017
 pastor-0.017
Token make
Token ID787
Feature activation-3.34e-01

Pos logit contributions
ra+0.002
bed+0.002
und+0.002
 P+0.002
 M+0.002
Neg logit contributions
 stranger-0.003
old-0.002
 cop-0.002
guard-0.002
 gentleman-0.002
Token Lily
Token ID20037
Feature activation-1.17e+00

Pos logit contributions
icol+0.012
text+0.011
und+0.011
keeping+0.011
ifying+0.010
Neg logit contributions
 stranger-0.009
 lady-0.009
 cop-0.009
 man-0.009
 doctor-0.008
Token noticed
Token ID6810
Feature activation+8.02e-01

Pos logit contributions
icol+0.031
ifying+0.030
und+0.028
text+0.027
rily+0.027
Neg logit contributions
 stranger-0.024
 ambulance-0.023
 man-0.023
guard-0.023
 answer-0.023
Token that
Token ID326
Feature activation+6.70e-01

Pos logit contributions
 stranger+0.020
guard+0.020
 answer+0.020
 seller+0.019
 ambulance+0.019
Neg logit contributions
icol-0.017
bed-0.016
und-0.016
text-0.015
 P-0.015
Token her
Token ID607
Feature activation-2.23e-01

Pos logit contributions
 stranger+0.015
guard+0.015
 answer+0.014
 ambulance+0.014
 seller+0.014
Neg logit contributions
icol-0.016
text-0.015
und-0.015
bed-0.015
ifying-0.015
Token t
Token ID256
Feature activation+3.53e-01

Pos logit contributions
icol+0.007
und+0.007
text+0.007
bed+0.006
ifying+0.006
Neg logit contributions
 stranger-0.004
 answer-0.003
 ambulance-0.003
guard-0.003
 driver-0.003
Tokeneddy
Token ID21874
Feature activation-3.82e+00

Pos logit contributions
 stranger+0.009
guard+0.009
 answer+0.009
 ambulance+0.009
 cop+0.008
Neg logit contributions
icol-0.007
text-0.007
und-0.007
bed-0.007
ra-0.006
Token bear
Token ID6842
Feature activation-2.33e-01

Pos logit contributions
icol+0.141
und+0.134
ifying+0.134
text+0.133
keeping+0.127
Neg logit contributions
 owl-0.042
 answer-0.040
 stranger-0.040
 ambulance-0.039
 driver-0.038
Token's
Token ID338
Feature activation-1.70e+00

Pos logit contributions
icol+0.004
text+0.004
bed+0.004
und+0.004
ra+0.004
Neg logit contributions
 stranger-0.007
guard-0.006
 answer-0.006
 ambulance-0.006
 pastor-0.006
Token arm
Token ID3211
Feature activation-1.28e+00

Pos logit contributions
icol+0.046
text+0.044
und+0.044
bed+0.041
ifying+0.040
Neg logit contributions
 stranger-0.034
 answer-0.033
 ambulance-0.032
guard-0.032
 driver-0.031
Token was
Token ID373
Feature activation-1.01e+00

Pos logit contributions
icol+0.028
text+0.026
und+0.025
bed+0.025
ra+0.024
Neg logit contributions
 stranger-0.032
guard-0.031
 answer-0.031
 ambulance-0.030
 cop-0.029
Token torn
Token ID12445
Feature activation-9.84e-01

Pos logit contributions
icol+0.025
text+0.023
und+0.023
bed+0.023
ifying+0.023
Neg logit contributions
 stranger-0.022
guard-0.022
 answer-0.021
 ambulance-0.021
 pastor-0.020
Token.
Token ID13
Feature activation+7.15e-01

Pos logit contributions
 stranger+0.029
guard+0.028
 answer+0.028
 ambulance+0.027
 cop+0.026
Neg logit contributions
icol-0.034
text-0.031
und-0.031
bed-0.031
ra-0.030
Token They
Token ID1119
Feature activation+1.09e+00

Pos logit contributions
 driver+0.014
 cop+0.013
 lady+0.013
 answer+0.013
 man+0.013
Neg logit contributions
icol-0.019
text-0.019
ra-0.018
bed-0.018
keeping-0.018
Token found
Token ID1043
Feature activation+8.21e-01

Pos logit contributions
 stranger+0.028
guard+0.027
 ambulance+0.026
 answer+0.026
 cop+0.025
Neg logit contributions
icol-0.023
text-0.021
und-0.021
bed-0.021
ra-0.021
Token his
Token ID465
Feature activation-6.52e-01

Pos logit contributions
 stranger+0.018
guard+0.017
 ambulance+0.017
 answer+0.017
 seller+0.016
Neg logit contributions
icol-0.021
text-0.019
und-0.019
ifying-0.019
bed-0.019
Token t
Token ID256
Feature activation-1.96e-01

Pos logit contributions
icol+0.017
text+0.016
bed+0.016
und+0.015
ra+0.015
Neg logit contributions
 stranger-0.013
guard-0.013
 answer-0.012
 ambulance-0.012
 pastor-0.012
Tokeneddy
Token ID21874
Feature activation-4.10e+00

Pos logit contributions
icol+0.004
text+0.003
bed+0.003
und+0.003
ra+0.003
Neg logit contributions
 stranger-0.006
 answer-0.005
guard-0.005
 ambulance-0.005
 seller-0.005
Token bear
Token ID6842
Feature activation-7.65e-01

Pos logit contributions
icol+0.135
text+0.127
und+0.126
ifying+0.123
bed+0.121
Neg logit contributions
 stranger-0.058
guard-0.056
 ambulance-0.055
 answer-0.055
 cop-0.052
Token on
Token ID319
Feature activation-1.26e+00

Pos logit contributions
icol+0.017
text+0.016
und+0.016
bed+0.016
ra+0.015
Neg logit contributions
 stranger-0.019
guard-0.018
 answer-0.018
 ambulance-0.018
 pastor-0.017
Token his
Token ID465
Feature activation-3.01e+00

Pos logit contributions
icol+0.032
text+0.030
und+0.028
ifying+0.028
bed+0.027
Neg logit contributions
 stranger-0.028
 answer-0.028
 cop-0.027
guard-0.027
 lady-0.027
Token bed
Token ID3996
Feature activation-1.33e+00

Pos logit contributions
icol+0.067
text+0.063
bed+0.062
und+0.059
ra+0.059
Neg logit contributions
guard-0.073
 stranger-0.073
 answer-0.069
 ambulance-0.068
 pastor-0.067
Token.
Token ID13
Feature activation+5.12e-01

Pos logit contributions
icol+0.040
text+0.038
und+0.036
ifying+0.035
ra+0.035
Neg logit contributions
 stranger-0.023
guard-0.023
 answer-0.023
 ambulance-0.022
 driver-0.022
Token She
Token ID1375
Feature activation+1.65e-01

Pos logit contributions
icol+0.010
ra+0.010
text+0.009
bed+0.009
und+0.009
Neg logit contributions
 stranger-0.008
 veterinarian-0.008
 cop-0.008
 driver-0.008
 ambulance-0.008
Token could
Token ID714
Feature activation-1.42e-01

Pos logit contributions
 stranger+0.004
guard+0.003
 answer+0.003
 driver+0.003
 seller+0.003
Neg logit contributions
icol-0.004
text-0.004
ifying-0.004
und-0.004
keeping-0.004
Token still
Token ID991
Feature activation+2.35e+00

Pos logit contributions
icol+0.003
bed+0.003
ra+0.003
text+0.003
und+0.003
Neg logit contributions
guard-0.003
 stranger-0.003
 ambulance-0.003
 pastor-0.003
 seller-0.003
Token read
Token ID1100
Feature activation-1.43e+00

Pos logit contributions
guard+0.051
 stranger+0.050
 ambulance+0.047
 pastor+0.046
 seller+0.046
Neg logit contributions
icol-0.058
ra-0.054
text-0.054
bed-0.054
und-0.053
Token her
Token ID607
Feature activation-1.41e+00

Pos logit contributions
icol+0.032
text+0.029
und+0.029
ifying+0.028
ra+0.027
Neg logit contributions
 stranger-0.036
 answer-0.035
guard-0.034
 ambulance-0.034
 cop-0.033
Token diary
Token ID26339
Feature activation-4.27e+00

Pos logit contributions
icol+0.041
und+0.039
text+0.039
bed+0.038
ifying+0.037
Neg logit contributions
 stranger-0.025
 answer-0.024
guard-0.024
 ambulance-0.024
 cop-0.023
Token while
Token ID981
Feature activation+7.12e-02

Pos logit contributions
icol+0.103
text+0.095
und+0.093
ifying+0.091
bed+0.089
Neg logit contributions
 stranger-0.099
 answer-0.097
guard-0.095
 ambulance-0.093
 seller-0.091
Token she
Token ID673
Feature activation-1.20e+00

Pos logit contributions
 stranger+0.001
 answer+0.001
guard+0.001
 ambulance+0.001
 cop+0.001
Neg logit contributions
icol-0.002
text-0.002
und-0.002
ifying-0.002
ra-0.002
Token was
Token ID373
Feature activation-8.81e-01

Pos logit contributions
icol+0.029
text+0.026
und+0.026
ifying+0.025
bed+0.025
Neg logit contributions
 stranger-0.027
guard-0.027
 answer-0.027
 ambulance-0.026
 seller-0.026
Token playing
Token ID2712
Feature activation+1.37e-01

Pos logit contributions
icol+0.023
text+0.021
und+0.020
bed+0.020
ifying+0.020
Neg logit contributions
 stranger-0.019
 answer-0.019
 ambulance-0.018
guard-0.018
 cop-0.017
Token!
Token ID0
Feature activation+1.70e-01

Pos logit contributions
 stranger+0.004
guard+0.004
 answer+0.004
 ambulance+0.004
 cop+0.003
Neg logit contributions
icol-0.003
text-0.003
und-0.002
bed-0.002
ifying-0.002