TOP ACTIVATIONS
MAX = 0.117

night . They both felt a sense of pride and pleasure
back the village to its former state . With the help
. Jan ie shared her creations with all her family and
. They all shared a big piece of the enormous cake
things better . She felt a great sense of accomplishment for
the sweater . Alec felt a warm feeling in his heart
M agg ie hugged Mom my and said , " I
drank the soup and felt a little better . She sn
S oph ie felt a bit bitter , but was
walk . She still felt a bit w obb ly ,
on it . Sally felt a little better , but she
M um and dad hugged Maria and they all went home
calendar soon ." Amy felt a little better after hearing these
happy ! She hugged her mum and together they went to
was shining . Lucy felt a bit scared but with Jake
her jam . She felt a great sense of accomplishment so
sweet treat and shared a big , happy hug . <|endoftext|>
it . After they finished cleaning the shell , they celebrated
of you !" Ruth felt a special feeling in her heart
yourself . <|endoftext|> Once upon a time , there lived a

MIN ACTIVATIONS
MIN = -0.129

saw a big , red thing on the ground . He
mom and saw a terrible thing . A man had a
One day a very unusual thing happened . He pulled on
saw a strange and unknown thing . She asked her mum
One day , a strange thing happened . A small boy
a man with a real gun . He is angry .
, they see a new thing in the park . It
small , round , green thing on the ground . It
day , a shiny new vehicle came to town . Everyone
Then , a bad thing happened . A big car
day , a new green truck came to play . The
, they see a new thing in the gym . It
, they see a real train outside . It is big
they see a big red thing on a green plant .
when he saw a big vehicle coming down the road .
named Tim found a toy motorcycle . It was red and
they see a big white thing on the grass . It
, they notice a big truck on the road . The
saw a long , green thing in the water . He
bus met a small green car . The car liked to

INTERVAL 0.055 - 0.117
CONTAINS 1.024%

on her way with a big smile on her face .
's important to respect them ." M illy smiled
found . Her mom hugged her and praised her for behaving
Aunt . She felt dizz y with happiness to know that
. Tom and Sam are happy . They mail

INTERVAL -0.006 - 0.055
CONTAINS 59.922%

juice maker . It is shiny and colorful . It has
like Tommy and a c udd ly friend like her t
and smiled . � � € � � Ok Mum ,
Jane : " Look Mike , we did it
, Tommy . I didn 't mean to scare you .

INTERVAL -0.068 - -0.006
CONTAINS 38.697%

. " Do you want to play with me in the
the boat too close to the cliff . They were so
. They did not mind getting filthy . They had fun
named Lily who loved to play with water . She liked
family . They lived in a big house and loved to

INTERVAL -0.129 - -0.068
CONTAINS 0.357%

, they found a new box of blocks in the play
. We could get hurt or lost ." Tim
hit a tree . The ball broke into many pieces and
didn 't see the big hole in the ground . He
was a long , green snake named Sam . Sam lived
Token night
Token ID1755
Feature activation+5.20e-03

Pos logit contributions
 parked+0.002
 driving+0.002
 smoking+0.002
 stole+0.002
 disturbed+0.002
Neg logit contributions
 satisfaction-0.002
 pleasure-0.002
 accomplishment-0.002
 delight-0.002
 fascination-0.002
Token.
Token ID13
Feature activation+3.74e-02

Pos logit contributions
 satisfaction+0.012
 pleasure+0.012
 accomplishment+0.011
 delight+0.011
 fascination+0.011
Neg logit contributions
 parked-0.007
 driving-0.006
 smoking-0.006
 stole-0.006
 disturbed-0.006
Token They
Token ID1119
Feature activation+4.51e-02

Pos logit contributions
 accomplishment+0.079
 satisfaction+0.078
 fascination+0.075
 pleasure+0.075
 decision+0.075
Neg logit contributions
 parked-0.048
 driving-0.045
 smoking-0.044
 stole-0.043
 CA-0.040
Token both
Token ID1111
Feature activation+4.34e-02

Pos logit contributions
 pleasure+0.107
 accomplishment+0.105
 satisfaction+0.104
 decision+0.099
 accomplishments+0.099
Neg logit contributions
 parked-0.056
 stole-0.051
 knocked-0.047
 broke-0.047
 stuck-0.046
Token felt
Token ID2936
Feature activation+5.82e-02

Pos logit contributions
 accomplishment+0.097
 pleasure+0.096
 satisfaction+0.093
 accomplishments+0.090
 fascination+0.090
Neg logit contributions
 parked-0.057
 stole-0.054
 broke-0.051
 knocked-0.051
 disturbed-0.049
Token a
Token ID257
Feature activation+1.17e-01

Pos logit contributions
aman+0.128
 accomplishment+0.124
 fascination+0.122
 accomplishments+0.121
oph+0.119
Neg logit contributions
 parked-0.071
 disturbed-0.066
 stuck-0.064
 stole-0.064
 hurt-0.063
Token sense
Token ID2565
Feature activation+3.04e-02

Pos logit contributions
 deeds+0.238
annah+0.227
ishment+0.227
ently+0.224
heart+0.223
Neg logit contributions
 hurt-0.140
 knocked-0.128
 parked-0.127
 wrecked-0.125
 broke-0.121
Token of
Token ID286
Feature activation+5.34e-02

Pos logit contributions
 satisfaction+0.081
 pleasure+0.081
 fascination+0.079
 accomplishment+0.078
 delight+0.076
Neg logit contributions
 parked-0.025
 driving-0.024
 smoking-0.023
 disturbed-0.021
 stole-0.021
Token pride
Token ID11293
Feature activation+3.38e-02

Pos logit contributions
oph+0.058
aman+0.054
aire+0.051
heart+0.047
ya+0.045
Neg logit contributions
 driving-0.123
 parked-0.123
 stuck-0.118
 stole-0.116
 thrown-0.115
Token and
Token ID290
Feature activation+6.25e-02

Pos logit contributions
 pleasure+0.065
 satisfaction+0.064
 fascination+0.062
 accomplishment+0.062
 delight+0.060
Neg logit contributions
 parked-0.054
 driving-0.053
 smoking-0.051
 stole-0.049
 broke-0.049
Token pleasure
Token ID9476
Feature activation+2.22e-02

Pos logit contributions
aman+0.114
oph+0.105
 pleasure+0.105
 difference+0.103
 fascination+0.102
Neg logit contributions
 parked-0.096
 stole-0.093
 driving-0.090
 broke-0.090
 disturbed-0.090
Token back
Token ID736
Feature activation+3.04e-02

Pos logit contributions
 Deal+0.172
 fascination+0.170
aman+0.166
 accomplishment+0.166
 grin+0.165
Neg logit contributions
 guns-0.068
 smoking-0.067
 disturbed-0.066
 driving-0.065
 stuck-0.064
Token the
Token ID262
Feature activation+3.18e-02

Pos logit contributions
 satisfaction+0.073
 accomplishment+0.073
 pleasure+0.073
 fascination+0.071
 delight+0.069
Neg logit contributions
 parked-0.033
 driving-0.032
 stole-0.030
 smoking-0.030
 disturbed-0.029
Token village
Token ID7404
Feature activation+6.01e-02

Pos logit contributions
 satisfaction+0.062
 fascination+0.061
 pleasure+0.060
 accomplishment+0.059
 delight+0.059
Neg logit contributions
 parked-0.048
 stole-0.047
 smoking-0.046
 disturbed-0.045
 knocked-0.045
Token to
Token ID284
Feature activation+3.22e-02

Pos logit contributions
 satisfaction+0.133
 pleasure+0.130
 fascination+0.129
 accomplishment+0.127
 delight+0.124
Neg logit contributions
 parked-0.076
 driving-0.074
 stole-0.068
 stuck-0.068
 smoking-0.067
Token its
Token ID663
Feature activation+2.69e-02

Pos logit contributions
 pleasure+0.071
 satisfaction+0.071
 accomplishment+0.070
 fascination+0.069
 delight+0.067
Neg logit contributions
 driving-0.038
 smoking-0.038
 parked-0.037
 stole-0.033
 disturbed-0.033
Token former
Token ID1966
Feature activation+1.13e-01

Pos logit contributions
 satisfaction+0.045
 pleasure+0.044
 accomplishment+0.043
 fascination+0.042
 delight+0.041
Neg logit contributions
 parked-0.048
 driving-0.046
 smoking-0.045
 stole-0.044
 stuck-0.043
Token state
Token ID1181
Feature activation+5.09e-02

Pos logit contributions
 fascination+0.154
 pleasure+0.148
oph+0.141
 satisfaction+0.141
 goodness+0.136
Neg logit contributions
 driving-0.187
 parked-0.180
 guarded-0.180
 driven-0.176
 smoking-0.175
Token.
Token ID13
Feature activation+3.60e-02

Pos logit contributions
 satisfaction+0.124
 pleasure+0.123
 accomplishment+0.120
 fascination+0.119
 delight+0.117
Neg logit contributions
 parked-0.055
 driving-0.052
 smoking-0.049
 blocking-0.044
 disturbed-0.043
Token With
Token ID2080
Feature activation+4.91e-02

Pos logit contributions
 accomplishment+0.084
 satisfaction+0.082
 fascination+0.079
 pleasure+0.078
 decision+0.078
Neg logit contributions
 parked-0.041
 driving-0.039
 smoking-0.038
 stole-0.035
 CA-0.034
Token the
Token ID262
Feature activation+4.29e-02

Pos logit contributions
 pleasure+0.108
 decision+0.107
 difference+0.107
 satisfaction+0.106
 accomplishment+0.106
Neg logit contributions
 parked-0.054
 driving-0.050
 smoking-0.048
 stuck-0.048
 knocked-0.046
Token help
Token ID1037
Feature activation+4.24e-02

Pos logit contributions
 pleasure+0.083
 fascination+0.083
 satisfaction+0.081
oph+0.080
aman+0.079
Neg logit contributions
 smoking-0.058
 hurt-0.058
 stole-0.057
 broke-0.055
 stuck-0.055
Token.
Token ID13
Feature activation+2.72e-02

Pos logit contributions
 satisfaction+0.055
 pleasure+0.054
 accomplishment+0.053
 fascination+0.052
 delight+0.052
Neg logit contributions
 parked-0.027
 driving-0.025
 smoking-0.023
 stole-0.023
 broke-0.022
Token Jan
Token ID2365
Feature activation+5.32e-02

Pos logit contributions
 satisfaction+0.056
 accomplishment+0.056
 pleasure+0.054
 fascination+0.052
 delight+0.052
Neg logit contributions
 parked-0.038
 driving-0.036
 smoking-0.035
 stole-0.034
 knocked-0.033
Tokenie
Token ID494
Feature activation+6.58e-02

Pos logit contributions
 pleasure+0.126
 satisfaction+0.123
 accomplishment+0.116
 fascination+0.115
 delight+0.115
Neg logit contributions
 parked-0.067
 stole-0.061
 broke-0.060
 disturbed-0.060
 stuck-0.058
Token shared
Token ID4888
Feature activation+4.56e-02

Pos logit contributions
 pleasure+0.170
 satisfaction+0.167
 accomplishment+0.163
 fascination+0.160
 relief+0.156
Neg logit contributions
 parked-0.067
 knocked-0.059
 broke-0.059
 stole-0.058
 driving-0.057
Token her
Token ID607
Feature activation+5.49e-02

Pos logit contributions
 pleasure+0.109
 satisfaction+0.106
 accomplishment+0.104
 fascination+0.101
 difference+0.098
Neg logit contributions
 parked-0.050
 driving-0.048
 smoking-0.045
 stole-0.044
 broke-0.043
Token creations
Token ID28443
Feature activation+1.13e-01

Pos logit contributions
 fascination+0.108
 satisfaction+0.106
oph+0.104
aman+0.104
heart+0.100
Neg logit contributions
 parked-0.073
 driving-0.072
 broke-0.069
 stuck-0.067
 hurt-0.067
Token with
Token ID351
Feature activation+4.13e-02

Pos logit contributions
 pleasure+0.284
 satisfaction+0.279
 fascination+0.277
 Deal+0.269
 disbelief+0.266
Neg logit contributions
 parked-0.101
 on-0.098
 driving-0.095
 to-0.084
 called-0.083
Token all
Token ID477
Feature activation+5.88e-02

Pos logit contributions
 pleasure+0.090
 accomplishment+0.088
 satisfaction+0.087
 decision+0.086
 fascination+0.085
Neg logit contributions
 driving-0.045
 parked-0.045
 disturbed-0.044
 smoking-0.043
 hurt-0.042
Token her
Token ID607
Feature activation+5.52e-02

Pos logit contributions
 pleasure+0.150
 accomplishment+0.145
 satisfaction+0.144
 fascination+0.141
 Deal+0.141
Neg logit contributions
 parked-0.044
 driving-0.042
 smoking-0.036
 blocking-0.034
 called-0.034
Token family
Token ID1641
Feature activation+6.78e-02

Pos logit contributions
 fascination+0.110
 satisfaction+0.109
aman+0.103
 pleasure+0.103
oph+0.102
Neg logit contributions
 broke-0.071
 hurt-0.069
 cars-0.069
 stuck-0.069
 stole-0.069
Token and
Token ID290
Feature activation+2.85e-02

Pos logit contributions
 satisfaction+0.141
 pleasure+0.139
 fascination+0.138
 accomplishment+0.135
 Deal+0.133
Neg logit contributions
 parked-0.093
 driving-0.091
 stuck-0.082
 smoking-0.080
 sitting-0.078
Token.
Token ID13
Feature activation+3.82e-02

Pos logit contributions
 satisfaction+0.056
 pleasure+0.056
 accomplishment+0.054
 fascination+0.053
 delight+0.052
Neg logit contributions
 parked-0.033
 driving-0.031
 smoking-0.030
 stole-0.029
 knocked-0.028
Token They
Token ID1119
Feature activation+3.05e-02

Pos logit contributions
 satisfaction+0.072
 accomplishment+0.072
 pleasure+0.069
 fascination+0.068
 decision+0.067
Neg logit contributions
 parked-0.057
 driving-0.053
 smoking-0.052
 knocked-0.051
 stole-0.051
Token all
Token ID477
Feature activation+3.05e-02

Pos logit contributions
 pleasure+0.068
 satisfaction+0.066
 accomplishment+0.066
 fascination+0.062
 difference+0.062
Neg logit contributions
 parked-0.043
 stole-0.038
 knocked-0.037
 driving-0.037
 broke-0.036
Token shared
Token ID4888
Feature activation+6.76e-02

Pos logit contributions
 pleasure+0.069
 accomplishment+0.067
 satisfaction+0.066
 fascination+0.063
 accomplishments+0.063
Neg logit contributions
 parked-0.043
 stole-0.038
 knocked-0.037
 broke-0.036
 disturbed-0.036
Token a
Token ID257
Feature activation+7.72e-02

Pos logit contributions
 pleasure+0.168
 accomplishment+0.163
 satisfaction+0.161
 fascination+0.156
 accomplishments+0.155
Neg logit contributions
 parked-0.058
 driving-0.053
 smoking-0.052
 on-0.052
 stole-0.051
Token big
Token ID1263
Feature activation+1.12e-01

Pos logit contributions
ishment+0.149
 pleasure+0.144
 satisfaction+0.142
 accomplishment+0.140
heart+0.140
Neg logit contributions
 parked-0.096
 driving-0.091
 CA-0.089
 stole-0.088
saw-0.088
Token piece
Token ID3704
Feature activation+5.97e-02

Pos logit contributions
ishment+0.204
 former+0.199
 surroundings+0.198
 gut+0.197
esty+0.193
Neg logit contributions
 popped-0.137
 driving-0.132
 hit-0.132
 broke-0.130
 parked-0.124
Token of
Token ID286
Feature activation+5.87e-02

Pos logit contributions
 satisfaction+0.162
 pleasure+0.162
 fascination+0.155
 accomplishment+0.153
 delight+0.149
Neg logit contributions
 parked-0.047
 driving-0.044
 smoking-0.039
 stuck-0.037
 blocking-0.036
Token the
Token ID262
Feature activation+4.48e-02

Pos logit contributions
aire+0.117
 pleasure+0.114
oph+0.111
 fascination+0.111
iper+0.110
Neg logit contributions
 parked-0.070
 driving-0.069
 smoking-0.068
 disturbed-0.062
 stole-0.062
Token enormous
Token ID9812
Feature activation+6.99e-02

Pos logit contributions
 satisfaction+0.080
 fascination+0.079
 accomplishment+0.078
 pleasure+0.078
 delight+0.075
Neg logit contributions
 parked-0.066
 driving-0.065
 smoking-0.064
 disturbed-0.061
 stole-0.061
Token cake
Token ID12187
Feature activation+5.16e-02

Pos logit contributions
 pleasure+0.155
 satisfaction+0.154
 fascination+0.153
 accomplishment+0.150
iper+0.149
Neg logit contributions
 driving-0.068
 smoking-0.063
 broke-0.061
 parked-0.059
 appeared-0.058
Token things
Token ID1243
Feature activation+6.14e-02

Pos logit contributions
 satisfaction+0.094
 pleasure+0.092
 fascination+0.087
 accomplishment+0.087
aire+0.083
Neg logit contributions
 driving-0.053
 parked-0.050
 smoking-0.050
 disturbed-0.048
 stuck-0.046
Token better
Token ID1365
Feature activation+5.09e-02

Pos logit contributions
 satisfaction+0.145
 fascination+0.139
 pleasure+0.139
 accomplishment+0.136
aman+0.132
Neg logit contributions
 driving-0.068
 parked-0.067
 hurt-0.065
 stuck-0.064
 smoking-0.061
Token.
Token ID13
Feature activation+4.87e-02

Pos logit contributions
 fascination+0.124
 pleasure+0.123
 satisfaction+0.123
 accomplishment+0.120
 delight+0.115
Neg logit contributions
 driving-0.050
 parked-0.050
 smoking-0.049
 hurt-0.048
 disturbed-0.047
Token She
Token ID1375
Feature activation+5.11e-02

Pos logit contributions
 accomplishment+0.112
 satisfaction+0.107
 decision+0.105
 fascination+0.104
 accomplishments+0.101
Neg logit contributions
 parked-0.052
 Wh-0.048
 driving-0.048
 CA-0.047
 smoking-0.044
Token felt
Token ID2936
Feature activation+4.95e-02

Pos logit contributions
 pleasure+0.126
 satisfaction+0.123
 accomplishment+0.123
 fascination+0.118
 delight+0.117
Neg logit contributions
 parked-0.062
 stole-0.055
 knocked-0.054
 broke-0.053
 stuck-0.051
Token a
Token ID257
Feature activation+1.11e-01

Pos logit contributions
aman+0.101
 accomplishment+0.100
 fascination+0.099
oph+0.098
 accomplishments+0.097
Neg logit contributions
 parked-0.067
 disturbed-0.061
 stuck-0.061
 hurt-0.060
 driving-0.060
Token great
Token ID1049
Feature activation+6.68e-02

Pos logit contributions
 deeds+0.218
osa+0.214
ishment+0.210
ior+0.206
aman+0.206
Neg logit contributions
 hurt-0.146
 parked-0.129
 wrecked-0.128
 knocked-0.127
 hit-0.123
Token sense
Token ID2565
Feature activation+2.10e-02

Pos logit contributions
hew+0.122
 Deal+0.121
 fascination+0.118
ishment+0.114
heart+0.112
Neg logit contributions
 parked-0.094
 driving-0.092
 knocked-0.092
 hit-0.091
 hurt-0.090
Token of
Token ID286
Feature activation+4.25e-02

Pos logit contributions
 satisfaction+0.054
 pleasure+0.054
 accomplishment+0.053
 fascination+0.052
 delight+0.051
Neg logit contributions
 parked-0.020
 driving-0.019
 smoking-0.018
 stole-0.017
 disturbed-0.017
Token accomplishment
Token ID29340
Feature activation+2.93e-02

Pos logit contributions
oph+0.051
aire+0.049
aman+0.045
 pleasure+0.042
 fascination+0.041
Neg logit contributions
 driving-0.095
 parked-0.095
 blocking-0.088
 stuck-0.087
 hurt-0.087
Token for
Token ID329
Feature activation+3.43e-02

Pos logit contributions
 pleasure+0.061
 satisfaction+0.060
 fascination+0.058
 accomplishment+0.057
 delight+0.057
Neg logit contributions
 parked-0.045
 driving-0.043
 smoking-0.040
 stole-0.039
 broke-0.039
Token the
Token ID262
Feature activation+1.25e-02

Pos logit contributions
 pleasure+0.052
 satisfaction+0.051
 accomplishment+0.050
 fascination+0.048
 difference+0.048
Neg logit contributions
 parked-0.025
 driving-0.024
 smoking-0.023
 knocked-0.021
 stole-0.021
Token sweater
Token ID35685
Feature activation+1.82e-02

Pos logit contributions
 pleasure+0.024
 satisfaction+0.024
 accomplishment+0.024
 fascination+0.022
 delight+0.022
Neg logit contributions
 parked-0.021
 driving-0.019
 smoking-0.019
 stole-0.019
 broke-0.018
Token.
Token ID13
Feature activation+3.87e-02

Pos logit contributions
 satisfaction+0.037
 pleasure+0.037
 accomplishment+0.035
 fascination+0.034
 delight+0.034
Neg logit contributions
 parked-0.029
 driving-0.027
 smoking-0.026
 stole-0.026
 broke-0.025
Token Alec
Token ID38285
Feature activation+5.35e-02

Pos logit contributions
 accomplishment+0.079
 satisfaction+0.077
 pleasure+0.076
 fascination+0.072
 delight+0.071
Neg logit contributions
 parked-0.058
 driving-0.053
 smoking-0.050
 CA-0.050
 stole-0.049
Token felt
Token ID2936
Feature activation+5.35e-02

Pos logit contributions
 pleasure+0.137
 accomplishment+0.131
 satisfaction+0.130
 difference+0.122
 fascination+0.121
Neg logit contributions
 parked-0.072
 stole-0.060
 knocked-0.059
 broke-0.058
 disturbed-0.056
Token a
Token ID257
Feature activation+1.10e-01

Pos logit contributions
 accomplishment+0.111
oph+0.108
 accomplishments+0.108
aman+0.107
 pleasure+0.106
Neg logit contributions
 parked-0.076
 hurt-0.071
 disturbed-0.070
 stuck-0.068
 stole-0.065
Token warm
Token ID5814
Feature activation+8.77e-02

Pos logit contributions
 deeds+0.228
oph+0.215
ishment+0.214
osa+0.212
aman+0.211
Neg logit contributions
 hurt-0.151
 parked-0.136
 knocked-0.134
 broke-0.131
 disturbed-0.128
Token feeling
Token ID4203
Feature activation+3.28e-02

Pos logit contributions
oph+0.172
 pleasure+0.167
 satisfaction+0.167
 accomplishments+0.166
hew+0.165
Neg logit contributions
 parked-0.124
 hurt-0.121
 driving-0.118
 stuck-0.113
 hit-0.112
Token in
Token ID287
Feature activation+4.92e-02

Pos logit contributions
 pleasure+0.082
 satisfaction+0.080
 accomplishment+0.080
 fascination+0.080
 delight+0.076
Neg logit contributions
 parked-0.037
 driving-0.033
 disturbed-0.032
 smoking-0.031
 stole-0.031
Token his
Token ID465
Feature activation+4.98e-02

Pos logit contributions
oph+0.136
 pleasure+0.133
 accomplishment+0.129
 accomplishments+0.129
aire+0.128
Neg logit contributions
 driving-0.031
 parked-0.031
 smoking-0.026
 blocking-0.025
 hurt-0.025
Token heart
Token ID2612
Feature activation+4.28e-02

Pos logit contributions
oph+0.085
 satisfaction+0.080
 accomplishment+0.080
ya+0.078
 pleasure+0.077
Neg logit contributions
 hurt-0.080
 parked-0.078
 driving-0.077
 broke-0.074
 stuck-0.074
TokenM
Token ID44
Feature activation+5.33e-02

Pos logit contributions
 pleasure+0.103
 accomplishment+0.100
 satisfaction+0.094
 difference+0.090
 fascination+0.089
Neg logit contributions
 parked-0.057
 disturbed-0.054
Ro-0.053
 knocked-0.051
 smoking-0.051
Tokenagg
Token ID9460
Feature activation+4.70e-02

Pos logit contributions
 pleasure+0.064
 satisfaction+0.063
 fascination+0.056
 accomplishment+0.056
 difference+0.055
Neg logit contributions
 knocked-0.109
 disturbed-0.107
 parked-0.105
 smoking-0.105
 arose-0.103
Tokenie
Token ID494
Feature activation+6.25e-02

Pos logit contributions
 pleasure+0.102
 satisfaction+0.094
 accomplishment+0.093
 difference+0.088
 fascination+0.086
Neg logit contributions
 parked-0.072
 knocked-0.067
 disturbed-0.065
 stole-0.063
 broke-0.062
Token hugged
Token ID41130
Feature activation+6.11e-02

Pos logit contributions
 pleasure+0.147
 satisfaction+0.136
 accomplishment+0.135
 difference+0.131
 fascination+0.127
Neg logit contributions
 parked-0.090
 knocked-0.082
 broke-0.080
 stole-0.078
 stuck-0.076
Token Mom
Token ID11254
Feature activation+8.47e-02

Pos logit contributions
 pleasure+0.175
 difference+0.165
 satisfaction+0.165
 impact+0.161
 bond+0.159
Neg logit contributions
 smoking-0.036
 knocked-0.035
 to-0.032
 parked-0.032
 on-0.031
Tokenmy
Token ID1820
Feature activation+1.10e-01

Pos logit contributions
 accomplishment+0.247
 satisfaction+0.241
 pleasure+0.236
 accomplishments+0.231
 delight+0.225
Neg logit contributions
o-0.071
 parked-0.066
 smoking-0.051
 on-0.050
 called-0.048
Token and
Token ID290
Feature activation+5.20e-02

Pos logit contributions
 pleasure+0.261
 satisfaction+0.255
agg+0.254
 fascination+0.245
 accomplishment+0.245
Neg logit contributions
 on-0.124
 parked-0.116
 to-0.109
 in-0.100
 stuck-0.100
Token said
Token ID531
Feature activation+4.22e-02

Pos logit contributions
 pleasure+0.126
 difference+0.123
 accomplishment+0.120
 accomplishments+0.120
 decision+0.119
Neg logit contributions
 parked-0.063
 knocked-0.061
 broke-0.054
 stole-0.053
 rang-0.052
Token,
Token ID11
Feature activation+4.30e-02

Pos logit contributions
 satisfaction+0.081
 pleasure+0.080
 fascination+0.079
 accomplishment+0.076
 difference+0.075
Neg logit contributions
 driving-0.063
 broke-0.062
 smoking-0.060
 parked-0.060
 knocked-0.059
Token "
Token ID366
Feature activation+5.21e-02

Pos logit contributions
 satisfaction+0.120
 pleasure+0.118
 accomplishment+0.117
 fascination+0.114
 difference+0.114
Neg logit contributions
 parked-0.036
 driving-0.035
 smoking-0.031
 knocked-0.029
 blocking-0.025
TokenI
Token ID40
Feature activation+5.78e-02

Pos logit contributions
 accomplishment+0.126
 decision+0.121
 difference+0.118
 pleasure+0.113
 accomplishments+0.112
Neg logit contributions
 parked-0.059
 disturbed-0.050
 smoking-0.048
Ro-0.048
I-0.047
Token drank
Token ID24070
Feature activation+5.32e-02

Pos logit contributions
 pleasure+0.158
 satisfaction+0.145
 accomplishment+0.145
 fascination+0.137
 difference+0.137
Neg logit contributions
 parked-0.081
 broke-0.074
 knocked-0.073
 stole-0.071
 stuck-0.066
Token the
Token ID262
Feature activation+3.59e-02

Pos logit contributions
 pleasure+0.146
 accomplishments+0.136
 accomplishment+0.136
 fascination+0.136
 satisfaction+0.136
Neg logit contributions
 broke-0.044
 smoking-0.044
 hurt-0.042
 stuck-0.040
 parked-0.039
Token soup
Token ID17141
Feature activation+7.98e-02

Pos logit contributions
 pleasure+0.085
 satisfaction+0.085
 accomplishment+0.083
 fascination+0.081
 delight+0.080
Neg logit contributions
 smoking-0.040
 disturbed-0.038
 parked-0.037
 broke-0.036
 hurt-0.036
Token and
Token ID290
Feature activation+3.02e-02

Pos logit contributions
 pleasure+0.177
 satisfaction+0.169
 fascination+0.169
 accomplishment+0.162
 amaz+0.158
Neg logit contributions
 broke-0.105
 parked-0.104
 on-0.103
 stuck-0.102
 smoking-0.101
Token felt
Token ID2936
Feature activation+3.90e-02

Pos logit contributions
 pleasure+0.072
 satisfaction+0.068
 accomplishment+0.067
 fascination+0.065
 delight+0.064
Neg logit contributions
 parked-0.039
 stole-0.036
 broke-0.035
 smoking-0.035
 knocked-0.035
Token a
Token ID257
Feature activation+1.09e-01

Pos logit contributions
aman+0.074
 accomplishment+0.073
 pleasure+0.073
 fascination+0.072
oph+0.071
Neg logit contributions
 parked-0.057
 hurt-0.055
 disturbed-0.055
 smoking-0.054
 stuck-0.053
Token little
Token ID1310
Feature activation+5.54e-02

Pos logit contributions
aman+0.218
 deeds+0.212
ya+0.211
annah+0.210
iper+0.208
Neg logit contributions
 hurt-0.155
 broke-0.138
 disturbed-0.130
 wrecked-0.124
 hit-0.124
Token better
Token ID1365
Feature activation+2.05e-02

Pos logit contributions
aman+0.119
ya+0.114
 accomplishment+0.114
oph+0.113
 accomplishments+0.112
Neg logit contributions
 hurt-0.077
 stuck-0.072
 broke-0.071
 disturbed-0.069
 stole-0.064
Token.
Token ID13
Feature activation+3.60e-02

Pos logit contributions
 pleasure+0.047
 satisfaction+0.045
 accomplishment+0.045
 fascination+0.044
 delight+0.043
Neg logit contributions
 parked-0.026
 smoking-0.024
 driving-0.024
 stole-0.023
 disturbed-0.023
Token She
Token ID1375
Feature activation+3.24e-02

Pos logit contributions
 accomplishment+0.086
 satisfaction+0.084
 fascination+0.081
 pleasure+0.081
 decision+0.080
Neg logit contributions
 parked-0.038
 broke-0.035
 driving-0.034
 stole-0.034
 knocked-0.034
Token sn
Token ID3013
Feature activation+3.15e-02

Pos logit contributions
 pleasure+0.077
 accomplishment+0.074
 satisfaction+0.073
 fascination+0.071
 accomplishments+0.069
Neg logit contributions
 parked-0.044
 stole-0.040
 knocked-0.040
 broke-0.039
 disturbed-0.038
Token
Token ID198
Feature activation+4.08e-02

Pos logit contributions
 pleasure+0.066
 satisfaction+0.065
 accomplishment+0.064
 fascination+0.063
 delight+0.060
Neg logit contributions
 parked-0.031
 stole-0.027
 smoking-0.026
 driving-0.026
 broke-0.025
TokenS
Token ID50
Feature activation+4.59e-02

Pos logit contributions
 pleasure+0.096
 accomplishment+0.091
 fascination+0.091
 satisfaction+0.088
 amaz+0.084
Neg logit contributions
 parked-0.052
 stole-0.046
saw-0.043
 broke-0.043
 disturbed-0.042
Tokenoph
Token ID2522
Feature activation+5.58e-02

Pos logit contributions
 pleasure+0.048
 fascination+0.046
 satisfaction+0.044
 accomplishment+0.043
 delight+0.042
Neg logit contributions
 stole-0.113
 parked-0.112
 broke-0.108
 disturbed-0.106
 CA-0.104
Tokenie
Token ID494
Feature activation+5.90e-02

Pos logit contributions
 pleasure+0.136
 fascination+0.122
 satisfaction+0.121
 accomplishment+0.121
 amaz+0.116
Neg logit contributions
 parked-0.070
 stole-0.064
 broke-0.063
 disturbed-0.060
 called-0.059
Token felt
Token ID2936
Feature activation+4.50e-02

Pos logit contributions
 pleasure+0.137
 satisfaction+0.125
 accomplishment+0.124
 fascination+0.121
 difference+0.118
Neg logit contributions
 parked-0.086
 stole-0.079
 broke-0.079
 knocked-0.078
 stuck-0.076
Token a
Token ID257
Feature activation+1.08e-01

Pos logit contributions
 pleasure+0.096
 accomplishment+0.095
aman+0.093
oph+0.092
 accomplishments+0.091
Neg logit contributions
 parked-0.058
 hurt-0.058
 disturbed-0.057
 stuck-0.056
 stole-0.055
Token bit
Token ID1643
Feature activation+4.94e-02

Pos logit contributions
 pleasure+0.235
aman+0.233
ishment+0.228
osa+0.227
oph+0.225
Neg logit contributions
 hurt-0.144
 broke-0.123
 knocked-0.115
 disturbed-0.114
 hit-0.111
Token bitter
Token ID12922
Feature activation+5.57e-02

Pos logit contributions
 pleasure+0.111
 accomplishment+0.106
 satisfaction+0.106
 fascination+0.105
oph+0.102
Neg logit contributions
 hurt-0.062
 disturbed-0.062
 stuck-0.061
 broke-0.058
 stole-0.057
Token,
Token ID11
Feature activation+7.36e-02

Pos logit contributions
 satisfaction+0.121
 fascination+0.120
 pleasure+0.119
 accomplishment+0.114
 delight+0.110
Neg logit contributions
 smoking-0.071
 disturbed-0.070
 driving-0.069
 broke-0.069
 parked-0.068
Token but
Token ID475
Feature activation+5.12e-02

Pos logit contributions
 pleasure+0.148
aire+0.148
 satisfaction+0.143
 accomplishments+0.142
 accomplishment+0.141
Neg logit contributions
 parked-0.106
 smoking-0.101
 driving-0.101
 stuck-0.098
 blocking-0.096
Token was
Token ID373
Feature activation+1.97e-02

Pos logit contributions
 pleasure+0.136
 satisfaction+0.131
 accomplishment+0.130
 accomplishments+0.130
 difference+0.129
Neg logit contributions
 parked-0.040
 stuck-0.038
 disturbed-0.034
 stole-0.033
 broke-0.033
Token walk
Token ID2513
Feature activation+3.51e-02

Pos logit contributions
 satisfaction+0.063
 pleasure+0.061
 fascination+0.058
 delight+0.057
 accomplishment+0.056
Neg logit contributions
 driving-0.051
 parked-0.050
 smoking-0.047
 knocked-0.046
 stole-0.046
Token.
Token ID13
Feature activation+1.14e-02

Pos logit contributions
 satisfaction+0.084
 pleasure+0.084
 accomplishment+0.081
 fascination+0.078
 delight+0.077
Neg logit contributions
 parked-0.042
 driving-0.040
 smoking-0.037
 stuck-0.036
 disturbed-0.036
Token She
Token ID1375
Feature activation+3.26e-02

Pos logit contributions
 satisfaction+0.022
 accomplishment+0.022
 pleasure+0.022
 delight+0.021
 fascination+0.020
Neg logit contributions
 parked-0.018
 driving-0.017
 smoking-0.017
 stole-0.016
 knocked-0.016
Token still
Token ID991
Feature activation+3.46e-02

Pos logit contributions
 pleasure+0.079
 satisfaction+0.079
 accomplishment+0.077
 delight+0.073
 fascination+0.072
Neg logit contributions
 parked-0.043
 knocked-0.037
 stole-0.036
 driving-0.036
 broke-0.034
Token felt
Token ID2936
Feature activation+4.68e-02

Pos logit contributions
 accomplishment+0.078
 satisfaction+0.078
 pleasure+0.078
 delight+0.074
 fascination+0.073
Neg logit contributions
 parked-0.046
 stole-0.042
 driving-0.042
 knocked-0.041
 broke-0.040
Token a
Token ID257
Feature activation+1.08e-01

Pos logit contributions
 accomplishment+0.098
oph+0.098
 pleasure+0.096
 satisfaction+0.096
 fascination+0.095
Neg logit contributions
 parked-0.063
 disturbed-0.061
 hurt-0.060
 driving-0.059
 stuck-0.059
Token bit
Token ID1643
Feature activation+4.65e-02

Pos logit contributions
 deeds+0.256
oph+0.244
ently+0.244
 agrees+0.242
hew+0.238
Neg logit contributions
 hurt-0.122
 knocked-0.101
 broke-0.098
 wrecked-0.096
 hit-0.095
Token w
Token ID266
Feature activation+5.04e-02

Pos logit contributions
 satisfaction+0.109
 fascination+0.108
 pleasure+0.108
 accomplishment+0.108
oph+0.104
Neg logit contributions
 disturbed-0.053
 hurt-0.051
 stuck-0.049
 broke-0.047
 smoking-0.046
Tokenobb
Token ID21963
Feature activation+5.05e-02

Pos logit contributions
 fascination+0.106
 amaz+0.105
 pleasure+0.103
 satisfaction+0.100
 delight+0.097
Neg logit contributions
 disturbed-0.066
wheel-0.062
 smoking-0.062
 bitten-0.062
 hurt-0.062
Tokenly
Token ID306
Feature activation+4.10e-02

Pos logit contributions
 fascination+0.104
 pleasure+0.103
 satisfaction+0.101
 amaz+0.099
 delight+0.097
Neg logit contributions
 parked-0.079
 stole-0.078
 broke-0.077
 driving-0.076
 hurt-0.076
Token,
Token ID11
Feature activation+7.02e-02

Pos logit contributions
 pleasure+0.076
 satisfaction+0.076
 accomplishment+0.073
 fascination+0.070
 delight+0.068
Neg logit contributions
 parked-0.069
 driving-0.067
 smoking-0.065
 stole-0.064
 blocking-0.063
Token on
Token ID319
Feature activation+1.27e-02

Pos logit contributions
 satisfaction+0.004
 pleasure+0.004
 accomplishment+0.004
 delight+0.004
 fascination+0.003
Neg logit contributions
 parked-0.005
 driving-0.004
 smoking-0.004
 stole-0.004
 disturbed-0.004
Token it
Token ID340
Feature activation-1.07e-02

Pos logit contributions
 satisfaction+0.029
 pleasure+0.029
 accomplishment+0.028
 fascination+0.027
 delight+0.027
Neg logit contributions
 parked-0.016
 driving-0.015
 smoking-0.015
 stole-0.014
 knocked-0.014
Token.
Token ID13
Feature activation+3.08e-03

Pos logit contributions
 parked+0.016
 stole+0.015
 smoking+0.014
 driving+0.014
 disturbed+0.014
Neg logit contributions
 pleasure-0.022
 satisfaction-0.022
 accomplishment-0.022
 delight-0.021
 fascination-0.020
Token Sally
Token ID25737
Feature activation+3.71e-02

Pos logit contributions
 satisfaction+0.006
 pleasure+0.006
 accomplishment+0.006
 delight+0.006
 fascination+0.006
Neg logit contributions
 parked-0.005
 driving-0.005
 stole-0.005
 smoking-0.005
 knocked-0.004
Token felt
Token ID2936
Feature activation+3.24e-02

Pos logit contributions
 pleasure+0.092
 satisfaction+0.092
 accomplishment+0.090
 fascination+0.084
 delight+0.084
Neg logit contributions
 parked-0.048
 knocked-0.043
 stole-0.042
 broke-0.041
 driving-0.039
Token a
Token ID257
Feature activation+1.07e-01

Pos logit contributions
 accomplishment+0.060
 pleasure+0.059
 satisfaction+0.059
 fascination+0.058
aman+0.058
Neg logit contributions
 parked-0.051
 disturbed-0.048
 stole-0.047
 hurt-0.047
 driving-0.046
Token little
Token ID1310
Feature activation+4.75e-02

Pos logit contributions
 deeds+0.226
aman+0.220
heart+0.217
ior+0.216
oph+0.210
Neg logit contributions
 hurt-0.137
 broke-0.124
 knocked-0.121
 hit-0.114
 disturbed-0.111
Token better
Token ID1365
Feature activation+4.11e-02

Pos logit contributions
aman+0.100
 accomplishment+0.097
oph+0.096
 satisfaction+0.095
 accomplishments+0.095
Neg logit contributions
 hurt-0.064
 stuck-0.064
 broke-0.060
 disturbed-0.060
 stole-0.057
Token,
Token ID11
Feature activation+3.08e-02

Pos logit contributions
 pleasure+0.090
 satisfaction+0.087
 fascination+0.085
 accomplishment+0.082
 delight+0.081
Neg logit contributions
 parked-0.055
 driving-0.051
 stuck-0.051
 hurt-0.051
 knocked-0.051
Token but
Token ID475
Feature activation+2.22e-02

Pos logit contributions
 satisfaction+0.056
 pleasure+0.054
 accomplishment+0.053
 fascination+0.052
 delight+0.051
Neg logit contributions
 parked-0.055
 driving-0.052
 smoking-0.050
 stuck-0.048
 stole-0.048
Token she
Token ID673
Feature activation+2.39e-02

Pos logit contributions
 pleasure+0.051
 satisfaction+0.051
 accomplishment+0.050
 fascination+0.048
 delight+0.048
Neg logit contributions
 parked-0.029
 stole-0.025
 smoking-0.025
 driving-0.025
 stuck-0.024
TokenM
Token ID44
Feature activation+4.81e-02

Pos logit contributions
 pleasure+0.054
 accomplishment+0.050
 satisfaction+0.050
 delight+0.046
 decision+0.046
Neg logit contributions
 parked-0.044
 smoking-0.041
 driving-0.040
 disturbed-0.038
 stole-0.038
Tokenum
Token ID388
Feature activation+3.52e-02

Pos logit contributions
 pleasure+0.078
 satisfaction+0.077
 delight+0.072
 fascination+0.070
 accomplishment+0.067
Neg logit contributions
 parked-0.082
 smoking-0.081
 driving-0.078
anes-0.078
 knocked-0.075
Token and
Token ID290
Feature activation+4.74e-02

Pos logit contributions
 pleasure+0.066
 satisfaction+0.065
 accomplishment+0.062
 fascination+0.059
 delight+0.057
Neg logit contributions
 parked-0.067
 driving-0.059
 stole-0.058
 knocked-0.058
 smoking-0.057
Token dad
Token ID9955
Feature activation+3.84e-02

Pos logit contributions
 difference+0.095
 pleasure+0.094
 accomplishment+0.091
 satisfaction+0.091
 decision+0.090
Neg logit contributions
 parked-0.070
 knocked-0.062
 driving-0.060
 smoking-0.059
 called-0.059
Token hugged
Token ID41130
Feature activation+8.27e-02

Pos logit contributions
 pleasure+0.086
 satisfaction+0.085
 accomplishment+0.085
 fascination+0.080
 accomplishments+0.079
Neg logit contributions
 parked-0.056
 stole-0.048
 driving-0.048
 knocked-0.046
 broke-0.046
Token Maria
Token ID14200
Feature activation+1.07e-01

Pos logit contributions
 pleasure+0.195
 satisfaction+0.185
 accomplishment+0.183
 decision+0.182
 difference+0.181
Neg logit contributions
 to-0.075
 appeared-0.074
 on-0.071
 called-0.069
 came-0.067
Token and
Token ID290
Feature activation+3.78e-02

Pos logit contributions
 satisfaction+0.217
 fascination+0.211
 accomplishment+0.210
 pleasure+0.209
 accomplishments+0.202
Neg logit contributions
 parked-0.143
 driving-0.133
 on-0.126
 stuck-0.125
o-0.121
Token they
Token ID484
Feature activation+3.00e-02

Pos logit contributions
 pleasure+0.094
 satisfaction+0.092
 accomplishment+0.090
 fascination+0.089
 difference+0.088
Neg logit contributions
 parked-0.039
 knocked-0.034
 driving-0.034
 stole-0.033
 broke-0.032
Token all
Token ID477
Feature activation+1.11e-02

Pos logit contributions
 pleasure+0.066
 accomplishment+0.065
 satisfaction+0.065
 fascination+0.062
 delight+0.061
Neg logit contributions
 parked-0.043
 stole-0.038
 driving-0.037
 broke-0.036
 knocked-0.036
Token went
Token ID1816
Feature activation-9.61e-03

Pos logit contributions
 pleasure+0.024
 satisfaction+0.024
 accomplishment+0.023
 fascination+0.022
 delight+0.022
Neg logit contributions
 parked-0.016
 stole-0.014
 driving-0.014
 smoking-0.014
 knocked-0.014
Token home
Token ID1363
Feature activation+2.50e-02

Pos logit contributions
 parked+0.014
 stole+0.013
 smoking+0.013
 driving+0.012
 disturbed+0.012
Neg logit contributions
 satisfaction-0.020
 pleasure-0.020
 accomplishment-0.020
 delight-0.019
 fascination-0.019
Token calendar
Token ID11845
Feature activation+4.86e-03

Pos logit contributions
 satisfaction+0.102
 pleasure+0.097
 accomplishment+0.096
 delight+0.094
 fascination+0.094
Neg logit contributions
 knocked-0.069
 parked-0.068
 driving-0.068
 broke-0.066
 smoking-0.065
Token soon
Token ID2582
Feature activation+9.74e-03

Pos logit contributions
 satisfaction+0.010
 pleasure+0.010
 accomplishment+0.010
 delight+0.009
 fascination+0.009
Neg logit contributions
 parked-0.007
 driving-0.007
 stole-0.007
 smoking-0.007
 disturbed-0.006
Token."
Token ID526
Feature activation+2.96e-02

Pos logit contributions
 satisfaction+0.023
 pleasure+0.022
 accomplishment+0.022
 fascination+0.021
 delight+0.021
Neg logit contributions
 parked-0.012
 driving-0.011
 stole-0.011
 smoking-0.011
 knocked-0.010
Token Amy
Token ID14235
Feature activation+7.80e-02

Pos logit contributions
 accomplishment+0.068
 satisfaction+0.068
 pleasure+0.067
 fascination+0.064
 decision+0.063
Neg logit contributions
 parked-0.033
 smoking-0.032
 broke-0.031
 driving-0.031
 stole-0.031
Token felt
Token ID2936
Feature activation+4.50e-02

Pos logit contributions
 pleasure+0.189
 accomplishment+0.177
 accomplishments+0.171
 satisfaction+0.171
 difference+0.168
Neg logit contributions
 parked-0.096
 broke-0.095
 knocked-0.091
 hurt-0.086
 rang-0.086
Token a
Token ID257
Feature activation+1.07e-01

Pos logit contributions
 accomplishment+0.094
aman+0.094
 pleasure+0.092
 fascination+0.091
 accomplishments+0.090
Neg logit contributions
 parked-0.059
 hurt-0.057
 disturbed-0.056
 stuck-0.055
 stole-0.055
Token little
Token ID1310
Feature activation+4.67e-02

Pos logit contributions
 deeds+0.221
osa+0.221
aman+0.220
heart+0.213
ishment+0.213
Neg logit contributions
 hurt-0.138
 broke-0.132
 knocked-0.122
 wrecked-0.118
 hit-0.115
Token better
Token ID1365
Feature activation+4.78e-02

Pos logit contributions
 accomplishment+0.094
aman+0.092
 accomplishments+0.091
 fascination+0.091
 satisfaction+0.090
Neg logit contributions
 hurt-0.064
 stuck-0.064
 broke-0.063
 stole-0.061
 disturbed-0.061
Token after
Token ID706
Feature activation-3.33e-03

Pos logit contributions
 pleasure+0.094
 fascination+0.088
 satisfaction+0.087
 accomplishment+0.085
agg+0.081
Neg logit contributions
 parked-0.074
 hurt-0.073
 broke-0.072
 knocked-0.072
 stuck-0.071
Token hearing
Token ID4854
Feature activation+2.52e-02

Pos logit contributions
 parked+0.005
 stole+0.005
 driving+0.004
 smoking+0.004
 broke+0.004
Neg logit contributions
 satisfaction-0.007
 pleasure-0.007
 accomplishment-0.007
 delight-0.007
 fascination-0.006
Token these
Token ID777
Feature activation+4.57e-02

Pos logit contributions
 pleasure+0.055
 satisfaction+0.054
 accomplishment+0.053
 fascination+0.052
 delight+0.050
Neg logit contributions
 parked-0.034
 smoking-0.032
 driving-0.032
 stole-0.031
 disturbed-0.031
Token happy
Token ID3772
Feature activation+3.36e-02

Pos logit contributions
 pleasure+0.059
 satisfaction+0.059
 accomplishment+0.058
oph+0.055
 accomplishments+0.055
Neg logit contributions
 parked-0.056
 disturbed-0.054
 hurt-0.052
 broke-0.052
 knocked-0.052
Token!
Token ID0
Feature activation+5.11e-02

Pos logit contributions
 satisfaction+0.082
 pleasure+0.081
 fascination+0.077
 accomplishment+0.076
 delight+0.073
Neg logit contributions
 driving-0.039
 parked-0.038
 smoking-0.038
 disturbed-0.035
 blocking-0.035
Token She
Token ID1375
Feature activation+4.22e-02

Pos logit contributions
 accomplishment+0.120
 satisfaction+0.120
 fascination+0.116
 pleasure+0.112
 decision+0.112
Neg logit contributions
 parked-0.049
 driving-0.045
 smoking-0.043
 stole-0.043
 knocked-0.043
Token hugged
Token ID41130
Feature activation+7.76e-02

Pos logit contributions
 pleasure+0.102
 satisfaction+0.098
 accomplishment+0.098
 difference+0.096
 fascination+0.095
Neg logit contributions
 parked-0.056
 stole-0.049
 knocked-0.048
 broke-0.047
 stuck-0.045
Token her
Token ID607
Feature activation+6.60e-02

Pos logit contributions
 pleasure+0.221
hew+0.219
 decision+0.217
 difference+0.214
 satisfaction+0.211
Neg logit contributions
 to-0.039
 on-0.034
 in-0.027
 smoking-0.024
 coming-0.024
Token mum
Token ID25682
Feature activation+1.07e-01

Pos logit contributions
aman+0.123
hew+0.120
 satisfaction+0.119
 fascination+0.117
 pleasures+0.116
Neg logit contributions
 stuck-0.089
 hurt-0.084
 broke-0.079
 parked-0.078
 knocked-0.077
Token and
Token ID290
Feature activation+4.33e-02

Pos logit contributions
 pleasure+0.249
 satisfaction+0.240
 fascination+0.237
 accomplishment+0.233
 accomplishments+0.233
Neg logit contributions
 parked-0.139
 on-0.121
 called-0.115
 driving-0.113
 stuck-0.110
Token together
Token ID1978
Feature activation+3.22e-02

Pos logit contributions
 pleasure+0.110
 satisfaction+0.103
 accomplishment+0.103
 difference+0.100
 accomplishments+0.100
Neg logit contributions
 parked-0.050
 knocked-0.045
 stole-0.043
 broke-0.042
 stuck-0.041
Token they
Token ID484
Feature activation+1.98e-02

Pos logit contributions
 pleasure+0.081
 satisfaction+0.080
 accomplishment+0.076
 fascination+0.075
 difference+0.074
Neg logit contributions
 parked-0.035
 knocked-0.032
 driving-0.032
 broke-0.031
 stuck-0.030
Token went
Token ID1816
Feature activation-7.92e-03

Pos logit contributions
 pleasure+0.049
 satisfaction+0.048
 accomplishment+0.048
 delight+0.045
 fascination+0.045
Neg logit contributions
 parked-0.024
 stole-0.021
 driving-0.021
 knocked-0.020
 broke-0.020
Token to
Token ID284
Feature activation+1.38e-02

Pos logit contributions
 parked+0.010
 stole+0.009
 smoking+0.009
 driving+0.009
 disturbed+0.009
Neg logit contributions
 pleasure-0.018
 satisfaction-0.018
 accomplishment-0.018
 delight-0.017
 fascination-0.016
Token was
Token ID373
Feature activation-9.66e-03

Pos logit contributions
 parked+0.025
 stole+0.023
 driving+0.023
 smoking+0.023
 blocking+0.021
Neg logit contributions
 satisfaction-0.037
 pleasure-0.037
 accomplishment-0.036
 delight-0.036
 fascination-0.034
Token shining
Token ID22751
Feature activation+2.57e-02

Pos logit contributions
 parked+0.013
 stole+0.012
 smoking+0.012
 driving+0.012
 broke+0.011
Neg logit contributions
 pleasure-0.021
 satisfaction-0.021
 accomplishment-0.021
 delight-0.020
 fascination-0.020
Token.
Token ID13
Feature activation-4.61e-03

Pos logit contributions
 satisfaction+0.060
 pleasure+0.059
 accomplishment+0.059
 fascination+0.056
 delight+0.055
Neg logit contributions
 parked-0.032
 driving-0.029
 smoking-0.029
 disturbed-0.028
 broke-0.028
Token Lucy
Token ID22162
Feature activation+3.34e-02

Pos logit contributions
 parked+0.007
 stole+0.007
 smoking+0.007
 driving+0.007
 disturbed+0.007
Neg logit contributions
 pleasure-0.009
 satisfaction-0.009
 accomplishment-0.009
 delight-0.008
 fascination-0.008
Token felt
Token ID2936
Feature activation+3.87e-02

Pos logit contributions
 satisfaction+0.076
 pleasure+0.076
 accomplishment+0.076
 delight+0.069
 fascination+0.069
Neg logit contributions
 parked-0.048
 knocked-0.044
 stole-0.043
 broke-0.041
 disturbed-0.041
Token a
Token ID257
Feature activation+1.07e-01

Pos logit contributions
 accomplishment+0.083
 pleasure+0.081
 satisfaction+0.080
 accomplishments+0.078
oph+0.077
Neg logit contributions
 parked-0.052
 disturbed-0.050
 hurt-0.048
 stuck-0.047
 smoking-0.046
Token bit
Token ID1643
Feature activation+3.95e-02

Pos logit contributions
 deeds+0.238
oph+0.230
aman+0.224
 accomplishment+0.223
hew+0.223
Neg logit contributions
 hurt-0.132
 knocked-0.117
 disturbed-0.109
 broke-0.106
 lost-0.103
Token scared
Token ID12008
Feature activation+2.98e-02

Pos logit contributions
 satisfaction+0.097
 pleasure+0.097
 accomplishment+0.097
 fascination+0.093
 accomplishments+0.092
Neg logit contributions
 disturbed-0.041
 stuck-0.038
 hurt-0.038
 knocked-0.037
 smoking-0.036
Token but
Token ID475
Feature activation+1.97e-02

Pos logit contributions
 satisfaction+0.059
 pleasure+0.058
 accomplishment+0.058
 fascination+0.055
 accomplishments+0.052
Neg logit contributions
 parked-0.047
 smoking-0.047
 driving-0.047
 disturbed-0.046
 blocking-0.044
Token with
Token ID351
Feature activation+3.48e-02

Pos logit contributions
 pleasure+0.041
 satisfaction+0.041
 accomplishment+0.040
 fascination+0.038
 difference+0.038
Neg logit contributions
 parked-0.029
 driving-0.026
 smoking-0.026
 stole-0.026
 knocked-0.026
Token Jake
Token ID14757
Feature activation+2.12e-02

Pos logit contributions
 satisfaction+0.072
 pleasure+0.072
 accomplishment+0.069
aire+0.068
 difference+0.068
Neg logit contributions
 parked-0.047
 driving-0.043
 smoking-0.042
 knocked-0.042
 disturbed-0.042
Token her
Token ID607
Feature activation+4.17e-02

Pos logit contributions
 pleasure+0.071
 satisfaction+0.067
 fascination+0.065
aire+0.065
oph+0.063
Neg logit contributions
 smoking-0.041
 driving-0.040
 parked-0.039
 blocking-0.038
 guarding-0.035
Token jam
Token ID16853
Feature activation+3.36e-02

Pos logit contributions
 satisfaction+0.080
 pleasure+0.079
 fascination+0.078
aman+0.072
 delight+0.071
Neg logit contributions
 smoking-0.061
 knocked-0.060
 broke-0.055
 disturbed-0.055
 driving-0.055
Token.
Token ID13
Feature activation+4.08e-02

Pos logit contributions
 pleasure+0.077
 satisfaction+0.076
 fascination+0.073
 accomplishment+0.072
 delight+0.070
Neg logit contributions
 parked-0.042
 smoking-0.040
 driving-0.039
 stole-0.037
 disturbed-0.037
Token She
Token ID1375
Feature activation+2.86e-02

Pos logit contributions
 accomplishment+0.096
 satisfaction+0.093
 pleasure+0.089
 decision+0.088
 fascination+0.087
Neg logit contributions
 parked-0.042
 smoking-0.039
 knocked-0.037
 driving-0.037
 broke-0.037
Token felt
Token ID2936
Feature activation+4.17e-02

Pos logit contributions
 pleasure+0.068
 satisfaction+0.066
 accomplishment+0.065
 fascination+0.063
 difference+0.062
Neg logit contributions
 parked-0.036
 stole-0.033
 knocked-0.032
 broke-0.031
 driving-0.031
Token a
Token ID257
Feature activation+1.06e-01

Pos logit contributions
 accomplishment+0.085
aman+0.084
 pleasure+0.084
 fascination+0.083
 accomplishments+0.082
Neg logit contributions
 parked-0.055
 disturbed-0.053
 stuck-0.050
 smoking-0.050
 stole-0.050
Token great
Token ID1049
Feature activation+6.55e-02

Pos logit contributions
 deeds+0.221
ishment+0.213
ior+0.207
 agrees+0.206
hew+0.205
Neg logit contributions
 hurt-0.133
 knocked-0.125
 broke-0.115
 parked-0.115
 disturbed-0.114
Token sense
Token ID2565
Feature activation+2.33e-02

Pos logit contributions
hew+0.127
 fascination+0.123
 pleasure+0.120
 Deal+0.118
aman+0.117
Neg logit contributions
 knocked-0.084
 broke-0.081
 hurt-0.081
 disturbed-0.079
 hit-0.079
Token of
Token ID286
Feature activation+4.86e-02

Pos logit contributions
 pleasure+0.061
 satisfaction+0.061
 fascination+0.059
 accomplishment+0.059
 delight+0.057
Neg logit contributions
 parked-0.022
 driving-0.020
 smoking-0.020
 disturbed-0.018
 stole-0.018
Token accomplishment
Token ID29340
Feature activation+2.76e-02

Pos logit contributions
aire+0.055
oph+0.052
aman+0.049
hew+0.044
 pleasure+0.043
Neg logit contributions
 parked-0.109
 driving-0.108
 smoking-0.105
 blocking-0.104
 hurt-0.104
Token so
Token ID523
Feature activation+4.43e-02

Pos logit contributions
 pleasure+0.059
 satisfaction+0.059
 fascination+0.056
 accomplishment+0.055
 delight+0.055
Neg logit contributions
 parked-0.040
 driving-0.038
 smoking-0.037
 stole-0.035
 knocked-0.035
Token sweet
Token ID6029
Feature activation+7.41e-02

Pos logit contributions
 satisfaction+0.103
 fascination+0.101
 pleasure+0.099
 amaz+0.099
 disbelief+0.098
Neg logit contributions
 driving-0.076
 parked-0.074
 smoking-0.071
 hurt-0.070
 disturbed-0.069
Token treat
Token ID2190
Feature activation+4.04e-02

Pos logit contributions
 pleasure+0.159
 satisfaction+0.156
 fascination+0.155
 Deal+0.150
 accomplishment+0.147
Neg logit contributions
 smoking-0.089
 driving-0.083
 disturbed-0.081
 hurt-0.080
 parked-0.079
Token and
Token ID290
Feature activation+5.09e-02

Pos logit contributions
 satisfaction+0.092
 pleasure+0.091
 fascination+0.088
 accomplishment+0.088
 delight+0.085
Neg logit contributions
 parked-0.051
 driving-0.047
 smoking-0.046
 broke-0.045
 stole-0.044
Token shared
Token ID4888
Feature activation+6.47e-02

Pos logit contributions
 pleasure+0.109
 fascination+0.106
 satisfaction+0.106
aire+0.106
 accomplishment+0.106
Neg logit contributions
 parked-0.063
 knocked-0.058
 smoking-0.058
 driving-0.057
 stole-0.055
Token a
Token ID257
Feature activation+7.64e-02

Pos logit contributions
 pleasure+0.163
 accomplishment+0.160
 fascination+0.158
 satisfaction+0.155
 decision+0.153
Neg logit contributions
 on-0.048
 smoking-0.044
 parked-0.044
 driving-0.041
 disturbed-0.040
Token big
Token ID1263
Feature activation+1.06e-01

Pos logit contributions
ishment+0.150
 deeds+0.148
 fascination+0.145
 pleasure+0.144
 satisfaction+0.143
Neg logit contributions
 parked-0.095
 driving-0.089
 hurt-0.086
 broke-0.086
 stole-0.084
Token,
Token ID11
Feature activation+5.14e-02

Pos logit contributions
 surroundings+0.201
 deeds+0.201
ishment+0.199
ently+0.196
iper+0.193
Neg logit contributions
 hit-0.121
 driving-0.121
 parked-0.118
 popped-0.117
 v-0.116
Token happy
Token ID3772
Feature activation+4.84e-02

Pos logit contributions
 satisfaction+0.087
 fascination+0.086
 accomplishment+0.086
 pleasure+0.085
aire+0.083
Neg logit contributions
 parked-0.089
 driving-0.084
 smoking-0.083
 stuck-0.077
 stole-0.076
Token hug
Token ID16225
Feature activation+3.11e-02

Pos logit contributions
 pleasure+0.086
 fascination+0.086
 satisfaction+0.084
 Deal+0.082
 accomplishment+0.079
Neg logit contributions
 driving-0.076
 parked-0.070
 smoking-0.069
 knocked-0.065
 disturbed-0.063
Token.
Token ID13
Feature activation+3.84e-02

Pos logit contributions
 pleasure+0.071
 satisfaction+0.071
 accomplishment+0.069
 fascination+0.068
 delight+0.066
Neg logit contributions
 parked-0.038
 driving-0.036
 smoking-0.035
 stole-0.033
 broke-0.033
Token<|endoftext|>
Token ID50256
Feature activation-2.36e-02

Pos logit contributions
 satisfaction+0.086
 accomplishment+0.086
 pleasure+0.083
 fascination+0.083
 decision+0.081
Neg logit contributions
 parked-0.047
 driving-0.043
 smoking-0.040
 knocked-0.040
 broke-0.040
Token it
Token ID340
Feature activation+5.14e-02

Pos logit contributions
 satisfaction+0.126
 pleasure+0.126
 accomplishment+0.122
 fascination+0.118
 delight+0.118
Neg logit contributions
 smoking-0.047
 parked-0.044
 driving-0.041
 stuck-0.038
 hurt-0.037
Token.
Token ID13
Feature activation+4.17e-02

Pos logit contributions
 satisfaction+0.125
 fascination+0.122
 pleasure+0.121
 accomplishment+0.116
 amaz+0.116
Neg logit contributions
 parked-0.055
 driving-0.051
 stuck-0.049
 smoking-0.049
 broke-0.048
Token After
Token ID2293
Feature activation-2.74e-03

Pos logit contributions
 accomplishment+0.093
 satisfaction+0.093
 fascination+0.088
 pleasure+0.087
 decision+0.086
Neg logit contributions
 parked-0.051
 driving-0.046
 Wh-0.045
 CA-0.044
 knocked-0.043
Token they
Token ID484
Feature activation+2.40e-02

Pos logit contributions
 parked+0.004
 stole+0.003
 driving+0.003
 smoking+0.003
 disturbed+0.003
Neg logit contributions
 satisfaction-0.006
 pleasure-0.006
 accomplishment-0.006
 delight-0.006
 fascination-0.006
Token finished
Token ID5201
Feature activation+3.72e-02

Pos logit contributions
 pleasure+0.053
 accomplishment+0.052
 satisfaction+0.051
 fascination+0.049
 difference+0.048
Neg logit contributions
 parked-0.035
 stole-0.032
 knocked-0.031
 broke-0.030
 disturbed-0.029
Token cleaning
Token ID12724
Feature activation+1.06e-01

Pos logit contributions
 satisfaction+0.093
 pleasure+0.093
 fascination+0.086
 accomplishment+0.084
 delight+0.084
Neg logit contributions
 smoking-0.045
 driving-0.044
 parked-0.039
 knocked-0.038
 broke-0.035
Token the
Token ID262
Feature activation+1.89e-02

Pos logit contributions
 satisfaction+0.275
 pleasure+0.260
 Deal+0.257
ited+0.254
 fascination+0.252
Neg logit contributions
 on-0.095
 smoking-0.093
 broke-0.085
 cars-0.084
 driving-0.081
Token shell
Token ID7582
Feature activation+6.09e-02

Pos logit contributions
 satisfaction+0.038
 pleasure+0.038
 accomplishment+0.037
 fascination+0.036
 delight+0.035
Neg logit contributions
 parked-0.029
 stole-0.027
 smoking-0.026
 driving-0.026
 broke-0.026
Token,
Token ID11
Feature activation+2.12e-02

Pos logit contributions
 satisfaction+0.125
 pleasure+0.119
 fascination+0.118
 accomplishment+0.112
 delight+0.109
Neg logit contributions
 broke-0.093
 parked-0.093
 driving-0.088
 smoking-0.088
 knocked-0.087
Token they
Token ID484
Feature activation+8.31e-03

Pos logit contributions
 pleasure+0.056
 satisfaction+0.056
 accomplishment+0.054
 fascination+0.054
 delight+0.052
Neg logit contributions
 parked-0.019
 smoking-0.018
 driving-0.018
 stole-0.016
 broke-0.016
Token celebrated
Token ID13943
Feature activation+5.75e-02

Pos logit contributions
 pleasure+0.019
 satisfaction+0.019
 accomplishment+0.019
 delight+0.018
 fascination+0.018
Neg logit contributions
 parked-0.011
 stole-0.010
 driving-0.010
 smoking-0.009
 knocked-0.009
Token of
Token ID286
Feature activation+2.81e-02

Pos logit contributions
 satisfaction+0.039
 pleasure+0.039
 accomplishment+0.039
 fascination+0.038
 delight+0.037
Neg logit contributions
 parked-0.014
 driving-0.013
 smoking-0.013
 stole-0.012
 disturbed-0.012
Token you
Token ID345
Feature activation+3.87e-02

Pos logit contributions
 satisfaction+0.069
 pleasure+0.068
 accomplishment+0.067
 fascination+0.067
 delight+0.064
Neg logit contributions
 driving-0.028
 parked-0.028
 smoking-0.027
 blocking-0.025
 guarding-0.024
Token!"
Token ID2474
Feature activation+2.98e-02

Pos logit contributions
 satisfaction+0.086
 fascination+0.085
 accomplishment+0.084
 pleasure+0.084
 amaz+0.083
Neg logit contributions
 parked-0.051
 driving-0.047
 smoking-0.044
 stole-0.043
 broke-0.041
Token Ruth
Token ID22132
Feature activation+6.09e-02

Pos logit contributions
 accomplishment+0.066
 pleasure+0.066
 satisfaction+0.065
 fascination+0.062
 accomplishments+0.060
Neg logit contributions
 parked-0.039
 driving-0.036
 smoking-0.035
 stole-0.034
 knocked-0.034
Token felt
Token ID2936
Feature activation+4.32e-02

Pos logit contributions
 pleasure+0.146
 accomplishment+0.139
 satisfaction+0.136
 fascination+0.132
 difference+0.131
Neg logit contributions
 parked-0.080
 knocked-0.072
 broke-0.071
 stole-0.069
 called-0.066
Token a
Token ID257
Feature activation+1.06e-01

Pos logit contributions
aman+0.086
 accomplishment+0.086
 pleasure+0.084
 fascination+0.083
 accomplishments+0.083
Neg logit contributions
 parked-0.060
 hurt-0.056
 disturbed-0.056
 stole-0.055
 stuck-0.055
Token special
Token ID2041
Feature activation+5.36e-02

Pos logit contributions
ishment+0.226
 deeds+0.225
osa+0.223
aman+0.220
hew+0.217
Neg logit contributions
 hurt-0.130
 broke-0.115
 knocked-0.113
 hit-0.106
 wrecked-0.104
Token feeling
Token ID4203
Feature activation+3.83e-02

Pos logit contributions
hew+0.080
 satisfaction+0.080
oph+0.078
 pleasure+0.078
 fascination+0.077
Neg logit contributions
 parked-0.092
 driving-0.092
 broke-0.092
 knocked-0.091
 stuck-0.090
Token in
Token ID287
Feature activation+5.02e-02

Pos logit contributions
 fascination+0.093
 pleasure+0.093
 accomplishment+0.091
 satisfaction+0.089
 amaz+0.088
Neg logit contributions
 parked-0.043
 disturbed-0.041
 broke-0.039
 stole-0.038
 driving-0.038
Token her
Token ID607
Feature activation+5.30e-02

Pos logit contributions
hew+0.127
 pleasure+0.120
ouring+0.116
aire+0.115
ealing+0.114
Neg logit contributions
 driving-0.042
 parked-0.040
 smoking-0.040
 blocking-0.039
 guarding-0.037
Token heart
Token ID2612
Feature activation+4.32e-02

Pos logit contributions
oph+0.091
heart+0.083
aman+0.079
ya+0.078
hew+0.076
Neg logit contributions
 smoking-0.085
 driving-0.084
 hurt-0.082
 broke-0.080
 parked-0.080
Token yourself
Token ID3511
Feature activation+4.21e-02

Pos logit contributions
 accomplishment+0.144
oph+0.142
 pleasure+0.142
 satisfaction+0.140
 fascination+0.139
Neg logit contributions
 smoking-0.078
 parked-0.076
 driving-0.076
 blocking-0.072
 guarding-0.069
Token.
Token ID13
Feature activation+4.94e-02

Pos logit contributions
 satisfaction+0.104
 accomplishment+0.102
 pleasure+0.101
 fascination+0.098
 delight+0.097
Neg logit contributions
 parked-0.045
 driving-0.041
 smoking-0.041
 stole-0.039
 disturbed-0.037
Token<|endoftext|>
Token ID50256
Feature activation-1.58e-02

Pos logit contributions
 accomplishment+0.122
 satisfaction+0.117
 decision+0.112
 pleasure+0.111
 fascination+0.111
Neg logit contributions
 parked-0.050
 driving-0.043
 smoking-0.043
 stole-0.041
 CA-0.040
TokenOnce
Token ID7454
Feature activation+3.02e-02

Pos logit contributions
 parked+0.023
 smoking+0.021
 stole+0.021
 driving+0.021
 knocked+0.020
Neg logit contributions
 pleasure-0.032
 satisfaction-0.032
 accomplishment-0.031
 delight-0.030
 fascination-0.030
Token upon
Token ID2402
Feature activation-3.68e-03

Pos logit contributions
 satisfaction+0.064
 pleasure+0.060
 accomplishment+0.059
 delight+0.058
 fascination+0.057
Neg logit contributions
 parked-0.046
 driving-0.042
 smoking-0.041
 stole-0.041
 knocked-0.041
Token a
Token ID257
Feature activation+1.06e-01

Pos logit contributions
 parked+0.004
 stole+0.004
 smoking+0.004
 driving+0.004
 knocked+0.004
Neg logit contributions
 satisfaction-0.009
 pleasure-0.009
 accomplishment-0.009
 delight-0.008
 fascination-0.008
Token time
Token ID640
Feature activation+2.15e-02

Pos logit contributions
 satisfaction+0.260
 accomplishment+0.253
 amaz+0.243
 delight+0.242
 deeds+0.241
Neg logit contributions
 knocked-0.082
 disturbed-0.080
o-0.079
 named-0.077
 driving-0.076
Token,
Token ID11
Feature activation-7.79e-03

Pos logit contributions
 satisfaction+0.038
 accomplishment+0.036
 pleasure+0.036
 delight+0.034
 fascination+0.034
Neg logit contributions
 parked-0.039
 driving-0.038
 smoking-0.037
 stole-0.037
 broke-0.037
Tokenthere
Token ID8117
Feature activation+1.40e-02

Pos logit contributions
 parked+0.006
 stole+0.006
 smoking+0.006
 driving+0.006
 knocked+0.005
Neg logit contributions
 pleasure-0.021
 satisfaction-0.021
 accomplishment-0.020
 delight-0.020
 fascination-0.020
Token lived
Token ID5615
Feature activation-3.10e-02

Pos logit contributions
 pleasure+0.033
 satisfaction+0.033
 accomplishment+0.033
 fascination+0.031
 delight+0.031
Neg logit contributions
 parked-0.017
 stole-0.016
 driving-0.015
 smoking-0.015
 broke-0.015
Token a
Token ID257
Feature activation+1.83e-02

Pos logit contributions
 stole+0.047
 parked+0.046
 smoking+0.044
 knocked+0.043
 driving+0.042
Neg logit contributions
 pleasure-0.061
 satisfaction-0.060
 delight-0.059
 accomplishment-0.056
 fascination-0.055
Token saw
Token ID2497
Feature activation-5.28e-02

Pos logit contributions
 driving+0.055
anes+0.055
 parked+0.054
 Enter+0.053
 blocking+0.053
Neg logit contributions
 satisfaction-0.063
 delight-0.060
 pleasure-0.058
 accomplishment-0.058
 fascination-0.058
Token a
Token ID257
Feature activation-2.87e-02

Pos logit contributions
cher+0.097
ider+0.096
 stole+0.093
 parked+0.091
 guarding+0.090
Neg logit contributions
 satisfaction-0.074
 delight-0.070
 pleasure-0.066
 accomplishment-0.064
 relief-0.064
Token big
Token ID1263
Feature activation+5.46e-03

Pos logit contributions
 parked+0.049
 driving+0.046
 stole+0.045
 smoking+0.044
 driven+0.043
Neg logit contributions
 pleasure-0.050
 satisfaction-0.050
 accomplishment-0.049
 delight-0.047
 difference-0.046
Token,
Token ID11
Feature activation-2.30e-02

Pos logit contributions
 satisfaction+0.011
 pleasure+0.011
 accomplishment+0.011
 delight+0.011
 fascination+0.010
Neg logit contributions
 parked-0.008
 smoking-0.008
 driving-0.008
 stole-0.008
 broke-0.007
Token red
Token ID2266
Feature activation-3.39e-02

Pos logit contributions
 parked+0.034
 driving+0.032
 stole+0.032
 smoking+0.031
 knocked+0.031
Neg logit contributions
 satisfaction-0.045
 pleasure-0.045
 accomplishment-0.044
 delight-0.042
 fascination-0.042
Token thing
Token ID1517
Feature activation-1.29e-01

Pos logit contributions
 parked+0.049
 stole+0.047
 driving+0.046
 disturbed+0.046
 smoking+0.045
Neg logit contributions
 satisfaction-0.066
 pleasure-0.064
 accomplishment-0.064
 delight-0.063
 fascination-0.062
Token on
Token ID319
Feature activation-3.28e-02

Pos logit contributions
 Indians+0.262
Lost+0.238
cher+0.236
chard+0.231
 villains+0.230
Neg logit contributions
 and-0.122
 goodness-0.114
 accomplishment-0.101
 until-0.097
 moments-0.096
Token the
Token ID262
Feature activation-1.39e-02

Pos logit contributions
 stole+0.057
cher+0.057
 parked+0.055
 smoking+0.055
 driving+0.054
Neg logit contributions
 satisfaction-0.055
 delight-0.053
 accomplishment-0.052
 pleasure-0.050
 fascination-0.049
Token ground
Token ID2323
Feature activation-8.12e-02

Pos logit contributions
 parked+0.024
 stole+0.024
 driving+0.023
 smoking+0.023
 disturbed+0.023
Neg logit contributions
 satisfaction-0.024
 pleasure-0.024
 accomplishment-0.023
 delight-0.022
 fascination-0.022
Token.
Token ID13
Feature activation-3.37e-02

Pos logit contributions
 Found+0.188
cher+0.187
 Indians+0.186
Ro+0.183
 villains+0.181
Neg logit contributions
 accomplishment-0.057
 until-0.056
 decision-0.056
ness-0.053
 satisfaction-0.052
Token He
Token ID679
Feature activation-3.35e-02

Pos logit contributions
 parked+0.062
cher+0.060
 driving+0.060
 stole+0.059
 smoking+0.059
Neg logit contributions
 pleasure-0.052
 satisfaction-0.050
 delight-0.050
 accomplishment-0.047
 fascination-0.045
Token mom
Token ID1995
Feature activation-2.00e-02

Pos logit contributions
 parked+0.082
 stole+0.079
 smoking+0.078
 blocking+0.077
 disturbed+0.076
Neg logit contributions
 pleasure-0.048
 accomplishment-0.044
 delight-0.043
 satisfaction-0.041
 fascination-0.038
Token and
Token ID290
Feature activation-2.65e-02

Pos logit contributions
 parked+0.038
 stole+0.038
 CA+0.038
 disturbed+0.037
 killed+0.037
Neg logit contributions
 pleasure-0.028
 satisfaction-0.027
 accomplishment-0.026
 delight-0.026
 difference-0.025
Token saw
Token ID2497
Feature activation-5.69e-02

Pos logit contributions
through+0.046
 parked+0.046
 smoking+0.046
angs+0.046
 blocking+0.046
Neg logit contributions
 pleasure-0.040
 satisfaction-0.040
 accomplishment-0.038
 delight-0.037
 fascination-0.036
Token a
Token ID257
Feature activation-2.64e-02

Pos logit contributions
ider+0.115
cher+0.111
 stole+0.109
 parked+0.106
 guarding+0.104
Neg logit contributions
 delight-0.070
 satisfaction-0.069
 pleasure-0.069
 amaz-0.063
 fascination-0.061
Token terrible
Token ID7818
Feature activation-4.63e-03

Pos logit contributions
 parked+0.044
 driving+0.041
 stole+0.041
 smoking+0.041
 disturbed+0.039
Neg logit contributions
 pleasure-0.048
 satisfaction-0.047
 accomplishment-0.046
 delight-0.044
 fascination-0.043
Token thing
Token ID1517
Feature activation-1.29e-01

Pos logit contributions
 parked+0.006
 driving+0.005
 stole+0.005
 smoking+0.005
 disturbed+0.005
Neg logit contributions
 satisfaction-0.010
 pleasure-0.010
 accomplishment-0.010
 delight-0.010
 fascination-0.010
Token.
Token ID13
Feature activation-2.79e-02

Pos logit contributions
 Indians+0.246
Ro+0.242
Lost+0.242
destruct+0.235
Wh+0.235
Neg logit contributions
 and-0.127
 accomplishment-0.105
 until-0.105
 goodness-0.098
,-0.097
Token A
Token ID317
Feature activation-2.77e-02

Pos logit contributions
 parked+0.049
 driving+0.047
cher+0.047
 stole+0.047
 smoking+0.046
Neg logit contributions
 pleasure-0.046
 delight-0.044
 satisfaction-0.043
 fascination-0.041
 accomplishment-0.040
Token man
Token ID582
Feature activation-5.76e-02

Pos logit contributions
 parked+0.050
 driving+0.046
 smoking+0.045
 stole+0.045
 blocking+0.043
Neg logit contributions
 pleasure-0.047
 satisfaction-0.046
 accomplishment-0.045
 delight-0.043
 difference-0.042
Token had
Token ID550
Feature activation-7.24e-02

Pos logit contributions
angs+0.079
 Bang+0.078
Ro+0.078
anes+0.077
through+0.073
Neg logit contributions
 satisfaction-0.092
 fascination-0.092
 goodness-0.091
 accomplishment-0.090
 pleasure-0.089
Token a
Token ID257
Feature activation-1.46e-02

Pos logit contributions
glers+0.123
saw+0.122
Ro+0.118
aro+0.115
chard+0.115
Neg logit contributions
 satisfaction-0.107
 delight-0.102
 pleasure-0.097
 joy-0.091
 enthusiasm-0.087
Token One
Token ID1881
Feature activation-1.15e-02

Pos logit contributions
 satisfaction+0.007
 pleasure+0.007
 accomplishment+0.007
 delight+0.006
 fascination+0.006
Neg logit contributions
 parked-0.005
 driving-0.004
 smoking-0.004
 stole-0.004
 knocked-0.004
Token day
Token ID1110
Feature activation-2.09e-02

Pos logit contributions
 parked+0.023
 stole+0.022
 smoking+0.022
 driving+0.022
 blocking+0.021
Neg logit contributions
 satisfaction-0.017
 pleasure-0.017
 accomplishment-0.016
 delight-0.016
 fascination-0.015
Token a
Token ID257
Feature activation-2.73e-02

Pos logit contributions
 parked+0.050
 stole+0.047
 smoking+0.046
 driving+0.046
through+0.046
Neg logit contributions
 satisfaction-0.022
 pleasure-0.022
 accomplishment-0.021
 delight-0.020
 fascination-0.019
Token very
Token ID845
Feature activation-1.21e-02

Pos logit contributions
 parked+0.060
 driving+0.057
 smoking+0.057
 stole+0.056
 blocking+0.055
Neg logit contributions
 pleasure-0.037
 satisfaction-0.036
 accomplishment-0.035
 delight-0.032
 difference-0.030
Token unusual
Token ID8468
Feature activation-2.20e-02

Pos logit contributions
 parked+0.022
 smoking+0.021
 driving+0.021
 stole+0.020
 blocking+0.019
Neg logit contributions
 accomplishment-0.021
 satisfaction-0.021
 pleasure-0.021
 delight-0.019
 fascination-0.019
Token thing
Token ID1517
Feature activation-1.23e-01

Pos logit contributions
 parked+0.038
 stole+0.035
 smoking+0.035
 driving+0.035
 knocked+0.033
Neg logit contributions
 pleasure-0.040
 satisfaction-0.039
 accomplishment-0.039
 delight-0.037
 fascination-0.036
Token happened
Token ID3022
Feature activation-3.69e-02

Pos logit contributions
chard+0.185
through+0.185
anes+0.182
Wh+0.182
cher+0.175
Neg logit contributions
 hope-0.149
 moments-0.148
 goodness-0.141
 accomplishment-0.141
 and-0.138
Token.
Token ID13
Feature activation-2.21e-02

Pos logit contributions
 parked+0.061
 stole+0.060
saw+0.056
Ro+0.056
through+0.055
Neg logit contributions
 accomplishment-0.063
 pleasure-0.062
 satisfaction-0.059
 fascination-0.058
 goodness-0.057
Token He
Token ID679
Feature activation-3.95e-02

Pos logit contributions
 parked+0.039
 stole+0.037
 smoking+0.037
 driving+0.037
 knocked+0.036
Neg logit contributions
 pleasure-0.039
 satisfaction-0.038
 delight-0.036
 fascination-0.035
 accomplishment-0.034
Token pulled
Token ID5954
Feature activation+2.43e-02

Pos logit contributions
 smoking+0.060
 driving+0.058
anes+0.056
 blocking+0.054
angs+0.054
Neg logit contributions
 satisfaction-0.067
 delight-0.066
 fascination-0.065
 pleasure-0.064
 accomplishment-0.063
Token on
Token ID319
Feature activation+2.36e-02

Pos logit contributions
 pleasure+0.061
 satisfaction+0.060
 accomplishment+0.058
oph+0.057
 fascination+0.056
Neg logit contributions
 parked-0.027
 driving-0.025
 smoking-0.024
 stuck-0.024
 broke-0.023
Token saw
Token ID2497
Feature activation-2.40e-02

Pos logit contributions
 driving+0.058
 parked+0.057
 blocking+0.056
 smoking+0.056
anes+0.053
Neg logit contributions
 delight-0.061
 satisfaction-0.060
 pleasure-0.060
 fascination-0.060
oph-0.058
Token a
Token ID257
Feature activation-1.16e-02

Pos logit contributions
 parked+0.036
 stole+0.034
 driving+0.033
 smoking+0.032
 knocked+0.031
Neg logit contributions
 pleasure-0.048
 satisfaction-0.048
 delight-0.047
 accomplishment-0.046
 fascination-0.046
Token strange
Token ID6283
Feature activation-9.69e-03

Pos logit contributions
 parked+0.018
 driving+0.017
 stole+0.017
 smoking+0.017
 knocked+0.016
Neg logit contributions
 pleasure-0.022
 satisfaction-0.022
 accomplishment-0.022
 delight-0.021
 fascination-0.021
Token and
Token ID290
Feature activation-4.80e-02

Pos logit contributions
 parked+0.015
 driving+0.014
 stole+0.014
 smoking+0.013
 knocked+0.013
Neg logit contributions
 satisfaction-0.019
 pleasure-0.019
 accomplishment-0.019
 delight-0.018
 fascination-0.018
Token unknown
Token ID6439
Feature activation-4.66e-02

Pos logit contributions
 parked+0.078
 driving+0.078
anes+0.077
 smoking+0.073
cher+0.072
Neg logit contributions
 satisfaction-0.081
 accomplishment-0.079
 fascination-0.077
 delight-0.077
 pleasure-0.076
Token thing
Token ID1517
Feature activation-1.23e-01

Pos logit contributions
 parked+0.086
 stole+0.080
hes+0.077
 driven+0.076
 driving+0.076
Neg logit contributions
 pleasure-0.073
 satisfaction-0.072
 difference-0.071
 accomplishment-0.071
 decision-0.070
Token.
Token ID13
Feature activation+5.53e-03

Pos logit contributions
Ro+0.234
 Indians+0.228
Spe+0.226
Wh+0.221
 parked+0.220
Neg logit contributions
 accomplishment-0.115
 delight-0.111
 darling-0.107
 moments-0.106
 joy-0.105
Token She
Token ID1375
Feature activation-2.30e-02

Pos logit contributions
 satisfaction+0.011
 pleasure+0.011
 accomplishment+0.011
 delight+0.010
 fascination+0.010
Neg logit contributions
 parked-0.009
 driving-0.008
 smoking-0.008
 stole-0.008
 knocked-0.008
Token asked
Token ID1965
Feature activation+2.36e-02

Pos logit contributions
 smoking+0.032
 driving+0.032
 parked+0.032
 stole+0.030
 blocking+0.030
Neg logit contributions
 satisfaction-0.044
 delight-0.044
 fascination-0.043
 pleasure-0.042
 accomplishment-0.042
Token her
Token ID607
Feature activation-4.09e-03

Pos logit contributions
 satisfaction+0.053
 pleasure+0.053
 accomplishment+0.052
 fascination+0.049
 delight+0.048
Neg logit contributions
 parked-0.031
 driving-0.030
 smoking-0.029
 broke-0.028
 knocked-0.028
Token mum
Token ID25682
Feature activation+1.94e-02

Pos logit contributions
 parked+0.008
 stole+0.007
 driving+0.007
 smoking+0.007
 disturbed+0.007
Neg logit contributions
 pleasure-0.007
 satisfaction-0.007
 accomplishment-0.007
 delight-0.007
 fascination-0.006
TokenOne
Token ID3198
Feature activation-2.41e-03

Pos logit contributions
 parked+0.050
 smoking+0.049
 driving+0.049
 stole+0.049
 broke+0.047
Neg logit contributions
 satisfaction-0.046
 pleasure-0.044
 delight-0.043
 fascination-0.043
 accomplishment-0.042
Token day
Token ID1110
Feature activation-1.94e-02

Pos logit contributions
 parked+0.004
 driving+0.004
 stole+0.004
 smoking+0.004
 knocked+0.004
Neg logit contributions
 satisfaction-0.004
 pleasure-0.004
 accomplishment-0.004
 delight-0.004
 fascination-0.004
Token,
Token ID11
Feature activation-1.68e-02

Pos logit contributions
 parked+0.048
 stole+0.045
 smoking+0.045
 driving+0.045
 blocking+0.044
Neg logit contributions
 pleasure-0.020
 satisfaction-0.020
 delight-0.018
 accomplishment-0.018
 fascination-0.016
Token a
Token ID257
Feature activation-1.55e-02

Pos logit contributions
 parked+0.024
 stole+0.022
 smoking+0.022
 driving+0.022
 blocking+0.020
Neg logit contributions
 pleasure-0.036
 satisfaction-0.035
 accomplishment-0.034
 delight-0.034
 fascination-0.033
Token strange
Token ID6283
Feature activation-1.37e-02

Pos logit contributions
 parked+0.032
 driving+0.030
 stole+0.030
 smoking+0.030
 blocking+0.029
Neg logit contributions
 pleasure-0.023
 satisfaction-0.023
 accomplishment-0.022
 delight-0.021
 fascination-0.020
Token thing
Token ID1517
Feature activation-1.22e-01

Pos logit contributions
 parked+0.022
 stole+0.020
 driving+0.020
 smoking+0.020
 blocking+0.019
Neg logit contributions
 pleasure-0.026
 satisfaction-0.026
 accomplishment-0.026
 delight-0.024
 fascination-0.024
Token happened
Token ID3022
Feature activation-3.82e-02

Pos logit contributions
Wh+0.179
chard+0.173
 Indians+0.172
Spe+0.168
Lost+0.164
Neg logit contributions
 moments-0.162
 accomplishment-0.160
 satisfaction-0.152
 goodness-0.152
 delight-0.151
Token.
Token ID13
Feature activation-3.58e-02

Pos logit contributions
 parked+0.064
 stole+0.063
saw+0.060
Ro+0.059
through+0.058
Neg logit contributions
 accomplishment-0.064
 pleasure-0.063
 satisfaction-0.061
 fascination-0.059
 accomplishments-0.057
Token A
Token ID317
Feature activation-2.25e-02

Pos logit contributions
 parked+0.060
 driving+0.058
 stole+0.057
 smoking+0.055
 blocking+0.054
Neg logit contributions
 pleasure-0.065
 satisfaction-0.064
 delight-0.063
 fascination-0.059
 accomplishment-0.058
Token small
Token ID1402
Feature activation+9.46e-04

Pos logit contributions
 parked+0.042
 driving+0.038
 stole+0.038
 smoking+0.037
 blocking+0.036
Neg logit contributions
 pleasure-0.039
 satisfaction-0.039
 accomplishment-0.038
 delight-0.036
 fascination-0.035
Token boy
Token ID2933
Feature activation-5.88e-02

Pos logit contributions
 satisfaction+0.002
 pleasure+0.002
 accomplishment+0.002
 delight+0.002
 fascination+0.002
Neg logit contributions
 parked-0.001
 driving-0.001
 stole-0.001
 smoking-0.001
 knocked-0.001
Token a
Token ID257
Feature activation-1.89e-02

Pos logit contributions
 stole+0.049
 parked+0.048
through+0.045
cher+0.045
wheel+0.044
Neg logit contributions
 pleasure-0.053
 satisfaction-0.052
 delight-0.049
 amaz-0.049
 fascination-0.049
Token man
Token ID582
Feature activation-7.19e-02

Pos logit contributions
 parked+0.031
 driving+0.028
 smoking+0.028
 stole+0.028
 disturbed+0.027
Neg logit contributions
 pleasure-0.035
 satisfaction-0.035
 accomplishment-0.035
 delight-0.033
 fascination-0.032
Token with
Token ID351
Feature activation-3.36e-02

Pos logit contributions
Ro+0.108
 CA+0.100
 Bang+0.099
 broke+0.096
 Crash+0.096
Neg logit contributions
 fascination-0.117
 amaz-0.115
 accomplishment-0.115
 delight-0.113
 satisfaction-0.113
Token a
Token ID257
Feature activation-2.03e-02

Pos logit contributions
 parked+0.049
 stole+0.047
saw+0.044
 smoking+0.044
 disturbed+0.042
Neg logit contributions
 satisfaction-0.070
 pleasure-0.069
 fascination-0.067
 accomplishment-0.067
 delight-0.067
Token real
Token ID1103
Feature activation-2.03e-02

Pos logit contributions
 parked+0.030
 driving+0.027
 stole+0.027
 smoking+0.026
 disturbed+0.025
Neg logit contributions
 pleasure-0.042
 satisfaction-0.041
 accomplishment-0.041
 fascination-0.040
 delight-0.039
Token gun
Token ID2485
Feature activation-1.21e-01

Pos logit contributions
 parked+0.023
 stole+0.019
 smoking+0.019
 driving+0.019
through+0.018
Neg logit contributions
 pleasure-0.050
 satisfaction-0.050
 accomplishment-0.049
 delight-0.047
 fascination-0.047
Token.
Token ID13
Feature activation-2.77e-02

Pos logit contributions
through+0.257
saw+0.256
chard+0.252
Ro+0.250
Free+0.243
Neg logit contributions
 and-0.105
 until-0.076
 thanks-0.074
ness-0.072
 hope-0.071
Token He
Token ID679
Feature activation-1.88e-02

Pos logit contributions
 parked+0.050
 driving+0.048
 stole+0.046
 smoking+0.045
 knocked+0.045
Neg logit contributions
 pleasure-0.047
 satisfaction-0.047
 delight-0.046
 fascination-0.043
 amaz-0.043
Token is
Token ID318
Feature activation-6.40e-02

Pos logit contributions
 parked+0.033
 smoking+0.032
 driving+0.032
 stole+0.031
 disturbed+0.030
Neg logit contributions
 satisfaction-0.031
 pleasure-0.030
 accomplishment-0.030
 fascination-0.029
 delight-0.029
Token angry
Token ID7954
Feature activation-3.30e-02

Pos logit contributions
 stole+0.101
Ro+0.099
 CA+0.099
anes+0.099
saw+0.095
Neg logit contributions
 pleasure-0.104
 accomplishment-0.100
 delight-0.097
 satisfaction-0.097
 fascination-0.096
Token.
Token ID13
Feature activation-2.79e-02

Pos logit contributions
 stole+0.072
 parked+0.070
chard+0.068
 CA+0.067
 blocked+0.066
Neg logit contributions
 pleasure-0.036
 satisfaction-0.036
 accomplishment-0.036
 delight-0.035
 goodness-0.033
Token,
Token ID11
Feature activation-2.44e-02

Pos logit contributions
 parked+0.034
 smoking+0.032
 driving+0.032
 stole+0.032
 blocking+0.031
Neg logit contributions
 pleasure-0.016
 satisfaction-0.016
 accomplishment-0.015
 delight-0.015
 fascination-0.013
Token they
Token ID484
Feature activation-6.40e-03

Pos logit contributions
 parked+0.034
 smoking+0.033
 driving+0.032
 stole+0.032
 guarding+0.031
Neg logit contributions
 pleasure-0.051
 satisfaction-0.051
 accomplishment-0.049
 delight-0.048
 fascination-0.045
Token see
Token ID766
Feature activation-2.01e-02

Pos logit contributions
 parked+0.010
 smoking+0.010
 driving+0.010
 stole+0.010
 disturbed+0.009
Neg logit contributions
 satisfaction-0.012
 pleasure-0.012
 accomplishment-0.012
 delight-0.011
 fascination-0.011
Token a
Token ID257
Feature activation-3.03e-02

Pos logit contributions
 parked+0.030
 stole+0.029
 driving+0.028
 smoking+0.027
through+0.026
Neg logit contributions
 satisfaction-0.040
 pleasure-0.040
 delight-0.038
 accomplishment-0.038
 amaz-0.037
Token new
Token ID649
Feature activation-9.00e-03

Pos logit contributions
 parked+0.052
 driving+0.048
 stole+0.048
 smoking+0.047
 driven+0.045
Neg logit contributions
 pleasure-0.055
 satisfaction-0.053
 accomplishment-0.053
 delight-0.050
 difference-0.050
Token thing
Token ID1517
Feature activation-1.21e-01

Pos logit contributions
 parked+0.012
 stole+0.011
 smoking+0.011
 driving+0.011
 disturbed+0.011
Neg logit contributions
 pleasure-0.020
 satisfaction-0.020
 accomplishment-0.019
 delight-0.019
 fascination-0.018
Token in
Token ID287
Feature activation-2.13e-02

Pos logit contributions
 Indians+0.223
Lost+0.220
Wh+0.214
 stole+0.212
 Stri+0.210
Neg logit contributions
 accomplishment-0.132
 goodness-0.117
 and-0.116
 until-0.109
 delight-0.108
Token the
Token ID262
Feature activation-1.51e-02

Pos logit contributions
 parked+0.036
 stole+0.035
 driving+0.034
 smoking+0.034
 disturbed+0.032
Neg logit contributions
 satisfaction-0.040
 pleasure-0.039
 accomplishment-0.038
 delight-0.038
 fascination-0.037
Token park
Token ID3952
Feature activation-6.21e-02

Pos logit contributions
 parked+0.027
 driving+0.026
 stole+0.025
 smoking+0.025
 knocked+0.025
Neg logit contributions
 satisfaction-0.026
 pleasure-0.026
 delight-0.025
 accomplishment-0.024
 difference-0.024
Token.
Token ID13
Feature activation-4.11e-02

Pos logit contributions
through+0.124
Ro+0.124
cher+0.122
 smoking+0.118
 driving+0.118
Neg logit contributions
 decision-0.073
 satisfaction-0.072
 delight-0.071
 accomplishment-0.071
 amaz-0.066
Token It
Token ID632
Feature activation-4.03e-02

Pos logit contributions
 driving+0.084
 parked+0.083
 smoking+0.081
cher+0.080
 guarding+0.080
Neg logit contributions
 delight-0.061
 pleasure-0.060
 satisfaction-0.059
 accomplishment-0.052
 amaz-0.050
Token small
Token ID1402
Feature activation-4.34e-03

Pos logit contributions
 parked+0.040
 driving+0.037
 stole+0.037
 smoking+0.037
 disturbed+0.035
Neg logit contributions
 pleasure-0.042
 satisfaction-0.041
 accomplishment-0.041
 delight-0.039
 fascination-0.038
Token,
Token ID11
Feature activation-2.37e-02

Pos logit contributions
 parked+0.006
 driving+0.006
 stole+0.005
 smoking+0.005
 knocked+0.005
Neg logit contributions
 satisfaction-0.009
 pleasure-0.009
 accomplishment-0.009
 delight-0.009
 fascination-0.009
Token round
Token ID2835
Feature activation-2.54e-02

Pos logit contributions
 parked+0.035
 driving+0.033
 stole+0.033
 smoking+0.033
 knocked+0.032
Neg logit contributions
 satisfaction-0.046
 pleasure-0.045
 accomplishment-0.045
 delight-0.043
 fascination-0.043
Token,
Token ID11
Feature activation-1.42e-02

Pos logit contributions
 parked+0.040
 stole+0.038
 driving+0.038
 smoking+0.038
 disturbed+0.037
Neg logit contributions
 satisfaction-0.048
 accomplishment-0.046
 delight-0.045
 pleasure-0.045
 decision-0.043
Token green
Token ID4077
Feature activation-3.78e-02

Pos logit contributions
 parked+0.021
 stole+0.020
 driving+0.020
 smoking+0.020
 knocked+0.019
Neg logit contributions
 satisfaction-0.028
 pleasure-0.027
 accomplishment-0.027
 delight-0.026
 fascination-0.026
Token thing
Token ID1517
Feature activation-1.21e-01

Pos logit contributions
 parked+0.061
 stole+0.059
 driving+0.058
cher+0.057
 disturbed+0.057
Neg logit contributions
 satisfaction-0.067
 delight-0.065
 accomplishment-0.064
 pleasure-0.063
 fascination-0.061
Token on
Token ID319
Feature activation-2.14e-02

Pos logit contributions
 Indians+0.243
cher+0.239
Lost+0.238
chard+0.224
 driven+0.223
Neg logit contributions
 and-0.115
 until-0.092
 goodness-0.089
,-0.088
 de-0.086
Token the
Token ID262
Feature activation-1.24e-02

Pos logit contributions
 stole+0.035
 parked+0.035
 smoking+0.033
 driving+0.033
 disturbed+0.033
Neg logit contributions
 satisfaction-0.039
 delight-0.038
 accomplishment-0.037
 pleasure-0.037
 fascination-0.036
Token ground
Token ID2323
Feature activation-7.14e-02

Pos logit contributions
 parked+0.021
 stole+0.020
 smoking+0.020
 driving+0.020
 disturbed+0.020
Neg logit contributions
 satisfaction-0.022
 pleasure-0.022
 accomplishment-0.021
 delight-0.021
 fascination-0.020
Token.
Token ID13
Feature activation-2.80e-02

Pos logit contributions
Ro+0.156
 CA+0.156
 Found+0.155
cher+0.154
 stole+0.153
Neg logit contributions
 accomplishment-0.062
 decision-0.060
 until-0.059
 satisfaction-0.059
 delight-0.058
Token It
Token ID632
Feature activation-4.57e-02

Pos logit contributions
 parked+0.050
 stole+0.047
 driving+0.047
 smoking+0.047
 disturbed+0.046
Neg logit contributions
 pleasure-0.048
 satisfaction-0.045
 delight-0.045
 accomplishment-0.042
 fascination-0.041
Token day
Token ID1110
Feature activation-1.91e-02

Pos logit contributions
 parked+0.014
 stole+0.014
 smoking+0.014
 driving+0.013
 knocked+0.013
Neg logit contributions
 satisfaction-0.012
 pleasure-0.012
 accomplishment-0.011
 delight-0.011
 fascination-0.010
Token,
Token ID11
Feature activation-2.41e-02

Pos logit contributions
 parked+0.047
 stole+0.044
 smoking+0.044
 driving+0.044
 blocking+0.042
Neg logit contributions
 pleasure-0.020
 satisfaction-0.020
 accomplishment-0.019
 delight-0.018
 fascination-0.016
Token a
Token ID257
Feature activation-1.81e-02

Pos logit contributions
 parked+0.037
 stole+0.034
 smoking+0.034
 driving+0.034
 blocking+0.032
Neg logit contributions
 pleasure-0.048
 satisfaction-0.047
 accomplishment-0.045
 delight-0.045
 fascination-0.042
Token shiny
Token ID22441
Feature activation-1.23e-02

Pos logit contributions
 parked+0.038
 smoking+0.036
 driving+0.036
 stole+0.036
 blocking+0.035
Neg logit contributions
 pleasure-0.026
 satisfaction-0.026
 accomplishment-0.025
 delight-0.023
 fascination-0.022
Token new
Token ID649
Feature activation-1.36e-03

Pos logit contributions
 parked+0.015
 smoking+0.014
 stole+0.014
 driving+0.014
 disturbed+0.013
Neg logit contributions
 pleasure-0.028
 satisfaction-0.028
 accomplishment-0.028
 delight-0.026
 fascination-0.026
Token vehicle
Token ID4038
Feature activation-1.21e-01

Pos logit contributions
 parked+0.002
 smoking+0.002
 stole+0.002
 driving+0.002
 disturbed+0.002
Neg logit contributions
 pleasure-0.003
 satisfaction-0.003
 accomplishment-0.003
 delight-0.003
 fascination-0.003
Token came
Token ID1625
Feature activation-4.26e-02

Pos logit contributions
through+0.210
angs+0.206
Ro+0.198
chard+0.187
 Indians+0.178
Neg logit contributions
 and-0.144
 hope-0.138
 accomplishment-0.135
 satisfaction-0.132
ness-0.128
Token to
Token ID284
Feature activation-1.95e-02

Pos logit contributions
 stole+0.062
 parked+0.060
Ro+0.059
 smoking+0.058
 disturbed+0.054
Neg logit contributions
 pleasure-0.080
 accomplishment-0.077
 satisfaction-0.073
 fascination-0.071
 delight-0.070
Token town
Token ID3240
Feature activation-4.13e-02

Pos logit contributions
 parked+0.031
 stole+0.030
 smoking+0.030
through+0.029
 blocking+0.029
Neg logit contributions
 pleasure-0.038
 satisfaction-0.037
 accomplishment-0.036
 delight-0.035
 fascination-0.034
Token.
Token ID13
Feature activation-3.40e-02

Pos logit contributions
 CA+0.069
Ro+0.068
 stole+0.067
 parked+0.064
 broke+0.063
Neg logit contributions
 accomplishment-0.068
 pleasure-0.067
 satisfaction-0.066
 delight-0.063
 fascination-0.061
Token Everyone
Token ID11075
Feature activation-3.55e-02

Pos logit contributions
 parked+0.062
 smoking+0.060
 driving+0.059
 stole+0.059
 blocking+0.058
Neg logit contributions
 pleasure-0.058
 satisfaction-0.057
 delight-0.055
 accomplishment-0.051
 fascination-0.050
Token
Token ID198
Feature activation-1.61e-02

Pos logit contributions
 parked+0.031
 stole+0.030
 smoking+0.030
 driving+0.030
 disturbed+0.028
Neg logit contributions
 satisfaction-0.030
 pleasure-0.029
 accomplishment-0.029
 delight-0.028
 fascination-0.027
TokenThen
Token ID6423
Feature activation-2.31e-02

Pos logit contributions
 parked+0.029
 driving+0.029
 smoking+0.029
 stole+0.028
 disturbed+0.027
Neg logit contributions
 satisfaction-0.026
 accomplishment-0.024
 pleasure-0.024
 delight-0.023
 fascination-0.022
Token,
Token ID11
Feature activation-4.12e-02

Pos logit contributions
 parked+0.034
 smoking+0.031
 driving+0.030
 stole+0.030
through+0.029
Neg logit contributions
 satisfaction-0.047
 delight-0.045
 pleasure-0.045
 accomplishment-0.045
 fascination-0.043
Token a
Token ID257
Feature activation-2.60e-02

Pos logit contributions
 parked+0.054
anes+0.053
 smoking+0.053
 driving+0.052
through+0.051
Neg logit contributions
 satisfaction-0.085
 accomplishment-0.081
 pleasure-0.080
 delight-0.080
 relief-0.076
Token bad
Token ID2089
Feature activation-9.01e-03

Pos logit contributions
 parked+0.044
 stole+0.041
 driving+0.040
 smoking+0.040
 blocking+0.038
Neg logit contributions
 satisfaction-0.048
 pleasure-0.047
 accomplishment-0.046
 delight-0.045
 fascination-0.043
Token thing
Token ID1517
Feature activation-1.21e-01

Pos logit contributions
 parked+0.013
 stole+0.012
 smoking+0.012
 driving+0.012
 blocking+0.011
Neg logit contributions
 satisfaction-0.019
 pleasure-0.019
 accomplishment-0.019
 delight-0.018
 fascination-0.018
Token happened
Token ID3022
Feature activation-4.43e-02

Pos logit contributions
chard+0.219
through+0.183
nam+0.181
 Indians+0.174
 smoking+0.165
Neg logit contributions
 moments-0.159
 accomplishment-0.157
 hope-0.150
 satisfaction-0.147
 relief-0.146
Token.
Token ID13
Feature activation-4.15e-02

Pos logit contributions
 parked+0.082
Ro+0.080
 stole+0.080
through+0.079
saw+0.078
Neg logit contributions
 accomplishment-0.064
 satisfaction-0.059
 pleasure-0.058
 goodness-0.057
 decision-0.055
Token A
Token ID317
Feature activation-3.58e-02

Pos logit contributions
 parked+0.075
 driving+0.073
 stole+0.073
 knocked+0.071
 guarding+0.070
Neg logit contributions
 satisfaction-0.068
 pleasure-0.064
 delight-0.062
 accomplishment-0.060
 relief-0.059
Token big
Token ID1263
Feature activation+3.53e-03

Pos logit contributions
 parked+0.061
 stole+0.055
through+0.052
 driving+0.052
 smoking+0.051
Neg logit contributions
 satisfaction-0.064
 accomplishment-0.063
 pleasure-0.063
 difference-0.061
 decision-0.060
Token car
Token ID1097
Feature activation-1.02e-01

Pos logit contributions
 satisfaction+0.009
 pleasure+0.008
 accomplishment+0.008
 delight+0.008
 fascination+0.008
Neg logit contributions
 parked-0.004
 driving-0.004
 smoking-0.004
 stole-0.004
 knocked-0.003
Token day
Token ID1110
Feature activation-2.22e-02

Pos logit contributions
 parked+0.021
 stole+0.020
 smoking+0.020
 driving+0.020
 blocking+0.019
Neg logit contributions
 satisfaction-0.016
 pleasure-0.016
 accomplishment-0.016
 delight-0.014
 fascination-0.014
Token,
Token ID11
Feature activation-3.75e-02

Pos logit contributions
 parked+0.056
 stole+0.053
 smoking+0.053
 driving+0.052
 blocking+0.051
Neg logit contributions
 pleasure-0.022
 satisfaction-0.021
 accomplishment-0.020
 delight-0.019
 fascination-0.017
Token a
Token ID257
Feature activation-2.68e-02

Pos logit contributions
 parked+0.069
angs+0.067
 stole+0.066
wheel+0.064
cher+0.064
Neg logit contributions
 pleasure-0.060
 satisfaction-0.058
 delight-0.056
 accomplishment-0.054
 fascination-0.051
Token new
Token ID649
Feature activation-1.38e-02

Pos logit contributions
 parked+0.057
 driving+0.053
 stole+0.053
 smoking+0.052
 blocking+0.051
Neg logit contributions
 pleasure-0.039
 satisfaction-0.038
 accomplishment-0.036
 delight-0.034
 difference-0.032
Token green
Token ID4077
Feature activation-3.06e-02

Pos logit contributions
 parked+0.022
 smoking+0.021
 stole+0.021
 driving+0.020
 blocking+0.020
Neg logit contributions
 pleasure-0.027
 satisfaction-0.026
 accomplishment-0.026
 delight-0.024
 fascination-0.024
Token truck
Token ID7779
Feature activation-1.20e-01

Pos logit contributions
 parked+0.032
 smoking+0.031
through+0.030
 stole+0.030
 blocking+0.029
Neg logit contributions
 pleasure-0.072
 satisfaction-0.071
 accomplishment-0.070
 delight-0.068
 decision-0.066
Token came
Token ID1625
Feature activation-4.22e-02

Pos logit contributions
through+0.219
nam+0.218
chard+0.215
angs+0.199
cher+0.192
Neg logit contributions
 and-0.142
 thanks-0.122
 hope-0.120
 satisfaction-0.120
 loves-0.113
Token to
Token ID284
Feature activation-2.79e-02

Pos logit contributions
 stole+0.066
 parked+0.066
 smoking+0.063
Ro+0.061
chard+0.061
Neg logit contributions
 pleasure-0.072
 accomplishment-0.070
 satisfaction-0.066
 fascination-0.065
 delight-0.064
Token play
Token ID711
Feature activation-5.96e-02

Pos logit contributions
 parked+0.047
through+0.046
 stole+0.046
angs+0.045
 smoking+0.045
Neg logit contributions
 pleasure-0.050
 satisfaction-0.049
 accomplishment-0.048
 delight-0.046
 fascination-0.045
Token.
Token ID13
Feature activation-4.49e-02

Pos logit contributions
through+0.119
 knocked+0.117
 stole+0.116
 rang+0.116
 parked+0.115
Neg logit contributions
 accomplishment-0.074
 agreement-0.073
 satisfaction-0.072
 delight-0.072
 difference-0.069
Token The
Token ID383
Feature activation-2.48e-02

Pos logit contributions
 parked+0.089
 driving+0.086
cher+0.085
 smoking+0.085
 stole+0.085
Neg logit contributions
 pleasure-0.066
 delight-0.065
 satisfaction-0.063
 accomplishment-0.056
 fascination-0.055
Token,
Token ID11
Feature activation-7.52e-04

Pos logit contributions
 satisfaction+0.009
 pleasure+0.009
 accomplishment+0.009
 fascination+0.008
 delight+0.008
Neg logit contributions
 parked-0.012
 driving-0.012
 stole-0.011
 smoking-0.011
 knocked-0.011
Token they
Token ID484
Feature activation+6.41e-03

Pos logit contributions
 parked+0.001
 driving+0.001
 smoking+0.001
 stole+0.001
 disturbed+0.001
Neg logit contributions
 satisfaction-0.002
 pleasure-0.002
 accomplishment-0.002
 delight-0.002
 fascination-0.002
Token see
Token ID766
Feature activation-2.12e-02

Pos logit contributions
 satisfaction+0.013
 pleasure+0.013
 accomplishment+0.013
 delight+0.012
 fascination+0.012
Neg logit contributions
 parked-0.010
 stole-0.009
 driving-0.009
 smoking-0.009
 knocked-0.009
Token a
Token ID257
Feature activation-2.97e-02

Pos logit contributions
 parked+0.031
 stole+0.030
 driving+0.029
 smoking+0.028
 guarding+0.027
Neg logit contributions
 pleasure-0.042
 satisfaction-0.042
 delight-0.041
 accomplishment-0.039
 fascination-0.039
Token new
Token ID649
Feature activation-6.85e-03

Pos logit contributions
 parked+0.052
 driving+0.048
 stole+0.047
 smoking+0.046
 driven+0.045
Neg logit contributions
 pleasure-0.052
 accomplishment-0.051
 satisfaction-0.051
 delight-0.048
 difference-0.048
Token thing
Token ID1517
Feature activation-1.20e-01

Pos logit contributions
 parked+0.010
 stole+0.010
 smoking+0.010
 driving+0.010
 disturbed+0.009
Neg logit contributions
 pleasure-0.014
 satisfaction-0.014
 accomplishment-0.014
 delight-0.013
 fascination-0.013
Token in
Token ID287
Feature activation-2.82e-02

Pos logit contributions
Lost+0.215
 Indians+0.212
Wh+0.209
Ro+0.205
 stole+0.204
Neg logit contributions
 accomplishment-0.132
 goodness-0.125
 darling-0.117
 and-0.116
 delight-0.115
Token the
Token ID262
Feature activation-7.19e-03

Pos logit contributions
 parked+0.048
 stole+0.047
 driving+0.047
 smoking+0.045
 disturbed+0.044
Neg logit contributions
 satisfaction-0.051
 pleasure-0.049
 delight-0.048
 accomplishment-0.048
 fascination-0.047
Token gym
Token ID11550
Feature activation-8.02e-02

Pos logit contributions
 parked+0.010
 driving+0.009
 stole+0.009
 smoking+0.009
 disturbed+0.009
Neg logit contributions
 satisfaction-0.016
 pleasure-0.015
 accomplishment-0.015
 delight-0.015
 fascination-0.015
Token.
Token ID13
Feature activation-1.93e-02

Pos logit contributions
through+0.155
Ro+0.152
 CA+0.148
 stole+0.148
 bitten+0.147
Neg logit contributions
 accomplishment-0.093
 satisfaction-0.091
 decision-0.091
 delight-0.086
 difference-0.080
Token It
Token ID632
Feature activation-4.14e-02

Pos logit contributions
 parked+0.033
 driving+0.033
 smoking+0.032
 stole+0.032
 blocking+0.030
Neg logit contributions
 pleasure-0.035
 satisfaction-0.034
 delight-0.033
 accomplishment-0.032
 fascination-0.031
Token,
Token ID11
Feature activation+2.29e-03

Pos logit contributions
 parked+0.013
 stole+0.012
 driving+0.012
 smoking+0.012
 disturbed+0.011
Neg logit contributions
 pleasure-0.008
 satisfaction-0.008
 accomplishment-0.007
 delight-0.007
 fascination-0.007
Token they
Token ID484
Feature activation-1.44e-02

Pos logit contributions
 satisfaction+0.005
 pleasure+0.005
 accomplishment+0.005
 delight+0.005
 fascination+0.005
Neg logit contributions
 parked-0.003
 driving-0.003
 smoking-0.003
 stole-0.003
 disturbed-0.002
Token see
Token ID766
Feature activation-3.22e-02

Pos logit contributions
 parked+0.023
 driving+0.023
 smoking+0.023
 stole+0.022
 blocking+0.021
Neg logit contributions
 satisfaction-0.026
 pleasure-0.026
 accomplishment-0.025
 delight-0.024
 fascination-0.024
Token a
Token ID257
Feature activation-3.98e-02

Pos logit contributions
 stole+0.054
 parked+0.050
 guarding+0.049
cher+0.049
 blocking+0.048
Neg logit contributions
 pleasure-0.055
 satisfaction-0.055
 delight-0.053
 amaz-0.053
 fascination-0.051
Token real
Token ID1103
Feature activation-2.06e-02

Pos logit contributions
 parked+0.071
 stole+0.066
 bitten+0.064
 driving+0.064
 smoking+0.063
Neg logit contributions
 pleasure-0.065
 accomplishment-0.064
 satisfaction-0.062
 difference-0.062
 decision-0.060
Token train
Token ID4512
Feature activation-1.20e-01

Pos logit contributions
 parked+0.023
 stole+0.022
 smoking+0.021
 blocking+0.020
 driving+0.019
Neg logit contributions
 accomplishment-0.048
 pleasure-0.048
 satisfaction-0.048
 delight-0.045
 fascination-0.045
Token outside
Token ID2354
Feature activation-6.23e-02

Pos logit contributions
angs+0.213
 bitten+0.208
through+0.197
chard+0.197
Ro+0.196
Neg logit contributions
 and-0.137
 difference-0.130
 accomplishment-0.124
 until-0.122
 decision-0.117
Token.
Token ID13
Feature activation-1.96e-02

Pos logit contributions
cher+0.117
 stole+0.115
chard+0.113
 parked+0.113
wheel+0.113
Neg logit contributions
 satisfaction-0.083
 delight-0.081
 accomplishment-0.078
 decision-0.078
 amaz-0.076
Token It
Token ID632
Feature activation-4.98e-02

Pos logit contributions
 parked+0.035
 driving+0.034
 smoking+0.033
 stole+0.033
 disturbed+0.031
Neg logit contributions
 pleasure-0.034
 satisfaction-0.034
 delight-0.034
 accomplishment-0.032
 fascination-0.032
Token is
Token ID318
Feature activation-6.20e-02

Pos logit contributions
chard+0.120
 smoking+0.119
 spies+0.111
 bitten+0.111
cher+0.110
Neg logit contributions
 delight-0.039
 pleasure-0.039
 fascination-0.037
 accomplishment-0.037
 difference-0.036
Token big
Token ID1263
Feature activation-3.46e-03

Pos logit contributions
 stole+0.124
cher+0.118
aro+0.111
Ro+0.110
saw+0.108
Neg logit contributions
 pleasure-0.092
 accomplishment-0.089
 delight-0.085
 fascination-0.085
 amaz-0.083
Token they
Token ID484
Feature activation+3.75e-03

Pos logit contributions
 parked+0.039
 smoking+0.038
 stole+0.037
 driving+0.036
 blocking+0.035
Neg logit contributions
 pleasure-0.059
 satisfaction-0.057
 accomplishment-0.056
 delight-0.055
 fascination-0.052
Token see
Token ID766
Feature activation-1.71e-02

Pos logit contributions
 satisfaction+0.007
 pleasure+0.007
 accomplishment+0.007
 delight+0.007
 fascination+0.007
Neg logit contributions
 parked-0.006
 stole-0.006
 driving-0.005
 smoking-0.005
 knocked-0.005
Token a
Token ID257
Feature activation-3.72e-02

Pos logit contributions
 parked+0.023
 stole+0.022
 driving+0.022
 smoking+0.022
 knocked+0.020
Neg logit contributions
 pleasure-0.036
 satisfaction-0.035
 delight-0.034
 accomplishment-0.033
 fascination-0.033
Token big
Token ID1263
Feature activation+2.89e-03

Pos logit contributions
 parked+0.063
 driving+0.059
 smoking+0.057
 stole+0.057
 bitten+0.056
Neg logit contributions
 pleasure-0.065
 accomplishment-0.064
 satisfaction-0.063
 difference-0.061
 delight-0.060
Token red
Token ID2266
Feature activation-3.13e-02

Pos logit contributions
 satisfaction+0.006
 pleasure+0.006
 accomplishment+0.006
 delight+0.006
 fascination+0.006
Neg logit contributions
 parked-0.004
 driving-0.004
 smoking-0.004
 stole-0.004
 disturbed-0.003
Token thing
Token ID1517
Feature activation-1.20e-01

Pos logit contributions
 parked+0.045
 stole+0.043
 disturbed+0.042
 driving+0.042
 smoking+0.042
Neg logit contributions
 satisfaction-0.062
 pleasure-0.062
 accomplishment-0.061
 delight-0.059
 fascination-0.059
Token on
Token ID319
Feature activation-3.66e-02

Pos logit contributions
 Indians+0.239
Lost+0.229
 villains+0.225
Wh+0.220
Ro+0.218
Neg logit contributions
 and-0.119
 accomplishment-0.111
 goodness-0.109
 de-0.103
 until-0.100
Token a
Token ID257
Feature activation-3.45e-02

Pos logit contributions
cher+0.065
 stole+0.065
 parked+0.063
 driving+0.063
 smoking+0.063
Neg logit contributions
 satisfaction-0.059
 delight-0.058
 accomplishment-0.058
 pleasure-0.056
 fascination-0.055
Token green
Token ID4077
Feature activation-4.40e-02

Pos logit contributions
 parked+0.057
 driving+0.056
 stole+0.056
 disturbed+0.055
 driven+0.054
Neg logit contributions
 pleasure-0.057
 satisfaction-0.057
 accomplishment-0.055
 delight-0.055
 fascination-0.055
Token plant
Token ID4618
Feature activation-9.35e-02

Pos logit contributions
 parked+0.072
 driving+0.070
 stole+0.069
 disturbed+0.069
 smoking+0.068
Neg logit contributions
 satisfaction-0.076
 delight-0.074
 pleasure-0.073
 accomplishment-0.073
 fascination-0.072
Token.
Token ID13
Feature activation-3.19e-02

Pos logit contributions
Ro+0.185
 Crash+0.185
 wrecked+0.184
through+0.183
 driven+0.182
Neg logit contributions
 difference-0.087
 goodness-0.082
 delight-0.078
 accomplishment-0.077
 satisfaction-0.075
Token when
Token ID618
Feature activation-5.34e-02

Pos logit contributions
 stole+0.051
 parked+0.050
 CA+0.048
 knocked+0.046
 blocked+0.046
Neg logit contributions
 delight-0.047
 satisfaction-0.047
 pleasure-0.047
 accomplishment-0.046
 fascination-0.045
Token he
Token ID339
Feature activation-5.03e-02

Pos logit contributions
 parked+0.073
 stole+0.072
angs+0.071
cher+0.070
tr+0.066
Neg logit contributions
 accomplishment-0.106
 satisfaction-0.103
 pleasure-0.102
 delight-0.101
 fascination-0.100
Token saw
Token ID2497
Feature activation-4.88e-02

Pos logit contributions
 blocking+0.076
 Enter+0.075
aro+0.074
 driving+0.073
 smoking+0.073
Neg logit contributions
 satisfaction-0.079
 delight-0.079
 fascination-0.077
 pleasure-0.076
oph-0.075
Token a
Token ID257
Feature activation-2.56e-02

Pos logit contributions
 stole+0.083
 parked+0.083
 guarding+0.080
cher+0.079
ider+0.079
Neg logit contributions
 satisfaction-0.076
 delight-0.076
 pleasure-0.073
 amaz-0.070
 accomplishment-0.070
Token big
Token ID1263
Feature activation+6.97e-03

Pos logit contributions
 parked+0.041
 stole+0.038
 driving+0.038
 smoking+0.037
 disturbed+0.036
Neg logit contributions
 pleasure-0.048
 satisfaction-0.047
 accomplishment-0.047
 delight-0.044
 fascination-0.044
Token vehicle
Token ID4038
Feature activation-1.19e-01

Pos logit contributions
 satisfaction+0.017
 pleasure+0.016
 accomplishment+0.016
 delight+0.015
 fascination+0.015
Neg logit contributions
 parked-0.008
 driving-0.008
 smoking-0.008
 stole-0.008
 broke-0.007
Token coming
Token ID2406
Feature activation-6.30e-02

Pos logit contributions
 villains+0.220
Ro+0.217
 Indians+0.215
angs+0.211
 bitten+0.205
Neg logit contributions
 accomplishment-0.132
 decision-0.130
 and-0.129
 until-0.119
 goodness-0.115
Token down
Token ID866
Feature activation-6.82e-02

Pos logit contributions
 stole+0.111
 disturbed+0.108
 villains+0.107
 killed+0.105
 smoking+0.105
Neg logit contributions
 accomplishment-0.085
 fascination-0.082
 satisfaction-0.082
 pleasure-0.081
 delight-0.077
Token the
Token ID262
Feature activation-2.25e-02

Pos logit contributions
cher+0.153
ophone+0.151
 stole+0.147
 villains+0.142
tr+0.139
Neg logit contributions
 satisfaction-0.076
 accomplishment-0.072
 delight-0.064
 joy-0.061
 fascination-0.057
Token road
Token ID2975
Feature activation-6.38e-02

Pos logit contributions
 stole+0.032
 smoking+0.031
 parked+0.031
 disturbed+0.030
 blocked+0.028
Neg logit contributions
 satisfaction-0.043
 accomplishment-0.043
 pleasure-0.043
 delight-0.041
 difference-0.039
Token.
Token ID13
Feature activation-1.98e-02

Pos logit contributions
Ro+0.119
 CA+0.119
 stole+0.117
CA+0.110
 bitten+0.110
Neg logit contributions
 accomplishment-0.090
 decision-0.085
 satisfaction-0.082
 difference-0.079
 delight-0.078
Token named
Token ID3706
Feature activation-4.56e-02

Pos logit contributions
 parked+0.036
 driving+0.032
 blocking+0.032
 smoking+0.030
through+0.030
Neg logit contributions
 satisfaction-0.088
 pleasure-0.084
 delight-0.083
 accomplishment-0.081
 fascination-0.080
Token Tim
Token ID5045
Feature activation-2.28e-02

Pos logit contributions
 smoking+0.069
 parked+0.069
 blocking+0.069
 driving+0.069
 blown+0.066
Neg logit contributions
 pleasure-0.085
 satisfaction-0.082
 delight-0.079
 goodness-0.070
 accomplishment-0.069
Token found
Token ID1043
Feature activation-5.43e-02

Pos logit contributions
 parked+0.028
 smoking+0.028
 driving+0.027
 blocking+0.026
through+0.025
Neg logit contributions
 satisfaction-0.049
 pleasure-0.047
 accomplishment-0.045
 delight-0.045
 fascination-0.045
Token a
Token ID257
Feature activation-2.20e-02

Pos logit contributions
aro+0.096
 stole+0.095
 parked+0.095
riages+0.094
cher+0.093
Neg logit contributions
 satisfaction-0.081
 pleasure-0.076
 delight-0.074
 accomplishment-0.070
 relief-0.068
Token toy
Token ID13373
Feature activation-1.12e-01

Pos logit contributions
 parked+0.038
 driving+0.034
 stole+0.034
 smoking+0.034
 disturbed+0.032
Neg logit contributions
 pleasure-0.041
 satisfaction-0.039
 accomplishment-0.039
 fascination-0.037
 delight-0.036
Token motorcycle
Token ID18757
Feature activation-1.19e-01

Pos logit contributions
 driven+0.197
saw+0.196
angs+0.195
through+0.195
RO+0.194
Neg logit contributions
 goodness-0.130
 accomplishment-0.119
 and-0.118
ummy-0.118
 decision-0.118
Token.
Token ID13
Feature activation-2.13e-02

Pos logit contributions
Ro+0.279
 villains+0.260
angs+0.259
 Indians+0.254
aro+0.254
Neg logit contributions
 and-0.095
,-0.080
 accomplishment-0.068
 until-0.064
 tightly-0.062
Token It
Token ID632
Feature activation-6.42e-02

Pos logit contributions
 parked+0.036
 driving+0.035
 stole+0.034
 smoking+0.034
 disturbed+0.034
Neg logit contributions
 pleasure-0.038
 satisfaction-0.038
 delight-0.036
 accomplishment-0.036
 fascination-0.034
Token was
Token ID373
Feature activation-7.07e-02

Pos logit contributions
 spies+0.149
angs+0.142
 smoking+0.137
aro+0.133
cher+0.133
Neg logit contributions
 pleasure-0.048
 satisfaction-0.047
 delight-0.046
 accomplishment-0.045
 fascination-0.043
Token red
Token ID2266
Feature activation-1.56e-02

Pos logit contributions
aro+0.154
cher+0.152
 stole+0.149
angs+0.146
Ro+0.145
Neg logit contributions
 accomplishment-0.090
 pleasure-0.086
 satisfaction-0.082
 delight-0.081
 sweet-0.073
Token and
Token ID290
Feature activation-3.94e-02

Pos logit contributions
 parked+0.037
 stole+0.037
 smoking+0.036
 disturbed+0.036
 driving+0.035
Neg logit contributions
 pleasure-0.016
 satisfaction-0.016
 accomplishment-0.015
 delight-0.014
 fascination-0.013
Token they
Token ID484
Feature activation-2.19e-03

Pos logit contributions
 parked+0.016
 smoking+0.015
 stole+0.015
 driving+0.015
 blocking+0.014
Neg logit contributions
 pleasure-0.032
 satisfaction-0.032
 accomplishment-0.031
 delight-0.031
 fascination-0.030
Token see
Token ID766
Feature activation-1.83e-02

Pos logit contributions
 parked+0.004
 driving+0.003
 smoking+0.003
 stole+0.003
 disturbed+0.003
Neg logit contributions
 satisfaction-0.004
 pleasure-0.004
 accomplishment-0.004
 delight-0.004
 fascination-0.004
Token a
Token ID257
Feature activation-3.52e-02

Pos logit contributions
 parked+0.025
 stole+0.024
 smoking+0.024
 driving+0.024
 guarding+0.022
Neg logit contributions
 pleasure-0.038
 satisfaction-0.037
 delight-0.036
 accomplishment-0.035
 fascination-0.035
Token big
Token ID1263
Feature activation+7.06e-03

Pos logit contributions
 parked+0.059
 driving+0.055
 stole+0.054
 smoking+0.054
 driven+0.053
Neg logit contributions
 pleasure-0.062
 accomplishment-0.061
 satisfaction-0.060
 difference-0.058
 delight-0.058
Token white
Token ID2330
Feature activation-2.56e-02

Pos logit contributions
 satisfaction+0.016
 pleasure+0.016
 accomplishment+0.015
 delight+0.015
 fascination+0.015
Neg logit contributions
 parked-0.009
 driving-0.009
 smoking-0.009
 stole-0.009
 broke-0.008
Token thing
Token ID1517
Feature activation-1.18e-01

Pos logit contributions
 parked+0.037
 stole+0.036
 driving+0.035
 smoking+0.035
 disturbed+0.034
Neg logit contributions
 satisfaction-0.051
 pleasure-0.050
 accomplishment-0.049
 delight-0.048
 fascination-0.048
Token on
Token ID319
Feature activation-3.28e-02

Pos logit contributions
 Indians+0.235
Lost+0.230
 villains+0.224
Wh+0.219
Ro+0.214
Neg logit contributions
 and-0.116
 goodness-0.108
 accomplishment-0.107
 until-0.103
 de-0.099
Token the
Token ID262
Feature activation-1.99e-02

Pos logit contributions
 stole+0.058
 parked+0.057
cher+0.056
 driving+0.056
 smoking+0.055
Neg logit contributions
 satisfaction-0.055
 delight-0.054
 accomplishment-0.053
 pleasure-0.052
 fascination-0.051
Token grass
Token ID8701
Feature activation-7.08e-02

Pos logit contributions
 stole+0.035
 smoking+0.034
 parked+0.034
 disturbed+0.034
 driving+0.033
Neg logit contributions
 satisfaction-0.034
 pleasure-0.033
 accomplishment-0.032
 delight-0.032
 fascination-0.030
Token.
Token ID13
Feature activation-1.17e-02

Pos logit contributions
 stole+0.136
 killed+0.135
 villains+0.131
 Found+0.131
 driving+0.131
Neg logit contributions
 delight-0.077
ness-0.077
aman-0.076
 decision-0.075
 amaz-0.074
Token It
Token ID632
Feature activation-2.77e-02

Pos logit contributions
 parked+0.019
 driving+0.019
 stole+0.018
 smoking+0.018
 disturbed+0.017
Neg logit contributions
 pleasure-0.022
 satisfaction-0.022
 delight-0.021
 accomplishment-0.021
 fascination-0.021
Token,
Token ID11
Feature activation-3.25e-02

Pos logit contributions
 parked+0.049
 smoking+0.047
 stole+0.046
 driving+0.046
 blocking+0.045
Neg logit contributions
 pleasure-0.022
 satisfaction-0.022
 accomplishment-0.021
 delight-0.020
 fascination-0.018
Token they
Token ID484
Feature activation-8.03e-03

Pos logit contributions
 parked+0.046
 smoking+0.045
 stole+0.043
 guarding+0.043
cher+0.043
Neg logit contributions
 pleasure-0.065
 satisfaction-0.063
 accomplishment-0.062
 delight-0.061
 fascination-0.056
Token notice
Token ID4003
Feature activation-4.02e-02

Pos logit contributions
 parked+0.013
 smoking+0.013
 driving+0.013
 stole+0.013
 blocking+0.012
Neg logit contributions
 satisfaction-0.014
 pleasure-0.014
 accomplishment-0.014
 delight-0.014
 fascination-0.013
Token a
Token ID257
Feature activation-3.88e-02

Pos logit contributions
cher+0.059
 stole+0.058
 parked+0.057
 guarding+0.056
 smoking+0.055
Neg logit contributions
 pleasure-0.073
 delight-0.072
 satisfaction-0.072
 accomplishment-0.070
 amaz-0.069
Token big
Token ID1263
Feature activation-1.36e-03

Pos logit contributions
 parked+0.068
 driving+0.062
 stole+0.061
 smoking+0.061
 driven+0.059
Neg logit contributions
 accomplishment-0.066
 pleasure-0.066
 satisfaction-0.065
 difference-0.063
 delight-0.062
Token truck
Token ID7779
Feature activation-1.18e-01

Pos logit contributions
 parked+0.002
 driving+0.002
 stole+0.002
 smoking+0.002
 disturbed+0.002
Neg logit contributions
 satisfaction-0.003
 pleasure-0.003
 accomplishment-0.003
 delight-0.003
 fascination-0.003
Token on
Token ID319
Feature activation-3.78e-02

Pos logit contributions
 villains+0.211
 Indians+0.202
Ro+0.201
chard+0.197
angs+0.189
Neg logit contributions
 and-0.135
 decision-0.133
 accomplishment-0.118
 amaz-0.116
 goodness-0.116
Token the
Token ID262
Feature activation-1.91e-02

Pos logit contributions
 stole+0.072
cher+0.070
 smoking+0.067
 parked+0.066
 disturbed+0.065
Neg logit contributions
 satisfaction-0.058
 accomplishment-0.058
 delight-0.057
 pleasure-0.054
 fascination-0.054
Token road
Token ID2975
Feature activation-7.69e-02

Pos logit contributions
 stole+0.027
 smoking+0.027
 disturbed+0.027
 parked+0.026
 broke+0.025
Neg logit contributions
 satisfaction-0.038
 pleasure-0.037
 accomplishment-0.036
 delight-0.036
 fascination-0.035
Token.
Token ID13
Feature activation-3.26e-02

Pos logit contributions
Ro+0.164
 Indians+0.152
 CA+0.151
cher+0.150
 stole+0.148
Neg logit contributions
 decision-0.088
 accomplishment-0.080
 satisfaction-0.072
 delight-0.070
 difference-0.070
Token The
Token ID383
Feature activation-2.31e-02

Pos logit contributions
 driving+0.061
 smoking+0.060
 parked+0.060
cher+0.060
 guarding+0.059
Neg logit contributions
 pleasure-0.050
 satisfaction-0.049
 delight-0.049
 accomplishment-0.044
 amaz-0.042
Token saw
Token ID2497
Feature activation-3.84e-02

Pos logit contributions
 parked+0.013
 driving+0.013
 smoking+0.012
 stole+0.012
 blocking+0.011
Neg logit contributions
 satisfaction-0.020
 pleasure-0.020
 accomplishment-0.019
 delight-0.019
 fascination-0.018
Token a
Token ID257
Feature activation-1.51e-02

Pos logit contributions
 parked+0.064
 driving+0.060
 stole+0.059
cher+0.058
saw+0.056
Neg logit contributions
 satisfaction-0.066
 pleasure-0.066
 delight-0.064
 accomplishment-0.061
 fascination-0.060
Token long
Token ID890
Feature activation-5.96e-03

Pos logit contributions
 parked+0.025
 driving+0.023
 stole+0.023
 smoking+0.023
 knocked+0.021
Neg logit contributions
 pleasure-0.029
 satisfaction-0.028
 accomplishment-0.028
 delight-0.027
 fascination-0.026
Token,
Token ID11
Feature activation-2.11e-02

Pos logit contributions
 parked+0.009
 driving+0.009
 stole+0.009
 smoking+0.009
 disturbed+0.008
Neg logit contributions
 satisfaction-0.012
 pleasure-0.012
 accomplishment-0.011
 delight-0.011
 fascination-0.011
Token green
Token ID4077
Feature activation-2.26e-02

Pos logit contributions
 parked+0.032
 driving+0.030
 stole+0.029
 smoking+0.029
 knocked+0.028
Neg logit contributions
 satisfaction-0.041
 pleasure-0.041
 accomplishment-0.040
 delight-0.039
 fascination-0.039
Token thing
Token ID1517
Feature activation-1.18e-01

Pos logit contributions
 parked+0.036
 driving+0.034
 stole+0.034
 smoking+0.033
 knocked+0.032
Neg logit contributions
 satisfaction-0.042
 pleasure-0.042
 accomplishment-0.041
 delight-0.040
 fascination-0.039
Token in
Token ID287
Feature activation-1.78e-02

Pos logit contributions
 Indians+0.222
 driven+0.221
Lost+0.218
anes+0.214
cher+0.213
Neg logit contributions
 and-0.109
 goodness-0.102
 moments-0.097
 until-0.094
 de-0.090
Token the
Token ID262
Feature activation-1.14e-02

Pos logit contributions
 parked+0.028
 driving+0.028
 stole+0.027
 smoking+0.027
 knocked+0.025
Neg logit contributions
 satisfaction-0.034
 pleasure-0.033
 delight-0.033
 accomplishment-0.032
 fascination-0.031
Token water
Token ID1660
Feature activation-6.17e-02

Pos logit contributions
 parked+0.019
 driving+0.019
 smoking+0.018
 stole+0.018
 knocked+0.017
Neg logit contributions
 satisfaction-0.021
 pleasure-0.021
 delight-0.020
 accomplishment-0.019
 fascination-0.019
Token.
Token ID13
Feature activation-1.57e-02

Pos logit contributions
 driving+0.117
 parked+0.117
 stole+0.113
 CA+0.111
destruct+0.110
Neg logit contributions
 difference-0.075
 satisfaction-0.075
 delight-0.074
 pleasure-0.070
 decision-0.070
Token He
Token ID679
Feature activation-3.31e-02

Pos logit contributions
 parked+0.025
 driving+0.024
 smoking+0.023
 stole+0.023
 knocked+0.022
Neg logit contributions
 pleasure-0.031
 satisfaction-0.030
 delight-0.029
 accomplishment-0.029
 fascination-0.028
Token bus
Token ID1323
Feature activation-8.22e-02

Pos logit contributions
 smoking+0.013
 stole+0.012
 parked+0.012
 disturbed+0.011
 broke+0.011
Neg logit contributions
 satisfaction-0.029
 pleasure-0.029
 accomplishment-0.028
 delight-0.028
 fascination-0.027
Token met
Token ID1138
Feature activation-4.43e-02

Pos logit contributions
 smoking+0.144
through+0.133
angs+0.132
chard+0.129
Ro+0.127
Neg logit contributions
 satisfaction-0.112
 decision-0.111
est-0.107
 delight-0.105
ness-0.105
Token a
Token ID257
Feature activation-2.92e-02

Pos logit contributions
 stole+0.074
 smoking+0.073
 parked+0.073
 knocked+0.071
 blocked+0.069
Neg logit contributions
 pleasure-0.076
 satisfaction-0.076
 delight-0.071
 accomplishment-0.066
 fascination-0.065
Token small
Token ID1402
Feature activation-1.24e-02

Pos logit contributions
 parked+0.063
 smoking+0.059
 stole+0.058
 driving+0.058
 blocking+0.056
Neg logit contributions
 pleasure-0.042
 satisfaction-0.040
 accomplishment-0.039
 delight-0.036
 difference-0.035
Token green
Token ID4077
Feature activation-4.41e-02

Pos logit contributions
 parked+0.019
 stole+0.018
 smoking+0.018
 driving+0.018
 knocked+0.017
Neg logit contributions
 pleasure-0.025
 satisfaction-0.024
 accomplishment-0.024
 delight-0.023
 fascination-0.023
Token car
Token ID1097
Feature activation-1.17e-01

Pos logit contributions
 stole+0.060
 parked+0.060
 smoking+0.060
 bitten+0.059
through+0.058
Neg logit contributions
 pleasure-0.086
 satisfaction-0.086
 accomplishment-0.084
 delight-0.083
 decision-0.080
Token.
Token ID13
Feature activation-3.12e-02

Pos logit contributions
angs+0.286
through+0.283
Ro+0.275
destruct+0.269
Free+0.266
Neg logit contributions
 and-0.079
 until-0.060
,-0.050
 accomplishment-0.046
 agreement-0.043
Token The
Token ID383
Feature activation-1.71e-02

Pos logit contributions
 parked+0.060
 smoking+0.060
 stole+0.060
 driving+0.059
cher+0.057
Neg logit contributions
 pleasure-0.048
 satisfaction-0.046
 delight-0.046
 accomplishment-0.041
 fascination-0.040
Token car
Token ID1097
Feature activation-6.55e-02

Pos logit contributions
 smoking+0.022
 stole+0.022
 parked+0.021
 disturbed+0.020
 blocking+0.020
Neg logit contributions
 pleasure-0.038
 satisfaction-0.038
 accomplishment-0.037
 delight-0.036
 fascination-0.035
Token liked
Token ID8288
Feature activation-5.19e-02

Pos logit contributions
 smoking+0.124
angs+0.119
through+0.115
saw+0.110
Ro+0.109
Neg logit contributions
 satisfaction-0.084
 delight-0.083
 accomplishment-0.083
 pleasure-0.083
 nod-0.076
Token to
Token ID284
Feature activation-1.43e-02

Pos logit contributions
cher+0.087
tr+0.087
 stole+0.086
 parked+0.086
through+0.084
Neg logit contributions
 pleasure-0.085
 accomplishment-0.084
 delight-0.079
 satisfaction-0.077
 joy-0.076
Token on
Token ID319
Feature activation+3.76e-02

Pos logit contributions
 parked+0.009
 stole+0.008
 smoking+0.008
 driving+0.008
 disturbed+0.008
Neg logit contributions
 satisfaction-0.010
 pleasure-0.009
 accomplishment-0.009
 delight-0.009
 fascination-0.009
Token her
Token ID607
Feature activation+2.10e-02

Pos logit contributions
 pleasure+0.085
 satisfaction+0.083
 accomplishment+0.079
 fascination+0.079
 delight+0.077
Neg logit contributions
 parked-0.046
 driving-0.046
 smoking-0.043
 knocked-0.040
 blocking-0.040
Token way
Token ID835
Feature activation+4.58e-02

Pos logit contributions
 satisfaction+0.039
 pleasure+0.038
 accomplishment+0.036
 fascination+0.036
 delight+0.035
Neg logit contributions
 parked-0.036
 driving-0.035
 stole-0.034
 smoking-0.033
 knocked-0.033
Token with
Token ID351
Feature activation+2.82e-02

Pos logit contributions
 satisfaction+0.105
 pleasure+0.099
 accomplishment+0.095
 fascination+0.095
 delight+0.092
Neg logit contributions
 parked-0.064
 driving-0.061
 stuck-0.057
 smoking-0.055
 knocked-0.055
Token a
Token ID257
Feature activation+3.76e-02

Pos logit contributions
 pleasure+0.063
 accomplishment+0.063
 satisfaction+0.062
 fascination+0.059
 accomplishments+0.059
Neg logit contributions
 parked-0.034
 smoking-0.031
 driving-0.031
 disturbed-0.030
 stole-0.030
Token big
Token ID1263
Feature activation+6.63e-02

Pos logit contributions
 satisfaction+0.057
 accomplishment+0.056
 pleasure+0.055
 fascination+0.051
 delight+0.050
Neg logit contributions
 parked-0.071
 stole-0.071
 broke-0.069
 smoking-0.069
 knocked-0.068
Token smile
Token ID8212
Feature activation-1.99e-02

Pos logit contributions
 satisfaction+0.086
agg+0.084
 pleasure+0.082
 accomplishment+0.078
 accomplishments+0.077
Neg logit contributions
 parked-0.136
 stole-0.135
 broke-0.134
 driving-0.133
 smoking-0.130
Token on
Token ID319
Feature activation+3.90e-02

Pos logit contributions
 parked+0.021
 stole+0.021
 smoking+0.020
 driving+0.019
 disturbed+0.019
Neg logit contributions
 satisfaction-0.048
 accomplishment-0.047
 pleasure-0.047
 delight-0.045
 fascination-0.044
Token her
Token ID607
Feature activation+5.56e-02

Pos logit contributions
 pleasure+0.098
 fascination+0.088
 satisfaction+0.088
 accomplishment+0.088
hew+0.086
Neg logit contributions
 parked-0.036
 driving-0.035
 hurt-0.033
 smoking-0.033
 knocked-0.032
Token face
Token ID1986
Feature activation+2.34e-03

Pos logit contributions
 pleasure+0.111
aman+0.106
 satisfaction+0.106
 fascination+0.106
heart+0.105
Neg logit contributions
 driving-0.079
 parked-0.077
 smoking-0.072
 hurt-0.070
 broke-0.064
Token.
Token ID13
Feature activation+4.29e-02

Pos logit contributions
 satisfaction+0.005
 pleasure+0.005
 accomplishment+0.005
 delight+0.005
 fascination+0.005
Neg logit contributions
 parked-0.003
 driving-0.003
 smoking-0.003
 stole-0.003
 disturbed-0.003
Token's
Token ID338
Feature activation+4.72e-03

Pos logit contributions
 satisfaction+0.002
 pleasure+0.002
 accomplishment+0.002
 delight+0.002
 fascination+0.002
Neg logit contributions
 parked-0.002
 driving-0.002
 smoking-0.002
 stole-0.002
 disturbed-0.002
Token important
Token ID1593
Feature activation+7.69e-04

Pos logit contributions
 satisfaction+0.009
 pleasure+0.009
 accomplishment+0.009
 delight+0.008
 fascination+0.008
Neg logit contributions
 parked-0.008
 driving-0.007
 smoking-0.007
 stole-0.007
 knocked-0.007
Token to
Token ID284
Feature activation+2.72e-02

Pos logit contributions
 satisfaction+0.002
 pleasure+0.002
 accomplishment+0.002
 delight+0.002
 fascination+0.002
Neg logit contributions
 parked-0.001
 driving-0.001
 smoking-0.001
 stole-0.001
 disturbed-0.001
Token respect
Token ID2461
Feature activation+3.45e-02

Pos logit contributions
 satisfaction+0.055
 pleasure+0.055
 accomplishment+0.054
 fascination+0.052
 delight+0.051
Neg logit contributions
 parked-0.041
 driving-0.039
 stole-0.037
 smoking-0.037
 knocked-0.036
Token them
Token ID606
Feature activation+3.64e-02

Pos logit contributions
 accomplishment+0.078
 satisfaction+0.077
 pleasure+0.077
 fascination+0.075
 delight+0.073
Neg logit contributions
 parked-0.043
 smoking-0.041
 driving-0.040
 stole-0.038
 disturbed-0.037
Token."
Token ID526
Feature activation+5.61e-02

Pos logit contributions
 satisfaction+0.086
 pleasure+0.084
 accomplishment+0.084
 fascination+0.083
 delight+0.080
Neg logit contributions
 parked-0.044
 driving-0.041
 smoking-0.039
 stole-0.038
 disturbed-0.037
Token
Token ID198
Feature activation+2.02e-02

Pos logit contributions
 accomplishment+0.133
 satisfaction+0.127
 pleasure+0.125
 fascination+0.124
 accomplishments+0.122
Neg logit contributions
 called-0.055
 smoking-0.054
 parked-0.051
 broke-0.049
o-0.049
Token
Token ID198
Feature activation+3.23e-02

Pos logit contributions
 satisfaction+0.047
 pleasure+0.047
 accomplishment+0.046
 fascination+0.044
 delight+0.043
Neg logit contributions
 parked-0.024
 smoking-0.022
 stole-0.021
 driving-0.021
 disturbed-0.021
TokenM
Token ID44
Feature activation+5.86e-02

Pos logit contributions
 pleasure+0.072
 accomplishment+0.070
 satisfaction+0.068
 fascination+0.066
 delight+0.062
Neg logit contributions
 parked-0.042
 stole-0.039
 disturbed-0.039
 smoking-0.038
Ro-0.038
Tokenilly
Token ID6548
Feature activation+5.27e-02

Pos logit contributions
 pleasure+0.086
 satisfaction+0.082
 fascination+0.081
 delight+0.078
 amaz+0.074
Neg logit contributions
 smoking-0.109
 parked-0.104
 called-0.100
 disturbed-0.098
 knocked-0.097
Token smiled
Token ID13541
Feature activation+5.57e-02

Pos logit contributions
 pleasure+0.124
 satisfaction+0.119
 accomplishment+0.118
 fascination+0.110
 difference+0.109
Neg logit contributions
 parked-0.075
 knocked-0.066
 stole-0.065
 broke-0.065
 driving-0.061
Token found
Token ID1043
Feature activation+5.37e-04

Pos logit contributions
 parked+0.028
 smoking+0.027
 driving+0.026
 stole+0.025
through+0.025
Neg logit contributions
 satisfaction-0.048
 pleasure-0.048
 accomplishment-0.047
 delight-0.045
 fascination-0.043
Token.
Token ID13
Feature activation+1.69e-02

Pos logit contributions
 satisfaction+0.001
 pleasure+0.001
 accomplishment+0.001
 delight+0.001
 fascination+0.001
Neg logit contributions
 parked-0.001
 driving-0.001
 smoking-0.001
 stole-0.001
 disturbed-0.001
Token Her
Token ID2332
Feature activation+2.08e-02

Pos logit contributions
 satisfaction+0.036
 accomplishment+0.036
 pleasure+0.035
 fascination+0.034
 delight+0.033
Neg logit contributions
 parked-0.024
 driving-0.022
 smoking-0.022
 stole-0.022
 knocked-0.021
Token mom
Token ID1995
Feature activation+3.80e-02

Pos logit contributions
 satisfaction+0.044
 pleasure+0.043
 accomplishment+0.041
 fascination+0.041
 delight+0.039
Neg logit contributions
 parked-0.029
 driving-0.027
 broke-0.026
 knocked-0.026
 stole-0.026
Token hugged
Token ID41130
Feature activation+6.82e-02

Pos logit contributions
 pleasure+0.095
 satisfaction+0.093
 accomplishment+0.091
 fascination+0.086
 accomplishments+0.085
Neg logit contributions
 parked-0.047
 stole-0.041
 knocked-0.040
 broke-0.039
 driving-0.039
Token her
Token ID607
Feature activation+7.43e-02

Pos logit contributions
 accomplishment+0.190
 pleasure+0.186
 accomplishments+0.185
 satisfaction+0.184
 amaz+0.180
Neg logit contributions
't-0.048
 parked-0.045
 on-0.040
 hurt-0.040
 Wh-0.040
Token and
Token ID290
Feature activation+2.70e-02

Pos logit contributions
 satisfaction+0.163
agg+0.160
 fascination+0.159
 pleasure+0.159
aman+0.153
Neg logit contributions
 parked-0.097
 on-0.087
 driving-0.082
 stuck-0.082
 in-0.074
Token praised
Token ID15342
Feature activation+4.54e-02

Pos logit contributions
 pleasure+0.073
 satisfaction+0.071
 accomplishment+0.070
 fascination+0.069
 accomplishments+0.067
Neg logit contributions
 parked-0.025
 stole-0.022
 knocked-0.021
 broke-0.020
 disturbed-0.020
Token her
Token ID607
Feature activation+4.17e-02

Pos logit contributions
 pleasure+0.107
 satisfaction+0.107
 fascination+0.103
 amaz+0.102
 accomplishment+0.101
Neg logit contributions
 driving-0.044
 parked-0.043
 disturbed-0.041
 blocking-0.039
 smoking-0.039
Token for
Token ID329
Feature activation+6.44e-03

Pos logit contributions
 satisfaction+0.089
 fascination+0.087
 pleasure+0.086
 delight+0.082
 accomplishment+0.080
Neg logit contributions
 driving-0.055
 parked-0.054
 knocked-0.048
 smoking-0.047
 stole-0.046
Token behaving
Token ID37722
Feature activation+6.53e-02

Pos logit contributions
 pleasure+0.015
 satisfaction+0.014
 accomplishment+0.014
 fascination+0.014
 delight+0.014
Neg logit contributions
 parked-0.008
 driving-0.008
 smoking-0.008
 stole-0.008
 knocked-0.007
Token Aunt
Token ID38074
Feature activation-1.65e-02

Pos logit contributions
 satisfaction+0.026
 pleasure+0.025
 accomplishment+0.025
 fascination+0.024
 delight+0.024
Neg logit contributions
 parked-0.020
 driving-0.019
 smoking-0.018
 stole-0.018
 knocked-0.018
Token.
Token ID13
Feature activation+2.59e-02

Pos logit contributions
 parked+0.027
 stole+0.025
 driving+0.025
 smoking+0.025
 broke+0.024
Neg logit contributions
 pleasure-0.031
 satisfaction-0.030
 accomplishment-0.030
 delight-0.029
 fascination-0.028
Token She
Token ID1375
Feature activation+3.04e-02

Pos logit contributions
 satisfaction+0.055
 accomplishment+0.055
 pleasure+0.053
 fascination+0.052
 delight+0.050
Neg logit contributions
 parked-0.037
 driving-0.034
 smoking-0.032
 stole-0.032
 knocked-0.031
Token felt
Token ID2936
Feature activation+4.62e-02

Pos logit contributions
 pleasure+0.070
 satisfaction+0.069
 accomplishment+0.068
 fascination+0.065
 difference+0.064
Neg logit contributions
 parked-0.042
 knocked-0.037
 stole-0.036
 driving-0.036
 broke-0.035
Token dizz
Token ID29527
Feature activation+2.61e-02

Pos logit contributions
 accomplishment+0.098
aman+0.097
 accomplishments+0.095
oph+0.095
 fascination+0.093
Neg logit contributions
 parked-0.061
 disturbed-0.056
 hurt-0.054
 stuck-0.054
 driving-0.053
Tokeny
Token ID88
Feature activation+6.28e-02

Pos logit contributions
 satisfaction+0.061
 pleasure+0.060
 accomplishment+0.059
 delight+0.057
 fascination+0.056
Neg logit contributions
 parked-0.035
 smoking-0.033
 driving-0.032
 disturbed-0.032
 broke-0.031
Token with
Token ID351
Feature activation+5.36e-02

Pos logit contributions
 satisfaction+0.122
 pleasure+0.122
 fascination+0.113
 accomplishment+0.112
oph+0.111
Neg logit contributions
 driving-0.103
 parked-0.101
 smoking-0.096
 stuck-0.094
 blocking-0.092
Token happiness
Token ID12157
Feature activation+2.98e-02

Pos logit contributions
aire+0.118
oph+0.116
 accomplishment+0.114
 pleasure+0.114
 satisfaction+0.113
Neg logit contributions
 parked-0.061
 driving-0.059
 disturbed-0.059
 knocked-0.058
 smoking-0.058
Token to
Token ID284
Feature activation+2.67e-02

Pos logit contributions
 satisfaction+0.059
 pleasure+0.058
 accomplishment+0.056
 fascination+0.055
 delight+0.054
Neg logit contributions
 parked-0.049
 driving-0.047
 smoking-0.046
 disturbed-0.044
 stole-0.043
Token know
Token ID760
Feature activation+2.97e-02

Pos logit contributions
 pleasure+0.055
 satisfaction+0.055
 accomplishment+0.054
 fascination+0.052
 delight+0.050
Neg logit contributions
 parked-0.039
 smoking-0.037
 driving-0.037
 knocked-0.035
 stole-0.034
Token that
Token ID326
Feature activation+1.05e-02

Pos logit contributions
 pleasure+0.067
 satisfaction+0.067
 accomplishment+0.065
 fascination+0.063
 delight+0.062
Neg logit contributions
 parked-0.039
 driving-0.037
 smoking-0.036
 stole-0.033
 blocking-0.033
Token.
Token ID13
Feature activation-3.05e-02

Pos logit contributions
 parked+0.025
 stole+0.023
 smoking+0.022
 driving+0.022
 disturbed+0.022
Neg logit contributions
 accomplishment-0.033
 satisfaction-0.033
 pleasure-0.033
 delight-0.031
 fascination-0.030
Token
Token ID198
Feature activation-4.45e-03

Pos logit contributions
 parked+0.064
 stole+0.063
 smoking+0.060
 driving+0.060
 disturbed+0.058
Neg logit contributions
 pleasure-0.042
 satisfaction-0.041
 delight-0.039
 accomplishment-0.038
 fascination-0.037
Token
Token ID198
Feature activation-2.10e-04

Pos logit contributions
 parked+0.007
 stole+0.007
 driving+0.007
 smoking+0.007
 disturbed+0.006
Neg logit contributions
 satisfaction-0.009
 pleasure-0.009
 accomplishment-0.008
 delight-0.008
 fascination-0.008
TokenTom
Token ID13787
Feature activation+5.06e-02

Pos logit contributions
 parked+0.000
 driving+0.000
 stole+0.000
 smoking+0.000
 disturbed+0.000
Neg logit contributions
 satisfaction-0.000
 pleasure-0.000
 accomplishment-0.000
 delight-0.000
 fascination-0.000
Token and
Token ID290
Feature activation-1.89e-02

Pos logit contributions
 pleasure+0.091
 satisfaction+0.089
 accomplishment+0.086
 delight+0.078
 fascination+0.076
Neg logit contributions
 parked-0.098
 driving-0.090
 broke-0.088
 stole-0.087
 knocked-0.087
Token Sam
Token ID3409
Feature activation+6.75e-02

Pos logit contributions
 parked+0.025
 smoking+0.024
 stole+0.024
 disturbed+0.023
 driving+0.023
Neg logit contributions
 satisfaction-0.041
 accomplishment-0.041
 pleasure-0.039
 delight-0.039
 fascination-0.038
Token are
Token ID389
Feature activation-5.04e-03

Pos logit contributions
 pleasure+0.140
 accomplishment+0.136
 satisfaction+0.131
heart+0.131
 accomplishments+0.126
Neg logit contributions
 parked-0.106
 stole-0.093
 broke-0.093
 hit-0.090
 stuck-0.089
Token happy
Token ID3772
Feature activation-9.36e-03

Pos logit contributions
 parked+0.011
 stole+0.011
 smoking+0.011
 driving+0.010
 disturbed+0.010
Neg logit contributions
 satisfaction-0.007
 pleasure-0.007
 accomplishment-0.007
 delight-0.006
 fascination-0.006
Token.
Token ID13
Feature activation-9.16e-03

Pos logit contributions
 parked+0.014
 stole+0.013
 driving+0.012
 smoking+0.012
 broke+0.012
Neg logit contributions
 satisfaction-0.020
 pleasure-0.020
 accomplishment-0.019
 delight-0.019
 fascination-0.018
Token They
Token ID1119
Feature activation+4.37e-02

Pos logit contributions
 parked+0.016
 stole+0.015
 smoking+0.015
 driving+0.015
 disturbed+0.014
Neg logit contributions
 pleasure-0.017
 satisfaction-0.017
 accomplishment-0.016
 delight-0.016
 fascination-0.015
Token mail
Token ID6920
Feature activation-1.63e-02

Pos logit contributions
 accomplishment+0.085
 pleasure+0.084
 satisfaction+0.081
 fascination+0.080
heart+0.079
Neg logit contributions
 parked-0.071
 stole-0.067
 broke-0.065
 knocked-0.062
 stuck-0.061
Token juice
Token ID13135
Feature activation-3.52e-02

Pos logit contributions
 satisfaction+0.019
 pleasure+0.019
 accomplishment+0.018
 delight+0.018
 fascination+0.018
Neg logit contributions
 parked-0.012
 driving-0.011
 smoking-0.011
 stole-0.011
 broke-0.010
Token maker
Token ID16009
Feature activation-4.37e-02

Pos logit contributions
 parked+0.065
 smoking+0.064
 stole+0.064
 driving+0.064
through+0.062
Neg logit contributions
 pleasure-0.054
 satisfaction-0.053
 accomplishment-0.053
 delight-0.049
 difference-0.046
Token.
Token ID13
Feature activation-2.21e-02

Pos logit contributions
 stole+0.078
 parked+0.077
through+0.077
 smoking+0.076
saw+0.075
Neg logit contributions
 pleasure-0.069
 satisfaction-0.067
 accomplishment-0.066
 delight-0.062
 fascination-0.059
Token It
Token ID632
Feature activation-1.33e-02

Pos logit contributions
 parked+0.041
 driving+0.040
 stole+0.039
 smoking+0.038
 disturbed+0.037
Neg logit contributions
 pleasure-0.039
 satisfaction-0.037
 delight-0.036
 fascination-0.034
 accomplishment-0.034
Token is
Token ID318
Feature activation-2.57e-02

Pos logit contributions
 parked+0.026
 driving+0.025
 smoking+0.025
 stole+0.024
 disturbed+0.023
Neg logit contributions
 pleasure-0.020
 satisfaction-0.020
 accomplishment-0.019
 delight-0.019
 fascination-0.018
Token shiny
Token ID22441
Feature activation+2.51e-02

Pos logit contributions
 stole+0.040
 parked+0.038
saw+0.037
through+0.036
 driving+0.036
Neg logit contributions
 pleasure-0.051
 satisfaction-0.050
 accomplishment-0.050
 delight-0.046
 fascination-0.045
Token and
Token ID290
Feature activation+1.96e-02

Pos logit contributions
 satisfaction+0.043
 pleasure+0.043
 accomplishment+0.041
 fascination+0.041
 delight+0.039
Neg logit contributions
 parked-0.048
 driving-0.046
 smoking-0.045
 broke-0.044
 stole-0.044
Token colorful
Token ID20239
Feature activation-6.72e-03

Pos logit contributions
 pleasure+0.040
 satisfaction+0.040
 accomplishment+0.039
 fascination+0.038
 delight+0.038
Neg logit contributions
 parked-0.030
 smoking-0.028
 stole-0.027
 driving-0.027
 broke-0.026
Token.
Token ID13
Feature activation-1.44e-02

Pos logit contributions
 parked+0.010
 stole+0.010
 smoking+0.010
 driving+0.010
 disturbed+0.009
Neg logit contributions
 satisfaction-0.013
 pleasure-0.013
 accomplishment-0.013
 delight-0.013
 fascination-0.012
Token It
Token ID632
Feature activation-1.10e-02

Pos logit contributions
 parked+0.025
 driving+0.024
 stole+0.024
 smoking+0.023
 knocked+0.022
Neg logit contributions
 pleasure-0.027
 satisfaction-0.026
 delight-0.025
 accomplishment-0.024
 fascination-0.024
Token has
Token ID468
Feature activation-2.18e-02

Pos logit contributions
 parked+0.020
 driving+0.019
 smoking+0.019
 stole+0.019
 disturbed+0.018
Neg logit contributions
 satisfaction-0.018
 pleasure-0.018
 accomplishment-0.018
 delight-0.017
 fascination-0.017
Token like
Token ID588
Feature activation-3.02e-02

Pos logit contributions
 parked+0.000
 driving+0.000
 smoking+0.000
 stole+0.000
 knocked+0.000
Neg logit contributions
 satisfaction-0.000
 pleasure-0.000
 accomplishment-0.000
 delight-0.000
 fascination-0.000
Token Tommy
Token ID19919
Feature activation-9.52e-03

Pos logit contributions
 parked+0.046
 stole+0.043
 smoking+0.041
through+0.041
 broke+0.040
Neg logit contributions
 pleasure-0.060
 accomplishment-0.058
 satisfaction-0.057
 delight-0.056
 fascination-0.053
Token and
Token ID290
Feature activation-8.28e-03

Pos logit contributions
 parked+0.015
 smoking+0.014
 stole+0.014
 driving+0.014
 knocked+0.013
Neg logit contributions
 pleasure-0.018
 satisfaction-0.018
 accomplishment-0.018
 delight-0.017
 fascination-0.017
Token a
Token ID257
Feature activation+1.95e-02

Pos logit contributions
 parked+0.011
 smoking+0.010
 driving+0.010
 stole+0.010
 disturbed+0.010
Neg logit contributions
 satisfaction-0.018
 pleasure-0.018
 accomplishment-0.018
 delight-0.017
 fascination-0.017
Token c
Token ID269
Feature activation+3.01e-02

Pos logit contributions
 satisfaction+0.034
 accomplishment+0.033
 pleasure+0.033
 fascination+0.032
 delight+0.031
Neg logit contributions
 parked-0.034
 driving-0.033
 stole-0.033
 smoking-0.032
 broke-0.032
Tokenudd
Token ID4185
Feature activation+4.51e-02

Pos logit contributions
 satisfaction+0.049
 pleasure+0.047
 fascination+0.046
 delight+0.046
 accomplishment+0.045
Neg logit contributions
 parked-0.059
 stole-0.056
 driving-0.056
 broke-0.055
 smoking-0.054
Tokenly
Token ID306
Feature activation+1.37e-02

Pos logit contributions
 pleasure+0.086
 amaz+0.085
 fascination+0.085
 satisfaction+0.084
 delight+0.080
Neg logit contributions
 parked-0.076
 driving-0.072
 stole-0.069
 broke-0.068
 disturbed-0.066
Token friend
Token ID1545
Feature activation-2.98e-02

Pos logit contributions
 satisfaction+0.029
 pleasure+0.029
 accomplishment+0.029
 delight+0.027
 fascination+0.027
Neg logit contributions
 parked-0.019
 driving-0.018
 stole-0.018
 smoking-0.017
 broke-0.017
Token like
Token ID588
Feature activation-3.52e-02

Pos logit contributions
 parked+0.046
 smoking+0.044
 driving+0.042
 stole+0.042
 CA+0.041
Neg logit contributions
 pleasure-0.059
 satisfaction-0.057
 accomplishment-0.054
 delight-0.054
 fascination-0.051
Token her
Token ID607
Feature activation-7.11e-03

Pos logit contributions
 parked+0.054
 smoking+0.049
 stole+0.049
 broke+0.048
 knocked+0.047
Neg logit contributions
 pleasure-0.071
 satisfaction-0.067
 accomplishment-0.067
 delight-0.065
 fascination-0.062
Token t
Token ID256
Feature activation-1.46e-02

Pos logit contributions
 parked+0.011
 stole+0.010
 smoking+0.010
 driving+0.010
 knocked+0.010
Neg logit contributions
 pleasure-0.014
 satisfaction-0.014
 accomplishment-0.014
 delight-0.013
 fascination-0.013
Token and
Token ID290
Feature activation-8.83e-03

Pos logit contributions
 satisfaction+0.061
 pleasure+0.060
 accomplishment+0.058
 fascination+0.057
 delight+0.055
Neg logit contributions
 parked-0.055
 driving-0.052
 smoking-0.050
 knocked-0.048
 stole-0.048
Token smiled
Token ID13541
Feature activation+2.36e-02

Pos logit contributions
 parked+0.013
 smoking+0.013
 driving+0.013
 stole+0.013
 disturbed+0.012
Neg logit contributions
 satisfaction-0.018
 pleasure-0.017
 accomplishment-0.017
 delight-0.017
 fascination-0.016
Token.
Token ID13
Feature activation+2.32e-02

Pos logit contributions
 satisfaction+0.060
 pleasure+0.059
 accomplishment+0.059
 fascination+0.056
 delight+0.055
Neg logit contributions
 parked-0.023
 driving-0.022
 smoking-0.020
 disturbed-0.019
 broke-0.019
Token �
Token ID6184
Feature activation+1.64e-02

Pos logit contributions
 satisfaction+0.054
 accomplishment+0.054
 pleasure+0.053
 fascination+0.051
 delight+0.050
Neg logit contributions
 parked-0.027
 driving-0.025
 smoking-0.024
 stole-0.024
 knocked-0.023
Token�
Token ID95
Feature activation+2.16e-02

Pos logit contributions
 satisfaction+0.030
 pleasure+0.030
 accomplishment+0.029
 fascination+0.028
 delight+0.028
Neg logit contributions
 parked-0.029
 driving-0.027
 stole-0.027
 smoking-0.026
 broke-0.025
Token€
Token ID26391
Feature activation+3.42e-02

Pos logit contributions
 pleasure+0.042
 accomplishment+0.041
 satisfaction+0.041
 delight+0.039
 fascination+0.039
Neg logit contributions
 parked-0.036
 driving-0.033
 smoking-0.033
 stole-0.033
 broke-0.030
Token�
Token ID129
Feature activation+1.44e-02

Pos logit contributions
 satisfaction+0.071
 accomplishment+0.069
 pleasure+0.068
 fascination+0.065
 delight+0.065
Neg logit contributions
 parked-0.055
 driving-0.052
 smoking-0.050
 stole-0.050
 broke-0.048
Token�
Token ID241
Feature activation+2.36e-02

Pos logit contributions
 pleasure+0.026
 satisfaction+0.026
 accomplishment+0.025
 delight+0.024
 fascination+0.023
Neg logit contributions
 parked-0.026
 driving-0.024
 smoking-0.024
 stole-0.024
 knocked-0.023
TokenOk
Token ID18690
Feature activation+3.23e-02

Pos logit contributions
 accomplishment+0.048
 satisfaction+0.048
 pleasure+0.047
 delight+0.045
 fascination+0.045
Neg logit contributions
 parked-0.035
 driving-0.032
 stole-0.032
 smoking-0.031
 disturbed-0.030
Token Mum
Token ID38367
Feature activation+8.74e-03

Pos logit contributions
 satisfaction+0.065
 fascination+0.063
 pleasure+0.062
 accomplishment+0.062
 amaz+0.058
Neg logit contributions
 parked-0.051
 driving-0.048
 stole-0.047
 smoking-0.046
 broke-0.045
Token,
Token ID11
Feature activation+3.92e-02

Pos logit contributions
 satisfaction+0.016
 pleasure+0.016
 accomplishment+0.016
 delight+0.015
 fascination+0.015
Neg logit contributions
 parked-0.015
 driving-0.014
 stole-0.014
 smoking-0.014
 knocked-0.013
Token
Token ID198
Feature activation+6.61e-03

Pos logit contributions
 satisfaction+0.033
 pleasure+0.032
 accomplishment+0.032
 delight+0.030
 fascination+0.030
Neg logit contributions
 parked-0.020
 driving-0.019
 smoking-0.019
 stole-0.018
 knocked-0.018
Token
Token ID198
Feature activation+2.66e-03

Pos logit contributions
 satisfaction+0.014
 pleasure+0.014
 accomplishment+0.014
 delight+0.013
 fascination+0.013
Neg logit contributions
 parked-0.010
 driving-0.009
 smoking-0.009
 stole-0.009
 knocked-0.008
TokenJane
Token ID41083
Feature activation+2.30e-02

Pos logit contributions
 satisfaction+0.005
 pleasure+0.005
 accomplishment+0.005
 delight+0.004
 fascination+0.004
Neg logit contributions
 parked-0.005
 driving-0.004
 smoking-0.004
 stole-0.004
 disturbed-0.004
Token:
Token ID25
Feature activation+1.57e-02

Pos logit contributions
 pleasure+0.048
 satisfaction+0.047
 accomplishment+0.046
 fascination+0.043
 delight+0.043
Neg logit contributions
 parked-0.037
 stole-0.033
 knocked-0.033
 driving-0.033
 broke-0.032
Token "
Token ID366
Feature activation+2.07e-02

Pos logit contributions
 satisfaction+0.035
 pleasure+0.034
 accomplishment+0.034
 delight+0.032
 fascination+0.032
Neg logit contributions
 parked-0.021
 driving-0.020
 smoking-0.019
 stole-0.019
 knocked-0.018
TokenLook
Token ID8567
Feature activation+3.21e-02

Pos logit contributions
 accomplishment+0.049
 pleasure+0.048
 satisfaction+0.047
 delight+0.045
 fascination+0.044
Neg logit contributions
 parked-0.026
 stole-0.023
 smoking-0.023
 disturbed-0.022
 driving-0.022
Token Mike
Token ID4995
Feature activation+1.31e-02

Pos logit contributions
 satisfaction+0.070
 accomplishment+0.066
 pleasure+0.066
 fascination+0.065
 delight+0.061
Neg logit contributions
 parked-0.045
 smoking-0.042
 driving-0.042
 stole-0.042
 knocked-0.041
Token,
Token ID11
Feature activation+1.24e-02

Pos logit contributions
 satisfaction+0.023
 pleasure+0.023
 accomplishment+0.022
 fascination+0.021
 delight+0.021
Neg logit contributions
 parked-0.024
 driving-0.023
 stole-0.022
 smoking-0.022
 knocked-0.021
Token we
Token ID356
Feature activation+1.09e-02

Pos logit contributions
 satisfaction+0.029
 accomplishment+0.029
 pleasure+0.029
 fascination+0.028
 delight+0.027
Neg logit contributions
 parked-0.015
 stole-0.014
 driving-0.014
 smoking-0.014
 knocked-0.013
Token did
Token ID750
Feature activation-2.64e-03

Pos logit contributions
 satisfaction+0.023
 pleasure+0.023
 accomplishment+0.023
 fascination+0.021
 delight+0.021
Neg logit contributions
 parked-0.016
 stole-0.015
 driving-0.014
 smoking-0.014
 broke-0.014
Token it
Token ID340
Feature activation-1.35e-02

Pos logit contributions
 parked+0.004
 smoking+0.003
 driving+0.003
 stole+0.003
 disturbed+0.003
Neg logit contributions
 pleasure-0.006
 satisfaction-0.006
 accomplishment-0.006
 delight-0.005
 fascination-0.005
Token,
Token ID11
Feature activation+3.45e-02

Pos logit contributions
 satisfaction+0.042
 pleasure+0.040
 accomplishment+0.040
 fascination+0.039
 delight+0.037
Neg logit contributions
 driving-0.036
 smoking-0.036
 parked-0.036
 stole-0.034
 broke-0.033
Token Tommy
Token ID19919
Feature activation+1.23e-02

Pos logit contributions
 satisfaction+0.080
 pleasure+0.080
 accomplishment+0.079
 fascination+0.075
 accomplishments+0.074
Neg logit contributions
 parked-0.043
 smoking-0.038
 driving-0.038
 broke-0.035
 hurt-0.034
Token.
Token ID13
Feature activation+7.83e-03

Pos logit contributions
 satisfaction+0.030
 pleasure+0.030
 accomplishment+0.030
 fascination+0.029
 delight+0.029
Neg logit contributions
 parked-0.013
 stole-0.012
 smoking-0.012
 driving-0.012
 broke-0.012
Token I
Token ID314
Feature activation+1.99e-02

Pos logit contributions
 satisfaction+0.020
 accomplishment+0.020
 pleasure+0.020
 fascination+0.019
 delight+0.019
Neg logit contributions
 parked-0.007
 stole-0.006
 smoking-0.006
 driving-0.006
 broke-0.006
Token didn
Token ID1422
Feature activation-8.30e-02

Pos logit contributions
 accomplishment+0.044
 satisfaction+0.044
 pleasure+0.044
 delight+0.042
 fascination+0.041
Neg logit contributions
 stole-0.027
 parked-0.027
 broke-0.026
 disturbed-0.026
 knocked-0.025
Token't
Token ID470
Feature activation-4.22e-03

Pos logit contributions
 stole+0.129
chard+0.126
 parked+0.121
 blocked+0.120
 guarded+0.116
Neg logit contributions
 accomplishment-0.137
 delight-0.135
 sense-0.123
 pleasure-0.121
 satisfaction-0.119
Token mean
Token ID1612
Feature activation-4.36e-03

Pos logit contributions
 parked+0.005
 driving+0.005
 smoking+0.005
 stole+0.005
 knocked+0.004
Neg logit contributions
 satisfaction-0.010
 pleasure-0.010
 accomplishment-0.009
 delight-0.009
 fascination-0.009
Token to
Token ID284
Feature activation-3.28e-02

Pos logit contributions
 parked+0.004
 driving+0.004
 stole+0.004
 smoking+0.004
 knocked+0.003
Neg logit contributions
 satisfaction-0.011
 pleasure-0.011
 accomplishment-0.011
 delight-0.011
 fascination-0.010
Token scare
Token ID19437
Feature activation-1.06e-03

Pos logit contributions
 parked+0.052
 driving+0.046
 blocking+0.044
cher+0.044
 smoking+0.043
Neg logit contributions
 delight-0.061
 difference-0.061
 satisfaction-0.060
 pleasure-0.059
 accomplishment-0.057
Token you
Token ID345
Feature activation-2.51e-02

Pos logit contributions
 parked+0.002
 driving+0.001
 stole+0.001
 smoking+0.001
 knocked+0.001
Neg logit contributions
 satisfaction-0.002
 pleasure-0.002
 accomplishment-0.002
 delight-0.002
 fascination-0.002
Token.
Token ID13
Feature activation-5.16e-04

Pos logit contributions
 parked+0.041
 stole+0.038
 smoking+0.038
 driving+0.037
 CA+0.036
Neg logit contributions
 pleasure-0.047
 satisfaction-0.045
 accomplishment-0.045
 delight-0.044
 difference-0.043
Token.
Token ID13
Feature activation-6.15e-02

Pos logit contributions
 parked+0.082
 stole+0.075
saw+0.074
 CA+0.074
Ro+0.073
Neg logit contributions
 accomplishment-0.065
 pleasure-0.064
 delight-0.062
 satisfaction-0.062
 enthusiasm-0.056
Token "
Token ID366
Feature activation-1.20e-02

Pos logit contributions
angs+0.129
 stole+0.126
through+0.125
glers+0.125
saw+0.124
Neg logit contributions
 pleasure-0.067
 delight-0.065
 satisfaction-0.059
 nod-0.057
 accomplishment-0.056
TokenDo
Token ID5211
Feature activation-1.65e-03

Pos logit contributions
 parked+0.014
 driving+0.013
 smoking+0.013
 stole+0.013
 knocked+0.012
Neg logit contributions
 satisfaction-0.028
 pleasure-0.028
 fascination-0.027
 delight-0.027
 accomplishment-0.027
Token you
Token ID345
Feature activation+7.59e-03

Pos logit contributions
 parked+0.002
 smoking+0.002
 driving+0.002
 stole+0.002
 knocked+0.002
Neg logit contributions
 satisfaction-0.004
 pleasure-0.004
 accomplishment-0.003
 delight-0.003
 fascination-0.003
Token want
Token ID765
Feature activation-5.52e-02

Pos logit contributions
 satisfaction+0.013
 pleasure+0.013
 accomplishment+0.013
 delight+0.013
 fascination+0.012
Neg logit contributions
 parked-0.013
 stole-0.013
 driving-0.013
 smoking-0.012
 disturbed-0.012
Token to
Token ID284
Feature activation-4.04e-02

Pos logit contributions
 parked+0.090
saw+0.083
 blocked+0.082
 stole+0.080
 driving+0.080
Neg logit contributions
 pleasure-0.098
 satisfaction-0.093
 joy-0.092
 delight-0.091
 happiness-0.087
Token play
Token ID711
Feature activation-4.57e-02

Pos logit contributions
 parked+0.070
 blocking+0.070
 smoking+0.069
 blocked+0.067
 stole+0.067
Neg logit contributions
 pleasure-0.070
 delight-0.069
 satisfaction-0.068
 accomplishment-0.064
 difference-0.060
Token with
Token ID351
Feature activation-3.24e-02

Pos logit contributions
 parked+0.079
 stole+0.075
 knocked+0.075
 smoking+0.071
 driving+0.071
Neg logit contributions
 pleasure-0.076
 delight-0.075
 satisfaction-0.074
 accomplishment-0.072
 difference-0.068
Token me
Token ID502
Feature activation-2.51e-02

Pos logit contributions
 parked+0.072
 driving+0.069
 driven+0.068
 stole+0.067
 knocked+0.067
Neg logit contributions
 delight-0.037
 satisfaction-0.036
 pleasure-0.035
 goodness-0.033
 accomplishment-0.032
Token in
Token ID287
Feature activation-2.48e-02

Pos logit contributions
 parked+0.028
 stole+0.027
 smoking+0.026
 driving+0.026
 knocked+0.026
Neg logit contributions
 pleasure-0.057
 satisfaction-0.055
 accomplishment-0.054
 delight-0.052
 difference-0.051
Token the
Token ID262
Feature activation-7.83e-03

Pos logit contributions
 parked+0.044
 stole+0.043
 knocked+0.041
 driving+0.041
 smoking+0.040
Neg logit contributions
 delight-0.041
 satisfaction-0.041
 pleasure-0.041
 accomplishment-0.039
 amaz-0.038
Token the
Token ID262
Feature activation-1.28e-02

Pos logit contributions
 parked+0.023
 stole+0.022
 smoking+0.021
 driving+0.021
 disturbed+0.021
Neg logit contributions
 pleasure-0.028
 accomplishment-0.028
 satisfaction-0.028
 delight-0.026
 fascination-0.026
Token boat
Token ID8848
Feature activation-3.36e-02

Pos logit contributions
 parked+0.019
 stole+0.018
 smoking+0.018
 driving+0.017
 disturbed+0.017
Neg logit contributions
 satisfaction-0.026
 pleasure-0.026
 accomplishment-0.026
 delight-0.025
 fascination-0.024
Token too
Token ID1165
Feature activation+2.06e-03

Pos logit contributions
 stole+0.055
 parked+0.054
 smoking+0.053
Ro+0.053
 CA+0.053
Neg logit contributions
 accomplishment-0.056
 pleasure-0.054
 satisfaction-0.053
 difference-0.052
 decision-0.052
Token close
Token ID1969
Feature activation+5.43e-03

Pos logit contributions
 satisfaction+0.005
 pleasure+0.005
 accomplishment+0.004
 delight+0.004
 fascination+0.004
Neg logit contributions
 parked-0.003
 driving-0.003
 smoking-0.003
 stole-0.003
 disturbed-0.002
Token to
Token ID284
Feature activation+4.62e-03

Pos logit contributions
 satisfaction+0.014
 pleasure+0.014
 accomplishment+0.013
 delight+0.013
 fascination+0.013
Neg logit contributions
 parked-0.006
 driving-0.005
 smoking-0.005
 stole-0.005
 knocked-0.005
Token the
Token ID262
Feature activation-1.49e-02

Pos logit contributions
 satisfaction+0.009
 pleasure+0.009
 accomplishment+0.009
 delight+0.009
 fascination+0.009
Neg logit contributions
 parked-0.007
 driving-0.007
 smoking-0.007
 stole-0.007
 knocked-0.006
Token cliff
Token ID19516
Feature activation-2.64e-02

Pos logit contributions
 parked+0.023
 stole+0.022
 driving+0.021
 smoking+0.021
 disturbed+0.021
Neg logit contributions
 satisfaction-0.030
 pleasure-0.030
 accomplishment-0.029
 delight-0.028
 fascination-0.027
Token.
Token ID13
Feature activation-8.95e-03

Pos logit contributions
 parked+0.040
 stole+0.039
 smoking+0.037
 driving+0.037
 disturbed+0.036
Neg logit contributions
 accomplishment-0.050
 satisfaction-0.049
 pleasure-0.049
 delight-0.047
 fascination-0.046
Token They
Token ID1119
Feature activation-1.05e-02

Pos logit contributions
 parked+0.015
 stole+0.014
 driving+0.014
 smoking+0.014
 disturbed+0.013
Neg logit contributions
 pleasure-0.017
 satisfaction-0.017
 accomplishment-0.016
 delight-0.016
 fascination-0.015
Token were
Token ID547
Feature activation-3.17e-02

Pos logit contributions
 parked+0.014
 driving+0.014
 smoking+0.014
 stole+0.013
 blocking+0.012
Neg logit contributions
 satisfaction-0.022
 pleasure-0.022
 accomplishment-0.021
 delight-0.021
 fascination-0.021
Token so
Token ID523
Feature activation-2.37e-03

Pos logit contributions
 stole+0.048
 CA+0.046
Ro+0.045
saw+0.043
 parked+0.043
Neg logit contributions
 accomplishment-0.063
 satisfaction-0.062
 pleasure-0.061
 delight-0.059
 fascination-0.058
Token.
Token ID13
Feature activation+1.40e-03

Pos logit contributions
 parked+0.031
 stole+0.031
 smoking+0.029
 driving+0.029
 disturbed+0.028
Neg logit contributions
 accomplishment-0.040
 satisfaction-0.039
 pleasure-0.038
 delight-0.037
 difference-0.035
Token They
Token ID1119
Feature activation-1.24e-02

Pos logit contributions
 satisfaction+0.003
 pleasure+0.003
 accomplishment+0.003
 delight+0.003
 fascination+0.003
Neg logit contributions
 parked-0.002
 driving-0.002
 stole-0.002
 smoking-0.002
 disturbed-0.002
Token did
Token ID750
Feature activation+1.26e-02

Pos logit contributions
 parked+0.016
 driving+0.015
 smoking+0.015
 stole+0.014
 CA+0.014
Neg logit contributions
 satisfaction-0.027
 pleasure-0.027
 accomplishment-0.026
 fascination-0.026
 delight-0.026
Token not
Token ID407
Feature activation-2.23e-03

Pos logit contributions
 satisfaction+0.034
 pleasure+0.034
 accomplishment+0.032
 fascination+0.032
 delight+0.032
Neg logit contributions
 parked-0.011
 driving-0.010
 smoking-0.010
 stole-0.010
 knocked-0.009
Token mind
Token ID2000
Feature activation-7.70e-03

Pos logit contributions
 parked+0.004
 driving+0.004
 smoking+0.004
 stole+0.004
 knocked+0.004
Neg logit contributions
 satisfaction-0.004
 pleasure-0.004
 accomplishment-0.004
 delight-0.003
 fascination-0.003
Token getting
Token ID1972
Feature activation-2.95e-02

Pos logit contributions
 parked+0.012
 stole+0.011
 driving+0.011
 smoking+0.011
 broke+0.011
Neg logit contributions
 satisfaction-0.015
 pleasure-0.015
 accomplishment-0.015
 delight-0.014
 fascination-0.014
Token filthy
Token ID35677
Feature activation-3.23e-03

Pos logit contributions
 parked+0.046
 CA+0.041
 driving+0.041
 stole+0.040
Ro+0.039
Neg logit contributions
 satisfaction-0.057
 accomplishment-0.057
 pleasure-0.056
 delight-0.054
 fascination-0.051
Token.
Token ID13
Feature activation-7.16e-04

Pos logit contributions
 parked+0.005
 stole+0.004
 driving+0.004
 smoking+0.004
 knocked+0.004
Neg logit contributions
 satisfaction-0.007
 pleasure-0.007
 accomplishment-0.007
 delight-0.006
 fascination-0.006
Token They
Token ID1119
Feature activation-1.38e-02

Pos logit contributions
 parked+0.001
 stole+0.001
 driving+0.001
 smoking+0.001
 disturbed+0.001
Neg logit contributions
 pleasure-0.001
 satisfaction-0.001
 accomplishment-0.001
 delight-0.001
 fascination-0.001
Token had
Token ID550
Feature activation-6.85e-03

Pos logit contributions
 parked+0.019
 driving+0.019
 smoking+0.019
 stole+0.017
 CA+0.017
Neg logit contributions
 satisfaction-0.029
 pleasure-0.028
 accomplishment-0.027
 fascination-0.027
 delight-0.027
Token fun
Token ID1257
Feature activation-4.38e-03

Pos logit contributions
 parked+0.009
 driving+0.008
 smoking+0.008
 stole+0.008
 disturbed+0.007
Neg logit contributions
 pleasure-0.016
 satisfaction-0.015
 accomplishment-0.015
 delight-0.015
 fascination-0.014
Token named
Token ID3706
Feature activation-1.35e-02

Pos logit contributions
 parked+0.021
 stole+0.018
 driving+0.018
 smoking+0.017
 blocked+0.017
Neg logit contributions
 satisfaction-0.055
 pleasure-0.053
 accomplishment-0.052
 delight-0.052
 fascination-0.051
Token Lily
Token ID20037
Feature activation-1.07e-02

Pos logit contributions
 parked+0.020
 driving+0.019
 smoking+0.018
 stole+0.018
 blocking+0.017
Neg logit contributions
 pleasure-0.029
 satisfaction-0.028
 accomplishment-0.027
 delight-0.027
 fascination-0.026
Token who
Token ID508
Feature activation+1.98e-03

Pos logit contributions
 parked+0.015
 stole+0.014
 smoking+0.014
 driving+0.014
 disturbed+0.013
Neg logit contributions
 satisfaction-0.022
 pleasure-0.022
 accomplishment-0.021
 fascination-0.021
 delight-0.020
Token loved
Token ID6151
Feature activation-5.75e-02

Pos logit contributions
 satisfaction+0.004
 pleasure+0.004
 accomplishment+0.004
 delight+0.003
 fascination+0.003
Neg logit contributions
 parked-0.003
 driving-0.003
 smoking-0.003
 stole-0.003
 disturbed-0.003
Token to
Token ID284
Feature activation-2.34e-02

Pos logit contributions
 parked+0.090
cher+0.089
 blocked+0.084
 stole+0.083
 exploded+0.081
Neg logit contributions
 pleasure-0.104
 satisfaction-0.102
 delight-0.101
 accomplishment-0.101
 joy-0.093
Token play
Token ID711
Feature activation-3.10e-02

Pos logit contributions
 parked+0.038
 blocking+0.035
 stole+0.035
 smoking+0.034
 blocked+0.034
Neg logit contributions
 pleasure-0.046
 satisfaction-0.045
 delight-0.044
 accomplishment-0.043
 fascination-0.041
Token with
Token ID351
Feature activation-3.97e-02

Pos logit contributions
 parked+0.048
 stole+0.045
 smoking+0.042
 knocked+0.042
 driving+0.041
Neg logit contributions
 delight-0.059
 satisfaction-0.057
 accomplishment-0.057
 pleasure-0.056
 goodness-0.054
Token water
Token ID1660
Feature activation-3.60e-02

Pos logit contributions
 parked+0.089
cher+0.085
 stole+0.085
 blocked+0.080
 driven+0.080
Neg logit contributions
 delight-0.050
 accomplishment-0.048
 satisfaction-0.047
 fascination-0.043
 enthusiasm-0.042
Token.
Token ID13
Feature activation+1.48e-02

Pos logit contributions
 parked+0.063
 stole+0.060
 driving+0.058
 knocked+0.057
 driven+0.057
Neg logit contributions
 satisfaction-0.057
 accomplishment-0.057
 pleasure-0.057
 delight-0.057
 difference-0.055
Token She
Token ID1375
Feature activation-4.98e-03

Pos logit contributions
 satisfaction+0.032
 accomplishment+0.032
 pleasure+0.031
 fascination+0.030
 delight+0.030
Neg logit contributions
 parked-0.021
 driving-0.019
 smoking-0.019
 stole-0.019
 broke-0.018
Token liked
Token ID8288
Feature activation-2.66e-02

Pos logit contributions
 parked+0.007
 smoking+0.006
 driving+0.006
 stole+0.006
 disturbed+0.006
Neg logit contributions
 satisfaction-0.011
 pleasure-0.011
 accomplishment-0.010
 delight-0.010
 fascination-0.010
Token family
Token ID1641
Feature activation-3.61e-02

Pos logit contributions
 satisfaction+0.042
 pleasure+0.041
 accomplishment+0.041
 fascination+0.039
 delight+0.039
Neg logit contributions
 parked-0.030
 driving-0.028
 smoking-0.028
 stole-0.027
 knocked-0.026
Token.
Token ID13
Feature activation-2.63e-02

Pos logit contributions
 parked+0.055
 blocked+0.055
 stole+0.055
 smoking+0.054
angs+0.054
Neg logit contributions
 satisfaction-0.064
 pleasure-0.063
 accomplishment-0.059
 delight-0.059
 difference-0.057
Token They
Token ID1119
Feature activation-3.58e-02

Pos logit contributions
 parked+0.048
 stole+0.047
 smoking+0.047
 driving+0.047
cher+0.046
Neg logit contributions
 pleasure-0.044
 satisfaction-0.043
 delight-0.041
 fascination-0.038
 amaz-0.037
Token lived
Token ID5615
Feature activation-3.10e-02

Pos logit contributions
 smoking+0.064
saw+0.060
 blocking+0.060
through+0.059
 driving+0.059
Neg logit contributions
 satisfaction-0.060
 pleasure-0.058
 delight-0.053
 accomplishment-0.051
 fascination-0.051
Token in
Token ID287
Feature activation+4.05e-03

Pos logit contributions
 stole+0.041
 parked+0.040
 smoking+0.039
 CA+0.038
saw+0.037
Neg logit contributions
 satisfaction-0.064
 pleasure-0.064
 accomplishment-0.063
 fascination-0.059
 delight-0.058
Token a
Token ID257
Feature activation-1.30e-02

Pos logit contributions
 satisfaction+0.010
 pleasure+0.010
 accomplishment+0.010
 delight+0.009
 fascination+0.009
Neg logit contributions
 parked-0.004
 driving-0.004
 smoking-0.004
 stole-0.004
 disturbed-0.004
Token big
Token ID1263
Feature activation+4.40e-02

Pos logit contributions
 parked+0.022
 driving+0.020
 stole+0.020
 smoking+0.020
 disturbed+0.019
Neg logit contributions
 pleasure-0.024
 satisfaction-0.024
 accomplishment-0.023
 delight-0.023
 fascination-0.022
Token house
Token ID2156
Feature activation-2.57e-02

Pos logit contributions
 satisfaction+0.102
 accomplishment+0.101
 pleasure+0.100
 fascination+0.094
 accomplishments+0.092
Neg logit contributions
 parked-0.056
 driving-0.052
 smoking-0.052
 stole-0.050
 broke-0.049
Token and
Token ID290
Feature activation-2.20e-02

Pos logit contributions
 parked+0.038
 stole+0.037
 CA+0.037
 smoking+0.036
 disturbed+0.035
Neg logit contributions
 satisfaction-0.049
 pleasure-0.048
 accomplishment-0.048
 delight-0.045
 fascination-0.044
Token loved
Token ID6151
Feature activation-3.87e-02

Pos logit contributions
 parked+0.034
 stole+0.034
 smoking+0.034
 driving+0.033
 blocking+0.033
Neg logit contributions
 satisfaction-0.041
 pleasure-0.040
 accomplishment-0.038
 delight-0.037
 fascination-0.036
Token to
Token ID284
Feature activation-1.63e-02

Pos logit contributions
 parked+0.053
 stole+0.051
 broke+0.048
 blocked+0.047
 CA+0.046
Neg logit contributions
 pleasure-0.084
 satisfaction-0.080
 accomplishment-0.080
 delight-0.076
 fascination-0.072
Token,
Token ID11
Feature activation+6.73e-03

Pos logit contributions
 satisfaction+0.002
 pleasure+0.002
 accomplishment+0.002
 delight+0.001
 fascination+0.001
Neg logit contributions
 parked-0.002
 driving-0.002
 stole-0.002
 smoking-0.002
 disturbed-0.002
Token they
Token ID484
Feature activation-3.14e-02

Pos logit contributions
 satisfaction+0.016
 pleasure+0.016
 accomplishment+0.016
 fascination+0.015
 delight+0.015
Neg logit contributions
 parked-0.008
 smoking-0.007
 driving-0.007
 stole-0.007
 disturbed-0.007
Token found
Token ID1043
Feature activation-2.10e-02

Pos logit contributions
 driving+0.046
 smoking+0.045
 blocking+0.043
 CA+0.043
 guarding+0.043
Neg logit contributions
 satisfaction-0.058
 delight-0.054
 pleasure-0.054
 fascination-0.054
 accomplishment-0.051
Token a
Token ID257
Feature activation-2.71e-02

Pos logit contributions
 parked+0.032
 stole+0.030
 driving+0.029
 smoking+0.028
 guarding+0.028
Neg logit contributions
 satisfaction-0.043
 pleasure-0.043
 accomplishment-0.041
 delight-0.041
 fascination-0.039
Token new
Token ID649
Feature activation-3.84e-03

Pos logit contributions
 parked+0.048
 driving+0.044
 stole+0.044
 smoking+0.043
 disturbed+0.041
Neg logit contributions
 pleasure-0.048
 accomplishment-0.046
 satisfaction-0.046
 fascination-0.044
 delight-0.044
Token box
Token ID3091
Feature activation-7.53e-02

Pos logit contributions
 parked+0.005
 stole+0.005
 smoking+0.005
 driving+0.005
 disturbed+0.004
Neg logit contributions
 pleasure-0.009
 satisfaction-0.009
 accomplishment-0.008
 delight-0.008
 fascination-0.008
Token of
Token ID286
Feature activation-9.72e-03

Pos logit contributions
Ro+0.140
 CA+0.131
 villains+0.128
 stole+0.124
 Crash+0.122
Neg logit contributions
 accomplishment-0.116
 satisfaction-0.101
 pleasure-0.096
 delight-0.095
 decision-0.094
Token blocks
Token ID7021
Feature activation-6.94e-02

Pos logit contributions
 parked+0.013
 stole+0.012
 smoking+0.012
 driving+0.012
 knocked+0.011
Neg logit contributions
 satisfaction-0.022
 pleasure-0.021
 accomplishment-0.021
 delight-0.021
 fascination-0.020
Token in
Token ID287
Feature activation-3.09e-03

Pos logit contributions
Ro+0.154
 CA+0.135
 villains+0.134
through+0.131
RO+0.128
Neg logit contributions
 accomplishment-0.085
 until-0.070
 satisfaction-0.069
 decision-0.069
 pleasure-0.068
Token the
Token ID262
Feature activation-5.22e-03

Pos logit contributions
 parked+0.005
 stole+0.004
 driving+0.004
 smoking+0.004
 disturbed+0.004
Neg logit contributions
 satisfaction-0.006
 pleasure-0.006
 accomplishment-0.006
 delight-0.006
 fascination-0.006
Token play
Token ID711
Feature activation-6.82e-02

Pos logit contributions
 parked+0.008
 driving+0.008
 smoking+0.008
 stole+0.008
 disturbed+0.007
Neg logit contributions
 satisfaction-0.010
 pleasure-0.010
 accomplishment-0.010
 delight-0.010
 fascination-0.010
Token.
Token ID13
Feature activation-1.83e-02

Pos logit contributions
 smoking+0.067
 parked+0.067
 driving+0.063
 stole+0.063
through+0.060
Neg logit contributions
 pleasure-0.077
 satisfaction-0.076
 accomplishment-0.075
 delight-0.071
 difference-0.070
Token We
Token ID775
Feature activation-5.11e-03

Pos logit contributions
 parked+0.030
 stole+0.030
 driving+0.029
 smoking+0.029
 disturbed+0.028
Neg logit contributions
 pleasure-0.034
 satisfaction-0.033
 delight-0.031
 accomplishment-0.031
 fascination-0.030
Token could
Token ID714
Feature activation+3.74e-03

Pos logit contributions
 parked+0.008
 smoking+0.007
 driving+0.007
 stole+0.007
 disturbed+0.007
Neg logit contributions
 satisfaction-0.010
 pleasure-0.010
 accomplishment-0.010
 delight-0.009
 fascination-0.009
Token get
Token ID651
Feature activation-3.61e-02

Pos logit contributions
 satisfaction+0.008
 pleasure+0.008
 accomplishment+0.008
 delight+0.008
 fascination+0.008
Neg logit contributions
 parked-0.005
 driving-0.005
 stole-0.005
 smoking-0.005
 knocked-0.005
Token hurt
Token ID5938
Feature activation-2.30e-03

Pos logit contributions
 parked+0.053
 CA+0.049
 stole+0.049
 guarding+0.048
 smoking+0.048
Neg logit contributions
 pleasure-0.070
 satisfaction-0.070
 accomplishment-0.067
 delight-0.064
 joy-0.062
Token or
Token ID393
Feature activation-7.04e-02

Pos logit contributions
 parked+0.003
 stole+0.003
 driving+0.003
 smoking+0.003
 knocked+0.003
Neg logit contributions
 satisfaction-0.005
 pleasure-0.005
 accomplishment-0.005
 delight-0.005
 fascination-0.004
Token lost
Token ID2626
Feature activation+3.24e-03

Pos logit contributions
anes+0.098
 CA+0.091
 parked+0.091
saw+0.084
 driving+0.083
Neg logit contributions
 pleasure-0.132
 satisfaction-0.131
 accomplishment-0.126
 delight-0.121
 joy-0.120
Token."
Token ID526
Feature activation+2.77e-03

Pos logit contributions
 satisfaction+0.007
 pleasure+0.007
 accomplishment+0.007
 delight+0.007
 fascination+0.007
Neg logit contributions
 parked-0.004
 driving-0.004
 smoking-0.004
 stole-0.004
 disturbed-0.003
Token
Token ID198
Feature activation-1.42e-02

Pos logit contributions
 satisfaction+0.006
 pleasure+0.006
 accomplishment+0.006
 delight+0.005
 fascination+0.005
Neg logit contributions
 parked-0.004
 driving-0.004
 smoking-0.004
 stole-0.004
 knocked-0.004
Token
Token ID198
Feature activation-1.27e-02

Pos logit contributions
 parked+0.025
 stole+0.024
 smoking+0.023
 driving+0.023
 disturbed+0.022
Neg logit contributions
 satisfaction-0.024
 pleasure-0.024
 accomplishment-0.024
 delight-0.023
 fascination-0.022
TokenTim
Token ID14967
Feature activation+4.23e-02

Pos logit contributions
 parked+0.021
 stole+0.020
 driving+0.020
 smoking+0.019
 disturbed+0.019
Neg logit contributions
 satisfaction-0.023
 accomplishment-0.023
 pleasure-0.022
 delight-0.022
 fascination-0.021
Token hit
Token ID2277
Feature activation-2.66e-02

Pos logit contributions
anes+0.098
 smoking+0.092
 driving+0.091
 Enter+0.090
 Danger+0.088
Neg logit contributions
 delight-0.143
 satisfaction-0.141
 accomplishment-0.139
 disappointment-0.130
 decision-0.128
Token a
Token ID257
Feature activation-2.19e-02

Pos logit contributions
 parked+0.039
 stole+0.039
 smoking+0.037
 driving+0.036
cher+0.036
Neg logit contributions
 satisfaction-0.052
 accomplishment-0.050
 pleasure-0.050
 delight-0.049
 fascination-0.047
Token tree
Token ID5509
Feature activation-6.56e-02

Pos logit contributions
 parked+0.035
 driving+0.033
 smoking+0.032
 stole+0.032
 disturbed+0.031
Neg logit contributions
 satisfaction-0.042
 pleasure-0.040
 accomplishment-0.040
 delight-0.039
 fascination-0.038
Token.
Token ID13
Feature activation-2.64e-02

Pos logit contributions
anes+0.112
 CA+0.107
 stole+0.107
 parked+0.107
cher+0.106
Neg logit contributions
 accomplishment-0.098
 satisfaction-0.091
 decision-0.091
 delight-0.089
 fascination-0.085
Token The
Token ID383
Feature activation-1.77e-02

Pos logit contributions
 parked+0.047
 stole+0.045
 driving+0.044
 smoking+0.044
cher+0.043
Neg logit contributions
 pleasure-0.043
 satisfaction-0.042
 delight-0.041
 accomplishment-0.040
 fascination-0.037
Token ball
Token ID2613
Feature activation-7.11e-02

Pos logit contributions
 parked+0.028
 stole+0.026
 smoking+0.026
 driving+0.026
 knocked+0.025
Neg logit contributions
 satisfaction-0.035
 pleasure-0.034
 accomplishment-0.034
 delight-0.033
 fascination-0.031
Token broke
Token ID6265
Feature activation-1.34e-02

Pos logit contributions
 driving+0.075
 Enter+0.074
 smoking+0.073
 spies+0.072
anes+0.069
Neg logit contributions
 delight-0.142
 satisfaction-0.142
 accomplishment-0.141
 decision-0.131
 relief-0.129
Token into
Token ID656
Feature activation-9.49e-03

Pos logit contributions
 parked+0.024
 stole+0.023
 smoking+0.022
 driving+0.022
 disturbed+0.021
Neg logit contributions
 satisfaction-0.022
 pleasure-0.022
 accomplishment-0.022
 delight-0.021
 fascination-0.020
Token many
Token ID867
Feature activation+1.44e-02

Pos logit contributions
 parked+0.016
 stole+0.015
 driving+0.015
 smoking+0.014
 knocked+0.014
Neg logit contributions
 satisfaction-0.018
 pleasure-0.017
 accomplishment-0.017
 delight-0.017
 fascination-0.016
Token pieces
Token ID5207
Feature activation-1.79e-02

Pos logit contributions
 pleasure+0.031
 satisfaction+0.030
 accomplishment+0.029
 fascination+0.028
 delight+0.027
Neg logit contributions
 parked-0.021
 smoking-0.020
 driving-0.020
 stole-0.020
 broke-0.019
Token and
Token ID290
Feature activation-3.24e-02

Pos logit contributions
 parked+0.035
 stole+0.034
 smoking+0.033
 driving+0.033
Ro+0.032
Neg logit contributions
 accomplishment-0.026
 satisfaction-0.026
 pleasure-0.026
 delight-0.024
 fascination-0.023
Token didn
Token ID1422
Feature activation-4.82e-02

Pos logit contributions
 parked+0.029
 smoking+0.029
 CA+0.028
anes+0.028
 stole+0.027
Neg logit contributions
 pleasure-0.050
 accomplishment-0.049
 satisfaction-0.049
 delight-0.048
 fascination-0.046
Token't
Token ID470
Feature activation-3.94e-03

Pos logit contributions
 smoking+0.051
 stole+0.051
 parked+0.048
 disturbed+0.045
 guns+0.045
Neg logit contributions
 accomplishment-0.112
 pleasure-0.110
 delight-0.107
 satisfaction-0.106
 fascination-0.101
Token see
Token ID766
Feature activation-2.55e-02

Pos logit contributions
 parked+0.007
 smoking+0.007
 driving+0.007
 stole+0.007
 broke+0.006
Neg logit contributions
 satisfaction-0.007
 pleasure-0.007
 accomplishment-0.007
 delight-0.006
 fascination-0.006
Token the
Token ID262
Feature activation-8.49e-03

Pos logit contributions
 parked+0.041
 smoking+0.040
 stole+0.039
 disturbed+0.038
 broke+0.037
Neg logit contributions
 pleasure-0.048
 satisfaction-0.046
 delight-0.045
 accomplishment-0.045
 amaz-0.042
Token big
Token ID1263
Feature activation+9.49e-03

Pos logit contributions
 parked+0.012
 smoking+0.012
 stole+0.012
 driving+0.011
 knocked+0.011
Neg logit contributions
 pleasure-0.018
 satisfaction-0.018
 accomplishment-0.017
 delight-0.017
 fascination-0.016
Token hole
Token ID7604
Feature activation-7.00e-02

Pos logit contributions
 satisfaction+0.020
 pleasure+0.020
 accomplishment+0.020
 fascination+0.019
 delight+0.019
Neg logit contributions
 parked-0.014
 driving-0.013
 stole-0.013
 smoking-0.012
 broke-0.012
Token in
Token ID287
Feature activation-3.96e-03

Pos logit contributions
Ro+0.117
 CA+0.113
 stole+0.105
 villains+0.101
 smoking+0.100
Neg logit contributions
 accomplishment-0.116
 pleasure-0.106
 delight-0.102
 decision-0.101
 satisfaction-0.099
Token the
Token ID262
Feature activation+3.83e-03

Pos logit contributions
 parked+0.006
 stole+0.005
 smoking+0.005
 driving+0.005
 disturbed+0.005
Neg logit contributions
 satisfaction-0.008
 pleasure-0.008
 accomplishment-0.008
 delight-0.008
 fascination-0.008
Token ground
Token ID2323
Feature activation-5.16e-02

Pos logit contributions
 satisfaction+0.008
 pleasure+0.008
 accomplishment+0.007
 delight+0.007
 fascination+0.007
Neg logit contributions
 parked-0.006
 driving-0.006
 stole-0.006
 smoking-0.006
 broke-0.005
Token.
Token ID13
Feature activation-1.94e-02

Pos logit contributions
 CA+0.091
Ro+0.091
 stole+0.088
 parked+0.085
 smoking+0.084
Neg logit contributions
 accomplishment-0.074
 pleasure-0.071
 delight-0.070
 satisfaction-0.069
 fascination-0.068
Token He
Token ID679
Feature activation-9.82e-03

Pos logit contributions
 parked+0.035
 smoking+0.033
 driving+0.032
 stole+0.032
 blocking+0.031
Neg logit contributions
 pleasure-0.033
 satisfaction-0.031
 accomplishment-0.031
 delight-0.031
 fascination-0.029
Token was
Token ID373
Feature activation-6.70e-03

Pos logit contributions
 satisfaction+0.029
 pleasure+0.028
 accomplishment+0.028
 delight+0.027
 fascination+0.027
Neg logit contributions
 parked-0.015
 stole-0.013
 driving-0.013
 smoking-0.013
 broke-0.013
Token a
Token ID257
Feature activation+6.19e-03

Pos logit contributions
 parked+0.008
 stole+0.007
 smoking+0.007
 driving+0.007
 knocked+0.007
Neg logit contributions
 pleasure-0.016
 satisfaction-0.016
 accomplishment-0.015
 delight-0.015
 fascination-0.015
Token long
Token ID890
Feature activation+9.81e-03

Pos logit contributions
 satisfaction+0.010
 pleasure+0.010
 accomplishment+0.010
 delight+0.010
 fascination+0.010
Neg logit contributions
 parked-0.012
 driving-0.011
 smoking-0.011
 stole-0.011
 knocked-0.010
Token,
Token ID11
Feature activation-4.40e-03

Pos logit contributions
 satisfaction+0.020
 pleasure+0.020
 accomplishment+0.019
 delight+0.018
 fascination+0.018
Neg logit contributions
 parked-0.015
 driving-0.014
 smoking-0.014
 stole-0.014
 knocked-0.013
Token green
Token ID4077
Feature activation-1.22e-02

Pos logit contributions
 parked+0.007
 stole+0.006
 driving+0.006
 smoking+0.006
 knocked+0.006
Neg logit contributions
 satisfaction-0.009
 pleasure-0.009
 accomplishment-0.008
 delight-0.008
 fascination-0.008
Token snake
Token ID17522
Feature activation-7.20e-02

Pos logit contributions
 parked+0.018
 stole+0.016
 smoking+0.016
 driving+0.016
 disturbed+0.016
Neg logit contributions
 satisfaction-0.025
 pleasure-0.025
 accomplishment-0.024
 delight-0.024
 fascination-0.023
Token named
Token ID3706
Feature activation-2.47e-02

Pos logit contributions
 parked+0.108
 driven+0.102
saw+0.097
cher+0.096
 smoking+0.096
Neg logit contributions
 goodness-0.118
 delight-0.116
 satisfaction-0.116
 pleasure-0.112
 difference-0.111
Token Sam
Token ID3409
Feature activation-2.64e-02

Pos logit contributions
 parked+0.039
 driving+0.036
 stole+0.036
 smoking+0.036
 disturbed+0.035
Neg logit contributions
 pleasure-0.049
 satisfaction-0.046
 delight-0.045
 accomplishment-0.045
 fascination-0.044
Token.
Token ID13
Feature activation-6.99e-03

Pos logit contributions
 parked+0.042
 smoking+0.041
 stole+0.040
 driving+0.039
 disturbed+0.038
Neg logit contributions
 satisfaction-0.045
 pleasure-0.045
 accomplishment-0.043
 fascination-0.042
 delight-0.042
Token Sam
Token ID3409
Feature activation-6.38e-03

Pos logit contributions
 parked+0.008
 stole+0.007
 driving+0.007
 smoking+0.007
 disturbed+0.007
Neg logit contributions
 satisfaction-0.017
 pleasure-0.017
 accomplishment-0.016
 delight-0.016
 fascination-0.016
Token lived
Token ID5615
Feature activation-2.86e-02

Pos logit contributions
 parked+0.010
 driving+0.009
 smoking+0.009
 stole+0.009
 disturbed+0.008
Neg logit contributions
 satisfaction-0.013
 pleasure-0.012
 accomplishment-0.012
 delight-0.012
 fascination-0.012