TOP ACTIVATIONS
MAX = 0.214

to school he started to worry . He was scared to
in the village started to worry . What were they going
squeezed her hands together in worry and her lips were tight
his mom was filled with worry . She never stopped thinking
set and he started to worry . But then he spotted
saw something that made her worry . She noticed that some
. His mom noticed the worry on his face and said
saw something that made her worry . There was a big
, so he started to worry . Suddenly ,
home , he started to worry . What if he lost
, the seed started to worry . It wanted to grow
in the network started to worry ; they wanted everyone to
, and it made her worry . Suddenly ,
closer , he started to worry - he realised the
twins ' parents started to worry . They asked them what
lost and Lily started to worry . She looked everywhere for
and the farmer started to worry . It was getting late
, his friends started to worry . They knew the fruit
, and he began to worry . He had never been
it started moving towards the voice . When the

MIN ACTIVATIONS
MIN = -0.295

something new ! She thought av oc ados were the most
something new ! She thought av oc ados were the most
away . She learned that av oc ados are good for
girl who didn 't like av oc ados . She always
. Did you know that av oc ados are very healthy
on , whenever they wanted av oc ados , they would
a scary situation . She coll ed her the brav est
of fabric and buttons and rib bons . She cut a
ongo didn 't like wet av oc ados , so he
before . The av oc ados were shared with
. They had bells and rib bons . They made soft
wear hats and clips and rib bons . You can be
but he also knew cater pill ars were not pets .
Did you know that cater pill ars turn into beautiful butterflies
can sew some buttons or rib bons or patches . Would
When the coll age was finished , the
a sound like this : rib bit , rib bit ."
this : rib bit , rib bit ." L
They had different colors and rib bons . They were very
" R ib bit , rib bit ," the frog agrees

INTERVAL 0.087 - 0.214
CONTAINS 1.862%

the clown toy in the ground so it wouldn 't scare
So Andy asked his mom for a ticket
and find fun things . One day she was playing in
and were now the proud owners of the bright crown !
belonged to the gr umpy old bear who lived in the

INTERVAL -0.041 - 0.087
CONTAINS 84.441%

Lily to the bathroom and cleaned her wound . She put
She tries to pull the small dog away , but the
. The girl looked up and blinked . She
put the coffee into an old cup and handed it to
Grand pa showed him all the exciting things in the museum

INTERVAL -0.168 - -0.041
CONTAINS 13.678%

, but you can 't marry her ." Anna
." Tom and Lily said , " OK ,
section , but she still couldn 't find the caul iflower
. She was running around on the lush grass , enjoying
utter . Tweet was excited to play a

INTERVAL -0.295 - -0.168
CONTAINS 0.019%

" R ib bit , rib bit ," the frog agrees
tail which was decorated with rib bons . It was the
And I love milk and cookies . And I love you
me !" The l ibrarian then took Anne to
. I will bite a cookie instead ." She
Token to
Token ID284
Feature activation+3.49e-02

Pos logit contributions
 adult+0.003
 worry+0.003
 explanation+0.003
 zone+0.003
 son+0.003
Neg logit contributions
Aud-0.003
 brace-0.002
 wrists-0.002
 fet-0.002
 sixteen-0.002
Token school
Token ID1524
Feature activation+7.07e-03

Pos logit contributions
 adult+0.057
 worry+0.056
 explanation+0.051
 son+0.050
urious+0.049
Neg logit contributions
Aud-0.064
 brace-0.062
 fet-0.058
 wrists-0.057
 sixteen-0.055
Token he
Token ID339
Feature activation+4.29e-02

Pos logit contributions
 adult+0.011
 worry+0.010
 explanation+0.009
 son+0.009
 zone+0.009
Neg logit contributions
Aud-0.014
 brace-0.013
 fet-0.013
 wrists-0.012
 grapes-0.012
Token started
Token ID2067
Feature activation+4.37e-02

Pos logit contributions
 worry+0.056
 adult+0.053
urious+0.047
 son+0.046
 zone+0.045
Neg logit contributions
Aud-0.095
 brace-0.093
 grapes-0.087
 wrists-0.086
 sixteen-0.084
Token to
Token ID284
Feature activation+4.17e-02

Pos logit contributions
 worry+0.078
 adult+0.077
 explanation+0.069
 son+0.067
 zone+0.066
Neg logit contributions
Aud-0.077
 brace-0.071
 wrists-0.067
 fet-0.066
 sixteen-0.065
Token worry
Token ID5490
Feature activation+2.12e-01

Pos logit contributions
 worry+0.052
 adult+0.049
 explanation+0.042
 son+0.040
urious+0.039
Neg logit contributions
Aud-0.096
 brace-0.092
 fet-0.089
 wrists-0.087
ip-0.087
Token.
Token ID13
Feature activation+2.60e-02

Pos logit contributions
 adult+0.324
 worry+0.316
 explanation+0.287
 son+0.284
 zone+0.274
Neg logit contributions
Aud-0.423
 brace-0.396
 fet-0.382
 wrists-0.374
 sixteen-0.360
Token He
Token ID679
Feature activation+4.25e-02

Pos logit contributions
 adult+0.041
 worry+0.040
 explanation+0.037
 son+0.036
urious+0.036
Neg logit contributions
Aud-0.049
 brace-0.048
 fet-0.045
 wrists-0.044
 grapes-0.042
Token was
Token ID373
Feature activation+2.82e-02

Pos logit contributions
 worry+0.052
 adult+0.051
 son+0.044
urious+0.043
 explanation+0.041
Neg logit contributions
Aud-0.098
 brace-0.094
 grapes-0.090
 sixteen-0.088
 wrists-0.087
Token scared
Token ID12008
Feature activation+6.79e-02

Pos logit contributions
 adult+0.040
 worry+0.039
 son+0.033
 explanation+0.032
 nurses+0.031
Neg logit contributions
Aud-0.060
 brace-0.060
 wrists-0.057
 fet-0.055
ip-0.054
Token to
Token ID284
Feature activation-5.91e-03

Pos logit contributions
 adult+0.116
 worry+0.111
 explanation+0.107
urious+0.107
 zone+0.102
Neg logit contributions
Aud-0.108
 brace-0.108
 fet-0.102
 pro-0.097
 grapes-0.096
Token in
Token ID287
Feature activation-3.83e-02

Pos logit contributions
 adult+0.116
 worry+0.109
urious+0.108
 zone+0.107
 explanation+0.107
Neg logit contributions
 fet-0.087
 brace-0.085
Aud-0.083
 pro-0.082
 scored-0.080
Token the
Token ID262
Feature activation+6.88e-03

Pos logit contributions
Aud+0.049
 brace+0.047
 fet+0.043
 wrists+0.042
 sixteen+0.040
Neg logit contributions
 adult-0.082
 worry-0.081
 explanation-0.076
urious-0.075
 son-0.074
Token village
Token ID7404
Feature activation+1.00e-01

Pos logit contributions
 adult+0.008
iness+0.008
 outage+0.007
 explanation+0.007
 worry+0.007
Neg logit contributions
 grape-0.014
 brace-0.014
Aud-0.014
 fet-0.014
 wrists-0.013
Token started
Token ID2067
Feature activation+2.60e-02

Pos logit contributions
 adult+0.151
 worry+0.145
 explanation+0.133
 zone+0.130
urious+0.130
Neg logit contributions
Aud-0.186
 brace-0.183
 fet-0.176
 wrists-0.172
 pro-0.166
Token to
Token ID284
Feature activation+3.46e-02

Pos logit contributions
 adult+0.053
 worry+0.052
 explanation+0.048
urious+0.047
 son+0.047
Neg logit contributions
Aud-0.037
 brace-0.035
 fet-0.033
 wrists-0.032
 sixteen-0.030
Token worry
Token ID5490
Feature activation+2.08e-01

Pos logit contributions
 adult+0.045
urious+0.040
 zone+0.040
iness+0.040
 explanation+0.038
Neg logit contributions
 brace-0.071
Aud-0.070
 fet-0.068
 wrists-0.065
 grape-0.062
Token.
Token ID13
Feature activation+2.35e-02

Pos logit contributions
 adult+0.323
 worry+0.316
 explanation+0.288
 son+0.282
 zone+0.272
Neg logit contributions
Aud-0.401
 brace-0.385
 fet-0.363
 wrists-0.359
 sixteen-0.344
Token What
Token ID1867
Feature activation-1.99e-02

Pos logit contributions
 adult+0.045
 worry+0.044
 explanation+0.041
 son+0.040
urious+0.040
Neg logit contributions
Aud-0.036
 brace-0.035
 fet-0.032
 wrists-0.031
ip-0.030
Token were
Token ID547
Feature activation-3.74e-02

Pos logit contributions
Aud+0.027
 fet+0.027
 brace+0.027
 scored+0.026
 pro+0.025
Neg logit contributions
 adult-0.039
 explanation-0.035
 worry-0.035
rated-0.034
 zone-0.034
Token they
Token ID484
Feature activation-2.26e-02

Pos logit contributions
Aud+0.070
 brace+0.066
 fet+0.063
 wrists+0.063
 sixteen+0.061
Neg logit contributions
 adult-0.061
 worry-0.059
 son-0.053
 explanation-0.053
urious-0.050
Token going
Token ID1016
Feature activation+1.97e-02

Pos logit contributions
Aud+0.034
 brace+0.033
 fet+0.030
 sixteen+0.030
 grapes+0.029
Neg logit contributions
 adult-0.042
 worry-0.038
 explanation-0.038
urious-0.037
 zone-0.036
Token squeezed
Token ID29650
Feature activation-1.63e-04

Pos logit contributions
Aud+0.020
 brace+0.020
 wrists+0.018
 sixteen+0.018
 grapes+0.018
Neg logit contributions
 worry-0.012
 adult-0.012
 son-0.010
urious-0.010
 explanation-0.010
Token her
Token ID607
Feature activation+7.64e-03

Pos logit contributions
Aud+0.000
 fet+0.000
 brace+0.000
 scored+0.000
 sixteen+0.000
Neg logit contributions
 adult-0.000
 worry-0.000
 son-0.000
 nurses-0.000
 explanation-0.000
Token hands
Token ID2832
Feature activation+4.81e-02

Pos logit contributions
 worry+0.012
 adult+0.012
 son+0.011
 explanation+0.010
 eye+0.010
Neg logit contributions
Aud-0.016
 brace-0.014
 sixteen-0.014
 fet-0.013
 pro-0.013
Token together
Token ID1978
Feature activation-7.11e-04

Pos logit contributions
 worry+0.074
 adult+0.068
 explanation+0.062
 son+0.062
iness+0.061
Neg logit contributions
Aud-0.099
 brace-0.091
 fet-0.084
 wrists-0.083
 sixteen-0.082
Token in
Token ID287
Feature activation-7.79e-03

Pos logit contributions
Aud+0.002
 brace+0.001
 fet+0.001
 sixteen+0.001
 dod+0.001
Neg logit contributions
 worry-0.001
 adult-0.001
 explanation-0.001
 son-0.001
iness-0.001
Token worry
Token ID5490
Feature activation+2.07e-01

Pos logit contributions
Aud+0.012
 fet+0.011
ip+0.011
 brace+0.011
 wrists+0.010
Neg logit contributions
 worry-0.016
 adult-0.015
 explanation-0.014
 son-0.013
 distress-0.013
Token and
Token ID290
Feature activation+2.30e-03

Pos logit contributions
 worry+0.281
 adult+0.276
 son+0.256
iness+0.252
 explanation+0.248
Neg logit contributions
Aud-0.469
 fet-0.416
 brace-0.413
 sixteen-0.392
 pro-0.384
Token her
Token ID607
Feature activation-3.43e-02

Pos logit contributions
 worry+0.004
 adult+0.003
 son+0.003
 explanation+0.003
 actions+0.003
Neg logit contributions
Aud-0.005
 brace-0.004
 sixteen-0.004
 fet-0.004
ip-0.004
Token lips
Token ID11914
Feature activation-9.73e-03

Pos logit contributions
Aud+0.073
 brace+0.067
 sixteen+0.065
 fet+0.065
br+0.062
Neg logit contributions
 worry-0.051
 adult-0.048
 son-0.046
 actions-0.044
 explanation-0.043
Token were
Token ID547
Feature activation+1.82e-02

Pos logit contributions
Aud+0.021
 brace+0.019
 grapes+0.018
 fet+0.018
 grape+0.018
Neg logit contributions
 worry-0.014
iness-0.012
 adult-0.012
 explanation-0.011
rated-0.011
Token tight
Token ID5381
Feature activation-4.02e-02

Pos logit contributions
 worry+0.029
 adult+0.025
 distress+0.022
iness+0.022
 why+0.021
Neg logit contributions
Aud-0.037
 fet-0.035
ip-0.035
con-0.034
 brace-0.034
Token his
Token ID465
Feature activation+2.20e-02

Pos logit contributions
 worry+0.019
 adult+0.019
 explanation+0.017
 son+0.017
urious+0.016
Neg logit contributions
Aud-0.025
 brace-0.024
 wrists-0.022
 sixteen-0.022
 fet-0.022
Token mom
Token ID1995
Feature activation+6.41e-02

Pos logit contributions
 adult+0.028
 worry+0.028
 son+0.024
 explanation+0.024
urious+0.022
Neg logit contributions
Aud-0.049
 brace-0.047
 fet-0.045
 wrists-0.044
 sixteen-0.044
Token was
Token ID373
Feature activation+2.29e-02

Pos logit contributions
 worry+0.088
 adult+0.085
 son+0.076
 explanation+0.073
 actions+0.067
Neg logit contributions
Aud-0.139
 brace-0.133
 wrists-0.126
 grapes-0.126
 sixteen-0.125
Token filled
Token ID5901
Feature activation+3.69e-03

Pos logit contributions
 worry+0.037
 adult+0.036
 explanation+0.032
 son+0.031
 distress+0.030
Neg logit contributions
 brace-0.043
Aud-0.042
 wrists-0.040
 fet-0.040
 grapes-0.039
Token with
Token ID351
Feature activation+2.60e-02

Pos logit contributions
 worry+0.006
 adult+0.006
 explanation+0.005
 son+0.005
urious+0.005
Neg logit contributions
Aud-0.007
 brace-0.006
br-0.006
ip-0.006
 sixteen-0.006
Token worry
Token ID5490
Feature activation+2.06e-01

Pos logit contributions
 worry+0.039
 adult+0.038
 son+0.034
 explanation+0.034
iness+0.031
Neg logit contributions
Aud-0.054
 brace-0.050
 fet-0.049
 wrists-0.048
ip-0.047
Token.
Token ID13
Feature activation+5.71e-03

Pos logit contributions
 worry+0.295
 adult+0.293
 explanation+0.263
 son+0.262
iness+0.254
Neg logit contributions
Aud-0.443
 brace-0.411
 fet-0.393
 wrists-0.389
 sixteen-0.372
Token She
Token ID1375
Feature activation-2.59e-02

Pos logit contributions
 adult+0.010
 worry+0.010
 explanation+0.009
urious+0.009
 son+0.009
Neg logit contributions
Aud-0.009
 brace-0.009
 fet-0.008
 wrists-0.008
 sixteen-0.008
Token never
Token ID1239
Feature activation-9.32e-03

Pos logit contributions
Aud+0.054
 brace+0.052
 grapes+0.049
 wrists+0.049
 sixteen+0.048
Neg logit contributions
 adult-0.037
 worry-0.036
 son-0.032
 explanation-0.031
urious-0.030
Token stopped
Token ID5025
Feature activation-4.33e-02

Pos logit contributions
Aud+0.016
 fet+0.015
 brace+0.015
 pro+0.014
 scored+0.014
Neg logit contributions
 adult-0.016
 explanation-0.015
 worry-0.015
urious-0.014
 son-0.014
Token thinking
Token ID3612
Feature activation+7.46e-02

Pos logit contributions
Aud+0.078
 brace+0.074
 fet+0.071
 wrists+0.070
 sixteen+0.068
Neg logit contributions
 worry-0.073
 adult-0.073
 explanation-0.064
 son-0.063
urious-0.062
Token set
Token ID900
Feature activation+6.79e-02

Pos logit contributions
 worry+0.008
 adult+0.008
 explanation+0.007
urious+0.007
 son+0.007
Neg logit contributions
Aud-0.009
 brace-0.009
 fet-0.008
 wrists-0.008
 grapes-0.008
Token and
Token ID290
Feature activation+4.08e-02

Pos logit contributions
 adult+0.114
 explanation+0.107
 son+0.105
 worry+0.104
 eye+0.098
Neg logit contributions
Aud-0.125
 wrists-0.107
 brace-0.107
 sixteen-0.103
 pro-0.102
Token he
Token ID339
Feature activation+1.88e-02

Pos logit contributions
 worry+0.062
 adult+0.057
 son+0.055
 explanation+0.053
 ig+0.050
Neg logit contributions
Aud-0.083
 brace-0.073
 sixteen-0.072
ip-0.071
 wrists-0.070
Token started
Token ID2067
Feature activation+2.21e-02

Pos logit contributions
 worry+0.021
 adult+0.020
 son+0.019
 anchor+0.017
 figure+0.017
Neg logit contributions
Aud-0.046
 brace-0.041
 grapes-0.040
 sixteen-0.039
 wrists-0.038
Token to
Token ID284
Feature activation+4.33e-02

Pos logit contributions
 worry+0.033
 adult+0.032
 explanation+0.028
 son+0.028
 ig+0.027
Neg logit contributions
Aud-0.048
 brace-0.042
Pink-0.039
 wrists-0.039
ip-0.039
Token worry
Token ID5490
Feature activation+2.06e-01

Pos logit contributions
 worry+0.064
 adult+0.054
 explanation+0.050
 son+0.048
 answer+0.045
Neg logit contributions
Aud-0.098
ip-0.091
 brace-0.088
 fet-0.086
 grapes-0.085
Token.
Token ID13
Feature activation+1.71e-02

Pos logit contributions
 adult+0.310
 worry+0.308
 explanation+0.277
 son+0.274
iness+0.262
Neg logit contributions
Aud-0.415
 brace-0.385
 fet-0.374
 wrists-0.365
 sixteen-0.351
Token But
Token ID887
Feature activation+5.34e-02

Pos logit contributions
 adult+0.025
 worry+0.023
 son+0.023
 explanation+0.021
urious+0.021
Neg logit contributions
 brace-0.036
Aud-0.035
 wrists-0.032
 fet-0.031
br-0.031
Token then
Token ID788
Feature activation-3.30e-02

Pos logit contributions
 worry+0.081
 adult+0.080
 son+0.073
 explanation+0.069
 zone+0.066
Neg logit contributions
Aud-0.110
 brace-0.105
 fet-0.098
 wrists-0.097
 sixteen-0.093
Token he
Token ID339
Feature activation-3.43e-02

Pos logit contributions
Aud+0.069
 brace+0.066
 wrists+0.064
 fet+0.063
 sixteen+0.060
Neg logit contributions
 worry-0.049
 adult-0.047
 son-0.043
 explanation-0.040
 zone-0.038
Token spotted
Token ID13489
Feature activation-5.46e-03

Pos logit contributions
Aud+0.077
 brace+0.075
 wrists+0.071
 fet+0.069
 grapes+0.069
Neg logit contributions
 adult-0.042
 worry-0.042
 son-0.036
urious-0.035
 explanation-0.034
Token saw
Token ID2497
Feature activation+3.13e-02

Pos logit contributions
Aud+0.080
 brace+0.078
 fet+0.075
 wrists+0.072
 pro+0.072
Neg logit contributions
 adult-0.063
 worry-0.060
 explanation-0.057
 zone-0.054
iness-0.054
Token something
Token ID1223
Feature activation+5.27e-02

Pos logit contributions
 adult+0.061
 worry+0.060
 explanation+0.055
 son+0.054
urious+0.054
Neg logit contributions
Aud-0.047
 brace-0.045
 fet-0.043
 wrists-0.041
 sixteen-0.040
Token that
Token ID326
Feature activation+6.47e-02

Pos logit contributions
 adult+0.090
 worry+0.088
 explanation+0.081
 son+0.079
urious+0.079
Neg logit contributions
Aud-0.091
 brace-0.087
 fet-0.083
 wrists-0.081
 sixteen-0.078
Token made
Token ID925
Feature activation+4.29e-02

Pos logit contributions
 worry+0.098
 adult+0.098
 explanation+0.087
urious+0.085
 son+0.085
Neg logit contributions
Aud-0.123
 brace-0.118
 fet-0.113
 wrists-0.112
 sixteen-0.108
Token her
Token ID607
Feature activation+7.68e-03

Pos logit contributions
 worry+0.085
 adult+0.084
 explanation+0.078
 son+0.075
 God+0.071
Neg logit contributions
Aud-0.067
 fet-0.062
 brace-0.061
 wrists-0.061
ip-0.058
Token worry
Token ID5490
Feature activation+2.05e-01

Pos logit contributions
 adult+0.012
 worry+0.011
 explanation+0.010
urious+0.010
 zone+0.010
Neg logit contributions
Aud-0.015
 brace-0.014
 fet-0.014
 wrists-0.014
 grapes-0.013
Token.
Token ID13
Feature activation+2.42e-02

Pos logit contributions
 adult+0.384
 worry+0.362
urious+0.349
 explanation+0.345
 zone+0.336
Neg logit contributions
 brace-0.307
Aud-0.296
 fet-0.279
 wrists-0.274
 grape-0.265
Token She
Token ID1375
Feature activation-1.25e-02

Pos logit contributions
 adult+0.043
 worry+0.042
 son+0.039
 explanation+0.038
urious+0.037
Neg logit contributions
Aud-0.040
 brace-0.039
 fet-0.037
ip-0.036
 sixteen-0.036
Token noticed
Token ID6810
Feature activation+2.34e-02

Pos logit contributions
Aud+0.024
 brace+0.024
 fet+0.022
 wrists+0.022
 sixteen+0.021
Neg logit contributions
 adult-0.019
 worry-0.018
 explanation-0.017
urious-0.016
 son-0.016
Token that
Token ID326
Feature activation+6.37e-02

Pos logit contributions
 adult+0.043
 worry+0.041
 son+0.038
 explanation+0.038
urious+0.036
Neg logit contributions
Aud-0.038
 brace-0.036
 fet-0.035
 wrists-0.034
 sixteen-0.033
Token some
Token ID617
Feature activation+1.66e-02

Pos logit contributions
 worry+0.122
 adult+0.117
 son+0.107
 explanation+0.105
 outsider+0.104
Neg logit contributions
Aud-0.102
 fet-0.096
 brace-0.094
 sixteen-0.092
 wrists-0.091
Token.
Token ID13
Feature activation+2.59e-02

Pos logit contributions
 worry+0.041
 adult+0.040
 son+0.037
 explanation+0.036
 zone+0.036
Neg logit contributions
Aud-0.066
 brace-0.061
 fet-0.058
 grape-0.057
 grapes-0.057
Token His
Token ID2399
Feature activation+1.02e-01

Pos logit contributions
 adult+0.048
 worry+0.047
 explanation+0.043
urious+0.043
 zone+0.042
Neg logit contributions
Aud-0.040
 brace-0.039
 fet-0.037
 wrists-0.036
 sixteen-0.035
Token mom
Token ID1995
Feature activation+6.22e-02

Pos logit contributions
 adult+0.142
 worry+0.138
 explanation+0.124
 son+0.124
urious+0.119
Neg logit contributions
Aud-0.213
 brace-0.205
 fet-0.196
 wrists-0.191
 sixteen-0.189
Token noticed
Token ID6810
Feature activation+2.78e-02

Pos logit contributions
 adult+0.089
 worry+0.088
 explanation+0.077
 son+0.077
urious+0.074
Neg logit contributions
Aud-0.126
 brace-0.123
 wrists-0.116
 fet-0.115
 sixteen-0.112
Token the
Token ID262
Feature activation+2.53e-03

Pos logit contributions
 adult+0.051
 worry+0.051
urious+0.047
 explanation+0.047
 zone+0.047
Neg logit contributions
Aud-0.042
 brace-0.041
 fet-0.038
 wrists-0.037
 sprink-0.037
Token worry
Token ID5490
Feature activation+2.04e-01

Pos logit contributions
 adult+0.004
 worry+0.004
 explanation+0.003
 son+0.003
 zone+0.003
Neg logit contributions
Aud-0.005
 brace-0.005
 fet-0.005
 wrists-0.005
 sixteen-0.005
Token on
Token ID319
Feature activation-1.18e-01

Pos logit contributions
 adult+0.315
 worry+0.307
 explanation+0.280
urious+0.280
 zone+0.270
Neg logit contributions
Aud-0.374
 brace-0.368
 fet-0.347
 wrists-0.340
 grapes-0.330
Token his
Token ID465
Feature activation+3.74e-02

Pos logit contributions
Aud+0.212
 brace+0.198
 fet+0.196
 wrists+0.184
 sixteen+0.175
Neg logit contributions
 adult-0.216
 worry-0.206
 son-0.191
 explanation-0.182
 zone-0.175
Token face
Token ID1986
Feature activation+6.58e-02

Pos logit contributions
 adult+0.053
 worry+0.049
urious+0.046
 zone+0.045
 explanation+0.045
Neg logit contributions
 brace-0.072
Aud-0.072
 wrists-0.068
 fet-0.068
 sixteen-0.064
Token and
Token ID290
Feature activation+6.02e-03

Pos logit contributions
 worry+0.094
 adult+0.093
 explanation+0.082
 son+0.082
 zone+0.080
Neg logit contributions
Aud-0.137
 brace-0.129
 fet-0.123
 wrists-0.122
 sixteen-0.118
Token said
Token ID531
Feature activation+2.27e-02

Pos logit contributions
 adult+0.009
 worry+0.009
 explanation+0.008
 son+0.008
urious+0.008
Neg logit contributions
Aud-0.011
 brace-0.011
 fet-0.010
 wrists-0.010
 sixteen-0.010
Token saw
Token ID2497
Feature activation+3.54e-02

Pos logit contributions
Aud+0.065
 brace+0.064
 fet+0.062
 wrists+0.059
 pro+0.059
Neg logit contributions
 adult-0.055
 worry-0.051
 explanation-0.049
iness-0.047
 zone-0.046
Token something
Token ID1223
Feature activation+5.31e-02

Pos logit contributions
 adult+0.069
 worry+0.067
 explanation+0.062
urious+0.061
 son+0.061
Neg logit contributions
Aud-0.053
 brace-0.051
 fet-0.048
 wrists-0.047
 sixteen-0.045
Token that
Token ID326
Feature activation+6.61e-02

Pos logit contributions
 adult+0.093
 worry+0.091
 explanation+0.083
 son+0.081
urious+0.081
Neg logit contributions
Aud-0.089
 brace-0.086
 fet-0.081
 wrists-0.080
 sixteen-0.077
Token made
Token ID925
Feature activation+3.87e-02

Pos logit contributions
 worry+0.099
 adult+0.096
 explanation+0.086
urious+0.084
 son+0.083
Neg logit contributions
Aud-0.129
 brace-0.123
 fet-0.118
 wrists-0.117
 sixteen-0.115
Token her
Token ID607
Feature activation+2.60e-03

Pos logit contributions
 adult+0.079
 worry+0.078
 explanation+0.073
 son+0.071
iness+0.068
Neg logit contributions
Aud-0.057
 brace-0.052
 fet-0.051
 wrists-0.051
 sixteen-0.048
Token worry
Token ID5490
Feature activation+2.03e-01

Pos logit contributions
 adult+0.004
 worry+0.004
 explanation+0.004
urious+0.003
 zone+0.003
Neg logit contributions
 brace-0.005
Aud-0.005
 fet-0.005
 wrists-0.005
 grape-0.004
Token.
Token ID13
Feature activation+1.89e-02

Pos logit contributions
 adult+0.375
 worry+0.355
urious+0.340
 explanation+0.337
 zone+0.325
Neg logit contributions
 brace-0.309
Aud-0.301
 fet-0.282
 wrists-0.280
 grapes-0.268
Token There
Token ID1318
Feature activation+7.36e-02

Pos logit contributions
 adult+0.035
 worry+0.034
 son+0.032
urious+0.031
 explanation+0.031
Neg logit contributions
Aud-0.031
 brace-0.030
 fet-0.027
 wrists-0.027
 sixteen-0.027
Token was
Token ID373
Feature activation+1.80e-02

Pos logit contributions
 adult+0.122
 worry+0.114
 explanation+0.109
urious+0.107
 zone+0.106
Neg logit contributions
 brace-0.127
Aud-0.126
 fet-0.116
 wrists-0.112
 grape-0.111
Token a
Token ID257
Feature activation-1.09e-02

Pos logit contributions
 adult+0.033
 worry+0.033
 explanation+0.029
 son+0.029
urious+0.028
Neg logit contributions
Aud-0.029
 brace-0.027
 fet-0.026
 wrists-0.026
ip-0.025
Token big
Token ID1263
Feature activation-2.19e-02

Pos logit contributions
Aud+0.022
 brace+0.020
 fet+0.019
 wrists+0.019
 sixteen+0.019
Neg logit contributions
 worry-0.017
 adult-0.017
 son-0.015
 explanation-0.014
urious-0.013
Token,
Token ID11
Feature activation+3.52e-02

Pos logit contributions
 adult+0.076
 worry+0.074
 explanation+0.071
urious+0.069
 son+0.068
Neg logit contributions
 brace-0.050
Aud-0.050
 fet-0.046
 sprink-0.044
 beautifully-0.044
Token so
Token ID523
Feature activation+3.19e-03

Pos logit contributions
 adult+0.056
 worry+0.055
 explanation+0.050
 son+0.049
urious+0.049
Neg logit contributions
Aud-0.065
 brace-0.063
 fet-0.060
 wrists-0.059
 sixteen-0.057
Token he
Token ID339
Feature activation-3.68e-03

Pos logit contributions
 adult+0.005
 worry+0.005
 explanation+0.005
 son+0.005
urious+0.005
Neg logit contributions
Aud-0.006
 brace-0.006
 fet-0.005
 wrists-0.005
 sixteen-0.005
Token started
Token ID2067
Feature activation+1.64e-02

Pos logit contributions
Aud+0.007
 brace+0.007
 fet+0.007
 wrists+0.006
 pro+0.006
Neg logit contributions
 adult-0.006
 worry-0.005
 explanation-0.005
 zone-0.005
urious-0.005
Token to
Token ID284
Feature activation+2.65e-02

Pos logit contributions
 adult+0.036
 worry+0.035
 explanation+0.034
urious+0.033
 zone+0.033
Neg logit contributions
Aud-0.019
 brace-0.018
 fet-0.018
 wrists-0.016
 pro-0.015
Token worry
Token ID5490
Feature activation+2.03e-01

Pos logit contributions
 adult+0.036
iness+0.033
 zone+0.032
urious+0.031
 explanation+0.031
Neg logit contributions
 brace-0.051
Aud-0.051
 fet-0.050
 wrists-0.048
 som-0.045
Token.
Token ID13
Feature activation+2.11e-02

Pos logit contributions
 adult+0.330
 worry+0.323
 explanation+0.294
 son+0.289
urious+0.286
Neg logit contributions
Aud-0.370
 brace-0.357
 fet-0.337
 wrists-0.331
 sixteen-0.319
Token
Token ID198
Feature activation+1.17e-02

Pos logit contributions
 adult+0.032
 worry+0.031
 explanation+0.028
 son+0.028
urious+0.028
Neg logit contributions
Aud-0.041
 brace-0.041
 fet-0.037
 wrists-0.037
 grapes-0.035
Token
Token ID198
Feature activation+1.37e-02

Pos logit contributions
 worry+0.024
 adult+0.024
 explanation+0.022
 zone+0.022
iness+0.022
Neg logit contributions
Aud-0.016
 brace-0.014
 fet-0.014
ip-0.013
 sixteen-0.013
TokenSuddenly
Token ID38582
Feature activation+1.06e-02

Pos logit contributions
 adult+0.028
 worry+0.028
 explanation+0.026
 zone+0.026
iness+0.026
Neg logit contributions
Aud-0.019
 brace-0.017
 fet-0.017
 sixteen-0.016
 wrists-0.015
Token,
Token ID11
Feature activation-1.17e-02

Pos logit contributions
 adult+0.020
 worry+0.019
urious+0.019
 explanation+0.019
 zone+0.018
Neg logit contributions
Aud-0.016
 brace-0.015
 fet-0.014
 sixteen-0.014
br-0.014
Token home
Token ID1363
Feature activation+2.71e-02

Pos logit contributions
 adult+0.074
 worry+0.073
urious+0.068
 explanation+0.067
 son+0.067
Neg logit contributions
Aud-0.060
 brace-0.058
 fet-0.055
 wrists-0.052
 sixteen-0.051
Token,
Token ID11
Feature activation+2.07e-02

Pos logit contributions
 adult+0.057
urious+0.056
 zone+0.054
 outsider+0.053
rated+0.053
Neg logit contributions
 brace-0.031
 fet-0.027
 pro-0.027
Aud-0.026
ip-0.025
Token he
Token ID339
Feature activation+1.35e-02

Pos logit contributions
 adult+0.034
 worry+0.033
 explanation+0.031
urious+0.030
 son+0.030
Neg logit contributions
Aud-0.036
 brace-0.036
 fet-0.034
 wrists-0.033
 sixteen-0.032
Token started
Token ID2067
Feature activation+2.31e-02

Pos logit contributions
 adult+0.022
 worry+0.021
 explanation+0.020
 zone+0.019
 son+0.019
Neg logit contributions
Aud-0.024
 brace-0.024
 fet-0.023
 wrists-0.022
 pro-0.021
Token to
Token ID284
Feature activation+2.95e-02

Pos logit contributions
 worry+0.042
 adult+0.042
 explanation+0.037
 son+0.037
urious+0.036
Neg logit contributions
Aud-0.039
 brace-0.036
 sixteen-0.034
 wrists-0.033
br-0.032
Token worry
Token ID5490
Feature activation+2.03e-01

Pos logit contributions
 adult+0.040
 zone+0.035
iness+0.035
 explanation+0.034
urious+0.034
Neg logit contributions
 brace-0.061
Aud-0.059
 fet-0.057
 wrists-0.056
 grape-0.054
Token.
Token ID13
Feature activation+2.24e-02

Pos logit contributions
 adult+0.356
 worry+0.349
 explanation+0.322
urious+0.318
 zone+0.313
Neg logit contributions
Aud-0.330
 brace-0.328
 fet-0.307
 wrists-0.297
 grapes-0.289
Token What
Token ID1867
Feature activation-2.00e-02

Pos logit contributions
 adult+0.031
 worry+0.030
 son+0.028
 explanation+0.027
urious+0.026
Neg logit contributions
Aud-0.048
 brace-0.046
 fet-0.043
 wrists-0.042
ip-0.041
Token if
Token ID611
Feature activation-1.11e-02

Pos logit contributions
 brace+0.031
Aud+0.030
 fet+0.030
 scored+0.029
ip+0.027
Neg logit contributions
 adult-0.037
 explanation-0.035
 worry-0.033
rated-0.033
urious-0.032
Token he
Token ID339
Feature activation-6.36e-02

Pos logit contributions
 brace+0.017
Aud+0.016
 fet+0.015
 grapes+0.014
 pro+0.014
Neg logit contributions
 adult-0.021
urious-0.019
 worry-0.019
 explanation-0.019
 zone-0.019
Token lost
Token ID2626
Feature activation-1.69e-02

Pos logit contributions
 fet+0.088
 brace+0.087
Aud+0.084
 scored+0.083
 pro+0.079
Neg logit contributions
 adult-0.129
 explanation-0.124
 zone-0.114
 worry-0.113
iness-0.112
Token,
Token ID11
Feature activation+1.03e-02

Pos logit contributions
 adult+0.010
 worry+0.010
 explanation+0.009
urious+0.009
 son+0.009
Neg logit contributions
Aud-0.011
 brace-0.011
 fet-0.010
 wrists-0.010
 sixteen-0.010
Token the
Token ID262
Feature activation-2.16e-02

Pos logit contributions
 worry+0.016
 adult+0.014
 son+0.014
 God+0.013
 end+0.013
Neg logit contributions
Aud-0.022
 brace-0.020
 wrists-0.019
 fet-0.019
ip-0.019
Token seed
Token ID9403
Feature activation-1.43e-02

Pos logit contributions
 brace+0.037
 grape+0.037
 grapes+0.037
Aud+0.035
 dough+0.034
Neg logit contributions
 adult-0.033
 zone-0.032
 worry-0.031
iness-0.030
 explanation-0.030
Token started
Token ID2067
Feature activation+1.56e-02

Pos logit contributions
 brace+0.024
Aud+0.024
 fet+0.023
 wrists+0.022
 grape+0.022
Neg logit contributions
 adult-0.024
 worry-0.023
 explanation-0.022
 son-0.021
 zone-0.021
Token to
Token ID284
Feature activation+2.39e-02

Pos logit contributions
 worry+0.029
 adult+0.028
 explanation+0.026
urious+0.026
 son+0.025
Neg logit contributions
Aud-0.027
 brace-0.024
 wrists-0.022
 sixteen-0.022
ip-0.022
Token worry
Token ID5490
Feature activation+2.02e-01

Pos logit contributions
 worry+0.039
 adult+0.038
 explanation+0.035
urious+0.034
 son+0.033
Neg logit contributions
Aud-0.044
 brace-0.042
 fet-0.040
 wrists-0.039
ip-0.038
Token.
Token ID13
Feature activation+1.91e-02

Pos logit contributions
 adult+0.344
 worry+0.337
 explanation+0.309
urious+0.303
 son+0.301
Neg logit contributions
Aud-0.349
 brace-0.340
 fet-0.321
 wrists-0.314
 sixteen-0.302
Token It
Token ID632
Feature activation+1.29e-02

Pos logit contributions
 adult+0.036
 worry+0.035
 explanation+0.033
 son+0.032
urious+0.032
Neg logit contributions
Aud-0.031
 brace-0.029
 fet-0.027
ip-0.026
 wrists-0.026
Token wanted
Token ID2227
Feature activation+6.25e-02

Pos logit contributions
 adult+0.021
 worry+0.021
 explanation+0.019
 son+0.018
 zone+0.018
Neg logit contributions
Aud-0.023
 brace-0.023
 fet-0.021
 wrists-0.021
 sixteen-0.020
Token to
Token ID284
Feature activation-1.57e-02

Pos logit contributions
 adult+0.126
 worry+0.123
 explanation+0.114
 son+0.112
urious+0.111
Neg logit contributions
Aud-0.092
 brace-0.087
 fet-0.081
 wrists-0.080
 sixteen-0.076
Token grow
Token ID1663
Feature activation+3.80e-02

Pos logit contributions
Aud+0.033
 brace+0.031
 fet+0.030
 wrists+0.029
ip+0.029
Neg logit contributions
 worry-0.022
 adult-0.022
 explanation-0.019
 son-0.018
urious-0.018
Token in
Token ID287
Feature activation-1.18e-02

Pos logit contributions
 adult+0.118
 worry+0.117
 explanation+0.105
 son+0.102
urious+0.101
Neg logit contributions
Aud-0.135
 brace-0.130
 fet-0.121
 wrists-0.121
 sixteen-0.117
Token the
Token ID262
Feature activation+3.29e-02

Pos logit contributions
Aud+0.019
 brace+0.018
 fet+0.017
 wrists+0.016
ip+0.016
Neg logit contributions
 worry-0.023
 adult-0.022
 explanation-0.020
 son-0.020
 zone-0.020
Token network
Token ID3127
Feature activation+2.74e-02

Pos logit contributions
 adult+0.039
 worry+0.038
urious+0.032
 explanation+0.032
 son+0.032
Neg logit contributions
Aud-0.077
 brace-0.075
 fet-0.071
ip-0.070
 wrists-0.069
Token started
Token ID2067
Feature activation+3.16e-02

Pos logit contributions
 worry+0.039
 adult+0.039
 explanation+0.035
 zone+0.034
iness+0.034
Neg logit contributions
Aud-0.056
 brace-0.054
 wrists-0.050
 fet-0.050
ip-0.049
Token to
Token ID284
Feature activation+3.83e-02

Pos logit contributions
 worry+0.057
 adult+0.055
 explanation+0.051
 son+0.049
 actions+0.047
Neg logit contributions
Aud-0.055
 brace-0.052
 wrists-0.048
ip-0.048
 fet-0.047
Token worry
Token ID5490
Feature activation+2.02e-01

Pos logit contributions
 worry+0.051
 adult+0.048
 explanation+0.043
 son+0.040
 outsider+0.039
Neg logit contributions
Aud-0.085
 brace-0.081
ip-0.078
 fet-0.078
 wrists-0.077
Token;
Token ID26
Feature activation+3.26e-02

Pos logit contributions
 worry+0.295
 adult+0.294
 explanation+0.264
 son+0.259
iness+0.249
Neg logit contributions
Aud-0.419
 brace-0.393
 fet-0.379
 wrists-0.372
 sixteen-0.358
Token they
Token ID484
Feature activation-2.70e-02

Pos logit contributions
 worry+0.059
 adult+0.058
 son+0.052
 explanation+0.052
 God+0.049
Neg logit contributions
Aud-0.058
 brace-0.055
ip-0.052
 wrists-0.051
 fet-0.051
Token wanted
Token ID2227
Feature activation+5.91e-02

Pos logit contributions
Aud+0.052
 brace+0.051
 fet+0.048
 wrists+0.047
 sixteen+0.046
Neg logit contributions
 adult-0.041
 worry-0.040
 explanation-0.036
 son-0.035
urious-0.035
Token everyone
Token ID2506
Feature activation+6.20e-02

Pos logit contributions
 adult+0.111
 worry+0.109
 explanation+0.099
 son+0.099
 actions+0.093
Neg logit contributions
Aud-0.100
 brace-0.094
 fet-0.088
 wrists-0.088
ip-0.086
Token to
Token ID284
Feature activation+5.76e-03

Pos logit contributions
 worry+0.117
 adult+0.115
 explanation+0.106
 son+0.105
 zone+0.103
Neg logit contributions
Aud-0.097
 brace-0.093
 wrists-0.086
 fet-0.086
 sixteen-0.084
Token,
Token ID11
Feature activation+2.05e-02

Pos logit contributions
 adult+0.061
 worry+0.061
 explanation+0.054
 zone+0.052
 son+0.052
Neg logit contributions
Aud-0.082
 brace-0.078
 fet-0.074
 wrists-0.073
 sixteen-0.071
Token and
Token ID290
Feature activation-6.33e-03

Pos logit contributions
 worry+0.031
 adult+0.030
 explanation+0.026
 son+0.026
 actions+0.024
Neg logit contributions
Aud-0.042
 brace-0.040
 fet-0.038
 wrists-0.037
 sixteen-0.037
Token it
Token ID340
Feature activation-3.01e-02

Pos logit contributions
Aud+0.010
 brace+0.010
 fet+0.009
 wrists+0.009
 sixteen+0.009
Neg logit contributions
 adult-0.012
 worry-0.012
 explanation-0.011
 son-0.010
 zone-0.010
Token made
Token ID925
Feature activation+4.34e-02

Pos logit contributions
Aud+0.057
 brace+0.055
 fet+0.053
 wrists+0.051
 sixteen+0.050
Neg logit contributions
 adult-0.046
 worry-0.044
 explanation-0.041
urious-0.040
 son-0.039
Token her
Token ID607
Feature activation+5.87e-03

Pos logit contributions
 adult+0.089
 worry+0.088
 explanation+0.081
 son+0.079
urious+0.078
Neg logit contributions
Aud-0.061
 brace-0.059
 fet-0.056
 wrists-0.056
 sixteen-0.053
Token worry
Token ID5490
Feature activation+2.02e-01

Pos logit contributions
 adult+0.008
 worry+0.008
 explanation+0.007
urious+0.007
 son+0.007
Neg logit contributions
Aud-0.012
 brace-0.012
 fet-0.011
 wrists-0.011
 sixteen-0.011
Token.
Token ID13
Feature activation+1.95e-02

Pos logit contributions
 worry+0.318
 adult+0.318
 explanation+0.286
 son+0.278
 zone+0.273
Neg logit contributions
Aud-0.380
 brace-0.364
 fet-0.345
 wrists-0.339
 sixteen-0.328
Token
Token ID198
Feature activation+1.23e-02

Pos logit contributions
 adult+0.041
 worry+0.040
 explanation+0.037
urious+0.036
 son+0.036
Neg logit contributions
Aud-0.026
 brace-0.025
 fet-0.023
 wrists-0.023
 sixteen-0.021
Token
Token ID198
Feature activation+1.58e-02

Pos logit contributions
 adult+0.024
 worry+0.024
 explanation+0.022
 son+0.022
 zone+0.022
Neg logit contributions
Aud-0.018
 brace-0.017
 fet-0.016
 wrists-0.015
ip-0.015
TokenSuddenly
Token ID38582
Feature activation+8.43e-03

Pos logit contributions
 worry+0.031
 adult+0.031
 explanation+0.029
iness+0.029
 son+0.029
Neg logit contributions
Aud-0.023
 brace-0.020
 fet-0.020
ip-0.019
 wrists-0.018
Token,
Token ID11
Feature activation-1.39e-02

Pos logit contributions
 adult+0.013
 worry+0.013
 explanation+0.012
 son+0.012
urious+0.011
Neg logit contributions
Aud-0.016
 brace-0.015
 fet-0.014
 wrists-0.014
 sixteen-0.014
Token closer
Token ID5699
Feature activation+3.11e-02

Pos logit contributions
Aud+0.054
 brace+0.052
 fet+0.050
 wrists+0.047
 pro+0.047
Neg logit contributions
 adult-0.056
 worry-0.053
 explanation-0.051
urious-0.051
 zone-0.050
Token,
Token ID11
Feature activation+1.03e-02

Pos logit contributions
 adult+0.044
 worry+0.044
 explanation+0.039
 son+0.039
 zone+0.037
Neg logit contributions
Aud-0.064
 brace-0.061
 fet-0.058
 wrists-0.058
 sixteen-0.056
Token he
Token ID339
Feature activation-1.10e-06

Pos logit contributions
 adult+0.018
 worry+0.017
 explanation+0.016
urious+0.016
 zone+0.016
Neg logit contributions
Aud-0.017
 brace-0.017
 fet-0.016
 wrists-0.016
 sixteen-0.015
Token started
Token ID2067
Feature activation+8.99e-03

Feature not active on prev token;
no predictions were affected.
Token to
Token ID284
Feature activation+2.59e-02

Pos logit contributions
 adult+0.017
 worry+0.017
 explanation+0.015
 son+0.015
urious+0.015
Neg logit contributions
Aud-0.014
 brace-0.014
 wrists-0.012
 sixteen-0.012
 fet-0.012
Token worry
Token ID5490
Feature activation+2.01e-01

Pos logit contributions
 adult+0.035
 worry+0.032
urious+0.031
 explanation+0.031
 zone+0.030
Neg logit contributions
Aud-0.052
 brace-0.052
 fet-0.050
 wrists-0.049
 pro-0.047
Token
Token ID220
Feature activation-3.16e-03

Pos logit contributions
 adult+0.319
 worry+0.312
 explanation+0.283
 son+0.279
urious+0.272
Neg logit contributions
Aud-0.380
 brace-0.362
 fet-0.345
 wrists-0.340
 sixteen-0.327
Token -
Token ID532
Feature activation+3.76e-02

Pos logit contributions
Aud+0.006
 brace+0.005
 fet+0.005
 wrists+0.005
 sixteen+0.005
Neg logit contributions
 adult-0.005
 worry-0.005
 explanation-0.005
 son-0.005
urious-0.005
Token he
Token ID339
Feature activation-8.85e-03

Pos logit contributions
 worry+0.056
 adult+0.054
 son+0.049
 explanation+0.046
 God+0.043
Neg logit contributions
Aud-0.081
 brace-0.077
 wrists-0.074
 fet-0.073
ip-0.070
Token realised
Token ID19169
Feature activation+3.76e-05

Pos logit contributions
Aud+0.019
 brace+0.018
 wrists+0.017
 fet+0.017
 grapes+0.016
Neg logit contributions
 adult-0.012
 worry-0.012
 son-0.011
urious-0.011
 explanation-0.010
Token the
Token ID262
Feature activation-4.26e-02

Pos logit contributions
 adult+0.000
 worry+0.000
 son+0.000
 explanation+0.000
 zone+0.000
Neg logit contributions
Aud-0.000
 brace-0.000
 fet-0.000
 wrists-0.000
 sixteen-0.000
Token twins
Token ID20345
Feature activation+4.84e-02

Pos logit contributions
 adult+0.018
 zone+0.017
 worry+0.017
urious+0.016
rated+0.016
Neg logit contributions
 brace-0.024
Aud-0.024
 grape-0.023
 grapes-0.023
 wrists-0.023
Token'
Token ID6
Feature activation+1.43e-02

Pos logit contributions
 worry+0.070
 adult+0.069
 son+0.061
 explanation+0.061
urious+0.059
Neg logit contributions
Aud-0.100
 brace-0.094
 fet-0.087
 wrists-0.087
 sixteen-0.085
Token parents
Token ID3397
Feature activation+6.87e-02

Pos logit contributions
 adult+0.020
 worry+0.019
 son+0.017
urious+0.017
 explanation+0.017
Neg logit contributions
Aud-0.030
 brace-0.029
 fet-0.028
 wrists-0.027
 sixteen-0.027
Token started
Token ID2067
Feature activation+1.55e-02

Pos logit contributions
 adult+0.108
 worry+0.104
 explanation+0.096
urious+0.095
 zone+0.094
Neg logit contributions
Aud-0.125
 brace-0.123
 fet-0.117
 wrists-0.114
 pro-0.110
Token to
Token ID284
Feature activation+2.83e-02

Pos logit contributions
 worry+0.028
 adult+0.028
 explanation+0.025
 son+0.024
urious+0.023
Neg logit contributions
Aud-0.028
 brace-0.025
ip-0.023
 wrists-0.023
br-0.023
Token worry
Token ID5490
Feature activation+2.00e-01

Pos logit contributions
 adult+0.038
 worry+0.036
 explanation+0.033
urious+0.033
 zone+0.032
Neg logit contributions
Aud-0.058
 brace-0.057
 fet-0.054
 wrists-0.054
 sixteen-0.052
Token.
Token ID13
Feature activation+1.75e-02

Pos logit contributions
 adult+0.328
 worry+0.321
 explanation+0.293
 son+0.286
urious+0.286
Neg logit contributions
Aud-0.358
 brace-0.347
 fet-0.328
 wrists-0.321
 sixteen-0.310
Token They
Token ID1119
Feature activation+2.38e-02

Pos logit contributions
 adult+0.032
 worry+0.031
 explanation+0.029
 son+0.029
urious+0.028
Neg logit contributions
Aud-0.029
 brace-0.028
 fet-0.026
 wrists-0.025
ip-0.025
Token asked
Token ID1965
Feature activation+1.04e-01

Pos logit contributions
 adult+0.038
 worry+0.036
 zone+0.035
 explanation+0.035
urious+0.034
Neg logit contributions
Aud-0.041
 brace-0.041
 fet-0.040
 wrists-0.038
 pro-0.037
Token them
Token ID606
Feature activation+9.52e-03

Pos logit contributions
 adult+0.156
 worry+0.151
urious+0.140
 explanation+0.138
 zone+0.137
Neg logit contributions
Aud-0.195
 brace-0.191
 fet-0.182
 wrists-0.178
 grapes-0.174
Token what
Token ID644
Feature activation-2.26e-02

Pos logit contributions
 adult+0.013
 worry+0.013
urious+0.012
 zone+0.012
 explanation+0.012
Neg logit contributions
Aud-0.018
 brace-0.018
 fet-0.017
 wrists-0.017
 grapes-0.017
Token lost
Token ID2626
Feature activation+5.48e-02

Pos logit contributions
 adult+0.014
 worry+0.014
 explanation+0.013
 son+0.012
urious+0.012
Neg logit contributions
Aud-0.017
 brace-0.016
 wrists-0.015
 fet-0.015
 sixteen-0.015
Token and
Token ID290
Feature activation+3.12e-02

Pos logit contributions
 adult+0.100
 worry+0.097
 explanation+0.093
urious+0.090
 son+0.089
Neg logit contributions
 brace-0.085
Aud-0.085
 fet-0.080
 grapes-0.076
 wrists-0.076
Token Lily
Token ID20037
Feature activation-1.77e-02

Pos logit contributions
 adult+0.046
 worry+0.044
 explanation+0.040
urious+0.039
 son+0.039
Neg logit contributions
Aud-0.061
 brace-0.060
 fet-0.057
 wrists-0.056
 sixteen-0.054
Token started
Token ID2067
Feature activation+1.59e-02

Pos logit contributions
 brace+0.030
 fet+0.029
Aud+0.029
 wrists+0.028
 pro+0.026
Neg logit contributions
 adult-0.030
 worry-0.028
 explanation-0.027
 zone-0.026
urious-0.026
Token to
Token ID284
Feature activation+2.93e-02

Pos logit contributions
 adult+0.034
 worry+0.033
 explanation+0.031
urious+0.031
 zone+0.031
Neg logit contributions
Aud-0.020
 brace-0.020
 fet-0.019
 wrists-0.017
 grapes-0.016
Token worry
Token ID5490
Feature activation+1.99e-01

Pos logit contributions
 adult+0.048
iness+0.044
 zone+0.044
urious+0.043
reetings+0.041
Neg logit contributions
 brace-0.053
 fet-0.051
Aud-0.046
 pur-0.044
 grape-0.044
Token.
Token ID13
Feature activation+2.27e-02

Pos logit contributions
 adult+0.309
 worry+0.302
 explanation+0.274
 son+0.268
urious+0.262
Neg logit contributions
Aud-0.385
 brace-0.368
 fet-0.349
 wrists-0.345
 sixteen-0.334
Token She
Token ID1375
Feature activation-1.12e-02

Pos logit contributions
 adult+0.038
 worry+0.038
 son+0.034
 explanation+0.034
urious+0.033
Neg logit contributions
 brace-0.039
Aud-0.038
 sixteen-0.036
 wrists-0.036
 fet-0.035
Token looked
Token ID3114
Feature activation+3.61e-02

Pos logit contributions
 brace+0.018
 fet+0.018
Aud+0.018
 wrists+0.017
 grapes+0.016
Neg logit contributions
 adult-0.019
 worry-0.019
 zone-0.018
 explanation-0.018
iness-0.018
Token everywhere
Token ID8347
Feature activation-3.35e-02

Pos logit contributions
 adult+0.082
 worry+0.080
 explanation+0.076
urious+0.075
 zone+0.074
Neg logit contributions
Aud-0.041
 brace-0.040
 fet-0.037
 wrists-0.035
 grapes-0.034
Token for
Token ID329
Feature activation-6.04e-02

Pos logit contributions
Aud+0.054
 brace+0.053
 fet+0.050
 grapes+0.048
 wrists+0.047
Neg logit contributions
 adult-0.059
 worry-0.057
urious-0.055
 explanation-0.053
 outsider-0.052
Token and
Token ID290
Feature activation+6.43e-03

Pos logit contributions
 worry+0.097
 adult+0.090
 explanation+0.081
 son+0.079
iness+0.078
Neg logit contributions
Aud-0.149
 brace-0.142
 wrists-0.137
 fet-0.133
 grapes-0.132
Token the
Token ID262
Feature activation-9.81e-03

Pos logit contributions
 worry+0.011
 adult+0.010
 son+0.009
 explanation+0.009
 nurses+0.009
Neg logit contributions
Aud-0.012
 brace-0.012
 wrists-0.011
 fet-0.011
ip-0.011
Token farmer
Token ID18739
Feature activation+2.61e-02

Pos logit contributions
Aud+0.022
 brace+0.021
 fet+0.020
 wrists+0.020
ip+0.019
Neg logit contributions
 adult-0.013
 worry-0.013
 son-0.011
 explanation-0.011
 actions-0.010
Token started
Token ID2067
Feature activation+1.15e-02

Pos logit contributions
 worry+0.037
 adult+0.036
 son+0.032
 explanation+0.031
iness+0.030
Neg logit contributions
Aud-0.056
 brace-0.053
 grapes-0.049
 wrists-0.049
 sixteen-0.047
Token to
Token ID284
Feature activation+3.05e-02

Pos logit contributions
 worry+0.020
 adult+0.018
 explanation+0.016
 son+0.015
 distress+0.015
Neg logit contributions
Aud-0.023
 brace-0.021
ip-0.020
 wrists-0.020
 grapes-0.019
Token worry
Token ID5490
Feature activation+1.99e-01

Pos logit contributions
 worry+0.040
 adult+0.039
 explanation+0.034
 son+0.032
urious+0.031
Neg logit contributions
Aud-0.067
 brace-0.064
 fet-0.062
 wrists-0.061
ip-0.060
Token.
Token ID13
Feature activation+1.05e-02

Pos logit contributions
 worry+0.298
 adult+0.297
 explanation+0.266
 son+0.260
iness+0.249
Neg logit contributions
Aud-0.404
 brace-0.381
 fet-0.365
 wrists-0.361
 grapes-0.346
Token It
Token ID632
Feature activation+2.28e-03

Pos logit contributions
 adult+0.018
 worry+0.017
 son+0.016
urious+0.015
 explanation+0.015
Neg logit contributions
 brace-0.020
Aud-0.019
 wrists-0.017
 fet-0.017
 grapes-0.017
Token was
Token ID373
Feature activation+9.49e-03

Pos logit contributions
 adult+0.003
 worry+0.003
 explanation+0.003
urious+0.003
 son+0.003
Neg logit contributions
Aud-0.004
 brace-0.004
 fet-0.004
 wrists-0.004
 sixteen-0.004
Token getting
Token ID1972
Feature activation+2.75e-02

Pos logit contributions
 worry+0.016
 adult+0.015
 explanation+0.013
 son+0.013
 distress+0.013
Neg logit contributions
 brace-0.018
Aud-0.018
 wrists-0.017
 fet-0.017
ip-0.016
Token late
Token ID2739
Feature activation+1.73e-02

Pos logit contributions
 adult+0.044
 worry+0.043
 explanation+0.039
urious+0.038
 son+0.038
Neg logit contributions
Aud-0.050
 brace-0.049
 fet-0.046
 wrists-0.046
 sixteen-0.044
Token,
Token ID11
Feature activation+2.06e-02

Pos logit contributions
Aud+0.210
Pur+0.175
 brace+0.171
 wrists+0.170
 sixteen+0.168
Neg logit contributions
 adult-0.081
 end-0.076
 worry-0.075
iness-0.071
 island-0.070
Token his
Token ID465
Feature activation+1.95e-02

Pos logit contributions
 worry+0.027
 adult+0.025
 son+0.023
 why+0.021
 explanation+0.020
Neg logit contributions
Aud-0.047
ip-0.041
 brace-0.039
 wrists-0.038
 fet-0.038
Token friends
Token ID2460
Feature activation+4.08e-02

Pos logit contributions
 worry+0.029
 son+0.028
 adult+0.028
 eye+0.025
 actions+0.024
Neg logit contributions
Aud-0.044
br-0.037
 fet-0.037
 brace-0.037
 sixteen-0.035
Token started
Token ID2067
Feature activation+2.75e-02

Pos logit contributions
 worry+0.059
 adult+0.056
 son+0.052
 zone+0.049
iness+0.047
Neg logit contributions
Aud-0.086
 brace-0.083
 wrists-0.076
 sixteen-0.075
 grapes-0.074
Token to
Token ID284
Feature activation+3.96e-02

Pos logit contributions
 worry+0.045
 adult+0.041
 explanation+0.035
 distress+0.034
 son+0.034
Neg logit contributions
Aud-0.058
ip-0.050
br-0.049
 brace-0.047
 wrists-0.047
Token worry
Token ID5490
Feature activation+1.99e-01

Pos logit contributions
 worry+0.058
 adult+0.050
 explanation+0.045
 outsider+0.042
 son+0.042
Neg logit contributions
Aud-0.089
ip-0.083
 brace-0.082
 fet-0.080
br-0.078
Token.
Token ID13
Feature activation+1.10e-02

Pos logit contributions
 adult+0.295
 worry+0.292
 explanation+0.261
 son+0.259
iness+0.248
Neg logit contributions
Aud-0.406
 brace-0.380
 fet-0.368
 wrists-0.361
 sixteen-0.350
Token They
Token ID1119
Feature activation+1.11e-02

Pos logit contributions
 adult+0.016
 worry+0.015
 son+0.015
urious+0.014
 explanation+0.014
Neg logit contributions
 brace-0.022
Aud-0.022
 wrists-0.020
 fet-0.019
 sprink-0.019
Token knew
Token ID2993
Feature activation+1.64e-02

Pos logit contributions
 worry+0.016
 adult+0.016
 explanation+0.014
 son+0.014
urious+0.014
Neg logit contributions
Aud-0.022
 brace-0.022
 fet-0.020
 wrists-0.020
 sixteen-0.020
Token the
Token ID262
Feature activation-2.61e-02

Pos logit contributions
 adult+0.029
 worry+0.029
 son+0.026
 explanation+0.026
 actions+0.024
Neg logit contributions
Aud-0.028
 brace-0.027
 fet-0.025
 wrists-0.025
 sixteen-0.024
Token fruit
Token ID8234
Feature activation-8.87e-02

Pos logit contributions
Aud+0.046
 brace+0.044
 fet+0.042
 wrists+0.041
ip+0.039
Neg logit contributions
 adult-0.045
 worry-0.045
 explanation-0.041
 son-0.040
urious-0.039
Token,
Token ID11
Feature activation+1.61e-02

Pos logit contributions
Aud+0.089
 brace+0.084
 fet+0.080
 wrists+0.079
 sixteen+0.077
Neg logit contributions
 adult-0.073
 worry-0.073
 explanation-0.064
 son-0.063
 zone-0.063
Token and
Token ID290
Feature activation-9.30e-03

Pos logit contributions
 adult+0.025
 worry+0.024
 explanation+0.022
urious+0.022
 son+0.022
Neg logit contributions
Aud-0.030
 brace-0.029
 fet-0.028
 wrists-0.027
 sixteen-0.026
Token he
Token ID339
Feature activation-3.34e-03

Pos logit contributions
Aud+0.017
 brace+0.016
 wrists+0.015
 fet+0.015
 sixteen+0.014
Neg logit contributions
 worry-0.015
 adult-0.015
 son-0.014
 explanation-0.014
urious-0.013
Token began
Token ID2540
Feature activation+3.63e-03

Pos logit contributions
 fet+0.005
Aud+0.005
 brace+0.005
 pro+0.005
 scored+0.005
Neg logit contributions
 adult-0.006
 worry-0.005
 explanation-0.005
iness-0.005
 zone-0.005
Token to
Token ID284
Feature activation+2.37e-02

Pos logit contributions
 adult+0.007
 worry+0.007
 explanation+0.007
urious+0.007
 son+0.007
Neg logit contributions
Aud-0.005
 brace-0.005
 fet-0.005
 wrists-0.005
 sixteen-0.004
Token worry
Token ID5490
Feature activation+1.99e-01

Pos logit contributions
 adult+0.031
 worry+0.029
 explanation+0.027
urious+0.026
 zone+0.026
Neg logit contributions
Aud-0.050
 brace-0.049
 fet-0.047
 wrists-0.046
 sixteen-0.044
Token.
Token ID13
Feature activation+2.60e-02

Pos logit contributions
 adult+0.315
 worry+0.308
 explanation+0.280
 son+0.274
urious+0.270
Neg logit contributions
Aud-0.380
 brace-0.360
 fet-0.342
 wrists-0.338
 sixteen-0.325
Token He
Token ID679
Feature activation+3.88e-02

Pos logit contributions
 adult+0.038
 worry+0.036
 explanation+0.033
 son+0.033
urious+0.032
Neg logit contributions
Aud-0.054
 brace-0.052
 wrists-0.048
 fet-0.048
ip-0.047
Token had
Token ID550
Feature activation-3.72e-02

Pos logit contributions
 adult+0.065
 worry+0.063
 explanation+0.060
iness+0.059
 zone+0.059
Neg logit contributions
 brace-0.065
Aud-0.064
 fet-0.064
 wrists-0.060
 pro-0.059
Token never
Token ID1239
Feature activation+1.35e-02

Pos logit contributions
Aud+0.071
 brace+0.069
 wrists+0.067
ip+0.064
 fet+0.063
Neg logit contributions
 worry-0.060
 adult-0.058
urious-0.049
 son-0.048
 explanation-0.048
Token been
Token ID587
Feature activation+6.38e-02

Pos logit contributions
 adult+0.022
 explanation+0.019
 worry+0.019
 son+0.018
urious+0.018
Neg logit contributions
Aud-0.024
 brace-0.024
 fet-0.023
 scored-0.022
 pro-0.021
Token it
Token ID340
Feature activation-3.81e-02

Pos logit contributions
Aud+0.023
 brace+0.021
 fet+0.020
 wrists+0.020
ip+0.019
Neg logit contributions
 worry-0.026
 adult-0.025
 son-0.023
 God-0.022
 explanation-0.022
Token started
Token ID2067
Feature activation+1.13e-02

Pos logit contributions
Aud+0.068
 brace+0.066
 fet+0.063
 wrists+0.061
 pro+0.058
Neg logit contributions
 adult-0.063
 worry-0.061
 explanation-0.056
urious-0.055
 zone-0.055
Token moving
Token ID3867
Feature activation+5.70e-02

Pos logit contributions
 adult+0.024
 worry+0.023
 explanation+0.022
urious+0.022
 son+0.022
Neg logit contributions
Aud-0.015
 brace-0.014
 fet-0.013
 wrists-0.013
 sixteen-0.012
Token towards
Token ID3371
Feature activation+7.05e-02

Pos logit contributions
 worry+0.090
 adult+0.089
 zone+0.079
 explanation+0.078
urious+0.078
Neg logit contributions
Aud-0.109
 brace-0.103
 wrists-0.097
 fet-0.097
ip-0.095
Token the
Token ID262
Feature activation+3.85e-02

Pos logit contributions
 adult+0.134
 worry+0.130
 son+0.119
 explanation+0.118
urious+0.115
Neg logit contributions
Aud-0.120
 brace-0.112
 fet-0.107
ip-0.103
 wrists-0.103
Token voice
Token ID3809
Feature activation+1.99e-01

Pos logit contributions
 adult+0.063
 worry+0.059
urious+0.057
iness+0.056
 explanation+0.055
Neg logit contributions
 brace-0.064
 sprink-0.062
 wrists-0.061
 dough-0.060
 icing-0.059
Token.
Token ID13
Feature activation+2.08e-02

Pos logit contributions
 worry+0.287
 adult+0.285
 explanation+0.253
 son+0.252
 zone+0.246
Neg logit contributions
Aud-0.407
 brace-0.380
 fet-0.362
 wrists-0.359
 sixteen-0.350
Token
Token ID198
Feature activation+1.03e-02

Pos logit contributions
 adult+0.033
 worry+0.032
 son+0.031
 explanation+0.031
urious+0.030
Neg logit contributions
Aud-0.039
 brace-0.038
 wrists-0.034
 fet-0.034
ip-0.033
Token
Token ID198
Feature activation+1.12e-02

Pos logit contributions
 adult+0.019
 worry+0.019
 explanation+0.017
urious+0.017
 son+0.017
Neg logit contributions
Aud-0.016
 brace-0.015
 fet-0.015
 wrists-0.014
 sixteen-0.014
TokenWhen
Token ID2215
Feature activation-1.50e-02

Pos logit contributions
 adult+0.019
 worry+0.018
urious+0.017
 explanation+0.017
 son+0.016
Neg logit contributions
 brace-0.019
Aud-0.019
 fet-0.018
 wrists-0.018
ip-0.017
Token the
Token ID262
Feature activation-5.23e-02

Pos logit contributions
Aud+0.030
 brace+0.028
 wrists+0.027
 fet+0.026
ip+0.025
Neg logit contributions
 worry-0.025
 adult-0.023
 God-0.022
 end-0.022
 son-0.022
Token something
Token ID1223
Feature activation+3.34e-02

Pos logit contributions
Aud+0.042
 fet+0.040
 wrists+0.039
 sixteen+0.039
 brace+0.038
Neg logit contributions
 adult-0.028
 worry-0.026
 why-0.024
 son-0.024
 explanation-0.023
Token new
Token ID649
Feature activation-5.38e-02

Pos logit contributions
 adult+0.053
 worry+0.051
 explanation+0.046
 son+0.046
urious+0.044
Neg logit contributions
Aud-0.065
 wrists-0.060
 brace-0.060
 fet-0.059
 sixteen-0.058
Token!
Token ID0
Feature activation-3.86e-03

Pos logit contributions
Aud+0.109
 brace+0.102
 fet+0.097
 sixteen+0.097
 wrists+0.097
Neg logit contributions
 adult-0.080
 worry-0.077
 zone-0.073
 explanation-0.071
 son-0.071
Token She
Token ID1375
Feature activation-3.17e-02

Pos logit contributions
Aud+0.007
 brace+0.007
 wrists+0.006
 sixteen+0.006
 grapes+0.006
Neg logit contributions
 adult-0.006
 zone-0.006
 worry-0.006
 son-0.006
urious-0.005
Token thought
Token ID1807
Feature activation+5.13e-02

Pos logit contributions
Aud+0.081
 brace+0.071
 sixteen+0.069
br+0.068
 wrists+0.068
Neg logit contributions
 worry-0.031
 adult-0.031
 son-0.030
 actions-0.026
 father-0.025
Token av
Token ID1196
Feature activation-2.95e-01

Pos logit contributions
 adult+0.086
 worry+0.078
 explanation+0.077
 actions+0.073
 son+0.073
Neg logit contributions
Aud-0.100
 wrists-0.094
 fet-0.088
 brace-0.088
 sixteen-0.086
Tokenoc
Token ID420
Feature activation-9.53e-02

Pos logit contributions
Aud+0.651
 brace+0.619
 wrists+0.612
 fet+0.591
 sixteen+0.576
Neg logit contributions
 worry-0.343
 adult-0.320
urious-0.316
 zone-0.293
rated-0.279
Tokenados
Token ID22484
Feature activation-1.41e-01

Pos logit contributions
Aud+0.136
 fet+0.113
 brace+0.107
 wrists+0.104
 sixteen+0.100
Neg logit contributions
 adult-0.213
 worry-0.188
 actions-0.186
 explanation-0.184
pa-0.182
Token were
Token ID547
Feature activation-2.78e-02

Pos logit contributions
Aud+0.329
 brace+0.303
 wrists+0.297
 sixteen+0.296
 fet+0.293
Neg logit contributions
 worry-0.169
 adult-0.159
iness-0.146
 zone-0.140
 explanation-0.139
Token the
Token ID262
Feature activation-5.91e-03

Pos logit contributions
Aud+0.062
 fet+0.057
ip+0.056
 wrists+0.055
con+0.053
Neg logit contributions
 worry-0.040
 adult-0.038
 why-0.035
 end-0.033
 son-0.032
Token most
Token ID749
Feature activation+7.19e-03

Pos logit contributions
Aud+0.012
 brace+0.011
 fet+0.011
 wrists+0.010
 sixteen+0.010
Neg logit contributions
 worry-0.009
 adult-0.009
 explanation-0.008
 son-0.008
urious-0.008
Token something
Token ID1223
Feature activation+3.41e-02

Pos logit contributions
Aud+0.038
 fet+0.037
 wrists+0.036
 sixteen+0.035
 brace+0.035
Neg logit contributions
 adult-0.025
 worry-0.023
 why-0.022
 son-0.021
 explanation-0.020
Token new
Token ID649
Feature activation-5.11e-02

Pos logit contributions
 adult+0.053
 worry+0.051
 son+0.046
 explanation+0.046
urious+0.044
Neg logit contributions
Aud-0.067
 wrists-0.062
 brace-0.061
 fet-0.060
 sixteen-0.060
Token!
Token ID0
Feature activation-3.39e-03

Pos logit contributions
Aud+0.104
 brace+0.098
 sixteen+0.093
 fet+0.093
 wrists+0.092
Neg logit contributions
 adult-0.075
 worry-0.072
 zone-0.069
 son-0.067
 explanation-0.067
Token She
Token ID1375
Feature activation-2.78e-02

Pos logit contributions
Aud+0.006
 brace+0.006
 wrists+0.006
 sixteen+0.006
 grapes+0.005
Neg logit contributions
 adult-0.006
 worry-0.005
 zone-0.005
 son-0.005
urious-0.005
Token thought
Token ID1807
Feature activation+5.38e-02

Pos logit contributions
Aud+0.073
 brace+0.064
 sixteen+0.062
br+0.061
 wrists+0.061
Neg logit contributions
 worry-0.027
 adult-0.026
 son-0.025
 actions-0.021
 father-0.021
Token av
Token ID1196
Feature activation-2.93e-01

Pos logit contributions
 adult+0.090
 worry+0.080
 explanation+0.079
 actions+0.076
 God+0.076
Neg logit contributions
Aud-0.106
 wrists-0.099
 fet-0.093
 brace-0.092
 sixteen-0.090
Tokenoc
Token ID420
Feature activation-9.99e-02

Pos logit contributions
Aud+0.634
 brace+0.604
 wrists+0.590
 fet+0.574
 sixteen+0.560
Neg logit contributions
 worry-0.357
 adult-0.338
urious-0.326
 zone-0.307
rated-0.293
Tokenados
Token ID22484
Feature activation-1.39e-01

Pos logit contributions
Aud+0.141
 fet+0.119
 brace+0.110
 wrists+0.107
 sixteen+0.104
Neg logit contributions
 adult-0.223
 worry-0.198
 explanation-0.194
 actions-0.193
pa-0.191
Token were
Token ID547
Feature activation-2.56e-02

Pos logit contributions
Aud+0.327
 brace+0.300
 wrists+0.296
 sixteen+0.295
 fet+0.291
Neg logit contributions
 worry-0.164
 adult-0.153
iness-0.142
 zone-0.136
 explanation-0.134
Token the
Token ID262
Feature activation-3.29e-03

Pos logit contributions
Aud+0.058
 fet+0.053
ip+0.052
 wrists+0.050
con+0.049
Neg logit contributions
 worry-0.036
 adult-0.034
 why-0.032
 end-0.030
 answer-0.029
Token most
Token ID749
Feature activation+7.31e-03

Pos logit contributions
Aud+0.007
 brace+0.006
 fet+0.006
 wrists+0.006
ip+0.006
Neg logit contributions
 worry-0.005
 adult-0.005
 explanation-0.004
 son-0.004
urious-0.004
Token away
Token ID1497
Feature activation-7.54e-03

Pos logit contributions
 worry+0.030
 adult+0.028
ie+0.025
 voice+0.025
 underground+0.025
Neg logit contributions
Aud-0.068
 wrists-0.067
 seventeen-0.063
 sixteen-0.062
 icing-0.061
Token.
Token ID13
Feature activation+1.04e-02

Pos logit contributions
Aud+0.018
 wrists+0.016
 brace+0.016
ip+0.015
 sixteen+0.015
Neg logit contributions
 worry-0.010
 adult-0.009
 explanation-0.008
 son-0.008
iness-0.007
Token She
Token ID1375
Feature activation-2.34e-02

Pos logit contributions
urious+0.013
 adult+0.013
 worry+0.013
 son+0.012
 God+0.012
Neg logit contributions
 brace-0.022
Aud-0.021
br-0.020
 wrists-0.020
 sixteen-0.019
Token learned
Token ID4499
Feature activation-2.83e-02

Pos logit contributions
Aud+0.059
 sixteen+0.052
 brace+0.052
br+0.050
 grapes+0.050
Neg logit contributions
 worry-0.026
 adult-0.023
 son-0.022
 actions-0.020
ie-0.019
Token that
Token ID326
Feature activation+4.15e-02

Pos logit contributions
Aud+0.059
 wrists+0.056
 brace+0.055
 fet+0.055
 sixteen+0.054
Neg logit contributions
 adult-0.042
 worry-0.038
 explanation-0.035
 son-0.033
 why-0.033
Token av
Token ID1196
Feature activation-2.61e-01

Pos logit contributions
 adult+0.075
 worry+0.074
 explanation+0.068
 son+0.066
urious+0.066
Neg logit contributions
Aud-0.068
 brace-0.065
 fet-0.062
 wrists-0.060
 sixteen-0.058
Tokenoc
Token ID420
Feature activation-7.20e-02

Pos logit contributions
Aud+0.553
 brace+0.529
 wrists+0.520
 fet+0.498
 sixteen+0.489
Neg logit contributions
 worry-0.339
 adult-0.327
urious-0.300
 zone-0.290
 explanation-0.283
Tokenados
Token ID22484
Feature activation-1.03e-01

Pos logit contributions
Aud+0.099
 fet+0.084
 brace+0.077
 wrists+0.076
br+0.073
Neg logit contributions
 adult-0.161
 worry-0.148
 explanation-0.143
 actions-0.143
pa-0.142
Token are
Token ID389
Feature activation-3.54e-03

Pos logit contributions
Aud+0.211
 brace+0.201
 wrists+0.191
 fet+0.190
 sixteen+0.186
Neg logit contributions
 worry-0.146
 adult-0.146
 explanation-0.129
 zone-0.125
urious-0.123
Token good
Token ID922
Feature activation-5.96e-02

Pos logit contributions
Aud+0.007
 wrists+0.007
 brace+0.007
 fet+0.007
ip+0.006
Neg logit contributions
 adult-0.005
 worry-0.005
 explanation-0.004
 actions-0.004
 end-0.004
Token for
Token ID329
Feature activation-5.14e-02

Pos logit contributions
Aud+0.088
 brace+0.086
 fet+0.081
 wrists+0.077
 grapes+0.075
Neg logit contributions
 adult-0.115
 worry-0.113
 explanation-0.105
urious-0.104
 son-0.102
Token girl
Token ID2576
Feature activation+2.58e-02

Pos logit contributions
 adult+0.087
 worry+0.082
 explanation+0.074
urious+0.072
 zone+0.068
Neg logit contributions
Aud-0.227
 brace-0.224
 fet-0.216
 wrists-0.212
 grapes-0.209
Token who
Token ID508
Feature activation+3.19e-02

Pos logit contributions
 adult+0.050
 worry+0.048
 explanation+0.047
urious+0.045
iness+0.045
Neg logit contributions
Aud-0.037
 brace-0.037
 fet-0.036
 wrists-0.034
 grape-0.033
Token didn
Token ID1422
Feature activation-8.33e-03

Pos logit contributions
 adult+0.059
 worry+0.055
 explanation+0.052
 son+0.051
urious+0.050
Neg logit contributions
Aud-0.052
 brace-0.050
 fet-0.048
 wrists-0.045
 grape-0.044
Token't
Token ID470
Feature activation+6.15e-02

Pos logit contributions
Aud+0.012
 brace+0.011
 fet+0.010
 wrists+0.010
 grapes+0.010
Neg logit contributions
 adult-0.017
 worry-0.017
 explanation-0.015
urious-0.015
 son-0.015
Token like
Token ID588
Feature activation-5.70e-02

Pos logit contributions
 adult+0.103
 worry+0.097
urious+0.093
 explanation+0.091
 son+0.090
Neg logit contributions
Aud-0.104
 brace-0.103
 fet-0.099
 grape-0.093
 wrists-0.093
Token av
Token ID1196
Feature activation-2.57e-01

Pos logit contributions
Aud+0.096
 brace+0.086
 fet+0.084
 wrists+0.083
 sixteen+0.083
Neg logit contributions
 worry-0.106
 adult-0.105
 explanation-0.094
 son-0.093
 actions-0.091
Tokenoc
Token ID420
Feature activation-7.45e-02

Pos logit contributions
Aud+0.518
 wrists+0.491
 brace+0.477
 fet+0.473
 sixteen+0.452
Neg logit contributions
 worry-0.352
 adult-0.318
urious-0.308
pa-0.280
 zone-0.275
Tokenados
Token ID22484
Feature activation-9.03e-02

Pos logit contributions
Aud+0.112
 fet+0.098
 wrists+0.089
 pro+0.085
Pur+0.083
Neg logit contributions
 adult-0.158
 worry-0.142
 actions-0.138
pa-0.137
 explanation-0.135
Token.
Token ID13
Feature activation+1.65e-02

Pos logit contributions
Aud+0.203
 sixteen+0.182
 brace+0.182
 wrists+0.179
 fet+0.179
Neg logit contributions
 worry-0.118
 adult-0.111
 son-0.096
 zone-0.095
iness-0.094
Token She
Token ID1375
Feature activation-1.16e-02

Pos logit contributions
 worry+0.027
 adult+0.027
urious+0.024
 explanation+0.024
 outsider+0.024
Neg logit contributions
Aud-0.030
 fet-0.028
 brace-0.028
 pro-0.027
ip-0.027
Token always
Token ID1464
Feature activation+3.06e-03

Pos logit contributions
Aud+0.028
 brace+0.026
 sixteen+0.025
 fet+0.024
 wrists+0.024
Neg logit contributions
 worry-0.013
 adult-0.013
urious-0.011
 son-0.011
 outsider-0.010
Token.
Token ID13
Feature activation+2.36e-02

Pos logit contributions
Aud+0.070
 brace+0.066
 wrists+0.063
 sixteen+0.060
 fet+0.059
Neg logit contributions
 worry-0.041
 adult-0.038
 son-0.037
 zone-0.034
pa-0.034
Token Did
Token ID7731
Feature activation-4.15e-02

Pos logit contributions
 adult+0.040
 worry+0.039
 son+0.035
 zone+0.035
 explanation+0.035
Neg logit contributions
 brace-0.041
Aud-0.041
 wrists-0.037
 fet-0.037
 grapes-0.035
Token you
Token ID345
Feature activation-1.36e-02

Pos logit contributions
Aud+0.083
 fet+0.075
 brace+0.073
 sixteen+0.072
 pro+0.072
Neg logit contributions
 worry-0.065
 adult-0.061
 son-0.055
 actions-0.054
 explanation-0.053
Token know
Token ID760
Feature activation+7.34e-02

Pos logit contributions
Aud+0.030
 brace+0.029
 wrists+0.028
 fet+0.028
 grapes+0.027
Neg logit contributions
 worry-0.018
 adult-0.017
 explanation-0.015
 son-0.014
 outsider-0.014
Token that
Token ID326
Feature activation+2.07e-02

Pos logit contributions
 adult+0.088
 worry+0.081
 why+0.073
 son+0.069
 explanation+0.068
Neg logit contributions
Aud-0.178
 fet-0.167
 brace-0.160
 wrists-0.155
ip-0.153
Token av
Token ID1196
Feature activation-2.46e-01

Pos logit contributions
 worry+0.033
 adult+0.032
 son+0.028
 explanation+0.028
 actions+0.027
Neg logit contributions
Aud-0.042
 fet-0.038
 wrists-0.037
ip-0.037
 sixteen-0.037
Tokenoc
Token ID420
Feature activation-4.41e-02

Pos logit contributions
Aud+0.605
 wrists+0.569
 brace+0.550
 fet+0.537
 Violet+0.521
Neg logit contributions
 worry-0.236
urious-0.214
 adult-0.210
ie-0.191
rated-0.190
Tokenados
Token ID22484
Feature activation-1.01e-01

Pos logit contributions
Aud+0.064
 fet+0.051
 brace+0.050
 wrists+0.047
 pro+0.043
Neg logit contributions
 adult-0.102
pa-0.091
 worry-0.091
 actions-0.089
 explanation-0.088
Token are
Token ID389
Feature activation-4.17e-03

Pos logit contributions
Aud+0.224
 brace+0.209
 wrists+0.203
 fet+0.200
 sixteen+0.193
Neg logit contributions
 worry-0.133
 adult-0.127
iness-0.113
 zone-0.112
 explanation-0.108
Token very
Token ID845
Feature activation-5.72e-02

Pos logit contributions
Aud+0.010
 wrists+0.009
 fet+0.009
 brace+0.008
ip+0.008
Neg logit contributions
 adult-0.006
 worry-0.005
 why-0.005
 end-0.005
 answer-0.005
Token healthy
Token ID5448
Feature activation-6.10e-02

Pos logit contributions
Aud+0.125
 brace+0.112
 wrists+0.111
 fet+0.108
 pro+0.102
Neg logit contributions
 adult-0.084
 worry-0.080
 explanation-0.071
urious-0.070
place-0.067
Token on
Token ID319
Feature activation-1.39e-01

Pos logit contributions
 adult+0.005
 worry+0.005
 explanation+0.005
 zone+0.005
iness+0.005
Neg logit contributions
Aud-0.003
 brace-0.003
 wrists-0.002
 fet-0.002
 sprink-0.002
Token,
Token ID11
Feature activation-8.11e-03

Pos logit contributions
Aud+0.351
 brace+0.329
 fet+0.327
 sixteen+0.322
 wrists+0.321
Neg logit contributions
 worry-0.140
 adult-0.136
 son-0.108
 end-0.105
 actions-0.104
Token whenever
Token ID8797
Feature activation-2.46e-02

Pos logit contributions
Aud+0.018
 wrists+0.017
 sixteen+0.016
 seventeen+0.016
seven+0.016
Neg logit contributions
 worry-0.011
 end-0.010
 adult-0.010
 son-0.009
ie-0.009
Token they
Token ID484
Feature activation-4.98e-02

Pos logit contributions
Aud+0.043
 fet+0.039
 brace+0.039
 wrists+0.039
 sixteen+0.038
Neg logit contributions
 worry-0.046
 adult-0.043
 son-0.040
 God-0.039
 explanation-0.038
Token wanted
Token ID2227
Feature activation+4.90e-02

Pos logit contributions
Aud+0.094
 brace+0.091
 fet+0.086
 wrists+0.085
 sixteen+0.082
Neg logit contributions
 adult-0.077
 worry-0.076
 explanation-0.068
urious-0.067
 son-0.067
Token av
Token ID1196
Feature activation-2.45e-01

Pos logit contributions
 worry+0.083
 adult+0.081
 explanation+0.072
 son+0.071
 actions+0.069
Neg logit contributions
Aud-0.098
 wrists-0.085
 fet-0.084
 brace-0.083
ip-0.080
Tokenoc
Token ID420
Feature activation-6.06e-02

Pos logit contributions
Aud+0.523
 wrists+0.503
 brace+0.496
 fet+0.476
 sixteen+0.470
Neg logit contributions
 worry-0.301
 adult-0.284
urious-0.268
 zone-0.243
pa-0.240
Tokenados
Token ID22484
Feature activation-8.76e-02

Pos logit contributions
Aud+0.086
 fet+0.071
 brace+0.067
 pro+0.066
 wrists+0.065
Neg logit contributions
 adult-0.135
 worry-0.123
 actions-0.119
pa-0.119
 explanation-0.118
Token,
Token ID11
Feature activation-9.67e-03

Pos logit contributions
Aud+0.181
 wrists+0.171
 brace+0.170
 sixteen+0.164
 fet+0.164
Neg logit contributions
 worry-0.126
 adult-0.119
 explanation-0.105
 son-0.104
iness-0.104
Token they
Token ID484
Feature activation-1.86e-02

Pos logit contributions
Aud+0.020
 sixteen+0.019
 wrists+0.018
 brace+0.018
 fet+0.018
Neg logit contributions
 worry-0.014
 adult-0.014
 son-0.013
 nurses-0.012
 explanation-0.012
Token would
Token ID561
Feature activation-4.89e-02

Pos logit contributions
Aud+0.043
 brace+0.040
 wrists+0.038
 sixteen+0.038
con+0.036
Neg logit contributions
 worry-0.024
 adult-0.022
 son-0.020
iness-0.020
 end-0.019
Token a
Token ID257
Feature activation+5.67e-03

Pos logit contributions
 brace+0.047
Aud+0.047
 wrists+0.043
 fet+0.043
ip+0.042
Neg logit contributions
 worry-0.050
 adult-0.049
 explanation-0.044
 son-0.042
 end-0.041
Token scary
Token ID14343
Feature activation+3.89e-02

Pos logit contributions
 worry+0.009
 adult+0.008
 explanation+0.007
 son+0.007
urious+0.007
Neg logit contributions
 brace-0.011
Aud-0.011
 wrists-0.011
 fet-0.010
 sixteen-0.010
Token situation
Token ID3074
Feature activation+1.17e-01

Pos logit contributions
 adult+0.049
 worry+0.046
urious+0.042
 explanation+0.042
 zone+0.040
Neg logit contributions
Aud-0.082
 brace-0.080
 fet-0.077
 wrists-0.075
 grapes-0.074
Token.
Token ID13
Feature activation+6.06e-03

Pos logit contributions
 adult+0.161
 worry+0.158
 zone+0.140
 explanation+0.139
 son+0.138
Neg logit contributions
Aud-0.245
 brace-0.237
 wrists-0.226
 sixteen-0.224
 fet-0.221
Token She
Token ID1375
Feature activation-2.74e-02

Pos logit contributions
 adult+0.010
 worry+0.009
 son+0.009
urious+0.008
 God+0.008
Neg logit contributions
 brace-0.012
Aud-0.011
br-0.010
 wrists-0.010
 fet-0.010
Token coll
Token ID2927
Feature activation-2.42e-01

Pos logit contributions
Aud+0.058
 brace+0.057
 grapes+0.054
 wrists+0.053
 sixteen+0.053
Neg logit contributions
 adult-0.037
 worry-0.036
 son-0.033
 explanation-0.030
 actions-0.029
Tokened
Token ID276
Feature activation-4.69e-02

Pos logit contributions
Aud+0.416
 brace+0.404
 wrists+0.396
 fet+0.387
 pro+0.377
Neg logit contributions
 adult-0.413
 worry-0.388
 explanation-0.379
iness-0.356
 son-0.351
Token her
Token ID607
Feature activation-2.81e-02

Pos logit contributions
Aud+0.078
 brace+0.077
 sixteen+0.072
 wrists+0.071
br+0.069
Neg logit contributions
 adult-0.081
 worry-0.081
 son-0.076
 explanation-0.074
 nurses-0.071
Token the
Token ID262
Feature activation-1.31e-03

Pos logit contributions
Aud+0.072
 brace+0.069
 fet+0.066
 sixteen+0.065
 wrists+0.064
Neg logit contributions
 adult-0.028
 worry-0.027
 son-0.023
 explanation-0.022
urious-0.021
Token brav
Token ID49025
Feature activation+3.83e-02

Pos logit contributions
Aud+0.003
 brace+0.003
 fet+0.003
 wrists+0.003
 sixteen+0.003
Neg logit contributions
 adult-0.002
 worry-0.002
 explanation-0.001
 son-0.001
urious-0.001
Tokenest
Token ID395
Feature activation+3.90e-02

Pos logit contributions
 adult+0.044
guard+0.042
washer+0.040
 outage+0.040
 worry+0.039
Neg logit contributions
Aud-0.075
 brace-0.075
 wrists-0.074
 sixteen-0.072
 grape-0.071
Token of
Token ID286
Feature activation-1.14e-02

Pos logit contributions
Aud+0.224
 wrists+0.217
 brace+0.214
 fet+0.207
 pro+0.200
Neg logit contributions
 adult-0.081
 worry-0.063
 explanation-0.062
place-0.055
iness-0.055
Token fabric
Token ID9664
Feature activation-6.88e-02

Pos logit contributions
Aud+0.020
 fet+0.018
 brace+0.018
 wrists+0.018
 pro+0.017
Neg logit contributions
 adult-0.020
 worry-0.019
 explanation-0.018
 son-0.018
 actions-0.017
Token and
Token ID290
Feature activation-1.55e-02

Pos logit contributions
Aud+0.140
 brace+0.126
 sixteen+0.123
 wrists+0.122
 fet+0.122
Neg logit contributions
 adult-0.095
 worry-0.093
iness-0.085
 zone-0.085
 son-0.083
Token buttons
Token ID12163
Feature activation-4.91e-03

Pos logit contributions
Aud+0.029
 pro+0.025
 sixteen+0.025
ip+0.025
 fet+0.024
Neg logit contributions
 worry-0.024
 adult-0.024
 son-0.022
 nurses-0.022
 actions-0.021
Token and
Token ID290
Feature activation-2.36e-02

Pos logit contributions
Aud+0.010
 brace+0.009
 fet+0.009
 grape+0.009
 wrists+0.009
Neg logit contributions
 adult-0.007
 worry-0.006
iness-0.006
 zone-0.006
 son-0.006
Token rib
Token ID12183
Feature activation-2.41e-01

Pos logit contributions
Aud+0.043
ip+0.038
 pro+0.037
app+0.037
 fet+0.036
Neg logit contributions
 worry-0.037
 adult-0.036
 son-0.034
 nurses-0.033
 actions-0.033
Tokenbons
Token ID23461
Feature activation-1.10e-01

Pos logit contributions
Aud+0.250
 grape+0.240
 pro+0.239
 fet+0.217
 grapes+0.216
Neg logit contributions
 adult-0.518
 worry-0.489
pa-0.488
 eye-0.483
urious-0.474
Token.
Token ID13
Feature activation-1.69e-02

Pos logit contributions
Aud+0.245
 wrists+0.219
 grape+0.219
 brace+0.214
 fet+0.213
Neg logit contributions
iness-0.128
 adult-0.122
 worry-0.119
 zone-0.119
guard-0.109
Token She
Token ID1375
Feature activation-3.36e-02

Pos logit contributions
 brace+0.030
Aud+0.030
 wrists+0.029
 fet+0.028
 sixteen+0.027
Neg logit contributions
 adult-0.027
 worry-0.025
 son-0.024
 explanation-0.023
 God-0.023
Token cut
Token ID2005
Feature activation-3.78e-02

Pos logit contributions
Aud+0.068
 brace+0.062
 sixteen+0.060
 grapes+0.059
 wrists+0.058
Neg logit contributions
 adult-0.047
 worry-0.045
 son-0.042
urious-0.040
 explanation-0.039
Token a
Token ID257
Feature activation-1.57e-02

Pos logit contributions
Aud+0.075
 fet+0.065
 wrists+0.064
 brace+0.062
 pro+0.061
Neg logit contributions
 adult-0.056
 son-0.049
 worry-0.049
 end-0.047
 side-0.046
Tokenongo
Token ID25162
Feature activation-2.29e-02

Pos logit contributions
 adult+0.131
 worry+0.130
 explanation+0.123
 zone+0.117
urious+0.117
Neg logit contributions
Aud-0.081
 brace-0.079
 fet-0.073
ip-0.070
 wrists-0.068
Token didn
Token ID1422
Feature activation-2.22e-02

Pos logit contributions
Aud+0.053
 brace+0.050
 sixteen+0.047
 wrists+0.047
 fet+0.047
Neg logit contributions
 adult-0.027
 worry-0.027
urious-0.024
 son-0.023
 explanation-0.022
Token't
Token ID470
Feature activation+4.66e-02

Pos logit contributions
Aud+0.032
 brace+0.032
 wrists+0.030
 fet+0.029
 grapes+0.027
Neg logit contributions
 adult-0.044
 worry-0.043
 explanation-0.039
urious-0.038
 son-0.038
Token like
Token ID588
Feature activation-6.88e-02

Pos logit contributions
 adult+0.068
urious+0.064
 worry+0.061
 explanation+0.060
 son+0.059
Neg logit contributions
 brace-0.086
Aud-0.085
 fet-0.085
 grape-0.082
 grapes-0.080
Token wet
Token ID9583
Feature activation-5.94e-02

Pos logit contributions
Aud+0.128
 fet+0.107
 wrists+0.105
ip+0.102
 brace+0.101
Neg logit contributions
 worry-0.129
 adult-0.128
 actions-0.110
 son-0.109
 explanation-0.108
Token av
Token ID1196
Feature activation-2.41e-01

Pos logit contributions
Aud+0.115
 fet+0.103
 brace+0.101
 wrists+0.097
 sixteen+0.095
Neg logit contributions
 worry-0.098
 adult-0.096
 son-0.084
 zone-0.084
iness-0.084
Tokenoc
Token ID420
Feature activation-6.01e-02

Pos logit contributions
Aud+0.525
 wrists+0.502
 brace+0.485
 fet+0.470
 sixteen+0.461
Neg logit contributions
 worry-0.289
 adult-0.255
urious-0.245
 zone-0.223
rated-0.222
Tokenados
Token ID22484
Feature activation-8.77e-02

Pos logit contributions
Aud+0.090
 fet+0.076
 wrists+0.071
 pro+0.066
 brace+0.065
Neg logit contributions
 adult-0.129
 worry-0.117
 actions-0.113
 explanation-0.111
iness-0.109
Token,
Token ID11
Feature activation+3.28e-03

Pos logit contributions
Aud+0.178
 brace+0.165
 wrists+0.162
 sixteen+0.159
 fet+0.159
Neg logit contributions
 worry-0.130
 adult-0.122
 son-0.107
iness-0.107
 explanation-0.106
Token so
Token ID523
Feature activation-2.91e-02

Pos logit contributions
 adult+0.006
 worry+0.006
 explanation+0.005
 son+0.005
urious+0.005
Neg logit contributions
Aud-0.006
 brace-0.005
 fet-0.005
 wrists-0.005
 sixteen-0.005
Token he
Token ID339
Feature activation-2.30e-02

Pos logit contributions
Aud+0.052
 brace+0.051
 fet+0.048
 wrists+0.047
 sixteen+0.046
Neg logit contributions
 adult-0.047
 worry-0.046
 explanation-0.042
urious-0.041
 zone-0.041
Token before
Token ID878
Feature activation-3.23e-02

Pos logit contributions
Aud+0.052
 brace+0.048
 fet+0.047
 wrists+0.047
ip+0.044
Neg logit contributions
 worry-0.036
 adult-0.035
 son-0.031
 explanation-0.030
 zone-0.030
Token.
Token ID13
Feature activation-4.34e-03

Pos logit contributions
Aud+0.075
 brace+0.068
 fet+0.066
 wrists+0.065
 sixteen+0.062
Neg logit contributions
 worry-0.043
 adult-0.041
 explanation-0.035
 son-0.034
urious-0.034
Token
Token ID198
Feature activation-1.15e-02

Pos logit contributions
 brace+0.010
Aud+0.010
 wrists+0.009
 grapes+0.008
br+0.008
Neg logit contributions
 adult-0.005
 son-0.005
 God-0.005
Son-0.004
 zone-0.004
Token
Token ID198
Feature activation-8.61e-03

Pos logit contributions
 brace+0.025
Aud+0.024
 wrists+0.024
 fet+0.022
 grapes+0.022
Neg logit contributions
 adult-0.015
 worry-0.013
 explanation-0.013
urious-0.012
 son-0.012
TokenThe
Token ID464
Feature activation-7.77e-03

Pos logit contributions
 brace+0.017
 wrists+0.017
Aud+0.016
 fet+0.015
 grapes+0.015
Neg logit contributions
 adult-0.013
 worry-0.011
 explanation-0.011
urious-0.011
 end-0.011
Token av
Token ID1196
Feature activation-2.41e-01

Pos logit contributions
Aud+0.017
 brace+0.016
 fet+0.016
 wrists+0.016
 sixteen+0.015
Neg logit contributions
 adult-0.010
 worry-0.010
 explanation-0.009
 son-0.008
urious-0.008
Tokenoc
Token ID420
Feature activation-4.95e-02

Pos logit contributions
Aud+0.571
 brace+0.537
 wrists+0.531
 fet+0.513
 sixteen+0.495
Neg logit contributions
 adult-0.251
 worry-0.249
urious-0.231
 explanation-0.200
ie-0.197
Tokenados
Token ID22484
Feature activation-8.83e-02

Pos logit contributions
Aud+0.064
 fet+0.053
 brace+0.051
 wrists+0.047
 sixteen+0.045
Neg logit contributions
 adult-0.117
 worry-0.109
 explanation-0.104
 son-0.102
urious-0.102
Token were
Token ID547
Feature activation+2.67e-03

Pos logit contributions
Aud+0.185
 brace+0.168
 wrists+0.160
 fet+0.159
 sixteen+0.157
Neg logit contributions
 worry-0.128
 adult-0.125
 explanation-0.112
iness-0.109
urious-0.106
Token shared
Token ID4888
Feature activation-6.65e-02

Pos logit contributions
 worry+0.004
 adult+0.004
 end+0.004
 son+0.003
 father+0.003
Neg logit contributions
Aud-0.006
 fet-0.005
con-0.005
 brace-0.005
 wrists-0.005
Token with
Token ID351
Feature activation-7.09e-03

Pos logit contributions
Aud+0.164
ip+0.141
seven+0.140
 wrists+0.137
con+0.137
Neg logit contributions
 worry-0.076
 son-0.065
 end-0.064
 adult-0.064
 explanation-0.060
Token.
Token ID13
Feature activation-6.15e-03

Pos logit contributions
Aud+0.122
 brace+0.111
 fet+0.109
 sixteen+0.106
 wrists+0.105
Neg logit contributions
 worry-0.081
 adult-0.079
 son-0.074
 zone-0.072
 explanation-0.068
Token They
Token ID1119
Feature activation+7.60e-03

Pos logit contributions
Aud+0.011
 brace+0.011
 fet+0.010
 wrists+0.010
 pro+0.009
Neg logit contributions
 adult-0.010
 worry-0.010
 son-0.010
 explanation-0.009
 zone-0.009
Token had
Token ID550
Feature activation-5.32e-02

Pos logit contributions
 worry+0.011
 adult+0.011
 son+0.010
 explanation+0.009
iness+0.009
Neg logit contributions
Aud-0.015
 brace-0.014
 wrists-0.013
 grapes-0.013
 sixteen-0.013
Token bells
Token ID30987
Feature activation-4.78e-02

Pos logit contributions
ip+0.101
Aud+0.097
�+0.094
 pro+0.088
 grape+0.088
Neg logit contributions
 worry-0.080
 adult-0.072
 son-0.072
 island-0.068
 ig-0.066
Token and
Token ID290
Feature activation-8.08e-03

Pos logit contributions
Aud+0.100
 brace+0.094
 grapes+0.093
 fet+0.093
 grape+0.092
Neg logit contributions
 worry-0.062
 adult-0.055
 son-0.054
 zone-0.052
iness-0.051
Token rib
Token ID12183
Feature activation-2.37e-01

Pos logit contributions
Aud+0.014
ip+0.012
 fet+0.012
 pro+0.012
 grape+0.012
Neg logit contributions
 worry-0.014
 adult-0.013
 son-0.013
 nurses-0.012
 actions-0.012
Tokenbons
Token ID23461
Feature activation-1.03e-01

Pos logit contributions
Aud+0.227
 pro+0.226
 grape+0.219
 fet+0.208
 brace+0.198
Neg logit contributions
 adult-0.534
 worry-0.514
pa-0.514
urious-0.494
 eye-0.486
Token.
Token ID13
Feature activation-1.52e-02

Pos logit contributions
Aud+0.240
 grape+0.219
 wrists+0.219
 grapes+0.216
 brace+0.216
Neg logit contributions
iness-0.104
 worry-0.101
 adult-0.098
 zone-0.097
guard-0.089
Token They
Token ID1119
Feature activation+3.70e-03

Pos logit contributions
 brace+0.027
Aud+0.027
 fet+0.025
 wrists+0.025
 pro+0.023
Neg logit contributions
 adult-0.025
 worry-0.024
 son-0.023
 explanation-0.022
 zone-0.021
Token made
Token ID925
Feature activation+2.57e-02

Pos logit contributions
 worry+0.005
 adult+0.005
 son+0.005
 explanation+0.004
iness+0.004
Neg logit contributions
Aud-0.007
 brace-0.007
 grapes-0.007
 wrists-0.006
 sixteen-0.006
Token soft
Token ID2705
Feature activation-3.54e-02

Pos logit contributions
 son+0.037
 adult+0.037
 worry+0.037
 explanation+0.034
 God+0.033
Neg logit contributions
Aud-0.052
 fet-0.045
ip-0.045
 brace-0.045
�-0.045
Token wear
Token ID5806
Feature activation-7.18e-02

Pos logit contributions
Aud+0.096
 grapes+0.090
 fet+0.088
 brace+0.088
 wrists+0.085
Neg logit contributions
 worry-0.061
 adult-0.057
 end-0.054
 answer-0.051
 pray-0.049
Token hats
Token ID23910
Feature activation-7.26e-02

Pos logit contributions
 fet+0.141
Aud+0.136
 scored+0.130
 pro+0.130
�+0.127
Neg logit contributions
 why-0.082
 worry-0.082
 adult-0.079
 God-0.079
 son-0.073
Token and
Token ID290
Feature activation-8.41e-03

Pos logit contributions
Aud+0.183
 brace+0.175
 fet+0.173
 wrists+0.172
pill+0.166
Neg logit contributions
 worry-0.062
 zone-0.060
 end-0.054
 adult-0.053
iness-0.051
Token clips
Token ID19166
Feature activation-1.19e-01

Pos logit contributions
Aud+0.014
 pro+0.014
 beautifully+0.014
app+0.013
�+0.013
Neg logit contributions
 worry-0.014
 adult-0.013
 son-0.013
 nurses-0.012
 actions-0.012
Token and
Token ID290
Feature activation-4.48e-03

Pos logit contributions
Aud+0.294
 brace+0.273
 fet+0.271
 grape+0.265
 wrists+0.264
Neg logit contributions
 zone-0.113
iness-0.099
 adult-0.099
 end-0.097
 worry-0.097
Token rib
Token ID12183
Feature activation-2.37e-01

Pos logit contributions
Aud+0.007
 pro+0.007
app+0.006
 beautifully+0.006
�+0.006
Neg logit contributions
 worry-0.008
 adult-0.007
 son-0.007
 actions-0.007
 nurses-0.007
Tokenbons
Token ID23461
Feature activation-1.09e-01

Pos logit contributions
 pro+0.309
 grape+0.279
 fet+0.266
Aud+0.262
�+0.253
Neg logit contributions
 adult-0.437
pa-0.433
 eye-0.412
ent-0.408
 purposes-0.399
Token.
Token ID13
Feature activation-1.11e-02

Pos logit contributions
Aud+0.275
 fet+0.254
 wrists+0.253
 brace+0.253
 grape+0.249
Neg logit contributions
 zone-0.092
iness-0.091
 end-0.086
 adult-0.075
 worry-0.070
Token You
Token ID921
Feature activation-9.79e-03

Pos logit contributions
 brace+0.020
Aud+0.020
 fet+0.019
 pro+0.018
 wrists+0.018
Neg logit contributions
 adult-0.017
 worry-0.016
 zone-0.015
 God-0.015
 son-0.015
Token can
Token ID460
Feature activation-4.95e-02

Pos logit contributions
Aud+0.021
 brace+0.020
 wrists+0.020
 grapes+0.020
 grape+0.019
Neg logit contributions
 worry-0.013
 adult-0.011
 explanation-0.010
 pray-0.010
 zone-0.010
Token be
Token ID307
Feature activation+2.66e-02

Pos logit contributions
Aud+0.106
 grapes+0.098
 fet+0.097
 brace+0.096
 grape+0.093
Neg logit contributions
 worry-0.068
 adult-0.064
 end-0.057
 answer-0.055
 pray-0.053
Token but
Token ID475
Feature activation+3.89e-02

Pos logit contributions
 worry+0.013
 adult+0.012
 son+0.010
 explanation+0.010
 actions+0.010
Neg logit contributions
Aud-0.030
 brace-0.028
 wrists-0.028
 fet-0.027
 sixteen-0.026
Token he
Token ID339
Feature activation-4.38e-02

Pos logit contributions
 adult+0.057
 worry+0.056
 son+0.050
 explanation+0.049
 actions+0.046
Neg logit contributions
Aud-0.078
 brace-0.074
 fet-0.072
 wrists-0.072
 sixteen-0.070
Token also
Token ID635
Feature activation-1.64e-02

Pos logit contributions
Aud+0.087
 brace+0.085
 fet+0.080
 wrists+0.079
 sixteen+0.077
Neg logit contributions
 adult-0.063
 worry-0.062
 explanation-0.055
 son-0.054
urious-0.054
Token knew
Token ID2993
Feature activation+5.06e-03

Pos logit contributions
Aud+0.035
 brace+0.034
 wrists+0.032
 fet+0.032
 sixteen+0.031
Neg logit contributions
 worry-0.022
 adult-0.021
 son-0.019
urious-0.018
 explanation-0.018
Token cater
Token ID20825
Feature activation-1.15e-01

Pos logit contributions
 adult+0.008
 worry+0.008
 son+0.007
 actions+0.007
 explanation+0.007
Neg logit contributions
Aud-0.010
 brace-0.009
 sixteen-0.009
 fet-0.009
 wrists-0.009
Tokenpill
Token ID27215
Feature activation-2.37e-01

Pos logit contributions
 wrists+0.133
 brace+0.126
Aud+0.124
 sixteen+0.123
 eighteen+0.111
Neg logit contributions
 adult-0.251
iness-0.248
 zone-0.240
 outsider-0.236
 worry-0.235
Tokenars
Token ID945
Feature activation-7.27e-03

Pos logit contributions
Aud+0.378
 brace+0.368
 wrists+0.351
 fet+0.336
 sixteen+0.328
Neg logit contributions
 adult-0.421
 worry-0.412
 explanation-0.388
iness-0.377
urious-0.376
Token were
Token ID547
Feature activation-1.97e-02

Pos logit contributions
Aud+0.015
 brace+0.014
 wrists+0.014
 fet+0.013
 sixteen+0.013
Neg logit contributions
 worry-0.010
 adult-0.010
iness-0.009
 explanation-0.009
 zone-0.009
Token not
Token ID407
Feature activation+5.08e-02

Pos logit contributions
Aud+0.042
 fet+0.038
ip+0.038
 wrists+0.037
 brace+0.037
Neg logit contributions
 adult-0.028
 worry-0.027
 son-0.024
 why-0.023
 answer-0.023
Token pets
Token ID17252
Feature activation-8.69e-02

Pos logit contributions
 worry+0.080
 adult+0.073
 son+0.066
 answer+0.063
 option+0.063
Neg logit contributions
Aud-0.106
 wrists-0.095
ip-0.095
�-0.095
 fet-0.094
Token.
Token ID13
Feature activation-2.15e-02

Pos logit contributions
Aud+0.200
 wrists+0.184
 sixteen+0.183
 brace+0.182
 fet+0.180
Neg logit contributions
 worry-0.106
 adult-0.105
 zone-0.096
iness-0.095
 end-0.089
Token Did
Token ID7731
Feature activation-5.53e-02

Pos logit contributions
 adult+0.017
 worry+0.016
 zone+0.015
 explanation+0.015
 son+0.015
Neg logit contributions
 brace-0.019
Aud-0.018
 wrists-0.017
 fet-0.017
 grapes-0.017
Token you
Token ID345
Feature activation-2.44e-02

Pos logit contributions
Aud+0.115
 fet+0.104
 brace+0.102
 pro+0.101
 sixteen+0.100
Neg logit contributions
 worry-0.083
 adult-0.077
 son-0.070
 actions-0.067
 explanation-0.066
Token know
Token ID760
Feature activation+6.69e-02

Pos logit contributions
Aud+0.056
 brace+0.054
 wrists+0.051
 fet+0.051
 grapes+0.050
Neg logit contributions
 worry-0.030
 adult-0.029
 explanation-0.025
 son-0.024
 outsider-0.023
Token that
Token ID326
Feature activation+1.56e-02

Pos logit contributions
 adult+0.085
 worry+0.079
 why+0.069
 son+0.068
 God+0.067
Neg logit contributions
Aud-0.157
 fet-0.144
 brace-0.141
 wrists-0.136
 sixteen-0.135
Token cater
Token ID20825
Feature activation-1.16e-01

Pos logit contributions
 worry+0.025
 adult+0.024
 son+0.022
 explanation+0.021
 God+0.020
Neg logit contributions
Aud-0.032
 fet-0.029
 brace-0.029
 wrists-0.028
ip-0.028
Tokenpill
Token ID27215
Feature activation-2.37e-01

Pos logit contributions
 wrists+0.189
 brace+0.177
Aud+0.173
 sixteen+0.167
 eighteen+0.156
Neg logit contributions
 adult-0.206
iness-0.198
heart-0.196
 zone-0.193
ie-0.190
Tokenars
Token ID945
Feature activation-3.85e-03

Pos logit contributions
Aud+0.442
 wrists+0.439
 brace+0.433
 sprink+0.393
 pro+0.392
Neg logit contributions
 worry-0.331
 adult-0.329
 explanation-0.321
place-0.319
iness-0.314
Token turn
Token ID1210
Feature activation+2.72e-02

Pos logit contributions
Aud+0.008
 brace+0.008
 wrists+0.007
 fet+0.007
 sixteen+0.007
Neg logit contributions
 worry-0.006
 adult-0.005
iness-0.005
 zone-0.005
 explanation-0.005
Token into
Token ID656
Feature activation-4.42e-02

Pos logit contributions
 adult+0.042
 worry+0.042
iness+0.037
ie+0.037
 zone+0.037
Neg logit contributions
Aud-0.053
 brace-0.048
 wrists-0.047
 fet-0.044
 pro-0.042
Token beautiful
Token ID4950
Feature activation-1.60e-02

Pos logit contributions
Aud+0.091
 fet+0.077
ip+0.077
 wrists+0.076
 brace+0.076
Neg logit contributions
 adult-0.071
 worry-0.071
 ig-0.060
 son-0.059
 God-0.058
Token butterflies
Token ID44571
Feature activation-2.26e-02

Pos logit contributions
Aud+0.035
 fet+0.029
 sixteen+0.028
ip+0.028
 brace+0.027
Neg logit contributions
 worry-0.023
 adult-0.022
heart-0.022
ie-0.021
 creatures-0.021
Token can
Token ID460
Feature activation-4.76e-02

Pos logit contributions
Aud+0.067
 brace+0.064
 wrists+0.060
 fet+0.059
 sixteen+0.058
Neg logit contributions
 worry-0.047
 adult-0.047
 explanation-0.040
 son-0.040
 zone-0.040
Token sew
Token ID34249
Feature activation-7.93e-02

Pos logit contributions
Aud+0.099
 brace+0.096
 fet+0.092
 wrists+0.090
 sixteen+0.087
Neg logit contributions
 adult-0.064
 worry-0.062
 explanation-0.056
urious-0.055
 son-0.054
Token some
Token ID617
Feature activation+3.41e-02

Pos logit contributions
Aud+0.156
 fet+0.146
pill+0.132
 grape+0.131
 scored+0.131
Neg logit contributions
 worry-0.105
 end-0.104
 adult-0.102
 son-0.101
 zone-0.098
Token buttons
Token ID12163
Feature activation+2.37e-02

Pos logit contributions
 adult+0.057
 worry+0.053
 explanation+0.050
place+0.050
 end+0.049
Neg logit contributions
Aud-0.066
 fet-0.056
 sixteen-0.054
 scored-0.054
 wrists-0.053
Token or
Token ID393
Feature activation-3.67e-02

Pos logit contributions
 adult+0.032
 worry+0.031
 zone+0.030
iness+0.029
 son+0.029
Neg logit contributions
Aud-0.050
 brace-0.046
 fet-0.045
 wrists-0.044
 grape-0.043
Token rib
Token ID12183
Feature activation-2.34e-01

Pos logit contributions
Aud+0.068
ip+0.060
 fet+0.058
 brace+0.058
 pro+0.057
Neg logit contributions
 worry-0.062
 adult-0.059
 son-0.054
 explanation-0.053
 nurses-0.052
Tokenbons
Token ID23461
Feature activation-1.06e-01

Pos logit contributions
Aud+0.281
 pro+0.276
 grape+0.270
 fet+0.263
 grapes+0.242
Neg logit contributions
pa-0.444
 adult-0.441
 eye-0.422
 end-0.419
 purposes-0.405
Token or
Token ID393
Feature activation-3.70e-02

Pos logit contributions
Aud+0.242
 brace+0.221
 fet+0.221
 wrists+0.216
 grape+0.210
Neg logit contributions
 adult-0.120
 zone-0.118
iness-0.117
 worry-0.114
 end-0.112
Token patches
Token ID16082
Feature activation-6.73e-02

Pos logit contributions
Aud+0.064
ip+0.055
 fet+0.054
 pro+0.054
 grape+0.052
Neg logit contributions
 worry-0.067
 adult-0.064
 son-0.060
 nurses-0.058
 actions-0.058
Token.
Token ID13
Feature activation-2.48e-02

Pos logit contributions
Aud+0.153
 brace+0.146
 wrists+0.143
 fet+0.138
 pro+0.128
Neg logit contributions
 adult-0.085
iness-0.080
 zone-0.079
 worry-0.077
 end-0.074
Token Would
Token ID10928
Feature activation-6.18e-02

Pos logit contributions
 brace+0.049
Aud+0.048
 fet+0.045
 wrists+0.045
 grapes+0.043
Neg logit contributions
 adult-0.035
 worry-0.035
 son-0.031
 explanation-0.031
 zone-0.030
Token
Token ID220
Feature activation-3.70e-02

Pos logit contributions
 adult+0.013
 son+0.012
 worry+0.011
urious+0.011
 explanation+0.011
Neg logit contributions
 brace-0.016
Aud-0.015
br-0.013
 wrists-0.013
 grapes-0.012
Token
Token ID198
Feature activation-4.18e-04

Pos logit contributions
 brace+0.078
Aud+0.076
 wrists+0.073
 grapes+0.069
 pro+0.069
Neg logit contributions
 adult-0.050
 son-0.044
 worry-0.044
urious-0.043
 explanation-0.042
Token
Token ID198
Feature activation+1.37e-03

Pos logit contributions
 brace+0.001
Aud+0.001
 wrists+0.001
 grapes+0.001
 fet+0.001
Neg logit contributions
 adult-0.001
 worry-0.001
 explanation-0.001
urious-0.001
 son-0.000
TokenWhen
Token ID2215
Feature activation-2.01e-02

Pos logit contributions
 adult+0.002
 worry+0.002
urious+0.002
 explanation+0.002
Son+0.002
Neg logit contributions
 brace-0.003
Aud-0.002
 wrists-0.002
 fet-0.002
 grapes-0.002
Token the
Token ID262
Feature activation-7.34e-02

Pos logit contributions
Aud+0.044
 brace+0.041
 wrists+0.038
 fet+0.038
pill+0.035
Neg logit contributions
 worry-0.031
 adult-0.029
 son-0.028
 God-0.026
 child-0.026
Token coll
Token ID2927
Feature activation-2.34e-01

Pos logit contributions
 brace+0.158
Aud+0.157
 wrists+0.149
 fet+0.149
 sprink+0.148
Neg logit contributions
 adult-0.085
 worry-0.083
 outsider-0.077
 zone-0.077
urious-0.076
Tokenage
Token ID496
Feature activation-4.18e-02

Pos logit contributions
Aud+0.432
 wrists+0.412
 brace+0.393
 fet+0.392
 pro+0.390
Neg logit contributions
 adult-0.384
 explanation-0.354
 worry-0.328
iness-0.325
 end-0.303
Token was
Token ID373
Feature activation-8.27e-03

Pos logit contributions
Aud+0.093
 brace+0.088
 wrists+0.085
 fet+0.084
 grapes+0.082
Neg logit contributions
 adult-0.054
 worry-0.051
 explanation-0.046
 son-0.044
iness-0.043
Token finished
Token ID5201
Feature activation-7.20e-02

Pos logit contributions
Aud+0.016
 brace+0.016
 fet+0.016
 wrists+0.015
pill+0.015
Neg logit contributions
 worry-0.012
 adult-0.012
 end-0.011
 son-0.011
 explanation-0.011
Token,
Token ID11
Feature activation+6.84e-03

Pos logit contributions
Aud+0.185
 fet+0.163
 brace+0.160
 wrists+0.159
 sixteen+0.152
Neg logit contributions
 worry-0.084
 adult-0.074
 son-0.066
 explanation-0.064
iness-0.057
Token the
Token ID262
Feature activation-2.61e-02

Pos logit contributions
 worry+0.011
 son+0.011
 adult+0.010
 child+0.009
 God+0.009
Neg logit contributions
Aud-0.015
 brace-0.013
 fet-0.013
 wrists-0.012
br-0.012
Token a
Token ID257
Feature activation+1.49e-03

Pos logit contributions
 adult+0.030
 worry+0.030
 explanation+0.026
 son+0.025
 warnings+0.025
Neg logit contributions
Aud-0.046
ip-0.043
 wrists-0.043
�-0.043
 pro-0.042
Token sound
Token ID2128
Feature activation+6.35e-02

Pos logit contributions
 worry+0.002
 adult+0.002
 explanation+0.002
 son+0.002
 end+0.002
Neg logit contributions
Aud-0.003
 brace-0.003
ip-0.003
 wrists-0.002
 fet-0.002
Token like
Token ID588
Feature activation-5.03e-02

Pos logit contributions
 adult+0.088
 worry+0.088
 zone+0.078
 explanation+0.076
iness+0.076
Neg logit contributions
Aud-0.134
 brace-0.128
 wrists-0.124
 fet-0.119
 grapes-0.114
Token this
Token ID428
Feature activation+3.42e-02

Pos logit contributions
Aud+0.107
ip+0.098
 wrists+0.095
 brace+0.094
 sixteen+0.092
Neg logit contributions
 adult-0.064
 God-0.058
 worry-0.057
 son-0.054
 why-0.053
Token:
Token ID25
Feature activation+9.48e-03

Pos logit contributions
 worry+0.049
 adult+0.048
 zone+0.043
 son+0.043
 explanation+0.043
Neg logit contributions
Aud-0.074
 brace-0.066
 wrists-0.063
 fet-0.063
ip-0.061
Token rib
Token ID12183
Feature activation-2.31e-01

Pos logit contributions
 adult+0.013
 worry+0.013
 why+0.013
 God+0.013
 son+0.012
Neg logit contributions
Aud-0.019
 brace-0.018
 wrists-0.017
 icing-0.016
ip-0.016
Tokenbit
Token ID2545
Feature activation-3.41e-02

Pos logit contributions
Aud+0.365
 brace+0.340
 pro+0.339
 fet+0.332
ip+0.320
Neg logit contributions
 adult-0.390
 worry-0.381
urious-0.381
 end-0.365
 eye-0.358
Token,
Token ID11
Feature activation+6.83e-04

Pos logit contributions
Aud+0.079
 brace+0.072
 wrists+0.070
 fet+0.068
 sixteen+0.066
Neg logit contributions
 worry-0.041
 adult-0.040
 zone-0.035
iness-0.034
 son-0.034
Token rib
Token ID12183
Feature activation-2.31e-01

Pos logit contributions
 worry+0.001
 adult+0.001
 son+0.001
 actions+0.001
 explanation+0.001
Neg logit contributions
Aud-0.001
ip-0.001
 fet-0.001
 pro-0.001
 brace-0.001
Tokenbit
Token ID2545
Feature activation-4.17e-02

Pos logit contributions
Aud+0.392
 pro+0.386
ip+0.356
 fet+0.355
 brace+0.352
Neg logit contributions
 adult-0.354
 worry-0.346
 eye-0.325
an-0.323
rr-0.322
Token."
Token ID526
Feature activation-7.19e-02

Pos logit contributions
Aud+0.097
 brace+0.087
 wrists+0.086
 fet+0.083
 sixteen+0.081
Neg logit contributions
 worry-0.050
 adult-0.048
 son-0.043
 zone-0.042
iness-0.042
Token this
Token ID428
Feature activation+3.42e-02

Pos logit contributions
Aud+0.107
ip+0.098
 wrists+0.095
 brace+0.094
 sixteen+0.092
Neg logit contributions
 adult-0.064
 God-0.058
 worry-0.057
 son-0.054
 why-0.053
Token:
Token ID25
Feature activation+9.48e-03

Pos logit contributions
 worry+0.049
 adult+0.048
 zone+0.043
 son+0.043
 explanation+0.043
Neg logit contributions
Aud-0.074
 brace-0.066
 wrists-0.063
 fet-0.063
ip-0.061
Token rib
Token ID12183
Feature activation-2.31e-01

Pos logit contributions
 adult+0.013
 worry+0.013
 why+0.013
 God+0.013
 son+0.012
Neg logit contributions
Aud-0.019
 brace-0.018
 wrists-0.017
 icing-0.016
ip-0.016
Tokenbit
Token ID2545
Feature activation-3.41e-02

Pos logit contributions
Aud+0.365
 brace+0.340
 pro+0.339
 fet+0.332
ip+0.320
Neg logit contributions
 adult-0.390
 worry-0.381
urious-0.381
 end-0.365
 eye-0.358
Token,
Token ID11
Feature activation+6.83e-04

Pos logit contributions
Aud+0.079
 brace+0.072
 wrists+0.070
 fet+0.068
 sixteen+0.066
Neg logit contributions
 worry-0.041
 adult-0.040
 zone-0.035
iness-0.034
 son-0.034
Token rib
Token ID12183
Feature activation-2.31e-01

Pos logit contributions
 worry+0.001
 adult+0.001
 son+0.001
 actions+0.001
 explanation+0.001
Neg logit contributions
Aud-0.001
ip-0.001
 fet-0.001
 pro-0.001
 brace-0.001
Tokenbit
Token ID2545
Feature activation-4.17e-02

Pos logit contributions
Aud+0.392
 pro+0.386
ip+0.356
 fet+0.355
 brace+0.352
Neg logit contributions
 adult-0.354
 worry-0.346
 eye-0.325
an-0.323
rr-0.322
Token."
Token ID526
Feature activation-7.19e-02

Pos logit contributions
Aud+0.097
 brace+0.087
 wrists+0.086
 fet+0.083
 sixteen+0.081
Neg logit contributions
 worry-0.050
 adult-0.048
 son-0.043
 zone-0.042
iness-0.042
Token
Token ID198
Feature activation-3.22e-02

Pos logit contributions
Aud+0.153
 brace+0.152
 wrists+0.147
 fet+0.143
 sixteen+0.142
Neg logit contributions
 adult-0.086
 worry-0.080
 son-0.076
 explanation-0.073
 God-0.070
Token
Token ID198
Feature activation-2.76e-02

Pos logit contributions
Aud+0.066
 brace+0.065
 wrists+0.062
 fet+0.061
 grapes+0.061
Neg logit contributions
 adult-0.043
 worry-0.039
 explanation-0.035
urious-0.035
 son-0.035
TokenL
Token ID43
Feature activation-8.02e-02

Pos logit contributions
 icing+0.054
 grapes+0.053
 brace+0.052
ip+0.052
 grape+0.052
Neg logit contributions
 adult-0.036
 worry-0.031
urious-0.029
 elders-0.029
 end-0.029
Token They
Token ID1119
Feature activation+4.81e-03

Pos logit contributions
 brace+0.023
Aud+0.022
 pro+0.021
 wrists+0.021
 fet+0.020
Neg logit contributions
 adult-0.017
 son-0.016
 God-0.015
 worry-0.015
 explanation-0.014
Token had
Token ID550
Feature activation-5.36e-02

Pos logit contributions
 worry+0.007
 adult+0.007
 son+0.006
 explanation+0.006
 nurses+0.006
Neg logit contributions
Aud-0.010
 brace-0.009
 sixteen-0.009
br-0.009
 wrists-0.008
Token different
Token ID1180
Feature activation-1.35e-02

Pos logit contributions
ip+0.102
Aud+0.094
�+0.093
 pro+0.090
br+0.087
Neg logit contributions
 worry-0.088
 adult-0.077
 son-0.071
 permission-0.070
 distress-0.069
Token colors
Token ID7577
Feature activation-1.30e-02

Pos logit contributions
Aud+0.028
 fet+0.025
 brace+0.025
ip+0.025
 pro+0.024
Neg logit contributions
 adult-0.020
 worry-0.019
 explanation-0.018
 son-0.017
 actions-0.017
Token and
Token ID290
Feature activation-5.72e-03

Pos logit contributions
Aud+0.030
 wrists+0.027
 fet+0.027
 brace+0.027
 sixteen+0.026
Neg logit contributions
 adult-0.014
 worry-0.014
iness-0.014
 zone-0.013
 explanation-0.012
Token rib
Token ID12183
Feature activation-2.30e-01

Pos logit contributions
Aud+0.011
ip+0.010
 sixteen+0.010
 fet+0.009
 pro+0.009
Neg logit contributions
 worry-0.009
 adult-0.008
 actions-0.008
 explanation-0.008
 son-0.007
Tokenbons
Token ID23461
Feature activation-1.03e-01

Pos logit contributions
Aud+0.248
 pro+0.247
 wrists+0.218
 fet+0.216
ip+0.209
Neg logit contributions
 adult-0.495
 worry-0.473
urious-0.463
pa-0.460
 explanation-0.453
Token.
Token ID13
Feature activation-1.85e-02

Pos logit contributions
Aud+0.237
 wrists+0.211
 brace+0.205
 grape+0.205
 sixteen+0.205
Neg logit contributions
iness-0.117
 worry-0.111
 adult-0.108
 zone-0.104
guard-0.095
Token They
Token ID1119
Feature activation-4.32e-04

Pos logit contributions
 brace+0.036
Aud+0.035
 fet+0.033
 wrists+0.033
 pro+0.033
Neg logit contributions
 adult-0.027
 son-0.024
 worry-0.024
 God-0.023
 explanation-0.023
Token were
Token ID547
Feature activation-3.93e-03

Pos logit contributions
Aud+0.001
 brace+0.001
 sixteen+0.001
ip+0.001
br+0.001
Neg logit contributions
 worry-0.001
 adult-0.001
 son-0.001
 nurses-0.000
 end-0.000
Token very
Token ID845
Feature activation-6.67e-02

Pos logit contributions
�+0.008
Aud+0.008
ip+0.008
 AW+0.008
br+0.008
Neg logit contributions
 worry-0.005
 adult-0.004
 son-0.004
 father-0.004
 why-0.004
Token"
Token ID1
Feature activation-3.99e-02

Pos logit contributions
 brace+0.049
Aud+0.049
 wrists+0.046
 fet+0.045
 grapes+0.045
Neg logit contributions
 adult-0.043
 worry-0.041
urious-0.037
 explanation-0.036
 son-0.036
TokenR
Token ID49
Feature activation-1.60e-02

Pos logit contributions
Aud+0.073
 brace+0.071
 fet+0.067
 wrists+0.066
 sixteen+0.064
Neg logit contributions
 adult-0.063
 worry-0.062
 explanation-0.057
 son-0.055
urious-0.054
Tokenib
Token ID571
Feature activation-3.36e-02

Pos logit contributions
Aud+0.028
 brace+0.027
 fet+0.026
 wrists+0.025
 grapes+0.025
Neg logit contributions
 adult-0.027
 worry-0.026
 explanation-0.024
 son-0.023
urious-0.023
Tokenbit
Token ID2545
Feature activation-1.89e-02

Pos logit contributions
Aud+0.055
 brace+0.047
 pro+0.046
 wrists+0.046
 fet+0.045
Neg logit contributions
 worry-0.062
 adult-0.061
iness-0.056
an-0.055
 eye-0.055
Token,
Token ID11
Feature activation+1.71e-02

Pos logit contributions
Aud+0.043
 wrists+0.040
 brace+0.039
 fet+0.038
 icing+0.037
Neg logit contributions
 worry-0.024
 adult-0.022
 son-0.020
 explanation-0.019
iness-0.019
Token rib
Token ID12183
Feature activation-2.30e-01

Pos logit contributions
 adult+0.029
 worry+0.029
 son+0.026
 actions+0.026
 explanation+0.025
Neg logit contributions
Aud-0.030
 brace-0.026
 pro-0.025
 sixteen-0.025
 beautifully-0.025
Tokenbit
Token ID2545
Feature activation-2.83e-02

Pos logit contributions
 pro+0.392
Aud+0.386
 brace+0.353
 fet+0.352
ip+0.340
Neg logit contributions
 adult-0.357
 worry-0.330
pa-0.324
rr-0.322
an-0.322
Token,"
Token ID553
Feature activation-1.11e-01

Pos logit contributions
Aud+0.064
 wrists+0.058
 brace+0.057
 fet+0.057
 icing+0.055
Neg logit contributions
 worry-0.036
 adult-0.035
 son-0.031
iness-0.029
 explanation-0.028
Token the
Token ID262
Feature activation-8.43e-02

Pos logit contributions
Aud+0.187
 brace+0.164
 wrists+0.157
 fet+0.156
 sixteen+0.152
Neg logit contributions
 adult-0.204
 worry-0.202
 son-0.198
 God-0.193
 sailor-0.184
Token frog
Token ID21264
Feature activation-8.49e-02

Pos logit contributions
Aud+0.184
 brace+0.168
 fet+0.163
 sixteen+0.160
 wrists+0.159
Neg logit contributions
 adult-0.116
 worry-0.112
 explanation-0.098
 son-0.097
urious-0.091
Token agrees
Token ID14386
Feature activation-5.89e-02

Pos logit contributions
Aud+0.184
 brace+0.169
 wrists+0.165
 sixteen+0.161
 grapes+0.160
Neg logit contributions
 adult-0.104
 worry-0.103
 explanation-0.093
urious-0.092
 son-0.091
Token the
Token ID262
Feature activation+9.96e-03

Pos logit contributions
 worry+0.053
 adult+0.053
 son+0.048
 end+0.047
 answer+0.047
Neg logit contributions
Aud-0.067
 fet-0.059
 sixteen-0.057
br-0.055
ip-0.055
Token clown
Token ID25573
Feature activation-3.29e-02

Pos logit contributions
 adult+0.021
 worry+0.020
 explanation+0.019
urious+0.019
 zone+0.019
Neg logit contributions
Aud-0.013
 brace-0.013
 fet-0.012
 wrists-0.012
 sixteen-0.011
Token toy
Token ID13373
Feature activation-1.68e-02

Pos logit contributions
Aud+0.059
 brace+0.055
 wrists+0.052
 fet+0.051
 sixteen+0.051
Neg logit contributions
 worry-0.055
 adult-0.055
 son-0.049
 explanation-0.049
urious-0.049
Token in
Token ID287
Feature activation-1.97e-02

Pos logit contributions
Aud+0.034
 brace+0.030
 wrists+0.028
 fet+0.028
 sixteen+0.028
Neg logit contributions
 adult-0.027
 worry-0.025
 son-0.024
 zone-0.023
 explanation-0.023
Token the
Token ID262
Feature activation+2.30e-02

Pos logit contributions
Aud+0.031
 fet+0.029
 brace+0.029
ip+0.028
 wrists+0.027
Neg logit contributions
 worry-0.039
 adult-0.038
 son-0.033
 explanation-0.033
 distress-0.033
Token ground
Token ID2323
Feature activation+1.02e-01

Pos logit contributions
 adult+0.032
 worry+0.031
 zone+0.028
urious+0.028
 son+0.027
Neg logit contributions
Aud-0.048
 brace-0.046
 fet-0.043
ip-0.042
 wrists-0.042
Token so
Token ID523
Feature activation+2.51e-03

Pos logit contributions
 worry+0.138
 adult+0.137
 zone+0.121
 explanation+0.119
urious+0.117
Neg logit contributions
Aud-0.222
 brace-0.208
 wrists-0.194
 fet-0.192
 sixteen-0.189
Token it
Token ID340
Feature activation-1.07e-02

Pos logit contributions
 worry+0.004
 adult+0.004
ie+0.003
iness+0.003
 son+0.003
Neg logit contributions
Aud-0.005
 brace-0.005
 wrists-0.005
 sixteen-0.005
 fet-0.004
Token wouldn
Token ID3636
Feature activation-4.32e-02

Pos logit contributions
Aud+0.018
 brace+0.018
 fet+0.017
 wrists+0.016
 sixteen+0.016
Neg logit contributions
 adult-0.018
 worry-0.018
 explanation-0.016
urious-0.016
 son-0.016
Token't
Token ID470
Feature activation+7.37e-02

Pos logit contributions
Aud+0.059
 brace+0.057
 fet+0.052
 wrists+0.052
 grapes+0.049
Neg logit contributions
 adult-0.089
 worry-0.088
 explanation-0.082
 son-0.080
urious-0.080
Token scare
Token ID19437
Feature activation+5.38e-02

Pos logit contributions
 adult+0.129
urious+0.128
 zone+0.125
 son+0.120
 explanation+0.118
Neg logit contributions
 brace-0.119
 fet-0.110
Aud-0.105
 pro-0.099
br-0.099
Token
Token ID220
Feature activation-1.61e-02

Pos logit contributions
 adult+0.048
 worry+0.048
iness+0.044
 explanation+0.044
 zone+0.043
Neg logit contributions
Aud-0.043
 brace-0.042
 fet-0.041
 wrists-0.040
 sixteen-0.039
Token
Token ID198
Feature activation+1.93e-02

Pos logit contributions
Aud+0.024
 brace+0.024
 fet+0.022
 wrists+0.021
 sixteen+0.021
Neg logit contributions
 adult-0.030
 worry-0.030
 explanation-0.028
urious-0.027
 son-0.027
Token
Token ID198
Feature activation+2.23e-02

Pos logit contributions
 adult+0.038
 worry+0.038
 explanation+0.036
 son+0.035
 outsider+0.035
Neg logit contributions
Aud-0.027
 brace-0.025
 fet-0.024
 sixteen-0.023
 pro-0.023
TokenSo
Token ID2396
Feature activation+6.90e-02

Pos logit contributions
 adult+0.041
 worry+0.040
iness+0.039
 son+0.039
 explanation+0.039
Neg logit contributions
Aud-0.034
 brace-0.030
 fet-0.030
 sixteen-0.028
 pro-0.028
Token Andy
Token ID12382
Feature activation+4.31e-02

Pos logit contributions
 adult+0.136
 outsider+0.130
urious+0.130
 zone+0.128
iness+0.124
Neg logit contributions
 brace-0.086
 fet-0.079
Aud-0.077
 sixteen-0.076
 beautifully-0.075
Token asked
Token ID1965
Feature activation+1.06e-01

Pos logit contributions
 adult+0.060
 worry+0.058
 son+0.052
urious+0.052
 explanation+0.051
Neg logit contributions
Aud-0.089
 brace-0.086
 wrists-0.081
 sixteen-0.079
 fet-0.079
Token his
Token ID465
Feature activation+2.30e-02

Pos logit contributions
 adult+0.185
 worry+0.180
 explanation+0.166
 son+0.162
urious+0.159
Neg logit contributions
Aud-0.185
 brace-0.178
 fet-0.168
 wrists-0.165
 sixteen-0.158
Token mom
Token ID1995
Feature activation+8.09e-02

Pos logit contributions
 adult+0.030
 worry+0.029
 explanation+0.026
urious+0.026
 zone+0.025
Neg logit contributions
Aud-0.048
 brace-0.047
 fet-0.045
 wrists-0.045
 grapes-0.043
Token for
Token ID329
Feature activation-6.03e-02

Pos logit contributions
 adult+0.115
 worry+0.110
 explanation+0.099
 son+0.098
urious+0.095
Neg logit contributions
Aud-0.167
 brace-0.161
 fet-0.153
 wrists-0.152
 sixteen-0.146
Token a
Token ID257
Feature activation-2.00e-03

Pos logit contributions
Aud+0.124
 brace+0.120
 fet+0.115
 wrists+0.112
 sixteen+0.109
Neg logit contributions
 adult-0.086
 worry-0.083
 explanation-0.075
 son-0.072
urious-0.071
Token ticket
Token ID7846
Feature activation-1.40e-02

Pos logit contributions
Aud+0.004
 brace+0.004
 wrists+0.003
 sixteen+0.003
 fet+0.003
Neg logit contributions
 adult-0.003
 worry-0.003
 explanation-0.003
 son-0.003
 option-0.003
Token and
Token ID290
Feature activation+2.47e-02

Pos logit contributions
 adult+0.144
 worry+0.142
 explanation+0.133
 son+0.131
 zone+0.130
Neg logit contributions
 brace-0.085
Aud-0.084
 fet-0.073
 wrists-0.071
 grapes-0.071
Token find
Token ID1064
Feature activation+8.05e-02

Pos logit contributions
 adult+0.037
 worry+0.033
 son+0.031
 explanation+0.031
iness+0.031
Neg logit contributions
Aud-0.048
 brace-0.048
 fet-0.045
 wrists-0.044
 grapes-0.043
Token fun
Token ID1257
Feature activation+2.05e-02

Pos logit contributions
 adult+0.160
 worry+0.160
 explanation+0.148
urious+0.146
 zone+0.146
Neg logit contributions
 brace-0.113
Aud-0.113
 fet-0.100
 grapes-0.098
 wrists-0.098
Token things
Token ID1243
Feature activation+6.86e-02

Pos logit contributions
 adult+0.034
 worry+0.033
 explanation+0.031
 son+0.030
urious+0.030
Neg logit contributions
 brace-0.037
Aud-0.034
 fet-0.033
 grapes-0.032
 wrists-0.032
Token.
Token ID13
Feature activation+2.30e-02

Pos logit contributions
 worry+0.104
 adult+0.104
 zone+0.094
 explanation+0.093
 outsider+0.091
Neg logit contributions
Aud-0.126
 brace-0.117
 wrists-0.117
 sixteen-0.115
 fet-0.113
Token One
Token ID1881
Feature activation+8.96e-02

Pos logit contributions
 adult+0.043
 worry+0.043
 explanation+0.039
 son+0.038
 zone+0.038
Neg logit contributions
Aud-0.035
 brace-0.034
 fet-0.032
 wrists-0.031
 grapes-0.030
Token day
Token ID1110
Feature activation+9.82e-03

Pos logit contributions
 adult+0.199
 purposes+0.194
 distress+0.191
rying+0.186
 situations+0.185
Neg logit contributions
 grape-0.093
 high-0.088
 cookie-0.081
 l-0.080
Aud-0.079
Token she
Token ID673
Feature activation-3.57e-02

Pos logit contributions
 adult+0.017
 worry+0.017
 explanation+0.016
urious+0.016
 outsider+0.015
Neg logit contributions
Aud-0.016
 brace-0.016
 fet-0.015
 grapes-0.014
ip-0.014
Token was
Token ID373
Feature activation+4.16e-02

Pos logit contributions
 fet+0.040
Aud+0.037
 brace+0.036
 scored+0.035
ip+0.030
Neg logit contributions
 adult-0.082
 explanation-0.081
iness-0.080
 worry-0.077
edient-0.073
Token playing
Token ID2712
Feature activation+3.90e-03

Pos logit contributions
 adult+0.075
 worry+0.073
 explanation+0.068
urious+0.066
 son+0.066
Neg logit contributions
Aud-0.068
 brace-0.066
 fet-0.062
 wrists-0.061
 sixteen-0.058
Token in
Token ID287
Feature activation+1.85e-03

Pos logit contributions
 adult+0.007
 worry+0.007
 explanation+0.006
urious+0.006
 son+0.006
Neg logit contributions
Aud-0.006
 brace-0.006
 fet-0.006
 wrists-0.006
 sixteen-0.005
Token and
Token ID290
Feature activation+6.04e-03

Pos logit contributions
 adult+0.112
 worry+0.105
 explanation+0.094
 zone+0.094
 son+0.093
Neg logit contributions
Aud-0.199
 wrists-0.175
 brace-0.174
 fet-0.170
 sixteen-0.166
Token were
Token ID547
Feature activation+5.15e-03

Pos logit contributions
 worry+0.010
 adult+0.010
 son+0.010
 explanation+0.009
 zone+0.009
Neg logit contributions
Aud-0.011
 brace-0.010
 wrists-0.010
 fet-0.009
ip-0.009
Token now
Token ID783
Feature activation+1.81e-02

Pos logit contributions
 worry+0.007
 son+0.006
 adult+0.006
 guardian+0.006
 end+0.006
Neg logit contributions
Aud-0.011
 brace-0.011
con-0.011
 wrists-0.010
 grapes-0.010
Token the
Token ID262
Feature activation-1.24e-02

Pos logit contributions
 worry+0.030
 adult+0.027
 son+0.027
 guardian+0.024
 explanation+0.024
Neg logit contributions
Aud-0.035
 wrists-0.033
 brace-0.033
ip-0.031
 sixteen-0.030
Token proud
Token ID6613
Feature activation-2.00e-02

Pos logit contributions
Aud+0.025
 brace+0.024
 fet+0.023
 wrists+0.023
 sixteen+0.022
Neg logit contributions
 adult-0.018
 worry-0.018
 explanation-0.016
 son-0.016
urious-0.015
Token owners
Token ID4393
Feature activation+1.02e-01

Pos logit contributions
Aud+0.042
 brace+0.040
 fet+0.038
 wrists+0.038
 sixteen+0.037
Neg logit contributions
 adult-0.028
 worry-0.027
 explanation-0.024
 son-0.023
urious-0.023
Token of
Token ID286
Feature activation-3.60e-02

Pos logit contributions
 worry+0.139
 adult+0.134
 son+0.124
iness+0.124
 zone+0.123
Neg logit contributions
Aud-0.223
 brace-0.203
 wrists-0.199
 sixteen-0.192
 grapes-0.188
Token the
Token ID262
Feature activation+4.17e-03

Pos logit contributions
Aud+0.071
 brace+0.065
ip+0.064
 fet+0.063
 wrists+0.062
Neg logit contributions
 adult-0.059
 worry-0.057
 son-0.051
 God-0.050
 explanation-0.050
Token bright
Token ID6016
Feature activation+3.50e-02

Pos logit contributions
 adult+0.007
 worry+0.007
 explanation+0.006
urious+0.006
 son+0.006
Neg logit contributions
Aud-0.008
 brace-0.007
 fet-0.007
 wrists-0.007
 sixteen-0.007
Token crown
Token ID12389
Feature activation-8.76e-02

Pos logit contributions
 adult+0.060
 worry+0.060
 outsider+0.054
 eye+0.054
 explanation+0.054
Neg logit contributions
Aud-0.066
 fet-0.058
 sixteen-0.053
 wrists-0.052
ip-0.052
Token!
Token ID0
Feature activation-7.47e-03

Pos logit contributions
Aud+0.196
 brace+0.181
 wrists+0.179
 grapes+0.171
 sixteen+0.171
Neg logit contributions
 worry-0.105
 adult-0.102
 son-0.101
iness-0.099
 zone-0.096
Token belonged
Token ID19611
Feature activation+5.77e-02

Pos logit contributions
Aud+0.122
 brace+0.110
 wrists+0.108
 sixteen+0.108
seven+0.101
Neg logit contributions
 worry-0.066
 adult-0.064
 son-0.055
 explanation-0.055
 actions-0.053
Token to
Token ID284
Feature activation+4.22e-02

Pos logit contributions
 adult+0.084
 worry+0.082
 zone+0.071
 option+0.071
urious+0.071
Neg logit contributions
Aud-0.120
 wrists-0.112
 brace-0.109
 sixteen-0.106
 seventeen-0.098
Token the
Token ID262
Feature activation+4.55e-02

Pos logit contributions
 adult+0.070
 worry+0.069
 son+0.065
 God+0.060
 explanation+0.060
Neg logit contributions
Aud-0.085
 brace-0.079
 wrists-0.075
 fet-0.073
ip-0.073
Token gr
Token ID1036
Feature activation+7.44e-02

Pos logit contributions
 adult+0.062
 worry+0.060
 son+0.052
 explanation+0.049
 end+0.049
Neg logit contributions
Aud-0.104
 brace-0.097
 fet-0.094
 wrists-0.092
 sixteen-0.091
Tokenumpy
Token ID32152
Feature activation+1.54e-01

Pos logit contributions
 worry+0.076
 adult+0.073
 explanation+0.065
 son+0.063
 zone+0.060
Neg logit contributions
Aud-0.174
 brace-0.170
 fet-0.165
 grape-0.161
 grapes-0.161
Token old
Token ID1468
Feature activation+9.63e-02

Pos logit contributions
 adult+0.208
 worry+0.198
 son+0.176
 explanation+0.172
urious+0.163
Neg logit contributions
Aud-0.341
 brace-0.325
 fet-0.312
 wrists-0.306
 sixteen-0.302
Token bear
Token ID6842
Feature activation+4.34e-02

Pos logit contributions
 adult+0.132
 worry+0.126
 son+0.114
 explanation+0.106
 child+0.100
Neg logit contributions
Aud-0.222
 brace-0.200
 sixteen-0.198
 fet-0.195
br-0.189
Token who
Token ID508
Feature activation+1.63e-02

Pos logit contributions
 worry+0.061
 adult+0.059
 son+0.054
 explanation+0.052
 zone+0.051
Neg logit contributions
Aud-0.093
 brace-0.089
 wrists-0.084
 sixteen-0.082
br-0.079
Token lived
Token ID5615
Feature activation-8.28e-04

Pos logit contributions
 worry+0.020
 adult+0.018
 son+0.016
iness+0.015
urious+0.015
Neg logit contributions
Aud-0.039
 brace-0.037
 wrists-0.036
 sixteen-0.035
 eighteen-0.033
Token in
Token ID287
Feature activation-5.05e-03

Pos logit contributions
Aud+0.002
 brace+0.002
 wrists+0.002
 sixteen+0.002
 sprink+0.002
Neg logit contributions
 adult-0.001
 worry-0.001
rying-0.001
urious-0.001
 son-0.001
Token the
Token ID262
Feature activation+2.91e-02

Pos logit contributions
Aud+0.008
 brace+0.008
 wrists+0.007
 fet+0.007
ip+0.007
Neg logit contributions
 worry-0.010
 adult-0.009
 explanation-0.008
 son-0.008
 zone-0.008
Token Lily
Token ID20037
Feature activation-1.27e-02

Pos logit contributions
Aud+0.038
 wrists+0.035
 brace+0.034
 pro+0.033
 sixteen+0.032
Neg logit contributions
 adult-0.038
 worry-0.038
 son-0.034
 nurses-0.034
 explanation-0.034
Token to
Token ID284
Feature activation+6.99e-03

Pos logit contributions
Aud+0.020
 brace+0.019
 fet+0.018
 wrists+0.017
 sixteen+0.017
Neg logit contributions
 adult-0.024
 worry-0.024
 explanation-0.022
urious-0.021
 son-0.021
Token the
Token ID262
Feature activation+1.58e-02

Pos logit contributions
 adult+0.014
 worry+0.014
 explanation+0.013
 son+0.013
urious+0.012
Neg logit contributions
Aud-0.010
 brace-0.010
 fet-0.009
 wrists-0.009
ip-0.009
Token bathroom
Token ID12436
Feature activation+3.28e-02

Pos logit contributions
 adult+0.022
 worry+0.022
 explanation+0.019
urious+0.019
 son+0.019
Neg logit contributions
Aud-0.032
 brace-0.031
 fet-0.030
 wrists-0.029
 grapes-0.028
Token and
Token ID290
Feature activation-1.58e-02

Pos logit contributions
 adult+0.053
 worry+0.052
 explanation+0.048
 son+0.047
urious+0.046
Neg logit contributions
Aud-0.059
 brace-0.057
 fet-0.054
 wrists-0.053
 sixteen-0.051
Token cleaned
Token ID20750
Feature activation-2.21e-02

Pos logit contributions
Aud+0.028
 brace+0.027
 fet+0.025
 wrists+0.025
 sixteen+0.024
Neg logit contributions
 adult-0.027
 worry-0.026
 explanation-0.024
 son-0.023
urious-0.023
Token her
Token ID607
Feature activation-6.49e-03

Pos logit contributions
Aud+0.045
br+0.040
 seventeen+0.039
 sixteen+0.038
 scored+0.038
Neg logit contributions
 adult-0.034
 worry-0.032
 son-0.031
ie-0.030
 side-0.030
Token wound
Token ID11699
Feature activation+1.58e-02

Pos logit contributions
Aud+0.014
 brace+0.013
 fet+0.012
 wrists+0.012
 sixteen+0.012
Neg logit contributions
 adult-0.009
 worry-0.009
 explanation-0.008
 son-0.008
urious-0.008
Token.
Token ID13
Feature activation-1.81e-02

Pos logit contributions
 worry+0.017
 adult+0.017
 explanation+0.015
 son+0.014
iness+0.014
Neg logit contributions
Aud-0.039
 sixteen-0.034
 brace-0.034
 fet-0.034
 pro-0.034
Token She
Token ID1375
Feature activation-3.20e-02

Pos logit contributions
Aud+0.027
 brace+0.026
 fet+0.024
 wrists+0.024
 sixteen+0.023
Neg logit contributions
 adult-0.035
 worry-0.035
 explanation-0.032
urious-0.032
 son-0.032
Token put
Token ID1234
Feature activation-3.61e-02

Pos logit contributions
Aud+0.055
 brace+0.053
 fet+0.050
 wrists+0.049
 sixteen+0.048
Neg logit contributions
 adult-0.055
 worry-0.053
 explanation-0.049
urious-0.048
 son-0.048
Token She
Token ID1375
Feature activation-2.55e-02

Pos logit contributions
 brace+0.008
 wrists+0.007
Aud+0.007
 fet+0.007
 pro+0.007
Neg logit contributions
 worry-0.005
 adult-0.005
 son-0.004
 God-0.004
 distress-0.004
Token tries
Token ID8404
Feature activation-3.47e-03

Pos logit contributions
Aud+0.051
 brace+0.048
 sixteen+0.045
 wrists+0.045
 pro+0.045
Neg logit contributions
 adult-0.037
 worry-0.036
urious-0.031
 son-0.031
 explanation-0.031
Token to
Token ID284
Feature activation-3.55e-02

Pos logit contributions
Aud+0.006
 brace+0.006
 wrists+0.006
 icing+0.006
 sixteen+0.006
Neg logit contributions
 adult-0.006
 worry-0.006
 explanation-0.005
 son-0.005
 answer-0.005
Token pull
Token ID2834
Feature activation+3.57e-02

Pos logit contributions
Aud+0.061
 brace+0.061
 fet+0.055
 wrists+0.054
 grapes+0.051
Neg logit contributions
 adult-0.059
 zone-0.056
urious-0.056
iness-0.055
 worry-0.055
Token the
Token ID262
Feature activation+2.10e-02

Pos logit contributions
 adult+0.065
 worry+0.063
 son+0.056
 end+0.056
 side+0.054
Neg logit contributions
Aud-0.060
 brace-0.057
 wrists-0.056
 fet-0.054
 sprink-0.052
Token small
Token ID1402
Feature activation+5.69e-02

Pos logit contributions
 adult+0.037
 worry+0.036
 son+0.034
 explanation+0.033
 zone+0.033
Neg logit contributions
Aud-0.036
 fet-0.033
 brace-0.033
 sixteen-0.032
 pro-0.031
Token dog
Token ID3290
Feature activation+4.12e-02

Pos logit contributions
 adult+0.090
 worry+0.088
 explanation+0.079
 son+0.079
 zone+0.078
Neg logit contributions
Aud-0.107
 brace-0.102
 fet-0.098
 wrists-0.095
 sixteen-0.093
Token away
Token ID1497
Feature activation-1.84e-02

Pos logit contributions
 worry+0.068
 adult+0.066
 explanation+0.059
 son+0.058
 zone+0.057
Neg logit contributions
Aud-0.076
 brace-0.071
 wrists-0.069
 sprink-0.067
 sixteen-0.066
Token,
Token ID11
Feature activation+1.34e-02

Pos logit contributions
Aud+0.039
 brace+0.038
 fet+0.036
 wrists+0.036
 sixteen+0.035
Neg logit contributions
 adult-0.025
 worry-0.024
 explanation-0.021
 son-0.021
 zone-0.020
Token but
Token ID475
Feature activation+3.46e-02

Pos logit contributions
 worry+0.013
 adult+0.012
 son+0.010
 explanation+0.010
 actions+0.008
Neg logit contributions
Aud-0.034
 brace-0.034
 wrists-0.032
ip-0.032
 fet-0.031
Token the
Token ID262
Feature activation-4.50e-02

Pos logit contributions
 worry+0.061
 adult+0.056
 end+0.053
 son+0.050
 why+0.049
Neg logit contributions
Aud-0.061
 brace-0.061
 wrists-0.059
 fet-0.058
 sixteen-0.057
Token.
Token ID13
Feature activation-2.76e-02

Pos logit contributions
 adult+0.088
 worry+0.086
 son+0.081
 explanation+0.076
 zone+0.070
Neg logit contributions
Aud-0.132
 brace-0.122
 wrists-0.117
 fet-0.116
 sixteen-0.113
Token
Token ID198
Feature activation-2.55e-02

Pos logit contributions
 brace+0.065
 wrists+0.064
Aud+0.064
 icing+0.060
 grapes+0.058
Neg logit contributions
 adult-0.029
 son-0.025
 worry-0.025
 God-0.024
 voice-0.022
Token
Token ID198
Feature activation-2.26e-02

Pos logit contributions
 brace+0.049
Aud+0.049
 wrists+0.048
 fet+0.045
 grapes+0.045
Neg logit contributions
 adult-0.039
 worry-0.035
urious-0.032
 explanation-0.032
 son-0.032
TokenThe
Token ID464
Feature activation+1.12e-03

Pos logit contributions
 brace+0.044
Aud+0.042
 wrists+0.042
 grapes+0.040
 fet+0.040
Neg logit contributions
 adult-0.034
 worry-0.032
urious-0.029
 son-0.028
 explanation-0.028
Token girl
Token ID2576
Feature activation+7.54e-03

Pos logit contributions
 adult+0.002
 worry+0.001
 explanation+0.001
 son+0.001
urious+0.001
Neg logit contributions
Aud-0.002
 brace-0.002
 fet-0.002
 wrists-0.002
 sixteen-0.002
Token looked
Token ID3114
Feature activation+2.27e-02

Pos logit contributions
 worry+0.010
 adult+0.009
 son+0.009
urious+0.008
 explanation+0.008
Neg logit contributions
Aud-0.016
 brace-0.015
 sixteen-0.015
 grapes-0.014
 fet-0.014
Token up
Token ID510
Feature activation-6.52e-02

Pos logit contributions
 worry+0.036
 adult+0.035
 son+0.033
urious+0.032
 explanation+0.032
Neg logit contributions
Aud-0.042
 brace-0.040
 fet-0.037
 wrists-0.037
 sixteen-0.037
Token and
Token ID290
Feature activation+4.53e-03

Pos logit contributions
Aud+0.132
 brace+0.124
 fet+0.119
 wrists+0.117
 sixteen+0.111
Neg logit contributions
 worry-0.095
 adult-0.095
 son-0.086
urious-0.084
 explanation-0.083
Token blinked
Token ID40360
Feature activation+1.89e-02

Pos logit contributions
 worry+0.005
 son+0.004
 adult+0.004
urious+0.004
 explanation+0.003
Neg logit contributions
Aud-0.011
 brace-0.010
 fet-0.010
ip-0.010
br-0.010
Token.
Token ID13
Feature activation-1.28e-02

Pos logit contributions
 worry+0.027
 adult+0.025
 son+0.024
urious+0.023
 zone+0.023
Neg logit contributions
Aud-0.039
 fet-0.036
 brace-0.036
 sixteen-0.034
 grapes-0.034
Token She
Token ID1375
Feature activation-3.04e-02

Pos logit contributions
 brace+0.025
 fet+0.024
Aud+0.024
 wrists+0.024
ip+0.023
Neg logit contributions
 adult-0.017
 God-0.016
 son-0.016
 worry-0.015
 explanation-0.014
Token put
Token ID1234
Feature activation-1.66e-02

Pos logit contributions
 adult+0.042
 worry+0.041
 son+0.038
 explanation+0.037
urious+0.035
Neg logit contributions
Aud-0.058
 brace-0.056
 wrists-0.052
 fet-0.051
 sixteen-0.050
Token the
Token ID262
Feature activation+1.41e-02

Pos logit contributions
Aud+0.031
 brace+0.029
 fet+0.028
 pro+0.027
 wrists+0.027
Neg logit contributions
 adult-0.024
 worry-0.024
 son-0.024
 nurses-0.023
pa-0.021
Token coffee
Token ID6891
Feature activation-4.47e-02

Pos logit contributions
 adult+0.024
 worry+0.024
 son+0.022
 explanation+0.022
 zone+0.020
Neg logit contributions
Aud-0.025
 brace-0.024
 fet-0.022
 wrists-0.022
 sixteen-0.021
Token into
Token ID656
Feature activation-1.93e-02

Pos logit contributions
Aud+0.085
 brace+0.078
 sixteen+0.069
 fet+0.068
seven+0.068
Neg logit contributions
 adult-0.070
 zone-0.068
 worry-0.066
 son-0.066
iness-0.066
Token an
Token ID281
Feature activation+1.03e-01

Pos logit contributions
Aud+0.041
 fet+0.036
ip+0.036
 brace+0.034
 scored+0.034
Neg logit contributions
 worry-0.030
 adult-0.029
 son-0.028
 explanation-0.025
 child-0.025
Token old
Token ID1468
Feature activation+6.97e-02

Pos logit contributions
 worry+0.167
 adult+0.164
 zone+0.154
urious+0.151
iness+0.149
Neg logit contributions
 brace-0.170
Aud-0.169
 grapes-0.161
 wrists-0.160
 sixteen-0.159
Token cup
Token ID6508
Feature activation-7.44e-02

Pos logit contributions
 worry+0.106
 adult+0.101
 son+0.093
 explanation+0.090
 zone+0.088
Neg logit contributions
Aud-0.145
 brace-0.131
 fet-0.126
 sixteen-0.126
ip-0.123
Token and
Token ID290
Feature activation+1.40e-02

Pos logit contributions
Aud+0.174
 brace+0.168
 wrists+0.158
 fet+0.153
Pur+0.148
Neg logit contributions
 adult-0.087
 zone-0.079
iness-0.079
 worry-0.074
 eye-0.073
Token handed
Token ID10158
Feature activation-3.45e-02

Pos logit contributions
 worry+0.019
 son+0.018
 adult+0.018
 nurses+0.018
 why+0.016
Neg logit contributions
Aud-0.030
 brace-0.028
 sixteen-0.025
br-0.025
 wrists-0.025
Token it
Token ID340
Feature activation+2.15e-02

Pos logit contributions
Aud+0.065
 brace+0.062
 wrists+0.058
br+0.058
 fet+0.057
Neg logit contributions
 adult-0.057
 son-0.054
 worry-0.052
 child-0.049
 end-0.048
Token to
Token ID284
Feature activation+2.04e-02

Pos logit contributions
 adult+0.030
 worry+0.029
 permission+0.027
 son+0.027
 explanation+0.026
Neg logit contributions
Aud-0.044
 brace-0.042
 pro-0.039
 wrists-0.039
 sixteen-0.039
Token Grand
Token ID5675
Feature activation+2.63e-03

Pos logit contributions
 adult+0.023
 worry+0.023
 son+0.021
 explanation+0.020
 zone+0.020
Neg logit contributions
Aud-0.024
 brace-0.023
 fet-0.021
 wrists-0.021
 sixteen-0.019
Tokenpa
Token ID8957
Feature activation+8.72e-02

Pos logit contributions
 worry+0.003
 adult+0.003
rated+0.003
 zone+0.003
 outsider+0.003
Neg logit contributions
 sprink-0.005
Aud-0.005
 brace-0.005
 icing-0.005
ip-0.005
Token showed
Token ID3751
Feature activation+4.09e-02

Pos logit contributions
 adult+0.126
 worry+0.122
 son+0.110
 explanation+0.108
urious+0.106
Neg logit contributions
Aud-0.175
 brace-0.170
 wrists-0.159
 fet-0.159
 grapes-0.153
Token him
Token ID683
Feature activation+1.09e-01

Pos logit contributions
 adult+0.066
 worry+0.061
 son+0.060
 explanation+0.056
urious+0.054
Neg logit contributions
Aud-0.079
 brace-0.076
 fet-0.073
 wrists-0.072
 sixteen-0.069
Token all
Token ID477
Feature activation+1.01e-01

Pos logit contributions
 adult+0.185
 worry+0.179
 son+0.166
 explanation+0.164
urious+0.159
Neg logit contributions
Aud-0.196
 brace-0.187
 fet-0.174
 wrists-0.173
 sixteen-0.169
Token the
Token ID262
Feature activation+3.95e-02

Pos logit contributions
 adult+0.170
 worry+0.166
 explanation+0.152
urious+0.149
 son+0.148
Neg logit contributions
Aud-0.176
 brace-0.171
 fet-0.162
 wrists-0.158
 sixteen-0.152
Token exciting
Token ID7895
Feature activation+7.06e-02

Pos logit contributions
 adult+0.069
 worry+0.068
iness+0.064
urious+0.063
 explanation+0.063
Neg logit contributions
 brace-0.064
Aud-0.061
 fet-0.057
 grapes-0.057
 sprink-0.056
Token things
Token ID1243
Feature activation+8.80e-02

Pos logit contributions
 adult+0.104
 worry+0.103
 explanation+0.091
urious+0.090
 son+0.088
Neg logit contributions
 brace-0.136
Aud-0.133
 fet-0.126
 wrists-0.124
 grapes-0.122
Token in
Token ID287
Feature activation-1.70e-02

Pos logit contributions
 adult+0.140
 worry+0.139
 explanation+0.125
 son+0.124
 zone+0.123
Neg logit contributions
Aud-0.162
 brace-0.154
 fet-0.146
 wrists-0.145
 sixteen-0.141
Token the
Token ID262
Feature activation+2.84e-02

Pos logit contributions
Aud+0.023
 brace+0.022
 fet+0.020
 wrists+0.020
 sixteen+0.019
Neg logit contributions
 adult-0.035
 worry-0.034
 explanation-0.032
urious-0.032
 son-0.031
Token museum
Token ID13257
Feature activation+5.09e-02

Pos logit contributions
 adult+0.043
 worry+0.043
iness+0.040
 explanation+0.039
rated+0.038
Neg logit contributions
 brace-0.050
Aud-0.050
 fet-0.048
 sprink-0.047
 sixteen-0.047
Token,
Token ID11
Feature activation-1.16e-02

Pos logit contributions
Aud+0.097
 sixteen+0.090
 fet+0.089
 brace+0.088
 som+0.087
Neg logit contributions
 worry-0.056
 son-0.044
 adult-0.042
 zone-0.042
iness-0.040
Token but
Token ID475
Feature activation+3.26e-02

Pos logit contributions
Aud+0.025
 fet+0.024
 sixteen+0.023
 brace+0.023
 wrists+0.023
Neg logit contributions
 worry-0.016
 adult-0.015
 son-0.014
 explanation-0.013
 actions-0.012
Token you
Token ID345
Feature activation-1.03e-02

Pos logit contributions
 worry+0.054
 adult+0.053
 explanation+0.047
 son+0.047
 actions+0.046
Neg logit contributions
Aud-0.062
 fet-0.057
 brace-0.055
 sixteen-0.055
 wrists-0.054
Token can
Token ID460
Feature activation-4.62e-02

Pos logit contributions
Aud+0.021
 brace+0.020
 wrists+0.019
 fet+0.018
 grapes+0.018
Neg logit contributions
 worry-0.016
 adult-0.015
 explanation-0.013
 son-0.013
 outsider-0.013
Token't
Token ID470
Feature activation+4.58e-02

Pos logit contributions
Aud+0.092
 brace+0.086
 fet+0.084
 grapes+0.081
 wrists+0.081
Neg logit contributions
 worry-0.070
 adult-0.068
 explanation-0.060
 outsider-0.058
 end-0.056
Token marry
Token ID12479
Feature activation-1.09e-01

Pos logit contributions
 worry+0.066
 adult+0.065
 explanation+0.057
 outsider+0.055
 son+0.055
Neg logit contributions
Aud-0.094
 brace-0.089
 fet-0.086
 wrists-0.083
 sixteen-0.082
Token her
Token ID607
Feature activation-2.11e-02

Pos logit contributions
Aud+0.216
 fet+0.210
 sixteen+0.200
ip+0.195
 sprink+0.193
Neg logit contributions
 worry-0.170
 adult-0.151
 zone-0.144
 God-0.141
 son-0.140
Token."
Token ID526
Feature activation-7.30e-02

Pos logit contributions
Aud+0.046
 fet+0.042
 sixteen+0.040
 brace+0.040
br+0.038
Neg logit contributions
 worry-0.031
 adult-0.029
 son-0.029
 zone-0.026
 actions-0.025
Token
Token ID198
Feature activation-3.24e-02

Pos logit contributions
Aud+0.133
 brace+0.128
 fet+0.120
 wrists+0.117
 sixteen+0.117
Neg logit contributions
 adult-0.119
 worry-0.119
 son-0.108
 explanation-0.105
urious-0.104
Token
Token ID198
Feature activation-2.80e-02

Pos logit contributions
Aud+0.057
 brace+0.056
 fet+0.052
 wrists+0.052
 sixteen+0.050
Neg logit contributions
 adult-0.054
 worry-0.052
 explanation-0.048
urious-0.048
 son-0.047
TokenAnna
Token ID31160
Feature activation-7.33e-02

Pos logit contributions
 brace+0.048
Aud+0.048
 fet+0.046
 wrists+0.045
 sixteen+0.044
Neg logit contributions
 adult-0.047
 worry-0.045
urious-0.042
 explanation-0.040
 son-0.040
Token."
Token ID526
Feature activation-7.88e-02

Pos logit contributions
Aud+0.086
 brace+0.083
 wrists+0.080
 fet+0.075
ip+0.073
Neg logit contributions
 adult-0.053
 worry-0.052
 zone-0.047
 son-0.046
 end-0.045
Token
Token ID198
Feature activation-4.17e-02

Pos logit contributions
Aud+0.140
 brace+0.136
 wrists+0.128
 fet+0.126
 sixteen+0.123
Neg logit contributions
 adult-0.130
 worry-0.128
 explanation-0.116
 son-0.115
urious-0.112
Token
Token ID198
Feature activation-3.64e-02

Pos logit contributions
Aud+0.068
 brace+0.066
 fet+0.062
 wrists+0.061
 sixteen+0.059
Neg logit contributions
 adult-0.075
 worry-0.073
 explanation-0.067
urious-0.066
 son-0.066
TokenTom
Token ID13787
Feature activation+1.09e-03

Pos logit contributions
Aud+0.057
 brace+0.056
 fet+0.052
 wrists+0.052
 sixteen+0.050
Neg logit contributions
 adult-0.067
 worry-0.066
 explanation-0.060
urious-0.060
 son-0.059
Token and
Token ID290
Feature activation-4.91e-02

Pos logit contributions
 adult+0.002
 worry+0.002
 son+0.001
urious+0.001
 explanation+0.001
Neg logit contributions
Aud-0.002
 brace-0.002
 fet-0.002
 wrists-0.002
 sixteen-0.002
Token Lily
Token ID20037
Feature activation-7.42e-02

Pos logit contributions
Aud+0.067
 brace+0.063
 fet+0.059
 wrists+0.058
 sixteen+0.058
Neg logit contributions
 adult-0.103
 worry-0.101
 son-0.094
 explanation-0.093
 zone-0.089
Token said
Token ID531
Feature activation-4.67e-03

Pos logit contributions
Aud+0.112
 brace+0.110
 fet+0.107
 wrists+0.103
 sprink+0.097
Neg logit contributions
 adult-0.144
 worry-0.133
 explanation-0.127
 zone-0.125
urious-0.124
Token,
Token ID11
Feature activation-3.57e-02

Pos logit contributions
Aud+0.010
 brace+0.009
 fet+0.009
 wrists+0.009
 sixteen+0.008
Neg logit contributions
 worry-0.007
 adult-0.006
 explanation-0.006
 son-0.006
iness-0.005
Token "
Token ID366
Feature activation-6.74e-02

Pos logit contributions
 brace+0.071
Aud+0.070
 wrists+0.062
bons+0.062
pill+0.061
Neg logit contributions
 son-0.049
 worry-0.048
 adult-0.046
 explanation-0.044
 God-0.043
TokenOK
Token ID11380
Feature activation+3.69e-02

Pos logit contributions
Aud+0.122
 brace+0.119
 fet+0.112
 wrists+0.109
 sixteen+0.107
Neg logit contributions
 adult-0.107
 worry-0.107
 explanation-0.096
urious-0.095
 son-0.094
Token,
Token ID11
Feature activation-1.76e-02

Pos logit contributions
 adult+0.061
 worry+0.059
 explanation+0.055
urious+0.054
 zone+0.053
Neg logit contributions
Aud-0.065
 brace-0.063
 fet-0.060
 wrists-0.059
 sixteen-0.057
Token section
Token ID2665
Feature activation-3.47e-02

Pos logit contributions
Aud+0.128
 brace+0.113
 sixteen+0.109
 wrists+0.107
 fet+0.107
Neg logit contributions
 adult-0.096
 worry-0.095
 son-0.085
 explanation-0.084
 zone-0.080
Token,
Token ID11
Feature activation+3.30e-03

Pos logit contributions
Aud+0.072
 brace+0.067
 wrists+0.063
 fet+0.063
 sixteen+0.062
Neg logit contributions
 adult-0.050
 worry-0.049
 explanation-0.044
 son-0.044
 zone-0.043
Token but
Token ID475
Feature activation+3.03e-02

Pos logit contributions
 worry+0.004
 adult+0.004
 son+0.004
 explanation+0.004
 why+0.003
Neg logit contributions
Aud-0.008
br-0.007
 brace-0.007
ip-0.007
 sixteen-0.007
Token she
Token ID673
Feature activation-8.97e-02

Pos logit contributions
 worry+0.053
 adult+0.050
 son+0.046
 why+0.043
 end+0.043
Neg logit contributions
Aud-0.061
 fet-0.054
 brace-0.052
br-0.052
 sixteen-0.051
Token still
Token ID991
Feature activation-2.65e-02

Pos logit contributions
Aud+0.217
 brace+0.204
 sixteen+0.193
 wrists+0.191
 fet+0.191
Neg logit contributions
 worry-0.106
 adult-0.104
 son-0.090
urious-0.086
 explanation-0.082
Token couldn
Token ID3521
Feature activation-5.31e-02

Pos logit contributions
Aud+0.065
 brace+0.061
br+0.057
 fet+0.057
 grapes+0.057
Neg logit contributions
 worry-0.031
 adult-0.028
 son-0.026
urious-0.024
 explanation-0.023
Token't
Token ID470
Feature activation+3.78e-02

Pos logit contributions
Aud+0.101
 wrists+0.100
 brace+0.097
 fet+0.093
 grapes+0.087
Neg logit contributions
 adult-0.085
 worry-0.080
 explanation-0.070
 end-0.069
 situation-0.067
Token find
Token ID1064
Feature activation+5.01e-02

Pos logit contributions
 adult+0.052
urious+0.049
 zone+0.049
iness+0.047
 worry+0.047
Neg logit contributions
 brace-0.072
 fet-0.069
Aud-0.067
 wrists-0.066
 grape-0.066
Token the
Token ID262
Feature activation+2.21e-03

Pos logit contributions
 adult+0.083
 worry+0.080
 son+0.073
 why+0.072
 permission+0.071
Neg logit contributions
Aud-0.100
 fet-0.087
 brace-0.082
 wrists-0.081
 scored-0.081
Token caul
Token ID46087
Feature activation-7.90e-02

Pos logit contributions
 adult+0.004
 worry+0.004
 explanation+0.003
 son+0.003
urious+0.003
Neg logit contributions
Aud-0.004
 brace-0.004
 fet-0.004
 wrists-0.004
 sixteen-0.004
Tokeniflower
Token ID42642
Feature activation-4.57e-02

Pos logit contributions
Aud+0.130
 brace+0.120
 fet+0.114
 wrists+0.112
 sixteen+0.108
Neg logit contributions
 adult-0.145
 worry-0.142
urious-0.131
 explanation-0.130
 son-0.126
Token.
Token ID13
Feature activation+2.90e-02

Pos logit contributions
 worry+0.062
 adult+0.057
 zone+0.054
iness+0.053
 son+0.052
Neg logit contributions
Aud-0.121
 brace-0.111
 fet-0.107
 wrists-0.106
 sixteen-0.105
Token She
Token ID1375
Feature activation-4.84e-03

Pos logit contributions
 adult+0.055
 worry+0.053
 explanation+0.049
 son+0.049
urious+0.049
Neg logit contributions
Aud-0.045
 brace-0.044
 fet-0.041
 wrists-0.040
 grapes-0.038
Token was
Token ID373
Feature activation+3.56e-02

Pos logit contributions
Aud+0.011
 brace+0.010
 sixteen+0.010
 grapes+0.010
 wrists+0.010
Neg logit contributions
 worry-0.005
 adult-0.005
 son-0.005
urious-0.004
 actions-0.004
Token running
Token ID2491
Feature activation+3.82e-03

Pos logit contributions
 worry+0.048
 adult+0.043
 son+0.038
 guardian+0.038
 explanation+0.036
Neg logit contributions
 brace-0.076
Aud-0.074
con-0.074
 wrists-0.072
 fet-0.072
Token around
Token ID1088
Feature activation+2.24e-02

Pos logit contributions
 worry+0.004
 adult+0.004
 side+0.004
 end+0.004
 option+0.004
Neg logit contributions
Aud-0.008
 brace-0.008
 icing-0.008
 wrists-0.008
 sprink-0.008
Token on
Token ID319
Feature activation-6.45e-02

Pos logit contributions
 worry+0.028
 adult+0.027
 side+0.024
 son+0.023
 end+0.023
Neg logit contributions
Aud-0.052
 brace-0.048
ip-0.047
 wrists-0.046
 sixteen-0.046
Token the
Token ID262
Feature activation+5.06e-02

Pos logit contributions
Aud+0.125
ip+0.122
 fet+0.120
 sixteen+0.113
br+0.112
Neg logit contributions
 worry-0.100
 adult-0.098
 side-0.087
 son-0.084
 distress-0.083
Token lush
Token ID37408
Feature activation-4.52e-03

Pos logit contributions
 adult+0.093
 worry+0.086
 explanation+0.082
 son+0.082
 eye+0.081
Neg logit contributions
Aud-0.082
 fet-0.076
 brace-0.075
 pro-0.072
ip-0.072
Token grass
Token ID8701
Feature activation+1.05e-02

Pos logit contributions
Aud+0.009
 fet+0.008
 pro+0.008
 brace+0.007
ip+0.007
Neg logit contributions
 adult-0.007
 island-0.007
 ig-0.006
 worry-0.006
urious-0.006
Token,
Token ID11
Feature activation+4.72e-02

Pos logit contributions
 adult+0.013
 worry+0.013
 zone+0.012
 explanation+0.012
 son+0.012
Neg logit contributions
Aud-0.023
 brace-0.022
 fet-0.021
 wrists-0.020
 sixteen-0.020
Token enjoying
Token ID13226
Feature activation-3.59e-02

Pos logit contributions
 worry+0.065
 adult+0.061
 explanation+0.056
 son+0.055
 option+0.053
Neg logit contributions
Aud-0.094
 brace-0.090
ip-0.086
 fet-0.085
 wrists-0.085
Tokenutter
Token ID10381
Feature activation-1.18e-02

Pos logit contributions
Aud+0.205
 brace+0.198
 fet+0.187
 wrists+0.185
 sixteen+0.182
Neg logit contributions
 adult-0.154
 worry-0.152
urious-0.137
 explanation-0.133
 zone-0.132
Token.
Token ID13
Feature activation-3.53e-03

Pos logit contributions
Aud+0.024
 brace+0.023
 wrists+0.022
 fet+0.022
 sixteen+0.021
Neg logit contributions
 adult-0.017
 worry-0.016
 explanation-0.015
 son-0.014
urious-0.014
Token
Token ID220
Feature activation-4.37e-02

Pos logit contributions
Aud+0.006
 brace+0.006
 icing+0.005
 wrists+0.005
 sixteen+0.005
Neg logit contributions
 adult-0.006
 worry-0.006
 son-0.005
 explanation-0.005
 polar-0.005
Token
Token ID198
Feature activation-7.54e-03

Pos logit contributions
Aud+0.082
 brace+0.080
 wrists+0.077
 fet+0.073
 grapes+0.072
Neg logit contributions
 adult-0.067
 worry-0.064
 explanation-0.059
 son-0.058
urious-0.058
Token
Token ID198
Feature activation-2.65e-03

Pos logit contributions
Aud+0.014
 brace+0.014
 wrists+0.013
 sixteen+0.012
 grapes+0.012
Neg logit contributions
 adult-0.012
 worry-0.011
 explanation-0.010
urious-0.010
 son-0.010
TokenTweet
Token ID47845
Feature activation-5.08e-02

Pos logit contributions
Aud+0.005
 brace+0.004
 wrists+0.004
 icing+0.004
 sixteen+0.004
Neg logit contributions
 adult-0.004
 worry-0.004
urious-0.004
 explanation-0.004
row-0.004
Token was
Token ID373
Feature activation+1.96e-02

Pos logit contributions
Aud+0.105
 brace+0.097
 grapes+0.094
 icing+0.093
 sixteen+0.093
Neg logit contributions
 adult-0.068
 worry-0.068
urious-0.063
 son-0.060
 explanation-0.058
Token excited
Token ID6568
Feature activation+4.24e-02

Pos logit contributions
 worry+0.025
 adult+0.021
 son+0.019
 explanation+0.019
 distress+0.019
Neg logit contributions
Aud-0.043
ip-0.042
 brace-0.042
 fet-0.041
 wrists-0.041
Token to
Token ID284
Feature activation-9.21e-03

Pos logit contributions
 adult+0.087
 worry+0.083
urious+0.078
 explanation+0.078
 zone+0.077
Neg logit contributions
 brace-0.058
Aud-0.056
 fet-0.052
 grapes-0.051
 grape-0.049
Token play
Token ID711
Feature activation-2.02e-02

Pos logit contributions
Aud+0.021
ip+0.020
 brace+0.020
 wrists+0.019
 sprink+0.019
Neg logit contributions
 adult-0.012
 worry-0.012
 answer-0.010
 explanation-0.010
 figure-0.010
Token a
Token ID257
Feature activation+1.87e-02

Pos logit contributions
Aud+0.038
 wrists+0.036
ip+0.036
 sprink+0.035
 brace+0.034
Neg logit contributions
 worry-0.030
 adult-0.029
 explanation-0.028
 end-0.028
 figure-0.027
Token"
Token ID1
Feature activation-3.99e-02

Pos logit contributions
 brace+0.049
Aud+0.049
 wrists+0.046
 fet+0.045
 grapes+0.045
Neg logit contributions
 adult-0.043
 worry-0.041
urious-0.037
 explanation-0.036
 son-0.036
TokenR
Token ID49
Feature activation-1.60e-02

Pos logit contributions
Aud+0.073
 brace+0.071
 fet+0.067
 wrists+0.066
 sixteen+0.064
Neg logit contributions
 adult-0.063
 worry-0.062
 explanation-0.057
 son-0.055
urious-0.054
Tokenib
Token ID571
Feature activation-3.36e-02

Pos logit contributions
Aud+0.028
 brace+0.027
 fet+0.026
 wrists+0.025
 grapes+0.025
Neg logit contributions
 adult-0.027
 worry-0.026
 explanation-0.024
 son-0.023
urious-0.023
Tokenbit
Token ID2545
Feature activation-1.89e-02

Pos logit contributions
Aud+0.055
 brace+0.047
 pro+0.046
 wrists+0.046
 fet+0.045
Neg logit contributions
 worry-0.062
 adult-0.061
iness-0.056
an-0.055
 eye-0.055
Token,
Token ID11
Feature activation+1.71e-02

Pos logit contributions
Aud+0.043
 wrists+0.040
 brace+0.039
 fet+0.038
 icing+0.037
Neg logit contributions
 worry-0.024
 adult-0.022
 son-0.020
 explanation-0.019
iness-0.019
Token rib
Token ID12183
Feature activation-2.30e-01

Pos logit contributions
 adult+0.029
 worry+0.029
 son+0.026
 actions+0.026
 explanation+0.025
Neg logit contributions
Aud-0.030
 brace-0.026
 pro-0.025
 sixteen-0.025
 beautifully-0.025
Tokenbit
Token ID2545
Feature activation-2.83e-02

Pos logit contributions
 pro+0.392
Aud+0.386
 brace+0.353
 fet+0.352
ip+0.340
Neg logit contributions
 adult-0.357
 worry-0.330
pa-0.324
rr-0.322
an-0.322
Token,"
Token ID553
Feature activation-1.11e-01

Pos logit contributions
Aud+0.064
 wrists+0.058
 brace+0.057
 fet+0.057
 icing+0.055
Neg logit contributions
 worry-0.036
 adult-0.035
 son-0.031
iness-0.029
 explanation-0.028
Token the
Token ID262
Feature activation-8.43e-02

Pos logit contributions
Aud+0.187
 brace+0.164
 wrists+0.157
 fet+0.156
 sixteen+0.152
Neg logit contributions
 adult-0.204
 worry-0.202
 son-0.198
 God-0.193
 sailor-0.184
Token frog
Token ID21264
Feature activation-8.49e-02

Pos logit contributions
Aud+0.184
 brace+0.168
 fet+0.163
 sixteen+0.160
 wrists+0.159
Neg logit contributions
 adult-0.116
 worry-0.112
 explanation-0.098
 son-0.097
urious-0.091
Token agrees
Token ID14386
Feature activation-5.89e-02

Pos logit contributions
Aud+0.184
 brace+0.169
 wrists+0.165
 sixteen+0.161
 grapes+0.160
Neg logit contributions
 adult-0.104
 worry-0.103
 explanation-0.093
urious-0.092
 son-0.091
Token tail
Token ID7894
Feature activation-7.85e-02

Pos logit contributions
Aud+0.031
 fet+0.028
 brace+0.027
ip+0.026
 wrists+0.026
Neg logit contributions
 worry-0.023
 adult-0.023
 son-0.020
 explanation-0.020
 eye-0.019
Token which
Token ID543
Feature activation-1.27e-02

Pos logit contributions
Aud+0.157
 brace+0.147
 wrists+0.141
 fet+0.135
 sixteen+0.134
Neg logit contributions
 adult-0.112
 worry-0.112
 zone-0.103
 eye-0.098
 son-0.098
Token was
Token ID373
Feature activation+2.20e-02

Pos logit contributions
Aud+0.029
 brace+0.027
 sixteen+0.027
 wrists+0.027
 fet+0.026
Neg logit contributions
 worry-0.015
 adult-0.015
 zone-0.013
iness-0.012
 son-0.012
Token decorated
Token ID24789
Feature activation-4.88e-02

Pos logit contributions
 worry+0.031
 adult+0.030
 son+0.026
 distress+0.025
 polar+0.025
Neg logit contributions
Aud-0.046
ip-0.042
 brace-0.042
 fet-0.041
 wrists-0.041
Token with
Token ID351
Feature activation-3.48e-03

Pos logit contributions
Aud+0.103
 fet+0.092
 brace+0.090
 wrists+0.090
 sixteen+0.090
Neg logit contributions
 zone-0.066
 worry-0.065
 adult-0.065
 son-0.060
iness-0.059
Token rib
Token ID12183
Feature activation-1.85e-01

Pos logit contributions
Aud+0.007
ip+0.006
 fet+0.006
 sixteen+0.006
 wrists+0.006
Neg logit contributions
 worry-0.006
 adult-0.006
 son-0.005
 ig-0.005
 polar-0.005
Tokenbons
Token ID23461
Feature activation-5.78e-02

Pos logit contributions
 pro+0.203
Aud+0.198
 fet+0.182
 scored+0.174
 wrists+0.169
Neg logit contributions
 adult-0.384
urious-0.370
pa-0.369
 worry-0.356
 eye-0.352
Token.
Token ID13
Feature activation+9.75e-03

Pos logit contributions
Aud+0.128
 brace+0.113
 wrists+0.112
 fet+0.111
 grapes+0.109
Neg logit contributions
 adult-0.069
 worry-0.068
iness-0.067
 zone-0.066
 son-0.064
Token It
Token ID632
Feature activation+6.68e-03

Pos logit contributions
 adult+0.016
 worry+0.016
 explanation+0.015
 son+0.014
urious+0.014
Neg logit contributions
Aud-0.017
 brace-0.017
 fet-0.015
 wrists-0.015
 sixteen-0.015
Token was
Token ID373
Feature activation+1.51e-02

Pos logit contributions
 worry+0.008
 adult+0.007
 son+0.006
 outsider+0.006
urious+0.006
Neg logit contributions
Aud-0.015
 brace-0.015
 wrists-0.014
 grapes-0.014
 sixteen-0.014
Token the
Token ID262
Feature activation-6.08e-03

Pos logit contributions
 worry+0.024
 adult+0.023
 son+0.021
 explanation+0.020
 ig+0.020
Neg logit contributions
Aud-0.030
 brace-0.027
 fet-0.027
ip-0.027
 wrists-0.027
Token And
Token ID843
Feature activation-4.94e-02

Pos logit contributions
 brace+0.016
Aud+0.016
 fet+0.015
 wrists+0.015
 pro+0.014
Neg logit contributions
 worry-0.014
 adult-0.014
 zone-0.013
 son-0.013
 explanation-0.013
Token I
Token ID314
Feature activation-3.96e-02

Pos logit contributions
Aud+0.098
 brace+0.091
 fet+0.088
 wrists+0.087
 sixteen+0.087
Neg logit contributions
 worry-0.077
 adult-0.073
 son-0.069
 explanation-0.066
 God-0.063
Token love
Token ID1842
Feature activation-3.07e-02

Pos logit contributions
Aud+0.075
 brace+0.073
 fet+0.069
 wrists+0.068
 sixteen+0.065
Neg logit contributions
 adult-0.061
 worry-0.060
 explanation-0.055
 son-0.053
urious-0.053
Token milk
Token ID7545
Feature activation-1.21e-01

Pos logit contributions
Aud+0.068
br+0.061
 sixteen+0.061
 pro+0.060
 AW+0.059
Neg logit contributions
 adult-0.042
 worry-0.041
 God-0.039
 son-0.037
pa-0.036
Token and
Token ID290
Feature activation-1.63e-02

Pos logit contributions
Aud+0.270
 brace+0.247
 wrists+0.238
 fet+0.237
 sixteen+0.235
Neg logit contributions
 worry-0.158
 adult-0.155
 son-0.144
 end-0.136
iness-0.136
Token cookies
Token ID14746
Feature activation-1.71e-01

Pos logit contributions
Aud+0.028
 brace+0.024
 sixteen+0.024
 fet+0.023
ip+0.023
Neg logit contributions
 worry-0.032
 adult-0.029
 actions-0.028
 son-0.027
 nurses-0.027
Token.
Token ID13
Feature activation-1.29e-02

Pos logit contributions
Aud+0.414
 brace+0.387
 wrists+0.375
 fet+0.368
 som+0.360
Neg logit contributions
 end-0.166
 adult-0.162
 worry-0.160
guard-0.151
 zone-0.148
Token And
Token ID843
Feature activation-4.94e-02

Pos logit contributions
 brace+0.022
Aud+0.022
 wrists+0.021
 fet+0.021
 pro+0.019
Neg logit contributions
 adult-0.021
 worry-0.021
 son-0.019
 zone-0.019
 explanation-0.019
Token I
Token ID314
Feature activation-4.14e-02

Pos logit contributions
Aud+0.101
 brace+0.093
 fet+0.089
 wrists+0.088
 sixteen+0.087
Neg logit contributions
 worry-0.076
 adult-0.069
 son-0.067
 explanation-0.064
 God-0.063
Token love
Token ID1842
Feature activation-3.02e-02

Pos logit contributions
Aud+0.079
 brace+0.078
 wrists+0.073
 fet+0.072
 grapes+0.070
Neg logit contributions
 worry-0.063
 adult-0.063
 explanation-0.056
 son-0.055
 zone-0.054
Token you
Token ID345
Feature activation+1.57e-02

Pos logit contributions
Aud+0.065
br+0.057
 AW+0.056
 pro+0.056
 sixteen+0.055
Neg logit contributions
 adult-0.044
 worry-0.043
 God-0.043
 son-0.039
 nurses-0.038
Token me
Token ID502
Feature activation+3.90e-02

Pos logit contributions
 adult+0.040
 worry+0.037
 son+0.033
 explanation+0.033
urious+0.031
Neg logit contributions
Aud-0.051
 brace-0.050
 fet-0.048
 pro-0.047
ip-0.046
Token!"
Token ID2474
Feature activation-6.10e-02

Pos logit contributions
 adult+0.065
 worry+0.061
 zone+0.056
 explanation+0.056
 son+0.054
Neg logit contributions
Aud-0.073
 brace-0.072
 wrists-0.067
 grapes-0.065
 sprink-0.064
Token
Token ID198
Feature activation-9.08e-03

Pos logit contributions
 brace+0.114
Aud+0.114
 wrists+0.107
 grapes+0.104
 sixteen+0.103
Neg logit contributions
 worry-0.090
 adult-0.088
 explanation-0.083
 son-0.083
 God-0.081
Token
Token ID198
Feature activation-5.17e-03

Pos logit contributions
 brace+0.017
Aud+0.017
 wrists+0.016
 grapes+0.016
 fet+0.016
Neg logit contributions
 adult-0.013
 worry-0.013
 explanation-0.012
urious-0.012
 son-0.012
TokenThe
Token ID464
Feature activation+1.04e-03

Pos logit contributions
 brace+0.009
Aud+0.009
 wrists+0.008
ip+0.008
 fet+0.008
Neg logit contributions
 adult-0.009
 worry-0.008
urious-0.008
 explanation-0.008
 zone-0.007
Token l
Token ID300
Feature activation-1.69e-01

Pos logit contributions
 adult+0.003
 worry+0.003
 explanation+0.002
 son+0.002
urious+0.002
Neg logit contributions
Aud-0.001
 brace-0.001
 fet-0.001
 wrists-0.001
 sixteen-0.001
Tokenibrarian
Token ID35808
Feature activation-4.47e-02

Pos logit contributions
Aud+0.375
 brace+0.372
 wrists+0.372
 fet+0.356
 sixteen+0.347
Neg logit contributions
 adult-0.208
iness-0.160
row-0.153
 zone-0.152
 explanation-0.150
Token then
Token ID788
Feature activation-4.54e-03

Pos logit contributions
Aud+0.081
 brace+0.078
 wrists+0.073
 fet+0.071
 sixteen+0.071
Neg logit contributions
 adult-0.074
 worry-0.071
 son-0.065
 explanation-0.065
urious-0.063
Token took
Token ID1718
Feature activation-1.77e-02

Pos logit contributions
Aud+0.009
 brace+0.008
 wrists+0.008
 sixteen+0.008
 fet+0.008
Neg logit contributions
 adult-0.007
 worry-0.007
 son-0.006
 explanation-0.006
urious-0.006
Token Anne
Token ID15397
Feature activation+2.26e-02

Pos logit contributions
 brace+0.032
 wrists+0.032
Aud+0.031
ip+0.030
 fet+0.030
Neg logit contributions
 adult-0.027
 worry-0.026
 God-0.026
 son-0.025
 explanation-0.025
Token to
Token ID284
Feature activation+1.17e-02

Pos logit contributions
 adult+0.039
 worry+0.039
 explanation+0.035
 son+0.035
urious+0.034
Neg logit contributions
Aud-0.038
 brace-0.037
 wrists-0.034
 fet-0.034
 sixteen-0.034
Token.
Token ID13
Feature activation-4.46e-03

Pos logit contributions
Aud+0.152
 wrists+0.135
 brace+0.134
 fet+0.130
 sixteen+0.130
Neg logit contributions
 worry-0.076
 adult-0.071
iness-0.067
 zone-0.065
 son-0.062
Token I
Token ID314
Feature activation-1.23e-02

Pos logit contributions
 brace+0.008
Aud+0.008
 wrists+0.007
 fet+0.007
 pro+0.007
Neg logit contributions
 worry-0.007
 adult-0.007
 zone-0.007
 son-0.007
 God-0.006
Token will
Token ID481
Feature activation-2.82e-02

Pos logit contributions
Aud+0.026
 brace+0.026
 wrists+0.024
 fet+0.024
 grapes+0.023
Neg logit contributions
 worry-0.016
 adult-0.016
 explanation-0.014
 son-0.014
urious-0.014
Token bite
Token ID13197
Feature activation-2.17e-02

Pos logit contributions
Aud+0.067
ip+0.061
 grapes+0.061
 brace+0.060
 fet+0.060
Neg logit contributions
 worry-0.038
 adult-0.032
 end-0.028
 answer-0.028
 explanation-0.026
Token a
Token ID257
Feature activation-2.35e-02

Pos logit contributions
Aud+0.051
 fet+0.048
 wrists+0.047
 brace+0.046
 sixteen+0.045
Neg logit contributions
 worry-0.024
 end-0.023
 adult-0.022
ie-0.021
 side-0.020
Token cookie
Token ID19751
Feature activation-1.71e-01

Pos logit contributions
Aud+0.050
 wrists+0.043
 sixteen+0.042
 fet+0.041
 brace+0.041
Neg logit contributions
 worry-0.038
 adult-0.035
 end-0.032
 son-0.031
 child-0.030
Token instead
Token ID2427
Feature activation+1.30e-02

Pos logit contributions
Aud+0.416
 brace+0.372
 wrists+0.356
Pur+0.356
 fet+0.349
Neg logit contributions
 end-0.184
 worry-0.178
 adult-0.175
 zone-0.167
guard-0.164
Token."
Token ID526
Feature activation-7.31e-02

Pos logit contributions
 worry+0.018
 adult+0.018
iness+0.017
 zone+0.016
 explanation+0.016
Neg logit contributions
Aud-0.027
 brace-0.026
 fet-0.025
 wrists-0.025
 pro-0.023
Token
Token ID198
Feature activation-3.21e-02

Pos logit contributions
 brace+0.160
Aud+0.159
 wrists+0.152
 sixteen+0.149
 fet+0.147
Neg logit contributions
 worry-0.079
 adult-0.078
 son-0.073
 God-0.072
 explanation-0.069
Token
Token ID198
Feature activation-2.80e-02

Pos logit contributions
Aud+0.060
 brace+0.060
 wrists+0.057
 fet+0.056
 pro+0.055
Neg logit contributions
 adult-0.048
 worry-0.046
 son-0.042
 explanation-0.042
urious-0.041
TokenShe
Token ID3347
Feature activation-2.64e-02

Pos logit contributions
 brace+0.049
Aud+0.048
 fet+0.046
 wrists+0.045
 sixteen+0.045
Neg logit contributions
 adult-0.046
 worry-0.044
urious-0.041
 explanation-0.039
 son-0.039