TOP ACTIVATIONS
MAX = 14.050

" It 's your fault , you crashed into me
many songs , but her favorite song is " Happy Birthday
knew it was her own fault . The girl
. Once he found his favorite toy , he was ready
listen . Suddenly , her favorite toy bunny came to life
different animals , but her favorite was the tiger . She
. She put on her favourite dress and went straight to
Sarah . She had a favourite brush , which she liked
the bird was enjoying her favourite food of pasta when she
. They played all his favorite songs and sang as loud
" It 's your own fault , you should 've been
microscope ! It was my favorite thing ! How could you
?" Tim my chose his favorite flavor and the driver gave
, Lily accidentally tore her favorite toy . She was very
. She was wearing her favorite clothes , a pink dress
. " What 's your favorite color ?" Lily thought for
happened . Tim my 's favorite toy fell into the mill
around the kitchen to its favourite tunes . One
jumping . But , her favourite exercise was dancing !
room and put on her favourite dress . It was so

MIN ACTIVATIONS
MIN = -11.325

The three - year - old thought for a
The 3 - year old smiled and said ,
uff led the three - year - old 's hair .
, a little 3 - year - old girl found the
nie was a three - year - old with bright eyes
So , the 3 - year - old grabbed their brother
there was a three - year - old boy called Tommy
at her excited three - year - old daughter and continued
world , the 3 - year - old boy hugged his
Suddenly , their three - year - old son came running
She was a three - year - old and she was
beach with their three - year - old daughter . They
bow and the 3 - year - old followed him ,
so a brave 3 - year - old girl named Lisa
The 3 - year - old be amed and
happy that the three - year - old had changed their
ally was a 3 - year - old girl who had
to find a 3 - year old boy smiling up at
that even a three - year - old can deliver something
mother took her 3 - year - old daughter to the

INTERVAL 7.706 - 14.050
CONTAINS 0.059%

of kids playing with a football . Tim my wanted to
loved to play with his football . His football was big
so excited to eat her favorite food , spaghetti and meat
different snacks , but their favorite was the beans .
, Tom ! It 's mine !" Lily said . She

INTERVAL 1.363 - 7.706
CONTAINS 21.570%

having a nap in the sun when a little girl came
car . It is not fair ." Tom starts
From then on , Tim my proudly drank his juice
jump and do bur pees every day . One
tall they were taller than her . Y ol anda was

INTERVAL -4.981 - 1.363
CONTAINS 77.297%

She asked her mom if she could climb the ladder .
them having fun . She laughed and cl apped with joy
Their moms came and saw the bag of nuts .
m op realised he had made a big mistake and was
The lion was surprised and said to the rabbit ,

INTERVAL -11.325 - -4.981
CONTAINS 1.074%

called out � � € � � Sam ! Hold on
she made a very strange face . The soup was bitter
One day , a big noise disturbed Bob 's regular life
and cracks . Tom felt sad and scared .
tent , they heard a loud noise . Lily felt frightened
Token
Token ID198
Feature activation+1.90e+00

Pos logit contributions
 fault+0.008
Mine+0.007
 pleasure+0.007
 favorites+0.007
 flavors+0.007
Neg logit contributions
 called-0.004
 worried-0.003
 scared-0.003
 shouted-0.003
 shocked-0.003
Token"
Token ID1
Feature activation+2.88e+00

Pos logit contributions
 fault+0.107
Mine+0.100
 favorites+0.096
 pleasure+0.096
 accessory+0.093
Neg logit contributions
 called-0.066
 shouted-0.057
 scared-0.057
 worried-0.056
 sad-0.053
TokenIt
Token ID1026
Feature activation+1.82e+00

Pos logit contributions
 fault+0.155
 pleasure+0.129
 favorites+0.128
eter+0.118
 findings+0.118
Neg logit contributions
 called-0.122
 shouted-0.114
 scared-0.105
 worried-0.102
 yelled-0.101
Token's
Token ID338
Feature activation+2.41e+00

Pos logit contributions
 fault+0.105
Mine+0.101
 pleasure+0.095
 favorites+0.094
 accessory+0.091
Neg logit contributions
 called-0.055
 worried-0.048
 shouted-0.047
 scared-0.047
 scientists-0.044
Token your
Token ID534
Feature activation+2.50e+00

Pos logit contributions
 fault+0.127
Mine+0.123
 pleasure+0.117
 favorites+0.114
oof+0.112
Neg logit contributions
 called-0.084
 scared-0.076
 worried-0.076
 shouted-0.075
 sad-0.070
Token fault
Token ID8046
Feature activation+1.18e+01

Pos logit contributions
Mine+0.111
rene+0.102
rust+0.100
 fault+0.099
oof+0.094
Neg logit contributions
 called-0.110
 scared-0.102
 worried-0.100
 sad-0.099
 so-0.096
Token,
Token ID11
Feature activation-9.33e-01

Pos logit contributions
 fault+0.742
 pleasure+0.655
Mine+0.646
 accessory+0.631
 favorites+0.626
Neg logit contributions
 called-0.310
 worried-0.275
 scared-0.269
irens-0.262
 scientists-0.256
Token you
Token ID345
Feature activation-1.71e+00

Pos logit contributions
 called+0.034
 scared+0.031
 worried+0.031
 shouted+0.030
 sad+0.029
Neg logit contributions
 fault-0.048
Mine-0.047
 favorites-0.044
 pleasure-0.044
 accessory-0.042
Token crashed
Token ID14997
Feature activation-1.52e+00

Pos logit contributions
 called+0.065
 causing+0.063
irens+0.061
 scientists+0.060
 men+0.059
Neg logit contributions
 fault-0.080
Mine-0.074
 pleasure-0.073
 favorites-0.071
oof-0.067
Token into
Token ID656
Feature activation-3.62e-01

Pos logit contributions
 called+0.040
 worried+0.035
 scared+0.035
 shouted+0.033
 sad+0.031
Neg logit contributions
 fault-0.094
Mine-0.090
 pleasure-0.083
 favorites-0.083
oof-0.082
Token me
Token ID502
Feature activation+6.66e-01

Pos logit contributions
 called+0.010
 worried+0.009
 scared+0.009
 shouted+0.009
 confused+0.008
Neg logit contributions
 fault-0.021
Mine-0.021
 pleasure-0.019
 favorites-0.019
 accessory-0.019
Token many
Token ID867
Feature activation-1.88e+00

Pos logit contributions
 called+0.059
 scared+0.054
 worried+0.053
 shouted+0.052
 sad+0.050
Neg logit contributions
 fault-0.066
Mine-0.065
 pleasure-0.060
 favorites-0.060
 accessory-0.058
Token songs
Token ID7259
Feature activation+3.46e+00

Pos logit contributions
 called+0.076
 scared+0.069
 worried+0.068
 shouted+0.067
 sad+0.064
Neg logit contributions
 fault-0.091
Mine-0.089
 pleasure-0.083
 favorites-0.081
oof-0.080
Token,
Token ID11
Feature activation-1.21e-01

Pos logit contributions
 fault+0.218
Mine+0.213
 pleasure+0.202
 favorites+0.202
 accessory+0.197
Neg logit contributions
 called-0.076
 worried-0.070
 scared-0.069
 shouted-0.063
 sad-0.061
Token but
Token ID475
Feature activation+2.62e-01

Pos logit contributions
 called+0.004
 scared+0.003
 worried+0.003
 shouted+0.003
 sad+0.003
Neg logit contributions
 fault-0.007
Mine-0.007
 favorites-0.006
 pleasure-0.006
 accessory-0.006
Token her
Token ID607
Feature activation+2.03e+00

Pos logit contributions
 fault+0.016
Mine+0.016
 pleasure+0.015
 favorites+0.015
 accessory+0.015
Neg logit contributions
 called-0.007
 scared-0.006
 worried-0.006
 shouted-0.006
 sad-0.005
Token favorite
Token ID4004
Feature activation+1.15e+01

Pos logit contributions
Mine+0.105
 fault+0.102
rust+0.095
rene+0.094
oof+0.093
Neg logit contributions
 called-0.077
 scared-0.071
 worried-0.069
 sad-0.067
 confused-0.064
Token song
Token ID3496
Feature activation+5.60e-01

Pos logit contributions
 fault+0.589
Mine+0.565
 pleasure+0.531
rust+0.529
rene+0.527
Neg logit contributions
 called-0.458
 scared-0.400
 worried-0.384
 noise-0.380
 shouted-0.379
Token is
Token ID318
Feature activation+2.76e+00

Pos logit contributions
 fault+0.030
Mine+0.029
 pleasure+0.027
 favorites+0.027
 accessory+0.027
Neg logit contributions
 called-0.018
 worried-0.017
 scared-0.016
 shouted-0.016
 confused-0.015
Token "
Token ID366
Feature activation-8.03e-02

Pos logit contributions
Mine+0.173
 fault+0.171
 favorites+0.160
 pleasure+0.160
rust+0.156
Neg logit contributions
 called-0.076
 scared-0.064
 worried-0.061
 shouted-0.057
 sad-0.057
TokenHappy
Token ID25082
Feature activation+3.50e+00

Pos logit contributions
 called+0.003
 scared+0.003
 shouted+0.003
 worried+0.003
 sad+0.003
Neg logit contributions
 fault-0.004
Mine-0.004
 pleasure-0.003
 favorites-0.003
 accessory-0.003
Token Birthday
Token ID33511
Feature activation+6.22e+00

Pos logit contributions
 fault+0.109
Mine+0.105
 pleasure+0.092
 favorites+0.091
 accessory+0.086
Neg logit contributions
 called-0.200
 scared-0.185
 worried-0.184
 shouted-0.182
 sad-0.177
Token knew
Token ID2993
Feature activation-3.84e+00

Pos logit contributions
 called+0.026
 worried+0.024
 scared+0.024
 scientists+0.022
 causing+0.022
Neg logit contributions
 fault-0.045
Mine-0.044
 favorites-0.042
 pleasure-0.042
 accessory-0.040
Token it
Token ID340
Feature activation+1.30e-02

Pos logit contributions
 called+0.089
 scared+0.081
 worried+0.080
 shouted+0.077
 shocked+0.073
Neg logit contributions
 fault-0.238
Mine-0.221
 pleasure-0.220
 favorites-0.212
 accessory-0.209
Token was
Token ID373
Feature activation+2.43e+00

Pos logit contributions
 fault+0.001
Mine+0.001
 pleasure+0.001
 favorites+0.001
oof+0.001
Neg logit contributions
 called-0.000
 scared-0.000
 worried-0.000
 shouted-0.000
 sad-0.000
Token her
Token ID607
Feature activation+1.67e+00

Pos logit contributions
Mine+0.132
 fault+0.131
 favorites+0.122
 pleasure+0.120
 accessory+0.118
Neg logit contributions
 called-0.084
 scared-0.076
 worried-0.074
 shouted-0.070
 sad-0.070
Token own
Token ID898
Feature activation+3.65e+00

Pos logit contributions
Mine+0.080
 fault+0.073
rust+0.071
rene+0.071
oof+0.070
Neg logit contributions
 called-0.070
 scared-0.065
 worried-0.065
 sad-0.063
 so-0.062
Token fault
Token ID8046
Feature activation+1.14e+01

Pos logit contributions
Mine+0.174
 fault+0.167
rust+0.157
 favorites+0.156
oof+0.155
Neg logit contributions
 called-0.148
 scared-0.136
 worried-0.134
 sad-0.130
 shouted-0.126
Token.
Token ID13
Feature activation-8.08e-02

Pos logit contributions
 fault+0.697
 pleasure+0.620
Mine+0.615
 accessory+0.606
 favorites+0.597
Neg logit contributions
 called-0.303
 worried-0.280
 scared-0.275
 shouted-0.268
 shocked-0.261
Token
Token ID198
Feature activation+1.33e-01

Pos logit contributions
 called+0.002
 scared+0.002
 worried+0.002
 shouted+0.002
 sad+0.002
Neg logit contributions
 fault-0.005
Mine-0.005
 favorites-0.004
 pleasure-0.004
 accessory-0.004
Token
Token ID198
Feature activation+1.25e-01

Pos logit contributions
 fault+0.008
Mine+0.008
 favorites+0.008
 pleasure+0.008
 accessory+0.008
Neg logit contributions
 called-0.004
 shouted-0.003
 scared-0.003
 worried-0.003
 sad-0.003
TokenThe
Token ID464
Feature activation-1.53e+00

Pos logit contributions
 fault+0.008
 favorites+0.007
 pleasure+0.007
Mine+0.007
 personalities+0.006
Neg logit contributions
 called-0.004
 shouted-0.004
 scared-0.003
 worried-0.003
 so-0.003
Token girl
Token ID2576
Feature activation-2.50e+00

Pos logit contributions
 called+0.058
 scared+0.053
 worried+0.053
 shouted+0.051
 sad+0.050
Neg logit contributions
 fault-0.076
Mine-0.075
 favorites-0.069
 pleasure-0.069
 accessory-0.067
Token.
Token ID13
Feature activation+1.94e+00

Pos logit contributions
 fault+0.031
Mine+0.031
 pleasure+0.030
 favorites+0.029
 accessory+0.029
Neg logit contributions
 called-0.016
 worried-0.015
 scared-0.014
 scientists-0.013
 sad-0.013
Token Once
Token ID4874
Feature activation+4.62e+00

Pos logit contributions
 fault+0.107
Mine+0.106
 pleasure+0.099
 favorites+0.099
 accessory+0.096
Neg logit contributions
 called-0.062
 scared-0.055
 worried-0.055
 shouted-0.053
 sad-0.051
Token he
Token ID339
Feature activation+1.41e-01

Pos logit contributions
 fault+0.274
Mine+0.268
 pleasure+0.249
 favorites+0.246
 friendship+0.238
Neg logit contributions
 called-0.122
 worried-0.105
 scared-0.104
 shouted-0.099
 causing-0.096
Token found
Token ID1043
Feature activation+1.16e-01

Pos logit contributions
 fault+0.008
Mine+0.008
 pleasure+0.008
 favorites+0.008
 accessory+0.008
Neg logit contributions
 called-0.004
 scared-0.004
 worried-0.004
 shouted-0.004
 sad-0.003
Token his
Token ID465
Feature activation+3.73e+00

Pos logit contributions
 fault+0.007
Mine+0.007
 pleasure+0.007
 favorites+0.007
 accessory+0.007
Neg logit contributions
 called-0.003
 scared-0.003
 worried-0.002
 shouted-0.002
 sad-0.002
Token favorite
Token ID4004
Feature activation+1.14e+01

Pos logit contributions
Mine+0.174
rene+0.165
 fault+0.161
rust+0.154
oof+0.151
Neg logit contributions
 called-0.162
 scared-0.148
 sad-0.147
 worried-0.145
 so-0.142
Token toy
Token ID13373
Feature activation+6.71e+00

Pos logit contributions
 fault+0.574
rene+0.550
Mine+0.534
rust+0.510
 pleasure+0.491
Neg logit contributions
 called-0.474
 so-0.416
 scared-0.409
 noise-0.401
 sad-0.400
Token,
Token ID11
Feature activation-2.92e-01

Pos logit contributions
 fault+0.477
Mine+0.472
 pleasure+0.448
 favorites+0.447
 accessory+0.437
Neg logit contributions
 called-0.111
 scared-0.087
 worried-0.085
 shouted-0.079
 sad-0.071
Token he
Token ID339
Feature activation-6.13e-01

Pos logit contributions
 called+0.007
 scared+0.006
 worried+0.006
 shouted+0.006
 sad+0.006
Neg logit contributions
 fault-0.018
Mine-0.018
 pleasure-0.017
 favorites-0.017
 accessory-0.016
Token was
Token ID373
Feature activation+2.67e+00

Pos logit contributions
 called+0.019
 scared+0.017
 worried+0.017
 shouted+0.016
 sad+0.016
Neg logit contributions
 fault-0.034
Mine-0.034
 favorites-0.032
 pleasure-0.032
 accessory-0.031
Token ready
Token ID3492
Feature activation+6.12e-01

Pos logit contributions
 fault+0.169
Mine+0.162
 pleasure+0.159
 favorites+0.156
 accessory+0.152
Neg logit contributions
 called-0.057
 scared-0.048
 shouted-0.048
 worried-0.048
 scientists-0.047
Token listen
Token ID6004
Feature activation+1.10e+00

Pos logit contributions
 fault+0.047
Mine+0.044
 favorites+0.043
 pleasure+0.043
 accessory+0.042
Neg logit contributions
 called-0.018
 shouted-0.016
 scared-0.016
 worried-0.016
 shocked-0.014
Token.
Token ID13
Feature activation+6.15e-01

Pos logit contributions
 fault+0.064
Mine+0.059
 pleasure+0.057
 favorites+0.056
 friendship+0.055
Neg logit contributions
 called-0.032
 worried-0.029
 scared-0.029
 shouted-0.028
 shocked-0.027
Token Suddenly
Token ID24975
Feature activation+1.72e-01

Pos logit contributions
 fault+0.035
Mine+0.034
 favorites+0.032
 pleasure+0.032
 accessory+0.031
Neg logit contributions
 called-0.020
 scared-0.017
 worried-0.017
 shouted-0.017
 sad-0.016
Token,
Token ID11
Feature activation+3.97e-02

Pos logit contributions
 fault+0.010
Mine+0.010
 pleasure+0.009
 friendship+0.009
 favorites+0.009
Neg logit contributions
 called-0.004
 worried-0.004
 scared-0.004
 shouted-0.004
 shocked-0.004
Token her
Token ID607
Feature activation+2.37e+00

Pos logit contributions
Mine+0.002
 fault+0.002
 pleasure+0.002
 favorites+0.002
 accessory+0.002
Neg logit contributions
 called-0.001
 worried-0.001
 scared-0.001
 shouted-0.001
 shocked-0.001
Token favorite
Token ID4004
Feature activation+1.14e+01

Pos logit contributions
 fault+0.118
Mine+0.118
 pleasure+0.109
 favorites+0.108
 accessory+0.105
Neg logit contributions
 called-0.089
 scared-0.081
 worried-0.080
 shouted-0.078
 sad-0.075
Token toy
Token ID13373
Feature activation+6.62e+00

Pos logit contributions
 fault+0.540
Mine+0.522
 pleasure+0.481
 favorites+0.480
rust+0.475
Neg logit contributions
 called-0.477
 scared-0.433
 worried-0.421
 shouted-0.412
 sad-0.407
Token bunny
Token ID44915
Feature activation+3.26e+00

Pos logit contributions
 fault+0.356
Mine+0.348
 favorites+0.322
 pleasure+0.321
 accessory+0.315
Neg logit contributions
 called-0.219
 worried-0.200
 scared-0.199
 shouted-0.192
 sad-0.184
Token came
Token ID1625
Feature activation-3.12e+00

Pos logit contributions
 fault+0.191
Mine+0.188
 pleasure+0.177
 favorites+0.176
oof+0.171
Neg logit contributions
 called-0.099
 scared-0.088
 worried-0.085
 shouted-0.083
 sad-0.078
Token to
Token ID284
Feature activation+3.33e-01

Pos logit contributions
 called+0.075
 scared+0.064
 worried+0.062
 shouted+0.059
 sad+0.056
Neg logit contributions
 fault-0.201
Mine-0.199
 favorites-0.189
 pleasure-0.189
 accessory-0.182
Token life
Token ID1204
Feature activation+1.10e+00

Pos logit contributions
 fault+0.020
Mine+0.019
 favorites+0.019
 pleasure+0.018
 accessory+0.018
Neg logit contributions
 called-0.010
 scared-0.009
 worried-0.009
 shouted-0.008
 noise-0.008
Token different
Token ID1180
Feature activation+3.68e+00

Pos logit contributions
 fault+0.040
Mine+0.039
 pleasure+0.037
 favorites+0.036
 accessory+0.035
Neg logit contributions
 called-0.030
 scared-0.027
 worried-0.027
 shouted-0.026
 shocked-0.025
Token animals
Token ID4695
Feature activation+8.67e-01

Pos logit contributions
 fault+0.174
Mine+0.171
 pleasure+0.158
 favorites+0.158
oof+0.152
Neg logit contributions
 called-0.148
 scared-0.134
 worried-0.133
 shouted-0.130
 sad-0.126
Token,
Token ID11
Feature activation+7.09e-02

Pos logit contributions
 fault+0.052
Mine+0.050
 pleasure+0.048
 favorites+0.048
 accessory+0.046
Neg logit contributions
 called-0.022
 worried-0.020
 scared-0.020
 shouted-0.019
 sad-0.018
Token but
Token ID475
Feature activation+2.73e-01

Pos logit contributions
 fault+0.004
Mine+0.004
 pleasure+0.004
 favorites+0.004
 accessory+0.004
Neg logit contributions
 called-0.002
 worried-0.002
 scared-0.002
 shouted-0.002
 shocked-0.002
Token her
Token ID607
Feature activation+2.07e+00

Pos logit contributions
 fault+0.017
Mine+0.016
 pleasure+0.015
 favorites+0.015
 accessory+0.015
Neg logit contributions
 called-0.007
 scared-0.006
 worried-0.006
 shouted-0.006
 sad-0.006
Token favorite
Token ID4004
Feature activation+1.14e+01

Pos logit contributions
Mine+0.108
 fault+0.105
oof+0.095
rene+0.095
 accessory+0.095
Neg logit contributions
 called-0.077
 scared-0.070
 worried-0.069
 sad-0.066
 shouted-0.065
Token was
Token ID373
Feature activation+2.36e+00

Pos logit contributions
 fault+0.596
Mine+0.577
 pleasure+0.534
rene+0.532
rust+0.528
Neg logit contributions
 called-0.435
 scared-0.379
 worried-0.365
 sad-0.357
 shouted-0.355
Token the
Token ID262
Feature activation-1.35e+00

Pos logit contributions
 fault+0.144
Mine+0.142
 pleasure+0.134
 favorites+0.133
 accessory+0.130
Neg logit contributions
 called-0.063
 scared-0.055
 worried-0.054
 shouted-0.052
 sad-0.049
Token tiger
Token ID26241
Feature activation+5.12e+00

Pos logit contributions
 called+0.053
 scared+0.049
 worried+0.047
 sad+0.046
 so+0.044
Neg logit contributions
Mine-0.067
 fault-0.066
 favorites-0.060
 pleasure-0.060
rene-0.060
Token.
Token ID13
Feature activation+4.45e-01

Pos logit contributions
 fault+0.323
Mine+0.315
 pleasure+0.301
 favorites+0.300
 accessory+0.295
Neg logit contributions
 called-0.114
 worried-0.099
 scared-0.098
 shouted-0.096
 confused-0.087
Token She
Token ID1375
Feature activation-3.87e-01

Pos logit contributions
 fault+0.026
Mine+0.025
 pleasure+0.024
 favorites+0.024
 accessory+0.023
Neg logit contributions
 called-0.014
 scared-0.012
 worried-0.012
 shouted-0.011
 sad-0.011
Token.
Token ID13
Feature activation+6.85e-01

Pos logit contributions
 fault+0.186
Mine+0.184
 pleasure+0.174
 favorites+0.173
 accessory+0.169
Neg logit contributions
 called-0.068
 scared-0.058
 worried-0.056
 shouted-0.054
 sad-0.050
Token She
Token ID1375
Feature activation+2.22e-01

Pos logit contributions
 fault+0.039
Mine+0.038
 favorites+0.036
 pleasure+0.036
 accessory+0.035
Neg logit contributions
 called-0.022
 scared-0.019
 worried-0.018
 shouted-0.018
 sad-0.017
Token put
Token ID1234
Feature activation+1.50e+00

Pos logit contributions
 fault+0.012
Mine+0.012
 favorites+0.011
 pleasure+0.011
 accessory+0.011
Neg logit contributions
 called-0.007
 scared-0.006
 worried-0.006
 shouted-0.006
 sad-0.006
Token on
Token ID319
Feature activation-2.43e-01

Pos logit contributions
 fault+0.095
Mine+0.094
 favorites+0.089
 pleasure+0.089
oof+0.086
Neg logit contributions
 called-0.036
 scared-0.031
 worried-0.030
 shouted-0.029
 sad-0.027
Token her
Token ID607
Feature activation+2.16e+00

Pos logit contributions
 called+0.008
 scared+0.007
 worried+0.007
 shouted+0.006
 sad+0.006
Neg logit contributions
 fault-0.014
Mine-0.014
 pleasure-0.013
 favorites-0.013
 accessory-0.013
Token favourite
Token ID12507
Feature activation+1.13e+01

Pos logit contributions
 fault+0.102
Mine+0.101
 pleasure+0.091
 favorites+0.089
rust+0.088
Neg logit contributions
 called-0.091
 scared-0.083
 worried-0.082
 shouted-0.079
 sad-0.079
Token dress
Token ID6576
Feature activation+1.61e+00

Pos logit contributions
 fault+0.623
Mine+0.545
rust+0.522
 pleasure+0.521
eter+0.515
Neg logit contributions
 called-0.464
 scared-0.388
 so-0.385
 shouted-0.382
 noise-0.380
Token and
Token ID290
Feature activation+5.42e-01

Pos logit contributions
 fault+0.098
Mine+0.094
 favorites+0.092
 accessory+0.092
 pleasure+0.090
Neg logit contributions
 called-0.037
 scared-0.037
 worried-0.036
 afraid-0.033
 shouted-0.033
Token went
Token ID1816
Feature activation-7.61e-01

Pos logit contributions
Mine+0.028
 fault+0.028
 favorites+0.027
 pleasure+0.026
 accessory+0.026
Neg logit contributions
 called-0.018
 worried-0.016
 scared-0.016
 shouted-0.015
 sad-0.015
Token straight
Token ID3892
Feature activation+2.04e+00

Pos logit contributions
 called+0.021
 scared+0.020
 worried+0.020
 shouted+0.019
 confused+0.018
Neg logit contributions
 fault-0.043
Mine-0.041
 favorites-0.040
 accessory-0.040
 pleasure-0.039
Token to
Token ID284
Feature activation+4.24e-01

Pos logit contributions
 fault+0.119
Mine+0.117
 favorites+0.111
 pleasure+0.109
 accessory+0.107
Neg logit contributions
 called-0.059
 scared-0.053
 worried-0.052
 shouted-0.049
 sad-0.047
Token Sarah
Token ID10490
Feature activation+3.31e+00

Pos logit contributions
 called+0.175
 worried+0.167
 scared+0.165
 shouted+0.159
 shocked+0.157
Neg logit contributions
 fault-0.250
Mine-0.240
 pleasure-0.233
 favorites-0.225
oof-0.218
Token.
Token ID13
Feature activation+6.76e-01

Pos logit contributions
 fault+0.206
Mine+0.202
 favorites+0.191
 pleasure+0.191
 accessory+0.186
Neg logit contributions
 called-0.082
 scared-0.072
 worried-0.071
 shouted-0.067
 sad-0.064
Token She
Token ID1375
Feature activation+2.20e-01

Pos logit contributions
 fault+0.041
Mine+0.039
 pleasure+0.037
 favorites+0.037
 accessory+0.036
Neg logit contributions
 called-0.019
 scared-0.016
 worried-0.016
 shouted-0.016
 sad-0.015
Token had
Token ID550
Feature activation+1.21e+00

Pos logit contributions
Mine+0.011
 fault+0.011
 favorites+0.011
 pleasure+0.010
 accessory+0.010
Neg logit contributions
 called-0.007
 worried-0.007
 scared-0.006
 sad-0.006
 noise-0.006
Token a
Token ID257
Feature activation+1.10e+00

Pos logit contributions
 fault+0.066
Mine+0.065
 pleasure+0.061
 favorites+0.061
 accessory+0.059
Neg logit contributions
 called-0.038
 scared-0.034
 worried-0.034
 shouted-0.033
 sad-0.032
Token favourite
Token ID12507
Feature activation+1.13e+01

Pos logit contributions
 fault+0.060
Mine+0.058
 favorites+0.054
rust+0.053
oof+0.053
Neg logit contributions
 called-0.038
 worried-0.035
 scared-0.035
 sad-0.033
 shouted-0.031
Token brush
Token ID14093
Feature activation-1.27e+00

Pos logit contributions
 fault+0.592
Mine+0.530
rene+0.527
rust+0.515
eter+0.502
Neg logit contributions
 called-0.481
 scared-0.408
 noise-0.403
 so-0.399
 sad-0.398
Token,
Token ID11
Feature activation-6.80e-02

Pos logit contributions
 scared+0.029
 worried+0.028
 called+0.028
 shouted+0.028
 shocked+0.026
Neg logit contributions
 fault-0.077
Mine-0.073
 accessory-0.072
 favorites-0.071
 pleasure-0.069
Token which
Token ID543
Feature activation+1.66e+00

Pos logit contributions
 called+0.002
 scared+0.002
 worried+0.002
 shouted+0.002
 sad+0.001
Neg logit contributions
 fault-0.004
Mine-0.004
 pleasure-0.004
 favorites-0.004
 accessory-0.004
Token she
Token ID673
Feature activation-1.93e+00

Pos logit contributions
 fault+0.094
Mine+0.093
 pleasure+0.086
 favorites+0.086
 accessory+0.084
Neg logit contributions
 called-0.049
 worried-0.044
 scared-0.044
 shouted-0.042
 sad-0.041
Token liked
Token ID8288
Feature activation+3.10e+00

Pos logit contributions
 called+0.066
 scared+0.055
 shouted+0.055
 worried+0.054
 sad+0.050
Neg logit contributions
 fault-0.113
Mine-0.108
 pleasure-0.102
oof-0.099
 favorites-0.099
Token the
Token ID262
Feature activation-1.33e-01

Pos logit contributions
 fault+0.067
Mine+0.066
 pleasure+0.064
 favorites+0.061
 accessory+0.060
Neg logit contributions
 called-0.032
 worried-0.029
irens-0.028
 scared-0.028
 shocked-0.028
Token bird
Token ID6512
Feature activation+2.63e+00

Pos logit contributions
 called+0.005
 shouted+0.004
 scared+0.004
 worried+0.004
 shocked+0.004
Neg logit contributions
 fault-0.007
Mine-0.007
 pleasure-0.006
 favorites-0.006
 accessory-0.006
Token was
Token ID373
Feature activation+2.66e+00

Pos logit contributions
 fault+0.126
Mine+0.123
 favorites+0.115
oof+0.115
 accessory+0.115
Neg logit contributions
 scientists-0.087
 called-0.082
irens-0.080
 worried-0.079
 men-0.078
Token enjoying
Token ID13226
Feature activation+2.94e+00

Pos logit contributions
 fault+0.160
 pleasure+0.148
Mine+0.148
 favorites+0.143
oof+0.141
Neg logit contributions
 called-0.064
 scientists-0.058
 shouted-0.057
 men-0.055
 machines-0.054
Token her
Token ID607
Feature activation+2.01e+00

Pos logit contributions
 fault+0.192
Mine+0.188
 favorites+0.178
 pleasure+0.175
 accessory+0.173
Neg logit contributions
 called-0.071
 scared-0.062
 worried-0.059
 shouted-0.055
 sad-0.054
Token favourite
Token ID12507
Feature activation+1.13e+01

Pos logit contributions
Mine+0.099
 fault+0.095
rust+0.085
 favorites+0.084
bet+0.082
Neg logit contributions
 called-0.084
 scared-0.078
 worried-0.075
 sad-0.075
 so-0.074
Token food
Token ID2057
Feature activation+5.78e-01

Pos logit contributions
 fault+0.629
Mine+0.569
rust+0.524
eter+0.520
rene+0.517
Neg logit contributions
 called-0.447
 noise-0.402
 so-0.396
 scared-0.390
 people-0.375
Token of
Token ID286
Feature activation-4.04e+00

Pos logit contributions
 fault+0.038
Mine+0.037
 pleasure+0.035
 favorites+0.035
 accessory+0.034
Neg logit contributions
 called-0.012
 scared-0.011
 worried-0.010
 shouted-0.010
 sad-0.009
Token pasta
Token ID26296
Feature activation+1.09e+00

Pos logit contributions
 called+0.133
 scared+0.113
 worried+0.108
 so+0.105
 shouted+0.105
Neg logit contributions
 fault-0.235
Mine-0.230
 favorites-0.214
 pleasure-0.210
rene-0.209
Token when
Token ID618
Feature activation-6.65e-01

Pos logit contributions
 fault+0.069
Mine+0.068
 pleasure+0.064
 favorites+0.064
 accessory+0.063
Neg logit contributions
 called-0.026
 scared-0.022
 worried-0.022
 shouted-0.021
 sad-0.020
Token she
Token ID673
Feature activation-2.52e+00

Pos logit contributions
 called+0.016
 worried+0.014
 scared+0.013
 shouted+0.013
 confused+0.012
Neg logit contributions
 fault-0.041
Mine-0.040
 pleasure-0.039
 favorites-0.038
 accessory-0.037
Token.
Token ID13
Feature activation+1.94e+00

Pos logit contributions
 called+0.064
 scared+0.058
 worried+0.058
 shouted+0.055
 sad+0.054
Neg logit contributions
 fault-0.097
Mine-0.095
 favorites-0.088
 pleasure-0.087
 accessory-0.085
Token They
Token ID1119
Feature activation+1.78e+00

Pos logit contributions
 fault+0.107
Mine+0.106
 pleasure+0.099
 favorites+0.099
 accessory+0.096
Neg logit contributions
 called-0.062
 scared-0.055
 worried-0.055
 shouted-0.053
 sad-0.051
Token played
Token ID2826
Feature activation+5.55e+00

Pos logit contributions
Mine+0.092
 fault+0.091
 favorites+0.083
 pleasure+0.083
 accessory+0.079
Neg logit contributions
 called-0.061
 worried-0.055
 scared-0.055
 shouted-0.052
 causing-0.051
Token all
Token ID477
Feature activation+2.35e+00

Pos logit contributions
Mine+0.304
 fault+0.303
 pleasure+0.281
 favorites+0.279
oof+0.275
Neg logit contributions
 called-0.172
 worried-0.157
 scared-0.154
 shouted-0.147
 confused-0.142
Token his
Token ID465
Feature activation+3.37e+00

Pos logit contributions
 fault+0.091
Mine+0.091
 pleasure+0.082
 favorites+0.081
oof+0.080
Neg logit contributions
 called-0.109
 worried-0.105
 scared-0.104
 shouted-0.100
 confused-0.100
Token favorite
Token ID4004
Feature activation+1.12e+01

Pos logit contributions
Mine+0.181
 fault+0.173
 pleasure+0.164
rene+0.163
eter+0.157
Neg logit contributions
 called-0.123
 scared-0.113
 sad-0.110
 worried-0.108
 shouted-0.104
Token songs
Token ID7259
Feature activation+2.55e+00

Pos logit contributions
 fault+0.620
rene+0.583
Mine+0.557
 pleasure+0.549
rust+0.529
Neg logit contributions
 called-0.431
 so-0.380
 people-0.369
 noise-0.368
 scared-0.357
Token and
Token ID290
Feature activation+3.89e-01

Pos logit contributions
 fault+0.165
Mine+0.160
 pleasure+0.153
 favorites+0.152
 accessory+0.150
Neg logit contributions
 called-0.054
 worried-0.048
 scared-0.048
 shouted-0.044
 confused-0.041
Token sang
Token ID25889
Feature activation-1.06e+00

Pos logit contributions
Mine+0.020
 fault+0.020
 favorites+0.018
 pleasure+0.018
 accessory+0.018
Neg logit contributions
 called-0.013
 worried-0.012
 scared-0.012
 shouted-0.012
 confused-0.011
Token as
Token ID355
Feature activation+3.92e+00

Pos logit contributions
 called+0.040
 scared+0.036
 worried+0.036
 shouted+0.035
 sad+0.033
Neg logit contributions
 fault-0.052
Mine-0.052
 pleasure-0.049
 favorites-0.048
 accessory-0.047
Token loud
Token ID7812
Feature activation-4.39e+00

Pos logit contributions
 fault+0.249
Mine+0.245
 pleasure+0.232
 favorites+0.230
 accessory+0.225
Neg logit contributions
 called-0.095
 scared-0.081
 worried-0.080
 shouted-0.077
 sad-0.071
Token "
Token ID366
Feature activation+5.62e-01

Pos logit contributions
 called+0.046
 scared+0.040
 worried+0.039
 shouted+0.038
 scientists+0.037
Neg logit contributions
 fault-0.111
Mine-0.106
 pleasure-0.105
 favorites-0.099
 accessory-0.098
TokenIt
Token ID1026
Feature activation+1.75e+00

Pos logit contributions
 fault+0.027
 favorites+0.023
Mine+0.022
 pleasure+0.022
 personalities+0.022
Neg logit contributions
 called-0.024
 shouted-0.023
 scared-0.022
 worried-0.021
 so-0.020
Token's
Token ID338
Feature activation+2.41e+00

Pos logit contributions
 fault+0.090
Mine+0.086
 pleasure+0.082
 favorites+0.080
 accessory+0.078
Neg logit contributions
 called-0.061
 worried-0.056
 scared-0.054
 shouted-0.054
 sad-0.051
Token your
Token ID534
Feature activation+2.46e+00

Pos logit contributions
 fault+0.122
Mine+0.117
 pleasure+0.113
 favorites+0.109
oof+0.106
Neg logit contributions
 called-0.088
 scared-0.081
 worried-0.080
 shouted-0.080
 shocked-0.075
Token own
Token ID898
Feature activation+3.60e+00

Pos logit contributions
Mine+0.104
 fault+0.094
rust+0.093
rene+0.092
 accessory+0.087
Neg logit contributions
 called-0.114
 scared-0.106
 worried-0.104
 sad-0.102
 so-0.101
Token fault
Token ID8046
Feature activation+1.12e+01

Pos logit contributions
Mine+0.167
 fault+0.166
 favorites+0.152
 pleasure+0.149
rust+0.148
Neg logit contributions
 called-0.148
 scared-0.135
 worried-0.133
 shouted-0.129
 sad-0.127
Token,
Token ID11
Feature activation-6.79e-01

Pos logit contributions
 fault+0.639
 pleasure+0.576
Mine+0.574
 favorites+0.549
 accessory+0.546
Neg logit contributions
 called-0.354
 worried-0.328
 scared-0.325
 shouted-0.314
 shocked-0.304
Token you
Token ID345
Feature activation-1.88e+00

Pos logit contributions
 called+0.024
 scared+0.021
 worried+0.021
 shouted+0.020
 sad+0.020
Neg logit contributions
 fault-0.036
Mine-0.036
 favorites-0.033
 pleasure-0.033
 accessory-0.032
Token should
Token ID815
Feature activation-1.25e-01

Pos logit contributions
 called+0.074
 causing+0.069
 worried+0.068
 shouted+0.067
irens+0.067
Neg logit contributions
 fault-0.087
Mine-0.082
 pleasure-0.080
 favorites-0.077
oof-0.073
Token've
Token ID1053
Feature activation-1.27e+00

Pos logit contributions
 called+0.005
 scared+0.004
 worried+0.004
 shouted+0.004
 sad+0.004
Neg logit contributions
 fault-0.007
Mine-0.006
 favorites-0.006
 pleasure-0.006
 accessory-0.006
Token been
Token ID587
Feature activation-1.32e+00

Pos logit contributions
 called+0.046
 scared+0.042
 shouted+0.041
 worried+0.040
 sad+0.038
Neg logit contributions
 fault-0.066
Mine-0.066
 favorites-0.066
 pleasure-0.062
oof-0.060
Token microscope
Token ID36396
Feature activation+9.42e-01

Pos logit contributions
 fault+0.114
Mine+0.111
 pleasure+0.105
 favorites+0.105
oof+0.102
Neg logit contributions
 called-0.060
 scared-0.052
 worried-0.052
 shouted-0.051
 sad-0.048
Token!
Token ID0
Feature activation+1.72e+00

Pos logit contributions
 fault+0.056
Mine+0.053
 favorites+0.051
 pleasure+0.050
 accessory+0.050
Neg logit contributions
 called-0.025
 worried-0.023
 scared-0.022
 shouted-0.021
 confused-0.020
Token It
Token ID632
Feature activation+2.33e+00

Pos logit contributions
 fault+0.093
Mine+0.092
 pleasure+0.086
 favorites+0.086
 accessory+0.083
Neg logit contributions
 called-0.056
 scared-0.051
 worried-0.050
 shouted-0.048
 sad-0.046
Token was
Token ID373
Feature activation+2.65e+00

Pos logit contributions
 fault+0.122
Mine+0.120
 favorites+0.111
 pleasure+0.110
 accessory+0.107
Neg logit contributions
 called-0.080
 worried-0.072
 scared-0.071
 shouted-0.069
 sad-0.067
Token my
Token ID616
Feature activation+5.95e+00

Pos logit contributions
 fault+0.158
Mine+0.154
 pleasure+0.147
 favorites+0.145
 accessory+0.140
Neg logit contributions
 called-0.071
 scared-0.062
 worried-0.062
 shouted-0.061
 sad-0.056
Token favorite
Token ID4004
Feature activation+1.12e+01

Pos logit contributions
Mine+0.268
rene+0.256
 fault+0.249
bet+0.240
oof+0.238
Neg logit contributions
 called-0.261
 scared-0.245
 worried-0.242
 sad-0.239
 so-0.232
Token thing
Token ID1517
Feature activation+3.66e+00

Pos logit contributions
 fault+0.545
Mine+0.528
rene+0.500
 pleasure+0.489
 favorites+0.483
Neg logit contributions
 called-0.467
 scared-0.413
 worried-0.398
 so-0.397
 noise-0.394
Token!
Token ID0
Feature activation+9.90e-01

Pos logit contributions
 fault+0.231
Mine+0.225
 pleasure+0.214
 favorites+0.211
 accessory+0.208
Neg logit contributions
 called-0.084
 worried-0.074
 scared-0.073
 shouted-0.070
 sad-0.066
Token How
Token ID1374
Feature activation+3.52e+00

Pos logit contributions
 fault+0.056
Mine+0.054
 favorites+0.051
 pleasure+0.051
 accessory+0.049
Neg logit contributions
 called-0.032
 scared-0.028
 worried-0.028
 shouted-0.027
 sad-0.026
Token could
Token ID714
Feature activation-6.42e-01

Pos logit contributions
 fault+0.198
Mine+0.195
 pleasure+0.182
 favorites+0.182
 accessory+0.176
Neg logit contributions
 called-0.111
 scared-0.098
 worried-0.097
 shouted-0.094
 sad-0.089
Token you
Token ID345
Feature activation-1.98e+00

Pos logit contributions
 called+0.022
 scared+0.020
 worried+0.020
 shouted+0.019
 sad+0.018
Neg logit contributions
 fault-0.034
Mine-0.034
 pleasure-0.031
 favorites-0.031
 accessory-0.030
Token?"
Token ID1701
Feature activation+2.10e+00

Pos logit contributions
 fault+0.299
Mine+0.297
 pleasure+0.280
 favorites+0.278
 accessory+0.273
Neg logit contributions
 called-0.140
 scared-0.121
 worried-0.118
 shouted-0.115
 sad-0.109
Token Tim
Token ID5045
Feature activation+4.06e+00

Pos logit contributions
 fault+0.110
Mine+0.109
 pleasure+0.101
 favorites+0.101
oof+0.098
Neg logit contributions
 called-0.072
 scared-0.065
 worried-0.065
 shouted-0.062
 sad-0.060
Tokenmy
Token ID1820
Feature activation+1.74e+00

Pos logit contributions
 fault+0.238
Mine+0.235
 favorites+0.220
 pleasure+0.219
 accessory+0.213
Neg logit contributions
 called-0.115
 scared-0.101
 worried-0.100
 shouted-0.095
 sad-0.091
Token chose
Token ID7690
Feature activation+4.55e+00

Pos logit contributions
Mine+0.098
 fault+0.097
 favorites+0.091
 pleasure+0.088
rust+0.088
Neg logit contributions
 called-0.047
 worried-0.042
 scientists-0.042
 scared-0.042
 causing-0.041
Token his
Token ID465
Feature activation+3.15e+00

Pos logit contributions
 fault+0.276
Mine+0.272
 pleasure+0.261
 favorites+0.256
 accessory+0.254
Neg logit contributions
 called-0.127
 scared-0.113
 worried-0.110
 shouted-0.103
 sad-0.101
Token favorite
Token ID4004
Feature activation+1.12e+01

Pos logit contributions
Mine+0.147
rene+0.142
 fault+0.138
rust+0.134
 accessory+0.130
Neg logit contributions
 called-0.133
 scared-0.122
 sad-0.121
 worried-0.119
 so-0.116
Token flavor
Token ID9565
Feature activation+1.56e+00

Pos logit contributions
 fault+0.557
rene+0.521
Mine+0.493
 former+0.489
rust+0.488
Neg logit contributions
 called-0.486
 so-0.433
 scared-0.421
 sad-0.418
 noise-0.413
Token and
Token ID290
Feature activation+4.66e-01

Pos logit contributions
 fault+0.112
Mine+0.110
 pleasure+0.106
 favorites+0.105
oof+0.102
Neg logit contributions
 called-0.028
 scared-0.021
 worried-0.020
 so-0.019
 shouted-0.018
Token the
Token ID262
Feature activation-1.17e+00

Pos logit contributions
 fault+0.031
Mine+0.030
 pleasure+0.029
 favorites+0.028
 accessory+0.028
Neg logit contributions
 called-0.012
 shouted-0.009
 scared-0.009
 worried-0.009
 sad-0.008
Token driver
Token ID4639
Feature activation+1.38e+00

Pos logit contributions
 called+0.047
 scared+0.044
 worried+0.043
 shouted+0.042
 sad+0.042
Neg logit contributions
 fault-0.055
Mine-0.054
 favorites-0.050
 pleasure-0.050
oof-0.049
Token gave
Token ID2921
Feature activation+6.02e-01

Pos logit contributions
 fault+0.080
Mine+0.079
 pleasure+0.074
 favorites+0.073
 accessory+0.071
Neg logit contributions
 called-0.042
 scared-0.037
 worried-0.036
 shouted-0.036
 sad-0.034
Token,
Token ID11
Feature activation+1.32e-01

Pos logit contributions
Mine+0.096
 fault+0.096
 pleasure+0.092
 favorites+0.090
 accessory+0.088
Neg logit contributions
 called-0.039
 worried-0.035
 scared-0.035
 shocked-0.033
 shouted-0.032
Token Lily
Token ID20037
Feature activation+2.68e+00

Pos logit contributions
 fault+0.007
Mine+0.007
 pleasure+0.007
 favorites+0.007
 accessory+0.007
Neg logit contributions
 called-0.004
 worried-0.003
 scared-0.003
 shouted-0.003
 shocked-0.003
Token accidentally
Token ID14716
Feature activation-3.56e-01

Pos logit contributions
 fault+0.140
Mine+0.132
 favorites+0.127
 accessory+0.122
omo+0.120
Neg logit contributions
 called-0.073
 causing-0.070
 noise-0.069
 scientists-0.068
irens-0.067
Token tore
Token ID26197
Feature activation+2.90e-01

Pos logit contributions
 called+0.011
 worried+0.009
 scared+0.009
 shouted+0.009
 sad+0.009
Neg logit contributions
 fault-0.020
Mine-0.020
 pleasure-0.018
 favorites-0.018
 accessory-0.018
Token her
Token ID607
Feature activation+2.27e+00

Pos logit contributions
 fault+0.017
Mine+0.017
 pleasure+0.016
 favorites+0.016
 accessory+0.016
Neg logit contributions
 called-0.008
 worried-0.007
 scared-0.007
 shouted-0.007
 sad-0.006
Token favorite
Token ID4004
Feature activation+1.12e+01

Pos logit contributions
Mine+0.112
 fault+0.110
 pleasure+0.103
 favorites+0.102
rust+0.100
Neg logit contributions
 called-0.088
 scared-0.081
 worried-0.078
 sad-0.077
 shouted-0.076
Token toy
Token ID13373
Feature activation+6.55e+00

Pos logit contributions
 fault+0.546
rene+0.515
rust+0.498
Mine+0.496
 pleasure+0.496
Neg logit contributions
 called-0.463
 noise-0.420
 scared-0.412
 so-0.408
 sad-0.401
Token.
Token ID13
Feature activation+1.88e-01

Pos logit contributions
 fault+0.397
Mine+0.390
 accessory+0.361
 favorites+0.355
 pleasure+0.352
Neg logit contributions
 called-0.165
 worried-0.154
 scared-0.151
 shouted-0.145
 afraid-0.138
Token She
Token ID1375
Feature activation-2.39e-01

Pos logit contributions
Mine+0.011
 fault+0.011
 pleasure+0.010
 favorites+0.010
oof+0.009
Neg logit contributions
 called-0.006
 scared-0.005
 worried-0.005
 shouted-0.005
 causing-0.005
Token was
Token ID373
Feature activation+2.73e+00

Pos logit contributions
 called+0.007
 scientists+0.007
 men+0.007
 worried+0.006
irens+0.006
Neg logit contributions
Mine-0.013
 fault-0.012
 favorites-0.011
 accessory-0.011
 flavors-0.011
Token very
Token ID845
Feature activation-3.33e+00

Pos logit contributions
 fault+0.176
Mine+0.162
 pleasure+0.158
asured+0.156
 favorites+0.151
Neg logit contributions
 men-0.049
plets-0.047
irens-0.047
 scientists-0.047
line-0.046
Token.
Token ID13
Feature activation+1.94e+00

Pos logit contributions
 called+0.101
 scared+0.089
 worried+0.089
 shouted+0.087
 sad+0.082
Neg logit contributions
 fault-0.178
Mine-0.175
 pleasure-0.164
 favorites-0.163
 accessory-0.159
Token She
Token ID1375
Feature activation+5.33e-02

Pos logit contributions
 fault+0.107
Mine+0.106
 pleasure+0.099
 favorites+0.099
 accessory+0.096
Neg logit contributions
 called-0.062
 scared-0.055
 worried-0.055
 shouted-0.053
 sad-0.051
Token was
Token ID373
Feature activation+3.02e+00

Pos logit contributions
 fault+0.003
Mine+0.003
 favorites+0.003
 pleasure+0.003
 accessory+0.003
Neg logit contributions
 called-0.002
 scared-0.001
 worried-0.001
 shouted-0.001
 sad-0.001
Token wearing
Token ID5762
Feature activation+2.02e+00

Pos logit contributions
 fault+0.188
Mine+0.181
 pleasure+0.179
 favorites+0.174
 accessory+0.165
Neg logit contributions
 called-0.062
 shouted-0.054
 scientists-0.052
 men-0.051
irens-0.051
Token her
Token ID607
Feature activation+1.92e+00

Pos logit contributions
 fault+0.128
Mine+0.123
 favorites+0.117
 pleasure+0.117
rust+0.115
Neg logit contributions
 called-0.052
 scared-0.046
 worried-0.044
 shouted-0.041
 sad-0.041
Token favorite
Token ID4004
Feature activation+1.12e+01

Pos logit contributions
Mine+0.083
 fault+0.080
rust+0.074
rene+0.071
 pleasure+0.070
Neg logit contributions
 called-0.092
 scared-0.084
 worried-0.083
 sad-0.082
 so-0.081
Token clothes
Token ID8242
Feature activation-6.06e-01

Pos logit contributions
 fault+0.601
Mine+0.548
rust+0.517
 pleasure+0.515
rene+0.506
Neg logit contributions
 called-0.460
 scared-0.393
 so-0.388
 sad-0.387
 worried-0.382
Token,
Token ID11
Feature activation-6.90e-01

Pos logit contributions
 called+0.013
 scared+0.013
 worried+0.013
 shouted+0.012
 scientists+0.012
Neg logit contributions
 fault-0.037
Mine-0.036
 favorites-0.035
 accessory-0.035
 pleasure-0.034
Token a
Token ID257
Feature activation+5.59e-01

Pos logit contributions
 called+0.020
 scared+0.016
 worried+0.016
 shouted+0.015
 so+0.015
Neg logit contributions
 fault-0.043
Mine-0.043
 pleasure-0.040
 favorites-0.039
rust-0.039
Token pink
Token ID11398
Feature activation+7.33e+00

Pos logit contributions
 fault+0.028
Mine+0.027
 pleasure+0.025
 favorites+0.025
 accessory+0.024
Neg logit contributions
 called-0.021
 scared-0.019
 worried-0.019
 shouted-0.019
 sad-0.018
Token dress
Token ID6576
Feature activation+1.18e+00

Pos logit contributions
 fault+0.323
Mine+0.312
 pleasure+0.289
 favorites+0.283
rust+0.276
Neg logit contributions
 called-0.330
 scared-0.296
 worried-0.294
 shouted-0.288
 sad-0.282
Token.
Token ID13
Feature activation+7.53e-01

Pos logit contributions
 fault+0.061
Mine+0.060
 pleasure+0.057
 favorites+0.057
 accessory+0.055
Neg logit contributions
 called-0.024
 scared-0.022
 worried-0.022
 shouted-0.021
 sad-0.020
Token "
Token ID366
Feature activation+6.58e-01

Pos logit contributions
 fault+0.042
Mine+0.042
 pleasure+0.039
 favorites+0.039
 accessory+0.038
Neg logit contributions
 called-0.024
 scared-0.021
 worried-0.021
 shouted-0.020
 sad-0.019
TokenWhat
Token ID2061
Feature activation+4.53e+00

Pos logit contributions
 fault+0.038
 pleasure+0.033
 favorites+0.031
 accessory+0.030
 findings+0.029
Neg logit contributions
 called-0.024
 shouted-0.022
 scared-0.021
 worried-0.020
 yelled-0.020
Token's
Token ID338
Feature activation+2.77e+00

Pos logit contributions
 fault+0.250
Mine+0.246
 pleasure+0.230
 favorites+0.229
 accessory+0.224
Neg logit contributions
 called-0.149
 scared-0.134
 worried-0.130
 shouted-0.127
 sad-0.122
Token your
Token ID534
Feature activation+2.63e+00

Pos logit contributions
 fault+0.147
Mine+0.141
 pleasure+0.136
 favorites+0.134
oof+0.133
Neg logit contributions
 called-0.081
 shouted-0.079
 worried-0.078
 scared-0.077
plets-0.077
Token favorite
Token ID4004
Feature activation+1.12e+01

Pos logit contributions
Mine+0.103
 fault+0.094
rene+0.091
rust+0.089
bet+0.084
Neg logit contributions
 called-0.134
 scared-0.122
 worried-0.119
 noise-0.118
 sad-0.117
Token color
Token ID3124
Feature activation+3.26e+00

Pos logit contributions
 fault+0.561
Mine+0.551
 pleasure+0.505
 favorites+0.496
rene+0.488
Neg logit contributions
 called-0.442
 scared-0.389
 worried-0.377
 noise-0.369
 shouted-0.368
Token?"
Token ID1701
Feature activation+1.70e+00

Pos logit contributions
 fault+0.176
Mine+0.170
 favorites+0.163
 pleasure+0.162
 accessory+0.159
Neg logit contributions
 called-0.100
 worried-0.095
 scared-0.093
 shouted-0.093
 afraid-0.085
Token Lily
Token ID20037
Feature activation+2.29e+00

Pos logit contributions
Mine+0.081
 fault+0.075
 pleasure+0.074
 favorites+0.074
rust+0.071
Neg logit contributions
 causing-0.054
 worried-0.053
 called-0.053
 scientists-0.051
 scared-0.051
Token thought
Token ID1807
Feature activation+6.99e-02

Pos logit contributions
Mine+0.127
 fault+0.126
 favorites+0.120
 pleasure+0.116
 accessory+0.114
Neg logit contributions
 called-0.061
 worried-0.056
 scared-0.056
 men-0.056
 causing-0.055
Token for
Token ID329
Feature activation-2.50e+00

Pos logit contributions
 fault+0.004
 pleasure+0.004
Mine+0.003
 favorites+0.003
 friendship+0.003
Neg logit contributions
 called-0.002
 shouted-0.002
 scared-0.002
 worried-0.002
irens-0.002
Token happened
Token ID3022
Feature activation+3.72e+00

Pos logit contributions
 called+0.033
 worried+0.030
 scared+0.029
 shouted+0.028
 sad+0.027
Neg logit contributions
 fault-0.069
Mine-0.066
 pleasure-0.062
 favorites-0.062
 accessory-0.061
Token.
Token ID13
Feature activation+2.99e-01

Pos logit contributions
 fault+0.226
Mine+0.212
 pleasure+0.203
 favorites+0.200
 accessory+0.199
Neg logit contributions
 called-0.096
 scared-0.084
 worried-0.084
 shouted-0.081
 scientists-0.077
Token Tim
Token ID5045
Feature activation+4.27e+00

Pos logit contributions
 fault+0.016
Mine+0.016
 pleasure+0.015
 favorites+0.015
oof+0.014
Neg logit contributions
 called-0.010
 scared-0.009
 worried-0.009
 shouted-0.009
 sad-0.008
Tokenmy
Token ID1820
Feature activation+1.67e+00

Pos logit contributions
 fault+0.166
Mine+0.161
 favorites+0.143
oof+0.140
rust+0.140
Neg logit contributions
 called-0.183
 scientists-0.177
 men-0.174
 causing-0.172
 worried-0.171
Token's
Token ID338
Feature activation+2.59e+00

Pos logit contributions
Mine+0.075
 fault+0.072
omo+0.067
rust+0.066
 flavors+0.064
Neg logit contributions
 scientists-0.057
irens-0.057
plets-0.055
line-0.055
 men-0.053
Token favorite
Token ID4004
Feature activation+1.12e+01

Pos logit contributions
Mine+0.139
 fault+0.136
 favorites+0.126
 pleasure+0.126
oof+0.125
Neg logit contributions
 called-0.090
 scared-0.083
 worried-0.082
 sad-0.079
 shouted-0.076
Token toy
Token ID13373
Feature activation+6.47e+00

Pos logit contributions
 fault+0.517
Mine+0.503
rust+0.465
rene+0.464
 favorites+0.462
Neg logit contributions
 called-0.476
 scared-0.438
 worried-0.425
 sad-0.418
 noise-0.414
Token fell
Token ID3214
Feature activation-4.20e+00

Pos logit contributions
 fault+0.367
Mine+0.355
 favorites+0.328
 pleasure+0.325
 accessory+0.323
Neg logit contributions
 called-0.190
 worried-0.175
 scared-0.172
 shouted-0.167
 confused-0.159
Token into
Token ID656
Feature activation-1.78e-01

Pos logit contributions
 called+0.105
 worried+0.093
 shouted+0.093
 scientists+0.090
 men+0.087
Neg logit contributions
 fault-0.258
Mine-0.243
 accessory-0.223
 pleasure-0.218
 favorites-0.212
Token the
Token ID262
Feature activation-1.13e+00

Pos logit contributions
 called+0.005
 worried+0.005
 scared+0.004
 shouted+0.004
 sad+0.004
Neg logit contributions
Mine-0.010
 fault-0.010
 pleasure-0.009
 favorites-0.009
 accessory-0.009
Token mill
Token ID3939
Feature activation-6.51e+00

Pos logit contributions
 called+0.040
 shouted+0.036
 worried+0.035
 scared+0.035
 scientists+0.034
Neg logit contributions
 fault-0.057
Mine-0.056
 pleasure-0.051
 favorites-0.049
 friendship-0.048
Token around
Token ID1088
Feature activation+4.90e+00

Pos logit contributions
 called+0.011
 worried+0.010
 scared+0.010
 shouted+0.010
 sad+0.009
Neg logit contributions
 fault-0.028
Mine-0.027
 pleasure-0.026
 favorites-0.025
 accessory-0.025
Token the
Token ID262
Feature activation-1.29e-01

Pos logit contributions
 fault+0.321
Mine+0.316
 pleasure+0.301
 favorites+0.300
 accessory+0.292
Neg logit contributions
 called-0.111
 scared-0.093
 worried-0.090
 shouted-0.087
 sad-0.081
Token kitchen
Token ID9592
Feature activation-3.48e+00

Pos logit contributions
 called+0.005
 scared+0.005
 worried+0.005
 shouted+0.005
 sad+0.004
Neg logit contributions
 fault-0.006
Mine-0.006
 pleasure-0.006
 favorites-0.006
 accessory-0.005
Token to
Token ID284
Feature activation+7.03e-01

Pos logit contributions
 called+0.095
 worried+0.089
 scared+0.089
 afraid+0.084
 scientists+0.084
Neg logit contributions
 fault-0.197
Mine-0.187
 favorites-0.180
 flavors-0.177
 accessory-0.177
Token its
Token ID663
Feature activation+1.05e-01

Pos logit contributions
 fault+0.032
Mine+0.032
 pleasure+0.030
 favorites+0.029
oof+0.028
Neg logit contributions
 called-0.029
 scared-0.026
 worried-0.026
 shouted-0.025
 sad-0.025
Token favourite
Token ID12507
Feature activation+1.11e+01

Pos logit contributions
 fault+0.005
Mine+0.005
 pleasure+0.005
 favorites+0.005
 accessory+0.005
Neg logit contributions
 called-0.004
 scared-0.004
 worried-0.004
 shouted-0.003
 sad-0.003
Token tunes
Token ID31796
Feature activation+2.95e+00

Pos logit contributions
 fault+0.566
Mine+0.542
rust+0.514
 pleasure+0.506
rene+0.504
Neg logit contributions
 called-0.436
 scared-0.384
 so-0.372
 worried-0.372
 noise-0.369
Token.
Token ID13
Feature activation+1.73e-01

Pos logit contributions
 fault+0.184
Mine+0.177
 pleasure+0.170
 favorites+0.167
 accessory+0.167
Neg logit contributions
 called-0.068
 worried-0.062
 scared-0.061
 shouted-0.058
 confused-0.054
Token
Token ID198
Feature activation+5.64e-01

Pos logit contributions
 fault+0.010
Mine+0.010
 favorites+0.009
 pleasure+0.009
 accessory+0.009
Neg logit contributions
 called-0.006
 scared-0.005
 shouted-0.005
 worried-0.005
 sad-0.004
Token
Token ID198
Feature activation+5.38e-01

Pos logit contributions
 fault+0.036
Mine+0.034
 pleasure+0.032
 favorites+0.032
 accessory+0.032
Neg logit contributions
 called-0.016
 shouted-0.013
 scared-0.013
 worried-0.013
 sad-0.012
TokenOne
Token ID3198
Feature activation+4.57e+00

Pos logit contributions
 fault+0.030
 pleasure+0.026
 favorites+0.026
Mine+0.026
 accessory+0.025
Neg logit contributions
 called-0.020
 shouted-0.018
 scared-0.017
 worried-0.017
 sad-0.015
Token jumping
Token ID14284
Feature activation+1.22e+00

Pos logit contributions
Mine+0.049
 fault+0.048
 pleasure+0.045
 favorites+0.043
 accessory+0.043
Neg logit contributions
 called-0.038
 worried-0.034
 scared-0.033
 shocked-0.032
 men-0.032
Token.
Token ID13
Feature activation+6.70e-01

Pos logit contributions
Mine+0.075
 fault+0.075
 pleasure+0.068
 favorites+0.068
 accessory+0.067
Neg logit contributions
 called-0.030
 worried-0.027
 scared-0.025
 shouted-0.024
 confused-0.023
Token But
Token ID887
Feature activation+7.13e-01

Pos logit contributions
 fault+0.037
Mine+0.036
 pleasure+0.034
 favorites+0.033
 accessory+0.033
Neg logit contributions
 called-0.022
 scared-0.020
 worried-0.019
 shouted-0.019
 sad-0.018
Token,
Token ID11
Feature activation-1.05e-01

Pos logit contributions
 fault+0.040
Mine+0.040
 pleasure+0.037
 favorites+0.037
 accessory+0.036
Neg logit contributions
 called-0.021
 scared-0.019
 worried-0.019
 shouted-0.018
 shocked-0.017
Token her
Token ID607
Feature activation+2.08e+00

Pos logit contributions
 called+0.003
 worried+0.003
 scared+0.003
 shouted+0.003
 shocked+0.003
Neg logit contributions
Mine-0.006
 fault-0.006
 pleasure-0.006
 favorites-0.005
 accessory-0.005
Token favourite
Token ID12507
Feature activation+1.11e+01

Pos logit contributions
Mine+0.105
 fault+0.102
rust+0.095
oof+0.094
rene+0.094
Neg logit contributions
 called-0.079
 scared-0.073
 worried-0.071
 sad-0.069
 shouted-0.067
Token exercise
Token ID5517
Feature activation+4.48e+00

Pos logit contributions
 fault+0.521
Mine+0.502
 pleasure+0.460
rust+0.454
 favorites+0.451
Neg logit contributions
 called-0.485
 scared-0.440
 worried-0.427
 shouted-0.417
 sad-0.414
Token was
Token ID373
Feature activation+2.64e+00

Pos logit contributions
 fault+0.233
Mine+0.230
 pleasure+0.210
 favorites+0.210
 accessory+0.206
Neg logit contributions
 called-0.155
 worried-0.142
 scared-0.142
 shouted-0.135
 confused-0.131
Token dancing
Token ID15360
Feature activation+3.06e+00

Pos logit contributions
 fault+0.155
Mine+0.151
 pleasure+0.142
 favorites+0.141
 accessory+0.138
Neg logit contributions
 called-0.076
 scared-0.066
 worried-0.066
 shouted-0.064
 sad-0.060
Token!
Token ID0
Feature activation+1.01e+00

Pos logit contributions
 fault+0.188
Mine+0.186
 pleasure+0.175
 favorites+0.173
 accessory+0.172
Neg logit contributions
 called-0.073
 worried-0.065
 scared-0.064
 shouted-0.062
 shocked-0.057
Token
Token ID198
Feature activation+4.57e-01

Pos logit contributions
 fault+0.058
Mine+0.057
 pleasure+0.053
 favorites+0.053
 accessory+0.052
Neg logit contributions
 called-0.031
 scared-0.028
 worried-0.027
 shouted-0.026
 sad-0.025
Token room
Token ID2119
Feature activation-1.19e+00

Pos logit contributions
Mine+0.157
 fault+0.156
 pleasure+0.143
 favorites+0.142
rust+0.140
Neg logit contributions
 called-0.112
 scared-0.101
 worried-0.100
 shouted-0.096
 sad-0.095
Token and
Token ID290
Feature activation+3.63e-01

Pos logit contributions
 called+0.030
 scared+0.027
 worried+0.027
 shouted+0.026
 scientists+0.024
Neg logit contributions
 fault-0.071
Mine-0.069
 favorites-0.066
 pleasure-0.065
 accessory-0.064
Token put
Token ID1234
Feature activation+9.35e-01

Pos logit contributions
Mine+0.019
 fault+0.018
 favorites+0.017
 pleasure+0.017
 accessory+0.016
Neg logit contributions
 called-0.011
 worried-0.011
 scared-0.011
 sad-0.010
 scientists-0.010
Token on
Token ID319
Feature activation-5.83e-01

Pos logit contributions
 fault+0.056
Mine+0.056
 favorites+0.053
 pleasure+0.052
 accessory+0.050
Neg logit contributions
 called-0.025
 scared-0.022
 worried-0.022
 shouted-0.021
 sad-0.020
Token her
Token ID607
Feature activation+1.90e+00

Pos logit contributions
 called+0.015
 scared+0.013
 worried+0.013
 shouted+0.012
 sad+0.012
Neg logit contributions
 fault-0.036
Mine-0.035
 pleasure-0.034
 favorites-0.034
 accessory-0.033
Token favourite
Token ID12507
Feature activation+1.11e+01

Pos logit contributions
 fault+0.090
Mine+0.088
 pleasure+0.081
 favorites+0.081
 accessory+0.078
Neg logit contributions
 called-0.078
 scared-0.071
 worried-0.070
 shouted-0.069
 sad-0.066
Token dress
Token ID6576
Feature activation+1.58e+00

Pos logit contributions
 fault+0.578
Mine+0.527
 pleasure+0.505
rust+0.504
rene+0.488
Neg logit contributions
 called-0.455
 scared-0.395
 worried-0.384
 noise-0.383
 shouted-0.377
Token.
Token ID13
Feature activation+3.28e-01

Pos logit contributions
 fault+0.095
Mine+0.091
 favorites+0.089
 accessory+0.088
 pleasure+0.086
Neg logit contributions
 called-0.038
 scared-0.037
 worried-0.037
 shouted-0.034
 afraid-0.033
Token It
Token ID632
Feature activation+1.70e+00

Pos logit contributions
 fault+0.020
Mine+0.019
 pleasure+0.018
 favorites+0.018
 accessory+0.018
Neg logit contributions
 called-0.010
 scared-0.008
 worried-0.008
 shouted-0.008
 sad-0.007
Token was
Token ID373
Feature activation+2.62e+00

Pos logit contributions
 fault+0.094
Mine+0.092
 pleasure+0.086
 favorites+0.086
 accessory+0.083
Neg logit contributions
 called-0.056
 scared-0.050
 worried-0.050
 shouted-0.048
 sad-0.046
Token so
Token ID523
Feature activation-2.25e+00

Pos logit contributions
 fault+0.172
Mine+0.171
 favorites+0.160
 pleasure+0.160
 accessory+0.156
Neg logit contributions
 called-0.061
 scared-0.050
 worried-0.049
 shouted-0.046
 sad-0.044
Token
Token ID198
Feature activation-4.00e-01

Pos logit contributions
 called+0.016
 worried+0.015
 scientists+0.015
 scared+0.014
 causing+0.014
Neg logit contributions
 fault-0.017
Mine-0.017
 pleasure-0.016
oof-0.015
 favorites-0.015
Token
Token ID198
Feature activation-3.78e-01

Pos logit contributions
 called+0.013
 scared+0.012
 worried+0.012
 shouted+0.011
 sad+0.011
Neg logit contributions
 fault-0.022
Mine-0.022
 pleasure-0.020
 favorites-0.020
oof-0.019
TokenThe
Token ID464
Feature activation-1.95e+00

Pos logit contributions
 called+0.016
 scared+0.014
 worried+0.014
 shouted+0.014
 sad+0.013
Neg logit contributions
 fault-0.018
Mine-0.017
 pleasure-0.016
 favorites-0.016
 accessory-0.015
Token three
Token ID1115
Feature activation-1.68e+00

Pos logit contributions
 called+0.085
 scared+0.078
 worried+0.077
 shouted+0.076
 sad+0.073
Neg logit contributions
 fault-0.087
Mine-0.086
 pleasure-0.079
 favorites-0.078
 accessory-0.075
Token-
Token ID12
Feature activation-5.17e+00

Pos logit contributions
 called+0.063
 causing+0.063
 worried+0.060
 scared+0.060
 out+0.059
Neg logit contributions
 favorites-0.069
 flavors-0.068
Mine-0.068
omo-0.067
 fault-0.063
Tokenyear
Token ID1941
Feature activation-1.13e+01

Pos logit contributions
 called+0.169
 scared+0.153
 worried+0.151
 shouted+0.144
 sad+0.141
Neg logit contributions
 fault-0.280
Mine-0.278
 pleasure-0.258
 favorites-0.257
oof-0.252
Token-
Token ID12
Feature activation-5.18e+00

Pos logit contributions
 called+0.363
 causing+0.324
 worried+0.320
 scared+0.318
 shouted+0.299
Neg logit contributions
 fault-0.612
Mine-0.552
 friendship-0.551
rust-0.546
 pleasure-0.544
Tokenold
Token ID727
Feature activation-7.34e-01

Pos logit contributions
 called+0.191
 scared+0.174
 worried+0.172
 shouted+0.165
 sad+0.161
Neg logit contributions
 fault-0.262
Mine-0.257
 pleasure-0.240
 favorites-0.238
oof-0.232
Token thought
Token ID1807
Feature activation-6.72e-01

Pos logit contributions
 called+0.022
 scared+0.020
 causing+0.020
 worried+0.020
irens+0.020
Neg logit contributions
 fault-0.037
Mine-0.036
 pleasure-0.035
 favorites-0.035
 friendship-0.034
Token for
Token ID329
Feature activation-3.12e+00

Pos logit contributions
 called+0.017
 scared+0.016
 worried+0.015
 shouted+0.015
 confused+0.014
Neg logit contributions
 fault-0.041
Mine-0.039
 pleasure-0.038
 favorites-0.037
 accessory-0.036
Token a
Token ID257
Feature activation+7.37e-02

Pos logit contributions
 called+0.085
 scared+0.072
 worried+0.071
 shouted+0.069
 sad+0.065
Neg logit contributions
 fault-0.191
Mine-0.188
 favorites-0.176
 pleasure-0.176
 accessory-0.172
Token
Token ID198
Feature activation-2.16e-01

Pos logit contributions
 called+0.015
 worried+0.014
 scared+0.014
 causing+0.013
 sad+0.013
Neg logit contributions
 fault-0.016
Mine-0.016
 pleasure-0.015
 favorites-0.015
oof-0.014
Token
Token ID198
Feature activation-2.07e-01

Pos logit contributions
 called+0.007
 scared+0.006
 shouted+0.006
 worried+0.006
 sad+0.005
Neg logit contributions
 fault-0.013
Mine-0.012
 favorites-0.012
 pleasure-0.012
 accessory-0.011
TokenThe
Token ID464
Feature activation-1.90e+00

Pos logit contributions
 called+0.009
 scared+0.008
 shouted+0.008
 worried+0.008
 sad+0.007
Neg logit contributions
 fault-0.010
Mine-0.009
 pleasure-0.009
 favorites-0.009
 accessory-0.009
Token 3
Token ID513
Feature activation-5.04e+00

Pos logit contributions
 called+0.065
 scared+0.058
 shouted+0.058
 worried+0.056
 shocked+0.053
Neg logit contributions
Mine-0.098
 fault-0.097
 pleasure-0.093
 favorites-0.093
 accessory-0.087
Token-
Token ID12
Feature activation-5.18e+00

Pos logit contributions
 called+0.206
 worried+0.204
 scared+0.204
 causing+0.199
 confused+0.195
Neg logit contributions
Mine-0.191
 flavors-0.187
 favorites-0.186
omo-0.176
 accessory-0.175
Tokenyear
Token ID1941
Feature activation-1.13e+01

Pos logit contributions
 called+0.157
 scared+0.141
 worried+0.139
 shouted+0.132
 sad+0.128
Neg logit contributions
 fault-0.295
Mine-0.292
 pleasure-0.273
 favorites-0.272
oof-0.265
Token old
Token ID1468
Feature activation-1.21e+00

Pos logit contributions
 called+0.359
 causing+0.341
 worried+0.332
 scared+0.331
 smoking+0.310
Neg logit contributions
 fault-0.587
Mine-0.551
 friendship-0.539
 flavors-0.538
 favorites-0.538
Token smiled
Token ID13541
Feature activation+5.74e-01

Pos logit contributions
 called+0.037
 scared+0.035
 worried+0.034
 causing+0.033
plets+0.033
Neg logit contributions
Mine-0.063
 fault-0.062
 favorites-0.061
 accessory-0.060
 flavors-0.059
Token and
Token ID290
Feature activation-4.37e-02

Pos logit contributions
 fault+0.035
Mine+0.034
 pleasure+0.033
 favorites+0.033
 accessory+0.032
Neg logit contributions
 called-0.014
 shouted-0.013
 scared-0.012
 worried-0.012
 causing-0.011
Token said
Token ID531
Feature activation-3.05e+00

Pos logit contributions
 called+0.001
 worried+0.001
 scared+0.001
 scientists+0.001
 sad+0.001
Neg logit contributions
Mine-0.003
 fault-0.003
 pleasure-0.003
 favorites-0.002
 accessory-0.002
Token,
Token ID11
Feature activation-1.11e+00

Pos logit contributions
 called+0.072
 scared+0.062
 worried+0.061
 shouted+0.059
 causing+0.057
Neg logit contributions
 fault-0.190
Mine-0.189
 pleasure-0.184
 favorites-0.174
 accessory-0.170
Tokenuff
Token ID1648
Feature activation-4.28e+00

Pos logit contributions
 fault+0.041
Mine+0.040
 favorites+0.037
 pleasure+0.037
 accessory+0.035
Neg logit contributions
 called-0.041
 scared-0.038
 worried-0.037
 shouted-0.036
 sad-0.036
Tokenled
Token ID992
Feature activation-2.13e+00

Pos logit contributions
 called+0.141
 scared+0.131
 worried+0.129
 noise+0.127
 scientists+0.125
Neg logit contributions
Mine-0.211
 fault-0.206
 favorites-0.189
 flavors-0.189
rene-0.188
Token the
Token ID262
Feature activation-1.54e+00

Pos logit contributions
 called+0.062
 scared+0.055
 worried+0.055
 shouted+0.053
 shocked+0.051
Neg logit contributions
Mine-0.120
 fault-0.118
 pleasure-0.112
 favorites-0.111
 accessory-0.107
Token three
Token ID1115
Feature activation-1.38e+00

Pos logit contributions
 called+0.060
 scared+0.054
 worried+0.053
 shouted+0.053
 sad+0.049
Neg logit contributions
 fault-0.076
Mine-0.075
 pleasure-0.070
 favorites-0.069
 accessory-0.066
Token-
Token ID12
Feature activation-5.05e+00

Pos logit contributions
 called+0.049
 worried+0.045
 scared+0.045
 causing+0.043
 confused+0.042
Neg logit contributions
Mine-0.065
 favorites-0.064
 flavors-0.063
 fault-0.063
 accessory-0.061
Tokenyear
Token ID1941
Feature activation-1.13e+01

Pos logit contributions
 called+0.186
 scared+0.172
 worried+0.170
 sad+0.161
 shouted+0.160
Neg logit contributions
 fault-0.251
Mine-0.250
 pleasure-0.230
 favorites-0.229
oof-0.224
Token-
Token ID12
Feature activation-5.05e+00

Pos logit contributions
 called+0.379
 causing+0.330
 scared+0.327
 worried+0.324
 shouted+0.322
Neg logit contributions
 fault-0.587
Mine-0.539
 flavors-0.529
eter-0.527
 favorites-0.525
Tokenold
Token ID727
Feature activation-5.99e-01

Pos logit contributions
 called+0.194
 scared+0.178
 worried+0.176
 shouted+0.169
 sad+0.166
Neg logit contributions
 fault-0.249
Mine-0.244
 pleasure-0.227
 favorites-0.226
oof-0.219
Token's
Token ID338
Feature activation+1.70e+00

Pos logit contributions
 called+0.017
 scared+0.016
 worried+0.016
irens+0.016
 shocked+0.016
Neg logit contributions
 fault-0.031
Mine-0.030
 pleasure-0.030
 favorites-0.029
 accessory-0.029
Token hair
Token ID4190
Feature activation+1.87e+00

Pos logit contributions
 fault+0.082
Mine+0.077
 pleasure+0.077
 favorites+0.075
 accessory+0.071
Neg logit contributions
 called-0.068
 scared-0.061
 shouted-0.061
 worried-0.059
 shocked-0.057
Token.
Token ID13
Feature activation-3.66e-01

Pos logit contributions
 fault+0.109
Mine+0.104
 accessory+0.100
 favorites+0.099
 pleasure+0.098
Neg logit contributions
 called-0.050
 worried-0.047
 scared-0.046
 afraid-0.043
 shouted-0.043
Token,
Token ID11
Feature activation-8.28e-01

Pos logit contributions
Mine+0.049
 fault+0.046
 pleasure+0.045
 favorites+0.045
oof+0.045
Neg logit contributions
 called-0.021
 worried-0.021
 shouted-0.020
 causing-0.019
 confused-0.019
Token a
Token ID257
Feature activation+4.28e-01

Pos logit contributions
oke+0.029
irens+0.028
 called+0.028
line+0.027
 shouted+0.027
Neg logit contributions
Mine-0.040
 pleasure-0.036
 favorites-0.034
 friendship-0.033
oof-0.033
Token little
Token ID1310
Feature activation-2.57e+00

Pos logit contributions
 pleasure+0.023
Mine+0.022
 fault+0.021
 friendship+0.021
 hobby+0.021
Neg logit contributions
 called-0.012
 causing-0.011
 shouted-0.011
irens-0.011
ting-0.010
Token 3
Token ID513
Feature activation-4.83e+00

Pos logit contributions
irens+0.109
 causing+0.106
 shouted+0.102
 called+0.098
 shocked+0.097
Neg logit contributions
Mine-0.095
 pleasure-0.093
cy-0.091
 friendship-0.089
 addition-0.086
Token-
Token ID12
Feature activation-4.91e+00

Pos logit contributions
 causing+0.187
 worried+0.175
 confused+0.175
plets+0.174
 called+0.173
Neg logit contributions
Mine-0.187
oof-0.181
 flavors-0.178
 hobby-0.177
 favorites-0.174
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.164
 scared+0.152
 worried+0.151
 sad+0.144
 shouted+0.143
Neg logit contributions
Mine-0.248
 fault-0.244
oof-0.228
rust-0.226
rene-0.225
Token-
Token ID12
Feature activation-4.91e+00

Pos logit contributions
 causing+0.405
 called+0.377
 smoking+0.343
 calling+0.341
 shouted+0.340
Neg logit contributions
 fault-0.523
rust-0.495
oof-0.490
 friendship-0.484
 accessory-0.476
Tokenold
Token ID727
Feature activation-4.02e-01

Pos logit contributions
 called+0.187
 scared+0.174
 worried+0.173
 shouted+0.165
 sad+0.164
Neg logit contributions
 fault-0.233
Mine-0.231
 favorites-0.212
 pleasure-0.211
oof-0.210
Token girl
Token ID2576
Feature activation-3.50e+00

Pos logit contributions
 called+0.012
 causing+0.012
 worried+0.011
plets+0.011
 sad+0.011
Neg logit contributions
Mine-0.020
 fault-0.019
 favorites-0.019
rust-0.018
 pleasure-0.018
Token found
Token ID1043
Feature activation-6.23e-01

Pos logit contributions
 noise+0.090
taker+0.090
 sad+0.090
 worried+0.088
 confused+0.087
Neg logit contributions
Mine-0.184
 favorites-0.175
 fault-0.170
 accessory-0.167
oof-0.167
Token the
Token ID262
Feature activation-1.56e+00

Pos logit contributions
 called+0.018
 shouted+0.017
 worried+0.017
plets+0.017
 confused+0.016
Neg logit contributions
 fault-0.033
Mine-0.032
 pleasure-0.030
 favorites-0.030
oof-0.029
Tokennie
Token ID11952
Feature activation+1.06e+00

Pos logit contributions
 called+0.112
 scared+0.104
 worried+0.104
 shouted+0.101
 shocked+0.098
Neg logit contributions
 fault-0.116
Mine-0.115
 favorites-0.107
 pleasure-0.106
rust-0.102
Token was
Token ID373
Feature activation+2.16e+00

Pos logit contributions
 fault+0.058
Mine+0.058
 favorites+0.055
 pleasure+0.055
 accessory+0.053
Neg logit contributions
 called-0.030
 worried-0.028
 scared-0.028
 sad-0.027
 noise-0.026
Token a
Token ID257
Feature activation+2.90e-01

Pos logit contributions
 fault+0.119
 pleasure+0.115
 favorites+0.112
Mine+0.111
 favourites+0.104
Neg logit contributions
 called-0.057
 shouted-0.056
 men-0.054
plets-0.053
 scientists-0.053
Token three
Token ID1115
Feature activation-1.48e+00

Pos logit contributions
Mine+0.016
 fault+0.016
 pleasure+0.016
 favorites+0.016
rust+0.015
Neg logit contributions
 called-0.008
 shouted-0.007
 causing-0.007
irens-0.007
 scared-0.007
Token-
Token ID12
Feature activation-5.02e+00

Pos logit contributions
 called+0.051
 causing+0.050
 worried+0.048
 scared+0.047
 shouted+0.047
Neg logit contributions
Mine-0.067
 favorites-0.066
 flavors-0.064
 fault-0.063
 hobby-0.062
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.164
 scared+0.149
 worried+0.145
 shouted+0.139
 noise+0.138
Neg logit contributions
Mine-0.269
 fault-0.265
 pleasure-0.248
 favorites-0.247
rust-0.242
Token-
Token ID12
Feature activation-5.03e+00

Pos logit contributions
 called+0.360
 causing+0.341
 shouted+0.321
 worried+0.318
 scared+0.316
Neg logit contributions
 fault-0.579
 pleasure-0.558
 favorites-0.544
 accessory-0.532
 friendship-0.526
Tokenold
Token ID727
Feature activation-5.57e-01

Pos logit contributions
 called+0.183
 scared+0.166
 worried+0.164
 shouted+0.159
 sad+0.154
Neg logit contributions
 fault-0.258
Mine-0.254
 pleasure-0.236
 favorites-0.235
 accessory-0.228
Token with
Token ID351
Feature activation+1.18e+00

Pos logit contributions
 called+0.015
irens+0.014
 worried+0.014
 shocked+0.014
 scared+0.014
Neg logit contributions
 fault-0.030
Mine-0.029
 favorites-0.029
 pleasure-0.029
 accessory-0.028
Token bright
Token ID6016
Feature activation+9.93e-01

Pos logit contributions
Mine+0.066
 fault+0.066
 pleasure+0.063
 favorites+0.062
 accessory+0.060
Neg logit contributions
 called-0.036
 scared-0.032
 worried-0.031
 shouted-0.030
 sad-0.029
Token eyes
Token ID2951
Feature activation-3.99e+00

Pos logit contributions
 fault+0.051
Mine+0.050
 pleasure+0.046
 favorites+0.046
 accessory+0.044
Neg logit contributions
 called-0.037
 scared-0.033
 worried-0.033
 shouted-0.032
 sad-0.031
Token So
Token ID1406
Feature activation-5.91e-01

Pos logit contributions
 fault+0.092
Mine+0.091
 favorites+0.084
 pleasure+0.083
oof+0.079
Neg logit contributions
 called-0.068
 worried-0.064
 scared-0.064
 causing-0.060
 shocked-0.060
Token,
Token ID11
Feature activation-9.37e-01

Pos logit contributions
 called+0.015
irens+0.015
plets+0.015
iren+0.015
 worried+0.015
Neg logit contributions
Mine-0.031
 friendship-0.031
 pleasure-0.031
 fault-0.029
 favorites-0.028
Token the
Token ID262
Feature activation-1.66e+00

Pos logit contributions
 called+0.025
plets+0.024
irens+0.023
 worried+0.023
 afraid+0.023
Neg logit contributions
Mine-0.048
 friendship-0.044
 pleasure-0.044
 fault-0.044
 favorites-0.043
Token 3
Token ID513
Feature activation-4.91e+00

Pos logit contributions
 called+0.052
 shouted+0.047
plets+0.042
 causing+0.042
irens+0.042
Neg logit contributions
 fault-0.089
Mine-0.087
 pleasure-0.085
 favorites-0.084
 friendship-0.079
Token-
Token ID12
Feature activation-5.02e+00

Pos logit contributions
 called+0.187
 causing+0.186
 worried+0.181
plets+0.178
line+0.178
Neg logit contributions
 flavors-0.188
omo-0.186
Mine-0.185
 favorites-0.184
 hobby-0.179
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.186
 scared+0.176
 worried+0.174
 noise+0.169
 confused+0.167
Neg logit contributions
Mine-0.232
 fault-0.223
rene-0.214
rust-0.210
oof-0.210
Token-
Token ID12
Feature activation-5.03e+00

Pos logit contributions
 called+0.368
 causing+0.358
 worried+0.322
 shouted+0.318
 scared+0.312
Neg logit contributions
 fault-0.564
rust-0.522
 friendship-0.518
oof-0.503
Mine-0.498
Tokenold
Token ID727
Feature activation-4.83e-01

Pos logit contributions
 called+0.207
 scared+0.194
 worried+0.193
 causing+0.186
 confused+0.184
Neg logit contributions
 fault-0.216
Mine-0.215
rene-0.197
oof-0.196
rust-0.195
Token grabbed
Token ID13176
Feature activation-6.02e-01

Pos logit contributions
plets+0.017
irens+0.016
 causing+0.016
 crossing+0.016
 gust+0.015
Neg logit contributions
rust-0.020
 fault-0.020
 friendship-0.019
Mine-0.019
 accessory-0.019
Token their
Token ID511
Feature activation+9.96e-01

Pos logit contributions
 called+0.018
 worried+0.016
 scared+0.015
 shouted+0.015
 sad+0.014
Neg logit contributions
Mine-0.034
 fault-0.034
 pleasure-0.032
 favorites-0.032
oof-0.031
Token brother
Token ID3956
Feature activation-1.91e+00

Pos logit contributions
 fault+0.055
Mine+0.053
 pleasure+0.051
 favorites+0.051
 accessory+0.048
Neg logit contributions
 called-0.033
 shouted-0.028
 scared-0.028
 worried-0.028
 sad-0.026
Token there
Token ID612
Feature activation+1.52e+00

Pos logit contributions
 called+0.016
 scared+0.013
 worried+0.012
 shouted+0.011
 sad+0.010
Neg logit contributions
 fault-0.067
Mine-0.066
 pleasure-0.063
 favorites-0.063
 accessory-0.061
Token was
Token ID373
Feature activation+2.31e+00

Pos logit contributions
 fault+0.111
 former+0.106
 pleasure+0.103
 accessory+0.103
Mine+0.103
Neg logit contributions
 called-0.032
 scared-0.026
 so-0.023
 sad-0.022
 shouted-0.022
Token a
Token ID257
Feature activation+3.79e-01

Pos logit contributions
 fault+0.183
Mine+0.173
 accessory+0.170
 favorites+0.166
eter+0.166
Neg logit contributions
 called-0.034
 scared-0.025
 so-0.025
 worried-0.018
 sad-0.017
Token three
Token ID1115
Feature activation-1.41e+00

Pos logit contributions
Mine+0.022
 fault+0.021
 pleasure+0.021
 favorites+0.020
rust+0.020
Neg logit contributions
 called-0.010
 shouted-0.009
 causing-0.009
 scared-0.008
 worried-0.008
Token-
Token ID12
Feature activation-4.99e+00

Pos logit contributions
 causing+0.051
 called+0.046
 worried+0.045
 shouted+0.044
ting+0.044
Neg logit contributions
 favorites-0.061
omo-0.061
Mine-0.060
 flavors-0.060
 hobby-0.059
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.158
 scared+0.142
 worried+0.140
 shouted+0.136
 sad+0.130
Neg logit contributions
Mine-0.269
 fault-0.269
 favorites-0.249
 pleasure-0.249
rust-0.245
Token-
Token ID12
Feature activation-4.99e+00

Pos logit contributions
 called+0.358
 causing+0.355
 shouted+0.323
 worried+0.319
 scared+0.310
Neg logit contributions
 fault-0.570
 pleasure-0.531
 favorites-0.530
rust-0.526
 accessory-0.525
Tokenold
Token ID727
Feature activation-4.37e-01

Pos logit contributions
 called+0.181
 scared+0.164
 worried+0.163
 shouted+0.158
 sad+0.152
Neg logit contributions
 fault-0.253
Mine-0.250
 pleasure-0.232
 favorites-0.232
oof-0.225
Token boy
Token ID2933
Feature activation-2.06e+00

Pos logit contributions
 causing+0.014
 called+0.013
 worried+0.013
 shocked+0.013
 shouted+0.013
Neg logit contributions
 fault-0.021
Mine-0.021
 favorites-0.020
oof-0.020
 accessory-0.020
Token called
Token ID1444
Feature activation-6.92e+00

Pos logit contributions
 worried+0.046
 scared+0.046
 called+0.045
 shouted+0.044
 confused+0.044
Neg logit contributions
Mine-0.118
 fault-0.118
oof-0.114
 favorites-0.113
 accessory-0.109
Token Tommy
Token ID19919
Feature activation+2.30e+00

Pos logit contributions
 called+0.283
 scared+0.254
 worried+0.248
 shouted+0.247
 sad+0.234
Neg logit contributions
 fault-0.331
Mine-0.324
 favorites-0.300
 pleasure-0.295
 accessory-0.294
Token at
Token ID379
Feature activation+1.35e+00

Pos logit contributions
 called+0.007
 shouted+0.007
 noise+0.006
plets+0.006
 worried+0.006
Neg logit contributions
Mine-0.011
 fault-0.011
 favorites-0.011
 addition-0.011
 pleasure-0.010
Token her
Token ID607
Feature activation+1.39e+00

Pos logit contributions
 fault+0.082
Mine+0.081
 pleasure+0.077
 favorites+0.076
 accessory+0.074
Neg logit contributions
 called-0.037
 scared-0.032
 worried-0.032
 shouted-0.031
 sad-0.029
Token excited
Token ID6568
Feature activation+3.97e-01

Pos logit contributions
 fault+0.072
Mine+0.069
 pleasure+0.067
 favorites+0.066
 accessory+0.064
Neg logit contributions
 called-0.049
 scared-0.044
 worried-0.043
 shouted-0.043
 shocked-0.040
Token three
Token ID1115
Feature activation-1.60e+00

Pos logit contributions
 fault+0.023
Mine+0.023
 pleasure+0.022
 favorites+0.022
 accessory+0.022
Neg logit contributions
 called-0.011
 scared-0.010
 worried-0.009
 shocked-0.009
 shouted-0.009
Token-
Token ID12
Feature activation-5.06e+00

Pos logit contributions
 called+0.061
 worried+0.060
 causing+0.058
 scared+0.058
 shocked+0.056
Neg logit contributions
 flavors-0.065
 favorites-0.064
Mine-0.063
 accessory-0.062
 fault-0.061
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.157
 shouted+0.134
 scared+0.133
 worried+0.131
 sad+0.117
Neg logit contributions
 fault-0.298
Mine-0.290
 favorites-0.274
 pleasure-0.273
 accessory-0.265
Token-
Token ID12
Feature activation-5.07e+00

Pos logit contributions
 called+0.365
 causing+0.339
 scared+0.330
 worried+0.329
 shouted+0.317
Neg logit contributions
 fault-0.571
 flavors-0.511
asured-0.508
 pleasure-0.507
 accessory-0.505
Tokenold
Token ID727
Feature activation-5.42e-01

Pos logit contributions
 called+0.172
 scared+0.150
 shouted+0.149
 worried+0.148
 sad+0.136
Neg logit contributions
 fault-0.280
Mine-0.276
 favorites-0.256
 pleasure-0.256
 accessory-0.248
Token daughter
Token ID4957
Feature activation-3.08e+00

Pos logit contributions
 called+0.020
plets+0.019
irens+0.019
 worried+0.019
 scared+0.018
Neg logit contributions
 fault-0.024
 pleasure-0.023
 favorites-0.023
Mine-0.023
 accessory-0.022
Token and
Token ID290
Feature activation+1.14e-01

Pos logit contributions
 called+0.079
 scared+0.075
 worried+0.074
 noise+0.070
 shocked+0.070
Neg logit contributions
 fault-0.175
Mine-0.168
 pleasure-0.167
 favorites-0.167
 accessory-0.161
Token continued
Token ID3767
Feature activation+2.84e+00

Pos logit contributions
Mine+0.006
 pleasure+0.006
 fault+0.006
 accessory+0.006
 favorites+0.006
Neg logit contributions
irens-0.003
plets-0.003
 worried-0.003
iren-0.003
 scared-0.003
Token world
Token ID995
Feature activation-1.52e+00

Pos logit contributions
Mine+0.219
 fault+0.211
rust+0.191
 favorites+0.190
oof+0.190
Neg logit contributions
 called-0.196
 scared-0.178
 worried-0.174
 sad-0.169
 shouted-0.168
Token,
Token ID11
Feature activation-1.10e+00

Pos logit contributions
 called+0.036
 worried+0.032
 scared+0.032
 shouted+0.030
 sad+0.029
Neg logit contributions
 fault-0.095
Mine-0.092
 pleasure-0.089
 favorites-0.087
 accessory-0.086
Token the
Token ID262
Feature activation-1.83e+00

Pos logit contributions
 called+0.029
 worried+0.026
 scared+0.025
 shouted+0.025
 shocked+0.023
Neg logit contributions
 fault-0.065
Mine-0.064
 pleasure-0.063
 favorites-0.060
 accessory-0.058
Token 3
Token ID513
Feature activation-5.03e+00

Pos logit contributions
 called+0.052
 scared+0.046
 worried+0.045
 shouted+0.043
 sad+0.041
Neg logit contributions
 fault-0.109
Mine-0.108
 pleasure-0.101
 favorites-0.101
 accessory-0.098
Token-
Token ID12
Feature activation-5.08e+00

Pos logit contributions
 called+0.191
 scared+0.176
 worried+0.176
 causing+0.171
 scientists+0.165
Neg logit contributions
 fault-0.226
Mine-0.226
 favorites-0.219
 pleasure-0.212
 flavors-0.211
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.153
 scared+0.135
 worried+0.133
 shouted+0.128
 sad+0.123
Neg logit contributions
 fault-0.293
Mine-0.289
 pleasure-0.271
 favorites-0.270
 accessory-0.263
Token-
Token ID12
Feature activation-5.08e+00

Pos logit contributions
 called+0.330
 worried+0.297
 causing+0.296
 scared+0.294
 shouted+0.276
Neg logit contributions
 fault-0.613
 pleasure-0.577
 favorites-0.562
 friendship-0.562
 accessory-0.561
Tokenold
Token ID727
Feature activation-5.68e-01

Pos logit contributions
 called+0.179
 scared+0.160
 worried+0.158
 shouted+0.155
 sad+0.147
Neg logit contributions
 fault-0.270
Mine-0.266
 pleasure-0.247
 favorites-0.246
 accessory-0.239
Token boy
Token ID2933
Feature activation-2.24e+00

Pos logit contributions
 called+0.018
irens+0.018
 worried+0.017
 scared+0.017
 causing+0.017
Neg logit contributions
 fault-0.026
Mine-0.026
 pleasure-0.026
 accessory-0.025
 favorites-0.025
Token hugged
Token ID41130
Feature activation-1.30e+00

Pos logit contributions
 called+0.062
 noise+0.062
irens+0.061
 worried+0.061
 scientists+0.061
Neg logit contributions
 fault-0.113
Mine-0.108
 pleasure-0.108
 favorites-0.106
 accessory-0.106
Token his
Token ID465
Feature activation+1.88e+00

Pos logit contributions
 called+0.057
 scared+0.053
 worried+0.053
 shouted+0.051
 sad+0.050
Neg logit contributions
 fault-0.056
Mine-0.056
 pleasure-0.051
 favorites-0.050
 accessory-0.048
TokenSuddenly
Token ID38582
Feature activation-1.39e+00

Pos logit contributions
 fault+0.013
 pleasure+0.012
Mine+0.012
 favorites+0.011
 accessory+0.011
Neg logit contributions
 called-0.009
 shouted-0.008
 scared-0.008
 worried-0.008
 so-0.008
Token,
Token ID11
Feature activation-9.29e-01

Pos logit contributions
 called+0.033
 worried+0.029
 causing+0.029
 scared+0.028
 afraid+0.027
Neg logit contributions
Mine-0.081
 fault-0.080
 pleasure-0.077
 favorites-0.076
 flavors-0.074
Token their
Token ID511
Feature activation+8.25e-01

Pos logit contributions
 called+0.034
 worried+0.030
 shouted+0.030
 scared+0.030
 causing+0.029
Neg logit contributions
Mine-0.045
 fault-0.043
 pleasure-0.042
 favorites-0.041
 flavors-0.039
Token three
Token ID1115
Feature activation-1.53e+00

Pos logit contributions
 fault+0.046
Mine+0.045
rene+0.043
oof+0.043
 accessory+0.043
Neg logit contributions
 called-0.025
 scared-0.024
 worried-0.023
 sad-0.023
 shouted-0.022
Token-
Token ID12
Feature activation-5.00e+00

Pos logit contributions
 called+0.046
 causing+0.042
 worried+0.042
 scared+0.042
 confused+0.038
Neg logit contributions
Mine-0.081
 favorites-0.080
 fault-0.079
 flavors-0.075
omo-0.075
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.156
 scared+0.142
 worried+0.140
 shouted+0.131
 sad+0.131
Neg logit contributions
 fault-0.278
Mine-0.278
 favorites-0.257
 pleasure-0.256
oof-0.251
Token-
Token ID12
Feature activation-5.01e+00

Pos logit contributions
 called+0.360
 causing+0.344
 worried+0.319
 scared+0.311
ting+0.300
Neg logit contributions
 fault-0.587
Mine-0.533
 accessory-0.526
 favorites-0.524
rust-0.522
Tokenold
Token ID727
Feature activation-4.82e-01

Pos logit contributions
 called+0.179
 scared+0.161
 worried+0.160
 shouted+0.155
 sad+0.149
Neg logit contributions
 fault-0.262
Mine-0.257
 pleasure-0.240
 favorites-0.239
 accessory-0.231
Token son
Token ID3367
Feature activation-1.58e+00

Pos logit contributions
 called+0.015
 causing+0.014
 worried+0.014
 scared+0.014
irens+0.014
Neg logit contributions
Mine-0.024
 fault-0.023
 favorites-0.022
 pleasure-0.022
 accessory-0.021
Token came
Token ID1625
Feature activation-3.71e+00

Pos logit contributions
 called+0.040
 scientists+0.039
 worried+0.038
 causing+0.038
taker+0.037
Neg logit contributions
Mine-0.082
 fault-0.079
 favorites-0.075
 accessory-0.074
 pleasure-0.074
Token running
Token ID2491
Feature activation-6.93e-01

Pos logit contributions
 called+0.111
 worried+0.097
 scared+0.097
 shouted+0.094
 sad+0.092
Neg logit contributions
 fault-0.206
Mine-0.204
 pleasure-0.190
 favorites-0.189
 accessory-0.185
Token She
Token ID1375
Feature activation-8.06e-01

Pos logit contributions
 called+0.004
 scared+0.003
 worried+0.003
 shouted+0.003
 sad+0.003
Neg logit contributions
 fault-0.007
Mine-0.007
 pleasure-0.007
 favorites-0.006
 accessory-0.006
Token was
Token ID373
Feature activation+2.22e+00

Pos logit contributions
 called+0.030
 scared+0.027
 worried+0.026
 shouted+0.025
 sad+0.024
Neg logit contributions
 fault-0.043
Mine-0.042
 pleasure-0.039
 favorites-0.039
 accessory-0.038
Token a
Token ID257
Feature activation+2.85e-01

Pos logit contributions
 pleasure+0.121
 fault+0.121
 favorites+0.120
Mine+0.114
 favourites+0.112
Neg logit contributions
 shouted-0.057
plets-0.055
irens-0.054
 men-0.050
line-0.049
Token three
Token ID1115
Feature activation-1.50e+00

Pos logit contributions
 pleasure+0.016
Mine+0.015
 fault+0.015
 favorites+0.015
 accessory+0.014
Neg logit contributions
 called-0.008
 shouted-0.008
irens-0.007
 causing-0.007
 doors-0.007
Token-
Token ID12
Feature activation-4.96e+00

Pos logit contributions
 causing+0.048
 called+0.048
 shouted+0.045
 worried+0.045
 confused+0.044
Neg logit contributions
 favorites-0.071
Mine-0.069
 fault-0.068
 flavors-0.068
 accessory-0.066
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.161
 scared+0.148
 worried+0.145
 shouted+0.140
 noise+0.139
Neg logit contributions
Mine-0.258
 fault-0.256
 favorites-0.240
oof-0.238
 pleasure-0.238
Token-
Token ID12
Feature activation-4.96e+00

Pos logit contributions
 called+0.366
 causing+0.356
 shouted+0.337
 worried+0.316
 scared+0.313
Neg logit contributions
 fault-0.570
 favorites-0.535
 pleasure-0.528
 accessory-0.519
 friendship-0.511
Tokenold
Token ID727
Feature activation-4.12e-01

Pos logit contributions
 called+0.182
 scared+0.166
 worried+0.164
 shouted+0.159
 sad+0.154
Neg logit contributions
 fault-0.250
Mine-0.246
 favorites-0.229
 pleasure-0.229
oof-0.222
Token and
Token ID290
Feature activation+2.27e-01

Pos logit contributions
 called+0.012
irens+0.012
 causing+0.012
 shouted+0.011
plets+0.011
Neg logit contributions
 fault-0.021
 favorites-0.021
Mine-0.020
 pleasure-0.020
 accessory-0.020
Token she
Token ID673
Feature activation-2.91e+00

Pos logit contributions
Mine+0.011
 fault+0.011
 pleasure+0.010
 favorites+0.010
 accessory+0.010
Neg logit contributions
 called-0.009
 scared-0.008
 worried-0.008
 shouted-0.008
 confused-0.007
Token was
Token ID373
Feature activation+2.21e+00

Pos logit contributions
 called+0.102
 scared+0.089
 worried+0.085
 shouted+0.083
 sad+0.079
Neg logit contributions
 fault-0.173
Mine-0.165
 pleasure-0.159
 favorites-0.152
 accessory-0.150
Token beach
Token ID10481
Feature activation+3.42e+00

Pos logit contributions
 called+0.087
 scared+0.081
 worried+0.081
 shouted+0.079
 sad+0.077
Neg logit contributions
 fault-0.068
Mine-0.068
 pleasure-0.061
 favorites-0.061
 accessory-0.058
Token with
Token ID351
Feature activation+1.18e+00

Pos logit contributions
 fault+0.197
Mine+0.192
 favorites+0.188
 pleasure+0.183
 accessory+0.182
Neg logit contributions
 called-0.087
 worried-0.083
 scared-0.082
 shouted-0.079
 shocked-0.076
Token their
Token ID511
Feature activation+8.61e-01

Pos logit contributions
 fault+0.070
Mine+0.065
 accessory+0.062
 favorites+0.062
 pleasure+0.062
Neg logit contributions
 called-0.038
 scared-0.032
 shouted-0.032
 worried-0.031
 so-0.030
Token three
Token ID1115
Feature activation-1.45e+00

Pos logit contributions
 fault+0.050
Mine+0.047
rene+0.045
oof+0.045
 accessory+0.045
Neg logit contributions
 called-0.028
 worried-0.025
 scared-0.025
 shouted-0.024
 sad-0.024
Token-
Token ID12
Feature activation-5.03e+00

Pos logit contributions
 called+0.046
 scared+0.042
 worried+0.042
 causing+0.040
 shouted+0.039
Neg logit contributions
Mine-0.076
 fault-0.076
 favorites-0.074
 pleasure-0.071
 flavors-0.071
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.153
 scared+0.132
 worried+0.130
 shouted+0.129
 sad+0.119
Neg logit contributions
 fault-0.294
Mine-0.288
 pleasure-0.271
 favorites-0.268
 accessory-0.262
Token-
Token ID12
Feature activation-5.04e+00

Pos logit contributions
 called+0.318
 scared+0.290
 worried+0.289
 causing+0.281
 shouted+0.271
Neg logit contributions
 fault-0.638
 favorites-0.598
Mine-0.592
 pleasure-0.590
 flavors-0.577
Tokenold
Token ID727
Feature activation-4.48e-01

Pos logit contributions
 called+0.170
 scared+0.147
 shouted+0.147
 worried+0.144
 sad+0.132
Neg logit contributions
 fault-0.280
Mine-0.275
 pleasure-0.257
 favorites-0.253
 accessory-0.248
Token daughter
Token ID4957
Feature activation-2.92e+00

Pos logit contributions
 called+0.017
 scared+0.016
 worried+0.016
 noise+0.016
 shocked+0.016
Neg logit contributions
 fault-0.019
 favorites-0.019
Mine-0.019
 pleasure-0.019
 accessory-0.017
Token.
Token ID13
Feature activation-2.00e-01

Pos logit contributions
 called+0.070
 scared+0.065
 worried+0.064
 noise+0.062
 shouted+0.061
Neg logit contributions
 fault-0.173
Mine-0.170
 favorites-0.167
 pleasure-0.165
 accessory-0.157
Token They
Token ID1119
Feature activation-6.33e-02

Pos logit contributions
 called+0.006
 scared+0.006
 worried+0.006
 shouted+0.005
 sad+0.005
Neg logit contributions
 fault-0.011
Mine-0.011
 pleasure-0.010
 favorites-0.010
 accessory-0.010
Token bow
Token ID9563
Feature activation+2.61e+00

Pos logit contributions
 fault+0.015
Mine+0.014
 favorites+0.014
 pleasure+0.013
eter+0.013
Neg logit contributions
 called-0.012
 shouted-0.010
 scared-0.010
 worried-0.010
 noise-0.009
Token and
Token ID290
Feature activation+1.70e-01

Pos logit contributions
 fault+0.165
Mine+0.162
 pleasure+0.153
 favorites+0.153
 accessory+0.149
Neg logit contributions
 called-0.063
 scared-0.054
 worried-0.053
 shouted-0.051
 sad-0.048
Token the
Token ID262
Feature activation-1.53e+00

Pos logit contributions
Mine+0.008
 fault+0.008
 pleasure+0.007
 favorites+0.007
 accessory+0.007
Neg logit contributions
 called-0.007
 worried-0.007
 scared-0.006
 shocked-0.006
 shouted-0.006
Token 3
Token ID513
Feature activation-4.84e+00

Pos logit contributions
 called+0.051
 scared+0.042
 shouted+0.042
 worried+0.041
 shocked+0.039
Neg logit contributions
Mine-0.085
 fault-0.084
 pleasure-0.079
 favorites-0.078
 accessory-0.075
Token-
Token ID12
Feature activation-4.96e+00

Pos logit contributions
 called+0.191
 scared+0.169
 worried+0.168
 causing+0.159
 confused+0.158
Neg logit contributions
Mine-0.221
 fault-0.212
 favorites-0.209
 flavors-0.209
 accessory-0.200
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.201
 scared+0.181
 worried+0.179
 shouted+0.178
 sad+0.168
Neg logit contributions
 fault-0.238
Mine-0.232
 pleasure-0.216
 favorites-0.215
 accessory-0.207
Token-
Token ID12
Feature activation-4.97e+00

Pos logit contributions
 called+0.298
 scared+0.257
 worried+0.256
 shouted+0.238
 causing+0.233
Neg logit contributions
 fault-0.681
Mine-0.660
 pleasure-0.632
 favorites-0.629
 accessory-0.621
Tokenold
Token ID727
Feature activation-4.95e-01

Pos logit contributions
 called+0.177
 scared+0.157
 worried+0.155
 shouted+0.155
 sad+0.144
Neg logit contributions
 fault-0.262
Mine-0.257
 pleasure-0.239
 favorites-0.238
 accessory-0.231
Token followed
Token ID3940
Feature activation-6.79e-01

Pos logit contributions
 called+0.015
 scared+0.014
 worried+0.014
 shocked+0.013
 afraid+0.013
Neg logit contributions
 fault-0.026
Mine-0.026
 favorites-0.025
 pleasure-0.025
 accessory-0.024
Token him
Token ID683
Feature activation+8.35e-01

Pos logit contributions
 called+0.018
 scared+0.017
 worried+0.017
 shocked+0.016
 afraid+0.016
Neg logit contributions
Mine-0.039
 fault-0.038
 pleasure-0.036
 favorites-0.036
oof-0.035
Token,
Token ID11
Feature activation-9.36e-01

Pos logit contributions
 fault+0.053
Mine+0.052
 pleasure+0.049
 favorites+0.049
 accessory+0.048
Neg logit contributions
 called-0.020
 scared-0.018
 worried-0.017
 shouted-0.016
 sad-0.015
Token so
Token ID523
Feature activation-2.57e+00

Pos logit contributions
 called+0.025
 worried+0.021
 scared+0.021
 shouted+0.020
 shocked+0.020
Neg logit contributions
 fault-0.079
Mine-0.076
 pleasure-0.076
 favorites-0.073
 accessory-0.072
Token a
Token ID257
Feature activation+2.12e-01

Pos logit contributions
 called+0.066
 worried+0.058
 scared+0.058
 shouted+0.055
 shocked+0.052
Neg logit contributions
 fault-0.152
Mine-0.152
 pleasure-0.145
 favorites-0.142
rust-0.138
Token brave
Token ID14802
Feature activation-1.47e+00

Pos logit contributions
 fault+0.010
 pleasure+0.010
Mine+0.010
 accessory+0.009
 favorites+0.009
Neg logit contributions
 called-0.007
 shouted-0.007
 scared-0.007
 worried-0.007
 causing-0.007
Token 3
Token ID513
Feature activation-5.01e+00

Pos logit contributions
 called+0.057
 causing+0.054
irens+0.053
 shouted+0.053
 shocked+0.053
Neg logit contributions
 fault-0.065
 pleasure-0.064
Mine-0.061
 accessory-0.060
 friendship-0.059
Token-
Token ID12
Feature activation-5.07e+00

Pos logit contributions
 called+0.176
 causing+0.174
 window+0.170
 worried+0.170
plets+0.168
Neg logit contributions
 flavors-0.204
 accessory-0.199
 hobby-0.198
Mine-0.195
rene-0.189
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.174
 scared+0.160
 worried+0.159
 sad+0.151
 shouted+0.150
Neg logit contributions
Mine-0.257
 fault-0.257
 pleasure-0.238
oof-0.236
 favorites-0.234
Token-
Token ID12
Feature activation-5.09e+00

Pos logit contributions
 called+0.381
 causing+0.366
 worried+0.341
 shouted+0.332
 smoking+0.332
Neg logit contributions
 fault-0.550
 accessory-0.509
 friendship-0.504
 pleasure-0.502
rust-0.493
Tokenold
Token ID727
Feature activation-5.84e-01

Pos logit contributions
 called+0.192
 scared+0.176
 worried+0.175
 shouted+0.167
 sad+0.165
Neg logit contributions
 fault-0.250
Mine-0.246
 pleasure-0.229
 favorites-0.227
oof-0.223
Token girl
Token ID2576
Feature activation-3.72e+00

Pos logit contributions
 called+0.017
 worried+0.016
 scared+0.016
 shocked+0.016
 causing+0.016
Neg logit contributions
 fault-0.030
Mine-0.029
 pleasure-0.029
 favorites-0.028
 accessory-0.028
Token named
Token ID3706
Feature activation-4.47e+00

Pos logit contributions
 called+0.092
 scared+0.081
 worried+0.080
 shouted+0.076
 sad+0.073
Neg logit contributions
 fault-0.230
Mine-0.227
 pleasure-0.215
 favorites-0.214
 accessory-0.209
Token Lisa
Token ID15378
Feature activation-6.47e-01

Pos logit contributions
 called+0.179
 scared+0.164
 worried+0.162
 shouted+0.158
 sad+0.153
Neg logit contributions
 fault-0.213
Mine-0.209
 pleasure-0.194
 favorites-0.192
 accessory-0.185
Token
Token ID198
Feature activation+7.85e-04

Pos logit contributions
 called+0.170
 scared+0.153
 worried+0.153
 shouted+0.146
 sad+0.141
Neg logit contributions
 fault-0.263
Mine-0.254
 pleasure-0.241
 favorites-0.238
oof-0.229
Token
Token ID198
Feature activation+6.54e-03

Pos logit contributions
 fault+0.000
Mine+0.000
 favorites+0.000
 pleasure+0.000
 accessory+0.000
Neg logit contributions
 called-0.000
 scared-0.000
 worried-0.000
 shouted-0.000
 sad-0.000
TokenThe
Token ID464
Feature activation-1.76e+00

Pos logit contributions
 fault+0.000
 favorites+0.000
Mine+0.000
 pleasure+0.000
 accessory+0.000
Neg logit contributions
 called-0.000
 shouted-0.000
 scared-0.000
 worried-0.000
 so-0.000
Token 3
Token ID513
Feature activation-5.02e+00

Pos logit contributions
 called+0.058
 shouted+0.050
 scared+0.049
 worried+0.048
 causing+0.046
Neg logit contributions
Mine-0.095
 fault-0.094
 pleasure-0.089
 favorites-0.088
rust-0.084
Token-
Token ID12
Feature activation-5.08e+00

Pos logit contributions
 called+0.194
 scared+0.181
 worried+0.180
 causing+0.174
 scientists+0.171
Neg logit contributions
Mine-0.223
 fault-0.212
 favorites-0.209
 flavors-0.206
 accessory-0.198
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.136
 scared+0.114
 shouted+0.113
 worried+0.112
 sad+0.101
Neg logit contributions
 fault-0.320
Mine-0.309
 favorites-0.295
 pleasure-0.294
 accessory-0.285
Token-
Token ID12
Feature activation-5.10e+00

Pos logit contributions
 called+0.320
 scared+0.286
 worried+0.285
 causing+0.277
 shouted+0.259
Neg logit contributions
 fault-0.637
Mine-0.610
 pleasure-0.583
rust-0.581
 accessory-0.577
Tokenold
Token ID727
Feature activation-5.76e-01

Pos logit contributions
 called+0.163
 scared+0.141
 shouted+0.141
 worried+0.139
 sad+0.128
Neg logit contributions
 fault-0.293
Mine-0.285
 favorites-0.268
 pleasure-0.267
 accessory-0.258
Token be
Token ID307
Feature activation+1.50e+00

Pos logit contributions
 called+0.021
 scared+0.020
 worried+0.020
 scientists+0.019
 causing+0.019
Neg logit contributions
Mine-0.026
 fault-0.026
 pleasure-0.025
 accessory-0.024
 favorites-0.024
Tokenamed
Token ID2434
Feature activation-2.31e+00

Pos logit contributions
 fault+0.076
Mine+0.074
 pleasure+0.071
 favorites+0.070
 accessory+0.069
Neg logit contributions
 called-0.052
 scared-0.047
 worried-0.047
 shouted-0.045
 sad-0.044
Token and
Token ID290
Feature activation+1.08e-01

Pos logit contributions
 called+0.068
 scared+0.060
 worried+0.060
 shouted+0.060
 scientists+0.060
Neg logit contributions
Mine-0.121
 pleasure-0.118
 fault-0.117
 accessory-0.114
rust-0.113
Token happy
Token ID3772
Feature activation+1.20e+00

Pos logit contributions
 called+0.013
 scientists+0.011
 shouted+0.011
 machines+0.010
 noise+0.010
Neg logit contributions
 fault-0.046
Mine-0.044
 pleasure-0.044
 favorites-0.042
 accessory-0.040
Token that
Token ID326
Feature activation-3.10e-01

Pos logit contributions
 fault+0.079
Mine+0.078
 pleasure+0.074
 favorites+0.073
 accessory+0.072
Neg logit contributions
 called-0.026
 scared-0.022
 worried-0.022
 shouted-0.021
 sad-0.019
Token the
Token ID262
Feature activation-1.79e+00

Pos logit contributions
 called+0.007
 scared+0.006
 worried+0.006
 shouted+0.006
 sad+0.005
Neg logit contributions
 fault-0.020
Mine-0.020
 pleasure-0.019
 favorites-0.019
 accessory-0.018
Token three
Token ID1115
Feature activation-1.53e+00

Pos logit contributions
 called+0.054
 scared+0.047
 worried+0.047
 shouted+0.046
 sad+0.043
Neg logit contributions
 fault-0.103
Mine-0.102
 pleasure-0.096
 favorites-0.095
 accessory-0.092
Token-
Token ID12
Feature activation-5.01e+00

Pos logit contributions
 called+0.048
 scared+0.043
 worried+0.043
 causing+0.041
 shouted+0.041
Neg logit contributions
 fault-0.081
Mine-0.080
 favorites-0.079
 pleasure-0.076
 flavors-0.075
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.164
 scared+0.148
 worried+0.146
 shouted+0.140
 sad+0.136
Neg logit contributions
 fault-0.273
Mine-0.271
 pleasure-0.252
 favorites-0.250
oof-0.244
Token-
Token ID12
Feature activation-5.02e+00

Pos logit contributions
 called+0.301
 worried+0.262
 scared+0.262
 shouted+0.250
 causing+0.248
Neg logit contributions
 fault-0.669
Mine-0.631
 pleasure-0.618
 favorites-0.615
 accessory-0.599
Tokenold
Token ID727
Feature activation-5.19e-01

Pos logit contributions
 called+0.183
 scared+0.164
 worried+0.162
 shouted+0.159
 sad+0.151
Neg logit contributions
 fault-0.261
Mine-0.256
 pleasure-0.238
 favorites-0.237
 accessory-0.230
Token had
Token ID550
Feature activation-1.89e-02

Pos logit contributions
 called+0.017
 scared+0.015
 worried+0.015
 shouted+0.015
 noise+0.015
Neg logit contributions
 fault-0.027
Mine-0.026
 pleasure-0.025
 favorites-0.025
 accessory-0.024
Token changed
Token ID3421
Feature activation-4.36e-01

Pos logit contributions
 called+0.001
 scared+0.001
 worried+0.001
 shouted+0.001
 sad+0.001
Neg logit contributions
 fault-0.001
Mine-0.001
 pleasure-0.001
 favorites-0.001
 accessory-0.001
Token their
Token ID511
Feature activation+8.60e-01

Pos logit contributions
 called+0.011
 scared+0.010
 worried+0.009
 shouted+0.009
 sad+0.009
Neg logit contributions
 fault-0.027
Mine-0.027
 pleasure-0.025
 favorites-0.025
 accessory-0.024
Tokenally
Token ID453
Feature activation+1.10e-01

Pos logit contributions
 called+0.016
 worried+0.014
 shouted+0.014
 so+0.014
 door+0.014
Neg logit contributions
 fault-0.011
Mine-0.010
 accessory-0.009
 pleasure-0.009
 personalities-0.009
Token was
Token ID373
Feature activation+2.29e+00

Pos logit contributions
 fault+0.007
Mine+0.007
 pleasure+0.006
 favorites+0.006
 accessory+0.006
Neg logit contributions
 called-0.003
 scared-0.002
 worried-0.002
 shouted-0.002
 sad-0.002
Token a
Token ID257
Feature activation+3.16e-01

Pos logit contributions
 fault+0.137
Mine+0.135
 pleasure+0.130
 favorites+0.128
oof+0.123
Neg logit contributions
 called-0.060
 shouted-0.052
 scared-0.052
 worried-0.051
 noise-0.047
Token 3
Token ID513
Feature activation-4.90e+00

Pos logit contributions
Mine+0.018
 fault+0.018
 pleasure+0.017
 favorites+0.017
 accessory+0.016
Neg logit contributions
 called-0.009
 shouted-0.008
 scared-0.007
 worried-0.007
 causing-0.007
Token-
Token ID12
Feature activation-5.07e+00

Pos logit contributions
 called+0.179
 causing+0.175
 worried+0.171
plets+0.169
 scared+0.169
Neg logit contributions
Mine-0.204
 favorites-0.194
 flavors-0.192
 hobby-0.190
 accessory-0.189
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.163
 scared+0.148
 worried+0.144
 shouted+0.139
 noise+0.137
Neg logit contributions
Mine-0.270
 fault-0.269
 pleasure-0.252
 favorites-0.250
rust-0.245
Token-
Token ID12
Feature activation-5.07e+00

Pos logit contributions
 called+0.370
 causing+0.363
 worried+0.332
 scared+0.331
 shouted+0.331
Neg logit contributions
 fault-0.543
 pleasure-0.518
 favorites-0.512
 accessory-0.510
Mine-0.501
Tokenold
Token ID727
Feature activation-5.06e-01

Pos logit contributions
 called+0.184
 scared+0.168
 worried+0.165
 shouted+0.160
 sad+0.155
Neg logit contributions
 fault-0.259
Mine-0.255
 pleasure-0.238
 favorites-0.237
 accessory-0.230
Token girl
Token ID2576
Feature activation-3.71e+00

Pos logit contributions
 called+0.013
irens+0.013
 shocked+0.012
 noise+0.012
 scared+0.012
Neg logit contributions
 fault-0.027
 favorites-0.027
Mine-0.027
 pleasure-0.026
 accessory-0.026
Token who
Token ID508
Feature activation-3.63e+00

Pos logit contributions
 noise+0.085
 scared+0.083
 worried+0.082
 shocked+0.081
 men+0.079
Neg logit contributions
 fault-0.207
 favorites-0.206
Mine-0.206
 pleasure-0.201
 accessory-0.196
Token had
Token ID550
Feature activation+1.46e-01

Pos logit contributions
 called+0.130
 scared+0.114
 worried+0.111
 shouted+0.109
 sad+0.102
Neg logit contributions
 fault-0.204
Mine-0.194
 pleasure-0.182
 favorites-0.179
oof-0.178
Token to
Token ID284
Feature activation+2.45e-01

Pos logit contributions
 called+0.004
 shouted+0.004
 noise+0.004
 scientists+0.004
 scared+0.004
Neg logit contributions
 fault-0.008
Mine-0.007
 pleasure-0.007
 favorites-0.007
 friendship-0.007
Token find
Token ID1064
Feature activation-1.91e+00

Pos logit contributions
 fault+0.013
Mine+0.013
 pleasure+0.012
 favorites+0.012
 accessory+0.012
Neg logit contributions
 called-0.008
 scared-0.008
 worried-0.007
 shouted-0.007
 sad-0.007
Token a
Token ID257
Feature activation+3.60e-01

Pos logit contributions
 called+0.057
 worried+0.053
 scared+0.053
 shouted+0.051
 sad+0.050
Neg logit contributions
 fault-0.105
Mine-0.103
 pleasure-0.100
 favorites-0.097
oof-0.094
Token 3
Token ID513
Feature activation-4.91e+00

Pos logit contributions
 fault+0.018
Mine+0.017
 pleasure+0.016
 favorites+0.016
 accessory+0.015
Neg logit contributions
 called-0.014
 scared-0.013
 worried-0.013
 shouted-0.012
 sad-0.012
Token-
Token ID12
Feature activation-4.98e+00

Pos logit contributions
 called+0.181
 causing+0.172
 scientists+0.167
 worried+0.164
 confused+0.159
Neg logit contributions
Mine-0.211
 accessory-0.207
 flavors-0.205
 fault-0.199
oof-0.196
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.155
 scared+0.138
 worried+0.136
 shouted+0.132
 sad+0.126
Neg logit contributions
 fault-0.282
Mine-0.278
 pleasure-0.260
 favorites-0.259
 accessory-0.252
Token old
Token ID1468
Feature activation-1.06e+00

Pos logit contributions
 causing+0.375
 called+0.373
 smoking+0.325
 shouted+0.321
 worried+0.317
Neg logit contributions
 fault-0.543
 pleasure-0.504
rust-0.498
 accessory-0.495
 friendship-0.491
Token boy
Token ID2933
Feature activation-2.11e+00

Pos logit contributions
 called+0.033
 causing+0.029
irens+0.028
plets+0.028
 worried+0.028
Neg logit contributions
 fault-0.056
Mine-0.053
 accessory-0.053
 pleasure-0.052
 flavors-0.051
Token smiling
Token ID16755
Feature activation+1.73e+00

Pos logit contributions
 called+0.048
 worried+0.043
 scared+0.043
 shouted+0.040
 sad+0.040
Neg logit contributions
 fault-0.129
Mine-0.127
 pleasure-0.121
 favorites-0.121
 accessory-0.120
Token up
Token ID510
Feature activation+2.89e+00

Pos logit contributions
 fault+0.103
Mine+0.100
 pleasure+0.098
 favorites+0.097
 accessory+0.096
Neg logit contributions
 called-0.045
 shouted-0.038
 scared-0.037
 worried-0.037
 causing-0.035
Token at
Token ID379
Feature activation+1.56e+00

Pos logit contributions
 fault+0.168
Mine+0.165
 pleasure+0.158
 favorites+0.156
 accessory+0.152
Neg logit contributions
 called-0.074
 worried-0.064
 scared-0.064
 shouted-0.063
 scientists-0.062
Token that
Token ID326
Feature activation-2.46e-01

Pos logit contributions
 called+0.011
 scared+0.010
 worried+0.010
 shouted+0.010
 shocked+0.009
Neg logit contributions
 fault-0.026
Mine-0.025
 pleasure-0.024
 favorites-0.023
 accessory-0.023
Token even
Token ID772
Feature activation+3.35e+00

Pos logit contributions
 called+0.006
 scared+0.005
 worried+0.005
 shouted+0.005
 sad+0.005
Neg logit contributions
 fault-0.015
Mine-0.015
 favorites-0.014
 pleasure-0.014
 accessory-0.014
Token a
Token ID257
Feature activation+2.07e-01

Pos logit contributions
 fault+0.174
Mine+0.166
 pleasure+0.163
 favorites+0.160
 accessory+0.155
Neg logit contributions
 called-0.110
 shouted-0.102
 worried-0.101
 scared-0.100
 shocked-0.097
Token three
Token ID1115
Feature activation-1.56e+00

Pos logit contributions
 fault+0.009
Mine+0.009
 pleasure+0.008
 favorites+0.008
 accessory+0.008
Neg logit contributions
 called-0.009
 scared-0.008
 worried-0.008
 shouted-0.008
 sad-0.008
Token-
Token ID12
Feature activation-5.02e+00

Pos logit contributions
 called+0.056
 scared+0.050
 worried+0.050
 shouted+0.048
 confused+0.046
Neg logit contributions
 fault-0.079
Mine-0.078
 pleasure-0.074
 favorites-0.073
 accessory-0.072
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.167
 scared+0.149
 worried+0.147
 shouted+0.144
 sad+0.136
Neg logit contributions
 fault-0.276
Mine-0.271
 pleasure-0.252
 favorites-0.252
 accessory-0.245
Token-
Token ID12
Feature activation-5.03e+00

Pos logit contributions
 called+0.344
 scared+0.293
 causing+0.293
 worried+0.289
 shouted+0.288
Neg logit contributions
 fault-0.626
 pleasure-0.603
Mine-0.583
 accessory-0.573
 favorites-0.573
Tokenold
Token ID727
Feature activation-5.15e-01

Pos logit contributions
 called+0.181
 scared+0.162
 worried+0.160
 shouted+0.158
 sad+0.149
Neg logit contributions
 fault-0.263
Mine-0.259
 favorites-0.239
 pleasure-0.238
 accessory-0.231
Token can
Token ID460
Feature activation+1.42e+00

Pos logit contributions
 called+0.016
 scared+0.015
 worried+0.015
 shouted+0.014
 sad+0.014
Neg logit contributions
 fault-0.028
Mine-0.027
 pleasure-0.027
 favorites-0.026
 accessory-0.025
Token deliver
Token ID5203
Feature activation-2.20e+00

Pos logit contributions
 fault+0.068
Mine+0.067
 pleasure+0.063
 favorites+0.062
 accessory+0.061
Neg logit contributions
 called-0.055
 scared-0.050
 worried-0.050
 shouted-0.048
 sad-0.048
Token something
Token ID1223
Feature activation-1.25e+00

Pos logit contributions
 called+0.066
 scared+0.059
 worried+0.059
 shouted+0.056
 sad+0.054
Neg logit contributions
 fault-0.126
Mine-0.123
 pleasure-0.117
 favorites-0.116
 accessory-0.112
Token mother
Token ID2802
Feature activation-1.25e+00

Pos logit contributions
Mine+0.026
 fault+0.026
 pleasure+0.025
 favorites+0.024
 accessory+0.024
Neg logit contributions
 called-0.010
 scared-0.008
 shouted-0.008
 worried-0.008
 causing-0.008
Token took
Token ID1718
Feature activation-1.54e+00

Pos logit contributions
 called+0.032
 scared+0.030
 worried+0.030
 confused+0.029
 sad+0.029
Neg logit contributions
Mine-0.070
 fault-0.069
 favorites-0.066
 pleasure-0.065
 accessory-0.065
Token her
Token ID607
Feature activation+1.58e+00

Pos logit contributions
 called+0.062
 shocked+0.058
 worried+0.058
 causing+0.058
 scared+0.057
Neg logit contributions
Mine-0.063
 fault-0.061
 favorites-0.060
 pleasure-0.058
oof-0.057
Token 3
Token ID513
Feature activation-4.82e+00

Pos logit contributions
 fault+0.077
 favorites+0.073
Mine+0.072
 pleasure+0.071
 accessory+0.067
Neg logit contributions
 called-0.056
 shouted-0.053
 scared-0.052
 shocked-0.052
 worried-0.051
Token-
Token ID12
Feature activation-4.92e+00

Pos logit contributions
 called+0.163
 causing+0.151
 worried+0.151
 scared+0.149
 confused+0.148
Neg logit contributions
Mine-0.220
 accessory-0.215
 fault-0.213
oof-0.211
 favorites-0.209
Tokenyear
Token ID1941
Feature activation-1.12e+01

Pos logit contributions
 called+0.169
 scared+0.157
 worried+0.154
 shouted+0.146
 sad+0.146
Neg logit contributions
Mine-0.256
 fault-0.254
 favorites-0.236
 pleasure-0.235
oof-0.230
Token-
Token ID12
Feature activation-4.93e+00

Pos logit contributions
 called+0.357
 causing+0.348
 scared+0.329
 worried+0.329
 shouted+0.321
Neg logit contributions
 fault-0.571
 favorites-0.525
 pleasure-0.519
 accessory-0.517
rust-0.515
Tokenold
Token ID727
Feature activation-3.91e-01

Pos logit contributions
 called+0.179
 scared+0.163
 worried+0.161
 shouted+0.156
 sad+0.152
Neg logit contributions
 fault-0.252
Mine-0.248
 pleasure-0.231
 favorites-0.230
 accessory-0.223
Token daughter
Token ID4957
Feature activation-2.84e+00

Pos logit contributions
plets+0.018
 shocked+0.017
 called+0.017
 causing+0.017
 ding+0.017
Neg logit contributions
 favorites-0.013
 fault-0.013
Mine-0.012
 pleasure-0.012
 accessory-0.011
Token to
Token ID284
Feature activation+3.75e-01

Pos logit contributions
plets+0.077
 shocked+0.076
 called+0.075
 shouted+0.075
 scared+0.073
Neg logit contributions
 fault-0.150
 favorites-0.147
Mine-0.143
 pleasure-0.142
 favourites-0.135
Token the
Token ID262
Feature activation-1.56e+00

Pos logit contributions
Mine+0.023
 fault+0.023
 pleasure+0.022
 favorites+0.021
 accessory+0.021
Neg logit contributions
 called-0.009
 scared-0.008
 worried-0.008
 shouted-0.008
 sad-0.007
Token of
Token ID286
Feature activation-4.42e+00

Pos logit contributions
 confused+0.137
 worried+0.137
plets+0.132
 helpless+0.130
 shocked+0.128
Neg logit contributions
Mine-0.249
 friendship-0.225
 addition-0.222
rust-0.221
 accessory-0.221
Token kids
Token ID3988
Feature activation-1.37e+00

Pos logit contributions
 called+0.132
 shouted+0.113
 worried+0.111
 scared+0.110
 causing+0.105
Neg logit contributions
Mine-0.245
 fault-0.240
 favorites-0.230
 pleasure-0.227
rust-0.223
Token playing
Token ID2712
Feature activation+2.91e+00

Pos logit contributions
 called+0.049
taker+0.047
 scientists+0.047
 noise+0.047
 scared+0.047
Neg logit contributions
 fault-0.059
Mine-0.058
 favorites-0.055
 accessory-0.051
rust-0.051
Token with
Token ID351
Feature activation+9.63e-01

Pos logit contributions
 fault+0.208
Mine+0.200
 pleasure+0.195
 favorites+0.191
 accessory+0.188
Neg logit contributions
 called-0.058
 scared-0.047
 shouted-0.042
 worried-0.042
 sad-0.037
Token a
Token ID257
Feature activation+3.73e-01

Pos logit contributions
 fault+0.062
Mine+0.060
 pleasure+0.058
 accessory+0.057
 favorites+0.057
Neg logit contributions
 called-0.024
 scared-0.021
 worried-0.020
 shouted-0.019
 sad-0.019
Token football
Token ID4346
Feature activation+8.14e+00

Pos logit contributions
 fault+0.018
Mine+0.017
 pleasure+0.016
 favorites+0.016
oof+0.016
Neg logit contributions
 called-0.015
 scared-0.014
 worried-0.014
 sad-0.013
 shouted-0.013
Token.
Token ID13
Feature activation-2.41e-01

Pos logit contributions
 fault+0.517
Mine+0.506
 favorites+0.479
 pleasure+0.474
 accessory+0.468
Neg logit contributions
 called-0.186
 worried-0.164
 scared-0.163
 shouted-0.156
 sad-0.146
Token Tim
Token ID5045
Feature activation+3.12e+00

Pos logit contributions
 called+0.009
 worried+0.009
 scared+0.008
 shouted+0.008
 causing+0.008
Neg logit contributions
Mine-0.012
 fault-0.011
 favorites-0.010
 pleasure-0.010
oof-0.010
Tokenmy
Token ID1820
Feature activation+3.17e-01

Pos logit contributions
 fault+0.130
Mine+0.128
 pleasure+0.119
 favorites+0.115
 accessory+0.111
Neg logit contributions
 called-0.148
 scared-0.136
 worried-0.135
 shouted-0.133
 sad-0.128
Token wanted
Token ID2227
Feature activation+8.19e-01

Pos logit contributions
Mine+0.015
 fault+0.015
 favorites+0.014
rust+0.014
oof+0.014
Neg logit contributions
 called-0.010
 causing-0.010
 scientists-0.010
 worried-0.009
irens-0.009
Token to
Token ID284
Feature activation+9.62e-02

Pos logit contributions
 fault+0.056
Mine+0.056
 pleasure+0.054
 favorites+0.053
 accessory+0.053
Neg logit contributions
 called-0.018
 scared-0.014
 shouted-0.014
 worried-0.013
 so-0.013
Token loved
Token ID6151
Feature activation+1.44e+00

Pos logit contributions
 fault+0.033
Mine+0.032
 pleasure+0.030
 favorites+0.030
 accessory+0.029
Neg logit contributions
 called-0.021
 scared-0.019
 worried-0.019
 shouted-0.018
 sad-0.017
Token to
Token ID284
Feature activation+7.88e-02

Pos logit contributions
 fault+0.084
Mine+0.083
 favorites+0.078
 pleasure+0.078
 accessory+0.075
Neg logit contributions
 called-0.042
 scared-0.037
 worried-0.037
 shouted-0.036
 sad-0.034
Token play
Token ID711
Feature activation+5.11e+00

Pos logit contributions
 fault+0.004
Mine+0.003
 pleasure+0.003
 favorites+0.003
eter+0.003
Neg logit contributions
 called-0.004
 scared-0.003
 shouted-0.003
 worried-0.003
 noise-0.003
Token with
Token ID351
Feature activation+9.59e-01

Pos logit contributions
Mine+0.296
 fault+0.296
 favorites+0.277
 pleasure+0.274
 accessory+0.270
Neg logit contributions
 called-0.146
 scared-0.130
 worried-0.129
 shouted-0.124
 sad-0.117
Token his
Token ID465
Feature activation+2.37e+00

Pos logit contributions
 fault+0.052
 pleasure+0.049
Mine+0.049
rene+0.046
 favorites+0.046
Neg logit contributions
 called-0.035
 scared-0.031
 worried-0.030
 shouted-0.030
 noise-0.029
Token football
Token ID4346
Feature activation+8.23e+00

Pos logit contributions
rene+0.136
 fault+0.130
Mine+0.128
 pleasure+0.124
eter+0.123
Neg logit contributions
 called-0.079
 scared-0.076
 sad-0.072
 worried-0.072
 noise-0.069
Token.
Token ID13
Feature activation-1.56e-01

Pos logit contributions
 fault+0.561
Mine+0.556
 pleasure+0.533
 favorites+0.512
oof+0.508
Neg logit contributions
 called-0.183
 scared-0.145
 worried-0.136
 shouted-0.127
 sad-0.124
Token His
Token ID2399
Feature activation+3.09e+00

Pos logit contributions
 called+0.003
 so+0.003
 scared+0.002
 noise+0.002
 sad+0.002
Neg logit contributions
 fault-0.010
Mine-0.010
 accessory-0.010
rene-0.010
 favorites-0.010
Token football
Token ID4346
Feature activation+8.23e+00

Pos logit contributions
rene+0.132
 former+0.126
rust+0.119
 fault+0.119
irst+0.117
Neg logit contributions
 sad-0.147
 scared-0.147
 so-0.138
 noise-0.138
 very-0.137
Token was
Token ID373
Feature activation+2.34e+00

Pos logit contributions
 fault+0.517
Mine+0.504
 pleasure+0.494
rene+0.478
 former+0.470
Neg logit contributions
 called-0.250
 scared-0.210
 worried-0.192
 shouted-0.191
 sad-0.187
Token big
Token ID1263
Feature activation-1.29e-01

Pos logit contributions
 fault+0.138
Mine+0.136
 pleasure+0.128
 favorites+0.128
 accessory+0.124
Neg logit contributions
 called-0.067
 scared-0.059
 worried-0.058
 shouted-0.056
 sad-0.053
Token so
Token ID523
Feature activation-2.29e+00

Pos logit contributions
 fault+0.137
 pleasure+0.131
Mine+0.129
 favorites+0.124
rene+0.121
Neg logit contributions
 called-0.057
 shouted-0.051
 scientists-0.050
irens-0.049
 machines-0.049
Token excited
Token ID6568
Feature activation+2.33e-01

Pos logit contributions
 called+0.073
 scientists+0.063
 causing+0.062
 shouted+0.061
 machines+0.060
Neg logit contributions
Mine-0.117
 fault-0.117
 pleasure-0.115
 favorites-0.109
rene-0.108
Token to
Token ID284
Feature activation+1.21e-01

Pos logit contributions
 fault+0.012
 accessory+0.011
 pleasure+0.011
Mine+0.011
 favorites+0.011
Neg logit contributions
 called-0.007
 worried-0.007
 shocked-0.007
 scared-0.007
 confused-0.007
Token eat
Token ID4483
Feature activation-5.54e-02

Pos logit contributions
 fault+0.006
Mine+0.006
 favorites+0.006
 pleasure+0.005
 accessory+0.005
Neg logit contributions
 called-0.005
 scared-0.004
 worried-0.004
 shouted-0.004
 sad-0.004
Token her
Token ID607
Feature activation+1.73e+00

Pos logit contributions
 called+0.002
 scared+0.001
 worried+0.001
 shouted+0.001
 sad+0.001
Neg logit contributions
 fault-0.003
Mine-0.003
 favorites-0.003
 pleasure-0.003
 accessory-0.003
Token favorite
Token ID4004
Feature activation+1.05e+01

Pos logit contributions
Mine+0.083
rene+0.076
rust+0.075
 fault+0.074
bet+0.072
Neg logit contributions
 called-0.073
 scared-0.070
 so-0.069
 sad-0.067
 noise-0.066
Token food
Token ID2057
Feature activation+2.67e-01

Pos logit contributions
 fault+0.521
Mine+0.468
rene+0.463
rust+0.450
 pleasure+0.436
Neg logit contributions
 called-0.461
 noise-0.431
 so-0.415
 scared-0.412
 people-0.402
Token,
Token ID11
Feature activation-7.85e-01

Pos logit contributions
 fault+0.018
Mine+0.018
 favorites+0.017
 pleasure+0.017
oof+0.017
Neg logit contributions
 called-0.006
 scared-0.004
 worried-0.004
 shouted-0.004
 so-0.004
Token spaghetti
Token ID48241
Feature activation+1.45e+00

Pos logit contributions
 called+0.019
 scared+0.016
 worried+0.016
 shouted+0.015
 sad+0.014
Neg logit contributions
 fault-0.050
Mine-0.049
 pleasure-0.046
 favorites-0.046
 accessory-0.045
Token and
Token ID290
Feature activation+9.50e-02

Pos logit contributions
 fault+0.100
Mine+0.099
 favorites+0.094
 pleasure+0.094
rust+0.091
Neg logit contributions
 called-0.029
 scared-0.022
 worried-0.021
 shouted-0.021
 so-0.019
Token meat
Token ID6174
Feature activation-1.07e+00

Pos logit contributions
 fault+0.004
Mine+0.004
 pleasure+0.004
 favorites+0.004
 accessory+0.004
Neg logit contributions
 called-0.004
 scared-0.004
 worried-0.004
 shouted-0.004
 sad-0.003
Token different
Token ID1180
Feature activation+2.98e+00

Pos logit contributions
 called+0.212
 scared+0.193
 shouted+0.189
 worried+0.185
 so+0.181
Neg logit contributions
 fault-0.198
Mine-0.188
 pleasure-0.168
 accessory-0.165
rust-0.163
Token snacks
Token ID28487
Feature activation+2.97e+00

Pos logit contributions
 fault+0.138
Mine+0.130
 pleasure+0.118
rust+0.115
oof+0.113
Neg logit contributions
 called-0.136
 scared-0.120
 so-0.118
 shouted-0.118
 noise-0.116
Token,
Token ID11
Feature activation-9.34e-01

Pos logit contributions
 fault+0.197
Mine+0.194
 pleasure+0.184
 favorites+0.184
 accessory+0.180
Neg logit contributions
 called-0.061
 scared-0.052
 worried-0.051
 shouted-0.048
 sad-0.045
Token but
Token ID475
Feature activation-4.58e-01

Pos logit contributions
 called+0.043
 scared+0.039
 shouted+0.038
 worried+0.038
 so+0.037
Neg logit contributions
 fault-0.041
Mine-0.040
 pleasure-0.036
 favorites-0.035
 accessory-0.034
Token their
Token ID511
Feature activation+8.46e-01

Pos logit contributions
 called+0.012
 scared+0.011
 worried+0.010
 shouted+0.010
 sad+0.009
Neg logit contributions
 fault-0.028
Mine-0.028
 pleasure-0.026
 favorites-0.026
 accessory-0.025
Token favorite
Token ID4004
Feature activation+1.00e+01

Pos logit contributions
 fault+0.034
Mine+0.034
rust+0.032
rene+0.032
oof+0.032
Neg logit contributions
 called-0.039
 scared-0.037
 worried-0.036
 sad-0.036
 so-0.036
Token was
Token ID373
Feature activation+2.18e+00

Pos logit contributions
 fault+0.518
Mine+0.485
rene+0.466
rust+0.465
 pleasure+0.449
Neg logit contributions
 called-0.410
 scared-0.348
 so-0.341
 shouted-0.333
 noise-0.332
Token the
Token ID262
Feature activation-1.67e+00

Pos logit contributions
 fault+0.131
Mine+0.130
 pleasure+0.122
 favorites+0.121
 accessory+0.118
Neg logit contributions
 called-0.062
 scared-0.054
 worried-0.053
 shouted-0.050
 sad-0.048
Token beans
Token ID16567
Feature activation+9.80e-01

Pos logit contributions
 called+0.067
 scared+0.063
 so+0.061
 worried+0.061
 sad+0.059
Neg logit contributions
Mine-0.079
 fault-0.077
oof-0.072
rene-0.072
rust-0.072
Token.
Token ID13
Feature activation-2.01e-01

Pos logit contributions
 fault+0.060
Mine+0.058
 pleasure+0.055
 favorites+0.055
 accessory+0.054
Neg logit contributions
 called-0.024
 worried-0.021
 scared-0.021
 shouted-0.021
 confused-0.019
Token
Token ID198
Feature activation+1.60e-01

Pos logit contributions
 called+0.006
 scared+0.005
 shouted+0.005
 worried+0.005
 so+0.004
Neg logit contributions
 fault-0.012
Mine-0.012
 accessory-0.011
 pleasure-0.011
 favorites-0.011
Token,
Token ID11
Feature activation-6.51e-01

Pos logit contributions
 fault+0.121
Mine+0.116
 pleasure+0.113
 favorites+0.109
oof+0.109
Neg logit contributions
 called-0.033
 scared-0.027
 worried-0.027
 shouted-0.025
 confused-0.023
Token Tom
Token ID4186
Feature activation+1.72e+00

Pos logit contributions
 called+0.023
 scared+0.021
 worried+0.021
 shouted+0.021
 confused+0.019
Neg logit contributions
Mine-0.033
 fault-0.033
 pleasure-0.031
 favorites-0.030
 accessory-0.030
Token!
Token ID0
Feature activation+7.63e-01

Pos logit contributions
 fault+0.099
oof+0.087
Mine+0.085
 pleasure+0.084
omo+0.084
Neg logit contributions
 scientists-0.045
 called-0.042
irens-0.041
 doors-0.040
line-0.040
Token It
Token ID632
Feature activation+1.54e+00

Pos logit contributions
 fault+0.042
Mine+0.042
 pleasure+0.039
 favorites+0.039
 accessory+0.038
Neg logit contributions
 called-0.024
 scared-0.022
 worried-0.021
 shouted-0.021
 sad-0.020
Token's
Token ID338
Feature activation+2.05e+00

Pos logit contributions
 fault+0.087
Mine+0.085
 pleasure+0.079
 favorites+0.079
oof+0.077
Neg logit contributions
 called-0.048
 worried-0.042
 scared-0.042
 shouted-0.041
 sad-0.039
Token mine
Token ID6164
Feature activation+8.51e+00

Pos logit contributions
 fault+0.109
Mine+0.106
 pleasure+0.099
 favorites+0.098
oof+0.096
Neg logit contributions
 called-0.071
 scared-0.064
 worried-0.064
 shouted-0.063
 sad-0.059
Token!"
Token ID2474
Feature activation+9.51e-01

Pos logit contributions
Mine+0.513
 fault+0.507
 pleasure+0.484
 favorites+0.480
rust+0.468
Neg logit contributions
 called-0.238
 scared-0.209
 worried-0.200
 shouted-0.196
 so-0.187
Token Lily
Token ID20037
Feature activation+1.96e+00

Pos logit contributions
Mine+0.035
 fault+0.031
oof+0.030
omo+0.029
 pleasure+0.028
Neg logit contributions
ting-0.041
 scientists-0.041
plets-0.040
 causing-0.040
ing-0.040
Token said
Token ID531
Feature activation-2.63e+00

Pos logit contributions
Mine+0.100
 fault+0.095
 flavors+0.086
 personalities+0.084
oof+0.084
Neg logit contributions
 scientists-0.060
plets-0.059
line-0.058
lessness-0.057
irens-0.055
Token.
Token ID13
Feature activation-3.58e-01

Pos logit contributions
 scientists+0.097
 called+0.091
 machines+0.090
 men+0.090
irens+0.089
Neg logit contributions
 fault-0.123
Mine-0.102
 pleasure-0.101
eter-0.098
uzz-0.098
Token She
Token ID1375
Feature activation-8.82e-01

Pos logit contributions
 called+0.013
 worried+0.012
 scared+0.012
 causing+0.012
 scientists+0.011
Neg logit contributions
Mine-0.018
 fault-0.017
 pleasure-0.015
 favorites-0.015
oof-0.014
Token having
Token ID1719
Feature activation+1.29e+00

Pos logit contributions
 fault+0.100
 favorites+0.089
 pleasure+0.086
 favorite+0.083
 favourites+0.082
Neg logit contributions
plets-0.090
taker-0.089
lessness-0.084
 men-0.082
 machines-0.082
Token a
Token ID257
Feature activation+4.22e-01

Pos logit contributions
 fault+0.082
Mine+0.081
 pleasure+0.076
 favorites+0.076
 accessory+0.074
Neg logit contributions
 called-0.033
 scared-0.028
 worried-0.027
 shouted-0.026
 sad-0.025
Token nap
Token ID25422
Feature activation+2.34e+00

Pos logit contributions
 fault+0.020
Mine+0.019
asured+0.018
bet+0.018
eter+0.018
Neg logit contributions
 called-0.018
 scared-0.017
 sad-0.017
 worried-0.017
 very-0.016
Token in
Token ID287
Feature activation-8.04e-01

Pos logit contributions
 fault+0.133
Mine+0.128
 accessory+0.125
 favorites+0.125
 pleasure+0.123
Neg logit contributions
 called-0.059
 shouted-0.057
 worried-0.055
 scared-0.055
 confused-0.052
Token the
Token ID262
Feature activation-1.66e+00

Pos logit contributions
 called+0.019
 scared+0.016
 worried+0.015
 so+0.015
 shouted+0.014
Neg logit contributions
 fault-0.054
Mine-0.051
 favorites-0.051
 pleasure-0.050
rene-0.049
Token sun
Token ID4252
Feature activation+2.42e+00

Pos logit contributions
 called+0.074
 scared+0.068
 worried+0.067
 sad+0.065
 shouted+0.065
Neg logit contributions
 fault-0.075
Mine-0.071
 pleasure-0.067
 favorites-0.067
 accessory-0.063
Token when
Token ID618
Feature activation-5.76e-01

Pos logit contributions
 fault+0.142
Mine+0.134
 favorites+0.132
 accessory+0.130
 pleasure+0.128
Neg logit contributions
 called-0.064
 worried-0.061
 scared-0.060
 shouted-0.059
 shocked-0.058
Token a
Token ID257
Feature activation+4.10e-01

Pos logit contributions
 called+0.011
 scared+0.009
 worried+0.009
 shouted+0.008
 sad+0.008
Neg logit contributions
 fault-0.040
Mine-0.039
 pleasure-0.037
 favorites-0.037
 accessory-0.037
Token little
Token ID1310
Feature activation-2.52e+00

Pos logit contributions
Mine+0.022
 fault+0.022
 pleasure+0.021
 favorites+0.021
 accessory+0.020
Neg logit contributions
 called-0.012
 shouted-0.011
 scared-0.011
 causing-0.011
 worried-0.010
Token girl
Token ID2576
Feature activation-3.34e+00

Pos logit contributions
 causing+0.093
 called+0.088
taker+0.086
 shocked+0.086
 shouted+0.086
Neg logit contributions
Mine-0.107
 pleasure-0.101
 fault-0.100
 favorites-0.098
 friendship-0.096
Token came
Token ID1625
Feature activation-3.53e+00

Pos logit contributions
taker+0.091
 scientists+0.086
 noise+0.081
 worried+0.080
lessly+0.079
Neg logit contributions
Mine-0.171
 favorites-0.168
 fault-0.164
 accessory-0.163
oof-0.159
Token car
Token ID1097
Feature activation-6.83e-01

Pos logit contributions
 called+0.050
 scared+0.046
 worried+0.044
 sad+0.044
 noise+0.041
Neg logit contributions
Mine-0.116
 fault-0.115
rene-0.111
oof-0.111
rust-0.109
Token.
Token ID13
Feature activation-4.52e-01

Pos logit contributions
 called+0.016
 scared+0.013
 worried+0.013
 shouted+0.012
 sad+0.012
Neg logit contributions
Mine-0.044
 fault-0.044
 pleasure-0.042
 favorites-0.041
 accessory-0.040
Token It
Token ID632
Feature activation+8.75e-01

Pos logit contributions
 called+0.013
 scared+0.012
 worried+0.012
 shouted+0.011
 sad+0.011
Neg logit contributions
 fault-0.026
Mine-0.026
 pleasure-0.024
 favorites-0.024
 accessory-0.023
Token is
Token ID318
Feature activation+2.34e+00

Pos logit contributions
Mine+0.047
 fault+0.046
 pleasure+0.045
 favorites+0.043
 accessory+0.043
Neg logit contributions
 called-0.031
 scared-0.029
 worried-0.027
 shouted-0.026
 sad-0.026
Token not
Token ID407
Feature activation-1.21e+00

Pos logit contributions
 fault+0.114
Mine+0.110
 pleasure+0.103
 favorites+0.103
oof+0.099
Neg logit contributions
 called-0.090
 scared-0.082
 worried-0.082
 shouted-0.081
 confused-0.075
Token fair
Token ID3148
Feature activation+2.75e+00

Pos logit contributions
 called+0.051
 shouted+0.048
 worried+0.048
 scared+0.047
irens+0.047
Neg logit contributions
 fault-0.052
 pleasure-0.047
Mine-0.046
 favorites-0.046
 accessory-0.044
Token."
Token ID526
Feature activation+1.74e+00

Pos logit contributions
 fault+0.169
Mine+0.163
 favorites+0.155
 pleasure+0.155
 accessory+0.151
Neg logit contributions
 called-0.074
 scared-0.064
 worried-0.064
 shouted-0.062
 sad-0.057
Token
Token ID198
Feature activation-7.71e-02

Pos logit contributions
 fault+0.091
Mine+0.089
 pleasure+0.083
 favorites+0.083
oof+0.080
Neg logit contributions
 called-0.061
 scared-0.055
 worried-0.055
 shouted-0.053
 causing-0.052
Token
Token ID198
Feature activation-7.79e-02

Pos logit contributions
 called+0.003
 scared+0.002
 worried+0.002
 shouted+0.002
 sad+0.002
Neg logit contributions
 fault-0.004
Mine-0.004
 pleasure-0.004
 favorites-0.004
 accessory-0.004
TokenTom
Token ID13787
Feature activation+9.39e-01

Pos logit contributions
 called+0.003
 scared+0.003
 worried+0.003
 shouted+0.003
 sad+0.003
Neg logit contributions
 fault-0.004
Mine-0.003
 pleasure-0.003
 favorites-0.003
 accessory-0.003
Token starts
Token ID4940
Feature activation-1.10e+00

Pos logit contributions
 fault+0.053
Mine+0.052
 favorites+0.048
 pleasure+0.047
oof+0.047
Neg logit contributions
 called-0.029
 scared-0.025
 worried-0.025
 shouted-0.024
 causing-0.024
Token
Token ID198
Feature activation+4.37e-01

Pos logit contributions
 fault+0.027
Mine+0.026
 pleasure+0.025
 favorites+0.025
 accessory+0.024
Neg logit contributions
 called-0.013
 scared-0.011
 shouted-0.011
 worried-0.011
 sad-0.010
TokenFrom
Token ID4863
Feature activation-6.82e-01

Pos logit contributions
 fault+0.026
 favorites+0.023
 pleasure+0.023
Mine+0.022
 personalities+0.021
Neg logit contributions
 called-0.016
 shouted-0.013
 scared-0.013
 so-0.012
 worried-0.012
Token then
Token ID788
Feature activation+6.71e-01

Pos logit contributions
 called+0.011
 scared+0.010
 worried+0.010
 shouted+0.009
 confused+0.008
Neg logit contributions
 fault-0.046
Mine-0.045
oof-0.043
 pleasure-0.043
 favorites-0.043
Token on
Token ID319
Feature activation-1.17e+00

Pos logit contributions
Mine+0.039
 fault+0.037
oof+0.036
 friendship+0.035
 accessory+0.035
Neg logit contributions
 called-0.016
 scared-0.016
 confused-0.016
 worried-0.015
 shouted-0.015
Token,
Token ID11
Feature activation-6.99e-01

Pos logit contributions
 called+0.025
 scared+0.023
 worried+0.023
 shouted+0.023
 afraid+0.021
Neg logit contributions
 fault-0.075
Mine-0.071
 pleasure-0.069
 favorites-0.068
 friendship-0.067
Token Tim
Token ID5045
Feature activation+3.07e+00

Pos logit contributions
 called+0.026
 scared+0.023
 worried+0.022
 shouted+0.021
 sad+0.021
Neg logit contributions
 fault-0.037
Mine-0.036
 favorites-0.033
 pleasure-0.033
 accessory-0.033
Tokenmy
Token ID1820
Feature activation-1.19e-02

Pos logit contributions
Mine+0.149
 fault+0.144
 pleasure+0.134
 favorites+0.130
oof+0.125
Neg logit contributions
 called-0.130
 worried-0.117
 scared-0.116
 shouted-0.110
 so-0.109
Token proudly
Token ID23017
Feature activation+2.12e+00

Pos logit contributions
 called+0.000
 scared+0.000
 worried+0.000
 shouted+0.000
 sad+0.000
Neg logit contributions
Mine-0.001
 fault-0.001
 pleasure-0.001
 favorites-0.001
oof-0.001
Token drank
Token ID24070
Feature activation-1.10e+00

Pos logit contributions
 fault+0.112
Mine+0.110
 pleasure+0.103
 favorites+0.102
oof+0.098
Neg logit contributions
 called-0.076
 scared-0.068
 worried-0.067
 shouted-0.066
 sad-0.063
Token his
Token ID465
Feature activation+2.30e+00

Pos logit contributions
 called+0.034
 scared+0.030
 worried+0.030
 shouted+0.029
 sad+0.028
Neg logit contributions
 fault-0.063
Mine-0.062
 favorites-0.058
 pleasure-0.058
 accessory-0.057
Token juice
Token ID13135
Feature activation+3.18e+00

Pos logit contributions
 fault+0.107
Mine+0.104
rust+0.094
 accessory+0.092
 habitats+0.092
Neg logit contributions
 called-0.101
 sad-0.096
 scared-0.095
 worried-0.093
 so-0.090
Token jump
Token ID4391
Feature activation-3.24e-01

Pos logit contributions
 fault+0.023
Mine+0.023
 pleasure+0.021
 favorites+0.021
 accessory+0.020
Neg logit contributions
 called-0.033
 scared-0.031
 worried-0.031
 shouted-0.030
 sad-0.030
Token and
Token ID290
Feature activation+6.71e-01

Pos logit contributions
 called+0.007
 worried+0.007
 scared+0.006
 shouted+0.006
 afraid+0.006
Neg logit contributions
Mine-0.020
 fault-0.020
 favorites-0.018
 accessory-0.018
 pleasure-0.018
Token do
Token ID466
Feature activation-1.35e+00

Pos logit contributions
Mine+0.025
 fault+0.024
 pleasure+0.022
 favorites+0.022
 accessory+0.021
Neg logit contributions
 called-0.033
 worried-0.031
 scared-0.031
 shouted-0.030
 sad-0.029
Token bur
Token ID4356
Feature activation-1.94e+00

Pos logit contributions
 called+0.052
 scared+0.047
 worried+0.047
 shouted+0.046
 confused+0.043
Neg logit contributions
 fault-0.066
Mine-0.065
 pleasure-0.060
 favorites-0.060
 accessory-0.059
Tokenpees
Token ID42623
Feature activation+2.81e+00

Pos logit contributions
 called+0.070
 scared+0.062
 worried+0.061
 shouted+0.060
 sad+0.057
Neg logit contributions
 fault-0.101
Mine-0.100
 pleasure-0.093
 favorites-0.093
 accessory-0.089
Token every
Token ID790
Feature activation+5.41e+00

Pos logit contributions
 fault+0.164
Mine+0.159
 favorites+0.150
 pleasure+0.148
 accessory+0.146
Neg logit contributions
 called-0.082
 worried-0.073
 scared-0.073
 shouted-0.070
 sad-0.065
Token day
Token ID1110
Feature activation+1.04e+00

Pos logit contributions
 fault+0.237
Mine+0.228
 favorites+0.212
 pleasure+0.211
oof+0.204
Neg logit contributions
 called-0.245
 scared-0.227
 worried-0.223
 shouted-0.217
 sad-0.214
Token.
Token ID13
Feature activation+3.33e-01

Pos logit contributions
 fault+0.056
Mine+0.054
 pleasure+0.052
 favorites+0.051
 accessory+0.050
Neg logit contributions
 called-0.029
 worried-0.028
 scared-0.028
 shouted-0.027
 shocked-0.027
Token
Token ID198
Feature activation+5.95e-01

Pos logit contributions
 fault+0.023
 accessory+0.023
Mine+0.023
bet+0.022
asured+0.022
Neg logit contributions
 called-0.006
 scared-0.004
 so-0.004
 sad-0.004
 shouted-0.004
Token
Token ID198
Feature activation+6.01e-01

Pos logit contributions
 fault+0.039
Mine+0.038
 pleasure+0.036
 favorites+0.035
 accessory+0.035
Neg logit contributions
 called-0.015
 scared-0.012
 shouted-0.012
 worried-0.012
 sad-0.011
TokenOne
Token ID3198
Feature activation+4.09e+00

Pos logit contributions
 fault+0.027
 pleasure+0.025
 favorites+0.022
 accessory+0.022
Mine+0.022
Neg logit contributions
 called-0.030
 scared-0.027
 so-0.026
 sad-0.026
 shouted-0.026
Token tall
Token ID7331
Feature activation-2.83e+00

Pos logit contributions
 called+0.077
 scared+0.071
 worried+0.069
 shouted+0.066
 sad+0.065
Neg logit contributions
 fault-0.139
Mine-0.136
 favorites-0.127
 accessory-0.125
 pleasure-0.125
Token they
Token ID484
Feature activation-2.36e+00

Pos logit contributions
 called+0.061
 worried+0.053
 scared+0.053
 shouted+0.050
 sad+0.048
Neg logit contributions
 fault-0.183
Mine-0.179
 pleasure-0.171
 favorites-0.170
 accessory-0.167
Token were
Token ID547
Feature activation-6.98e-01

Pos logit contributions
 called+0.076
 worried+0.070
 scared+0.069
 shouted+0.066
 noise+0.065
Neg logit contributions
Mine-0.123
 fault-0.122
 pleasure-0.115
 favorites-0.115
 accessory-0.111
Token taller
Token ID25242
Feature activation-1.62e-02

Pos logit contributions
 called+0.020
 shouted+0.018
 scared+0.017
 worried+0.017
 noise+0.016
Neg logit contributions
 fault-0.040
Mine-0.038
 pleasure-0.037
 favorites-0.037
 accessory-0.036
Token than
Token ID621
Feature activation+3.05e+00

Pos logit contributions
 called+0.001
 scared+0.001
 worried+0.000
 shouted+0.000
 so+0.000
Neg logit contributions
Mine-0.001
 fault-0.001
 pleasure-0.001
 favorites-0.001
oof-0.001
Token her
Token ID607
Feature activation+1.38e+00

Pos logit contributions
Mine+0.154
 fault+0.154
 favorites+0.142
 pleasure+0.140
 accessory+0.138
Neg logit contributions
 called-0.114
 scared-0.102
 worried-0.100
 shouted-0.097
 sad-0.095
Token.
Token ID13
Feature activation-2.02e-01

Pos logit contributions
Mine+0.081
 fault+0.075
rust+0.072
oof+0.071
oco+0.070
Neg logit contributions
 called-0.046
 scared-0.041
 worried-0.040
 so-0.038
 sad-0.037
Token Y
Token ID575
Feature activation+3.08e-01

Pos logit contributions
 called+0.006
 scared+0.005
 worried+0.005
 shouted+0.005
 sad+0.005
Neg logit contributions
 fault-0.012
Mine-0.011
 favorites-0.011
 accessory-0.011
 pleasure-0.010
Tokenol
Token ID349
Feature activation-6.93e-02

Pos logit contributions
 fault+0.014
Mine+0.014
 favorites+0.012
 accessory+0.012
rust+0.012
Neg logit contributions
 called-0.013
ing-0.011
 of-0.011
 so-0.011
 noise-0.010
Tokenanda
Token ID5282
Feature activation+7.41e-01

Pos logit contributions
 called+0.003
 scared+0.003
 worried+0.003
 shouted+0.002
 sad+0.002
Neg logit contributions
 fault-0.003
Mine-0.003
 pleasure-0.003
 favorites-0.003
 accessory-0.003
Token was
Token ID373
Feature activation+2.13e+00

Pos logit contributions
 fault+0.041
Mine+0.040
 pleasure+0.038
 favorites+0.038
 accessory+0.037
Neg logit contributions
 called-0.023
 worried-0.021
 scared-0.021
 shouted-0.020
 sad-0.019
Token She
Token ID1375
Feature activation-1.05e+00

Pos logit contributions
 called+0.015
 scared+0.013
 worried+0.013
 shouted+0.013
 sad+0.012
Neg logit contributions
 fault-0.026
Mine-0.026
 pleasure-0.024
 favorites-0.024
oof-0.023
Token asked
Token ID1965
Feature activation+7.48e-01

Pos logit contributions
 called+0.031
 worried+0.029
 scared+0.029
 sad+0.028
 causing+0.028
Neg logit contributions
Mine-0.057
 fault-0.055
 favorites-0.053
 accessory-0.051
 pleasure-0.050
Token her
Token ID607
Feature activation+1.49e+00

Pos logit contributions
 fault+0.038
Mine+0.037
 pleasure+0.035
 favorites+0.035
 accessory+0.033
Neg logit contributions
 called-0.028
 scared-0.025
 worried-0.025
 shouted-0.024
 sad-0.023
Token mom
Token ID1995
Feature activation-7.40e-01

Pos logit contributions
 fault+0.075
Mine+0.074
 pleasure+0.068
 favorites+0.068
 accessory+0.066
Neg logit contributions
 called-0.057
 scared-0.052
 worried-0.051
 shouted-0.050
 sad-0.048
Token if
Token ID611
Feature activation-1.08e+00

Pos logit contributions
 called+0.020
 scared+0.018
 shouted+0.018
 worried+0.018
 shocked+0.016
Neg logit contributions
 fault-0.044
Mine-0.042
 favorites-0.041
 pleasure-0.040
 accessory-0.038
Token she
Token ID673
Feature activation-2.81e+00

Pos logit contributions
 called+0.025
 scared+0.022
 shouted+0.021
 worried+0.021
 noise+0.020
Neg logit contributions
 fault-0.065
Mine-0.064
 pleasure-0.061
 favorites-0.059
 friendship-0.058
Token could
Token ID714
Feature activation-1.07e+00

Pos logit contributions
 called+0.047
 scared+0.035
 shouted+0.032
 worried+0.029
 cried+0.023
Neg logit contributions
 fault-0.220
Mine-0.214
 former-0.204
 pleasure-0.203
oof-0.202
Token climb
Token ID12080
Feature activation-3.37e-01

Pos logit contributions
 called+0.038
 scared+0.034
 worried+0.033
 shouted+0.032
 sad+0.031
Neg logit contributions
 fault-0.058
Mine-0.057
 favorites-0.053
 pleasure-0.053
oof-0.052
Token the
Token ID262
Feature activation-1.77e+00

Pos logit contributions
 called+0.007
 worried+0.006
 scared+0.006
 shouted+0.006
 confused+0.005
Neg logit contributions
 fault-0.022
Mine-0.022
 favorites-0.020
 pleasure-0.020
 accessory-0.020
Token ladder
Token ID18002
Feature activation-4.05e+00

Pos logit contributions
 called+0.071
 shouted+0.063
 worried+0.062
 scared+0.060
 scientists+0.058
Neg logit contributions
 fault-0.086
Mine-0.083
 pleasure-0.076
 favorites-0.075
 flavors-0.072
Token.
Token ID13
Feature activation-4.05e-01

Pos logit contributions
 called+0.107
 worried+0.101
 scared+0.097
 shouted+0.095
 scientists+0.092
Neg logit contributions
 fault-0.238
Mine-0.226
 favorites-0.214
 pleasure-0.212
 accessory-0.209
Token them
Token ID606
Feature activation+9.77e-01

Pos logit contributions
 called+0.043
 scared+0.039
 worried+0.038
 shouted+0.037
 sad+0.035
Neg logit contributions
 fault-0.076
Mine-0.074
 pleasure-0.070
 favorites-0.069
 accessory-0.067
Token having
Token ID1719
Feature activation+1.07e+00

Pos logit contributions
 fault+0.062
Mine+0.061
 pleasure+0.058
 favorites+0.057
 accessory+0.056
Neg logit contributions
 called-0.024
 scared-0.021
 worried-0.020
 shouted-0.020
 sad-0.018
Token fun
Token ID1257
Feature activation+6.36e+00

Pos logit contributions
 fault+0.046
Mine+0.046
 favorites+0.041
 pleasure+0.041
 accessory+0.040
Neg logit contributions
 called-0.049
 scared-0.045
 worried-0.044
 shouted-0.044
 so-0.043
Token.
Token ID13
Feature activation-3.75e-01

Pos logit contributions
 fault+0.423
Mine+0.420
 pleasure+0.398
 favorites+0.394
oof+0.386
Neg logit contributions
 called-0.140
 scared-0.117
 worried-0.112
 shouted-0.109
 sad-0.099
Token She
Token ID1375
Feature activation-8.57e-01

Pos logit contributions
 called+0.010
 scared+0.009
 worried+0.008
 shouted+0.008
 sad+0.008
Neg logit contributions
 fault-0.023
Mine-0.023
 accessory-0.021
 pleasure-0.021
 favorites-0.021
Token laughed
Token ID13818
Feature activation+9.93e-01

Pos logit contributions
 called+0.027
 scared+0.024
 worried+0.023
 shouted+0.023
 sad+0.021
Neg logit contributions
 fault-0.050
Mine-0.048
 pleasure-0.046
 favorites-0.045
 accessory-0.044
Token and
Token ID290
Feature activation+2.18e-02

Pos logit contributions
 fault+0.061
Mine+0.061
 pleasure+0.058
 favorites+0.057
 accessory+0.056
Neg logit contributions
 called-0.024
 worried-0.020
 scared-0.020
 shouted-0.020
 sad-0.017
Token cl
Token ID537
Feature activation-6.97e-01

Pos logit contributions
Mine+0.001
 fault+0.001
 pleasure+0.001
 accessory+0.001
 favorites+0.001
Neg logit contributions
 called-0.001
 worried-0.001
 scientists-0.001
 scared-0.001
 men-0.001
Tokenapped
Token ID6320
Feature activation-3.51e+00

Pos logit contributions
 called+0.019
 scared+0.016
 worried+0.016
 shouted+0.015
 confused+0.014
Neg logit contributions
 fault-0.044
Mine-0.042
 pleasure-0.039
 favorites-0.039
 flavors-0.038
Token with
Token ID351
Feature activation+8.73e-01

Pos logit contributions
 called+0.111
 scared+0.099
 worried+0.098
 shouted+0.095
 sad+0.089
Neg logit contributions
 fault-0.194
Mine-0.192
 pleasure-0.180
 favorites-0.178
 accessory-0.175
Token joy
Token ID8716
Feature activation+2.28e+00

Pos logit contributions
 fault+0.036
Mine+0.032
 favorites+0.031
 accessory+0.030
oof+0.030
Neg logit contributions
 called-0.042
 scared-0.040
 worried-0.039
 shouted-0.039
 so-0.038
Token
Token ID198
Feature activation+1.60e-02

Pos logit contributions
 fault+0.002
Mine+0.002
 pleasure+0.002
 favorites+0.002
oof+0.002
Neg logit contributions
 called-0.001
 scared-0.001
 worried-0.001
 shouted-0.001
 sad-0.001
TokenTheir
Token ID14574
Feature activation+1.15e+00

Pos logit contributions
Mine+0.001
 fault+0.001
oof+0.001
 favorites+0.001
 pleasure+0.001
Neg logit contributions
 called-0.001
 scared-0.001
 worried-0.001
 confused-0.001
 sad-0.001
Token moms
Token ID34015
Feature activation+3.71e+00

Pos logit contributions
Mine+0.048
 fault+0.045
 pleasure+0.044
 accessory+0.043
 favorites+0.042
Neg logit contributions
 called-0.055
 worried-0.052
 scared-0.052
 sad-0.052
 shouted-0.049
Token came
Token ID1625
Feature activation-3.70e+00

Pos logit contributions
Mine+0.155
 fault+0.135
 flavors+0.128
 ease+0.128
 favorites+0.127
Neg logit contributions
 scientists-0.132
plets-0.127
taker-0.122
irens-0.119
ug-0.119
Token and
Token ID290
Feature activation-2.27e-01

Pos logit contributions
plets+0.132
 gust+0.117
taker+0.116
 ashamed+0.114
 called+0.112
Neg logit contributions
 fault-0.170
Mine-0.160
 favorites-0.156
 addition-0.154
asured-0.152
Token saw
Token ID2497
Feature activation-3.36e+00

Pos logit contributions
plets+0.008
 noise+0.007
 gust+0.007
ug+0.007
taker+0.007
Neg logit contributions
Mine-0.009
rust-0.009
 flavors-0.009
 favorites-0.009
omo-0.008
Token the
Token ID262
Feature activation-1.76e+00

Pos logit contributions
 called+0.143
 scared+0.135
 worried+0.133
 shouted+0.132
 confused+0.125
Neg logit contributions
 fault-0.146
Mine-0.146
 favorites-0.131
 pleasure-0.127
rust-0.124
Token bag
Token ID6131
Feature activation-3.90e+00

Pos logit contributions
 called+0.057
 scared+0.049
 shouted+0.049
 worried+0.049
 sad+0.044
Neg logit contributions
 fault-0.099
Mine-0.097
 pleasure-0.090
 favorites-0.090
 accessory-0.087
Token of
Token ID286
Feature activation-4.49e+00

Pos logit contributions
 scientists+0.099
 called+0.094
 afraid+0.089
 worried+0.087
 shouted+0.087
Neg logit contributions
 fault-0.200
Mine-0.196
 accessory-0.182
 friendship-0.179
 addition-0.177
Token nuts
Token ID14380
Feature activation+2.92e+00

Pos logit contributions
 called+0.208
 scared+0.190
 worried+0.189
 shouted+0.188
 confused+0.179
Neg logit contributions
 fault-0.183
Mine-0.183
 pleasure-0.164
 favorites-0.164
 accessory-0.156
Token.
Token ID13
Feature activation-5.14e-01

Pos logit contributions
 fault+0.158
Mine+0.141
 accessory+0.136
 flavors+0.132
 friendship+0.131
Neg logit contributions
 called-0.081
 scientists-0.079
 Lightning-0.073
 shouted-0.073
 confused-0.073
Token m
Token ID285
Feature activation-2.82e+00

Pos logit contributions
 called+0.039
 shouted+0.034
 scared+0.034
 worried+0.033
 shocked+0.031
Neg logit contributions
Mine-0.073
 fault-0.072
 favorites-0.068
 pleasure-0.067
 accessory-0.065
Tokenop
Token ID404
Feature activation-5.44e+00

Pos logit contributions
 called+0.092
 scared+0.082
 shouted+0.079
 worried+0.077
 sad+0.076
Neg logit contributions
 fault-0.153
Mine-0.149
 favorites-0.143
omo-0.139
 flavors-0.135
Token realised
Token ID19169
Feature activation-5.26e+00

Pos logit contributions
 called+0.176
 scared+0.161
 worried+0.160
 shouted+0.153
 causing+0.153
Neg logit contributions
Mine-0.276
 fault-0.275
 favorites-0.254
rust-0.248
 accessory-0.244
Token he
Token ID339
Feature activation-1.24e+00

Pos logit contributions
plets+0.117
 called+0.116
 worried+0.114
 shouted+0.113
 scared+0.113
Neg logit contributions
 fault-0.297
Mine-0.281
 friendship-0.273
 flavors-0.269
rust-0.267
Token had
Token ID550
Feature activation+4.08e-01

Pos logit contributions
 called+0.037
 scared+0.033
 worried+0.033
 shouted+0.032
 sad+0.030
Neg logit contributions
 fault-0.069
Mine-0.068
 favorites-0.063
 pleasure-0.063
 accessory-0.061
Token made
Token ID925
Feature activation-8.44e-01

Pos logit contributions
 fault+0.021
Mine+0.020
 pleasure+0.019
 favorites+0.019
 accessory+0.018
Neg logit contributions
 called-0.015
 scared-0.014
 worried-0.013
 shouted-0.013
 sad-0.013
Token a
Token ID257
Feature activation+5.55e-01

Pos logit contributions
 called+0.026
 worried+0.022
 shouted+0.022
 scared+0.022
 afraid+0.020
Neg logit contributions
 fault-0.049
Mine-0.046
 pleasure-0.045
 favorites-0.044
oof-0.042
Token big
Token ID1263
Feature activation-2.10e-02

Pos logit contributions
 fault+0.023
Mine+0.021
 pleasure+0.020
 favorites+0.019
 accessory+0.019
Neg logit contributions
 called-0.027
 shouted-0.025
 scared-0.024
 worried-0.024
 causing-0.023
Token mistake
Token ID7457
Feature activation-3.78e+00

Pos logit contributions
 called+0.001
 shouted+0.001
 scared+0.001
 worried+0.001
 scientists+0.001
Neg logit contributions
 fault-0.001
Mine-0.001
 pleasure-0.001
 favorites-0.001
 accessory-0.001
Token and
Token ID290
Feature activation+6.01e-01

Pos logit contributions
 called+0.115
 shouted+0.108
 worried+0.106
 scared+0.106
 shocked+0.101
Neg logit contributions
 fault-0.205
 accessory-0.177
Mine-0.175
 pleasure-0.170
 friendship-0.166
Token was
Token ID373
Feature activation+2.55e+00

Pos logit contributions
 fault+0.029
Mine+0.029
 pleasure+0.028
 friendship+0.028
 flavors+0.027
Neg logit contributions
plets-0.017
irens-0.017
 called-0.016
ting-0.015
iren-0.015
Token
Token ID198
Feature activation+1.22e-01

Pos logit contributions
 fault+0.007
Mine+0.007
 pleasure+0.006
 favorites+0.006
oof+0.006
Neg logit contributions
 called-0.004
 scared-0.004
 worried-0.004
 shouted-0.004
 sad-0.004
TokenThe
Token ID464
Feature activation-1.60e+00

Pos logit contributions
Mine+0.006
rust+0.005
rene+0.005
 fault+0.005
oof+0.005
Neg logit contributions
 called-0.005
 worried-0.005
 scared-0.004
 sad-0.004
 causing-0.004
Token lion
Token ID18744
Feature activation-2.58e-01

Pos logit contributions
 called+0.050
 scared+0.045
 worried+0.044
 shouted+0.042
 sad+0.041
Neg logit contributions
 fault-0.093
Mine-0.091
 pleasure-0.085
 favorites-0.085
 accessory-0.083
Token was
Token ID373
Feature activation+2.18e+00

Pos logit contributions
 called+0.008
 scientists+0.008
 causing+0.007
 worried+0.007
irens+0.007
Neg logit contributions
Mine-0.013
rust-0.012
 fault-0.012
 favorites-0.012
 accessory-0.011
Token surprised
Token ID6655
Feature activation-1.63e+00

Pos logit contributions
 fault+0.140
Mine+0.136
 pleasure+0.133
 favorites+0.130
oof+0.126
Neg logit contributions
 called-0.040
 shouted-0.034
 scientists-0.032
 worried-0.031
 men-0.031
Token and
Token ID290
Feature activation+2.13e-01

Pos logit contributions
 called+0.042
 worried+0.037
 shouted+0.037
 scared+0.037
 shocked+0.036
Neg logit contributions
 fault-0.094
Mine-0.090
 pleasure-0.087
 favorites-0.087
rust-0.085
Token said
Token ID531
Feature activation-2.81e+00

Pos logit contributions
Mine+0.011
rust+0.010
 pleasure+0.010
 fault+0.010
 friendship+0.010
Neg logit contributions
irens-0.006
 scientists-0.006
 called-0.006
plets-0.006
iren-0.005
Token to
Token ID284
Feature activation+1.99e-01

Pos logit contributions
 called+0.078
 shouted+0.066
 causing+0.065
irens+0.063
plets+0.063
Neg logit contributions
 fault-0.154
Mine-0.152
 pleasure-0.147
 friendship-0.140
rust-0.138
Token the
Token ID262
Feature activation-1.77e+00

Pos logit contributions
 fault+0.013
Mine+0.013
 pleasure+0.012
 favorites+0.012
 accessory+0.012
Neg logit contributions
 called-0.005
 scared-0.004
 worried-0.004
 shouted-0.004
 sad-0.003
Token rabbit
Token ID22746
Feature activation-7.70e-01

Pos logit contributions
 called+0.055
 shouted+0.049
 worried+0.048
 scared+0.048
 shocked+0.045
Neg logit contributions
 fault-0.096
Mine-0.096
 pleasure-0.091
 favorites-0.090
 accessory-0.087
Token,
Token ID11
Feature activation-9.02e-01

Pos logit contributions
 called+0.016
 shouted+0.013
 worried+0.013
 scared+0.013
 scientists+0.012
Neg logit contributions
 fault-0.050
Mine-0.048
 pleasure-0.046
 favorites-0.046
rust-0.045
Token called
Token ID1444
Feature activation-6.99e+00

Pos logit contributions
 scientists+0.022
 noise+0.022
plets+0.021
 gust+0.021
irens+0.021
Neg logit contributions
Mine-0.030
omo-0.029
 fault-0.029
bet-0.028
 accessory-0.028
Token out
Token ID503
Feature activation-3.88e+00

Pos logit contributions
plets+0.253
 called+0.240
 shouted+0.229
 scared+0.227
 shocked+0.226
Neg logit contributions
 fault-0.340
Mine-0.311
 pleasure-0.296
uzz-0.279
bet-0.277
Token �
Token ID6184
Feature activation+4.19e-01

Pos logit contributions
plets+0.138
line+0.127
 called+0.126
 shouted+0.124
 cried+0.120
Neg logit contributions
 fault-0.163
Mine-0.162
 pleasure-0.157
uzz-0.155
 friendship-0.154
Token�
Token ID95
Feature activation-2.80e+00

Pos logit contributions
Mine+0.017
 fault+0.015
oof+0.015
bet+0.014
oco+0.014
Neg logit contributions
 called-0.018
 scared-0.018
 worried-0.017
 sad-0.017
 afraid-0.017
Token€
Token ID26391
Feature activation-1.85e-01

Pos logit contributions
 called+0.127
 scared+0.122
 worried+0.120
 sad+0.120
 confused+0.116
Neg logit contributions
Mine-0.110
 fault-0.107
oof-0.100
bet-0.099
 pleasure-0.097
Token�
Token ID129
Feature activation-5.25e+00

Pos logit contributions
 called+0.007
 causing+0.007
 worried+0.007
 scientists+0.007
 noise+0.007
Neg logit contributions
 fault-0.008
Mine-0.007
oof-0.007
 favorites-0.007
 pleasure-0.007
Token�
Token ID241
Feature activation+4.78e+00

Pos logit contributions
 called+0.261
 scared+0.242
 worried+0.239
 scientists+0.233
 shouted+0.232
Neg logit contributions
 fault-0.188
Mine-0.185
oof-0.168
bet-0.160
 pleasure-0.157
TokenSam
Token ID16305
Feature activation+3.02e-01

Pos logit contributions
 fault+0.224
Mine+0.217
 favorites+0.204
 pleasure+0.202
 accessory+0.195
Neg logit contributions
 called-0.198
 scared-0.182
 worried-0.179
 shouted-0.177
 sad-0.167
Token!
Token ID0
Feature activation+1.02e+00

Pos logit contributions
 fault+0.017
Mine+0.016
 pleasure+0.015
 favorites+0.015
 accessory+0.015
Neg logit contributions
 called-0.009
 scared-0.008
 worried-0.008
 shouted-0.008
 sad-0.008
Token Hold
Token ID9340
Feature activation+3.26e-01

Pos logit contributions
Mine+0.050
 fault+0.050
 pleasure+0.048
 favorites+0.046
oof+0.045
Neg logit contributions
 called-0.037
 scared-0.034
 worried-0.034
 shouted-0.032
 sad-0.032
Token on
Token ID319
Feature activation-1.31e+00

Pos logit contributions
 fault+0.022
Mine+0.021
 favorites+0.020
 pleasure+0.020
oof+0.020
Neg logit contributions
 called-0.006
 scared-0.005
 worried-0.005
 shouted-0.005
 noise-0.004
Token she
Token ID673
Feature activation-2.92e+00

Pos logit contributions
 called+0.028
 shouted+0.026
 worried+0.025
irens+0.025
 scared+0.025
Neg logit contributions
Mine-0.055
 fault-0.053
 pleasure-0.050
 favorites-0.050
 accessory-0.049
Token made
Token ID925
Feature activation-1.11e+00

Pos logit contributions
 called+0.084
 worried+0.076
 scared+0.076
 scientists+0.073
 causing+0.072
Neg logit contributions
Mine-0.161
 fault-0.157
 favorites-0.149
 pleasure-0.143
rust-0.142
Token a
Token ID257
Feature activation+2.97e-01

Pos logit contributions
 called+0.031
 scared+0.027
 worried+0.027
 shouted+0.026
 sad+0.024
Neg logit contributions
 fault-0.067
Mine-0.065
 pleasure-0.062
 favorites-0.061
 accessory-0.059
Token very
Token ID845
Feature activation-3.59e+00

Pos logit contributions
 fault+0.015
Mine+0.015
 pleasure+0.014
 favorites+0.013
 accessory+0.013
Neg logit contributions
 called-0.011
irens-0.009
 scientists-0.009
 shouted-0.009
ting-0.009
Token strange
Token ID6283
Feature activation-1.01e+00

Pos logit contributions
 called+0.125
 scared+0.107
 worried+0.106
 shouted+0.105
 causing+0.101
Neg logit contributions
 fault-0.189
Mine-0.186
 pleasure-0.174
 favorites-0.172
rust-0.165
Token face
Token ID1986
Feature activation-5.26e+00

Pos logit contributions
 called+0.040
 shouted+0.035
 scared+0.035
 worried+0.035
 scientists+0.033
Neg logit contributions
 fault-0.050
Mine-0.047
 pleasure-0.045
 favorites-0.044
 accessory-0.043
Token.
Token ID13
Feature activation-4.71e-02

Pos logit contributions
 called+0.142
 worried+0.127
 scientists+0.127
 scared+0.126
 shouted+0.125
Neg logit contributions
 fault-0.297
Mine-0.295
 accessory-0.272
 favorites-0.265
 flavors-0.265
Token The
Token ID383
Feature activation-5.81e-01

Pos logit contributions
 called+0.001
 scared+0.001
 worried+0.001
 shouted+0.001
 sad+0.001
Neg logit contributions
 fault-0.003
Mine-0.003
 pleasure-0.002
 favorites-0.002
 accessory-0.002
Token soup
Token ID17141
Feature activation-7.35e-01

Pos logit contributions
 called+0.025
 scared+0.023
 worried+0.023
 shouted+0.023
 sad+0.022
Neg logit contributions
 fault-0.026
Mine-0.025
 pleasure-0.023
 favorites-0.023
 accessory-0.022
Token was
Token ID373
Feature activation+2.24e+00

Pos logit contributions
 called+0.025
 worried+0.023
 scared+0.023
 shouted+0.022
 sad+0.021
Neg logit contributions
Mine-0.038
 fault-0.037
 pleasure-0.034
 favorites-0.034
 accessory-0.033
Token bitter
Token ID12922
Feature activation-9.02e-01

Pos logit contributions
 fault+0.131
Mine+0.128
 pleasure+0.122
 favorites+0.119
 accessory+0.115
Neg logit contributions
 called-0.060
 shouted-0.053
 scared-0.051
 worried-0.051
 scientists-0.050
Token One
Token ID1881
Feature activation+3.52e+00

Pos logit contributions
 called+0.000
 so+0.000
 scared+0.000
 sad+0.000
 shouted+0.000
Neg logit contributions
 fault-0.000
 accessory-0.000
Mine-0.000
oof-0.000
oco-0.000
Token day
Token ID1110
Feature activation+8.71e-01

Pos logit contributions
 fault+0.155
Mine+0.152
 pleasure+0.137
 favorites+0.137
 accessory+0.131
Neg logit contributions
 called-0.161
 scared-0.148
 worried-0.146
 shouted-0.141
 sad-0.139
Token,
Token ID11
Feature activation-8.53e-01

Pos logit contributions
Mine+0.055
 fault+0.055
 pleasure+0.052
 favorites+0.052
oof+0.050
Neg logit contributions
 called-0.019
 worried-0.016
 scared-0.016
 shouted-0.015
 sad-0.014
Token a
Token ID257
Feature activation+3.91e-01

Pos logit contributions
 called+0.027
 worried+0.023
 scared+0.023
 shouted+0.023
 sad+0.021
Neg logit contributions
Mine-0.047
 fault-0.047
 pleasure-0.044
 favorites-0.044
oof-0.042
Token big
Token ID1263
Feature activation-3.11e-01

Pos logit contributions
 pleasure+0.019
Mine+0.019
 fault+0.018
 friendship+0.018
 hobby+0.018
Neg logit contributions
 called-0.012
 causing-0.012
 shouted-0.012
irens-0.011
ting-0.011
Token noise
Token ID7838
Feature activation-6.73e+00

Pos logit contributions
 called+0.010
 shouted+0.009
 causing+0.009
plets+0.009
 scientists+0.009
Neg logit contributions
Mine-0.015
 pleasure-0.015
 fault-0.015
 friendship-0.014
 accessory-0.014
Token disturbed
Token ID24069
Feature activation-2.25e+00

Pos logit contributions
 worried+0.203
 sad+0.201
 confused+0.199
 called+0.192
 ashamed+0.188
Neg logit contributions
Mine-0.319
 fault-0.290
 flavors-0.286
 accessory-0.276
 favorites-0.274
Token Bob
Token ID5811
Feature activation+4.20e+00

Pos logit contributions
 called+0.094
 worried+0.090
 scared+0.087
 shouted+0.085
 sad+0.085
Neg logit contributions
Mine-0.097
 fault-0.092
 pleasure-0.086
 favorites-0.085
oof-0.084
Token's
Token ID338
Feature activation+1.95e+00

Pos logit contributions
 fault+0.213
Mine+0.196
 accessory+0.186
 favorites+0.186
oof+0.184
Neg logit contributions
 called-0.116
 worried-0.115
 ashamed-0.115
 shocked-0.115
 doors-0.111
Token regular
Token ID3218
Feature activation+3.00e+00

Pos logit contributions
 fault+0.108
Mine+0.105
 pleasure+0.099
 favorites+0.099
 accessory+0.096
Neg logit contributions
 called-0.064
 scared-0.056
 worried-0.056
 shouted-0.055
 sad-0.051
Token life
Token ID1204
Feature activation+8.00e-01

Pos logit contributions
 fault+0.151
Mine+0.143
 pleasure+0.134
 favorites+0.133
 accessory+0.133
Neg logit contributions
 called-0.112
 shouted-0.101
 worried-0.100
 scared-0.098
 shocked-0.096
Token and
Token ID290
Feature activation-1.94e-01

Pos logit contributions
 called+0.004
 worried+0.003
 scared+0.003
 shouted+0.003
 sad+0.003
Neg logit contributions
 fault-0.013
Mine-0.012
 pleasure-0.012
 favorites-0.011
 accessory-0.011
Token cracks
Token ID23217
Feature activation-2.34e+00

Pos logit contributions
 called+0.006
 worried+0.006
 scared+0.005
 shouted+0.005
 confused+0.005
Neg logit contributions
 fault-0.011
Mine-0.010
 pleasure-0.009
 favorites-0.009
 flavors-0.009
Token.
Token ID13
Feature activation-3.41e-01

Pos logit contributions
 called+0.055
 worried+0.049
 scared+0.048
 shouted+0.047
 confused+0.043
Neg logit contributions
 fault-0.145
Mine-0.141
 pleasure-0.131
 favorites-0.130
 accessory-0.129
Token Tom
Token ID4186
Feature activation+8.83e-01

Pos logit contributions
 called+0.012
 scared+0.011
 worried+0.011
 shouted+0.011
 sad+0.010
Neg logit contributions
 fault-0.018
Mine-0.017
 pleasure-0.016
 favorites-0.016
oof-0.015
Token felt
Token ID2936
Feature activation-3.68e+00

Pos logit contributions
 fault+0.044
Mine+0.043
 favorites+0.040
 accessory+0.040
 flavors+0.039
Neg logit contributions
 scientists-0.025
irens-0.024
 causing-0.024
 called-0.023
 gust-0.023
Token sad
Token ID6507
Feature activation-5.53e+00

Pos logit contributions
irens+0.079
 called+0.079
plets+0.074
 shouted+0.071
lic+0.071
Neg logit contributions
 fault-0.217
 pleasure-0.211
Mine-0.207
 friendship-0.194
 favorites-0.190
Token and
Token ID290
Feature activation-2.09e-01

Pos logit contributions
 called+0.175
 shouted+0.172
irens+0.165
 scientists+0.160
 worried+0.157
Neg logit contributions
 fault-0.307
 pleasure-0.250
 friendship-0.244
Mine-0.239
 favorites-0.239
Token scared
Token ID12008
Feature activation-3.41e+00

Pos logit contributions
plets+0.006
irens+0.006
ing+0.005
lic+0.005
 scientists+0.005
Neg logit contributions
Mine-0.010
 fault-0.010
rust-0.010
 friendship-0.010
 pleasure-0.010
Token.
Token ID13
Feature activation-3.31e-01

Pos logit contributions
 called+0.117
plets+0.114
 shouted+0.113
 worried+0.112
 scanner+0.111
Neg logit contributions
 fault-0.151
Mine-0.143
 friendship-0.132
 pleasure-0.129
eter-0.127
Token
Token ID198
Feature activation+9.00e-03

Pos logit contributions
 called+0.010
 scared+0.009
 worried+0.009
 shouted+0.009
 sad+0.008
Neg logit contributions
 fault-0.019
Mine-0.018
 pleasure-0.017
 favorites-0.017
 accessory-0.017
Token
Token ID198
Feature activation-1.78e-02

Pos logit contributions
 fault+0.001
Mine+0.001
 pleasure+0.000
 favorites+0.000
 accessory+0.000
Neg logit contributions
 called-0.000
 scared-0.000
 worried-0.000
 shouted-0.000
 sad-0.000
Token tent
Token ID11105
Feature activation-1.10e+00

Pos logit contributions
 called+0.073
 scared+0.067
 worried+0.066
 shouted+0.065
 sad+0.062
Neg logit contributions
 fault-0.090
Mine-0.089
 favorites-0.083
 pleasure-0.082
 accessory-0.079
Token,
Token ID11
Feature activation-9.60e-01

Pos logit contributions
 called+0.022
 worried+0.022
 scared+0.021
 shouted+0.020
 afraid+0.020
Neg logit contributions
 fault-0.067
Mine-0.065
 favorites-0.064
 pleasure-0.063
 accessory-0.062
Token they
Token ID484
Feature activation-2.25e+00

Pos logit contributions
 called+0.034
 shouted+0.033
 worried+0.032
 scared+0.031
irens+0.031
Neg logit contributions
Mine-0.046
 fault-0.042
 favorites-0.041
 pleasure-0.040
 friendship-0.038
Token heard
Token ID2982
Feature activation-6.76e+00

Pos logit contributions
iren+0.090
irens+0.088
oke+0.081
 noise+0.081
taker+0.078
Neg logit contributions
Mine-0.083
 favorites-0.078
rust-0.076
 ease-0.073
 favourites-0.072
Token a
Token ID257
Feature activation+2.62e-01

Pos logit contributions
 called+0.232
 worried+0.224
 shouted+0.217
plets+0.214
 causing+0.213
Neg logit contributions
Mine-0.309
 fault-0.298
 friendship-0.275
 personalities-0.274
 pleasure-0.268
Token loud
Token ID7812
Feature activation-5.10e+00

Pos logit contributions
Mine+0.012
 joke+0.012
 friendship+0.011
Me+0.010
 snack+0.010
Neg logit contributions
iers-0.008
plets-0.008
 called-0.008
ug-0.008
ting-0.008
Token noise
Token ID7838
Feature activation-6.87e+00

Pos logit contributions
 called+0.138
 scientists+0.138
iers+0.135
plets+0.123
olded+0.122
Neg logit contributions
Mine-0.256
 joke-0.241
oof-0.237
 friendship-0.224
good-0.216
Token.
Token ID13
Feature activation-1.71e-01

Pos logit contributions
 worried+0.189
 sad+0.188
 ashamed+0.186
 confused+0.185
 scientists+0.184
Neg logit contributions
Mine-0.324
 fault-0.301
 accessory-0.294
oof-0.289
good-0.287
Token Lily
Token ID20037
Feature activation+1.34e+00

Pos logit contributions
 called+0.007
 causing+0.007
 worried+0.007
irens+0.007
 scared+0.007
Neg logit contributions
Mine-0.007
 fault-0.006
 favorites-0.006
 personalities-0.006
 pleasure-0.006
Token felt
Token ID2936
Feature activation-3.33e+00

Pos logit contributions
Mine+0.052
rust+0.048
die+0.048
 accessory+0.047
omo+0.047
Neg logit contributions
taker-0.048
 causing-0.046
iers-0.046
lessness-0.045
rier-0.045
Token frightened
Token ID24776
Feature activation-3.04e+00

Pos logit contributions
oke+0.073
 called+0.071
plets+0.071
irens+0.067
 shouted+0.066
Neg logit contributions
Mine-0.184
 fault-0.183
 pleasure-0.181
 friendship-0.165
 ease-0.163