TOP ACTIVATIONS
MAX = 0.213

Pete and Lucy said their I do 's .
" Hi , I am Tim ," said the
smiled to . She said I had done a great job
says . " I said I was sorry . I want
" Look , I made a hat !" Tom
" Look , I made a big loop with
said , " Look , I have some insects in my
I 'm sorry I said I don 't like you .
" Mom says I have a bad ear .
" Hi , I 'm Ch ir py !
" Look , Lily , I found some gum in the
. She laughed and said I was lucky . She said
" Mom , I made it ! I did
" Look , I found a stick !" Tom
said , " Look , I found a ball . It
" Look , Lily , I see a bird 's nest
" Look , Lily , I found a coin !" Ben
said , " Look , I found a pretty pe b
mom ! She said ," I saw your ig loo and
from Anna . She says I am her best friend .

MIN ACTIVATIONS
MIN = -0.227

from far away lands and tales of magical creatures . She
start learning how to make models just like the ones his
museum and see what other dinosaurs she could find . <|endoftext|>
read books , especially fairy tales . One night , she
do not know anything about nations ." Ben and
whisper excited ly , telling tales of the rat who was
play cow boys and cow girls , and pretend to ride
friends are more fun than robots . Don 't you agree
be cow boys and cow girls . They put on hats
going and make more interesting models . <|endoftext|> One day a
were soldiers or animals or robots . One day , they
We know some things about nations ." The big
she knew so much about dinosaurs . She was ready to
! That 's how all masks should be organized ."
had no car and no dinosaurs to play with . They
blankets and chairs to make cages and fences .
slides and ice cream and dinosaurs . They had fun at
to know more about other nations . They ask the teacher
inside , but the police officers did not listen .
I see that book about dinosaurs ? It looks fun ,"

INTERVAL 0.103 - 0.213
CONTAINS 2.056%

and happy . " Mom , we 're sorry
, � � € � � Frank ie , this microphone
The girl cl apped her hands and the dog w agged
was carrying a wide , sunny smile . Everyone was so
Lily says . Mom unt w ists Lily 's

INTERVAL -0.007 - 0.103
CONTAINS 50.758%

. That pencil could hurt you . â € Jack was
they are too high . " Let 's climb
, but the bird flew away . Tom felt sad that
something caught Molly 's eye . She saw two little birds
asked her mom for a bag and started picking up the

INTERVAL -0.117 - -0.007
CONTAINS 46.727%

The raven c awed and flew down , getting
he feels sad or discouraged , he remembers how he worked
high voice . " The dragon is coming to get me
very much . Tim my learned that accidents can happen and
shampoo from getting in her eyes . Lily didn 't understand

INTERVAL -0.227 - -0.117
CONTAINS 0.458%

yes , you can have cheese and crack ers later .
Tommy were surprised that the rh in oc eros could talk
play . They make more masks and pictures . They have
they never wanted to get married . One day ,
L ily and Tom were friends who liked to play in
Token Pete
Token ID17542
Feature activation-2.72e-02

Pos logit contributions
 not+0.006
lf+0.006
 Brown+0.006
 driving+0.006
 Alf+0.006
Neg logit contributions
ited-0.016
able-0.015
ok-0.015
cy-0.014
 single-0.014
Token and
Token ID290
Feature activation+6.22e-03

Pos logit contributions
 not+0.037
lf+0.036
 Brown+0.034
 Alf+0.034
 driving+0.034
Neg logit contributions
ited-0.065
able-0.060
ok-0.059
cy-0.057
Ok-0.055
Token Lucy
Token ID22162
Feature activation+8.80e-03

Pos logit contributions
ited+0.015
able+0.014
cy+0.014
ok+0.014
Ok+0.013
Neg logit contributions
 not-0.007
lf-0.007
 driving-0.007
 Brown-0.006
 Alf-0.006
Token said
Token ID531
Feature activation-9.30e-02

Pos logit contributions
ited+0.021
able+0.019
ok+0.019
cy+0.019
Ok+0.018
Neg logit contributions
 not-0.012
lf-0.011
 Brown-0.011
 driving-0.011
 Alf-0.011
Token their
Token ID511
Feature activation-3.26e-02

Pos logit contributions
 not+0.109
lf+0.095
 Brown+0.094
 driving+0.094
 Alf+0.090
Neg logit contributions
ited-0.254
able-0.230
ok-0.227
cy-0.224
Ok-0.212
Token I
Token ID314
Feature activation+2.03e-01

Pos logit contributions
 not+0.045
lf+0.043
 Brown+0.042
 driving+0.041
 Alf+0.041
Neg logit contributions
ited-0.077
able-0.071
ok-0.070
cy-0.067
Ok-0.066
Token do
Token ID466
Feature activation-9.14e-03

Pos logit contributions
ited+0.482
able+0.437
ok+0.434
cy+0.415
Ok+0.407
Neg logit contributions
 not-0.290
lf-0.265
 Brown-0.260
 driving-0.260
 Alf-0.254
Token's
Token ID338
Feature activation+1.19e-03

Pos logit contributions
 not+0.012
lf+0.010
 driving+0.010
 Brown+0.010
 pretending+0.009
Neg logit contributions
ited-0.024
able-0.022
ok-0.022
cy-0.021
Ok-0.020
Token.
Token ID13
Feature activation-7.87e-03

Pos logit contributions
ited+0.003
able+0.003
ok+0.003
cy+0.003
Ok+0.003
Neg logit contributions
 not-0.002
lf-0.001
 driving-0.001
 Brown-0.001
 Alf-0.001
Token
Token ID198
Feature activation-1.29e-02

Pos logit contributions
 not+0.009
 Brown+0.009
 Alf+0.009
 Max+0.008
 driving+0.008
Neg logit contributions
ited-0.022
able-0.019
ok-0.019
cy-0.018
 single-0.018
Token
Token ID198
Feature activation-1.83e-03

Pos logit contributions
 not+0.015
lf+0.014
 Brown+0.013
 driving+0.013
 Alf+0.013
Neg logit contributions
ited-0.034
able-0.031
ok-0.031
cy-0.030
Ok-0.029
Token
Token ID198
Feature activation+3.51e-02

Pos logit contributions
ited+0.072
able+0.066
ok+0.065
cy+0.062
Ok+0.061
Neg logit contributions
 not-0.041
lf-0.039
 Brown-0.038
 driving-0.038
 Alf-0.037
Token
Token ID198
Feature activation+3.28e-02

Pos logit contributions
ited+0.086
able+0.076
ok+0.075
cy+0.073
 single+0.070
Neg logit contributions
 not-0.050
lf-0.047
 Brown-0.045
 driving-0.044
 Alf-0.044
Token"
Token ID1
Feature activation+1.32e-01

Pos logit contributions
ited+0.067
able+0.059
ok+0.057
cy+0.056
 single+0.054
Neg logit contributions
 not-0.059
lf-0.057
 Brown-0.053
 Zig-0.053
 Alf-0.053
TokenHi
Token ID17250
Feature activation+1.39e-02

Pos logit contributions
ited+0.210
able+0.176
ok+0.164
 single+0.162
cy+0.158
Neg logit contributions
 not-0.292
lf-0.277
 Brown-0.274
 Zig-0.263
 Max-0.262
Token,
Token ID11
Feature activation+8.68e-03

Pos logit contributions
ited+0.033
able+0.031
cy+0.030
ok+0.030
yes+0.028
Neg logit contributions
 not-0.018
lf-0.017
 driving-0.017
opes-0.016
aurus-0.016
Token I
Token ID314
Feature activation+2.03e-01

Pos logit contributions
ited+0.019
able+0.018
cy+0.017
ok+0.017
Ok+0.016
Neg logit contributions
 not-0.012
lf-0.012
opes-0.012
 driving-0.011
 pretending-0.011
Token am
Token ID716
Feature activation+3.15e-02

Pos logit contributions
ited+0.392
able+0.352
ok+0.345
cy+0.331
Ok+0.315
Neg logit contributions
 not-0.373
lf-0.369
 Brown-0.359
 driving-0.354
 Alf-0.352
Token Tim
Token ID5045
Feature activation-7.67e-03

Pos logit contributions
ited+0.077
able+0.072
ok+0.070
cy+0.069
 single+0.068
Neg logit contributions
lf-0.037
 not-0.036
 driving-0.034
 Brown-0.034
 Alf-0.033
Token,"
Token ID553
Feature activation+3.63e-02

Pos logit contributions
 not+0.010
lf+0.010
 Brown+0.010
 driving+0.009
 Alf+0.009
Neg logit contributions
ited-0.019
able-0.017
ok-0.017
cy-0.016
Ok-0.016
Token said
Token ID531
Feature activation-7.03e-02

Pos logit contributions
ited+0.093
able+0.089
ok+0.086
cy+0.085
yes+0.084
Neg logit contributions
lf-0.037
 not-0.036
opes-0.035
 driving-0.035
aurus-0.033
Token the
Token ID262
Feature activation+4.63e-02

Pos logit contributions
 not+0.072
lf+0.072
opes+0.068
 driving+0.066
 pretending+0.061
Neg logit contributions
ited-0.180
able-0.172
ok-0.167
cy-0.166
 single-0.162
Token smiled
Token ID13541
Feature activation-2.01e-03

Pos logit contributions
ited+0.186
able+0.168
ok+0.166
cy+0.160
Ok+0.155
Neg logit contributions
 not-0.117
lf-0.110
 Brown-0.108
 driving-0.108
 Alf-0.106
Token to
Token ID284
Feature activation+2.34e-02

Pos logit contributions
 not+0.002
 Brown+0.002
 driving+0.002
lf+0.002
 Alf+0.002
Neg logit contributions
ited-0.006
able-0.005
ok-0.005
cy-0.005
Ok-0.005
Token.
Token ID13
Feature activation+7.84e-03

Pos logit contributions
ited+0.056
able+0.051
ok+0.051
cy+0.048
Ok+0.047
Neg logit contributions
 not-0.034
 Brown-0.031
lf-0.031
 Alf-0.031
 driving-0.030
Token She
Token ID1375
Feature activation-4.62e-03

Pos logit contributions
ited+0.022
able+0.020
ok+0.019
cy+0.018
eful+0.018
Neg logit contributions
 not-0.009
 Brown-0.009
 Alf-0.008
 Zig-0.008
 Max-0.008
Token said
Token ID531
Feature activation-8.15e-02

Pos logit contributions
 not+0.005
lf+0.005
 Brown+0.005
 driving+0.004
 Alf+0.004
Neg logit contributions
ited-0.012
able-0.011
ok-0.011
cy-0.011
Ok-0.011
Token I
Token ID314
Feature activation+2.02e-01

Pos logit contributions
 not+0.081
 Brown+0.067
lf+0.064
 driving+0.062
 Alf+0.062
Neg logit contributions
ited-0.246
able-0.223
ok-0.219
cy-0.217
 single-0.204
Token had
Token ID550
Feature activation+3.99e-03

Pos logit contributions
ited+0.489
able+0.447
ok+0.443
cy+0.426
eful+0.418
Neg logit contributions
 not-0.285
lf-0.234
 driving-0.234
 Brown-0.234
 Alf-0.230
Token done
Token ID1760
Feature activation-5.89e-02

Pos logit contributions
ited+0.011
able+0.010
ok+0.010
cy+0.010
Ok+0.010
Neg logit contributions
 not-0.004
lf-0.004
 driving-0.004
 Brown-0.004
 Alf-0.003
Token a
Token ID257
Feature activation-6.00e-02

Pos logit contributions
 not+0.066
lf+0.061
 Brown+0.058
 driving+0.058
 Alf+0.056
Neg logit contributions
ited-0.159
able-0.146
ok-0.145
cy-0.141
Ok-0.137
Token great
Token ID1049
Feature activation+6.89e-02

Pos logit contributions
 not+0.075
 driving+0.057
lf+0.055
 drawing+0.055
 pretending+0.054
Neg logit contributions
ited-0.161
ok-0.152
able-0.148
eful-0.143
cy-0.143
Token job
Token ID1693
Feature activation-4.14e-03

Pos logit contributions
ited+0.179
able+0.161
cy+0.157
ok+0.156
sie+0.152
Neg logit contributions
 not-0.089
lf-0.081
 driving-0.077
 pretending-0.071
 Brown-0.070
Token says
Token ID1139
Feature activation-6.06e-02

Pos logit contributions
lf+0.033
 not+0.031
 driving+0.031
 Brown+0.030
 Alf+0.030
Neg logit contributions
ited-0.062
able-0.057
ok-0.056
cy-0.054
Ok-0.053
Token.
Token ID13
Feature activation+2.33e-02

Pos logit contributions
 not+0.066
lf+0.059
 Brown+0.058
 driving+0.057
 Alf+0.055
Neg logit contributions
ited-0.167
able-0.153
ok-0.151
cy-0.147
Ok-0.143
Token "
Token ID366
Feature activation+1.08e-01

Pos logit contributions
ited+0.065
able+0.060
ok+0.059
cy+0.057
Ok+0.056
Neg logit contributions
 not-0.024
lf-0.022
 Brown-0.022
 driving-0.021
 Alf-0.021
TokenI
Token ID40
Feature activation+1.74e-01

Pos logit contributions
ited+0.214
able+0.164
sie+0.163
cy+0.162
ok+0.160
Neg logit contributions
 not-0.220
 driving-0.191
 Brown-0.187
 having-0.185
 cars-0.184
Token said
Token ID531
Feature activation-8.93e-02

Pos logit contributions
ited+0.386
able+0.351
ok+0.345
cy+0.331
Ok+0.321
Neg logit contributions
 not-0.271
lf-0.265
 Brown-0.256
 driving-0.255
 Alf-0.251
Token I
Token ID314
Feature activation+2.02e-01

Pos logit contributions
lf+0.127
 not+0.126
 Brown+0.120
 Alf+0.119
 driving+0.119
Neg logit contributions
ited-0.200
able-0.184
ok-0.183
cy-0.173
Ok-0.171
Token was
Token ID373
Feature activation+8.19e-03

Pos logit contributions
ited+0.424
able+0.394
ok+0.377
cy+0.373
Ok+0.349
Neg logit contributions
lf-0.342
 Brown-0.317
 not-0.316
 driving-0.313
 Alf-0.313
Token sorry
Token ID7926
Feature activation+8.06e-05

Pos logit contributions
ited+0.019
able+0.018
ok+0.017
cy+0.016
 single+0.016
Neg logit contributions
lf-0.011
 Alf-0.010
 Brown-0.010
 not-0.010
 Zig-0.010
Token.
Token ID13
Feature activation-1.13e-02

Pos logit contributions
ited+0.000
able+0.000
ok+0.000
cy+0.000
Ok+0.000
Neg logit contributions
 not-0.000
lf-0.000
 Brown-0.000
 driving-0.000
 Alf-0.000
Token I
Token ID314
Feature activation+1.68e-01

Pos logit contributions
 not+0.019
 Brown+0.018
 Zig+0.017
 Alf+0.017
 Max+0.017
Neg logit contributions
ited-0.028
able-0.023
ok-0.023
cy-0.022
 single-0.021
Token want
Token ID765
Feature activation+5.12e-02

Pos logit contributions
ited+0.410
able+0.376
ok+0.371
cy+0.357
Ok+0.348
Neg logit contributions
 not-0.224
lf-0.214
 Brown-0.208
 driving-0.206
 Alf-0.203
Token
Token ID198
Feature activation+1.61e-02

Pos logit contributions
ited+0.039
able+0.036
ok+0.035
cy+0.034
Ok+0.033
Neg logit contributions
 not-0.020
 Brown-0.019
lf-0.019
 Alf-0.018
 driving-0.018
Token
Token ID198
Feature activation+2.21e-02

Pos logit contributions
ited+0.046
able+0.041
ok+0.041
cy+0.040
Ok+0.039
Neg logit contributions
 not-0.017
lf-0.015
 Brown-0.015
 Alf-0.014
 driving-0.014
Token"
Token ID1
Feature activation+1.28e-01

Pos logit contributions
ited+0.047
able+0.040
cy+0.040
ok+0.040
mer+0.038
Neg logit contributions
 not-0.038
lf-0.037
 Zig-0.034
 driving-0.033
 Brown-0.033
TokenLook
Token ID8567
Feature activation+5.18e-02

Pos logit contributions
ited+0.220
able+0.185
 single+0.173
cy+0.171
ok+0.171
Neg logit contributions
 not-0.264
 Brown-0.248
lf-0.247
 Max-0.239
 Wh-0.236
Token,
Token ID11
Feature activation+4.88e-03

Pos logit contributions
ited+0.112
cy+0.106
able+0.101
ok+0.101
eful+0.097
Neg logit contributions
lf-0.072
 driving-0.071
 not-0.068
opes-0.066
aurus-0.065
Token I
Token ID314
Feature activation+2.01e-01

Pos logit contributions
ited+0.010
cy+0.010
able+0.010
ok+0.009
yes+0.009
Neg logit contributions
opes-0.006
lf-0.006
 driving-0.006
 not-0.005
anks-0.005
Token made
Token ID925
Feature activation+1.66e-02

Pos logit contributions
ited+0.461
able+0.420
ok+0.412
cy+0.407
 single+0.382
Neg logit contributions
lf-0.302
 Brown-0.290
 not-0.290
 driving-0.280
 Zig-0.280
Token a
Token ID257
Feature activation-5.49e-02

Pos logit contributions
ited+0.035
able+0.034
ok+0.034
 single+0.033
eful+0.032
Neg logit contributions
lf-0.022
aurus-0.019
 not-0.019
 driving-0.019
opes-0.018
Token hat
Token ID6877
Feature activation-1.79e-02

Pos logit contributions
 not+0.070
lf+0.066
 Brown+0.064
 driving+0.063
 Alf+0.063
Neg logit contributions
ited-0.137
able-0.126
ok-0.124
cy-0.120
Ok-0.117
Token!"
Token ID2474
Feature activation+5.57e-02

Pos logit contributions
 Zig+0.029
opes+0.028
lf+0.028
 Alf+0.028
 not+0.028
Neg logit contributions
ited-0.034
ok-0.032
able-0.031
cy-0.030
yes-0.029
Token Tom
Token ID4186
Feature activation-7.10e-03

Pos logit contributions
ited+0.135
able+0.127
ok+0.123
cy+0.121
Ok+0.119
Neg logit contributions
lf-0.064
 driving-0.063
 not-0.062
opes-0.060
 pretending-0.056
Token
Token ID198
Feature activation+1.61e-02

Pos logit contributions
ited+0.039
able+0.036
ok+0.035
cy+0.034
Ok+0.033
Neg logit contributions
 not-0.020
 Brown-0.019
lf-0.019
 Alf-0.018
 driving-0.018
Token
Token ID198
Feature activation+2.21e-02

Pos logit contributions
ited+0.046
able+0.041
ok+0.041
cy+0.040
Ok+0.039
Neg logit contributions
 not-0.017
lf-0.015
 Brown-0.015
 Alf-0.014
 driving-0.014
Token"
Token ID1
Feature activation+1.28e-01

Pos logit contributions
ited+0.047
able+0.040
cy+0.040
ok+0.040
mer+0.038
Neg logit contributions
 not-0.038
lf-0.037
 Zig-0.034
 driving-0.033
 Brown-0.033
TokenLook
Token ID8567
Feature activation+5.18e-02

Pos logit contributions
ited+0.220
able+0.185
 single+0.173
cy+0.171
ok+0.171
Neg logit contributions
 not-0.264
 Brown-0.248
lf-0.247
 Max-0.239
 Wh-0.236
Token,
Token ID11
Feature activation+4.88e-03

Pos logit contributions
ited+0.112
cy+0.106
able+0.101
ok+0.101
eful+0.097
Neg logit contributions
lf-0.072
 driving-0.071
 not-0.068
opes-0.066
aurus-0.065
Token I
Token ID314
Feature activation+2.01e-01

Pos logit contributions
ited+0.010
cy+0.010
able+0.010
ok+0.009
yes+0.009
Neg logit contributions
opes-0.006
lf-0.006
 driving-0.006
 not-0.005
anks-0.005
Token made
Token ID925
Feature activation+1.66e-02

Pos logit contributions
ited+0.461
able+0.420
ok+0.412
cy+0.407
 single+0.382
Neg logit contributions
lf-0.302
 Brown-0.290
 not-0.290
 driving-0.280
 Zig-0.280
Token a
Token ID257
Feature activation-5.49e-02

Pos logit contributions
ited+0.035
able+0.034
ok+0.034
 single+0.033
eful+0.032
Neg logit contributions
lf-0.022
aurus-0.019
 not-0.019
 driving-0.019
opes-0.018
Token big
Token ID1263
Feature activation+4.53e-02

Pos logit contributions
 not+0.070
lf+0.066
 Brown+0.064
 driving+0.063
 Alf+0.063
Neg logit contributions
ited-0.137
able-0.126
ok-0.124
cy-0.120
Ok-0.117
Token loop
Token ID9052
Feature activation+1.21e-02

Pos logit contributions
ited+0.113
able+0.103
ok+0.102
cy+0.098
Ok+0.096
Neg logit contributions
 not-0.058
lf-0.054
 Brown-0.053
 driving-0.052
 Alf-0.052
Token with
Token ID351
Feature activation+3.19e-02

Pos logit contributions
ited+0.028
able+0.025
ok+0.025
cy+0.024
 single+0.023
Neg logit contributions
 not-0.018
lf-0.017
 Brown-0.017
 Alf-0.016
 driving-0.016
Token said
Token ID531
Feature activation-4.22e-02

Pos logit contributions
ited+0.146
able+0.133
ok+0.131
cy+0.126
Ok+0.123
Neg logit contributions
 not-0.085
lf-0.081
 Brown-0.079
 driving-0.078
 Alf-0.077
Token,
Token ID11
Feature activation+3.16e-02

Pos logit contributions
 not+0.041
lf+0.041
 Brown+0.038
 driving+0.038
 Alf+0.037
Neg logit contributions
ited-0.116
able-0.107
ok-0.107
cy-0.102
Ok-0.101
Token "
Token ID366
Feature activation+1.27e-01

Pos logit contributions
ited+0.084
ok+0.078
able+0.077
cy+0.075
Ok+0.074
Neg logit contributions
 not-0.032
lf-0.032
 Brown-0.029
 Alf-0.029
 driving-0.029
TokenLook
Token ID8567
Feature activation+5.58e-02

Pos logit contributions
ited+0.197
sie+0.173
rous+0.164
 single+0.163
eful+0.159
Neg logit contributions
 not-0.244
 Max-0.231
 he-0.227
 Tom-0.227
 Dino-0.222
Token,
Token ID11
Feature activation+5.17e-03

Pos logit contributions
ited+0.129
cy+0.121
ok+0.117
able+0.116
eful+0.111
Neg logit contributions
lf-0.072
 driving-0.071
 not-0.068
 Brown-0.066
opes-0.065
Token I
Token ID314
Feature activation+2.01e-01

Pos logit contributions
ited+0.011
able+0.010
cy+0.010
ok+0.010
yes+0.010
Neg logit contributions
opes-0.007
lf-0.007
 driving-0.006
 Alf-0.006
 not-0.006
Token have
Token ID423
Feature activation+4.19e-02

Pos logit contributions
ited+0.465
able+0.423
ok+0.417
cy+0.405
Ok+0.388
Neg logit contributions
 not-0.294
lf-0.292
 Brown-0.282
 driving-0.275
 Alf-0.274
Token some
Token ID617
Feature activation+3.73e-02

Pos logit contributions
ited+0.118
able+0.110
ok+0.108
cy+0.105
Ok+0.103
Neg logit contributions
 not-0.040
lf-0.037
 Brown-0.036
 driving-0.035
 Alf-0.035
Token insects
Token ID19435
Feature activation-5.59e-02

Pos logit contributions
ited+0.096
ok+0.087
able+0.085
cy+0.084
Ok+0.082
Neg logit contributions
 not-0.048
 Brown-0.044
lf-0.044
 driving-0.043
 Alf-0.041
Token in
Token ID287
Feature activation+2.96e-02

Pos logit contributions
 not+0.067
lf+0.062
 Brown+0.060
 driving+0.060
 Alf+0.058
Neg logit contributions
ited-0.145
able-0.132
ok-0.131
cy-0.126
Ok-0.125
Token my
Token ID616
Feature activation+4.24e-02

Pos logit contributions
ited+0.082
ok+0.076
Ok+0.073
able+0.072
Okay+0.071
Neg logit contributions
 not-0.035
lf-0.031
 Brown-0.030
 driving-0.029
 Alf-0.029
Token I
Token ID314
Feature activation+1.82e-01

Pos logit contributions
 not+0.006
 Brown+0.006
 Alf+0.005
 Zig+0.005
 Max+0.005
Neg logit contributions
ited-0.011
able-0.009
ok-0.009
cy-0.009
 single-0.009
Token'm
Token ID1101
Feature activation+3.14e-02

Pos logit contributions
ited+0.395
ok+0.356
able+0.352
cy+0.332
Ok+0.330
Neg logit contributions
 not-0.305
 Brown-0.267
 driving-0.259
 Alf-0.258
 Zig-0.255
Token sorry
Token ID7926
Feature activation-2.01e-02

Pos logit contributions
ited+0.075
able+0.068
ok+0.068
cy+0.065
Ok+0.064
Neg logit contributions
 not-0.043
lf-0.042
 Brown-0.040
 driving-0.040
 Alf-0.039
Token I
Token ID314
Feature activation+1.76e-01

Pos logit contributions
lf+0.026
 not+0.025
 driving+0.024
 Alf+0.024
 Brown+0.024
Neg logit contributions
ited-0.047
ok-0.044
able-0.044
cy-0.042
 single-0.041
Token said
Token ID531
Feature activation-9.47e-02

Pos logit contributions
ited+0.424
able+0.388
ok+0.381
cy+0.370
Ok+0.357
Neg logit contributions
lf-0.247
 not-0.230
 driving-0.226
 Alf-0.225
 Brown-0.225
Token I
Token ID314
Feature activation+2.01e-01

Pos logit contributions
 not+0.145
lf+0.136
 Brown+0.134
 driving+0.132
 Alf+0.131
Neg logit contributions
ited-0.216
able-0.195
ok-0.192
cy-0.185
Ok-0.179
Token don
Token ID836
Feature activation+5.94e-02

Pos logit contributions
ited+0.428
able+0.390
ok+0.380
cy+0.369
Ok+0.355
Neg logit contributions
lf-0.342
 driving-0.314
 Alf-0.313
 Brown-0.312
opes-0.307
Token't
Token ID470
Feature activation+1.65e-03

Pos logit contributions
ited+0.152
ok+0.150
able+0.144
 single+0.139
cy+0.135
Neg logit contributions
 Alf-0.057
lf-0.056
 Zig-0.055
 Brown-0.054
 not-0.053
Token like
Token ID588
Feature activation-2.26e-02

Pos logit contributions
ited+0.004
able+0.003
ok+0.003
cy+0.003
Ok+0.003
Neg logit contributions
 not-0.003
 Brown-0.003
 driving-0.003
lf-0.002
 Alf-0.002
Token you
Token ID345
Feature activation+6.34e-02

Pos logit contributions
 not+0.029
lf+0.028
 Brown+0.027
 Alf+0.027
 driving+0.027
Neg logit contributions
ited-0.055
able-0.051
ok-0.050
cy-0.048
 single-0.047
Token.
Token ID13
Feature activation-1.99e-02

Pos logit contributions
ited+0.167
able+0.153
ok+0.152
cy+0.146
Ok+0.143
Neg logit contributions
 not-0.074
lf-0.069
 Brown-0.067
 driving-0.066
 Alf-0.065
Token
Token ID198
Feature activation+4.12e-03

Pos logit contributions
ited+0.043
ok+0.040
able+0.040
cy+0.038
Ok+0.038
Neg logit contributions
 not-0.016
lf-0.016
 driving-0.015
 Brown-0.014
 Alf-0.014
Token
Token ID198
Feature activation+1.30e-02

Pos logit contributions
ited+0.011
able+0.010
ok+0.010
cy+0.010
Ok+0.010
Neg logit contributions
 not-0.005
lf-0.004
 Brown-0.004
 driving-0.004
 Alf-0.004
Token"
Token ID1
Feature activation+1.18e-01

Pos logit contributions
ited+0.029
able+0.025
ok+0.024
cy+0.024
sie+0.024
Neg logit contributions
 not-0.023
lf-0.021
 driving-0.019
 Zig-0.019
 Brown-0.019
TokenMom
Token ID29252
Feature activation+1.03e-01

Pos logit contributions
ited+0.210
able+0.172
ok+0.170
cy+0.165
 single+0.155
Neg logit contributions
 not-0.258
lf-0.238
 driving-0.235
 Brown-0.234
 Zig-0.228
Token says
Token ID1139
Feature activation-6.31e-02

Pos logit contributions
ited+0.250
able+0.227
ok+0.225
cy+0.222
use+0.206
Neg logit contributions
lf-0.129
 not-0.126
 Zig-0.124
 Alf-0.121
 driving-0.115
Token I
Token ID314
Feature activation+2.00e-01

Pos logit contributions
lf+0.073
 Alf+0.069
opes+0.067
 not+0.067
aurus+0.066
Neg logit contributions
ited-0.147
able-0.141
ok-0.139
 single-0.134
cy-0.131
Token have
Token ID423
Feature activation+4.34e-02

Pos logit contributions
ited+0.394
able+0.380
ok+0.351
cy+0.351
use+0.326
Neg logit contributions
lf-0.359
 Alf-0.329
opes-0.326
 Brown-0.317
 Zig-0.315
Token a
Token ID257
Feature activation-2.13e-02

Pos logit contributions
ited+0.113
able+0.102
ok+0.102
cy+0.099
Ok+0.096
Neg logit contributions
 not-0.054
 Brown-0.049
lf-0.049
 driving-0.049
 Alf-0.047
Token bad
Token ID2089
Feature activation+5.37e-02

Pos logit contributions
 not+0.029
 Brown+0.027
 driving+0.027
lf+0.026
 Alf+0.025
Neg logit contributions
ited-0.054
ok-0.049
able-0.048
cy-0.046
Ok-0.045
Token ear
Token ID1027
Feature activation+8.32e-02

Pos logit contributions
ited+0.130
able+0.117
ok+0.116
cy+0.112
Ok+0.109
Neg logit contributions
 not-0.076
lf-0.071
 Brown-0.070
 driving-0.069
 Alf-0.067
Token.
Token ID13
Feature activation+2.74e-03

Pos logit contributions
ited+0.183
able+0.170
ok+0.169
cy+0.161
 single+0.157
Neg logit contributions
lf-0.120
 not-0.119
 Alf-0.114
 Zig-0.113
 Brown-0.112
Token
Token ID198
Feature activation+1.61e-02

Pos logit contributions
ited+0.039
able+0.036
ok+0.035
cy+0.034
Ok+0.033
Neg logit contributions
 not-0.020
 Brown-0.019
lf-0.019
 Alf-0.018
 driving-0.018
Token
Token ID198
Feature activation+2.21e-02

Pos logit contributions
ited+0.046
able+0.041
ok+0.041
cy+0.040
Ok+0.039
Neg logit contributions
 not-0.017
lf-0.015
 Brown-0.015
 Alf-0.014
 driving-0.014
Token"
Token ID1
Feature activation+1.28e-01

Pos logit contributions
ited+0.047
able+0.040
cy+0.040
ok+0.040
mer+0.038
Neg logit contributions
 not-0.038
lf-0.037
 Zig-0.034
 driving-0.033
 Brown-0.033
TokenHi
Token ID17250
Feature activation+8.46e-03

Pos logit contributions
ited+0.220
able+0.185
 single+0.173
cy+0.171
ok+0.171
Neg logit contributions
 not-0.264
 Brown-0.248
lf-0.247
 Max-0.239
 Wh-0.236
Token,
Token ID11
Feature activation+4.78e-03

Pos logit contributions
ited+0.021
able+0.020
cy+0.019
ok+0.019
 single+0.018
Neg logit contributions
 not-0.010
lf-0.010
 driving-0.009
 Alf-0.009
opes-0.009
Token I
Token ID314
Feature activation+2.00e-01

Pos logit contributions
ited+0.011
able+0.010
cy+0.010
ok+0.010
Ok+0.009
Neg logit contributions
 not-0.006
lf-0.006
opes-0.006
 driving-0.006
 Alf-0.006
Token'm
Token ID1101
Feature activation+5.13e-02

Pos logit contributions
ited+0.387
able+0.346
ok+0.339
cy+0.326
 single+0.310
Neg logit contributions
 not-0.368
lf-0.365
 Brown-0.355
 driving-0.350
 Alf-0.349
Token Ch
Token ID609
Feature activation-3.65e-02

Pos logit contributions
ited+0.127
able+0.119
ok+0.117
cy+0.114
 single+0.113
Neg logit contributions
lf-0.058
 not-0.055
 driving-0.052
 Brown-0.051
opes-0.051
Tokenir
Token ID343
Feature activation-4.03e-03

Pos logit contributions
 not+0.075
lf+0.073
 Brown+0.071
 driving+0.071
 Zig+0.070
Neg logit contributions
ited-0.063
able-0.056
ok-0.053
Ok-0.051
cy-0.050
Tokenpy
Token ID9078
Feature activation-9.42e-02

Pos logit contributions
 not+0.006
lf+0.005
 Brown+0.005
 driving+0.005
 Alf+0.005
Neg logit contributions
ited-0.009
able-0.009
ok-0.009
cy-0.008
Ok-0.008
Token!
Token ID0
Feature activation+7.77e-03

Pos logit contributions
 not+0.113
 Brown+0.106
lf+0.104
 driving+0.102
 Alf+0.100
Neg logit contributions
ited-0.242
ok-0.219
able-0.216
Ok-0.212
cy-0.207
Token"
Token ID1
Feature activation+1.33e-01

Pos logit contributions
ited+0.086
able+0.076
ok+0.075
cy+0.073
 single+0.070
Neg logit contributions
 not-0.050
lf-0.047
 Brown-0.045
 driving-0.044
 Alf-0.044
TokenLook
Token ID8567
Feature activation+6.10e-02

Pos logit contributions
ited+0.210
able+0.171
ok+0.165
 single+0.161
cy+0.157
Neg logit contributions
 not-0.300
lf-0.282
 Brown-0.278
 Max-0.267
 Zig-0.266
Token,
Token ID11
Feature activation+1.35e-02

Pos logit contributions
ited+0.137
cy+0.128
able+0.125
ok+0.125
eful+0.119
Neg logit contributions
lf-0.081
 driving-0.080
 not-0.077
 Zig-0.073
 Brown-0.073
Token Lily
Token ID20037
Feature activation+3.97e-02

Pos logit contributions
ited+0.029
cy+0.028
able+0.028
ok+0.026
sie+0.026
Neg logit contributions
opes-0.017
lf-0.017
 driving-0.017
 not-0.016
 pretending-0.015
Token,
Token ID11
Feature activation+2.09e-02

Pos logit contributions
ited+0.099
able+0.094
ok+0.092
cy+0.091
Ok+0.084
Neg logit contributions
lf-0.048
 not-0.047
 driving-0.044
 Zig-0.044
 Alf-0.043
Token I
Token ID314
Feature activation+2.00e-01

Pos logit contributions
ited+0.041
able+0.038
ok+0.038
cy+0.037
Ok+0.035
Neg logit contributions
lf-0.032
opes-0.031
 driving-0.030
 not-0.030
 Alf-0.030
Token found
Token ID1043
Feature activation-2.35e-02

Pos logit contributions
ited+0.464
able+0.423
ok+0.417
cy+0.404
Ok+0.387
Neg logit contributions
 not-0.290
lf-0.286
 Brown-0.277
 driving-0.272
 Alf-0.269
Token some
Token ID617
Feature activation+4.04e-02

Pos logit contributions
 not+0.027
lf+0.027
 driving+0.025
 Brown+0.024
 Alf+0.024
Neg logit contributions
ited-0.059
able-0.056
ok-0.054
cy-0.053
 single-0.052
Token gum
Token ID27142
Feature activation-1.08e-02

Pos logit contributions
ited+0.101
ok+0.092
able+0.089
Ok+0.087
cy+0.087
Neg logit contributions
 not-0.054
lf-0.050
 Brown-0.049
 driving-0.048
 Alf-0.047
Token in
Token ID287
Feature activation+3.67e-02

Pos logit contributions
 not+0.014
lf+0.013
 Brown+0.013
 driving+0.013
 Alf+0.012
Neg logit contributions
ited-0.027
able-0.025
ok-0.024
cy-0.023
Ok-0.023
Token the
Token ID262
Feature activation+6.21e-03

Pos logit contributions
ited+0.091
ok+0.084
able+0.081
Ok+0.080
cy+0.078
Neg logit contributions
 not-0.052
lf-0.048
 Brown-0.047
 Alf-0.044
 driving-0.044
Token.
Token ID13
Feature activation+1.54e-02

Pos logit contributions
 not+0.008
lf+0.007
 Brown+0.007
 driving+0.007
 Alf+0.007
Neg logit contributions
ited-0.016
able-0.015
ok-0.015
cy-0.014
Ok-0.014
Token She
Token ID1375
Feature activation+9.71e-03

Pos logit contributions
ited+0.040
able+0.036
ok+0.036
cy+0.034
 single+0.033
Neg logit contributions
 not-0.020
 Brown-0.018
 Alf-0.018
lf-0.018
 Zig-0.017
Token laughed
Token ID13818
Feature activation-4.41e-02

Pos logit contributions
ited+0.022
able+0.020
ok+0.020
cy+0.019
Ok+0.018
Neg logit contributions
 not-0.014
lf-0.014
 Brown-0.014
 driving-0.013
 Alf-0.013
Token and
Token ID290
Feature activation+2.80e-02

Pos logit contributions
lf+0.057
 not+0.056
 driving+0.052
 Brown+0.052
 Alf+0.052
Neg logit contributions
ited-0.106
able-0.098
ok-0.098
cy-0.093
Ok-0.091
Token said
Token ID531
Feature activation-9.43e-02

Pos logit contributions
ited+0.068
able+0.062
ok+0.061
cy+0.059
Ok+0.058
Neg logit contributions
 not-0.037
lf-0.036
 Brown-0.034
 driving-0.034
 Alf-0.034
Token I
Token ID314
Feature activation+2.00e-01

Pos logit contributions
 not+0.095
lf+0.088
 Brown+0.085
 driving+0.084
 Alf+0.082
Neg logit contributions
ited-0.262
able-0.242
ok-0.240
cy-0.232
Ok-0.227
Token was
Token ID373
Feature activation-1.60e-03

Pos logit contributions
ited+0.457
able+0.415
ok+0.410
cy+0.393
Ok+0.383
Neg logit contributions
 not-0.298
lf-0.284
 Brown-0.278
 driving-0.275
 Alf-0.272
Token lucky
Token ID9670
Feature activation-4.42e-02

Pos logit contributions
lf+0.002
 not+0.002
 Brown+0.002
 Alf+0.002
 driving+0.002
Neg logit contributions
ited-0.004
able-0.004
ok-0.004
cy-0.004
 single-0.003
Token.
Token ID13
Feature activation-6.63e-03

Pos logit contributions
 not+0.060
 Brown+0.055
lf+0.054
 driving+0.054
 Alf+0.052
Neg logit contributions
ited-0.111
able-0.100
ok-0.098
cy-0.095
Ok-0.093
Token She
Token ID1375
Feature activation-8.71e-03

Pos logit contributions
 not+0.010
 Brown+0.010
 Alf+0.009
 Max+0.009
 Zig+0.009
Neg logit contributions
ited-0.017
able-0.015
ok-0.014
cy-0.013
 single-0.013
Token said
Token ID531
Feature activation-9.35e-02

Pos logit contributions
 not+0.010
lf+0.010
 Brown+0.009
 driving+0.009
 Alf+0.009
Neg logit contributions
ited-0.023
able-0.021
ok-0.020
cy-0.020
Ok-0.019
Token
Token ID198
Feature activation+3.51e-02

Pos logit contributions
ited+0.121
able+0.111
ok+0.110
cy+0.105
Ok+0.103
Neg logit contributions
 not-0.067
lf-0.064
 Brown-0.062
 driving-0.061
 Alf-0.061
Token
Token ID198
Feature activation+3.28e-02

Pos logit contributions
ited+0.086
able+0.076
ok+0.075
cy+0.073
 single+0.070
Neg logit contributions
 not-0.050
lf-0.047
 Brown-0.045
 driving-0.044
 Alf-0.044
Token"
Token ID1
Feature activation+1.32e-01

Pos logit contributions
ited+0.067
able+0.059
ok+0.057
cy+0.056
 single+0.054
Neg logit contributions
 not-0.059
lf-0.057
 Brown-0.053
 Zig-0.053
 Alf-0.053
TokenMom
Token ID29252
Feature activation+1.16e-01

Pos logit contributions
ited+0.210
able+0.176
ok+0.164
 single+0.162
cy+0.158
Neg logit contributions
 not-0.292
lf-0.277
 Brown-0.274
 Zig-0.263
 Max-0.262
Token,
Token ID11
Feature activation+1.45e-02

Pos logit contributions
ited+0.301
able+0.276
ok+0.274
cy+0.264
Ok+0.258
Neg logit contributions
 not-0.137
lf-0.129
 Brown-0.125
 driving-0.123
 Alf-0.121
Token I
Token ID314
Feature activation+1.99e-01

Pos logit contributions
ited+0.032
able+0.030
ok+0.029
cy+0.029
Ok+0.028
Neg logit contributions
lf-0.020
 not-0.020
 driving-0.019
opes-0.019
 Zig-0.019
Token made
Token ID925
Feature activation+1.88e-02

Pos logit contributions
ited+0.443
able+0.401
ok+0.400
cy+0.378
Ok+0.371
Neg logit contributions
 not-0.314
lf-0.286
 Brown-0.285
 driving-0.283
 Alf-0.277
Token it
Token ID340
Feature activation+5.46e-02

Pos logit contributions
ited+0.042
ok+0.040
able+0.040
 single+0.037
cy+0.037
Neg logit contributions
lf-0.024
 not-0.023
aurus-0.022
 Zig-0.022
 driving-0.022
Token!
Token ID0
Feature activation+6.46e-03

Pos logit contributions
ited+0.118
able+0.108
ok+0.108
cy+0.104
 single+0.103
Neg logit contributions
lf-0.077
 Alf-0.075
 Zig-0.075
 Brown-0.073
 not-0.072
Token I
Token ID314
Feature activation+1.71e-01

Pos logit contributions
ited+0.015
able+0.013
ok+0.013
cy+0.012
eful+0.012
Neg logit contributions
 not-0.011
 Brown-0.010
 Alf-0.010
 driving-0.010
lf-0.010
Token did
Token ID750
Feature activation-5.69e-02

Pos logit contributions
ited+0.409
ok+0.374
able+0.373
cy+0.354
Ok+0.353
Neg logit contributions
 not-0.237
lf-0.208
 Brown-0.208
 driving-0.207
 Alf-0.202
Token
Token ID198
Feature activation+7.18e-03

Pos logit contributions
ited+0.016
able+0.015
ok+0.015
cy+0.014
Ok+0.014
Neg logit contributions
 not-0.008
 Brown-0.007
lf-0.007
 Alf-0.007
 driving-0.007
Token
Token ID198
Feature activation+1.50e-02

Pos logit contributions
ited+0.020
able+0.018
ok+0.018
cy+0.018
Ok+0.017
Neg logit contributions
 not-0.008
lf-0.007
 Brown-0.007
 Alf-0.006
 Zig-0.006
Token"
Token ID1
Feature activation+1.25e-01

Pos logit contributions
ited+0.033
able+0.028
cy+0.028
ok+0.027
mer+0.026
Neg logit contributions
 not-0.026
lf-0.025
 Zig-0.023
 driving-0.022
 Max-0.022
TokenLook
Token ID8567
Feature activation+5.01e-02

Pos logit contributions
ited+0.211
able+0.184
 single+0.168
ok+0.167
rous+0.166
Neg logit contributions
 not-0.248
lf-0.246
 Brown-0.241
 driving-0.233
 Zig-0.231
Token,
Token ID11
Feature activation+3.05e-04

Pos logit contributions
ited+0.107
cy+0.101
ok+0.096
able+0.095
eful+0.092
Neg logit contributions
lf-0.071
 driving-0.070
 not-0.067
opes-0.066
aurus-0.064
Token I
Token ID314
Feature activation+1.98e-01

Pos logit contributions
cy+0.001
ited+0.001
able+0.001
 single+0.001
yes+0.001
Neg logit contributions
opes-0.000
lf-0.000
 driving-0.000
anks-0.000
 not-0.000
Token found
Token ID1043
Feature activation-2.65e-02

Pos logit contributions
ited+0.451
able+0.410
ok+0.404
cy+0.398
 single+0.375
Neg logit contributions
lf-0.297
 Brown-0.289
 not-0.285
 Zig-0.280
 Alf-0.279
Token a
Token ID257
Feature activation-3.43e-02

Pos logit contributions
lf+0.033
 not+0.032
 driving+0.030
 Brown+0.029
 Alf+0.028
Neg logit contributions
ited-0.064
able-0.061
ok-0.059
cy-0.058
 single-0.057
Token stick
Token ID4859
Feature activation+4.77e-02

Pos logit contributions
 not+0.045
 Brown+0.041
lf+0.041
 driving+0.040
 Alf+0.038
Neg logit contributions
ited-0.090
able-0.081
ok-0.081
cy-0.076
Ok-0.075
Token!"
Token ID2474
Feature activation+5.61e-02

Pos logit contributions
ited+0.104
able+0.095
ok+0.094
cy+0.089
 single+0.087
Neg logit contributions
 not-0.073
lf-0.071
 Brown-0.070
 Alf-0.069
 Zig-0.068
Token Tom
Token ID4186
Feature activation-6.42e-03

Pos logit contributions
ited+0.137
able+0.129
ok+0.126
cy+0.123
Ok+0.120
Neg logit contributions
lf-0.065
 driving-0.063
 not-0.063
opes-0.060
 Brown-0.057
Token said
Token ID531
Feature activation-7.85e-02

Pos logit contributions
ited+0.128
ok+0.116
able+0.116
cy+0.111
Ok+0.109
Neg logit contributions
 not-0.062
lf-0.057
 Brown-0.056
 driving-0.055
 Alf-0.054
Token,
Token ID11
Feature activation+1.09e-02

Pos logit contributions
 not+0.080
lf+0.079
 Brown+0.075
 driving+0.073
 Alf+0.073
Neg logit contributions
ited-0.210
able-0.196
ok-0.194
cy-0.185
Ok-0.184
Token "
Token ID366
Feature activation+1.15e-01

Pos logit contributions
ited+0.028
ok+0.026
able+0.025
cy+0.025
Ok+0.025
Neg logit contributions
lf-0.012
 not-0.011
 Brown-0.011
 Alf-0.011
 driving-0.010
TokenLook
Token ID8567
Feature activation+4.93e-02

Pos logit contributions
ited+0.202
sie+0.175
 single+0.169
eful+0.161
able+0.161
Neg logit contributions
 not-0.201
 Max-0.191
 Tom-0.188
 Dino-0.185
lf-0.182
Token,
Token ID11
Feature activation+5.81e-04

Pos logit contributions
ited+0.113
cy+0.108
able+0.103
ok+0.103
eful+0.099
Neg logit contributions
lf-0.063
 driving-0.062
 not-0.059
opes-0.057
 Brown-0.056
Token I
Token ID314
Feature activation+1.98e-01

Pos logit contributions
ited+0.001
able+0.001
cy+0.001
ok+0.001
Ok+0.001
Neg logit contributions
lf-0.001
opes-0.001
 driving-0.001
 Alf-0.001
 not-0.001
Token found
Token ID1043
Feature activation-2.89e-02

Pos logit contributions
ited+0.463
able+0.423
ok+0.417
cy+0.404
Ok+0.389
Neg logit contributions
 not-0.283
lf-0.277
 Brown-0.269
 driving-0.263
 Alf-0.263
Token a
Token ID257
Feature activation-3.45e-02

Pos logit contributions
lf+0.036
 not+0.036
 driving+0.033
 Alf+0.032
 Brown+0.032
Neg logit contributions
ited-0.069
able-0.066
ok-0.064
cy-0.062
 single-0.061
Token ball
Token ID2613
Feature activation+3.42e-02

Pos logit contributions
 not+0.044
 Brown+0.041
lf+0.041
 driving+0.040
 Alf+0.038
Neg logit contributions
ited-0.090
ok-0.082
able-0.081
cy-0.077
Ok-0.076
Token.
Token ID13
Feature activation-1.26e-02

Pos logit contributions
ited+0.081
able+0.075
ok+0.074
cy+0.071
 single+0.069
Neg logit contributions
 not-0.047
lf-0.044
 Brown-0.043
 Alf-0.043
 driving-0.043
Token It
Token ID632
Feature activation+1.54e-02

Pos logit contributions
 not+0.022
lf+0.020
 Brown+0.020
 driving+0.020
 Alf+0.020
Neg logit contributions
ited-0.026
able-0.023
ok-0.023
cy-0.022
Ok-0.021
Token"
Token ID1
Feature activation+1.32e-01

Pos logit contributions
ited+0.067
able+0.059
ok+0.057
cy+0.056
 single+0.054
Neg logit contributions
 not-0.059
lf-0.057
 Brown-0.053
 Zig-0.053
 Alf-0.053
TokenLook
Token ID8567
Feature activation+5.82e-02

Pos logit contributions
ited+0.210
able+0.176
ok+0.164
 single+0.162
cy+0.158
Neg logit contributions
 not-0.292
lf-0.277
 Brown-0.274
 Zig-0.263
 Max-0.262
Token,
Token ID11
Feature activation+9.82e-03

Pos logit contributions
ited+0.127
cy+0.119
able+0.115
ok+0.114
eful+0.109
Neg logit contributions
lf-0.080
 driving-0.078
 not-0.076
opes-0.073
aurus-0.071
Token Lily
Token ID20037
Feature activation+3.89e-02

Pos logit contributions
ited+0.021
cy+0.020
able+0.020
sie+0.019
ok+0.019
Neg logit contributions
opes-0.013
lf-0.012
 driving-0.012
 not-0.012
 pretending-0.011
Token,
Token ID11
Feature activation+1.79e-02

Pos logit contributions
ited+0.096
able+0.091
ok+0.089
cy+0.087
Ok+0.081
Neg logit contributions
lf-0.049
 not-0.048
 driving-0.045
 Alf-0.044
 Zig-0.044
Token I
Token ID314
Feature activation+1.98e-01

Pos logit contributions
ited+0.034
able+0.032
ok+0.031
cy+0.031
Ok+0.029
Neg logit contributions
lf-0.029
opes-0.028
 not-0.027
 driving-0.027
 Alf-0.027
Token see
Token ID766
Feature activation+6.98e-03

Pos logit contributions
ited+0.456
able+0.415
ok+0.409
cy+0.398
Ok+0.379
Neg logit contributions
 not-0.290
lf-0.289
 Brown-0.280
 driving-0.274
 Alf-0.271
Token a
Token ID257
Feature activation-3.62e-02

Pos logit contributions
ited+0.018
able+0.017
ok+0.017
cy+0.016
 single+0.016
Neg logit contributions
 not-0.008
lf-0.007
 Brown-0.007
 driving-0.007
 Alf-0.007
Token bird
Token ID6512
Feature activation-1.52e-02

Pos logit contributions
 not+0.047
lf+0.045
 Brown+0.044
 driving+0.043
 Alf+0.042
Neg logit contributions
ited-0.091
ok-0.083
able-0.083
cy-0.078
Ok-0.078
Token's
Token ID338
Feature activation+1.80e-02

Pos logit contributions
 not+0.022
lf+0.021
 Brown+0.020
 driving+0.020
 Alf+0.020
Neg logit contributions
ited-0.036
able-0.032
ok-0.032
cy-0.031
 single-0.030
Token nest
Token ID16343
Feature activation-1.71e-02

Pos logit contributions
ited+0.042
able+0.038
ok+0.037
cy+0.036
Ok+0.035
Neg logit contributions
 not-0.027
lf-0.025
 Brown-0.025
 driving-0.025
 Alf-0.024
Token"
Token ID1
Feature activation+1.32e-01

Pos logit contributions
ited+0.067
able+0.059
ok+0.057
cy+0.056
 single+0.054
Neg logit contributions
 not-0.059
lf-0.057
 Brown-0.053
 Zig-0.053
 Alf-0.053
TokenLook
Token ID8567
Feature activation+5.82e-02

Pos logit contributions
ited+0.210
able+0.176
ok+0.164
 single+0.162
cy+0.158
Neg logit contributions
 not-0.292
lf-0.277
 Brown-0.274
 Zig-0.263
 Max-0.262
Token,
Token ID11
Feature activation+9.82e-03

Pos logit contributions
ited+0.127
cy+0.119
able+0.115
ok+0.114
eful+0.109
Neg logit contributions
lf-0.080
 driving-0.078
 not-0.076
opes-0.073
aurus-0.071
Token Lily
Token ID20037
Feature activation+3.89e-02

Pos logit contributions
ited+0.021
cy+0.020
able+0.020
sie+0.019
ok+0.019
Neg logit contributions
opes-0.013
lf-0.012
 driving-0.012
 not-0.012
 pretending-0.011
Token,
Token ID11
Feature activation+1.79e-02

Pos logit contributions
ited+0.096
able+0.091
ok+0.089
cy+0.087
Ok+0.081
Neg logit contributions
lf-0.049
 not-0.048
 driving-0.045
 Alf-0.044
 Zig-0.044
Token I
Token ID314
Feature activation+1.98e-01

Pos logit contributions
ited+0.034
able+0.032
ok+0.031
cy+0.031
Ok+0.029
Neg logit contributions
lf-0.029
opes-0.028
 not-0.027
 driving-0.027
 Alf-0.027
Token found
Token ID1043
Feature activation-2.45e-02

Pos logit contributions
ited+0.456
able+0.415
ok+0.409
cy+0.398
Ok+0.379
Neg logit contributions
 not-0.290
lf-0.289
 Brown-0.280
 driving-0.274
 Alf-0.271
Token a
Token ID257
Feature activation-2.96e-02

Pos logit contributions
 not+0.029
lf+0.029
 driving+0.027
 Brown+0.026
 Alf+0.026
Neg logit contributions
ited-0.061
able-0.058
ok-0.056
cy-0.055
 single-0.053
Token coin
Token ID10752
Feature activation+2.35e-02

Pos logit contributions
 not+0.039
lf+0.036
 Brown+0.035
 driving+0.035
 Alf+0.034
Neg logit contributions
ited-0.076
ok-0.069
able-0.069
cy-0.065
Ok-0.064
Token!"
Token ID2474
Feature activation+6.06e-02

Pos logit contributions
ited+0.057
able+0.052
ok+0.052
cy+0.050
Ok+0.048
Neg logit contributions
 not-0.031
lf-0.030
 Brown-0.029
 driving-0.029
 Alf-0.029
Token Ben
Token ID3932
Feature activation+2.27e-02

Pos logit contributions
ited+0.146
able+0.138
ok+0.133
cy+0.132
yes+0.130
Neg logit contributions
lf-0.068
 driving-0.067
opes-0.065
 not-0.064
aurus-0.060
Token said
Token ID531
Feature activation-8.52e-02

Pos logit contributions
ited+0.004
able+0.004
ok+0.004
cy+0.004
Ok+0.004
Neg logit contributions
 not-0.002
lf-0.002
 Brown-0.002
 driving-0.002
 Alf-0.002
Token,
Token ID11
Feature activation+8.76e-03

Pos logit contributions
lf+0.091
 not+0.090
 driving+0.085
 Brown+0.084
 Alf+0.083
Neg logit contributions
ited-0.219
able-0.205
ok-0.202
Ok-0.194
cy-0.193
Token "
Token ID366
Feature activation+1.18e-01

Pos logit contributions
ited+0.022
ok+0.020
able+0.020
Ok+0.020
cy+0.019
Neg logit contributions
lf-0.010
 not-0.010
 Brown-0.009
 driving-0.009
 Alf-0.009
TokenLook
Token ID8567
Feature activation+4.87e-02

Pos logit contributions
ited+0.213
sie+0.184
able+0.183
away+0.174
 single+0.172
Neg logit contributions
 not-0.204
 Brown-0.198
 Wh-0.191
 Max-0.188
 Zig-0.187
Token,
Token ID11
Feature activation-1.62e-04

Pos logit contributions
ited+0.110
cy+0.104
ok+0.100
able+0.099
eful+0.096
Neg logit contributions
lf-0.063
 driving-0.062
 not-0.060
opes-0.058
 Brown-0.057
Token I
Token ID314
Feature activation+1.98e-01

Pos logit contributions
opes+0.000
lf+0.000
 driving+0.000
 not+0.000
 pretending+0.000
Neg logit contributions
ited-0.000
cy-0.000
able-0.000
ok-0.000
yes-0.000
Token found
Token ID1043
Feature activation-2.88e-02

Pos logit contributions
ited+0.462
able+0.421
ok+0.416
cy+0.401
Ok+0.388
Neg logit contributions
 not-0.285
lf-0.276
 Brown-0.268
 driving-0.264
 Alf-0.262
Token a
Token ID257
Feature activation-3.46e-02

Pos logit contributions
 not+0.034
lf+0.034
 driving+0.031
 Brown+0.030
 Alf+0.030
Neg logit contributions
ited-0.071
able-0.067
ok-0.065
cy-0.063
 single-0.062
Token pretty
Token ID2495
Feature activation+6.02e-02

Pos logit contributions
 not+0.045
lf+0.042
 Brown+0.042
 driving+0.041
 Alf+0.039
Neg logit contributions
ited-0.089
ok-0.081
able-0.080
cy-0.076
Ok-0.075
Token pe
Token ID613
Feature activation+4.64e-02

Pos logit contributions
ited+0.143
ok+0.129
able+0.129
Ok+0.121
cy+0.121
Neg logit contributions
 not-0.086
lf-0.083
 Brown-0.081
 driving-0.079
 Alf-0.077
Tokenb
Token ID65
Feature activation-2.72e-02

Pos logit contributions
ited+0.094
able+0.084
ok+0.083
cy+0.081
Ok+0.079
Neg logit contributions
 not-0.083
lf-0.078
 driving-0.076
 Alf-0.074
 Brown-0.073
Token mom
Token ID1995
Feature activation+5.32e-02

Pos logit contributions
lf+0.027
opes+0.027
 Zig+0.025
 Dino+0.024
 Zed+0.023
Neg logit contributions
ited-0.045
 single-0.045
cy-0.044
ok-0.042
able-0.042
Token!
Token ID0
Feature activation+8.68e-03

Pos logit contributions
ited+0.130
able+0.116
ok+0.116
cy+0.113
 single+0.107
Neg logit contributions
lf-0.068
 not-0.068
 Zig-0.065
 Alf-0.064
 Brown-0.063
Token She
Token ID1375
Feature activation-2.17e-02

Pos logit contributions
ited+0.025
able+0.022
ok+0.021
cy+0.020
eful+0.020
Neg logit contributions
 Brown-0.010
 not-0.010
 Alf-0.010
 Max-0.009
 Tim-0.009
Token said
Token ID531
Feature activation-9.06e-02

Pos logit contributions
 not+0.028
lf+0.026
 Brown+0.026
 driving+0.025
 Alf+0.025
Neg logit contributions
ited-0.054
able-0.050
ok-0.049
cy-0.047
Ok-0.046
Token,"
Token ID553
Feature activation+3.27e-02

Pos logit contributions
 not+0.092
lf+0.088
 driving+0.083
 Brown+0.083
 Alf+0.081
Neg logit contributions
ited-0.246
able-0.229
ok-0.226
cy-0.218
Ok-0.215
Token I
Token ID314
Feature activation+1.97e-01

Pos logit contributions
ited+0.067
able+0.056
cy+0.053
ok+0.052
 single+0.050
Neg logit contributions
 not-0.065
 Brown-0.061
 Max-0.059
 Tim-0.059
 Tom-0.057
Token saw
Token ID2497
Feature activation-3.08e-02

Pos logit contributions
ited+0.456
able+0.413
ok+0.408
cy+0.392
Ok+0.380
Neg logit contributions
 not-0.286
lf-0.279
 Brown-0.271
 driving-0.269
 Alf-0.265
Token your
Token ID534
Feature activation+4.31e-02

Pos logit contributions
 not+0.039
lf+0.038
 driving+0.036
 Brown+0.036
 Alf+0.035
Neg logit contributions
ited-0.076
able-0.070
ok-0.069
cy-0.067
Ok-0.065
Token ig
Token ID45329
Feature activation-5.29e-02

Pos logit contributions
ited+0.105
able+0.097
ok+0.096
cy+0.095
 single+0.092
Neg logit contributions
lf-0.053
 not-0.053
 Zig-0.049
 Brown-0.048
 driving-0.048
Tokenloo
Token ID29680
Feature activation-2.46e-02

Pos logit contributions
 not+0.084
lf+0.079
 Brown+0.079
 driving+0.077
 Alf+0.077
Neg logit contributions
ited-0.115
ok-0.105
able-0.104
cy-0.099
 single-0.095
Token and
Token ID290
Feature activation+4.70e-03

Pos logit contributions
 not+0.030
lf+0.029
 Brown+0.028
 driving+0.027
 Alf+0.027
Neg logit contributions
ited-0.063
able-0.058
ok-0.057
cy-0.055
Ok-0.054
Token from
Token ID422
Feature activation-9.67e-02

Pos logit contributions
ited+0.051
able+0.049
 single+0.048
 tighter+0.045
cy+0.044
Neg logit contributions
lf-0.044
 Alf-0.040
opes-0.040
 Zig-0.039
 driving-0.039
Token Anna
Token ID11735
Feature activation+5.11e-02

Pos logit contributions
 not+0.157
lf+0.152
 Brown+0.148
 Alf+0.147
 driving+0.146
Neg logit contributions
ited-0.202
able-0.183
ok-0.178
cy-0.172
 single-0.168
Token.
Token ID13
Feature activation-7.44e-04

Pos logit contributions
ited+0.122
able+0.115
ok+0.111
cy+0.109
 single+0.104
Neg logit contributions
 not-0.065
lf-0.064
 Alf-0.063
 Zig-0.062
 Brown-0.061
Token She
Token ID1375
Feature activation-9.79e-03

Pos logit contributions
 not+0.001
lf+0.001
 Brown+0.001
 driving+0.001
 Alf+0.001
Neg logit contributions
ited-0.002
able-0.001
ok-0.001
cy-0.001
Ok-0.001
Token says
Token ID1139
Feature activation-8.10e-02

Pos logit contributions
lf+0.017
 Zig+0.016
 Alf+0.016
 not+0.016
 driving+0.016
Neg logit contributions
ited-0.020
able-0.018
cy-0.017
ok-0.017
use-0.016
Token I
Token ID314
Feature activation+1.97e-01

Pos logit contributions
lf+0.113
 driving+0.109
 not+0.108
 Alf+0.104
opes+0.104
Neg logit contributions
ited-0.171
able-0.163
ok-0.158
 single-0.152
cy-0.152
Token am
Token ID716
Feature activation+2.96e-02

Pos logit contributions
ited+0.431
able+0.421
cy+0.398
ok+0.374
use+0.373
Neg logit contributions
lf-0.302
opes-0.283
 Alf-0.277
 Zig-0.275
 driving-0.272
Token her
Token ID607
Feature activation-1.00e-02

Pos logit contributions
ited+0.065
able+0.061
 single+0.060
ok+0.059
 forward+0.058
Neg logit contributions
lf-0.038
opes-0.036
 Zig-0.034
 Alf-0.034
aurus-0.033
Token best
Token ID1266
Feature activation+4.31e-02

Pos logit contributions
lf+0.015
 Zig+0.014
 Alf+0.014
 not+0.013
opes+0.013
Neg logit contributions
ited-0.021
able-0.021
ok-0.020
cy-0.019
 single-0.019
Token friend
Token ID1545
Feature activation-5.01e-02

Pos logit contributions
ited+0.096
able+0.093
ok+0.092
 single+0.088
cy+0.087
Neg logit contributions
 not-0.057
lf-0.057
 Zig-0.056
 Alf-0.055
 Brown-0.053
Token.
Token ID13
Feature activation-8.82e-03

Pos logit contributions
 not+0.059
lf+0.054
 Brown+0.053
 driving+0.052
 Alf+0.051
Neg logit contributions
ited-0.132
able-0.120
ok-0.120
cy-0.115
Ok-0.113
Token from
Token ID422
Feature activation-1.19e-01

Pos logit contributions
 not+0.181
 Alf+0.180
lf+0.177
 Zig+0.176
opes+0.175
Neg logit contributions
ited-0.363
able-0.354
ok-0.347
 single-0.331
cy-0.321
Token far
Token ID1290
Feature activation+6.53e-02

Pos logit contributions
 not+0.143
lf+0.136
 driving+0.132
 Brown+0.129
 Alf+0.128
Neg logit contributions
ited-0.295
able-0.276
cy-0.263
ok-0.261
 single-0.258
Tokenaway
Token ID8272
Feature activation-1.05e-02

Pos logit contributions
ited+0.119
ok+0.106
able+0.101
Ok+0.099
cy+0.095
Neg logit contributions
 not-0.128
 Brown-0.123
lf-0.123
 driving-0.120
 Alf-0.120
Token lands
Token ID8604
Feature activation-3.45e-02

Pos logit contributions
 not+0.018
lf+0.017
 Brown+0.016
 driving+0.016
 Alf+0.016
Neg logit contributions
ited-0.021
able-0.020
ok-0.018
cy-0.018
 single-0.018
Token and
Token ID290
Feature activation-1.99e-03

Pos logit contributions
 not+0.041
lf+0.039
 Brown+0.038
 driving+0.037
 Alf+0.037
Neg logit contributions
ited-0.088
able-0.082
ok-0.081
cy-0.078
Ok-0.076
Token tales
Token ID19490
Feature activation-2.25e-01

Pos logit contributions
 not+0.002
lf+0.002
aurus+0.002
 Brown+0.002
 Alf+0.002
Neg logit contributions
ited-0.005
able-0.004
ok-0.004
cy-0.004
 single-0.004
Token of
Token ID286
Feature activation-7.30e-02

Pos logit contributions
lf+0.300
 not+0.293
 Brown+0.288
 Zig+0.287
 Alf+0.287
Neg logit contributions
ited-0.504
able-0.484
ok-0.471
 single-0.449
cy-0.443
Token magical
Token ID10883
Feature activation-5.79e-02

Pos logit contributions
lf+0.088
 not+0.085
aurus+0.080
 driving+0.078
 Alf+0.078
Neg logit contributions
ited-0.164
able-0.162
 single-0.149
ok-0.142
yes-0.141
Token creatures
Token ID8109
Feature activation-1.38e-01

Pos logit contributions
 not+0.071
lf+0.064
 driving+0.062
 Brown+0.062
 Alf+0.061
Neg logit contributions
ited-0.147
able-0.139
ok-0.132
cy-0.129
 single-0.127
Token.
Token ID13
Feature activation-1.52e-02

Pos logit contributions
 not+0.175
lf+0.174
 Brown+0.165
 Alf+0.164
 Zig+0.162
Neg logit contributions
ited-0.327
able-0.311
ok-0.298
cy-0.287
 single-0.286
Token She
Token ID1375
Feature activation-2.93e-02

Pos logit contributions
 not+0.024
 Brown+0.023
 Alf+0.022
lf+0.022
 driving+0.022
Neg logit contributions
ited-0.034
able-0.031
ok-0.030
cy-0.029
 single-0.028
Token start
Token ID923
Feature activation+2.97e-02

Pos logit contributions
lf+0.055
 Brown+0.053
 not+0.053
 Zig+0.052
 Alf+0.052
Neg logit contributions
ited-0.082
ok-0.075
able-0.074
 single-0.072
cy-0.072
Token learning
Token ID4673
Feature activation-9.66e-02

Pos logit contributions
ited+0.077
able+0.072
ok+0.071
 single+0.069
cy+0.068
Neg logit contributions
lf-0.030
 not-0.030
 Brown-0.028
 Zig-0.028
 Alf-0.027
Token how
Token ID703
Feature activation-3.76e-02

Pos logit contributions
 not+0.126
lf+0.122
 Brown+0.118
 driving+0.116
 Alf+0.115
Neg logit contributions
ited-0.234
able-0.216
ok-0.212
cy-0.203
Ok-0.199
Token to
Token ID284
Feature activation-7.08e-02

Pos logit contributions
 not+0.048
 driving+0.041
 Brown+0.039
 having+0.037
 cars+0.037
Neg logit contributions
ited-0.106
ok-0.090
cy-0.088
able-0.087
Ok-0.086
Token make
Token ID787
Feature activation-3.92e-02

Pos logit contributions
 not+0.086
 Brown+0.086
lf+0.085
 Zig+0.084
 Alf+0.083
Neg logit contributions
ited-0.169
able-0.155
ok-0.155
cy-0.150
 single-0.146
Token models
Token ID4981
Feature activation-2.24e-01

Pos logit contributions
lf+0.054
 not+0.052
 Zig+0.051
 Brown+0.050
aurus+0.049
Neg logit contributions
ited-0.084
able-0.082
ok-0.076
 single-0.075
eful-0.074
Token just
Token ID655
Feature activation+6.19e-03

Pos logit contributions
 Zig+0.337
lf+0.335
 Brown+0.323
opes+0.313
 Alf+0.310
Neg logit contributions
ited-0.468
able-0.459
ok-0.444
cy-0.416
 single-0.410
Token like
Token ID588
Feature activation-4.51e-02

Pos logit contributions
ited+0.015
able+0.014
ok+0.014
cy+0.013
 single+0.013
Neg logit contributions
 not-0.008
lf-0.008
 Brown-0.007
 Alf-0.007
 driving-0.007
Token the
Token ID262
Feature activation-5.24e-02

Pos logit contributions
 not+0.055
lf+0.053
 Brown+0.051
 Alf+0.050
 Zig+0.049
Neg logit contributions
ited-0.110
able-0.102
ok-0.100
 single-0.096
cy-0.096
Token ones
Token ID3392
Feature activation-9.62e-03

Pos logit contributions
 not+0.074
lf+0.072
 Zig+0.070
 Brown+0.070
 Alf+0.068
Neg logit contributions
ited-0.121
able-0.112
ok-0.110
cy-0.105
 single-0.104
Token his
Token ID465
Feature activation-5.41e-02

Pos logit contributions
 not+0.010
lf+0.010
 Brown+0.010
 Alf+0.010
 Zig+0.010
Neg logit contributions
ited-0.025
able-0.024
ok-0.023
cy-0.022
 single-0.022
Token museum
Token ID13257
Feature activation-4.54e-02

Pos logit contributions
 not+0.010
 Brown+0.009
 Alf+0.009
 Zig+0.009
lf+0.009
Neg logit contributions
ited-0.016
able-0.015
ok-0.014
cy-0.014
 single-0.014
Token and
Token ID290
Feature activation-2.17e-02

Pos logit contributions
 not+0.055
lf+0.052
 Brown+0.050
 driving+0.050
 Alf+0.049
Neg logit contributions
ited-0.116
able-0.107
ok-0.105
cy-0.102
Ok-0.099
Token see
Token ID766
Feature activation-4.20e-02

Pos logit contributions
lf+0.029
opes+0.029
aurus+0.026
 Zig+0.026
 Alf+0.026
Neg logit contributions
ited-0.046
cy-0.043
able-0.043
yes-0.042
 single-0.041
Token what
Token ID644
Feature activation-1.06e-01

Pos logit contributions
lf+0.051
 not+0.049
 driving+0.044
opes+0.044
 Brown+0.042
Neg logit contributions
ited-0.100
able-0.098
cy-0.092
ok-0.089
 single-0.089
Token other
Token ID584
Feature activation-4.63e-02

Pos logit contributions
 not+0.119
lf+0.115
 driving+0.114
 Brown+0.112
 Alf+0.110
Neg logit contributions
ited-0.267
able-0.257
ok-0.244
cy-0.239
 single-0.236
Token dinosaurs
Token ID27199
Feature activation-2.22e-01

Pos logit contributions
 not+0.054
opes+0.053
lf+0.052
 Fort+0.048
aurus+0.047
Neg logit contributions
able-0.111
ited-0.109
cy-0.096
 single-0.095
ok-0.094
Token she
Token ID673
Feature activation-1.21e-02

Pos logit contributions
 not+0.234
 Brown+0.233
 Alf+0.232
lf+0.231
 Zig+0.226
Neg logit contributions
ited-0.575
able-0.546
ok-0.514
cy-0.509
 single-0.491
Token could
Token ID714
Feature activation-5.36e-03

Pos logit contributions
 not+0.016
lf+0.016
 Brown+0.016
 driving+0.015
 Alf+0.015
Neg logit contributions
ited-0.029
able-0.027
ok-0.026
cy-0.025
Ok-0.025
Token find
Token ID1064
Feature activation-3.96e-02

Pos logit contributions
 Brown+0.007
 Zig+0.007
lf+0.007
 Alf+0.007
 Wh+0.007
Neg logit contributions
ited-0.012
ok-0.012
able-0.011
yes-0.011
cy-0.011
Token.
Token ID13
Feature activation-1.21e-02

Pos logit contributions
 not+0.043
lf+0.039
 Brown+0.039
 driving+0.038
 Alf+0.037
Neg logit contributions
ited-0.107
able-0.098
ok-0.098
cy-0.094
Ok-0.093
Token<|endoftext|>
Token ID50256
Feature activation+1.28e-02

Pos logit contributions
lf+0.020
 driving+0.019
 not+0.019
opes+0.018
 pretending+0.018
Neg logit contributions
ited-0.025
cy-0.023
able-0.023
ok-0.023
Ok-0.022
Token read
Token ID1100
Feature activation-4.37e-02

Pos logit contributions
 not+0.018
 Alf+0.018
lf+0.017
 Brown+0.017
 Zig+0.017
Neg logit contributions
ited-0.032
ok-0.029
able-0.029
cy-0.028
yes-0.027
Token books
Token ID3835
Feature activation-7.19e-02

Pos logit contributions
 not+0.045
aurus+0.041
 Alf+0.040
 driving+0.039
lf+0.039
Neg logit contributions
ited-0.108
able-0.107
ok-0.099
 single-0.098
yes-0.095
Token,
Token ID11
Feature activation-4.61e-02

Pos logit contributions
 not+0.086
lf+0.085
 driving+0.085
 Alf+0.084
 Brown+0.083
Neg logit contributions
ited-0.172
able-0.165
ok-0.161
cy-0.153
 single-0.150
Token especially
Token ID2592
Feature activation+7.44e-02

Pos logit contributions
aurus+0.077
anks+0.072
 Alf+0.072
 Zig+0.071
opes+0.070
Neg logit contributions
ited-0.087
able-0.075
ok-0.074
yes-0.073
 single-0.067
Token fairy
Token ID25607
Feature activation-9.46e-02

Pos logit contributions
ited+0.180
able+0.176
ok+0.165
cy+0.162
 single+0.157
Neg logit contributions
lf-0.087
 not-0.087
 Alf-0.084
 Zig-0.080
aurus-0.079
Token tales
Token ID19490
Feature activation-2.21e-01

Pos logit contributions
 not+0.136
lf+0.127
aurus+0.127
 driving+0.126
 Alf+0.120
Neg logit contributions
able-0.202
ited-0.188
cy-0.186
ok-0.180
mer-0.174
Token.
Token ID13
Feature activation-1.63e-02

Pos logit contributions
lf+0.313
 Alf+0.298
 not+0.296
 Zig+0.294
opes+0.290
Neg logit contributions
able-0.460
ited-0.460
ok-0.449
 single-0.427
yes-0.410
Token One
Token ID1881
Feature activation+5.97e-02

Pos logit contributions
 not+0.026
lf+0.025
 Brown+0.024
 driving+0.024
 Alf+0.023
Neg logit contributions
ited-0.036
able-0.033
ok-0.032
cy-0.031
Ok-0.030
Token night
Token ID1755
Feature activation+1.80e-03

Pos logit contributions
ited+0.119
ok+0.105
able+0.105
cy+0.098
Ok+0.097
Neg logit contributions
 not-0.109
lf-0.104
 Brown-0.102
 driving-0.101
 Alf-0.100
Token,
Token ID11
Feature activation-1.94e-02

Pos logit contributions
ited+0.004
able+0.004
cy+0.004
ok+0.004
 single+0.004
Neg logit contributions
 not-0.002
lf-0.002
 Alf-0.002
 driving-0.002
opes-0.002
Token she
Token ID673
Feature activation+2.36e-02

Pos logit contributions
 driving+0.030
 Alf+0.030
opes+0.030
lf+0.029
 Zig+0.029
Neg logit contributions
ited-0.039
able-0.036
cy-0.036
ok-0.035
 single-0.034
Token do
Token ID466
Feature activation-4.64e-02

Pos logit contributions
able+0.113
ited+0.111
cy+0.105
ok+0.097
use+0.096
Neg logit contributions
lf-0.094
 Alf-0.084
 driving-0.082
 pretending-0.081
opes-0.079
Token not
Token ID407
Feature activation-4.43e-02

Pos logit contributions
 Alf+0.025
 Brown+0.022
 Zig+0.021
lf+0.021
 driving+0.015
Neg logit contributions
ited-0.140
able-0.138
ok-0.131
 single-0.130
kind-0.128
Token know
Token ID760
Feature activation-1.09e-01

Pos logit contributions
 not+0.065
lf+0.062
 Brown+0.060
 driving+0.060
 Alf+0.059
Neg logit contributions
ited-0.103
able-0.094
ok-0.093
cy-0.089
Ok-0.087
Token anything
Token ID1997
Feature activation-5.44e-02

Pos logit contributions
lf+0.122
 not+0.120
 Alf+0.118
 Brown+0.115
 driving+0.113
Neg logit contributions
ited-0.262
able-0.250
ok-0.241
cy-0.234
 single-0.229
Token about
Token ID546
Feature activation-1.27e-01

Pos logit contributions
 not+0.071
lf+0.064
 Brown+0.063
 driving+0.063
 Alf+0.060
Neg logit contributions
ited-0.139
ok-0.125
able-0.125
cy-0.121
Ok-0.118
Token nations
Token ID7027
Feature activation-2.21e-01

Pos logit contributions
 not+0.188
lf+0.187
 Alf+0.185
 Zig+0.184
 driving+0.183
Neg logit contributions
ited-0.246
able-0.237
ok-0.221
 single-0.219
cy-0.219
Token."
Token ID526
Feature activation+4.08e-02

Pos logit contributions
 not+0.305
lf+0.301
 Alf+0.297
 Brown+0.293
 Zig+0.288
Neg logit contributions
ited-0.505
able-0.479
ok-0.459
cy-0.442
 single-0.428
Token
Token ID198
Feature activation+9.65e-04

Pos logit contributions
ited+0.089
ok+0.087
able+0.086
cy+0.081
Ok+0.079
Neg logit contributions
lf-0.059
 not-0.055
 driving-0.055
opes-0.054
 Brown-0.053
Token
Token ID198
Feature activation+1.02e-02

Pos logit contributions
ited+0.002
able+0.002
ok+0.002
cy+0.002
 single+0.002
Neg logit contributions
lf-0.001
 not-0.001
 driving-0.001
 Brown-0.001
 Alf-0.001
TokenBen
Token ID11696
Feature activation+2.14e-02

Pos logit contributions
ited+0.022
ok+0.019
able+0.018
cy+0.018
sie+0.018
Neg logit contributions
 not-0.018
lf-0.016
 Zig-0.016
 Alf-0.016
 Brown-0.015
Token and
Token ID290
Feature activation+4.61e-02

Pos logit contributions
ited+0.056
able+0.051
ok+0.051
cy+0.049
Ok+0.048
Neg logit contributions
 not-0.025
lf-0.023
 Brown-0.023
 driving-0.022
 Alf-0.022
Token whisper
Token ID31992
Feature activation-7.08e-02

Pos logit contributions
ited+0.007
able+0.006
ok+0.006
cy+0.006
 single+0.006
Neg logit contributions
lf-0.004
 not-0.004
 Brown-0.004
 driving-0.004
 Alf-0.004
Token excited
Token ID6568
Feature activation+2.48e-02

Pos logit contributions
 not+0.087
lf+0.087
 Brown+0.081
 driving+0.081
 Alf+0.079
Neg logit contributions
ited-0.173
able-0.162
ok-0.159
cy-0.152
Ok-0.150
Tokenly
Token ID306
Feature activation+1.40e-02

Pos logit contributions
ited+0.042
able+0.039
ok+0.039
cy+0.035
yes+0.034
Neg logit contributions
lf-0.049
 not-0.048
 driving-0.046
 Brown-0.046
 Zig-0.045
Token,
Token ID11
Feature activation-4.14e-02

Pos logit contributions
ited+0.034
able+0.032
ok+0.031
cy+0.030
 single+0.029
Neg logit contributions
lf-0.017
 not-0.017
 driving-0.016
 Brown-0.016
 Alf-0.015
Token telling
Token ID5149
Feature activation-6.18e-02

Pos logit contributions
lf+0.061
 Brown+0.053
opes+0.052
 Alf+0.051
aurus+0.051
Neg logit contributions
ited-0.088
ok-0.080
able-0.079
Ok-0.078
 single-0.077
Token tales
Token ID19490
Feature activation-2.20e-01

Pos logit contributions
lf+0.062
 not+0.059
 driving+0.056
 Brown+0.052
 pretending+0.052
Neg logit contributions
ited-0.158
able-0.152
ok-0.144
 single-0.141
Ok-0.141
Token of
Token ID286
Feature activation-7.13e-02

Pos logit contributions
lf+0.237
 Brown+0.228
 Alf+0.222
 not+0.221
 driving+0.221
Neg logit contributions
ited-0.550
able-0.531
ok-0.518
 single-0.498
cy-0.490
Token the
Token ID262
Feature activation-3.04e-02

Pos logit contributions
lf+0.084
 not+0.080
 driving+0.074
 Alf+0.074
 Brown+0.073
Neg logit contributions
ited-0.171
able-0.162
 single-0.153
cy-0.152
ok-0.151
Token rat
Token ID4227
Feature activation-2.35e-02

Pos logit contributions
 not+0.042
lf+0.039
 Brown+0.038
 driving+0.038
 Alf+0.037
Neg logit contributions
ited-0.074
able-0.067
ok-0.067
cy-0.064
Ok-0.063
Token who
Token ID508
Feature activation-5.55e-02

Pos logit contributions
lf+0.031
 not+0.029
 driving+0.028
 Alf+0.028
opes+0.028
Neg logit contributions
ited-0.052
able-0.051
cy-0.047
 single-0.047
ok-0.045
Token was
Token ID373
Feature activation-3.10e-02

Pos logit contributions
 not+0.064
lf+0.060
 Brown+0.058
 driving+0.058
 Alf+0.058
Neg logit contributions
ited-0.143
able-0.134
ok-0.127
cy-0.126
 single-0.122
Token play
Token ID711
Feature activation-4.61e-02

Pos logit contributions
 not+0.026
lf+0.026
 Brown+0.025
 driving+0.025
 Alf+0.024
Neg logit contributions
ited-0.041
able-0.038
ok-0.036
cy-0.036
 single-0.034
Token cow
Token ID9875
Feature activation-9.11e-02

Pos logit contributions
 not+0.062
lf+0.059
 Alf+0.059
 Brown+0.059
 driving+0.057
Neg logit contributions
ited-0.106
able-0.101
ok-0.097
cy-0.095
 single-0.092
Tokenboys
Token ID13202
Feature activation-1.53e-01

Pos logit contributions
lf+0.105
 not+0.101
 Alf+0.098
aurus+0.093
 driving+0.092
Neg logit contributions
ited-0.220
able-0.213
cy-0.211
ok-0.197
mer-0.195
Token and
Token ID290
Feature activation+1.35e-02

Pos logit contributions
lf+0.217
 not+0.216
 Alf+0.206
 driving+0.201
 Zig+0.189
Neg logit contributions
ited-0.329
able-0.318
cy-0.309
ok-0.296
mer-0.293
Token cow
Token ID9875
Feature activation-1.04e-01

Pos logit contributions
ited+0.033
able+0.031
ok+0.030
cy+0.030
 single+0.029
Neg logit contributions
 not-0.016
lf-0.015
 Alf-0.014
 driving-0.014
 Brown-0.014
Tokengirls
Token ID36960
Feature activation-2.19e-01

Pos logit contributions
lf+0.126
 not+0.119
 Alf+0.111
 driving+0.103
 Brown+0.101
Neg logit contributions
ited-0.239
able-0.238
cy-0.233
mer-0.224
ok-0.215
Token,
Token ID11
Feature activation-5.04e-02

Pos logit contributions
lf+0.292
 Alf+0.274
 not+0.270
 Zig+0.251
 Brown+0.250
Neg logit contributions
ited-0.481
able-0.458
cy-0.455
ok-0.454
mer-0.428
Token and
Token ID290
Feature activation-4.99e-03

Pos logit contributions
lf+0.067
 not+0.060
opes+0.058
 Alf+0.057
aurus+0.056
Neg logit contributions
ited-0.113
able-0.108
cy-0.106
ok-0.102
yes-0.099
Token pretend
Token ID16614
Feature activation-8.02e-02

Pos logit contributions
lf+0.006
 not+0.006
 Alf+0.005
aurus+0.005
opes+0.005
Neg logit contributions
ited-0.012
able-0.012
cy-0.011
ok-0.011
Ok-0.011
Token to
Token ID284
Feature activation-3.47e-02

Pos logit contributions
lf+0.104
 Alf+0.102
aurus+0.102
 Brown+0.098
opes+0.096
Neg logit contributions
able-0.170
ited-0.169
ok-0.158
 single-0.158
cy-0.153
Token ride
Token ID6594
Feature activation-5.57e-02

Pos logit contributions
lf+0.042
 not+0.041
 Alf+0.040
 Brown+0.040
 driving+0.038
Neg logit contributions
ited-0.086
able-0.079
ok-0.077
cy-0.075
 single-0.074
Token friends
Token ID2460
Feature activation-1.40e-01

Pos logit contributions
lf+0.021
 not+0.020
 driving+0.020
 Brown+0.019
 Alf+0.019
Neg logit contributions
ited-0.058
able-0.055
ok-0.053
cy-0.052
Ok-0.052
Token are
Token ID389
Feature activation-2.81e-02

Pos logit contributions
 not+0.174
lf+0.164
 Brown+0.160
 driving+0.158
 Alf+0.156
Neg logit contributions
ited-0.353
able-0.324
ok-0.321
cy-0.309
Ok-0.302
Token more
Token ID517
Feature activation+9.91e-03

Pos logit contributions
lf+0.038
 Alf+0.036
aurus+0.034
 Zig+0.034
opes+0.034
Neg logit contributions
ited-0.061
able-0.060
 single-0.055
ok-0.054
cy-0.053
Token fun
Token ID1257
Feature activation-1.51e-03

Pos logit contributions
ited+0.020
able+0.019
ok+0.018
cy+0.017
 single+0.017
Neg logit contributions
 not-0.016
lf-0.016
 Alf-0.016
 Brown-0.016
 Zig-0.016
Token than
Token ID621
Feature activation-3.11e-02

Pos logit contributions
 not+0.002
lf+0.002
 Alf+0.002
 Brown+0.002
 Zig+0.002
Neg logit contributions
ited-0.003
able-0.003
ok-0.003
cy-0.003
 single-0.003
Token robots
Token ID14193
Feature activation-2.18e-01

Pos logit contributions
 not+0.042
lf+0.041
 Alf+0.039
 Brown+0.039
 driving+0.039
Neg logit contributions
ited-0.073
able-0.067
ok-0.066
cy-0.064
 single-0.062
Token.
Token ID13
Feature activation-4.18e-02

Pos logit contributions
 not+0.282
lf+0.271
 Alf+0.263
 Brown+0.262
 driving+0.257
Neg logit contributions
ited-0.531
able-0.497
ok-0.484
cy-0.466
 single-0.452
Token Don
Token ID2094
Feature activation+1.17e-02

Pos logit contributions
lf+0.069
 not+0.067
 driving+0.063
 Brown+0.061
 Alf+0.061
Neg logit contributions
ited-0.083
able-0.079
ok-0.078
cy-0.074
Ok-0.073
Token't
Token ID470
Feature activation-3.33e-02

Pos logit contributions
ited+0.026
able+0.026
ok+0.025
 single+0.024
cy+0.023
Neg logit contributions
 pretending-0.014
 Alf-0.014
aurus-0.014
opes-0.014
 drawings-0.013
Token you
Token ID345
Feature activation+4.02e-02

Pos logit contributions
lf+0.050
 not+0.049
 Alf+0.047
 driving+0.047
 Brown+0.047
Neg logit contributions
ited-0.074
able-0.068
ok-0.066
cy-0.064
 single-0.062
Token agree
Token ID4236
Feature activation-4.16e-02

Pos logit contributions
ited+0.083
able+0.075
ok+0.074
cy+0.071
Ok+0.068
Neg logit contributions
 not-0.068
lf-0.067
 Brown-0.065
 driving-0.064
 Alf-0.064
Token be
Token ID307
Feature activation+6.31e-03

Pos logit contributions
opes+0.039
 Brown+0.038
lf+0.038
 Alf+0.038
 driving+0.037
Neg logit contributions
ited-0.052
ok-0.047
cy-0.047
able-0.046
yes-0.046
Token cow
Token ID9875
Feature activation-9.74e-02

Pos logit contributions
able+0.014
ited+0.014
cy+0.013
ok+0.013
 single+0.012
Neg logit contributions
lf-0.008
opes-0.008
 Brown-0.008
 Alf-0.008
 Zig-0.007
Tokenboys
Token ID13202
Feature activation-1.49e-01

Pos logit contributions
lf+0.101
 not+0.092
 driving+0.091
aurus+0.087
 Alf+0.085
Neg logit contributions
ited-0.242
able-0.238
cy-0.234
ok-0.225
mer-0.222
Token and
Token ID290
Feature activation+5.56e-03

Pos logit contributions
lf+0.215
 not+0.213
 Alf+0.208
 driving+0.205
 Zig+0.195
Neg logit contributions
ited-0.312
able-0.307
cy-0.296
ok-0.291
mer-0.284
Token cow
Token ID9875
Feature activation-1.03e-01

Pos logit contributions
ited+0.012
able+0.012
cy+0.012
 single+0.011
use+0.011
Neg logit contributions
lf-0.007
 not-0.007
 driving-0.006
 Alf-0.006
aurus-0.006
Tokengirls
Token ID36960
Feature activation-2.18e-01

Pos logit contributions
lf+0.137
 not+0.131
 driving+0.120
 hardly+0.114
 Alf+0.113
Neg logit contributions
able-0.227
ited-0.223
cy-0.220
mer-0.212
ok-0.207
Token.
Token ID13
Feature activation-2.02e-02

Pos logit contributions
lf+0.326
 Alf+0.306
 not+0.299
 Zig+0.284
 Brown+0.280
Neg logit contributions
ited-0.411
ok-0.409
cy-0.405
able-0.403
mer-0.380
Token They
Token ID1119
Feature activation-4.84e-02

Pos logit contributions
 not+0.024
 Brown+0.021
 Alf+0.021
 Zig+0.021
lf+0.020
Neg logit contributions
ited-0.055
able-0.050
ok-0.049
cy-0.047
 single-0.046
Token put
Token ID1234
Feature activation+5.09e-02

Pos logit contributions
lf+0.068
 not+0.065
 driving+0.063
 Brown+0.062
 Alf+0.061
Neg logit contributions
ited-0.113
able-0.103
cy-0.100
ok-0.099
use-0.093
Token on
Token ID319
Feature activation+5.89e-03

Pos logit contributions
ited+0.132
able+0.122
ok+0.120
cy+0.116
Ok+0.113
Neg logit contributions
 not-0.060
lf-0.057
 Brown-0.055
 driving-0.054
 Alf-0.054
Token hats
Token ID23910
Feature activation-3.75e-02

Pos logit contributions
ited+0.015
able+0.014
ok+0.013
cy+0.013
 single+0.013
Neg logit contributions
 not-0.007
lf-0.007
 driving-0.007
 Alf-0.006
 Brown-0.006
Token going
Token ID1016
Feature activation+3.76e-02

Pos logit contributions
ited+0.138
able+0.128
ok+0.126
cy+0.122
Ok+0.119
Neg logit contributions
 not-0.055
lf-0.052
 Brown-0.050
 driving-0.049
 Alf-0.049
Token and
Token ID290
Feature activation-3.83e-02

Pos logit contributions
ited+0.098
able+0.089
ok+0.088
cy+0.085
Ok+0.084
Neg logit contributions
 not-0.047
 driving-0.043
 Brown-0.043
lf-0.042
 Alf-0.041
Token make
Token ID787
Feature activation-3.23e-02

Pos logit contributions
 not+0.050
 driving+0.044
lf+0.043
 Brown+0.043
 Alf+0.041
Neg logit contributions
ited-0.100
able-0.092
ok-0.090
cy-0.087
Ok-0.084
Token more
Token ID517
Feature activation-1.16e-02

Pos logit contributions
lf+0.041
 not+0.041
 Brown+0.039
 Zig+0.039
 driving+0.038
Neg logit contributions
ited-0.075
able-0.071
ok-0.068
 single-0.066
eful-0.065
Token interesting
Token ID3499
Feature activation-5.24e-02

Pos logit contributions
 not+0.018
lf+0.017
 Brown+0.017
 Alf+0.016
 driving+0.016
Neg logit contributions
ited-0.026
able-0.024
ok-0.023
cy-0.022
 single-0.022
Token models
Token ID4981
Feature activation-2.15e-01

Pos logit contributions
 not+0.069
lf+0.065
 Brown+0.063
 Zig+0.062
 Alf+0.061
Neg logit contributions
ited-0.125
able-0.119
ok-0.116
cy-0.110
 single-0.109
Token.
Token ID13
Feature activation-2.16e-02

Pos logit contributions
 Zig+0.307
lf+0.305
opes+0.303
 not+0.301
 Brown+0.297
Neg logit contributions
ited-0.452
able-0.451
ok-0.439
yes-0.402
cy-0.401
Token<|endoftext|>
Token ID50256
Feature activation+1.94e-03

Pos logit contributions
 not+0.032
lf+0.031
 driving+0.030
 pretending+0.028
opes+0.027
Neg logit contributions
ited-0.047
able-0.044
cy-0.043
ok-0.043
Ok-0.042
TokenOne
Token ID3198
Feature activation+7.78e-02

Pos logit contributions
ited+0.005
able+0.004
ok+0.004
cy+0.004
 single+0.004
Neg logit contributions
 not-0.003
lf-0.003
 driving-0.002
 Alf-0.002
 Brown-0.002
Token day
Token ID1110
Feature activation+5.26e-02

Pos logit contributions
ited+0.166
ok+0.149
able+0.148
Ok+0.141
cy+0.140
Neg logit contributions
 not-0.132
lf-0.120
 driving-0.118
 Brown-0.113
 about-0.111
Token a
Token ID257
Feature activation-2.41e-02

Pos logit contributions
ited+0.144
able+0.134
ok+0.133
cy+0.128
 single+0.125
Neg logit contributions
 not-0.052
lf-0.049
 Brown-0.048
 Alf-0.047
 driving-0.047
Token were
Token ID547
Feature activation-6.66e-02

Pos logit contributions
lf+0.019
 not+0.018
opes+0.017
 driving+0.017
 Brown+0.016
Neg logit contributions
ited-0.042
able-0.039
cy-0.039
ok-0.037
mer-0.036
Token soldiers
Token ID5795
Feature activation-1.69e-01

Pos logit contributions
lf+0.090
opes+0.076
ops+0.070
 Alf+0.070
aurus+0.070
Neg logit contributions
able-0.134
 forward-0.127
 single-0.122
mer-0.121
 further-0.121
Token or
Token ID393
Feature activation-6.14e-02

Pos logit contributions
lf+0.252
 Alf+0.248
 not+0.235
opes+0.235
 Zig+0.230
Neg logit contributions
able-0.343
ited-0.334
cy-0.318
ok-0.313
mer-0.310
Token animals
Token ID4695
Feature activation-1.80e-01

Pos logit contributions
lf+0.074
 Alf+0.068
 Zig+0.066
opes+0.065
 Brown+0.065
Neg logit contributions
ited-0.133
able-0.132
use-0.123
 single-0.122
cy-0.118
Token or
Token ID393
Feature activation-7.14e-02

Pos logit contributions
lf+0.251
 not+0.243
 Alf+0.243
 Brown+0.226
opes+0.224
Neg logit contributions
able-0.396
ited-0.394
ok-0.359
cy-0.354
mer-0.350
Token robots
Token ID14193
Feature activation-2.14e-01

Pos logit contributions
lf+0.083
 Alf+0.073
 Zig+0.070
 not+0.069
aurus+0.069
Neg logit contributions
ited-0.164
able-0.164
use-0.152
cy-0.148
 single-0.148
Token.
Token ID13
Feature activation-2.83e-02

Pos logit contributions
 not+0.311
 Alf+0.306
lf+0.298
 Zig+0.288
 Brown+0.284
Neg logit contributions
able-0.462
ited-0.462
ok-0.420
mer-0.412
cy-0.406
Token One
Token ID1881
Feature activation+5.60e-02

Pos logit contributions
 not+0.036
lf+0.034
 Brown+0.033
 driving+0.033
 Alf+0.032
Neg logit contributions
ited-0.071
able-0.065
ok-0.065
cy-0.062
Ok-0.061
Token day
Token ID1110
Feature activation+4.23e-02

Pos logit contributions
ited+0.096
able+0.090
ok+0.085
cy+0.083
 single+0.080
Neg logit contributions
 not-0.111
lf-0.109
 Brown-0.109
 Alf-0.106
 driving-0.105
Token,
Token ID11
Feature activation-1.95e-02

Pos logit contributions
ited+0.090
able+0.089
cy+0.087
ok+0.086
 single+0.084
Neg logit contributions
lf-0.061
 driving-0.055
 Brown-0.055
 not-0.053
aurus-0.052
Token they
Token ID484
Feature activation+3.49e-03

Pos logit contributions
lf+0.023
 driving+0.021
 Brown+0.020
aurus+0.019
opes+0.019
Neg logit contributions
ited-0.047
cy-0.046
able-0.045
ok-0.044
Ok-0.043
Token We
Token ID775
Feature activation+9.44e-02

Pos logit contributions
 not+0.072
lf+0.069
 Brown+0.068
 driving+0.067
 Alf+0.067
Neg logit contributions
ited-0.079
able-0.070
ok-0.069
cy-0.066
Ok-0.064
Token know
Token ID760
Feature activation-9.58e-02

Pos logit contributions
ited+0.209
able+0.205
ok+0.192
cy+0.192
Ok+0.174
Neg logit contributions
lf-0.156
 Alf-0.137
 driving-0.137
 Brown-0.134
 Zig-0.129
Token some
Token ID617
Feature activation+1.43e-02

Pos logit contributions
lf+0.121
 Alf+0.115
aurus+0.113
 Brown+0.111
 not+0.111
Neg logit contributions
ited-0.206
able-0.201
ok-0.194
cy-0.188
Ok-0.185
Token things
Token ID1243
Feature activation-5.24e-02

Pos logit contributions
ited+0.038
able+0.035
ok+0.034
cy+0.033
Ok+0.032
Neg logit contributions
 not-0.016
lf-0.015
 Brown-0.015
 driving-0.015
 Alf-0.014
Token about
Token ID546
Feature activation-1.16e-01

Pos logit contributions
lf+0.067
 Alf+0.064
 not+0.063
 Brown+0.062
 driving+0.061
Neg logit contributions
ited-0.126
able-0.120
ok-0.117
cy-0.111
 single-0.108
Token nations
Token ID7027
Feature activation-2.13e-01

Pos logit contributions
lf+0.152
 not+0.148
 Alf+0.144
 Brown+0.142
 driving+0.141
Neg logit contributions
ited-0.253
able-0.245
ok-0.231
cy-0.230
 single-0.226
Token."
Token ID526
Feature activation+3.86e-02

Pos logit contributions
 not+0.269
lf+0.266
 Alf+0.260
 Brown+0.257
 Zig+0.251
Neg logit contributions
ited-0.513
able-0.489
ok-0.469
cy-0.454
 single-0.439
Token
Token ID198
Feature activation+3.51e-03

Pos logit contributions
ited+0.090
able+0.086
ok+0.085
cy+0.082
Ok+0.080
Neg logit contributions
lf-0.053
 not-0.050
 driving-0.049
 Brown-0.047
 Alf-0.046
Token
Token ID198
Feature activation+1.26e-02

Pos logit contributions
ited+0.007
able+0.007
ok+0.007
cy+0.006
 single+0.006
Neg logit contributions
lf-0.005
 driving-0.005
 pretending-0.005
 Alf-0.005
 not-0.005
TokenThe
Token ID464
Feature activation+6.80e-05

Pos logit contributions
ited+0.029
ok+0.025
able+0.025
cy+0.025
 single+0.024
Neg logit contributions
 not-0.021
lf-0.019
 driving-0.018
 Brown-0.018
 Zig-0.018
Token big
Token ID1263
Feature activation+4.67e-02

Pos logit contributions
ited+0.000
able+0.000
ok+0.000
cy+0.000
 single+0.000
Neg logit contributions
 Brown-0.000
 not-0.000
 Zig-0.000
opes-0.000
 driving-0.000
Token she
Token ID673
Feature activation+2.57e-03

Pos logit contributions
opes+0.093
lf+0.091
 driving+0.083
aurus+0.083
 Zig+0.082
Neg logit contributions
able-0.113
ited-0.111
 single-0.107
ok-0.103
cy-0.100
Token knew
Token ID2993
Feature activation-3.99e-02

Pos logit contributions
ited+0.006
able+0.006
cy+0.006
ok+0.006
Ok+0.005
Neg logit contributions
 not-0.003
 Alf-0.003
 Brown-0.003
 Zig-0.003
lf-0.003
Token so
Token ID523
Feature activation+1.15e-02

Pos logit contributions
lf+0.045
 driving+0.041
 Brown+0.040
 Alf+0.040
 not+0.040
Neg logit contributions
ited-0.096
able-0.094
ok-0.089
 single-0.087
cy-0.087
Token much
Token ID881
Feature activation+2.49e-02

Pos logit contributions
ited+0.023
able+0.022
ok+0.020
 single+0.020
cy+0.020
Neg logit contributions
lf-0.019
 driving-0.018
 not-0.018
 Brown-0.018
 Zig-0.018
Token about
Token ID546
Feature activation-1.34e-01

Pos logit contributions
ited+0.066
able+0.063
ok+0.061
cy+0.058
 single+0.058
Neg logit contributions
 not-0.026
lf-0.025
 Brown-0.024
 Alf-0.024
 driving-0.024
Token dinosaurs
Token ID27199
Feature activation-2.12e-01

Pos logit contributions
lf+0.166
 not+0.147
aurus+0.145
 Zig+0.145
 Brown+0.144
Neg logit contributions
ited-0.298
able-0.290
 single-0.276
cy-0.266
ok-0.265
Token.
Token ID13
Feature activation-1.36e-02

Pos logit contributions
lf+0.264
 Brown+0.261
 not+0.260
 Alf+0.260
 Zig+0.258
Neg logit contributions
ited-0.502
able-0.483
ok-0.456
cy-0.441
 single-0.436
Token She
Token ID1375
Feature activation-2.41e-02

Pos logit contributions
lf+0.024
 driving+0.023
 not+0.022
 pretending+0.021
aurus+0.021
Neg logit contributions
ited-0.026
ok-0.024
able-0.024
cy-0.024
Ok-0.023
Token was
Token ID373
Feature activation-1.42e-02

Pos logit contributions
lf+0.035
 driving+0.035
 Brown+0.035
 not+0.035
 Alf+0.035
Neg logit contributions
ited-0.052
able-0.046
cy-0.044
ok-0.044
Ok-0.044
Token ready
Token ID3492
Feature activation+2.37e-02

Pos logit contributions
lf+0.021
 not+0.021
 Brown+0.020
 Alf+0.020
 driving+0.020
Neg logit contributions
ited-0.032
able-0.028
ok-0.028
cy-0.027
Ok-0.027
Token to
Token ID284
Feature activation-1.04e-02

Pos logit contributions
ited+0.058
able+0.053
ok+0.052
cy+0.050
Ok+0.048
Neg logit contributions
 not-0.034
 driving-0.030
 Brown-0.030
lf-0.030
 Alf-0.029
Token!
Token ID0
Feature activation-1.33e-02

Pos logit contributions
 not+0.096
lf+0.089
 Brown+0.087
 driving+0.086
 Alf+0.084
Neg logit contributions
ited-0.202
able-0.184
ok-0.183
cy-0.176
Ok-0.172
Token That
Token ID1320
Feature activation+5.37e-03

Pos logit contributions
 not+0.023
 Brown+0.022
 driving+0.022
 Alf+0.022
lf+0.022
Neg logit contributions
ited-0.028
able-0.025
ok-0.024
cy-0.023
 single-0.022
Token's
Token ID338
Feature activation-3.31e-02

Pos logit contributions
ited+0.013
able+0.012
ok+0.011
cy+0.011
 single+0.011
Neg logit contributions
 not-0.007
lf-0.007
 Brown-0.007
 Alf-0.007
 Zig-0.007
Token how
Token ID703
Feature activation-9.54e-03

Pos logit contributions
lf+0.041
 not+0.040
 Brown+0.039
 Alf+0.038
 driving+0.038
Neg logit contributions
ited-0.081
able-0.074
ok-0.073
cy-0.071
Ok-0.070
Token all
Token ID477
Feature activation+2.12e-02

Pos logit contributions
 not+0.015
lf+0.014
 Brown+0.014
 driving+0.014
 Alf+0.014
Neg logit contributions
ited-0.021
able-0.019
ok-0.019
cy-0.018
Ok-0.018
Token masks
Token ID20680
Feature activation-2.11e-01

Pos logit contributions
ited+0.047
able+0.044
ok+0.042
cy+0.041
 single+0.039
Neg logit contributions
 not-0.032
lf-0.031
 Brown-0.030
 Alf-0.030
 Zig-0.030
Token should
Token ID815
Feature activation-1.03e-01

Pos logit contributions
lf+0.265
 not+0.264
 Brown+0.253
 Alf+0.251
 Zig+0.247
Neg logit contributions
ited-0.514
able-0.476
ok-0.472
cy-0.453
 single-0.438
Token be
Token ID307
Feature activation-2.70e-02

Pos logit contributions
lf+0.137
 Alf+0.132
 Brown+0.129
 driving+0.128
 Zig+0.126
Neg logit contributions
ited-0.233
able-0.217
ok-0.216
 single-0.205
cy-0.204
Token organized
Token ID8389
Feature activation-1.65e-02

Pos logit contributions
 not+0.039
lf+0.037
 Brown+0.036
 driving+0.036
 Alf+0.035
Neg logit contributions
ited-0.063
able-0.057
ok-0.057
cy-0.055
Ok-0.053
Token."
Token ID526
Feature activation+3.35e-02

Pos logit contributions
 not+0.020
lf+0.018
 Brown+0.018
 driving+0.018
 Alf+0.017
Neg logit contributions
ited-0.043
ok-0.039
able-0.039
cy-0.037
Ok-0.037
Token
Token ID198
Feature activation-6.40e-03

Pos logit contributions
ited+0.083
able+0.077
ok+0.076
cy+0.074
Ok+0.072
Neg logit contributions
 not-0.042
lf-0.040
 driving-0.038
 Brown-0.038
 Alf-0.037
Token had
Token ID550
Feature activation-2.36e-02

Pos logit contributions
lf+0.027
 driving+0.026
 not+0.025
 Brown+0.025
 Alf+0.025
Neg logit contributions
ited-0.052
able-0.047
ok-0.046
cy-0.045
Ok-0.043
Token no
Token ID645
Feature activation+3.99e-02

Pos logit contributions
 not+0.033
lf+0.030
 Brown+0.030
 driving+0.030
 Alf+0.029
Neg logit contributions
ited-0.057
able-0.052
ok-0.052
cy-0.050
Ok-0.049
Token car
Token ID1097
Feature activation-8.63e-02

Pos logit contributions
ited+0.107
ok+0.098
able+0.098
cy+0.094
Ok+0.093
Neg logit contributions
 not-0.046
lf-0.042
 driving-0.041
 Brown-0.041
 Alf-0.039
Token and
Token ID290
Feature activation-2.14e-02

Pos logit contributions
lf+0.126
 not+0.122
 Alf+0.122
 Brown+0.115
opes+0.115
Neg logit contributions
ited-0.189
able-0.182
ok-0.175
cy-0.168
yes-0.158
Token no
Token ID645
Feature activation+2.97e-02

Pos logit contributions
lf+0.023
 not+0.022
 Alf+0.022
 Brown+0.022
 driving+0.021
Neg logit contributions
ited-0.056
able-0.052
ok-0.050
cy-0.049
 single-0.048
Token dinosaurs
Token ID27199
Feature activation-2.11e-01

Pos logit contributions
ited+0.074
able+0.068
ok+0.068
cy+0.065
Ok+0.064
Neg logit contributions
 not-0.039
lf-0.036
 Brown-0.036
 driving-0.036
 Alf-0.034
Token to
Token ID284
Feature activation-4.71e-02

Pos logit contributions
 Alf+0.279
 not+0.275
lf+0.272
 Brown+0.266
 Zig+0.261
Neg logit contributions
ited-0.497
able-0.468
ok-0.452
cy-0.432
 single-0.419
Token play
Token ID711
Feature activation-7.23e-02

Pos logit contributions
 not+0.062
lf+0.061
 Alf+0.060
 Brown+0.059
 driving+0.058
Neg logit contributions
ited-0.111
able-0.100
ok-0.099
cy-0.096
 single-0.093
Token with
Token ID351
Feature activation-2.23e-02

Pos logit contributions
 Alf+0.106
 not+0.103
 Brown+0.101
 Zig+0.100
lf+0.099
Neg logit contributions
ited-0.150
able-0.147
ok-0.140
cy-0.137
mer-0.132
Token.
Token ID13
Feature activation-2.22e-02

Pos logit contributions
 not+0.024
lf+0.023
 Brown+0.022
 driving+0.022
 Alf+0.021
Neg logit contributions
ited-0.060
able-0.055
ok-0.055
cy-0.053
Ok-0.052
Token They
Token ID1119
Feature activation-5.15e-02

Pos logit contributions
 not+0.028
lf+0.028
 driving+0.026
 Brown+0.025
 Alf+0.025
Neg logit contributions
ited-0.055
able-0.051
ok-0.050
cy-0.048
Ok-0.048
Token blankets
Token ID34794
Feature activation-9.15e-02

Pos logit contributions
lf+0.047
 not+0.046
 Alf+0.043
aurus+0.043
 Brown+0.043
Neg logit contributions
ited-0.074
able-0.070
ok-0.065
cy-0.064
 single-0.062
Token and
Token ID290
Feature activation-5.27e-04

Pos logit contributions
lf+0.123
 Alf+0.112
 not+0.112
 driving+0.109
opes+0.107
Neg logit contributions
ited-0.204
able-0.198
ok-0.194
cy-0.190
 single-0.182
Token chairs
Token ID18791
Feature activation-5.93e-02

Pos logit contributions
 not+0.001
lf+0.001
 Alf+0.001
 driving+0.001
 Brown+0.001
Neg logit contributions
ited-0.001
able-0.001
ok-0.001
cy-0.001
 single-0.001
Token to
Token ID284
Feature activation-2.55e-02

Pos logit contributions
lf+0.073
 Alf+0.070
 not+0.069
 Brown+0.066
opes+0.066
Neg logit contributions
ited-0.139
able-0.131
ok-0.131
cy-0.124
 single-0.121
Token make
Token ID787
Feature activation-2.04e-02

Pos logit contributions
 not+0.032
lf+0.031
 driving+0.030
 Alf+0.029
 Brown+0.029
Neg logit contributions
ited-0.060
able-0.056
ok-0.053
cy-0.053
 single-0.052
Token cages
Token ID37933
Feature activation-2.10e-01

Pos logit contributions
lf+0.028
 not+0.026
aurus+0.024
 driving+0.024
opes+0.023
Neg logit contributions
able-0.042
ited-0.042
cy-0.039
 single-0.039
ok-0.039
Token and
Token ID290
Feature activation-2.30e-02

Pos logit contributions
lf+0.293
 Alf+0.275
 Brown+0.272
 not+0.269
 Zig+0.263
Neg logit contributions
ited-0.464
able-0.452
ok-0.427
cy-0.414
 single-0.404
Token fences
Token ID35771
Feature activation-7.78e-02

Pos logit contributions
lf+0.030
aurus+0.026
 not+0.026
 Alf+0.026
opes+0.025
Neg logit contributions
ited-0.047
able-0.046
 single-0.043
 tighter-0.042
cy-0.042
Token.
Token ID13
Feature activation-2.31e-02

Pos logit contributions
lf+0.109
 not+0.106
 Alf+0.100
opes+0.098
aurus+0.095
Neg logit contributions
ited-0.166
able-0.162
ok-0.153
 single-0.148
cy-0.146
Token
Token ID198
Feature activation-8.17e-03

Pos logit contributions
lf+0.032
 not+0.031
 driving+0.029
 Brown+0.027
 Alf+0.027
Neg logit contributions
ited-0.051
able-0.049
ok-0.049
cy-0.048
Ok-0.046
Token
Token ID198
Feature activation+6.23e-03

Pos logit contributions
 not+0.010
lf+0.009
 Brown+0.009
 driving+0.009
 Alf+0.008
Neg logit contributions
ited-0.021
able-0.020
ok-0.019
cy-0.019
Ok-0.018
Token slides
Token ID19392
Feature activation-2.44e-02

Pos logit contributions
lf+0.008
 not+0.008
 Alf+0.008
 driving+0.008
 Brown+0.008
Neg logit contributions
ited-0.014
able-0.013
ok-0.012
cy-0.012
 single-0.012
Token and
Token ID290
Feature activation-1.88e-02

Pos logit contributions
 not+0.028
lf+0.026
 Brown+0.025
 driving+0.025
 Alf+0.025
Neg logit contributions
ited-0.064
able-0.060
ok-0.059
cy-0.057
Ok-0.055
Token ice
Token ID4771
Feature activation-2.07e-02

Pos logit contributions
lf+0.021
 Alf+0.021
 Brown+0.020
 not+0.020
aurus+0.019
Neg logit contributions
ited-0.046
able-0.044
ok-0.042
cy-0.041
 single-0.040
Token cream
Token ID8566
Feature activation-4.29e-02

Pos logit contributions
 not+0.034
lf+0.032
 Brown+0.031
 driving+0.031
 Alf+0.030
Neg logit contributions
ited-0.046
able-0.040
ok-0.040
cy-0.038
Ok-0.038
Token and
Token ID290
Feature activation-2.32e-02

Pos logit contributions
 not+0.050
lf+0.046
 Brown+0.045
 driving+0.045
 Alf+0.044
Neg logit contributions
ited-0.113
able-0.103
ok-0.102
cy-0.098
Ok-0.097
Token dinosaurs
Token ID27199
Feature activation-2.10e-01

Pos logit contributions
lf+0.028
 Alf+0.027
 not+0.026
 Brown+0.026
 Zig+0.025
Neg logit contributions
ited-0.057
able-0.054
ok-0.051
cy-0.050
Ok-0.049
Token.
Token ID13
Feature activation-2.54e-02

Pos logit contributions
 not+0.250
lf+0.246
 Alf+0.239
 Brown+0.238
 driving+0.230
Neg logit contributions
ited-0.527
able-0.493
ok-0.482
cy-0.462
 single-0.450
Token They
Token ID1119
Feature activation-5.66e-02

Pos logit contributions
 not+0.032
lf+0.032
 driving+0.030
 Brown+0.029
 Alf+0.029
Neg logit contributions
ited-0.062
able-0.058
ok-0.057
cy-0.056
Ok-0.054
Token had
Token ID550
Feature activation-2.07e-02

Pos logit contributions
 driving+0.079
lf+0.078
opes+0.078
 pretending+0.074
 Alf+0.074
Neg logit contributions
ited-0.134
able-0.121
cy-0.119
ok-0.112
Ok-0.110
Token fun
Token ID1257
Feature activation+1.18e-02

Pos logit contributions
lf+0.029
 Alf+0.026
 driving+0.025
 not+0.025
 Brown+0.025
Neg logit contributions
ited-0.049
able-0.045
ok-0.044
Ok-0.042
cy-0.042
Token at
Token ID379
Feature activation-2.82e-02

Pos logit contributions
ited+0.025
able+0.024
ok+0.023
 single+0.022
cy+0.022
Neg logit contributions
 Alf-0.017
 Zig-0.017
 Brown-0.016
lf-0.016
aurus-0.016
Token to
Token ID284
Feature activation-1.87e-02

Pos logit contributions
ited+0.094
able+0.088
ok+0.084
 single+0.081
cy+0.079
Neg logit contributions
lf-0.088
 not-0.083
 Brown-0.083
 Zig-0.082
 driving-0.082
Token know
Token ID760
Feature activation-8.62e-02

Pos logit contributions
lf+0.028
 Brown+0.027
 Zig+0.027
 driving+0.026
 Alf+0.026
Neg logit contributions
ited-0.039
able-0.036
ok-0.035
 single-0.034
cy-0.034
Token more
Token ID517
Feature activation+3.08e-02

Pos logit contributions
lf+0.122
 Brown+0.116
 not+0.115
 Alf+0.114
 Zig+0.113
Neg logit contributions
ited-0.178
able-0.173
ok-0.162
 single-0.158
Ok-0.155
Token about
Token ID546
Feature activation-1.05e-01

Pos logit contributions
able+0.067
ited+0.067
ok+0.062
 single+0.060
cy+0.057
Neg logit contributions
 Brown-0.043
 not-0.042
lf-0.041
 Zig-0.041
 Alf-0.040
Token other
Token ID584
Feature activation-2.33e-02

Pos logit contributions
lf+0.139
 Brown+0.134
 not+0.134
 Zig+0.131
 driving+0.130
Neg logit contributions
ited-0.234
able-0.226
 single-0.208
cy-0.207
ok-0.207
Token nations
Token ID7027
Feature activation-2.09e-01

Pos logit contributions
 Brown+0.029
 not+0.029
lf+0.028
 Zig+0.028
 driving+0.027
Neg logit contributions
ited-0.056
able-0.055
ok-0.051
cy-0.050
eful-0.049
Token.
Token ID13
Feature activation-1.35e-02

Pos logit contributions
lf+0.265
 not+0.253
 Brown+0.251
 Alf+0.251
 driving+0.247
Neg logit contributions
ited-0.494
able-0.476
ok-0.456
cy-0.443
 single-0.430
Token They
Token ID1119
Feature activation-5.17e-02

Pos logit contributions
 not+0.013
 Brown+0.012
 Max+0.011
 Alf+0.011
 Zig+0.010
Neg logit contributions
ited-0.041
able-0.037
ok-0.036
 single-0.034
cy-0.034
Token ask
Token ID1265
Feature activation+3.22e-02

Pos logit contributions
lf+0.085
 not+0.084
 Brown+0.082
 driving+0.081
 Alf+0.079
Neg logit contributions
ited-0.110
able-0.098
ok-0.097
cy-0.094
Ok-0.090
Token the
Token ID262
Feature activation+3.63e-03

Pos logit contributions
ited+0.076
able+0.072
ok+0.070
cy+0.068
Ok+0.066
Neg logit contributions
 not-0.041
lf-0.041
 driving-0.039
 Brown-0.039
 Alf-0.038
Token teacher
Token ID4701
Feature activation+2.92e-02

Pos logit contributions
ited+0.009
able+0.008
ok+0.008
cy+0.008
Ok+0.008
Neg logit contributions
 not-0.004
lf-0.004
 Brown-0.004
 driving-0.004
 Alf-0.004
Token inside
Token ID2641
Feature activation-6.38e-02

Pos logit contributions
lf+0.072
 Alf+0.070
 Zig+0.069
 Brown+0.068
 not+0.068
Neg logit contributions
ited-0.114
able-0.110
ok-0.102
 single-0.098
mer-0.098
Token,
Token ID11
Feature activation-5.21e-02

Pos logit contributions
 not+0.074
lf+0.071
 Brown+0.069
 driving+0.068
 Alf+0.068
Neg logit contributions
ited-0.161
able-0.152
ok-0.145
cy-0.143
 single-0.139
Token but
Token ID475
Feature activation-5.45e-03

Pos logit contributions
 not+0.062
lf+0.058
 Brown+0.057
 driving+0.056
 Alf+0.055
Neg logit contributions
ited-0.135
able-0.124
ok-0.123
cy-0.118
Ok-0.115
Token the
Token ID262
Feature activation-3.50e-02

Pos logit contributions
 not+0.005
lf+0.005
 Brown+0.004
 driving+0.004
 Alf+0.004
Neg logit contributions
ited-0.015
able-0.014
ok-0.014
cy-0.014
Ok-0.013
Token police
Token ID1644
Feature activation-9.79e-02

Pos logit contributions
lf+0.048
opes+0.045
 Alf+0.045
 Zig+0.044
 Brown+0.043
Neg logit contributions
ited-0.076
able-0.074
 single-0.070
ok-0.069
Ok-0.067
Token officers
Token ID3790
Feature activation-2.08e-01

Pos logit contributions
lf+0.131
 not+0.131
opes+0.130
 Zig+0.125
 Alf+0.123
Neg logit contributions
ited-0.216
able-0.205
ok-0.191
cy-0.184
mer-0.182
Token did
Token ID750
Feature activation-8.90e-02

Pos logit contributions
lf+0.251
 Brown+0.247
 not+0.246
 Zig+0.244
 Alf+0.242
Neg logit contributions
ited-0.523
able-0.471
ok-0.464
cy-0.444
Ok-0.440
Token not
Token ID407
Feature activation-5.82e-02

Pos logit contributions
lf+0.064
 Alf+0.064
 Brown+0.064
 Zig+0.059
opes+0.054
Neg logit contributions
ited-0.256
able-0.243
ok-0.240
 single-0.231
Ok-0.230
Token listen
Token ID6004
Feature activation-7.61e-03

Pos logit contributions
lf+0.093
 Brown+0.089
 not+0.089
 Alf+0.088
 Zig+0.087
Neg logit contributions
ited-0.125
able-0.115
ok-0.114
cy-0.105
Ok-0.103
Token.
Token ID13
Feature activation-1.96e-02

Pos logit contributions
 not+0.008
 driving+0.006
 Brown+0.006
lf+0.006
 Alf+0.006
Neg logit contributions
ited-0.022
ok-0.020
able-0.020
cy-0.020
Ok-0.019
Token
Token ID198
Feature activation-2.26e-02

Pos logit contributions
 not+0.024
lf+0.022
 Brown+0.022
 driving+0.021
 Alf+0.021
Neg logit contributions
ited-0.051
able-0.047
ok-0.046
cy-0.044
Ok-0.043
Token I
Token ID314
Feature activation+1.39e-01

Pos logit contributions
ited+0.176
able+0.160
ok+0.159
cy+0.152
Ok+0.148
Neg logit contributions
 not-0.139
lf-0.136
 driving-0.133
 Alf-0.132
 Brown-0.131
Token see
Token ID766
Feature activation-2.11e-02

Pos logit contributions
ited+0.350
able+0.322
ok+0.320
cy+0.312
 single+0.303
Neg logit contributions
lf-0.161
 not-0.157
 driving-0.152
 Alf-0.152
 Zig-0.150
Token that
Token ID326
Feature activation-5.57e-02

Pos logit contributions
 not+0.029
lf+0.028
aurus+0.027
 driving+0.026
 Zig+0.026
Neg logit contributions
ited-0.046
able-0.044
cy-0.041
ok-0.040
 single-0.039
Token book
Token ID1492
Feature activation-1.22e-01

Pos logit contributions
 not+0.081
opes+0.080
lf+0.076
 Zig+0.075
 Alf+0.074
Neg logit contributions
ited-0.113
able-0.113
 single-0.104
cy-0.100
ok-0.098
Token about
Token ID546
Feature activation-1.27e-01

Pos logit contributions
 not+0.155
 Alf+0.149
lf+0.148
 Brown+0.148
 driving+0.146
Neg logit contributions
ited-0.291
able-0.273
ok-0.267
cy-0.256
 single-0.251
Token dinosaurs
Token ID27199
Feature activation-2.06e-01

Pos logit contributions
aurus+0.138
 not+0.136
lf+0.131
 Alf+0.126
 Zig+0.124
Neg logit contributions
able-0.273
ited-0.272
 single-0.261
cy-0.251
rous-0.248
Token?
Token ID30
Feature activation+9.15e-03

Pos logit contributions
 not+0.317
 Alf+0.304
 Brown+0.303
lf+0.303
 Zig+0.300
Neg logit contributions
ited-0.446
able-0.415
ok-0.401
cy-0.384
 single-0.371
Token It
Token ID632
Feature activation-5.81e-03

Pos logit contributions
ited+0.016
able+0.015
ok+0.015
cy+0.014
Ok+0.014
Neg logit contributions
 not-0.017
lf-0.017
 driving-0.016
 Brown-0.016
 Alf-0.016
Token looks
Token ID3073
Feature activation+4.86e-02

Pos logit contributions
lf+0.010
 driving+0.009
 Alf+0.009
 not+0.009
aurus+0.009
Neg logit contributions
ited-0.011
able-0.011
cy-0.010
ok-0.010
 single-0.010
Token fun
Token ID1257
Feature activation+8.73e-03

Pos logit contributions
ited+0.117
ok+0.107
able+0.106
cy+0.103
Ok+0.100
Neg logit contributions
 not-0.069
 Brown-0.062
lf-0.062
 driving-0.061
 Alf-0.060
Token,"
Token ID553
Feature activation+6.43e-03

Pos logit contributions
ited+0.022
able+0.020
ok+0.020
cy+0.019
Ok+0.019
Neg logit contributions
 not-0.011
lf-0.011
 Brown-0.010
 driving-0.010
 Alf-0.010
Token and
Token ID290
Feature activation+5.74e-04

Pos logit contributions
 not+0.035
lf+0.035
 Brown+0.033
 Alf+0.033
 driving+0.033
Neg logit contributions
ited-0.072
able-0.066
ok-0.066
cy-0.063
Ok-0.062
Token happy
Token ID3772
Feature activation-2.67e-02

Pos logit contributions
ited+0.001
able+0.001
ok+0.001
cy+0.001
Ok+0.001
Neg logit contributions
lf-0.001
 not-0.001
 Brown-0.001
 Alf-0.001
 driving-0.001
Token.
Token ID13
Feature activation-2.70e-02

Pos logit contributions
lf+0.040
 Zig+0.038
 Alf+0.038
 not+0.038
opes+0.037
Neg logit contributions
ited-0.054
able-0.053
ok-0.052
cy-0.049
 single-0.048
Token
Token ID198
Feature activation-2.01e-02

Pos logit contributions
 not+0.032
lf+0.030
 Brown+0.029
 driving+0.029
 Alf+0.028
Neg logit contributions
ited-0.070
able-0.065
ok-0.064
cy-0.062
Ok-0.060
Token
Token ID198
Feature activation-6.96e-03

Pos logit contributions
lf+0.029
 driving+0.029
 not+0.028
 pretending+0.028
 Brown+0.027
Neg logit contributions
ited-0.041
able-0.040
ok-0.040
cy-0.037
 single-0.037
Token"
Token ID1
Feature activation+1.05e-01

Pos logit contributions
 not+0.012
lf+0.011
 driving+0.010
 Brown+0.010
 Zig+0.010
Neg logit contributions
ited-0.015
ok-0.013
able-0.013
cy-0.013
 single-0.012
TokenMom
Token ID29252
Feature activation+9.36e-02

Pos logit contributions
ited+0.170
able+0.141
ok+0.136
cy+0.131
 single+0.131
Neg logit contributions
 not-0.229
lf-0.221
 Brown-0.219
 driving-0.210
 Zig-0.208
Token,
Token ID11
Feature activation-7.98e-03

Pos logit contributions
ited+0.236
ok+0.214
able+0.213
cy+0.210
Ok+0.197
Neg logit contributions
lf-0.111
 not-0.110
 Zig-0.105
 Alf-0.102
 driving-0.101
Token we
Token ID356
Feature activation+1.16e-01

Pos logit contributions
lf+0.013
opes+0.013
 driving+0.013
 not+0.012
 Alf+0.012
Neg logit contributions
ited-0.015
cy-0.014
able-0.014
ok-0.014
 single-0.013
Token're
Token ID821
Feature activation-1.12e-02

Pos logit contributions
ited+0.284
able+0.256
cy+0.250
ok+0.247
Ok+0.230
Neg logit contributions
lf-0.170
 driving-0.159
 Zig-0.158
 Brown-0.154
 Alf-0.150
Token sorry
Token ID7926
Feature activation-1.20e-02

Pos logit contributions
opes+0.021
lf+0.021
aurus+0.019
cases+0.019
ops+0.019
Neg logit contributions
ited-0.018
 forward-0.017
cy-0.017
ok-0.017
eful-0.016
Token,
Token ID11
Feature activation-1.36e-02

Pos logit contributions
 not+0.082
 Brown+0.071
lf+0.068
 driving+0.067
 Alf+0.065
Neg logit contributions
ited-0.281
able-0.256
ok-0.255
cy-0.250
Ok-0.240
Token �
Token ID6184
Feature activation+1.10e-02

Pos logit contributions
 not+0.013
 Brown+0.011
lf+0.011
 driving+0.011
 Alf+0.011
Neg logit contributions
ited-0.040
able-0.037
ok-0.036
cy-0.035
Ok-0.034
Token�
Token ID95
Feature activation+1.32e-02

Pos logit contributions
ited+0.023
able+0.020
ok+0.019
cy+0.019
 single+0.018
Neg logit contributions
 not-0.020
lf-0.019
 Brown-0.018
 driving-0.018
 Zig-0.018
Token€
Token ID26391
Feature activation-2.74e-02

Pos logit contributions
ited+0.028
able+0.025
cy+0.024
ok+0.023
 single+0.023
Neg logit contributions
 not-0.023
lf-0.021
 Brown-0.021
 driving-0.021
 Zig-0.021
Token�
Token ID129
Feature activation+1.94e-02

Pos logit contributions
lf+0.053
aurus+0.052
 Alf+0.051
 not+0.051
 driving+0.051
Neg logit contributions
ited-0.044
ok-0.041
able-0.040
Ok-0.038
cy-0.037
Token�
Token ID241
Feature activation+1.20e-01

Pos logit contributions
ited+0.039
able+0.034
ok+0.034
Ok+0.032
cy+0.032
Neg logit contributions
 not-0.034
lf-0.033
 Brown-0.033
 Alf-0.032
 driving-0.032
TokenFrank
Token ID17439
Feature activation-3.66e-02

Pos logit contributions
ited+0.257
able+0.217
cy+0.215
ok+0.209
sie+0.206
Neg logit contributions
 not-0.216
 Brown-0.198
 driving-0.195
 Zig-0.188
 Alf-0.183
Tokenie
Token ID494
Feature activation-1.53e-02

Pos logit contributions
 not+0.067
 Brown+0.064
lf+0.063
 Alf+0.062
 driving+0.061
Neg logit contributions
ited-0.071
ok-0.064
able-0.061
Ok-0.060
cy-0.058
Token,
Token ID11
Feature activation-2.36e-02

Pos logit contributions
 not+0.016
 Brown+0.014
lf+0.014
 driving+0.013
 Alf+0.013
Neg logit contributions
ited-0.043
ok-0.040
able-0.039
cy-0.038
Ok-0.037
Token this
Token ID428
Feature activation-1.67e-02

Pos logit contributions
 not+0.030
lf+0.027
 Brown+0.027
 driving+0.026
 Alf+0.026
Neg logit contributions
ited-0.061
able-0.056
ok-0.055
cy-0.053
Ok-0.051
Token microphone
Token ID21822
Feature activation-6.39e-02

Pos logit contributions
 not+0.022
lf+0.020
 Brown+0.020
 driving+0.019
 Alf+0.019
Neg logit contributions
ited-0.042
ok-0.038
able-0.038
cy-0.036
Ok-0.036
Token The
Token ID383
Feature activation-3.85e-02

Pos logit contributions
lf+0.028
 not+0.028
 driving+0.027
 pretending+0.025
 Brown+0.025
Neg logit contributions
ited-0.043
able-0.040
ok-0.040
cy-0.038
Ok-0.038
Token girl
Token ID2576
Feature activation+2.43e-03

Pos logit contributions
 not+0.070
lf+0.066
 Brown+0.065
 driving+0.065
 Alf+0.062
Neg logit contributions
ited-0.077
ok-0.069
able-0.068
cy-0.064
Ok-0.063
Token cl
Token ID537
Feature activation+3.84e-02

Pos logit contributions
ited+0.006
able+0.005
cy+0.005
ok+0.005
Ok+0.005
Neg logit contributions
lf-0.003
 not-0.003
 driving-0.003
 Alf-0.003
 Brown-0.003
Tokenapped
Token ID6320
Feature activation+1.27e-02

Pos logit contributions
ited+0.092
able+0.081
ok+0.080
 single+0.078
Ok+0.076
Neg logit contributions
 not-0.051
 Brown-0.048
 Alf-0.048
aurus-0.047
 Zig-0.046
Token her
Token ID607
Feature activation+3.54e-02

Pos logit contributions
ited+0.036
ok+0.033
able+0.032
cy+0.031
Ok+0.031
Neg logit contributions
 not-0.016
 about-0.012
lf-0.012
 Brown-0.012
 driving-0.011
Token hands
Token ID2832
Feature activation+1.05e-01

Pos logit contributions
ited+0.081
able+0.072
ok+0.071
Ok+0.068
cy+0.067
Neg logit contributions
 not-0.057
 Brown-0.052
 driving-0.050
lf-0.049
 Zig-0.049
Token and
Token ID290
Feature activation+1.82e-02

Pos logit contributions
ited+0.276
able+0.254
ok+0.251
cy+0.242
Ok+0.237
Neg logit contributions
 not-0.123
lf-0.113
 Brown-0.111
 driving-0.110
 Alf-0.107
Token the
Token ID262
Feature activation+6.34e-03

Pos logit contributions
ited+0.035
able+0.033
cy+0.032
yes+0.030
ok+0.030
Neg logit contributions
opes-0.029
lf-0.028
 driving-0.028
 Alf-0.027
aurus-0.026
Token dog
Token ID3290
Feature activation-5.47e-02

Pos logit contributions
ited+0.015
able+0.013
ok+0.013
cy+0.013
Ok+0.012
Neg logit contributions
 not-0.009
lf-0.009
 Brown-0.009
 driving-0.009
 Alf-0.008
Token w
Token ID266
Feature activation-3.00e-02

Pos logit contributions
lf+0.086
 driving+0.080
 Alf+0.080
 not+0.078
 Brown+0.077
Neg logit contributions
ited-0.119
able-0.107
cy-0.104
ok-0.104
yes-0.097
Tokenagged
Token ID14655
Feature activation+3.65e-02

Pos logit contributions
 not+0.057
lf+0.054
 driving+0.053
opes+0.052
 Zig+0.052
Neg logit contributions
ited-0.054
able-0.047
ok-0.046
cy-0.046
 single-0.046
Token was
Token ID373
Feature activation-5.20e-03

Pos logit contributions
 not+0.047
lf+0.045
 Brown+0.045
 Alf+0.045
 driving+0.045
Neg logit contributions
ited-0.076
able-0.067
cy-0.065
ok-0.065
Ok-0.063
Token carrying
Token ID6872
Feature activation-1.59e-03

Pos logit contributions
lf+0.009
opes+0.008
aurus+0.008
 Alf+0.008
ops+0.008
Neg logit contributions
ited-0.010
 single-0.009
able-0.009
ok-0.008
eful-0.008
Token a
Token ID257
Feature activation-1.60e-02

Pos logit contributions
 not+0.002
lf+0.002
 driving+0.001
 Brown+0.001
 Alf+0.001
Neg logit contributions
ited-0.004
able-0.004
ok-0.004
cy-0.004
 single-0.004
Token wide
Token ID3094
Feature activation+9.84e-02

Pos logit contributions
 not+0.021
lf+0.020
 Brown+0.020
 driving+0.019
 Alf+0.019
Neg logit contributions
ited-0.040
ok-0.036
able-0.036
cy-0.034
Ok-0.034
Token,
Token ID11
Feature activation+6.70e-03

Pos logit contributions
ited+0.253
ok+0.234
able+0.234
cy+0.223
Ok+0.222
Neg logit contributions
 not-0.119
lf-0.119
 Brown-0.112
 driving-0.110
 Alf-0.105
Token sunny
Token ID27737
Feature activation+1.17e-01

Pos logit contributions
ited+0.015
ok+0.014
able+0.014
cy+0.013
Ok+0.013
Neg logit contributions
 not-0.011
 driving-0.010
 Brown-0.010
lf-0.010
 pretending-0.009
Token smile
Token ID8212
Feature activation+2.92e-02

Pos logit contributions
ited+0.282
able+0.255
ok+0.252
cy+0.241
Ok+0.240
Neg logit contributions
lf-0.165
 not-0.162
 Brown-0.152
 driving-0.151
 Alf-0.141
Token.
Token ID13
Feature activation-6.76e-03

Pos logit contributions
ited+0.066
able+0.062
ok+0.060
cy+0.058
 single+0.058
Neg logit contributions
 not-0.041
lf-0.040
 Alf-0.038
 Zig-0.038
 Brown-0.037
Token Everyone
Token ID11075
Feature activation-3.09e-02

Pos logit contributions
 not+0.009
lf+0.008
 Brown+0.008
 driving+0.008
 Alf+0.008
Neg logit contributions
ited-0.017
able-0.015
ok-0.015
cy-0.015
Ok-0.014
Token was
Token ID373
Feature activation-1.93e-02

Pos logit contributions
 not+0.039
lf+0.037
 Brown+0.036
 driving+0.035
 Alf+0.035
Neg logit contributions
ited-0.078
able-0.071
ok-0.071
cy-0.068
Ok-0.066
Token so
Token ID523
Feature activation+2.15e-02

Pos logit contributions
lf+0.033
opes+0.032
aurus+0.030
 Zig+0.030
 bones+0.029
Neg logit contributions
ited-0.037
able-0.033
ok-0.032
 single-0.032
eful-0.032
Token Lily
Token ID20037
Feature activation+2.47e-02

Pos logit contributions
ited+0.102
able+0.095
ok+0.094
cy+0.091
 single+0.089
Neg logit contributions
lf-0.048
 not-0.048
 driving-0.046
 Brown-0.043
 Alf-0.043
Token says
Token ID1139
Feature activation-7.03e-02

Pos logit contributions
ited+0.062
able+0.057
ok+0.056
cy+0.054
 single+0.053
Neg logit contributions
 not-0.032
lf-0.029
 Brown-0.029
 driving-0.028
 Alf-0.028
Token.
Token ID13
Feature activation+1.68e-02

Pos logit contributions
 not+0.062
 driving+0.047
 Brown+0.045
 pretending+0.042
 about+0.042
Neg logit contributions
ited-0.224
able-0.201
cy-0.194
Ok-0.191
 single-0.190
Token
Token ID198
Feature activation+9.45e-03

Pos logit contributions
ited+0.058
able+0.050
ok+0.048
mer+0.048
cy+0.048
Neg logit contributions
 not-0.014
 Brown-0.012
 Max-0.011
 Tom-0.010
 Tim-0.009
Token
Token ID198
Feature activation+1.75e-02

Pos logit contributions
ited+0.025
able+0.023
ok+0.023
cy+0.022
Ok+0.022
Neg logit contributions
 not-0.011
lf-0.010
 Brown-0.010
 driving-0.010
 Alf-0.009
TokenMom
Token ID29252
Feature activation+1.04e-01

Pos logit contributions
ited+0.044
mer+0.035
able+0.035
 single+0.034
ok+0.033
Neg logit contributions
 not-0.030
 Max-0.026
lf-0.026
 Brown-0.025
 Tom-0.024
Token unt
Token ID1418
Feature activation+8.00e-02

Pos logit contributions
ited+0.201
able+0.181
ok+0.175
Ok+0.167
cy+0.166
Neg logit contributions
 not-0.195
lf-0.181
 Brown-0.181
 driving-0.180
 Alf-0.173
Tokenw
Token ID86
Feature activation+2.23e-02

Pos logit contributions
ited+0.152
Ok+0.138
ok+0.133
yes+0.128
 single+0.128
Neg logit contributions
 not-0.135
lf-0.129
 Brown-0.129
 cars-0.127
 Zig-0.127
Tokenists
Token ID1023
Feature activation-1.87e-02

Pos logit contributions
ited+0.056
able+0.051
ok+0.049
Ok+0.048
cy+0.047
Neg logit contributions
 not-0.031
 Brown-0.029
lf-0.028
 driving-0.027
 Max-0.027
Token Lily
Token ID20037
Feature activation+4.13e-02

Pos logit contributions
 not+0.020
 Brown+0.012
 Max+0.012
 about+0.012
 Tom+0.012
Neg logit contributions
ited-0.057
ok-0.050
sie-0.050
able-0.050
cy-0.050
Token's
Token ID338
Feature activation+4.49e-02

Pos logit contributions
ited+0.120
ok+0.109
able+0.109
Ok+0.107
cy+0.105
Neg logit contributions
 not-0.044
 Brown-0.032
 driving-0.031
 pretending-0.029
 about-0.029
Token.
Token ID13
Feature activation-2.21e-02

Pos logit contributions
 not+0.025
lf+0.025
 Alf+0.024
 Brown+0.024
 Zig+0.023
Neg logit contributions
ited-0.046
able-0.043
ok-0.042
cy-0.041
 single-0.039
Token That
Token ID1320
Feature activation+6.39e-03

Pos logit contributions
 not+0.024
 Brown+0.023
 Alf+0.022
 Max+0.022
 driving+0.021
Neg logit contributions
ited-0.063
able-0.056
ok-0.055
cy-0.053
 single-0.052
Token pencil
Token ID21613
Feature activation-5.57e-02

Pos logit contributions
ited+0.016
able+0.015
ok+0.014
cy+0.014
 single+0.014
Neg logit contributions
 not-0.008
lf-0.008
 Brown-0.008
 Zig-0.007
 Alf-0.007
Token could
Token ID714
Feature activation-5.80e-03

Pos logit contributions
lf+0.080
 not+0.079
 Brown+0.078
 Alf+0.077
 Zig+0.077
Neg logit contributions
ited-0.126
able-0.118
ok-0.113
cy-0.109
 single-0.105
Token hurt
Token ID5938
Feature activation-1.01e-01

Pos logit contributions
lf+0.008
 Alf+0.008
 driving+0.007
opes+0.007
 pretending+0.007
Neg logit contributions
ited-0.013
ok-0.012
able-0.012
 single-0.011
cy-0.011
Token you
Token ID345
Feature activation+4.51e-02

Pos logit contributions
 not+0.136
 Brown+0.125
lf+0.124
 driving+0.121
 Alf+0.119
Neg logit contributions
ited-0.253
ok-0.226
able-0.225
cy-0.215
Ok-0.212
Token.
Token ID13
Feature activation-2.04e-02

Pos logit contributions
ited+0.116
able+0.107
ok+0.106
cy+0.102
Ok+0.099
Neg logit contributions
 not-0.053
lf-0.051
 Brown-0.049
 driving-0.048
 Alf-0.048
Tokenâ
Token ID22940
Feature activation+7.10e-02

Pos logit contributions
 not+0.021
 Brown+0.020
 Alf+0.019
 Max+0.019
 driving+0.018
Neg logit contributions
ited-0.058
able-0.052
ok-0.051
cy-0.050
 single-0.049
Token€
Token ID26391
Feature activation-3.57e-02

Pos logit contributions
ited+0.127
able+0.114
 single+0.112
cy+0.107
ok+0.105
Neg logit contributions
 not-0.135
lf-0.128
 driving-0.123
 Brown-0.121
 Zig-0.118
Token Jack
Token ID3619
Feature activation-3.17e-02

Pos logit contributions
lf+0.054
 Alf+0.051
aurus+0.051
 Brown+0.051
 driving+0.050
Neg logit contributions
ited-0.070
ok-0.067
able-0.065
cy-0.064
Ok-0.063
Token was
Token ID373
Feature activation-7.32e-03

Pos logit contributions
 not+0.037
lf+0.035
 Brown+0.034
 driving+0.033
 Alf+0.033
Neg logit contributions
ited-0.082
able-0.076
ok-0.075
cy-0.072
Ok-0.071
Token they
Token ID484
Feature activation-3.99e-02

Pos logit contributions
ited+0.016
able+0.015
ok+0.015
cy+0.015
Ok+0.014
Neg logit contributions
 not-0.006
lf-0.006
 Brown-0.005
 driving-0.005
 Alf-0.005
Token are
Token ID389
Feature activation-3.42e-02

Pos logit contributions
lf+0.058
 driving+0.055
 Brown+0.055
 Alf+0.052
 Zig+0.051
Neg logit contributions
ited-0.093
able-0.083
cy-0.083
ok-0.081
Ok-0.077
Token too
Token ID1165
Feature activation+3.44e-02

Pos logit contributions
lf+0.055
 not+0.053
 Brown+0.053
 Alf+0.052
 driving+0.051
Neg logit contributions
ited-0.073
able-0.066
ok-0.064
cy-0.061
 single-0.060
Token high
Token ID1029
Feature activation+2.27e-02

Pos logit contributions
ited+0.080
ok+0.069
Ok+0.068
use+0.067
cy+0.067
Neg logit contributions
 not-0.061
 driving-0.046
lf-0.046
 Brown-0.046
 Tom-0.045
Token.
Token ID13
Feature activation-1.14e-02

Pos logit contributions
ited+0.057
able+0.053
ok+0.052
cy+0.050
 single+0.048
Neg logit contributions
lf-0.028
 not-0.027
 Alf-0.026
 Brown-0.025
 driving-0.025
Token
Token ID198
Feature activation-4.41e-03

Pos logit contributions
 not+0.010
 Max+0.009
 Brown+0.008
 Tom+0.008
 Tim+0.007
Neg logit contributions
ited-0.036
able-0.033
ok-0.032
kind-0.030
 single-0.030
Token
Token ID198
Feature activation+9.09e-03

Pos logit contributions
 not+0.004
lf+0.003
 Brown+0.003
 Zig+0.003
 driving+0.003
Neg logit contributions
ited-0.013
able-0.012
ok-0.012
cy-0.011
Ok-0.011
Token"
Token ID1
Feature activation+1.16e-01

Pos logit contributions
ited+0.021
ok+0.018
yes+0.018
isher+0.018
 single+0.018
Neg logit contributions
 not-0.014
lf-0.012
 Max-0.011
 Tom-0.011
 Zig-0.010
TokenLet
Token ID5756
Feature activation+1.44e-01

Pos logit contributions
ited+0.221
able+0.189
away+0.181
isher+0.176
sie+0.175
Neg logit contributions
 not-0.203
 Tom-0.200
 Wh-0.192
 Max-0.191
 Brown-0.187
Token's
Token ID338
Feature activation-8.08e-04

Pos logit contributions
ited+0.377
able+0.346
ok+0.343
cy+0.330
Ok+0.323
Neg logit contributions
 not-0.169
lf-0.160
 Brown-0.155
 driving-0.154
 Alf-0.152
Token climb
Token ID12080
Feature activation+4.06e-02

Pos logit contributions
lf+0.001
 not+0.001
 driving+0.001
 Alf+0.001
 Brown+0.001
Neg logit contributions
ited-0.002
ok-0.002
able-0.002
cy-0.002
Ok-0.002
Token,
Token ID11
Feature activation-2.83e-02

Pos logit contributions
 not+0.024
lf+0.024
 Brown+0.022
 driving+0.022
 Alf+0.022
Neg logit contributions
ited-0.051
able-0.047
ok-0.046
cy-0.045
 single-0.043
Token but
Token ID475
Feature activation+3.75e-03

Pos logit contributions
 not+0.032
 driving+0.028
 Brown+0.027
 pretending+0.026
lf+0.025
Neg logit contributions
ited-0.078
able-0.073
ok-0.072
cy-0.070
Ok-0.068
Token the
Token ID262
Feature activation-4.46e-02

Pos logit contributions
ited+0.010
able+0.009
ok+0.009
cy+0.008
Ok+0.008
Neg logit contributions
lf-0.004
 not-0.004
 Alf-0.004
 Brown-0.004
 driving-0.004
Token bird
Token ID6512
Feature activation-6.20e-02

Pos logit contributions
 not+0.063
lf+0.059
 Brown+0.058
 driving+0.057
 Alf+0.056
Neg logit contributions
ited-0.108
able-0.098
ok-0.097
cy-0.093
Ok-0.091
Token flew
Token ID13112
Feature activation+6.06e-03

Pos logit contributions
 not+0.072
lf+0.066
 Brown+0.064
 driving+0.063
 Alf+0.061
Neg logit contributions
ited-0.163
able-0.150
ok-0.150
cy-0.144
Ok-0.141
Token away
Token ID1497
Feature activation+5.74e-03

Pos logit contributions
ited+0.011
able+0.009
ok+0.009
ability+0.009
cy+0.009
Neg logit contributions
 not-0.013
 Zig-0.012
 Brown-0.012
 all-0.011
 Max-0.011
Token.
Token ID13
Feature activation-1.39e-02

Pos logit contributions
ited+0.017
able+0.015
ok+0.015
cy+0.015
Ok+0.014
Neg logit contributions
 not-0.006
 driving-0.005
 Brown-0.005
lf-0.005
 Alf-0.005
Token Tom
Token ID4186
Feature activation-4.99e-02

Pos logit contributions
lf+0.016
 not+0.015
 driving+0.014
 Brown+0.014
 Alf+0.014
Neg logit contributions
ited-0.036
ok-0.034
able-0.033
Ok-0.032
cy-0.032
Token felt
Token ID2936
Feature activation-1.66e-02

Pos logit contributions
 not+0.062
lf+0.060
 Brown+0.058
 driving+0.058
 Alf+0.058
Neg logit contributions
ited-0.126
able-0.114
ok-0.113
cy-0.109
Ok-0.106
Token sad
Token ID6507
Feature activation+4.37e-03

Pos logit contributions
 not+0.023
 Brown+0.020
 driving+0.020
 Zig+0.020
lf+0.019
Neg logit contributions
ited-0.041
ok-0.038
able-0.037
cy-0.036
Ok-0.035
Token that
Token ID326
Feature activation-4.81e-02

Pos logit contributions
ited+0.010
able+0.009
ok+0.009
 single+0.009
cy+0.009
Neg logit contributions
lf-0.006
 Alf-0.006
 driving-0.006
 not-0.005
 Brown-0.005
Token something
Token ID1223
Feature activation-4.70e-02

Pos logit contributions
 not+0.073
lf+0.069
 Brown+0.068
 driving+0.066
 Alf+0.066
Neg logit contributions
ited-0.079
able-0.071
ok-0.070
cy-0.066
Ok-0.064
Token caught
Token ID4978
Feature activation-3.83e-02

Pos logit contributions
 not+0.065
lf+0.062
 Brown+0.060
 driving+0.060
 Alf+0.059
Neg logit contributions
ited-0.112
able-0.102
ok-0.101
cy-0.097
Ok-0.095
Token Molly
Token ID30236
Feature activation+4.54e-02

Pos logit contributions
 not+0.043
lf+0.039
 Brown+0.039
 driving+0.037
 Max+0.036
Neg logit contributions
ited-0.105
ok-0.095
able-0.093
Ok-0.093
Okay-0.089
Token's
Token ID338
Feature activation+3.95e-02

Pos logit contributions
ited+0.112
able+0.103
ok+0.101
cy+0.098
 single+0.096
Neg logit contributions
 not-0.057
lf-0.054
 Alf-0.054
 Brown-0.053
 Zig-0.052
Token eye
Token ID4151
Feature activation-3.70e-03

Pos logit contributions
ited+0.098
able+0.088
ok+0.087
cy+0.083
Ok+0.082
Neg logit contributions
 not-0.054
 Brown-0.050
lf-0.049
 driving-0.049
 Alf-0.047
Token.
Token ID13
Feature activation+2.04e-03

Pos logit contributions
 not+0.005
 Alf+0.005
 Zig+0.005
 Brown+0.004
lf+0.004
Neg logit contributions
ited-0.009
able-0.008
ok-0.008
 single-0.008
cy-0.008
Token She
Token ID1375
Feature activation-1.93e-02

Pos logit contributions
ited+0.006
able+0.005
ok+0.005
kind+0.005
cy+0.005
Neg logit contributions
 not-0.003
 Max-0.002
 Brown-0.002
 Alf-0.002
 Tim-0.002
Token saw
Token ID2497
Feature activation-4.36e-02

Pos logit contributions
 not+0.028
lf+0.026
 Brown+0.026
 driving+0.026
 Alf+0.026
Neg logit contributions
ited-0.045
able-0.040
ok-0.040
cy-0.039
Ok-0.037
Token two
Token ID734
Feature activation-8.47e-03

Pos logit contributions
lf+0.047
 not+0.047
 driving+0.045
 Brown+0.042
 Alf+0.042
Neg logit contributions
ited-0.111
able-0.105
ok-0.100
cy-0.099
 single-0.098
Token little
Token ID1310
Feature activation+5.75e-02

Pos logit contributions
 not+0.011
lf+0.011
 Brown+0.011
 Zig+0.011
 Alf+0.010
Neg logit contributions
ited-0.020
able-0.019
ok-0.018
cy-0.018
 single-0.017
Token birds
Token ID10087
Feature activation-6.81e-02

Pos logit contributions
ited+0.147
ok+0.133
able+0.131
Ok+0.127
cy+0.124
Neg logit contributions
 not-0.070
 Brown-0.067
lf-0.065
 Alf-0.065
 driving-0.064
Token asked
Token ID1965
Feature activation+1.80e-02

Pos logit contributions
 not+0.039
lf+0.038
 Brown+0.036
 driving+0.035
 Alf+0.035
Neg logit contributions
ited-0.068
able-0.062
ok-0.061
cy-0.058
 single-0.058
Token her
Token ID607
Feature activation+1.40e-03

Pos logit contributions
ited+0.037
able+0.034
ok+0.033
cy+0.032
Ok+0.031
Neg logit contributions
 not-0.029
lf-0.029
 Alf-0.028
 driving-0.028
 Zig-0.027
Token mom
Token ID1995
Feature activation+8.03e-02

Pos logit contributions
able+0.003
ited+0.003
cy+0.003
ok+0.003
 single+0.003
Neg logit contributions
opes-0.002
 Zig-0.002
lf-0.002
 driving-0.002
aurus-0.002
Token for
Token ID329
Feature activation+1.81e-02

Pos logit contributions
ited+0.189
able+0.175
ok+0.172
cy+0.167
 single+0.161
Neg logit contributions
 not-0.108
lf-0.107
 Brown-0.101
 driving-0.101
 Zig-0.101
Token a
Token ID257
Feature activation-3.22e-02

Pos logit contributions
ited+0.046
able+0.042
ok+0.041
cy+0.041
Ok+0.039
Neg logit contributions
 not-0.025
lf-0.023
 driving-0.022
 Brown-0.022
 Alf-0.021
Token bag
Token ID6131
Feature activation+8.07e-03

Pos logit contributions
 not+0.042
 driving+0.037
 Brown+0.037
lf+0.036
 pretending+0.036
Neg logit contributions
ited-0.085
ok-0.076
able-0.076
eful-0.075
mer-0.073
Token and
Token ID290
Feature activation+2.58e-02

Pos logit contributions
ited+0.018
able+0.017
ok+0.017
cy+0.016
 single+0.015
Neg logit contributions
 not-0.011
lf-0.011
 Brown-0.011
 Alf-0.011
 driving-0.011
Token started
Token ID2067
Feature activation+8.73e-03

Pos logit contributions
ited+0.063
able+0.058
ok+0.057
cy+0.056
Ok+0.055
Neg logit contributions
 not-0.032
lf-0.032
 driving-0.030
 Brown-0.030
 Alf-0.030
Token picking
Token ID10868
Feature activation+1.02e-02

Pos logit contributions
ited+0.022
able+0.020
ok+0.020
cy+0.019
Ok+0.018
Neg logit contributions
 not-0.012
lf-0.011
 driving-0.011
 Brown-0.011
 Alf-0.010
Token up
Token ID510
Feature activation+2.20e-02

Pos logit contributions
ited+0.028
ok+0.026
able+0.025
Ok+0.025
Okay+0.024
Neg logit contributions
 not-0.011
 Brown-0.011
 Alf-0.010
 driving-0.010
 cars-0.009
Token the
Token ID262
Feature activation+4.30e-03

Pos logit contributions
ited+0.055
able+0.052
ok+0.049
cy+0.049
 single+0.047
Neg logit contributions
 not-0.026
lf-0.026
 driving-0.024
 Zig-0.023
 Brown-0.023
Token
Token ID198
Feature activation-2.91e-03

Pos logit contributions
ited+0.114
able+0.104
ok+0.102
cy+0.099
 single+0.096
Neg logit contributions
 not-0.056
 Brown-0.053
lf-0.051
 driving-0.051
 Alf-0.050
Token
Token ID198
Feature activation+4.63e-03

Pos logit contributions
 not+0.003
lf+0.003
 Brown+0.003
 driving+0.003
 Alf+0.003
Neg logit contributions
ited-0.008
able-0.007
ok-0.007
cy-0.007
Ok-0.007
TokenThe
Token ID464
Feature activation-4.09e-03

Pos logit contributions
ited+0.010
able+0.009
cy+0.009
ok+0.008
 single+0.008
Neg logit contributions
 not-0.008
lf-0.008
 driving-0.007
 Zig-0.007
 Brown-0.007
Token raven
Token ID37735
Feature activation-1.92e-02

Pos logit contributions
 not+0.007
lf+0.006
 Brown+0.006
 driving+0.006
 Zig+0.006
Neg logit contributions
ited-0.009
ok-0.008
able-0.008
cy-0.008
Ok-0.008
Token c
Token ID269
Feature activation-4.15e-02

Pos logit contributions
 not+0.027
lf+0.025
 Brown+0.025
 driving+0.025
 Alf+0.025
Neg logit contributions
ited-0.046
able-0.041
ok-0.041
cy-0.039
Ok-0.038
Tokenawed
Token ID36825
Feature activation-4.93e-02

Pos logit contributions
 not+0.070
lf+0.067
 Brown+0.065
 driving+0.064
 Zig+0.063
Neg logit contributions
ited-0.087
able-0.079
ok-0.077
cy-0.075
 single-0.072
Token and
Token ID290
Feature activation+1.69e-02

Pos logit contributions
 not+0.064
 Brown+0.060
 Alf+0.056
 driving+0.056
 Zig+0.055
Neg logit contributions
ited-0.125
ok-0.113
able-0.112
cy-0.107
Ok-0.107
Token flew
Token ID13112
Feature activation+3.73e-03

Pos logit contributions
ited+0.039
able+0.036
ok+0.035
cy+0.034
Ok+0.033
Neg logit contributions
 not-0.024
lf-0.023
 driving-0.022
 Alf-0.022
 Brown-0.022
Token down
Token ID866
Feature activation+5.89e-02

Pos logit contributions
ited+0.007
able+0.005
ok+0.005
cy+0.005
Ok+0.005
Neg logit contributions
 not-0.008
 Brown-0.008
 Zig-0.008
 Alf-0.007
 Max-0.007
Token,
Token ID11
Feature activation-3.28e-02

Pos logit contributions
ited+0.170
able+0.152
ok+0.152
cy+0.148
Ok+0.146
Neg logit contributions
 not-0.061
 Brown-0.055
 driving-0.053
 Zig-0.052
 Alf-0.052
Token getting
Token ID1972
Feature activation+1.68e-02

Pos logit contributions
 not+0.043
lf+0.039
 Brown+0.039
 driving+0.039
 Alf+0.037
Neg logit contributions
ited-0.083
able-0.076
ok-0.075
cy-0.072
Ok-0.070
Token he
Token ID339
Feature activation-1.23e-02

Pos logit contributions
 not+0.003
opes+0.003
lf+0.003
 driving+0.003
aurus+0.003
Neg logit contributions
ited-0.006
able-0.006
cy-0.006
 single-0.006
ok-0.006
Token feels
Token ID5300
Feature activation-1.94e-02

Pos logit contributions
 not+0.017
lf+0.016
 driving+0.016
 Brown+0.016
 Alf+0.015
Neg logit contributions
ited-0.029
ok-0.026
able-0.026
Ok-0.025
cy-0.025
Token sad
Token ID6507
Feature activation-1.19e-02

Pos logit contributions
 not+0.030
 driving+0.026
 pretending+0.025
 about+0.024
 Brown+0.024
Neg logit contributions
ited-0.046
ok-0.044
cy-0.042
Ok-0.041
able-0.041
Token or
Token ID393
Feature activation-5.14e-02

Pos logit contributions
 not+0.012
lf+0.012
 Alf+0.011
 Zig+0.011
 Brown+0.011
Neg logit contributions
ited-0.031
able-0.029
ok-0.029
cy-0.028
 single-0.027
Token discouraged
Token ID30170
Feature activation-3.86e-02

Pos logit contributions
 not+0.086
 driving+0.077
 pretending+0.073
lf+0.072
 Brown+0.072
Neg logit contributions
ited-0.113
ok-0.104
able-0.103
cy-0.100
Ok-0.098
Token,
Token ID11
Feature activation-5.85e-02

Pos logit contributions
 not+0.037
lf+0.035
 Brown+0.034
 driving+0.033
 Alf+0.033
Neg logit contributions
ited-0.108
able-0.100
ok-0.099
cy-0.096
Ok-0.094
Token he
Token ID339
Feature activation-1.32e-02

Pos logit contributions
lf+0.053
 not+0.050
 driving+0.047
 Brown+0.047
 Alf+0.046
Neg logit contributions
ited-0.157
able-0.149
ok-0.143
cy-0.142
 single-0.140
Token remembers
Token ID18140
Feature activation-1.95e-02

Pos logit contributions
 not+0.020
lf+0.019
 Brown+0.018
 driving+0.018
 Alf+0.018
Neg logit contributions
ited-0.030
able-0.027
ok-0.027
cy-0.026
Ok-0.025
Token how
Token ID703
Feature activation-3.05e-03

Pos logit contributions
 not+0.020
lf+0.019
 Brown+0.019
 driving+0.018
 Alf+0.018
Neg logit contributions
ited-0.053
able-0.049
ok-0.048
cy-0.046
Ok-0.046
Token he
Token ID339
Feature activation-2.36e-02

Pos logit contributions
 not+0.004
 driving+0.004
lf+0.003
 Brown+0.003
 having+0.003
Neg logit contributions
ited-0.008
ok-0.007
cy-0.007
able-0.007
Ok-0.007
Token worked
Token ID3111
Feature activation-2.16e-02

Pos logit contributions
 not+0.032
lf+0.030
 Brown+0.029
 driving+0.029
 Alf+0.028
Neg logit contributions
ited-0.057
ok-0.052
able-0.052
cy-0.049
Ok-0.049
Token high
Token ID1029
Feature activation+1.47e-01

Pos logit contributions
 not+0.008
lf+0.007
 pretending+0.006
 driving+0.006
 Max+0.006
Neg logit contributions
ited-0.013
ok-0.013
cy-0.013
able-0.012
Ok-0.012
Token voice
Token ID3809
Feature activation+3.22e-02

Pos logit contributions
ited+0.345
ok+0.319
Ok+0.318
able+0.311
Alright+0.305
Neg logit contributions
lf-0.227
 not-0.214
 driving-0.193
 Brown-0.192
 Zig-0.183
Token.
Token ID13
Feature activation+4.96e-03

Pos logit contributions
ited+0.087
able+0.080
ok+0.079
cy+0.076
Ok+0.075
Neg logit contributions
 not-0.035
lf-0.033
 Brown-0.032
 driving-0.031
 Alf-0.031
Token "
Token ID366
Feature activation+9.38e-02

Pos logit contributions
ited+0.014
able+0.013
ok+0.012
cy+0.012
Ok+0.012
Neg logit contributions
 not-0.005
lf-0.005
 Brown-0.005
 driving-0.005
 Alf-0.005
TokenThe
Token ID464
Feature activation-1.13e-02

Pos logit contributions
ited+0.187
able+0.148
sie+0.145
eful+0.142
ok+0.142
Neg logit contributions
 not-0.177
 Brown-0.158
 driving-0.157
 Zig-0.154
 Wh-0.154
Token dragon
Token ID10441
Feature activation-6.89e-02

Pos logit contributions
 not+0.014
 driving+0.012
lf+0.012
 Brown+0.012
 Alf+0.012
Neg logit contributions
ited-0.030
able-0.027
ok-0.027
cy-0.027
Ok-0.026
Token is
Token ID318
Feature activation+1.97e-02

Pos logit contributions
 not+0.107
lf+0.103
 Brown+0.101
 driving+0.099
 Alf+0.099
Neg logit contributions
ited-0.152
able-0.138
ok-0.135
cy-0.131
 single-0.126
Token coming
Token ID2406
Feature activation+1.63e-02

Pos logit contributions
ited+0.043
able+0.040
 single+0.038
ok+0.037
cy+0.036
Neg logit contributions
lf-0.029
 Alf-0.027
 Brown-0.026
opes-0.025
aurus-0.025
Token to
Token ID284
Feature activation-2.71e-02

Pos logit contributions
ited+0.041
able+0.037
ok+0.036
Ok+0.035
cy+0.034
Neg logit contributions
 not-0.023
 Brown-0.022
 driving-0.021
 Alf-0.020
 Zig-0.020
Token get
Token ID651
Feature activation+5.19e-02

Pos logit contributions
lf+0.037
 not+0.036
opes+0.034
 driving+0.034
cases+0.033
Neg logit contributions
ited-0.061
able-0.056
cy-0.054
 single-0.053
ok-0.053
Token me
Token ID502
Feature activation+8.16e-02

Pos logit contributions
ited+0.134
ok+0.120
able+0.118
Ok+0.115
cy+0.113
Neg logit contributions
 not-0.074
 Brown-0.068
 driving-0.063
 Alf-0.063
 Zig-0.062
Token very
Token ID845
Feature activation+8.48e-02

Pos logit contributions
lf+0.068
 Alf+0.062
 not+0.060
 driving+0.059
aurus+0.057
Neg logit contributions
ited-0.119
able-0.115
cy-0.106
ok-0.104
 single-0.103
Token much
Token ID881
Feature activation+7.23e-02

Pos logit contributions
ited+0.170
able+0.154
ok+0.149
cy+0.141
 single+0.138
Neg logit contributions
 not-0.147
lf-0.145
 Brown-0.142
 driving-0.141
 Alf-0.141
Token.
Token ID13
Feature activation-7.17e-03

Pos logit contributions
ited+0.186
able+0.174
ok+0.169
cy+0.162
 single+0.158
Neg logit contributions
 not-0.086
lf-0.084
 Alf-0.080
 driving-0.079
 Brown-0.079
Token Tim
Token ID5045
Feature activation-3.14e-02

Pos logit contributions
bles+0.012
lf+0.011
ics+0.010
 swords+0.010
aurus+0.010
Neg logit contributions
Ok-0.012
Okay-0.011
ok-0.011
ited-0.011
 single-0.011
Tokenmy
Token ID1820
Feature activation-2.82e-02

Pos logit contributions
 not+0.060
lf+0.058
 driving+0.057
 Brown+0.057
 Alf+0.057
Neg logit contributions
ited-0.059
able-0.053
ok-0.051
cy-0.049
Ok-0.047
Token learned
Token ID4499
Feature activation-7.83e-02

Pos logit contributions
lf+0.042
 Alf+0.042
aurus+0.041
 driving+0.040
 Zig+0.040
Neg logit contributions
ited-0.062
able-0.056
ok-0.053
cy-0.052
Ok-0.050
Token that
Token ID326
Feature activation-5.12e-02

Pos logit contributions
lf+0.108
aurus+0.100
 pretending+0.096
 Alf+0.096
opes+0.095
Neg logit contributions
able-0.165
ited-0.153
 single-0.151
ok-0.150
away-0.149
Token accidents
Token ID17390
Feature activation-9.46e-02

Pos logit contributions
lf+0.076
opes+0.075
aurus+0.074
 Alf+0.073
cases+0.073
Neg logit contributions
able-0.099
ited-0.096
 single-0.094
ok-0.092
yes-0.087
Token can
Token ID460
Feature activation+1.70e-02

Pos logit contributions
lf+0.132
 Alf+0.127
 driving+0.117
aurus+0.117
 Brown+0.117
Neg logit contributions
ited-0.207
able-0.206
ok-0.191
cy-0.185
 single-0.183
Token happen
Token ID1645
Feature activation-7.26e-02

Pos logit contributions
ited+0.033
able+0.032
ok+0.032
use+0.031
 single+0.031
Neg logit contributions
lf-0.026
 pretending-0.023
 Alf-0.023
 driving-0.023
 cousins-0.023
Token and
Token ID290
Feature activation-2.71e-02

Pos logit contributions
lf+0.089
 not+0.085
 Alf+0.084
 Brown+0.082
 driving+0.082
Neg logit contributions
ited-0.175
able-0.169
ok-0.164
cy-0.158
 single-0.154
Token shampoo
Token ID47259
Feature activation-9.34e-03

Pos logit contributions
 not+0.082
lf+0.078
 Brown+0.076
 driving+0.075
 Alf+0.075
Neg logit contributions
ited-0.127
able-0.115
ok-0.114
cy-0.109
Ok-0.106
Token from
Token ID422
Feature activation-1.31e-01

Pos logit contributions
 not+0.012
lf+0.010
 driving+0.010
 Brown+0.010
 Alf+0.010
Neg logit contributions
ited-0.024
able-0.022
ok-0.022
Ok-0.021
cy-0.021
Token getting
Token ID1972
Feature activation-3.15e-03

Pos logit contributions
 not+0.142
lf+0.132
 Brown+0.128
 driving+0.127
 Alf+0.124
Neg logit contributions
ited-0.354
able-0.326
ok-0.324
cy-0.312
Ok-0.306
Token in
Token ID287
Feature activation-1.12e-02

Pos logit contributions
 not+0.005
 driving+0.004
 Brown+0.004
 having+0.004
 cars+0.004
Neg logit contributions
ited-0.008
ok-0.007
able-0.007
cy-0.007
Ok-0.007
Token her
Token ID607
Feature activation-3.68e-02

Pos logit contributions
 not+0.014
 Brown+0.012
lf+0.012
 driving+0.012
 Alf+0.011
Neg logit contributions
ited-0.030
ok-0.028
able-0.027
cy-0.027
Ok-0.026
Token eyes
Token ID2951
Feature activation-3.70e-02

Pos logit contributions
 not+0.050
 Brown+0.044
lf+0.043
 driving+0.043
 pretending+0.041
Neg logit contributions
ited-0.095
ok-0.087
able-0.084
Ok-0.081
cy-0.080
Token.
Token ID13
Feature activation-1.00e-02

Pos logit contributions
 not+0.038
lf+0.032
 Brown+0.032
 driving+0.032
 Alf+0.029
Neg logit contributions
ited-0.104
ok-0.096
able-0.096
cy-0.094
Ok-0.093
Token Lily
Token ID20037
Feature activation+3.87e-05

Pos logit contributions
 not+0.013
lf+0.012
 Brown+0.012
 driving+0.012
 Alf+0.012
Neg logit contributions
ited-0.025
able-0.023
ok-0.022
cy-0.022
Ok-0.021
Token didn
Token ID1422
Feature activation-2.44e-02

Pos logit contributions
ited+0.000
able+0.000
ok+0.000
cy+0.000
Ok+0.000
Neg logit contributions
 not-0.000
lf-0.000
 Brown-0.000
 driving-0.000
 Alf-0.000
Token't
Token ID470
Feature activation-1.06e-02

Pos logit contributions
lf+0.026
 pretending+0.024
 Alf+0.024
opes+0.023
 driving+0.023
Neg logit contributions
ited-0.060
ok-0.058
 single-0.057
able-0.057
cy-0.054
Token understand
Token ID1833
Feature activation-6.95e-02

Pos logit contributions
 not+0.017
lf+0.016
 Brown+0.015
 driving+0.015
 Alf+0.015
Neg logit contributions
ited-0.024
able-0.021
ok-0.021
cy-0.020
Ok-0.020
Token yes
Token ID3763
Feature activation+5.17e-02

Pos logit contributions
 not+0.046
 Brown+0.041
lf+0.040
 driving+0.040
 Alf+0.040
Neg logit contributions
ited-0.066
able-0.058
ok-0.057
cy-0.054
 single-0.053
Token,
Token ID11
Feature activation-4.76e-02

Pos logit contributions
ited+0.153
able+0.141
ok+0.139
cy+0.135
Ok+0.133
Neg logit contributions
 not-0.050
 Brown-0.041
lf-0.040
 driving-0.039
 Alf-0.036
Token you
Token ID345
Feature activation+7.10e-02

Pos logit contributions
lf+0.063
opes+0.058
 not+0.057
 driving+0.057
 Alf+0.057
Neg logit contributions
ited-0.110
able-0.103
ok-0.101
cy-0.100
 single-0.097
Token can
Token ID460
Feature activation+5.25e-02

Pos logit contributions
ited+0.145
able+0.131
ok+0.129
cy+0.125
Ok+0.118
Neg logit contributions
lf-0.125
 not-0.119
 driving-0.117
 Alf-0.116
 Brown-0.116
Token have
Token ID423
Feature activation-7.89e-03

Pos logit contributions
ited+0.120
ok+0.112
cy+0.110
able+0.108
Ok+0.103
Neg logit contributions
lf-0.072
 Alf-0.069
 driving-0.067
 Zig-0.066
 not-0.066
Token cheese
Token ID9891
Feature activation-1.39e-01

Pos logit contributions
 not+0.009
lf+0.009
 Brown+0.008
 driving+0.008
 Alf+0.008
Neg logit contributions
ited-0.021
able-0.019
ok-0.019
cy-0.018
Ok-0.018
Token and
Token ID290
Feature activation-1.61e-02

Pos logit contributions
 not+0.177
lf+0.171
 Brown+0.166
 Alf+0.166
 driving+0.166
Neg logit contributions
ited-0.341
able-0.316
ok-0.311
cy-0.299
Ok-0.289
Token crack
Token ID8469
Feature activation-8.52e-02

Pos logit contributions
 not+0.019
lf+0.018
 driving+0.017
 Brown+0.017
 Alf+0.017
Neg logit contributions
ited-0.042
able-0.038
ok-0.038
cy-0.037
 single-0.036
Tokeners
Token ID364
Feature activation-1.21e-01

Pos logit contributions
 not+0.113
lf+0.101
 Brown+0.100
 Zig+0.093
 driving+0.092
Neg logit contributions
ited-0.223
ok-0.199
Ok-0.194
able-0.190
cy-0.187
Token later
Token ID1568
Feature activation-4.81e-02

Pos logit contributions
 not+0.151
lf+0.135
 Brown+0.134
 driving+0.131
 Alf+0.123
Neg logit contributions
ited-0.313
ok-0.284
able-0.283
cy-0.272
Ok-0.271
Token.
Token ID13
Feature activation-3.39e-02

Pos logit contributions
 not+0.063
lf+0.056
 Brown+0.056
 driving+0.056
 Alf+0.052
Neg logit contributions
ited-0.124
able-0.112
ok-0.109
cy-0.108
Ok-0.105
Token Tommy
Token ID19919
Feature activation-2.16e-02

Pos logit contributions
ited+0.091
cy+0.081
able+0.078
Ok+0.076
ok+0.074
Neg logit contributions
lf-0.070
opes-0.067
ves-0.066
 Zig-0.062
 Alf-0.062
Token were
Token ID547
Feature activation-4.33e-02

Pos logit contributions
lf+0.030
 not+0.030
 driving+0.027
 Brown+0.027
 Alf+0.027
Neg logit contributions
ited-0.053
ok-0.047
able-0.046
cy-0.046
Ok-0.044
Token surprised
Token ID6655
Feature activation-3.02e-02

Pos logit contributions
lf+0.085
aurus+0.078
opes+0.077
 Alf+0.076
 Dino+0.073
Neg logit contributions
ited-0.071
eful-0.065
able-0.063
cy-0.061
 forward-0.061
Token that
Token ID326
Feature activation-5.25e-02

Pos logit contributions
 not+0.038
lf+0.037
 Brown+0.036
 driving+0.035
 Alf+0.035
Neg logit contributions
ited-0.076
able-0.070
ok-0.069
cy-0.066
Ok-0.065
Token the
Token ID262
Feature activation-2.30e-02

Pos logit contributions
lf+0.064
 not+0.058
opes+0.058
aurus+0.054
 driving+0.054
Neg logit contributions
ited-0.127
able-0.123
cy-0.117
ok-0.116
 single-0.116
Token rh
Token ID9529
Feature activation-1.34e-01

Pos logit contributions
lf+0.027
opes+0.025
 not+0.025
cases+0.023
 pretending+0.023
Neg logit contributions
ited-0.056
able-0.054
 single-0.053
ok-0.049
cy-0.049
Tokenin
Token ID259
Feature activation+3.17e-02

Pos logit contributions
 not+0.398
lf+0.388
 Brown+0.384
 driving+0.382
 Alf+0.381
Neg logit contributions
ited-0.109
able-0.083
ok-0.078
cy-0.067
Ok-0.059
Tokenoc
Token ID420
Feature activation-3.89e-02

Pos logit contributions
ited+0.045
able+0.043
cy+0.037
mer+0.037
 single+0.036
Neg logit contributions
 not-0.071
lf-0.069
 pretending-0.067
opes-0.067
 Zig-0.066
Tokeneros
Token ID27498
Feature activation-1.04e-01

Pos logit contributions
 not+0.058
lf+0.055
 Brown+0.053
 driving+0.053
 Alf+0.052
Neg logit contributions
ited-0.089
able-0.081
ok-0.080
cy-0.077
Ok-0.075
Token could
Token ID714
Feature activation+2.62e-03

Pos logit contributions
lf+0.133
 not+0.126
 Alf+0.125
 Brown+0.122
 Zig+0.120
Neg logit contributions
ited-0.251
able-0.234
cy-0.222
ok-0.218
 single-0.212
Token talk
Token ID1561
Feature activation-7.00e-02

Pos logit contributions
ited+0.006
 single+0.005
ok+0.005
yes+0.005
able+0.005
Neg logit contributions
lf-0.003
opes-0.003
 Alf-0.003
 dinosaurs-0.003
 pretending-0.003
Token play
Token ID711
Feature activation-4.20e-02

Pos logit contributions
ited+0.087
able+0.078
ok+0.078
cy+0.074
Ok+0.073
Neg logit contributions
 not-0.055
 Brown-0.051
lf-0.050
 driving-0.050
 Alf-0.049
Token.
Token ID13
Feature activation-1.25e-02

Pos logit contributions
 not+0.056
 Alf+0.055
 Brown+0.055
lf+0.055
 driving+0.053
Neg logit contributions
ited-0.098
able-0.093
ok-0.091
cy-0.086
Ok-0.085
Token They
Token ID1119
Feature activation-5.04e-02

Pos logit contributions
 not+0.013
 Brown+0.012
 Zig+0.011
 Alf+0.011
 driving+0.011
Neg logit contributions
ited-0.037
able-0.033
ok-0.032
cy-0.031
sie-0.030
Token make
Token ID787
Feature activation-3.62e-04

Pos logit contributions
lf+0.072
 driving+0.067
 not+0.066
 Brown+0.066
 Alf+0.066
Neg logit contributions
ited-0.121
able-0.110
ok-0.109
cy-0.107
Ok-0.101
Token more
Token ID517
Feature activation+1.56e-02

Pos logit contributions
lf+0.001
 Brown+0.001
 Alf+0.001
 driving+0.000
aurus+0.000
Neg logit contributions
ited-0.001
able-0.001
ok-0.001
eful-0.001
 tighter-0.001
Token masks
Token ID20680
Feature activation-1.76e-01

Pos logit contributions
able+0.038
ited+0.036
ok+0.034
cy+0.031
ability+0.031
Neg logit contributions
 Brown-0.020
 not-0.020
lf-0.020
 Alf-0.019
 Zig-0.018
Token and
Token ID290
Feature activation+3.30e-03

Pos logit contributions
lf+0.295
 Alf+0.285
opes+0.285
 Brown+0.271
 Zig+0.269
Neg logit contributions
ok-0.330
able-0.319
ited-0.318
 single-0.293
cy-0.290
Token pictures
Token ID5986
Feature activation-1.76e-01

Pos logit contributions
ited+0.007
able+0.007
ok+0.007
cy+0.006
Ok+0.006
Neg logit contributions
lf-0.004
 Brown-0.004
 Alf-0.004
 driving-0.004
 not-0.004
Token.
Token ID13
Feature activation-1.99e-02

Pos logit contributions
lf+0.277
 Alf+0.274
opes+0.267
 not+0.261
 Brown+0.255
Neg logit contributions
able-0.337
ok-0.324
ited-0.311
 single-0.294
cy-0.292
Token They
Token ID1119
Feature activation-5.49e-02

Pos logit contributions
 not+0.022
 Brown+0.020
 Alf+0.019
 driving+0.019
 Zig+0.019
Neg logit contributions
ited-0.056
able-0.051
ok-0.050
cy-0.048
Ok-0.047
Token have
Token ID423
Feature activation+9.83e-03

Pos logit contributions
lf+0.081
 driving+0.075
 Alf+0.074
 Brown+0.074
opes+0.073
Neg logit contributions
ited-0.127
able-0.117
cy-0.114
ok-0.113
yes-0.104
Token they
Token ID484
Feature activation-3.11e-02

Pos logit contributions
ited+0.027
able+0.025
cy+0.024
 single+0.024
ok+0.024
Neg logit contributions
lf-0.015
opes-0.014
 not-0.014
 driving-0.014
 Alf-0.014
Token never
Token ID1239
Feature activation+2.78e-02

Pos logit contributions
lf+0.038
 driving+0.038
 Brown+0.036
 Alf+0.035
 pretending+0.035
Neg logit contributions
ited-0.078
able-0.072
cy-0.071
ok-0.067
Ok-0.066
Token wanted
Token ID2227
Feature activation-2.14e-02

Pos logit contributions
ited+0.067
able+0.061
ok+0.059
cy+0.057
Ok+0.054
Neg logit contributions
lf-0.039
 not-0.038
 Brown-0.037
 Alf-0.036
 driving-0.036
Token to
Token ID284
Feature activation-4.19e-02

Pos logit contributions
lf+0.035
 not+0.035
 Brown+0.033
 driving+0.033
 Alf+0.032
Neg logit contributions
ited-0.044
able-0.041
ok-0.040
cy-0.038
 single-0.038
Token get
Token ID651
Feature activation+3.28e-02

Pos logit contributions
lf+0.068
opes+0.065
 Alf+0.062
 pretending+0.061
cases+0.059
Neg logit contributions
ited-0.082
yes-0.076
ok-0.073
able-0.073
cy-0.072
Token married
Token ID6405
Feature activation-1.25e-01

Pos logit contributions
ited+0.076
able+0.069
ok+0.069
cy+0.066
Ok+0.064
Neg logit contributions
 not-0.048
lf-0.045
 Brown-0.044
 driving-0.044
 Alf-0.043
Token.
Token ID13
Feature activation-2.08e-02

Pos logit contributions
 not+0.144
lf+0.135
 Brown+0.131
 driving+0.129
 Alf+0.127
Neg logit contributions
ited-0.328
able-0.302
ok-0.299
cy-0.288
Ok-0.282
Token
Token ID198
Feature activation-1.41e-02

Pos logit contributions
lf+0.030
 not+0.030
 driving+0.029
 pretending+0.027
 Brown+0.026
Neg logit contributions
ited-0.046
able-0.043
ok-0.042
cy-0.042
Ok-0.040
TokenOne
Token ID3198
Feature activation+7.22e-02

Pos logit contributions
 not+0.019
lf+0.018
 driving+0.018
 Brown+0.017
 Alf+0.017
Neg logit contributions
ited-0.033
able-0.031
ok-0.030
cy-0.029
 single-0.029
Token day
Token ID1110
Feature activation+4.64e-02

Pos logit contributions
ited+0.161
ok+0.146
Ok+0.139
able+0.135
use+0.132
Neg logit contributions
 not-0.118
lf-0.110
 Brown-0.105
 driving-0.105
 Zig-0.101
Token,
Token ID11
Feature activation-1.97e-02

Pos logit contributions
ited+0.112
able+0.108
cy+0.104
ok+0.102
 single+0.100
Neg logit contributions
 not-0.055
lf-0.054
 driving-0.050
opes-0.050
 Alf-0.050
TokenL
Token ID43
Feature activation+4.04e-02

Pos logit contributions
ited+0.037
able+0.033
ok+0.033
cy+0.031
Ok+0.030
Neg logit contributions
 not-0.031
lf-0.030
 Brown-0.029
 driving-0.029
 Alf-0.029
Tokenily
Token ID813
Feature activation+4.49e-02

Pos logit contributions
ited+0.078
able+0.074
ok+0.070
cy+0.069
sie+0.063
Neg logit contributions
 driving-0.071
 not-0.070
 Brown-0.070
lf-0.068
 Zig-0.068
Token and
Token ID290
Feature activation+5.10e-02

Pos logit contributions
ited+0.110
able+0.102
ok+0.099
cy+0.098
Ok+0.093
Neg logit contributions
 not-0.058
lf-0.056
 Alf-0.054
 driving-0.053
 Brown-0.053
Token Tom
Token ID4186
Feature activation-3.46e-02

Pos logit contributions
ited+0.114
able+0.111
cy+0.108
ok+0.104
Ok+0.101
Neg logit contributions
lf-0.073
 driving-0.065
opes-0.065
 pretending-0.063
 Alf-0.063
Token were
Token ID547
Feature activation-2.58e-02

Pos logit contributions
lf+0.045
 not+0.044
 Alf+0.041
 driving+0.041
 Zig+0.040
Neg logit contributions
ited-0.086
able-0.081
cy-0.078
ok-0.077
Ok-0.071
Token friends
Token ID2460
Feature activation-1.20e-01

Pos logit contributions
lf+0.030
 Brown+0.027
 not+0.027
 Alf+0.027
 Zig+0.026
Neg logit contributions
ited-0.067
able-0.063
ok-0.062
cy-0.060
 single-0.058
Token who
Token ID508
Feature activation-4.80e-02

Pos logit contributions
 not+0.127
lf+0.109
 driving+0.108
 Brown+0.107
 Alf+0.098
Neg logit contributions
ited-0.336
able-0.307
ok-0.304
cy-0.295
Ok-0.293
Token liked
Token ID8288
Feature activation-8.26e-03

Pos logit contributions
lf+0.079
 not+0.075
 Alf+0.073
 driving+0.072
opes+0.072
Neg logit contributions
ited-0.100
able-0.096
cy-0.091
ok-0.089
Ok-0.085
Token to
Token ID284
Feature activation-2.64e-02

Pos logit contributions
 not+0.014
lf+0.014
 Alf+0.013
 Brown+0.013
aurus+0.013
Neg logit contributions
ited-0.015
able-0.014
ok-0.014
cy-0.013
 single-0.013
Token play
Token ID711
Feature activation-5.32e-02

Pos logit contributions
 not+0.037
lf+0.036
 Brown+0.034
 driving+0.034
 Alf+0.034
Neg logit contributions
ited-0.060
able-0.055
ok-0.055
cy-0.054
yes-0.052
Token in
Token ID287
Feature activation+1.68e-02

Pos logit contributions
 not+0.085
 Alf+0.080
lf+0.080
 Brown+0.080
 driving+0.078
Neg logit contributions
ited-0.108
able-0.104
ok-0.102
cy-0.099
yes-0.093