TOP ACTIVATIONS
MAX = 8.600

hospital . The doctors tried to help her , but it
. Finally , they managed to put it out . The
hospital . The doctors tried to help him , but they
hospital . The doctors tried to help him , but it
hospital . The doctors tried to help , but it was
Eventually , the fox managed to grab the sack and threw
cry . Her mom tried to protect her , but the
shy lion . He tried to run as fast as he
The veterinarian tried to help Spot , but he
leash . The lady tries to hold him , but he
The farmer tried to make the bird go away
move . The captain tried to fix it , but he
. Luckily , Charlie managed to escape by clinging onto a
too short . They try to use a rope to get
ases them . He tries to bite them .
shout . Their moms try to calm them down . They
sick . The vet had to take an x - ray
them fall . They tried to resist , but they were
ie . Tweet ie tried to be brave , but the
and teachers arrived and tried to put out the fire .

MIN ACTIVATIONS
MIN = -15.531

't I play with the oct opus , too ?"
he waved to the r asp berries and said , "
as they drank the mil ksh akes .
ksh akes . The mil ksh akes were blue and pink
a big net around the oct opus . It was stuck
wanted to play with the oct opus , but we broke
fish came and asked the oct opus why it wasn 't
, duck ," said the oct opus . " Do you
- war ," said the oct opus . " I have
Don 't eat the r asp berries , they are too
water and wave at the oct opus . The oct opus
it ?" The oct opus thought about it for
friends . He asked the oct opus , the whale and
. We will share the oct opus together . Let 's
He wanted to see the oct opus better . He ran
sw am over to the oct opus and asked her to
shell and pin ched the oct opus 's arm with its
, duck ," said the oct opus . " I didn
you and we love the oct opus ." Mom
Tim said to the oct opus , " Can we

INTERVAL 2.568 - 8.600
CONTAINS 27.741%

. The bird wanted to be the best fl ier in
The painter smiled and shook his head .
sad and wished that Lily had given the dog a chance
to see Sammy and they became friends . Sam
was happy that it was just an adventure ! <|endoftext|> One

INTERVAL -3.465 - 2.568
CONTAINS 56.863%

One day Bob went to the park with
" What is your name ? Where is your home ?"
. L ila heard a loud bark and saw a big
to rub it . Suddenly , a magic fairy
he blows too hard and the soap spl ashes . He

INTERVAL -9.498 - -3.465
CONTAINS 15.271%

her pocket . The next day , Joy went
a small girl called Lily . She was always in such
her the string and the toy car . He went outside
store and bought a stuffed bear that she could attach the
, there was a little boy named Tim my . Tim

INTERVAL -15.531 - -9.498
CONTAINS 0.125%

decided to jump on the tr amp oline , but Lily
They took turns playing the fl ute and had so much
hipp o . The little hipp o lived in a big
zoo would wish that the z ebra would be nicer .
you have to keep the therm ometer in your mouth for
Token hospital
Token ID4436
Feature activation-8.48e-01

Pos logit contributions
uddle+0.019
uffy+0.019
op+0.019
olly+0.019
ari+0.018
Neg logit contributions
 such-0.018
 his-0.017
 it-0.017
 him-0.016
 the-0.016
Token.
Token ID13
Feature activation-1.77e+00

Pos logit contributions
op+0.022
uffy+0.022
oke+0.021
uddle+0.021
arry+0.021
Neg logit contributions
 such-0.014
 his-0.014
 the-0.013
 it-0.013
 him-0.013
Token The
Token ID383
Feature activation-2.70e+00

Pos logit contributions
op+0.036
oke+0.035
uddle+0.035
arry+0.035
opus+0.034
Neg logit contributions
 such-0.041
 his-0.040
 the-0.040
 it-0.039
 him-0.038
Token doctors
Token ID7519
Feature activation+1.05e-02

Pos logit contributions
uddle+0.063
uffy+0.063
olly+0.062
op+0.062
pillar+0.060
Neg logit contributions
 such-0.053
 his-0.049
 it-0.048
 him-0.047
 the-0.047
Token tried
Token ID3088
Feature activation+6.35e+00

Pos logit contributions
 such+0.000
 his+0.000
 yours+0.000
 the+0.000
 anything+0.000
Neg logit contributions
op-0.000
oke-0.000
uffy-0.000
uddle-0.000
ari-0.000
Token to
Token ID284
Feature activation+8.60e+00

Pos logit contributions
 such+0.102
 yours+0.091
ts+0.090
 permission+0.085
 his+0.085
Neg logit contributions
op-0.157
uffy-0.156
uddy-0.147
uddle-0.143
arry-0.143
Token help
Token ID1037
Feature activation+2.83e+00

Pos logit contributions
 such+0.101
 yours+0.083
 nothing+0.083
 hers+0.079
 it+0.078
Neg logit contributions
uffy-0.242
op-0.241
rig-0.239
arry-0.236
uddle-0.230
Token her
Token ID607
Feature activation+1.52e+00

Pos logit contributions
 such+0.022
 his+0.017
 yours+0.015
 permission+0.015
 nothing+0.014
Neg logit contributions
uffy-0.097
op-0.097
olly-0.094
oke-0.092
arry-0.092
Token,
Token ID11
Feature activation+4.14e+00

Pos logit contributions
 such+0.028
 his+0.027
 the+0.026
 it+0.026
 him+0.025
Neg logit contributions
op-0.038
uffy-0.037
uddle-0.037
oke-0.036
arry-0.036
Token but
Token ID475
Feature activation+3.68e+00

Pos logit contributions
 such+0.111
 him+0.105
 yours+0.105
ps+0.104
 rid+0.104
Neg logit contributions
oke-0.050
op-0.050
rig-0.047
uffy-0.046
arry-0.043
Token it
Token ID340
Feature activation+1.59e+00

Pos logit contributions
ps+0.057
 respect+0.053
 such+0.051
 permission+0.050
ws+0.048
Neg logit contributions
uffy-0.090
op-0.079
olly-0.077
oke-0.077
uddle-0.075
Token.
Token ID13
Feature activation-2.74e+00

Pos logit contributions
olly+0.155
op+0.155
arry+0.154
oke+0.154
uffy+0.153
Neg logit contributions
 such-0.092
 his-0.091
 the-0.090
 it-0.086
 him-0.083
Token Finally
Token ID9461
Feature activation+1.20e+00

Pos logit contributions
op+0.058
oke+0.057
arry+0.057
uffy+0.057
uddle+0.057
Neg logit contributions
 such-0.059
 his-0.057
 the-0.056
 it-0.055
 him-0.054
Token,
Token ID11
Feature activation+2.24e+00

Pos logit contributions
 such+0.036
 yours+0.035
 permission+0.035
 rid+0.035
 respect+0.033
Neg logit contributions
olly-0.013
uffy-0.013
op-0.013
oke-0.012
oles-0.012
Token they
Token ID484
Feature activation+4.80e+00

Pos logit contributions
 yours+0.024
 such+0.022
 permission+0.021
ps+0.021
 yourself+0.020
Neg logit contributions
op-0.062
oles-0.060
uffy-0.059
arry-0.057
ule-0.057
Token managed
Token ID5257
Feature activation+6.21e+00

Pos logit contributions
 such+0.076
 yours+0.072
 anything+0.069
 your+0.067
 his+0.066
Neg logit contributions
oke-0.126
ari-0.124
op-0.122
uddle-0.118
olly-0.118
Token to
Token ID284
Feature activation+8.60e+00

Pos logit contributions
 such+0.113
 rid+0.111
ts+0.107
 protect+0.103
 permission+0.102
Neg logit contributions
op-0.121
oke-0.119
uddle-0.118
uffy-0.117
zag-0.115
Token put
Token ID1234
Feature activation+6.01e+00

Pos logit contributions
 such+0.128
 hers+0.115
 his+0.114
 yours+0.112
 nothing+0.110
Neg logit contributions
op-0.224
rig-0.224
uddle-0.222
arry-0.219
uffy-0.219
Token it
Token ID340
Feature activation+2.03e+00

Pos logit contributions
 letting+0.059
 permission+0.057
 such+0.055
 never+0.053
 hurting+0.051
Neg logit contributions
uffy-0.164
op-0.164
uddle-0.164
ite-0.156
ane-0.154
Token out
Token ID503
Feature activation+2.43e+00

Pos logit contributions
 such+0.032
 permission+0.028
 it+0.027
 respect+0.027
 him+0.026
Neg logit contributions
uddle-0.052
uffy-0.051
op-0.050
zag-0.049
olly-0.048
Token.
Token ID13
Feature activation-2.21e+00

Pos logit contributions
 such+0.043
 permission+0.042
 him+0.041
 his+0.039
 yours+0.039
Neg logit contributions
op-0.051
oke-0.050
rig-0.050
uddle-0.050
ari-0.049
Token The
Token ID383
Feature activation-2.36e+00

Pos logit contributions
op+0.049
oke+0.048
arry+0.048
uffy+0.048
olly+0.048
Neg logit contributions
 such-0.046
 his-0.044
 the-0.043
 it-0.042
 him-0.042
Token hospital
Token ID4436
Feature activation-6.69e-01

Pos logit contributions
uddle+0.014
op+0.013
actus+0.013
opus+0.013
olly+0.012
Neg logit contributions
 such-0.015
 his-0.014
 it-0.014
 yours-0.014
 hers-0.014
Token.
Token ID13
Feature activation-2.40e+00

Pos logit contributions
op+0.017
uffy+0.017
oke+0.017
uddle+0.017
arry+0.017
Neg logit contributions
 such-0.011
 his-0.011
 the-0.011
 it-0.010
 him-0.010
Token The
Token ID383
Feature activation-2.48e+00

Pos logit contributions
op+0.059
oke+0.058
uddle+0.058
opus+0.058
olly+0.058
Neg logit contributions
 his-0.045
 such-0.045
 the-0.044
 it-0.042
 him-0.042
Token doctors
Token ID7519
Feature activation-4.98e-01

Pos logit contributions
op+0.061
uddle+0.060
olly+0.059
opus+0.058
uffy+0.057
Neg logit contributions
 such-0.048
 his-0.043
 yours-0.042
 its-0.041
 it-0.041
Token tried
Token ID3088
Feature activation+6.26e+00

Pos logit contributions
op+0.013
oke+0.013
uffy+0.013
uddle+0.012
ari+0.012
Neg logit contributions
 such-0.009
 his-0.008
 yours-0.007
 the-0.007
 anything-0.007
Token to
Token ID284
Feature activation+8.59e+00

Pos logit contributions
 such+0.100
 yours+0.089
ts+0.086
 permission+0.083
 rid+0.082
Neg logit contributions
op-0.154
uffy-0.149
uddy-0.144
uddle-0.142
zag-0.141
Token help
Token ID1037
Feature activation+2.84e+00

Pos logit contributions
 such+0.106
 nothing+0.088
 yours+0.087
 it+0.086
 hers+0.086
Neg logit contributions
rig-0.238
op-0.236
arry-0.234
uffy-0.231
zag-0.230
Token him
Token ID683
Feature activation+2.53e+00

Pos logit contributions
 such+0.021
 permission+0.014
 nothing+0.014
 rid+0.013
 yours+0.013
Neg logit contributions
op-0.097
uffy-0.096
arry-0.094
oke-0.093
uddle-0.092
Token,
Token ID11
Feature activation+3.97e+00

Pos logit contributions
 such+0.050
 his+0.047
 the+0.046
 it+0.046
 him+0.044
Neg logit contributions
op-0.059
uffy-0.058
uddle-0.058
arry-0.057
oke-0.057
Token but
Token ID475
Feature activation+3.70e+00

Pos logit contributions
 such+0.103
 rid+0.096
 yours+0.096
 his+0.094
 him+0.094
Neg logit contributions
oke-0.058
op-0.056
rig-0.052
arry-0.051
uffy-0.050
Token they
Token ID484
Feature activation+4.13e+00

Pos logit contributions
ps+0.065
 such+0.059
 respect+0.058
 permission+0.056
ws+0.055
Neg logit contributions
uffy-0.078
op-0.075
oke-0.075
uddy-0.072
oles-0.071
Token hospital
Token ID4436
Feature activation-1.14e+00

Pos logit contributions
uddle+0.018
op+0.018
olly+0.017
uffy+0.017
opus+0.017
Neg logit contributions
 such-0.018
 it-0.018
 his-0.017
 him-0.017
 them-0.017
Token.
Token ID13
Feature activation-2.17e+00

Pos logit contributions
op+0.030
oke+0.029
uffy+0.029
arry+0.029
uddle+0.029
Neg logit contributions
 such-0.019
 his-0.018
 the-0.018
 it-0.018
 him-0.017
Token The
Token ID383
Feature activation-2.50e+00

Pos logit contributions
op+0.045
oke+0.044
uffy+0.044
arry+0.044
uddle+0.044
Neg logit contributions
 such-0.048
 his-0.047
 the-0.046
 it-0.045
 him-0.044
Token doctors
Token ID7519
Feature activation-5.85e-01

Pos logit contributions
op+0.053
uddle+0.053
uffy+0.053
olly+0.051
ari+0.051
Neg logit contributions
 such-0.052
 it-0.048
 his-0.048
 him-0.047
 yours-0.047
Token tried
Token ID3088
Feature activation+6.11e+00

Pos logit contributions
op+0.016
oke+0.015
uffy+0.015
uddle+0.015
ari+0.015
Neg logit contributions
 such-0.010
 his-0.009
 yours-0.008
 the-0.008
 anything-0.008
Token to
Token ID284
Feature activation+8.58e+00

Pos logit contributions
 such+0.102
 yours+0.091
ts+0.090
 rid+0.086
 permission+0.084
Neg logit contributions
op-0.142
uffy-0.138
uddy-0.136
zag-0.134
arry-0.133
Token help
Token ID1037
Feature activation+2.83e+00

Pos logit contributions
 such+0.104
 nothing+0.089
 yours+0.085
 it+0.083
 anything+0.080
Neg logit contributions
rig-0.234
arry-0.230
op-0.228
uffy-0.227
zag-0.224
Token him
Token ID683
Feature activation+2.42e+00

Pos logit contributions
 such+0.022
 permission+0.015
 rid+0.015
 nothing+0.015
 yours+0.014
Neg logit contributions
op-0.095
uffy-0.094
arry-0.093
oke-0.091
uddy-0.090
Token,
Token ID11
Feature activation+3.92e+00

Pos logit contributions
 such+0.047
 his+0.045
 the+0.044
 it+0.043
 him+0.042
Neg logit contributions
op-0.057
uffy-0.056
uddle-0.056
arry-0.056
oke-0.056
Token but
Token ID475
Feature activation+3.69e+00

Pos logit contributions
 such+0.102
 rid+0.094
 yours+0.094
 him+0.093
 his+0.092
Neg logit contributions
oke-0.057
op-0.055
rig-0.052
arry-0.052
uffy-0.049
Token it
Token ID340
Feature activation+1.56e+00

Pos logit contributions
ps+0.063
 respect+0.058
 such+0.058
ws+0.056
 permission+0.056
Neg logit contributions
uffy-0.077
oke-0.074
op-0.074
uddy-0.073
oles-0.071
Token hospital
Token ID4436
Feature activation-1.27e+00

Pos logit contributions
uddle+0.031
actus+0.030
opus+0.029
ari+0.028
pillar+0.028
Neg logit contributions
 such-0.040
 his-0.039
 him-0.038
 rid-0.037
 permission-0.036
Token.
Token ID13
Feature activation-2.69e+00

Pos logit contributions
op+0.033
uffy+0.033
oke+0.033
arry+0.032
uddle+0.032
Neg logit contributions
 such-0.021
 his-0.020
 the-0.020
 it-0.019
 him-0.019
Token The
Token ID383
Feature activation-2.73e+00

Pos logit contributions
op+0.066
uffy+0.066
oke+0.065
uddle+0.065
olly+0.065
Neg logit contributions
 such-0.048
 his-0.047
 the-0.045
 it-0.044
 him-0.044
Token doctors
Token ID7519
Feature activation-5.46e-01

Pos logit contributions
uddle+0.065
olly+0.064
uffy+0.063
opus+0.063
pillar+0.063
Neg logit contributions
 such-0.052
 his-0.049
 him-0.047
 the-0.045
 it-0.045
Token tried
Token ID3088
Feature activation+6.05e+00

Pos logit contributions
op+0.015
oke+0.014
uffy+0.014
uddle+0.014
ari+0.014
Neg logit contributions
 such-0.009
 his-0.009
 yours-0.008
 the-0.008
 anything-0.008
Token to
Token ID284
Feature activation+8.56e+00

Pos logit contributions
 such+0.097
 yours+0.089
ts+0.086
 permission+0.082
 his+0.081
Neg logit contributions
uffy-0.145
op-0.145
uddy-0.137
uddle-0.135
oodle-0.135
Token help
Token ID1037
Feature activation+2.74e+00

Pos logit contributions
 such+0.107
 yours+0.090
 nothing+0.089
 it+0.088
 his+0.086
Neg logit contributions
rig-0.242
uffy-0.235
arry-0.234
op-0.232
zag-0.227
Token,
Token ID11
Feature activation+3.45e+00

Pos logit contributions
 such+0.026
 his+0.022
 permission+0.020
 yours+0.019
 nothing+0.019
Neg logit contributions
uffy-0.089
op-0.089
olly-0.087
oke-0.085
uddle-0.085
Token but
Token ID475
Feature activation+3.70e+00

Pos logit contributions
 such+0.088
 his+0.084
 him+0.084
 yours+0.083
 rid+0.081
Neg logit contributions
oke-0.053
op-0.051
rig-0.047
uffy-0.047
arry-0.046
Token it
Token ID340
Feature activation+1.45e+00

Pos logit contributions
ps+0.046
 permission+0.043
 respect+0.042
 such+0.040
ws+0.038
Neg logit contributions
uffy-0.101
op-0.090
uddle-0.088
olly-0.087
oke-0.087
Token was
Token ID373
Feature activation+4.62e+00

Pos logit contributions
 such+0.028
 his+0.027
 the+0.025
 him+0.025
 permission+0.025
Neg logit contributions
uffy-0.034
op-0.034
uddle-0.034
oke-0.033
olly-0.033
Token Eventually
Token ID16178
Feature activation+1.17e+00

Pos logit contributions
op+0.065
uffy+0.065
opus+0.064
uddle+0.063
ane+0.063
Neg logit contributions
 the-0.043
 such-0.042
 his-0.042
 it-0.041
 him-0.038
Token,
Token ID11
Feature activation+1.90e+00

Pos logit contributions
 such+0.029
 permission+0.028
 yours+0.027
 his+0.027
 him+0.027
Neg logit contributions
olly-0.019
arry-0.019
oke-0.019
op-0.018
uddle-0.018
Token the
Token ID262
Feature activation-2.18e+00

Pos logit contributions
 such+0.021
 permission+0.020
ps+0.018
 yours+0.018
 his+0.018
Neg logit contributions
olly-0.055
arry-0.055
uddle-0.054
op-0.054
oles-0.053
Token fox
Token ID21831
Feature activation-1.05e+00

Pos logit contributions
pillar+0.039
uddle+0.038
arry+0.038
olly+0.038
ari+0.037
Neg logit contributions
 such-0.054
 his-0.051
 the-0.050
 it-0.049
 him-0.049
Token managed
Token ID5257
Feature activation+6.46e+00

Pos logit contributions
op+0.031
arry+0.030
uffy+0.030
oke+0.030
olly+0.030
Neg logit contributions
 such-0.014
 his-0.014
 the-0.014
 it-0.013
 him-0.012
Token to
Token ID284
Feature activation+8.54e+00

Pos logit contributions
ts+0.112
 such+0.111
 rid+0.105
 forgiveness+0.100
 permission+0.099
Neg logit contributions
arry-0.127
oke-0.126
op-0.126
uffy-0.125
zag-0.124
Token grab
Token ID5552
Feature activation+5.61e+00

Pos logit contributions
 such+0.131
 his+0.122
 nothing+0.117
 it+0.115
 him+0.113
Neg logit contributions
rig-0.222
arry-0.220
uddle-0.220
uffy-0.219
op-0.217
Token the
Token ID262
Feature activation-2.75e+00

Pos logit contributions
 such+0.038
 never+0.035
 saving+0.035
 letting+0.035
 ruining+0.034
Neg logit contributions
erry-0.172
pillar-0.170
olly-0.170
arry-0.169
op-0.168
Token sack
Token ID23704
Feature activation-2.39e+00

Pos logit contributions
op+0.041
uddle+0.041
uffy+0.041
arry+0.040
olly+0.040
Neg logit contributions
 such-0.076
 his-0.075
 the-0.073
 him-0.072
 it-0.072
Token and
Token ID290
Feature activation+5.55e+00

Pos logit contributions
uffy+0.061
oke+0.061
op+0.061
uddle+0.060
arry+0.060
Neg logit contributions
 such-0.042
 his-0.042
 the-0.042
 it-0.040
 him-0.038
Token threw
Token ID9617
Feature activation+5.46e+00

Pos logit contributions
 such+0.081
 permission+0.076
 anything+0.072
 yours+0.072
 him+0.069
Neg logit contributions
uddle-0.141
uffy-0.136
op-0.134
rig-0.133
arry-0.132
Token cry
Token ID3960
Feature activation+1.10e+00

Pos logit contributions
 such+0.125
 his+0.124
 the+0.122
 it+0.118
 him+0.115
Neg logit contributions
op-0.178
oke-0.178
olly-0.177
arry-0.176
uffy-0.176
Token.
Token ID13
Feature activation-2.20e+00

Pos logit contributions
 such+0.019
 his+0.018
 the+0.018
 it+0.017
 him+0.017
Neg logit contributions
op-0.028
uffy-0.028
arry-0.028
oke-0.028
uddle-0.028
Token Her
Token ID2332
Feature activation-5.68e-01

Pos logit contributions
op+0.045
oke+0.045
uddle+0.044
uffy+0.044
arry+0.044
Neg logit contributions
 such-0.050
 his-0.049
 the-0.049
 it-0.048
 him-0.046
Token mom
Token ID1995
Feature activation+4.45e-01

Pos logit contributions
op+0.016
uffy+0.016
uddle+0.016
oke+0.016
arry+0.016
Neg logit contributions
 such-0.008
 his-0.008
 the-0.007
 it-0.007
 him-0.007
Token tried
Token ID3088
Feature activation+5.83e+00

Pos logit contributions
 such+0.008
 his+0.007
 yours+0.007
 anything+0.007
 the+0.006
Neg logit contributions
op-0.012
oke-0.012
uffy-0.011
uddle-0.011
ari-0.011
Token to
Token ID284
Feature activation+8.51e+00

Pos logit contributions
 such+0.089
 yours+0.077
 his+0.075
 it+0.073
 the+0.072
Neg logit contributions
op-0.155
uffy-0.151
arry-0.148
rig-0.148
oke-0.146
Token protect
Token ID1805
Feature activation+4.47e+00

Pos logit contributions
 such+0.108
 nothing+0.094
 it+0.092
 yours+0.089
 anything+0.086
Neg logit contributions
rig-0.243
arry-0.241
op-0.238
uffy-0.236
uddle-0.233
Token her
Token ID607
Feature activation-1.41e-01

Pos logit contributions
 such+0.035
 saved+0.030
 his+0.029
 permission+0.027
 never+0.027
Neg logit contributions
uffy-0.144
op-0.140
olly-0.137
pillar-0.136
arry-0.135
Token,
Token ID11
Feature activation+4.04e+00

Pos logit contributions
olly+0.004
arry+0.004
oke+0.004
pill+0.004
opus+0.004
Neg logit contributions
 his-0.002
 the-0.002
 such-0.002
 it-0.002
 him-0.002
Token but
Token ID475
Feature activation+3.63e+00

Pos logit contributions
 such+0.109
 him+0.102
 yours+0.101
 rid+0.101
 his+0.101
Neg logit contributions
oke-0.053
rig-0.051
op-0.050
pillar-0.046
arry-0.045
Token the
Token ID262
Feature activation-2.42e+00

Pos logit contributions
ps+0.052
 permission+0.052
 respect+0.051
ws+0.050
 such+0.049
Neg logit contributions
uffy-0.090
op-0.082
oke-0.078
s-0.078
uddle-0.076
Token shy
Token ID15800
Feature activation-5.36e+00

Pos logit contributions
oke+0.046
ari+0.045
uddle+0.044
op+0.044
opus+0.044
Neg logit contributions
 such-0.040
 his-0.038
 it-0.037
 the-0.036
 him-0.036
Token lion
Token ID18744
Feature activation-3.43e+00

Pos logit contributions
uffy+0.133
rig+0.132
uddle+0.132
op+0.132
arry+0.130
Neg logit contributions
 his-0.101
 the-0.100
 it-0.097
 such-0.096
 him-0.092
Token.
Token ID13
Feature activation-1.77e+00

Pos logit contributions
op+0.096
oke+0.095
arry+0.095
uffy+0.095
olly+0.095
Neg logit contributions
 such-0.051
 his-0.050
 the-0.049
 it-0.047
 him-0.045
Token He
Token ID679
Feature activation+4.46e+00

Pos logit contributions
op+0.042
uffy+0.042
uddle+0.041
oke+0.041
arry+0.041
Neg logit contributions
 such-0.034
 his-0.034
 the-0.033
 it-0.032
 him-0.031
Token tried
Token ID3088
Feature activation+5.83e+00

Pos logit contributions
 such+0.073
 his+0.067
 permission+0.064
 the+0.064
 him+0.063
Neg logit contributions
oke-0.117
op-0.116
uddle-0.115
uffy-0.114
pillar-0.114
Token to
Token ID284
Feature activation+8.48e+00

Pos logit contributions
 such+0.083
 his+0.070
 permission+0.069
 the+0.069
 yours+0.069
Neg logit contributions
op-0.158
arry-0.154
oke-0.154
uffy-0.153
uddle-0.152
Token run
Token ID1057
Feature activation+2.97e+00

Pos logit contributions
 such+0.150
 it+0.133
 his+0.130
 nothing+0.129
 the+0.128
Neg logit contributions
rig-0.198
arry-0.194
op-0.193
zag-0.191
uffy-0.191
Token as
Token ID355
Feature activation+3.41e+00

Pos logit contributions
 such+0.037
 his+0.035
 the+0.034
 it+0.033
 him+0.031
Neg logit contributions
op-0.090
oke-0.089
arry-0.089
uffy-0.089
uddle-0.088
Token fast
Token ID3049
Feature activation+7.95e-01

Pos logit contributions
 such+0.059
 permission+0.053
 his+0.049
 him+0.048
 yours+0.047
Neg logit contributions
olly-0.086
op-0.086
arry-0.085
oke-0.085
uffy-0.083
Token as
Token ID355
Feature activation+2.76e+00

Pos logit contributions
 such+0.012
 permission+0.010
 never+0.010
 yours+0.010
 rid+0.010
Neg logit contributions
op-0.021
oke-0.020
olly-0.020
arry-0.020
pillar-0.020
Token he
Token ID339
Feature activation+1.49e+00

Pos logit contributions
 such+0.045
 permission+0.040
 rid+0.036
 cleaned+0.035
 respect+0.035
Neg logit contributions
oke-0.069
arry-0.069
olly-0.069
op-0.068
oles-0.067
Token
Token ID198
Feature activation-2.54e+00

Pos logit contributions
opus+0.092
ute+0.091
uddle+0.091
op+0.091
ane+0.090
Neg logit contributions
 the-0.057
 his-0.056
 it-0.049
 such-0.049
 him-0.047
Token
Token ID198
Feature activation-2.06e+00

Pos logit contributions
opus+0.066
uddle+0.066
owers+0.066
pill+0.065
anda+0.065
Neg logit contributions
 the-0.049
 his-0.048
 it-0.047
 him-0.044
 a-0.042
TokenThe
Token ID464
Feature activation-3.51e+00

Pos logit contributions
oke+0.052
uffy+0.052
op+0.052
uddle+0.051
arry+0.051
Neg logit contributions
 his-0.037
 such-0.036
 the-0.036
 it-0.034
 never-0.034
Token veterinarian
Token ID41404
Feature activation-1.59e+00

Pos logit contributions
op+0.078
uffy+0.077
oke+0.077
arry+0.077
uddle+0.077
Neg logit contributions
 such-0.073
 his-0.071
 the-0.070
 it-0.069
 him-0.067
Token tried
Token ID3088
Feature activation+5.65e+00

Pos logit contributions
op+0.045
oke+0.045
uffy+0.045
arry+0.045
uddle+0.045
Neg logit contributions
 such-0.023
 his-0.022
 the-0.021
 it-0.021
 him-0.020
Token to
Token ID284
Feature activation+8.46e+00

Pos logit contributions
 such+0.086
 yours+0.073
 permission+0.073
 it+0.070
 never+0.070
Neg logit contributions
op-0.144
uffy-0.142
oodle-0.137
uddy-0.137
arry-0.136
Token help
Token ID1037
Feature activation+2.40e+00

Pos logit contributions
 such+0.117
 nothing+0.102
 it+0.100
 anything+0.094
 yours+0.091
Neg logit contributions
rig-0.226
arry-0.219
op-0.215
zag-0.214
uffy-0.214
Token Spot
Token ID15899
Feature activation-3.09e-01

Pos logit contributions
 such+0.032
 his+0.030
 the+0.029
 it+0.028
 him+0.027
Neg logit contributions
op-0.071
uffy-0.070
oke-0.070
arry-0.069
uddle-0.069
Token,
Token ID11
Feature activation+4.28e+00

Pos logit contributions
op+0.008
oke+0.008
uffy+0.008
arry+0.008
uddle+0.008
Neg logit contributions
 such-0.005
 his-0.005
 the-0.005
 it-0.005
 him-0.005
Token but
Token ID475
Feature activation+3.39e+00

Pos logit contributions
 such+0.105
 him+0.097
 rid+0.097
 yours+0.096
 permission+0.095
Neg logit contributions
oke-0.070
op-0.070
rig-0.065
pillar-0.065
uffy-0.064
Token he
Token ID339
Feature activation+3.19e+00

Pos logit contributions
 such+0.042
 permission+0.041
 respect+0.036
 rid+0.035
 yours+0.035
Neg logit contributions
uffy-0.098
op-0.096
oke-0.095
olly-0.092
uddle-0.092
Token leash
Token ID40529
Feature activation-1.24e+00

Pos logit contributions
oke+0.027
olly+0.026
arry+0.026
op+0.026
uffy+0.026
Neg logit contributions
 his-0.020
 such-0.020
 the-0.019
 it-0.019
 him-0.018
Token.
Token ID13
Feature activation-1.33e+00

Pos logit contributions
op+0.032
uffy+0.031
uddle+0.031
arry+0.031
oke+0.031
Neg logit contributions
 such-0.021
 his-0.020
 the-0.019
 it-0.019
 him-0.018
Token The
Token ID383
Feature activation-2.50e+00

Pos logit contributions
op+0.030
uddle+0.030
uffy+0.030
arry+0.030
oke+0.030
Neg logit contributions
 such-0.027
 his-0.027
 the-0.026
 it-0.026
 him-0.025
Token lady
Token ID10846
Feature activation-1.33e+00

Pos logit contributions
uddle+0.044
op+0.042
oke+0.042
ari+0.041
opus+0.041
Neg logit contributions
 such-0.064
 his-0.060
 her-0.058
 him-0.058
 the-0.058
Token tries
Token ID8404
Feature activation+5.28e+00

Pos logit contributions
arry+0.034
olly+0.033
op+0.032
uffy+0.032
opus+0.032
Neg logit contributions
 the-0.024
 his-0.024
 such-0.024
 it-0.023
 him-0.022
Token to
Token ID284
Feature activation+8.46e+00

Pos logit contributions
 such+0.092
 yours+0.083
ts+0.082
 saved+0.079
 replaced+0.078
Neg logit contributions
op-0.119
oodle-0.117
uffy-0.115
uddy-0.113
s-0.112
Token hold
Token ID1745
Feature activation+2.78e+00

Pos logit contributions
 such+0.158
 nothing+0.141
 it+0.134
 anything+0.132
 yours+0.130
Neg logit contributions
rig-0.187
op-0.179
uffy-0.177
arry-0.177
zag-0.175
Token him
Token ID683
Feature activation-6.37e-01

Pos logit contributions
 such+0.030
 his+0.025
 the+0.024
 never+0.024
 permission+0.024
Neg logit contributions
op-0.088
uffy-0.086
olly-0.085
arry-0.085
oke-0.085
Token,
Token ID11
Feature activation+5.07e+00

Pos logit contributions
oke+0.018
olly+0.018
opus+0.018
arry+0.018
ari+0.018
Neg logit contributions
 his-0.010
 the-0.009
 such-0.009
 it-0.009
 him-0.009
Token but
Token ID475
Feature activation+3.59e+00

Pos logit contributions
 such+0.126
 yours+0.113
 permission+0.113
 rid+0.113
 him+0.112
Neg logit contributions
oke-0.084
op-0.080
uffy-0.077
rig-0.076
oodle-0.076
Token he
Token ID339
Feature activation+3.32e+00

Pos logit contributions
 such+0.054
 permission+0.051
 yours+0.046
 rid+0.046
 replaced+0.044
Neg logit contributions
op-0.094
uffy-0.094
oke-0.092
oodle-0.090
uddle-0.089
Token
Token ID198
Feature activation-2.35e+00

Pos logit contributions
uffy+0.079
uddle+0.079
opus+0.078
olly+0.078
op+0.078
Neg logit contributions
 the-0.058
 his-0.058
 it-0.055
 such-0.054
 him-0.052
Token
Token ID198
Feature activation-1.75e+00

Pos logit contributions
uddle+0.059
opus+0.059
anda+0.058
pill+0.058
ari+0.058
Neg logit contributions
 the-0.046
 his-0.045
 it-0.045
 him-0.041
 such-0.041
TokenThe
Token ID464
Feature activation-2.97e+00

Pos logit contributions
op+0.044
arry+0.043
olly+0.043
oke+0.042
uffy+0.042
Neg logit contributions
 such-0.032
 his-0.030
 the-0.029
 it-0.029
 him-0.029
Token farmer
Token ID18739
Feature activation-1.54e+00

Pos logit contributions
op+0.066
oke+0.065
uffy+0.065
arry+0.065
uddle+0.065
Neg logit contributions
 such-0.061
 his-0.059
 the-0.058
 it-0.057
 him-0.056
Token tried
Token ID3088
Feature activation+5.80e+00

Pos logit contributions
op+0.043
oke+0.043
uffy+0.043
arry+0.043
uddle+0.043
Neg logit contributions
 such-0.023
 his-0.022
 the-0.021
 it-0.021
 him-0.020
Token to
Token ID284
Feature activation+8.45e+00

Pos logit contributions
 such+0.096
 yours+0.087
 saved+0.079
 permission+0.079
 him+0.079
Neg logit contributions
op-0.144
uffy-0.134
zag-0.132
oke-0.131
uddle-0.131
Token make
Token ID787
Feature activation+5.65e+00

Pos logit contributions
 such+0.114
 nothing+0.104
 yours+0.097
 it+0.096
 anything+0.094
Neg logit contributions
rig-0.229
op-0.224
arry-0.219
uffy-0.217
uddle-0.216
Token the
Token ID262
Feature activation-1.77e+00

Pos logit contributions
 never+0.034
 such+0.032
 cleaned+0.029
 yours+0.027
 hers+0.027
Neg logit contributions
op-0.180
oles-0.176
uddle-0.175
uffy-0.173
oke-0.172
Token bird
Token ID6512
Feature activation-7.85e-01

Pos logit contributions
op+0.039
oke+0.039
uffy+0.039
arry+0.039
uddle+0.039
Neg logit contributions
 such-0.037
 his-0.036
 the-0.036
 it-0.035
 him-0.034
Token go
Token ID467
Feature activation+2.35e+00

Pos logit contributions
op+0.021
uddle+0.021
uffy+0.020
rig+0.020
oke+0.020
Neg logit contributions
 such-0.012
 his-0.011
 it-0.011
 him-0.011
 the-0.011
Token away
Token ID1497
Feature activation+1.28e+00

Pos logit contributions
 such+0.030
 his+0.025
 permission+0.024
 nothing+0.024
 it+0.023
Neg logit contributions
uffy-0.070
op-0.070
uddle-0.069
arry-0.068
ari-0.068
Token move
Token ID1445
Feature activation+1.44e+00

Pos logit contributions
 such+0.093
 nothing+0.079
 how+0.074
 his+0.071
 her+0.071
Neg logit contributions
arry-0.119
ari-0.117
oke-0.112
zag-0.110
op-0.109
Token.
Token ID13
Feature activation-2.79e+00

Pos logit contributions
 such+0.027
 permission+0.026
 his+0.024
 nothing+0.024
 never+0.024
Neg logit contributions
oke-0.031
op-0.031
arry-0.030
uffy-0.030
uddle-0.029
Token The
Token ID383
Feature activation-1.92e+00

Pos logit contributions
op+0.067
uffy+0.067
pillar+0.067
opus+0.067
olly+0.067
Neg logit contributions
 the-0.053
 his-0.053
 such-0.052
 it-0.051
 him-0.048
Token captain
Token ID10654
Feature activation-1.61e+00

Pos logit contributions
ari+0.041
oke+0.040
uddle+0.040
arry+0.039
opus+0.039
Neg logit contributions
 such-0.043
 her-0.038
 his-0.037
 him-0.037
 it-0.037
Token tried
Token ID3088
Feature activation+5.37e+00

Pos logit contributions
op+0.042
oke+0.042
uddle+0.042
uffy+0.041
ari+0.041
Neg logit contributions
 such-0.027
 his-0.025
 the-0.024
 it-0.024
 him-0.023
Token to
Token ID284
Feature activation+8.45e+00

Pos logit contributions
 such+0.086
 yours+0.078
 permission+0.077
 rid+0.074
 protect+0.074
Neg logit contributions
op-0.128
zag-0.125
oke-0.123
uddle-0.122
uffy-0.121
Token fix
Token ID4259
Feature activation+4.46e+00

Pos logit contributions
 such+0.149
 nothing+0.132
 it+0.125
 hers+0.123
 anything+0.123
Neg logit contributions
rig-0.195
arry-0.190
zag-0.186
op-0.181
ari-0.176
Token it
Token ID340
Feature activation+1.01e+00

Pos logit contributions
 permission+0.032
 such+0.025
 never+0.025
 respect+0.025
 cleaned+0.023
Neg logit contributions
uddle-0.138
oke-0.137
op-0.137
uffy-0.136
arry-0.133
Token,
Token ID11
Feature activation+4.33e+00

Pos logit contributions
 such+0.021
 his+0.020
 the+0.020
 it+0.019
 him+0.019
Neg logit contributions
op-0.022
oke-0.021
arry-0.021
uddle-0.021
uffy-0.021
Token but
Token ID475
Feature activation+3.66e+00

Pos logit contributions
 such+0.112
 permission+0.106
 rid+0.105
 yours+0.104
 respect+0.103
Neg logit contributions
oke-0.064
op-0.060
rig-0.057
uddle-0.054
arry-0.054
Token he
Token ID339
Feature activation+3.23e+00

Pos logit contributions
 permission+0.040
 such+0.037
ps+0.036
 respect+0.035
 yourself+0.033
Neg logit contributions
oke-0.105
op-0.103
uddle-0.101
oles-0.100
uddy-0.099
Token.
Token ID13
Feature activation-2.77e+00

Pos logit contributions
uffy+0.090
olly+0.089
op+0.089
uddle+0.088
arry+0.088
Neg logit contributions
 such-0.052
 his-0.051
 the-0.051
 it-0.049
 him-0.047
Token Luckily
Token ID24262
Feature activation+1.86e+00

Pos logit contributions
uddle+0.068
uffy+0.067
op+0.067
oke+0.067
olly+0.066
Neg logit contributions
 such-0.052
 his-0.052
 the-0.051
 it-0.050
 him-0.048
Token,
Token ID11
Feature activation+1.83e+00

Pos logit contributions
 such+0.046
 rid+0.043
 permission+0.043
 yours+0.042
 him+0.042
Neg logit contributions
op-0.032
oke-0.030
arry-0.030
uffy-0.029
olly-0.029
Token Charlie
Token ID11526
Feature activation-6.57e-01

Pos logit contributions
 permission+0.020
 yours+0.020
 such+0.020
 rid+0.019
 hers+0.019
Neg logit contributions
op-0.049
arry-0.048
ule-0.047
opus-0.047
olly-0.047
Token managed
Token ID5257
Feature activation+6.18e+00

Pos logit contributions
op+0.017
oke+0.016
uffy+0.016
olly+0.016
uddle+0.016
Neg logit contributions
 such-0.011
 his-0.011
 the-0.010
 it-0.010
 him-0.010
Token to
Token ID284
Feature activation+8.43e+00

Pos logit contributions
 such+0.108
ts+0.107
 rid+0.107
 yourselves+0.101
 bruises+0.100
Neg logit contributions
oke-0.118
op-0.117
arry-0.112
ogging-0.112
olly-0.109
Token escape
Token ID6654
Feature activation+3.28e+00

Pos logit contributions
 such+0.125
 his+0.115
 nothing+0.109
 yours+0.107
 it+0.107
Neg logit contributions
op-0.229
uddle-0.228
rig-0.226
arry-0.220
uffy-0.219
Token by
Token ID416
Feature activation+3.94e+00

Pos logit contributions
 such+0.044
 permission+0.041
 never+0.040
 yours+0.039
 respect+0.038
Neg logit contributions
uddle-0.085
op-0.085
rig-0.083
arry-0.083
uffy-0.082
Token clinging
Token ID40137
Feature activation+3.95e+00

Pos logit contributions
 such+0.032
 yours+0.028
 respect+0.026
 permission+0.026
 rid+0.025
Neg logit contributions
op-0.123
oles-0.121
arry-0.118
rig-0.118
uddle-0.118
Token onto
Token ID4291
Feature activation+5.39e+00

Pos logit contributions
 such+0.068
 him+0.058
 never+0.058
 care+0.057
 saved+0.057
Neg logit contributions
op-0.092
uddle-0.091
ute-0.089
rig-0.089
uffy-0.088
Token a
Token ID257
Feature activation-5.59e-01

Pos logit contributions
 such+0.030
 share+0.029
 permission+0.029
 yours+0.028
 cleaned+0.028
Neg logit contributions
op-0.164
oles-0.155
rig-0.154
pillar-0.153
uffy-0.152
Token too
Token ID1165
Feature activation+1.47e+00

Pos logit contributions
 such+0.094
 him+0.093
 care+0.085
 permission+0.083
 respect+0.082
Neg logit contributions
rig-0.107
zag-0.104
op-0.103
oodle-0.102
oke-0.101
Token short
Token ID1790
Feature activation-1.21e+00

Pos logit contributions
 such+0.025
 his+0.023
 him+0.022
 the+0.021
 it+0.021
Neg logit contributions
op-0.040
arry-0.039
olly-0.038
uddle-0.038
oke-0.038
Token.
Token ID13
Feature activation-6.84e-01

Pos logit contributions
op+0.029
oke+0.028
uffy+0.027
arry+0.027
olly+0.027
Neg logit contributions
 such-0.023
 his-0.022
 the-0.021
 it-0.021
 him-0.020
Token They
Token ID1119
Feature activation+5.49e+00

Pos logit contributions
arry+0.018
op+0.018
ari+0.018
opus+0.018
uffy+0.018
Neg logit contributions
 such-0.012
 the-0.011
 his-0.011
 it-0.011
 him-0.010
Token try
Token ID1949
Feature activation+4.61e+00

Pos logit contributions
 such+0.098
 his+0.090
 the+0.084
 it+0.081
 permission+0.081
Neg logit contributions
op-0.143
uddle-0.136
uffy-0.136
oke-0.135
arry-0.132
Token to
Token ID284
Feature activation+8.43e+00

Pos logit contributions
 such+0.072
 his+0.062
 him+0.061
 permission+0.060
 protect+0.059
Neg logit contributions
op-0.125
uddle-0.121
uffy-0.120
oke-0.119
arry-0.118
Token use
Token ID779
Feature activation+6.35e+00

Pos logit contributions
 such+0.156
 his+0.136
 nothing+0.134
 permission+0.131
 it+0.129
Neg logit contributions
op-0.199
rig-0.196
uddle-0.194
arry-0.191
uffy-0.190
Token a
Token ID257
Feature activation-1.15e-01

Pos logit contributions
 punished+0.053
 never+0.049
check+0.049
 such+0.047
 protected+0.046
Neg logit contributions
op-0.173
uddle-0.168
s-0.166
ite-0.159
uffy-0.159
Token rope
Token ID17182
Feature activation-2.28e+00

Pos logit contributions
arry+0.003
olly+0.003
uffy+0.003
ari+0.003
oke+0.003
Neg logit contributions
 his-0.002
 the-0.002
 it-0.002
 such-0.002
 nothing-0.002
Token to
Token ID284
Feature activation+6.89e+00

Pos logit contributions
op+0.062
oke+0.062
uffy+0.061
uddle+0.061
arry+0.060
Neg logit contributions
 such-0.035
 his-0.032
 the-0.031
 it-0.030
 him-0.030
Token get
Token ID651
Feature activation+5.92e+00

Pos logit contributions
 such+0.135
 permission+0.120
 nothing+0.114
 his+0.113
 never+0.112
Neg logit contributions
op-0.158
uddle-0.148
rig-0.144
uffy-0.141
zag-0.140
Tokenases
Token ID1386
Feature activation+2.46e+00

Pos logit contributions
ari+0.069
op+0.068
rig+0.068
uddle+0.068
opus+0.068
Neg logit contributions
 his-0.036
 such-0.034
 the-0.034
 him-0.033
 it-0.033
Token them
Token ID606
Feature activation-3.89e-01

Pos logit contributions
 such+0.023
 yours+0.018
 hurting+0.017
 letting+0.017
 permission+0.017
Neg logit contributions
op-0.078
arry-0.076
uffy-0.076
uddy-0.075
oke-0.074
Token.
Token ID13
Feature activation-1.17e+00

Pos logit contributions
op+0.009
uddle+0.009
uffy+0.009
oke+0.009
arry+0.009
Neg logit contributions
 such-0.007
 his-0.007
 the-0.006
 it-0.006
 him-0.006
Token He
Token ID679
Feature activation+4.69e+00

Pos logit contributions
op+0.030
olly+0.029
uffy+0.029
uddle+0.029
opus+0.029
Neg logit contributions
 his-0.021
 the-0.021
 such-0.021
 it-0.020
 him-0.019
Token tries
Token ID8404
Feature activation+4.84e+00

Pos logit contributions
 such+0.094
 his+0.090
 the+0.087
 it+0.086
 him+0.084
Neg logit contributions
op-0.106
uffy-0.106
uddle-0.106
oke-0.105
olly-0.103
Token to
Token ID284
Feature activation+8.42e+00

Pos logit contributions
 such+0.084
 yours+0.077
ts+0.073
 permission+0.072
 replaced+0.070
Neg logit contributions
op-0.110
oodle-0.106
uddy-0.106
uffy-0.106
arry-0.105
Token bite
Token ID13197
Feature activation+2.19e+00

Pos logit contributions
 such+0.161
 nothing+0.140
 it+0.136
 permission+0.133
 yours+0.132
Neg logit contributions
rig-0.179
uffy-0.175
zag-0.174
op-0.171
arry-0.171
Token them
Token ID606
Feature activation-9.63e-01

Pos logit contributions
 such+0.014
 permission+0.010
 never+0.009
 his+0.009
 nothing+0.008
Neg logit contributions
op-0.076
uffy-0.075
uddle-0.075
arry-0.074
oke-0.074
Token.
Token ID13
Feature activation-8.55e-01

Pos logit contributions
op+0.025
oke+0.024
uffy+0.024
arry+0.024
uddle+0.024
Neg logit contributions
 such-0.017
 his-0.016
 the-0.016
 it-0.015
 him-0.015
Token
Token ID198
Feature activation-1.23e+00

Pos logit contributions
op+0.020
uffy+0.020
olly+0.020
oke+0.020
uddle+0.020
Neg logit contributions
 such-0.016
 his-0.016
 the-0.016
 it-0.015
 him-0.015
Token
Token ID198
Feature activation-8.57e-01

Pos logit contributions
uddle+0.027
opus+0.027
pillar+0.027
ari+0.027
olly+0.026
Neg logit contributions
 the-0.026
 his-0.026
 it-0.026
 such-0.025
 him-0.025
Token shout
Token ID9191
Feature activation+2.56e+00

Pos logit contributions
 such+0.112
 it+0.095
 yours+0.092
 anything+0.091
 him+0.090
Neg logit contributions
rig-0.123
uffy-0.120
uddle-0.119
oke-0.119
oodle-0.118
Token.
Token ID13
Feature activation-1.37e+00

Pos logit contributions
 such+0.051
 his+0.046
 never+0.045
 it+0.045
 the+0.045
Neg logit contributions
oke-0.057
op-0.057
uddle-0.056
uffy-0.055
olly-0.055
Token Their
Token ID5334
Feature activation-2.40e-01

Pos logit contributions
op+0.029
uddle+0.029
oke+0.029
uffy+0.029
olly+0.029
Neg logit contributions
 such-0.030
 his-0.030
 the-0.030
 it-0.029
 him-0.028
Token moms
Token ID34015
Feature activation+9.74e-01

Pos logit contributions
uddle+0.005
uffy+0.005
op+0.005
olly+0.005
oke+0.005
Neg logit contributions
 such-0.005
 his-0.005
 it-0.005
 the-0.005
 him-0.005
Token try
Token ID1949
Feature activation+4.58e+00

Pos logit contributions
 such+0.014
 his+0.014
 the+0.013
 it+0.013
 him+0.012
Neg logit contributions
op-0.027
oke-0.027
uddle-0.027
uffy-0.027
arry-0.027
Token to
Token ID284
Feature activation+8.42e+00

Pos logit contributions
 such+0.071
 yours+0.060
 never+0.060
 protect+0.060
 nothing+0.059
Neg logit contributions
op-0.116
oke-0.115
oodle-0.113
uffy-0.112
uddle-0.112
Token calm
Token ID9480
Feature activation+2.21e+00

Pos logit contributions
 such+0.117
 nothing+0.112
 it+0.103
 hers+0.102
 yours+0.102
Neg logit contributions
rig-0.218
arry-0.214
uddle-0.212
op-0.211
oke-0.207
Token them
Token ID606
Feature activation+8.32e-01

Pos logit contributions
 such+0.015
 his+0.012
 the+0.010
 never+0.010
 it+0.010
Neg logit contributions
uffy-0.078
op-0.078
oke-0.078
uddle-0.078
arry-0.078
Token down
Token ID866
Feature activation+1.73e+00

Pos logit contributions
 such+0.013
 his+0.012
 it+0.011
 the+0.011
 nothing+0.011
Neg logit contributions
op-0.022
uffy-0.022
uddle-0.022
oke-0.022
arry-0.022
Token.
Token ID13
Feature activation-1.46e+00

Pos logit contributions
 such+0.030
 his+0.029
 the+0.028
 it+0.028
 him+0.027
Neg logit contributions
op-0.044
oke-0.043
arry-0.043
uffy-0.043
olly-0.043
Token They
Token ID1119
Feature activation+5.82e+00

Pos logit contributions
op+0.031
uffy+0.031
oke+0.031
uddle+0.031
olly+0.031
Neg logit contributions
 such-0.032
 his-0.031
 the-0.031
 it-0.030
 him-0.029
Token sick
Token ID6639
Feature activation+7.55e-02

Pos logit contributions
 letting+0.082
 itself+0.081
 respect+0.080
 yourself+0.078
 yours+0.078
Neg logit contributions
op-0.144
erry-0.139
olly-0.136
uddle-0.136
s-0.134
Token.
Token ID13
Feature activation-3.53e+00

Pos logit contributions
 such+0.001
 his+0.001
 the+0.001
 it+0.001
 never+0.001
Neg logit contributions
op-0.002
uddle-0.002
oke-0.002
olly-0.002
uffy-0.002
Token The
Token ID383
Feature activation-2.44e+00

Pos logit contributions
op+0.108
oke+0.108
arry+0.107
uddle+0.107
olly+0.107
Neg logit contributions
 his-0.045
 such-0.043
 the-0.043
 it-0.041
 him-0.039
Token vet
Token ID20202
Feature activation-2.67e-01

Pos logit contributions
op+0.052
uffy+0.052
uddle+0.052
olly+0.051
arry+0.050
Neg logit contributions
 such-0.052
 his-0.050
 the-0.049
 it-0.049
 him-0.047
Token had
Token ID550
Feature activation+6.54e+00

Pos logit contributions
oke+0.007
uffy+0.006
op+0.006
anda+0.006
ari+0.006
Neg logit contributions
 such-0.005
 anything-0.004
 his-0.004
 yours-0.004
 permission-0.004
Token to
Token ID284
Feature activation+8.42e+00

Pos logit contributions
ts+0.086
ieve+0.083
ps+0.080
 yours+0.080
 freeing+0.079
Neg logit contributions
op-0.142
pill-0.136
uffy-0.134
erry-0.133
s-0.131
Token take
Token ID1011
Feature activation+4.37e+00

Pos logit contributions
 such+0.126
 it+0.113
 nothing+0.109
 his+0.108
 how+0.107
Neg logit contributions
uffy-0.209
op-0.208
arry-0.203
rig-0.202
olly-0.202
Token an
Token ID281
Feature activation-1.37e+00

Pos logit contributions
 such+0.036
 permission+0.035
 letting+0.033
 yours+0.032
 how+0.030
Neg logit contributions
op-0.134
uffy-0.132
olly-0.128
pillar-0.128
ables-0.127
Token x
Token ID2124
Feature activation-6.04e+00

Pos logit contributions
op+0.037
oke+0.036
uffy+0.036
arry+0.036
uddle+0.036
Neg logit contributions
 such-0.022
 his-0.022
 the-0.021
 it-0.020
 him-0.020
Token-
Token ID12
Feature activation-1.29e+00

Pos logit contributions
op+0.098
uffy+0.095
oke+0.095
arry+0.095
uddle+0.095
Neg logit contributions
 such-0.161
 his-0.158
 the-0.156
 it-0.153
 him-0.150
Tokenray
Token ID2433
Feature activation-2.87e+00

Pos logit contributions
oke+0.025
uddle+0.025
uffy+0.025
arry+0.025
op+0.025
Neg logit contributions
 such-0.030
 his-0.030
 the-0.029
 it-0.029
 him-0.028
Token them
Token ID606
Feature activation+2.44e+00

Pos logit contributions
 such+0.038
 hurting+0.037
 permission+0.036
 yours+0.035
 letting+0.034
Neg logit contributions
uddle-0.164
op-0.159
oles-0.158
olly-0.158
rig-0.157
Token fall
Token ID2121
Feature activation+2.25e+00

Pos logit contributions
 such+0.042
 him+0.037
 his+0.037
 it+0.036
 the+0.036
Neg logit contributions
uddle-0.060
op-0.060
uffy-0.058
rig-0.057
anda-0.055
Token.
Token ID13
Feature activation-2.23e+00

Pos logit contributions
 such+0.037
 his+0.035
 the+0.034
 it+0.033
 him+0.032
Neg logit contributions
op-0.059
uddle-0.058
uffy-0.058
arry-0.058
oke-0.057
Token They
Token ID1119
Feature activation+5.34e+00

Pos logit contributions
opus+0.059
ari+0.058
ane+0.058
pillar+0.057
op+0.057
Neg logit contributions
 the-0.039
 his-0.038
 it-0.037
 such-0.036
 him-0.034
Token tried
Token ID3088
Feature activation+5.38e+00

Pos logit contributions
 such+0.108
 yours+0.107
 yourself+0.097
 your+0.097
 permission+0.095
Neg logit contributions
op-0.122
uddle-0.110
oke-0.109
uffy-0.109
anda-0.108
Token to
Token ID284
Feature activation+8.41e+00

Pos logit contributions
 such+0.081
 yours+0.071
 permission+0.069
 rid+0.068
 saved+0.068
Neg logit contributions
op-0.142
uddle-0.136
uffy-0.135
arry-0.132
oke-0.132
Token resist
Token ID4180
Feature activation+5.38e+00

Pos logit contributions
 such+0.155
 nothing+0.133
 his+0.132
 it+0.128
 yours+0.127
Neg logit contributions
rig-0.194
op-0.193
arry-0.188
uddle-0.185
zag-0.185
Token,
Token ID11
Feature activation+4.33e+00

Pos logit contributions
 such+0.058
 yours+0.053
 rid+0.052
 saved+0.052
 fixed+0.049
Neg logit contributions
op-0.156
uddle-0.149
oke-0.148
oles-0.148
rig-0.147
Token but
Token ID475
Feature activation+3.62e+00

Pos logit contributions
 such+0.103
 his+0.099
 yours+0.095
 him+0.094
 the+0.093
Neg logit contributions
op-0.076
oke-0.076
rig-0.072
uddle-0.071
arry-0.070
Token they
Token ID484
Feature activation+4.11e+00

Pos logit contributions
 such+0.043
 yours+0.039
ps+0.036
 permission+0.036
 fixed+0.036
Neg logit contributions
oles-0.098
op-0.097
uffy-0.096
oke-0.093
ho-0.093
Token were
Token ID547
Feature activation+4.64e+00

Pos logit contributions
 such+0.076
 his+0.070
 yours+0.067
 the+0.066
 permission+0.066
Neg logit contributions
op-0.106
uffy-0.101
uddle-0.099
oke-0.098
olly-0.096
Tokenie
Token ID494
Feature activation-2.51e+00

Pos logit contributions
op+0.091
arry+0.091
opus+0.091
olly+0.091
uddle+0.090
Neg logit contributions
 the-0.078
 his-0.077
 such-0.075
 him-0.072
 it-0.071
Token.
Token ID13
Feature activation-2.79e+00

Pos logit contributions
olly+0.073
arry+0.073
uddle+0.072
uffy+0.072
oke+0.072
Neg logit contributions
 the-0.037
 such-0.037
 his-0.036
 it-0.035
 him-0.032
Token Tweet
Token ID18752
Feature activation-2.14e+00

Pos logit contributions
op+0.071
oke+0.071
uffy+0.070
arry+0.070
uddle+0.070
Neg logit contributions
 such-0.048
 his-0.047
 the-0.046
 it-0.045
 him-0.043
Tokenie
Token ID494
Feature activation+6.59e-01

Pos logit contributions
op+0.044
arry+0.043
uddle+0.043
olly+0.043
oke+0.043
Neg logit contributions
 his-0.048
 such-0.048
 the-0.047
 it-0.046
 him-0.045
Token tried
Token ID3088
Feature activation+5.80e+00

Pos logit contributions
 such+0.010
 his+0.010
 the+0.010
 it+0.010
 him+0.009
Neg logit contributions
op-0.018
oke-0.017
uffy-0.017
arry-0.017
uddle-0.017
Token to
Token ID284
Feature activation+8.40e+00

Pos logit contributions
 such+0.084
 his+0.072
 the+0.070
 it+0.070
 yours+0.070
Neg logit contributions
op-0.158
arry-0.155
uffy-0.154
oke-0.151
uddle-0.151
Token be
Token ID307
Feature activation+6.27e+00

Pos logit contributions
 such+0.145
 nothing+0.127
 his+0.125
 it+0.124
 yours+0.120
Neg logit contributions
arry-0.192
rig-0.191
pillar-0.189
op-0.188
zag-0.187
Token brave
Token ID14802
Feature activation-1.11e+00

Pos logit contributions
 permission+0.100
 such+0.097
 him+0.096
 yours+0.091
 it+0.089
Neg logit contributions
arry-0.130
ang-0.129
zag-0.128
oles-0.126
anda-0.125
Token,
Token ID11
Feature activation+4.17e+00

Pos logit contributions
oke+0.023
op+0.023
arry+0.022
olly+0.022
uddle+0.022
Neg logit contributions
 such-0.024
 his-0.024
 the-0.023
 it-0.022
 him-0.022
Token but
Token ID475
Feature activation+3.50e+00

Pos logit contributions
 such+0.100
 his+0.096
 him+0.092
 permission+0.091
 rid+0.090
Neg logit contributions
oke-0.072
op-0.070
rig-0.070
pillar-0.069
arry-0.069
Token the
Token ID262
Feature activation-1.99e+00

Pos logit contributions
 such+0.034
 permission+0.030
 respect+0.028
ps+0.028
 yours+0.025
Neg logit contributions
uffy-0.102
oles-0.101
ho-0.100
oke-0.099
erry-0.099
Token and
Token ID290
Feature activation+5.04e+00

Pos logit contributions
op+0.013
oke+0.013
uffy+0.013
uddle+0.012
ari+0.012
Neg logit contributions
 such-0.009
 his-0.008
 yours-0.008
 the-0.008
 it-0.008
Token teachers
Token ID7799
Feature activation+4.31e-01

Pos logit contributions
 such+0.070
 yours+0.062
 yourself+0.054
 rid+0.053
 respect+0.053
Neg logit contributions
oke-0.127
op-0.125
ule-0.124
uffy-0.124
oles-0.121
Token arrived
Token ID5284
Feature activation+2.08e+00

Pos logit contributions
 such+0.008
 his+0.008
 yours+0.007
 the+0.007
 it+0.007
Neg logit contributions
op-0.010
oke-0.010
ari-0.010
uffy-0.010
uddle-0.010
Token and
Token ID290
Feature activation+6.05e+00

Pos logit contributions
 such+0.041
 his+0.039
 yours+0.039
 anything+0.038
 him+0.037
Neg logit contributions
uddle-0.045
op-0.044
oke-0.043
uffy-0.043
arry-0.042
Token tried
Token ID3088
Feature activation+5.48e+00

Pos logit contributions
 yours+0.100
 such+0.099
 yourself+0.088
 hers+0.087
 himself+0.080
Neg logit contributions
oke-0.144
ari-0.136
op-0.134
uddle-0.129
uffy-0.129
Token to
Token ID284
Feature activation+8.37e+00

Pos logit contributions
 such+0.104
 yours+0.104
 yourself+0.093
ts+0.091
 hers+0.091
Neg logit contributions
op-0.112
zag-0.105
uffy-0.104
uddy-0.103
s-0.099
Token put
Token ID1234
Feature activation+5.93e+00

Pos logit contributions
 such+0.124
 nothing+0.108
 yours+0.108
 hers+0.107
 it+0.103
Neg logit contributions
rig-0.218
op-0.212
uffy-0.210
arry-0.207
uddle-0.207
Token out
Token ID503
Feature activation+3.56e+00

Pos logit contributions
 letting+0.059
 such+0.058
 permission+0.057
 never+0.054
 hurting+0.049
Neg logit contributions
uffy-0.162
uddle-0.162
op-0.160
rig-0.149
ite-0.149
Token the
Token ID262
Feature activation-2.15e+00

Pos logit contributions
 hurting+0.021
 never+0.019
 such+0.018
 letting+0.018
 yours+0.017
Neg logit contributions
op-0.110
uffy-0.110
uddle-0.109
oke-0.108
ari-0.106
Token fire
Token ID2046
Feature activation-5.78e+00

Pos logit contributions
uddle+0.052
op+0.051
uffy+0.050
oke+0.050
olly+0.049
Neg logit contributions
 such-0.039
 his-0.037
 it-0.036
 him-0.035
 the-0.035
Token.
Token ID13
Feature activation-2.74e+00

Pos logit contributions
olly+0.155
op+0.155
arry+0.154
oke+0.154
uffy+0.153
Neg logit contributions
 such-0.092
 his-0.091
 the-0.090
 it-0.086
 him-0.083
Token't
Token ID470
Feature activation+2.59e+00

Pos logit contributions
 such+0.066
 his+0.062
 permission+0.060
 the+0.059
 him+0.057
Neg logit contributions
op-0.075
opus-0.074
uffy-0.073
oke-0.073
olly-0.073
Token I
Token ID314
Feature activation+3.67e+00

Pos logit contributions
 such+0.040
 his+0.038
 the+0.037
 it+0.036
 him+0.035
Neg logit contributions
op-0.071
arry-0.069
oke-0.069
olly-0.069
uffy-0.069
Token play
Token ID711
Feature activation-1.33e+00

Pos logit contributions
 such+0.046
 his+0.045
 the+0.043
 it+0.042
 him+0.040
Neg logit contributions
op-0.111
oke-0.110
uffy-0.109
arry-0.109
uddle-0.109
Token with
Token ID351
Feature activation+1.83e+00

Pos logit contributions
uddle+0.030
op+0.030
opus+0.029
olly+0.029
uffy+0.029
Neg logit contributions
 permission-0.024
 such-0.024
 never-0.023
 his-0.023
 nothing-0.022
Token the
Token ID262
Feature activation-4.48e+00

Pos logit contributions
 never+0.018
 permission+0.017
 saved+0.017
 such+0.017
 cleaned+0.016
Neg logit contributions
opus-0.054
op-0.053
pillar-0.052
uddle-0.052
olly-0.052
Token oct
Token ID19318
Feature activation-1.55e+01

Pos logit contributions
op+0.097
uffy+0.097
arry+0.096
oke+0.096
olly+0.095
Neg logit contributions
 such-0.095
 his-0.094
 the-0.093
 it-0.091
 him-0.088
Tokenopus
Token ID25790
Feature activation-7.47e+00

Pos logit contributions
rig+0.176
uffy+0.165
 Today+0.159
adow+0.157
eth+0.154
Neg logit contributions
 him-0.412
 it-0.406
 his-0.393
 such-0.392
 help-0.383
Token,
Token ID11
Feature activation+1.64e+00

Pos logit contributions
arry+0.187
op+0.187
oke+0.185
uffy+0.183
olly+0.182
Neg logit contributions
 such-0.135
 his-0.133
 the-0.131
 it-0.126
 him-0.121
Token too
Token ID1165
Feature activation-7.51e-01

Pos logit contributions
 such+0.023
 his+0.022
 the+0.022
 it+0.021
 him+0.021
Neg logit contributions
op-0.046
oke-0.046
uffy-0.045
uddle-0.045
olly-0.045
Token?"
Token ID1701
Feature activation-4.00e+00

Pos logit contributions
op+0.018
oke+0.018
uffy+0.018
arry+0.018
uddle+0.018
Neg logit contributions
 such-0.014
 his-0.014
 the-0.013
 it-0.013
 him-0.013
Token
Token ID198
Feature activation-3.25e+00

Pos logit contributions
uddle+0.092
olly+0.091
ari+0.090
oke+0.090
arry+0.090
Neg logit contributions
 his-0.087
 the-0.085
 it-0.082
 such-0.082
 him-0.080
Token he
Token ID339
Feature activation+6.94e-01

Pos logit contributions
op+0.014
olly+0.014
uddle+0.014
uffy+0.014
oke+0.014
Neg logit contributions
 such-0.006
 his-0.005
 the-0.005
 him-0.005
 it-0.005
Token waved
Token ID26834
Feature activation+9.65e-02

Pos logit contributions
 such+0.011
 his+0.010
 the+0.010
 him+0.009
 it+0.009
Neg logit contributions
op-0.019
uddle-0.019
olly-0.018
uffy-0.018
oke-0.018
Token to
Token ID284
Feature activation+3.21e+00

Pos logit contributions
 such+0.001
 never+0.001
 ruining+0.001
 saving+0.001
 protect+0.001
Neg logit contributions
olly-0.002
op-0.002
oodle-0.002
arry-0.002
uddle-0.002
Token the
Token ID262
Feature activation-4.12e+00

Pos logit contributions
 such+0.014
 never+0.010
 saved+0.009
 nothing+0.009
 hurting+0.009
Neg logit contributions
olly-0.109
rig-0.108
op-0.108
arry-0.108
ule-0.107
Token r
Token ID374
Feature activation-9.61e+00

Pos logit contributions
op+0.101
arry+0.100
oke+0.100
uffy+0.099
olly+0.097
Neg logit contributions
 his-0.076
 such-0.076
 the-0.076
 it-0.073
 him-0.070
Tokenasp
Token ID5126
Feature activation-1.54e+01

Pos logit contributions
pillar+0.182
opus+0.177
oke+0.175
uffy+0.173
rig+0.172
Neg logit contributions
 his-0.219
 such-0.215
 permission-0.209
 himself-0.209
 the-0.208
Tokenberries
Token ID20853
Feature activation-5.55e+00

Pos logit contributions
rig+0.218
uffy+0.206
ari+0.205
arry+0.200
oodle+0.199
Neg logit contributions
 the-0.455
 it-0.441
 his-0.434
 such-0.432
 him-0.430
Token and
Token ID290
Feature activation+3.12e+00

Pos logit contributions
oke+0.149
uffy+0.149
op+0.148
arry+0.148
uddle+0.147
Neg logit contributions
 such-0.089
 his-0.087
 the-0.086
 it-0.083
 him-0.081
Token said
Token ID531
Feature activation+8.16e-01

Pos logit contributions
 such+0.050
 permission+0.044
 him+0.043
 anything+0.042
 his+0.041
Neg logit contributions
op-0.079
uddle-0.077
oke-0.077
oodle-0.076
pill-0.076
Token,
Token ID11
Feature activation+7.89e-01

Pos logit contributions
 such+0.016
 his+0.016
 the+0.016
 it+0.015
 him+0.015
Neg logit contributions
op-0.018
oke-0.018
uffy-0.018
arry-0.018
uddle-0.018
Token "
Token ID366
Feature activation-1.36e+00

Pos logit contributions
 the+0.013
 his+0.013
 it+0.013
 such+0.012
 him+0.012
Neg logit contributions
uffy-0.022
uddle-0.022
olly-0.021
op-0.021
ari-0.021
Token as
Token ID355
Feature activation+1.75e+00

Pos logit contributions
op+0.026
oke+0.026
uffy+0.026
arry+0.026
uddle+0.026
Neg logit contributions
 such-0.015
 his-0.014
 the-0.014
 it-0.013
 him-0.013
Token they
Token ID484
Feature activation+1.83e+00

Pos logit contributions
 such+0.023
 permission+0.020
 his+0.020
 him+0.019
 never+0.019
Neg logit contributions
op-0.049
uffy-0.049
olly-0.049
oke-0.048
ule-0.048
Token drank
Token ID24070
Feature activation+1.68e+00

Pos logit contributions
 such+0.030
 his+0.028
 it+0.025
 the+0.025
 nothing+0.025
Neg logit contributions
op-0.049
oke-0.048
uddle-0.047
uffy-0.047
olly-0.047
Token the
Token ID262
Feature activation-3.43e+00

Pos logit contributions
 such+0.018
 his+0.017
 permission+0.015
 the+0.015
 never+0.015
Neg logit contributions
op-0.052
uddle-0.051
uffy-0.051
oke-0.050
olly-0.050
Token mil
Token ID1465
Feature activation-8.16e+00

Pos logit contributions
arry+0.094
oke+0.094
op+0.093
ari+0.092
olly+0.092
Neg logit contributions
 such-0.054
 his-0.053
 the-0.053
 it-0.051
 him-0.048
Tokenksh
Token ID50133
Feature activation-1.51e+01

Pos logit contributions
arry+0.148
olly+0.147
ari+0.141
opus+0.140
oke+0.140
Neg logit contributions
 such-0.196
 him-0.187
 the-0.186
 rid-0.185
 his-0.184
Tokenakes
Token ID1124
Feature activation-3.97e+00

Pos logit contributions
pillar+0.230
oke+0.222
ari+0.217
olly+0.216
opus+0.214
Neg logit contributions
 him-0.390
 such-0.388
 care-0.377
 each-0.375
 his-0.372
Token.
Token ID13
Feature activation-3.69e+00

Pos logit contributions
op+0.107
oke+0.105
uffy+0.105
uddle+0.105
arry+0.105
Neg logit contributions
 such-0.063
 his-0.061
 the-0.059
 it-0.057
 him-0.056
Token
Token ID220
Feature activation-5.48e+00

Pos logit contributions
uddle+0.092
opus+0.090
op+0.089
ari+0.088
uffy+0.088
Neg logit contributions
 the-0.072
 his-0.070
 it-0.070
 such-0.069
 him-0.065
Token
Token ID198
Feature activation-3.20e+00

Pos logit contributions
opus+0.143
pillar+0.141
uddle+0.137
actus+0.136
ari+0.135
Neg logit contributions
 the-0.103
 it-0.098
 his-0.097
 such-0.095
 him-0.094
Token
Token ID198
Feature activation-2.68e+00

Pos logit contributions
uddle+0.078
uffy+0.076
ari+0.075
ane+0.074
anda+0.074
Neg logit contributions
 the-0.064
 his-0.064
 it-0.062
 such-0.060
 him-0.058
Tokenksh
Token ID50133
Feature activation-1.54e+01

Pos logit contributions
arry+0.172
olly+0.169
opus+0.164
uffy+0.164
uddle+0.163
Neg logit contributions
 such-0.186
 permission-0.173
 the-0.173
 him-0.172
 his-0.172
Tokenakes
Token ID1124
Feature activation-4.53e+00

Pos logit contributions
pillar+0.290
pill+0.267
opus+0.267
olly+0.267
oke+0.265
Neg logit contributions
 him-0.354
 such-0.352
 his-0.334
 each-0.330
 permission-0.326
Token.
Token ID13
Feature activation-3.92e+00

Pos logit contributions
op+0.119
oke+0.117
uffy+0.117
uddle+0.117
arry+0.116
Neg logit contributions
 such-0.074
 his-0.072
 the-0.070
 it-0.068
 him-0.066
Token The
Token ID383
Feature activation-3.80e+00

Pos logit contributions
uddle+0.105
opus+0.101
actus+0.101
op+0.100
rig+0.100
Neg logit contributions
 the-0.070
 his-0.070
 it-0.068
 such-0.067
 him-0.064
Token mil
Token ID1465
Feature activation-8.22e+00

Pos logit contributions
op+0.090
arry+0.089
oke+0.088
uffy+0.088
olly+0.088
Neg logit contributions
 such-0.073
 his-0.072
 the-0.072
 it-0.069
 him-0.067
Tokenksh
Token ID50133
Feature activation-1.51e+01

Pos logit contributions
olly+0.145
arry+0.144
ari+0.139
uddle+0.137
pillar+0.136
Neg logit contributions
 such-0.202
 his-0.195
 him-0.195
 the-0.193
 it-0.189
Tokenakes
Token ID1124
Feature activation-2.98e+00

Pos logit contributions
pillar+0.240
olly+0.236
ari+0.236
arry+0.230
oke+0.229
Neg logit contributions
 such-0.383
 him-0.380
 his-0.369
 each-0.360
 the-0.358
Token were
Token ID547
Feature activation+3.02e+00

Pos logit contributions
op+0.080
oke+0.078
uddle+0.077
uffy+0.077
olly+0.076
Neg logit contributions
 such-0.048
 his-0.045
 the-0.042
 permission-0.041
 him-0.041
Token blue
Token ID4171
Feature activation-5.55e+00

Pos logit contributions
 such+0.043
 permission+0.041
 his+0.039
 him+0.039
 protect+0.038
Neg logit contributions
op-0.076
oke-0.076
oodle-0.075
uffy-0.074
rig-0.074
Token and
Token ID290
Feature activation+3.53e+00

Pos logit contributions
arry+0.144
oke+0.143
rig+0.142
uffy+0.141
ari+0.141
Neg logit contributions
 such-0.099
 his-0.096
 the-0.096
 it-0.091
 him-0.090
Token pink
Token ID11398
Feature activation-3.93e+00

Pos logit contributions
 such+0.051
 permission+0.051
 him+0.046
 his+0.045
 yours+0.043
Neg logit contributions
op-0.091
oke-0.090
uffy-0.089
pill-0.088
oodle-0.085
Token a
Token ID257
Feature activation-2.76e+00

Pos logit contributions
 such+0.023
 permission+0.018
 never+0.016
 yours+0.016
 his+0.015
Neg logit contributions
oke-0.117
op-0.114
uddle-0.113
uffy-0.113
olly-0.112
Token big
Token ID1263
Feature activation-4.48e+00

Pos logit contributions
arry+0.076
olly+0.076
op+0.076
ari+0.074
oke+0.074
Neg logit contributions
 the-0.045
 it-0.043
 his-0.043
 such-0.040
 a-0.039
Token net
Token ID2010
Feature activation-5.09e+00

Pos logit contributions
op+0.100
arry+0.099
uffy+0.099
olly+0.098
oke+0.098
Neg logit contributions
 the-0.092
 his-0.092
 such-0.091
 it-0.090
 him-0.085
Token around
Token ID1088
Feature activation+9.58e-01

Pos logit contributions
op+0.135
oke+0.133
uddle+0.132
olly+0.132
uffy+0.132
Neg logit contributions
 such-0.082
 his-0.079
 the-0.076
 it-0.074
 him-0.073
Token the
Token ID262
Feature activation-4.23e+00

Pos logit contributions
 such+0.007
 permission+0.006
 rid+0.006
 yours+0.005
 protect+0.005
Neg logit contributions
op-0.030
oles-0.030
oke-0.030
uffy-0.030
olly-0.030
Token oct
Token ID19318
Feature activation-1.51e+01

Pos logit contributions
op+0.082
arry+0.081
oke+0.081
uffy+0.080
olly+0.079
Neg logit contributions
 such-0.101
 the-0.100
 his-0.099
 it-0.097
 him-0.093
Tokenopus
Token ID25790
Feature activation-7.19e+00

Pos logit contributions
rig+0.160
uffy+0.159
 Today+0.155
eth+0.152
erry+0.147
Neg logit contributions
 him-0.411
 it-0.397
 such-0.396
 his-0.394
 that-0.389
Token.
Token ID13
Feature activation-3.45e+00

Pos logit contributions
arry+0.199
uffy+0.199
olly+0.196
op+0.195
oke+0.195
Neg logit contributions
 his-0.110
 such-0.110
 the-0.109
 it-0.108
 him-0.100
Token It
Token ID632
Feature activation+6.94e-01

Pos logit contributions
uffy+0.099
uddle+0.098
op+0.098
opus+0.095
rig+0.095
Neg logit contributions
 the-0.056
 his-0.055
 it-0.054
 such-0.052
 him-0.048
Token was
Token ID373
Feature activation+3.23e+00

Pos logit contributions
 such+0.013
 his+0.012
 the+0.012
 it+0.012
 him+0.011
Neg logit contributions
op-0.017
oke-0.017
uffy-0.017
uddle-0.017
arry-0.017
Token stuck
Token ID7819
Feature activation-1.17e+00

Pos logit contributions
 care+0.034
 permission+0.034
 respect+0.030
 such+0.030
 rid+0.030
Neg logit contributions
rig-0.089
oodle-0.088
oles-0.087
oke-0.085
uffy-0.084
Token wanted
Token ID2227
Feature activation+4.11e+00

Pos logit contributions
 his+0.044
 such+0.044
 the+0.041
 its+0.041
 hers+0.040
Neg logit contributions
op-0.053
uffy-0.051
oke-0.049
uddle-0.049
arry-0.048
Token to
Token ID284
Feature activation+5.04e+00

Pos logit contributions
ps+0.071
 such+0.070
 saved+0.070
 protected+0.070
ts+0.069
Neg logit contributions
op-0.075
uddle-0.073
s-0.070
uffy-0.069
olly-0.068
Token play
Token ID711
Feature activation-9.41e-01

Pos logit contributions
 such+0.073
 permission+0.066
 nothing+0.065
 his+0.064
 hers+0.063
Neg logit contributions
op-0.132
arry-0.128
uffy-0.128
uddle-0.128
rig-0.125
Token with
Token ID351
Feature activation+2.17e+00

Pos logit contributions
uddle+0.025
op+0.024
uffy+0.024
oke+0.023
olly+0.023
Neg logit contributions
 such-0.014
 permission-0.014
 his-0.013
 never-0.013
 him-0.012
Token the
Token ID262
Feature activation-4.20e+00

Pos logit contributions
 such+0.020
 never+0.020
 his+0.020
 hers+0.019
 letting+0.019
Neg logit contributions
op-0.066
uddle-0.066
uffy-0.065
olly-0.062
opus-0.062
Token oct
Token ID19318
Feature activation-1.49e+01

Pos logit contributions
op+0.109
oke+0.108
uffy+0.108
uddle+0.107
arry+0.107
Neg logit contributions
 such-0.071
 his-0.069
 the-0.067
 it-0.065
 him-0.063
Tokenopus
Token ID25790
Feature activation-6.89e+00

Pos logit contributions
rig+0.156
erry+0.153
adow+0.146
 Today+0.142
bi+0.138
Neg logit contributions
 him-0.406
 his-0.395
 it-0.395
 not-0.383
 such-0.375
Token,
Token ID11
Feature activation+2.99e+00

Pos logit contributions
arry+0.180
op+0.179
oke+0.178
uffy+0.177
olly+0.177
Neg logit contributions
 such-0.115
 his-0.114
 the-0.113
 it-0.109
 him-0.104
Token but
Token ID475
Feature activation+2.75e+00

Pos logit contributions
 such+0.050
 his+0.047
 the+0.044
 him+0.043
 it+0.043
Neg logit contributions
oke-0.077
op-0.076
uffy-0.076
olly-0.076
uddle-0.076
Token we
Token ID356
Feature activation+2.35e+00

Pos logit contributions
 such+0.044
 rid+0.040
 care+0.040
 permission+0.040
ps+0.036
Neg logit contributions
oke-0.069
uffy-0.061
oles-0.061
opus-0.061
op-0.061
Token broke
Token ID6265
Feature activation+2.86e+00

Pos logit contributions
 such+0.043
 his+0.040
 the+0.037
 its+0.036
 permission+0.036
Neg logit contributions
op-0.059
uffy-0.057
oke-0.057
uddle-0.055
anda-0.054
Token fish
Token ID5916
Feature activation-6.03e+00

Pos logit contributions
op+0.068
uffy+0.067
oke+0.067
arry+0.066
pill+0.066
Neg logit contributions
 the-0.042
 his-0.041
 it-0.041
 such-0.038
 a-0.037
Token came
Token ID1625
Feature activation+2.55e+00

Pos logit contributions
uffy+0.176
uddle+0.173
olly+0.173
arry+0.173
op+0.172
Neg logit contributions
 his-0.088
 the-0.086
 it-0.082
 such-0.081
 him-0.080
Token and
Token ID290
Feature activation+3.39e+00

Pos logit contributions
 his+0.030
 the+0.029
 such+0.028
 it+0.027
 him+0.026
Neg logit contributions
uddle-0.083
uffy-0.083
oke-0.083
op-0.083
olly-0.082
Token asked
Token ID1965
Feature activation+1.85e+00

Pos logit contributions
 such+0.057
 his+0.049
 permission+0.049
 him+0.047
 the+0.047
Neg logit contributions
op-0.086
oke-0.086
pillar-0.084
opus-0.084
arry-0.084
Token the
Token ID262
Feature activation-4.55e+00

Pos logit contributions
 such+0.017
 his+0.014
 never+0.013
 the+0.013
 it+0.013
Neg logit contributions
oke-0.060
op-0.060
arry-0.060
olly-0.060
uffy-0.059
Token oct
Token ID19318
Feature activation-1.48e+01

Pos logit contributions
op+0.117
uffy+0.115
oke+0.115
arry+0.115
uddle+0.114
Neg logit contributions
 such-0.079
 his-0.077
 the-0.075
 it-0.073
 him-0.071
Tokenopus
Token ID25790
Feature activation-6.13e+00

Pos logit contributions
uffy+0.170
 Today+0.162
rig+0.147
eth+0.144
itten+0.131
Neg logit contributions
 it-0.417
 him-0.408
 his-0.393
 never-0.388
 such-0.387
Token why
Token ID1521
Feature activation+2.26e+00

Pos logit contributions
op+0.170
uddle+0.168
uffy+0.167
arry+0.167
oke+0.166
Neg logit contributions
 his-0.095
 such-0.094
 the-0.092
 it-0.089
 him-0.085
Token it
Token ID340
Feature activation+1.72e+00

Pos logit contributions
 such+0.022
 permission+0.018
 his+0.018
 never+0.016
 rid+0.016
Neg logit contributions
op-0.073
oke-0.073
uffy-0.071
arry-0.071
olly-0.070
Token wasn
Token ID2492
Feature activation+1.81e+00

Pos logit contributions
 such+0.031
 his+0.030
 the+0.030
 it+0.029
 him+0.028
Neg logit contributions
op-0.042
oke-0.041
uffy-0.041
uddle-0.041
arry-0.041
Token't
Token ID470
Feature activation+3.89e+00

Pos logit contributions
 such+0.030
 the+0.030
 his+0.030
 it+0.029
 him+0.028
Neg logit contributions
uddle-0.050
uffy-0.049
oke-0.049
ari-0.048
op-0.048
Token,
Token ID11
Feature activation+1.29e+00

Pos logit contributions
rig+0.074
op+0.072
pill+0.072
arry+0.070
pillar+0.070
Neg logit contributions
 the-0.054
 it-0.052
 his-0.050
 such-0.048
 a-0.046
Token duck
Token ID22045
Feature activation-5.68e+00

Pos logit contributions
 the+0.025
 his+0.024
 it+0.024
 such+0.024
 him+0.022
Neg logit contributions
op-0.032
uffy-0.032
arry-0.032
rig-0.031
uddle-0.031
Token,"
Token ID553
Feature activation-2.14e+00

Pos logit contributions
pill+0.154
op+0.152
ari+0.150
ane+0.150
rig+0.150
Neg logit contributions
 the-0.095
 it-0.090
 his-0.089
 such-0.084
 him-0.080
Token said
Token ID531
Feature activation+1.94e-01

Pos logit contributions
olly+0.054
uffy+0.053
op+0.053
oke+0.053
arry+0.052
Neg logit contributions
 such-0.036
 permission-0.032
 his-0.032
 never-0.031
 the-0.030
Token the
Token ID262
Feature activation-4.35e+00

Pos logit contributions
 such+0.003
 never+0.003
 his+0.003
 permission+0.002
 him+0.002
Neg logit contributions
uffy-0.005
op-0.005
olly-0.005
oke-0.005
arry-0.005
Token oct
Token ID19318
Feature activation-1.48e+01

Pos logit contributions
arry+0.099
rig+0.098
oke+0.098
op+0.098
uffy+0.098
Neg logit contributions
 the-0.091
 his-0.090
 it-0.087
 such-0.085
 him-0.081
Tokenopus
Token ID25790
Feature activation-6.38e+00

Pos logit contributions
rig+0.168
 Today+0.165
yt+0.155
erry+0.141
uffy+0.140
Neg logit contributions
 it-0.402
 him-0.384
 his-0.383
 not-0.374
 never-0.373
Token.
Token ID13
Feature activation-3.79e+00

Pos logit contributions
uddle+0.186
uffy+0.185
oke+0.184
arry+0.184
op+0.183
Neg logit contributions
 his-0.093
 the-0.089
 such-0.085
 it-0.085
 him-0.080
Token "
Token ID366
Feature activation-2.10e-01

Pos logit contributions
op+0.098
uddle+0.098
uffy+0.098
arry+0.097
ari+0.095
Neg logit contributions
 his-0.067
 the-0.066
 such-0.065
 it-0.064
 him-0.059
TokenDo
Token ID5211
Feature activation+4.36e+00

Pos logit contributions
uddle+0.007
uffy+0.007
op+0.007
oke+0.007
arry+0.006
Neg logit contributions
 his-0.003
 such-0.003
 the-0.002
 it-0.002
 him-0.002
Token you
Token ID345
Feature activation+2.10e+00

Pos logit contributions
 such+0.051
 his+0.049
 the+0.047
 it+0.045
 him+0.043
Neg logit contributions
op-0.134
uffy-0.133
oke-0.133
olly-0.133
arry-0.133
Token-
Token ID12
Feature activation-1.54e+00

Pos logit contributions
arry+0.074
rig+0.073
oke+0.070
ari+0.070
uffy+0.067
Neg logit contributions
 the-0.122
 it-0.122
 such-0.117
 his-0.117
 him-0.113
Tokenwar
Token ID5767
Feature activation-5.80e+00

Pos logit contributions
uffy+0.027
uddle+0.027
oke+0.026
arry+0.026
rig+0.025
Neg logit contributions
 such-0.039
 the-0.039
 his-0.039
 it-0.039
 him-0.038
Token,"
Token ID553
Feature activation-1.94e+00

Pos logit contributions
uffy+0.163
uddle+0.161
oke+0.159
arry+0.159
ari+0.158
Neg logit contributions
 such-0.090
 the-0.090
 his-0.088
 it-0.088
 him-0.082
Token said
Token ID531
Feature activation+7.51e-01

Pos logit contributions
op+0.046
oke+0.045
olly+0.045
uffy+0.044
arry+0.044
Neg logit contributions
 such-0.036
 permission-0.034
 his-0.033
 yours-0.032
 never-0.032
Token the
Token ID262
Feature activation-3.42e+00

Pos logit contributions
 such+0.006
 never+0.005
 permission+0.005
 his+0.004
 yours+0.004
Neg logit contributions
olly-0.025
op-0.025
oke-0.025
uffy-0.025
arry-0.024
Token oct
Token ID19318
Feature activation-1.47e+01

Pos logit contributions
uffy+0.081
arry+0.079
ari+0.079
rig+0.079
op+0.079
Neg logit contributions
 the-0.070
 his-0.070
 it-0.067
 such-0.066
 him-0.062
Tokenopus
Token ID25790
Feature activation-5.70e+00

Pos logit contributions
rig+0.160
 Today+0.159
yt+0.154
owers+0.142
uffy+0.138
Neg logit contributions
 it-0.402
 him-0.392
 his-0.390
 such-0.372
 never-0.372
Token.
Token ID13
Feature activation-3.77e+00

Pos logit contributions
uddle+0.166
uffy+0.165
oke+0.163
olly+0.162
opus+0.162
Neg logit contributions
 his-0.085
 the-0.080
 such-0.078
 it-0.076
 him-0.074
Token "
Token ID366
Feature activation+4.98e-01

Pos logit contributions
uffy+0.098
op+0.098
uddle+0.097
opus+0.096
arry+0.096
Neg logit contributions
 the-0.068
 his-0.068
 it-0.066
 such-0.065
 a-0.060
TokenI
Token ID40
Feature activation+2.11e+00

Pos logit contributions
 such+0.007
 his+0.007
 it+0.007
 the+0.007
 never+0.007
Neg logit contributions
uffy-0.014
uddle-0.014
oke-0.014
arry-0.014
olly-0.014
Token have
Token ID423
Feature activation+4.99e+00

Pos logit contributions
 such+0.035
 his+0.034
 the+0.033
 it+0.033
 him+0.031
Neg logit contributions
op-0.055
oke-0.055
olly-0.055
uffy-0.055
arry-0.055
Token Don
Token ID2094
Feature activation+1.13e+00

Pos logit contributions
uddle+0.084
rig+0.079
uffy+0.079
actus+0.078
op+0.077
Neg logit contributions
 the-0.048
 such-0.046
 it-0.046
 his-0.045
 a-0.042
Token't
Token ID470
Feature activation+3.47e+00

Pos logit contributions
 the+0.018
 it+0.016
 such+0.016
 never+0.015
 his+0.015
Neg logit contributions
uddle-0.034
oodle-0.033
actus-0.033
arry-0.033
rig-0.033
Token eat
Token ID4483
Feature activation+1.33e+00

Pos logit contributions
 such+0.052
 his+0.049
 the+0.048
 it+0.046
 him+0.045
Neg logit contributions
op-0.097
arry-0.095
oke-0.095
uffy-0.095
uddle-0.095
Token the
Token ID262
Feature activation-2.80e+00

Pos logit contributions
 such+0.010
 his+0.010
 the+0.008
 permission+0.008
 it+0.008
Neg logit contributions
op-0.045
uddle-0.045
uffy-0.045
oke-0.045
olly-0.045
Token r
Token ID374
Feature activation-8.24e+00

Pos logit contributions
op+0.074
oke+0.074
arry+0.073
uffy+0.073
olly+0.073
Neg logit contributions
 such-0.047
 his-0.045
 the-0.045
 it-0.043
 him-0.042
Tokenasp
Token ID5126
Feature activation-1.46e+01

Pos logit contributions
opus+0.154
pillar+0.152
oke+0.149
op+0.148
arry+0.147
Neg logit contributions
 such-0.195
 the-0.184
 his-0.183
 permission-0.178
 each-0.176
Tokenberries
Token ID20853
Feature activation-5.02e+00

Pos logit contributions
rig+0.225
ari+0.214
uffy+0.206
eth+0.204
ute+0.203
Neg logit contributions
 the-0.424
 it-0.413
 such-0.408
 his-0.388
 made-0.387
Token,
Token ID11
Feature activation+2.35e+00

Pos logit contributions
op+0.132
arry+0.132
oke+0.132
uffy+0.131
uddle+0.130
Neg logit contributions
 such-0.083
 his-0.080
 the-0.080
 it-0.077
 him-0.074
Token they
Token ID484
Feature activation+2.57e+00

Pos logit contributions
 such+0.033
 his+0.031
 the+0.029
 him+0.028
 it+0.028
Neg logit contributions
op-0.066
oke-0.066
uddle-0.065
uffy-0.065
olly-0.065
Token are
Token ID389
Feature activation+2.85e+00

Pos logit contributions
 his+0.046
 such+0.046
 the+0.043
 it+0.042
 him+0.042
Neg logit contributions
op-0.062
oke-0.060
uffy-0.060
uddle-0.060
olly-0.059
Token too
Token ID1165
Feature activation+1.79e-01

Pos logit contributions
 him+0.035
 such+0.035
 permission+0.034
 his+0.034
 respect+0.032
Neg logit contributions
anda-0.073
uffy-0.073
oke-0.073
oles-0.072
rig-0.071
Token water
Token ID1660
Feature activation-4.17e+00

Pos logit contributions
op+0.085
arry+0.085
oke+0.084
olly+0.083
uffy+0.083
Neg logit contributions
 such-0.080
 the-0.080
 his-0.080
 it-0.077
 him-0.074
Token and
Token ID290
Feature activation+3.44e+00

Pos logit contributions
op+0.104
oke+0.103
uffy+0.102
arry+0.102
uddle+0.102
Neg logit contributions
 such-0.074
 his-0.072
 the-0.071
 it-0.069
 him-0.067
Token wave
Token ID6769
Feature activation-9.86e-01

Pos logit contributions
 such+0.050
 permission+0.042
 his+0.041
 him+0.040
 yours+0.038
Neg logit contributions
uddle-0.091
op-0.090
rig-0.090
uffy-0.090
olly-0.088
Token at
Token ID379
Feature activation+2.21e+00

Pos logit contributions
op+0.025
olly+0.025
oke+0.025
uddle+0.024
oodle+0.024
Neg logit contributions
 such-0.016
 his-0.013
 never-0.013
 permission-0.013
 the-0.013
Token the
Token ID262
Feature activation-3.49e+00

Pos logit contributions
 such+0.011
 never+0.007
 saved+0.006
 permission+0.006
 protect+0.006
Neg logit contributions
uffy-0.079
op-0.078
olly-0.078
arry-0.076
uddle-0.076
Token oct
Token ID19318
Feature activation-1.46e+01

Pos logit contributions
op+0.083
arry+0.082
oke+0.082
uffy+0.082
olly+0.081
Neg logit contributions
 such-0.067
 the-0.066
 his-0.066
 it-0.064
 him-0.061
Tokenopus
Token ID25790
Feature activation-6.36e+00

Pos logit contributions
 Today+0.159
uffy+0.146
rig+0.145
erry+0.130
eth+0.124
Neg logit contributions
 him-0.412
 it-0.400
 such-0.394
 his-0.392
 never-0.386
Token.
Token ID13
Feature activation-3.87e+00

Pos logit contributions
op+0.171
arry+0.171
uffy+0.171
oke+0.170
olly+0.169
Neg logit contributions
 such-0.100
 his-0.099
 the-0.098
 it-0.095
 him-0.090
Token The
Token ID383
Feature activation-4.38e+00

Pos logit contributions
uddle+0.106
op+0.104
uffy+0.103
opus+0.101
ari+0.100
Neg logit contributions
 his-0.069
 the-0.068
 it-0.066
 such-0.063
 him-0.061
Token oct
Token ID19318
Feature activation-1.38e+01

Pos logit contributions
op+0.100
uffy+0.100
arry+0.099
oke+0.098
olly+0.096
Neg logit contributions
 his-0.088
 the-0.088
 such-0.086
 it-0.085
 him-0.081
Tokenopus
Token ID25790
Feature activation-4.77e+00

Pos logit contributions
 Today+0.141
rig+0.140
uffy+0.135
erry+0.124
eth+0.122
Neg logit contributions
 it-0.404
 him-0.404
 such-0.400
 his-0.394
 never-0.387
Token it
Token ID340
Feature activation-1.87e+00

Pos logit contributions
ps+0.039
 saving+0.038
 preventing+0.038
ts+0.036
 protecting+0.036
Neg logit contributions
s-0.084
uddle-0.082
ite-0.081
actus-0.080
ogging-0.080
Token?"
Token ID1701
Feature activation-4.11e+00

Pos logit contributions
op+0.040
uffy+0.040
olly+0.040
uddle+0.039
arry+0.039
Neg logit contributions
 such-0.038
 his-0.036
 the-0.036
 it-0.036
 never-0.035
Token
Token ID198
Feature activation-4.53e+00

Pos logit contributions
uddle+0.101
ari+0.099
op+0.098
rig+0.097
oke+0.097
Neg logit contributions
 his-0.086
 the-0.085
 it-0.082
 him-0.079
 her-0.078
Token
Token ID198
Feature activation-3.89e+00

Pos logit contributions
uddle+0.115
ari+0.111
oke+0.110
opus+0.109
owers+0.109
Neg logit contributions
 his-0.094
 the-0.094
 it-0.090
 him-0.086
 her-0.085
TokenThe
Token ID464
Feature activation-4.44e+00

Pos logit contributions
uddle+0.101
oke+0.101
ari+0.098
ane+0.097
uffy+0.097
Neg logit contributions
 his-0.072
 the-0.069
 it-0.066
 her-0.065
 such-0.064
Token oct
Token ID19318
Feature activation-1.46e+01

Pos logit contributions
op+0.105
uffy+0.104
oke+0.103
arry+0.103
ari+0.101
Neg logit contributions
 the-0.091
 his-0.089
 it-0.087
 a-0.082
 her-0.079
Tokenopus
Token ID25790
Feature activation-6.47e+00

Pos logit contributions
 Today+0.160
rig+0.159
uffy+0.158
eth+0.142
uddy+0.130
Neg logit contributions
 him-0.411
 it-0.411
 his-0.405
 such-0.389
 that-0.384
Token thought
Token ID1807
Feature activation+6.22e-01

Pos logit contributions
op+0.175
arry+0.173
uffy+0.172
oke+0.171
olly+0.170
Neg logit contributions
 his-0.104
 the-0.100
 such-0.100
 it-0.097
 him-0.093
Token about
Token ID546
Feature activation+2.77e+00

Pos logit contributions
 such+0.009
 his+0.008
 the+0.007
 never+0.007
 it+0.007
Neg logit contributions
oke-0.018
olly-0.018
op-0.018
uffy-0.018
uddle-0.017
Token it
Token ID340
Feature activation-2.91e+00

Pos logit contributions
 such+0.013
 permission+0.011
 rid+0.011
 his+0.007
 protect+0.007
Neg logit contributions
oke-0.098
arry-0.098
uffy-0.096
pill-0.096
op-0.096
Token for
Token ID329
Feature activation+1.98e+00

Pos logit contributions
op+0.076
uffy+0.075
oke+0.075
arry+0.075
uddle+0.075
Neg logit contributions
 such-0.049
 his-0.048
 the-0.047
 it-0.045
 him-0.044
Token friends
Token ID2460
Feature activation-2.72e+00

Pos logit contributions
op+0.060
oke+0.060
arry+0.060
olly+0.059
uddle+0.058
Neg logit contributions
 his-0.042
 such-0.042
 the-0.042
 it-0.040
 him-0.038
Token.
Token ID13
Feature activation-4.00e+00

Pos logit contributions
op+0.072
oke+0.071
uffy+0.071
arry+0.071
uddle+0.070
Neg logit contributions
 such-0.045
 his-0.044
 the-0.043
 it-0.042
 him-0.040
Token He
Token ID679
Feature activation+1.75e+00

Pos logit contributions
uddle+0.106
op+0.102
uffy+0.100
actus+0.100
opus+0.099
Neg logit contributions
 his-0.077
 the-0.076
 it-0.073
 such-0.072
 him-0.070
Token asked
Token ID1965
Feature activation+1.88e+00

Pos logit contributions
 such+0.028
 his+0.026
 the+0.026
 it+0.025
 him+0.024
Neg logit contributions
op-0.047
oke-0.046
uffy-0.046
uddle-0.046
olly-0.046
Token the
Token ID262
Feature activation-4.49e+00

Pos logit contributions
 such+0.011
 never+0.007
 it+0.007
 his+0.007
 the+0.006
Neg logit contributions
oke-0.067
op-0.066
arry-0.066
uffy-0.066
olly-0.066
Token oct
Token ID19318
Feature activation-1.45e+01

Pos logit contributions
op+0.109
arry+0.107
uffy+0.106
oke+0.105
olly+0.104
Neg logit contributions
 the-0.087
 his-0.086
 such-0.082
 it-0.082
 him-0.078
Tokenopus
Token ID25790
Feature activation-6.26e+00

Pos logit contributions
uffy+0.135
 Today+0.128
rig+0.125
owers+0.112
eth+0.111
Neg logit contributions
 him-0.441
 his-0.427
 it-0.417
 such-0.416
 himself-0.398
Token,
Token ID11
Feature activation+2.27e+00

Pos logit contributions
op+0.181
uddle+0.180
uffy+0.179
arry+0.179
oke+0.178
Neg logit contributions
 his-0.090
 such-0.087
 the-0.087
 it-0.083
 him-0.080
Token the
Token ID262
Feature activation-3.70e+00

Pos logit contributions
 such+0.045
 his+0.043
 the+0.042
 it+0.041
 him+0.041
Neg logit contributions
op-0.051
oke-0.051
arry-0.050
uffy-0.050
olly-0.050
Token whale
Token ID22206
Feature activation-6.65e+00

Pos logit contributions
op+0.084
uffy+0.084
arry+0.082
oke+0.082
uddle+0.081
Neg logit contributions
 his-0.076
 the-0.076
 such-0.074
 it-0.073
 him-0.070
Token and
Token ID290
Feature activation+3.61e+00

Pos logit contributions
op+0.163
uddle+0.161
arry+0.161
uffy+0.160
olly+0.159
Neg logit contributions
 his-0.122
 such-0.121
 the-0.120
 it-0.117
 him-0.113
Token.
Token ID13
Feature activation-3.40e+00

Pos logit contributions
op+0.159
oke+0.157
arry+0.157
uffy+0.156
uddle+0.155
Neg logit contributions
 such-0.093
 his-0.090
 the-0.088
 it-0.086
 him-0.082
Token We
Token ID775
Feature activation+2.44e+00

Pos logit contributions
uddle+0.087
uffy+0.084
actus+0.084
op+0.083
olly+0.082
Neg logit contributions
 such-0.065
 the-0.062
 it-0.061
 his-0.060
 never-0.057
Token will
Token ID481
Feature activation+4.67e+00

Pos logit contributions
 such+0.050
 his+0.050
 the+0.046
 him+0.044
 it+0.044
Neg logit contributions
op-0.054
oke-0.052
uffy-0.052
arry-0.052
uddle-0.051
Token share
Token ID2648
Feature activation+3.14e+00

Pos logit contributions
 such+0.065
 his+0.064
 nothing+0.054
 the+0.054
 him+0.054
Neg logit contributions
arry-0.124
op-0.123
ari-0.120
uffy-0.120
uddle-0.120
Token the
Token ID262
Feature activation-3.61e+00

Pos logit contributions
 permission+0.025
 such+0.023
 his+0.022
 rid+0.022
 nothing+0.021
Neg logit contributions
uddle-0.095
op-0.095
oke-0.095
pillar-0.094
opus-0.093
Token oct
Token ID19318
Feature activation-1.45e+01

Pos logit contributions
op+0.080
uffy+0.080
arry+0.079
oke+0.079
olly+0.077
Neg logit contributions
 such-0.075
 his-0.075
 the-0.074
 it-0.072
 him-0.069
Tokenopus
Token ID25790
Feature activation-5.79e+00

Pos logit contributions
uffy+0.158
 Today+0.147
rig+0.141
eth+0.133
oodle+0.130
Neg logit contributions
 it-0.383
 him-0.383
 such-0.372
 help-0.370
 his-0.370
Token together
Token ID1978
Feature activation+7.32e-01

Pos logit contributions
arry+0.158
oke+0.157
uffy+0.156
op+0.155
olly+0.152
Neg logit contributions
 such-0.094
 the-0.093
 his-0.092
 it-0.089
 him-0.081
Token.
Token ID13
Feature activation-3.26e+00

Pos logit contributions
 such+0.013
 his+0.013
 the+0.012
 it+0.012
 him+0.012
Neg logit contributions
op-0.018
oke-0.017
olly-0.017
uddle-0.017
uffy-0.017
Token Let
Token ID3914
Feature activation+2.43e+00

Pos logit contributions
uddle+0.078
op+0.076
uffy+0.076
olly+0.075
actus+0.074
Neg logit contributions
 such-0.067
 the-0.064
 it-0.063
 his-0.062
 him-0.059
Token's
Token ID338
Feature activation+3.48e+00

Pos logit contributions
 such+0.056
 the+0.055
 his+0.054
 it+0.054
 him+0.052
Neg logit contributions
arry-0.050
op-0.050
oke-0.050
uffy-0.049
uddle-0.049
Token He
Token ID679
Feature activation+2.19e+00

Pos logit contributions
uddle+0.122
uffy+0.121
opus+0.119
olly+0.118
op+0.118
Neg logit contributions
 his-0.077
 the-0.075
 it-0.072
 such-0.065
 a-0.065
Token wanted
Token ID2227
Feature activation+3.67e+00

Pos logit contributions
 such+0.035
 his+0.033
 the+0.032
 it+0.031
 him+0.031
Neg logit contributions
op-0.059
oke-0.058
uddle-0.058
uffy-0.058
olly-0.058
Token to
Token ID284
Feature activation+4.08e+00

Pos logit contributions
 such+0.061
 relief+0.057
ts+0.057
 hurting+0.057
 allowing+0.057
Neg logit contributions
op-0.075
anda-0.070
pill-0.070
oodle-0.070
uddle-0.069
Token see
Token ID766
Feature activation+3.53e+00

Pos logit contributions
 such+0.057
 nothing+0.050
 permission+0.049
 saving+0.049
 his+0.049
Neg logit contributions
op-0.109
arry-0.106
pillar-0.103
zag-0.102
rig-0.102
Token the
Token ID262
Feature activation-4.08e+00

Pos logit contributions
 never+0.023
 such+0.023
 permission+0.020
 care+0.018
 saved+0.018
Neg logit contributions
oke-0.114
op-0.112
uddle-0.112
anda-0.112
uffy-0.111
Token oct
Token ID19318
Feature activation-1.45e+01

Pos logit contributions
op+0.094
uffy+0.093
oke+0.091
olly+0.091
arry+0.091
Neg logit contributions
 such-0.083
 the-0.081
 his-0.081
 it-0.079
 him-0.076
Tokenopus
Token ID25790
Feature activation-5.19e+00

Pos logit contributions
rig+0.178
uffy+0.172
 Today+0.164
aney+0.164
eth+0.161
Neg logit contributions
 him-0.374
 it-0.372
 such-0.360
 that-0.352
 his-0.348
Token better
Token ID1365
Feature activation-1.43e+00

Pos logit contributions
op+0.136
oke+0.134
uffy+0.134
arry+0.134
uddle+0.134
Neg logit contributions
 such-0.086
 his-0.084
 the-0.082
 it-0.080
 him-0.077
Token.
Token ID13
Feature activation-3.57e+00

Pos logit contributions
op+0.038
uffy+0.038
oke+0.038
uddle+0.038
arry+0.038
Neg logit contributions
 such-0.023
 his-0.023
 the-0.023
 it-0.022
 him-0.021
Token He
Token ID679
Feature activation+2.10e+00

Pos logit contributions
op+0.081
uffy+0.081
uddle+0.081
olly+0.080
opus+0.080
Neg logit contributions
 his-0.075
 the-0.073
 such-0.073
 it-0.072
 him-0.068
Token ran
Token ID4966
Feature activation+1.69e+00

Pos logit contributions
 such+0.033
 his+0.030
 the+0.029
 it+0.028
 him+0.028
Neg logit contributions
op-0.058
oke-0.057
uddle-0.057
uffy-0.057
olly-0.057
Token sw
Token ID1509
Feature activation-6.64e+00

Pos logit contributions
op+0.131
uffy+0.130
uddle+0.130
oke+0.130
arry+0.129
Neg logit contributions
 such-0.079
 his-0.078
 the-0.076
 it-0.074
 him-0.071
Tokenam
Token ID321
Feature activation-1.75e+00

Pos logit contributions
oke+0.124
op+0.122
uffy+0.119
pill+0.118
pillar+0.118
Neg logit contributions
 his-0.161
 such-0.157
 him-0.153
 the-0.152
 it-0.150
Token over
Token ID625
Feature activation+1.79e+00

Pos logit contributions
op+0.049
uffy+0.048
oke+0.048
uddle+0.048
olly+0.047
Neg logit contributions
 his-0.028
 such-0.027
 the-0.027
 it-0.025
 him-0.025
Token to
Token ID284
Feature activation+2.91e+00

Pos logit contributions
 such+0.019
 his+0.017
 the+0.016
 it+0.016
 never+0.015
Neg logit contributions
op-0.057
arry-0.057
olly-0.057
oke-0.056
uffy-0.056
Token the
Token ID262
Feature activation-3.39e+00

Pos logit contributions
 such+0.014
 never+0.009
 his+0.009
 nothing+0.009
 it+0.007
Neg logit contributions
arry-0.105
op-0.104
olly-0.104
pillar-0.103
oke-0.102
Token oct
Token ID19318
Feature activation-1.45e+01

Pos logit contributions
op+0.071
uffy+0.070
oke+0.070
arry+0.069
olly+0.069
Neg logit contributions
 such-0.075
 his-0.073
 the-0.073
 it-0.071
 him-0.069
Tokenopus
Token ID25790
Feature activation-6.49e+00

Pos logit contributions
 Today+0.175
uffy+0.174
rig+0.157
erry+0.140
eth+0.139
Neg logit contributions
 him-0.388
 it-0.382
 such-0.373
 that-0.367
 never-0.362
Token and
Token ID290
Feature activation+3.04e+00

Pos logit contributions
uffy+0.173
arry+0.170
oke+0.170
uddle+0.168
olly+0.168
Neg logit contributions
 his-0.111
 the-0.110
 such-0.108
 it-0.106
 him-0.102
Token asked
Token ID1965
Feature activation+1.93e+00

Pos logit contributions
 such+0.048
 permission+0.041
 him+0.041
 his+0.040
 yours+0.040
Neg logit contributions
op-0.078
arry-0.077
oke-0.077
pillar-0.077
opus-0.076
Token her
Token ID607
Feature activation-1.89e+00

Pos logit contributions
 such+0.017
 his+0.013
 never+0.013
 nothing+0.012
 the+0.012
Neg logit contributions
op-0.063
oke-0.063
arry-0.063
olly-0.062
uffy-0.062
Token to
Token ID284
Feature activation+4.41e+00

Pos logit contributions
op+0.052
uffy+0.051
uddle+0.051
oke+0.051
arry+0.051
Neg logit contributions
 such-0.030
 his-0.030
 the-0.029
 it-0.028
 him-0.027
Token shell
Token ID7582
Feature activation-5.72e+00

Pos logit contributions
op+0.045
uddle+0.045
uffy+0.044
arry+0.044
olly+0.044
Neg logit contributions
 such-0.042
 his-0.040
 the-0.039
 it-0.039
 him-0.038
Token and
Token ID290
Feature activation+4.21e+00

Pos logit contributions
oke+0.139
op+0.138
arry+0.137
uffy+0.136
olly+0.135
Neg logit contributions
 such-0.107
 his-0.106
 the-0.104
 it-0.101
 him-0.096
Token pin
Token ID6757
Feature activation-4.71e-01

Pos logit contributions
 such+0.075
 permission+0.071
 yours+0.068
 him+0.063
 respect+0.063
Neg logit contributions
uddle-0.095
uffy-0.092
suit-0.089
oodle-0.089
op-0.089
Tokenched
Token ID1740
Feature activation+2.96e+00

Pos logit contributions
op+0.009
pillar+0.009
uddle+0.008
pill+0.008
oke+0.008
Neg logit contributions
 such-0.011
 his-0.010
 permission-0.010
 yours-0.010
 the-0.010
Token the
Token ID262
Feature activation-3.68e+00

Pos logit contributions
 such+0.020
 his+0.016
 never+0.015
 permission+0.014
 the+0.014
Neg logit contributions
op-0.103
uffy-0.102
olly-0.102
oke-0.101
uddle-0.101
Token oct
Token ID19318
Feature activation-1.45e+01

Pos logit contributions
op+0.083
arry+0.082
uffy+0.082
oke+0.082
olly+0.081
Neg logit contributions
 such-0.074
 his-0.073
 the-0.072
 it-0.070
 him-0.068
Tokenopus
Token ID25790
Feature activation-5.75e+00

Pos logit contributions
 Today+0.170
rig+0.163
uffy+0.148
erry+0.141
arry+0.140
Neg logit contributions
 him-0.386
 it-0.384
 his-0.367
 such-0.364
 never-0.358
Token's
Token ID338
Feature activation-1.33e+00

Pos logit contributions
oke+0.149
op+0.149
arry+0.149
uffy+0.148
olly+0.147
Neg logit contributions
 such-0.097
 his-0.095
 the-0.093
 it-0.090
 him-0.087
Token arm
Token ID3211
Feature activation-5.32e+00

Pos logit contributions
op+0.031
uddle+0.030
olly+0.030
uffy+0.030
arry+0.029
Neg logit contributions
 such-0.026
 his-0.024
 the-0.024
 it-0.024
 him-0.023
Token with
Token ID351
Feature activation+4.83e+00

Pos logit contributions
op+0.137
uffy+0.134
uddle+0.134
oke+0.134
arry+0.133
Neg logit contributions
 such-0.091
 his-0.087
 the-0.085
 it-0.083
 him-0.081
Token its
Token ID663
Feature activation-7.75e-01

Pos logit contributions
 permission+0.037
 such+0.035
 saving+0.034
 punished+0.034
 ruining+0.032
Neg logit contributions
op-0.134
pillar-0.133
pill-0.131
oles-0.131
olly-0.125
Token,
Token ID11
Feature activation+1.42e+00

Pos logit contributions
pill+0.007
ari+0.007
rig+0.007
op+0.007
ane+0.007
Neg logit contributions
 the-0.005
 it-0.005
 his-0.005
 such-0.005
 him-0.005
Token duck
Token ID22045
Feature activation-4.84e+00

Pos logit contributions
 the+0.028
 such+0.028
 his+0.028
 it+0.027
 him+0.026
Neg logit contributions
op-0.034
arry-0.033
uffy-0.033
uddle-0.033
ari-0.033
Token,"
Token ID553
Feature activation-1.77e+00

Pos logit contributions
pill+0.123
ari+0.123
arry+0.122
rig+0.122
op+0.121
Neg logit contributions
 the-0.085
 his-0.083
 it-0.082
 such-0.080
 him-0.076
Token said
Token ID531
Feature activation+6.35e-01

Pos logit contributions
oke+0.040
olly+0.040
op+0.040
uffy+0.039
arry+0.039
Neg logit contributions
 such-0.033
 permission-0.030
 yours-0.029
 never-0.029
 anything-0.028
Token the
Token ID262
Feature activation-3.99e+00

Pos logit contributions
 such+0.006
 his+0.005
 never+0.005
 permission+0.005
 it+0.005
Neg logit contributions
op-0.020
uffy-0.020
oke-0.020
olly-0.020
arry-0.020
Token oct
Token ID19318
Feature activation-1.44e+01

Pos logit contributions
arry+0.086
op+0.085
rig+0.085
ari+0.085
oke+0.085
Neg logit contributions
 the-0.086
 his-0.085
 it-0.084
 such-0.083
 him-0.079
Tokenopus
Token ID25790
Feature activation-5.99e+00

Pos logit contributions
 Today+0.186
rig+0.166
yt+0.151
adow+0.148
aris+0.137
Neg logit contributions
 it-0.384
 never-0.375
 not-0.373
 him-0.368
 now-0.361
Token.
Token ID13
Feature activation-3.57e+00

Pos logit contributions
op+0.162
uffy+0.161
uddle+0.161
arry+0.161
oke+0.160
Neg logit contributions
 his-0.094
 such-0.093
 the-0.091
 it-0.088
 him-0.085
Token "
Token ID366
Feature activation-2.39e-01

Pos logit contributions
uffy+0.093
op+0.093
uddle+0.093
olly+0.092
arry+0.092
Neg logit contributions
 his-0.062
 the-0.061
 such-0.060
 it-0.059
 him-0.055
TokenI
Token ID40
Feature activation+2.10e+00

Pos logit contributions
op+0.006
uffy+0.006
arry+0.006
uddle+0.006
oke+0.006
Neg logit contributions
 such-0.004
 his-0.004
 the-0.004
 it-0.003
 him-0.003
Token didn
Token ID1422
Feature activation+1.91e+00

Pos logit contributions
 such+0.033
 his+0.032
 the+0.031
 it+0.030
 him+0.029
Neg logit contributions
op-0.057
oke-0.057
uffy-0.057
uddle-0.056
arry-0.056
Token you
Token ID345
Feature activation-7.42e-01

Pos logit contributions
 such+0.037
 saved+0.035
 his+0.034
 nothing+0.033
 never+0.033
Neg logit contributions
op-0.063
opus-0.062
uffy-0.062
pill-0.061
olly-0.061
Token and
Token ID290
Feature activation+4.08e+00

Pos logit contributions
op+0.016
oke+0.016
uddle+0.016
uffy+0.016
olly+0.016
Neg logit contributions
 such-0.014
 his-0.014
 him-0.013
 it-0.013
 rid-0.013
Token we
Token ID356
Feature activation+2.16e+00

Pos logit contributions
 such+0.067
 him+0.059
 rid+0.058
 his+0.058
 permission+0.058
Neg logit contributions
uffy-0.097
oke-0.094
op-0.093
olly-0.093
uddle-0.093
Token love
Token ID1842
Feature activation+2.32e+00

Pos logit contributions
 such+0.031
 his+0.030
 the+0.027
 it+0.027
 anything+0.027
Neg logit contributions
op-0.062
uddle-0.060
oke-0.060
uffy-0.060
arry-0.059
Token the
Token ID262
Feature activation-4.30e+00

Pos logit contributions
 such+0.027
 saved+0.025
 his+0.024
 never+0.023
 nothing+0.023
Neg logit contributions
op-0.065
uffy-0.063
opus-0.061
oke-0.061
uddle-0.061
Token oct
Token ID19318
Feature activation-1.44e+01

Pos logit contributions
op+0.092
oke+0.090
uffy+0.090
arry+0.090
uddle+0.090
Neg logit contributions
 such-0.093
 his-0.091
 the-0.090
 it-0.088
 him-0.086
Tokenopus
Token ID25790
Feature activation-6.55e+00

Pos logit contributions
rig+0.166
 Today+0.151
uffy+0.138
adow+0.137
erry+0.135
Neg logit contributions
 him-0.401
 it-0.381
 his-0.374
 such-0.370
 rid-0.357
Token."
Token ID526
Feature activation-3.87e+00

Pos logit contributions
op+0.173
oke+0.170
uffy+0.170
uddle+0.170
olly+0.170
Neg logit contributions
 such-0.106
 his-0.102
 the-0.099
 it-0.097
 him-0.094
Token
Token ID198
Feature activation-3.34e+00

Pos logit contributions
opus+0.101
pillar+0.099
uddle+0.099
rig+0.099
anda+0.098
Neg logit contributions
 the-0.076
 his-0.075
 it-0.071
 their-0.070
 him-0.068
Token
Token ID198
Feature activation-2.55e+00

Pos logit contributions
uddle+0.076
ari+0.076
arry+0.076
pill+0.075
pillar+0.075
Neg logit contributions
 the-0.071
 his-0.070
 it-0.068
 him-0.065
 such-0.064
TokenMom
Token ID29252
Feature activation-1.74e+00

Pos logit contributions
op+0.068
oke+0.068
uffy+0.068
uddle+0.068
arry+0.067
Neg logit contributions
 such-0.041
 his-0.041
 the-0.040
 it-0.039
 him-0.037
Token
Token ID198
Feature activation-2.67e+00

Pos logit contributions
uddle+0.074
owers+0.074
opus+0.073
ari+0.072
ane+0.072
Neg logit contributions
 his-0.059
 the-0.059
 it-0.059
 him-0.055
 never-0.053
TokenTim
Token ID14967
Feature activation-2.83e+00

Pos logit contributions
uffy+0.069
oke+0.068
op+0.068
uddle+0.068
olly+0.067
Neg logit contributions
 his-0.046
 such-0.046
 the-0.044
 it-0.043
 him-0.041
Token said
Token ID531
Feature activation+1.72e+00

Pos logit contributions
op+0.079
arry+0.079
uffy+0.079
oke+0.079
uddle+0.078
Neg logit contributions
 his-0.042
 the-0.042
 such-0.041
 it-0.040
 never-0.038
Token to
Token ID284
Feature activation+3.42e+00

Pos logit contributions
 such+0.039
 his+0.035
 the+0.035
 him+0.035
 permission+0.035
Neg logit contributions
op-0.034
oke-0.034
olly-0.033
arry-0.033
uffy-0.032
Token the
Token ID262
Feature activation-3.19e+00

Pos logit contributions
 such+0.024
 rid+0.019
 permission+0.017
 respect+0.017
 yours+0.017
Neg logit contributions
arry-0.105
uffy-0.103
rig-0.102
ari-0.101
ule-0.101
Token oct
Token ID19318
Feature activation-1.44e+01

Pos logit contributions
uffy+0.073
op+0.072
oke+0.070
arry+0.070
ari+0.070
Neg logit contributions
 his-0.067
 the-0.066
 such-0.064
 it-0.064
 him-0.061
Tokenopus
Token ID25790
Feature activation-5.99e+00

Pos logit contributions
rig+0.167
 Today+0.155
uffy+0.140
adow+0.135
erry+0.133
Neg logit contributions
 him-0.409
 it-0.403
 his-0.395
 such-0.378
 not-0.374
Token,
Token ID11
Feature activation+2.82e+00

Pos logit contributions
op+0.133
uddle+0.131
arry+0.131
uffy+0.131
olly+0.130
Neg logit contributions
 his-0.125
 such-0.123
 the-0.122
 it-0.119
 him-0.115
Token "
Token ID366
Feature activation+2.45e-02

Pos logit contributions
 such+0.057
 his+0.054
 the+0.053
 it+0.052
 him+0.051
Neg logit contributions
op-0.064
oke-0.063
arry-0.062
uffy-0.062
olly-0.061
TokenCan
Token ID6090
Feature activation+3.66e+00

Pos logit contributions
 such+0.000
 his+0.000
 the+0.000
 it+0.000
 him+0.000
Neg logit contributions
op-0.001
arry-0.001
uffy-0.001
oke-0.001
uddle-0.001
Token we
Token ID356
Feature activation+4.41e+00

Pos logit contributions
 such+0.062
 his+0.057
 permission+0.053
 the+0.053
 him+0.052
Neg logit contributions
op-0.093
uffy-0.093
oke-0.091
opus-0.091
arry-0.090
Token.
Token ID13
Feature activation-3.80e+00

Pos logit contributions
oke+0.157
uddle+0.157
arry+0.157
olly+0.156
uffy+0.154
Neg logit contributions
 his-0.086
 the-0.085
 such-0.081
 it-0.080
 never-0.076
Token The
Token ID383
Feature activation-4.01e+00

Pos logit contributions
op+0.099
uddle+0.098
oke+0.098
uffy+0.098
olly+0.097
Neg logit contributions
 such-0.066
 his-0.065
 the-0.064
 it-0.062
 him-0.060
Token bird
Token ID6512
Feature activation-4.36e+00

Pos logit contributions
op+0.086
oke+0.085
arry+0.085
uffy+0.084
olly+0.084
Neg logit contributions
 the-0.087
 his-0.087
 such-0.086
 it-0.084
 him-0.080
Token wanted
Token ID2227
Feature activation+3.66e+00

Pos logit contributions
op+0.115
arry+0.114
oke+0.114
uffy+0.114
olly+0.113
Neg logit contributions
 such-0.071
 his-0.070
 the-0.069
 it-0.067
 him-0.064
Token to
Token ID284
Feature activation+5.14e+00

Pos logit contributions
 such+0.061
ps+0.060
ts+0.060
 ruining+0.058
 rid+0.058
Neg logit contributions
op-0.073
erry-0.072
ule-0.072
pillar-0.071
oles-0.070
Token be
Token ID307
Feature activation+4.12e+00

Pos logit contributions
 such+0.092
 his+0.084
 it+0.081
 nothing+0.081
 the+0.081
Neg logit contributions
op-0.123
arry-0.119
uddle-0.117
uffy-0.117
rig-0.116
Token the
Token ID262
Feature activation-3.18e+00

Pos logit contributions
 him+0.059
 nothing+0.055
 permission+0.055
 themselves+0.053
 respect+0.052
Neg logit contributions
adow-0.090
ute-0.085
zag-0.080
erry-0.080
anda-0.079
Token best
Token ID1266
Feature activation-2.74e+00

Pos logit contributions
op+0.079
oke+0.078
uffy+0.077
arry+0.077
uddle+0.077
Neg logit contributions
 such-0.058
 his-0.056
 the-0.055
 it-0.054
 him-0.052
Token fl
Token ID781
Feature activation-9.75e+00

Pos logit contributions
oke+0.067
rig+0.067
olly+0.067
arry+0.066
uffy+0.066
Neg logit contributions
 the-0.054
 it-0.053
 his-0.053
 such-0.052
 help-0.048
Tokenier
Token ID959
Feature activation-4.02e+00

Pos logit contributions
pill+0.100
oke+0.098
rig+0.095
uddle+0.079
oodle+0.078
Neg logit contributions
 such-0.302
 help-0.295
 no-0.294
 never-0.294
 it-0.293
Token in
Token ID287
Feature activation+1.96e+00

Pos logit contributions
uddle+0.101
op+0.101
uffy+0.101
olly+0.100
arry+0.100
Neg logit contributions
 such-0.073
 his-0.072
 the-0.071
 it-0.070
 him-0.065
Token
Token ID198
Feature activation-2.10e+00

Pos logit contributions
uffy+0.072
uddle+0.071
owers+0.071
pill+0.071
olly+0.069
Neg logit contributions
 the-0.058
 his-0.058
 it-0.056
 him-0.052
 such-0.051
TokenThe
Token ID464
Feature activation-3.46e+00

Pos logit contributions
op+0.052
uffy+0.051
oke+0.051
arry+0.051
uddle+0.051
Neg logit contributions
 such-0.039
 his-0.038
 the-0.037
 it-0.036
 him-0.035
Token painter
Token ID34537
Feature activation-3.42e+00

Pos logit contributions
uffy+0.074
op+0.073
arry+0.072
olly+0.072
uddle+0.072
Neg logit contributions
 the-0.077
 his-0.076
 such-0.074
 it-0.073
 him-0.069
Token smiled
Token ID13541
Feature activation-2.14e-01

Pos logit contributions
op+0.095
oke+0.093
uffy+0.093
uddle+0.092
olly+0.092
Neg logit contributions
 such-0.052
 his-0.049
 the-0.048
 it-0.046
 him-0.045
Token and
Token ID290
Feature activation+4.21e+00

Pos logit contributions
op+0.005
oke+0.005
uffy+0.005
uddle+0.005
arry+0.005
Neg logit contributions
 such-0.004
 his-0.004
 the-0.004
 it-0.004
 him-0.004
Token shook
Token ID14682
Feature activation+4.27e+00

Pos logit contributions
 such+0.079
 permission+0.075
 anything+0.071
 yours+0.071
 him+0.069
Neg logit contributions
op-0.097
oke-0.096
uddle-0.094
olly-0.094
pillar-0.094
Token his
Token ID465
Feature activation-1.10e+00

Pos logit contributions
 such+0.026
 yours+0.024
 saved+0.023
 never+0.023
 permission+0.021
Neg logit contributions
op-0.135
arry-0.133
rig-0.133
oodle-0.133
ogging-0.131
Token head
Token ID1182
Feature activation-1.92e+00

Pos logit contributions
oke+0.032
pill+0.032
arry+0.031
ari+0.031
olly+0.031
Neg logit contributions
 his-0.017
 the-0.017
 it-0.015
 permission-0.015
 such-0.015
Token.
Token ID13
Feature activation-3.19e+00

Pos logit contributions
op+0.051
oke+0.050
uffy+0.050
arry+0.050
uddle+0.050
Neg logit contributions
 such-0.032
 his-0.031
 the-0.030
 it-0.029
 him-0.028
Token
Token ID198
Feature activation-2.34e+00

Pos logit contributions
uffy+0.087
uddle+0.087
op+0.086
opus+0.086
olly+0.086
Neg logit contributions
 the-0.054
 his-0.054
 it-0.051
 such-0.050
 him-0.047
Token
Token ID198
Feature activation-1.81e+00

Pos logit contributions
uffy+0.057
uddle+0.056
owers+0.056
pill+0.055
opus+0.055
Neg logit contributions
 the-0.047
 his-0.047
 it-0.046
 him-0.042
 such-0.041
Token sad
Token ID6507
Feature activation+1.12e+00

Pos logit contributions
 such+0.062
 his+0.058
 it+0.057
 the+0.056
 him+0.056
Neg logit contributions
op-0.085
uffy-0.084
oke-0.084
uddle-0.084
olly-0.084
Token and
Token ID290
Feature activation+3.86e+00

Pos logit contributions
 such+0.024
 his+0.023
 the+0.022
 permission+0.021
 him+0.021
Neg logit contributions
op-0.024
oke-0.023
uffy-0.023
arry-0.023
uddle-0.023
Token wished
Token ID16555
Feature activation+2.92e+00

Pos logit contributions
 such+0.077
 yours+0.072
 permission+0.070
 anything+0.067
 himself+0.066
Neg logit contributions
oke-0.081
uffy-0.077
anda-0.075
olly-0.072
ari-0.071
Token that
Token ID326
Feature activation+3.00e+00

Pos logit contributions
 such+0.056
 permission+0.053
 his+0.053
 rid+0.052
 protect+0.052
Neg logit contributions
uffy-0.063
oke-0.063
olly-0.062
op-0.060
pill-0.059
Token Lily
Token ID20037
Feature activation-1.15e+00

Pos logit contributions
ps+0.051
 such+0.043
 yours+0.043
 permission+0.042
ts+0.041
Neg logit contributions
uffy-0.068
oke-0.062
op-0.062
olly-0.060
uddy-0.059
Token had
Token ID550
Feature activation+7.04e+00

Pos logit contributions
uffy+0.027
op+0.026
pillar+0.026
oke+0.025
uddle+0.025
Neg logit contributions
 such-0.022
 his-0.021
 permission-0.020
 the-0.020
 him-0.019
Token given
Token ID1813
Feature activation+5.41e+00

Pos logit contributions
 fires+0.101
ps+0.093
ts+0.085
 theirs+0.082
 yours+0.082
Neg logit contributions
uffy-0.147
ess-0.133
str-0.131
ari-0.130
op-0.128
Token the
Token ID262
Feature activation-3.26e+00

Pos logit contributions
 letting+0.040
 ruining+0.039
 saving+0.038
ts+0.036
 preventing+0.035
Neg logit contributions
uffy-0.158
olly-0.157
pillar-0.157
pill-0.156
erry-0.152
Token dog
Token ID3290
Feature activation-2.12e+00

Pos logit contributions
op+0.064
oke+0.064
arry+0.063
uffy+0.063
uddle+0.063
Neg logit contributions
 such-0.076
 his-0.074
 the-0.073
 it-0.072
 him-0.070
Token a
Token ID257
Feature activation+8.60e-02

Pos logit contributions
uffy+0.065
op+0.064
olly+0.063
uddle+0.063
ari+0.063
Neg logit contributions
 such-0.021
 it-0.019
 his-0.019
 him-0.018
 the-0.017
Token chance
Token ID2863
Feature activation+8.31e-01

Pos logit contributions
 such+0.001
 his+0.001
 the+0.001
 it+0.001
 him+0.001
Neg logit contributions
op-0.002
oke-0.002
uffy-0.002
arry-0.002
uddle-0.002
Token to
Token ID284
Feature activation+3.96e+00

Pos logit contributions
 such+0.014
 permission+0.013
 his+0.013
 yours+0.013
 rid+0.013
Neg logit contributions
op-0.022
olly-0.022
pillar-0.022
arry-0.022
uffy-0.022
Token see
Token ID766
Feature activation+3.49e+00

Pos logit contributions
 such+0.055
 his+0.050
 the+0.047
 it+0.047
 nothing+0.047
Neg logit contributions
arry-0.111
op-0.110
olly-0.109
oke-0.109
ari-0.108
Token Sammy
Token ID44067
Feature activation-2.96e+00

Pos logit contributions
 such+0.017
 never+0.016
 nothing+0.015
 permission+0.014
 his+0.014
Neg logit contributions
pillar-0.121
pill-0.120
op-0.120
olly-0.119
arry-0.119
Token and
Token ID290
Feature activation+3.22e+00

Pos logit contributions
op+0.078
uffy+0.077
arry+0.077
oke+0.077
uddle+0.077
Neg logit contributions
 such-0.050
 his-0.049
 the-0.049
 it-0.047
 him-0.045
Token they
Token ID484
Feature activation+2.62e+00

Pos logit contributions
 such+0.052
 permission+0.048
 anything+0.044
 him+0.044
 yours+0.044
Neg logit contributions
uffy-0.079
arry-0.078
oke-0.078
ule-0.077
pillar-0.077
Token became
Token ID2627
Feature activation+4.94e+00

Pos logit contributions
 such+0.042
 his+0.038
 permission+0.036
 him+0.035
 it+0.035
Neg logit contributions
op-0.069
arry-0.069
uffy-0.068
ari-0.068
uddle-0.068
Token friends
Token ID2460
Feature activation-1.49e+00

Pos logit contributions
 such+0.075
 his+0.072
 him+0.069
 it+0.069
 permission+0.068
Neg logit contributions
uffy-0.128
op-0.127
arry-0.126
uddle-0.126
olly-0.125
Token.
Token ID13
Feature activation-3.98e+00

Pos logit contributions
oke+0.041
op+0.041
uffy+0.041
arry+0.040
uddle+0.040
Neg logit contributions
 such-0.024
 his-0.023
 the-0.023
 it-0.022
 him-0.021
Token
Token ID198
Feature activation-3.01e+00

Pos logit contributions
op+0.096
oke+0.096
opus+0.095
ane+0.093
ari+0.091
Neg logit contributions
 the-0.081
 his-0.079
 it-0.079
 such-0.075
 her-0.074
Token
Token ID198
Feature activation-2.52e+00

Pos logit contributions
uddle+0.073
oke+0.073
op+0.072
opus+0.072
ari+0.071
Neg logit contributions
 the-0.060
 it-0.060
 his-0.059
 him-0.055
 such-0.054
TokenSam
Token ID16305
Feature activation-4.70e+00

Pos logit contributions
oke+0.057
op+0.057
uffy+0.057
uddle+0.056
arry+0.056
Neg logit contributions
 such-0.050
 his-0.050
 the-0.049
 it-0.048
 him-0.046
Token was
Token ID373
Feature activation+5.20e+00

Pos logit contributions
 such+0.057
 yours+0.056
 hers+0.052
 permission+0.051
 rid+0.049
Neg logit contributions
oke-0.105
uddle-0.100
ari-0.099
op-0.099
pillar-0.098
Token happy
Token ID3772
Feature activation+1.53e+00

Pos logit contributions
 such+0.077
 him+0.074
 respect+0.070
 yours+0.069
 it+0.069
Neg logit contributions
oke-0.121
oodle-0.121
adow-0.120
rig-0.120
anda-0.120
Token that
Token ID326
Feature activation+2.49e+00

Pos logit contributions
 such+0.023
 yours+0.022
 permission+0.022
 rid+0.022
 his+0.021
Neg logit contributions
uddle-0.037
op-0.037
oke-0.037
arry-0.037
uffy-0.037
Token it
Token ID340
Feature activation+1.86e+00

Pos logit contributions
 such+0.034
ps+0.033
 permission+0.032
 rid+0.032
 yours+0.030
Neg logit contributions
ule-0.061
op-0.061
oke-0.060
ane-0.060
arry-0.059
Token was
Token ID373
Feature activation+4.95e+00

Pos logit contributions
 such+0.035
 permission+0.034
 his+0.032
 yours+0.032
 anything+0.032
Neg logit contributions
pillar-0.042
oke-0.042
op-0.042
uffy-0.041
olly-0.041
Token just
Token ID655
Feature activation+3.49e+00

Pos logit contributions
ts+0.063
 permission+0.057
 bruises+0.057
 yourselves+0.057
 rid+0.055
Neg logit contributions
adow-0.107
zag-0.106
rig-0.100
uffy-0.099
ess-0.097
Token an
Token ID281
Feature activation-2.95e+00

Pos logit contributions
 rid+0.032
ts+0.031
 care+0.031
 such+0.030
 share+0.030
Neg logit contributions
rig-0.094
oke-0.090
adow-0.090
arry-0.086
ari-0.085
Token adventure
Token ID8855
Feature activation-2.09e+00

Pos logit contributions
op+0.082
oke+0.081
uffy+0.081
arry+0.081
uddle+0.081
Neg logit contributions
 such-0.044
 his-0.042
 the-0.041
 it-0.040
 him-0.039
Token!
Token ID0
Feature activation-2.80e+00

Pos logit contributions
op+0.055
uffy+0.055
oke+0.055
uddle+0.054
arry+0.054
Neg logit contributions
 such-0.034
 his-0.033
 the-0.033
 it-0.032
 him-0.031
Token<|endoftext|>
Token ID50256
Feature activation-6.84e+00

Pos logit contributions
uddle+0.075
anda+0.072
uffy+0.071
opus+0.071
op+0.070
Neg logit contributions
 his-0.055
 the-0.054
 such-0.053
 it-0.052
 him-0.049
TokenOne
Token ID3198
Feature activation-1.73e+00

Pos logit contributions
rig+0.171
uddle+0.164
anda+0.164
op+0.160
opus+0.160
Neg logit contributions
 the-0.155
 it-0.147
 his-0.146
 such-0.144
 him-0.140
Token
Token ID220
Feature activation-5.42e+00

Pos logit contributions
opus+0.127
uddle+0.126
actus+0.124
ute+0.123
pillar+0.122
Neg logit contributions
 his-0.073
 the-0.072
 it-0.069
 him-0.069
 such-0.060
Token
Token ID198
Feature activation-3.44e+00

Pos logit contributions
opus+0.150
pillar+0.146
actus+0.145
uddle+0.144
pill+0.141
Neg logit contributions
 his-0.098
 the-0.097
 it-0.095
 him-0.093
 such-0.088
Token
Token ID198
Feature activation-3.05e+00

Pos logit contributions
uddle+0.104
opus+0.102
actus+0.099
oke+0.099
anda+0.099
Neg logit contributions
 it-0.057
 the-0.057
 his-0.056
 him-0.052
 them-0.051
TokenOne
Token ID3198
Feature activation-1.85e+00

Pos logit contributions
uddle+0.087
oke+0.087
uffy+0.086
opus+0.084
ari+0.084
Neg logit contributions
 his-0.051
 the-0.049
 it-0.048
 never-0.048
 such-0.046
Token day
Token ID1110
Feature activation-3.76e-01

Pos logit contributions
uddle+0.049
arry+0.049
oke+0.049
pill+0.049
ari+0.048
Neg logit contributions
 the-0.034
 his-0.034
 it-0.031
 such-0.031
 a-0.030
Token Bob
Token ID5811
Feature activation-3.03e+00

Pos logit contributions
op+0.008
oke+0.008
uddle+0.008
uffy+0.008
arry+0.008
Neg logit contributions
 such-0.008
 his-0.008
 the-0.008
 it-0.008
 him-0.007
Token went
Token ID1816
Feature activation+3.00e+00

Pos logit contributions
arry+0.084
oke+0.083
uddle+0.082
op+0.082
uffy+0.082
Neg logit contributions
 his-0.049
 the-0.049
 such-0.046
 never-0.045
 it-0.045
Token to
Token ID284
Feature activation+4.59e+00

Pos logit contributions
 such+0.038
 his+0.036
 the+0.035
 it+0.034
 him+0.032
Neg logit contributions
op-0.090
uffy-0.089
oke-0.089
arry-0.089
uddle-0.089
Token the
Token ID262
Feature activation-9.40e-01

Pos logit contributions
 such+0.024
 never+0.017
 it+0.016
 nothing+0.016
 his+0.016
Neg logit contributions
op-0.162
uffy-0.161
arry-0.159
rig-0.159
olly-0.158
Token park
Token ID3952
Feature activation-2.97e+00

Pos logit contributions
oke+0.020
op+0.020
arry+0.020
uffy+0.020
olly+0.019
Neg logit contributions
 such-0.021
 his-0.020
 the-0.020
 it-0.019
 him-0.019
Token with
Token ID351
Feature activation+4.62e+00

Pos logit contributions
oke+0.096
uffy+0.094
uddle+0.093
arry+0.093
opus+0.093
Neg logit contributions
 the-0.038
 his-0.037
 such-0.035
 it-0.034
 her-0.032
Token "
Token ID366
Feature activation-9.68e-01

Pos logit contributions
uddle+0.116
opus+0.111
uffy+0.111
olly+0.110
rig+0.110
Neg logit contributions
 the-0.070
 it-0.067
 his-0.066
 a-0.062
 her-0.061
TokenWhat
Token ID2061
Feature activation+1.62e+00

Pos logit contributions
uddle+0.029
uffy+0.029
oke+0.029
op+0.029
olly+0.029
Neg logit contributions
 such-0.012
 his-0.012
 the-0.012
 it-0.012
 him-0.011
Token is
Token ID318
Feature activation+1.51e+00

Pos logit contributions
 such+0.030
 his+0.030
 the+0.029
 it+0.029
 him+0.028
Neg logit contributions
op-0.039
arry-0.039
uddle-0.039
uffy-0.039
oke-0.039
Token your
Token ID534
Feature activation-3.21e+00

Pos logit contributions
 such+0.014
 permission+0.013
 rid+0.011
 care+0.011
 respect+0.011
Neg logit contributions
uffy-0.047
op-0.046
opus-0.046
oke-0.046
oodle-0.046
Token name
Token ID1438
Feature activation-4.21e+00

Pos logit contributions
oke+0.067
op+0.065
olly+0.064
arry+0.064
uddle+0.062
Neg logit contributions
 his-0.074
 such-0.073
 the-0.072
 it-0.070
 him-0.069
Token?
Token ID30
Feature activation-3.20e+00

Pos logit contributions
op+0.109
arry+0.109
uddle+0.109
rig+0.108
uffy+0.108
Neg logit contributions
 such-0.073
 the-0.072
 his-0.071
 it-0.069
 him-0.066
Token Where
Token ID6350
Feature activation+1.82e+00

Pos logit contributions
uddle+0.096
rig+0.091
olly+0.089
uffy+0.088
actus+0.087
Neg logit contributions
 his-0.054
 the-0.051
 it-0.051
 him-0.048
 such-0.048
Token is
Token ID318
Feature activation+3.43e+00

Pos logit contributions
 the+0.032
 his+0.032
 it+0.031
 such+0.031
 him+0.029
Neg logit contributions
rig-0.047
arry-0.047
uddle-0.047
op-0.046
olly-0.046
Token your
Token ID534
Feature activation-2.84e+00

Pos logit contributions
 such+0.025
 permission+0.024
 cleaned+0.022
 respect+0.021
 never+0.021
Neg logit contributions
opus-0.109
uffy-0.109
anda-0.107
ule-0.107
pill-0.107
Token home
Token ID1363
Feature activation-1.66e+00

Pos logit contributions
oke+0.063
op+0.062
olly+0.061
arry+0.061
uddle+0.059
Neg logit contributions
 his-0.062
 such-0.061
 the-0.060
 it-0.059
 him-0.057
Token?"
Token ID1701
Feature activation-3.32e+00

Pos logit contributions
op+0.038
uffy+0.038
uddle+0.037
oke+0.037
arry+0.037
Neg logit contributions
 such-0.032
 his-0.031
 the-0.030
 it-0.030
 him-0.030
Token.
Token ID13
Feature activation-2.22e+00

Pos logit contributions
 such+0.044
 his+0.040
 never+0.039
 the+0.038
 him+0.038
Neg logit contributions
op-0.055
oke-0.053
olly-0.053
uddle-0.053
arry-0.053
Token L
Token ID406
Feature activation-5.82e+00

Pos logit contributions
op+0.051
uffy+0.051
oke+0.051
arry+0.050
uddle+0.050
Neg logit contributions
 such-0.044
 his-0.043
 the-0.042
 it-0.041
 him-0.040
Tokenila
Token ID10102
Feature activation-2.01e+00

Pos logit contributions
uddle+0.154
arry+0.154
owers+0.153
ables+0.153
uddy+0.152
Neg logit contributions
 it-0.084
 the-0.082
 her-0.077
 such-0.074
 them-0.071
Token heard
Token ID2982
Feature activation+5.38e+00

Pos logit contributions
op+0.055
arry+0.054
uffy+0.054
oke+0.054
olly+0.054
Neg logit contributions
 such-0.031
 his-0.030
 the-0.030
 it-0.029
 him-0.027
Token a
Token ID257
Feature activation-1.33e-01

Pos logit contributions
 such+0.032
 never+0.025
 yours+0.024
 his+0.022
 saved+0.021
Neg logit contributions
arry-0.183
op-0.183
oke-0.183
olly-0.181
ari-0.180
Token loud
Token ID7812
Feature activation-8.05e-01

Pos logit contributions
uffy+0.004
olly+0.004
ane+0.004
op+0.004
pill+0.004
Neg logit contributions
 his-0.002
 the-0.002
 it-0.002
 such-0.002
 him-0.002
Token bark
Token ID21405
Feature activation-3.29e+00

Pos logit contributions
uffy+0.023
uddle+0.023
olly+0.023
op+0.023
ari+0.022
Neg logit contributions
 his-0.013
 the-0.013
 it-0.012
 such-0.012
 him-0.012
Token and
Token ID290
Feature activation+4.90e+00

Pos logit contributions
uffy+0.095
uddle+0.093
olly+0.091
rig+0.091
arry+0.090
Neg logit contributions
 his-0.053
 the-0.053
 such-0.051
 him-0.051
 it-0.050
Token saw
Token ID2497
Feature activation+4.33e+00

Pos logit contributions
 such+0.082
 permission+0.075
 yours+0.073
 him+0.067
 respect+0.064
Neg logit contributions
op-0.117
oke-0.116
oodle-0.114
arry-0.114
uffy-0.113
Token a
Token ID257
Feature activation-1.04e+00

Pos logit contributions
 such+0.031
 his+0.027
 the+0.024
 it+0.023
 never+0.023
Neg logit contributions
op-0.153
oke-0.152
uffy-0.151
arry-0.150
uddle-0.150
Token big
Token ID1263
Feature activation-2.25e+00

Pos logit contributions
olly+0.029
arry+0.027
rig+0.027
uffy+0.027
pill+0.027
Neg logit contributions
 the-0.017
 it-0.017
 his-0.017
 such-0.015
 a-0.015
Token to
Token ID284
Feature activation+5.80e+00

Pos logit contributions
 such+0.075
 yours+0.070
 rid+0.062
 hers+0.062
 him+0.061
Neg logit contributions
op-0.119
ule-0.112
rig-0.110
oles-0.110
arry-0.110
Token rub
Token ID6437
Feature activation+2.37e+00

Pos logit contributions
 such+0.111
 his+0.104
 the+0.102
 it+0.101
 him+0.098
Neg logit contributions
op-0.137
arry-0.134
uffy-0.133
uddle-0.133
rig-0.132
Token it
Token ID340
Feature activation+1.01e+00

Pos logit contributions
 such+0.014
 never+0.012
 permission+0.012
 yours+0.011
 respect+0.010
Neg logit contributions
op-0.085
uffy-0.080
uddle-0.080
arry-0.080
oles-0.079
Token.
Token ID13
Feature activation-3.38e+00

Pos logit contributions
 such+0.017
 his+0.016
 the+0.015
 permission+0.015
 it+0.015
Neg logit contributions
op-0.026
uddle-0.025
arry-0.025
rig-0.024
uffy-0.024
Token
Token ID198
Feature activation-2.71e+00

Pos logit contributions
uffy+0.084
oke+0.084
anda+0.083
op+0.082
ari+0.082
Neg logit contributions
 his-0.071
 the-0.069
 it-0.069
 him-0.065
 such-0.065
Token
Token ID198
Feature activation-2.11e+00

Pos logit contributions
uddle+0.077
owers+0.076
opus+0.075
pill+0.075
oke+0.075
Neg logit contributions
 it-0.049
 the-0.048
 his-0.048
 him-0.043
 a-0.042
TokenSuddenly
Token ID38582
Feature activation+7.22e-01

Pos logit contributions
uffy+0.052
oke+0.051
op+0.050
uddle+0.050
anda+0.049
Neg logit contributions
 his-0.042
 the-0.041
 such-0.040
 it-0.040
 never-0.039
Token,
Token ID11
Feature activation+1.57e+00

Pos logit contributions
 such+0.017
 his+0.016
 the+0.016
 it+0.015
 permission+0.015
Neg logit contributions
op-0.014
arry-0.014
olly-0.014
uffy-0.014
oke-0.013
Token a
Token ID257
Feature activation-1.18e+00

Pos logit contributions
 permission+0.019
 such+0.019
 yours+0.017
 respect+0.016
 rid+0.015
Neg logit contributions
olly-0.044
oles-0.043
arry-0.043
pillar-0.043
opus-0.042
Token magic
Token ID5536
Feature activation-4.95e+00

Pos logit contributions
op+0.033
ane+0.033
uffy+0.033
oke+0.033
olly+0.032
Neg logit contributions
 the-0.018
 his-0.018
 it-0.018
 such-0.016
 a-0.016
Token fairy
Token ID25607
Feature activation-2.70e+00

Pos logit contributions
op+0.117
uffy+0.117
oke+0.116
olly+0.115
arry+0.115
Neg logit contributions
 his-0.093
 the-0.092
 such-0.092
 it-0.091
 him-0.085
Token he
Token ID339
Feature activation+1.29e+00

Pos logit contributions
 such+0.003
 permission+0.002
 his+0.002
 never+0.002
 him+0.002
Neg logit contributions
op-0.008
uffy-0.008
oke-0.008
uddle-0.008
olly-0.008
Token blows
Token ID20385
Feature activation+1.92e+00

Pos logit contributions
 such+0.023
 his+0.023
 the+0.022
 it+0.021
 him+0.021
Neg logit contributions
op-0.032
oke-0.032
uffy-0.032
uddle-0.032
olly-0.031
Token too
Token ID1165
Feature activation-1.51e+00

Pos logit contributions
 such+0.024
 permission+0.022
 never+0.022
 protect+0.021
 saving+0.020
Neg logit contributions
op-0.054
uddle-0.053
olly-0.053
oke-0.052
uffy-0.051
Token hard
Token ID1327
Feature activation-1.62e+00

Pos logit contributions
op+0.043
oke+0.042
arry+0.042
uffy+0.042
olly+0.042
Neg logit contributions
 such-0.022
 his-0.021
 the-0.021
 it-0.020
 him-0.019
Token and
Token ID290
Feature activation+4.22e+00

Pos logit contributions
op+0.040
oke+0.039
arry+0.038
olly+0.038
uffy+0.038
Neg logit contributions
 such-0.030
 his-0.028
 the-0.028
 it-0.027
 him-0.026
Token the
Token ID262
Feature activation-3.10e+00

Pos logit contributions
 such+0.072
 permission+0.071
 theirs+0.059
 his+0.059
 yours+0.059
Neg logit contributions
oke-0.100
uddle-0.099
op-0.099
pill-0.095
uffy-0.094
Token soap
Token ID19533
Feature activation-4.67e+00

Pos logit contributions
op+0.084
uddle+0.083
uffy+0.083
oke+0.082
olly+0.082
Neg logit contributions
 such-0.049
 his-0.047
 the-0.046
 it-0.044
 him-0.043
Token spl
Token ID4328
Feature activation-6.43e+00

Pos logit contributions
op+0.120
uddle+0.117
uffy+0.115
oke+0.115
olly+0.114
Neg logit contributions
 such-0.081
 his-0.076
 the-0.073
 him-0.071
 it-0.071
Tokenashes
Token ID7465
Feature activation-1.00e+00

Pos logit contributions
arry+0.090
ari+0.089
opus+0.088
pillar+0.087
oke+0.087
Neg logit contributions
 such-0.180
 his-0.177
 the-0.176
 it-0.172
 him-0.172
Token.
Token ID13
Feature activation-2.79e+00

Pos logit contributions
op+0.025
uddle+0.024
olly+0.024
oke+0.024
pillar+0.023
Neg logit contributions
 such-0.017
 permission-0.015
 his-0.015
 never-0.015
 nothing-0.014
Token He
Token ID679
Feature activation+2.92e+00

Pos logit contributions
ari+0.072
op+0.071
uddle+0.070
oke+0.070
arry+0.070
Neg logit contributions
 his-0.051
 the-0.050
 it-0.048
 such-0.047
 him-0.046
Token her
Token ID607
Feature activation-1.29e+00

Pos logit contributions
ts+0.024
 rid+0.024
 punished+0.024
 hurting+0.022
 permission+0.021
Neg logit contributions
arry-0.114
op-0.112
uffy-0.111
pillar-0.110
olly-0.110
Token pocket
Token ID10000
Feature activation-1.05e+00

Pos logit contributions
op+0.027
uffy+0.027
uddle+0.027
arry+0.026
olly+0.026
Neg logit contributions
 such-0.028
 his-0.027
 the-0.027
 it-0.027
 him-0.026
Token.
Token ID13
Feature activation-3.23e+00

Pos logit contributions
op+0.025
uddle+0.025
ari+0.024
arry+0.024
uffy+0.024
Neg logit contributions
 such-0.019
 his-0.019
 the-0.018
 him-0.018
 it-0.017
Token
Token ID198
Feature activation-2.38e+00

Pos logit contributions
uddle+0.084
opus+0.082
oke+0.081
uffy+0.081
anda+0.081
Neg logit contributions
 the-0.064
 it-0.063
 such-0.061
 his-0.060
 her-0.058
Token
Token ID198
Feature activation-1.83e+00

Pos logit contributions
uddle+0.063
uffy+0.061
ari+0.060
oke+0.060
opus+0.060
Neg logit contributions
 the-0.046
 it-0.045
 his-0.044
 him-0.042
 never-0.041
TokenThe
Token ID464
Feature activation-4.12e+00

Pos logit contributions
uffy+0.048
uddle+0.048
oke+0.047
op+0.047
anda+0.046
Neg logit contributions
 such-0.032
 the-0.032
 his-0.031
 it-0.031
 never-0.030
Token next
Token ID1306
Feature activation-3.69e+00

Pos logit contributions
op+0.084
oke+0.083
uffy+0.083
arry+0.083
uddle+0.082
Neg logit contributions
 such-0.093
 his-0.091
 the-0.090
 it-0.088
 him-0.086
Token day
Token ID1110
Feature activation-1.70e+00

Pos logit contributions
uddle+0.098
oke+0.097
op+0.097
uffy+0.096
pill+0.095
Neg logit contributions
 the-0.067
 his-0.065
 it-0.064
 such-0.062
 a-0.060
Token,
Token ID11
Feature activation+2.48e-01

Pos logit contributions
op+0.041
uddle+0.041
rig+0.041
uffy+0.041
oke+0.040
Neg logit contributions
 the-0.034
 such-0.034
 it-0.033
 his-0.033
 her-0.030
Token Joy
Token ID14087
Feature activation-1.87e+00

Pos logit contributions
 such+0.005
 his+0.005
 the+0.005
 it+0.005
 him+0.005
Neg logit contributions
op-0.005
oke-0.005
uffy-0.005
uddle-0.005
arry-0.005
Token went
Token ID1816
Feature activation+2.59e+00

Pos logit contributions
op+0.053
uffy+0.053
uddle+0.053
oke+0.053
arry+0.053
Neg logit contributions
 the-0.027
 such-0.027
 his-0.026
 it-0.025
 never-0.024
Token a
Token ID257
Feature activation-1.71e+00

Pos logit contributions
 such+0.013
 his+0.011
 the+0.009
 it+0.009
 him+0.008
Neg logit contributions
op-0.123
olly-0.122
uffy-0.122
arry-0.121
uddle-0.121
Token small
Token ID1402
Feature activation-3.53e+00

Pos logit contributions
pill+0.047
ane+0.046
arry+0.046
oke+0.046
rig+0.045
Neg logit contributions
 the-0.030
 his-0.028
 it-0.028
 a-0.026
 more-0.024
Token girl
Token ID2576
Feature activation-3.59e+00

Pos logit contributions
rig+0.091
oke+0.086
pill+0.085
arry+0.085
oles+0.082
Neg logit contributions
 the-0.073
 his-0.067
 it-0.065
 a-0.063
 more-0.059
Token called
Token ID1444
Feature activation+7.70e-01

Pos logit contributions
uddle+0.093
oke+0.092
arry+0.092
op+0.091
uffy+0.091
Neg logit contributions
 the-0.062
 his-0.061
 such-0.061
 it-0.059
 her-0.056
Token Lily
Token ID20037
Feature activation-5.33e+00

Pos logit contributions
 the+0.010
 such+0.009
 it+0.009
 his+0.009
 her+0.009
Neg logit contributions
uddle-0.025
opus-0.024
rig-0.024
op-0.024
actus-0.024
Token.
Token ID13
Feature activation-3.83e+00

Pos logit contributions
op+0.137
uffy+0.135
oke+0.135
arry+0.135
olly+0.134
Neg logit contributions
 such-0.091
 his-0.088
 the-0.085
 it-0.084
 him-0.081
Token She
Token ID1375
Feature activation+1.43e+00

Pos logit contributions
op+0.087
oke+0.086
opus+0.085
uffy+0.084
arry+0.084
Neg logit contributions
 such-0.080
 the-0.079
 his-0.079
 it-0.076
 her-0.074
Token was
Token ID373
Feature activation+3.24e+00

Pos logit contributions
 such+0.023
 his+0.021
 the+0.020
 permission+0.020
 it+0.020
Neg logit contributions
op-0.039
uffy-0.038
olly-0.038
uddle-0.038
oke-0.038
Token always
Token ID1464
Feature activation+4.67e+00

Pos logit contributions
 such+0.046
 his+0.045
 it+0.043
 him+0.042
 the+0.041
Neg logit contributions
op-0.089
uffy-0.088
olly-0.087
oke-0.086
arry-0.086
Token in
Token ID287
Feature activation+2.91e+00

Pos logit contributions
 respect+0.081
 him+0.080
 such+0.078
 permission+0.076
 protect+0.076
Neg logit contributions
op-0.103
oodle-0.102
uffy-0.101
ang-0.100
olly-0.099
Token such
Token ID884
Feature activation+3.94e+00

Pos logit contributions
 rid+0.018
 never+0.017
 saved+0.016
 yours+0.016
 permission+0.016
Neg logit contributions
olly-0.090
pill-0.090
uffy-0.089
pillar-0.088
op-0.087
Token her
Token ID607
Feature activation-1.03e+00

Pos logit contributions
 such+0.025
 saved+0.022
 fixed+0.021
 never+0.021
 saving+0.020
Neg logit contributions
oodle-0.079
uffy-0.079
olly-0.078
arry-0.077
anda-0.077
Token the
Token ID262
Feature activation-3.71e+00

Pos logit contributions
uffy+0.034
op+0.034
uddle+0.034
oodle+0.033
oke+0.033
Neg logit contributions
 such-0.008
 never-0.008
 him-0.007
 permission-0.006
 his-0.006
Token string
Token ID4731
Feature activation-4.19e+00

Pos logit contributions
op+0.079
uddle+0.079
ane+0.077
uffy+0.076
ute+0.075
Neg logit contributions
 such-0.076
 his-0.073
 him-0.072
 the-0.067
 permission-0.066
Token and
Token ID290
Feature activation+4.74e+00

Pos logit contributions
op+0.104
uddle+0.102
uffy+0.101
oke+0.100
arry+0.100
Neg logit contributions
 such-0.073
 his-0.068
 the-0.066
 him-0.065
 it-0.064
Token the
Token ID262
Feature activation-2.98e+00

Pos logit contributions
 such+0.038
 permission+0.031
 him+0.029
 never+0.029
 nothing+0.029
Neg logit contributions
rig-0.146
uddle-0.145
pill-0.144
oodle-0.143
op-0.143
Token toy
Token ID13373
Feature activation-3.57e+00

Pos logit contributions
op+0.066
uddle+0.066
uffy+0.065
pillar+0.064
oke+0.063
Neg logit contributions
 such-0.060
 his-0.057
 him-0.055
 the-0.053
 never-0.052
Token car
Token ID1097
Feature activation-2.94e+00

Pos logit contributions
op+0.070
uffy+0.067
ane+0.066
uddle+0.066
pillar+0.064
Neg logit contributions
 such-0.085
 his-0.079
 permission-0.077
 the-0.075
 him-0.075
Token.
Token ID13
Feature activation-3.36e+00

Pos logit contributions
op+0.066
oke+0.065
uddle+0.065
ari+0.064
uffy+0.064
Neg logit contributions
 such-0.056
 permission-0.051
 his-0.050
 the-0.048
 him-0.048
Token He
Token ID679
Feature activation+2.35e+00

Pos logit contributions
opus+0.088
ane+0.087
op+0.086
ari+0.086
pillar+0.085
Neg logit contributions
 his-0.064
 the-0.062
 it-0.060
 such-0.058
 him-0.056
Token went
Token ID1816
Feature activation+1.92e+00

Pos logit contributions
 such+0.036
 his+0.033
 the+0.032
 it+0.031
 him+0.031
Neg logit contributions
op-0.065
oke-0.064
uffy-0.064
uddle-0.064
arry-0.064
Token outside
Token ID2354
Feature activation-1.08e+00

Pos logit contributions
 such+0.027
 his+0.024
 the+0.024
 it+0.023
 him+0.022
Neg logit contributions
op-0.055
uddle-0.054
uffy-0.054
olly-0.054
arry-0.054
Token store
Token ID3650
Feature activation-2.21e+00

Pos logit contributions
op+0.060
uddle+0.060
uffy+0.058
olly+0.058
oke+0.058
Neg logit contributions
 such-0.071
 his-0.069
 the-0.067
 it-0.067
 him-0.066
Token and
Token ID290
Feature activation+3.02e+00

Pos logit contributions
op+0.050
oke+0.049
arry+0.049
uffy+0.049
uddle+0.049
Neg logit contributions
 such-0.044
 his-0.043
 the-0.042
 it-0.041
 him-0.040
Token bought
Token ID5839
Feature activation+3.79e+00

Pos logit contributions
 such+0.043
 his+0.038
 permission+0.037
 yours+0.036
 anything+0.036
Neg logit contributions
op-0.086
uffy-0.081
ari-0.081
olly-0.081
uddle-0.081
Token a
Token ID257
Feature activation-1.69e+00

Pos logit contributions
 such+0.023
 his+0.022
 hurting+0.022
 never+0.022
 ruining+0.021
Neg logit contributions
op-0.126
arry-0.122
oke-0.122
uddle-0.121
oodle-0.121
Token stuffed
Token ID22259
Feature activation-4.91e+00

Pos logit contributions
uffy+0.047
oke+0.047
arry+0.047
olly+0.046
pill+0.046
Neg logit contributions
 his-0.026
 the-0.026
 it-0.026
 such-0.026
 a-0.023
Token bear
Token ID6842
Feature activation-5.15e+00

Pos logit contributions
op+0.110
arry+0.108
uddle+0.108
uffy+0.107
oke+0.107
Neg logit contributions
 such-0.100
 his-0.096
 the-0.094
 it-0.092
 him-0.090
Token that
Token ID326
Feature activation+8.54e-01

Pos logit contributions
oke+0.144
uffy+0.144
uddle+0.144
arry+0.143
olly+0.142
Neg logit contributions
 such-0.079
 his-0.077
 the-0.077
 it-0.073
 him-0.069
Token she
Token ID673
Feature activation+1.69e+00

Pos logit contributions
 such+0.015
 his+0.014
 the+0.013
 permission+0.013
 it+0.013
Neg logit contributions
op-0.022
uffy-0.021
oke-0.021
olly-0.021
arry-0.021
Token could
Token ID714
Feature activation+3.62e+00

Pos logit contributions
 such+0.029
 his+0.026
 anything+0.025
 permission+0.025
 the+0.025
Neg logit contributions
op-0.044
olly-0.043
oke-0.043
ari-0.043
uffy-0.043
Token attach
Token ID10199
Feature activation+2.15e+00

Pos logit contributions
 such+0.052
 his+0.046
 nothing+0.045
 the+0.045
 it+0.044
Neg logit contributions
op-0.102
arry-0.098
olly-0.097
rig-0.096
uffy-0.095
Token the
Token ID262
Feature activation-4.20e+00

Pos logit contributions
 such+0.020
 his+0.018
 permission+0.017
 never+0.017
 the+0.016
Neg logit contributions
op-0.067
arry-0.066
uddle-0.065
olly-0.065
uffy-0.065
Token,
Token ID11
Feature activation-3.88e-01

Pos logit contributions
rig+0.055
actus+0.052
uddle+0.052
pillar+0.051
ari+0.051
Neg logit contributions
 the-0.030
 his-0.028
 it-0.028
 a-0.026
 her-0.025
Token there
Token ID612
Feature activation-1.97e+00

Pos logit contributions
op+0.012
oke+0.012
arry+0.012
uddle+0.012
uffy+0.012
Neg logit contributions
 such-0.004
 his-0.004
 the-0.004
 it-0.004
 him-0.004
Token was
Token ID373
Feature activation+3.24e+00

Pos logit contributions
arry+0.051
oke+0.051
op+0.051
ari+0.051
olly+0.051
Neg logit contributions
 his-0.033
 the-0.033
 such-0.033
 it-0.033
 him-0.031
Token a
Token ID257
Feature activation-1.35e+00

Pos logit contributions
 such+0.011
 his+0.009
 the+0.008
 it+0.007
 him+0.005
Neg logit contributions
op-0.128
oke-0.127
uffy-0.127
arry-0.127
uddle-0.127
Token little
Token ID1310
Feature activation-1.78e+00

Pos logit contributions
pill+0.037
ane+0.037
arry+0.037
oke+0.036
rig+0.036
Neg logit contributions
 the-0.023
 his-0.022
 it-0.021
 a-0.020
 her-0.019
Token boy
Token ID2933
Feature activation-3.92e+00

Pos logit contributions
rig+0.044
oke+0.041
arry+0.040
zag+0.039
pill+0.039
Neg logit contributions
 the-0.038
 it-0.036
 his-0.036
 her-0.034
 a-0.034
Token named
Token ID3706
Feature activation+5.69e-01

Pos logit contributions
op+0.094
oke+0.093
uddle+0.093
arry+0.093
uffy+0.093
Neg logit contributions
 such-0.073
 his-0.072
 the-0.071
 it-0.069
 him-0.067
Token Tim
Token ID5045
Feature activation-5.27e+00

Pos logit contributions
 such+0.007
 his+0.006
 the+0.006
 it+0.006
 permission+0.005
Neg logit contributions
op-0.018
uffy-0.018
olly-0.018
oke-0.018
arry-0.017
Tokenmy
Token ID1820
Feature activation-3.24e+00

Pos logit contributions
op+0.101
uffy+0.100
oke+0.100
arry+0.099
olly+0.098
Neg logit contributions
 such-0.122
 it-0.113
 his-0.112
 the-0.112
 nothing-0.111
Token.
Token ID13
Feature activation-3.57e+00

Pos logit contributions
op+0.081
uffy+0.079
arry+0.078
oke+0.078
olly+0.077
Neg logit contributions
 such-0.059
 his-0.054
 the-0.053
 it-0.053
 nothing-0.051
Token Tim
Token ID5045
Feature activation-4.51e+00

Pos logit contributions
op+0.120
oke+0.119
uffy+0.119
uddle+0.119
opus+0.118
Neg logit contributions
 his-0.037
 the-0.034
 such-0.034
 him-0.032
 it-0.032
Token decided
Token ID3066
Feature activation+3.15e+00

Pos logit contributions
 such+0.073
 anything+0.065
 permission+0.064
 yours+0.063
 yourself+0.063
Neg logit contributions
op-0.086
uffy-0.085
oke-0.085
ari-0.084
olly-0.083
Token to
Token ID284
Feature activation+5.58e+00

Pos logit contributions
 such+0.061
ps+0.059
 rid+0.059
 permission+0.055
 satisfaction+0.055
Neg logit contributions
ogging-0.058
uffy-0.058
op-0.058
s-0.058
ess-0.057
Token jump
Token ID4391
Feature activation+1.24e+00

Pos logit contributions
 such+0.083
 his+0.076
 nothing+0.073
 it+0.073
 the+0.073
Neg logit contributions
op-0.151
uffy-0.148
arry-0.147
uddle-0.147
olly-0.146
Token on
Token ID319
Feature activation+3.48e+00

Pos logit contributions
 such+0.017
 his+0.017
 the+0.016
 it+0.016
 him+0.015
Neg logit contributions
op-0.036
oke-0.036
uffy-0.036
arry-0.035
uddle-0.035
Token the
Token ID262
Feature activation-1.01e+00

Pos logit contributions
 such+0.015
 permission+0.013
 rid+0.011
 his+0.009
 saved+0.009
Neg logit contributions
op-0.125
uddle-0.125
uffy-0.124
pill-0.122
pillar-0.121
Token tr
Token ID491
Feature activation-1.01e+01

Pos logit contributions
op+0.017
uddle+0.017
uffy+0.017
olly+0.017
oke+0.017
Neg logit contributions
 such-0.026
 his-0.025
 the-0.025
 it-0.024
 him-0.024
Tokenamp
Token ID696
Feature activation-7.66e+00

Pos logit contributions
oke+0.131
rig+0.112
arry+0.112
pillar+0.110
oodle+0.109
Neg logit contributions
 such-0.298
 his-0.294
 it-0.292
 the-0.289
 him-0.286
Tokenoline
Token ID14453
Feature activation-4.46e+00

Pos logit contributions
arry+0.114
oke+0.113
uffy+0.109
op+0.109
rig+0.108
Neg logit contributions
 the-0.220
 his-0.218
 such-0.217
 it-0.213
 him-0.208
Token,
Token ID11
Feature activation+3.78e+00

Pos logit contributions
oke+0.122
arry+0.121
uffy+0.120
uddle+0.119
op+0.119
Neg logit contributions
 such-0.071
 the-0.071
 his-0.070
 it-0.069
 him-0.065
Token but
Token ID475
Feature activation+2.87e+00

Pos logit contributions
 such+0.075
 his+0.072
 the+0.070
 it+0.069
 him+0.068
Neg logit contributions
op-0.086
oke-0.085
uddle-0.084
uffy-0.084
arry-0.084
Token Lily
Token ID20037
Feature activation-4.98e-01

Pos logit contributions
 permission+0.047
 such+0.046
ps+0.045
 respect+0.042
 protect+0.041
Neg logit contributions
uffy-0.068
olly-0.064
uddle-0.063
oke-0.062
oodle-0.062
Token They
Token ID1119
Feature activation+3.48e+00

Pos logit contributions
opus+0.087
op+0.086
ane+0.084
oke+0.084
oles+0.083
Neg logit contributions
 his-0.069
 the-0.066
 such-0.065
 it-0.064
 her-0.063
Token took
Token ID1718
Feature activation+5.05e+00

Pos logit contributions
 such+0.059
 yourself+0.057
 yours+0.055
 anything+0.054
 yourselves+0.053
Neg logit contributions
op-0.084
olly-0.083
uddle-0.082
ari-0.081
anda-0.081
Token turns
Token ID4962
Feature activation+2.88e+00

Pos logit contributions
 permission+0.076
 letting+0.076
 such+0.074
 never+0.073
 saving+0.071
Neg logit contributions
op-0.126
uddle-0.123
olly-0.123
uffy-0.123
pillar-0.118
Token playing
Token ID2712
Feature activation+1.09e+00

Pos logit contributions
 such+0.041
 his+0.039
 the+0.037
 it+0.036
 him+0.036
Neg logit contributions
op-0.082
oke-0.081
uffy-0.080
uddle-0.080
olly-0.080
Token the
Token ID262
Feature activation-3.59e+00

Pos logit contributions
 such+0.019
 his+0.017
 permission+0.017
 never+0.016
 him+0.016
Neg logit contributions
op-0.027
uddle-0.027
uffy-0.026
oke-0.026
olly-0.026
Token fl
Token ID781
Feature activation-9.89e+00

Pos logit contributions
op+0.088
uffy+0.087
uddle+0.087
oke+0.087
arry+0.086
Neg logit contributions
 such-0.066
 his-0.064
 the-0.062
 it-0.061
 him-0.060
Tokenute
Token ID1133
Feature activation-4.76e+00

Pos logit contributions
oke+0.084
pill+0.075
arry+0.067
uffy+0.066
erry+0.064
Neg logit contributions
 such-0.329
 it-0.326
 the-0.320
 his-0.318
 never-0.311
Token and
Token ID290
Feature activation+4.76e+00

Pos logit contributions
oke+0.128
arry+0.127
op+0.126
uffy+0.125
olly+0.124
Neg logit contributions
 such-0.078
 the-0.078
 his-0.077
 it-0.075
 him-0.070
Token had
Token ID550
Feature activation+5.35e+00

Pos logit contributions
 permission+0.067
 such+0.066
 yours+0.061
 yourself+0.057
 respect+0.056
Neg logit contributions
oke-0.120
op-0.117
zag-0.116
uddle-0.115
uffy-0.114
Token so
Token ID523
Feature activation+1.80e+00

Pos logit contributions
 permission+0.086
 rid+0.085
 hurting+0.076
 protect+0.076
ps+0.075
Neg logit contributions
pill-0.125
uddle-0.118
rig-0.111
ari-0.111
oodle-0.111
Token much
Token ID881
Feature activation+1.28e+00

Pos logit contributions
 such+0.028
 his+0.027
 the+0.026
 it+0.025
 him+0.025
Neg logit contributions
op-0.050
oke-0.049
uffy-0.049
olly-0.048
arry-0.048
Token hipp
Token ID18568
Feature activation-9.01e+00

Pos logit contributions
rig+0.048
oke+0.045
arry+0.044
zag+0.043
pill+0.043
Neg logit contributions
 the-0.043
 it-0.040
 his-0.040
 a-0.037
 her-0.037
Tokeno
Token ID78
Feature activation-3.94e+00

Pos logit contributions
rig+0.202
uddle+0.193
pill+0.192
pillar+0.181
arry+0.180
Neg logit contributions
 his-0.201
 the-0.195
 him-0.185
 such-0.182
 never-0.181
Token.
Token ID13
Feature activation-3.15e+00

Pos logit contributions
oke+0.102
uddle+0.101
arry+0.101
olly+0.100
uffy+0.100
Neg logit contributions
 his-0.067
 the-0.066
 such-0.066
 it-0.064
 him-0.061
Token The
Token ID383
Feature activation-3.44e+00

Pos logit contributions
op+0.079
uddle+0.079
oke+0.079
rig+0.079
uffy+0.077
Neg logit contributions
 his-0.062
 the-0.060
 such-0.058
 him-0.056
 it-0.056
Token little
Token ID1310
Feature activation-4.72e+00

Pos logit contributions
op+0.074
uffy+0.072
olly+0.072
arry+0.072
oke+0.071
Neg logit contributions
 his-0.077
 the-0.076
 such-0.074
 it-0.073
 him-0.071
Token hipp
Token ID18568
Feature activation-9.65e+00

Pos logit contributions
rig+0.101
op+0.101
olly+0.100
uffy+0.099
uddle+0.097
Neg logit contributions
 the-0.109
 his-0.109
 it-0.104
 such-0.103
 him-0.099
Tokeno
Token ID78
Feature activation-3.37e+00

Pos logit contributions
rig+0.223
olly+0.216
uddle+0.214
ogging+0.211
pillar+0.209
Neg logit contributions
 his-0.198
 the-0.189
 him-0.184
 never-0.182
 such-0.176
Token lived
Token ID5615
Feature activation+1.27e+00

Pos logit contributions
arry+0.092
oke+0.091
uddle+0.091
op+0.091
uffy+0.091
Neg logit contributions
 his-0.053
 the-0.052
 such-0.050
 it-0.050
 never-0.047
Token in
Token ID287
Feature activation+2.32e+00

Pos logit contributions
 such+0.022
 permission+0.020
 it+0.019
 him+0.019
 anything+0.019
Neg logit contributions
op-0.032
olly-0.031
uffy-0.031
uddle-0.031
arry-0.030
Token a
Token ID257
Feature activation-3.49e-01

Pos logit contributions
 permission+0.024
 rid+0.019
 yours+0.018
 resolved+0.018
 cleaned+0.017
Neg logit contributions
pill-0.064
op-0.064
pillar-0.062
olly-0.062
arry-0.062
Token big
Token ID1263
Feature activation-1.87e+00

Pos logit contributions
oke+0.010
op+0.010
arry+0.010
rig+0.009
uffy+0.009
Neg logit contributions
 the-0.006
 his-0.006
 it-0.005
 such-0.005
 him-0.005
Token zoo
Token ID26626
Feature activation-4.94e+00

Pos logit contributions
op+0.068
oke+0.067
arry+0.067
uffy+0.066
olly+0.065
Neg logit contributions
 such-0.066
 his-0.065
 the-0.065
 it-0.062
 him-0.060
Token would
Token ID561
Feature activation+3.05e+00

Pos logit contributions
op+0.129
arry+0.127
uffy+0.127
oke+0.125
olly+0.125
Neg logit contributions
 such-0.084
 the-0.084
 his-0.083
 it-0.080
 him-0.075
Token wish
Token ID4601
Feature activation-5.95e-01

Pos logit contributions
 such+0.042
 his+0.038
 the+0.036
 it+0.035
 permission+0.035
Neg logit contributions
arry-0.081
op-0.081
ari-0.081
oke-0.081
rig-0.080
Token that
Token ID326
Feature activation+1.44e+00

Pos logit contributions
op+0.017
oke+0.017
uffy+0.017
olly+0.016
uddle+0.016
Neg logit contributions
 such-0.008
 his-0.008
 the-0.008
 it-0.007
 him-0.007
Token the
Token ID262
Feature activation-4.55e+00

Pos logit contributions
 such+0.013
 his+0.011
 permission+0.011
 him+0.010
 rid+0.010
Neg logit contributions
oke-0.044
uffy-0.044
op-0.044
olly-0.044
pillar-0.043
Token z
Token ID1976
Feature activation-1.06e+01

Pos logit contributions
op+0.109
arry+0.107
olly+0.106
uffy+0.105
oke+0.105
Neg logit contributions
 the-0.090
 such-0.090
 his-0.088
 it-0.087
 never-0.081
Tokenebra
Token ID37052
Feature activation-6.47e+00

Pos logit contributions
rig+0.138
oke+0.135
arry+0.134
pillar+0.125
pill+0.124
Neg logit contributions
 his-0.309
 such-0.307
 the-0.299
 him-0.297
 it-0.296
Token would
Token ID561
Feature activation+3.90e+00

Pos logit contributions
op+0.163
oke+0.161
uffy+0.161
arry+0.161
uddle+0.160
Neg logit contributions
 such-0.112
 his-0.109
 the-0.107
 it-0.104
 him-0.101
Token be
Token ID307
Feature activation+3.22e+00

Pos logit contributions
 such+0.054
 his+0.047
 permission+0.046
 the+0.045
 it+0.044
Neg logit contributions
arry-0.103
ari-0.101
zag-0.099
op-0.099
pillar-0.099
Token nicer
Token ID36597
Feature activation-3.00e+00

Pos logit contributions
 permission+0.037
 respect+0.034
 such+0.034
 him+0.033
 nothing+0.032
Neg logit contributions
uffy-0.081
anda-0.081
ari-0.080
rig-0.080
adow-0.080
Token.
Token ID13
Feature activation-4.38e+00

Pos logit contributions
op+0.076
uffy+0.075
arry+0.075
uddle+0.075
oke+0.075
Neg logit contributions
 such-0.050
 his-0.048
 the-0.047
 it-0.045
 him-0.045
Token you
Token ID345
Feature activation+3.69e-01

Pos logit contributions
 his+0.054
 such+0.054
 rid+0.050
 fixed+0.048
 saved+0.047
Neg logit contributions
op-0.106
uffy-0.104
oke-0.104
olly-0.101
opus-0.099
Token have
Token ID423
Feature activation+3.90e+00

Pos logit contributions
 his+0.007
 such+0.007
 him+0.006
 her+0.006
 hers+0.006
Neg logit contributions
op-0.010
uffy-0.009
uddle-0.009
oke-0.009
olly-0.009
Token to
Token ID284
Feature activation+5.05e+00

Pos logit contributions
 him+0.068
 such+0.067
 his+0.066
 himself+0.066
 protect+0.066
Neg logit contributions
op-0.085
uffy-0.080
uddle-0.079
olly-0.076
ears-0.075
Token keep
Token ID1394
Feature activation+4.39e+00

Pos logit contributions
 such+0.077
 nothing+0.067
 it+0.066
 his+0.066
 how+0.063
Neg logit contributions
olly-0.129
op-0.129
uffy-0.126
arry-0.126
uddle-0.125
Token the
Token ID262
Feature activation-4.51e+00

Pos logit contributions
 respect+0.043
 theirs+0.040
 solved+0.039
 such+0.039
 saved+0.039
Neg logit contributions
op-0.123
uffy-0.119
ule-0.113
uddle-0.112
erry-0.110
Token therm
Token ID21969
Feature activation-1.15e+01

Pos logit contributions
op+0.076
uffy+0.074
uddle+0.072
ule+0.069
actus+0.067
Neg logit contributions
 such-0.119
 his-0.113
 the-0.111
 it-0.110
 him-0.108
Tokenometer
Token ID15635
Feature activation-5.27e+00

Pos logit contributions
pill+0.129
olly+0.116
pillar+0.116
rig+0.115
oke+0.109
Neg logit contributions
 his-0.356
 it-0.352
 the-0.350
 such-0.348
 him-0.347
Token in
Token ID287
Feature activation+2.06e+00

Pos logit contributions
op+0.133
uffy+0.130
uddle+0.129
olly+0.127
oke+0.126
Neg logit contributions
 such-0.091
 his-0.086
 the-0.082
 him-0.082
 it-0.082
Token your
Token ID534
Feature activation-1.82e+00

Pos logit contributions
 such+0.012
 saved+0.011
 him+0.010
 his+0.010
 nothing+0.010
Neg logit contributions
op-0.067
uffy-0.066
olly-0.065
arry-0.064
erry-0.063
Token mouth
Token ID5422
Feature activation-4.30e+00

Pos logit contributions
uffy+0.040
op+0.040
uddle+0.039
olly+0.037
oke+0.037
Neg logit contributions
 such-0.039
 his-0.036
 it-0.036
 the-0.036
 him-0.035
Token for
Token ID329
Feature activation+2.03e+00

Pos logit contributions
op+0.114
oke+0.113
uffy+0.113
arry+0.113
uddle+0.112
Neg logit contributions
 such-0.070
 his-0.068
 the-0.066
 it-0.064
 him-0.062