TOP ACTIVATIONS
MAX = 6.097

broke apart and the two seals could explore the place together
. The chip mun ks were very curious ,
and eagerly agreed . upon their arrival to the woods
do something special !" The other friend was very excited .
the apples easily . The other friend was short and couldn
which he called his " ch amber ". He also loved
any trouble . The two dogs played together all day ,
saw Jack and it slit hered towards him . He was
friend D ory said that sharks permit little fish like them
around and it will make us warm ." So , they
. The farmer and his dog continued to run and play
sticky and scary and made them sne e ze .
. The other frogs could jump high ,
, Tim and his toy dog went for a walk .
clock went " tick - t ock , tick - t
. Suddenly , the dog lun ged at Tim my and
time , and it made them feel blue . Then one
duck said " H ats ch !". She flew up in
to argue and it made them sad .
he said . The other mouse agreed , and so

MIN ACTIVATIONS
MIN = -5.959

. The girl asked what had happened to the sky ,
was curious about what grandma had in her house . She
He asked Brown ie what had happened , and the horse
ly showing her what he had written . His
and told her what Ben had done . Her
What an amazing adventure she had ! <|endoftext|> Andrew was so
dad asked Jacob what he had done . Jacob said he
Her mom asked her what had happened and the girl said
Mom , look what I have !" Tom said . He
. He asked John what had happened and John told him
, Anna , what I have !" Sara says , showing
Max asked his mom what had happened . His mother told
what 's in the cold drawer ? Maybe it will make
then asked mummy what he had to do with the apple
Lily , look what I have !" Ben said . He
and thought about what she had learnt .
you 're doing in my drawer ?" he said .
outside and see what he had made . His friends said
to think about what he had done the day before .
and held out what he had been carrying .

INTERVAL 3.083 - 6.097
CONTAINS 0.191%

anything about l izards . Go away ." Sam
This is my spot . Go away !" Tom felt sad
. He made a big snowball and rolled it around the
in the park . The dog was barking loudly . Tom
. Ben and Spot saw a big pile of

INTERVAL 0.069 - 3.083
CONTAINS 47.243%

play outside when it rains . They put on their boots
out at the blue skies above them . Suddenly
But today , he felt very miserable . He had a
glad that the axe did not hit them .
steak from Max . Max grow led and did not want

INTERVAL -2.945 - 0.069
CONTAINS 52.146%

The giant knew he had to run away , so he
if we ask permission before we start and work together ".
the breeze . They fell asleep and had sweet dreams .
Once he arrived home he shared the rest of the fish
, when Bob was walking , he heard two bugs talking

INTERVAL -5.959 - -2.945
CONTAINS 0.420%

Mom , what is in this envelope ?" Her mom said
she put it into her pocket . M
was a really fun day ," said Ben .
" I will , Mom . I 'm sorry ,"
the wealthy flower in the park . <|endoftext|> Once there was
Token broke
Token ID6265
Feature activation-2.88e-01

Pos logit contributions
 trigger+0.062
 heads+0.059
 fists+0.058
rated+0.056
 noses+0.055
Neg logit contributions
iw-0.042
h-0.042
on-0.040
yl-0.040
ott-0.039
Token apart
Token ID5475
Feature activation+1.35e-01

Pos logit contributions
iw+0.002
h+0.002
on+0.002
yl+0.002
ott+0.002
Neg logit contributions
 trigger-0.011
 heads-0.010
 fists-0.010
rated-0.010
 noses-0.010
Token and
Token ID290
Feature activation+4.88e-01

Pos logit contributions
 trigger+0.005
 heads+0.005
 fists+0.005
rated+0.005
 noses+0.005
Neg logit contributions
iw-0.001
h-0.001
on-0.001
yl-0.001
ott-0.001
Token the
Token ID262
Feature activation+1.22e+00

Pos logit contributions
 trigger+0.015
 heads+0.014
 fists+0.013
rated+0.013
 noses+0.013
Neg logit contributions
iw-0.008
h-0.007
on-0.007
yl-0.007
ott-0.007
Token two
Token ID734
Feature activation+2.49e+00

Pos logit contributions
 trigger+0.030
 heads+0.028
rated+0.028
 fists+0.027
 noses+0.026
Neg logit contributions
iw-0.025
on-0.024
yl-0.024
h-0.024
ott-0.023
Token seals
Token ID29718
Feature activation+4.94e+00

Pos logit contributions
 trigger+0.072
 heads+0.069
rated+0.067
 fists+0.066
 noses+0.064
Neg logit contributions
iw-0.041
h-0.040
on-0.039
yl-0.039
ott-0.038
Token could
Token ID714
Feature activation+2.10e-01

Pos logit contributions
 trigger+0.137
 heads+0.131
 fists+0.126
rated+0.125
 noses+0.122
Neg logit contributions
iw-0.086
h-0.085
on-0.084
yl-0.082
ott-0.080
Token explore
Token ID7301
Feature activation+4.80e-01

Pos logit contributions
 trigger+0.005
 heads+0.005
 fists+0.005
rated+0.005
 noses+0.005
Neg logit contributions
iw-0.004
h-0.004
on-0.004
yl-0.004
ott-0.004
Token the
Token ID262
Feature activation+5.26e-01

Pos logit contributions
 trigger+0.015
 heads+0.014
 fists+0.014
rated+0.014
 noses+0.014
Neg logit contributions
iw-0.007
h-0.007
on-0.006
yl-0.006
ott-0.006
Token place
Token ID1295
Feature activation+9.40e-01

Pos logit contributions
 trigger+0.013
 fists+0.012
 heads+0.012
rated+0.012
 noses+0.012
Neg logit contributions
iw-0.011
yl-0.010
h-0.010
on-0.010
Happy-0.010
Token together
Token ID1978
Feature activation+6.08e-01

Pos logit contributions
 trigger+0.030
 heads+0.030
 fists+0.029
rated+0.029
 noses+0.028
Neg logit contributions
iw-0.012
h-0.011
yl-0.011
ott-0.011
on-0.011
Token.
Token ID13
Feature activation+4.86e-01

Pos logit contributions
iw+0.021
h+0.019
yl+0.019
ott+0.018
on+0.018
Neg logit contributions
 trigger-0.071
 heads-0.070
 fists-0.067
rated-0.066
 noses-0.066
Token
Token ID198
Feature activation+4.95e-01

Pos logit contributions
 trigger+0.014
 heads+0.013
 fists+0.013
 noses+0.013
rated+0.013
Neg logit contributions
iw-0.008
h-0.008
yl-0.008
on-0.008
ott-0.007
Token
Token ID198
Feature activation-6.89e-01

Pos logit contributions
 trigger+0.015
 heads+0.015
 fists+0.014
 noses+0.014
rated+0.014
Neg logit contributions
iw-0.007
h-0.006
yl-0.005
ott-0.005
br-0.005
TokenThe
Token ID464
Feature activation-6.66e-01

Pos logit contributions
iw+0.010
h+0.009
yl+0.009
ott+0.009
on+0.009
Neg logit contributions
 trigger-0.021
 heads-0.020
 fists-0.019
rated-0.019
 noses-0.019
Token chip
Token ID11594
Feature activation+2.37e+00

Pos logit contributions
iw+0.013
yl+0.013
ott+0.012
on+0.012
h+0.012
Neg logit contributions
 trigger-0.017
rated-0.016
 heads-0.016
 fists-0.016
 noses-0.015
Tokenmun
Token ID6199
Feature activation+4.86e+00

Pos logit contributions
 trigger+0.071
 heads+0.067
 fists+0.066
rated+0.065
 noses+0.063
Neg logit contributions
iw-0.039
h-0.038
on-0.037
yl-0.036
ott-0.036
Tokenks
Token ID591
Feature activation+3.02e+00

Pos logit contributions
 trigger+0.141
 heads+0.131
rated+0.129
 lawn+0.123
 fists+0.123
Neg logit contributions
h-0.083
on-0.080
yl-0.079
iw-0.078
ott-0.076
Token were
Token ID547
Feature activation+7.02e-02

Pos logit contributions
 trigger+0.081
 heads+0.078
 fists+0.075
rated+0.074
 noses+0.073
Neg logit contributions
h-0.056
on-0.054
iw-0.054
yl-0.053
Happy-0.052
Token very
Token ID845
Feature activation+8.69e-01

Pos logit contributions
 trigger+0.002
rated+0.002
 heads+0.002
 fists+0.002
 noses+0.002
Neg logit contributions
h-0.001
on-0.001
iw-0.001
yl-0.001
Happy-0.001
Token curious
Token ID11040
Feature activation+3.89e-02

Pos logit contributions
 trigger+0.023
 heads+0.022
 fists+0.021
rated+0.021
 noses+0.021
Neg logit contributions
iw-0.017
h-0.016
yl-0.016
on-0.016
ott-0.016
Token,
Token ID11
Feature activation+2.62e-01

Pos logit contributions
 trigger+0.001
 heads+0.001
 fists+0.001
rated+0.001
 noses+0.001
Neg logit contributions
h-0.000
iw-0.000
on-0.000
yl-0.000
ott-0.000
Token and
Token ID290
Feature activation-1.27e+00

Pos logit contributions
 trigger+0.010
 heads+0.010
 fists+0.010
rated+0.009
 noses+0.009
Neg logit contributions
h-0.003
iw-0.003
on-0.003
yl-0.003
ott-0.003
Token eagerly
Token ID30130
Feature activation-6.24e-01

Pos logit contributions
h+0.021
iw+0.021
on+0.020
yl+0.020
Happy+0.019
Neg logit contributions
 trigger-0.038
 heads-0.035
 fists-0.034
rated-0.034
 noses-0.033
Token agreed
Token ID4987
Feature activation+4.16e-01

Pos logit contributions
iw+0.012
ott+0.011
h+0.011
on+0.011
yl+0.011
Neg logit contributions
 heads-0.016
 trigger-0.016
 fists-0.015
rated-0.015
 noses-0.015
Token.
Token ID13
Feature activation+6.26e-01

Pos logit contributions
 trigger+0.015
 heads+0.014
 fists+0.014
rated+0.014
 noses+0.014
Neg logit contributions
iw-0.004
h-0.004
on-0.004
yl-0.004
ott-0.004
Token
Token ID198
Feature activation+1.88e+00

Pos logit contributions
 trigger+0.022
 heads+0.021
 fists+0.021
rated+0.020
 noses+0.020
Neg logit contributions
iw-0.007
h-0.006
yl-0.006
on-0.006
ott-0.006
Token upon
Token ID2402
Feature activation+4.77e+00

Pos logit contributions
 trigger+0.061
 heads+0.059
rated+0.057
 fists+0.057
 noses+0.056
Neg logit contributions
iw-0.023
h-0.019
ott-0.019
yl-0.019
on-0.018
Token their
Token ID511
Feature activation+4.71e-01

Pos logit contributions
 trigger+0.164
 heads+0.158
 fists+0.152
rated+0.152
 noses+0.149
Neg logit contributions
h-0.054
iw-0.052
on-0.051
yl-0.049
Happy-0.048
Token arrival
Token ID10325
Feature activation+1.09e+00

Pos logit contributions
 trigger+0.013
rated+0.013
lling+0.011
 fists+0.011
ping+0.011
Neg logit contributions
yl-0.008
ott-0.008
Happy-0.008
lu-0.008
on-0.008
Token to
Token ID284
Feature activation-1.46e+00

Pos logit contributions
 trigger+0.039
 heads+0.038
 fists+0.037
rated+0.036
 noses+0.036
Neg logit contributions
iw-0.010
h-0.010
on-0.010
yl-0.010
ott-0.009
Token the
Token ID262
Feature activation-5.89e-01

Pos logit contributions
iw+0.023
h+0.022
on+0.021
yl+0.021
ott+0.021
Neg logit contributions
 trigger-0.044
 heads-0.043
 fists-0.041
rated-0.041
 noses-0.040
Token woods
Token ID16479
Feature activation+4.30e-01

Pos logit contributions
iw+0.012
h+0.012
yl+0.011
ott+0.011
Happy+0.011
Neg logit contributions
 trigger-0.015
 heads-0.014
 fists-0.014
rated-0.013
 noses-0.013
Token do
Token ID466
Feature activation-5.15e-01

Pos logit contributions
 trigger+0.018
rated+0.017
 fists+0.017
 heads+0.017
 noses+0.016
Neg logit contributions
on-0.009
h-0.009
iw-0.009
ott-0.008
yl-0.008
Token something
Token ID1223
Feature activation+2.37e+00

Pos logit contributions
iw+0.008
h+0.008
ott+0.008
yl+0.008
on+0.007
Neg logit contributions
 trigger-0.015
 heads-0.015
 fists-0.014
 noses-0.014
rated-0.014
Token special
Token ID2041
Feature activation-4.93e-01

Pos logit contributions
 trigger+0.060
 heads+0.059
rated+0.057
 fists+0.057
 noses+0.056
Neg logit contributions
iw-0.044
ott-0.042
on-0.041
h-0.041
yl-0.041
Token!"
Token ID2474
Feature activation+3.87e-01

Pos logit contributions
ott+0.008
iw+0.008
yl+0.008
h+0.007
on+0.007
Neg logit contributions
 heads-0.013
 trigger-0.013
rated-0.013
 noses-0.012
 grill-0.012
Token The
Token ID383
Feature activation+1.34e+00

Pos logit contributions
 heads+0.010
 trigger+0.010
 noses+0.009
 faces+0.009
 fists+0.009
Neg logit contributions
iw-0.008
h-0.007
yl-0.007
prise-0.007
on-0.007
Token other
Token ID584
Feature activation+4.68e+00

Pos logit contributions
rated+0.039
 trigger+0.039
 fists+0.037
lling+0.037
 heads+0.037
Neg logit contributions
yl-0.019
Happy-0.018
ott-0.018
iw-0.018
on-0.018
Token friend
Token ID1545
Feature activation+1.63e+00

Pos logit contributions
 trigger+0.163
 fists+0.146
 heads+0.143
rated+0.142
lling+0.142
Neg logit contributions
h-0.055
yl-0.055
iw-0.050
,-0.045
on-0.045
Token was
Token ID373
Feature activation+7.47e-01

Pos logit contributions
 heads+0.041
rated+0.038
 fists+0.038
 trigger+0.038
 noses+0.037
Neg logit contributions
iw-0.034
br-0.030
Happy-0.030
och-0.029
h-0.029
Token very
Token ID845
Feature activation+1.20e+00

Pos logit contributions
 trigger+0.020
 heads+0.020
 fists+0.019
rated+0.019
 noses+0.018
Neg logit contributions
iw-0.014
h-0.013
on-0.013
yl-0.013
ott-0.013
Token excited
Token ID6568
Feature activation+1.92e-01

Pos logit contributions
 trigger+0.032
 heads+0.031
 fists+0.029
rated+0.029
 noses+0.029
Neg logit contributions
iw-0.023
h-0.022
on-0.022
yl-0.022
ott-0.021
Token.
Token ID13
Feature activation+1.62e+00

Pos logit contributions
 trigger+0.007
 heads+0.007
 fists+0.006
rated+0.006
 noses+0.006
Neg logit contributions
iw-0.002
h-0.002
on-0.002
yl-0.002
ott-0.002
Token the
Token ID262
Feature activation+5.87e-01

Pos logit contributions
 trigger+0.017
 heads+0.016
rated+0.016
 fists+0.016
 noses+0.015
Neg logit contributions
iw-0.008
h-0.007
on-0.007
yl-0.007
ott-0.007
Token apples
Token ID22514
Feature activation+1.07e-02

Pos logit contributions
 trigger+0.014
 fists+0.014
 heads+0.013
rated+0.013
 noses+0.013
Neg logit contributions
iw-0.012
h-0.011
on-0.011
ott-0.011
Happy-0.010
Token easily
Token ID3538
Feature activation-5.29e-01

Pos logit contributions
 trigger+0.000
 heads+0.000
 fists+0.000
rated+0.000
 noses+0.000
Neg logit contributions
iw-0.000
h-0.000
on-0.000
yl-0.000
ott-0.000
Token.
Token ID13
Feature activation+2.07e+00

Pos logit contributions
iw+0.006
yl+0.005
h+0.005
on+0.005
ott+0.005
Neg logit contributions
 trigger-0.018
 heads-0.018
 fists-0.017
rated-0.017
 noses-0.017
Token The
Token ID383
Feature activation+1.45e+00

Pos logit contributions
 trigger+0.065
 heads+0.062
rated+0.060
 fists+0.060
 noses+0.058
Neg logit contributions
iw-0.030
h-0.029
on-0.028
yl-0.028
ott-0.028
Token other
Token ID584
Feature activation+4.67e+00

Pos logit contributions
 trigger+0.044
rated+0.044
lling+0.043
 fists+0.043
 heads+0.042
Neg logit contributions
iw-0.020
ott-0.018
yl-0.018
on-0.017
h-0.017
Token friend
Token ID1545
Feature activation+1.61e+00

Pos logit contributions
 trigger+0.152
 fists+0.134
lling+0.131
 heads+0.129
rated+0.129
Neg logit contributions
iw-0.070
h-0.068
yl-0.064
ott-0.059
on-0.057
Token was
Token ID373
Feature activation+7.64e-01

Pos logit contributions
 heads+0.040
 trigger+0.038
rated+0.037
 fists+0.037
 noses+0.036
Neg logit contributions
iw-0.032
h-0.031
yl-0.029
br-0.029
Happy-0.029
Token short
Token ID1790
Feature activation+5.37e-01

Pos logit contributions
 trigger+0.022
 heads+0.021
rated+0.020
 fists+0.020
 noses+0.020
Neg logit contributions
iw-0.013
h-0.013
yl-0.012
on-0.012
ott-0.012
Token and
Token ID290
Feature activation+6.34e-02

Pos logit contributions
 trigger+0.018
 heads+0.018
rated+0.017
 fists+0.017
 noses+0.017
Neg logit contributions
iw-0.006
h-0.006
on-0.006
yl-0.006
ott-0.005
Token couldn
Token ID3521
Feature activation-6.41e-01

Pos logit contributions
 trigger+0.002
 heads+0.002
 fists+0.002
rated+0.002
 noses+0.002
Neg logit contributions
iw-0.001
h-0.001
yl-0.001
on-0.001
ott-0.001
Token which
Token ID543
Feature activation-7.76e-01

Pos logit contributions
 she+0.001
iw+0.001
on+0.001
h+0.001
 H+0.001
Neg logit contributions
 trigger-0.002
rated-0.002
 fists-0.002
 heads-0.002
lling-0.002
Token he
Token ID339
Feature activation+5.74e-01

Pos logit contributions
iw+0.014
on+0.013
h+0.013
yl+0.013
ott+0.013
Neg logit contributions
 trigger-0.022
 heads-0.021
 fists-0.020
rated-0.020
 noses-0.019
Token called
Token ID1444
Feature activation+1.45e+00

Pos logit contributions
 heads+0.013
 trigger+0.012
rated+0.012
 noses+0.012
 fists+0.012
Neg logit contributions
iw-0.012
h-0.012
Happy-0.012
on-0.012
yl-0.012
Token his
Token ID465
Feature activation+1.40e+00

Pos logit contributions
 trigger+0.045
 heads+0.043
rated+0.043
 fists+0.042
lling+0.041
Neg logit contributions
h-0.020
on-0.020
iw-0.019
yl-0.018
Happy-0.018
Token "
Token ID366
Feature activation+1.57e-02

Pos logit contributions
 trigger+0.034
rated+0.031
 heads+0.031
 fists+0.030
lling+0.029
Neg logit contributions
iw-0.030
yl-0.029
ott-0.029
h-0.028
on-0.028
Tokench
Token ID354
Feature activation+4.61e+00

Pos logit contributions
 trigger+0.000
 heads+0.000
rated+0.000
 noses+0.000
 fists+0.000
Neg logit contributions
iw-0.000
ott-0.000
yl-0.000
on-0.000
h-0.000
Tokenamber
Token ID7789
Feature activation+1.52e+00

Pos logit contributions
 trigger+0.136
 heads+0.131
 fists+0.127
rated+0.126
 noses+0.123
Neg logit contributions
h-0.076
on-0.074
iw-0.071
yl-0.071
ott-0.066
Token".
Token ID1911
Feature activation+9.67e-01

Pos logit contributions
 trigger+0.047
 heads+0.046
 fists+0.044
rated+0.044
 noses+0.043
Neg logit contributions
iw-0.022
h-0.021
yl-0.020
on-0.020
ott-0.020
Token He
Token ID679
Feature activation+1.59e-01

Pos logit contributions
 trigger+0.030
 heads+0.029
 fists+0.028
 noses+0.028
rated+0.027
Neg logit contributions
iw-0.014
h-0.014
yl-0.013
on-0.013
ott-0.013
Token also
Token ID635
Feature activation-1.12e+00

Pos logit contributions
 heads+0.004
 trigger+0.004
 fists+0.004
 noses+0.004
rated+0.004
Neg logit contributions
iw-0.003
h-0.003
on-0.003
yl-0.003
ott-0.003
Token loved
Token ID6151
Feature activation-1.28e-02

Pos logit contributions
iw+0.021
h+0.021
on+0.020
yl+0.020
ott+0.020
Neg logit contributions
 trigger-0.030
 heads-0.029
 fists-0.028
rated-0.027
 noses-0.027
Token any
Token ID597
Feature activation+4.81e-01

Pos logit contributions
 trigger+0.008
 heads+0.008
 fists+0.007
rated+0.007
 noses+0.007
Neg logit contributions
iw-0.005
h-0.005
on-0.005
yl-0.005
ott-0.004
Token trouble
Token ID5876
Feature activation+1.24e+00

Pos logit contributions
 trigger+0.012
 heads+0.012
 fists+0.011
rated+0.011
 noses+0.011
Neg logit contributions
iw-0.010
h-0.009
on-0.009
yl-0.009
ott-0.009
Token.
Token ID13
Feature activation+1.43e-01

Pos logit contributions
 trigger+0.047
 heads+0.045
 fists+0.044
rated+0.044
 noses+0.043
Neg logit contributions
h-0.010
iw-0.009
on-0.009
yl-0.008
Happy-0.007
Token The
Token ID383
Feature activation-3.12e-01

Pos logit contributions
 heads+0.004
 trigger+0.004
 fists+0.004
 noses+0.004
 lawn+0.004
Neg logit contributions
yl-0.002
iw-0.002
br-0.002
h-0.002
on-0.002
Token two
Token ID734
Feature activation+2.26e+00

Pos logit contributions
iw+0.006
h+0.006
on+0.006
yl+0.006
ott+0.006
Neg logit contributions
 trigger-0.008
 heads-0.008
rated-0.007
 fists-0.007
 noses-0.007
Token dogs
Token ID6844
Feature activation+4.61e+00

Pos logit contributions
 trigger+0.063
 heads+0.061
 fists+0.058
rated+0.057
 noses+0.057
Neg logit contributions
iw-0.040
h-0.039
on-0.038
yl-0.038
ott-0.037
Token played
Token ID2826
Feature activation+1.30e+00

Pos logit contributions
 trigger+0.137
 heads+0.129
rated+0.128
 fists+0.127
 noses+0.122
Neg logit contributions
h-0.072
on-0.067
iw-0.066
yl-0.066
 she-0.062
Token together
Token ID1978
Feature activation+1.64e+00

Pos logit contributions
 trigger+0.041
 heads+0.040
rated+0.039
 fists+0.038
 noses+0.037
Neg logit contributions
h-0.018
on-0.017
iw-0.017
yl-0.016
Happy-0.016
Token all
Token ID477
Feature activation+1.24e+00

Pos logit contributions
 trigger+0.055
 heads+0.052
rated+0.051
 fists+0.051
 noses+0.049
Neg logit contributions
h-0.021
iw-0.020
on-0.019
yl-0.018
ott-0.017
Token day
Token ID1110
Feature activation+4.60e-01

Pos logit contributions
 trigger+0.030
 heads+0.029
rated+0.028
 fists+0.028
 noses+0.027
Neg logit contributions
iw-0.026
h-0.025
on-0.025
yl-0.025
ott-0.025
Token,
Token ID11
Feature activation+3.04e-01

Pos logit contributions
 trigger+0.015
 heads+0.014
 fists+0.014
rated+0.014
 noses+0.014
Neg logit contributions
iw-0.006
h-0.006
on-0.006
yl-0.006
ott-0.006
Token saw
Token ID2497
Feature activation-5.29e-01

Pos logit contributions
 trigger+0.100
 heads+0.095
rated+0.093
 fists+0.091
 noses+0.089
Neg logit contributions
h-0.049
iw-0.048
on-0.048
yl-0.047
ott-0.045
Token Jack
Token ID3619
Feature activation+1.24e+00

Pos logit contributions
on+0.007
 course+0.005
ott+0.005
 she+0.005
iw+0.005
Neg logit contributions
rated-0.018
 heads-0.018
 trigger-0.018
 fists-0.017
 noses-0.017
Token and
Token ID290
Feature activation+8.08e-01

Pos logit contributions
 trigger+0.045
rated+0.042
 heads+0.042
 fists+0.041
 noses+0.040
Neg logit contributions
on-0.013
h-0.013
iw-0.010
yl-0.010
ott-0.009
Token it
Token ID340
Feature activation+1.39e+00

Pos logit contributions
 trigger+0.024
 heads+0.024
 fists+0.023
rated+0.023
 noses+0.022
Neg logit contributions
iw-0.012
h-0.012
on-0.012
yl-0.012
ott-0.011
Token slit
Token ID40724
Feature activation+2.12e+00

Pos logit contributions
 trigger+0.045
 heads+0.042
 fists+0.041
rated+0.041
 noses+0.040
Neg logit contributions
h-0.019
on-0.017
iw-0.017
yl-0.017
ott-0.016
Tokenhered
Token ID6083
Feature activation+4.61e+00

Pos logit contributions
 heads+0.046
 trigger+0.045
rated+0.044
 fists+0.043
 noses+0.043
Neg logit contributions
h-0.050
iw-0.048
yl-0.048
on-0.047
Happy-0.045
Token towards
Token ID3371
Feature activation+8.59e-01

Pos logit contributions
 trigger+0.136
rated+0.124
 noses+0.119
 heads+0.119
 fists+0.118
Neg logit contributions
h-0.070
yl-0.068
 together-0.066
,-0.063
!-0.062
Token him
Token ID683
Feature activation+1.42e+00

Pos logit contributions
 trigger+0.027
rated+0.026
 heads+0.026
 fists+0.026
 noses+0.025
Neg logit contributions
on-0.012
h-0.010
Happy-0.010
ott-0.010
yl-0.010
Token.
Token ID13
Feature activation+5.40e-01

Pos logit contributions
 trigger+0.053
 heads+0.051
 fists+0.050
rated+0.050
 noses+0.049
Neg logit contributions
h-0.011
iw-0.011
on-0.011
yl-0.010
ott-0.009
Token He
Token ID679
Feature activation+1.21e-01

Pos logit contributions
 trigger+0.014
 heads+0.014
 fists+0.013
rated+0.013
 noses+0.013
Neg logit contributions
iw-0.010
h-0.010
yl-0.010
on-0.010
ott-0.009
Token was
Token ID373
Feature activation+1.02e+00

Pos logit contributions
 trigger+0.003
 heads+0.003
 fists+0.003
rated+0.003
 noses+0.003
Neg logit contributions
iw-0.002
h-0.002
yl-0.002
on-0.002
ott-0.002
Token friend
Token ID1545
Feature activation+6.16e-01

Pos logit contributions
rated+0.056
lling+0.054
 trigger+0.053
ging+0.050
ping+0.049
Neg logit contributions
,-0.016
yl-0.015
ott-0.014
on-0.014
 too-0.014
Token D
Token ID360
Feature activation+1.16e+00

Pos logit contributions
 trigger+0.023
 heads+0.023
 fists+0.022
rated+0.022
 noses+0.021
Neg logit contributions
h-0.005
iw-0.005
on-0.004
yl-0.004
ott-0.004
Tokenory
Token ID652
Feature activation-2.27e-01

Pos logit contributions
 trigger+0.028
 heads+0.027
rated+0.026
 fists+0.025
 noses+0.025
Neg logit contributions
iw-0.024
h-0.024
on-0.023
yl-0.023
Happy-0.022
Token said
Token ID531
Feature activation+2.09e-01

Pos logit contributions
iw+0.004
h+0.003
Happy+0.003
on+0.003
yl+0.003
Neg logit contributions
 trigger-0.007
 heads-0.007
 fists-0.006
rated-0.006
 noses-0.006
Token that
Token ID326
Feature activation-7.29e-01

Pos logit contributions
 trigger+0.008
 heads+0.008
 fists+0.007
rated+0.007
 noses+0.007
Neg logit contributions
iw-0.002
h-0.002
on-0.001
yl-0.001
ott-0.001
Token sharks
Token ID27476
Feature activation+4.60e+00

Pos logit contributions
on+0.006
iw+0.005
yl+0.005
ott+0.005
 she+0.005
Neg logit contributions
 trigger-0.029
 heads-0.028
rated-0.027
 fists-0.027
 noses-0.026
Token permit
Token ID8749
Feature activation+2.07e+00

Pos logit contributions
 trigger+0.160
 heads+0.154
 fists+0.153
 noses+0.149
 faces+0.144
Neg logit contributions
h-0.044
on-0.044
?-0.040
?"-0.039
,-0.037
Token little
Token ID1310
Feature activation+6.06e-01

Pos logit contributions
 trigger+0.065
 fists+0.061
rated+0.060
 heads+0.059
sing+0.057
Neg logit contributions
h-0.028
 she-0.028
,-0.026
yl-0.025
ott-0.024
Token fish
Token ID5916
Feature activation+2.14e+00

Pos logit contributions
 trigger+0.015
 fists+0.015
rated+0.015
 heads+0.014
 noses+0.013
Neg logit contributions
yl-0.012
iw-0.011
 she-0.011
h-0.011
Happy-0.011
Token like
Token ID588
Feature activation-3.19e-01

Pos logit contributions
 trigger+0.084
 heads+0.082
 fists+0.081
rated+0.079
 noses+0.077
Neg logit contributions
h-0.012
iw-0.012
on-0.012
yl-0.011
ott-0.009
Token them
Token ID606
Feature activation+1.43e+00

Pos logit contributions
iw+0.004
h+0.004
on+0.004
yl+0.004
ott+0.004
Neg logit contributions
 trigger-0.010
rated-0.010
 heads-0.009
 fists-0.009
 noses-0.009
Token around
Token ID1088
Feature activation+7.37e-01

Pos logit contributions
 trigger+0.066
 heads+0.064
 fists+0.060
rated+0.059
 noses+0.059
Neg logit contributions
iw-0.042
ott-0.040
on-0.039
h-0.039
yl-0.039
Token and
Token ID290
Feature activation-1.52e-01

Pos logit contributions
 trigger+0.020
 heads+0.020
 fists+0.019
 noses+0.019
rated+0.018
Neg logit contributions
iw-0.012
h-0.012
ott-0.012
yl-0.012
on-0.011
Token it
Token ID340
Feature activation+1.02e+00

Pos logit contributions
iw+0.002
on+0.002
h+0.002
 she+0.002
ott+0.002
Neg logit contributions
 trigger-0.004
rated-0.004
 fists-0.004
 heads-0.004
 noses-0.004
Token will
Token ID481
Feature activation+5.28e-01

Pos logit contributions
 heads+0.027
 trigger+0.026
rated+0.025
 fists+0.024
 noses+0.024
Neg logit contributions
iw-0.020
yl-0.019
on-0.019
ott-0.019
h-0.018
Token make
Token ID787
Feature activation+1.35e+00

Pos logit contributions
 trigger+0.012
 heads+0.012
 fists+0.011
rated+0.011
 noses+0.011
Neg logit contributions
iw-0.012
on-0.011
yl-0.011
h-0.011
ott-0.011
Token us
Token ID514
Feature activation+4.58e+00

Pos logit contributions
 trigger+0.040
 heads+0.038
rated+0.037
 fists+0.037
 noses+0.036
Neg logit contributions
h-0.021
iw-0.021
on-0.020
yl-0.020
ott-0.019
Token warm
Token ID5814
Feature activation+1.37e+00

Pos logit contributions
 trigger+0.116
 heads+0.108
 fists+0.105
rated+0.102
 noses+0.100
Neg logit contributions
h-0.092
iw-0.091
on-0.090
yl-0.087
Happy-0.086
Token."
Token ID526
Feature activation+9.97e-02

Pos logit contributions
 trigger+0.046
 heads+0.044
 fists+0.043
rated+0.043
 noses+0.042
Neg logit contributions
iw-0.017
h-0.016
on-0.015
yl-0.015
ott-0.015
Token So
Token ID1406
Feature activation+1.39e+00

Pos logit contributions
 heads+0.003
 trigger+0.003
 noses+0.002
 fists+0.002
 faces+0.002
Neg logit contributions
h-0.002
iw-0.002
yl-0.002
on-0.002
Happy-0.002
Token,
Token ID11
Feature activation+7.71e-01

Pos logit contributions
 trigger+0.047
 heads+0.047
 fists+0.045
 noses+0.045
rated+0.044
Neg logit contributions
iw-0.016
ott-0.015
h-0.015
on-0.014
yl-0.014
Token they
Token ID484
Feature activation+1.49e+00

Pos logit contributions
 trigger+0.021
 heads+0.020
 fists+0.019
 noses+0.019
 faces+0.018
Neg logit contributions
iw-0.014
h-0.014
yl-0.013
ott-0.013
on-0.013
Token.
Token ID13
Feature activation+1.75e+00

Pos logit contributions
iw+0.016
yl+0.014
ott+0.014
on+0.014
h+0.014
Neg logit contributions
 trigger-0.040
 heads-0.040
 noses-0.038
 fists-0.038
rated-0.038
Token The
Token ID383
Feature activation-1.59e-01

Pos logit contributions
 trigger+0.070
 heads+0.068
 fists+0.067
rated+0.066
 noses+0.064
Neg logit contributions
iw-0.011
on-0.009
h-0.009
ott-0.008
Happy-0.007
Token farmer
Token ID18739
Feature activation+1.11e+00

Pos logit contributions
iw+0.003
on+0.002
Happy+0.002
 she+0.002
ott+0.002
Neg logit contributions
 trigger-0.004
 fists-0.004
rated-0.004
 heads-0.004
lling-0.004
Token and
Token ID290
Feature activation+2.49e+00

Pos logit contributions
 heads+0.029
 trigger+0.028
 fists+0.027
 noses+0.027
rated+0.027
Neg logit contributions
iw-0.021
h-0.020
yl-0.019
Happy-0.019
on-0.019
Token his
Token ID465
Feature activation+1.41e+00

Pos logit contributions
 trigger+0.087
rated+0.084
 fists+0.081
imming+0.079
lling+0.079
Neg logit contributions
iw-0.026
on-0.024
yl-0.023
h-0.022
 she-0.021
Token dog
Token ID3290
Feature activation+4.54e+00

Pos logit contributions
 trigger+0.038
rated+0.038
lling+0.035
 heads+0.034
 fists+0.033
Neg logit contributions
iw-0.025
yl-0.025
ott-0.024
on-0.023
lu-0.022
Token continued
Token ID3767
Feature activation+2.75e-01

Pos logit contributions
 trigger+0.130
 heads+0.122
 fists+0.119
rated+0.117
 noses+0.113
Neg logit contributions
iw-0.077
h-0.077
on-0.074
yl-0.072
ott-0.072
Token to
Token ID284
Feature activation-8.39e-01

Pos logit contributions
 trigger+0.009
 heads+0.009
 fists+0.008
rated+0.008
 noses+0.008
Neg logit contributions
iw-0.004
h-0.004
on-0.003
yl-0.003
ott-0.003
Token run
Token ID1057
Feature activation+2.00e+00

Pos logit contributions
iw+0.016
h+0.016
on+0.016
yl+0.016
ott+0.015
Neg logit contributions
 trigger-0.022
 heads-0.021
 fists-0.020
rated-0.020
 noses-0.019
Token and
Token ID290
Feature activation-2.49e-01

Pos logit contributions
 trigger+0.059
 heads+0.057
 fists+0.055
rated+0.054
 noses+0.053
Neg logit contributions
iw-0.033
h-0.031
on-0.031
yl-0.030
ott-0.030
Token play
Token ID711
Feature activation+8.56e-01

Pos logit contributions
iw+0.004
h+0.004
on+0.004
 she+0.004
yl+0.004
Neg logit contributions
 trigger-0.008
rated-0.007
 fists-0.007
 heads-0.007
lling-0.006
Token sticky
Token ID23408
Feature activation+4.44e-01

Pos logit contributions
 trigger+0.011
rated+0.010
 heads+0.009
 fists+0.009
 antics+0.009
Neg logit contributions
h-0.005
iw-0.005
yl-0.005
on-0.004
ott-0.004
Token and
Token ID290
Feature activation-3.64e-01

Pos logit contributions
 trigger+0.016
rated+0.015
 heads+0.015
 fists+0.015
 noses+0.014
Neg logit contributions
h-0.004
iw-0.004
on-0.004
yl-0.003
br-0.003
Token scary
Token ID14343
Feature activation+1.87e+00

Pos logit contributions
h+0.006
iw+0.006
on+0.006
yl+0.006
ott+0.006
Neg logit contributions
 trigger-0.010
 heads-0.009
rated-0.009
 fists-0.009
 antics-0.009
Token and
Token ID290
Feature activation-1.02e-01

Pos logit contributions
 trigger+0.074
rated+0.071
 heads+0.069
 fists+0.068
imming+0.067
Neg logit contributions
h-0.011
iw-0.008
on-0.007
yl-0.007
br-0.006
Token made
Token ID925
Feature activation+1.67e+00

Pos logit contributions
h+0.001
iw+0.001
on+0.001
yl+0.001
 she+0.001
Neg logit contributions
 trigger-0.003
 heads-0.003
rated-0.003
 fists-0.003
 antics-0.003
Token them
Token ID606
Feature activation+4.53e+00

Pos logit contributions
 trigger+0.051
rated+0.050
 heads+0.048
 fists+0.046
 lawn+0.044
Neg logit contributions
on-0.023
h-0.022
ott-0.022
Happy-0.022
yl-0.021
Token sne
Token ID10505
Feature activation+2.71e+00

Pos logit contributions
 trigger+0.113
rated+0.102
 heads+0.101
 goats+0.098
 fists+0.098
Neg logit contributions
h-0.092
on-0.085
 she-0.082
iw-0.080
ott-0.078
Tokene
Token ID68
Feature activation+1.91e+00

Pos logit contributions
 trigger+0.084
 heads+0.081
 fists+0.078
rated+0.077
 noses+0.076
Neg logit contributions
iw-0.040
h-0.039
on-0.038
yl-0.037
ott-0.037
Tokenze
Token ID2736
Feature activation+2.75e+00

Pos logit contributions
 trigger+0.035
rated+0.031
 heads+0.030
 fists+0.030
 goats+0.029
Neg logit contributions
h-0.054
on-0.049
yl-0.048
br-0.047
ott-0.046
Token.
Token ID13
Feature activation+1.89e+00

Pos logit contributions
 trigger+0.108
 heads+0.104
 fists+0.102
rated+0.100
 noses+0.098
Neg logit contributions
h-0.020
on-0.013
br-0.012
 she-0.010
Happy-0.009
Token
Token ID198
Feature activation+1.62e+00

Pos logit contributions
 trigger+0.071
 heads+0.069
rated+0.067
 fists+0.067
 noses+0.065
Neg logit contributions
iw-0.016
h-0.015
on-0.014
ott-0.014
yl-0.013
Token.
Token ID13
Feature activation+1.55e+00

Pos logit contributions
 trigger+0.089
 heads+0.086
 fists+0.083
rated+0.083
 noses+0.081
Neg logit contributions
iw-0.022
h-0.021
on-0.020
yl-0.019
ott-0.018
Token
Token ID220
Feature activation+3.82e-02

Pos logit contributions
 trigger+0.049
 heads+0.048
 fists+0.046
 noses+0.046
rated+0.045
Neg logit contributions
iw-0.021
h-0.021
yl-0.020
on-0.019
Happy-0.018
Token
Token ID198
Feature activation+1.81e+00

Pos logit contributions
 trigger+0.001
 heads+0.001
 fists+0.001
rated+0.001
 noses+0.001
Neg logit contributions
iw-0.000
h-0.000
on-0.000
yl-0.000
ott-0.000
Token
Token ID198
Feature activation+5.26e-01

Pos logit contributions
 trigger+0.051
 heads+0.050
rated+0.048
 fists+0.047
 noses+0.047
Neg logit contributions
iw-0.028
ott-0.023
yl-0.023
h-0.022
prise-0.022
TokenThe
Token ID464
Feature activation+1.07e+00

Pos logit contributions
 trigger+0.016
 heads+0.016
rated+0.015
 noses+0.015
 fists+0.015
Neg logit contributions
iw-0.008
on-0.007
yl-0.007
ott-0.007
prise-0.006
Token other
Token ID584
Feature activation+4.53e+00

Pos logit contributions
 trigger+0.032
 fists+0.032
rated+0.031
 heads+0.030
 noses+0.030
Neg logit contributions
iw-0.016
Happy-0.015
h-0.015
yl-0.015
ott-0.015
Token frogs
Token ID37475
Feature activation+2.76e+00

Pos logit contributions
 trigger+0.127
 fists+0.114
rated+0.104
lling+0.096
 footing+0.096
Neg logit contributions
iw-0.093
,-0.089
h-0.088
yl-0.085
 course-0.080
Token could
Token ID714
Feature activation+6.91e-01

Pos logit contributions
 trigger+0.082
 heads+0.078
 fists+0.076
rated+0.074
 noses+0.073
Neg logit contributions
h-0.045
iw-0.044
on-0.042
yl-0.042
ott-0.041
Token jump
Token ID4391
Feature activation+2.14e+00

Pos logit contributions
 trigger+0.019
 heads+0.017
rated+0.017
 fists+0.017
 noses+0.016
Neg logit contributions
h-0.013
iw-0.013
on-0.012
yl-0.012
ott-0.012
Token high
Token ID1029
Feature activation+1.31e+00

Pos logit contributions
 trigger+0.065
 heads+0.063
 fists+0.062
rated+0.060
 noses+0.060
Neg logit contributions
h-0.032
iw-0.029
on-0.029
yl-0.029
br-0.028
Token,
Token ID11
Feature activation+2.28e-01

Pos logit contributions
 trigger+0.039
 heads+0.037
 fists+0.036
rated+0.036
 noses+0.035
Neg logit contributions
iw-0.021
h-0.020
yl-0.019
on-0.019
ott-0.019
Token,
Token ID11
Feature activation+1.45e+00

Pos logit contributions
 trigger+0.112
 heads+0.109
 fists+0.105
rated+0.105
 noses+0.103
Neg logit contributions
iw-0.019
h-0.018
on-0.017
yl-0.016
ott-0.014
Token Tim
Token ID5045
Feature activation+1.17e+00

Pos logit contributions
 trigger+0.046
 heads+0.045
 fists+0.043
rated+0.043
 noses+0.042
Neg logit contributions
iw-0.020
h-0.020
on-0.020
yl-0.019
Happy-0.018
Token and
Token ID290
Feature activation+3.48e+00

Pos logit contributions
 trigger+0.035
 heads+0.032
 fists+0.032
rated+0.031
 noses+0.031
Neg logit contributions
iw-0.019
h-0.018
on-0.018
ott-0.018
yl-0.018
Token his
Token ID465
Feature activation+2.52e+00

Pos logit contributions
 trigger+0.105
 heads+0.099
 fists+0.096
rated+0.096
 noses+0.092
Neg logit contributions
iw-0.055
on-0.055
h-0.055
yl-0.052
ott-0.050
Token toy
Token ID13373
Feature activation+2.83e+00

Pos logit contributions
rated+0.090
lling+0.083
 trigger+0.075
 banks+0.074
 surfaces+0.074
Neg logit contributions
-0.030
,-0.029
.-0.028
on-0.028
yl-0.027
Token dog
Token ID3290
Feature activation+4.52e+00

Pos logit contributions
rated+0.077
 fists+0.069
lling+0.069
 trigger+0.069
 heads+0.066
Neg logit contributions
on-0.054
iw-0.052
yl-0.052
 she-0.052
ott-0.051
Token went
Token ID1816
Feature activation+1.99e+00

Pos logit contributions
 trigger+0.151
 heads+0.143
 fists+0.141
rated+0.141
lling+0.135
Neg logit contributions
iw-0.054
h-0.053
on-0.052
ott-0.051
yl-0.050
Token for
Token ID329
Feature activation+2.94e+00

Pos logit contributions
 trigger+0.058
 heads+0.057
 fists+0.053
 noses+0.053
rated+0.052
Neg logit contributions
iw-0.033
h-0.031
on-0.031
ott-0.030
yl-0.030
Token a
Token ID257
Feature activation+8.66e-01

Pos logit contributions
 trigger+0.088
 heads+0.086
 fists+0.083
 noses+0.082
rated+0.079
Neg logit contributions
iw-0.046
h-0.045
yl-0.042
on-0.042
ott-0.042
Token walk
Token ID2513
Feature activation+6.59e-01

Pos logit contributions
 trigger+0.017
 heads+0.016
 fists+0.015
rated+0.015
 noses+0.015
Neg logit contributions
iw-0.023
h-0.022
on-0.022
yl-0.021
ott-0.021
Token.
Token ID13
Feature activation+2.70e+00

Pos logit contributions
 heads+0.018
 trigger+0.018
rated+0.017
 noses+0.017
 fists+0.016
Neg logit contributions
iw-0.012
on-0.011
ott-0.011
yl-0.011
h-0.010
Token clock
Token ID8801
Feature activation+1.66e+00

Pos logit contributions
 trigger+0.003
 heads+0.003
 fists+0.003
rated+0.003
 noses+0.003
Neg logit contributions
iw-0.002
h-0.002
on-0.002
yl-0.002
ott-0.002
Token went
Token ID1816
Feature activation+1.42e+00

Pos logit contributions
 heads+0.039
 trigger+0.038
 noses+0.036
rated+0.035
 fists+0.034
Neg logit contributions
iw-0.038
yl-0.034
ott-0.034
h-0.034
Happy-0.034
Token "
Token ID366
Feature activation+1.14e+00

Pos logit contributions
 trigger+0.035
 heads+0.035
 fists+0.031
rated+0.031
 noses+0.031
Neg logit contributions
iw-0.030
ott-0.028
yl-0.028
on-0.028
Happy-0.027
Tokentick
Token ID42298
Feature activation+7.06e-01

Pos logit contributions
rated+0.027
 trigger+0.027
 heads+0.026
ping+0.024
 fists+0.024
Neg logit contributions
ott-0.022
iw-0.022
yl-0.022
prise-0.020
ris-0.019
Token-
Token ID12
Feature activation+1.37e+00

Pos logit contributions
 heads+0.021
 trigger+0.021
rated+0.021
 fists+0.020
 noses+0.020
Neg logit contributions
iw-0.010
ott-0.009
yl-0.009
on-0.009
prise-0.009
Tokent
Token ID83
Feature activation+4.49e+00

Pos logit contributions
 trigger+0.033
rated+0.031
 heads+0.031
 fists+0.029
 noses+0.028
Neg logit contributions
iw-0.027
ott-0.026
prise-0.024
yl-0.024
 gen-0.023
Tokenock
Token ID735
Feature activation+2.67e+00

Pos logit contributions
 trigger+0.112
rated+0.107
 heads+0.106
 fists+0.101
 noses+0.098
Neg logit contributions
iw-0.094
ott-0.088
yl-0.085
on-0.083
lu-0.081
Token,
Token ID11
Feature activation+1.14e+00

Pos logit contributions
 trigger+0.081
 heads+0.081
rated+0.078
 fists+0.077
 noses+0.076
Neg logit contributions
iw-0.040
ott-0.037
yl-0.035
on-0.034
Happy-0.034
Token tick
Token ID4378
Feature activation+1.50e+00

Pos logit contributions
 trigger+0.032
 heads+0.030
 fists+0.029
rated+0.029
 noses+0.028
Neg logit contributions
h-0.020
iw-0.020
on-0.019
yl-0.019
ott-0.018
Token-
Token ID12
Feature activation+9.10e-01

Pos logit contributions
 trigger+0.035
 heads+0.033
 fists+0.032
rated+0.031
 noses+0.031
Neg logit contributions
h-0.034
iw-0.033
on-0.032
yl-0.032
ott-0.031
Tokent
Token ID83
Feature activation+3.60e+00

Pos logit contributions
 trigger+0.024
rated+0.022
 heads+0.022
 noses+0.021
 fists+0.021
Neg logit contributions
iw-0.017
ott-0.016
prise-0.015
yl-0.015
 gen-0.014
Token.
Token ID13
Feature activation+3.36e-01

Pos logit contributions
iw+0.011
h+0.010
yl+0.010
on+0.010
ott+0.010
Neg logit contributions
 trigger-0.036
 heads-0.035
 fists-0.034
rated-0.033
 noses-0.033
Token Suddenly
Token ID24975
Feature activation-2.43e-01

Pos logit contributions
 trigger+0.009
 heads+0.009
 fists+0.009
 noses+0.008
rated+0.008
Neg logit contributions
h-0.006
yl-0.006
iw-0.006
prise-0.006
Happy-0.006
Token,
Token ID11
Feature activation-6.04e-01

Pos logit contributions
iw+0.003
h+0.003
ott+0.003
Happy+0.003
yl+0.003
Neg logit contributions
 trigger-0.008
 fists-0.008
 heads-0.008
 noses-0.008
 faces-0.007
Token the
Token ID262
Feature activation+6.61e-02

Pos logit contributions
iw+0.010
h+0.010
yl+0.009
ott+0.009
on+0.009
Neg logit contributions
 trigger-0.018
 heads-0.017
 fists-0.017
 noses-0.016
rated-0.016
Token dog
Token ID3290
Feature activation+2.47e+00

Pos logit contributions
rated+0.002
lling+0.002
imming+0.002
 heads+0.002
lled+0.002
Neg logit contributions
on-0.001
 band-0.001
 zero-0.001
 she-0.001
 course-0.001
Token lun
Token ID14678
Feature activation+4.49e+00

Pos logit contributions
rated+0.075
 trigger+0.075
 heads+0.074
 fists+0.071
 attention+0.070
Neg logit contributions
h-0.035
yl-0.033
on-0.033
br-0.031
iw-0.029
Tokenged
Token ID2004
Feature activation+1.04e+00

Pos logit contributions
 heads+0.102
 lawn+0.099
 noses+0.096
 fists+0.096
rated+0.096
Neg logit contributions
h-0.102
on-0.092
ha-0.091
yl-0.090
iw-0.084
Token at
Token ID379
Feature activation+1.49e+00

Pos logit contributions
 trigger+0.034
 heads+0.033
rated+0.033
 fists+0.032
 noses+0.032
Neg logit contributions
h-0.013
iw-0.013
on-0.013
yl-0.012
Happy-0.011
Token Tim
Token ID5045
Feature activation+4.10e-01

Pos logit contributions
 trigger+0.046
 heads+0.044
rated+0.043
 fists+0.043
 noses+0.041
Neg logit contributions
iw-0.021
h-0.021
on-0.021
yl-0.020
ott-0.020
Tokenmy
Token ID1820
Feature activation-1.25e+00

Pos logit contributions
 trigger+0.014
 heads+0.013
 fists+0.013
rated+0.013
 noses+0.013
Neg logit contributions
h-0.005
on-0.005
iw-0.004
yl-0.004
ott-0.004
Token and
Token ID290
Feature activation+5.17e-01

Pos logit contributions
on+0.013
h+0.013
iw+0.011
yl+0.011
ott+0.010
Neg logit contributions
 trigger-0.044
 heads-0.043
rated-0.043
 fists-0.041
 noses-0.040
Token time
Token ID640
Feature activation+3.15e-01

Pos logit contributions
 trigger+0.023
 heads+0.022
 fists+0.021
rated+0.021
 noses+0.020
Neg logit contributions
h-0.016
yl-0.015
iw-0.015
on-0.015
ris-0.014
Token,
Token ID11
Feature activation-6.29e-01

Pos logit contributions
 heads+0.009
 trigger+0.009
 fists+0.009
 noses+0.009
 tails+0.009
Neg logit contributions
iw-0.005
ott-0.004
Happy-0.004
h-0.004
on-0.004
Token and
Token ID290
Feature activation+2.41e-02

Pos logit contributions
iw+0.009
 she+0.009
h+0.008
on+0.008
yl+0.008
Neg logit contributions
 trigger-0.021
rated-0.019
 heads-0.019
 fists-0.018
imming-0.017
Token it
Token ID340
Feature activation+1.13e+00

Pos logit contributions
 trigger+0.001
rated+0.001
 heads+0.001
 fists+0.001
lling+0.001
Neg logit contributions
on-0.000
h-0.000
iw-0.000
 she-0.000
yl-0.000
Token made
Token ID925
Feature activation+1.43e+00

Pos logit contributions
 trigger+0.037
 fists+0.035
 heads+0.035
 noses+0.033
rated+0.033
Neg logit contributions
ott-0.015
h-0.014
on-0.014
yl-0.013
iw-0.013
Token them
Token ID606
Feature activation+4.49e+00

Pos logit contributions
 trigger+0.037
 heads+0.035
 fists+0.034
rated+0.034
 noses+0.032
Neg logit contributions
h-0.028
on-0.028
yl-0.027
Happy-0.027
iw-0.027
Token feel
Token ID1254
Feature activation+4.80e-01

Pos logit contributions
 trigger+0.115
 heads+0.108
 fists+0.104
rated+0.102
 noses+0.101
Neg logit contributions
h-0.085
on-0.083
iw-0.082
ott-0.080
Happy-0.080
Token blue
Token ID4171
Feature activation+3.11e-01

Pos logit contributions
 trigger+0.014
 heads+0.013
 fists+0.013
rated+0.013
 noses+0.013
Neg logit contributions
iw-0.008
h-0.008
on-0.007
ott-0.007
yl-0.007
Token.
Token ID13
Feature activation+1.87e+00

Pos logit contributions
 trigger+0.011
 heads+0.011
 fists+0.011
rated+0.010
 noses+0.010
Neg logit contributions
iw-0.003
h-0.003
on-0.003
yl-0.003
ott-0.003
Token Then
Token ID3244
Feature activation+1.73e+00

Pos logit contributions
 trigger+0.060
 heads+0.057
 fists+0.056
rated+0.055
 noses+0.054
Neg logit contributions
iw-0.026
h-0.025
on-0.025
yl-0.024
ott-0.024
Token one
Token ID530
Feature activation+9.04e-01

Pos logit contributions
 trigger+0.063
 heads+0.062
 fists+0.060
 noses+0.059
rated+0.059
Neg logit contributions
iw-0.016
h-0.015
ott-0.014
on-0.014
yl-0.014
Token duck
Token ID22045
Feature activation+2.62e+00

Pos logit contributions
rated+0.041
 fists+0.038
 trigger+0.038
lling+0.038
 heads+0.037
Neg logit contributions
iw-0.030
h-0.030
yl-0.030
on-0.029
ott-0.028
Token said
Token ID531
Feature activation+4.77e-01

Pos logit contributions
 trigger+0.077
 heads+0.072
 fists+0.072
rated+0.070
 noses+0.068
Neg logit contributions
h-0.043
on-0.041
iw-0.041
yl-0.039
ott-0.039
Token "
Token ID366
Feature activation+4.26e-01

Pos logit contributions
 trigger+0.015
 heads+0.014
 fists+0.014
 noses+0.014
 faces+0.014
Neg logit contributions
iw-0.007
ris-0.007
ott-0.006
Happy-0.005
yl-0.005
TokenH
Token ID39
Feature activation+1.98e+00

Pos logit contributions
 trigger+0.013
 heads+0.013
 fists+0.012
 noses+0.012
rated+0.012
Neg logit contributions
h-0.007
iw-0.007
on-0.006
Happy-0.006
yl-0.006
Tokenats
Token ID1381
Feature activation+3.21e+00

Pos logit contributions
 trigger+0.050
 heads+0.049
rated+0.047
 fists+0.046
 noses+0.045
Neg logit contributions
Happy-0.037
iw-0.037
on-0.036
ott-0.035
yl-0.035
Tokench
Token ID354
Feature activation+4.47e+00

Pos logit contributions
 trigger+0.094
 heads+0.091
 fists+0.089
rated+0.087
 noses+0.086
Neg logit contributions
h-0.051
iw-0.049
on-0.049
yl-0.047
ott-0.045
Token!".
Token ID48220
Feature activation+1.01e-01

Pos logit contributions
 trigger+0.125
 heads+0.120
 fists+0.115
rated+0.114
 noses+0.112
Neg logit contributions
h-0.080
on-0.078
iw-0.077
yl-0.075
ott-0.071
Token She
Token ID1375
Feature activation-2.68e-01

Pos logit contributions
 heads+0.003
 trigger+0.003
 fists+0.003
 noses+0.003
 faces+0.003
Neg logit contributions
iw-0.002
prise-0.002
yl-0.002
Happy-0.002
ris-0.002
Token flew
Token ID13112
Feature activation+1.77e+00

Pos logit contributions
h+0.004
iw+0.004
yl+0.004
on+0.004
ott+0.004
Neg logit contributions
 trigger-0.008
 heads-0.007
 fists-0.007
rated-0.007
 noses-0.007
Token up
Token ID510
Feature activation+6.65e-01

Pos logit contributions
 trigger+0.040
 heads+0.038
 fists+0.036
rated+0.036
 noses+0.035
Neg logit contributions
iw-0.041
h-0.040
on-0.039
yl-0.039
ott-0.038
Token in
Token ID287
Feature activation+6.54e-01

Pos logit contributions
 heads+0.017
 trigger+0.016
 fists+0.015
 noses+0.015
 lawn+0.014
Neg logit contributions
prise-0.013
on-0.013
iw-0.013
Happy-0.013
ott-0.012
Token to
Token ID284
Feature activation-5.44e-02

Pos logit contributions
 trigger+0.023
rated+0.022
 heads+0.021
lling+0.020
 fists+0.020
Neg logit contributions
h-0.003
on-0.001
iw-0.001
yl-0.001
ott-0.001
Token argue
Token ID7267
Feature activation+2.11e+00

Pos logit contributions
h+0.001
iw+0.001
ott+0.001
yl+0.001
on+0.001
Neg logit contributions
 trigger-0.001
 fists-0.001
 heads-0.001
rated-0.001
 noses-0.001
Token and
Token ID290
Feature activation-5.48e-02

Pos logit contributions
 trigger+0.082
 heads+0.074
 goats+0.074
 fists+0.074
rated+0.072
Neg logit contributions
h-0.019
on-0.017
 together-0.015
,-0.013
?-0.012
Token it
Token ID340
Feature activation+1.26e+00

Pos logit contributions
h+0.001
on+0.001
iw+0.001
yl+0.001
ott+0.001
Neg logit contributions
 trigger-0.001
rated-0.001
 heads-0.001
 fists-0.001
 noses-0.001
Token made
Token ID925
Feature activation+1.36e+00

Pos logit contributions
 trigger+0.041
 heads+0.038
 fists+0.037
rated+0.036
 noses+0.036
Neg logit contributions
h-0.017
yl-0.015
on-0.015
ott-0.015
br-0.014
Token them
Token ID606
Feature activation+4.47e+00

Pos logit contributions
 trigger+0.043
 heads+0.042
rated+0.042
 fists+0.040
 noses+0.038
Neg logit contributions
h-0.018
on-0.017
Happy-0.016
yl-0.015
ott-0.015
Token sad
Token ID6507
Feature activation+1.88e+00

Pos logit contributions
 trigger+0.120
 heads+0.107
 goats+0.107
rated+0.105
 fists+0.103
Neg logit contributions
h-0.081
 together-0.075
on-0.074
 she-0.074
 much-0.073
Token.
Token ID13
Feature activation+7.55e-01

Pos logit contributions
 trigger+0.070
 heads+0.068
 fists+0.066
rated+0.065
 noses+0.064
Neg logit contributions
h-0.015
iw-0.015
on-0.013
yl-0.013
ott-0.013
Token
Token ID220
Feature activation-5.05e-01

Pos logit contributions
 trigger+0.028
 heads+0.027
rated+0.026
 fists+0.026
 noses+0.025
Neg logit contributions
iw-0.007
on-0.006
h-0.006
ott-0.006
yl-0.006
Token
Token ID198
Feature activation+7.87e-01

Pos logit contributions
iw+0.003
h+0.003
on+0.003
ott+0.002
yl+0.002
Neg logit contributions
 trigger-0.020
 heads-0.020
 fists-0.019
rated-0.019
 noses-0.019
Token
Token ID198
Feature activation-6.97e-01

Pos logit contributions
 trigger+0.028
 heads+0.027
 fists+0.026
rated+0.026
 noses+0.026
Neg logit contributions
iw-0.008
h-0.007
yl-0.007
on-0.007
ott-0.006
Token
Token ID26391
Feature activation-1.51e+00

Pos logit contributions
rated+0.001
 heads+0.001
 trigger+0.001
lling+0.001
 fists+0.001
Neg logit contributions
iw-0.001
yl-0.001
ott-0.001
h-0.001
Happy-0.001
Token he
Token ID339
Feature activation+5.79e-01

Pos logit contributions
iw+0.028
h+0.027
ott+0.027
yl+0.026
on+0.025
Neg logit contributions
 heads-0.038
 trigger-0.037
 fists-0.036
 noses-0.036
rated-0.035
Token said
Token ID531
Feature activation-2.69e-01

Pos logit contributions
 heads+0.015
 trigger+0.015
rated+0.014
 fists+0.014
 noses+0.014
Neg logit contributions
iw-0.011
h-0.011
on-0.010
yl-0.010
ott-0.010
Token.
Token ID13
Feature activation+4.31e-01

Pos logit contributions
iw+0.004
ott+0.004
ris+0.003
yl+0.003
h+0.003
Neg logit contributions
 trigger-0.008
 heads-0.008
 fists-0.008
 noses-0.008
 faces-0.008
Token The
Token ID383
Feature activation+9.80e-01

Pos logit contributions
 trigger+0.011
 heads+0.010
 fists+0.010
 noses+0.010
 grill+0.010
Neg logit contributions
h-0.008
yl-0.008
iw-0.008
prise-0.008
on-0.008
Token other
Token ID584
Feature activation+4.47e+00

Pos logit contributions
rated+0.023
 trigger+0.022
 fists+0.021
 heads+0.021
 footing+0.020
Neg logit contributions
iw-0.020
on-0.019
Happy-0.019
ott-0.019
br-0.018
Token mouse
Token ID10211
Feature activation+1.85e+00

Pos logit contributions
 trigger+0.116
rated+0.107
 fists+0.103
 footing+0.102
 heads+0.096
Neg logit contributions
iw-0.085
house-0.080
h-0.079
yl-0.079
Happy-0.078
Token agreed
Token ID4987
Feature activation+4.43e-01

Pos logit contributions
 trigger+0.051
 heads+0.050
 fists+0.048
rated+0.047
 noses+0.047
Neg logit contributions
iw-0.032
h-0.032
on-0.030
yl-0.030
ott-0.030
Token,
Token ID11
Feature activation+1.00e+00

Pos logit contributions
 trigger+0.013
 heads+0.013
 fists+0.013
rated+0.012
 noses+0.012
Neg logit contributions
iw-0.007
h-0.006
ott-0.006
on-0.006
yl-0.006
Token and
Token ID290
Feature activation-1.54e-01

Pos logit contributions
 trigger+0.028
 heads+0.027
 fists+0.026
 noses+0.026
rated+0.025
Neg logit contributions
iw-0.018
h-0.017
yl-0.017
ott-0.016
on-0.016
Token so
Token ID523
Feature activation+8.21e-01

Pos logit contributions
iw+0.003
h+0.003
on+0.002
yl+0.002
ott+0.002
Neg logit contributions
 trigger-0.004
 heads-0.004
 fists-0.004
rated-0.004
 noses-0.004
Token.
Token ID13
Feature activation-3.63e-01

Pos logit contributions
 trigger+0.003
 heads+0.003
 noses+0.002
rated+0.002
 faces+0.002
Neg logit contributions
iw-0.001
on-0.001
ott-0.001
Happy-0.001
yl-0.001
Token The
Token ID383
Feature activation+2.01e-01

Pos logit contributions
h+0.009
on+0.009
iw+0.009
yl+0.008
prise+0.008
Neg logit contributions
 trigger-0.008
 heads-0.008
 fists-0.007
 noses-0.007
 goats-0.007
Token girl
Token ID2576
Feature activation-5.25e-01

Pos logit contributions
 trigger+0.005
rated+0.004
 fists+0.004
 heads+0.004
 noses+0.004
Neg logit contributions
yl-0.005
Happy-0.004
h-0.004
iw-0.004
on-0.004
Token asked
Token ID1965
Feature activation+8.08e-01

Pos logit contributions
iw+0.011
h+0.010
ott+0.010
on+0.010
Happy+0.009
Neg logit contributions
 heads-0.013
 trigger-0.013
 noses-0.012
 fists-0.012
rated-0.012
Token what
Token ID644
Feature activation+7.54e-01

Pos logit contributions
 trigger+0.025
 heads+0.024
 fists+0.023
rated+0.023
 noses+0.023
Neg logit contributions
iw-0.012
h-0.011
yl-0.011
on-0.011
ott-0.011
Token had
Token ID550
Feature activation-5.96e+00

Pos logit contributions
 trigger+0.023
 heads+0.023
 fists+0.022
 noses+0.022
rated+0.021
Neg logit contributions
iw-0.010
h-0.010
ott-0.009
on-0.009
yl-0.009
Token happened
Token ID3022
Feature activation-3.43e+00

Pos logit contributions
iw+0.126
h+0.122
on+0.119
yl+0.118
ott+0.117
Neg logit contributions
 trigger-0.146
 heads-0.141
 fists-0.134
rated-0.132
 noses-0.130
Token to
Token ID284
Feature activation-2.88e+00

Pos logit contributions
iw+0.050
h+0.046
on+0.045
yl+0.045
ott+0.045
Neg logit contributions
 trigger-0.105
 heads-0.103
 fists-0.099
rated-0.098
 noses-0.097
Token the
Token ID262
Feature activation-1.26e-01

Pos logit contributions
iw+0.048
h+0.046
on+0.045
yl+0.044
ott+0.044
Neg logit contributions
 trigger-0.083
 heads-0.080
 fists-0.077
 noses-0.075
rated-0.075
Token sky
Token ID6766
Feature activation-8.09e-01

Pos logit contributions
h+0.002
iw+0.002
yl+0.002
Happy+0.002
on+0.002
Neg logit contributions
 trigger-0.003
 fists-0.003
rated-0.003
 heads-0.003
 noses-0.003
Token,
Token ID11
Feature activation-6.40e-01

Pos logit contributions
iw+0.012
ott+0.011
yl+0.011
on+0.011
h+0.011
Neg logit contributions
 trigger-0.024
 heads-0.024
rated-0.023
 fists-0.023
 noses-0.023
Token was
Token ID373
Feature activation+6.46e-01

Pos logit contributions
iw+0.013
h+0.013
on+0.012
yl+0.012
ott+0.012
Neg logit contributions
 trigger-0.023
 heads-0.022
 fists-0.021
rated-0.021
 noses-0.020
Token curious
Token ID11040
Feature activation+3.83e-01

Pos logit contributions
 trigger+0.020
 heads+0.019
 fists+0.018
rated+0.018
 noses+0.018
Neg logit contributions
iw-0.010
h-0.009
on-0.009
yl-0.009
ott-0.009
Token about
Token ID546
Feature activation-1.23e+00

Pos logit contributions
 trigger+0.012
 heads+0.012
 fists+0.011
rated+0.011
 noses+0.011
Neg logit contributions
h-0.005
iw-0.005
on-0.005
yl-0.005
ott-0.005
Token what
Token ID644
Feature activation-2.24e-01

Pos logit contributions
iw+0.017
h+0.017
yl+0.017
on+0.016
ott+0.016
Neg logit contributions
 trigger-0.040
 heads-0.037
rated-0.037
 fists-0.036
 noses-0.034
Token grandma
Token ID49890
Feature activation-2.13e+00

Pos logit contributions
iw+0.004
h+0.003
on+0.003
yl+0.003
ott+0.003
Neg logit contributions
 trigger-0.007
 heads-0.006
 fists-0.006
 noses-0.006
rated-0.006
Token had
Token ID550
Feature activation-5.90e+00

Pos logit contributions
iw+0.035
h+0.034
on+0.032
yl+0.032
ott+0.032
Neg logit contributions
 trigger-0.061
 heads-0.060
 fists-0.057
rated-0.057
 noses-0.056
Token in
Token ID287
Feature activation-1.50e+00

Pos logit contributions
iw+0.106
on+0.093
ott+0.092
h+0.091
br+0.087
Neg logit contributions
 heads-0.162
 trigger-0.157
rated-0.152
 noses-0.148
 fists-0.147
Token her
Token ID607
Feature activation-1.39e+00

Pos logit contributions
iw+0.022
h+0.021
on+0.021
yl+0.021
ott+0.020
Neg logit contributions
 trigger-0.046
 heads-0.045
 fists-0.043
rated-0.043
 noses-0.042
Token house
Token ID2156
Feature activation-2.87e+00

Pos logit contributions
iw+0.028
h+0.028
yl+0.027
on+0.027
ott+0.027
Neg logit contributions
 trigger-0.035
 heads-0.033
 fists-0.032
rated-0.032
 noses-0.031
Token.
Token ID13
Feature activation+7.42e-01

Pos logit contributions
iw+0.032
h+0.029
yl+0.028
on+0.028
ott+0.028
Neg logit contributions
 trigger-0.098
 heads-0.097
 fists-0.093
 noses-0.092
rated-0.092
Token She
Token ID1375
Feature activation-1.60e+00

Pos logit contributions
 trigger+0.030
 heads+0.029
rated+0.029
 fists+0.028
lling+0.028
Neg logit contributions
iw-0.004
ott-0.004
h-0.004
on-0.003
yl-0.003
Token He
Token ID679
Feature activation-4.47e-02

Pos logit contributions
iw+0.013
h+0.012
yl+0.012
on+0.012
Happy+0.012
Neg logit contributions
 trigger-0.019
 heads-0.018
 fists-0.018
rated-0.017
 noses-0.017
Token asked
Token ID1965
Feature activation+7.10e-01

Pos logit contributions
iw+0.001
on+0.001
h+0.001
yl+0.001
ott+0.001
Neg logit contributions
 trigger-0.001
 heads-0.001
 fists-0.001
rated-0.001
 noses-0.001
Token Brown
Token ID4373
Feature activation+3.12e-01

Pos logit contributions
 trigger+0.019
 heads+0.018
 fists+0.017
rated+0.017
 noses+0.017
Neg logit contributions
iw-0.014
h-0.013
yl-0.013
ott-0.013
on-0.013
Tokenie
Token ID494
Feature activation-2.05e-01

Pos logit contributions
 trigger+0.009
 heads+0.008
 fists+0.008
rated+0.008
 noses+0.008
Neg logit contributions
iw-0.006
h-0.005
Happy-0.005
yl-0.005
on-0.005
Token what
Token ID644
Feature activation+3.80e-01

Pos logit contributions
iw+0.003
h+0.003
on+0.003
yl+0.003
ott+0.003
Neg logit contributions
 trigger-0.007
 heads-0.006
rated-0.006
 fists-0.006
 noses-0.006
Token had
Token ID550
Feature activation-5.89e+00

Pos logit contributions
 trigger+0.010
 heads+0.009
 fists+0.009
 noses+0.009
rated+0.009
Neg logit contributions
iw-0.007
h-0.007
yl-0.006
Happy-0.006
ott-0.006
Token happened
Token ID3022
Feature activation-3.56e+00

Pos logit contributions
iw+0.139
h+0.130
yl+0.126
on+0.126
ott+0.125
Neg logit contributions
 trigger-0.131
 heads-0.126
 fists-0.120
rated-0.119
 noses-0.117
Token,
Token ID11
Feature activation-9.47e-01

Pos logit contributions
iw+0.045
h+0.044
on+0.042
yl+0.042
ott+0.041
Neg logit contributions
 trigger-0.118
 heads-0.113
 fists-0.110
rated-0.109
 noses-0.107
Token and
Token ID290
Feature activation-1.36e+00

Pos logit contributions
iw+0.018
yl+0.017
ott+0.017
Happy+0.017
on+0.016
Neg logit contributions
 trigger-0.025
 heads-0.025
 fists-0.024
 noses-0.023
rated-0.023
Token the
Token ID262
Feature activation-4.29e-01

Pos logit contributions
iw+0.025
br+0.025
ris+0.024
ott+0.024
Happy+0.023
Neg logit contributions
 heads-0.034
 trigger-0.034
 fists-0.032
 noses-0.031
rated-0.031
Token horse
Token ID8223
Feature activation+1.99e+00

Pos logit contributions
iw+0.010
h+0.010
on+0.009
yl+0.009
Happy+0.009
Neg logit contributions
 trigger-0.010
 heads-0.009
 fists-0.009
rated-0.009
 noses-0.009
Tokenly
Token ID306
Feature activation-5.96e-01

Pos logit contributions
 trigger+0.002
 heads+0.002
 fists+0.002
rated+0.002
 noses+0.002
Neg logit contributions
h-0.002
iw-0.002
yl-0.002
on-0.002
ott-0.002
Token showing
Token ID4478
Feature activation+9.71e-01

Pos logit contributions
iw+0.009
h+0.008
ott+0.008
yl+0.008
Happy+0.008
Neg logit contributions
 trigger-0.018
 heads-0.018
 fists-0.017
 noses-0.017
rated-0.017
Token her
Token ID607
Feature activation+3.03e-01

Pos logit contributions
 trigger+0.029
 heads+0.028
rated+0.027
 fists+0.027
 noses+0.026
Neg logit contributions
h-0.015
iw-0.014
on-0.014
yl-0.014
ott-0.014
Token what
Token ID644
Feature activation-3.46e-01

Pos logit contributions
 trigger+0.009
 heads+0.009
rated+0.009
 fists+0.009
 noses+0.008
Neg logit contributions
iw-0.004
h-0.004
on-0.004
yl-0.004
ott-0.004
Token he
Token ID339
Feature activation-2.10e+00

Pos logit contributions
iw+0.006
h+0.006
ott+0.005
yl+0.005
ris+0.005
Neg logit contributions
 trigger-0.009
 heads-0.009
 fists-0.009
 noses-0.009
 faces-0.009
Token had
Token ID550
Feature activation-5.86e+00

Pos logit contributions
iw+0.048
h+0.046
yl+0.043
br+0.043
ott+0.042
Neg logit contributions
 heads-0.045
rated-0.043
 trigger-0.041
lling-0.039
 fists-0.039
Token written
Token ID3194
Feature activation-3.06e+00

Pos logit contributions
iw+0.116
ott+0.102
h+0.100
br+0.099
on+0.098
Neg logit contributions
 heads-0.146
 trigger-0.139
rated-0.138
lling-0.136
 noses-0.134
Token.
Token ID13
Feature activation+9.20e-02

Pos logit contributions
iw+0.050
ott+0.045
h+0.043
yl+0.043
Happy+0.040
Neg logit contributions
 trigger-0.091
 heads-0.088
 fists-0.085
rated-0.084
 noses-0.082
Token
Token ID198
Feature activation+5.96e-01

Pos logit contributions
 trigger+0.003
 heads+0.003
 fists+0.003
 noses+0.003
rated+0.003
Neg logit contributions
iw-0.001
h-0.001
yl-0.001
Happy-0.001
ott-0.001
Token
Token ID198
Feature activation-6.57e-01

Pos logit contributions
 trigger+0.020
 heads+0.019
 fists+0.019
rated+0.019
 noses+0.018
Neg logit contributions
iw-0.007
h-0.006
yl-0.006
ott-0.005
on-0.005
TokenHis
Token ID6653
Feature activation+1.13e+00

Pos logit contributions
h+0.009
iw+0.009
on+0.009
yl+0.008
Happy+0.008
Neg logit contributions
 trigger-0.021
 heads-0.021
 fists-0.020
rated-0.020
 noses-0.019
Token and
Token ID290
Feature activation-4.69e-01

Pos logit contributions
iw+0.018
h+0.017
on+0.016
yl+0.016
ott+0.016
Neg logit contributions
 trigger-0.048
 heads-0.047
 fists-0.045
rated-0.045
 noses-0.044
Token told
Token ID1297
Feature activation+4.32e-01

Pos logit contributions
iw+0.008
h+0.008
on+0.007
yl+0.007
ott+0.007
Neg logit contributions
 trigger-0.013
 heads-0.013
 fists-0.013
 noses-0.012
rated-0.012
Token her
Token ID607
Feature activation-1.23e+00

Pos logit contributions
 trigger+0.014
rated+0.014
 heads+0.014
 fists+0.013
lling+0.013
Neg logit contributions
h-0.006
yl-0.005
on-0.005
ott-0.005
iw-0.005
Token what
Token ID644
Feature activation-2.15e-01

Pos logit contributions
iw+0.020
h+0.019
on+0.019
yl+0.019
ott+0.018
Neg logit contributions
 trigger-0.036
 heads-0.035
 fists-0.033
rated-0.033
 noses-0.033
Token Ben
Token ID3932
Feature activation-5.91e-01

Pos logit contributions
iw+0.004
h+0.004
on+0.004
yl+0.004
ott+0.004
Neg logit contributions
 trigger-0.006
 heads-0.005
 fists-0.005
rated-0.005
 noses-0.005
Token had
Token ID550
Feature activation-5.86e+00

Pos logit contributions
on+0.008
h+0.007
yl+0.007
iw+0.007
ott+0.007
Neg logit contributions
 trigger-0.019
 fists-0.018
 heads-0.018
rated-0.018
 noses-0.017
Token done
Token ID1760
Feature activation-3.78e+00

Pos logit contributions
iw+0.105
h+0.097
on+0.096
ott+0.095
yl+0.094
Neg logit contributions
 heads-0.156
 trigger-0.156
 noses-0.147
 fists-0.146
rated-0.145
Token.
Token ID13
Feature activation-3.36e-01

Pos logit contributions
h+0.030
iw+0.029
on+0.028
yl+0.027
ott+0.025
Neg logit contributions
 trigger-0.143
 heads-0.138
 fists-0.135
rated-0.133
 noses-0.132
Token
Token ID198
Feature activation+1.90e-01

Pos logit contributions
iw+0.004
on+0.004
h+0.004
yl+0.004
ris+0.003
Neg logit contributions
 trigger-0.011
 heads-0.011
 fists-0.011
 noses-0.011
 faces-0.010
Token
Token ID198
Feature activation-1.24e+00

Pos logit contributions
 trigger+0.006
 heads+0.006
 fists+0.006
 noses+0.006
 faces+0.006
Neg logit contributions
iw-0.002
h-0.002
yl-0.002
on-0.002
ott-0.002
TokenHer
Token ID9360
Feature activation-8.12e-01

Pos logit contributions
iw+0.020
h+0.018
yl+0.018
on+0.018
ott+0.018
Neg logit contributions
 trigger-0.036
 heads-0.036
 fists-0.034
 noses-0.033
rated-0.032
Token What
Token ID1867
Feature activation+1.79e+00

Pos logit contributions
prise+0.008
h+0.008
yl+0.008
iw+0.008
on+0.008
Neg logit contributions
 heads-0.011
 trigger-0.010
 fists-0.010
 lawn-0.010
 noses-0.010
Token an
Token ID281
Feature activation-1.96e+00

Pos logit contributions
 trigger+0.048
 heads+0.047
 fists+0.046
rated+0.045
 noses+0.045
Neg logit contributions
iw-0.032
h-0.030
yl-0.028
ott-0.028
on-0.027
Token amazing
Token ID4998
Feature activation-1.58e+00

Pos logit contributions
iw+0.041
h+0.040
on+0.039
yl+0.039
ott+0.038
Neg logit contributions
 trigger-0.049
 heads-0.047
 fists-0.044
rated-0.044
 noses-0.043
Token adventure
Token ID8855
Feature activation-1.68e+00

Pos logit contributions
iw+0.034
h+0.033
on+0.032
ott+0.032
yl+0.032
Neg logit contributions
 trigger-0.038
 heads-0.037
 fists-0.035
rated-0.034
 noses-0.034
Token she
Token ID673
Feature activation-2.90e+00

Pos logit contributions
iw+0.023
h+0.021
ott+0.021
yl+0.020
on+0.020
Neg logit contributions
 trigger-0.052
 heads-0.051
rated-0.050
 fists-0.049
 noses-0.048
Token had
Token ID550
Feature activation-5.84e+00

Pos logit contributions
iw+0.060
h+0.057
on+0.055
ott+0.054
yl+0.053
Neg logit contributions
 heads-0.071
 trigger-0.069
rated-0.069
 fists-0.064
lling-0.063
Token!
Token ID0
Feature activation-1.06e+00

Pos logit contributions
iw+0.101
ott+0.089
on+0.088
h+0.088
yl+0.084
Neg logit contributions
 heads-0.160
 trigger-0.157
rated-0.152
lling-0.148
 fists-0.147
Token<|endoftext|>
Token ID50256
Feature activation-5.35e-01

Pos logit contributions
h+0.018
yl+0.017
iw+0.017
prise+0.017
br+0.017
Neg logit contributions
 trigger-0.026
 fists-0.026
 heads-0.025
 lawn-0.024
 goats-0.024
TokenAndrew
Token ID20508
Feature activation-9.41e-01

Pos logit contributions
iw+0.008
h+0.008
yl+0.008
prise+0.007
ott+0.007
Neg logit contributions
 trigger-0.015
 heads-0.014
rated-0.013
 fists-0.013
 lawn-0.013
Token was
Token ID373
Feature activation-1.04e+00

Pos logit contributions
iw+0.018
h+0.017
yl+0.016
Happy+0.016
ott+0.016
Neg logit contributions
 heads-0.025
 trigger-0.024
rated-0.023
 noses-0.023
 fists-0.022
Token so
Token ID523
Feature activation-1.32e+00

Pos logit contributions
iw+0.020
h+0.020
yl+0.019
on+0.019
ott+0.019
Neg logit contributions
 trigger-0.027
 heads-0.026
 fists-0.025
rated-0.024
 noses-0.024
Token dad
Token ID9955
Feature activation-1.85e+00

Pos logit contributions
 trigger+0.021
rated+0.019
 heads+0.019
 fists+0.018
 noses+0.018
Neg logit contributions
iw-0.019
yl-0.019
on-0.019
h-0.019
ott-0.018
Token asked
Token ID1965
Feature activation-2.87e-01

Pos logit contributions
iw+0.037
h+0.032
ott+0.032
Happy+0.031
yl+0.030
Neg logit contributions
 heads-0.047
 trigger-0.046
 fists-0.044
lling-0.044
rated-0.043
Token Jacob
Token ID12806
Feature activation-1.99e-01

Pos logit contributions
h+0.005
iw+0.005
on+0.005
yl+0.005
ott+0.005
Neg logit contributions
 trigger-0.008
 heads-0.008
 fists-0.007
rated-0.007
 noses-0.007
Token what
Token ID644
Feature activation+1.74e-01

Pos logit contributions
iw+0.003
h+0.003
yl+0.003
ott+0.003
Happy+0.003
Neg logit contributions
 trigger-0.006
 heads-0.006
 fists-0.005
rated-0.005
 noses-0.005
Token he
Token ID339
Feature activation-2.93e+00

Pos logit contributions
 trigger+0.005
 heads+0.005
 fists+0.005
 noses+0.005
rated+0.005
Neg logit contributions
iw-0.003
h-0.003
ott-0.003
yl-0.003
Happy-0.002
Token had
Token ID550
Feature activation-5.78e+00

Pos logit contributions
iw+0.068
h+0.062
yl+0.059
Happy+0.058
ott+0.058
Neg logit contributions
 heads-0.065
rated-0.063
 trigger-0.060
 fists-0.058
lling-0.056
Token done
Token ID1760
Feature activation-4.52e+00

Pos logit contributions
iw+0.139
h+0.118
ott+0.116
br+0.112
yl+0.112
Neg logit contributions
 heads-0.124
 fists-0.117
 trigger-0.116
 faces-0.116
lling-0.115
Token.
Token ID13
Feature activation-8.52e-01

Pos logit contributions
iw+0.049
h+0.046
yl+0.044
on+0.043
ott+0.043
Neg logit contributions
 trigger-0.156
 heads-0.152
 fists-0.147
rated-0.146
 noses-0.144
Token Jacob
Token ID12806
Feature activation+3.40e-02

Pos logit contributions
iw+0.019
h+0.018
yl+0.018
ott+0.018
on+0.018
Neg logit contributions
 trigger-0.020
 heads-0.020
 fists-0.019
 noses-0.018
rated-0.018
Token said
Token ID531
Feature activation-1.02e+00

Pos logit contributions
 heads+0.001
 fists+0.001
 noses+0.001
 trigger+0.001
rated+0.001
Neg logit contributions
iw-0.001
Happy-0.001
br-0.001
h-0.001
ott-0.001
Token he
Token ID339
Feature activation-9.83e-01

Pos logit contributions
iw+0.016
ott+0.014
ris+0.013
yl+0.013
h+0.012
Neg logit contributions
 trigger-0.031
 heads-0.030
 fists-0.030
 faces-0.029
 noses-0.029
Token Her
Token ID2332
Feature activation-8.06e-02

Pos logit contributions
iw+0.004
yl+0.004
h+0.004
on+0.004
Happy+0.003
Neg logit contributions
 heads-0.008
 trigger-0.008
 fists-0.008
 noses-0.008
rated-0.007
Token mom
Token ID1995
Feature activation-1.59e+00

Pos logit contributions
yl+0.002
iw+0.001
ott+0.001
on+0.001
h+0.001
Neg logit contributions
 trigger-0.002
rated-0.002
 heads-0.002
lling-0.002
 fists-0.002
Token asked
Token ID1965
Feature activation+2.58e-01

Pos logit contributions
iw+0.031
Happy+0.027
br+0.027
ott+0.026
yl+0.026
Neg logit contributions
 heads-0.042
 trigger-0.040
 fists-0.038
 noses-0.038
rated-0.037
Token her
Token ID607
Feature activation-1.45e+00

Pos logit contributions
 trigger+0.008
 heads+0.008
 fists+0.007
rated+0.007
 noses+0.007
Neg logit contributions
iw-0.004
h-0.004
on-0.004
yl-0.003
ott-0.003
Token what
Token ID644
Feature activation+2.80e-01

Pos logit contributions
iw+0.027
h+0.026
on+0.025
yl+0.025
ott+0.025
Neg logit contributions
 trigger-0.039
 heads-0.038
 fists-0.036
rated-0.035
 noses-0.035
Token had
Token ID550
Feature activation-5.77e+00

Pos logit contributions
 trigger+0.007
 heads+0.007
 fists+0.007
 noses+0.007
rated+0.007
Neg logit contributions
iw-0.005
h-0.005
yl-0.004
ris-0.004
on-0.004
Token happened
Token ID3022
Feature activation-3.31e+00

Pos logit contributions
iw+0.140
h+0.130
on+0.127
yl+0.123
ott+0.122
Neg logit contributions
 trigger-0.125
 heads-0.122
 fists-0.115
 noses-0.114
rated-0.111
Token and
Token ID290
Feature activation-2.00e+00

Pos logit contributions
iw+0.037
h+0.035
on+0.034
yl+0.033
ott+0.032
Neg logit contributions
 trigger-0.114
 heads-0.111
 fists-0.107
rated-0.106
 noses-0.105
Token the
Token ID262
Feature activation-8.79e-01

Pos logit contributions
iw+0.038
ris+0.037
br+0.036
h+0.034
Happy+0.034
Neg logit contributions
 heads-0.051
 trigger-0.047
 fists-0.047
 noses-0.046
rated-0.045
Token girl
Token ID2576
Feature activation-1.01e+00

Pos logit contributions
iw+0.015
on+0.015
yl+0.015
h+0.015
ott+0.014
Neg logit contributions
 trigger-0.024
 heads-0.024
 fists-0.023
rated-0.023
 noses-0.023
Token said
Token ID531
Feature activation-1.11e+00

Pos logit contributions
iw+0.019
h+0.017
Happy+0.017
on+0.016
br+0.016
Neg logit contributions
 heads-0.027
 trigger-0.026
 fists-0.025
 noses-0.025
rated-0.025
TokenMom
Token ID29252
Feature activation-2.02e+00

Pos logit contributions
iw+0.007
h+0.007
on+0.007
yl+0.007
ott+0.007
Neg logit contributions
 trigger-0.011
 heads-0.011
 fists-0.011
rated-0.010
 noses-0.010
Token,
Token ID11
Feature activation+2.12e+00

Pos logit contributions
Happy+0.033
iw+0.032
ott+0.032
br+0.029
yl+0.029
Neg logit contributions
 heads-0.058
 fists-0.053
 trigger-0.053
 noses-0.053
rated-0.051
Token look
Token ID804
Feature activation-9.38e-01

Pos logit contributions
 trigger+0.065
 heads+0.062
 fists+0.060
rated+0.059
 noses+0.059
Neg logit contributions
iw-0.032
h-0.031
on-0.030
yl-0.030
ott-0.029
Token what
Token ID644
Feature activation+2.14e-01

Pos logit contributions
iw+0.012
h+0.012
ott+0.011
yl+0.011
on+0.011
Neg logit contributions
 trigger-0.030
 heads-0.030
 fists-0.029
rated-0.028
 noses-0.028
Token I
Token ID314
Feature activation-2.25e+00

Pos logit contributions
 trigger+0.007
 heads+0.007
 fists+0.007
rated+0.007
 noses+0.007
Neg logit contributions
iw-0.002
h-0.002
yl-0.002
on-0.002
ott-0.002
Token have
Token ID423
Feature activation-5.76e+00

Pos logit contributions
iw+0.041
h+0.039
yl+0.038
on+0.037
ott+0.037
Neg logit contributions
 trigger-0.061
 heads-0.060
rated-0.057
 fists-0.056
 noses-0.055
Token!"
Token ID2474
Feature activation-1.41e+00

Pos logit contributions
iw+0.095
ott+0.095
h+0.088
yl+0.086
lu+0.084
Neg logit contributions
 heads-0.149
 trigger-0.146
rated-0.142
 noses-0.140
lling-0.138
Token Tom
Token ID4186
Feature activation-2.02e-01

Pos logit contributions
iw+0.032
h+0.032
ott+0.030
yl+0.030
on+0.030
Neg logit contributions
 trigger-0.033
 heads-0.031
 fists-0.030
 noses-0.030
 faces-0.029
Token said
Token ID531
Feature activation-2.49e-01

Pos logit contributions
iw+0.003
h+0.003
on+0.003
yl+0.003
ott+0.003
Neg logit contributions
 trigger-0.006
 heads-0.006
 fists-0.006
 noses-0.006
rated-0.006
Token.
Token ID13
Feature activation+2.82e-02

Pos logit contributions
iw+0.003
h+0.002
yl+0.002
on+0.002
ott+0.002
Neg logit contributions
 trigger-0.009
 heads-0.009
 fists-0.008
rated-0.008
 noses-0.008
Token He
Token ID679
Feature activation-1.08e+00

Pos logit contributions
 trigger+0.001
 heads+0.001
 fists+0.001
 noses+0.001
rated+0.001
Neg logit contributions
iw-0.000
h-0.000
yl-0.000
on-0.000
ott-0.000
Token.
Token ID13
Feature activation-3.15e-02

Pos logit contributions
 trigger+0.062
 heads+0.060
 fists+0.058
rated+0.058
 noses+0.057
Neg logit contributions
iw-0.018
h-0.017
on-0.016
yl-0.016
ott-0.015
Token He
Token ID679
Feature activation+1.76e-01

Pos logit contributions
iw+0.001
prise+0.001
yl+0.001
h+0.001
ig+0.001
Neg logit contributions
 heads-0.001
 fists-0.001
 trigger-0.001
rated-0.001
 noses-0.001
Token asked
Token ID1965
Feature activation+1.60e+00

Pos logit contributions
 heads+0.005
 trigger+0.004
 fists+0.004
rated+0.004
 noses+0.004
Neg logit contributions
iw-0.003
Happy-0.003
h-0.003
on-0.003
ott-0.003
Token John
Token ID1757
Feature activation+2.62e-01

Pos logit contributions
 trigger+0.042
 heads+0.041
 fists+0.039
rated+0.039
 noses+0.038
Neg logit contributions
iw-0.031
ott-0.029
h-0.028
yl-0.028
on-0.028
Token what
Token ID644
Feature activation+8.71e-01

Pos logit contributions
 trigger+0.008
 heads+0.008
 fists+0.007
rated+0.007
 noses+0.007
Neg logit contributions
iw-0.004
h-0.004
Happy-0.004
yl-0.004
on-0.004
Token had
Token ID550
Feature activation-5.76e+00

Pos logit contributions
 trigger+0.020
 heads+0.020
 fists+0.019
 noses+0.019
rated+0.019
Neg logit contributions
iw-0.018
h-0.017
yl-0.016
ott-0.016
Happy-0.016
Token happened
Token ID3022
Feature activation-3.28e+00

Pos logit contributions
iw+0.147
h+0.134
ott+0.128
yl+0.127
Happy+0.127
Neg logit contributions
 trigger-0.118
 heads-0.115
 fists-0.110
 noses-0.106
rated-0.105
Token and
Token ID290
Feature activation-1.40e+00

Pos logit contributions
iw+0.042
h+0.039
yl+0.038
on+0.038
ott+0.038
Neg logit contributions
 trigger-0.107
 heads-0.104
 fists-0.101
rated-0.099
 noses-0.098
Token John
Token ID1757
Feature activation+9.90e-01

Pos logit contributions
iw+0.028
ott+0.026
br+0.026
on+0.025
h+0.025
Neg logit contributions
 heads-0.035
 trigger-0.034
rated-0.032
 fists-0.031
 noses-0.031
Token told
Token ID1297
Feature activation+9.72e-01

Pos logit contributions
 heads+0.027
 trigger+0.026
rated+0.025
 fists+0.025
 noses+0.024
Neg logit contributions
iw-0.018
Happy-0.017
h-0.016
on-0.016
br-0.016
Token him
Token ID683
Feature activation+1.72e+00

Pos logit contributions
 trigger+0.024
 heads+0.023
 fists+0.022
rated+0.021
 noses+0.021
Neg logit contributions
iw-0.021
h-0.020
yl-0.019
ott-0.019
on-0.019
Token,
Token ID11
Feature activation+3.28e-02

Pos logit contributions
iw+0.001
h+0.001
ott+0.001
Happy+0.001
yl+0.001
Neg logit contributions
 trigger-0.001
 heads-0.001
 fists-0.001
 noses-0.001
rated-0.001
Token Anna
Token ID11735
Feature activation-1.13e+00

Pos logit contributions
 trigger+0.001
 heads+0.001
 noses+0.001
 fists+0.001
 faces+0.001
Neg logit contributions
iw-0.001
h-0.001
ris-0.001
ott-0.001
yl-0.001
Token,
Token ID11
Feature activation+1.19e+00

Pos logit contributions
iw+0.020
ott+0.019
Happy+0.019
aid+0.018
och+0.018
Neg logit contributions
 heads-0.030
rated-0.027
 fists-0.027
 noses-0.027
 trigger-0.026
Token what
Token ID644
Feature activation+6.23e-01

Pos logit contributions
 trigger+0.032
 heads+0.032
 fists+0.030
 noses+0.030
 faces+0.030
Neg logit contributions
iw-0.020
h-0.019
ris-0.019
ott-0.018
yl-0.018
Token I
Token ID314
Feature activation-1.71e+00

Pos logit contributions
 heads+0.014
 trigger+0.014
 fists+0.014
 noses+0.014
 faces+0.013
Neg logit contributions
iw-0.013
h-0.012
ott-0.012
ris-0.012
yl-0.012
Token have
Token ID423
Feature activation-5.75e+00

Pos logit contributions
iw+0.032
h+0.031
yl+0.029
Happy+0.028
br+0.028
Neg logit contributions
 trigger-0.043
 heads-0.041
rated-0.041
 fists-0.038
 noses-0.037
Token!"
Token ID2474
Feature activation-1.28e+00

Pos logit contributions
ott+0.105
iw+0.104
lu+0.097
sha+0.095
h+0.095
Neg logit contributions
 heads-0.120
 faces-0.114
 trigger-0.113
 attention-0.112
 noses-0.111
Token Sara
Token ID24799
Feature activation-9.12e-01

Pos logit contributions
iw+0.019
h+0.019
on+0.018
yl+0.018
ott+0.017
Neg logit contributions
 trigger-0.040
 heads-0.038
 fists-0.037
rated-0.037
 noses-0.036
Token says
Token ID1139
Feature activation-9.82e-02

Pos logit contributions
iw+0.018
h+0.018
yl+0.017
on+0.017
Happy+0.017
Neg logit contributions
 heads-0.023
 trigger-0.023
 fists-0.022
 noses-0.021
rated-0.021
Token,
Token ID11
Feature activation+4.61e-01

Pos logit contributions
iw+0.001
yl+0.001
h+0.001
ott+0.001
Happy+0.001
Neg logit contributions
 trigger-0.003
 heads-0.003
 fists-0.003
 noses-0.003
rated-0.003
Token showing
Token ID4478
Feature activation+1.13e+00

Pos logit contributions
 trigger+0.011
rated+0.010
 heads+0.010
 fists+0.009
 noses+0.009
Neg logit contributions
h-0.011
on-0.010
iw-0.010
yl-0.010
ott-0.010
TokenMax
Token ID11518
Feature activation+1.15e+00

Pos logit contributions
iw+0.003
h+0.003
on+0.003
yl+0.003
ott+0.003
Neg logit contributions
 trigger-0.004
 heads-0.004
 fists-0.004
rated-0.004
 noses-0.004
Token asked
Token ID1965
Feature activation+1.41e+00

Pos logit contributions
 trigger+0.031
 heads+0.030
 fists+0.028
rated+0.028
 noses+0.028
Neg logit contributions
iw-0.022
h-0.021
on-0.021
yl-0.020
ott-0.020
Token his
Token ID465
Feature activation+7.93e-01

Pos logit contributions
 trigger+0.037
 heads+0.036
 fists+0.035
rated+0.033
 noses+0.033
Neg logit contributions
iw-0.028
h-0.026
ott-0.025
yl-0.025
on-0.025
Token mom
Token ID1995
Feature activation-2.28e+00

Pos logit contributions
 trigger+0.020
rated+0.019
 heads+0.019
 fists+0.018
 noses+0.017
Neg logit contributions
iw-0.016
yl-0.016
on-0.015
ott-0.015
h-0.015
Token what
Token ID644
Feature activation+4.12e-01

Pos logit contributions
iw+0.033
h+0.032
on+0.031
yl+0.031
ott+0.030
Neg logit contributions
 trigger-0.071
 heads-0.069
 fists-0.066
rated-0.066
 noses-0.065
Token had
Token ID550
Feature activation-5.73e+00

Pos logit contributions
 trigger+0.011
 heads+0.011
 fists+0.010
 noses+0.010
 grill+0.010
Neg logit contributions
iw-0.007
h-0.007
yl-0.006
Happy-0.006
ott-0.006
Token happened
Token ID3022
Feature activation-3.18e+00

Pos logit contributions
iw+0.166
h+0.152
yl+0.144
och+0.141
ris+0.139
Neg logit contributions
 heads-0.091
 noses-0.089
 faces-0.089
 fists-0.089
 trigger-0.088
Token.
Token ID13
Feature activation-3.76e-02

Pos logit contributions
iw+0.037
h+0.036
on+0.035
yl+0.034
ott+0.033
Neg logit contributions
 trigger-0.108
 heads-0.104
 fists-0.101
rated-0.100
 noses-0.099
Token His
Token ID2399
Feature activation+1.09e+00

Pos logit contributions
iw+0.001
yl+0.001
prise+0.001
h+0.001
ris+0.001
Neg logit contributions
 trigger-0.001
 heads-0.001
 fists-0.001
 noses-0.001
 faces-0.001
Token mother
Token ID2802
Feature activation-1.45e+00

Pos logit contributions
 trigger+0.035
rated+0.034
lling+0.033
ping+0.031
 heads+0.030
Neg logit contributions
yl-0.015
on-0.015
ott-0.014
iw-0.014
h-0.013
Token told
Token ID1297
Feature activation+7.11e-01

Pos logit contributions
iw+0.028
Happy+0.026
br+0.024
h+0.024
ott+0.023
Neg logit contributions
 heads-0.037
 trigger-0.036
 fists-0.034
 noses-0.034
 faces-0.033
Token what
Token ID644
Feature activation-5.35e-01

Pos logit contributions
iw+0.023
h+0.022
yl+0.021
Happy+0.021
ott+0.021
Neg logit contributions
 trigger-0.044
 heads-0.044
 fists-0.042
 noses-0.041
rated-0.041
Token's
Token ID338
Feature activation-1.27e+00

Pos logit contributions
iw+0.008
h+0.008
on+0.007
yl+0.007
ott+0.007
Neg logit contributions
 trigger-0.017
 heads-0.016
 fists-0.015
rated-0.015
 noses-0.015
Token in
Token ID287
Feature activation-1.53e+00

Pos logit contributions
h+0.023
iw+0.022
yl+0.022
on+0.022
ott+0.021
Neg logit contributions
 trigger-0.035
 heads-0.033
 fists-0.032
rated-0.031
 noses-0.031
Token the
Token ID262
Feature activation-1.58e+00

Pos logit contributions
iw+0.024
h+0.023
on+0.022
yl+0.022
ott+0.022
Neg logit contributions
 trigger-0.045
 heads-0.044
 fists-0.043
rated-0.042
 noses-0.041
Token cold
Token ID4692
Feature activation-2.69e-01

Pos logit contributions
iw+0.030
h+0.029
yl+0.029
on+0.028
ott+0.027
Neg logit contributions
 trigger-0.043
 heads-0.041
 fists-0.039
rated-0.039
 noses-0.038
Token drawer
Token ID33451
Feature activation-5.72e+00

Pos logit contributions
iw+0.005
h+0.005
on+0.005
yl+0.004
ott+0.004
Neg logit contributions
 trigger-0.008
 heads-0.007
 fists-0.007
rated-0.007
 noses-0.007
Token?
Token ID30
Feature activation-2.11e+00

Pos logit contributions
iw+0.063
yl+0.058
ott+0.054
Happy+0.053
h+0.051
Neg logit contributions
 heads-0.192
 trigger-0.191
rated-0.187
 fists-0.180
 noses-0.180
Token Maybe
Token ID6674
Feature activation-1.31e+00

Pos logit contributions
yl+0.045
h+0.043
iw+0.043
ott+0.040
on+0.040
Neg logit contributions
 trigger-0.051
 heads-0.049
 fists-0.047
 noses-0.045
rated-0.043
Token it
Token ID340
Feature activation-1.65e-01

Pos logit contributions
iw+0.018
h+0.018
ott+0.017
on+0.017
yl+0.017
Neg logit contributions
 trigger-0.040
 heads-0.040
 fists-0.038
 noses-0.038
rated-0.037
Token will
Token ID481
Feature activation+1.29e-01

Pos logit contributions
iw+0.003
h+0.003
on+0.003
yl+0.003
ott+0.003
Neg logit contributions
 trigger-0.005
 heads-0.004
 fists-0.004
rated-0.004
 noses-0.004
Token make
Token ID787
Feature activation+6.82e-01

Pos logit contributions
 trigger+0.003
 heads+0.003
 fists+0.003
rated+0.003
 noses+0.003
Neg logit contributions
iw-0.003
h-0.003
on-0.003
yl-0.003
ott-0.003
Token then
Token ID788
Feature activation-1.54e+00

Pos logit contributions
 heads+0.021
 noses+0.019
 trigger+0.019
 fists+0.019
rated+0.019
Neg logit contributions
iw-0.015
h-0.014
yl-0.014
Happy-0.014
ott-0.014
Token asked
Token ID1965
Feature activation+7.95e-01

Pos logit contributions
iw+0.026
h+0.025
ott+0.024
yl+0.023
Happy+0.022
Neg logit contributions
 heads-0.043
 trigger-0.042
 fists-0.041
 noses-0.041
 faces-0.040
Token mummy
Token ID38567
Feature activation-1.14e+00

Pos logit contributions
 heads+0.022
 trigger+0.022
 fists+0.022
 noses+0.021
rated+0.020
Neg logit contributions
iw-0.014
ott-0.012
h-0.012
ris-0.011
Happy-0.011
Token what
Token ID644
Feature activation+4.30e-01

Pos logit contributions
iw+0.015
h+0.015
on+0.014
yl+0.014
ott+0.014
Neg logit contributions
 trigger-0.037
 heads-0.036
 fists-0.034
rated-0.034
 noses-0.034
Token he
Token ID339
Feature activation-2.59e+00

Pos logit contributions
 trigger+0.012
 heads+0.012
 noses+0.011
 fists+0.011
 grill+0.011
Neg logit contributions
iw-0.007
h-0.007
yl-0.006
ott-0.006
Happy-0.006
Token had
Token ID550
Feature activation-5.71e+00

Pos logit contributions
iw+0.055
h+0.054
yl+0.051
Happy+0.049
br+0.049
Neg logit contributions
 trigger-0.060
 heads-0.060
rated-0.058
 fists-0.054
 noses-0.054
Token to
Token ID284
Feature activation-1.38e+00

Pos logit contributions
iw+0.112
h+0.105
br+0.100
yl+0.098
ott+0.098
Neg logit contributions
 heads-0.139
 trigger-0.135
rated-0.131
 noses-0.130
lling-0.129
Token do
Token ID466
Feature activation-3.61e+00

Pos logit contributions
iw+0.034
h+0.033
yl+0.033
on+0.032
ott+0.032
Neg logit contributions
 trigger-0.029
 heads-0.028
 fists-0.026
rated-0.026
 noses-0.025
Token with
Token ID351
Feature activation-1.12e+00

Pos logit contributions
iw+0.055
yl+0.049
h+0.048
ott+0.047
Happy+0.046
Neg logit contributions
 heads-0.104
 trigger-0.102
rated-0.099
 fists-0.097
 noses-0.097
Token the
Token ID262
Feature activation+2.45e-01

Pos logit contributions
iw+0.018
h+0.018
yl+0.017
Happy+0.017
ott+0.017
Neg logit contributions
 trigger-0.032
 heads-0.031
 fists-0.031
 noses-0.030
rated-0.029
Token apple
Token ID17180
Feature activation-8.80e-01

Pos logit contributions
 trigger+0.006
 heads+0.006
 fists+0.006
rated+0.006
 noses+0.006
Neg logit contributions
iw-0.005
on-0.005
h-0.005
ott-0.005
yl-0.004
Token Lily
Token ID20037
Feature activation-2.23e+00

Pos logit contributions
iw+0.000
h+0.000
on+0.000
yl+0.000
ott+0.000
Neg logit contributions
 trigger-0.001
 heads-0.001
 fists-0.001
rated-0.001
 noses-0.001
Token,
Token ID11
Feature activation+9.99e-01

Pos logit contributions
iw+0.035
ott+0.032
h+0.032
Happy+0.031
yl+0.029
Neg logit contributions
 heads-0.066
 trigger-0.063
 fists-0.061
 noses-0.060
rated-0.060
Token look
Token ID804
Feature activation-9.37e-01

Pos logit contributions
 trigger+0.032
 heads+0.031
 fists+0.030
rated+0.029
 noses+0.029
Neg logit contributions
iw-0.014
h-0.013
yl-0.013
on-0.013
ott-0.012
Token what
Token ID644
Feature activation+1.58e-01

Pos logit contributions
iw+0.012
ott+0.011
h+0.011
yl+0.011
on+0.011
Neg logit contributions
 trigger-0.031
 heads-0.030
 fists-0.029
rated-0.029
 noses-0.028
Token I
Token ID314
Feature activation-2.50e+00

Pos logit contributions
 trigger+0.005
 heads+0.005
 fists+0.005
rated+0.005
 noses+0.005
Neg logit contributions
iw-0.002
h-0.002
yl-0.002
on-0.002
ott-0.002
Token have
Token ID423
Feature activation-5.71e+00

Pos logit contributions
h+0.045
iw+0.045
yl+0.042
on+0.041
ott+0.040
Neg logit contributions
 trigger-0.067
 heads-0.065
rated-0.063
 fists-0.060
 noses-0.059
Token!"
Token ID2474
Feature activation-1.51e+00

Pos logit contributions
ott+0.097
iw+0.097
h+0.090
lu+0.085
yl+0.084
Neg logit contributions
 heads-0.144
 trigger-0.142
rated-0.136
 noses-0.135
 faces-0.133
Token Ben
Token ID3932
Feature activation-7.08e-01

Pos logit contributions
iw+0.026
h+0.026
on+0.025
yl+0.025
ott+0.024
Neg logit contributions
 trigger-0.043
 heads-0.041
 fists-0.040
 noses-0.039
rated-0.038
Token said
Token ID531
Feature activation-5.98e-01

Pos logit contributions
iw+0.009
h+0.008
on+0.008
yl+0.008
ott+0.008
Neg logit contributions
 trigger-0.023
 heads-0.023
 fists-0.022
rated-0.022
 noses-0.022
Token.
Token ID13
Feature activation-3.18e-02

Pos logit contributions
iw+0.007
h+0.006
yl+0.006
on+0.006
ott+0.006
Neg logit contributions
 trigger-0.021
 heads-0.021
 fists-0.020
 noses-0.019
rated-0.019
Token He
Token ID679
Feature activation-1.26e+00

Pos logit contributions
iw+0.001
h+0.001
prise+0.001
yl+0.001
on+0.001
Neg logit contributions
 trigger-0.001
 heads-0.001
 fists-0.001
 noses-0.001
 faces-0.001
Token and
Token ID290
Feature activation-1.13e+00

Pos logit contributions
iw+0.026
ott+0.024
yl+0.023
Happy+0.023
h+0.023
Neg logit contributions
 trigger-0.051
 heads-0.051
 fists-0.049
rated-0.048
 noses-0.048
Token thought
Token ID1807
Feature activation-3.67e-01

Pos logit contributions
iw+0.016
h+0.015
on+0.015
yl+0.015
ott+0.014
Neg logit contributions
 trigger-0.036
 heads-0.035
 fists-0.033
rated-0.033
 noses-0.032
Token about
Token ID546
Feature activation-1.20e+00

Pos logit contributions
iw+0.007
ott+0.006
yl+0.006
h+0.006
Happy+0.006
Neg logit contributions
 trigger-0.010
 fists-0.010
 heads-0.009
 noses-0.009
 faces-0.009
Token what
Token ID644
Feature activation-5.82e-02

Pos logit contributions
iw+0.021
h+0.020
on+0.020
yl+0.020
ott+0.020
Neg logit contributions
 trigger-0.033
 heads-0.033
 fists-0.031
 noses-0.031
rated-0.030
Token she
Token ID673
Feature activation-2.97e+00

Pos logit contributions
iw+0.001
h+0.001
yl+0.001
on+0.001
ott+0.001
Neg logit contributions
 heads-0.002
 trigger-0.002
 fists-0.002
 noses-0.002
rated-0.002
Token had
Token ID550
Feature activation-5.70e+00

Pos logit contributions
iw+0.066
h+0.064
yl+0.061
on+0.061
br+0.059
Neg logit contributions
 heads-0.067
rated-0.065
 trigger-0.060
lling-0.056
 fists-0.056
Token learnt
Token ID26338
Feature activation-3.69e+00

Pos logit contributions
iw+0.110
on+0.098
h+0.097
ott+0.097
br+0.096
Neg logit contributions
 heads-0.141
rated-0.136
 trigger-0.132
lling-0.132
 noses-0.127
Token.
Token ID13
Feature activation-6.56e-01

Pos logit contributions
iw+0.045
h+0.041
on+0.040
yl+0.039
ott+0.039
Neg logit contributions
 trigger-0.122
 heads-0.120
 fists-0.115
rated-0.115
 noses-0.113
Token
Token ID220
Feature activation-1.60e+00

Pos logit contributions
h+0.013
yl+0.013
br+0.012
iw+0.012
prise+0.012
Neg logit contributions
 heads-0.015
 fists-0.015
 trigger-0.014
 lawn-0.014
 noses-0.014
Token
Token ID198
Feature activation+8.59e-03

Pos logit contributions
iw+0.021
h+0.020
yl+0.019
Happy+0.017
br+0.017
Neg logit contributions
 trigger-0.049
 heads-0.048
 fists-0.048
 noses-0.045
rated-0.045
Token
Token ID198
Feature activation-1.12e+00

Pos logit contributions
 trigger+0.000
 heads+0.000
 fists+0.000
rated+0.000
 noses+0.000
Neg logit contributions
iw-0.000
yl-0.000
h-0.000
br-0.000
ris-0.000
Token you
Token ID345
Feature activation+4.23e-01

Pos logit contributions
iw+0.022
h+0.021
on+0.021
yl+0.020
ott+0.019
Neg logit contributions
 trigger-0.069
 heads-0.067
 fists-0.064
rated-0.064
 noses-0.063
Token're
Token ID821
Feature activation-1.07e+00

Pos logit contributions
 trigger+0.012
 heads+0.012
 fists+0.011
 noses+0.011
rated+0.011
Neg logit contributions
h-0.007
on-0.007
iw-0.007
yl-0.007
ott-0.006
Token doing
Token ID1804
Feature activation-3.11e+00

Pos logit contributions
h+0.020
on+0.019
iw+0.019
yl+0.019
ott+0.018
Neg logit contributions
 trigger-0.029
 heads-0.028
 fists-0.027
 noses-0.026
rated-0.026
Token in
Token ID287
Feature activation-1.57e+00

Pos logit contributions
iw+0.040
h+0.037
yl+0.035
ott+0.035
on+0.034
Neg logit contributions
 trigger-0.102
 heads-0.100
 fists-0.096
rated-0.096
 noses-0.094
Token my
Token ID616
Feature activation-6.93e-01

Pos logit contributions
iw+0.024
h+0.023
on+0.022
yl+0.022
ott+0.021
Neg logit contributions
 trigger-0.048
 heads-0.046
 fists-0.045
rated-0.044
 noses-0.043
Token drawer
Token ID33451
Feature activation-5.69e+00

Pos logit contributions
h+0.015
iw+0.015
on+0.014
yl+0.014
Happy+0.014
Neg logit contributions
 heads-0.017
 trigger-0.016
 fists-0.016
 noses-0.015
rated-0.015
Token?"
Token ID1701
Feature activation-2.57e+00

Pos logit contributions
iw+0.067
Happy+0.060
yl+0.059
h+0.057
on+0.056
Neg logit contributions
 heads-0.188
 trigger-0.187
rated-0.183
 fists-0.177
 noses-0.172
Token he
Token ID339
Feature activation-1.22e+00

Pos logit contributions
h+0.043
iw+0.042
yl+0.040
ris+0.039
prise+0.039
Neg logit contributions
 trigger-0.073
 heads-0.069
 fists-0.067
 noses-0.065
 faces-0.064
Token said
Token ID531
Feature activation-2.18e-01

Pos logit contributions
iw+0.020
ott+0.017
on+0.017
h+0.016
Happy+0.016
Neg logit contributions
 heads-0.037
 trigger-0.037
 fists-0.035
 noses-0.034
rated-0.034
Token.
Token ID13
Feature activation+2.33e-01

Pos logit contributions
iw+0.003
ott+0.003
ris+0.003
yl+0.003
prise+0.003
Neg logit contributions
 trigger-0.007
 heads-0.006
 fists-0.006
rated-0.006
 faces-0.006
Token
Token ID220
Feature activation-7.26e-01

Pos logit contributions
 trigger+0.007
 heads+0.007
 fists+0.006
 noses+0.006
 faces+0.006
Neg logit contributions
iw-0.004
h-0.004
yl-0.003
prise-0.003
ris-0.003
Token outside
Token ID2354
Feature activation-6.27e-01

Pos logit contributions
 trigger+0.047
 heads+0.047
 fists+0.044
rated+0.043
 noses+0.043
Neg logit contributions
iw-0.033
h-0.032
yl-0.031
ott-0.031
on-0.030
Token and
Token ID290
Feature activation-1.17e+00

Pos logit contributions
iw+0.012
yl+0.012
ott+0.012
br+0.012
ha+0.011
Neg logit contributions
 heads-0.015
 noses-0.015
 trigger-0.014
 fists-0.014
 grill-0.014
Token see
Token ID766
Feature activation-1.83e+00

Pos logit contributions
iw+0.024
h+0.024
yl+0.023
ott+0.023
on+0.023
Neg logit contributions
 trigger-0.028
 heads-0.028
 fists-0.026
rated-0.026
 noses-0.026
Token what
Token ID644
Feature activation+3.30e-01

Pos logit contributions
iw+0.033
h+0.032
yl+0.032
ott+0.031
on+0.030
Neg logit contributions
 trigger-0.050
 heads-0.048
 fists-0.046
 noses-0.046
rated-0.045
Token he
Token ID339
Feature activation-2.32e+00

Pos logit contributions
 trigger+0.009
 heads+0.009
 fists+0.009
 noses+0.009
rated+0.008
Neg logit contributions
iw-0.006
h-0.005
yl-0.005
ott-0.005
on-0.005
Token had
Token ID550
Feature activation-5.69e+00

Pos logit contributions
iw+0.049
h+0.048
yl+0.047
on+0.045
Sus+0.044
Neg logit contributions
rated-0.052
 heads-0.051
 trigger-0.047
 noses-0.044
 fists-0.044
Token made
Token ID925
Feature activation-2.08e+00

Pos logit contributions
iw+0.121
ott+0.110
lu+0.108
on+0.107
h+0.107
Neg logit contributions
 heads-0.125
rated-0.123
lling-0.120
 trigger-0.120
 noses-0.115
Token.
Token ID13
Feature activation+1.03e-01

Pos logit contributions
iw+0.026
h+0.024
ott+0.024
yl+0.024
on+0.022
Neg logit contributions
 trigger-0.068
 heads-0.066
 fists-0.064
rated-0.064
 noses-0.063
Token His
Token ID2399
Feature activation+1.74e+00

Pos logit contributions
 trigger+0.003
 heads+0.003
 fists+0.003
rated+0.003
 noses+0.003
Neg logit contributions
iw-0.002
h-0.002
on-0.002
yl-0.002
ott-0.002
Token friends
Token ID2460
Feature activation+6.58e-01

Pos logit contributions
 trigger+0.054
rated+0.050
lling+0.047
 heads+0.047
 fists+0.046
Neg logit contributions
iw-0.027
on-0.026
yl-0.026
ott-0.026
h-0.024
Token said
Token ID531
Feature activation-1.07e+00

Pos logit contributions
 trigger+0.018
 heads+0.017
 fists+0.016
rated+0.016
 noses+0.016
Neg logit contributions
iw-0.012
h-0.011
yl-0.011
ott-0.011
on-0.011
Token to
Token ID284
Feature activation-1.05e+00

Pos logit contributions
iw+0.007
h+0.006
yl+0.006
Happy+0.006
ott+0.006
Neg logit contributions
 trigger-0.015
 heads-0.015
 fists-0.014
 noses-0.014
rated-0.014
Token think
Token ID892
Feature activation-9.47e-01

Pos logit contributions
iw+0.020
h+0.019
on+0.019
yl+0.019
ott+0.018
Neg logit contributions
 trigger-0.028
 heads-0.027
 fists-0.026
rated-0.025
 noses-0.025
Token about
Token ID546
Feature activation-1.93e+00

Pos logit contributions
iw+0.015
ott+0.013
yl+0.013
h+0.013
Happy+0.012
Neg logit contributions
 trigger-0.028
 heads-0.028
 fists-0.027
 noses-0.027
rated-0.026
Token what
Token ID644
Feature activation-2.01e-01

Pos logit contributions
iw+0.033
h+0.032
yl+0.030
ott+0.030
on+0.030
Neg logit contributions
 trigger-0.055
 heads-0.053
 fists-0.051
 noses-0.050
rated-0.050
Token he
Token ID339
Feature activation-2.51e+00

Pos logit contributions
iw+0.003
h+0.003
yl+0.003
ott+0.003
on+0.003
Neg logit contributions
 trigger-0.006
 heads-0.006
 fists-0.006
 noses-0.005
rated-0.005
Token had
Token ID550
Feature activation-5.68e+00

Pos logit contributions
iw+0.051
h+0.050
yl+0.047
on+0.047
ott+0.045
Neg logit contributions
 heads-0.062
rated-0.061
 trigger-0.060
 fists-0.056
 noses-0.055
Token done
Token ID1760
Feature activation-4.20e+00

Pos logit contributions
iw+0.108
h+0.099
ott+0.094
on+0.093
br+0.092
Neg logit contributions
 heads-0.149
 trigger-0.145
rated-0.140
 fists-0.138
 noses-0.137
Token the
Token ID262
Feature activation+4.85e-01

Pos logit contributions
iw+0.041
h+0.039
on+0.037
yl+0.037
ott+0.035
Neg logit contributions
 trigger-0.150
 heads-0.146
 fists-0.141
rated-0.140
 noses-0.138
Token day
Token ID1110
Feature activation-1.54e+00

Pos logit contributions
 trigger+0.015
 heads+0.014
 fists+0.014
rated+0.014
 noses+0.014
Neg logit contributions
iw-0.007
yl-0.007
on-0.007
h-0.007
ott-0.007
Token before
Token ID878
Feature activation-1.05e+00

Pos logit contributions
iw+0.016
h+0.015
yl+0.014
on+0.014
ott+0.014
Neg logit contributions
 trigger-0.054
 heads-0.053
 fists-0.051
rated-0.050
 noses-0.050
Token.
Token ID13
Feature activation-1.22e+00

Pos logit contributions
iw+0.015
h+0.014
yl+0.014
ott+0.014
Happy+0.013
Neg logit contributions
 trigger-0.033
 heads-0.032
 fists-0.031
 noses-0.031
rated-0.030
Token and
Token ID290
Feature activation-1.26e+00

Pos logit contributions
iw+0.013
h+0.012
on+0.012
yl+0.012
ott+0.011
Neg logit contributions
 trigger-0.023
 heads-0.023
 fists-0.021
 noses-0.021
rated-0.021
Token held
Token ID2714
Feature activation+3.86e-01

Pos logit contributions
iw+0.023
h+0.021
on+0.021
ott+0.021
ris+0.020
Neg logit contributions
 heads-0.034
 trigger-0.032
 fists-0.031
 faces-0.031
 noses-0.031
Token out
Token ID503
Feature activation+3.12e-01

Pos logit contributions
 trigger+0.010
 heads+0.009
rated+0.009
 noses+0.008
 fists+0.008
Neg logit contributions
iw-0.008
on-0.008
yl-0.008
h-0.008
ott-0.008
Token what
Token ID644
Feature activation+3.83e-01

Pos logit contributions
 trigger+0.009
 heads+0.008
 fists+0.008
 noses+0.008
rated+0.007
Neg logit contributions
iw-0.006
h-0.006
ott-0.005
on-0.005
yl-0.005
Token he
Token ID339
Feature activation-2.56e+00

Pos logit contributions
 trigger+0.014
 heads+0.013
 fists+0.013
rated+0.013
 noses+0.012
Neg logit contributions
iw-0.004
h-0.004
on-0.004
yl-0.004
ott-0.004
Token had
Token ID550
Feature activation-5.68e+00

Pos logit contributions
iw+0.056
h+0.054
on+0.051
yl+0.050
Happy+0.050
Neg logit contributions
 heads-0.057
rated-0.057
 trigger-0.055
 fists-0.051
 noses-0.050
Token been
Token ID587
Feature activation-2.17e+00

Pos logit contributions
iw+0.116
h+0.104
ott+0.103
on+0.101
lu+0.100
Neg logit contributions
 heads-0.137
 trigger-0.129
rated-0.127
 noses-0.126
lling-0.126
Token carrying
Token ID6872
Feature activation-1.25e+00

Pos logit contributions
iw+0.045
h+0.042
ott+0.042
on+0.042
Happy+0.041
Neg logit contributions
 heads-0.051
 trigger-0.051
 fists-0.049
rated-0.048
 noses-0.048
Token.
Token ID13
Feature activation-1.92e-01

Pos logit contributions
iw+0.015
h+0.014
yl+0.014
ott+0.014
Happy+0.013
Neg logit contributions
 trigger-0.042
 heads-0.041
 fists-0.039
 noses-0.039
rated-0.038
Token
Token ID198
Feature activation-7.42e-02

Pos logit contributions
iw+0.003
h+0.003
yl+0.003
on+0.003
ott+0.003
Neg logit contributions
 trigger-0.005
 heads-0.005
 fists-0.005
rated-0.005
 noses-0.005
Token
Token ID198
Feature activation-1.11e+00

Pos logit contributions
iw+0.001
yl+0.001
h+0.001
ott+0.001
on+0.001
Neg logit contributions
 trigger-0.002
 heads-0.002
rated-0.002
 fists-0.002
 lawn-0.002
Token anything
Token ID1997
Feature activation+1.19e+00

Pos logit contributions
h+0.013
iw+0.013
on+0.013
yl+0.012
ott+0.012
Neg logit contributions
 trigger-0.028
 heads-0.027
rated-0.026
 fists-0.026
 noses-0.025
Token about
Token ID546
Feature activation-6.14e-01

Pos logit contributions
 trigger+0.044
 heads+0.042
 fists+0.041
rated+0.040
 noses+0.040
Neg logit contributions
h-0.012
iw-0.011
on-0.010
yl-0.010
br-0.010
Token l
Token ID300
Feature activation-9.26e-01

Pos logit contributions
h+0.009
iw+0.009
on+0.009
yl+0.008
ott+0.008
Neg logit contributions
 trigger-0.019
rated-0.019
 heads-0.018
 fists-0.018
lling-0.017
Tokenizards
Token ID14124
Feature activation+1.64e+00

Pos logit contributions
h+0.026
yl+0.025
iw+0.025
Happy+0.025
on+0.024
Neg logit contributions
 trigger-0.016
 heads-0.014
 fists-0.014
 noses-0.012
 tails-0.012
Token.
Token ID13
Feature activation-5.20e-01

Pos logit contributions
 trigger+0.058
 heads+0.057
 fists+0.055
rated+0.054
 noses+0.054
Neg logit contributions
iw-0.016
h-0.015
on-0.015
yl-0.014
ott-0.014
Token Go
Token ID1514
Feature activation+3.86e+00

Pos logit contributions
iw+0.009
h+0.009
yl+0.009
on+0.009
ott+0.009
Neg logit contributions
 trigger-0.014
 heads-0.014
 fists-0.013
 noses-0.013
rated-0.013
Token away
Token ID1497
Feature activation+2.53e-01

Pos logit contributions
 trigger+0.079
 heads+0.077
rated+0.069
 fists+0.066
 noses+0.065
Neg logit contributions
on-0.096
iw-0.096
ott-0.094
yl-0.091
h-0.090
Token."
Token ID526
Feature activation-3.87e-01

Pos logit contributions
 trigger+0.009
 heads+0.009
 fists+0.009
rated+0.008
 noses+0.008
Neg logit contributions
iw-0.002
h-0.002
on-0.002
yl-0.002
ott-0.002
Token
Token ID198
Feature activation+5.41e-01

Pos logit contributions
iw+0.006
h+0.006
on+0.006
br+0.006
ott+0.006
Neg logit contributions
 heads-0.010
 trigger-0.010
 noses-0.010
 fists-0.010
 dogs-0.009
Token
Token ID198
Feature activation-7.30e-01

Pos logit contributions
 trigger+0.015
 heads+0.015
 fists+0.014
 noses+0.014
 faces+0.013
Neg logit contributions
iw-0.009
yl-0.008
ott-0.007
h-0.007
on-0.007
TokenSam
Token ID16305
Feature activation-9.96e-03

Pos logit contributions
on+0.016
iw+0.015
ott+0.015
yl+0.014
bul+0.014
Neg logit contributions
 trigger-0.015
 heads-0.015
 tails-0.014
 noses-0.013
 fists-0.013
TokenThis
Token ID1212
Feature activation-3.57e-01

Pos logit contributions
 trigger+0.013
 heads+0.012
 fists+0.012
rated+0.012
 noses+0.012
Neg logit contributions
iw-0.009
h-0.008
on-0.008
yl-0.008
ott-0.008
Token is
Token ID318
Feature activation+5.15e-01

Pos logit contributions
on+0.006
h+0.006
iw+0.005
yl+0.005
br+0.005
Neg logit contributions
 trigger-0.011
 heads-0.010
rated-0.010
 fists-0.010
lling-0.010
Token my
Token ID616
Feature activation+1.09e+00

Pos logit contributions
 trigger+0.018
 heads+0.017
rated+0.017
 fists+0.017
 faces+0.016
Neg logit contributions
h-0.006
iw-0.005
yl-0.005
on-0.005
br-0.005
Token spot
Token ID4136
Feature activation+9.54e-01

Pos logit contributions
 trigger+0.027
rated+0.026
 heads+0.025
lling+0.025
 fists+0.024
Neg logit contributions
iw-0.023
yl-0.022
on-0.021
h-0.021
ott-0.020
Token.
Token ID13
Feature activation-5.95e-01

Pos logit contributions
 trigger+0.030
 heads+0.030
 fists+0.029
rated+0.028
 noses+0.028
Neg logit contributions
iw-0.013
h-0.012
yl-0.012
on-0.012
ott-0.012
Token Go
Token ID1514
Feature activation+3.85e+00

Pos logit contributions
h+0.010
iw+0.010
yl+0.010
ott+0.009
on+0.009
Neg logit contributions
 trigger-0.017
 heads-0.016
 fists-0.016
 noses-0.015
rated-0.015
Token away
Token ID1497
Feature activation+2.53e-01

Pos logit contributions
 trigger+0.080
 heads+0.080
rated+0.072
 noses+0.067
 fists+0.066
Neg logit contributions
ott-0.093
iw-0.091
on-0.091
yl-0.087
br-0.086
Token!"
Token ID2474
Feature activation+7.29e-02

Pos logit contributions
 trigger+0.009
 heads+0.009
 fists+0.008
rated+0.008
 noses+0.008
Neg logit contributions
iw-0.003
h-0.003
on-0.003
yl-0.002
ott-0.002
Token Tom
Token ID4186
Feature activation+9.31e-01

Pos logit contributions
 trigger+0.002
 heads+0.002
 noses+0.002
 fists+0.002
 faces+0.002
Neg logit contributions
iw-0.001
prise-0.001
h-0.001
yl-0.001
ott-0.001
Token felt
Token ID2936
Feature activation-1.55e-01

Pos logit contributions
 trigger+0.030
 heads+0.028
 fists+0.027
rated+0.027
 noses+0.027
Neg logit contributions
h-0.013
iw-0.013
on-0.012
yl-0.012
ott-0.012
Token sad
Token ID6507
Feature activation+1.47e+00

Pos logit contributions
 much+0.002
on+0.002
 she+0.002
h+0.002
Happy+0.002
Neg logit contributions
 heads-0.004
 trigger-0.004
rated-0.004
 fists-0.004
 noses-0.004
Token.
Token ID13
Feature activation+1.48e+00

Pos logit contributions
 trigger+0.022
 heads+0.021
 fists+0.021
 noses+0.020
rated+0.020
Neg logit contributions
iw-0.007
h-0.007
yl-0.007
Happy-0.006
ott-0.006
Token He
Token ID679
Feature activation+2.08e-01

Pos logit contributions
 trigger+0.041
 heads+0.041
 fists+0.039
 noses+0.038
rated+0.038
Neg logit contributions
iw-0.027
yl-0.026
h-0.026
Happy-0.025
on-0.024
Token made
Token ID925
Feature activation+6.11e-01

Pos logit contributions
 heads+0.006
 trigger+0.006
 fists+0.005
rated+0.005
 noses+0.005
Neg logit contributions
iw-0.004
h-0.004
Happy-0.004
yl-0.003
on-0.003
Token a
Token ID257
Feature activation+9.34e-01

Pos logit contributions
 trigger+0.020
 heads+0.019
rated+0.018
 fists+0.018
 noses+0.018
Neg logit contributions
on-0.008
h-0.008
iw-0.008
yl-0.007
ott-0.007
Token big
Token ID1263
Feature activation+4.58e-01

Pos logit contributions
 trigger+0.024
 heads+0.023
 fists+0.022
rated+0.022
 noses+0.021
Neg logit contributions
iw-0.019
h-0.018
on-0.018
yl-0.018
ott-0.018
Token snowball
Token ID46275
Feature activation+3.20e+00

Pos logit contributions
 trigger+0.010
 heads+0.009
 fists+0.009
 noses+0.009
 faces+0.009
Neg logit contributions
h-0.011
iw-0.011
Happy-0.011
yl-0.011
ott-0.011
Token and
Token ID290
Feature activation-9.12e-01

Pos logit contributions
 trigger+0.095
 heads+0.092
rated+0.091
 fists+0.089
 noses+0.086
Neg logit contributions
h-0.047
on-0.047
iw-0.047
yl-0.044
ott-0.041
Token rolled
Token ID11686
Feature activation+1.79e+00

Pos logit contributions
iw+0.016
h+0.016
yl+0.015
ott+0.015
on+0.015
Neg logit contributions
 trigger-0.025
 heads-0.024
 fists-0.023
rated-0.023
 noses-0.023
Token it
Token ID340
Feature activation+1.25e+00

Pos logit contributions
 trigger+0.054
 heads+0.052
 fists+0.050
rated+0.049
 noses+0.049
Neg logit contributions
iw-0.029
h-0.028
yl-0.027
ott-0.027
on-0.027
Token around
Token ID1088
Feature activation+1.31e+00

Pos logit contributions
 trigger+0.029
 heads+0.028
 fists+0.026
rated+0.026
 noses+0.025
Neg logit contributions
iw-0.028
h-0.027
yl-0.027
on-0.027
ott-0.026
Token the
Token ID262
Feature activation+1.06e-01

Pos logit contributions
 trigger+0.041
 heads+0.040
 fists+0.039
 noses+0.038
rated+0.037
Neg logit contributions
iw-0.019
yl-0.018
ott-0.017
h-0.017
Happy-0.017
Token in
Token ID287
Feature activation+7.81e-01

Pos logit contributions
 trigger+0.039
 heads+0.037
 fists+0.035
 noses+0.035
rated+0.034
Neg logit contributions
iw-0.021
yl-0.019
ott-0.019
h-0.019
Happy-0.019
Token the
Token ID262
Feature activation-8.68e-01

Pos logit contributions
 trigger+0.024
 heads+0.023
 fists+0.022
rated+0.022
 noses+0.022
Neg logit contributions
iw-0.012
h-0.012
yl-0.011
on-0.011
ott-0.011
Token park
Token ID3952
Feature activation-2.25e+00

Pos logit contributions
iw+0.016
on+0.015
h+0.015
ott+0.015
yl+0.015
Neg logit contributions
 trigger-0.024
rated-0.022
 heads-0.022
 fists-0.022
lling-0.021
Token.
Token ID13
Feature activation+8.40e-01

Pos logit contributions
iw+0.026
yl+0.023
ott+0.023
h+0.023
Happy+0.023
Neg logit contributions
 trigger-0.076
 heads-0.075
 fists-0.071
 noses-0.071
rated-0.071
Token The
Token ID383
Feature activation+5.14e-01

Pos logit contributions
 trigger+0.025
 heads+0.024
 fists+0.024
 noses+0.023
rated+0.023
Neg logit contributions
iw-0.013
h-0.013
yl-0.013
ott-0.013
on-0.012
Token dog
Token ID3290
Feature activation+3.10e+00

Pos logit contributions
rated+0.017
lling+0.017
imming+0.016
immers+0.016
lled+0.016
Neg logit contributions
 she-0.006
-0.006
 that-0.006
 band-0.005
 too-0.005
Token was
Token ID373
Feature activation+1.33e+00

Pos logit contributions
 trigger+0.089
 heads+0.086
 fists+0.083
rated+0.082
 noses+0.081
Neg logit contributions
iw-0.052
h-0.050
on-0.049
yl-0.049
ott-0.048
Token barking
Token ID48961
Feature activation+3.71e+00

Pos logit contributions
 trigger+0.043
rated+0.041
 heads+0.039
lling+0.036
 banks+0.036
Neg logit contributions
h-0.020
on-0.019
yl-0.018
iw-0.018
 H-0.018
Token loudly
Token ID23112
Feature activation+2.19e+00

Pos logit contributions
 trigger+0.129
 heads+0.123
rated+0.122
 fists+0.119
 noses+0.116
Neg logit contributions
h-0.040
on-0.035
yl-0.033
Happy-0.032
iw-0.032
Token.
Token ID13
Feature activation+9.82e-01

Pos logit contributions
 trigger+0.080
 heads+0.077
 fists+0.075
rated+0.075
 noses+0.073
Neg logit contributions
h-0.020
iw-0.017
on-0.017
yl-0.016
Happy-0.015
Token Tom
Token ID4186
Feature activation+1.48e+00

Pos logit contributions
 trigger+0.026
 heads+0.026
 fists+0.025
rated+0.024
 noses+0.024
Neg logit contributions
iw-0.019
h-0.018
yl-0.018
ott-0.017
on-0.017
Token.
Token ID13
Feature activation+1.61e+00

Pos logit contributions
 trigger+0.007
 heads+0.007
rated+0.006
 fists+0.006
 noses+0.006
Neg logit contributions
iw-0.002
h-0.002
yl-0.002
on-0.002
ott-0.002
Token
Token ID198
Feature activation+2.64e+00

Pos logit contributions
 trigger+0.066
rated+0.064
 heads+0.064
lling+0.063
 fists+0.063
Neg logit contributions
iw-0.008
on-0.007
-0.006
h-0.006
ott-0.005
Token
Token ID198
Feature activation+1.05e+00

Pos logit contributions
 trigger+0.086
 heads+0.082
rated+0.080
 fists+0.080
 noses+0.078
Neg logit contributions
iw-0.032
h-0.027
ott-0.026
yl-0.026
on-0.025
TokenBen
Token ID11696
Feature activation+7.19e-01

Pos logit contributions
 fists+0.035
 heads+0.034
 trigger+0.034
 noses+0.032
rated+0.031
Neg logit contributions
h-0.016
Happy-0.015
br-0.015
ha-0.014
on-0.013
Token and
Token ID290
Feature activation+2.63e+00

Pos logit contributions
 heads+0.020
 trigger+0.019
rated+0.018
 fists+0.018
 noses+0.018
Neg logit contributions
iw-0.013
h-0.012
yl-0.012
on-0.012
ott-0.011
Token Spot
Token ID15899
Feature activation+3.87e+00

Pos logit contributions
 trigger+0.085
 fists+0.079
 heads+0.079
rated+0.078
lling+0.075
Neg logit contributions
on-0.037
h-0.036
iw-0.036
 H-0.035
yl-0.034
Token saw
Token ID2497
Feature activation-5.30e-01

Pos logit contributions
 trigger+0.105
 heads+0.102
 fists+0.097
rated+0.096
 noses+0.095
Neg logit contributions
iw-0.071
h-0.069
on-0.067
yl-0.067
ott-0.066
Token a
Token ID257
Feature activation-1.96e-01

Pos logit contributions
 that+0.004
on+0.003
 much+0.003
+0.003
 she+0.003
Neg logit contributions
rated-0.023
 trigger-0.022
 heads-0.022
 fists-0.021
immers-0.020
Token big
Token ID1263
Feature activation+4.49e-01

Pos logit contributions
iw+0.003
yl+0.003
h+0.003
on+0.003
ott+0.003
Neg logit contributions
 trigger-0.005
 heads-0.005
rated-0.005
 fists-0.005
 noses-0.005
Token pile
Token ID14540
Feature activation-1.70e+00

Pos logit contributions
 trigger+0.012
 heads+0.012
 fists+0.011
rated+0.011
 noses+0.011
Neg logit contributions
iw-0.008
h-0.008
on-0.008
yl-0.008
ott-0.007
Token of
Token ID286
Feature activation+1.11e+00

Pos logit contributions
iw+0.016
h+0.015
yl+0.014
on+0.014
ott+0.014
Neg logit contributions
 trigger-0.061
 heads-0.060
 fists-0.058
rated-0.057
 noses-0.057
Token play
Token ID711
Feature activation-8.83e-01

Pos logit contributions
iw+0.005
h+0.005
yl+0.005
on+0.005
ott+0.005
Neg logit contributions
 trigger-0.007
 heads-0.006
 fists-0.006
rated-0.006
 noses-0.006
Token outside
Token ID2354
Feature activation-2.48e-01

Pos logit contributions
iw+0.016
h+0.016
yl+0.016
ott+0.015
on+0.015
Neg logit contributions
 trigger-0.024
 heads-0.023
 noses-0.022
 fists-0.021
 faces-0.021
Token when
Token ID618
Feature activation+2.19e-01

Pos logit contributions
iw+0.004
yl+0.004
ott+0.003
h+0.003
on+0.003
Neg logit contributions
 heads-0.008
 trigger-0.008
 noses-0.007
 fists-0.007
rated-0.007
Token it
Token ID340
Feature activation-2.08e-01

Pos logit contributions
 trigger+0.007
 heads+0.006
 fists+0.006
rated+0.006
 noses+0.006
Neg logit contributions
iw-0.003
h-0.003
on-0.003
yl-0.003
ott-0.003
Token rains
Token ID29424
Feature activation+7.06e-01

Pos logit contributions
iw+0.004
h+0.004
on+0.004
yl+0.004
ott+0.004
Neg logit contributions
 trigger-0.006
 heads-0.005
 fists-0.005
rated-0.005
 noses-0.005
Token.
Token ID13
Feature activation+9.22e-01

Pos logit contributions
 heads+0.023
 trigger+0.023
 fists+0.022
rated+0.022
 noses+0.022
Neg logit contributions
iw-0.008
yl-0.008
on-0.008
ott-0.007
h-0.007
Token They
Token ID1119
Feature activation+1.39e+00

Pos logit contributions
 trigger+0.029
 heads+0.027
 fists+0.026
rated+0.026
lling+0.025
Neg logit contributions
iw-0.014
h-0.014
ott-0.013
on-0.013
yl-0.013
Token put
Token ID1234
Feature activation+7.59e-01

Pos logit contributions
 trigger+0.038
 heads+0.035
 fists+0.035
rated+0.034
 noses+0.034
Neg logit contributions
h-0.025
iw-0.025
yl-0.024
ott-0.024
on-0.024
Token on
Token ID319
Feature activation+9.04e-01

Pos logit contributions
 trigger+0.024
rated+0.023
 heads+0.022
 fists+0.022
 antics+0.021
Neg logit contributions
iw-0.011
on-0.010
h-0.010
yl-0.010
ott-0.009
Token their
Token ID511
Feature activation+1.83e+00

Pos logit contributions
 trigger+0.026
rated+0.023
 fists+0.021
 heads+0.021
lling+0.021
Neg logit contributions
h-0.017
yl-0.016
iw-0.016
on-0.016
Happy-0.016
Token boots
Token ID14412
Feature activation-3.04e-01

Pos logit contributions
 trigger+0.045
rated+0.041
lling+0.040
imming+0.034
 fists+0.034
Neg logit contributions
yl-0.038
iw-0.037
h-0.037
br-0.036
ott-0.036
Token out
Token ID503
Feature activation+5.86e-02

Pos logit contributions
iw+0.009
h+0.009
yl+0.008
Happy+0.008
on+0.008
Neg logit contributions
 trigger-0.016
 heads-0.015
 fists-0.014
rated-0.014
 noses-0.014
Token at
Token ID379
Feature activation-1.39e+00

Pos logit contributions
 trigger+0.002
 heads+0.002
 fists+0.002
rated+0.002
 noses+0.002
Neg logit contributions
iw-0.001
h-0.001
on-0.001
yl-0.001
ott-0.001
Token the
Token ID262
Feature activation-1.45e+00

Pos logit contributions
iw+0.016
on+0.015
yl+0.015
h+0.015
ott+0.014
Neg logit contributions
 trigger-0.048
rated-0.044
 heads-0.044
 fists-0.042
 noses-0.041
Token blue
Token ID4171
Feature activation-1.12e+00

Pos logit contributions
iw+0.028
h+0.027
Happy+0.026
on+0.025
yl+0.025
Neg logit contributions
 trigger-0.038
 fists-0.035
 heads-0.035
rated-0.034
 noses-0.033
Token skies
Token ID24091
Feature activation-8.92e-01

Pos logit contributions
iw+0.024
h+0.024
yl+0.023
on+0.023
ott+0.023
Neg logit contributions
 trigger-0.027
 heads-0.026
 fists-0.024
rated-0.024
 noses-0.024
Token above
Token ID2029
Feature activation+2.87e-01

Pos logit contributions
iw+0.011
h+0.010
yl+0.010
on+0.010
ott+0.010
Neg logit contributions
 trigger-0.030
 heads-0.029
 fists-0.028
rated-0.028
 noses-0.027
Token them
Token ID606
Feature activation+2.00e-01

Pos logit contributions
 trigger+0.009
 heads+0.009
rated+0.008
 fists+0.008
 noses+0.008
Neg logit contributions
h-0.004
on-0.004
yl-0.004
iw-0.004
Happy-0.004
Token.
Token ID13
Feature activation-6.99e-01

Pos logit contributions
 trigger+0.007
 heads+0.007
 fists+0.007
rated+0.007
 noses+0.007
Neg logit contributions
iw-0.002
h-0.002
yl-0.002
on-0.002
ott-0.002
Token
Token ID198
Feature activation-5.38e-01

Pos logit contributions
iw+0.008
h+0.007
on+0.007
yl+0.007
ott+0.007
Neg logit contributions
 trigger-0.024
 heads-0.024
 fists-0.023
rated-0.023
 noses-0.022
Token
Token ID198
Feature activation-1.62e+00

Pos logit contributions
iw+0.011
h+0.010
yl+0.010
on+0.010
ott+0.010
Neg logit contributions
 trigger-0.014
 heads-0.013
 fists-0.013
rated-0.013
 noses-0.012
TokenSuddenly
Token ID38582
Feature activation-9.02e-01

Pos logit contributions
iw+0.024
h+0.023
on+0.022
yl+0.022
ott+0.022
Neg logit contributions
 trigger-0.051
 heads-0.049
 fists-0.047
rated-0.047
 noses-0.046
Token But
Token ID887
Feature activation+1.37e+00

Pos logit contributions
 trigger+0.049
 heads+0.048
 fists+0.046
 noses+0.046
rated+0.046
Neg logit contributions
iw-0.018
h-0.018
yl-0.018
ott-0.017
on-0.016
Token today
Token ID1909
Feature activation+1.15e+00

Pos logit contributions
 trigger+0.043
 heads+0.041
 fists+0.040
rated+0.040
 noses+0.039
Neg logit contributions
iw-0.020
h-0.019
on-0.019
yl-0.019
ott-0.018
Token,
Token ID11
Feature activation-5.00e-01

Pos logit contributions
 trigger+0.034
 heads+0.033
 noses+0.032
rated+0.031
 fists+0.031
Neg logit contributions
iw-0.018
h-0.016
ott-0.016
yl-0.015
Happy-0.015
Token he
Token ID339
Feature activation+3.77e-01

Pos logit contributions
on+0.005
iw+0.005
h+0.005
yl+0.005
 she+0.005
Neg logit contributions
 trigger-0.018
 heads-0.017
rated-0.017
 fists-0.017
 noses-0.016
Token felt
Token ID2936
Feature activation-8.68e-01

Pos logit contributions
 trigger+0.010
 heads+0.010
 fists+0.009
rated+0.009
 noses+0.009
Neg logit contributions
iw-0.007
h-0.007
on-0.007
yl-0.007
ott-0.006
Token very
Token ID845
Feature activation+1.54e+00

Pos logit contributions
on+0.014
h+0.013
iw+0.013
ott+0.013
 she+0.013
Neg logit contributions
 trigger-0.025
 heads-0.025
rated-0.024
 fists-0.023
 noses-0.023
Token miserable
Token ID22444
Feature activation+1.23e+00

Pos logit contributions
 trigger+0.039
 heads+0.037
 fists+0.035
rated+0.034
 noses+0.034
Neg logit contributions
iw-0.031
h-0.031
on-0.030
ott-0.030
yl-0.029
Token.
Token ID13
Feature activation+1.11e+00

Pos logit contributions
 trigger+0.039
 heads+0.038
rated+0.037
 fists+0.037
 noses+0.036
Neg logit contributions
iw-0.017
yl-0.015
ott-0.015
Happy-0.015
h-0.015
Token He
Token ID679
Feature activation+8.87e-01

Pos logit contributions
 trigger+0.035
 heads+0.034
 fists+0.033
rated+0.032
 noses+0.032
Neg logit contributions
iw-0.015
h-0.015
yl-0.015
on-0.014
ott-0.014
Token had
Token ID550
Feature activation-1.17e+00

Pos logit contributions
 trigger+0.023
 heads+0.023
 fists+0.022
rated+0.021
 noses+0.021
Neg logit contributions
iw-0.017
h-0.017
on-0.016
yl-0.016
Happy-0.016
Token a
Token ID257
Feature activation+2.79e-01

Pos logit contributions
iw+0.022
h+0.021
yl+0.021
on+0.020
ott+0.020
Neg logit contributions
 trigger-0.031
 heads-0.031
 fists-0.030
 noses-0.029
rated-0.028
Token glad
Token ID9675
Feature activation-3.07e-01

Pos logit contributions
iw+0.026
h+0.025
yl+0.025
on+0.024
ott+0.024
Neg logit contributions
 trigger-0.039
 heads-0.039
 fists-0.037
 noses-0.036
rated-0.036
Token that
Token ID326
Feature activation-6.54e-01

Pos logit contributions
iw+0.005
h+0.004
Happy+0.004
yl+0.004
br+0.004
Neg logit contributions
 heads-0.009
 trigger-0.009
 noses-0.009
 faces-0.009
 fists-0.009
Token the
Token ID262
Feature activation+3.45e-03

Pos logit contributions
on+0.008
iw+0.007
h+0.007
yl+0.007
ott+0.007
Neg logit contributions
 trigger-0.022
 heads-0.022
rated-0.021
 fists-0.021
 noses-0.020
Token axe
Token ID23272
Feature activation+8.80e-01

Pos logit contributions
 trigger+0.000
 heads+0.000
rated+0.000
 fists+0.000
 noses+0.000
Neg logit contributions
iw-0.000
on-0.000
ott-0.000
h-0.000
yl-0.000
Token did
Token ID750
Feature activation+5.36e-02

Pos logit contributions
 trigger+0.024
 heads+0.024
 fists+0.022
rated+0.022
 noses+0.022
Neg logit contributions
iw-0.016
h-0.015
yl-0.015
ott-0.015
on-0.015
Token not
Token ID407
Feature activation+2.86e+00

Pos logit contributions
 trigger+0.001
 heads+0.001
 fists+0.001
rated+0.001
 noses+0.001
Neg logit contributions
h-0.001
iw-0.001
on-0.001
yl-0.001
ott-0.001
Token hit
Token ID2277
Feature activation-5.41e-02

Pos logit contributions
 trigger+0.066
 heads+0.064
rated+0.060
 fists+0.060
 noses+0.058
Neg logit contributions
iw-0.065
yl-0.062
on-0.062
h-0.062
ott-0.061
Token them
Token ID606
Feature activation+2.11e+00

Pos logit contributions
h+0.001
iw+0.001
on+0.001
yl+0.001
ott+0.001
Neg logit contributions
 trigger-0.002
 heads-0.002
rated-0.001
 fists-0.001
 noses-0.001
Token.
Token ID13
Feature activation-6.36e-03

Pos logit contributions
 trigger+0.075
 heads+0.073
 fists+0.072
rated+0.071
 noses+0.070
Neg logit contributions
h-0.020
on-0.019
iw-0.018
ott-0.016
yl-0.015
Token
Token ID198
Feature activation+8.08e-02

Pos logit contributions
h+0.000
iw+0.000
yl+0.000
on+0.000
prise+0.000
Neg logit contributions
 trigger-0.000
 heads-0.000
 fists-0.000
 noses-0.000
 faces-0.000
Token
Token ID198
Feature activation-1.32e+00

Pos logit contributions
 trigger+0.003
 heads+0.003
 fists+0.002
 noses+0.002
rated+0.002
Neg logit contributions
iw-0.001
h-0.001
yl-0.001
ott-0.001
on-0.001
Token steak
Token ID26320
Feature activation+2.44e+00

Pos logit contributions
rated+0.019
 heads+0.017
lling+0.017
 trigger+0.017
 noses+0.016
Neg logit contributions
iw-0.014
on-0.014
yl-0.013
br-0.013
 course-0.013
Token from
Token ID422
Feature activation+2.62e+00

Pos logit contributions
 trigger+0.084
 heads+0.080
 fists+0.079
rated+0.078
 noses+0.077
Neg logit contributions
h-0.025
iw-0.025
on-0.024
yl-0.022
ott-0.022
Token Max
Token ID5436
Feature activation+1.14e+00

Pos logit contributions
 trigger+0.070
 heads+0.067
 fists+0.065
rated+0.064
 noses+0.063
Neg logit contributions
iw-0.050
h-0.048
on-0.047
yl-0.047
ott-0.046
Token.
Token ID13
Feature activation+9.61e-01

Pos logit contributions
 trigger+0.042
 heads+0.041
 fists+0.040
rated+0.039
 noses+0.039
Neg logit contributions
iw-0.010
h-0.009
on-0.009
yl-0.009
ott-0.008
Token Max
Token ID5436
Feature activation+1.91e+00

Pos logit contributions
 trigger+0.024
 heads+0.023
 fists+0.023
 noses+0.022
rated+0.021
Neg logit contributions
iw-0.019
h-0.019
yl-0.019
Happy-0.018
ott-0.018
Token grow
Token ID1663
Feature activation+8.52e-01

Pos logit contributions
 trigger+0.058
 heads+0.053
rated+0.053
 fists+0.052
 noses+0.051
Neg logit contributions
h-0.030
iw-0.030
on-0.029
yl-0.028
ott-0.028
Tokenled
Token ID992
Feature activation+6.79e-01

Pos logit contributions
 trigger+0.018
 heads+0.017
 fists+0.016
 noses+0.015
 lawn+0.015
Neg logit contributions
h-0.020
on-0.020
Happy-0.020
br-0.019
iw-0.019
Token and
Token ID290
Feature activation-3.60e-01

Pos logit contributions
 trigger+0.023
 heads+0.022
rated+0.021
 fists+0.021
 noses+0.021
Neg logit contributions
iw-0.008
h-0.008
on-0.007
yl-0.007
ott-0.007
Token did
Token ID750
Feature activation-1.21e-03

Pos logit contributions
iw+0.006
h+0.006
on+0.006
yl+0.006
ott+0.006
Neg logit contributions
 trigger-0.010
 heads-0.010
 fists-0.010
rated-0.010
 noses-0.009
Token not
Token ID407
Feature activation+2.25e+00

Pos logit contributions
h+0.000
iw+0.000
on+0.000
yl+0.000
ott+0.000
Neg logit contributions
 trigger-0.000
 heads-0.000
 fists-0.000
rated-0.000
 noses-0.000
Token want
Token ID765
Feature activation+1.46e+00

Pos logit contributions
 trigger+0.049
 heads+0.047
rated+0.045
 fists+0.044
 noses+0.042
Neg logit contributions
iw-0.055
on-0.052
yl-0.052
h-0.052
ott-0.052
Token The
Token ID383
Feature activation-4.60e-01

Pos logit contributions
 trigger+0.021
 heads+0.020
 fists+0.020
 noses+0.019
 faces+0.018
Neg logit contributions
h-0.013
iw-0.013
yl-0.013
prise-0.012
Happy-0.012
Token giant
Token ID6175
Feature activation+1.34e+00

Pos logit contributions
iw+0.009
on+0.009
yl+0.009
h+0.009
ott+0.009
Neg logit contributions
 trigger-0.012
 heads-0.011
 fists-0.011
rated-0.011
 noses-0.010
Token knew
Token ID2993
Feature activation-1.20e+00

Pos logit contributions
 trigger+0.036
 heads+0.036
 fists+0.034
rated+0.033
 noses+0.033
Neg logit contributions
iw-0.025
h-0.024
yl-0.023
ott-0.023
on-0.023
Token he
Token ID339
Feature activation-3.59e-01

Pos logit contributions
iw+0.018
h+0.017
Happy+0.016
ott+0.016
yl+0.016
Neg logit contributions
 trigger-0.037
 heads-0.036
 fists-0.035
 noses-0.034
rated-0.034
Token had
Token ID550
Feature activation-1.92e+00

Pos logit contributions
iw+0.006
h+0.006
yl+0.006
on+0.006
ott+0.006
Neg logit contributions
 trigger-0.010
 heads-0.010
 fists-0.009
rated-0.009
 noses-0.009
Token to
Token ID284
Feature activation-3.35e-01

Pos logit contributions
iw+0.026
h+0.024
yl+0.024
on+0.023
ott+0.023
Neg logit contributions
 trigger-0.061
 heads-0.061
 fists-0.059
 noses-0.058
rated-0.057
Token run
Token ID1057
Feature activation+1.87e+00

Pos logit contributions
iw+0.008
h+0.007
on+0.007
yl+0.007
ott+0.007
Neg logit contributions
 trigger-0.008
 heads-0.007
 fists-0.007
rated-0.007
 noses-0.007
Token away
Token ID1497
Feature activation-2.74e-01

Pos logit contributions
 trigger+0.047
 heads+0.045
 fists+0.043
rated+0.042
 noses+0.041
Neg logit contributions
iw-0.039
h-0.038
on-0.037
yl-0.037
ott-0.036
Token,
Token ID11
Feature activation+9.62e-03

Pos logit contributions
iw+0.004
h+0.004
on+0.004
ott+0.004
yl+0.004
Neg logit contributions
 trigger-0.009
 heads-0.008
 fists-0.008
rated-0.008
 noses-0.008
Token so
Token ID523
Feature activation+8.99e-01

Pos logit contributions
 trigger+0.000
 heads+0.000
rated+0.000
 fists+0.000
 noses+0.000
Neg logit contributions
h-0.000
iw-0.000
on-0.000
yl-0.000
 she-0.000
Token he
Token ID339
Feature activation+2.21e-01

Pos logit contributions
 trigger+0.028
 heads+0.027
 fists+0.027
 noses+0.026
rated+0.026
Neg logit contributions
iw-0.012
h-0.012
ott-0.011
yl-0.011
Happy-0.011
Token if
Token ID611
Feature activation-1.33e+00

Pos logit contributions
iw+0.020
ott+0.020
yl+0.018
h+0.018
on+0.018
Neg logit contributions
 heads-0.042
 trigger-0.042
rated-0.041
 fists-0.040
 noses-0.040
Token we
Token ID356
Feature activation+4.66e-01

Pos logit contributions
iw+0.017
h+0.017
yl+0.016
ott+0.016
on+0.016
Neg logit contributions
 trigger-0.043
 heads-0.042
 fists-0.040
rated-0.040
 noses-0.040
Token ask
Token ID1265
Feature activation+1.94e-01

Pos logit contributions
 trigger+0.011
 heads+0.011
rated+0.010
 fists+0.010
 noses+0.010
Neg logit contributions
iw-0.010
h-0.010
yl-0.010
Happy-0.009
ott-0.009
Token permission
Token ID7170
Feature activation+2.68e-01

Pos logit contributions
 trigger+0.006
 heads+0.005
 fists+0.005
rated+0.005
 noses+0.005
Neg logit contributions
iw-0.003
h-0.003
on-0.003
yl-0.003
ott-0.003
Token before
Token ID878
Feature activation-9.36e-01

Pos logit contributions
 trigger+0.007
 heads+0.007
rated+0.006
 fists+0.006
 noses+0.006
Neg logit contributions
iw-0.005
ott-0.005
yl-0.005
h-0.005
on-0.005
Token we
Token ID356
Feature activation-5.15e-01

Pos logit contributions
iw+0.014
h+0.013
yl+0.013
on+0.013
ott+0.013
Neg logit contributions
 trigger-0.028
 heads-0.028
 fists-0.027
rated-0.026
 noses-0.026
Token start
Token ID923
Feature activation-1.52e+00

Pos logit contributions
iw+0.011
h+0.011
yl+0.011
ott+0.010
on+0.010
Neg logit contributions
 trigger-0.012
 heads-0.012
 fists-0.011
rated-0.011
 noses-0.011
Token and
Token ID290
Feature activation-1.18e+00

Pos logit contributions
iw+0.025
yl+0.023
ott+0.023
h+0.023
Happy+0.022
Neg logit contributions
 heads-0.043
 fists-0.042
 noses-0.042
 trigger-0.041
 faces-0.039
Token work
Token ID670
Feature activation-3.85e-01

Pos logit contributions
iw+0.023
h+0.022
yl+0.022
ott+0.022
on+0.022
Neg logit contributions
 trigger-0.030
 heads-0.030
 fists-0.028
rated-0.027
 noses-0.027
Token together
Token ID1978
Feature activation+1.47e-01

Pos logit contributions
iw+0.006
ott+0.006
yl+0.005
h+0.005
on+0.005
Neg logit contributions
 heads-0.012
 trigger-0.011
 fists-0.011
rated-0.011
 noses-0.011
Token".
Token ID1911
Feature activation-2.32e+00

Pos logit contributions
 heads+0.004
 fists+0.004
 noses+0.004
rated+0.004
 trigger+0.004
Neg logit contributions
iw-0.002
ott-0.002
yl-0.002
on-0.002
h-0.002
Token the
Token ID262
Feature activation-4.19e-03

Pos logit contributions
 she+0.002
h+0.001
iw+0.001
Happy+0.001
yl+0.001
Neg logit contributions
 trigger-0.010
rated-0.009
 fists-0.009
imming-0.009
 antics-0.009
Token breeze
Token ID28633
Feature activation+7.93e-02

Pos logit contributions
iw+0.000
h+0.000
on+0.000
Happy+0.000
br+0.000
Neg logit contributions
 trigger-0.000
rated-0.000
 fists-0.000
 heads-0.000
lling-0.000
Token.
Token ID13
Feature activation+8.59e-01

Pos logit contributions
 trigger+0.003
 heads+0.003
 fists+0.003
rated+0.003
 noses+0.003
Neg logit contributions
h-0.001
iw-0.001
on-0.001
yl-0.001
Happy-0.001
Token They
Token ID1119
Feature activation+1.80e-01

Pos logit contributions
 trigger+0.022
 heads+0.021
 fists+0.020
rated+0.020
 noses+0.020
Neg logit contributions
iw-0.017
h-0.017
on-0.016
yl-0.016
ott-0.016
Token fell
Token ID3214
Feature activation+9.60e-01

Pos logit contributions
 heads+0.005
 trigger+0.005
 noses+0.004
 fists+0.004
rated+0.004
Neg logit contributions
iw-0.004
h-0.003
on-0.003
yl-0.003
ott-0.003
Token asleep
Token ID16039
Feature activation-2.15e-02

Pos logit contributions
 trigger+0.023
 heads+0.023
 fists+0.021
 noses+0.021
rated+0.020
Neg logit contributions
iw-0.021
on-0.020
yl-0.020
ott-0.020
h-0.019
Token and
Token ID290
Feature activation-5.41e-01

Pos logit contributions
iw+0.000
ott+0.000
on+0.000
yl+0.000
prise+0.000
Neg logit contributions
 heads-0.001
 trigger-0.001
 noses-0.001
 faces-0.001
 tails-0.000
Token had
Token ID550
Feature activation-1.62e+00

Pos logit contributions
iw+0.010
yl+0.009
h+0.009
ott+0.009
on+0.009
Neg logit contributions
 heads-0.015
 trigger-0.015
rated-0.014
 noses-0.014
 fists-0.014
Token sweet
Token ID6029
Feature activation-4.90e-01

Pos logit contributions
iw+0.026
h+0.025
yl+0.025
on+0.025
ott+0.024
Neg logit contributions
 trigger-0.047
 heads-0.047
 fists-0.045
 noses-0.045
rated-0.044
Token dreams
Token ID10625
Feature activation-1.14e+00

Pos logit contributions
iw+0.008
h+0.008
on+0.008
Happy+0.008
yl+0.008
Neg logit contributions
 trigger-0.014
 heads-0.013
 fists-0.013
rated-0.013
lling-0.012
Token.
Token ID13
Feature activation+7.03e-01

Pos logit contributions
iw+0.017
ott+0.015
yl+0.014
on+0.014
h+0.013
Neg logit contributions
 trigger-0.035
 heads-0.035
 noses-0.033
rated-0.033
 fists-0.033
TokenOnce
Token ID7454
Feature activation+3.31e-02

Pos logit contributions
iw+0.014
yl+0.013
ott+0.013
h+0.013
on+0.012
Neg logit contributions
 trigger-0.023
 heads-0.022
rated-0.022
 fists-0.021
lling-0.021
Token he
Token ID339
Feature activation-1.47e+00

Pos logit contributions
 trigger+0.001
 heads+0.001
rated+0.001
 fists+0.001
 noses+0.001
Neg logit contributions
iw-0.001
h-0.001
yl-0.001
ott-0.001
on-0.000
Token arrived
Token ID5284
Feature activation-2.33e+00

Pos logit contributions
iw+0.032
ott+0.029
h+0.028
yl+0.027
lu+0.026
Neg logit contributions
 heads-0.032
 trigger-0.031
rated-0.031
lling-0.029
 fists-0.029
Token home
Token ID1363
Feature activation-1.06e+00

Pos logit contributions
iw+0.031
h+0.030
on+0.029
yl+0.029
ott+0.028
Neg logit contributions
 trigger-0.075
 heads-0.073
 fists-0.070
rated-0.070
 noses-0.069
Token he
Token ID339
Feature activation-1.00e+00

Pos logit contributions
iw+0.011
h+0.010
yl+0.009
ott+0.009
on+0.009
Neg logit contributions
 trigger-0.037
 heads-0.037
 fists-0.035
rated-0.035
 noses-0.034
Token shared
Token ID4888
Feature activation-2.02e+00

Pos logit contributions
iw+0.020
h+0.019
yl+0.019
ott+0.018
Happy+0.018
Neg logit contributions
 heads-0.025
 trigger-0.024
rated-0.023
 fists-0.023
 noses-0.023
Token the
Token ID262
Feature activation+7.14e-01

Pos logit contributions
iw+0.030
on+0.029
h+0.028
yl+0.027
Happy+0.026
Neg logit contributions
 trigger-0.064
 heads-0.061
rated-0.059
 fists-0.059
 noses-0.056
Token rest
Token ID1334
Feature activation+5.89e-01

Pos logit contributions
 fists+0.019
 heads+0.019
 trigger+0.019
 noses+0.018
rated+0.018
Neg logit contributions
on-0.013
ott-0.012
iw-0.012
Happy-0.012
 course-0.012
Token of
Token ID286
Feature activation-4.16e-01

Pos logit contributions
 trigger+0.023
 heads+0.022
 fists+0.021
rated+0.021
 noses+0.021
Neg logit contributions
iw-0.004
h-0.004
on-0.004
yl-0.004
ott-0.004
Token the
Token ID262
Feature activation-2.41e-01

Pos logit contributions
iw+0.006
on+0.005
h+0.005
yl+0.005
ott+0.005
Neg logit contributions
 trigger-0.014
 heads-0.013
rated-0.013
 fists-0.013
 noses-0.012
Token fish
Token ID5916
Feature activation+1.24e+00

Pos logit contributions
on+0.004
iw+0.004
ott+0.004
Happy+0.004
 course+0.004
Neg logit contributions
 fists-0.007
 trigger-0.006
 heads-0.006
 noses-0.006
rated-0.006
Token,
Token ID11
Feature activation+3.14e-01

Pos logit contributions
 trigger+0.065
 heads+0.063
 fists+0.061
rated+0.061
 noses+0.059
Neg logit contributions
h-0.009
iw-0.008
on-0.008
yl-0.008
Happy-0.006
Token when
Token ID618
Feature activation+2.15e+00

Pos logit contributions
 trigger+0.010
 heads+0.010
 fists+0.009
rated+0.009
 noses+0.009
Neg logit contributions
h-0.005
iw-0.005
 she-0.005
on-0.004
yl-0.004
Token Bob
Token ID5811
Feature activation+7.61e-01

Pos logit contributions
 trigger+0.069
 heads+0.066
 fists+0.063
rated+0.063
 noses+0.059
Neg logit contributions
h-0.031
iw-0.030
on-0.029
yl-0.028
 she-0.028
Token was
Token ID373
Feature activation-4.98e-01

Pos logit contributions
 trigger+0.022
 heads+0.021
 fists+0.020
rated+0.020
 noses+0.020
Neg logit contributions
iw-0.013
h-0.012
on-0.012
yl-0.012
ott-0.012
Token walking
Token ID6155
Feature activation+1.19e+00

Pos logit contributions
h+0.009
on+0.009
iw+0.009
yl+0.009
ott+0.009
Neg logit contributions
 trigger-0.014
 heads-0.013
rated-0.013
 fists-0.012
 noses-0.012
Token,
Token ID11
Feature activation-4.50e-01

Pos logit contributions
 trigger+0.033
 heads+0.031
 fists+0.030
rated+0.030
 noses+0.029
Neg logit contributions
iw-0.021
h-0.021
on-0.020
yl-0.020
ott-0.020
Token he
Token ID339
Feature activation+8.13e-02

Pos logit contributions
h+0.003
iw+0.003
on+0.003
yl+0.003
 she+0.002
Neg logit contributions
 trigger-0.018
 heads-0.017
rated-0.017
 fists-0.017
 noses-0.016
Token heard
Token ID2982
Feature activation+4.55e-02

Pos logit contributions
 trigger+0.003
 heads+0.002
 fists+0.002
rated+0.002
 noses+0.002
Neg logit contributions
iw-0.001
h-0.001
on-0.001
yl-0.001
ott-0.001
Token two
Token ID734
Feature activation+1.33e+00

Pos logit contributions
 trigger+0.001
 heads+0.001
 fists+0.001
rated+0.001
 noses+0.001
Neg logit contributions
iw-0.001
h-0.001
on-0.001
yl-0.001
ott-0.001
Token bugs
Token ID11316
Feature activation+9.14e-01

Pos logit contributions
rated+0.034
 trigger+0.034
 heads+0.031
 fists+0.029
imming+0.028
Neg logit contributions
h-0.025
yl-0.025
ott-0.024
on-0.024
 she-0.024
Token talking
Token ID3375
Feature activation+2.32e+00

Pos logit contributions
 trigger+0.025
 heads+0.024
rated+0.024
 fists+0.023
 noses+0.022
Neg logit contributions
h-0.018
on-0.017
iw-0.015
yl-0.015
ott-0.015
TokenMom
Token ID29252
Feature activation-1.93e+00

Pos logit contributions
iw+0.009
h+0.009
on+0.009
Happy+0.009
yl+0.008
Neg logit contributions
 trigger-0.015
 heads-0.014
 fists-0.014
rated-0.014
 noses-0.014
Token,
Token ID11
Feature activation+2.33e+00

Pos logit contributions
iw+0.030
Happy+0.029
ott+0.028
yl+0.027
br+0.027
Neg logit contributions
 heads-0.056
 trigger-0.053
 noses-0.052
rated-0.052
 fists-0.052
Token what
Token ID644
Feature activation+9.33e-01

Pos logit contributions
 trigger+0.073
 heads+0.071
 fists+0.069
rated+0.068
 noses+0.067
Neg logit contributions
iw-0.033
h-0.032
yl-0.030
on-0.030
ott-0.030
Token is
Token ID318
Feature activation-3.84e+00

Pos logit contributions
 trigger+0.029
 heads+0.027
 fists+0.027
rated+0.026
 noses+0.026
Neg logit contributions
on-0.015
iw-0.014
yl-0.014
h-0.014
ott-0.013
Token in
Token ID287
Feature activation-2.05e+00

Pos logit contributions
iw+0.061
h+0.059
ott+0.054
Happy+0.054
ris+0.053
Neg logit contributions
 trigger-0.115
 heads-0.110
 fists-0.105
 noses-0.101
 grill-0.100
Token this
Token ID428
Feature activation-3.62e+00

Pos logit contributions
iw+0.033
h+0.032
ott+0.030
on+0.030
yl+0.030
Neg logit contributions
 trigger-0.060
 heads-0.059
 fists-0.057
 noses-0.055
rated-0.054
Token envelope
Token ID22878
Feature activation-5.12e+00

Pos logit contributions
iw+0.073
h+0.072
on+0.068
ott+0.068
yl+0.067
Neg logit contributions
 trigger-0.091
 heads-0.087
 fists-0.081
rated-0.081
 noses-0.081
Token?"
Token ID1701
Feature activation-7.41e-01

Pos logit contributions
iw+0.063
yl+0.058
ott+0.057
Happy+0.053
h+0.050
Neg logit contributions
 trigger-0.163
 heads-0.161
rated-0.160
 grill-0.153
 lawn-0.150
Token Her
Token ID2332
Feature activation-2.02e-01

Pos logit contributions
iw+0.012
h+0.011
yl+0.011
on+0.011
ott+0.011
Neg logit contributions
 trigger-0.021
 heads-0.021
 fists-0.020
 noses-0.020
rated-0.019
Token mom
Token ID1995
Feature activation-2.15e+00

Pos logit contributions
yl+0.003
ott+0.003
iw+0.003
on+0.003
h+0.003
Neg logit contributions
 trigger-0.006
rated-0.006
lling-0.006
 heads-0.005
 banks-0.005
Token said
Token ID531
Feature activation-7.88e-01

Pos logit contributions
iw+0.045
br+0.042
Happy+0.042
ris+0.038
ott+0.038
Neg logit contributions
 heads-0.054
 noses-0.048
 faces-0.045
rated-0.045
 fists-0.045
Token she
Token ID673
Feature activation-9.44e-01

Pos logit contributions
iw+0.015
h+0.014
yl+0.014
on+0.014
ott+0.014
Neg logit contributions
 trigger-0.040
 heads-0.039
 fists-0.038
rated-0.037
 noses-0.037
Token put
Token ID1234
Feature activation-3.46e-01

Pos logit contributions
iw+0.018
h+0.017
yl+0.016
on+0.016
Happy+0.016
Neg logit contributions
 heads-0.025
 trigger-0.025
 fists-0.023
 noses-0.023
rated-0.023
Token it
Token ID340
Feature activation+8.78e-01

Pos logit contributions
iw+0.006
h+0.006
ris+0.006
yl+0.006
ott+0.006
Neg logit contributions
 trigger-0.009
 heads-0.009
 fists-0.009
 noses-0.008
 attention-0.008
Token into
Token ID656
Feature activation-5.53e-01

Pos logit contributions
 trigger+0.023
 heads+0.022
 fists+0.021
rated+0.021
 noses+0.021
Neg logit contributions
iw-0.017
yl-0.016
h-0.016
ott-0.016
on-0.016
Token her
Token ID607
Feature activation-9.87e-01

Pos logit contributions
iw+0.011
h+0.011
ris+0.010
br+0.010
yl+0.010
Neg logit contributions
 heads-0.014
 trigger-0.014
 fists-0.013
 noses-0.013
 faces-0.013
Token pocket
Token ID10000
Feature activation-3.29e+00

Pos logit contributions
h+0.022
iw+0.022
yl+0.022
Happy+0.021
on+0.021
Neg logit contributions
 heads-0.022
 trigger-0.022
 fists-0.021
 noses-0.021
rated-0.020
Token.
Token ID13
Feature activation+5.06e-01

Pos logit contributions
iw+0.052
yl+0.050
Happy+0.049
h+0.049
ott+0.046
Neg logit contributions
 heads-0.097
 trigger-0.096
 fists-0.090
 noses-0.089
rated-0.088
Token
Token ID220
Feature activation-8.29e-01

Pos logit contributions
 trigger+0.014
 heads+0.014
 fists+0.014
 noses+0.013
 faces+0.013
Neg logit contributions
yl-0.008
h-0.008
iw-0.007
br-0.007
ris-0.007
Token
Token ID198
Feature activation+5.71e-01

Pos logit contributions
iw+0.009
h+0.009
yl+0.008
on+0.008
Happy+0.007
Neg logit contributions
 trigger-0.028
 heads-0.027
 fists-0.027
rated-0.026
 noses-0.026
Token
Token ID198
Feature activation-8.37e-01

Pos logit contributions
 trigger+0.017
 heads+0.016
 fists+0.016
rated+0.015
 noses+0.015
Neg logit contributions
iw-0.008
yl-0.007
h-0.007
ott-0.006
br-0.006
TokenM
Token ID44
Feature activation-2.13e+00

Pos logit contributions
iw+0.012
yl+0.011
h+0.011
on+0.011
ott+0.010
Neg logit contributions
 trigger-0.026
 heads-0.025
 fists-0.024
rated-0.024
 noses-0.024
Token was
Token ID373
Feature activation-7.11e-01

Pos logit contributions
iw+0.011
on+0.010
h+0.010
yl+0.010
ott+0.010
Neg logit contributions
 trigger-0.015
 heads-0.014
 fists-0.013
rated-0.013
 noses-0.013
Token a
Token ID257
Feature activation-4.72e-01

Pos logit contributions
iw+0.011
h+0.011
on+0.010
yl+0.010
ott+0.010
Neg logit contributions
 trigger-0.021
 heads-0.021
 fists-0.020
rated-0.020
 noses-0.019
Token really
Token ID1107
Feature activation-1.18e-01

Pos logit contributions
iw+0.009
h+0.009
yl+0.009
on+0.009
ott+0.009
Neg logit contributions
 trigger-0.012
 heads-0.012
 fists-0.011
rated-0.011
 noses-0.011
Token fun
Token ID1257
Feature activation+8.26e-02

Pos logit contributions
iw+0.002
h+0.002
yl+0.002
on+0.002
ott+0.002
Neg logit contributions
 trigger-0.003
 heads-0.003
rated-0.003
 fists-0.003
 noses-0.003
Token day
Token ID1110
Feature activation-8.23e-01

Pos logit contributions
 trigger+0.002
 heads+0.002
 fists+0.002
rated+0.002
 noses+0.002
Neg logit contributions
iw-0.002
h-0.002
on-0.002
yl-0.002
ott-0.002
Token,"
Token ID553
Feature activation-3.02e+00

Pos logit contributions
iw+0.011
ott+0.011
yl+0.010
h+0.010
on+0.009
Neg logit contributions
 heads-0.026
 trigger-0.026
 fists-0.025
rated-0.025
 noses-0.025
Token said
Token ID531
Feature activation-1.69e+00

Pos logit contributions
iw+0.039
h+0.037
yl+0.035
ott+0.034
on+0.034
Neg logit contributions
 trigger-0.098
 heads-0.097
 fists-0.092
 noses-0.092
rated-0.090
Token Ben
Token ID3932
Feature activation-8.78e-01

Pos logit contributions
iw+0.036
h+0.035
ott+0.033
yl+0.032
on+0.032
Neg logit contributions
 trigger-0.042
 heads-0.040
 fists-0.039
 noses-0.037
rated-0.036
Token.
Token ID13
Feature activation-9.59e-01

Pos logit contributions
iw+0.012
Happy+0.010
yl+0.010
ott+0.010
br+0.010
Neg logit contributions
 heads-0.028
 trigger-0.028
 fists-0.027
rated-0.027
 noses-0.026
Token
Token ID198
Feature activation-7.85e-01

Pos logit contributions
iw+0.020
h+0.019
yl+0.019
prise+0.018
ott+0.018
Neg logit contributions
 trigger-0.023
 heads-0.023
 fists-0.022
 noses-0.021
 faces-0.020
Token
Token ID198
Feature activation-1.59e+00

Pos logit contributions
iw+0.011
h+0.009
yl+0.009
br+0.008
ott+0.008
Neg logit contributions
 trigger-0.024
 heads-0.024
 fists-0.022
rated-0.022
 noses-0.022
Token
Token ID198
Feature activation-1.50e+00

Pos logit contributions
iw+0.006
h+0.006
yl+0.006
on+0.005
ott+0.005
Neg logit contributions
 trigger-0.017
 heads-0.016
 fists-0.016
 noses-0.015
 faces-0.015
Token"
Token ID1
Feature activation-9.39e-01

Pos logit contributions
yl+0.025
iw+0.025
ott+0.025
on+0.024
h+0.024
Neg logit contributions
 heads-0.040
 trigger-0.039
 fists-0.036
 noses-0.036
 faces-0.036
TokenI
Token ID40
Feature activation-2.95e-01

Pos logit contributions
on+0.020
iw+0.019
yl+0.019
h+0.019
ott+0.019
Neg logit contributions
 heads-0.021
 trigger-0.021
 noses-0.019
 faces-0.019
 tails-0.018
Token will
Token ID481
Feature activation-1.17e+00

Pos logit contributions
on+0.007
iw+0.007
h+0.007
br+0.007
yl+0.007
Neg logit contributions
 heads-0.005
 trigger-0.005
rated-0.005
 noses-0.005
 fists-0.005
Token,
Token ID11
Feature activation-3.51e-01

Pos logit contributions
iw+0.027
yl+0.025
on+0.025
ott+0.024
Happy+0.024
Neg logit contributions
 heads-0.028
 trigger-0.027
 noses-0.025
 fists-0.025
 faces-0.024
Token Mom
Token ID11254
Feature activation-3.04e+00

Pos logit contributions
iw+0.008
h+0.007
on+0.007
yl+0.007
ott+0.007
Neg logit contributions
 trigger-0.008
 heads-0.008
 fists-0.008
 noses-0.008
rated-0.007
Token.
Token ID13
Feature activation-1.33e+00

Pos logit contributions
yl+0.054
iw+0.053
br+0.052
ott+0.051
Happy+0.051
Neg logit contributions
 heads-0.084
rated-0.077
 attention-0.076
 faces-0.074
 noses-0.074
Token I
Token ID314
Feature activation+3.05e-02

Pos logit contributions
yl+0.028
h+0.028
on+0.025
ott+0.025
iw+0.024
Neg logit contributions
 heads-0.033
 trigger-0.032
 noses-0.031
 fists-0.031
 tails-0.029
Token'm
Token ID1101
Feature activation-7.21e-01

Pos logit contributions
 trigger+0.001
 heads+0.001
rated+0.001
 noses+0.001
 fists+0.001
Neg logit contributions
on-0.001
iw-0.001
h-0.001
yl-0.001
ott-0.001
Token sorry
Token ID7926
Feature activation+3.88e-02

Pos logit contributions
iw+0.020
ris+0.020
yl+0.019
ha+0.019
br+0.019
Neg logit contributions
 fists-0.010
 heads-0.010
 trigger-0.009
 grill-0.009
 attention-0.009
Token,"
Token ID553
Feature activation-2.66e+00

Pos logit contributions
 heads+0.001
 trigger+0.001
 fists+0.001
 noses+0.001
 faces+0.001
Neg logit contributions
iw-0.001
yl-0.000
on-0.000
ott-0.000
ris-0.000
Token the
Token ID262
Feature activation-6.33e-01

Pos logit contributions
iw+0.002
yl+0.002
h+0.002
on+0.002
ott+0.002
Neg logit contributions
 trigger-0.005
rated-0.005
 heads-0.004
 fists-0.004
 noses-0.004
Token wealthy
Token ID11574
Feature activation-2.01e-01

Pos logit contributions
iw+0.013
h+0.012
on+0.012
yl+0.012
Happy+0.012
Neg logit contributions
 trigger-0.016
 heads-0.015
 fists-0.015
rated-0.015
 noses-0.014
Token flower
Token ID15061
Feature activation-1.14e+00

Pos logit contributions
iw+0.004
yl+0.004
h+0.004
on+0.004
Happy+0.004
Neg logit contributions
 trigger-0.005
 fists-0.005
 heads-0.004
rated-0.004
 noses-0.004
Token in
Token ID287
Feature activation-4.33e-01

Pos logit contributions
iw+0.015
yl+0.014
ott+0.014
h+0.014
on+0.013
Neg logit contributions
 heads-0.035
 trigger-0.035
 noses-0.034
 fists-0.033
rated-0.033
Token the
Token ID262
Feature activation-1.53e+00

Pos logit contributions
iw+0.007
h+0.007
on+0.007
ott+0.007
yl+0.007
Neg logit contributions
 trigger-0.012
 heads-0.012
 fists-0.012
 noses-0.011
rated-0.011
Token park
Token ID3952
Feature activation-3.16e+00

Pos logit contributions
iw+0.030
h+0.029
on+0.028
yl+0.028
ott+0.028
Neg logit contributions
 trigger-0.040
 heads-0.038
 fists-0.037
rated-0.036
 noses-0.035
Token.
Token ID13
Feature activation-1.13e+00

Pos logit contributions
iw+0.039
ott+0.036
on+0.035
yl+0.034
h+0.034
Neg logit contributions
 heads-0.104
 trigger-0.103
 noses-0.098
 fists-0.097
rated-0.097
Token<|endoftext|>
Token ID50256
Feature activation-1.53e+00

Pos logit contributions
iw+0.020
h+0.019
yl+0.019
on+0.019
ott+0.018
Neg logit contributions
 trigger-0.031
 heads-0.030
 fists-0.029
rated-0.029
 noses-0.028
TokenOnce
Token ID7454
Feature activation-2.05e-01

Pos logit contributions
iw+0.022
h+0.021
yl+0.020
prise+0.019
ott+0.019
Neg logit contributions
 trigger-0.046
 heads-0.043
 fists-0.041
rated-0.041
 noses-0.040
Token there
Token ID612
Feature activation+9.79e-01

Pos logit contributions
iw+0.005
h+0.004
Happy+0.004
on+0.004
ott+0.004
Neg logit contributions
 trigger-0.004
 heads-0.004
 fists-0.004
rated-0.004
 noses-0.004
Token was
Token ID373
Feature activation-1.17e+00

Pos logit contributions
 trigger+0.028
 heads+0.027
 fists+0.026
rated+0.026
 noses+0.026
Neg logit contributions
iw-0.016
h-0.016
on-0.015
yl-0.015
ott-0.015