TOP ACTIVATIONS
MAX = 0.093

very special toy for L ott ie . It was a
. When D ott ie arrived at the lab
, why can 't the man hear us ?"
!" Turn ing around to face the voice
loyal little girl named L ott ie . Every day ,
didn 't want the old man to know that . He
" Who is Lincoln , Ben ? Do you
One day , L ott ie 's mom went to
?" D ott ie per ked up and
" What 's in that man 's case ?". Her grandmother
did not want the bad man to take her away so
a little bit of coax ing from her dad , she
something sweet or something sav our y ? In the end
if I can give young man some advice ?". Tom nodded
Ben did not like r iddles . They did not like
doll makes a m unch ing sound . Anna smiles .
," she said , scoop ing him up into her arms
the sound of hurry and bust le . G
the lion have a big man e ?" she asked .
After a while , D ott ie was no longer scared

MIN ACTIVATIONS
MIN = -0.108

and put on their p aj amas . They climbed into
and put on their p aj amas . They hugged their
and put on their p aj amas . They get into
They put on their p aj amas and get into their
and put on their p aj amas . Their mom tucked
and put on their p aj amas . They went to
and cozy in her p aj amas , even though she
to wear her new p aj amas and read the book
time , her new p aj amas fit perfectly . "
They put on their p aj amas and went to their
and put on their p aj amas . They each pick
and put on their p aj amas . Uncle Joe showed
and put on their p aj amas . They hug their
and put on their p aj amas . They said good
give them towels and p aj amas . They give them
and changed into her p aj amas . She was so
and put on their p aj amas . They climbed into
She puts on her p aj amas . She hugs her
on dry and cozy p aj amas and sn ugg led
and put on their p aj amas first . They do

INTERVAL 0.042 - 0.093
CONTAINS 0.256%

curious and ran to the man . He opened his bag
Tom 's room . One day , Tom 's mom tells
and Ben didn 't want to nap . They wanted to
the little girl didn 't understand . Tears started rolling down
?" Sam thought for a moment and then said , "

INTERVAL -0.008 - 0.042
CONTAINS 72.494%

stone and it was very big . The little train was
the ball all around the gym and had so much fun
€ � � Be careful , the branches are very high
. He worked all day , making things prett ier and
<|endoftext|> Once upon a time , there was a silly zero

INTERVAL -0.058 - -0.008
CONTAINS 27.220%

make pictures of animals , flowers and stars . One day
. She grabbed her favorite shirt and raced to store it
, " Thank you , Mia , for helping me find
mom my said , " Ok , Joe . You can
- up . <|endoftext|> Once upon a time , there was

INTERVAL -0.108 - -0.058
CONTAINS 0.029%

k ang aroo . The k ang aroo has a sack
. After their saf ari , they sat down
and vegetables and enjoy their lunch . They are proud of
of the ball . It fl apped its wings and blew
the tree to its once former beauty . He was happy
Token very
Token ID845
Feature activation+3.21e-03

Pos logit contributions
igator+0.025
inn+0.025
 his+0.024
 ang+0.023
His+0.022
Neg logit contributions
ich-0.027
 ourselves-0.026
sters-0.025
lor-0.025
 yourselves-0.025
Token special
Token ID2041
Feature activation+2.41e-02

Pos logit contributions
inn+0.005
igator+0.005
 his+0.005
His+0.004
ny+0.004
Neg logit contributions
ich-0.006
 ourselves-0.005
sters-0.005
lor-0.005
 yourselves-0.005
Token toy
Token ID13373
Feature activation-9.96e-03

Pos logit contributions
igator+0.034
inn+0.033
 his+0.032
ny+0.029
His+0.029
Neg logit contributions
ich-0.045
 ourselves-0.043
sters-0.042
lor-0.042
 yourselves-0.041
Token for
Token ID329
Feature activation-8.38e-03

Pos logit contributions
ich+0.020
aj+0.019
lor+0.019
 ourselves+0.019
 yourselves+0.019
Neg logit contributions
igator-0.012
inn-0.012
 his-0.012
 He-0.011
His-0.011
Token L
Token ID406
Feature activation+7.98e-03

Pos logit contributions
ich+0.019
lor+0.018
aj+0.018
sters+0.018
 ourselves+0.017
Neg logit contributions
 his-0.009
inn-0.009
His-0.007
 ang-0.007
igator-0.007
Tokenott
Token ID1252
Feature activation+9.26e-02

Pos logit contributions
inn+0.009
igator+0.009
 his+0.008
ny+0.008
His+0.008
Neg logit contributions
ich-0.017
 ourselves-0.016
sters-0.016
lor-0.016
aj-0.015
Tokenie
Token ID494
Feature activation+1.24e-02

Pos logit contributions
inn+0.103
igator+0.098
 ang+0.095
 He+0.092
His+0.091
Neg logit contributions
ich-0.193
lor-0.181
aj-0.178
 ourselves-0.176
sters-0.174
Token.
Token ID13
Feature activation+5.56e-03

Pos logit contributions
igator+0.015
inn+0.014
 his+0.014
ny+0.013
His+0.012
Neg logit contributions
ich-0.026
 ourselves-0.025
 yourselves-0.025
lor-0.024
sters-0.024
Token It
Token ID632
Feature activation+1.17e-02

Pos logit contributions
inn+0.007
igator+0.006
 his+0.006
ny+0.006
His+0.006
Neg logit contributions
ich-0.012
 ourselves-0.011
sters-0.011
lor-0.011
aj-0.010
Token was
Token ID373
Feature activation+6.47e-03

Pos logit contributions
igator+0.016
inn+0.016
 his+0.015
ny+0.015
His+0.014
Neg logit contributions
ich-0.022
 ourselves-0.021
sters-0.020
lor-0.020
 yourselves-0.020
Token a
Token ID257
Feature activation-1.88e-02

Pos logit contributions
 his+0.007
inn+0.006
igator+0.006
 He+0.005
His+0.005
Neg logit contributions
ich-0.016
sters-0.015
lor-0.015
 ourselves-0.015
asks-0.014
Token.
Token ID13
Feature activation-8.30e-03

Pos logit contributions
 his+0.003
inn+0.002
His+0.002
igator+0.002
 He+0.002
Neg logit contributions
ich-0.005
sters-0.005
sed-0.005
lor-0.005
asks-0.005
Token
Token ID198
Feature activation-9.47e-03

Pos logit contributions
ich+0.016
 ourselves+0.015
sters+0.015
lor+0.014
 yourselves+0.014
Neg logit contributions
inn-0.012
igator-0.011
 his-0.011
ny-0.010
His-0.010
Token
Token ID198
Feature activation+4.25e-03

Pos logit contributions
ich+0.017
 ourselves+0.016
sters+0.015
lor+0.015
aj+0.015
Neg logit contributions
inn-0.014
igator-0.014
 his-0.013
ny-0.013
 ang-0.013
TokenWhen
Token ID2215
Feature activation+4.96e-03

Pos logit contributions
inn+0.006
igator+0.006
His+0.005
 his+0.005
ny+0.005
Neg logit contributions
ich-0.008
 ourselves-0.008
sters-0.008
aj-0.007
 yourselves-0.007
Token D
Token ID360
Feature activation-2.78e-02

Pos logit contributions
inn+0.007
 his+0.006
igator+0.006
His+0.006
ny+0.006
Neg logit contributions
ich-0.010
lor-0.009
aj-0.009
 ourselves-0.009
sters-0.009
Tokenott
Token ID1252
Feature activation+8.49e-02

Pos logit contributions
ich+0.056
 ourselves+0.053
sters+0.053
aj+0.052
lor+0.051
Neg logit contributions
igator-0.034
inn-0.034
 his-0.030
His-0.029
ny-0.029
Tokenie
Token ID494
Feature activation+2.21e-02

Pos logit contributions
igator+0.096
inn+0.096
 ang+0.085
 trader+0.084
His+0.084
Neg logit contributions
ich-0.177
lor-0.166
aj-0.161
sters-0.161
 ourselves-0.160
Token arrived
Token ID5284
Feature activation+1.96e-03

Pos logit contributions
inn+0.029
igator+0.028
 his+0.027
ny+0.026
His+0.025
Neg logit contributions
ich-0.044
 ourselves-0.042
sters-0.040
 yourselves-0.040
lor-0.039
Token at
Token ID379
Feature activation-4.47e-03

Pos logit contributions
inn+0.003
igator+0.003
 his+0.003
His+0.003
ny+0.003
Neg logit contributions
ich-0.004
 ourselves-0.003
sters-0.003
lor-0.003
aj-0.003
Token the
Token ID262
Feature activation+2.51e-03

Pos logit contributions
ich+0.009
 ourselves+0.009
 yourselves+0.008
lor+0.008
sters+0.008
Neg logit contributions
inn-0.006
igator-0.006
ny-0.005
His-0.005
 ang-0.005
Token lab
Token ID2248
Feature activation+7.50e-03

Pos logit contributions
inn+0.004
igator+0.003
 his+0.003
ny+0.003
His+0.003
Neg logit contributions
ich-0.005
 ourselves-0.004
sters-0.004
lor-0.004
aj-0.004
Token,
Token ID11
Feature activation-2.29e-02

Pos logit contributions
igator+0.002
 his+0.002
inn+0.002
ny+0.002
 He+0.001
Neg logit contributions
ich-0.003
lor-0.003
 ourselves-0.003
 yourselves-0.003
asks-0.003
Token why
Token ID1521
Feature activation-7.71e-03

Pos logit contributions
ich+0.053
lor+0.052
aj+0.051
sed+0.049
sters+0.049
Neg logit contributions
igator-0.024
 his-0.023
 He-0.018
 he-0.017
inn-0.016
Token can
Token ID460
Feature activation+1.06e-02

Pos logit contributions
ich+0.012
lor+0.011
aj+0.011
 ourselves+0.010
sed+0.010
Neg logit contributions
igator-0.014
 his-0.014
 He-0.012
 ang-0.012
ny-0.012
Token't
Token ID470
Feature activation+2.95e-02

Pos logit contributions
igator+0.022
 his+0.022
inn+0.020
His+0.020
 He+0.020
Neg logit contributions
ich-0.013
aj-0.012
lor-0.012
sters-0.011
asks-0.011
Token the
Token ID262
Feature activation+1.20e-02

Pos logit contributions
igator+0.072
 his+0.069
inn+0.067
His+0.065
 He+0.064
Neg logit contributions
aj-0.026
ich-0.026
lor-0.023
sters-0.022
sed-0.021
Token man
Token ID582
Feature activation+8.06e-02

Pos logit contributions
igator+0.019
 his+0.017
inn+0.017
 ang+0.016
ny+0.016
Neg logit contributions
ich-0.021
lor-0.020
 ourselves-0.020
aj-0.020
sters-0.020
Token hear
Token ID3285
Feature activation+2.36e-02

Pos logit contributions
igator+0.109
inn+0.104
 his+0.101
ny+0.095
 ang+0.094
Neg logit contributions
ich-0.155
 ourselves-0.146
lor-0.145
sters-0.144
aj-0.142
Token us
Token ID514
Feature activation+3.09e-02

Pos logit contributions
 his+0.036
igator+0.031
 ang+0.026
His+0.025
 He+0.025
Neg logit contributions
ich-0.048
sed-0.046
sters-0.046
lor-0.045
aj-0.044
Token?"
Token ID1701
Feature activation+1.51e-02

Pos logit contributions
igator+0.046
 his+0.043
inn+0.042
ny+0.040
 ang+0.040
Neg logit contributions
ich-0.056
lor-0.054
sters-0.053
aj-0.051
sed-0.051
Token
Token ID198
Feature activation+8.55e-03

Pos logit contributions
inn+0.026
igator+0.025
ny+0.022
 ang+0.021
 trader+0.021
Neg logit contributions
ich-0.024
 ourselves-0.023
sters-0.022
asks-0.021
aj-0.021
Token
Token ID198
Feature activation+9.88e-03

Pos logit contributions
inn+0.012
igator+0.012
 his+0.011
His+0.011
ny+0.010
Neg logit contributions
ich-0.016
 ourselves-0.015
sters-0.015
lor-0.015
aj-0.015
Token!"
Token ID2474
Feature activation+1.12e-02

Pos logit contributions
ich+0.025
aj+0.024
asks+0.023
 ourselves+0.023
sters+0.023
Neg logit contributions
igator-0.025
inn-0.021
 his-0.021
ny-0.021
 He-0.018
Token
Token ID220
Feature activation+1.52e-02

Pos logit contributions
inn+0.013
 his+0.012
igator+0.012
His+0.011
 He+0.011
Neg logit contributions
ich-0.024
 ourselves-0.023
sters-0.022
lor-0.022
aj-0.022
Token
Token ID198
Feature activation+1.30e-02

Pos logit contributions
inn+0.017
 his+0.016
 He+0.016
His+0.014
igator+0.014
Neg logit contributions
ich-0.033
sters-0.032
 ourselves-0.031
sed-0.030
 yourselves-0.030
Token
Token ID198
Feature activation+1.49e-02

Pos logit contributions
inn+0.018
igator+0.017
 his+0.017
His+0.017
ny+0.016
Neg logit contributions
ich-0.025
 ourselves-0.023
sters-0.023
lor-0.022
asks-0.022
TokenTurn
Token ID17278
Feature activation+2.82e-02

Pos logit contributions
inn+0.023
igator+0.023
 his+0.021
ny+0.020
 ang+0.020
Neg logit contributions
ich-0.026
 ourselves-0.024
sters-0.023
lor-0.023
 yourselves-0.023
Tokening
Token ID278
Feature activation+7.99e-02

Pos logit contributions
inn+0.046
igator+0.046
 his+0.042
ny+0.041
His+0.041
Neg logit contributions
ich-0.046
 ourselves-0.043
sters-0.042
lor-0.041
aj-0.041
Token around
Token ID1088
Feature activation+3.68e-02

Pos logit contributions
inn+0.113
igator+0.110
 ang+0.097
His+0.097
ny+0.095
Neg logit contributions
 ourselves-0.136
ich-0.135
sters-0.127
 yourselves-0.125
lor-0.125
Token to
Token ID284
Feature activation+2.65e-03

Pos logit contributions
inn+0.045
igator+0.044
 his+0.041
ny+0.038
His+0.038
Neg logit contributions
ich-0.077
 ourselves-0.072
sters-0.071
lor-0.070
aj-0.070
Token face
Token ID1986
Feature activation+4.67e-02

Pos logit contributions
inn+0.003
igator+0.003
ny+0.003
His+0.002
 ang+0.002
Neg logit contributions
ich-0.006
 ourselves-0.006
 yourselves-0.005
sters-0.005
lor-0.005
Token the
Token ID262
Feature activation+8.10e-03

Pos logit contributions
inn+0.054
igator+0.054
 his+0.053
His+0.048
ny+0.046
Neg logit contributions
ich-0.099
 ourselves-0.093
sters-0.091
lor-0.091
aj-0.090
Token voice
Token ID3809
Feature activation+4.36e-02

Pos logit contributions
inn+0.012
igator+0.012
 his+0.011
ny+0.011
 ang+0.010
Neg logit contributions
ich-0.015
lor-0.014
 ourselves-0.014
sters-0.014
aj-0.013
Token loyal
Token ID9112
Feature activation-1.14e-02

Pos logit contributions
ich+0.007
sters+0.006
aj+0.006
 ourselves+0.006
 majesty+0.006
Neg logit contributions
inn-0.009
igator-0.008
 aback-0.008
 ang-0.008
 his-0.008
Token little
Token ID1310
Feature activation+9.15e-03

Pos logit contributions
ich+0.017
 ourselves+0.015
sters+0.015
lor+0.015
aj+0.015
Neg logit contributions
inn-0.021
igator-0.020
 his-0.019
His-0.018
 ang-0.018
Token girl
Token ID2576
Feature activation+1.11e-02

Pos logit contributions
inn+0.016
 ang+0.015
His+0.015
igator+0.014
 his+0.014
Neg logit contributions
ich-0.015
sters-0.012
 ourselves-0.012
sed-0.012
 roses-0.011
Token named
Token ID3706
Feature activation-2.51e-02

Pos logit contributions
inn+0.015
igator+0.015
 ang+0.013
 his+0.013
His+0.013
Neg logit contributions
ich-0.021
 ourselves-0.020
sters-0.019
lor-0.019
aj-0.019
Token L
Token ID406
Feature activation+3.82e-03

Pos logit contributions
ich+0.069
sters+0.067
 ourselves+0.065
lor+0.065
 yourselves+0.063
Neg logit contributions
inn-0.017
igator-0.015
ny-0.013
 his-0.013
 He-0.010
Tokenott
Token ID1252
Feature activation+7.74e-02

Pos logit contributions
inn+0.005
igator+0.005
 his+0.004
ny+0.004
His+0.004
Neg logit contributions
ich-0.008
 ourselves-0.007
sters-0.007
 yourselves-0.007
sed-0.007
Tokenie
Token ID494
Feature activation-1.45e-03

Pos logit contributions
inn+0.080
igator+0.075
 ang+0.069
 his+0.066
His+0.066
Neg logit contributions
ich-0.174
lor-0.160
 ourselves-0.159
sters-0.159
aj-0.157
Token.
Token ID13
Feature activation-2.41e-03

Pos logit contributions
ich+0.002
 ourselves+0.002
sters+0.002
aj+0.002
lor+0.002
Neg logit contributions
inn-0.003
igator-0.003
 ang-0.002
 trader-0.002
His-0.002
Token Every
Token ID3887
Feature activation+3.60e-03

Pos logit contributions
ich+0.004
 ourselves+0.004
asks+0.004
aj+0.004
 yourselves+0.004
Neg logit contributions
igator-0.004
inn-0.004
 ang-0.003
 trader-0.003
His-0.003
Token day
Token ID1110
Feature activation+2.67e-02

Pos logit contributions
inn+0.005
igator+0.005
 his+0.004
 ang+0.004
His+0.004
Neg logit contributions
ich-0.007
 ourselves-0.007
lor-0.007
sters-0.006
aj-0.006
Token,
Token ID11
Feature activation+1.73e-02

Pos logit contributions
inn+0.035
 his+0.032
igator+0.032
His+0.030
ny+0.029
Neg logit contributions
ich-0.054
 ourselves-0.051
sters-0.050
lor-0.050
 yourselves-0.049
Token didn
Token ID1422
Feature activation-1.05e-02

Pos logit contributions
inn+0.016
igator+0.016
 ang+0.013
ny+0.013
 trader+0.013
Neg logit contributions
ich-0.015
sters-0.014
lor-0.013
 ourselves-0.013
sed-0.013
Token't
Token ID470
Feature activation+4.45e-03

Pos logit contributions
ich+0.015
 ourselves+0.014
lor+0.013
sters+0.013
 yourselves+0.013
Neg logit contributions
inn-0.020
igator-0.019
 his-0.018
ny-0.018
 ang-0.017
Token want
Token ID765
Feature activation+1.89e-02

Pos logit contributions
inn+0.008
jack+0.007
igator+0.007
penter+0.007
 trader+0.007
Neg logit contributions
 ourselves-0.005
 yourself-0.005
sters-0.005
ich-0.005
 you-0.005
Token the
Token ID262
Feature activation+4.02e-02

Pos logit contributions
inn+0.035
igator+0.034
ny+0.030
His+0.030
 ang+0.029
Neg logit contributions
 ourselves-0.026
ich-0.026
 yourselves-0.024
sters-0.024
asks-0.023
Token old
Token ID1468
Feature activation+5.58e-02

Pos logit contributions
inn+0.072
His+0.060
igator+0.060
ny+0.057
 ang+0.055
Neg logit contributions
ich-0.062
sters-0.058
 ourselves-0.057
 majesty-0.057
 yourselves-0.056
Token man
Token ID582
Feature activation+7.71e-02

Pos logit contributions
inn+0.096
igator+0.083
His+0.077
ny+0.076
 ang+0.075
Neg logit contributions
ich-0.091
sters-0.084
 ourselves-0.081
lor-0.081
 majesty-0.080
Token to
Token ID284
Feature activation+2.65e-02

Pos logit contributions
inn+0.129
igator+0.120
His+0.109
 ang+0.107
ny+0.106
Neg logit contributions
ich-0.125
 ourselves-0.118
sters-0.115
lor-0.110
aj-0.108
Token know
Token ID760
Feature activation-7.97e-03

Pos logit contributions
inn+0.038
igator+0.037
 his+0.034
ny+0.033
His+0.033
Neg logit contributions
ich-0.049
 ourselves-0.046
sters-0.045
lor-0.045
aj-0.044
Token that
Token ID326
Feature activation+2.69e-02

Pos logit contributions
ich+0.017
aj+0.016
 ourselves+0.016
lor+0.016
sters+0.016
Neg logit contributions
 his-0.009
igator-0.009
inn-0.008
His-0.008
 ang-0.008
Token.
Token ID13
Feature activation+1.56e-02

Pos logit contributions
igator+0.022
 his+0.021
inn+0.020
 ang+0.017
His+0.017
Neg logit contributions
ich-0.067
 ourselves-0.063
lor-0.063
sters-0.063
aj-0.062
Token He
Token ID679
Feature activation+1.79e-02

Pos logit contributions
igator+0.017
inn+0.016
 ang+0.012
 his+0.012
ny+0.012
Neg logit contributions
ich-0.033
 ourselves-0.033
sters-0.032
lor-0.032
aj-0.032
Token
Token ID198
Feature activation+5.14e-03

Pos logit contributions
ich+0.014
 ourselves+0.013
sters+0.013
lor+0.012
aj+0.012
Neg logit contributions
inn-0.007
igator-0.007
 his-0.006
His-0.006
ny-0.006
Token
Token ID198
Feature activation+7.17e-03

Pos logit contributions
inn+0.007
igator+0.007
 his+0.007
His+0.006
ny+0.006
Neg logit contributions
ich-0.010
sters-0.009
 ourselves-0.009
lor-0.009
aj-0.009
Token"
Token ID1
Feature activation-7.97e-03

Pos logit contributions
inn+0.011
igator+0.010
 his+0.009
ny+0.009
His+0.009
Neg logit contributions
ich-0.013
 ourselves-0.012
sters-0.012
lor-0.012
aj-0.012
TokenWho
Token ID8241
Feature activation+1.49e-02

Pos logit contributions
ich+0.013
 ourselves+0.012
sters+0.012
 yourselves+0.012
lor+0.012
Neg logit contributions
inn-0.013
igator-0.013
 his-0.012
 ang-0.011
ny-0.011
Token is
Token ID318
Feature activation+2.39e-02

Pos logit contributions
igator+0.025
 his+0.024
ah+0.022
ny+0.021
 weigh+0.020
Neg logit contributions
ich-0.026
aj-0.024
 ourselves-0.023
asks-0.022
lor-0.022
Token Lincoln
Token ID12406
Feature activation+7.60e-02

Pos logit contributions
igator+0.035
inn+0.035
 his+0.033
 ang+0.031
His+0.031
Neg logit contributions
ich-0.043
 ourselves-0.040
aj-0.039
sters-0.039
lor-0.039
Token,
Token ID11
Feature activation+1.55e-02

Pos logit contributions
igator+0.125
inn+0.121
 his+0.118
ny+0.113
 ang+0.110
Neg logit contributions
ich-0.123
 ourselves-0.115
lor-0.112
aj-0.111
sters-0.111
Token Ben
Token ID3932
Feature activation+1.17e-02

Pos logit contributions
igator+0.020
 his+0.020
inn+0.019
His+0.018
 He+0.017
Neg logit contributions
ich-0.031
lor-0.029
 ourselves-0.029
aj-0.028
sters-0.028
Token?
Token ID30
Feature activation+1.00e-02

Pos logit contributions
igator+0.018
inn+0.018
 his+0.017
ny+0.017
 ang+0.016
Neg logit contributions
ich-0.019
 ourselves-0.019
sters-0.018
lor-0.018
asks-0.017
Token Do
Token ID2141
Feature activation-3.57e-02

Pos logit contributions
igator+0.014
inn+0.014
 his+0.013
His+0.013
 He+0.013
Neg logit contributions
ich-0.019
 ourselves-0.017
sters-0.017
aj-0.017
lor-0.017
Token you
Token ID345
Feature activation+1.98e-02

Pos logit contributions
ich+0.032
aj+0.026
 ourselves+0.025
lor+0.025
sters+0.025
Neg logit contributions
 his-0.084
igator-0.084
inn-0.081
His-0.081
 He-0.078
Token
Token ID198
Feature activation+9.45e-04

Pos logit contributions
ich+0.020
 ourselves+0.019
sters+0.019
lor+0.018
aj+0.018
Neg logit contributions
inn-0.016
igator-0.015
 his-0.014
His-0.014
ny-0.014
TokenOne
Token ID3198
Feature activation-6.96e-03

Pos logit contributions
inn+0.001
igator+0.001
 his+0.001
His+0.001
ny+0.001
Neg logit contributions
ich-0.002
sters-0.002
 ourselves-0.002
lor-0.002
aj-0.002
Token day
Token ID1110
Feature activation+3.43e-02

Pos logit contributions
ich+0.012
 ourselves+0.011
aj+0.010
asks+0.010
sters+0.010
Neg logit contributions
inn-0.011
igator-0.010
 ang-0.010
His-0.009
 aback-0.009
Token,
Token ID11
Feature activation+2.94e-02

Pos logit contributions
inn+0.043
 his+0.039
igator+0.038
His+0.036
ny+0.035
Neg logit contributions
ich-0.072
 ourselves-0.067
lor-0.067
sters-0.066
 yourselves-0.064
Token L
Token ID406
Feature activation-7.50e-03

Pos logit contributions
inn+0.042
igator+0.042
 his+0.037
ny+0.037
His+0.036
Neg logit contributions
ich-0.054
 ourselves-0.051
sters-0.050
aj-0.048
lor-0.048
Tokenott
Token ID1252
Feature activation+7.57e-02

Pos logit contributions
ich+0.015
 ourselves+0.015
sters+0.014
lor+0.014
aj+0.014
Neg logit contributions
igator-0.009
inn-0.009
 his-0.008
ny-0.008
His-0.008
Tokenie
Token ID494
Feature activation+1.52e-04

Pos logit contributions
inn+0.079
igator+0.077
 ang+0.067
 his+0.066
His+0.065
Neg logit contributions
ich-0.169
lor-0.156
 ourselves-0.155
sters-0.155
aj-0.153
Token's
Token ID338
Feature activation+1.04e-02

Pos logit contributions
 his+0.000
ny+0.000
igator+0.000
 He+0.000
inn+0.000
Neg logit contributions
 ourselves-0.000
ich-0.000
 yourselves-0.000
lor-0.000
sters-0.000
Token mom
Token ID1995
Feature activation-4.34e-03

Pos logit contributions
igator+0.014
 his+0.014
inn+0.013
 ang+0.012
ny+0.012
Neg logit contributions
 ourselves-0.020
lor-0.019
ich-0.019
 yourselves-0.019
sters-0.018
Token went
Token ID1816
Feature activation+1.36e-02

Pos logit contributions
ich+0.010
 ourselves+0.010
lor+0.009
sters+0.009
 yourselves+0.009
Neg logit contributions
igator-0.004
inn-0.004
 his-0.004
ny-0.004
 ang-0.003
Token to
Token ID284
Feature activation-1.22e-02

Pos logit contributions
inn+0.020
igator+0.020
 his+0.019
ny+0.018
His+0.018
Neg logit contributions
ich-0.025
 ourselves-0.023
sters-0.023
lor-0.022
aj-0.022
Token?"
Token ID1701
Feature activation+1.12e-02

Pos logit contributions
igator+0.001
 his+0.001
ny+0.001
inn+0.001
 He+0.001
Neg logit contributions
ich-0.001
sed-0.001
sters-0.001
aj-0.001
lor-0.001
Token
Token ID220
Feature activation+1.52e-02

Pos logit contributions
igator+0.019
inn+0.019
 ang+0.017
ny+0.017
 his+0.016
Neg logit contributions
ich-0.017
 ourselves-0.016
sters-0.016
lor-0.016
asks-0.015
Token
Token ID198
Feature activation+1.69e-03

Pos logit contributions
inn+0.020
igator+0.018
 his+0.018
 He+0.017
His+0.017
Neg logit contributions
ich-0.030
 ourselves-0.028
sters-0.028
 yourselves-0.027
lor-0.027
Token
Token ID198
Feature activation+1.88e-03

Pos logit contributions
inn+0.002
igator+0.002
 his+0.002
His+0.002
ny+0.002
Neg logit contributions
ich-0.003
 ourselves-0.003
sters-0.003
lor-0.003
aj-0.003
TokenD
Token ID35
Feature activation-2.46e-02

Pos logit contributions
inn+0.003
igator+0.003
 his+0.003
ny+0.003
 ang+0.003
Neg logit contributions
ich-0.003
 ourselves-0.003
sters-0.003
lor-0.003
aj-0.003
Tokenott
Token ID1252
Feature activation+7.55e-02

Pos logit contributions
ich+0.052
 ourselves+0.050
sters+0.049
lor+0.048
aj+0.048
Neg logit contributions
inn-0.029
igator-0.028
 his-0.025
ny-0.024
His-0.024
Tokenie
Token ID494
Feature activation+4.63e-03

Pos logit contributions
inn+0.083
igator+0.080
 ang+0.071
 his+0.070
His+0.069
Neg logit contributions
ich-0.163
lor-0.151
sters-0.149
 ourselves-0.148
aj-0.147
Token per
Token ID583
Feature activation-1.88e-02

Pos logit contributions
inn+0.007
igator+0.007
 his+0.006
 ang+0.006
ny+0.006
Neg logit contributions
ich-0.008
sters-0.008
 ourselves-0.007
lor-0.007
aj-0.007
Tokenked
Token ID9091
Feature activation+1.34e-02

Pos logit contributions
ich+0.033
sters+0.030
 ourselves+0.030
lor+0.029
aj+0.029
Neg logit contributions
igator-0.029
inn-0.028
 his-0.026
His-0.024
 ang-0.024
Token up
Token ID510
Feature activation+8.06e-03

Pos logit contributions
inn+0.019
igator+0.019
ited+0.016
ny+0.015
 trader+0.015
Neg logit contributions
ich-0.023
 ourselves-0.022
sters-0.021
 yourselves-0.020
 rib-0.020
Token and
Token ID290
Feature activation-1.02e-02

Pos logit contributions
igator+0.016
inn+0.015
ny+0.014
ah+0.013
 trader+0.013
Neg logit contributions
ich-0.009
sters-0.009
 ourselves-0.009
sed-0.008
 rib-0.008
Token "
Token ID366
Feature activation+1.37e-02

Pos logit contributions
ich+0.000
 ourselves+0.000
sters+0.000
lor+0.000
aj+0.000
Neg logit contributions
inn-0.000
igator-0.000
ny-0.000
 his-0.000
 ang-0.000
TokenWhat
Token ID2061
Feature activation+7.68e-04

Pos logit contributions
inn+0.014
igator+0.014
His+0.013
 his+0.013
ny+0.012
Neg logit contributions
ich-0.030
 ourselves-0.029
aj-0.028
sters-0.028
sed-0.028
Token's
Token ID338
Feature activation+8.24e-03

Pos logit contributions
 his+0.001
igator+0.001
 He+0.001
ny+0.001
 weigh+0.001
Neg logit contributions
ich-0.001
aj-0.001
lor-0.001
sed-0.001
 ourselves-0.001
Token in
Token ID287
Feature activation+1.79e-02

Pos logit contributions
igator+0.011
inn+0.011
 his+0.011
 ang+0.009
His+0.009
Neg logit contributions
ich-0.016
lor-0.015
aj-0.015
sters-0.015
 ourselves-0.015
Token that
Token ID326
Feature activation+2.05e-02

Pos logit contributions
 his+0.016
igator+0.015
inn+0.015
 ang+0.012
 trader+0.012
Neg logit contributions
ich-0.042
aj-0.041
sters-0.041
lor-0.040
sed-0.040
Token man
Token ID582
Feature activation+7.53e-02

Pos logit contributions
 his+0.033
inn+0.033
igator+0.033
ny+0.032
 ang+0.031
Neg logit contributions
ich-0.031
 ourselves-0.029
aj-0.029
sters-0.029
asks-0.028
Token's
Token ID338
Feature activation-1.98e-03

Pos logit contributions
igator+0.105
 his+0.102
inn+0.097
ny+0.095
 ang+0.093
Neg logit contributions
ich-0.135
 ourselves-0.131
lor-0.130
aj-0.126
sed-0.126
Token case
Token ID1339
Feature activation-5.18e-03

Pos logit contributions
ich+0.004
sters+0.004
 ourselves+0.004
EEP+0.004
we+0.004
Neg logit contributions
 his-0.003
 load-0.002
 He-0.002
 weigh-0.002
 trader-0.002
Token?".
Token ID43634
Feature activation+1.09e-02

Pos logit contributions
ich+0.011
 ourselves+0.010
sters+0.010
 majesty+0.010
aj+0.010
Neg logit contributions
igator-0.007
 his-0.006
 He-0.006
ny-0.006
inn-0.006
Token Her
Token ID2332
Feature activation+2.08e-02

Pos logit contributions
inn+0.018
igator+0.018
ny+0.016
 ang+0.015
 his+0.015
Neg logit contributions
ich-0.018
 ourselves-0.017
lor-0.016
sters-0.016
 yourselves-0.016
Token grandmother
Token ID18410
Feature activation+6.54e-03

Pos logit contributions
inn+0.042
igator+0.042
 his+0.039
ny+0.038
His+0.038
Neg logit contributions
ich-0.026
 ourselves-0.024
sters-0.023
lor-0.022
aj-0.022
Token did
Token ID750
Feature activation+5.40e-03

Pos logit contributions
ich+0.004
 ourselves+0.003
lor+0.003
sters+0.003
 yourselves+0.003
Neg logit contributions
igator-0.002
inn-0.002
 his-0.002
ny-0.002
His-0.002
Token not
Token ID407
Feature activation+1.83e-02

Pos logit contributions
inn+0.012
igator+0.012
ny+0.011
 trader+0.010
jack+0.010
Neg logit contributions
 ourselves-0.006
ich-0.005
 yourselves-0.005
asks-0.005
sters-0.005
Token want
Token ID765
Feature activation+2.12e-02

Pos logit contributions
inn+0.031
igator+0.031
jack+0.031
penter+0.028
His+0.027
Neg logit contributions
 ourselves-0.025
 yourself-0.023
 yourselves-0.023
asks-0.022
onies-0.021
Token the
Token ID262
Feature activation+3.80e-02

Pos logit contributions
inn+0.033
igator+0.033
ny+0.029
His+0.029
 ang+0.029
Neg logit contributions
ich-0.036
 ourselves-0.034
sters-0.033
 yourselves-0.032
lor-0.032
Token bad
Token ID2089
Feature activation+4.32e-02

Pos logit contributions
inn+0.072
igator+0.070
 his+0.064
His+0.063
ny+0.063
Neg logit contributions
ich-0.055
 ourselves-0.049
sters-0.049
lor-0.048
aj-0.047
Token man
Token ID582
Feature activation+7.46e-02

Pos logit contributions
inn+0.073
igator+0.070
His+0.062
ny+0.057
jack+0.057
Neg logit contributions
ich-0.070
sters-0.067
asks-0.063
 ourselves-0.062
aj-0.061
Token to
Token ID284
Feature activation+2.38e-02

Pos logit contributions
inn+0.126
igator+0.118
His+0.107
 ang+0.105
 his+0.104
Neg logit contributions
ich-0.119
 ourselves-0.112
sters-0.109
lor-0.104
aj-0.103
Token take
Token ID1011
Feature activation+3.30e-02

Pos logit contributions
inn+0.034
igator+0.033
 his+0.029
ny+0.029
His+0.029
Neg logit contributions
ich-0.044
 ourselves-0.043
sters-0.041
 yourselves-0.040
lor-0.040
Token her
Token ID607
Feature activation+2.29e-02

Pos logit contributions
inn+0.040
igator+0.039
His+0.033
ny+0.033
 his+0.033
Neg logit contributions
ich-0.067
 ourselves-0.065
sters-0.063
 yourselves-0.062
lor-0.062
Token away
Token ID1497
Feature activation-4.10e-03

Pos logit contributions
 his+0.039
igator+0.037
inn+0.035
 ang+0.035
ny+0.035
Neg logit contributions
ich-0.037
lor-0.036
 ourselves-0.036
 yourselves-0.035
asks-0.034
Token so
Token ID523
Feature activation-5.42e-03

Pos logit contributions
ich+0.007
 ourselves+0.007
sters+0.007
lor+0.007
aj+0.007
Neg logit contributions
inn-0.006
igator-0.006
 his-0.006
ny-0.005
His-0.005
Token a
Token ID257
Feature activation+1.53e-02

Pos logit contributions
inn+0.010
igator+0.010
ny+0.009
His+0.009
 ang+0.009
Neg logit contributions
ich-0.013
 ourselves-0.012
lor-0.012
aj-0.012
 yourselves-0.012
Token little
Token ID1310
Feature activation+2.83e-02

Pos logit contributions
inn+0.022
igator+0.022
 his+0.019
ny+0.018
His+0.018
Neg logit contributions
ich-0.028
aj-0.025
 ourselves-0.025
sters-0.025
lor-0.025
Token bit
Token ID1643
Feature activation+1.32e-02

Pos logit contributions
inn+0.043
igator+0.038
His+0.036
 his+0.034
 ang+0.031
Neg logit contributions
ich-0.049
sed-0.045
 ourselves-0.045
sters-0.045
 rib-0.044
Token of
Token ID286
Feature activation+2.14e-02

Pos logit contributions
igator+0.022
inn+0.021
His+0.019
jack+0.019
ah+0.018
Neg logit contributions
aj-0.017
sters-0.017
 ourselves-0.016
 roses-0.016
asks-0.016
Token coax
Token ID37700
Feature activation+6.44e-03

Pos logit contributions
inn+0.032
igator+0.030
ny+0.028
His+0.027
 his+0.026
Neg logit contributions
ich-0.038
 ourselves-0.037
 yourselves-0.035
sters-0.035
asks-0.034
Tokening
Token ID278
Feature activation+7.39e-02

Pos logit contributions
inn+0.009
igator+0.009
 his+0.009
ny+0.008
His+0.008
Neg logit contributions
ich-0.012
 ourselves-0.011
sters-0.011
lor-0.011
aj-0.011
Token from
Token ID422
Feature activation+2.44e-02

Pos logit contributions
inn+0.099
igator+0.097
 his+0.087
ny+0.085
His+0.084
Neg logit contributions
ich-0.143
 ourselves-0.135
sters-0.132
lor-0.130
aj-0.129
Token her
Token ID607
Feature activation+1.46e-02

Pos logit contributions
inn+0.027
igator+0.027
 his+0.025
 ang+0.022
ny+0.022
Neg logit contributions
ich-0.053
 ourselves-0.050
lor-0.050
sters-0.050
aj-0.049
Token dad
Token ID9955
Feature activation+2.65e-02

Pos logit contributions
 his+0.019
 ang+0.016
inn+0.015
igator+0.015
 trader+0.015
Neg logit contributions
lor-0.029
ich-0.028
 yourselves-0.027
 ourselves-0.027
asks-0.027
Token,
Token ID11
Feature activation+5.82e-03

Pos logit contributions
inn+0.036
igator+0.036
 his+0.034
ny+0.033
His+0.031
Neg logit contributions
ich-0.051
 ourselves-0.047
 yourselves-0.046
lor-0.046
sters-0.046
Token she
Token ID673
Feature activation-8.98e-03

Pos logit contributions
inn+0.008
igator+0.008
 his+0.008
His+0.007
 ang+0.007
Neg logit contributions
ich-0.011
 ourselves-0.010
sters-0.010
lor-0.010
aj-0.010
Token something
Token ID1223
Feature activation+1.77e-03

Pos logit contributions
inn+0.018
igator+0.018
 his+0.017
 ang+0.016
ny+0.016
Neg logit contributions
ich-0.027
 ourselves-0.025
sters-0.025
lor-0.025
aj-0.024
Token sweet
Token ID6029
Feature activation+3.61e-02

Pos logit contributions
igator+0.002
 his+0.002
inn+0.002
 ang+0.002
His+0.002
Neg logit contributions
ich-0.004
lor-0.004
sters-0.004
 ourselves-0.003
aj-0.003
Token or
Token ID393
Feature activation+2.26e-02

Pos logit contributions
inn+0.052
igator+0.052
 his+0.049
ny+0.046
 ang+0.045
Neg logit contributions
ich-0.069
lor-0.062
 ourselves-0.062
sters-0.061
aj-0.060
Token something
Token ID1223
Feature activation-3.72e-03

Pos logit contributions
igator+0.031
 his+0.030
inn+0.029
 ang+0.027
His+0.026
Neg logit contributions
ich-0.045
lor-0.042
 ourselves-0.040
sters-0.040
asks-0.039
Token sav
Token ID6799
Feature activation+1.26e-02

Pos logit contributions
ich+0.008
lor+0.007
sters+0.007
aj+0.007
 ourselves+0.007
Neg logit contributions
igator-0.005
 his-0.005
inn-0.004
 ang-0.004
ny-0.004
Tokenour
Token ID454
Feature activation+7.36e-02

Pos logit contributions
inn+0.021
igator+0.020
ny+0.018
 ang+0.018
 his+0.018
Neg logit contributions
ich-0.021
 ourselves-0.020
sters-0.019
lor-0.019
 yourselves-0.019
Tokeny
Token ID88
Feature activation+4.53e-02

Pos logit contributions
igator+0.107
 his+0.102
inn+0.101
ny+0.093
 ang+0.087
Neg logit contributions
ich-0.142
lor-0.128
sters-0.125
sed-0.124
 ourselves-0.123
Token?
Token ID30
Feature activation+9.67e-03

Pos logit contributions
igator+0.062
 his+0.060
inn+0.057
ny+0.052
 He+0.048
Neg logit contributions
ich-0.094
lor-0.085
sters-0.080
sed-0.079
aj-0.078
Token In
Token ID554
Feature activation+5.26e-04

Pos logit contributions
inn+0.012
igator+0.011
 his+0.011
 He+0.011
His+0.010
Neg logit contributions
ich-0.020
 ourselves-0.019
sters-0.018
aj-0.018
 yourselves-0.018
Token the
Token ID262
Feature activation-1.78e-03

Pos logit contributions
inn+0.001
igator+0.001
 his+0.001
 ang+0.000
His+0.000
Neg logit contributions
ich-0.001
sters-0.001
aj-0.001
 ourselves-0.001
asks-0.001
Token end
Token ID886
Feature activation+2.82e-02

Pos logit contributions
ich+0.003
 ourselves+0.003
sters+0.003
lor+0.003
aj+0.003
Neg logit contributions
inn-0.003
igator-0.003
 his-0.002
ny-0.002
 ang-0.002
Token if
Token ID611
Feature activation-4.96e-03

Pos logit contributions
igator+0.027
 his+0.026
inn+0.025
ny+0.024
 ang+0.023
Neg logit contributions
ich-0.034
lor-0.031
sters-0.031
aj-0.031
sed-0.031
Token I
Token ID314
Feature activation+2.18e-02

Pos logit contributions
ich+0.007
aj+0.006
lor+0.006
sters+0.006
asks+0.006
Neg logit contributions
 his-0.010
igator-0.010
 he-0.009
 He-0.009
His-0.009
Token can
Token ID460
Feature activation-1.96e-03

Pos logit contributions
igator+0.030
 his+0.027
inn+0.027
ny+0.027
 He+0.024
Neg logit contributions
ich-0.046
 ourselves-0.042
 yourselves-0.042
lor-0.042
aj-0.041
Token give
Token ID1577
Feature activation+8.67e-03

Pos logit contributions
ich+0.004
aj+0.004
sters+0.004
lor+0.004
 ourselves+0.004
Neg logit contributions
igator-0.002
inn-0.002
 his-0.002
ny-0.002
His-0.002
Token young
Token ID1862
Feature activation+4.74e-02

Pos logit contributions
 his+0.018
igator+0.018
ny+0.017
inn+0.017
 He+0.016
Neg logit contributions
ich-0.011
aj-0.010
lor-0.010
asks-0.010
sed-0.009
Token man
Token ID582
Feature activation+7.30e-02

Pos logit contributions
igator+0.078
 his+0.075
inn+0.071
 ang+0.068
ny+0.068
Neg logit contributions
ich-0.081
lor-0.073
aj-0.072
sters-0.071
 ourselves-0.071
Token some
Token ID617
Feature activation-6.24e-03

Pos logit contributions
igator+0.093
inn+0.089
 his+0.086
ny+0.081
His+0.078
Neg logit contributions
ich-0.148
 ourselves-0.137
sters-0.136
lor-0.135
asks-0.135
Token advice
Token ID5608
Feature activation-6.16e-04

Pos logit contributions
ich+0.013
lor+0.012
sters+0.011
 ourselves+0.011
aj+0.011
Neg logit contributions
igator-0.009
 his-0.009
inn-0.007
ny-0.007
 He-0.007
Token?".
Token ID43634
Feature activation+2.06e-02

Pos logit contributions
ich+0.001
sters+0.001
sed+0.001
aj+0.001
lor+0.001
Neg logit contributions
igator-0.001
 his-0.001
ny-0.001
inn-0.001
 He-0.001
Token Tom
Token ID4186
Feature activation+4.68e-03

Pos logit contributions
igator+0.027
inn+0.026
ah+0.022
 ang+0.021
ny+0.021
Neg logit contributions
 ourselves-0.038
ich-0.037
lor-0.035
aj-0.035
 yourselves-0.035
Token nodded
Token ID14464
Feature activation+2.21e-02

Pos logit contributions
inn+0.007
igator+0.007
 ang+0.006
 trader+0.006
 his+0.006
Neg logit contributions
ich-0.008
sters-0.007
aj-0.007
lor-0.007
 ourselves-0.007
Token Ben
Token ID3932
Feature activation-3.82e-03

Pos logit contributions
igator+0.042
inn+0.042
ny+0.037
His+0.037
jack+0.036
Neg logit contributions
ich-0.034
 ourselves-0.033
sters-0.032
 yourselves-0.030
asks-0.030
Token did
Token ID750
Feature activation-2.80e-03

Pos logit contributions
ich+0.008
 ourselves+0.007
sters+0.007
lor+0.007
aj+0.007
Neg logit contributions
inn-0.005
igator-0.005
 his-0.004
ny-0.004
His-0.004
Token not
Token ID407
Feature activation+1.49e-02

Pos logit contributions
ich+0.004
 ourselves+0.004
sters+0.003
lor+0.003
aj+0.003
Neg logit contributions
inn-0.005
igator-0.005
 his-0.005
His-0.005
 ang-0.005
Token like
Token ID588
Feature activation+3.70e-02

Pos logit contributions
inn+0.026
jack+0.024
 trader+0.022
igator+0.022
 ang+0.022
Neg logit contributions
 ourselves-0.020
onies-0.019
 yourself-0.018
asks-0.018
sters-0.018
Token r
Token ID374
Feature activation+1.07e-02

Pos logit contributions
 his+0.043
igator+0.043
inn+0.041
 ang+0.037
 He+0.035
Neg logit contributions
ich-0.078
lor-0.074
 ourselves-0.073
sters-0.072
aj-0.071
Tokeniddles
Token ID29319
Feature activation+7.28e-02

Pos logit contributions
igator+0.020
 his+0.019
His+0.019
inn+0.019
 trader+0.018
Neg logit contributions
ich-0.016
 yourselves-0.015
asks-0.014
 ourselves-0.014
aj-0.014
Token.
Token ID13
Feature activation+1.80e-02

Pos logit contributions
inn+0.104
igator+0.103
 his+0.095
ny+0.091
His+0.090
Neg logit contributions
ich-0.135
 ourselves-0.126
sters-0.123
lor-0.122
aj-0.121
Token They
Token ID1119
Feature activation-5.02e-03

Pos logit contributions
igator+0.027
inn+0.027
 his+0.024
 ang+0.023
His+0.023
Neg logit contributions
ich-0.031
 ourselves-0.031
sters-0.029
 yourselves-0.029
asks-0.029
Token did
Token ID750
Feature activation-4.74e-03

Pos logit contributions
ich+0.009
 ourselves+0.008
sters+0.008
aj+0.008
lor+0.008
Neg logit contributions
inn-0.007
igator-0.007
 his-0.006
 ang-0.006
ny-0.006
Token not
Token ID407
Feature activation+8.19e-03

Pos logit contributions
ich+0.007
sters+0.006
lor+0.006
 ourselves+0.006
aj+0.006
Neg logit contributions
inn-0.009
igator-0.009
 his-0.008
 ang-0.008
His-0.008
Token like
Token ID588
Feature activation+2.44e-02

Pos logit contributions
inn+0.014
His+0.012
jack+0.012
 ang+0.012
igator+0.012
Neg logit contributions
 ourselves-0.012
 yourselves-0.011
ich-0.011
 yourself-0.011
asks-0.010
Token doll
Token ID3654
Feature activation-2.23e-03

Pos logit contributions
igator+0.008
 his+0.007
inn+0.007
 ang+0.006
ny+0.006
Neg logit contributions
lor-0.012
 ourselves-0.012
ich-0.011
sters-0.011
asks-0.011
Token makes
Token ID1838
Feature activation+8.86e-03

Pos logit contributions
ich+0.004
 ourselves+0.004
lor+0.004
sters+0.004
 yourselves+0.004
Neg logit contributions
igator-0.003
inn-0.003
 his-0.003
ny-0.003
His-0.003
Token a
Token ID257
Feature activation-5.34e-03

Pos logit contributions
inn+0.011
igator+0.011
 his+0.010
ny+0.009
His+0.009
Neg logit contributions
ich-0.018
 ourselves-0.017
sters-0.017
lor-0.017
aj-0.016
Token m
Token ID285
Feature activation-2.89e-02

Pos logit contributions
ich+0.009
 ourselves+0.008
sters+0.008
lor+0.008
aj+0.008
Neg logit contributions
inn-0.009
igator-0.008
 his-0.008
ny-0.007
His-0.007
Tokenunch
Token ID3316
Feature activation+7.73e-03

Pos logit contributions
ich+0.037
 ourselves+0.033
sters+0.032
lor+0.032
aj+0.031
Neg logit contributions
inn-0.059
igator-0.058
 his-0.055
ny-0.054
 ang-0.053
Tokening
Token ID278
Feature activation+7.21e-02

Pos logit contributions
inn+0.013
igator+0.013
 ang+0.012
 trader+0.012
His+0.012
Neg logit contributions
ich-0.012
 ourselves-0.011
sters-0.010
lor-0.010
 yourselves-0.010
Token sound
Token ID2128
Feature activation+3.70e-02

Pos logit contributions
inn+0.111
igator+0.105
 trader+0.099
His+0.095
ny+0.092
Neg logit contributions
 ourselves-0.116
ich-0.111
sters-0.106
 yourselves-0.103
lor-0.101
Token.
Token ID13
Feature activation+8.51e-03

Pos logit contributions
inn+0.058
igator+0.058
ny+0.051
 ang+0.051
His+0.050
Neg logit contributions
ich-0.061
 ourselves-0.059
sters-0.057
lor-0.055
aj-0.055
Token Anna
Token ID11735
Feature activation-8.06e-03

Pos logit contributions
inn+0.010
igator+0.010
 his+0.009
ny+0.009
His+0.009
Neg logit contributions
ich-0.017
 ourselves-0.017
sters-0.016
lor-0.016
aj-0.016
Token smiles
Token ID21845
Feature activation+1.35e-02

Pos logit contributions
ich+0.014
 ourselves+0.013
lor+0.013
 yourselves+0.012
sters+0.012
Neg logit contributions
igator-0.013
inn-0.013
 his-0.012
ny-0.012
 He-0.011
Token.
Token ID13
Feature activation+3.89e-03

Pos logit contributions
inn+0.021
igator+0.020
 his+0.018
ny+0.018
His+0.018
Neg logit contributions
ich-0.024
 ourselves-0.022
sters-0.022
lor-0.021
aj-0.021
Token,"
Token ID553
Feature activation+1.68e-02

Pos logit contributions
inn+0.024
igator+0.023
 his+0.021
His+0.021
ny+0.021
Neg logit contributions
ich-0.025
 ourselves-0.024
sters-0.023
lor-0.022
aj-0.022
Token she
Token ID673
Feature activation+6.44e-03

Pos logit contributions
inn+0.030
igator+0.029
ny+0.027
ah+0.026
 trader+0.025
Neg logit contributions
 ourselves-0.024
ich-0.023
sters-0.023
 yourselves-0.022
asks-0.021
Token said
Token ID531
Feature activation-3.15e-03

Pos logit contributions
igator+0.013
inn+0.013
 trader+0.012
aster+0.012
isel+0.011
Neg logit contributions
sters-0.006
ich-0.006
aj-0.006
sed-0.006
lor-0.006
Token,
Token ID11
Feature activation+1.57e-03

Pos logit contributions
 ourselves+0.004
ich+0.004
sters+0.004
aj+0.004
sed+0.004
Neg logit contributions
igator-0.006
inn-0.006
ah-0.005
ny-0.005
ited-0.005
Token scoop
Token ID23529
Feature activation+8.01e-03

Pos logit contributions
inn+0.002
igator+0.002
 his+0.002
His+0.002
ny+0.002
Neg logit contributions
ich-0.003
 ourselves-0.003
sters-0.003
lor-0.003
aj-0.003
Tokening
Token ID278
Feature activation+7.20e-02

Pos logit contributions
igator+0.016
inn+0.016
His+0.015
 trader+0.015
 ang+0.014
Neg logit contributions
 ourselves-0.009
ich-0.009
sters-0.008
 yourselves-0.008
lor-0.008
Token him
Token ID683
Feature activation+3.04e-02

Pos logit contributions
igator+0.112
inn+0.109
jack+0.098
ah+0.097
penter+0.097
Neg logit contributions
 ourselves-0.114
sters-0.104
ich-0.099
 yourselves-0.096
 roses-0.095
Token up
Token ID510
Feature activation+2.43e-02

Pos logit contributions
inn+0.050
igator+0.050
ah+0.043
 trader+0.042
jack+0.041
Neg logit contributions
 ourselves-0.048
ich-0.047
sters-0.044
 yourselves-0.042
lor-0.042
Token into
Token ID656
Feature activation-5.87e-03

Pos logit contributions
igator+0.041
inn+0.041
ny+0.036
His+0.035
 ang+0.035
Neg logit contributions
ich-0.036
 ourselves-0.036
sters-0.034
aj-0.032
lor-0.032
Token her
Token ID607
Feature activation+4.89e-03

Pos logit contributions
 ourselves+0.011
ich+0.011
 yourselves+0.011
sters+0.010
asks+0.010
Neg logit contributions
inn-0.008
igator-0.008
ny-0.007
ah-0.006
His-0.006
Token arms
Token ID5101
Feature activation+8.78e-03

Pos logit contributions
 his+0.008
igator+0.008
inn+0.008
ny+0.007
 He+0.007
Neg logit contributions
ich-0.008
asks-0.008
 ourselves-0.008
lor-0.007
 yourselves-0.007
Token the
Token ID262
Feature activation+4.45e-03

Pos logit contributions
ich+0.006
sters+0.005
aj+0.005
lor+0.005
 ourselves+0.005
Neg logit contributions
 his-0.003
igator-0.003
inn-0.002
His-0.002
 He-0.002
Token sound
Token ID2128
Feature activation+1.78e-02

Pos logit contributions
igator+0.005
 his+0.005
 ang+0.005
ny+0.005
 He+0.005
Neg logit contributions
lor-0.009
ich-0.008
aj-0.008
 ourselves-0.008
sters-0.008
Token of
Token ID286
Feature activation+4.37e-03

Pos logit contributions
inn+0.029
igator+0.027
ny+0.024
 trader+0.024
His+0.024
Neg logit contributions
ich-0.029
 ourselves-0.027
sters-0.027
asks-0.026
lor-0.026
Token hurry
Token ID23290
Feature activation-1.46e-02

Pos logit contributions
 his+0.005
igator+0.005
 ang+0.004
inn+0.004
 He+0.004
Neg logit contributions
ich-0.009
lor-0.009
aj-0.009
sters-0.009
 ourselves-0.009
Token and
Token ID290
Feature activation-5.71e-03

Pos logit contributions
ich+0.029
sters+0.027
lor+0.027
aj+0.027
 ourselves+0.027
Neg logit contributions
inn-0.019
 his-0.018
igator-0.018
ny-0.017
 ang-0.016
Token bust
Token ID13076
Feature activation+7.20e-02

Pos logit contributions
ich+0.012
lor+0.012
 ourselves+0.011
aj+0.011
sters+0.011
Neg logit contributions
igator-0.007
 his-0.006
inn-0.006
 ang-0.006
His-0.006
Tokenle
Token ID293
Feature activation+1.73e-02

Pos logit contributions
inn+0.096
igator+0.092
His+0.079
 He+0.079
 trader+0.076
Neg logit contributions
ich-0.136
 ourselves-0.132
sters-0.129
 yourselves-0.125
lor-0.125
Token.
Token ID13
Feature activation+6.22e-03

Pos logit contributions
 his+0.021
igator+0.020
inn+0.019
ny+0.018
 ang+0.017
Neg logit contributions
ich-0.037
lor-0.035
aj-0.034
 ourselves-0.034
 yourselves-0.034
Token
Token ID198
Feature activation+4.09e-03

Pos logit contributions
inn+0.008
 his+0.007
igator+0.007
ny+0.007
 He+0.007
Neg logit contributions
ich-0.013
 ourselves-0.012
sters-0.012
aj-0.011
lor-0.011
Token
Token ID198
Feature activation+1.55e-02

Pos logit contributions
inn+0.005
 his+0.005
His+0.005
igator+0.005
ny+0.005
Neg logit contributions
ich-0.008
sters-0.008
 ourselves-0.008
 yourselves-0.008
aj-0.008
TokenG
Token ID38
Feature activation-5.13e-03

Pos logit contributions
inn+0.024
His+0.022
ny+0.022
 his+0.022
igator+0.021
Neg logit contributions
ich-0.027
 ourselves-0.026
aj-0.025
sters-0.025
 yourselves-0.025
Token the
Token ID262
Feature activation+2.94e-03

Pos logit contributions
igator+0.035
inn+0.033
 his+0.033
His+0.030
 ang+0.030
Neg logit contributions
ich-0.048
 ourselves-0.045
aj-0.044
lor-0.044
sters-0.044
Token lion
Token ID18744
Feature activation+2.02e-02

Pos logit contributions
igator+0.004
 his+0.004
ny+0.004
ah+0.004
 ang+0.004
Neg logit contributions
aj-0.006
lor-0.006
ich-0.005
 ourselves-0.005
uddled-0.005
Token have
Token ID423
Feature activation+1.27e-03

Pos logit contributions
igator+0.030
 his+0.028
inn+0.026
ny+0.026
 ang+0.025
Neg logit contributions
ich-0.038
 ourselves-0.036
lor-0.035
aj-0.035
sters-0.034
Token a
Token ID257
Feature activation-8.61e-03

Pos logit contributions
 his+0.002
igator+0.001
ny+0.001
 He+0.001
His+0.001
Neg logit contributions
ich-0.003
aj-0.003
lor-0.003
sters-0.003
asks-0.003
Token big
Token ID1263
Feature activation+2.25e-02

Pos logit contributions
ich+0.017
 ourselves+0.016
sters+0.016
 yourselves+0.016
lor+0.016
Neg logit contributions
igator-0.011
inn-0.011
 his-0.011
ny-0.010
 ang-0.010
Token man
Token ID582
Feature activation+7.19e-02

Pos logit contributions
igator+0.028
inn+0.027
 his+0.025
ny+0.024
 ang+0.023
Neg logit contributions
ich-0.046
 ourselves-0.044
lor-0.043
sters-0.043
 yourselves-0.042
Tokene
Token ID68
Feature activation+1.11e-02

Pos logit contributions
igator+0.112
inn+0.111
 his+0.105
ny+0.100
 ang+0.098
Neg logit contributions
ich-0.122
 ourselves-0.116
lor-0.112
sters-0.112
aj-0.109
Token?"
Token ID1701
Feature activation+3.12e-03

Pos logit contributions
igator+0.016
 his+0.015
inn+0.014
His+0.014
ny+0.013
Neg logit contributions
 ourselves-0.021
ich-0.020
sters-0.018
 yourselves-0.018
lor-0.018
Token she
Token ID673
Feature activation+1.78e-02

Pos logit contributions
inn+0.005
igator+0.005
ny+0.004
 ang+0.004
ited+0.004
Neg logit contributions
ich-0.005
 ourselves-0.005
 yourselves-0.005
asks-0.005
sters-0.005
Token asked
Token ID1965
Feature activation-1.58e-02

Pos logit contributions
igator+0.019
inn+0.019
 trader+0.015
ny+0.015
jack+0.014
Neg logit contributions
ich-0.039
sters-0.037
 ourselves-0.036
aj-0.036
asks-0.036
Token.
Token ID13
Feature activation+1.54e-02

Pos logit contributions
ich+0.025
 ourselves+0.024
sters+0.023
 yourselves+0.022
aj+0.022
Neg logit contributions
igator-0.027
inn-0.026
ny-0.023
 ang-0.023
His-0.023
Token After
Token ID2293
Feature activation-3.33e-02

Pos logit contributions
ich+0.004
 ourselves+0.004
sters+0.004
 yourselves+0.003
lor+0.003
Neg logit contributions
inn-0.002
igator-0.002
 his-0.002
His-0.002
ny-0.002
Token a
Token ID257
Feature activation+1.52e-02

Pos logit contributions
ich+0.072
aj+0.067
lor+0.066
 ourselves+0.066
sters+0.066
Neg logit contributions
 his-0.039
inn-0.038
igator-0.035
His-0.033
 He-0.032
Token while
Token ID981
Feature activation+3.30e-02

Pos logit contributions
igator+0.022
inn+0.021
 ang+0.019
 his+0.019
jack+0.018
Neg logit contributions
ich-0.026
 ourselves-0.024
lor-0.024
sters-0.023
aj-0.023
Token,
Token ID11
Feature activation+2.05e-02

Pos logit contributions
inn+0.046
igator+0.045
 his+0.041
ny+0.039
His+0.039
Neg logit contributions
ich-0.063
 ourselves-0.059
sters-0.058
lor-0.057
aj-0.057
Token D
Token ID360
Feature activation-3.95e-02

Pos logit contributions
inn+0.026
 his+0.025
igator+0.025
 ang+0.022
His+0.022
Neg logit contributions
ich-0.042
aj-0.039
 ourselves-0.039
lor-0.038
sters-0.038
Tokenott
Token ID1252
Feature activation+7.17e-02

Pos logit contributions
ich+0.084
 ourselves+0.079
sters+0.078
lor+0.077
aj+0.076
Neg logit contributions
inn-0.046
igator-0.045
 his-0.040
ny-0.038
His-0.038
Tokenie
Token ID494
Feature activation-2.62e-03

Pos logit contributions
inn+0.083
igator+0.082
 ang+0.073
 trader+0.073
 his+0.072
Neg logit contributions
ich-0.149
lor-0.139
sters-0.135
 ourselves-0.134
aj-0.133
Token was
Token ID373
Feature activation+9.83e-03

Pos logit contributions
 ourselves+0.006
ich+0.006
 yourselves+0.006
asks+0.005
 ours+0.005
Neg logit contributions
 his-0.003
inn-0.003
ny-0.003
igator-0.003
His-0.003
Token no
Token ID645
Feature activation-9.01e-03

Pos logit contributions
inn+0.014
igator+0.013
 his+0.012
ny+0.012
His+0.012
Neg logit contributions
ich-0.019
 ourselves-0.018
sters-0.017
lor-0.017
aj-0.017
Token longer
Token ID2392
Feature activation-9.53e-03

Pos logit contributions
 ourselves+0.014
asks+0.013
 rib+0.013
ich+0.013
aj+0.013
Neg logit contributions
igator-0.015
jack-0.013
His-0.012
inn-0.012
bo-0.011
Token scared
Token ID12008
Feature activation-3.40e-03

Pos logit contributions
ich+0.020
 ourselves+0.019
sters+0.019
 yourselves+0.018
aj+0.018
Neg logit contributions
inn-0.011
igator-0.011
His-0.009
ny-0.009
 his-0.009
Token and
Token ID290
Feature activation-1.48e-02

Pos logit contributions
ich+0.036
 ourselves+0.033
sters+0.032
asks+0.032
 yourselves+0.032
Neg logit contributions
inn-0.031
igator-0.031
 his-0.030
ny-0.028
His-0.027
Token put
Token ID1234
Feature activation+9.22e-03

Pos logit contributions
ich+0.031
lor+0.029
asks+0.028
 ourselves+0.027
sters+0.027
Neg logit contributions
 his-0.020
igator-0.018
inn-0.017
 ang-0.017
His-0.017
Token on
Token ID319
Feature activation+1.42e-02

Pos logit contributions
inn+0.013
igator+0.013
 his+0.011
ny+0.011
His+0.011
Neg logit contributions
ich-0.017
 ourselves-0.016
sters-0.016
lor-0.016
aj-0.016
Token their
Token ID511
Feature activation+3.72e-03

Pos logit contributions
 his+0.016
inn+0.016
igator+0.015
ny+0.013
 ang+0.013
Neg logit contributions
ich-0.031
lor-0.029
sters-0.029
asks-0.028
 ourselves-0.028
Token p
Token ID279
Feature activation-5.04e-02

Pos logit contributions
 his+0.006
igator+0.006
 ang+0.005
inn+0.005
 He+0.005
Neg logit contributions
lor-0.007
ich-0.007
sters-0.006
 ourselves-0.006
asks-0.005
Tokenaj
Token ID1228
Feature activation-1.08e-01

Pos logit contributions
ich+0.036
 ourselves+0.033
lor+0.026
 yourselves+0.026
sters+0.026
Neg logit contributions
inn-0.132
ny-0.119
igator-0.119
 his-0.119
His-0.116
Tokenamas
Token ID17485
Feature activation+2.68e-03

Pos logit contributions
ich+0.124
 ourselves+0.104
 ours+0.102
 yourselves+0.098
sed+0.095
Neg logit contributions
inn-0.227
His-0.213
 his-0.210
ny-0.209
 He-0.208
Token.
Token ID13
Feature activation-2.88e-03

Pos logit contributions
inn+0.003
igator+0.003
 his+0.003
ny+0.003
His+0.003
Neg logit contributions
ich-0.005
 ourselves-0.005
sters-0.005
 yourselves-0.005
asks-0.005
Token They
Token ID1119
Feature activation-8.88e-03

Pos logit contributions
ich+0.007
sters+0.007
lor+0.007
 ourselves+0.006
ocket+0.006
Neg logit contributions
 He-0.003
inn-0.002
 his-0.002
 His-0.002
His-0.002
Token climbed
Token ID19952
Feature activation-2.09e-04

Pos logit contributions
ich+0.017
 ourselves+0.016
lor+0.016
sters+0.016
 yourselves+0.016
Neg logit contributions
igator-0.012
inn-0.012
 his-0.011
ny-0.011
His-0.011
Token into
Token ID656
Feature activation-8.52e-03

Pos logit contributions
ich+0.000
sters+0.000
 ourselves+0.000
lor+0.000
asks+0.000
Neg logit contributions
inn-0.000
 his-0.000
igator-0.000
ny-0.000
 ang-0.000
Token and
Token ID290
Feature activation-1.77e-02

Pos logit contributions
ich+0.046
 ourselves+0.042
sters+0.041
lor+0.040
asks+0.040
Neg logit contributions
inn-0.043
igator-0.042
 his-0.040
ny-0.039
His-0.038
Token put
Token ID1234
Feature activation+6.08e-03

Pos logit contributions
ich+0.038
lor+0.035
asks+0.033
 ourselves+0.033
sters+0.033
Neg logit contributions
 his-0.023
igator-0.021
inn-0.021
 ang-0.020
His-0.019
Token on
Token ID319
Feature activation+1.17e-02

Pos logit contributions
inn+0.009
igator+0.009
His+0.008
ny+0.008
 his+0.008
Neg logit contributions
ich-0.011
 ourselves-0.010
sters-0.010
 yourselves-0.010
lor-0.010
Token their
Token ID511
Feature activation+2.25e-04

Pos logit contributions
inn+0.014
igator+0.014
 his+0.014
His+0.012
ny+0.012
Neg logit contributions
ich-0.024
 ourselves-0.023
lor-0.022
sters-0.022
asks-0.022
Token p
Token ID279
Feature activation-5.00e-02

Pos logit contributions
 his+0.000
igator+0.000
 He+0.000
 ang+0.000
 cube+0.000
Neg logit contributions
lor-0.000
ich-0.000
sters-0.000
 ourselves-0.000
asks-0.000
Tokenaj
Token ID1228
Feature activation-1.08e-01

Pos logit contributions
ich+0.031
 ourselves+0.028
lor+0.024
 yourselves+0.022
sters+0.020
Neg logit contributions
inn-0.133
igator-0.122
ny-0.122
 his-0.121
ah-0.118
Tokenamas
Token ID17485
Feature activation+1.25e-03

Pos logit contributions
ich+0.125
 ourselves+0.103
 ours+0.103
 yourselves+0.099
asks+0.097
Neg logit contributions
inn-0.217
 his-0.207
ny-0.207
 He-0.205
His-0.205
Token.
Token ID13
Feature activation-9.61e-03

Pos logit contributions
inn+0.002
igator+0.001
 his+0.001
ny+0.001
His+0.001
Neg logit contributions
ich-0.003
 ourselves-0.002
sters-0.002
 yourselves-0.002
asks-0.002
Token They
Token ID1119
Feature activation-1.06e-02

Pos logit contributions
ich+0.023
sters+0.022
lor+0.021
 ourselves+0.021
 yourselves+0.020
Neg logit contributions
inn-0.010
 his-0.010
 He-0.010
His-0.009
 His-0.008
Token hugged
Token ID41130
Feature activation-1.87e-02

Pos logit contributions
ich+0.021
 ourselves+0.020
lor+0.019
sters+0.019
 yourselves+0.019
Neg logit contributions
igator-0.014
inn-0.014
 his-0.013
ny-0.012
His-0.012
Token their
Token ID511
Feature activation+8.18e-03

Pos logit contributions
ich+0.040
lor+0.038
sters+0.037
aj+0.036
 ourselves+0.036
Neg logit contributions
 his-0.023
igator-0.022
inn-0.022
His-0.020
ny-0.019
Token and
Token ID290
Feature activation-1.46e-02

Pos logit contributions
ich+0.038
 ourselves+0.035
sters+0.034
lor+0.034
aj+0.033
Neg logit contributions
inn-0.040
igator-0.039
 his-0.036
ny-0.035
His-0.035
Token put
Token ID1234
Feature activation+8.45e-03

Pos logit contributions
ich+0.033
lor+0.031
sters+0.029
asks+0.029
 ourselves+0.028
Neg logit contributions
 his-0.018
igator-0.017
inn-0.015
His-0.015
 ang-0.015
Token on
Token ID319
Feature activation+1.30e-02

Pos logit contributions
inn+0.013
igator+0.013
His+0.012
ny+0.011
 ang+0.011
Neg logit contributions
ich-0.014
 ourselves-0.014
sters-0.013
 yourselves-0.013
aj-0.013
Token their
Token ID511
Feature activation+1.48e-03

Pos logit contributions
inn+0.019
igator+0.018
ny+0.016
His+0.016
 his+0.016
Neg logit contributions
ich-0.024
 ourselves-0.023
sters-0.022
aj-0.022
 yourselves-0.022
Token p
Token ID279
Feature activation-4.46e-02

Pos logit contributions
 his+0.002
igator+0.002
 ang+0.002
inn+0.002
 His+0.002
Neg logit contributions
lor-0.003
ich-0.003
sters-0.002
 ourselves-0.002
asks-0.002
Tokenaj
Token ID1228
Feature activation-1.05e-01

Pos logit contributions
ich+0.025
 ourselves+0.020
lor+0.018
sters+0.016
 yourselves+0.014
Neg logit contributions
inn-0.122
igator-0.113
 his-0.113
ny-0.111
His-0.110
Tokenamas
Token ID17485
Feature activation+7.26e-03

Pos logit contributions
ich+0.115
 ourselves+0.093
 ours+0.089
sed+0.088
 yourselves+0.086
Neg logit contributions
inn-0.226
 his-0.212
His-0.211
ny-0.210
 He-0.209
Token.
Token ID13
Feature activation+4.12e-04

Pos logit contributions
inn+0.010
igator+0.010
 his+0.009
ny+0.009
His+0.009
Neg logit contributions
ich-0.014
 ourselves-0.013
sters-0.013
lor-0.013
 yourselves-0.013
Token They
Token ID1119
Feature activation-9.62e-03

Pos logit contributions
inn+0.000
 his+0.000
 He+0.000
His+0.000
igator+0.000
Neg logit contributions
ich-0.001
sters-0.001
lor-0.001
 ourselves-0.001
 yourselves-0.001
Token get
Token ID651
Feature activation-1.94e-03

Pos logit contributions
ich+0.020
lor+0.018
 ourselves+0.018
sters+0.018
 yourselves+0.018
Neg logit contributions
igator-0.013
inn-0.012
 his-0.012
ny-0.011
 ang-0.011
Token into
Token ID656
Feature activation-4.21e-03

Pos logit contributions
ich+0.004
 ourselves+0.003
sters+0.003
lor+0.003
aj+0.003
Neg logit contributions
inn-0.003
igator-0.003
 his-0.003
ny-0.002
 ang-0.002
Token They
Token ID1119
Feature activation-1.11e-02

Pos logit contributions
ich+0.014
 ourselves+0.013
sters+0.013
lor+0.012
aj+0.012
Neg logit contributions
inn-0.010
igator-0.009
 his-0.008
ny-0.008
His-0.008
Token put
Token ID1234
Feature activation+7.72e-03

Pos logit contributions
ich+0.022
 ourselves+0.020
lor+0.020
sters+0.019
asks+0.019
Neg logit contributions
igator-0.015
inn-0.015
 his-0.014
ny-0.014
 ang-0.013
Token on
Token ID319
Feature activation+1.80e-02

Pos logit contributions
igator+0.012
inn+0.012
His+0.010
ny+0.010
 trader+0.010
Neg logit contributions
 ourselves-0.013
ich-0.013
sters-0.012
 yourselves-0.012
aj-0.012
Token their
Token ID511
Feature activation+1.24e-02

Pos logit contributions
inn+0.026
igator+0.026
ny+0.022
His+0.022
 ang+0.022
Neg logit contributions
ich-0.033
 ourselves-0.032
sters-0.031
 yourselves-0.030
aj-0.030
Token p
Token ID279
Feature activation-4.41e-02

Pos logit contributions
 his+0.022
 ang+0.016
igator+0.016
His+0.016
 His+0.016
Neg logit contributions
lor-0.024
ich-0.023
 ourselves-0.021
oor-0.021
asks-0.020
Tokenaj
Token ID1228
Feature activation-1.04e-01

Pos logit contributions
ich+0.026
 ourselves+0.025
 yourselves+0.018
lor+0.018
sters+0.017
Neg logit contributions
inn-0.119
 his-0.110
His-0.107
ny-0.107
ah-0.106
Tokenamas
Token ID17485
Feature activation+3.47e-03

Pos logit contributions
ich+0.107
 ourselves+0.093
 ours+0.086
 yourselves+0.082
sed+0.081
Neg logit contributions
inn-0.226
 his-0.214
His-0.213
 He-0.210
ny-0.207
Token and
Token ID290
Feature activation-1.64e-02

Pos logit contributions
inn+0.005
igator+0.005
 his+0.005
ny+0.005
His+0.004
Neg logit contributions
ich-0.006
 ourselves-0.006
sters-0.006
 yourselves-0.006
lor-0.006
Token get
Token ID651
Feature activation-2.19e-03

Pos logit contributions
ich+0.035
lor+0.032
 ourselves+0.031
sters+0.031
asks+0.030
Neg logit contributions
 his-0.021
igator-0.021
inn-0.020
His-0.018
 ang-0.018
Token into
Token ID656
Feature activation-7.95e-04

Pos logit contributions
ich+0.004
lor+0.004
 ourselves+0.004
sters+0.004
asks+0.004
Neg logit contributions
inn-0.003
igator-0.003
 his-0.003
 ang-0.002
His-0.002
Token their
Token ID511
Feature activation-7.81e-03

Pos logit contributions
ich+0.002
 ourselves+0.001
sters+0.001
lor+0.001
aj+0.001
Neg logit contributions
inn-0.001
igator-0.001
 his-0.001
His-0.001
 ang-0.001
Token and
Token ID290
Feature activation-1.28e-02

Pos logit contributions
ich+0.030
 ourselves+0.028
sters+0.027
lor+0.026
 yourselves+0.026
Neg logit contributions
inn-0.030
igator-0.029
 his-0.028
ny-0.027
His-0.027
Token put
Token ID1234
Feature activation+1.07e-02

Pos logit contributions
ich+0.027
lor+0.025
 ourselves+0.024
sters+0.024
asks+0.024
Neg logit contributions
 his-0.016
igator-0.016
inn-0.015
His-0.014
 ang-0.014
Token on
Token ID319
Feature activation+1.36e-02

Pos logit contributions
inn+0.016
igator+0.016
His+0.014
ny+0.014
 ang+0.014
Neg logit contributions
ich-0.018
 ourselves-0.018
sters-0.017
 yourselves-0.017
aj-0.016
Token their
Token ID511
Feature activation+1.24e-02

Pos logit contributions
inn+0.018
igator+0.017
 his+0.016
ny+0.015
His+0.015
Neg logit contributions
ich-0.027
 ourselves-0.025
sters-0.025
lor-0.025
aj-0.024
Token p
Token ID279
Feature activation-4.27e-02

Pos logit contributions
 his+0.021
igator+0.018
 ang+0.017
 His+0.017
His+0.016
Neg logit contributions
lor-0.024
ich-0.023
 ourselves-0.021
sters-0.020
oor-0.019
Tokenaj
Token ID1228
Feature activation-1.02e-01

Pos logit contributions
ich+0.023
 ourselves+0.022
lor+0.016
sters+0.015
 yourselves+0.014
Neg logit contributions
inn-0.117
 his-0.109
igator-0.108
ny-0.107
His-0.106
Tokenamas
Token ID17485
Feature activation+5.25e-03

Pos logit contributions
ich+0.112
 ourselves+0.097
 ours+0.089
 yourselves+0.086
sed+0.085
Neg logit contributions
inn-0.223
His-0.209
 his-0.208
 He-0.205
ny-0.203
Token.
Token ID13
Feature activation-2.31e-03

Pos logit contributions
inn+0.007
igator+0.007
 his+0.006
ny+0.006
His+0.006
Neg logit contributions
ich-0.010
 ourselves-0.010
sters-0.010
 yourselves-0.009
lor-0.009
Token Their
Token ID5334
Feature activation+1.05e-03

Pos logit contributions
ich+0.005
sters+0.005
 ourselves+0.005
lor+0.005
 yourselves+0.005
Neg logit contributions
inn-0.003
 his-0.002
 He-0.002
igator-0.002
His-0.002
Token mom
Token ID1995
Feature activation-9.53e-03

Pos logit contributions
 his+0.001
igator+0.001
 He+0.001
His+0.001
 His+0.001
Neg logit contributions
lor-0.002
ich-0.002
 ourselves-0.002
sters-0.002
oor-0.002
Token tucked
Token ID29779
Feature activation+3.80e-03

Pos logit contributions
ich+0.019
 ourselves+0.018
 yourselves+0.017
sters+0.017
lor+0.017
Neg logit contributions
inn-0.013
igator-0.012
 his-0.012
ny-0.011
His-0.011
Token and
Token ID290
Feature activation-1.31e-02

Pos logit contributions
ich+0.047
sters+0.042
 ourselves+0.042
lor+0.041
asks+0.041
Neg logit contributions
inn-0.034
igator-0.034
 his-0.032
ny-0.030
His-0.030
Token put
Token ID1234
Feature activation+7.92e-03

Pos logit contributions
ich+0.029
lor+0.027
asks+0.026
 ourselves+0.026
sters+0.026
Neg logit contributions
 his-0.016
igator-0.014
inn-0.013
 ang-0.013
His-0.013
Token on
Token ID319
Feature activation+1.03e-02

Pos logit contributions
inn+0.011
igator+0.011
 his+0.010
ny+0.009
His+0.009
Neg logit contributions
ich-0.015
 ourselves-0.014
sters-0.014
lor-0.014
aj-0.014
Token their
Token ID511
Feature activation+8.63e-03

Pos logit contributions
inn+0.012
igator+0.011
 his+0.011
His+0.010
ny+0.010
Neg logit contributions
ich-0.022
sters-0.020
lor-0.020
 ourselves-0.020
asks-0.020
Token p
Token ID279
Feature activation-4.20e-02

Pos logit contributions
 his+0.016
igator+0.013
His+0.012
 His+0.012
 ang+0.012
Neg logit contributions
lor-0.016
ich-0.015
 ourselves-0.014
asks-0.014
 yourselves-0.013
Tokenaj
Token ID1228
Feature activation-1.00e-01

Pos logit contributions
ich+0.029
 ourselves+0.027
lor+0.024
 yourselves+0.021
sters+0.019
Neg logit contributions
inn-0.108
 his-0.099
igator-0.099
ny-0.098
 ang-0.097
Tokenamas
Token ID17485
Feature activation+3.01e-03

Pos logit contributions
ich+0.120
 ours+0.106
 ourselves+0.104
 yourselves+0.100
ocket+0.100
Neg logit contributions
inn-0.195
 his-0.191
 He-0.189
His-0.187
ny-0.185
Token.
Token ID13
Feature activation-8.10e-03

Pos logit contributions
inn+0.004
 his+0.003
igator+0.003
ny+0.003
His+0.003
Neg logit contributions
ich-0.006
sters-0.006
 ourselves-0.006
 yourselves-0.006
asks-0.006
Token They
Token ID1119
Feature activation-9.77e-03

Pos logit contributions
ich+0.018
sters+0.017
 ourselves+0.017
lor+0.016
 yourselves+0.016
Neg logit contributions
inn-0.009
 his-0.009
 He-0.008
igator-0.008
His-0.008
Token went
Token ID1816
Feature activation+6.14e-03

Pos logit contributions
ich+0.020
 ourselves+0.019
lor+0.018
sters+0.018
 yourselves+0.018
Neg logit contributions
igator-0.013
inn-0.012
 his-0.012
ny-0.011
His-0.011
Token to
Token ID284
Feature activation-2.14e-02

Pos logit contributions
inn+0.008
 his+0.008
igator+0.007
ny+0.007
His+0.007
Neg logit contributions
ich-0.013
 ourselves-0.012
sters-0.012
lor-0.011
 yourselves-0.011
Token and
Token ID290
Feature activation+7.71e-05

Pos logit contributions
ich+0.002
lor+0.002
 ourselves+0.002
sters+0.002
 yourselves+0.002
Neg logit contributions
 his-0.001
inn-0.001
igator-0.001
 He-0.001
 ang-0.001
Token cozy
Token ID37438
Feature activation-2.44e-02

Pos logit contributions
 his+0.000
igator+0.000
His+0.000
inn+0.000
 He+0.000
Neg logit contributions
ich-0.000
lor-0.000
asks-0.000
sters-0.000
 ourselves-0.000
Token in
Token ID287
Feature activation-4.02e-03

Pos logit contributions
ich+0.045
 ourselves+0.042
sters+0.041
lor+0.040
aj+0.040
Neg logit contributions
inn-0.036
igator-0.035
 his-0.032
ny-0.031
His-0.031
Token her
Token ID607
Feature activation-1.41e-02

Pos logit contributions
ich+0.009
sters+0.008
lor+0.008
 ourselves+0.008
aj+0.008
Neg logit contributions
inn-0.005
 his-0.004
igator-0.004
His-0.004
 ang-0.004
Token p
Token ID279
Feature activation-4.66e-02

Pos logit contributions
ich+0.023
 ourselves+0.023
lor+0.022
sters+0.022
asks+0.022
Neg logit contributions
 his-0.023
igator-0.022
inn-0.021
 He-0.020
His-0.020
Tokenaj
Token ID1228
Feature activation-1.00e-01

Pos logit contributions
ich+0.032
 ourselves+0.032
 yourselves+0.026
lor+0.026
sters+0.026
Neg logit contributions
inn-0.116
igator-0.107
 his-0.107
ny-0.105
 ang-0.104
Tokenamas
Token ID17485
Feature activation-5.46e-03

Pos logit contributions
ich+0.098
 ourselves+0.088
 ours+0.084
 yourselves+0.084
asks+0.082
Neg logit contributions
inn-0.218
 his-0.213
His-0.208
 He-0.207
ny-0.202
Token,
Token ID11
Feature activation-4.86e-03

Pos logit contributions
ich+0.012
 ourselves+0.011
sters+0.011
 yourselves+0.011
asks+0.011
Neg logit contributions
inn-0.006
 his-0.006
igator-0.006
ny-0.006
His-0.006
Token even
Token ID772
Feature activation+1.15e-02

Pos logit contributions
ich+0.010
lor+0.010
sters+0.010
 ourselves+0.009
asks+0.009
Neg logit contributions
 his-0.007
His-0.006
 he-0.006
 He-0.006
 ang-0.005
Token though
Token ID996
Feature activation-5.77e-03

Pos logit contributions
inn+0.016
igator+0.015
ny+0.014
His+0.014
 his+0.013
Neg logit contributions
ich-0.021
 ourselves-0.020
aj-0.020
sters-0.020
 yourselves-0.019
Token she
Token ID673
Feature activation+1.52e-02

Pos logit contributions
ich+0.014
lor+0.013
sters+0.013
aj+0.013
asks+0.013
Neg logit contributions
 his-0.007
 he-0.005
inn-0.005
igator-0.005
 He-0.004
Token to
Token ID284
Feature activation+1.01e-02

Pos logit contributions
ich+0.004
 ourselves+0.004
lor+0.004
sters+0.004
aj+0.004
Neg logit contributions
inn-0.003
 his-0.002
igator-0.002
ny-0.002
 He-0.002
Token wear
Token ID5806
Feature activation-2.14e-03

Pos logit contributions
inn+0.012
igator+0.012
 his+0.012
His+0.011
ny+0.010
Neg logit contributions
ich-0.021
 ourselves-0.020
sters-0.020
lor-0.020
aj-0.020
Token her
Token ID607
Feature activation+9.72e-03

Pos logit contributions
ich+0.005
 ourselves+0.005
sters+0.005
asks+0.005
aj+0.004
Neg logit contributions
 his-0.002
inn-0.002
 He-0.002
His-0.002
igator-0.002
Token new
Token ID649
Feature activation+1.95e-02

Pos logit contributions
 his+0.017
inn+0.015
igator+0.015
His+0.014
 He+0.014
Neg logit contributions
ich-0.016
 ourselves-0.015
 yourselves-0.014
lor-0.014
sters-0.014
Token p
Token ID279
Feature activation-4.16e-02

Pos logit contributions
 his+0.038
inn+0.037
igator+0.037
His+0.034
ny+0.034
Neg logit contributions
ich-0.026
 ourselves-0.025
sters-0.023
 yourselves-0.023
lor-0.023
Tokenaj
Token ID1228
Feature activation-1.00e-01

Pos logit contributions
 ourselves+0.038
ich+0.034
 yourselves+0.032
lor+0.027
sters+0.026
Neg logit contributions
inn-0.098
igator-0.090
ny-0.088
 his-0.088
ah-0.088
Tokenamas
Token ID17485
Feature activation-6.26e-03

Pos logit contributions
ich+0.100
 ourselves+0.094
 yourselves+0.087
 ours+0.082
sters+0.082
Neg logit contributions
inn-0.220
 his-0.207
His-0.206
 He-0.204
ny-0.203
Token and
Token ID290
Feature activation-7.39e-03

Pos logit contributions
ich+0.012
 ourselves+0.011
sters+0.011
 yourselves+0.011
asks+0.011
Neg logit contributions
inn-0.009
igator-0.008
 his-0.008
ny-0.008
His-0.007
Token read
Token ID1100
Feature activation-6.26e-03

Pos logit contributions
ich+0.019
lor+0.017
 ourselves+0.016
sters+0.016
asks+0.016
Neg logit contributions
 his-0.009
igator-0.006
 he-0.006
 He-0.006
His-0.006
Token the
Token ID262
Feature activation+2.47e-02

Pos logit contributions
ich+0.016
sters+0.014
lor+0.014
sed+0.014
aj+0.014
Neg logit contributions
 his-0.007
inn-0.005
 He-0.005
His-0.005
 he-0.004
Token book
Token ID1492
Feature activation-8.24e-03

Pos logit contributions
 his+0.042
igator+0.039
 ang+0.036
ny+0.035
inn+0.035
Neg logit contributions
ich-0.043
 ourselves-0.040
sters-0.038
 yourselves-0.037
asks-0.036
Token time
Token ID640
Feature activation+9.10e-03

Pos logit contributions
inn+0.009
igator+0.009
 his+0.008
His+0.008
 ang+0.008
Neg logit contributions
ich-0.011
 ourselves-0.011
lor-0.010
sters-0.010
aj-0.010
Token,
Token ID11
Feature activation+6.70e-03

Pos logit contributions
 his+0.010
inn+0.009
igator+0.008
 He+0.008
His+0.008
Neg logit contributions
ich-0.022
 ourselves-0.020
lor-0.020
sters-0.020
 yourselves-0.019
Token her
Token ID607
Feature activation+1.57e-02

Pos logit contributions
inn+0.009
igator+0.009
 his+0.009
His+0.008
 ang+0.008
Neg logit contributions
ich-0.013
 ourselves-0.012
sters-0.012
lor-0.012
aj-0.012
Token new
Token ID649
Feature activation-1.92e-03

Pos logit contributions
 his+0.021
inn+0.016
His+0.015
 ang+0.015
 His+0.015
Neg logit contributions
 ourselves-0.033
lor-0.033
ich-0.032
 yourselves-0.031
sters-0.030
Token p
Token ID279
Feature activation-5.44e-02

Pos logit contributions
ich+0.003
 ourselves+0.003
lor+0.003
sters+0.003
 yourselves+0.003
Neg logit contributions
igator-0.003
 his-0.003
inn-0.003
ny-0.002
 ang-0.002
Tokenaj
Token ID1228
Feature activation-9.97e-02

Pos logit contributions
 ourselves+0.065
ich+0.060
 yourselves+0.056
lor+0.055
sters+0.051
Neg logit contributions
inn-0.113
igator-0.101
ny-0.096
ah-0.096
 his-0.096
Tokenamas
Token ID17485
Feature activation-1.23e-02

Pos logit contributions
 ourselves+0.101
ich+0.100
 ours+0.096
 yourselves+0.091
asks+0.087
Neg logit contributions
inn-0.208
 He-0.195
His-0.194
ny-0.193
 his-0.192
Token fit
Token ID4197
Feature activation+5.14e-03

Pos logit contributions
ich+0.028
 ourselves+0.028
 yourselves+0.027
sters+0.026
 majesty+0.026
Neg logit contributions
inn-0.013
 his-0.013
igator-0.012
ny-0.012
 He-0.011
Token perfectly
Token ID7138
Feature activation+5.27e-03

Pos logit contributions
 his+0.007
inn+0.006
igator+0.005
ny+0.005
 He+0.005
Neg logit contributions
ich-0.011
lor-0.010
sters-0.010
asks-0.010
sed-0.010
Token.
Token ID13
Feature activation+3.34e-04

Pos logit contributions
inn+0.007
 his+0.006
igator+0.006
ny+0.005
His+0.005
Neg logit contributions
ich-0.011
 ourselves-0.010
asks-0.010
 yourselves-0.010
lor-0.010
Token "
Token ID366
Feature activation-8.71e-03

Pos logit contributions
inn+0.000
igator+0.000
 his+0.000
His+0.000
 He+0.000
Neg logit contributions
ich-0.001
 ourselves-0.001
sters-0.001
lor-0.001
 yourselves-0.001
Token They
Token ID1119
Feature activation-9.86e-03

Pos logit contributions
 He+0.003
 his+0.003
inn+0.003
 His+0.003
His+0.002
Neg logit contributions
ich-0.008
sters-0.008
lor-0.008
 ourselves-0.007
ocket-0.007
Token put
Token ID1234
Feature activation+5.32e-03

Pos logit contributions
ich+0.021
 ourselves+0.019
 yourselves+0.019
lor+0.019
sters+0.019
Neg logit contributions
igator-0.012
inn-0.012
 his-0.012
ny-0.011
His-0.011
Token on
Token ID319
Feature activation+1.51e-02

Pos logit contributions
inn+0.008
igator+0.007
His+0.007
ny+0.007
 ang+0.006
Neg logit contributions
ich-0.010
 ourselves-0.010
sters-0.009
 yourselves-0.009
lor-0.009
Token their
Token ID511
Feature activation+1.61e-02

Pos logit contributions
inn+0.018
igator+0.018
 his+0.017
His+0.016
ny+0.016
Neg logit contributions
ich-0.031
 ourselves-0.029
sters-0.029
lor-0.029
asks-0.028
Token p
Token ID279
Feature activation-4.26e-02

Pos logit contributions
 his+0.027
igator+0.023
 ang+0.020
His+0.019
inn+0.019
Neg logit contributions
ich-0.031
lor-0.031
oor-0.027
 ourselves-0.027
 yourselves-0.026
Tokenaj
Token ID1228
Feature activation-9.93e-02

Pos logit contributions
ich+0.030
 ourselves+0.028
 yourselves+0.024
lor+0.024
 unic+0.022
Neg logit contributions
inn-0.109
 his-0.100
ny-0.096
His-0.096
igator-0.095
Tokenamas
Token ID17485
Feature activation+5.07e-03

Pos logit contributions
ich+0.109
 ours+0.096
 ourselves+0.093
 yourselves+0.091
sed+0.085
Neg logit contributions
inn-0.205
His-0.198
 his-0.198
 He-0.193
ny-0.188
Token and
Token ID290
Feature activation-1.68e-02

Pos logit contributions
inn+0.007
 his+0.007
igator+0.007
ny+0.007
His+0.007
Neg logit contributions
ich-0.009
 ourselves-0.009
 yourselves-0.009
sters-0.009
asks-0.008
Token went
Token ID1816
Feature activation+2.89e-03

Pos logit contributions
ich+0.039
lor+0.036
 ourselves+0.035
asks+0.035
 yourselves+0.035
Neg logit contributions
 his-0.019
igator-0.017
inn-0.016
His-0.015
ny-0.015
Token to
Token ID284
Feature activation-2.40e-02

Pos logit contributions
 his+0.004
inn+0.004
igator+0.003
ny+0.003
His+0.003
Neg logit contributions
ich-0.006
 ourselves-0.006
sters-0.006
 yourselves-0.005
lor-0.005
Token their
Token ID511
Feature activation+1.28e-02

Pos logit contributions
ich+0.044
sters+0.042
lor+0.042
aj+0.038
asks+0.038
Neg logit contributions
 his-0.038
inn-0.035
igator-0.034
 He-0.031
 ang-0.031
Token and
Token ID290
Feature activation-1.16e-02

Pos logit contributions
ich+0.042
 ourselves+0.039
sters+0.038
lor+0.038
asks+0.038
Neg logit contributions
inn-0.034
igator-0.033
 his-0.032
ny-0.031
His-0.030
Token put
Token ID1234
Feature activation+6.74e-03

Pos logit contributions
ich+0.025
lor+0.024
sters+0.021
 ourselves+0.021
asks+0.021
Neg logit contributions
 his-0.016
igator-0.015
inn-0.013
His-0.013
 He-0.013
Token on
Token ID319
Feature activation+1.46e-02

Pos logit contributions
inn+0.010
igator+0.009
His+0.008
ny+0.008
 ang+0.008
Neg logit contributions
ich-0.012
 ourselves-0.012
sters-0.011
 yourselves-0.011
aj-0.011
Token their
Token ID511
Feature activation+8.13e-03

Pos logit contributions
inn+0.018
igator+0.018
 his+0.017
ny+0.016
His+0.016
Neg logit contributions
ich-0.030
 ourselves-0.028
sters-0.027
lor-0.027
aj-0.027
Token p
Token ID279
Feature activation-4.29e-02

Pos logit contributions
 his+0.014
igator+0.012
inn+0.010
 He+0.010
 His+0.010
Neg logit contributions
lor-0.016
ich-0.016
 ourselves-0.013
asks-0.013
 yourselves-0.013
Tokenaj
Token ID1228
Feature activation-9.93e-02

Pos logit contributions
ich+0.044
 ourselves+0.041
lor+0.037
 yourselves+0.036
sters+0.033
Neg logit contributions
inn-0.100
igator-0.089
 his-0.088
ny-0.088
 He-0.084
Tokenamas
Token ID17485
Feature activation+6.42e-03

Pos logit contributions
ich+0.106
 ourselves+0.091
 yourselves+0.085
 ours+0.081
asks+0.079
Neg logit contributions
inn-0.217
 his-0.203
His-0.201
ny-0.199
 He-0.199
Token.
Token ID13
Feature activation+9.68e-03

Pos logit contributions
inn+0.009
igator+0.009
 his+0.008
ny+0.008
His+0.008
Neg logit contributions
ich-0.012
 ourselves-0.011
sters-0.011
lor-0.011
 yourselves-0.011
Token They
Token ID1119
Feature activation-8.29e-03

Pos logit contributions
igator+0.014
inn+0.014
 his+0.013
ny+0.012
 ang+0.012
Neg logit contributions
ich-0.017
 ourselves-0.017
sters-0.016
aj-0.016
lor-0.016
Token each
Token ID1123
Feature activation-5.43e-03

Pos logit contributions
ich+0.018
lor+0.016
 ourselves+0.016
sters+0.016
 yourselves+0.016
Neg logit contributions
igator-0.010
inn-0.010
 his-0.010
ny-0.009
 ang-0.008
Token pick
Token ID2298
Feature activation-2.17e-03

Pos logit contributions
ich+0.011
 ourselves+0.010
sters+0.010
lor+0.010
 yourselves+0.010
Neg logit contributions
inn-0.007
igator-0.007
 his-0.006
ny-0.006
His-0.006
Token and
Token ID290
Feature activation-1.07e-02

Pos logit contributions
ich+0.032
 ourselves+0.030
sters+0.029
lor+0.029
 yourselves+0.029
Neg logit contributions
inn-0.029
igator-0.029
 his-0.027
ny-0.026
His-0.026
Token put
Token ID1234
Feature activation+9.42e-03

Pos logit contributions
ich+0.023
lor+0.021
asks+0.020
 ourselves+0.020
sters+0.020
Neg logit contributions
 his-0.014
igator-0.013
inn-0.012
 ang-0.011
His-0.011
Token on
Token ID319
Feature activation+1.90e-02

Pos logit contributions
inn+0.013
igator+0.013
 his+0.011
His+0.011
ny+0.011
Neg logit contributions
ich-0.018
 ourselves-0.017
sters-0.016
lor-0.016
aj-0.016
Token their
Token ID511
Feature activation+1.28e-02

Pos logit contributions
 his+0.023
inn+0.022
igator+0.022
 ang+0.019
ny+0.019
Neg logit contributions
ich-0.040
lor-0.037
sters-0.037
asks-0.036
 ourselves-0.036
Token p
Token ID279
Feature activation-4.43e-02

Pos logit contributions
 his+0.021
igator+0.018
 ang+0.016
 His+0.015
ny+0.015
Neg logit contributions
lor-0.024
ich-0.024
 ourselves-0.021
 yourselves-0.021
asks-0.020
Tokenaj
Token ID1228
Feature activation-9.93e-02

Pos logit contributions
 ourselves+0.028
ich+0.028
 yourselves+0.023
lor+0.021
 unic+0.019
Neg logit contributions
inn-0.117
igator-0.105
 his-0.104
ny-0.104
His-0.102
Tokenamas
Token ID17485
Feature activation+7.62e-03

Pos logit contributions
ich+0.106
 ourselves+0.094
 yourselves+0.089
 ours+0.089
sed+0.081
Neg logit contributions
inn-0.214
His-0.199
 his-0.198
 He-0.195
ny-0.195
Token.
Token ID13
Feature activation+5.90e-03

Pos logit contributions
inn+0.010
igator+0.010
 his+0.009
ny+0.009
His+0.009
Neg logit contributions
ich-0.015
 ourselves-0.014
sters-0.014
 yourselves-0.014
asks-0.013
Token Uncle
Token ID23169
Feature activation-3.89e-03

Pos logit contributions
 his+0.006
 He+0.006
inn+0.006
 His+0.006
His+0.006
Neg logit contributions
ich-0.013
sters-0.012
 ourselves-0.012
lor-0.012
 yourselves-0.011
Token Joe
Token ID5689
Feature activation-8.32e-03

Pos logit contributions
ich+0.008
 ourselves+0.008
lor+0.008
sters+0.008
 yourselves+0.008
Neg logit contributions
igator-0.004
inn-0.004
 his-0.004
ny-0.004
His-0.004
Token showed
Token ID3751
Feature activation+2.07e-02

Pos logit contributions
ich+0.017
 ourselves+0.017
 yourselves+0.016
lor+0.016
asks+0.016
Neg logit contributions
igator-0.010
 his-0.010
inn-0.010
ny-0.009
 He-0.009
Token and
Token ID290
Feature activation-1.20e-02

Pos logit contributions
ich+0.030
 ourselves+0.028
sters+0.027
lor+0.026
aj+0.026
Neg logit contributions
inn-0.035
igator-0.035
 his-0.032
 ang-0.031
ny-0.031
Token put
Token ID1234
Feature activation+8.52e-03

Pos logit contributions
ich+0.027
lor+0.025
 ourselves+0.023
asks+0.023
sters+0.023
Neg logit contributions
 his-0.015
igator-0.014
inn-0.013
His-0.012
 ang-0.012
Token on
Token ID319
Feature activation+1.61e-02

Pos logit contributions
inn+0.013
igator+0.012
His+0.011
ny+0.011
 ang+0.011
Neg logit contributions
 ourselves-0.015
ich-0.015
sters-0.014
 yourselves-0.014
aj-0.013
Token their
Token ID511
Feature activation+1.43e-02

Pos logit contributions
inn+0.020
igator+0.020
 his+0.019
ny+0.017
His+0.017
Neg logit contributions
ich-0.032
 ourselves-0.030
sters-0.030
lor-0.030
aj-0.029
Token p
Token ID279
Feature activation-4.11e-02

Pos logit contributions
 his+0.025
igator+0.021
His+0.018
 ang+0.018
 football+0.018
Neg logit contributions
lor-0.027
ich-0.027
 ourselves-0.023
 yourselves-0.023
oor-0.022
Tokenaj
Token ID1228
Feature activation-9.86e-02

Pos logit contributions
ich+0.026
 ourselves+0.023
 yourselves+0.019
lor+0.019
 unic+0.017
Neg logit contributions
inn-0.108
 his-0.099
ny-0.098
igator-0.098
His-0.096
Tokenamas
Token ID17485
Feature activation+7.58e-03

Pos logit contributions
ich+0.098
 ourselves+0.082
 yourselves+0.077
 ours+0.074
sters+0.073
Neg logit contributions
inn-0.222
 his-0.208
His-0.208
ny-0.204
 He-0.203
Token.
Token ID13
Feature activation+5.24e-03

Pos logit contributions
inn+0.011
igator+0.011
 his+0.010
ny+0.010
His+0.009
Neg logit contributions
ich-0.014
 ourselves-0.013
sters-0.013
lor-0.013
aj-0.012
Token They
Token ID1119
Feature activation-9.12e-03

Pos logit contributions
inn+0.007
igator+0.007
 his+0.006
ny+0.006
His+0.006
Neg logit contributions
ich-0.010
 ourselves-0.010
sters-0.009
lor-0.009
aj-0.009
Token hug
Token ID16225
Feature activation-1.46e-02

Pos logit contributions
ich+0.020
lor+0.018
 ourselves+0.018
sters+0.018
 yourselves+0.018
Neg logit contributions
igator-0.011
 his-0.010
inn-0.010
ny-0.009
 ang-0.009
Token their
Token ID511
Feature activation+1.63e-02

Pos logit contributions
ich+0.033
lor+0.031
sters+0.031
aj+0.030
 ourselves+0.030
Neg logit contributions
 his-0.015
igator-0.015
inn-0.015
ny-0.013
His-0.013
Token and
Token ID290
Feature activation-7.26e-03

Pos logit contributions
ich+0.023
 ourselves+0.022
sters+0.021
lor+0.021
aj+0.021
Neg logit contributions
inn-0.025
igator-0.025
 his-0.023
ny-0.022
His-0.022
Token put
Token ID1234
Feature activation+1.57e-02

Pos logit contributions
ich+0.015
lor+0.014
 ourselves+0.014
asks+0.014
sters+0.014
Neg logit contributions
 his-0.009
igator-0.009
inn-0.009
His-0.008
 ang-0.008
Token on
Token ID319
Feature activation+1.82e-02

Pos logit contributions
inn+0.022
igator+0.022
His+0.020
ny+0.019
 ang+0.019
Neg logit contributions
ich-0.028
 ourselves-0.028
sters-0.026
 yourselves-0.026
lor-0.026
Token their
Token ID511
Feature activation+1.90e-02

Pos logit contributions
inn+0.023
igator+0.022
 his+0.021
His+0.019
ny+0.019
Neg logit contributions
ich-0.037
 ourselves-0.035
sters-0.034
lor-0.034
aj-0.034
Token p
Token ID279
Feature activation-3.77e-02

Pos logit contributions
 his+0.031
igator+0.028
 ang+0.025
His+0.023
 cube+0.023
Neg logit contributions
lor-0.037
ich-0.035
asks-0.031
 ourselves-0.030
 yourselves-0.029
Tokenaj
Token ID1228
Feature activation-9.86e-02

Pos logit contributions
ich+0.028
 ourselves+0.027
lor+0.024
 yourselves+0.023
 unic+0.022
Neg logit contributions
inn-0.094
igator-0.085
 his-0.084
ny-0.082
His-0.082
Tokenamas
Token ID17485
Feature activation+9.39e-03

Pos logit contributions
ich+0.106
 ourselves+0.088
 ours+0.087
sed+0.082
 yourselves+0.082
Neg logit contributions
inn-0.210
His-0.199
 his-0.198
 He-0.193
ny-0.192
Token.
Token ID13
Feature activation-5.04e-03

Pos logit contributions
inn+0.011
 his+0.010
igator+0.010
ny+0.010
His+0.009
Neg logit contributions
ich-0.020
sters-0.019
 ourselves-0.019
 yourselves-0.018
asks-0.018
Token They
Token ID1119
Feature activation-3.66e-03

Pos logit contributions
ich+0.012
sters+0.011
lor+0.010
 ourselves+0.010
 yourselves+0.010
Neg logit contributions
inn-0.006
 his-0.005
 He-0.005
His-0.005
 His-0.004
Token said
Token ID531
Feature activation-3.54e-02

Pos logit contributions
ich+0.008
 ourselves+0.007
lor+0.007
 yourselves+0.007
sters+0.007
Neg logit contributions
igator-0.005
inn-0.005
 his-0.004
ny-0.004
His-0.004
Token good
Token ID922
Feature activation+5.41e-03

Pos logit contributions
ich+0.079
lor+0.071
 ourselves+0.067
 yourselves+0.067
asks+0.067
Neg logit contributions
 his-0.048
igator-0.040
inn-0.038
 He-0.037
His-0.037
Token give
Token ID1577
Feature activation-2.14e-02

Pos logit contributions
ich+0.021
 ourselves+0.019
lor+0.019
sters+0.019
aj+0.018
Neg logit contributions
igator-0.013
inn-0.013
 his-0.012
ny-0.011
 ang-0.011
Token them
Token ID606
Feature activation+2.43e-02

Pos logit contributions
ich+0.044
lor+0.041
 ourselves+0.041
sters+0.040
aj+0.040
Neg logit contributions
inn-0.027
igator-0.027
 his-0.026
ny-0.024
 ang-0.023
Token towels
Token ID38494
Feature activation-1.68e-02

Pos logit contributions
igator+0.034
inn+0.034
 his+0.032
ny+0.030
His+0.030
Neg logit contributions
ich-0.046
lor-0.042
 ourselves-0.042
sters-0.042
aj-0.041
Token and
Token ID290
Feature activation-1.26e-02

Pos logit contributions
ich+0.036
lor+0.033
sters+0.032
 majesty+0.032
 yourselves+0.032
Neg logit contributions
 his-0.022
igator-0.021
ny-0.020
inn-0.020
 He-0.018
Token p
Token ID279
Feature activation-2.82e-02

Pos logit contributions
ich+0.025
lor+0.024
sters+0.021
 ourselves+0.020
asks+0.019
Neg logit contributions
 his-0.020
igator-0.019
 ang-0.017
His-0.017
inn-0.016
Tokenaj
Token ID1228
Feature activation-9.83e-02

Pos logit contributions
ich+0.038
 ourselves+0.035
sters+0.034
aj+0.033
lor+0.033
Neg logit contributions
inn-0.054
igator-0.054
 his-0.050
ny-0.049
His-0.049
Tokenamas
Token ID17485
Feature activation+1.11e-03

Pos logit contributions
ich+0.101
lor+0.078
 ourselves+0.077
sters+0.073
asks+0.073
Neg logit contributions
inn-0.221
ny-0.206
 his-0.205
 He-0.203
His-0.201
Token.
Token ID13
Feature activation-2.70e-03

Pos logit contributions
inn+0.001
 his+0.001
igator+0.001
ny+0.001
His+0.001
Neg logit contributions
ich-0.002
 ourselves-0.002
 yourselves-0.002
sters-0.002
lor-0.002
Token They
Token ID1119
Feature activation-1.07e-02

Pos logit contributions
ich+0.006
sters+0.006
lor+0.006
 ourselves+0.006
aj+0.005
Neg logit contributions
inn-0.003
 his-0.003
 He-0.003
igator-0.003
His-0.003
Token give
Token ID1577
Feature activation-2.18e-02

Pos logit contributions
ich+0.022
lor+0.021
 ourselves+0.020
sters+0.020
 yourselves+0.020
Neg logit contributions
igator-0.014
inn-0.013
 his-0.013
ny-0.012
 ang-0.012
Token them
Token ID606
Feature activation+2.36e-02

Pos logit contributions
ich+0.047
lor+0.044
sters+0.043
 ourselves+0.043
aj+0.042
Neg logit contributions
igator-0.026
inn-0.026
 his-0.026
ny-0.023
 ang-0.022
Token and
Token ID290
Feature activation-1.01e-02

Pos logit contributions
ich+0.043
 ourselves+0.042
 yourselves+0.042
sters+0.041
 majesty+0.040
Neg logit contributions
 his-0.025
inn-0.023
ny-0.022
igator-0.021
 He-0.021
Token changed
Token ID3421
Feature activation+1.06e-02

Pos logit contributions
ich+0.025
lor+0.024
 ourselves+0.023
asks+0.023
sters+0.022
Neg logit contributions
 his-0.012
His-0.008
 he-0.008
igator-0.008
 He-0.007
Token into
Token ID656
Feature activation-1.83e-02

Pos logit contributions
 his+0.013
inn+0.010
igator+0.010
 ang+0.010
 He+0.009
Neg logit contributions
ich-0.024
lor-0.023
sters-0.022
sed-0.022
asks-0.022
Token her
Token ID607
Feature activation-7.65e-03

Pos logit contributions
ich+0.040
sters+0.038
lor+0.038
aj+0.037
 ourselves+0.036
Neg logit contributions
 his-0.021
inn-0.020
igator-0.019
 ang-0.018
His-0.017
Token p
Token ID279
Feature activation-3.96e-02

Pos logit contributions
ich+0.014
lor+0.014
 ourselves+0.013
sters+0.013
asks+0.012
Neg logit contributions
 his-0.012
igator-0.010
 ang-0.010
inn-0.010
 He-0.010
Tokenaj
Token ID1228
Feature activation-9.83e-02

Pos logit contributions
ich+0.028
 ourselves+0.027
sters+0.022
 yourselves+0.021
lor+0.021
Neg logit contributions
inn-0.100
igator-0.095
 his-0.093
 ang-0.091
ny-0.091
Tokenamas
Token ID17485
Feature activation-4.36e-03

Pos logit contributions
ich+0.098
 ourselves+0.092
 yourselves+0.084
 ours+0.083
sed+0.080
Neg logit contributions
inn-0.210
 his-0.202
 He-0.200
ny-0.197
His-0.195
Token.
Token ID13
Feature activation-3.93e-04

Pos logit contributions
ich+0.010
sters+0.009
 ourselves+0.009
 yourselves+0.009
asks+0.009
Neg logit contributions
inn-0.005
 his-0.004
ny-0.004
igator-0.004
 He-0.004
Token She
Token ID1375
Feature activation+5.72e-03

Pos logit contributions
ich+0.001
sters+0.001
 ourselves+0.001
lor+0.001
 yourselves+0.001
Neg logit contributions
 He-0.000
inn-0.000
 his-0.000
igator-0.000
 His-0.000
Token was
Token ID373
Feature activation+9.11e-03

Pos logit contributions
 his+0.007
igator+0.005
inn+0.005
His+0.005
 He+0.005
Neg logit contributions
ich-0.013
 ourselves-0.013
 yourselves-0.013
sters-0.012
asks-0.012
Token so
Token ID523
Feature activation-1.13e-02

Pos logit contributions
inn+0.011
igator+0.010
 his+0.010
 ang+0.009
His+0.009
Neg logit contributions
ich-0.019
sters-0.019
 ourselves-0.018
lor-0.018
asks-0.018
Token and
Token ID290
Feature activation-1.57e-02

Pos logit contributions
ich+0.057
 ourselves+0.054
sters+0.051
 yourselves+0.051
lor+0.051
Neg logit contributions
 his-0.038
inn-0.035
igator-0.033
 He-0.032
ny-0.031
Token put
Token ID1234
Feature activation+9.43e-03

Pos logit contributions
ich+0.033
 ourselves+0.030
sters+0.030
lor+0.030
asks+0.029
Neg logit contributions
igator-0.019
inn-0.019
 his-0.019
His-0.017
ny-0.017
Token on
Token ID319
Feature activation+1.48e-02

Pos logit contributions
inn+0.014
igator+0.013
ny+0.012
His+0.012
 ang+0.012
Neg logit contributions
ich-0.017
 ourselves-0.017
sters-0.016
lor-0.016
 yourselves-0.016
Token their
Token ID511
Feature activation+1.84e-02

Pos logit contributions
inn+0.017
 his+0.017
igator+0.017
His+0.015
ny+0.015
Neg logit contributions
ich-0.031
sters-0.029
 ourselves-0.029
lor-0.029
asks-0.028
Token p
Token ID279
Feature activation-3.76e-02

Pos logit contributions
 his+0.031
igator+0.026
 His+0.024
His+0.023
 He+0.023
Neg logit contributions
ich-0.037
lor-0.035
sters-0.032
 ourselves-0.032
 yourselves-0.030
Tokenaj
Token ID1228
Feature activation-9.80e-02

Pos logit contributions
ich+0.025
 ourselves+0.023
 yourselves+0.017
sters+0.017
lor+0.016
Neg logit contributions
inn-0.100
 his-0.093
igator-0.091
His-0.091
ny-0.090
Tokenamas
Token ID17485
Feature activation+6.83e-03

Pos logit contributions
ich+0.109
 ourselves+0.094
 ours+0.087
 yourselves+0.085
sed+0.083
Neg logit contributions
inn-0.211
 his-0.200
His-0.200
 He-0.196
ny-0.192
Token.
Token ID13
Feature activation+6.23e-03

Pos logit contributions
inn+0.009
igator+0.008
 his+0.008
ny+0.008
His+0.007
Neg logit contributions
ich-0.014
 ourselves-0.013
sters-0.013
 yourselves-0.013
asks-0.013
Token They
Token ID1119
Feature activation-5.90e-03

Pos logit contributions
 He+0.006
 his+0.006
inn+0.006
 His+0.005
His+0.005
Neg logit contributions
ich-0.015
sters-0.015
lor-0.014
 ourselves-0.014
sed-0.013
Token climbed
Token ID19952
Feature activation+2.81e-03

Pos logit contributions
ich+0.012
 ourselves+0.011
lor+0.011
 yourselves+0.011
sters+0.011
Neg logit contributions
igator-0.008
inn-0.007
 his-0.007
ny-0.007
His-0.007
Token into
Token ID656
Feature activation-5.36e-03

Pos logit contributions
inn+0.004
 his+0.004
igator+0.004
ny+0.003
 He+0.003
Neg logit contributions
ich-0.005
 ourselves-0.005
sters-0.005
asks-0.005
lor-0.005
Token She
Token ID1375
Feature activation+7.00e-05

Pos logit contributions
ich+0.027
sters+0.025
 ourselves+0.025
lor+0.025
aj+0.024
Neg logit contributions
inn-0.011
 He-0.011
 his-0.011
igator-0.010
His-0.010
Token puts
Token ID7584
Feature activation+1.28e-02

Pos logit contributions
 his+0.000
igator+0.000
inn+0.000
 He+0.000
 ang+0.000
Neg logit contributions
ich-0.000
 ourselves-0.000
lor-0.000
sters-0.000
 yourselves-0.000
Token on
Token ID319
Feature activation+1.26e-02

Pos logit contributions
inn+0.017
igator+0.017
 his+0.016
ny+0.015
His+0.015
Neg logit contributions
ich-0.024
 ourselves-0.023
sters-0.023
lor-0.022
aj-0.022
Token her
Token ID607
Feature activation+1.41e-02

Pos logit contributions
inn+0.014
 his+0.014
igator+0.013
 ang+0.012
His+0.012
Neg logit contributions
ich-0.028
sters-0.026
 ourselves-0.026
lor-0.026
asks-0.025
Token p
Token ID279
Feature activation-3.49e-02

Pos logit contributions
 his+0.024
igator+0.019
His+0.019
 He+0.019
 His+0.019
Neg logit contributions
ich-0.025
 ourselves-0.025
lor-0.025
asks-0.023
 yourselves-0.023
Tokenaj
Token ID1228
Feature activation-9.79e-02

Pos logit contributions
ich+0.020
 ourselves+0.018
lor+0.015
 yourselves+0.013
sters+0.012
Neg logit contributions
inn-0.095
 his-0.088
ny-0.088
igator-0.087
His-0.085
Tokenamas
Token ID17485
Feature activation+2.22e-03

Pos logit contributions
ich+0.109
 ourselves+0.095
 ours+0.091
 yourselves+0.088
asks+0.086
Neg logit contributions
inn-0.204
 his-0.199
 He-0.197
His-0.196
ny-0.192
Token.
Token ID13
Feature activation-7.33e-03

Pos logit contributions
inn+0.003
 his+0.003
igator+0.003
ny+0.003
His+0.003
Neg logit contributions
ich-0.004
 ourselves-0.004
sters-0.004
 yourselves-0.004
asks-0.004
Token She
Token ID1375
Feature activation+3.99e-03

Pos logit contributions
ich+0.018
sters+0.017
 ourselves+0.017
lor+0.017
 yourselves+0.016
Neg logit contributions
 He-0.007
 his-0.007
inn-0.006
His-0.006
 His-0.006
Token hugs
Token ID40657
Feature activation-2.15e-02

Pos logit contributions
 his+0.006
igator+0.005
inn+0.005
 He+0.005
 ang+0.005
Neg logit contributions
ich-0.008
 ourselves-0.007
lor-0.007
sters-0.007
 yourselves-0.007
Token her
Token ID607
Feature activation+8.53e-03

Pos logit contributions
ich+0.047
 ourselves+0.044
sters+0.044
lor+0.044
aj+0.044
Neg logit contributions
inn-0.023
 his-0.023
igator-0.023
His-0.020
ny-0.020
Token on
Token ID319
Feature activation+1.82e-03

Pos logit contributions
inn+0.028
igator+0.028
 his+0.027
ny+0.023
 ang+0.023
Neg logit contributions
ich-0.055
sters-0.051
lor-0.051
 ourselves-0.050
asks-0.050
Token dry
Token ID5894
Feature activation+1.49e-03

Pos logit contributions
 his+0.002
igator+0.001
inn+0.001
His+0.001
 He+0.001
Neg logit contributions
ich-0.005
lor-0.004
sters-0.004
asks-0.004
aj-0.004
Token and
Token ID290
Feature activation-5.10e-03

Pos logit contributions
 his+0.003
igator+0.002
inn+0.002
 He+0.002
ny+0.002
Neg logit contributions
ich-0.003
lor-0.002
sters-0.002
sed-0.002
asks-0.002
Token cozy
Token ID37438
Feature activation-2.89e-02

Pos logit contributions
ich+0.011
lor+0.011
asks+0.010
sters+0.010
 ourselves+0.009
Neg logit contributions
 his-0.007
igator-0.006
inn-0.006
 ang-0.005
His-0.005
Token p
Token ID279
Feature activation-3.27e-02

Pos logit contributions
ich+0.052
sters+0.047
lor+0.046
asks+0.044
ige+0.042
Neg logit contributions
 his-0.048
igator-0.043
inn-0.042
ny-0.040
 He-0.039
Tokenaj
Token ID1228
Feature activation-9.78e-02

Pos logit contributions
ich+0.034
lor+0.030
 ourselves+0.030
sters+0.027
 yourselves+0.027
Neg logit contributions
inn-0.072
igator-0.066
ny-0.064
 ang-0.064
 his-0.063
Tokenamas
Token ID17485
Feature activation-3.18e-04

Pos logit contributions
ich+0.112
 ours+0.105
ocket+0.101
 ourselves+0.096
 privacy+0.095
Neg logit contributions
inn-0.190
 He-0.190
 his-0.188
ny-0.184
His-0.181
Token and
Token ID290
Feature activation-6.45e-03

Pos logit contributions
ich+0.001
sters+0.001
 yourselves+0.001
 ourselves+0.001
asks+0.001
Neg logit contributions
 his-0.000
inn-0.000
ny-0.000
 He-0.000
igator-0.000
Token sn
Token ID3013
Feature activation-1.26e-02

Pos logit contributions
ich+0.015
lor+0.014
sters+0.013
asks+0.013
oor+0.012
Neg logit contributions
 his-0.009
igator-0.007
 he-0.007
 He-0.007
 ang-0.006
Tokenugg
Token ID6837
Feature activation-1.85e-02

Pos logit contributions
ich+0.020
sters+0.019
 ourselves+0.019
sed+0.018
asks+0.017
Neg logit contributions
inn-0.022
ny-0.019
igator-0.019
 his-0.019
His-0.018
Tokenled
Token ID992
Feature activation+1.03e-02

Pos logit contributions
ich+0.041
sters+0.037
sed+0.037
 ourselves+0.036
lor+0.036
Neg logit contributions
inn-0.021
 his-0.021
ny-0.019
igator-0.018
 He-0.017
Token and
Token ID290
Feature activation-3.12e-03

Pos logit contributions
ich+0.022
 ourselves+0.021
sters+0.021
lor+0.020
aj+0.020
Neg logit contributions
inn-0.020
igator-0.020
 his-0.018
ny-0.018
His-0.018
Token put
Token ID1234
Feature activation+9.54e-03

Pos logit contributions
ich+0.007
lor+0.007
sters+0.006
asks+0.006
 ourselves+0.006
Neg logit contributions
 his-0.004
igator-0.003
His-0.003
inn-0.003
 He-0.003
Token on
Token ID319
Feature activation+1.24e-02

Pos logit contributions
inn+0.014
igator+0.013
 his+0.012
ny+0.012
His+0.012
Neg logit contributions
ich-0.018
 ourselves-0.017
sters-0.016
lor-0.016
aj-0.016
Token their
Token ID511
Feature activation+1.24e-02

Pos logit contributions
 his+0.014
inn+0.014
igator+0.013
His+0.012
 ang+0.011
Neg logit contributions
ich-0.028
lor-0.025
sters-0.025
asks-0.025
 ourselves-0.024
Token p
Token ID279
Feature activation-3.70e-02

Pos logit contributions
 his+0.022
igator+0.018
ny+0.017
His+0.016
 He+0.016
Neg logit contributions
ich-0.023
lor-0.023
sters-0.019
 ourselves-0.019
oor-0.019
Tokenaj
Token ID1228
Feature activation-9.78e-02

Pos logit contributions
ich+0.024
 ourselves+0.022
lor+0.017
 yourselves+0.017
sters+0.016
Neg logit contributions
inn-0.099
 his-0.090
ny-0.089
igator-0.089
His-0.087
Tokenamas
Token ID17485
Feature activation+7.71e-03

Pos logit contributions
ich+0.104
 ourselves+0.088
 yourselves+0.080
sters+0.078
sed+0.078
Neg logit contributions
inn-0.218
 his-0.204
His-0.203
ny-0.200
 He-0.199
Token first
Token ID717
Feature activation-7.00e-03

Pos logit contributions
inn+0.011
igator+0.010
 his+0.010
ny+0.009
His+0.009
Neg logit contributions
ich-0.015
 ourselves-0.014
sters-0.014
lor-0.013
 yourselves-0.013
Token.
Token ID13
Feature activation+2.21e-03

Pos logit contributions
ich+0.017
lor+0.016
sters+0.016
 ourselves+0.015
asks+0.015
Neg logit contributions
 his-0.008
inn-0.007
igator-0.006
ny-0.006
 He-0.006
Token They
Token ID1119
Feature activation-7.41e-03

Pos logit contributions
igator+0.003
inn+0.003
ny+0.002
 ang+0.002
 his+0.002
Neg logit contributions
ich-0.005
 ourselves-0.004
asks-0.004
aj-0.004
 yourselves-0.004
Token do
Token ID466
Feature activation-1.33e-02

Pos logit contributions
ich+0.015
 ourselves+0.014
lor+0.014
sters+0.014
 yourselves+0.013
Neg logit contributions
igator-0.010
inn-0.009
 his-0.009
ny-0.008
 ang-0.008
Token curious
Token ID11040
Feature activation+2.50e-03

Pos logit contributions
inn+0.016
igator+0.015
 his+0.015
 ang+0.013
His+0.013
Neg logit contributions
ich-0.029
 ourselves-0.028
lor-0.028
sters-0.028
asks-0.027
Token and
Token ID290
Feature activation+6.39e-03

Pos logit contributions
inn+0.004
igator+0.004
His+0.003
ny+0.003
 ang+0.003
Neg logit contributions
ich-0.004
 ourselves-0.004
sters-0.004
aj-0.004
asks-0.004
Token ran
Token ID4966
Feature activation+1.95e-03

Pos logit contributions
inn+0.010
igator+0.009
ny+0.008
 ang+0.008
His+0.008
Neg logit contributions
ich-0.011
 ourselves-0.011
sters-0.011
asks-0.010
aj-0.010
Token to
Token ID284
Feature activation-1.06e-02

Pos logit contributions
igator+0.004
ited+0.004
emaker+0.004
ah+0.004
inn+0.004
Neg logit contributions
 ourselves-0.002
ich-0.002
sters-0.002
 roses-0.001
 flowers-0.001
Token the
Token ID262
Feature activation-1.89e-04

Pos logit contributions
 ourselves+0.019
ich+0.018
 yourselves+0.017
asks+0.016
sters+0.016
Neg logit contributions
inn-0.017
igator-0.015
ny-0.014
His-0.014
ited-0.013
Token man
Token ID582
Feature activation+4.94e-02

Pos logit contributions
 ourselves+0.000
asks+0.000
lor+0.000
ich+0.000
aj+0.000
Neg logit contributions
 his-0.000
igator-0.000
ny-0.000
 ang-0.000
 trader-0.000
Token.
Token ID13
Feature activation+1.73e-02

Pos logit contributions
inn+0.079
igator+0.077
His+0.069
 his+0.069
 ang+0.069
Neg logit contributions
ich-0.082
 ourselves-0.078
sters-0.076
aj-0.073
lor-0.073
Token He
Token ID679
Feature activation+4.85e-03

Pos logit contributions
inn+0.023
igator+0.023
ited+0.020
ah+0.018
emaker+0.018
Neg logit contributions
asks-0.030
ich-0.030
aj-0.030
 ourselves-0.030
 yourselves-0.030
Token opened
Token ID4721
Feature activation+1.50e-02

Pos logit contributions
inn+0.007
igator+0.006
 his+0.006
ny+0.006
His+0.006
Neg logit contributions
ich-0.009
 ourselves-0.009
sters-0.009
lor-0.009
aj-0.008
Token his
Token ID465
Feature activation+1.21e-02

Pos logit contributions
igator+0.014
inn+0.014
ny+0.012
ited+0.011
 ang+0.011
Neg logit contributions
 ourselves-0.033
ich-0.033
 yourselves-0.032
sters-0.031
aj-0.031
Token bag
Token ID6131
Feature activation-2.35e-02

Pos logit contributions
 his+0.023
inn+0.022
 ang+0.021
igator+0.021
ny+0.020
Neg logit contributions
ich-0.017
 ourselves-0.017
 yourselves-0.016
lor-0.016
asks-0.015
Token Tom
Token ID4186
Feature activation-9.58e-03

Pos logit contributions
 ourselves+0.024
ich+0.024
 yourselves+0.022
lor+0.022
sters+0.022
Neg logit contributions
igator-0.017
inn-0.017
 ang-0.014
ny-0.014
His-0.014
Token's
Token ID338
Feature activation-2.46e-02

Pos logit contributions
ich+0.017
lor+0.016
 ourselves+0.016
sters+0.016
asks+0.016
Neg logit contributions
ny-0.015
 his-0.015
igator-0.014
inn-0.014
 He-0.013
Token room
Token ID2119
Feature activation-1.42e-02

Pos logit contributions
ich+0.032
 ourselves+0.030
sters+0.028
aj+0.028
lor+0.028
Neg logit contributions
inn-0.049
igator-0.048
 his-0.044
ny-0.044
His-0.044
Token.
Token ID13
Feature activation+3.41e-03

Pos logit contributions
ich+0.032
lor+0.029
 ourselves+0.029
sters+0.028
 yourselves+0.028
Neg logit contributions
 his-0.015
inn-0.015
igator-0.015
His-0.013
 He-0.013
Token One
Token ID1881
Feature activation-1.78e-02

Pos logit contributions
igator+0.006
inn+0.006
ited+0.006
 ang+0.006
 discouraged+0.005
Neg logit contributions
asks-0.005
ich-0.005
 ourselves-0.005
aj-0.005
 yourselves-0.004
Token day
Token ID1110
Feature activation+4.40e-02

Pos logit contributions
ich+0.028
 ourselves+0.026
aj+0.025
asks+0.025
 yourselves+0.024
Neg logit contributions
inn-0.029
 ang-0.027
igator-0.026
 aback-0.026
 trader-0.025
Token,
Token ID11
Feature activation+3.37e-02

Pos logit contributions
inn+0.057
igator+0.054
 his+0.052
His+0.048
ny+0.047
Neg logit contributions
ich-0.090
 ourselves-0.084
lor-0.083
sters-0.082
aj-0.081
Token Tom
Token ID4186
Feature activation-1.01e-02

Pos logit contributions
inn+0.057
igator+0.057
ny+0.052
 ang+0.050
His+0.048
Neg logit contributions
ich-0.052
 ourselves-0.051
sters-0.048
 yourselves-0.048
asks-0.047
Token's
Token ID338
Feature activation-1.84e-03

Pos logit contributions
ich+0.016
 ourselves+0.016
lor+0.015
sters+0.014
 yourselves+0.014
Neg logit contributions
igator-0.018
 his-0.017
ny-0.017
inn-0.017
 He-0.017
Token mom
Token ID1995
Feature activation-6.11e-04

Pos logit contributions
ich+0.003
 ourselves+0.003
sters+0.003
lor+0.003
 yourselves+0.003
Neg logit contributions
inn-0.003
igator-0.003
 his-0.003
ny-0.003
His-0.003
Token tells
Token ID4952
Feature activation-1.15e-02

Pos logit contributions
ich+0.001
 ourselves+0.001
lor+0.001
sters+0.001
 yourselves+0.001
Neg logit contributions
igator-0.001
inn-0.001
 his-0.001
ny-0.001
 ang-0.001
Token and
Token ID290
Feature activation+1.38e-02

Pos logit contributions
ich+0.007
 ourselves+0.007
sters+0.006
lor+0.006
aj+0.006
Neg logit contributions
inn-0.008
igator-0.008
 his-0.008
ny-0.007
His-0.007
Token Ben
Token ID3932
Feature activation-9.32e-03

Pos logit contributions
inn+0.024
igator+0.023
ny+0.020
His+0.020
 trader+0.019
Neg logit contributions
ich-0.022
 ourselves-0.021
sters-0.020
 yourselves-0.020
aj-0.019
Token didn
Token ID1422
Feature activation-9.27e-03

Pos logit contributions
ich+0.018
 ourselves+0.017
sters+0.017
lor+0.016
aj+0.016
Neg logit contributions
inn-0.013
igator-0.012
 his-0.011
ny-0.010
His-0.010
Token't
Token ID470
Feature activation+1.17e-02

Pos logit contributions
ich+0.013
 ourselves+0.012
sters+0.011
lor+0.011
asks+0.011
Neg logit contributions
igator-0.017
inn-0.017
 his-0.016
ny-0.016
 ang-0.015
Token want
Token ID765
Feature activation+2.24e-02

Pos logit contributions
inn+0.021
igator+0.020
jack+0.020
penter+0.019
 trader+0.019
Neg logit contributions
 ourselves-0.015
asks-0.014
 yourselves-0.014
 yourself-0.014
 privacy-0.013
Token to
Token ID284
Feature activation+4.39e-02

Pos logit contributions
inn+0.039
igator+0.038
ny+0.035
His+0.034
 ang+0.034
Neg logit contributions
ich-0.034
 ourselves-0.034
 yourselves-0.031
asks-0.030
sters-0.030
Token nap
Token ID25422
Feature activation+1.39e-02

Pos logit contributions
inn+0.066
igator+0.064
 ang+0.055
His+0.055
 his+0.054
Neg logit contributions
 ourselves-0.078
ich-0.078
 yourselves-0.073
sters-0.071
asks-0.070
Token.
Token ID13
Feature activation+1.34e-02

Pos logit contributions
 his+0.017
inn+0.017
igator+0.015
His+0.014
 ang+0.014
Neg logit contributions
ich-0.030
lor-0.028
sters-0.027
 ourselves-0.026
asks-0.026
Token They
Token ID1119
Feature activation-6.18e-03

Pos logit contributions
inn+0.017
igator+0.017
 his+0.016
ny+0.015
His+0.015
Neg logit contributions
ich-0.027
 ourselves-0.025
sters-0.025
lor-0.025
aj-0.024
Token wanted
Token ID2227
Feature activation-6.03e-03

Pos logit contributions
ich+0.010
 ourselves+0.010
sters+0.010
aj+0.009
asks+0.009
Neg logit contributions
inn-0.010
igator-0.010
 his-0.008
 ang-0.008
 trader-0.008
Token to
Token ID284
Feature activation+1.28e-02

Pos logit contributions
ich+0.011
 ourselves+0.010
sters+0.010
lor+0.010
aj+0.010
Neg logit contributions
inn-0.009
igator-0.009
 his-0.008
His-0.008
ny-0.008
Token the
Token ID262
Feature activation-5.36e-04

Pos logit contributions
igator+0.002
 his+0.002
inn+0.002
 ang+0.002
His+0.002
Neg logit contributions
ich-0.004
lor-0.004
sters-0.004
 ourselves-0.004
aj-0.004
Token little
Token ID1310
Feature activation+2.45e-03

Pos logit contributions
ich+0.001
 ourselves+0.001
sters+0.001
aj+0.001
lor+0.001
Neg logit contributions
igator-0.001
inn-0.001
 his-0.001
ny-0.001
 He-0.001
Token girl
Token ID2576
Feature activation+7.63e-03

Pos logit contributions
inn+0.004
igator+0.004
His+0.004
 ang+0.003
jack+0.003
Neg logit contributions
ich-0.004
sters-0.003
 ourselves-0.003
 roses-0.003
aj-0.003
Token didn
Token ID1422
Feature activation-9.29e-03

Pos logit contributions
inn+0.011
igator+0.011
 his+0.010
ny+0.010
His+0.010
Neg logit contributions
ich-0.014
 ourselves-0.013
sters-0.013
lor-0.013
aj-0.012
Token't
Token ID470
Feature activation+5.68e-03

Pos logit contributions
ich+0.014
 ourselves+0.013
sters+0.012
lor+0.012
aj+0.012
Neg logit contributions
inn-0.017
igator-0.017
 his-0.015
ny-0.015
His-0.015
Token understand
Token ID1833
Feature activation+5.08e-02

Pos logit contributions
inn+0.009
igator+0.009
jack+0.008
 ang+0.008
His+0.008
Neg logit contributions
 ourselves-0.008
ich-0.008
 yourself-0.007
asks-0.007
sters-0.007
Token.
Token ID13
Feature activation+1.29e-02

Pos logit contributions
inn+0.067
igator+0.065
 his+0.060
ny+0.057
His+0.057
Neg logit contributions
ich-0.100
 ourselves-0.095
sters-0.093
lor-0.092
aj-0.091
Token Tears
Token ID40328
Feature activation+5.78e-03

Pos logit contributions
inn+0.016
igator+0.016
 his+0.014
ny+0.013
His+0.013
Neg logit contributions
ich-0.026
 ourselves-0.025
sters-0.024
lor-0.024
aj-0.024
Token started
Token ID2067
Feature activation-6.29e-03

Pos logit contributions
inn+0.008
igator+0.008
 his+0.007
ny+0.007
 ang+0.007
Neg logit contributions
ich-0.011
 ourselves-0.010
sters-0.010
lor-0.010
aj-0.010
Token rolling
Token ID10708
Feature activation+1.75e-02

Pos logit contributions
ich+0.011
 ourselves+0.010
sters+0.010
lor+0.010
aj+0.010
Neg logit contributions
inn-0.010
igator-0.009
 his-0.009
ny-0.008
His-0.008
Token down
Token ID866
Feature activation+8.34e-03

Pos logit contributions
inn+0.022
igator+0.021
His+0.018
 ang+0.017
 trader+0.017
Neg logit contributions
ich-0.035
 ourselves-0.034
sters-0.033
 yourselves-0.032
aj-0.032
Token?"
Token ID1701
Feature activation+1.53e-02

Pos logit contributions
igator+0.014
inn+0.013
 his+0.013
ny+0.012
His+0.012
Neg logit contributions
ich-0.016
 ourselves-0.015
sters-0.015
lor-0.015
asks-0.014
Token Sam
Token ID3409
Feature activation-9.47e-04

Pos logit contributions
igator+0.027
inn+0.026
ited+0.024
ny+0.023
 ang+0.022
Neg logit contributions
ich-0.022
 ourselves-0.022
asks-0.021
aj-0.021
sters-0.020
Token thought
Token ID1807
Feature activation-1.09e-03

Pos logit contributions
aj+0.001
sters+0.001
ich+0.001
sed+0.001
 rib+0.001
Neg logit contributions
inn-0.002
igator-0.002
 trader-0.001
 ang-0.001
isel-0.001
Token for
Token ID329
Feature activation+1.26e-02

Pos logit contributions
ich+0.002
 ourselves+0.002
sters+0.002
lor+0.002
 yourselves+0.002
Neg logit contributions
inn-0.002
igator-0.002
 his-0.002
ny-0.002
 ang-0.002
Token a
Token ID257
Feature activation+1.94e-02

Pos logit contributions
inn+0.021
igator+0.020
ny+0.018
 trader+0.017
His+0.017
Neg logit contributions
 ourselves-0.021
ich-0.020
 yourselves-0.020
sters-0.018
asks-0.018
Token moment
Token ID2589
Feature activation+4.27e-02

Pos logit contributions
igator+0.027
inn+0.027
jack+0.023
 ang+0.023
 his+0.023
Neg logit contributions
ich-0.034
lor-0.031
 ourselves-0.031
aj-0.031
sed-0.030
Token and
Token ID290
Feature activation+1.31e-02

Pos logit contributions
inn+0.069
igator+0.067
 his+0.063
ny+0.061
His+0.061
Neg logit contributions
ich-0.071
 ourselves-0.067
sters-0.065
lor-0.064
aj-0.063
Token then
Token ID788
Feature activation+6.26e-03

Pos logit contributions
inn+0.019
igator+0.019
ny+0.017
 his+0.017
His+0.016
Neg logit contributions
ich-0.024
 ourselves-0.023
sters-0.022
aj-0.022
lor-0.022
Token said
Token ID531
Feature activation-1.48e-02

Pos logit contributions
inn+0.011
igator+0.011
ny+0.010
His+0.009
 trader+0.009
Neg logit contributions
sters-0.009
ich-0.009
aj-0.008
 ourselves-0.008
sed-0.008
Token,
Token ID11
Feature activation+1.14e-03

Pos logit contributions
ich+0.022
sters+0.021
 ourselves+0.021
lor+0.020
aj+0.020
Neg logit contributions
inn-0.025
igator-0.025
ny-0.023
ited-0.023
 ang-0.022
Token "
Token ID366
Feature activation+8.06e-03

Pos logit contributions
inn+0.002
igator+0.002
 his+0.002
His+0.001
ny+0.001
Neg logit contributions
ich-0.002
 ourselves-0.002
sters-0.002
lor-0.002
aj-0.002
Token stone
Token ID7815
Feature activation+5.51e-03

Pos logit contributions
 his+0.021
igator+0.021
inn+0.020
ny+0.019
 He+0.019
Neg logit contributions
ich-0.019
lor-0.017
 ourselves-0.016
aj-0.016
sters-0.016
Token and
Token ID290
Feature activation-3.59e-04

Pos logit contributions
igator+0.007
 his+0.007
inn+0.007
ny+0.006
 He+0.006
Neg logit contributions
ich-0.011
lor-0.010
 yourselves-0.010
aj-0.010
 ourselves-0.010
Token it
Token ID340
Feature activation-1.69e-02

Pos logit contributions
ich+0.001
lor+0.001
 ourselves+0.001
sters+0.001
aj+0.001
Neg logit contributions
igator-0.000
inn-0.000
 his-0.000
His-0.000
ny-0.000
Token was
Token ID373
Feature activation-1.04e-03

Pos logit contributions
ich+0.028
 ourselves+0.026
sters+0.026
lor+0.025
aj+0.025
Neg logit contributions
inn-0.027
igator-0.025
 ang-0.023
His-0.023
ny-0.022
Token very
Token ID845
Feature activation-3.20e-02

Pos logit contributions
ich+0.002
 ourselves+0.002
sters+0.002
lor+0.002
aj+0.002
Neg logit contributions
inn-0.001
igator-0.001
 his-0.001
ny-0.001
His-0.001
Token big
Token ID1263
Feature activation+8.76e-03

Pos logit contributions
ich+0.057
 ourselves+0.054
sters+0.052
aj+0.052
 yourselves+0.051
Neg logit contributions
inn-0.048
igator-0.047
 his-0.042
ny-0.041
His-0.041
Token.
Token ID13
Feature activation+2.99e-03

Pos logit contributions
 his+0.011
igator+0.011
inn+0.010
ny+0.010
 He+0.010
Neg logit contributions
ich-0.018
lor-0.017
 ourselves-0.017
sters-0.016
aj-0.016
Token The
Token ID383
Feature activation+5.49e-03

Pos logit contributions
inn+0.003
igator+0.003
 his+0.003
ny+0.003
His+0.003
Neg logit contributions
ich-0.007
 ourselves-0.006
sters-0.006
lor-0.006
aj-0.006
Token little
Token ID1310
Feature activation-1.41e-02

Pos logit contributions
igator+0.007
 his+0.006
inn+0.006
ny+0.006
 trader+0.005
Neg logit contributions
lor-0.012
 ourselves-0.011
sters-0.011
ich-0.011
aj-0.011
Token train
Token ID4512
Feature activation-2.34e-02

Pos logit contributions
ich+0.023
 ourselves+0.022
lor+0.022
sters+0.021
aj+0.021
Neg logit contributions
igator-0.023
inn-0.023
 his-0.021
ny-0.021
 ang-0.020
Token was
Token ID373
Feature activation+7.56e-03

Pos logit contributions
ich+0.048
 ourselves+0.047
lor+0.045
 yourselves+0.045
aj+0.044
Neg logit contributions
igator-0.029
inn-0.028
 his-0.027
ny-0.026
His-0.025
Token the
Token ID262
Feature activation+1.10e-02

Pos logit contributions
igator+0.009
ited+0.009
vous+0.008
inn+0.008
 ang+0.008
Neg logit contributions
 ourselves-0.011
 rib-0.010
ich-0.010
 yourselves-0.010
 sandwiches-0.009
Token ball
Token ID2613
Feature activation-7.80e-03

Pos logit contributions
igator+0.017
inn+0.015
 his+0.015
ny+0.014
 ang+0.014
Neg logit contributions
ich-0.020
lor-0.020
 ourselves-0.020
asks-0.019
sters-0.019
Token all
Token ID477
Feature activation+1.82e-02

Pos logit contributions
ich+0.011
 ourselves+0.011
sters+0.010
oor+0.010
sed+0.009
Neg logit contributions
 ang-0.012
inn-0.012
igator-0.012
ited-0.011
 Boy-0.011
Token around
Token ID1088
Feature activation+7.54e-03

Pos logit contributions
 ang+0.026
 trader+0.026
 grocer+0.025
His+0.025
isel+0.023
Neg logit contributions
 ourselves-0.027
ich-0.025
 you-0.023
 yourselves-0.023
w-0.022
Token the
Token ID262
Feature activation-8.26e-03

Pos logit contributions
igator+0.011
inn+0.011
 ang+0.010
His+0.009
ny+0.009
Neg logit contributions
ich-0.013
 ourselves-0.013
 yourselves-0.012
sters-0.012
lor-0.011
Token gym
Token ID11550
Feature activation+6.55e-03

Pos logit contributions
ich+0.013
lor+0.013
sters+0.013
asks+0.013
 ourselves+0.012
Neg logit contributions
igator-0.014
ny-0.013
 his-0.013
inn-0.013
 He-0.012
Token and
Token ID290
Feature activation+3.28e-03

Pos logit contributions
inn+0.010
igator+0.010
 his+0.009
 ang+0.009
His+0.009
Neg logit contributions
ich-0.011
 ourselves-0.011
sters-0.010
lor-0.010
 yourselves-0.010
Token had
Token ID550
Feature activation-2.09e-02

Pos logit contributions
inn+0.004
igator+0.004
 his+0.003
His+0.003
ny+0.003
Neg logit contributions
ich-0.007
 ourselves-0.006
sters-0.006
lor-0.006
aj-0.006
Token so
Token ID523
Feature activation-7.07e-03

Pos logit contributions
ich+0.044
sters+0.040
 ourselves+0.040
lor+0.039
asks+0.039
Neg logit contributions
 his-0.027
inn-0.026
igator-0.025
His-0.023
ny-0.023
Token much
Token ID881
Feature activation+1.52e-02

Pos logit contributions
ich+0.013
 ourselves+0.013
sters+0.012
aj+0.012
 yourselves+0.012
Neg logit contributions
igator-0.009
inn-0.009
 ang-0.008
ny-0.008
 his-0.008
Token fun
Token ID1257
Feature activation+1.86e-02

Pos logit contributions
igator+0.026
inn+0.026
ny+0.024
His+0.023
 ang+0.023
Neg logit contributions
ich-0.023
 ourselves-0.022
aj-0.021
asks-0.021
sters-0.021
Token€
Token ID26391
Feature activation-1.29e-02

Pos logit contributions
inn+0.006
igator+0.006
 his+0.006
 ang+0.005
His+0.005
Neg logit contributions
ich-0.006
 ourselves-0.006
lor-0.006
sters-0.006
 yourselves-0.006
Token�
Token ID129
Feature activation+2.64e-02

Pos logit contributions
 ourselves+0.020
 yourselves+0.019
asks+0.019
ich+0.018
 ours+0.018
Neg logit contributions
 He-0.019
ny-0.018
inn-0.018
 his-0.017
ah-0.017
Token�
Token ID241
Feature activation+2.78e-02

Pos logit contributions
inn+0.039
igator+0.037
 his+0.035
ny+0.035
His+0.033
Neg logit contributions
ich-0.047
sters-0.045
 ourselves-0.045
lor-0.044
aj-0.043
TokenBe
Token ID3856
Feature activation-1.66e-03

Pos logit contributions
inn+0.039
igator+0.037
 his+0.035
His+0.034
ny+0.034
Neg logit contributions
ich-0.052
sters-0.050
 ourselves-0.049
aj-0.048
lor-0.048
Token careful
Token ID8161
Feature activation-1.87e-03

Pos logit contributions
ich+0.003
 ourselves+0.002
sters+0.002
lor+0.002
aj+0.002
Neg logit contributions
inn-0.003
igator-0.003
 his-0.003
His-0.002
ny-0.002
Token,
Token ID11
Feature activation+8.91e-03

Pos logit contributions
ich+0.004
lor+0.004
sters+0.004
aj+0.003
asks+0.003
Neg logit contributions
igator-0.002
 his-0.002
inn-0.002
ny-0.002
 He-0.002
Token the
Token ID262
Feature activation-1.70e-02

Pos logit contributions
 his+0.010
 he+0.009
 He+0.008
igator+0.008
inn+0.008
Neg logit contributions
ich-0.021
lor-0.019
aj-0.019
sters-0.018
sed-0.018
Token branches
Token ID13737
Feature activation-2.27e-02

Pos logit contributions
ich+0.032
 ourselves+0.032
aj+0.031
lor+0.031
asks+0.031
Neg logit contributions
igator-0.024
 his-0.021
inn-0.021
 He-0.020
jack-0.020
Token are
Token ID389
Feature activation+8.98e-03

Pos logit contributions
ich+0.045
aj+0.042
lor+0.042
 ourselves+0.040
sters+0.040
Neg logit contributions
igator-0.034
 his-0.029
inn-0.029
ny-0.027
 He-0.027
Token very
Token ID845
Feature activation-1.65e-02

Pos logit contributions
igator+0.010
 his+0.010
inn+0.009
 ang+0.009
ny+0.008
Neg logit contributions
ich-0.020
lor-0.020
sters-0.019
asks-0.018
 ourselves-0.018
Token high
Token ID1029
Feature activation+2.47e-02

Pos logit contributions
ich+0.035
lor+0.034
 ourselves+0.032
sters+0.032
asks+0.032
Neg logit contributions
igator-0.020
inn-0.020
 his-0.019
 ang-0.017
ny-0.017
Token.
Token ID13
Feature activation-5.51e-03

Pos logit contributions
inn+0.022
igator+0.021
 his+0.019
ny+0.018
His+0.018
Neg logit contributions
ich-0.034
 ourselves-0.033
sters-0.032
lor-0.031
aj-0.031
Token He
Token ID679
Feature activation+7.70e-03

Pos logit contributions
 ourselves+0.010
aj+0.010
ich+0.010
asks+0.010
lor+0.010
Neg logit contributions
igator-0.007
inn-0.007
 ang-0.006
ited-0.006
ah-0.006
Token worked
Token ID3111
Feature activation+1.46e-02

Pos logit contributions
inn+0.010
igator+0.010
 ang+0.008
His+0.008
 his+0.008
Neg logit contributions
ich-0.015
sters-0.014
aj-0.014
 ourselves-0.014
lor-0.014
Token all
Token ID477
Feature activation+3.16e-02

Pos logit contributions
igator+0.027
inn+0.022
jack+0.021
ny+0.021
ited+0.021
Neg logit contributions
 ourselves-0.021
ich-0.019
aj-0.019
sters-0.019
 yourselves-0.019
Token day
Token ID1110
Feature activation+4.22e-03

Pos logit contributions
 grocer+0.051
 ang+0.050
 trader+0.050
igator+0.048
jack+0.048
Neg logit contributions
 ourselves-0.043
 yourselves-0.034
 our-0.034
w-0.034
 you-0.034
Token,
Token ID11
Feature activation-3.92e-03

Pos logit contributions
inn+0.007
igator+0.006
 his+0.006
ny+0.006
His+0.006
Neg logit contributions
ich-0.007
 ourselves-0.007
 yourselves-0.007
sters-0.007
lor-0.007
Token making
Token ID1642
Feature activation+2.31e-03

Pos logit contributions
ich+0.007
 ourselves+0.006
sters+0.006
lor+0.006
aj+0.006
Neg logit contributions
inn-0.006
igator-0.006
 his-0.005
ny-0.005
His-0.005
Token things
Token ID1243
Feature activation-1.86e-02

Pos logit contributions
inn+0.003
igator+0.003
 his+0.003
 ang+0.003
His+0.003
Neg logit contributions
ich-0.004
 ourselves-0.004
sters-0.004
lor-0.004
aj-0.004
Token prett
Token ID46442
Feature activation-1.35e-02

Pos logit contributions
ich+0.036
 ourselves+0.034
sters+0.033
lor+0.032
aj+0.032
Neg logit contributions
inn-0.025
igator-0.025
 his-0.022
ny-0.022
His-0.022
Tokenier
Token ID959
Feature activation+3.98e-03

Pos logit contributions
ich+0.028
lor+0.025
 ourselves+0.025
sters+0.024
 yourselves+0.024
Neg logit contributions
inn-0.018
igator-0.017
 his-0.016
ny-0.015
His-0.015
Token and
Token ID290
Feature activation+5.44e-04

Pos logit contributions
 his+0.006
ny+0.005
 He+0.005
inn+0.005
igator+0.005
Neg logit contributions
ich-0.008
lor-0.007
sed-0.007
asks-0.007
aj-0.007
Token<|endoftext|>
Token ID50256
Feature activation-1.55e-03

Pos logit contributions
igator+0.008
inn+0.008
ny+0.007
 ang+0.007
 his+0.007
Neg logit contributions
ich-0.008
 ourselves-0.008
lor-0.008
sters-0.008
aj-0.008
TokenOnce
Token ID7454
Feature activation-2.11e-02

Pos logit contributions
ich+0.003
 ourselves+0.003
sters+0.003
aj+0.003
 yourselves+0.003
Neg logit contributions
inn-0.002
igator-0.002
 his-0.002
ny-0.002
 He-0.002
Token upon
Token ID2402
Feature activation-1.55e-02

Pos logit contributions
ich+0.022
 ourselves+0.020
aj+0.018
sters+0.018
lor+0.018
Neg logit contributions
inn-0.046
igator-0.046
ny-0.043
 ang-0.043
His-0.043
Token a
Token ID257
Feature activation+6.04e-03

Pos logit contributions
ich+0.030
 ourselves+0.029
sters+0.028
lor+0.028
aj+0.027
Neg logit contributions
inn-0.020
igator-0.020
 his-0.018
ny-0.017
 ang-0.017
Token time
Token ID640
Feature activation+1.81e-02

Pos logit contributions
igator+0.009
inn+0.009
 his+0.008
 ang+0.008
His+0.008
Neg logit contributions
ich-0.010
 ourselves-0.010
aj-0.010
lor-0.009
asks-0.009
Token,
Token ID11
Feature activation+8.52e-03

Pos logit contributions
inn+0.023
igator+0.022
 his+0.021
His+0.019
ny+0.019
Neg logit contributions
ich-0.037
 ourselves-0.035
sters-0.035
lor-0.035
 yourselves-0.034
Token there
Token ID612
Feature activation+1.12e-02

Pos logit contributions
inn+0.015
igator+0.015
 ang+0.014
His+0.014
 trader+0.013
Neg logit contributions
ich-0.012
 ourselves-0.011
sters-0.010
 yourselves-0.010
aj-0.010
Token was
Token ID373
Feature activation+6.82e-03

Pos logit contributions
inn+0.018
igator+0.017
 his+0.016
ny+0.015
His+0.015
Neg logit contributions
ich-0.019
 ourselves-0.018
sters-0.017
lor-0.017
aj-0.017
Token a
Token ID257
Feature activation-7.08e-03

Pos logit contributions
inn+0.009
igator+0.009
 his+0.008
ny+0.008
His+0.008
Neg logit contributions
ich-0.013
 ourselves-0.013
sters-0.012
lor-0.012
aj-0.012
Token silly
Token ID14397
Feature activation+1.11e-02

Pos logit contributions
ich+0.011
sters+0.009
aj+0.009
 ourselves+0.009
 majesty+0.009
Neg logit contributions
inn-0.013
igator-0.012
 aback-0.011
 ang-0.011
 his-0.011
Token zero
Token ID6632
Feature activation+2.00e-02

Pos logit contributions
inn+0.018
igator+0.017
 his+0.016
His+0.015
 ang+0.015
Neg logit contributions
ich-0.019
 ourselves-0.018
sters-0.017
lor-0.017
aj-0.017
Token make
Token ID787
Feature activation-1.50e-02

Pos logit contributions
ich+0.009
lor+0.009
 ourselves+0.008
sters+0.008
aj+0.008
Neg logit contributions
inn-0.010
igator-0.009
 his-0.009
His-0.009
ny-0.009
Token pictures
Token ID5986
Feature activation-5.23e-03

Pos logit contributions
ich+0.025
lor+0.023
aj+0.022
sters+0.022
 ourselves+0.022
Neg logit contributions
inn-0.024
igator-0.024
 his-0.024
 ang-0.022
ny-0.021
Token of
Token ID286
Feature activation-2.24e-02

Pos logit contributions
ich+0.012
aj+0.011
sed+0.011
lor+0.011
asks+0.010
Neg logit contributions
igator-0.006
 his-0.006
inn-0.006
 He-0.005
ny-0.005
Token animals
Token ID4695
Feature activation-2.04e-02

Pos logit contributions
ich+0.047
lor+0.046
aj+0.045
sters+0.042
asks+0.042
Neg logit contributions
 his-0.030
igator-0.025
inn-0.023
 ang-0.022
 He-0.022
Token,
Token ID11
Feature activation-2.52e-02

Pos logit contributions
ich+0.045
aj+0.044
lor+0.043
sed+0.043
asks+0.042
Neg logit contributions
igator-0.022
 his-0.021
inn-0.021
ny-0.019
 He-0.018
Token flowers
Token ID12734
Feature activation-4.12e-02

Pos logit contributions
ich+0.050
lor+0.049
 ourselves+0.046
aj+0.046
sters+0.044
Neg logit contributions
 his-0.035
igator-0.033
inn-0.030
 He-0.029
His-0.029
Token and
Token ID290
Feature activation-2.75e-02

Pos logit contributions
ich+0.092
aj+0.089
 yourselves+0.088
lor+0.088
 ourselves+0.087
Neg logit contributions
 his-0.043
inn-0.042
igator-0.041
 He-0.037
ny-0.037
Token stars
Token ID5788
Feature activation-2.40e-03

Pos logit contributions
ich+0.045
lor+0.044
aj+0.040
 ourselves+0.038
asks+0.038
Neg logit contributions
 his-0.049
igator-0.047
 He-0.043
 ang-0.042
His-0.042
Token.
Token ID13
Feature activation-1.02e-02

Pos logit contributions
ich+0.005
aj+0.005
 ourselves+0.005
 yourselves+0.005
lor+0.005
Neg logit contributions
igator-0.003
inn-0.003
 his-0.003
ny-0.002
 He-0.002
Token One
Token ID1881
Feature activation-8.49e-03

Pos logit contributions
ich+0.022
 ourselves+0.021
aj+0.021
asks+0.021
lor+0.020
Neg logit contributions
igator-0.011
inn-0.010
 ang-0.010
 his-0.009
His-0.009
Token day
Token ID1110
Feature activation+3.56e-02

Pos logit contributions
ich+0.016
 ourselves+0.015
sters+0.015
aj+0.015
asks+0.015
Neg logit contributions
inn-0.012
igator-0.011
 ang-0.010
 his-0.010
His-0.010
Token.
Token ID13
Feature activation+2.33e-03

Pos logit contributions
inn+0.007
igator+0.006
 his+0.006
His+0.006
ny+0.006
Neg logit contributions
ich-0.010
 ourselves-0.010
sters-0.010
lor-0.009
aj-0.009
Token She
Token ID1375
Feature activation+9.06e-03

Pos logit contributions
igator+0.003
inn+0.003
 his+0.003
 ang+0.003
ny+0.003
Neg logit contributions
ich-0.004
 ourselves-0.004
sters-0.004
lor-0.004
aj-0.004
Token grabbed
Token ID13176
Feature activation+1.58e-02

Pos logit contributions
inn+0.011
igator+0.011
 his+0.009
 ang+0.009
His+0.009
Neg logit contributions
ich-0.019
sters-0.017
 ourselves-0.017
aj-0.017
lor-0.017
Token her
Token ID607
Feature activation+1.08e-02

Pos logit contributions
igator+0.024
inn+0.022
jack+0.021
ny+0.021
His+0.021
Neg logit contributions
 rib-0.024
 ourselves-0.024
sters-0.023
ich-0.023
 roses-0.023
Token favorite
Token ID4004
Feature activation+8.24e-03

Pos logit contributions
 his+0.018
igator+0.017
inn+0.017
 ang+0.015
ny+0.015
Neg logit contributions
ich-0.018
lor-0.018
 ourselves-0.017
 yourselves-0.017
asks-0.016
Token shirt
Token ID10147
Feature activation-1.47e-02

Pos logit contributions
igator+0.015
inn+0.014
 his+0.014
ny+0.013
His+0.013
Neg logit contributions
ich-0.013
 ourselves-0.012
lor-0.012
sters-0.011
 yourselves-0.011
Token and
Token ID290
Feature activation+1.67e-03

Pos logit contributions
ich+0.025
 ourselves+0.024
sters+0.023
 yourselves+0.023
lor+0.023
Neg logit contributions
inn-0.023
igator-0.023
 his-0.022
ny-0.021
His-0.021
Token raced
Token ID32897
Feature activation+9.01e-03

Pos logit contributions
inn+0.002
igator+0.002
 his+0.002
His+0.002
ny+0.002
Neg logit contributions
ich-0.004
 ourselves-0.003
lor-0.003
sters-0.003
 yourselves-0.003
Token to
Token ID284
Feature activation-1.42e-02

Pos logit contributions
igator+0.015
inn+0.013
 ang+0.013
ited+0.012
jack+0.012
Neg logit contributions
 ourselves-0.014
ich-0.014
sters-0.013
 yourselves-0.013
 rib-0.013
Token store
Token ID3650
Feature activation+3.88e-03

Pos logit contributions
 ourselves+0.024
 yourselves+0.023
ich+0.022
 rib+0.021
 yourself+0.020
Neg logit contributions
igator-0.023
inn-0.021
ny-0.020
ah-0.019
ited-0.019
Token it
Token ID340
Feature activation-2.73e-03

Pos logit contributions
inn+0.005
igator+0.005
 his+0.004
ny+0.004
His+0.004
Neg logit contributions
ich-0.008
 ourselves-0.007
sters-0.007
lor-0.007
aj-0.007
Token,
Token ID11
Feature activation-1.53e-03

Pos logit contributions
ich+0.031
 ourselves+0.031
sters+0.030
lor+0.029
 yourselves+0.028
Neg logit contributions
inn-0.033
igator-0.031
ny-0.028
 ang-0.028
ited-0.028
Token "
Token ID366
Feature activation+9.28e-03

Pos logit contributions
ich+0.003
 ourselves+0.003
aj+0.003
 yourselves+0.003
lor+0.003
Neg logit contributions
inn-0.002
 his-0.002
igator-0.002
His-0.002
 He-0.002
TokenThank
Token ID10449
Feature activation-7.94e-03

Pos logit contributions
inn+0.015
igator+0.015
 his+0.013
 ang+0.013
ny+0.013
Neg logit contributions
ich-0.016
 ourselves-0.015
lor-0.014
sters-0.014
 yourselves-0.014
Token you
Token ID345
Feature activation+6.40e-03

Pos logit contributions
ich+0.005
aj+0.004
 ourselves+0.004
sters+0.004
lor+0.004
Neg logit contributions
inn-0.021
igator-0.020
 his-0.020
His-0.019
ny-0.019
Token,
Token ID11
Feature activation+2.37e-03

Pos logit contributions
inn+0.008
igator+0.008
 his+0.008
ny+0.007
 He+0.007
Neg logit contributions
ich-0.013
 ourselves-0.012
sters-0.012
lor-0.012
 yourselves-0.012
Token Mia
Token ID32189
Feature activation-8.67e-03

Pos logit contributions
inn+0.003
igator+0.002
ny+0.002
 ang+0.002
His+0.002
Neg logit contributions
ich-0.005
 ourselves-0.005
sters-0.005
aj-0.005
asks-0.005
Token,
Token ID11
Feature activation-1.46e-02

Pos logit contributions
ich+0.014
 ourselves+0.013
sters+0.013
lor+0.013
asks+0.013
Neg logit contributions
igator-0.014
inn-0.014
 his-0.013
ny-0.013
His-0.012
Token for
Token ID329
Feature activation-2.47e-02

Pos logit contributions
ich+0.026
aj+0.025
sters+0.023
lor+0.023
 ourselves+0.022
Neg logit contributions
igator-0.025
 his-0.025
inn-0.023
 He-0.022
 he-0.021
Token helping
Token ID5742
Feature activation-1.65e-02

Pos logit contributions
ich+0.042
aj+0.040
sters+0.040
lor+0.040
asks+0.039
Neg logit contributions
 his-0.038
igator-0.038
inn-0.037
 ang-0.034
His-0.034
Token me
Token ID502
Feature activation-1.98e-02

Pos logit contributions
ich+0.026
aj+0.026
sters+0.025
lor+0.025
sed+0.024
Neg logit contributions
 his-0.028
igator-0.028
inn-0.026
ny-0.025
His-0.023
Token find
Token ID1064
Feature activation-1.76e-02

Pos logit contributions
ich+0.037
aj+0.034
sters+0.034
lor+0.034
asks+0.033
Neg logit contributions
inn-0.028
igator-0.028
 his-0.027
ny-0.025
His-0.024
Token mom
Token ID1995
Feature activation-1.23e-02

Pos logit contributions
 his+0.001
igator+0.001
inn+0.001
His+0.001
 ang+0.001
Neg logit contributions
ich-0.001
lor-0.001
sters-0.001
 ourselves-0.001
 yourselves-0.001
Tokenmy
Token ID1820
Feature activation-7.79e-03

Pos logit contributions
ich+0.028
 ourselves+0.026
sters+0.026
aj+0.026
lor+0.025
Neg logit contributions
inn-0.013
igator-0.012
 his-0.010
His-0.010
 ang-0.010
Token said
Token ID531
Feature activation-2.50e-02

Pos logit contributions
ich+0.015
 ourselves+0.015
sters+0.014
 yourselves+0.014
lor+0.014
Neg logit contributions
inn-0.010
igator-0.010
 his-0.009
ny-0.009
His-0.009
Token,
Token ID11
Feature activation-1.11e-02

Pos logit contributions
ich+0.049
 ourselves+0.046
aj+0.045
sters+0.045
lor+0.045
Neg logit contributions
inn-0.033
igator-0.033
 his-0.031
His-0.029
ny-0.028
Token "
Token ID366
Feature activation+1.05e-02

Pos logit contributions
ich+0.024
 ourselves+0.023
aj+0.023
 yourselves+0.022
lor+0.022
Neg logit contributions
inn-0.012
 his-0.012
His-0.012
 He-0.011
igator-0.011
TokenOk
Token ID18690
Feature activation-1.89e-02

Pos logit contributions
inn+0.014
igator+0.014
 his+0.013
 ang+0.012
ny+0.012
Neg logit contributions
ich-0.020
 ourselves-0.019
sters-0.018
lor-0.018
 yourselves-0.018
Token,
Token ID11
Feature activation-2.01e-02

Pos logit contributions
ich+0.037
 ourselves+0.034
lor+0.034
sters+0.034
aj+0.033
Neg logit contributions
inn-0.025
igator-0.025
 his-0.023
ny-0.022
His-0.021
Token Joe
Token ID5689
Feature activation-3.77e-03

Pos logit contributions
ich+0.042
lor+0.040
aj+0.039
sters+0.038
sed+0.037
Neg logit contributions
igator-0.025
 his-0.025
inn-0.024
 He-0.022
ny-0.021
Token.
Token ID13
Feature activation-1.78e-02

Pos logit contributions
ich+0.007
 ourselves+0.007
asks+0.007
sters+0.007
lor+0.006
Neg logit contributions
igator-0.005
inn-0.005
ny-0.005
 his-0.005
 He-0.005
Token You
Token ID921
Feature activation-3.09e-03

Pos logit contributions
ich+0.028
 ourselves+0.026
sters+0.025
lor+0.025
aj+0.025
Neg logit contributions
inn-0.030
igator-0.030
 his-0.028
ny-0.027
His-0.027
Token can
Token ID460
Feature activation-2.23e-02

Pos logit contributions
ich+0.006
aj+0.005
 ourselves+0.005
 yourselves+0.005
asks+0.005
Neg logit contributions
igator-0.005
inn-0.004
ny-0.004
 his-0.004
 He-0.004
Token-
Token ID12
Feature activation+3.01e-02

Pos logit contributions
inn+0.003
 his+0.003
igator+0.003
ny+0.003
 ang+0.003
Neg logit contributions
ich-0.005
lor-0.004
sed-0.004
 ourselves-0.004
sters-0.004
Tokenup
Token ID929
Feature activation+5.33e-03

Pos logit contributions
inn+0.048
igator+0.047
 his+0.043
His+0.042
ny+0.042
Neg logit contributions
ich-0.051
 ourselves-0.048
sters-0.046
lor-0.046
aj-0.046
Token.
Token ID13
Feature activation-7.14e-03

Pos logit contributions
 his+0.006
igator+0.005
ny+0.005
 He+0.005
inn+0.005
Neg logit contributions
ich-0.012
lor-0.011
sed-0.010
asks-0.010
 ourselves-0.010
Token<|endoftext|>
Token ID50256
Feature activation-5.45e-04

Pos logit contributions
ich+0.015
sters+0.015
 ourselves+0.014
lor+0.014
 yourselves+0.013
Neg logit contributions
inn-0.009
 his-0.008
 He-0.008
igator-0.008
His-0.007
TokenOnce
Token ID7454
Feature activation-1.41e-02

Pos logit contributions
ich+0.001
 ourselves+0.001
sters+0.001
lor+0.001
aj+0.001
Neg logit contributions
inn-0.001
igator-0.001
 his-0.001
ny-0.001
His-0.001
Token upon
Token ID2402
Feature activation-1.16e-02

Pos logit contributions
ich+0.016
 ourselves+0.014
sters+0.013
aj+0.013
lor+0.013
Neg logit contributions
inn-0.030
igator-0.030
 his-0.028
ny-0.028
His-0.027
Token a
Token ID257
Feature activation+5.87e-03

Pos logit contributions
ich+0.023
 ourselves+0.022
sters+0.021
lor+0.021
aj+0.021
Neg logit contributions
inn-0.015
igator-0.015
 his-0.013
ny-0.013
His-0.013
Token time
Token ID640
Feature activation+1.67e-02

Pos logit contributions
igator+0.009
inn+0.008
 his+0.008
 ang+0.008
His+0.007
Neg logit contributions
ich-0.010
 ourselves-0.010
aj-0.010
lor-0.009
sters-0.009
Token,
Token ID11
Feature activation+9.14e-03

Pos logit contributions
inn+0.020
igator+0.019
 his+0.019
His+0.017
ny+0.016
Neg logit contributions
ich-0.035
 ourselves-0.033
sters-0.033
lor-0.033
 yourselves-0.032
Token there
Token ID612
Feature activation+1.08e-02

Pos logit contributions
inn+0.017
igator+0.017
 ang+0.016
 trader+0.015
His+0.015
Neg logit contributions
ich-0.012
 ourselves-0.011
 yourselves-0.010
aj-0.010
sters-0.010
Token was
Token ID373
Feature activation+5.90e-03

Pos logit contributions
inn+0.017
igator+0.017
 his+0.016
ny+0.015
 ang+0.015
Neg logit contributions
ich-0.018
 ourselves-0.017
sters-0.016
lor-0.016
aj-0.016
Token k
Token ID479
Feature activation-4.56e-02

Pos logit contributions
inn+0.032
igator+0.031
 his+0.029
His+0.028
ny+0.028
Neg logit contributions
ich-0.032
 ourselves-0.029
sters-0.029
lor-0.028
aj-0.028
Tokenang
Token ID648
Feature activation+3.15e-03

Pos logit contributions
ich+0.067
 ourselves+0.059
sters+0.057
lor+0.056
 yourselves+0.055
Neg logit contributions
inn-0.085
igator-0.084
 his-0.080
 ang-0.078
His-0.077
Tokenaroo
Token ID38049
Feature activation-5.89e-03

Pos logit contributions
inn+0.005
 ang+0.004
igator+0.004
 his+0.004
ny+0.004
Neg logit contributions
ich-0.006
 ourselves-0.005
sters-0.005
lor-0.005
 yourselves-0.005
Token.
Token ID13
Feature activation+1.01e-02

Pos logit contributions
aj+0.013
 majesty+0.012
lor+0.012
ich+0.012
 ourselves+0.012
Neg logit contributions
igator-0.007
 his-0.006
inn-0.006
ny-0.006
 He-0.005
Token The
Token ID383
Feature activation-4.75e-03

Pos logit contributions
inn+0.014
igator+0.013
 ang+0.012
ny+0.011
 his+0.011
Neg logit contributions
ich-0.019
 ourselves-0.018
sters-0.018
 yourselves-0.018
asks-0.018
Token k
Token ID479
Feature activation-5.89e-02

Pos logit contributions
aj+0.007
sters+0.007
ich+0.007
lor+0.007
asks+0.007
Neg logit contributions
igator-0.009
inn-0.008
 his-0.008
ny-0.008
His-0.008
Tokenang
Token ID648
Feature activation-1.75e-02

Pos logit contributions
ich+0.106
 ourselves+0.099
sters+0.096
lor+0.094
 yourselves+0.093
Neg logit contributions
inn-0.088
igator-0.087
 his-0.080
 ang-0.077
His-0.077
Tokenaroo
Token ID38049
Feature activation-1.99e-02

Pos logit contributions
ich+0.035
 ourselves+0.032
sters+0.032
lor+0.031
 yourselves+0.031
Neg logit contributions
inn-0.023
igator-0.022
 his-0.020
 ang-0.020
ny-0.020
Token has
Token ID468
Feature activation-2.78e-02

Pos logit contributions
ich+0.030
 ourselves+0.029
lor+0.029
aj+0.027
 yourselves+0.026
Neg logit contributions
igator-0.037
 his-0.036
ny-0.035
 He-0.035
inn-0.034
Token a
Token ID257
Feature activation-1.32e-02

Pos logit contributions
aj+0.071
sters+0.069
ich+0.068
asks+0.068
lor+0.068
Neg logit contributions
 his-0.028
 He-0.021
His-0.019
ny-0.019
 His-0.018
Token sack
Token ID23704
Feature activation-7.88e-03

Pos logit contributions
ich+0.022
 ourselves+0.022
sters+0.021
lor+0.021
 yourselves+0.021
Neg logit contributions
inn-0.021
igator-0.020
 his-0.019
ny-0.019
His-0.018
Token.
Token ID13
Feature activation-1.12e-03

Pos logit contributions
ich+0.082
lor+0.078
aj+0.076
 ourselves+0.076
asks+0.075
Neg logit contributions
 his-0.051
igator-0.049
inn-0.046
ny-0.044
 He-0.043
Token
Token ID198
Feature activation-2.28e-03

Pos logit contributions
ich+0.002
 ourselves+0.002
sters+0.002
lor+0.002
 yourselves+0.002
Neg logit contributions
inn-0.001
 He-0.001
igator-0.001
 his-0.001
ny-0.001
Token
Token ID198
Feature activation+3.52e-03

Pos logit contributions
ich+0.005
sters+0.004
 ourselves+0.004
lor+0.004
 yourselves+0.004
Neg logit contributions
inn-0.003
 his-0.003
igator-0.003
His-0.003
ny-0.002
TokenAfter
Token ID3260
Feature activation-2.69e-02

Pos logit contributions
inn+0.005
igator+0.005
His+0.005
 his+0.004
ny+0.004
Neg logit contributions
ich-0.007
sters-0.006
 ourselves-0.006
sed-0.006
lor-0.006
Token their
Token ID511
Feature activation+1.37e-02

Pos logit contributions
ich+0.054
 ourselves+0.051
sters+0.050
lor+0.050
aj+0.050
Neg logit contributions
inn-0.034
igator-0.033
 his-0.032
His-0.030
 ang-0.029
Token saf
Token ID1932
Feature activation-6.58e-02

Pos logit contributions
 his+0.014
igator+0.013
inn+0.013
 ang+0.012
ny+0.011
Neg logit contributions
 yourselves-0.030
lor-0.030
ich-0.030
 ourselves-0.028
sters-0.028
Tokenari
Token ID2743
Feature activation+9.08e-04

Pos logit contributions
ich+0.079
 ourselves+0.077
 yourselves+0.071
aj+0.069
sters+0.069
Neg logit contributions
inn-0.133
igator-0.131
 his-0.123
ny-0.122
His-0.122
Token,
Token ID11
Feature activation+4.32e-03

Pos logit contributions
igator+0.001
 his+0.001
inn+0.001
ny+0.001
 ang+0.001
Neg logit contributions
ich-0.002
 ourselves-0.002
aj-0.002
 yourselves-0.002
asks-0.002
Token they
Token ID484
Feature activation-2.48e-02

Pos logit contributions
inn+0.005
igator+0.004
 his+0.004
His+0.004
ny+0.004
Neg logit contributions
ich-0.010
 ourselves-0.009
sters-0.009
lor-0.009
aj-0.009
Token sat
Token ID3332
Feature activation-7.22e-03

Pos logit contributions
ich+0.050
 ourselves+0.048
sters+0.046
lor+0.046
 yourselves+0.046
Neg logit contributions
inn-0.032
igator-0.031
 his-0.029
ny-0.027
His-0.027
Token down
Token ID866
Feature activation-2.08e-02

Pos logit contributions
ich+0.015
 ourselves+0.014
lor+0.014
sters+0.014
asks+0.014
Neg logit contributions
inn-0.008
igator-0.008
 his-0.008
His-0.007
 ang-0.007
Token and
Token ID290
Feature activation-7.18e-03

Pos logit contributions
ich+0.044
lor+0.041
asks+0.041
 yourselves+0.041
sed+0.040
Neg logit contributions
igator-0.032
 his-0.032
inn-0.031
ny-0.028
His-0.028
Token vegetables
Token ID13701
Feature activation-1.88e-02

Pos logit contributions
ich+0.014
lor+0.013
aj+0.012
asks+0.012
sters+0.011
Neg logit contributions
 his-0.012
igator-0.011
ny-0.010
 ang-0.010
 He-0.010
Token and
Token ID290
Feature activation-1.56e-02

Pos logit contributions
ich+0.040
lor+0.038
asks+0.038
 yourselves+0.038
aj+0.037
Neg logit contributions
 his-0.022
igator-0.021
inn-0.021
ny-0.018
His-0.017
Token enjoy
Token ID2883
Feature activation-3.19e-02

Pos logit contributions
ich+0.033
lor+0.032
aj+0.029
asks+0.029
sters+0.027
Neg logit contributions
 his-0.024
igator-0.022
 He-0.018
ny-0.018
 ang-0.018
Token their
Token ID511
Feature activation+9.48e-04

Pos logit contributions
ich+0.078
lor+0.077
aj+0.074
sed+0.073
asks+0.073
Neg logit contributions
 his-0.031
inn-0.023
igator-0.022
His-0.020
 He-0.018
Token lunch
Token ID9965
Feature activation-6.41e-02

Pos logit contributions
 his+0.001
igator+0.001
 football+0.001
 particular+0.001
ny+0.001
Neg logit contributions
lor-0.002
ich-0.002
umes-0.002
 yourselves-0.002
sters-0.002
Token.
Token ID13
Feature activation-6.46e-03

Pos logit contributions
ich+0.150
 yourselves+0.137
sed+0.136
asks+0.136
 ourselves+0.134
Neg logit contributions
inn-0.065
 his-0.063
igator-0.059
ny-0.052
His-0.049
Token They
Token ID1119
Feature activation-7.31e-03

Pos logit contributions
ich+0.015
sters+0.014
lor+0.014
 ourselves+0.014
 yourselves+0.013
Neg logit contributions
inn-0.007
 He-0.006
 his-0.006
igator-0.006
His-0.006
Token are
Token ID389
Feature activation-6.95e-03

Pos logit contributions
ich+0.012
lor+0.011
aj+0.011
 ourselves+0.011
asks+0.011
Neg logit contributions
igator-0.013
inn-0.012
 his-0.012
ny-0.011
 ang-0.011
Token proud
Token ID6613
Feature activation-1.24e-02

Pos logit contributions
ich+0.015
lor+0.015
asks+0.014
sters+0.014
aj+0.014
Neg logit contributions
 his-0.008
igator-0.008
inn-0.007
 ang-0.006
His-0.006
Token of
Token ID286
Feature activation-7.51e-03

Pos logit contributions
ich+0.031
lor+0.031
 ourselves+0.030
aj+0.029
 yourselves+0.029
Neg logit contributions
 his-0.010
inn-0.009
igator-0.009
 He-0.008
 he-0.007
Token of
Token ID286
Feature activation-3.26e-03

Pos logit contributions
ich+0.015
 ourselves+0.015
sters+0.014
 yourselves+0.013
asks+0.013
Neg logit contributions
inn-0.014
igator-0.013
His-0.012
 ang-0.012
 his-0.012
Token the
Token ID262
Feature activation-1.87e-02

Pos logit contributions
ich+0.007
 ourselves+0.006
sters+0.006
 yourselves+0.006
asks+0.006
Neg logit contributions
inn-0.004
igator-0.004
ny-0.003
 his-0.003
His-0.003
Token ball
Token ID2613
Feature activation-9.82e-03

Pos logit contributions
lor+0.035
ich+0.034
sters+0.033
aj+0.033
 ourselves+0.032
Neg logit contributions
igator-0.028
inn-0.027
ny-0.025
 his-0.024
 trader-0.022
Token.
Token ID13
Feature activation+8.67e-03

Pos logit contributions
ich+0.018
 ourselves+0.017
sters+0.016
lor+0.016
 yourselves+0.016
Neg logit contributions
inn-0.014
igator-0.014
 his-0.013
His-0.012
ny-0.012
Token It
Token ID632
Feature activation+7.18e-03

Pos logit contributions
inn+0.011
igator+0.011
 his+0.009
 ang+0.009
ny+0.009
Neg logit contributions
ich-0.017
 ourselves-0.017
sters-0.016
 yourselves-0.016
lor-0.016
Token fl
Token ID781
Feature activation-6.39e-02

Pos logit contributions
inn+0.011
 trader+0.010
igator+0.010
 ang+0.010
 grocer+0.009
Neg logit contributions
ich-0.011
sed-0.010
 rib-0.010
sters-0.010
aj-0.010
Tokenapped
Token ID6320
Feature activation+2.74e-02

Pos logit contributions
ich+0.110
 ourselves+0.104
sters+0.101
lor+0.101
 yourselves+0.098
Neg logit contributions
inn-0.102
igator-0.097
 his-0.089
His-0.088
ny-0.087
Token its
Token ID663
Feature activation-7.31e-04

Pos logit contributions
inn+0.038
 his+0.038
igator+0.037
His+0.034
ny+0.033
Neg logit contributions
ich-0.055
lor-0.051
sters-0.050
aj-0.049
 ourselves-0.048
Token wings
Token ID12098
Feature activation-9.44e-04

Pos logit contributions
ich+0.001
lor+0.001
asks+0.001
 yourselves+0.001
sters+0.001
Neg logit contributions
inn-0.002
ny-0.002
 his-0.002
igator-0.002
His-0.001
Token and
Token ID290
Feature activation-1.33e-02

Pos logit contributions
ich+0.001
 ourselves+0.001
sters+0.001
lor+0.001
aj+0.001
Neg logit contributions
inn-0.002
igator-0.002
 ang-0.001
His-0.001
 his-0.001
Token blew
Token ID17948
Feature activation+2.75e-03

Pos logit contributions
ich+0.025
 ourselves+0.023
sters+0.022
lor+0.022
aj+0.022
Neg logit contributions
inn-0.019
igator-0.019
 his-0.018
ny-0.017
His-0.017
Token the
Token ID262
Feature activation-1.27e-02

Pos logit contributions
inn+0.032
igator+0.031
His+0.026
ny+0.026
 ang+0.025
Neg logit contributions
 ourselves-0.058
ich-0.057
 yourselves-0.054
sters-0.054
lor-0.053
Token tree
Token ID5509
Feature activation+9.91e-03

Pos logit contributions
lor+0.026
aj+0.025
asks+0.025
 ourselves+0.025
sters+0.024
Neg logit contributions
igator-0.017
jack-0.014
 his-0.014
ny-0.013
 ang-0.013
Token to
Token ID284
Feature activation-1.00e-02

Pos logit contributions
igator+0.014
inn+0.014
 his+0.013
ny+0.013
His+0.012
Neg logit contributions
ich-0.018
 ourselves-0.017
sters-0.017
lor-0.017
asks-0.017
Token its
Token ID663
Feature activation+3.82e-03

Pos logit contributions
ich+0.021
 ourselves+0.020
 yourselves+0.019
sters+0.019
lor+0.019
Neg logit contributions
inn-0.012
igator-0.011
ny-0.010
 ang-0.010
His-0.010
Token once
Token ID1752
Feature activation-5.08e-03

Pos logit contributions
 his+0.006
igator+0.005
 ang+0.005
jack+0.005
ny+0.004
Neg logit contributions
asks-0.007
lor-0.007
ich-0.007
sters-0.007
 yourselves-0.006
Token former
Token ID1966
Feature activation-6.17e-02

Pos logit contributions
ich+0.010
 ourselves+0.009
sters+0.009
lor+0.009
asks+0.009
Neg logit contributions
inn-0.007
igator-0.006
 his-0.006
ny-0.006
His-0.006
Token beauty
Token ID8737
Feature activation-1.52e-02

Pos logit contributions
ich+0.100
sters+0.084
 ourselves+0.081
lor+0.081
asks+0.080
Neg logit contributions
 his-0.106
inn-0.106
igator-0.103
ny-0.098
 ang-0.094
Token.
Token ID13
Feature activation+3.94e-03

Pos logit contributions
ich+0.031
 ourselves+0.028
sters+0.028
aj+0.028
 yourselves+0.028
Neg logit contributions
igator-0.020
inn-0.019
 his-0.019
ny-0.017
 He-0.016
Token He
Token ID679
Feature activation+4.52e-03

Pos logit contributions
inn+0.004
igator+0.003
 his+0.003
ny+0.003
His+0.003
Neg logit contributions
ich-0.009
 ourselves-0.009
sters-0.009
lor-0.009
aj-0.009
Token was
Token ID373
Feature activation+4.62e-03

Pos logit contributions
 his+0.005
igator+0.005
inn+0.005
ny+0.005
His+0.005
Neg logit contributions
ich-0.010
 ourselves-0.009
 yourselves-0.009
asks-0.009
aj-0.009
Token happy
Token ID3772
Feature activation-1.27e-02

Pos logit contributions
inn+0.006
igator+0.006
 his+0.006
ny+0.005
His+0.005
Neg logit contributions
ich-0.009
 ourselves-0.008
sters-0.008
lor-0.008
aj-0.008