TOP ACTIVATIONS
MAX = 5.581

matter how small or how big , it is important to
. But no matter how high he jumped , he could
. But no matter how high John tried to jump he
" Look how big our pile is !" Tom
fear . No matter how big or mighty something seems ,
the water . See how big and strong you are .
" Look how high I can bounce it !"
" Look how big our pile is now !"
moon . This is how big and far I am .
learned that no matter how big or mysterious the cliff was
says , " They are av oc ados . My mom
. Can you jump as high as you can and touch
Wow , look how big it is ! â
he couldn 't see how tall he was !
Maybe they could grow more av oc ados and share them
than yours . Look how high it is !" He was
important , no matter how big or small they are .
something new ! She thought av oc ados were the most
girl who didn 't like av oc ados . She always
happy that it produced more av oc ados than ever before

MIN ACTIVATIONS
MIN = -5.399

some furniture around in the living room . Lily was happy
the piano and around the living room . Soon
them and ran to the living room . She saw the
pron and ran to the living room . She saw her
noise and run to the living room . They see the
crash and came to the living room . She saw the
noise and ran to the living room . She saw the
's go back to the living room . We can play
noise and ran to the living room . They saw the
the big couch in the living room . She was always
up and ran around the living room again . The couch
. They ran to the living room and pretended to play
and ran back to the living room . They played with
They run around the living room , bump ing into
and Dad come into the living room and see the mess
and takes them to the living room . "
chair and ran to the living room . Their
they run back to the living room and pick up their
. They ran to the living room . They saw the
They go to the living room and play with Beef

INTERVAL 2.836 - 5.581
CONTAINS 0.124%

try your switch again . Maybe we can make a song
't know anything about l izards . Go away ."
room . " What 's in the box ?" she asked
toy . She knew how much it meant to him and
He wants to know how high their tower is .

INTERVAL 0.091 - 2.836
CONTAINS 46.535%

. One day , Lily and her friends were playing on
, a super hero arrived on the scene . Her name
this me adow ." The little girl was surprised , she
upon a time , Ben and Erin stepped onto the beach
in his room that he loved to sit on . One

INTERVAL -2.654 - 0.091
CONTAINS 52.910%

brother smiled and said , " It 's a good thing
hands and feet to hold on to the grass and rocks
Sarah . She was 3 years old and loved to explore
One day the dad forgot to lock the door before he
face . <|endoftext|> Once upon a time there was a bug

INTERVAL -5.399 - -2.654
CONTAINS 0.431%

there was a little girl named Lily . She liked to
Two curious little children , called Max and Jenny , played
avi said , " Let 's follow it and see where
there was a little girl named Lily . She loved to
" Wow , that 's cool ! Can you
Token matter
Token ID2300
Feature activation-4.82e-01

Pos logit contributions
ement+0.051
 nervously+0.050
 sadly+0.048
 curiously+0.048
 closely+0.046
Neg logit contributions
ww-0.036
rav-0.033
eware-0.032
fl-0.032
w-0.031
Token how
Token ID703
Feature activation-6.88e-01

Pos logit contributions
ww+0.015
rav+0.015
eware+0.014
fl+0.014
w+0.014
Neg logit contributions
ement-0.011
 nervously-0.011
 sadly-0.010
 curiously-0.010
 closely-0.010
Token small
Token ID1402
Feature activation+3.67e+00

Pos logit contributions
ww+0.021
rav+0.020
eware+0.020
less+0.019
fl+0.019
Neg logit contributions
ement-0.017
 nervously-0.015
 sadly-0.015
 curiously-0.014
 finally-0.014
Token or
Token ID393
Feature activation+5.40e-01

Pos logit contributions
ement+0.104
 nervously+0.103
 sadly+0.098
 curiously+0.097
 closely+0.093
Neg logit contributions
ww-0.096
fl-0.090
rav-0.090
eware-0.088
w-0.085
Token how
Token ID703
Feature activation-8.23e-01

Pos logit contributions
ement+0.014
 nervously+0.013
 sadly+0.013
 curiously+0.012
 closely+0.012
Neg logit contributions
ww-0.016
rav-0.015
eware-0.015
fl-0.015
w-0.014
Token big
Token ID1263
Feature activation+5.58e+00

Pos logit contributions
ww+0.024
rav+0.023
eware+0.023
fl+0.023
w+0.022
Neg logit contributions
ement-0.021
 nervously-0.020
 sadly-0.019
 curiously-0.019
 closely-0.018
Token,
Token ID11
Feature activation-2.99e-03

Pos logit contributions
ement+0.154
 nervously+0.154
 curiously+0.146
 sadly+0.143
 proudly+0.136
Neg logit contributions
ww-0.154
fl-0.144
w-0.138
rav-0.138
eware-0.135
Token it
Token ID340
Feature activation-3.45e-01

Pos logit contributions
ww+0.000
rav+0.000
fl+0.000
eware+0.000
w+0.000
Neg logit contributions
ement-0.000
 nervously-0.000
 sadly-0.000
 curiously-0.000
 proudly-0.000
Token is
Token ID318
Feature activation+2.71e-01

Pos logit contributions
ww+0.007
fl+0.006
rav+0.006
eware+0.006
w+0.006
Neg logit contributions
ement-0.012
 nervously-0.011
 curiously-0.011
 sadly-0.011
 closely-0.010
Token important
Token ID1593
Feature activation-3.17e-01

Pos logit contributions
ement+0.007
 nervously+0.006
 curiously+0.006
 sadly+0.006
 closely+0.005
Neg logit contributions
ww-0.008
fl-0.008
rav-0.008
eware-0.008
less-0.007
Token to
Token ID284
Feature activation+2.60e-01

Pos logit contributions
ww+0.009
fl+0.009
rav+0.009
eware+0.009
less+0.009
Neg logit contributions
ement-0.008
 nervously-0.007
 curiously-0.007
 sadly-0.007
 closely-0.007
Token.
Token ID13
Feature activation+6.09e-01

Pos logit contributions
ement+0.003
 nervously+0.003
 sadly+0.003
 curiously+0.003
 closely+0.002
Neg logit contributions
ww-0.006
rav-0.006
eware-0.006
fl-0.006
w-0.006
Token But
Token ID887
Feature activation+1.84e+00

Pos logit contributions
ement+0.016
 nervously+0.013
 closely+0.013
 curiously+0.013
 sadly+0.013
Neg logit contributions
Fl-0.018
ww-0.018
fl-0.017
eware-0.017
rav-0.017
Token no
Token ID645
Feature activation+1.76e+00

Pos logit contributions
ement+0.043
 nervously+0.033
 curiously+0.031
 closely+0.029
 deeply+0.029
Neg logit contributions
ww-0.061
Fl-0.060
fl-0.059
w-0.058
eware-0.057
Token matter
Token ID2300
Feature activation+7.42e-02

Pos logit contributions
ement+0.048
 nervously+0.046
 sadly+0.045
 curiously+0.044
 closely+0.042
Neg logit contributions
ww-0.049
fl-0.045
eware-0.044
rav-0.044
w-0.042
Token how
Token ID703
Feature activation-1.48e-01

Pos logit contributions
ement+0.002
 nervously+0.002
 sadly+0.002
 curiously+0.002
 closely+0.002
Neg logit contributions
ww-0.002
rav-0.002
eware-0.002
fl-0.002
Fl-0.002
Token high
Token ID1029
Feature activation+5.54e+00

Pos logit contributions
ww+0.005
eware+0.004
w+0.004
rav+0.004
less+0.004
Neg logit contributions
ement-0.004
 nervously-0.003
 sadly-0.003
 curiously-0.003
 proudly-0.003
Token he
Token ID339
Feature activation+3.68e+00

Pos logit contributions
ement+0.114
 nervously+0.106
 curiously+0.099
 sadly+0.096
 proudly+0.092
Neg logit contributions
ww-0.191
fl-0.187
eware-0.174
w-0.174
Fl-0.168
Token jumped
Token ID11687
Feature activation+1.86e+00

Pos logit contributions
ement+0.107
 nervously+0.099
 sadly+0.097
 deeply+0.095
 curiously+0.095
Neg logit contributions
ww-0.093
eware-0.088
rav-0.087
w-0.086
fl-0.086
Token,
Token ID11
Feature activation+3.99e-01

Pos logit contributions
ement+0.036
 nervously+0.033
 curiously+0.031
 sadly+0.030
 closely+0.029
Neg logit contributions
ww-0.065
eware-0.062
fl-0.061
less-0.061
rav-0.060
Token he
Token ID339
Feature activation+1.12e+00

Pos logit contributions
ement+0.009
 nervously+0.007
 curiously+0.007
 closely+0.007
 sadly+0.006
Neg logit contributions
ww-0.014
fl-0.013
Fl-0.013
eware-0.013
rav-0.012
Token could
Token ID714
Feature activation+6.56e-01

Pos logit contributions
ement+0.037
 nervously+0.030
 curiously+0.029
 deeply+0.029
 closely+0.029
Neg logit contributions
ww-0.025
w-0.025
fl-0.024
eware-0.024
Fl-0.024
Token.
Token ID13
Feature activation+3.91e-01

Pos logit contributions
ement+0.010
 nervously+0.009
 sadly+0.009
 curiously+0.009
 closely+0.008
Neg logit contributions
ww-0.013
rav-0.012
fl-0.012
eware-0.012
w-0.011
Token But
Token ID887
Feature activation+1.36e+00

Pos logit contributions
ement+0.010
 closely+0.009
 sadly+0.009
 deeply+0.009
 armor+0.009
Neg logit contributions
eware-0.009
fl-0.009
rav-0.009
ww-0.009
Fl-0.009
Token no
Token ID645
Feature activation+1.59e+00

Pos logit contributions
ement+0.032
pered+0.025
 nervously+0.025
 victorious+0.025
 deeply+0.025
Neg logit contributions
fl-0.037
ww-0.036
less-0.035
eware-0.035
w-0.034
Token matter
Token ID2300
Feature activation-1.28e-01

Pos logit contributions
ement+0.044
 nervously+0.043
 sadly+0.042
 curiously+0.040
 proudly+0.039
Neg logit contributions
ww-0.042
fl-0.039
eware-0.039
rav-0.038
less-0.037
Token how
Token ID703
Feature activation-1.07e-01

Pos logit contributions
ww+0.003
rav+0.003
fl+0.003
eware+0.003
less+0.003
Neg logit contributions
ement-0.004
 nervously-0.003
 sadly-0.003
 curiously-0.003
 closely-0.003
Token high
Token ID1029
Feature activation+5.46e+00

Pos logit contributions
ww+0.003
eware+0.003
rav+0.003
w+0.003
less+0.003
Neg logit contributions
ement-0.003
 nervously-0.002
 sadly-0.002
 curiously-0.002
 proudly-0.002
Token John
Token ID1757
Feature activation+2.25e+00

Pos logit contributions
ement+0.107
 nervously+0.099
 curiously+0.092
 deeply+0.089
 frantically+0.088
Neg logit contributions
fl-0.186
ww-0.181
eware-0.171
w-0.168
Fl-0.158
Token tried
Token ID3088
Feature activation+1.01e+00

Pos logit contributions
ement+0.070
 deeply+0.062
 nervously+0.062
 curiously+0.060
 sadly+0.059
Neg logit contributions
ww-0.051
fl-0.050
w-0.050
eware-0.047
Fl-0.047
Token to
Token ID284
Feature activation+4.72e-01

Pos logit contributions
ement+0.028
 nervously+0.024
 curiously+0.023
 sadly+0.023
 deeply+0.022
Neg logit contributions
fl-0.027
ww-0.026
rav-0.025
less-0.025
eware-0.025
Token jump
Token ID4391
Feature activation+1.36e+00

Pos logit contributions
ement+0.014
 nervously+0.012
 sadly+0.012
 curiously+0.011
 deeply+0.011
Neg logit contributions
ww-0.012
fl-0.012
eware-0.011
Fl-0.011
rav-0.011
Token he
Token ID339
Feature activation+1.20e+00

Pos logit contributions
ement+0.030
 nervously+0.026
 curiously+0.024
 sadly+0.024
 closely+0.023
Neg logit contributions
ww-0.043
fl-0.042
less-0.042
rav-0.042
eware-0.042
Token
Token ID198
Feature activation+1.68e-02

Pos logit contributions
ement+0.017
 nervously+0.016
 sadly+0.016
 curiously+0.016
 closely+0.015
Neg logit contributions
rav-0.013
eware-0.013
fl-0.013
ww-0.013
Fl-0.012
Token
Token ID198
Feature activation-9.14e-02

Pos logit contributions
ement+0.000
 nervously+0.000
 sadly+0.000
 curiously+0.000
 closely+0.000
Neg logit contributions
ww-0.001
rav-0.001
fl-0.000
eware-0.000
Fl-0.000
Token"
Token ID1
Feature activation-1.73e+00

Pos logit contributions
ww+0.002
rav+0.002
fl+0.002
eware+0.002
less+0.002
Neg logit contributions
 nervously-0.002
ement-0.002
 closely-0.002
 sadly-0.002
 curiously-0.002
TokenLook
Token ID8567
Feature activation+2.06e-01

Pos logit contributions
eware+0.039
ww+0.037
rav+0.036
less+0.036
fl+0.032
Neg logit contributions
 finally-0.053
ement-0.053
 nervously-0.051
 happily-0.049
 He-0.049
Token how
Token ID703
Feature activation-2.49e+00

Pos logit contributions
ement+0.003
 nervously+0.003
 sadly+0.003
 curiously+0.003
 closely+0.002
Neg logit contributions
ww-0.008
rav-0.008
eware-0.008
fl-0.008
w-0.008
Token big
Token ID1263
Feature activation+5.39e+00

Pos logit contributions
ww+0.080
fl+0.079
rav+0.077
eware+0.076
Fl+0.075
Neg logit contributions
 nervously-0.052
ement-0.051
 sadly-0.050
 curiously-0.048
 closely-0.048
Token our
Token ID674
Feature activation+2.49e-01

Pos logit contributions
 nervously+0.156
ement+0.153
 curiously+0.149
 sadly+0.144
 frantically+0.140
Neg logit contributions
ww-0.149
w-0.130
fl-0.129
rav-0.124
eware-0.124
Token pile
Token ID14540
Feature activation+1.89e+00

Pos logit contributions
ement+0.008
 nervously+0.007
 sadly+0.007
 curiously+0.007
 frantically+0.007
Neg logit contributions
fl-0.006
ww-0.006
rav-0.005
w-0.005
less-0.005
Token is
Token ID318
Feature activation+1.25e+00

Pos logit contributions
ement+0.059
 nervously+0.058
 curiously+0.056
 sadly+0.056
 closely+0.054
Neg logit contributions
ww-0.044
fl-0.041
rav-0.041
eware-0.039
w-0.039
Token!"
Token ID2474
Feature activation+1.41e+00

Pos logit contributions
ement+0.041
 nervously+0.039
 curiously+0.037
 sadly+0.037
 frantically+0.035
Neg logit contributions
ww-0.029
eware-0.026
fl-0.026
less-0.024
w-0.024
Token Tom
Token ID4186
Feature activation+3.76e-01

Pos logit contributions
ement+0.041
 sadly+0.036
 nervously+0.036
 deeply+0.035
 curiously+0.035
Neg logit contributions
eware-0.036
ww-0.035
rav-0.033
fl-0.033
Fl-0.032
Token fear
Token ID3252
Feature activation-6.82e-01

Pos logit contributions
ement+0.061
 nervously+0.057
 sadly+0.055
 curiously+0.054
 closely+0.052
Neg logit contributions
ww-0.064
rav-0.062
eware-0.061
fl-0.060
less-0.058
Token.
Token ID13
Feature activation+2.13e-01

Pos logit contributions
ww+0.025
rav+0.024
eware+0.023
fl+0.023
w+0.022
Neg logit contributions
ement-0.013
 nervously-0.012
 sadly-0.011
 curiously-0.011
 closely-0.010
Token No
Token ID1400
Feature activation+2.13e+00

Pos logit contributions
ement+0.005
 nervously+0.005
 curiously+0.005
 sadly+0.005
 closely+0.005
Neg logit contributions
ww-0.006
rav-0.006
eware-0.006
Fl-0.006
fl-0.005
Token matter
Token ID2300
Feature activation-3.19e-01

Pos logit contributions
ement+0.066
 nervously+0.065
 sadly+0.063
 curiously+0.062
 closely+0.060
Neg logit contributions
ww-0.050
rav-0.047
eware-0.046
fl-0.045
w-0.043
Token how
Token ID703
Feature activation-1.17e-01

Pos logit contributions
ww+0.010
rav+0.009
eware+0.009
fl+0.009
w+0.009
Neg logit contributions
ement-0.008
 nervously-0.007
 sadly-0.007
 curiously-0.007
 closely-0.007
Token big
Token ID1263
Feature activation+5.35e+00

Pos logit contributions
ww+0.004
eware+0.003
rav+0.003
less+0.003
w+0.003
Neg logit contributions
ement-0.003
 nervously-0.003
 curiously-0.002
 sadly-0.002
 finally-0.002
Token or
Token ID393
Feature activation+7.43e-01

Pos logit contributions
ement+0.146
 nervously+0.143
 curiously+0.138
 sadly+0.136
 proudly+0.130
Neg logit contributions
ww-0.148
fl-0.135
rav-0.133
w-0.131
eware-0.129
Token mighty
Token ID18680
Feature activation+3.31e+00

Pos logit contributions
ement+0.018
 nervously+0.018
 sadly+0.017
 curiously+0.017
 closely+0.016
Neg logit contributions
ww-0.022
rav-0.021
eware-0.021
fl-0.021
w-0.020
Token something
Token ID1223
Feature activation+9.36e-01

Pos logit contributions
ement+0.095
 nervously+0.091
 curiously+0.088
 sadly+0.087
 closely+0.083
Neg logit contributions
ww-0.087
rav-0.080
fl-0.079
eware-0.077
w-0.075
Token seems
Token ID2331
Feature activation+1.38e+00

Pos logit contributions
ement+0.027
 nervously+0.026
 sadly+0.026
 curiously+0.025
 eagerly+0.024
Neg logit contributions
ww-0.024
rav-0.022
fl-0.022
eware-0.022
w-0.020
Token,
Token ID11
Feature activation+3.66e-01

Pos logit contributions
ement+0.028
 nervously+0.026
 sadly+0.025
 curiously+0.025
 closely+0.023
Neg logit contributions
ww-0.047
eware-0.045
rav-0.045
fl-0.044
less-0.043
Token the
Token ID262
Feature activation-1.76e-01

Pos logit contributions
ement+0.028
 nervously+0.027
 sadly+0.025
 curiously+0.025
 eagerly+0.023
Neg logit contributions
ww-0.041
fl-0.038
eware-0.037
Fl-0.037
w-0.037
Token water
Token ID1660
Feature activation-1.48e+00

Pos logit contributions
ww+0.006
rav+0.005
eware+0.005
fl+0.005
w+0.005
Neg logit contributions
ement-0.004
 nervously-0.004
 sadly-0.004
 curiously-0.004
 closely-0.004
Token.
Token ID13
Feature activation-4.39e-03

Pos logit contributions
ww+0.042
rav+0.040
eware+0.039
fl+0.039
w+0.037
Neg logit contributions
ement-0.038
 nervously-0.037
 sadly-0.036
 curiously-0.036
 closely-0.034
Token See
Token ID4091
Feature activation+1.24e+00

Pos logit contributions
ww+0.000
rav+0.000
fl+0.000
eware+0.000
w+0.000
Neg logit contributions
ement-0.000
 nervously-0.000
 sadly-0.000
 curiously-0.000
 He-0.000
Token how
Token ID703
Feature activation-2.19e+00

Pos logit contributions
ement+0.030
 nervously+0.029
 sadly+0.027
 curiously+0.026
 frantically+0.026
Neg logit contributions
ww-0.037
fl-0.036
rav-0.035
w-0.034
eware-0.034
Token big
Token ID1263
Feature activation+5.30e+00

Pos logit contributions
ww+0.075
rav+0.072
fl+0.071
eware+0.071
Fl+0.068
Neg logit contributions
ement-0.044
 nervously-0.043
 sadly-0.042
 curiously-0.041
 closely-0.039
Token and
Token ID290
Feature activation-7.75e-02

Pos logit contributions
ement+0.130
 nervously+0.128
 curiously+0.120
 sadly+0.119
 proudly+0.111
Neg logit contributions
ww-0.165
fl-0.155
w-0.147
eware-0.143
rav-0.143
Token strong
Token ID1913
Feature activation+3.42e+00

Pos logit contributions
ww+0.002
rav+0.002
eware+0.002
fl+0.002
w+0.002
Neg logit contributions
ement-0.002
 nervously-0.002
 sadly-0.002
 curiously-0.002
 closely-0.002
Token you
Token ID345
Feature activation-8.76e-02

Pos logit contributions
ement+0.088
 nervously+0.085
 curiously+0.081
 sadly+0.080
 proudly+0.076
Neg logit contributions
ww-0.103
fl-0.095
w-0.091
eware-0.090
rav-0.090
Token are
Token ID389
Feature activation+1.36e+00

Pos logit contributions
ww+0.002
rav+0.002
eware+0.002
fl+0.002
w+0.002
Neg logit contributions
ement-0.003
 nervously-0.003
 sadly-0.003
 curiously-0.003
 closely-0.003
Token.
Token ID13
Feature activation+1.81e-01

Pos logit contributions
ement+0.037
 nervously+0.036
 sadly+0.035
 curiously+0.034
 closely+0.033
Neg logit contributions
ww-0.038
rav-0.036
eware-0.035
fl-0.035
w-0.033
Token
Token ID198
Feature activation+7.23e-01

Pos logit contributions
ement+0.018
 nervously+0.017
 curiously+0.016
 frantically+0.016
 deeply+0.016
Neg logit contributions
ww-0.016
rav-0.016
Fl-0.016
eware-0.015
fl-0.014
Token
Token ID198
Feature activation+3.77e-01

Pos logit contributions
ement+0.018
 nervously+0.016
 sadly+0.016
 curiously+0.016
 closely+0.015
Neg logit contributions
ww-0.022
rav-0.021
eware-0.020
Fl-0.020
fl-0.020
Token"
Token ID1
Feature activation-1.45e+00

Pos logit contributions
ement+0.011
 nervously+0.011
 sadly+0.011
 curiously+0.011
 closely+0.010
Neg logit contributions
ww-0.009
fl-0.009
rav-0.009
eware-0.009
w-0.008
TokenLook
Token ID8567
Feature activation+3.31e-01

Pos logit contributions
ww+0.036
eware+0.035
rav+0.034
fl+0.034
less+0.032
Neg logit contributions
 nervously-0.040
ement-0.040
 finally-0.040
 curiously-0.038
 eagerly-0.038
Token how
Token ID703
Feature activation-2.37e+00

Pos logit contributions
ement+0.006
 nervously+0.005
 curiously+0.005
 sadly+0.005
 frantically+0.005
Neg logit contributions
ww-0.012
rav-0.012
eware-0.012
fl-0.011
w-0.011
Token high
Token ID1029
Feature activation+5.29e+00

Pos logit contributions
fl+0.079
ww+0.078
rav+0.075
eware+0.074
Fl+0.072
Neg logit contributions
 nervously-0.047
ement-0.046
 sadly-0.046
 curiously-0.044
 closely-0.044
Token I
Token ID314
Feature activation+8.75e-01

Pos logit contributions
ement+0.120
 nervously+0.117
 curiously+0.111
 sadly+0.109
 frantically+0.103
Neg logit contributions
ww-0.172
fl-0.162
eware-0.156
rav-0.156
w-0.154
Token can
Token ID460
Feature activation+7.61e-01

Pos logit contributions
ement+0.026
 nervously+0.025
 sadly+0.024
 curiously+0.024
 closely+0.023
Neg logit contributions
ww-0.022
rav-0.021
fl-0.020
eware-0.020
w-0.019
Token bounce
Token ID20110
Feature activation+1.16e+00

Pos logit contributions
ement+0.023
 nervously+0.021
 sadly+0.021
 curiously+0.020
 closely+0.019
Neg logit contributions
ww-0.019
rav-0.018
eware-0.018
fl-0.018
w-0.017
Token it
Token ID340
Feature activation+5.01e-01

Pos logit contributions
ement+0.028
 nervously+0.026
 curiously+0.025
 sadly+0.024
 eagerly+0.023
Neg logit contributions
ww-0.037
fl-0.034
eware-0.033
rav-0.033
w-0.033
Token!"
Token ID2474
Feature activation+1.31e+00

Pos logit contributions
ement+0.015
 nervously+0.014
 curiously+0.013
 frantically+0.013
 sadly+0.013
Neg logit contributions
ww-0.013
fl-0.012
w-0.011
eware-0.011
cr-0.011
Token
Token ID198
Feature activation+6.13e-01

Pos logit contributions
ement+0.016
 nervously+0.015
 sadly+0.015
 curiously+0.014
 closely+0.014
Neg logit contributions
eware-0.017
ww-0.017
rav-0.017
fl-0.017
Fl-0.016
Token
Token ID198
Feature activation+1.42e-01

Pos logit contributions
ement+0.015
 nervously+0.014
 sadly+0.013
 curiously+0.013
 closely+0.012
Neg logit contributions
ww-0.019
rav-0.018
eware-0.018
Fl-0.017
fl-0.017
Token"
Token ID1
Feature activation-1.57e+00

Pos logit contributions
 nervously+0.004
ement+0.004
 sadly+0.003
 curiously+0.003
 closely+0.003
Neg logit contributions
ww-0.004
fl-0.004
rav-0.004
eware-0.004
less-0.004
TokenLook
Token ID8567
Feature activation+4.23e-01

Pos logit contributions
ww+0.040
rav+0.040
eware+0.037
less+0.037
fl+0.034
Neg logit contributions
 nervously-0.044
 finally-0.043
 sadly-0.042
ement-0.041
 curiously-0.041
Token how
Token ID703
Feature activation-1.95e+00

Pos logit contributions
ement+0.007
 nervously+0.007
 sadly+0.006
 curiously+0.006
 frantically+0.006
Neg logit contributions
ww-0.016
rav-0.015
eware-0.015
fl-0.015
w-0.014
Token big
Token ID1263
Feature activation+5.27e+00

Pos logit contributions
fl+0.067
ww+0.066
rav+0.065
Fl+0.062
eware+0.061
Neg logit contributions
 nervously-0.036
 sadly-0.036
ement-0.034
 curiously-0.033
 closely-0.033
Token our
Token ID674
Feature activation+1.68e-01

Pos logit contributions
 nervously+0.145
ement+0.145
 curiously+0.140
 sadly+0.137
 proudly+0.130
Neg logit contributions
ww-0.150
w-0.132
fl-0.132
rav-0.130
eware-0.130
Token pile
Token ID14540
Feature activation+1.60e+00

Pos logit contributions
ement+0.004
 nervously+0.004
 sadly+0.004
 curiously+0.004
 closely+0.004
Neg logit contributions
ww-0.005
rav-0.005
fl-0.004
eware-0.004
w-0.004
Token is
Token ID318
Feature activation+1.39e+00

Pos logit contributions
ement+0.048
 nervously+0.047
 sadly+0.045
 curiously+0.045
 closely+0.044
Neg logit contributions
ww-0.039
rav-0.037
fl-0.037
eware-0.036
w-0.035
Token now
Token ID783
Feature activation+3.39e-01

Pos logit contributions
ement+0.043
 nervously+0.041
 curiously+0.040
 sadly+0.039
 closely+0.038
Neg logit contributions
ww-0.034
eware-0.031
fl-0.031
rav-0.030
w-0.029
Token!"
Token ID2474
Feature activation+1.25e+00

Pos logit contributions
ement+0.012
 nervously+0.011
 curiously+0.011
 sadly+0.011
 frantically+0.010
Neg logit contributions
ww-0.007
eware-0.006
fl-0.006
w-0.006
rav-0.006
Token moon
Token ID8824
Feature activation+1.25e+00

Pos logit contributions
ement+0.021
 nervously+0.021
 sadly+0.020
 curiously+0.020
 closely+0.019
Neg logit contributions
ww-0.021
rav-0.020
fl-0.020
eware-0.020
w-0.019
Token.
Token ID13
Feature activation-6.71e-02

Pos logit contributions
ement+0.024
 nervously+0.023
 sadly+0.022
 curiously+0.021
 closely+0.020
Neg logit contributions
ww-0.044
rav-0.042
fl-0.042
eware-0.042
w-0.040
Token This
Token ID770
Feature activation-1.58e+00

Pos logit contributions
ww+0.002
rav+0.002
fl+0.002
eware+0.002
w+0.002
Neg logit contributions
ement-0.001
 nervously-0.001
 sadly-0.001
 He-0.001
 curiously-0.001
Token is
Token ID318
Feature activation-3.93e-01

Pos logit contributions
ww+0.040
rav+0.039
eware+0.038
less+0.037
Fl+0.037
Neg logit contributions
ement-0.046
 nervously-0.044
 sadly-0.043
 curiously-0.043
 closely-0.042
Token how
Token ID703
Feature activation-2.11e+00

Pos logit contributions
ww+0.012
rav+0.011
eware+0.011
fl+0.011
w+0.011
Neg logit contributions
ement-0.009
 nervously-0.009
 sadly-0.009
 curiously-0.009
 closely-0.008
Token big
Token ID1263
Feature activation+5.27e+00

Pos logit contributions
ww+0.059
rav+0.056
fl+0.056
eware+0.055
Fl+0.053
Neg logit contributions
ement-0.054
 nervously-0.053
 sadly-0.052
 curiously-0.050
 closely-0.049
Token and
Token ID290
Feature activation-1.16e-01

Pos logit contributions
ement+0.113
 nervously+0.109
 curiously+0.104
 sadly+0.101
 eagerly+0.092
Neg logit contributions
ww-0.181
fl-0.168
w-0.165
rav-0.157
eware-0.156
Token far
Token ID1290
Feature activation+2.69e+00

Pos logit contributions
ww+0.004
rav+0.003
eware+0.003
fl+0.003
w+0.003
Neg logit contributions
ement-0.003
 nervously-0.003
 sadly-0.003
 curiously-0.002
 closely-0.002
Token I
Token ID314
Feature activation+1.14e+00

Pos logit contributions
ement+0.066
 nervously+0.063
 curiously+0.060
 sadly+0.060
 closely+0.056
Neg logit contributions
ww-0.083
rav-0.077
fl-0.076
eware-0.076
w-0.075
Token am
Token ID716
Feature activation+6.55e-01

Pos logit contributions
ement+0.034
 nervously+0.033
 sadly+0.032
 curiously+0.031
 closely+0.030
Neg logit contributions
ww-0.029
rav-0.027
eware-0.027
fl-0.027
w-0.025
Token.
Token ID13
Feature activation-9.36e-02

Pos logit contributions
ement+0.017
 nervously+0.016
 sadly+0.015
 curiously+0.015
 closely+0.014
Neg logit contributions
ww-0.019
eware-0.018
rav-0.018
fl-0.017
w-0.017
Token learned
Token ID4499
Feature activation-6.31e-01

Pos logit contributions
ement+0.004
 nervously+0.004
 curiously+0.004
 closely+0.004
 sadly+0.004
Neg logit contributions
ww-0.003
rav-0.003
fl-0.003
w-0.003
Fl-0.003
Token that
Token ID326
Feature activation-1.34e+00

Pos logit contributions
ww+0.021
rav+0.020
fl+0.020
eware+0.020
w+0.019
Neg logit contributions
ement-0.014
 nervously-0.013
 sadly-0.012
 curiously-0.012
 closely-0.012
Token no
Token ID645
Feature activation+1.56e+00

Pos logit contributions
ww+0.044
rav+0.042
eware+0.041
fl+0.041
w+0.039
Neg logit contributions
ement-0.030
 nervously-0.029
 sadly-0.027
 curiously-0.027
 closely-0.026
Token matter
Token ID2300
Feature activation-7.55e-01

Pos logit contributions
ement+0.054
 nervously+0.053
 sadly+0.051
 curiously+0.050
 proudly+0.049
Neg logit contributions
ww-0.032
fl-0.028
eware-0.028
rav-0.028
w-0.026
Token how
Token ID703
Feature activation-6.51e-01

Pos logit contributions
ww+0.024
rav+0.023
eware+0.022
fl+0.022
w+0.021
Neg logit contributions
ement-0.018
 nervously-0.017
 sadly-0.017
 curiously-0.016
 closely-0.016
Token big
Token ID1263
Feature activation+5.26e+00

Pos logit contributions
ww+0.021
rav+0.020
eware+0.020
fl+0.020
w+0.019
Neg logit contributions
ement-0.015
 nervously-0.014
 sadly-0.013
 curiously-0.013
 closely-0.012
Token or
Token ID393
Feature activation+3.18e-01

Pos logit contributions
ement+0.140
 nervously+0.138
 sadly+0.133
 curiously+0.131
 proudly+0.126
Neg logit contributions
ww-0.149
fl-0.141
w-0.134
rav-0.132
eware-0.132
Token mysterious
Token ID11428
Feature activation+1.52e+00

Pos logit contributions
ement+0.007
 nervously+0.007
 sadly+0.007
 curiously+0.007
 closely+0.007
Neg logit contributions
ww-0.010
rav-0.010
eware-0.009
fl-0.009
w-0.009
Token the
Token ID262
Feature activation+1.39e+00

Pos logit contributions
ement+0.038
 nervously+0.037
 sadly+0.036
 curiously+0.035
 proudly+0.033
Neg logit contributions
ww-0.045
fl-0.043
rav-0.042
eware-0.041
w-0.040
Token cliff
Token ID19516
Feature activation+1.67e+00

Pos logit contributions
ement+0.037
 nervously+0.034
 sadly+0.034
 curiously+0.032
 proudly+0.032
Neg logit contributions
ww-0.040
fl-0.038
eware-0.037
rav-0.037
w-0.035
Token was
Token ID373
Feature activation+1.27e+00

Pos logit contributions
 nervously+0.053
ement+0.053
 curiously+0.051
 sadly+0.050
 closely+0.048
Neg logit contributions
ww-0.035
fl-0.034
rav-0.033
eware-0.032
w-0.032
Token says
Token ID1139
Feature activation-4.04e-01

Pos logit contributions
ement+0.036
 sadly+0.033
 nervously+0.032
 victorious+0.032
 deeply+0.031
Neg logit contributions
ww-0.021
fl-0.021
rav-0.020
eware-0.018
w-0.018
Token,
Token ID11
Feature activation+4.15e-01

Pos logit contributions
ww+0.017
rav+0.016
fl+0.016
eware+0.016
w+0.016
Neg logit contributions
ement-0.005
 nervously-0.005
 sadly-0.005
 curiously-0.005
 closely-0.004
Token "
Token ID366
Feature activation-9.24e-01

Pos logit contributions
ement+0.009
 nervously+0.009
 sadly+0.009
 curiously+0.009
 proudly+0.008
Neg logit contributions
ww-0.014
rav-0.013
eware-0.013
fl-0.013
w-0.013
TokenThey
Token ID2990
Feature activation+1.89e-01

Pos logit contributions
rav+0.025
less+0.025
ww+0.024
eware+0.023
uc+0.023
Neg logit contributions
 He-0.024
 nervously-0.022
 curiously-0.022
ement-0.022
 closely-0.021
Token are
Token ID389
Feature activation+3.74e-01

Pos logit contributions
ement+0.007
 nervously+0.006
 sadly+0.006
 curiously+0.006
 closely+0.006
Neg logit contributions
ww-0.004
rav-0.003
eware-0.003
fl-0.003
w-0.003
Token av
Token ID1196
Feature activation+5.25e+00

Pos logit contributions
ement+0.011
 nervously+0.010
 sadly+0.009
 curiously+0.009
 frantically+0.008
Neg logit contributions
ww-0.011
fl-0.010
Fl-0.010
eware-0.010
rav-0.010
Tokenoc
Token ID420
Feature activation+2.21e-01

Pos logit contributions
ement+0.153
 nervously+0.151
 sadly+0.146
 curiously+0.144
 closely+0.137
Neg logit contributions
ww-0.139
fl-0.126
rav-0.125
w-0.122
eware-0.122
Tokenados
Token ID22484
Feature activation+1.84e+00

Pos logit contributions
ement+0.004
 nervously+0.004
 sadly+0.004
 curiously+0.004
 closely+0.004
Neg logit contributions
ww-0.008
rav-0.008
fl-0.007
eware-0.007
w-0.007
Token.
Token ID13
Feature activation+5.84e-01

Pos logit contributions
ement+0.039
 nervously+0.038
 sadly+0.036
 curiously+0.035
 closely+0.033
Neg logit contributions
ww-0.062
fl-0.059
rav-0.058
eware-0.058
w-0.056
Token My
Token ID2011
Feature activation+3.04e-01

Pos logit contributions
ement+0.015
 nervously+0.014
 sadly+0.014
 curiously+0.014
 closely+0.013
Neg logit contributions
ww-0.017
rav-0.016
eware-0.016
fl-0.016
w-0.015
Token mom
Token ID1995
Feature activation+5.38e-01

Pos logit contributions
ement+0.008
 nervously+0.008
 sadly+0.008
 frantically+0.008
 curiously+0.008
Neg logit contributions
fl-0.008
ww-0.008
rav-0.008
eware-0.008
less-0.007
Token.
Token ID13
Feature activation-3.08e-01

Pos logit contributions
ww+0.020
rav+0.019
eware+0.018
fl+0.018
w+0.017
Neg logit contributions
ement-0.018
 nervously-0.017
 sadly-0.017
 curiously-0.016
 closely-0.016
Token Can
Token ID1680
Feature activation+9.84e-01

Pos logit contributions
ww+0.010
rav+0.009
eware+0.009
fl+0.009
Fl+0.009
Neg logit contributions
ement-0.007
 nervously-0.007
 sadly-0.006
 curiously-0.006
 closely-0.006
Token you
Token ID345
Feature activation+5.99e-01

Pos logit contributions
ement+0.037
 nervously+0.036
 sadly+0.035
 curiously+0.035
 closely+0.034
Neg logit contributions
ww-0.017
rav-0.015
eware-0.015
fl-0.014
Fl-0.014
Token jump
Token ID4391
Feature activation+1.23e+00

Pos logit contributions
ement+0.021
 nervously+0.019
 sadly+0.018
 curiously+0.018
 deeply+0.018
Neg logit contributions
ww-0.011
eware-0.011
rav-0.010
w-0.010
cr-0.010
Token as
Token ID355
Feature activation-4.82e-02

Pos logit contributions
ement+0.028
 nervously+0.025
 sadly+0.024
 curiously+0.024
 closely+0.022
Neg logit contributions
ww-0.040
eware-0.038
rav-0.038
fl-0.038
w-0.037
Token high
Token ID1029
Feature activation+5.24e+00

Pos logit contributions
ww+0.001
w+0.001
eware+0.001
rav+0.001
less+0.001
Neg logit contributions
ement-0.001
 nervously-0.001
 sadly-0.001
 curiously-0.001
 closely-0.001
Token as
Token ID355
Feature activation+4.14e-01

Pos logit contributions
ement+0.113
 nervously+0.101
 curiously+0.094
 sadly+0.093
 frantically+0.090
Neg logit contributions
ww-0.180
fl-0.172
w-0.166
eware-0.165
rav-0.156
Token you
Token ID345
Feature activation+7.00e-01

Pos logit contributions
ement+0.016
 nervously+0.015
 sadly+0.014
 curiously+0.014
 closely+0.013
Neg logit contributions
ww-0.008
w-0.007
eware-0.007
rav-0.006
Fl-0.006
Token can
Token ID460
Feature activation+1.43e+00

Pos logit contributions
ement+0.022
 nervously+0.021
 sadly+0.020
 curiously+0.020
 closely+0.019
Neg logit contributions
ww-0.016
rav-0.016
eware-0.016
fl-0.015
w-0.015
Token and
Token ID290
Feature activation+1.77e-01

Pos logit contributions
ement+0.041
 nervously+0.037
 sadly+0.036
 curiously+0.035
 frantically+0.034
Neg logit contributions
ww-0.037
fl-0.036
rav-0.036
cr-0.035
eware-0.035
Token touch
Token ID3638
Feature activation+5.44e-01

Pos logit contributions
ement+0.005
 nervously+0.005
 sadly+0.005
 curiously+0.004
 closely+0.004
Neg logit contributions
ww-0.005
fl-0.004
eware-0.004
rav-0.004
Fl-0.004
Token
Token ID241
Feature activation-1.43e+00

Pos logit contributions
ww+0.000
rav+0.000
eware+0.000
fl+0.000
w+0.000
Neg logit contributions
ement-0.000
 nervously-0.000
 sadly-0.000
 curiously-0.000
 proudly-0.000
TokenWow
Token ID22017
Feature activation+4.92e-01

Pos logit contributions
ww+0.034
fl+0.031
rav+0.031
eware+0.031
ucer+0.029
Neg logit contributions
 nervously-0.042
 He-0.041
ement-0.041
 curiously-0.041
 finally-0.039
Token,
Token ID11
Feature activation-2.56e-01

Pos logit contributions
ement+0.009
 nervously+0.009
 sadly+0.008
 curiously+0.008
 closely+0.008
Neg logit contributions
ww-0.018
rav-0.017
eware-0.017
fl-0.017
Fl-0.016
Token look
Token ID804
Feature activation+1.68e+00

Pos logit contributions
ww+0.009
rav+0.008
fl+0.008
eware+0.008
w+0.008
Neg logit contributions
ement-0.005
 nervously-0.005
 sadly-0.005
 curiously-0.005
 closely-0.005
Token how
Token ID703
Feature activation-2.48e+00

Pos logit contributions
ement+0.020
 nervously+0.019
 sadly+0.018
 curiously+0.017
 closely+0.015
Neg logit contributions
ww-0.072
rav-0.070
eware-0.069
fl-0.069
w-0.067
Token big
Token ID1263
Feature activation+5.24e+00

Pos logit contributions
fl+0.082
ww+0.082
Fl+0.080
eware+0.080
rav+0.079
Neg logit contributions
 nervously-0.046
ement-0.045
 sadly-0.045
 closely-0.044
 He-0.043
Token it
Token ID340
Feature activation+1.13e+00

Pos logit contributions
ement+0.136
 nervously+0.134
 sadly+0.128
 curiously+0.127
 closely+0.120
Neg logit contributions
ww-0.152
rav-0.142
fl-0.139
eware-0.138
w-0.135
Token is
Token ID318
Feature activation+1.96e+00

Pos logit contributions
ement+0.035
 nervously+0.034
 sadly+0.033
 curiously+0.032
 closely+0.031
Neg logit contributions
ww-0.027
fl-0.026
rav-0.025
eware-0.025
w-0.024
Token!
Token ID0
Feature activation+7.09e-01

Pos logit contributions
ement+0.039
 nervously+0.037
 sadly+0.036
 curiously+0.035
 closely+0.033
Neg logit contributions
ww-0.068
rav-0.065
eware-0.065
fl-0.065
w-0.062
Tokenâ
Token ID22940
Feature activation-5.67e-01

Pos logit contributions
ement+0.014
 nervously+0.013
 sadly+0.013
 curiously+0.012
 closely+0.012
Neg logit contributions
ww-0.025
rav-0.024
fl-0.024
eware-0.024
w-0.023
Token
Token ID26391
Feature activation-7.90e-01

Pos logit contributions
ww+0.014
rav+0.014
eware+0.014
fl+0.013
less+0.013
Neg logit contributions
ement-0.017
 nervously-0.015
 sadly-0.015
 finally-0.015
 closely-0.015
Token he
Token ID339
Feature activation+1.76e+00

Pos logit contributions
ement+0.026
 nervously+0.020
 curiously+0.019
 closely+0.018
 deeply+0.018
Neg logit contributions
ww-0.040
fl-0.039
Fl-0.038
eware-0.038
w-0.038
Token couldn
Token ID3521
Feature activation+1.92e+00

Pos logit contributions
ement+0.056
 nervously+0.047
 closely+0.046
 curiously+0.046
 deeply+0.046
Neg logit contributions
ww-0.038
a-0.038
w-0.037
eware-0.037
cr-0.036
Token't
Token ID470
Feature activation-1.42e-01

Pos logit contributions
ement+0.069
 nervously+0.065
 curiously+0.063
 closely+0.063
 armor+0.062
Neg logit contributions
ww-0.035
Fl-0.032
eware-0.032
rav-0.031
fl-0.030
Token see
Token ID766
Feature activation+2.43e+00

Pos logit contributions
cr+0.003
eware+0.003
'm+0.003
free+0.003
Fl+0.002
Neg logit contributions
ement-0.005
 sadly-0.004
 armor-0.004
 nervously-0.004
 novels-0.004
Token how
Token ID703
Feature activation-1.26e+00

Pos logit contributions
ement+0.062
 nervously+0.056
 sadly+0.054
 curiously+0.054
 proudly+0.052
Neg logit contributions
fl-0.068
ww-0.064
eware-0.062
Fl-0.062
rav-0.060
Token tall
Token ID7331
Feature activation+5.20e+00

Pos logit contributions
ww+0.039
rav+0.036
less+0.036
eware+0.036
fl+0.035
Neg logit contributions
ement-0.030
 nervously-0.027
 sadly-0.026
 curiously-0.025
 finally-0.024
Token he
Token ID339
Feature activation+2.67e+00

Pos logit contributions
ement+0.110
 nervously+0.105
 curiously+0.102
 sadly+0.102
 proudly+0.096
Neg logit contributions
ww-0.164
fl-0.163
Fl-0.145
w-0.141
eware-0.140
Token was
Token ID373
Feature activation+2.53e+00

Pos logit contributions
ement+0.095
 nervously+0.087
 sadly+0.085
 deeply+0.084
 closely+0.084
Neg logit contributions
fl-0.047
Fl-0.044
eware-0.044
ww-0.042
w-0.040
Token!
Token ID0
Feature activation+6.63e-01

Pos logit contributions
ement+0.054
 nervously+0.047
 sadly+0.045
 curiously+0.043
 deeply+0.041
Neg logit contributions
fl-0.081
eware-0.079
ww-0.078
less-0.077
cr-0.076
Token
Token ID220
Feature activation-5.67e-01

Pos logit contributions
ement+0.014
 closely+0.013
 nervously+0.012
 curiously+0.012
 sadly+0.012
Neg logit contributions
ww-0.021
Fl-0.021
eware-0.021
rav-0.020
fl-0.020
Token
Token ID198
Feature activation+3.16e-01

Pos logit contributions
rav+0.014
eware+0.014
fl+0.014
Fl+0.014
less+0.014
Neg logit contributions
ement-0.015
 nervously-0.014
 sadly-0.014
 closely-0.013
 curiously-0.013
Token Maybe
Token ID6674
Feature activation+2.72e+00

Pos logit contributions
ement+0.001
 nervously+0.001
 sadly+0.001
 curiously+0.001
 closely+0.001
Neg logit contributions
ww-0.001
rav-0.001
fl-0.001
eware-0.001
w-0.001
Token they
Token ID484
Feature activation+6.14e-01

Pos logit contributions
ement+0.069
 nervously+0.065
 sadly+0.062
 curiously+0.061
 deeply+0.058
Neg logit contributions
ww-0.081
fl-0.077
eware-0.077
rav-0.077
w-0.072
Token could
Token ID714
Feature activation+4.26e-01

Pos logit contributions
ement+0.023
 nervously+0.021
 curiously+0.021
 sadly+0.020
 closely+0.020
Neg logit contributions
rav-0.010
w-0.010
fl-0.010
ww-0.010
eware-0.009
Token grow
Token ID1663
Feature activation-6.59e-01

Pos logit contributions
ement+0.015
 nervously+0.014
 sadly+0.013
 frantically+0.012
 curiously+0.012
Neg logit contributions
cr-0.008
fl-0.008
ww-0.008
w-0.007
Fl-0.007
Token more
Token ID517
Feature activation+1.09e-02

Pos logit contributions
fl+0.020
ww+0.020
rav+0.019
eware+0.019
less+0.018
Neg logit contributions
ement-0.015
 nervously-0.014
 sadly-0.014
 curiously-0.013
 frantically-0.013
Token av
Token ID1196
Feature activation+5.20e+00

Pos logit contributions
ement+0.000
 nervously+0.000
 sadly+0.000
 curiously+0.000
 frantically+0.000
Neg logit contributions
ww-0.000
fl-0.000
w-0.000
eware-0.000
rav-0.000
Tokenoc
Token ID420
Feature activation+1.36e+00

Pos logit contributions
ement+0.161
 nervously+0.158
 sadly+0.151
 curiously+0.151
 frantically+0.141
Neg logit contributions
ww-0.130
fl-0.120
w-0.115
rav-0.114
eware-0.113
Tokenados
Token ID22484
Feature activation+2.35e+00

Pos logit contributions
 nervously+0.030
ement+0.028
 sadly+0.028
 curiously+0.028
 closely+0.027
Neg logit contributions
ww-0.044
fl-0.043
rav-0.043
eware-0.042
w-0.041
Token and
Token ID290
Feature activation-3.08e-01

Pos logit contributions
ement+0.052
 nervously+0.048
 curiously+0.046
 sadly+0.045
 frantically+0.042
Neg logit contributions
ww-0.076
fl-0.075
rav-0.072
cr-0.071
eware-0.071
Token share
Token ID2648
Feature activation-5.13e-01

Pos logit contributions
ww+0.008
fl+0.008
rav+0.007
eware+0.007
Fl+0.007
Neg logit contributions
ement-0.009
 nervously-0.008
 sadly-0.008
 curiously-0.008
 closely-0.008
Token them
Token ID606
Feature activation+9.70e-01

Pos logit contributions
ww+0.017
rav+0.016
fl+0.016
eware+0.015
w+0.015
Neg logit contributions
ement-0.011
 nervously-0.011
 sadly-0.010
 curiously-0.010
 closely-0.009
Token than
Token ID621
Feature activation+2.26e+00

Pos logit contributions
ww+0.014
rav+0.013
eware+0.013
fl+0.013
w+0.013
Neg logit contributions
ement-0.013
 nervously-0.012
 curiously-0.011
 sadly-0.011
 closely-0.011
Token yours
Token ID12431
Feature activation-4.10e-03

Pos logit contributions
ement+0.081
 nervously+0.077
 sadly+0.074
 curiously+0.074
 closely+0.071
Neg logit contributions
ww-0.046
rav-0.042
eware-0.040
w-0.039
fl-0.039
Token.
Token ID13
Feature activation+3.37e-01

Pos logit contributions
ww+0.000
rav+0.000
eware+0.000
fl+0.000
w+0.000
Neg logit contributions
ement-0.000
 nervously-0.000
 curiously-0.000
 sadly-0.000
 closely-0.000
Token Look
Token ID6803
Feature activation-9.70e-02

Pos logit contributions
ement+0.009
 nervously+0.009
 sadly+0.009
 curiously+0.009
 closely+0.008
Neg logit contributions
ww-0.009
rav-0.009
fl-0.009
eware-0.008
w-0.008
Token how
Token ID703
Feature activation-2.46e+00

Pos logit contributions
ww+0.004
rav+0.004
eware+0.003
fl+0.003
w+0.003
Neg logit contributions
ement-0.002
 nervously-0.002
 sadly-0.001
 curiously-0.001
 closely-0.001
Token high
Token ID1029
Feature activation+5.20e+00

Pos logit contributions
ww+0.071
rav+0.068
fl+0.068
eware+0.067
Fl+0.064
Neg logit contributions
ement-0.062
 nervously-0.061
 sadly-0.059
 curiously-0.058
 closely-0.056
Token it
Token ID340
Feature activation+7.37e-01

Pos logit contributions
ement+0.131
 nervously+0.130
 curiously+0.124
 sadly+0.121
 frantically+0.116
Neg logit contributions
ww-0.158
fl-0.151
w-0.142
eware-0.139
rav-0.137
Token is
Token ID318
Feature activation+1.23e+00

Pos logit contributions
ement+0.026
 nervously+0.025
 sadly+0.024
 curiously+0.024
 frantically+0.023
Neg logit contributions
ww-0.015
fl-0.014
rav-0.014
eware-0.013
w-0.013
Token!"
Token ID2474
Feature activation+1.56e+00

Pos logit contributions
ement+0.039
 nervously+0.037
 curiously+0.036
 sadly+0.035
 frantically+0.034
Neg logit contributions
ww-0.029
eware-0.027
fl-0.027
rav-0.026
w-0.025
Token He
Token ID679
Feature activation+7.81e-01

Pos logit contributions
ement+0.048
 closely+0.044
 curiously+0.043
 nervously+0.043
 frantically+0.041
Neg logit contributions
ww-0.035
Fl-0.034
eware-0.033
Br-0.032
L-0.032
Token was
Token ID373
Feature activation+3.78e-01

Pos logit contributions
ement+0.024
 nervously+0.021
 deeply+0.020
 curiously+0.020
 sadly+0.020
Neg logit contributions
ww-0.019
eware-0.017
Fl-0.017
w-0.017
rav-0.017
Token important
Token ID1593
Feature activation-9.17e-01

Pos logit contributions
ement+0.000
 nervously+0.000
 sadly+0.000
 curiously+0.000
 closely+0.000
Neg logit contributions
ww-0.000
rav-0.000
eware-0.000
fl-0.000
w-0.000
Token,
Token ID11
Feature activation-6.65e-01

Pos logit contributions
ww+0.033
rav+0.032
fl+0.032
eware+0.031
less+0.030
Neg logit contributions
ement-0.017
 nervously-0.016
 curiously-0.015
 sadly-0.015
 closely-0.014
Token no
Token ID645
Feature activation+3.67e-01

Pos logit contributions
ww+0.024
rav+0.023
fl+0.023
eware+0.022
Fl+0.022
Neg logit contributions
ement-0.013
 nervously-0.011
 curiously-0.011
 sadly-0.010
 closely-0.009
Token matter
Token ID2300
Feature activation-8.25e-01

Pos logit contributions
ement+0.011
 nervously+0.010
 sadly+0.010
 curiously+0.010
 closely+0.009
Neg logit contributions
ww-0.010
rav-0.009
fl-0.009
eware-0.009
w-0.008
Token how
Token ID703
Feature activation-9.21e-01

Pos logit contributions
ww+0.027
rav+0.026
eware+0.026
fl+0.026
w+0.025
Neg logit contributions
ement-0.018
 nervously-0.017
 sadly-0.017
 curiously-0.016
 closely-0.016
Token big
Token ID1263
Feature activation+5.20e+00

Pos logit contributions
ww+0.030
rav+0.029
fl+0.029
eware+0.029
Fl+0.028
Neg logit contributions
ement-0.020
 nervously-0.019
 sadly-0.019
 curiously-0.018
 closely-0.018
Token or
Token ID393
Feature activation+3.92e-01

Pos logit contributions
ement+0.126
 nervously+0.123
 curiously+0.118
 sadly+0.115
 proudly+0.109
Neg logit contributions
ww-0.161
fl-0.153
rav-0.148
w-0.146
eware-0.143
Token small
Token ID1402
Feature activation+2.94e+00

Pos logit contributions
ement+0.009
 nervously+0.009
 sadly+0.009
 curiously+0.008
 closely+0.008
Neg logit contributions
ww-0.012
rav-0.012
fl-0.011
eware-0.011
w-0.011
Token they
Token ID484
Feature activation+1.45e+00

Pos logit contributions
ement+0.068
 nervously+0.065
 curiously+0.061
 sadly+0.061
 closely+0.058
Neg logit contributions
ww-0.093
fl-0.090
rav-0.088
eware-0.087
w-0.084
Token are
Token ID389
Feature activation+1.63e+00

Pos logit contributions
ement+0.045
 nervously+0.044
 sadly+0.043
 curiously+0.042
 closely+0.041
Neg logit contributions
ww-0.034
rav-0.033
eware-0.032
fl-0.031
w-0.030
Token.
Token ID13
Feature activation+1.56e-01

Pos logit contributions
ement+0.033
 nervously+0.031
 sadly+0.029
 curiously+0.029
 closely+0.026
Neg logit contributions
ww-0.056
rav-0.054
eware-0.054
fl-0.054
less-0.051
Token something
Token ID1223
Feature activation+5.90e-02

Pos logit contributions
ww+0.008
fl+0.007
eware+0.007
rav+0.007
less+0.007
Neg logit contributions
ement-0.008
 nervously-0.007
 curiously-0.007
 sadly-0.007
 frantically-0.006
Token new
Token ID649
Feature activation+1.17e+00

Pos logit contributions
ement+0.001
 nervously+0.001
 sadly+0.001
 curiously+0.001
 finally+0.001
Neg logit contributions
ww-0.002
rav-0.002
fl-0.002
eware-0.002
less-0.002
Token!
Token ID0
Feature activation+4.58e-01

Pos logit contributions
ement+0.024
 nervously+0.023
 curiously+0.022
 sadly+0.021
 closely+0.020
Neg logit contributions
ww-0.039
fl-0.039
rav-0.038
eware-0.037
less-0.036
Token She
Token ID1375
Feature activation-2.77e-01

Pos logit contributions
ement+0.009
 nervously+0.009
 curiously+0.008
 sadly+0.008
 closely+0.008
Neg logit contributions
ww-0.016
rav-0.016
fl-0.015
eware-0.015
Fl-0.015
Token thought
Token ID1807
Feature activation+1.71e+00

Pos logit contributions
ww+0.007
fl+0.006
rav+0.006
Fl+0.006
cr+0.006
Neg logit contributions
ement-0.009
 nervously-0.008
 curiously-0.008
 closely-0.007
 anymore-0.007
Token av
Token ID1196
Feature activation+5.19e+00

Pos logit contributions
ement+0.042
 nervously+0.039
 sadly+0.037
 curiously+0.036
 frantically+0.035
Neg logit contributions
ww-0.046
rav-0.045
less-0.044
eware-0.042
fl-0.040
Tokenoc
Token ID420
Feature activation+5.01e-01

Pos logit contributions
ement+0.142
 nervously+0.139
 sadly+0.134
 curiously+0.132
 closely+0.127
Neg logit contributions
ww-0.142
rav-0.134
fl-0.132
eware-0.131
w-0.126
Tokenados
Token ID22484
Feature activation+1.71e+00

Pos logit contributions
 nervously+0.010
ement+0.010
 sadly+0.010
 curiously+0.010
 closely+0.009
Neg logit contributions
ww-0.017
fl-0.016
rav-0.016
eware-0.016
w-0.016
Token were
Token ID547
Feature activation+1.17e+00

Pos logit contributions
ement+0.048
 nervously+0.046
 sadly+0.045
 curiously+0.044
 closely+0.043
Neg logit contributions
ww-0.045
fl-0.043
rav-0.042
eware-0.041
w-0.040
Token the
Token ID262
Feature activation+8.07e-01

Pos logit contributions
ement+0.030
 nervously+0.029
 sadly+0.027
 curiously+0.027
 frantically+0.025
Neg logit contributions
ww-0.034
fl-0.032
rav-0.031
eware-0.030
Fl-0.030
Token most
Token ID749
Feature activation+5.29e-01

Pos logit contributions
ement+0.023
 nervously+0.023
 sadly+0.022
 curiously+0.022
 closely+0.021
Neg logit contributions
ww-0.021
fl-0.020
rav-0.019
eware-0.019
w-0.018
Token girl
Token ID2576
Feature activation-1.19e+00

Pos logit contributions
ww+0.009
rav+0.009
eware+0.009
fl+0.009
w+0.008
Neg logit contributions
ement-0.011
 nervously-0.011
 sadly-0.010
 curiously-0.010
 closely-0.010
Token who
Token ID508
Feature activation-1.02e+00

Pos logit contributions
rav+0.046
ww+0.046
eware+0.044
fl+0.043
w+0.042
Neg logit contributions
 nervously-0.019
 sadly-0.018
ement-0.017
 curiously-0.017
 closely-0.016
Token didn
Token ID1422
Feature activation+5.02e-01

Pos logit contributions
ww+0.036
rav+0.036
eware+0.035
fl+0.034
cr+0.033
Neg logit contributions
 He-0.018
 happily-0.018
 nervously-0.018
 sadly-0.018
 finally-0.018
Token't
Token ID470
Feature activation-2.67e-01

Pos logit contributions
ement+0.014
 nervously+0.013
 sadly+0.013
 curiously+0.012
 closely+0.012
Neg logit contributions
ww-0.014
rav-0.013
eware-0.013
fl-0.013
w-0.012
Token like
Token ID588
Feature activation+8.14e-01

Pos logit contributions
ww+0.009
rav+0.009
fl+0.008
eware+0.008
w+0.008
Neg logit contributions
 nervously-0.006
 sadly-0.005
ement-0.005
 curiously-0.005
 finally-0.005
Token av
Token ID1196
Feature activation+5.13e+00

Pos logit contributions
ement+0.018
 nervously+0.018
 sadly+0.017
 curiously+0.017
 closely+0.016
Neg logit contributions
ww-0.026
rav-0.025
eware-0.025
fl-0.024
w-0.024
Tokenoc
Token ID420
Feature activation+8.99e-01

Pos logit contributions
ement+0.148
 nervously+0.148
 sadly+0.141
 curiously+0.140
 closely+0.133
Neg logit contributions
ww-0.138
fl-0.126
w-0.123
rav-0.123
eware-0.120
Tokenados
Token ID22484
Feature activation+1.87e+00

Pos logit contributions
 nervously+0.022
 sadly+0.021
 curiously+0.020
 closely+0.020
ement+0.020
Neg logit contributions
ww-0.028
fl-0.027
rav-0.026
eware-0.026
w-0.025
Token.
Token ID13
Feature activation+6.60e-01

Pos logit contributions
ement+0.040
 nervously+0.038
 sadly+0.036
 curiously+0.035
 closely+0.033
Neg logit contributions
ww-0.062
fl-0.060
eware-0.058
rav-0.058
Fl-0.057
Token She
Token ID1375
Feature activation+3.58e-01

Pos logit contributions
ement+0.012
 nervously+0.011
 sadly+0.010
 curiously+0.010
 closely+0.010
Neg logit contributions
ww-0.025
rav-0.024
eware-0.023
fl-0.023
Fl-0.023
Token always
Token ID1464
Feature activation-5.87e-01

Pos logit contributions
ement+0.011
 nervously+0.010
 sadly+0.010
 curiously+0.010
 closely+0.010
Neg logit contributions
ww-0.009
fl-0.008
rav-0.008
w-0.008
eware-0.008
Token happy
Token ID3772
Feature activation+2.54e-01

Pos logit contributions
ww+0.022
rav+0.020
eware+0.020
fl+0.020
less+0.019
Neg logit contributions
ement-0.017
 nervously-0.016
 sadly-0.015
 curiously-0.015
 closely-0.014
Token that
Token ID326
Feature activation-8.94e-01

Pos logit contributions
ement+0.006
 nervously+0.006
 sadly+0.006
 curiously+0.006
 closely+0.005
Neg logit contributions
ww-0.008
fl-0.007
rav-0.007
eware-0.007
less-0.007
Token it
Token ID340
Feature activation-4.74e-01

Pos logit contributions
ww+0.028
rav+0.027
fl+0.027
eware+0.025
w+0.025
Neg logit contributions
ement-0.021
 nervously-0.019
 curiously-0.018
 sadly-0.017
 frantically-0.017
Token produced
Token ID4635
Feature activation-5.08e-01

Pos logit contributions
ww+0.012
fl+0.011
rav+0.011
w+0.010
eware+0.010
Neg logit contributions
ement-0.014
 nervously-0.013
 curiously-0.013
 sadly-0.013
 closely-0.012
Token more
Token ID517
Feature activation+6.44e-02

Pos logit contributions
ww+0.014
fl+0.013
cr+0.013
less+0.013
eware+0.012
Neg logit contributions
ement-0.013
 nervously-0.013
 curiously-0.012
 sadly-0.012
 frantically-0.011
Token av
Token ID1196
Feature activation+5.13e+00

Pos logit contributions
 nervously+0.002
ement+0.002
 sadly+0.002
 frantically+0.002
alling+0.002
Neg logit contributions
ww-0.002
fl-0.002
less-0.001
cr-0.001
w-0.001
Tokenoc
Token ID420
Feature activation+1.63e+00

Pos logit contributions
ement+0.166
 nervously+0.165
 sadly+0.156
 curiously+0.155
 frantically+0.148
Neg logit contributions
ww-0.120
fl-0.110
w-0.104
rav-0.102
eware-0.100
Tokenados
Token ID22484
Feature activation+1.78e+00

Pos logit contributions
 nervously+0.044
 curiously+0.040
 sadly+0.039
 finally+0.039
 frantically+0.039
Neg logit contributions
fl-0.046
ww-0.045
rav-0.043
w-0.042
eware-0.042
Token than
Token ID621
Feature activation+1.99e+00

Pos logit contributions
ement+0.040
 nervously+0.038
 curiously+0.035
 sadly+0.035
 frantically+0.033
Neg logit contributions
fl-0.056
ww-0.055
less-0.053
cr-0.053
rav-0.053
Token ever
Token ID1683
Feature activation+9.41e-01

Pos logit contributions
ement+0.051
 nervously+0.048
 curiously+0.045
 sadly+0.042
 frantically+0.042
Neg logit contributions
ww-0.054
cr-0.054
less-0.054
fl-0.052
rav-0.052
Token before
Token ID878
Feature activation-1.41e+00

Pos logit contributions
ement+0.021
 nervously+0.020
 sadly+0.019
 curiously+0.019
 closely+0.018
Neg logit contributions
ww-0.030
fl-0.029
rav-0.028
eware-0.028
less-0.027
Token some
Token ID617
Feature activation-8.19e-01

Pos logit contributions
ww+0.007
fl+0.007
eware+0.007
rav+0.007
Fl+0.007
Neg logit contributions
ement-0.006
 nervously-0.005
 sadly-0.005
 curiously-0.005
 closely-0.005
Token furniture
Token ID13091
Feature activation+5.01e-01

Pos logit contributions
ww+0.022
fl+0.020
eware+0.020
rav+0.020
Fl+0.019
Neg logit contributions
ement-0.023
 nervously-0.022
 sadly-0.022
 curiously-0.021
 finally-0.020
Token around
Token ID1088
Feature activation-9.24e-01

Pos logit contributions
ement+0.012
 nervously+0.011
 sadly+0.010
 curiously+0.010
 closely+0.010
Neg logit contributions
fl-0.014
ww-0.014
eware-0.014
less-0.014
rav-0.014
Token in
Token ID287
Feature activation-1.02e+00

Pos logit contributions
ww+0.029
fl+0.029
rav+0.028
eware+0.028
Fl+0.028
Neg logit contributions
ement-0.021
 nervously-0.019
 sadly-0.018
 curiously-0.018
 closely-0.017
Token the
Token ID262
Feature activation-1.99e+00

Pos logit contributions
ww+0.036
rav+0.034
fl+0.034
eware+0.034
Fl+0.033
Neg logit contributions
ement-0.020
 nervously-0.019
 sadly-0.018
 curiously-0.018
 closely-0.017
Token living
Token ID2877
Feature activation-5.40e+00

Pos logit contributions
ww+0.063
rav+0.061
eware+0.060
fl+0.059
w+0.057
Neg logit contributions
ement-0.046
 nervously-0.045
 sadly-0.043
 curiously-0.042
 closely-0.041
Token room
Token ID2119
Feature activation-1.65e+00

Pos logit contributions
ww+0.203
rav+0.197
eware+0.194
fl+0.187
w+0.182
Neg logit contributions
ement-0.100
 nervously-0.096
 sadly-0.091
 curiously-0.086
 proudly-0.085
Token.
Token ID13
Feature activation-2.73e-01

Pos logit contributions
ww+0.059
rav+0.057
eware+0.056
fl+0.056
w+0.054
Neg logit contributions
ement-0.031
 nervously-0.029
 sadly-0.028
 curiously-0.027
 closely-0.026
Token Lily
Token ID20037
Feature activation-8.54e-02

Pos logit contributions
ww+0.010
rav+0.009
eware+0.009
fl+0.009
w+0.009
Neg logit contributions
ement-0.005
 nervously-0.005
 sadly-0.005
 curiously-0.005
 closely-0.005
Token was
Token ID373
Feature activation+1.00e+00

Pos logit contributions
ww+0.002
rav+0.002
eware+0.002
fl+0.002
w+0.002
Neg logit contributions
ement-0.002
 nervously-0.002
 sadly-0.002
 curiously-0.002
 closely-0.002
Token happy
Token ID3772
Feature activation+2.59e-01

Pos logit contributions
ement+0.019
 nervously+0.018
 sadly+0.018
 curiously+0.017
 closely+0.016
Neg logit contributions
ww-0.036
rav-0.035
eware-0.034
fl-0.034
w-0.033
Token the
Token ID262
Feature activation-1.33e+00

Pos logit contributions
fl+0.027
rav+0.026
ww+0.026
less+0.025
Fl+0.025
Neg logit contributions
ement-0.019
 nervously-0.017
 sadly-0.017
 curiously-0.017
 closely-0.016
Token piano
Token ID19132
Feature activation-2.27e+00

Pos logit contributions
ww+0.035
rav+0.033
fl+0.033
eware+0.032
Fl+0.031
Neg logit contributions
ement-0.037
 nervously-0.036
 sadly-0.035
 curiously-0.035
 closely-0.033
Token and
Token ID290
Feature activation-8.27e-01

Pos logit contributions
ww+0.074
fl+0.073
rav+0.072
eware+0.071
Fl+0.069
Neg logit contributions
ement-0.046
 nervously-0.043
 curiously-0.041
 sadly-0.040
 closely-0.039
Token around
Token ID1088
Feature activation-1.08e+00

Pos logit contributions
ww+0.023
rav+0.022
fl+0.022
eware+0.022
w+0.021
Neg logit contributions
ement-0.022
 nervously-0.021
 sadly-0.020
 curiously-0.020
 closely-0.019
Token the
Token ID262
Feature activation-1.19e+00

Pos logit contributions
fl+0.036
rav+0.035
ww+0.035
less+0.033
eware+0.033
Neg logit contributions
ement-0.022
 nervously-0.020
 sadly-0.019
 curiously-0.019
 closely-0.018
Token living
Token ID2877
Feature activation-5.40e+00

Pos logit contributions
ww+0.032
rav+0.031
fl+0.030
eware+0.030
Fl+0.029
Neg logit contributions
ement-0.033
 nervously-0.031
 sadly-0.031
 curiously-0.030
 closely-0.029
Token room
Token ID2119
Feature activation-2.35e+00

Pos logit contributions
ww+0.194
eware+0.186
rav+0.185
fl+0.179
w+0.174
Neg logit contributions
ement-0.108
 nervously-0.104
 sadly-0.099
 curiously-0.095
 proudly-0.091
Token.
Token ID13
Feature activation-6.01e-01

Pos logit contributions
ww+0.081
rav+0.079
fl+0.078
eware+0.078
less+0.075
Neg logit contributions
ement-0.045
 nervously-0.043
 sadly-0.040
 curiously-0.040
 closely-0.038
Token
Token ID198
Feature activation-1.46e-01

Pos logit contributions
ww+0.021
rav+0.020
eware+0.020
fl+0.020
w+0.019
Neg logit contributions
ement-0.012
 nervously-0.011
 sadly-0.011
 curiously-0.011
 closely-0.010
Token
Token ID198
Feature activation-6.75e-01

Pos logit contributions
ww+0.005
rav+0.005
eware+0.005
fl+0.005
w+0.005
Neg logit contributions
ement-0.003
 nervously-0.003
 sadly-0.003
 curiously-0.003
 closely-0.002
TokenSoon
Token ID28093
Feature activation+9.15e-02

Pos logit contributions
ww+0.020
eware+0.020
rav+0.020
less+0.020
fl+0.019
Neg logit contributions
 nervously-0.017
 curiously-0.016
 closely-0.015
 sadly-0.015
ement-0.015
Token them
Token ID606
Feature activation-4.53e-01

Pos logit contributions
ement+0.010
 nervously+0.009
 sadly+0.009
 curiously+0.009
 proudly+0.008
Neg logit contributions
ww-0.014
rav-0.013
fl-0.013
eware-0.013
Fl-0.013
Token and
Token ID290
Feature activation-5.48e-01

Pos logit contributions
fl+0.017
ww+0.017
rav+0.017
eware+0.016
less+0.016
Neg logit contributions
ement-0.007
 nervously-0.006
 sadly-0.005
 finally-0.005
 curiously-0.005
Token ran
Token ID4966
Feature activation-1.15e-01

Pos logit contributions
ww+0.016
rav+0.015
eware+0.015
fl+0.015
w+0.014
Neg logit contributions
ement-0.014
 nervously-0.014
 sadly-0.014
 curiously-0.013
 closely-0.013
Token to
Token ID284
Feature activation-1.56e+00

Pos logit contributions
ww+0.003
eware+0.003
rav+0.003
fl+0.003
less+0.003
Neg logit contributions
ement-0.003
 nervously-0.003
 sadly-0.003
 curiously-0.003
 closely-0.003
Token the
Token ID262
Feature activation-1.96e+00

Pos logit contributions
ww+0.056
rav+0.054
eware+0.053
fl+0.053
w+0.051
Neg logit contributions
ement-0.030
 nervously-0.029
 sadly-0.027
 curiously-0.027
 closely-0.025
Token living
Token ID2877
Feature activation-5.39e+00

Pos logit contributions
ww+0.045
rav+0.043
fl+0.043
eware+0.042
Fl+0.040
Neg logit contributions
ement-0.062
 nervously-0.059
 sadly-0.057
 curiously-0.057
 closely-0.055
Token room
Token ID2119
Feature activation-1.80e+00

Pos logit contributions
ww+0.179
rav+0.172
eware+0.169
fl+0.168
w+0.162
Neg logit contributions
ement-0.116
 nervously-0.112
 sadly-0.107
 curiously-0.105
 closely-0.100
Token.
Token ID13
Feature activation-2.75e-01

Pos logit contributions
less+0.050
eware+0.048
fl+0.047
rav+0.046
ww+0.046
Neg logit contributions
ement-0.043
 nervously-0.039
 curiously-0.038
 sadly-0.037
 armor-0.036
Token She
Token ID1375
Feature activation-7.04e-01

Pos logit contributions
ww+0.009
eware+0.009
rav+0.009
fl+0.009
Fl+0.008
Neg logit contributions
ement-0.006
 nervously-0.005
 sadly-0.005
 curiously-0.005
 closely-0.005
Token saw
Token ID2497
Feature activation+4.65e-01

Pos logit contributions
ww+0.021
rav+0.020
eware+0.020
fl+0.020
Fl+0.019
Neg logit contributions
ement-0.018
 nervously-0.017
 sadly-0.016
 curiously-0.016
 closely-0.015
Token the
Token ID262
Feature activation-5.43e-01

Pos logit contributions
ement+0.012
 nervously+0.011
 sadly+0.011
 curiously+0.010
 eagerly+0.010
Neg logit contributions
ww-0.013
fl-0.013
Fl-0.013
eware-0.012
less-0.012
Tokenpron
Token ID31186
Feature activation-1.34e+00

Pos logit contributions
eware+0.027
fl+0.027
rav+0.027
ww+0.027
w+0.025
Neg logit contributions
 nervously-0.031
ement-0.031
 sadly-0.030
 curiously-0.030
 proudly-0.030
Token and
Token ID290
Feature activation+1.95e-02

Pos logit contributions
ww+0.053
fl+0.051
eware+0.051
rav+0.051
w+0.050
Neg logit contributions
ement-0.020
 nervously-0.017
 curiously-0.016
 sadly-0.016
 closely-0.014
Token ran
Token ID4966
Feature activation-2.80e-01

Pos logit contributions
ement+0.001
 nervously+0.000
 curiously+0.000
 sadly+0.000
 closely+0.000
Neg logit contributions
ww-0.001
fl-0.000
rav-0.000
eware-0.000
w-0.000
Token to
Token ID284
Feature activation-1.57e+00

Pos logit contributions
ww+0.007
rav+0.007
eware+0.007
fl+0.007
less+0.007
Neg logit contributions
ement-0.008
 nervously-0.008
 sadly-0.007
 curiously-0.007
 closely-0.007
Token the
Token ID262
Feature activation-1.74e+00

Pos logit contributions
rav+0.038
Fl+0.035
less+0.035
fl+0.035
ww+0.035
Neg logit contributions
ement-0.047
 sadly-0.037
 nervously-0.036
 deeply-0.035
 frantically-0.034
Token living
Token ID2877
Feature activation-5.39e+00

Pos logit contributions
ww+0.055
rav+0.053
eware+0.052
fl+0.051
w+0.050
Neg logit contributions
 nervously-0.040
ement-0.040
 sadly-0.039
 curiously-0.038
 closely-0.037
Token room
Token ID2119
Feature activation-1.52e+00

Pos logit contributions
ww+0.182
rav+0.175
eware+0.172
fl+0.170
w+0.164
Neg logit contributions
ement-0.114
 nervously-0.110
 sadly-0.106
 curiously-0.103
 closely-0.099
Token.
Token ID13
Feature activation-5.24e-02

Pos logit contributions
rav+0.047
ww+0.047
fl+0.046
eware+0.046
less+0.046
Neg logit contributions
ement-0.033
 nervously-0.030
 sadly-0.029
 curiously-0.029
 closely-0.027
Token She
Token ID1375
Feature activation-2.23e-01

Pos logit contributions
eware+0.002
ww+0.002
rav+0.002
fl+0.002
Fl+0.002
Neg logit contributions
ement-0.001
 nervously-0.001
 sadly-0.001
 curiously-0.001
 closely-0.001
Token saw
Token ID2497
Feature activation+7.11e-01

Pos logit contributions
ww+0.006
rav+0.006
fl+0.005
eware+0.005
w+0.005
Neg logit contributions
ement-0.007
 nervously-0.006
 curiously-0.006
 sadly-0.006
 closely-0.005
Token her
Token ID607
Feature activation+6.95e-01

Pos logit contributions
ement+0.017
 nervously+0.016
 sadly+0.015
 curiously+0.015
 closely+0.014
Neg logit contributions
ww-0.021
fl-0.021
rav-0.021
eware-0.020
Fl-0.020
Token noise
Token ID7838
Feature activation-1.35e+00

Pos logit contributions
ww+0.038
rav+0.036
eware+0.035
fl+0.034
w+0.033
Neg logit contributions
ement-0.028
 nervously-0.028
 curiously-0.027
 sadly-0.027
 closely-0.025
Token and
Token ID290
Feature activation-1.38e-01

Pos logit contributions
ww+0.044
fl+0.042
rav+0.042
less+0.042
Fl+0.040
Neg logit contributions
ement-0.027
 nervously-0.024
 sadly-0.023
 curiously-0.023
 finally-0.022
Token run
Token ID1057
Feature activation-4.64e-01

Pos logit contributions
ww+0.004
rav+0.004
fl+0.004
eware+0.004
Fl+0.003
Neg logit contributions
ement-0.004
 nervously-0.003
 sadly-0.003
 curiously-0.003
 closely-0.003
Token to
Token ID284
Feature activation-1.44e+00

Pos logit contributions
ww+0.015
rav+0.014
fl+0.013
eware+0.013
w+0.013
Neg logit contributions
 nervously-0.011
ement-0.011
 sadly-0.011
 curiously-0.010
 closely-0.010
Token the
Token ID262
Feature activation-1.64e+00

Pos logit contributions
ww+0.053
rav+0.051
eware+0.050
fl+0.050
w+0.048
Neg logit contributions
ement-0.026
 nervously-0.026
 sadly-0.024
 curiously-0.024
 closely-0.022
Token living
Token ID2877
Feature activation-5.33e+00

Pos logit contributions
ww+0.047
rav+0.045
eware+0.044
fl+0.043
w+0.042
Neg logit contributions
ement-0.043
 nervously-0.042
 sadly-0.041
 curiously-0.040
 closely-0.039
Token room
Token ID2119
Feature activation-1.55e+00

Pos logit contributions
ww+0.183
rav+0.176
eware+0.172
fl+0.170
w+0.165
Neg logit contributions
ement-0.110
 nervously-0.108
 sadly-0.103
 curiously-0.100
 closely-0.096
Token.
Token ID13
Feature activation-1.09e-01

Pos logit contributions
ww+0.046
less+0.046
rav+0.046
eware+0.046
fl+0.045
Neg logit contributions
ement-0.034
 nervously-0.031
 curiously-0.030
 sadly-0.030
 closely-0.028
Token They
Token ID1119
Feature activation-1.13e+00

Pos logit contributions
ww+0.004
rav+0.003
eware+0.003
fl+0.003
w+0.003
Neg logit contributions
ement-0.002
 nervously-0.002
 sadly-0.002
 curiously-0.002
 closely-0.002
Token see
Token ID766
Feature activation+3.41e-01

Pos logit contributions
rav+0.030
eware+0.027
ww+0.027
Fl+0.027
fl+0.027
Neg logit contributions
ement-0.033
 sadly-0.027
 nervously-0.026
 curiously-0.026
 closely-0.025
Token the
Token ID262
Feature activation-4.16e-01

Pos logit contributions
ement+0.007
 nervously+0.006
 sadly+0.006
 curiously+0.006
 closely+0.006
Neg logit contributions
ww-0.012
rav-0.011
fl-0.011
eware-0.011
Fl-0.011
Token crash
Token ID7014
Feature activation-6.06e-01

Pos logit contributions
ww+0.029
rav+0.028
eware+0.027
cr+0.026
w+0.025
Neg logit contributions
ement-0.018
 nervously-0.018
 tears-0.017
 curiously-0.017
 sadly-0.017
Token and
Token ID290
Feature activation-2.65e-01

Pos logit contributions
ww+0.024
fl+0.023
rav+0.023
eware+0.023
less+0.023
Neg logit contributions
ement-0.008
 nervously-0.007
 curiously-0.007
 sadly-0.007
 closely-0.006
Token came
Token ID1625
Feature activation-1.13e+00

Pos logit contributions
ww+0.007
rav+0.007
eware+0.007
fl+0.007
w+0.006
Neg logit contributions
ement-0.007
 nervously-0.007
 sadly-0.007
 curiously-0.007
 closely-0.007
Token to
Token ID284
Feature activation-9.83e-01

Pos logit contributions
ww+0.034
rav+0.033
fl+0.032
eware+0.032
w+0.030
Neg logit contributions
ement-0.028
 nervously-0.027
 sadly-0.026
 curiously-0.025
 closely-0.024
Token the
Token ID262
Feature activation-1.54e+00

Pos logit contributions
ww+0.039
rav+0.037
eware+0.036
fl+0.036
w+0.036
Neg logit contributions
 nervously-0.016
ement-0.015
 sadly-0.014
 curiously-0.014
 finally-0.013
Token living
Token ID2877
Feature activation-5.31e+00

Pos logit contributions
ww+0.047
rav+0.045
eware+0.044
fl+0.042
w+0.042
Neg logit contributions
 nervously-0.038
ement-0.038
 sadly-0.037
 curiously-0.036
 closely-0.035
Token room
Token ID2119
Feature activation-1.90e+00

Pos logit contributions
ww+0.185
rav+0.178
eware+0.175
fl+0.172
w+0.167
Neg logit contributions
ement-0.108
 nervously-0.105
 sadly-0.100
 curiously-0.098
 closely-0.093
Token.
Token ID13
Feature activation-3.10e-01

Pos logit contributions
less+0.058
eware+0.057
ww+0.057
rav+0.057
fl+0.056
Neg logit contributions
ement-0.041
 nervously-0.038
 sadly-0.036
 curiously-0.036
 closely-0.034
Token She
Token ID1375
Feature activation-6.21e-01

Pos logit contributions
ww+0.010
eware+0.010
rav+0.010
fl+0.010
Fl+0.010
Neg logit contributions
ement-0.007
 nervously-0.006
 sadly-0.006
 curiously-0.006
 closely-0.005
Token saw
Token ID2497
Feature activation+3.65e-01

Pos logit contributions
ww+0.019
rav+0.018
eware+0.017
fl+0.017
Fl+0.017
Neg logit contributions
ement-0.016
 nervously-0.015
 sadly-0.014
 curiously-0.014
 closely-0.013
Token the
Token ID262
Feature activation-3.45e-01

Pos logit contributions
ement+0.008
 nervously+0.008
 sadly+0.007
 curiously+0.007
 eagerly+0.007
Neg logit contributions
ww-0.012
fl-0.011
rav-0.011
eware-0.011
Fl-0.011
Token noise
Token ID7838
Feature activation-1.11e+00

Pos logit contributions
ww+0.017
rav+0.017
eware+0.016
fl+0.016
w+0.015
Neg logit contributions
ement-0.016
 nervously-0.016
 sadly-0.015
 curiously-0.015
 closely-0.014
Token and
Token ID290
Feature activation+1.87e-01

Pos logit contributions
ww+0.035
fl+0.035
less+0.034
Fl+0.033
rav+0.033
Neg logit contributions
ement-0.022
 nervously-0.018
 armor-0.018
 sadly-0.018
 curiously-0.018
Token ran
Token ID4966
Feature activation+2.64e-03

Pos logit contributions
ement+0.005
 nervously+0.005
 sadly+0.005
 curiously+0.005
 closely+0.004
Neg logit contributions
ww-0.005
rav-0.005
eware-0.005
fl-0.005
w-0.005
Token to
Token ID284
Feature activation-1.17e+00

Pos logit contributions
ement+0.000
 nervously+0.000
 sadly+0.000
 curiously+0.000
 closely+0.000
Neg logit contributions
ww-0.000
rav-0.000
eware-0.000
fl-0.000
less-0.000
Token the
Token ID262
Feature activation-1.52e+00

Pos logit contributions
ww+0.040
rav+0.039
eware+0.039
fl+0.038
Fl+0.037
Neg logit contributions
ement-0.024
 nervously-0.021
 sadly-0.021
 curiously-0.020
 closely-0.019
Token living
Token ID2877
Feature activation-5.31e+00

Pos logit contributions
ww+0.036
fl+0.035
Fl+0.034
rav+0.033
eware+0.033
Neg logit contributions
ement-0.048
 nervously-0.043
 sadly-0.043
 curiously-0.042
 closely-0.041
Token room
Token ID2119
Feature activation-1.56e+00

Pos logit contributions
ww+0.171
rav+0.164
fl+0.162
eware+0.161
w+0.155
Neg logit contributions
ement-0.119
 nervously-0.114
 sadly-0.110
 curiously-0.108
 closely-0.103
Token.
Token ID13
Feature activation-1.38e-01

Pos logit contributions
less+0.043
eware+0.042
fl+0.041
rav+0.040
ww+0.040
Neg logit contributions
ement-0.039
 nervously-0.034
 curiously-0.034
 sadly-0.033
 armor-0.031
Token She
Token ID1375
Feature activation-4.58e-01

Pos logit contributions
ww+0.005
rav+0.005
eware+0.005
fl+0.005
w+0.005
Neg logit contributions
ement-0.002
 nervously-0.002
 sadly-0.002
 curiously-0.002
 closely-0.002
Token saw
Token ID2497
Feature activation+6.01e-01

Pos logit contributions
ww+0.014
rav+0.013
eware+0.013
fl+0.013
Fl+0.013
Neg logit contributions
ement-0.011
 nervously-0.011
 sadly-0.010
 curiously-0.010
 closely-0.010
Token the
Token ID262
Feature activation-1.17e-01

Pos logit contributions
ement+0.015
 nervously+0.014
 sadly+0.013
 curiously+0.013
 eagerly+0.012
Neg logit contributions
fl-0.018
ww-0.018
Fl-0.017
eware-0.017
less-0.017
Token's
Token ID338
Feature activation-3.92e+00

Pos logit contributions
 sadly+0.018
 nervously+0.018
 finally+0.018
ement+0.017
 closely+0.017
Neg logit contributions
rav-0.027
ww-0.027
eware-0.027
less-0.026
fl-0.026
Token go
Token ID467
Feature activation-9.04e-03

Pos logit contributions
ww+0.091
fl+0.083
rav+0.081
eware+0.080
Fl+0.079
Neg logit contributions
ement-0.129
 nervously-0.117
 sadly-0.113
 curiously-0.113
 closely-0.108
Token back
Token ID736
Feature activation-5.03e-01

Pos logit contributions
ww+0.000
rav+0.000
fl+0.000
eware+0.000
less+0.000
Neg logit contributions
ement-0.000
 nervously-0.000
 sadly-0.000
 curiously-0.000
 closely-0.000
Token to
Token ID284
Feature activation-1.82e+00

Pos logit contributions
ww+0.014
fl+0.013
rav+0.013
eware+0.013
Fl+0.012
Neg logit contributions
ement-0.014
 nervously-0.013
 sadly-0.013
 curiously-0.013
 closely-0.012
Token the
Token ID262
Feature activation-2.46e+00

Pos logit contributions
ww+0.058
rav+0.055
fl+0.055
eware+0.054
Fl+0.053
Neg logit contributions
ement-0.043
 nervously-0.039
 sadly-0.038
 curiously-0.037
 closely-0.035
Token living
Token ID2877
Feature activation-5.30e+00

Pos logit contributions
ww+0.075
rav+0.073
eware+0.071
fl+0.070
w+0.068
Neg logit contributions
 nervously-0.058
ement-0.058
 sadly-0.056
 curiously-0.055
 closely-0.053
Token room
Token ID2119
Feature activation-5.44e-01

Pos logit contributions
ww+0.178
rav+0.171
eware+0.168
fl+0.167
w+0.160
Neg logit contributions
ement-0.113
 nervously-0.110
 sadly-0.105
 curiously-0.102
 closely-0.098
Token.
Token ID13
Feature activation-7.54e-01

Pos logit contributions
ww+0.017
fl+0.016
rav+0.016
eware+0.016
less+0.016
Neg logit contributions
ement-0.013
 nervously-0.011
 curiously-0.011
 sadly-0.011
 closely-0.010
Token We
Token ID775
Feature activation+8.73e-02

Pos logit contributions
ww+0.026
rav+0.024
w+0.024
fl+0.024
eware+0.023
Neg logit contributions
ement-0.015
 nervously-0.015
 sadly-0.015
 He-0.015
 curiously-0.014
Token can
Token ID460
Feature activation+2.41e-01

Pos logit contributions
ement+0.003
 nervously+0.003
 curiously+0.003
 sadly+0.003
 closely+0.003
Neg logit contributions
ww-0.002
rav-0.002
eware-0.002
fl-0.002
w-0.001
Token play
Token ID711
Feature activation+3.16e-01

Pos logit contributions
ement+0.008
 nervously+0.007
 sadly+0.007
 curiously+0.007
 deeply+0.007
Neg logit contributions
fl-0.005
ww-0.005
Fl-0.005
rav-0.005
burn-0.004
Token noise
Token ID7838
Feature activation-1.13e+00

Pos logit contributions
ww+0.020
rav+0.019
eware+0.018
fl+0.018
w+0.017
Neg logit contributions
ement-0.019
 nervously-0.019
 sadly-0.018
 curiously-0.018
 closely-0.017
Token and
Token ID290
Feature activation-1.69e-01

Pos logit contributions
fl+0.035
less+0.035
ww+0.034
Fl+0.033
rav+0.033
Neg logit contributions
ement-0.022
 nervously-0.019
 sadly-0.019
 armor-0.019
 curiously-0.018
Token ran
Token ID4966
Feature activation-2.00e-01

Pos logit contributions
ww+0.005
rav+0.005
eware+0.005
fl+0.005
w+0.005
Neg logit contributions
 nervously-0.004
ement-0.004
 sadly-0.004
 curiously-0.004
 closely-0.004
Token to
Token ID284
Feature activation-1.46e+00

Pos logit contributions
ww+0.005
eware+0.005
rav+0.005
fl+0.005
less+0.005
Neg logit contributions
ement-0.006
 nervously-0.005
 sadly-0.005
 curiously-0.005
 closely-0.005
Token the
Token ID262
Feature activation-1.55e+00

Pos logit contributions
ww+0.048
rav+0.047
eware+0.046
fl+0.046
Fl+0.045
Neg logit contributions
ement-0.032
 nervously-0.029
 sadly-0.028
 curiously-0.027
 closely-0.026
Token living
Token ID2877
Feature activation-5.23e+00

Pos logit contributions
ww+0.039
fl+0.037
eware+0.037
rav+0.036
Fl+0.036
Neg logit contributions
ement-0.046
 nervously-0.042
 sadly-0.041
 curiously-0.040
 closely-0.038
Token room
Token ID2119
Feature activation-1.65e+00

Pos logit contributions
ww+0.159
rav+0.152
fl+0.152
eware+0.150
w+0.144
Neg logit contributions
ement-0.124
 nervously-0.117
 sadly-0.114
 curiously-0.112
 closely-0.107
Token.
Token ID13
Feature activation-3.89e-01

Pos logit contributions
less+0.046
eware+0.044
fl+0.043
ww+0.042
rav+0.042
Neg logit contributions
ement-0.040
 nervously-0.036
 curiously-0.035
 sadly-0.034
 closely-0.033
Token They
Token ID1119
Feature activation-1.18e+00

Pos logit contributions
ww+0.012
eware+0.011
fl+0.011
rav+0.011
Fl+0.011
Neg logit contributions
ement-0.010
 nervously-0.009
 sadly-0.008
 curiously-0.008
 closely-0.008
Token saw
Token ID2497
Feature activation+4.10e-01

Pos logit contributions
ww+0.030
rav+0.030
eware+0.029
Fl+0.029
fl+0.029
Neg logit contributions
ement-0.034
 nervously-0.029
 sadly-0.029
 curiously-0.029
 closely-0.028
Token the
Token ID262
Feature activation-5.92e-02

Pos logit contributions
ement+0.010
 nervously+0.009
 sadly+0.009
 curiously+0.009
 eagerly+0.009
Neg logit contributions
ww-0.012
fl-0.011
Fl-0.011
less-0.011
eware-0.011
Token the
Token ID262
Feature activation-1.27e+00

Pos logit contributions
ww+0.000
rav+0.000
eware+0.000
fl+0.000
w+0.000
Neg logit contributions
ement-0.000
 nervously-0.000
 sadly-0.000
 curiously-0.000
 closely-0.000
Token big
Token ID1263
Feature activation-1.17e+00

Pos logit contributions
ww+0.043
rav+0.043
eware+0.041
w+0.040
fl+0.039
Neg logit contributions
 nervously-0.025
ement-0.024
 curiously-0.024
 sadly-0.023
 closely-0.023
Token couch
Token ID18507
Feature activation-1.01e+00

Pos logit contributions
ww+0.042
rav+0.040
eware+0.040
fl+0.040
w+0.038
Neg logit contributions
ement-0.022
 nervously-0.021
 sadly-0.020
 curiously-0.020
 closely-0.018
Token in
Token ID287
Feature activation-1.64e-01

Pos logit contributions
ww+0.034
fl+0.033
eware+0.033
rav+0.033
Fl+0.031
Neg logit contributions
ement-0.021
 nervously-0.019
 sadly-0.018
 curiously-0.018
 closely-0.017
Token the
Token ID262
Feature activation-1.48e+00

Pos logit contributions
ww+0.006
rav+0.006
eware+0.006
fl+0.006
w+0.005
Neg logit contributions
ement-0.003
 nervously-0.003
 sadly-0.003
 curiously-0.003
 closely-0.003
Token living
Token ID2877
Feature activation-5.23e+00

Pos logit contributions
rav+0.048
eware+0.047
ww+0.046
cr+0.043
w+0.043
Neg logit contributions
 nervously-0.032
ement-0.032
 sadly-0.030
 He-0.030
 curiously-0.030
Token room
Token ID2119
Feature activation-1.61e+00

Pos logit contributions
ww+0.214
rav+0.210
eware+0.207
fl+0.195
w+0.191
Neg logit contributions
ement-0.080
 nervously-0.077
 sadly-0.073
 happily-0.069
 proudly-0.069
Token.
Token ID13
Feature activation+6.05e-01

Pos logit contributions
ww+0.067
rav+0.063
eware+0.061
fl+0.061
w+0.060
Neg logit contributions
ement-0.024
 nervously-0.023
 sadly-0.022
 curiously-0.021
 closely-0.020
Token She
Token ID1375
Feature activation+5.10e-02

Pos logit contributions
ement+0.009
 nervously+0.009
 sadly+0.008
 curiously+0.008
 closely+0.008
Neg logit contributions
ww-0.024
rav-0.023
eware-0.023
fl-0.022
w-0.022
Token was
Token ID373
Feature activation+1.04e+00

Pos logit contributions
ement+0.001
 nervously+0.001
 sadly+0.001
 curiously+0.001
 closely+0.001
Neg logit contributions
ww-0.001
rav-0.001
eware-0.001
fl-0.001
w-0.001
Token always
Token ID1464
Feature activation-4.58e-01

Pos logit contributions
ement+0.021
 nervously+0.020
 sadly+0.020
 curiously+0.019
 closely+0.018
Neg logit contributions
ww-0.036
rav-0.035
eware-0.034
fl-0.034
w-0.032
Token up
Token ID510
Feature activation+1.03e+00

Pos logit contributions
ement+0.016
 nervously+0.014
 sadly+0.013
 curiously+0.013
 deeply+0.013
Neg logit contributions
fl-0.016
ww-0.015
less-0.015
eware-0.015
rav-0.015
Token and
Token ID290
Feature activation-5.25e-01

Pos logit contributions
ement+0.025
aling+0.020
 goals+0.020
 armor+0.020
 warrior+0.019
Neg logit contributions
fl-0.030
less-0.029
ww-0.027
cr-0.026
eware-0.026
Token ran
Token ID4966
Feature activation+1.72e-01

Pos logit contributions
ww+0.016
rav+0.015
fl+0.015
eware+0.015
w+0.014
Neg logit contributions
ement-0.013
 nervously-0.012
 sadly-0.011
 curiously-0.011
 closely-0.011
Token around
Token ID1088
Feature activation-9.90e-01

Pos logit contributions
ement+0.005
 nervously+0.004
 sadly+0.004
 curiously+0.004
 closely+0.004
Neg logit contributions
ww-0.005
rav-0.005
eware-0.005
less-0.005
fl-0.005
Token the
Token ID262
Feature activation-1.53e+00

Pos logit contributions
fl+0.030
rav+0.029
less+0.028
Fl+0.028
ww+0.028
Neg logit contributions
ement-0.023
 nervously-0.019
 sadly-0.019
 curiously-0.018
 closely-0.018
Token living
Token ID2877
Feature activation-5.22e+00

Pos logit contributions
ww+0.045
rav+0.043
fl+0.042
eware+0.042
Fl+0.040
Neg logit contributions
ement-0.039
 nervously-0.037
 sadly-0.036
 curiously-0.035
 closely-0.034
Token room
Token ID2119
Feature activation-1.91e+00

Pos logit contributions
ww+0.172
rav+0.165
eware+0.162
fl+0.162
w+0.155
Neg logit contributions
ement-0.114
 nervously-0.110
 sadly-0.105
 curiously-0.103
 closely-0.099
Token again
Token ID757
Feature activation-2.81e-01

Pos logit contributions
less+0.059
rav+0.058
ww+0.058
fl+0.058
eware+0.057
Neg logit contributions
ement-0.040
 nervously-0.037
 sadly-0.035
 curiously-0.035
 closely-0.034
Token.
Token ID13
Feature activation-8.01e-01

Pos logit contributions
ww+0.008
fl+0.008
eware+0.008
less+0.008
rav+0.008
Neg logit contributions
ement-0.007
 nervously-0.006
 sadly-0.006
 curiously-0.006
 frantically-0.005
Token The
Token ID383
Feature activation-5.65e-01

Pos logit contributions
ww+0.028
rav+0.027
eware+0.027
fl+0.027
Fl+0.026
Neg logit contributions
ement-0.015
 nervously-0.015
 sadly-0.014
 curiously-0.014
 closely-0.013
Token couch
Token ID18507
Feature activation-8.06e-02

Pos logit contributions
ww+0.018
rav+0.017
eware+0.017
fl+0.017
w+0.016
Neg logit contributions
ement-0.013
 nervously-0.013
 sadly-0.012
 curiously-0.012
 closely-0.012
Token.
Token ID13
Feature activation-3.84e-02

Pos logit contributions
fl+0.002
rav+0.002
ww+0.002
eware+0.002
less+0.002
Neg logit contributions
ement-0.001
 nervously-0.001
 sadly-0.001
 curiously-0.001
 closely-0.001
Token They
Token ID1119
Feature activation-1.16e+00

Pos logit contributions
ww+0.001
rav+0.001
eware+0.001
fl+0.001
w+0.001
Neg logit contributions
ement-0.001
 nervously-0.001
 sadly-0.001
 curiously-0.001
 closely-0.001
Token ran
Token ID4966
Feature activation+3.70e-01

Pos logit contributions
rav+0.026
ww+0.026
Fl+0.024
w+0.024
eware+0.024
Neg logit contributions
ement-0.037
 closely-0.032
 nervously-0.031
 curiously-0.031
 sadly-0.030
Token to
Token ID284
Feature activation-1.54e+00

Pos logit contributions
ement+0.013
 nervously+0.011
 curiously+0.011
 sadly+0.011
 deeply+0.011
Neg logit contributions
eware-0.008
rav-0.008
ww-0.008
less-0.007
fl-0.007
Token the
Token ID262
Feature activation-2.06e+00

Pos logit contributions
rav+0.047
ww+0.046
fl+0.045
eware+0.045
Fl+0.045
Neg logit contributions
ement-0.037
 nervously-0.031
 sadly-0.031
 curiously-0.029
 closely-0.029
Token living
Token ID2877
Feature activation-5.22e+00

Pos logit contributions
ww+0.057
rav+0.054
eware+0.053
fl+0.052
w+0.051
Neg logit contributions
 nervously-0.056
ement-0.055
 sadly-0.054
 curiously-0.053
 closely-0.051
Token room
Token ID2119
Feature activation-1.16e+00

Pos logit contributions
ww+0.181
rav+0.174
eware+0.171
fl+0.168
w+0.163
Neg logit contributions
ement-0.107
 nervously-0.104
 sadly-0.099
 curiously-0.097
 closely-0.093
Token and
Token ID290
Feature activation-9.45e-01

Pos logit contributions
less+0.037
rav+0.035
eware+0.035
fl+0.035
ww+0.034
Neg logit contributions
ement-0.024
 nervously-0.021
 curiously-0.021
 sadly-0.019
 closely-0.018
Token pretended
Token ID36766
Feature activation+1.88e-01

Pos logit contributions
ww+0.028
rav+0.027
eware+0.026
fl+0.026
Fl+0.025
Neg logit contributions
ement-0.024
 nervously-0.022
 sadly-0.021
 curiously-0.021
 closely-0.020
Token to
Token ID284
Feature activation-1.46e-01

Pos logit contributions
ement+0.005
 nervously+0.005
 sadly+0.004
 curiously+0.004
 closely+0.004
Neg logit contributions
ww-0.006
rav-0.005
eware-0.005
fl-0.005
w-0.005
Token play
Token ID711
Feature activation-7.86e-01

Pos logit contributions
ww+0.004
rav+0.004
fl+0.004
eware+0.004
w+0.004
Neg logit contributions
 nervously-0.004
 finally-0.004
 sadly-0.004
 curiously-0.004
ement-0.004
Token and
Token ID290
Feature activation-8.42e-01

Pos logit contributions
fl+0.028
less+0.027
ww+0.027
cr+0.027
Fl+0.026
Neg logit contributions
ement-0.021
 nervously-0.018
 curiously-0.017
 frantically-0.017
 finally-0.016
Token ran
Token ID4966
Feature activation-1.98e-01

Pos logit contributions
ww+0.022
fl+0.021
rav+0.021
Fl+0.020
w+0.020
Neg logit contributions
ement-0.025
 nervously-0.022
 curiously-0.021
 closely-0.021
 sadly-0.020
Token back
Token ID736
Feature activation-7.81e-01

Pos logit contributions
ww+0.005
less+0.004
eware+0.004
rav+0.004
fl+0.004
Neg logit contributions
ement-0.007
 nervously-0.006
 curiously-0.006
 sadly-0.005
 closely-0.005
Token to
Token ID284
Feature activation-1.51e+00

Pos logit contributions
fl+0.017
eware+0.016
ww+0.016
rav+0.016
less+0.016
Neg logit contributions
ement-0.027
 nervously-0.024
 curiously-0.023
 sadly-0.023
 closely-0.022
Token the
Token ID262
Feature activation-1.38e+00

Pos logit contributions
rav+0.033
fl+0.032
ww+0.032
Fl+0.031
less+0.030
Neg logit contributions
ement-0.049
 nervously-0.040
 sadly-0.039
 curiously-0.039
 closely-0.038
Token living
Token ID2877
Feature activation-5.22e+00

Pos logit contributions
ww+0.040
fl+0.038
rav+0.037
eware+0.037
Fl+0.036
Neg logit contributions
ement-0.037
 nervously-0.034
 sadly-0.033
 curiously-0.032
 closely-0.031
Token room
Token ID2119
Feature activation-1.49e+00

Pos logit contributions
ww+0.153
fl+0.146
rav+0.145
eware+0.142
Fl+0.139
Neg logit contributions
ement-0.128
 nervously-0.122
 curiously-0.117
 sadly-0.116
 closely-0.112
Token.
Token ID13
Feature activation+2.51e-02

Pos logit contributions
less+0.043
eware+0.043
rav+0.042
ww+0.042
fl+0.042
Neg logit contributions
ement-0.034
 nervously-0.031
 curiously-0.030
 sadly-0.029
 closely-0.028
Token They
Token ID1119
Feature activation-1.06e+00

Pos logit contributions
ement+0.001
 nervously+0.001
 sadly+0.001
 curiously+0.001
 closely+0.001
Neg logit contributions
ww-0.001
eware-0.001
rav-0.001
fl-0.001
Fl-0.001
Token played
Token ID2826
Feature activation+2.51e-01

Pos logit contributions
Fl+0.024
ww+0.024
rav+0.024
w+0.023
eware+0.023
Neg logit contributions
ement-0.034
 nervously-0.028
 curiously-0.028
 closely-0.028
 sadly-0.027
Token with
Token ID351
Feature activation+1.15e+00

Pos logit contributions
ement+0.007
 nervously+0.006
 curiously+0.006
 sadly+0.006
 closely+0.005
Neg logit contributions
Fl-0.006
rav-0.006
ww-0.006
fl-0.006
less-0.006
Token
Token ID198
Feature activation+1.82e-01

Pos logit contributions
ement+0.011
 nervously+0.010
 sadly+0.010
 curiously+0.009
 closely+0.009
Neg logit contributions
ww-0.010
rav-0.010
Fl-0.010
eware-0.010
w-0.009
TokenThey
Token ID2990
Feature activation-1.11e+00

Pos logit contributions
ement+0.005
 nervously+0.005
 sadly+0.005
 curiously+0.005
 closely+0.005
Neg logit contributions
ww-0.004
rav-0.004
fl-0.004
eware-0.004
less-0.004
Token run
Token ID1057
Feature activation+2.07e-01

Pos logit contributions
rav+0.024
Fl+0.024
ww+0.024
w+0.023
CH+0.023
Neg logit contributions
ement-0.037
 nervously-0.031
 sadly-0.030
 curiously-0.030
 deeply-0.029
Token around
Token ID1088
Feature activation-4.74e-01

Pos logit contributions
ement+0.006
 nervously+0.005
 curiously+0.005
 sadly+0.005
 deeply+0.005
Neg logit contributions
ww-0.006
rav-0.006
eware-0.006
fl-0.006
less-0.005
Token the
Token ID262
Feature activation-1.75e+00

Pos logit contributions
fl+0.014
Fl+0.013
less+0.013
rav+0.013
free+0.012
Neg logit contributions
ement-0.012
 sadly-0.010
 nervously-0.009
 armor-0.009
 curiously-0.009
Token living
Token ID2877
Feature activation-5.20e+00

Pos logit contributions
ww+0.060
rav+0.059
eware+0.058
fl+0.056
w+0.055
Neg logit contributions
 nervously-0.035
ement-0.034
 curiously-0.033
 sadly-0.033
 closely-0.032
Token room
Token ID2119
Feature activation-1.37e+00

Pos logit contributions
ww+0.183
rav+0.176
eware+0.174
fl+0.170
w+0.165
Neg logit contributions
ement-0.103
 nervously-0.101
 sadly-0.096
 curiously-0.093
 closely-0.089
Token,
Token ID11
Feature activation-7.67e-01

Pos logit contributions
less+0.042
rav+0.041
fl+0.039
eware+0.039
ww+0.038
Neg logit contributions
ement-0.029
 nervously-0.025
 curiously-0.024
 armor-0.024
 Sir-0.024
Token bump
Token ID13852
Feature activation-1.41e+00

Pos logit contributions
ww+0.022
rav+0.021
fl+0.021
eware+0.020
Fl+0.020
Neg logit contributions
ement-0.020
 nervously-0.018
 sadly-0.017
 curiously-0.017
 closely-0.016
Tokening
Token ID278
Feature activation-1.44e-01

Pos logit contributions
ww+0.034
rav+0.032
eware+0.031
fl+0.031
w+0.029
Neg logit contributions
ement-0.043
 nervously-0.042
 sadly-0.041
 curiously-0.041
 closely-0.039
Token into
Token ID656
Feature activation-9.12e-01

Pos logit contributions
fl+0.005
ww+0.005
Fl+0.005
rav+0.005
less+0.005
Neg logit contributions
ement-0.003
 nervously-0.003
 sadly-0.002
 curiously-0.002
 finally-0.002
Token and
Token ID290
Feature activation-1.89e-01

Pos logit contributions
ement+0.001
 sadly+0.001
 nervously+0.001
 curiously+0.001
 finally+0.001
Neg logit contributions
fl-0.003
rav-0.003
ww-0.003
Fl-0.002
w-0.002
Token Dad
Token ID17415
Feature activation-9.87e-01

Pos logit contributions
ww+0.008
eware+0.007
w+0.007
rav+0.007
fl+0.007
Neg logit contributions
 nervously-0.003
 sadly-0.002
 curiously-0.002
ement-0.002
 He-0.002
Token come
Token ID1282
Feature activation-1.01e+00

Pos logit contributions
rav+0.018
ww+0.017
fl+0.016
w+0.015
eware+0.015
Neg logit contributions
ement-0.035
 sadly-0.033
 nervously-0.032
 curiously-0.031
 closely-0.030
Token into
Token ID656
Feature activation-1.25e+00

Pos logit contributions
ww+0.035
rav+0.034
eware+0.034
fl+0.034
w+0.032
Neg logit contributions
ement-0.020
 nervously-0.019
 sadly-0.018
 curiously-0.018
 closely-0.017
Token the
Token ID262
Feature activation-2.03e+00

Pos logit contributions
ww+0.044
rav+0.043
fl+0.042
eware+0.042
Fl+0.041
Neg logit contributions
ement-0.025
 nervously-0.023
 sadly-0.022
 curiously-0.021
 closely-0.020
Token living
Token ID2877
Feature activation-5.19e+00

Pos logit contributions
ww+0.062
rav+0.059
eware+0.058
fl+0.058
w+0.056
Neg logit contributions
ement-0.049
 nervously-0.048
 sadly-0.046
 curiously-0.045
 closely-0.043
Token room
Token ID2119
Feature activation-1.56e+00

Pos logit contributions
ww+0.177
rav+0.170
eware+0.167
fl+0.165
w+0.159
Neg logit contributions
ement-0.109
 nervously-0.106
 sadly-0.101
 curiously-0.098
 closely-0.094
Token and
Token ID290
Feature activation-1.01e+00

Pos logit contributions
less+0.045
rav+0.044
fl+0.043
eware+0.042
ww+0.042
Neg logit contributions
ement-0.036
 nervously-0.033
 curiously-0.032
 sadly-0.031
 closely-0.030
Token see
Token ID766
Feature activation+2.80e-02

Pos logit contributions
ww+0.028
rav+0.027
fl+0.027
eware+0.027
Fl+0.025
Neg logit contributions
ement-0.028
 nervously-0.026
 sadly-0.025
 curiously-0.025
 closely-0.024
Token the
Token ID262
Feature activation-3.83e-01

Pos logit contributions
ement+0.001
 nervously+0.001
 sadly+0.001
 curiously+0.001
 closely+0.001
Neg logit contributions
ww-0.001
rav-0.001
fl-0.001
eware-0.001
Fl-0.001
Token mess
Token ID2085
Feature activation-2.24e+00

Pos logit contributions
ww+0.010
rav+0.010
fl+0.010
eware+0.010
w+0.009
Neg logit contributions
ement-0.011
 nervously-0.010
 sadly-0.010
 curiously-0.010
 closely-0.009
Token and
Token ID290
Feature activation-6.30e-01

Pos logit contributions
less+0.030
ww+0.030
w+0.030
fl+0.030
Fl+0.029
Neg logit contributions
ement-0.019
 nervously-0.017
 armor-0.017
 finally-0.016
 curiously-0.015
Token takes
Token ID2753
Feature activation-3.42e-01

Pos logit contributions
fl+0.014
ww+0.014
Fl+0.014
rav+0.014
w+0.013
Neg logit contributions
ement-0.020
 nervously-0.017
 sadly-0.016
 deeply-0.016
 curiously-0.016
Token them
Token ID606
Feature activation-4.08e-02

Pos logit contributions
ww+0.011
rav+0.011
fl+0.011
eware+0.011
Fl+0.010
Neg logit contributions
ement-0.007
 nervously-0.007
 sadly-0.007
 curiously-0.007
 closely-0.006
Token to
Token ID284
Feature activation-1.49e+00

Pos logit contributions
ww+0.001
rav+0.001
eware+0.001
fl+0.001
less+0.001
Neg logit contributions
ement-0.001
 nervously-0.001
 curiously-0.001
 sadly-0.001
 closely-0.001
Token the
Token ID262
Feature activation-1.31e+00

Pos logit contributions
ww+0.053
rav+0.051
eware+0.050
fl+0.050
Fl+0.048
Neg logit contributions
ement-0.029
 nervously-0.027
 sadly-0.026
 curiously-0.026
 closely-0.024
Token living
Token ID2877
Feature activation-5.19e+00

Pos logit contributions
ww+0.039
rav+0.037
eware+0.036
fl+0.036
w+0.035
Neg logit contributions
ement-0.033
 nervously-0.032
 sadly-0.031
 curiously-0.030
 closely-0.029
Token room
Token ID2119
Feature activation-1.21e+00

Pos logit contributions
ww+0.171
rav+0.165
eware+0.162
fl+0.161
w+0.155
Neg logit contributions
ement-0.113
 nervously-0.109
 sadly-0.104
 curiously-0.102
 closely-0.098
Token.
Token ID13
Feature activation-1.65e-01

Pos logit contributions
less+0.034
eware+0.033
rav+0.032
ww+0.032
fl+0.032
Neg logit contributions
ement-0.029
 nervously-0.027
 curiously-0.026
 sadly-0.024
 finally-0.024
Token
Token ID198
Feature activation-2.77e-01

Pos logit contributions
ww+0.006
rav+0.005
eware+0.005
fl+0.005
Fl+0.005
Neg logit contributions
ement-0.004
 nervously-0.003
 sadly-0.003
 curiously-0.003
 closely-0.003
Token
Token ID198
Feature activation-5.11e-01

Pos logit contributions
ww+0.009
rav+0.008
eware+0.008
Fl+0.008
fl+0.008
Neg logit contributions
ement-0.007
 nervously-0.006
 sadly-0.006
 curiously-0.006
 closely-0.006
Token"
Token ID1
Feature activation-1.74e+00

Pos logit contributions
rav+0.013
fl+0.013
ww+0.013
eware+0.013
less+0.012
Neg logit contributions
 nervously-0.014
 curiously-0.013
 sadly-0.013
ement-0.013
 He-0.013
Token chair
Token ID5118
Feature activation+9.52e-01

Pos logit contributions
ww+0.026
rav+0.025
eware+0.025
fl+0.025
w+0.023
Neg logit contributions
ement-0.023
 nervously-0.022
 sadly-0.022
 curiously-0.021
 closely-0.020
Token and
Token ID290
Feature activation-1.12e+00

Pos logit contributions
ement+0.020
 nervously+0.015
 curiously+0.015
 Sir+0.013
 sadly+0.013
Neg logit contributions
less-0.030
fl-0.030
tr-0.029
cr-0.029
ww-0.029
Token ran
Token ID4966
Feature activation-3.99e-01

Pos logit contributions
ww+0.032
rav+0.031
fl+0.031
eware+0.030
Fl+0.029
Neg logit contributions
ement-0.031
 nervously-0.027
 curiously-0.026
 sadly-0.026
 closely-0.025
Token to
Token ID284
Feature activation-1.69e+00

Pos logit contributions
ww+0.011
rav+0.011
fl+0.011
eware+0.011
less+0.011
Neg logit contributions
ement-0.011
 nervously-0.010
 curiously-0.009
 sadly-0.009
 deeply-0.009
Token the
Token ID262
Feature activation-1.49e+00

Pos logit contributions
rav+0.045
ww+0.044
Fl+0.043
fl+0.043
eware+0.043
Neg logit contributions
ement-0.047
 nervously-0.037
 sadly-0.037
 deeply-0.035
 curiously-0.035
Token living
Token ID2877
Feature activation-5.19e+00

Pos logit contributions
ww+0.043
rav+0.041
fl+0.041
eware+0.041
Fl+0.039
Neg logit contributions
ement-0.038
 nervously-0.037
 sadly-0.036
 curiously-0.035
 closely-0.034
Token room
Token ID2119
Feature activation-1.68e+00

Pos logit contributions
ww+0.163
rav+0.156
fl+0.155
eware+0.153
w+0.147
Neg logit contributions
ement-0.118
 nervously-0.114
 sadly-0.110
 curiously-0.108
 closely-0.103
Token.
Token ID13
Feature activation-7.21e-01

Pos logit contributions
less+0.050
rav+0.049
eware+0.049
fl+0.049
ww+0.048
Neg logit contributions
ement-0.038
 nervously-0.034
 curiously-0.033
 sadly-0.033
 closely-0.031
Token
Token ID198
Feature activation-8.58e-01

Pos logit contributions
ww+0.022
rav+0.021
eware+0.021
Fl+0.020
fl+0.020
Neg logit contributions
ement-0.018
 nervously-0.016
 sadly-0.016
 curiously-0.015
 closely-0.015
Token
Token ID198
Feature activation-1.11e+00

Pos logit contributions
ww+0.028
rav+0.027
eware+0.027
fl+0.026
Fl+0.026
Neg logit contributions
ement-0.019
 nervously-0.018
 sadly-0.017
 curiously-0.017
 closely-0.016
TokenTheir
Token ID14574
Feature activation-1.27e+00

Pos logit contributions
fl+0.037
eware+0.034
rav+0.034
ww+0.033
burn+0.031
Neg logit contributions
 nervously-0.025
 finally-0.024
 sadly-0.024
 happily-0.024
 proudly-0.024
Token they
Token ID484
Feature activation-1.20e+00

Pos logit contributions
ement+0.011
 nervously+0.011
 sadly+0.011
 curiously+0.010
 finally+0.010
Neg logit contributions
ww-0.017
rav-0.016
eware-0.016
fl-0.016
w-0.016
Token run
Token ID1057
Feature activation-2.41e-01

Pos logit contributions
ww+0.026
rav+0.026
fl+0.025
Fl+0.024
w+0.024
Neg logit contributions
ement-0.040
 nervously-0.035
 curiously-0.034
 sadly-0.034
 closely-0.032
Token back
Token ID736
Feature activation-3.44e-01

Pos logit contributions
ww+0.008
rav+0.007
eware+0.007
fl+0.007
w+0.007
Neg logit contributions
ement-0.006
 nervously-0.005
 sadly-0.005
 curiously-0.005
 closely-0.005
Token to
Token ID284
Feature activation-1.77e+00

Pos logit contributions
ww+0.008
fl+0.008
rav+0.008
eware+0.008
less+0.008
Neg logit contributions
ement-0.011
 nervously-0.010
 sadly-0.009
 curiously-0.009
 closely-0.009
Token the
Token ID262
Feature activation-1.65e+00

Pos logit contributions
fl+0.044
rav+0.044
ww+0.044
Fl+0.042
less+0.041
Neg logit contributions
ement-0.052
 nervously-0.045
 sadly-0.043
 curiously-0.043
 closely-0.041
Token living
Token ID2877
Feature activation-5.19e+00

Pos logit contributions
ww+0.052
rav+0.051
eware+0.049
fl+0.048
w+0.047
Neg logit contributions
 nervously-0.037
ement-0.037
 sadly-0.036
 curiously-0.035
 closely-0.034
Token room
Token ID2119
Feature activation-1.43e+00

Pos logit contributions
ww+0.181
rav+0.175
eware+0.173
fl+0.169
w+0.164
Neg logit contributions
ement-0.104
 nervously-0.101
 sadly-0.097
 curiously-0.094
 closely-0.090
Token and
Token ID290
Feature activation-1.18e+00

Pos logit contributions
rav+0.047
ww+0.046
fl+0.046
less+0.046
eware+0.046
Neg logit contributions
ement-0.028
 nervously-0.025
 curiously-0.024
 sadly-0.023
 closely-0.022
Token pick
Token ID2298
Feature activation-1.05e+00

Pos logit contributions
ww+0.028
rav+0.027
fl+0.027
Fl+0.026
eware+0.026
Neg logit contributions
ement-0.038
 nervously-0.033
 curiously-0.032
 sadly-0.032
 closely-0.031
Token up
Token ID510
Feature activation+9.12e-01

Pos logit contributions
ww+0.036
rav+0.035
fl+0.034
eware+0.034
w+0.033
Neg logit contributions
ement-0.021
 nervously-0.020
 sadly-0.019
 curiously-0.019
 closely-0.018
Token their
Token ID511
Feature activation-2.13e-01

Pos logit contributions
ement+0.025
 nervously+0.023
 curiously+0.022
 sadly+0.022
 frantically+0.021
Neg logit contributions
ww-0.025
fl-0.025
rav-0.024
Fl-0.023
eware-0.023
Token.
Token ID13
Feature activation-1.09e-01

Pos logit contributions
ww+0.078
rav+0.075
eware+0.074
fl+0.074
w+0.071
Neg logit contributions
ement-0.033
 nervously-0.032
 sadly-0.031
 curiously-0.029
 closely-0.028
Token They
Token ID1119
Feature activation-9.79e-01

Pos logit contributions
ww+0.004
rav+0.004
eware+0.003
fl+0.003
w+0.003
Neg logit contributions
ement-0.002
 nervously-0.002
 sadly-0.002
 curiously-0.002
 closely-0.002
Token ran
Token ID4966
Feature activation+4.68e-01

Pos logit contributions
ww+0.026
rav+0.025
eware+0.025
Fl+0.024
fl+0.024
Neg logit contributions
ement-0.028
 nervously-0.024
 sadly-0.024
 curiously-0.024
 closely-0.024
Token to
Token ID284
Feature activation-1.30e+00

Pos logit contributions
ement+0.015
 nervously+0.013
 sadly+0.013
 curiously+0.013
 deeply+0.013
Neg logit contributions
eware-0.011
ww-0.011
rav-0.010
fl-0.010
less-0.010
Token the
Token ID262
Feature activation-1.92e+00

Pos logit contributions
ww+0.047
rav+0.045
eware+0.044
fl+0.044
w+0.043
Neg logit contributions
ement-0.024
 nervously-0.023
 sadly-0.022
 curiously-0.022
 closely-0.020
Token living
Token ID2877
Feature activation-5.18e+00

Pos logit contributions
ww+0.054
rav+0.052
fl+0.051
eware+0.051
Fl+0.049
Neg logit contributions
ement-0.051
 nervously-0.049
 sadly-0.047
 curiously-0.046
 closely-0.045
Token room
Token ID2119
Feature activation-1.26e+00

Pos logit contributions
ww+0.172
rav+0.165
eware+0.162
fl+0.161
w+0.155
Neg logit contributions
ement-0.112
 nervously-0.108
 sadly-0.104
 curiously-0.102
 closely-0.097
Token.
Token ID13
Feature activation-1.15e-01

Pos logit contributions
less+0.037
eware+0.036
ww+0.035
fl+0.035
rav+0.035
Neg logit contributions
ement-0.028
 nervously-0.025
 curiously-0.025
 sadly-0.023
 closely-0.022
Token They
Token ID1119
Feature activation-1.15e+00

Pos logit contributions
ww+0.003
eware+0.003
rav+0.003
fl+0.003
Fl+0.003
Neg logit contributions
ement-0.003
 nervously-0.003
 sadly-0.003
 curiously-0.003
 closely-0.002
Token saw
Token ID2497
Feature activation+5.22e-01

Pos logit contributions
ww+0.029
rav+0.029
eware+0.029
Fl+0.028
fl+0.028
Neg logit contributions
ement-0.033
 nervously-0.029
 curiously-0.028
 sadly-0.028
 closely-0.028
Token the
Token ID262
Feature activation-5.68e-01

Pos logit contributions
ement+0.010
 nervously+0.010
 sadly+0.009
 curiously+0.009
 closely+0.009
Neg logit contributions
ww-0.018
rav-0.018
fl-0.017
eware-0.017
Fl-0.017
Token
Token ID198
Feature activation-4.12e-01

Pos logit contributions
ww+0.004
rav+0.004
eware+0.004
fl+0.004
w+0.004
Neg logit contributions
ement-0.002
 nervously-0.002
 sadly-0.002
 curiously-0.002
 closely-0.002
TokenThey
Token ID2990
Feature activation-1.31e+00

Pos logit contributions
eware+0.012
rav+0.012
fl+0.012
ww+0.011
less+0.011
Neg logit contributions
 nervously-0.010
 sadly-0.010
 curiously-0.010
ement-0.009
 happily-0.009
Token go
Token ID467
Feature activation-6.54e-02

Pos logit contributions
ww+0.028
rav+0.027
Fl+0.025
w+0.025
fl+0.025
Neg logit contributions
ement-0.045
 nervously-0.039
 sadly-0.038
 curiously-0.038
 closely-0.036
Token to
Token ID284
Feature activation-1.82e+00

Pos logit contributions
ww+0.002
rav+0.002
fl+0.002
eware+0.002
less+0.002
Neg logit contributions
ement-0.002
 nervously-0.002
 curiously-0.002
 sadly-0.002
 closely-0.002
Token the
Token ID262
Feature activation-2.18e+00

Pos logit contributions
ww+0.050
rav+0.050
fl+0.050
Fl+0.049
less+0.048
Neg logit contributions
ement-0.049
 nervously-0.040
 sadly-0.039
 curiously-0.038
 deeply-0.037
Token living
Token ID2877
Feature activation-5.18e+00

Pos logit contributions
ww+0.068
rav+0.065
eware+0.064
fl+0.061
w+0.061
Neg logit contributions
 nervously-0.052
ement-0.050
 sadly-0.050
 curiously-0.049
 closely-0.047
Token room
Token ID2119
Feature activation-8.55e-01

Pos logit contributions
ww+0.170
rav+0.163
eware+0.160
fl+0.160
w+0.153
Neg logit contributions
ement-0.113
 nervously-0.109
 sadly-0.105
 curiously-0.103
 closely-0.098
Token and
Token ID290
Feature activation-8.25e-01

Pos logit contributions
less+0.026
rav+0.024
fl+0.024
eware+0.023
cr+0.022
Neg logit contributions
ement-0.019
 nervously-0.016
 curiously-0.016
 Sir-0.015
 sadly-0.015
Token play
Token ID711
Feature activation-6.67e-01

Pos logit contributions
rav+0.015
fl+0.015
ww+0.015
Fl+0.015
eware+0.014
Neg logit contributions
ement-0.030
 nervously-0.025
 curiously-0.024
 sadly-0.024
 deeply-0.023
Token with
Token ID351
Feature activation+4.30e-01

Pos logit contributions
ww+0.020
fl+0.019
rav+0.019
Fl+0.019
eware+0.018
Neg logit contributions
ement-0.016
 nervously-0.015
 sadly-0.014
 curiously-0.014
 closely-0.013
Token Beef
Token ID35309
Feature activation-1.76e+00

Pos logit contributions
ement+0.011
 nervously+0.010
 sadly+0.010
 curiously+0.009
 closely+0.009
Neg logit contributions
ww-0.012
fl-0.012
Fl-0.012
rav-0.012
eware-0.012
Token try
Token ID1949
Feature activation-5.49e-01

Pos logit contributions
ww+0.004
rav+0.004
fl+0.004
eware+0.004
Fl+0.004
Neg logit contributions
ement-0.005
 nervously-0.005
 sadly-0.005
 curiously-0.005
 closely-0.005
Token your
Token ID534
Feature activation+2.65e-01

Pos logit contributions
ww+0.018
rav+0.017
eware+0.017
fl+0.016
w+0.016
Neg logit contributions
ement-0.012
 nervously-0.012
 sadly-0.012
 curiously-0.011
 closely-0.011
Token switch
Token ID5078
Feature activation+3.68e-01

Pos logit contributions
ement+0.007
 nervously+0.007
 sadly+0.007
 curiously+0.006
 closely+0.006
Neg logit contributions
ww-0.008
rav-0.007
eware-0.007
fl-0.007
w-0.007
Token again
Token ID757
Feature activation+4.24e-01

Pos logit contributions
ement+0.008
 nervously+0.007
 sadly+0.007
 curiously+0.007
 closely+0.007
Neg logit contributions
ww-0.012
rav-0.012
fl-0.012
eware-0.012
w-0.011
Token.
Token ID13
Feature activation-2.81e-01

Pos logit contributions
ement+0.012
 nervously+0.011
 sadly+0.011
 curiously+0.011
 closely+0.010
Neg logit contributions
ww-0.012
rav-0.011
fl-0.011
eware-0.011
w-0.010
Token Maybe
Token ID6674
Feature activation+3.03e+00

Pos logit contributions
ww+0.009
fl+0.008
rav+0.008
w+0.008
eware+0.008
Neg logit contributions
ement-0.006
 nervously-0.006
 He-0.006
 sadly-0.006
 curiously-0.006
Token we
Token ID356
Feature activation+6.98e-01

Pos logit contributions
ement+0.099
 nervously+0.095
 sadly+0.092
 curiously+0.091
 closely+0.087
Neg logit contributions
ww-0.068
rav-0.064
fl-0.062
eware-0.062
w-0.059
Token can
Token ID460
Feature activation+5.19e-01

Pos logit contributions
ement+0.026
 nervously+0.025
 curiously+0.024
 sadly+0.024
 closely+0.023
Neg logit contributions
ww-0.012
rav-0.012
eware-0.011
fl-0.011
w-0.011
Token make
Token ID787
Feature activation-9.56e-01

Pos logit contributions
ement+0.016
 nervously+0.014
 sadly+0.014
 curiously+0.014
 frantically+0.013
Neg logit contributions
ww-0.012
rav-0.012
fl-0.012
eware-0.012
w-0.011
Token a
Token ID257
Feature activation-1.98e-01

Pos logit contributions
ww+0.030
rav+0.029
eware+0.029
fl+0.028
w+0.027
Neg logit contributions
ement-0.022
 nervously-0.021
 sadly-0.021
 curiously-0.020
 closely-0.020
Token song
Token ID3496
Feature activation-9.17e-01

Pos logit contributions
eware+0.006
ww+0.006
rav+0.006
fl+0.006
w+0.006
Neg logit contributions
 nervously-0.004
 sadly-0.004
ement-0.004
 curiously-0.004
 closely-0.004
Token't
Token ID470
Feature activation-9.40e-01

Pos logit contributions
ement+0.009
 nervously+0.009
 sadly+0.009
 curiously+0.009
 closely+0.008
Neg logit contributions
ww-0.010
rav-0.009
eware-0.009
fl-0.009
w-0.009
Token know
Token ID760
Feature activation+2.37e-02

Pos logit contributions
fl+0.031
rav+0.031
ww+0.029
eware+0.029
ucer+0.028
Neg logit contributions
 He-0.017
 finally-0.017
 sadly-0.017
 happily-0.017
 nervously-0.017
Token anything
Token ID1997
Feature activation-9.48e-01

Pos logit contributions
ement+0.001
 nervously+0.001
 sadly+0.001
 curiously+0.001
 closely+0.001
Neg logit contributions
ww-0.001
rav-0.001
fl-0.001
eware-0.001
Fl-0.001
Token about
Token ID546
Feature activation+5.56e-01

Pos logit contributions
ww+0.032
rav+0.031
eware+0.031
fl+0.031
Fl+0.029
Neg logit contributions
ement-0.019
 nervously-0.018
 sadly-0.018
 closely-0.017
 curiously-0.017
Token l
Token ID300
Feature activation+1.37e+00

Pos logit contributions
ement+0.015
 nervously+0.015
 sadly+0.014
 curiously+0.014
 closely+0.013
Neg logit contributions
ww-0.015
fl-0.015
rav-0.014
eware-0.014
Fl-0.014
Tokenizards
Token ID14124
Feature activation+2.89e+00

Pos logit contributions
 nervously+0.036
ement+0.036
 sadly+0.035
 curiously+0.034
 finally+0.034
Neg logit contributions
w-0.037
eware-0.037
ww-0.036
rav-0.035
a-0.035
Token.
Token ID13
Feature activation+4.31e-01

Pos logit contributions
ement+0.069
 nervously+0.065
 curiously+0.061
 sadly+0.060
 finally+0.057
Neg logit contributions
ww-0.087
rav-0.084
fl-0.084
eware-0.082
w-0.082
Token Go
Token ID1514
Feature activation-9.34e-01

Pos logit contributions
ement+0.012
 nervously+0.012
 sadly+0.012
 curiously+0.011
 He+0.011
Neg logit contributions
ww-0.011
rav-0.011
eware-0.010
fl-0.010
less-0.010
Token away
Token ID1497
Feature activation-1.04e+00

Pos logit contributions
ww+0.030
rav+0.029
fl+0.028
eware+0.028
w+0.026
Neg logit contributions
 nervously-0.021
ement-0.021
 sadly-0.020
 curiously-0.019
 closely-0.018
Token."
Token ID526
Feature activation+1.78e+00

Pos logit contributions
ww+0.030
rav+0.028
eware+0.028
fl+0.028
w+0.026
Neg logit contributions
ement-0.028
 nervously-0.027
 sadly-0.026
 curiously-0.025
 closely-0.024
Token
Token ID198
Feature activation+5.34e-01

Pos logit contributions
ement+0.050
 nervously+0.044
 closely+0.043
 sadly+0.043
 curiously+0.043
Neg logit contributions
ww-0.049
Fl-0.048
rav-0.047
eware-0.046
fl-0.045
Token room
Token ID2119
Feature activation-9.43e-01

Pos logit contributions
ww+0.160
rav+0.157
eware+0.153
fl+0.147
w+0.143
Neg logit contributions
ement-0.074
 nervously-0.071
 sadly-0.067
 curiously-0.064
 closely-0.062
Token.
Token ID13
Feature activation+5.24e-01

Pos logit contributions
ww+0.032
eware+0.030
fl+0.030
rav+0.030
less+0.029
Neg logit contributions
ement-0.019
 nervously-0.018
 sadly-0.017
 curiously-0.017
 closely-0.016
Token "
Token ID366
Feature activation-8.80e-01

Pos logit contributions
ement+0.011
 nervously+0.010
 sadly+0.009
 curiously+0.009
 closely+0.009
Neg logit contributions
ww-0.018
rav-0.017
eware-0.017
fl-0.017
Fl-0.017
TokenWhat
Token ID2061
Feature activation+3.94e-01

Pos logit contributions
rav+0.024
ww+0.024
eware+0.023
less+0.022
uc+0.022
Neg logit contributions
 finally-0.022
 nervously-0.022
 curiously-0.021
 He-0.020
 sadly-0.020
Token's
Token ID338
Feature activation-2.09e-01

Pos logit contributions
ement+0.007
 and+0.007
 nervously+0.007
 proudly+0.007
 He+0.007
Neg logit contributions
fl-0.013
ww-0.013
Br-0.012
Fl-0.012
Snow-0.012
Token in
Token ID287
Feature activation+3.48e+00

Pos logit contributions
rav+0.008
ww+0.008
fl+0.008
eware+0.008
Fl+0.008
Neg logit contributions
 closely-0.003
ement-0.003
 sadly-0.003
 nervously-0.003
 finally-0.003
Token the
Token ID262
Feature activation-5.02e-01

Pos logit contributions
ement+0.073
 nervously+0.070
 sadly+0.067
 curiously+0.065
 closely+0.062
Neg logit contributions
ww-0.118
rav-0.113
eware-0.111
fl-0.111
w-0.107
Token box
Token ID3091
Feature activation+2.39e+00

Pos logit contributions
rav+0.020
ww+0.019
cr+0.019
eware+0.019
fl+0.018
Neg logit contributions
ement-0.007
 tears-0.007
 closely-0.006
 nervously-0.006
 sadly-0.006
Token?"
Token ID1701
Feature activation+1.70e+00

Pos logit contributions
ement+0.070
 nervously+0.068
 curiously+0.064
 sadly+0.064
 closely+0.061
Neg logit contributions
ww-0.060
eware-0.056
w-0.054
rav-0.053
fl-0.053
Token she
Token ID673
Feature activation+6.44e-01

Pos logit contributions
ement+0.040
 nervously+0.037
 sadly+0.035
 curiously+0.035
 closely+0.033
Neg logit contributions
ww-0.053
fl-0.050
eware-0.050
rav-0.049
Fl-0.048
Token asked
Token ID1965
Feature activation+1.73e-01

Pos logit contributions
 nervously+0.008
 curiously+0.008
 sadly+0.008
 closely+0.007
ement+0.007
Neg logit contributions
rav-0.028
ww-0.028
fl-0.027
less-0.027
eware-0.027
Token toy
Token ID13373
Feature activation-3.73e-01

Pos logit contributions
ww+0.020
rav+0.019
eware+0.018
fl+0.018
w+0.017
Neg logit contributions
ement-0.019
 nervously-0.018
 sadly-0.017
 curiously-0.017
 closely-0.017
Token.
Token ID13
Feature activation-3.47e-01

Pos logit contributions
fl+0.012
eware+0.011
ww+0.011
less+0.011
rav+0.011
Neg logit contributions
ement-0.008
 nervously-0.008
 curiously-0.007
 closely-0.007
 sadly-0.007
Token She
Token ID1375
Feature activation-8.91e-02

Pos logit contributions
ww+0.012
rav+0.012
fl+0.011
eware+0.011
Fl+0.011
Neg logit contributions
ement-0.007
 nervously-0.007
 sadly-0.006
 curiously-0.006
 closely-0.006
Token knew
Token ID2993
Feature activation+1.20e+00

Pos logit contributions
fl+0.002
ww+0.002
a+0.002
rav+0.002
w+0.002
Neg logit contributions
ement-0.003
 curiously-0.002
 nervously-0.002
 deeply-0.002
 closely-0.002
Token how
Token ID703
Feature activation-1.84e+00

Pos logit contributions
ement+0.030
 nervously+0.027
 curiously+0.026
 sadly+0.025
 closely+0.024
Neg logit contributions
ww-0.035
fl-0.034
rav-0.033
eware-0.032
less-0.032
Token much
Token ID881
Feature activation+2.88e+00

Pos logit contributions
ww+0.056
rav+0.054
fl+0.053
eware+0.053
w+0.051
Neg logit contributions
 nervously-0.044
ement-0.044
 sadly-0.043
 curiously-0.041
 closely-0.040
Token it
Token ID340
Feature activation+7.22e-01

Pos logit contributions
ement+0.077
 nervously+0.075
 sadly+0.071
 curiously+0.071
 closely+0.066
Neg logit contributions
ww-0.083
rav-0.076
fl-0.076
eware-0.075
w-0.072
Token meant
Token ID4001
Feature activation+1.23e+00

Pos logit contributions
ement+0.022
 nervously+0.021
 sadly+0.020
 curiously+0.020
 closely+0.019
Neg logit contributions
ww-0.017
fl-0.017
rav-0.017
eware-0.017
w-0.016
Token to
Token ID284
Feature activation+5.68e-01

Pos logit contributions
ement+0.024
 nervously+0.023
 curiously+0.021
 sadly+0.021
 closely+0.020
Neg logit contributions
ww-0.042
rav-0.041
fl-0.041
eware-0.041
less-0.040
Token him
Token ID683
Feature activation+2.05e+00

Pos logit contributions
ement+0.018
 nervously+0.017
 sadly+0.017
 curiously+0.017
 closely+0.016
Neg logit contributions
ww-0.013
rav-0.012
eware-0.012
fl-0.012
w-0.011
Token and
Token ID290
Feature activation-8.66e-02

Pos logit contributions
ement+0.043
 nervously+0.038
 curiously+0.035
 sadly+0.034
 closely+0.033
Neg logit contributions
ww-0.067
fl-0.066
eware-0.066
less-0.065
rav-0.064
Token He
Token ID679
Feature activation+8.97e-01

Pos logit contributions
ement+0.011
 nervously+0.009
 sadly+0.008
 curiously+0.008
 closely+0.008
Neg logit contributions
ww-0.027
Fl-0.027
eware-0.026
rav-0.026
fl-0.026
Token wants
Token ID3382
Feature activation+1.04e+00

Pos logit contributions
ement+0.030
 nervously+0.026
 sadly+0.026
 curiously+0.026
 frantically+0.025
Neg logit contributions
rav-0.018
fl-0.018
ww-0.018
eware-0.016
Fl-0.015
Token to
Token ID284
Feature activation+6.46e-02

Pos logit contributions
ement+0.033
 nervously+0.031
 sadly+0.031
 curiously+0.030
 closely+0.028
Neg logit contributions
ww-0.024
fl-0.023
eware-0.023
rav-0.023
less-0.021
Token know
Token ID760
Feature activation+4.96e-01

Pos logit contributions
ement+0.002
 sadly+0.002
 nervously+0.002
 deeply+0.002
 curiously+0.002
Neg logit contributions
ww-0.001
rav-0.001
fl-0.001
eware-0.001
w-0.001
Token how
Token ID703
Feature activation-1.22e+00

Pos logit contributions
ement+0.013
 nervously+0.012
 sadly+0.011
 curiously+0.011
 frantically+0.011
Neg logit contributions
ww-0.013
fl-0.013
Fl-0.013
rav-0.012
less-0.012
Token high
Token ID1029
Feature activation+4.99e+00

Pos logit contributions
ww+0.036
rav+0.035
eware+0.034
fl+0.034
w+0.032
Neg logit contributions
ement-0.031
 nervously-0.030
 sadly-0.029
 curiously-0.028
 closely-0.027
Token their
Token ID511
Feature activation+4.93e-01

Pos logit contributions
ement+0.123
 nervously+0.118
 curiously+0.111
 sadly+0.110
 frantically+0.109
Neg logit contributions
fl-0.151
ww-0.150
w-0.136
eware-0.134
Fl-0.132
Token tower
Token ID10580
Feature activation+2.94e+00

Pos logit contributions
ement+0.011
 nervously+0.011
 sadly+0.011
 curiously+0.010
 frantically+0.010
Neg logit contributions
rav-0.015
fl-0.015
ww-0.015
eware-0.014
w-0.014
Token is
Token ID318
Feature activation+1.59e+00

Pos logit contributions
ement+0.101
 nervously+0.099
 curiously+0.096
 sadly+0.094
 closely+0.092
Neg logit contributions
ww-0.056
fl-0.054
rav-0.054
eware-0.051
w-0.051
Token.
Token ID13
Feature activation+8.24e-01

Pos logit contributions
ement+0.039
 nervously+0.035
 curiously+0.033
 sadly+0.033
 frantically+0.032
Neg logit contributions
fl-0.047
ww-0.047
eware-0.046
less-0.044
rav-0.043
Token
Token ID198
Feature activation+5.74e-01

Pos logit contributions
ement+0.015
 nervously+0.013
 sadly+0.012
 curiously+0.012
 closely+0.012
Neg logit contributions
ww-0.031
eware-0.030
rav-0.030
Fl-0.029
fl-0.029
Token.
Token ID13
Feature activation+3.35e-01

Pos logit contributions
fl+0.018
ww+0.018
rav+0.017
eware+0.017
Fl+0.017
Neg logit contributions
ement-0.011
 nervously-0.011
 sadly-0.010
 curiously-0.010
 closely-0.009
Token One
Token ID1881
Feature activation+1.28e+00

Pos logit contributions
ement+0.006
 nervously+0.006
 sadly+0.006
 curiously+0.005
 frantically+0.005
Neg logit contributions
fl-0.012
Fl-0.011
rav-0.011
ww-0.011
eware-0.011
Token day
Token ID1110
Feature activation+4.78e-02

Pos logit contributions
 nervously+0.026
ement+0.025
 finally+0.025
 happily+0.025
 closely+0.025
Neg logit contributions
rav-0.041
eware-0.038
ww-0.038
ucer-0.038
fl-0.037
Token,
Token ID11
Feature activation+7.19e-01

Pos logit contributions
 sadly+0.000
 nervously+0.000
ement+0.000
 curiously+0.000
 happily+0.000
Neg logit contributions
ww-0.002
fl-0.002
rav-0.002
eware-0.002
Fl-0.002
Token Lily
Token ID20037
Feature activation+2.40e-01

Pos logit contributions
 nervously+0.014
ement+0.014
 sadly+0.014
 curiously+0.013
 finally+0.013
Neg logit contributions
ww-0.025
rav-0.024
eware-0.024
fl-0.024
w-0.023
Token and
Token ID290
Feature activation+1.07e+00

Pos logit contributions
ement+0.007
 deeply+0.006
 nervously+0.006
 curiously+0.006
 sadly+0.006
Neg logit contributions
ww-0.007
fl-0.007
eware-0.006
rav-0.006
w-0.006
Token her
Token ID607
Feature activation+5.74e-01

Pos logit contributions
ement+0.023
 nervously+0.022
 sadly+0.022
 curiously+0.021
 closely+0.020
Neg logit contributions
ww-0.036
rav-0.034
eware-0.034
fl-0.034
w-0.032
Token friends
Token ID2460
Feature activation-1.04e+00

Pos logit contributions
ement+0.012
 nervously+0.012
 sadly+0.011
 curiously+0.011
 closely+0.010
Neg logit contributions
ww-0.019
rav-0.019
fl-0.018
eware-0.018
w-0.017
Token were
Token ID547
Feature activation+1.45e+00

Pos logit contributions
ww+0.029
rav+0.027
eware+0.027
fl+0.027
w+0.026
Neg logit contributions
ement-0.029
 nervously-0.026
 curiously-0.025
 sadly-0.025
 closely-0.025
Token playing
Token ID2712
Feature activation+7.63e-01

Pos logit contributions
ement+0.040
 nervously+0.037
 curiously+0.035
 sadly+0.035
 closely+0.033
Neg logit contributions
ww-0.041
fl-0.039
rav-0.038
eware-0.037
less-0.036
Token on
Token ID319
Feature activation+1.55e+00

Pos logit contributions
ement+0.017
 nervously+0.016
 sadly+0.015
 curiously+0.015
 closely+0.015
Neg logit contributions
ww-0.025
fl-0.024
rav-0.024
eware-0.023
w-0.023
Token,
Token ID11
Feature activation+5.00e-02

Pos logit contributions
ement+0.003
 nervously+0.003
 curiously+0.003
 closely+0.002
 sadly+0.002
Neg logit contributions
ww-0.006
rav-0.006
eware-0.006
fl-0.006
w-0.006
Token a
Token ID257
Feature activation-5.28e-01

Pos logit contributions
ement+0.001
 nervously+0.001
 curiously+0.001
 sadly+0.001
 closely+0.001
Neg logit contributions
ww-0.002
rav-0.001
fl-0.001
eware-0.001
less-0.001
Token super
Token ID2208
Feature activation-1.34e+00

Pos logit contributions
ww+0.015
rav+0.015
eware+0.014
fl+0.014
w+0.014
Neg logit contributions
ement-0.013
 nervously-0.013
 sadly-0.013
 curiously-0.012
 closely-0.012
Token hero
Token ID4293
Feature activation-2.21e-01

Pos logit contributions
ww+0.038
rav+0.036
eware+0.035
fl+0.034
less+0.033
Neg logit contributions
ement-0.036
 nervously-0.034
 sadly-0.033
 curiously-0.033
 closely-0.032
Token arrived
Token ID5284
Feature activation+9.76e-01

Pos logit contributions
ww+0.006
rav+0.006
eware+0.005
fl+0.005
w+0.005
Neg logit contributions
ement-0.006
 nervously-0.006
 sadly-0.006
 curiously-0.006
 closely-0.006
Token on
Token ID319
Feature activation+3.71e-01

Pos logit contributions
ement+0.023
 nervously+0.020
 curiously+0.019
 finally+0.019
 closely+0.019
Neg logit contributions
less-0.028
fl-0.028
ww-0.028
rav-0.027
w-0.027
Token the
Token ID262
Feature activation-5.74e-01

Pos logit contributions
ement+0.011
 curiously+0.010
 closely+0.010
 nervously+0.009
azes+0.009
Neg logit contributions
fl-0.007
ww-0.007
less-0.007
Fl-0.007
Snow-0.007
Token scene
Token ID3715
Feature activation-1.38e+00

Pos logit contributions
ww+0.016
Fl+0.015
fl+0.015
eware+0.015
rav+0.015
Neg logit contributions
ement-0.016
 nervously-0.014
 sadly-0.013
 curiously-0.013
 eagerly-0.013
Token.
Token ID13
Feature activation-6.62e-01

Pos logit contributions
ww+0.040
rav+0.039
less+0.039
eware+0.039
fl+0.038
Neg logit contributions
ement-0.031
 nervously-0.029
 curiously-0.028
 sadly-0.027
 closely-0.026
Token Her
Token ID2332
Feature activation-1.53e-01

Pos logit contributions
ww+0.024
rav+0.024
eware+0.023
fl+0.022
less+0.022
Neg logit contributions
ement-0.011
 nervously-0.010
 sadly-0.009
 curiously-0.009
 closely-0.009
Token name
Token ID1438
Feature activation-2.80e-01

Pos logit contributions
ww+0.004
rav+0.004
fl+0.004
eware+0.004
less+0.004
Neg logit contributions
ement-0.004
 nervously-0.004
 curiously-0.004
 sadly-0.004
 closely-0.004
Token this
Token ID428
Feature activation-1.47e+00

Pos logit contributions
ement+0.007
 nervously+0.007
 curiously+0.006
 sadly+0.006
 eagerly+0.006
Neg logit contributions
ww-0.007
fl-0.007
eware-0.007
Fl-0.006
rav-0.006
Token me
Token ID502
Feature activation+7.25e-01

Pos logit contributions
ww+0.044
rav+0.044
eware+0.043
fl+0.042
w+0.040
Neg logit contributions
ement-0.036
 nervously-0.034
 sadly-0.034
 curiously-0.033
 closely-0.032
Tokenadow
Token ID4584
Feature activation+7.06e-02

Pos logit contributions
ement+0.021
 nervously+0.020
 sadly+0.020
 curiously+0.019
 closely+0.019
Neg logit contributions
ww-0.019
rav-0.018
eware-0.017
fl-0.017
w-0.016
Token."
Token ID526
Feature activation+5.99e-01

Pos logit contributions
ement+0.002
 nervously+0.002
 sadly+0.002
 curiously+0.002
 closely+0.002
Neg logit contributions
ww-0.002
rav-0.002
eware-0.002
fl-0.002
w-0.002
Token The
Token ID383
Feature activation-1.50e-01

Pos logit contributions
ement+0.014
 nervously+0.013
 sadly+0.012
 closely+0.012
 curiously+0.012
Neg logit contributions
ww-0.018
fl-0.017
rav-0.017
Fl-0.017
eware-0.017
Token little
Token ID1310
Feature activation+8.88e-01

Pos logit contributions
ww+0.005
rav+0.005
eware+0.004
fl+0.004
w+0.004
Neg logit contributions
ement-0.004
 nervously-0.004
 sadly-0.003
 curiously-0.003
 closely-0.003
Token girl
Token ID2576
Feature activation-3.13e-01

Pos logit contributions
ement+0.028
 nervously+0.027
 sadly+0.027
 curiously+0.026
 closely+0.025
Neg logit contributions
ww-0.021
rav-0.020
eware-0.020
fl-0.019
w-0.018
Token was
Token ID373
Feature activation+1.05e+00

Pos logit contributions
ww+0.008
rav+0.008
fl+0.008
eware+0.008
Fl+0.008
Neg logit contributions
ement-0.008
 nervously-0.008
 curiously-0.007
 closely-0.007
 sadly-0.007
Token surprised
Token ID6655
Feature activation+8.75e-01

Pos logit contributions
ement+0.025
 nervously+0.022
 curiously+0.021
 closely+0.020
 armor+0.019
Neg logit contributions
ww-0.029
a-0.028
Fl-0.027
less-0.027
eware-0.027
Token,
Token ID11
Feature activation+9.89e-01

Pos logit contributions
ement+0.016
 nervously+0.015
 armor+0.014
 sadly+0.014
 victorious+0.013
Neg logit contributions
less-0.030
fl-0.028
ww-0.028
rav-0.027
eware-0.026
Token she
Token ID673
Feature activation+9.54e-01

Pos logit contributions
ement+0.023
 nervously+0.019
 curiously+0.018
 closely+0.017
 sadly+0.016
Neg logit contributions
ww-0.029
fl-0.029
less-0.028
rav-0.027
Fl-0.027
Token upon
Token ID2402
Feature activation-1.72e+00

Pos logit contributions
rav+0.007
ww+0.006
less+0.006
eware+0.006
fl+0.006
Neg logit contributions
ement-0.014
 nervously-0.014
 sadly-0.013
 curiously-0.013
 frantically-0.013
Token a
Token ID257
Feature activation-1.06e+00

Pos logit contributions
ww+0.060
rav+0.058
eware+0.057
fl+0.056
w+0.054
Neg logit contributions
 nervously-0.034
ement-0.033
 curiously-0.032
 sadly-0.032
 finally-0.031
Token time
Token ID640
Feature activation+1.82e-01

Pos logit contributions
ww+0.036
rav+0.033
eware+0.033
fl+0.032
w+0.032
Neg logit contributions
ement-0.023
 nervously-0.022
 sadly-0.021
 curiously-0.021
 closely-0.020
Token,
Token ID11
Feature activation+7.23e-01

Pos logit contributions
ement+0.002
 nervously+0.002
 sadly+0.002
 curiously+0.002
 closely+0.002
Neg logit contributions
ww-0.007
rav-0.007
eware-0.007
fl-0.007
w-0.007
Token Ben
Token ID3932
Feature activation-4.23e-01

Pos logit contributions
ement+0.015
 nervously+0.015
 sadly+0.015
 curiously+0.014
 closely+0.014
Neg logit contributions
ww-0.024
rav-0.023
eware-0.023
fl-0.023
w-0.022
Token and
Token ID290
Feature activation+8.45e-01

Pos logit contributions
ww+0.017
rav+0.017
eware+0.016
fl+0.016
Fl+0.016
Neg logit contributions
 nervously-0.006
ement-0.005
 sadly-0.005
 curiously-0.005
 closely-0.005
Token Erin
Token ID28894
Feature activation-2.30e+00

Pos logit contributions
 nervously+0.006
 He+0.006
 sadly+0.005
ement+0.005
 happily+0.005
Neg logit contributions
eware-0.039
rav-0.038
ww-0.038
less-0.038
fl-0.038
Token stepped
Token ID10764
Feature activation+1.05e+00

Pos logit contributions
ww+0.075
eware+0.070
rav+0.070
fl+0.067
Fl+0.067
Neg logit contributions
 nervously-0.049
 sadly-0.049
 happily-0.048
 finally-0.047
ement-0.047
Token onto
Token ID4291
Feature activation+4.29e-01

Pos logit contributions
ement+0.024
 nervously+0.021
 sadly+0.021
 curiously+0.020
 finally+0.019
Neg logit contributions
fl-0.033
ww-0.033
rav-0.032
eware-0.032
less-0.031
Token the
Token ID262
Feature activation-9.59e-01

Pos logit contributions
ement+0.008
 nervously+0.008
 sadly+0.007
 curiously+0.007
 closely+0.007
Neg logit contributions
ww-0.015
rav-0.015
eware-0.015
fl-0.014
w-0.014
Token beach
Token ID10481
Feature activation-3.36e-01

Pos logit contributions
ww+0.029
rav+0.028
eware+0.027
w+0.026
fl+0.026
Neg logit contributions
 nervously-0.023
ement-0.023
 curiously-0.022
 sadly-0.021
 closely-0.021
Token in
Token ID287
Feature activation+5.82e-01

Pos logit contributions
ement+0.062
 nervously+0.060
 curiously+0.059
 closely+0.056
 frantically+0.054
Neg logit contributions
ww-0.059
less-0.056
w-0.056
Fl-0.056
tr-0.055
Token his
Token ID465
Feature activation+3.51e-01

Pos logit contributions
ement+0.004
 nervously+0.004
 sadly+0.003
 curiously+0.003
 closely+0.003
Neg logit contributions
rav-0.028
ww-0.028
eware-0.027
fl-0.027
less-0.026
Token room
Token ID2119
Feature activation-8.70e-01

Pos logit contributions
ement+0.007
 nervously+0.007
 sadly+0.007
 curiously+0.007
 proudly+0.006
Neg logit contributions
ww-0.012
rav-0.011
eware-0.011
fl-0.011
w-0.011
Token that
Token ID326
Feature activation-2.19e+00

Pos logit contributions
ww+0.029
eware+0.028
rav+0.028
fl+0.028
less+0.027
Neg logit contributions
ement-0.017
 nervously-0.017
 sadly-0.016
 curiously-0.016
 closely-0.015
Token he
Token ID339
Feature activation+1.02e+00

Pos logit contributions
ww+0.090
rav+0.088
eware+0.087
fl+0.084
less+0.082
Neg logit contributions
 sadly-0.027
 nervously-0.027
 finally-0.027
 He-0.026
ement-0.026
Token loved
Token ID6151
Feature activation+1.38e+00

Pos logit contributions
ement+0.027
 nervously+0.026
 sadly+0.025
 curiously+0.025
 closely+0.024
Neg logit contributions
ww-0.029
rav-0.028
eware-0.027
fl-0.027
w-0.025
Token to
Token ID284
Feature activation+4.74e-01

Pos logit contributions
ement+0.033
 nervously+0.032
 sadly+0.031
 curiously+0.030
 closely+0.029
Neg logit contributions
ww-0.043
rav-0.041
eware-0.040
fl-0.039
w-0.038
Token sit
Token ID1650
Feature activation+2.02e+00

Pos logit contributions
ement+0.013
 nervously+0.013
 sadly+0.012
 curiously+0.012
 closely+0.012
Neg logit contributions
ww-0.013
rav-0.013
eware-0.012
fl-0.012
w-0.012
Token on
Token ID319
Feature activation+1.09e+00

Pos logit contributions
ement+0.048
 nervously+0.044
 curiously+0.041
 frantically+0.041
 sadly+0.040
Neg logit contributions
fl-0.061
ww-0.060
w-0.059
Fl-0.058
rav-0.056
Token.
Token ID13
Feature activation+2.80e-01

Pos logit contributions
ement+0.027
 nervously+0.024
 curiously+0.023
 sadly+0.023
 frantically+0.023
Neg logit contributions
Fl-0.030
ww-0.029
fl-0.028
Snow-0.028
Ch-0.027
Token One
Token ID1881
Feature activation+8.04e-01

Pos logit contributions
ement+0.005
 nervously+0.005
 sadly+0.005
 curiously+0.005
 frantically+0.005
Neg logit contributions
ww-0.009
fl-0.009
rav-0.009
Fl-0.009
eware-0.009
Token brother
Token ID3956
Feature activation-2.42e-01

Pos logit contributions
ww+0.017
fl+0.017
rav+0.017
eware+0.016
less+0.015
Neg logit contributions
ement-0.017
 nervously-0.016
 curiously-0.015
 sadly-0.015
 closely-0.014
Token smiled
Token ID13541
Feature activation+5.00e-01

Pos logit contributions
ww+0.007
rav+0.007
fl+0.006
Fl+0.006
eware+0.006
Neg logit contributions
ement-0.006
 nervously-0.006
 curiously-0.006
 closely-0.006
 sadly-0.005
Token and
Token ID290
Feature activation-5.43e-01

Pos logit contributions
ement+0.009
 nervously+0.007
 curiously+0.006
 frantically+0.006
 closely+0.006
Neg logit contributions
fl-0.018
less-0.018
rav-0.018
ww-0.017
cr-0.016
Token said
Token ID531
Feature activation-1.98e+00

Pos logit contributions
ww+0.017
eware+0.017
rav+0.017
fl+0.016
w+0.016
Neg logit contributions
 nervously-0.012
 sadly-0.012
ement-0.012
 proudly-0.011
 curiously-0.011
Token,
Token ID11
Feature activation-1.06e+00

Pos logit contributions
ww+0.083
eware+0.080
fl+0.079
rav+0.079
w+0.077
Neg logit contributions
ement-0.027
 nervously-0.025
 sadly-0.025
 curiously-0.022
 proudly-0.021
Token "
Token ID366
Feature activation-1.34e+00

Pos logit contributions
ww+0.031
rav+0.030
eware+0.029
fl+0.029
w+0.028
Neg logit contributions
ement-0.026
 nervously-0.026
 sadly-0.025
 curiously-0.024
 closely-0.023
TokenIt
Token ID1026
Feature activation-1.98e+00

Pos logit contributions
eware+0.037
ww+0.032
uc+0.031
less+0.031
rav+0.031
Neg logit contributions
 He-0.035
ement-0.034
 finally-0.033
 happily-0.033
 nervously-0.032
Token's
Token ID338
Feature activation-2.04e+00

Pos logit contributions
eware+0.068
ww+0.067
rav+0.067
less+0.067
Br+0.063
Neg logit contributions
 sadly-0.037
 nervously-0.036
 happily-0.035
 proudly-0.035
 finally-0.035
Token a
Token ID257
Feature activation-2.93e-01

Pos logit contributions
ww+0.058
rav+0.057
eware+0.057
fl+0.055
w+0.053
Neg logit contributions
ement-0.053
 nervously-0.051
 sadly-0.051
 curiously-0.049
 closely-0.048
Token good
Token ID922
Feature activation-1.68e+00

Pos logit contributions
eware+0.009
fl+0.009
w+0.009
ww+0.009
rav+0.009
Neg logit contributions
 nervously-0.006
 sadly-0.006
 finally-0.006
 proudly-0.006
ement-0.006
Token thing
Token ID1517
Feature activation-1.34e+00

Pos logit contributions
ww+0.052
rav+0.050
eware+0.050
fl+0.049
Fl+0.049
Neg logit contributions
 nervously-0.038
ement-0.038
 sadly-0.037
 curiously-0.036
 proudly-0.035
Token hands
Token ID2832
Feature activation-3.19e-01

Pos logit contributions
ement+0.019
 nervously+0.018
 sadly+0.018
 curiously+0.017
 proudly+0.017
Neg logit contributions
ww-0.020
rav-0.019
eware-0.019
fl-0.018
w-0.018
Token and
Token ID290
Feature activation-1.92e-01

Pos logit contributions
ww+0.012
fl+0.012
rav+0.011
eware+0.011
less+0.011
Neg logit contributions
ement-0.005
 nervously-0.005
 sadly-0.005
 curiously-0.005
 closely-0.004
Token feet
Token ID3625
Feature activation-3.77e-01

Pos logit contributions
ww+0.005
fl+0.005
rav+0.005
eware+0.004
cr+0.004
Neg logit contributions
ement-0.006
 nervously-0.005
 sadly-0.005
 curiously-0.005
 deeply-0.005
Token to
Token ID284
Feature activation-1.03e+00

Pos logit contributions
ww+0.013
rav+0.013
fl+0.013
eware+0.013
less+0.012
Neg logit contributions
ement-0.007
 nervously-0.007
 sadly-0.006
 curiously-0.006
 closely-0.006
Token hold
Token ID1745
Feature activation+8.97e-01

Pos logit contributions
fl+0.020
Fl+0.019
cr+0.018
w+0.017
Ch+0.017
Neg logit contributions
ement-0.037
 sadly-0.030
 deeply-0.029
 nervously-0.028
 Sir-0.027
Token on
Token ID319
Feature activation-7.83e-02

Pos logit contributions
ement+0.024
 nervously+0.022
 curiously+0.021
 sadly+0.021
 proudly+0.020
Neg logit contributions
fl-0.024
ww-0.024
rav-0.023
less-0.023
eware-0.022
Token to
Token ID284
Feature activation-7.09e-01

Pos logit contributions
fl+0.002
ww+0.002
less+0.002
eware+0.002
rav+0.002
Neg logit contributions
ement-0.002
 nervously-0.002
 sadly-0.002
 finally-0.002
 curiously-0.002
Token the
Token ID262
Feature activation-3.07e-01

Pos logit contributions
fl+0.016
Fl+0.014
cr+0.014
free+0.013
less+0.013
Neg logit contributions
ement-0.023
 sadly-0.018
 nervously-0.018
 deeply-0.017
 Sir-0.017
Token grass
Token ID8701
Feature activation+4.08e-02

Pos logit contributions
ww+0.009
fl+0.009
rav+0.009
eware+0.008
Fl+0.008
Neg logit contributions
ement-0.007
 sadly-0.007
 nervously-0.007
 proudly-0.006
 curiously-0.006
Token and
Token ID290
Feature activation-4.07e-01

Pos logit contributions
ement+0.001
 nervously+0.001
 sadly+0.001
 curiously+0.001
 closely+0.001
Neg logit contributions
ww-0.001
rav-0.001
fl-0.001
eware-0.001
w-0.001
Token rocks
Token ID12586
Feature activation+9.24e-01

Pos logit contributions
fl+0.010
ww+0.009
rav+0.009
eware+0.009
cr+0.009
Neg logit contributions
ement-0.013
 nervously-0.011
 sadly-0.011
 curiously-0.011
 deeply-0.010
Token Sarah
Token ID10490
Feature activation-1.09e+00

Pos logit contributions
ww+0.113
rav+0.110
w+0.108
fl+0.108
haw+0.102
Neg logit contributions
 nervously-0.056
 sadly-0.053
 curiously-0.052
ement-0.051
 closely-0.049
Token.
Token ID13
Feature activation-2.48e-02

Pos logit contributions
ww+0.040
rav+0.039
eware+0.038
fl+0.038
w+0.037
Neg logit contributions
ement-0.019
 nervously-0.019
 sadly-0.018
 curiously-0.017
 closely-0.016
Token She
Token ID1375
Feature activation-2.22e-01

Pos logit contributions
ww+0.001
rav+0.001
fl+0.001
eware+0.001
less+0.001
Neg logit contributions
ement-0.000
 nervously-0.000
 sadly-0.000
 curiously-0.000
 closely-0.000
Token was
Token ID373
Feature activation+6.00e-01

Pos logit contributions
ww+0.007
rav+0.007
eware+0.007
less+0.007
fl+0.007
Neg logit contributions
 nervously-0.005
ement-0.005
 sadly-0.005
 finally-0.005
 proudly-0.005
Token 3
Token ID513
Feature activation+2.83e-01

Pos logit contributions
ement+0.013
 nervously+0.012
 sadly+0.012
 curiously+0.012
 closely+0.011
Neg logit contributions
ww-0.020
rav-0.019
eware-0.019
fl-0.019
w-0.018
Token years
Token ID812
Feature activation-1.16e+00

Pos logit contributions
ement+0.006
 nervously+0.006
 finally+0.006
 curiously+0.006
 sadly+0.005
Neg logit contributions
ww-0.009
fl-0.008
Fl-0.008
rav-0.008
eware-0.008
Token old
Token ID1468
Feature activation-7.82e-01

Pos logit contributions
ww+0.030
less+0.028
eware+0.027
rav+0.026
fl+0.025
Neg logit contributions
ement-0.032
 sadly-0.029
 nervously-0.029
 finally-0.029
 curiously-0.028
Token and
Token ID290
Feature activation+5.56e-01

Pos logit contributions
fl+0.029
ww+0.028
rav+0.027
Fl+0.027
eware+0.027
Neg logit contributions
ement-0.013
 nervously-0.012
 closely-0.011
 curiously-0.011
 sadly-0.011
Token loved
Token ID6151
Feature activation+9.54e-01

Pos logit contributions
 nervously+0.011
 sadly+0.011
ement+0.010
 He+0.010
 curiously+0.010
Neg logit contributions
ww-0.019
eware-0.019
rav-0.018
fl-0.018
cr-0.017
Token to
Token ID284
Feature activation+1.83e-02

Pos logit contributions
 nervously+0.021
 curiously+0.020
ement+0.020
 sadly+0.019
 happily+0.019
Neg logit contributions
ww-0.031
rav-0.030
eware-0.030
w-0.029
Fl-0.028
Token explore
Token ID7301
Feature activation+5.46e-01

Pos logit contributions
ement+0.001
 nervously+0.001
 sadly+0.000
 curiously+0.000
 closely+0.000
Neg logit contributions
ww-0.000
rav-0.000
eware-0.000
fl-0.000
w-0.000
TokenOne
Token ID3198
Feature activation+1.13e-01

Pos logit contributions
 nervously+0.000
ement+0.000
 curiously+0.000
 sadly+0.000
 closely+0.000
Neg logit contributions
ww-0.000
rav-0.000
fl-0.000
eware-0.000
w-0.000
Token day
Token ID1110
Feature activation-1.90e-01

Pos logit contributions
 nervously+0.003
ement+0.003
 sadly+0.003
 curiously+0.003
 closely+0.003
Neg logit contributions
rav-0.003
ww-0.003
fl-0.003
eware-0.003
Fl-0.003
Token the
Token ID262
Feature activation+3.50e-01

Pos logit contributions
ww+0.008
rav+0.007
eware+0.007
fl+0.007
w+0.007
Neg logit contributions
ement-0.003
 nervously-0.003
 sadly-0.003
 curiously-0.002
 closely-0.002
Token dad
Token ID9955
Feature activation-4.71e-01

Pos logit contributions
ement+0.008
 nervously+0.007
 sadly+0.007
 curiously+0.007
 closely+0.006
Neg logit contributions
ww-0.012
rav-0.011
eware-0.011
fl-0.011
w-0.011
Token forgot
Token ID16453
Feature activation-1.44e+00

Pos logit contributions
ww+0.011
eware+0.011
fl+0.010
rav+0.010
w+0.010
Neg logit contributions
ement-0.015
 nervously-0.014
 deeply-0.013
 curiously-0.013
 sadly-0.013
Token to
Token ID284
Feature activation+8.91e-02

Pos logit contributions
ww+0.042
eware+0.041
rav+0.041
fl+0.040
less+0.039
Neg logit contributions
ement-0.036
 nervously-0.033
 curiously-0.031
 sadly-0.031
 closely-0.030
Token lock
Token ID5793
Feature activation+1.94e-01

Pos logit contributions
ement+0.003
 nervously+0.002
 sadly+0.002
 curiously+0.002
 closely+0.002
Neg logit contributions
ww-0.002
eware-0.002
rav-0.002
w-0.002
fl-0.002
Token the
Token ID262
Feature activation+7.49e-01

Pos logit contributions
ement+0.005
 nervously+0.004
 curiously+0.004
 sadly+0.004
 eagerly+0.004
Neg logit contributions
eware-0.005
less-0.005
ww-0.005
fl-0.005
w-0.005
Token door
Token ID3420
Feature activation+1.08e+00

Pos logit contributions
ement+0.015
 nervously+0.014
 sadly+0.014
 curiously+0.013
 closely+0.013
Neg logit contributions
ww-0.026
rav-0.025
eware-0.025
fl-0.025
w-0.024
Token before
Token ID878
Feature activation-1.37e+00

Pos logit contributions
ement+0.029
 nervously+0.025
 curiously+0.023
 frantically+0.023
 armor+0.023
Neg logit contributions
less-0.028
eware-0.026
tr-0.024
ww-0.024
free-0.024
Token he
Token ID339
Feature activation+9.60e-01

Pos logit contributions
ww+0.038
fl+0.036
eware+0.035
less+0.035
rav+0.034
Neg logit contributions
ement-0.035
 nervously-0.030
 closely-0.030
 curiously-0.029
 sadly-0.028
Token face
Token ID1986
Feature activation-1.58e+00

Pos logit contributions
ww+0.049
rav+0.047
eware+0.045
fl+0.045
w+0.043
Neg logit contributions
ement-0.054
 nervously-0.053
 sadly-0.051
 curiously-0.050
 closely-0.048
Token.
Token ID13
Feature activation-3.12e-01

Pos logit contributions
ww+0.062
rav+0.060
eware+0.059
w+0.057
fl+0.057
Neg logit contributions
ement-0.025
 nervously-0.025
 sadly-0.024
 curiously-0.022
 proudly-0.022
Token<|endoftext|>
Token ID50256
Feature activation-7.12e-02

Pos logit contributions
ww+0.010
rav+0.010
fl+0.010
Fl+0.009
eware+0.009
Neg logit contributions
ement-0.007
 nervously-0.006
 curiously-0.006
 sadly-0.006
 closely-0.006
TokenOnce
Token ID7454
Feature activation-6.87e-02

Pos logit contributions
ww+0.002
fl+0.002
rav+0.002
Fl+0.002
eware+0.002
Neg logit contributions
ement-0.002
 nervously-0.002
 sadly-0.002
 curiously-0.002
 closely-0.002
Token upon
Token ID2402
Feature activation-2.01e+00

Pos logit contributions
rav+0.001
ww+0.001
less+0.001
eware+0.001
w+0.001
Neg logit contributions
ement-0.003
 nervously-0.002
 curiously-0.002
 sadly-0.002
 armor-0.002
Token a
Token ID257
Feature activation-1.14e+00

Pos logit contributions
ww+0.067
rav+0.065
eware+0.064
fl+0.063
w+0.060
Neg logit contributions
 nervously-0.042
ement-0.042
 sadly-0.040
 curiously-0.040
 closely-0.037
Token time
Token ID640
Feature activation+8.06e-02

Pos logit contributions
ww+0.037
rav+0.035
eware+0.034
Fl+0.033
w+0.033
Neg logit contributions
ement-0.026
 nervously-0.024
 sadly-0.023
 curiously-0.023
 closely-0.021
Token there
Token ID612
Feature activation-5.82e-01

Pos logit contributions
ement+0.001
 nervously+0.001
 sadly+0.001
 curiously+0.001
 closely+0.001
Neg logit contributions
ww-0.003
rav-0.003
eware-0.003
fl-0.003
w-0.003
Token was
Token ID373
Feature activation-1.00e+00

Pos logit contributions
ww+0.015
rav+0.014
fl+0.014
eware+0.014
w+0.013
Neg logit contributions
ement-0.017
 nervously-0.016
 sadly-0.016
 curiously-0.015
 closely-0.015
Token a
Token ID257
Feature activation-5.43e-01

Pos logit contributions
ww+0.030
rav+0.029
fl+0.028
eware+0.028
w+0.027
Neg logit contributions
ement-0.025
 nervously-0.024
 sadly-0.023
 curiously-0.023
 closely-0.022
Token bug
Token ID5434
Feature activation-5.37e-01

Pos logit contributions
ww+0.016
rav+0.016
eware+0.015
fl+0.015
w+0.015
Neg logit contributions
ement-0.013
 nervously-0.013
 sadly-0.013
 curiously-0.012
 closely-0.012
Token there
Token ID612
Feature activation-2.85e-01

Pos logit contributions
ement+0.015
 nervously+0.014
 sadly+0.014
 curiously+0.013
 closely+0.013
Neg logit contributions
ww-0.021
rav-0.020
eware-0.019
fl-0.019
w-0.019
Token was
Token ID373
Feature activation-7.42e-01

Pos logit contributions
ww+0.008
rav+0.007
fl+0.007
eware+0.007
w+0.007
Neg logit contributions
ement-0.008
 nervously-0.008
 sadly-0.008
 curiously-0.007
 closely-0.007
Token a
Token ID257
Feature activation-3.95e-01

Pos logit contributions
ww+0.021
fl+0.019
rav+0.019
Fl+0.019
eware+0.019
Neg logit contributions
ement-0.021
 nervously-0.019
 sadly-0.019
 curiously-0.018
 closely-0.018
Token little
Token ID1310
Feature activation-3.71e-01

Pos logit contributions
ww+0.012
rav+0.011
eware+0.011
fl+0.011
w+0.010
Neg logit contributions
ement-0.010
 nervously-0.010
 sadly-0.009
 curiously-0.009
 closely-0.009
Token girl
Token ID2576
Feature activation-1.06e+00

Pos logit contributions
ww+0.009
rav+0.009
eware+0.009
fl+0.008
w+0.008
Neg logit contributions
ement-0.011
 nervously-0.011
 sadly-0.011
 curiously-0.010
 closely-0.010
Token named
Token ID3706
Feature activation-3.45e+00

Pos logit contributions
ww+0.039
rav+0.038
eware+0.037
fl+0.036
w+0.035
Neg logit contributions
 nervously-0.019
ement-0.018
 sadly-0.018
 curiously-0.017
 closely-0.017
Token Lily
Token ID20037
Feature activation-6.85e-01

Pos logit contributions
ww+0.134
fl+0.129
rav+0.128
w+0.127
eware+0.126
Neg logit contributions
 nervously-0.053
 sadly-0.053
 finally-0.052
 closely-0.051
 curiously-0.051
Token.
Token ID13
Feature activation-2.54e-01

Pos logit contributions
ww+0.026
rav+0.026
eware+0.025
fl+0.025
w+0.024
Neg logit contributions
ement-0.011
 nervously-0.011
 sadly-0.010
 curiously-0.010
 closely-0.010
Token She
Token ID1375
Feature activation-3.01e-01

Pos logit contributions
ww+0.010
rav+0.010
eware+0.009
fl+0.009
w+0.009
Neg logit contributions
ement-0.004
 nervously-0.004
 sadly-0.004
 curiously-0.004
 closely-0.003
Token liked
Token ID8288
Feature activation+1.39e+00

Pos logit contributions
eware+0.011
rav+0.011
less+0.011
ww+0.011
fl+0.010
Neg logit contributions
 finally-0.005
 happily-0.005
 proudly-0.005
 nervously-0.005
 sadly-0.005
Token to
Token ID284
Feature activation+2.72e-01

Pos logit contributions
 nervously+0.029
 curiously+0.027
 closely+0.026
ement+0.026
 sadly+0.026
Neg logit contributions
ww-0.045
rav-0.044
eware-0.044
w-0.042
Fl-0.042
TokenTwo
Token ID7571
Feature activation+1.10e+00

Pos logit contributions
fl+0.001
rav+0.001
ww+0.001
eware+0.001
w+0.001
Neg logit contributions
 nervously-0.001
 He-0.001
 finally-0.000
 curiously-0.000
 eagerly-0.000
Token curious
Token ID11040
Feature activation-4.79e-01

Pos logit contributions
 nervously+0.026
ement+0.026
 sadly+0.025
 curiously+0.025
 closely+0.024
Neg logit contributions
ww-0.033
rav-0.031
fl-0.031
eware-0.031
w-0.030
Token little
Token ID1310
Feature activation-5.70e-01

Pos logit contributions
ww+0.015
rav+0.014
eware+0.014
fl+0.014
w+0.013
Neg logit contributions
ement-0.011
 nervously-0.011
 sadly-0.010
 curiously-0.010
 closely-0.010
Token children
Token ID1751
Feature activation-7.84e-01

Pos logit contributions
ww+0.015
fl+0.014
rav+0.014
eware+0.014
Fl+0.013
Neg logit contributions
ement-0.017
 nervously-0.016
 sadly-0.016
 curiously-0.015
 closely-0.015
Token,
Token ID11
Feature activation+7.83e-01

Pos logit contributions
ww+0.025
fl+0.024
rav+0.024
eware+0.024
w+0.022
Neg logit contributions
ement-0.018
 nervously-0.017
 sadly-0.016
 curiously-0.016
 closely-0.015
Token called
Token ID1444
Feature activation-2.71e+00

Pos logit contributions
ement+0.017
 nervously+0.016
 sadly+0.016
 curiously+0.015
 closely+0.015
Neg logit contributions
ww-0.026
rav-0.025
eware-0.024
fl-0.024
w-0.023
Token Max
Token ID5436
Feature activation-6.59e-01

Pos logit contributions
ww+0.104
rav+0.097
w+0.097
fl+0.096
eware+0.095
Neg logit contributions
 nervously-0.045
ement-0.043
 sadly-0.042
 curiously-0.041
 closely-0.039
Token and
Token ID290
Feature activation+3.90e-01

Pos logit contributions
ww+0.027
rav+0.026
fl+0.026
eware+0.026
Fl+0.025
Neg logit contributions
ement-0.009
 nervously-0.008
 sadly-0.008
 curiously-0.007
 closely-0.007
Token Jenny
Token ID28437
Feature activation-2.45e+00

Pos logit contributions
ement+0.009
 nervously+0.008
 sadly+0.008
 curiously+0.007
 closely+0.007
Neg logit contributions
ww-0.013
rav-0.013
fl-0.012
eware-0.012
Fl-0.012
Token,
Token ID11
Feature activation+6.61e-01

Pos logit contributions
ww+0.097
rav+0.093
eware+0.092
fl+0.090
w+0.089
Neg logit contributions
ement-0.037
 nervously-0.037
 sadly-0.035
 curiously-0.034
 closely-0.031
Token played
Token ID2826
Feature activation+5.80e-01

Pos logit contributions
ement+0.016
 nervously+0.016
 sadly+0.015
 curiously+0.015
 closely+0.014
Neg logit contributions
ww-0.020
rav-0.019
eware-0.019
fl-0.018
w-0.018
Tokenavi
Token ID15820
Feature activation+2.84e-01

Pos logit contributions
ement+0.037
 nervously+0.036
 sadly+0.035
 curiously+0.034
 closely+0.033
Neg logit contributions
ww-0.034
rav-0.032
eware-0.032
fl-0.031
w-0.030
Token said
Token ID531
Feature activation-7.88e-01

Pos logit contributions
ement+0.008
 nervously+0.007
 sadly+0.007
 deeply+0.006
 curiously+0.006
Neg logit contributions
ww-0.007
fl-0.007
rav-0.007
w-0.007
eware-0.007
Token,
Token ID11
Feature activation+5.33e-01

Pos logit contributions
ww+0.033
rav+0.031
fl+0.031
eware+0.031
w+0.031
Neg logit contributions
 nervously-0.010
ement-0.010
 sadly-0.010
 curiously-0.009
 proudly-0.009
Token "
Token ID366
Feature activation-1.10e+00

Pos logit contributions
 nervously+0.012
ement+0.012
 sadly+0.011
 curiously+0.011
 closely+0.011
Neg logit contributions
ww-0.018
rav-0.017
eware-0.016
w-0.016
fl-0.016
TokenLet
Token ID5756
Feature activation+1.24e+00

Pos logit contributions
less+0.029
eware+0.028
rav+0.028
uc+0.027
ww+0.027
Neg logit contributions
 nervously-0.028
 He-0.028
 finally-0.028
 curiously-0.028
ement-0.027
Token's
Token ID338
Feature activation-2.96e+00

Pos logit contributions
 nervously+0.027
 sadly+0.026
ement+0.026
 closely+0.025
 curiously+0.025
Neg logit contributions
rav-0.040
eware-0.040
less-0.040
ww-0.039
ucer-0.038
Token follow
Token ID1061
Feature activation+4.14e-01

Pos logit contributions
ww+0.074
fl+0.070
eware+0.068
rav+0.068
w+0.065
Neg logit contributions
ement-0.090
 nervously-0.083
 sadly-0.081
 curiously-0.078
 closely-0.075
Token it
Token ID340
Feature activation-1.14e+00

Pos logit contributions
ement+0.009
 nervously+0.008
 sadly+0.008
 curiously+0.007
 closely+0.007
Neg logit contributions
ww-0.014
rav-0.014
fl-0.013
eware-0.013
w-0.013
Token and
Token ID290
Feature activation-7.41e-01

Pos logit contributions
ww+0.031
rav+0.030
eware+0.029
fl+0.029
w+0.028
Neg logit contributions
ement-0.031
 nervously-0.030
 sadly-0.029
 curiously-0.029
 closely-0.028
Token see
Token ID766
Feature activation+3.45e-02

Pos logit contributions
ww+0.022
rav+0.021
eware+0.021
fl+0.021
w+0.020
Neg logit contributions
ement-0.019
 nervously-0.017
 sadly-0.016
 curiously-0.016
 closely-0.015
Token where
Token ID810
Feature activation-1.31e+00

Pos logit contributions
ement+0.001
 nervously+0.001
 sadly+0.001
 curiously+0.001
 closely+0.001
Neg logit contributions
ww-0.001
rav-0.001
fl-0.001
eware-0.001
Fl-0.001
Token there
Token ID612
Feature activation-5.18e-01

Pos logit contributions
ement+0.014
 nervously+0.013
 sadly+0.013
 curiously+0.012
 closely+0.012
Neg logit contributions
ww-0.021
rav-0.020
eware-0.020
fl-0.020
w-0.019
Token was
Token ID373
Feature activation-8.36e-01

Pos logit contributions
ww+0.015
rav+0.014
eware+0.014
fl+0.014
w+0.013
Neg logit contributions
ement-0.014
 nervously-0.013
 sadly-0.013
 curiously-0.013
 closely-0.012
Token a
Token ID257
Feature activation-5.52e-01

Pos logit contributions
ww+0.025
rav+0.024
fl+0.023
eware+0.023
Fl+0.022
Neg logit contributions
ement-0.021
 nervously-0.020
 sadly-0.019
 curiously-0.019
 closely-0.018
Token little
Token ID1310
Feature activation-4.96e-01

Pos logit contributions
ww+0.016
rav+0.015
eware+0.015
fl+0.015
w+0.014
Neg logit contributions
ement-0.014
 nervously-0.014
 sadly-0.013
 curiously-0.013
 closely-0.012
Token girl
Token ID2576
Feature activation-1.15e+00

Pos logit contributions
ww+0.011
rav+0.011
eware+0.010
fl+0.010
w+0.010
Neg logit contributions
ement-0.016
 nervously-0.015
 sadly-0.015
 curiously-0.015
 closely-0.014
Token named
Token ID3706
Feature activation-3.49e+00

Pos logit contributions
ww+0.043
rav+0.042
eware+0.041
fl+0.040
w+0.039
Neg logit contributions
 nervously-0.020
 sadly-0.019
ement-0.019
 curiously-0.018
 closely-0.017
Token Lily
Token ID20037
Feature activation-8.00e-01

Pos logit contributions
ww+0.144
fl+0.141
w+0.137
rav+0.136
haw+0.135
Neg logit contributions
 finally-0.047
 nervously-0.045
 sadly-0.044
 happily-0.044
 curiously-0.042
Token.
Token ID13
Feature activation-3.14e-01

Pos logit contributions
ww+0.031
rav+0.030
eware+0.030
fl+0.029
w+0.028
Neg logit contributions
 nervously-0.012
ement-0.012
 sadly-0.012
 curiously-0.011
 closely-0.011
Token She
Token ID1375
Feature activation-4.19e-01

Pos logit contributions
ww+0.012
rav+0.011
eware+0.011
fl+0.011
w+0.011
Neg logit contributions
ement-0.006
 nervously-0.005
 sadly-0.005
 curiously-0.005
 closely-0.005
Token loved
Token ID6151
Feature activation+1.19e+00

Pos logit contributions
eware+0.014
rav+0.014
ww+0.014
less+0.014
fl+0.013
Neg logit contributions
 nervously-0.008
 finally-0.008
 proudly-0.008
 happily-0.008
 sadly-0.008
Token to
Token ID284
Feature activation+1.37e-01

Pos logit contributions
 nervously+0.025
 curiously+0.024
ement+0.023
 sadly+0.023
 closely+0.023
Neg logit contributions
ww-0.038
rav-0.037
eware-0.037
w-0.035
Fl-0.035
Token
Token ID198
Feature activation+2.38e-01

Pos logit contributions
ement+0.000
 nervously+0.000
 sadly+0.000
 curiously+0.000
 closely+0.000
Neg logit contributions
ww-0.000
fl-0.000
rav-0.000
eware-0.000
Fl-0.000
Token
Token ID198
Feature activation-6.02e-02

Pos logit contributions
ement+0.005
 nervously+0.005
 sadly+0.005
 curiously+0.004
 closely+0.004
Neg logit contributions
ww-0.008
rav-0.008
eware-0.008
fl-0.008
Fl-0.007
Token"
Token ID1
Feature activation-2.01e+00

Pos logit contributions
rav+0.002
fl+0.002
ww+0.002
less+0.002
eware+0.002
Neg logit contributions
 nervously-0.001
 curiously-0.001
 closely-0.001
 finally-0.001
 eagerly-0.001
TokenWow
Token ID22017
Feature activation+4.72e-01

Pos logit contributions
eware+0.056
less+0.053
uc+0.053
rav+0.051
ww+0.049
Neg logit contributions
 finally-0.052
 He-0.048
 nervously-0.047
ement-0.047
 happily-0.046
Token,
Token ID11
Feature activation-1.37e+00

Pos logit contributions
ement+0.007
 nervously+0.007
 sadly+0.007
 curiously+0.006
 closely+0.006
Neg logit contributions
ww-0.018
rav-0.018
eware-0.018
fl-0.018
Fl-0.017
Token that
Token ID326
Feature activation-3.36e+00

Pos logit contributions
rav+0.049
ww+0.049
fl+0.049
eware+0.048
less+0.047
Neg logit contributions
 nervously-0.023
 sadly-0.023
ement-0.022
 curiously-0.022
 closely-0.022
Token's
Token ID338
Feature activation-2.42e+00

Pos logit contributions
rav+0.104
eware+0.102
ww+0.100
Fl+0.098
less+0.097
Neg logit contributions
 sadly-0.079
ement-0.077
 nervously-0.075
 proudly-0.075
 closely-0.072
Token cool
Token ID3608
Feature activation-1.99e+00

Pos logit contributions
rav+0.084
ww+0.083
eware+0.082
fl+0.080
w+0.077
Neg logit contributions
ement-0.046
 sadly-0.045
 nervously-0.044
 closely-0.043
 curiously-0.042
Token!
Token ID0
Feature activation-6.80e-02

Pos logit contributions
ww+0.058
rav+0.055
eware+0.054
fl+0.054
Fl+0.052
Neg logit contributions
ement-0.051
 nervously-0.049
 sadly-0.048
 curiously-0.047
 closely-0.045
Token Can
Token ID1680
Feature activation+9.01e-01

Pos logit contributions
ww+0.002
w+0.002
fl+0.002
rav+0.002
ucer+0.002
Neg logit contributions
 He-0.002
 nervously-0.002
ement-0.002
 sadly-0.002
 curiously-0.002
Token you
Token ID345
Feature activation-5.77e-02

Pos logit contributions
ement+0.030
 nervously+0.029
 sadly+0.029
 curiously+0.028
 closely+0.028
Neg logit contributions
ww-0.019
rav-0.018
eware-0.018
fl-0.018
w-0.016