TOP ACTIVATIONS
MAX = 0.246

Weight : 15 kilograms . Average for age : 3 years
was a beautiful pink c asser ole . The
the jar . She t ipt o ed up to the
not want to be a tatt let ale . She decided
didn 't know what " mod est " meant , but
that the child l aug heed in delight .
't understand the word " expensive ," but she recognized the
and dreamed of becoming a Veter in arian one day .
to try the broccoli c asser ole . He said ,
the baby . They t ipt o ed inside and put
called T ippy . T ippy would always get up early
M olly t ipt o ed around the corner
few minutes , then t ipt o ed back to the
time . She quietly t ipt o ed down the street
jumping over logs , t ipt oe ing around rocks ,
didn 't understand what " mod est " meant , but
T - Rex used D ippy 's measurements to cause trouble
asked . Wh ippy happily nodded and started to
. They write " VOL CAN O " in big letters
the ground , so D ippy hopped over to take a

MIN ACTIVATIONS
MIN = -0.342

the tree to its once former beauty . He was happy
to be restored to its former glory ."
the village back to its former glory !"
, leaving it to its former state .
back the village to its former state . With the help
We have different likes and disl ikes . We can be
is from a long time ago , when a king lived
They lived a long time ago . They had big bones
me here a long time ago . They forgot about me
it away a long time ago . It is not good
They were here a minute ago ." " Maybe
it here a long time ago ." They decided
animals and people from long ago . Sam 's mom said
started smoking a long time ago and now it 's hard
coins that people used long ago . They saw paintings and
a ship that sank long ago . She said they could
lost it a long time ago , and you found it
are from a long time ago , before we were born
people did a long time ago . They lived in castles
by a famous artist long ago . One

INTERVAL 0.099 - 0.246
CONTAINS 0.091%

the tree . While m unch ing on her
She also saw shells . Shell s were her favorite .
They ran to find shelter under a rough tree . Lily
the tree . He stays under the tree and bar ks
flower and said , " Wow ! It is very pretty

INTERVAL -0.048 - 0.099
CONTAINS 96.651%

very excited and they quickly began to pick the leaves .
. Sam hugged Dad and Tim and said he
smile , he opened the box and his eyes filled with
me if you don 't . And it 's not good
so big and exciting that Sarah had to take a deep

INTERVAL -0.195 - -0.048
CONTAINS 3.254%

birds . They also saw some people sitting under a tent
idea ! He grabbed a nearby stick and started throwing it
to their favorite spot and then cl apped their hands in
have cows , pigs , chickens , and sheep . They
out of the freeze and soon it was playing again .

INTERVAL -0.342 - -0.195
CONTAINS 0.004%

is from a long time ago , when people used different
is from a long time ago , from when dinosaurs lived
it away in a wheel bar row . Once
When Daisy put the magn ifying glass down , the cater
they just kept on qu arre ling !
Token Weight
Token ID14331
Feature activation+1.04e-02

Pos logit contributions
 Professional+0.001
 Symb+0.001
 Pride+0.001
 Average+0.001
ALTH+0.001
Neg logit contributions
 tant-0.005
bel-0.005
 former-0.005
spe-0.005
 disl-0.005
Token:
Token ID25
Feature activation+2.08e-02

Pos logit contributions
 is+0.003
,+0.003
 things+0.002
 Dollars+0.001
inges+0.001
Neg logit contributions
 former-0.022
 tant-0.019
 disl-0.019
bel-0.018
 stret-0.018
Token 15
Token ID1315
Feature activation-6.62e-03

Pos logit contributions
 Symb+0.008
 Ten+0.007
 Pride+0.007
 Roots+0.007
 V+0.007
Neg logit contributions
 tant-0.037
bel-0.037
side-0.036
handed-0.036
keeping-0.036
Token kilograms
Token ID37075
Feature activation-1.55e-02

Pos logit contributions
 former+0.013
 disl+0.012
isk+0.012
 tant+0.012
ishly+0.012
Neg logit contributions
,-0.002
.-0.001
 centimeters-0.001
!-0.001
ALTH-0.001
Token.
Token ID13
Feature activation-3.51e-03

Pos logit contributions
 former+0.036
 disl+0.032
 tant+0.029
bel+0.028
spe+0.028
Neg logit contributions
,-0.002
.-0.001
!-0.001
!"-0.001
 hard-0.001
Token Average
Token ID13475
Feature activation+2.13e-01

Pos logit contributions
 tant+0.006
bel+0.006
 former+0.006
 disl+0.006
spe+0.006
Neg logit contributions
 Professional-0.001
 Pride-0.001
 Symb-0.001
 Average-0.001
ALTH-0.001
Token for
Token ID329
Feature activation+2.41e-02

Pos logit contributions
 Mueller+0.079
erv+0.069
 Professional+0.062
 for+0.055
ALTH+0.054
Neg logit contributions
 disl-0.375
 former-0.327
 tant-0.327
oc-0.307
storm-0.306
Token age
Token ID2479
Feature activation-1.01e-01

Pos logit contributions
 help+0.008
 looking+0.006
 helping+0.006
 more+0.005
 solving+0.005
Neg logit contributions
spe-0.044
 former-0.044
bel-0.044
ci-0.042
 tant-0.042
Token:
Token ID25
Feature activation+1.51e-02

Pos logit contributions
 former+0.231
 tant+0.205
 disl+0.204
bel+0.190
isk+0.189
Neg logit contributions
,-0.020
."-0.008
.-0.006
!-0.005
!"-0.002
Token 3
Token ID513
Feature activation-1.66e-02

Pos logit contributions
 to+0.004
 very+0.004
 new+0.004
 ounce+0.004
 more+0.004
Neg logit contributions
isk-0.028
 former-0.027
bel-0.026
 disl-0.026
ro-0.026
Token years
Token ID812
Feature activation+2.46e-02

Pos logit contributions
ifying+0.033
 former+0.033
arl+0.031
stack+0.029
bar+0.029
Neg logit contributions
 years-0.004
 things-0.002
 days-0.002
,-0.002
 ways-0.001
Token was
Token ID373
Feature activation-9.84e-03

Pos logit contributions
 looked+0.007
 vodka+0.004
 covered+0.003
 helped+0.003
 smelled+0.003
Neg logit contributions
 former-0.078
arl-0.077
 disl-0.075
 ago-0.073
 tant-0.070
Token a
Token ID257
Feature activation+9.58e-04

Pos logit contributions
 disl+0.020
 former+0.019
 tant+0.018
arre+0.018
spe+0.018
Neg logit contributions
 vodka-0.002
 very-0.002
 covered-0.002
 under-0.002
 making-0.001
Token beautiful
Token ID4950
Feature activation-1.15e-02

Pos logit contributions
 vodka+0.000
ALTH+0.000
 Professional+0.000
 Mueller+0.000
 capit+0.000
Neg logit contributions
 former-0.002
 disl-0.002
und-0.001
 ago-0.001
bar-0.001
Token pink
Token ID11398
Feature activation+2.30e-02

Pos logit contributions
 former+0.019
ifying+0.018
 disl+0.018
spe+0.017
bar+0.017
Neg logit contributions
 vodka-0.004
 Mueller-0.004
 Professional-0.004
ALTH-0.004
Code-0.003
Token c
Token ID269
Feature activation+1.25e-02

Pos logit contributions
 Mueller+0.008
ALTH+0.006
 vodka+0.006
 Professional+0.006
 soils+0.005
Neg logit contributions
 former-0.041
 disl-0.041
spe-0.036
bel-0.035
arl-0.034
Tokenasser
Token ID24929
Feature activation+2.00e-01

Pos logit contributions
ALTH+0.009
 Professional+0.009
Code+0.009
 Mueller+0.008
 vodka+0.008
Neg logit contributions
 former-0.017
 disl-0.016
spe-0.015
 tant-0.014
 ago-0.014
Tokenole
Token ID2305
Feature activation-6.77e-02

Pos logit contributions
ALTH+0.094
 Mueller+0.092
 soils+0.075
Code+0.074
 blogging+0.069
Neg logit contributions
 former-0.306
 disl-0.304
ifying-0.277
 ah-0.269
 tant-0.262
Token.
Token ID13
Feature activation+1.47e-02

Pos logit contributions
 former+0.137
 disl+0.122
 tant+0.117
bel+0.112
arl+0.106
Neg logit contributions
ALTH-0.021
 Mueller-0.020
 Professional-0.017
 blogging-0.014
Code-0.013
Token
Token ID198
Feature activation+1.98e-02

Pos logit contributions
 Professional+0.007
ALTH+0.005
Code+0.005
 vodka+0.004
 Ded+0.004
Neg logit contributions
 former-0.027
 disl-0.024
 tant-0.023
bel-0.023
spe-0.022
Token
Token ID198
Feature activation+2.00e-02

Pos logit contributions
 Professional+0.014
Code+0.013
ALTH+0.013
 Mueller+0.012
 vodka+0.012
Neg logit contributions
 former-0.029
 disl-0.026
 tant-0.024
ifying-0.024
 ago-0.023
TokenThe
Token ID464
Feature activation+2.28e-02

Pos logit contributions
Code+0.010
ALTH+0.010
 Professional+0.010
 vodka+0.009
 Mueller+0.008
Neg logit contributions
 former-0.033
 disl-0.030
 tant-0.028
 ago-0.028
ifying-0.026
Token the
Token ID262
Feature activation+2.82e-02

Pos logit contributions
ALTH+0.010
 Professional+0.010
Code+0.010
 Mueller+0.007
 Vend+0.007
Neg logit contributions
 former-0.044
 disl-0.040
bel-0.038
ifying-0.037
spe-0.037
Token jar
Token ID17379
Feature activation+3.74e-02

Pos logit contributions
Code+0.010
ALTH+0.009
 Professional+0.008
 vodka+0.008
 Mueller+0.008
Neg logit contributions
 disl-0.052
 former-0.047
isk-0.044
bel-0.044
spe-0.043
Token.
Token ID13
Feature activation+2.08e-02

Pos logit contributions
.+0.003
,+0.000
 under-0.000
 because-0.002
 but-0.002
Neg logit contributions
 disl-0.096
 former-0.087
 tant-0.085
 seventeen-0.079
spe-0.078
Token She
Token ID1375
Feature activation+2.72e-02

Pos logit contributions
 Professional+0.010
 Mueller+0.010
ALTH+0.010
Code+0.008
 Average+0.008
Neg logit contributions
 former-0.035
 disl-0.032
ifying-0.031
 tant-0.031
bel-0.029
Token t
Token ID256
Feature activation+9.35e-03

Pos logit contributions
 looked+0.006
 vodka+0.006
Code+0.005
 Professional+0.005
 opened+0.004
Neg logit contributions
 disl-0.057
 former-0.056
ifying-0.051
arl-0.050
opl-0.048
Tokenipt
Token ID10257
Feature activation+1.91e-01

Pos logit contributions
ALTH+0.006
 Professional+0.006
Code+0.005
 Mueller+0.005
 vodka+0.005
Neg logit contributions
 former-0.014
 disl-0.012
ifying-0.012
 tant-0.012
bel-0.012
Tokeno
Token ID78
Feature activation+5.12e-03

Pos logit contributions
 Professional+0.090
Code+0.084
ipt+0.083
 Mueller+0.082
expensive+0.076
Neg logit contributions
 former-0.331
 disl-0.296
 tant-0.264
ifying-0.246
 ago-0.241
Tokened
Token ID276
Feature activation-4.67e-02

Pos logit contributions
 under+0.002
 looked+0.001
 followed+0.001
 led+0.001
's+0.001
Neg logit contributions
 former-0.011
 disl-0.011
igsaw-0.009
arl-0.009
opl-0.009
Token up
Token ID510
Feature activation+2.63e-02

Pos logit contributions
 disl+0.096
 former+0.092
arre+0.079
ifying+0.078
spe+0.077
Neg logit contributions
 under-0.016
 Mueller-0.011
 Professional-0.009
 vodka-0.009
Code-0.008
Token to
Token ID284
Feature activation+2.25e-02

Pos logit contributions
 Mueller+0.011
Code+0.011
 Professional+0.011
ALTH+0.011
 vodka+0.010
Neg logit contributions
 disl-0.045
 former-0.045
ifying-0.038
 tant-0.038
arre-0.036
Token the
Token ID262
Feature activation+3.22e-02

Pos logit contributions
 the+0.001
 it+0.000
 Professional-0.001
 look-0.001
 looking-0.002
Neg logit contributions
 former-0.053
spe-0.051
arre-0.049
 tant-0.048
 disl-0.048
Token not
Token ID407
Feature activation+1.31e-02

Pos logit contributions
ALTH+0.022
 Professional+0.021
Code+0.021
 Mueller+0.020
 vodka+0.020
Neg logit contributions
 former-0.034
 disl-0.030
ifying-0.027
 tant-0.027
 ago-0.026
Token want
Token ID765
Feature activation-1.86e-02

Pos logit contributions
ALTH+0.012
 Professional+0.012
Code+0.012
 Mueller+0.011
 vodka+0.011
Neg logit contributions
 former-0.016
 disl-0.014
 tant-0.013
ifying-0.013
 ago-0.013
Token to
Token ID284
Feature activation+1.19e-02

Pos logit contributions
 former+0.031
 disl+0.027
spe+0.027
ifying+0.026
bel+0.026
Neg logit contributions
 Professional-0.010
 Mueller-0.009
ALTH-0.009
Code-0.009
 vodka-0.008
Token be
Token ID307
Feature activation-1.18e-02

Pos logit contributions
Code+0.010
ALTH+0.009
 Professional+0.009
 Mueller+0.009
 vodka+0.009
Neg logit contributions
 former-0.016
 disl-0.014
ifying-0.013
 tant-0.013
 ago-0.012
Token a
Token ID257
Feature activation+3.34e-04

Pos logit contributions
 former+0.022
ifying+0.020
spe+0.019
 disl+0.019
stack+0.019
Neg logit contributions
 more-0.003
 deleting-0.003
 vodka-0.003
 agreeable-0.003
 tatt-0.003
Token tatt
Token ID14399
Feature activation+1.91e-01

Pos logit contributions
ALTH+0.000
 Professional+0.000
Code+0.000
 Mueller+0.000
 vodka+0.000
Neg logit contributions
 former-0.000
 disl-0.000
ifying-0.000
 ago-0.000
bel-0.000
Tokenlet
Token ID1616
Feature activation-5.88e-03

Pos logit contributions
 Professional+0.154
Code+0.153
ALTH+0.152
 Mueller+0.150
 vodka+0.139
Neg logit contributions
 former-0.254
 disl-0.228
 tant-0.214
ifying-0.195
spe-0.192
Tokenale
Token ID1000
Feature activation+6.41e-03

Pos logit contributions
 former+0.013
 disl+0.011
ifying+0.010
spe+0.010
bel+0.010
Neg logit contributions
ALTH-0.002
 Professional-0.001
 vodka-0.001
expensive-0.001
 Mueller-0.001
Token.
Token ID13
Feature activation+1.49e-02

Pos logit contributions
.+0.000
,-0.001
 for-0.001
 Mueller-0.001
!-0.001
Neg logit contributions
 former-0.017
 disl-0.014
ifying-0.013
bel-0.013
 tant-0.013
Token She
Token ID1375
Feature activation+1.83e-02

Pos logit contributions
 Professional+0.009
ALTH+0.009
Code+0.008
 Mueller+0.008
 vodka+0.007
Neg logit contributions
 former-0.023
 disl-0.021
spe-0.020
 tant-0.020
bel-0.020
Token decided
Token ID3066
Feature activation-6.70e-03

Pos logit contributions
Code+0.009
 vodka+0.008
ALTH+0.008
 Professional+0.008
 Mueller+0.008
Neg logit contributions
 former-0.031
 disl-0.030
ifying-0.028
igsaw-0.027
 ago-0.027
Token didn
Token ID1422
Feature activation-2.87e-02

Pos logit contributions
 former+0.027
 disl+0.024
ifying+0.023
 ago+0.022
 tant+0.022
Neg logit contributions
Code-0.015
 Professional-0.015
ALTH-0.015
 Mueller-0.014
 vodka-0.014
Token't
Token ID470
Feature activation+2.59e-02

Pos logit contributions
 former+0.032
 disl+0.029
ifying+0.027
 tant+0.026
spe+0.024
Neg logit contributions
Code-0.027
 Professional-0.027
ALTH-0.026
 Mueller-0.026
 vodka-0.025
Token know
Token ID760
Feature activation-3.48e-03

Pos logit contributions
Code+0.023
ALTH+0.023
 Professional+0.023
 Mueller+0.022
 vodka+0.021
Neg logit contributions
 former-0.032
 disl-0.028
ifying-0.026
 tant-0.025
 ago-0.025
Token what
Token ID644
Feature activation-4.64e-03

Pos logit contributions
 former+0.006
 disl+0.005
ifying+0.005
bel+0.005
 tant+0.005
Neg logit contributions
Code-0.002
 Professional-0.002
ALTH-0.002
 Mueller-0.002
 vodka-0.001
Token "
Token ID366
Feature activation-1.91e-02

Pos logit contributions
 former+0.008
ifying+0.007
 disl+0.007
 tant+0.007
 ago+0.007
Neg logit contributions
 Professional-0.002
Code-0.002
ALTH-0.002
 Mueller-0.002
 Ded-0.002
Tokenmod
Token ID4666
Feature activation+1.90e-01

Pos logit contributions
 former+0.033
 disl+0.032
 ago+0.030
 Majesty+0.029
 amaz+0.027
Neg logit contributions
expensive-0.013
sort-0.011
pay-0.011
mod-0.011
danger-0.009
Tokenest
Token ID395
Feature activation+1.27e-02

Pos logit contributions
 Professional+0.047
Code+0.047
ALTH+0.045
 Mueller+0.042
retched+0.038
Neg logit contributions
 former-0.337
 disl-0.311
ifying-0.304
side-0.291
 ago-0.285
Token"
Token ID1
Feature activation-1.70e-02

Pos logit contributions
 thing+0.003
 things+0.001
 Mueller+0.001
ALTH+0.001
 ounce+0.001
Neg logit contributions
 former-0.028
bel-0.023
 disl-0.022
spe-0.022
side-0.022
Token meant
Token ID4001
Feature activation+1.66e-02

Pos logit contributions
 former+0.035
 disl+0.034
ocol+0.031
ifying+0.030
 Majesty+0.030
Neg logit contributions
expensive-0.006
sort-0.005
pay-0.005
mod-0.004
 Mueller-0.003
Token,
Token ID11
Feature activation+1.23e-02

Pos logit contributions
,+0.001
 but+0.001
ALTH+0.000
 because+0.000
 Mueller-0.000
Neg logit contributions
 former-0.040
ifying-0.033
bel-0.031
 disl-0.031
 tant-0.031
Token but
Token ID475
Feature activation+3.56e-02

Pos logit contributions
Code+0.003
ALTH+0.003
 Professional+0.002
 vodka+0.002
 Mueller+0.002
Neg logit contributions
 former-0.025
 disl-0.022
ifying-0.022
spe-0.021
bel-0.021
Token that
Token ID326
Feature activation-1.60e-02

Pos logit contributions
 former+0.038
 tant+0.035
 disl+0.033
ifying+0.032
spe+0.031
Neg logit contributions
Code-0.006
ALTH-0.005
 Mueller-0.005
 Professional-0.004
 vodka-0.004
Token the
Token ID262
Feature activation+1.46e-02

Pos logit contributions
 former+0.037
arl+0.037
 disl+0.035
 tant+0.035
ro+0.034
Neg logit contributions
 it-0.002
 the-0.001
 they+0.000
 everyone+0.001
 everything+0.002
Token child
Token ID1200
Feature activation+3.42e-02

Pos logit contributions
Code+0.003
 Mueller+0.003
 vodka+0.002
 Professional+0.002
 moon+0.002
Neg logit contributions
 disl-0.028
 former-0.027
 tant-0.025
isk-0.025
ro-0.024
Token l
Token ID300
Feature activation+3.26e-02

Pos logit contributions
 looked+0.008
 closed+0.007
 opened+0.006
 lay+0.005
 sat+0.004
Neg logit contributions
arl-0.079
 former-0.072
 disl-0.071
 tant-0.070
igsaw-0.065
Tokenaug
Token ID7493
Feature activation-1.81e-02

Pos logit contributions
 Professional+0.025
ALTH+0.025
Code+0.024
 Mueller+0.024
 vodka+0.021
Neg logit contributions
 former-0.045
 disl-0.042
 tant-0.038
spe-0.036
bel-0.035
Tokenheed
Token ID23616
Feature activation+1.88e-01

Pos logit contributions
 former+0.022
 disl+0.019
 tant+0.018
ifying+0.016
bel+0.016
Neg logit contributions
Code-0.017
ALTH-0.017
 Professional-0.017
 Mueller-0.016
 vodka-0.015
Token in
Token ID287
Feature activation+1.27e-02

Pos logit contributions
Code+0.057
ALTH+0.056
 Mueller+0.052
 Professional+0.045
 vodka+0.045
Neg logit contributions
 disl-0.351
 former-0.329
 tant-0.285
 supervision-0.284
opl-0.275
Token delight
Token ID10974
Feature activation-4.25e-02

Pos logit contributions
Code+0.011
ALTH+0.011
 Professional+0.010
 Mueller+0.010
 vodka+0.010
Neg logit contributions
 former-0.016
 disl-0.014
 tant-0.013
bel-0.013
spe-0.013
Token.
Token ID13
Feature activation+8.33e-03

Pos logit contributions
 former+0.081
 disl+0.078
 tant+0.067
ifying+0.067
bar+0.066
Neg logit contributions
 Mueller-0.007
ALTH-0.004
Code-0.004
 Professional-0.004
 vodka-0.004
Token
Token ID198
Feature activation+1.14e-02

Pos logit contributions
 Inside+0.001
 Professional+0.000
 Bes+0.000
 "+0.000
ALTH+0.000
Neg logit contributions
 former-0.018
 tant-0.017
 disl-0.016
bar-0.016
spe-0.015
Token
Token ID198
Feature activation+1.15e-02

Pos logit contributions
 Professional+0.007
ALTH+0.007
Code+0.007
 Mueller+0.006
 vodka+0.006
Neg logit contributions
 former-0.018
 disl-0.016
ifying-0.015
 tant-0.015
spe-0.014
Token't
Token ID470
Feature activation+1.73e-02

Pos logit contributions
 former+0.045
 disl+0.040
ifying+0.036
 tant+0.036
 ago+0.035
Neg logit contributions
Code-0.032
 Professional-0.031
ALTH-0.031
 Mueller-0.030
 vodka-0.029
Token understand
Token ID1833
Feature activation-5.68e-02

Pos logit contributions
ALTH+0.020
 Professional+0.020
Code+0.020
 vodka+0.018
 Mueller+0.018
Neg logit contributions
 former-0.017
 disl-0.015
 tant-0.013
 ago-0.013
ro-0.012
Token the
Token ID262
Feature activation-2.27e-03

Pos logit contributions
 former+0.098
ifying+0.084
bel+0.082
 disl+0.081
spe+0.081
Neg logit contributions
 Mueller-0.027
Code-0.026
 Professional-0.026
ALTH-0.024
 vodka-0.022
Token word
Token ID1573
Feature activation-8.17e-03

Pos logit contributions
 former+0.004
ifying+0.003
 disl+0.003
 ago+0.003
spe+0.003
Neg logit contributions
ALTH-0.001
 Professional-0.001
Code-0.001
 Mueller-0.001
 vodka-0.001
Token "
Token ID366
Feature activation-2.60e-02

Pos logit contributions
 former+0.020
 disl+0.019
arl+0.017
ifying+0.016
bel+0.016
Neg logit contributions
,-0.001
 but-0.000
 and+0.000
.+0.000
 Mueller+0.000
Tokenexpensive
Token ID22031
Feature activation+1.86e-01

Pos logit contributions
 former+0.044
 disl+0.044
 ago+0.040
 Majesty+0.038
ifying+0.035
Neg logit contributions
expensive-0.016
mod-0.014
pay-0.014
sort-0.014
danger-0.012
Token,"
Token ID553
Feature activation-3.10e-02

Pos logit contributions
ALTH+0.039
 Professional+0.037
expensive+0.033
 Mueller+0.028
 ounce+0.026
Neg logit contributions
 former-0.391
 disl-0.349
 tant-0.310
bel-0.308
 amaz-0.300
Token but
Token ID475
Feature activation+2.70e-02

Pos logit contributions
 former+0.053
 disl+0.048
spe+0.048
ifying+0.048
 ago+0.045
Neg logit contributions
 Professional-0.015
 Mueller-0.014
Code-0.013
ALTH-0.012
expensive-0.012
Token she
Token ID673
Feature activation-3.28e-02

Pos logit contributions
 Mueller+0.004
 Professional+0.003
 it+0.002
 her+0.001
 under+0.001
Neg logit contributions
 former-0.059
 disl-0.054
ifying-0.052
arl-0.050
opl-0.049
Token recognized
Token ID8018
Feature activation+2.92e-02

Pos logit contributions
 former+0.046
 disl+0.043
ifying+0.040
 tant+0.038
 ago+0.038
Neg logit contributions
Code-0.024
 Professional-0.023
ALTH-0.023
 Mueller-0.023
 vodka-0.021
Token the
Token ID262
Feature activation-1.35e-03

Pos logit contributions
 it+0.004
 the+0.002
.-0.001
 more-0.002
 Professional-0.003
Neg logit contributions
 former-0.069
bel-0.062
 ago-0.060
ro-0.059
 disl-0.059
Token and
Token ID290
Feature activation-2.59e-04

Pos logit contributions
ALTH+0.065
Code+0.063
 Professional+0.062
 vodka+0.061
 Average+0.059
Neg logit contributions
 former-0.035
 disl-0.032
 tant-0.030
bel-0.030
 ago-0.029
Token dreamed
Token ID27947
Feature activation+2.35e-02

Pos logit contributions
 former+0.000
 disl+0.000
ifying+0.000
 tant+0.000
spe+0.000
Neg logit contributions
 Professional-0.000
Code-0.000
ALTH-0.000
 vodka-0.000
 Mueller-0.000
Token of
Token ID286
Feature activation-1.92e-02

Pos logit contributions
 Professional+0.008
Code+0.007
 Mueller+0.007
ALTH+0.007
 vodka+0.005
Neg logit contributions
 former-0.041
 disl-0.040
ifying-0.037
spe-0.035
 ago-0.034
Token becoming
Token ID5033
Feature activation+3.98e-03

Pos logit contributions
 former+0.032
ifying+0.032
 tant+0.030
spe+0.029
bel+0.029
Neg logit contributions
 Professional-0.009
ALTH-0.008
Code-0.008
 vodka-0.007
 Mueller-0.006
Token a
Token ID257
Feature activation-1.20e-02

Pos logit contributions
Code+0.002
ALTH+0.002
 Professional+0.002
 Mueller+0.002
 vodka+0.002
Neg logit contributions
 former-0.006
 disl-0.006
ifying-0.006
 tant-0.006
 ago-0.005
Token Veter
Token ID13525
Feature activation+1.85e-01

Pos logit contributions
 former+0.015
 disl+0.014
ifying+0.012
 tant+0.012
 ago+0.012
Neg logit contributions
ALTH-0.010
Code-0.010
 Professional-0.010
 Mueller-0.009
 vodka-0.009
Tokenin
Token ID259
Feature activation-1.29e-02

Pos logit contributions
ALTH+0.207
Code+0.201
 Professional+0.201
 Mueller+0.189
 vodka+0.188
Neg logit contributions
 former-0.190
 tant-0.153
spe-0.150
 disl-0.148
bel-0.145
Tokenarian
Token ID3699
Feature activation+4.81e-02

Pos logit contributions
 former+0.021
 disl+0.019
ifying+0.018
 tant+0.017
bel+0.017
Neg logit contributions
 Mueller-0.007
ALTH-0.006
 Professional-0.006
Code-0.006
 Veter-0.005
Token one
Token ID530
Feature activation-3.88e-02

Pos logit contributions
ALTH+0.014
 Mueller+0.012
adies+0.006
 among+0.004
swers+0.004
Neg logit contributions
 former-0.096
 tant-0.092
ifying-0.086
 disl-0.082
arre-0.080
Token day
Token ID1110
Feature activation-1.03e-02

Pos logit contributions
 former+0.047
 disl+0.041
ifying+0.040
 tant+0.039
 ago+0.038
Neg logit contributions
Code-0.035
ALTH-0.035
 Professional-0.034
 vodka-0.032
 Mueller-0.031
Token.
Token ID13
Feature activation+8.51e-04

Pos logit contributions
 former+0.021
 disl+0.021
 tant+0.019
spe+0.018
isk+0.018
Neg logit contributions
 Mueller-0.001
ALTH-0.001
.-0.001
 Professional-0.001
,-0.001
Token to
Token ID284
Feature activation-2.08e-03

Pos logit contributions
 former+0.048
ifying+0.043
spe+0.041
 disl+0.041
bel+0.040
Neg logit contributions
 Mueller-0.006
Code-0.004
 Professional-0.003
 vodka-0.003
ALTH-0.003
Token try
Token ID1949
Feature activation+8.70e-03

Pos logit contributions
 former+0.003
 disl+0.003
ifying+0.002
 tant+0.002
spe+0.002
Neg logit contributions
Code-0.002
ALTH-0.002
 Professional-0.002
 Mueller-0.002
 vodka-0.001
Token the
Token ID262
Feature activation+7.20e-03

Pos logit contributions
 Professional+0.002
 Mueller+0.002
expensive+0.001
 it+0.001
 looking+0.001
Neg logit contributions
 former-0.017
ifying-0.016
bel-0.016
spe-0.016
 disl-0.015
Token broccoli
Token ID44653
Feature activation-3.20e-02

Pos logit contributions
Code+0.007
ALTH+0.007
 Professional+0.007
 Mueller+0.007
 vodka+0.007
Neg logit contributions
 former-0.008
 disl-0.007
 tant-0.006
ifying-0.006
 ago-0.006
Token c
Token ID269
Feature activation-2.23e-03

Pos logit contributions
 former+0.071
 disl+0.069
ifying+0.063
bel+0.061
arl+0.060
Neg logit contributions
 under-0.003
.-0.002
,-0.001
 because-0.001
 according+0.000
Tokenasser
Token ID24929
Feature activation+1.84e-01

Pos logit contributions
 former+0.003
 disl+0.003
spe+0.003
 ago+0.003
ifying+0.003
Neg logit contributions
 Professional-0.002
ALTH-0.002
Code-0.001
 Mueller-0.001
 vodka-0.001
Tokenole
Token ID2305
Feature activation-8.50e-02

Pos logit contributions
 Mueller+0.150
ALTH+0.149
Code+0.148
 Professional+0.144
 vodka+0.132
Neg logit contributions
 former-0.226
 disl-0.218
ifying-0.195
 tant-0.182
bel-0.179
Token.
Token ID13
Feature activation+4.24e-04

Pos logit contributions
 former+0.165
 disl+0.141
ifying+0.131
bel+0.131
 tant+0.130
Neg logit contributions
ALTH-0.037
 Mueller-0.033
 Professional-0.033
Code-0.028
 vodka-0.024
Token He
Token ID679
Feature activation-5.13e-03

Pos logit contributions
 Professional+0.000
 Mueller+0.000
ALTH+0.000
 Vend+0.000
 Ashes+0.000
Neg logit contributions
 former-0.001
ifying-0.001
bel-0.001
spe-0.001
 ago-0.001
Token said
Token ID531
Feature activation-1.24e-02

Pos logit contributions
 former+0.010
arl+0.009
ifying+0.009
 disl+0.009
 ago+0.009
Neg logit contributions
 vodka-0.002
 Professional-0.001
 Mueller-0.001
 unfolds-0.001
Code-0.001
Token,
Token ID11
Feature activation+1.58e-02

Pos logit contributions
 former+0.029
ifying+0.024
 habitats+0.023
opl+0.022
 disl+0.022
Neg logit contributions
ALTH-0.000
,-0.000
 Professional-0.000
 Mueller-0.000
Code+0.000
Token the
Token ID262
Feature activation+2.38e-02

Pos logit contributions
,+0.002
.+0.001
 because+0.000
 and-0.000
 but-0.001
Neg logit contributions
 disl-0.042
bar-0.041
 former-0.040
 seventeen-0.040
tr-0.040
Token baby
Token ID5156
Feature activation-2.45e-03

Pos logit contributions
 Professional+0.009
ALTH+0.009
Code+0.009
 Mueller+0.009
 vodka+0.008
Neg logit contributions
 former-0.040
 disl-0.039
ifying-0.035
spe-0.035
bel-0.035
Token.
Token ID13
Feature activation+1.94e-02

Pos logit contributions
 former+0.005
 disl+0.005
 tant+0.004
arl+0.004
arre+0.004
Neg logit contributions
 Mueller-0.001
 under-0.001
ALTH-0.001
 Professional-0.001
Code-0.001
Token They
Token ID1119
Feature activation+2.67e-02

Pos logit contributions
 Professional+0.006
ALTH+0.006
 Vend+0.005
 Mueller+0.005
Code+0.005
Neg logit contributions
 former-0.035
 disl-0.033
ifying-0.031
bel-0.030
 tant-0.030
Token t
Token ID256
Feature activation+4.57e-03

Pos logit contributions
 Mueller+0.008
 Professional+0.008
Code+0.008
 vodka+0.008
ALTH+0.007
Neg logit contributions
 disl-0.050
 former-0.049
igsaw-0.044
opl-0.043
bel-0.043
Tokenipt
Token ID10257
Feature activation+1.84e-01

Pos logit contributions
ALTH+0.003
 Professional+0.002
Code+0.002
 Mueller+0.002
 vodka+0.002
Neg logit contributions
 former-0.007
 disl-0.006
ifying-0.006
 tant-0.006
spe-0.006
Tokeno
Token ID78
Feature activation+2.54e-04

Pos logit contributions
Code+0.099
 Professional+0.098
 Mueller+0.090
ipt+0.089
 vodka+0.084
Neg logit contributions
 former-0.310
 disl-0.276
 tant-0.239
ifying-0.230
 ago-0.230
Tokened
Token ID276
Feature activation-5.37e-02

Pos logit contributions
 under+0.000
 Professional+0.000
ultane+0.000
Code+0.000
 vodka+0.000
Neg logit contributions
 disl-0.000
 former-0.000
ocol-0.000
igsaw-0.000
 tant-0.000
Token inside
Token ID2641
Feature activation+3.99e-02

Pos logit contributions
 disl+0.107
 former+0.104
opl+0.088
ifying+0.086
spe+0.084
Neg logit contributions
 under-0.018
 Mueller-0.013
 Professional-0.012
 vodka-0.010
Code-0.010
Token and
Token ID290
Feature activation+1.67e-02

Pos logit contributions
 Professional+0.012
ALTH+0.008
 Mueller+0.008
Code+0.007
 under+0.007
Neg logit contributions
 former-0.078
 disl-0.074
ifying-0.068
bar-0.062
 Majesty-0.062
Token put
Token ID1234
Feature activation+3.61e-02

Pos logit contributions
 closed+0.004
 hid+0.004
 looked+0.004
 opened+0.004
 under+0.003
Neg logit contributions
 former-0.037
opl-0.036
 disl-0.035
arl-0.033
igsaw-0.031
Token called
Token ID1444
Feature activation+8.97e-02

Pos logit contributions
,+0.015
.+0.012
 and+0.008
!+0.007
 came+0.003
Neg logit contributions
 former-0.145
igsaw-0.130
ifying-0.127
arl-0.125
 disl-0.120
Token T
Token ID309
Feature activation+5.87e-02

Pos logit contributions
 the+0.023
 a+0.019
 Leo+0.013
 out+0.012
 his+0.012
Neg logit contributions
 former-0.221
arre-0.197
aj-0.187
side-0.180
 tant-0.179
Tokenippy
Token ID41214
Feature activation+2.01e-01

Pos logit contributions
ippy+0.025
 Mueller+0.021
erv+0.020
ALTH+0.019
utor+0.019
Neg logit contributions
 former-0.103
 disl-0.101
bel-0.094
ifying-0.094
 tant-0.092
Token.
Token ID13
Feature activation+3.05e-02

Pos logit contributions
.+0.031
,+0.024
 Mueller+0.013
 under+0.011
 looked+0.009
Neg logit contributions
 former-0.431
 disl-0.418
arl-0.400
igsaw-0.391
 tant-0.390
Token T
Token ID309
Feature activation+3.56e-02

Pos logit contributions
 W+0.009
 Professional+0.007
 Squ+0.005
 Inside+0.005
 In+0.005
Neg logit contributions
 tant-0.061
 disl-0.059
 former-0.058
bel-0.057
spe-0.051
Tokenippy
Token ID41214
Feature activation+1.83e-01

Pos logit contributions
ippy+0.017
Code+0.015
 Professional+0.014
ALTH+0.014
 Mueller+0.013
Neg logit contributions
 former-0.062
 disl-0.057
 tant-0.057
ifying-0.056
bel-0.054
Token would
Token ID561
Feature activation+1.53e-02

Pos logit contributions
 looked+0.027
 loved+0.015
 followed+0.012
 was+0.011
 Mueller+0.011
Neg logit contributions
arl-0.421
 former-0.385
 disl-0.382
 tant-0.378
ifying-0.369
Token always
Token ID1464
Feature activation+1.58e-02

Pos logit contributions
 look+0.004
 always+0.003
 run+0.003
 make+0.003
 tell+0.002
Neg logit contributions
arl-0.035
ifying-0.035
aj-0.034
ocol-0.032
 former-0.032
Token get
Token ID651
Feature activation+3.70e-02

Pos logit contributions
 look+0.005
 ask+0.004
 tell+0.004
 listen+0.003
 looking+0.003
Neg logit contributions
ifying-0.035
side-0.035
arl-0.034
 former-0.033
igsaw-0.033
Token up
Token ID510
Feature activation+2.47e-02

Pos logit contributions
 lost+0.009
 very+0.009
 more+0.008
 in+0.006
 the+0.006
Neg logit contributions
 former-0.080
storm-0.079
opl-0.077
oc-0.077
tr-0.076
Token early
Token ID1903
Feature activation-4.22e-02

Pos logit contributions
 under+0.005
 and+0.005
 looking+0.003
 to+0.003
 in+0.003
Neg logit contributions
 disl-0.053
tr-0.052
arl-0.051
 seventeen-0.051
bar-0.050
Token
Token ID198
Feature activation+1.59e-02

Pos logit contributions
 Professional+0.006
 Mueller+0.005
ALTH+0.005
Code+0.004
 Ded+0.004
Neg logit contributions
 former-0.021
 disl-0.021
bel-0.019
 tant-0.019
spe-0.019
Token
Token ID198
Feature activation+1.59e-02

Pos logit contributions
 Professional+0.005
Code+0.004
ALTH+0.004
 Mueller+0.003
 Ded+0.003
Neg logit contributions
 former-0.031
 disl-0.030
 tant-0.026
ifying-0.026
spe-0.026
TokenM
Token ID44
Feature activation+3.99e-02

Pos logit contributions
 Professional+0.006
Code+0.005
ALTH+0.005
 Mueller+0.005
 vodka+0.004
Neg logit contributions
 former-0.029
 disl-0.029
 ago-0.025
 tant-0.025
spe-0.025
Tokenolly
Token ID5098
Feature activation+5.22e-02

Pos logit contributions
 Mueller+0.016
ALTH+0.016
 Professional+0.014
Code+0.014
 vodka+0.011
Neg logit contributions
 former-0.067
 disl-0.065
ifying-0.061
bel-0.060
spe-0.059
Token t
Token ID256
Feature activation+2.22e-03

Pos logit contributions
 looked+0.014
's+0.011
 opened+0.010
 followed+0.009
 closed+0.009
Neg logit contributions
arl-0.125
 disl-0.119
 former-0.113
opl-0.111
ifying-0.109
Tokenipt
Token ID10257
Feature activation+1.82e-01

Pos logit contributions
Code+0.001
 Professional+0.001
ALTH+0.001
 Mueller+0.001
 vodka+0.001
Neg logit contributions
 former-0.003
 disl-0.003
ifying-0.003
 tant-0.003
 ago-0.003
Tokeno
Token ID78
Feature activation-3.22e-03

Pos logit contributions
ipt+0.078
 Professional+0.068
Code+0.063
 Mueller+0.063
itates+0.059
Neg logit contributions
 former-0.336
 disl-0.300
 tant-0.269
 ago-0.253
ifying-0.246
Tokened
Token ID276
Feature activation-5.59e-02

Pos logit contributions
 disl+0.007
 former+0.007
igsaw+0.006
arl+0.006
opl+0.006
Neg logit contributions
 followed-0.001
 looked-0.001
's-0.001
 led-0.001
 crawled-0.001
Token around
Token ID1088
Feature activation-2.88e-02

Pos logit contributions
 disl+0.116
 former+0.111
opl+0.098
spe+0.095
arre+0.095
Neg logit contributions
 under-0.017
 Mueller-0.012
Code-0.009
 vodka-0.009
 Professional-0.009
Token the
Token ID262
Feature activation+2.49e-02

Pos logit contributions
 former+0.062
 disl+0.060
ifying+0.057
opl+0.056
arre+0.055
Neg logit contributions
 under-0.004
 looking-0.004
 but-0.002
,-0.002
 the-0.002
Token corner
Token ID5228
Feature activation+3.66e-02

Pos logit contributions
 Professional+0.008
Code+0.006
 Mueller+0.006
 vodka+0.006
 cave+0.006
Neg logit contributions
 disl-0.047
 former-0.043
spe-0.040
 tant-0.040
arl-0.040
Token few
Token ID1178
Feature activation+3.17e-02

Pos logit contributions
 very+0.011
 looking+0.011
 way+0.010
 box+0.009
 new+0.009
Neg logit contributions
 disl-0.066
 former-0.065
und-0.060
ifying-0.060
keeping-0.060
Token minutes
Token ID2431
Feature activation+7.23e-02

Pos logit contributions
 days+0.008
 more+0.007
 minutes+0.005
 ways+0.005
 things+0.004
Neg logit contributions
ifying-0.065
 former-0.063
 disl-0.063
arl-0.062
tr-0.058
Token,
Token ID11
Feature activation+4.42e-02

Pos logit contributions
,+0.019
 looking+0.012
 solving+0.010
 and+0.010
 blogging+0.008
Neg logit contributions
 former-0.162
 tant-0.145
 disl-0.145
ifying-0.139
arl-0.128
Token then
Token ID788
Feature activation-5.51e-02

Pos logit contributions
 but+0.014
 the+0.010
 it+0.010
 looking+0.009
 a+0.007
Neg logit contributions
 former-0.104
arl-0.101
 disl-0.099
opl-0.094
arre-0.093
Token t
Token ID256
Feature activation+9.20e-03

Pos logit contributions
 former+0.136
arl+0.129
 disl+0.126
isk+0.119
 tant+0.118
Neg logit contributions
,-0.013
 the-0.010
 it-0.009
 he-0.007
 opened-0.005
Tokenipt
Token ID10257
Feature activation+1.82e-01

Pos logit contributions
Code+0.006
 Professional+0.005
 Mueller+0.005
ALTH+0.005
 vodka+0.005
Neg logit contributions
 former-0.014
 disl-0.014
 tant-0.012
 ago-0.011
ifying-0.011
Tokeno
Token ID78
Feature activation+1.46e-03

Pos logit contributions
ipt+0.093
ok+0.064
 Professional+0.061
ippy+0.060
itates+0.059
Neg logit contributions
 former-0.353
 disl-0.333
 tant-0.277
 amaz-0.254
arre-0.253
Tokened
Token ID276
Feature activation-5.54e-02

Pos logit contributions
 under+0.000
's+0.000
d+0.000
 in+0.000
 into+0.000
Neg logit contributions
 disl-0.004
 former-0.003
 tant-0.003
igsaw-0.003
ocol-0.003
Token back
Token ID736
Feature activation+2.16e-02

Pos logit contributions
 disl+0.137
 former+0.126
arre+0.115
opl+0.111
 tant+0.107
Neg logit contributions
 under-0.014
 out-0.006
 inside-0.005
 to-0.004
 through-0.003
Token to
Token ID284
Feature activation+1.48e-02

Pos logit contributions
 under+0.002
 to+0.002
,+0.002
 inside+0.002
 in+0.001
Neg logit contributions
 disl-0.056
 tant-0.051
 former-0.050
arre-0.050
 seventeen-0.048
Token the
Token ID262
Feature activation+2.26e-02

Pos logit contributions
 the+0.003
 his+0.000
 it-0.000
 looking-0.001
 check-0.001
Neg logit contributions
arre-0.038
 tant-0.035
bel-0.034
aj-0.034
cases-0.034
Token time
Token ID640
Feature activation-1.01e-02

Pos logit contributions
 disl+0.024
 former+0.023
bel+0.022
spe+0.022
ifying+0.022
Neg logit contributions
ALTH-0.002
 vodka-0.002
 going-0.001
 Mueller-0.001
 very-0.001
Token.
Token ID13
Feature activation+1.05e-02

Pos logit contributions
 former+0.022
 disl+0.020
 tant+0.019
ifying+0.018
spe+0.018
Neg logit contributions
 under-0.001
 for-0.001
 looking-0.001
 Professional-0.000
 Mueller-0.000
Token She
Token ID1375
Feature activation+1.67e-02

Pos logit contributions
 Professional+0.007
ALTH+0.007
Code+0.007
 Mueller+0.006
 vodka+0.006
Neg logit contributions
 former-0.016
 disl-0.014
ifying-0.013
 tant-0.013
spe-0.013
Token quietly
Token ID12703
Feature activation+4.08e-03

Pos logit contributions
 looked+0.004
 opened+0.003
 closed+0.003
 vodka+0.003
 tried+0.003
Neg logit contributions
 disl-0.036
arl-0.034
 former-0.033
ifying-0.033
opl-0.033
Token t
Token ID256
Feature activation+5.17e-04

Pos logit contributions
 opened+0.001
 looked+0.001
 closed+0.001
 sat+0.001
 followed+0.001
Neg logit contributions
 former-0.009
ifying-0.008
 disl-0.008
opl-0.008
 tant-0.008
Tokenipt
Token ID10257
Feature activation+1.81e-01

Pos logit contributions
Code+0.000
 Professional+0.000
ALTH+0.000
 Mueller+0.000
 vodka+0.000
Neg logit contributions
 former-0.001
 disl-0.001
ifying-0.001
 tant-0.001
 ago-0.001
Tokeno
Token ID78
Feature activation-4.84e-03

Pos logit contributions
ipt+0.083
 Professional+0.072
Code+0.065
ippy+0.061
 Mueller+0.061
Neg logit contributions
 former-0.331
 disl-0.302
 tant-0.259
ifying-0.246
 ago-0.244
Tokened
Token ID276
Feature activation-5.67e-02

Pos logit contributions
 disl+0.011
 former+0.011
ocol+0.009
opl+0.009
arl+0.009
Neg logit contributions
's-0.001
 under-0.001
 to-0.001
 followed-0.001
 led-0.001
Token down
Token ID866
Feature activation+1.11e-02

Pos logit contributions
 disl+0.140
 former+0.126
opl+0.114
arre+0.112
spe+0.112
Neg logit contributions
 under-0.017
 to-0.005
 out-0.005
 outside-0.004
 for-0.003
Token the
Token ID262
Feature activation+2.36e-02

Pos logit contributions
 under+0.002
 and+0.001
 to+0.000
,+0.000
 the+0.000
Neg logit contributions
 disl-0.026
 former-0.024
 Majesty-0.021
ifying-0.021
arre-0.021
Token street
Token ID4675
Feature activation-1.96e-02

Pos logit contributions
 Professional+0.013
Code+0.013
ALTH+0.012
 Mueller+0.011
 vodka+0.010
Neg logit contributions
 former-0.037
 disl-0.035
ifying-0.032
 tant-0.031
bel-0.031
Token jumping
Token ID14284
Feature activation+2.81e-02

Pos logit contributions
Code+0.015
 Professional+0.015
ALTH+0.014
 vodka+0.014
 Mueller+0.014
Neg logit contributions
 former-0.041
 disl-0.038
ifying-0.036
 tant-0.035
 ago-0.033
Token over
Token ID625
Feature activation-1.74e-02

Pos logit contributions
 under+0.007
 among+0.004
 and+0.003
,+0.003
ALTH+0.002
Neg logit contributions
 former-0.062
 disl-0.061
 supervision-0.053
ifying-0.051
ocol-0.051
Token logs
Token ID17259
Feature activation+9.70e-03

Pos logit contributions
 former+0.034
 disl+0.033
ifying+0.028
arre+0.028
arl+0.028
Neg logit contributions
Code-0.005
 Professional-0.004
 Mueller-0.003
 under-0.003
ALTH-0.003
Token,
Token ID11
Feature activation+2.71e-02

Pos logit contributions
 Professional+0.001
 Mueller+0.001
 under+0.001
 blogging+0.001
ALTH+0.001
Neg logit contributions
 former-0.020
 disl-0.020
ifying-0.017
 tant-0.017
bel-0.016
Token t
Token ID256
Feature activation+8.38e-04

Pos logit contributions
 Professional+0.014
Code+0.014
ALTH+0.014
 Mueller+0.013
 vodka+0.013
Neg logit contributions
 former-0.045
 disl-0.041
ifying-0.039
 tant-0.038
spe-0.036
Tokenipt
Token ID10257
Feature activation+1.81e-01

Pos logit contributions
ALTH+0.001
 Professional+0.001
 Mueller+0.001
 vodka+0.001
Code+0.001
Neg logit contributions
 former-0.001
bel-0.001
ifying-0.001
bar-0.001
und-0.001
Tokenoe
Token ID2577
Feature activation+5.29e-03

Pos logit contributions
 Professional+0.131
Code+0.130
 Mueller+0.124
ALTH+0.123
 vodka+0.117
Neg logit contributions
 former-0.261
 disl-0.233
 tant-0.209
ifying-0.201
 ago-0.195
Tokening
Token ID278
Feature activation+1.41e-02

Pos logit contributions
 Professional+0.004
 vodka+0.004
Code+0.004
 Mueller+0.003
ultane+0.003
Neg logit contributions
 former-0.008
 disl-0.008
 tant-0.006
ifying-0.006
spe-0.006
Token around
Token ID1088
Feature activation-3.05e-02

Pos logit contributions
Code+0.005
 Professional+0.005
ALTH+0.004
 Mueller+0.004
 vodka+0.004
Neg logit contributions
 former-0.027
 disl-0.026
ifying-0.022
 tant-0.021
 ago-0.021
Token rocks
Token ID12586
Feature activation+3.38e-02

Pos logit contributions
 former+0.062
 disl+0.059
 tant+0.054
ifying+0.054
arl+0.053
Neg logit contributions
Code-0.005
 Professional-0.005
 Mueller-0.004
 under-0.004
ALTH-0.004
Token,
Token ID11
Feature activation+2.98e-02

Pos logit contributions
,+0.002
 and+0.001
 among+0.001
 under+0.001
.+0.000
Neg logit contributions
 former-0.077
 disl-0.075
 tant-0.064
ifying-0.064
 aids-0.061
Token didn
Token ID1422
Feature activation-3.61e-02

Pos logit contributions
 former+0.040
 disl+0.037
ifying+0.036
 tant+0.034
 ago+0.034
Neg logit contributions
Code-0.016
 Professional-0.016
ALTH-0.015
 Mueller-0.014
 vodka-0.014
Token't
Token ID470
Feature activation+1.59e-02

Pos logit contributions
 former+0.050
 disl+0.045
 tant+0.041
ifying+0.041
 ago+0.040
Neg logit contributions
ALTH-0.027
Code-0.027
 Professional-0.027
 Mueller-0.025
 vodka-0.024
Token understand
Token ID1833
Feature activation-5.77e-02

Pos logit contributions
 Mueller+0.007
Code+0.007
ALTH+0.006
 Professional+0.006
 Average+0.006
Neg logit contributions
 former-0.027
ifying-0.025
 disl-0.023
spe-0.023
bel-0.023
Token what
Token ID644
Feature activation-8.54e-03

Pos logit contributions
 former+0.130
ifying+0.115
att+0.112
bel+0.110
ro+0.110
Neg logit contributions
 but-0.012
 because-0.011
,-0.010
 what-0.007
 about-0.006
Token "
Token ID366
Feature activation-2.41e-02

Pos logit contributions
 former+0.018
ifying+0.017
arl+0.016
ro+0.016
 tant+0.015
Neg logit contributions
 it-0.001
 the-0.001
 helping-0.001
 new-0.001
 very-0.001
Tokenmod
Token ID4666
Feature activation+1.80e-01

Pos logit contributions
 disl+0.040
 former+0.039
 ago+0.037
 Majesty+0.035
igsaw+0.033
Neg logit contributions
expensive-0.016
sort-0.014
pay-0.014
mod-0.014
danger-0.013
Tokenest
Token ID395
Feature activation+4.76e-03

Pos logit contributions
est+0.012
ickle+0.008
 Novel+0.003
ie+0.001
Code+0.000
Neg logit contributions
 former-0.342
ifying-0.332
 disl-0.327
Aud-0.317
side-0.317
Token"
Token ID1
Feature activation-2.46e-02

Pos logit contributions
 thing+0.001
."+0.001
 things+0.001
,+0.001
.+0.001
Neg logit contributions
 former-0.010
bel-0.009
Aud-0.009
 disl-0.008
isk-0.008
Token meant
Token ID4001
Feature activation+9.69e-03

Pos logit contributions
 former+0.053
 disl+0.052
ocol+0.050
ifying+0.047
 Majesty+0.045
Neg logit contributions
expensive-0.006
sort-0.006
pay-0.005
mod-0.004
file-0.003
Token,
Token ID11
Feature activation+7.11e-03

Pos logit contributions
,+0.002
.+0.001
 it+0.001
 and+0.000
 but-0.000
Neg logit contributions
 former-0.025
 tant-0.021
isk-0.021
bel-0.021
ifying-0.020
Token but
Token ID475
Feature activation+2.84e-02

Pos logit contributions
 but+0.001
 because-0.000
 and-0.001
 so-0.001
 it-0.001
Neg logit contributions
 former-0.017
ifying-0.017
bel-0.016
 disl-0.016
igsaw-0.016
Token T
Token ID309
Feature activation+3.08e-02

Pos logit contributions
 W+0.005
 because+0.005
 Poo+0.005
 it+0.004
 the+0.004
Neg logit contributions
 disl-0.068
 former-0.062
 tant-0.060
arl-0.059
 seventeen-0.058
Token-
Token ID12
Feature activation-7.54e-03

Pos logit contributions
 Professional+0.021
ALTH+0.021
Code+0.020
 Mueller+0.020
 vodka+0.019
Neg logit contributions
 former-0.042
 disl-0.038
ro-0.035
 tant-0.035
ifying-0.035
TokenRex
Token ID47389
Feature activation+6.23e-02

Pos logit contributions
 former+0.014
arre+0.012
 disl+0.012
 amaz+0.011
 ago+0.011
Neg logit contributions
expensive-0.002
B-0.002
 Professional-0.002
Lab-0.002
rex-0.002
Token used
Token ID973
Feature activation+6.38e-02

Pos logit contributions
 followed+0.011
,+0.011
's+0.011
 looked+0.008
 listened+0.008
Neg logit contributions
 former-0.140
igsaw-0.122
 disl-0.120
ocol-0.118
arl-0.118
Token D
Token ID360
Feature activation+1.16e-02

Pos logit contributions
 his+0.006
 the+0.005
 to+0.002
 it-0.001
 all-0.004
Neg logit contributions
 former-0.153
 disl-0.144
storm-0.138
aj-0.132
arre-0.131
Tokenippy
Token ID41214
Feature activation+1.80e-01

Pos logit contributions
ALTH+0.008
Code+0.008
 Professional+0.008
 Mueller+0.008
 vodka+0.007
Neg logit contributions
 former-0.016
 disl-0.015
ifying-0.013
 tant-0.013
 ago-0.013
Token's
Token ID338
Feature activation+4.64e-02

Pos logit contributions
 to+0.040
 things+0.036
,+0.029
 ounce+0.027
 Mueller+0.024
Neg logit contributions
 former-0.377
arl-0.322
aman-0.303
ifying-0.299
 disl-0.298
Token measurements
Token ID13871
Feature activation-2.42e-02

Pos logit contributions
 help+0.012
 ounce+0.012
 Mueller+0.009
 new+0.008
ALTH+0.008
Neg logit contributions
 former-0.084
ifying-0.078
 disl-0.077
isk-0.074
arre-0.073
Token to
Token ID284
Feature activation+2.14e-02

Pos logit contributions
 former+0.062
 disl+0.058
 Majesty+0.053
 Amen+0.049
arl+0.048
Neg logit contributions
,-0.003
 and-0.003
 to-0.003
.-0.002
 for-0.000
Token cause
Token ID2728
Feature activation+1.49e-02

Pos logit contributions
 help+0.006
 the+0.006
 solve+0.005
 make+0.005
 weigh+0.003
Neg logit contributions
aj-0.049
ifying-0.044
 disl-0.042
arre-0.042
 former-0.040
Token trouble
Token ID5876
Feature activation+1.52e-02

Pos logit contributions
 more+0.002
 the+0.002
 it+0.001
 things+0.001
 trouble+0.001
Neg logit contributions
 former-0.032
ifying-0.031
tr-0.030
stack-0.030
bel-0.029
Token asked
Token ID1965
Feature activation+1.47e-02

Pos logit contributions
ifying+0.002
 former+0.002
 disl+0.002
arl+0.002
 tant+0.002
Neg logit contributions
 vodka-0.000
 Professional-0.000
 Mueller-0.000
Code-0.000
 quer-0.000
Token.
Token ID13
Feature activation+1.65e-02

Pos logit contributions
 Mueller+0.003
 Professional+0.003
 under+0.002
ALTH+0.002
Code+0.001
Neg logit contributions
 former-0.030
 disl-0.030
ifying-0.028
arre-0.026
arl-0.025
Token
Token ID198
Feature activation+2.03e-02

Pos logit contributions
 Professional+0.005
ALTH+0.003
Code+0.002
 Average+0.002
 Ashes+0.002
Neg logit contributions
 former-0.035
 disl-0.031
ifying-0.031
 tant-0.030
spe-0.029
Token
Token ID198
Feature activation+2.07e-02

Pos logit contributions
 Professional+0.009
Code+0.008
ALTH+0.007
 Ded+0.006
 Average+0.005
Neg logit contributions
 former-0.038
 disl-0.034
ifying-0.033
 ago-0.032
 tant-0.032
TokenWh
Token ID1199
Feature activation+4.86e-02

Pos logit contributions
Code+0.011
 Professional+0.010
ALTH+0.010
 Ded+0.008
 capit+0.008
Neg logit contributions
 former-0.036
 ago-0.033
 disl-0.032
 tant-0.031
ifying-0.030
Tokenippy
Token ID41214
Feature activation+1.78e-01

Pos logit contributions
Code+0.020
 Professional+0.020
ALTH+0.019
 Mueller+0.017
 vodka+0.016
Neg logit contributions
 former-0.084
 disl-0.078
ifying-0.072
 tant-0.071
 ago-0.070
Token happily
Token ID18177
Feature activation-1.86e-02

Pos logit contributions
 Mueller+0.060
 Professional+0.043
Code+0.040
 looked+0.035
 capit+0.035
Neg logit contributions
ifying-0.336
arl-0.336
 former-0.334
 disl-0.321
opl-0.305
Token nodded
Token ID14464
Feature activation+1.49e-02

Pos logit contributions
 former+0.035
 tant+0.033
 disl+0.033
ifying+0.032
 ago+0.030
Neg logit contributions
 Professional-0.005
 capit-0.005
 Mueller-0.005
 vodka-0.005
Code-0.005
Token and
Token ID290
Feature activation+1.60e-02

Pos logit contributions
 under+0.002
,+0.002
 and+0.001
.+0.001
 his+0.000
Neg logit contributions
 former-0.033
 disl-0.032
arre-0.031
ifying-0.029
pill-0.028
Token started
Token ID2067
Feature activation+2.61e-02

Pos logit contributions
 helped+0.004
 sn+0.003
 followed+0.003
 looked+0.003
 sat+0.002
Neg logit contributions
arl-0.036
ifying-0.034
 former-0.034
 ago-0.033
 tant-0.032
Token to
Token ID284
Feature activation+2.05e-02

Pos logit contributions
 looking+0.005
 Professional+0.005
 digging+0.005
 blogging+0.005
 vodka+0.004
Neg logit contributions
 former-0.053
 tant-0.048
ifying-0.048
spe-0.046
 disl-0.046
Token.
Token ID13
Feature activation+2.44e-02

Pos logit contributions
 look+0.003
 tell+0.003
 make+0.003
 ask+0.003
 help+0.002
Neg logit contributions
ifying-0.043
 tant-0.040
 former-0.039
spe-0.039
run-0.038
Token They
Token ID1119
Feature activation+4.78e-02

Pos logit contributions
+0.003
 She+0.002
 The+0.002
 They+0.002
 It+0.002
Neg logit contributions
 former-0.057
 tant-0.056
bel-0.052
ache-0.049
olving-0.048
Token write
Token ID3551
Feature activation+1.68e-02

Pos logit contributions
 looked+0.011
 put+0.008
 did+0.007
 were+0.007
 made+0.006
Neg logit contributions
 former-0.106
igsaw-0.105
arl-0.100
ocol-0.096
umm-0.095
Token "
Token ID366
Feature activation+1.05e-02

Pos logit contributions
 and+0.003
,+0.003
 for+0.002
 very+0.002
 about+0.001
Neg logit contributions
 former-0.041
 disl-0.037
bar-0.034
ocol-0.034
spir-0.034
TokenVOL
Token ID44558
Feature activation+1.78e-01

Pos logit contributions
expensive+0.006
mail+0.005
Lab+0.005
sort+0.005
gor+0.005
Neg logit contributions
 former-0.018
 disl-0.018
 Majesty-0.016
ishly-0.015
 tant-0.015
TokenCAN
Token ID44565
Feature activation+1.78e-01

Pos logit contributions
Code+0.143
ALTH+0.138
 Professional+0.137
 Mueller+0.132
 vodka+0.131
Neg logit contributions
 former-0.235
 disl-0.206
 tant-0.196
ifying-0.193
 ago-0.193
TokenO
Token ID46
Feature activation+8.25e-02

Pos logit contributions
ALTH+0.115
Code+0.103
 Professional+0.098
 Mueller+0.090
 capit+0.083
Neg logit contributions
 former-0.303
 disl-0.254
 ago-0.244
 tant-0.240
bel-0.221
Token"
Token ID1
Feature activation+1.50e-03

Pos logit contributions
ALTH+0.037
 Professional+0.036
expensive+0.034
Code+0.034
 Mueller+0.029
Neg logit contributions
 former-0.143
 disl-0.119
 tant-0.117
 ago-0.115
 amaz-0.115
Token in
Token ID287
Feature activation+2.24e-02

Pos logit contributions
 and+0.000
expensive+0.000
Private+0.000
 but+0.000
sort+0.000
Neg logit contributions
 disl-0.003
ocol-0.003
 former-0.003
 Majesty-0.003
 tant-0.003
Token big
Token ID1263
Feature activation-8.51e-03

Pos logit contributions
 the+0.005
 a+0.002
 it-0.000
 bed-0.001
 their-0.002
Neg logit contributions
 former-0.050
ossom-0.045
 disl-0.045
bel-0.044
arre-0.043
Token letters
Token ID7475
Feature activation-6.43e-03

Pos logit contributions
 former+0.018
 disl+0.015
ifying+0.014
bel+0.014
isk+0.014
Neg logit contributions
 breath-0.002
 box-0.002
,-0.002
 things-0.002
ALTH-0.002
Token the
Token ID262
Feature activation+5.80e-02

Pos logit contributions
 the+0.006
,+0.003
.+0.002
 it+0.002
 her-0.001
Neg logit contributions
 former-0.105
pp-0.103
ifying-0.100
att-0.099
arre-0.099
Token ground
Token ID2323
Feature activation+1.31e-02

Pos logit contributions
 way+0.019
 box+0.014
 grass+0.014
 tent+0.011
 new+0.011
Neg logit contributions
 disl-0.120
run-0.117
 former-0.113
arre-0.111
und-0.109
Token,
Token ID11
Feature activation+3.94e-02

Pos logit contributions
.+0.001
,+0.001
 and+0.000
!-0.001
 under-0.002
Neg logit contributions
 former-0.035
 disl-0.033
 tant-0.033
ocol-0.032
 Deal-0.031
Token so
Token ID523
Feature activation+2.34e-02

Pos logit contributions
 but+0.013
 it+0.009
 and+0.007
 the+0.006
 under+0.004
Neg logit contributions
 former-0.098
igsaw-0.096
arl-0.091
 disl-0.087
 tant-0.085
Token D
Token ID360
Feature activation+1.00e-02

Pos logit contributions
 it+0.004
 the+0.004
 he+0.003
 they+0.001
 she+0.001
Neg logit contributions
arl-0.061
arre-0.060
 former-0.058
 tant-0.058
att-0.057
Tokenippy
Token ID41214
Feature activation+1.78e-01

Pos logit contributions
 Professional+0.002
 Mueller+0.002
ALTH+0.002
Code+0.002
ippy+0.001
Neg logit contributions
 former-0.018
 disl-0.018
spe-0.017
bel-0.017
 tant-0.017
Token hopped
Token ID45230
Feature activation+1.16e-02

Pos logit contributions
 followed+0.035
 looked+0.032
 crawled+0.032
 ran+0.024
ALTH+0.023
Neg logit contributions
 former-0.377
 disl-0.350
arl-0.348
ifying-0.328
igsaw-0.327
Token over
Token ID625
Feature activation-1.27e-02

Pos logit contributions
 under+0.004
ALTH+0.004
 Professional+0.003
 Mueller+0.003
Code+0.003
Neg logit contributions
 former-0.023
 disl-0.022
spe-0.019
ifying-0.018
 tant-0.018
Token to
Token ID284
Feature activation+1.59e-02

Pos logit contributions
 former+0.032
 disl+0.031
arre+0.029
ifying+0.029
spe+0.028
Neg logit contributions
 and-0.001
 to-0.000
 the-0.000
 under-0.000
 it+0.000
Token take
Token ID1011
Feature activation+1.00e-03

Pos logit contributions
 the+0.003
 show+0.002
 it+0.002
 lay+0.002
 look+0.001
Neg logit contributions
spe-0.042
ifying-0.038
 former-0.038
bel-0.036
arre-0.036
Token a
Token ID257
Feature activation+3.39e-03

Pos logit contributions
 W+0.000
 Professional+0.000
 it+0.000
 a+0.000
 the+0.000
Neg logit contributions
spe-0.002
 former-0.002
bel-0.002
 disl-0.002
 ago-0.002
Token the
Token ID262
Feature activation+6.71e-03

Pos logit contributions
 the+0.000
 things-0.000
 it-0.000
 new-0.000
 his-0.000
Neg logit contributions
 former-0.003
 disl-0.003
 ago-0.003
arl-0.003
ifying-0.003
Token tree
Token ID5509
Feature activation-2.95e-02

Pos logit contributions
 tree+0.001
 new+0.001
 rat+0.001
 hole+0.000
 things+0.000
Neg logit contributions
 disl-0.014
ifying-0.013
bel-0.013
arl-0.013
spe-0.013
Token to
Token ID284
Feature activation-5.33e-03

Pos logit contributions
arl+0.073
 disl+0.069
 tant+0.064
ifying+0.063
ocol+0.062
Neg logit contributions
,-0.006
 and-0.005
.-0.002
 because-0.002
 very-0.001
Token its
Token ID663
Feature activation-6.38e-03

Pos logit contributions
aj+0.012
ifying+0.012
 tant+0.011
arre+0.011
pill+0.011
Neg logit contributions
 the-0.001
 help-0.001
 solve-0.001
 make-0.001
 his-0.000
Token once
Token ID1752
Feature activation-1.55e-02

Pos logit contributions
arl+0.013
 disl+0.013
arre+0.012
ifying+0.012
 tant+0.012
Neg logit contributions
 new-0.002
 way-0.002
 place-0.002
 more-0.002
 home-0.001
Token former
Token ID1966
Feature activation-3.42e-01

Pos logit contributions
 disl+0.032
arl+0.032
ifying+0.031
 tant+0.029
Aud+0.027
Neg logit contributions
 more-0.005
,-0.003
 new-0.002
 safe-0.002
.-0.002
Token beauty
Token ID8737
Feature activation-4.84e-02

Pos logit contributions
ifying+0.585
 ah+0.506
bel+0.502
 tant+0.489
arl+0.488
Neg logit contributions
 ways-0.115
 voy-0.096
 Professional-0.094
ALTH-0.090
Code-0.089
Token.
Token ID13
Feature activation-9.20e-03

Pos logit contributions
 former+0.120
 disl+0.118
 tant+0.111
arl+0.106
arre+0.104
Neg logit contributions
.-0.003
,-0.000
!+0.002
 and+0.002
 because+0.005
Token He
Token ID679
Feature activation-1.55e-02

Pos logit contributions
 tant+0.016
 former+0.016
bel+0.016
 disl+0.015
ifying+0.014
Neg logit contributions
 Professional-0.003
ALTH-0.003
 Ded-0.003
 Average-0.002
 Conflict-0.002
Token was
Token ID373
Feature activation-3.17e-02

Pos logit contributions
arl+0.039
ifying+0.034
 disl+0.034
 tant+0.033
igsaw+0.032
Neg logit contributions
 knew-0.003
 looked-0.002
 was-0.001
 believed-0.001
 used-0.001
Token happy
Token ID3772
Feature activation-1.33e-02

Pos logit contributions
 tant+0.075
arl+0.074
ro+0.072
 disl+0.071
 former+0.068
Neg logit contributions
 very-0.006
 so-0.004
 able+0.001
 the+0.001
 happy+0.002
Token to
Token ID284
Feature activation-6.89e-03

Pos logit contributions
 former+0.029
 disl+0.027
 amaz+0.026
 ah+0.025
 Majesty+0.024
Neg logit contributions
 and-0.001
,-0.001
."-0.001
.-0.001
 to-0.001
Token be
Token ID307
Feature activation-3.14e-02

Pos logit contributions
ifying+0.015
 tant+0.015
arre+0.014
aj+0.013
pill+0.013
Neg logit contributions
 solve-0.002
 help-0.001
 find-0.001
 get-0.001
 make-0.001
Token restored
Token ID15032
Feature activation+7.79e-03

Pos logit contributions
ho+0.064
ifying+0.060
 tant+0.059
ro+0.058
stack+0.056
Neg logit contributions
 used-0.008
 very-0.007
 organized-0.007
 safe-0.007
 more-0.006
Token to
Token ID284
Feature activation-5.88e-03

Pos logit contributions
."+0.001
.+0.000
 and-0.000
,-0.000
 for-0.000
Neg logit contributions
 disl-0.020
 seventeen-0.018
 former-0.018
 ah-0.018
 tant-0.017
Token its
Token ID663
Feature activation-7.05e-03

Pos logit contributions
ifying+0.013
 tant+0.013
ho+0.012
arre+0.012
arl+0.012
Neg logit contributions
 solve-0.002
 the-0.001
 find-0.001
 it-0.001
 space-0.001
Token former
Token ID1966
Feature activation-3.40e-01

Pos logit contributions
ifying+0.013
 ah+0.012
lic+0.012
paste+0.012
arl+0.012
Neg logit contributions
 way-0.004
 new-0.004
 place-0.004
 home-0.003
 best-0.003
Token glory
Token ID13476
Feature activation-5.49e-02

Pos logit contributions
ifying+0.506
bel+0.445
 disl+0.442
 ah+0.439
 tant+0.430
Neg logit contributions
ALTH-0.167
Code-0.159
 Mueller-0.156
 Professional-0.156
 soils-0.146
Token."
Token ID526
Feature activation+9.29e-03

Pos logit contributions
 tant+0.134
 disl+0.131
 former+0.126
arre+0.122
ishly+0.119
Neg logit contributions
."-0.005
.-0.001
,+0.003
 and+0.005
?"+0.008
Token
Token ID220
Feature activation-4.70e-02

Pos logit contributions
 Professional+0.006
ALTH+0.006
Code+0.006
 Mueller+0.006
 vodka+0.005
Neg logit contributions
 former-0.014
 disl-0.013
ifying-0.012
 tant-0.012
spe-0.011
Token
Token ID198
Feature activation-2.12e-03

Pos logit contributions
 former+0.091
ifying+0.084
 disl+0.082
 tant+0.081
spe+0.078
Neg logit contributions
 Professional-0.015
ALTH-0.015
Code-0.009
 Mueller-0.009
 Ded-0.008
Token
Token ID198
Feature activation-1.92e-03

Pos logit contributions
 former+0.003
 disl+0.002
 tant+0.002
ifying+0.002
 ago+0.002
Neg logit contributions
ALTH-0.002
Code-0.002
 Professional-0.002
 Mueller-0.002
 vodka-0.002
Token the
Token ID262
Feature activation+5.09e-03

Pos logit contributions
 it+0.000
 the-0.000
 things-0.000
 more-0.000
 better-0.000
Neg logit contributions
spe-0.006
bel-0.005
arl-0.005
storm-0.005
hy-0.005
Token village
Token ID7404
Feature activation-3.80e-02

Pos logit contributions
 things+0.001
 world+0.001
 hole+0.001
 way+0.001
 animals+0.001
Neg logit contributions
 disl-0.009
ifying-0.009
run-0.009
bar-0.009
bel-0.009
Token back
Token ID736
Feature activation+5.56e-03

Pos logit contributions
arl+0.081
Aud+0.075
 tant+0.074
 disl+0.073
isk+0.073
Neg logit contributions
."-0.009
.-0.009
,-0.008
 more-0.004
 and-0.004
Token to
Token ID284
Feature activation-2.54e-03

Pos logit contributions
 to+0.001
."+0.001
 and+0.001
.+0.001
 home+0.000
Neg logit contributions
 disl-0.013
 seventeen-0.013
 tant-0.012
 ah-0.012
 gloss-0.012
Token its
Token ID663
Feature activation-4.79e-03

Pos logit contributions
 tant+0.005
spe+0.005
bel+0.005
aj+0.005
ro+0.005
Neg logit contributions
 the-0.001
 it-0.000
 our-0.000
 find-0.000
."-0.000
Token former
Token ID1966
Feature activation-3.37e-01

Pos logit contributions
 disl+0.009
arl+0.008
 tant+0.008
auc+0.008
lic+0.008
Neg logit contributions
 way-0.002
 place-0.002
 new-0.002
 home-0.002
 friends-0.002
Token glory
Token ID13476
Feature activation-5.25e-02

Pos logit contributions
ifying+0.506
bel+0.463
 disl+0.449
bar+0.439
 ah+0.437
Neg logit contributions
 Professional-0.145
ALTH-0.142
Code-0.137
 soils-0.134
 Mueller-0.128
Token!"
Token ID2474
Feature activation-3.80e-03

Pos logit contributions
 tant+0.130
 disl+0.120
 former+0.113
isk+0.111
hatt+0.110
Neg logit contributions
."-0.007
.-0.003
,-0.001
!"+0.002
 and+0.003
Token
Token ID220
Feature activation-4.36e-02

Pos logit contributions
 former+0.007
 disl+0.007
ifying+0.007
 ago+0.006
 tant+0.006
Neg logit contributions
 Professional-0.001
ALTH-0.001
 Ded-0.001
 Mueller-0.001
Code-0.001
Token
Token ID198
Feature activation+5.51e-04

Pos logit contributions
 former+0.086
 tant+0.082
ifying+0.080
 disl+0.079
spe+0.073
Neg logit contributions
ALTH-0.014
 Professional-0.013
 Ded-0.006
Code-0.005
 Average-0.004
Token
Token ID198
Feature activation+4.97e-04

Pos logit contributions
 Professional+0.000
Code+0.000
ALTH+0.000
 Ded+0.000
 Mueller+0.000
Neg logit contributions
 former-0.001
ifying-0.001
 disl-0.001
 ago-0.001
 tant-0.001
Token,
Token ID11
Feature activation+1.19e-02

Pos logit contributions
.+0.000
,+0.000
 and-0.000
!-0.000
 looking-0.000
Neg logit contributions
 disl-0.003
 former-0.003
 tant-0.003
arl-0.003
Aud-0.003
Token leaving
Token ID4305
Feature activation-1.28e-02

Pos logit contributions
 and+0.002
 looking+0.002
 under+0.002
 but+0.001
 thanks+0.001
Neg logit contributions
 tant-0.024
 former-0.024
igsaw-0.023
 disl-0.023
ocol-0.022
Token it
Token ID340
Feature activation+3.06e-02

Pos logit contributions
 disl+0.029
 former+0.028
arl+0.026
aj+0.025
bel+0.025
Neg logit contributions
 the-0.001
 it+0.000
 new+0.000
 under+0.000
 more+0.001
Token to
Token ID284
Feature activation-1.38e-03

Pos logit contributions
 under+0.009
 looking+0.007
 hiding+0.006
 buried+0.006
 underground+0.006
Neg logit contributions
 tant-0.059
 former-0.059
 disl-0.058
arl-0.054
ocol-0.052
Token its
Token ID663
Feature activation-3.81e-03

Pos logit contributions
ifying+0.003
 tant+0.003
cases+0.003
spe+0.003
aj+0.003
Neg logit contributions
 the-0.000
 place-0.000
 its-0.000
 find-0.000
 safe-0.000
Token former
Token ID1966
Feature activation-3.36e-01

Pos logit contributions
arl+0.007
 disl+0.007
arre+0.007
Aud+0.007
 tant+0.007
Neg logit contributions
 cave-0.001
 way-0.001
 place-0.001
 hole-0.001
 new-0.001
Token state
Token ID1181
Feature activation-8.87e-02

Pos logit contributions
ifying+0.458
bel+0.426
 disl+0.421
 tant+0.414
 former+0.402
Neg logit contributions
Code-0.195
ALTH-0.191
 Professional-0.179
 vodka-0.176
 Mueller-0.171
Token.
Token ID13
Feature activation-1.09e-03

Pos logit contributions
 disl+0.234
 tant+0.217
 former+0.211
arre+0.203
arl+0.201
Neg logit contributions
.-0.005
!+0.009
,+0.013
 and+0.014
 among+0.025
Token
Token ID220
Feature activation-4.13e-02

Pos logit contributions
 former+0.002
 disl+0.002
 tant+0.002
bel+0.002
ifying+0.001
Neg logit contributions
 Professional-0.001
ALTH-0.001
 Mueller-0.001
Code-0.001
 Ded-0.001
Token
Token ID198
Feature activation+3.37e-03

Pos logit contributions
 former+0.085
 tant+0.081
 disl+0.075
ifying+0.074
spe+0.071
Neg logit contributions
ALTH-0.011
 Professional-0.011
 Ded-0.004
Code-0.004
 Conflict-0.003
Token
Token ID198
Feature activation+3.45e-03

Pos logit contributions
 Professional+0.001
Code+0.001
ALTH+0.001
 Ded+0.001
 Mueller+0.001
Neg logit contributions
 former-0.006
 tant-0.005
 disl-0.005
ifying-0.005
 ago-0.005
Token back
Token ID736
Feature activation+1.44e-02

Pos logit contributions
 new+0.006
 more+0.004
 things+0.004
 the+0.004
 Professional+0.004
Neg logit contributions
arl-0.045
ifying-0.044
 disl-0.044
 ago-0.043
storm-0.043
Token the
Token ID262
Feature activation+1.52e-02

Pos logit contributions
 the+0.002
 things+0.001
 new+0.001
 many+0.001
 food+0.001
Neg logit contributions
storm-0.031
arl-0.030
 disl-0.030
 seventeen-0.029
 former-0.028
Token village
Token ID7404
Feature activation-2.81e-02

Pos logit contributions
ALTH+0.010
Code+0.009
 Mueller+0.009
 Professional+0.009
 capit+0.008
Neg logit contributions
 former-0.022
ifying-0.021
 disl-0.020
 tant-0.019
bel-0.019
Token to
Token ID284
Feature activation+7.31e-03

Pos logit contributions
arl+0.061
 disl+0.053
 former+0.052
ifying+0.052
 tant+0.052
Neg logit contributions
 very-0.005
 and-0.005
,-0.005
.-0.004
 more-0.004
Token its
Token ID663
Feature activation+5.00e-03

Pos logit contributions
 the+0.001
 help+0.000
 solve+0.000
 new-0.000
 helping-0.000
Neg logit contributions
aj-0.016
 tant-0.016
ifying-0.015
ro-0.015
drops-0.015
Token former
Token ID1966
Feature activation-3.27e-01

Pos logit contributions
 new+0.002
 place+0.002
 way+0.001
 more+0.001
 cave+0.001
Neg logit contributions
 disl-0.009
 tant-0.009
arl-0.009
arre-0.009
ifying-0.009
Token state
Token ID1181
Feature activation-7.95e-02

Pos logit contributions
ifying+0.457
 disl+0.423
bel+0.421
 tant+0.408
bar+0.391
Neg logit contributions
Code-0.176
 Professional-0.175
ALTH-0.174
 Mueller-0.162
 vodka-0.160
Token.
Token ID13
Feature activation+6.62e-03

Pos logit contributions
 disl+0.202
 tant+0.183
 former+0.172
arl+0.172
arre+0.172
Neg logit contributions
.-0.007
 and+0.001
,+0.002
!+0.009
 more+0.010
Token With
Token ID2080
Feature activation+1.22e-02

Pos logit contributions
 Professional+0.003
ALTH+0.003
 Ded+0.002
 Mueller+0.002
 Pride+0.002
Neg logit contributions
 former-0.012
 tant-0.011
 disl-0.011
bel-0.010
ifying-0.010
Token the
Token ID262
Feature activation+1.72e-02

Pos logit contributions
 the+0.002
 more+0.002
 many+0.001
 a+0.001
 it+0.001
Neg logit contributions
 former-0.025
ro-0.025
 disl-0.024
arre-0.024
aj-0.024
Token help
Token ID1037
Feature activation+4.37e-02

Pos logit contributions
 Mueller+0.008
ALTH+0.008
 Professional+0.007
Code+0.007
 capit+0.006
Neg logit contributions
 former-0.028
ifying-0.026
 disl-0.025
und-0.024
spe-0.024
Token We
Token ID775
Feature activation-1.84e-02

Pos logit contributions
 former+0.015
 tant+0.014
bel+0.014
 disl+0.013
spe+0.013
Neg logit contributions
ALTH-0.002
Code-0.002
 Professional-0.002
 Vend-0.002
 Roots-0.002
Token have
Token ID423
Feature activation-2.41e-02

Pos logit contributions
 former+0.038
igsaw+0.033
 disl+0.032
side+0.032
 gru+0.032
Neg logit contributions
 are-0.002
ALTH-0.002
 look-0.001
 Professional-0.001
 vodka-0.001
Token different
Token ID1180
Feature activation+1.34e-02

Pos logit contributions
oc+0.049
 former+0.048
cut+0.048
stack+0.047
opl+0.047
Neg logit contributions
 to-0.003
 more-0.003
 many-0.002
 toys-0.001
 new-0.000
Token likes
Token ID7832
Feature activation-2.13e-02

Pos logit contributions
 things+0.003
 colors+0.003
 toys+0.002
 costumes+0.001
 ways+0.001
Neg logit contributions
 former-0.028
storm-0.027
ifying-0.027
stack-0.027
spe-0.026
Token and
Token ID290
Feature activation-9.48e-03

Pos logit contributions
 former+0.045
arre+0.043
spe+0.042
 disl+0.041
bar+0.040
Neg logit contributions
 things-0.005
 animals-0.004
 toys-0.004
 for-0.004
 new-0.004
Token disl
Token ID19621
Feature activation-2.95e-01

Pos logit contributions
aman+0.020
lor+0.019
rol+0.019
und+0.018
oc+0.018
Neg logit contributions
 look-0.003
 fun-0.003
 things-0.003
 we-0.003
 make-0.003
Tokenikes
Token ID7938
Feature activation+2.40e-02

Pos logit contributions
 former+0.361
 disl+0.322
 tant+0.288
ifying+0.280
bel+0.278
Neg logit contributions
ALTH-0.263
 Professional-0.263
Code-0.263
 Mueller-0.247
 vodka-0.240
Token.
Token ID13
Feature activation-5.29e-03

Pos logit contributions
.+0.000
."-0.001
ALTH-0.002
,-0.002
 more-0.003
Neg logit contributions
 former-0.060
 disl-0.054
bel-0.051
spe-0.050
isk-0.047
Token We
Token ID775
Feature activation-1.66e-02

Pos logit contributions
 former+0.009
 tant+0.009
bel+0.009
 disl+0.008
spe+0.008
Neg logit contributions
ALTH-0.002
Code-0.002
 Professional-0.002
 Vend-0.002
 Average-0.002
Token can
Token ID460
Feature activation-4.20e-02

Pos logit contributions
 former+0.031
 disl+0.027
igsaw+0.027
bel+0.026
side+0.026
Neg logit contributions
ALTH-0.004
Code-0.003
 Professional-0.003
 Mueller-0.003
 vodka-0.003
Token be
Token ID307
Feature activation-2.96e-02

Pos logit contributions
 former+0.069
 tant+0.066
 disl+0.066
ifying+0.065
spe+0.062
Neg logit contributions
Code-0.016
 Professional-0.016
ALTH-0.015
 Mueller-0.014
 vodka-0.013
Token is
Token ID318
Feature activation-3.66e-03

Pos logit contributions
ALTH+0.004
Code+0.004
 Professional+0.004
 vodka+0.004
 Mueller+0.004
Neg logit contributions
 former-0.011
 disl-0.010
 tant-0.010
ifying-0.009
bel-0.009
Token from
Token ID422
Feature activation-1.64e-02

Pos logit contributions
 disl+0.007
 former+0.007
 tant+0.006
spe+0.006
arre+0.006
Neg logit contributions
 vodka-0.001
ALTH-0.001
 Professional-0.001
Code-0.001
 under-0.001
Token a
Token ID257
Feature activation-2.67e-02

Pos logit contributions
arre+0.035
 disl+0.035
spe+0.035
age+0.033
anne+0.032
Neg logit contributions
 the-0.003
 a-0.002
 my-0.001
 looking-0.000
 gold-0.000
Token long
Token ID890
Feature activation-5.67e-02

Pos logit contributions
 disl+0.049
 former+0.048
arre+0.045
und+0.045
isk+0.044
Neg logit contributions
 place-0.007
 coin-0.007
 new-0.007
 good-0.007
 treasure-0.007
Token time
Token ID640
Feature activation-3.66e-02

Pos logit contributions
 disl+0.121
 former+0.119
bel+0.109
ifying+0.104
arre+0.099
Neg logit contributions
 way-0.012
 spot-0.011
,-0.010
 box-0.010
 place-0.010
Token ago
Token ID2084
Feature activation-2.89e-01

Pos logit contributions
 former+0.069
 tant+0.067
 disl+0.064
isk+0.061
spe+0.059
Neg logit contributions
 looking-0.018
 for-0.017
.-0.016
 under-0.015
 and-0.014
Token,
Token ID11
Feature activation+4.33e-04

Pos logit contributions
 former+0.707
 disl+0.624
 tant+0.613
ifying+0.595
arre+0.554
Neg logit contributions
,-0.010
.-0.009
 and+0.008
."+0.014
ALTH+0.014
Token when
Token ID618
Feature activation-4.09e-02

Pos logit contributions
 but+0.000
 under+0.000
 and+0.000
 it+0.000
 because+0.000
Neg logit contributions
isk-0.001
spe-0.001
igsaw-0.001
handed-0.001
 former-0.001
Token a
Token ID257
Feature activation-2.57e-02

Pos logit contributions
 tant+0.103
 former+0.093
 disl+0.092
arre+0.092
isk+0.091
Neg logit contributions
 it-0.008
 we-0.003
 things-0.002
 the-0.001
 you+0.001
Token king
Token ID5822
Feature activation-4.90e-03

Pos logit contributions
 disl+0.045
 former+0.045
und+0.039
bel+0.039
isk+0.038
Neg logit contributions
ALTH-0.008
 vodka-0.007
Code-0.007
 Veter-0.007
 tatt-0.007
Token lived
Token ID5615
Feature activation-2.87e-02

Pos logit contributions
 former+0.012
 tant+0.010
 disl+0.010
igsaw+0.009
spe+0.009
Neg logit contributions
 comes-0.001
,-0.001
 came-0.001
 and-0.001
?"-0.001
Token They
Token ID1119
Feature activation-6.14e-03

Pos logit contributions
 former+0.023
 tant+0.021
 disl+0.020
bel+0.019
spe+0.019
Neg logit contributions
Code-0.007
ALTH-0.007
 Professional-0.007
 Average-0.006
 Mueller-0.006
Token lived
Token ID5615
Feature activation-3.16e-02

Pos logit contributions
 former+0.010
 tant+0.009
 disl+0.008
ifying+0.008
spe+0.008
Neg logit contributions
Code-0.004
ALTH-0.003
 Professional-0.003
 vodka-0.003
 Mueller-0.003
Token a
Token ID257
Feature activation-2.80e-02

Pos logit contributions
 disl+0.063
 former+0.061
arre+0.056
 tant+0.055
ifying+0.053
Neg logit contributions
 under-0.010
 Mueller-0.006
ALTH-0.005
 among-0.005
 underground-0.004
Token long
Token ID890
Feature activation-5.39e-02

Pos logit contributions
 former+0.044
 disl+0.041
bel+0.038
ifying+0.038
 tant+0.037
Neg logit contributions
 Mueller-0.014
ALTH-0.014
 Professional-0.014
Code-0.014
 Veter-0.013
Token time
Token ID640
Feature activation-3.57e-02

Pos logit contributions
 disl+0.104
 former+0.100
bel+0.095
ifying+0.093
arre+0.086
Neg logit contributions
 dinosaur-0.013
,-0.011
 hiding-0.010
 lizard-0.010
 Mueller-0.010
Token ago
Token ID2084
Feature activation-2.88e-01

Pos logit contributions
 disl+0.053
 tant+0.052
 former+0.052
isk+0.049
bel+0.046
Neg logit contributions
 looking-0.024
 under-0.022
 creators-0.021
 digging-0.020
 searching-0.020
Token.
Token ID13
Feature activation-1.25e-02

Pos logit contributions
 former+0.700
ifying+0.637
 disl+0.620
 tant+0.616
arre+0.569
Neg logit contributions
,-0.025
.-0.010
 and+0.008
 looking+0.016
."+0.023
Token They
Token ID1119
Feature activation-5.10e-03

Pos logit contributions
 tant+0.021
 former+0.021
bel+0.019
 disl+0.019
spe+0.019
Neg logit contributions
Code-0.006
 Professional-0.006
ALTH-0.006
 Average-0.005
 Vend-0.005
Token had
Token ID550
Feature activation-3.27e-02

Pos logit contributions
 former+0.009
 tant+0.009
 disl+0.008
igsaw+0.008
arl+0.008
Neg logit contributions
Code-0.002
 Professional-0.002
 vodka-0.002
ALTH-0.001
 Mueller-0.001
Token big
Token ID1263
Feature activation-3.89e-02

Pos logit contributions
 former+0.062
 tant+0.061
arl+0.060
igsaw+0.059
arre+0.057
Neg logit contributions
 vodka-0.007
 more-0.005
 treasure-0.005
Code-0.005
 fun-0.005
Token bones
Token ID11945
Feature activation+1.65e-02

Pos logit contributions
 former+0.059
 disl+0.054
ifying+0.051
bel+0.049
spe+0.049
Neg logit contributions
Code-0.023
ALTH-0.023
 Professional-0.023
 Mueller-0.022
 vodka-0.020
Token me
Token ID502
Feature activation-7.94e-02

Pos logit contributions
 it+0.001
 the+0.001
 things+0.000
 on+0.000
 under+0.000
Neg logit contributions
 former-0.012
spe-0.011
 disl-0.010
arre-0.010
bel-0.010
Token here
Token ID994
Feature activation-7.31e-02

Pos logit contributions
 former+0.200
 disl+0.178
 amaz+0.168
spe+0.163
 tant+0.162
Neg logit contributions
 under-0.013
 in-0.011
 on-0.009
 near-0.006
 by+0.001
Token a
Token ID257
Feature activation-2.63e-02

Pos logit contributions
 disl+0.178
 former+0.166
arre+0.151
isk+0.149
pill+0.146
Neg logit contributions
 under-0.012
 and-0.007
,-0.007
.-0.006
 in-0.004
Token long
Token ID890
Feature activation-5.24e-02

Pos logit contributions
 disl+0.052
arre+0.050
 former+0.050
und+0.047
alled+0.047
Neg logit contributions
 new-0.007
 place-0.007
 box-0.007
 gift-0.006
 coin-0.006
Token time
Token ID640
Feature activation-3.43e-02

Pos logit contributions
 former+0.117
 disl+0.106
arre+0.098
spe+0.093
ifying+0.093
Neg logit contributions
,-0.010
 stick-0.010
 box-0.008
 spot-0.008
 piece-0.008
Token ago
Token ID2084
Feature activation-2.87e-01

Pos logit contributions
 former+0.073
 disl+0.070
isk+0.068
 tant+0.067
spe+0.063
Neg logit contributions
,-0.013
.-0.012
 and-0.010
."-0.010
,"-0.010
Token.
Token ID13
Feature activation-1.79e-02

Pos logit contributions
 former+0.749
ifying+0.667
 disl+0.639
 tant+0.620
arre+0.574
Neg logit contributions
,-0.032
.+0.003
 and+0.016
!"+0.020
."+0.023
Token They
Token ID1119
Feature activation-7.12e-03

Pos logit contributions
 former+0.034
 tant+0.033
spe+0.031
arre+0.031
bel+0.030
Neg logit contributions
 Professional-0.005
Code-0.005
ALTH-0.005
 Mane-0.004
 Average-0.004
Token forgot
Token ID16453
Feature activation+1.39e-02

Pos logit contributions
 former+0.013
 disl+0.012
 tant+0.012
spe+0.011
igsaw+0.011
Neg logit contributions
 Professional-0.002
ALTH-0.002
 vodka-0.002
Code-0.002
 Talks-0.002
Token about
Token ID546
Feature activation+1.57e-02

Pos logit contributions
 to+0.001
 about+0.000
 the-0.000
 it-0.001
.-0.001
Neg logit contributions
 former-0.034
ro-0.030
bel-0.030
 tant-0.029
igsaw-0.029
Token me
Token ID502
Feature activation-8.02e-02

Pos logit contributions
 the+0.003
 it+0.000
 looking+0.000
 being-0.000
 animals-0.001
Neg logit contributions
bel-0.037
ifying-0.036
brush-0.036
spe-0.036
bar-0.035
Token it
Token ID340
Feature activation+1.51e-02

Pos logit contributions
 former+0.014
spe+0.013
bel+0.012
 disl+0.012
storm+0.012
Neg logit contributions
 it-0.001
 the+0.000
 things+0.000
 more+0.001
 everything+0.001
Token away
Token ID1497
Feature activation-6.07e-02

Pos logit contributions
 under+0.005
.+0.003
 in+0.002
 for+0.002
 because+0.002
Neg logit contributions
 disl-0.034
 former-0.033
 amaz-0.030
spe-0.029
ocol-0.029
Token a
Token ID257
Feature activation-2.56e-02

Pos logit contributions
 disl+0.146
 former+0.143
Aud+0.137
hatt+0.133
 seventeen+0.131
Neg logit contributions
.-0.005
."-0.005
 under-0.000
 and+0.000
,+0.000
Token long
Token ID890
Feature activation-5.51e-02

Pos logit contributions
 disl+0.046
bel+0.044
 former+0.044
alled+0.042
und+0.042
Neg logit contributions
 new-0.006
 place-0.006
 lot-0.005
 way-0.005
 good-0.005
Token time
Token ID640
Feature activation-3.58e-02

Pos logit contributions
 former+0.124
ifying+0.114
bel+0.114
 disl+0.112
ibr+0.105
Neg logit contributions
 way-0.004
 time-0.003
,-0.001
 place-0.000
 thing+0.001
Token ago
Token ID2084
Feature activation-2.87e-01

Pos logit contributions
 former+0.079
bel+0.076
 tant+0.076
 disl+0.072
spe+0.071
Neg logit contributions
.-0.010
."-0.010
,-0.007
 and-0.004
 but-0.003
Token.
Token ID13
Feature activation-1.46e-02

Pos logit contributions
 former+0.774
ifying+0.689
 disl+0.668
 tant+0.642
Aud+0.618
Neg logit contributions
.-0.010
,+0.002
."+0.016
 and+0.028
!+0.062
Token It
Token ID632
Feature activation+8.13e-03

Pos logit contributions
 former+0.023
bel+0.022
 tant+0.022
spe+0.021
 disl+0.020
Neg logit contributions
 Professional-0.008
ALTH-0.007
Code-0.007
 Mueller-0.007
 Average-0.007
Token is
Token ID318
Feature activation-3.52e-03

Pos logit contributions
ALTH+0.001
 Professional+0.001
 vodka+0.001
Code+0.001
 conject+0.001
Neg logit contributions
 former-0.017
 ago-0.015
ifying-0.015
side-0.015
arl-0.015
Token not
Token ID407
Feature activation-1.25e-02

Pos logit contributions
bel+0.007
spe+0.007
 tant+0.007
 disl+0.007
 former+0.007
Neg logit contributions
 very-0.001
 for-0.000
 good-0.000
 under-0.000
 hard-0.000
Token good
Token ID922
Feature activation+2.07e-02

Pos logit contributions
 former+0.024
spe+0.023
bel+0.023
 disl+0.022
side+0.022
Neg logit contributions
 good-0.002
 worth-0.002
 very-0.002
 vodka-0.002
 qualified-0.001
TokenThey
Token ID2990
Feature activation-2.15e-02

Pos logit contributions
 former+0.059
 disl+0.049
 Majesty+0.046
 ago+0.045
 tant+0.044
Neg logit contributions
Code-0.014
ALTH-0.007
Private-0.007
History-0.007
Lab-0.007
Token were
Token ID547
Feature activation-1.82e-02

Pos logit contributions
 former+0.035
 disl+0.031
 tant+0.030
ifying+0.029
bel+0.028
Neg logit contributions
ALTH-0.012
Code-0.012
 Professional-0.011
 vodka-0.011
 Mueller-0.011
Token here
Token ID994
Feature activation-7.26e-02

Pos logit contributions
 former+0.034
 disl+0.031
spe+0.030
bel+0.029
 tant+0.029
Neg logit contributions
 vodka-0.007
ALTH-0.007
 Mueller-0.006
 Professional-0.006
Code-0.006
Token a
Token ID257
Feature activation-2.58e-02

Pos logit contributions
 disl+0.153
 former+0.152
 tant+0.152
bar+0.132
spe+0.129
Neg logit contributions
 under-0.010
,-0.006
 for-0.005
ALTH-0.004
 Mueller-0.004
Token minute
Token ID5664
Feature activation-1.60e-02

Pos logit contributions
 former+0.048
 disl+0.047
ifying+0.046
bel+0.045
spe+0.043
Neg logit contributions
 place-0.007
 new-0.007
 way-0.007
 Veter-0.006
 tatt-0.005
Token ago
Token ID2084
Feature activation-2.86e-01

Pos logit contributions
 former+0.040
 tant+0.038
 disl+0.037
ifying+0.032
bel+0.032
Neg logit contributions
.-0.001
,-0.001
."-0.001
 and+0.000
!"+0.001
Token."
Token ID526
Feature activation+1.75e-03

Pos logit contributions
 former+0.707
 tant+0.609
 disl+0.594
ifying+0.579
arre+0.536
Neg logit contributions
,-0.025
ALTH-0.004
.-0.000
."+0.010
 looking+0.019
Token
Token ID198
Feature activation-9.24e-03

Pos logit contributions
Code+0.001
ALTH+0.001
 Professional+0.001
 Mueller+0.001
 vodka+0.001
Neg logit contributions
 former-0.002
 disl-0.002
 tant-0.002
ifying-0.002
spe-0.002
Token
Token ID198
Feature activation-8.95e-03

Pos logit contributions
 former+0.012
 disl+0.011
ifying+0.010
 tant+0.010
 ago+0.010
Neg logit contributions
Code-0.007
ALTH-0.007
 Professional-0.007
 Mueller-0.007
 vodka-0.007
Token"
Token ID1
Feature activation-2.66e-02

Pos logit contributions
 former+0.011
 disl+0.009
ifying+0.008
 tant+0.008
 ago+0.008
Neg logit contributions
ALTH-0.008
 Professional-0.008
Code-0.008
 Mueller-0.008
 vodka-0.008
TokenMaybe
Token ID13300
Feature activation+3.16e-02

Pos logit contributions
 former+0.054
 disl+0.044
 ago+0.043
ocol+0.043
 tant+0.043
Neg logit contributions
Code-0.013
Nice-0.007
Over-0.007
Left-0.006
ALTH-0.006
Token it
Token ID340
Feature activation+1.76e-02

Pos logit contributions
 it+0.003
 the-0.001
 under-0.001
."-0.003
.-0.003
Neg logit contributions
 former-0.047
arre-0.044
 disl-0.042
spe-0.041
bel-0.041
Token here
Token ID994
Feature activation-7.12e-02

Pos logit contributions
 under+0.006
."+0.002
 in+0.002
 underground+0.002
 near+0.002
Neg logit contributions
 disl-0.040
 former-0.038
 tant-0.036
spe-0.036
igsaw-0.035
Token a
Token ID257
Feature activation-2.49e-02

Pos logit contributions
 disl+0.182
 tant+0.168
 former+0.167
igsaw+0.163
bel+0.158
Neg logit contributions
."-0.006
.-0.002
 and+0.004
 under+0.005
,+0.008
Token long
Token ID890
Feature activation-5.31e-02

Pos logit contributions
 disl+0.048
 former+0.044
bel+0.044
arre+0.043
und+0.041
Neg logit contributions
 treasure-0.007
 place-0.007
 coin-0.007
 cave-0.006
 box-0.006
Token time
Token ID640
Feature activation-3.38e-02

Pos logit contributions
 former+0.117
 disl+0.112
bel+0.109
ifying+0.099
iqu+0.098
Neg logit contributions
 way-0.010
 place-0.009
 rock-0.007
 spot-0.007
,-0.007
Token ago
Token ID2084
Feature activation-2.85e-01

Pos logit contributions
 tant+0.061
 former+0.058
bel+0.055
 disl+0.054
isk+0.054
Neg logit contributions
."-0.023
.-0.021
 looking-0.019
,-0.017
 and-0.016
Token."
Token ID526
Feature activation+2.06e-03

Pos logit contributions
 former+0.729
 disl+0.628
ifying+0.625
 tant+0.614
arre+0.567
Neg logit contributions
.-0.008
."-0.007
,-0.006
 and+0.029
!"+0.030
Token
Token ID198
Feature activation-8.27e-03

Pos logit contributions
 Professional+0.001
Code+0.001
ALTH+0.001
 Mueller+0.001
 vodka+0.001
Neg logit contributions
 former-0.003
 disl-0.003
ifying-0.003
 tant-0.003
spe-0.003
Token
Token ID198
Feature activation-8.20e-03

Pos logit contributions
 former+0.012
 disl+0.010
ifying+0.010
 tant+0.010
spe+0.009
Neg logit contributions
 Professional-0.006
Code-0.006
ALTH-0.006
 Mueller-0.006
 vodka-0.005
TokenThey
Token ID2990
Feature activation-1.88e-02

Pos logit contributions
 former+0.015
 disl+0.012
 tant+0.012
ifying+0.012
 ago+0.012
Neg logit contributions
Code-0.004
ALTH-0.004
 Professional-0.003
 Mueller-0.003
 vodka-0.003
Token decided
Token ID3066
Feature activation-3.38e-02

Pos logit contributions
 former+0.030
 disl+0.029
 tant+0.025
ifying+0.025
 ago+0.024
Neg logit contributions
Code-0.011
 Professional-0.010
ALTH-0.010
 Mueller-0.010
 vodka-0.009
Token animals
Token ID4695
Feature activation+3.66e-02

Pos logit contributions
arre+0.075
bel+0.073
arl+0.073
 tant+0.073
ifying+0.071
Neg logit contributions
 animals-0.009
 things-0.005
 new-0.003
 toys-0.002
 stars-0.002
Token and
Token ID290
Feature activation-1.54e-02

Pos logit contributions
.+0.002
,-0.000
 and-0.001
 because-0.004
 under-0.005
Neg logit contributions
 former-0.097
 tant-0.088
 disl-0.085
bel-0.082
igsaw-0.081
Token people
Token ID661
Feature activation-1.30e-02

Pos logit contributions
arl+0.038
arre+0.035
igsaw+0.035
ifying+0.034
bar+0.034
Neg logit contributions
 animals-0.005
 things-0.003
 toys-0.003
 people-0.002
 flowers-0.001
Token from
Token ID422
Feature activation-1.34e-02

Pos logit contributions
 former+0.034
igsaw+0.028
 tant+0.028
 Deal+0.027
 seventeen+0.027
Neg logit contributions
.-0.001
 wearing+0.001
 who+0.001
,+0.001
 looking+0.002
Token long
Token ID890
Feature activation-5.22e-02

Pos logit contributions
arre+0.030
bar+0.029
age+0.028
 former+0.027
arl+0.027
Neg logit contributions
 the-0.003
 it-0.002
 under-0.002
 animals-0.001
 looking-0.001
Token ago
Token ID2084
Feature activation-2.85e-01

Pos logit contributions
 former+0.097
 disl+0.089
bar+0.084
ifying+0.083
bly+0.081
Neg logit contributions
.-0.022
,-0.019
 things-0.017
 ways-0.016
 and-0.014
Token.
Token ID13
Feature activation-1.16e-02

Pos logit contributions
 former+0.764
ifying+0.693
 tant+0.671
 disl+0.656
arl+0.637
Neg logit contributions
.-0.002
,+0.046
 and+0.077
 looking+0.080
!+0.097
Token Sam
Token ID3409
Feature activation-4.24e-02

Pos logit contributions
 former+0.019
bel+0.017
spe+0.017
ifying+0.017
 tant+0.016
Neg logit contributions
 Professional-0.007
ALTH-0.006
Code-0.006
 Mueller-0.006
 Vend-0.005
Token's
Token ID338
Feature activation+1.26e-02

Pos logit contributions
 former+0.086
arl+0.085
ifying+0.083
 tant+0.078
ocol+0.078
Neg logit contributions
Code-0.008
ALTH-0.007
 Professional-0.006
 Mueller-0.006
 looked-0.005
Token mom
Token ID1995
Feature activation+4.76e-03

Pos logit contributions
 Mueller+0.004
 Professional+0.003
 Vend+0.003
 characteristic+0.002
ALTH+0.002
Neg logit contributions
 disl-0.022
 former-0.022
bar-0.021
und-0.021
ifying-0.020
Token said
Token ID531
Feature activation-2.40e-02

Pos logit contributions
 helped+0.001
 looked+0.001
 showed+0.001
 vodka+0.001
 Professional+0.001
Neg logit contributions
 former-0.011
ifying-0.010
 disl-0.010
 tant-0.010
igsaw-0.009
Token started
Token ID2067
Feature activation-9.05e-03

Pos logit contributions
ifying+0.055
 former+0.053
bel+0.053
side+0.052
spe+0.052
Neg logit contributions
 know-0.003
 Mueller-0.002
 Professional-0.002
 overs-0.002
 helped-0.002
Token smoking
Token ID9216
Feature activation-2.13e-02

Pos logit contributions
 former+0.020
isk+0.020
arl+0.019
belt+0.019
 tant+0.019
Neg logit contributions
 to-0.001
 making-0.001
 looking+0.000
 helping+0.000
 the+0.000
Token a
Token ID257
Feature activation-2.51e-02

Pos logit contributions
 former+0.051
spe+0.049
 disl+0.045
bel+0.045
ifying+0.045
Neg logit contributions
 and-0.003
.-0.002
 more-0.001
 it-0.001
 the-0.000
Token long
Token ID890
Feature activation-5.26e-02

Pos logit contributions
 disl+0.046
 former+0.044
alled+0.043
isk+0.042
 ago+0.041
Neg logit contributions
 new-0.006
 cigarette-0.005
 hole-0.005
 piece-0.004
 place-0.004
Token time
Token ID640
Feature activation-3.49e-02

Pos logit contributions
 former+0.109
 disl+0.098
ibr+0.097
ifying+0.096
bel+0.094
Neg logit contributions
 piece-0.015
,-0.013
 way-0.011
 hole-0.009
 thing-0.009
Token ago
Token ID2084
Feature activation-2.84e-01

Pos logit contributions
 former+0.076
 tant+0.073
isk+0.072
Aud+0.071
 disl+0.070
Neg logit contributions
.-0.009
 and-0.005
 making-0.005
,-0.004
 for-0.003
Token and
Token ID290
Feature activation-1.79e-02

Pos logit contributions
 former+0.756
ifying+0.675
 disl+0.665
 tant+0.655
Aud+0.627
Neg logit contributions
.-0.011
,-0.004
 and+0.019
."+0.031
!+0.051
Token now
Token ID783
Feature activation-6.32e-02

Pos logit contributions
lor+0.037
igsaw+0.036
opl+0.035
 former+0.035
side+0.034
Neg logit contributions
 it-0.007
 making-0.007
 the-0.007
 we-0.004
 made-0.004
Token it
Token ID340
Feature activation+1.66e-02

Pos logit contributions
 tant+0.151
 former+0.144
bel+0.142
spe+0.140
ifying+0.131
Neg logit contributions
 it-0.013
.-0.009
,-0.009
."-0.007
 we-0.005
Token's
Token ID338
Feature activation+1.19e-02

Pos logit contributions
 vodka+0.003
ALTH+0.003
's+0.003
 spans+0.002
 fills+0.002
Neg logit contributions
 former-0.034
 ago-0.031
 disl-0.030
igsaw-0.030
 tant-0.029
Token hard
Token ID1327
Feature activation+1.95e-02

Pos logit contributions
 very+0.003
 making+0.003
 hard+0.002
 going+0.002
 allergic+0.002
Neg logit contributions
 disl-0.024
spe-0.024
bel-0.023
 tant-0.023
igsaw-0.023
Token coins
Token ID10796
Feature activation-1.60e-02

Pos logit contributions
arl+0.026
 former+0.025
aman+0.024
arre+0.024
opl+0.023
Neg logit contributions
 looked-0.003
 colors-0.003
 animals-0.003
 more-0.002
 rocks-0.002
Token that
Token ID326
Feature activation-3.15e-02

Pos logit contributions
 former+0.041
 disl+0.037
 tant+0.034
arre+0.032
igsaw+0.031
Neg logit contributions
 under-0.001
.-0.001
,-0.001
 and+0.001
 looking+0.001
Token people
Token ID661
Feature activation-8.65e-03

Pos logit contributions
 former+0.073
arl+0.069
arre+0.064
 disl+0.063
ifying+0.062
Neg logit contributions
 looked-0.008
 they-0.005
 it-0.005
 made-0.004
 covered-0.004
Token used
Token ID973
Feature activation+3.39e-02

Pos logit contributions
 former+0.021
 tant+0.016
igsaw+0.015
 disl+0.015
arl+0.015
Neg logit contributions
 looked-0.002
 looking-0.001
 under-0.001
 used-0.001
 helped-0.001
Token long
Token ID890
Feature activation-5.01e-02

Pos logit contributions
 it+0.006
 to+0.005
 everywhere+0.004
 things+0.004
.+0.004
Neg logit contributions
 former-0.073
arre-0.070
 disl-0.069
storm-0.065
opl-0.061
Token ago
Token ID2084
Feature activation-2.84e-01

Pos logit contributions
 former+0.102
 disl+0.090
arre+0.086
spe+0.085
bel+0.084
Neg logit contributions
,-0.012
 things-0.009
 colors-0.008
.-0.008
Code-0.007
Token.
Token ID13
Feature activation-4.76e-03

Pos logit contributions
 former+0.758
ifying+0.663
 tant+0.654
arre+0.652
arl+0.648
Neg logit contributions
.-0.009
,+0.034
 and+0.044
 toys+0.068
 looking+0.073
Token They
Token ID1119
Feature activation+1.53e-03

Pos logit contributions
 former+0.007
 disl+0.006
 tant+0.005
ifying+0.005
bel+0.005
Neg logit contributions
ALTH-0.004
 Professional-0.004
Code-0.004
 Mueller-0.003
 vodka-0.003
Token saw
Token ID2497
Feature activation-4.34e-03

Pos logit contributions
Code+0.000
 Mueller+0.000
 vodka+0.000
 looked+0.000
 Professional+0.000
Neg logit contributions
 former-0.003
 disl-0.003
arl-0.003
ifying-0.003
 tant-0.003
Token paintings
Token ID21641
Feature activation-3.03e-02

Pos logit contributions
 former+0.010
arl+0.009
 disl+0.009
arre+0.009
ifying+0.008
Neg logit contributions
 many-0.001
 animals-0.001
 new-0.001
 more-0.001
 dinosaurs-0.001
Token and
Token ID290
Feature activation-9.77e-03

Pos logit contributions
 former+0.075
 disl+0.071
 tant+0.064
arre+0.058
ifying+0.057
Neg logit contributions
,-0.003
 and-0.000
 under+0.000
Code+0.002
 blogging+0.003
Token a
Token ID257
Feature activation-2.46e-02

Pos logit contributions
spe+0.024
 disl+0.024
ifying+0.024
bar+0.024
arre+0.024
Neg logit contributions
 under-0.004
 looking-0.002
 the-0.002
 behind-0.002
 near-0.001
Token ship
Token ID4074
Feature activation+2.91e-02

Pos logit contributions
 former+0.043
 disl+0.041
ifying+0.037
und+0.036
bel+0.036
Neg logit contributions
 place-0.009
 new-0.007
 Professional-0.007
 forwarding-0.007
 looking-0.007
Token that
Token ID326
Feature activation-3.27e-02

Pos logit contributions
.+0.002
 and-0.002
 under-0.002
,-0.003
 near-0.004
Neg logit contributions
 disl-0.075
 former-0.069
ocol-0.063
ifying-0.063
igsaw-0.060
Token sank
Token ID30895
Feature activation-2.24e-02

Pos logit contributions
 former+0.073
arl+0.068
ifying+0.066
 tant+0.063
igsaw+0.060
Neg logit contributions
 looked-0.007
 helped-0.007
 made-0.005
 led-0.004
 under-0.004
Token long
Token ID890
Feature activation-5.10e-02

Pos logit contributions
 disl+0.055
 former+0.052
igsaw+0.051
opl+0.050
arre+0.049
Neg logit contributions
 under-0.003
.-0.002
 to-0.001
 in-0.001
 for-0.000
Token ago
Token ID2084
Feature activation-2.84e-01

Pos logit contributions
 former+0.117
 disl+0.114
bel+0.103
ifying+0.099
Aud+0.099
Neg logit contributions
.-0.008
 and-0.005
,-0.005
 under-0.003
 days+0.003
Token.
Token ID13
Feature activation-9.99e-03

Pos logit contributions
 former+0.713
ifying+0.648
 disl+0.633
 tant+0.594
Aud+0.577
Neg logit contributions
.-0.004
,+0.009
 and+0.026
 under+0.050
 looking+0.051
Token She
Token ID1375
Feature activation-5.20e-03

Pos logit contributions
 former+0.016
spe+0.015
 disl+0.015
bel+0.014
 tant+0.014
Neg logit contributions
 Professional-0.006
ALTH-0.005
Code-0.005
 Mueller-0.004
 Vend-0.004
Token said
Token ID531
Feature activation-2.32e-02

Pos logit contributions
 former+0.010
ifying+0.009
 disl+0.009
 tant+0.008
igsaw+0.008
Neg logit contributions
Code-0.002
 Professional-0.002
ALTH-0.002
 vodka-0.002
 Mueller-0.001
Token they
Token ID484
Feature activation-4.64e-03

Pos logit contributions
 former+0.053
ifying+0.046
isk+0.043
 tant+0.042
bel+0.042
Neg logit contributions
,-0.003
 Professional-0.003
ALTH-0.002
 it-0.002
Code-0.002
Token could
Token ID714
Feature activation-4.16e-04

Pos logit contributions
 former+0.007
 disl+0.006
 tant+0.006
ifying+0.006
ro+0.005
Neg logit contributions
Code-0.003
 Professional-0.003
ALTH-0.003
 Mueller-0.003
 vodka-0.003
Token lost
Token ID2626
Feature activation+5.41e-02

Pos logit contributions
 Professional+0.005
 vodka+0.005
Code+0.004
 used+0.004
 under+0.004
Neg logit contributions
 former-0.048
 tant-0.040
 disl-0.038
side-0.038
spe-0.037
Token it
Token ID340
Feature activation+1.82e-02

Pos logit contributions
 it+0.006
 things-0.002
 the-0.006
 under-0.007
 treasure-0.011
Neg logit contributions
 former-0.140
arre-0.129
bel-0.123
 disl-0.122
spe-0.119
Token a
Token ID257
Feature activation-2.19e-02

Pos logit contributions
 under+0.003
,+0.002
 and+0.001
."+0.001
 because+0.001
Neg logit contributions
 former-0.043
 disl-0.042
spe-0.038
 tant-0.038
 amaz-0.037
Token long
Token ID890
Feature activation-5.17e-02

Pos logit contributions
 disl+0.042
arre+0.042
isk+0.041
bel+0.041
und+0.040
Neg logit contributions
 treasure-0.007
 coin-0.006
 new-0.005
 way-0.005
 gift-0.005
Token time
Token ID640
Feature activation-3.24e-02

Pos logit contributions
 former+0.108
 disl+0.103
bel+0.102
arre+0.100
ifying+0.098
Neg logit contributions
 thing-0.014
 way-0.014
 coin-0.011
 place-0.011
,-0.010
Token ago
Token ID2084
Feature activation-2.84e-01

Pos logit contributions
 former+0.052
 tant+0.051
isk+0.049
bel+0.046
 disl+0.045
Neg logit contributions
 looking-0.027
.-0.025
,-0.024
 and-0.023
 hiding-0.022
Token,
Token ID11
Feature activation+2.42e-03

Pos logit contributions
 former+0.748
ifying+0.665
 disl+0.650
 tant+0.634
arre+0.604
Neg logit contributions
,-0.027
.-0.009
 and-0.001
."+0.029
 looking+0.039
Token and
Token ID290
Feature activation-1.47e-02

Pos logit contributions
 but+0.001
 and+0.001
 it+0.000
 because+0.000
 so+0.000
Neg logit contributions
igsaw-0.006
 former-0.005
 tant-0.005
spe-0.005
opl-0.005
Token you
Token ID345
Feature activation-4.36e-02

Pos logit contributions
 tant+0.033
arre+0.033
igsaw+0.033
arl+0.033
opl+0.033
Neg logit contributions
 it-0.006
 we-0.003
 the-0.002
 could-0.002
 I-0.001
Token found
Token ID1043
Feature activation+3.59e-04

Pos logit contributions
 former+0.077
 disl+0.076
 tant+0.070
side+0.068
spe+0.068
Neg logit contributions
 Professional-0.014
Code-0.012
ALTH-0.012
 Average-0.011
 conject-0.011
Token it
Token ID340
Feature activation+1.67e-02

Pos logit contributions
 it+0.000
 Professional-0.000
 under-0.000
 vodka-0.000
 Clinic-0.000
Neg logit contributions
 former-0.001
spe-0.001
arre-0.001
bel-0.001
 disl-0.001
Token are
Token ID389
Feature activation-1.18e-02

Pos logit contributions
ALTH+0.005
 vodka+0.004
 fills+0.004
Code+0.003
 according+0.003
Neg logit contributions
 former-0.073
spe-0.061
 tant-0.059
 disl-0.057
bel-0.057
Token from
Token ID422
Feature activation-1.07e-02

Pos logit contributions
 former+0.024
 tant+0.024
spe+0.022
 disl+0.022
bel+0.022
Neg logit contributions
 very-0.002
 good-0.002
 deleting-0.001
 for-0.001
 hard-0.001
Token a
Token ID257
Feature activation-2.23e-02

Pos logit contributions
brush+0.021
age+0.020
 gloss+0.020
 former+0.020
 disl+0.020
Neg logit contributions
 the-0.004
 animals-0.003
 for-0.003
 looking-0.002
 our-0.002
Token long
Token ID890
Feature activation-5.01e-02

Pos logit contributions
 disl+0.041
ment+0.037
 former+0.037
bel+0.036
und+0.035
Neg logit contributions
 place-0.007
 very-0.007
 way-0.007
 good-0.007
 lot-0.006
Token time
Token ID640
Feature activation-3.22e-02

Pos logit contributions
 disl+0.117
 former+0.112
bly+0.110
ifying+0.108
bel+0.107
Neg logit contributions
 way-0.008
 time-0.004
,-0.004
 place-0.001
 and+0.003
Token ago
Token ID2084
Feature activation-2.84e-01

Pos logit contributions
 former+0.068
 tant+0.067
 disl+0.059
Mic+0.059
spe+0.058
Neg logit contributions
.-0.013
,-0.012
 and-0.010
."-0.009
 for-0.008
Token,
Token ID11
Feature activation+3.30e-04

Pos logit contributions
 former+0.747
ifying+0.686
 tant+0.660
 disl+0.642
igsaw+0.608
Neg logit contributions
,-0.013
.-0.006
 and+0.017
."+0.025
 looking+0.068
Token before
Token ID878
Feature activation-2.65e-02

Pos logit contributions
 but+0.000
 and+0.000
 because+0.000
 like-0.000
 for-0.000
Neg logit contributions
 tant-0.001
handed-0.001
igsaw-0.001
ifying-0.001
spe-0.001
Token we
Token ID356
Feature activation+1.78e-02

Pos logit contributions
 tant+0.056
 former+0.053
spe+0.052
bel+0.050
 disl+0.050
Neg logit contributions
 going-0.005
 looking-0.004
 making-0.004
 we-0.004
,-0.004
Token were
Token ID547
Feature activation-1.89e-02

Pos logit contributions
 help+0.003
 look+0.003
 make+0.003
 are+0.003
 get+0.002
Neg logit contributions
 former-0.037
 disl-0.037
spe-0.034
 tant-0.034
igsaw-0.034
Token born
Token ID4642
Feature activation+4.19e-02

Pos logit contributions
 former+0.038
 disl+0.036
 tant+0.035
ho+0.035
spe+0.034
Neg logit contributions
 looking-0.006
 going-0.005
 supposed-0.005
 very-0.005
 bored-0.004
Token people
Token ID661
Feature activation-1.00e-02

Pos logit contributions
 former+0.024
 tant+0.021
isk+0.021
ifying+0.021
arl+0.020
Neg logit contributions
 it-0.001
 things-0.001
 we-0.001
 you-0.000
 I-0.000
Token did
Token ID750
Feature activation+3.16e-03

Pos logit contributions
 former+0.022
 tant+0.019
spe+0.018
ifying+0.018
bel+0.017
Neg logit contributions
 ask-0.002
 vodka-0.002
 look-0.002
 looking-0.002
 are-0.001
Token a
Token ID257
Feature activation-2.13e-02

Pos logit contributions
."+0.000
 and+0.000
,+0.000
.+0.000
 for+0.000
Neg logit contributions
ifying-0.007
 tant-0.007
 former-0.007
igsaw-0.006
ishly-0.006
Token long
Token ID890
Feature activation-4.95e-02

Pos logit contributions
 former+0.032
 disl+0.029
ifying+0.029
bar+0.028
spe+0.027
Neg logit contributions
ALTH-0.012
 Mueller-0.012
Code-0.011
 Professional-0.011
 vodka-0.010
Token time
Token ID640
Feature activation-3.10e-02

Pos logit contributions
ifying+0.114
 former+0.104
iqu+0.102
bly+0.096
rol+0.096
Neg logit contributions
,-0.009
 time-0.009
 and-0.007
 thing-0.007
 story-0.006
Token ago
Token ID2084
Feature activation-2.84e-01

Pos logit contributions
 tant+0.069
 former+0.064
isk+0.063
ifying+0.061
igsaw+0.059
Neg logit contributions
.-0.010
."-0.010
 and-0.008
,-0.008
 for-0.005
Token.
Token ID13
Feature activation-8.29e-03

Pos logit contributions
 former+0.741
ifying+0.682
 tant+0.638
 disl+0.610
Aud+0.595
Neg logit contributions
,-0.022
.-0.020
."-0.011
 and+0.004
!"+0.008
Token They
Token ID1119
Feature activation-2.24e-03

Pos logit contributions
 tant+0.015
 former+0.015
bel+0.014
 disl+0.014
spe+0.013
Neg logit contributions
ALTH-0.003
Code-0.003
 Professional-0.003
 Vend-0.003
 Average-0.003
Token lived
Token ID5615
Feature activation-2.82e-02

Pos logit contributions
 tant+0.005
 former+0.005
igsaw+0.004
run+0.004
aj+0.004
Neg logit contributions
 are-0.000
 look-0.000
 tell-0.000
 make-0.000
 ask-0.000
Token in
Token ID287
Feature activation-4.70e-03

Pos logit contributions
 disl+0.064
bar+0.056
 former+0.055
opl+0.053
ifying+0.052
Neg logit contributions
 under-0.009
 near-0.004
 in-0.003
 very-0.002
 underground-0.001
Token castles
Token ID49203
Feature activation-1.66e-02

Pos logit contributions
 former+0.010
 disl+0.010
arre+0.010
drops+0.010
bar+0.010
Neg logit contributions
 the-0.000
 a-0.000
 space+0.001
 many+0.001
 an+0.001
Token by
Token ID416
Feature activation+5.43e-03

Pos logit contributions
 it+0.002
 Professional+0.002
 new+0.002
inges+0.002
 vodka+0.001
Neg logit contributions
 former-0.027
ifying-0.025
 disl-0.024
 tant-0.023
 ago-0.022
Token a
Token ID257
Feature activation-2.13e-02

Pos logit contributions
 animals+0.002
 looking+0.001
 the+0.001
 things+0.001
 being+0.001
Neg logit contributions
ifying-0.011
 former-0.010
arl-0.010
Aud-0.010
aj-0.010
Token famous
Token ID5863
Feature activation+1.62e-03

Pos logit contributions
 disl+0.036
ifying+0.034
 former+0.034
 ago+0.033
und+0.032
Neg logit contributions
 new-0.006
 very-0.006
ALTH-0.005
 Mueller-0.005
 Veter-0.005
Token artist
Token ID6802
Feature activation-4.49e-03

Pos logit contributions
 Mueller+0.001
ALTH+0.001
 Professional+0.001
Code+0.001
 Veter+0.001
Neg logit contributions
 former-0.003
 disl-0.002
ifying-0.002
 tant-0.002
arre-0.002
Token long
Token ID890
Feature activation-4.91e-02

Pos logit contributions
 former+0.010
 tant+0.009
 disl+0.009
arl+0.009
ifying+0.009
Neg logit contributions
.-0.000
,+0.000
 and+0.000
."+0.000
!+0.000
Token ago
Token ID2084
Feature activation-2.83e-01

Pos logit contributions
 former+0.101
ifying+0.094
 disl+0.093
bel+0.087
iqu+0.087
Neg logit contributions
 way-0.014
.-0.013
,-0.010
 and-0.008
 place-0.008
Token.
Token ID13
Feature activation-8.43e-03

Pos logit contributions
 former+0.676
 disl+0.598
ifying+0.594
 tant+0.592
arl+0.554
Neg logit contributions
,-0.005
.-0.005
ALTH+0.015
 and+0.025
 looking+0.025
Token
Token ID220
Feature activation-5.30e-02

Pos logit contributions
 former+0.012
 disl+0.011
 tant+0.010
ifying+0.010
bel+0.010
Neg logit contributions
 Professional-0.006
ALTH-0.006
Code-0.006
 Mueller-0.006
 vodka-0.005
Token
Token ID198
Feature activation-6.26e-03

Pos logit contributions
 former+0.098
 disl+0.089
ifying+0.083
 tant+0.083
spe+0.078
Neg logit contributions
 Professional-0.022
ALTH-0.021
Code-0.018
 Mueller-0.015
 Ded-0.015
Token
Token ID198
Feature activation-6.51e-03

Pos logit contributions
 former+0.006
 disl+0.005
 ago+0.005
 tant+0.004
bel+0.004
Neg logit contributions
ALTH-0.007
Code-0.007
 Mueller-0.007
 Professional-0.007
 vodka-0.007
TokenOne
Token ID3198
Feature activation+2.48e-02

Pos logit contributions
 former+0.007
 disl+0.006
spe+0.005
 ago+0.005
ro+0.005
Neg logit contributions
 Professional-0.007
ALTH-0.007
Code-0.007
 Mueller-0.007
 vodka-0.006
Token the
Token ID262
Feature activation+5.26e-03

Pos logit contributions
 former+0.056
 disl+0.054
arre+0.054
spe+0.052
bel+0.052
Neg logit contributions
 the-0.001
 it+0.002
 looking+0.003
 Professional+0.003
 a+0.004
Token tree
Token ID5509
Feature activation-2.65e-02

Pos logit contributions
 Professional+0.004
ALTH+0.004
Code+0.004
 Mueller+0.004
 vodka+0.003
Neg logit contributions
 former-0.007
 disl-0.007
ifying-0.006
 tant-0.006
bel-0.006
Token.
Token ID13
Feature activation-2.70e-03

Pos logit contributions
 disl+0.068
 tant+0.064
arl+0.064
 former+0.058
ifying+0.057
Neg logit contributions
.-0.001
 looking+0.002
,+0.002
 under+0.003
 and+0.003
Token
Token ID198
Feature activation+7.01e-04

Pos logit contributions
 former+0.005
 tant+0.004
 disl+0.004
spe+0.004
ifying+0.004
Neg logit contributions
 Professional-0.001
ALTH-0.001
 Mueller-0.001
 Ded-0.001
Code-0.001
Token
Token ID198
Feature activation+5.37e-04

Pos logit contributions
 Professional+0.000
ALTH+0.000
Code+0.000
 Mueller+0.000
 vodka+0.000
Neg logit contributions
 former-0.001
 disl-0.001
ifying-0.001
 tant-0.001
spe-0.001
TokenWhile
Token ID3633
Feature activation+1.09e-01

Pos logit contributions
 Professional+0.000
ALTH+0.000
Code+0.000
 Mueller+0.000
 vodka+0.000
Neg logit contributions
 former-0.001
 disl-0.001
 tant-0.001
ifying-0.001
 ago-0.001
Token m
Token ID285
Feature activation-4.09e-02

Pos logit contributions
 Professional+0.034
 Mueller+0.028
ALTH+0.023
 vodka+0.023
 solving+0.021
Neg logit contributions
 former-0.213
 disl-0.210
 tant-0.200
ro-0.181
ifying-0.180
Tokenunch
Token ID3316
Feature activation+6.28e-02

Pos logit contributions
 former+0.056
 disl+0.050
ifying+0.047
 tant+0.047
bel+0.044
Neg logit contributions
 Professional-0.031
ALTH-0.030
Code-0.030
 Mueller-0.029
 vodka-0.028
Tokening
Token ID278
Feature activation-2.58e-03

Pos logit contributions
Code+0.011
 Professional+0.010
,+0.009
 it+0.007
 under+0.007
Neg logit contributions
 former-0.146
 disl-0.130
 ago-0.116
spe-0.109
 habitat-0.108
Token on
Token ID319
Feature activation-9.20e-03

Pos logit contributions
 former+0.006
ifying+0.005
 disl+0.005
spe+0.005
arl+0.005
Neg logit contributions
 Professional-0.000
Code-0.000
,-0.000
 the-0.000
ALTH-0.000
Token her
Token ID607
Feature activation-4.88e-03

Pos logit contributions
 former+0.021
arre+0.020
 disl+0.020
bel+0.019
spe+0.018
Neg logit contributions
 the-0.000
 Professional+0.000
,+0.000
 it+0.001
 under+0.001
Token She
Token ID1375
Feature activation+2.28e-03

Pos logit contributions
 former+0.001
 disl+0.001
 tant+0.001
ifying+0.001
 ago+0.001
Neg logit contributions
Code-0.001
ALTH-0.001
 Professional-0.001
 Mueller-0.001
 vodka-0.001
Token also
Token ID635
Feature activation+3.11e-02

Pos logit contributions
Code+0.001
 vodka+0.001
 Professional+0.001
ALTH+0.001
 Mueller+0.001
Neg logit contributions
 former-0.004
 disl-0.004
ifying-0.004
arl-0.004
igsaw-0.004
Token saw
Token ID2497
Feature activation+2.31e-03

Pos logit contributions
Code+0.010
 Professional+0.007
 vodka+0.007
 blogging+0.007
 Mueller+0.006
Neg logit contributions
 former-0.061
ifying-0.053
 tant-0.051
 disl-0.051
arl-0.051
Token shells
Token ID19679
Feature activation+6.93e-03

Pos logit contributions
 a+0.000
 many+0.000
 new+0.000
 more+0.000
 flowers-0.000
Neg logit contributions
 former-0.006
isk-0.005
 disl-0.005
arl-0.005
arre-0.005
Token.
Token ID13
Feature activation-2.73e-04

Pos logit contributions
,+0.000
 under-0.001
Code-0.001
 and-0.001
 among-0.001
Neg logit contributions
 former-0.018
 disl-0.016
 tant-0.015
spe-0.015
 aids-0.014
Token Shell
Token ID17537
Feature activation+1.11e-01

Pos logit contributions
 former+0.000
 disl+0.000
ifying+0.000
 tant+0.000
 ago+0.000
Neg logit contributions
Code-0.000
ALTH-0.000
 Professional-0.000
 Mueller-0.000
 vodka-0.000
Tokens
Token ID82
Feature activation+3.08e-02

Pos logit contributions
Code+0.056
 Professional+0.050
 Mueller+0.047
ALTH+0.047
 vodka+0.044
Neg logit contributions
 former-0.197
 disl-0.170
ifying-0.164
 tant-0.159
 ago-0.149
Token were
Token ID547
Feature activation-5.84e-03

Pos logit contributions
Code+0.010
 vodka+0.007
 blogging+0.006
 Mueller+0.006
 looked+0.005
Neg logit contributions
 former-0.061
 disl-0.054
 tant-0.052
igsaw-0.051
ifying-0.049
Token her
Token ID607
Feature activation-4.41e-03

Pos logit contributions
 former+0.012
 disl+0.011
 tant+0.011
ifying+0.010
arre+0.010
Neg logit contributions
Code-0.002
 vodka-0.001
 blogging-0.001
 very-0.001
 looking-0.001
Token favorite
Token ID4004
Feature activation+2.95e-02

Pos logit contributions
 disl+0.009
 former+0.008
arre+0.008
isk+0.008
 tant+0.008
Neg logit contributions
 Mueller-0.001
 best-0.001
 under-0.001
 new-0.001
 looking-0.001
Token.
Token ID13
Feature activation+2.11e-03

Pos logit contributions
Code+0.007
 Mueller+0.007
 vodka+0.006
ALTH+0.006
 soils+0.006
Neg logit contributions
 former-0.056
 disl-0.051
arl-0.048
bel-0.047
arre-0.047
Token They
Token ID1119
Feature activation+1.81e-02

Pos logit contributions
 Professional+0.004
ALTH+0.004
Code+0.003
 Mueller+0.003
 vodka+0.003
Neg logit contributions
 former-0.011
 disl-0.010
ifying-0.010
spe-0.009
 tant-0.009
Token ran
Token ID4966
Feature activation+2.32e-02

Pos logit contributions
ALTH+0.019
 Professional+0.018
Code+0.018
 Mueller+0.017
 Average+0.016
Neg logit contributions
 former-0.019
ifying-0.016
 tant-0.015
 disl-0.015
spe-0.014
Token to
Token ID284
Feature activation+6.97e-03

Pos logit contributions
ALTH+0.025
 Professional+0.025
Code+0.025
 vodka+0.024
 Mueller+0.023
Neg logit contributions
 former-0.022
 disl-0.019
 tant-0.018
ro-0.018
 ago-0.018
Token find
Token ID1064
Feature activation+3.09e-02

Pos logit contributions
 Professional+0.004
ALTH+0.004
Code+0.004
 Mueller+0.004
 vodka+0.004
Neg logit contributions
 former-0.011
 disl-0.010
ifying-0.009
spe-0.009
 tant-0.009
Token shelter
Token ID11772
Feature activation+2.10e-02

Pos logit contributions
Code+0.025
 Professional+0.025
ALTH+0.024
 Mueller+0.023
 vodka+0.022
Neg logit contributions
 former-0.042
 disl-0.037
ifying-0.034
 tant-0.034
spe-0.033
Token under
Token ID739
Feature activation+1.17e-01

Pos logit contributions
 under+0.005
 and+0.002
 for+0.001
,+0.001
.+0.001
Neg logit contributions
 former-0.048
 disl-0.045
 tant-0.044
ocol-0.043
ifying-0.041
Token a
Token ID257
Feature activation-2.69e-03

Pos logit contributions
Code+0.040
 Professional+0.036
ALTH+0.030
 Mueller+0.023
 Ashes+0.022
Neg logit contributions
 former-0.222
 disl-0.209
 tant-0.196
spe-0.195
bel-0.193
Token rough
Token ID5210
Feature activation+2.08e-02

Pos logit contributions
 former+0.004
 disl+0.003
ifying+0.003
 tant+0.003
 ago+0.003
Neg logit contributions
Code-0.002
ALTH-0.002
 Professional-0.002
 Mueller-0.002
 vodka-0.002
Token tree
Token ID5509
Feature activation-1.36e-02

Pos logit contributions
ALTH+0.014
Code+0.013
 Professional+0.013
 Mueller+0.011
 vodka+0.011
Neg logit contributions
 former-0.031
 disl-0.029
ifying-0.026
bel-0.025
spe-0.025
Token.
Token ID13
Feature activation+1.04e-02

Pos logit contributions
 disl+0.029
 former+0.028
 tant+0.027
ocol+0.025
ifying+0.025
Neg logit contributions
.-0.000
 under-0.000
 Professional+0.000
,+0.001
 hiding+0.001
Token Lily
Token ID20037
Feature activation+7.07e-03

Pos logit contributions
 Professional+0.008
ALTH+0.007
Code+0.007
 Mueller+0.007
 vodka+0.007
Neg logit contributions
 former-0.015
 disl-0.013
 tant-0.012
ifying-0.012
spe-0.012
Token the
Token ID262
Feature activation+9.23e-03

Pos logit contributions
ALTH+0.000
Code+0.000
 Professional+0.000
 under+0.000
 Mueller+0.000
Neg logit contributions
 former-0.003
 disl-0.003
 tant-0.002
bel-0.002
arre-0.002
Token tree
Token ID5509
Feature activation-2.48e-02

Pos logit contributions
ALTH+0.007
Code+0.007
 Professional+0.007
 Mueller+0.007
 vodka+0.007
Neg logit contributions
 former-0.012
 disl-0.011
ifying-0.010
 tant-0.010
spe-0.010
Token.
Token ID13
Feature activation+2.86e-03

Pos logit contributions
 disl+0.053
 former+0.052
ifying+0.046
 tant+0.045
ocol+0.043
Neg logit contributions
 under-0.001
 Professional-0.001
ALTH-0.001
.-0.000
 Mueller-0.000
Token He
Token ID679
Feature activation-5.65e-03

Pos logit contributions
 Professional+0.002
ALTH+0.002
Code+0.002
 Mueller+0.002
 vodka+0.002
Neg logit contributions
 former-0.004
 disl-0.004
 tant-0.003
ifying-0.003
spe-0.003
Token stays
Token ID14768
Feature activation-2.24e-02

Pos logit contributions
 former+0.009
 disl+0.008
ifying+0.008
spe+0.008
 ago+0.008
Neg logit contributions
ALTH-0.003
 Professional-0.003
Code-0.003
 vodka-0.003
 Mueller-0.003
Token under
Token ID739
Feature activation+1.09e-01

Pos logit contributions
 former+0.036
 disl+0.034
 tant+0.031
ifying+0.030
spe+0.030
Neg logit contributions
ALTH-0.012
Code-0.012
 Professional-0.011
 Mueller-0.011
 vodka-0.011
Token the
Token ID262
Feature activation+1.08e-02

Pos logit contributions
Code+0.030
 Professional+0.029
ALTH+0.026
 Vend+0.014
 Mueller+0.014
Neg logit contributions
 former-0.215
 disl-0.203
spe-0.190
bel-0.189
arre-0.181
Token tree
Token ID5509
Feature activation-2.34e-02

Pos logit contributions
Code+0.007
ALTH+0.007
 Professional+0.006
 Mueller+0.006
 vodka+0.005
Neg logit contributions
 former-0.017
 disl-0.016
spe-0.014
ifying-0.014
bel-0.014
Token and
Token ID290
Feature activation-5.71e-04

Pos logit contributions
 disl+0.051
 former+0.048
ifying+0.046
 tant+0.043
ocol+0.042
Neg logit contributions
 Professional-0.001
.-0.000
 under-0.000
 Mueller+0.000
asser+0.000
Token bar
Token ID2318
Feature activation-3.35e-02

Pos logit contributions
 former+0.001
 disl+0.001
spe+0.001
ifying+0.001
 tant+0.001
Neg logit contributions
 Professional-0.000
ALTH-0.000
Code-0.000
 vodka-0.000
 Mueller-0.000
Tokenks
Token ID591
Feature activation-1.79e-02

Pos logit contributions
 former+0.041
 disl+0.036
ifying+0.033
 tant+0.033
 ago+0.031
Neg logit contributions
ALTH-0.030
 Professional-0.030
Code-0.030
 Mueller-0.029
 vodka-0.027
Token flower
Token ID15061
Feature activation+2.72e-02

Pos logit contributions
 Professional+0.009
Code+0.009
 Mueller+0.009
ALTH+0.009
 vodka+0.008
Neg logit contributions
 former-0.037
 disl-0.035
ifying-0.033
spe-0.033
bar-0.033
Token and
Token ID290
Feature activation+1.08e-02

Pos logit contributions
 Mueller+0.003
 under+0.003
 Professional+0.002
 vodka+0.001
 and+0.001
Neg logit contributions
 disl-0.058
 former-0.053
 tant-0.051
arl-0.051
spe-0.048
Token said
Token ID531
Feature activation+3.04e-03

Pos logit contributions
 vodka+0.004
Code+0.004
 Professional+0.004
 Mueller+0.003
 Average+0.003
Neg logit contributions
 former-0.019
ifying-0.019
arl-0.018
 tant-0.017
igsaw-0.017
Token,
Token ID11
Feature activation+3.01e-02

Pos logit contributions
 Professional+0.002
ALTH+0.002
Code+0.002
 Mueller+0.002
 vodka+0.001
Neg logit contributions
 former-0.005
 disl-0.004
ifying-0.004
 tant-0.004
 ago-0.004
Token "
Token ID366
Feature activation+1.72e-03

Pos logit contributions
ALTH+0.022
 Professional+0.022
Code+0.022
 Mueller+0.021
 vodka+0.020
Neg logit contributions
 former-0.043
 disl-0.037
 tant-0.035
ifying-0.035
spe-0.033
TokenWow
Token ID22017
Feature activation+1.04e-01

Pos logit contributions
ALTH+0.001
 Professional+0.001
Code+0.001
 Mueller+0.001
 vodka+0.001
Neg logit contributions
 former-0.002
 disl-0.002
ifying-0.002
 tant-0.002
spe-0.002
Token!
Token ID0
Feature activation+2.99e-02

Pos logit contributions
Code+0.085
 Professional+0.085
ALTH+0.084
 Mueller+0.080
 vodka+0.078
Neg logit contributions
 former-0.128
 disl-0.116
ifying-0.109
 tant-0.104
 ago-0.102
Token It
Token ID632
Feature activation+4.00e-02

Pos logit contributions
 Professional+0.019
ALTH+0.018
Code+0.017
 Mueller+0.016
 Average+0.015
Neg logit contributions
 former-0.047
 tant-0.042
 disl-0.041
bel-0.039
spe-0.038
Token is
Token ID318
Feature activation+2.73e-02

Pos logit contributions
 Professional+0.026
ALTH+0.026
Code+0.026
 Mueller+0.024
 vodka+0.023
Neg logit contributions
 former-0.059
 disl-0.053
 tant-0.050
ifying-0.049
 ago-0.048
Token very
Token ID845
Feature activation+6.25e-02

Pos logit contributions
ALTH+0.014
 Professional+0.013
Code+0.013
 vodka+0.012
 Mueller+0.011
Neg logit contributions
 former-0.044
 tant-0.042
 disl-0.042
bel-0.040
spe-0.039
Token pretty
Token ID2495
Feature activation+3.32e-02

Pos logit contributions
Code+0.037
 Professional+0.036
ALTH+0.036
 vodka+0.034
 Mueller+0.033
Neg logit contributions
 former-0.093
 disl-0.084
 tant-0.083
ifying-0.080
spe-0.078
Token very
Token ID845
Feature activation+5.66e-02

Pos logit contributions
 former+0.029
bel+0.028
arl+0.028
 tant+0.028
 disl+0.027
Neg logit contributions
 so-0.001
 looking-0.001
 very-0.001
 covered+0.001
 wearing+0.001
Token excited
Token ID6568
Feature activation+5.21e-03

Pos logit contributions
 good+0.008
 eager+0.008
 gifted+0.007
 fit+0.007
 clever+0.006
Neg logit contributions
 former-0.115
ifying-0.108
tr-0.108
isk-0.105
storm-0.104
Token and
Token ID290
Feature activation+9.02e-03

Pos logit contributions
!+0.000
 and+0.000
,+0.000
.-0.000
 but-0.000
Neg logit contributions
 former-0.013
ifying-0.011
 tant-0.011
 disl-0.011
bel-0.010
Token they
Token ID484
Feature activation+1.73e-02

Pos logit contributions
 it+0.002
 ran+0.001
 he+0.001
 looked+0.001
 opened+0.001
Neg logit contributions
 former-0.021
arl-0.020
 tant-0.019
arre-0.019
opl-0.019
Token quickly
Token ID2952
Feature activation-2.23e-02

Pos logit contributions
 vodka+0.003
 looked+0.003
 Professional+0.003
 capit+0.002
 sat+0.002
Neg logit contributions
 former-0.034
 disl-0.033
arl-0.030
 tant-0.030
 ago-0.029
Token began
Token ID2540
Feature activation-1.15e-02

Pos logit contributions
 former+0.043
 tant+0.039
ifying+0.038
bel+0.037
 disl+0.037
Neg logit contributions
 vodka-0.004
 buried-0.004
 dug-0.004
 crawled-0.004
 opened-0.004
Token to
Token ID284
Feature activation+1.30e-02

Pos logit contributions
 former+0.021
 tant+0.019
ifying+0.019
 disl+0.018
arl+0.018
Neg logit contributions
 Professional-0.003
 blogging-0.003
 vodka-0.003
 Mueller-0.002
ALTH-0.002
Token pick
Token ID2298
Feature activation+1.86e-02

Pos logit contributions
 Professional+0.002
 unfolds+0.002
 vodka+0.002
assuming+0.002
 pick+0.001
Neg logit contributions
bel-0.024
 tant-0.024
 former-0.024
belt-0.024
 ago-0.023
Token the
Token ID262
Feature activation+2.23e-02

Pos logit contributions
 Professional+0.008
Code+0.008
ALTH+0.007
 Mueller+0.007
expensive+0.006
Neg logit contributions
 former-0.033
 disl-0.030
bel-0.029
 ago-0.029
spe-0.028
Token leaves
Token ID5667
Feature activation+2.83e-03

Pos logit contributions
 Professional+0.014
ALTH+0.013
Code+0.013
 Mueller+0.012
 vodka+0.011
Neg logit contributions
 former-0.034
 disl-0.032
bel-0.030
ifying-0.029
spe-0.029
Token.
Token ID13
Feature activation+1.54e-02

Pos logit contributions
 under+0.000
.+0.000
 and+0.000
,+0.000
 for+0.000
Neg logit contributions
 former-0.006
 disl-0.006
 habitats-0.006
 aids-0.006
 Majesty-0.006
Token.
Token ID13
Feature activation-5.74e-03

Pos logit contributions
.+0.001
,+0.000
 and-0.001
 more-0.001
 for-0.001
Neg logit contributions
ifying-0.023
ocol-0.022
 disl-0.021
arl-0.021
 former-0.021
Token
Token ID198
Feature activation-4.54e-03

Pos logit contributions
 former+0.012
ifying+0.011
 tant+0.011
 disl+0.011
bel+0.011
Neg logit contributions
 Professional-0.001
-0.000
 Vend-0.000
ALTH-0.000
 But+0.000
Token
Token ID198
Feature activation-5.46e-03

Pos logit contributions
 former+0.008
ifying+0.008
 disl+0.007
 tant+0.007
spe+0.007
Neg logit contributions
 Professional-0.002
Code-0.002
ALTH-0.002
 Ded-0.001
 Mueller-0.001
TokenSam
Token ID16305
Feature activation+1.45e-02

Pos logit contributions
 former+0.010
ifying+0.009
 disl+0.009
 tant+0.008
 ago+0.008
Neg logit contributions
Code-0.002
ALTH-0.002
 Professional-0.002
Fore-0.002
 Mueller-0.001
Token hugged
Token ID41130
Feature activation+1.88e-02

Pos logit contributions
 looked+0.003
 Professional+0.003
Code+0.003
 followed+0.003
 closed+0.002
Neg logit contributions
 former-0.030
 disl-0.029
ifying-0.029
ocol-0.027
igsaw-0.026
Token Dad
Token ID17415
Feature activation-2.65e-02

Pos logit contributions
 the+0.002
 his+0.000
Code+0.000
 Professional-0.000
 under-0.001
Neg logit contributions
 former-0.043
ifying-0.042
 disl-0.038
spe-0.037
 ago-0.037
Token and
Token ID290
Feature activation-1.21e-02

Pos logit contributions
 former+0.040
 disl+0.036
ifying+0.033
 ago+0.032
 tant+0.032
Neg logit contributions
 Professional-0.018
Code-0.017
ALTH-0.017
 Mueller-0.016
 vodka-0.015
Token Tim
Token ID5045
Feature activation-6.09e-03

Pos logit contributions
 former+0.023
ifying+0.022
 ago+0.021
 disl+0.020
arl+0.020
Neg logit contributions
Code-0.004
 Professional-0.004
 vodka-0.003
ALTH-0.002
 unfolds-0.002
Token and
Token ID290
Feature activation-1.19e-02

Pos logit contributions
 former+0.010
 disl+0.009
ifying+0.008
 tant+0.008
 ago+0.008
Neg logit contributions
 Professional-0.004
Code-0.004
ALTH-0.003
 Mueller-0.003
 vodka-0.003
Token said
Token ID531
Feature activation-2.40e-02

Pos logit contributions
 former+0.024
ifying+0.022
 disl+0.021
arl+0.021
 ago+0.021
Neg logit contributions
 Professional-0.002
 helped-0.002
Code-0.002
 looked-0.002
 vodka-0.002
Token he
Token ID339
Feature activation-3.02e-02

Pos logit contributions
 former+0.057
ifying+0.050
isk+0.046
igsaw+0.045
 tant+0.045
Neg logit contributions
,-0.004
 it-0.003
 they-0.001
 good-0.001
 he-0.001
Token smile
Token ID8212
Feature activation-3.73e-02

Pos logit contributions
 former+0.037
 disl+0.033
 ago+0.030
ro+0.028
ifying+0.028
Neg logit contributions
 Professional-0.015
Code-0.014
ALTH-0.014
 Mueller-0.014
 vodka-0.013
Token,
Token ID11
Feature activation+1.37e-03

Pos logit contributions
 former+0.106
 disl+0.093
 habitats+0.086
 habitat+0.085
 tant+0.081
Neg logit contributions
,-0.005
 and+0.000
 on+0.000
 for+0.006
 in+0.008
Token he
Token ID339
Feature activation-3.16e-02

Pos logit contributions
 it+0.000
 the+0.000
Code+0.000
 he+0.000
 because+0.000
Neg logit contributions
 former-0.003
 disl-0.003
ro-0.003
 tant-0.002
 ago-0.002
Token opened
Token ID4721
Feature activation+5.81e-02

Pos logit contributions
 disl+0.066
 former+0.065
arl+0.060
 tant+0.059
ifying+0.057
Neg logit contributions
 opened-0.007
 looked-0.007
 helped-0.006
 closed-0.006
 followed-0.005
Token the
Token ID262
Feature activation-4.49e-03

Pos logit contributions
 it+0.003
 the+0.002
 Professional-0.000
Code-0.001
ALTH-0.005
Neg logit contributions
 former-0.127
 disl-0.122
arl-0.111
storm-0.109
isk-0.108
Token box
Token ID3091
Feature activation+4.80e-02

Pos logit contributions
 disl+0.008
 former+0.007
bel+0.007
spe+0.007
 ago+0.006
Neg logit contributions
 Mueller-0.002
ALTH-0.002
Code-0.002
 Professional-0.002
 Average-0.001
Token and
Token ID290
Feature activation-1.49e-02

Pos logit contributions
 under+0.003
 Mueller+0.002
 Professional+0.002
 and+0.001
 ranging+0.000
Neg logit contributions
 disl-0.109
 former-0.100
 tant-0.088
arl-0.084
ifying-0.083
Token his
Token ID465
Feature activation-1.39e-02

Pos logit contributions
 former+0.032
 disl+0.032
opl+0.030
arl+0.030
 habitats+0.027
Neg logit contributions
 opened-0.004
 looked-0.003
 under-0.003
ultane-0.002
 showed-0.002
Token eyes
Token ID2951
Feature activation+2.42e-02

Pos logit contributions
 disl+0.025
 former+0.023
und+0.021
bush+0.021
ro+0.021
Neg logit contributions
 Mueller-0.005
 new-0.004
 colleague-0.004
Code-0.004
 Professional-0.004
Token filled
Token ID5901
Feature activation-6.03e-03

Pos logit contributions
Code+0.013
 vodka+0.011
ALTH+0.011
 Professional+0.011
 blogging+0.010
Neg logit contributions
 former-0.040
 disl-0.039
 tant-0.034
 ago-0.032
spe-0.031
Token with
Token ID351
Feature activation-3.34e-02

Pos logit contributions
 former+0.012
arl+0.012
 disl+0.012
 ago+0.012
ifying+0.011
Neg logit contributions
 the-0.001
 it-0.001
Code-0.000
 under+0.000
 his+0.000
Token me
Token ID502
Feature activation-7.88e-02

Pos logit contributions
 former+0.041
rol+0.040
side+0.039
mo+0.039
 seventeen+0.039
Neg logit contributions
 dad-0.002
 your+0.000
 we+0.000
 mom+0.001
 toys+0.001
Token if
Token ID611
Feature activation-5.16e-02

Pos logit contributions
 former+0.169
 seventeen+0.165
 tant+0.160
 disl+0.159
ifying+0.156
Neg logit contributions
.-0.007
."-0.002
 because+0.001
,+0.005
 very+0.006
Token you
Token ID345
Feature activation-4.47e-02

Pos logit contributions
 former+0.121
 tant+0.116
arre+0.107
ifying+0.106
isk+0.104
Neg logit contributions
 we-0.007
 it-0.005
 things-0.001
 you+0.001
 they+0.002
Token don
Token ID836
Feature activation-1.07e-01

Pos logit contributions
 former+0.083
 disl+0.075
spe+0.074
ifying+0.071
 tant+0.070
Neg logit contributions
ALTH-0.011
expensive-0.010
 Professional-0.009
 overs-0.009
 look-0.008
Token't
Token ID470
Feature activation+1.54e-02

Pos logit contributions
 former+0.089
 disl+0.078
ifying+0.075
 tant+0.068
arre+0.067
Neg logit contributions
Code-0.121
 Professional-0.121
 Mueller-0.119
 vodka-0.119
ALTH-0.114
Token.
Token ID13
Feature activation-1.52e-02

Pos logit contributions
 Professional+0.009
ALTH+0.009
 Mueller+0.009
Code+0.009
 vodka+0.008
Neg logit contributions
 former-0.023
 disl-0.021
ifying-0.020
bel-0.020
 tant-0.019
Token And
Token ID843
Feature activation+1.49e-02

Pos logit contributions
 former+0.025
bel+0.025
 tant+0.024
spe+0.023
 disl+0.023
Neg logit contributions
 Professional-0.007
ALTH-0.006
Code-0.006
 Mueller-0.006
 Roots-0.006
Token it
Token ID340
Feature activation+1.67e-02

Pos logit contributions
 we+0.002
 you+0.001
 it+0.001
 maybe+0.001
 let+0.001
Neg logit contributions
opl-0.036
 tant-0.035
arre-0.035
 former-0.035
ifying-0.034
Token's
Token ID338
Feature activation+1.12e-02

Pos logit contributions
ALTH+0.005
 Professional+0.004
 vodka+0.004
Code+0.003
 Mueller+0.003
Neg logit contributions
 former-0.033
 disl-0.030
bel-0.029
ifying-0.029
 ago-0.028
Token not
Token ID407
Feature activation-1.16e-02

Pos logit contributions
ALTH+0.003
 vodka+0.002
 Professional+0.002
 because+0.002
 very+0.002
Neg logit contributions
bel-0.021
 disl-0.020
 former-0.020
spe-0.019
ho-0.019
Token good
Token ID922
Feature activation+2.18e-02

Pos logit contributions
 former+0.023
 disl+0.022
ifying+0.021
ho+0.021
ro+0.021
Neg logit contributions
 good-0.002
 worth-0.001
 nice-0.001
 for-0.001
 very-0.001
Token so
Token ID523
Feature activation+8.21e-03

Pos logit contributions
 disl+0.030
 tant+0.028
 former+0.028
arre+0.027
arl+0.026
Neg logit contributions
 vodka-0.003
 Professional-0.003
ALTH-0.003
 under-0.002
Code-0.002
Token big
Token ID1263
Feature activation-1.67e-02

Pos logit contributions
Code+0.003
 Professional+0.002
ALTH+0.002
 vodka+0.002
 Mueller+0.002
Neg logit contributions
 former-0.014
 disl-0.013
 tant-0.013
 supervision-0.013
bar-0.013
Token and
Token ID290
Feature activation+4.46e-03

Pos logit contributions
 former+0.034
 tant+0.031
ifying+0.031
 disl+0.030
spe+0.030
Neg logit contributions
ALTH-0.003
 compared-0.003
 Professional-0.002
Code-0.002
 under-0.002
Token exciting
Token ID7895
Feature activation-3.46e-03

Pos logit contributions
 cozy+0.001
 new+0.001
 shiny+0.001
 dark+0.001
 looked+0.001
Neg logit contributions
arl-0.009
 former-0.009
 disl-0.009
arre-0.009
ro-0.009
Token that
Token ID326
Feature activation-1.21e-02

Pos logit contributions
 former+0.007
 tant+0.006
 disl+0.006
ifying+0.006
arl+0.006
Neg logit contributions
 Mueller-0.001
ALTH-0.001
 Professional-0.001
Code-0.001
 blogging-0.001
Token Sarah
Token ID10490
Feature activation-3.17e-02

Pos logit contributions
 former+0.025
arl+0.024
 disl+0.024
ro+0.023
ifying+0.023
Neg logit contributions
 it-0.002
 the-0.001
 she-0.001
 everything-0.000
 new-0.000
Token had
Token ID550
Feature activation-1.13e-02

Pos logit contributions
 former+0.060
arl+0.060
 disl+0.057
 tant+0.055
ocol+0.054
Neg logit contributions
 Professional-0.009
ALTH-0.008
 vodka-0.008
 Mueller-0.007
ultane-0.007
Token to
Token ID284
Feature activation+9.86e-03

Pos logit contributions
 former+0.019
 tant+0.018
 disl+0.018
arl+0.017
ifying+0.017
Neg logit contributions
 Professional-0.005
ALTH-0.005
 vodka-0.005
Code-0.005
 capit-0.004
Token take
Token ID1011
Feature activation-3.39e-03

Pos logit contributions
 find+0.002
 open+0.002
 hide+0.002
 look+0.002
 keep+0.001
Neg logit contributions
ifying-0.021
 tant-0.021
arre-0.021
bel-0.021
spe-0.020
Token a
Token ID257
Feature activation-1.26e-03

Pos logit contributions
 disl+0.007
 former+0.007
spe+0.007
bar+0.006
cut+0.006
Neg logit contributions
 good-0.001
 it-0.000
 very-0.000
 things-0.000
 the-0.000
Token deep
Token ID2769
Feature activation+3.18e-02

Pos logit contributions
 former+0.002
 disl+0.002
ifying+0.001
 ago+0.001
 tant+0.001
Neg logit contributions
 Professional-0.001
ALTH-0.001
Code-0.001
 Mueller-0.001
 vodka-0.001
Token birds
Token ID10087
Feature activation-4.05e-03

Pos logit contributions
 former+0.038
arl+0.037
igsaw+0.037
belt+0.036
arre+0.035
Neg logit contributions
 animals-0.005
 bugs-0.004
 fish-0.004
 the-0.003
 flowers-0.003
Token.
Token ID13
Feature activation-1.30e-02

Pos logit contributions
 former+0.011
 tant+0.010
 disl+0.009
igsaw+0.009
spe+0.009
Neg logit contributions
,-0.000
 looking-0.000
 and-0.000
.-0.000
 under+0.000
Token They
Token ID1119
Feature activation-4.30e-03

Pos logit contributions
 former+0.022
 disl+0.020
 tant+0.019
spe+0.019
bel+0.018
Neg logit contributions
ALTH-0.006
Code-0.006
 Professional-0.006
 Vend-0.006
 Mueller-0.005
Token also
Token ID635
Feature activation+2.00e-02

Pos logit contributions
 former+0.010
igsaw+0.009
 disl+0.009
arl+0.008
 tant+0.008
Neg logit contributions
 looked-0.001
 crawled-0.001
 followed-0.001
 also-0.000
 opened-0.000
Token saw
Token ID2497
Feature activation-9.10e-03

Pos logit contributions
 looked+0.003
 saw+0.002
 crawled+0.001
 smelled+0.001
 covered+0.001
Neg logit contributions
 former-0.046
igsaw-0.044
arl-0.042
 tant-0.042
bar-0.041
Token some
Token ID617
Feature activation-5.92e-02

Pos logit contributions
 former+0.021
belt+0.018
igsaw+0.018
ifying+0.018
arre+0.018
Neg logit contributions
 animals-0.001
 many-0.001
 new-0.001
 the-0.001
 more-0.001
Token people
Token ID661
Feature activation-1.17e-02

Pos logit contributions
 former+0.133
 disl+0.122
igsaw+0.120
bar+0.117
ifying+0.114
Neg logit contributions
 animals-0.013
 new-0.009
 things-0.007
 flowers-0.007
Code-0.006
Token sitting
Token ID5586
Feature activation-9.62e-04

Pos logit contributions
 former+0.027
 tant+0.023
 disl+0.022
igsaw+0.020
ifying+0.020
Neg logit contributions
 looking-0.003
 under-0.002
 wearing-0.002
,-0.001
 blogging-0.001
Token under
Token ID739
Feature activation+9.50e-02

Pos logit contributions
 disl+0.002
 former+0.002
spe+0.002
igsaw+0.002
 tant+0.002
Neg logit contributions
 under-0.000
 near-0.000
Code-0.000
 among-0.000
 underground-0.000
Token a
Token ID257
Feature activation-2.40e-02

Pos logit contributions
 the+0.005
 a+0.001
Code+0.000
 it-0.006
 animals-0.012
Neg logit contributions
 former-0.217
 disl-0.207
spe-0.205
arre-0.199
bel-0.197
Token tent
Token ID11105
Feature activation+5.07e-02

Pos logit contributions
 former+0.046
 disl+0.039
spe+0.036
arre+0.035
 ago+0.034
Neg logit contributions
Code-0.011
ALTH-0.010
 Professional-0.009
 vodka-0.007
 blogging-0.006
Token idea
Token ID2126
Feature activation+9.95e-03

Pos logit contributions
 former+0.026
 disl+0.023
ifying+0.023
 tant+0.021
spe+0.021
Neg logit contributions
ALTH-0.015
 Professional-0.015
Code-0.015
 Mueller-0.014
 vodka-0.014
Token!
Token ID0
Feature activation+1.31e-03

Pos logit contributions
ALTH+0.004
Code+0.004
 Mueller+0.003
 Professional+0.003
 vodka+0.003
Neg logit contributions
 former-0.019
 disl-0.016
ifying-0.015
 tant-0.015
bel-0.015
Token He
Token ID679
Feature activation-1.54e-02

Pos logit contributions
 Professional+0.001
ALTH+0.001
Code+0.000
 Mueller+0.000
 Ded+0.000
Neg logit contributions
 former-0.003
 disl-0.002
bel-0.002
 tant-0.002
spe-0.002
Token grabbed
Token ID13176
Feature activation-3.66e-02

Pos logit contributions
 former+0.032
 disl+0.032
ifying+0.031
arl+0.030
 tant+0.030
Neg logit contributions
 followed-0.003
 looked-0.003
 opened-0.003
 helped-0.003
 tried-0.003
Token a
Token ID257
Feature activation-2.16e-02

Pos logit contributions
 former+0.082
 disl+0.076
spe+0.073
bel+0.070
 ago+0.070
Neg logit contributions
 Professional-0.006
Code-0.005
 the-0.005
 Scale-0.004
 Rob-0.004
Token nearby
Token ID6716
Feature activation-7.97e-02

Pos logit contributions
 former+0.044
 disl+0.040
 habitats+0.038
 ago+0.036
 habitat+0.035
Neg logit contributions
 box-0.006
 new-0.004
 shovel-0.004
 piece-0.004
 stick-0.004
Token stick
Token ID4859
Feature activation-6.92e-03

Pos logit contributions
 former+0.170
 disl+0.159
ifying+0.148
bel+0.147
spe+0.135
Neg logit contributions
 box-0.019
 tree-0.013
 rock-0.012
,-0.011
 and-0.009
Token and
Token ID290
Feature activation-1.17e-02

Pos logit contributions
 former+0.018
 disl+0.018
 Majesty+0.015
ocol+0.015
ifying+0.015
Neg logit contributions
 and-0.000
,+0.001
 under+0.001
 to+0.001
 but+0.001
Token started
Token ID2067
Feature activation-3.88e-03

Pos logit contributions
arl+0.028
 former+0.027
opl+0.024
 disl+0.024
ifying+0.024
Neg logit contributions
 tried-0.003
 helped-0.003
 wrapped-0.003
 used-0.002
 tied-0.002
Token throwing
Token ID9644
Feature activation-3.54e-03

Pos logit contributions
 former+0.008
ifying+0.007
 tant+0.007
arl+0.007
 disl+0.007
Neg logit contributions
 digging-0.001
 looking-0.001
 blogging-0.001
 Professional-0.001
 searching-0.000
Token it
Token ID340
Feature activation+2.22e-02

Pos logit contributions
 former+0.009
storm+0.008
 disl+0.008
aj+0.008
spe+0.008
Neg logit contributions
 it-0.001
 the-0.000
 things-0.000
 more+0.000
 rocks+0.000
Token to
Token ID284
Feature activation-9.02e-03

Pos logit contributions
 disl+0.088
 former+0.080
ocol+0.075
ifying+0.067
 tant+0.063
Neg logit contributions
 under-0.011
,-0.006
 near-0.005
 and-0.004
 outside-0.004
Token their
Token ID511
Feature activation-1.38e-02

Pos logit contributions
ifying+0.020
 tant+0.020
arre+0.020
spe+0.019
aj+0.019
Neg logit contributions
 the-0.001
 a+0.000
 place+0.001
 look+0.001
 new+0.001
Token favorite
Token ID4004
Feature activation+2.19e-02

Pos logit contributions
 disl+0.022
 former+0.020
 ago+0.020
arl+0.020
ifying+0.020
Neg logit contributions
 Professional-0.006
 Mueller-0.005
Code-0.005
ALTH-0.005
 Vend-0.005
Token spot
Token ID4136
Feature activation+2.83e-02

Pos logit contributions
 place+0.005
 spot+0.003
 way+0.002
 thing+0.002
 room+0.002
Neg logit contributions
arl-0.044
 disl-0.043
 former-0.043
ifying-0.042
 seventeen-0.039
Token and
Token ID290
Feature activation-1.15e-02

Pos logit contributions
 under+0.005
,+0.004
.+0.003
 and+0.002
 where+0.001
Neg logit contributions
 disl-0.069
 former-0.067
arre-0.064
 tant-0.061
 Deal-0.060
Token then
Token ID788
Feature activation-8.57e-02

Pos logit contributions
arl+0.028
ocol+0.025
opl+0.025
 former+0.024
 tant+0.023
Neg logit contributions
 it-0.003
 the-0.003
 looked-0.003
 followed-0.002
 played-0.002
Token cl
Token ID537
Feature activation+8.91e-03

Pos logit contributions
 former+0.200
arl+0.191
 disl+0.185
opl+0.184
 tant+0.182
Neg logit contributions
 looked-0.016
 followed-0.016
 went-0.016
 it-0.016
 the-0.015
Tokenapped
Token ID6320
Feature activation+3.17e-02

Pos logit contributions
 Professional+0.005
ALTH+0.005
Code+0.005
 Mueller+0.004
 vodka+0.004
Neg logit contributions
 former-0.014
 disl-0.013
 tant-0.012
 ago-0.011
ifying-0.011
Token their
Token ID511
Feature activation-1.45e-02

Pos logit contributions
 for+0.003
 their+0.002
.+0.002
 and+0.001
 the-0.000
Neg logit contributions
 former-0.074
 disl-0.071
pill-0.071
opl-0.070
isk-0.069
Token hands
Token ID2832
Feature activation-1.60e-02

Pos logit contributions
 former+0.028
 disl+0.028
opl+0.024
 ago+0.023
bar+0.023
Neg logit contributions
Code-0.003
 Professional-0.002
ALTH-0.002
 vodka-0.002
 capit-0.002
Token in
Token ID287
Feature activation-4.00e-03

Pos logit contributions
 disl+0.038
 former+0.035
 tant+0.031
isk+0.030
bel+0.029
Neg logit contributions
.-0.001
 under-0.000
,-0.000
 and+0.000
 in+0.001
Token have
Token ID423
Feature activation-1.61e-02

Pos logit contributions
 former+0.071
 tant+0.061
 disl+0.059
eh+0.056
ifying+0.055
Neg logit contributions
 are-0.003
 things-0.002
 is-0.002
,-0.002
 according-0.001
Token cows
Token ID22575
Feature activation-4.78e-02

Pos logit contributions
 tant+0.032
 former+0.032
 disl+0.029
ci+0.029
ifying+0.028
Neg logit contributions
 animals-0.006
 many-0.003
 more-0.003
 food-0.002
 colors-0.002
Token,
Token ID11
Feature activation+1.85e-02

Pos logit contributions
 former+0.113
 tant+0.103
igsaw+0.092
 disl+0.088
ifying+0.085
Neg logit contributions
,-0.010
 and-0.007
 because-0.004
 are-0.004
 animals-0.002
Token pigs
Token ID22333
Feature activation-2.50e-02

Pos logit contributions
 animals+0.006
Code+0.004
 trains+0.003
 stal+0.003
 but+0.002
Neg logit contributions
 former-0.038
igsaw-0.035
ifying-0.034
 tant-0.034
handed-0.034
Token,
Token ID11
Feature activation+1.94e-02

Pos logit contributions
 former+0.067
 tant+0.062
igsaw+0.055
ifying+0.054
 disl+0.054
Neg logit contributions
,-0.003
 and-0.000
 because+0.004
.+0.005
 looking+0.005
Token chickens
Token ID25972
Feature activation-5.10e-02

Pos logit contributions
 animals+0.006
Code+0.004
 and+0.004
 stal+0.003
 dogs+0.002
Neg logit contributions
 former-0.041
igsaw-0.038
spe-0.036
 tant-0.036
ifying-0.036
Token,
Token ID11
Feature activation+1.97e-02

Pos logit contributions
 former+0.143
 tant+0.129
igsaw+0.123
 disl+0.120
ifying+0.118
Neg logit contributions
,-0.003
 and-0.000
.+0.013
 but+0.014
 because+0.015
Token and
Token ID290
Feature activation+2.14e-03

Pos logit contributions
 animals+0.004
 and+0.004
Code+0.003
 but+0.001
 toys+0.001
Neg logit contributions
 former-0.043
igsaw-0.040
spe-0.039
 tant-0.038
ifying-0.038
Token sheep
Token ID15900
Feature activation-2.18e-02

Pos logit contributions
 animals+0.001
 more+0.001
 stal+0.000
 things+0.000
 toys+0.000
Neg logit contributions
 former-0.004
igsaw-0.004
 tant-0.004
und-0.004
arl-0.004
Token.
Token ID13
Feature activation+7.27e-03

Pos logit contributions
 former+0.048
 tant+0.047
igsaw+0.042
 amaz+0.040
U+0.038
Neg logit contributions
.-0.005
,-0.005
 and-0.004
 animals-0.004
 to-0.002
Token They
Token ID1119
Feature activation+1.52e-02

Pos logit contributions
 Professional+0.004
ALTH+0.003
 Vend+0.003
 Mueller+0.003
Code+0.002
Neg logit contributions
 former-0.012
 disl-0.011
spe-0.011
ifying-0.011
bel-0.011
Token out
Token ID503
Feature activation+7.17e-03

Pos logit contributions
 former+0.087
 disl+0.086
 tant+0.082
ifying+0.082
arl+0.080
Neg logit contributions
 Professional-0.010
 Mueller-0.009
ALTH-0.008
,-0.008
 solve-0.007
Token of
Token ID286
Feature activation-2.56e-02

Pos logit contributions
,+0.001
 under+0.001
 and+0.000
 of+0.000
 Professional-0.000
Neg logit contributions
 former-0.016
 disl-0.015
 habitats-0.013
 Majesty-0.013
 tant-0.013
Token the
Token ID262
Feature activation+1.78e-03

Pos logit contributions
 former+0.053
 disl+0.052
bel+0.047
ifying+0.047
arre+0.046
Neg logit contributions
 Professional-0.004
Code-0.002
 the-0.001
 breath-0.000
 characteristic-0.000
Token freeze
Token ID16611
Feature activation-2.65e-02

Pos logit contributions
Code+0.001
 way+0.000
 Professional+0.000
ALTH+0.000
 hole+0.000
Neg logit contributions
 disl-0.003
 former-0.003
bel-0.003
ifying-0.003
spe-0.003
Token and
Token ID290
Feature activation-1.19e-02

Pos logit contributions
 former+0.054
 disl+0.052
 Deal+0.049
arl+0.046
 Amen+0.044
Neg logit contributions
 and-0.008
.-0.007
,-0.007
 way-0.007
!-0.006
Token soon
Token ID2582
Feature activation-7.40e-02

Pos logit contributions
arl+0.028
opl+0.027
igsaw+0.026
ifying+0.025
 former+0.025
Neg logit contributions
 the-0.004
 it-0.003
 helped-0.002
 covered-0.002
 looked-0.002
Token it
Token ID340
Feature activation+2.28e-02

Pos logit contributions
 former+0.142
 disl+0.127
 tant+0.123
ifying+0.121
arl+0.115
Neg logit contributions
ALTH-0.026
 Professional-0.026
Code-0.025
 vodka-0.021
 Mueller-0.018
Token was
Token ID373
Feature activation-3.05e-02

Pos logit contributions
ALTH+0.003
 led+0.002
 looked+0.002
Code+0.002
 covered+0.002
Neg logit contributions
 disl-0.049
 former-0.049
arl-0.045
 tant-0.044
ocol-0.043
Token playing
Token ID2712
Feature activation-5.46e-03

Pos logit contributions
 disl+0.056
 tant+0.052
 former+0.051
bel+0.048
ro+0.047
Neg logit contributions
 vodka-0.012
ALTH-0.012
 Professional-0.010
 deleting-0.008
 adjourn-0.008
Token again
Token ID757
Feature activation-1.54e-02

Pos logit contributions
 disl+0.012
 former+0.012
arl+0.011
ifying+0.011
aj+0.011
Neg logit contributions
 under-0.001
.-0.001
 in-0.001
 near-0.000
 among-0.000
Token.
Token ID13
Feature activation-8.99e-03

Pos logit contributions
 tant+0.042
 disl+0.042
 aids+0.040
arl+0.040
 paramed+0.039
Neg logit contributions
.-0.001
!+0.001
,+0.002
 and+0.003
 the+0.004
Token is
Token ID318
Feature activation+9.51e-03

Pos logit contributions
ALTH+0.014
Code+0.014
 Professional+0.014
 Mueller+0.013
 vodka+0.013
Neg logit contributions
 former-0.030
 disl-0.026
 tant-0.025
ifying-0.025
bel-0.024
Token from
Token ID422
Feature activation-2.52e-03

Pos logit contributions
 vodka+0.003
ALTH+0.003
 under+0.002
 Professional+0.002
Code+0.002
Neg logit contributions
 disl-0.017
 former-0.017
 tant-0.017
arre-0.016
spe-0.016
Token a
Token ID257
Feature activation-1.37e-02

Pos logit contributions
arre+0.006
spe+0.005
 disl+0.005
anne+0.005
cases+0.005
Neg logit contributions
 the-0.000
 gold-0.000
 looking-0.000
 a-0.000
 under-0.000
Token long
Token ID890
Feature activation-4.26e-02

Pos logit contributions
 disl+0.025
 former+0.024
arre+0.023
und+0.022
bel+0.022
Neg logit contributions
 treasure-0.004
 place-0.003
 cave-0.003
 coin-0.003
 way-0.003
Token time
Token ID640
Feature activation-2.28e-02

Pos logit contributions
 disl+0.087
 former+0.085
bel+0.081
ifying+0.078
arre+0.073
Neg logit contributions
 way-0.009
 place-0.008
 spot-0.007
 hole-0.006
 box-0.006
Token ago
Token ID2084
Feature activation-2.75e-01

Pos logit contributions
 tant+0.038
 former+0.038
 disl+0.036
isk+0.034
bel+0.033
Neg logit contributions
 looking-0.017
 under-0.015
 for-0.015
.-0.014
 and-0.014
Token,
Token ID11
Feature activation+1.35e-02

Pos logit contributions
 former+0.655
 disl+0.573
 tant+0.571
ifying+0.562
arre+0.519
Neg logit contributions
,-0.013
.-0.007
ALTH-0.003
 looking+0.008
 and+0.009
Token when
Token ID618
Feature activation-2.73e-02

Pos logit contributions
 but+0.003
 under+0.001
 it+0.001
 and+0.001
 because+0.000
Neg logit contributions
spe-0.030
isk-0.030
igsaw-0.030
 tant-0.029
 former-0.029
Token people
Token ID661
Feature activation+1.55e-03

Pos logit contributions
 tant+0.067
arre+0.062
 former+0.062
 disl+0.061
spe+0.059
Neg logit contributions
 it-0.005
 things-0.002
 we-0.001
 the+0.001
 you+0.002
Token used
Token ID973
Feature activation+4.37e-02

Pos logit contributions
 look+0.000
 vodka+0.000
 ask+0.000
ALTH+0.000
 get+0.000
Neg logit contributions
 former-0.003
spe-0.003
 tant-0.003
 disl-0.003
ifying-0.003
Token different
Token ID1180
Feature activation+1.85e-02

Pos logit contributions
 it+0.004
 things-0.000
 Professional-0.001
 vodka-0.002
 the-0.002
Neg logit contributions
 former-0.095
bel-0.086
 disl-0.086
arre-0.085
spe-0.084
Token is
Token ID318
Feature activation+4.60e-03

Pos logit contributions
 is+0.000
 looks-0.000
,-0.000
 things-0.000
!-0.000
Neg logit contributions
 former-0.006
ifying-0.005
 disl-0.005
 tant-0.005
 stret-0.005
Token from
Token ID422
Feature activation-5.84e-03

Pos logit contributions
 very+0.001
 going+0.001
 for+0.001
 too+0.001
 more+0.001
Neg logit contributions
 tant-0.010
 former-0.009
 disl-0.009
ro-0.009
bel-0.009
Token a
Token ID257
Feature activation-1.94e-02

Pos logit contributions
 disl+0.013
arre+0.012
 tant+0.012
bel+0.012
 former+0.012
Neg logit contributions
 the-0.001
 my-0.001
 a-0.001
 under-0.001
 it-0.000
Token long
Token ID890
Feature activation-4.50e-02

Pos logit contributions
 disl+0.035
 former+0.033
bel+0.030
isk+0.030
aj+0.030
Neg logit contributions
 way-0.006
 very-0.006
 place-0.005
 cave-0.005
ALTH-0.005
Token time
Token ID640
Feature activation-2.81e-02

Pos logit contributions
 disl+0.092
 former+0.090
bel+0.083
Aud+0.079
ifying+0.077
Neg logit contributions
 way-0.011
 place-0.007
,-0.007
 time-0.005
 hole-0.003
Token ago
Token ID2084
Feature activation-2.78e-01

Pos logit contributions
 tant+0.064
 former+0.059
 disl+0.059
bel+0.054
Mic+0.052
Neg logit contributions
.-0.008
,-0.007
!-0.007
."-0.006
 looking-0.005
Token,
Token ID11
Feature activation+5.29e-03

Pos logit contributions
 former+0.746
 tant+0.664
 disl+0.653
ifying+0.646
Aud+0.600
Neg logit contributions
,-0.025
.+0.002
 and+0.027
."+0.029
!+0.032
Token from
Token ID422
Feature activation-8.88e-03

Pos logit contributions
 but+0.002
 it+0.001
 and+0.001
 because+0.001
 I+0.001
Neg logit contributions
 tant-0.012
 former-0.011
 disl-0.011
opl-0.011
bel-0.011
Token when
Token ID618
Feature activation-3.61e-02

Pos logit contributions
arre+0.020
 disl+0.019
pill+0.019
 tant+0.018
age+0.018
Neg logit contributions
 the-0.002
 under-0.001
 my-0.001
 a-0.001
 upstairs-0.001
Token dinosaurs
Token ID27199
Feature activation+4.05e-02

Pos logit contributions
 tant+0.091
 disl+0.083
 former+0.081
isk+0.080
bel+0.077
Neg logit contributions
 it-0.007
 we-0.004
 I-0.002
 the-0.000
 you-0.000
Token lived
Token ID5615
Feature activation-2.70e-02

Pos logit contributions
ALTH+0.010
,+0.009
 Professional+0.005
 vodka+0.005
?"+0.004
Neg logit contributions
 former-0.090
 disl-0.079
 tant-0.071
ifying-0.068
bel-0.066
Token it
Token ID340
Feature activation+3.04e-02

Pos logit contributions
 it+0.001
 the-0.000
 things-0.001
 under-0.001
 more-0.001
Neg logit contributions
 former-0.017
 disl-0.016
arre-0.015
aj-0.015
arl-0.015
Token away
Token ID1497
Feature activation-4.62e-02

Pos logit contributions
 under+0.009
 Professional+0.007
 Mueller+0.007
ultane+0.007
ALTH+0.006
Neg logit contributions
 disl-0.055
 former-0.054
ifying-0.049
spe-0.047
 tant-0.047
Token in
Token ID287
Feature activation+6.38e-03

Pos logit contributions
 disl+0.101
 former+0.093
arre+0.089
 tant+0.088
arl+0.087
Neg logit contributions
 under-0.002
.-0.002
 among-0.002
,+0.003
 according+0.003
Token a
Token ID257
Feature activation-1.23e-02

Pos logit contributions
 Professional+0.001
expensive+0.001
ALTH+0.000
 the+0.000
Code+0.000
Neg logit contributions
 disl-0.012
 former-0.012
bel-0.012
arre-0.011
drops-0.011
Token wheel
Token ID7825
Feature activation-1.33e-02

Pos logit contributions
 disl+0.021
 former+0.020
bel+0.019
 ago+0.018
ro+0.018
Neg logit contributions
ALTH-0.004
 Professional-0.004
 new-0.003
 vodka-0.003
 place-0.003
Tokenbar
Token ID5657
Feature activation-2.19e-01

Pos logit contributions
 disl+0.036
 former+0.034
 tant+0.030
 Deal+0.029
 aids+0.029
Neg logit contributions
.-0.000
!+0.002
,+0.003
 and+0.004
 for+0.004
Tokenrow
Token ID808
Feature activation+2.17e-02

Pos logit contributions
 former+0.372
 disl+0.367
ifying+0.326
spe+0.305
 tant+0.295
Neg logit contributions
 Professional-0.087
ALTH-0.086
Code-0.085
 Mueller-0.085
 vodka-0.069
Token.
Token ID13
Feature activation+9.70e-04

Pos logit contributions
.+0.000
 among-0.003
!-0.003
 under-0.003
,-0.004
Neg logit contributions
 disl-0.056
 former-0.050
 tant-0.046
Aud-0.045
ocol-0.044
Token
Token ID198
Feature activation+3.98e-03

Pos logit contributions
 Professional+0.001
ALTH+0.001
Code+0.001
 Mueller+0.001
 vodka+0.001
Neg logit contributions
 former-0.001
 disl-0.001
 tant-0.001
ifying-0.001
bel-0.001
Token
Token ID198
Feature activation+4.36e-03

Pos logit contributions
ALTH+0.004
Code+0.004
 Professional+0.004
 Mueller+0.004
 vodka+0.004
Neg logit contributions
 former-0.004
 disl-0.004
 tant-0.003
 ago-0.003
bel-0.003
TokenOnce
Token ID7454
Feature activation+4.15e-02

Pos logit contributions
 Professional+0.004
ALTH+0.004
Code+0.004
 Mueller+0.004
 vodka+0.004
Neg logit contributions
 former-0.005
 disl-0.004
spe-0.004
bel-0.004
ifying-0.004
Token When
Token ID1649
Feature activation+6.03e-02

Pos logit contributions
+0.003
 She+0.003
 It+0.002
 They+0.002
 The+0.002
Neg logit contributions
 former-0.072
 tant-0.069
bel-0.063
arre-0.062
ache-0.062
Token Daisy
Token ID40355
Feature activation-2.93e-02

Pos logit contributions
 it+0.011
 the+0.006
 they+0.004
 he+0.004
 she+0.001
Neg logit contributions
 former-0.154
 tant-0.147
arre-0.143
arl-0.139
rig-0.137
Token put
Token ID1234
Feature activation+3.95e-02

Pos logit contributions
 former+0.067
 disl+0.058
 tant+0.056
arl+0.055
ifying+0.055
Neg logit contributions
 opened-0.007
 looked-0.006
 Professional-0.005
 vodka-0.005
 lay-0.005
Token the
Token ID262
Feature activation+2.95e-02

Pos logit contributions
 it+0.004
 the+0.002
 under+0.001
 on+0.001
 her-0.003
Neg logit contributions
 former-0.112
spe-0.106
 disl-0.105
arre-0.094
bel-0.091
Token magn
Token ID7842
Feature activation+3.10e-02

Pos logit contributions
 box+0.012
 coin+0.008
 things+0.007
 book+0.007
 egg+0.006
Neg logit contributions
 former-0.060
 disl-0.059
isk-0.056
arre-0.054
und-0.053
Tokenifying
Token ID4035
Feature activation-2.42e-01

Pos logit contributions
 under+0.028
ultane+0.027
Code+0.026
hole+0.026
 Professional+0.026
Neg logit contributions
 disl-0.052
 former-0.043
isk-0.032
 Majesty-0.028
opl-0.028
Token glass
Token ID5405
Feature activation+3.72e-02

Pos logit contributions
 disl+0.468
 former+0.453
spe+0.433
 supervision+0.433
isk+0.401
Neg logit contributions
 Professional-0.039
 glass-0.035
Code-0.024
ALTH-0.017
 Shards-0.015
Token down
Token ID866
Feature activation+7.82e-03

Pos logit contributions
,+0.006
 box+0.001
 and-0.002
.-0.002
 jar-0.005
Neg logit contributions
 former-0.102
 disl-0.095
 habitats-0.086
isk-0.083
 afar-0.083
Token,
Token ID11
Feature activation+2.05e-02

Pos logit contributions
,+0.001
 the-0.000
 under-0.000
 it-0.001
 but-0.001
Neg logit contributions
 former-0.021
 disl-0.021
arre-0.020
isk-0.018
making-0.018
Token the
Token ID262
Feature activation+1.30e-02

Pos logit contributions
 it+0.002
 she+0.001
 the-0.000
 there-0.003
 her-0.004
Neg logit contributions
 former-0.059
arl-0.058
 seventeen-0.054
aj-0.053
ifying-0.052
Token cater
Token ID20825
Feature activation-2.67e-02

Pos logit contributions
 box+0.005
 girl+0.005
 boy+0.004
 things+0.003
 animals+0.003
Neg logit contributions
 disl-0.025
 former-0.024
arre-0.023
und-0.023
isk-0.023
Token they
Token ID484
Feature activation+1.06e-03

Pos logit contributions
 the+0.005
 it+0.005
 he+0.002
 they-0.002
 all-0.003
Neg logit contributions
arl-0.085
 former-0.081
 disl-0.079
aj-0.075
 stret-0.072
Token just
Token ID655
Feature activation-1.57e-02

Pos logit contributions
 looked+0.000
 tried+0.000
 knew+0.000
 listened+0.000
 all+0.000
Neg logit contributions
 former-0.002
 disl-0.002
igsaw-0.002
arl-0.002
opl-0.002
Token kept
Token ID4030
Feature activation-9.75e-03

Pos logit contributions
 former+0.033
 disl+0.031
bel+0.030
 tant+0.029
arl+0.029
Neg logit contributions
 looked-0.004
 compared-0.003
 blinked-0.002
 tried-0.002
 covered-0.002
Token on
Token ID319
Feature activation-1.03e-02

Pos logit contributions
 former+0.021
ro+0.020
bel+0.020
ifying+0.020
tr+0.019
Neg logit contributions
 looking-0.003
 trying-0.002
 going-0.002
 moving-0.001
 being-0.001
Token qu
Token ID627
Feature activation-1.38e-02

Pos logit contributions
 former+0.018
ifying+0.018
ro+0.017
arl+0.017
bel+0.017
Neg logit contributions
 looking-0.003
 trying-0.002
 admitting-0.002
 Mueller-0.002
 listening-0.002
Tokenarre
Token ID9624
Feature activation-2.12e-01

Pos logit contributions
 former+0.012
 disl+0.010
 ago+0.009
ifying+0.009
 tant+0.009
Neg logit contributions
ALTH-0.017
Code-0.017
 Professional-0.017
 Mueller-0.016
 vodka-0.016
Tokenling
Token ID1359
Feature activation-5.50e-02

Pos logit contributions
 former+0.309
 disl+0.269
ifying+0.260
 ago+0.249
ro+0.224
Neg logit contributions
Code-0.151
ALTH-0.150
 Professional-0.142
 Mueller-0.139
 vodka-0.137
Token!
Token ID0
Feature activation+9.72e-03

Pos logit contributions
 former+0.137
aj+0.129
arl+0.129
 disl+0.123
aman+0.122
Neg logit contributions
.-0.009
,+0.001
 and+0.001
 the+0.003
 in+0.004
Token
Token ID220
Feature activation-4.54e-02

Pos logit contributions
 Professional+0.003
 Mueller+0.002
ALTH+0.001
 In+0.001
 Ded+0.001
Neg logit contributions
 former-0.019
 disl-0.019
bel-0.018
 tant-0.017
spe-0.016
Token
Token ID198
Feature activation-1.31e-04

Pos logit contributions
 former+0.100
 disl+0.092
 tant+0.091
ifying+0.088
spe+0.083
Neg logit contributions
 Professional-0.006
ALTH-0.005
-0.000
 Ded+0.001
 Mueller+0.001
Token
Token ID198
Feature activation-3.28e-06

Pos logit contributions
 former+0.000
 disl+0.000
ifying+0.000
 tant+0.000
 ago+0.000
Neg logit contributions
 Professional-0.000
Code-0.000
ALTH-0.000
 Ded-0.000
 Mueller-0.000