TOP ACTIVATIONS
MAX = 0.063

friends . Tim learned that school was a fun place to
loved to play make - bel ieve . Today she was
Mom replied , " The navy can sail across the ocean
for taking her to the carn ival . They hugged her
little girl soon realized that school was a place of friendship
That 's why the school kept shining
waved at the famous aer opl ane as it flew away
y roof . Your new classroom is modern and bright .
. She explained that the computers were connected to a network
always remember how fragile the city was . <|endoftext|> Once upon
. " Your dolls are both beautiful ," Mom
also learned that meeting new people can be fun and that
and they walked through the carn ival gates , while Max
airport , Lily saw many aer opl anes parked outside .
day because he knew that school was a reliable place where
Max loved being at the carn ival so much that he
explain to him that the trains are not for eating ,
They waved at the famous aer opl ane as it flew
. She was 3 - years - old . Every day
As the aer opl ane took off , Lucy

MIN ACTIVATIONS
MIN = -0.064

forest and looking for a carrot trail ," replied Benny .
It bit me . It bit me ." Tom
the berries and have something delicious to eat . But first
Tom , I found a carrot !" she said , holding
Max shouted . " It bit me ! It bit me
me , Tom . It bit me . It bit me
have enough fruits for a pie ," Ben said .
emerged with a big juicy carrot . He held
" O uch , it bit me !" Ben cried .
, ow ! The dog bit me , Lily ! It
It bit me ! It bit me !" He
! I brought you a carrot !" Harriet opened her mouth
m unch ing on a carrot . Suddenly ,
west and twist off a carrot from the ground . That
, ow ! The lizard bit me ! It hurt me
cried . " The ducks bit me ! They are mean
the pr une . It bit me ." She
! I found a bigger carrot !" Lily says . She
it bit me , it bit me !" she sob bed
" Help , help , dad !" Ben screams

INTERVAL 0.031 - 0.063
CONTAINS 0.307%

she went she would ask lots of questions and discover new
a par rot . The par rot was ch ubby and
and make sounds when they press a button . Lily likes
she grows up , she can choose a job that has
at his adventure . He couldn 't wait to go back

INTERVAL -0.001 - 0.031
CONTAINS 52.337%

being together . Motor loved playing with M abel and feeling
and wondered if there might be something fascinating inside .
. <|endoftext|> Once upon a time there was a large patch
. She says she hopes he is healthy soon . <|endoftext|>
was a little girl named Lily . Lily loved her blanket

INTERVAL -0.032 - -0.001
CONTAINS 47.139%

Lily saw him and said , " No , Tom ,
It came closer and closer , until it was right in
hugged her and said , " Don 't worry , Lily
man was so happy ! He waved his arms and shouted
ordered ! <|endoftext|> Once upon a time , in a small

INTERVAL -0.064 - -0.032
CONTAINS 0.217%

These fruits are y ummy !" L ila said , biting
soldier : " Can you help me deliver this package ?".
that looked like a big carrot . He smiled and thought
. They gave him a carrot nose and two rocks for
Look at me , Sue !" he shouts . " I
Token friends
Token ID2460
Feature activation+9.84e-03

Pos logit contributions
 studies+0.002
 university+0.002
 studying+0.002
ifting+0.001
old+0.001
Neg logit contributions
Br-0.002
 elbows-0.001
fl-0.001
mar-0.001
 fart-0.001
Token.
Token ID13
Feature activation-1.12e-02

Pos logit contributions
 studies+0.018
 university+0.016
ifting+0.016
 studying+0.016
old+0.016
Neg logit contributions
Br-0.013
 elbows-0.011
fl-0.010
mar-0.010
ety-0.010
Token Tim
Token ID5045
Feature activation+3.17e-04

Pos logit contributions
Br+0.021
 elbows+0.018
fl+0.017
mar+0.017
icles+0.017
Neg logit contributions
 studies-0.015
 university-0.014
 studying-0.013
old-0.013
ifting-0.013
Token learned
Token ID4499
Feature activation+1.92e-02

Pos logit contributions
 university+0.001
 studies+0.000
ifting+0.000
 amaz+0.000
igsaw+0.000
Neg logit contributions
Br-0.000
ety-0.000
fl-0.000
 forth-0.000
mar-0.000
Token that
Token ID326
Feature activation+1.06e-02

Pos logit contributions
 studies+0.025
 amaz+0.022
ifting+0.021
 studying+0.020
 university+0.020
Neg logit contributions
Br-0.035
mar-0.031
fl-0.030
icles-0.029
 fart-0.029
Token school
Token ID1524
Feature activation+6.29e-02

Pos logit contributions
 studies+0.017
old+0.016
 university+0.015
ifting+0.015
ver+0.015
Neg logit contributions
Br-0.016
fl-0.013
 elbows-0.013
 forth-0.013
 fart-0.013
Token was
Token ID373
Feature activation-6.24e-03

Pos logit contributions
 amaz+0.108
ifting+0.104
 arrest+0.101
 university+0.097
 church+0.097
Neg logit contributions
 fart-0.068
fl-0.066
 ones-0.064
mar-0.063
 forth-0.062
Token a
Token ID257
Feature activation+4.62e-03

Pos logit contributions
Br+0.009
 fart+0.007
ety+0.007
mar+0.007
Fin+0.007
Neg logit contributions
 studies-0.011
ifting-0.010
old-0.009
 university-0.009
 amaz-0.009
Token fun
Token ID1257
Feature activation-2.87e-03

Pos logit contributions
 studies+0.007
ifting+0.007
old+0.007
 studying+0.006
 university+0.006
Neg logit contributions
Br-0.007
 fart-0.006
 elbows-0.006
mar-0.006
fl-0.006
Token place
Token ID1295
Feature activation+1.32e-02

Pos logit contributions
Br+0.004
 elbows+0.003
icles+0.003
fl+0.003
mar+0.003
Neg logit contributions
 studies-0.005
 university-0.005
 studying-0.005
old-0.005
ifting-0.004
Token to
Token ID284
Feature activation+9.83e-03

Pos logit contributions
 studies+0.023
ifting+0.022
 university+0.022
 amaz+0.021
old+0.021
Neg logit contributions
Br-0.015
fl-0.014
mar-0.013
 fart-0.013
cr-0.013
Token loved
Token ID6151
Feature activation+1.23e-02

Pos logit contributions
 university+0.023
 studies+0.023
ifting+0.022
igsaw+0.022
 amaz+0.022
Neg logit contributions
Br-0.018
fl-0.016
mar-0.016
ety-0.015
 forth-0.015
Token to
Token ID284
Feature activation-9.35e-04

Pos logit contributions
ver+0.021
 studies+0.021
ifting+0.019
old+0.019
ience+0.019
Neg logit contributions
Br-0.016
fl-0.015
 elbows-0.013
 forth-0.013
icles-0.013
Token play
Token ID711
Feature activation+2.70e-02

Pos logit contributions
Br+0.001
 elbows+0.001
mar+0.001
fl+0.001
 forth+0.001
Neg logit contributions
ver-0.002
ifting-0.002
ival-0.002
year-0.002
osing-0.002
Token make
Token ID787
Feature activation+2.21e-02

Pos logit contributions
 studies+0.035
 university+0.033
 studying+0.032
ifting+0.030
 reading+0.029
Neg logit contributions
Br-0.051
 elbows-0.045
ety-0.043
igi-0.043
icles-0.043
Token-
Token ID12
Feature activation+1.73e-02

Pos logit contributions
 studies+0.037
 university+0.034
 studying+0.033
ifting+0.032
old+0.032
Neg logit contributions
Br-0.033
 elbows-0.027
fl-0.027
icles-0.026
 fart-0.026
Tokenbel
Token ID6667
Feature activation+5.93e-02

Pos logit contributions
 studies+0.029
 university+0.027
 studying+0.026
ifting+0.025
 amaz+0.025
Neg logit contributions
Br-0.026
 elbows-0.021
fl-0.021
mar-0.020
ety-0.020
Tokenieve
Token ID12311
Feature activation+2.16e-02

Pos logit contributions
 studies+0.071
 amaz+0.069
ience+0.063
igsaw+0.062
 aston+0.060
Neg logit contributions
Br-0.098
 elbows-0.090
mar-0.087
 forth-0.086
 fart-0.086
Token.
Token ID13
Feature activation+1.86e-02

Pos logit contributions
 studies+0.037
ifting+0.034
old+0.034
ver+0.033
 amaz+0.033
Neg logit contributions
Br-0.027
fl-0.025
 fart-0.024
 elbows-0.023
mar-0.023
Token Today
Token ID6288
Feature activation+1.55e-02

Pos logit contributions
 studies+0.025
 university+0.023
 studying+0.022
old+0.022
ifting+0.022
Neg logit contributions
Br-0.033
 elbows-0.028
fl-0.028
icles-0.027
ety-0.027
Token she
Token ID673
Feature activation+5.18e-03

Pos logit contributions
 studies+0.021
 university+0.018
old+0.018
ifting+0.018
 studying+0.018
Neg logit contributions
Br-0.027
fl-0.023
mar-0.023
 elbows-0.023
icles-0.023
Token was
Token ID373
Feature activation-8.30e-05

Pos logit contributions
 university+0.009
 studies+0.009
ifting+0.009
 amaz+0.008
 studying+0.008
Neg logit contributions
Br-0.006
mar-0.006
ety-0.005
fl-0.005
olyn-0.005
TokenMom
Token ID29252
Feature activation-7.51e-03

Pos logit contributions
 studies+0.022
 university+0.020
 studying+0.020
ifting+0.020
old+0.019
Neg logit contributions
Br-0.015
 elbows-0.011
mar-0.011
fl-0.011
icles-0.010
Token replied
Token ID8712
Feature activation+8.04e-03

Pos logit contributions
Br+0.009
mar+0.007
icles+0.007
fl+0.007
olyn+0.007
Neg logit contributions
 studies-0.014
 university-0.014
 studying-0.013
ifting-0.012
 mailing-0.012
Token,
Token ID11
Feature activation-4.99e-03

Pos logit contributions
 studies+0.012
 university+0.010
 studying+0.010
old+0.010
ifting+0.009
Neg logit contributions
Br-0.014
mar-0.012
fl-0.012
 elbows-0.012
icles-0.012
Token "
Token ID366
Feature activation-1.96e-03

Pos logit contributions
Br+0.010
 elbows+0.008
fl+0.008
mar+0.008
icles+0.008
Neg logit contributions
 studies-0.006
 university-0.005
old-0.005
 studying-0.005
ifting-0.005
TokenThe
Token ID464
Feature activation+1.37e-02

Pos logit contributions
Br+0.003
 elbows+0.003
 fart+0.002
icles+0.002
mar+0.002
Neg logit contributions
 studies-0.003
 university-0.003
old-0.003
 studying-0.003
ifting-0.003
Token navy
Token ID23956
Feature activation+5.54e-02

Pos logit contributions
 studies+0.023
ifting+0.022
old+0.021
ver+0.021
 university+0.020
Neg logit contributions
Br-0.020
fl-0.016
 fart-0.016
 elbows-0.016
 forth-0.015
Token can
Token ID460
Feature activation+2.48e-02

Pos logit contributions
ifting+0.087
 studies+0.083
old+0.081
 university+0.079
 amaz+0.078
Neg logit contributions
fl-0.069
Br-0.068
suit-0.066
 fart-0.066
 ones-0.064
Token sail
Token ID14936
Feature activation+4.17e-03

Pos logit contributions
 studies+0.042
 studying+0.039
 university+0.039
old+0.039
 amaz+0.038
Neg logit contributions
Br-0.035
icles-0.029
mar-0.028
cr-0.028
 fart-0.026
Token across
Token ID1973
Feature activation+2.43e-02

Pos logit contributions
 studies+0.007
 studying+0.006
ifting+0.006
 university+0.006
old+0.006
Neg logit contributions
Br-0.006
 fart-0.005
fl-0.005
icles-0.005
mar-0.005
Token the
Token ID262
Feature activation+5.89e-03

Pos logit contributions
 studies+0.036
ver+0.031
 studying+0.030
old+0.029
ifting+0.029
Neg logit contributions
Br-0.040
 fart-0.035
icles-0.034
fl-0.034
 elbows-0.034
Token ocean
Token ID9151
Feature activation+4.75e-03

Pos logit contributions
 studies+0.009
 university+0.008
 studying+0.008
old+0.008
ifting+0.008
Neg logit contributions
Br-0.009
 elbows-0.008
fl-0.008
mar-0.008
 fart-0.008
Token for
Token ID329
Feature activation-6.78e-03

Pos logit contributions
ifting+0.032
 studies+0.032
 university+0.030
 amaz+0.029
ieving+0.028
Neg logit contributions
 forth-0.011
 away-0.011
Br-0.010
fl-0.010
ety-0.009
Token taking
Token ID2263
Feature activation-3.76e-03

Pos logit contributions
Br+0.012
 elbows+0.011
 forth+0.010
mar+0.010
icles+0.010
Neg logit contributions
 studies-0.010
 university-0.008
old-0.008
ifting-0.008
ver-0.008
Token her
Token ID607
Feature activation-1.59e-02

Pos logit contributions
Br+0.006
 elbows+0.005
fl+0.005
mar+0.005
icles+0.005
Neg logit contributions
 studies-0.006
 university-0.005
 studying-0.005
old-0.005
ifting-0.005
Token to
Token ID284
Feature activation+3.94e-04

Pos logit contributions
Br+0.016
 fart+0.015
fl+0.015
 away+0.015
 forth+0.014
Neg logit contributions
ifting-0.030
 studies-0.030
 university-0.028
old-0.027
 amaz-0.027
Token the
Token ID262
Feature activation+7.86e-03

Pos logit contributions
 studies+0.001
ience+0.001
ver+0.001
ifting+0.001
old+0.001
Neg logit contributions
Br-0.001
 elbows-0.000
 P-0.000
 avail-0.000
 forth-0.000
Token carn
Token ID18466
Feature activation+5.49e-02

Pos logit contributions
 studies+0.012
ifting+0.011
old+0.011
ver+0.011
 studying+0.010
Neg logit contributions
Br-0.012
 fart-0.011
fl-0.010
 elbows-0.010
 bisc-0.010
Tokenival
Token ID2473
Feature activation+1.12e-02

Pos logit contributions
 amaz+0.044
 studies+0.044
 studying+0.037
 university+0.034
 mailing+0.034
Neg logit contributions
Br-0.119
P-0.112
mar-0.109
Fin-0.104
Carl-0.103
Token.
Token ID13
Feature activation-5.88e-04

Pos logit contributions
 studies+0.019
ifting+0.019
 studying+0.017
old+0.017
 amaz+0.017
Neg logit contributions
Br-0.014
fl-0.012
mar-0.012
 forth-0.012
 fart-0.012
Token They
Token ID1119
Feature activation+5.94e-03

Pos logit contributions
Br+0.001
ety+0.001
fl+0.001
icles+0.001
 elbows+0.001
Neg logit contributions
 studies-0.001
 university-0.001
 studying-0.001
ifting-0.001
old-0.001
Token hugged
Token ID41130
Feature activation+2.73e-03

Pos logit contributions
 studies+0.010
 university+0.010
ifting+0.009
 studying+0.009
old+0.009
Neg logit contributions
Br-0.009
 forth-0.007
mar-0.007
ety-0.007
fl-0.007
Token her
Token ID607
Feature activation+1.23e-02

Pos logit contributions
 studies+0.003
 university+0.003
 studying+0.003
old+0.003
ifting+0.002
Neg logit contributions
Br-0.006
 elbows-0.005
fl-0.005
mar-0.005
 fart-0.005
Token little
Token ID1310
Feature activation-1.18e-02

Pos logit contributions
Br+0.005
 elbows+0.004
 fart+0.004
fl+0.004
icles+0.004
Neg logit contributions
 studies-0.005
ifting-0.004
 studying-0.004
 university-0.004
old-0.004
Token girl
Token ID2576
Feature activation-6.77e-03

Pos logit contributions
Br+0.016
 elbows+0.013
mar+0.012
icles+0.012
fl+0.012
Neg logit contributions
 studies-0.022
 university-0.020
 studying-0.020
old-0.019
ifting-0.019
Token soon
Token ID2582
Feature activation+5.42e-03

Pos logit contributions
Br+0.008
 forth+0.006
fl+0.006
mar+0.006
 elbows+0.006
Neg logit contributions
 university-0.012
ifting-0.012
 studies-0.012
igsaw-0.012
 studying-0.011
Token realized
Token ID6939
Feature activation+1.36e-02

Pos logit contributions
 studies+0.009
 university+0.008
 studying+0.008
ifting+0.008
old+0.008
Neg logit contributions
Br-0.008
 elbows-0.006
ety-0.006
mar-0.006
fl-0.006
Token that
Token ID326
Feature activation+4.37e-03

Pos logit contributions
 studies+0.019
ifting+0.016
old+0.015
ver+0.015
 sells+0.015
Neg logit contributions
Br-0.022
fl-0.021
icles-0.020
 elbows-0.020
mar-0.020
Token school
Token ID1524
Feature activation+5.48e-02

Pos logit contributions
 studies+0.007
ifting+0.006
old+0.006
ver+0.006
ival+0.006
Neg logit contributions
Br-0.007
fl-0.006
 elbows-0.006
 forth-0.006
mar-0.005
Token was
Token ID373
Feature activation-7.21e-03

Pos logit contributions
ifting+0.108
 amaz+0.103
 arrest+0.102
itt+0.098
old+0.098
Neg logit contributions
mar-0.039
fl-0.038
 forth-0.037
Br-0.037
 fart-0.036
Token a
Token ID257
Feature activation+3.67e-03

Pos logit contributions
Br+0.011
 elbows+0.009
 fart+0.009
mar+0.009
fl+0.009
Neg logit contributions
 studies-0.012
old-0.010
ifting-0.010
 university-0.010
 amaz-0.010
Token place
Token ID1295
Feature activation+7.49e-03

Pos logit contributions
 studies+0.005
ifting+0.005
old+0.005
 studying+0.005
ver+0.005
Neg logit contributions
Br-0.006
 fart-0.005
 elbows-0.005
fl-0.005
mar-0.005
Token of
Token ID286
Feature activation+2.20e-03

Pos logit contributions
 studies+0.012
ifting+0.011
 university+0.011
old+0.010
 studying+0.010
Neg logit contributions
Br-0.010
fl-0.009
mar-0.009
 elbows-0.009
 fart-0.009
Token friendship
Token ID14738
Feature activation-1.01e-04

Pos logit contributions
 studies+0.003
ifting+0.003
ver+0.003
old+0.003
ience+0.003
Neg logit contributions
Br-0.003
 elbows-0.003
 fart-0.003
 forth-0.003
fl-0.003
Token
Token ID198
Feature activation+1.72e-02

Pos logit contributions
 studies+0.004
 university+0.003
 studying+0.003
ifting+0.003
 amaz+0.003
Neg logit contributions
Br-0.004
mar-0.003
fl-0.003
 elbows-0.003
igi-0.003
TokenThat
Token ID2504
Feature activation+2.42e-02

Pos logit contributions
 studies+0.029
 studying+0.025
 university+0.025
 mailing+0.024
 amaz+0.024
Neg logit contributions
Br-0.029
mar-0.022
Fin-0.020
Carl-0.020
P-0.020
Token's
Token ID338
Feature activation-9.93e-03

Pos logit contributions
 studies+0.050
ifting+0.049
 university+0.048
 arrest+0.048
 studying+0.047
Neg logit contributions
Br-0.022
fl-0.019
mar-0.017
 forth-0.016
olyn-0.015
Token why
Token ID1521
Feature activation+5.73e-03

Pos logit contributions
Br+0.016
fl+0.014
mar+0.014
 forth+0.013
icles+0.013
Neg logit contributions
 studies-0.015
 arrest-0.014
ifting-0.013
old-0.013
 university-0.013
Token the
Token ID262
Feature activation+8.74e-03

Pos logit contributions
 studies+0.009
ifting+0.009
old+0.008
 arrest+0.008
aling+0.008
Neg logit contributions
Br-0.008
mar-0.007
fl-0.007
olyn-0.006
icles-0.006
Token school
Token ID1524
Feature activation+5.47e-02

Pos logit contributions
ifting+0.014
 studies+0.013
old+0.013
ver+0.013
anging+0.012
Neg logit contributions
Br-0.012
fl-0.011
 fart-0.011
 forth-0.010
 ones-0.010
Token kept
Token ID4030
Feature activation+5.50e-03

Pos logit contributions
ifting+0.102
 arrest+0.098
igsaw+0.093
 amaz+0.093
old+0.092
Neg logit contributions
 forth-0.046
mar-0.046
Br-0.046
fl-0.042
 sound-0.042
Token shining
Token ID22751
Feature activation+1.47e-02

Pos logit contributions
 studies+0.008
 university+0.008
old+0.008
ifting+0.007
 studying+0.007
Neg logit contributions
Br-0.009
 elbows-0.008
fl-0.007
mar-0.007
icles-0.007
Token
Token ID6184
Feature activation+2.45e-02

Pos logit contributions
 studies+0.022
old+0.020
ifting+0.019
 university+0.019
 studying+0.019
Neg logit contributions
Br-0.023
fl-0.021
icles-0.020
 elbows-0.020
 forth-0.020
Token
Token ID95
Feature activation+2.59e-02

Pos logit contributions
 amaz+0.035
 glad+0.030
old+0.030
 struggling+0.030
igh+0.029
Neg logit contributions
-0.033
mar-0.031
Br-0.030
ern-0.029
gh-0.029
Token
Token ID26391
Feature activation+1.88e-02

Pos logit contributions
 studies+0.042
old+0.040
 amaz+0.040
 university+0.039
 studying+0.039
Neg logit contributions
Br-0.033
mar-0.028
fl-0.027
-0.027
 fart-0.026
Token waved
Token ID26834
Feature activation+3.08e-02

Pos logit contributions
 studies+0.016
 university+0.015
ifting+0.015
 studying+0.014
igsaw+0.014
Neg logit contributions
Br-0.011
 forth-0.009
mar-0.009
olyn-0.009
ety-0.009
Token at
Token ID379
Feature activation+1.66e-02

Pos logit contributions
 studies+0.042
ifting+0.038
 university+0.037
 studying+0.037
old+0.035
Neg logit contributions
Br-0.055
 elbows-0.047
fl-0.046
 forth-0.046
 away-0.045
Token the
Token ID262
Feature activation+1.94e-02

Pos logit contributions
 studies+0.027
ifting+0.025
 university+0.024
 studying+0.024
ver+0.023
Neg logit contributions
Br-0.025
 elbows-0.022
fl-0.021
 forth-0.020
icles-0.020
Token famous
Token ID5863
Feature activation+3.22e-02

Pos logit contributions
ifting+0.032
 studies+0.031
old+0.030
anging+0.030
smart+0.030
Neg logit contributions
Br-0.025
 sound-0.024
 fart-0.022
 forth-0.021
 away-0.021
Token aer
Token ID9551
Feature activation+5.28e-02

Pos logit contributions
ifting+0.054
ver+0.048
old+0.048
smart+0.047
 studies+0.047
Neg logit contributions
 ones-0.039
 sound-0.036
Br-0.036
 forth-0.035
 away-0.034
Tokenopl
Token ID20106
Feature activation+5.45e-02

Pos logit contributions
 amaz+0.077
 studies+0.065
 aston+0.064
 university+0.060
 young+0.060
Neg logit contributions
rim-0.076
ibi-0.074
Br-0.074
ty-0.073
bur-0.072
Tokenane
Token ID1531
Feature activation+1.73e-02

Pos logit contributions
 amaz+0.094
 studies+0.089
 enchanted+0.083
 aston+0.080
 young+0.080
Neg logit contributions
w-0.068
we-0.068
mar-0.066
fl-0.060
olyn-0.060
Token as
Token ID355
Feature activation+6.19e-04

Pos logit contributions
ifting+0.027
 studies+0.027
old+0.025
 university+0.024
 studying+0.024
Neg logit contributions
 forth-0.023
fl-0.022
Br-0.022
 away-0.022
mar-0.019
Token it
Token ID340
Feature activation-1.12e-02

Pos logit contributions
 studies+0.001
old+0.001
ifting+0.001
 amaz+0.001
osing+0.001
Neg logit contributions
Br-0.001
fl-0.001
 forth-0.001
 avail-0.001
olyn-0.001
Token flew
Token ID13112
Feature activation-2.53e-03

Pos logit contributions
Br+0.016
mar+0.013
fl+0.013
 elbows+0.012
 fart+0.012
Neg logit contributions
 studies-0.020
 university-0.019
 studying-0.019
old-0.018
ifting-0.018
Token away
Token ID1497
Feature activation-1.10e-02

Pos logit contributions
Br+0.004
 elbows+0.003
rim+0.003
igi+0.003
icles+0.003
Neg logit contributions
 university-0.004
 studies-0.004
ver-0.004
 young-0.004
 victorious-0.004
Tokeny
Token ID88
Feature activation-7.41e-03

Pos logit contributions
Br+0.021
 elbows+0.018
fl+0.018
mar+0.018
icles+0.018
Neg logit contributions
 studies-0.027
 university-0.025
 studying-0.024
old-0.024
ifting-0.024
Token roof
Token ID9753
Feature activation+9.07e-03

Pos logit contributions
Br+0.011
 elbows+0.009
fl+0.009
 fart+0.009
icles+0.009
Neg logit contributions
 studies-0.012
 university-0.011
old-0.011
ifting-0.011
 studying-0.011
Token.
Token ID13
Feature activation+4.30e-03

Pos logit contributions
 studies+0.016
 university+0.015
old+0.014
 studying+0.014
ifting+0.014
Neg logit contributions
Br-0.011
icles-0.011
fl-0.011
 elbows-0.010
mar-0.010
Token Your
Token ID3406
Feature activation+1.08e-03

Pos logit contributions
 studies+0.007
 university+0.006
 studying+0.006
ifting+0.006
old+0.006
Neg logit contributions
Br-0.007
 elbows-0.006
icles-0.006
ety-0.005
Wow-0.005
Token new
Token ID649
Feature activation+4.29e-03

Pos logit contributions
ifting+0.002
ver+0.002
 studies+0.002
old+0.002
osing+0.002
Neg logit contributions
Br-0.001
 elbows-0.001
fl-0.001
 fart-0.001
 forth-0.001
Token classroom
Token ID15806
Feature activation+5.44e-02

Pos logit contributions
ifting+0.007
 studies+0.007
ver+0.007
old+0.006
 studying+0.006
Neg logit contributions
Br-0.006
 elbows-0.005
fl-0.005
 fart-0.005
 forth-0.005
Token is
Token ID318
Feature activation-1.53e-03

Pos logit contributions
ifting+0.093
 amaz+0.088
year+0.085
old+0.085
igsaw+0.084
Neg logit contributions
 fart-0.055
 forth-0.054
 ones-0.052
 elbows-0.051
mar-0.051
Token modern
Token ID3660
Feature activation+1.58e-02

Pos logit contributions
Br+0.002
 elbows+0.002
 fart+0.002
 forth+0.002
mar+0.002
Neg logit contributions
 studies-0.002
old-0.002
ifting-0.002
 university-0.002
 arrest-0.002
Token and
Token ID290
Feature activation+1.90e-02

Pos logit contributions
ifting+0.025
 studies+0.024
old+0.023
ver+0.023
ival+0.022
Neg logit contributions
fl-0.020
 ones-0.019
 fart-0.019
Br-0.019
 forth-0.018
Token bright
Token ID6016
Feature activation+1.44e-02

Pos logit contributions
 studies+0.030
ifting+0.030
old+0.029
ver+0.028
 arrest+0.027
Neg logit contributions
Br-0.027
 forth-0.023
 fart-0.023
 elbows-0.023
fl-0.022
Token.
Token ID13
Feature activation+7.54e-03

Pos logit contributions
 studies+0.027
old+0.026
ifting+0.026
ver+0.025
 university+0.024
Neg logit contributions
fl-0.015
Br-0.015
icles-0.014
 ones-0.013
 away-0.013
Token.
Token ID13
Feature activation-2.24e-03

Pos logit contributions
 studies+0.033
ifting+0.033
 amaz+0.031
old+0.031
ience+0.031
Neg logit contributions
Br-0.016
icles-0.015
 fart-0.015
mar-0.015
 forth-0.014
Token She
Token ID1375
Feature activation+3.91e-03

Pos logit contributions
Br+0.004
 elbows+0.003
fl+0.003
icles+0.003
mar+0.003
Neg logit contributions
 studies-0.004
 university-0.003
 studying-0.003
old-0.003
ifting-0.003
Token explained
Token ID4893
Feature activation+7.19e-03

Pos logit contributions
 university+0.007
ifting+0.007
igsaw+0.007
 studies+0.007
 arrest+0.007
Neg logit contributions
Br-0.004
mar-0.003
ety-0.003
 forth-0.003
cr-0.003
Token that
Token ID326
Feature activation+6.26e-03

Pos logit contributions
 studies+0.009
 university+0.007
ifting+0.007
 studying+0.007
ver+0.007
Neg logit contributions
Br-0.013
icles-0.012
fl-0.012
 elbows-0.012
mar-0.012
Token the
Token ID262
Feature activation+2.56e-03

Pos logit contributions
 studies+0.010
ifting+0.009
old+0.009
ver+0.008
 university+0.008
Neg logit contributions
Br-0.009
fl-0.008
icles-0.008
 elbows-0.008
mar-0.008
Token computers
Token ID9061
Feature activation+5.44e-02

Pos logit contributions
ifting+0.004
 studies+0.004
old+0.004
ver+0.003
anging+0.003
Neg logit contributions
Br-0.004
 fart-0.003
fl-0.003
 elbows-0.003
 bisc-0.003
Token were
Token ID547
Feature activation+3.43e-03

Pos logit contributions
ifting+0.098
 studies+0.093
old+0.092
 amaz+0.091
 university+0.090
Neg logit contributions
 fart-0.050
 forth-0.049
mar-0.049
icles-0.049
 away-0.048
Token connected
Token ID5884
Feature activation+1.31e-02

Pos logit contributions
 studies+0.006
ifting+0.005
old+0.005
 university+0.005
 amaz+0.005
Neg logit contributions
Br-0.005
 fart-0.004
 elbows-0.004
icles-0.004
ety-0.004
Token to
Token ID284
Feature activation+8.05e-03

Pos logit contributions
 studies+0.021
 university+0.019
ifting+0.019
 studying+0.019
old+0.019
Neg logit contributions
Br-0.019
fl-0.016
 elbows-0.016
icles-0.016
 fart-0.016
Token a
Token ID257
Feature activation-3.22e-03

Pos logit contributions
 studies+0.012
ifting+0.012
ver+0.011
old+0.011
ience+0.011
Neg logit contributions
Br-0.012
 elbows-0.011
icles-0.010
 fart-0.010
fl-0.010
Token network
Token ID3127
Feature activation+1.93e-02

Pos logit contributions
Br+0.005
 elbows+0.005
 fart+0.004
fl+0.004
mar+0.004
Neg logit contributions
 studies-0.005
 university-0.004
ifting-0.004
 studying-0.004
old-0.004
Token always
Token ID1464
Feature activation+1.22e-02

Pos logit contributions
Br+0.000
mar+0.000
 forth+0.000
 elbows+0.000
bur+0.000
Neg logit contributions
 studies-0.000
old-0.000
ifting-0.000
ver-0.000
 university-0.000
Token remember
Token ID3505
Feature activation+1.37e-02

Pos logit contributions
 studies+0.019
ifting+0.019
old+0.018
ver+0.018
 university+0.018
Neg logit contributions
Br-0.016
 forth-0.015
 elbows-0.014
mar-0.014
 fart-0.013
Token how
Token ID703
Feature activation+8.31e-03

Pos logit contributions
 studies+0.020
ifting+0.018
old+0.018
ver+0.017
ival+0.017
Neg logit contributions
Br-0.020
fl-0.018
 elbows-0.018
mar-0.018
 away-0.018
Token fragile
Token ID21049
Feature activation+1.21e-02

Pos logit contributions
 studies+0.014
 amaz+0.013
igh+0.013
 arrest+0.013
ifting+0.013
Neg logit contributions
Br-0.009
 rid-0.009
ety-0.009
fl-0.008
 forth-0.008
Token the
Token ID262
Feature activation-2.60e-03

Pos logit contributions
 studies+0.019
old+0.017
 university+0.017
ifting+0.016
 studying+0.016
Neg logit contributions
Br-0.018
 elbows-0.016
fl-0.016
icles-0.016
 forth-0.016
Token city
Token ID1748
Feature activation+5.42e-02

Pos logit contributions
Br+0.004
 fart+0.003
fl+0.003
 elbows+0.003
 forth+0.003
Neg logit contributions
 studies-0.004
old-0.004
ifting-0.004
ver-0.004
 university-0.004
Token was
Token ID373
Feature activation+8.93e-03

Pos logit contributions
 studies+0.098
 arrest+0.098
ifting+0.095
old+0.094
 Many+0.092
Neg logit contributions
fl-0.052
 forth-0.051
mar-0.048
 away-0.046
 fart-0.046
Token.
Token ID13
Feature activation+3.87e-03

Pos logit contributions
 studies+0.016
old+0.015
ifting+0.015
 amaz+0.015
 university+0.015
Neg logit contributions
Br-0.009
fl-0.009
 away-0.009
 fart-0.009
mar-0.009
Token<|endoftext|>
Token ID50256
Feature activation+1.17e-04

Pos logit contributions
 studies+0.006
 university+0.006
 arrest+0.006
ifting+0.006
 studying+0.006
Neg logit contributions
Br-0.006
ety-0.004
 elbows-0.004
mar-0.004
Wow-0.004
TokenOnce
Token ID7454
Feature activation+1.11e-03

Pos logit contributions
 studies+0.000
 university+0.000
 studying+0.000
old+0.000
ifting+0.000
Neg logit contributions
Br-0.000
 elbows-0.000
 fart-0.000
icles-0.000
mar-0.000
Token upon
Token ID2402
Feature activation-1.35e-03

Pos logit contributions
 studies+0.002
 university+0.002
 studying+0.002
 arrest+0.002
ifting+0.002
Neg logit contributions
Br-0.001
ety-0.001
mar-0.001
icles-0.001
 elbows-0.001
Token.
Token ID13
Feature activation+8.78e-03

Pos logit contributions
 studies+0.007
 university+0.007
ifting+0.007
ver+0.007
old+0.007
Neg logit contributions
Br-0.006
fl-0.005
 away-0.005
mar-0.005
 elbows-0.005
Token
Token ID198
Feature activation-2.75e-05

Pos logit contributions
 studies+0.015
 studying+0.013
 university+0.013
old+0.013
ifting+0.012
Neg logit contributions
Br-0.013
 elbows-0.011
fl-0.010
Look-0.010
Wow-0.010
Token
Token ID198
Feature activation+1.04e-02

Pos logit contributions
Br+0.000
 elbows+0.000
mar+0.000
fl+0.000
icles+0.000
Neg logit contributions
 studies-0.000
 university-0.000
 studying-0.000
ifting-0.000
old-0.000
Token"
Token ID1
Feature activation-2.10e-03

Pos logit contributions
 studies+0.015
 university+0.013
 studying+0.013
 amaz+0.013
ver+0.012
Neg logit contributions
Br-0.018
 elbows-0.015
mar-0.015
P-0.014
Carl-0.014
TokenYour
Token ID7120
Feature activation+8.97e-03

Pos logit contributions
Br+0.003
 elbows+0.002
icles+0.002
 fart+0.002
fl+0.002
Neg logit contributions
 studies-0.004
 university-0.003
 studying-0.003
old-0.003
ifting-0.003
Token dolls
Token ID36062
Feature activation+5.39e-02

Pos logit contributions
ifting+0.016
 studies+0.015
old+0.015
ver+0.014
 studying+0.014
Neg logit contributions
Br-0.011
fl-0.010
 elbows-0.010
ety-0.009
 fart-0.009
Token are
Token ID389
Feature activation-5.07e-03

Pos logit contributions
ifting+0.089
 university+0.087
old+0.087
 studies+0.086
year+0.086
Neg logit contributions
Br-0.064
 fart-0.063
 away-0.058
w-0.058
 forth-0.058
Token both
Token ID1111
Feature activation+1.73e-02

Pos logit contributions
Br+0.009
 elbows+0.007
 fart+0.007
fl+0.007
mar+0.007
Neg logit contributions
 studies-0.008
 university-0.007
old-0.007
ifting-0.007
 studying-0.006
Token beautiful
Token ID4950
Feature activation+9.60e-03

Pos logit contributions
 studies+0.027
ifting+0.026
 university+0.025
old+0.025
ver+0.024
Neg logit contributions
Br-0.025
 fart-0.022
 elbows-0.021
fl-0.021
ety-0.020
Token,"
Token ID553
Feature activation-2.23e-02

Pos logit contributions
 studies+0.014
 university+0.013
 studying+0.012
ifting+0.012
old+0.012
Neg logit contributions
Br-0.016
 elbows-0.014
fl-0.014
mar-0.013
 fart-0.013
Token Mom
Token ID11254
Feature activation-9.95e-03

Pos logit contributions
Br+0.024
 elbows+0.020
 forth+0.016
fl+0.016
ety+0.016
Neg logit contributions
 studies-0.046
ifting-0.043
ver-0.042
 university-0.041
 studying-0.040
Token also
Token ID635
Feature activation+2.62e-02

Pos logit contributions
ifting+0.030
 studies+0.029
 studying+0.029
 university+0.029
igsaw+0.028
Neg logit contributions
Br-0.030
ety-0.026
mar-0.025
 elbows-0.024
 fart-0.024
Token learned
Token ID4499
Feature activation+1.91e-02

Pos logit contributions
ifting+0.044
igsaw+0.042
 university+0.039
ival+0.038
itt+0.038
Neg logit contributions
Br-0.034
ety-0.031
 away-0.030
mar-0.030
 fart-0.028
Token that
Token ID326
Feature activation+7.83e-03

Pos logit contributions
 studies+0.026
ifting+0.023
 studying+0.022
 amaz+0.021
 university+0.021
Neg logit contributions
Br-0.033
mar-0.029
fl-0.029
 elbows-0.028
icles-0.028
Token meeting
Token ID3249
Feature activation+3.98e-02

Pos logit contributions
 studies+0.012
ifting+0.011
old+0.011
 university+0.010
ver+0.010
Neg logit contributions
Br-0.012
 elbows-0.011
fl-0.010
 forth-0.010
mar-0.010
Token new
Token ID649
Feature activation+9.60e-03

Pos logit contributions
 studies+0.059
ifting+0.058
 arrest+0.057
 amaz+0.055
anging+0.055
Neg logit contributions
Br-0.051
 fart-0.051
 forth-0.048
fl-0.048
mar-0.047
Token people
Token ID661
Feature activation+5.38e-02

Pos logit contributions
ifting+0.018
 studies+0.018
 university+0.016
 studying+0.016
ver+0.016
Neg logit contributions
Br-0.011
fl-0.010
 elbows-0.009
 fart-0.008
 forth-0.008
Token can
Token ID460
Feature activation+1.94e-02

Pos logit contributions
 amaz+0.095
igsaw+0.092
ifting+0.088
 university+0.086
 studies+0.084
Neg logit contributions
 away-0.060
 forth-0.056
fl-0.054
ety-0.052
 fart-0.051
Token be
Token ID307
Feature activation+1.27e-02

Pos logit contributions
 studies+0.033
 university+0.030
 studying+0.030
ifting+0.030
old+0.030
Neg logit contributions
Br-0.028
 elbows-0.023
mar-0.022
fl-0.022
 fart-0.022
Token fun
Token ID1257
Feature activation+1.02e-02

Pos logit contributions
 studies+0.020
ifting+0.019
 university+0.017
 amaz+0.017
igsaw+0.017
Neg logit contributions
Br-0.019
ety-0.017
 elbows-0.016
 fart-0.016
 away-0.016
Token and
Token ID290
Feature activation+9.52e-03

Pos logit contributions
 studies+0.016
ifting+0.016
 amaz+0.016
 university+0.015
old+0.014
Neg logit contributions
fl-0.012
 ones-0.012
 fart-0.012
Br-0.011
 forth-0.011
Token that
Token ID326
Feature activation+5.96e-03

Pos logit contributions
ifting+0.015
 studies+0.014
ver+0.013
 university+0.013
old+0.013
Neg logit contributions
Br-0.014
 forth-0.012
 sound-0.012
fl-0.012
 fart-0.011
Token and
Token ID290
Feature activation+1.38e-02

Pos logit contributions
ience+0.014
ifting+0.014
 studies+0.014
ieving+0.013
itt+0.013
Neg logit contributions
Br-0.010
mar-0.009
icles-0.009
 away-0.009
 sound-0.009
Token they
Token ID484
Feature activation+2.62e-03

Pos logit contributions
ifting+0.021
 studies+0.020
ience+0.019
ieving+0.019
old+0.018
Neg logit contributions
Br-0.020
 forth-0.018
mar-0.017
 fart-0.016
fl-0.016
Token walked
Token ID6807
Feature activation+1.22e-02

Pos logit contributions
 studies+0.005
ifting+0.005
 university+0.004
 studying+0.004
igsaw+0.004
Neg logit contributions
Br-0.003
mar-0.003
 forth-0.003
ety-0.003
bur-0.003
Token through
Token ID832
Feature activation+3.20e-02

Pos logit contributions
 studies+0.019
 university+0.018
 studying+0.018
old+0.017
ifting+0.016
Neg logit contributions
Br-0.020
 elbows-0.017
icles-0.016
fl-0.016
 fart-0.016
Token the
Token ID262
Feature activation+4.71e-03

Pos logit contributions
ifting+0.057
ver+0.054
osing+0.052
 studies+0.052
ittle+0.051
Neg logit contributions
 elbows-0.035
Br-0.034
 away-0.033
 ones-0.032
 them-0.030
Token carn
Token ID18466
Feature activation+5.37e-02

Pos logit contributions
 studies+0.006
ver+0.005
ifting+0.005
 studying+0.005
old+0.005
Neg logit contributions
Br-0.009
 fart-0.008
 elbows-0.007
fl-0.007
 bisc-0.007
Tokenival
Token ID2473
Feature activation+1.86e-03

Pos logit contributions
 amaz+0.044
 studies+0.043
 studying+0.037
 mailing+0.036
 arrest+0.034
Neg logit contributions
Br-0.114
P-0.105
mar-0.102
suit-0.100
 fart-0.098
Token gates
Token ID17435
Feature activation+1.45e-02

Pos logit contributions
ifting+0.003
 studies+0.003
old+0.003
ience+0.003
 amaz+0.003
Neg logit contributions
Br-0.002
fl-0.002
 fart-0.002
 forth-0.002
 away-0.002
Token,
Token ID11
Feature activation-8.42e-03

Pos logit contributions
 studies+0.030
ience+0.029
igsaw+0.029
ifting+0.028
 arrest+0.028
Neg logit contributions
 away-0.010
Br-0.009
cr-0.009
 downstream-0.009
 forth-0.009
Token while
Token ID981
Feature activation-3.20e-03

Pos logit contributions
Br+0.011
 forth+0.011
 fart+0.010
fl+0.010
 elbows+0.010
Neg logit contributions
ifting-0.013
 studies-0.013
old-0.013
igsaw-0.013
ience-0.013
Token Max
Token ID5436
Feature activation-1.72e-02

Pos logit contributions
Br+0.004
 elbows+0.004
 forth+0.004
icles+0.003
fl+0.003
Neg logit contributions
 studies-0.005
ifting-0.005
old-0.005
 amaz-0.005
ieving-0.004
Token airport
Token ID9003
Feature activation+1.59e-02

Pos logit contributions
 studies+0.023
old+0.021
ver+0.021
ifting+0.021
 studying+0.019
Neg logit contributions
Br-0.038
 fart-0.033
fl-0.032
 elbows-0.031
bur-0.031
Token,
Token ID11
Feature activation+6.76e-03

Pos logit contributions
 studies+0.024
ifting+0.022
ience+0.022
old+0.021
 amaz+0.021
Neg logit contributions
Br-0.023
fl-0.020
 forth-0.020
icles-0.020
mar-0.020
Token Lily
Token ID20037
Feature activation-5.57e-03

Pos logit contributions
 studies+0.010
 university+0.009
 studying+0.009
old+0.009
ifting+0.009
Neg logit contributions
Br-0.011
 elbows-0.010
mar-0.009
fl-0.009
 fart-0.009
Token saw
Token ID2497
Feature activation+1.10e-02

Pos logit contributions
Br+0.006
ety+0.005
 forth+0.005
fl+0.005
bur+0.004
Neg logit contributions
igsaw-0.011
 studies-0.011
 university-0.010
ifting-0.010
 amaz-0.010
Token many
Token ID867
Feature activation+3.56e-02

Pos logit contributions
 studies+0.017
ience+0.014
ifting+0.014
 studying+0.014
igsaw+0.014
Neg logit contributions
Br-0.019
fl-0.016
icles-0.016
 elbows-0.016
Look-0.015
Token aer
Token ID9551
Feature activation+5.35e-02

Pos logit contributions
 studies+0.063
ver+0.060
ifting+0.057
igsaw+0.057
 university+0.055
Neg logit contributions
Br-0.045
fl-0.038
 elbows-0.037
 forth-0.036
 ones-0.036
Tokenopl
Token ID20106
Feature activation+4.68e-02

Pos logit contributions
 studies+0.053
 amaz+0.048
 university+0.048
 studying+0.043
ver+0.043
Neg logit contributions
Br-0.109
bur-0.096
 elbows-0.096
fl-0.096
icles-0.096
Tokenanes
Token ID7305
Feature activation+2.51e-02

Pos logit contributions
 studies+0.067
 amaz+0.064
ience+0.060
 sells+0.054
 university+0.054
Neg logit contributions
Br-0.071
icles-0.070
w-0.070
mar-0.070
fl-0.070
Token parked
Token ID19584
Feature activation+8.70e-03

Pos logit contributions
 studies+0.046
igsaw+0.042
 amaz+0.041
 university+0.040
old+0.039
Neg logit contributions
fl-0.032
 forth-0.030
mar-0.027
cr-0.026
icles-0.026
Token outside
Token ID2354
Feature activation+2.86e-03

Pos logit contributions
 studies+0.015
old+0.013
 university+0.013
 studying+0.013
ifting+0.013
Neg logit contributions
Br-0.013
 elbows-0.011
mar-0.011
fl-0.011
icles-0.010
Token.
Token ID13
Feature activation+1.02e-02

Pos logit contributions
 studies+0.006
ience+0.005
ifting+0.005
old+0.005
ver+0.005
Neg logit contributions
Br-0.003
fl-0.003
icles-0.002
 elbows-0.002
mar-0.002
Token day
Token ID1110
Feature activation+2.23e-03

Pos logit contributions
 studies+0.032
ifting+0.029
ver+0.029
 university+0.028
 studying+0.028
Neg logit contributions
Br-0.017
 fart-0.013
 forth-0.013
 elbows-0.013
fl-0.013
Token because
Token ID780
Feature activation+2.07e-02

Pos logit contributions
 studies+0.003
ifting+0.003
old+0.003
 university+0.003
 amaz+0.003
Neg logit contributions
Br-0.003
 fart-0.003
mar-0.003
 elbows-0.003
fl-0.003
Token he
Token ID339
Feature activation-1.53e-05

Pos logit contributions
 studies+0.040
ience+0.039
ifting+0.037
 arrest+0.036
ieving+0.036
Neg logit contributions
Br-0.018
 forth-0.016
icles-0.016
Tim-0.015
 ones-0.014
Token knew
Token ID2993
Feature activation+9.98e-03

Pos logit contributions
Br+0.000
mar+0.000
ety+0.000
fl+0.000
 forth+0.000
Neg logit contributions
 university-0.000
ifting-0.000
igsaw-0.000
 mailing-0.000
 arrest-0.000
Token that
Token ID326
Feature activation+2.87e-03

Pos logit contributions
 studies+0.015
ifting+0.014
old+0.013
 university+0.013
 arrest+0.013
Neg logit contributions
Br-0.014
icles-0.013
 forth-0.013
mar-0.012
fl-0.012
Token school
Token ID1524
Feature activation+5.34e-02

Pos logit contributions
 studies+0.005
ifting+0.004
old+0.004
 university+0.004
ver+0.004
Neg logit contributions
Br-0.004
 elbows-0.004
 forth-0.004
fl-0.003
 fart-0.003
Token was
Token ID373
Feature activation-1.08e-02

Pos logit contributions
ifting+0.103
 amaz+0.098
 arrest+0.098
 studies+0.095
 university+0.094
Neg logit contributions
mar-0.045
Br-0.043
fl-0.041
 fart-0.040
 forth-0.039
Token a
Token ID257
Feature activation+3.74e-03

Pos logit contributions
Br+0.015
mar+0.013
 fart+0.013
 forth+0.013
 away+0.012
Neg logit contributions
 studies-0.018
ifting-0.016
old-0.016
 arrest-0.015
itt-0.015
Token reliable
Token ID9314
Feature activation+3.05e-03

Pos logit contributions
 studies+0.005
ifting+0.005
old+0.005
ver+0.005
 studying+0.005
Neg logit contributions
Br-0.006
 fart-0.005
 elbows-0.005
fl-0.005
 forth-0.005
Token place
Token ID1295
Feature activation+1.38e-02

Pos logit contributions
 studies+0.005
ifting+0.004
old+0.004
 university+0.004
 studying+0.004
Neg logit contributions
Br-0.005
fl-0.004
 elbows-0.004
 fart-0.004
mar-0.004
Token where
Token ID810
Feature activation+2.38e-02

Pos logit contributions
 studies+0.024
ifting+0.022
 university+0.022
old+0.021
 amaz+0.021
Neg logit contributions
Br-0.017
fl-0.015
mar-0.015
icles-0.015
 fart-0.014
Token Max
Token ID5436
Feature activation-7.89e-03

Pos logit contributions
Br+0.017
 elbows+0.015
mar+0.014
fl+0.014
 fart+0.014
Neg logit contributions
 studies-0.014
 university-0.013
 studying-0.012
old-0.012
ifting-0.012
Token loved
Token ID6151
Feature activation+4.07e-03

Pos logit contributions
Br+0.011
fl+0.010
mar+0.009
ety+0.009
 forth+0.009
Neg logit contributions
 university-0.013
 studies-0.012
ifting-0.012
old-0.011
igsaw-0.011
Token being
Token ID852
Feature activation+2.17e-03

Pos logit contributions
 studies+0.006
ver+0.006
ifting+0.006
old+0.006
ience+0.005
Neg logit contributions
Br-0.006
fl-0.005
 elbows-0.005
icles-0.005
 fart-0.005
Token at
Token ID379
Feature activation+7.24e-03

Pos logit contributions
 studies+0.003
ifting+0.003
 university+0.003
old+0.003
 amaz+0.003
Neg logit contributions
Br-0.004
 fart-0.003
 elbows-0.003
fl-0.003
mar-0.003
Token the
Token ID262
Feature activation+2.13e-03

Pos logit contributions
ifting+0.010
old+0.010
 studies+0.010
ver+0.010
ience+0.010
Neg logit contributions
Br-0.010
 fart-0.009
fl-0.009
icles-0.009
 ones-0.009
Token carn
Token ID18466
Feature activation+5.32e-02

Pos logit contributions
 studies+0.003
old+0.002
ifting+0.002
 university+0.002
 studying+0.002
Neg logit contributions
Br-0.004
 fart-0.004
 elbows-0.004
fl-0.004
icles-0.003
Tokenival
Token ID2473
Feature activation+1.12e-02

Pos logit contributions
 amaz+0.053
 studies+0.045
 mailing+0.039
 studying+0.038
 arrest+0.037
Neg logit contributions
Br-0.104
P-0.103
mar-0.100
suit-0.098
Fin-0.095
Token so
Token ID523
Feature activation+1.14e-02

Pos logit contributions
ifting+0.017
 studies+0.017
old+0.015
 university+0.015
 amaz+0.015
Neg logit contributions
Br-0.015
fl-0.014
 fart-0.014
mar-0.013
icles-0.013
Token much
Token ID881
Feature activation+1.16e-03

Pos logit contributions
 studies+0.014
ifting+0.013
 university+0.013
 studying+0.012
 mailing+0.012
Neg logit contributions
Br-0.021
 elbows-0.018
mar-0.017
Fin-0.017
icles-0.017
Token that
Token ID326
Feature activation-8.95e-03

Pos logit contributions
 studies+0.002
ifting+0.002
 university+0.002
old+0.002
 studying+0.002
Neg logit contributions
Br-0.002
fl-0.001
 elbows-0.001
mar-0.001
icles-0.001
Token he
Token ID339
Feature activation-1.21e-02

Pos logit contributions
Br+0.013
fl+0.011
 elbows+0.011
 forth+0.011
 fart+0.011
Neg logit contributions
 studies-0.014
ifting-0.013
old-0.013
 university-0.013
 studying-0.012
Token explain
Token ID4727
Feature activation+9.18e-03

Pos logit contributions
ifting+0.022
 studies+0.021
 studying+0.021
old+0.021
 mailing+0.020
Neg logit contributions
Br-0.020
 elbows-0.018
 forth-0.017
 fart-0.017
 away-0.017
Token to
Token ID284
Feature activation-1.10e-02

Pos logit contributions
 studies+0.012
 university+0.010
ifting+0.010
 studying+0.010
old+0.010
Neg logit contributions
Br-0.017
 elbows-0.015
fl-0.014
icles-0.014
mar-0.014
Token him
Token ID683
Feature activation+1.70e-02

Pos logit contributions
Br+0.013
 elbows+0.010
icles+0.009
fl+0.009
ety+0.009
Neg logit contributions
 studies-0.022
 university-0.021
ifting-0.020
 studying-0.020
old-0.020
Token that
Token ID326
Feature activation+5.34e-03

Pos logit contributions
 studies+0.024
 amaz+0.022
ifting+0.022
 university+0.021
igsaw+0.021
Neg logit contributions
fl-0.026
 away-0.025
Br-0.025
balls-0.024
icles-0.023
Token the
Token ID262
Feature activation+6.29e-04

Pos logit contributions
 studies+0.008
ifting+0.007
 university+0.007
old+0.007
 studying+0.007
Neg logit contributions
Br-0.008
 elbows-0.007
fl-0.007
icles-0.007
 fart-0.007
Token trains
Token ID13404
Feature activation+5.31e-02

Pos logit contributions
 studies+0.001
 university+0.001
 studying+0.001
old+0.001
 amaz+0.001
Neg logit contributions
Br-0.001
 elbows-0.001
mar-0.001
ety-0.001
icles-0.001
Token are
Token ID389
Feature activation-5.63e-03

Pos logit contributions
igsaw+0.090
 studies+0.089
ifting+0.089
 amaz+0.089
old+0.086
Neg logit contributions
 fart-0.060
 away-0.059
 forth-0.059
 elbows-0.054
Br-0.054
Token not
Token ID407
Feature activation-1.27e-02

Pos logit contributions
Br+0.009
 elbows+0.007
 fart+0.007
icles+0.007
mar+0.007
Neg logit contributions
 studies-0.009
old-0.008
ifting-0.008
 university-0.008
 amaz-0.008
Token for
Token ID329
Feature activation-6.19e-03

Pos logit contributions
Br+0.021
 elbows+0.018
 fart+0.017
mar+0.017
icles+0.017
Neg logit contributions
 studies-0.020
 university-0.017
old-0.017
ifting-0.017
 studying-0.017
Token eating
Token ID6600
Feature activation+3.59e-03

Pos logit contributions
Br+0.009
 elbows+0.008
icles+0.008
 fart+0.007
 forth+0.007
Neg logit contributions
 studies-0.010
ifting-0.009
old-0.009
 university-0.009
 amaz-0.009
Token,
Token ID11
Feature activation-1.91e-04

Pos logit contributions
 studies+0.006
old+0.006
ifting+0.006
 university+0.005
 studying+0.005
Neg logit contributions
Br-0.005
 elbows-0.004
icles-0.004
 fart-0.004
fl-0.004
Token They
Token ID1119
Feature activation+8.61e-03

Pos logit contributions
 studies+0.007
 university+0.006
 studying+0.006
ifting+0.006
old+0.006
Neg logit contributions
Br-0.007
fl-0.005
Wow-0.005
 elbows-0.005
ety-0.005
Token waved
Token ID26834
Feature activation+3.08e-02

Pos logit contributions
 studies+0.016
 university+0.015
ifting+0.015
 studying+0.014
igsaw+0.014
Neg logit contributions
Br-0.011
 forth-0.009
mar-0.009
olyn-0.009
ety-0.009
Token at
Token ID379
Feature activation+1.66e-02

Pos logit contributions
 studies+0.042
ifting+0.038
 university+0.037
 studying+0.037
old+0.035
Neg logit contributions
Br-0.055
 elbows-0.047
fl-0.046
 forth-0.046
 away-0.045
Token the
Token ID262
Feature activation+1.94e-02

Pos logit contributions
 studies+0.027
ifting+0.025
 university+0.024
 studying+0.024
ver+0.023
Neg logit contributions
Br-0.025
 elbows-0.022
fl-0.021
 forth-0.020
icles-0.020
Token famous
Token ID5863
Feature activation+3.22e-02

Pos logit contributions
ifting+0.032
 studies+0.031
old+0.030
anging+0.030
smart+0.030
Neg logit contributions
Br-0.025
 sound-0.024
 fart-0.022
 forth-0.021
 away-0.021
Token aer
Token ID9551
Feature activation+5.28e-02

Pos logit contributions
ifting+0.054
ver+0.048
old+0.048
smart+0.047
 studies+0.047
Neg logit contributions
 ones-0.039
 sound-0.036
Br-0.036
 forth-0.035
 away-0.034
Tokenopl
Token ID20106
Feature activation+5.45e-02

Pos logit contributions
 amaz+0.077
 studies+0.065
 aston+0.064
 university+0.060
 young+0.060
Neg logit contributions
rim-0.076
ibi-0.074
Br-0.074
ty-0.073
bur-0.072
Tokenane
Token ID1531
Feature activation+1.73e-02

Pos logit contributions
 amaz+0.094
 studies+0.089
 enchanted+0.083
 aston+0.080
 young+0.080
Neg logit contributions
w-0.068
we-0.068
mar-0.066
fl-0.060
olyn-0.060
Token as
Token ID355
Feature activation+6.19e-04

Pos logit contributions
ifting+0.027
 studies+0.027
old+0.025
 university+0.024
 studying+0.024
Neg logit contributions
 forth-0.023
fl-0.022
Br-0.022
 away-0.022
mar-0.019
Token it
Token ID340
Feature activation-1.12e-02

Pos logit contributions
 studies+0.001
old+0.001
ifting+0.001
 amaz+0.001
osing+0.001
Neg logit contributions
Br-0.001
fl-0.001
 forth-0.001
 avail-0.001
olyn-0.001
Token flew
Token ID13112
Feature activation-2.53e-03

Pos logit contributions
Br+0.016
mar+0.013
fl+0.013
 elbows+0.012
 fart+0.012
Neg logit contributions
 studies-0.020
 university-0.019
 studying-0.019
old-0.018
ifting-0.018
Token.
Token ID13
Feature activation-8.28e-03

Pos logit contributions
 studies+0.009
 university+0.009
 studying+0.008
ifting+0.008
 amaz+0.008
Neg logit contributions
Br-0.007
ety-0.006
 elbows-0.006
fl-0.006
mar-0.006
Token She
Token ID1375
Feature activation-3.85e-05

Pos logit contributions
Br+0.015
 elbows+0.013
fl+0.012
mar+0.012
icles+0.012
Neg logit contributions
 studies-0.011
 university-0.010
 studying-0.010
old-0.010
ifting-0.010
Token was
Token ID373
Feature activation+2.33e-03

Pos logit contributions
Br+0.000
 elbows+0.000
mar+0.000
fl+0.000
ety+0.000
Neg logit contributions
 studies-0.000
 university-0.000
 studying-0.000
old-0.000
ifting-0.000
Token 3
Token ID513
Feature activation+3.41e-02

Pos logit contributions
 studies+0.004
ience+0.003
old+0.003
ifting+0.003
 university+0.003
Neg logit contributions
Br-0.004
 elbows-0.003
ety-0.003
fl-0.003
 fart-0.003
Token-
Token ID12
Feature activation+2.84e-02

Pos logit contributions
 studies+0.051
 university+0.048
ver+0.046
ifting+0.045
 studying+0.044
Neg logit contributions
Br-0.053
 elbows-0.047
fl-0.045
ety-0.045
mar-0.045
Tokenyears
Token ID19002
Feature activation+5.28e-02

Pos logit contributions
 studies+0.025
 university+0.022
 studying+0.020
 amaz+0.018
ifting+0.018
Neg logit contributions
Br-0.065
 elbows-0.058
mar-0.056
ety-0.056
fl-0.056
Token-
Token ID12
Feature activation+3.68e-02

Pos logit contributions
 studies+0.090
 amaz+0.086
 university+0.084
 studying+0.081
ifting+0.078
Neg logit contributions
Br-0.067
row-0.062
fl-0.061
ety-0.061
icles-0.059
Tokenold
Token ID727
Feature activation+4.13e-03

Pos logit contributions
 studies+0.018
 university+0.013
 amaz+0.012
 studying+0.010
ifting+0.008
Neg logit contributions
Br-0.098
ety-0.089
mar-0.088
fl-0.087
 elbows-0.087
Token.
Token ID13
Feature activation-3.80e-03

Pos logit contributions
 studies+0.003
old+0.003
 studying+0.003
 university+0.003
 reading+0.003
Neg logit contributions
Br-0.010
 elbows-0.009
 fart-0.008
icles-0.008
mar-0.008
Token Every
Token ID3887
Feature activation+2.25e-03

Pos logit contributions
Br+0.006
 elbows+0.005
fl+0.005
icles+0.005
mar+0.005
Neg logit contributions
 studies-0.006
 university-0.005
 studying-0.005
old-0.005
ifting-0.005
Token day
Token ID1110
Feature activation-9.51e-03

Pos logit contributions
 studies+0.005
 university+0.004
 studying+0.004
old+0.004
ifting+0.004
Neg logit contributions
Br-0.002
fl-0.002
 elbows-0.002
mar-0.002
ety-0.002
Token
Token ID198
Feature activation-9.82e-03

Pos logit contributions
Br+0.011
mar+0.008
 elbows+0.008
icles+0.008
fl+0.008
Neg logit contributions
 studies-0.010
 university-0.009
 studying-0.009
 amaz-0.009
ifting-0.009
Token
Token ID198
Feature activation+9.03e-03

Pos logit contributions
Br+0.020
 elbows+0.019
icles+0.018
 fart+0.018
fl+0.017
Neg logit contributions
 studies-0.010
old-0.009
 university-0.009
ifting-0.009
 studying-0.009
TokenAs
Token ID1722
Feature activation-1.17e-03

Pos logit contributions
 studies+0.017
 university+0.016
 studying+0.016
old+0.015
ifting+0.015
Neg logit contributions
Br-0.012
 elbows-0.009
mar-0.009
fl-0.009
 fart-0.009
Token the
Token ID262
Feature activation+8.50e-03

Pos logit contributions
Br+0.002
mar+0.001
fl+0.001
igi+0.001
 forth+0.001
Neg logit contributions
 studies-0.002
old-0.002
ifting-0.002
 amaz-0.002
 studying-0.002
Token aer
Token ID9551
Feature activation+4.94e-02

Pos logit contributions
 studies+0.014
ifting+0.013
old+0.013
ver+0.013
 university+0.013
Neg logit contributions
Br-0.012
fl-0.010
 forth-0.010
 elbows-0.010
 fart-0.010
Tokenopl
Token ID20106
Feature activation+5.26e-02

Pos logit contributions
 amaz+0.069
 studies+0.060
 university+0.059
 aston+0.057
 young+0.056
Neg logit contributions
Br-0.077
w-0.074
fl-0.073
bur-0.073
mar-0.071
Tokenane
Token ID1531
Feature activation+1.28e-02

Pos logit contributions
 amaz+0.087
 studies+0.085
ience+0.079
 delighted+0.076
igsaw+0.076
Neg logit contributions
w-0.069
fl-0.069
mar-0.067
we-0.062
olyn-0.061
Token took
Token ID1718
Feature activation-4.07e-03

Pos logit contributions
 studies+0.026
 university+0.025
igsaw+0.024
old+0.024
 studying+0.024
Neg logit contributions
Br-0.015
 forth-0.013
fl-0.012
mar-0.011
 fart-0.011
Token off
Token ID572
Feature activation-5.10e-03

Pos logit contributions
Br+0.007
 elbows+0.006
fl+0.006
mar+0.006
icles+0.006
Neg logit contributions
 studies-0.006
 university-0.005
 studying-0.005
old-0.005
ifting-0.005
Token,
Token ID11
Feature activation+1.08e-02

Pos logit contributions
Br+0.008
 elbows+0.007
fl+0.007
icles+0.007
mar+0.007
Neg logit contributions
 studies-0.008
ifting-0.007
old-0.007
 university-0.007
 studying-0.006
Token Lucy
Token ID22162
Feature activation-7.28e-03

Pos logit contributions
 studies+0.020
old+0.018
ifting+0.018
 studying+0.017
 university+0.017
Neg logit contributions
Br-0.015
fl-0.011
icles-0.011
Wow-0.011
 elbows-0.011
Token forest
Token ID8222
Feature activation-5.68e-03

Pos logit contributions
Br+0.004
 elbows+0.004
icles+0.004
mar+0.004
cr+0.004
Neg logit contributions
 university-0.002
 studies-0.002
 young-0.002
 studying-0.002
 enchanted-0.002
Token and
Token ID290
Feature activation-1.61e-02

Pos logit contributions
Br+0.008
fl+0.007
 fart+0.007
cr+0.007
ety+0.007
Neg logit contributions
 studies-0.009
ifting-0.008
old-0.008
 university-0.008
 amaz-0.008
Token looking
Token ID2045
Feature activation+4.16e-04

Pos logit contributions
Br+0.027
fl+0.024
 elbows+0.023
 forth+0.023
 fart+0.023
Neg logit contributions
 studies-0.023
old-0.021
 university-0.021
ifting-0.020
 studying-0.020
Token for
Token ID329
Feature activation-1.55e-02

Pos logit contributions
 studies+0.001
 university+0.001
ifting+0.001
 studying+0.001
old+0.001
Neg logit contributions
Br-0.001
fl-0.000
 fart-0.000
 elbows-0.000
ety-0.000
Token a
Token ID257
Feature activation-3.69e-02

Pos logit contributions
Br+0.025
 elbows+0.021
fl+0.020
mar+0.020
icles+0.020
Neg logit contributions
 studies-0.024
 university-0.022
 studying-0.021
old-0.021
ifting-0.020
Token carrot
Token ID40562
Feature activation-6.20e-02

Pos logit contributions
Br+0.066
 elbows+0.064
icles+0.062
mar+0.057
fl+0.057
Neg logit contributions
 studies-0.047
 university-0.046
 studying-0.042
 young-0.041
 reading-0.038
Token trail
Token ID8025
Feature activation-1.22e-02

Pos logit contributions
Br+0.128
 elbows+0.105
Wow+0.102
Look+0.099
Oh+0.098
Neg logit contributions
 studies-0.073
 university-0.065
 reading-0.064
 studying-0.064
old-0.058
Token,"
Token ID553
Feature activation-3.23e-02

Pos logit contributions
Br+0.019
 elbows+0.016
mar+0.016
 fart+0.015
fl+0.015
Neg logit contributions
 studies-0.019
 university-0.017
 studying-0.017
ifting-0.017
 amaz-0.017
Token replied
Token ID8712
Feature activation-7.02e-03

Pos logit contributions
Br+0.052
 fart+0.048
 elbows+0.048
icles+0.047
igi+0.047
Neg logit contributions
 studies-0.046
old-0.042
 studying-0.041
 university-0.040
 young-0.039
Token Benny
Token ID44275
Feature activation-5.08e-03

Pos logit contributions
Br+0.012
 elbows+0.011
 fart+0.010
igi+0.010
icles+0.010
Neg logit contributions
 studies-0.009
 university-0.009
 studying-0.009
 young-0.009
old-0.009
Token.
Token ID13
Feature activation-2.29e-03

Pos logit contributions
Br+0.008
 elbows+0.007
fl+0.006
mar+0.006
icles+0.006
Neg logit contributions
 studies-0.008
 university-0.007
 studying-0.007
old-0.007
ifting-0.007
Token It
Token ID632
Feature activation-3.53e-03

Pos logit contributions
Br+0.003
 elbows+0.003
icles+0.003
fl+0.003
ety+0.003
Neg logit contributions
 studies-0.003
 university-0.002
 studying-0.002
ifting-0.002
old-0.002
Token bit
Token ID1643
Feature activation-6.03e-02

Pos logit contributions
Br+0.006
 elbows+0.005
fl+0.005
 fart+0.005
mar+0.005
Neg logit contributions
 studies-0.005
 university-0.005
 studying-0.005
old-0.004
ifting-0.004
Token me
Token ID502
Feature activation+1.89e-03

Pos logit contributions
Br+0.122
suit+0.105
mar+0.103
gh+0.101
fl+0.100
Neg logit contributions
 that-0.067
 how-0.066
 this-0.064
 the-0.060
 young-0.057
Token.
Token ID13
Feature activation+1.95e-03

Pos logit contributions
 studies+0.003
 university+0.003
 studying+0.003
old+0.003
ifting+0.003
Neg logit contributions
Br-0.003
 elbows-0.003
fl-0.002
mar-0.002
icles-0.002
Token It
Token ID632
Feature activation-5.58e-03

Pos logit contributions
 studies+0.003
 university+0.002
 studying+0.002
old+0.002
ifting+0.002
Neg logit contributions
Br-0.004
 elbows-0.003
icles-0.003
fl-0.003
ety-0.003
Token bit
Token ID1643
Feature activation-6.14e-02

Pos logit contributions
Br+0.008
 elbows+0.007
fl+0.007
 fart+0.007
mar+0.007
Neg logit contributions
 studies-0.009
 university-0.009
 studying-0.008
old-0.008
ifting-0.008
Token me
Token ID502
Feature activation-9.30e-04

Pos logit contributions
Br+0.125
suit+0.106
mar+0.104
gh+0.104
fl+0.102
Neg logit contributions
 that-0.069
 how-0.068
 this-0.065
 the-0.062
 young-0.058
Token."
Token ID526
Feature activation-1.44e-03

Pos logit contributions
Br+0.001
 elbows+0.001
fl+0.001
mar+0.001
icles+0.001
Neg logit contributions
 studies-0.002
 university-0.001
 studying-0.001
old-0.001
ifting-0.001
Token
Token ID198
Feature activation+1.43e-03

Pos logit contributions
Br+0.003
 elbows+0.002
mar+0.002
fl+0.002
ety+0.002
Neg logit contributions
 studies-0.002
 university-0.002
 studying-0.002
old-0.002
ifting-0.002
Token
Token ID198
Feature activation+9.37e-03

Pos logit contributions
 studies+0.002
 university+0.002
 studying+0.002
old+0.002
ifting+0.002
Neg logit contributions
Br-0.003
 elbows-0.002
mar-0.002
fl-0.002
icles-0.002
TokenTom
Token ID13787
Feature activation-9.37e-03

Pos logit contributions
 studies+0.018
 university+0.016
 studying+0.016
old+0.016
ifting+0.016
Neg logit contributions
Br-0.012
 elbows-0.010
fl-0.009
mar-0.009
icles-0.009
Token the
Token ID262
Feature activation-3.20e-02

Pos logit contributions
Br+0.032
 elbows+0.028
fl+0.027
icles+0.027
mar+0.027
Neg logit contributions
 studies-0.017
 university-0.015
 studying-0.015
ifting-0.014
old-0.014
Token berries
Token ID36322
Feature activation-3.67e-03

Pos logit contributions
 elbows+0.050
Br+0.050
icles+0.041
 forth+0.040
mar+0.040
Neg logit contributions
 studies-0.046
 young-0.044
old-0.042
 knows-0.041
 studying-0.040
Token and
Token ID290
Feature activation-1.55e-02

Pos logit contributions
Br+0.006
 elbows+0.006
mar+0.005
fl+0.005
icles+0.005
Neg logit contributions
 studies-0.005
 university-0.005
 studying-0.004
old-0.004
ifting-0.004
Token have
Token ID423
Feature activation-1.64e-02

Pos logit contributions
Br+0.027
 elbows+0.023
fl+0.022
mar+0.022
 fart+0.022
Neg logit contributions
 studies-0.022
 university-0.020
 studying-0.019
ifting-0.019
old-0.019
Token something
Token ID1223
Feature activation-3.90e-02

Pos logit contributions
Br+0.032
 elbows+0.028
fl+0.027
mar+0.026
icles+0.026
Neg logit contributions
 studies-0.020
 university-0.018
 studying-0.018
old-0.017
ifting-0.017
Token delicious
Token ID12625
Feature activation-6.14e-02

Pos logit contributions
Br+0.085
 elbows+0.080
fl+0.072
 forth+0.072
 fart+0.071
Neg logit contributions
 studies-0.039
old-0.035
 studying-0.034
 glad-0.032
 university-0.031
Token to
Token ID284
Feature activation-2.75e-02

Pos logit contributions
Br+0.154
 elbows+0.136
Look+0.115
Wow+0.115
 fart+0.113
Neg logit contributions
 studies-0.049
 how-0.046
 studying-0.045
 reading-0.043
 about-0.040
Token eat
Token ID4483
Feature activation-3.66e-02

Pos logit contributions
Br+0.048
 elbows+0.041
mar+0.040
icles+0.040
fl+0.040
Neg logit contributions
 studies-0.039
 university-0.035
 studying-0.034
old-0.033
ifting-0.033
Token.
Token ID13
Feature activation-6.77e-03

Pos logit contributions
Br+0.075
 elbows+0.064
fl+0.059
ety+0.058
 fart+0.058
Neg logit contributions
 studies-0.042
 university-0.039
 studying-0.038
old-0.037
ifting-0.036
Token But
Token ID887
Feature activation-2.08e-02

Pos logit contributions
Br+0.012
 elbows+0.011
fl+0.010
mar+0.010
 fart+0.010
Neg logit contributions
 studies-0.010
 university-0.009
 studying-0.008
old-0.008
ifting-0.008
Token first
Token ID717
Feature activation-7.06e-03

Pos logit contributions
Br+0.033
 elbows+0.028
 forth+0.027
icles+0.027
mar+0.027
Neg logit contributions
 studies-0.032
 university-0.029
old-0.029
ifting-0.029
 studying-0.028
Token Tom
Token ID4186
Feature activation-1.11e-02

Pos logit contributions
Br+0.023
 fart+0.017
icles+0.017
ety+0.017
Wow+0.016
Neg logit contributions
 studies-0.047
ifting-0.042
old-0.042
 studying-0.041
 university-0.040
Token,
Token ID11
Feature activation-8.57e-03

Pos logit contributions
Br+0.024
 elbows+0.021
icles+0.020
 fart+0.020
fl+0.019
Neg logit contributions
 studies-0.011
 studying-0.010
 university-0.010
old-0.010
 reading-0.009
Token I
Token ID314
Feature activation-1.40e-02

Pos logit contributions
Br+0.012
 fart+0.011
 P+0.011
cr+0.010
Wow+0.010
Neg logit contributions
 studies-0.015
old-0.013
ifting-0.012
ver-0.012
 studying-0.012
Token found
Token ID1043
Feature activation-1.73e-02

Pos logit contributions
Br+0.016
ety+0.013
w+0.013
mar+0.013
 fart+0.013
Neg logit contributions
 studies-0.027
 university-0.026
 studying-0.025
old-0.024
ifting-0.024
Token a
Token ID257
Feature activation-2.28e-02

Pos logit contributions
Br+0.039
 elbows+0.034
mar+0.033
ety+0.033
icles+0.033
Neg logit contributions
 studies-0.016
 university-0.014
 studying-0.013
ifting-0.013
old-0.013
Token carrot
Token ID40562
Feature activation-6.05e-02

Pos logit contributions
Br+0.040
 elbows+0.039
icles+0.037
mar+0.035
fl+0.034
Neg logit contributions
 university-0.029
 studies-0.029
 young-0.027
 studying-0.027
 amaz-0.026
Token!"
Token ID2474
Feature activation-2.75e-02

Pos logit contributions
Br+0.125
 elbows+0.104
Wow+0.093
ety+0.092
Fin+0.092
Neg logit contributions
 studies-0.070
 university-0.065
 reading-0.063
 studying-0.062
 arrest-0.057
Token she
Token ID673
Feature activation-2.22e-02

Pos logit contributions
Br+0.040
 elbows+0.034
fl+0.033
mar+0.032
 fart+0.032
Neg logit contributions
 studies-0.047
 university-0.044
 studying-0.042
old-0.042
ifting-0.041
Token said
Token ID531
Feature activation+1.51e-03

Pos logit contributions
Br+0.038
igi+0.035
icles+0.035
bur+0.034
fl+0.034
Neg logit contributions
 studies-0.030
old-0.028
 young-0.027
 studying-0.026
 university-0.025
Token,
Token ID11
Feature activation-1.35e-02

Pos logit contributions
 studies+0.002
 university+0.002
 studying+0.002
old+0.002
ifting+0.002
Neg logit contributions
Br-0.002
 elbows-0.002
fl-0.002
mar-0.002
icles-0.002
Token holding
Token ID4769
Feature activation-6.46e-03

Pos logit contributions
Br+0.018
 elbows+0.016
 forth+0.016
cr+0.014
fl+0.014
Neg logit contributions
 studies-0.024
old-0.024
ival-0.022
 university-0.022
ience-0.022
Token Max
Token ID5436
Feature activation-3.33e-02

Pos logit contributions
Br+0.039
 elbows+0.033
fl+0.031
mar+0.031
icles+0.031
Neg logit contributions
 studies-0.040
 university-0.037
 studying-0.036
old-0.035
ifting-0.034
Token shouted
Token ID17293
Feature activation+8.87e-03

Pos logit contributions
Br+0.066
fl+0.063
icles+0.063
igi+0.061
 avail+0.059
Neg logit contributions
old-0.030
 studies-0.030
 who-0.028
 arrest-0.028
 young-0.028
Token.
Token ID13
Feature activation+2.02e-03

Pos logit contributions
 studies+0.013
 university+0.012
 studying+0.012
old+0.012
ifting+0.012
Neg logit contributions
Br-0.015
 elbows-0.012
fl-0.012
mar-0.012
 fart-0.012
Token "
Token ID366
Feature activation-1.02e-02

Pos logit contributions
 studies+0.002
 university+0.002
 studying+0.002
old+0.002
ifting+0.002
Neg logit contributions
Br-0.004
 elbows-0.004
fl-0.003
mar-0.003
icles-0.003
TokenIt
Token ID1026
Feature activation-1.43e-02

Pos logit contributions
Br+0.013
igi+0.011
icles+0.011
 elbows+0.011
mar+0.011
Neg logit contributions
 studies-0.017
 university-0.016
 studying-0.016
ifting-0.016
 arrest-0.015
Token bit
Token ID1643
Feature activation-6.04e-02

Pos logit contributions
Br+0.021
 elbows+0.017
fl+0.016
mar+0.016
icles+0.016
Neg logit contributions
 studies-0.025
 university-0.023
 studying-0.022
old-0.022
ifting-0.022
Token me
Token ID502
Feature activation+5.55e-03

Pos logit contributions
Br+0.120
fl+0.102
mar+0.101
igi+0.097
 elbows+0.096
Neg logit contributions
 that-0.068
 how-0.067
 this-0.066
 young-0.063
 arrest-0.058
Token!
Token ID0
Feature activation-6.21e-03

Pos logit contributions
 studies+0.009
 university+0.008
 studying+0.008
old+0.008
ifting+0.008
Neg logit contributions
Br-0.008
 elbows-0.007
mar-0.007
fl-0.007
 fart-0.007
Token It
Token ID632
Feature activation-3.84e-03

Pos logit contributions
Br+0.011
 elbows+0.009
icles+0.009
fl+0.009
 fart+0.009
Neg logit contributions
 studies-0.009
 university-0.008
 studying-0.008
ifting-0.008
old-0.008
Token bit
Token ID1643
Feature activation-5.78e-02

Pos logit contributions
Br+0.005
 elbows+0.004
fl+0.004
 fart+0.004
mar+0.004
Neg logit contributions
 studies-0.007
 university-0.007
 studying-0.006
old-0.006
ifting-0.006
Token me
Token ID502
Feature activation+4.30e-04

Pos logit contributions
Br+0.115
fl+0.096
mar+0.096
 bisc+0.090
 elbows+0.090
Neg logit contributions
 how-0.063
 that-0.063
 young-0.061
 studies-0.060
 this-0.060
Token me
Token ID502
Feature activation-1.25e-02

Pos logit contributions
Br+0.070
mar+0.057
bank+0.057
suit+0.057
Fin+0.056
Neg logit contributions
 university-0.031
 arrest-0.030
old-0.029
 studying-0.029
 young-0.028
Token,
Token ID11
Feature activation-1.79e-02

Pos logit contributions
Br+0.022
 elbows+0.019
fl+0.018
mar+0.018
icles+0.017
Neg logit contributions
 studies-0.018
 university-0.017
 studying-0.016
old-0.016
ifting-0.015
Token Tom
Token ID4186
Feature activation-1.68e-02

Pos logit contributions
Br+0.030
 elbows+0.025
 fart+0.024
icles+0.024
fl+0.023
Neg logit contributions
 studies-0.028
old-0.024
 studying-0.024
 university-0.024
ifting-0.024
Token.
Token ID13
Feature activation-1.84e-03

Pos logit contributions
Br+0.031
 elbows+0.025
icles+0.025
fl+0.024
mar+0.024
Neg logit contributions
 studies-0.023
 university-0.021
 studying-0.021
old-0.021
ifting-0.019
Token It
Token ID632
Feature activation-3.53e-03

Pos logit contributions
Br+0.003
 elbows+0.003
icles+0.003
fl+0.003
ety+0.003
Neg logit contributions
 studies-0.003
 university-0.002
 studying-0.002
ifting-0.002
old-0.002
Token bit
Token ID1643
Feature activation-6.03e-02

Pos logit contributions
Br+0.006
 elbows+0.005
fl+0.005
 fart+0.005
mar+0.005
Neg logit contributions
 studies-0.005
 university-0.005
 studying-0.005
old-0.004
ifting-0.004
Token me
Token ID502
Feature activation+1.89e-03

Pos logit contributions
Br+0.122
suit+0.105
mar+0.103
gh+0.101
fl+0.100
Neg logit contributions
 that-0.067
 how-0.066
 this-0.064
 the-0.060
 young-0.057
Token.
Token ID13
Feature activation+1.95e-03

Pos logit contributions
 studies+0.003
 university+0.003
 studying+0.003
old+0.003
ifting+0.003
Neg logit contributions
Br-0.003
 elbows-0.003
fl-0.002
mar-0.002
icles-0.002
Token It
Token ID632
Feature activation-5.58e-03

Pos logit contributions
 studies+0.003
 university+0.002
 studying+0.002
old+0.002
ifting+0.002
Neg logit contributions
Br-0.004
 elbows-0.003
icles-0.003
fl-0.003
ety-0.003
Token bit
Token ID1643
Feature activation-6.14e-02

Pos logit contributions
Br+0.008
 elbows+0.007
fl+0.007
 fart+0.007
mar+0.007
Neg logit contributions
 studies-0.009
 university-0.009
 studying-0.008
old-0.008
ifting-0.008
Token me
Token ID502
Feature activation-9.30e-04

Pos logit contributions
Br+0.125
suit+0.106
mar+0.104
gh+0.104
fl+0.102
Neg logit contributions
 that-0.069
 how-0.068
 this-0.065
 the-0.062
 young-0.058
Token have
Token ID423
Feature activation-1.85e-02

Pos logit contributions
Br+0.010
 fart+0.009
ety+0.009
w+0.008
cr+0.008
Neg logit contributions
 studies-0.014
old-0.013
 university-0.013
 studying-0.013
ifting-0.012
Token enough
Token ID1576
Feature activation-3.27e-02

Pos logit contributions
Br+0.032
 elbows+0.027
 fart+0.027
fl+0.027
icles+0.027
Neg logit contributions
 studies-0.026
 university-0.023
 studying-0.023
old-0.023
ifting-0.022
Token fruits
Token ID15921
Feature activation-1.95e-02

Pos logit contributions
Br+0.045
 elbows+0.037
fl+0.036
icles+0.036
 fart+0.035
Neg logit contributions
 studies-0.058
 university-0.054
 studying-0.053
old-0.052
ifting-0.051
Token for
Token ID329
Feature activation-2.71e-02

Pos logit contributions
Br+0.026
 fart+0.022
mar+0.022
icles+0.022
ety+0.022
Neg logit contributions
 studies-0.035
 university-0.032
 studying-0.031
 amaz-0.031
old-0.030
Token a
Token ID257
Feature activation-2.55e-02

Pos logit contributions
Br+0.038
 elbows+0.032
fl+0.030
mar+0.030
icles+0.029
Neg logit contributions
 studies-0.048
 university-0.044
 studying-0.043
old-0.042
ifting-0.042
Token pie
Token ID2508
Feature activation-5.89e-02

Pos logit contributions
Br+0.043
 elbows+0.039
icles+0.037
mar+0.037
fl+0.037
Neg logit contributions
 studies-0.037
 university-0.035
 studying-0.032
old-0.030
 church-0.030
Token,"
Token ID553
Feature activation-2.81e-02

Pos logit contributions
Br+0.116
 elbows+0.101
Carl+0.087
Wow+0.087
fl+0.087
Neg logit contributions
 studies-0.075
 university-0.069
 studying-0.064
old-0.062
 reading-0.061
Token Ben
Token ID3932
Feature activation-2.35e-02

Pos logit contributions
Br+0.043
 elbows+0.038
mar+0.037
fl+0.037
igi+0.036
Neg logit contributions
 studies-0.044
 university-0.041
old-0.040
 studying-0.040
 amaz-0.038
Token said
Token ID531
Feature activation+5.06e-03

Pos logit contributions
Br+0.036
 elbows+0.031
fl+0.029
icles+0.029
 fart+0.028
Neg logit contributions
 studies-0.039
old-0.036
 studying-0.035
 university-0.034
ifting-0.033
Token.
Token ID13
Feature activation+9.42e-05

Pos logit contributions
 studies+0.008
 university+0.007
 studying+0.007
old+0.007
ifting+0.007
Neg logit contributions
Br-0.008
 elbows-0.007
fl-0.007
 fart-0.007
mar-0.007
Token
Token ID198
Feature activation+4.07e-03

Pos logit contributions
 studies+0.000
 university+0.000
old+0.000
ifting+0.000
 young+0.000
Neg logit contributions
Br-0.000
 elbows-0.000
 forth-0.000
fl-0.000
mar-0.000
Token emerged
Token ID9349
Feature activation+1.88e-02

Pos logit contributions
Br+0.021
 elbows+0.017
mar+0.016
 fart+0.016
fl+0.016
Neg logit contributions
 studies-0.026
 university-0.025
ifting-0.025
 mailing-0.024
 studying-0.024
Token with
Token ID351
Feature activation+8.47e-03

Pos logit contributions
 studies+0.034
old+0.030
ience+0.029
 university+0.029
ifting+0.029
Neg logit contributions
Br-0.025
icles-0.021
cr-0.021
fl-0.021
 elbows-0.021
Token a
Token ID257
Feature activation-5.50e-03

Pos logit contributions
 studies+0.011
ifting+0.010
 university+0.010
 studying+0.010
ver+0.010
Neg logit contributions
Br-0.015
 elbows-0.014
 fart-0.012
icles-0.012
fl-0.012
Token big
Token ID1263
Feature activation+8.85e-05

Pos logit contributions
Br+0.010
 elbows+0.008
 fart+0.008
fl+0.008
 bisc+0.008
Neg logit contributions
 studies-0.008
ifting-0.007
old-0.007
 university-0.007
 studying-0.007
Token juicy
Token ID35075
Feature activation-2.89e-02

Pos logit contributions
 studies+0.000
ifting+0.000
old+0.000
ver+0.000
 university+0.000
Neg logit contributions
Br-0.000
 elbows-0.000
 forth-0.000
 bisc-0.000
 fart-0.000
Token carrot
Token ID40562
Feature activation-5.86e-02

Pos logit contributions
Br+0.054
 elbows+0.047
mar+0.044
fl+0.043
ety+0.043
Neg logit contributions
 studies-0.038
 university-0.034
 studying-0.033
 amaz-0.032
old-0.031
Token.
Token ID13
Feature activation+2.31e-03

Pos logit contributions
Br+0.110
 elbows+0.091
ety+0.085
mar+0.085
 fart+0.084
Neg logit contributions
 studies-0.075
 studying-0.069
 reading-0.067
 university-0.066
 mailing-0.064
Token
Token ID198
Feature activation+1.14e-03

Pos logit contributions
 studies+0.004
 university+0.003
 studying+0.003
ifting+0.003
old+0.003
Neg logit contributions
Br-0.004
 elbows-0.003
icles-0.003
fl-0.003
 fart-0.003
Token
Token ID198
Feature activation+1.37e-02

Pos logit contributions
 studies+0.002
 university+0.001
 studying+0.001
ifting+0.001
old+0.001
Neg logit contributions
Br-0.002
 elbows-0.002
mar-0.002
fl-0.002
icles-0.002
TokenHe
Token ID1544
Feature activation-3.55e-03

Pos logit contributions
 studies+0.024
 university+0.021
 studying+0.021
ifting+0.020
 mailing+0.020
Neg logit contributions
Br-0.021
 elbows-0.016
mar-0.016
Fin-0.016
Carl-0.016
Token held
Token ID2714
Feature activation+3.00e-03

Pos logit contributions
Br+0.005
 elbows+0.004
mar+0.004
fl+0.004
 fart+0.004
Neg logit contributions
 studies-0.006
 university-0.006
ifting-0.006
 studying-0.005
old-0.005
Token"
Token ID1
Feature activation-1.53e-02

Pos logit contributions
 studies+0.009
 university+0.008
 studying+0.008
ifting+0.007
old+0.007
Neg logit contributions
Br-0.011
 elbows-0.010
 fart-0.009
mar-0.009
ety-0.009
TokenO
Token ID46
Feature activation-8.75e-03

Pos logit contributions
Br+0.016
 elbows+0.014
icles+0.014
igi+0.013
 forth+0.013
Neg logit contributions
 studies-0.028
 university-0.027
ifting-0.026
 arrest-0.026
 amaz-0.026
Tokenuch
Token ID794
Feature activation-2.81e-02

Pos logit contributions
Br+0.004
mar+0.003
 elbows+0.003
fl+0.002
w+0.002
Neg logit contributions
 studies-0.023
 university-0.022
 studying-0.022
old-0.021
 amaz-0.021
Token,
Token ID11
Feature activation-1.93e-02

Pos logit contributions
Br+0.059
 elbows+0.050
icles+0.047
fl+0.045
mar+0.045
Neg logit contributions
 studies-0.030
 studying-0.027
 university-0.027
ifting-0.026
old-0.025
Token it
Token ID340
Feature activation+2.27e-03

Pos logit contributions
Br+0.029
 elbows+0.025
fl+0.023
mar+0.023
 fart+0.023
Neg logit contributions
 studies-0.032
 university-0.029
 studying-0.028
old-0.028
ifting-0.028
Token bit
Token ID1643
Feature activation-5.84e-02

Pos logit contributions
 studies+0.004
 university+0.004
 studying+0.003
old+0.003
ifting+0.003
Neg logit contributions
Br-0.003
 fart-0.003
fl-0.003
 elbows-0.003
mar-0.003
Token me
Token ID502
Feature activation-1.29e-03

Pos logit contributions
Br+0.122
mar+0.102
 elbows+0.100
 bisc+0.100
suit+0.097
Neg logit contributions
 that-0.059
 how-0.059
 this-0.058
 young-0.058
 university-0.057
Token!"
Token ID2474
Feature activation-2.65e-02

Pos logit contributions
Br+0.002
 elbows+0.002
fl+0.001
mar+0.001
 fart+0.001
Neg logit contributions
 studies-0.002
 university-0.002
 studying-0.002
old-0.002
ifting-0.002
Token Ben
Token ID3932
Feature activation-2.62e-02

Pos logit contributions
Br+0.035
 elbows+0.028
fl+0.027
mar+0.026
icles+0.026
Neg logit contributions
 studies-0.050
 university-0.046
 studying-0.045
old-0.044
ifting-0.044
Token cried
Token ID16896
Feature activation-1.05e-02

Pos logit contributions
Br+0.045
fl+0.038
icles+0.038
igi+0.037
 elbows+0.037
Neg logit contributions
 studies-0.037
old-0.034
 studying-0.033
 university-0.032
 reading-0.031
Token.
Token ID13
Feature activation+1.22e-03

Pos logit contributions
Br+0.020
 elbows+0.017
igi+0.017
mar+0.017
fl+0.017
Neg logit contributions
 studies-0.012
 university-0.012
old-0.011
 studying-0.011
 young-0.011
Token,
Token ID11
Feature activation-1.58e-02

Pos logit contributions
Br+0.034
 elbows+0.030
fl+0.029
mar+0.029
icles+0.028
Neg logit contributions
 studies-0.021
 university-0.018
 studying-0.018
old-0.017
ifting-0.017
Token ow
Token ID12334
Feature activation-1.83e-02

Pos logit contributions
Br+0.025
 elbows+0.021
mar+0.021
fl+0.021
ety+0.020
Neg logit contributions
 studies-0.024
 university-0.023
 studying-0.022
ifting-0.021
old-0.021
Token!
Token ID0
Feature activation-1.26e-03

Pos logit contributions
Br+0.030
 elbows+0.025
fl+0.024
mar+0.024
icles+0.024
Neg logit contributions
 studies-0.028
 university-0.026
 studying-0.025
old-0.025
ifting-0.024
Token The
Token ID383
Feature activation-1.26e-03

Pos logit contributions
Br+0.002
 elbows+0.002
fl+0.002
icles+0.002
ety+0.002
Neg logit contributions
 studies-0.002
 university-0.002
 studying-0.002
old-0.002
ifting-0.002
Token dog
Token ID3290
Feature activation+6.67e-03

Pos logit contributions
Br+0.002
mar+0.002
igi+0.002
icles+0.002
ety+0.002
Neg logit contributions
 studies-0.002
 university-0.001
 studying-0.001
 arrest-0.001
 reading-0.001
Token bit
Token ID1643
Feature activation-5.81e-02

Pos logit contributions
 studies+0.011
 university+0.010
 studying+0.010
old+0.009
ifting+0.009
Neg logit contributions
Br-0.010
 elbows-0.009
 fart-0.009
mar-0.008
fl-0.008
Token me
Token ID502
Feature activation+5.40e-03

Pos logit contributions
Br+0.124
mar+0.109
fl+0.103
suit+0.099
boxes+0.099
Neg logit contributions
 that-0.063
 this-0.060
 how-0.057
 the-0.056
 arrest-0.051
Token,
Token ID11
Feature activation-3.22e-02

Pos logit contributions
 studies+0.008
 university+0.008
 studying+0.008
old+0.007
ifting+0.007
Neg logit contributions
Br-0.009
 elbows-0.007
fl-0.007
mar-0.007
icles-0.007
Token Lily
Token ID20037
Feature activation-2.34e-03

Pos logit contributions
Br+0.056
 elbows+0.048
mar+0.047
fl+0.047
icles+0.046
Neg logit contributions
 studies-0.044
 university-0.041
 studying-0.040
old-0.038
ifting-0.038
Token!
Token ID0
Feature activation-8.80e-03

Pos logit contributions
Br+0.004
 elbows+0.003
mar+0.003
fl+0.003
 fart+0.003
Neg logit contributions
 studies-0.004
 university-0.003
 studying-0.003
ifting-0.003
old-0.003
Token It
Token ID632
Feature activation-4.23e-03

Pos logit contributions
Br+0.016
 elbows+0.013
fl+0.013
ety+0.013
icles+0.013
Neg logit contributions
 studies-0.013
 university-0.012
 studying-0.011
ifting-0.011
old-0.011
TokenIt
Token ID1026
Feature activation-1.43e-02

Pos logit contributions
Br+0.013
igi+0.011
icles+0.011
 elbows+0.011
mar+0.011
Neg logit contributions
 studies-0.017
 university-0.016
 studying-0.016
ifting-0.016
 arrest-0.015
Token bit
Token ID1643
Feature activation-6.04e-02

Pos logit contributions
Br+0.021
 elbows+0.017
fl+0.016
mar+0.016
icles+0.016
Neg logit contributions
 studies-0.025
 university-0.023
 studying-0.022
old-0.022
ifting-0.022
Token me
Token ID502
Feature activation+5.55e-03

Pos logit contributions
Br+0.120
fl+0.102
mar+0.101
igi+0.097
 elbows+0.096
Neg logit contributions
 that-0.068
 how-0.067
 this-0.066
 young-0.063
 arrest-0.058
Token!
Token ID0
Feature activation-6.21e-03

Pos logit contributions
 studies+0.009
 university+0.008
 studying+0.008
old+0.008
ifting+0.008
Neg logit contributions
Br-0.008
 elbows-0.007
mar-0.007
fl-0.007
 fart-0.007
Token It
Token ID632
Feature activation-3.84e-03

Pos logit contributions
Br+0.011
 elbows+0.009
icles+0.009
fl+0.009
 fart+0.009
Neg logit contributions
 studies-0.009
 university-0.008
 studying-0.008
ifting-0.008
old-0.008
Token bit
Token ID1643
Feature activation-5.78e-02

Pos logit contributions
Br+0.005
 elbows+0.004
fl+0.004
 fart+0.004
mar+0.004
Neg logit contributions
 studies-0.007
 university-0.007
 studying-0.006
old-0.006
ifting-0.006
Token me
Token ID502
Feature activation+4.30e-04

Pos logit contributions
Br+0.115
fl+0.096
mar+0.096
 bisc+0.090
 elbows+0.090
Neg logit contributions
 how-0.063
 that-0.063
 young-0.061
 studies-0.060
 this-0.060
Token!"
Token ID2474
Feature activation-5.17e-03

Pos logit contributions
 studies+0.001
 university+0.001
 studying+0.001
old+0.001
ifting+0.001
Neg logit contributions
Br-0.001
 elbows-0.001
mar-0.000
fl-0.000
 fart-0.000
Token
Token ID198
Feature activation+5.64e-03

Pos logit contributions
Br+0.010
 elbows+0.009
fl+0.009
icles+0.008
mar+0.008
Neg logit contributions
 studies-0.006
 university-0.006
 studying-0.005
old-0.005
ifting-0.005
Token
Token ID198
Feature activation+1.25e-02

Pos logit contributions
 studies+0.007
 university+0.007
 studying+0.006
old+0.006
ifting+0.006
Neg logit contributions
Br-0.011
 elbows-0.009
fl-0.009
mar-0.009
icles-0.009
TokenHe
Token ID1544
Feature activation-9.81e-03

Pos logit contributions
 studies+0.024
 university+0.022
 studying+0.022
old+0.022
ifting+0.021
Neg logit contributions
Br-0.015
 elbows-0.012
fl-0.012
icles-0.012
mar-0.012
Token!
Token ID0
Feature activation+3.59e-03

Pos logit contributions
Br+0.010
fl+0.009
mar+0.008
ety+0.008
 elbows+0.008
Neg logit contributions
 studies-0.014
 university-0.013
ifting-0.012
 studying-0.012
old-0.012
Token I
Token ID314
Feature activation-7.13e-03

Pos logit contributions
 studies+0.006
 studying+0.005
ifting+0.005
 university+0.005
old+0.005
Neg logit contributions
Br-0.005
Wow-0.004
icles-0.004
fl-0.004
ety-0.004
Token brought
Token ID3181
Feature activation-1.72e-02

Pos logit contributions
Br+0.009
mar+0.008
cr+0.007
ety+0.007
ny+0.007
Neg logit contributions
 university-0.013
 studies-0.012
ifting-0.012
 studying-0.012
 reading-0.012
Token you
Token ID345
Feature activation-1.54e-02

Pos logit contributions
Br+0.031
 elbows+0.026
mar+0.025
fl+0.025
ety+0.025
Neg logit contributions
 studies-0.024
 university-0.022
 studying-0.021
old-0.021
ifting-0.020
Token a
Token ID257
Feature activation-3.10e-02

Pos logit contributions
Br+0.024
fl+0.022
cr+0.021
 fart+0.021
ety+0.021
Neg logit contributions
 studies-0.024
ifting-0.022
 university-0.021
 studying-0.021
old-0.020
Token carrot
Token ID40562
Feature activation-5.73e-02

Pos logit contributions
Br+0.055
 elbows+0.048
fl+0.046
icles+0.045
mar+0.045
Neg logit contributions
 studies-0.043
 university-0.039
 studying-0.038
old-0.036
ifting-0.036
Token!"
Token ID2474
Feature activation+4.04e-03

Pos logit contributions
Br+0.103
 elbows+0.088
mar+0.079
fl+0.078
ety+0.077
Neg logit contributions
 studies-0.081
 university-0.075
 studying-0.072
old-0.068
 reading-0.067
Token Harriet
Token ID48805
Feature activation-1.33e-02

Pos logit contributions
 studies+0.006
ifting+0.006
old+0.006
 university+0.006
 studying+0.006
Neg logit contributions
Br-0.006
 elbows-0.005
mar-0.005
fl-0.005
ety-0.005
Token opened
Token ID4721
Feature activation+3.02e-03

Pos logit contributions
Br+0.020
mar+0.016
fl+0.016
 elbows+0.016
ety+0.015
Neg logit contributions
 studies-0.022
 university-0.021
 studying-0.020
ifting-0.020
old-0.019
Token her
Token ID607
Feature activation-3.86e-03

Pos logit contributions
 studies+0.004
 university+0.004
 studying+0.004
old+0.004
ifting+0.004
Neg logit contributions
Br-0.005
 elbows-0.004
fl-0.004
icles-0.004
mar-0.004
Token mouth
Token ID5422
Feature activation-1.00e-02

Pos logit contributions
Br+0.005
 elbows+0.004
fl+0.004
 fart+0.004
icles+0.004
Neg logit contributions
 studies-0.007
 university-0.007
ifting-0.007
old-0.007
 studying-0.007
Token m
Token ID285
Feature activation-1.05e-02

Pos logit contributions
Br+0.013
 elbows+0.011
mar+0.010
icles+0.010
fl+0.010
Neg logit contributions
 studies-0.012
 university-0.011
 studying-0.011
old-0.011
ifting-0.011
Tokenunch
Token ID3316
Feature activation-1.00e-02

Pos logit contributions
Br+0.023
 elbows+0.020
fl+0.020
 fart+0.020
mar+0.020
Neg logit contributions
 studies-0.010
 university-0.009
 studying-0.008
old-0.008
ifting-0.008
Tokening
Token ID278
Feature activation-1.30e-02

Pos logit contributions
Br+0.019
 elbows+0.017
 fart+0.014
mar+0.014
rim+0.014
Neg logit contributions
ifting-0.011
ver-0.011
 studies-0.011
old-0.011
 studying-0.010
Token on
Token ID319
Feature activation-1.45e-02

Pos logit contributions
Br+0.023
 elbows+0.020
mar+0.019
ety+0.019
 fart+0.018
Neg logit contributions
 studies-0.018
 studying-0.016
 university-0.016
ifting-0.015
old-0.015
Token a
Token ID257
Feature activation-2.30e-02

Pos logit contributions
Br+0.028
 elbows+0.024
mar+0.024
ety+0.023
icles+0.023
Neg logit contributions
 studies-0.018
 studying-0.016
 university-0.016
ifting-0.015
 amaz-0.014
Token carrot
Token ID40562
Feature activation-5.71e-02

Pos logit contributions
 elbows+0.041
Br+0.040
icles+0.039
fl+0.036
mar+0.036
Neg logit contributions
 studies-0.025
 young-0.024
 amaz-0.023
 university-0.022
 enchanted-0.022
Token.
Token ID13
Feature activation-8.10e-03

Pos logit contributions
Br+0.128
Kids+0.111
 elbows+0.109
Wow+0.107
Look+0.105
Neg logit contributions
 studies-0.050
 reading-0.044
 studying-0.043
 university-0.040
 that-0.040
Token
Token ID198
Feature activation-1.47e-02

Pos logit contributions
Br+0.016
 elbows+0.015
mar+0.014
igi+0.014
icles+0.014
Neg logit contributions
 studies-0.009
 university-0.008
 studying-0.008
ifting-0.007
old-0.007
Token
Token ID198
Feature activation+1.90e-03

Pos logit contributions
Br+0.031
 elbows+0.030
fl+0.028
 fart+0.028
icles+0.028
Neg logit contributions
 studies-0.014
old-0.013
 university-0.013
ifting-0.012
 studying-0.012
TokenSuddenly
Token ID38582
Feature activation-4.86e-03

Pos logit contributions
 studies+0.003
 university+0.003
 studying+0.003
old+0.002
ifting+0.002
Neg logit contributions
Br-0.003
 elbows-0.003
icles-0.003
fl-0.003
mar-0.003
Token,
Token ID11
Feature activation+3.12e-03

Pos logit contributions
Br+0.010
 elbows+0.008
fl+0.008
mar+0.008
icles+0.008
Neg logit contributions
 studies-0.006
 university-0.005
 studying-0.005
old-0.005
ifting-0.005
Token west
Token ID7421
Feature activation-5.84e-03

Pos logit contributions
 university+0.006
 studies+0.006
 church+0.005
 studying+0.005
 reading+0.005
Neg logit contributions
Br-0.008
icles-0.008
 elbows-0.007
mar-0.007
rim-0.007
Token and
Token ID290
Feature activation-3.87e-03

Pos logit contributions
Br+0.008
 fart+0.007
ety+0.007
fl+0.007
icles+0.007
Neg logit contributions
 studies-0.010
ifting-0.009
old-0.009
 studying-0.009
ience-0.009
Token twist
Token ID14528
Feature activation-1.12e-02

Pos logit contributions
Br+0.006
 elbows+0.005
 fart+0.005
 forth+0.005
fl+0.005
Neg logit contributions
 studies-0.006
ifting-0.005
old-0.005
 university-0.005
 studying-0.005
Token off
Token ID572
Feature activation-1.62e-02

Pos logit contributions
Br+0.018
 elbows+0.015
mar+0.014
fl+0.014
icles+0.014
Neg logit contributions
 studies-0.017
 university-0.016
 studying-0.015
old-0.015
ifting-0.015
Token a
Token ID257
Feature activation-1.81e-02

Pos logit contributions
Br+0.020
 fart+0.018
 away+0.017
icles+0.017
 forth+0.017
Neg logit contributions
 studies-0.029
old-0.026
ifting-0.026
 amaz-0.026
 studying-0.025
Token carrot
Token ID40562
Feature activation-5.70e-02

Pos logit contributions
Br+0.032
 elbows+0.029
fl+0.027
icles+0.027
mar+0.027
Neg logit contributions
 studies-0.025
 university-0.023
 studying-0.022
old-0.021
ifting-0.021
Token from
Token ID422
Feature activation-1.92e-02

Pos logit contributions
Br+0.099
 elbows+0.084
mar+0.078
fl+0.077
ety+0.076
Neg logit contributions
 studies-0.083
 university-0.077
 studying-0.073
old-0.069
ver-0.068
Token the
Token ID262
Feature activation-1.26e-02

Pos logit contributions
Br+0.035
 elbows+0.031
 fart+0.029
fl+0.029
mar+0.029
Neg logit contributions
 studies-0.026
 university-0.022
 studying-0.022
old-0.022
ifting-0.022
Token ground
Token ID2323
Feature activation-2.22e-02

Pos logit contributions
Br+0.021
 elbows+0.018
mar+0.018
icles+0.018
ety+0.017
Neg logit contributions
 studies-0.019
 university-0.018
 studying-0.017
 church-0.016
 young-0.016
Token.
Token ID13
Feature activation+5.04e-03

Pos logit contributions
Br+0.026
 fart+0.023
ety+0.022
fl+0.021
icles+0.021
Neg logit contributions
 studies-0.041
ifting-0.038
 university-0.037
 studying-0.037
 amaz-0.037
Token That
Token ID1320
Feature activation+1.50e-02

Pos logit contributions
 studies+0.008
 university+0.007
 studying+0.007
old+0.007
ifting+0.007
Neg logit contributions
Br-0.008
icles-0.006
 fart-0.006
fl-0.006
ety-0.006
Token,
Token ID11
Feature activation-1.52e-02

Pos logit contributions
Br+0.040
 elbows+0.035
fl+0.033
mar+0.033
icles+0.033
Neg logit contributions
 studies-0.022
 university-0.020
 studying-0.019
old-0.018
ifting-0.018
Token ow
Token ID12334
Feature activation-2.00e-02

Pos logit contributions
Br+0.023
 elbows+0.019
mar+0.018
fl+0.018
icles+0.018
Neg logit contributions
 studies-0.025
 university-0.023
 studying-0.023
ifting-0.022
old-0.022
Token!
Token ID0
Feature activation+3.89e-04

Pos logit contributions
Br+0.035
 elbows+0.030
fl+0.028
icles+0.028
mar+0.028
Neg logit contributions
 studies-0.029
 university-0.026
 studying-0.026
ifting-0.025
old-0.025
Token The
Token ID383
Feature activation-7.24e-03

Pos logit contributions
 studies+0.001
 university+0.001
 studying+0.001
old+0.001
ifting+0.001
Neg logit contributions
Br-0.001
 elbows-0.001
fl-0.000
icles-0.000
ety-0.000
Token lizard
Token ID42406
Feature activation+1.60e-02

Pos logit contributions
Br+0.015
icles+0.014
ety+0.014
mar+0.014
 elbows+0.013
Neg logit contributions
 studies-0.007
 university-0.007
 studying-0.006
 amaz-0.006
 reading-0.006
Token bit
Token ID1643
Feature activation-5.68e-02

Pos logit contributions
 studies+0.025
 university+0.024
 studying+0.023
old+0.022
ifting+0.022
Neg logit contributions
Br-0.024
 fart-0.022
 elbows-0.022
w-0.021
 forth-0.021
Token me
Token ID502
Feature activation+5.97e-03

Pos logit contributions
Br+0.119
mar+0.100
 bisc+0.098
fl+0.094
suit+0.093
Neg logit contributions
 that-0.059
 how-0.059
 this-0.057
 the-0.053
 knows-0.052
Token!
Token ID0
Feature activation-2.46e-03

Pos logit contributions
 studies+0.009
 university+0.008
 studying+0.008
old+0.008
ifting+0.008
Neg logit contributions
Br-0.010
 elbows-0.008
fl-0.008
icles-0.008
mar-0.008
Token It
Token ID632
Feature activation-1.85e-03

Pos logit contributions
Br+0.004
 elbows+0.003
ety+0.003
icles+0.003
fl+0.003
Neg logit contributions
 studies-0.004
 university-0.004
 studying-0.003
ifting-0.003
old-0.003
Token hurt
Token ID5938
Feature activation-4.02e-02

Pos logit contributions
Br+0.003
 elbows+0.002
fl+0.002
 fart+0.002
mar+0.002
Neg logit contributions
 studies-0.003
 university-0.003
 studying-0.003
old-0.003
ifting-0.003
Token me
Token ID502
Feature activation+4.83e-03

Pos logit contributions
Br+0.079
 bisc+0.064
mar+0.064
 elbows+0.063
fl+0.062
Neg logit contributions
 how-0.044
 university-0.042
 arrest-0.041
old-0.041
 young-0.041
Token cried
Token ID16896
Feature activation-1.43e-02

Pos logit contributions
Br+0.056
fl+0.053
icles+0.052
igi+0.051
 avail+0.051
Neg logit contributions
old-0.032
 studies-0.032
 young-0.030
 who-0.029
 relieved-0.029
Token.
Token ID13
Feature activation-2.09e-03

Pos logit contributions
Br+0.029
igi+0.026
 fart+0.026
bur+0.026
 elbows+0.026
Neg logit contributions
 university-0.013
 studies-0.013
 young-0.013
old-0.013
 studying-0.012
Token "
Token ID366
Feature activation-4.40e-03

Pos logit contributions
Br+0.005
 forth+0.004
 elbows+0.004
fl+0.004
ety+0.004
Neg logit contributions
 studies-0.002
 university-0.002
 studying-0.001
ifting-0.001
 young-0.001
TokenThe
Token ID464
Feature activation-7.18e-03

Pos logit contributions
Br+0.006
 elbows+0.005
icles+0.005
mar+0.005
fl+0.005
Neg logit contributions
 studies-0.007
 university-0.007
 studying-0.007
ifting-0.007
old-0.006
Token ducks
Token ID39694
Feature activation-3.33e-03

Pos logit contributions
Br+0.011
 elbows+0.009
mar+0.009
fl+0.009
icles+0.009
Neg logit contributions
 studies-0.011
 university-0.011
 studying-0.010
old-0.010
ifting-0.010
Token bit
Token ID1643
Feature activation-5.64e-02

Pos logit contributions
Br+0.006
 elbows+0.005
 fart+0.005
fl+0.005
mar+0.005
Neg logit contributions
 studies-0.005
 university-0.004
 studying-0.004
old-0.004
ifting-0.004
Token me
Token ID502
Feature activation+3.02e-03

Pos logit contributions
Br+0.119
mar+0.106
suit+0.100
fl+0.098
boxes+0.097
Neg logit contributions
 that-0.059
 this-0.056
 the-0.056
 how-0.054
 arrest-0.053
Token!
Token ID0
Feature activation-3.59e-03

Pos logit contributions
 studies+0.004
 university+0.004
 studying+0.004
old+0.004
ifting+0.004
Neg logit contributions
Br-0.005
 elbows-0.004
fl-0.004
mar-0.004
icles-0.004
Token They
Token ID1119
Feature activation-1.99e-03

Pos logit contributions
Br+0.006
 elbows+0.005
icles+0.005
fl+0.005
ety+0.005
Neg logit contributions
 studies-0.005
 university-0.005
 studying-0.005
ifting-0.004
old-0.004
Token are
Token ID389
Feature activation-1.10e-02

Pos logit contributions
Br+0.003
 elbows+0.003
 fart+0.003
fl+0.003
mar+0.003
Neg logit contributions
 studies-0.003
 university-0.003
 studying-0.003
old-0.003
ifting-0.003
Token mean
Token ID1612
Feature activation-1.82e-02

Pos logit contributions
Br+0.021
fl+0.017
mar+0.017
 elbows+0.017
icles+0.017
Neg logit contributions
 studies-0.014
 studying-0.013
 university-0.012
 reading-0.012
 glad-0.012
Token the
Token ID262
Feature activation-6.45e-03

Pos logit contributions
Br+0.011
 elbows+0.010
 fart+0.010
icles+0.010
fl+0.010
Neg logit contributions
 studies-0.010
old-0.008
 university-0.008
 studying-0.008
ver-0.008
Token pr
Token ID778
Feature activation-1.78e-02

Pos logit contributions
Br+0.012
 elbows+0.010
mar+0.010
icles+0.010
ibi+0.010
Neg logit contributions
 studies-0.008
 university-0.008
 church-0.007
 studying-0.007
 young-0.007
Tokenune
Token ID1726
Feature activation-8.49e-03

Pos logit contributions
Br+0.033
 elbows+0.028
mar+0.028
fl+0.027
 fart+0.027
Neg logit contributions
 studies-0.023
 university-0.020
 studying-0.020
ifting-0.020
old-0.019
Token.
Token ID13
Feature activation-2.78e-04

Pos logit contributions
Br+0.010
 fart+0.009
fl+0.009
 elbows+0.009
mar+0.009
Neg logit contributions
 studies-0.016
 university-0.014
 amaz-0.014
 studying-0.014
old-0.014
Token It
Token ID632
Feature activation+2.85e-03

Pos logit contributions
Br+0.000
cr+0.000
icles+0.000
fl+0.000
 elbows+0.000
Neg logit contributions
 studies-0.000
 university-0.000
 studying-0.000
old-0.000
ifting-0.000
Token bit
Token ID1643
Feature activation-5.63e-02

Pos logit contributions
 studies+0.004
 university+0.004
old+0.004
 studying+0.004
 amaz+0.004
Neg logit contributions
Br-0.004
fl-0.004
 fart-0.004
mar-0.004
ety-0.004
Token me
Token ID502
Feature activation+5.82e-03

Pos logit contributions
Br+0.102
 elbows+0.082
ibi+0.081
mar+0.081
 bisc+0.080
Neg logit contributions
 how-0.071
 that-0.068
 young-0.067
 knows-0.064
 this-0.064
Token."
Token ID526
Feature activation-4.43e-03

Pos logit contributions
 studies+0.011
 university+0.010
old+0.009
 amaz+0.009
 studying+0.009
Neg logit contributions
Br-0.008
fl-0.007
 fart-0.007
mar-0.007
 elbows-0.007
Token
Token ID198
Feature activation+5.13e-03

Pos logit contributions
Br+0.008
 elbows+0.007
fl+0.007
 fart+0.007
mar+0.007
Neg logit contributions
 studies-0.006
 university-0.005
 studying-0.005
old-0.005
ifting-0.005
Token
Token ID198
Feature activation+8.01e-03

Pos logit contributions
 studies+0.007
 university+0.006
 studying+0.006
old+0.006
ifting+0.006
Neg logit contributions
Br-0.009
 elbows-0.008
fl-0.008
mar-0.008
icles-0.008
TokenShe
Token ID3347
Feature activation+3.77e-03

Pos logit contributions
 studies+0.014
 university+0.013
 studying+0.012
old+0.012
ifting+0.012
Neg logit contributions
Br-0.012
 elbows-0.010
fl-0.009
mar-0.009
icles-0.009
Token!
Token ID0
Feature activation-2.79e-03

Pos logit contributions
Br+0.015
 elbows+0.013
icles+0.012
mar+0.012
fl+0.012
Neg logit contributions
 studies-0.009
 university-0.008
 studying-0.008
old-0.007
ifting-0.007
Token I
Token ID314
Feature activation-1.39e-02

Pos logit contributions
Br+0.005
fl+0.004
icles+0.004
 fart+0.004
ety+0.004
Neg logit contributions
 studies-0.004
old-0.004
 studying-0.004
 university-0.004
ifting-0.004
Token found
Token ID1043
Feature activation-9.65e-03

Pos logit contributions
Br+0.017
 fart+0.013
cr+0.013
ety+0.013
mar+0.013
Neg logit contributions
 studies-0.027
 university-0.026
 studying-0.025
old-0.024
ifting-0.024
Token a
Token ID257
Feature activation-2.00e-02

Pos logit contributions
Br+0.021
 elbows+0.018
fl+0.018
mar+0.018
icles+0.017
Neg logit contributions
 studies-0.010
 university-0.009
 studying-0.008
old-0.008
ifting-0.008
Token bigger
Token ID5749
Feature activation-2.17e-02

Pos logit contributions
Br+0.033
 elbows+0.032
icles+0.030
mar+0.030
fl+0.028
Neg logit contributions
 studies-0.027
 university-0.026
 young-0.026
 studying-0.025
 reading-0.024
Token carrot
Token ID40562
Feature activation-5.61e-02

Pos logit contributions
Br+0.038
 elbows+0.035
mar+0.034
rim+0.031
icles+0.031
Neg logit contributions
 studies-0.028
 university-0.026
 amaz-0.026
 young-0.025
 studying-0.025
Token!"
Token ID2474
Feature activation-2.86e-02

Pos logit contributions
Br+0.110
 elbows+0.092
icles+0.083
mar+0.083
ety+0.083
Neg logit contributions
 studies-0.072
 university-0.065
 studying-0.063
 reading-0.060
 arrest-0.058
Token Lily
Token ID20037
Feature activation-1.14e-02

Pos logit contributions
Br+0.036
 elbows+0.030
mar+0.028
fl+0.028
 fart+0.027
Neg logit contributions
 studies-0.054
 university-0.051
 studying-0.049
old-0.049
ifting-0.048
Token says
Token ID1139
Feature activation+4.26e-03

Pos logit contributions
Br+0.019
 elbows+0.017
fl+0.016
 fart+0.016
mar+0.016
Neg logit contributions
 studies-0.017
 university-0.015
 studying-0.015
old-0.014
ifting-0.014
Token.
Token ID13
Feature activation-3.36e-04

Pos logit contributions
 studies+0.007
 university+0.006
 studying+0.006
old+0.006
ifting+0.006
Neg logit contributions
Br-0.007
 elbows-0.006
fl-0.005
mar-0.005
icles-0.005
Token She
Token ID1375
Feature activation+1.22e-02

Pos logit contributions
Br+0.001
 elbows+0.001
mar+0.001
fl+0.001
icles+0.001
Neg logit contributions
 studies-0.000
 university-0.000
 studying-0.000
old-0.000
ifting-0.000
Token it
Token ID340
Feature activation+8.35e-03

Pos logit contributions
Br+0.033
 elbows+0.028
mar+0.027
fl+0.027
ety+0.026
Neg logit contributions
 studies-0.026
 university-0.024
 studying-0.023
old-0.023
ifting-0.023
Token bit
Token ID1643
Feature activation-5.17e-02

Pos logit contributions
 studies+0.012
 university+0.012
 studying+0.011
old+0.011
ifting+0.011
Neg logit contributions
Br-0.014
 elbows-0.012
fl-0.012
 fart-0.012
mar-0.012
Token me
Token ID502
Feature activation+6.63e-03

Pos logit contributions
Br+0.109
suit+0.095
mar+0.089
Fin+0.088
 bisc+0.087
Neg logit contributions
 how-0.060
 that-0.057
 this-0.056
 the-0.051
 what-0.047
Token,
Token ID11
Feature activation-2.59e-02

Pos logit contributions
 studies+0.010
 university+0.009
 studying+0.009
old+0.009
ifting+0.008
Neg logit contributions
Br-0.011
 elbows-0.009
fl-0.009
icles-0.009
mar-0.009
Token it
Token ID340
Feature activation+2.72e-03

Pos logit contributions
Br+0.043
 elbows+0.037
mar+0.036
fl+0.036
ety+0.034
Neg logit contributions
 studies-0.038
 university-0.036
 studying-0.034
ifting-0.033
old-0.033
Token bit
Token ID1643
Feature activation-5.50e-02

Pos logit contributions
 studies+0.004
 university+0.004
 studying+0.004
old+0.004
ifting+0.004
Neg logit contributions
Br-0.004
 elbows-0.003
fl-0.003
 fart-0.003
mar-0.003
Token me
Token ID502
Feature activation+1.65e-03

Pos logit contributions
Br+0.115
suit+0.099
mar+0.095
Fin+0.092
ibi+0.091
Neg logit contributions
 how-0.063
 that-0.062
 this-0.059
 the-0.054
 what-0.051
Token!"
Token ID2474
Feature activation-2.38e-02

Pos logit contributions
 studies+0.003
 university+0.002
 studying+0.002
old+0.002
ifting+0.002
Neg logit contributions
Br-0.003
 elbows-0.002
fl-0.002
mar-0.002
icles-0.002
Token she
Token ID673
Feature activation-2.00e-02

Pos logit contributions
Br+0.039
fl+0.034
mar+0.033
icles+0.033
 fart+0.033
Neg logit contributions
 studies-0.035
 university-0.034
 studying-0.032
old-0.032
ifting-0.031
Token sob
Token ID27355
Feature activation-1.46e-02

Pos logit contributions
Br+0.036
icles+0.035
fl+0.034
igi+0.034
bur+0.033
Neg logit contributions
old-0.022
 studies-0.022
 young-0.021
 knows-0.020
 relieved-0.020
Tokenbed
Token ID3077
Feature activation-1.39e-02

Pos logit contributions
Br+0.025
Wow+0.021
 bisc+0.020
 fart+0.020
icles+0.019
Neg logit contributions
 studying-0.021
 studies-0.020
 reading-0.020
anging-0.019
ifting-0.019
Token
Token ID198
Feature activation-1.14e-02

Pos logit contributions
Br+0.022
 elbows+0.019
fl+0.018
mar+0.018
icles+0.018
Neg logit contributions
 studies-0.015
 university-0.013
 studying-0.013
ifting-0.013
old-0.012
Token
Token ID198
Feature activation+1.56e-03

Pos logit contributions
Br+0.022
 elbows+0.020
fl+0.019
 fart+0.019
icles+0.018
Neg logit contributions
 studies-0.014
 university-0.012
old-0.012
 studying-0.012
ifting-0.012
Token"
Token ID1
Feature activation-2.14e-02

Pos logit contributions
 studies+0.002
 university+0.001
 studying+0.001
old+0.001
ifting+0.001
Neg logit contributions
Br-0.003
 elbows-0.003
fl-0.003
icles-0.003
mar-0.003
TokenHelp
Token ID22087
Feature activation-3.37e-02

Pos logit contributions
Br+0.024
icles+0.024
wood+0.023
igi+0.023
row+0.023
Neg logit contributions
 arrest-0.034
 studies-0.033
 university-0.032
 amaz-0.032
ifting-0.031
Token,
Token ID11
Feature activation-2.64e-02

Pos logit contributions
Br+0.085
mar+0.071
 elbows+0.071
ksh+0.069
icles+0.068
Neg logit contributions
old-0.021
 arrest-0.020
 university-0.020
 studying-0.019
 young-0.018
Token help
Token ID1037
Feature activation-5.50e-02

Pos logit contributions
Br+0.048
 elbows+0.042
 fart+0.040
mar+0.039
fl+0.039
Neg logit contributions
 studies-0.036
 university-0.031
 studying-0.031
old-0.030
ifting-0.030
Token,
Token ID11
Feature activation-2.51e-02

Pos logit contributions
Br+0.129
 elbows+0.113
fl+0.106
Wow+0.104
igi+0.104
Neg logit contributions
old-0.046
 studies-0.043
 university-0.040
 how-0.038
 young-0.038
Token dad
Token ID9955
Feature activation-1.06e-02

Pos logit contributions
Br+0.039
 elbows+0.032
 fart+0.031
icles+0.030
 forth+0.029
Neg logit contributions
 studies-0.042
old-0.037
ifting-0.036
 studying-0.036
 university-0.036
Token!"
Token ID2474
Feature activation-3.50e-02

Pos logit contributions
Br+0.016
 elbows+0.014
fl+0.013
mar+0.013
 fart+0.013
Neg logit contributions
 studies-0.017
 university-0.016
 studying-0.015
old-0.015
ifting-0.015
Token Ben
Token ID3932
Feature activation-2.24e-02

Pos logit contributions
Br+0.050
fl+0.044
mar+0.042
 elbows+0.042
icles+0.041
Neg logit contributions
 studies-0.060
 university-0.058
 studying-0.054
old-0.053
ifting-0.051
Token screams
Token ID26557
Feature activation-1.49e-02

Pos logit contributions
Br+0.043
icles+0.036
igi+0.036
fl+0.036
mar+0.036
Neg logit contributions
 studies-0.029
 university-0.026
old-0.025
 studying-0.025
ifting-0.023
Token she
Token ID673
Feature activation+8.94e-03

Pos logit contributions
 studies+0.030
ifting+0.028
 amaz+0.027
 studying+0.027
 university+0.026
Neg logit contributions
Br-0.025
fl-0.023
 forth-0.023
mar-0.022
olyn-0.022
Token went
Token ID1816
Feature activation+2.73e-02

Pos logit contributions
 studies+0.017
 university+0.017
 studying+0.017
 glad+0.017
old+0.017
Neg logit contributions
Br-0.009
mar-0.008
 forth-0.007
fl-0.007
ety-0.007
Token she
Token ID673
Feature activation+5.65e-03

Pos logit contributions
 studies+0.043
 amaz+0.040
 arrest+0.037
ifting+0.036
ience+0.036
Neg logit contributions
fl-0.035
Br-0.034
 forth-0.034
mar-0.033
 fart-0.031
Token would
Token ID561
Feature activation+1.64e-02

Pos logit contributions
igsaw+0.011
 university+0.011
ifting+0.010
 studies+0.010
 glad+0.010
Neg logit contributions
mar-0.005
Br-0.005
ety-0.005
 forth-0.005
fl-0.005
Token ask
Token ID1265
Feature activation+4.88e-03

Pos logit contributions
ifting+0.027
 studies+0.026
old+0.025
itt+0.024
ience+0.024
Neg logit contributions
Br-0.022
mar-0.018
 away-0.018
 elbows-0.018
 fart-0.018
Token lots
Token ID6041
Feature activation+3.33e-02

Pos logit contributions
 studies+0.008
ifting+0.007
old+0.007
 arrest+0.007
 university+0.007
Neg logit contributions
Br-0.007
icles-0.006
fl-0.006
 elbows-0.006
mar-0.006
Token of
Token ID286
Feature activation+1.61e-02

Pos logit contributions
 studies+0.048
ver+0.042
 amaz+0.042
ifting+0.041
 university+0.040
Neg logit contributions
ie-0.047
fl-0.046
Br-0.046
ies-0.045
 ones-0.045
Token questions
Token ID2683
Feature activation+2.56e-02

Pos logit contributions
 studies+0.026
ifting+0.025
ver+0.024
 arrest+0.023
 university+0.023
Neg logit contributions
Br-0.021
 elbows-0.018
fl-0.018
 ones-0.017
 fart-0.017
Token and
Token ID290
Feature activation+6.71e-03

Pos logit contributions
 studies+0.047
 arrest+0.047
ifting+0.044
 amaz+0.044
ience+0.043
Neg logit contributions
mar-0.023
fl-0.023
icles-0.022
Br-0.022
 ones-0.021
Token discover
Token ID7073
Feature activation+2.95e-02

Pos logit contributions
ifting+0.009
 studies+0.009
ver+0.009
 university+0.008
old+0.008
Neg logit contributions
Br-0.010
fl-0.009
 elbows-0.009
 forth-0.009
mar-0.009
Token new
Token ID649
Feature activation+1.32e-02

Pos logit contributions
 studies+0.049
 arrest+0.044
ifting+0.043
 university+0.042
 studying+0.042
Neg logit contributions
Br-0.037
fl-0.037
mar-0.034
 ones-0.033
 elbows-0.032
Token a
Token ID257
Feature activation-8.85e-04

Pos logit contributions
 studies+0.012
 university+0.010
 studying+0.010
old+0.009
ifting+0.009
Neg logit contributions
Br-0.024
 elbows-0.022
fl-0.021
mar-0.021
icles-0.021
Token par
Token ID1582
Feature activation+2.66e-02

Pos logit contributions
Br+0.002
 elbows+0.001
fl+0.001
 fart+0.001
mar+0.001
Neg logit contributions
 studies-0.001
 university-0.001
old-0.001
 studying-0.001
ifting-0.001
Tokenrot
Token ID10599
Feature activation-1.14e-02

Pos logit contributions
 studies+0.046
 university+0.042
 studying+0.041
old+0.040
ifting+0.039
Neg logit contributions
Br-0.039
 elbows-0.031
mar-0.031
fl-0.030
ety-0.030
Token.
Token ID13
Feature activation+2.84e-03

Pos logit contributions
Br+0.015
fl+0.014
mar+0.013
 elbows+0.013
cr+0.012
Neg logit contributions
 studies-0.021
 amaz-0.018
ifting-0.018
 university-0.018
ver-0.018
Token The
Token ID383
Feature activation+2.87e-03

Pos logit contributions
 studies+0.004
ience+0.004
ieving+0.004
 studying+0.004
old+0.004
Neg logit contributions
Br-0.005
icles-0.004
Wow-0.004
ety-0.004
 P-0.004
Token par
Token ID1582
Feature activation+3.62e-02

Pos logit contributions
 studies+0.004
 university+0.003
 studying+0.003
old+0.003
 mailing+0.003
Neg logit contributions
Br-0.005
 elbows-0.005
mar-0.005
icles-0.005
fl-0.004
Tokenrot
Token ID10599
Feature activation+1.31e-02

Pos logit contributions
 studies+0.066
 university+0.060
 studying+0.059
old+0.057
 amaz+0.057
Neg logit contributions
Br-0.050
fl-0.039
mar-0.039
 elbows-0.039
ety-0.038
Token was
Token ID373
Feature activation-1.08e-02

Pos logit contributions
 studies+0.026
 university+0.026
igsaw+0.025
 amaz+0.025
ifting+0.024
Neg logit contributions
Br-0.013
fl-0.012
mar-0.012
ety-0.011
cr-0.010
Token ch
Token ID442
Feature activation+2.27e-02

Pos logit contributions
Br+0.020
 elbows+0.017
fl+0.017
mar+0.016
 fart+0.016
Neg logit contributions
 studies-0.014
 university-0.013
 studying-0.012
old-0.012
ifting-0.012
Tokenubby
Token ID38393
Feature activation+1.37e-02

Pos logit contributions
 studies+0.041
 university+0.038
 studying+0.037
 mailing+0.034
 reading+0.034
Neg logit contributions
Br-0.031
 elbows-0.028
ety-0.027
w-0.025
row-0.025
Token and
Token ID290
Feature activation+3.80e-03

Pos logit contributions
 studies+0.022
ifting+0.021
old+0.019
igsaw+0.019
 university+0.019
Neg logit contributions
fl-0.018
Br-0.017
 fart-0.017
ters-0.017
ety-0.017
Token and
Token ID290
Feature activation+9.82e-03

Pos logit contributions
 studies+0.015
 university+0.014
 studying+0.013
ifting+0.013
old+0.013
Neg logit contributions
Br-0.019
 elbows-0.016
 fart-0.016
fl-0.016
icles-0.016
Token make
Token ID787
Feature activation+4.84e-03

Pos logit contributions
old+0.016
ifting+0.015
 studies+0.015
igsaw+0.014
ver+0.014
Neg logit contributions
Br-0.013
 forth-0.013
 fart-0.013
 elbows-0.012
fl-0.011
Token sounds
Token ID5238
Feature activation+9.88e-03

Pos logit contributions
 studies+0.007
 university+0.006
 studying+0.006
ifting+0.006
old+0.006
Neg logit contributions
Br-0.008
 elbows-0.007
 fart-0.007
fl-0.007
icles-0.006
Token when
Token ID618
Feature activation-9.52e-03

Pos logit contributions
 studies+0.017
 university+0.015
old+0.015
 studying+0.015
ifting+0.015
Neg logit contributions
Br-0.014
 elbows-0.011
fl-0.011
 fart-0.011
icles-0.011
Token they
Token ID484
Feature activation+1.28e-02

Pos logit contributions
Br+0.015
 elbows+0.013
icles+0.012
fl+0.012
 fart+0.012
Neg logit contributions
 studies-0.015
 university-0.013
old-0.013
 studying-0.013
ifting-0.013
Token press
Token ID1803
Feature activation+3.14e-02

Pos logit contributions
ifting+0.022
igsaw+0.022
year+0.022
old+0.022
 mailing+0.020
Neg logit contributions
 elbows-0.014
 away-0.014
 fart-0.014
Br-0.014
 forth-0.013
Token a
Token ID257
Feature activation-1.93e-02

Pos logit contributions
 studies+0.050
ifting+0.045
old+0.045
 studying+0.044
 university+0.043
Neg logit contributions
Br-0.048
 elbows-0.041
icles-0.040
 fart-0.040
fl-0.038
Token button
Token ID4936
Feature activation-4.65e-03

Pos logit contributions
Br+0.036
 elbows+0.032
ety+0.031
mar+0.030
fl+0.030
Neg logit contributions
 studies-0.025
 university-0.024
 studying-0.022
 amaz-0.021
old-0.021
Token.
Token ID13
Feature activation-4.90e-03

Pos logit contributions
Br+0.006
 elbows+0.005
icles+0.005
fl+0.005
mar+0.005
Neg logit contributions
 studies-0.008
 university-0.007
old-0.007
 studying-0.007
ifting-0.007
Token Lily
Token ID20037
Feature activation+6.14e-03

Pos logit contributions
Br+0.008
 elbows+0.007
mar+0.007
fl+0.007
 forth+0.006
Neg logit contributions
 studies-0.007
 university-0.007
 studying-0.007
ifting-0.007
old-0.007
Token likes
Token ID7832
Feature activation-9.68e-04

Pos logit contributions
igsaw+0.009
year+0.009
 university+0.009
ifting+0.009
 mailing+0.009
Neg logit contributions
fl-0.008
 forth-0.008
ny-0.007
ety-0.007
 fart-0.007
Token she
Token ID673
Feature activation+1.82e-02

Pos logit contributions
 studies+0.018
ifting+0.016
old+0.016
 university+0.015
 amaz+0.015
Neg logit contributions
Br-0.019
 elbows-0.016
fl-0.016
mar-0.016
icles-0.016
Token grows
Token ID13676
Feature activation+2.13e-02

Pos logit contributions
 university+0.039
ifting+0.036
igsaw+0.036
 arrest+0.036
 studying+0.035
Neg logit contributions
mar-0.015
ety-0.014
olyn-0.014
Br-0.013
suit-0.012
Token up
Token ID510
Feature activation+2.27e-02

Pos logit contributions
 studies+0.041
old+0.039
 amaz+0.039
 arrest+0.039
 Many+0.037
Neg logit contributions
Br-0.021
icles-0.020
fl-0.020
mar-0.019
 elbows-0.019
Token,
Token ID11
Feature activation+1.61e-02

Pos logit contributions
 studies+0.034
ifting+0.033
old+0.032
 amaz+0.031
 university+0.029
Neg logit contributions
Br-0.033
 elbows-0.029
mar-0.028
 away-0.028
 forth-0.027
Token she
Token ID673
Feature activation+1.13e-02

Pos logit contributions
 studies+0.026
ifting+0.024
old+0.024
 university+0.023
 studying+0.022
Neg logit contributions
Br-0.024
 elbows-0.020
fl-0.019
icles-0.019
 forth-0.019
Token can
Token ID460
Feature activation+3.28e-02

Pos logit contributions
 university+0.024
igsaw+0.023
ifting+0.023
 arrest+0.023
 studies+0.022
Neg logit contributions
Br-0.010
mar-0.009
ety-0.008
 forth-0.008
olyn-0.007
Token choose
Token ID3853
Feature activation+2.09e-02

Pos logit contributions
 studies+0.052
 amaz+0.052
ifting+0.052
old+0.050
igsaw+0.050
Neg logit contributions
Br-0.042
mar-0.039
icles-0.035
bur-0.033
olyn-0.033
Token a
Token ID257
Feature activation-7.31e-03

Pos logit contributions
 studies+0.031
ifting+0.028
 university+0.027
old+0.027
 studying+0.026
Neg logit contributions
Br-0.033
icles-0.029
fl-0.029
 elbows-0.029
mar-0.028
Token job
Token ID1693
Feature activation-1.11e-02

Pos logit contributions
Br+0.012
 elbows+0.010
 fart+0.010
fl+0.010
mar+0.010
Neg logit contributions
 studies-0.011
 university-0.010
ifting-0.010
old-0.009
 studying-0.009
Token that
Token ID326
Feature activation+4.88e-04

Pos logit contributions
Br+0.017
 elbows+0.015
mar+0.014
fl+0.014
 fart+0.014
Neg logit contributions
 studies-0.016
ifting-0.015
 university-0.015
old-0.014
 studying-0.014
Token has
Token ID468
Feature activation+7.37e-03

Pos logit contributions
 studies+0.001
ifting+0.001
 university+0.001
old+0.001
 studying+0.001
Neg logit contributions
Br-0.001
fl-0.001
 elbows-0.001
mar-0.001
 fart-0.001
Token at
Token ID379
Feature activation+8.09e-03

Pos logit contributions
Br+0.005
 elbows+0.004
 fart+0.004
icles+0.004
 bisc+0.004
Neg logit contributions
 studies-0.004
 studying-0.003
 university-0.003
old-0.003
ifting-0.003
Token his
Token ID465
Feature activation+1.13e-03

Pos logit contributions
 studies+0.012
ifting+0.011
old+0.011
ival+0.011
ience+0.011
Neg logit contributions
Br-0.011
 forth-0.011
 elbows-0.011
icles-0.011
bur-0.010
Token adventure
Token ID8855
Feature activation+5.25e-03

Pos logit contributions
ifting+0.002
old+0.002
ver+0.002
osing+0.002
 studies+0.002
Neg logit contributions
Br-0.002
 elbows-0.002
fl-0.001
 forth-0.001
 fart-0.001
Token.
Token ID13
Feature activation+7.54e-04

Pos logit contributions
 studies+0.009
 university+0.008
ifting+0.008
 studying+0.008
old+0.008
Neg logit contributions
Br-0.007
 elbows-0.006
fl-0.006
mar-0.006
icles-0.006
Token He
Token ID679
Feature activation-3.26e-03

Pos logit contributions
 studies+0.001
 university+0.001
 studying+0.001
ifting+0.001
 mailing+0.001
Neg logit contributions
Br-0.001
fl-0.001
 elbows-0.001
icles-0.001
Carl-0.001
Token couldn
Token ID3521
Feature activation+3.97e-02

Pos logit contributions
Br+0.004
 forth+0.003
fl+0.003
mar+0.003
cr+0.003
Neg logit contributions
 university-0.006
ifting-0.006
 studies-0.006
igsaw-0.006
 studying-0.006
Token't
Token ID470
Feature activation+2.69e-02

Pos logit contributions
 amaz+0.077
 studies+0.077
old+0.074
year+0.072
 studying+0.071
Neg logit contributions
Br-0.040
mar-0.036
icles-0.035
fl-0.032
 fart-0.032
Token wait
Token ID4043
Feature activation-6.01e-03

Pos logit contributions
 studies+0.043
 university+0.039
 studying+0.039
ifting+0.038
old+0.038
Neg logit contributions
Br-0.042
 elbows-0.034
mar-0.033
 fart-0.033
fl-0.033
Token to
Token ID284
Feature activation-2.96e-03

Pos logit contributions
Br+0.006
fl+0.005
 elbows+0.005
mar+0.005
 forth+0.005
Neg logit contributions
 studies-0.012
old-0.011
 university-0.011
ifting-0.011
ience-0.010
Token go
Token ID467
Feature activation+5.55e-03

Pos logit contributions
Br+0.004
 forth+0.004
 elbows+0.004
 fart+0.004
mar+0.004
Neg logit contributions
 studies-0.005
ifting-0.004
old-0.004
ver-0.004
ience-0.004
Token back
Token ID736
Feature activation-4.40e-03

Pos logit contributions
 studies+0.010
 university+0.009
 studying+0.009
old+0.009
ifting+0.009
Neg logit contributions
Br-0.007
 elbows-0.006
fl-0.006
mar-0.006
 fart-0.005
Token being
Token ID852
Feature activation+1.66e-03

Pos logit contributions
 studies+0.026
old+0.024
ifting+0.023
ver+0.023
ience+0.022
Neg logit contributions
Br-0.024
fl-0.022
 elbows-0.020
mar-0.019
 forth-0.019
Token together
Token ID1978
Feature activation+8.16e-04

Pos logit contributions
 studies+0.002
 university+0.002
 studying+0.002
old+0.002
ifting+0.002
Neg logit contributions
Br-0.003
 elbows-0.003
fl-0.003
mar-0.003
icles-0.003
Token.
Token ID13
Feature activation+1.00e-02

Pos logit contributions
 studies+0.001
old+0.001
ifting+0.001
 university+0.001
igsaw+0.001
Neg logit contributions
Br-0.001
 away-0.001
cr-0.001
 forth-0.001
mar-0.001
Token Motor
Token ID12533
Feature activation+9.62e-03

Pos logit contributions
 studies+0.017
 university+0.016
old+0.016
 studying+0.016
ifting+0.015
Neg logit contributions
Br-0.014
icles-0.012
fl-0.011
 elbows-0.011
ety-0.011
Token loved
Token ID6151
Feature activation-4.66e-05

Pos logit contributions
 studies+0.015
 university+0.015
 studying+0.014
old+0.014
ifting+0.013
Neg logit contributions
Br-0.015
fl-0.013
mar-0.013
 elbows-0.012
icles-0.012
Token playing
Token ID2712
Feature activation+1.31e-02

Pos logit contributions
Br+0.000
fl+0.000
 elbows+0.000
mar+0.000
ety+0.000
Neg logit contributions
 studies-0.000
 university-0.000
ifting-0.000
old-0.000
 studying-0.000
Token with
Token ID351
Feature activation+8.34e-03

Pos logit contributions
 studies+0.021
 university+0.019
old+0.018
 studying+0.018
ifting+0.018
Neg logit contributions
Br-0.021
 elbows-0.017
fl-0.017
mar-0.017
 fart-0.017
Token M
Token ID337
Feature activation-8.43e-04

Pos logit contributions
 studies+0.015
 university+0.014
 studying+0.013
ifting+0.013
old+0.013
Neg logit contributions
Br-0.012
 elbows-0.010
fl-0.009
icles-0.009
mar-0.009
Tokenabel
Token ID9608
Feature activation+2.04e-02

Pos logit contributions
Br+0.001
 elbows+0.001
fl+0.001
icles+0.001
mar+0.001
Neg logit contributions
 studies-0.001
 university-0.001
 studying-0.001
old-0.001
ifting-0.001
Token and
Token ID290
Feature activation-3.12e-03

Pos logit contributions
 studies+0.034
 university+0.033
 amaz+0.032
ifting+0.031
 studying+0.030
Neg logit contributions
Br-0.026
fl-0.025
 forth-0.023
ie-0.022
mar-0.022
Token feeling
Token ID4203
Feature activation+1.68e-02

Pos logit contributions
Br+0.005
fl+0.005
 elbows+0.004
 forth+0.004
icles+0.004
Neg logit contributions
 studies-0.005
ifting-0.004
old-0.004
 university-0.004
ival-0.004
Token and
Token ID290
Feature activation+3.15e-03

Pos logit contributions
 studies+0.015
 university+0.014
 arrest+0.014
 studying+0.013
old+0.013
Neg logit contributions
fl-0.014
Br-0.014
mar-0.014
icles-0.013
 forth-0.013
Token wondered
Token ID14028
Feature activation+7.22e-03

Pos logit contributions
 studies+0.005
 university+0.005
ifting+0.005
old+0.004
ver+0.004
Neg logit contributions
Br-0.005
 elbows-0.004
fl-0.004
 forth-0.004
mar-0.004
Token if
Token ID611
Feature activation+8.22e-03

Pos logit contributions
 studies+0.010
 arrest+0.009
old+0.008
igsaw+0.008
ifting+0.008
Neg logit contributions
 forth-0.011
Br-0.011
icles-0.010
fl-0.010
mar-0.010
Token there
Token ID612
Feature activation+5.06e-03

Pos logit contributions
 studies+0.013
old+0.011
ifting+0.011
 studying+0.011
 university+0.011
Neg logit contributions
Br-0.012
 elbows-0.010
fl-0.010
mar-0.010
ety-0.010
Token might
Token ID1244
Feature activation+4.26e-03

Pos logit contributions
 studies+0.008
 university+0.008
 studying+0.007
ifting+0.007
old+0.007
Neg logit contributions
Br-0.007
fl-0.006
mar-0.006
 elbows-0.006
icles-0.006
Token be
Token ID307
Feature activation+1.68e-02

Pos logit contributions
 studies+0.008
old+0.007
 studying+0.007
 university+0.007
ifting+0.007
Neg logit contributions
Br-0.006
 elbows-0.005
fl-0.005
mar-0.005
 fart-0.005
Token something
Token ID1223
Feature activation-9.32e-03

Pos logit contributions
 studies+0.024
 university+0.022
 studying+0.021
old+0.021
ifting+0.020
Neg logit contributions
Br-0.029
 elbows-0.025
fl-0.024
mar-0.024
icles-0.024
Token fascinating
Token ID13899
Feature activation-9.55e-03

Pos logit contributions
Br+0.017
 elbows+0.015
fl+0.015
mar+0.014
 fart+0.014
Neg logit contributions
 studies-0.012
 university-0.011
 studying-0.010
old-0.010
ifting-0.010
Token inside
Token ID2641
Feature activation-1.37e-02

Pos logit contributions
Br+0.013
fl+0.011
mar+0.011
 elbows+0.011
 fart+0.011
Neg logit contributions
 studies-0.016
 university-0.015
 studying-0.014
old-0.014
ifting-0.014
Token.
Token ID13
Feature activation+1.05e-02

Pos logit contributions
Br+0.019
 elbows+0.017
fl+0.016
mar+0.016
icles+0.016
Neg logit contributions
 studies-0.023
 university-0.020
 studying-0.020
ifting-0.020
old-0.020
Token
Token ID198
Feature activation+3.12e-03

Pos logit contributions
 studies+0.016
 university+0.015
 studying+0.014
ifting+0.014
old+0.014
Neg logit contributions
Br-0.017
 elbows-0.014
fl-0.014
ety-0.014
mar-0.013
Token.
Token ID13
Feature activation+1.26e-03

Pos logit contributions
Br+0.009
fl+0.008
 elbows+0.008
 fart+0.008
 forth+0.008
Neg logit contributions
 studies-0.011
ifting-0.010
 university-0.010
 studying-0.010
old-0.009
Token<|endoftext|>
Token ID50256
Feature activation+1.04e-02

Pos logit contributions
 studies+0.002
 university+0.002
 studying+0.002
ifting+0.002
old+0.002
Neg logit contributions
Br-0.002
 elbows-0.002
ety-0.001
fl-0.001
icles-0.001
TokenOnce
Token ID7454
Feature activation-9.13e-04

Pos logit contributions
 studies+0.015
 university+0.014
 studying+0.014
old+0.013
ifting+0.013
Neg logit contributions
Br-0.017
 elbows-0.015
fl-0.014
mar-0.014
icles-0.014
Token upon
Token ID2402
Feature activation+4.20e-03

Pos logit contributions
Br+0.001
ety+0.001
 elbows+0.001
mar+0.001
icles+0.001
Neg logit contributions
 studies-0.002
 university-0.001
 studying-0.001
ifting-0.001
 amaz-0.001
Token a
Token ID257
Feature activation-5.22e-03

Pos logit contributions
 studies+0.007
ience+0.006
 studying+0.006
 university+0.006
ifting+0.005
Neg logit contributions
Br-0.007
 elbows-0.006
cr-0.006
mar-0.006
Tim-0.005
Token time
Token ID640
Feature activation+1.02e-02

Pos logit contributions
Br+0.006
 elbows+0.005
mar+0.005
fl+0.005
 fart+0.005
Neg logit contributions
 studies-0.010
 university-0.009
 studying-0.009
old-0.009
ifting-0.009
Token there
Token ID612
Feature activation-2.05e-02

Pos logit contributions
 studies+0.016
 university+0.013
old+0.013
 studying+0.013
ience+0.013
Neg logit contributions
Br-0.017
mar-0.014
fl-0.014
 elbows-0.014
ety-0.013
Token was
Token ID373
Feature activation+7.25e-03

Pos logit contributions
Br+0.035
 elbows+0.030
fl+0.028
mar+0.028
 fart+0.028
Neg logit contributions
 studies-0.030
 university-0.027
 studying-0.026
old-0.026
ifting-0.025
Token a
Token ID257
Feature activation+1.04e-02

Pos logit contributions
 studies+0.009
 university+0.007
 studying+0.007
old+0.007
ifting+0.007
Neg logit contributions
Br-0.015
 elbows-0.013
fl-0.013
ety-0.012
mar-0.012
Token large
Token ID1588
Feature activation+1.87e-02

Pos logit contributions
 studies+0.014
old+0.012
ifting+0.012
 studying+0.011
 university+0.011
Neg logit contributions
Br-0.020
ety-0.016
fl-0.016
 elbows-0.016
 fart-0.016
Token patch
Token ID8529
Feature activation+4.20e-03

Pos logit contributions
 studies+0.033
ver+0.032
old+0.032
ifting+0.032
 amaz+0.029
Neg logit contributions
Br-0.022
 forth-0.020
fl-0.019
gie-0.019
ety-0.018
Token.
Token ID13
Feature activation-6.28e-04

Pos logit contributions
 studies+0.023
ifting+0.022
 university+0.021
 amaz+0.021
old+0.020
Neg logit contributions
Br-0.015
mar-0.013
fl-0.013
icles-0.013
cr-0.012
Token She
Token ID1375
Feature activation+1.95e-02

Pos logit contributions
Br+0.001
 elbows+0.001
fl+0.001
mar+0.001
ety+0.001
Neg logit contributions
 studies-0.001
 university-0.001
 studying-0.001
ifting-0.001
old-0.001
Token says
Token ID1139
Feature activation+2.79e-02

Pos logit contributions
igsaw+0.040
ended+0.037
ifting+0.036
year+0.036
 mailing+0.035
Neg logit contributions
fl-0.018
 scar-0.017
 bisc-0.017
Br-0.016
 dri-0.016
Token she
Token ID673
Feature activation+1.49e-02

Pos logit contributions
 studies+0.046
old+0.039
igsaw+0.038
ifting+0.038
 amaz+0.038
Neg logit contributions
Br-0.043
icles-0.036
fl-0.035
mar-0.034
 P-0.033
Token hopes
Token ID7176
Feature activation+1.38e-02

Pos logit contributions
igsaw+0.034
ifting+0.030
 university+0.030
 amaz+0.029
 impressed+0.028
Neg logit contributions
fl-0.012
w-0.010
mar-0.010
 will-0.010
Br-0.009
Token he
Token ID339
Feature activation+2.01e-02

Pos logit contributions
 studies+0.022
ifting+0.019
 university+0.019
 amaz+0.019
old+0.018
Neg logit contributions
Br-0.021
 elbows-0.018
icles-0.018
fl-0.017
mar-0.017
Token is
Token ID318
Feature activation-1.17e-02

Pos logit contributions
igsaw+0.054
ifting+0.049
 amaz+0.048
 university+0.048
 impressed+0.046
Neg logit contributions
fl-0.009
mar-0.008
w-0.006
ny-0.006
Br-0.006
Token healthy
Token ID5448
Feature activation-2.20e-03

Pos logit contributions
Br+0.018
 elbows+0.015
 forth+0.015
mar+0.015
 fart+0.015
Neg logit contributions
 studies-0.018
old-0.017
ifting-0.016
igsaw-0.016
ival-0.016
Token soon
Token ID2582
Feature activation+6.99e-03

Pos logit contributions
Br+0.004
fl+0.003
 elbows+0.003
mar+0.003
 fart+0.003
Neg logit contributions
 studies-0.003
 university-0.003
ifting-0.003
old-0.003
 studying-0.002
Token.
Token ID13
Feature activation-2.67e-03

Pos logit contributions
 studies+0.014
ifting+0.013
old+0.013
 university+0.013
 amaz+0.012
Neg logit contributions
Br-0.007
fl-0.006
mar-0.006
 elbows-0.006
 forth-0.006
Token<|endoftext|>
Token ID50256
Feature activation+1.68e-02

Pos logit contributions
Br+0.005
 elbows+0.004
fl+0.004
mar+0.004
ety+0.004
Neg logit contributions
 studies-0.004
 university-0.003
 studying-0.003
ifting-0.003
old-0.003
Token was
Token ID373
Feature activation+1.02e-02

Pos logit contributions
Br+0.024
 elbows+0.021
fl+0.020
mar+0.020
 fart+0.020
Neg logit contributions
 studies-0.020
 university-0.018
 studying-0.018
old-0.017
ifting-0.017
Token a
Token ID257
Feature activation+1.76e-02

Pos logit contributions
 studies+0.012
 university+0.010
 studying+0.010
old+0.009
ifting+0.009
Neg logit contributions
Br-0.021
 elbows-0.018
fl-0.018
 fart-0.018
icles-0.017
Token little
Token ID1310
Feature activation+4.19e-03

Pos logit contributions
 studies+0.025
old+0.022
ifting+0.022
 studying+0.021
ver+0.021
Neg logit contributions
Br-0.031
ety-0.025
 fart-0.025
Fin-0.025
fl-0.024
Token girl
Token ID2576
Feature activation-4.51e-03

Pos logit contributions
 studies+0.008
 university+0.007
 studying+0.007
old+0.007
ifting+0.007
Neg logit contributions
Br-0.006
 elbows-0.005
mar-0.004
fl-0.004
icles-0.004
Token named
Token ID3706
Feature activation-1.88e-02

Pos logit contributions
Br+0.009
 elbows+0.008
fl+0.007
mar+0.007
ety+0.007
Neg logit contributions
 studies-0.006
 university-0.005
 studying-0.005
ifting-0.005
old-0.005
Token Lily
Token ID20037
Feature activation+3.74e-03

Pos logit contributions
Br+0.028
 elbows+0.026
fl+0.023
mar+0.023
icles+0.023
Neg logit contributions
 studies-0.030
 university-0.030
 studying-0.029
 reading-0.027
old-0.027
Token.
Token ID13
Feature activation-4.69e-03

Pos logit contributions
 studies+0.006
 university+0.005
 studying+0.005
old+0.005
ifting+0.005
Neg logit contributions
Br-0.006
 elbows-0.005
 fart-0.005
icles-0.005
fl-0.005
Token Lily
Token ID20037
Feature activation+1.67e-02

Pos logit contributions
Br+0.010
 elbows+0.009
igi+0.009
 fart+0.009
mar+0.008
Neg logit contributions
 studies-0.005
 university-0.004
 studying-0.004
 reading-0.004
ifting-0.004
Token loved
Token ID6151
Feature activation+7.96e-03

Pos logit contributions
 studies+0.018
 university+0.016
 studying+0.016
old+0.015
ifting+0.015
Neg logit contributions
Br-0.035
 elbows-0.030
fl-0.029
mar-0.029
icles-0.029
Token her
Token ID607
Feature activation+4.66e-03

Pos logit contributions
 studies+0.013
 university+0.012
old+0.012
ifting+0.012
 studying+0.012
Neg logit contributions
Br-0.012
 elbows-0.010
fl-0.010
mar-0.009
icles-0.009
Token blanket
Token ID18447
Feature activation+5.99e-03

Pos logit contributions
 studies+0.008
old+0.008
ifting+0.008
ver+0.007
 university+0.007
Neg logit contributions
Br-0.006
fl-0.006
 elbows-0.005
mar-0.005
ety-0.005
Token Lily
Token ID20037
Feature activation-2.87e-03

Pos logit contributions
Br+0.003
 elbows+0.002
fl+0.002
mar+0.002
icles+0.002
Neg logit contributions
 studies-0.002
 university-0.002
 studying-0.002
old-0.002
ifting-0.002
Token saw
Token ID2497
Feature activation+9.13e-03

Pos logit contributions
Br+0.003
 fart+0.003
mar+0.003
 elbows+0.003
fl+0.003
Neg logit contributions
 university-0.006
igsaw-0.005
 studies-0.005
ifting-0.005
 mailing-0.005
Token him
Token ID683
Feature activation+9.15e-03

Pos logit contributions
 studies+0.014
 university+0.012
ver+0.012
ifting+0.012
ience+0.012
Neg logit contributions
Br-0.014
 elbows-0.013
icles-0.013
fl-0.012
ety-0.012
Token and
Token ID290
Feature activation+2.29e-03

Pos logit contributions
 studies+0.014
old+0.014
 university+0.013
ifting+0.013
igsaw+0.013
Neg logit contributions
 forth-0.013
 elbows-0.012
 fart-0.012
Br-0.011
mar-0.011
Token said
Token ID531
Feature activation+1.66e-02

Pos logit contributions
 studies+0.004
 university+0.003
ifting+0.003
 studying+0.003
old+0.003
Neg logit contributions
Br-0.004
 elbows-0.003
fl-0.003
ety-0.003
mar-0.003
Token,
Token ID11
Feature activation-4.89e-03

Pos logit contributions
 studies+0.024
 university+0.021
old+0.020
 studying+0.020
 amaz+0.020
Neg logit contributions
Br-0.028
mar-0.024
 elbows-0.024
fl-0.024
icles-0.023
Token "
Token ID366
Feature activation-6.74e-03

Pos logit contributions
Br+0.007
fl+0.006
gie+0.006
 elbows+0.006
icles+0.006
Neg logit contributions
 studies-0.008
old-0.008
 university-0.007
ifting-0.007
igsaw-0.007
TokenNo
Token ID2949
Feature activation-1.19e-02

Pos logit contributions
row+0.010
igi+0.010
rim+0.009
icles+0.009
uggets+0.009
Neg logit contributions
 Many-0.009
 studies-0.009
 arrest-0.009
 university-0.009
-0.008
Token,
Token ID11
Feature activation-2.27e-03

Pos logit contributions
Br+0.028
 elbows+0.025
mar+0.025
igi+0.024
fl+0.024
Neg logit contributions
 studies-0.010
 university-0.010
 studying-0.009
old-0.008
 reading-0.008
Token Tom
Token ID4186
Feature activation+5.69e-03

Pos logit contributions
Br+0.003
 elbows+0.003
 fart+0.003
icles+0.003
 forth+0.003
Neg logit contributions
 studies-0.004
ifting-0.004
old-0.004
 university-0.003
 studying-0.003
Token,
Token ID11
Feature activation-6.92e-04

Pos logit contributions
 studies+0.007
 university+0.007
 studying+0.006
old+0.006
ifting+0.006
Neg logit contributions
Br-0.011
 elbows-0.009
fl-0.009
mar-0.009
icles-0.009
Token It
Token ID632
Feature activation-6.96e-03

Pos logit contributions
 studies+0.015
 studying+0.013
 university+0.013
old+0.013
ifting+0.013
Neg logit contributions
Br-0.020
 elbows-0.017
icles-0.016
 fart-0.016
ety-0.016
Token came
Token ID1625
Feature activation-4.30e-04

Pos logit contributions
Br+0.012
icles+0.011
 elbows+0.011
igi+0.010
fl+0.010
Neg logit contributions
 studies-0.009
 university-0.008
 studying-0.007
old-0.007
ifting-0.007
Token closer
Token ID5699
Feature activation-8.10e-03

Pos logit contributions
Br+0.001
 elbows+0.001
mar+0.001
 bisc+0.001
icles+0.001
Neg logit contributions
 studies-0.000
 university-0.000
ver-0.000
 studying-0.000
 young-0.000
Token and
Token ID290
Feature activation+1.49e-02

Pos logit contributions
Br+0.015
 elbows+0.013
fl+0.013
mar+0.013
 fart+0.013
Neg logit contributions
 studies-0.010
 university-0.009
 studying-0.009
old-0.009
ifting-0.009
Token closer
Token ID5699
Feature activation-1.76e-02

Pos logit contributions
 studies+0.023
 university+0.021
 studying+0.020
ifting+0.019
old+0.019
Neg logit contributions
Br-0.024
 elbows-0.021
icles-0.019
mar-0.019
fl-0.019
Token,
Token ID11
Feature activation-7.51e-03

Pos logit contributions
Br+0.033
 elbows+0.028
 fart+0.027
icles+0.026
mar+0.026
Neg logit contributions
 studies-0.023
 university-0.021
 studying-0.021
old-0.020
ifting-0.019
Token until
Token ID1566
Feature activation+8.40e-04

Pos logit contributions
Br+0.011
 elbows+0.010
fl+0.009
 fart+0.009
 forth+0.009
Neg logit contributions
 studies-0.013
old-0.012
 university-0.011
ifting-0.011
igsaw-0.011
Token it
Token ID340
Feature activation-2.05e-02

Pos logit contributions
 studies+0.001
 university+0.001
 studying+0.001
old+0.001
ifting+0.001
Neg logit contributions
Br-0.001
 elbows-0.001
fl-0.001
mar-0.001
icles-0.001
Token was
Token ID373
Feature activation-3.42e-02

Pos logit contributions
Br+0.038
 elbows+0.035
icles+0.034
fl+0.033
mar+0.032
Neg logit contributions
 studies-0.024
 university-0.020
ifting-0.020
ver-0.020
 studying-0.019
Token right
Token ID826
Feature activation-1.62e-02

Pos logit contributions
Br+0.071
fl+0.067
 elbows+0.061
amboo+0.059
icles+0.059
Neg logit contributions
 university-0.032
 studying-0.031
 glad-0.031
 reading-0.031
 young-0.030
Token in
Token ID287
Feature activation+3.26e-03

Pos logit contributions
Br+0.037
 elbows+0.029
Fin+0.029
Carl+0.028
 fart+0.028
Neg logit contributions
 studies-0.016
 university-0.016
 studying-0.016
 reading-0.015
 young-0.014
Token hugged
Token ID41130
Feature activation-1.93e-02

Pos logit contributions
Br+0.022
 elbows+0.018
mar+0.017
ety+0.017
fl+0.017
Neg logit contributions
 studies-0.025
 university-0.023
 studying-0.022
ifting-0.021
old-0.021
Token her
Token ID607
Feature activation+4.28e-03

Pos logit contributions
Br+0.038
cr+0.033
mar+0.033
fl+0.033
 fart+0.032
Neg logit contributions
 studies-0.021
 university-0.021
old-0.020
 studying-0.019
 young-0.018
Token and
Token ID290
Feature activation-6.25e-03

Pos logit contributions
 studies+0.006
ifting+0.006
old+0.005
 university+0.005
 studying+0.005
Neg logit contributions
Br-0.007
 elbows-0.007
fl-0.006
 forth-0.006
 fart-0.006
Token said
Token ID531
Feature activation+1.11e-02

Pos logit contributions
Br+0.008
 elbows+0.007
fl+0.006
mar+0.006
icles+0.006
Neg logit contributions
 studies-0.011
 university-0.011
 studying-0.010
old-0.010
ifting-0.010
Token,
Token ID11
Feature activation-8.50e-03

Pos logit contributions
 studies+0.017
 university+0.015
 amaz+0.014
 studying+0.014
old+0.014
Neg logit contributions
Br-0.019
 elbows-0.016
mar-0.016
fl-0.016
icles-0.015
Token "
Token ID366
Feature activation-4.89e-03

Pos logit contributions
Br+0.016
 elbows+0.013
fl+0.013
mar+0.013
icles+0.013
Neg logit contributions
 studies-0.012
old-0.010
 university-0.010
 amaz-0.010
ifting-0.010
TokenDon
Token ID3987
Feature activation+2.27e-03

Pos logit contributions
icles+0.006
Br+0.006
 elbows+0.006
suit+0.006
cr+0.006
Neg logit contributions
 studies-0.007
old-0.007
 university-0.007
ifting-0.007
 young-0.007
Token't
Token ID470
Feature activation-2.01e-02

Pos logit contributions
 studies+0.003
 university+0.002
 studying+0.002
old+0.002
ifting+0.002
Neg logit contributions
Br-0.004
 elbows-0.004
mar-0.004
fl-0.004
 fart-0.004
Token worry
Token ID5490
Feature activation-1.95e-02

Pos logit contributions
Br+0.040
suit+0.037
igi+0.036
icles+0.035
ksh+0.034
Neg logit contributions
 university-0.022
 studies-0.022
 studying-0.021
 arrest-0.020
 young-0.019
Token,
Token ID11
Feature activation-1.08e-02

Pos logit contributions
Br+0.045
 elbows+0.040
mar+0.038
fl+0.038
 fart+0.038
Neg logit contributions
 studies-0.017
 university-0.016
 studying-0.014
old-0.014
ifting-0.013
Token Lily
Token ID20037
Feature activation+1.15e-02

Pos logit contributions
Br+0.009
 fart+0.006
ety+0.006
icles+0.006
 forth+0.006
Neg logit contributions
 studies-0.025
old-0.023
ifting-0.023
 studying-0.023
 university-0.023
Token man
Token ID582
Feature activation+4.18e-03

Pos logit contributions
ifting+0.021
ver+0.021
 studies+0.021
old+0.020
anging+0.020
Neg logit contributions
Br-0.008
 sound-0.007
 forth-0.006
fl-0.006
 bisc-0.006
Token was
Token ID373
Feature activation-5.93e-03

Pos logit contributions
 university+0.008
 studies+0.008
igsaw+0.007
ifting+0.007
 studying+0.007
Neg logit contributions
Br-0.004
 forth-0.004
mar-0.004
cr-0.004
ety-0.004
Token so
Token ID523
Feature activation+2.75e-03

Pos logit contributions
Br+0.008
 elbows+0.007
 fart+0.006
cr+0.006
 away+0.006
Neg logit contributions
 studies-0.010
ience-0.010
 university-0.009
ifting-0.009
igsaw-0.009
Token happy
Token ID3772
Feature activation+4.95e-03

Pos logit contributions
 studies+0.004
 university+0.004
 arrest+0.004
ience+0.004
igsaw+0.004
Neg logit contributions
Br-0.004
 sound-0.003
mar-0.003
cr-0.003
 away-0.003
Token!
Token ID0
Feature activation+6.74e-03

Pos logit contributions
 studies+0.007
 university+0.006
 studying+0.006
ifting+0.006
old+0.006
Neg logit contributions
Br-0.009
 elbows-0.008
fl-0.008
mar-0.008
icles-0.008
Token He
Token ID679
Feature activation-6.75e-03

Pos logit contributions
 studies+0.012
 university+0.012
 studying+0.012
ifting+0.011
old+0.011
Neg logit contributions
Br-0.008
icles-0.006
 fart-0.006
ety-0.006
 elbows-0.006
Token waved
Token ID26834
Feature activation+2.41e-02

Pos logit contributions
Br+0.007
ety+0.006
mar+0.006
 bisc+0.006
 forth+0.006
Neg logit contributions
 university-0.013
ifting-0.012
 studies-0.012
igsaw-0.012
 studying-0.012
Token his
Token ID465
Feature activation-1.24e-02

Pos logit contributions
 studies+0.035
 university+0.031
 studying+0.031
ifting+0.031
old+0.029
Neg logit contributions
Br-0.042
 elbows-0.035
fl-0.034
icles-0.033
 forth-0.033
Token arms
Token ID5101
Feature activation+2.22e-03

Pos logit contributions
Br+0.015
 elbows+0.013
 fart+0.012
 forth+0.012
suit+0.012
Neg logit contributions
ifting-0.023
 studies-0.022
old-0.022
 university-0.022
 studying-0.021
Token and
Token ID290
Feature activation-2.02e-02

Pos logit contributions
 studies+0.003
 university+0.003
old+0.003
 studying+0.003
ifting+0.003
Neg logit contributions
Br-0.004
fl-0.003
cr-0.003
 forth-0.003
mar-0.003
Token shouted
Token ID17293
Feature activation+2.12e-02

Pos logit contributions
Br+0.034
 elbows+0.029
mar+0.028
icles+0.028
ety+0.028
Neg logit contributions
 studies-0.030
 studying-0.027
 university-0.027
 amaz-0.025
old-0.025
Token ordered
Token ID6149
Feature activation-6.87e-03

Pos logit contributions
 studies+0.010
 university+0.010
ifting+0.009
ver+0.009
 studying+0.009
Neg logit contributions
Br-0.006
ety-0.005
fl-0.005
 elbows-0.005
cr-0.005
Token!
Token ID0
Feature activation-9.17e-03

Pos logit contributions
Br+0.009
icles+0.008
 elbows+0.008
fl+0.008
mar+0.008
Neg logit contributions
 studies-0.012
 university-0.011
ifting-0.010
old-0.010
 studying-0.010
Token<|endoftext|>
Token ID50256
Feature activation+3.14e-03

Pos logit contributions
Br+0.015
icles+0.012
fl+0.011
 elbows+0.011
mar+0.011
Neg logit contributions
 studies-0.015
 university-0.013
 studying-0.013
ifting-0.013
 mailing-0.012
TokenOnce
Token ID7454
Feature activation-2.59e-03

Pos logit contributions
 studies+0.004
 university+0.004
 studying+0.004
old+0.004
ifting+0.004
Neg logit contributions
Br-0.006
 elbows-0.005
 fart-0.005
icles-0.005
mar-0.005
Token upon
Token ID2402
Feature activation-7.58e-04

Pos logit contributions
Br+0.004
 elbows+0.003
ety+0.003
mar+0.003
fl+0.003
Neg logit contributions
 studies-0.005
 university-0.004
 studying-0.004
ifting-0.004
old-0.004
Token a
Token ID257
Feature activation-3.49e-03

Pos logit contributions
Br+0.001
 elbows+0.001
cr+0.001
Tim+0.001
mar+0.001
Neg logit contributions
 studies-0.001
ience-0.001
 studying-0.001
 university-0.001
ifting-0.001
Token time
Token ID640
Feature activation+6.64e-03

Pos logit contributions
Br+0.004
 elbows+0.003
mar+0.003
ety+0.003
 fart+0.003
Neg logit contributions
 studies-0.007
 university-0.007
 studying-0.006
ifting-0.006
old-0.006
Token,
Token ID11
Feature activation-3.17e-03

Pos logit contributions
 studies+0.010
 university+0.008
old+0.008
 studying+0.008
ifting+0.008
Neg logit contributions
Br-0.011
mar-0.009
fl-0.009
 elbows-0.009
icles-0.009
Token in
Token ID287
Feature activation+6.94e-03

Pos logit contributions
Br+0.006
 elbows+0.005
mar+0.005
fl+0.005
 fart+0.005
Neg logit contributions
 studies-0.004
 university-0.004
 studying-0.004
old-0.004
ifting-0.004
Token a
Token ID257
Feature activation+6.18e-03

Pos logit contributions
 studies+0.011
old+0.010
ifting+0.010
 studying+0.009
 university+0.009
Neg logit contributions
Br-0.012
 elbows-0.009
 bisc-0.009
Wow-0.009
 fart-0.009
Token small
Token ID1402
Feature activation+8.44e-03

Pos logit contributions
 studies+0.009
ver+0.008
ifting+0.008
 studying+0.008
old+0.007
Neg logit contributions
Br-0.011
 elbows-0.009
 fart-0.009
ety-0.009
fl-0.009
TokenThese
Token ID4711
Feature activation+1.75e-02

Pos logit contributions
row+0.009
 elbows+0.009
ram+0.009
 forth+0.009
icles+0.009
Neg logit contributions
 university-0.010
ifting-0.010
 church-0.010
 studies-0.010
 amaz-0.009
Token fruits
Token ID15921
Feature activation+6.76e-03

Pos logit contributions
 studies+0.034
 university+0.031
 studying+0.031
ifting+0.031
old+0.031
Neg logit contributions
Br-0.020
fl-0.018
 elbows-0.017
ety-0.016
 fart-0.016
Token are
Token ID389
Feature activation-1.26e-02

Pos logit contributions
 studies+0.010
 university+0.010
 studying+0.009
 amaz+0.009
old+0.009
Neg logit contributions
Br-0.010
 fart-0.009
mar-0.009
icles-0.009
ety-0.009
Token y
Token ID331
Feature activation-1.94e-02

Pos logit contributions
Br+0.023
 elbows+0.020
fl+0.018
mar+0.018
icles+0.018
Neg logit contributions
 studies-0.017
 university-0.016
 studying-0.016
ifting-0.015
old-0.015
Tokenummy
Token ID13513
Feature activation-1.62e-02

Pos logit contributions
Br+0.036
 elbows+0.030
fl+0.028
mar+0.027
 fart+0.027
Neg logit contributions
 studies-0.026
ifting-0.024
old-0.023
 university-0.023
 studying-0.023
Token!"
Token ID2474
Feature activation-3.34e-02

Pos logit contributions
Br+0.024
 elbows+0.020
fl+0.020
mar+0.020
icles+0.019
Neg logit contributions
 studies-0.027
 university-0.024
 studying-0.024
old-0.023
ifting-0.023
Token L
Token ID406
Feature activation+1.45e-03

Pos logit contributions
Br+0.052
igi+0.051
bur+0.051
 fart+0.050
fl+0.048
Neg logit contributions
 studies-0.043
old-0.041
 university-0.039
 Many-0.039
 young-0.038
Tokenila
Token ID10102
Feature activation-3.92e-03

Pos logit contributions
 studies+0.002
old+0.002
ifting+0.002
 university+0.002
 studying+0.002
Neg logit contributions
Br-0.003
 elbows-0.002
fl-0.002
 fart-0.002
 forth-0.002
Token said
Token ID531
Feature activation+7.36e-03

Pos logit contributions
Br+0.006
 elbows+0.005
fl+0.004
mar+0.004
icles+0.004
Neg logit contributions
 studies-0.007
 university-0.006
 studying-0.006
old-0.006
ifting-0.006
Token,
Token ID11
Feature activation-5.20e-03

Pos logit contributions
 studies+0.010
 university+0.009
 studying+0.009
old+0.009
ifting+0.009
Neg logit contributions
Br-0.013
 elbows-0.011
 fart-0.011
fl-0.011
icles-0.011
Token biting
Token ID30791
Feature activation-2.66e-02

Pos logit contributions
Br+0.007
 elbows+0.007
 forth+0.007
cr+0.006
 fart+0.006
Neg logit contributions
 studies-0.009
old-0.008
 university-0.008
ival-0.008
 owns-0.008
Token soldier
Token ID10686
Feature activation-4.77e-03

Pos logit contributions
 studies+0.003
ifting+0.003
old+0.003
ver+0.003
 studying+0.003
Neg logit contributions
Br-0.004
 fart-0.003
 bisc-0.003
 elbows-0.003
fl-0.003
Token:
Token ID25
Feature activation+3.46e-03

Pos logit contributions
Br+0.006
fl+0.006
gie+0.006
cr+0.006
 away+0.006
Neg logit contributions
 studies-0.007
ifting-0.007
 amaz-0.006
igsaw-0.006
 university-0.006
Token "
Token ID366
Feature activation-1.77e-02

Pos logit contributions
 studies+0.006
ifting+0.005
old+0.005
ience+0.005
igsaw+0.005
Neg logit contributions
Br-0.005
 P-0.004
Look-0.004
fl-0.004
Fin-0.004
TokenCan
Token ID6090
Feature activation+1.60e-02

Pos logit contributions
Br+0.021
 elbows+0.018
fl+0.016
mar+0.016
icles+0.016
Neg logit contributions
 studies-0.034
 university-0.032
 studying-0.031
old-0.031
ifting-0.030
Token you
Token ID345
Feature activation+7.79e-03

Pos logit contributions
 studies+0.029
 amaz+0.029
old+0.027
ifting+0.027
 studying+0.027
Neg logit contributions
Br-0.020
icles-0.016
 fart-0.016
mar-0.016
ety-0.015
Token help
Token ID1037
Feature activation-3.74e-02

Pos logit contributions
 studies+0.013
ifting+0.012
old+0.012
 studying+0.012
igsaw+0.012
Neg logit contributions
Br-0.010
mar-0.009
 fart-0.009
 away-0.009
Fin-0.008
Token me
Token ID502
Feature activation+2.63e-03

Pos logit contributions
Br+0.033
 elbows+0.023
fl+0.021
mar+0.020
icles+0.020
Neg logit contributions
 studies-0.087
 university-0.082
 studying-0.080
old-0.079
ifting-0.078
Token deliver
Token ID5203
Feature activation-2.09e-02

Pos logit contributions
 studies+0.004
 amaz+0.004
year+0.004
old+0.004
ifting+0.004
Neg logit contributions
Br-0.003
 away-0.003
 fart-0.003
 forth-0.003
bur-0.003
Token this
Token ID428
Feature activation-1.12e-02

Pos logit contributions
Br+0.032
 fart+0.027
 elbows+0.027
icles+0.027
mar+0.026
Neg logit contributions
 studies-0.033
ifting-0.029
 studying-0.029
old-0.029
 university-0.029
Token package
Token ID5301
Feature activation-2.10e-02

Pos logit contributions
 fart+0.014
Br+0.013
 away+0.012
 forth+0.012
 downstream+0.012
Neg logit contributions
 studies-0.018
old-0.018
ifting-0.018
 amaz-0.017
 studying-0.017
Token?".
Token ID43634
Feature activation-1.76e-03

Pos logit contributions
Br+0.028
mar+0.026
 fart+0.026
icles+0.025
fl+0.024
Neg logit contributions
 studies-0.034
ifting-0.031
 amaz-0.031
old-0.031
 studying-0.030
Token that
Token ID326
Feature activation-9.32e-03

Pos logit contributions
 studies+0.002
 university+0.002
old+0.001
 studying+0.001
ifting+0.001
Neg logit contributions
Br-0.002
fl-0.002
mar-0.002
 forth-0.002
 elbows-0.002
Token looked
Token ID3114
Feature activation+4.25e-03

Pos logit contributions
Br+0.016
 elbows+0.014
icles+0.013
ety+0.013
igi+0.013
Neg logit contributions
 studies-0.013
 university-0.012
 studying-0.012
ifting-0.011
old-0.011
Token like
Token ID588
Feature activation+1.77e-02

Pos logit contributions
 studies+0.007
 university+0.006
 studying+0.006
old+0.006
ifting+0.006
Neg logit contributions
Br-0.007
 elbows-0.006
fl-0.006
mar-0.005
 fart-0.005
Token a
Token ID257
Feature activation-7.46e-03

Pos logit contributions
 studies+0.025
old+0.021
 university+0.021
 studying+0.020
ifting+0.020
Neg logit contributions
Br-0.032
fl-0.028
icles-0.027
 elbows-0.027
cr-0.027
Token big
Token ID1263
Feature activation+6.50e-03

Pos logit contributions
Br+0.013
 elbows+0.012
icles+0.011
mar+0.011
fl+0.011
Neg logit contributions
 studies-0.010
 university-0.010
 studying-0.009
 amaz-0.009
 reading-0.008
Token carrot
Token ID40562
Feature activation-3.58e-02

Pos logit contributions
 studies+0.009
 university+0.008
old+0.008
 studying+0.008
ifting+0.008
Neg logit contributions
Br-0.011
 elbows-0.010
fl-0.009
 fart-0.009
mar-0.009
Token.
Token ID13
Feature activation+2.23e-03

Pos logit contributions
Br+0.069
 elbows+0.058
ety+0.054
mar+0.054
icles+0.053
Neg logit contributions
 studies-0.046
 university-0.042
 studying-0.042
 reading-0.039
 mailing-0.038
Token He
Token ID679
Feature activation-3.94e-03

Pos logit contributions
 studies+0.003
 university+0.003
 studying+0.003
old+0.003
ifting+0.002
Neg logit contributions
Br-0.004
 elbows-0.004
mar-0.003
 fart-0.003
 forth-0.003
Token smiled
Token ID13541
Feature activation+1.99e-02

Pos logit contributions
Br+0.006
 elbows+0.005
mar+0.005
fl+0.005
ety+0.005
Neg logit contributions
 studies-0.007
 university-0.006
 studying-0.006
ifting-0.006
old-0.006
Token and
Token ID290
Feature activation-1.39e-02

Pos logit contributions
 studies+0.026
 university+0.022
old+0.022
 studying+0.021
ifting+0.021
Neg logit contributions
Br-0.038
fl-0.034
 elbows-0.032
ety-0.032
mar-0.032
Token thought
Token ID1807
Feature activation+5.80e-03

Pos logit contributions
Br+0.022
 elbows+0.019
icles+0.018
fl+0.018
mar+0.018
Neg logit contributions
 studies-0.022
 university-0.020
 studying-0.019
old-0.019
ifting-0.019
Token.
Token ID13
Feature activation-2.24e-03

Pos logit contributions
Br+0.044
 elbows+0.037
mar+0.035
fl+0.035
 fart+0.035
Neg logit contributions
 studies-0.041
 university-0.037
 studying-0.036
old-0.035
ifting-0.035
Token They
Token ID1119
Feature activation+4.23e-03

Pos logit contributions
Br+0.004
mar+0.004
 elbows+0.004
w+0.004
 forth+0.004
Neg logit contributions
 studies-0.003
 university-0.002
 studying-0.002
ifting-0.002
old-0.002
Token gave
Token ID2921
Feature activation-2.70e-03

Pos logit contributions
 studies+0.008
 university+0.007
ifting+0.007
 studying+0.007
igsaw+0.007
Neg logit contributions
Br-0.006
ety-0.005
cr-0.004
icles-0.004
 elbows-0.004
Token him
Token ID683
Feature activation-1.69e-03

Pos logit contributions
Br+0.005
mar+0.004
igi+0.004
 elbows+0.004
row+0.004
Neg logit contributions
 studies-0.004
 amaz-0.003
old-0.003
 studying-0.003
 university-0.003
Token a
Token ID257
Feature activation-1.97e-02

Pos logit contributions
Br+0.003
fl+0.003
icles+0.003
cr+0.003
 elbows+0.003
Neg logit contributions
 studies-0.002
 university-0.002
 studying-0.002
ifting-0.002
 mailing-0.002
Token carrot
Token ID40562
Feature activation-3.58e-02

Pos logit contributions
Br+0.027
 elbows+0.021
 fart+0.020
fl+0.020
 bisc+0.020
Neg logit contributions
 studies-0.036
 university-0.033
 studying-0.033
ifting-0.033
old-0.032
Token nose
Token ID9686
Feature activation-8.99e-03

Pos logit contributions
Br+0.052
 elbows+0.043
mar+0.040
 fart+0.038
ety+0.038
Neg logit contributions
 studies-0.062
 university-0.056
 studying-0.055
old-0.053
 reading-0.053
Token and
Token ID290
Feature activation+7.05e-04

Pos logit contributions
Br+0.018
 elbows+0.016
icles+0.016
fl+0.016
mar+0.016
Neg logit contributions
 studies-0.010
 university-0.009
 studying-0.009
ifting-0.008
old-0.008
Token two
Token ID734
Feature activation+1.68e-02

Pos logit contributions
 studies+0.001
 university+0.001
ifting+0.001
 studying+0.001
old+0.001
Neg logit contributions
Br-0.001
 elbows-0.001
 bisc-0.001
fl-0.001
icles-0.001
Token rocks
Token ID12586
Feature activation-6.57e-03

Pos logit contributions
 studies+0.028
 university+0.025
 studying+0.025
old+0.024
ifting+0.024
Neg logit contributions
Br-0.026
 elbows-0.022
fl-0.021
mar-0.020
icles-0.020
Token for
Token ID329
Feature activation-1.35e-02

Pos logit contributions
Br+0.010
 elbows+0.008
fl+0.008
icles+0.008
mar+0.008
Neg logit contributions
 studies-0.011
 university-0.010
 studying-0.010
ifting-0.009
old-0.009
TokenLook
Token ID8567
Feature activation+5.79e-04

Pos logit contributions
ram+0.021
row+0.021
uggets+0.020
igi+0.020
 elbows+0.020
Neg logit contributions
 Many-0.021
 amaz-0.020
 studies-0.019
ifting-0.019
 sells-0.019
Token at
Token ID379
Feature activation-1.21e-02

Pos logit contributions
 studies+0.000
 university+0.000
 studying+0.000
ifting+0.000
old+0.000
Neg logit contributions
Br-0.001
 elbows-0.001
fl-0.001
mar-0.001
igi-0.001
Token me
Token ID502
Feature activation+4.16e-03

Pos logit contributions
Br+0.013
 elbows+0.010
mar+0.009
fl+0.009
ety+0.009
Neg logit contributions
 studies-0.025
 university-0.024
 studying-0.023
old-0.023
ifting-0.023
Token,
Token ID11
Feature activation-1.38e-02

Pos logit contributions
 studies+0.007
 university+0.006
old+0.006
 studying+0.006
ifting+0.006
Neg logit contributions
Br-0.006
 elbows-0.005
mar-0.005
 fart-0.005
icles-0.005
Token Sue
Token ID26089
Feature activation-6.19e-03

Pos logit contributions
Br+0.013
icles+0.010
 fart+0.010
ety+0.009
fl+0.009
Neg logit contributions
 studies-0.031
 studying-0.027
ifting-0.027
old-0.027
 university-0.027
Token!"
Token ID2474
Feature activation-3.81e-02

Pos logit contributions
Br+0.010
 elbows+0.008
fl+0.008
mar+0.008
icles+0.008
Neg logit contributions
 studies-0.010
 university-0.009
 studying-0.009
old-0.009
ifting-0.008
Token he
Token ID339
Feature activation-1.56e-02

Pos logit contributions
Br+0.057
 elbows+0.046
fl+0.046
mar+0.045
 fart+0.044
Neg logit contributions
 studies-0.064
 university-0.060
 studying-0.058
old-0.058
ifting-0.056
Token shouts
Token ID34757
Feature activation+8.86e-03

Pos logit contributions
Br+0.029
 elbows+0.025
icles+0.024
mar+0.024
fl+0.024
Neg logit contributions
 studies-0.021
 university-0.019
 studying-0.018
old-0.018
ifting-0.017
Token.
Token ID13
Feature activation-2.23e-03

Pos logit contributions
 studies+0.014
 university+0.013
 studying+0.013
old+0.012
ifting+0.012
Neg logit contributions
Br-0.014
 elbows-0.012
fl-0.011
mar-0.011
icles-0.011
Token "
Token ID366
Feature activation-6.54e-03

Pos logit contributions
Br+0.005
 elbows+0.004
fl+0.004
mar+0.004
 fart+0.004
Neg logit contributions
 studies-0.002
 university-0.002
 studying-0.002
old-0.002
ifting-0.002
TokenI
Token ID40
Feature activation-5.06e-03

Pos logit contributions
Br+0.009
 elbows+0.007
 fart+0.007
fl+0.007
mar+0.006
Neg logit contributions
 studies-0.012
 university-0.011
 studying-0.011
old-0.011
ifting-0.011