TOP ACTIVATIONS
MAX = 0.076

hat and grabbed her magn ifying glass . She was ready
He pointed his magn ifying glass at different classmates and
said , holding the magn ifying glass over a black ant
, Lilly put on her nic est dress and said goodbye
He brought his magn ifying glass and looked closely .
Inspired , Emily asked her pap a , " How can
When Daisy put the magn ifying glass down , the cater
examine it with the magn ifying glass ?" "
and handed him the magn ifying glass . He
She takes out a magn ifying glass . She looks at
reached for her toy te ap ot , but it was
tea from her toy te ap ot and they all had
borrowed her mom 's magn ifying glass to take a closer
day , Emma asked her pap a if they could go
the necklace and the magn ifying glass and told him that
the necklace and the magn ifying glass and started to examine
Maybe . Let 's cl ap and see if it hops
cried , dropping the magn ifying glass . "
at him with a mis chie vous grin . She said
it and started to cl ap her hands .

MIN ACTIVATIONS
MIN = -0.085

It became the happiest books he lf in the world !
I don 't like rac co ons . They are mean
went on a scary roller co aster ride . Tim my
didn 't know what " organ ize " meant , but
Lily loved to ride roller co asters and always remembered how
. And we love Roc co . He is our new
." R oc co nodded and said , "
and not look for rac co ons anymore . <|endoftext|> L
lesson and never pretended with tr ays again . <|endoftext|> Billy
spotted a row of books hel ves in her grandparents '
friends . Bye , Roc co . Bye , man ."
Sue , help me un plug the alarm and then we
Mom . I wanted more h angers for my dolls .
was not happy and un plug ged the iron . She
followed him and they all h uddled together , feeling safe
." They both h oot , h oot ,
� � € � � co o â € , it
. <|endoftext|> Jimmy was 3 - years - old and he
. Everyone ceased working and h uddled up for warmth .
. But those are not tr unks . Those are my

INTERVAL 0.036 - 0.076
CONTAINS 0.223%

. She put on her best dress and put on her
but then the bicycle started making strange noises and it stopped
street and the man would always give the bird a piece
treats on the table , â € but the little girl
dress , and she was always delighted when she put it

INTERVAL -0.005 - 0.036
CONTAINS 66.510%

. The boy had a warm feeling in his
mom my said , " Not today , sweet ie .
. <|endoftext|> Once upon a time there was a calm missile
to sleep . Playing is good , but sleeping is good
out and play with it . One day ,

INTERVAL -0.045 - -0.005
CONTAINS 33.224%

bur row and found some leaves . He piled them up
. He knew from now on there 's nothing to be
to do anything . I just want the rain to stop
, I have some news for you ," Mom said .
bored and decided to do something naughty . Her bad behaviour

INTERVAL -0.085 - -0.045
CONTAINS 0.043%

got off the couch and tr ud ged out of the
mom , " What is behind the fence ?"
Annie was never anxious about z ipping up her shirt again
were two friends , Ab ig ail and Mat ty .
str ayed too far from home again . <|endoftext|> Once upon
Token hat
Token ID6877
Feature activation+1.73e-02

Pos logit contributions
 fart+0.030
rated+0.027
 ju+0.027
 lo+0.026
 funn+0.026
Neg logit contributions
 arrest-0.024
 Tom-0.024
 lawyer-0.023
 school-0.023
 that-0.022
Token and
Token ID290
Feature activation-4.18e-03

Pos logit contributions
 fart+0.048
rated+0.046
 ju+0.045
 funn+0.045
 lo+0.045
Neg logit contributions
 Tom-0.008
 arrest-0.007
 that-0.007
 lawyer-0.007
 school-0.007
Token grabbed
Token ID13176
Feature activation-9.33e-03

Pos logit contributions
 Tom+0.005
 arrest+0.005
 school+0.004
 that+0.004
 Ann+0.004
Neg logit contributions
 fart-0.009
rated-0.009
 ju-0.008
 funn-0.008
 lo-0.008
Token her
Token ID607
Feature activation+1.31e-02

Pos logit contributions
 arrest+0.007
 Tom+0.007
 lawyer+0.007
 school+0.007
 elev+0.006
Neg logit contributions
 fart-0.023
 ju-0.021
rated-0.021
 lo-0.021
 funn-0.021
Token magn
Token ID7842
Feature activation+1.18e-02

Pos logit contributions
 fart+0.026
rated+0.024
 ju+0.024
 funn+0.023
 lo+0.023
Neg logit contributions
 Tom-0.016
 arrest-0.016
 lawyer-0.015
 school-0.015
 that-0.015
Tokenifying
Token ID4035
Feature activation+7.58e-02

Pos logit contributions
 fart+0.009
rated+0.008
 funn+0.007
 ju+0.007
 lo+0.006
Neg logit contributions
 arrest-0.029
 Tom-0.029
 that-0.028
 lawyer-0.028
 school-0.028
Token glass
Token ID5405
Feature activation+1.73e-02

Pos logit contributions
rated+0.153
 fart+0.143
nam+0.132
 funn+0.132
 lo+0.128
Neg logit contributions
 Tom-0.107
 that-0.100
 arrest-0.100
 school-0.099
 church-0.097
Token.
Token ID13
Feature activation+5.40e-04

Pos logit contributions
 fart+0.044
 ju+0.041
rated+0.041
 lo+0.041
 funn+0.040
Neg logit contributions
 arrest-0.012
 Tom-0.011
 lawyer-0.011
 school-0.010
 elev-0.010
Token She
Token ID1375
Feature activation-7.95e-04

Pos logit contributions
 fart+0.001
 ju+0.001
 lo+0.001
 funn+0.001
rated+0.001
Neg logit contributions
 arrest-0.001
 lawyer-0.001
 elev-0.000
 school-0.000
 church-0.000
Token was
Token ID373
Feature activation-2.81e-04

Pos logit contributions
 Tom+0.001
 arrest+0.001
 lawyer+0.001
 school+0.001
 that+0.001
Neg logit contributions
 fart-0.001
rated-0.001
 ju-0.001
 funn-0.001
 lo-0.001
Token ready
Token ID3492
Feature activation+2.55e-02

Pos logit contributions
 Tom+0.000
 that+0.000
 school+0.000
 studying+0.000
 arrest+0.000
Neg logit contributions
 fart-0.001
rated-0.001
 ju-0.000
 lo-0.000
 funn-0.000
Token
Token ID198
Feature activation+1.92e-03

Pos logit contributions
 fart+0.011
 lo+0.011
 funn+0.010
rated+0.010
 ju+0.010
Neg logit contributions
 arrest-0.008
 studying-0.008
 elev-0.007
 school-0.007
 lawyer-0.007
TokenHe
Token ID1544
Feature activation+9.10e-03

Pos logit contributions
 fart+0.004
 lo+0.004
 funn+0.004
rated+0.004
 ju+0.004
Neg logit contributions
 arrest-0.002
 school-0.002
 Tom-0.002
 studying-0.002
 church-0.002
Token pointed
Token ID6235
Feature activation+1.45e-02

Pos logit contributions
 fart+0.016
rated+0.015
 ju+0.015
 lo+0.014
 funn+0.014
Neg logit contributions
 arrest-0.013
 Tom-0.013
 lawyer-0.013
 school-0.012
 elev-0.012
Token his
Token ID465
Feature activation+2.51e-03

Pos logit contributions
 fart+0.035
rated+0.033
 ju+0.033
 lo+0.032
 funn+0.032
Neg logit contributions
 arrest-0.011
 Tom-0.011
 lawyer-0.011
 school-0.010
 elev-0.010
Token magn
Token ID7842
Feature activation+8.41e-03

Pos logit contributions
 fart+0.004
rated+0.004
 ju+0.004
 funn+0.004
 lo+0.004
Neg logit contributions
 Tom-0.004
 arrest-0.004
 lawyer-0.004
 school-0.004
 that-0.004
Tokenifying
Token ID4035
Feature activation+7.27e-02

Pos logit contributions
 fart+0.006
rated+0.006
 funn+0.005
 ju+0.005
 lo+0.005
Neg logit contributions
 arrest-0.021
 Tom-0.020
 that-0.020
issa-0.020
 lawyer-0.020
Token glass
Token ID5405
Feature activation+1.54e-02

Pos logit contributions
rated+0.146
 fart+0.136
 funn+0.124
nam+0.122
 ju+0.121
Neg logit contributions
 Tom-0.103
 school-0.098
 arrest-0.097
 that-0.096
 church-0.096
Token at
Token ID379
Feature activation-7.93e-03

Pos logit contributions
 fart+0.036
rated+0.033
 ju+0.033
 lo+0.033
 funn+0.033
Neg logit contributions
 arrest-0.013
 Tom-0.013
 lawyer-0.013
 school-0.012
 elev-0.012
Token different
Token ID1180
Feature activation+5.85e-03

Pos logit contributions
 arrest+0.009
 lawyer+0.009
azy+0.008
 elev+0.008
 studying+0.008
Neg logit contributions
 fart-0.016
 lo-0.015
rated-0.014
 funn-0.014
 ju-0.014
Token classmates
Token ID28999
Feature activation+2.27e-03

Pos logit contributions
 fart+0.010
rated+0.009
 ju+0.009
 lo+0.009
 funn+0.009
Neg logit contributions
 arrest-0.009
 Tom-0.009
 lawyer-0.008
 school-0.008
 that-0.008
Token and
Token ID290
Feature activation+1.41e-02

Pos logit contributions
 fart+0.005
 lo+0.005
 funn+0.005
 ju+0.005
rated+0.004
Neg logit contributions
 arrest-0.002
 elev-0.002
 lawyer-0.002
azy-0.002
 Karen-0.002
Token said
Token ID531
Feature activation-1.09e-02

Pos logit contributions
 fart+0.036
rated+0.034
 ju+0.033
 lo+0.032
 funn+0.031
Neg logit contributions
 Tom-0.024
 that-0.021
 arrest-0.021
 Ann-0.021
 school-0.021
Token,
Token ID11
Feature activation-1.27e-03

Pos logit contributions
 arrest+0.007
 lawyer+0.006
 elev+0.006
 school+0.006
 Tom+0.006
Neg logit contributions
 fart-0.028
 lo-0.027
rated-0.026
 ju-0.026
 funn-0.026
Token holding
Token ID4769
Feature activation+9.36e-03

Pos logit contributions
 arrest+0.001
 Tom+0.001
 lawyer+0.001
 school+0.001
 elev+0.001
Neg logit contributions
 fart-0.003
rated-0.003
 ju-0.003
 lo-0.003
 funn-0.003
Token the
Token ID262
Feature activation+2.40e-03

Pos logit contributions
 fart+0.017
 lo+0.016
 ju+0.016
 funn+0.016
rated+0.015
Neg logit contributions
 arrest-0.013
 lawyer-0.012
 school-0.012
 church-0.012
 studying-0.012
Token magn
Token ID7842
Feature activation+2.76e-03

Pos logit contributions
 fart+0.004
rated+0.004
 ju+0.004
 lo+0.004
 funn+0.004
Neg logit contributions
 arrest-0.003
 Tom-0.003
 lawyer-0.003
 school-0.003
 that-0.003
Tokenifying
Token ID4035
Feature activation+7.22e-02

Pos logit contributions
 fart+0.002
rated+0.001
 ju+0.001
 funn+0.001
 lo+0.001
Neg logit contributions
 arrest-0.007
 Tom-0.007
 lawyer-0.007
 school-0.007
 that-0.007
Token glass
Token ID5405
Feature activation+1.54e-02

Pos logit contributions
rated+0.143
 fart+0.132
 funn+0.123
 lo+0.121
nam+0.120
Neg logit contributions
 Tom-0.103
 arrest-0.100
 that-0.099
 church-0.098
 school-0.098
Token over
Token ID625
Feature activation-3.02e-03

Pos logit contributions
 fart+0.039
rated+0.037
 ju+0.037
 lo+0.036
 funn+0.036
Neg logit contributions
 arrest-0.010
 Tom-0.010
 lawyer-0.009
 school-0.009
 elev-0.009
Token a
Token ID257
Feature activation+1.22e-02

Pos logit contributions
 lawyer+0.004
issa+0.004
 arrest+0.004
azy+0.004
 studying+0.004
Neg logit contributions
 lo-0.006
 fart-0.005
 tighter-0.005
 wider-0.005
 ju-0.005
Token black
Token ID2042
Feature activation-9.34e-04

Pos logit contributions
 fart+0.023
rated+0.022
 funn+0.021
 lo+0.021
 ju+0.021
Neg logit contributions
 lawyer-0.016
 Tom-0.016
 school-0.015
 arrest-0.015
 that-0.015
Token ant
Token ID1885
Feature activation+1.90e-02

Pos logit contributions
 arrest+0.001
 Tom+0.001
 elev+0.001
 studying+0.001
issa+0.001
Neg logit contributions
 fart-0.002
rated-0.001
 lo-0.001
 ju-0.001
 funn-0.001
Token,
Token ID11
Feature activation+6.24e-03

Pos logit contributions
 fart+0.003
 lo+0.002
 ju+0.002
rated+0.002
 funn+0.002
Neg logit contributions
 arrest-0.001
azy-0.001
 elev-0.001
itt-0.001
issa-0.001
Token Lilly
Token ID35134
Feature activation-9.01e-03

Pos logit contributions
 fart+0.014
 ju+0.013
 lo+0.013
rated+0.013
 funn+0.012
Neg logit contributions
 arrest-0.006
 Tom-0.006
 lawyer-0.006
 elev-0.006
 school-0.006
Token put
Token ID1234
Feature activation-4.70e-03

Pos logit contributions
 arrest+0.014
 elev+0.014
 Tom+0.013
 lawyer+0.013
 school+0.013
Neg logit contributions
 fart-0.015
rated-0.014
 lo-0.014
 ju-0.013
 funn-0.013
Token on
Token ID319
Feature activation+1.41e-02

Pos logit contributions
 arrest+0.006
 lawyer+0.005
 Tom+0.005
 elev+0.005
 school+0.005
Neg logit contributions
 fart-0.009
 lo-0.009
 ju-0.009
rated-0.009
 funn-0.008
Token her
Token ID607
Feature activation+3.73e-03

Pos logit contributions
 fart+0.028
 lo+0.028
nam+0.026
 ju+0.026
 funn+0.026
Neg logit contributions
azy-0.015
 lawyer-0.015
makers-0.014
 arrest-0.014
ibrarian-0.013
Token nic
Token ID9200
Feature activation+7.19e-02

Pos logit contributions
 fart+0.007
rated+0.007
 ju+0.006
 funn+0.006
 lo+0.006
Neg logit contributions
 Tom-0.005
 arrest-0.005
 school-0.005
 lawyer-0.005
 that-0.005
Tokenest
Token ID395
Feature activation+1.61e-02

Pos logit contributions
 fart+0.113
 ju+0.104
 funn+0.097
 lo+0.094
nam+0.094
Neg logit contributions
 Tom-0.123
 arrest-0.121
 that-0.118
 school-0.117
 lawyer-0.113
Token dress
Token ID6576
Feature activation-1.02e-02

Pos logit contributions
 fart+0.031
rated+0.028
 ju+0.028
 lo+0.028
 funn+0.028
Neg logit contributions
 arrest-0.021
 Tom-0.021
 lawyer-0.020
 school-0.020
 that-0.019
Token and
Token ID290
Feature activation+5.26e-03

Pos logit contributions
 arrest+0.008
 elev+0.008
azy+0.008
 Tom+0.008
 lawyer+0.007
Neg logit contributions
 fart-0.024
 lo-0.023
rated-0.022
 ju-0.022
 funn-0.021
Token said
Token ID531
Feature activation+3.26e-04

Pos logit contributions
 fart+0.011
rated+0.010
 ju+0.010
 funn+0.010
 lo+0.009
Neg logit contributions
 Tom-0.006
 school-0.006
 that-0.006
 arrest-0.006
 grandma-0.005
Token goodbye
Token ID24829
Feature activation+1.53e-02

Pos logit contributions
 fart+0.001
rated+0.001
 funn+0.001
 ju+0.001
 lo+0.001
Neg logit contributions
 Tom-0.000
 that-0.000
 lawyer-0.000
 school-0.000
 arrest-0.000
Token
Token ID198
Feature activation+5.81e-03

Pos logit contributions
 fart+0.024
 lo+0.022
rated+0.022
 ju+0.022
 funn+0.022
Neg logit contributions
 arrest-0.014
 lawyer-0.013
 elev-0.013
 studying-0.012
 school-0.012
TokenHe
Token ID1544
Feature activation+1.06e-03

Pos logit contributions
 fart+0.010
rated+0.010
 ju+0.010
 funn+0.009
 lo+0.009
Neg logit contributions
 Tom-0.008
 arrest-0.008
 lawyer-0.008
Tom-0.008
 school-0.008
Token brought
Token ID3181
Feature activation+1.67e-02

Pos logit contributions
 fart+0.002
 ju+0.002
rated+0.002
 funn+0.002
 lo+0.002
Neg logit contributions
 Tom-0.001
 that-0.001
 school-0.001
 building-0.001
 arrest-0.001
Token his
Token ID465
Feature activation-1.19e-03

Pos logit contributions
 fart+0.042
rated+0.041
 ju+0.039
 funn+0.039
 lo+0.039
Neg logit contributions
 Tom-0.013
 that-0.012
 arrest-0.011
 school-0.010
 Ann-0.010
Token magn
Token ID7842
Feature activation+6.98e-03

Pos logit contributions
 school+0.001
 that+0.001
 lawyer+0.001
 church+0.001
 Tom+0.001
Neg logit contributions
rated-0.002
 fart-0.002
 ju-0.002
 lo-0.002
 funn-0.002
Tokenifying
Token ID4035
Feature activation+7.14e-02

Pos logit contributions
rated+0.007
 fart+0.007
 funn+0.007
 ju+0.006
 lo+0.006
Neg logit contributions
 arrest-0.015
 that-0.015
 ever-0.014
 inside-0.014
 Tom-0.014
Token glass
Token ID5405
Feature activation+1.51e-02

Pos logit contributions
rated+0.156
 fart+0.136
 funn+0.128
nam+0.127
 lo+0.123
Neg logit contributions
 Tom-0.097
 that-0.091
 school-0.090
 arrest-0.089
 church-0.087
Token and
Token ID290
Feature activation+7.20e-03

Pos logit contributions
 fart+0.036
rated+0.035
 funn+0.034
 ju+0.033
 lo+0.033
Neg logit contributions
 Tom-0.012
 that-0.012
 arrest-0.012
 school-0.011
 lawyer-0.011
Token looked
Token ID3114
Feature activation+1.81e-02

Pos logit contributions
 fart+0.014
rated+0.013
 ju+0.012
 funn+0.012
 lo+0.012
Neg logit contributions
 Tom-0.010
 arrest-0.010
 that-0.010
 school-0.009
 building-0.009
Token closely
Token ID7173
Feature activation+2.25e-03

Pos logit contributions
 fart+0.040
rated+0.037
 ju+0.037
 lo+0.036
 funn+0.036
Neg logit contributions
 arrest-0.019
 Tom-0.019
 lawyer-0.018
 school-0.017
 elev-0.017
Token.
Token ID13
Feature activation+8.93e-03

Pos logit contributions
 fart+0.006
 lo+0.005
 ju+0.005
rated+0.005
 funn+0.005
Neg logit contributions
 arrest-0.002
 lawyer-0.001
 school-0.001
 Tom-0.001
 elev-0.001
Token Inspired
Token ID45827
Feature activation-1.03e-02

Pos logit contributions
 Tom+0.003
 arrest+0.003
 lawyer+0.003
 school+0.003
 that+0.003
Neg logit contributions
 fart-0.009
rated-0.009
 lo-0.008
 ju-0.008
 funn-0.008
Token,
Token ID11
Feature activation+5.62e-03

Pos logit contributions
 Tom+0.008
 school+0.007
 that+0.007
 arrest+0.007
 lawyer+0.007
Neg logit contributions
 fart-0.026
 ju-0.025
rated-0.025
 funn-0.024
 lo-0.024
Token Emily
Token ID17608
Feature activation-4.22e-03

Pos logit contributions
 fart+0.016
rated+0.015
 ju+0.015
 funn+0.014
 lo+0.014
Neg logit contributions
 that-0.003
 Tom-0.003
 school-0.003
 she-0.002
 arrest-0.002
Token asked
Token ID1965
Feature activation-1.40e-02

Pos logit contributions
 arrest+0.006
 Tom+0.006
 school+0.006
 lawyer+0.006
 elev+0.006
Neg logit contributions
 fart-0.007
rated-0.007
 ju-0.007
 lo-0.007
 funn-0.007
Token her
Token ID607
Feature activation-3.12e-03

Pos logit contributions
 arrest+0.013
 elev+0.011
 Tom+0.011
 lawyer+0.010
 school+0.010
Neg logit contributions
 fart-0.032
 lo-0.030
 ju-0.030
rated-0.030
 funn-0.029
Token pap
Token ID20461
Feature activation+6.91e-02

Pos logit contributions
 arrest+0.004
 Tom+0.004
 lawyer+0.004
 school+0.004
 elev+0.004
Neg logit contributions
 fart-0.006
 ju-0.006
rated-0.006
 lo-0.006
 funn-0.005
Tokena
Token ID64
Feature activation+4.43e-02

Pos logit contributions
 fart+0.135
rated+0.119
 ju+0.118
 lo+0.111
 funn+0.111
Neg logit contributions
 lawyer-0.094
 that-0.091
issa-0.087
azy-0.087
 Tom-0.085
Token,
Token ID11
Feature activation-3.83e-04

Pos logit contributions
 fart+0.104
rated+0.097
 ju+0.097
 lo+0.096
 funn+0.095
Neg logit contributions
 arrest-0.038
 Tom-0.037
 lawyer-0.035
 school-0.035
 elev-0.033
Token "
Token ID366
Feature activation-6.63e-04

Pos logit contributions
 arrest+0.000
 elev+0.000
 Tom+0.000
 school+0.000
 lawyer+0.000
Neg logit contributions
 fart-0.001
 ju-0.001
 lo-0.001
 funn-0.001
rated-0.001
TokenHow
Token ID2437
Feature activation+1.92e-03

Pos logit contributions
 lawyer+0.001
 Tom+0.001
 arrest+0.001
 school+0.001
 that+0.001
Neg logit contributions
 fart-0.001
rated-0.001
 funn-0.001
 ju-0.001
 lo-0.001
Token can
Token ID460
Feature activation-3.81e-03

Pos logit contributions
 fart+0.004
rated+0.004
 ju+0.004
 lo+0.004
 funn+0.004
Neg logit contributions
 arrest-0.002
 Tom-0.002
 lawyer-0.002
 school-0.002
 that-0.002
Token When
Token ID1649
Feature activation+1.37e-02

Pos logit contributions
 fart+0.008
rated+0.008
 ju+0.008
 lo+0.008
 funn+0.008
Neg logit contributions
 Tom-0.004
 arrest-0.004
 lawyer-0.004
 school-0.004
 that-0.004
Token Daisy
Token ID40355
Feature activation-2.84e-04

Pos logit contributions
 fart+0.033
 lo+0.031
 ju+0.031
rated+0.031
 funn+0.031
Neg logit contributions
 arrest-0.010
 lawyer-0.009
 Tom-0.009
 elev-0.009
azy-0.008
Token put
Token ID1234
Feature activation-9.08e-03

Pos logit contributions
 elev+0.000
 lawyer+0.000
 arrest+0.000
 school+0.000
 recess+0.000
Neg logit contributions
 fart-0.000
 lo-0.000
 ju-0.000
rated-0.000
 funn-0.000
Token the
Token ID262
Feature activation+9.71e-03

Pos logit contributions
 arrest+0.010
 lawyer+0.009
 Tom+0.009
 elev+0.009
 school+0.008
Neg logit contributions
 fart-0.020
 lo-0.018
 ju-0.018
 funn-0.018
rated-0.018
Token magn
Token ID7842
Feature activation-3.28e-03

Pos logit contributions
 fart+0.019
rated+0.018
 ju+0.017
 funn+0.017
 lo+0.017
Neg logit contributions
 lawyer-0.012
 school-0.012
 arrest-0.012
 Tom-0.012
 church-0.011
Tokenifying
Token ID4035
Feature activation+6.80e-02

Pos logit contributions
 that+0.007
 Tom+0.007
 arrest+0.007
 upstairs+0.007
 inside+0.007
Neg logit contributions
 fart-0.003
rated-0.003
 funn-0.003
 ju-0.003
 lo-0.002
Token glass
Token ID5405
Feature activation+1.62e-02

Pos logit contributions
rated+0.144
 fart+0.129
 funn+0.125
nam+0.123
 lo+0.120
Neg logit contributions
 Tom-0.091
 that-0.090
 church-0.084
 arrest-0.083
 school-0.082
Token down
Token ID866
Feature activation+3.84e-03

Pos logit contributions
 fart+0.042
rated+0.040
 ju+0.039
 funn+0.039
 lo+0.039
Neg logit contributions
 Tom-0.010
 arrest-0.010
 that-0.009
 school-0.009
 lawyer-0.009
Token,
Token ID11
Feature activation-4.54e-03

Pos logit contributions
 fart+0.009
 lo+0.009
 ju+0.009
rated+0.009
 funn+0.009
Neg logit contributions
 arrest-0.003
 lawyer-0.003
 studying-0.002
 elev-0.002
 Tom-0.002
Token the
Token ID262
Feature activation+7.35e-03

Pos logit contributions
 arrest+0.003
 lawyer+0.002
 Tom+0.002
 elev+0.002
 school+0.002
Neg logit contributions
 fart-0.012
 ju-0.011
 lo-0.011
rated-0.011
 funn-0.011
Token cater
Token ID20825
Feature activation+1.22e-02

Pos logit contributions
 fart+0.014
rated+0.013
 ju+0.013
 funn+0.013
 lo+0.013
Neg logit contributions
 Tom-0.009
 arrest-0.009
 lawyer-0.009
 school-0.009
 that-0.009
Token examine
Token ID10716
Feature activation-1.57e-02

Pos logit contributions
 fart+0.007
 lo+0.006
 ju+0.006
 funn+0.006
rated+0.006
Neg logit contributions
 arrest-0.005
 elev-0.005
 Tom-0.005
 lawyer-0.005
issa-0.004
Token it
Token ID340
Feature activation+9.81e-03

Pos logit contributions
 arrest+0.009
 Tom+0.009
 lawyer+0.008
 school+0.008
 elev+0.008
Neg logit contributions
 fart-0.041
 ju-0.038
rated-0.038
 lo-0.038
 funn-0.038
Token with
Token ID351
Feature activation+2.26e-03

Pos logit contributions
 fart+0.023
rated+0.022
 ju+0.021
 funn+0.021
 lo+0.021
Neg logit contributions
 that-0.009
 Tom-0.009
 school-0.009
 lawyer-0.008
 building-0.008
Token the
Token ID262
Feature activation-2.08e-03

Pos logit contributions
 fart+0.005
rated+0.005
 ju+0.005
 funn+0.005
 lo+0.005
Neg logit contributions
 Tom-0.002
 arrest-0.002
 school-0.002
 that-0.002
 lawyer-0.002
Token magn
Token ID7842
Feature activation+7.02e-04

Pos logit contributions
 Tom+0.003
 arrest+0.003
 school+0.003
 lawyer+0.003
 that+0.003
Neg logit contributions
 fart-0.004
rated-0.004
 ju-0.003
 funn-0.003
 lo-0.003
Tokenifying
Token ID4035
Feature activation+6.74e-02

Pos logit contributions
 fart+0.001
rated+0.000
 funn+0.000
 ju+0.000
 lo+0.000
Neg logit contributions
 arrest-0.002
 Tom-0.002
 that-0.002
 lawyer-0.002
 school-0.002
Token glass
Token ID5405
Feature activation+1.26e-02

Pos logit contributions
rated+0.146
 fart+0.129
nam+0.129
 funn+0.122
mediately+0.120
Neg logit contributions
 Tom-0.086
 church-0.085
 that-0.084
 school-0.082
 arrest-0.078
Token?"
Token ID1701
Feature activation-1.24e-02

Pos logit contributions
 fart+0.028
rated+0.027
 ju+0.027
 funn+0.027
 lo+0.026
Neg logit contributions
 that-0.012
 Tom-0.012
 school-0.011
 lawyer-0.011
 arrest-0.010
Token
Token ID198
Feature activation+1.39e-02

Pos logit contributions
 arrest+0.010
 Tom+0.010
 lawyer+0.009
 school+0.009
 elev+0.009
Neg logit contributions
 fart-0.030
rated-0.028
 ju-0.028
 lo-0.028
 funn-0.027
Token
Token ID198
Feature activation+8.69e-03

Pos logit contributions
 fart+0.034
rated+0.032
 ju+0.031
 funn+0.031
 lo+0.031
Neg logit contributions
 Tom-0.011
 arrest-0.011
 school-0.010
 lawyer-0.010
 that-0.010
Token"
Token ID1
Feature activation+3.22e-03

Pos logit contributions
 fart+0.022
rated+0.021
 funn+0.020
 ju+0.020
 lo+0.019
Neg logit contributions
 Tom-0.007
Tom-0.006
 that-0.006
 arrest-0.006
 school-0.006
Token and
Token ID290
Feature activation+1.14e-02

Pos logit contributions
 fart+0.065
 lo+0.061
rated+0.061
 funn+0.060
 ju+0.059
Neg logit contributions
 arrest-0.030
 elev-0.028
 recess-0.027
azy-0.027
issa-0.027
Token handed
Token ID10158
Feature activation+5.00e-03

Pos logit contributions
 fart+0.020
 ju+0.018
 lo+0.018
 funn+0.018
rated+0.017
Neg logit contributions
 arrest-0.017
 lawyer-0.016
 elev-0.016
 school-0.016
 church-0.015
Token him
Token ID683
Feature activation+1.36e-02

Pos logit contributions
 fart+0.011
 lo+0.011
 ju+0.011
 funn+0.010
 tighter+0.010
Neg logit contributions
 arrest-0.005
 lawyer-0.004
 elev-0.004
 church-0.004
 studying-0.004
Token the
Token ID262
Feature activation+7.60e-03

Pos logit contributions
 fart+0.027
 ju+0.026
 funn+0.025
 lo+0.025
 tighter+0.023
Neg logit contributions
 arrest-0.016
 lawyer-0.015
 church-0.014
 school-0.014
 studying-0.014
Token magn
Token ID7842
Feature activation+1.54e-03

Pos logit contributions
rated+0.014
 fart+0.013
 funn+0.012
ksh+0.012
 ju+0.012
Neg logit contributions
 that-0.011
 Tom-0.011
 arrest-0.011
 school-0.011
 lawyer-0.010
Tokenifying
Token ID4035
Feature activation+6.71e-02

Pos logit contributions
 fart+0.001
rated+0.001
 ju+0.001
 funn+0.001
 lo+0.001
Neg logit contributions
 arrest-0.004
 Tom-0.004
 lawyer-0.004
 that-0.004
 school-0.004
Token glass
Token ID5405
Feature activation+1.24e-02

Pos logit contributions
rated+0.138
 fart+0.126
 funn+0.117
 ju+0.113
nam+0.113
Neg logit contributions
 Tom-0.092
 that-0.091
 arrest-0.091
 church-0.090
 school-0.089
Token.
Token ID13
Feature activation+1.08e-03

Pos logit contributions
 fart+0.031
 ju+0.029
 lo+0.028
 funn+0.028
rated+0.028
Neg logit contributions
 arrest-0.009
 lawyer-0.009
 Tom-0.008
 elev-0.008
 studying-0.008
Token
Token ID198
Feature activation+6.21e-03

Pos logit contributions
 fart+0.002
 ju+0.002
 lo+0.002
 funn+0.002
rated+0.002
Neg logit contributions
 arrest-0.001
 elev-0.001
 lawyer-0.001
 school-0.001
issa-0.001
Token
Token ID198
Feature activation+2.62e-04

Pos logit contributions
 fart+0.011
 lo+0.010
 ju+0.010
 funn+0.009
rated+0.009
Neg logit contributions
 arrest-0.008
 lawyer-0.008
 elev-0.008
 studying-0.008
 building-0.008
TokenHe
Token ID1544
Feature activation+2.84e-03

Pos logit contributions
 fart+0.001
 ju+0.000
rated+0.000
 lo+0.000
 funn+0.000
Neg logit contributions
 arrest-0.000
 Tom-0.000
 lawyer-0.000
 school-0.000
 elev-0.000
Token She
Token ID1375
Feature activation+1.04e-02

Pos logit contributions
 fart+0.000
 lo+0.000
 ju+0.000
 funn+0.000
rated+0.000
Neg logit contributions
 arrest-0.000
 school-0.000
 elev-0.000
 lawyer-0.000
 studying-0.000
Token takes
Token ID2753
Feature activation-9.58e-04

Pos logit contributions
 fart+0.015
 lo+0.014
 funn+0.014
 wider+0.013
 ju+0.013
Neg logit contributions
 Karen-0.017
 lawyer-0.017
 school-0.017
 arrest-0.017
 Tom-0.017
Token out
Token ID503
Feature activation-6.55e-04

Pos logit contributions
 arrest+0.001
 lawyer+0.001
 studying+0.001
 church+0.001
 school+0.001
Neg logit contributions
 fart-0.002
 lo-0.002
 funn-0.002
 ju-0.002
rated-0.002
Token a
Token ID257
Feature activation+3.78e-04

Pos logit contributions
 arrest+0.001
 lawyer+0.001
 studying+0.001
parent+0.001
azy+0.001
Neg logit contributions
 fart-0.001
 lo-0.001
 funn-0.001
 ju-0.001
 wider-0.001
Token magn
Token ID7842
Feature activation-4.12e-03

Pos logit contributions
 fart+0.001
rated+0.001
 ju+0.001
 funn+0.001
 lo+0.001
Neg logit contributions
 lawyer-0.000
 school-0.000
 Tom-0.000
 building-0.000
 arrest-0.000
Tokenifying
Token ID4035
Feature activation+6.63e-02

Pos logit contributions
 arrest+0.010
 Tom+0.010
 lawyer+0.010
 that+0.010
 school+0.010
Neg logit contributions
 fart-0.003
rated-0.002
 ju-0.002
 funn-0.002
 lo-0.002
Token glass
Token ID5405
Feature activation+1.20e-02

Pos logit contributions
rated+0.134
 fart+0.122
 funn+0.114
 lo+0.111
nam+0.110
Neg logit contributions
 Tom-0.093
 that-0.092
 arrest-0.090
 church-0.088
 school-0.088
Token.
Token ID13
Feature activation-4.03e-03

Pos logit contributions
 fart+0.027
 ju+0.025
rated+0.025
 lo+0.025
 funn+0.025
Neg logit contributions
 arrest-0.012
 Tom-0.011
 lawyer-0.011
 school-0.011
 elev-0.010
Token She
Token ID1375
Feature activation+3.16e-04

Pos logit contributions
 arrest+0.005
 lawyer+0.004
 elev+0.004
 school+0.004
issa+0.004
Neg logit contributions
 fart-0.008
 ju-0.008
 lo-0.007
 funn-0.007
 wider-0.007
Token looks
Token ID3073
Feature activation+5.98e-03

Pos logit contributions
 fart+0.000
 lo+0.000
 funn+0.000
 ju+0.000
rated+0.000
Neg logit contributions
 lawyer-0.001
 Karen-0.001
 arrest-0.001
 Tom-0.001
 school-0.000
Token at
Token ID379
Feature activation-1.14e-02

Pos logit contributions
 fart+0.014
rated+0.013
 ju+0.013
 lo+0.013
 funn+0.013
Neg logit contributions
 Tom-0.005
 that-0.005
 school-0.005
 arrest-0.005
 lawyer-0.005
Token reached
Token ID4251
Feature activation-1.24e-03

Pos logit contributions
 fart+0.007
 lo+0.007
 ju+0.007
rated+0.007
 funn+0.007
Neg logit contributions
 arrest-0.007
 Tom-0.006
 elev-0.006
 lawyer-0.006
 school-0.006
Token for
Token ID329
Feature activation-1.51e-02

Pos logit contributions
 arrest+0.001
 lawyer+0.001
 Tom+0.001
 elev+0.001
azy+0.001
Neg logit contributions
 fart-0.003
 lo-0.002
rated-0.002
 funn-0.002
 ju-0.002
Token her
Token ID607
Feature activation-1.20e-02

Pos logit contributions
 lawyer+0.016
 arrest+0.016
parent+0.015
azy+0.015
 elev+0.015
Neg logit contributions
 fart-0.031
 lo-0.030
 funn-0.028
 ju-0.028
nam-0.028
Token toy
Token ID13373
Feature activation-1.02e-02

Pos logit contributions
 Tom+0.014
 arrest+0.013
 school+0.013
 lawyer+0.013
 that+0.013
Neg logit contributions
 fart-0.025
rated-0.023
 ju-0.023
 funn-0.023
 lo-0.023
Token te
Token ID573
Feature activation+3.57e-02

Pos logit contributions
 arrest+0.012
 elev+0.011
azy+0.011
 lawyer+0.011
 studying+0.011
Neg logit contributions
 fart-0.021
 lo-0.019
 funn-0.018
 ju-0.018
 prett-0.017
Tokenap
Token ID499
Feature activation+6.61e-02

Pos logit contributions
 fart+0.050
rated+0.046
 ju+0.045
 funn+0.043
 lo+0.042
Neg logit contributions
 arrest-0.064
 Tom-0.064
 school-0.063
 lawyer-0.061
 Lee-0.060
Tokenot
Token ID313
Feature activation+2.60e-02

Pos logit contributions
 fart+0.110
 ju+0.102
rated+0.099
 funn+0.097
 lo+0.092
Neg logit contributions
 Tom-0.111
 that-0.104
 school-0.103
 Ann-0.102
 Lee-0.099
Token,
Token ID11
Feature activation-9.77e-03

Pos logit contributions
 fart+0.062
 ju+0.057
rated+0.057
 lo+0.057
 funn+0.057
Neg logit contributions
 arrest-0.020
 elev-0.019
 school-0.019
 lawyer-0.019
 church-0.018
Token but
Token ID475
Feature activation-2.06e-02

Pos logit contributions
 Tom+0.010
 that+0.009
 arrest+0.008
 building+0.008
 school+0.008
Neg logit contributions
 fart-0.022
rated-0.022
 ju-0.021
 funn-0.020
 lo-0.020
Token it
Token ID340
Feature activation+1.11e-02

Pos logit contributions
 Tom+0.012
 arrest+0.011
 school+0.011
 that+0.010
 lawyer+0.010
Neg logit contributions
 fart-0.055
rated-0.052
 ju-0.051
 funn-0.050
 lo-0.050
Token was
Token ID373
Feature activation+2.29e-02

Pos logit contributions
 fart+0.025
rated+0.024
 ju+0.024
nam+0.024
 afloat+0.023
Neg logit contributions
 Tom-0.010
 that-0.009
 Ann-0.009
 ever-0.008
 building-0.008
Token tea
Token ID8887
Feature activation+1.13e-02

Pos logit contributions
 fart+0.030
rated+0.028
 ju+0.028
 funn+0.028
 lo+0.027
Neg logit contributions
 Tom-0.015
 arrest-0.014
 school-0.014
 that-0.014
 lawyer-0.013
Token from
Token ID422
Feature activation-4.56e-03

Pos logit contributions
 fart+0.027
 ju+0.025
 lo+0.025
rated+0.025
 funn+0.024
Neg logit contributions
 arrest-0.010
 lawyer-0.009
 Tom-0.009
 elev-0.009
 school-0.009
Token her
Token ID607
Feature activation-6.65e-03

Pos logit contributions
 arrest+0.005
 lawyer+0.004
azy+0.004
 Tom+0.004
 elev+0.004
Neg logit contributions
 fart-0.010
 lo-0.009
rated-0.009
 ju-0.009
 funn-0.009
Token toy
Token ID13373
Feature activation-1.19e-02

Pos logit contributions
 arrest+0.009
 elev+0.009
 Tom+0.009
azy+0.008
 lawyer+0.008
Neg logit contributions
 fart-0.012
 ju-0.011
 lo-0.011
 funn-0.011
 prett-0.010
Token te
Token ID573
Feature activation+3.59e-02

Pos logit contributions
 Tom+0.016
 elev+0.016
 Mike+0.016
 Karen+0.016
 studying+0.016
Neg logit contributions
 fart-0.021
 lo-0.018
 ju-0.018
 funn-0.017
cr-0.016
Tokenap
Token ID499
Feature activation+6.57e-02

Pos logit contributions
 fart+0.054
rated+0.050
 ju+0.049
 funn+0.047
 lo+0.047
Neg logit contributions
 arrest-0.062
 Tom-0.061
 school-0.059
 lawyer-0.058
 elev-0.057
Tokenot
Token ID313
Feature activation+2.69e-02

Pos logit contributions
 fart+0.096
 ju+0.086
rated+0.085
 funn+0.083
 lo+0.082
Neg logit contributions
 Tom-0.118
 arrest-0.115
 school-0.113
 that-0.112
 lawyer-0.111
Token and
Token ID290
Feature activation+1.03e-02

Pos logit contributions
 fart+0.060
rated+0.054
 ju+0.054
 lo+0.053
 funn+0.053
Neg logit contributions
 elev-0.025
 church-0.025
 school-0.025
 lawyer-0.025
 arrest-0.024
Token they
Token ID484
Feature activation+1.10e-03

Pos logit contributions
 fart+0.021
rated+0.020
 ju+0.019
 funn+0.019
 lo+0.018
Neg logit contributions
 Tom-0.013
 school-0.012
 that-0.012
 building-0.012
 arrest-0.012
Token all
Token ID477
Feature activation+1.69e-02

Pos logit contributions
 fart+0.002
 ju+0.002
rated+0.002
 funn+0.002
 lo+0.002
Neg logit contributions
 Tom-0.001
 that-0.001
 school-0.001
 arrest-0.001
 building-0.001
Token had
Token ID550
Feature activation+6.63e-03

Pos logit contributions
 fart+0.036
 ju+0.033
 lo+0.032
rated+0.031
 funn+0.031
Neg logit contributions
 that-0.020
 Tom-0.019
 school-0.018
 ever-0.018
 building-0.018
Token borrowed
Token ID22546
Feature activation+5.31e-03

Pos logit contributions
 fart+0.013
rated+0.012
 ju+0.012
 funn+0.012
 lo+0.012
Neg logit contributions
 Tom-0.006
 that-0.006
 arrest-0.005
 lawyer-0.005
 school-0.005
Token her
Token ID607
Feature activation-3.53e-03

Pos logit contributions
 fart+0.013
rated+0.013
 ju+0.012
 lo+0.012
 funn+0.012
Neg logit contributions
 Tom-0.004
 arrest-0.004
 that-0.004
 lawyer-0.004
 school-0.003
Token mom
Token ID1995
Feature activation+2.05e-02

Pos logit contributions
 that+0.005
 school+0.005
 arrest+0.005
 Tom+0.005
 lawyer+0.004
Neg logit contributions
 fart-0.007
rated-0.006
 funn-0.006
 ju-0.006
 lo-0.006
Token's
Token ID338
Feature activation+1.17e-03

Pos logit contributions
 fart+0.039
rated+0.036
 ju+0.036
 lo+0.035
 funn+0.035
Neg logit contributions
 arrest-0.026
 Tom-0.026
 lawyer-0.025
 school-0.025
 that-0.024
Token magn
Token ID7842
Feature activation-6.94e-05

Pos logit contributions
 fart+0.002
rated+0.002
 funn+0.002
 ju+0.002
 lo+0.002
Neg logit contributions
 arrest-0.002
 that-0.002
 school-0.002
 Tom-0.002
 lawyer-0.002
Tokenifying
Token ID4035
Feature activation+6.56e-02

Pos logit contributions
 arrest+0.000
 Tom+0.000
 that+0.000
 lawyer+0.000
 school+0.000
Neg logit contributions
 fart-0.000
rated-0.000
 funn-0.000
 ju-0.000
 lo-0.000
Token glass
Token ID5405
Feature activation+1.13e-02

Pos logit contributions
rated+0.128
 fart+0.119
 funn+0.110
 lo+0.107
nam+0.107
Neg logit contributions
 Tom-0.094
 arrest-0.092
 that-0.091
 school-0.090
 church-0.088
Token to
Token ID284
Feature activation+3.06e-03

Pos logit contributions
 fart+0.028
 ju+0.026
rated+0.026
 lo+0.025
 funn+0.025
Neg logit contributions
 arrest-0.009
 Tom-0.008
 lawyer-0.008
 school-0.008
 elev-0.008
Token take
Token ID1011
Feature activation+8.31e-03

Pos logit contributions
 fart+0.006
 ju+0.006
 lo+0.006
rated+0.006
 funn+0.006
Neg logit contributions
 arrest-0.004
 lawyer-0.003
 Tom-0.003
 elev-0.003
issa-0.003
Token a
Token ID257
Feature activation+9.28e-03

Pos logit contributions
 fart+0.017
 ju+0.016
 lo+0.016
 funn+0.016
rated+0.015
Neg logit contributions
 lawyer-0.009
 arrest-0.009
issa-0.009
 school-0.008
 church-0.008
Token closer
Token ID5699
Feature activation+1.41e-02

Pos logit contributions
 fart+0.017
rated+0.017
 funn+0.016
 lo+0.016
 ju+0.016
Neg logit contributions
 that-0.012
 lawyer-0.012
 arrest-0.012
 school-0.012
 Tom-0.012
Token day
Token ID1110
Feature activation+2.10e-02

Pos logit contributions
 fart+0.008
 ju+0.007
rated+0.007
 funn+0.007
 lo+0.007
Neg logit contributions
 Tom-0.003
 that-0.003
 school-0.003
 lawyer-0.002
 church-0.002
Token,
Token ID11
Feature activation+1.14e-03

Pos logit contributions
 fart+0.046
 lo+0.045
 ju+0.042
 funn+0.041
rated+0.040
Neg logit contributions
 arrest-0.019
parent-0.018
 elev-0.018
 lawyer-0.017
issa-0.016
Token Emma
Token ID18966
Feature activation+2.58e-03

Pos logit contributions
 fart+0.003
 lo+0.002
 ju+0.002
 funn+0.002
nam+0.002
Neg logit contributions
 arrest-0.001
 elev-0.001
 lawyer-0.001
parent-0.001
 studying-0.001
Token asked
Token ID1965
Feature activation-1.09e-02

Pos logit contributions
 fart+0.005
rated+0.004
 lo+0.004
 ju+0.004
 funn+0.004
Neg logit contributions
 arrest-0.004
 elev-0.004
 Tom-0.003
 lawyer-0.003
 school-0.003
Token her
Token ID607
Feature activation-1.18e-03

Pos logit contributions
 arrest+0.013
 elev+0.010
 lawyer+0.010
 church+0.009
azy+0.009
Neg logit contributions
 lo-0.021
 fart-0.021
rated-0.020
 funn-0.020
nam-0.020
Token pap
Token ID20461
Feature activation+6.49e-02

Pos logit contributions
 arrest+0.001
 elev+0.001
 Tom+0.001
azy+0.001
issa+0.001
Neg logit contributions
 fart-0.002
 ju-0.002
 funn-0.002
 lo-0.002
rated-0.002
Tokena
Token ID64
Feature activation+3.64e-02

Pos logit contributions
 fart+0.127
 ju+0.113
rated+0.112
 funn+0.104
 lo+0.103
Neg logit contributions
 lawyer-0.089
 that-0.085
issa-0.083
azy-0.081
 Tom-0.081
Token if
Token ID611
Feature activation-6.23e-03

Pos logit contributions
 fart+0.088
rated+0.083
 ju+0.082
 lo+0.081
 funn+0.081
Neg logit contributions
 Tom-0.029
 arrest-0.028
 lawyer-0.027
 school-0.026
 that-0.025
Token they
Token ID484
Feature activation-2.61e-03

Pos logit contributions
 arrest+0.004
 elev+0.004
 lawyer+0.003
parent+0.003
 Tom+0.003
Neg logit contributions
 fart-0.015
 ju-0.015
 lo-0.015
 funn-0.015
rated-0.014
Token could
Token ID714
Feature activation+7.13e-03

Pos logit contributions
 arrest+0.004
 elev+0.004
 lawyer+0.004
 Tom+0.004
 school+0.004
Neg logit contributions
 fart-0.004
 ju-0.004
rated-0.004
 lo-0.004
 funn-0.004
Token go
Token ID467
Feature activation+7.40e-03

Pos logit contributions
 fart+0.012
 lo+0.011
 ju+0.011
rated+0.011
 funn+0.011
Neg logit contributions
 arrest-0.010
 elev-0.010
issa-0.010
 lawyer-0.010
 Tom-0.010
Token the
Token ID262
Feature activation-3.81e-03

Pos logit contributions
 fart+0.008
 funn+0.008
 ju+0.008
 lo+0.008
 prett+0.007
Neg logit contributions
 arrest-0.005
 lawyer-0.005
 church-0.005
 studying-0.004
 school-0.004
Token necklace
Token ID38987
Feature activation-3.69e-03

Pos logit contributions
 arrest+0.005
 school+0.005
 that+0.005
 Tom+0.005
 lawyer+0.005
Neg logit contributions
rated-0.007
 fart-0.006
ksh-0.006
 ju-0.006
 funn-0.006
Token and
Token ID290
Feature activation+1.31e-03

Pos logit contributions
 arrest+0.004
 elev+0.003
 church+0.003
azy+0.003
 recess+0.003
Neg logit contributions
 fart-0.008
 lo-0.008
 ju-0.008
 funn-0.007
rated-0.007
Token the
Token ID262
Feature activation+5.13e-03

Pos logit contributions
 fart+0.002
 ju+0.002
rated+0.002
 lo+0.002
 funn+0.002
Neg logit contributions
 arrest-0.002
 Tom-0.002
 lawyer-0.002
 school-0.002
 elev-0.002
Token magn
Token ID7842
Feature activation-9.65e-03

Pos logit contributions
rated+0.010
 fart+0.009
ksh+0.008
 ju+0.008
 funn+0.008
Neg logit contributions
 Tom-0.007
 school-0.007
 arrest-0.007
 lawyer-0.007
 that-0.007
Tokenifying
Token ID4035
Feature activation+6.48e-02

Pos logit contributions
 Tom+0.025
 arrest+0.024
 school+0.024
 lawyer+0.024
 church+0.024
Neg logit contributions
 fart-0.006
 lo-0.005
 ju-0.005
rated-0.005
 funn-0.004
Token glass
Token ID5405
Feature activation+7.96e-03

Pos logit contributions
rated+0.134
 fart+0.121
 funn+0.114
 ju+0.110
nam+0.109
Neg logit contributions
 arrest-0.090
 that-0.088
 Tom-0.088
 church-0.085
 school-0.085
Token and
Token ID290
Feature activation+2.68e-03

Pos logit contributions
 fart+0.020
 lo+0.018
 ju+0.018
 funn+0.018
rated+0.018
Neg logit contributions
 arrest-0.006
 lawyer-0.005
 elev-0.005
 studying-0.005
 Tom-0.005
Token told
Token ID1297
Feature activation-1.12e-02

Pos logit contributions
 fart+0.005
rated+0.005
 ju+0.005
 funn+0.005
 lo+0.005
Neg logit contributions
 Tom-0.003
 arrest-0.003
 school-0.003
 lawyer-0.003
 that-0.003
Token him
Token ID683
Feature activation+3.70e-03

Pos logit contributions
 arrest+0.012
 lawyer+0.011
aren+0.011
 elev+0.010
idd+0.010
Neg logit contributions
 fart-0.023
 funn-0.022
 lo-0.022
 ju-0.021
 tighter-0.021
Token that
Token ID326
Feature activation+2.94e-02

Pos logit contributions
 fart+0.008
 ju+0.007
 funn+0.007
 lo+0.007
rated+0.007
Neg logit contributions
 arrest-0.004
 lawyer-0.003
 elev-0.003
parent-0.003
issa-0.003
Token the
Token ID262
Feature activation+1.53e-04

Pos logit contributions
 arrest+0.001
 lawyer+0.001
 studying+0.001
parent+0.001
azy+0.001
Neg logit contributions
 fart-0.003
 lo-0.003
 ju-0.003
 funn-0.003
 tighter-0.003
Token necklace
Token ID38987
Feature activation-5.49e-03

Pos logit contributions
rated+0.000
 fart+0.000
 funn+0.000
 ju+0.000
ksh+0.000
Neg logit contributions
 arrest-0.000
 Tom-0.000
 school-0.000
 that-0.000
 lawyer-0.000
Token and
Token ID290
Feature activation-1.80e-04

Pos logit contributions
 arrest+0.003
 Tom+0.003
 lawyer+0.003
 school+0.003
 elev+0.003
Neg logit contributions
 fart-0.014
 ju-0.013
rated-0.013
 lo-0.013
 funn-0.013
Token the
Token ID262
Feature activation+7.59e-03

Pos logit contributions
 Tom+0.000
 arrest+0.000
 school+0.000
 that+0.000
 church+0.000
Neg logit contributions
 fart-0.000
rated-0.000
 ju-0.000
 funn-0.000
 lo-0.000
Token magn
Token ID7842
Feature activation-1.01e-02

Pos logit contributions
rated+0.014
 fart+0.013
 funn+0.012
 ju+0.012
ksh+0.012
Neg logit contributions
 Tom-0.011
 arrest-0.010
 school-0.010
 lawyer-0.010
 that-0.010
Tokenifying
Token ID4035
Feature activation+6.46e-02

Pos logit contributions
 arrest+0.024
 Tom+0.024
 lawyer+0.024
 school+0.024
 that+0.023
Neg logit contributions
 fart-0.008
rated-0.006
 ju-0.006
 lo-0.006
 funn-0.006
Token glass
Token ID5405
Feature activation+8.77e-03

Pos logit contributions
rated+0.138
 fart+0.122
 funn+0.117
nam+0.115
 lo+0.113
Neg logit contributions
 that-0.088
 Tom-0.087
 arrest-0.086
 church-0.084
 school-0.081
Token and
Token ID290
Feature activation+1.04e-04

Pos logit contributions
 fart+0.023
 ju+0.021
 lo+0.021
rated+0.021
 funn+0.021
Neg logit contributions
 arrest-0.005
 Tom-0.005
 lawyer-0.005
 school-0.005
 elev-0.005
Token started
Token ID2067
Feature activation+5.75e-04

Pos logit contributions
 fart+0.000
rated+0.000
 ju+0.000
 funn+0.000
 lo+0.000
Neg logit contributions
 Tom-0.000
 arrest-0.000
 that-0.000
 school-0.000
 Ann-0.000
Token to
Token ID284
Feature activation+9.50e-03

Pos logit contributions
 fart+0.001
 ju+0.001
 lo+0.001
 funn+0.001
rated+0.001
Neg logit contributions
 lawyer-0.001
azy-0.001
 arrest-0.001
issa-0.001
 Tom-0.001
Token examine
Token ID10716
Feature activation-1.47e-02

Pos logit contributions
 fart+0.020
rated+0.018
 ju+0.018
 funn+0.017
 lo+0.017
Neg logit contributions
 arrest-0.012
 school-0.012
 Tom-0.011
 building-0.011
 that-0.011
TokenMaybe
Token ID13300
Feature activation-5.77e-03

Pos logit contributions
rated+0.016
 fart+0.015
 afloat+0.014
ench+0.014
 ju+0.014
Neg logit contributions
Tom-0.008
 Tom-0.008
 that-0.008
 ever-0.008
 lawyer-0.008
Token.
Token ID13
Feature activation-8.42e-03

Pos logit contributions
 Tom+0.004
 that+0.004
 arrest+0.003
 school+0.003
 lawyer+0.003
Neg logit contributions
 fart-0.015
rated-0.014
 ju-0.014
 funn-0.014
 lo-0.014
Token Let
Token ID3914
Feature activation+6.24e-03

Pos logit contributions
 Tom+0.011
 Lee+0.010
 that+0.010
 Ann+0.010
 lawyer+0.010
Neg logit contributions
 fart-0.017
rated-0.017
 funn-0.015
 lo-0.015
 ju-0.015
Token's
Token ID338
Feature activation+3.28e-02

Pos logit contributions
 fart+0.011
rated+0.010
nam+0.010
 ju+0.010
 funn+0.009
Neg logit contributions
 Tom-0.010
 that-0.010
 arrest-0.009
 she-0.008
 school-0.008
Token cl
Token ID537
Feature activation+7.28e-03

Pos logit contributions
 fart+0.062
rated+0.060
 ju+0.060
 funn+0.058
ksh+0.057
Neg logit contributions
 Tom-0.043
 that-0.042
 school-0.041
 arrest-0.040
 playing-0.039
Tokenap
Token ID499
Feature activation+6.46e-02

Pos logit contributions
 fart+0.012
 ju+0.012
rated+0.011
 funn+0.010
 lo+0.010
Neg logit contributions
 Tom-0.012
 school-0.012
 arrest-0.011
 that-0.011
 Mike-0.011
Token and
Token ID290
Feature activation+7.50e-03

Pos logit contributions
 ju+0.165
 fart+0.162
rated+0.157
 redd+0.153
 funn+0.150
Neg logit contributions
 Tom-0.053
 that-0.052
 school-0.044
 Ann-0.041
 now-0.040
Token see
Token ID766
Feature activation+1.53e-02

Pos logit contributions
rated+0.016
 fart+0.015
 ju+0.015
 redd+0.014
ksh+0.014
Neg logit contributions
 Tom-0.009
 school-0.009
 that-0.008
 playing-0.008
 building-0.008
Token if
Token ID611
Feature activation+7.05e-03

Pos logit contributions
 fart+0.036
rated+0.034
 ju+0.034
 lo+0.033
 funn+0.033
Neg logit contributions
 Tom-0.013
 arrest-0.013
 that-0.012
 school-0.012
 lawyer-0.012
Token it
Token ID340
Feature activation+4.86e-04

Pos logit contributions
 fart+0.018
rated+0.017
 ju+0.017
 funn+0.017
 lo+0.017
Neg logit contributions
 Tom-0.005
 arrest-0.004
 that-0.004
 school-0.004
 lawyer-0.004
Token hops
Token ID29438
Feature activation+1.22e-02

Pos logit contributions
 fart+0.001
 ju+0.001
nam+0.001
rated+0.001
 funn+0.001
Neg logit contributions
 Tom-0.001
 that-0.001
 arrest-0.001
 Ann-0.001
 building-0.001
Token cried
Token ID16896
Feature activation-2.91e-02

Pos logit contributions
 fart+0.015
rated+0.014
 ju+0.014
 funn+0.014
 lo+0.014
Neg logit contributions
 Tom-0.010
 arrest-0.010
 lawyer-0.009
 school-0.009
 that-0.009
Token,
Token ID11
Feature activation-1.28e-02

Pos logit contributions
 arrest+0.028
 church+0.022
issa+0.022
 elev+0.022
 studying+0.022
Neg logit contributions
 fart-0.067
 lo-0.064
 funn-0.062
 tighter-0.062
rated-0.062
Token dropping
Token ID12047
Feature activation+6.96e-03

Pos logit contributions
 arrest+0.012
 Tom+0.012
 lawyer+0.012
 school+0.011
 elev+0.011
Neg logit contributions
 fart-0.029
rated-0.027
 ju-0.026
 lo-0.026
 funn-0.026
Token the
Token ID262
Feature activation-3.33e-03

Pos logit contributions
 fart+0.014
 lo+0.013
 funn+0.013
 ju+0.013
 tighter+0.013
Neg logit contributions
 arrest-0.008
 studying-0.007
 lawyer-0.007
issa-0.007
parent-0.007
Token magn
Token ID7842
Feature activation-6.27e-03

Pos logit contributions
 Tom+0.005
 arrest+0.005
 lawyer+0.004
 school+0.004
 elev+0.004
Neg logit contributions
 fart-0.006
rated-0.006
 ju-0.006
 lo-0.005
 funn-0.005
Tokenifying
Token ID4035
Feature activation+6.44e-02

Pos logit contributions
 Tom+0.017
 arrest+0.017
 lawyer+0.017
 school+0.017
 elev+0.016
Neg logit contributions
 fart-0.003
rated-0.002
 ju-0.002
 lo-0.002
 funn-0.002
Token glass
Token ID5405
Feature activation+5.43e-03

Pos logit contributions
rated+0.115
 fart+0.114
 funn+0.104
 ju+0.104
 lo+0.102
Neg logit contributions
 Tom-0.097
 arrest-0.095
 school-0.092
 church-0.092
 that-0.092
Token.
Token ID13
Feature activation-7.09e-03

Pos logit contributions
 fart+0.013
rated+0.012
 ju+0.012
 lo+0.012
 funn+0.012
Neg logit contributions
 arrest-0.004
 Tom-0.004
 lawyer-0.004
 school-0.004
 elev-0.004
Token
Token ID198
Feature activation-2.98e-03

Pos logit contributions
 arrest+0.009
 elev+0.009
parent+0.008
issa+0.008
 school+0.008
Neg logit contributions
 fart-0.014
 lo-0.012
 ju-0.012
 funn-0.012
rated-0.011
Token
Token ID198
Feature activation-6.51e-03

Pos logit contributions
 arrest+0.004
 lawyer+0.004
 elev+0.004
 studying+0.004
 event+0.004
Neg logit contributions
 fart-0.005
 lo-0.004
rac-0.004
 wider-0.004
mediately-0.004
Token"
Token ID1
Feature activation-1.64e-03

Pos logit contributions
 arrest+0.007
 studying+0.007
 elev+0.007
 lawyer+0.007
 church+0.007
Neg logit contributions
 fart-0.013
 lo-0.012
 ju-0.011
rated-0.011
 funn-0.011
Token at
Token ID379
Feature activation-3.56e-03

Pos logit contributions
 fart+0.030
rated+0.028
 ju+0.027
 lo+0.026
 funn+0.026
Neg logit contributions
 that-0.008
 Tom-0.008
 inside-0.007
 outside-0.007
 school-0.007
Token him
Token ID683
Feature activation+5.53e-03

Pos logit contributions
 arrest+0.004
 studying+0.004
 lawyer+0.004
makers+0.004
 choosing+0.004
Neg logit contributions
 fart-0.006
 lo-0.006
nam-0.006
 funn-0.006
 ju-0.006
Token with
Token ID351
Feature activation-3.63e-04

Pos logit contributions
 fart+0.015
rated+0.014
 ju+0.014
 funn+0.014
 lo+0.014
Neg logit contributions
 Tom-0.003
 arrest-0.003
 school-0.003
 lawyer-0.003
 that-0.003
Token a
Token ID257
Feature activation+5.19e-03

Pos logit contributions
 Tom+0.000
 that+0.000
 arrest+0.000
 school+0.000
 lawyer+0.000
Neg logit contributions
 fart-0.001
rated-0.001
 ju-0.001
ksh-0.001
 funn-0.001
Token mis
Token ID2984
Feature activation+3.81e-02

Pos logit contributions
rated+0.010
 fart+0.010
ksh+0.009
nam+0.009
mediately+0.009
Neg logit contributions
 that-0.006
 building-0.006
 lawyer-0.006
 Tom-0.006
 school-0.006
Tokenchie
Token ID3043
Feature activation+6.40e-02

Pos logit contributions
 fart+0.058
 ju+0.053
rated+0.053
 lo+0.052
 funn+0.051
Neg logit contributions
 arrest-0.061
 lawyer-0.060
azy-0.058
 elev-0.057
 ever-0.056
Tokenvous
Token ID31222
Feature activation+7.35e-03

Pos logit contributions
 fart+0.145
nam+0.137
 afloat+0.135
 ju+0.134
 lo+0.128
Neg logit contributions
 that-0.063
 ever-0.059
 Tom-0.058
--0.057
old-0.056
Token grin
Token ID16770
Feature activation+1.10e-02

Pos logit contributions
 fart+0.011
rated+0.009
 ju+0.009
 lo+0.009
 funn+0.009
Neg logit contributions
 Tom-0.013
 arrest-0.013
 lawyer-0.013
 school-0.013
 that-0.012
Token.
Token ID13
Feature activation-5.68e-03

Pos logit contributions
 fart+0.027
rated+0.025
 lo+0.025
 ju+0.025
 funn+0.025
Neg logit contributions
 arrest-0.008
 elev-0.007
 lawyer-0.007
 Tom-0.007
 school-0.007
Token She
Token ID1375
Feature activation-5.28e-03

Pos logit contributions
 arrest+0.008
 recess+0.006
 school+0.006
issa+0.006
azy+0.006
Neg logit contributions
 fart-0.011
 ju-0.010
 lo-0.010
 funn-0.010
rated-0.009
Token said
Token ID531
Feature activation-8.80e-03

Pos logit contributions
 Tom+0.007
 arrest+0.007
 lawyer+0.007
 school+0.007
 that+0.007
Neg logit contributions
 fart-0.010
rated-0.009
 ju-0.009
 funn-0.009
 lo-0.009
Token it
Token ID340
Feature activation-2.03e-03

Pos logit contributions
 fart+0.009
 funn+0.008
 lo+0.008
 ju+0.008
nam+0.008
Neg logit contributions
azy-0.005
aren-0.005
 lawyer-0.004
 arrest-0.004
makers-0.004
Token and
Token ID290
Feature activation+1.26e-02

Pos logit contributions
 arrest+0.001
 lawyer+0.001
 Tom+0.001
 elev+0.001
 church+0.001
Neg logit contributions
 fart-0.005
 lo-0.005
rated-0.005
 ju-0.005
 funn-0.005
Token started
Token ID2067
Feature activation+4.64e-03

Pos logit contributions
 fart+0.026
rated+0.025
 ju+0.024
 lo+0.024
 funn+0.024
Neg logit contributions
 Tom-0.015
 arrest-0.014
 school-0.014
 lawyer-0.013
 that-0.013
Token to
Token ID284
Feature activation+1.77e-02

Pos logit contributions
 fart+0.010
 lo+0.009
 funn+0.009
 ju+0.009
nam+0.008
Neg logit contributions
azy-0.004
 arrest-0.004
issa-0.004
 lawyer-0.004
aren-0.004
Token cl
Token ID537
Feature activation+1.28e-02

Pos logit contributions
 fart+0.036
rated+0.033
 ju+0.033
 lo+0.032
 funn+0.032
Neg logit contributions
 Tom-0.022
 arrest-0.022
 school-0.021
 lawyer-0.020
 studying-0.020
Tokenap
Token ID499
Feature activation+6.37e-02

Pos logit contributions
 fart+0.017
rated+0.015
 ju+0.015
 lo+0.014
 funn+0.014
Neg logit contributions
 Tom-0.025
 arrest-0.025
 school-0.024
 lawyer-0.024
 that-0.024
Token her
Token ID607
Feature activation+1.26e-02

Pos logit contributions
 fart+0.168
rated+0.158
 ju+0.158
 lo+0.156
 funn+0.155
Neg logit contributions
 Tom-0.037
 arrest-0.037
 lawyer-0.033
 school-0.033
 that-0.031
Token hands
Token ID2832
Feature activation+2.79e-02

Pos logit contributions
rated+0.022
 fart+0.022
 ju+0.022
 funn+0.020
 redd+0.019
Neg logit contributions
 Tom-0.019
 arrest-0.019
 playing-0.017
 school-0.017
 that-0.017
Token.
Token ID13
Feature activation+5.88e-03

Pos logit contributions
 fart+0.078
rated+0.073
 ju+0.073
 lo+0.072
 funn+0.072
Neg logit contributions
 arrest-0.012
 Tom-0.012
 lawyer-0.010
 school-0.010
 that-0.009
Token
Token ID198
Feature activation+6.15e-03

Pos logit contributions
 fart+0.011
 ju+0.010
 lo+0.010
rated+0.010
 funn+0.010
Neg logit contributions
 arrest-0.008
 lawyer-0.007
 elev-0.007
 school-0.007
 Tom-0.007
Token
Token ID198
Feature activation-5.51e-04

Pos logit contributions
 fart+0.011
 lo+0.010
rated+0.009
 ju+0.009
 funn+0.009
Neg logit contributions
 arrest-0.009
 elev-0.008
 lawyer-0.008
 studying-0.008
 helpers-0.008
Token It
Token ID632
Feature activation-1.18e-02

Pos logit contributions
 fart+0.013
 ju+0.012
 lo+0.012
 funn+0.012
rated+0.012
Neg logit contributions
 arrest-0.007
 lawyer-0.007
 elev-0.007
 school-0.007
 Tom-0.006
Token became
Token ID2627
Feature activation+1.10e-02

Pos logit contributions
 Tom+0.012
 arrest+0.012
 that+0.011
 school+0.011
 lawyer+0.011
Neg logit contributions
 fart-0.026
rated-0.024
 ju-0.024
 funn-0.023
 lo-0.023
Token the
Token ID262
Feature activation-4.80e-03

Pos logit contributions
 fart+0.028
rated+0.027
ksh+0.025
rac+0.024
 ju+0.024
Neg logit contributions
 Tom-0.008
 that-0.007
 school-0.007
 now-0.007
 playing-0.007
Token happiest
Token ID49414
Feature activation+1.56e-04

Pos logit contributions
 school+0.008
 lawyer+0.008
 Tom+0.008
 church+0.008
 arrest+0.007
Neg logit contributions
 fart-0.007
rated-0.007
 ju-0.006
 lo-0.006
 funn-0.006
Token books
Token ID3835
Feature activation-5.47e-02

Pos logit contributions
 brightest+0.000
 fart+0.000
 lo+0.000
 ju+0.000
 funn+0.000
Neg logit contributions
Mike-0.000
 arrest-0.000
co-0.000
makers-0.000
aren-0.000
Tokenhe
Token ID258
Feature activation-8.50e-02

Pos logit contributions
 arrest+0.062
 elev+0.058
 Lee+0.057
 lawyer+0.056
 Tom+0.056
Neg logit contributions
 fart-0.111
 lo-0.106
 ju-0.105
rated-0.104
 brighter-0.100
Tokenlf
Token ID1652
Feature activation-6.27e-02

Pos logit contributions
aren+0.100
 elev+0.097
 Karen+0.097
old+0.092
 arrest+0.091
Neg logit contributions
 fart-0.154
rated-0.145
 lo-0.135
boat-0.125
rily-0.125
Token in
Token ID287
Feature activation-2.06e-02

Pos logit contributions
 arrest+0.066
 elev+0.064
 Tom+0.061
 lawyer+0.059
 Lee+0.059
Neg logit contributions
 fart-0.128
rated-0.123
 lo-0.122
 ju-0.120
 brightest-0.117
Token the
Token ID262
Feature activation-1.17e-02

Pos logit contributions
 lawyer+0.024
co+0.024
 arrest+0.023
 Lee+0.023
azy+0.023
Neg logit contributions
 fart-0.039
 lo-0.036
rated-0.034
 ju-0.033
nam-0.033
Token world
Token ID995
Feature activation-1.41e-02

Pos logit contributions
 arrest+0.020
 Tom+0.019
 Lee+0.019
azy+0.019
 Mike+0.018
Neg logit contributions
 fart-0.018
 lo-0.016
 ju-0.016
 funn-0.016
rated-0.015
Token!
Token ID0
Feature activation-1.21e-03

Pos logit contributions
 arrest+0.013
 lawyer+0.013
innie+0.013
 Lee+0.012
 elev+0.012
Neg logit contributions
 fart-0.030
rated-0.029
 ju-0.029
 lo-0.029
 brighter-0.027
Token I
Token ID314
Feature activation-1.55e-02

Pos logit contributions
 arrest+0.020
 Tom+0.019
 lawyer+0.019
 school+0.019
 elev+0.018
Neg logit contributions
 fart-0.040
 ju-0.037
 lo-0.036
rated-0.036
 funn-0.036
Token don
Token ID836
Feature activation-3.87e-02

Pos logit contributions
 arrest+0.022
 elev+0.021
 lawyer+0.021
 Tom+0.020
 school+0.020
Neg logit contributions
 fart-0.028
 lo-0.026
 funn-0.026
 ju-0.026
rated-0.026
Token't
Token ID470
Feature activation-1.80e-02

Pos logit contributions
 arrest+0.030
 lawyer+0.028
 recess+0.028
 school+0.028
aren+0.028
Neg logit contributions
 fart-0.092
 lo-0.081
 ju-0.079
 funn-0.079
rated-0.078
Token like
Token ID588
Feature activation-7.43e-03

Pos logit contributions
 arrest+0.024
 elev+0.024
issa+0.023
 lawyer+0.023
 Tom+0.022
Neg logit contributions
 fart-0.034
rated-0.031
 funn-0.031
 ju-0.030
 lo-0.030
Token rac
Token ID3444
Feature activation-2.96e-02

Pos logit contributions
 arrest+0.007
 Tom+0.007
 lawyer+0.007
azy+0.006
 school+0.006
Neg logit contributions
 fart-0.017
 ju-0.015
 funn-0.015
 lo-0.015
rated-0.015
Tokenco
Token ID1073
Feature activation-8.17e-02

Pos logit contributions
 grandma+0.052
 Karen+0.049
 uncle+0.048
 Mike+0.047
 Tom+0.046
Neg logit contributions
 fart-0.048
 funn-0.039
 lo-0.036
rated-0.036
sc-0.035
Tokenons
Token ID684
Feature activation-1.56e-02

Pos logit contributions
 lawyer+0.091
 arrest+0.090
 upstairs+0.089
 school+0.087
 elev+0.085
Neg logit contributions
 fart-0.161
rated-0.149
 funn-0.143
 lo-0.141
 ju-0.139
Token.
Token ID13
Feature activation-1.22e-02

Pos logit contributions
 arrest+0.010
 Tom+0.010
 elev+0.009
 lawyer+0.009
 school+0.009
Neg logit contributions
 fart-0.040
 ju-0.037
 lo-0.037
 funn-0.036
rated-0.036
Token They
Token ID1119
Feature activation-1.12e-02

Pos logit contributions
 Tom+0.014
 arrest+0.013
 lawyer+0.013
 school+0.012
 that+0.012
Neg logit contributions
 fart-0.026
rated-0.024
 ju-0.024
 lo-0.023
 funn-0.023
Token are
Token ID389
Feature activation-4.05e-03

Pos logit contributions
 arrest+0.014
 Tom+0.014
 that+0.013
 lawyer+0.013
 school+0.013
Neg logit contributions
 fart-0.022
rated-0.020
 ju-0.020
 lo-0.020
 funn-0.019
Token mean
Token ID1612
Feature activation+1.33e-02

Pos logit contributions
 Tom+0.005
 arrest+0.005
 that+0.005
 school+0.005
 lawyer+0.005
Neg logit contributions
 fart-0.008
rated-0.007
 ju-0.007
 lo-0.007
 funn-0.007
Token went
Token ID1816
Feature activation-1.59e-03

Pos logit contributions
 fart+0.003
rated+0.003
 ju+0.003
 funn+0.003
 lo+0.003
Neg logit contributions
 Tom-0.002
 arrest-0.002
 school-0.002
 that-0.002
 lawyer-0.002
Token on
Token ID319
Feature activation-4.64e-04

Pos logit contributions
 arrest+0.002
azy+0.002
issa+0.002
 lawyer+0.002
parent+0.002
Neg logit contributions
 fart-0.003
 lo-0.002
 funn-0.002
 wider-0.002
 prett-0.002
Token a
Token ID257
Feature activation+2.21e-03

Pos logit contributions
azy+0.001
 arrest+0.001
ibrarian+0.001
issa+0.001
 lawyer+0.001
Neg logit contributions
 fart-0.001
 lo-0.001
 funn-0.001
 ju-0.001
nam-0.001
Token scary
Token ID14343
Feature activation+5.08e-04

Pos logit contributions
 fart+0.004
rated+0.004
 ju+0.004
 lo+0.004
 funn+0.004
Neg logit contributions
 Tom-0.003
 arrest-0.003
 school-0.003
 lawyer-0.003
 that-0.003
Token roller
Token ID24471
Feature activation-3.00e-02

Pos logit contributions
 fart+0.001
rated+0.001
 funn+0.001
 lo+0.001
 ju+0.001
Neg logit contributions
 Tom-0.001
azy-0.001
 arrest-0.001
 Lee-0.001
issa-0.001
Tokenco
Token ID1073
Feature activation-7.22e-02

Pos logit contributions
irst+0.032
 Tom+0.031
 arrest+0.031
issa+0.031
 lawyer+0.031
Neg logit contributions
 fart-0.062
 lo-0.054
 funn-0.053
rated-0.053
 ju-0.053
Tokenaster
Token ID1603
Feature activation-2.46e-02

Pos logit contributions
 lawyer+0.087
 helpers+0.085
 Lee+0.085
 Tom+0.084
 arrest+0.081
Neg logit contributions
 fart-0.124
rated-0.121
 lo-0.110
cr-0.107
seed-0.107
Token ride
Token ID6594
Feature activation-1.76e-02

Pos logit contributions
 arrest+0.019
 lawyer+0.019
 Tom+0.018
 elev+0.018
 school+0.017
Neg logit contributions
 fart-0.059
 lo-0.055
rated-0.055
 funn-0.054
 ju-0.054
Token.
Token ID13
Feature activation+1.38e-03

Pos logit contributions
 arrest+0.019
 lawyer+0.017
 church+0.016
 Tom+0.016
 elev+0.016
Neg logit contributions
 fart-0.038
 lo-0.035
rated-0.035
 funn-0.034
 ju-0.034
Token Tim
Token ID5045
Feature activation-3.19e-03

Pos logit contributions
 fart+0.003
 ju+0.002
 lo+0.002
rated+0.002
 funn+0.002
Neg logit contributions
 arrest-0.002
 lawyer-0.002
 school-0.002
 elev-0.002
 studying-0.002
Tokenmy
Token ID1820
Feature activation-9.58e-03

Pos logit contributions
 Tom+0.005
 arrest+0.005
 school+0.005
 lawyer+0.005
 that+0.005
Neg logit contributions
 fart-0.005
 ju-0.005
rated-0.005
 funn-0.005
 lo-0.005
Token didn
Token ID1422
Feature activation-1.36e-02

Pos logit contributions
 fart+0.024
 ju+0.022
rated+0.022
 funn+0.021
 lo+0.021
Neg logit contributions
 Tom-0.014
 that-0.013
 school-0.013
 lawyer-0.013
 arrest-0.013
Token't
Token ID470
Feature activation-1.59e-02

Pos logit contributions
 helpers+0.015
aren+0.015
 arrest+0.015
 recess+0.014
 hours+0.014
Neg logit contributions
 fart-0.027
 lo-0.023
rated-0.022
 funn-0.022
 ju-0.021
Token know
Token ID760
Feature activation-1.34e-02

Pos logit contributions
 arrest+0.027
 elev+0.025
 Tom+0.025
 lawyer+0.024
 church+0.024
Neg logit contributions
 fart-0.025
rated-0.025
 funn-0.023
 lo-0.023
 ju-0.022
Token what
Token ID644
Feature activation-1.19e-02

Pos logit contributions
 arrest+0.019
 recess+0.018
 lawyer+0.018
 church+0.016
 school+0.016
Neg logit contributions
 lo-0.022
 fart-0.022
 ju-0.022
nam-0.021
 funn-0.021
Token "
Token ID366
Feature activation+2.42e-03

Pos logit contributions
 arrest+0.020
irst+0.015
aren+0.015
 recess+0.015
zy+0.014
Neg logit contributions
 ju-0.019
 funn-0.018
 fart-0.018
 lo-0.018
nam-0.018
Tokenorgan
Token ID9971
Feature activation-7.19e-02

Pos logit contributions
 fart+0.004
 ju+0.004
rated+0.004
 funn+0.004
 fins+0.004
Neg logit contributions
 lawyer-0.003
 ever-0.003
 Tom-0.003
 school-0.003
Tom-0.003
Tokenize
Token ID1096
Feature activation-3.33e-02

Pos logit contributions
 arrest+0.090
 Tom+0.090
 lawyer+0.084
 school+0.083
 Lee+0.083
Neg logit contributions
 fart-0.139
rated-0.129
 ju-0.128
 lo-0.124
 funn-0.124
Token"
Token ID1
Feature activation-7.20e-04

Pos logit contributions
 arrest+0.036
 Tom+0.035
 school+0.035
 elev+0.035
 lawyer+0.033
Neg logit contributions
 fart-0.068
rated-0.064
 ju-0.064
 lo-0.062
 funn-0.061
Token meant
Token ID4001
Feature activation-1.77e-02

Pos logit contributions
 Tom+0.001
 lawyer+0.001
 school+0.001
 that+0.001
 arrest+0.001
Neg logit contributions
 fart-0.002
rated-0.001
 ju-0.001
 lo-0.001
 funn-0.001
Token,
Token ID11
Feature activation+1.50e-02

Pos logit contributions
 arrest+0.024
 recess+0.024
irst+0.022
 zoo+0.022
aren+0.021
Neg logit contributions
 fart-0.030
 lo-0.028
rated-0.028
 ju-0.027
 wider-0.026
Token but
Token ID475
Feature activation-6.20e-03

Pos logit contributions
 fart+0.031
rated+0.030
 ju+0.029
 lo+0.028
 funn+0.028
Neg logit contributions
 Tom-0.017
 arrest-0.016
 that-0.016
 lawyer-0.016
 school-0.016
Token Lily
Token ID20037
Feature activation-8.21e-03

Pos logit contributions
 fart+0.006
 ju+0.006
 lo+0.006
 funn+0.006
 brightest+0.006
Neg logit contributions
 arrest-0.003
parent-0.002
 lawyer-0.002
issa-0.002
 elev-0.002
Token loved
Token ID6151
Feature activation-2.90e-02

Pos logit contributions
 arrest+0.009
 Tom+0.008
 elev+0.008
 lawyer+0.008
 church+0.008
Neg logit contributions
 fart-0.017
rated-0.016
 lo-0.016
 ju-0.016
 funn-0.016
Token to
Token ID284
Feature activation-2.92e-03

Pos logit contributions
 arrest+0.031
irst+0.029
issa+0.029
ibrarian+0.028
ily+0.028
Neg logit contributions
 fart-0.056
 ju-0.055
 funn-0.055
 lo-0.053
 prett-0.050
Token ride
Token ID6594
Feature activation-2.16e-02

Pos logit contributions
 arrest+0.004
 Tom+0.004
 lawyer+0.004
issa+0.003
 elev+0.003
Neg logit contributions
 fart-0.006
rated-0.005
 ju-0.005
 lo-0.005
 funn-0.005
Token roller
Token ID24471
Feature activation-3.84e-02

Pos logit contributions
 arrest+0.023
 lawyer+0.019
 Tom+0.018
azy+0.018
 studying+0.017
Neg logit contributions
 fart-0.048
 lo-0.045
 funn-0.045
 ju-0.044
rated-0.042
Tokenco
Token ID1073
Feature activation-7.15e-02

Pos logit contributions
 arrest+0.033
 lawyer+0.032
 Tom+0.028
issa+0.028
 church+0.028
Neg logit contributions
 fart-0.091
 lo-0.082
 funn-0.081
 ju-0.080
rated-0.078
Tokenasters
Token ID7060
Feature activation-1.79e-02

Pos logit contributions
 lawyer+0.087
 arrest+0.086
 helpers+0.083
 Lee+0.083
 Tom+0.082
Neg logit contributions
 fart-0.131
rated-0.117
 lo-0.116
 funn-0.111
 ju-0.110
Token and
Token ID290
Feature activation-8.18e-03

Pos logit contributions
 arrest+0.014
 lawyer+0.012
 Tom+0.012
 school+0.011
 elev+0.011
Neg logit contributions
 fart-0.044
 lo-0.041
 ju-0.040
 funn-0.040
rated-0.040
Token always
Token ID1464
Feature activation+3.24e-02

Pos logit contributions
 arrest+0.010
 Tom+0.009
 lawyer+0.009
 school+0.009
 elev+0.009
Neg logit contributions
 fart-0.017
 ju-0.015
rated-0.015
 lo-0.015
 funn-0.015
Token remembered
Token ID12086
Feature activation-1.02e-03

Pos logit contributions
 fart+0.059
rated+0.055
 lo+0.054
 ju+0.054
 funn+0.053
Neg logit contributions
 arrest-0.045
 Tom-0.044
 lawyer-0.043
 school-0.042
 elev-0.041
Token how
Token ID703
Feature activation+3.15e-03

Pos logit contributions
 arrest+0.001
 lawyer+0.001
 Tom+0.001
issa+0.001
 elev+0.001
Neg logit contributions
 fart-0.002
 ju-0.002
 lo-0.002
 funn-0.002
rated-0.002
Token.
Token ID13
Feature activation-2.02e-02

Pos logit contributions
 fart+0.022
rated+0.021
 lo+0.021
 funn+0.021
 ju+0.020
Neg logit contributions
 arrest-0.015
 elev-0.013
 lawyer-0.013
 school-0.012
issa-0.012
Token And
Token ID843
Feature activation-3.54e-03

Pos logit contributions
 arrest+0.028
 Tom+0.028
 lawyer+0.027
 school+0.027
 elev+0.026
Neg logit contributions
 fart-0.036
 ju-0.033
rated-0.033
 lo-0.032
 funn-0.032
Token we
Token ID356
Feature activation-1.14e-02

Pos logit contributions
 Tom+0.003
 that+0.003
 school+0.003
 arrest+0.003
 lawyer+0.003
Neg logit contributions
 fart-0.008
rated-0.008
 ju-0.008
 funn-0.008
 lo-0.008
Token love
Token ID1842
Feature activation-1.50e-02

Pos logit contributions
 arrest+0.011
 Tom+0.011
 lawyer+0.010
 school+0.010
 elev+0.010
Neg logit contributions
 fart-0.026
rated-0.024
 ju-0.024
 lo-0.024
 funn-0.024
Token Roc
Token ID13545
Feature activation-3.60e-02

Pos logit contributions
 arrest+0.018
 lawyer+0.017
azy+0.016
issa+0.016
 elev+0.016
Neg logit contributions
 fart-0.030
 funn-0.027
 ju-0.027
 lo-0.027
rated-0.025
Tokenco
Token ID1073
Feature activation-7.12e-02

Pos logit contributions
 arrest+0.037
 school+0.033
 she+0.032
 Tom+0.032
 choosing+0.030
Neg logit contributions
 fart-0.077
 funn-0.075
rated-0.074
ksh-0.071
 lo-0.071
Token.
Token ID13
Feature activation-1.58e-02

Pos logit contributions
 arrest+0.079
 lawyer+0.073
 elev+0.073
 school+0.071
 upstairs+0.070
Neg logit contributions
 fart-0.140
rated-0.134
 funn-0.129
 lo-0.128
 ju-0.126
Token He
Token ID679
Feature activation-9.87e-03

Pos logit contributions
 arrest+0.021
 lawyer+0.020
 school+0.020
 elev+0.019
 studying+0.019
Neg logit contributions
 fart-0.030
 ju-0.028
rated-0.027
 funn-0.027
 lo-0.027
Token is
Token ID318
Feature activation-1.44e-02

Pos logit contributions
 recess+0.015
 elev+0.015
 school+0.015
 arrest+0.015
 lawyer+0.014
Neg logit contributions
 fart-0.016
 funn-0.015
 lo-0.015
rated-0.015
 ju-0.015
Token our
Token ID674
Feature activation-1.83e-02

Pos logit contributions
 arrest+0.020
 lawyer+0.018
 elev+0.018
 school+0.018
 Tom+0.018
Neg logit contributions
 fart-0.027
 funn-0.025
 ju-0.025
 lo-0.024
rated-0.023
Token new
Token ID649
Feature activation-1.40e-02

Pos logit contributions
 arrest+0.023
 Tom+0.022
 elev+0.021
 lawyer+0.021
 studying+0.020
Neg logit contributions
 fart-0.036
 funn-0.033
 ju-0.033
 lo-0.033
rated-0.031
Token."
Token ID526
Feature activation-1.49e-03

Pos logit contributions
 fart+0.019
 lo+0.017
 funn+0.017
rated+0.017
 ju+0.017
Neg logit contributions
 arrest-0.012
 elev-0.011
 school-0.010
 lawyer-0.010
 studying-0.010
Token
Token ID198
Feature activation+2.24e-03

Pos logit contributions
 arrest+0.002
 lawyer+0.001
 elev+0.001
 school+0.001
 Tom+0.001
Neg logit contributions
 fart-0.003
 ju-0.003
 lo-0.003
rated-0.003
 funn-0.003
Token
Token ID198
Feature activation-7.08e-04

Pos logit contributions
 fart+0.004
 lo+0.004
rated+0.004
 ju+0.004
 funn+0.004
Neg logit contributions
 arrest-0.003
 lawyer-0.002
 Tom-0.002
 elev-0.002
 school-0.002
TokenR
Token ID49
Feature activation-1.28e-02

Pos logit contributions
 arrest+0.001
 Tom+0.001
 lawyer+0.001
 school+0.001
 elev+0.001
Neg logit contributions
 fart-0.001
 ju-0.001
 lo-0.001
rated-0.001
 funn-0.001
Tokenoc
Token ID420
Feature activation-1.78e-02

Pos logit contributions
 elev+0.021
 upstairs+0.020
 lawyer+0.020
 that+0.020
 church+0.020
Neg logit contributions
rated-0.018
 lo-0.018
 fart-0.018
mediately-0.017
die-0.017
Tokenco
Token ID1073
Feature activation-7.10e-02

Pos logit contributions
 school+0.021
 hours+0.019
 upstairs+0.019
 studying+0.019
 university+0.019
Neg logit contributions
 fart-0.032
rac-0.031
 funn-0.029
bid-0.029
 lo-0.028
Token nodded
Token ID14464
Feature activation-1.59e-03

Pos logit contributions
 arrest+0.109
 elev+0.104
 lawyer+0.103
 school+0.099
 upstairs+0.097
Neg logit contributions
rated-0.109
 fart-0.106
 lo-0.104
 funn-0.099
mediately-0.098
Token and
Token ID290
Feature activation+1.06e-02

Pos logit contributions
 Tom+0.001
 that+0.000
 school+0.000
 lawyer+0.000
 arrest+0.000
Neg logit contributions
 fart-0.005
 ju-0.004
rated-0.004
 funn-0.004
 lo-0.004
Token said
Token ID531
Feature activation-4.96e-03

Pos logit contributions
 fart+0.021
rated+0.020
 ju+0.019
 funn+0.018
 lo+0.018
Neg logit contributions
 Tom-0.015
 that-0.013
 arrest-0.013
 Ann-0.013
 school-0.013
Token,
Token ID11
Feature activation+1.37e-02

Pos logit contributions
 arrest+0.003
 Tom+0.003
 lawyer+0.002
 school+0.002
 elev+0.002
Neg logit contributions
 fart-0.013
 ju-0.012
rated-0.012
 lo-0.012
 funn-0.012
Token "
Token ID366
Feature activation+9.97e-03

Pos logit contributions
 fart+0.029
 ju+0.028
 lo+0.027
 funn+0.027
rated+0.027
Neg logit contributions
 arrest-0.015
 elev-0.013
 school-0.013
 lawyer-0.013
 Tom-0.013
Token and
Token ID290
Feature activation-5.96e-05

Pos logit contributions
 arrest+0.007
 elev+0.007
parent+0.006
issa+0.006
 Karen+0.006
Neg logit contributions
 fart-0.015
 lo-0.013
 ju-0.013
 funn-0.013
 tighter-0.012
Token not
Token ID407
Feature activation-5.95e-03

Pos logit contributions
 Tom+0.000
 arrest+0.000
 school+0.000
 lawyer+0.000
 that+0.000
Neg logit contributions
 fart-0.000
rated-0.000
 ju-0.000
 lo-0.000
 funn-0.000
Token look
Token ID804
Feature activation-6.46e-03

Pos logit contributions
 Tom+0.007
 arrest+0.006
 that+0.006
 school+0.006
 lawyer+0.006
Neg logit contributions
 fart-0.013
rated-0.012
 ju-0.012
 lo-0.012
 funn-0.012
Token for
Token ID329
Feature activation-2.12e-02

Pos logit contributions
 arrest+0.007
 Tom+0.007
 lawyer+0.006
 school+0.006
 elev+0.006
Neg logit contributions
 fart-0.014
rated-0.013
 ju-0.013
 lo-0.013
 funn-0.013
Token rac
Token ID3444
Feature activation-2.42e-02

Pos logit contributions
azy+0.027
issa+0.026
 lawyer+0.026
 arrest+0.025
parent+0.025
Neg logit contributions
 fart-0.039
 lo-0.038
 funn-0.037
nam-0.036
 ju-0.035
Tokenco
Token ID1073
Feature activation-7.07e-02

Pos logit contributions
 grandma+0.038
 Mike+0.036
 Tom+0.035
 church+0.035
 Karen+0.035
Neg logit contributions
 fart-0.043
 funn-0.036
rated-0.035
 lo-0.035
sc-0.034
Tokenons
Token ID684
Feature activation-6.81e-03

Pos logit contributions
 arrest+0.090
 lawyer+0.089
 Lee+0.086
 school+0.086
 upstairs+0.086
Neg logit contributions
 fart-0.120
rated-0.113
 lo-0.107
 funn-0.105
 ju-0.102
Token anymore
Token ID7471
Feature activation-2.97e-02

Pos logit contributions
 arrest+0.005
 elev+0.005
 school+0.005
 studying+0.005
 church+0.005
Neg logit contributions
 fart-0.016
 ju-0.015
 lo-0.015
 funn-0.015
rated-0.015
Token.
Token ID13
Feature activation+1.23e-03

Pos logit contributions
 arrest+0.031
issa+0.029
aren+0.028
 elev+0.028
 lawyer+0.027
Neg logit contributions
 fart-0.062
 lo-0.058
cr-0.056
 funn-0.055
rated-0.054
Token<|endoftext|>
Token ID50256
Feature activation-7.62e-03

Pos logit contributions
 fart+0.002
 ju+0.002
 lo+0.002
 funn+0.002
rated+0.002
Neg logit contributions
 arrest-0.002
 lawyer-0.001
 elev-0.001
issa-0.001
 studying-0.001
TokenL
Token ID43
Feature activation-2.42e-02

Pos logit contributions
 arrest+0.011
 lawyer+0.010
 elev+0.010
parent+0.010
 studying+0.010
Neg logit contributions
 fart-0.014
 lo-0.013
 ju-0.012
 funn-0.011
cr-0.011
Token lesson
Token ID11483
Feature activation-2.87e-03

Pos logit contributions
 fart+0.001
rated+0.001
 ju+0.001
 lo+0.001
 funn+0.001
Neg logit contributions
 arrest-0.001
 Tom-0.001
 lawyer-0.001
 school-0.001
 that-0.001
Token and
Token ID290
Feature activation+1.39e-03

Pos logit contributions
 church+0.004
 arrest+0.004
 elev+0.004
ait+0.004
 recess+0.003
Neg logit contributions
 tighter-0.005
 fart-0.005
rated-0.005
 wider-0.005
 lo-0.005
Token never
Token ID1239
Feature activation+4.33e-03

Pos logit contributions
 fart+0.003
 lo+0.003
 funn+0.003
rated+0.003
 ju+0.003
Neg logit contributions
 arrest-0.002
 lawyer-0.002
 Tom-0.002
 elev-0.002
 school-0.001
Token pretended
Token ID36766
Feature activation-2.52e-02

Pos logit contributions
 fart+0.009
 ju+0.009
rated+0.008
 brightest+0.008
 redd+0.008
Neg logit contributions
 Tom-0.005
 that-0.005
 arrest-0.005
 school-0.005
 playing-0.004
Token with
Token ID351
Feature activation-1.66e-02

Pos logit contributions
 arrest+0.030
issa+0.029
innie+0.029
ait+0.028
azy+0.028
Neg logit contributions
 fart-0.049
 lo-0.044
 funn-0.044
 ju-0.043
rated-0.041
Token tr
Token ID491
Feature activation-7.00e-02

Pos logit contributions
 arrest+0.019
 lawyer+0.019
azy+0.018
issa+0.018
aren+0.018
Neg logit contributions
 fart-0.033
 funn-0.030
 lo-0.030
 ju-0.030
nam-0.029
Tokenays
Token ID592
Feature activation-1.10e-02

Pos logit contributions
 lawyer+0.109
 Tom+0.104
 nurse+0.104
 worker+0.102
 veterinarian+0.102
Neg logit contributions
 fart-0.110
rac-0.093
ifying-0.091
cr-0.088
 afloat-0.087
Token again
Token ID757
Feature activation-1.80e-02

Pos logit contributions
 arrest+0.011
 lawyer+0.010
 Tom+0.010
issa+0.010
 church+0.009
Neg logit contributions
 fart-0.025
 lo-0.023
 ju-0.022
 funn-0.022
rated-0.022
Token.
Token ID13
Feature activation+7.59e-03

Pos logit contributions
 arrest+0.015
 lawyer+0.013
 elev+0.013
issa+0.013
 church+0.012
Neg logit contributions
 fart-0.042
 lo-0.040
rated-0.039
 ju-0.039
 funn-0.039
Token<|endoftext|>
Token ID50256
Feature activation-6.10e-03

Pos logit contributions
 fart+0.015
 ju+0.014
 funn+0.014
 lo+0.014
rated+0.014
Neg logit contributions
 arrest-0.009
 lawyer-0.008
 elev-0.008
 school-0.008
 church-0.008
TokenBilly
Token ID49640
Feature activation-1.26e-02

Pos logit contributions
 arrest+0.009
 lawyer+0.008
 elev+0.008
 studying+0.008
parent+0.008
Neg logit contributions
 fart-0.011
 lo-0.010
 ju-0.009
 funn-0.009
rated-0.009
Token spotted
Token ID13489
Feature activation-1.24e-02

Pos logit contributions
 arrest+0.010
 Tom+0.010
 lawyer+0.010
 school+0.010
 elev+0.010
Neg logit contributions
 fart-0.015
rated-0.014
 ju-0.013
 lo-0.013
 funn-0.013
Token a
Token ID257
Feature activation-9.70e-03

Pos logit contributions
 arrest+0.016
 lawyer+0.014
 elev+0.014
 Tom+0.014
azy+0.013
Neg logit contributions
 fart-0.024
 lo-0.022
 ju-0.022
 funn-0.022
rated-0.021
Token row
Token ID5752
Feature activation+3.94e-03

Pos logit contributions
 lawyer+0.013
 school+0.012
 Tom+0.012
 church+0.012
 arrest+0.012
Neg logit contributions
 fart-0.018
rated-0.018
 ju-0.017
 funn-0.017
 lo-0.016
Token of
Token ID286
Feature activation-2.03e-03

Pos logit contributions
 fart+0.009
rated+0.008
 ju+0.008
 funn+0.008
 lo+0.008
Neg logit contributions
 arrest-0.004
 Tom-0.004
 lawyer-0.003
 school-0.003
 that-0.003
Token books
Token ID3835
Feature activation-3.41e-02

Pos logit contributions
 Tom+0.003
 arrest+0.003
 lawyer+0.002
 school+0.002
 elev+0.002
Neg logit contributions
 fart-0.004
rated-0.004
 ju-0.004
 lo-0.004
 funn-0.004
Tokenhel
Token ID2978
Feature activation-6.98e-02

Pos logit contributions
 arrest+0.039
 lawyer+0.034
 elev+0.034
 Tom+0.034
parent+0.033
Neg logit contributions
 fart-0.072
 lo-0.068
rated-0.067
 ju-0.066
 funn-0.064
Tokenves
Token ID1158
Feature activation-1.47e-02

Pos logit contributions
 Tom+0.099
 lawyer+0.095
 Lee+0.095
 arrest+0.093
 school+0.090
Neg logit contributions
 fart-0.119
rated-0.118
 lo-0.107
 ju-0.098
 funn-0.098
Token in
Token ID287
Feature activation+5.85e-03

Pos logit contributions
 arrest+0.012
 lawyer+0.011
 Tom+0.011
 elev+0.011
 school+0.011
Neg logit contributions
 fart-0.035
 lo-0.032
rated-0.032
 ju-0.032
 funn-0.032
Token her
Token ID607
Feature activation-1.21e-02

Pos logit contributions
 fart+0.013
 lo+0.012
 ju+0.011
 funn+0.011
rated+0.011
Neg logit contributions
 arrest-0.006
 lawyer-0.006
azy-0.006
 studying-0.005
issa-0.005
Token grandparents
Token ID28571
Feature activation+6.46e-03

Pos logit contributions
 arrest+0.014
 Tom+0.014
 lawyer+0.013
 elev+0.013
 school+0.013
Neg logit contributions
 fart-0.024
 ju-0.022
 lo-0.022
rated-0.022
 funn-0.022
Token'
Token ID6
Feature activation+8.70e-03

Pos logit contributions
 fart+0.010
 lo+0.010
 ju+0.009
 funn+0.009
rated+0.009
Neg logit contributions
 arrest-0.010
 elev-0.009
 lawyer-0.009
 Tom-0.009
azy-0.009
Token friends
Token ID2460
Feature activation+2.12e-02

Pos logit contributions
 arrest+0.017
 Tom+0.017
 elev+0.016
 lawyer+0.016
 school+0.016
Neg logit contributions
 fart-0.025
 funn-0.022
 lo-0.022
 ju-0.022
rated-0.022
Token.
Token ID13
Feature activation-1.02e-02

Pos logit contributions
 fart+0.047
 ju+0.043
 lo+0.043
rated+0.043
 funn+0.042
Neg logit contributions
 arrest-0.022
 Tom-0.020
 elev-0.020
 lawyer-0.019
 school-0.019
Token Bye
Token ID47843
Feature activation+1.20e-02

Pos logit contributions
 Tom+0.013
 arrest+0.013
 lawyer+0.012
 school+0.012
 that+0.012
Neg logit contributions
 fart-0.020
rated-0.018
 ju-0.018
 lo-0.018
 funn-0.018
Token,
Token ID11
Feature activation+9.25e-03

Pos logit contributions
 fart+0.025
rated+0.023
 ju+0.022
 lo+0.022
 funn+0.021
Neg logit contributions
 elev-0.014
 arrest-0.013
azy-0.013
issa-0.013
 lawyer-0.012
Token Roc
Token ID13545
Feature activation-2.55e-02

Pos logit contributions
 fart+0.019
 ju+0.017
 lo+0.017
 funn+0.017
rated+0.017
Neg logit contributions
 arrest-0.011
 elev-0.010
 Tom-0.010
 lawyer-0.010
 school-0.010
Tokenco
Token ID1073
Feature activation-6.89e-02

Pos logit contributions
 arrest+0.035
 school+0.031
 she+0.030
 recess+0.030
 studying+0.030
Neg logit contributions
rated-0.048
 fart-0.047
 funn-0.044
ksh-0.043
 lo-0.043
Token.
Token ID13
Feature activation-3.15e-03

Pos logit contributions
 elev+0.069
 arrest+0.063
 upstairs+0.060
 lawyer+0.060
 Olivia+0.059
Neg logit contributions
 fart-0.138
rated-0.137
 lo-0.127
 funn-0.123
rac-0.120
Token Bye
Token ID47843
Feature activation+1.13e-02

Pos logit contributions
 Tom+0.004
 arrest+0.004
 lawyer+0.004
 school+0.004
 that+0.004
Neg logit contributions
 fart-0.006
rated-0.006
 ju-0.006
 lo-0.006
 funn-0.006
Token,
Token ID11
Feature activation+8.29e-03

Pos logit contributions
 fart+0.021
rated+0.020
 ju+0.019
 lo+0.019
 funn+0.018
Neg logit contributions
 elev-0.015
 arrest-0.014
issa-0.014
azy-0.014
 lawyer-0.014
Token man
Token ID582
Feature activation-6.99e-04

Pos logit contributions
 fart+0.017
 ju+0.015
 lo+0.015
 funn+0.015
rated+0.015
Neg logit contributions
 arrest-0.010
 lawyer-0.009
 elev-0.009
 studying-0.009
 school-0.009
Token."
Token ID526
Feature activation-6.04e-03

Pos logit contributions
 elev+0.001
 school+0.001
 Karen+0.001
innie+0.001
issa+0.001
Neg logit contributions
 fart-0.001
rated-0.001
 lo-0.001
 ju-0.001
 funn-0.001
Token Sue
Token ID26089
Feature activation-1.54e-02

Pos logit contributions
 arrest+0.009
 elev+0.008
 lawyer+0.007
azy+0.007
 church+0.007
Neg logit contributions
 fart-0.016
 lo-0.015
rated-0.015
 funn-0.015
 ju-0.015
Token,
Token ID11
Feature activation+8.72e-04

Pos logit contributions
 arrest+0.017
 elev+0.017
 church+0.016
 recess+0.015
 school+0.015
Neg logit contributions
 fart-0.031
rated-0.031
 lo-0.030
 ju-0.029
 funn-0.028
Token help
Token ID1037
Feature activation+8.62e-03

Pos logit contributions
 fart+0.002
 ju+0.002
 lo+0.001
 funn+0.001
rated+0.001
Neg logit contributions
 arrest-0.001
 lawyer-0.001
 school-0.001
 elev-0.001
issa-0.001
Token me
Token ID502
Feature activation+7.15e-04

Pos logit contributions
 fart+0.017
 lo+0.016
rated+0.015
 funn+0.015
 ju+0.015
Neg logit contributions
 arrest-0.011
 lawyer-0.010
 recess-0.009
 Lee-0.009
 studying-0.009
Token un
Token ID555
Feature activation-2.94e-03

Pos logit contributions
 fart+0.001
 lo+0.001
rated+0.001
 ju+0.001
 funn+0.001
Neg logit contributions
 Tom-0.001
 lawyer-0.001
 arrest-0.001
issa-0.001
parent-0.001
Tokenplug
Token ID16875
Feature activation-6.84e-02

Pos logit contributions
 Tom+0.004
 arrest+0.004
 lawyer+0.003
 school+0.003
 that+0.003
Neg logit contributions
 fart-0.006
rated-0.005
 lo-0.005
 ju-0.005
 funn-0.005
Token the
Token ID262
Feature activation+3.31e-03

Pos logit contributions
 lawyer+0.078
 dentist+0.067
 meeting+0.067
 clim+0.066
 elev+0.065
Neg logit contributions
 fart-0.130
 lo-0.113
rated-0.113
rac-0.109
 afloat-0.104
Token alarm
Token ID10436
Feature activation-1.98e-05

Pos logit contributions
 fart+0.007
rated+0.006
 ju+0.006
 funn+0.006
 lo+0.006
Neg logit contributions
 arrest-0.004
 Tom-0.004
 lawyer-0.004
 school-0.004
 church-0.004
Token and
Token ID290
Feature activation+7.61e-04

Pos logit contributions
 arrest+0.000
 Tom+0.000
 lawyer+0.000
 school+0.000
 elev+0.000
Neg logit contributions
 fart-0.000
rated-0.000
 ju-0.000
 lo-0.000
 funn-0.000
Token then
Token ID788
Feature activation+2.11e-02

Pos logit contributions
 fart+0.002
 ju+0.001
 lo+0.001
rated+0.001
 funn+0.001
Neg logit contributions
 arrest-0.001
 Tom-0.001
 lawyer-0.001
 elev-0.001
 school-0.001
Token we
Token ID356
Feature activation+9.93e-04

Pos logit contributions
 fart+0.041
 lo+0.038
 ju+0.037
rated+0.037
 funn+0.036
Neg logit contributions
 arrest-0.026
 lawyer-0.025
 elev-0.024
azy-0.024
parent-0.024
Token Mom
Token ID11254
Feature activation+1.81e-02

Pos logit contributions
 arrest+0.001
 lawyer+0.001
 elev+0.001
 school+0.001
 studying+0.001
Neg logit contributions
 fart-0.002
 ju-0.001
 lo-0.001
 funn-0.001
rated-0.001
Token.
Token ID13
Feature activation-1.45e-02

Pos logit contributions
rated+0.037
 fart+0.037
 lo+0.036
 funn+0.036
 ju+0.035
Neg logit contributions
 arrest-0.020
 elev-0.018
 recess-0.018
 lawyer-0.018
 school-0.017
Token I
Token ID314
Feature activation-2.24e-02

Pos logit contributions
 arrest+0.016
 Tom+0.015
 lawyer+0.015
 school+0.015
 elev+0.015
Neg logit contributions
 fart-0.030
 ju-0.028
rated-0.028
 lo-0.027
 funn-0.027
Token wanted
Token ID2227
Feature activation-1.80e-02

Pos logit contributions
 arrest+0.031
 Tom+0.031
 lawyer+0.030
 elev+0.030
 school+0.030
Neg logit contributions
 fart-0.040
rated-0.038
 lo-0.037
 ju-0.037
 funn-0.037
Token more
Token ID517
Feature activation-1.00e-02

Pos logit contributions
 arrest+0.019
azy+0.019
 lawyer+0.018
issa+0.018
parent+0.017
Neg logit contributions
 fart-0.038
 funn-0.034
 ju-0.034
 lo-0.034
 brightest-0.031
Token h
Token ID289
Feature activation-6.75e-02

Pos logit contributions
 arrest+0.012
 lawyer+0.011
 Tom+0.011
azy+0.011
issa+0.011
Neg logit contributions
 fart-0.020
 funn-0.019
 lo-0.019
 ju-0.018
rated-0.018
Tokenangers
Token ID6606
Feature activation-3.93e-02

Pos logit contributions
 Tom+0.102
 John+0.102
 lawyer+0.096
 Karen+0.094
parent+0.094
Neg logit contributions
 fart-0.099
rac-0.091
 lo-0.086
immer-0.085
 prett-0.082
Token for
Token ID329
Feature activation-3.31e-02

Pos logit contributions
 arrest+0.037
 lawyer+0.034
 elev+0.033
 Tom+0.033
issa+0.033
Neg logit contributions
 fart-0.091
 ju-0.083
 lo-0.083
 funn-0.082
rated-0.081
Token my
Token ID616
Feature activation-1.91e-02

Pos logit contributions
aren+0.050
 lawyer+0.049
 arrest+0.046
 worker+0.046
azy+0.046
Neg logit contributions
 fart-0.053
nam-0.050
 lo-0.047
 funn-0.044
 prett-0.043
Token dolls
Token ID36062
Feature activation-1.08e-02

Pos logit contributions
 arrest+0.025
 Tom+0.024
azy+0.024
Mike+0.024
 Karen+0.023
Neg logit contributions
 fart-0.036
 lo-0.033
 funn-0.032
 prett-0.032
 ju-0.031
Token.
Token ID13
Feature activation-1.13e-02

Pos logit contributions
 arrest+0.008
 elev+0.007
 Tom+0.007
 lawyer+0.007
azy+0.007
Neg logit contributions
 fart-0.027
 ju-0.025
 lo-0.025
rated-0.024
 funn-0.024
Token was
Token ID373
Feature activation-9.23e-03

Pos logit contributions
 arrest+0.014
 Tom+0.013
 lawyer+0.013
 school+0.013
 elev+0.013
Neg logit contributions
 fart-0.018
 lo-0.017
 ju-0.017
 funn-0.017
rated-0.016
Token not
Token ID407
Feature activation-2.27e-02

Pos logit contributions
 arrest+0.014
 lawyer+0.013
 Tom+0.013
 school+0.013
 elev+0.013
Neg logit contributions
 fart-0.016
 lo-0.014
 funn-0.014
 ju-0.014
rated-0.014
Token happy
Token ID3772
Feature activation-1.91e-03

Pos logit contributions
 arrest+0.041
parent+0.038
 lawyer+0.038
 elev+0.037
 Tom+0.037
Neg logit contributions
 fart-0.033
 lo-0.030
 funn-0.030
 ju-0.029
rated-0.028
Token and
Token ID290
Feature activation-1.28e-02

Pos logit contributions
issa+0.002
innie+0.002
aren+0.002
azy+0.002
 arrest+0.002
Neg logit contributions
 fart-0.004
 lo-0.004
 funn-0.004
rated-0.003
 ju-0.003
Token un
Token ID555
Feature activation-8.80e-03

Pos logit contributions
 arrest+0.021
 lawyer+0.020
 Tom+0.020
 school+0.020
 elev+0.020
Neg logit contributions
 fart-0.020
 lo-0.018
 ju-0.018
rated-0.018
 funn-0.018
Tokenplug
Token ID16875
Feature activation-6.75e-02

Pos logit contributions
 Tom+0.014
 arrest+0.014
 school+0.013
 lawyer+0.013
 church+0.013
Neg logit contributions
 fart-0.015
rated-0.013
 lo-0.013
 funn-0.013
 ju-0.013
Tokenged
Token ID2004
Feature activation-2.32e-02

Pos logit contributions
 lawyer+0.131
 university+0.125
 dentist+0.122
 church+0.122
 veterinarian+0.122
Neg logit contributions
 fart-0.073
rated-0.069
bery-0.067
 lo-0.067
 afloat-0.066
Token the
Token ID262
Feature activation+9.29e-03

Pos logit contributions
 arrest+0.025
 lawyer+0.024
parent+0.023
 event+0.022
 school+0.022
Neg logit contributions
 fart-0.049
 lo-0.046
 funn-0.044
 tighter-0.044
 ju-0.043
Token iron
Token ID6953
Feature activation+6.37e-03

Pos logit contributions
 fart+0.018
 lo+0.016
rated+0.016
 funn+0.016
 ju+0.016
Neg logit contributions
 Tom-0.013
 arrest-0.012
azy-0.012
 Ann-0.012
 lawyer-0.011
Token.
Token ID13
Feature activation-9.88e-03

Pos logit contributions
 fart+0.016
 lo+0.015
 ju+0.015
rated+0.015
 funn+0.015
Neg logit contributions
 arrest-0.004
 Tom-0.004
 lawyer-0.004
 school-0.004
 elev-0.004
Token She
Token ID1375
Feature activation-1.27e-02

Pos logit contributions
 arrest+0.013
 elev+0.012
 lawyer+0.012
issa+0.011
 event+0.011
Neg logit contributions
 fart-0.018
 ju-0.017
 lo-0.017
 funn-0.016
 redd-0.015
Token followed
Token ID3940
Feature activation-1.19e-02

Pos logit contributions
 fart+0.025
 lo+0.023
rated+0.022
 ju+0.022
 funn+0.022
Neg logit contributions
 arrest-0.019
 elev-0.018
 lawyer-0.018
 school-0.018
 Tom-0.018
Token him
Token ID683
Feature activation-2.17e-03

Pos logit contributions
issa+0.009
 school+0.009
azy+0.008
 studying+0.008
 elev+0.008
Neg logit contributions
 fart-0.028
 lo-0.026
 ju-0.026
rated-0.026
 funn-0.025
Token and
Token ID290
Feature activation+1.82e-03

Pos logit contributions
 arrest+0.002
 lawyer+0.002
 Tom+0.002
issa+0.002
 elev+0.002
Neg logit contributions
 fart-0.005
rated-0.005
 lo-0.005
 ju-0.005
 funn-0.005
Token they
Token ID484
Feature activation-3.13e-03

Pos logit contributions
 fart+0.004
rated+0.004
 ju+0.004
 lo+0.004
 funn+0.004
Neg logit contributions
 arrest-0.002
 Tom-0.002
 lawyer-0.002
 school-0.002
 elev-0.002
Token all
Token ID477
Feature activation+3.93e-03

Pos logit contributions
 Tom+0.004
 arrest+0.004
 school+0.003
 lawyer+0.003
 that+0.003
Neg logit contributions
 fart-0.007
rated-0.006
 ju-0.006
 funn-0.006
 lo-0.006
Token h
Token ID289
Feature activation-6.73e-02

Pos logit contributions
 fart+0.008
 ju+0.007
rated+0.007
 funn+0.007
 lo+0.007
Neg logit contributions
 Tom-0.005
 that-0.005
 school-0.004
 Ann-0.004
 arrest-0.004
Tokenuddled
Token ID32745
Feature activation-1.95e-02

Pos logit contributions
 lawyer+0.139
 school+0.132
 church+0.130
 grandma+0.127
 university+0.124
Neg logit contributions
cr-0.070
ench-0.066
 fart-0.066
rac-0.064
immer-0.063
Token together
Token ID1978
Feature activation+2.09e-02

Pos logit contributions
 arrest+0.028
 lawyer+0.027
issa+0.026
azy+0.026
 school+0.026
Neg logit contributions
 fart-0.036
 lo-0.033
rated-0.032
 ju-0.032
 funn-0.032
Token,
Token ID11
Feature activation+2.72e-04

Pos logit contributions
 fart+0.049
 lo+0.045
 ju+0.045
 funn+0.044
rated+0.044
Neg logit contributions
 arrest-0.019
 lawyer-0.018
 elev-0.017
issa-0.017
 church-0.016
Token feeling
Token ID4203
Feature activation-1.88e-02

Pos logit contributions
 fart+0.001
 ju+0.001
 lo+0.001
 funn+0.001
rated+0.001
Neg logit contributions
 lawyer-0.000
 arrest-0.000
issa-0.000
 school-0.000
 Tom-0.000
Token safe
Token ID3338
Feature activation-2.10e-02

Pos logit contributions
 Tom+0.025
 arrest+0.025
 that+0.024
 school+0.024
 lawyer+0.024
Neg logit contributions
 fart-0.036
rated-0.033
 ju-0.032
 lo-0.032
 funn-0.031
Token."
Token ID526
Feature activation-1.83e-02

Pos logit contributions
 arrest+0.014
 lawyer+0.012
 Tom+0.012
 elev+0.012
 school+0.011
Neg logit contributions
 fart-0.041
rated-0.038
 lo-0.038
 ju-0.038
 funn-0.037
Token
Token ID198
Feature activation-3.79e-03

Pos logit contributions
 arrest+0.026
 lawyer+0.024
 elev+0.023
issa+0.023
azy+0.023
Neg logit contributions
 fart-0.032
 lo-0.031
 ju-0.030
 funn-0.029
 wider-0.028
Token
Token ID198
Feature activation-7.55e-03

Pos logit contributions
 arrest+0.005
 lawyer+0.004
 elev+0.004
 studying+0.004
issa+0.004
Neg logit contributions
 fart-0.007
 lo-0.007
rated-0.007
 ju-0.007
 funn-0.007
TokenThey
Token ID2990
Feature activation-1.59e-02

Pos logit contributions
 arrest+0.011
 Tom+0.011
 lawyer+0.010
 school+0.010
 elev+0.010
Neg logit contributions
 fart-0.014
 ju-0.012
rated-0.012
 lo-0.012
 funn-0.012
Token both
Token ID1111
Feature activation+8.37e-04

Pos logit contributions
 arrest+0.022
 Tom+0.021
 lawyer+0.021
 school+0.020
 elev+0.020
Neg logit contributions
 fart-0.029
rated-0.027
 ju-0.027
 lo-0.026
 funn-0.026
Token h
Token ID289
Feature activation-6.64e-02

Pos logit contributions
 fart+0.002
 ju+0.001
rated+0.001
 lo+0.001
 funn+0.001
Neg logit contributions
 Tom-0.001
 arrest-0.001
 school-0.001
 lawyer-0.001
 church-0.001
Tokenoot
Token ID1025
Feature activation-3.56e-02

Pos logit contributions
 lawyer+0.112
 school+0.109
parent+0.106
 Tom+0.104
 church+0.102
Neg logit contributions
rac-0.083
 fart-0.078
immer-0.075
 wider-0.073
 lo-0.073
Token,
Token ID11
Feature activation-1.82e-02

Pos logit contributions
 arrest+0.053
 Tom+0.052
 lawyer+0.051
 school+0.051
 elev+0.049
Neg logit contributions
 fart-0.061
rated-0.056
 ju-0.055
 lo-0.055
 funn-0.054
Token h
Token ID289
Feature activation-5.86e-02

Pos logit contributions
 Tom+0.016
 arrest+0.016
 lawyer+0.015
 school+0.015
 that+0.015
Neg logit contributions
 fart-0.042
rated-0.039
 ju-0.039
 lo-0.039
 funn-0.039
Tokenoot
Token ID1025
Feature activation-3.25e-02

Pos logit contributions
 lawyer+0.091
parent+0.090
 school+0.088
 Tom+0.083
 event+0.081
Neg logit contributions
rac-0.082
 fart-0.076
bid-0.072
itated-0.071
immer-0.071
Token,
Token ID11
Feature activation-1.23e-02

Pos logit contributions
 lawyer+0.041
 arrest+0.040
 school+0.039
 elev+0.038
 Tom+0.038
Neg logit contributions
 fart-0.062
 lo-0.057
rated-0.057
 funn-0.055
 ju-0.054
Token �
Token ID6184
Feature activation-1.12e-02

Pos logit contributions
 fart+0.007
 ju+0.007
rated+0.006
 funn+0.006
 lo+0.006
Neg logit contributions
 Tom-0.006
 arrest-0.006
 lawyer-0.006
 that-0.006
 school-0.006
Token�
Token ID95
Feature activation+8.25e-03

Pos logit contributions
 Tom+0.017
 Lee+0.017
 lawyer+0.016
 school+0.016
 Ann+0.016
Neg logit contributions
 fart-0.018
rated-0.017
 ju-0.017
 lo-0.016
rac-0.015
Token€
Token ID26391
Feature activation-1.19e-02

Pos logit contributions
 fart+0.013
rated+0.012
 ju+0.012
 funn+0.011
 lo+0.011
Neg logit contributions
 Tom-0.014
 school-0.013
 lawyer-0.013
 Lee-0.013
 arrest-0.013
Token�
Token ID129
Feature activation-1.31e-02

Pos logit contributions
 school+0.014
 lawyer+0.014
 Tom+0.013
 arrest+0.013
 elev+0.013
Neg logit contributions
 fart-0.024
 lo-0.022
 ju-0.022
rated-0.022
 funn-0.020
Token�
Token ID241
Feature activation-9.12e-04

Pos logit contributions
 lawyer+0.018
 Tom+0.018
 arrest+0.018
 school+0.018
 Lee+0.017
Neg logit contributions
 fart-0.023
rated-0.022
 ju-0.021
 funn-0.020
 lo-0.020
Tokenco
Token ID1073
Feature activation-6.64e-02

Pos logit contributions
 arrest+0.001
 lawyer+0.001
 school+0.001
 elev+0.001
 Tom+0.001
Neg logit contributions
 fart-0.002
 lo-0.001
rated-0.001
 ju-0.001
 funn-0.001
Tokeno
Token ID78
Feature activation+2.10e-02

Pos logit contributions
 lawyer+0.086
 arrest+0.085
 Tom+0.084
 elev+0.084
 Lee+0.083
Neg logit contributions
 fart-0.119
rated-0.115
 lo-0.110
 ju-0.104
 funn-0.104
Tokenâ
Token ID22940
Feature activation+3.07e-02

Pos logit contributions
 fart+0.051
 ju+0.048
rated+0.047
 funn+0.047
 lo+0.047
Neg logit contributions
 Tom-0.016
 arrest-0.016
 that-0.015
 lawyer-0.015
 school-0.015
Token€
Token ID26391
Feature activation-7.16e-03

Pos logit contributions
 fart+0.051
rated+0.047
 ju+0.046
 lo+0.045
 funn+0.045
Neg logit contributions
 arrest-0.048
 Tom-0.047
 lawyer-0.046
 school-0.046
 elev-0.045
Token,
Token ID11
Feature activation+8.82e-03

Pos logit contributions
 school+0.008
 lawyer+0.008
 arrest+0.008
 elev+0.008
 Tom+0.008
Neg logit contributions
 fart-0.014
 ju-0.013
 lo-0.013
rated-0.013
 funn-0.012
Token it
Token ID340
Feature activation-3.72e-03

Pos logit contributions
 fart+0.020
rated+0.019
 ju+0.019
 lo+0.019
 funn+0.019
Neg logit contributions
 arrest-0.008
 Tom-0.008
 lawyer-0.007
 school-0.007
 elev-0.007
Token.
Token ID13
Feature activation+2.47e-03

Pos logit contributions
 fart+0.021
 lo+0.020
rated+0.019
 ju+0.019
 funn+0.019
Neg logit contributions
 arrest-0.009
 lawyer-0.009
 elev-0.008
irst-0.008
issa-0.008
Token<|endoftext|>
Token ID50256
Feature activation-5.94e-03

Pos logit contributions
 fart+0.005
 ju+0.004
 lo+0.004
 funn+0.004
rated+0.004
Neg logit contributions
 arrest-0.003
 elev-0.003
 lawyer-0.003
 school-0.003
 church-0.003
TokenJimmy
Token ID40335
Feature activation+6.06e-03

Pos logit contributions
 arrest+0.009
 lawyer+0.008
 elev+0.008
 studying+0.007
parent+0.007
Neg logit contributions
 fart-0.011
 lo-0.010
 ju-0.010
rated-0.009
 funn-0.009
Token was
Token ID373
Feature activation-1.21e-02

Pos logit contributions
 fart+0.011
rated+0.011
 lo+0.010
 ju+0.010
 funn+0.010
Neg logit contributions
 elev-0.008
 arrest-0.008
 lawyer-0.008
 recess-0.007
 school-0.007
Token 3
Token ID513
Feature activation-1.78e-03

Pos logit contributions
 arrest+0.011
 Tom+0.010
 lawyer+0.010
 school+0.010
 elev+0.009
Neg logit contributions
 fart-0.028
rated-0.026
 ju-0.026
 lo-0.026
 funn-0.026
Token-
Token ID12
Feature activation-6.58e-02

Pos logit contributions
 arrest+0.002
 Tom+0.002
 recess+0.002
 elev+0.002
 Lee+0.002
Neg logit contributions
 fart-0.003
rated-0.003
 lo-0.003
 funn-0.003
 ju-0.003
Tokenyears
Token ID19002
Feature activation-3.17e-02

Pos logit contributions
 Tom+0.093
 arrest+0.093
 school+0.089
 lawyer+0.088
 church+0.087
Neg logit contributions
 fart-0.115
 ju-0.106
rated-0.105
 lo-0.105
 funn-0.104
Token-
Token ID12
Feature activation-6.50e-02

Pos logit contributions
 Tom+0.026
 arrest+0.025
 Lee+0.024
 elev+0.024
 school+0.023
Neg logit contributions
 fart-0.072
rated-0.069
 ju-0.068
 funn-0.068
 lo-0.067
Tokenold
Token ID727
Feature activation-1.16e-02

Pos logit contributions
 Tom+0.053
 arrest+0.052
 Lee+0.050
 school+0.049
 church+0.048
Neg logit contributions
 fart-0.143
 ju-0.141
 funn-0.138
 lo-0.138
rated-0.135
Token and
Token ID290
Feature activation-1.37e-02

Pos logit contributions
 elev+0.010
 arrest+0.010
azy+0.010
 recess+0.009
 church+0.009
Neg logit contributions
 fart-0.025
 lo-0.024
 ju-0.024
 funn-0.023
rated-0.023
Token he
Token ID339
Feature activation-4.09e-03

Pos logit contributions
 arrest+0.015
 lawyer+0.014
 elev+0.014
 Tom+0.013
azy+0.013
Neg logit contributions
 fart-0.028
 lo-0.027
rated-0.027
 ju-0.027
 funn-0.027
Token.
Token ID13
Feature activation+2.50e-04

Pos logit contributions
 fart+0.007
rated+0.007
 lo+0.006
 ju+0.006
 funn+0.006
Neg logit contributions
 arrest-0.003
 lawyer-0.003
issa-0.003
parent-0.003
azy-0.003
Token Everyone
Token ID11075
Feature activation-1.48e-02

Pos logit contributions
 fart+0.001
 ju+0.000
 lo+0.000
rated+0.000
 funn+0.000
Neg logit contributions
 arrest-0.000
 lawyer-0.000
 school-0.000
 elev-0.000
 Tom-0.000
Token ceased
Token ID24368
Feature activation-3.24e-02

Pos logit contributions
 arrest+0.017
 Tom+0.017
 lawyer+0.016
 school+0.016
 elev+0.015
Neg logit contributions
 fart-0.031
rated-0.029
 lo-0.028
 ju-0.028
 funn-0.028
Token working
Token ID1762
Feature activation+1.38e-02

Pos logit contributions
 arrest+0.023
 lawyer+0.023
 Tom+0.022
issa+0.020
 school+0.020
Neg logit contributions
 fart-0.079
rated-0.074
 lo-0.074
 funn-0.073
 ju-0.072
Token and
Token ID290
Feature activation-1.55e-02

Pos logit contributions
 fart+0.026
 lo+0.024
rated+0.024
 funn+0.024
 ju+0.024
Neg logit contributions
 arrest-0.019
 Tom-0.018
 lawyer-0.018
 school-0.017
 Lee-0.016
Token h
Token ID289
Feature activation-6.57e-02

Pos logit contributions
 Tom+0.018
 that+0.018
 school+0.018
 arrest+0.017
 building+0.017
Neg logit contributions
 fart-0.033
 ju-0.030
rated-0.030
 funn-0.029
 lo-0.028
Tokenuddled
Token ID32745
Feature activation-2.58e-02

Pos logit contributions
 lawyer+0.120
 school+0.114
 Tom+0.114
parent+0.112
 Lee+0.112
Neg logit contributions
 fart-0.081
rated-0.080
ench-0.075
rac-0.075
 lo-0.074
Token up
Token ID510
Feature activation-3.28e-02

Pos logit contributions
 arrest+0.031
 lawyer+0.030
 Tom+0.028
azy+0.028
 school+0.028
Neg logit contributions
 fart-0.053
rated-0.049
 lo-0.048
 ju-0.047
 funn-0.047
Token for
Token ID329
Feature activation-2.22e-02

Pos logit contributions
 arrest+0.030
 lawyer+0.029
 Tom+0.028
 school+0.026
issa+0.026
Neg logit contributions
 fart-0.076
 lo-0.070
rated-0.070
 ju-0.070
 funn-0.069
Token warmth
Token ID23125
Feature activation-6.78e-03

Pos logit contributions
 lawyer+0.027
aren+0.025
 Lee+0.025
issa+0.025
 arrest+0.024
Neg logit contributions
 fart-0.043
 lo-0.040
 funn-0.039
 ju-0.037
 tighter-0.037
Token.
Token ID13
Feature activation-4.43e-04

Pos logit contributions
 arrest+0.004
 lawyer+0.004
 Tom+0.004
 school+0.004
 elev+0.004
Neg logit contributions
 fart-0.017
rated-0.016
 lo-0.016
 ju-0.016
 funn-0.016
Token.
Token ID13
Feature activation-1.30e-02

Pos logit contributions
 fart+0.022
 lo+0.021
 funn+0.021
rated+0.021
 ju+0.019
Neg logit contributions
 arrest-0.011
 elev-0.011
 lawyer-0.010
 recess-0.010
 church-0.009
Token But
Token ID887
Feature activation-2.80e-02

Pos logit contributions
 arrest+0.020
 lawyer+0.019
 school+0.019
parent+0.018
 studying+0.018
Neg logit contributions
 fart-0.020
 ju-0.018
 funn-0.018
 lo-0.017
 brightest-0.017
Token those
Token ID883
Feature activation-1.82e-02

Pos logit contributions
 arrest+0.034
 lawyer+0.031
parent+0.030
 Tom+0.029
 elev+0.028
Neg logit contributions
 fart-0.058
 lo-0.054
 funn-0.053
 ju-0.052
rated-0.051
Token are
Token ID389
Feature activation-5.51e-05

Pos logit contributions
 arrest+0.034
 lawyer+0.032
parent+0.030
 Karen+0.029
itor+0.029
Neg logit contributions
 fart-0.025
 funn-0.023
 lo-0.023
 prett-0.021
 ju-0.021
Token not
Token ID407
Feature activation-7.65e-03

Pos logit contributions
 arrest+0.000
 lawyer+0.000
parent+0.000
 school+0.000
 church+0.000
Neg logit contributions
 fart-0.000
 funn-0.000
 ju-0.000
 lo-0.000
 prett-0.000
Token tr
Token ID491
Feature activation-6.56e-02

Pos logit contributions
 arrest+0.010
 lawyer+0.009
 Tom+0.009
parent+0.009
 school+0.009
Neg logit contributions
 fart-0.015
 funn-0.014
 lo-0.014
 ju-0.013
rated-0.013
Tokenunks
Token ID14125
Feature activation-1.56e-02

Pos logit contributions
 she+0.112
 lawyer+0.111
 Lucy+0.111
 Tom+0.110
 Karen+0.109
Neg logit contributions
 fart-0.086
rac-0.076
ifying-0.072
ksh-0.072
c-0.067
Token.
Token ID13
Feature activation-1.12e-02

Pos logit contributions
 arrest+0.016
 lawyer+0.015
 Tom+0.015
 elev+0.015
issa+0.015
Neg logit contributions
 fart-0.034
rated-0.031
 lo-0.031
 funn-0.031
 ju-0.031
Token Those
Token ID5845
Feature activation-6.55e-03

Pos logit contributions
 arrest+0.014
 lawyer+0.014
 school+0.014
 Tom+0.014
 elev+0.013
Neg logit contributions
 fart-0.022
 ju-0.020
rated-0.019
 funn-0.019
 lo-0.019
Token are
Token ID389
Feature activation+6.86e-03

Pos logit contributions
 arrest+0.009
 Tom+0.009
 lawyer+0.009
 school+0.008
 elev+0.008
Neg logit contributions
 fart-0.012
 ju-0.011
rated-0.011
 lo-0.011
 funn-0.011
Token my
Token ID616
Feature activation-2.03e-02

Pos logit contributions
 fart+0.014
rated+0.013
 ju+0.013
 lo+0.013
 funn+0.013
Neg logit contributions
 arrest-0.008
 Tom-0.008
 lawyer-0.008
 school-0.008
 elev-0.007
Token.
Token ID13
Feature activation-1.47e-02

Pos logit contributions
 fart+0.005
rated+0.005
 funn+0.005
 lo+0.005
 ju+0.005
Neg logit contributions
 arrest-0.004
azy-0.003
 elev-0.003
idd-0.003
itt-0.003
Token She
Token ID1375
Feature activation-1.54e-02

Pos logit contributions
 arrest+0.015
 Tom+0.013
 lawyer+0.013
 school+0.013
azy+0.013
Neg logit contributions
 fart-0.032
rated-0.031
 ju-0.031
 lo-0.030
 funn-0.030
Token put
Token ID1234
Feature activation-1.44e-02

Pos logit contributions
 arrest+0.023
 elev+0.022
 lawyer+0.022
 Tom+0.022
 Lee+0.021
Neg logit contributions
 fart-0.026
rated-0.025
 lo-0.024
 ju-0.024
 funn-0.023
Token on
Token ID319
Feature activation+7.20e-03

Pos logit contributions
 arrest+0.018
 lawyer+0.017
 Tom+0.016
 school+0.016
 studying+0.016
Neg logit contributions
 fart-0.029
 ju-0.027
 lo-0.027
 funn-0.026
rated-0.026
Token her
Token ID607
Feature activation+3.99e-03

Pos logit contributions
 fart+0.016
 ju+0.015
 lo+0.015
 funn+0.014
nam+0.014
Neg logit contributions
 lawyer-0.007
 arrest-0.006
makers-0.006
azy-0.006
 Lee-0.006
Token best
Token ID1266
Feature activation+3.71e-02

Pos logit contributions
 fart+0.007
 ju+0.006
 lo+0.006
 funn+0.006
rated+0.006
Neg logit contributions
 arrest-0.006
 Tom-0.006
 lawyer-0.006
 elev-0.005
 school-0.005
Token dress
Token ID6576
Feature activation-8.31e-03

Pos logit contributions
 fart+0.071
rated+0.065
 ju+0.065
 lo+0.064
 funn+0.064
Neg logit contributions
 arrest-0.048
 Tom-0.048
 lawyer-0.046
 school-0.046
 that-0.044
Token and
Token ID290
Feature activation+4.42e-03

Pos logit contributions
 arrest+0.007
 elev+0.006
 Tom+0.006
 lawyer+0.006
azy+0.006
Neg logit contributions
 fart-0.020
 lo-0.019
 ju-0.019
rated-0.018
 funn-0.018
Token put
Token ID1234
Feature activation-8.83e-03

Pos logit contributions
 fart+0.008
rated+0.008
 ju+0.008
 lo+0.008
 funn+0.008
Neg logit contributions
 arrest-0.006
 Tom-0.006
 lawyer-0.006
 school-0.005
 elev-0.005
Token on
Token ID319
Feature activation+1.28e-02

Pos logit contributions
 arrest+0.012
 lawyer+0.011
 studying+0.010
azy+0.010
 elev+0.010
Neg logit contributions
 fart-0.017
 lo-0.016
 ju-0.016
 funn-0.016
 brightest-0.015
Token her
Token ID607
Feature activation+9.56e-03

Pos logit contributions
 fart+0.028
 ju+0.027
 lo+0.027
 funn+0.026
 brightest+0.025
Neg logit contributions
 lawyer-0.012
 arrest-0.012
azy-0.011
makers-0.011
 Lee-0.010
Token but
Token ID475
Feature activation-6.96e-03

Pos logit contributions
 fart+0.013
 ju+0.012
rated+0.012
 lo+0.012
 funn+0.012
Neg logit contributions
 arrest-0.005
 Tom-0.005
 lawyer-0.005
 school-0.005
 elev-0.005
Token then
Token ID788
Feature activation+1.80e-02

Pos logit contributions
 Tom+0.005
 arrest+0.005
 that+0.005
 school+0.005
 lawyer+0.004
Neg logit contributions
 fart-0.018
rated-0.017
 ju-0.016
 funn-0.016
 lo-0.016
Token the
Token ID262
Feature activation+5.67e-03

Pos logit contributions
 fart+0.038
 lo+0.036
 ju+0.036
 funn+0.035
rated+0.034
Neg logit contributions
 arrest-0.019
 lawyer-0.018
 elev-0.017
parent-0.017
azy-0.017
Token bicycle
Token ID17026
Feature activation-9.35e-03

Pos logit contributions
 fart+0.011
rated+0.010
 ju+0.010
 lo+0.010
 funn+0.010
Neg logit contributions
 arrest-0.007
 Tom-0.007
 lawyer-0.007
 school-0.007
 that-0.007
Token started
Token ID2067
Feature activation+6.42e-03

Pos logit contributions
 Tom+0.013
 arrest+0.013
 lawyer+0.012
 that+0.012
 school+0.012
Neg logit contributions
 fart-0.017
rated-0.016
 ju-0.016
 funn-0.015
 lo-0.015
Token making
Token ID1642
Feature activation+3.66e-02

Pos logit contributions
 fart+0.016
rated+0.015
 ju+0.015
 funn+0.015
 lo+0.015
Neg logit contributions
 Tom-0.005
 arrest-0.005
 school-0.004
 lawyer-0.004
 that-0.004
Token strange
Token ID6283
Feature activation+1.92e-02

Pos logit contributions
 fart+0.079
rated+0.073
 ju+0.072
 lo+0.071
 funn+0.071
Neg logit contributions
 arrest-0.039
 Tom-0.039
 lawyer-0.037
 school-0.037
 that-0.035
Token noises
Token ID26782
Feature activation-8.12e-03

Pos logit contributions
 fart+0.034
rated+0.031
 ju+0.031
 lo+0.030
 funn+0.030
Neg logit contributions
 Tom-0.028
 arrest-0.028
 school-0.027
 lawyer-0.027
 that-0.026
Token and
Token ID290
Feature activation+6.43e-03

Pos logit contributions
 arrest+0.006
issa+0.006
parent+0.006
 lawyer+0.006
innie+0.006
Neg logit contributions
 fart-0.020
 lo-0.019
 funn-0.018
rated-0.018
 ju-0.018
Token it
Token ID340
Feature activation+8.62e-03

Pos logit contributions
 fart+0.013
rated+0.013
 ju+0.012
 lo+0.012
 funn+0.012
Neg logit contributions
 Tom-0.008
 that-0.007
 arrest-0.007
 school-0.007
 building-0.007
Token stopped
Token ID5025
Feature activation-1.84e-02

Pos logit contributions
 fart+0.017
 ju+0.016
rated+0.016
nam+0.016
 redd+0.015
Neg logit contributions
 that-0.010
 Tom-0.010
 arrest-0.010
 school-0.009
 building-0.009
Token street
Token ID4675
Feature activation-1.60e-02

Pos logit contributions
 church+0.005
 lawyer+0.005
 school+0.005
 Tom+0.005
 that+0.004
Neg logit contributions
 fart-0.021
rated-0.021
 ju-0.020
cr-0.020
 lo-0.019
Token and
Token ID290
Feature activation+1.07e-02

Pos logit contributions
 arrest+0.013
 Tom+0.011
issa+0.011
 elev+0.011
azy+0.011
Neg logit contributions
 fart-0.038
 lo-0.036
 ju-0.036
 funn-0.036
rated-0.035
Token the
Token ID262
Feature activation+1.31e-03

Pos logit contributions
 fart+0.024
rated+0.023
 ju+0.022
 lo+0.022
 funn+0.022
Neg logit contributions
 Tom-0.011
 arrest-0.011
 school-0.010
 that-0.010
 lawyer-0.010
Token man
Token ID582
Feature activation+1.43e-02

Pos logit contributions
 fart+0.003
rated+0.003
 ju+0.002
 funn+0.002
 lo+0.002
Neg logit contributions
 lawyer-0.002
 arrest-0.001
 Tom-0.001
 school-0.001
 church-0.001
Token would
Token ID561
Feature activation+9.34e-03

Pos logit contributions
 fart+0.029
rated+0.026
 ju+0.026
 lo+0.026
 funn+0.026
Neg logit contributions
 Tom-0.017
 arrest-0.017
 lawyer-0.016
 school-0.016
 that-0.016
Token always
Token ID1464
Feature activation+3.97e-02

Pos logit contributions
 fart+0.017
 ju+0.016
 lo+0.016
rated+0.016
 funn+0.016
Neg logit contributions
 Tom-0.012
 arrest-0.012
 lawyer-0.012
 elev-0.012
issa-0.012
Token give
Token ID1577
Feature activation+2.29e-03

Pos logit contributions
 fart+0.079
rated+0.073
 ju+0.072
 funn+0.070
 lo+0.070
Neg logit contributions
 arrest-0.049
 Tom-0.049
 school-0.047
 lawyer-0.047
 that-0.046
Token the
Token ID262
Feature activation+6.43e-03

Pos logit contributions
 fart+0.006
rated+0.006
 ju+0.006
 funn+0.006
 lo+0.005
Neg logit contributions
 Tom-0.002
 arrest-0.001
 that-0.001
 lawyer-0.001
 school-0.001
Token bird
Token ID6512
Feature activation+1.78e-02

Pos logit contributions
rated+0.013
 fart+0.013
 lo+0.012
 ju+0.012
nam+0.012
Neg logit contributions
 lawyer-0.007
 Tom-0.007
 that-0.007
 arrest-0.007
 church-0.007
Token a
Token ID257
Feature activation+1.98e-02

Pos logit contributions
 fart+0.042
rated+0.040
 lo+0.037
 funn+0.037
 ju+0.037
Neg logit contributions
 that-0.017
 Tom-0.016
 arrest-0.014
 lawyer-0.014
 school-0.014
Token piece
Token ID3704
Feature activation+7.63e-03

Pos logit contributions
rated+0.045
 fart+0.044
nam+0.042
mediately+0.039
ksh+0.039
Neg logit contributions
 that-0.022
 lawyer-0.020
 church-0.020
 school-0.019
 Tom-0.019
Token treats
Token ID18432
Feature activation+1.70e-02

Pos logit contributions
 fart+0.000
 ju+0.000
 funn+0.000
 lo+0.000
rated+0.000
Neg logit contributions
 arrest-0.000
 Tom-0.000
 lawyer-0.000
 school-0.000
issa-0.000
Token on
Token ID319
Feature activation+1.78e-03

Pos logit contributions
 fart+0.037
 ju+0.034
 funn+0.034
 lo+0.034
rated+0.033
Neg logit contributions
 arrest-0.019
 elev-0.017
 lawyer-0.017
 church-0.016
issa-0.016
Token the
Token ID262
Feature activation-3.86e-03

Pos logit contributions
 fart+0.003
 ju+0.003
 lo+0.003
nam+0.003
 funn+0.003
Neg logit contributions
 lawyer-0.002
issa-0.002
makers-0.002
 arrest-0.002
 elev-0.002
Token table
Token ID3084
Feature activation+6.05e-03

Pos logit contributions
 arrest+0.005
 Tom+0.005
 lawyer+0.005
 school+0.005
 elev+0.005
Neg logit contributions
 fart-0.008
 ju-0.007
rated-0.007
 funn-0.007
 lo-0.007
Token,
Token ID11
Feature activation+1.06e-02

Pos logit contributions
 fart+0.014
 ju+0.013
 lo+0.013
rated+0.013
 funn+0.013
Neg logit contributions
 arrest-0.005
 lawyer-0.005
 elev-0.005
 Tom-0.005
 school-0.004
Tokenâ
Token ID22940
Feature activation+3.61e-02

Pos logit contributions
 fart+0.024
 ju+0.022
rated+0.022
 lo+0.022
 funn+0.022
Neg logit contributions
 arrest-0.010
 Tom-0.010
 lawyer-0.010
 school-0.010
 elev-0.009
Token€
Token ID26391
Feature activation-6.85e-03

Pos logit contributions
 fart+0.053
rated+0.048
 ju+0.047
 lo+0.046
 funn+0.046
Neg logit contributions
 arrest-0.063
 Tom-0.062
 lawyer-0.061
 school-0.060
 elev-0.059
Token but
Token ID475
Feature activation-8.35e-03

Pos logit contributions
 elev+0.011
issa+0.010
 lawyer+0.010
azy+0.010
 adult+0.010
Neg logit contributions
 fart-0.010
 ju-0.010
 lo-0.009
rated-0.009
 funn-0.008
Token the
Token ID262
Feature activation-1.39e-03

Pos logit contributions
 Tom+0.009
 that+0.008
 school+0.008
 Mike+0.007
 arrest+0.007
Neg logit contributions
 fart-0.019
rated-0.018
 funn-0.017
 ju-0.017
 lo-0.017
Token little
Token ID1310
Feature activation+1.69e-02

Pos logit contributions
 Tom+0.002
 arrest+0.002
 lawyer+0.002
 school+0.002
 that+0.002
Neg logit contributions
 fart-0.003
rated-0.003
 ju-0.003
 lo-0.002
 funn-0.002
Token girl
Token ID2576
Feature activation+1.69e-02

Pos logit contributions
 fart+0.031
rated+0.029
 ju+0.028
 funn+0.028
 lo+0.027
Neg logit contributions
 Tom-0.023
 school-0.022
 lawyer-0.022
 arrest-0.022
 that-0.022
Token dress
Token ID6576
Feature activation-7.28e-03

Pos logit contributions
 arrest+0.006
 Tom+0.006
 lawyer+0.006
 school+0.006
 elev+0.006
Neg logit contributions
 fart-0.010
 ju-0.009
rated-0.009
 lo-0.009
 funn-0.009
Token,
Token ID11
Feature activation+3.29e-03

Pos logit contributions
 arrest+0.006
 Tom+0.005
 elev+0.005
 lawyer+0.005
azy+0.005
Neg logit contributions
 fart-0.018
 lo-0.017
 ju-0.017
rated-0.016
 funn-0.016
Token and
Token ID290
Feature activation+4.34e-03

Pos logit contributions
 fart+0.007
 ju+0.006
 lo+0.006
rated+0.006
 funn+0.006
Neg logit contributions
 arrest-0.004
 Tom-0.004
 lawyer-0.003
 school-0.003
 elev-0.003
Token she
Token ID673
Feature activation+7.83e-03

Pos logit contributions
 fart+0.011
rated+0.010
 ju+0.010
 funn+0.010
 lo+0.010
Neg logit contributions
 Tom-0.003
 school-0.003
 that-0.003
 arrest-0.003
 lawyer-0.003
Token was
Token ID373
Feature activation-2.40e-03

Pos logit contributions
 fart+0.014
rated+0.013
 ju+0.013
 lo+0.013
 funn+0.012
Neg logit contributions
 arrest-0.011
 Tom-0.011
 lawyer-0.011
 school-0.010
 elev-0.010
Token always
Token ID1464
Feature activation+3.59e-02

Pos logit contributions
 arrest+0.003
 Tom+0.003
 lawyer+0.003
 school+0.003
 elev+0.002
Neg logit contributions
 fart-0.005
 ju-0.005
rated-0.005
 lo-0.005
 funn-0.005
Token delighted
Token ID20707
Feature activation-3.56e-03

Pos logit contributions
 fart+0.073
rated+0.068
 ju+0.066
 funn+0.063
 lo+0.062
Neg logit contributions
 Tom-0.044
 school-0.044
 studying-0.043
 playing-0.042
 that-0.042
Token when
Token ID618
Feature activation+1.44e-02

Pos logit contributions
 arrest+0.005
 lawyer+0.004
 Tom+0.004
 elev+0.004
 church+0.004
Neg logit contributions
 fart-0.007
 lo-0.007
 ju-0.006
 funn-0.006
rated-0.006
Token she
Token ID673
Feature activation-4.52e-04

Pos logit contributions
 lo+0.033
 fart+0.032
nam+0.031
 brightest+0.031
 ju+0.031
Neg logit contributions
 arrest-0.013
azy-0.010
 lawyer-0.010
co-0.010
 Tom-0.009
Token put
Token ID1234
Feature activation-6.15e-03

Pos logit contributions
 arrest+0.001
 Tom+0.001
 lawyer+0.001
 elev+0.001
 school+0.001
Neg logit contributions
 fart-0.001
 lo-0.001
rated-0.001
 ju-0.001
 funn-0.001
Token it
Token ID340
Feature activation+2.08e-02

Pos logit contributions
 arrest+0.007
 lawyer+0.006
 Tom+0.005
azy+0.005
 elev+0.005
Neg logit contributions
 fart-0.014
 lo-0.013
 ju-0.013
 funn-0.013
 brightest-0.012
Token.
Token ID13
Feature activation+9.27e-03

Pos logit contributions
ait+0.014
issa+0.014
 lawyer+0.014
 Lee+0.014
ibrarian+0.013
Neg logit contributions
 fart-0.021
 lo-0.019
 funn-0.018
cr-0.017
 afloat-0.017
Token
Token ID198
Feature activation+9.51e-03

Pos logit contributions
 fart+0.014
 ju+0.013
 funn+0.013
 lo+0.013
rated+0.012
Neg logit contributions
 arrest-0.014
 lawyer-0.014
 elev-0.014
 school-0.013
 church-0.013
Token
Token ID198
Feature activation+3.67e-03

Pos logit contributions
 fart+0.019
rated+0.017
 lo+0.017
 ju+0.017
 funn+0.017
Neg logit contributions
 arrest-0.012
 lawyer-0.011
 elev-0.011
 Lee-0.010
 studying-0.010
TokenThe
Token ID464
Feature activation+1.72e-03

Pos logit contributions
 fart+0.006
rated+0.005
 ju+0.005
 lo+0.005
 funn+0.005
Neg logit contributions
 arrest-0.006
 Tom-0.006
 lawyer-0.006
 school-0.006
 that-0.006
Token boy
Token ID2933
Feature activation+4.27e-03

Pos logit contributions
 fart+0.003
rated+0.003
 ju+0.003
 lo+0.003
 funn+0.003
Neg logit contributions
 arrest-0.002
 Tom-0.002
 lawyer-0.002
 school-0.002
 elev-0.002
Token had
Token ID550
Feature activation+1.13e-02

Pos logit contributions
 fart+0.009
 lo+0.008
 funn+0.008
 ju+0.008
rated+0.008
Neg logit contributions
 arrest-0.005
 elev-0.005
 lawyer-0.005
 Tom-0.005
 school-0.005
Token a
Token ID257
Feature activation+1.62e-04

Pos logit contributions
 fart+0.022
rated+0.021
 ju+0.020
 lo+0.019
 funn+0.019
Neg logit contributions
 Tom-0.015
 arrest-0.015
 school-0.014
 that-0.014
 lawyer-0.014
Token warm
Token ID5814
Feature activation+1.35e-02

Pos logit contributions
 fart+0.000
rated+0.000
 ju+0.000
 lo+0.000
 funn+0.000
Neg logit contributions
 Tom-0.000
 arrest-0.000
 lawyer-0.000
 school-0.000
 that-0.000
Token feeling
Token ID4203
Feature activation-1.07e-02

Pos logit contributions
 fart+0.028
 lo+0.026
rated+0.026
 ju+0.025
 funn+0.025
Neg logit contributions
 arrest-0.015
 Tom-0.015
 elev-0.014
 lawyer-0.014
azy-0.014
Token in
Token ID287
Feature activation+3.43e-03

Pos logit contributions
 Lee+0.014
 elev+0.014
 arrest+0.013
parent+0.013
innie+0.013
Neg logit contributions
 fart-0.020
 lo-0.019
 funn-0.018
 ju-0.018
 brightest-0.018
Token his
Token ID465
Feature activation-1.43e-02

Pos logit contributions
 fart+0.005
 lo+0.005
 funn+0.005
 ju+0.005
nam+0.004
Neg logit contributions
azy-0.005
 lawyer-0.005
issa-0.005
ibrarian-0.005
itt-0.005
Token mom
Token ID1995
Feature activation+2.10e-03

Pos logit contributions
 arrest+0.008
 Tom+0.008
 elev+0.007
azy+0.007
 lawyer+0.007
Neg logit contributions
 fart-0.018
 lo-0.017
 funn-0.017
rated-0.017
 ju-0.017
Tokenmy
Token ID1820
Feature activation-7.76e-03

Pos logit contributions
 fart+0.004
 ju+0.004
rated+0.004
 funn+0.003
 lo+0.003
Neg logit contributions
 Tom-0.003
 that-0.003
 lawyer-0.003
 school-0.003
 arrest-0.003
Token said
Token ID531
Feature activation-2.29e-02

Pos logit contributions
 arrest+0.012
 elev+0.011
 school+0.011
 lawyer+0.011
 Tom+0.011
Neg logit contributions
 fart-0.013
 lo-0.012
rated-0.012
 ju-0.012
 funn-0.012
Token,
Token ID11
Feature activation-3.55e-03

Pos logit contributions
 arrest+0.024
 elev+0.020
parent+0.020
 school+0.020
 lawyer+0.019
Neg logit contributions
 fart-0.049
 lo-0.048
 ju-0.047
rated-0.046
 funn-0.045
Token "
Token ID366
Feature activation+2.50e-03

Pos logit contributions
 arrest+0.005
 recess+0.004
 elev+0.004
 school+0.004
parent+0.004
Neg logit contributions
 fart-0.006
 ju-0.006
 lo-0.006
 funn-0.006
rated-0.006
TokenNot
Token ID3673
Feature activation+1.25e-03

Pos logit contributions
rated+0.005
 fart+0.005
 afloat+0.005
ench+0.004
 lo+0.004
Neg logit contributions
 that-0.003
 Tom-0.003
Tom-0.003
 lawyer-0.003
 she-0.003
Token today
Token ID1909
Feature activation+9.34e-03

Pos logit contributions
 fart+0.003
rated+0.002
 lo+0.002
 funn+0.002
 ju+0.002
Neg logit contributions
itt-0.001
 arrest-0.001
ibrarian-0.001
azy-0.001
parent-0.001
Token,
Token ID11
Feature activation+1.02e-02

Pos logit contributions
 fart+0.020
 lo+0.019
rated+0.019
 ju+0.018
 funn+0.018
Neg logit contributions
 arrest-0.009
 elev-0.008
 lawyer-0.008
azy-0.008
parent-0.008
Token sweet
Token ID6029
Feature activation+1.93e-02

Pos logit contributions
 fart+0.025
rated+0.023
 ju+0.023
 lo+0.023
 funn+0.023
Neg logit contributions
 Tom-0.008
 arrest-0.008
 school-0.007
 lawyer-0.007
 that-0.007
Tokenie
Token ID494
Feature activation-5.31e-03

Pos logit contributions
rated+0.023
 fart+0.021
 lo+0.020
 ju+0.019
 funn+0.019
Neg logit contributions
 arrest-0.040
 elev-0.038
 studying-0.037
 lawyer-0.037
 Tom-0.036
Token.
Token ID13
Feature activation-1.36e-02

Pos logit contributions
 arrest+0.006
 elev+0.005
 recess+0.005
 lawyer+0.005
 school+0.005
Neg logit contributions
rated-0.011
 fart-0.010
 lo-0.010
 ju-0.010
 funn-0.009
Token.
Token ID13
Feature activation+6.26e-03

Pos logit contributions
rated+0.043
 fart+0.043
 funn+0.042
 lo+0.042
 wider+0.041
Neg logit contributions
 arrest-0.030
 recess-0.023
 lawyer-0.022
 elev-0.021
 school-0.021
Token<|endoftext|>
Token ID50256
Feature activation-3.88e-03

Pos logit contributions
 fart+0.013
 ju+0.013
 lo+0.012
 funn+0.012
rated+0.012
Neg logit contributions
 arrest-0.006
 lawyer-0.006
 school-0.006
 Tom-0.006
 elev-0.006
TokenOnce
Token ID7454
Feature activation+6.52e-03

Pos logit contributions
 arrest+0.006
 lawyer+0.005
 elev+0.005
 studying+0.005
issa+0.005
Neg logit contributions
 fart-0.007
 lo-0.006
 ju-0.006
rated-0.006
 funn-0.006
Token upon
Token ID2402
Feature activation+3.18e-02

Pos logit contributions
 fart+0.007
rated+0.006
 lo+0.006
 ju+0.006
 funn+0.006
Neg logit contributions
 arrest-0.014
 lawyer-0.013
 elev-0.013
 studying-0.013
 church-0.013
Token a
Token ID257
Feature activation-7.65e-03

Pos logit contributions
 fart+0.070
rated+0.065
 ju+0.065
 lo+0.064
 funn+0.064
Neg logit contributions
 arrest-0.031
 Tom-0.031
 lawyer-0.030
 school-0.029
 elev-0.029
Token time
Token ID640
Feature activation+1.48e-02

Pos logit contributions
 arrest+0.008
 Tom+0.008
 lawyer+0.008
 elev+0.007
 school+0.007
Neg logit contributions
 fart-0.017
rated-0.015
 lo-0.015
 ju-0.015
 funn-0.015
Token there
Token ID612
Feature activation+1.19e-02

Pos logit contributions
 fart+0.031
 lo+0.029
 ju+0.027
rated+0.027
cr+0.026
Neg logit contributions
 arrest-0.017
 lawyer-0.015
issa-0.014
parent-0.014
 elev-0.014
Token was
Token ID373
Feature activation-1.61e-03

Pos logit contributions
 fart+0.021
 lo+0.020
rated+0.019
 ju+0.019
 funn+0.019
Neg logit contributions
 arrest-0.018
 lawyer-0.016
 elev-0.016
 studying-0.015
 church-0.015
Token a
Token ID257
Feature activation-5.04e-03

Pos logit contributions
 arrest+0.002
 lawyer+0.002
 Tom+0.002
 elev+0.002
 Lee+0.002
Neg logit contributions
 fart-0.003
 ju-0.003
 lo-0.003
 funn-0.003
rated-0.003
Token calm
Token ID9480
Feature activation-1.54e-03

Pos logit contributions
 arrest+0.008
 elev+0.008
 Tom+0.007
 studying+0.007
 Mike+0.007
Neg logit contributions
 fart-0.008
 ju-0.008
 lo-0.008
 funn-0.007
rated-0.007
Token missile
Token ID10105
Feature activation+4.90e-03

Pos logit contributions
 arrest+0.002
 elev+0.002
 choosing+0.002
 studying+0.002
azy+0.002
Neg logit contributions
 fart-0.003
 ju-0.003
 lo-0.003
rated-0.003
 funn-0.003
Token to
Token ID284
Feature activation-7.89e-03

Pos logit contributions
 arrest+0.009
 lawyer+0.008
 Tom+0.008
azy+0.008
issa+0.008
Neg logit contributions
 fart-0.020
 ju-0.018
 lo-0.018
 funn-0.018
rated-0.017
Token sleep
Token ID3993
Feature activation-3.14e-02

Pos logit contributions
 lawyer+0.011
 elev+0.011
 arrest+0.011
issa+0.011
 Tom+0.011
Neg logit contributions
 fart-0.014
 lo-0.012
rated-0.012
 ju-0.012
 funn-0.012
Token.
Token ID13
Feature activation-1.76e-02

Pos logit contributions
 arrest+0.030
 elev+0.028
 lawyer+0.027
 Tom+0.027
azy+0.026
Neg logit contributions
 fart-0.071
rated-0.066
 lo-0.066
 ju-0.064
 funn-0.063
Token Playing
Token ID23911
Feature activation+8.96e-03

Pos logit contributions
 arrest+0.023
 lawyer+0.022
 school+0.022
 elev+0.021
 building+0.021
Neg logit contributions
 fart-0.032
 ju-0.029
 lo-0.029
rated-0.028
 funn-0.028
Token is
Token ID318
Feature activation+1.00e-02

Pos logit contributions
 fart+0.020
rated+0.019
 ju+0.019
 lo+0.019
 funn+0.019
Neg logit contributions
 arrest-0.008
 Tom-0.008
 lawyer-0.008
 school-0.008
 that-0.008
Token good
Token ID922
Feature activation+1.51e-02

Pos logit contributions
 fart+0.021
rated+0.020
 ju+0.020
 lo+0.019
 funn+0.019
Neg logit contributions
 Tom-0.011
 arrest-0.010
 school-0.010
 that-0.010
 lawyer-0.010
Token,
Token ID11
Feature activation+9.86e-03

Pos logit contributions
 fart+0.035
rated+0.033
 lo+0.032
 funn+0.031
 ju+0.031
Neg logit contributions
 arrest-0.014
azy-0.013
issa-0.013
 elev-0.013
 lawyer-0.012
Token but
Token ID475
Feature activation-1.89e-02

Pos logit contributions
 fart+0.021
rated+0.020
 ju+0.020
 lo+0.019
 funn+0.019
Neg logit contributions
 Tom-0.011
 arrest-0.010
 school-0.010
 lawyer-0.010
 that-0.010
Token sleeping
Token ID11029
Feature activation-4.06e-03

Pos logit contributions
 arrest+0.019
 lawyer+0.017
 Tom+0.017
 school+0.016
 elev+0.016
Neg logit contributions
 fart-0.042
 ju-0.039
 lo-0.039
 funn-0.038
rated-0.038
Token is
Token ID318
Feature activation+1.05e-02

Pos logit contributions
 arrest+0.004
 lawyer+0.003
 elev+0.003
 Tom+0.003
 Karen+0.003
Neg logit contributions
 fart-0.009
rated-0.009
 lo-0.009
 ju-0.008
 funn-0.008
Token good
Token ID922
Feature activation+1.27e-02

Pos logit contributions
 fart+0.023
rated+0.022
 ju+0.021
 lo+0.021
 funn+0.021
Neg logit contributions
 Tom-0.011
 arrest-0.010
 school-0.010
 lawyer-0.010
 that-0.010
Token out
Token ID503
Feature activation+3.23e-03

Pos logit contributions
 fart+0.007
rated+0.007
 ju+0.007
 lo+0.007
 funn+0.007
Neg logit contributions
 arrest-0.003
 Tom-0.003
 lawyer-0.003
 school-0.003
 elev-0.003
Token and
Token ID290
Feature activation+1.45e-02

Pos logit contributions
 fart+0.008
rated+0.007
 ju+0.007
 lo+0.007
 funn+0.007
Neg logit contributions
 arrest-0.002
 Tom-0.002
 lawyer-0.002
 school-0.002
 elev-0.002
Token play
Token ID711
Feature activation+5.83e-03

Pos logit contributions
 fart+0.033
rated+0.031
 funn+0.029
 ju+0.029
ksh+0.029
Neg logit contributions
 Tom-0.015
 that-0.015
 school-0.015
 arrest-0.014
 building-0.014
Token with
Token ID351
Feature activation+9.88e-03

Pos logit contributions
 fart+0.010
 funn+0.008
 lo+0.008
 ju+0.008
 brighter+0.008
Neg logit contributions
aren-0.009
 arrest-0.009
 Karen-0.009
issa-0.009
ait-0.009
Token it
Token ID340
Feature activation+2.74e-03

Pos logit contributions
 fart+0.020
 lo+0.018
 ju+0.017
 funn+0.017
nam+0.017
Neg logit contributions
aren-0.012
 Lee-0.011
issa-0.011
 lawyer-0.011
 Karen-0.011
Token.
Token ID13
Feature activation+1.24e-02

Pos logit contributions
 fart+0.007
rated+0.007
 ju+0.007
 lo+0.007
 funn+0.007
Neg logit contributions
 arrest-0.002
 Tom-0.002
 lawyer-0.002
 school-0.001
 elev-0.001
Token
Token ID198
Feature activation+1.48e-02

Pos logit contributions
 fart+0.026
 ju+0.024
rated+0.023
 funn+0.023
 lo+0.023
Neg logit contributions
 arrest-0.014
 lawyer-0.013
 Tom-0.013
 school-0.013
 elev-0.013
Token
Token ID198
Feature activation+5.52e-03

Pos logit contributions
 fart+0.029
 lo+0.027
rated+0.027
 ju+0.026
 funn+0.026
Neg logit contributions
 arrest-0.019
 lawyer-0.016
 elev-0.016
 studying-0.016
 Lee-0.016
TokenOne
Token ID3198
Feature activation-2.84e-04

Pos logit contributions
 fart+0.010
rated+0.009
 ju+0.009
 lo+0.009
 funn+0.009
Neg logit contributions
 arrest-0.008
 Tom-0.008
 lawyer-0.007
 school-0.007
 that-0.007
Token day
Token ID1110
Feature activation+1.31e-02

Pos logit contributions
 Tom+0.000
 arrest+0.000
 lawyer+0.000
 school+0.000
 that+0.000
Neg logit contributions
 fart-0.001
rated-0.001
 ju-0.001
 lo-0.001
 funn-0.001
Token,
Token ID11
Feature activation-1.02e-02

Pos logit contributions
 fart+0.028
 lo+0.026
rated+0.025
 ju+0.025
 funn+0.024
Neg logit contributions
 arrest-0.013
parent-0.013
issa-0.012
 lawyer-0.012
 elev-0.012
Token bur
Token ID4356
Feature activation+7.33e-03

Pos logit contributions
 fart+0.021
rated+0.020
 lo+0.019
 funn+0.019
 ju+0.019
Neg logit contributions
 school-0.013
 arrest-0.013
 Tom-0.012
 lawyer-0.012
 that-0.012
Tokenrow
Token ID808
Feature activation+7.43e-03

Pos logit contributions
 fart+0.010
 funn+0.009
 ju+0.009
 lo+0.009
nam+0.008
Neg logit contributions
 that-0.013
 arrest-0.013
 building-0.013
co-0.012
 school-0.012
Token and
Token ID290
Feature activation+9.21e-03

Pos logit contributions
 fart+0.019
rated+0.018
 ju+0.018
 lo+0.018
 funn+0.018
Neg logit contributions
 Tom-0.004
 arrest-0.004
 lawyer-0.004
 school-0.004
 that-0.004
Token found
Token ID1043
Feature activation+1.50e-03

Pos logit contributions
 fart+0.019
rated+0.017
 funn+0.017
 ju+0.017
 lo+0.017
Neg logit contributions
 Tom-0.012
 that-0.011
 arrest-0.011
 school-0.011
 building-0.010
Token some
Token ID617
Feature activation+2.04e-03

Pos logit contributions
 fart+0.004
rated+0.003
 ju+0.003
 lo+0.003
 funn+0.003
Neg logit contributions
 Tom-0.001
 that-0.001
 arrest-0.001
 school-0.001
 lawyer-0.001
Token leaves
Token ID5667
Feature activation-1.44e-02

Pos logit contributions
 fart+0.004
rated+0.004
 ju+0.003
 funn+0.003
 lo+0.003
Neg logit contributions
 that-0.003
 Tom-0.003
 school-0.003
 arrest-0.002
 lawyer-0.002
Token.
Token ID13
Feature activation+1.46e-02

Pos logit contributions
 arrest+0.009
 Tom+0.009
 lawyer+0.009
 school+0.008
 elev+0.008
Neg logit contributions
 fart-0.037
rated-0.035
 ju-0.035
 lo-0.034
 funn-0.034
Token He
Token ID679
Feature activation+4.73e-03

Pos logit contributions
 fart+0.029
 ju+0.027
rated+0.026
 lo+0.026
 funn+0.026
Neg logit contributions
 arrest-0.018
 Tom-0.017
 lawyer-0.017
 school-0.017
 elev-0.017
Token piled
Token ID33654
Feature activation+1.14e-02

Pos logit contributions
 fart+0.009
rated+0.008
 ju+0.008
 lo+0.008
 funn+0.008
Neg logit contributions
 arrest-0.006
 Tom-0.006
 lawyer-0.006
 school-0.006
 that-0.006
Token them
Token ID606
Feature activation+2.40e-02

Pos logit contributions
 fart+0.026
rated+0.025
 ju+0.025
 funn+0.025
 lo+0.024
Neg logit contributions
 Tom-0.010
 arrest-0.010
 that-0.009
 school-0.009
 elev-0.009
Token up
Token ID510
Feature activation-7.48e-03

Pos logit contributions
 fart+0.045
rated+0.044
 funn+0.044
 ju+0.044
 lo+0.041
Neg logit contributions
 that-0.030
 Tom-0.029
 upstairs-0.028
 inside-0.028
 outside-0.027
Token.
Token ID13
Feature activation-2.83e-03

Pos logit contributions
 lo+0.016
 fart+0.015
 funn+0.015
 ju+0.015
 brighter+0.015
Neg logit contributions
 arrest-0.010
 lawyer-0.009
issa-0.009
 elev-0.009
 Lee-0.008
Token He
Token ID679
Feature activation-1.34e-02

Pos logit contributions
 arrest+0.004
 lawyer+0.004
 elev+0.004
 church+0.004
 school+0.004
Neg logit contributions
 fart-0.005
rated-0.005
 ju-0.005
 funn-0.005
 lo-0.005
Token knew
Token ID2993
Feature activation-1.48e-02

Pos logit contributions
 arrest+0.016
 Tom+0.016
 lawyer+0.015
 school+0.015
 elev+0.015
Neg logit contributions
 fart-0.027
rated-0.025
 ju-0.025
 lo-0.024
 funn-0.024
Token from
Token ID422
Feature activation-1.35e-02

Pos logit contributions
 arrest+0.012
 lawyer+0.010
parent+0.009
 Lee+0.009
 elev+0.009
Neg logit contributions
 fart-0.035
 lo-0.034
 ju-0.033
 funn-0.033
rated-0.031
Token now
Token ID783
Feature activation+1.20e-04

Pos logit contributions
 arrest+0.011
 lawyer+0.010
 Lee+0.010
 Tom+0.010
 elev+0.009
Neg logit contributions
 fart-0.031
 lo-0.029
 funn-0.029
rated-0.029
 ju-0.029
Token on
Token ID319
Feature activation-4.73e-03

Pos logit contributions
 fart+0.000
rated+0.000
 lo+0.000
 funn+0.000
 ju+0.000
Neg logit contributions
 arrest-0.000
 Tom-0.000
 lawyer-0.000
 elev-0.000
azy-0.000
Token there
Token ID612
Feature activation-9.94e-05

Pos logit contributions
 arrest+0.004
 lawyer+0.004
 elev+0.004
azy+0.004
aren+0.004
Neg logit contributions
 fart-0.010
 lo-0.010
rated-0.010
 ju-0.010
 funn-0.009
Token's
Token ID338
Feature activation+1.32e-02

Pos logit contributions
 arrest+0.000
 lawyer+0.000
 Tom+0.000
 elev+0.000
issa+0.000
Neg logit contributions
 fart-0.000
 lo-0.000
rated-0.000
 ju-0.000
 funn-0.000
Token nothing
Token ID2147
Feature activation-6.73e-03

Pos logit contributions
 fart+0.028
 ju+0.026
rated+0.026
 funn+0.025
 lo+0.025
Neg logit contributions
 school-0.015
 that-0.014
 Tom-0.014
 church-0.014
 studying-0.014
Token to
Token ID284
Feature activation-3.11e-04

Pos logit contributions
 arrest+0.008
 Tom+0.007
 lawyer+0.007
 Lee+0.007
 school+0.006
Neg logit contributions
 fart-0.014
rated-0.013
 lo-0.013
 ju-0.013
 funn-0.013
Token be
Token ID307
Feature activation+1.25e-02

Pos logit contributions
 arrest+0.000
 Tom+0.000
 lawyer+0.000
 school+0.000
 elev+0.000
Neg logit contributions
 fart-0.001
rated-0.001
 ju-0.001
 lo-0.001
 funn-0.000
Token to
Token ID284
Feature activation-1.27e-03

Pos logit contributions
 arrest+0.018
 lawyer+0.017
issa+0.016
azy+0.016
 Tom+0.016
Neg logit contributions
 fart-0.041
 funn-0.037
 ju-0.037
 lo-0.037
rated-0.035
Token do
Token ID466
Feature activation-5.55e-03

Pos logit contributions
 arrest+0.002
 Tom+0.001
 lawyer+0.001
 elev+0.001
 school+0.001
Neg logit contributions
 fart-0.003
rated-0.002
 ju-0.002
 lo-0.002
 funn-0.002
Token anything
Token ID1997
Feature activation-2.77e-02

Pos logit contributions
aren+0.006
 arrest+0.006
parent+0.005
ait+0.005
issa+0.005
Neg logit contributions
 fart-0.012
 funn-0.011
 lo-0.011
 ju-0.010
nam-0.010
Token.
Token ID13
Feature activation-3.56e-03

Pos logit contributions
 arrest+0.030
parent+0.027
aren+0.026
issa+0.026
 lawyer+0.025
Neg logit contributions
 fart-0.063
 funn-0.057
 ju-0.056
 lo-0.056
rated-0.056
Token I
Token ID314
Feature activation-1.32e-02

Pos logit contributions
 Tom+0.004
 arrest+0.003
 lawyer+0.003
 Ann+0.003
 that+0.003
Neg logit contributions
 fart-0.008
rated-0.008
 ju-0.008
 lo-0.008
 funn-0.007
Token just
Token ID655
Feature activation-1.64e-02

Pos logit contributions
 Tom+0.017
 arrest+0.017
 lawyer+0.016
 school+0.016
 that+0.016
Neg logit contributions
 fart-0.026
rated-0.023
 ju-0.023
 lo-0.023
 funn-0.023
Token want
Token ID765
Feature activation-1.06e-02

Pos logit contributions
 Tom+0.021
 arrest+0.020
 school+0.019
 lawyer+0.019
 that+0.019
Neg logit contributions
 fart-0.032
 ju-0.030
rated-0.029
 lo-0.029
 funn-0.029
Token the
Token ID262
Feature activation-1.04e-02

Pos logit contributions
 arrest+0.009
 Tom+0.009
 lawyer+0.009
 school+0.008
 elev+0.008
Neg logit contributions
 fart-0.025
 ju-0.023
 funn-0.023
 lo-0.023
rated-0.022
Token rain
Token ID6290
Feature activation+6.59e-04

Pos logit contributions
 Tom+0.014
 school+0.013
 lawyer+0.013
 arrest+0.013
 that+0.012
Neg logit contributions
 fart-0.020
rated-0.020
 ju-0.018
 lo-0.018
 funn-0.018
Token to
Token ID284
Feature activation+3.25e-03

Pos logit contributions
 fart+0.002
rated+0.001
 ju+0.001
 funn+0.001
 lo+0.001
Neg logit contributions
 arrest-0.001
 lawyer-0.001
 Tom-0.001
issa-0.001
 elev-0.001
Token stop
Token ID2245
Feature activation-1.33e-02

Pos logit contributions
 fart+0.006
rated+0.006
 ju+0.006
 lo+0.006
 funn+0.005
Neg logit contributions
 arrest-0.004
 Tom-0.004
 lawyer-0.004
 elev-0.004
 school-0.004
Token,
Token ID11
Feature activation-3.62e-03

Pos logit contributions
 elev+0.002
 arrest+0.002
 recess+0.002
azy+0.002
 lawyer+0.001
Neg logit contributions
 fart-0.004
 lo-0.004
rated-0.004
 ju-0.003
 funn-0.003
Token I
Token ID314
Feature activation-2.30e-02

Pos logit contributions
 arrest+0.004
 lawyer+0.004
 elev+0.004
parent+0.004
 school+0.004
Neg logit contributions
 fart-0.007
 ju-0.007
 lo-0.007
 funn-0.007
rated-0.006
Token have
Token ID423
Feature activation-1.59e-02

Pos logit contributions
 arrest+0.030
 elev+0.029
 Tom+0.028
 lawyer+0.028
 school+0.027
Neg logit contributions
 fart-0.043
rated-0.041
 lo-0.041
 funn-0.040
 ju-0.040
Token some
Token ID617
Feature activation-8.43e-03

Pos logit contributions
 arrest+0.017
 Tom+0.017
 lawyer+0.017
 school+0.016
 elev+0.016
Neg logit contributions
 fart-0.033
 ju-0.031
 lo-0.030
 funn-0.030
rated-0.030
Token news
Token ID1705
Feature activation+8.66e-03

Pos logit contributions
 arrest+0.010
 Tom+0.010
 lawyer+0.009
 school+0.009
 elev+0.009
Neg logit contributions
 fart-0.017
 ju-0.016
rated-0.016
 lo-0.016
 funn-0.016
Token for
Token ID329
Feature activation-2.98e-02

Pos logit contributions
 fart+0.019
 funn+0.017
 lo+0.017
 ju+0.017
rated+0.017
Neg logit contributions
 arrest-0.010
 elev-0.008
 lawyer-0.008
azy-0.008
parent-0.008
Token you
Token ID345
Feature activation-3.89e-03

Pos logit contributions
 arrest+0.038
 lawyer+0.037
aren+0.036
azy+0.036
 elev+0.036
Neg logit contributions
 fart-0.053
 lo-0.049
nam-0.049
 funn-0.047
 ju-0.046
Token,"
Token ID553
Feature activation-3.77e-02

Pos logit contributions
 Tom+0.004
 arrest+0.004
 school+0.003
 lawyer+0.003
 that+0.003
Neg logit contributions
 fart-0.009
 ju-0.008
rated-0.008
 funn-0.008
 lo-0.008
Token Mom
Token ID11254
Feature activation+5.97e-03

Pos logit contributions
 Tom+0.031
 she+0.024
 that+0.023
 school+0.023
 grandma+0.022
Neg logit contributions
 fart-0.109
rated-0.102
 ju-0.096
 afloat-0.094
ksh-0.092
Token said
Token ID531
Feature activation-2.31e-02

Pos logit contributions
 fart+0.010
rated+0.010
 ju+0.009
 lo+0.009
 funn+0.009
Neg logit contributions
 arrest-0.009
 Tom-0.009
 lawyer-0.008
 school-0.008
 elev-0.008
Token.
Token ID13
Feature activation-9.89e-03

Pos logit contributions
 arrest+0.012
 Tom+0.010
 lawyer+0.010
 school+0.009
 elev+0.009
Neg logit contributions
 fart-0.062
rated-0.059
 lo-0.059
 ju-0.059
 funn-0.058
Token bored
Token ID17533
Feature activation+9.60e-03

Pos logit contributions
 arrest+0.007
 Tom+0.007
 elev+0.006
 lawyer+0.006
 school+0.006
Neg logit contributions
 fart-0.008
 lo-0.007
 ju-0.007
 funn-0.007
rated-0.007
Token and
Token ID290
Feature activation-1.39e-02

Pos logit contributions
 fart+0.022
 lo+0.021
 funn+0.020
 ju+0.020
rated+0.020
Neg logit contributions
 arrest-0.009
innie-0.007
azy-0.007
issa-0.007
aren-0.007
Token decided
Token ID3066
Feature activation+3.30e-03

Pos logit contributions
 Tom+0.018
 that+0.017
 school+0.017
 arrest+0.017
 lawyer+0.016
Neg logit contributions
 fart-0.028
rated-0.025
 ju-0.025
 funn-0.024
 lo-0.024
Token to
Token ID284
Feature activation+7.26e-03

Pos logit contributions
 fart+0.008
 ju+0.007
 lo+0.007
rated+0.007
 funn+0.007
Neg logit contributions
 arrest-0.003
 Tom-0.003
azy-0.003
 elev-0.003
issa-0.003
Token do
Token ID466
Feature activation+8.69e-03

Pos logit contributions
 fart+0.014
rated+0.013
 ju+0.013
 funn+0.013
 lo+0.013
Neg logit contributions
 Tom-0.009
 arrest-0.009
 school-0.009
 lawyer-0.009
 that-0.009
Token something
Token ID1223
Feature activation-2.17e-02

Pos logit contributions
 fart+0.013
 lo+0.012
 funn+0.012
 ju+0.012
rated+0.011
Neg logit contributions
 arrest-0.015
 Lee-0.014
 Tom-0.014
 lawyer-0.014
parent-0.013
Token naughty
Token ID45128
Feature activation+4.93e-03

Pos logit contributions
 arrest+0.031
 Tom+0.029
 lawyer+0.028
 Lee+0.027
parent+0.027
Neg logit contributions
 fart-0.041
 ju-0.038
 funn-0.038
 lo-0.037
rated-0.037
Token.
Token ID13
Feature activation+2.04e-03

Pos logit contributions
 fart+0.012
rated+0.011
 lo+0.011
 funn+0.011
 ju+0.011
Neg logit contributions
 arrest-0.004
 elev-0.003
 lawyer-0.003
 church-0.003
issa-0.003
Token Her
Token ID2332
Feature activation-8.09e-03

Pos logit contributions
 fart+0.005
rated+0.005
 ju+0.005
 lo+0.004
 funn+0.004
Neg logit contributions
 arrest-0.002
 Tom-0.002
 lawyer-0.002
 school-0.002
 that-0.001
Token bad
Token ID2089
Feature activation+1.37e-02

Pos logit contributions
 arrest+0.010
 Tom+0.010
 lawyer+0.010
 school+0.010
 that+0.010
Neg logit contributions
 fart-0.016
rated-0.015
 ju-0.014
 lo-0.014
 funn-0.014
Token behaviour
Token ID9172
Feature activation+1.45e-02

Pos logit contributions
 fart+0.024
rated+0.022
 ju+0.021
 lo+0.021
 funn+0.021
Neg logit contributions
 arrest-0.020
 Tom-0.020
 lawyer-0.019
 school-0.019
 that-0.019
Token got
Token ID1392
Feature activation+5.37e-03

Pos logit contributions
 arrest+0.021
 elev+0.020
 lawyer+0.020
 studying+0.020
 school+0.020
Neg logit contributions
 fart-0.021
rated-0.021
 lo-0.020
 ju-0.020
 funn-0.019
Token off
Token ID572
Feature activation-2.15e-02

Pos logit contributions
 fart+0.011
rated+0.010
 ju+0.010
 lo+0.010
 funn+0.010
Neg logit contributions
 arrest-0.007
 Tom-0.007
 lawyer-0.006
 school-0.006
 elev-0.006
Token the
Token ID262
Feature activation+4.54e-03

Pos logit contributions
 arrest+0.025
 lawyer+0.025
azy+0.025
issa+0.024
 elev+0.023
Neg logit contributions
 fart-0.040
 lo-0.039
 ju-0.037
 tighter-0.037
 funn-0.035
Token couch
Token ID18507
Feature activation-2.52e-02

Pos logit contributions
 fart+0.010
rated+0.009
 ju+0.009
 funn+0.009
 lo+0.009
Neg logit contributions
 arrest-0.005
 Tom-0.005
 lawyer-0.004
 school-0.004
 that-0.004
Token and
Token ID290
Feature activation+4.89e-03

Pos logit contributions
 arrest+0.020
 elev+0.018
azy+0.018
 lawyer+0.018
 Tom+0.017
Neg logit contributions
 fart-0.060
 lo-0.056
rated-0.056
 ju-0.055
 funn-0.054
Token tr
Token ID491
Feature activation-4.82e-02

Pos logit contributions
 fart+0.009
rated+0.008
 lo+0.008
 ju+0.008
 funn+0.008
Neg logit contributions
 arrest-0.007
 lawyer-0.006
 elev-0.006
 Tom-0.006
 school-0.006
Tokenud
Token ID463
Feature activation-3.10e-02

Pos logit contributions
 lawyer+0.062
 meeting+0.061
 Tom+0.060
 worker+0.059
 that+0.058
Neg logit contributions
ifying-0.078
rated-0.075
 fart-0.075
ench-0.074
rac-0.073
Tokenged
Token ID2004
Feature activation-2.09e-02

Pos logit contributions
 lawyer+0.043
 arrest+0.043
 elev+0.042
 church+0.041
 university+0.041
Neg logit contributions
rated-0.052
 fart-0.051
 lo-0.049
cr-0.048
 tighter-0.046
Token out
Token ID503
Feature activation-4.94e-03

Pos logit contributions
 arrest+0.031
 lawyer+0.030
 school+0.029
azy+0.029
 church+0.029
Neg logit contributions
 fart-0.034
 lo-0.032
 funn-0.031
 tighter-0.031
 ju-0.031
Token of
Token ID286
Feature activation+6.26e-04

Pos logit contributions
 arrest+0.005
 lawyer+0.005
 Tom+0.004
 church+0.004
azy+0.004
Neg logit contributions
 fart-0.011
 lo-0.010
rated-0.010
 funn-0.010
 ju-0.010
Token the
Token ID262
Feature activation-6.50e-03

Pos logit contributions
 fart+0.001
 lo+0.001
 funn+0.001
 ju+0.001
 tighter+0.001
Neg logit contributions
 lawyer-0.001
 arrest-0.001
azy-0.001
ibrarian-0.001
 studying-0.001
Token mom
Token ID1995
Feature activation+1.70e-02

Pos logit contributions
 fart+0.013
 funn+0.013
 lo+0.012
 ju+0.012
rated+0.012
Neg logit contributions
 arrest-0.008
 elev-0.007
azy-0.007
issa-0.007
 Tom-0.007
Token,
Token ID11
Feature activation-6.73e-03

Pos logit contributions
 fart+0.039
 ju+0.037
 lo+0.037
rated+0.036
 funn+0.036
Neg logit contributions
 arrest-0.016
 lawyer-0.013
 Tom-0.013
 elev-0.013
 school-0.013
Token "
Token ID366
Feature activation+4.39e-03

Pos logit contributions
 arrest+0.010
 recess+0.008
 elev+0.008
parent+0.008
issa+0.008
Neg logit contributions
 fart-0.012
 ju-0.012
 lo-0.012
 funn-0.012
nam-0.011
TokenWhat
Token ID2061
Feature activation-1.29e-02

Pos logit contributions
 fart+0.008
rated+0.008
 ju+0.007
 funn+0.007
 lo+0.007
Neg logit contributions
 Tom-0.006
 arrest-0.006
 lawyer-0.006
 school-0.006
 elev-0.005
Token is
Token ID318
Feature activation-2.23e-02

Pos logit contributions
 arrest+0.017
 Tom+0.016
 lawyer+0.016
 school+0.016
 elev+0.015
Neg logit contributions
 fart-0.025
 ju-0.023
rated-0.023
 lo-0.022
 funn-0.022
Token behind
Token ID2157
Feature activation-4.52e-02

Pos logit contributions
parent+0.025
 arrest+0.025
aren+0.024
loe+0.024
kids+0.024
Neg logit contributions
 fart-0.048
 ju-0.042
 prett-0.041
 brightest-0.040
 funn-0.040
Token the
Token ID262
Feature activation-8.40e-03

Pos logit contributions
 arrest+0.044
 lawyer+0.044
 studying+0.044
 recess+0.044
aren+0.043
Neg logit contributions
 fart-0.096
 ju-0.083
 lo-0.082
rated-0.082
 brightest-0.078
Token fence
Token ID13990
Feature activation-1.87e-02

Pos logit contributions
 arrest+0.015
 Tom+0.014
 studying+0.014
 lawyer+0.014
issa+0.014
Neg logit contributions
 fart-0.012
rated-0.010
 ju-0.010
 lo-0.010
 funn-0.010
Token?"
Token ID1701
Feature activation-1.62e-02

Pos logit contributions
 Tom+0.018
 that+0.017
 school+0.016
 arrest+0.016
 building+0.016
Neg logit contributions
 fart-0.043
rated-0.041
 ju-0.041
 funn-0.041
 lo-0.040
Token
Token ID198
Feature activation+7.95e-03

Pos logit contributions
 arrest+0.018
 Tom+0.017
 lawyer+0.017
 elev+0.016
 school+0.016
Neg logit contributions
 fart-0.034
rated-0.031
 ju-0.031
 lo-0.031
 funn-0.031
Token
Token ID198
Feature activation+3.93e-03

Pos logit contributions
 fart+0.016
 lo+0.015
rated+0.015
 ju+0.015
 funn+0.015
Neg logit contributions
 arrest-0.009
 lawyer-0.008
 elev-0.008
 studying-0.008
 Tom-0.008
Token Annie
Token ID25860
Feature activation-1.99e-02

Pos logit contributions
 fart+0.004
 lo+0.004
 ju+0.004
rated+0.004
 funn+0.004
Neg logit contributions
 arrest-0.002
 lawyer-0.002
 Tom-0.001
 elev-0.001
 Lee-0.001
Token was
Token ID373
Feature activation+5.01e-03

Pos logit contributions
 Tom+0.027
 arrest+0.026
 lawyer+0.026
 elev+0.025
 school+0.025
Neg logit contributions
 fart-0.036
rated-0.034
 lo-0.033
 ju-0.032
 funn-0.032
Token never
Token ID1239
Feature activation-1.68e-03

Pos logit contributions
 fart+0.010
 ju+0.009
 funn+0.009
 lo+0.009
rated+0.009
Neg logit contributions
 arrest-0.007
 Tom-0.006
 lawyer-0.006
 school-0.006
 elev-0.006
Token anxious
Token ID18116
Feature activation-7.72e-03

Pos logit contributions
 Tom+0.002
 school+0.002
 arrest+0.002
 that+0.002
 studying+0.002
Neg logit contributions
 fart-0.003
rated-0.003
 ju-0.003
 lo-0.003
 funn-0.003
Token about
Token ID546
Feature activation-2.44e-02

Pos logit contributions
 arrest+0.010
parent+0.009
 Tom+0.009
issa+0.009
 church+0.009
Neg logit contributions
 fart-0.015
rated-0.014
 ju-0.014
 funn-0.014
 lo-0.014
Token z
Token ID1976
Feature activation-5.69e-02

Pos logit contributions
 arrest+0.027
 lawyer+0.025
ibrarian+0.025
azy+0.024
itt+0.024
Neg logit contributions
 fart-0.049
 lo-0.047
 funn-0.046
 ju-0.046
nam-0.044
Tokenipping
Token ID4501
Feature activation-9.17e-03

Pos logit contributions
 Lucy+0.094
 Karen+0.094
 Tom+0.090
 lawyer+0.090
 John+0.089
Neg logit contributions
 fart-0.083
rac-0.077
 lo-0.074
 funn-0.068
 ju-0.065
Token up
Token ID510
Feature activation-2.56e-02

Pos logit contributions
 arrest+0.012
 Tom+0.011
 lawyer+0.011
 church+0.010
 school+0.010
Neg logit contributions
 fart-0.018
 lo-0.017
 ju-0.017
rated-0.017
 funn-0.017
Token her
Token ID607
Feature activation-5.56e-03

Pos logit contributions
 arrest+0.027
 lawyer+0.026
 Tom+0.026
 Lee+0.024
issa+0.024
Neg logit contributions
 fart-0.055
 lo-0.051
 ju-0.050
 funn-0.049
rated-0.049
Token shirt
Token ID10147
Feature activation+1.57e-02

Pos logit contributions
 Tom+0.007
 arrest+0.007
 school+0.007
 lawyer+0.007
 that+0.007
Neg logit contributions
 fart-0.010
rated-0.010
 ju-0.010
 funn-0.009
 lo-0.009
Token again
Token ID757
Feature activation-1.18e-02

Pos logit contributions
 fart+0.033
 ju+0.031
 lo+0.030
rated+0.030
 funn+0.030
Neg logit contributions
 arrest-0.018
 Tom-0.017
 elev-0.017
 lawyer-0.016
 church-0.016
Token were
Token ID547
Feature activation-6.35e-03

Pos logit contributions
 fart+0.021
 lo+0.019
rated+0.019
 ju+0.019
 funn+0.019
Neg logit contributions
 arrest-0.016
 lawyer-0.015
 elev-0.014
 studying-0.014
 church-0.014
Token two
Token ID734
Feature activation-2.54e-03

Pos logit contributions
 Tom+0.007
 arrest+0.006
 school+0.006
 lawyer+0.006
 that+0.006
Neg logit contributions
 fart-0.014
rated-0.013
 ju-0.013
 lo-0.013
 funn-0.013
Token friends
Token ID2460
Feature activation-1.64e-03

Pos logit contributions
 arrest+0.003
 Tom+0.003
 school+0.003
 elev+0.003
 lawyer+0.003
Neg logit contributions
 fart-0.005
 lo-0.005
 funn-0.005
 ju-0.004
rated-0.004
Token,
Token ID11
Feature activation-2.10e-03

Pos logit contributions
 arrest+0.002
 elev+0.002
 church+0.002
issa+0.002
 lawyer+0.002
Neg logit contributions
 lo-0.003
 fart-0.003
 funn-0.003
seed-0.003
cr-0.003
Token Ab
Token ID2275
Feature activation-3.95e-02

Pos logit contributions
 arrest+0.002
 elev+0.002
 lawyer+0.002
 church+0.002
 studying+0.002
Neg logit contributions
 fart-0.004
 lo-0.004
 funn-0.004
 ju-0.004
nam-0.004
Tokenig
Token ID328
Feature activation-5.57e-02

Pos logit contributions
 arrest+0.064
 lawyer+0.063
 Tom+0.062
 school+0.061
 elev+0.061
Neg logit contributions
 fart-0.060
 lo-0.055
rated-0.055
 ju-0.054
 funn-0.051
Tokenail
Token ID603
Feature activation-4.26e-02

Pos logit contributions
 elev+0.082
 Tom+0.081
 lawyer+0.081
 church+0.080
 school+0.080
Neg logit contributions
 fart-0.088
 lo-0.080
rated-0.078
cr-0.077
 ju-0.076
Token and
Token ID290
Feature activation-1.28e-02

Pos logit contributions
 elev+0.050
 recess+0.048
 arrest+0.043
 university+0.043
azy+0.043
Neg logit contributions
 fart-0.076
 lo-0.075
rated-0.074
seed-0.073
cr-0.071
Token Mat
Token ID6550
Feature activation-3.71e-02

Pos logit contributions
 arrest+0.014
 lawyer+0.013
 elev+0.013
 church+0.012
azy+0.012
Neg logit contributions
 fart-0.025
 funn-0.025
nam-0.025
 lo-0.024
 ju-0.023
Tokenty
Token ID774
Feature activation-2.38e-02

Pos logit contributions
 church+0.067
 arrest+0.064
 lawyer+0.064
 studying+0.061
 counting+0.061
Neg logit contributions
rated-0.046
 fart-0.045
 lo-0.044
 tighter-0.042
 ju-0.042
Token.
Token ID13
Feature activation-3.93e-03

Pos logit contributions
 arrest+0.026
 elev+0.025
 church+0.025
issa+0.024
 lawyer+0.024
Neg logit contributions
 lo-0.046
 fart-0.045
 funn-0.044
cr-0.043
rated-0.043
Token str
Token ID965
Feature activation-4.46e-03

Pos logit contributions
 Tom+0.002
 that+0.002
 arrest+0.002
 school+0.002
 playing+0.002
Neg logit contributions
 fart-0.004
 ju-0.004
rated-0.004
 funn-0.004
 lo-0.004
Tokenayed
Token ID16548
Feature activation+1.33e-02

Pos logit contributions
 Tom+0.006
 that+0.006
 arrest+0.006
azy+0.005
 lawyer+0.005
Neg logit contributions
 fart-0.009
 ju-0.009
rated-0.008
 funn-0.008
nam-0.008
Token too
Token ID1165
Feature activation-1.63e-02

Pos logit contributions
 fart+0.028
rated+0.026
 ju+0.025
 funn+0.025
 lo+0.025
Neg logit contributions
 Tom-0.015
 arrest-0.015
 that-0.014
 lawyer-0.014
 school-0.014
Token far
Token ID1290
Feature activation-1.31e-02

Pos logit contributions
 arrest+0.023
 Tom+0.023
 lawyer+0.022
 school+0.022
 that+0.021
Neg logit contributions
 fart-0.030
rated-0.027
 ju-0.027
 lo-0.027
 funn-0.027
Token from
Token ID422
Feature activation-2.29e-02

Pos logit contributions
 arrest+0.019
 lawyer+0.019
 Tom+0.019
 school+0.019
 elev+0.019
Neg logit contributions
 fart-0.023
rated-0.021
 lo-0.020
 ju-0.020
 funn-0.019
Token home
Token ID1363
Feature activation-5.08e-02

Pos logit contributions
 arrest+0.024
 Tom+0.023
 lawyer+0.023
azy+0.022
issa+0.022
Neg logit contributions
 fart-0.049
 lo-0.044
rated-0.044
 funn-0.043
 ju-0.043
Token again
Token ID757
Feature activation-2.36e-02

Pos logit contributions
 Tom+0.047
issa+0.046
 arrest+0.046
 lawyer+0.045
 elev+0.045
Neg logit contributions
 fart-0.109
 lo-0.102
 ju-0.100
rated-0.098
 funn-0.096
Token.
Token ID13
Feature activation+5.79e-03

Pos logit contributions
 Tom+0.013
 arrest+0.013
 lawyer+0.013
issa+0.012
 elev+0.012
Neg logit contributions
 fart-0.062
rated-0.058
 lo-0.057
 ju-0.057
 funn-0.056
Token<|endoftext|>
Token ID50256
Feature activation-8.31e-03

Pos logit contributions
 fart+0.011
 ju+0.010
 funn+0.010
 lo+0.010
rated+0.009
Neg logit contributions
 arrest-0.008
 lawyer-0.007
 elev-0.007
issa-0.007
 church-0.007
TokenOnce
Token ID7454
Feature activation+3.21e-03

Pos logit contributions
 arrest+0.012
 lawyer+0.012
 elev+0.012
 studying+0.011
issa+0.011
Neg logit contributions
 fart-0.014
 lo-0.013
 ju-0.013
rated-0.012
cr-0.012
Token upon
Token ID2402
Feature activation+2.92e-02

Pos logit contributions
 fart+0.004
rated+0.003
 lo+0.003
 ju+0.003
 funn+0.003
Neg logit contributions
 arrest-0.007
 lawyer-0.006
 elev-0.006
 Tom-0.006
 church-0.006