TOP ACTIVATIONS
MAX = 0.095

closer and closer , but the fog was too thick .
. Suddenly , the fog began to move away
thought , and then the organized friend had an idea .
Ben agreed and soon the two of them were walking
of them . The fog was gone , and
was surprised and wondered how the all igator could talk .
today ?" Mr . Sun smiled and said ,
! He even showed his new friends how to do them
! The new all igator said , "
nodded and watched carefully as the grown up stretched her arms
a little girl named Lu . She was three years old
bright fan . Joe wanted the fan too , and so
Can I use it ?" asked Lucy . "
was excited to have a new one . " I hope
able . And they liked the fog . It made their
. " Do you like the stairs ?" "
One day , Sam and Lucy went to the park
again . They understood that the fog was trying to tell
Sam and Lucy . Sam and Lucy loved to play together
curious little cat named Kitty . Kitty loved to play and

MIN ACTIVATIONS
MIN = -0.107

found a brilliant skirt in Lily 's closet . It was
a nap . One day , Gro ver decided he wanted
the beach . But first , he had to pack !
what to do . But then , a kind
pumpkin friends . One day , a big , mean squash
a fair peanut . It looked so tasty that a lion
every day . One day , while Av a was playing
big ocean . One day , P inch y met a
saw a new toy in Lily 's part . It was
eating bamboo . One day , P andy met a rabbit
was a big card with a picture of a rainbow .
From that day on , Jolly and his friends were
to eat . One day , P ingu was feeling tired
a big ice cream in Ben 's hand and a new
a break . One day , his friend Sammy came over
puts a shiny sticker on Ben 's folder . Ben feels
make funny noises . So , he kept rubbing and rubbing
's purse . It has a picture of his mom and
card on the floor . The card had a big ,
affe ." They looked through Lily 's toy box and found

INTERVAL 0.045 - 0.095
CONTAINS 0.477%

car ride was long and hot . Tom and Lily felt
like it ?" L ily smiled . She liked
Suddenly , a strange noise made her jump . A scary
Tim my felt sad and went back inside his hut .
One day , it r ained outside , and Lily

INTERVAL -0.006 - 0.045
CONTAINS 63.041%

, the boy felt fresh and happy . He thanked the
refused to listen . Finally , Jack 's
friends were angry . " No !" they shouted . "
get filled with water . â € So Tim
friend , O ll ie ." O ll ie was happy

INTERVAL -0.057 - -0.006
CONTAINS 36.322%

so he got on the plane . The
said , reaching for the coal . But Ben
along . He felt so happy knowing that he would still
One day , Kitty wanted to cook some food
closet , and in his toy box . He cannot find

INTERVAL -0.107 - -0.057
CONTAINS 0.161%

said , " Oh , this looks like something from Grand
so much fun . After a while , the big purple
T oni 's watch back . They ran after the bully
bus stop . He sang a song , danced around ,
as the wind blew through their hair . The girl noticed
Token closer
Token ID5699
Feature activation+1.86e-02

Pos logit contributions
 Lu+0.074
Ann+0.068
 angels+0.066
Lu+0.066
 pastor+0.064
Neg logit contributions
ret-0.044
orse-0.041
ets-0.040
 peanut-0.038
sel-0.037
Token and
Token ID290
Feature activation-1.42e-02

Pos logit contributions
 Lu+0.041
Ann+0.035
Lu+0.035
 firefighters+0.035
 angels+0.035
Neg logit contributions
ret-0.025
orse-0.024
orses-0.022
ets-0.020
 peanut-0.020
Token closer
Token ID5699
Feature activation+6.25e-03

Pos logit contributions
ret+0.027
orse+0.027
orses+0.025
 peanut+0.023
ork+0.023
Neg logit contributions
 Lu-0.026
 angels-0.019
Ann-0.019
Lu-0.019
 firefighters-0.019
Token,
Token ID11
Feature activation-1.18e-02

Pos logit contributions
 Lu+0.013
 angels+0.011
Ann+0.011
Lu+0.011
 firefighters+0.011
Neg logit contributions
orse-0.009
ret-0.009
orses-0.009
 peanut-0.008
ets-0.008
Token but
Token ID475
Feature activation+6.37e-03

Pos logit contributions
ret+0.021
orse+0.021
orses+0.019
 peanut+0.018
ocket+0.018
Neg logit contributions
 Lu-0.022
 angels-0.018
Ann-0.017
 firefighters-0.017
Lu-0.017
Token the
Token ID262
Feature activation+9.53e-02

Pos logit contributions
 Lu+0.014
 angels+0.012
Ann+0.012
Lu+0.012
 firefighters+0.011
Neg logit contributions
ret-0.009
orse-0.009
orses-0.008
 peanut-0.007
ets-0.007
Token fog
Token ID19558
Feature activation+1.78e-02

Pos logit contributions
ussy+0.179
whe+0.176
ators+0.170
jack+0.162
beh+0.162
Neg logit contributions
 holes-0.100
 long-0.089
 middle-0.086
 it-0.084
 shell-0.083
Token was
Token ID373
Feature activation+5.37e-02

Pos logit contributions
 Lu+0.037
 angels+0.031
 youths+0.030
Lu+0.030
Ann+0.030
Neg logit contributions
orse-0.026
ret-0.026
orses-0.024
ork-0.022
 peanut-0.022
Token too
Token ID1165
Feature activation+2.65e-02

Pos logit contributions
 Lu+0.109
 youths+0.096
 angels+0.095
Ann+0.095
 Taxi+0.093
Neg logit contributions
orse-0.067
 peanut-0.066
ret-0.062
ifying-0.060
 accidentally-0.059
Token thick
Token ID6546
Feature activation+7.15e-03

Pos logit contributions
 Lu+0.059
 angels+0.053
Ann+0.051
ators+0.049
 youths+0.049
Neg logit contributions
ret-0.033
orse-0.033
 peanut-0.030
orses-0.029
ork-0.029
Token.
Token ID13
Feature activation+4.11e-03

Pos logit contributions
 Lu+0.015
 angels+0.012
Ann+0.012
Lu+0.012
 firefighters+0.012
Neg logit contributions
ret-0.011
orse-0.011
orses-0.010
 peanut-0.009
ets-0.009
Token.
Token ID13
Feature activation+4.23e-03

Pos logit contributions
ret+0.004
orse+0.004
orses+0.004
 peanut+0.004
ets+0.003
Neg logit contributions
 Lu-0.005
 angels-0.004
Ann-0.004
Lu-0.004
 firefighters-0.004
Token
Token ID198
Feature activation+1.60e-02

Pos logit contributions
 Lu+0.009
 angels+0.008
Ann+0.007
 firefighters+0.007
Lu+0.007
Neg logit contributions
ret-0.006
orse-0.006
orses-0.006
 peanut-0.005
ets-0.005
Token
Token ID198
Feature activation+5.42e-03

Pos logit contributions
 Lu+0.034
 angels+0.030
 firefighters+0.029
 youths+0.028
Ann+0.028
Neg logit contributions
orse-0.023
ret-0.022
orses-0.021
ork-0.018
 peanut-0.018
TokenSuddenly
Token ID38582
Feature activation+1.82e-02

Pos logit contributions
 Lu+0.012
 angels+0.010
 firefighters+0.009
Ann+0.009
 youths+0.009
Neg logit contributions
orse-0.008
ret-0.008
orses-0.007
 peanut-0.007
ets-0.006
Token,
Token ID11
Feature activation+2.07e-02

Pos logit contributions
 Lu+0.043
Ann+0.038
 angels+0.037
Lu+0.037
co+0.037
Neg logit contributions
orse-0.021
ret-0.021
orses-0.018
 peanut-0.017
ork-0.016
Token the
Token ID262
Feature activation+9.46e-02

Pos logit contributions
 Lu+0.045
Ann+0.040
co+0.040
 angels+0.039
Lu+0.039
Neg logit contributions
ret-0.025
orse-0.025
orses-0.022
 peanut-0.021
sel-0.020
Token fog
Token ID19558
Feature activation+8.43e-03

Pos logit contributions
ussy+0.181
whe+0.175
beh+0.167
ators+0.166
 Lu+0.163
Neg logit contributions
 holes-0.086
 long-0.084
 sudden-0.079
 shell-0.077
 middle-0.077
Token began
Token ID2540
Feature activation+1.46e-02

Pos logit contributions
 Lu+0.018
 angels+0.015
Ann+0.015
Lu+0.015
 youths+0.015
Neg logit contributions
orse-0.012
ret-0.012
orses-0.011
 peanut-0.010
ets-0.010
Token to
Token ID284
Feature activation+3.98e-02

Pos logit contributions
 Lu+0.030
Ann+0.025
 angels+0.025
Lu+0.025
 youths+0.024
Neg logit contributions
orse-0.022
ret-0.022
orses-0.020
 peanut-0.019
ork-0.018
Token move
Token ID1445
Feature activation+2.52e-02

Pos logit contributions
 Lu+0.084
Ann+0.073
Lu+0.072
 youths+0.072
 Barber+0.070
Neg logit contributions
orse-0.052
ret-0.047
 peanut-0.046
orses-0.046
sel-0.040
Token away
Token ID1497
Feature activation+2.49e-02

Pos logit contributions
 Lu+0.055
 angels+0.048
Ann+0.046
Lu+0.045
 Barber+0.045
Neg logit contributions
ret-0.033
orse-0.030
ets-0.028
orses-0.028
 peanut-0.027
Token thought
Token ID1807
Feature activation-1.38e-02

Pos logit contributions
ret+0.001
orse+0.001
orses+0.001
 peanut+0.001
ets+0.001
Neg logit contributions
 Lu-0.001
 angels-0.001
Ann-0.001
Lu-0.001
 firefighters-0.001
Token,
Token ID11
Feature activation-1.36e-02

Pos logit contributions
ret+0.024
orse+0.024
orses+0.022
 peanut+0.021
ets+0.020
Neg logit contributions
 Lu-0.026
 angels-0.021
 firefighters-0.020
Ann-0.020
Lu-0.020
Token and
Token ID290
Feature activation-6.67e-03

Pos logit contributions
ret+0.024
orse+0.024
orses+0.023
 peanut+0.021
ocket+0.021
Neg logit contributions
 Lu-0.025
 angels-0.021
 firefighters-0.020
Ann-0.019
Lu-0.019
Token then
Token ID788
Feature activation+4.41e-03

Pos logit contributions
ret+0.012
orse+0.012
orses+0.011
ets+0.010
 peanut+0.010
Neg logit contributions
 Lu-0.012
 angels-0.010
 firefighters-0.010
Ann-0.010
Lu-0.010
Token the
Token ID262
Feature activation+6.76e-02

Pos logit contributions
 Lu+0.010
 angels+0.008
Ann+0.008
Lu+0.008
 firefighters+0.008
Neg logit contributions
orse-0.006
ret-0.006
orses-0.006
 peanut-0.005
ets-0.005
Token organized
Token ID8389
Feature activation+9.09e-02

Pos logit contributions
 Lu+0.142
ators+0.123
Ann+0.118
lo+0.116
 Stev+0.116
Neg logit contributions
 peanut-0.089
 tomato-0.080
orse-0.077
 pen-0.076
 notebook-0.073
Token friend
Token ID1545
Feature activation-1.50e-02

Pos logit contributions
 Lu+0.155
Michael+0.154
Mark+0.153
Ann+0.152
 mercy+0.151
Neg logit contributions
's-0.105
 tomato-0.099
 mouth-0.093
 peanut-0.092
 picture-0.091
Token had
Token ID550
Feature activation-1.90e-03

Pos logit contributions
ret+0.024
orse+0.023
orses+0.022
ets+0.020
 peanut+0.020
Neg logit contributions
 Lu-0.030
 angels-0.025
 firefighters-0.024
Ann-0.024
Lu-0.024
Token an
Token ID281
Feature activation+4.48e-02

Pos logit contributions
ret+0.003
orse+0.003
orses+0.003
 peanut+0.003
ets+0.003
Neg logit contributions
 Lu-0.004
 angels-0.003
Ann-0.003
Lu-0.003
 firefighters-0.003
Token idea
Token ID2126
Feature activation+6.26e-03

Pos logit contributions
 Lu+0.081
Ann+0.073
 angels+0.070
Lu+0.068
Michael+0.063
Neg logit contributions
ret-0.076
orse-0.076
 peanut-0.072
ets-0.067
ocket-0.066
Token.
Token ID13
Feature activation-5.52e-03

Pos logit contributions
 Lu+0.013
 angels+0.011
Ann+0.010
 firefighters+0.010
Lu+0.010
Neg logit contributions
orse-0.010
ret-0.010
orses-0.009
 peanut-0.008
ets-0.008
Token
Token ID198
Feature activation-2.41e-02

Pos logit contributions
ret+0.034
orse+0.033
orses+0.031
ets+0.031
 peanut+0.030
Neg logit contributions
 Lu-0.032
Ann-0.025
Lu-0.025
 angels-0.024
 youths-0.023
TokenBen
Token ID11696
Feature activation+4.13e-02

Pos logit contributions
ret+0.044
orse+0.041
ocket+0.038
orses+0.037
 peanut+0.037
Neg logit contributions
 Lu-0.045
Lu-0.039
Ann-0.039
 youths-0.036
 angels-0.035
Token agreed
Token ID4987
Feature activation-1.42e-04

Pos logit contributions
 Lu+0.085
 angels+0.079
 firefighters+0.075
 Taxi+0.073
 taxis+0.072
Neg logit contributions
orse-0.060
ret-0.055
orses-0.051
 peanut-0.050
's-0.049
Token and
Token ID290
Feature activation+2.40e-03

Pos logit contributions
ret+0.000
orse+0.000
orses+0.000
 peanut+0.000
ets+0.000
Neg logit contributions
 Lu-0.000
 angels-0.000
Ann-0.000
Lu-0.000
 firefighters-0.000
Token soon
Token ID2582
Feature activation-1.76e-02

Pos logit contributions
 Lu+0.005
 angels+0.005
Ann+0.005
Lu+0.005
 firefighters+0.005
Neg logit contributions
orse-0.003
ret-0.003
orses-0.003
 peanut-0.003
ets-0.002
Token the
Token ID262
Feature activation+8.94e-02

Pos logit contributions
ret+0.027
orse+0.026
orses+0.025
ets+0.022
ocket+0.022
Neg logit contributions
 Lu-0.037
 angels-0.030
Ann-0.030
Lu-0.029
 firefighters-0.029
Token two
Token ID734
Feature activation+7.80e-02

Pos logit contributions
 Lu+0.147
Ann+0.143
 Tit+0.136
 Kitty+0.135
Lu+0.134
Neg logit contributions
 picture-0.107
 drawing-0.104
's-0.101
 ink-0.099
 face-0.096
Token of
Token ID286
Feature activation+2.32e-02

Pos logit contributions
 Lu+0.144
iger+0.142
 Amen+0.134
asc+0.134
jack+0.133
Neg logit contributions
's-0.098
s-0.096
bs-0.093
orse-0.092
orses-0.090
Token them
Token ID606
Feature activation+3.15e-02

Pos logit contributions
 Lu+0.051
jack+0.044
 angels+0.044
 attendant+0.044
 Taxi+0.043
Neg logit contributions
orse-0.030
ret-0.029
orses-0.027
 peanut-0.025
 notebooks-0.023
Token were
Token ID547
Feature activation+3.86e-02

Pos logit contributions
 Lu+0.067
 angels+0.059
Ann+0.058
 firefighters+0.057
Lu+0.057
Neg logit contributions
orse-0.042
ret-0.040
orses-0.039
ets-0.037
 peanut-0.035
Token walking
Token ID6155
Feature activation+4.24e-02

Pos logit contributions
 Lu+0.075
jack+0.067
Ann+0.067
Lu+0.066
 attendant+0.064
Neg logit contributions
orse-0.057
orses-0.053
ret-0.047
 peanut-0.046
ets-0.045
Token of
Token ID286
Feature activation+1.06e-02

Pos logit contributions
 Lu+0.058
 angels+0.049
Lu+0.047
Ann+0.047
 firefighters+0.045
Neg logit contributions
orse-0.048
ret-0.047
orses-0.045
ets-0.044
 peanut-0.042
Token them
Token ID606
Feature activation-3.57e-03

Pos logit contributions
 Lu+0.021
 angels+0.018
Ann+0.018
Lu+0.017
 firefighters+0.017
Neg logit contributions
ret-0.016
orse-0.016
orses-0.015
 peanut-0.014
ets-0.013
Token.
Token ID13
Feature activation-6.79e-04

Pos logit contributions
ret+0.006
orse+0.006
orses+0.006
 peanut+0.005
ets+0.005
Neg logit contributions
 Lu-0.007
 angels-0.006
Ann-0.005
Lu-0.005
 firefighters-0.005
Token
Token ID198
Feature activation+1.78e-02

Pos logit contributions
ret+0.001
orse+0.001
orses+0.001
 peanut+0.001
ets+0.001
Neg logit contributions
 Lu-0.001
 angels-0.001
Ann-0.001
Lu-0.001
 firefighters-0.001
Token
Token ID198
Feature activation+1.04e-02

Pos logit contributions
 Lu+0.038
 angels+0.034
 firefighters+0.033
 youths+0.032
ators+0.032
Neg logit contributions
orse-0.025
ret-0.024
orses-0.022
ork-0.020
 peanut-0.019
TokenThe
Token ID464
Feature activation+8.91e-02

Pos logit contributions
 Lu+0.021
 angels+0.018
 firefighters+0.017
Ann+0.017
 youths+0.017
Neg logit contributions
orse-0.015
ret-0.015
orses-0.014
ork-0.013
 peanut-0.013
Token fog
Token ID19558
Feature activation+2.57e-02

Pos logit contributions
whe+0.158
ussy+0.157
 Lu+0.151
wood+0.149
ators+0.145
Neg logit contributions
 wishes-0.086
 wish-0.084
 bad-0.082
 holes-0.079
 dream-0.079
Token was
Token ID373
Feature activation+3.38e-02

Pos logit contributions
 Lu+0.054
Lu+0.044
 youths+0.044
Ann+0.044
 angels+0.044
Neg logit contributions
orse-0.038
ret-0.037
orses-0.034
ork-0.031
 peanut-0.031
Token gone
Token ID3750
Feature activation+4.21e-03

Pos logit contributions
 Lu+0.072
Ann+0.063
 angels+0.062
 youths+0.062
Lu+0.060
Neg logit contributions
orse-0.043
ret-0.041
 peanut-0.040
orses-0.038
ets-0.035
Token,
Token ID11
Feature activation-5.26e-03

Pos logit contributions
 Lu+0.009
 angels+0.007
Ann+0.007
Lu+0.007
 firefighters+0.007
Neg logit contributions
ret-0.006
orse-0.006
orses-0.006
 peanut-0.005
ets-0.005
Token and
Token ID290
Feature activation+9.73e-03

Pos logit contributions
ret+0.009
orse+0.009
orses+0.008
 peanut+0.008
ets+0.007
Neg logit contributions
 Lu-0.010
 angels-0.008
Ann-0.008
 firefighters-0.008
Lu-0.008
Token was
Token ID373
Feature activation+3.72e-02

Pos logit contributions
 Lu+0.042
 angels+0.037
Ann+0.035
 firefighters+0.035
 youths+0.034
Neg logit contributions
orse-0.029
ret-0.028
orses-0.025
 peanut-0.024
sel-0.022
Token surprised
Token ID6655
Feature activation-4.79e-03

Pos logit contributions
 Lu+0.082
 angels+0.074
ators+0.073
lo+0.071
Ann+0.071
Neg logit contributions
orse-0.046
ret-0.039
orses-0.037
 peanut-0.037
's-0.036
Token and
Token ID290
Feature activation-4.82e-03

Pos logit contributions
ret+0.008
orse+0.008
orses+0.008
 peanut+0.007
ets+0.007
Neg logit contributions
 Lu-0.009
 angels-0.008
Ann-0.007
Lu-0.007
 firefighters-0.007
Token wondered
Token ID14028
Feature activation+2.56e-02

Pos logit contributions
ret+0.009
orse+0.008
orses+0.008
 peanut+0.007
ets+0.007
Neg logit contributions
 Lu-0.009
 angels-0.007
Ann-0.007
Lu-0.007
 firefighters-0.007
Token how
Token ID703
Feature activation+1.83e-02

Pos logit contributions
 Lu+0.057
Ann+0.050
Lu+0.048
 angels+0.048
 youths+0.048
Neg logit contributions
ret-0.033
orse-0.032
orses-0.028
 peanut-0.026
ocket-0.025
Token the
Token ID262
Feature activation+8.89e-02

Pos logit contributions
 Lu+0.037
Ann+0.031
 angels+0.031
ators+0.030
co+0.030
Neg logit contributions
ret-0.027
orse-0.027
orses-0.025
 peanut-0.023
ets-0.022
Token all
Token ID477
Feature activation+6.76e-02

Pos logit contributions
 Lu+0.184
Ann+0.184
co+0.174
Michael+0.171
ators+0.167
Neg logit contributions
 peanut-0.083
 photos-0.080
 shrimp-0.079
 bean-0.069
 photo-0.069
Tokenigator
Token ID23823
Feature activation+1.49e-02

Pos logit contributions
 Lu+0.101
co+0.097
Michael+0.086
ators+0.085
 angels+0.085
Neg logit contributions
's-0.110
 peanut-0.108
ets-0.106
ret-0.105
s-0.103
Token could
Token ID714
Feature activation+3.80e-02

Pos logit contributions
 Lu+0.031
 angels+0.026
Ann+0.025
Lu+0.025
 firefighters+0.024
Neg logit contributions
orse-0.023
ret-0.022
orses-0.020
 peanut-0.020
ets-0.019
Token talk
Token ID1561
Feature activation+1.69e-02

Pos logit contributions
 Lu+0.083
ators+0.069
Ann+0.069
 angels+0.068
Lu+0.068
Neg logit contributions
orse-0.053
ret-0.049
orses-0.046
 peanut-0.044
ork-0.043
Token.
Token ID13
Feature activation+4.50e-03

Pos logit contributions
 Lu+0.035
Ann+0.029
 angels+0.029
Lu+0.028
 firefighters+0.028
Neg logit contributions
ret-0.025
orse-0.024
orses-0.022
ets-0.022
 peanut-0.021
Token today
Token ID1909
Feature activation+1.40e-02

Pos logit contributions
ret+0.025
orse+0.024
orses+0.023
 peanut+0.021
ets+0.021
Neg logit contributions
 Lu-0.020
 angels-0.016
 firefighters-0.015
Ann-0.014
Lu-0.014
Token?"
Token ID1701
Feature activation+5.22e-03

Pos logit contributions
 Lu+0.026
 angels+0.021
Ann+0.021
Lu+0.020
 firefighters+0.020
Neg logit contributions
orse-0.024
ret-0.024
orses-0.022
 peanut-0.022
ets-0.021
Token
Token ID198
Feature activation+7.13e-03

Pos logit contributions
 Lu+0.011
 angels+0.009
Ann+0.009
 firefighters+0.009
Lu+0.009
Neg logit contributions
orse-0.008
ret-0.008
orses-0.007
 peanut-0.007
ets-0.006
Token
Token ID198
Feature activation+1.17e-03

Pos logit contributions
 Lu+0.015
 angels+0.013
Ann+0.012
 firefighters+0.012
Lu+0.012
Neg logit contributions
orse-0.011
ret-0.011
orses-0.010
 peanut-0.009
ork-0.009
TokenMr
Token ID5246
Feature activation+5.87e-02

Pos logit contributions
 Lu+0.002
 angels+0.002
Ann+0.002
 firefighters+0.002
Lu+0.002
Neg logit contributions
ret-0.002
orse-0.002
orses-0.002
 peanut-0.002
ets-0.002
Token.
Token ID13
Feature activation+8.88e-02

Pos logit contributions
 Lu+0.139
Ann+0.122
 youths+0.122
 firefighters+0.122
 angels+0.122
Neg logit contributions
orse-0.058
ret-0.054
 peanut-0.049
orses-0.048
ork-0.046
Token Sun
Token ID3825
Feature activation+4.67e-02

Pos logit contributions
guard+0.118
aid+0.117
 youths+0.116
co+0.115
 mercy+0.112
Neg logit contributions
orse-0.145
 T-0.143
 This-0.142
orses-0.131
ret-0.123
Token smiled
Token ID13541
Feature activation+1.36e-02

Pos logit contributions
 Lu+0.097
 youths+0.085
 angels+0.084
 firefighters+0.084
Ann+0.083
Neg logit contributions
orse-0.067
ret-0.059
orses-0.058
 peanut-0.056
's-0.051
Token and
Token ID290
Feature activation+1.25e-02

Pos logit contributions
 Lu+0.031
Ann+0.027
 angels+0.026
Lu+0.026
 firefighters+0.025
Neg logit contributions
orse-0.018
ret-0.018
orses-0.016
 peanut-0.016
ets-0.015
Token said
Token ID531
Feature activation+8.82e-03

Pos logit contributions
 Lu+0.027
 angels+0.023
Ann+0.023
Lu+0.022
 firefighters+0.022
Neg logit contributions
orse-0.017
ret-0.017
orses-0.016
 peanut-0.015
ets-0.014
Token,
Token ID11
Feature activation-2.65e-03

Pos logit contributions
 Lu+0.020
Ann+0.017
 angels+0.017
Lu+0.017
 firefighters+0.017
Neg logit contributions
ret-0.011
orse-0.011
orses-0.010
 peanut-0.010
ets-0.009
Token!
Token ID0
Feature activation-5.63e-03

Pos logit contributions
ret+0.011
orse+0.011
orses+0.010
 peanut+0.010
ets+0.009
Neg logit contributions
 Lu-0.013
 angels-0.011
Ann-0.011
 firefighters-0.010
Lu-0.010
Token He
Token ID679
Feature activation+2.65e-02

Pos logit contributions
ret+0.010
orse+0.010
orses+0.009
 peanut+0.009
ets+0.008
Neg logit contributions
 Lu-0.010
 angels-0.008
Ann-0.008
Lu-0.008
 firefighters-0.008
Token even
Token ID772
Feature activation+6.18e-03

Pos logit contributions
 Lu+0.058
 angels+0.051
 firefighters+0.049
 Taxi+0.048
 attendant+0.048
Neg logit contributions
orse-0.037
ret-0.034
orses-0.034
 peanut-0.031
ets-0.029
Token showed
Token ID3751
Feature activation-1.21e-02

Pos logit contributions
 Lu+0.013
 angels+0.011
Ann+0.011
 firefighters+0.010
Lu+0.010
Neg logit contributions
orse-0.009
ret-0.009
orses-0.009
 peanut-0.008
ets-0.008
Token his
Token ID465
Feature activation+5.48e-02

Pos logit contributions
orse+0.023
ret+0.022
orses+0.022
 peanut+0.020
ocket+0.019
Neg logit contributions
 Lu-0.021
 angels-0.017
 firefighters-0.016
Ann-0.016
Lu-0.016
Token new
Token ID649
Feature activation+8.86e-02

Pos logit contributions
 Lu+0.106
Lu+0.092
ators+0.089
Ann+0.088
guard+0.084
Neg logit contributions
orse-0.080
 mouth-0.076
orses-0.076
 peanut-0.074
ret-0.072
Token friends
Token ID2460
Feature activation-1.44e-02

Pos logit contributions
 Lu+0.167
Ann+0.154
Lu+0.149
ators+0.143
K+0.142
Neg logit contributions
 drawing-0.112
 mouth-0.111
 tomato-0.110
's-0.110
orse-0.104
Token how
Token ID703
Feature activation+1.76e-02

Pos logit contributions
ret+0.027
orse+0.027
orses+0.025
 peanut+0.024
ets+0.023
Neg logit contributions
 Lu-0.025
 angels-0.020
Ann-0.019
Lu-0.018
 youths-0.018
Token to
Token ID284
Feature activation+6.41e-02

Pos logit contributions
 Lu+0.036
Ann+0.030
 angels+0.030
 attendant+0.030
Lu+0.029
Neg logit contributions
orse-0.026
ret-0.026
orses-0.025
 peanut-0.022
ets-0.022
Token do
Token ID466
Feature activation+9.16e-03

Pos logit contributions
 Lu+0.139
 firefighters+0.128
 angels+0.126
 department+0.124
 Taxi+0.124
Neg logit contributions
orse-0.078
orses-0.069
ets-0.063
ret-0.063
 scored-0.055
Token them
Token ID606
Feature activation+4.02e-02

Pos logit contributions
 Lu+0.020
 angels+0.017
Ann+0.017
Lu+0.016
 firefighters+0.016
Neg logit contributions
orse-0.013
ret-0.013
orses-0.012
 peanut-0.011
ets-0.011
Token!
Token ID0
Feature activation-2.02e-02

Pos logit contributions
ret+0.013
orse+0.013
orses+0.012
 peanut+0.011
ets+0.011
Neg logit contributions
 Lu-0.013
 angels-0.011
Ann-0.010
Lu-0.010
 firefighters-0.010
Token
Token ID220
Feature activation+2.66e-03

Pos logit contributions
ret+0.032
orse+0.032
orses+0.030
ets+0.028
 peanut+0.027
Neg logit contributions
 Lu-0.043
 Lucy-0.033
Lu-0.033
 attendant-0.033
Ann-0.033
Token
Token ID198
Feature activation-9.26e-04

Pos logit contributions
 Lu+0.005
 angels+0.004
Ann+0.004
Lu+0.004
 firefighters+0.004
Neg logit contributions
ret-0.004
orse-0.004
orses-0.004
 peanut-0.004
ets-0.003
Token
Token ID198
Feature activation-7.96e-03

Pos logit contributions
ret+0.002
orse+0.001
orses+0.001
 peanut+0.001
ets+0.001
Neg logit contributions
 Lu-0.002
 angels-0.002
Ann-0.001
Lu-0.001
 firefighters-0.001
TokenThe
Token ID464
Feature activation+6.85e-02

Pos logit contributions
ret+0.014
orse+0.013
orses+0.013
 peanut+0.012
ets+0.012
Neg logit contributions
 Lu-0.015
Lu-0.013
Ann-0.013
 angels-0.012
 firefighters-0.012
Token new
Token ID649
Feature activation+8.84e-02

Pos logit contributions
 Lu+0.155
Ann+0.152
Michael+0.137
co+0.135
Kim+0.134
Neg logit contributions
 peanut-0.067
 shrimp-0.066
 photos-0.059
 bean-0.057
 net-0.053
Token all
Token ID477
Feature activation+9.57e-03

Pos logit contributions
Ann+0.191
 Lu+0.188
Michael+0.178
Kim+0.175
co+0.172
Neg logit contributions
 shrimp-0.079
 shell-0.073
 peanut-0.073
 net-0.073
s-0.072
Tokenigator
Token ID23823
Feature activation-2.66e-02

Pos logit contributions
 Lu+0.010
 angels+0.007
Ann+0.006
Lu+0.006
 firefighters+0.006
Neg logit contributions
orse-0.024
ret-0.024
orses-0.023
 peanut-0.022
ets-0.022
Token said
Token ID531
Feature activation+8.53e-04

Pos logit contributions
ret+0.047
orse+0.044
orses+0.043
ets+0.041
 notebooks+0.040
Neg logit contributions
 Lu-0.048
 angels-0.038
 attendant-0.038
 firefighters-0.037
Ann-0.037
Token,
Token ID11
Feature activation-5.55e-03

Pos logit contributions
 Lu+0.002
 angels+0.002
Ann+0.002
Lu+0.002
 firefighters+0.002
Neg logit contributions
ret-0.001
orse-0.001
orses-0.001
 peanut-0.001
ets-0.001
Token "
Token ID366
Feature activation+2.13e-03

Pos logit contributions
ret+0.010
orse+0.010
orses+0.009
 peanut+0.009
ets+0.009
Neg logit contributions
 Lu-0.010
 angels-0.008
Ann-0.008
Lu-0.008
 firefighters-0.008
Token nodded
Token ID14464
Feature activation+5.14e-03

Pos logit contributions
 Lu+0.026
 angels+0.022
 firefighters+0.021
Ann+0.021
Lu+0.021
Neg logit contributions
orse-0.019
ret-0.019
orses-0.017
 peanut-0.016
ets-0.016
Token and
Token ID290
Feature activation+1.09e-02

Pos logit contributions
 Lu+0.011
 angels+0.009
Ann+0.009
Lu+0.009
 firefighters+0.009
Neg logit contributions
orse-0.008
ret-0.008
orses-0.007
 peanut-0.007
ets-0.006
Token watched
Token ID7342
Feature activation-4.71e-03

Pos logit contributions
 Lu+0.022
 angels+0.018
Ann+0.018
 firefighters+0.018
Lu+0.018
Neg logit contributions
orse-0.017
ret-0.017
orses-0.016
 peanut-0.015
ets-0.014
Token carefully
Token ID7773
Feature activation-6.85e-03

Pos logit contributions
ret+0.007
orse+0.007
orses+0.007
 peanut+0.006
ets+0.006
Neg logit contributions
 Lu-0.010
 angels-0.008
Ann-0.008
 firefighters-0.008
Lu-0.008
Token as
Token ID355
Feature activation+1.19e-02

Pos logit contributions
ret+0.011
orse+0.011
orses+0.010
 peanut+0.010
ets+0.009
Neg logit contributions
 Lu-0.013
 angels-0.011
Ann-0.011
 firefighters-0.010
Lu-0.010
Token the
Token ID262
Feature activation+8.84e-02

Pos logit contributions
 Lu+0.025
Ann+0.021
 angels+0.021
Lu+0.021
co+0.021
Neg logit contributions
ret-0.017
orse-0.017
orses-0.015
 peanut-0.014
ork-0.014
Token grown
Token ID7334
Feature activation+3.21e-02

Pos logit contributions
 Johnson+0.145
 Jenkins+0.142
 Tit+0.141
girls+0.141
 department+0.140
Neg logit contributions
 picture-0.098
's-0.094
 drawing-0.093
 photo-0.090
 paper-0.089
Token up
Token ID510
Feature activation+2.51e-03

Pos logit contributions
 Lu+0.065
Ann+0.056
 Taxi+0.055
Lu+0.055
 youths+0.054
Neg logit contributions
ret-0.047
orse-0.044
sel-0.041
orses-0.041
ets-0.039
Token stretched
Token ID19110
Feature activation-7.31e-03

Pos logit contributions
 Lu+0.005
 angels+0.004
Ann+0.004
Lu+0.004
 firefighters+0.004
Neg logit contributions
ret-0.004
orse-0.004
orses-0.004
 peanut-0.003
ets-0.003
Token her
Token ID607
Feature activation+2.60e-02

Pos logit contributions
orse+0.012
ret+0.012
orses+0.011
 peanut+0.011
ets+0.010
Neg logit contributions
 Lu-0.014
 angels-0.011
Ann-0.011
Lu-0.011
 firefighters-0.011
Token arms
Token ID5101
Feature activation-1.09e-02

Pos logit contributions
 Lu+0.056
Ann+0.050
 firefighters+0.049
Lu+0.049
 youths+0.048
Neg logit contributions
ret-0.035
orse-0.033
orses-0.030
ork-0.029
 peanut-0.028
Token a
Token ID257
Feature activation-6.74e-03

Pos logit contributions
ret+0.010
orse+0.009
orses+0.009
 peanut+0.008
ets+0.008
Neg logit contributions
 Lu-0.010
 angels-0.008
Ann-0.008
 firefighters-0.008
Lu-0.008
Token little
Token ID1310
Feature activation+1.61e-02

Pos logit contributions
ret+0.010
orse+0.010
orses+0.009
ets+0.009
 peanut+0.008
Neg logit contributions
 Lu-0.014
 angels-0.012
Lu-0.011
Ann-0.011
 firefighters-0.011
Token girl
Token ID2576
Feature activation-2.29e-02

Pos logit contributions
 Lu+0.025
Ann+0.020
 angels+0.019
Lu+0.018
 firefighters+0.018
Neg logit contributions
orse-0.033
ret-0.032
 peanut-0.031
orses-0.030
ets-0.027
Token named
Token ID3706
Feature activation-1.20e-02

Pos logit contributions
ret+0.041
orse+0.037
orses+0.036
ets+0.035
sel+0.034
Neg logit contributions
 Lu-0.044
Lu-0.035
 angels-0.035
Ann-0.035
co-0.033
Token Lu
Token ID6026
Feature activation+2.28e-02

Pos logit contributions
orse+0.023
ret+0.022
orses+0.021
 peanut+0.020
sel+0.019
Neg logit contributions
 Lu-0.022
Lu-0.017
 angels-0.017
Ann-0.017
 Lucy-0.016
Token.
Token ID13
Feature activation+8.81e-02

Pos logit contributions
 Lu+0.039
 angels+0.035
 firefighters+0.035
 youths+0.034
 attendant+0.032
Neg logit contributions
orse-0.041
ret-0.040
orses-0.037
 peanut-0.033
ork-0.033
Token She
Token ID1375
Feature activation-1.21e-02

Pos logit contributions
guard+0.207
 helpers+0.201
 firefighters+0.200
 mercy+0.199
 fighters+0.199
Neg logit contributions
 This-0.080
 Its-0.067
s-0.067
 It-0.067
's-0.059
Token was
Token ID373
Feature activation+1.72e-02

Pos logit contributions
ret+0.023
orse+0.022
orses+0.021
ets+0.020
 peanut+0.020
Neg logit contributions
 Lu-0.021
Lu-0.017
 angels-0.017
Ann-0.017
 firefighters-0.016
Token three
Token ID1115
Feature activation+2.46e-02

Pos logit contributions
 Lu+0.036
Ann+0.031
Lu+0.030
 angels+0.030
 attendant+0.029
Neg logit contributions
orse-0.024
ret-0.024
 peanut-0.022
orses-0.022
ocket-0.019
Token years
Token ID812
Feature activation+2.46e-02

Pos logit contributions
 Lu+0.056
 attendant+0.048
 angels+0.047
Ann+0.047
 Taxi+0.046
Neg logit contributions
orse-0.031
ret-0.030
orses-0.029
 peanut-0.028
ets-0.026
Token old
Token ID1468
Feature activation+3.75e-02

Pos logit contributions
 Lu+0.047
Ann+0.041
 angels+0.039
Lu+0.038
ators+0.038
Neg logit contributions
orse-0.039
ret-0.039
orses-0.036
 peanut-0.036
ork-0.033
Token bright
Token ID6016
Feature activation+3.60e-02

Pos logit contributions
 Lu+0.035
 angels+0.030
Ann+0.029
 youths+0.028
Lu+0.028
Neg logit contributions
orse-0.026
ret-0.025
orses-0.024
 peanut-0.023
ets-0.022
Token fan
Token ID4336
Feature activation+7.63e-03

Pos logit contributions
 Lu+0.076
Ann+0.067
 angels+0.065
 youths+0.064
Lu+0.063
Neg logit contributions
orse-0.049
 peanut-0.047
ret-0.046
orses-0.042
ork-0.040
Token.
Token ID13
Feature activation+4.77e-03

Pos logit contributions
 Lu+0.016
 angels+0.013
Ann+0.013
Lu+0.013
 firefighters+0.012
Neg logit contributions
orse-0.012
ret-0.012
orses-0.011
 peanut-0.010
ets-0.010
Token Joe
Token ID5689
Feature activation+8.28e-03

Pos logit contributions
 Lu+0.009
 angels+0.008
Ann+0.007
 firefighters+0.007
Lu+0.007
Neg logit contributions
orse-0.008
ret-0.008
orses-0.007
 peanut-0.007
ets-0.007
Token wanted
Token ID2227
Feature activation+3.60e-02

Pos logit contributions
 Lu+0.018
 angels+0.015
Ann+0.015
 firefighters+0.014
Lu+0.014
Neg logit contributions
orse-0.012
ret-0.012
orses-0.011
 peanut-0.010
ets-0.010
Token the
Token ID262
Feature activation+8.74e-02

Pos logit contributions
 Lu+0.081
 youths+0.071
 attendant+0.071
 angels+0.071
Ann+0.070
Neg logit contributions
orse-0.040
 peanut-0.040
ret-0.037
orses-0.034
ork-0.032
Token fan
Token ID4336
Feature activation+3.51e-02

Pos logit contributions
 Lu+0.138
Ann+0.135
 Jenkins+0.131
jack+0.130
Lu+0.127
Neg logit contributions
's-0.119
 picture-0.113
 drawing-0.105
 may-0.102
ret-0.101
Token too
Token ID1165
Feature activation+1.11e-02

Pos logit contributions
 Lu+0.077
Ann+0.068
Lu+0.066
 angels+0.064
 firefighters+0.063
Neg logit contributions
ret-0.045
orse-0.044
 peanut-0.042
orses-0.039
ets-0.038
Token,
Token ID11
Feature activation-5.52e-03

Pos logit contributions
 Lu+0.025
 angels+0.022
Ann+0.021
 firefighters+0.021
Lu+0.021
Neg logit contributions
ret-0.014
orse-0.014
orses-0.013
 peanut-0.012
ets-0.012
Token and
Token ID290
Feature activation+2.18e-02

Pos logit contributions
ret+0.009
orse+0.009
orses+0.009
 peanut+0.008
ets+0.008
Neg logit contributions
 Lu-0.011
 angels-0.009
Ann-0.008
 firefighters-0.008
Lu-0.008
Token so
Token ID523
Feature activation-3.44e-03

Pos logit contributions
 Lu+0.045
 angels+0.039
Ann+0.038
 Taxi+0.037
Lu+0.037
Neg logit contributions
orse-0.032
ret-0.029
orses-0.028
 peanut-0.027
ets-0.025
TokenCan
Token ID6090
Feature activation+2.51e-02

Pos logit contributions
 Lu+0.080
 angels+0.072
 firefighters+0.068
 attendant+0.066
jack+0.066
Neg logit contributions
 peanut-0.047
orse-0.046
ret-0.045
 scored-0.043
orses-0.043
Token I
Token ID314
Feature activation+2.48e-02

Pos logit contributions
 Lu+0.066
 angels+0.059
 firefighters+0.058
Ann+0.057
 youths+0.057
Neg logit contributions
ret-0.021
orse-0.021
orses-0.017
 peanut-0.016
ork-0.014
Token use
Token ID779
Feature activation-9.53e-03

Pos logit contributions
 Lu+0.054
 angels+0.048
Ann+0.045
Lu+0.045
 firefighters+0.045
Neg logit contributions
orse-0.035
ret-0.033
orses-0.030
 peanut-0.029
ets-0.027
Token it
Token ID340
Feature activation+2.46e-02

Pos logit contributions
orse+0.013
ret+0.013
orses+0.012
 peanut+0.011
ets+0.011
Neg logit contributions
 Lu-0.021
 angels-0.018
Ann-0.017
 firefighters-0.017
Lu-0.017
Token?"
Token ID1701
Feature activation+4.90e-02

Pos logit contributions
 Lu+0.052
Ann+0.043
 angels+0.043
Lu+0.042
 youths+0.042
Neg logit contributions
ret-0.036
orse-0.035
 peanut-0.032
orses-0.032
ets-0.030
Token asked
Token ID1965
Feature activation+8.69e-02

Pos logit contributions
ators+0.061
 angels+0.059
 Lu+0.056
co+0.055
 department+0.054
Neg logit contributions
orse-0.096
 peanut-0.089
bs-0.088
's-0.088
 This-0.086
Token Lucy
Token ID22162
Feature activation+1.53e-02

Pos logit contributions
co+0.133
ators+0.132
iddle+0.132
 fighters+0.129
 sub+0.123
Neg logit contributions
's-0.120
bs-0.112
 it-0.110
 peanut-0.109
s-0.108
Token.
Token ID13
Feature activation+3.19e-02

Pos logit contributions
 Lu+0.032
 angels+0.028
Ann+0.027
 firefighters+0.027
Lu+0.027
Neg logit contributions
orse-0.022
ret-0.022
orses-0.020
 peanut-0.019
ets-0.018
Token
Token ID198
Feature activation+2.76e-02

Pos logit contributions
 Lu+0.073
 angels+0.070
ators+0.068
 firefighters+0.067
 youths+0.066
Neg logit contributions
orse-0.034
ret-0.031
orses-0.027
 This-0.027
ork-0.027
Token
Token ID198
Feature activation+3.67e-02

Pos logit contributions
 Lu+0.062
 angels+0.058
 firefighters+0.057
 youths+0.055
ators+0.054
Neg logit contributions
ret-0.033
orse-0.032
orses-0.029
ork-0.028
 peanut-0.024
Token"
Token ID1
Feature activation+1.99e-02

Pos logit contributions
 Lu+0.089
 angels+0.084
 firefighters+0.082
ators+0.081
 youths+0.080
Neg logit contributions
orse-0.029
ork-0.028
's-0.027
ret-0.026
orses-0.024
Token was
Token ID373
Feature activation+3.88e-02

Pos logit contributions
 Lu+0.013
 angels+0.011
Ann+0.011
 firefighters+0.011
Lu+0.011
Neg logit contributions
orse-0.008
ret-0.008
orses-0.008
 peanut-0.007
ets-0.007
Token excited
Token ID6568
Feature activation-8.90e-03

Pos logit contributions
 Lu+0.089
Ann+0.082
lo+0.080
 angels+0.079
co+0.078
Neg logit contributions
orse-0.049
ret-0.041
 peanut-0.038
orses-0.038
ocket-0.035
Token to
Token ID284
Feature activation+1.63e-02

Pos logit contributions
ret+0.017
orse+0.017
orses+0.016
 peanut+0.014
ocket+0.014
Neg logit contributions
 Lu-0.016
 angels-0.013
Lu-0.012
 firefighters-0.012
Ann-0.012
Token have
Token ID423
Feature activation+4.72e-03

Pos logit contributions
 Lu+0.036
 angels+0.032
Ann+0.031
Lu+0.031
co+0.030
Neg logit contributions
orse-0.023
ret-0.021
orses-0.019
 peanut-0.018
ets-0.017
Token a
Token ID257
Feature activation+3.85e-02

Pos logit contributions
 Lu+0.009
 angels+0.008
Ann+0.008
Lu+0.007
 firefighters+0.007
Neg logit contributions
orse-0.008
ret-0.008
orses-0.007
 peanut-0.007
ets-0.006
Token new
Token ID649
Feature activation+8.69e-02

Pos logit contributions
 Lu+0.087
Ann+0.081
ators+0.078
lo+0.078
co+0.077
Neg logit contributions
orse-0.042
 peanut-0.042
ret-0.041
ork-0.038
 pen-0.036
Token one
Token ID530
Feature activation+1.51e-02

Pos logit contributions
Ann+0.179
lo+0.177
Andy+0.167
 Lu+0.166
co+0.165
Neg logit contributions
's-0.095
 folder-0.094
 pen-0.094
 notebook-0.093
 peanut-0.088
Token.
Token ID13
Feature activation-5.19e-04

Pos logit contributions
 Lu+0.033
Ann+0.028
 angels+0.028
 firefighters+0.027
Lu+0.027
Neg logit contributions
orse-0.021
ret-0.021
orses-0.019
 peanut-0.019
ets-0.018
Token "
Token ID366
Feature activation+6.13e-03

Pos logit contributions
ret+0.001
orse+0.001
orses+0.001
 peanut+0.001
ets+0.001
Neg logit contributions
 Lu-0.001
 angels-0.001
Ann-0.001
Lu-0.001
 firefighters-0.001
TokenI
Token ID40
Feature activation-1.86e-03

Pos logit contributions
 Lu+0.010
 angels+0.008
 firefighters+0.008
Ann+0.008
 youths+0.007
Neg logit contributions
orse-0.012
ret-0.012
orses-0.011
 peanut-0.011
ets-0.010
Token hope
Token ID2911
Feature activation-1.44e-03

Pos logit contributions
ret+0.003
orse+0.003
orses+0.003
 peanut+0.003
ets+0.003
Neg logit contributions
 Lu-0.004
 angels-0.003
Ann-0.003
Lu-0.003
 firefighters-0.003
Tokenable
Token ID540
Feature activation+1.98e-02

Pos logit contributions
orse+0.028
ret+0.027
orses+0.026
 peanut+0.024
ocket+0.023
Neg logit contributions
 Lu-0.024
Ann-0.019
Lu-0.019
 angels-0.019
 youths-0.019
Token.
Token ID13
Feature activation-7.45e-03

Pos logit contributions
 Lu+0.044
Ann+0.037
Lu+0.036
 angels+0.036
 firefighters+0.035
Neg logit contributions
ret-0.027
orse-0.026
orses-0.024
ets-0.023
 peanut-0.023
Token And
Token ID843
Feature activation-1.72e-02

Pos logit contributions
ret+0.013
orse+0.013
orses+0.012
 peanut+0.011
ets+0.011
Neg logit contributions
 Lu-0.014
 angels-0.012
Ann-0.011
 firefighters-0.011
Lu-0.011
Token they
Token ID484
Feature activation+3.39e-02

Pos logit contributions
ret+0.027
orse+0.025
orses+0.024
ork+0.023
ets+0.022
Neg logit contributions
 Lu-0.037
 angels-0.031
 firefighters-0.029
 youths-0.029
Ann-0.029
Token liked
Token ID8288
Feature activation+1.37e-02

Pos logit contributions
 Lu+0.076
 angels+0.067
Ann+0.064
 firefighters+0.064
 Taxi+0.063
Neg logit contributions
orse-0.043
ret-0.042
ets-0.038
 peanut-0.038
orses-0.037
Token the
Token ID262
Feature activation+8.67e-02

Pos logit contributions
 Lu+0.027
Ann+0.023
Lu+0.022
 angels+0.022
 attendant+0.022
Neg logit contributions
orse-0.021
ret-0.021
orses-0.020
 peanut-0.020
 notebooks-0.018
Token fog
Token ID19558
Feature activation+3.27e-02

Pos logit contributions
Andy+0.155
Lu+0.151
Ann+0.148
jack+0.147
 Lu+0.146
Neg logit contributions
 seas-0.103
 shell-0.101
 picture-0.094
 photo-0.093
 net-0.092
Token.
Token ID13
Feature activation-1.85e-02

Pos logit contributions
 Lu+0.071
Lu+0.060
Ann+0.060
Michael+0.057
 youths+0.056
Neg logit contributions
ret-0.045
orse-0.045
orses-0.041
ets-0.040
 peanut-0.039
Token It
Token ID632
Feature activation+4.34e-02

Pos logit contributions
ret+0.034
orse+0.032
orses+0.030
 peanut+0.029
ets+0.029
Neg logit contributions
 Lu-0.034
 angels-0.026
 firefighters-0.026
Ann-0.026
 youths-0.026
Token made
Token ID925
Feature activation-1.72e-02

Pos logit contributions
 Lu+0.103
 youths+0.092
 Taxi+0.092
 angels+0.090
 taxis+0.090
Neg logit contributions
orse-0.052
 peanut-0.042
orses-0.041
ret-0.041
's-0.041
Token their
Token ID511
Feature activation+1.47e-02

Pos logit contributions
orse+0.032
ret+0.032
orses+0.030
ocket+0.028
 peanut+0.028
Neg logit contributions
 Lu-0.030
 angels-0.026
 firefighters-0.024
Ann-0.023
 youths-0.022
Token.
Token ID13
Feature activation+2.46e-02

Pos logit contributions
 Lu+0.021
Ann+0.017
Lu+0.017
 angels+0.017
 firefighters+0.017
Neg logit contributions
ret-0.015
orse-0.015
orses-0.013
 peanut-0.012
ets-0.012
Token "
Token ID366
Feature activation-2.92e-03

Pos logit contributions
 Lu+0.051
 angels+0.046
ators+0.045
Ann+0.044
co+0.043
Neg logit contributions
orse-0.032
ret-0.030
orses-0.028
 peanut-0.027
ork-0.027
TokenDo
Token ID5211
Feature activation-2.88e-02

Pos logit contributions
ret+0.005
orse+0.005
orses+0.005
 peanut+0.005
ets+0.005
Neg logit contributions
 Lu-0.005
 angels-0.004
Ann-0.004
Lu-0.004
 firefighters-0.004
Token you
Token ID345
Feature activation-5.98e-03

Pos logit contributions
orse+0.035
ret+0.035
orses+0.031
sel+0.030
 peanut+0.029
Neg logit contributions
 Lu-0.067
 angels-0.061
Ann-0.059
Lu-0.058
 Lucy-0.056
Token like
Token ID588
Feature activation+3.86e-02

Pos logit contributions
ret+0.009
orse+0.009
orses+0.008
 peanut+0.008
ork+0.008
Neg logit contributions
 Lu-0.012
 angels-0.010
Ann-0.010
Lu-0.010
 firefighters-0.010
Token the
Token ID262
Feature activation+8.65e-02

Pos logit contributions
 Lu+0.093
jack+0.090
 attendant+0.086
Ann+0.081
 firefighters+0.081
Neg logit contributions
ret-0.032
 peanut-0.030
orse-0.030
orses-0.028
 drawing-0.028
Token stairs
Token ID16046
Feature activation+1.37e-02

Pos logit contributions
Andy+0.177
 Lu+0.169
jack+0.166
Michael+0.163
co+0.159
Neg logit contributions
 folder-0.100
 notebook-0.095
 picture-0.091
 drawing-0.086
 long-0.085
Token?"
Token ID1701
Feature activation+1.18e-02

Pos logit contributions
 Lu+0.026
 angels+0.021
Ann+0.021
Lu+0.020
 firefighters+0.020
Neg logit contributions
ret-0.023
orse-0.023
orses-0.021
ets-0.021
 peanut-0.020
Token
Token ID198
Feature activation+1.06e-02

Pos logit contributions
 Lu+0.025
 angels+0.022
 firefighters+0.021
Ann+0.021
ators+0.021
Neg logit contributions
orse-0.017
ret-0.016
orses-0.015
 peanut-0.014
ets-0.013
Token
Token ID198
Feature activation+6.30e-03

Pos logit contributions
 Lu+0.022
 angels+0.019
 firefighters+0.019
Ann+0.018
 youths+0.018
Neg logit contributions
ret-0.015
orse-0.015
orses-0.014
 peanut-0.012
ork-0.012
Token"
Token ID1
Feature activation+3.65e-02

Pos logit contributions
 Lu+0.012
 angels+0.010
 firefighters+0.010
Ann+0.009
 attendant+0.009
Neg logit contributions
orse-0.010
ret-0.010
orses-0.010
 peanut-0.009
ets-0.009
Token
Token ID198
Feature activation+2.45e-02

Pos logit contributions
 Lu+0.075
 angels+0.067
 firefighters+0.066
 youths+0.064
 attendant+0.062
Neg logit contributions
orse-0.043
ret-0.041
orses-0.038
ork-0.034
 peanut-0.031
TokenOne
Token ID3198
Feature activation+1.47e-03

Pos logit contributions
 Lu+0.049
 angels+0.042
 firefighters+0.040
 attendant+0.040
 youths+0.040
Neg logit contributions
orse-0.038
orses-0.036
ret-0.035
ork-0.032
 peanut-0.031
Token day
Token ID1110
Feature activation+9.45e-04

Pos logit contributions
 Lu+0.003
 angels+0.002
Ann+0.002
 firefighters+0.002
Lu+0.002
Neg logit contributions
orse-0.003
ret-0.003
orses-0.002
 peanut-0.002
ets-0.002
Token,
Token ID11
Feature activation+1.59e-02

Pos logit contributions
 Lu+0.002
 angels+0.002
Ann+0.002
Lu+0.002
 firefighters+0.002
Neg logit contributions
ret-0.001
orse-0.001
orses-0.001
 peanut-0.001
ets-0.001
Token Sam
Token ID3409
Feature activation-1.76e-02

Pos logit contributions
 Lu+0.036
 angels+0.031
Ann+0.031
co+0.030
 attendant+0.030
Neg logit contributions
orse-0.019
ret-0.018
orses-0.017
 peanut-0.016
ets-0.015
Token and
Token ID290
Feature activation+8.65e-02

Pos logit contributions
ret+0.030
orse+0.029
orses+0.027
ocket+0.025
 peanut+0.024
Neg logit contributions
 Lu-0.034
Lu-0.027
Ann-0.027
 angels-0.027
 firefighters-0.026
Token Lucy
Token ID22162
Feature activation-3.08e-03

Pos logit contributions
 attendant+0.128
iddle+0.127
iren+0.120
 ambulance+0.120
ests+0.119
Neg logit contributions
's-0.117
 Po-0.114
 This-0.114
 peanut-0.112
 T-0.111
Token went
Token ID1816
Feature activation+2.77e-02

Pos logit contributions
ret+0.005
orse+0.005
orses+0.004
 peanut+0.004
ets+0.004
Neg logit contributions
 Lu-0.006
 angels-0.005
Ann-0.005
Lu-0.005
 firefighters-0.005
Token to
Token ID284
Feature activation+1.93e-02

Pos logit contributions
 Lu+0.063
Ann+0.054
Lu+0.053
Michael+0.051
 Stev+0.050
Neg logit contributions
orse-0.034
ret-0.034
ets-0.031
orses-0.031
 peanut-0.031
Token the
Token ID262
Feature activation+8.46e-03

Pos logit contributions
 Lu+0.039
 angels+0.035
Ann+0.034
Lu+0.033
co+0.033
Neg logit contributions
orse-0.027
ret-0.026
orses-0.024
 peanut-0.024
ork-0.021
Token park
Token ID3952
Feature activation+7.67e-04

Pos logit contributions
 Lu+0.015
 angels+0.012
Ann+0.011
Lu+0.011
 firefighters+0.011
Neg logit contributions
ret-0.015
orse-0.015
orses-0.014
 peanut-0.014
ork-0.013
Token again
Token ID757
Feature activation-2.66e-03

Pos logit contributions
 Lu+0.022
 angels+0.019
Ann+0.019
Lu+0.018
 firefighters+0.018
Neg logit contributions
ret-0.016
orse-0.016
orses-0.015
 peanut-0.014
ets-0.013
Token.
Token ID13
Feature activation-2.91e-03

Pos logit contributions
ret+0.004
orse+0.004
orses+0.004
 peanut+0.004
ets+0.004
Neg logit contributions
 Lu-0.005
 angels-0.004
Ann-0.004
Lu-0.004
 firefighters-0.004
Token They
Token ID1119
Feature activation+9.19e-03

Pos logit contributions
ret+0.005
orse+0.005
orses+0.004
 peanut+0.004
ets+0.004
Neg logit contributions
 Lu-0.006
 angels-0.005
Ann-0.005
Lu-0.005
 firefighters-0.005
Token understood
Token ID7247
Feature activation+9.18e-03

Pos logit contributions
 Lu+0.019
 angels+0.017
Ann+0.016
 firefighters+0.016
Lu+0.016
Neg logit contributions
orse-0.014
ret-0.014
orses-0.012
 peanut-0.012
ets-0.011
Token that
Token ID326
Feature activation+2.96e-02

Pos logit contributions
 Lu+0.021
 angels+0.018
Ann+0.018
Lu+0.017
 firefighters+0.017
Neg logit contributions
ret-0.012
orse-0.012
orses-0.011
 peanut-0.010
ets-0.009
Token the
Token ID262
Feature activation+8.64e-02

Pos logit contributions
 Lu+0.067
Ann+0.062
co+0.061
Lu+0.060
 attendant+0.058
Neg logit contributions
orse-0.034
ret-0.033
orses-0.030
 peanut-0.028
's-0.024
Token fog
Token ID19558
Feature activation+2.76e-02

Pos logit contributions
ussy+0.147
Ann+0.145
 Lu+0.141
whe+0.141
Mark+0.134
Neg logit contributions
 may-0.100
 holes-0.096
 wish-0.095
 seas-0.094
 shell-0.094
Token was
Token ID373
Feature activation+2.12e-02

Pos logit contributions
 Lu+0.057
Ann+0.047
Lu+0.047
 angels+0.047
 youths+0.046
Neg logit contributions
ret-0.041
orse-0.041
orses-0.037
ets-0.035
ork-0.034
Token trying
Token ID2111
Feature activation-7.86e-03

Pos logit contributions
 Lu+0.043
Ann+0.037
 angels+0.036
 youths+0.034
Lu+0.034
Neg logit contributions
ret-0.032
orse-0.032
orses-0.029
 peanut-0.028
ets-0.027
Token to
Token ID284
Feature activation+2.84e-02

Pos logit contributions
orse+0.014
ret+0.014
orses+0.013
 peanut+0.013
ets+0.012
Neg logit contributions
 Lu-0.014
 angels-0.011
 firefighters-0.011
Ann-0.011
Lu-0.011
Token tell
Token ID1560
Feature activation+6.28e-03

Pos logit contributions
 Lu+0.062
Ann+0.052
 youths+0.052
Lu+0.051
 angels+0.051
Neg logit contributions
orse-0.039
ret-0.039
orses-0.035
 peanut-0.033
ets-0.031
Token Sam
Token ID3409
Feature activation-1.15e-02

Pos logit contributions
 Lu+0.035
 angels+0.030
Ann+0.029
 attendant+0.028
co+0.028
Neg logit contributions
orse-0.026
ret-0.024
orses-0.023
 peanut-0.022
sel-0.020
Token and
Token ID290
Feature activation+3.41e-02

Pos logit contributions
ret+0.020
orse+0.020
orses+0.018
 peanut+0.017
ocket+0.017
Neg logit contributions
 Lu-0.021
 angels-0.017
Lu-0.017
Ann-0.017
 firefighters-0.016
Token Lucy
Token ID22162
Feature activation+1.11e-02

Pos logit contributions
 Lu+0.068
ators+0.061
 attendant+0.061
 angels+0.060
co+0.059
Neg logit contributions
orse-0.051
 peanut-0.045
orses-0.044
ret-0.044
cious-0.038
Token.
Token ID13
Feature activation+2.19e-02

Pos logit contributions
 Lu+0.023
Ann+0.020
 angels+0.019
Lu+0.019
 firefighters+0.019
Neg logit contributions
orse-0.017
ret-0.016
orses-0.015
 peanut-0.014
ets-0.013
Token Sam
Token ID3409
Feature activation-8.13e-03

Pos logit contributions
 Lu+0.051
 angels+0.045
 attendant+0.044
guard+0.043
 firefighters+0.043
Neg logit contributions
orse-0.025
orses-0.023
ret-0.023
 peanut-0.021
ets-0.021
Token and
Token ID290
Feature activation+8.62e-02

Pos logit contributions
ret+0.013
orse+0.013
orses+0.012
 peanut+0.011
ets+0.011
Neg logit contributions
 Lu-0.016
 angels-0.013
Ann-0.013
Lu-0.013
 firefighters-0.013
Token Lucy
Token ID22162
Feature activation+8.11e-03

Pos logit contributions
iddle+0.134
 attendant+0.134
 ambulance+0.128
ators+0.125
pr+0.123
Neg logit contributions
 This-0.126
's-0.122
 T-0.121
 Po-0.115
 peanut-0.113
Token loved
Token ID6151
Feature activation+3.38e-02

Pos logit contributions
 Lu+0.017
 angels+0.014
Ann+0.014
 firefighters+0.014
Lu+0.013
Neg logit contributions
orse-0.012
ret-0.012
orses-0.011
 peanut-0.011
ets-0.010
Token to
Token ID284
Feature activation+2.11e-02

Pos logit contributions
 Lu+0.075
Ann+0.069
 attendant+0.068
Lu+0.065
Michael+0.062
Neg logit contributions
 peanut-0.041
orse-0.041
orses-0.038
 notebooks-0.037
 ol-0.036
Token play
Token ID711
Feature activation+1.89e-02

Pos logit contributions
 Lu+0.048
Ann+0.041
Lu+0.040
 attendant+0.040
 angels+0.040
Neg logit contributions
orse-0.029
orses-0.026
ret-0.026
 peanut-0.025
 notebooks-0.022
Token together
Token ID1978
Feature activation+9.86e-03

Pos logit contributions
 Lu+0.041
Ann+0.033
 attendant+0.033
 Taxi+0.033
 angels+0.032
Neg logit contributions
orse-0.027
ret-0.025
orses-0.024
 peanut-0.023
ets-0.022
Token curious
Token ID11040
Feature activation+1.39e-02

Pos logit contributions
ret+0.011
orse+0.011
orses+0.010
ets+0.010
 peanut+0.010
Neg logit contributions
 Lu-0.013
 angels-0.011
Ann-0.010
 firefighters-0.010
Lu-0.010
Token little
Token ID1310
Feature activation+2.36e-02

Pos logit contributions
 Lu+0.028
Ann+0.023
 angels+0.023
Lu+0.022
 firefighters+0.022
Neg logit contributions
orse-0.022
ret-0.022
orses-0.020
 peanut-0.020
ets-0.018
Token cat
Token ID3797
Feature activation-1.05e-02

Pos logit contributions
 Lu+0.041
Ann+0.034
 angels+0.033
ators+0.032
Lu+0.032
Neg logit contributions
orse-0.042
 peanut-0.041
ret-0.040
orses-0.038
 notebook-0.036
Token named
Token ID3706
Feature activation+3.34e-03

Pos logit contributions
ret+0.016
orse+0.015
orses+0.015
ets+0.014
 peanut+0.013
Neg logit contributions
 Lu-0.022
 angels-0.018
Lu-0.018
Ann-0.018
 firefighters-0.017
Token Kitty
Token ID21168
Feature activation+9.07e-03

Pos logit contributions
 Lu+0.005
 angels+0.004
Ann+0.004
 firefighters+0.004
Lu+0.004
Neg logit contributions
orse-0.007
ret-0.007
orses-0.006
 peanut-0.006
ets-0.006
Token.
Token ID13
Feature activation+8.58e-02

Pos logit contributions
 Lu+0.019
 angels+0.016
Ann+0.016
 firefighters+0.015
 attendant+0.015
Neg logit contributions
orse-0.014
ret-0.013
orses-0.013
 peanut-0.012
ets-0.011
Token Kitty
Token ID21168
Feature activation+2.04e-02

Pos logit contributions
guard+0.151
ators+0.145
 pastor+0.135
 attendant+0.134
 helpers+0.124
Neg logit contributions
 This-0.147
 Its-0.123
 It-0.119
 The-0.115
 Find-0.113
Token loved
Token ID6151
Feature activation+2.19e-02

Pos logit contributions
 Lu+0.044
 angels+0.039
 attendant+0.038
 firefighters+0.037
Ann+0.037
Neg logit contributions
orse-0.028
ret-0.027
orses-0.025
 peanut-0.024
ets-0.022
Token to
Token ID284
Feature activation+6.02e-03

Pos logit contributions
 Lu+0.046
Ann+0.040
 attendant+0.039
Lu+0.038
 angels+0.037
Neg logit contributions
orse-0.030
 peanut-0.030
ret-0.029
orses-0.028
 notebooks-0.026
Token play
Token ID711
Feature activation+8.52e-03

Pos logit contributions
 Lu+0.013
 angels+0.011
Ann+0.011
Lu+0.010
 firefighters+0.010
Neg logit contributions
orse-0.009
ret-0.009
orses-0.008
 peanut-0.008
ets-0.007
Token and
Token ID290
Feature activation+7.32e-03

Pos logit contributions
 Lu+0.017
 angels+0.014
Ann+0.014
 attendant+0.013
Lu+0.013
Neg logit contributions
orse-0.014
ret-0.013
orses-0.012
 peanut-0.012
ets-0.012
Token found
Token ID1043
Feature activation-9.96e-03

Pos logit contributions
ret+0.047
orse+0.047
orses+0.045
ocus+0.040
ets+0.040
Neg logit contributions
 Lu-0.044
Lu-0.035
 firefighters-0.035
Ann-0.034
 youths-0.033
Token a
Token ID257
Feature activation-3.29e-02

Pos logit contributions
orse+0.019
ret+0.019
orses+0.018
ets+0.016
 peanut+0.016
Neg logit contributions
 Lu-0.018
 angels-0.014
 firefighters-0.014
Ann-0.013
Lu-0.013
Token brilliant
Token ID10457
Feature activation-1.85e-02

Pos logit contributions
orse+0.062
ret+0.061
orses+0.059
ets+0.056
sel+0.051
Neg logit contributions
 Lu-0.057
 angels-0.047
 attendant-0.045
Lu-0.044
 firefighters-0.044
Token skirt
Token ID23967
Feature activation-1.93e-02

Pos logit contributions
orse+0.031
ret+0.031
orses+0.030
ets+0.027
sel+0.025
Neg logit contributions
 Lu-0.035
 angels-0.029
 firefighters-0.028
Lu-0.028
 attendant-0.027
Token in
Token ID287
Feature activation+1.08e-02

Pos logit contributions
ret+0.038
orse+0.037
orses+0.035
 peanut+0.032
ets+0.031
Neg logit contributions
 Lu-0.032
 angels-0.026
 firefighters-0.025
Lu-0.024
 attendant-0.024
Token Lily
Token ID20037
Feature activation-1.03e-01

Pos logit contributions
 Lu+0.022
 angels+0.019
Ann+0.018
co+0.017
Lu+0.017
Neg logit contributions
ret-0.016
orse-0.016
orses-0.015
 peanut-0.014
 notebooks-0.013
Token's
Token ID338
Feature activation-4.20e-02

Pos logit contributions
ret+0.189
 AW+0.161
orse+0.159
alks+0.150
eg+0.148
Neg logit contributions
 Lu-0.155
co-0.146
Lu-0.130
my-0.128
 Pen-0.127
Token closet
Token ID21615
Feature activation-2.27e-02

Pos logit contributions
orse+0.096
orses+0.091
ret+0.091
ocus+0.081
sel+0.080
Neg logit contributions
 Lu-0.057
 angels-0.043
 attendant-0.040
Lu-0.039
 Lucy-0.039
Token.
Token ID13
Feature activation-5.88e-03

Pos logit contributions
ret+0.043
orse+0.043
orses+0.040
 peanut+0.037
ets+0.035
Neg logit contributions
 Lu-0.038
 angels-0.032
 attendant-0.030
 firefighters-0.030
Ann-0.030
Token It
Token ID632
Feature activation+7.24e-05

Pos logit contributions
ret+0.010
orse+0.010
orses+0.010
 peanut+0.009
ets+0.009
Neg logit contributions
 Lu-0.011
 angels-0.009
Ann-0.009
Lu-0.009
 firefighters-0.008
Token was
Token ID373
Feature activation-1.72e-02

Pos logit contributions
 Lu+0.000
 angels+0.000
Ann+0.000
Lu+0.000
 firefighters+0.000
Neg logit contributions
ret-0.000
orse-0.000
orses-0.000
 peanut-0.000
ets-0.000
Token a
Token ID257
Feature activation-1.28e-02

Pos logit contributions
ret+0.005
orse+0.004
orses+0.004
 peanut+0.004
ets+0.004
Neg logit contributions
 Lu-0.005
 angels-0.004
Ann-0.004
 firefighters-0.004
Lu-0.004
Token nap
Token ID25422
Feature activation+4.60e-03

Pos logit contributions
ret+0.022
orse+0.021
orses+0.020
ets+0.019
sel+0.018
Neg logit contributions
 Lu-0.024
 angels-0.020
Lu-0.019
Ann-0.019
 firefighters-0.018
Token.
Token ID13
Feature activation-4.28e-02

Pos logit contributions
 Lu+0.009
 angels+0.008
Ann+0.007
Lu+0.007
 firefighters+0.007
Neg logit contributions
orse-0.007
ret-0.007
orses-0.007
 peanut-0.006
ets-0.006
Token One
Token ID1881
Feature activation+1.37e-03

Pos logit contributions
ret+0.089
orse+0.087
ocket+0.084
orses+0.084
ocus+0.083
Neg logit contributions
 Lu-0.060
 Taxi-0.043
 Barber-0.042
 attendant-0.039
 Luis-0.039
Token day
Token ID1110
Feature activation-7.92e-02

Pos logit contributions
 Lu+0.003
 angels+0.002
Ann+0.002
Lu+0.002
 firefighters+0.002
Neg logit contributions
orse-0.002
ret-0.002
orses-0.002
 peanut-0.002
ets-0.002
Token,
Token ID11
Feature activation-1.02e-01

Pos logit contributions
ret+0.179
iche+0.176
ets+0.174
orse+0.173
orses+0.172
Neg logit contributions
 Lu-0.083
 angels-0.055
 Taxi-0.049
 Dog-0.049
 Barber-0.047
Token Gro
Token ID10299
Feature activation-4.12e-02

Pos logit contributions
iche+0.221
pler+0.219
ocus+0.218
ets+0.216
ocado+0.213
Neg logit contributions
 Lu-0.099
 Squirrel-0.086
 angels-0.083
 Bird-0.080
 Dog-0.080
Tokenver
Token ID332
Feature activation-5.35e-02

Pos logit contributions
 peanut+0.086
orse+0.086
orses+0.081
ret+0.078
 notebooks+0.077
Neg logit contributions
 Lu-0.053
Ann-0.042
 angels-0.041
Lu-0.039
 Taxi-0.038
Token decided
Token ID3066
Feature activation-2.54e-02

Pos logit contributions
iche+0.107
ret+0.104
ork+0.101
ocket+0.101
ets+0.100
Neg logit contributions
 Lu-0.077
 Karen-0.059
 angels-0.058
 Dog-0.057
 Taxi-0.055
Token he
Token ID339
Feature activation+2.88e-03

Pos logit contributions
ret+0.054
orse+0.053
orses+0.051
sel+0.048
ets+0.048
Neg logit contributions
 Lu-0.038
 angels-0.029
 firefighters-0.027
Ann-0.026
Lu-0.026
Token wanted
Token ID2227
Feature activation+1.71e-03

Pos logit contributions
 Lu+0.006
 angels+0.005
Ann+0.005
Lu+0.005
 firefighters+0.005
Neg logit contributions
ret-0.004
orse-0.004
orses-0.004
 peanut-0.004
ets-0.004
Token the
Token ID262
Feature activation-2.13e-02

Pos logit contributions
ret+0.011
orse+0.010
orses+0.010
 peanut+0.009
ets+0.009
Neg logit contributions
 Lu-0.011
 angels-0.009
Ann-0.009
 firefighters-0.009
Lu-0.008
Token beach
Token ID10481
Feature activation-2.49e-02

Pos logit contributions
orse+0.044
ret+0.044
orses+0.041
ets+0.038
ocket+0.038
Neg logit contributions
 Lu-0.033
 angels-0.027
 firefighters-0.026
 attendant-0.025
Ann-0.025
Token.
Token ID13
Feature activation-7.53e-02

Pos logit contributions
ret+0.047
orse+0.044
orses+0.041
 peanut+0.040
ocket+0.039
Neg logit contributions
 Lu-0.044
 angels-0.037
Ann-0.034
 firefighters-0.034
Lu-0.034
Token But
Token ID887
Feature activation-7.94e-02

Pos logit contributions
orse+0.163
ret+0.158
arding+0.155
orses+0.154
ocus+0.154
Neg logit contributions
 Lu-0.085
 Barber-0.063
 youths-0.059
 Pen-0.055
 Bob-0.053
Token first
Token ID717
Feature activation-8.64e-02

Pos logit contributions
ets+0.161
ret+0.154
ork+0.154
orses+0.152
iche+0.152
Neg logit contributions
 Lu-0.104
 Barber-0.083
 Pen-0.075
 Bob-0.074
tons-0.068
Token,
Token ID11
Feature activation-1.02e-01

Pos logit contributions
iche+0.191
pler+0.186
orse+0.184
ork+0.180
orses+0.180
Neg logit contributions
 Lu-0.084
 helpers-0.064
 Barber-0.063
 youths-0.062
guard-0.061
Token he
Token ID339
Feature activation+3.24e-03

Pos logit contributions
ork+0.191
orse+0.190
ocus+0.190
pler+0.189
orses+0.188
Neg logit contributions
 Lu-0.117
 youths-0.099
 Barber-0.098
 helpers-0.097
 answer-0.092
Token had
Token ID550
Feature activation+2.92e-03

Pos logit contributions
 Lu+0.007
 angels+0.006
Ann+0.006
Lu+0.006
 firefighters+0.006
Neg logit contributions
orse-0.005
ret-0.005
orses-0.004
 peanut-0.004
ets-0.004
Token to
Token ID284
Feature activation+7.79e-05

Pos logit contributions
 Lu+0.006
 angels+0.005
Ann+0.005
Lu+0.004
 firefighters+0.004
Neg logit contributions
ret-0.005
orse-0.005
orses-0.004
 peanut-0.004
ets-0.004
Token pack
Token ID2353
Feature activation-2.45e-02

Pos logit contributions
 Lu+0.000
 angels+0.000
Ann+0.000
Lu+0.000
 firefighters+0.000
Neg logit contributions
ret-0.000
orse-0.000
orses-0.000
 peanut-0.000
ets-0.000
Token!
Token ID0
Feature activation-5.76e-02

Pos logit contributions
orse+0.054
ret+0.051
orses+0.050
pler+0.045
ocus+0.045
Neg logit contributions
 Lu-0.035
 angels-0.028
 firefighters-0.026
Ann-0.025
Lu-0.025
Token what
Token ID644
Feature activation+1.34e-02

Pos logit contributions
ret+0.013
orse+0.013
orses+0.012
 peanut+0.012
ets+0.011
Neg logit contributions
 Lu-0.016
 angels-0.013
 firefighters-0.012
Ann-0.012
Lu-0.012
Token to
Token ID284
Feature activation+2.28e-02

Pos logit contributions
 Lu+0.028
 angels+0.023
Ann+0.023
 attendant+0.022
Lu+0.022
Neg logit contributions
orse-0.021
ret-0.020
orses-0.019
 peanut-0.018
ork-0.017
Token do
Token ID466
Feature activation+3.57e-03

Pos logit contributions
 Lu+0.052
 angels+0.043
 firefighters+0.043
Lu+0.043
Ann+0.042
Neg logit contributions
orse-0.028
ret-0.027
orses-0.026
 peanut-0.025
ets-0.023
Token.
Token ID13
Feature activation-8.60e-02

Pos logit contributions
 Lu+0.007
 angels+0.006
Ann+0.006
Lu+0.006
 firefighters+0.006
Neg logit contributions
ret-0.006
orse-0.006
orses-0.005
 peanut-0.005
ets-0.005
Token
Token ID198
Feature activation-3.83e-02

Pos logit contributions
ifying+0.198
ocket+0.195
ret+0.188
ocus+0.188
ets+0.187
Neg logit contributions
 Lu-0.073
 youths-0.059
 Bob-0.057
 attendant-0.057
 Bird-0.052
Token
Token ID198
Feature activation-1.01e-01

Pos logit contributions
ret+0.075
orse+0.074
ets+0.073
 peanut+0.071
orses+0.070
Neg logit contributions
 Lu-0.059
Ann-0.047
Lu-0.045
 angels-0.044
 attendant-0.042
TokenBut
Token ID1537
Feature activation-8.48e-02

Pos logit contributions
ocket+0.198
ret+0.193
icol+0.189
ifying+0.189
orse+0.185
Neg logit contributions
Ann-0.089
 youths-0.084
Mark-0.074
 attendant-0.074
Fine-0.073
Token then
Token ID788
Feature activation-5.06e-02

Pos logit contributions
ret+0.188
ork+0.186
ets+0.175
iche+0.174
ables+0.171
Neg logit contributions
 Lu-0.089
 Bob-0.073
 Bird-0.072
 helpers-0.070
 Dog-0.070
Token,
Token ID11
Feature activation-9.14e-02

Pos logit contributions
ret+0.105
ets+0.103
orse+0.097
orses+0.096
ork+0.094
Neg logit contributions
 Lu-0.070
 angels-0.058
 Bird-0.054
 firefighters-0.054
 Lucy-0.053
Token a
Token ID257
Feature activation+3.47e-04

Pos logit contributions
ets+0.217
ret+0.208
ocus+0.206
ard+0.204
orses+0.203
Neg logit contributions
 Lu-0.068
 helpers-0.064
 angels-0.062
 firefighters-0.061
 Bird-0.059
Token kind
Token ID1611
Feature activation-1.18e-03

Pos logit contributions
 Lu+0.001
 angels+0.001
Ann+0.000
Lu+0.000
 firefighters+0.000
Neg logit contributions
ret-0.001
orse-0.001
orses-0.001
 peanut-0.001
ets-0.001
Token pumpkin
Token ID30089
Feature activation-8.19e-03

Pos logit contributions
 Lu+0.007
Ann+0.005
 angels+0.005
Lu+0.005
 firefighters+0.005
Neg logit contributions
orse-0.011
ret-0.011
orses-0.010
 peanut-0.010
ets-0.009
Token friends
Token ID2460
Feature activation+9.03e-03

Pos logit contributions
ret+0.019
orse+0.018
orses+0.017
ets+0.017
 peanut+0.016
Neg logit contributions
 Lu-0.011
 angels-0.008
 firefighters-0.008
Lu-0.007
Ann-0.007
Token.
Token ID13
Feature activation-6.71e-02

Pos logit contributions
 Lu+0.019
Ann+0.016
 angels+0.016
Lu+0.016
 firefighters+0.016
Neg logit contributions
orse-0.013
ret-0.013
orses-0.012
 peanut-0.011
ets-0.011
Token One
Token ID1881
Feature activation-1.22e-02

Pos logit contributions
ret+0.142
ocket+0.139
orse+0.137
ifying+0.135
pler+0.134
Neg logit contributions
 Lu-0.074
 youths-0.059
 attendant-0.054
 Ann-0.053
Ann-0.052
Token day
Token ID1110
Feature activation-7.38e-02

Pos logit contributions
ret+0.025
orse+0.023
orses+0.022
ets+0.021
sel+0.021
Neg logit contributions
 Lu-0.020
 angels-0.016
Lu-0.015
Ann-0.015
 firefighters-0.015
Token,
Token ID11
Feature activation-9.97e-02

Pos logit contributions
ret+0.163
ets+0.155
orse+0.155
ocket+0.155
 generation+0.152
Neg logit contributions
 Lu-0.074
 angels-0.063
 youths-0.059
 helpers-0.053
 Lucy-0.047
Token a
Token ID257
Feature activation-2.50e-02

Pos logit contributions
ret+0.179
ets+0.175
pler+0.174
ork+0.173
ocus+0.171
Neg logit contributions
 helpers-0.118
 youths-0.115
 Bob-0.111
 angels-0.111
 Lu-0.105
Token big
Token ID1263
Feature activation-7.13e-03

Pos logit contributions
ret+0.046
orse+0.043
orses+0.041
ets+0.041
ocus+0.039
Neg logit contributions
 Lu-0.045
 angels-0.037
 firefighters-0.036
Lu-0.036
 attendant-0.035
Token,
Token ID11
Feature activation-2.09e-03

Pos logit contributions
ret+0.014
orse+0.014
orses+0.013
ets+0.012
 peanut+0.012
Neg logit contributions
 Lu-0.012
 angels-0.009
 firefighters-0.009
Ann-0.009
Lu-0.009
Token mean
Token ID1612
Feature activation-1.71e-02

Pos logit contributions
ret+0.004
orse+0.003
orses+0.003
 peanut+0.003
ets+0.003
Neg logit contributions
 Lu-0.004
 angels-0.003
Ann-0.003
 firefighters-0.003
Lu-0.003
Token squash
Token ID34613
Feature activation-2.81e-02

Pos logit contributions
ret+0.033
orse+0.033
orses+0.031
ets+0.030
ocus+0.028
Neg logit contributions
 Lu-0.028
 angels-0.022
 firefighters-0.022
 youths-0.020
Ann-0.020
Token a
Token ID257
Feature activation-9.20e-03

Pos logit contributions
ret+0.015
orse+0.015
orses+0.014
ets+0.013
 peanut+0.013
Neg logit contributions
 Lu-0.015
 angels-0.012
Lu-0.011
Ann-0.011
 firefighters-0.011
Token fair
Token ID3148
Feature activation-1.76e-03

Pos logit contributions
ret+0.015
orse+0.015
orses+0.014
ets+0.013
 peanut+0.012
Neg logit contributions
 Lu-0.018
 angels-0.015
Lu-0.015
 firefighters-0.015
Ann-0.015
Token peanut
Token ID26636
Feature activation-8.43e-03

Pos logit contributions
ret+0.003
orse+0.003
orses+0.003
 peanut+0.003
ets+0.003
Neg logit contributions
 Lu-0.003
 angels-0.002
Ann-0.002
Lu-0.002
 firefighters-0.002
Token.
Token ID13
Feature activation-4.95e-02

Pos logit contributions
ret+0.015
orse+0.014
orses+0.013
ets+0.012
 peanut+0.012
Neg logit contributions
 Lu-0.016
 angels-0.013
Lu-0.012
Ann-0.012
 firefighters-0.012
Token It
Token ID632
Feature activation-5.48e-02

Pos logit contributions
ret+0.085
orses+0.083
ets+0.083
orse+0.082
bs+0.071
Neg logit contributions
 Lu-0.094
 Lucy-0.084
Lu-0.074
 Kitty-0.072
 Stev-0.072
Token looked
Token ID3114
Feature activation-9.88e-02

Pos logit contributions
ret+0.133
ets+0.127
orses+0.125
arding+0.125
pieces+0.124
Neg logit contributions
 Lu-0.055
Lu-0.041
 Lucy-0.040
jack-0.038
 Bird-0.037
Token so
Token ID523
Feature activation-6.28e-02

Pos logit contributions
orse+0.284
orses+0.271
ork+0.256
pler+0.255
stand+0.251
Neg logit contributions
 Lu-0.046
 Lucy-0.045
 too-0.034
 all-0.033
 firefighters-0.029
Token tasty
Token ID25103
Feature activation-2.47e-03

Pos logit contributions
pler+0.131
ret+0.125
orse+0.125
ork+0.123
ets+0.122
Neg logit contributions
 Lu-0.078
 Lucy-0.067
 angels-0.064
 youths-0.052
Lu-0.052
Token that
Token ID326
Feature activation-2.58e-02

Pos logit contributions
ret+0.004
orse+0.003
orses+0.003
 peanut+0.003
ets+0.003
Neg logit contributions
 Lu-0.005
 angels-0.005
Ann-0.004
Lu-0.004
 firefighters-0.004
Token a
Token ID257
Feature activation-4.93e-02

Pos logit contributions
ret+0.039
orse+0.038
orses+0.035
ets+0.035
ocus+0.032
Neg logit contributions
 Lu-0.054
 angels-0.047
 firefighters-0.044
 Lucy-0.044
Lu-0.043
Token lion
Token ID18744
Feature activation-3.00e-02

Pos logit contributions
ret+0.095
ets+0.090
ocus+0.085
orse+0.085
orses+0.084
Neg logit contributions
 Lu-0.078
 angels-0.068
 firefighters-0.064
Lu-0.064
 firefighter-0.061
Token every
Token ID790
Feature activation+8.60e-03

Pos logit contributions
 Lu+0.014
 angels+0.012
Ann+0.012
 firefighters+0.011
Lu+0.011
Neg logit contributions
ret-0.011
orse-0.011
orses-0.010
 peanut-0.009
ets-0.009
Token day
Token ID1110
Feature activation-3.10e-02

Pos logit contributions
 Lu+0.017
 angels+0.014
Ann+0.014
Lu+0.014
 firefighters+0.014
Neg logit contributions
orse-0.014
ret-0.013
orses-0.013
 peanut-0.012
ets-0.011
Token.
Token ID13
Feature activation-3.52e-02

Pos logit contributions
ret+0.068
orse+0.063
orses+0.060
 peanut+0.057
sel+0.056
Neg logit contributions
 Lu-0.048
 angels-0.036
Lu-0.035
 Lucy-0.034
 youths-0.032
Token One
Token ID1881
Feature activation-5.49e-03

Pos logit contributions
orse+0.073
orses+0.068
ret+0.067
ets+0.062
ocket+0.061
Neg logit contributions
 Lu-0.057
 angels-0.043
 Lucy-0.042
Lu-0.041
 youths-0.041
Token day
Token ID1110
Feature activation-6.83e-02

Pos logit contributions
ret+0.010
orse+0.010
orses+0.010
ets+0.009
 peanut+0.009
Neg logit contributions
 Lu-0.010
 angels-0.008
Ann-0.007
Lu-0.007
 firefighters-0.007
Token,
Token ID11
Feature activation-9.64e-02

Pos logit contributions
orse+0.159
orses+0.154
ret+0.153
 battlefield+0.149
ets+0.149
Neg logit contributions
 Lu-0.081
 Lucy-0.058
 angels-0.057
 Megan-0.049
 youths-0.049
Token while
Token ID981
Feature activation-4.09e-02

Pos logit contributions
orse+0.199
orses+0.193
oc+0.187
ables+0.176
ets+0.175
Neg logit contributions
 Lu-0.131
 angels-0.124
 Lucy-0.122
 Zoe-0.118
 Megan-0.112
Token Av
Token ID5184
Feature activation-4.71e-02

Pos logit contributions
ret+0.082
ets+0.081
orse+0.073
ocado+0.071
sel+0.071
Neg logit contributions
 Lu-0.069
 angels-0.059
 youths-0.053
 Lucy-0.051
 Barber-0.050
Tokena
Token ID64
Feature activation-8.15e-03

Pos logit contributions
orse+0.082
ret+0.079
 peanut+0.078
orses+0.075
 lens+0.073
Neg logit contributions
 Lu-0.075
Lu-0.062
Ann-0.061
 angels-0.060
 youths-0.059
Token was
Token ID373
Feature activation-9.44e-04

Pos logit contributions
ret+0.015
orse+0.014
orses+0.013
 peanut+0.013
ets+0.012
Neg logit contributions
 Lu-0.015
 angels-0.012
Ann-0.012
Lu-0.012
 firefighters-0.011
Token playing
Token ID2712
Feature activation+2.24e-02

Pos logit contributions
ret+0.002
orse+0.002
orses+0.001
 peanut+0.001
ets+0.001
Neg logit contributions
 Lu-0.002
 angels-0.002
Ann-0.001
Lu-0.001
 firefighters-0.001
Token big
Token ID1263
Feature activation-3.22e-02

Pos logit contributions
ret+0.098
ets+0.090
orse+0.089
sel+0.086
orses+0.086
Neg logit contributions
 Lu-0.060
Lu-0.046
 angels-0.045
 firefighters-0.044
Ann-0.044
Token ocean
Token ID9151
Feature activation-3.18e-02

Pos logit contributions
ret+0.065
orse+0.060
orses+0.058
ets+0.056
sel+0.055
Neg logit contributions
 Lu-0.052
 angels-0.040
 firefighters-0.038
 Taxi-0.038
 youths-0.038
Token.
Token ID13
Feature activation-6.54e-02

Pos logit contributions
ret+0.066
orse+0.060
orses+0.057
ocket+0.054
ets+0.054
Neg logit contributions
 Lu-0.049
 angels-0.042
 firefighters-0.039
Lu-0.038
 youths-0.037
Token One
Token ID1881
Feature activation-2.08e-02

Pos logit contributions
ret+0.143
orse+0.140
pler+0.137
orses+0.136
ocket+0.136
Neg logit contributions
 Lu-0.072
 Jenkins-0.052
 Lucy-0.051
 youths-0.049
 Barber-0.046
Token day
Token ID1110
Feature activation-8.30e-02

Pos logit contributions
ret+0.044
orse+0.041
orses+0.040
sel+0.039
ets+0.039
Neg logit contributions
 Lu-0.031
Lu-0.023
 angels-0.023
Ann-0.022
 firefighters-0.022
Token,
Token ID11
Feature activation-9.61e-02

Pos logit contributions
ret+0.199
 generation+0.192
ocket+0.189
ets+0.188
pieces+0.185
Neg logit contributions
 Lu-0.073
 youths-0.046
 Lucy-0.045
 angels-0.043
 bud-0.040
Token P
Token ID350
Feature activation-5.65e-02

Pos logit contributions
pler+0.207
ets+0.202
ret+0.202
ocket+0.195
ox+0.195
Neg logit contributions
 Lu-0.096
 Bob-0.084
 Squirrel-0.083
 Beaver-0.082
 Lucy-0.082
Tokeninch
Token ID8589
Feature activation-4.53e-02

Pos logit contributions
orse+0.076
orses+0.068
pler+0.063
bs+0.058
fortable+0.056
Neg logit contributions
 Lu-0.115
 angels-0.100
 youths-0.099
ators-0.096
 Jenkins-0.095
Tokeny
Token ID88
Feature activation-4.87e-02

Pos logit contributions
ret+0.089
orse+0.089
orses+0.086
ocket+0.083
 peanut+0.081
Neg logit contributions
 Lu-0.058
 angels-0.048
guard-0.044
Lu-0.044
Ann-0.043
Token met
Token ID1138
Feature activation+2.14e-03

Pos logit contributions
ret+0.093
orse+0.082
orses+0.080
ocket+0.079
ocus+0.078
Neg logit contributions
 Lu-0.081
 firefighters-0.061
Lu-0.061
 youths-0.061
Ann-0.060
Token a
Token ID257
Feature activation-1.41e-02

Pos logit contributions
 Lu+0.004
 angels+0.003
Ann+0.003
Lu+0.003
 firefighters+0.003
Neg logit contributions
ret-0.004
orse-0.004
orses-0.003
 peanut-0.003
ets-0.003
Token saw
Token ID2497
Feature activation+5.47e-03

Pos logit contributions
ret+0.054
orse+0.050
orses+0.047
ocket+0.043
sel+0.042
Neg logit contributions
 Lu-0.061
Ann-0.049
 firefighters-0.049
Lu-0.048
 angels-0.048
Token a
Token ID257
Feature activation-1.12e-02

Pos logit contributions
 Lu+0.011
 angels+0.009
Ann+0.009
Lu+0.009
 attendant+0.009
Neg logit contributions
orse-0.008
ret-0.008
orses-0.008
 peanut-0.008
ets-0.007
Token new
Token ID649
Feature activation+2.79e-02

Pos logit contributions
ret+0.020
orse+0.020
orses+0.019
ets+0.018
ocket+0.017
Neg logit contributions
 Lu-0.020
 angels-0.017
 firefighters-0.016
Ann-0.016
Lu-0.016
Token toy
Token ID13373
Feature activation-1.82e-02

Pos logit contributions
 Lu+0.053
Ann+0.044
Lu+0.042
Michael+0.042
 angels+0.042
Neg logit contributions
orse-0.046
ret-0.044
 peanut-0.044
ork-0.041
orses-0.040
Token in
Token ID287
Feature activation+1.28e-02

Pos logit contributions
ret+0.034
orse+0.033
orses+0.031
ets+0.029
 peanut+0.028
Neg logit contributions
 Lu-0.032
 angels-0.027
 firefighters-0.025
Ann-0.025
 attendant-0.024
Token Lily
Token ID20037
Feature activation-9.61e-02

Pos logit contributions
 Lu+0.026
 angels+0.022
Ann+0.021
Lu+0.020
co+0.020
Neg logit contributions
ret-0.019
orse-0.019
 peanut-0.018
orses-0.018
 notebooks-0.016
Token's
Token ID338
Feature activation-3.01e-02

Pos logit contributions
ret+0.169
orse+0.136
 AW+0.135
sel+0.129
orses+0.128
Neg logit contributions
 Lu-0.155
co-0.142
 angels-0.128
 Lucy-0.127
Lu-0.126
Token part
Token ID636
Feature activation+3.79e-03

Pos logit contributions
orse+0.073
ret+0.073
orses+0.071
sel+0.065
ets+0.064
Neg logit contributions
 Lu-0.035
 angels-0.025
 firefighters-0.025
 attendant-0.024
 taxis-0.023
Token.
Token ID13
Feature activation-3.00e-03

Pos logit contributions
 Lu+0.007
 angels+0.005
Ann+0.005
Lu+0.005
 firefighters+0.005
Neg logit contributions
ret-0.007
orse-0.007
orses-0.007
 peanut-0.006
ets-0.006
Token It
Token ID632
Feature activation+3.17e-03

Pos logit contributions
ret+0.005
orse+0.005
orses+0.005
 peanut+0.004
ets+0.004
Neg logit contributions
 Lu-0.006
 angels-0.005
Ann-0.005
Lu-0.005
 firefighters-0.005
Token was
Token ID373
Feature activation-1.68e-02

Pos logit contributions
 Lu+0.006
 angels+0.005
Ann+0.005
 firefighters+0.005
Lu+0.005
Neg logit contributions
ret-0.005
orse-0.005
orses-0.005
 peanut-0.005
ets-0.004
Token eating
Token ID6600
Feature activation+3.81e-02

Pos logit contributions
 Lu+0.024
Ann+0.019
Lu+0.019
 angels+0.019
 firefighters+0.019
Neg logit contributions
orse-0.021
ret-0.020
orses-0.020
 peanut-0.018
ets-0.017
Token bamboo
Token ID37252
Feature activation+1.97e-02

Pos logit contributions
 Lu+0.074
 attendant+0.066
Ann+0.064
Michael+0.062
ators+0.062
Neg logit contributions
 peanut-0.058
 ol-0.054
ret-0.052
orse-0.050
orses-0.048
Token.
Token ID13
Feature activation-4.10e-02

Pos logit contributions
 Lu+0.043
Ann+0.036
 firefighters+0.035
 Taxi+0.035
Lu+0.035
Neg logit contributions
ret-0.026
orse-0.025
orses-0.024
ets-0.024
 peanut-0.023
Token One
Token ID1881
Feature activation+1.08e-02

Pos logit contributions
ret+0.088
orse+0.086
orses+0.082
ocket+0.080
ets+0.080
Neg logit contributions
 Lu-0.059
 Lucy-0.043
Lu-0.041
 Jenkins-0.040
Ann-0.040
Token day
Token ID1110
Feature activation-8.16e-02

Pos logit contributions
 Lu+0.022
 angels+0.019
 firefighters+0.018
Ann+0.018
 attendant+0.018
Neg logit contributions
orse-0.017
ret-0.016
orses-0.015
 peanut-0.015
ets-0.014
Token,
Token ID11
Feature activation-9.59e-02

Pos logit contributions
ret+0.199
 generation+0.194
ocket+0.190
ets+0.188
pieces+0.186
Neg logit contributions
 Lu-0.073
 Lucy-0.046
 Olivia-0.044
 angels-0.036
Lu-0.036
Token P
Token ID350
Feature activation-2.23e-02

Pos logit contributions
ret+0.211
ets+0.209
pler+0.204
ience+0.198
orses+0.197
Neg logit contributions
 Lu-0.100
 Lucy-0.093
 Bunny-0.090
 Olivia-0.088
 Squirrel-0.083
Tokenandy
Token ID10757
Feature activation-1.86e-02

Pos logit contributions
orse+0.037
ret+0.035
orses+0.034
 peanut+0.032
ocket+0.030
Neg logit contributions
 Lu-0.042
 angels-0.035
Ann-0.034
Lu-0.033
co-0.032
Token met
Token ID1138
Feature activation+2.02e-02

Pos logit contributions
ret+0.028
orse+0.027
orses+0.025
ocket+0.023
 peanut+0.023
Neg logit contributions
 Lu-0.039
 angels-0.032
Ann-0.032
Lu-0.032
 firefighters-0.031
Token a
Token ID257
Feature activation+3.49e-03

Pos logit contributions
 Lu+0.044
 angels+0.037
Ann+0.037
 attendant+0.036
 firefighters+0.036
Neg logit contributions
ret-0.028
orse-0.028
 peanut-0.026
orses-0.025
ocket-0.022
Token rabbit
Token ID22746
Feature activation-1.54e-02

Pos logit contributions
 Lu+0.007
 angels+0.005
Ann+0.005
Lu+0.005
 firefighters+0.005
Neg logit contributions
orse-0.006
ret-0.006
orses-0.006
 peanut-0.005
ets-0.005
Token was
Token ID373
Feature activation-3.93e-02

Pos logit contributions
ret+0.047
orse+0.047
orses+0.045
ets+0.043
ocket+0.040
Neg logit contributions
 Lu-0.037
Ann-0.029
 angels-0.028
Lu-0.028
 attendant-0.028
Token a
Token ID257
Feature activation-7.24e-02

Pos logit contributions
orse+0.083
ret+0.083
orses+0.080
ets+0.074
oc+0.070
Neg logit contributions
 Lu-0.057
 firefighters-0.048
 angels-0.047
 attendant-0.044
 Lucy-0.043
Token big
Token ID1263
Feature activation-5.29e-02

Pos logit contributions
ret+0.139
ets+0.132
orses+0.128
orse+0.128
ocus+0.116
Neg logit contributions
 Lu-0.109
 attendant-0.101
 angels-0.101
 firefighter-0.099
 firefighters-0.098
Token card
Token ID2657
Feature activation-4.09e-02

Pos logit contributions
ret+0.099
orse+0.095
orses+0.094
ets+0.084
bs+0.080
Neg logit contributions
 Lu-0.083
 angels-0.074
 attendant-0.072
 firefighters-0.071
 angel-0.071
Token with
Token ID351
Feature activation-8.24e-02

Pos logit contributions
ret+0.082
orse+0.075
orses+0.073
ets+0.067
ocus+0.064
Neg logit contributions
 Lu-0.065
 angels-0.056
 attendant-0.053
 firefighters-0.052
Lu-0.051
Token a
Token ID257
Feature activation-9.51e-02

Pos logit contributions
orse+0.198
ret+0.186
orses+0.179
ossom+0.173
ork+0.172
Neg logit contributions
 angels-0.087
 Lu-0.087
 firefighters-0.085
 Lucy-0.069
 youths-0.066
Token picture
Token ID4286
Feature activation-2.17e-02

Pos logit contributions
ret+0.164
ets+0.148
orse+0.146
orses+0.141
earing+0.138
Neg logit contributions
 Lu-0.147
 angels-0.143
 attendant-0.137
 firefighters-0.134
 firefighter-0.132
Token of
Token ID286
Feature activation-3.86e-02

Pos logit contributions
ret+0.048
orse+0.047
orses+0.044
 peanut+0.041
ets+0.040
Neg logit contributions
 Lu-0.030
 angels-0.024
 firefighters-0.022
Ann-0.022
Lu-0.022
Token a
Token ID257
Feature activation-7.66e-02

Pos logit contributions
ret+0.081
orse+0.080
orses+0.075
ets+0.074
ork+0.071
Neg logit contributions
 Lu-0.057
 angels-0.050
 firefighters-0.047
 Lucy-0.043
 youths-0.042
Token rainbow
Token ID27223
Feature activation-5.61e-02

Pos logit contributions
ret+0.140
orse+0.124
ets+0.122
orses+0.118
earing+0.117
Neg logit contributions
 Lu-0.126
 angels-0.119
 firefighters-0.115
 attendant-0.111
 firefighter-0.110
Token.
Token ID13
Feature activation-6.75e-03

Pos logit contributions
ret+0.117
orse+0.112
orses+0.107
fortable+0.096
ocket+0.094
Neg logit contributions
 Lu-0.078
 angels-0.071
co-0.061
 firefighters-0.061
 attendant-0.061
Token
Token ID198
Feature activation-7.68e-02

Pos logit contributions
ret+0.076
orse+0.074
ets+0.073
 peanut+0.072
sel+0.072
Neg logit contributions
 Lu-0.058
Ann-0.047
Lu-0.045
 angels-0.042
 attendant-0.042
TokenFrom
Token ID4863
Feature activation-5.27e-03

Pos logit contributions
ret+0.165
orse+0.162
ocket+0.161
 peanut+0.151
 mouths+0.150
Neg logit contributions
Ann-0.076
Lu-0.074
 Lu-0.071
 attendant-0.071
 youths-0.067
Token that
Token ID326
Feature activation-1.15e-02

Pos logit contributions
ret+0.006
orse+0.006
orses+0.006
 peanut+0.005
ets+0.005
Neg logit contributions
 Lu-0.013
 angels-0.011
Ann-0.011
 firefighters-0.011
Lu-0.010
Token day
Token ID1110
Feature activation-2.80e-02

Pos logit contributions
ret+0.022
orse+0.022
orses+0.021
ets+0.020
 peanut+0.019
Neg logit contributions
 Lu-0.019
 angels-0.015
Lu-0.015
Ann-0.014
 firefighters-0.014
Token on
Token ID319
Feature activation-2.59e-02

Pos logit contributions
orse+0.060
ret+0.059
orses+0.057
 peanut+0.054
ets+0.052
Neg logit contributions
 Lu-0.039
 angels-0.031
Ann-0.029
 firefighters-0.028
Lu-0.028
Token,
Token ID11
Feature activation-9.48e-02

Pos logit contributions
ret+0.050
orse+0.049
orses+0.047
ets+0.047
sel+0.045
Neg logit contributions
 Lu-0.042
 angels-0.034
 Lucy-0.032
 firefighters-0.031
 Luis-0.031
Token Jolly
Token ID40276
Feature activation-1.66e-02

Pos logit contributions
orse+0.155
ifying+0.154
ork+0.154
ox+0.153
ocket+0.153
Neg logit contributions
 youths-0.126
 helpers-0.121
 angels-0.117
 Lu-0.113
 Wise-0.109
Token and
Token ID290
Feature activation-3.55e-02

Pos logit contributions
ret+0.028
orse+0.028
orses+0.026
 peanut+0.024
ocket+0.023
Neg logit contributions
 Lu-0.031
 angels-0.026
Ann-0.025
Lu-0.025
 firefighters-0.025
Token his
Token ID465
Feature activation+1.48e-02

Pos logit contributions
orse+0.080
ret+0.080
orses+0.077
ocus+0.076
ets+0.075
Neg logit contributions
 Lu-0.042
 youths-0.033
 angels-0.033
 firefighters-0.033
 Lucy-0.030
Token friends
Token ID2460
Feature activation+2.54e-02

Pos logit contributions
 Lu+0.024
Ann+0.018
Lu+0.017
 attendant+0.016
Michael+0.016
Neg logit contributions
orse-0.031
ret-0.031
 peanut-0.029
orses-0.029
 notebooks-0.027
Token were
Token ID547
Feature activation+2.84e-02

Pos logit contributions
 Lu+0.056
Ann+0.048
 angels+0.048
 attendant+0.046
Lu+0.046
Neg logit contributions
orse-0.034
ret-0.034
orses-0.030
ets-0.029
 peanut-0.029
Token to
Token ID284
Feature activation-7.46e-03

Pos logit contributions
ret+0.041
orse+0.039
orses+0.036
ocket+0.035
ork+0.035
Neg logit contributions
 Lu-0.035
 angels-0.031
 youths-0.028
 firefighters-0.027
Lu-0.026
Token eat
Token ID4483
Feature activation-1.07e-02

Pos logit contributions
ret+0.013
orse+0.013
orses+0.012
 peanut+0.011
ets+0.011
Neg logit contributions
 Lu-0.014
 angels-0.012
Ann-0.011
 firefighters-0.011
Lu-0.011
Token.
Token ID13
Feature activation-3.27e-02

Pos logit contributions
ret+0.020
orse+0.019
orses+0.018
ets+0.016
ocket+0.016
Neg logit contributions
 Lu-0.019
 angels-0.016
Lu-0.015
 firefighters-0.015
 youths-0.015
Token One
Token ID1881
Feature activation-5.53e-03

Pos logit contributions
orse+0.068
ret+0.067
orses+0.064
ocket+0.063
ocus+0.061
Neg logit contributions
 Lu-0.048
 Barber-0.034
 angels-0.034
Ann-0.034
 Lucy-0.033
Token day
Token ID1110
Feature activation-8.20e-02

Pos logit contributions
ret+0.010
orse+0.010
orses+0.009
ets+0.009
 peanut+0.009
Neg logit contributions
 Lu-0.010
 angels-0.008
Ann-0.008
Lu-0.008
 firefighters-0.007
Token,
Token ID11
Feature activation-9.47e-02

Pos logit contributions
ret+0.192
 generation+0.188
ocket+0.185
orse+0.183
orses+0.182
Neg logit contributions
 Lu-0.078
 angels-0.047
 youths-0.046
 Lucy-0.045
 Olivia-0.044
Token P
Token ID350
Feature activation-5.25e-02

Pos logit contributions
ret+0.203
pler+0.203
ets+0.200
orses+0.196
ocket+0.196
Neg logit contributions
 Lu-0.095
whe-0.076
 Lucy-0.072
 Olivia-0.072
 Squirrel-0.071
Tokeningu
Token ID6680
Feature activation-3.33e-02

Pos logit contributions
orse+0.079
orses+0.068
pler+0.061
ret+0.061
bs+0.060
Neg logit contributions
 Lu-0.102
 angels-0.090
 Barber-0.085
 youths-0.085
ators-0.083
Token was
Token ID373
Feature activation-3.21e-03

Pos logit contributions
ret+0.056
orse+0.051
orses+0.048
ocket+0.047
ocus+0.046
Neg logit contributions
 Lu-0.062
 angels-0.051
 firefighters-0.051
Lu-0.049
 youths-0.049
Token feeling
Token ID4203
Feature activation-1.15e-02

Pos logit contributions
ret+0.005
orse+0.005
orses+0.005
 peanut+0.005
ets+0.005
Neg logit contributions
 Lu-0.006
 angels-0.005
Ann-0.005
 firefighters-0.005
Lu-0.005
Token tired
Token ID10032
Feature activation-7.37e-03

Pos logit contributions
ret+0.020
orse+0.019
orses+0.018
ets+0.017
ork+0.016
Neg logit contributions
 Lu-0.022
 angels-0.018
 firefighters-0.018
 youths-0.017
Lu-0.017
Token a
Token ID257
Feature activation-5.88e-03

Pos logit contributions
 Lu+0.034
Ann+0.028
Lu+0.028
 angels+0.028
 Taxi+0.027
Neg logit contributions
ret-0.024
orse-0.024
orses-0.022
 peanut-0.021
ets-0.020
Token big
Token ID1263
Feature activation+4.19e-04

Pos logit contributions
ret+0.010
orse+0.010
orses+0.009
ets+0.008
 peanut+0.008
Neg logit contributions
 Lu-0.011
 angels-0.009
Ann-0.009
 firefighters-0.009
Lu-0.009
Token ice
Token ID4771
Feature activation+1.56e-03

Pos logit contributions
 Lu+0.001
 angels+0.001
Ann+0.001
Lu+0.001
 firefighters+0.001
Neg logit contributions
ret-0.001
orse-0.001
orses-0.001
 peanut-0.001
ets-0.001
Token cream
Token ID8566
Feature activation-1.95e-02

Pos logit contributions
 Lu+0.002
 angels+0.002
Ann+0.002
Lu+0.002
 firefighters+0.002
Neg logit contributions
ret-0.003
orse-0.003
orses-0.003
 peanut-0.003
ets-0.003
Token in
Token ID287
Feature activation-9.58e-03

Pos logit contributions
ret+0.040
orse+0.039
orses+0.037
ets+0.034
 peanut+0.033
Neg logit contributions
 Lu-0.031
 angels-0.025
 firefighters-0.024
Ann-0.024
 attendant-0.024
Token Ben
Token ID3932
Feature activation-9.47e-02

Pos logit contributions
ret+0.018
orse+0.018
orses+0.017
ets+0.016
 peanut+0.015
Neg logit contributions
 Lu-0.017
Ann-0.014
 angels-0.014
Lu-0.013
 firefighters-0.013
Token's
Token ID338
Feature activation-4.71e-02

Pos logit contributions
ret+0.162
 AW+0.140
orse+0.126
ocus+0.125
alks+0.125
Neg logit contributions
 Lu-0.150
 Pen-0.137
 youths-0.127
 Dog-0.127
co-0.125
Token hand
Token ID1021
Feature activation-2.56e-02

Pos logit contributions
ret+0.099
orse+0.096
orses+0.092
arding+0.084
ets+0.083
Neg logit contributions
 Lu-0.063
 firefighters-0.052
 angels-0.051
 attendant-0.050
 Taxi-0.049
Token and
Token ID290
Feature activation-2.12e-02

Pos logit contributions
ret+0.050
orse+0.048
orses+0.045
 peanut+0.041
ets+0.040
Neg logit contributions
 Lu-0.043
 angels-0.035
 firefighters-0.033
Ann-0.033
co-0.033
Token a
Token ID257
Feature activation-2.24e-02

Pos logit contributions
ret+0.040
orse+0.037
orses+0.036
ets+0.033
ork+0.033
Neg logit contributions
 Lu-0.038
 angels-0.031
 firefighters-0.031
Ann-0.029
 youths-0.028
Token new
Token ID649
Feature activation+2.11e-02

Pos logit contributions
ret+0.039
orse+0.038
orses+0.036
ets+0.034
sel+0.032
Neg logit contributions
 Lu-0.041
 angels-0.035
 firefighters-0.033
Ann-0.033
Lu-0.032
Token a
Token ID257
Feature activation+1.01e-02

Pos logit contributions
 Lu+0.061
Ann+0.054
 angels+0.051
Lu+0.051
ators+0.051
Neg logit contributions
orse-0.037
 peanut-0.035
ret-0.035
orses-0.032
ork-0.029
Token break
Token ID2270
Feature activation+2.48e-02

Pos logit contributions
 Lu+0.021
Ann+0.018
 angels+0.018
Lu+0.017
 firefighters+0.017
Neg logit contributions
orse-0.015
ret-0.015
orses-0.014
 peanut-0.013
ork-0.012
Token.
Token ID13
Feature activation-5.07e-02

Pos logit contributions
 Lu+0.055
Ann+0.048
Lu+0.046
 angels+0.046
 attendant+0.045
Neg logit contributions
ret-0.033
orse-0.031
ets-0.029
orses-0.029
 peanut-0.028
Token One
Token ID1881
Feature activation+1.04e-02

Pos logit contributions
orse+0.107
ets+0.105
ocket+0.105
ret+0.104
orses+0.103
Neg logit contributions
 Lu-0.067
 attendant-0.047
 Jenkins-0.047
 Barber-0.046
 Lucy-0.045
Token day
Token ID1110
Feature activation-7.70e-02

Pos logit contributions
 Lu+0.021
 angels+0.018
Ann+0.017
 firefighters+0.017
 youths+0.017
Neg logit contributions
orse-0.016
ret-0.016
orses-0.015
 peanut-0.014
ork-0.013
Token,
Token ID11
Feature activation-9.43e-02

Pos logit contributions
ret+0.184
ets+0.181
ocket+0.180
 generation+0.176
ocado+0.176
Neg logit contributions
 Lu-0.071
 Olivia-0.041
 Lucy-0.041
 Pupp-0.037
 Dog-0.037
Token his
Token ID465
Feature activation+2.38e-02

Pos logit contributions
ets+0.221
pler+0.209
ocus+0.208
ocket+0.202
ocado+0.201
Neg logit contributions
 Lu-0.085
 Squirrel-0.081
 Bunny-0.074
 his-0.072
 Olivia-0.070
Token friend
Token ID1545
Feature activation-1.34e-02

Pos logit contributions
 Lu+0.040
Lu+0.033
Ann+0.032
 angels+0.030
co+0.029
Neg logit contributions
orse-0.046
ret-0.045
 peanut-0.043
orses-0.042
ork-0.039
Token Sammy
Token ID44067
Feature activation-1.31e-02

Pos logit contributions
ret+0.023
orse+0.022
orses+0.021
ets+0.020
ocket+0.019
Neg logit contributions
 Lu-0.026
 angels-0.021
Ann-0.021
Lu-0.020
 firefighters-0.020
Token came
Token ID1625
Feature activation+3.63e-02

Pos logit contributions
ret+0.021
orse+0.020
orses+0.019
ets+0.017
 peanut+0.017
Neg logit contributions
 Lu-0.027
 angels-0.022
Lu-0.021
Ann-0.021
 firefighters-0.021
Token over
Token ID625
Feature activation+6.35e-03

Pos logit contributions
 Lu+0.074
Ann+0.063
 angels+0.060
Lu+0.060
 pastor+0.060
Neg logit contributions
ret-0.053
orse-0.051
orses-0.047
's-0.046
 peanut-0.045
Token puts
Token ID7584
Feature activation-4.35e-02

Pos logit contributions
 Lu+0.016
 angels+0.013
Ann+0.013
Lu+0.013
 firefighters+0.012
Neg logit contributions
ret-0.011
orse-0.011
orses-0.010
 peanut-0.010
ets-0.009
Token a
Token ID257
Feature activation-6.87e-02

Pos logit contributions
orse+0.092
orses+0.086
ret+0.079
ets+0.076
ocus+0.074
Neg logit contributions
 Lu-0.068
 angels-0.057
 Lucy-0.053
 firefighters-0.051
Ann-0.051
Token shiny
Token ID22441
Feature activation-5.61e-02

Pos logit contributions
orses+0.124
ret+0.124
orse+0.123
ets+0.121
ocus+0.109
Neg logit contributions
 Lu-0.094
 firefighter-0.089
 angels-0.087
 angel-0.082
 firefighters-0.081
Token sticker
Token ID27735
Feature activation-3.43e-02

Pos logit contributions
orse+0.105
orses+0.104
ret+0.098
cious+0.089
ets+0.085
Neg logit contributions
 Lu-0.086
 angels-0.075
 firefighter-0.071
 angel-0.071
 firefighters-0.070
Token on
Token ID319
Feature activation-6.39e-02

Pos logit contributions
orse+0.077
ret+0.076
orses+0.075
 peanut+0.067
ets+0.064
Neg logit contributions
 Lu-0.043
 angels-0.036
 attendant-0.032
Ann-0.031
co-0.031
Token Ben
Token ID3932
Feature activation-9.42e-02

Pos logit contributions
orse+0.145
orses+0.141
ret+0.134
ets+0.132
ocus+0.129
Neg logit contributions
 Lu-0.087
 Lucy-0.072
Ann-0.065
 Ann-0.063
Lu-0.063
Token's
Token ID338
Feature activation-6.77e-02

Pos logit contributions
ret+0.139
orse+0.128
orses+0.126
 seas+0.122
 AW+0.120
Neg logit contributions
 Lu-0.159
co-0.141
 Pen-0.136
 angels-0.136
 Lucy-0.131
Token folder
Token ID9483
Feature activation-3.11e-02

Pos logit contributions
orse+0.116
ret+0.115
orses+0.115
ocus+0.105
fortable+0.099
Neg logit contributions
 angels-0.097
 Lu-0.089
 angel-0.088
 attendant-0.082
 firefighters-0.081
Token.
Token ID13
Feature activation-2.54e-02

Pos logit contributions
orse+0.059
ret+0.059
orses+0.056
 peanut+0.050
ocus+0.047
Neg logit contributions
 Lu-0.052
 angels-0.042
 attendant-0.040
Lu-0.040
Ann-0.039
Token Ben
Token ID3932
Feature activation-1.63e-02

Pos logit contributions
ret+0.049
orse+0.048
orses+0.045
ets+0.043
ocket+0.042
Neg logit contributions
 Lu-0.044
Ann-0.034
Lu-0.033
 Lucy-0.033
 angels-0.033
Token feels
Token ID5300
Feature activation-4.59e-02

Pos logit contributions
orse+0.028
ret+0.027
orses+0.026
 peanut+0.023
ork+0.023
Neg logit contributions
 Lu-0.031
 angels-0.026
Ann-0.025
 firefighters-0.024
 youths-0.024
Token make
Token ID787
Feature activation-3.07e-02

Pos logit contributions
 Lu+0.004
 angels+0.003
Ann+0.003
Lu+0.003
 firefighters+0.003
Neg logit contributions
ret-0.004
orse-0.004
orses-0.003
 peanut-0.003
ets-0.003
Token funny
Token ID8258
Feature activation-6.47e-03

Pos logit contributions
orse+0.056
ret+0.055
orses+0.052
ocket+0.050
 peanut+0.048
Neg logit contributions
 Lu-0.054
 angels-0.047
 firefighters-0.042
Ann-0.040
 Lucy-0.039
Token noises
Token ID26782
Feature activation-1.47e-02

Pos logit contributions
ret+0.012
orse+0.011
orses+0.011
 peanut+0.010
ets+0.010
Neg logit contributions
 Lu-0.012
 angels-0.010
Ann-0.009
 firefighters-0.009
Lu-0.009
Token.
Token ID13
Feature activation-5.44e-02

Pos logit contributions
ret+0.027
orse+0.026
orses+0.024
 peanut+0.022
ork+0.022
Neg logit contributions
 Lu-0.027
 angels-0.022
 firefighters-0.021
Ann-0.021
Lu-0.020
Token So
Token ID1406
Feature activation-7.56e-02

Pos logit contributions
orse+0.122
orses+0.116
ret+0.114
 peanut+0.110
ocket+0.109
Neg logit contributions
 Lu-0.073
 attendant-0.052
 firefighters-0.049
Ann-0.048
 Barber-0.048
Token,
Token ID11
Feature activation-9.42e-02

Pos logit contributions
iche+0.171
orse+0.164
ocado+0.163
orses+0.163
ret+0.159
Neg logit contributions
 Lu-0.085
 attendant-0.063
 helpers-0.062
 Luis-0.058
 Ann-0.056
Token he
Token ID339
Feature activation+1.25e-02

Pos logit contributions
iche+0.200
ocado+0.199
ork+0.193
 peanut+0.191
orse+0.188
Neg logit contributions
 Lu-0.101
 attendant-0.077
 helpers-0.075
 Luis-0.075
 angels-0.071
Token kept
Token ID4030
Feature activation+1.12e-02

Pos logit contributions
 Lu+0.027
 angels+0.023
Ann+0.022
 firefighters+0.022
 attendant+0.022
Neg logit contributions
orse-0.019
ret-0.018
orses-0.017
 peanut-0.016
ets-0.015
Token rubbing
Token ID27714
Feature activation-7.72e-03

Pos logit contributions
 Lu+0.022
Ann+0.018
 angels+0.018
Lu+0.018
 firefighters+0.017
Neg logit contributions
orse-0.019
ret-0.018
orses-0.017
 peanut-0.016
ocket-0.015
Token and
Token ID290
Feature activation-9.55e-03

Pos logit contributions
orse+0.013
ret+0.013
orses+0.013
 peanut+0.012
ets+0.012
Neg logit contributions
 Lu-0.014
 angels-0.012
Ann-0.011
 firefighters-0.011
Lu-0.011
Token rubbing
Token ID27714
Feature activation-6.60e-03

Pos logit contributions
ret+0.017
orse+0.017
orses+0.016
 peanut+0.015
ets+0.015
Neg logit contributions
 Lu-0.017
 angels-0.014
Ann-0.014
Lu-0.013
 firefighters-0.013
Token's
Token ID338
Feature activation-5.96e-02

Pos logit contributions
ret+0.106
orse+0.092
ork+0.088
orses+0.087
 AW+0.086
Neg logit contributions
 Lu-0.130
co-0.117
ma-0.114
 Lucy-0.114
Ann-0.112
Token purse
Token ID27517
Feature activation-2.54e-02

Pos logit contributions
orse+0.146
ret+0.140
orses+0.139
plets+0.128
arding+0.122
Neg logit contributions
 Lu-0.059
 firefighters-0.052
 angels-0.052
 helpers-0.050
 ambulance-0.049
Token.
Token ID13
Feature activation-2.90e-02

Pos logit contributions
ret+0.051
orse+0.049
orses+0.047
 peanut+0.042
ets+0.042
Neg logit contributions
 Lu-0.040
 angels-0.032
Ann-0.031
 attendant-0.031
Lu-0.030
Token It
Token ID632
Feature activation-4.00e-02

Pos logit contributions
orse+0.060
ret+0.060
orses+0.058
ets+0.053
ocket+0.051
Neg logit contributions
 Lu-0.047
 Lucy-0.033
 youths-0.033
 firefighters-0.032
 Pen-0.032
Token has
Token ID468
Feature activation-5.70e-02

Pos logit contributions
orse+0.084
ret+0.080
orses+0.080
ets+0.071
alks+0.070
Neg logit contributions
 Lu-0.059
Ann-0.046
 angels-0.045
 Lucy-0.045
 attendant-0.042
Token a
Token ID257
Feature activation-9.42e-02

Pos logit contributions
orse+0.130
ret+0.125
orses+0.119
ossom+0.107
oppy+0.107
Neg logit contributions
 Lu-0.073
 angels-0.067
 firefighters-0.064
 taxis-0.055
Ann-0.054
Token picture
Token ID4286
Feature activation-1.18e-02

Pos logit contributions
ret+0.165
orse+0.156
ets+0.154
orses+0.154
earing+0.139
Neg logit contributions
 Lu-0.132
 angels-0.131
 attendant-0.126
 angel-0.125
 firefighter-0.125
Token of
Token ID286
Feature activation-2.34e-02

Pos logit contributions
ret+0.026
orse+0.025
orses+0.024
 peanut+0.022
ets+0.022
Neg logit contributions
 Lu-0.017
 angels-0.013
Ann-0.013
 firefighters-0.012
Lu-0.012
Token his
Token ID465
Feature activation-5.88e-02

Pos logit contributions
ret+0.046
orse+0.045
orses+0.042
ets+0.040
ork+0.039
Neg logit contributions
 Lu-0.039
 angels-0.033
 firefighters-0.031
Ann-0.029
 Lucy-0.029
Token mom
Token ID1995
Feature activation-6.51e-02

Pos logit contributions
ret+0.122
orse+0.118
orses+0.112
sel+0.106
ets+0.105
Neg logit contributions
 Lu-0.080
 angels-0.076
 firefighters-0.066
 helpers-0.064
 angel-0.064
Token and
Token ID290
Feature activation-2.95e-02

Pos logit contributions
ret+0.096
orse+0.088
orses+0.082
ork+0.081
 AW+0.075
Neg logit contributions
 Lu-0.128
 Lucy-0.109
Ann-0.106
 Ann-0.106
co-0.106
Token card
Token ID2657
Feature activation-3.29e-02

Pos logit contributions
ret+0.042
orse+0.042
orses+0.040
ets+0.039
ocus+0.036
Neg logit contributions
 Lu-0.040
 angels-0.033
 firefighters-0.032
Lu-0.031
 attendant-0.031
Token on
Token ID319
Feature activation-1.41e-02

Pos logit contributions
ret+0.067
orse+0.063
orses+0.062
ets+0.057
ocus+0.055
Neg logit contributions
 Lu-0.049
 angels-0.042
 firefighters-0.041
Lu-0.040
 attendant-0.039
Token the
Token ID262
Feature activation-6.67e-02

Pos logit contributions
orse+0.028
ret+0.028
orses+0.027
ets+0.025
ocket+0.024
Neg logit contributions
 Lu-0.024
 angels-0.018
Ann-0.018
Lu-0.018
 firefighters-0.018
Token floor
Token ID4314
Feature activation-2.47e-02

Pos logit contributions
orse+0.140
ret+0.138
orses+0.133
bs+0.133
ocus+0.120
Neg logit contributions
 Lu-0.090
 angels-0.088
 attendant-0.087
 firefighters-0.082
co-0.076
Token.
Token ID13
Feature activation+1.10e-02

Pos logit contributions
ret+0.048
orse+0.047
orses+0.045
 peanut+0.042
ets+0.041
Neg logit contributions
 Lu-0.039
 angels-0.032
Lu-0.031
Ann-0.031
 firefighters-0.030
Token The
Token ID383
Feature activation-9.38e-02

Pos logit contributions
 Lu+0.025
 angels+0.022
Ann+0.021
Lu+0.021
 firefighters+0.021
Neg logit contributions
orse-0.014
ret-0.014
orses-0.012
 peanut-0.012
ork-0.011
Token card
Token ID2657
Feature activation-3.58e-02

Pos logit contributions
orse+0.189
ret+0.184
orses+0.183
ience+0.182
ifying+0.175
Neg logit contributions
 lady-0.109
 angels-0.109
 attendant-0.105
 firefighters-0.102
 helpers-0.093
Token had
Token ID550
Feature activation-6.30e-02

Pos logit contributions
ret+0.070
orse+0.067
orses+0.066
ets+0.064
ocus+0.059
Neg logit contributions
 Lu-0.054
 angels-0.046
 attendant-0.044
Ann-0.043
Lu-0.042
Token a
Token ID257
Feature activation-8.03e-02

Pos logit contributions
orse+0.155
orses+0.146
ret+0.144
oppy+0.136
oc+0.135
Neg logit contributions
 firefighters-0.063
 angels-0.063
 Lu-0.062
 taxis-0.045
 helpers-0.044
Token big
Token ID1263
Feature activation-6.37e-02

Pos logit contributions
ret+0.162
ets+0.160
orses+0.159
orse+0.158
iece+0.142
Neg logit contributions
 Lu-0.094
 angels-0.094
 bud-0.087
 attendant-0.087
 firefighters-0.087
Token,
Token ID11
Feature activation-5.26e-02

Pos logit contributions
ret+0.118
orses+0.116
orse+0.114
ets+0.107
bs+0.099
Neg logit contributions
 Lu-0.087
 bud-0.086
 angels-0.085
 angel-0.080
 firefighters-0.077
Tokenaffe
Token ID21223
Feature activation-1.60e-02

Pos logit contributions
orse+0.022
ret+0.022
orses+0.020
 peanut+0.018
ocket+0.017
Neg logit contributions
 Lu-0.036
 angels-0.032
 firefighters-0.031
Ann-0.030
co-0.030
Token."
Token ID526
Feature activation-3.74e-02

Pos logit contributions
ret+0.027
orse+0.027
orses+0.025
 peanut+0.023
ork+0.023
Neg logit contributions
 Lu-0.030
 angels-0.025
 firefighters-0.024
Ann-0.023
Lu-0.023
Token They
Token ID1119
Feature activation-9.93e-03

Pos logit contributions
orse+0.076
ret+0.074
orses+0.071
ocket+0.068
sel+0.066
Neg logit contributions
 Lu-0.064
 Lucy-0.054
Lu-0.051
Ann-0.048
 angels-0.044
Token looked
Token ID3114
Feature activation-2.34e-02

Pos logit contributions
orse+0.016
ret+0.016
orses+0.015
 peanut+0.014
ets+0.014
Neg logit contributions
 Lu-0.020
 angels-0.016
Ann-0.016
Lu-0.015
 firefighters-0.015
Token through
Token ID832
Feature activation-6.47e-04

Pos logit contributions
orse+0.045
ret+0.044
orses+0.043
ork+0.038
ocus+0.037
Neg logit contributions
 Lu-0.041
 angels-0.032
 firefighters-0.032
 Lucy-0.031
Lu-0.030
Token Lily
Token ID20037
Feature activation-9.37e-02

Pos logit contributions
ret+0.001
orse+0.001
orses+0.001
 peanut+0.001
ets+0.001
Neg logit contributions
 Lu-0.001
 angels-0.001
Ann-0.001
Lu-0.001
 firefighters-0.001
Token's
Token ID338
Feature activation-3.74e-02

Pos logit contributions
orse+0.150
ret+0.145
orses+0.143
 AW+0.142
Ble+0.139
Neg logit contributions
 Lu-0.149
 angels-0.135
co-0.125
 upstairs-0.122
 Lucy-0.121
Token toy
Token ID13373
Feature activation-3.06e-02

Pos logit contributions
orse+0.075
ret+0.074
orses+0.072
ocus+0.066
sel+0.062
Neg logit contributions
 Lu-0.057
 angels-0.048
 firefighters-0.045
jack-0.044
 attendant-0.044
Token box
Token ID3091
Feature activation-3.17e-02

Pos logit contributions
orse+0.064
ret+0.061
orses+0.060
ocus+0.052
 peanut+0.052
Neg logit contributions
 Lu-0.047
 angels-0.042
 firefighters-0.039
 attendant-0.037
jack-0.036
Token and
Token ID290
Feature activation-3.80e-02

Pos logit contributions
orse+0.060
ret+0.059
orses+0.057
 peanut+0.051
ocus+0.049
Neg logit contributions
 Lu-0.053
 angels-0.045
 attendant-0.042
 firefighters-0.041
Lu-0.040
Token found
Token ID1043
Feature activation-8.99e-03

Pos logit contributions
ret+0.065
orse+0.062
orses+0.061
ocus+0.056
ets+0.055
Neg logit contributions
 Lu-0.074
 Lucy-0.061
 angels-0.060
 firefighters-0.060
Lu-0.057
Token car
Token ID1097
Feature activation+1.44e-02

Pos logit contributions
 Lu+0.032
Ann+0.025
Lu+0.024
 angels+0.024
 Luis+0.023
Neg logit contributions
orse-0.032
ret-0.032
orses-0.030
 peanut-0.030
 notebooks-0.028
Token ride
Token ID6594
Feature activation-9.32e-05

Pos logit contributions
 Lu+0.027
 angels+0.021
Lu+0.021
Ann+0.021
 youths+0.021
Neg logit contributions
orse-0.025
ret-0.025
orses-0.023
 peanut-0.022
ets-0.021
Token was
Token ID373
Feature activation+2.21e-02

Pos logit contributions
ret+0.000
orse+0.000
orses+0.000
 peanut+0.000
ets+0.000
Neg logit contributions
 Lu-0.000
 angels-0.000
Ann-0.000
Lu-0.000
 firefighters-0.000
Token long
Token ID890
Feature activation+1.23e-02

Pos logit contributions
 Lu+0.045
Ann+0.036
Lu+0.036
 angels+0.036
 youths+0.036
Neg logit contributions
orse-0.033
ret-0.033
orses-0.031
 peanut-0.029
ets-0.029
Token and
Token ID290
Feature activation+1.57e-02

Pos logit contributions
 Lu+0.027
Ann+0.022
Lu+0.022
 angels+0.022
 firefighters+0.021
Neg logit contributions
ret-0.017
orse-0.017
orses-0.016
ets-0.015
 peanut-0.015
Token hot
Token ID3024
Feature activation+4.61e-02

Pos logit contributions
 Lu+0.032
 angels+0.027
Ann+0.026
 youths+0.026
Lu+0.026
Neg logit contributions
orse-0.024
ret-0.023
orses-0.022
 peanut-0.020
ets-0.020
Token.
Token ID13
Feature activation-2.17e-02

Pos logit contributions
 Lu+0.109
Ann+0.098
Lu+0.096
Pen+0.095
 attendant+0.093
Neg logit contributions
ets-0.044
orse-0.041
 peanut-0.038
ard-0.037
orses-0.036
Token Tom
Token ID4186
Feature activation+4.57e-03

Pos logit contributions
ret+0.039
orse+0.039
orses+0.036
ocket+0.033
 peanut+0.032
Neg logit contributions
 Lu-0.041
 firefighters-0.033
Ann-0.033
 Lucy-0.032
 attendant-0.032
Token and
Token ID290
Feature activation+5.76e-03

Pos logit contributions
 Lu+0.010
 angels+0.008
Ann+0.008
Lu+0.008
 firefighters+0.008
Neg logit contributions
orse-0.007
ret-0.007
orses-0.006
 peanut-0.006
ets-0.006
Token Lily
Token ID20037
Feature activation+7.67e-03

Pos logit contributions
 Lu+0.012
 angels+0.010
Ann+0.010
Lu+0.010
 firefighters+0.009
Neg logit contributions
ret-0.009
orse-0.009
orses-0.008
 peanut-0.008
ets-0.007
Token felt
Token ID2936
Feature activation-1.08e-02

Pos logit contributions
 Lu+0.015
 angels+0.013
Ann+0.013
Lu+0.013
 firefighters+0.013
Neg logit contributions
orse-0.012
ret-0.012
orses-0.011
 peanut-0.010
ets-0.010
Token like
Token ID588
Feature activation-2.13e-02

Pos logit contributions
ret+0.058
orse+0.055
ork+0.055
orses+0.054
 peanut+0.052
Neg logit contributions
 Lu-0.050
Ann-0.039
Lu-0.037
 angels-0.037
co-0.037
Token it
Token ID340
Feature activation+1.35e-02

Pos logit contributions
orse+0.030
ret+0.030
orses+0.028
ets+0.025
sel+0.025
Neg logit contributions
 Lu-0.047
 angels-0.040
Ann-0.039
Lu-0.037
 taxis-0.037
Token?"
Token ID1701
Feature activation+1.72e-02

Pos logit contributions
 Lu+0.025
Ann+0.020
 angels+0.020
Lu+0.020
 firefighters+0.020
Neg logit contributions
orse-0.023
ret-0.023
orses-0.021
 peanut-0.021
ets-0.020
Token
Token ID198
Feature activation+1.69e-02

Pos logit contributions
 Lu+0.038
 angels+0.034
ators+0.033
 firefighters+0.033
co+0.032
Neg logit contributions
orse-0.022
ret-0.021
orses-0.020
 peanut-0.018
bs-0.017
Token
Token ID198
Feature activation+1.32e-02

Pos logit contributions
 Lu+0.037
 angels+0.033
 firefighters+0.032
 youths+0.031
ators+0.031
Neg logit contributions
orse-0.022
ret-0.022
orses-0.020
ork-0.018
 peanut-0.018
TokenL
Token ID43
Feature activation+4.72e-02

Pos logit contributions
 Lu+0.030
 angels+0.026
 firefighters+0.024
 attendant+0.024
 youths+0.023
Neg logit contributions
orse-0.017
ret-0.017
orses-0.016
 peanut-0.014
ets-0.014
Tokenily
Token ID813
Feature activation+2.39e-02

Pos logit contributions
 Lu+0.108
 Taxi+0.098
jack+0.097
 angels+0.095
 ambul+0.094
Neg logit contributions
ets-0.057
ret-0.055
ocket-0.046
ocus-0.046
ork-0.044
Token smiled
Token ID13541
Feature activation+1.86e-02

Pos logit contributions
 Lu+0.049
 angels+0.043
 firefighters+0.042
Ann+0.040
Lu+0.039
Neg logit contributions
orse-0.036
ret-0.035
orses-0.032
 peanut-0.031
ets-0.029
Token.
Token ID13
Feature activation+3.04e-02

Pos logit contributions
 Lu+0.043
Ann+0.037
Lu+0.036
 angels+0.036
 firefighters+0.035
Neg logit contributions
ret-0.023
orse-0.023
orses-0.021
 peanut-0.020
ets-0.020
Token She
Token ID1375
Feature activation+1.66e-05

Pos logit contributions
 Lu+0.062
 angels+0.055
ators+0.054
 firefighters+0.053
Ann+0.053
Neg logit contributions
ret-0.041
orse-0.039
 peanut-0.036
orses-0.036
 This-0.035
Token liked
Token ID8288
Feature activation+2.83e-03

Pos logit contributions
 Lu+0.000
 angels+0.000
Ann+0.000
Lu+0.000
 firefighters+0.000
Neg logit contributions
ret-0.000
orse-0.000
orses-0.000
 peanut-0.000
ets-0.000
TokenSuddenly
Token ID38582
Feature activation+5.08e-02

Pos logit contributions
 Lu+0.058
 angels+0.051
 firefighters+0.047
 attendant+0.046
 youths+0.046
Neg logit contributions
orse-0.041
orses-0.039
ret-0.037
ork-0.035
ets-0.035
Token,
Token ID11
Feature activation+5.34e-02

Pos logit contributions
 Lu+0.131
co+0.126
ators+0.122
 youths+0.121
 attendant+0.118
Neg logit contributions
orse-0.030
bs-0.025
ret-0.025
's-0.023
orses-0.023
Token a
Token ID257
Feature activation+4.20e-02

Pos logit contributions
 Lu+0.108
co+0.107
ators+0.100
 youths+0.098
 angels+0.097
Neg logit contributions
orse-0.052
bs-0.051
dream-0.047
 peanut-0.047
 accidentally-0.046
Token strange
Token ID6283
Feature activation+3.54e-02

Pos logit contributions
 Lu+0.090
 angels+0.078
ators+0.077
Ann+0.077
Lu+0.073
Neg logit contributions
orse-0.057
 peanut-0.053
ret-0.050
orses-0.049
ork-0.047
Token noise
Token ID7838
Feature activation+2.59e-02

Pos logit contributions
 Lu+0.068
 angels+0.058
Ann+0.058
Lu+0.056
 youths+0.054
Neg logit contributions
orse-0.057
 peanut-0.053
ret-0.053
ork-0.049
orses-0.049
Token made
Token ID925
Feature activation+5.72e-02

Pos logit contributions
 Lu+0.056
 angels+0.049
Ann+0.048
Lu+0.046
 youths+0.046
Neg logit contributions
orse-0.035
ret-0.032
orses-0.030
 peanut-0.030
ets-0.029
Token her
Token ID607
Feature activation-8.24e-03

Pos logit contributions
 Lu+0.099
co+0.093
 Barber+0.086
ators+0.085
 Taxi+0.084
Neg logit contributions
ret-0.077
ets-0.075
orse-0.072
bs-0.070
ork-0.070
Token jump
Token ID4391
Feature activation+1.27e-03

Pos logit contributions
ret+0.015
orse+0.015
orses+0.014
 peanut+0.013
ets+0.013
Neg logit contributions
 Lu-0.015
 angels-0.012
Ann-0.011
 firefighters-0.011
 attendant-0.011
Token.
Token ID13
Feature activation+4.43e-02

Pos logit contributions
 Lu+0.002
 angels+0.002
Ann+0.002
Lu+0.002
 firefighters+0.002
Neg logit contributions
ret-0.002
orse-0.002
orses-0.002
 peanut-0.002
ets-0.002
Token A
Token ID317
Feature activation+2.01e-02

Pos logit contributions
 Lu+0.096
 angels+0.094
ators+0.092
guard+0.088
co+0.088
Neg logit contributions
orse-0.045
 This-0.042
ork-0.040
orses-0.038
 Its-0.038
Token scary
Token ID14343
Feature activation+1.29e-02

Pos logit contributions
 Lu+0.041
 angels+0.034
Ann+0.034
Lu+0.033
ators+0.033
Neg logit contributions
orse-0.031
ret-0.029
orses-0.027
 peanut-0.027
ork-0.025
Token Tim
Token ID5045
Feature activation+1.56e-02

Pos logit contributions
ret+0.015
orse+0.015
orses+0.014
ocket+0.013
 peanut+0.013
Neg logit contributions
 Lu-0.012
 angels-0.009
Ann-0.009
Lu-0.009
 firefighters-0.009
Tokenmy
Token ID1820
Feature activation-1.86e-02

Pos logit contributions
 Lu+0.024
 angels+0.019
 firefighters+0.018
Ann+0.017
 attendant+0.017
Neg logit contributions
ret-0.032
orse-0.032
orses-0.030
 peanut-0.029
ets-0.029
Token felt
Token ID2936
Feature activation-1.98e-02

Pos logit contributions
ret+0.028
orse+0.027
orses+0.025
ocket+0.023
ets+0.023
Neg logit contributions
 Lu-0.039
Ann-0.032
 angels-0.031
Lu-0.031
co-0.031
Token sad
Token ID6507
Feature activation-7.79e-03

Pos logit contributions
ret+0.034
orse+0.032
orses+0.031
ork+0.028
sel+0.028
Neg logit contributions
 Lu-0.039
 youths-0.032
 firefighters-0.032
 angels-0.030
Lu-0.029
Token and
Token ID290
Feature activation-1.81e-02

Pos logit contributions
ret+0.013
orse+0.013
orses+0.012
 peanut+0.011
ocket+0.011
Neg logit contributions
 Lu-0.015
 angels-0.013
 firefighters-0.012
Ann-0.012
 youths-0.012
Token went
Token ID1816
Feature activation+4.56e-02

Pos logit contributions
ret+0.033
orse+0.031
orses+0.029
ork+0.028
sel+0.027
Neg logit contributions
 Lu-0.034
 firefighters-0.027
 angels-0.026
 youths-0.026
 Lucy-0.026
Token back
Token ID736
Feature activation+4.69e-02

Pos logit contributions
 Lu+0.107
Ann+0.092
 Stev+0.092
ators+0.091
Lu+0.090
Neg logit contributions
 peanut-0.043
ets-0.043
 drawing-0.042
ork-0.041
ret-0.041
Token inside
Token ID2641
Feature activation-2.87e-03

Pos logit contributions
 Lu+0.109
Ann+0.104
 pastor+0.103
 angels+0.103
ators+0.101
Neg logit contributions
orse-0.038
ret-0.036
s-0.035
 drawing-0.034
 peanut-0.034
Token his
Token ID465
Feature activation+3.16e-02

Pos logit contributions
ret+0.005
orse+0.005
orses+0.005
 peanut+0.004
ets+0.004
Neg logit contributions
 Lu-0.005
 angels-0.004
Ann-0.004
Lu-0.004
 firefighters-0.004
Token hut
Token ID40812
Feature activation+1.99e-02

Pos logit contributions
 Lu+0.068
Ann+0.061
Lu+0.057
co+0.056
whe+0.055
Neg logit contributions
orse-0.043
 peanut-0.042
ret-0.039
orses-0.039
ork-0.038
Token.
Token ID13
Feature activation-2.36e-02

Pos logit contributions
 Lu+0.044
Ann+0.037
 angels+0.036
Lu+0.036
 youths+0.035
Neg logit contributions
orse-0.028
ret-0.027
orses-0.025
 peanut-0.024
ets-0.023
Token
Token ID198
Feature activation-8.91e-03

Pos logit contributions
 Lu+0.005
 angels+0.004
Ann+0.004
 firefighters+0.004
Lu+0.004
Neg logit contributions
ret-0.004
orse-0.004
orses-0.004
 peanut-0.003
ets-0.003
TokenOne
Token ID3198
Feature activation+4.21e-03

Pos logit contributions
ret+0.019
orse+0.019
orses+0.018
 peanut+0.017
ocket+0.017
Neg logit contributions
 Lu-0.013
Lu-0.010
Ann-0.010
 angels-0.010
co-0.009
Token day
Token ID1110
Feature activation-2.08e-02

Pos logit contributions
 Lu+0.008
 angels+0.007
Ann+0.007
 firefighters+0.006
Lu+0.006
Neg logit contributions
orse-0.007
ret-0.007
orses-0.006
 peanut-0.006
ets-0.006
Token,
Token ID11
Feature activation-1.84e-02

Pos logit contributions
ret+0.038
orse+0.037
orses+0.035
ets+0.033
ocket+0.033
Neg logit contributions
 Lu-0.039
 angels-0.031
Lu-0.030
Ann-0.030
 Lucy-0.029
Token it
Token ID340
Feature activation+2.80e-03

Pos logit contributions
ret+0.038
orse+0.037
orses+0.036
ets+0.033
ocket+0.032
Neg logit contributions
 Lu-0.031
 angels-0.025
Lu-0.024
 Lucy-0.024
Ann-0.023
Token r
Token ID374
Feature activation+4.58e-02

Pos logit contributions
 Lu+0.006
 angels+0.005
Ann+0.004
Lu+0.004
 firefighters+0.004
Neg logit contributions
ret-0.004
orse-0.004
orses-0.004
 peanut-0.004
ets-0.004
Tokenained
Token ID1328
Feature activation+1.70e-02

Pos logit contributions
 Lu+0.088
 Luis+0.069
 attendant+0.068
 Pen+0.068
 Taxi+0.067
Neg logit contributions
ret-0.075
ets-0.072
orse-0.068
ocus-0.067
orses-0.067
Token outside
Token ID2354
Feature activation+1.59e-03

Pos logit contributions
 Lu+0.035
 angels+0.030
Ann+0.029
 youths+0.028
 Taxi+0.028
Neg logit contributions
ret-0.024
orse-0.023
orses-0.021
ets-0.021
 peanut-0.021
Token,
Token ID11
Feature activation-9.01e-03

Pos logit contributions
 Lu+0.003
 angels+0.003
Ann+0.003
Lu+0.003
 firefighters+0.003
Neg logit contributions
ret-0.003
orse-0.003
orses-0.002
 peanut-0.002
ets-0.002
Token and
Token ID290
Feature activation-3.72e-03

Pos logit contributions
ret+0.016
orse+0.016
orses+0.015
 peanut+0.013
ocket+0.013
Neg logit contributions
 Lu-0.017
 angels-0.014
 firefighters-0.013
Ann-0.013
Lu-0.013
Token Lily
Token ID20037
Feature activation-2.26e-02

Pos logit contributions
ret+0.006
orse+0.006
orses+0.006
 peanut+0.005
ets+0.005
Neg logit contributions
 Lu-0.007
 angels-0.006
Ann-0.006
Lu-0.006
 firefighters-0.006
Token,
Token ID11
Feature activation-7.02e-03

Pos logit contributions
ret+0.057
orse+0.052
orses+0.050
ets+0.049
ocket+0.047
Neg logit contributions
 Lu-0.056
 angels-0.043
 firefighters-0.043
 Lucy-0.043
 attendant-0.042
Token the
Token ID262
Feature activation+3.36e-02

Pos logit contributions
ret+0.012
orse+0.012
orses+0.011
ets+0.010
 peanut+0.010
Neg logit contributions
 Lu-0.014
 angels-0.011
Ann-0.011
 firefighters-0.011
Lu-0.010
Token boy
Token ID2933
Feature activation-2.18e-02

Pos logit contributions
 Lu+0.066
Ann+0.057
Lu+0.054
ators+0.054
co+0.052
Neg logit contributions
orse-0.051
 peanut-0.050
ret-0.047
orses-0.046
 pen-0.042
Token felt
Token ID2936
Feature activation-7.43e-03

Pos logit contributions
ret+0.035
orse+0.033
orses+0.032
ets+0.029
 peanut+0.029
Neg logit contributions
 Lu-0.044
 angels-0.035
Ann-0.035
 firefighters-0.034
Lu-0.034
Token fresh
Token ID4713
Feature activation+2.42e-02

Pos logit contributions
ret+0.013
orse+0.013
orses+0.012
 peanut+0.011
ets+0.011
Neg logit contributions
 Lu-0.014
 angels-0.012
 firefighters-0.011
Lu-0.011
Ann-0.011
Token and
Token ID290
Feature activation+9.11e-03

Pos logit contributions
 Lu+0.055
Ann+0.048
Lu+0.046
 attendant+0.045
 angels+0.045
Neg logit contributions
orse-0.029
 peanut-0.028
ret-0.028
ets-0.027
orses-0.026
Token happy
Token ID3772
Feature activation+8.00e-03

Pos logit contributions
 Lu+0.018
Ann+0.015
 angels+0.015
Lu+0.015
 firefighters+0.014
Neg logit contributions
orse-0.015
ret-0.014
orses-0.013
 peanut-0.013
ets-0.012
Token.
Token ID13
Feature activation-4.82e-03

Pos logit contributions
 Lu+0.016
Ann+0.014
Lu+0.013
 angels+0.013
 firefighters+0.013
Neg logit contributions
orse-0.012
ret-0.012
orses-0.011
 peanut-0.011
ets-0.010
Token He
Token ID679
Feature activation-2.63e-03

Pos logit contributions
ret+0.010
orse+0.010
orses+0.009
 peanut+0.008
ets+0.008
Neg logit contributions
 Lu-0.008
 angels-0.006
Ann-0.006
Lu-0.006
 firefighters-0.006
Token thanked
Token ID26280
Feature activation+8.26e-03

Pos logit contributions
ret+0.004
orse+0.004
orses+0.004
 peanut+0.003
ets+0.003
Neg logit contributions
 Lu-0.005
 angels-0.004
Ann-0.004
Lu-0.004
 firefighters-0.004
Token the
Token ID262
Feature activation+4.62e-02

Pos logit contributions
 Lu+0.016
Ann+0.013
 angels+0.013
Lu+0.013
 firefighters+0.013
Neg logit contributions
orse-0.013
ret-0.013
orses-0.012
 peanut-0.012
ocket-0.011
Token refused
Token ID6520
Feature activation-1.41e-03

Pos logit contributions
ret+0.020
orse+0.020
orses+0.019
 peanut+0.017
ets+0.017
Neg logit contributions
 Lu-0.025
 angels-0.020
Ann-0.019
Lu-0.019
 firefighters-0.019
Token to
Token ID284
Feature activation-7.06e-03

Pos logit contributions
ret+0.002
orse+0.002
orses+0.002
 peanut+0.002
ets+0.002
Neg logit contributions
 Lu-0.003
 angels-0.002
Ann-0.002
Lu-0.002
 firefighters-0.002
Token listen
Token ID6004
Feature activation-2.22e-02

Pos logit contributions
ret+0.011
orse+0.011
orses+0.011
ets+0.010
 peanut+0.010
Neg logit contributions
 Lu-0.014
 angels-0.012
Ann-0.011
Lu-0.011
 firefighters-0.011
Token.
Token ID13
Feature activation+6.79e-04

Pos logit contributions
orse+0.043
ret+0.042
orses+0.041
 peanut+0.038
ocket+0.037
Neg logit contributions
 Lu-0.037
 angels-0.029
Lu-0.028
 attendant-0.028
Ann-0.027
Token
Token ID220
Feature activation+1.67e-02

Pos logit contributions
 Lu+0.001
 angels+0.001
Ann+0.001
Lu+0.001
 firefighters+0.001
Neg logit contributions
ret-0.001
orse-0.001
orses-0.001
 peanut-0.001
ets-0.001
Token
Token ID198
Feature activation-5.48e-03

Pos logit contributions
 Lu+0.036
 angels+0.030
 firefighters+0.029
Ann+0.029
Lu+0.029
Neg logit contributions
ret-0.024
orse-0.024
orses-0.022
 peanut-0.020
ork-0.019
Token
Token ID198
Feature activation-1.40e-02

Pos logit contributions
ret+0.009
orse+0.009
orses+0.008
 peanut+0.008
ets+0.008
Neg logit contributions
 Lu-0.011
 angels-0.009
Ann-0.009
Lu-0.008
 firefighters-0.008
TokenFinally
Token ID11158
Feature activation-1.67e-02

Pos logit contributions
ret+0.025
orse+0.024
orses+0.023
 peanut+0.022
ocket+0.022
Neg logit contributions
 Lu-0.025
Lu-0.021
Ann-0.021
 angels-0.020
 firefighters-0.020
Token,
Token ID11
Feature activation-2.54e-02

Pos logit contributions
ret+0.029
orse+0.028
orses+0.026
ets+0.025
ocket+0.024
Neg logit contributions
 Lu-0.032
 angels-0.026
 firefighters-0.025
 Lucy-0.025
Lu-0.025
Token Jack
Token ID3619
Feature activation-3.20e-02

Pos logit contributions
ret+0.051
orse+0.049
orses+0.048
ocus+0.045
ets+0.045
Neg logit contributions
 Lu-0.043
 firefighters-0.033
 Lucy-0.033
 angels-0.032
Lu-0.032
Token's
Token ID338
Feature activation-7.49e-03

Pos logit contributions
ret+0.048
orse+0.047
orses+0.045
ets+0.039
ocus+0.038
Neg logit contributions
 Lu-0.065
Ann-0.053
 attendant-0.052
 firefighters-0.052
 angels-0.052
Token friends
Token ID2460
Feature activation+1.80e-02

Pos logit contributions
 Lu+0.066
Ann+0.055
Lu+0.055
Michael+0.053
 Taxi+0.050
Neg logit contributions
orse-0.059
ret-0.059
 peanut-0.058
 notebooks-0.055
orses-0.054
Token were
Token ID547
Feature activation+3.22e-02

Pos logit contributions
 Lu+0.039
 angels+0.034
Ann+0.032
 firefighters+0.032
Lu+0.031
Neg logit contributions
ret-0.026
orse-0.025
orses-0.022
 peanut-0.022
ets-0.021
Token angry
Token ID7954
Feature activation+8.48e-03

Pos logit contributions
 Lu+0.074
 angels+0.065
Ann+0.064
Lu+0.063
 Taxi+0.062
Neg logit contributions
orse-0.037
ret-0.032
orses-0.032
 peanut-0.030
 drawing-0.029
Token.
Token ID13
Feature activation-8.26e-04

Pos logit contributions
 Lu+0.018
 angels+0.015
Ann+0.015
Lu+0.015
 firefighters+0.015
Neg logit contributions
orse-0.012
ret-0.012
orses-0.011
ets-0.010
 peanut-0.010
Token "
Token ID366
Feature activation-1.04e-02

Pos logit contributions
ret+0.001
orse+0.001
orses+0.001
 peanut+0.001
ets+0.001
Neg logit contributions
 Lu-0.002
 angels-0.001
Ann-0.001
Lu-0.001
 firefighters-0.001
TokenNo
Token ID2949
Feature activation-1.20e-03

Pos logit contributions
orse+0.024
ret+0.024
orses+0.023
 peanut+0.022
ets+0.021
Neg logit contributions
 Lu-0.014
Ann-0.010
Lu-0.010
 angels-0.010
 firefighters-0.009
Token!"
Token ID2474
Feature activation-2.37e-02

Pos logit contributions
ret+0.002
orse+0.002
orses+0.002
 peanut+0.001
ets+0.001
Neg logit contributions
 Lu-0.003
 angels-0.002
Ann-0.002
Lu-0.002
 firefighters-0.002
Token they
Token ID484
Feature activation+1.54e-02

Pos logit contributions
ret+0.043
orse+0.043
orses+0.041
ocket+0.038
 peanut+0.037
Neg logit contributions
 Lu-0.046
Ann-0.035
 attendant-0.035
Lu-0.034
 Lucy-0.034
Token shouted
Token ID17293
Feature activation-9.37e-03

Pos logit contributions
 Lu+0.034
 angels+0.030
Ann+0.029
 firefighters+0.028
Lu+0.028
Neg logit contributions
ret-0.021
orse-0.020
orses-0.018
 peanut-0.017
ets-0.017
Token.
Token ID13
Feature activation-3.46e-03

Pos logit contributions
ret+0.015
orse+0.015
orses+0.014
 peanut+0.013
ets+0.013
Neg logit contributions
 Lu-0.019
 angels-0.015
 firefighters-0.015
Ann-0.015
Lu-0.015
Token "
Token ID366
Feature activation-5.37e-03

Pos logit contributions
ret+0.006
orse+0.006
orses+0.006
 peanut+0.005
ets+0.005
Neg logit contributions
 Lu-0.007
 angels-0.005
Ann-0.005
Lu-0.005
 firefighters-0.005
Token get
Token ID651
Feature activation+8.48e-03

Pos logit contributions
 Lu+0.024
 angels+0.020
Ann+0.020
Lu+0.019
 youths+0.019
Neg logit contributions
orse-0.017
ret-0.017
orses-0.016
 peanut-0.015
ork-0.014
Token filled
Token ID5901
Feature activation-1.23e-02

Pos logit contributions
 Lu+0.016
Ann+0.014
 angels+0.013
Lu+0.013
 youths+0.013
Neg logit contributions
ret-0.014
orse-0.014
orses-0.013
 peanut-0.012
ets-0.012
Token with
Token ID351
Feature activation-2.63e-02

Pos logit contributions
orse+0.021
ret+0.021
orses+0.019
ets+0.018
 peanut+0.018
Neg logit contributions
 Lu-0.024
 angels-0.019
Lu-0.019
Ann-0.019
 firefighters-0.019
Token water
Token ID1660
Feature activation+9.61e-03

Pos logit contributions
orse+0.048
ret+0.048
orses+0.044
ork+0.041
ets+0.040
Neg logit contributions
 Lu-0.048
 angels-0.040
 firefighters-0.039
 youths-0.036
Lu-0.035
Token.
Token ID13
Feature activation+1.32e-02

Pos logit contributions
 Lu+0.019
 angels+0.016
Ann+0.016
Lu+0.016
 firefighters+0.015
Neg logit contributions
ret-0.015
orse-0.015
orses-0.014
 peanut-0.013
ets-0.013
Tokenâ
Token ID22940
Feature activation+4.75e-03

Pos logit contributions
 Lu+0.028
 angels+0.025
Ann+0.023
 firefighters+0.023
Lu+0.023
Neg logit contributions
orse-0.019
ret-0.018
orses-0.017
 peanut-0.016
ets-0.015
Token€
Token ID26391
Feature activation-1.26e-02

Pos logit contributions
 Lu+0.009
 angels+0.007
 firefighters+0.007
Ann+0.007
Lu+0.007
Neg logit contributions
ret-0.009
orse-0.008
orses-0.008
ets-0.007
 peanut-0.007
Token
Token ID198
Feature activation+2.77e-02

Pos logit contributions
ret+0.020
orse+0.019
orses+0.018
 peanut+0.017
ets+0.017
Neg logit contributions
 Lu-0.026
Ann-0.021
 angels-0.021
Lu-0.020
 Taxi-0.020
Token
Token ID198
Feature activation+1.66e-02

Pos logit contributions
 Lu+0.062
 angels+0.058
 firefighters+0.056
ators+0.054
 angel+0.053
Neg logit contributions
orse-0.035
ret-0.033
orses-0.031
ork-0.027
 peanut-0.026
TokenSo
Token ID2396
Feature activation-3.63e-03

Pos logit contributions
 Lu+0.030
 angels+0.026
 firefighters+0.023
 attendant+0.023
Ann+0.023
Neg logit contributions
orse-0.029
ret-0.027
orses-0.027
 peanut-0.025
ork-0.024
Token Tim
Token ID5045
Feature activation+2.86e-02

Pos logit contributions
ret+0.006
orse+0.006
orses+0.006
 peanut+0.005
ets+0.005
Neg logit contributions
 Lu-0.007
 angels-0.006
Ann-0.006
Lu-0.005
 firefighters-0.005
Token friend
Token ID1545
Feature activation+3.23e-02

Pos logit contributions
 Lu+0.049
Ann+0.043
ators+0.041
Lu+0.040
 attendant+0.039
Neg logit contributions
orse-0.053
ret-0.051
 peanut-0.051
orses-0.046
 pen-0.044
Token,
Token ID11
Feature activation+3.67e-02

Pos logit contributions
 Lu+0.072
 angels+0.061
Ann+0.061
 firefighters+0.060
 attendant+0.060
Neg logit contributions
orse-0.041
ret-0.039
orses-0.036
 peanut-0.035
ets-0.034
Token O
Token ID440
Feature activation+7.43e-03

Pos logit contributions
 Lu+0.067
ators+0.060
co+0.060
 angels+0.059
Ann+0.059
Neg logit contributions
orse-0.059
ret-0.058
 peanut-0.056
orses-0.052
 Po-0.046
Tokenll
Token ID297
Feature activation-1.36e-04

Pos logit contributions
 Lu+0.012
 angels+0.010
 firefighters+0.010
Ann+0.009
 youths+0.009
Neg logit contributions
ret-0.015
orse-0.015
orses-0.014
 peanut-0.013
ets-0.012
Tokenie
Token ID494
Feature activation+3.36e-02

Pos logit contributions
ret+0.000
orse+0.000
orses+0.000
 peanut+0.000
ets+0.000
Neg logit contributions
 Lu-0.000
 angels-0.000
Ann-0.000
Lu-0.000
 firefighters-0.000
Token."
Token ID526
Feature activation+1.19e-02

Pos logit contributions
 Lu+0.077
Ann+0.066
 angels+0.064
 firefighters+0.064
 attendant+0.063
Neg logit contributions
orse-0.041
ret-0.041
orses-0.037
 peanut-0.034
ets-0.033
Token O
Token ID440
Feature activation+1.95e-02

Pos logit contributions
 Lu+0.023
 angels+0.020
 firefighters+0.019
Ann+0.019
Lu+0.019
Neg logit contributions
orse-0.019
ret-0.019
orses-0.018
 peanut-0.017
ets-0.015
Tokenll
Token ID297
Feature activation+2.40e-02

Pos logit contributions
 Lu+0.037
 firefighters+0.032
 angels+0.031
 youths+0.030
 attendant+0.029
Neg logit contributions
ret-0.035
orse-0.035
orses-0.032
 peanut-0.029
ocus-0.028
Tokenie
Token ID494
Feature activation+2.98e-02

Pos logit contributions
 Lu+0.051
 firefighters+0.043
 angels+0.042
 attendant+0.042
Ann+0.042
Neg logit contributions
ret-0.036
orse-0.035
orses-0.032
 peanut-0.029
ets-0.028
Token was
Token ID373
Feature activation+2.48e-02

Pos logit contributions
 Lu+0.061
 angels+0.051
 firefighters+0.051
Ann+0.051
 Taxi+0.051
Neg logit contributions
orse-0.047
ret-0.046
orses-0.042
 peanut-0.041
sel-0.036
Token happy
Token ID3772
Feature activation+1.24e-02

Pos logit contributions
 Lu+0.058
co+0.051
 angels+0.050
Ann+0.050
ators+0.049
Neg logit contributions
orse-0.031
ret-0.028
orses-0.027
 peanut-0.026
ocket-0.022
Token so
Token ID523
Feature activation-1.28e-02

Pos logit contributions
ret+0.025
orse+0.024
orses+0.023
ork+0.021
 peanut+0.021
Neg logit contributions
 Lu-0.024
 angels-0.019
 firefighters-0.019
Ann-0.018
 attendant-0.018
Token he
Token ID339
Feature activation+1.52e-02

Pos logit contributions
ret+0.022
orse+0.020
ets+0.019
orses+0.019
ocket+0.018
Neg logit contributions
 Lu-0.025
 angels-0.021
Ann-0.020
 firefighters-0.020
Lu-0.019
Token got
Token ID1392
Feature activation+1.37e-02

Pos logit contributions
 Lu+0.031
 angels+0.027
Ann+0.026
Lu+0.025
 youths+0.025
Neg logit contributions
orse-0.024
ret-0.023
orses-0.021
 peanut-0.020
ets-0.019
Token on
Token ID319
Feature activation-1.00e-03

Pos logit contributions
 Lu+0.028
Ann+0.023
Lu+0.023
 angels+0.023
co+0.022
Neg logit contributions
ret-0.021
orse-0.021
orses-0.019
 peanut-0.019
ets-0.017
Token the
Token ID262
Feature activation-3.81e-02

Pos logit contributions
ret+0.002
orse+0.002
orses+0.002
 peanut+0.002
ets+0.002
Neg logit contributions
 Lu-0.002
 angels-0.001
Ann-0.001
Lu-0.001
 firefighters-0.001
Token plane
Token ID6614
Feature activation-1.29e-02

Pos logit contributions
ret+0.076
orse+0.075
orses+0.069
 peanut+0.068
ocket+0.066
Neg logit contributions
 Lu-0.062
 angels-0.053
 firefighters-0.050
 attendant-0.048
 taxis-0.047
Token.
Token ID13
Feature activation+2.21e-03

Pos logit contributions
ret+0.023
orse+0.022
orses+0.021
 peanut+0.020
ocket+0.019
Neg logit contributions
 Lu-0.024
 angels-0.020
 attendant-0.018
Ann-0.018
 firefighters-0.018
Token
Token ID220
Feature activation+2.03e-02

Pos logit contributions
 Lu+0.005
 angels+0.004
Ann+0.004
 firefighters+0.004
Lu+0.004
Neg logit contributions
ret-0.003
orse-0.003
orses-0.003
 peanut-0.003
ets-0.003
Token
Token ID198
Feature activation+1.01e-02

Pos logit contributions
 Lu+0.044
 angels+0.038
 firefighters+0.037
Ann+0.036
 youths+0.036
Neg logit contributions
ret-0.029
orse-0.028
orses-0.026
ork-0.023
 peanut-0.023
Token
Token ID198
Feature activation+7.06e-03

Pos logit contributions
 Lu+0.021
 angels+0.018
 firefighters+0.017
Ann+0.017
 youths+0.017
Neg logit contributions
ret-0.015
orse-0.015
orses-0.014
 peanut-0.012
ork-0.012
TokenThe
Token ID464
Feature activation-1.61e-02

Pos logit contributions
 Lu+0.015
 angels+0.013
 firefighters+0.012
Ann+0.012
 youths+0.012
Neg logit contributions
orse-0.011
ret-0.010
orses-0.010
 peanut-0.009
ets-0.009
Token said
Token ID531
Feature activation+1.37e-02

Pos logit contributions
 Lu+0.040
 angels+0.035
 firefighters+0.034
 Taxi+0.033
ators+0.033
Neg logit contributions
orse-0.023
ret-0.022
orses-0.019
 peanut-0.019
ets-0.018
Token,
Token ID11
Feature activation+3.43e-03

Pos logit contributions
 Lu+0.032
Ann+0.028
 angels+0.027
Lu+0.027
 firefighters+0.027
Neg logit contributions
orse-0.017
ret-0.017
orses-0.015
 peanut-0.014
ets-0.013
Token reaching
Token ID8978
Feature activation-6.57e-03

Pos logit contributions
 Lu+0.007
 angels+0.005
Ann+0.005
Lu+0.005
 firefighters+0.005
Neg logit contributions
orse-0.006
ret-0.006
orses-0.005
 peanut-0.005
ets-0.005
Token for
Token ID329
Feature activation-3.71e-02

Pos logit contributions
orse+0.011
ret+0.011
orses+0.010
 peanut+0.009
ets+0.009
Neg logit contributions
 Lu-0.013
 angels-0.011
Ann-0.010
 firefighters-0.010
Lu-0.010
Token the
Token ID262
Feature activation-4.12e-02

Pos logit contributions
ret+0.072
orse+0.070
orses+0.068
ets+0.068
ocus+0.066
Neg logit contributions
 Lu-0.060
 firefighters-0.047
 angels-0.047
 Lucy-0.046
 Ann-0.042
Token coal
Token ID5655
Feature activation-1.71e-02

Pos logit contributions
orse+0.080
ret+0.078
orses+0.077
ocus+0.073
ets+0.070
Neg logit contributions
 Lu-0.059
 angels-0.053
 firefighters-0.053
 attendant-0.048
jack-0.047
Token.
Token ID13
Feature activation-1.72e-02

Pos logit contributions
orse+0.031
ret+0.031
orses+0.029
 peanut+0.026
ets+0.026
Neg logit contributions
 Lu-0.030
 angels-0.025
 firefighters-0.024
 attendant-0.023
Ann-0.023
Token
Token ID198
Feature activation-1.72e-02

Pos logit contributions
ret+0.032
orse+0.032
orses+0.030
ets+0.028
ocket+0.028
Neg logit contributions
 Lu-0.031
 angels-0.024
Ann-0.023
Lu-0.023
 attendant-0.023
Token
Token ID198
Feature activation-3.29e-02

Pos logit contributions
ret+0.031
orse+0.031
orses+0.029
 peanut+0.028
ets+0.028
Neg logit contributions
 Lu-0.030
Ann-0.024
Lu-0.023
 angels-0.023
 attendant-0.022
TokenBut
Token ID1537
Feature activation-3.40e-02

Pos logit contributions
ret+0.075
orse+0.069
ocket+0.067
 peanut+0.065
orses+0.065
Neg logit contributions
 Lu-0.044
Lu-0.039
Ann-0.038
 youths-0.034
 angels-0.033
Token Ben
Token ID3932
Feature activation-1.27e-02

Pos logit contributions
ret+0.066
orse+0.060
orses+0.059
ets+0.058
ork+0.056
Neg logit contributions
 Lu-0.062
 Lucy-0.048
 angels-0.047
Ann-0.045
 Pen-0.044
Token along
Token ID1863
Feature activation-5.43e-03

Pos logit contributions
 Lu+0.030
 angels+0.026
Ann+0.025
 firefighters+0.024
Lu+0.024
Neg logit contributions
ret-0.021
orse-0.020
ets-0.019
orses-0.019
 peanut-0.019
Token.
Token ID13
Feature activation-4.27e-03

Pos logit contributions
ret+0.010
orse+0.010
orses+0.009
 peanut+0.008
ets+0.008
Neg logit contributions
 Lu-0.010
 angels-0.008
Ann-0.008
Lu-0.008
 firefighters-0.008
Token He
Token ID679
Feature activation-7.95e-04

Pos logit contributions
ret+0.008
orse+0.008
orses+0.007
 peanut+0.007
ets+0.007
Neg logit contributions
 Lu-0.008
 angels-0.006
Ann-0.006
Lu-0.006
 firefighters-0.006
Token felt
Token ID2936
Feature activation-2.40e-02

Pos logit contributions
ret+0.001
orse+0.001
orses+0.001
 peanut+0.001
ets+0.001
Neg logit contributions
 Lu-0.002
 angels-0.001
Ann-0.001
Lu-0.001
 firefighters-0.001
Token so
Token ID523
Feature activation-2.54e-02

Pos logit contributions
ret+0.047
orse+0.046
orses+0.044
ork+0.041
ets+0.040
Neg logit contributions
 Lu-0.042
 angels-0.031
 youths-0.031
 firefighters-0.030
 Lucy-0.030
Token happy
Token ID3772
Feature activation-8.54e-03

Pos logit contributions
ret+0.049
orse+0.048
orses+0.045
 peanut+0.044
sel+0.043
Neg logit contributions
 Lu-0.042
 angels-0.032
Lu-0.031
 youths-0.030
Ann-0.030
Token knowing
Token ID6970
Feature activation+7.06e-03

Pos logit contributions
ret+0.014
orse+0.013
orses+0.012
 peanut+0.011
ocket+0.011
Neg logit contributions
 Lu-0.018
 angels-0.015
Ann-0.014
 firefighters-0.014
Lu-0.014
Token that
Token ID326
Feature activation+1.25e-02

Pos logit contributions
 Lu+0.017
Ann+0.014
 angels+0.014
Lu+0.014
 firefighters+0.014
Neg logit contributions
orse-0.009
ret-0.008
orses-0.008
 peanut-0.007
ets-0.006
Token he
Token ID339
Feature activation+3.18e-03

Pos logit contributions
 Lu+0.028
Ann+0.024
 angels+0.024
Lu+0.024
co+0.023
Neg logit contributions
orse-0.016
ret-0.016
orses-0.015
 peanut-0.014
ocket-0.012
Token would
Token ID561
Feature activation+2.01e-02

Pos logit contributions
 Lu+0.007
 angels+0.006
Ann+0.006
 firefighters+0.006
Lu+0.006
Neg logit contributions
ret-0.004
orse-0.004
orses-0.004
 peanut-0.003
ets-0.003
Token still
Token ID991
Feature activation+2.27e-02

Pos logit contributions
 Lu+0.043
 angels+0.037
Ann+0.036
Lu+0.036
 firefighters+0.036
Neg logit contributions
orse-0.028
ret-0.028
orses-0.025
 peanut-0.023
ets-0.023
Token
Token ID198
Feature activation+1.36e-02

Pos logit contributions
 Lu+0.094
 angels+0.084
 firefighters+0.083
guard+0.083
ators+0.082
Neg logit contributions
orse-0.042
orses-0.038
ret-0.037
 This-0.033
 peanut-0.033
Token
Token ID198
Feature activation+4.09e-02

Pos logit contributions
 Lu+0.028
 angels+0.024
 firefighters+0.024
Ann+0.023
 youths+0.023
Neg logit contributions
orse-0.020
ret-0.020
orses-0.018
 peanut-0.017
ork-0.016
TokenOne
Token ID3198
Feature activation+7.94e-03

Pos logit contributions
 Lu+0.085
 angels+0.073
 youths+0.072
 helpers+0.072
 attendant+0.071
Neg logit contributions
orse-0.055
orses-0.052
ork-0.048
ret-0.048
oc-0.045
Token day
Token ID1110
Feature activation+3.68e-02

Pos logit contributions
 Lu+0.016
 angels+0.013
Ann+0.013
 firefighters+0.012
 attendant+0.012
Neg logit contributions
orse-0.013
ret-0.013
orses-0.012
 peanut-0.011
ets-0.011
Token,
Token ID11
Feature activation+5.75e-02

Pos logit contributions
 Lu+0.094
 attendant+0.084
ators+0.083
Ann+0.082
 pastor+0.081
Neg logit contributions
orse-0.030
ret-0.029
orses-0.026
 peanut-0.025
bs-0.022
Token Kitty
Token ID21168
Feature activation-1.94e-02

Pos logit contributions
 Lu+0.090
ators+0.085
iddle+0.083
 attendant+0.082
 pastor+0.081
Neg logit contributions
 peanut-0.083
orse-0.080
bs-0.079
s-0.079
orses-0.075
Token wanted
Token ID2227
Feature activation+3.75e-02

Pos logit contributions
ret+0.034
orse+0.032
orses+0.030
 peanut+0.028
ocket+0.028
Neg logit contributions
 Lu-0.037
Ann-0.029
 angels-0.029
Lu-0.029
 firefighters-0.029
Token to
Token ID284
Feature activation+4.05e-03

Pos logit contributions
 Lu+0.091
Ann+0.082
 attendant+0.079
ators+0.079
jack+0.078
Neg logit contributions
 peanut-0.041
orse-0.033
ret-0.032
 ol-0.032
 tomato-0.031
Token cook
Token ID4255
Feature activation+1.44e-02

Pos logit contributions
 Lu+0.009
 angels+0.007
Ann+0.007
Lu+0.007
 firefighters+0.007
Neg logit contributions
orse-0.006
ret-0.006
orses-0.006
 peanut-0.005
ets-0.005
Token some
Token ID617
Feature activation+1.63e-02

Pos logit contributions
 Lu+0.032
Ann+0.027
 angels+0.026
Lu+0.026
 attendant+0.026
Neg logit contributions
orse-0.019
ret-0.019
 peanut-0.017
orses-0.017
ets-0.015
Token food
Token ID2057
Feature activation+1.37e-02

Pos logit contributions
 Lu+0.034
Ann+0.029
 angels+0.028
Lu+0.028
 attendant+0.027
Neg logit contributions
orse-0.024
ret-0.023
 peanut-0.022
orses-0.022
ets-0.020
Token closet
Token ID21615
Feature activation-3.70e-02

Pos logit contributions
ret+0.061
orse+0.059
orses+0.057
ocus+0.052
sel+0.050
Neg logit contributions
 Lu-0.048
 angels-0.037
Ann-0.035
 firefighters-0.035
 youths-0.035
Token,
Token ID11
Feature activation-3.13e-02

Pos logit contributions
orse+0.071
ret+0.070
orses+0.066
 peanut+0.061
ocket+0.058
Neg logit contributions
 Lu-0.062
 angels-0.052
 attendant-0.049
Ann-0.048
 firefighters-0.048
Token and
Token ID290
Feature activation-2.51e-02

Pos logit contributions
ret+0.068
orse+0.065
orses+0.062
ork+0.059
 peanut+0.057
Neg logit contributions
 Lu-0.049
 angels-0.038
 firefighters-0.035
 attendant-0.034
Ann-0.034
Token in
Token ID287
Feature activation+9.91e-04

Pos logit contributions
ret+0.050
orse+0.050
orses+0.048
ork+0.043
ets+0.043
Neg logit contributions
 Lu-0.041
 angels-0.033
 firefighters-0.030
Ann-0.030
 Lucy-0.029
Token his
Token ID465
Feature activation-2.96e-02

Pos logit contributions
 Lu+0.002
 angels+0.002
Ann+0.001
Lu+0.001
 firefighters+0.001
Neg logit contributions
ret-0.002
orse-0.002
orses-0.002
 peanut-0.002
ets-0.001
Token toy
Token ID13373
Feature activation-1.13e-02

Pos logit contributions
ret+0.059
orse+0.059
orses+0.056
ocus+0.052
ets+0.049
Neg logit contributions
 Lu-0.049
 angels-0.038
 firefighters-0.036
Ann-0.036
 Lucy-0.036
Token box
Token ID3091
Feature activation-1.82e-02

Pos logit contributions
orse+0.021
ret+0.021
orses+0.019
 peanut+0.018
ets+0.017
Neg logit contributions
 Lu-0.020
 angels-0.017
Ann-0.016
 firefighters-0.016
Lu-0.016
Token.
Token ID13
Feature activation-2.61e-02

Pos logit contributions
orse+0.032
ret+0.031
orses+0.029
 peanut+0.028
ocket+0.026
Neg logit contributions
 Lu-0.034
 angels-0.028
Ann-0.027
 attendant-0.026
 firefighters-0.026
Token He
Token ID679
Feature activation-7.08e-03

Pos logit contributions
orse+0.061
ret+0.061
orses+0.059
ocket+0.054
 peanut+0.054
Neg logit contributions
 Lu-0.034
Ann-0.023
 youths-0.022
 angels-0.021
Lu-0.021
Token cannot
Token ID2314
Feature activation-1.43e-02

Pos logit contributions
orse+0.010
ret+0.010
orses+0.009
 peanut+0.008
ets+0.008
Neg logit contributions
 Lu-0.016
 angels-0.013
Ann-0.013
 firefighters-0.013
Lu-0.012
Token find
Token ID1064
Feature activation-1.82e-02

Pos logit contributions
ret+0.019
orse+0.018
orses+0.017
 peanut+0.016
ets+0.015
Neg logit contributions
 Lu-0.032
 angels-0.027
Ann-0.026
 firefighters-0.026
Lu-0.026
Token said
Token ID531
Feature activation-2.88e-03

Pos logit contributions
 Lu+0.016
 angels+0.014
Ann+0.013
Lu+0.013
 firefighters+0.013
Neg logit contributions
ret-0.012
orse-0.012
orses-0.011
 peanut-0.011
ets-0.010
Token,
Token ID11
Feature activation-1.26e-02

Pos logit contributions
ret+0.004
orse+0.004
orses+0.004
 peanut+0.004
ets+0.004
Neg logit contributions
 Lu-0.006
 angels-0.005
Ann-0.005
Lu-0.005
 firefighters-0.005
Token "
Token ID366
Feature activation-8.80e-03

Pos logit contributions
ret+0.024
orse+0.024
orses+0.023
 peanut+0.021
ets+0.021
Neg logit contributions
 Lu-0.021
 angels-0.017
Ann-0.016
 firefighters-0.016
Lu-0.016
TokenOh
Token ID5812
Feature activation+5.78e-03

Pos logit contributions
orse+0.018
ret+0.018
orses+0.017
 peanut+0.016
ets+0.016
Neg logit contributions
 Lu-0.014
Ann-0.011
 angels-0.011
Lu-0.011
 firefighters-0.010
Token,
Token ID11
Feature activation+5.82e-03

Pos logit contributions
 Lu+0.013
 angels+0.011
Ann+0.011
Lu+0.011
 firefighters+0.011
Neg logit contributions
ret-0.008
orse-0.008
orses-0.007
 peanut-0.006
ets-0.006
Token this
Token ID428
Feature activation-5.82e-02

Pos logit contributions
 Lu+0.012
 angels+0.010
Ann+0.010
Lu+0.010
 firefighters+0.010
Neg logit contributions
ret-0.008
orse-0.008
orses-0.007
 peanut-0.007
ets-0.007
Token looks
Token ID3073
Feature activation-6.43e-02

Pos logit contributions
orse+0.124
orses+0.120
alks+0.106
ret+0.105
ifying+0.102
Neg logit contributions
 Lu-0.078
 angel-0.064
 angels-0.063
Ann-0.063
Lu-0.062
Token like
Token ID588
Feature activation-2.24e-02

Pos logit contributions
orse+0.139
orses+0.132
ret+0.126
alks+0.120
iche+0.109
Neg logit contributions
 Lu-0.088
 angels-0.072
 Lucy-0.071
 angel-0.067
 firefighters-0.064
Token something
Token ID1223
Feature activation-4.61e-02

Pos logit contributions
orse+0.040
ret+0.040
orses+0.038
ets+0.035
ork+0.034
Neg logit contributions
 Lu-0.040
 angels-0.034
 firefighters-0.032
Lu-0.031
Ann-0.031
Token from
Token ID422
Feature activation-4.39e-02

Pos logit contributions
orse+0.081
ret+0.078
orses+0.076
alks+0.066
sel+0.066
Neg logit contributions
 Lu-0.085
 angels-0.072
 Lucy-0.068
 firefighters-0.065
Lu-0.065
Token Grand
Token ID5675
Feature activation-2.96e-02

Pos logit contributions
orse+0.087
orses+0.082
ret+0.081
ets+0.073
ork+0.070
Neg logit contributions
 Lu-0.075
 firefighters-0.060
 angels-0.059
 Lucy-0.058
 taxis-0.056
Token so
Token ID523
Feature activation-2.31e-02

Pos logit contributions
ret+0.028
orse+0.028
orses+0.025
ocket+0.024
ork+0.023
Neg logit contributions
 Lu-0.025
 angels-0.020
Ann-0.019
Lu-0.019
 firefighters-0.019
Token much
Token ID881
Feature activation+1.30e-02

Pos logit contributions
ret+0.037
orse+0.036
orses+0.033
ocket+0.032
 peanut+0.032
Neg logit contributions
 Lu-0.045
 angels-0.038
Ann-0.037
Lu-0.036
 firefighters-0.035
Token fun
Token ID1257
Feature activation-1.08e-02

Pos logit contributions
 Lu+0.025
 angels+0.020
Ann+0.020
Lu+0.020
ators+0.020
Neg logit contributions
orse-0.022
ret-0.021
orses-0.021
 peanut-0.020
ets-0.019
Token.
Token ID13
Feature activation-1.83e-02

Pos logit contributions
ret+0.019
orse+0.018
orses+0.017
 peanut+0.016
ocket+0.016
Neg logit contributions
 Lu-0.021
 angels-0.017
Ann-0.016
 firefighters-0.016
Lu-0.016
Token After
Token ID2293
Feature activation-3.11e-02

Pos logit contributions
ret+0.034
orse+0.032
orses+0.030
 peanut+0.029
ets+0.029
Neg logit contributions
 Lu-0.034
Ann-0.027
Lu-0.026
 angels-0.026
 Lucy-0.025
Token a
Token ID257
Feature activation-6.21e-02

Pos logit contributions
ret+0.066
ets+0.059
sel+0.057
orse+0.056
ocado+0.056
Neg logit contributions
 Lu-0.050
 angels-0.043
 firefighters-0.039
Ann-0.037
 Lucy-0.037
Token while
Token ID981
Feature activation-1.01e-02

Pos logit contributions
ret+0.123
ocket+0.117
alks+0.109
sel+0.109
ocus+0.108
Neg logit contributions
 Lu-0.103
Lu-0.089
 Lucy-0.080
Ann-0.080
 angels-0.079
Token,
Token ID11
Feature activation-2.41e-02

Pos logit contributions
ret+0.017
orse+0.016
orses+0.015
 peanut+0.014
ets+0.014
Neg logit contributions
 Lu-0.021
 angels-0.017
Ann-0.017
 firefighters-0.016
Lu-0.016
Token the
Token ID262
Feature activation-1.67e-02

Pos logit contributions
ret+0.042
orse+0.040
orses+0.037
ets+0.037
ocket+0.035
Neg logit contributions
 Lu-0.049
 angels-0.040
 Lucy-0.038
 firefighters-0.038
Ann-0.037
Token big
Token ID1263
Feature activation-6.11e-03

Pos logit contributions
ret+0.036
orse+0.035
orses+0.033
ets+0.031
sel+0.030
Neg logit contributions
 Lu-0.025
 angels-0.021
 firefighters-0.020
Ann-0.019
 attendant-0.019
Token purple
Token ID14032
Feature activation-2.06e-02

Pos logit contributions
ret+0.012
orse+0.012
orses+0.011
ets+0.010
 peanut+0.010
Neg logit contributions
 Lu-0.010
 angels-0.008
Ann-0.008
 firefighters-0.008
Lu-0.008
Token T
Token ID309
Feature activation-2.29e-02

Pos logit contributions
ret+0.009
orse+0.009
orses+0.008
ets+0.008
 peanut+0.008
Neg logit contributions
 Lu-0.009
 angels-0.008
Ann-0.007
 firefighters-0.007
Lu-0.007
Tokenoni
Token ID14651
Feature activation+9.08e-03

Pos logit contributions
orse+0.032
ret+0.029
orses+0.029
 peanut+0.026
ets+0.024
Neg logit contributions
 Lu-0.048
 angels-0.042
Ann-0.041
 youths-0.041
co-0.041
Token's
Token ID338
Feature activation+2.66e-03

Pos logit contributions
 Lu+0.017
 angels+0.014
Ann+0.014
Lu+0.013
 firefighters+0.013
Neg logit contributions
ret-0.016
orse-0.015
orses-0.014
 peanut-0.014
ets-0.013
Token watch
Token ID2342
Feature activation+7.21e-03

Pos logit contributions
 Lu+0.005
 angels+0.004
Ann+0.004
Lu+0.004
 firefighters+0.004
Neg logit contributions
ret-0.005
orse-0.005
orses-0.004
 peanut-0.004
ets-0.004
Token back
Token ID736
Feature activation+2.00e-02

Pos logit contributions
 Lu+0.014
 angels+0.012
Ann+0.011
Lu+0.011
 firefighters+0.011
Neg logit contributions
ret-0.012
orse-0.012
orses-0.011
 peanut-0.010
ets-0.010
Token.
Token ID13
Feature activation-7.34e-02

Pos logit contributions
 Lu+0.042
 angels+0.036
Ann+0.035
Lu+0.035
 firefighters+0.034
Neg logit contributions
ret-0.028
orse-0.028
orses-0.025
ets-0.024
 peanut-0.024
Token They
Token ID1119
Feature activation+1.95e-02

Pos logit contributions
orse+0.154
orses+0.146
ocket+0.146
ret+0.137
 peanut+0.131
Neg logit contributions
 Lu-0.118
 attendant-0.092
 Kenny-0.085
 Luis-0.084
 Taxi-0.081
Token ran
Token ID4966
Feature activation+1.60e-02

Pos logit contributions
 Lu+0.040
 angels+0.036
Ann+0.034
Lu+0.034
 youths+0.033
Neg logit contributions
ret-0.028
orse-0.028
orses-0.025
 peanut-0.024
ets-0.023
Token after
Token ID706
Feature activation-5.55e-02

Pos logit contributions
 Lu+0.033
 angels+0.029
Ann+0.028
Lu+0.027
 firefighters+0.027
Neg logit contributions
ret-0.023
orse-0.022
ets-0.020
orses-0.020
 peanut-0.019
Token the
Token ID262
Feature activation+4.95e-03

Pos logit contributions
ocado+0.113
ret+0.112
ets+0.110
orse+0.109
orses+0.106
Neg logit contributions
 Lu-0.090
 angels-0.065
 Luis-0.062
 firefighters-0.062
 youths-0.061
Token bully
Token ID27410
Feature activation-1.23e-02

Pos logit contributions
 Lu+0.009
 angels+0.007
Ann+0.007
Lu+0.007
 firefighters+0.007
Neg logit contributions
orse-0.009
ret-0.009
orses-0.008
 peanut-0.008
ets-0.008
Token bus
Token ID1323
Feature activation-2.80e-02

Pos logit contributions
ret+0.044
orse+0.043
orses+0.040
ets+0.037
ocket+0.036
Neg logit contributions
 Lu-0.041
 angels-0.034
 firefighters-0.033
 youths-0.032
Lu-0.032
Token stop
Token ID2245
Feature activation-2.19e-02

Pos logit contributions
ret+0.052
orse+0.051
orses+0.048
 peanut+0.046
ocket+0.044
Neg logit contributions
 Lu-0.048
 angels-0.039
 attendant-0.038
Ann-0.038
 Taxi-0.036
Token.
Token ID13
Feature activation+3.86e-02

Pos logit contributions
ret+0.040
orse+0.039
orses+0.036
 peanut+0.034
ocket+0.033
Neg logit contributions
 Lu-0.040
 angels-0.033
 firefighters-0.031
 youths-0.030
Ann-0.030
Token He
Token ID679
Feature activation-7.50e-03

Pos logit contributions
 Lu+0.081
 angels+0.077
Ann+0.073
co+0.072
 firefighters+0.071
Neg logit contributions
orse-0.051
orses-0.045
ret-0.045
 peanut-0.042
 This-0.042
Token sang
Token ID25889
Feature activation-8.65e-03

Pos logit contributions
ret+0.012
orse+0.012
orses+0.011
 peanut+0.011
ets+0.010
Neg logit contributions
 Lu-0.015
 angels-0.012
Ann-0.012
Lu-0.011
 firefighters-0.011
Token a
Token ID257
Feature activation-5.89e-02

Pos logit contributions
ret+0.016
orse+0.016
orses+0.014
 peanut+0.014
ets+0.013
Neg logit contributions
 Lu-0.016
 angels-0.013
Ann-0.012
 firefighters-0.012
Lu-0.012
Token song
Token ID3496
Feature activation+3.81e-04

Pos logit contributions
ret+0.127
orse+0.118
ocket+0.113
orses+0.112
alks+0.112
Neg logit contributions
 Lu-0.079
 angels-0.067
 attendant-0.058
Lu-0.057
Ann-0.056
Token,
Token ID11
Feature activation-1.79e-02

Pos logit contributions
 Lu+0.001
 angels+0.001
Ann+0.001
Lu+0.001
 firefighters+0.001
Neg logit contributions
ret-0.001
orse-0.001
orses-0.001
 peanut-0.001
ets-0.001
Token danced
Token ID39480
Feature activation-1.89e-03

Pos logit contributions
ret+0.033
orse+0.031
orses+0.029
ork+0.028
ocket+0.027
Neg logit contributions
 Lu-0.033
 angels-0.027
Ann-0.026
 firefighters-0.025
Lu-0.025
Token around
Token ID1088
Feature activation-6.67e-03

Pos logit contributions
ret+0.003
orse+0.003
orses+0.003
 peanut+0.003
ets+0.003
Neg logit contributions
 Lu-0.004
 angels-0.003
Ann-0.003
Lu-0.003
 firefighters-0.003
Token,
Token ID11
Feature activation-3.09e-02

Pos logit contributions
ret+0.012
orse+0.012
orses+0.011
 peanut+0.010
ets+0.010
Neg logit contributions
 Lu-0.012
 angels-0.010
Ann-0.010
 firefighters-0.010
Lu-0.010
Token as
Token ID355
Feature activation+2.19e-02

Pos logit contributions
 Lu+0.022
Ann+0.019
 angels+0.018
Lu+0.018
 firefighters+0.018
Neg logit contributions
orse-0.014
ret-0.014
orses-0.013
ets-0.012
 peanut-0.012
Token the
Token ID262
Feature activation-1.71e-02

Pos logit contributions
 Lu+0.054
Ann+0.047
co+0.046
 Taxi+0.046
Lu+0.046
Neg logit contributions
ret-0.022
orse-0.021
orses-0.019
 peanut-0.017
ork-0.016
Token wind
Token ID2344
Feature activation+2.99e-02

Pos logit contributions
orse+0.035
ret+0.035
orses+0.033
ets+0.030
ocus+0.029
Neg logit contributions
 Lu-0.027
 angels-0.023
 firefighters-0.022
 attendant-0.021
Ann-0.020
Token blew
Token ID17948
Feature activation+7.16e-03

Pos logit contributions
 Lu+0.065
 angels+0.054
Ann+0.054
Lu+0.053
 youths+0.053
Neg logit contributions
orse-0.041
ret-0.040
orses-0.037
ets-0.035
 peanut-0.034
Token through
Token ID832
Feature activation-9.57e-03

Pos logit contributions
 Lu+0.014
 angels+0.012
Ann+0.012
Lu+0.011
 firefighters+0.011
Neg logit contributions
ret-0.011
orse-0.011
orses-0.010
 peanut-0.010
ets-0.010
Token their
Token ID511
Feature activation-5.71e-02

Pos logit contributions
orse+0.018
ret+0.017
orses+0.016
 peanut+0.015
ets+0.015
Neg logit contributions
 Lu-0.017
 angels-0.014
Ann-0.013
 firefighters-0.013
Lu-0.013
Token hair
Token ID4190
Feature activation-2.40e-02

Pos logit contributions
ret+0.097
orse+0.097
orses+0.085
sel+0.083
ocket+0.079
Neg logit contributions
 Lu-0.095
 angels-0.092
 firefighters-0.087
 taxis-0.084
co-0.079
Token.
Token ID13
Feature activation+1.73e-03

Pos logit contributions
orse+0.046
ret+0.046
orses+0.043
 peanut+0.040
ocket+0.039
Neg logit contributions
 Lu-0.041
 angels-0.033
Ann-0.031
co-0.031
Lu-0.030
Token The
Token ID383
Feature activation-1.05e-02

Pos logit contributions
 Lu+0.004
 angels+0.003
Ann+0.003
 firefighters+0.003
Lu+0.003
Neg logit contributions
ret-0.002
orse-0.002
orses-0.002
 peanut-0.002
ets-0.002
Token girl
Token ID2576
Feature activation-1.21e-03

Pos logit contributions
ret+0.023
orse+0.023
orses+0.021
ets+0.020
 peanut+0.020
Neg logit contributions
 Lu-0.015
 angels-0.012
 firefighters-0.012
Ann-0.011
 attendant-0.011
Token noticed
Token ID6810
Feature activation+1.05e-02

Pos logit contributions
ret+0.002
orse+0.002
orses+0.002
 peanut+0.002
ets+0.002
Neg logit contributions
 Lu-0.002
 angels-0.002
Ann-0.002
Lu-0.002
 firefighters-0.002