TOP ACTIVATIONS
MAX = 0.114

. It looked even y umm ier . Lily wanted to
reward . It was ch ocol ates and sweets !
- it was bigger and ju ic ier than any he
the tree to its once former beauty . He was happy
was a box of ch ocol ates . Lily loved ch
some of the biggest , ju ic iest berries in the
. It was even y umm ier this time ! The
a big box of ch ocol ates . It was so
ier ? You are y umm ier . You are my
. It was even y umm ier than the first card
to be restored to its former glory ."
he found the biggest , ju ic iest squash he had
oll ip ops , ch ocol ates , g ummies ,
She found the biggest and ju ic iest one , but
could see the biggest , ju ic iest cher ries ever
was the sweet est and ju ic iest strawberry she had
, cup cakes and ch ocol ates .
was the sweet est and ju ic iest cherry he had
you know what is y umm ier ? You are y
the garden to pick the ju ic iest and sweet est

MIN ACTIVATIONS
MIN = -0.092

against all of the fastest runners in the world ! Even
friends would always join the kids to play and have fun
strategies to beat the other players . After
them fight ! So the players got ready and hopped into
little girl and the other players attached the pieces of wood
it asked each of the players to take a nap .
So the two girls fell on the ground and
friends . You are good runners . Do you want a
ran to join the big kids . They put on helmets
be cow boys and cow girls . They put on hats
flew down and joined the kids . Tim , Sue ,
you ." So together , Daddy and Tim went
been able to watch the players bounce the purple around the
the basketball as the top players . The little
what it says !" So together the mom and the girl
and carrots and played fun games . Little Billy talked to
to share their cars and play together . They make zoom
said goodbye and introduced the students to their new classroom .
their wool en creations and play with them some more .
So she asked the other players if they wanted to play

INTERVAL 0.062 - 0.114
CONTAINS 0.046%

. He saw a poor vegetable vendor near a bench and
. It was sour and mush y , but he did
, he found a big mel on in his garden .
that had landed on his tum my . It opened its
pond and everyone watched in amaz ement as her thread gl

INTERVAL 0.011 - 0.062
CONTAINS 27.458%

to eat pr unes . She liked how they were sweet
sad and didn 't know what to do . But then
the numbers . It was so much fun !
ladder . He was about to slide down when he felt
held the board high in the air and shouted ' one

INTERVAL -0.041 - 0.011
CONTAINS 69.552%

" Thank you , Tim my . I forgive you .
law says you must be honest and kind .
scared . He stayed by the snake and watched it .
like a world where people have loyal friends ."
It was sweet and crunch y . Tim smiled

INTERVAL -0.092 - -0.041
CONTAINS 2.943%

and had lots of fun together . After a while ,
Jake and Jane were best friends . Every day when school
so happy that they kept playing with the balls for hours
with you , Tim !" They played on the swings and
Bob and his new friend continued to play together , and
Token.
Token ID13
Feature activation+1.00e-03

Pos logit contributions
Their+0.012
pered+0.011
uits+0.010
Her+0.010
ics+0.010
Neg logit contributions
ocol-0.023
 accidents-0.023
 beast-0.021
oc-0.021
 reality-0.021
Token It
Token ID632
Feature activation+2.79e-02

Pos logit contributions
ocol+0.002
 accidents+0.002
 beast+0.002
 reality+0.002
oc+0.002
Neg logit contributions
Their-0.001
pered-0.001
Her-0.001
uits-0.001
ics-0.001
Token looked
Token ID3114
Feature activation+3.65e-02

Pos logit contributions
ocol+0.068
 reality+0.061
 accidents+0.060
umm+0.058
 beast+0.058
Neg logit contributions
Their-0.032
 run-0.031
 ran-0.031
 low-0.029
Her-0.029
Token even
Token ID772
Feature activation+6.83e-02

Pos logit contributions
ocol+0.093
 beast+0.092
 accidents+0.088
umm+0.088
 reality+0.088
Neg logit contributions
Their-0.030
 low-0.029
pered-0.026
 off-0.026
 fast-0.026
Token y
Token ID331
Feature activation+5.54e-02

Pos logit contributions
ocol+0.155
 beast+0.154
oc+0.149
umm+0.142
icer+0.141
Neg logit contributions
 low-0.076
Their-0.076
 over-0.075
 off-0.074
 fast-0.070
Tokenumm
Token ID13929
Feature activation+1.14e-01

Pos logit contributions
ocol+0.059
 beast+0.058
 accidents+0.058
 reality+0.056
oc+0.053
Neg logit contributions
Their-0.125
pered-0.122
uits-0.118
ics-0.117
Her-0.117
Tokenier
Token ID959
Feature activation+1.16e-02

Pos logit contributions
 beast+0.193
 accidents+0.192
 reality+0.191
ocol+0.191
oc+0.184
Neg logit contributions
pered-0.172
Their-0.172
ics-0.169
suits-0.165
uits-0.164
Token.
Token ID13
Feature activation+1.54e-03

Pos logit contributions
ocol+0.024
 accidents+0.023
 beast+0.022
 reality+0.022
oc+0.022
Neg logit contributions
Their-0.016
pered-0.015
uits-0.014
ics-0.014
Her-0.014
Token Lily
Token ID20037
Feature activation-1.18e-02

Pos logit contributions
ocol+0.004
 accidents+0.004
 beast+0.003
 reality+0.003
oc+0.003
Neg logit contributions
Their-0.002
pered-0.002
Her-0.002
uits-0.001
ics-0.001
Token wanted
Token ID2227
Feature activation-8.82e-03

Pos logit contributions
Their+0.019
pered+0.018
uits+0.017
ics+0.017
Her+0.017
Neg logit contributions
ocol-0.022
 accidents-0.021
 beast-0.020
 reality-0.020
oc-0.019
Token to
Token ID284
Feature activation+1.77e-02

Pos logit contributions
Their+0.005
Her+0.004
 off+0.004
ics+0.003
 run+0.003
Neg logit contributions
ocol-0.026
oc-0.025
 accidents-0.024
 beast-0.024
 reality-0.024
Token reward
Token ID6721
Feature activation+1.36e-02

Pos logit contributions
ocol+0.035
 accidents+0.034
 reality+0.031
 beast+0.031
 cavity+0.031
Neg logit contributions
Their-0.024
pered-0.023
 run-0.023
 swing-0.023
uits-0.022
Token.
Token ID13
Feature activation-1.06e-02

Pos logit contributions
ocol+0.030
 accidents+0.030
 beast+0.028
 reality+0.028
oc+0.028
Neg logit contributions
Their-0.016
pered-0.015
uits-0.014
ics-0.014
Her-0.014
Token It
Token ID632
Feature activation+2.45e-02

Pos logit contributions
Their+0.014
pered+0.013
uits+0.013
ics+0.012
Her+0.012
Neg logit contributions
ocol-0.022
 accidents-0.022
 beast-0.021
 reality-0.020
oc-0.020
Token was
Token ID373
Feature activation-3.37e-03

Pos logit contributions
ocol+0.047
 accidents+0.044
 beast+0.041
 reality+0.041
umm+0.040
Neg logit contributions
Their-0.038
pered-0.036
ics-0.035
 run-0.035
uits-0.034
Token ch
Token ID442
Feature activation+9.94e-03

Pos logit contributions
Their+0.004
pered+0.004
uits+0.003
ics+0.003
Her+0.003
Neg logit contributions
ocol-0.008
 accidents-0.008
 beast-0.007
oc-0.007
 reality-0.007
Tokenocol
Token ID4668
Feature activation+1.14e-01

Pos logit contributions
 accidents+0.020
 beast+0.019
ocol+0.019
 reality+0.019
oc+0.018
Neg logit contributions
pered-0.014
Their-0.014
uits-0.012
amed-0.012
 aside-0.012
Tokenates
Token ID689
Feature activation+8.83e-03

Pos logit contributions
 accidents+0.227
ocol+0.217
 reality+0.213
 beast+0.212
oc+0.205
Neg logit contributions
Their-0.160
pered-0.156
ics-0.150
uits-0.144
amed-0.143
Token and
Token ID290
Feature activation+2.10e-02

Pos logit contributions
ocol+0.016
 accidents+0.016
 beast+0.015
 reality+0.015
oc+0.015
Neg logit contributions
Their-0.014
pered-0.013
uits-0.012
Her-0.012
ics-0.012
Token sweets
Token ID42402
Feature activation+2.50e-02

Pos logit contributions
ocol+0.045
 accidents+0.043
 beast+0.041
oc+0.041
 reality+0.040
Neg logit contributions
Their-0.027
pered-0.025
ics-0.025
uits-0.024
 run-0.024
Token!
Token ID0
Feature activation+6.51e-03

Pos logit contributions
ocol+0.049
 accidents+0.049
 beast+0.046
 reality+0.046
oc+0.045
Neg logit contributions
Their-0.036
pered-0.034
uits-0.032
ics-0.031
Her-0.031
Token
Token ID198
Feature activation+6.03e-03

Pos logit contributions
ocol+0.014
 accidents+0.013
 beast+0.013
 reality+0.012
oc+0.012
Neg logit contributions
Their-0.009
pered-0.008
uits-0.008
ics-0.008
Her-0.008
Token -
Token ID532
Feature activation+8.28e-03

Pos logit contributions
ocol+0.125
 accidents+0.122
 beast+0.116
 reality+0.116
oc+0.114
Neg logit contributions
Their-0.053
pered-0.049
ics-0.046
uits-0.046
Her-0.045
Token it
Token ID340
Feature activation+5.53e-02

Pos logit contributions
ocol+0.023
umm+0.021
icer+0.021
 reality+0.020
 accidents+0.020
Neg logit contributions
 run-0.006
 fast-0.006
 low-0.006
Their-0.006
 off-0.005
Token was
Token ID373
Feature activation+4.28e-03

Pos logit contributions
ocol+0.119
 accidents+0.109
 reality+0.105
umm+0.104
 beast+0.102
Neg logit contributions
Their-0.074
 run-0.072
 low-0.071
 over-0.070
ics-0.070
Token bigger
Token ID5749
Feature activation+5.25e-03

Pos logit contributions
ocol+0.010
 accidents+0.010
 beast+0.010
oc+0.009
 reality+0.009
Neg logit contributions
Their-0.005
pered-0.004
uits-0.004
ics-0.004
Her-0.004
Token and
Token ID290
Feature activation+3.64e-02

Pos logit contributions
ocol+0.014
 accidents+0.013
 reality+0.013
icer+0.012
 beast+0.012
Neg logit contributions
pered-0.004
suits-0.004
 run-0.004
 fast-0.004
 swing-0.004
Token ju
Token ID7544
Feature activation+1.12e-01

Pos logit contributions
ocol+0.089
 beast+0.081
oc+0.080
icer+0.078
 accidents+0.078
Neg logit contributions
Their-0.039
 low-0.038
 fast-0.036
 over-0.035
 off-0.035
Tokenic
Token ID291
Feature activation+3.36e-02

Pos logit contributions
ocol+0.173
 accidents+0.170
 beast+0.160
 reality+0.155
oc+0.151
Neg logit contributions
Their-0.212
ics-0.202
pered-0.197
Her-0.194
uits-0.192
Tokenier
Token ID959
Feature activation+2.07e-02

Pos logit contributions
 accidents+0.055
ocol+0.054
 reality+0.050
 beast+0.050
 cavity+0.049
Neg logit contributions
Their-0.059
pered-0.057
ics-0.055
uits-0.054
Her-0.053
Token than
Token ID621
Feature activation+7.34e-04

Pos logit contributions
ocol+0.049
 accidents+0.047
 beast+0.045
 cavity+0.044
 reality+0.044
Neg logit contributions
pered-0.020
Their-0.020
 off-0.019
 run-0.018
ics-0.018
Token any
Token ID597
Feature activation-1.42e-03

Pos logit contributions
oc+0.001
ocol+0.001
umm+0.001
 accidents+0.001
 beast+0.001
Neg logit contributions
 low-0.001
 fast-0.001
 off-0.001
 their-0.001
Their-0.001
Token he
Token ID339
Feature activation+1.63e-02

Pos logit contributions
 run+0.001
 low+0.001
 fast+0.001
 first+0.001
 second+0.001
Neg logit contributions
ocol-0.003
 accidents-0.003
 reality-0.003
icer-0.003
umm-0.003
Token the
Token ID262
Feature activation+2.51e-03

Pos logit contributions
ocol+0.044
 accidents+0.042
 cavity+0.042
 beast+0.041
umm+0.040
Neg logit contributions
Their-0.019
 off-0.017
uits-0.017
 their-0.017
ics-0.016
Token tree
Token ID5509
Feature activation+1.46e-02

Pos logit contributions
ocol+0.006
 accidents+0.005
 reality+0.005
oc+0.005
icer+0.005
Neg logit contributions
 swing-0.003
 run-0.003
 leash-0.003
 low-0.003
 fast-0.003
Token to
Token ID284
Feature activation+5.34e-03

Pos logit contributions
ocol+0.033
 accidents+0.033
 beast+0.031
 reality+0.031
oc+0.031
Neg logit contributions
Their-0.016
pered-0.015
uits-0.014
Her-0.014
ics-0.014
Token its
Token ID663
Feature activation+1.23e-02

Pos logit contributions
ocol+0.012
 accidents+0.012
 beast+0.011
umm+0.011
 cavity+0.011
Neg logit contributions
Their-0.006
 run-0.005
pered-0.005
uits-0.005
ics-0.005
Token once
Token ID1752
Feature activation+1.14e-02

Pos logit contributions
ocol+0.027
 accidents+0.025
 beast+0.023
oc+0.023
icer+0.023
Neg logit contributions
Their-0.015
pered-0.015
 run-0.014
 swing-0.014
ics-0.014
Token former
Token ID1966
Feature activation+1.11e-01

Pos logit contributions
ocol+0.026
 accidents+0.024
 beast+0.024
icer+0.024
 cavity+0.023
Neg logit contributions
Their-0.013
 low-0.012
 aside-0.012
pered-0.012
ics-0.012
Token beauty
Token ID8737
Feature activation+1.57e-02

Pos logit contributions
 accidents+0.139
 beast+0.135
ocol+0.130
 cavity+0.122
icer+0.120
Neg logit contributions
pered-0.221
Their-0.217
uits-0.217
ics-0.210
 off-0.203
Token.
Token ID13
Feature activation-7.82e-04

Pos logit contributions
ocol+0.037
 accidents+0.036
 beast+0.034
 cavity+0.034
 reality+0.034
Neg logit contributions
Their-0.016
pered-0.015
uits-0.015
ics-0.015
 contain-0.014
Token He
Token ID679
Feature activation+6.26e-03

Pos logit contributions
Their+0.001
pered+0.001
uits+0.001
Her+0.001
ics+0.001
Neg logit contributions
 accidents-0.002
ocol-0.002
 beast-0.002
 cavity-0.002
 reality-0.002
Token was
Token ID373
Feature activation-1.33e-02

Pos logit contributions
ocol+0.013
 accidents+0.013
umm+0.012
 beast+0.012
icer+0.012
Neg logit contributions
Their-0.008
pered-0.008
 run-0.008
ics-0.008
amed-0.008
Token happy
Token ID3772
Feature activation-1.38e-02

Pos logit contributions
Their+0.014
pered+0.012
uits+0.012
ics+0.012
suits+0.012
Neg logit contributions
ocol-0.031
 accidents-0.031
 reality-0.030
 beast-0.030
oc-0.029
Token was
Token ID373
Feature activation+5.79e-03

Pos logit contributions
ocol+0.065
 accidents+0.062
 beast+0.057
 reality+0.057
umm+0.056
Neg logit contributions
Their-0.053
pered-0.049
uits-0.048
ics-0.048
 run-0.048
Token a
Token ID257
Feature activation-9.70e-03

Pos logit contributions
ocol+0.013
 accidents+0.013
oc+0.013
 beast+0.013
 reality+0.012
Neg logit contributions
Their-0.006
pered-0.006
uits-0.006
ics-0.005
 run-0.005
Token box
Token ID3091
Feature activation+3.54e-02

Pos logit contributions
Their+0.014
pered+0.012
uits+0.012
 run+0.012
 swing+0.012
Neg logit contributions
ocol-0.020
 accidents-0.019
 beast-0.018
 reality-0.017
oc-0.017
Token of
Token ID286
Feature activation+3.81e-03

Pos logit contributions
ocol+0.073
 accidents+0.073
 beast+0.070
oc+0.070
 reality+0.069
Neg logit contributions
Their-0.047
pered-0.043
Her-0.041
uits-0.040
ics-0.040
Token ch
Token ID442
Feature activation+2.55e-03

Pos logit contributions
 beast+0.008
oc+0.008
umm+0.008
ocol+0.008
 accidents+0.008
Neg logit contributions
Their-0.005
 run-0.004
 their-0.004
 swing-0.004
 fast-0.004
Tokenocol
Token ID4668
Feature activation+1.08e-01

Pos logit contributions
 accidents+0.002
 reality+0.002
 beast+0.002
 cavity+0.001
oc+0.001
Neg logit contributions
pered-0.007
Their-0.007
suits-0.006
uits-0.006
 aside-0.006
Tokenates
Token ID689
Feature activation+7.49e-03

Pos logit contributions
ocol+0.199
 accidents+0.196
 beast+0.184
 reality+0.182
oc+0.179
Neg logit contributions
Their-0.170
pered-0.160
uits-0.155
ics-0.154
Her-0.152
Token.
Token ID13
Feature activation+5.93e-03

Pos logit contributions
ocol+0.015
 accidents+0.015
 beast+0.014
 reality+0.014
oc+0.014
Neg logit contributions
Their-0.010
pered-0.010
uits-0.009
Her-0.009
ics-0.009
Token Lily
Token ID20037
Feature activation-4.84e-03

Pos logit contributions
ocol+0.012
 accidents+0.012
 beast+0.011
 reality+0.011
oc+0.011
Neg logit contributions
Their-0.009
pered-0.008
uits-0.008
Her-0.008
 low-0.007
Token loved
Token ID6151
Feature activation-8.25e-03

Pos logit contributions
Their+0.006
pered+0.006
uits+0.006
ics+0.006
Her+0.006
Neg logit contributions
ocol-0.010
 accidents-0.010
 beast-0.009
 reality-0.009
oc-0.009
Token ch
Token ID442
Feature activation-3.43e-03

Pos logit contributions
Their+0.011
pered+0.010
ics+0.010
uits+0.010
 run+0.010
Neg logit contributions
ocol-0.017
 accidents-0.016
oc-0.016
 beast-0.016
 reality-0.016
Token some
Token ID617
Feature activation+1.72e-03

Pos logit contributions
ocol+0.107
umm+0.104
 accidents+0.103
 reality+0.103
oc+0.100
Neg logit contributions
 off-0.037
Their-0.035
 run-0.034
 back-0.033
Her-0.033
Token of
Token ID286
Feature activation+5.81e-03

Pos logit contributions
ocol+0.003
 reality+0.003
 accidents+0.003
 beast+0.003
oc+0.003
Neg logit contributions
 run-0.003
Their-0.003
 leash-0.003
pered-0.002
Her-0.002
Token the
Token ID262
Feature activation+6.90e-03

Pos logit contributions
ocol+0.014
 accidents+0.014
oc+0.014
 beast+0.014
 reality+0.014
Neg logit contributions
Their-0.005
Her-0.004
 run-0.004
pered-0.004
ics-0.004
Token biggest
Token ID4094
Feature activation+1.49e-02

Pos logit contributions
ocol+0.015
 accidents+0.014
 reality+0.014
 beast+0.013
oc+0.013
Neg logit contributions
Their-0.009
 run-0.008
pered-0.008
Her-0.008
ics-0.008
Token,
Token ID11
Feature activation+1.16e-02

Pos logit contributions
ocol+0.031
 reality+0.030
 accidents+0.030
 beast+0.029
icer+0.028
Neg logit contributions
 run-0.019
pered-0.018
Their-0.018
 chase-0.018
 swing-0.018
Token ju
Token ID7544
Feature activation+1.07e-01

Pos logit contributions
ocol+0.025
 accidents+0.024
 reality+0.023
 beast+0.023
oc+0.022
Neg logit contributions
Their-0.014
pered-0.014
 run-0.013
uits-0.013
Her-0.013
Tokenic
Token ID291
Feature activation+1.68e-02

Pos logit contributions
ocol+0.160
 accidents+0.155
 beast+0.152
 reality+0.150
 cavity+0.136
Neg logit contributions
ics-0.204
Their-0.203
Her-0.187
amed-0.181
pered-0.176
Tokeniest
Token ID6386
Feature activation+1.26e-02

Pos logit contributions
ocol+0.019
 accidents+0.018
 beast+0.016
 reality+0.016
oc+0.016
Neg logit contributions
Their-0.038
pered-0.037
ics-0.036
uits-0.036
Her-0.036
Token berries
Token ID36322
Feature activation+2.89e-02

Pos logit contributions
ocol+0.023
 reality+0.023
 beast+0.023
 accidents+0.023
 cavity+0.022
Neg logit contributions
 run-0.018
 swing-0.017
 chase-0.017
 leash-0.017
 off-0.016
Token in
Token ID287
Feature activation-2.32e-02

Pos logit contributions
 accidents+0.056
ocol+0.056
oc+0.050
 pounds+0.049
 beast+0.049
Neg logit contributions
Their-0.043
pered-0.039
uits-0.038
Her-0.036
amed-0.036
Token the
Token ID262
Feature activation-1.05e-03

Pos logit contributions
Their+0.027
pered+0.024
ics+0.024
Her+0.024
uits+0.023
Neg logit contributions
ocol-0.052
 accidents-0.051
 beast-0.049
oc-0.048
 reality-0.048
Token.
Token ID13
Feature activation+4.24e-03

Pos logit contributions
ocol+0.048
 accidents+0.047
 beast+0.045
 reality+0.045
oc+0.044
Neg logit contributions
Their-0.025
pered-0.023
uits-0.022
ics-0.022
Her-0.021
Token It
Token ID632
Feature activation+3.18e-02

Pos logit contributions
ocol+0.011
 accidents+0.010
 reality+0.010
 beast+0.010
 pounds+0.009
Neg logit contributions
Their-0.004
pered-0.004
 low-0.004
Her-0.004
uits-0.004
Token was
Token ID373
Feature activation+2.04e-03

Pos logit contributions
ocol+0.077
 accidents+0.068
 reality+0.067
umm+0.067
icer+0.066
Neg logit contributions
 run-0.038
 ran-0.037
Their-0.037
 low-0.036
 over-0.035
Token even
Token ID772
Feature activation+5.73e-02

Pos logit contributions
ocol+0.005
oc+0.005
 accidents+0.005
 beast+0.005
 reality+0.004
Neg logit contributions
Their-0.002
pered-0.002
uits-0.002
 run-0.002
 low-0.002
Token y
Token ID331
Feature activation+5.21e-02

Pos logit contributions
ocol+0.129
 beast+0.120
umm+0.119
oc+0.117
icer+0.117
Neg logit contributions
Their-0.068
 low-0.065
 fast-0.063
 off-0.061
 aside-0.061
Tokenumm
Token ID13929
Feature activation+1.07e-01

Pos logit contributions
 beast+0.058
ocol+0.058
 reality+0.057
 accidents+0.057
oc+0.053
Neg logit contributions
Their-0.114
pered-0.111
Her-0.107
ics-0.106
uits-0.106
Tokenier
Token ID959
Feature activation+1.21e-02

Pos logit contributions
 accidents+0.180
ocol+0.177
 beast+0.177
 reality+0.176
oc+0.172
Neg logit contributions
ics-0.160
pered-0.160
Their-0.158
coat-0.154
uits-0.153
Token this
Token ID428
Feature activation+9.88e-03

Pos logit contributions
ocol+0.027
 accidents+0.025
 beast+0.024
 reality+0.024
 cavity+0.024
Neg logit contributions
Their-0.014
pered-0.014
ics-0.013
uits-0.013
 run-0.012
Token time
Token ID640
Feature activation-1.27e-02

Pos logit contributions
ocol+0.023
 accidents+0.021
umm+0.020
 reality+0.020
icer+0.019
Neg logit contributions
 run-0.010
 first-0.010
 over-0.010
 fast-0.010
 swing-0.010
Token!
Token ID0
Feature activation+6.39e-03

Pos logit contributions
Their+0.014
pered+0.013
uits+0.013
ics+0.012
 run+0.012
Neg logit contributions
ocol-0.029
 accidents-0.028
 beast-0.027
oc-0.027
 cavity-0.027
Token The
Token ID383
Feature activation-1.03e-02

Pos logit contributions
ocol+0.017
 accidents+0.017
 reality+0.016
 beast+0.016
 pounds+0.015
Neg logit contributions
Their-0.006
pered-0.005
Her-0.005
uits-0.004
 low-0.004
Token a
Token ID257
Feature activation-7.08e-03

Pos logit contributions
ocol+0.122
umm+0.114
 accidents+0.111
oc+0.111
icer+0.109
Neg logit contributions
Their-0.053
uits-0.047
ics-0.044
 off-0.044
 their-0.044
Token big
Token ID1263
Feature activation+5.03e-03

Pos logit contributions
Their+0.008
 swing+0.008
 run+0.007
 low+0.007
 leash+0.007
Neg logit contributions
ocol-0.017
 accidents-0.016
umm-0.015
icer-0.015
oc-0.014
Token box
Token ID3091
Feature activation+2.94e-02

Pos logit contributions
ocol+0.011
 accidents+0.010
 reality+0.010
oc+0.010
umm+0.010
Neg logit contributions
 swing-0.006
 run-0.006
 leash-0.005
 chase-0.005
Their-0.005
Token of
Token ID286
Feature activation-4.39e-03

Pos logit contributions
 accidents+0.057
ocol+0.057
 beast+0.054
 reality+0.053
 ig+0.053
Neg logit contributions
Their-0.042
pered-0.039
Her-0.037
 dads-0.037
uits-0.036
Token ch
Token ID442
Feature activation+4.49e-03

Pos logit contributions
 run+0.005
Their+0.004
 swing+0.004
 their+0.004
 fast+0.004
Neg logit contributions
umm-0.010
 beast-0.010
oc-0.010
ocol-0.009
icer-0.009
Tokenocol
Token ID4668
Feature activation+1.06e-01

Pos logit contributions
 accidents+0.004
 reality+0.004
 beast+0.004
oc+0.004
 cavity+0.004
Neg logit contributions
pered-0.011
Their-0.011
uits-0.010
suits-0.010
 aside-0.010
Tokenates
Token ID689
Feature activation+9.04e-03

Pos logit contributions
ocol+0.192
 accidents+0.185
 beast+0.174
 reality+0.171
oc+0.171
Neg logit contributions
Their-0.171
pered-0.160
uits-0.157
ics-0.154
Her-0.154
Token.
Token ID13
Feature activation+1.37e-03

Pos logit contributions
ocol+0.019
 accidents+0.019
 beast+0.018
 reality+0.017
oc+0.017
Neg logit contributions
Their-0.012
pered-0.011
uits-0.010
Her-0.010
ics-0.010
Token It
Token ID632
Feature activation+3.57e-02

Pos logit contributions
ocol+0.003
 accidents+0.003
 reality+0.003
 beast+0.003
oc+0.003
Neg logit contributions
Their-0.002
pered-0.002
uits-0.002
Her-0.002
 run-0.002
Token was
Token ID373
Feature activation-2.06e-06

Pos logit contributions
ocol+0.074
 accidents+0.070
 reality+0.066
 beast+0.064
umm+0.064
Neg logit contributions
Their-0.051
 run-0.048
ics-0.047
Her-0.046
pered-0.046
Token so
Token ID523
Feature activation+3.36e-02

Feature not active on prev token;
no predictions were affected.
Tokenier
Token ID959
Feature activation+1.27e-02

Pos logit contributions
 accidents+0.163
ocol+0.162
 beast+0.158
 reality+0.157
oc+0.151
Neg logit contributions
Their-0.176
ics-0.170
pered-0.169
uits-0.162
suits-0.161
Token?
Token ID30
Feature activation+1.50e-02

Pos logit contributions
ocol+0.028
 accidents+0.027
 beast+0.026
 reality+0.026
oc+0.025
Neg logit contributions
Their-0.016
pered-0.014
ics-0.014
uits-0.014
Her-0.013
Token You
Token ID921
Feature activation-1.22e-02

Pos logit contributions
 accidents+0.030
 reality+0.030
ocol+0.029
 beast+0.029
 ig+0.027
Neg logit contributions
Their-0.020
pered-0.020
Her-0.019
ends-0.017
 run-0.017
Token are
Token ID389
Feature activation-1.21e-02

Pos logit contributions
 run+0.009
Their+0.009
 contain+0.008
 follow+0.008
 argue+0.008
Neg logit contributions
 accidents-0.032
ocol-0.032
umm-0.031
 beast-0.030
 reality-0.030
Token y
Token ID331
Feature activation+4.78e-02

Pos logit contributions
Their+0.012
pered+0.011
ics+0.010
uits+0.010
Her+0.010
Neg logit contributions
ocol-0.030
 accidents-0.029
 beast-0.028
oc-0.028
 reality-0.028
Tokenumm
Token ID13929
Feature activation+1.06e-01

Pos logit contributions
 accidents+0.053
 beast+0.052
 reality+0.051
ocol+0.051
oc+0.048
Neg logit contributions
Their-0.104
pered-0.103
ics-0.101
uits-0.099
Her-0.098
Tokenier
Token ID959
Feature activation+1.55e-02

Pos logit contributions
 accidents+0.170
ocol+0.168
 beast+0.165
 reality+0.164
oc+0.159
Neg logit contributions
Their-0.173
pered-0.173
ics-0.169
suits-0.161
uits-0.160
Token.
Token ID13
Feature activation-2.78e-03

Pos logit contributions
ocol+0.034
 accidents+0.033
 beast+0.031
 reality+0.031
oc+0.030
Neg logit contributions
Their-0.019
pered-0.018
uits-0.017
ics-0.017
Her-0.017
Token You
Token ID921
Feature activation-1.53e-02

Pos logit contributions
Their+0.004
pered+0.004
Her+0.004
uits+0.003
 run+0.003
Neg logit contributions
 accidents-0.005
 reality-0.005
ocol-0.005
 beast-0.005
 ig-0.005
Token are
Token ID389
Feature activation-1.70e-02

Pos logit contributions
Their+0.014
 run+0.014
 contain+0.013
ics+0.013
pered+0.012
Neg logit contributions
ocol-0.038
 accidents-0.038
umm-0.035
 beast-0.035
 reality-0.035
Token my
Token ID616
Feature activation+4.71e-03

Pos logit contributions
Their+0.016
pered+0.015
ics+0.014
uits+0.014
Her+0.013
Neg logit contributions
ocol-0.042
 accidents-0.041
 beast-0.039
oc-0.039
 reality-0.039
Token.
Token ID13
Feature activation-6.46e-03

Pos logit contributions
 accidents+0.019
ocol+0.019
 reality+0.018
 beast+0.018
oc+0.017
Neg logit contributions
Their-0.012
pered-0.011
ics-0.011
uits-0.011
Her-0.010
Token It
Token ID632
Feature activation+3.19e-02

Pos logit contributions
Their+0.008
pered+0.007
uits+0.007
Her+0.007
ics+0.007
Neg logit contributions
ocol-0.014
 accidents-0.014
 beast-0.013
 reality-0.013
oc-0.013
Token was
Token ID373
Feature activation+6.34e-04

Pos logit contributions
ocol+0.068
 accidents+0.061
 beast+0.057
 reality+0.057
umm+0.056
Neg logit contributions
Their-0.044
 run-0.043
ics-0.041
 low-0.041
Her-0.041
Token even
Token ID772
Feature activation+5.91e-02

Pos logit contributions
ocol+0.002
 accidents+0.002
 beast+0.001
oc+0.001
 reality+0.001
Neg logit contributions
Their-0.001
pered-0.001
uits-0.001
ics-0.001
Her-0.000
Token y
Token ID331
Feature activation+4.73e-02

Pos logit contributions
ocol+0.131
 beast+0.122
icer+0.120
oc+0.119
umm+0.117
Neg logit contributions
Their-0.074
 low-0.068
 off-0.067
 fast-0.066
 aside-0.065
Tokenumm
Token ID13929
Feature activation+1.05e-01

Pos logit contributions
ocol+0.050
 beast+0.050
 accidents+0.050
 reality+0.048
oc+0.047
Neg logit contributions
Their-0.105
pered-0.104
uits-0.100
ics-0.099
Her-0.099
Tokenier
Token ID959
Feature activation+8.76e-03

Pos logit contributions
 accidents+0.174
 beast+0.173
ocol+0.171
 reality+0.169
oc+0.168
Neg logit contributions
pered-0.163
ics-0.162
Their-0.161
uits-0.157
coat-0.154
Token than
Token ID621
Feature activation-7.92e-04

Pos logit contributions
ocol+0.017
 accidents+0.016
 beast+0.016
 reality+0.015
oc+0.015
Neg logit contributions
Their-0.013
pered-0.012
uits-0.012
ics-0.012
Her-0.011
Token the
Token ID262
Feature activation-1.73e-02

Pos logit contributions
Their+0.001
uits+0.001
 contain+0.001
 off+0.001
 aside+0.001
Neg logit contributions
oc-0.002
ocol-0.002
 beast-0.002
 accidents-0.002
icer-0.002
Token first
Token ID717
Feature activation-2.91e-02

Pos logit contributions
Their+0.024
pered+0.022
uits+0.022
ics+0.021
Her+0.021
Neg logit contributions
ocol-0.035
 accidents-0.034
 beast-0.033
 reality-0.032
oc-0.032
Token card
Token ID2657
Feature activation-7.79e-03

Pos logit contributions
Their+0.045
pered+0.042
uits+0.041
ics+0.041
Her+0.040
Neg logit contributions
ocol-0.055
 accidents-0.053
 beast-0.051
 reality-0.050
oc-0.050
Token to
Token ID284
Feature activation+1.26e-02

Pos logit contributions
Their+0.008
pered+0.007
Her+0.007
uits+0.006
ics+0.006
Neg logit contributions
ocol-0.016
 accidents-0.015
 beast-0.015
 reality-0.014
oc-0.014
Token be
Token ID307
Feature activation-4.83e-03

Pos logit contributions
ocol+0.030
 accidents+0.029
 beast+0.028
oc+0.027
 reality+0.027
Neg logit contributions
Their-0.013
pered-0.012
ics-0.012
uits-0.012
 run-0.011
Token restored
Token ID15032
Feature activation+6.06e-03

Pos logit contributions
Their+0.006
pered+0.006
uits+0.005
ics+0.005
 run+0.005
Neg logit contributions
ocol-0.010
 accidents-0.010
 beast-0.010
umm-0.010
oc-0.009
Token to
Token ID284
Feature activation+2.00e-02

Pos logit contributions
ocol+0.014
 accidents+0.014
 beast+0.013
 reality+0.013
oc+0.013
Neg logit contributions
Their-0.007
pered-0.006
uits-0.006
ics-0.006
Her-0.006
Token its
Token ID663
Feature activation+1.67e-02

Pos logit contributions
ocol+0.046
 accidents+0.044
umm+0.044
icer+0.043
 beast+0.043
Neg logit contributions
Their-0.022
 run-0.021
ics-0.019
pered-0.019
 swing-0.019
Token former
Token ID1966
Feature activation+1.05e-01

Pos logit contributions
ocol+0.031
 accidents+0.030
 beast+0.028
oc+0.027
icer+0.027
Neg logit contributions
Their-0.026
pered-0.025
 run-0.024
uits-0.024
 swing-0.024
Token glory
Token ID13476
Feature activation+1.67e-02

Pos logit contributions
 accidents+0.135
 beast+0.130
ocol+0.129
 cavity+0.115
 reality+0.112
Neg logit contributions
pered-0.211
Their-0.210
uits-0.207
suits-0.198
 low-0.196
Token."
Token ID526
Feature activation+2.80e-03

Pos logit contributions
ocol+0.039
 accidents+0.038
 beast+0.036
 cavity+0.035
oc+0.035
Neg logit contributions
Their-0.018
uits-0.016
pered-0.016
ics-0.016
Her-0.015
Token
Token ID220
Feature activation+3.55e-02

Pos logit contributions
ocol+0.007
 accidents+0.007
 beast+0.006
 reality+0.006
oc+0.006
Neg logit contributions
Their-0.003
pered-0.003
ics-0.003
uits-0.003
Her-0.002
Token
Token ID198
Feature activation-5.50e-03

Pos logit contributions
ocol+0.081
 accidents+0.079
 beast+0.075
 reality+0.075
oc+0.074
Neg logit contributions
Their-0.040
pered-0.037
uits-0.035
ics-0.035
Her-0.034
Token
Token ID198
Feature activation+9.92e-05

Pos logit contributions
pered+0.007
Their+0.007
ics+0.007
uits+0.007
 contain+0.007
Neg logit contributions
ocol-0.011
 accidents-0.010
oc-0.010
 reality-0.010
 beast-0.010
Token he
Token ID339
Feature activation+2.25e-02

Pos logit contributions
ocol+0.054
 accidents+0.050
umm+0.049
 beast+0.049
 cavity+0.049
Neg logit contributions
Their-0.038
uits-0.034
ics-0.034
 run-0.034
 their-0.034
Token found
Token ID1043
Feature activation+5.02e-02

Pos logit contributions
ocol+0.056
 accidents+0.052
umm+0.050
 beast+0.048
 reality+0.047
Neg logit contributions
Their-0.024
 run-0.023
 ran-0.023
amed-0.022
 low-0.022
Token the
Token ID262
Feature activation+6.05e-03

Pos logit contributions
ocol+0.149
umm+0.142
oc+0.140
icer+0.139
 accidents+0.134
Neg logit contributions
 back-0.032
 off-0.031
 over-0.029
 aside-0.027
 out-0.026
Token biggest
Token ID4094
Feature activation+8.32e-03

Pos logit contributions
ocol+0.013
 accidents+0.012
oc+0.012
 beast+0.012
 reality+0.012
Neg logit contributions
Their-0.008
pered-0.007
ics-0.007
Her-0.007
uits-0.007
Token,
Token ID11
Feature activation-5.42e-03

Pos logit contributions
ocol+0.018
 accidents+0.017
icer+0.017
 reality+0.016
 beast+0.016
Neg logit contributions
Their-0.010
 run-0.009
 swing-0.009
 low-0.009
pered-0.009
Token ju
Token ID7544
Feature activation+1.05e-01

Pos logit contributions
 low+0.006
 fast+0.005
 run+0.005
 slow+0.005
 swing+0.005
Neg logit contributions
ocol-0.013
oc-0.012
icer-0.012
 reality-0.012
 beast-0.012
Tokenic
Token ID291
Feature activation+1.38e-02

Pos logit contributions
ocol+0.172
 accidents+0.160
 beast+0.160
 reality+0.155
 cavity+0.148
Neg logit contributions
Their-0.187
ics-0.184
amed-0.172
Her-0.167
They-0.158
Tokeniest
Token ID6386
Feature activation+6.47e-03

Pos logit contributions
ocol+0.018
 accidents+0.018
 reality+0.016
 cavity+0.016
icer+0.015
Neg logit contributions
Their-0.028
ics-0.027
pered-0.027
 run-0.026
uits-0.026
Token squash
Token ID34613
Feature activation+1.76e-02

Pos logit contributions
ocol+0.013
icer+0.012
 beast+0.012
 accidents+0.012
 reality+0.012
Neg logit contributions
Their-0.008
 swing-0.008
 run-0.008
 leash-0.008
 chase-0.008
Token he
Token ID339
Feature activation+1.03e-02

Pos logit contributions
ocol+0.037
 accidents+0.036
 beast+0.034
 reality+0.034
oc+0.033
Neg logit contributions
Their-0.023
pered-0.022
ics-0.021
uits-0.021
Her-0.020
Token had
Token ID550
Feature activation-1.25e-02

Pos logit contributions
ocol+0.026
 accidents+0.025
umm+0.023
 beast+0.023
 reality+0.023
Neg logit contributions
Their-0.009
pered-0.009
 run-0.009
 contain-0.008
nes-0.008
Tokenoll
Token ID692
Feature activation+6.40e-02

Pos logit contributions
 accidents+0.032
ocol+0.032
 beast+0.029
 reality+0.029
oc+0.029
Neg logit contributions
Their-0.049
pered-0.048
uits-0.047
ics-0.046
Her-0.046
Tokenip
Token ID541
Feature activation+3.07e-02

Pos logit contributions
 reality+0.127
umm+0.127
 accidents+0.125
 beast+0.122
ocol+0.121
Neg logit contributions
ics-0.080
balls-0.078
Their-0.077
 off-0.076
pered-0.074
Tokenops
Token ID2840
Feature activation+7.09e-03

Pos logit contributions
ocol+0.052
 accidents+0.050
 beast+0.047
 reality+0.047
oc+0.045
Neg logit contributions
Their-0.052
pered-0.050
ics-0.049
uits-0.049
Her-0.048
Token,
Token ID11
Feature activation+8.77e-03

Pos logit contributions
 accidents+0.016
ocol+0.015
 beast+0.014
 reality+0.014
oc+0.014
Neg logit contributions
Their-0.009
pered-0.008
Her-0.007
uits-0.007
suits-0.007
Token ch
Token ID442
Feature activation-1.72e-03

Pos logit contributions
ocol+0.019
 accidents+0.019
 beast+0.018
 reality+0.018
oc+0.018
Neg logit contributions
Their-0.011
pered-0.010
uits-0.009
ics-0.009
Her-0.009
Tokenocol
Token ID4668
Feature activation+1.05e-01

Pos logit contributions
pered+0.004
Their+0.004
uits+0.004
Her+0.004
amed+0.004
Neg logit contributions
 accidents-0.002
 reality-0.002
 beast-0.001
umm-0.001
oc-0.001
Tokenates
Token ID689
Feature activation+1.36e-03

Pos logit contributions
 accidents+0.205
ocol+0.199
 reality+0.192
 beast+0.192
oc+0.184
Neg logit contributions
Their-0.151
pered-0.145
ics-0.140
uits-0.137
Her-0.134
Token,
Token ID11
Feature activation+8.15e-03

Pos logit contributions
ocol+0.003
 accidents+0.003
 beast+0.003
 reality+0.003
oc+0.003
Neg logit contributions
Their-0.002
pered-0.002
uits-0.001
Her-0.001
ics-0.001
Token g
Token ID308
Feature activation+2.68e-02

Pos logit contributions
ocol+0.019
 accidents+0.018
 beast+0.017
oc+0.017
 reality+0.017
Neg logit contributions
Their-0.009
pered-0.008
ics-0.008
uits-0.008
Her-0.008
Tokenummies
Token ID39578
Feature activation+1.57e-02

Pos logit contributions
 accidents+0.041
ocol+0.040
 beast+0.039
 reality+0.038
oc+0.037
Neg logit contributions
Their-0.049
pered-0.048
ics-0.046
uits-0.046
Her-0.046
Token,
Token ID11
Feature activation-2.17e-03

Pos logit contributions
ocol+0.035
 accidents+0.035
 beast+0.033
oc+0.032
 reality+0.032
Neg logit contributions
Their-0.019
pered-0.017
uits-0.016
ics-0.016
Her-0.016
Token She
Token ID1375
Feature activation+1.60e-02

Pos logit contributions
 accidents+0.010
ocol+0.010
oc+0.010
 cavity+0.009
icer+0.009
Neg logit contributions
Their-0.007
pered-0.007
ics-0.007
uits-0.007
 contain-0.006
Token found
Token ID1043
Feature activation+4.92e-02

Pos logit contributions
ocol+0.033
 accidents+0.033
 beast+0.031
 reality+0.031
oc+0.030
Neg logit contributions
Their-0.021
pered-0.020
uits-0.019
ics-0.019
Her-0.019
Token the
Token ID262
Feature activation+1.18e-02

Pos logit contributions
ocol+0.113
 accidents+0.111
 beast+0.106
oc+0.105
 reality+0.105
Neg logit contributions
Their-0.056
pered-0.050
Her-0.048
uits-0.048
ics-0.048
Token biggest
Token ID4094
Feature activation+1.15e-02

Pos logit contributions
ocol+0.019
 accidents+0.019
 reality+0.017
umm+0.017
icer+0.017
Neg logit contributions
Their-0.021
pered-0.020
uits-0.019
ics-0.019
nes-0.018
Token and
Token ID290
Feature activation+1.01e-02

Pos logit contributions
ocol+0.020
 accidents+0.020
 reality+0.018
 beast+0.018
oc+0.018
Neg logit contributions
Their-0.019
pered-0.018
ics-0.018
uits-0.017
nes-0.017
Token ju
Token ID7544
Feature activation+1.04e-01

Pos logit contributions
 accidents+0.018
ocol+0.017
 reality+0.016
 beast+0.016
oc+0.016
Neg logit contributions
Their-0.017
pered-0.016
uits-0.016
ics-0.015
Her-0.015
Tokenic
Token ID291
Feature activation+1.70e-02

Pos logit contributions
ocol+0.157
 beast+0.151
 accidents+0.150
 reality+0.143
 cavity+0.134
Neg logit contributions
Their-0.199
ics-0.198
Her-0.185
amed-0.181
pered-0.172
Tokeniest
Token ID6386
Feature activation+1.02e-02

Pos logit contributions
ocol+0.018
 accidents+0.017
 beast+0.015
 reality+0.015
oc+0.014
Neg logit contributions
Their-0.041
pered-0.039
uits-0.038
ics-0.038
Her-0.038
Token one
Token ID530
Feature activation+1.29e-02

Pos logit contributions
ocol+0.016
 accidents+0.016
 beast+0.015
 reality+0.014
oc+0.014
Neg logit contributions
Their-0.019
pered-0.017
uits-0.017
ics-0.017
Her-0.017
Token,
Token ID11
Feature activation+7.54e-04

Pos logit contributions
ocol+0.030
 accidents+0.029
 beast+0.028
 reality+0.027
oc+0.027
Neg logit contributions
Their-0.015
pered-0.013
ics-0.013
Her-0.013
uits-0.013
Token but
Token ID475
Feature activation+4.02e-02

Pos logit contributions
ocol+0.002
 accidents+0.002
 reality+0.002
 beast+0.002
oc+0.001
Neg logit contributions
Their-0.001
pered-0.001
ics-0.001
uits-0.001
Her-0.001
Token could
Token ID714
Feature activation+8.18e-03

Pos logit contributions
ocol+0.054
 accidents+0.053
umm+0.048
 reality+0.048
 beast+0.047
Neg logit contributions
 run-0.025
Their-0.024
 swing-0.024
 low-0.024
ics-0.024
Token see
Token ID766
Feature activation+4.05e-02

Pos logit contributions
 beast+0.014
 reality+0.014
 accidents+0.014
ocol+0.014
 ig+0.014
Neg logit contributions
Their-0.012
suits-0.012
pered-0.011
amed-0.010
sc-0.010
Token the
Token ID262
Feature activation+1.74e-03

Pos logit contributions
ocol+0.106
 cavity+0.098
 accidents+0.097
umm+0.094
 reality+0.092
Neg logit contributions
Their-0.035
 over-0.033
 low-0.033
 fast-0.033
 off-0.032
Token biggest
Token ID4094
Feature activation+8.00e-03

Pos logit contributions
ocol+0.004
 accidents+0.004
 beast+0.003
 reality+0.003
oc+0.003
Neg logit contributions
Their-0.002
ics-0.002
pered-0.002
 swing-0.002
 low-0.002
Token,
Token ID11
Feature activation-2.02e-03

Pos logit contributions
ocol+0.017
 accidents+0.016
 beast+0.016
 reality+0.015
 cavity+0.015
Neg logit contributions
Their-0.010
pered-0.010
ics-0.009
 run-0.009
Her-0.009
Token ju
Token ID7544
Feature activation+1.04e-01

Pos logit contributions
Their+0.003
pered+0.003
uits+0.003
ics+0.003
Her+0.003
Neg logit contributions
ocol-0.004
 accidents-0.004
 beast-0.004
 reality-0.004
oc-0.004
Tokenic
Token ID291
Feature activation+1.50e-02

Pos logit contributions
ocol+0.155
 accidents+0.148
 beast+0.146
 reality+0.142
 cavity+0.136
Neg logit contributions
ics-0.198
Their-0.197
Her-0.185
amed-0.182
pered-0.175
Tokeniest
Token ID6386
Feature activation+8.02e-03

Pos logit contributions
ocol+0.018
 accidents+0.018
 reality+0.017
 cavity+0.016
 beast+0.016
Neg logit contributions
Their-0.033
ics-0.031
pered-0.031
Her-0.030
uits-0.030
Token cher
Token ID21382
Feature activation+5.74e-02

Pos logit contributions
ocol+0.017
 beast+0.017
 accidents+0.016
 reality+0.016
 cavity+0.016
Neg logit contributions
 swing-0.009
Their-0.009
 run-0.009
Her-0.009
ics-0.008
Tokenries
Token ID1678
Feature activation+9.51e-03

Pos logit contributions
 beast+0.059
ocol+0.058
 reality+0.057
 accidents+0.057
 cavity+0.055
Neg logit contributions
Their-0.129
ics-0.128
pered-0.124
Her-0.123
 aside-0.120
Token ever
Token ID1683
Feature activation-8.52e-03

Pos logit contributions
ocol+0.019
 accidents+0.019
 beast+0.018
oc+0.017
 reality+0.017
Neg logit contributions
Their-0.013
pered-0.012
uits-0.012
Her-0.012
ics-0.011
Token was
Token ID373
Feature activation-1.07e-03

Pos logit contributions
ocol+0.062
 accidents+0.055
 beast+0.053
 reality+0.053
icer+0.051
Neg logit contributions
Their-0.042
 run-0.041
Her-0.040
 low-0.039
ics-0.039
Token the
Token ID262
Feature activation+5.23e-03

Pos logit contributions
Their+0.001
pered+0.001
uits+0.001
ics+0.001
Her+0.001
Neg logit contributions
ocol-0.002
 accidents-0.002
 beast-0.002
oc-0.002
 reality-0.002
Token sweet
Token ID6029
Feature activation+2.58e-02

Pos logit contributions
ocol+0.010
 accidents+0.010
 beast+0.009
 reality+0.009
oc+0.009
Neg logit contributions
Their-0.008
pered-0.007
uits-0.007
ics-0.007
Her-0.007
Tokenest
Token ID395
Feature activation-2.60e-04

Pos logit contributions
ocol+0.052
 beast+0.050
 reality+0.050
 accidents+0.049
 cavity+0.048
Neg logit contributions
Their-0.033
 run-0.032
 contain-0.031
 swing-0.031
Her-0.030
Token and
Token ID290
Feature activation+5.24e-03

Pos logit contributions
Their+0.000
 swing+0.000
 run+0.000
 contain+0.000
 dads+0.000
Neg logit contributions
ocol-0.000
 accidents-0.000
 beast-0.000
 cavity-0.000
 reality-0.000
Token ju
Token ID7544
Feature activation+1.04e-01

Pos logit contributions
ocol+0.011
 beast+0.010
 accidents+0.010
oc+0.010
 reality+0.009
Neg logit contributions
Their-0.007
 low-0.007
 run-0.007
Her-0.007
 contain-0.007
Tokenic
Token ID291
Feature activation+1.43e-02

Pos logit contributions
ocol+0.155
 accidents+0.149
 beast+0.149
 reality+0.142
 cavity+0.137
Neg logit contributions
Their-0.199
ics-0.194
amed-0.183
Her-0.183
pered-0.175
Tokeniest
Token ID6386
Feature activation+4.69e-03

Pos logit contributions
ocol+0.017
 accidents+0.016
 beast+0.015
 reality+0.014
oc+0.014
Neg logit contributions
Their-0.032
pered-0.031
ics-0.030
uits-0.030
Her-0.030
Token strawberry
Token ID41236
Feature activation+3.14e-02

Pos logit contributions
ocol+0.009
 beast+0.008
 cavity+0.008
 accidents+0.008
oc+0.008
Neg logit contributions
Their-0.007
 swing-0.007
 run-0.007
 dads-0.007
Her-0.006
Token she
Token ID673
Feature activation+1.48e-02

Pos logit contributions
ocol+0.065
 accidents+0.064
 beast+0.061
 reality+0.060
oc+0.060
Neg logit contributions
Their-0.041
pered-0.038
ics-0.038
uits-0.037
Her-0.036
Token had
Token ID550
Feature activation-1.42e-02

Pos logit contributions
ocol+0.036
 accidents+0.035
 beast+0.033
 reality+0.033
oc+0.032
Neg logit contributions
Their-0.015
pered-0.014
 run-0.013
nes-0.013
 contain-0.013
Token,
Token ID11
Feature activation+5.40e-03

Pos logit contributions
 accidents+0.057
ocol+0.057
 beast+0.053
 reality+0.053
oc+0.052
Neg logit contributions
Their-0.034
pered-0.031
Her-0.029
uits-0.029
ics-0.028
Token cup
Token ID6508
Feature activation+3.94e-02

Pos logit contributions
ocol+0.012
 accidents+0.011
 beast+0.011
 reality+0.011
oc+0.011
Neg logit contributions
Their-0.007
pered-0.006
uits-0.006
ics-0.006
Her-0.006
Tokencakes
Token ID37263
Feature activation+3.69e-03

Pos logit contributions
ocol+0.079
 accidents+0.078
 beast+0.073
oc+0.071
umm+0.071
Neg logit contributions
Their-0.055
pered-0.052
Her-0.049
uits-0.048
ics-0.047
Token and
Token ID290
Feature activation-2.19e-03

Pos logit contributions
 accidents+0.007
ocol+0.006
oc+0.006
umm+0.006
 ig+0.006
Neg logit contributions
pered-0.006
Their-0.006
ighed-0.005
Her-0.005
nes-0.005
Token ch
Token ID442
Feature activation-1.21e-02

Pos logit contributions
Their+0.003
pered+0.003
uits+0.003
ics+0.003
Her+0.003
Neg logit contributions
ocol-0.004
 accidents-0.004
 beast-0.004
 reality-0.004
oc-0.004
Tokenocol
Token ID4668
Feature activation+1.03e-01

Pos logit contributions
Their+0.036
pered+0.035
uits+0.035
ics+0.034
Her+0.034
Neg logit contributions
 accidents-0.005
ocol-0.004
 beast-0.004
 reality-0.004
oc-0.003
Tokenates
Token ID689
Feature activation-1.75e-02

Pos logit contributions
ocol+0.188
 accidents+0.182
 beast+0.171
 reality+0.168
oc+0.168
Neg logit contributions
Their-0.166
pered-0.155
uits-0.152
ics-0.150
Her-0.149
Token.
Token ID13
Feature activation-7.74e-03

Pos logit contributions
Their+0.023
pered+0.022
Her+0.021
uits+0.020
ics+0.020
Neg logit contributions
ocol-0.036
 accidents-0.035
 reality-0.033
oc-0.032
 pounds-0.032
Token
Token ID220
Feature activation+4.30e-02

Pos logit contributions
Their+0.010
pered+0.009
uits+0.009
Her+0.009
ics+0.008
Neg logit contributions
ocol-0.017
 accidents-0.016
 beast-0.016
 reality-0.015
oc-0.015
Token
Token ID198
Feature activation-1.81e-03

Pos logit contributions
ocol+0.097
 accidents+0.095
 beast+0.090
 reality+0.089
oc+0.089
Neg logit contributions
Their-0.049
pered-0.045
uits-0.043
ics-0.043
Her-0.042
Token
Token ID198
Feature activation+2.26e-03

Pos logit contributions
pered+0.002
Their+0.002
ics+0.002
 contain+0.002
uits+0.002
Neg logit contributions
ocol-0.004
oc-0.004
 accidents-0.004
 beast-0.003
 reality-0.003
Token was
Token ID373
Feature activation+7.38e-04

Pos logit contributions
ocol+0.091
 accidents+0.079
umm+0.078
icer+0.078
 reality+0.076
Neg logit contributions
 run-0.057
Their-0.056
 ran-0.053
ics-0.053
 over-0.052
Token the
Token ID262
Feature activation+1.91e-03

Pos logit contributions
ocol+0.002
oc+0.002
 accidents+0.002
 beast+0.002
umm+0.002
Neg logit contributions
Their-0.001
pered-0.001
 over-0.001
 low-0.001
 off-0.001
Token sweet
Token ID6029
Feature activation+2.84e-02

Pos logit contributions
ocol+0.004
 accidents+0.004
oc+0.004
 beast+0.004
 reality+0.004
Neg logit contributions
Their-0.002
pered-0.002
ics-0.002
 run-0.002
 swing-0.002
Tokenest
Token ID395
Feature activation-8.98e-04

Pos logit contributions
ocol+0.052
 reality+0.050
 beast+0.050
 cavity+0.047
icer+0.047
Neg logit contributions
 run-0.041
 swing-0.041
 chase-0.039
ics-0.038
 leash-0.038
Token and
Token ID290
Feature activation+6.99e-03

Pos logit contributions
 swing+0.001
 run+0.001
Their+0.001
 leash+0.001
 chase+0.001
Neg logit contributions
ocol-0.002
 beast-0.002
 accidents-0.002
 reality-0.002
icer-0.002
Token ju
Token ID7544
Feature activation+1.03e-01

Pos logit contributions
ocol+0.016
icer+0.014
 beast+0.014
oc+0.014
 reality+0.014
Neg logit contributions
 low-0.009
 fast-0.008
 run-0.008
 slow-0.008
 swing-0.008
Tokenic
Token ID291
Feature activation+1.64e-02

Pos logit contributions
ocol+0.157
 accidents+0.152
 beast+0.151
 reality+0.144
 cavity+0.138
Neg logit contributions
Their-0.190
ics-0.190
amed-0.175
Her-0.174
pered-0.169
Tokeniest
Token ID6386
Feature activation+7.53e-03

Pos logit contributions
ocol+0.020
 accidents+0.020
 reality+0.018
 beast+0.018
 cavity+0.018
Neg logit contributions
Their-0.035
ics-0.033
pered-0.033
uits-0.033
Her-0.032
Token cherry
Token ID23612
Feature activation+2.97e-02

Pos logit contributions
ocol+0.015
 beast+0.014
icer+0.014
 accidents+0.014
 cavity+0.014
Neg logit contributions
 swing-0.010
 run-0.010
Their-0.009
 leash-0.009
 chase-0.009
Token he
Token ID339
Feature activation+8.99e-03

Pos logit contributions
ocol+0.061
 accidents+0.058
 reality+0.056
 beast+0.056
 cavity+0.055
Neg logit contributions
Their-0.040
ics-0.038
pered-0.037
 run-0.036
 swing-0.035
Token had
Token ID550
Feature activation-1.55e-02

Pos logit contributions
ocol+0.023
 accidents+0.022
umm+0.021
icer+0.021
 reality+0.021
Neg logit contributions
Their-0.008
 run-0.007
nes-0.007
pered-0.007
 contain-0.007
Token you
Token ID345
Feature activation-9.78e-03

Pos logit contributions
ocol+0.062
 beast+0.062
oc+0.062
 cavity+0.060
icer+0.059
Neg logit contributions
 first-0.032
Their-0.030
 run-0.029
 follow-0.028
 their-0.028
Token know
Token ID760
Feature activation+1.80e-02

Pos logit contributions
Their+0.007
 run+0.006
uits+0.006
ics+0.006
 contain+0.006
Neg logit contributions
ocol-0.026
 accidents-0.026
umm-0.025
 reality-0.025
 beast-0.024
Token what
Token ID644
Feature activation+1.67e-02

Pos logit contributions
oc+0.041
ocol+0.040
 beast+0.039
umm+0.039
icer+0.038
Neg logit contributions
 off-0.020
 their-0.019
Their-0.019
 first-0.019
ics-0.019
Token is
Token ID318
Feature activation-2.19e-03

Pos logit contributions
ocol+0.036
umm+0.034
icer+0.034
 accidents+0.034
oc+0.033
Neg logit contributions
Their-0.021
 run-0.019
 their-0.019
Her-0.019
ics-0.019
Token y
Token ID331
Feature activation+4.68e-02

Pos logit contributions
Their+0.003
uits+0.002
pered+0.002
 contain+0.002
Her+0.002
Neg logit contributions
ocol-0.005
 accidents-0.005
 beast-0.004
 cavity-0.004
 reality-0.004
Tokenumm
Token ID13929
Feature activation+1.03e-01

Pos logit contributions
 beast+0.056
 reality+0.056
 accidents+0.055
ocol+0.053
oc+0.051
Neg logit contributions
Their-0.097
pered-0.096
ics-0.095
uits-0.091
suits-0.090
Tokenier
Token ID959
Feature activation+1.27e-02

Pos logit contributions
 accidents+0.163
ocol+0.162
 beast+0.158
 reality+0.157
oc+0.151
Neg logit contributions
Their-0.176
ics-0.170
pered-0.169
uits-0.162
suits-0.161
Token?
Token ID30
Feature activation+1.50e-02

Pos logit contributions
ocol+0.028
 accidents+0.027
 beast+0.026
 reality+0.026
oc+0.025
Neg logit contributions
Their-0.016
pered-0.014
ics-0.014
uits-0.014
Her-0.013
Token You
Token ID921
Feature activation-1.22e-02

Pos logit contributions
 accidents+0.030
 reality+0.030
ocol+0.029
 beast+0.029
 ig+0.027
Neg logit contributions
Their-0.020
pered-0.020
Her-0.019
ends-0.017
 run-0.017
Token are
Token ID389
Feature activation-1.21e-02

Pos logit contributions
 run+0.009
Their+0.009
 contain+0.008
 follow+0.008
 argue+0.008
Neg logit contributions
 accidents-0.032
ocol-0.032
umm-0.031
 beast-0.030
 reality-0.030
Token y
Token ID331
Feature activation+4.78e-02

Pos logit contributions
Their+0.012
pered+0.011
ics+0.010
uits+0.010
Her+0.010
Neg logit contributions
ocol-0.030
 accidents-0.029
 beast-0.028
oc-0.028
 reality-0.028
Token the
Token ID262
Feature activation+1.73e-02

Pos logit contributions
ocol+0.024
 accidents+0.024
 beast+0.022
oc+0.022
umm+0.022
Neg logit contributions
Their-0.012
pered-0.010
ics-0.010
Her-0.010
uits-0.010
Token garden
Token ID11376
Feature activation-2.09e-03

Pos logit contributions
ocol+0.035
 accidents+0.034
 beast+0.032
 reality+0.031
oc+0.031
Neg logit contributions
Their-0.025
pered-0.023
ics-0.022
uits-0.022
Her-0.022
Token to
Token ID284
Feature activation+1.44e-02

Pos logit contributions
Their+0.003
pered+0.002
nes+0.002
uits+0.002
amed+0.002
Neg logit contributions
 accidents-0.004
oc-0.004
ocol-0.004
 ig-0.004
 reality-0.004
Token pick
Token ID2298
Feature activation+5.21e-02

Pos logit contributions
ocol+0.036
 accidents+0.036
 beast+0.034
 reality+0.034
oc+0.034
Neg logit contributions
Their-0.013
pered-0.012
uits-0.011
ics-0.011
Her-0.011
Token the
Token ID262
Feature activation+1.04e-02

Pos logit contributions
ocol+0.126
 accidents+0.123
 reality+0.121
 beast+0.120
oc+0.116
Neg logit contributions
Their-0.049
 off-0.045
Her-0.045
ics-0.044
pered-0.043
Token ju
Token ID7544
Feature activation+1.02e-01

Pos logit contributions
ocol+0.019
 accidents+0.019
 beast+0.018
 reality+0.018
oc+0.017
Neg logit contributions
Their-0.016
 swing-0.015
Her-0.015
 run-0.015
 low-0.015
Tokenic
Token ID291
Feature activation+1.60e-02

Pos logit contributions
ocol+0.153
 accidents+0.146
 beast+0.142
 reality+0.136
 cavity+0.124
Neg logit contributions
ics-0.199
Their-0.195
Her-0.184
amed-0.177
 swing-0.173
Tokeniest
Token ID6386
Feature activation+7.97e-03

Pos logit contributions
ocol+0.022
 accidents+0.021
 beast+0.019
oc+0.019
 reality+0.018
Neg logit contributions
Their-0.033
pered-0.032
uits-0.031
ics-0.030
Her-0.030
Token and
Token ID290
Feature activation+2.33e-02

Pos logit contributions
 beast+0.014
 accidents+0.014
 reality+0.014
ocol+0.014
 cavity+0.012
Neg logit contributions
 swing-0.013
 run-0.012
Her-0.011
 on-0.011
Their-0.011
Token sweet
Token ID6029
Feature activation+2.96e-02

Pos logit contributions
ocol+0.052
 beast+0.051
 accidents+0.047
 reality+0.047
icer+0.047
Neg logit contributions
 low-0.028
Their-0.027
 run-0.027
 fast-0.027
Her-0.027
Tokenest
Token ID395
Feature activation+7.97e-03

Pos logit contributions
 beast+0.054
 reality+0.054
ocol+0.050
 accidents+0.049
icer+0.048
Neg logit contributions
 run-0.043
 swing-0.042
 off-0.041
 dads-0.039
Her-0.039
Token against
Token ID1028
Feature activation-5.28e-03

Pos logit contributions
Their+0.081
ics+0.078
Her+0.076
pered+0.075
suits+0.069
Neg logit contributions
 accidents-0.122
ocol-0.118
 reality-0.118
 beast-0.118
heart-0.110
Token all
Token ID477
Feature activation-1.59e-02

Pos logit contributions
Their+0.007
pered+0.007
ics+0.007
uits+0.007
Her+0.007
Neg logit contributions
ocol-0.011
 accidents-0.010
 beast-0.010
 reality-0.010
oc-0.010
Token of
Token ID286
Feature activation-9.92e-03

Pos logit contributions
Their+0.025
pered+0.024
suits+0.023
uits+0.023
ics+0.023
Neg logit contributions
ocol-0.028
 accidents-0.028
 beast-0.026
 reality-0.026
oc-0.026
Token the
Token ID262
Feature activation-3.90e-04

Pos logit contributions
Their+0.013
pered+0.012
uits+0.012
ics+0.011
Her+0.011
Neg logit contributions
ocol-0.021
 accidents-0.020
 beast-0.019
oc-0.019
 reality-0.019
Token fastest
Token ID14162
Feature activation-4.09e-02

Pos logit contributions
Their+0.000
pered+0.000
uits+0.000
ics+0.000
Her+0.000
Neg logit contributions
ocol-0.001
 accidents-0.001
 beast-0.001
 reality-0.001
oc-0.001
Token runners
Token ID19323
Feature activation-9.21e-02

Pos logit contributions
Their+0.054
pered+0.050
uits+0.048
ics+0.048
Her+0.047
Neg logit contributions
ocol-0.085
 accidents-0.084
 beast-0.080
 reality-0.079
oc-0.078
Token in
Token ID287
Feature activation-3.88e-02

Pos logit contributions
Their+0.137
Her+0.122
pered+0.121
uits+0.116
amed+0.113
Neg logit contributions
ocol-0.170
 beast-0.164
 reality-0.161
 accidents-0.161
oc-0.159
Token the
Token ID262
Feature activation+6.96e-03

Pos logit contributions
Their+0.052
pered+0.048
uits+0.046
ics+0.046
Her+0.045
Neg logit contributions
ocol-0.081
 accidents-0.079
 beast-0.075
 reality-0.075
oc-0.073
Token world
Token ID995
Feature activation-1.91e-02

Pos logit contributions
ocol+0.015
 accidents+0.014
 cavity+0.013
oc+0.013
 beast+0.013
Neg logit contributions
Their-0.010
pered-0.009
uits-0.009
 run-0.009
ics-0.008
Token!
Token ID0
Feature activation-1.87e-03

Pos logit contributions
Their+0.028
pered+0.026
Her+0.025
uits+0.024
 dads+0.024
Neg logit contributions
 accidents-0.036
ocol-0.036
 beast-0.035
oc-0.034
 reality-0.034
Token Even
Token ID3412
Feature activation+7.58e-04

Pos logit contributions
Their+0.002
pered+0.002
uits+0.002
ics+0.002
Her+0.002
Neg logit contributions
ocol-0.004
 accidents-0.004
 beast-0.004
 reality-0.004
 cavity-0.004
Token friends
Token ID2460
Feature activation-6.71e-02

Pos logit contributions
Their+0.025
pered+0.024
ics+0.023
uits+0.023
 run+0.022
Neg logit contributions
ocol-0.038
 accidents-0.037
 beast-0.035
 reality-0.035
oc-0.035
Token would
Token ID561
Feature activation-4.30e-02

Pos logit contributions
Their+0.108
pered+0.104
uits+0.096
Her+0.094
ics+0.092
Neg logit contributions
ocol-0.114
 reality-0.112
oc-0.111
 accidents-0.111
 beast-0.108
Token always
Token ID1464
Feature activation-1.19e-02

Pos logit contributions
Their+0.059
pered+0.057
suits+0.055
amed+0.053
uits+0.051
Neg logit contributions
ocol-0.086
 reality-0.084
 accidents-0.084
oc-0.080
 beast-0.080
Token join
Token ID4654
Feature activation-7.06e-02

Pos logit contributions
Their+0.014
pered+0.014
uits+0.013
ics+0.012
suits+0.012
Neg logit contributions
ocol-0.026
 accidents-0.026
 reality-0.025
 beast-0.024
oc-0.024
Token the
Token ID262
Feature activation-1.64e-02

Pos logit contributions
Their+0.065
uits+0.060
ics+0.060
pered+0.058
Her+0.055
Neg logit contributions
ocol-0.175
 accidents-0.172
 reality-0.162
 beast-0.162
oc-0.161
Token kids
Token ID3988
Feature activation-8.65e-02

Pos logit contributions
Their+0.024
pered+0.023
uits+0.022
Her+0.021
ics+0.021
Neg logit contributions
ocol-0.032
 accidents-0.032
 beast-0.030
 reality-0.030
oc-0.029
Token to
Token ID284
Feature activation-1.63e-02

Pos logit contributions
Their+0.116
pered+0.104
Her+0.102
uits+0.099
amed+0.098
Neg logit contributions
 accidents-0.179
ocol-0.170
 reality-0.169
oc-0.169
 beast-0.164
Token play
Token ID711
Feature activation-7.52e-02

Pos logit contributions
Their+0.018
pered+0.017
uits+0.016
suits+0.016
ics+0.016
Neg logit contributions
ocol-0.037
 accidents-0.037
 reality-0.035
 beast-0.035
oc-0.034
Token and
Token ID290
Feature activation+2.77e-03

Pos logit contributions
Their+0.064
pered+0.058
uits+0.057
ics+0.055
Her+0.055
Neg logit contributions
 accidents-0.186
ocol-0.185
oc-0.179
 beast-0.176
 reality-0.176
Token have
Token ID423
Feature activation-2.31e-02

Pos logit contributions
 accidents+0.006
ocol+0.006
 reality+0.006
 beast+0.006
oc+0.006
Neg logit contributions
Their-0.003
pered-0.003
uits-0.003
Her-0.002
amed-0.002
Token fun
Token ID1257
Feature activation-4.99e-02

Pos logit contributions
Their+0.033
pered+0.031
uits+0.030
ics+0.029
Her+0.029
Neg logit contributions
ocol-0.046
 accidents-0.045
 reality-0.043
 beast-0.043
oc-0.042
Token strategies
Token ID10064
Feature activation-3.79e-02

Pos logit contributions
 beast+0.022
ocol+0.022
 cavity+0.021
 accidents+0.021
oc+0.020
Neg logit contributions
Their-0.014
uits-0.014
 off-0.013
 over-0.013
 their-0.013
Token to
Token ID284
Feature activation-1.91e-03

Pos logit contributions
Their+0.046
pered+0.043
Her+0.042
ics+0.040
uits+0.040
Neg logit contributions
 accidents-0.081
ocol-0.081
 reality-0.076
 beast-0.076
oc-0.074
Token beat
Token ID4405
Feature activation-4.49e-02

Pos logit contributions
Their+0.002
pered+0.002
uits+0.002
ics+0.002
Her+0.002
Neg logit contributions
ocol-0.004
 accidents-0.004
 beast-0.004
 reality-0.004
oc-0.004
Token the
Token ID262
Feature activation-2.78e-02

Pos logit contributions
Their+0.054
pered+0.049
uits+0.048
ics+0.045
 run+0.044
Neg logit contributions
ocol-0.100
 accidents-0.098
 beast-0.094
oc-0.092
 cavity-0.091
Token other
Token ID584
Feature activation-5.25e-02

Pos logit contributions
Their+0.038
pered+0.035
uits+0.035
 run+0.034
 contain+0.034
Neg logit contributions
ocol-0.057
 accidents-0.053
oc-0.051
 beast-0.051
 cavity-0.049
Token players
Token ID1938
Feature activation-8.65e-02

Pos logit contributions
Their+0.064
pered+0.060
uits+0.059
 run+0.059
suits+0.057
Neg logit contributions
ocol-0.116
 accidents-0.109
 beast-0.104
 cavity-0.103
oc-0.101
Token.
Token ID13
Feature activation-7.57e-03

Pos logit contributions
Their+0.110
pered+0.101
Her+0.098
uits+0.096
ics+0.094
Neg logit contributions
 accidents-0.179
ocol-0.179
 reality-0.174
 beast-0.172
oc-0.165
Token
Token ID220
Feature activation+3.78e-02

Pos logit contributions
Their+0.013
pered+0.013
ics+0.013
Her+0.012
uits+0.012
Neg logit contributions
 accidents-0.012
ocol-0.012
 beast-0.011
 reality-0.011
oc-0.011
Token
Token ID198
Feature activation-6.77e-03

Pos logit contributions
ocol+0.083
 accidents+0.082
 beast+0.078
oc+0.077
 reality+0.077
Neg logit contributions
Their-0.044
pered-0.041
uits-0.039
ics-0.039
Her-0.038
Token
Token ID198
Feature activation+1.72e-04

Pos logit contributions
pered+0.010
ics+0.009
 contain+0.009
Their+0.009
 leash+0.009
Neg logit contributions
oc-0.012
 beast-0.012
 accidents-0.012
 reality-0.012
ocol-0.012
TokenAfter
Token ID3260
Feature activation-2.83e-02

Pos logit contributions
 accidents+0.000
ocol+0.000
 beast+0.000
oc+0.000
 reality+0.000
Neg logit contributions
pered-0.000
Their-0.000
ics-0.000
uits-0.000
 contain-0.000
Token them
Token ID606
Feature activation-2.12e-02

Pos logit contributions
Their+0.008
pered+0.008
uits+0.007
ics+0.007
Her+0.007
Neg logit contributions
 accidents-0.014
ocol-0.013
 beast-0.013
 reality-0.013
oc-0.012
Token fight
Token ID1907
Feature activation-3.30e-02

Pos logit contributions
Their+0.028
pered+0.027
suits+0.027
uits+0.026
Her+0.026
Neg logit contributions
 accidents-0.045
oc-0.041
 beast-0.040
 reality-0.039
ocol-0.039
Token!
Token ID0
Feature activation+2.51e-03

Pos logit contributions
Their+0.036
pered+0.035
Her+0.033
uits+0.033
ics+0.033
Neg logit contributions
 accidents-0.075
ocol-0.071
 beast-0.070
 reality-0.069
oc-0.067
Token So
Token ID1406
Feature activation-1.33e-02

Pos logit contributions
ocol+0.006
 accidents+0.006
 beast+0.005
oc+0.005
 reality+0.005
Neg logit contributions
Their-0.003
pered-0.002
ics-0.002
uits-0.002
 contain-0.002
Token the
Token ID262
Feature activation-1.96e-02

Pos logit contributions
Their+0.013
pered+0.012
uits+0.011
ics+0.011
Her+0.011
Neg logit contributions
ocol-0.032
 accidents-0.032
 beast-0.030
 reality-0.030
oc-0.030
Token players
Token ID1938
Feature activation-8.60e-02

Pos logit contributions
Their+0.031
pered+0.031
Her+0.031
ics+0.030
uits+0.029
Neg logit contributions
 accidents-0.035
 beast-0.033
 reality-0.031
ocol-0.030
 cavity-0.029
Token got
Token ID1392
Feature activation+7.79e-03

Pos logit contributions
Their+0.133
pered+0.123
Her+0.120
uits+0.117
ics+0.116
Neg logit contributions
 beast-0.150
 accidents-0.147
 reality-0.144
ocol-0.141
oc-0.140
Token ready
Token ID3492
Feature activation+1.53e-02

Pos logit contributions
 accidents+0.017
ocol+0.016
 beast+0.015
 reality+0.015
 cavity+0.015
Neg logit contributions
Their-0.009
Her-0.009
uits-0.009
pered-0.009
nes-0.008
Token and
Token ID290
Feature activation+1.64e-02

Pos logit contributions
 accidents+0.032
ocol+0.029
 reality+0.029
 beast+0.028
 pounds+0.027
Neg logit contributions
Their-0.020
Her-0.020
ics-0.019
pered-0.019
nes-0.019
Token hopped
Token ID45230
Feature activation-2.99e-02

Pos logit contributions
 accidents+0.036
ocol+0.035
 beast+0.034
 reality+0.033
oc+0.033
Neg logit contributions
Their-0.020
pered-0.018
uits-0.018
Her-0.017
ics-0.017
Token into
Token ID656
Feature activation+1.85e-03

Pos logit contributions
Her+0.040
ics+0.037
Their+0.034
nes+0.032
atha+0.031
Neg logit contributions
 accidents-0.056
 beast-0.055
ocol-0.053
 reality-0.052
oc-0.052
Token little
Token ID1310
Feature activation-2.73e-02

Pos logit contributions
Their+0.026
pered+0.024
ics+0.024
Her+0.023
uits+0.023
Neg logit contributions
 accidents-0.028
ocol-0.027
 beast-0.026
 reality-0.026
oc-0.024
Token girl
Token ID2576
Feature activation-2.66e-02

Pos logit contributions
Their+0.054
pered+0.051
uits+0.050
ics+0.050
Her+0.049
Neg logit contributions
 accidents-0.041
ocol-0.039
 beast-0.037
 reality-0.037
oc-0.034
Token and
Token ID290
Feature activation-3.69e-02

Pos logit contributions
Their+0.037
pered+0.034
Her+0.032
uits+0.032
ics+0.032
Neg logit contributions
ocol-0.052
 accidents-0.052
 beast-0.051
oc-0.050
 reality-0.050
Token the
Token ID262
Feature activation-1.17e-02

Pos logit contributions
Their+0.046
pered+0.045
uits+0.044
suits+0.042
ics+0.040
Neg logit contributions
 accidents-0.078
ocol-0.072
 reality-0.071
 beast-0.071
oc-0.067
Token other
Token ID584
Feature activation-3.53e-02

Pos logit contributions
Their+0.018
pered+0.017
ics+0.017
Her+0.017
uits+0.016
Neg logit contributions
 accidents-0.021
 beast-0.021
 reality-0.020
ocol-0.020
umm-0.018
Token players
Token ID1938
Feature activation-8.58e-02

Pos logit contributions
Their+0.053
pered+0.049
uits+0.048
ics+0.048
Her+0.047
Neg logit contributions
ocol-0.068
 accidents-0.067
 beast-0.063
 reality-0.062
oc-0.062
Token attached
Token ID7223
Feature activation-5.18e-03

Pos logit contributions
Their+0.126
Her+0.113
pered+0.111
uits+0.101
ics+0.101
Neg logit contributions
ocol-0.154
 reality-0.153
 beast-0.152
 accidents-0.150
oc-0.148
Token the
Token ID262
Feature activation-1.08e-02

Pos logit contributions
Their+0.006
pered+0.006
Her+0.005
nes+0.005
ics+0.005
Neg logit contributions
 accidents-0.011
ocol-0.011
 beast-0.011
 reality-0.011
 cavity-0.010
Token pieces
Token ID5207
Feature activation-2.00e-02

Pos logit contributions
Their+0.015
Her+0.014
pered+0.014
ics+0.014
uits+0.013
Neg logit contributions
 accidents-0.022
ocol-0.021
 beast-0.021
 reality-0.020
 cavity-0.019
Token of
Token ID286
Feature activation-1.83e-02

Pos logit contributions
Their+0.027
pered+0.026
Her+0.026
 argue+0.024
ics+0.024
Neg logit contributions
 accidents-0.038
ocol-0.036
 reality-0.035
 beast-0.033
 pounds-0.033
Token wood
Token ID4898
Feature activation+1.23e-02

Pos logit contributions
Their+0.023
uits+0.020
 run+0.020
pered+0.019
ics+0.019
Neg logit contributions
ocol-0.040
 accidents-0.040
 beast-0.038
umm-0.037
oc-0.037
Token it
Token ID340
Feature activation+2.56e-02

Pos logit contributions
ocol+0.008
 cavity+0.007
 accidents+0.007
oc+0.007
icer+0.007
Neg logit contributions
Their-0.005
 run-0.005
ics-0.005
 low-0.005
Her-0.005
Token asked
Token ID1965
Feature activation+5.23e-03

Pos logit contributions
ocol+0.053
 accidents+0.053
 beast+0.050
 reality+0.049
oc+0.049
Neg logit contributions
Their-0.034
pered-0.031
uits-0.030
ics-0.030
Her-0.030
Token each
Token ID1123
Feature activation-2.81e-02

Pos logit contributions
ocol+0.012
 accidents+0.011
 beast+0.011
 reality+0.011
oc+0.011
Neg logit contributions
Their-0.006
pered-0.006
uits-0.006
ics-0.006
Her-0.006
Token of
Token ID286
Feature activation-2.96e-03

Pos logit contributions
Their+0.028
pered+0.025
 run+0.025
uits+0.024
 contain+0.024
Neg logit contributions
ocol-0.068
 accidents-0.067
oc-0.063
 beast-0.063
 reality-0.062
Token the
Token ID262
Feature activation-1.60e-02

Pos logit contributions
Their+0.003
pered+0.003
uits+0.003
Her+0.003
 their+0.003
Neg logit contributions
ocol-0.007
 accidents-0.007
umm-0.007
 cavity-0.006
oc-0.006
Token players
Token ID1938
Feature activation-8.52e-02

Pos logit contributions
Their+0.024
ics+0.022
uits+0.022
pered+0.022
Her+0.022
Neg logit contributions
 accidents-0.031
ocol-0.030
 beast-0.029
 reality-0.028
oc-0.028
Token to
Token ID284
Feature activation-6.16e-03

Pos logit contributions
Their+0.118
pered+0.106
Her+0.102
amed+0.101
ics+0.101
Neg logit contributions
 accidents-0.167
ocol-0.166
 reality-0.163
 beast-0.160
oc-0.152
Token take
Token ID1011
Feature activation-1.21e-02

Pos logit contributions
suits+0.007
Their+0.007
pered+0.007
amed+0.007
uits+0.006
Neg logit contributions
ocol-0.013
 accidents-0.013
 reality-0.013
 beast-0.012
 ig-0.012
Token a
Token ID257
Feature activation-1.84e-02

Pos logit contributions
Their+0.017
ics+0.016
pered+0.016
uits+0.016
Her+0.016
Neg logit contributions
ocol-0.024
 accidents-0.023
 beast-0.022
 reality-0.022
oc-0.021
Token nap
Token ID25422
Feature activation-1.58e-02

Pos logit contributions
pered+0.027
Their+0.026
ics+0.026
Her+0.025
amed+0.025
Neg logit contributions
 accidents-0.035
 reality-0.034
ocol-0.034
 beast-0.034
 cavity-0.033
Token.
Token ID13
Feature activation-5.47e-03

Pos logit contributions
Their+0.020
pered+0.018
Her+0.018
ics+0.018
uits+0.018
Neg logit contributions
ocol-0.033
 accidents-0.033
 reality-0.032
 beast-0.031
oc-0.031
Token
Token ID198
Feature activation-1.02e-02

Pos logit contributions
Their+0.015
ics+0.015
nes+0.014
uits+0.014
 argue+0.014
Neg logit contributions
 accidents-0.019
ocol-0.018
umm-0.017
oc-0.017
 cavity-0.016
Token
Token ID198
Feature activation-4.86e-03

Pos logit contributions
pered+0.015
suits+0.015
 contain+0.015
 argue+0.015
uits+0.014
Neg logit contributions
 accidents-0.017
ocol-0.017
oc-0.017
 cavity-0.016
 reality-0.016
TokenSo
Token ID2396
Feature activation-2.77e-02

Pos logit contributions
pered+0.007
suits+0.007
uits+0.006
 low+0.006
 argue+0.006
Neg logit contributions
 accidents-0.009
ocol-0.008
 reality-0.008
oc-0.008
 beast-0.008
Token the
Token ID262
Feature activation-3.10e-02

Pos logit contributions
Their+0.040
pered+0.040
uits+0.038
coat+0.037
ics+0.037
Neg logit contributions
 accidents-0.054
ocol-0.051
 reality-0.048
heart-0.048
umm-0.048
Token two
Token ID734
Feature activation-4.84e-02

Pos logit contributions
Their+0.048
pered+0.046
uits+0.045
suits+0.044
eyed+0.041
Neg logit contributions
 accidents-0.062
 reality-0.055
ocol-0.055
 beast-0.054
umm-0.054
Token girls
Token ID4813
Feature activation-8.48e-02

Pos logit contributions
Their+0.087
pered+0.083
uits+0.079
Her+0.076
ics+0.075
Neg logit contributions
 accidents-0.083
ocol-0.078
 reality-0.071
 beast-0.069
heart-0.068
Token fell
Token ID3214
Feature activation+2.04e-02

Pos logit contributions
Their+0.125
Her+0.108
pered+0.106
arl+0.101
ics+0.098
Neg logit contributions
ocol-0.151
 ig-0.149
 reality-0.148
oc-0.148
 beast-0.148
Token on
Token ID319
Feature activation+7.98e-04

Pos logit contributions
 accidents+0.040
 ig+0.034
 reality+0.031
ocol+0.030
heart+0.030
Neg logit contributions
atha-0.030
Their-0.030
 contain-0.029
Her-0.029
nes-0.028
Token the
Token ID262
Feature activation-4.53e-03

Pos logit contributions
 accidents+0.002
ocol+0.002
 reality+0.001
 beast+0.001
oc+0.001
Neg logit contributions
Their-0.001
pered-0.001
ics-0.001
amed-0.001
nes-0.001
Token ground
Token ID2323
Feature activation+7.14e-03

Pos logit contributions
Their+0.008
Her+0.008
ends+0.008
pered+0.008
uits+0.008
Neg logit contributions
 accidents-0.006
 beast-0.006
 worries-0.006
 ig-0.006
 nightmare-0.005
Token and
Token ID290
Feature activation-1.09e-03

Pos logit contributions
 accidents+0.014
 reality+0.012
oc+0.012
 beast+0.012
 ig+0.012
Neg logit contributions
Their-0.010
nes-0.009
 dads-0.009
pered-0.009
Her-0.009
Token friends
Token ID2460
Feature activation-4.55e-02

Pos logit contributions
Their+0.010
Her+0.009
uits+0.009
ics+0.009
pered+0.009
Neg logit contributions
 accidents-0.009
ocol-0.009
 beast-0.009
 reality-0.008
umm-0.008
Token.
Token ID13
Feature activation-6.79e-03

Pos logit contributions
Their+0.056
pered+0.052
uits+0.050
ics+0.049
Her+0.048
Neg logit contributions
ocol-0.099
 accidents-0.098
 beast-0.093
 reality-0.092
oc-0.091
Token You
Token ID921
Feature activation-1.88e-02

Pos logit contributions
Their+0.012
pered+0.011
Her+0.011
uits+0.010
ics+0.010
Neg logit contributions
ocol-0.012
 accidents-0.012
 beast-0.011
 reality-0.011
 cavity-0.010
Token are
Token ID389
Feature activation-2.08e-02

Pos logit contributions
Their+0.025
pered+0.023
uits+0.022
ics+0.022
Her+0.022
Neg logit contributions
ocol-0.040
 accidents-0.039
 beast-0.037
 reality-0.037
oc-0.036
Token good
Token ID922
Feature activation-7.49e-03

Pos logit contributions
Their+0.028
uits+0.027
pered+0.026
suits+0.026
ics+0.025
Neg logit contributions
ocol-0.043
 accidents-0.043
 beast-0.040
 reality-0.040
 cavity-0.038
Token runners
Token ID19323
Feature activation-8.46e-02

Pos logit contributions
Their+0.014
uits+0.013
Her+0.013
ics+0.012
pered+0.012
Neg logit contributions
 accidents-0.013
ocol-0.012
 beast-0.011
 reality-0.011
oc-0.011
Token.
Token ID13
Feature activation-8.47e-03

Pos logit contributions
Their+0.120
pered+0.113
Her+0.109
uits+0.109
ics+0.107
Neg logit contributions
ocol-0.163
 accidents-0.162
oc-0.154
 reality-0.153
 beast-0.152
Token Do
Token ID2141
Feature activation+6.19e-03

Pos logit contributions
Their+0.015
pered+0.014
uits+0.014
ics+0.014
Her+0.013
Neg logit contributions
ocol-0.014
 accidents-0.014
 beast-0.013
 reality-0.013
oc-0.013
Token you
Token ID345
Feature activation-1.14e-02

Pos logit contributions
ocol+0.010
 accidents+0.010
 beast+0.009
 reality+0.009
oc+0.009
Neg logit contributions
Their-0.011
pered-0.010
ics-0.010
Her-0.010
uits-0.010
Token want
Token ID765
Feature activation-1.65e-02

Pos logit contributions
Their+0.014
pered+0.013
ics+0.013
uits+0.012
Her+0.012
Neg logit contributions
ocol-0.025
 accidents-0.024
 reality-0.023
 beast-0.023
oc-0.023
Token a
Token ID257
Feature activation-2.67e-02

Pos logit contributions
Their+0.012
pered+0.011
uits+0.010
ics+0.010
Her+0.010
Neg logit contributions
ocol-0.044
 accidents-0.043
 beast-0.042
 reality-0.041
oc-0.041
Token ran
Token ID4966
Feature activation-6.65e-02

Pos logit contributions
pered+0.056
Their+0.056
uits+0.054
ics+0.054
oned+0.053
Neg logit contributions
 accidents-0.077
 beast-0.069
 reality-0.068
 ig-0.068
ocol-0.066
Token to
Token ID284
Feature activation-1.53e-02

Pos logit contributions
ics+0.110
uits+0.097
 contain+0.095
Her+0.093
 swing+0.090
Neg logit contributions
 accidents-0.106
 trouble-0.093
 cavity-0.088
oc-0.087
ocol-0.087
Token join
Token ID4654
Feature activation-6.11e-02

Pos logit contributions
suits+0.025
amed+0.025
coat+0.024
nes+0.023
pered+0.023
Neg logit contributions
 accidents-0.027
ocol-0.025
 reality-0.025
 beast-0.023
 dentist-0.022
Token the
Token ID262
Feature activation-1.75e-02

Pos logit contributions
uits+0.069
Their+0.068
ics+0.066
coat+0.066
suits+0.065
Neg logit contributions
ocol-0.140
 accidents-0.133
 reality-0.119
 beast-0.115
heart-0.114
Token big
Token ID1263
Feature activation-3.33e-02

Pos logit contributions
pered+0.030
Her+0.029
Their+0.029
uits+0.028
nes+0.027
Neg logit contributions
 accidents-0.031
 beast-0.028
 reality-0.027
ocol-0.026
 cel-0.025
Token kids
Token ID3988
Feature activation-8.42e-02

Pos logit contributions
Their+0.051
pered+0.048
Her+0.048
uits+0.046
ics+0.045
Neg logit contributions
 accidents-0.065
ocol-0.063
 beast-0.061
oc-0.058
 reality-0.058
Token.
Token ID13
Feature activation-1.73e-02

Pos logit contributions
Their+0.148
amed+0.139
Her+0.136
pered+0.136
uits+0.135
Neg logit contributions
 accidents-0.129
oc-0.128
ocol-0.128
 beast-0.115
 reality-0.115
Token They
Token ID1119
Feature activation-4.64e-02

Pos logit contributions
ics+0.021
 contain+0.019
amed+0.019
nes+0.019
 aside+0.018
Neg logit contributions
 accidents-0.036
ocol-0.034
oc-0.033
 reality-0.033
umm-0.033
Token put
Token ID1234
Feature activation-4.98e-03

Pos logit contributions
Their+0.071
pered+0.069
Her+0.064
ics+0.063
oned+0.062
Neg logit contributions
 reality-0.079
oc-0.077
 accidents-0.076
 beast-0.075
 ig-0.075
Token on
Token ID319
Feature activation+2.46e-03

Pos logit contributions
Their+0.005
Her+0.005
nes+0.005
uits+0.005
pered+0.005
Neg logit contributions
ocol-0.012
 accidents-0.012
 beast-0.011
icer-0.010
 reality-0.010
Token helmets
Token ID28359
Feature activation-1.91e-02

Pos logit contributions
 accidents+0.006
ocol+0.006
 reality+0.006
 beast+0.006
 cavity+0.005
Neg logit contributions
pered-0.002
Their-0.002
 contain-0.002
nes-0.002
ics-0.002
Token be
Token ID307
Feature activation-9.33e-03

Pos logit contributions
 reality+0.015
ocol+0.015
 ig+0.014
 accidents+0.013
 dreams+0.013
Neg logit contributions
pered-0.011
amed-0.010
suits-0.010
Their-0.010
ics-0.009
Token cow
Token ID9875
Feature activation+4.16e-04

Pos logit contributions
Their+0.013
uits+0.012
ics+0.012
pered+0.012
 contain+0.012
Neg logit contributions
ocol-0.018
 accidents-0.018
 reality-0.018
 beast-0.017
 cavity-0.016
Tokenboys
Token ID13202
Feature activation-4.32e-02

Pos logit contributions
 accidents+0.001
ocol+0.001
oc+0.001
 beast+0.001
 reality+0.001
Neg logit contributions
Their-0.000
pered-0.000
Her-0.000
uits-0.000
amed-0.000
Token and
Token ID290
Feature activation-1.09e-02

Pos logit contributions
Their+0.066
pered+0.061
amed+0.059
 contain+0.057
uits+0.057
Neg logit contributions
oc-0.082
 accidents-0.076
 beast-0.074
 reality-0.072
ocol-0.070
Token cow
Token ID9875
Feature activation-1.09e-02

Pos logit contributions
Their+0.016
pered+0.015
uits+0.014
ics+0.014
Her+0.014
Neg logit contributions
ocol-0.022
 accidents-0.021
 beast-0.020
 reality-0.020
oc-0.020
Tokengirls
Token ID36960
Feature activation-8.38e-02

Pos logit contributions
Their+0.014
pered+0.013
Her+0.012
uits+0.012
ics+0.012
Neg logit contributions
ocol-0.023
 accidents-0.023
oc-0.022
 beast-0.022
 reality-0.022
Token.
Token ID13
Feature activation-1.06e-03

Pos logit contributions
Their+0.135
Her+0.121
pered+0.117
 contain+0.116
amed+0.116
Neg logit contributions
ocol-0.140
oc-0.140
 beast-0.134
 accidents-0.132
 reality-0.131
Token They
Token ID1119
Feature activation-4.17e-02

Pos logit contributions
Their+0.001
ics+0.001
pered+0.001
uits+0.001
amed+0.001
Neg logit contributions
ocol-0.002
 accidents-0.002
oc-0.002
 beast-0.002
icer-0.002
Token put
Token ID1234
Feature activation-2.95e-03

Pos logit contributions
Their+0.052
pered+0.046
Her+0.046
ics+0.046
amed+0.044
Neg logit contributions
ocol-0.092
oc-0.086
 beast-0.085
 reality-0.084
 accidents-0.083
Token on
Token ID319
Feature activation+4.91e-03

Pos logit contributions
Their+0.003
pered+0.003
 contain+0.003
Her+0.003
ics+0.003
Neg logit contributions
ocol-0.007
 accidents-0.007
 beast-0.006
oc-0.006
icer-0.006
Token hats
Token ID23910
Feature activation-2.73e-02

Pos logit contributions
ocol+0.011
 accidents+0.011
oc+0.010
 beast+0.010
 reality+0.010
Neg logit contributions
pered-0.005
Their-0.005
 contain-0.005
nes-0.005
ics-0.005
Token flew
Token ID13112
Feature activation-3.08e-02

Pos logit contributions
ocol+0.051
 accidents+0.050
 beast+0.048
 reality+0.047
oc+0.047
Neg logit contributions
Their-0.029
pered-0.027
uits-0.026
ics-0.026
Her-0.025
Token down
Token ID866
Feature activation+4.50e-03

Pos logit contributions
nes+0.036
Their+0.036
ics+0.036
 contain+0.035
Her+0.034
Neg logit contributions
 accidents-0.062
 beast-0.055
heart-0.054
 reality-0.052
ocol-0.051
Token and
Token ID290
Feature activation+1.76e-02

Pos logit contributions
 accidents+0.009
 beast+0.008
ocol+0.008
 reality+0.008
oc+0.008
Neg logit contributions
Their-0.006
nes-0.006
ics-0.006
 contain-0.006
 dads-0.006
Token joined
Token ID5399
Feature activation-4.71e-02

Pos logit contributions
 accidents+0.036
 beast+0.032
ocol+0.032
 reality+0.032
oc+0.031
Neg logit contributions
Their-0.025
uits-0.023
pered-0.023
ics-0.022
Her-0.022
Token the
Token ID262
Feature activation-4.08e-03

Pos logit contributions
uits+0.066
Their+0.064
ics+0.063
coat+0.063
nes+0.063
Neg logit contributions
ocol-0.091
 accidents-0.091
heart-0.079
 beast-0.078
oc-0.075
Token kids
Token ID3988
Feature activation-8.33e-02

Pos logit contributions
Their+0.007
pered+0.007
uits+0.007
Her+0.007
ics+0.006
Neg logit contributions
 accidents-0.007
ocol-0.007
 beast-0.006
 reality-0.006
oc-0.006
Token.
Token ID13
Feature activation-1.11e-02

Pos logit contributions
Their+0.125
pered+0.110
amed+0.110
uits+0.109
Her+0.107
Neg logit contributions
 accidents-0.147
ocol-0.146
oc-0.144
 reality-0.141
 beast-0.136
Token Tim
Token ID5045
Feature activation-3.14e-03

Pos logit contributions
Their+0.016
ics+0.015
 contain+0.015
nes+0.015
uits+0.015
Neg logit contributions
ocol-0.021
 accidents-0.021
oc-0.020
 beast-0.019
umm-0.019
Token,
Token ID11
Feature activation-2.50e-02

Pos logit contributions
Their+0.005
pered+0.004
uits+0.004
Her+0.004
 contain+0.004
Neg logit contributions
 accidents-0.006
oc-0.006
 beast-0.006
 reality-0.006
ocol-0.005
Token Sue
Token ID26089
Feature activation-4.39e-03

Pos logit contributions
Their+0.034
pered+0.031
uits+0.031
ics+0.030
Her+0.030
Neg logit contributions
 accidents-0.051
ocol-0.051
 beast-0.048
 reality-0.048
oc-0.047
Token,
Token ID11
Feature activation-3.07e-02

Pos logit contributions
Their+0.007
pered+0.006
uits+0.006
Her+0.006
 contain+0.006
Neg logit contributions
 accidents-0.008
 beast-0.008
oc-0.008
ocol-0.007
 reality-0.007
Token you
Token ID345
Feature activation-2.74e-02

Pos logit contributions
Their+0.000
Her+0.000
ics+0.000
pered+0.000
 run+0.000
Neg logit contributions
ocol-0.000
 accidents-0.000
 beast-0.000
 reality-0.000
oc-0.000
Token."
Token ID526
Feature activation-1.57e-04

Pos logit contributions
Their+0.032
pered+0.029
uits+0.029
Her+0.027
ics+0.027
Neg logit contributions
ocol-0.061
 accidents-0.060
oc-0.058
 beast-0.057
 reality-0.057
Token
Token ID198
Feature activation-1.45e-02

Pos logit contributions
Their+0.000
pered+0.000
uits+0.000
ics+0.000
Her+0.000
Neg logit contributions
ocol-0.000
 accidents-0.000
 beast-0.000
 reality-0.000
 cavity-0.000
Token
Token ID198
Feature activation-8.07e-03

Pos logit contributions
pered+0.019
Their+0.018
uits+0.017
ics+0.017
 contain+0.017
Neg logit contributions
 accidents-0.029
ocol-0.029
oc-0.029
 beast-0.027
 cavity-0.027
TokenSo
Token ID2396
Feature activation-2.61e-02

Pos logit contributions
Their+0.011
pered+0.010
uits+0.010
ics+0.010
Her+0.010
Neg logit contributions
ocol-0.016
 accidents-0.016
oc-0.015
 beast-0.015
 reality-0.015
Token together
Token ID1978
Feature activation-8.33e-02

Pos logit contributions
Their+0.034
pered+0.034
uits+0.032
ics+0.032
nes+0.031
Neg logit contributions
 accidents-0.055
ocol-0.052
oc-0.050
 beast-0.049
 reality-0.048
Token,
Token ID11
Feature activation-3.37e-02

Pos logit contributions
pered+0.110
Their+0.105
uits+0.103
nes+0.103
amed+0.102
Neg logit contributions
 accidents-0.176
ocol-0.162
oc-0.154
 beast-0.150
 ig-0.149
Token Daddy
Token ID26926
Feature activation-3.46e-03

Pos logit contributions
Their+0.046
ics+0.041
pered+0.041
uits+0.040
 run+0.040
Neg logit contributions
ocol-0.072
 accidents-0.067
 beast-0.066
 reality-0.065
 cavity-0.065
Token and
Token ID290
Feature activation-4.44e-02

Pos logit contributions
Their+0.005
pered+0.004
Her+0.004
uits+0.004
 contain+0.004
Neg logit contributions
 accidents-0.007
oc-0.007
ocol-0.007
 beast-0.007
 cavity-0.007
Token Tim
Token ID5045
Feature activation-1.96e-02

Pos logit contributions
Their+0.061
 run+0.057
 dads+0.056
pered+0.054
 their+0.054
Neg logit contributions
ocol-0.090
 beast-0.086
 accidents-0.085
 cavity-0.083
icer-0.081
Token went
Token ID1816
Feature activation-3.98e-02

Pos logit contributions
Their+0.026
pered+0.023
uits+0.023
Her+0.022
ics+0.022
Neg logit contributions
oc-0.040
ocol-0.040
 accidents-0.039
 beast-0.038
 cavity-0.037
Token been
Token ID587
Feature activation+1.07e-02

Pos logit contributions
Their+0.017
pered+0.016
uits+0.016
ics+0.015
suits+0.015
Neg logit contributions
 accidents-0.023
ocol-0.022
 reality-0.021
 beast-0.021
 cavity-0.020
Token able
Token ID1498
Feature activation-8.08e-04

Pos logit contributions
 accidents+0.021
 beast+0.020
 reality+0.019
ocol+0.019
 cavity+0.018
Neg logit contributions
Their-0.015
uits-0.015
suits-0.014
Her-0.014
ics-0.014
Token to
Token ID284
Feature activation+5.37e-03

Pos logit contributions
Their+0.001
pered+0.001
uits+0.001
ics+0.001
Her+0.001
Neg logit contributions
 accidents-0.002
ocol-0.002
 reality-0.002
 beast-0.002
 cavity-0.002
Token watch
Token ID2342
Feature activation-1.35e-02

Pos logit contributions
 accidents+0.011
 reality+0.011
ocol+0.011
 beast+0.011
oc+0.010
Neg logit contributions
Their-0.007
suits-0.006
pered-0.006
ics-0.006
amed-0.006
Token the
Token ID262
Feature activation-4.50e-03

Pos logit contributions
Their+0.018
pered+0.017
uits+0.016
ics+0.016
Her+0.016
Neg logit contributions
ocol-0.028
 accidents-0.028
 beast-0.026
 reality-0.026
oc-0.025
Token players
Token ID1938
Feature activation-8.32e-02

Pos logit contributions
Their+0.006
pered+0.006
Her+0.006
uits+0.006
ics+0.005
Neg logit contributions
 accidents-0.009
 beast-0.009
ocol-0.009
 reality-0.008
umm-0.008
Token bounce
Token ID20110
Feature activation-3.25e-02

Pos logit contributions
Their+0.110
pered+0.098
suits+0.095
amed+0.093
uits+0.093
Neg logit contributions
 accidents-0.173
ocol-0.170
 reality-0.168
 beast-0.165
oc-0.157
Token the
Token ID262
Feature activation-6.70e-03

Pos logit contributions
Their+0.041
pered+0.037
Her+0.037
uits+0.035
ics+0.035
Neg logit contributions
 accidents-0.069
ocol-0.066
 beast-0.066
 reality-0.065
oc-0.063
Token purple
Token ID14032
Feature activation-6.30e-03

Pos logit contributions
Their+0.010
Her+0.009
pered+0.009
uits+0.008
ics+0.008
Neg logit contributions
 accidents-0.014
 beast-0.013
ocol-0.013
umm-0.012
 reality-0.012
Token around
Token ID1088
Feature activation-2.69e-02

Pos logit contributions
Their+0.007
pered+0.007
Her+0.006
uits+0.006
ics+0.006
Neg logit contributions
ocol-0.014
 accidents-0.014
 beast-0.013
oc-0.013
 reality-0.013
Token the
Token ID262
Feature activation-1.31e-02

Pos logit contributions
Their+0.034
pered+0.031
uits+0.030
ics+0.030
Her+0.029
Neg logit contributions
ocol-0.058
 accidents-0.057
 beast-0.054
 reality-0.053
oc-0.053
Token the
Token ID262
Feature activation-6.10e-03

Pos logit contributions
Their+0.018
pered+0.017
uits+0.016
ics+0.016
Her+0.015
Neg logit contributions
ocol-0.046
 accidents-0.046
 beast-0.043
 reality-0.043
oc-0.043
Token basketball
Token ID9669
Feature activation-9.37e-03

Pos logit contributions
Their+0.008
pered+0.008
ics+0.007
uits+0.007
Her+0.007
Neg logit contributions
 accidents-0.013
ocol-0.012
 beast-0.012
 reality-0.012
oc-0.011
Token as
Token ID355
Feature activation-2.00e-02

Pos logit contributions
Their+0.013
pered+0.012
uits+0.011
Her+0.011
ics+0.011
Neg logit contributions
 accidents-0.019
ocol-0.018
 beast-0.018
 reality-0.018
oc-0.017
Token the
Token ID262
Feature activation-1.78e-02

Pos logit contributions
Their+0.023
pered+0.020
uits+0.020
ics+0.020
Her+0.020
Neg logit contributions
ocol-0.046
 accidents-0.044
 beast-0.043
oc-0.043
 cavity-0.042
Token top
Token ID1353
Feature activation+3.02e-02

Pos logit contributions
Their+0.025
pered+0.024
uits+0.023
ics+0.023
Her+0.022
Neg logit contributions
ocol-0.036
 accidents-0.035
 beast-0.033
 reality-0.033
oc-0.032
Token players
Token ID1938
Feature activation-8.32e-02

Pos logit contributions
ocol+0.061
 accidents+0.060
 beast+0.057
 reality+0.056
oc+0.055
Neg logit contributions
Their-0.042
pered-0.039
uits-0.038
ics-0.037
Her-0.037
Token.
Token ID13
Feature activation-1.23e-02

Pos logit contributions
Their+0.118
pered+0.104
Her+0.099
uits+0.099
suits+0.097
Neg logit contributions
ocol-0.164
 accidents-0.160
 reality-0.158
 beast-0.157
oc-0.149
Token
Token ID198
Feature activation-5.63e-03

Pos logit contributions
Their+0.015
pered+0.014
uits+0.013
ics+0.013
Her+0.013
Neg logit contributions
ocol-0.027
 accidents-0.027
 beast-0.025
 reality-0.025
oc-0.025
Token
Token ID198
Feature activation-1.60e-03

Pos logit contributions
pered+0.008
suits+0.008
Their+0.007
 contain+0.007
uits+0.007
Neg logit contributions
ocol-0.010
 accidents-0.010
oc-0.010
 reality-0.010
 beast-0.010
TokenThe
Token ID464
Feature activation-1.32e-02

Pos logit contributions
pered+0.002
suits+0.002
uits+0.002
ics+0.002
 dads+0.002
Neg logit contributions
 accidents-0.003
ocol-0.003
 reality-0.003
oc-0.003
 beast-0.003
Token little
Token ID1310
Feature activation-2.21e-02

Pos logit contributions
Their+0.021
pered+0.020
ics+0.019
uits+0.019
Her+0.019
Neg logit contributions
 accidents-0.024
ocol-0.024
 beast-0.022
 reality-0.022
oc-0.021
Token what
Token ID644
Feature activation+2.93e-02

Pos logit contributions
ocol+0.019
 accidents+0.018
umm+0.018
 beast+0.018
 cavity+0.018
Neg logit contributions
Their-0.017
pered-0.015
uits-0.015
 off-0.015
 aside-0.015
Token it
Token ID340
Feature activation+4.77e-02

Pos logit contributions
ocol+0.063
oc+0.056
 accidents+0.055
icer+0.055
umm+0.054
Neg logit contributions
Their-0.041
ics-0.038
 run-0.037
 off-0.036
 swing-0.036
Token says
Token ID1139
Feature activation+8.65e-03

Pos logit contributions
ocol+0.086
 accidents+0.079
 reality+0.075
 beast+0.075
oc+0.074
Neg logit contributions
Their-0.077
ics-0.077
 run-0.074
pered-0.073
uits-0.071
Token!"
Token ID2474
Feature activation-1.09e-02

Pos logit contributions
ocol+0.017
 accidents+0.017
 reality+0.016
 beast+0.016
oc+0.015
Neg logit contributions
Their-0.012
pered-0.011
uits-0.011
Her-0.011
ics-0.011
Token So
Token ID1406
Feature activation-2.46e-02

Pos logit contributions
Their+0.015
pered+0.015
uits+0.015
ics+0.015
amed+0.014
Neg logit contributions
 accidents-0.021
ocol-0.020
oc-0.020
umm-0.019
 cavity-0.018
Token together
Token ID1978
Feature activation-8.32e-02

Pos logit contributions
pered+0.038
uits+0.037
ics+0.037
Their+0.036
amed+0.036
Neg logit contributions
 accidents-0.048
ocol-0.041
 reality-0.040
heart-0.039
oc-0.039
Token the
Token ID262
Feature activation-2.64e-02

Pos logit contributions
pered+0.120
Their+0.112
amed+0.112
ics+0.111
uits+0.111
Neg logit contributions
 accidents-0.171
ocol-0.152
 reality-0.141
oc-0.137
 ig-0.136
Token mom
Token ID1995
Feature activation+7.20e-04

Pos logit contributions
Their+0.044
pered+0.042
uits+0.041
ics+0.040
Her+0.038
Neg logit contributions
 accidents-0.047
ocol-0.046
 beast-0.043
 reality-0.043
oc-0.041
Token and
Token ID290
Feature activation-3.64e-02

Pos logit contributions
 accidents+0.002
ocol+0.001
 beast+0.001
oc+0.001
 reality+0.001
Neg logit contributions
Their-0.001
pered-0.001
uits-0.001
ics-0.001
Her-0.001
Token the
Token ID262
Feature activation-2.26e-02

Pos logit contributions
uits+0.051
Their+0.050
pered+0.049
ics+0.048
suits+0.047
Neg logit contributions
 accidents-0.074
ocol-0.068
 reality-0.065
 beast-0.063
 ig-0.062
Token girl
Token ID2576
Feature activation-4.63e-02

Pos logit contributions
Their+0.040
pered+0.038
uits+0.038
ics+0.037
Her+0.035
Neg logit contributions
 accidents-0.038
ocol-0.036
 beast-0.034
 reality-0.034
oc-0.033
Token and
Token ID290
Feature activation+4.67e-03

Pos logit contributions
Their+0.005
pered+0.005
Her+0.005
nes+0.005
atha+0.005
Neg logit contributions
 accidents-0.007
oc-0.007
ocol-0.007
 ig-0.006
 cel-0.006
Token carrots
Token ID34397
Feature activation-1.43e-02

Pos logit contributions
ocol+0.011
 accidents+0.010
 beast+0.010
 reality+0.010
icer+0.010
Neg logit contributions
Their-0.005
 run-0.005
uits-0.005
pered-0.005
ics-0.004
Token and
Token ID290
Feature activation-1.11e-02

Pos logit contributions
Their+0.020
pered+0.018
Her+0.018
nes+0.016
atha+0.016
Neg logit contributions
 accidents-0.028
oc-0.027
ocol-0.026
 reality-0.026
 ig-0.025
Token played
Token ID2826
Feature activation-7.18e-02

Pos logit contributions
Their+0.014
 run+0.014
uits+0.013
 dads+0.013
pered+0.013
Neg logit contributions
ocol-0.024
 accidents-0.022
 beast-0.022
 reality-0.021
icer-0.021
Token fun
Token ID1257
Feature activation-6.64e-02

Pos logit contributions
Their+0.071
Her+0.063
uits+0.062
ics+0.061
pered+0.060
Neg logit contributions
oc-0.164
ocol-0.164
 accidents-0.162
 beast-0.155
 reality-0.153
Token games
Token ID1830
Feature activation-8.31e-02

Pos logit contributions
Their+0.070
pered+0.064
uits+0.062
Her+0.059
ics+0.057
Neg logit contributions
ocol-0.157
 accidents-0.155
oc-0.149
 beast-0.148
 reality-0.146
Token.
Token ID13
Feature activation-1.24e-02

Pos logit contributions
Their+0.140
pered+0.132
Her+0.129
amed+0.123
 dads+0.120
Neg logit contributions
oc-0.137
 accidents-0.131
 reality-0.131
ocol-0.124
 beast-0.124
Token Little
Token ID7703
Feature activation+9.65e-03

Pos logit contributions
Their+0.017
 They+0.015
 over+0.015
pered+0.015
 run+0.014
Neg logit contributions
ocol-0.028
 accidents-0.025
 pounds-0.024
 reality-0.024
 beast-0.024
Token Billy
Token ID15890
Feature activation+6.98e-04

Pos logit contributions
ocol+0.020
 reality+0.019
 cavity+0.017
oc+0.017
 pounds+0.016
Neg logit contributions
 run-0.013
 dads-0.012
 swing-0.012
 first-0.012
Their-0.012
Token talked
Token ID6619
Feature activation-2.96e-02

Pos logit contributions
ocol+0.001
 accidents+0.001
 beast+0.001
 reality+0.001
oc+0.001
Neg logit contributions
Their-0.001
pered-0.001
uits-0.001
ics-0.001
Her-0.001
Token to
Token ID284
Feature activation-4.29e-03

Pos logit contributions
Their+0.042
pered+0.041
ics+0.039
Her+0.039
nes+0.039
Neg logit contributions
 accidents-0.057
 beast-0.052
 reality-0.052
oc-0.051
ocol-0.050
Token to
Token ID284
Feature activation-2.65e-02

Pos logit contributions
pered+0.038
amed+0.036
ics+0.035
suits+0.034
uits+0.034
Neg logit contributions
 accidents-0.065
 reality-0.058
ocol-0.053
 cavity-0.053
 dentist-0.053
Token share
Token ID2648
Feature activation-3.67e-02

Pos logit contributions
Their+0.028
pered+0.027
uits+0.025
suits+0.024
ics+0.023
Neg logit contributions
ocol-0.061
 accidents-0.060
 reality-0.058
oc-0.058
 beast-0.058
Token their
Token ID511
Feature activation-3.78e-02

Pos logit contributions
Their+0.032
pered+0.029
uits+0.027
ics+0.027
Her+0.026
Neg logit contributions
ocol-0.093
 accidents-0.092
 beast-0.088
 reality-0.087
oc-0.086
Token cars
Token ID5006
Feature activation-6.90e-02

Pos logit contributions
Their+0.049
pered+0.044
uits+0.044
ics+0.043
 run+0.042
Neg logit contributions
ocol-0.080
 accidents-0.078
 beast-0.076
 reality-0.075
oc-0.074
Token and
Token ID290
Feature activation-1.63e-03

Pos logit contributions
Their+0.083
pered+0.083
uits+0.077
Her+0.075
ics+0.072
Neg logit contributions
 accidents-0.146
oc-0.142
ocol-0.141
 reality-0.132
 beast-0.132
Token play
Token ID711
Feature activation-8.30e-02

Pos logit contributions
Their+0.002
pered+0.002
uits+0.002
ics+0.002
Her+0.001
Neg logit contributions
ocol-0.004
 accidents-0.004
 beast-0.004
 reality-0.004
oc-0.003
Token together
Token ID1978
Feature activation-6.09e-02

Pos logit contributions
uits+0.081
ics+0.080
pered+0.079
Their+0.079
 contain+0.077
Neg logit contributions
 accidents-0.189
ocol-0.184
oc-0.182
 reality-0.180
 beast-0.174
Token.
Token ID13
Feature activation-1.65e-02

Pos logit contributions
Their+0.098
uits+0.094
pered+0.093
amed+0.089
Her+0.089
Neg logit contributions
 accidents-0.102
ocol-0.101
 reality-0.100
oc-0.096
 beast-0.096
Token They
Token ID1119
Feature activation-4.23e-02

Pos logit contributions
Their+0.016
pered+0.015
ics+0.015
uits+0.015
 contain+0.014
Neg logit contributions
 accidents-0.039
ocol-0.039
oc-0.037
 beast-0.037
 reality-0.037
Token make
Token ID787
Feature activation-2.00e-02

Pos logit contributions
Their+0.057
pered+0.054
ics+0.050
Her+0.050
uits+0.049
Neg logit contributions
ocol-0.084
 accidents-0.082
oc-0.081
 reality-0.081
 beast-0.080
Token zoom
Token ID19792
Feature activation-4.39e-02

Pos logit contributions
Their+0.025
pered+0.023
uits+0.022
ics+0.022
Her+0.022
Neg logit contributions
ocol-0.043
 accidents-0.042
 beast-0.040
 reality-0.040
oc-0.040
Token said
Token ID531
Feature activation-1.61e-03

Pos logit contributions
Their+0.000
pered+0.000
uits+0.000
ics+0.000
Her+0.000
Neg logit contributions
ocol-0.000
 accidents-0.000
 beast-0.000
 reality-0.000
oc-0.000
Token goodbye
Token ID24829
Feature activation-1.06e-02

Pos logit contributions
Their+0.003
pered+0.003
uits+0.003
ics+0.003
suits+0.003
Neg logit contributions
 accidents-0.003
ocol-0.003
 beast-0.002
 reality-0.002
 cavity-0.002
Token and
Token ID290
Feature activation-2.45e-03

Pos logit contributions
Their+0.015
ics+0.014
uits+0.013
pered+0.013
 contain+0.013
Neg logit contributions
 accidents-0.021
oc-0.020
 beast-0.020
 cavity-0.019
 reality-0.019
Token introduced
Token ID5495
Feature activation-2.97e-02

Pos logit contributions
Their+0.003
pered+0.003
uits+0.003
ics+0.003
Her+0.003
Neg logit contributions
 accidents-0.005
ocol-0.005
 beast-0.005
 reality-0.005
oc-0.005
Token the
Token ID262
Feature activation-2.02e-02

Pos logit contributions
Their+0.038
uits+0.038
pered+0.037
ics+0.037
suits+0.036
Neg logit contributions
 accidents-0.061
ocol-0.059
 reality-0.055
 beast-0.054
oc-0.054
Token students
Token ID2444
Feature activation-8.29e-02

Pos logit contributions
Their+0.028
pered+0.026
ics+0.026
uits+0.025
Her+0.025
Neg logit contributions
ocol-0.041
 accidents-0.041
 beast-0.039
 reality-0.038
oc-0.037
Token to
Token ID284
Feature activation-1.42e-02

Pos logit contributions
Their+0.097
pered+0.093
ics+0.087
Her+0.083
suits+0.083
Neg logit contributions
 accidents-0.173
 reality-0.168
ocol-0.163
 beast-0.161
oc-0.160
Token their
Token ID511
Feature activation-2.76e-02

Pos logit contributions
Their+0.017
pered+0.016
ics+0.015
uits+0.015
suits+0.015
Neg logit contributions
ocol-0.031
 accidents-0.031
 reality-0.030
 beast-0.029
oc-0.028
Token new
Token ID649
Feature activation-2.80e-02

Pos logit contributions
Their+0.038
pered+0.036
uits+0.034
ics+0.034
Her+0.034
Neg logit contributions
ocol-0.056
 accidents-0.055
 beast-0.052
 reality-0.051
oc-0.051
Token classroom
Token ID15806
Feature activation-3.85e-02

Pos logit contributions
Their+0.039
pered+0.036
uits+0.035
ics+0.034
Her+0.034
Neg logit contributions
ocol-0.057
 accidents-0.056
 beast-0.052
 reality-0.052
oc-0.051
Token.
Token ID13
Feature activation-8.46e-03

Pos logit contributions
Their+0.055
pered+0.051
ics+0.048
uits+0.047
Her+0.047
Neg logit contributions
 accidents-0.075
ocol-0.072
 reality-0.071
oc-0.070
 beast-0.070
Token their
Token ID511
Feature activation-1.74e-02

Pos logit contributions
 accidents+0.020
ocol+0.020
 beast+0.018
 reality+0.018
 cavity+0.017
Neg logit contributions
Their-0.018
uits-0.017
 contain-0.017
pered-0.017
Her-0.017
Token wool
Token ID25749
Feature activation+2.04e-03

Pos logit contributions
Their+0.021
 run+0.019
pered+0.019
 swing+0.019
uits+0.018
Neg logit contributions
ocol-0.039
 accidents-0.037
oc-0.036
 beast-0.036
 reality-0.035
Tokenen
Token ID268
Feature activation-1.92e-02

Pos logit contributions
 accidents+0.004
 ig+0.004
ocol+0.004
 reality+0.004
oc+0.003
Neg logit contributions
Their-0.003
pered-0.003
Her-0.003
atha-0.003
 contain-0.003
Token creations
Token ID28443
Feature activation-3.03e-02

Pos logit contributions
Their+0.022
pered+0.020
Her+0.019
uits+0.019
ics+0.019
Neg logit contributions
ocol-0.043
 accidents-0.043
 reality-0.040
 beast-0.040
oc-0.040
Token and
Token ID290
Feature activation-2.48e-03

Pos logit contributions
pered+0.044
Their+0.042
Her+0.041
 argue+0.040
 dads+0.038
Neg logit contributions
 accidents-0.053
 reality-0.049
oc-0.048
ocol-0.048
 ig-0.046
Token play
Token ID711
Feature activation-8.28e-02

Pos logit contributions
Their+0.003
pered+0.002
uits+0.002
ics+0.002
Her+0.002
Neg logit contributions
ocol-0.006
 accidents-0.006
 beast-0.006
 reality-0.005
oc-0.005
Token with
Token ID351
Feature activation-2.57e-02

Pos logit contributions
 dads+0.089
Their+0.087
ics+0.087
uits+0.087
 contain+0.087
Neg logit contributions
 accidents-0.178
ocol-0.174
oc-0.173
 reality-0.166
 beast-0.163
Token them
Token ID606
Feature activation-4.34e-02

Pos logit contributions
Their+0.026
 their+0.023
Her+0.020
 low+0.020
 themselves+0.020
Neg logit contributions
ocol-0.065
umm-0.065
 accidents-0.063
 beast-0.062
 cavity-0.060
Token some
Token ID617
Feature activation-1.33e-02

Pos logit contributions
Their+0.063
pered+0.059
uits+0.057
amed+0.056
Her+0.055
Neg logit contributions
 accidents-0.084
ocol-0.080
 reality-0.079
oc-0.078
 beast-0.076
Token more
Token ID517
Feature activation-2.44e-02

Pos logit contributions
Their+0.019
pered+0.018
uits+0.017
 run+0.017
 aside+0.017
Neg logit contributions
ocol-0.026
 accidents-0.026
 beast-0.025
umm-0.024
 reality-0.024
Token.
Token ID13
Feature activation-1.44e-02

Pos logit contributions
Their+0.030
pered+0.028
uits+0.027
ics+0.027
Her+0.026
Neg logit contributions
ocol-0.053
 accidents-0.052
 beast-0.050
 reality-0.049
oc-0.049
Token So
Token ID1406
Feature activation-1.21e-02

Pos logit contributions
Their+0.003
pered+0.003
uits+0.003
ics+0.003
 contain+0.003
Neg logit contributions
 accidents-0.004
ocol-0.004
oc-0.004
 beast-0.003
 cavity-0.003
Token she
Token ID673
Feature activation-4.18e-03

Pos logit contributions
Their+0.016
pered+0.015
uits+0.015
ics+0.014
suits+0.014
Neg logit contributions
 accidents-0.026
ocol-0.025
 beast-0.024
 reality-0.024
oc-0.023
Token asked
Token ID1965
Feature activation-2.10e-03

Pos logit contributions
Their+0.006
pered+0.006
uits+0.006
ics+0.006
Her+0.006
Neg logit contributions
 accidents-0.007
 beast-0.007
 reality-0.007
oc-0.007
 cavity-0.007
Token the
Token ID262
Feature activation-1.13e-02

Pos logit contributions
suits+0.003
Their+0.003
nes+0.003
coat+0.003
uits+0.003
Neg logit contributions
 accidents-0.004
ocol-0.004
 reality-0.004
 beast-0.004
heart-0.004
Token other
Token ID584
Feature activation-4.37e-02

Pos logit contributions
Their+0.017
pered+0.016
ics+0.016
uits+0.016
nes+0.015
Neg logit contributions
 accidents-0.022
 beast-0.021
ocol-0.020
 reality-0.020
umm-0.019
Token players
Token ID1938
Feature activation-8.27e-02

Pos logit contributions
Their+0.076
pered+0.070
uits+0.070
ics+0.069
Her+0.067
Neg logit contributions
 accidents-0.075
ocol-0.074
 beast-0.069
 reality-0.068
oc-0.066
Token if
Token ID611
Feature activation+1.83e-02

Pos logit contributions
Their+0.139
amed+0.131
pered+0.129
ics+0.127
suits+0.124
Neg logit contributions
 accidents-0.134
 reality-0.130
ocol-0.126
 beast-0.123
heart-0.112
Token they
Token ID484
Feature activation-3.21e-02

Pos logit contributions
ocol+0.045
 accidents+0.044
 beast+0.042
 reality+0.042
oc+0.041
Neg logit contributions
Their-0.018
pered-0.016
uits-0.015
ics-0.015
Her-0.015
Token wanted
Token ID2227
Feature activation-4.66e-03

Pos logit contributions
Their+0.057
pered+0.054
ics+0.053
uits+0.052
Her+0.052
Neg logit contributions
ocol-0.052
 accidents-0.051
 beast-0.048
 reality-0.047
oc-0.046
Token to
Token ID284
Feature activation+9.93e-03

Pos logit contributions
Their+0.005
pered+0.005
ics+0.004
amed+0.004
uits+0.004
Neg logit contributions
 accidents-0.011
ocol-0.011
 reality-0.010
 beast-0.010
 cavity-0.010
Token play
Token ID711
Feature activation-5.66e-02

Pos logit contributions
 accidents+0.020
 reality+0.020
ocol+0.019
 beast+0.018
 ig+0.018
Neg logit contributions
pered-0.012
Their-0.012
suits-0.012
amed-0.011
uits-0.011
Token.
Token ID13
Feature activation+1.60e-02

Pos logit contributions
Their+0.037
pered+0.037
nes+0.034
uits+0.034
ics+0.034
Neg logit contributions
 accidents-0.041
ocol-0.040
 reality-0.039
oc-0.038
 ig-0.038
Token He
Token ID679
Feature activation+2.52e-02

Pos logit contributions
ocol+0.034
 accidents+0.034
 beast+0.032
 reality+0.031
oc+0.031
Neg logit contributions
Their-0.021
pered-0.019
uits-0.019
ics-0.018
Her-0.018
Token saw
Token ID2497
Feature activation+5.43e-02

Pos logit contributions
ocol+0.054
 accidents+0.052
umm+0.049
 reality+0.048
 beast+0.048
Neg logit contributions
Their-0.034
 run-0.032
uits-0.031
pered-0.031
 low-0.031
Token a
Token ID257
Feature activation+3.55e-03

Pos logit contributions
ocol+0.146
 accidents+0.135
oc+0.134
umm+0.133
icer+0.130
Neg logit contributions
Their-0.049
Her-0.040
 run-0.040
uits-0.039
ics-0.039
Token poor
Token ID3595
Feature activation+1.98e-02

Pos logit contributions
ocol+0.009
 accidents+0.008
umm+0.008
 reality+0.008
icer+0.008
Neg logit contributions
 swing-0.003
Their-0.003
 run-0.003
 low-0.003
 fast-0.003
Token vegetable
Token ID20236
Feature activation+6.36e-02

Pos logit contributions
ocol+0.044
 accidents+0.040
icer+0.040
umm+0.039
 reality+0.038
Neg logit contributions
 swing-0.024
 run-0.023
 leash-0.022
 fast-0.022
 low-0.021
Token vendor
Token ID18371
Feature activation+3.16e-02

Pos logit contributions
ocol+0.145
 accidents+0.137
 reality+0.130
icer+0.127
 cavity+0.127
Neg logit contributions
 swing-0.068
 run-0.067
pered-0.065
Their-0.065
ics-0.065
Token near
Token ID1474
Feature activation-2.19e-03

Pos logit contributions
ocol+0.076
 accidents+0.074
 reality+0.071
 beast+0.071
umm+0.070
Neg logit contributions
Their-0.030
 run-0.028
ics-0.028
pered-0.028
 swing-0.027
Token a
Token ID257
Feature activation-1.59e-02

Pos logit contributions
Their+0.002
 run+0.002
 their+0.002
pered+0.002
Her+0.002
Neg logit contributions
ocol-0.005
 accidents-0.005
umm-0.005
 reality-0.005
 beast-0.005
Token bench
Token ID7624
Feature activation+1.02e-02

Pos logit contributions
 swing+0.017
Their+0.017
 run+0.016
 low+0.016
 leash+0.016
Neg logit contributions
ocol-0.037
 accidents-0.036
umm-0.033
 reality-0.033
oc-0.032
Token and
Token ID290
Feature activation+2.06e-02

Pos logit contributions
ocol+0.020
 accidents+0.020
 beast+0.019
oc+0.019
 reality+0.018
Neg logit contributions
Their-0.014
pered-0.013
uits-0.013
ics-0.012
Her-0.012
Token.
Token ID13
Feature activation-9.04e-03

Pos logit contributions
 accidents+0.020
oc+0.018
 cel+0.018
 cavity+0.017
 ig+0.016
Neg logit contributions
Their-0.023
 dads-0.022
nes-0.022
uits-0.022
atha-0.022
Token It
Token ID632
Feature activation+2.65e-02

Pos logit contributions
Their+0.013
pered+0.012
uits+0.012
Her+0.011
 low+0.011
Neg logit contributions
ocol-0.019
 accidents-0.018
 beast-0.017
 reality-0.017
heart-0.016
Token was
Token ID373
Feature activation-5.79e-03

Pos logit contributions
ocol+0.046
 accidents+0.043
 beast+0.041
 reality+0.040
oc+0.039
Neg logit contributions
Their-0.046
pered-0.043
ics-0.042
Her-0.042
uits-0.042
Token sour
Token ID11348
Feature activation+4.13e-02

Pos logit contributions
Their+0.008
pered+0.007
uits+0.007
ics+0.007
Her+0.007
Neg logit contributions
ocol-0.012
 accidents-0.012
 beast-0.011
 reality-0.011
oc-0.011
Token and
Token ID290
Feature activation+2.29e-02

Pos logit contributions
ocol+0.110
 accidents+0.104
 beast+0.103
 reality+0.103
oc+0.100
Neg logit contributions
Their-0.031
pered-0.029
uits-0.026
ics-0.026
 run-0.025
Token mush
Token ID15026
Feature activation+6.87e-02

Pos logit contributions
ocol+0.046
 beast+0.044
 accidents+0.043
 reality+0.043
oc+0.042
Neg logit contributions
Their-0.032
 low-0.029
 run-0.029
pered-0.028
Her-0.028
Tokeny
Token ID88
Feature activation+1.36e-02

Pos logit contributions
ocol+0.170
 accidents+0.160
 beast+0.159
 reality+0.158
oc+0.152
Neg logit contributions
Their-0.066
pered-0.060
suits-0.058
ics-0.057
uits-0.056
Token,
Token ID11
Feature activation-3.85e-04

Pos logit contributions
 accidents+0.029
ocol+0.029
 beast+0.027
oc+0.027
 reality+0.027
Neg logit contributions
Their-0.017
pered-0.016
uits-0.016
ics-0.016
Her-0.015
Token but
Token ID475
Feature activation+4.23e-02

Pos logit contributions
Their+0.000
pered+0.000
uits+0.000
Her+0.000
ics+0.000
Neg logit contributions
ocol-0.001
 accidents-0.001
 beast-0.001
 reality-0.001
oc-0.001
Token he
Token ID339
Feature activation+2.34e-02

Pos logit contributions
ocol+0.100
oc+0.093
icer+0.092
 beast+0.091
umm+0.087
Neg logit contributions
Their-0.051
 their-0.050
 fast-0.049
 not-0.047
 first-0.046
Token did
Token ID750
Feature activation-2.30e-02

Pos logit contributions
ocol+0.059
 accidents+0.053
umm+0.051
 reality+0.051
 beast+0.050
Neg logit contributions
Their-0.026
 run-0.024
Her-0.023
 ran-0.023
 swing-0.022
Token,
Token ID11
Feature activation+1.07e-03

Pos logit contributions
Their+0.002
pered+0.002
uits+0.002
ics+0.002
Her+0.002
Neg logit contributions
ocol-0.004
 accidents-0.004
 beast-0.003
oc-0.003
 reality-0.003
Token he
Token ID339
Feature activation+1.71e-02

Pos logit contributions
ocol+0.002
 accidents+0.002
 reality+0.002
 beast+0.002
oc+0.002
Neg logit contributions
Their-0.001
pered-0.001
uits-0.001
Her-0.001
ics-0.001
Token found
Token ID1043
Feature activation+5.49e-02

Pos logit contributions
ocol+0.047
 accidents+0.044
umm+0.043
 beast+0.041
 reality+0.041
Neg logit contributions
 run-0.015
Their-0.014
 ran-0.014
 swing-0.013
suits-0.013
Token a
Token ID257
Feature activation-2.20e-03

Pos logit contributions
ocol+0.156
oc+0.148
umm+0.146
 accidents+0.146
 reality+0.144
Neg logit contributions
Their-0.044
pered-0.033
amed-0.033
 off-0.032
 fast-0.031
Token big
Token ID1263
Feature activation+2.85e-03

Pos logit contributions
 swing+0.002
 run+0.002
Their+0.002
 low+0.002
 leash+0.002
Neg logit contributions
ocol-0.006
 accidents-0.005
oc-0.005
umm-0.005
icer-0.005
Token mel
Token ID7758
Feature activation+7.46e-02

Pos logit contributions
ocol+0.006
 accidents+0.006
oc+0.006
 reality+0.006
umm+0.006
Neg logit contributions
 swing-0.003
 run-0.003
 leash-0.003
 low-0.003
 skip-0.003
Tokenon
Token ID261
Feature activation+3.27e-02

Pos logit contributions
ocol+0.137
 reality+0.137
 accidents+0.136
 worries+0.122
 beast+0.121
Neg logit contributions
ics-0.109
 swing-0.107
Her-0.106
Their-0.103
 on-0.099
Token in
Token ID287
Feature activation-1.95e-03

Pos logit contributions
 accidents+0.076
ocol+0.074
 beast+0.073
oc+0.071
 cavity+0.070
Neg logit contributions
Their-0.036
pered-0.031
uits-0.031
Her-0.030
ics-0.029
Token his
Token ID465
Feature activation+2.20e-02

Pos logit contributions
Their+0.002
 their+0.002
 fast+0.002
Her+0.002
 run+0.002
Neg logit contributions
ocol-0.005
 accidents-0.005
umm-0.005
oc-0.004
heart-0.004
Token garden
Token ID11376
Feature activation+2.55e-02

Pos logit contributions
ocol+0.046
 accidents+0.044
 reality+0.041
 beast+0.040
oc+0.040
Neg logit contributions
Their-0.032
Her-0.029
pered-0.029
ics-0.028
 swing-0.028
Token.
Token ID13
Feature activation+1.64e-03

Pos logit contributions
ocol+0.061
 accidents+0.060
 beast+0.057
 reality+0.056
oc+0.055
Neg logit contributions
Their-0.027
pered-0.024
ics-0.024
uits-0.023
Her-0.023
Token that
Token ID326
Feature activation+2.36e-02

Pos logit contributions
 accidents+0.024
ocol+0.024
 beast+0.022
oc+0.022
 reality+0.022
Neg logit contributions
Their-0.017
pered-0.015
uits-0.015
suits-0.014
ics-0.014
Token had
Token ID550
Feature activation+4.47e-03

Pos logit contributions
ocol+0.054
 accidents+0.049
icer+0.047
oc+0.047
 cavity+0.047
Neg logit contributions
Their-0.029
 low-0.027
 run-0.027
ics-0.027
Her-0.026
Token landed
Token ID11406
Feature activation+2.08e-02

Pos logit contributions
 accidents+0.008
ocol+0.008
 beast+0.008
 reality+0.008
 cavity+0.007
Neg logit contributions
Their-0.007
pered-0.006
suits-0.006
uits-0.006
ics-0.006
Token on
Token ID319
Feature activation+1.03e-02

Pos logit contributions
ocol+0.052
 accidents+0.051
 beast+0.049
 reality+0.048
oc+0.047
Neg logit contributions
Their-0.020
uits-0.016
ics-0.016
Her-0.016
pered-0.015
Token his
Token ID465
Feature activation-5.09e-03

Pos logit contributions
ocol+0.024
 accidents+0.024
oc+0.023
 beast+0.022
 reality+0.022
Neg logit contributions
Their-0.011
pered-0.010
uits-0.009
 low-0.009
 off-0.009
Token tum
Token ID11814
Feature activation+8.54e-02

Pos logit contributions
Their+0.007
pered+0.006
uits+0.006
ics+0.006
Her+0.006
Neg logit contributions
 accidents-0.011
ocol-0.010
 beast-0.010
 reality-0.010
 ig-0.010
Tokenmy
Token ID1820
Feature activation+2.49e-02

Pos logit contributions
ocol+0.173
 accidents+0.167
 beast+0.165
 reality+0.164
oc+0.159
Neg logit contributions
Their-0.119
Her-0.112
ics-0.112
amed-0.108
pered-0.107
Token.
Token ID13
Feature activation-1.56e-04

Pos logit contributions
ocol+0.052
 accidents+0.052
 beast+0.049
oc+0.048
 reality+0.048
Neg logit contributions
Their-0.033
pered-0.030
uits-0.030
ics-0.029
Her-0.028
Token It
Token ID632
Feature activation+2.76e-02

Pos logit contributions
Their+0.000
pered+0.000
uits+0.000
Her+0.000
 low+0.000
Neg logit contributions
ocol-0.000
 accidents-0.000
 beast-0.000
 reality-0.000
oc-0.000
Token opened
Token ID4721
Feature activation+1.46e-02

Pos logit contributions
ocol+0.051
 accidents+0.048
 beast+0.045
 reality+0.045
oc+0.044
Neg logit contributions
Their-0.045
pered-0.042
Her-0.041
ics-0.041
uits-0.041
Token its
Token ID663
Feature activation+2.64e-02

Pos logit contributions
ocol+0.033
 accidents+0.032
 beast+0.031
 reality+0.030
oc+0.030
Neg logit contributions
Their-0.017
pered-0.016
uits-0.015
ics-0.015
Her-0.015
Token pond
Token ID16723
Feature activation-1.81e-02

Pos logit contributions
 swing+0.020
 run+0.019
Their+0.019
pered+0.018
 leash+0.018
Neg logit contributions
ocol-0.034
 accidents-0.033
 reality-0.033
 cavity-0.031
 beast-0.031
Token and
Token ID290
Feature activation+3.45e-03

Pos logit contributions
Their+0.023
pered+0.020
Her+0.020
nes+0.020
 dads+0.019
Neg logit contributions
 accidents-0.038
ocol-0.037
 beast-0.036
oc-0.036
 reality-0.035
Token everyone
Token ID2506
Feature activation-3.31e-02

Pos logit contributions
ocol+0.007
 accidents+0.007
 beast+0.007
 reality+0.007
oc+0.007
Neg logit contributions
Their-0.004
pered-0.004
uits-0.004
ics-0.004
Her-0.004
Token watched
Token ID7342
Feature activation+4.20e-03

Pos logit contributions
Their+0.043
pered+0.040
uits+0.040
ics+0.038
Her+0.038
Neg logit contributions
ocol-0.069
 accidents-0.068
 beast-0.065
oc-0.064
 reality-0.064
Token in
Token ID287
Feature activation-1.50e-02

Pos logit contributions
ocol+0.008
 accidents+0.008
 reality+0.008
 beast+0.008
oc+0.007
Neg logit contributions
Their-0.006
pered-0.006
ics-0.005
uits-0.005
suits-0.005
Token amaz
Token ID40642
Feature activation+7.45e-02

Pos logit contributions
Their+0.032
pered+0.030
uits+0.030
 run+0.029
Her+0.029
Neg logit contributions
ocol-0.020
 accidents-0.019
 beast-0.018
oc-0.017
 cavity-0.017
Tokenement
Token ID972
Feature activation+1.99e-02

Pos logit contributions
oc+0.084
 cavity+0.078
 beast+0.077
ocol+0.068
 accidents+0.068
Neg logit contributions
Their-0.175
Her-0.165
ics-0.160
 off-0.159
pered-0.158
Token as
Token ID355
Feature activation-1.94e-02

Pos logit contributions
ocol+0.042
 accidents+0.042
 reality+0.039
 beast+0.039
oc+0.038
Neg logit contributions
Their-0.025
pered-0.024
uits-0.023
ics-0.022
Her-0.022
Token her
Token ID607
Feature activation-1.54e-03

Pos logit contributions
Their+0.022
uits+0.020
ics+0.020
pered+0.020
 run+0.020
Neg logit contributions
ocol-0.045
 accidents-0.042
oc-0.042
 beast-0.041
 cavity-0.041
Token thread
Token ID4704
Feature activation+5.39e-03

Pos logit contributions
 run+0.002
 swing+0.002
pered+0.002
 leash+0.002
Their+0.002
Neg logit contributions
ocol-0.003
 accidents-0.003
oc-0.003
icer-0.003
 beast-0.003
Token gl
Token ID1278
Feature activation+1.93e-02

Pos logit contributions
 accidents+0.011
ocol+0.010
 reality+0.010
 beast+0.010
oc+0.010
Neg logit contributions
Their-0.007
pered-0.007
Her-0.007
uits-0.006
ics-0.006
Token to
Token ID284
Feature activation+2.77e-02

Pos logit contributions
Their+0.015
pered+0.014
uits+0.013
ics+0.012
amed+0.012
Neg logit contributions
 accidents-0.042
ocol-0.042
 beast-0.039
 reality-0.039
 cavity-0.039
Token eat
Token ID4483
Feature activation+2.01e-02

Pos logit contributions
ocol+0.060
 accidents+0.057
 reality+0.056
 beast+0.055
 ig+0.054
Neg logit contributions
Their-0.033
pered-0.033
amed-0.031
suits-0.030
ics-0.029
Token pr
Token ID778
Feature activation+6.98e-02

Pos logit contributions
 beast+0.043
ocol+0.043
 reality+0.042
 accidents+0.042
umm+0.041
Neg logit contributions
Their-0.026
 their-0.023
Her-0.022
ics-0.022
 run-0.022
Tokenunes
Token ID4015
Feature activation+3.23e-02

Pos logit contributions
 accidents+0.115
ocol+0.115
 beast+0.110
 reality+0.108
oc+0.106
Neg logit contributions
Their-0.120
pered-0.117
ics-0.115
uits-0.114
Her-0.111
Token.
Token ID13
Feature activation+1.45e-02

Pos logit contributions
 accidents+0.056
ocol+0.053
 cel+0.049
oc+0.049
umm+0.048
Neg logit contributions
Their-0.053
pered-0.052
uits-0.049
atha-0.047
suits-0.047
Token She
Token ID1375
Feature activation+1.94e-02

Pos logit contributions
ocol+0.035
 accidents+0.035
 beast+0.034
 reality+0.033
oc+0.033
Neg logit contributions
Their-0.015
pered-0.013
Her-0.013
uits-0.013
ics-0.012
Token liked
Token ID8288
Feature activation-2.08e-02

Pos logit contributions
ocol+0.043
 accidents+0.042
 beast+0.040
 reality+0.039
oc+0.039
Neg logit contributions
Their-0.024
pered-0.022
uits-0.021
ics-0.021
Her-0.021
Token how
Token ID703
Feature activation+2.29e-03

Pos logit contributions
Their+0.020
pered+0.019
uits+0.018
ics+0.017
nes+0.017
Neg logit contributions
ocol-0.051
 accidents-0.051
 reality-0.046
 beast-0.046
 cavity-0.046
Token they
Token ID484
Feature activation-1.70e-02

Pos logit contributions
ocol+0.005
 accidents+0.005
 beast+0.005
 reality+0.005
oc+0.005
Neg logit contributions
Their-0.003
pered-0.002
uits-0.002
ics-0.002
Her-0.002
Token were
Token ID547
Feature activation-3.62e-03

Pos logit contributions
Their+0.018
 run+0.017
ics+0.017
Her+0.016
pered+0.016
Neg logit contributions
 accidents-0.039
ocol-0.039
 beast-0.038
 reality-0.037
 cavity-0.035
Token sweet
Token ID6029
Feature activation+2.80e-02

Pos logit contributions
Their+0.005
pered+0.005
uits+0.005
 contain+0.005
nes+0.005
Neg logit contributions
ocol-0.007
 accidents-0.007
 reality-0.006
 ig-0.006
oc-0.006
Token sad
Token ID6507
Feature activation+8.59e-03

Pos logit contributions
ocol+0.040
 accidents+0.040
 beast+0.037
 reality+0.037
oc+0.036
Neg logit contributions
Their-0.023
uits-0.022
pered-0.021
ics-0.021
Her-0.021
Token and
Token ID290
Feature activation+9.96e-03

Pos logit contributions
ocol+0.026
icer+0.023
 beast+0.023
oc+0.023
umm+0.023
Neg logit contributions
Their-0.005
 off-0.004
 run-0.004
pered-0.004
uits-0.004
Token didn
Token ID1422
Feature activation-4.43e-03

Pos logit contributions
ocol+0.021
 accidents+0.019
umm+0.019
oc+0.019
 reality+0.019
Neg logit contributions
Their-0.013
pered-0.013
 run-0.012
ics-0.012
 off-0.012
Token't
Token ID470
Feature activation+1.13e-02

Pos logit contributions
Their+0.004
pered+0.004
suits+0.004
uits+0.004
ics+0.003
Neg logit contributions
 accidents-0.011
ocol-0.011
 beast-0.010
 reality-0.010
 cavity-0.010
Token know
Token ID760
Feature activation+2.66e-02

Pos logit contributions
ocol+0.023
 accidents+0.023
 beast+0.022
 reality+0.022
oc+0.022
Neg logit contributions
Their-0.015
ics-0.014
pered-0.014
uits-0.014
suits-0.013
Token what
Token ID644
Feature activation+2.13e-02

Pos logit contributions
umm+0.058
icer+0.057
ocol+0.055
oc+0.053
 cavity+0.052
Neg logit contributions
 their-0.035
Their-0.035
 off-0.034
 run-0.034
 over-0.032
Token to
Token ID284
Feature activation+3.97e-02

Pos logit contributions
ocol+0.068
umm+0.064
icer+0.064
 accidents+0.063
 reality+0.062
Neg logit contributions
Their-0.010
 run-0.006
Her-0.005
 fast-0.005
 chase-0.004
Token do
Token ID466
Feature activation-1.07e-02

Pos logit contributions
ocol+0.080
 accidents+0.078
 beast+0.074
 reality+0.073
oc+0.072
Neg logit contributions
Their-0.056
pered-0.053
uits-0.051
ics-0.050
Her-0.050
Token.
Token ID13
Feature activation-1.90e-03

Pos logit contributions
Their+0.015
pered+0.014
uits+0.013
ics+0.013
Her+0.013
Neg logit contributions
 accidents-0.022
ocol-0.021
 reality-0.020
 beast-0.020
oc-0.020
Token But
Token ID887
Feature activation+2.34e-02

Pos logit contributions
Their+0.002
pered+0.002
Her+0.002
 low+0.002
uits+0.002
Neg logit contributions
ocol-0.005
 accidents-0.004
 reality-0.004
 beast-0.004
 pounds-0.004
Token then
Token ID788
Feature activation+1.30e-03

Pos logit contributions
ocol+0.060
umm+0.057
icer+0.055
 pounds+0.054
 accidents+0.054
Neg logit contributions
Their-0.023
Her-0.020
 run-0.020
 off-0.020
pered-0.020
Token the
Token ID262
Feature activation-1.48e-02

Pos logit contributions
Their+0.020
pered+0.019
uits+0.018
ics+0.017
Her+0.017
Neg logit contributions
ocol-0.039
 accidents-0.038
 beast-0.036
 cavity-0.036
 reality-0.036
Token numbers
Token ID3146
Feature activation-1.70e-02

Pos logit contributions
 swing+0.021
 run+0.021
Their+0.021
 fast+0.021
 low+0.020
Neg logit contributions
ocol-0.031
heart-0.029
 accidents-0.029
 reality-0.027
umm-0.026
Token.
Token ID13
Feature activation-8.79e-03

Pos logit contributions
Their+0.025
pered+0.023
Her+0.023
 dads+0.022
amed+0.021
Neg logit contributions
 accidents-0.032
ocol-0.031
 beast-0.031
 reality-0.031
 pounds-0.030
Token
Token ID198
Feature activation-2.52e-03

Pos logit contributions
Their+0.011
pered+0.010
uits+0.009
Her+0.009
ics+0.009
Neg logit contributions
ocol-0.020
 accidents-0.020
 reality-0.019
 beast-0.019
 cavity-0.018
Token
Token ID198
Feature activation+2.45e-03

Pos logit contributions
pered+0.003
Their+0.003
ics+0.003
uits+0.003
suits+0.003
Neg logit contributions
oc-0.005
ocol-0.005
 accidents-0.005
 beast-0.005
 cavity-0.004
TokenIt
Token ID1026
Feature activation+4.01e-02

Pos logit contributions
ocol+0.005
 accidents+0.005
 beast+0.005
oc+0.005
 reality+0.005
Neg logit contributions
Their-0.003
pered-0.003
uits-0.003
ics-0.003
suits-0.003
Token was
Token ID373
Feature activation+1.85e-03

Pos logit contributions
ocol+0.075
 accidents+0.069
 cavity+0.064
 beast+0.064
 reality+0.063
Neg logit contributions
Their-0.066
ics-0.062
pered-0.062
 run-0.060
uits-0.060
Token so
Token ID523
Feature activation+4.00e-02

Pos logit contributions
ocol+0.004
 accidents+0.004
 beast+0.004
oc+0.004
 reality+0.004
Neg logit contributions
Their-0.002
pered-0.002
uits-0.002
ics-0.002
 run-0.002
Token much
Token ID881
Feature activation+2.26e-02

Pos logit contributions
ocol+0.073
 accidents+0.072
 reality+0.067
 beast+0.066
oc+0.066
Neg logit contributions
Their-0.063
pered-0.061
Her-0.058
ics-0.058
uits-0.058
Token fun
Token ID1257
Feature activation-4.55e-02

Pos logit contributions
ocol+0.056
icer+0.052
oc+0.051
 beast+0.050
 accidents+0.048
Neg logit contributions
Their-0.024
 run-0.021
 off-0.021
 fast-0.021
uits-0.021
Token!
Token ID0
Feature activation+1.36e-03

Pos logit contributions
Their+0.073
pered+0.068
uits+0.067
ics+0.067
Her+0.064
Neg logit contributions
ocol-0.083
 accidents-0.080
 beast-0.075
 reality-0.074
oc-0.074
Token ladder
Token ID18002
Feature activation+6.52e-04

Pos logit contributions
Their+0.009
pered+0.008
uits+0.008
Her+0.007
ics+0.007
Neg logit contributions
ocol-0.016
 accidents-0.016
 beast-0.015
 reality-0.014
oc-0.014
Token.
Token ID13
Feature activation-2.34e-03

Pos logit contributions
 accidents+0.001
oc+0.001
 ig+0.001
 beast+0.001
ocol+0.001
Neg logit contributions
Their-0.001
 dads-0.001
Her-0.001
pered-0.001
nes-0.001
Token He
Token ID679
Feature activation+1.24e-02

Pos logit contributions
Their+0.004
pered+0.004
uits+0.004
ics+0.004
Her+0.004
Neg logit contributions
ocol-0.004
 accidents-0.004
 beast-0.004
 reality-0.004
oc-0.004
Token was
Token ID373
Feature activation-6.51e-03

Pos logit contributions
ocol+0.030
 accidents+0.028
umm+0.027
 reality+0.026
 beast+0.026
Neg logit contributions
Their-0.014
 run-0.013
uits-0.013
pered-0.013
 swing-0.012
Token about
Token ID546
Feature activation+1.89e-02

Pos logit contributions
Their+0.009
suits+0.008
uits+0.008
Her+0.008
ics+0.007
Neg logit contributions
 accidents-0.013
ocol-0.013
 beast-0.012
 cavity-0.012
 reality-0.012
Token to
Token ID284
Feature activation+2.35e-02

Pos logit contributions
ocol+0.050
 accidents+0.050
 beast+0.047
 reality+0.047
oc+0.047
Neg logit contributions
Their-0.015
pered-0.013
uits-0.012
ics-0.012
Her-0.011
Token slide
Token ID10649
Feature activation-7.99e-03

Pos logit contributions
ocol+0.055
 accidents+0.054
 beast+0.051
 reality+0.051
oc+0.050
Neg logit contributions
Their-0.026
pered-0.024
uits-0.022
ics-0.022
suits-0.022
Token down
Token ID866
Feature activation+6.13e-03

Pos logit contributions
Their+0.008
pered+0.008
uits+0.007
ics+0.007
Her+0.007
Neg logit contributions
 accidents-0.019
ocol-0.018
 beast-0.017
 reality-0.017
oc-0.017
Token when
Token ID618
Feature activation-3.81e-04

Pos logit contributions
 accidents+0.013
ocol+0.013
 beast+0.012
 reality+0.012
oc+0.012
Neg logit contributions
Their-0.008
pered-0.007
ics-0.007
uits-0.007
Her-0.007
Token he
Token ID339
Feature activation+2.41e-02

Pos logit contributions
Their+0.001
 run+0.001
 low+0.001
 fast+0.001
pered+0.001
Neg logit contributions
ocol-0.001
oc-0.001
umm-0.001
 accidents-0.001
 cavity-0.001
Token felt
Token ID2936
Feature activation+2.55e-02

Pos logit contributions
ocol+0.059
 accidents+0.057
umm+0.054
 reality+0.052
 beast+0.051
Neg logit contributions
 run-0.025
Their-0.025
 ran-0.024
 low-0.024
 swing-0.024
Token held
Token ID2714
Feature activation+1.67e-02

Pos logit contributions
ocol+0.014
 accidents+0.014
 beast+0.013
 reality+0.013
oc+0.013
Neg logit contributions
Their-0.008
pered-0.007
uits-0.007
ics-0.007
Her-0.007
Token the
Token ID262
Feature activation-2.13e-03

Pos logit contributions
ocol+0.040
 accidents+0.039
 beast+0.037
 reality+0.036
oc+0.036
Neg logit contributions
Their-0.018
pered-0.016
uits-0.016
ics-0.015
Her-0.015
Token board
Token ID3096
Feature activation-2.10e-02

Pos logit contributions
Their+0.003
pered+0.003
uits+0.003
ics+0.002
Her+0.002
Neg logit contributions
ocol-0.004
 accidents-0.004
 beast-0.004
 reality-0.004
oc-0.004
Token high
Token ID1029
Feature activation+1.51e-02

Pos logit contributions
Their+0.027
 dads+0.024
pered+0.024
Her+0.021
suits+0.021
Neg logit contributions
 accidents-0.044
 reality-0.040
 beast-0.039
oc-0.039
ocol-0.038
Token in
Token ID287
Feature activation-7.35e-03

Pos logit contributions
ocol+0.038
 accidents+0.038
 beast+0.036
 reality+0.036
oc+0.035
Neg logit contributions
Their-0.014
pered-0.012
uits-0.012
ics-0.011
Her-0.011
Token the
Token ID262
Feature activation+1.48e-02

Pos logit contributions
Their+0.008
pered+0.008
ics+0.008
suits+0.007
nes+0.007
Neg logit contributions
ocol-0.017
 accidents-0.017
 reality-0.016
 beast-0.016
oc-0.015
Token air
Token ID1633
Feature activation-1.33e-02

Pos logit contributions
ocol+0.028
 accidents+0.028
 reality+0.026
oc+0.025
 beast+0.024
Neg logit contributions
Their-0.023
pered-0.021
 swing-0.021
 low-0.021
 run-0.021
Token and
Token ID290
Feature activation+2.18e-02

Pos logit contributions
Their+0.019
pered+0.017
Her+0.016
 dads+0.016
nes+0.016
Neg logit contributions
 accidents-0.026
ocol-0.025
 reality-0.025
 beast-0.024
oc-0.024
Token shouted
Token ID17293
Feature activation-1.97e-02

Pos logit contributions
ocol+0.052
 accidents+0.049
icer+0.047
 beast+0.047
oc+0.047
Neg logit contributions
Their-0.024
pered-0.022
 run-0.021
uits-0.021
ics-0.021
Token '
Token ID705
Feature activation-1.55e-02

Pos logit contributions
Their+0.027
ics+0.026
 dads+0.024
suits+0.024
nes+0.024
Neg logit contributions
 accidents-0.039
ocol-0.039
 reality-0.036
 beast-0.035
oc-0.033
Tokenone
Token ID505
Feature activation-1.41e-02

Pos logit contributions
Their+0.022
pered+0.020
Her+0.020
ics+0.019
uits+0.019
Neg logit contributions
 accidents-0.031
ocol-0.031
 reality-0.029
 beast-0.029
 cavity-0.028
Token "
Token ID366
Feature activation-1.17e-02

Pos logit contributions
ics+0.046
Their+0.045
nes+0.044
 contain+0.044
suits+0.044
Neg logit contributions
 accidents-0.057
ocol-0.054
 cavity-0.051
oc-0.050
 beast-0.049
TokenThank
Token ID10449
Feature activation-1.32e-03

Pos logit contributions
Their+0.017
pered+0.015
Her+0.015
uits+0.015
ics+0.015
Neg logit contributions
ocol-0.024
 accidents-0.023
 beast-0.022
 reality-0.022
oc-0.021
Token you
Token ID345
Feature activation-4.14e-02

Pos logit contributions
Their+0.002
pered+0.002
Her+0.002
ics+0.002
uits+0.002
Neg logit contributions
ocol-0.002
 accidents-0.002
 reality-0.002
 beast-0.002
oc-0.002
Token,
Token ID11
Feature activation-2.59e-02

Pos logit contributions
Their+0.060
ics+0.060
pered+0.057
uits+0.054
amed+0.054
Neg logit contributions
 accidents-0.082
ocol-0.078
 reality-0.074
oc-0.074
 beast-0.074
Token Tim
Token ID5045
Feature activation-7.37e-03

Pos logit contributions
Their+0.033
pered+0.031
uits+0.030
Her+0.030
 run+0.029
Neg logit contributions
ocol-0.055
 accidents-0.053
 beast-0.051
oc-0.050
 reality-0.050
Tokenmy
Token ID1820
Feature activation-7.76e-03

Pos logit contributions
Their+0.010
pered+0.010
uits+0.009
ics+0.009
Her+0.009
Neg logit contributions
ocol-0.015
 accidents-0.014
 beast-0.014
 reality-0.013
oc-0.013
Token.
Token ID13
Feature activation-4.58e-03

Pos logit contributions
Their+0.008
pered+0.008
uits+0.007
ics+0.007
Her+0.007
Neg logit contributions
ocol-0.018
 accidents-0.017
 beast-0.017
 reality-0.016
oc-0.016
Token I
Token ID314
Feature activation-9.29e-04

Pos logit contributions
Their+0.007
pered+0.006
Her+0.006
uits+0.006
 run+0.006
Neg logit contributions
 accidents-0.009
ocol-0.009
 reality-0.008
 beast-0.008
 pounds-0.008
Token forgive
Token ID20927
Feature activation+6.31e-03

Pos logit contributions
Their+0.001
 run+0.001
uits+0.001
Her+0.001
pered+0.001
Neg logit contributions
ocol-0.002
 accidents-0.002
 beast-0.002
umm-0.002
 pounds-0.002
Token you
Token ID345
Feature activation-3.07e-02

Pos logit contributions
ocol+0.013
 beast+0.012
 reality+0.012
oc+0.012
 cavity+0.012
Neg logit contributions
Their-0.009
Her-0.008
 run-0.008
pered-0.008
 off-0.007
Token.
Token ID13
Feature activation-4.96e-03

Pos logit contributions
Their+0.051
pered+0.048
ics+0.047
uits+0.047
amed+0.046
Neg logit contributions
 accidents-0.057
ocol-0.051
oc-0.049
 reality-0.049
heart-0.047
Token law
Token ID1099
Feature activation+1.36e-02

Pos logit contributions
Their+0.028
pered+0.026
ics+0.025
uits+0.025
Her+0.024
Neg logit contributions
ocol-0.040
 accidents-0.039
 beast-0.037
 reality-0.037
oc-0.037
Token says
Token ID1139
Feature activation+8.89e-03

Pos logit contributions
ocol+0.022
 accidents+0.022
 beast+0.020
 reality+0.020
oc+0.020
Neg logit contributions
Their-0.024
pered-0.023
uits-0.022
ics-0.022
Her-0.022
Token you
Token ID345
Feature activation-2.48e-02

Pos logit contributions
ocol+0.021
 accidents+0.021
 beast+0.020
 reality+0.020
oc+0.020
Neg logit contributions
Their-0.009
pered-0.008
uits-0.008
ics-0.008
Her-0.008
Token must
Token ID1276
Feature activation+3.14e-03

Pos logit contributions
Their+0.031
 run+0.029
uits+0.027
nes+0.027
ics+0.027
Neg logit contributions
ocol-0.057
umm-0.056
 accidents-0.054
icer-0.051
 beast-0.050
Token be
Token ID307
Feature activation-1.71e-02

Pos logit contributions
ocol+0.006
 accidents+0.006
 reality+0.006
 beast+0.006
 cavity+0.006
Neg logit contributions
pered-0.004
suits-0.004
Their-0.004
amed-0.003
uits-0.003
Token honest
Token ID5508
Feature activation-4.27e-03

Pos logit contributions
Their+0.021
uits+0.021
ics+0.020
pered+0.020
 contain+0.020
Neg logit contributions
 accidents-0.036
ocol-0.036
 beast-0.034
 reality-0.033
 cavity-0.033
Token and
Token ID290
Feature activation+1.80e-02

Pos logit contributions
Their+0.004
pered+0.004
uits+0.004
ics+0.004
Her+0.004
Neg logit contributions
ocol-0.010
 accidents-0.010
 beast-0.010
 reality-0.010
oc-0.010
Token kind
Token ID1611
Feature activation-1.53e-03

Pos logit contributions
 accidents+0.040
ocol+0.039
 beast+0.037
 reality+0.037
oc+0.036
Neg logit contributions
Their-0.021
uits-0.020
pered-0.020
ics-0.019
Her-0.019
Token.
Token ID13
Feature activation+2.46e-03

Pos logit contributions
Their+0.002
pered+0.002
uits+0.002
ics+0.002
Her+0.002
Neg logit contributions
ocol-0.003
 accidents-0.003
 beast-0.003
 reality-0.003
oc-0.003
Token
Token ID198
Feature activation-5.97e-03

Pos logit contributions
ocol+0.006
 accidents+0.006
 reality+0.005
 beast+0.005
 cavity+0.005
Neg logit contributions
Their-0.003
pered-0.003
Her-0.003
uits-0.002
nes-0.002
Token
Token ID198
Feature activation-7.69e-04

Pos logit contributions
Their+0.007
pered+0.007
ics+0.007
uits+0.007
 contain+0.006
Neg logit contributions
ocol-0.013
 accidents-0.013
oc-0.012
 beast-0.012
 reality-0.012
Token scared
Token ID12008
Feature activation+7.06e-03

Pos logit contributions
 accidents+0.031
ocol+0.030
 beast+0.029
 reality+0.028
 cavity+0.028
Neg logit contributions
Their-0.027
uits-0.026
ics-0.025
suits-0.025
pered-0.025
Token.
Token ID13
Feature activation+2.68e-03

Pos logit contributions
ocol+0.016
 accidents+0.015
 reality+0.015
oc+0.015
 beast+0.015
Neg logit contributions
Their-0.008
pered-0.008
 run-0.007
uits-0.007
Her-0.007
Token He
Token ID679
Feature activation+1.35e-02

Pos logit contributions
ocol+0.007
 accidents+0.007
 reality+0.007
 beast+0.006
 cavity+0.006
Neg logit contributions
Their-0.003
pered-0.002
 low-0.002
Her-0.002
uits-0.002
Token stayed
Token ID9658
Feature activation-3.76e-03

Pos logit contributions
ocol+0.031
 accidents+0.030
umm+0.028
 reality+0.028
icer+0.027
Neg logit contributions
Their-0.016
 run-0.015
pered-0.015
Her-0.014
 low-0.014
Token by
Token ID416
Feature activation+1.13e-02

Pos logit contributions
Their+0.004
ics+0.004
uits+0.004
Her+0.004
nes+0.004
Neg logit contributions
 accidents-0.008
ocol-0.008
 beast-0.008
 reality-0.008
oc-0.008
Token the
Token ID262
Feature activation-4.78e-03

Pos logit contributions
ocol+0.024
 accidents+0.024
 beast+0.023
oc+0.022
 reality+0.022
Neg logit contributions
Their-0.014
pered-0.014
uits-0.013
ics-0.012
Her-0.012
Token snake
Token ID17522
Feature activation-5.70e-03

Pos logit contributions
Their+0.007
pered+0.007
ics+0.007
uits+0.007
Her+0.007
Neg logit contributions
ocol-0.009
 accidents-0.009
 reality-0.008
 beast-0.008
oc-0.008
Token and
Token ID290
Feature activation+1.04e-02

Pos logit contributions
Their+0.007
pered+0.006
uits+0.006
Her+0.006
ics+0.006
Neg logit contributions
 accidents-0.012
ocol-0.012
 beast-0.012
oc-0.012
 reality-0.012
Token watched
Token ID7342
Feature activation+7.33e-03

Pos logit contributions
ocol+0.025
 accidents+0.023
umm+0.022
icer+0.022
 beast+0.022
Neg logit contributions
Their-0.012
 run-0.011
pered-0.011
ics-0.011
uits-0.010
Token it
Token ID340
Feature activation+1.28e-02

Pos logit contributions
ocol+0.014
 accidents+0.014
 beast+0.013
 reality+0.013
oc+0.013
Neg logit contributions
Their-0.011
pered-0.010
uits-0.010
ics-0.010
Her-0.009
Token.
Token ID13
Feature activation-3.30e-03

Pos logit contributions
 accidents+0.028
ocol+0.027
oc+0.026
 reality+0.026
 beast+0.026
Neg logit contributions
Their-0.016
uits-0.014
suits-0.014
nes-0.014
amed-0.014
Token like
Token ID588
Feature activation-2.70e-02

Pos logit contributions
ocol+0.016
 accidents+0.016
 beast+0.015
oc+0.015
umm+0.014
Neg logit contributions
Their-0.007
pered-0.006
uits-0.006
ics-0.006
Her-0.006
Token a
Token ID257
Feature activation-1.63e-02

Pos logit contributions
Their+0.034
pered+0.031
uits+0.030
ics+0.030
Her+0.029
Neg logit contributions
ocol-0.059
 accidents-0.057
oc-0.055
 beast-0.054
 cavity-0.053
Token world
Token ID995
Feature activation+5.72e-03

Pos logit contributions
Their+0.026
pered+0.024
uits+0.023
ics+0.023
Her+0.023
Neg logit contributions
ocol-0.030
 accidents-0.029
 beast-0.027
 reality-0.027
oc-0.027
Token where
Token ID810
Feature activation+2.87e-03

Pos logit contributions
 accidents+0.011
ocol+0.011
oc+0.011
 beast+0.010
 reality+0.010
Neg logit contributions
Their-0.009
pered-0.008
Her-0.008
amed-0.007
 dads-0.007
Token people
Token ID661
Feature activation-2.79e-02

Pos logit contributions
ocol+0.006
oc+0.006
 cavity+0.006
icer+0.006
 accidents+0.006
Neg logit contributions
Their-0.004
ics-0.004
 run-0.004
pered-0.003
uits-0.003
Token have
Token ID423
Feature activation-1.50e-02

Pos logit contributions
Their+0.035
pered+0.032
uits+0.031
Her+0.030
ics+0.030
Neg logit contributions
ocol-0.060
 accidents-0.059
 beast-0.056
 reality-0.056
oc-0.055
Token loyal
Token ID9112
Feature activation-2.71e-02

Pos logit contributions
Their+0.021
pered+0.020
uits+0.019
ics+0.019
Her+0.019
Neg logit contributions
ocol-0.030
 accidents-0.029
 beast-0.028
 reality-0.027
oc-0.027
Token friends
Token ID2460
Feature activation-4.82e-02

Pos logit contributions
Their+0.038
pered+0.037
uits+0.036
 run+0.034
suits+0.034
Neg logit contributions
ocol-0.053
 accidents-0.052
 beast-0.049
oc-0.048
 cavity-0.048
Token."
Token ID526
Feature activation-1.07e-02

Pos logit contributions
Their+0.063
pered+0.058
Her+0.055
uits+0.055
ics+0.054
Neg logit contributions
ocol-0.101
 accidents-0.100
 beast-0.094
 reality-0.094
oc-0.094
Token
Token ID220
Feature activation+3.00e-02

Pos logit contributions
Their+0.015
pered+0.014
uits+0.014
ics+0.014
Her+0.013
Neg logit contributions
ocol-0.022
 accidents-0.021
 beast-0.020
 reality-0.020
oc-0.020
Token
Token ID198
Feature activation-1.28e-02

Pos logit contributions
ocol+0.064
 accidents+0.062
oc+0.059
 reality+0.059
 beast+0.059
Neg logit contributions
Their-0.036
pered-0.035
uits-0.033
 contain-0.033
ics-0.032
Token It
Token ID632
Feature activation+2.57e-02

Pos logit contributions
Their+0.015
pered+0.014
ics+0.014
uits+0.014
 contain+0.013
Neg logit contributions
ocol-0.019
 accidents-0.019
 beast-0.018
oc-0.018
 reality-0.017
Token was
Token ID373
Feature activation-5.44e-04

Pos logit contributions
ocol+0.059
 reality+0.050
 accidents+0.050
 beast+0.049
umm+0.049
Neg logit contributions
Their-0.034
 run-0.033
 over-0.032
 low-0.032
 ran-0.031
Token sweet
Token ID6029
Feature activation+3.45e-02

Pos logit contributions
Their+0.001
pered+0.001
uits+0.001
Her+0.001
ics+0.001
Neg logit contributions
ocol-0.001
 accidents-0.001
oc-0.001
 beast-0.001
 reality-0.001
Token and
Token ID290
Feature activation+1.51e-02

Pos logit contributions
ocol+0.102
 reality+0.098
 beast+0.096
icer+0.095
 cavity+0.092
Neg logit contributions
 run-0.013
Their-0.012
 off-0.011
 over-0.010
pered-0.010
Token crunch
Token ID23527
Feature activation+2.76e-02

Pos logit contributions
ocol+0.033
 beast+0.032
 reality+0.031
icer+0.031
 accidents+0.030
Neg logit contributions
Their-0.019
 run-0.017
 low-0.017
 fast-0.017
Her-0.016
Tokeny
Token ID88
Feature activation+6.85e-03

Pos logit contributions
ocol+0.059
 accidents+0.058
 beast+0.055
 reality+0.055
oc+0.054
Neg logit contributions
Their-0.035
pered-0.032
uits-0.031
ics-0.031
Her-0.030
Token.
Token ID13
Feature activation-8.97e-03

Pos logit contributions
 accidents+0.014
ocol+0.013
oc+0.013
 beast+0.012
 reality+0.012
Neg logit contributions
Their-0.010
uits-0.009
pered-0.009
Her-0.009
ics-0.009
Token
Token ID198
Feature activation-4.04e-03

Pos logit contributions
Their+0.013
pered+0.012
uits+0.011
Her+0.011
ics+0.011
Neg logit contributions
ocol-0.019
 accidents-0.018
 beast-0.017
 reality-0.017
oc-0.017
Token
Token ID198
Feature activation+1.29e-03

Pos logit contributions
pered+0.005
Their+0.005
 contain+0.005
ics+0.005
uits+0.005
Neg logit contributions
ocol-0.008
 accidents-0.008
oc-0.008
 beast-0.008
 reality-0.007
TokenTim
Token ID14967
Feature activation-5.81e-04

Pos logit contributions
oc+0.003
 accidents+0.002
ocol+0.002
 beast+0.002
icer+0.002
Neg logit contributions
pered-0.002
ics-0.002
 contain-0.002
uits-0.002
 aside-0.002
Token smiled
Token ID13541
Feature activation+5.12e-03

Pos logit contributions
Their+0.001
pered+0.001
uits+0.001
ics+0.001
Her+0.001
Neg logit contributions
ocol-0.001
 accidents-0.001
 beast-0.001
 reality-0.001
oc-0.001
Token and
Token ID290
Feature activation+4.38e-03

Pos logit contributions
Their+0.116
Her+0.113
pered+0.107
arl+0.102
eyed+0.099
Neg logit contributions
 accidents-0.114
 reality-0.107
oc-0.103
 ig-0.102
 beast-0.101
Token had
Token ID550
Feature activation-1.28e-02

Pos logit contributions
 accidents+0.009
ocol+0.009
 reality+0.009
 beast+0.009
oc+0.009
Neg logit contributions
Their-0.006
pered-0.005
Her-0.005
ics-0.005
uits-0.005
Token lots
Token ID6041
Feature activation+9.90e-03

Pos logit contributions
pered+0.019
Their+0.019
Her+0.018
eyed+0.018
 aside+0.018
Neg logit contributions
 accidents-0.023
 reality-0.022
 ig-0.021
 beast-0.020
 cel-0.020
Token of
Token ID286
Feature activation+2.71e-03

Pos logit contributions
ocol+0.019
 accidents+0.018
 beast+0.018
 reality+0.017
icer+0.017
Neg logit contributions
Their-0.015
uits-0.014
pered-0.014
ics-0.013
 contain-0.013
Token fun
Token ID1257
Feature activation-5.53e-02

Pos logit contributions
ocol+0.005
 accidents+0.005
 beast+0.005
 reality+0.005
oc+0.005
Neg logit contributions
Their-0.004
pered-0.004
ics-0.004
uits-0.004
Her-0.004
Token together
Token ID1978
Feature activation-6.03e-02

Pos logit contributions
Their+0.065
pered+0.060
uits+0.058
Her+0.057
ics+0.057
Neg logit contributions
ocol-0.124
 accidents-0.123
 beast-0.115
 reality-0.115
oc-0.114
Token.
Token ID13
Feature activation-1.27e-02

Pos logit contributions
Their+0.090
pered+0.082
Her+0.082
ics+0.080
uits+0.080
Neg logit contributions
 accidents-0.115
ocol-0.115
 reality-0.108
 beast-0.106
oc-0.106
Token After
Token ID2293
Feature activation-2.12e-02

Pos logit contributions
Their+0.016
pered+0.015
uits+0.014
Her+0.014
ics+0.014
Neg logit contributions
ocol-0.028
 accidents-0.027
 beast-0.026
 reality-0.026
 cavity-0.025
Token a
Token ID257
Feature activation-2.99e-02

Pos logit contributions
Their+0.024
pered+0.022
uits+0.021
ics+0.021
Her+0.021
Neg logit contributions
ocol-0.049
 accidents-0.048
 beast-0.046
 reality-0.045
oc-0.045
Token while
Token ID981
Feature activation-2.57e-02

Pos logit contributions
pered+0.040
ics+0.038
Their+0.037
uits+0.035
amed+0.035
Neg logit contributions
 accidents-0.063
ocol-0.060
 reality-0.059
oc-0.059
 beast-0.058
Token,
Token ID11
Feature activation-1.32e-02

Pos logit contributions
Their+0.033
ics+0.032
pered+0.031
uits+0.030
nes+0.030
Neg logit contributions
 accidents-0.055
ocol-0.055
 beast-0.051
 reality-0.049
oc-0.049
TokenJake
Token ID43930
Feature activation+4.51e-03

Pos logit contributions
ocol+0.077
 accidents+0.075
 reality+0.070
 beast+0.069
 cavity+0.067
Neg logit contributions
Their-0.067
pered-0.063
ics-0.062
uits-0.062
Her-0.060
Token and
Token ID290
Feature activation-1.24e-02

Pos logit contributions
ocol+0.011
 accidents+0.011
 reality+0.010
 beast+0.010
umm+0.010
Neg logit contributions
Their-0.004
 run-0.004
pered-0.004
uits-0.004
ics-0.004
Token Jane
Token ID12091
Feature activation-3.25e-03

Pos logit contributions
Their+0.021
pered+0.020
uits+0.020
ics+0.019
suits+0.019
Neg logit contributions
 accidents-0.021
ocol-0.021
 beast-0.020
oc-0.019
 reality-0.019
Token were
Token ID547
Feature activation+1.12e-03

Pos logit contributions
Their+0.004
pered+0.004
uits+0.003
ics+0.003
Her+0.003
Neg logit contributions
ocol-0.007
 accidents-0.007
 beast-0.007
 reality-0.007
oc-0.007
Token best
Token ID1266
Feature activation-9.16e-03

Pos logit contributions
ocol+0.002
 accidents+0.002
 reality+0.002
icer+0.002
 ig+0.002
Neg logit contributions
Their-0.002
uits-0.002
 contain-0.002
nes-0.002
suits-0.002
Token friends
Token ID2460
Feature activation-4.95e-02

Pos logit contributions
Their+0.015
pered+0.014
ics+0.014
uits+0.014
Her+0.014
Neg logit contributions
ocol-0.016
 accidents-0.016
 beast-0.015
 reality-0.015
oc-0.014
Token.
Token ID13
Feature activation+6.23e-03

Pos logit contributions
Their+0.072
pered+0.067
ics+0.067
uits+0.064
Her+0.062
Neg logit contributions
 accidents-0.095
ocol-0.093
 reality-0.089
 beast-0.088
oc-0.087
Token Every
Token ID3887
Feature activation-3.92e-02

Pos logit contributions
ocol+0.013
 accidents+0.013
 reality+0.012
oc+0.012
umm+0.012
Neg logit contributions
Their-0.008
pered-0.007
ics-0.007
amed-0.007
 leash-0.007
Token day
Token ID1110
Feature activation-1.13e-02

Pos logit contributions
Their+0.061
pered+0.060
uits+0.057
Her+0.055
ics+0.055
Neg logit contributions
 accidents-0.070
ocol-0.069
 beast-0.068
 reality-0.067
oc-0.065
Token when
Token ID618
Feature activation-9.69e-03

Pos logit contributions
Their+0.011
pered+0.010
uits+0.010
ics+0.010
Her+0.009
Neg logit contributions
ocol-0.027
 accidents-0.027
 reality-0.025
 beast-0.025
oc-0.025
Token school
Token ID1524
Feature activation-1.63e-02

Pos logit contributions
Their+0.006
ics+0.004
uits+0.004
Her+0.004
 run+0.004
Neg logit contributions
ocol-0.028
 accidents-0.027
oc-0.027
 beast-0.027
 cavity-0.026
Token so
Token ID523
Feature activation+1.62e-02

Pos logit contributions
Their+0.020
suits+0.019
Her+0.018
uits+0.018
pered+0.017
Neg logit contributions
 accidents-0.027
 reality-0.027
ocol-0.026
 ig-0.026
 beast-0.026
Token happy
Token ID3772
Feature activation-1.58e-02

Pos logit contributions
 accidents+0.036
ocol+0.035
 reality+0.034
oc+0.033
 beast+0.032
Neg logit contributions
Their-0.018
pered-0.017
uits-0.017
Her-0.016
suits-0.015
Token that
Token ID326
Feature activation-3.79e-03

Pos logit contributions
Their+0.018
pered+0.017
uits+0.016
ics+0.016
Her+0.015
Neg logit contributions
ocol-0.036
 accidents-0.035
 beast-0.034
 reality-0.033
oc-0.033
Token they
Token ID484
Feature activation-4.23e-02

Pos logit contributions
Their+0.004
ics+0.003
pered+0.003
uits+0.003
 run+0.003
Neg logit contributions
ocol-0.009
 accidents-0.009
 beast-0.009
oc-0.009
 cavity-0.009
Token kept
Token ID4030
Feature activation-9.92e-03

Pos logit contributions
Their+0.062
pered+0.058
uits+0.055
Her+0.055
ics+0.054
Neg logit contributions
ocol-0.081
 accidents-0.081
 reality-0.077
 beast-0.076
oc-0.074
Token playing
Token ID2712
Feature activation-6.08e-02

Pos logit contributions
Their+0.013
suits+0.013
Her+0.013
uits+0.012
pered+0.012
Neg logit contributions
ocol-0.020
 accidents-0.020
 reality-0.019
 beast-0.018
 cavity-0.017
Token with
Token ID351
Feature activation-1.40e-02

Pos logit contributions
Their+0.060
pered+0.053
Her+0.052
ics+0.052
uits+0.050
Neg logit contributions
 accidents-0.145
ocol-0.145
 reality-0.139
oc-0.137
 beast-0.137
Token the
Token ID262
Feature activation+2.41e-03

Pos logit contributions
Their+0.014
uits+0.012
pered+0.012
ics+0.012
 run+0.012
Neg logit contributions
ocol-0.034
 accidents-0.033
 beast-0.032
oc-0.032
umm-0.032
Token balls
Token ID11333
Feature activation-5.19e-02

Pos logit contributions
ocol+0.005
 accidents+0.005
oc+0.005
 beast+0.005
 cavity+0.004
Neg logit contributions
Their-0.003
uits-0.003
 run-0.003
 swing-0.003
pered-0.003
Token for
Token ID329
Feature activation-2.29e-02

Pos logit contributions
Their+0.077
pered+0.070
Her+0.067
 dads+0.065
suits+0.063
Neg logit contributions
 accidents-0.098
 reality-0.092
ocol-0.088
 ig-0.086
 beast-0.085
Token hours
Token ID2250
Feature activation-4.37e-02

Pos logit contributions
Their+0.031
pered+0.029
uits+0.029
ics+0.028
Her+0.028
Neg logit contributions
ocol-0.047
 accidents-0.046
 beast-0.043
 reality-0.043
oc-0.042
Token with
Token ID351
Feature activation-1.90e-02

Pos logit contributions
Their+0.078
uits+0.078
pered+0.074
ics+0.073
 contain+0.072
Neg logit contributions
 accidents-0.130
oc-0.127
ocol-0.126
 beast-0.119
 reality-0.116
Token you
Token ID345
Feature activation-2.35e-02

Pos logit contributions
Their+0.031
Her+0.028
pered+0.028
ics+0.028
uits+0.028
Neg logit contributions
ocol-0.034
 accidents-0.033
 reality-0.032
 beast-0.032
 cavity-0.031
Token,
Token ID11
Feature activation-1.45e-02

Pos logit contributions
Their+0.034
pered+0.032
uits+0.031
Her+0.030
ics+0.029
Neg logit contributions
 accidents-0.046
ocol-0.045
oc-0.043
 reality-0.043
 beast-0.043
Token Tim
Token ID5045
Feature activation-1.93e-04

Pos logit contributions
Their+0.023
pered+0.022
uits+0.021
ics+0.021
Her+0.021
Neg logit contributions
ocol-0.027
 accidents-0.026
 beast-0.024
 reality-0.024
oc-0.024
Token!"
Token ID2474
Feature activation-1.04e-02

Pos logit contributions
Their+0.000
uits+0.000
pered+0.000
 contain+0.000
ics+0.000
Neg logit contributions
 accidents-0.000
ocol-0.000
oc-0.000
 beast-0.000
 reality-0.000
Token They
Token ID1119
Feature activation-6.29e-02

Pos logit contributions
nes+0.019
Their+0.018
pered+0.018
amed+0.018
 contain+0.018
Neg logit contributions
 accidents-0.016
oc-0.015
ocol-0.014
 ig-0.013
umm-0.013
Token played
Token ID2826
Feature activation-6.37e-02

Pos logit contributions
Their+0.104
pered+0.102
arl+0.097
Her+0.096
ics+0.095
Neg logit contributions
 ig-0.088
oc-0.084
 cel-0.082
 reality-0.081
 dreams-0.080
Token on
Token ID319
Feature activation-6.99e-03

Pos logit contributions
nes+0.078
eyed+0.075
 contain+0.074
Her+0.073
Their+0.073
Neg logit contributions
 accidents-0.123
oc-0.122
 beast-0.110
ocol-0.104
 cel-0.104
Token the
Token ID262
Feature activation-5.07e-03

Pos logit contributions
Their+0.010
pered+0.009
uits+0.009
Her+0.009
ics+0.009
Neg logit contributions
ocol-0.014
 accidents-0.014
 beast-0.013
oc-0.013
 reality-0.013
Token swings
Token ID26728
Feature activation-4.91e-02

Pos logit contributions
pered+0.010
nes+0.009
eyed+0.009
amed+0.009
 aside+0.009
Neg logit contributions
 accidents-0.006
 ig-0.006
 beast-0.006
 cel-0.005
icer-0.005
Token and
Token ID290
Feature activation+1.57e-02

Pos logit contributions
nes+0.089
Their+0.086
Her+0.083
 contain+0.081
eyed+0.081
Neg logit contributions
oc-0.073
 ig-0.062
 beast-0.062
 accidents-0.062
icer-0.061
TokenBob
Token ID18861
Feature activation+3.41e-03

Pos logit contributions
pered+0.005
Their+0.004
ics+0.004
uits+0.004
suits+0.004
Neg logit contributions
ocol-0.007
 accidents-0.007
oc-0.007
 beast-0.006
 reality-0.006
Token and
Token ID290
Feature activation-4.18e-02

Pos logit contributions
ocol+0.007
 accidents+0.007
 beast+0.007
 reality+0.007
oc+0.007
Neg logit contributions
Their-0.004
pered-0.004
uits-0.004
ics-0.004
Her-0.004
Token his
Token ID465
Feature activation-2.31e-02

Pos logit contributions
Their+0.067
pered+0.063
uits+0.061
ics+0.061
Her+0.060
Neg logit contributions
ocol-0.076
 accidents-0.075
 beast-0.070
 reality-0.069
oc-0.069
Token new
Token ID649
Feature activation-3.87e-02

Pos logit contributions
Their+0.028
pered+0.027
uits+0.025
 run+0.025
ics+0.025
Neg logit contributions
ocol-0.051
 accidents-0.049
oc-0.046
 beast-0.046
 reality-0.046
Token friend
Token ID1545
Feature activation-3.68e-02

Pos logit contributions
Their+0.052
pered+0.049
uits+0.048
ics+0.048
Her+0.047
Neg logit contributions
ocol-0.079
 accidents-0.077
 beast-0.072
 reality-0.072
oc-0.071
Token continued
Token ID3767
Feature activation-5.82e-02

Pos logit contributions
Their+0.046
pered+0.042
uits+0.040
ics+0.040
Her+0.039
Neg logit contributions
ocol-0.080
 accidents-0.079
 beast-0.075
 reality-0.074
oc-0.074
Token to
Token ID284
Feature activation-2.48e-03

Pos logit contributions
nes+0.079
atha+0.073
suits+0.073
Their+0.073
 contain+0.072
Neg logit contributions
 accidents-0.115
ocol-0.114
 reality-0.107
 beast-0.105
 ig-0.105
Token play
Token ID711
Feature activation-6.06e-02

Pos logit contributions
Their+0.003
pered+0.003
uits+0.002
suits+0.002
ics+0.002
Neg logit contributions
ocol-0.006
 accidents-0.006
 reality-0.005
 beast-0.005
oc-0.005
Token together
Token ID1978
Feature activation-5.59e-02

Pos logit contributions
Their+0.058
pered+0.052
uits+0.051
ics+0.050
Her+0.049
Neg logit contributions
ocol-0.146
 accidents-0.146
 beast-0.139
oc-0.138
 reality-0.137
Token,
Token ID11
Feature activation-1.80e-02

Pos logit contributions
Their+0.076
pered+0.068
uits+0.065
Her+0.065
amed+0.065
Neg logit contributions
ocol-0.114
 accidents-0.113
 reality-0.109
oc-0.107
 beast-0.106
Token and
Token ID290
Feature activation-7.03e-03

Pos logit contributions
Their+0.024
pered+0.022
Her+0.021
uits+0.020
ics+0.020
Neg logit contributions
 accidents-0.038
ocol-0.035
 reality-0.035
 beast-0.035
oc-0.034