TOP ACTIVATIONS
MAX = 0.160

it here a long time ago ." " Maybe
lost it a long time ago . I thought I would
started smoking a long time ago and now it 's hard
she made a long time ago . Rose had promised always
ac orn a few days ago so I could have some
painted them a long time ago . She thought they were
the wardrobe a few days ago , so she opened it
who had died a year ago . It was very precious
by people a long time ago in history . Blue didn
same dress a long time ago . <|endoftext|> Once upon a
Jim from a long time ago , and he was very
had lived a long time ago . L
someone a very long time ago . The little girl thought
best friend a long time ago . He was my dog
who lived a long time ago ," mom said .
her birthday a few days ago . She liked to practice
is from a long time ago , when people used different
lost it a long time ago ." Sam was sorry to
it away a long time ago . It is not good
was painted a long time ago , â her mum

MIN ACTIVATIONS
MIN = -0.168

girl would make a great fer ret parent .
and exciting . A baby fer ret !
The young girl took the fer ret home and was very
clown s juggling and ac rob ats doing flips . But
and proud of himself for accompl ishing such a difficult task
lesson and never pretended with tr ays again . <|endoftext|> Billy
let 's go on the fer ris wheel and see the
sc olded him for dis obe ying her .
Ele ph ants have strong tr unks that they can use
Lily decided to play " mar ry ". They dressed up
didn 't know what " pr une " meant , so
lay was careful not to tr ample the flowers and plants
something new ! She thought av oc ados were the most
at first you don 't succeed , try , try again
so brave , wished him luck and then continued on his
. They said , " Ju ice , cheese and crack
. He says , " Ju ice is y ummy !
. But Brad didn 't bud ge , he remained where
And Ske leton stayed warm and cozy
her wish and wished him luck on his journey . <|endoftext|>

INTERVAL 0.078 - 0.160
CONTAINS 0.816%

. They have a lot of fun . The dog is
y and Leo both cheered when they reached the very top
fight . Tim was frustrated when he couldn 't win .
excited for his first day of class . He put on
there was a little boy named Tim my . Tim my

INTERVAL -0.004 - 0.078
CONTAINS 56.552%

my and apologized . " Here you go ,
a lot . She is lucky to have you ."
" But you can make another castle tomorrow . And maybe
their toes . After they finished doing yoga ,
It was her swim suit . It was too big for

INTERVAL -0.086 - -0.004
CONTAINS 42.344%

it got stuck in a tree . Wh isk ers tried
back to the house to show his mom .
taught the mouse how to sneak into the farm and steal
Daisy replied , " I was looking for a silly butter
. The nurse smiled and said , " It 's okay

INTERVAL -0.168 - -0.086
CONTAINS 0.288%

girl never got to play again . <|endoftext|> Once upon a
was a young girl named Av a . She liked to
eat , they smiled at each other . The cat said
and showed them his new trick . Hipp y could blow
just had an amazing fight against a mud p uddle .
Token it
Token ID340
Feature activation+2.93e-02

Pos logit contributions
less+0.085
free+0.075
 okay+0.071
 warm+0.070
 wet+0.068
Neg logit contributions
rob-0.123
 obstacles-0.118
 obstacle-0.117
 facts-0.113
 battles-0.112
Token here
Token ID994
Feature activation+6.09e-02

Pos logit contributions
less+0.031
free+0.027
 okay+0.026
 warm+0.025
inky+0.024
Neg logit contributions
rob-0.057
 obstacles-0.056
 obstacle-0.055
 facts-0.054
 battles-0.053
Token a
Token ID257
Feature activation+1.88e-02

Pos logit contributions
less+0.064
free+0.055
 okay+0.054
 warm+0.052
 wet+0.049
Neg logit contributions
rob-0.125
 obstacle-0.118
 obstacles-0.118
 facts-0.112
 battles-0.111
Token long
Token ID890
Feature activation+4.22e-02

Pos logit contributions
less+0.028
free+0.025
 okay+0.024
 warm+0.024
inky+0.024
Neg logit contributions
rob-0.029
 obstacles-0.028
 obstacle-0.027
 facts-0.026
der-0.026
Token time
Token ID640
Feature activation+7.52e-02

Pos logit contributions
less+0.050
free+0.043
 warm+0.041
 okay+0.041
 wet+0.040
Neg logit contributions
rob-0.076
 obstacles-0.073
 obstacle-0.072
 facts-0.072
der-0.070
Token ago
Token ID2084
Feature activation+1.43e-01

Pos logit contributions
less+0.095
free+0.084
 okay+0.079
 warm+0.076
inky+0.075
Neg logit contributions
rob-0.132
 obstacles-0.127
 obstacle-0.127
 facts-0.122
 battles-0.121
Token."
Token ID526
Feature activation+1.53e-02

Pos logit contributions
less+0.162
free+0.139
 okay+0.131
inky+0.127
 warm+0.125
Neg logit contributions
rob-0.265
 obstacles-0.261
 obstacle-0.259
 facts-0.247
 battles-0.247
Token
Token ID198
Feature activation+2.26e-02

Pos logit contributions
less+0.020
free+0.018
inky+0.017
etsy+0.016
 okay+0.016
Neg logit contributions
rob-0.026
 obstacles-0.026
 obstacle-0.025
 facts-0.025
 hears-0.024
Token
Token ID198
Feature activation+2.22e-02

Pos logit contributions
less+0.030
free+0.026
etsy+0.025
 okay+0.024
inky+0.024
Neg logit contributions
rob-0.038
 obstacles-0.037
 obstacle-0.036
 facts-0.035
 hears-0.035
Token"
Token ID1
Feature activation+1.36e-02

Pos logit contributions
less+0.034
free+0.032
etsy+0.030
inky+0.029
 okay+0.029
Neg logit contributions
rob-0.032
 obstacles-0.032
 obstacle-0.031
 hears-0.030
 battles-0.030
TokenMaybe
Token ID13300
Feature activation-3.60e-02

Pos logit contributions
less+0.021
free+0.020
There+0.019
 okay+0.019
 warm+0.018
Neg logit contributions
 obstacle-0.020
rob-0.018
 obstacles-0.018
 facts-0.018
 battles-0.018
Token lost
Token ID2626
Feature activation+3.10e-02

Pos logit contributions
less+0.007
free+0.007
 okay+0.006
inky+0.006
etsy+0.006
Neg logit contributions
rob-0.009
 obstacles-0.008
 obstacle-0.008
 facts-0.008
 battles-0.008
Token it
Token ID340
Feature activation+1.09e-02

Pos logit contributions
less+0.042
free+0.038
etsy+0.035
inky+0.035
 okay+0.035
Neg logit contributions
rob-0.050
 obstacles-0.050
 obstacle-0.049
 facts-0.047
 battles-0.047
Token a
Token ID257
Feature activation+1.23e-02

Pos logit contributions
less+0.014
free+0.013
etsy+0.012
inky+0.012
eth+0.012
Neg logit contributions
 obstacles-0.017
rob-0.017
 obstacle-0.017
 battles-0.017
der-0.017
Token long
Token ID890
Feature activation+3.38e-02

Pos logit contributions
less+0.016
free+0.014
 okay+0.014
 warm+0.014
inky+0.013
Neg logit contributions
rob-0.021
 obstacles-0.020
 obstacle-0.020
 facts-0.019
 battles-0.019
Token time
Token ID640
Feature activation+7.07e-02

Pos logit contributions
less+0.045
free+0.040
 okay+0.037
inky+0.037
etsy+0.037
Neg logit contributions
rob-0.056
 obstacles-0.055
 obstacle-0.054
 facts-0.052
 battles-0.052
Token ago
Token ID2084
Feature activation+1.40e-01

Pos logit contributions
less+0.097
free+0.087
etsy+0.083
inky+0.082
 okay+0.081
Neg logit contributions
rob-0.111
 obstacles-0.111
 obstacle-0.108
 facts-0.105
 battles-0.105
Token.
Token ID13
Feature activation+1.70e-02

Pos logit contributions
less+0.161
free+0.139
 okay+0.130
inky+0.127
etsy+0.127
Neg logit contributions
rob-0.258
 obstacles-0.254
 obstacle-0.252
 facts-0.242
 battles-0.241
Token I
Token ID314
Feature activation+9.66e-03

Pos logit contributions
less+0.028
free+0.024
inky+0.024
etsy+0.024
 wet+0.023
Neg logit contributions
rob-0.022
 obstacles-0.021
 facts-0.021
der-0.021
 hears-0.021
Token thought
Token ID1807
Feature activation-2.94e-02

Pos logit contributions
less+0.014
free+0.013
inky+0.012
etsy+0.012
 okay+0.012
Neg logit contributions
rob-0.014
 obstacles-0.014
 obstacle-0.014
 facts-0.013
 battles-0.013
Token I
Token ID314
Feature activation-1.75e-02

Pos logit contributions
rob+0.055
 obstacles+0.053
 obstacle+0.052
 battles+0.050
 facts+0.049
Neg logit contributions
less-0.035
free-0.030
 okay-0.030
 warm-0.029
inky-0.027
Token would
Token ID561
Feature activation-3.11e-03

Pos logit contributions
rob+0.026
 obstacles+0.025
 obstacle+0.025
 facts+0.024
 battles+0.024
Neg logit contributions
less-0.027
free-0.025
 okay-0.023
etsy-0.023
inky-0.023
Token started
Token ID2067
Feature activation+6.30e-02

Pos logit contributions
rob+0.015
 obstacles+0.014
 obstacle+0.014
 facts+0.014
 battles+0.014
Neg logit contributions
less-0.010
free-0.009
 okay-0.008
etsy-0.008
inky-0.008
Token smoking
Token ID9216
Feature activation+2.68e-02

Pos logit contributions
less+0.077
free+0.068
 okay+0.063
etsy+0.063
inky+0.063
Neg logit contributions
rob-0.111
 obstacles-0.109
 obstacle-0.108
 facts-0.104
 battles-0.104
Token a
Token ID257
Feature activation+1.26e-02

Pos logit contributions
less+0.029
free+0.025
 warm+0.024
 okay+0.024
 wet+0.023
Neg logit contributions
rob-0.053
 obstacle-0.050
 obstacles-0.050
 battles-0.048
 facts-0.048
Token long
Token ID890
Feature activation+3.99e-02

Pos logit contributions
less+0.016
free+0.014
 okay+0.013
 warm+0.013
inky+0.013
Neg logit contributions
rob-0.022
 obstacles-0.021
 obstacle-0.021
 facts-0.020
 battles-0.020
Token time
Token ID640
Feature activation+7.39e-02

Pos logit contributions
less+0.052
free+0.046
 okay+0.043
inky+0.043
 warm+0.043
Neg logit contributions
rob-0.068
 obstacles-0.066
 obstacle-0.064
 facts-0.063
 battles-0.062
Token ago
Token ID2084
Feature activation+1.40e-01

Pos logit contributions
less+0.094
free+0.083
 okay+0.077
inky+0.075
 warm+0.075
Neg logit contributions
rob-0.129
 obstacles-0.126
 obstacle-0.124
 facts-0.120
 battles-0.120
Token and
Token ID290
Feature activation+6.79e-03

Pos logit contributions
less+0.155
free+0.132
 okay+0.124
inky+0.118
etsy+0.118
Neg logit contributions
rob-0.267
 obstacles-0.262
 obstacle-0.262
 facts-0.249
 battles-0.249
Token now
Token ID783
Feature activation+3.60e-02

Pos logit contributions
less+0.008
free+0.008
 okay+0.008
 warm+0.008
 wet+0.008
Neg logit contributions
rob-0.013
 obstacles-0.011
 obstacle-0.011
 battles-0.011
 facts-0.011
Token it
Token ID340
Feature activation+1.58e-03

Pos logit contributions
less+0.039
 okay+0.035
free+0.034
 warm+0.033
 wet+0.032
Neg logit contributions
rob-0.074
 obstacles-0.068
 obstacle-0.066
 facts-0.066
 battles-0.065
Token's
Token ID338
Feature activation+9.90e-03

Pos logit contributions
less+0.002
free+0.002
 wet+0.002
 okay+0.002
 warm+0.002
Neg logit contributions
rob-0.003
 obstacles-0.003
 facts-0.002
 obstacle-0.002
 battles-0.002
Token hard
Token ID1327
Feature activation-5.14e-03

Pos logit contributions
less+0.012
free+0.010
 okay+0.010
inky+0.009
 warm+0.009
Neg logit contributions
rob-0.018
 obstacles-0.018
 obstacle-0.018
 facts-0.017
 battles-0.017
Token she
Token ID673
Feature activation-1.12e-03

Pos logit contributions
less+0.012
free+0.010
 okay+0.009
etsy+0.009
inky+0.009
Neg logit contributions
rob-0.019
 obstacles-0.018
 obstacle-0.018
 battles-0.017
 facts-0.017
Token made
Token ID925
Feature activation+3.15e-02

Pos logit contributions
rob+0.002
 obstacles+0.002
 obstacle+0.002
 cliffs+0.002
 battles+0.002
Neg logit contributions
less-0.002
free-0.002
 okay-0.002
 warm-0.001
etsy-0.001
Token a
Token ID257
Feature activation-1.83e-03

Pos logit contributions
less+0.043
etsy+0.040
inky+0.038
free+0.038
eth+0.037
Neg logit contributions
 obstacles-0.049
 facts-0.048
 battles-0.047
 obstacle-0.047
 hears-0.046
Token long
Token ID890
Feature activation+3.43e-02

Pos logit contributions
 obstacles+0.003
 obstacle+0.003
rob+0.003
 battles+0.003
 facts+0.003
Neg logit contributions
less-0.003
free-0.002
etsy-0.002
inky-0.002
 okay-0.002
Token time
Token ID640
Feature activation+7.08e-02

Pos logit contributions
less+0.052
free+0.047
 okay+0.045
etsy+0.044
inky+0.044
Neg logit contributions
rob-0.050
 obstacles-0.049
 obstacle-0.049
 facts-0.047
 battles-0.047
Token ago
Token ID2084
Feature activation+1.39e-01

Pos logit contributions
less+0.097
free+0.087
etsy+0.086
inky+0.086
eth+0.082
Neg logit contributions
 obstacles-0.110
rob-0.108
 obstacle-0.107
 facts-0.106
 battles-0.105
Token.
Token ID13
Feature activation+9.51e-03

Pos logit contributions
less+0.155
free+0.134
 okay+0.125
etsy+0.122
inky+0.121
Neg logit contributions
rob-0.262
 obstacles-0.258
 obstacle-0.255
 facts-0.245
 battles-0.245
Token Rose
Token ID8049
Feature activation-1.98e-02

Pos logit contributions
less+0.012
free+0.010
 okay+0.009
etsy+0.009
 warm+0.009
Neg logit contributions
 obstacles-0.017
der-0.016
 obstacle-0.016
rob-0.016
 facts-0.016
Token had
Token ID550
Feature activation+4.20e-03

Pos logit contributions
rob+0.037
 obstacles+0.035
 obstacle+0.035
 facts+0.034
 cliffs+0.033
Neg logit contributions
less-0.021
 clean-0.021
 okay-0.021
 warm-0.020
 wet-0.020
Token promised
Token ID8072
Feature activation+2.20e-02

Pos logit contributions
less+0.004
 okay+0.004
free+0.004
 warm+0.003
 wet+0.003
Neg logit contributions
rob-0.009
 obstacle-0.008
 obstacles-0.008
 cliffs-0.008
der-0.008
Token always
Token ID1464
Feature activation-1.23e-02

Pos logit contributions
less+0.024
free+0.020
 okay+0.019
etsy+0.018
 warm+0.018
Neg logit contributions
rob-0.044
 obstacle-0.043
 obstacles-0.042
 battles-0.040
 facts-0.040
Token ac
Token ID936
Feature activation-2.71e-02

Pos logit contributions
rob+0.009
 obstacles+0.009
 obstacle+0.009
 facts+0.009
 battles+0.009
Neg logit contributions
less-0.007
free-0.006
 okay-0.006
 warm-0.006
inky-0.005
Tokenorn
Token ID1211
Feature activation-1.25e-02

Pos logit contributions
 obstacles+0.042
 obstacle+0.041
rob+0.041
 facts+0.040
 battles+0.040
Neg logit contributions
less-0.039
free-0.035
inky-0.033
etsy-0.033
 okay-0.032
Token a
Token ID257
Feature activation+7.68e-03

Pos logit contributions
rob+0.025
 obstacles+0.024
 obstacle+0.023
 facts+0.023
 battles+0.022
Neg logit contributions
less-0.014
free-0.012
 okay-0.011
 warm-0.011
inky-0.011
Token few
Token ID1178
Feature activation-4.25e-03

Pos logit contributions
less+0.009
free+0.008
 okay+0.007
inky+0.007
etsy+0.007
Neg logit contributions
rob-0.014
 obstacles-0.014
 obstacle-0.014
 facts-0.013
 battles-0.013
Token days
Token ID1528
Feature activation+4.78e-02

Pos logit contributions
rob+0.009
 obstacles+0.007
 facts+0.007
 Fer+0.007
 battles+0.007
Neg logit contributions
less-0.005
 warm-0.004
free-0.004
 wet-0.004
ucky-0.004
Token ago
Token ID2084
Feature activation+1.38e-01

Pos logit contributions
less+0.058
free+0.051
 okay+0.047
 warm+0.046
etsy+0.046
Neg logit contributions
rob-0.087
 obstacles-0.085
 obstacle-0.084
 facts-0.081
 battles-0.081
Token so
Token ID523
Feature activation+8.45e-03

Pos logit contributions
less+0.158
free+0.136
 okay+0.128
etsy+0.123
inky+0.123
Neg logit contributions
rob-0.258
 obstacles-0.253
 obstacle-0.252
 battles-0.241
 facts-0.240
Token I
Token ID314
Feature activation+1.81e-02

Pos logit contributions
less+0.011
free+0.010
 okay+0.009
 warm+0.009
inky+0.009
Neg logit contributions
rob-0.015
 obstacles-0.014
 obstacle-0.014
 battles-0.014
 facts-0.014
Token could
Token ID714
Feature activation+7.08e-02

Pos logit contributions
less+0.024
free+0.022
 okay+0.021
etsy+0.020
 warm+0.020
Neg logit contributions
rob-0.032
 obstacles-0.030
 obstacle-0.029
 battles-0.029
 facts-0.029
Token have
Token ID423
Feature activation-5.17e-03

Pos logit contributions
less+0.094
free+0.085
 okay+0.084
 warm+0.082
inky+0.080
Neg logit contributions
rob-0.126
 obstacles-0.118
 obstacle-0.115
 battles-0.111
 facts-0.110
Token some
Token ID617
Feature activation-1.06e-02

Pos logit contributions
rob+0.010
 obstacle+0.009
 obstacles+0.009
 facts+0.009
 battles+0.009
Neg logit contributions
less-0.006
 okay-0.005
 warm-0.005
free-0.005
 wet-0.005
Token painted
Token ID13055
Feature activation+1.88e-02

Pos logit contributions
less+0.013
free+0.011
 okay+0.011
 warm+0.011
 wet+0.011
Neg logit contributions
rob-0.016
 obstacle-0.015
 obstacles-0.015
 facts-0.014
der-0.014
Token them
Token ID606
Feature activation+4.19e-02

Pos logit contributions
less+0.029
free+0.026
 okay+0.025
etsy+0.025
inky+0.025
Neg logit contributions
rob-0.027
 obstacles-0.027
 obstacle-0.026
 facts-0.025
 battles-0.025
Token a
Token ID257
Feature activation+5.15e-03

Pos logit contributions
less+0.060
free+0.053
etsy+0.053
inky+0.051
eth+0.050
Neg logit contributions
 obstacles-0.064
 obstacle-0.063
rob-0.062
 battles-0.061
 facts-0.061
Token long
Token ID890
Feature activation+3.06e-02

Pos logit contributions
less+0.007
free+0.006
 okay+0.006
etsy+0.006
inky+0.006
Neg logit contributions
rob-0.009
 obstacles-0.008
 obstacle-0.008
 battles-0.008
 facts-0.008
Token time
Token ID640
Feature activation+6.78e-02

Pos logit contributions
less+0.040
free+0.035
 okay+0.033
inky+0.033
 warm+0.033
Neg logit contributions
rob-0.052
 obstacles-0.050
 obstacle-0.049
 facts-0.048
der-0.047
Token ago
Token ID2084
Feature activation+1.38e-01

Pos logit contributions
less+0.092
free+0.084
etsy+0.082
inky+0.082
 okay+0.078
Neg logit contributions
 obstacles-0.106
rob-0.104
 obstacle-0.102
 facts-0.102
 battles-0.101
Token.
Token ID13
Feature activation+6.46e-03

Pos logit contributions
less+0.173
free+0.159
inky+0.148
etsy+0.147
 okay+0.144
Neg logit contributions
 obstacles-0.229
rob-0.229
 battles-0.220
 facts-0.219
 obstacle-0.219
Token She
Token ID1375
Feature activation+6.92e-05

Pos logit contributions
less+0.008
free+0.006
 okay+0.006
inky+0.006
etsy+0.005
Neg logit contributions
der-0.012
 obstacles-0.012
 obstacle-0.011
rob-0.011
 facts-0.011
Token thought
Token ID1807
Feature activation-3.14e-02

Pos logit contributions
 wet+0.000
 warm+0.000
 clean+0.000
 careful+0.000
 dry+0.000
Neg logit contributions
rob-0.000
 obstacle-0.000
 facts-0.000
 obstacles-0.000
 battles-0.000
Token they
Token ID484
Feature activation-2.39e-02

Pos logit contributions
rob+0.059
 obstacles+0.055
 obstacle+0.055
 battles+0.053
 facts+0.053
Neg logit contributions
less-0.038
free-0.032
 okay-0.032
 warm-0.030
 wet-0.029
Token were
Token ID547
Feature activation-1.81e-02

Pos logit contributions
rob+0.042
 obstacle+0.040
 obstacles+0.040
 facts+0.038
 battles+0.038
Neg logit contributions
less-0.031
free-0.029
 okay-0.027
 warm-0.027
 wet-0.027
Token the
Token ID262
Feature activation-7.43e-03

Pos logit contributions
rob+0.113
 obstacles+0.108
 obstacle+0.108
 facts+0.105
 battles+0.105
Neg logit contributions
less-0.069
free-0.060
 warm-0.057
 okay-0.056
 wet-0.055
Token wardrobe
Token ID35593
Feature activation-7.27e-03

Pos logit contributions
 obstacles+0.009
 obstacle+0.009
rob+0.008
 cliffs+0.008
 mysteries+0.008
Neg logit contributions
less-0.013
inky-0.012
etsy-0.012
free-0.012
 okay-0.011
Token a
Token ID257
Feature activation+1.13e-02

Pos logit contributions
rob+0.013
 obstacles+0.013
 obstacle+0.012
 facts+0.012
 battles+0.012
Neg logit contributions
less-0.009
free-0.008
 okay-0.007
 warm-0.007
inky-0.007
Token few
Token ID1178
Feature activation-5.21e-03

Pos logit contributions
less+0.015
free+0.013
etsy+0.013
inky+0.013
 okay+0.012
Neg logit contributions
 obstacles-0.019
 obstacle-0.019
rob-0.018
 battles-0.018
 jugg-0.017
Token days
Token ID1528
Feature activation+4.33e-02

Pos logit contributions
rob+0.008
der+0.007
 scoring+0.007
 Fer+0.007
 jugg+0.007
Neg logit contributions
less-0.008
 warm-0.007
 wet-0.007
free-0.007
 clean-0.007
Token ago
Token ID2084
Feature activation+1.37e-01

Pos logit contributions
less+0.063
inky+0.060
free+0.059
etsy+0.058
eth+0.055
Neg logit contributions
 obstacles-0.062
 obstacle-0.058
der-0.058
 cliffs-0.058
rob-0.058
Token,
Token ID11
Feature activation-9.28e-03

Pos logit contributions
less+0.178
free+0.166
inky+0.159
etsy+0.155
 okay+0.151
Neg logit contributions
 obstacles-0.217
rob-0.214
 facts-0.209
 battles-0.209
der-0.208
Token so
Token ID523
Feature activation+3.91e-03

Pos logit contributions
rob+0.015
 obstacles+0.015
 obstacle+0.015
 battles+0.014
 facts+0.014
Neg logit contributions
less-0.012
free-0.011
 okay-0.011
 warm-0.010
etsy-0.010
Token she
Token ID673
Feature activation+9.35e-03

Pos logit contributions
less+0.004
free+0.004
inky+0.004
eth+0.003
etsy+0.003
Neg logit contributions
 obstacles-0.007
 obstacle-0.007
 facts-0.007
der-0.007
 mysteries-0.007
Token opened
Token ID4721
Feature activation+3.19e-02

Pos logit contributions
less+0.013
free+0.012
 okay+0.012
etsy+0.011
 warm+0.011
Neg logit contributions
rob-0.015
 obstacles-0.014
 obstacle-0.014
 facts-0.014
 battles-0.014
Token it
Token ID340
Feature activation+2.03e-02

Pos logit contributions
less+0.053
free+0.048
inky+0.047
etsy+0.047
 okay+0.046
Neg logit contributions
 obstacles-0.042
rob-0.040
 obstacle-0.040
 cliffs-0.039
 hears-0.039
Token who
Token ID508
Feature activation+1.86e-03

Pos logit contributions
rob+0.008
 obstacle+0.008
 obstacles+0.008
 facts+0.008
 battles+0.007
Neg logit contributions
less-0.008
 okay-0.007
free-0.007
 warm-0.007
etsy-0.007
Token had
Token ID550
Feature activation+1.45e-03

Pos logit contributions
less+0.003
free+0.002
 warm+0.002
 wet+0.002
 clean+0.002
Neg logit contributions
 obstacles-0.003
 obstacle-0.003
rob-0.003
 battles-0.003
 cliffs-0.003
Token died
Token ID3724
Feature activation+5.13e-02

Pos logit contributions
less+0.002
 okay+0.001
 warm+0.001
free+0.001
 wet+0.001
Neg logit contributions
rob-0.003
 obstacle-0.003
 obstacles-0.003
der-0.002
 facts-0.002
Token a
Token ID257
Feature activation+1.71e-02

Pos logit contributions
less+0.062
free+0.054
etsy+0.050
inky+0.050
 okay+0.050
Neg logit contributions
rob-0.091
 obstacles-0.090
 obstacle-0.088
 facts-0.086
 battles-0.086
Token year
Token ID614
Feature activation+4.59e-02

Pos logit contributions
less+0.023
free+0.020
 okay+0.019
etsy+0.019
 warm+0.019
Neg logit contributions
rob-0.029
 obstacles-0.028
 obstacle-0.028
 facts-0.027
 battles-0.027
Token ago
Token ID2084
Feature activation+1.37e-01

Pos logit contributions
less+0.059
free+0.053
etsy+0.050
inky+0.049
 okay+0.049
Neg logit contributions
 obstacles-0.076
 obstacle-0.075
rob-0.075
 facts-0.073
 battles-0.073
Token.
Token ID13
Feature activation+2.47e-02

Pos logit contributions
less+0.167
free+0.148
etsy+0.137
inky+0.137
 okay+0.137
Neg logit contributions
rob-0.239
 obstacles-0.236
 obstacle-0.231
 facts-0.226
 battles-0.226
Token It
Token ID632
Feature activation+8.87e-03

Pos logit contributions
less+0.030
free+0.026
inky+0.024
etsy+0.023
 warm+0.023
Neg logit contributions
rob-0.043
 obstacles-0.042
der-0.041
 facts-0.041
 obstacle-0.041
Token was
Token ID373
Feature activation-3.29e-03

Pos logit contributions
less+0.013
free+0.012
 okay+0.012
 warm+0.012
 wet+0.011
Neg logit contributions
 obstacles-0.013
rob-0.013
 obstacle-0.013
 battles-0.012
 facts-0.012
Token very
Token ID845
Feature activation+2.79e-03

Pos logit contributions
 obstacles+0.005
rob+0.005
 obstacle+0.005
 facts+0.005
 battles+0.005
Neg logit contributions
less-0.004
free-0.004
etsy-0.004
inky-0.003
eth-0.003
Token precious
Token ID14186
Feature activation-1.42e-02

Pos logit contributions
less+0.004
free+0.004
etsy+0.004
opped+0.004
inky+0.004
Neg logit contributions
 obstacle-0.004
 facts-0.004
 obstacles-0.004
 hears-0.004
der-0.004
Token by
Token ID416
Feature activation-8.07e-03

Pos logit contributions
less+0.036
etsy+0.031
inky+0.029
eth+0.029
free+0.029
Neg logit contributions
 obstacles-0.079
 obstacle-0.077
 battles-0.077
rob-0.074
 facts-0.074
Token people
Token ID661
Feature activation+2.42e-03

Pos logit contributions
rob+0.015
 obstacle+0.013
 Fer+0.013
 obstacles+0.013
 jugg+0.013
Neg logit contributions
 okay-0.010
free-0.010
less-0.010
 warm-0.009
 wet-0.009
Token a
Token ID257
Feature activation+3.63e-03

Pos logit contributions
less+0.004
etsy+0.003
inky+0.003
free+0.003
isc+0.003
Neg logit contributions
 obstacles-0.003
 obstacle-0.003
 battles-0.003
 facts-0.003
 hears-0.003
Token long
Token ID890
Feature activation+3.56e-02

Pos logit contributions
less+0.004
 warm+0.004
free+0.004
 okay+0.004
 wet+0.004
Neg logit contributions
rob-0.007
der-0.006
 obstacles-0.006
 facts-0.006
 obstacle-0.006
Token time
Token ID640
Feature activation+6.74e-02

Pos logit contributions
less+0.034
free+0.030
 warm+0.029
 okay+0.029
eth+0.028
Neg logit contributions
rob-0.072
 obstacles-0.067
 obstacle-0.066
der-0.066
 facts-0.065
Token ago
Token ID2084
Feature activation+1.37e-01

Pos logit contributions
less+0.074
free+0.063
 okay+0.062
 warm+0.058
 wet+0.057
Neg logit contributions
rob-0.137
 obstacle-0.128
 obstacles-0.125
 Fer-0.121
 battles-0.119
Token in
Token ID287
Feature activation-5.45e-02

Pos logit contributions
less+0.147
free+0.124
 okay+0.117
etsy+0.112
inky+0.111
Neg logit contributions
rob-0.266
 obstacle-0.261
 obstacles-0.261
 facts-0.248
 battles-0.247
Token history
Token ID2106
Feature activation-3.65e-02

Pos logit contributions
rob+0.116
der+0.106
 facts+0.098
 concepts+0.097
 jugg+0.097
Neg logit contributions
 warm-0.052
less-0.052
 okay-0.050
 wet-0.049
free-0.048
Token.
Token ID13
Feature activation-4.00e-03

Pos logit contributions
rob+0.084
 obstacles+0.077
 obstacle+0.077
 accompl+0.076
 jugg+0.074
Neg logit contributions
less-0.027
 warm-0.023
free-0.022
 okay-0.020
 wet-0.020
Token Blue
Token ID4518
Feature activation-2.82e-02

Pos logit contributions
rob+0.009
 obstacles+0.009
 obstacle+0.009
 facts+0.008
 jugg+0.008
Neg logit contributions
free-0.003
less-0.003
inky-0.003
 okay-0.003
etsy-0.003
Token didn
Token ID1422
Feature activation+4.75e-02

Pos logit contributions
rob+0.046
 obstacle+0.045
 obstacles+0.045
 facts+0.044
 accompl+0.043
Neg logit contributions
less-0.036
free-0.034
inky-0.033
 warm-0.033
etsy-0.033
Token same
Token ID976
Feature activation+1.09e-03

Pos logit contributions
rob+0.021
 circus+0.019
 Fer+0.019
 obstacles+0.019
 fer+0.019
Neg logit contributions
less-0.023
etsy-0.022
inky-0.021
free-0.021
ung-0.021
Token dress
Token ID6576
Feature activation+1.64e-03

Pos logit contributions
less+0.001
free+0.001
inky+0.001
etsy+0.001
 okay+0.001
Neg logit contributions
rob-0.002
 obstacles-0.002
 obstacle-0.002
 battles-0.002
 facts-0.002
Token a
Token ID257
Feature activation+2.47e-03

Pos logit contributions
less+0.002
free+0.002
inky+0.002
etsy+0.002
 okay+0.002
Neg logit contributions
rob-0.003
 obstacles-0.002
 obstacle-0.002
der-0.002
 battles-0.002
Token long
Token ID890
Feature activation+3.27e-02

Pos logit contributions
less+0.003
free+0.003
etsy+0.003
inky+0.003
isc+0.003
Neg logit contributions
rob-0.004
 jugg-0.004
 circus-0.004
 battles-0.004
 obstacles-0.004
Token time
Token ID640
Feature activation+6.67e-02

Pos logit contributions
less+0.043
free+0.039
inky+0.035
etsy+0.035
 okay+0.034
Neg logit contributions
rob-0.054
 obstacles-0.053
 battles-0.051
 obstacle-0.051
 jugg-0.050
Token ago
Token ID2084
Feature activation+1.37e-01

Pos logit contributions
less+0.088
free+0.081
inky+0.080
etsy+0.079
 okay+0.076
Neg logit contributions
 obstacles-0.104
rob-0.103
 obstacle-0.101
 facts-0.100
 battles-0.099
Token.
Token ID13
Feature activation+6.89e-03

Pos logit contributions
less+0.164
free+0.146
inky+0.135
 okay+0.135
etsy+0.134
Neg logit contributions
rob-0.241
 obstacles-0.238
 obstacle-0.233
 facts-0.228
 battles-0.227
Token<|endoftext|>
Token ID50256
Feature activation-2.99e-02

Pos logit contributions
less+0.011
free+0.009
inky+0.008
 okay+0.008
etsy+0.008
Neg logit contributions
rob-0.010
der-0.009
 obstacles-0.009
 hears-0.009
 facts-0.009
TokenOnce
Token ID7454
Feature activation+1.32e-02

Pos logit contributions
rob+0.048
 obstacles+0.045
 hears+0.044
 obstacle+0.043
 battles+0.043
Neg logit contributions
less-0.042
free-0.037
etsy-0.035
 okay-0.035
inky-0.035
Token upon
Token ID2402
Feature activation+8.91e-03

Pos logit contributions
less+0.015
free+0.015
inky+0.013
opped+0.013
etsy+0.013
Neg logit contributions
 hears-0.022
 obstacle-0.021
 obstacles-0.021
rob-0.021
der-0.020
Token a
Token ID257
Feature activation+1.34e-03

Pos logit contributions
less+0.014
inky+0.013
eth+0.012
etsy+0.012
isc+0.012
Neg logit contributions
 hears-0.012
rob-0.012
 obstacles-0.012
 battles-0.012
 jugg-0.012
Token Jim
Token ID5395
Feature activation-1.73e-02

Pos logit contributions
less+0.097
 warm+0.082
free+0.082
 okay+0.082
 wet+0.079
Neg logit contributions
rob-0.175
 obstacle-0.163
 obstacles-0.162
der-0.159
 facts-0.157
Token from
Token ID422
Feature activation-2.71e-02

Pos logit contributions
 obstacles+0.027
rob+0.026
 obstacle+0.026
 battles+0.026
 hears+0.025
Neg logit contributions
less-0.024
free-0.022
etsy-0.020
inky-0.020
 okay-0.020
Token a
Token ID257
Feature activation+9.18e-03

Pos logit contributions
rob+0.051
 obstacles+0.048
 obstacle+0.048
der+0.048
 battles+0.048
Neg logit contributions
less-0.030
 warm-0.027
 wet-0.026
free-0.026
 okay-0.024
Token long
Token ID890
Feature activation+4.00e-02

Pos logit contributions
less+0.011
free+0.010
 okay+0.009
etsy+0.009
inky+0.009
Neg logit contributions
rob-0.016
 obstacles-0.016
 obstacle-0.016
 facts-0.015
 battles-0.015
Token time
Token ID640
Feature activation+6.60e-02

Pos logit contributions
less+0.050
free+0.045
 okay+0.042
etsy+0.041
inky+0.041
Neg logit contributions
rob-0.069
 obstacles-0.068
 obstacle-0.067
 facts-0.065
 battles-0.064
Token ago
Token ID2084
Feature activation+1.37e-01

Pos logit contributions
less+0.095
free+0.084
 okay+0.080
etsy+0.077
 warm+0.077
Neg logit contributions
rob-0.105
 obstacles-0.102
 obstacle-0.101
 facts-0.097
 battles-0.096
Token,
Token ID11
Feature activation-4.57e-03

Pos logit contributions
less+0.164
free+0.147
inky+0.136
etsy+0.135
 okay+0.134
Neg logit contributions
rob-0.239
 obstacles-0.237
 obstacle-0.230
 battles-0.227
 facts-0.227
Token and
Token ID290
Feature activation-5.69e-05

Pos logit contributions
rob+0.007
 obstacles+0.007
 obstacle+0.007
 facts+0.007
 battles+0.007
Neg logit contributions
less-0.006
free-0.006
 okay-0.005
 warm-0.005
etsy-0.005
Token he
Token ID339
Feature activation-1.69e-02

Pos logit contributions
rob+0.000
 obstacle+0.000
 obstacles+0.000
 battles+0.000
 facts+0.000
Neg logit contributions
less-0.000
 okay-0.000
 warm-0.000
 wet-0.000
free-0.000
Token was
Token ID373
Feature activation-8.01e-03

Pos logit contributions
rob+0.032
 obstacle+0.030
 obstacles+0.030
 facts+0.029
 accompl+0.028
Neg logit contributions
less-0.019
 okay-0.018
 wet-0.018
 warm-0.017
 clean-0.017
Token very
Token ID845
Feature activation+3.06e-03

Pos logit contributions
rob+0.014
 obstacles+0.014
 obstacle+0.014
 facts+0.013
 battles+0.013
Neg logit contributions
less-0.010
free-0.009
 okay-0.008
 warm-0.008
inky-0.008
Token had
Token ID550
Feature activation-3.97e-03

Pos logit contributions
less+0.002
free+0.002
 warm+0.002
 okay+0.002
 wet+0.002
Neg logit contributions
rob-0.002
 obstacle-0.002
 obstacles-0.002
der-0.002
 battles-0.002
Token lived
Token ID5615
Feature activation+9.70e-03

Pos logit contributions
rob+0.007
 obstacle+0.007
 obstacles+0.006
der+0.006
 facts+0.006
Neg logit contributions
less-0.005
free-0.005
 warm-0.005
 okay-0.005
 wet-0.004
Token a
Token ID257
Feature activation+1.78e-02

Pos logit contributions
less+0.010
free+0.009
 okay+0.008
 warm+0.008
 wet+0.007
Neg logit contributions
rob-0.020
 obstacle-0.019
 obstacles-0.019
 facts-0.018
der-0.018
Token long
Token ID890
Feature activation+3.86e-02

Pos logit contributions
less+0.024
 warm+0.021
free+0.021
 okay+0.021
 wet+0.020
Neg logit contributions
rob-0.031
 obstacles-0.029
 obstacle-0.028
 facts-0.028
der-0.028
Token time
Token ID640
Feature activation+6.73e-02

Pos logit contributions
less+0.043
 warm+0.037
free+0.037
 okay+0.037
 wet+0.037
Neg logit contributions
rob-0.071
 obstacles-0.067
 obstacle-0.067
 facts-0.065
der-0.065
Token ago
Token ID2084
Feature activation+1.37e-01

Pos logit contributions
less+0.078
free+0.069
 okay+0.064
 warm+0.062
 wet+0.061
Neg logit contributions
rob-0.128
 obstacle-0.121
 obstacles-0.120
 facts-0.115
 battles-0.114
Token.
Token ID13
Feature activation+3.96e-03

Pos logit contributions
less+0.156
free+0.134
 okay+0.125
etsy+0.123
inky+0.122
Neg logit contributions
rob-0.253
 obstacles-0.249
 obstacle-0.247
 facts-0.237
 battles-0.236
Token
Token ID220
Feature activation-2.47e-02

Pos logit contributions
less+0.005
free+0.005
 okay+0.004
etsy+0.004
 warm+0.004
Neg logit contributions
der-0.006
rob-0.006
 obstacles-0.006
 obstacle-0.006
 battles-0.006
Token
Token ID198
Feature activation+5.33e-03

Pos logit contributions
rob+0.037
 obstacles+0.036
 hears+0.034
der+0.034
 obstacle+0.034
Neg logit contributions
less-0.036
free-0.031
etsy-0.029
 warm-0.029
 wet-0.028
Token
Token ID198
Feature activation+7.36e-03

Pos logit contributions
less+0.008
free+0.007
 okay+0.007
etsy+0.006
isc+0.006
Neg logit contributions
rob-0.008
 obstacles-0.007
 hears-0.007
 obstacle-0.007
 facts-0.007
TokenL
Token ID43
Feature activation+3.07e-02

Pos logit contributions
less+0.009
free+0.007
 okay+0.007
isc+0.007
etsy+0.007
Neg logit contributions
 obstacles-0.013
 obstacle-0.013
rob-0.013
 battles-0.012
der-0.012
Token someone
Token ID2130
Feature activation+2.79e-03

Pos logit contributions
rob+0.020
 obstacle+0.018
 obstacles+0.018
 battles+0.017
 Fer+0.017
Neg logit contributions
less-0.012
free-0.011
 okay-0.010
 warm-0.010
 wet-0.010
Token a
Token ID257
Feature activation+7.56e-03

Pos logit contributions
less+0.004
free+0.003
 okay+0.003
 warm+0.003
etsy+0.003
Neg logit contributions
rob-0.005
 obstacles-0.005
 obstacle-0.005
 facts-0.004
 battles-0.004
Token very
Token ID845
Feature activation+5.71e-03

Pos logit contributions
less+0.011
free+0.010
 okay+0.009
 warm+0.009
etsy+0.009
Neg logit contributions
rob-0.012
 obstacles-0.012
 obstacle-0.011
 facts-0.011
der-0.011
Token long
Token ID890
Feature activation+3.18e-02

Pos logit contributions
less+0.007
free+0.006
 okay+0.005
 warm+0.005
inky+0.005
Neg logit contributions
rob-0.011
 obstacles-0.010
 obstacle-0.010
 facts-0.010
 battles-0.010
Token time
Token ID640
Feature activation+6.33e-02

Pos logit contributions
less+0.035
free+0.030
 okay+0.028
etsy+0.028
inky+0.028
Neg logit contributions
rob-0.060
 obstacles-0.059
 obstacle-0.058
 facts-0.056
 battles-0.056
Token ago
Token ID2084
Feature activation+1.36e-01

Pos logit contributions
less+0.081
free+0.071
 okay+0.068
etsy+0.065
 warm+0.065
Neg logit contributions
rob-0.111
 obstacles-0.107
 obstacle-0.106
 battles-0.102
 facts-0.102
Token.
Token ID13
Feature activation-1.07e-03

Pos logit contributions
less+0.155
free+0.135
 okay+0.126
etsy+0.124
inky+0.123
Neg logit contributions
rob-0.251
 obstacles-0.248
 obstacle-0.245
 facts-0.237
 battles-0.236
Token The
Token ID383
Feature activation-1.47e-02

Pos logit contributions
rob+0.002
 obstacles+0.002
der+0.002
 obstacle+0.002
 facts+0.002
Neg logit contributions
less-0.002
free-0.001
 okay-0.001
inky-0.001
 wet-0.001
Token little
Token ID1310
Feature activation+2.81e-02

Pos logit contributions
 obstacle+0.028
 obstacles+0.028
 facts+0.026
 mysteries+0.026
 battles+0.026
Neg logit contributions
less-0.016
free-0.014
inky-0.014
eth-0.014
etsy-0.013
Token girl
Token ID2576
Feature activation-2.09e-02

Pos logit contributions
less+0.044
inky+0.043
free+0.041
isc+0.041
ittens+0.040
Neg logit contributions
 acts-0.040
 obstacle-0.040
 obstacles-0.038
 mysteries-0.037
 believes-0.036
Token thought
Token ID1807
Feature activation-4.01e-02

Pos logit contributions
rob+0.039
 obstacle+0.035
 obstacles+0.035
 facts+0.033
 battles+0.033
Neg logit contributions
less-0.026
 warm-0.024
 okay-0.024
 clean-0.024
etsy-0.024
Token best
Token ID1266
Feature activation-3.63e-02

Pos logit contributions
less+0.036
free+0.032
etsy+0.029
inky+0.029
 okay+0.028
Neg logit contributions
rob-0.065
 obstacles-0.064
 obstacle-0.063
 facts-0.062
 battles-0.062
Token friend
Token ID1545
Feature activation+1.01e-02

Pos logit contributions
rob+0.065
 obstacles+0.064
 obstacle+0.063
 battles+0.062
 facts+0.062
Neg logit contributions
less-0.041
free-0.037
etsy-0.033
 okay-0.033
inky-0.033
Token a
Token ID257
Feature activation+6.81e-03

Pos logit contributions
less+0.013
free+0.012
inky+0.011
eth+0.011
etsy+0.011
Neg logit contributions
rob-0.016
 obstacles-0.016
 obstacle-0.015
 battles-0.015
 facts-0.015
Token long
Token ID890
Feature activation+2.90e-02

Pos logit contributions
less+0.009
free+0.008
 okay+0.007
etsy+0.007
inky+0.007
Neg logit contributions
rob-0.012
 obstacles-0.012
 obstacle-0.011
 facts-0.011
 battles-0.011
Token time
Token ID640
Feature activation+6.71e-02

Pos logit contributions
less+0.034
free+0.030
 okay+0.028
inky+0.028
etsy+0.028
Neg logit contributions
rob-0.052
 obstacles-0.051
 obstacle-0.051
 battles-0.049
 facts-0.049
Token ago
Token ID2084
Feature activation+1.36e-01

Pos logit contributions
less+0.088
free+0.079
etsy+0.077
inky+0.077
eth+0.073
Neg logit contributions
 obstacles-0.108
rob-0.106
 obstacle-0.105
 facts-0.103
 battles-0.103
Token.
Token ID13
Feature activation+1.16e-02

Pos logit contributions
less+0.155
free+0.135
 okay+0.125
etsy+0.123
inky+0.123
Neg logit contributions
rob-0.250
 obstacles-0.247
 obstacle-0.244
 facts-0.235
 battles-0.234
Token He
Token ID679
Feature activation-2.51e-02

Pos logit contributions
less+0.019
free+0.017
 wet+0.016
inky+0.016
 warm+0.016
Neg logit contributions
der-0.014
rob-0.014
 facts-0.014
 obstacles-0.014
 hears-0.014
Token was
Token ID373
Feature activation-3.07e-03

Pos logit contributions
rob+0.030
 obstacles+0.030
 obstacle+0.029
 facts+0.027
 battles+0.027
Neg logit contributions
less-0.045
free-0.041
 okay-0.039
inky-0.039
etsy-0.039
Token my
Token ID616
Feature activation+4.58e-02

Pos logit contributions
rob+0.005
 obstacles+0.005
 obstacle+0.005
 battles+0.005
 facts+0.005
Neg logit contributions
less-0.004
free-0.004
etsy-0.003
inky-0.003
 okay-0.003
Token dog
Token ID3290
Feature activation-1.31e-02

Pos logit contributions
less+0.055
free+0.051
etsy+0.047
inky+0.046
eth+0.044
Neg logit contributions
 obstacles-0.078
 obstacle-0.077
rob-0.076
 facts-0.075
 battles-0.074
Token who
Token ID508
Feature activation+8.43e-03

Pos logit contributions
rob+0.038
 facts+0.036
 hears+0.035
 obstacle+0.035
 obstacles+0.034
Neg logit contributions
less-0.043
free-0.039
inky-0.038
etsy-0.037
eth-0.036
Token lived
Token ID5615
Feature activation+1.75e-02

Pos logit contributions
less+0.012
free+0.012
 warm+0.011
 okay+0.011
 wet+0.011
Neg logit contributions
 obstacles-0.013
 obstacle-0.013
rob-0.012
der-0.012
 battles-0.012
Token a
Token ID257
Feature activation+1.84e-02

Pos logit contributions
less+0.019
free+0.016
 okay+0.015
etsy+0.015
inky+0.015
Neg logit contributions
rob-0.034
 obstacles-0.033
 obstacle-0.033
 facts-0.032
 battles-0.032
Token long
Token ID890
Feature activation+3.88e-02

Pos logit contributions
less+0.027
free+0.025
 warm+0.024
 okay+0.023
inky+0.023
Neg logit contributions
rob-0.028
 obstacles-0.028
 obstacle-0.027
 facts-0.026
der-0.026
Token time
Token ID640
Feature activation+6.95e-02

Pos logit contributions
less+0.053
free+0.047
inky+0.044
 okay+0.044
etsy+0.044
Neg logit contributions
rob-0.062
 obstacles-0.061
 obstacle-0.060
 facts-0.058
 battles-0.057
Token ago
Token ID2084
Feature activation+1.35e-01

Pos logit contributions
less+0.085
free+0.075
 okay+0.070
inky+0.069
etsy+0.069
Neg logit contributions
rob-0.122
 obstacles-0.120
 obstacle-0.119
 facts-0.115
 battles-0.115
Token,"
Token ID553
Feature activation-4.07e-03

Pos logit contributions
less+0.153
free+0.133
 okay+0.123
etsy+0.122
inky+0.122
Neg logit contributions
rob-0.250
 obstacles-0.247
 obstacle-0.243
 facts-0.236
 battles-0.236
Token mom
Token ID1995
Feature activation+1.62e-02

Pos logit contributions
rob+0.009
 obstacles+0.009
 obstacle+0.009
 battles+0.009
 challenges+0.009
Neg logit contributions
less-0.003
 okay-0.002
 warm-0.002
free-0.002
 wet-0.002
Token said
Token ID531
Feature activation+2.68e-02

Pos logit contributions
less+0.020
free+0.018
 okay+0.017
etsy+0.017
inky+0.017
Neg logit contributions
rob-0.028
 obstacles-0.028
 obstacle-0.027
 facts-0.026
 battles-0.026
Token.
Token ID13
Feature activation+1.47e-02

Pos logit contributions
isc+0.041
less+0.040
inky+0.039
etsy+0.039
There+0.039
Neg logit contributions
der-0.033
 hears-0.030
 seconds-0.028
 acts-0.028
 compared-0.027
Token
Token ID198
Feature activation+9.15e-03

Pos logit contributions
less+0.017
free+0.015
 okay+0.014
etsy+0.013
inky+0.013
Neg logit contributions
rob-0.027
 obstacles-0.027
 obstacle-0.027
 facts-0.026
 battles-0.026
Token her
Token ID607
Feature activation+2.71e-02

Pos logit contributions
rob+0.081
 obstacles+0.078
 obstacle+0.078
 facts+0.074
 battles+0.074
Neg logit contributions
less-0.043
free-0.037
 okay-0.035
 warm-0.033
etsy-0.033
Token birthday
Token ID10955
Feature activation-1.14e-02

Pos logit contributions
less+0.033
free+0.030
inky+0.028
etsy+0.028
 okay+0.027
Neg logit contributions
 obstacles-0.046
rob-0.046
 obstacle-0.045
 facts-0.044
 battles-0.044
Token a
Token ID257
Feature activation+1.81e-03

Pos logit contributions
rob+0.020
 obstacles+0.020
 obstacle+0.020
 facts+0.019
 battles+0.019
Neg logit contributions
less-0.014
free-0.012
 okay-0.011
etsy-0.011
inky-0.011
Token few
Token ID1178
Feature activation-8.79e-03

Pos logit contributions
less+0.002
free+0.002
 okay+0.002
etsy+0.002
inky+0.002
Neg logit contributions
rob-0.003
 obstacles-0.003
 obstacle-0.003
 facts-0.003
 battles-0.003
Token days
Token ID1528
Feature activation+4.40e-02

Pos logit contributions
rob+0.015
 cliffs+0.012
 hears+0.012
 obstacles+0.012
 obstacle+0.012
Neg logit contributions
less-0.012
 warm-0.011
free-0.011
 wet-0.010
 clean-0.010
Token ago
Token ID2084
Feature activation+1.35e-01

Pos logit contributions
less+0.058
free+0.053
etsy+0.051
 okay+0.051
inky+0.051
Neg logit contributions
rob-0.068
 obstacles-0.067
der-0.066
 obstacle-0.065
 battles-0.064
Token.
Token ID13
Feature activation+4.72e-03

Pos logit contributions
less+0.177
free+0.163
inky+0.155
etsy+0.153
 okay+0.151
Neg logit contributions
rob-0.213
 obstacles-0.211
 facts-0.206
der-0.204
 battles-0.204
Token She
Token ID1375
Feature activation+1.91e-03

Pos logit contributions
less+0.006
free+0.005
 okay+0.005
inky+0.005
etsy+0.005
Neg logit contributions
rob-0.008
 obstacles-0.008
 obstacle-0.008
 facts-0.008
 battles-0.008
Token liked
Token ID8288
Feature activation+1.34e-02

Pos logit contributions
less+0.002
 warm+0.002
 clean+0.002
 wet+0.002
 okay+0.002
Neg logit contributions
rob-0.004
 obstacle-0.004
 obstacles-0.003
 facts-0.003
 cliffs-0.003
Token to
Token ID284
Feature activation-1.69e-02

Pos logit contributions
less+0.013
free+0.012
 warm+0.011
 okay+0.010
 wet+0.010
Neg logit contributions
rob-0.027
 obstacle-0.026
 obstacles-0.025
der-0.024
 hears-0.024
Token practice
Token ID3357
Feature activation+4.82e-06

Pos logit contributions
rob+0.032
 obstacles+0.030
 obstacle+0.029
der+0.028
 cliffs+0.028
Neg logit contributions
less-0.018
 warm-0.018
free-0.018
 clean-0.017
 okay-0.017
Token is
Token ID318
Feature activation-1.19e-02

Pos logit contributions
rob+0.003
 obstacles+0.003
 obstacle+0.003
 battles+0.003
 facts+0.003
Neg logit contributions
less-0.003
free-0.003
 wet-0.003
 warm-0.003
 okay-0.003
Token from
Token ID422
Feature activation-2.82e-02

Pos logit contributions
rob+0.022
 obstacles+0.021
 obstacle+0.021
 facts+0.020
 battles+0.020
Neg logit contributions
less-0.014
free-0.012
 okay-0.012
 warm-0.012
inky-0.011
Token a
Token ID257
Feature activation+5.07e-04

Pos logit contributions
rob+0.057
 obstacle+0.053
 obstacles+0.052
der+0.051
 facts+0.051
Neg logit contributions
less-0.030
 warm-0.026
 wet-0.025
free-0.025
 okay-0.025
Token long
Token ID890
Feature activation+3.33e-02

Pos logit contributions
less+0.001
free+0.001
 okay+0.001
 warm+0.001
inky+0.001
Neg logit contributions
rob-0.001
 obstacles-0.001
 obstacle-0.001
 facts-0.001
 battles-0.001
Token time
Token ID640
Feature activation+6.83e-02

Pos logit contributions
less+0.042
free+0.037
 okay+0.035
inky+0.035
etsy+0.035
Neg logit contributions
rob-0.057
 obstacles-0.056
 obstacle-0.055
 facts-0.054
 battles-0.053
Token ago
Token ID2084
Feature activation+1.35e-01

Pos logit contributions
less+0.092
free+0.082
 okay+0.076
etsy+0.076
inky+0.076
Neg logit contributions
rob-0.112
 obstacles-0.110
 obstacle-0.109
 facts-0.105
 battles-0.105
Token,
Token ID11
Feature activation+3.12e-03

Pos logit contributions
less+0.161
free+0.141
 okay+0.131
etsy+0.130
inky+0.130
Neg logit contributions
rob-0.242
 obstacles-0.239
 obstacle-0.235
 facts-0.228
 battles-0.227
Token when
Token ID618
Feature activation+9.00e-02

Pos logit contributions
less+0.004
free+0.004
 okay+0.003
etsy+0.003
inky+0.003
Neg logit contributions
rob-0.005
 obstacles-0.005
 obstacle-0.005
 facts-0.005
 battles-0.005
Token people
Token ID661
Feature activation+1.21e-02

Pos logit contributions
less+0.094
 okay+0.084
 warm+0.081
free+0.079
 wet+0.077
Neg logit contributions
rob-0.183
 obstacle-0.168
 obstacles-0.167
 facts-0.163
 battles-0.163
Token used
Token ID973
Feature activation-2.08e-02

Pos logit contributions
less+0.017
free+0.015
 okay+0.014
 warm+0.014
inky+0.014
Neg logit contributions
rob-0.020
 obstacles-0.019
 obstacle-0.019
 facts-0.018
der-0.018
Token different
Token ID1180
Feature activation-7.53e-03

Pos logit contributions
 obstacles+0.033
 obstacle+0.032
rob+0.032
 facts+0.031
 battles+0.031
Neg logit contributions
less-0.029
free-0.026
etsy-0.025
inky-0.025
 okay-0.024
Token lost
Token ID2626
Feature activation+2.51e-02

Pos logit contributions
rob+0.000
 obstacles+0.000
 obstacle+0.000
 hears+0.000
 facts+0.000
Neg logit contributions
less-0.000
free-0.000
etsy-0.000
inky-0.000
 okay-0.000
Token it
Token ID340
Feature activation+6.99e-03

Pos logit contributions
less+0.030
free+0.027
 okay+0.025
 warm+0.025
inky+0.024
Neg logit contributions
rob-0.046
 obstacles-0.044
 obstacle-0.044
 facts-0.043
 battles-0.042
Token a
Token ID257
Feature activation+7.20e-03

Pos logit contributions
less+0.009
free+0.008
 okay+0.007
inky+0.007
 warm+0.007
Neg logit contributions
rob-0.012
 obstacles-0.012
 obstacle-0.012
 facts-0.012
 battles-0.011
Token long
Token ID890
Feature activation+2.83e-02

Pos logit contributions
less+0.009
free+0.008
 okay+0.007
 warm+0.007
inky+0.007
Neg logit contributions
rob-0.013
 obstacles-0.012
 obstacle-0.012
 facts-0.012
 battles-0.012
Token time
Token ID640
Feature activation+6.50e-02

Pos logit contributions
less+0.034
free+0.030
 okay+0.029
 warm+0.028
etsy+0.028
Neg logit contributions
rob-0.051
 obstacles-0.049
 obstacle-0.048
 facts-0.048
 battles-0.046
Token ago
Token ID2084
Feature activation+1.35e-01

Pos logit contributions
less+0.082
free+0.072
 okay+0.068
 warm+0.065
 wet+0.064
Neg logit contributions
rob-0.117
 obstacles-0.112
 obstacle-0.112
 facts-0.107
 battles-0.106
Token."
Token ID526
Feature activation-4.76e-03

Pos logit contributions
less+0.152
free+0.129
 okay+0.123
inky+0.117
etsy+0.116
Neg logit contributions
rob-0.256
 obstacle-0.253
 obstacles-0.252
 facts-0.239
 battles-0.237
Token Sam
Token ID3409
Feature activation+2.01e-03

Pos logit contributions
rob+0.010
 obstacles+0.010
 obstacle+0.010
der+0.009
 challenges+0.009
Neg logit contributions
less-0.004
free-0.003
 okay-0.003
 warm-0.003
inky-0.003
Token was
Token ID373
Feature activation-5.04e-03

Pos logit contributions
less+0.002
 warm+0.002
 okay+0.002
free+0.002
 wet+0.002
Neg logit contributions
rob-0.003
 obstacle-0.003
 obstacles-0.003
 facts-0.003
 accompl-0.003
Token sorry
Token ID7926
Feature activation+6.10e-02

Pos logit contributions
 obstacles+0.009
rob+0.009
 obstacle+0.009
 battles+0.009
 facts+0.009
Neg logit contributions
less-0.006
free-0.005
etsy-0.005
inky-0.005
eth-0.005
Token to
Token ID284
Feature activation+6.32e-04

Pos logit contributions
less+0.093
free+0.084
etsy+0.081
inky+0.080
eth+0.078
Neg logit contributions
 obstacles-0.087
 obstacle-0.086
rob-0.085
der-0.083
 battles-0.081
Token it
Token ID340
Feature activation+1.05e-02

Pos logit contributions
less+0.012
free+0.010
 okay+0.010
inky+0.010
etsy+0.010
Neg logit contributions
rob-0.014
 obstacles-0.014
 obstacle-0.014
 facts-0.013
 battles-0.013
Token away
Token ID1497
Feature activation+8.67e-04

Pos logit contributions
less+0.012
free+0.011
etsy+0.011
isc+0.011
inky+0.011
Neg logit contributions
 obstacles-0.017
 obstacle-0.017
der-0.017
 battles-0.017
 hears-0.016
Token a
Token ID257
Feature activation+1.04e-02

Pos logit contributions
less+0.001
free+0.001
 okay+0.001
 warm+0.001
inky+0.001
Neg logit contributions
rob-0.002
 obstacles-0.002
 obstacle-0.002
 facts-0.001
 battles-0.001
Token long
Token ID890
Feature activation+3.64e-02

Pos logit contributions
less+0.013
free+0.012
 okay+0.011
inky+0.011
etsy+0.011
Neg logit contributions
 obstacles-0.017
 obstacle-0.017
rob-0.017
 battles-0.016
 facts-0.016
Token time
Token ID640
Feature activation+7.29e-02

Pos logit contributions
less+0.042
free+0.037
 okay+0.034
inky+0.033
etsy+0.033
Neg logit contributions
rob-0.066
 obstacles-0.066
 obstacle-0.065
 battles-0.063
 facts-0.062
Token ago
Token ID2084
Feature activation+1.35e-01

Pos logit contributions
less+0.096
free+0.087
etsy+0.083
inky+0.083
eth+0.079
Neg logit contributions
 obstacles-0.117
rob-0.115
 obstacle-0.113
 facts-0.111
der-0.111
Token.
Token ID13
Feature activation+9.12e-03

Pos logit contributions
less+0.155
free+0.134
 okay+0.125
inky+0.122
etsy+0.122
Neg logit contributions
rob-0.249
 obstacles-0.246
 obstacle-0.244
 facts-0.234
 battles-0.234
Token It
Token ID632
Feature activation+3.94e-03

Pos logit contributions
less+0.015
free+0.013
etsy+0.012
 wet+0.012
inky+0.012
Neg logit contributions
rob-0.012
der-0.012
 obstacles-0.012
 facts-0.012
 hears-0.012
Token is
Token ID318
Feature activation-7.77e-03

Pos logit contributions
less+0.007
free+0.006
 okay+0.006
inky+0.006
 warm+0.006
Neg logit contributions
rob-0.005
 obstacles-0.005
 obstacle-0.005
 facts-0.005
 battles-0.005
Token not
Token ID407
Feature activation+5.21e-02

Pos logit contributions
rob+0.014
 obstacles+0.013
 obstacle+0.013
 facts+0.013
 battles+0.013
Neg logit contributions
less-0.010
free-0.009
 okay-0.008
 warm-0.008
inky-0.008
Token good
Token ID922
Feature activation-2.32e-02

Pos logit contributions
less+0.065
free+0.057
etsy+0.056
eth+0.055
inky+0.055
Neg logit contributions
 obstacles-0.088
 obstacle-0.086
rob-0.085
 battles-0.084
 facts-0.084
Token was
Token ID373
Feature activation-2.36e-02

Pos logit contributions
rob+0.027
 obstacles+0.027
 obstacle+0.026
 facts+0.025
 battles+0.025
Neg logit contributions
less-0.026
free-0.024
 okay-0.022
inky-0.022
etsy-0.022
Token painted
Token ID13055
Feature activation+1.97e-02

Pos logit contributions
 obstacles+0.041
rob+0.040
 obstacle+0.040
 facts+0.039
 battles+0.038
Neg logit contributions
less-0.029
free-0.026
etsy-0.024
inky-0.024
eth-0.023
Token a
Token ID257
Feature activation+1.16e-03

Pos logit contributions
less+0.025
free+0.022
etsy+0.022
inky+0.022
 okay+0.021
Neg logit contributions
rob-0.033
 obstacles-0.032
 obstacle-0.032
 battles-0.031
 facts-0.031
Token long
Token ID890
Feature activation+2.85e-02

Pos logit contributions
less+0.001
free+0.001
 okay+0.001
 warm+0.001
inky+0.001
Neg logit contributions
rob-0.002
 obstacles-0.002
 obstacle-0.002
 facts-0.002
 battles-0.002
Token time
Token ID640
Feature activation+6.39e-02

Pos logit contributions
less+0.039
free+0.035
 okay+0.033
etsy+0.032
inky+0.032
Neg logit contributions
rob-0.046
 obstacles-0.045
 obstacle-0.045
 facts-0.043
 battles-0.043
Token ago
Token ID2084
Feature activation+1.35e-01

Pos logit contributions
less+0.082
free+0.072
 okay+0.067
etsy+0.067
inky+0.067
Neg logit contributions
rob-0.109
 obstacles-0.107
 obstacle-0.106
 facts-0.102
 battles-0.102
Token,
Token ID11
Feature activation-1.01e-02

Pos logit contributions
less+0.156
free+0.137
 okay+0.126
etsy+0.125
inky+0.125
Neg logit contributions
rob-0.244
 obstacles-0.241
 obstacle-0.237
 facts-0.230
 battles-0.230
Tokenâ
Token ID22940
Feature activation-3.66e-02

Pos logit contributions
 obstacles+0.016
 obstacle+0.016
rob+0.016
der+0.015
 facts+0.015
Neg logit contributions
less-0.014
free-0.012
inky-0.012
etsy-0.011
eth-0.011
Token
Token ID26391
Feature activation-5.58e-02

Pos logit contributions
rob+0.054
 obstacle+0.048
 obstacles+0.047
 believes+0.046
 facts+0.045
Neg logit contributions
less-0.058
free-0.050
etsy-0.049
 okay-0.049
inky-0.048
Token her
Token ID607
Feature activation+2.67e-02

Pos logit contributions
rob+0.066
 hears+0.063
 obstacles+0.062
 battles+0.060
 obstacle+0.058
Neg logit contributions
less-0.098
free-0.092
inky-0.084
ucky-0.084
 warm-0.082
Token mum
Token ID25682
Feature activation-2.51e-02

Pos logit contributions
less+0.030
free+0.026
 okay+0.024
inky+0.024
etsy+0.024
Neg logit contributions
rob-0.050
 obstacles-0.049
 obstacle-0.048
 battles-0.047
 facts-0.047
Token girl
Token ID2576
Feature activation-4.54e-02

Pos logit contributions
rob+0.032
 obstacles+0.031
 obstacle+0.031
 battles+0.030
 facts+0.030
Neg logit contributions
less-0.024
free-0.021
etsy-0.019
 okay-0.019
inky-0.019
Token would
Token ID561
Feature activation-1.30e-02

Pos logit contributions
rob+0.081
 obstacles+0.076
 obstacle+0.075
 battles+0.070
 cliffs+0.069
Neg logit contributions
less-0.060
free-0.054
 okay-0.050
etsy-0.049
 warm-0.048
Token make
Token ID787
Feature activation+1.35e-02

Pos logit contributions
rob+0.024
 obstacles+0.022
 obstacle+0.022
der+0.021
 facts+0.020
Neg logit contributions
less-0.017
free-0.015
 okay-0.015
 warm-0.015
eth-0.014
Token a
Token ID257
Feature activation-1.66e-03

Pos logit contributions
less+0.020
free+0.017
etsy+0.017
inky+0.017
eth+0.016
Neg logit contributions
 obstacles-0.021
 obstacle-0.020
rob-0.020
 battles-0.020
 facts-0.019
Token great
Token ID1049
Feature activation-3.23e-02

Pos logit contributions
rob+0.003
 obstacles+0.003
 obstacle+0.003
der+0.003
 facts+0.003
Neg logit contributions
less-0.002
free-0.002
 warm-0.002
 okay-0.002
inky-0.002
Token fer
Token ID11354
Feature activation-1.68e-01

Pos logit contributions
rob+0.049
 obstacles+0.048
 obstacle+0.048
 facts+0.046
 battles+0.046
Neg logit contributions
less-0.048
free-0.043
 okay-0.040
etsy-0.040
inky-0.040
Tokenret
Token ID1186
Feature activation-9.94e-03

Pos logit contributions
 obstacles+0.263
der+0.256
rob+0.255
 cliffs+0.255
 obstacle+0.250
Neg logit contributions
less-0.240
inky-0.208
eth-0.205
free-0.204
etsy-0.193
Token parent
Token ID2560
Feature activation+1.66e-02

Pos logit contributions
 battles+0.016
 obstacle+0.016
 obstacles+0.016
der+0.016
rob+0.015
Neg logit contributions
less-0.012
etsy-0.011
free-0.011
olk-0.011
inky-0.010
Token.
Token ID13
Feature activation-7.77e-03

Pos logit contributions
less+0.021
free+0.019
 okay+0.018
etsy+0.018
inky+0.018
Neg logit contributions
rob-0.028
 obstacles-0.028
 obstacle-0.027
 facts-0.026
 battles-0.026
Token
Token ID220
Feature activation-3.89e-02

Pos logit contributions
rob+0.013
 obstacles+0.013
 obstacle+0.013
 facts+0.013
der+0.013
Neg logit contributions
less-0.010
free-0.008
 okay-0.008
inky-0.008
 warm-0.008
Token
Token ID198
Feature activation-8.94e-03

Pos logit contributions
rob+0.055
 obstacles+0.055
 hears+0.052
 obstacle+0.052
der+0.051
Neg logit contributions
less-0.056
free-0.051
 wet-0.047
 warm-0.047
 okay-0.046
Token and
Token ID290
Feature activation+3.23e-03

Pos logit contributions
less+0.000
free+0.000
 wet+0.000
 warm+0.000
 okay+0.000
Neg logit contributions
rob-0.001
 obstacle-0.001
 obstacles-0.001
 battles-0.001
 Fer-0.001
Token exciting
Token ID7895
Feature activation-2.53e-02

Pos logit contributions
less+0.004
 warm+0.004
 okay+0.003
free+0.003
 wet+0.003
Neg logit contributions
rob-0.006
 obstacle-0.005
 obstacles-0.005
 battles-0.005
 facts-0.005
Token.
Token ID13
Feature activation+2.22e-06

Pos logit contributions
rob+0.050
 obstacle+0.045
 obstacles+0.045
 battles+0.044
 Fer+0.042
Neg logit contributions
less-0.031
free-0.025
 warm-0.023
 okay-0.023
 wet-0.022
Token A
Token ID317
Feature activation-1.44e-02

Feature not active on prev token;
no predictions were affected.
Token baby
Token ID5156
Feature activation-3.82e-03

Pos logit contributions
rob+0.026
 obstacles+0.025
 obstacle+0.025
der+0.024
 facts+0.024
Neg logit contributions
less-0.018
free-0.015
 warm-0.015
 okay-0.014
inky-0.014
Token fer
Token ID11354
Feature activation-1.59e-01

Pos logit contributions
rob+0.006
 obstacles+0.006
 obstacle+0.005
 facts+0.005
 battles+0.005
Neg logit contributions
less-0.006
free-0.005
inky-0.005
 okay-0.005
etsy-0.005
Tokenret
Token ID1186
Feature activation-1.88e-02

Pos logit contributions
 obstacles+0.236
rob+0.234
 obstacle+0.228
 battles+0.222
 facts+0.220
Neg logit contributions
less-0.241
free-0.211
inky-0.209
etsy-0.207
eth-0.202
Token!
Token ID0
Feature activation-1.47e-02

Pos logit contributions
der+0.027
 obstacles+0.026
rob+0.026
 obstacle+0.025
 hears+0.025
Neg logit contributions
less-0.027
free-0.025
isc-0.023
inky-0.023
 okay-0.023
Token
Token ID220
Feature activation-2.31e-02

Pos logit contributions
der+0.023
 obstacles+0.022
rob+0.022
 obstacle+0.022
 hears+0.021
Neg logit contributions
less-0.020
free-0.017
 okay-0.016
inky-0.016
 wet-0.015
Token
Token ID198
Feature activation+2.81e-03

Pos logit contributions
 obstacles+0.034
rob+0.034
der+0.032
 hears+0.032
 obstacle+0.032
Neg logit contributions
less-0.033
free-0.029
 warm-0.027
 wet-0.027
etsy-0.027
Token
Token ID198
Feature activation+4.57e-03

Pos logit contributions
less+0.004
free+0.003
 okay+0.003
inky+0.003
etsy+0.003
Neg logit contributions
rob-0.004
 obstacles-0.004
 obstacle-0.004
 hears-0.004
 battles-0.004
TokenThe
Token ID464
Feature activation-2.65e-02

Pos logit contributions
rob+0.014
 obstacles+0.013
 obstacle+0.013
 facts+0.013
 battles+0.013
Neg logit contributions
less-0.010
free-0.009
 okay-0.009
etsy-0.009
inky-0.009
Token young
Token ID1862
Feature activation-2.74e-03

Pos logit contributions
rob+0.047
 obstacles+0.045
 obstacle+0.045
 facts+0.043
 battles+0.043
Neg logit contributions
less-0.033
free-0.030
 okay-0.027
inky-0.027
 warm-0.027
Token girl
Token ID2576
Feature activation-2.63e-02

Pos logit contributions
rob+0.005
 obstacles+0.005
 obstacle+0.005
 facts+0.005
 battles+0.005
Neg logit contributions
less-0.003
free-0.002
 okay-0.002
inky-0.002
etsy-0.002
Token took
Token ID1718
Feature activation-1.93e-02

Pos logit contributions
rob+0.045
 obstacle+0.042
 obstacles+0.042
 facts+0.041
 battles+0.040
Neg logit contributions
less-0.035
eth-0.031
free-0.031
 warm-0.030
etsy-0.030
Token the
Token ID262
Feature activation-2.59e-03

Pos logit contributions
 obstacles+0.032
rob+0.032
 obstacle+0.031
 battles+0.030
 facts+0.030
Neg logit contributions
less-0.026
free-0.022
etsy-0.021
inky-0.021
 okay-0.021
Token fer
Token ID11354
Feature activation-1.54e-01

Pos logit contributions
rob+0.004
 obstacles+0.003
 obstacle+0.003
 facts+0.003
 battles+0.003
Neg logit contributions
less-0.004
free-0.004
 okay-0.004
etsy-0.004
inky-0.004
Tokenret
Token ID1186
Feature activation-4.02e-03

Pos logit contributions
 obstacles+0.225
rob+0.225
der+0.220
 obstacle+0.218
 cliffs+0.212
Neg logit contributions
less-0.236
inky-0.206
eth-0.204
free-0.203
etsy-0.199
Token home
Token ID1363
Feature activation-1.25e-03

Pos logit contributions
 obstacles+0.007
 obstacle+0.006
 battles+0.006
 challenges+0.006
 hears+0.006
Neg logit contributions
less-0.005
free-0.004
olk-0.004
etsy-0.004
ucky-0.004
Token and
Token ID290
Feature activation-9.85e-03

Pos logit contributions
 obstacles+0.002
rob+0.002
 obstacle+0.002
 facts+0.002
 battles+0.002
Neg logit contributions
less-0.002
free-0.002
inky-0.002
etsy-0.001
 okay-0.001
Token was
Token ID373
Feature activation-2.02e-02

Pos logit contributions
rob+0.016
 obstacles+0.016
 obstacle+0.016
 facts+0.015
 battles+0.015
Neg logit contributions
less-0.013
free-0.012
 warm-0.011
 okay-0.011
etsy-0.011
Token very
Token ID845
Feature activation-9.05e-03

Pos logit contributions
 obstacles+0.031
 obstacle+0.031
rob+0.031
 battles+0.030
 facts+0.030
Neg logit contributions
less-0.027
free-0.024
inky-0.024
etsy-0.023
eth-0.022
Token clown
Token ID25573
Feature activation-6.22e-02

Pos logit contributions
less+0.020
free+0.017
 okay+0.016
 warm+0.016
etsy+0.016
Neg logit contributions
rob-0.027
 obstacles-0.026
 obstacle-0.026
 facts-0.025
 battles-0.025
Tokens
Token ID82
Feature activation+2.94e-02

Pos logit contributions
rob+0.103
 obstacles+0.099
 jugg+0.097
 battles+0.096
 obstacle+0.096
Neg logit contributions
less-0.081
free-0.074
 okay-0.069
etsy-0.069
inky-0.067
Token juggling
Token ID48097
Feature activation-3.05e-02

Pos logit contributions
less+0.035
free+0.030
 okay+0.029
 warm+0.028
inky+0.028
Neg logit contributions
rob-0.054
 obstacles-0.053
 obstacle-0.053
 battles-0.050
 facts-0.050
Token and
Token ID290
Feature activation-1.31e-02

Pos logit contributions
rob+0.057
 obstacle+0.055
 accompl+0.054
 believes+0.053
der+0.052
Neg logit contributions
less-0.033
 warm-0.031
 wet-0.030
free-0.028
 okay-0.026
Token ac
Token ID936
Feature activation-4.34e-02

Pos logit contributions
 obstacle+0.021
 obstacles+0.020
rob+0.019
 facts+0.018
 challenges+0.018
Neg logit contributions
less-0.018
free-0.017
 warm-0.017
 wet-0.017
 okay-0.016
Tokenrob
Token ID22609
Feature activation-1.52e-01

Pos logit contributions
rob+0.005
 obstacles+0.004
 obstacle+0.003
 facts+0.001
 battles+0.001
Neg logit contributions
less-0.125
free-0.119
 okay-0.115
inky-0.115
etsy-0.115
Tokenats
Token ID1381
Feature activation+3.54e-02

Pos logit contributions
 obstacles+0.253
 obstacle+0.252
rob+0.248
der+0.239
 facts+0.236
Neg logit contributions
less-0.197
free-0.169
inky-0.166
 okay-0.164
etsy-0.163
Token doing
Token ID1804
Feature activation-2.15e-02

Pos logit contributions
less+0.046
free+0.040
 okay+0.038
etsy+0.038
inky+0.038
Neg logit contributions
rob-0.060
 obstacles-0.059
 obstacle-0.058
 facts-0.056
 battles-0.056
Token flips
Token ID45971
Feature activation+1.77e-02

Pos logit contributions
rob+0.036
der+0.034
 obstacle+0.033
 believes+0.032
 obstacles+0.032
Neg logit contributions
less-0.027
 warm-0.026
 wet-0.025
free-0.024
 okay-0.024
Token.
Token ID13
Feature activation+1.35e-03

Pos logit contributions
less+0.022
free+0.020
 okay+0.018
etsy+0.018
inky+0.018
Neg logit contributions
rob-0.030
 obstacles-0.030
 obstacle-0.030
 facts-0.029
 battles-0.029
Token But
Token ID887
Feature activation-1.66e-02

Pos logit contributions
less+0.002
free+0.001
 okay+0.001
inky+0.001
 warm+0.001
Neg logit contributions
rob-0.002
 obstacles-0.002
 obstacle-0.002
der-0.002
 facts-0.002
Token and
Token ID290
Feature activation+2.51e-03

Pos logit contributions
less+0.060
free+0.052
 okay+0.049
 warm+0.048
etsy+0.047
Neg logit contributions
rob-0.085
 obstacles-0.083
 obstacle-0.083
 battles-0.079
 facts-0.078
Token proud
Token ID6613
Feature activation-3.39e-02

Pos logit contributions
less+0.003
free+0.003
inky+0.003
etsy+0.003
eth+0.003
Neg logit contributions
rob-0.004
 obstacles-0.004
 obstacle-0.004
 battles-0.004
 facts-0.004
Token of
Token ID286
Feature activation+4.29e-02

Pos logit contributions
rob+0.067
 obstacles+0.066
 obstacle+0.065
 facts+0.063
 battles+0.063
Neg logit contributions
less-0.035
free-0.030
 okay-0.028
etsy-0.027
inky-0.027
Token himself
Token ID2241
Feature activation-3.19e-02

Pos logit contributions
less+0.048
 okay+0.045
 warm+0.045
free+0.045
 wet+0.041
Neg logit contributions
rob-0.079
 obstacle-0.076
 obstacles-0.074
 hears-0.072
 battles-0.071
Token for
Token ID329
Feature activation-4.20e-02

Pos logit contributions
 obstacles+0.049
 obstacle+0.049
rob+0.049
 facts+0.047
der+0.047
Neg logit contributions
less-0.044
free-0.039
etsy-0.039
inky-0.037
 okay-0.037
Token accompl
Token ID6424
Feature activation-1.50e-01

Pos logit contributions
rob+0.064
 obstacle+0.063
 obstacles+0.057
der+0.055
 hears+0.055
Neg logit contributions
 okay-0.062
less-0.062
 warm-0.061
free-0.060
 clean-0.057
Tokenishing
Token ID3929
Feature activation-2.08e-02

Pos logit contributions
rob+0.261
 obstacle+0.258
 obstacles+0.256
 facts+0.253
der+0.249
Neg logit contributions
less-0.187
free-0.147
etsy-0.143
eth-0.140
 okay-0.139
Token such
Token ID884
Feature activation-7.86e-03

Pos logit contributions
rob+0.037
 obstacle+0.035
 obstacles+0.034
 jugg+0.034
der+0.033
Neg logit contributions
less-0.026
free-0.023
 warm-0.022
 okay-0.021
 wet-0.021
Token a
Token ID257
Feature activation-4.47e-03

Pos logit contributions
 obstacles+0.013
 obstacle+0.013
 battles+0.013
 challenges+0.012
 facts+0.012
Neg logit contributions
less-0.010
inky-0.009
eth-0.009
etsy-0.009
free-0.009
Token difficult
Token ID2408
Feature activation+1.03e-02

Pos logit contributions
rob+0.008
 obstacles+0.008
 obstacle+0.008
 facts+0.008
 battles+0.008
Neg logit contributions
less-0.005
free-0.004
 okay-0.004
 warm-0.004
etsy-0.004
Token task
Token ID4876
Feature activation-4.31e-02

Pos logit contributions
less+0.012
 warm+0.010
 okay+0.010
free+0.010
 clean+0.010
Neg logit contributions
rob-0.020
 facts-0.018
 hears-0.018
Offic-0.018
der-0.018
Token lesson
Token ID11483
Feature activation-9.85e-02

Pos logit contributions
rob+0.020
 obstacles+0.019
 obstacle+0.019
 facts+0.018
 battles+0.018
Neg logit contributions
less-0.025
free-0.023
 okay-0.022
inky-0.022
etsy-0.022
Token and
Token ID290
Feature activation-1.68e-02

Pos logit contributions
rob+0.168
 obstacles+0.163
 obstacle+0.163
 battles+0.155
 facts+0.155
Neg logit contributions
less-0.129
free-0.115
 okay-0.108
inky-0.105
etsy-0.104
Token never
Token ID1239
Feature activation+8.83e-03

Pos logit contributions
rob+0.025
 obstacles+0.024
 obstacle+0.024
 facts+0.022
 battles+0.022
Neg logit contributions
less-0.026
free-0.023
 okay-0.023
 warm-0.022
etsy-0.021
Token pretended
Token ID36766
Feature activation+5.20e-02

Pos logit contributions
less+0.014
free+0.013
 okay+0.012
etsy+0.012
inky+0.012
Neg logit contributions
rob-0.013
 obstacles-0.012
 obstacle-0.012
 facts-0.012
 battles-0.012
Token with
Token ID351
Feature activation-3.52e-02

Pos logit contributions
less+0.055
free+0.045
 okay+0.045
 warm+0.042
 wet+0.041
Neg logit contributions
rob-0.105
 obstacle-0.100
 obstacles-0.099
 facts-0.094
der-0.094
Token tr
Token ID491
Feature activation-1.50e-01

Pos logit contributions
rob+0.057
 obstacles+0.054
 obstacle+0.054
 facts+0.051
 battles+0.051
Neg logit contributions
less-0.049
free-0.044
 okay-0.042
 warm-0.041
 wet-0.040
Tokenays
Token ID592
Feature activation+1.40e-02

Pos logit contributions
 facts+0.239
 obstacles+0.237
 obstacle+0.236
Or+0.234
 battles+0.234
Neg logit contributions
less-0.203
ucky-0.178
inky-0.173
uffy-0.171
ick-0.171
Token again
Token ID757
Feature activation-9.86e-02

Pos logit contributions
less+0.016
free+0.014
 okay+0.013
 warm+0.012
etsy+0.012
Neg logit contributions
rob-0.026
 obstacles-0.026
 obstacle-0.025
 facts-0.025
 battles-0.025
Token.
Token ID13
Feature activation+3.16e-03

Pos logit contributions
rob+0.215
 obstacle+0.198
 obstacles+0.196
 facts+0.192
 Fer+0.191
Neg logit contributions
less-0.094
free-0.078
 okay-0.077
 warm-0.071
 wet-0.066
Token<|endoftext|>
Token ID50256
Feature activation-4.08e-02

Pos logit contributions
less+0.004
free+0.003
 okay+0.003
etsy+0.003
inky+0.003
Neg logit contributions
rob-0.005
 obstacles-0.005
 obstacle-0.005
 facts-0.005
 battles-0.005
TokenBilly
Token ID49640
Feature activation+1.83e-02

Pos logit contributions
rob+0.068
 obstacles+0.066
 obstacle+0.063
 hears+0.063
 battles+0.062
Neg logit contributions
less-0.054
free-0.048
 okay-0.045
etsy-0.045
inky-0.045
Token let
Token ID1309
Feature activation+1.12e-02

Pos logit contributions
 obstacles+0.049
 obstacle+0.048
 battles+0.046
rob+0.046
 facts+0.045
Neg logit contributions
less-0.053
free-0.048
inky-0.047
etsy-0.046
eth-0.046
Token's
Token ID338
Feature activation-3.86e-03

Pos logit contributions
less+0.021
free+0.019
etsy+0.017
eth+0.017
inky+0.016
Neg logit contributions
 obstacle-0.014
 obstacles-0.013
 hears-0.013
rob-0.013
 battles-0.013
Token go
Token ID467
Feature activation-1.62e-03

Pos logit contributions
 jugg+0.005
 obstacle+0.005
 obstacles+0.005
 battles+0.005
rob+0.005
Neg logit contributions
less-0.006
free-0.006
etsy-0.005
eth-0.005
inky-0.005
Token on
Token ID319
Feature activation-2.67e-02

Pos logit contributions
rob+0.003
 obstacles+0.002
 obstacle+0.002
 hears+0.002
 battles+0.002
Neg logit contributions
less-0.002
free-0.002
 okay-0.002
etsy-0.002
inky-0.002
Token the
Token ID262
Feature activation-1.99e-03

Pos logit contributions
rob+0.048
 obstacles+0.047
 obstacle+0.047
 facts+0.046
 battles+0.045
Neg logit contributions
less-0.032
free-0.028
 okay-0.027
 warm-0.027
 wet-0.026
Token fer
Token ID11354
Feature activation-1.48e-01

Pos logit contributions
rob+0.003
 obstacles+0.003
 facts+0.003
 obstacle+0.003
der+0.003
Neg logit contributions
less-0.003
 warm-0.003
 wet-0.003
free-0.002
 okay-0.002
Tokenris
Token ID2442
Feature activation-4.78e-03

Pos logit contributions
 obstacles+0.230
der+0.228
rob+0.222
 obstacle+0.219
 cliffs+0.216
Neg logit contributions
less-0.213
inky-0.188
eth-0.183
etsy-0.175
free-0.173
Token wheel
Token ID7825
Feature activation+1.00e-02

Pos logit contributions
der+0.009
 facts+0.008
 accompl+0.008
 obstacles+0.008
rob+0.008
Neg logit contributions
less-0.005
eth-0.004
ay-0.004
isc-0.004
 wet-0.004
Token and
Token ID290
Feature activation-4.26e-03

Pos logit contributions
less+0.013
free+0.012
 okay+0.011
etsy+0.011
inky+0.011
Neg logit contributions
rob-0.016
 obstacles-0.016
 obstacle-0.016
 facts-0.015
 battles-0.015
Token see
Token ID766
Feature activation+2.74e-03

Pos logit contributions
rob+0.007
 obstacles+0.007
 obstacle+0.007
 facts+0.007
der+0.007
Neg logit contributions
less-0.006
free-0.005
 okay-0.005
 warm-0.005
 wet-0.005
Token the
Token ID262
Feature activation-3.42e-03

Pos logit contributions
less+0.004
 okay+0.004
 warm+0.004
 wet+0.003
free+0.003
Neg logit contributions
rob-0.005
 obstacle-0.004
 obstacles-0.004
 facts-0.004
 battles-0.004
Token sc
Token ID629
Feature activation-5.15e-02

Pos logit contributions
rob+0.004
 obstacle+0.004
 obstacles+0.004
 facts+0.004
 battles+0.004
Neg logit contributions
less-0.004
free-0.003
 okay-0.003
 warm-0.003
etsy-0.003
Tokenolded
Token ID48959
Feature activation+4.69e-02

Pos logit contributions
 obstacles+0.083
der+0.082
 obstacle+0.081
rob+0.080
 facts+0.080
Neg logit contributions
free-0.059
less-0.058
 warm-0.054
 okay-0.053
 wet-0.050
Token him
Token ID683
Feature activation+1.96e-02

Pos logit contributions
less+0.088
etsy+0.082
inky+0.081
free+0.079
eth+0.079
Neg logit contributions
 obstacles-0.051
 obstacle-0.047
der-0.046
rob-0.045
 challenges-0.044
Token for
Token ID329
Feature activation-4.40e-02

Pos logit contributions
less+0.031
free+0.028
etsy+0.027
inky+0.027
 okay+0.026
Neg logit contributions
 obstacles-0.026
 obstacle-0.026
der-0.025
rob-0.025
 facts-0.025
Token dis
Token ID595
Feature activation-6.13e-02

Pos logit contributions
rob+0.059
 obstacle+0.055
 obstacles+0.053
 facts+0.051
 battles+0.050
Neg logit contributions
less-0.077
free-0.071
 okay-0.070
 warm-0.068
 wet-0.066
Tokenobe
Token ID5910
Feature activation-1.46e-01

Pos logit contributions
rob+0.051
 obstacles+0.048
 obstacle+0.046
 battles+0.042
 facts+0.042
Neg logit contributions
less-0.134
free-0.125
 okay-0.122
etsy-0.121
inky-0.121
Tokenying
Token ID1112
Feature activation-2.23e-02

Pos logit contributions
rob+0.260
 obstacles+0.251
 obstacle+0.247
 facts+0.243
 battles+0.242
Neg logit contributions
less-0.182
free-0.161
 okay-0.149
inky-0.145
etsy-0.143
Token her
Token ID607
Feature activation+1.60e-02

Pos logit contributions
 obstacles+0.031
 obstacle+0.029
rob+0.029
 challenges+0.029
 facts+0.029
Neg logit contributions
less-0.034
inky-0.031
etsy-0.031
free-0.030
eth-0.029
Token.
Token ID13
Feature activation+1.47e-03

Pos logit contributions
less+0.020
free+0.018
etsy+0.018
inky+0.017
 okay+0.017
Neg logit contributions
 obstacles-0.027
 obstacle-0.026
der-0.025
rob-0.025
 facts-0.025
Token
Token ID198
Feature activation+3.51e-04

Pos logit contributions
less+0.002
free+0.002
 okay+0.002
 warm+0.002
inky+0.002
Neg logit contributions
rob-0.002
 obstacles-0.002
der-0.002
 obstacle-0.002
 facts-0.002
Token
Token ID198
Feature activation+3.29e-03

Pos logit contributions
less+0.000
free+0.000
 okay+0.000
etsy+0.000
inky+0.000
Neg logit contributions
rob-0.001
 obstacles-0.001
 obstacle-0.001
 battles-0.001
 facts-0.001
Token Ele
Token ID15987
Feature activation-3.68e-02

Pos logit contributions
 obstacles+0.005
der+0.005
 obstacle+0.005
 facts+0.005
rob+0.005
Neg logit contributions
less-0.003
free-0.002
inky-0.002
 warm-0.002
 okay-0.002
Tokenph
Token ID746
Feature activation-9.11e-02

Pos logit contributions
rob+0.048
 obstacles+0.048
 obstacle+0.046
 facts+0.045
 battles+0.045
Neg logit contributions
less-0.061
free-0.056
inky-0.053
etsy-0.053
 warm-0.052
Tokenants
Token ID1187
Feature activation-1.04e-02

Pos logit contributions
rob+0.116
 obstacles+0.115
 obstacle+0.113
der+0.110
 facts+0.108
Neg logit contributions
less-0.152
free-0.139
 okay-0.133
 warm-0.132
etsy-0.130
Token have
Token ID423
Feature activation-1.74e-02

Pos logit contributions
rob+0.017
 obstacles+0.016
 obstacle+0.016
 battles+0.016
 facts+0.016
Neg logit contributions
less-0.014
free-0.013
 okay-0.012
 warm-0.012
etsy-0.012
Token strong
Token ID1913
Feature activation-2.51e-02

Pos logit contributions
rob+0.030
 obstacle+0.029
 obstacles+0.029
 facts+0.027
 battles+0.027
Neg logit contributions
less-0.022
free-0.020
 okay-0.019
etsy-0.018
 warm-0.018
Token tr
Token ID491
Feature activation-1.45e-01

Pos logit contributions
rob+0.038
 obstacles+0.036
 obstacle+0.035
 battles+0.034
der+0.033
Neg logit contributions
less-0.038
free-0.034
 okay-0.032
 warm-0.032
inky-0.031
Tokenunks
Token ID14125
Feature activation+1.22e-02

Pos logit contributions
 facts+0.227
 obstacle+0.226
 obstacles+0.225
 battles+0.223
rob+0.213
Neg logit contributions
less-0.197
inky-0.185
eth-0.181
ucky-0.180
izz-0.177
Token that
Token ID326
Feature activation+1.94e-02

Pos logit contributions
less+0.015
free+0.014
 okay+0.013
etsy+0.012
inky+0.012
Neg logit contributions
rob-0.021
 obstacles-0.021
 obstacle-0.021
 facts-0.020
 battles-0.020
Token they
Token ID484
Feature activation-1.21e-04

Pos logit contributions
less+0.025
free+0.023
 okay+0.022
 warm+0.022
 wet+0.021
Neg logit contributions
rob-0.032
 obstacles-0.032
 obstacle-0.031
der-0.030
 battles-0.030
Token can
Token ID460
Feature activation+3.07e-02

Pos logit contributions
rob+0.000
 obstacle+0.000
 obstacles+0.000
der+0.000
 facts+0.000
Neg logit contributions
free-0.000
less-0.000
 warm-0.000
 wet-0.000
 clean-0.000
Token use
Token ID779
Feature activation-7.48e-03

Pos logit contributions
less+0.038
free+0.035
 warm+0.034
inky+0.034
 okay+0.033
Neg logit contributions
rob-0.055
 obstacles-0.053
 obstacle-0.052
 facts-0.051
 battles-0.050
Token Lily
Token ID20037
Feature activation+3.04e-02

Pos logit contributions
less+0.009
free+0.007
inky+0.007
 okay+0.007
etsy+0.006
Neg logit contributions
rob-0.020
 obstacles-0.020
 obstacle-0.020
 battles-0.019
 facts-0.019
Token decided
Token ID3066
Feature activation+1.60e-02

Pos logit contributions
less+0.033
 warm+0.031
 wet+0.031
 okay+0.030
 clean+0.030
Neg logit contributions
rob-0.057
 obstacles-0.055
 obstacle-0.054
 facts-0.054
 accompl-0.051
Token to
Token ID284
Feature activation-2.66e-02

Pos logit contributions
less+0.015
 okay+0.012
free+0.012
 warm+0.011
 wet+0.011
Neg logit contributions
rob-0.035
 obstacle-0.033
 obstacles-0.033
 facts-0.031
 battles-0.031
Token play
Token ID711
Feature activation+2.45e-02

Pos logit contributions
 jugg+0.035
 obstacles+0.034
 obstacle+0.033
 battles+0.033
 facts+0.033
Neg logit contributions
less-0.045
free-0.040
etsy-0.037
 okay-0.037
inky-0.036
Token "
Token ID366
Feature activation-3.94e-02

Pos logit contributions
less+0.023
free+0.020
 warm+0.020
 okay+0.020
 wet+0.019
Neg logit contributions
rob-0.049
 obstacles-0.049
 obstacle-0.049
 facts-0.048
der-0.047
Tokenmar
Token ID3876
Feature activation-1.45e-01

Pos logit contributions
 obstacle+0.062
 obstacles+0.061
 facts+0.060
der+0.058
rob+0.058
Neg logit contributions
less-0.057
free-0.056
etsy-0.048
un-0.046
 warm-0.046
Tokenry
Token ID563
Feature activation+2.79e-02

Pos logit contributions
rob+0.236
 obstacles+0.233
 obstacle+0.229
 facts+0.221
 battles+0.220
Neg logit contributions
less-0.196
free-0.175
 okay-0.164
etsy-0.163
inky-0.163
Token".
Token ID1911
Feature activation-1.43e-02

Pos logit contributions
less+0.040
etsy+0.037
free+0.036
inky+0.035
 okay+0.034
Neg logit contributions
 obstacles-0.040
 obstacle-0.040
rob-0.039
der-0.038
 battles-0.038
Token They
Token ID1119
Feature activation+3.94e-03

Pos logit contributions
 obstacles+0.021
 facts+0.021
der+0.020
 hears+0.020
rob+0.020
Neg logit contributions
less-0.021
free-0.019
 okay-0.018
etsy-0.018
inky-0.017
Token dressed
Token ID12049
Feature activation+1.01e-02

Pos logit contributions
less+0.006
free+0.005
 warm+0.005
 okay+0.005
etsy+0.005
Neg logit contributions
rob-0.006
 obstacles-0.006
 obstacle-0.006
 facts-0.006
der-0.005
Token up
Token ID510
Feature activation-3.44e-02

Pos logit contributions
less+0.013
free+0.011
 okay+0.011
etsy+0.010
 warm+0.010
Neg logit contributions
rob-0.017
 obstacles-0.017
 obstacle-0.017
 facts-0.016
 battles-0.016
Token didn
Token ID1422
Feature activation+5.06e-02

Pos logit contributions
less+0.015
 okay+0.013
etsy+0.013
 warm+0.013
free+0.013
Neg logit contributions
rob-0.020
 obstacles-0.019
 obstacle-0.019
 facts-0.018
 battles-0.018
Token't
Token ID470
Feature activation+2.41e-02

Pos logit contributions
less+0.049
free+0.042
 okay+0.039
inky+0.039
etsy+0.037
Neg logit contributions
rob-0.102
 obstacles-0.101
 obstacle-0.100
 facts-0.098
 battles-0.097
Token know
Token ID760
Feature activation-1.78e-03

Pos logit contributions
less+0.040
inky+0.034
uffy+0.034
etsy+0.034
free+0.033
Neg logit contributions
 obstacle-0.031
 obstacles-0.030
 facts-0.030
 hears-0.030
rob-0.029
Token what
Token ID644
Feature activation-3.30e-02

Pos logit contributions
 obstacles+0.002
rob+0.002
 obstacle+0.002
 facts+0.002
der+0.002
Neg logit contributions
less-0.003
free-0.003
inky-0.003
etsy-0.003
eth-0.003
Token "
Token ID366
Feature activation-4.72e-02

Pos logit contributions
rob+0.067
 obstacle+0.064
 obstacles+0.064
der+0.061
 battles+0.061
Neg logit contributions
less-0.035
free-0.031
 warm-0.031
 okay-0.030
 wet-0.029
Tokenpr
Token ID1050
Feature activation-1.44e-01

Pos logit contributions
rob+0.069
 obstacle+0.069
 obstacles+0.068
 facts+0.066
 battles+0.065
Neg logit contributions
less-0.072
free-0.069
 okay-0.063
etsy-0.062
inky-0.061
Tokenune
Token ID1726
Feature activation-5.16e-02

Pos logit contributions
rob+0.287
 obstacles+0.277
 battles+0.268
 facts+0.266
 obstacle+0.266
Neg logit contributions
less-0.158
free-0.137
inky-0.129
etsy-0.123
 okay-0.123
Token"
Token ID1
Feature activation-1.20e-02

Pos logit contributions
rob+0.086
 obstacles+0.084
 obstacle+0.081
 battles+0.080
 facts+0.078
Neg logit contributions
less-0.070
free-0.064
 okay-0.060
inky-0.059
etsy-0.058
Token meant
Token ID4001
Feature activation-3.23e-03

Pos logit contributions
rob+0.019
 obstacles+0.019
 obstacle+0.019
 battles+0.018
 facts+0.018
Neg logit contributions
less-0.017
free-0.015
 okay-0.015
etsy-0.014
 warm-0.014
Token,
Token ID11
Feature activation-2.77e-02

Pos logit contributions
der+0.004
 facts+0.004
 obstacles+0.004
 site+0.004
 seconds+0.004
Neg logit contributions
less-0.004
etsy-0.004
inky-0.004
isc-0.004
opped-0.004
Token so
Token ID523
Feature activation-1.01e-02

Pos logit contributions
der+0.038
 obstacles+0.038
rob+0.037
 obstacle+0.036
 facts+0.036
Neg logit contributions
less-0.045
free-0.039
inky-0.038
etsy-0.038
eth-0.037
Token lay
Token ID3830
Feature activation+2.78e-02

Pos logit contributions
 obstacles+0.020
 obstacle+0.019
 cliffs+0.019
 circus+0.019
 mysteries+0.018
Neg logit contributions
less-0.015
inky-0.014
eth-0.013
free-0.013
etsy-0.013
Token was
Token ID373
Feature activation+4.63e-04

Pos logit contributions
less+0.031
free+0.027
inky+0.026
etsy+0.025
 okay+0.025
Neg logit contributions
rob-0.052
 obstacles-0.050
 obstacle-0.050
der-0.048
 facts-0.048
Token careful
Token ID8161
Feature activation+6.26e-02

Pos logit contributions
etsy+0.001
inky+0.001
less+0.001
eth+0.001
isc+0.001
Neg logit contributions
 obstacles-0.001
 tilted-0.001
 scoring-0.001
 guarding-0.001
 accompl-0.001
Token not
Token ID407
Feature activation+4.08e-02

Pos logit contributions
etsy+0.096
inky+0.092
less+0.090
isc+0.089
eth+0.086
Neg logit contributions
der-0.079
 obstacles-0.078
 facts-0.072
 cliffs-0.072
 challenges-0.071
Token to
Token ID284
Feature activation-3.30e-02

Pos logit contributions
less+0.053
inky+0.051
etsy+0.051
eth+0.050
isc+0.048
Neg logit contributions
 obstacles-0.063
 obstacle-0.059
 hears-0.059
 battles-0.059
 facts-0.059
Token tr
Token ID491
Feature activation-1.43e-01

Pos logit contributions
 obstacles+0.046
 obstacle+0.045
 jugg+0.044
 hears+0.044
 cliffs+0.044
Neg logit contributions
less-0.051
free-0.045
etsy-0.044
inky-0.043
eth-0.041
Tokenample
Token ID1403
Feature activation-6.48e-02

Pos logit contributions
rob+0.202
 obstacles+0.198
 obstacle+0.195
 facts+0.186
 battles+0.186
Neg logit contributions
less-0.226
free-0.205
 okay-0.194
etsy-0.193
inky-0.193
Token the
Token ID262
Feature activation-7.21e-03

Pos logit contributions
 obstacles+0.099
 obstacle+0.098
rob+0.097
 battles+0.093
 facts+0.093
Neg logit contributions
less-0.093
free-0.083
etsy-0.081
inky-0.080
eth-0.078
Token flowers
Token ID12734
Feature activation+2.69e-02

Pos logit contributions
 obstacles+0.011
 obstacle+0.011
rob+0.011
 facts+0.011
 cliffs+0.011
Neg logit contributions
less-0.010
free-0.009
etsy-0.008
inky-0.008
 okay-0.008
Token and
Token ID290
Feature activation-1.05e-02

Pos logit contributions
less+0.033
free+0.029
etsy+0.028
inky+0.028
eth+0.027
Neg logit contributions
 obstacles-0.046
rob-0.045
 obstacle-0.045
 battles-0.043
der-0.043
Token plants
Token ID6134
Feature activation+1.65e-02

Pos logit contributions
rob+0.019
 obstacle+0.018
 obstacles+0.017
 battles+0.017
der+0.017
Neg logit contributions
less-0.013
 warm-0.012
 wet-0.012
free-0.011
 okay-0.011
Token something
Token ID1223
Feature activation-4.62e-02

Pos logit contributions
rob+0.010
 obstacle+0.010
 obstacles+0.010
 battles+0.009
 jugg+0.009
Neg logit contributions
less-0.007
 okay-0.006
free-0.006
 warm-0.006
 wet-0.005
Token new
Token ID649
Feature activation-1.91e-02

Pos logit contributions
rob+0.092
 obstacles+0.086
 obstacle+0.084
 battles+0.083
 hears+0.082
Neg logit contributions
less-0.053
 okay-0.045
free-0.045
 warm-0.044
 wet-0.041
Token!
Token ID0
Feature activation-2.07e-02

Pos logit contributions
rob+0.042
 obstacle+0.040
 obstacles+0.040
 Fer+0.038
 jugg+0.038
Neg logit contributions
less-0.019
free-0.014
 okay-0.014
 warm-0.013
 clean-0.011
Token She
Token ID1375
Feature activation-7.82e-03

Pos logit contributions
rob+0.040
 obstacles+0.040
 obstacle+0.039
 facts+0.038
der+0.038
Neg logit contributions
less-0.022
free-0.018
 okay-0.017
etsy-0.016
 warm-0.016
Token thought
Token ID1807
Feature activation-3.27e-02

Pos logit contributions
rob+0.015
 obstacle+0.015
 obstacles+0.015
 facts+0.014
 languages+0.014
Neg logit contributions
 warm-0.008
 clean-0.008
less-0.008
 wet-0.008
 okay-0.007
Token av
Token ID1196
Feature activation-1.42e-01

Pos logit contributions
rob+0.059
 obstacles+0.056
 obstacle+0.056
 battles+0.054
 facts+0.053
Neg logit contributions
less-0.041
free-0.036
 okay-0.034
 warm-0.033
etsy-0.032
Tokenoc
Token ID420
Feature activation-3.69e-02

Pos logit contributions
 obstacles+0.130
 obstacle+0.127
rob+0.126
der+0.123
 cliffs+0.120
Neg logit contributions
less-0.289
free-0.277
inky-0.263
etsy-0.259
 warm-0.255
Tokenados
Token ID22484
Feature activation-6.97e-02

Pos logit contributions
 obstacle+0.061
 obstacles+0.060
rob+0.060
 hears+0.057
 battles+0.057
Neg logit contributions
less-0.050
free-0.044
inky-0.043
etsy-0.040
ucky-0.039
Token were
Token ID547
Feature activation-2.09e-02

Pos logit contributions
rob+0.122
 obstacle+0.120
 obstacles+0.120
 battles+0.113
der+0.111
Neg logit contributions
less-0.088
free-0.081
 warm-0.073
 okay-0.071
etsy-0.071
Token the
Token ID262
Feature activation-8.70e-04

Pos logit contributions
rob+0.039
 obstacles+0.037
 obstacle+0.037
der+0.035
 hears+0.035
Neg logit contributions
less-0.025
free-0.022
 okay-0.022
 warm-0.021
 wet-0.020
Token most
Token ID749
Feature activation-2.60e-02

Pos logit contributions
rob+0.002
 obstacles+0.002
 obstacle+0.002
 facts+0.002
 battles+0.002
Neg logit contributions
less-0.001
free-0.001
 okay-0.001
etsy-0.001
inky-0.001
Token at
Token ID379
Feature activation-5.24e-02

Pos logit contributions
less+0.026
 okay+0.022
free+0.022
 warm+0.021
etsy+0.020
Neg logit contributions
rob-0.059
 obstacle-0.056
 obstacles-0.056
 facts-0.054
 battles-0.054
Token first
Token ID717
Feature activation-6.02e-02

Pos logit contributions
rob+0.102
 facts+0.094
Nice+0.087
 concepts+0.087
 artifacts+0.086
Neg logit contributions
 warm-0.066
 okay-0.066
 wet-0.063
 clean-0.063
 home-0.061
Token you
Token ID345
Feature activation+8.13e-03

Pos logit contributions
rob+0.132
 obstacles+0.125
 obstacle+0.123
 facts+0.122
 battles+0.120
Neg logit contributions
less-0.052
free-0.043
 okay-0.042
 warm-0.039
etsy-0.037
Token don
Token ID836
Feature activation+2.73e-02

Pos logit contributions
less+0.011
free+0.010
 okay+0.009
etsy+0.009
inky+0.009
Neg logit contributions
rob-0.013
 obstacles-0.013
 obstacle-0.013
 facts-0.013
 battles-0.013
Token't
Token ID470
Feature activation+1.06e-02

Pos logit contributions
less+0.023
free+0.020
inky+0.019
 okay+0.018
 warm+0.017
Neg logit contributions
 obstacles-0.058
rob-0.058
 obstacle-0.057
 facts-0.057
 battles-0.055
Token succeed
Token ID6758
Feature activation-1.42e-01

Pos logit contributions
less+0.016
free+0.014
etsy+0.013
inky+0.013
 okay+0.013
Neg logit contributions
 obstacles-0.016
rob-0.016
 obstacle-0.016
 battles-0.015
 facts-0.015
Token,
Token ID11
Feature activation-3.57e-02

Pos logit contributions
rob+0.337
 obstacle+0.318
 obstacles+0.316
 facts+0.314
 battles+0.305
Neg logit contributions
less-0.107
 okay-0.092
free-0.091
 warm-0.074
 wet-0.073
Token try
Token ID1949
Feature activation+2.29e-04

Pos logit contributions
rob+0.065
 obstacles+0.062
 obstacle+0.061
 facts+0.059
 battles+0.059
Neg logit contributions
less-0.043
free-0.038
 okay-0.038
 warm-0.036
etsy-0.036
Token,
Token ID11
Feature activation-3.08e-02

Pos logit contributions
 okay+0.000
less+0.000
 warm+0.000
free+0.000
 wet+0.000
Neg logit contributions
rob-0.000
 obstacle-0.000
 facts-0.000
 obstacles-0.000
 Fer-0.000
Token try
Token ID1949
Feature activation+3.98e-03

Pos logit contributions
rob+0.055
 obstacles+0.052
 obstacle+0.051
 facts+0.050
 battles+0.050
Neg logit contributions
less-0.039
free-0.034
 okay-0.034
 warm-0.032
etsy-0.032
Token again
Token ID757
Feature activation-1.03e-01

Pos logit contributions
 okay+0.005
less+0.004
 warm+0.004
free+0.004
 wet+0.004
Neg logit contributions
rob-0.008
 facts-0.007
 obstacle-0.007
 obstacles-0.007
 Fer-0.007
Token so
Token ID523
Feature activation-7.12e-03

Pos logit contributions
rob+0.094
 obstacles+0.093
 obstacle+0.092
 facts+0.089
 battles+0.088
Neg logit contributions
less-0.061
free-0.053
 okay-0.050
etsy-0.048
inky-0.048
Token brave
Token ID14802
Feature activation-5.80e-02

Pos logit contributions
 obstacles+0.012
 obstacle+0.011
 battles+0.011
 facts+0.011
 challenges+0.011
Neg logit contributions
less-0.010
free-0.008
inky-0.008
etsy-0.008
eth-0.008
Token,
Token ID11
Feature activation-2.47e-02

Pos logit contributions
rob+0.094
 obstacles+0.093
 obstacle+0.091
 facts+0.088
 battles+0.088
Neg logit contributions
less-0.079
free-0.071
 okay-0.066
etsy-0.066
inky-0.066
Token wished
Token ID16555
Feature activation+4.06e-03

Pos logit contributions
rob+0.045
 obstacle+0.042
 obstacles+0.041
 facts+0.040
 battles+0.040
Neg logit contributions
less-0.031
 okay-0.028
free-0.028
 warm-0.027
 wet-0.025
Token him
Token ID683
Feature activation-1.70e-03

Pos logit contributions
less+0.005
free+0.004
 okay+0.004
 warm+0.004
 wet+0.004
Neg logit contributions
rob-0.007
 obstacle-0.007
 obstacles-0.007
 battles-0.007
 jugg-0.007
Token luck
Token ID8458
Feature activation-1.41e-01

Pos logit contributions
rob+0.003
 obstacle+0.003
 obstacles+0.003
 jugg+0.003
der+0.003
Neg logit contributions
less-0.002
free-0.002
 okay-0.002
 warm-0.002
etsy-0.002
Token and
Token ID290
Feature activation-1.57e-02

Pos logit contributions
rob+0.264
 obstacle+0.260
 obstacles+0.260
 battles+0.249
 facts+0.248
Neg logit contributions
less-0.171
free-0.146
 okay-0.128
 warm-0.124
etsy-0.120
Token then
Token ID788
Feature activation+7.28e-03

Pos logit contributions
rob+0.026
 obstacle+0.024
 obstacles+0.023
der+0.022
 cliffs+0.022
Neg logit contributions
less-0.023
free-0.020
 warm-0.020
 okay-0.020
 wet-0.019
Token continued
Token ID3767
Feature activation+1.97e-03

Pos logit contributions
less+0.009
free+0.008
 okay+0.008
 warm+0.008
 wet+0.008
Neg logit contributions
rob-0.013
 obstacle-0.012
 obstacles-0.012
 facts-0.011
der-0.011
Token on
Token ID319
Feature activation-2.89e-02

Pos logit contributions
less+0.002
free+0.002
 warm+0.002
 wet+0.001
 okay+0.001
Neg logit contributions
rob-0.004
 obstacle-0.004
 obstacles-0.004
der-0.004
 facts-0.004
Token his
Token ID465
Feature activation+2.84e-02

Pos logit contributions
rob+0.058
der+0.053
Really+0.050
 believes+0.050
 obstacle+0.050
Neg logit contributions
 warm-0.029
less-0.028
 okay-0.028
 wet-0.025
 safe-0.025
Token.
Token ID13
Feature activation+2.00e-02

Pos logit contributions
less+0.051
free+0.044
 okay+0.041
 warm+0.040
inky+0.040
Neg logit contributions
rob-0.082
 obstacles-0.080
 obstacle-0.079
 facts-0.077
 battles-0.077
Token They
Token ID1119
Feature activation+1.04e-02

Pos logit contributions
less+0.028
free+0.025
inky+0.024
etsy+0.024
 okay+0.023
Neg logit contributions
 obstacles-0.031
rob-0.031
 facts-0.030
 obstacle-0.030
 battles-0.030
Token said
Token ID531
Feature activation+1.57e-02

Pos logit contributions
less+0.013
 warm+0.013
 wet+0.012
 okay+0.012
free+0.012
Neg logit contributions
rob-0.018
 obstacle-0.017
 obstacles-0.017
 facts-0.016
 battles-0.016
Token,
Token ID11
Feature activation-2.14e-02

Pos logit contributions
less+0.020
free+0.018
isc+0.018
inky+0.018
etsy+0.017
Neg logit contributions
 facts-0.025
der-0.025
 hears-0.025
 obstacles-0.024
 battles-0.024
Token "
Token ID366
Feature activation-4.86e-02

Pos logit contributions
rob+0.031
 obstacle+0.031
 obstacles+0.030
 battles+0.028
 facts+0.028
Neg logit contributions
less-0.035
free-0.032
 okay-0.031
 warm-0.030
etsy-0.029
TokenJu
Token ID33018
Feature activation-1.40e-01

Pos logit contributions
rob+0.067
 obstacles+0.065
 obstacle+0.063
 facts+0.062
 battles+0.061
Neg logit contributions
less-0.079
free-0.072
inky-0.068
 okay-0.068
etsy-0.068
Tokenice
Token ID501
Feature activation+1.98e-02

Pos logit contributions
 obstacles+0.212
 obstacle+0.207
 battles+0.200
rob+0.195
 facts+0.193
Neg logit contributions
less-0.206
free-0.185
ucky-0.178
inky-0.176
etsy-0.174
Token,
Token ID11
Feature activation-2.09e-02

Pos logit contributions
less+0.021
free+0.018
etsy+0.017
 okay+0.017
inky+0.016
Neg logit contributions
rob-0.038
 obstacles-0.037
 obstacle-0.037
 facts-0.036
 battles-0.036
Token cheese
Token ID9891
Feature activation+1.19e-03

Pos logit contributions
rob+0.037
 obstacles+0.036
 obstacle+0.035
 battles+0.034
 facts+0.034
Neg logit contributions
less-0.026
free-0.023
 okay-0.022
 warm-0.022
inky-0.021
Token and
Token ID290
Feature activation+3.67e-03

Pos logit contributions
less+0.001
free+0.001
 okay+0.001
inky+0.001
etsy+0.001
Neg logit contributions
rob-0.002
 obstacles-0.002
 obstacle-0.002
 battles-0.002
 facts-0.002
Token crack
Token ID8469
Feature activation-7.89e-03

Pos logit contributions
less+0.004
 warm+0.004
 wet+0.003
 okay+0.003
free+0.003
Neg logit contributions
rob-0.007
 obstacle-0.007
 obstacles-0.007
 battles-0.006
 site-0.006
Token.
Token ID13
Feature activation+3.23e-02

Pos logit contributions
less+0.086
free+0.079
etsy+0.077
 okay+0.075
eth+0.074
Neg logit contributions
der-0.089
rob-0.088
 obstacles-0.086
 obstacle-0.084
 hears-0.084
Token He
Token ID679
Feature activation-1.79e-02

Pos logit contributions
less+0.054
free+0.049
etsy+0.047
inky+0.045
eth+0.044
Neg logit contributions
der-0.042
 facts-0.041
 obstacles-0.041
rob-0.040
 hears-0.040
Token says
Token ID1139
Feature activation+2.23e-03

Pos logit contributions
 obstacles+0.029
rob+0.028
 obstacle+0.027
 accompl+0.027
 battles+0.027
Neg logit contributions
 warm-0.024
 wet-0.024
 clean-0.023
 juice-0.022
 dry-0.021
Token,
Token ID11
Feature activation-1.74e-02

Pos logit contributions
less+0.003
free+0.002
etsy+0.002
inky+0.002
eth+0.002
Neg logit contributions
 obstacles-0.004
 facts-0.004
der-0.004
rob-0.004
 battles-0.004
Token "
Token ID366
Feature activation-5.14e-02

Pos logit contributions
rob+0.028
 obstacle+0.028
 obstacles+0.028
 challenges+0.025
 battles+0.025
Neg logit contributions
less-0.026
free-0.024
 okay-0.024
 warm-0.023
etsy-0.021
TokenJu
Token ID33018
Feature activation-1.40e-01

Pos logit contributions
rob+0.077
 obstacles+0.072
 obstacle+0.069
 facts+0.069
 battles+0.068
Neg logit contributions
less-0.079
free-0.069
inky-0.068
 okay-0.068
eth-0.067
Tokenice
Token ID501
Feature activation+2.09e-02

Pos logit contributions
 obstacles+0.224
 obstacle+0.219
rob+0.215
 battles+0.214
 facts+0.205
Neg logit contributions
less-0.200
free-0.177
ucky-0.173
inky-0.167
 okay-0.163
Token is
Token ID318
Feature activation-1.62e-02

Pos logit contributions
less+0.026
free+0.023
 okay+0.022
etsy+0.022
inky+0.022
Neg logit contributions
rob-0.035
 obstacles-0.035
 obstacle-0.035
 facts-0.034
 battles-0.033
Token y
Token ID331
Feature activation-6.60e-03

Pos logit contributions
 obstacles+0.031
rob+0.030
 obstacle+0.030
 facts+0.030
 battles+0.030
Neg logit contributions
less-0.017
free-0.015
etsy-0.014
eth-0.014
inky-0.014
Tokenummy
Token ID13513
Feature activation+7.58e-03

Pos logit contributions
 obstacle+0.012
rob+0.011
 obstacles+0.011
der+0.011
 facts+0.011
Neg logit contributions
free-0.007
less-0.007
 okay-0.006
etsy-0.006
eth-0.006
Token!
Token ID0
Feature activation-4.29e-03

Pos logit contributions
less+0.009
free+0.008
etsy+0.008
 okay+0.008
inky+0.008
Neg logit contributions
rob-0.013
 obstacles-0.013
 obstacle-0.013
 facts-0.012
 battles-0.012
Token.
Token ID13
Feature activation-1.06e-02

Pos logit contributions
der+0.083
 obstacles+0.083
 obstacle+0.082
rob+0.079
 hears+0.078
Neg logit contributions
less-0.068
free-0.064
inky-0.063
etsy-0.062
eth-0.059
Token But
Token ID887
Feature activation-2.30e-02

Pos logit contributions
rob+0.018
 obstacle+0.018
 obstacles+0.017
 facts+0.017
 battles+0.017
Neg logit contributions
less-0.014
free-0.012
etsy-0.012
 okay-0.012
inky-0.012
Token Brad
Token ID8114
Feature activation-8.07e-02

Pos logit contributions
 obstacles+0.035
rob+0.034
 obstacle+0.034
 battles+0.032
 challenges+0.032
Neg logit contributions
less-0.034
free-0.031
inky-0.029
etsy-0.029
eth-0.028
Token didn
Token ID1422
Feature activation+4.16e-02

Pos logit contributions
rob+0.120
 obstacles+0.116
 obstacle+0.115
 facts+0.110
 battles+0.110
Neg logit contributions
less-0.123
free-0.110
 okay-0.106
etsy-0.105
inky-0.104
Token't
Token ID470
Feature activation+1.63e-02

Pos logit contributions
less+0.044
free+0.038
 okay+0.035
etsy+0.035
inky+0.034
Neg logit contributions
rob-0.080
 obstacles-0.079
 obstacle-0.078
 facts-0.076
 battles-0.075
Token bud
Token ID3600
Feature activation-1.39e-01

Pos logit contributions
less+0.023
free+0.021
 okay+0.020
etsy+0.019
inky+0.019
Neg logit contributions
rob-0.026
 obstacles-0.025
 obstacle-0.025
 facts-0.024
 battles-0.024
Tokenge
Token ID469
Feature activation-1.59e-02

Pos logit contributions
 obstacles+0.131
 obstacle+0.123
rob+0.122
der+0.120
 battles+0.116
Neg logit contributions
less-0.268
free-0.254
inky-0.252
etsy-0.249
isc-0.245
Token,
Token ID11
Feature activation-3.54e-02

Pos logit contributions
 obstacles+0.027
rob+0.027
 obstacle+0.027
 battles+0.026
 facts+0.026
Neg logit contributions
less-0.020
free-0.018
inky-0.017
etsy-0.017
 okay-0.016
Token he
Token ID339
Feature activation-3.13e-02

Pos logit contributions
rob+0.062
 obstacle+0.059
 obstacles+0.059
 facts+0.057
 battles+0.057
Neg logit contributions
less-0.045
free-0.040
 okay-0.039
etsy-0.037
 warm-0.037
Token remained
Token ID6150
Feature activation-1.23e-02

Pos logit contributions
rob+0.054
 obstacle+0.049
 obstacles+0.048
 facts+0.047
 battles+0.046
Neg logit contributions
less-0.044
 okay-0.039
free-0.038
etsy-0.038
 warm-0.037
Token where
Token ID810
Feature activation+4.79e-02

Pos logit contributions
rob+0.023
 obstacle+0.022
 obstacles+0.022
 facts+0.022
der+0.021
Neg logit contributions
less-0.015
 okay-0.013
 warm-0.012
free-0.012
 wet-0.012
Token
Token ID26391
Feature activation-6.26e-02

Pos logit contributions
rob+0.079
 obstacle+0.074
 obstacles+0.074
 facts+0.070
 battles+0.069
Neg logit contributions
less-0.074
free-0.064
 okay-0.062
inky-0.062
etsy-0.062
Token
Token ID220
Feature activation-3.63e-02

Pos logit contributions
rob+0.088
 obstacles+0.085
 obstacle+0.083
 battles+0.081
 facts+0.080
Neg logit contributions
less-0.099
free-0.090
inky-0.084
 okay-0.083
etsy-0.083
Token
Token ID198
Feature activation-1.12e-02

Pos logit contributions
rob+0.058
 obstacles+0.057
 obstacle+0.055
der+0.055
 facts+0.054
Neg logit contributions
less-0.048
free-0.042
etsy-0.040
 warm-0.039
 wet-0.039
Token
Token ID198
Feature activation-1.00e-02

Pos logit contributions
rob+0.019
 obstacles+0.019
 obstacle+0.018
 battles+0.017
 facts+0.017
Neg logit contributions
less-0.015
free-0.013
inky-0.012
 okay-0.012
etsy-0.012
TokenAnd
Token ID1870
Feature activation-5.46e-02

Pos logit contributions
rob+0.017
 obstacles+0.016
 obstacle+0.016
 battles+0.015
 facts+0.015
Neg logit contributions
less-0.013
free-0.012
inky-0.011
 okay-0.011
etsy-0.011
Token Ske
Token ID19460
Feature activation-1.39e-01

Pos logit contributions
rob+0.088
 obstacles+0.087
 obstacle+0.085
 facts+0.083
 battles+0.082
Neg logit contributions
less-0.074
free-0.066
inky-0.062
etsy-0.061
 okay-0.060
Tokenleton
Token ID10565
Feature activation-7.99e-02

Pos logit contributions
 obstacles+0.211
rob+0.199
 obstacle+0.199
der+0.197
 facts+0.196
Neg logit contributions
less-0.205
free-0.183
inky-0.173
eth-0.172
etsy-0.172
Token stayed
Token ID9658
Feature activation+1.51e-02

Pos logit contributions
rob+0.126
 obstacles+0.123
 obstacle+0.121
 facts+0.118
der+0.118
Neg logit contributions
less-0.108
free-0.097
 okay-0.092
inky-0.091
etsy-0.090
Token warm
Token ID5814
Feature activation+6.51e-02

Pos logit contributions
less+0.017
free+0.015
etsy+0.014
inky+0.014
 okay+0.014
Neg logit contributions
rob-0.028
 obstacles-0.028
 obstacle-0.027
 battles-0.026
 facts-0.026
Token and
Token ID290
Feature activation+3.14e-03

Pos logit contributions
less+0.095
inky+0.085
etsy+0.085
free+0.084
eth+0.081
Neg logit contributions
rob-0.095
 obstacles-0.094
der-0.092
 obstacle-0.091
 cliffs-0.087
Token cozy
Token ID37438
Feature activation+6.63e-02

Pos logit contributions
less+0.003
free+0.003
 okay+0.003
etsy+0.002
 warm+0.002
Neg logit contributions
rob-0.006
 obstacles-0.006
 obstacle-0.006
 facts-0.006
 battles-0.006
Token her
Token ID607
Feature activation+9.66e-03

Pos logit contributions
less+0.023
free+0.020
etsy+0.018
inky+0.018
eth+0.018
Neg logit contributions
 obstacles-0.039
rob-0.038
 obstacle-0.038
 facts-0.037
 battles-0.037
Token wish
Token ID4601
Feature activation-6.52e-03

Pos logit contributions
less+0.011
 warm+0.010
 okay+0.010
 wet+0.009
free+0.009
Neg logit contributions
rob-0.019
 obstacles-0.017
 battles-0.016
 cliffs-0.016
der-0.016
Token and
Token ID290
Feature activation-1.99e-02

Pos logit contributions
rob+0.012
 obstacles+0.012
 obstacle+0.012
 battles+0.011
 facts+0.011
Neg logit contributions
less-0.008
free-0.007
 okay-0.006
 warm-0.006
etsy-0.006
Token wished
Token ID16555
Feature activation+1.61e-03

Pos logit contributions
rob+0.033
 obstacles+0.032
 obstacle+0.032
 battles+0.030
 facts+0.030
Neg logit contributions
less-0.027
free-0.024
 okay-0.023
 warm-0.023
etsy-0.022
Token him
Token ID683
Feature activation+6.90e-04

Pos logit contributions
less+0.001
free+0.001
 okay+0.001
etsy+0.001
 warm+0.001
Neg logit contributions
rob-0.003
 obstacle-0.003
 obstacles-0.003
 battles-0.003
 facts-0.003
Token luck
Token ID8458
Feature activation-1.39e-01

Pos logit contributions
less+0.001
free+0.001
 okay+0.001
 warm+0.001
etsy+0.001
Neg logit contributions
rob-0.001
 obstacles-0.001
 obstacle-0.001
 battles-0.001
 facts-0.001
Token on
Token ID319
Feature activation-5.32e-02

Pos logit contributions
rob+0.265
 obstacles+0.263
 obstacle+0.261
 battles+0.251
 facts+0.249
Neg logit contributions
less-0.163
free-0.139
 okay-0.122
 warm-0.119
etsy-0.118
Token his
Token ID465
Feature activation+6.03e-03

Pos logit contributions
rob+0.117
 obstacle+0.109
 obstacles+0.108
 hears+0.105
 battles+0.104
Neg logit contributions
less-0.046
 warm-0.040
free-0.038
 okay-0.038
 wet-0.035
Token journey
Token ID7002
Feature activation-4.01e-02

Pos logit contributions
less+0.007
 warm+0.007
 okay+0.006
 wet+0.006
 clean+0.006
Neg logit contributions
rob-0.012
 obstacles-0.011
der-0.010
 obstacle-0.010
 facts-0.010
Token.
Token ID13
Feature activation-6.77e-03

Pos logit contributions
rob+0.075
 obstacles+0.073
 obstacle+0.073
 facts+0.070
 battles+0.070
Neg logit contributions
less-0.046
free-0.040
 okay-0.037
etsy-0.036
 warm-0.036
Token<|endoftext|>
Token ID50256
Feature activation-4.36e-02

Pos logit contributions
rob+0.012
 obstacles+0.011
 obstacle+0.011
 facts+0.011
der+0.011
Neg logit contributions
less-0.009
free-0.007
 okay-0.007
inky-0.007
etsy-0.007
Token.
Token ID13
Feature activation+2.35e-02

Pos logit contributions
less+0.005
free+0.004
 okay+0.004
 warm+0.004
 wet+0.004
Neg logit contributions
rob-0.008
 obstacles-0.008
 obstacle-0.008
 facts-0.008
 battles-0.008
Token They
Token ID1119
Feature activation+1.09e-02

Pos logit contributions
less+0.033
free+0.029
etsy+0.027
 okay+0.027
inky+0.027
Neg logit contributions
 obstacles-0.038
rob-0.038
 obstacle-0.037
 facts-0.036
 battles-0.036
Token have
Token ID423
Feature activation-9.09e-04

Pos logit contributions
less+0.011
free+0.011
 warm+0.011
 wet+0.010
 clean+0.010
Neg logit contributions
rob-0.021
 obstacles-0.020
 obstacle-0.020
der-0.020
 facts-0.019
Token a
Token ID257
Feature activation+1.02e-02

Pos logit contributions
rob+0.002
 obstacle+0.002
 obstacles+0.002
der+0.002
 facts+0.002
Neg logit contributions
less-0.001
free-0.001
 okay-0.001
 warm-0.001
 wet-0.001
Token lot
Token ID1256
Feature activation+4.43e-02

Pos logit contributions
less+0.016
free+0.014
 okay+0.013
etsy+0.013
inky+0.013
Neg logit contributions
rob-0.015
 obstacles-0.015
 obstacle-0.014
 facts-0.014
 battles-0.014
Token of
Token ID286
Feature activation+1.01e-01

Pos logit contributions
less+0.037
free+0.034
etsy+0.033
 okay+0.033
inky+0.032
Neg logit contributions
rob-0.091
 facts-0.091
 obstacles-0.090
 battles-0.087
 obstacle-0.087
Token fun
Token ID1257
Feature activation+3.57e-02

Pos logit contributions
less+0.122
free+0.108
 okay+0.101
 warm+0.100
inky+0.098
Neg logit contributions
rob-0.180
 obstacles-0.178
 obstacle-0.177
der-0.170
 battles-0.169
Token.
Token ID13
Feature activation+2.15e-02

Pos logit contributions
less+0.031
free+0.028
 okay+0.024
 warm+0.023
 wet+0.023
Neg logit contributions
rob-0.076
 obstacle-0.075
 obstacles-0.074
 facts-0.071
 Fer-0.071
Token The
Token ID383
Feature activation-2.84e-03

Pos logit contributions
less+0.025
free+0.022
 okay+0.021
inky+0.021
etsy+0.020
Neg logit contributions
rob-0.039
 obstacles-0.039
 obstacle-0.038
 facts-0.037
 battles-0.037
Token dog
Token ID3290
Feature activation-3.01e-02

Pos logit contributions
 obstacles+0.005
rob+0.005
 obstacle+0.004
 facts+0.004
 battles+0.004
Neg logit contributions
less-0.004
free-0.004
etsy-0.003
inky-0.003
 okay-0.003
Token is
Token ID318
Feature activation-2.35e-03

Pos logit contributions
rob+0.049
 accompl+0.047
 obstacles+0.047
 obstacle+0.047
ators+0.044
Neg logit contributions
 wet-0.039
 warm-0.038
 clean-0.035
 okay-0.033
 safe-0.032
Tokeny
Token ID88
Feature activation-2.17e-02

Pos logit contributions
rob+0.084
 obstacles+0.083
 obstacle+0.082
 facts+0.079
 battles+0.079
Neg logit contributions
less-0.069
free-0.062
etsy-0.058
inky-0.057
 okay-0.057
Token and
Token ID290
Feature activation-5.54e-03

Pos logit contributions
rob+0.036
 obstacle+0.035
 obstacles+0.034
 facts+0.033
 battles+0.033
Neg logit contributions
less-0.028
 warm-0.025
etsy-0.025
free-0.025
eth-0.025
Token Leo
Token ID19632
Feature activation-4.14e-02

Pos logit contributions
rob+0.008
 obstacles+0.008
 obstacle+0.008
 facts+0.008
der+0.008
Neg logit contributions
less-0.008
free-0.007
 okay-0.007
 warm-0.007
inky-0.007
Token both
Token ID1111
Feature activation-1.27e-03

Pos logit contributions
rob+0.058
 obstacles+0.057
 obstacle+0.056
 facts+0.054
 battles+0.054
Neg logit contributions
less-0.065
free-0.059
 okay-0.056
inky-0.056
etsy-0.056
Token cheered
Token ID34767
Feature activation-3.52e-03

Pos logit contributions
rob+0.002
 obstacles+0.002
 obstacle+0.002
 facts+0.002
 battles+0.002
Neg logit contributions
less-0.002
free-0.002
 okay-0.002
inky-0.002
etsy-0.002
Token when
Token ID618
Feature activation+7.96e-02

Pos logit contributions
 obstacles+0.005
der+0.005
 obstacle+0.005
 battles+0.004
 challenges+0.004
Neg logit contributions
less-0.005
etsy-0.005
inky-0.005
olk-0.005
free-0.005
Token they
Token ID484
Feature activation+1.53e-02

Pos logit contributions
less+0.116
free+0.104
inky+0.100
etsy+0.099
 okay+0.097
Neg logit contributions
 obstacles-0.121
rob-0.120
 obstacle-0.118
 facts-0.114
 battles-0.114
Token reached
Token ID4251
Feature activation-1.74e-02

Pos logit contributions
less+0.020
free+0.019
 warm+0.017
 wet+0.017
etsy+0.017
Neg logit contributions
rob-0.026
 facts-0.026
 obstacles-0.025
 obstacle-0.025
 battles-0.024
Token the
Token ID262
Feature activation+1.62e-03

Pos logit contributions
rob+0.033
 facts+0.031
 obstacles+0.031
 obstacle+0.030
 battles+0.030
Neg logit contributions
less-0.020
free-0.017
 warm-0.017
 okay-0.016
 wet-0.016
Token very
Token ID845
Feature activation+4.23e-03

Pos logit contributions
less+0.002
free+0.002
 okay+0.002
 warm+0.002
 wet+0.002
Neg logit contributions
rob-0.003
 obstacles-0.002
 facts-0.002
 obstacle-0.002
 battles-0.002
Token top
Token ID1353
Feature activation-2.81e-02

Pos logit contributions
less+0.006
free+0.005
 okay+0.005
etsy+0.005
inky+0.005
Neg logit contributions
 obstacles-0.007
rob-0.007
 obstacle-0.007
 battles-0.007
 facts-0.007
Token fight
Token ID1907
Feature activation-5.46e-02

Pos logit contributions
rob+0.048
der+0.045
arium+0.043
 reefs+0.042
 islands+0.042
Neg logit contributions
 clean-0.022
 warm-0.021
 wet-0.019
 okay-0.019
free-0.019
Token.
Token ID13
Feature activation-9.36e-04

Pos logit contributions
rob+0.140
 habitats+0.118
Or+0.117
arium+0.116
 cliffs+0.115
Neg logit contributions
 clean-0.038
 okay-0.037
 warm-0.036
 wet-0.036
less-0.035
Token Tim
Token ID5045
Feature activation+2.09e-02

Pos logit contributions
rob+0.002
 obstacles+0.002
 obstacle+0.002
 facts+0.002
 battles+0.002
Neg logit contributions
less-0.001
free-0.001
 okay-0.001
etsy-0.001
inky-0.001
Token was
Token ID373
Feature activation-1.07e-02

Pos logit contributions
less+0.014
free+0.013
 wet+0.013
 clean+0.012
 warm+0.012
Neg logit contributions
 facts-0.045
rob-0.045
 obstacle-0.045
 obstacles-0.045
der-0.044
Token frustrated
Token ID14718
Feature activation+3.11e-03

Pos logit contributions
rob+0.019
 obstacles+0.019
 obstacle+0.019
 facts+0.018
 battles+0.018
Neg logit contributions
less-0.013
free-0.011
 okay-0.011
 warm-0.010
 wet-0.010
Token when
Token ID618
Feature activation+8.66e-02

Pos logit contributions
less+0.003
 okay+0.003
free+0.003
 wet+0.003
 warm+0.003
Neg logit contributions
rob-0.006
 obstacle-0.006
 facts-0.006
 obstacles-0.006
 battles-0.006
Token he
Token ID339
Feature activation+3.60e-05

Pos logit contributions
less+0.095
 okay+0.087
free+0.085
 warm+0.085
 wet+0.082
Neg logit contributions
rob-0.163
 obstacle-0.154
 obstacles-0.150
 facts-0.149
der-0.149
Token couldn
Token ID3521
Feature activation+5.15e-02

Pos logit contributions
 wet+0.000
 clean+0.000
 okay+0.000
less+0.000
 tired+0.000
Neg logit contributions
rob-0.000
 facts-0.000
 obstacle-0.000
der-0.000
 cultures-0.000
Token't
Token ID470
Feature activation+3.92e-02

Pos logit contributions
less+0.040
free+0.035
inky+0.032
 okay+0.032
 warm+0.030
Neg logit contributions
 obstacles-0.111
rob-0.111
 facts-0.110
 obstacle-0.109
 battles-0.107
Token win
Token ID1592
Feature activation-8.44e-02

Pos logit contributions
 clean+0.044
free+0.044
 okay+0.041
 warm+0.041
less+0.041
Neg logit contributions
rob-0.077
 facts-0.071
 mysteries-0.070
 obstacles-0.070
 habitats-0.069
Token.
Token ID13
Feature activation+2.49e-03

Pos logit contributions
rob+0.207
 facts+0.185
Offic+0.185
 obstacle+0.183
 lair+0.180
Neg logit contributions
less-0.059
 okay-0.058
free-0.050
 wet-0.050
 clean-0.042
Token excited
Token ID6568
Feature activation+1.78e-02

Pos logit contributions
less+0.012
etsy+0.011
inky+0.011
eth+0.010
free+0.010
Neg logit contributions
 obstacles-0.017
 obstacle-0.016
 jugg-0.016
 accompl-0.016
 scoring-0.016
Token for
Token ID329
Feature activation-3.10e-02

Pos logit contributions
less+0.024
free+0.022
etsy+0.021
inky+0.021
 okay+0.020
Neg logit contributions
 obstacles-0.028
rob-0.027
 obstacle-0.027
 facts-0.027
der-0.027
Token his
Token ID465
Feature activation+1.98e-02

Pos logit contributions
rob+0.061
 obstacle+0.057
 obstacles+0.056
der+0.054
 facts+0.053
Neg logit contributions
less-0.034
 warm-0.031
free-0.029
 okay-0.029
 clean-0.029
Token first
Token ID717
Feature activation-7.33e-03

Pos logit contributions
less+0.024
free+0.020
 okay+0.020
 warm+0.019
inky+0.019
Neg logit contributions
rob-0.036
 obstacles-0.035
 obstacle-0.035
der-0.033
 facts-0.033
Token day
Token ID1110
Feature activation+1.68e-02

Pos logit contributions
rob+0.013
 obstacles+0.013
 obstacle+0.012
 facts+0.012
der+0.012
Neg logit contributions
less-0.009
free-0.008
 warm-0.007
 okay-0.007
etsy-0.007
Token of
Token ID286
Feature activation+8.85e-02

Pos logit contributions
less+0.020
free+0.018
 okay+0.017
etsy+0.017
inky+0.017
Neg logit contributions
rob-0.029
 obstacles-0.029
 obstacle-0.029
 battles-0.028
 facts-0.028
Token class
Token ID1398
Feature activation-4.17e-02

Pos logit contributions
 warm+0.083
less+0.077
 clean+0.074
 wet+0.071
 okay+0.066
Neg logit contributions
der-0.182
rob-0.174
 obstacle-0.171
Or-0.169
 obstacles-0.168
Token.
Token ID13
Feature activation+2.27e-03

Pos logit contributions
rob+0.089
 obstacles+0.087
 obstacle+0.087
 cliffs+0.083
 facts+0.082
Neg logit contributions
less-0.039
free-0.031
 okay-0.029
 warm-0.029
eth-0.027
Token He
Token ID679
Feature activation-2.23e-02

Pos logit contributions
less+0.003
free+0.002
 okay+0.002
etsy+0.002
inky+0.002
Neg logit contributions
rob-0.004
 obstacles-0.004
 obstacle-0.004
 facts-0.004
 battles-0.004
Token put
Token ID1234
Feature activation+1.49e-02

Pos logit contributions
rob+0.043
 obstacle+0.041
der+0.039
 obstacles+0.039
 heights+0.039
Neg logit contributions
 clean-0.024
 warm-0.023
less-0.023
 wet-0.022
eth-0.022
Token on
Token ID319
Feature activation-2.19e-02

Pos logit contributions
less+0.019
free+0.017
 okay+0.016
inky+0.016
etsy+0.016
Neg logit contributions
rob-0.025
 obstacles-0.025
 obstacle-0.025
 facts-0.024
 battles-0.024
Token there
Token ID612
Feature activation+3.19e-02

Pos logit contributions
 hears+0.009
Which+0.008
 obstacles+0.008
 obstacle+0.008
 mysteries+0.008
Neg logit contributions
isc-0.009
less-0.009
inky-0.008
uster-0.008
eth-0.008
Token was
Token ID373
Feature activation-9.52e-03

Pos logit contributions
less+0.056
free+0.052
inky+0.049
etsy+0.049
 okay+0.049
Neg logit contributions
rob-0.039
 obstacles-0.038
 obstacle-0.037
 facts-0.035
 battles-0.035
Token a
Token ID257
Feature activation+1.65e-02

Pos logit contributions
rob+0.009
 walks+0.009
 circus+0.009
 motorcycles+0.009
 dragons+0.008
Neg logit contributions
less-0.020
eth-0.018
etsy-0.017
inky-0.017
free-0.017
Token little
Token ID1310
Feature activation+5.77e-02

Pos logit contributions
less+0.017
free+0.014
etsy+0.013
inky+0.012
 okay+0.012
Neg logit contributions
rob-0.032
 obstacle-0.032
 obstacles-0.032
 battles-0.031
 jugg-0.030
Token boy
Token ID2933
Feature activation-1.32e-02

Pos logit contributions
less+0.086
free+0.077
 okay+0.074
etsy+0.073
opped+0.072
Neg logit contributions
 obstacle-0.087
rob-0.085
 obstacles-0.081
 circus-0.080
 Fer-0.080
Token named
Token ID3706
Feature activation+8.49e-02

Pos logit contributions
rob+0.017
 obstacles+0.016
 obstacle+0.015
 battles+0.015
 hears+0.015
Neg logit contributions
less-0.023
free-0.021
etsy-0.020
 okay-0.020
inky-0.020
Token Tim
Token ID5045
Feature activation+3.93e-02

Pos logit contributions
less+0.139
ung+0.129
etsy+0.128
ume+0.127
opped+0.126
Neg logit contributions
 motorcycles-0.112
 hunt-0.104
 tigers-0.103
 dragons-0.102
 jugg-0.102
Tokenmy
Token ID1820
Feature activation+3.09e-02

Pos logit contributions
less+0.039
free+0.033
etsy+0.031
inky+0.030
 okay+0.029
Neg logit contributions
rob-0.078
 obstacles-0.077
 obstacle-0.076
 hears-0.074
 battles-0.074
Token.
Token ID13
Feature activation+2.45e-03

Pos logit contributions
less+0.038
free+0.033
 okay+0.031
etsy+0.031
inky+0.031
Neg logit contributions
rob-0.054
 obstacles-0.054
 obstacle-0.053
 facts-0.051
 battles-0.051
Token Tim
Token ID5045
Feature activation+2.42e-02

Pos logit contributions
less+0.003
free+0.002
etsy+0.002
inky+0.002
eth+0.002
Neg logit contributions
rob-0.004
 obstacles-0.004
 obstacle-0.004
 battles-0.004
 facts-0.004
Tokenmy
Token ID1820
Feature activation+1.72e-02

Pos logit contributions
less+0.027
free+0.022
etsy+0.021
inky+0.021
 okay+0.020
Neg logit contributions
rob-0.047
 obstacles-0.045
 obstacle-0.044
 hears-0.044
 battles-0.043
Tokenmy
Token ID1820
Feature activation+5.88e-02

Pos logit contributions
less+0.071
free+0.062
etsy+0.060
inky+0.056
isc+0.056
Neg logit contributions
 hears-0.088
rob-0.083
 obstacle-0.081
 obstacles-0.080
 battles-0.080
Token and
Token ID290
Feature activation+1.65e-02

Pos logit contributions
less+0.096
free+0.086
etsy+0.084
inky+0.081
isc+0.080
Neg logit contributions
 hears-0.075
 obstacle-0.075
 obstacles-0.074
der-0.073
 battles-0.072
Token apologized
Token ID22893
Feature activation+1.07e-02

Pos logit contributions
less+0.018
 warm+0.017
 okay+0.016
free+0.016
 wet+0.016
Neg logit contributions
 obstacles-0.031
 obstacle-0.030
rob-0.030
 battles-0.029
 cliffs-0.028
Token.
Token ID13
Feature activation+2.57e-02

Pos logit contributions
less+0.012
free+0.011
 okay+0.010
 warm+0.009
inky+0.009
Neg logit contributions
rob-0.020
 obstacles-0.020
 obstacle-0.020
 battles-0.019
 facts-0.019
Token
Token ID198
Feature activation+1.81e-02

Pos logit contributions
less+0.038
free+0.032
inky+0.030
etsy+0.030
 okay+0.030
Neg logit contributions
rob-0.037
der-0.037
 facts-0.036
 obstacles-0.036
Or-0.036
Token
Token ID198
Feature activation+2.05e-02

Pos logit contributions
less+0.026
free+0.022
 okay+0.021
etsy+0.021
inky+0.020
Neg logit contributions
rob-0.029
 obstacles-0.027
 obstacle-0.026
 hears-0.026
 facts-0.026
Token"
Token ID1
Feature activation+7.06e-03

Pos logit contributions
less+0.033
isc+0.028
 okay+0.028
free+0.028
inky+0.027
Neg logit contributions
rob-0.028
 obstacles-0.027
 obstacle-0.026
 jugg-0.025
Or-0.025
TokenHere
Token ID4342
Feature activation+2.04e-02

Pos logit contributions
less+0.010
free+0.009
inky+0.009
eth+0.009
 okay+0.009
Neg logit contributions
rob-0.011
 obstacles-0.011
 obstacle-0.010
 facts-0.010
der-0.010
Token you
Token ID345
Feature activation+1.82e-02

Pos logit contributions
less+0.022
free+0.019
 okay+0.017
etsy+0.017
inky+0.017
Neg logit contributions
rob-0.039
 obstacles-0.039
 obstacle-0.038
 facts-0.037
 battles-0.037
Token go
Token ID467
Feature activation+1.13e-02

Pos logit contributions
less+0.031
etsy+0.028
free+0.028
coat+0.026
eth+0.026
Neg logit contributions
 hears-0.024
 obstacle-0.023
 obstacles-0.022
 battles-0.022
 acts-0.022
Token,
Token ID11
Feature activation-9.04e-03

Pos logit contributions
less+0.015
free+0.013
 okay+0.013
isc+0.013
etsy+0.013
Neg logit contributions
 hears-0.017
rob-0.017
 obstacle-0.016
 obstacles-0.016
 site-0.016
Token a
Token ID257
Feature activation+1.27e-02

Pos logit contributions
less+0.027
free+0.024
 okay+0.023
 warm+0.022
inky+0.021
Neg logit contributions
rob-0.047
 obstacles-0.046
 obstacle-0.045
 facts-0.044
 battles-0.043
Token lot
Token ID1256
Feature activation+3.73e-02

Pos logit contributions
less+0.014
free+0.013
 okay+0.012
etsy+0.011
inky+0.011
Neg logit contributions
rob-0.023
 obstacles-0.023
 obstacle-0.023
 facts-0.022
 battles-0.022
Token.
Token ID13
Feature activation+1.62e-02

Pos logit contributions
less+0.044
free+0.039
 okay+0.036
inky+0.035
etsy+0.035
Neg logit contributions
rob-0.068
 obstacles-0.067
 obstacle-0.066
 battles-0.064
 facts-0.063
Token She
Token ID1375
Feature activation-3.21e-04

Pos logit contributions
less+0.024
free+0.020
 wet+0.019
inky+0.019
etsy+0.019
Neg logit contributions
rob-0.024
 facts-0.024
 obstacles-0.024
der-0.023
 obstacle-0.023
Token is
Token ID318
Feature activation-2.63e-03

Pos logit contributions
 obstacles+0.000
 obstacle+0.000
rob+0.000
 battles+0.000
 cliffs+0.000
Neg logit contributions
less-0.000
free-0.000
 warm-0.000
 okay-0.000
inky-0.000
Token lucky
Token ID9670
Feature activation-2.37e-03

Pos logit contributions
rob+0.004
 obstacles+0.004
 obstacle+0.004
 facts+0.004
 battles+0.004
Neg logit contributions
less-0.003
free-0.003
etsy-0.003
eth-0.003
inky-0.003
Token to
Token ID284
Feature activation-2.80e-02

Pos logit contributions
rob+0.004
 obstacles+0.004
 obstacle+0.004
 battles+0.004
 facts+0.004
Neg logit contributions
less-0.003
free-0.003
 okay-0.002
etsy-0.002
inky-0.002
Token have
Token ID423
Feature activation-9.49e-03

Pos logit contributions
der+0.061
rob+0.060
 obstacles+0.059
 obstacle+0.059
 motorcycles+0.057
Neg logit contributions
 okay-0.022
 warm-0.021
free-0.020
less-0.019
 OK-0.019
Token you
Token ID345
Feature activation+3.38e-02

Pos logit contributions
 obstacle+0.019
rob+0.019
 obstacles+0.019
der+0.018
 cliffs+0.017
Neg logit contributions
less-0.009
 okay-0.008
free-0.008
 warm-0.008
 wet-0.007
Token."
Token ID526
Feature activation+1.21e-02

Pos logit contributions
less+0.043
free+0.038
etsy+0.037
inky+0.036
eth+0.036
Neg logit contributions
rob-0.056
 obstacles-0.055
 obstacle-0.054
 facts-0.054
 battles-0.053
Token
Token ID198
Feature activation+1.76e-02

Pos logit contributions
less+0.017
free+0.015
 okay+0.014
etsy+0.014
inky+0.014
Neg logit contributions
rob-0.019
 obstacles-0.019
 hears-0.019
 facts-0.019
der-0.019
Token "
Token ID366
Feature activation-3.12e-02

Pos logit contributions
less+0.022
free+0.020
 okay+0.019
 warm+0.019
eth+0.019
Neg logit contributions
rob-0.030
 obstacle-0.029
 obstacles-0.029
 facts-0.028
 battles-0.027
TokenBut
Token ID1537
Feature activation-1.69e-02

Pos logit contributions
 obstacle+0.050
rob+0.050
 obstacles+0.049
 facts+0.046
 Fer+0.046
Neg logit contributions
free-0.043
less-0.042
 okay-0.036
 warm-0.036
etsy-0.035
Token you
Token ID345
Feature activation+7.65e-03

Pos logit contributions
rob+0.028
 obstacles+0.027
 obstacle+0.027
 battles+0.026
 facts+0.026
Neg logit contributions
less-0.023
free-0.020
 okay-0.020
 warm-0.019
 wet-0.019
Token can
Token ID460
Feature activation+3.32e-02

Pos logit contributions
less+0.010
free+0.009
 okay+0.008
inky+0.008
etsy+0.008
Neg logit contributions
rob-0.013
 obstacles-0.013
 obstacle-0.012
 facts-0.012
 battles-0.012
Token make
Token ID787
Feature activation+1.70e-02

Pos logit contributions
less+0.046
free+0.041
 okay+0.039
etsy+0.038
inky+0.038
Neg logit contributions
rob-0.053
 obstacles-0.052
 obstacle-0.052
 facts-0.050
 battles-0.050
Token another
Token ID1194
Feature activation-2.73e-03

Pos logit contributions
less+0.021
free+0.019
 okay+0.019
 warm+0.018
inky+0.018
Neg logit contributions
rob-0.029
 obstacles-0.028
 obstacle-0.028
 facts-0.027
 battles-0.027
Token castle
Token ID16669
Feature activation+6.25e-03

Pos logit contributions
rob+0.005
 obstacles+0.005
 accompl+0.004
 concepts+0.004
 battles+0.004
Neg logit contributions
less-0.003
 warm-0.003
 wet-0.003
 okay-0.003
free-0.003
Token tomorrow
Token ID9439
Feature activation+2.64e-02

Pos logit contributions
less+0.008
free+0.007
 okay+0.006
 warm+0.006
inky+0.006
Neg logit contributions
rob-0.011
 obstacles-0.011
 obstacle-0.011
 facts-0.010
 battles-0.010
Token.
Token ID13
Feature activation+1.39e-02

Pos logit contributions
less+0.031
free+0.027
 okay+0.025
inky+0.025
etsy+0.025
Neg logit contributions
rob-0.048
 obstacles-0.048
 obstacle-0.047
 facts-0.045
 battles-0.045
Token And
Token ID843
Feature activation+1.21e-02

Pos logit contributions
less+0.021
free+0.018
etsy+0.017
inky+0.017
 wet+0.017
Neg logit contributions
 obstacles-0.020
rob-0.020
der-0.020
 obstacle-0.020
 facts-0.020
Token maybe
Token ID3863
Feature activation-2.08e-03

Pos logit contributions
less+0.015
 okay+0.014
free+0.014
 warm+0.014
 wet+0.013
Neg logit contributions
rob-0.022
 obstacles-0.020
 obstacle-0.020
 battles-0.019
 facts-0.019
Token their
Token ID511
Feature activation+4.62e-02

Pos logit contributions
rob+0.013
 obstacles+0.013
 obstacle+0.012
 facts+0.012
 battles+0.012
Neg logit contributions
less-0.012
free-0.011
 okay-0.010
etsy-0.010
inky-0.010
Token toes
Token ID23932
Feature activation+2.77e-02

Pos logit contributions
less+0.065
free+0.058
 warm+0.056
 okay+0.055
 wet+0.054
Neg logit contributions
rob-0.077
 obstacles-0.070
 obstacle-0.069
 battles-0.068
 facts-0.067
Token.
Token ID13
Feature activation+5.90e-04

Pos logit contributions
less+0.038
etsy+0.035
free+0.034
inky+0.034
isc+0.033
Neg logit contributions
der-0.043
 obstacles-0.042
 obstacle-0.041
rob-0.040
 facts-0.040
Token
Token ID198
Feature activation+2.20e-03

Pos logit contributions
less+0.001
free+0.001
eth+0.001
 okay+0.001
inky+0.001
Neg logit contributions
der-0.001
rob-0.001
 obstacles-0.001
 facts-0.001
 battles-0.001
Token
Token ID198
Feature activation+3.96e-03

Pos logit contributions
less+0.003
free+0.002
 okay+0.002
etsy+0.002
inky+0.002
Neg logit contributions
rob-0.004
 obstacles-0.004
 obstacle-0.004
 battles-0.004
 facts-0.004
TokenAfter
Token ID3260
Feature activation+4.79e-02

Pos logit contributions
less+0.006
free+0.005
inky+0.005
 okay+0.005
isc+0.005
Neg logit contributions
 obstacles-0.006
rob-0.006
 obstacle-0.006
 battles-0.006
 jugg-0.005
Token they
Token ID484
Feature activation+1.44e-02

Pos logit contributions
less+0.064
free+0.057
inky+0.054
etsy+0.054
 okay+0.052
Neg logit contributions
rob-0.080
 obstacles-0.078
 obstacle-0.076
 facts-0.075
 battles-0.075
Token finished
Token ID5201
Feature activation-8.38e-03

Pos logit contributions
less+0.016
free+0.014
 warm+0.014
 wet+0.013
 okay+0.013
Neg logit contributions
rob-0.027
 obstacles-0.027
 obstacle-0.027
 facts-0.026
 battles-0.026
Token doing
Token ID1804
Feature activation-4.28e-03

Pos logit contributions
rob+0.014
 obstacles+0.014
 obstacle+0.014
 facts+0.013
 battles+0.013
Neg logit contributions
less-0.011
free-0.010
 warm-0.009
 okay-0.009
 wet-0.009
Token yoga
Token ID20351
Feature activation-4.24e-02

Pos logit contributions
rob+0.006
 obstacles+0.006
 obstacle+0.006
 facts+0.006
der+0.006
Neg logit contributions
less-0.006
 warm-0.006
 okay-0.006
free-0.006
 wet-0.006
Token,
Token ID11
Feature activation-1.32e-02

Pos logit contributions
 obstacle+0.101
 obstacles+0.099
Offic+0.098
rob+0.097
 abrupt+0.096
Neg logit contributions
less-0.025
 warm-0.018
free-0.016
 okay-0.016
 wet-0.015
Token It
Token ID632
Feature activation+2.78e-03

Pos logit contributions
less+0.017
free+0.014
etsy+0.013
inky+0.013
 okay+0.013
Neg logit contributions
 obstacles-0.032
der-0.032
 obstacle-0.031
rob-0.031
 facts-0.031
Token was
Token ID373
Feature activation-5.71e-03

Pos logit contributions
less+0.004
free+0.004
 wet+0.004
 warm+0.004
 okay+0.004
Neg logit contributions
rob-0.004
 obstacles-0.004
 obstacle-0.004
 facts-0.004
 battles-0.004
Token her
Token ID607
Feature activation+5.14e-02

Pos logit contributions
 obstacles+0.009
 obstacle+0.009
der+0.009
 seconds+0.009
 hears+0.009
Neg logit contributions
less-0.007
free-0.007
etsy-0.007
inky-0.007
eth-0.006
Token swim
Token ID9422
Feature activation-1.11e-02

Pos logit contributions
less+0.076
free+0.075
etsy+0.073
inky+0.068
eth+0.067
Neg logit contributions
 obstacles-0.073
 obstacle-0.071
der-0.068
 battles-0.067
 challenges-0.067
Token suit
Token ID6050
Feature activation+8.74e-03

Pos logit contributions
 obstacles+0.019
 obstacle+0.019
rob+0.018
 battles+0.018
 hears+0.018
Neg logit contributions
less-0.013
free-0.012
etsy-0.012
 okay-0.011
inky-0.011
Token.
Token ID13
Feature activation+1.59e-02

Pos logit contributions
etsy+0.012
less+0.012
free+0.012
inky+0.011
uffy+0.011
Neg logit contributions
 obstacles-0.012
der-0.012
 obstacle-0.012
 battles-0.012
 hears-0.011
Token It
Token ID632
Feature activation+1.66e-03

Pos logit contributions
less+0.019
free+0.016
inky+0.014
etsy+0.014
 okay+0.014
Neg logit contributions
der-0.029
 obstacles-0.029
 obstacle-0.028
 hears-0.028
 facts-0.028
Token was
Token ID373
Feature activation-6.06e-03

Pos logit contributions
less+0.002
 wet+0.002
 warm+0.002
free+0.002
 okay+0.002
Neg logit contributions
rob-0.003
 facts-0.003
 obstacle-0.003
 obstacles-0.003
 concepts-0.003
Token too
Token ID1165
Feature activation+2.17e-02

Pos logit contributions
 obstacles+0.010
 obstacle+0.009
 battles+0.009
rob+0.009
 hears+0.009
Neg logit contributions
less-0.008
free-0.008
etsy-0.008
inky-0.008
eth-0.007
Token big
Token ID1263
Feature activation-6.49e-03

Pos logit contributions
less+0.026
etsy+0.026
free+0.025
inky+0.023
 okay+0.022
Neg logit contributions
 obstacles-0.036
 obstacle-0.035
 hears-0.034
rob-0.034
 battles-0.033
Token for
Token ID329
Feature activation-2.60e-02

Pos logit contributions
rob+0.010
 obstacles+0.010
 obstacle+0.010
 facts+0.009
 battles+0.009
Neg logit contributions
less-0.009
free-0.008
etsy-0.008
inky-0.008
 okay-0.008
Token it
Token ID340
Feature activation+6.56e-03

Pos logit contributions
less+0.106
free+0.095
inky+0.090
etsy+0.088
 okay+0.085
Neg logit contributions
 obstacles-0.133
rob-0.133
 obstacle-0.127
 battles-0.126
 facts-0.123
Token got
Token ID1392
Feature activation+3.17e-02

Pos logit contributions
less+0.011
free+0.010
etsy+0.009
 okay+0.009
inky+0.009
Neg logit contributions
rob-0.009
 obstacles-0.009
 obstacle-0.008
 hears-0.008
 battles-0.008
Token stuck
Token ID7819
Feature activation+2.02e-02

Pos logit contributions
less+0.039
etsy+0.036
free+0.034
inky+0.033
isc+0.033
Neg logit contributions
 obstacles-0.055
 obstacle-0.052
rob-0.051
 battles-0.051
 cliffs-0.050
Token in
Token ID287
Feature activation-6.18e-02

Pos logit contributions
less+0.025
etsy+0.023
free+0.022
inky+0.022
isc+0.021
Neg logit contributions
 obstacles-0.034
rob-0.033
 obstacle-0.032
der-0.031
 battles-0.031
Token a
Token ID257
Feature activation+3.82e-03

Pos logit contributions
 obstacles+0.103
rob+0.100
 obstacle+0.099
 battles+0.096
 challenges+0.094
Neg logit contributions
less-0.083
free-0.074
etsy-0.071
inky-0.069
 okay-0.068
Token tree
Token ID5509
Feature activation-3.80e-02

Pos logit contributions
less+0.005
free+0.004
 okay+0.004
etsy+0.004
inky+0.004
Neg logit contributions
 obstacles-0.006
rob-0.006
 obstacle-0.006
 battles-0.006
 facts-0.006
Token.
Token ID13
Feature activation+2.22e-03

Pos logit contributions
 obstacles+0.061
 obstacle+0.061
rob+0.059
 hears+0.058
 battles+0.057
Neg logit contributions
less-0.051
free-0.045
etsy-0.045
 okay-0.044
inky-0.043
Token Wh
Token ID854
Feature activation+1.45e-02

Pos logit contributions
less+0.004
free+0.003
 okay+0.003
ick+0.003
icking+0.003
Neg logit contributions
der-0.003
 obstacles-0.003
rob-0.003
 battles-0.003
 tigers-0.003
Tokenisk
Token ID1984
Feature activation-2.86e-02

Pos logit contributions
less+0.030
free+0.028
inky+0.027
etsy+0.027
 okay+0.027
Neg logit contributions
 obstacles-0.013
rob-0.013
 obstacle-0.013
 facts-0.012
 battles-0.012
Tokeners
Token ID364
Feature activation-9.24e-03

Pos logit contributions
rob+0.044
 obstacles+0.043
 obstacle+0.042
 battles+0.041
 facts+0.041
Neg logit contributions
less-0.041
free-0.037
 okay-0.035
inky-0.035
etsy-0.035
Token tried
Token ID3088
Feature activation+2.93e-02

Pos logit contributions
rob+0.013
 obstacles+0.013
 obstacle+0.013
 facts+0.012
 battles+0.012
Neg logit contributions
less-0.014
free-0.013
 okay-0.012
etsy-0.012
inky-0.012
Token back
Token ID736
Feature activation-3.44e-02

Pos logit contributions
 obstacles+0.008
 hears+0.007
 seconds+0.007
 battles+0.007
 obstacle+0.007
Neg logit contributions
uffy-0.009
 okay-0.009
inky-0.008
ama-0.008
less-0.008
Token to
Token ID284
Feature activation-4.85e-03

Pos logit contributions
rob+0.063
 obstacles+0.063
 obstacle+0.061
 facts+0.060
 battles+0.059
Neg logit contributions
less-0.039
free-0.034
 okay-0.032
etsy-0.031
inky-0.031
Token the
Token ID262
Feature activation+5.01e-03

Pos logit contributions
rob+0.009
 obstacles+0.009
 obstacle+0.009
 facts+0.009
der+0.009
Neg logit contributions
 warm-0.005
less-0.005
 clean-0.005
 wet-0.005
 okay-0.005
Token house
Token ID2156
Feature activation+2.47e-02

Pos logit contributions
less+0.007
free+0.006
etsy+0.006
 okay+0.006
inky+0.006
Neg logit contributions
 obstacles-0.008
rob-0.008
 obstacle-0.008
 battles-0.007
 facts-0.007
Token to
Token ID284
Feature activation-1.76e-02

Pos logit contributions
less+0.031
free+0.027
inky+0.026
etsy+0.026
 okay+0.025
Neg logit contributions
 obstacles-0.042
rob-0.042
 obstacle-0.041
 battles-0.040
 facts-0.040
Token show
Token ID905
Feature activation-6.84e-02

Pos logit contributions
rob+0.032
 obstacles+0.031
 obstacle+0.030
 facts+0.030
der+0.029
Neg logit contributions
less-0.021
 warm-0.020
 clean-0.019
 wet-0.019
free-0.019
Token his
Token ID465
Feature activation-2.59e-03

Pos logit contributions
rob+0.106
 obstacles+0.104
 obstacle+0.102
 facts+0.098
 battles+0.098
Neg logit contributions
less-0.099
free-0.089
 okay-0.084
inky-0.083
etsy-0.083
Token mom
Token ID1995
Feature activation+9.50e-03

Pos logit contributions
rob+0.005
 obstacles+0.005
 obstacle+0.005
 facts+0.005
 battles+0.005
Neg logit contributions
less-0.003
free-0.002
 warm-0.002
 okay-0.002
 wet-0.002
Token.
Token ID13
Feature activation-1.18e-04

Pos logit contributions
less+0.014
free+0.013
inky+0.013
etsy+0.013
eth+0.012
Neg logit contributions
rob-0.013
 facts-0.013
 obstacles-0.013
 obstacle-0.013
 hears-0.013
Token
Token ID220
Feature activation-3.40e-02

Pos logit contributions
 obstacles+0.000
der+0.000
rob+0.000
 facts+0.000
 hears+0.000
Neg logit contributions
less-0.000
free-0.000
 okay-0.000
inky-0.000
 wet-0.000
Token
Token ID198
Feature activation-2.39e-03

Pos logit contributions
rob+0.044
 obstacles+0.043
 hears+0.042
der+0.042
 battles+0.040
Neg logit contributions
less-0.053
free-0.048
 wet-0.044
 warm-0.044
etsy-0.044
Token taught
Token ID7817
Feature activation-4.29e-02

Pos logit contributions
less+0.022
free+0.020
 okay+0.019
 warm+0.019
etsy+0.019
Neg logit contributions
rob-0.030
 obstacle-0.028
 obstacles-0.028
 facts-0.027
 battles-0.026
Token the
Token ID262
Feature activation-4.93e-02

Pos logit contributions
rob+0.079
 obstacles+0.077
 obstacle+0.076
 facts+0.074
der+0.073
Neg logit contributions
less-0.048
free-0.043
 okay-0.040
inky-0.040
 warm-0.039
Token mouse
Token ID10211
Feature activation-3.97e-02

Pos logit contributions
rob+0.083
 obstacles+0.081
 obstacle+0.079
 facts+0.076
 battles+0.076
Neg logit contributions
less-0.065
free-0.059
 okay-0.056
inky-0.055
 warm-0.054
Token how
Token ID703
Feature activation-3.83e-02

Pos logit contributions
rob+0.054
 obstacles+0.049
 obstacle+0.049
 battles+0.045
 Fer+0.045
Neg logit contributions
less-0.067
free-0.061
 okay-0.059
 warm-0.059
inky-0.058
Token to
Token ID284
Feature activation-4.78e-02

Pos logit contributions
rob+0.073
 obstacles+0.071
 obstacle+0.071
 facts+0.068
 battles+0.068
Neg logit contributions
less-0.043
free-0.037
 okay-0.034
etsy-0.033
inky-0.033
Token sneak
Token ID20528
Feature activation-2.53e-02

Pos logit contributions
rob+0.087
 obstacles+0.081
 obstacle+0.080
der+0.078
 facts+0.076
Neg logit contributions
less-0.057
 warm-0.056
 clean-0.055
 okay-0.053
free-0.052
Token into
Token ID656
Feature activation-3.96e-02

Pos logit contributions
rob+0.047
 obstacles+0.046
 obstacle+0.046
 battles+0.044
 facts+0.044
Neg logit contributions
less-0.029
free-0.025
 okay-0.023
etsy-0.023
inky-0.023
Token the
Token ID262
Feature activation-4.77e-04

Pos logit contributions
rob+0.084
der+0.074
 facts+0.073
 believes+0.073
 obstacle+0.073
Neg logit contributions
less-0.037
 warm-0.035
 wet-0.033
free-0.033
 okay-0.032
Token farm
Token ID5318
Feature activation-2.90e-03

Pos logit contributions
rob+0.001
 obstacles+0.001
 facts+0.001
der+0.001
 obstacle+0.001
Neg logit contributions
less-0.001
free-0.001
 warm-0.001
 wet-0.001
 okay-0.001
Token and
Token ID290
Feature activation-1.50e-02

Pos logit contributions
rob+0.006
 obstacle+0.005
 obstacles+0.005
 facts+0.005
 Fer+0.005
Neg logit contributions
less-0.003
free-0.003
 okay-0.003
 warm-0.003
 wet-0.002
Token steal
Token ID8711
Feature activation-2.35e-02

Pos logit contributions
rob+0.025
 obstacles+0.023
 obstacle+0.023
 facts+0.022
der+0.022
Neg logit contributions
less-0.021
free-0.019
 okay-0.018
 warm-0.018
 wet-0.017
Token Daisy
Token ID40355
Feature activation-3.35e-03

Pos logit contributions
rob+0.083
 obstacles+0.081
 obstacle+0.080
 battles+0.078
 facts+0.077
Neg logit contributions
less-0.032
free-0.029
 okay-0.028
etsy-0.028
 warm-0.027
Token replied
Token ID8712
Feature activation-7.97e-03

Pos logit contributions
rob+0.006
 obstacles+0.006
 obstacle+0.006
 facts+0.006
 battles+0.006
Neg logit contributions
less-0.004
 okay-0.004
 warm-0.004
etsy-0.004
free-0.004
Token,
Token ID11
Feature activation-2.79e-02

Pos logit contributions
rob+0.016
 obstacles+0.016
 obstacle+0.016
 facts+0.015
 battles+0.015
Neg logit contributions
less-0.008
free-0.007
 okay-0.006
etsy-0.006
inky-0.006
Token "
Token ID366
Feature activation-5.23e-02

Pos logit contributions
rob+0.051
 obstacle+0.050
 obstacles+0.049
 facts+0.047
 battles+0.046
Neg logit contributions
less-0.036
 okay-0.036
free-0.035
 warm-0.034
etsy-0.031
TokenI
Token ID40
Feature activation-2.08e-02

Pos logit contributions
rob+0.075
 obstacles+0.073
 obstacle+0.072
 facts+0.069
 battles+0.069
Neg logit contributions
less-0.082
free-0.074
 okay-0.070
etsy-0.070
inky-0.070
Token was
Token ID373
Feature activation-1.90e-02

Pos logit contributions
rob+0.036
 obstacles+0.035
 obstacle+0.035
 facts+0.034
 battles+0.034
Neg logit contributions
less-0.027
free-0.024
 okay-0.023
etsy-0.022
inky-0.022
Token looking
Token ID2045
Feature activation-2.40e-02

Pos logit contributions
rob+0.031
 obstacles+0.030
 obstacle+0.030
 facts+0.029
 battles+0.029
Neg logit contributions
less-0.026
free-0.023
 okay-0.023
 warm-0.022
etsy-0.022
Token for
Token ID329
Feature activation-3.59e-02

Pos logit contributions
 obstacle+0.033
 obstacles+0.033
 facts+0.030
der+0.030
 challenges+0.030
Neg logit contributions
less-0.037
etsy-0.034
eth-0.033
inky-0.033
free-0.032
Token a
Token ID257
Feature activation-4.54e-03

Pos logit contributions
rob+0.072
 obstacle+0.066
 obstacles+0.066
 battles+0.064
 facts+0.064
Neg logit contributions
less-0.039
 warm-0.036
 okay-0.035
 wet-0.033
free-0.032
Token silly
Token ID14397
Feature activation+2.11e-02

Pos logit contributions
 obstacles+0.007
 obstacle+0.007
rob+0.007
 battles+0.007
 facts+0.007
Neg logit contributions
less-0.006
free-0.005
inky-0.005
etsy-0.005
 okay-0.005
Token butter
Token ID9215
Feature activation+7.03e-03

Pos logit contributions
less+0.027
free+0.024
 okay+0.022
etsy+0.022
inky+0.022
Neg logit contributions
rob-0.036
 obstacles-0.036
 obstacle-0.035
 facts-0.034
 battles-0.034
Token.
Token ID13
Feature activation+6.22e-03

Pos logit contributions
less+0.005
free+0.005
 okay+0.004
etsy+0.004
inky+0.004
Neg logit contributions
rob-0.008
 obstacles-0.007
 obstacle-0.007
 facts-0.007
 battles-0.007
Token The
Token ID383
Feature activation-7.82e-03

Pos logit contributions
less+0.007
free+0.006
 okay+0.006
etsy+0.006
inky+0.005
Neg logit contributions
der-0.011
 obstacles-0.011
rob-0.011
 obstacle-0.011
 facts-0.011
Token nurse
Token ID15849
Feature activation-3.73e-02

Pos logit contributions
 obstacles+0.012
 obstacle+0.012
rob+0.012
 facts+0.011
 hears+0.011
Neg logit contributions
less-0.011
free-0.010
inky-0.010
etsy-0.010
 okay-0.009
Token smiled
Token ID13541
Feature activation-4.16e-02

Pos logit contributions
rob+0.060
 obstacles+0.059
 obstacle+0.059
 facts+0.056
 battles+0.056
Neg logit contributions
less-0.051
free-0.046
 okay-0.045
etsy-0.044
 warm-0.043
Token and
Token ID290
Feature activation-7.18e-03

Pos logit contributions
der+0.055
 obstacles+0.052
 obstacle+0.051
 hears+0.049
 battles+0.048
Neg logit contributions
less-0.067
free-0.062
etsy-0.062
inky-0.061
isc-0.059
Token said
Token ID531
Feature activation-5.58e-03

Pos logit contributions
 obstacles+0.014
 obstacle+0.014
rob+0.014
 battles+0.013
 facts+0.013
Neg logit contributions
less-0.007
 warm-0.006
free-0.006
 okay-0.006
 wet-0.006
Token,
Token ID11
Feature activation-3.67e-02

Pos logit contributions
der+0.010
 obstacles+0.009
 facts+0.009
 obstacle+0.009
rob+0.009
Neg logit contributions
less-0.006
isc-0.006
free-0.006
inky-0.005
eth-0.005
Token "
Token ID366
Feature activation-5.67e-02

Pos logit contributions
 obstacle+0.065
rob+0.065
 obstacles+0.064
 facts+0.059
 challenges+0.059
Neg logit contributions
less-0.049
free-0.049
 okay-0.047
 warm-0.043
etsy-0.041
TokenIt
Token ID1026
Feature activation-2.42e-02

Pos logit contributions
rob+0.080
 obstacles+0.078
 obstacle+0.076
 facts+0.073
 battles+0.072
Neg logit contributions
less-0.090
free-0.080
inky-0.078
 okay-0.077
eth-0.076
Token's
Token ID338
Feature activation-1.87e-02

Pos logit contributions
 obstacles+0.035
rob+0.035
 obstacle+0.034
 hears+0.034
 believes+0.033
Neg logit contributions
less-0.037
free-0.032
etsy-0.031
inky-0.031
eth-0.030
Token okay
Token ID8788
Feature activation+3.04e-02

Pos logit contributions
 obstacles+0.041
 obstacle+0.041
 battles+0.040
 challenges+0.040
 mysteries+0.039
Neg logit contributions
less-0.013
free-0.011
eth-0.010
inky-0.010
etsy-0.010
Token girl
Token ID2576
Feature activation-2.03e-02

Pos logit contributions
inky+0.045
less+0.043
free+0.042
ittens+0.039
isc+0.039
Neg logit contributions
 obstacle-0.042
 obstacles-0.041
 acts-0.040
 mysteries-0.039
 challenges-0.038
Token never
Token ID1239
Feature activation+1.84e-02

Pos logit contributions
rob+0.031
 obstacles+0.029
 obstacle+0.029
 facts+0.028
 battles+0.027
Neg logit contributions
less-0.030
free-0.028
 okay-0.027
etsy-0.026
 warm-0.026
Token got
Token ID1392
Feature activation+2.34e-02

Pos logit contributions
less+0.028
free+0.025
 okay+0.024
 wet+0.023
 warm+0.023
Neg logit contributions
rob-0.029
 obstacles-0.027
 obstacle-0.027
 facts-0.026
 battles-0.025
Token to
Token ID284
Feature activation-2.27e-03

Pos logit contributions
less+0.032
inky+0.031
free+0.031
isc+0.031
eth+0.030
Neg logit contributions
 obstacles-0.035
 hears-0.033
 obstacle-0.033
 mysteries-0.033
 challenges-0.032
Token play
Token ID711
Feature activation+2.16e-02

Pos logit contributions
rob+0.004
 obstacles+0.004
 obstacle+0.004
der+0.004
 facts+0.004
Neg logit contributions
less-0.003
 warm-0.003
 okay-0.003
free-0.003
 wet-0.003
Token again
Token ID757
Feature activation-9.09e-02

Pos logit contributions
less+0.023
free+0.020
 okay+0.019
 warm+0.018
 wet+0.017
Neg logit contributions
rob-0.043
 obstacles-0.042
 obstacle-0.041
 facts-0.040
 battles-0.039
Token.
Token ID13
Feature activation+2.63e-03

Pos logit contributions
rob+0.208
 facts+0.187
 obstacles+0.186
 obstacle+0.186
 Fer+0.184
Neg logit contributions
less-0.077
 okay-0.068
 warm-0.065
free-0.064
 wet-0.060
Token<|endoftext|>
Token ID50256
Feature activation-4.27e-02

Pos logit contributions
less+0.003
free+0.003
inky+0.003
 okay+0.003
 warm+0.003
Neg logit contributions
 obstacles-0.004
rob-0.004
 obstacle-0.004
 facts-0.004
der-0.004
TokenOnce
Token ID7454
Feature activation+7.71e-03

Pos logit contributions
rob+0.055
 obstacles+0.050
 hears+0.049
 obstacle+0.046
 cliffs+0.046
Neg logit contributions
less-0.075
 okay-0.066
free-0.066
inky-0.065
etsy-0.065
Token upon
Token ID2402
Feature activation+3.44e-03

Pos logit contributions
less+0.009
free+0.008
inky+0.008
opped+0.007
etsy+0.007
Neg logit contributions
 hears-0.013
 obstacle-0.013
 obstacles-0.013
rob-0.012
der-0.012
Token a
Token ID257
Feature activation-3.69e-03

Pos logit contributions
less+0.006
inky+0.005
etsy+0.005
eth+0.005
isc+0.005
Neg logit contributions
 hears-0.005
 obstacles-0.005
rob-0.005
 battles-0.004
 facts-0.004
Token was
Token ID373
Feature activation-1.05e-02

Pos logit contributions
less+0.050
free+0.046
inky+0.043
etsy+0.043
 okay+0.043
Neg logit contributions
rob-0.034
 obstacles-0.033
 obstacle-0.033
 facts-0.031
 battles-0.031
Token a
Token ID257
Feature activation+1.32e-02

Pos logit contributions
rob+0.009
 obstacles+0.009
 circus+0.009
 tigers+0.009
 planets+0.009
Neg logit contributions
less-0.022
eth-0.020
inky-0.019
etsy-0.019
uffy-0.019
Token young
Token ID1862
Feature activation+2.83e-02

Pos logit contributions
less+0.014
free+0.012
etsy+0.011
inky+0.010
 okay+0.010
Neg logit contributions
 obstacle-0.025
rob-0.025
 obstacles-0.025
 jugg-0.024
 battles-0.024
Token girl
Token ID2576
Feature activation+6.89e-03

Pos logit contributions
less+0.038
free+0.034
 okay+0.032
etsy+0.031
inky+0.031
Neg logit contributions
rob-0.046
 obstacle-0.045
 obstacles-0.044
 jugg-0.042
 battles-0.042
Token named
Token ID3706
Feature activation+7.52e-02

Pos logit contributions
less+0.012
free+0.011
inky+0.011
etsy+0.011
 okay+0.011
Neg logit contributions
rob-0.008
 obstacle-0.007
 obstacles-0.007
 hears-0.007
 facts-0.007
Token Av
Token ID5184
Feature activation-9.01e-02

Pos logit contributions
less+0.104
opped+0.094
free+0.093
isc+0.089
uster+0.089
Neg logit contributions
 hears-0.098
 mysteries-0.097
 museums-0.097
 dragons-0.093
 obstacle-0.093
Tokena
Token ID64
Feature activation-6.21e-03

Pos logit contributions
rob+0.232
 obstacles+0.231
 obstacle+0.227
 facts+0.221
 battles+0.220
Neg logit contributions
less-0.042
free-0.027
inky-0.023
etsy-0.020
 okay-0.019
Token.
Token ID13
Feature activation-5.59e-03

Pos logit contributions
rob+0.013
 obstacles+0.013
 obstacle+0.012
 battles+0.012
 facts+0.012
Neg logit contributions
less-0.006
free-0.005
inky-0.005
 warm-0.005
 okay-0.005
Token She
Token ID1375
Feature activation-8.90e-03

Pos logit contributions
rob+0.010
 obstacles+0.009
 obstacle+0.009
 facts+0.009
 battles+0.009
Neg logit contributions
less-0.007
free-0.006
 okay-0.006
etsy-0.006
inky-0.006
Token liked
Token ID8288
Feature activation+5.70e-03

Pos logit contributions
 obstacles+0.013
 obstacle+0.013
rob+0.013
 facts+0.012
 battles+0.012
Neg logit contributions
less-0.013
free-0.012
 warm-0.012
 okay-0.011
 wet-0.011
Token to
Token ID284
Feature activation-2.25e-02

Pos logit contributions
less+0.005
free+0.004
 warm+0.004
 wet+0.004
 okay+0.004
Neg logit contributions
rob-0.012
 obstacle-0.012
 obstacles-0.012
der-0.011
 believes-0.011
Token eat
Token ID4483
Feature activation-5.59e-03

Pos logit contributions
less+0.001
 warm+0.000
free+0.000
 okay+0.000
 wet+0.000
Neg logit contributions
rob-0.001
 obstacles-0.001
 obstacle-0.001
 facts-0.001
 battles-0.001
Token,
Token ID11
Feature activation-1.43e-02

Pos logit contributions
rob+0.011
 obstacles+0.011
 obstacle+0.011
 battles+0.010
 facts+0.010
Neg logit contributions
less-0.006
free-0.005
 okay-0.005
 warm-0.005
inky-0.004
Token they
Token ID484
Feature activation+5.70e-03

Pos logit contributions
 obstacles+0.020
rob+0.019
 obstacle+0.019
 battles+0.019
 cliffs+0.019
Neg logit contributions
less-0.023
inky-0.020
free-0.020
etsy-0.019
eth-0.019
Token smiled
Token ID13541
Feature activation-3.45e-02

Pos logit contributions
less+0.008
 warm+0.008
free+0.008
 okay+0.008
 wet+0.007
Neg logit contributions
rob-0.009
 obstacle-0.009
 obstacles-0.008
 facts-0.008
 battles-0.008
Token at
Token ID379
Feature activation-2.37e-02

Pos logit contributions
der+0.052
 obstacles+0.051
 obstacle+0.050
 battles+0.048
 facts+0.048
Neg logit contributions
less-0.049
etsy-0.045
free-0.045
inky-0.044
eth-0.042
Token each
Token ID1123
Feature activation-8.86e-02

Pos logit contributions
rob+0.039
 obstacle+0.035
 obstacles+0.034
 battles+0.033
 Fer+0.033
Neg logit contributions
less-0.035
 warm-0.032
free-0.032
 okay-0.031
 wet-0.030
Token other
Token ID584
Feature activation+2.91e-03

Pos logit contributions
 obstacles+0.125
rob+0.125
 obstacle+0.123
 battles+0.117
 facts+0.117
Neg logit contributions
less-0.139
free-0.126
etsy-0.120
inky-0.120
 okay-0.119
Token.
Token ID13
Feature activation-5.63e-03

Pos logit contributions
less+0.004
free+0.003
etsy+0.003
inky+0.003
 okay+0.003
Neg logit contributions
 obstacles-0.005
rob-0.005
 obstacle-0.005
 facts-0.005
 battles-0.005
Token The
Token ID383
Feature activation-1.70e-02

Pos logit contributions
rob+0.010
 obstacles+0.010
 obstacle+0.010
 facts+0.010
 battles+0.010
Neg logit contributions
less-0.007
free-0.006
 okay-0.005
etsy-0.005
inky-0.005
Token cat
Token ID3797
Feature activation-1.93e-02

Pos logit contributions
 obstacles+0.028
 obstacle+0.028
rob+0.027
 facts+0.026
 battles+0.026
Neg logit contributions
less-0.023
free-0.021
etsy-0.020
eth-0.019
inky-0.019
Token said
Token ID531
Feature activation-6.27e-03

Pos logit contributions
rob+0.033
 obstacles+0.032
 obstacle+0.032
 facts+0.030
 battles+0.030
Neg logit contributions
less-0.024
 warm-0.023
 wet-0.022
 okay-0.021
etsy-0.021
Token and
Token ID290
Feature activation-6.52e-03

Pos logit contributions
 obstacles+0.033
rob+0.033
der+0.033
 obstacle+0.032
 battles+0.031
Neg logit contributions
less-0.036
free-0.033
etsy-0.032
inky-0.032
 okay-0.031
Token showed
Token ID3751
Feature activation-4.74e-02

Pos logit contributions
rob+0.010
 obstacle+0.010
 obstacles+0.010
 facts+0.009
der+0.009
Neg logit contributions
less-0.009
free-0.008
 warm-0.008
 okay-0.008
 wet-0.008
Token them
Token ID606
Feature activation+1.32e-02

Pos logit contributions
rob+0.063
 obstacles+0.063
 obstacle+0.061
 facts+0.059
 battles+0.059
Neg logit contributions
less-0.078
free-0.070
 okay-0.067
etsy-0.067
inky-0.067
Token his
Token ID465
Feature activation-3.08e-03

Pos logit contributions
less+0.024
free+0.022
etsy+0.021
inky+0.021
 okay+0.021
Neg logit contributions
 obstacles-0.015
rob-0.015
 obstacle-0.015
 facts-0.014
 battles-0.014
Token new
Token ID649
Feature activation-2.19e-02

Pos logit contributions
rob+0.006
 obstacle+0.005
 obstacles+0.005
der+0.005
 battles+0.005
Neg logit contributions
less-0.004
 warm-0.004
free-0.004
 okay-0.003
 wet-0.003
Token trick
Token ID6908
Feature activation-9.77e-02

Pos logit contributions
rob+0.040
der+0.034
 obstacle+0.033
 obstacles+0.033
 accompl+0.032
Neg logit contributions
less-0.030
 warm-0.026
 wet-0.025
free-0.025
 okay-0.024
Token.
Token ID13
Feature activation-6.00e-03

Pos logit contributions
rob+0.168
 obstacles+0.167
 obstacle+0.164
 facts+0.159
 battles+0.158
Neg logit contributions
less-0.123
free-0.108
 okay-0.101
etsy-0.100
inky-0.100
Token Hipp
Token ID33812
Feature activation-9.28e-02

Pos logit contributions
rob+0.010
 obstacle+0.010
 obstacles+0.010
 facts+0.009
 battles+0.009
Neg logit contributions
less-0.008
free-0.007
inky-0.007
etsy-0.007
 okay-0.007
Tokeny
Token ID88
Feature activation-3.68e-02

Pos logit contributions
rob+0.155
 obstacles+0.152
 obstacle+0.149
 facts+0.145
 battles+0.144
Neg logit contributions
less-0.118
free-0.103
 okay-0.103
 warm-0.098
inky-0.095
Token could
Token ID714
Feature activation+4.93e-02

Pos logit contributions
rob+0.062
 obstacle+0.059
 obstacles+0.058
 facts+0.056
 accompl+0.054
Neg logit contributions
less-0.048
free-0.044
 warm-0.043
etsy-0.041
 wet-0.041
Token blow
Token ID6611
Feature activation+3.73e-03

Pos logit contributions
less+0.057
 warm+0.056
free+0.053
 okay+0.053
 clean+0.052
Neg logit contributions
rob-0.093
 obstacles-0.085
 obstacle-0.084
 facts-0.080
 battles-0.079
Token just
Token ID655
Feature activation-6.16e-03

Pos logit contributions
less+0.003
free+0.002
 okay+0.002
 warm+0.002
 wet+0.002
Neg logit contributions
rob-0.005
 obstacle-0.005
 obstacles-0.005
 facts-0.005
 battles-0.005
Token had
Token ID550
Feature activation+1.34e-03

Pos logit contributions
rob+0.010
 obstacles+0.010
 obstacle+0.010
 facts+0.009
 battles+0.009
Neg logit contributions
less-0.009
free-0.008
 okay-0.007
 warm-0.007
etsy-0.007
Token an
Token ID281
Feature activation-4.95e-02

Pos logit contributions
less+0.001
free+0.001
 okay+0.001
 warm+0.001
 wet+0.001
Neg logit contributions
rob-0.003
 obstacle-0.002
 obstacles-0.002
 facts-0.002
 battles-0.002
Token amazing
Token ID4998
Feature activation-5.62e-02

Pos logit contributions
rob+0.097
 obstacles+0.090
 hears+0.089
 battles+0.086
Or+0.086
Neg logit contributions
 warm-0.051
less-0.051
 wet-0.048
 okay-0.048
 clean-0.045
Token fight
Token ID1907
Feature activation-4.16e-02

Pos logit contributions
rob+0.100
 obstacles+0.097
der+0.092
 obstacle+0.092
 battles+0.091
Neg logit contributions
less-0.072
 warm-0.063
 wet-0.059
 okay-0.058
free-0.057
Token against
Token ID1028
Feature activation-9.32e-02

Pos logit contributions
rob+0.087
 obstacle+0.082
 obstacles+0.081
 facts+0.079
 cliffs+0.078
Neg logit contributions
less-0.045
free-0.036
 okay-0.035
 warm-0.031
 wet-0.030
Token a
Token ID257
Feature activation-5.36e-03

Pos logit contributions
rob+0.201
 obstacle+0.177
 believes+0.176
der+0.172
 facts+0.170
Neg logit contributions
less-0.090
 warm-0.080
 okay-0.077
 wet-0.076
 clean-0.072
Token mud
Token ID17492
Feature activation+8.43e-02

Pos logit contributions
rob+0.008
 obstacles+0.008
 facts+0.008
der+0.007
 hears+0.007
Neg logit contributions
less-0.008
 warm-0.007
free-0.007
 wet-0.007
inky-0.007
Token p
Token ID279
Feature activation-2.05e-02

Pos logit contributions
less+0.108
free+0.092
 wet+0.088
inky+0.086
ucky+0.086
Neg logit contributions
rob-0.144
 obstacles-0.143
 obstacle-0.138
 facts-0.138
 challenges-0.137
Tokenuddle
Token ID24500
Feature activation+2.97e-02

Pos logit contributions
rob+0.037
 obstacle+0.037
 obstacles+0.037
der+0.036
 facts+0.035
Neg logit contributions
less-0.023
free-0.021
 okay-0.020
etsy-0.019
 warm-0.018
Token.
Token ID13
Feature activation+5.90e-03

Pos logit contributions
less+0.034
free+0.029
 okay+0.026
inky+0.026
 warm+0.026
Neg logit contributions
rob-0.056
 obstacles-0.055
 obstacle-0.054
 facts-0.053
 battles-0.052