TOP ACTIVATIONS
MAX = 0.099

to explore the garden and look for bugs . One day
walk around his neighborhood and look at all the pretty houses
walk around his neighborhood and look at all the interesting things
play in her backyard and look at the big wall that
wander around her garden and look at all the pretty flowers
and proud of himself for accompl ishing such a difficult task
Lily . She loved to look at the stars in the
. Tim my loved to look up at the sky and
. Tim my loved to look up at the sky and
Lily . She loved to look at the stars at night
walk in the garden and look at all the pretty flowers
things in his hands and look at them . One day
up at the sky and look at the clouds . One
thought they looked like ac rob ats . She enjoyed the
Lily . She loved to look up at the sky and
girl . She liked to look up into the night sky
wander around her backyard and look for animals . One day
outside in the grass and look for bugs . One day
loved to play outside and look for pretty rocks . One
Lily . She loved to look up at the sky and

MIN ACTIVATIONS
MIN = -0.073

, it has a tip like a star ." Tom says
work trip . <|endoftext|> Once upon a time , there was
of chocolate . <|endoftext|> Once upon a time , there was
playing too deep <|endoftext|> Once upon a time there was a
Thankfully , the <|endoftext|> Once upon a time there was a
hole forever . <|endoftext|> Once upon a time , there was
the day ! <|endoftext|> Once upon a time , there was
been waiting for <|endoftext|> Once upon a time , there was
The end . <|endoftext|> Once upon a time , in a
dream on a miserable Tuesday like this ?
bad ending . <|endoftext|> Once upon a time , there was
picnic soon ." <|endoftext|> Once upon a time , there was
too late ! <|endoftext|> Once upon a time there was a
the tower !" <|endoftext|> Once upon a time , there was
too late . <|endoftext|> Once upon a time there were two
Experience won ! <|endoftext|> Once upon a time , there were
sang again . <|endoftext|> Once upon a time , there was
about itself . <|endoftext|> Once upon a time , there was
really strong !" <|endoftext|> Once upon a time , there was
and play . <|endoftext|> Once upon a time there was a

INTERVAL 0.056 - 0.099
CONTAINS 0.199%

radio at home . She looked out the window . It
to brand new ! They looked around in amaz ement and
. She made sure to look up at the stars each
them . Mom looks at them and smiles too
original idea . They looked very different from each other ,

INTERVAL 0.013 - 0.056
CONTAINS 17.141%

, Lily and Ben ? Why are you fighting and pin
was inside . " Let 's go to the
for a better look . M andy asked
girl to march . The little girl was very
again . " Here you go , kids ,

INTERVAL -0.030 - 0.013
CONTAINS 80.561%

it 's not about the score of how many toys he
Rex went to the park . They liked to play with
to talk . They say hello , how are you ,
<|endoftext|> S ara and Ben are twins . They like to
garden . It quickly started to scatter the vegetables and flowers

INTERVAL -0.073 - -0.030
CONTAINS 2.100%

Tom was trading something that wasn 't fair . The fox
the sky and then it started to laugh . It was
! The puppy was so pleased . The puppy
was brown and soft and had a pillow on it .
. They said sorry and gave back the shiny things .
Token to
Token ID284
Feature activation+2.49e-02

Pos logit contributions
 possessions+0.010
 possession+0.010
+0.010
izing+0.010
ement+0.010
Neg logit contributions
 a-0.005
 two-0.005
 Ben-0.004
 sne-0.004
 blast-0.004
Token explore
Token ID7301
Feature activation+5.09e-02

Pos logit contributions
 possessions+0.044
 possession+0.042
+0.040
 Acc+0.040
izing+0.040
Neg logit contributions
 a-0.029
 two-0.028
 Ben-0.025
 blast-0.025
 sne-0.025
Token the
Token ID262
Feature activation+3.14e-02

Pos logit contributions
ishly+0.131
 Acc+0.130
osity+0.130
 possessions+0.129
 mattered+0.127
Neg logit contributions
 a-0.046
 two-0.029
 Ben-0.021
 many-0.019
 some-0.019
Token garden
Token ID11376
Feature activation+2.67e-02

Pos logit contributions
 possessions+0.043
 possession+0.041
+0.039
izing+0.039
 Acc+0.039
Neg logit contributions
 a-0.049
 two-0.048
 Ben-0.044
 blast-0.044
 sne-0.044
Token and
Token ID290
Feature activation+2.83e-02

Pos logit contributions
 possessions+0.053
 Acc+0.051
 possession+0.050
 disl+0.050
+0.050
Neg logit contributions
 a-0.032
 two-0.028
 Ben-0.023
 four-0.022
 sne-0.022
Token look
Token ID804
Feature activation+9.90e-02

Pos logit contributions
ement+0.071
izing+0.069
 possessions+0.068
 possession+0.066
ating+0.065
Neg logit contributions
 a-0.026
 two-0.020
 sne-0.018
 roast-0.017
 Ben-0.016
Token for
Token ID329
Feature activation+2.49e-02

Pos logit contributions
+0.226
 possessions+0.223
ths+0.221
 mattered+0.221
 Acc+0.220
Neg logit contributions
 a-0.097
 two-0.073
 Ben-0.070
 four-0.056
 finish-0.052
Token bugs
Token ID11316
Feature activation+2.58e-02

Pos logit contributions
+0.041
 Acc+0.039
 possessions+0.039
bing+0.038
izing+0.038
Neg logit contributions
 a-0.038
 two-0.034
 Ben-0.030
 sne-0.030
 four-0.029
Token.
Token ID13
Feature activation+2.66e-02

Pos logit contributions
 possessions+0.053
 possession+0.051
 Acc+0.050
+0.050
izing+0.050
Neg logit contributions
 a-0.024
 two-0.022
 Ben-0.019
 sne-0.019
 blast-0.018
Token One
Token ID1881
Feature activation-1.56e-02

Pos logit contributions
 possessions+0.063
izing+0.060
 possession+0.060
ths+0.059
ized+0.058
Neg logit contributions
 a-0.023
 Ben-0.021
 two-0.020
 roast-0.015
 four-0.015
Token day
Token ID1110
Feature activation+2.04e-02

Pos logit contributions
 a+0.022
 two+0.020
 Ben+0.018
 sne+0.018
 roast+0.018
Neg logit contributions
 possessions-0.026
izing-0.025
-0.024
 possession-0.024
ement-0.024
Token walk
Token ID2513
Feature activation+2.66e-02

Pos logit contributions
 possessions+0.036
 bicycle+0.035
 possession+0.034
 ornament+0.034
 Acc+0.034
Neg logit contributions
 two-0.030
 a-0.029
ata-0.028
rip-0.028
 blah-0.027
Token around
Token ID1088
Feature activation+3.92e-02

Pos logit contributions
 possessions+0.049
 possession+0.047
 Acc+0.046
izing+0.046
+0.046
Neg logit contributions
 a-0.030
 two-0.028
 Ben-0.025
 sne-0.025
 blast-0.025
Token his
Token ID465
Feature activation+2.36e-02

Pos logit contributions
 possessions+0.073
 possession+0.072
+0.070
 angel+0.070
 bicycle+0.070
Neg logit contributions
 two-0.036
ata-0.034
 a-0.034
umbs-0.034
rip-0.034
Token neighborhood
Token ID6232
Feature activation+1.97e-02

Pos logit contributions
 possessions+0.039
 Acc+0.036
izing+0.036
ement+0.036
 mattered+0.036
Neg logit contributions
 a-0.033
 two-0.031
 Ben-0.028
 gas-0.027
 sne-0.027
Token and
Token ID290
Feature activation+2.65e-02

Pos logit contributions
 possessions+0.036
 possession+0.035
 Acc+0.034
+0.034
izing+0.033
Neg logit contributions
 a-0.023
 two-0.022
 Ben-0.019
 sne-0.019
 roast-0.018
Token look
Token ID804
Feature activation+9.69e-02

Pos logit contributions
ement+0.055
izing+0.054
 possessions+0.053
 possession+0.051
ating+0.050
Neg logit contributions
 a-0.030
 two-0.027
 sne-0.026
 roast-0.025
 Ben-0.025
Token at
Token ID379
Feature activation+2.37e-02

Pos logit contributions
 possessions+0.202
+0.201
 possession+0.197
 mattered+0.195
 Acc+0.195
Neg logit contributions
 a-0.104
 two-0.084
 Ben-0.080
 four-0.070
 sne-0.069
Token all
Token ID477
Feature activation+3.52e-02

Pos logit contributions
 possessions+0.053
+0.053
izing+0.052
 Acc+0.051
bing+0.050
Neg logit contributions
 a-0.020
 two-0.017
 Ben-0.014
 gas-0.013
 four-0.013
Token the
Token ID262
Feature activation+3.33e-02

Pos logit contributions
+0.087
 possessions+0.086
Really+0.084
ths+0.084
 Acc+0.084
Neg logit contributions
 a-0.024
 two-0.021
 Ben-0.016
 gas-0.013
 water-0.012
Token pretty
Token ID2495
Feature activation+2.70e-02

Pos logit contributions
 possessions+0.051
 possession+0.050
+0.048
 Acc+0.047
izing+0.047
Neg logit contributions
 a-0.048
 two-0.046
 Ben-0.042
 sne-0.041
 blast-0.041
Token houses
Token ID7777
Feature activation+2.30e-02

Pos logit contributions
 Acc+0.047
+0.045
ths+0.044
ishly+0.044
 possession+0.043
Neg logit contributions
 a-0.037
 two-0.035
 water-0.032
 gas-0.031
 Ben-0.030
Token walk
Token ID2513
Feature activation+2.96e-02

Pos logit contributions
 possessions+0.039
 bicycle+0.038
 possession+0.037
 ornament+0.037
 Acc+0.036
Neg logit contributions
 two-0.032
 a-0.031
rip-0.030
ata-0.030
 Ben-0.029
Token around
Token ID1088
Feature activation+3.98e-02

Pos logit contributions
 possessions+0.053
 possession+0.051
+0.049
izing+0.049
 Acc+0.049
Neg logit contributions
 a-0.034
 two-0.033
 Ben-0.029
 blast-0.029
 sne-0.029
Token his
Token ID465
Feature activation+2.50e-02

Pos logit contributions
 possessions+0.073
 possession+0.072
 angel+0.070
bing+0.070
 bicycle+0.070
Neg logit contributions
 two-0.037
umbs-0.037
ata-0.037
rip-0.036
 roasted-0.035
Token neighborhood
Token ID6232
Feature activation+2.12e-02

Pos logit contributions
 possessions+0.041
 Acc+0.039
ement+0.039
izing+0.039
 mattered+0.039
Neg logit contributions
 a-0.034
 two-0.033
 Ben-0.030
 sne-0.029
 gas-0.028
Token and
Token ID290
Feature activation+2.63e-02

Pos logit contributions
 possessions+0.038
 possession+0.037
 Acc+0.036
+0.036
izing+0.035
Neg logit contributions
 a-0.025
 two-0.024
 Ben-0.021
 sne-0.021
 roast-0.020
Token look
Token ID804
Feature activation+9.66e-02

Pos logit contributions
ement+0.053
izing+0.053
 possessions+0.052
 possession+0.051
+0.049
Neg logit contributions
 a-0.030
 two-0.026
 sne-0.025
 Ben-0.025
 roast-0.024
Token at
Token ID379
Feature activation+2.44e-02

Pos logit contributions
 possessions+0.200
+0.199
 possession+0.194
 Acc+0.192
 mattered+0.191
Neg logit contributions
 a-0.105
 two-0.085
 Ben-0.081
 sne-0.073
 four-0.071
Token all
Token ID477
Feature activation+3.63e-02

Pos logit contributions
 possessions+0.054
+0.052
izing+0.051
 possession+0.051
 Acc+0.051
Neg logit contributions
 a-0.021
 two-0.019
 Ben-0.015
 gas-0.014
 four-0.014
Token the
Token ID262
Feature activation+3.32e-02

Pos logit contributions
+0.089
 possessions+0.088
Really+0.086
ths+0.086
 Acc+0.085
Neg logit contributions
 a-0.025
 two-0.022
 Ben-0.018
 gas-0.014
 water-0.014
Token interesting
Token ID3499
Feature activation+3.10e-02

Pos logit contributions
 possessions+0.050
 possession+0.049
+0.048
 Acc+0.047
izing+0.047
Neg logit contributions
 a-0.048
 two-0.047
 sne-0.042
 Ben-0.042
 blast-0.041
Token things
Token ID1243
Feature activation+3.08e-02

Pos logit contributions
+0.045
 Acc+0.044
ishly+0.042
 possessions+0.042
 possession+0.042
Neg logit contributions
 a-0.049
 two-0.046
 gas-0.042
 Ben-0.041
 sne-0.041
Token play
Token ID711
Feature activation+2.34e-02

Pos logit contributions
 possessions+0.047
 possession+0.045
+0.043
 Acc+0.043
 angel+0.043
Neg logit contributions
 a-0.030
 two-0.029
 Ben-0.026
 blast-0.026
 sne-0.026
Token in
Token ID287
Feature activation+2.04e-02

Pos logit contributions
 possessions+0.044
 possession+0.041
 Acc+0.041
osity+0.039
 mattered+0.039
Neg logit contributions
 a-0.031
 two-0.027
 blast-0.025
 sne-0.025
 Ben-0.025
Token her
Token ID607
Feature activation+2.43e-02

Pos logit contributions
 possessions+0.040
 possession+0.039
+0.038
izing+0.038
 angel+0.037
Neg logit contributions
 a-0.019
 two-0.018
 Ben-0.016
 blast-0.016
 sne-0.016
Token backyard
Token ID24296
Feature activation+2.99e-02

Pos logit contributions
izing+0.044
 possessions+0.044
ement+0.043
osity+0.041
 mattered+0.041
Neg logit contributions
 a-0.034
 two-0.032
 Ben-0.028
 blast-0.027
 lunch-0.026
Token and
Token ID290
Feature activation+2.39e-02

Pos logit contributions
 possessions+0.070
 disl+0.066
 Acc+0.066
 possession+0.065
+0.065
Neg logit contributions
 a-0.031
 two-0.025
 Ben-0.019
 some-0.018
 sne-0.017
Token look
Token ID804
Feature activation+9.65e-02

Pos logit contributions
ement+0.066
izing+0.062
 possessions+0.059
lights+0.058
 possession+0.058
Neg logit contributions
 a-0.022
 sne-0.016
 roast-0.015
 two-0.015
 Ben-0.013
Token at
Token ID379
Feature activation+2.91e-02

Pos logit contributions
+0.232
 possessions+0.229
 mattered+0.227
 disl+0.226
ths+0.226
Neg logit contributions
 a-0.088
 two-0.060
 Ben-0.059
 four-0.045
 finish-0.044
Token the
Token ID262
Feature activation+2.38e-02

Pos logit contributions
+0.074
 Acc+0.071
izing+0.071
 possessions+0.070
osity+0.069
Neg logit contributions
 a-0.027
 two-0.021
 Ben-0.015
 four-0.014
 water-0.014
Token big
Token ID1263
Feature activation+3.85e-02

Pos logit contributions
 possessions+0.036
 possession+0.035
 angel+0.033
izing+0.033
+0.033
Neg logit contributions
 a-0.033
 two-0.032
 Ben-0.030
 blast-0.030
 sne-0.030
Token wall
Token ID3355
Feature activation+3.02e-02

Pos logit contributions
 possessions+0.071
 mattered+0.068
 possession+0.067
 Acc+0.067
izing+0.066
Neg logit contributions
 a-0.049
 two-0.048
 Ben-0.043
 sne-0.040
 gas-0.040
Token that
Token ID326
Feature activation+2.38e-02

Pos logit contributions
 possessions+0.069
 possession+0.066
 Acc+0.065
+0.065
 disl+0.063
Neg logit contributions
 a-0.025
 two-0.022
 Ben-0.017
 sne-0.016
 blast-0.016
Token wander
Token ID27776
Feature activation+1.59e-02

Pos logit contributions
 possessions+0.036
 possession+0.035
 bicycle+0.034
 ornament+0.034
 Acc+0.034
Neg logit contributions
 two-0.029
 a-0.028
rip-0.027
ata-0.027
 Ben-0.026
Token around
Token ID1088
Feature activation+3.56e-02

Pos logit contributions
 Acc+0.026
 possessions+0.025
+0.025
ths+0.024
izing+0.024
Neg logit contributions
 a-0.024
 two-0.022
 Ben-0.019
 gas-0.019
 four-0.019
Token her
Token ID607
Feature activation+1.87e-02

Pos logit contributions
 possessions+0.072
 possession+0.069
 Acc+0.068
izing+0.067
ement+0.066
Neg logit contributions
 a-0.036
 two-0.033
 Ben-0.030
 sne-0.029
 blast-0.028
Token garden
Token ID11376
Feature activation+1.95e-02

Pos logit contributions
 possessions+0.034
ement+0.033
 Acc+0.033
 mattered+0.032
izing+0.032
Neg logit contributions
 a-0.024
 two-0.022
 Ben-0.020
 sne-0.018
 gas-0.018
Token and
Token ID290
Feature activation+2.81e-02

Pos logit contributions
 possessions+0.039
 Acc+0.038
 possession+0.038
+0.037
 disl+0.037
Neg logit contributions
 a-0.021
 two-0.019
 Ben-0.015
 sne-0.015
 four-0.015
Token look
Token ID804
Feature activation+9.65e-02

Pos logit contributions
ement+0.071
izing+0.068
lights+0.064
 possessions+0.064
 possession+0.063
Neg logit contributions
 a-0.029
 two-0.023
 sne-0.022
 roast-0.021
 Ben-0.019
Token at
Token ID379
Feature activation+2.73e-02

Pos logit contributions
+0.229
ths+0.226
 possessions+0.225
 mattered+0.224
 disl+0.224
Neg logit contributions
 a-0.091
 two-0.063
 Ben-0.062
 four-0.046
 finish-0.046
Token all
Token ID477
Feature activation+3.46e-02

Pos logit contributions
+0.070
izing+0.066
 Acc+0.066
 possessions+0.065
bing+0.065
Neg logit contributions
 a-0.025
 two-0.020
 Ben-0.014
 four-0.013
 water-0.013
Token the
Token ID262
Feature activation+3.49e-02

Pos logit contributions
+0.094
Really+0.093
ths+0.091
 instincts+0.090
ishly+0.089
Neg logit contributions
 a-0.023
 two-0.019
 Ben-0.014
 water-0.013
 the-0.011
Token pretty
Token ID2495
Feature activation+2.89e-02

Pos logit contributions
 possession+0.055
+0.055
 possessions+0.054
 Acc+0.054
ths+0.052
Neg logit contributions
 a-0.049
 two-0.048
 sne-0.042
 Ben-0.041
 gas-0.041
Token flowers
Token ID12734
Feature activation+2.31e-02

Pos logit contributions
 Acc+0.058
ishly+0.056
ths+0.055
belt+0.054
+0.054
Neg logit contributions
 a-0.034
 two-0.032
 water-0.030
 Ben-0.026
 bread-0.025
Token and
Token ID290
Feature activation+1.08e-02

Pos logit contributions
 two+0.020
 a+0.020
rip+0.019
 blast+0.019
 roast+0.018
Neg logit contributions
 possessions-0.032
 possession-0.031
 angel-0.030
izing-0.030
 bicycle-0.029
Token proud
Token ID6613
Feature activation-1.30e-02

Pos logit contributions
 possessions+0.017
 possession+0.017
+0.016
izing+0.016
 angel+0.016
Neg logit contributions
 a-0.015
 two-0.014
 Ben-0.013
 sne-0.013
 blast-0.013
Token of
Token ID286
Feature activation-2.22e-05

Pos logit contributions
 a+0.014
 two+0.014
rip+0.013
 blast+0.013
 roast+0.013
Neg logit contributions
 possessions-0.023
 possession-0.023
izing-0.022
 angel-0.022
ement-0.021
Token himself
Token ID2241
Feature activation-1.20e-03

Pos logit contributions
 a+0.000
 two+0.000
 Ben+0.000
 blast+0.000
 sne+0.000
Neg logit contributions
 possessions-0.000
 possession-0.000
-0.000
izing-0.000
 angel-0.000
Token for
Token ID329
Feature activation+4.92e-03

Pos logit contributions
Ar+0.002
 blast+0.002
rip+0.002
 roast+0.002
 GO+0.002
Neg logit contributions
 possessions-0.002
 possession-0.002
 angel-0.002
ement-0.002
bing-0.002
Token accompl
Token ID6424
Feature activation+9.63e-02

Pos logit contributions
 possessions+0.006
 possession+0.006
 angel+0.005
izing+0.005
 Acc+0.005
Neg logit contributions
rip-0.008
umbs-0.008
ata-0.008
 two-0.008
 GO-0.007
Tokenishing
Token ID3929
Feature activation+2.20e-02

Pos logit contributions
 angel+0.126
 Nice+0.118
 Iris+0.112
 Acc+0.112
 Ellen+0.111
Neg logit contributions
 a-0.145
 two-0.144
 padd-0.139
 four-0.133
bang-0.129
Token such
Token ID884
Feature activation+8.34e-03

Pos logit contributions
 possessions+0.041
+0.039
 possession+0.039
 Acc+0.039
 angel+0.039
Neg logit contributions
 a-0.025
 two-0.023
 Ben-0.020
 gas-0.019
 sne-0.019
Token a
Token ID257
Feature activation-6.81e-03

Pos logit contributions
 possessions+0.019
 possession+0.019
izing+0.018
 mattered+0.018
 angel+0.018
Neg logit contributions
 a-0.005
 two-0.005
rip-0.004
 roast-0.004
 blast-0.004
Token difficult
Token ID2408
Feature activation+1.43e-02

Pos logit contributions
 a+0.009
 two+0.009
 blast+0.008
 gas+0.008
 sne+0.008
Neg logit contributions
 possessions-0.011
-0.011
ement-0.011
 possession-0.011
adoes-0.011
Token task
Token ID4876
Feature activation+1.13e-02

Pos logit contributions
 possessions+0.025
 possession+0.024
 Acc+0.024
+0.024
izing+0.024
Neg logit contributions
 a-0.017
 two-0.016
 blast-0.014
 sne-0.014
 Ben-0.014
Token Lily
Token ID20037
Feature activation+1.09e-02

Pos logit contributions
 a+0.014
 two+0.013
 Ben+0.012
 blast+0.012
 sne+0.012
Neg logit contributions
 possessions-0.026
 possession-0.025
-0.024
izing-0.024
 Acc-0.024
Token.
Token ID13
Feature activation+2.12e-02

Pos logit contributions
 possessions+0.025
izing+0.024
ement+0.024
+0.024
ths+0.024
Neg logit contributions
 a-0.009
 two-0.007
 Ben-0.006
 sne-0.006
 roast-0.005
Token She
Token ID1375
Feature activation+1.97e-02

Pos logit contributions
 possessions+0.038
 possession+0.034
izing+0.034
ths+0.034
ized+0.032
Neg logit contributions
 a-0.031
 Ben-0.030
 two-0.029
 roast-0.025
 sne-0.025
Token loved
Token ID6151
Feature activation+6.57e-03

Pos logit contributions
 possessions+0.027
ement+0.027
izing+0.026
 possession+0.025
ths+0.025
Neg logit contributions
 a-0.035
 two-0.032
 sne-0.031
 Ben-0.030
 roasted-0.029
Token to
Token ID284
Feature activation+2.55e-02

Pos logit contributions
 possessions+0.013
 possession+0.013
+0.013
izing+0.013
ement+0.013
Neg logit contributions
 a-0.006
 two-0.006
 sne-0.005
 Ben-0.005
 blast-0.005
Token look
Token ID804
Feature activation+9.56e-02

Pos logit contributions
 possessions+0.046
 possession+0.045
izing+0.043
+0.043
 Acc+0.042
Neg logit contributions
 a-0.029
 two-0.028
 Ben-0.025
 blast-0.025
 sne-0.025
Token at
Token ID379
Feature activation+3.23e-02

Pos logit contributions
 possessions+0.197
+0.196
 Acc+0.196
 mattered+0.194
 possession+0.194
Neg logit contributions
 a-0.105
 two-0.088
 Ben-0.082
 four-0.070
 sne-0.068
Token the
Token ID262
Feature activation+2.94e-02

Pos logit contributions
+0.082
izing+0.081
 Acc+0.080
bing+0.079
Really+0.078
Neg logit contributions
 a-0.027
 two-0.022
 Ben-0.015
 four-0.013
 gas-0.012
Token stars
Token ID5788
Feature activation+3.65e-02

Pos logit contributions
 possessions+0.046
 possession+0.045
 Acc+0.043
+0.043
izing+0.042
Neg logit contributions
 a-0.041
 two-0.040
 Ben-0.036
 sne-0.035
 gas-0.035
Token in
Token ID287
Feature activation+2.88e-02

Pos logit contributions
 possessions+0.073
 possession+0.070
 Acc+0.068
+0.067
 mattered+0.066
Neg logit contributions
 a-0.037
 two-0.035
 Ben-0.030
 sne-0.030
 blast-0.029
Token the
Token ID262
Feature activation+2.32e-02

Pos logit contributions
 possessions+0.062
 possession+0.059
 Acc+0.058
izing+0.058
ths+0.058
Neg logit contributions
 a-0.030
 two-0.027
 Ben-0.024
 blast-0.020
 four-0.020
Token.
Token ID13
Feature activation+9.12e-03

Pos logit contributions
 a+0.010
 two+0.010
 Ben+0.008
 sne+0.008
 blast+0.008
Neg logit contributions
 possessions-0.023
 possession-0.023
izing-0.022
-0.022
 angel-0.022
Token Tim
Token ID5045
Feature activation+2.74e-02

Pos logit contributions
 possessions+0.018
 possession+0.017
izing+0.016
 angel+0.016
+0.016
Neg logit contributions
 a-0.010
 two-0.009
 Ben-0.008
 blast-0.008
 roast-0.008
Tokenmy
Token ID1820
Feature activation+1.29e-02

Pos logit contributions
 possessions+0.035
 possession+0.033
izing+0.032
+0.032
 angel+0.031
Neg logit contributions
 a-0.047
 two-0.045
 sne-0.042
 Ben-0.042
rip-0.041
Token loved
Token ID6151
Feature activation+4.60e-03

Pos logit contributions
 possessions+0.017
 possession+0.016
izing+0.016
ement+0.015
 angel+0.015
Neg logit contributions
 a-0.022
 two-0.021
 sne-0.020
 Ben-0.020
 roasted-0.019
Token to
Token ID284
Feature activation+2.70e-02

Pos logit contributions
 possessions+0.009
 possession+0.009
+0.008
izing+0.008
 Acc+0.008
Neg logit contributions
 a-0.005
 two-0.004
 Ben-0.004
 sne-0.004
 blast-0.004
Token look
Token ID804
Feature activation+9.53e-02

Pos logit contributions
 possessions+0.049
 possession+0.047
izing+0.045
+0.045
 angel+0.044
Neg logit contributions
 a-0.032
 two-0.030
 sne-0.027
 Ben-0.027
 blast-0.027
Token up
Token ID510
Feature activation+5.27e-02

Pos logit contributions
+0.189
 possessions+0.188
 possession+0.184
 Acc+0.184
 mattered+0.182
Neg logit contributions
 a-0.109
 two-0.094
 Ben-0.088
 four-0.077
 sne-0.075
Token at
Token ID379
Feature activation+2.23e-02

Pos logit contributions
 possessions+0.096
 possession+0.092
 Acc+0.090
+0.090
 mattered+0.088
Neg logit contributions
 a-0.064
 two-0.060
 Ben-0.053
 sne-0.051
 four-0.050
Token the
Token ID262
Feature activation+1.67e-02

Pos logit contributions
 possessions+0.050
+0.049
izing+0.049
 possession+0.048
 Acc+0.048
Neg logit contributions
 a-0.017
 two-0.016
 Ben-0.013
 gas-0.011
 four-0.011
Token sky
Token ID6766
Feature activation+2.21e-02

Pos logit contributions
 possessions+0.025
 angel+0.024
 bicycle+0.023
 ornament+0.023
 possession+0.023
Neg logit contributions
 a-0.023
 two-0.022
 blast-0.022
rip-0.022
ata-0.022
Token and
Token ID290
Feature activation+2.73e-02

Pos logit contributions
 possessions+0.041
 possession+0.040
izing+0.038
 angel+0.038
+0.038
Neg logit contributions
 a-0.023
 two-0.022
 blast-0.020
 Ben-0.020
 sne-0.020
Token.
Token ID13
Feature activation+9.12e-03

Pos logit contributions
 a+0.010
 two+0.010
 Ben+0.008
 sne+0.008
 blast+0.008
Neg logit contributions
 possessions-0.023
 possession-0.023
izing-0.022
-0.022
 angel-0.022
Token Tim
Token ID5045
Feature activation+2.74e-02

Pos logit contributions
 possessions+0.018
 possession+0.017
izing+0.016
 angel+0.016
+0.016
Neg logit contributions
 a-0.010
 two-0.009
 Ben-0.008
 blast-0.008
 roast-0.008
Tokenmy
Token ID1820
Feature activation+1.29e-02

Pos logit contributions
 possessions+0.035
 possession+0.033
izing+0.032
+0.032
 angel+0.031
Neg logit contributions
 a-0.047
 two-0.045
 sne-0.042
 Ben-0.042
rip-0.041
Token loved
Token ID6151
Feature activation+4.60e-03

Pos logit contributions
 possessions+0.017
 possession+0.016
izing+0.016
ement+0.015
 angel+0.015
Neg logit contributions
 a-0.022
 two-0.021
 sne-0.020
 Ben-0.020
 roasted-0.019
Token to
Token ID284
Feature activation+2.70e-02

Pos logit contributions
 possessions+0.009
 possession+0.009
+0.008
izing+0.008
 Acc+0.008
Neg logit contributions
 a-0.005
 two-0.004
 Ben-0.004
 sne-0.004
 blast-0.004
Token look
Token ID804
Feature activation+9.53e-02

Pos logit contributions
 possessions+0.049
 possession+0.047
izing+0.045
+0.045
 angel+0.044
Neg logit contributions
 a-0.032
 two-0.030
 sne-0.027
 Ben-0.027
 blast-0.027
Token up
Token ID510
Feature activation+5.27e-02

Pos logit contributions
+0.189
 possessions+0.188
 possession+0.184
 Acc+0.184
 mattered+0.182
Neg logit contributions
 a-0.109
 two-0.094
 Ben-0.088
 four-0.077
 sne-0.075
Token at
Token ID379
Feature activation+2.23e-02

Pos logit contributions
 possessions+0.096
 possession+0.092
 Acc+0.090
+0.090
 mattered+0.088
Neg logit contributions
 a-0.064
 two-0.060
 Ben-0.053
 sne-0.051
 four-0.050
Token the
Token ID262
Feature activation+1.67e-02

Pos logit contributions
 possessions+0.050
+0.049
izing+0.049
 possession+0.048
 Acc+0.048
Neg logit contributions
 a-0.017
 two-0.016
 Ben-0.013
 gas-0.011
 four-0.011
Token sky
Token ID6766
Feature activation+2.21e-02

Pos logit contributions
 possessions+0.025
 angel+0.024
 bicycle+0.023
 ornament+0.023
 possession+0.023
Neg logit contributions
 a-0.023
 two-0.022
 blast-0.022
rip-0.022
ata-0.022
Token and
Token ID290
Feature activation+2.73e-02

Pos logit contributions
 possessions+0.041
 possession+0.040
izing+0.038
 angel+0.038
+0.038
Neg logit contributions
 a-0.023
 two-0.022
 blast-0.020
 Ben-0.020
 sne-0.020
Token Lily
Token ID20037
Feature activation+1.02e-02

Pos logit contributions
 a+0.016
 two+0.015
 Ben+0.013
 blast+0.013
 sne+0.013
Neg logit contributions
 possessions-0.030
 possession-0.029
izing-0.028
-0.028
 Acc-0.027
Token.
Token ID13
Feature activation+2.17e-02

Pos logit contributions
 possessions+0.024
izing+0.023
ement+0.022
ths+0.022
+0.022
Neg logit contributions
 a-0.008
 two-0.007
 Ben-0.006
 sne-0.006
 roast-0.005
Token She
Token ID1375
Feature activation+1.82e-02

Pos logit contributions
 possessions+0.035
 possession+0.032
izing+0.031
ths+0.031
 angel+0.030
Neg logit contributions
 a-0.034
 two-0.032
 Ben-0.032
 roast-0.028
 sne-0.028
Token loved
Token ID6151
Feature activation+6.19e-03

Pos logit contributions
 possessions+0.024
ement+0.024
izing+0.023
 possession+0.023
ths+0.022
Neg logit contributions
 a-0.033
 two-0.030
 sne-0.029
 Ben-0.028
 blast-0.028
Token to
Token ID284
Feature activation+2.54e-02

Pos logit contributions
 possessions+0.012
 possession+0.012
+0.012
izing+0.012
ement+0.012
Neg logit contributions
 a-0.006
 two-0.005
 Ben-0.005
 sne-0.005
 blast-0.005
Token look
Token ID804
Feature activation+9.53e-02

Pos logit contributions
 possessions+0.043
 possession+0.042
 Acc+0.040
+0.040
 angel+0.040
Neg logit contributions
 a-0.030
 two-0.029
 Ben-0.027
rip-0.026
 blast-0.026
Token at
Token ID379
Feature activation+3.23e-02

Pos logit contributions
 possessions+0.193
+0.193
 Acc+0.191
 possession+0.190
 mattered+0.189
Neg logit contributions
 a-0.106
 two-0.090
 Ben-0.083
 four-0.072
 sne-0.071
Token the
Token ID262
Feature activation+2.73e-02

Pos logit contributions
+0.081
izing+0.079
 Acc+0.079
bing+0.077
osity+0.076
Neg logit contributions
 a-0.028
 two-0.023
 Ben-0.016
 four-0.014
 gas-0.013
Token stars
Token ID5788
Feature activation+3.52e-02

Pos logit contributions
 possessions+0.042
 possession+0.040
+0.039
 Acc+0.038
izing+0.038
Neg logit contributions
 a-0.039
 two-0.037
 Ben-0.034
 sne-0.034
 blast-0.034
Token at
Token ID379
Feature activation+2.17e-02

Pos logit contributions
 possessions+0.068
 possession+0.066
 Acc+0.064
+0.063
izing+0.062
Neg logit contributions
 a-0.036
 two-0.034
 Ben-0.030
 sne-0.030
 blast-0.029
Token night
Token ID1755
Feature activation-5.85e-03

Pos logit contributions
 possessions+0.046
izing+0.044
 possession+0.044
+0.043
 Acc+0.043
Neg logit contributions
 a-0.020
 two-0.018
 Ben-0.016
 blast-0.014
 four-0.014
Token walk
Token ID2513
Feature activation+2.72e-02

Pos logit contributions
 possessions+0.035
 bicycle+0.034
 possession+0.034
 ornament+0.033
 Acc+0.033
Neg logit contributions
 two-0.028
 a-0.027
rip-0.026
ata-0.026
 blah-0.025
Token in
Token ID287
Feature activation+1.80e-02

Pos logit contributions
 Acc+0.059
 possessions+0.058
izing+0.056
ement+0.056
osity+0.056
Neg logit contributions
 a-0.031
 two-0.024
 Ben-0.021
 sne-0.020
 four-0.020
Token the
Token ID262
Feature activation+2.33e-02

Pos logit contributions
 possessions+0.035
 possession+0.035
 angel+0.033
 ornament+0.033
 bicycle+0.033
Neg logit contributions
rip-0.015
 two-0.015
 a-0.015
ata-0.014
 roasted-0.014
Token garden
Token ID11376
Feature activation+1.75e-02

Pos logit contributions
 possessions+0.038
 possession+0.037
+0.035
 angel+0.035
izing+0.035
Neg logit contributions
 a-0.029
 two-0.028
 Ben-0.026
 blast-0.026
 sne-0.026
Token and
Token ID290
Feature activation+2.13e-02

Pos logit contributions
 possessions+0.034
 possession+0.032
 Acc+0.031
+0.031
 angel+0.031
Neg logit contributions
 a-0.020
 two-0.018
 Ben-0.016
 sne-0.016
 roast-0.015
Token look
Token ID804
Feature activation+9.52e-02

Pos logit contributions
ement+0.054
izing+0.053
 possessions+0.051
lights+0.049
 possession+0.049
Neg logit contributions
 a-0.020
 two-0.016
 sne-0.015
 roast-0.014
 Ben-0.014
Token at
Token ID379
Feature activation+3.01e-02

Pos logit contributions
+0.229
 possessions+0.228
ths+0.228
 disl+0.224
 mattered+0.224
Neg logit contributions
 a-0.085
 Ben-0.059
 two-0.058
 four-0.041
 many-0.040
Token all
Token ID477
Feature activation+3.45e-02

Pos logit contributions
+0.079
izing+0.075
 Acc+0.075
 possessions+0.074
bing+0.074
Neg logit contributions
 a-0.026
 two-0.020
 Ben-0.014
 four-0.012
 some-0.012
Token the
Token ID262
Feature activation+3.71e-02

Pos logit contributions
+0.092
Really+0.091
 instincts+0.090
ths+0.090
 possessions+0.090
Neg logit contributions
 a-0.023
 two-0.018
 Ben-0.014
 water-0.013
 the-0.010
Token pretty
Token ID2495
Feature activation+2.86e-02

Pos logit contributions
 possession+0.059
 Acc+0.058
+0.058
 possessions+0.058
ths+0.055
Neg logit contributions
 a-0.052
 two-0.052
 sne-0.044
 Ben-0.044
 gas-0.043
Token flowers
Token ID12734
Feature activation+2.34e-02

Pos logit contributions
 Acc+0.060
ishly+0.057
ths+0.056
belt+0.055
 possession+0.054
Neg logit contributions
 a-0.033
 two-0.031
 water-0.029
 Ben-0.025
 bread-0.025
Token things
Token ID1243
Feature activation+4.11e-02

Pos logit contributions
 possessions+0.065
 possession+0.061
 bicycle+0.056
 ornament+0.056
 angel+0.055
Neg logit contributions
 blast-0.049
ata-0.048
 a-0.047
 two-0.047
rip-0.046
Token in
Token ID287
Feature activation+3.22e-02

Pos logit contributions
 possessions+0.077
 possession+0.074
 Acc+0.074
+0.073
 angel+0.072
Neg logit contributions
 a-0.046
 two-0.043
 Ben-0.037
 sne-0.036
 roast-0.036
Token his
Token ID465
Feature activation+2.48e-02

Pos logit contributions
 possessions+0.051
 possession+0.048
 bicycle+0.045
 ornament+0.045
 angel+0.044
Neg logit contributions
ata-0.043
rip-0.042
umbs-0.039
 roasted-0.039
 blah-0.039
Token hands
Token ID2832
Feature activation+3.93e-02

Pos logit contributions
izing+0.045
 possessions+0.044
ement+0.043
 angel+0.043
 Acc+0.043
Neg logit contributions
 a-0.030
 two-0.028
 Ben-0.025
 sne-0.024
 blast-0.024
Token and
Token ID290
Feature activation+2.54e-02

Pos logit contributions
 possessions+0.060
 possession+0.057
bing+0.056
izing+0.056
 ornament+0.055
Neg logit contributions
 blast-0.056
 GO-0.052
 Run-0.049
 blah-0.049
 DO-0.048
Token look
Token ID804
Feature activation+9.51e-02

Pos logit contributions
ement+0.053
 possessions+0.053
 possession+0.052
izing+0.052
 Acc+0.050
Neg logit contributions
 a-0.024
 two-0.022
 sne-0.020
 roast-0.020
 Ben-0.019
Token at
Token ID379
Feature activation+3.09e-02

Pos logit contributions
+0.200
 possessions+0.199
 possession+0.197
 Acc+0.196
ths+0.194
Neg logit contributions
 a-0.097
 two-0.080
 Ben-0.075
 sne-0.064
 four-0.064
Token them
Token ID606
Feature activation+3.60e-02

Pos logit contributions
+0.071
 Acc+0.069
izing+0.068
 possessions+0.068
ths+0.066
Neg logit contributions
 a-0.028
 two-0.024
 Ben-0.019
 gas-0.018
 four-0.017
Token.
Token ID13
Feature activation+1.47e-02

Pos logit contributions
 possessions+0.057
 possession+0.054
izing+0.052
+0.052
 angel+0.052
Neg logit contributions
 a-0.049
 two-0.047
 Ben-0.043
 blast-0.043
 sne-0.043
Token One
Token ID1881
Feature activation-1.61e-02

Pos logit contributions
 possessions+0.028
 possession+0.027
izing+0.026
+0.026
 angel+0.026
Neg logit contributions
 a-0.016
 two-0.015
 Ben-0.013
 blast-0.013
 sne-0.013
Token day
Token ID1110
Feature activation+1.64e-02

Pos logit contributions
 a+0.025
 two+0.025
 blast+0.023
 Ben+0.023
 GO+0.023
Neg logit contributions
 possessions-0.021
 possession-0.021
 angel-0.019
 bicycle-0.019
 mattered-0.019
Token up
Token ID510
Feature activation+5.43e-02

Pos logit contributions
 possessions+0.183
+0.183
 Acc+0.179
 possession+0.179
 mattered+0.177
Neg logit contributions
 a-0.104
 two-0.087
 Ben-0.081
 sne-0.072
 four-0.072
Token at
Token ID379
Feature activation+2.37e-02

Pos logit contributions
 possessions+0.098
 possession+0.094
 Acc+0.093
+0.092
 mattered+0.090
Neg logit contributions
 a-0.066
 two-0.062
 Ben-0.055
 sne-0.053
 four-0.052
Token the
Token ID262
Feature activation+1.67e-02

Pos logit contributions
+0.052
izing+0.051
 possessions+0.050
ement+0.049
 Acc+0.049
Neg logit contributions
 a-0.024
 two-0.021
 Ben-0.018
 four-0.016
 gas-0.014
Token sky
Token ID6766
Feature activation+2.04e-02

Pos logit contributions
 possessions+0.025
 possession+0.024
 angel+0.024
 ornament+0.023
 bicycle+0.023
Neg logit contributions
 a-0.023
 two-0.022
 blast-0.021
 Ben-0.021
rip-0.021
Token and
Token ID290
Feature activation+2.85e-02

Pos logit contributions
 possessions+0.040
 possession+0.039
 Acc+0.037
+0.037
izing+0.036
Neg logit contributions
 a-0.022
 two-0.020
 Ben-0.018
 sne-0.017
 blast-0.017
Token look
Token ID804
Feature activation+9.48e-02

Pos logit contributions
 possessions+0.066
ement+0.065
 possession+0.064
izing+0.064
ating+0.062
Neg logit contributions
 a-0.026
 two-0.020
 sne-0.019
 Ben-0.018
 roast-0.017
Token at
Token ID379
Feature activation+3.14e-02

Pos logit contributions
 possessions+0.206
+0.201
 possession+0.199
ths+0.196
 mattered+0.193
Neg logit contributions
 a-0.104
 two-0.082
 Ben-0.078
 sne-0.069
 four-0.067
Token the
Token ID262
Feature activation+1.72e-02

Pos logit contributions
+0.079
 possessions+0.077
izing+0.074
 Acc+0.074
bing+0.073
Neg logit contributions
 a-0.026
 two-0.021
 Ben-0.015
 four-0.014
 water-0.011
Token clouds
Token ID15114
Feature activation+1.63e-02

Pos logit contributions
 angel+0.024
 possessions+0.024
 ornament+0.023
izing+0.023
 possession+0.023
Neg logit contributions
 a-0.023
 Ben-0.023
 two-0.023
 blast-0.023
 Theo-0.022
Token.
Token ID13
Feature activation+1.08e-02

Pos logit contributions
 possessions+0.034
 possession+0.033
 Acc+0.032
+0.031
 mattered+0.031
Neg logit contributions
 a-0.015
 two-0.014
 sne-0.012
 Ben-0.012
 blast-0.012
Token One
Token ID1881
Feature activation-2.27e-02

Pos logit contributions
 possessions+0.029
izing+0.027
 possession+0.027
ths+0.027
ized+0.026
Neg logit contributions
 a-0.008
 two-0.006
 Ben-0.006
-0.004
 Tom-0.004
Token thought
Token ID1807
Feature activation+3.07e-02

Pos logit contributions
 possessions+0.026
 possession+0.025
izing+0.024
 Acc+0.024
 angel+0.024
Neg logit contributions
 a-0.022
 two-0.021
 sne-0.019
 Ben-0.019
 blast-0.019
Token they
Token ID484
Feature activation+1.62e-02

Pos logit contributions
 possessions+0.056
 possession+0.054
+0.053
izing+0.052
 Acc+0.052
Neg logit contributions
 a-0.038
 two-0.035
 Ben-0.032
 sne-0.031
 blast-0.031
Token looked
Token ID3114
Feature activation+6.54e-02

Pos logit contributions
 possessions+0.029
 possession+0.028
izing+0.028
 Acc+0.027
+0.027
Neg logit contributions
 a-0.021
 two-0.019
 sne-0.018
 roasted-0.017
 blast-0.016
Token like
Token ID588
Feature activation-2.53e-02

Pos logit contributions
 possessions+0.130
+0.125
 possession+0.125
 Acc+0.122
 angel+0.120
Neg logit contributions
 a-0.074
 two-0.063
 Ben-0.055
 sne-0.055
 blast-0.055
Token ac
Token ID936
Feature activation+5.73e-04

Pos logit contributions
 a+0.023
 two+0.022
 blast+0.019
 Ben+0.019
 sne+0.019
Neg logit contributions
 possessions-0.051
 possession-0.049
izing-0.047
 angel-0.047
-0.047
Tokenrob
Token ID22609
Feature activation+9.47e-02

Pos logit contributions
 possessions+0.001
 Acc+0.001
 possession+0.001
+0.001
 angel+0.001
Neg logit contributions
 a-0.001
 two-0.001
ata-0.001
 Ben-0.001
rip-0.001
Tokenats
Token ID1381
Feature activation+1.03e-02

Pos logit contributions
 Acc+0.164
 angel+0.164
+0.162
 bicycle+0.160
Really+0.159
Neg logit contributions
 a-0.105
 two-0.104
ata-0.098
 sne-0.095
rip-0.095
Token.
Token ID13
Feature activation+2.52e-02

Pos logit contributions
 possessions+0.018
 possession+0.017
 angel+0.017
izing+0.017
 mattered+0.017
Neg logit contributions
 a-0.011
 two-0.011
 blast-0.010
 roast-0.010
rip-0.010
Token She
Token ID1375
Feature activation+9.14e-03

Pos logit contributions
 possessions+0.036
 possession+0.034
izing+0.032
 angel+0.032
 mattered+0.031
Neg logit contributions
 a-0.041
 two-0.039
 Ben-0.037
 four-0.035
 sne-0.035
Token enjoyed
Token ID8359
Feature activation-5.21e-03

Pos logit contributions
 possessions+0.015
 possession+0.015
izing+0.014
+0.014
 Acc+0.014
Neg logit contributions
 a-0.012
 two-0.011
 Ben-0.010
 blast-0.010
 sne-0.010
Token the
Token ID262
Feature activation+6.63e-03

Pos logit contributions
 a+0.008
 two+0.008
 Ben+0.007
 sne+0.007
 blast+0.007
Neg logit contributions
 possessions-0.008
 possession-0.007
izing-0.007
 angel-0.007
-0.007
Token Lily
Token ID20037
Feature activation+9.27e-03

Pos logit contributions
 a+0.015
 two+0.015
 Ben+0.013
 blast+0.013
 sne+0.013
Neg logit contributions
 possessions-0.029
 possession-0.028
izing-0.027
-0.027
 Acc-0.027
Token.
Token ID13
Feature activation+1.94e-02

Pos logit contributions
 possessions+0.022
izing+0.021
ement+0.021
ths+0.021
+0.021
Neg logit contributions
 a-0.008
 two-0.006
 Ben-0.005
 sne-0.005
 roasted-0.004
Token She
Token ID1375
Feature activation+1.69e-02

Pos logit contributions
 possessions+0.031
 possession+0.029
izing+0.028
ths+0.028
 angel+0.027
Neg logit contributions
 a-0.029
 two-0.028
 Ben-0.027
 blast-0.025
 roast-0.024
Token loved
Token ID6151
Feature activation+6.31e-03

Pos logit contributions
 possessions+0.022
ement+0.021
 possession+0.021
izing+0.021
+0.020
Neg logit contributions
 a-0.031
 two-0.028
 sne-0.027
 blast-0.026
 Ben-0.026
Token to
Token ID284
Feature activation+2.58e-02

Pos logit contributions
 possessions+0.013
 possession+0.012
+0.012
izing+0.012
ement+0.012
Neg logit contributions
 a-0.006
 two-0.006
 Ben-0.005
 sne-0.005
 blast-0.005
Token look
Token ID804
Feature activation+9.47e-02

Pos logit contributions
 possessions+0.046
 possession+0.045
izing+0.043
+0.043
 Acc+0.042
Neg logit contributions
 a-0.030
 two-0.029
 Ben-0.026
 blast-0.026
 sne-0.026
Token up
Token ID510
Feature activation+5.16e-02

Pos logit contributions
 possessions+0.192
+0.192
 Acc+0.192
 mattered+0.190
 possession+0.189
Neg logit contributions
 a-0.106
 two-0.090
 Ben-0.084
 four-0.073
 sne-0.070
Token at
Token ID379
Feature activation+2.39e-02

Pos logit contributions
 possessions+0.094
 possession+0.090
 Acc+0.089
+0.088
 mattered+0.086
Neg logit contributions
 a-0.062
 two-0.058
 Ben-0.051
 sne-0.050
 four-0.049
Token the
Token ID262
Feature activation+1.73e-02

Pos logit contributions
izing+0.055
+0.054
 possessions+0.053
 Acc+0.053
bing+0.053
Neg logit contributions
 a-0.022
 two-0.019
 Ben-0.015
 four-0.013
 gas-0.012
Token sky
Token ID6766
Feature activation+2.29e-02

Pos logit contributions
 possessions+0.026
 angel+0.025
 possession+0.024
 ornament+0.024
 bicycle+0.024
Neg logit contributions
 a-0.024
 two-0.023
 blast-0.022
 Ben-0.022
rip-0.022
Token and
Token ID290
Feature activation+2.94e-02

Pos logit contributions
 possessions+0.047
 possession+0.045
 Acc+0.044
+0.044
 mattered+0.043
Neg logit contributions
 a-0.023
 two-0.021
 Ben-0.018
 sne-0.018
 roast-0.017
Token girl
Token ID2576
Feature activation+5.42e-03

Pos logit contributions
 a+0.009
 two+0.009
 Ben+0.008
 sne+0.008
 blast+0.008
Neg logit contributions
 possessions-0.009
 possession-0.009
izing-0.008
-0.008
 Acc-0.008
Token.
Token ID13
Feature activation+1.76e-02

Pos logit contributions
 possessions+0.012
izing+0.011
ement+0.011
 possession+0.011
 mattered+0.011
Neg logit contributions
 a-0.005
 two-0.005
 sne-0.004
 Ben-0.004
 roast-0.004
Token She
Token ID1375
Feature activation+1.84e-02

Pos logit contributions
 possessions+0.030
izing+0.027
ths+0.027
 possession+0.026
ized+0.026
Neg logit contributions
 a-0.028
 Ben-0.026
 two-0.025
 roast-0.022
 four-0.022
Token liked
Token ID8288
Feature activation-8.16e-03

Pos logit contributions
 possessions+0.023
 possession+0.021
izing+0.021
ement+0.021
ths+0.020
Neg logit contributions
 a-0.034
 two-0.032
 sne-0.030
 Ben-0.030
 blast-0.029
Token to
Token ID284
Feature activation+2.52e-02

Pos logit contributions
 a+0.008
 two+0.007
 Ben+0.006
 sne+0.006
 blast+0.006
Neg logit contributions
 possessions-0.017
 possession-0.016
-0.015
izing-0.015
 Acc-0.015
Token look
Token ID804
Feature activation+9.46e-02

Pos logit contributions
 possessions+0.039
 possession+0.038
 bicycle+0.037
 ornament+0.037
 Acc+0.037
Neg logit contributions
 two-0.031
 a-0.031
rip-0.029
ata-0.029
 Ben-0.029
Token up
Token ID510
Feature activation+5.50e-02

Pos logit contributions
+0.203
 possessions+0.201
 Acc+0.199
 mattered+0.198
 possession+0.197
Neg logit contributions
 a-0.100
 two-0.081
 Ben-0.075
 sne-0.065
 four-0.063
Token into
Token ID656
Feature activation+3.23e-02

Pos logit contributions
 possessions+0.106
 Acc+0.103
 possession+0.103
+0.101
 mattered+0.100
Neg logit contributions
 a-0.065
 two-0.059
 Ben-0.051
 sne-0.048
 four-0.048
Token the
Token ID262
Feature activation+2.91e-02

Pos logit contributions
 possessions+0.072
+0.070
 Acc+0.070
 possession+0.070
izing+0.069
Neg logit contributions
 a-0.027
 two-0.024
 Ben-0.020
 sne-0.018
 gas-0.017
Token night
Token ID1755
Feature activation-1.60e-03

Pos logit contributions
 possessions+0.047
 possession+0.045
 mattered+0.044
+0.044
izing+0.044
Neg logit contributions
 a-0.041
 two-0.040
 gas-0.035
 Ben-0.035
 sne-0.034
Token sky
Token ID6766
Feature activation+1.62e-02

Pos logit contributions
 a+0.002
 two+0.002
 blast+0.002
 Ben+0.002
 sne+0.002
Neg logit contributions
 possessions-0.003
 possession-0.003
izing-0.003
-0.003
 angel-0.003
Token wander
Token ID27776
Feature activation+1.45e-02

Pos logit contributions
 possessions+0.032
 possession+0.030
 bicycle+0.030
 Acc+0.030
 ornament+0.030
Neg logit contributions
 two-0.024
 a-0.024
rip-0.022
ata-0.022
 Ben-0.022
Token around
Token ID1088
Feature activation+3.47e-02

Pos logit contributions
 Acc+0.023
+0.022
 possessions+0.022
ths+0.021
ement+0.021
Neg logit contributions
 a-0.023
 two-0.021
 Ben-0.019
 gas-0.018
 four-0.018
Token her
Token ID607
Feature activation+1.77e-02

Pos logit contributions
 possessions+0.069
 possession+0.066
 Acc+0.065
ement+0.064
 mattered+0.064
Neg logit contributions
 a-0.037
 two-0.034
 Ben-0.030
 sne-0.029
 blast-0.028
Token backyard
Token ID24296
Feature activation+2.34e-02

Pos logit contributions
 possessions+0.034
 Acc+0.033
ement+0.033
 mattered+0.032
izing+0.032
Neg logit contributions
 a-0.022
 two-0.019
 Ben-0.018
 sne-0.016
 four-0.015
Token and
Token ID290
Feature activation+2.57e-02

Pos logit contributions
 possessions+0.049
 disl+0.048
 Acc+0.047
 possession+0.046
+0.046
Neg logit contributions
 a-0.026
 two-0.024
 four-0.018
 Ben-0.018
 sne-0.017
Token look
Token ID804
Feature activation+9.44e-02

Pos logit contributions
ement+0.067
izing+0.064
lights+0.061
 possessions+0.060
ined+0.059
Neg logit contributions
 a-0.026
 two-0.020
 sne-0.019
 roast-0.017
 Ben-0.017
Token for
Token ID329
Feature activation+1.91e-02

Pos logit contributions
+0.223
 disl+0.220
 possessions+0.219
 mattered+0.218
ths+0.218
Neg logit contributions
 a-0.090
 two-0.062
 Ben-0.059
 four-0.047
 finish-0.046
Token animals
Token ID4695
Feature activation+2.27e-02

Pos logit contributions
+0.030
 possessions+0.029
 Acc+0.029
 possession+0.029
izing+0.029
Neg logit contributions
 a-0.029
 two-0.026
 sne-0.023
 Ben-0.023
 gas-0.023
Token.
Token ID13
Feature activation+1.97e-02

Pos logit contributions
 possessions+0.049
 Acc+0.048
 possession+0.047
+0.047
izing+0.047
Neg logit contributions
 a-0.019
 two-0.017
 Ben-0.015
 sne-0.014
 gas-0.013
Token One
Token ID1881
Feature activation-2.02e-02

Pos logit contributions
 possessions+0.048
izing+0.046
 possession+0.045
ths+0.045
ized+0.044
Neg logit contributions
 a-0.017
 Ben-0.015
 two-0.015
 four-0.011
 Tom-0.011
Token day
Token ID1110
Feature activation+1.95e-02

Pos logit contributions
 a+0.028
 two+0.026
 Ben+0.023
 sne+0.023
 roast+0.023
Neg logit contributions
 possessions-0.035
izing-0.033
-0.032
ement-0.032
 possession-0.032
Token outside
Token ID2354
Feature activation+2.83e-02

Pos logit contributions
 possessions+0.039
 possession+0.037
 Acc+0.036
 mattered+0.035
ement+0.035
Neg logit contributions
 a-0.029
 two-0.026
 blast-0.024
 Ben-0.024
 sne-0.024
Token in
Token ID287
Feature activation+2.02e-02

Pos logit contributions
 possessions+0.068
 disl+0.064
 possession+0.063
osity+0.062
 Acc+0.062
Neg logit contributions
 a-0.029
 two-0.024
 Ben-0.019
 sne-0.018
 four-0.018
Token the
Token ID262
Feature activation+1.89e-02

Pos logit contributions
 possessions+0.050
ement+0.047
izing+0.047
 mattered+0.046
 possession+0.046
Neg logit contributions
 a-0.017
 two-0.014
 Ben-0.012
 blast-0.011
 sne-0.009
Token grass
Token ID8701
Feature activation+1.25e-02

Pos logit contributions
 possessions+0.034
 possession+0.033
izing+0.032
+0.031
 Acc+0.031
Neg logit contributions
 a-0.022
 two-0.021
 Ben-0.019
 blast-0.019
 sne-0.019
Token and
Token ID290
Feature activation+2.09e-02

Pos logit contributions
 possessions+0.021
 possession+0.021
 angel+0.020
izing+0.020
 ornament+0.020
Neg logit contributions
 blast-0.013
 a-0.013
 two-0.013
 GO-0.013
 blah-0.013
Token look
Token ID804
Feature activation+9.43e-02

Pos logit contributions
ement+0.054
 possessions+0.051
izing+0.051
 possession+0.050
ths+0.048
Neg logit contributions
 a-0.019
 two-0.015
 sne-0.015
 Ben-0.014
 roast-0.014
Token for
Token ID329
Feature activation+2.10e-02

Pos logit contributions
+0.228
 possessions+0.224
ths+0.223
 mattered+0.221
 Acc+0.220
Neg logit contributions
 a-0.086
 Ben-0.061
 two-0.060
 four-0.044
 many-0.043
Token bugs
Token ID11316
Feature activation+2.44e-02

Pos logit contributions
+0.036
 possessions+0.035
 Acc+0.035
 possession+0.034
izing+0.034
Neg logit contributions
 a-0.030
 two-0.027
 Ben-0.024
 sne-0.024
 four-0.023
Token.
Token ID13
Feature activation+2.59e-02

Pos logit contributions
 possessions+0.050
 possession+0.049
 Acc+0.047
+0.047
izing+0.047
Neg logit contributions
 a-0.022
 two-0.021
 Ben-0.018
 sne-0.018
 blast-0.017
Token One
Token ID1881
Feature activation-1.50e-02

Pos logit contributions
 possessions+0.062
izing+0.060
 possession+0.059
ths+0.059
ized+0.057
Neg logit contributions
 a-0.022
 Ben-0.020
 two-0.019
 roast-0.014
 Tom-0.014
Token day
Token ID1110
Feature activation+1.88e-02

Pos logit contributions
 a+0.021
 two+0.020
 Ben+0.018
 sne+0.018
 roast+0.018
Neg logit contributions
 possessions-0.025
izing-0.023
 possession-0.023
-0.023
 Acc-0.022
Token loved
Token ID6151
Feature activation+6.19e-03

Pos logit contributions
 possessions+0.022
ement+0.021
 possession+0.021
izing+0.021
ths+0.020
Neg logit contributions
 a-0.031
 two-0.029
 sne-0.028
 Ben-0.027
 blast-0.027
Token to
Token ID284
Feature activation+2.63e-02

Pos logit contributions
 possessions+0.012
 possession+0.012
+0.012
izing+0.012
 Acc+0.012
Neg logit contributions
 a-0.006
 two-0.006
 Ben-0.005
 sne-0.005
 blast-0.005
Token play
Token ID711
Feature activation+2.34e-02

Pos logit contributions
 possessions+0.044
 possession+0.043
 Acc+0.041
+0.041
 angel+0.041
Neg logit contributions
 a-0.031
 two-0.031
 Ben-0.028
rip-0.028
ata-0.027
Token outside
Token ID2354
Feature activation+2.77e-02

Pos logit contributions
 possessions+0.043
 possession+0.040
 Acc+0.039
 mattered+0.038
ement+0.038
Neg logit contributions
 a-0.031
 two-0.028
 blast-0.026
 sne-0.026
 Ben-0.026
Token and
Token ID290
Feature activation+2.55e-02

Pos logit contributions
 possessions+0.068
 disl+0.064
 possession+0.063
osity+0.062
 Acc+0.062
Neg logit contributions
 a-0.028
 two-0.022
 Ben-0.018
 sne-0.018
 four-0.017
Token look
Token ID804
Feature activation+9.43e-02

Pos logit contributions
ement+0.064
izing+0.061
 possessions+0.061
 possession+0.058
ths+0.058
Neg logit contributions
 a-0.024
 sne-0.019
 two-0.019
 roast-0.018
 blast-0.016
Token for
Token ID329
Feature activation+2.21e-02

Pos logit contributions
 possessions+0.217
+0.216
 mattered+0.215
 possession+0.211
 disl+0.210
Neg logit contributions
 a-0.092
 Ben-0.067
 two-0.065
 four-0.051
 sne-0.050
Token pretty
Token ID2495
Feature activation+2.52e-02

Pos logit contributions
 possessions+0.036
+0.035
 possession+0.035
 Acc+0.035
izing+0.034
Neg logit contributions
 a-0.031
 two-0.029
 Ben-0.026
 sne-0.026
 blast-0.025
Token rocks
Token ID12586
Feature activation+3.33e-02

Pos logit contributions
 Acc+0.050
+0.047
ishly+0.046
osity+0.046
ths+0.046
Neg logit contributions
 a-0.031
 two-0.027
 water-0.025
 sne-0.024
 Ben-0.024
Token.
Token ID13
Feature activation+2.93e-02

Pos logit contributions
 possessions+0.070
 possession+0.067
 Acc+0.067
+0.067
izing+0.065
Neg logit contributions
 a-0.029
 two-0.027
 Ben-0.023
 sne-0.022
 blast-0.022
Token One
Token ID1881
Feature activation-1.34e-02

Pos logit contributions
 possessions+0.072
izing+0.070
ths+0.069
 possession+0.068
ized+0.067
Neg logit contributions
 a-0.023
 Ben-0.021
 two-0.019
 Tom-0.014
 roast-0.014
Token Lily
Token ID20037
Feature activation+1.17e-02

Pos logit contributions
 a+0.012
 two+0.012
 blast+0.010
 Ben+0.010
 sne+0.010
Neg logit contributions
 possessions-0.022
 possession-0.022
-0.021
izing-0.021
 Acc-0.021
Token.
Token ID13
Feature activation+2.07e-02

Pos logit contributions
 possessions+0.027
izing+0.026
ement+0.026
ths+0.025
+0.025
Neg logit contributions
 a-0.010
 two-0.008
 Ben-0.007
 sne-0.006
 roast-0.006
Token She
Token ID1375
Feature activation+1.83e-02

Pos logit contributions
 possessions+0.033
 possession+0.031
izing+0.030
ths+0.030
 angel+0.029
Neg logit contributions
 a-0.032
 two-0.030
 Ben-0.030
 roast-0.027
 blast-0.027
Token loved
Token ID6151
Feature activation+6.38e-03

Pos logit contributions
 possessions+0.023
 possession+0.022
ement+0.022
izing+0.022
+0.021
Neg logit contributions
 a-0.033
 two-0.031
 sne-0.030
 Ben-0.029
 blast-0.029
Token to
Token ID284
Feature activation+2.44e-02

Pos logit contributions
 possessions+0.012
 possession+0.012
+0.011
izing+0.011
 angel+0.011
Neg logit contributions
 a-0.006
 two-0.006
 Ben-0.005
 blast-0.005
 sne-0.005
Token look
Token ID804
Feature activation+9.41e-02

Pos logit contributions
 possessions+0.041
 possession+0.039
 bicycle+0.038
 Acc+0.038
 angel+0.038
Neg logit contributions
 a-0.029
 two-0.028
rip-0.026
ata-0.026
 Ben-0.026
Token up
Token ID510
Feature activation+5.34e-02

Pos logit contributions
 possessions+0.190
 Acc+0.188
+0.188
 possession+0.187
 mattered+0.186
Neg logit contributions
 a-0.105
 two-0.090
 Ben-0.083
 four-0.073
 sne-0.071
Token at
Token ID379
Feature activation+2.33e-02

Pos logit contributions
 possessions+0.095
 possession+0.092
 Acc+0.089
+0.089
 mattered+0.087
Neg logit contributions
 a-0.065
 two-0.062
 Ben-0.055
 sne-0.053
 four-0.052
Token the
Token ID262
Feature activation+1.81e-02

Pos logit contributions
izing+0.051
+0.051
 possessions+0.051
 Acc+0.050
bing+0.050
Neg logit contributions
 a-0.022
 two-0.019
 Ben-0.016
 four-0.014
 gas-0.014
Token sky
Token ID6766
Feature activation+2.12e-02

Pos logit contributions
 possessions+0.027
 angel+0.026
 ornament+0.026
 bicycle+0.026
 possession+0.026
Neg logit contributions
 a-0.024
 two-0.023
 blast-0.023
rip-0.022
ata-0.022
Token and
Token ID290
Feature activation+2.99e-02

Pos logit contributions
 possessions+0.041
 possession+0.039
 Acc+0.038
+0.038
izing+0.038
Neg logit contributions
 a-0.022
 two-0.021
 Ben-0.018
 sne-0.018
 blast-0.018
Token,
Token ID11
Feature activation-2.74e-02

Pos logit contributions
 possessions+0.026
 possession+0.024
izing+0.023
 angel+0.023
+0.023
Neg logit contributions
 a-0.027
 two-0.027
 roast-0.026
 blast-0.026
 roasted-0.025
Token it
Token ID340
Feature activation-7.98e-03

Pos logit contributions
Ar+0.037
rip+0.035
umbs+0.035
rr+0.034
ata+0.034
Neg logit contributions
 angel-0.041
 ornament-0.040
 bicycle-0.037
-0.037
 Acc-0.037
Token has
Token ID468
Feature activation-4.88e-02

Pos logit contributions
 a+0.010
 two+0.010
 blast+0.010
 Theo+0.009
 Ben+0.009
Neg logit contributions
 possessions-0.012
-0.012
 angel-0.012
 ornament-0.012
 mattered-0.011
Token a
Token ID257
Feature activation-2.58e-02

Pos logit contributions
 Theo+0.033
rip+0.029
ata+0.027
 GO+0.027
 Run+0.026
Neg logit contributions
 possessions-0.104
 ornament-0.101
 possession-0.101
 mattered-0.100
 angel-0.098
Token tip
Token ID8171
Feature activation-1.42e-02

Pos logit contributions
 a+0.034
 two+0.033
 Ben+0.031
rip+0.031
 roasted+0.030
Neg logit contributions
 possessions-0.040
 possession-0.039
 angel-0.037
 ornament-0.037
 bicycle-0.037
Token like
Token ID588
Feature activation-7.27e-02

Pos logit contributions
 a+0.014
 two+0.014
 blast+0.014
 roasted+0.014
 roast+0.013
Neg logit contributions
 possessions-0.025
 possession-0.025
 angel-0.024
 ornament-0.024
 bicycle-0.023
Token a
Token ID257
Feature activation-3.61e-02

Pos logit contributions
ata+0.053
rip+0.047
 roasted+0.047
 Theo+0.046
 roast+0.044
Neg logit contributions
 ornament-0.153
 angel-0.148
 possessions-0.146
 possession-0.145
-0.143
Token star
Token ID3491
Feature activation-3.45e-02

Pos logit contributions
 a+0.051
 two+0.049
rip+0.047
 roasted+0.047
ata+0.046
Neg logit contributions
 possessions-0.051
 angel-0.050
 possession-0.049
 ornament-0.049
 bicycle-0.049
Token."
Token ID526
Feature activation-2.21e-02

Pos logit contributions
 two+0.037
 a+0.037
 roasted+0.036
 blast+0.036
 GO+0.036
Neg logit contributions
 angel-0.059
 ornament-0.058
 possessions-0.057
 possession-0.056
izing-0.055
Token Tom
Token ID4186
Feature activation-2.38e-02

Pos logit contributions
 roasted+0.018
ata+0.018
umbs+0.017
rip+0.017
rr+0.017
Neg logit contributions
 Acc-0.041
 she-0.041
 bicycle-0.040
 disl-0.040
-0.039
Token says
Token ID1139
Feature activation+7.03e-03

Pos logit contributions
 blast+0.029
 roasted+0.029
OOOO+0.027
rip+0.027
 roast+0.027
Neg logit contributions
 angel-0.036
ement-0.033
 Acc-0.033
 ornament-0.033
 disl-0.033
Token work
Token ID670
Feature activation-3.02e-03

Pos logit contributions
 possessions+0.008
 possession+0.007
izing+0.007
+0.007
 Acc+0.007
Neg logit contributions
 a-0.006
 two-0.006
 blast-0.006
 sne-0.006
 Ben-0.006
Token trip
Token ID5296
Feature activation-2.22e-02

Pos logit contributions
 roast+0.004
rip+0.004
 two+0.004
 roasted+0.004
 a+0.003
Neg logit contributions
 possessions-0.005
 possession-0.005
izing-0.005
ement-0.005
 bicycle-0.004
Token.
Token ID13
Feature activation-1.04e-02

Pos logit contributions
 roast+0.028
 roasted+0.027
 sne+0.027
 blast+0.027
rip+0.026
Neg logit contributions
 possessions-0.035
 possession-0.035
izing-0.033
ating-0.032
ement-0.032
Token<|endoftext|>
Token ID50256
Feature activation-3.90e-02

Pos logit contributions
 hook+0.013
 sne+0.013
 bread+0.013
bang+0.013
 cast+0.012
Neg logit contributions
 Acc-0.016
-0.015
 Mill-0.014
osity-0.014
 possession-0.013
TokenOnce
Token ID7454
Feature activation+5.51e-03

Pos logit contributions
 hook+0.061
 sne+0.060
 roast+0.060
 boil+0.060
 padd+0.058
Neg logit contributions
-0.045
 mattered-0.042
ement-0.041
ating-0.040
fl-0.040
Token upon
Token ID2402
Feature activation-7.02e-02

Pos logit contributions
 possessions+0.010
izing+0.010
 possession+0.010
+0.010
 angel+0.009
Neg logit contributions
 a-0.007
 two-0.006
 Ben-0.006
 sne-0.005
rip-0.005
Token a
Token ID257
Feature activation-2.21e-02

Pos logit contributions
rip+0.072
 blah+0.071
 Theo+0.068
 GO+0.066
OOOO+0.065
Neg logit contributions
 she-0.117
 ornament-0.116
 mattered-0.114
Her-0.114
 bicycle-0.112
Token time
Token ID640
Feature activation+3.41e-03

Pos logit contributions
 two+0.015
 a+0.014
rip+0.014
 Theo+0.014
 GO+0.014
Neg logit contributions
 possessions-0.045
 possession-0.045
-0.044
 Acc-0.043
ement-0.043
Token,
Token ID11
Feature activation-2.31e-03

Pos logit contributions
 possessions+0.007
 possession+0.006
izing+0.006
 angel+0.006
 ornament+0.006
Neg logit contributions
 a-0.005
 two-0.004
 Ben-0.004
 Tom-0.003
 blast-0.003
Token there
Token ID612
Feature activation+1.71e-02

Pos logit contributions
 a+0.002
 two+0.002
 Ben+0.002
 blast+0.002
 sne+0.002
Neg logit contributions
 possessions-0.005
 possession-0.004
izing-0.004
-0.004
 Acc-0.004
Token was
Token ID373
Feature activation-3.84e-02

Pos logit contributions
 possessions+0.034
 possession+0.032
izing+0.031
 angel+0.031
+0.030
Neg logit contributions
 a-0.021
 two-0.019
 Ben-0.017
 sne-0.016
 gas-0.015
Token of
Token ID286
Feature activation-2.31e-02

Pos logit contributions
 a+0.033
 two+0.032
 blast+0.028
 Ben+0.028
 sne+0.028
Neg logit contributions
 possessions-0.059
 possession-0.056
izing-0.054
 angel-0.054
 Acc-0.054
Token chocolate
Token ID11311
Feature activation-1.37e-02

Pos logit contributions
rip+0.023
ata+0.023
 two+0.022
 GO+0.022
 a+0.021
Neg logit contributions
 possessions-0.043
 possession-0.041
 angel-0.041
 ornament-0.040
 Acc-0.040
Token.
Token ID13
Feature activation-8.03e-03

Pos logit contributions
 two+0.015
 a+0.015
 blast+0.014
 GO+0.014
rip+0.014
Neg logit contributions
 possessions-0.023
 angel-0.023
 ornament-0.022
 possession-0.022
izing-0.022
Token<|endoftext|>
Token ID50256
Feature activation-4.55e-02

Pos logit contributions
 sne+0.008
rip+0.008
 padd+0.008
 GO+0.008
 hook+0.008
Neg logit contributions
 Acc-0.014
 possessions-0.014
-0.014
 possession-0.013
 mattered-0.013
TokenOnce
Token ID7454
Feature activation+5.66e-03

Pos logit contributions
 roast+0.069
 roasted+0.067
 padd+0.067
 hook+0.066
 boil+0.066
Neg logit contributions
-0.056
 mattered-0.051
ement-0.050
 ornament-0.050
ating-0.049
Token upon
Token ID2402
Feature activation-6.97e-02

Pos logit contributions
 possessions+0.012
izing+0.012
 possession+0.011
+0.011
ths+0.011
Neg logit contributions
 a-0.006
 two-0.006
 Ben-0.006
 Tom-0.004
rip-0.004
Token a
Token ID257
Feature activation-2.15e-02

Pos logit contributions
 Theo+0.042
 GO+0.042
 blah+0.041
rip+0.040
 padd+0.036
Neg logit contributions
 ornament-0.147
 mattered-0.146
 bicycle-0.143
Her-0.142
 possession-0.142
Token time
Token ID640
Feature activation-1.07e-03

Pos logit contributions
 two+0.014
 a+0.013
 GO+0.012
 Theo+0.012
rip+0.012
Neg logit contributions
 possessions-0.046
 possession-0.045
-0.044
 Acc-0.044
 mattered-0.043
Token,
Token ID11
Feature activation-3.17e-03

Pos logit contributions
 a+0.001
 two+0.001
 Ben+0.001
 Tom+0.001
 blast+0.001
Neg logit contributions
 possessions-0.002
 possession-0.002
izing-0.002
ths-0.002
adoes-0.002
Token there
Token ID612
Feature activation+1.60e-02

Pos logit contributions
 a+0.003
 two+0.003
 Ben+0.002
 sne+0.002
 four+0.002
Neg logit contributions
 possessions-0.007
 possession-0.007
izing-0.007
ths-0.006
 instincts-0.006
Token was
Token ID373
Feature activation-4.22e-02

Pos logit contributions
 possessions+0.037
izing+0.035
 possession+0.035
osity+0.034
 instincts+0.034
Neg logit contributions
 a-0.018
 two-0.014
 Ben-0.013
 sne-0.011
 Tom-0.011
Token playing
Token ID2712
Feature activation-2.82e-02

Pos logit contributions
ata+0.015
rip+0.015
umbs+0.015
 blast+0.015
 blah+0.015
Neg logit contributions
 possessions-0.014
 possession-0.013
izing-0.012
bing-0.012
ths-0.012
Token too
Token ID1165
Feature activation-2.16e-02

Pos logit contributions
 a+0.034
 two+0.034
rip+0.032
 roast+0.031
 sne+0.031
Neg logit contributions
 possessions-0.047
 possession-0.045
izing-0.044
 angel-0.043
-0.043
Token deep
Token ID2769
Feature activation-1.11e-02

Pos logit contributions
rip+0.031
 roast+0.030
 roasted+0.030
 Stri+0.029
 microwave+0.028
Neg logit contributions
 possessions-0.030
izing-0.030
 possession-0.029
ized-0.028
 mattered-0.027
Token<|endoftext|>
Token ID50256
Feature activation-5.36e-02

Pos logit contributions
 a+0.012
 two+0.012
rip+0.012
 blast+0.012
 roast+0.012
Neg logit contributions
 possessions-0.019
 possession-0.019
izing-0.018
 angel-0.018
bing-0.018
TokenOnce
Token ID7454
Feature activation-3.33e-03

Pos logit contributions
 roast+0.086
 hook+0.082
 roasted+0.081
 boil+0.080
 blast+0.080
Neg logit contributions
ement-0.063
-0.061
 mattered-0.060
ating-0.059
ined-0.058
Token upon
Token ID2402
Feature activation-6.92e-02

Pos logit contributions
 a+0.004
 two+0.004
 Ben+0.004
 sne+0.004
rip+0.004
Neg logit contributions
 possessions-0.006
 possession-0.005
izing-0.005
-0.005
 angel-0.005
Token a
Token ID257
Feature activation-2.65e-02

Pos logit contributions
rip+0.063
 Theo+0.061
 blah+0.060
rr+0.058
 Stri+0.055
Neg logit contributions
 ornament-0.123
 mattered-0.123
 she-0.122
 bicycle-0.121
 bracelet-0.118
Token time
Token ID640
Feature activation-4.56e-03

Pos logit contributions
 two+0.017
 a+0.016
rip+0.015
 GO+0.015
 Theo+0.015
Neg logit contributions
 possessions-0.057
 possession-0.056
-0.054
 mattered-0.054
 Acc-0.054
Token there
Token ID612
Feature activation+1.05e-02

Pos logit contributions
 a+0.007
 two+0.006
 Ben+0.006
 Tom+0.005
 sne+0.005
Neg logit contributions
 possessions-0.008
 possession-0.008
izing-0.007
 angel-0.007
 ornament-0.007
Token was
Token ID373
Feature activation-4.62e-02

Pos logit contributions
 possessions+0.015
 possession+0.015
+0.014
izing+0.014
 Acc+0.014
Neg logit contributions
 a-0.015
 two-0.015
 blast-0.014
 Ben-0.014
 sne-0.014
Token a
Token ID257
Feature activation-2.56e-02

Pos logit contributions
rip+0.025
 blah+0.024
 GO+0.021
rr+0.021
 Theo+0.021
Neg logit contributions
 ornament-0.101
 possession-0.101
 mattered-0.100
 Acc-0.098
 angel-0.097
Token Thankfully
Token ID31251
Feature activation+7.37e-03

Pos logit contributions
 a+0.009
 two+0.009
 sne+0.008
 blast+0.008
 roast+0.008
Neg logit contributions
 possessions-0.015
 possession-0.014
-0.014
 Acc-0.014
 angel-0.014
Token,
Token ID11
Feature activation-1.37e-02

Pos logit contributions
 possessions+0.012
 possession+0.012
 Acc+0.012
izing+0.012
 bicycle+0.011
Neg logit contributions
 a-0.011
 two-0.010
 Ben-0.009
 sne-0.008
 four-0.008
Token the
Token ID262
Feature activation-1.57e-02

Pos logit contributions
 a+0.023
 two+0.022
 Ben+0.021
 blast+0.020
 sne+0.020
Neg logit contributions
 possessions-0.018
 possession-0.017
izing-0.016
-0.016
 Acc-0.016
Token<|endoftext|>
Token ID50256
Feature activation-4.57e-02

Pos logit contributions
 a+0.021
rip+0.020
 roast+0.020
 roasted+0.020
 two+0.020
Neg logit contributions
 possessions-0.025
 possession-0.023
 angel-0.023
 ornament-0.022
 bicycle-0.022
TokenOnce
Token ID7454
Feature activation-4.89e-03

Pos logit contributions
 hook+0.071
 roast+0.071
 sne+0.070
 boil+0.069
 padd+0.069
Neg logit contributions
-0.056
ement-0.053
 mattered-0.050
ating-0.049
ined-0.049
Token upon
Token ID2402
Feature activation-6.92e-02

Pos logit contributions
 a+0.006
 two+0.006
 Ben+0.006
 sne+0.005
rip+0.005
Neg logit contributions
 possessions-0.009
izing-0.008
-0.008
 possession-0.008
 Acc-0.008
Token a
Token ID257
Feature activation-2.64e-02

Pos logit contributions
rip+0.071
 Theo+0.066
rr+0.065
 blah+0.065
 rods+0.065
Neg logit contributions
 she-0.124
 ornament-0.123
 bracelet-0.118
 bicycle-0.117
 mattered-0.116
Token time
Token ID640
Feature activation+3.35e-03

Pos logit contributions
 two+0.019
rip+0.018
 a+0.017
 Theo+0.017
 GO+0.017
Neg logit contributions
 possessions-0.054
 possession-0.054
-0.052
 bicycle-0.051
 Acc-0.051
Token there
Token ID612
Feature activation+1.24e-02

Pos logit contributions
 possessions+0.006
 possession+0.006
izing+0.006
 angel+0.006
 ornament+0.006
Neg logit contributions
 a-0.005
 two-0.004
 Ben-0.004
 Tom-0.003
 four-0.003
Token was
Token ID373
Feature activation-4.21e-02

Pos logit contributions
 possessions+0.019
 possession+0.018
izing+0.017
+0.017
 Acc+0.017
Neg logit contributions
 a-0.018
 two-0.017
 Ben-0.016
 blast-0.016
 sne-0.016
Token a
Token ID257
Feature activation-2.38e-02

Pos logit contributions
rip+0.026
 blah+0.023
 GO+0.021
rr+0.021
ata+0.021
Neg logit contributions
 ornament-0.091
 possession-0.090
 mattered-0.090
 angel-0.087
 possessions-0.087
Token hole
Token ID7604
Feature activation+4.53e-03

Pos logit contributions
 possessions+0.018
 possession+0.017
 angel+0.017
izing+0.017
+0.016
Neg logit contributions
 a-0.010
 two-0.010
 Ben-0.009
rip-0.009
 blast-0.009
Token forever
Token ID8097
Feature activation-1.50e-02

Pos logit contributions
 possessions+0.007
 possession+0.007
izing+0.007
 angel+0.007
ement+0.007
Neg logit contributions
 blast-0.005
 roast-0.005
 GO-0.005
 roasted-0.005
 two-0.005
Token.
Token ID13
Feature activation-1.44e-02

Pos logit contributions
rip+0.017
 blast+0.017
 two+0.017
 roasted+0.017
 roast+0.017
Neg logit contributions
 possessions-0.025
 possession-0.025
izing-0.024
bing-0.023
 mattered-0.023
Token<|endoftext|>
Token ID50256
Feature activation-4.34e-02

Pos logit contributions
 a+0.018
 two+0.018
 blast+0.017
 sne+0.017
 roast+0.017
Neg logit contributions
 Acc-0.022
 possessions-0.022
 possession-0.021
-0.021
 mattered-0.020
TokenOnce
Token ID7454
Feature activation+1.72e-03

Pos logit contributions
 roast+0.064
 padd+0.063
 sne+0.062
 hook+0.062
 roasted+0.062
Neg logit contributions
-0.052
 mattered-0.051
ement-0.049
ating-0.048
ined-0.048
Token upon
Token ID2402
Feature activation-6.90e-02

Pos logit contributions
 possessions+0.003
izing+0.003
+0.003
 possession+0.003
ths+0.003
Neg logit contributions
 a-0.002
 two-0.002
 Ben-0.002
 sne-0.001
rip-0.001
Token a
Token ID257
Feature activation-2.13e-02

Pos logit contributions
rip+0.055
 blah+0.055
 Theo+0.054
 GO+0.052
rr+0.051
Neg logit contributions
 she-0.133
 ornament-0.132
Her-0.129
 bicycle-0.127
 mattered-0.127
Token time
Token ID640
Feature activation+1.66e-03

Pos logit contributions
 two+0.013
 a+0.013
 GO+0.011
rip+0.011
 roasted+0.011
Neg logit contributions
 possessions-0.046
 possession-0.045
-0.044
 Acc-0.044
 mattered-0.044
Token,
Token ID11
Feature activation-2.85e-04

Pos logit contributions
 possessions+0.003
 possession+0.003
izing+0.003
 angel+0.003
 ornament+0.003
Neg logit contributions
 a-0.002
 two-0.002
 Ben-0.002
 Tom-0.002
 blast-0.002
Token there
Token ID612
Feature activation+1.47e-02

Pos logit contributions
 a+0.000
 two+0.000
 Ben+0.000
 sne+0.000
 blast+0.000
Neg logit contributions
 possessions-0.001
 possession-0.001
izing-0.001
ths-0.001
 angel-0.001
Token was
Token ID373
Feature activation-3.83e-02

Pos logit contributions
 possessions+0.031
 possession+0.029
izing+0.029
 angel+0.029
osity+0.028
Neg logit contributions
 a-0.017
 two-0.014
 Ben-0.014
 sne-0.012
 gas-0.011
Token the
Token ID262
Feature activation-3.39e-03

Pos logit contributions
 blast+0.038
 roast+0.036
rip+0.036
rr+0.036
 roasted+0.036
Neg logit contributions
 possessions-0.041
 possession-0.039
izing-0.037
 ornament-0.037
 angel-0.036
Token day
Token ID1110
Feature activation-2.84e-02

Pos logit contributions
 a+0.004
 two+0.004
rip+0.004
 blast+0.004
 Ben+0.004
Neg logit contributions
 possessions-0.006
 possession-0.005
 Acc-0.005
 angel-0.005
izing-0.005
Token!
Token ID0
Feature activation-1.56e-02

Pos logit contributions
rip+0.036
 two+0.035
Ar+0.035
 blast+0.034
 roast+0.034
Neg logit contributions
 possessions-0.044
 possession-0.042
izing-0.041
 angel-0.040
bing-0.040
Token<|endoftext|>
Token ID50256
Feature activation-3.99e-02

Pos logit contributions
 a+0.016
 two+0.015
 blast+0.015
 sne+0.015
ata+0.015
Neg logit contributions
 Acc-0.028
 possessions-0.028
 possession-0.027
-0.026
ement-0.026
TokenOnce
Token ID7454
Feature activation+2.59e-03

Pos logit contributions
 roast+0.060
 hook+0.060
 sne+0.059
 blast+0.058
 boil+0.058
Neg logit contributions
-0.050
 mattered-0.047
ement-0.046
ined-0.045
ating-0.044
Token upon
Token ID2402
Feature activation-6.90e-02

Pos logit contributions
 possessions+0.005
izing+0.005
+0.005
 possession+0.005
 angel+0.005
Neg logit contributions
 a-0.003
 two-0.003
 Ben-0.003
 sne-0.002
rip-0.002
Token a
Token ID257
Feature activation-2.25e-02

Pos logit contributions
rip+0.061
 blah+0.058
 Theo+0.057
rr+0.056
OOOO+0.054
Neg logit contributions
 she-0.127
 ornament-0.127
 mattered-0.124
 bicycle-0.123
Her-0.121
Token time
Token ID640
Feature activation+4.45e-03

Pos logit contributions
 two+0.014
 a+0.014
rip+0.013
 GO+0.012
 Theo+0.012
Neg logit contributions
 possessions-0.048
 possession-0.048
-0.046
 Acc-0.046
 mattered-0.046
Token,
Token ID11
Feature activation-4.07e-03

Pos logit contributions
 possessions+0.009
 possession+0.009
izing+0.009
 angel+0.008
adoes+0.008
Neg logit contributions
 a-0.006
 two-0.005
 Ben-0.005
 Tom-0.004
 sne-0.004
Token there
Token ID612
Feature activation+1.51e-02

Pos logit contributions
 a+0.004
 two+0.004
 Ben+0.003
 sne+0.003
 blast+0.003
Neg logit contributions
 possessions-0.008
 possession-0.008
izing-0.008
-0.008
 Acc-0.008
Token was
Token ID373
Feature activation-4.03e-02

Pos logit contributions
 possessions+0.030
 possession+0.028
izing+0.027
 angel+0.027
+0.026
Neg logit contributions
 a-0.019
 two-0.017
 Ben-0.016
 sne-0.015
 gas-0.014
Token been
Token ID587
Feature activation-1.54e-02

Pos logit contributions
rip+0.035
 roast+0.034
 blah+0.033
rr+0.033
olo+0.033
Neg logit contributions
 possessions-0.056
 mattered-0.053
 possession-0.052
 ornament-0.049
 Acc-0.048
Token waiting
Token ID4953
Feature activation-7.07e-03

Pos logit contributions
 two+0.016
 a+0.016
rip+0.016
 roast+0.015
 Ben+0.015
Neg logit contributions
 possessions-0.028
 possession-0.027
 mattered-0.026
 Acc-0.025
 angel-0.025
Token for
Token ID329
Feature activation-2.51e-02

Pos logit contributions
 two+0.008
 a+0.008
rip+0.008
 blast+0.008
ata+0.007
Neg logit contributions
 possessions-0.012
 possession-0.012
 Acc-0.011
 angel-0.011
izing-0.011
Token<|endoftext|>
Token ID50256
Feature activation-5.13e-02

Pos logit contributions
ata+0.035
rip+0.034
 GO+0.034
 blah+0.033
 blast+0.033
Neg logit contributions
 possessions-0.037
 possession-0.036
ie-0.033
izing-0.033
 Mill-0.032
TokenOnce
Token ID7454
Feature activation-2.79e-03

Pos logit contributions
 roast+0.075
 hook+0.075
 padd+0.072
 boil+0.072
 sne+0.072
Neg logit contributions
ement-0.063
-0.063
 mattered-0.059
 Ellie-0.057
 Ellen-0.057
Token upon
Token ID2402
Feature activation-6.88e-02

Pos logit contributions
 a+0.003
 two+0.003
 Ben+0.003
 sne+0.003
rip+0.003
Neg logit contributions
 possessions-0.005
izing-0.005
-0.005
 possession-0.005
 angel-0.005
Token a
Token ID257
Feature activation-2.60e-02

Pos logit contributions
rip+0.058
 blah+0.056
 Theo+0.056
rr+0.056
 Stri+0.052
Neg logit contributions
 ornament-0.131
 she-0.130
 mattered-0.129
Her-0.125
 bracelet-0.124
Token time
Token ID640
Feature activation-5.90e-03

Pos logit contributions
 two+0.016
 a+0.016
rip+0.015
 Theo+0.014
 GO+0.014
Neg logit contributions
 possessions-0.056
 possession-0.056
-0.054
 mattered-0.054
 Acc-0.054
Token,
Token ID11
Feature activation-6.83e-03

Pos logit contributions
 a+0.008
 two+0.007
 Ben+0.007
 Tom+0.006
 sne+0.005
Neg logit contributions
 possessions-0.012
 possession-0.011
izing-0.011
 angel-0.010
ating-0.010
Token there
Token ID612
Feature activation+1.48e-02

Pos logit contributions
 a+0.007
 two+0.007
 blast+0.007
 sne+0.007
rip+0.006
Neg logit contributions
 possessions-0.012
 possession-0.012
-0.011
izing-0.011
 angel-0.011
Token was
Token ID373
Feature activation-4.25e-02

Pos logit contributions
 possessions+0.028
 possession+0.026
izing+0.026
 angel+0.025
+0.025
Neg logit contributions
 a-0.019
 two-0.017
 Ben-0.016
 sne-0.015
 gas-0.014
Token The
Token ID383
Feature activation-1.39e-02

Pos logit contributions
 sne+0.001
rip+0.001
 a+0.001
ata+0.001
 GO+0.001
Neg logit contributions
 Acc-0.002
 possessions-0.002
 possession-0.002
-0.002
 mattered-0.002
Token end
Token ID886
Feature activation-5.97e-03

Pos logit contributions
rip+0.019
 a+0.019
ata+0.018
 GO+0.018
 blah+0.018
Neg logit contributions
 possessions-0.021
 angel-0.020
 ornament-0.019
 possession-0.019
 bicycle-0.019
Token.
Token ID13
Feature activation-1.32e-02

Pos logit contributions
 a+0.005
 two+0.005
 Ben+0.004
 blast+0.004
 sne+0.004
Neg logit contributions
 possessions-0.012
 possession-0.012
-0.011
izing-0.011
 Acc-0.011
Token<|endoftext|>
Token ID50256
Feature activation-4.18e-02

Pos logit contributions
 sne+0.016
 padd+0.015
 hook+0.015
rip+0.015
 GO+0.015
Neg logit contributions
 Acc-0.022
-0.020
Really-0.020
 mattered-0.020
 disl-0.020
TokenOnce
Token ID7454
Feature activation+8.54e-04

Pos logit contributions
 roast+0.066
 padd+0.066
 sne+0.065
 roasted+0.064
 hook+0.063
Neg logit contributions
-0.048
ating-0.045
 mattered-0.044
ined-0.044
ement-0.044
Token upon
Token ID2402
Feature activation-6.88e-02

Pos logit contributions
 possessions+0.002
izing+0.002
 possession+0.002
+0.002
ths+0.002
Neg logit contributions
 a-0.001
 two-0.001
 Ben-0.001
rip-0.001
 sne-0.001
Token a
Token ID257
Feature activation-1.70e-02

Pos logit contributions
 Theo+0.050
 GO+0.047
 blah+0.046
rip+0.045
 padd+0.041
Neg logit contributions
 ornament-0.137
Her-0.137
 mattered-0.136
 she-0.134
 bracelet-0.134
Token time
Token ID640
Feature activation+4.26e-03

Pos logit contributions
 two+0.011
 a+0.011
 Theo+0.009
 GO+0.009
 roasted+0.009
Neg logit contributions
 possessions-0.036
 possession-0.036
-0.035
 Acc-0.035
 mattered-0.035
Token,
Token ID11
Feature activation+1.72e-03

Pos logit contributions
 possessions+0.009
 possession+0.008
izing+0.008
 angel+0.008
ths+0.008
Neg logit contributions
 a-0.006
 two-0.005
 Ben-0.005
 Tom-0.004
rr-0.004
Token in
Token ID287
Feature activation+8.42e-03

Pos logit contributions
 possessions+0.004
 possession+0.004
izing+0.004
ths+0.003
+0.003
Neg logit contributions
 a-0.002
 two-0.002
 Ben-0.001
 sne-0.001
 blast-0.001
Token a
Token ID257
Feature activation+2.70e-03

Pos logit contributions
 possession+0.021
 possessions+0.021
 angel+0.021
 ornament+0.021
+0.021
Neg logit contributions
rip-0.002
 blah-0.002
rr-0.001
ata-0.001
 GO-0.001
Token dream
Token ID4320
Feature activation-3.20e-02

Pos logit contributions
 two+0.031
 a+0.030
 Ben+0.028
 blast+0.027
rip+0.027
Neg logit contributions
 possessions-0.051
 angel-0.049
 possession-0.049
 mattered-0.048
 bicycle-0.048
Token on
Token ID319
Feature activation-7.59e-03

Pos logit contributions
 two+0.040
 a+0.040
 roast+0.038
 roasted+0.038
 blast+0.038
Neg logit contributions
 possessions-0.051
 possession-0.048
izing-0.047
 angel-0.046
-0.046
Token a
Token ID257
Feature activation-9.53e-04

Pos logit contributions
 blast+0.008
rip+0.008
ata+0.008
rr+0.007
 two+0.007
Neg logit contributions
 possessions-0.014
 possession-0.013
-0.013
izing-0.013
 bicycle-0.013
Token miserable
Token ID22444
Feature activation-5.25e-03

Pos logit contributions
 a+0.001
 two+0.001
rip+0.001
 Ben+0.001
 roast+0.001
Neg logit contributions
 possessions-0.001
 possession-0.001
 angel-0.001
 bicycle-0.001
 ornament-0.001
Token Tuesday
Token ID3431
Feature activation-3.35e-02

Pos logit contributions
 a+0.007
 two+0.007
 Ben+0.007
 blast+0.007
rip+0.007
Neg logit contributions
 possessions-0.008
 possession-0.008
 angel-0.007
izing-0.007
-0.007
Token like
Token ID588
Feature activation-6.86e-02

Pos logit contributions
rip+0.043
 roast+0.043
umbs+0.043
 eight+0.043
 roasted+0.043
Neg logit contributions
-0.046
 possessions-0.045
 angel-0.045
 bicycle-0.045
izing-0.044
Token this
Token ID428
Feature activation-7.05e-03

Pos logit contributions
rip+0.079
ata+0.079
rr+0.077
 blast+0.075
umbs+0.075
Neg logit contributions
 possessions-0.113
 possession-0.108
 angel-0.105
 bicycle-0.104
 Acc-0.103
Token?
Token ID30
Feature activation-1.59e-02

Pos logit contributions
rip+0.009
umbs+0.009
rr+0.009
 eight+0.009
 GO+0.009
Neg logit contributions
 angel-0.011
 possessions-0.010
 ornament-0.010
 possession-0.010
 bicycle-0.010
Token
Token ID220
Feature activation-1.22e-02

Pos logit contributions
 a+0.014
 two+0.014
 blast+0.013
 sne+0.013
 roast+0.013
Neg logit contributions
 possessions-0.030
 possession-0.030
-0.030
 Acc-0.029
 bicycle-0.029
Token
Token ID198
Feature activation-1.39e-02

Pos logit contributions
 blast+0.011
 sne+0.011
 roast+0.011
 roasted+0.011
 a+0.011
Neg logit contributions
 possessions-0.022
-0.022
 Acc-0.022
 mattered-0.021
 possession-0.021
Token
Token ID198
Feature activation+2.86e-04

Pos logit contributions
 a+0.012
 two+0.011
 sne+0.011
 blast+0.011
 roast+0.011
Neg logit contributions
 possessions-0.027
-0.026
 possession-0.026
 Acc-0.025
 bicycle-0.025
Token bad
Token ID2089
Feature activation+1.36e-02

Pos logit contributions
 a+0.002
 two+0.001
 Ben+0.001
 blast+0.001
 sne+0.001
Neg logit contributions
 possessions-0.003
 possession-0.003
-0.003
 Acc-0.003
 angel-0.003
Token ending
Token ID7464
Feature activation-1.40e-02

Pos logit contributions
 possessions+0.014
 possession+0.013
 angel+0.012
 bicycle+0.012
 Acc+0.011
Neg logit contributions
 a-0.026
 two-0.025
 GO-0.024
rip-0.024
 Ben-0.024
Token.
Token ID13
Feature activation-6.40e-03

Pos logit contributions
 two+0.017
 GO+0.017
 blast+0.016
 roast+0.016
 a+0.016
Neg logit contributions
 possessions-0.022
 possession-0.021
 angel-0.021
 Acc-0.021
 mattered-0.020
Token<|endoftext|>
Token ID50256
Feature activation-4.31e-02

Pos logit contributions
 GO+0.007
rip+0.007
 sne+0.007
 padd+0.007
 hook+0.007
Neg logit contributions
 Acc-0.011
-0.011
 mattered-0.010
 possession-0.010
 possessions-0.010
TokenOnce
Token ID7454
Feature activation+2.16e-03

Pos logit contributions
 roast+0.066
 roasted+0.065
 padd+0.064
 sne+0.063
 hook+0.063
Neg logit contributions
-0.052
ating-0.050
ined-0.047
ement-0.047
 mattered-0.047
Token upon
Token ID2402
Feature activation-6.85e-02

Pos logit contributions
 possessions+0.004
izing+0.004
 possession+0.004
+0.004
ths+0.004
Neg logit contributions
 a-0.003
 two-0.002
 Ben-0.002
 sne-0.002
rip-0.002
Token a
Token ID257
Feature activation-1.94e-02

Pos logit contributions
 Theo+0.045
 blah+0.042
rip+0.042
 GO+0.041
ata+0.036
Neg logit contributions
 ornament-0.142
 mattered-0.142
 bicycle-0.138
 possession-0.137
 angel-0.136
Token time
Token ID640
Feature activation-5.48e-04

Pos logit contributions
 two+0.014
 a+0.013
 GO+0.013
 Theo+0.012
 roasted+0.012
Neg logit contributions
 possession-0.039
 possessions-0.039
-0.038
 Acc-0.038
 mattered-0.037
Token,
Token ID11
Feature activation-2.82e-03

Pos logit contributions
 a+0.001
 two+0.001
 Ben+0.001
 Tom+0.001
rr+0.001
Neg logit contributions
 possessions-0.001
 possession-0.001
izing-0.001
 angel-0.001
 ornament-0.001
Token there
Token ID612
Feature activation+1.63e-02

Pos logit contributions
 a+0.003
 two+0.003
 Ben+0.002
 sne+0.002
 blast+0.002
Neg logit contributions
 possessions-0.006
 possession-0.006
izing-0.006
-0.005
ths-0.005
Token was
Token ID373
Feature activation-3.84e-02

Pos logit contributions
 possessions+0.038
izing+0.036
 possession+0.035
osity+0.035
 angel+0.035
Neg logit contributions
 a-0.017
 two-0.014
 Ben-0.013
 sne-0.011
 Tom-0.011
Token picnic
Token ID35715
Feature activation-1.16e-03

Pos logit contributions
 possessions+0.001
 angel+0.001
 possession+0.001
 mattered+0.001
 ornament+0.001
Neg logit contributions
rip-0.001
 Stri-0.001
hes-0.001
umbs-0.001
ata-0.001
Token soon
Token ID2582
Feature activation-3.24e-02

Pos logit contributions
rip+0.001
 two+0.001
 sne+0.001
 a+0.001
 blah+0.001
Neg logit contributions
 possessions-0.002
 possession-0.002
 angel-0.002
 mattered-0.002
-0.002
Token."
Token ID526
Feature activation-8.35e-03

Pos logit contributions
 Theo+0.040
 roasted+0.037
 roast+0.037
 Stri+0.037
Ar+0.036
Neg logit contributions
 possessions-0.050
izing-0.050
-0.049
ating-0.047
 angel-0.046
Token<|endoftext|>
Token ID50256
Feature activation-4.05e-02

Pos logit contributions
 a+0.008
 blast+0.008
rip+0.008
 sne+0.008
 two+0.008
Neg logit contributions
 Acc-0.015
-0.014
 possessions-0.014
 possession-0.014
ement-0.014
TokenOnce
Token ID7454
Feature activation+3.73e-03

Pos logit contributions
 roast+0.059
 blast+0.056
 sne+0.056
 boil+0.056
 hook+0.056
Neg logit contributions
-0.053
 mattered-0.051
ement-0.050
ating-0.047
fl-0.047
Token upon
Token ID2402
Feature activation-6.84e-02

Pos logit contributions
 possessions+0.007
izing+0.007
 possession+0.007
 angel+0.007
+0.007
Neg logit contributions
 a-0.004
 two-0.004
 Ben-0.004
 sne-0.003
rip-0.003
Token a
Token ID257
Feature activation-2.23e-02

Pos logit contributions
rip+0.058
 blah+0.056
 Theo+0.055
OOOO+0.052
rr+0.052
Neg logit contributions
 ornament-0.127
 she-0.127
 bicycle-0.125
 bracelet-0.124
 mattered-0.124
Token time
Token ID640
Feature activation+2.28e-03

Pos logit contributions
 two+0.015
 a+0.014
rip+0.013
 roast+0.012
 Theo+0.012
Neg logit contributions
 possessions-0.047
 possession-0.047
-0.046
 Acc-0.045
ement-0.045
Token,
Token ID11
Feature activation-4.79e-03

Pos logit contributions
 possessions+0.005
 possession+0.005
izing+0.004
 angel+0.004
 ornament+0.004
Neg logit contributions
 a-0.003
 two-0.003
 Ben-0.003
 Tom-0.002
 an-0.002
Token there
Token ID612
Feature activation+1.55e-02

Pos logit contributions
 a+0.005
 two+0.005
 Ben+0.004
 blast+0.004
 sne+0.004
Neg logit contributions
 possessions-0.010
 possession-0.009
izing-0.009
-0.009
 angel-0.009
Token was
Token ID373
Feature activation-3.62e-02

Pos logit contributions
 possessions+0.033
 possession+0.031
izing+0.030
osity+0.030
 angel+0.030
Neg logit contributions
 a-0.018
 two-0.016
 Ben-0.015
 sne-0.013
 gas-0.013
Token too
Token ID1165
Feature activation-8.91e-03

Pos logit contributions
 a+0.022
 two+0.021
 Ben+0.020
 blast+0.020
rip+0.020
Neg logit contributions
 possessions-0.033
 possession-0.032
-0.030
izing-0.030
 angel-0.030
Token late
Token ID2739
Feature activation-2.16e-02

Pos logit contributions
 blast+0.009
 GO+0.009
 roast+0.008
rip+0.008
 Theo+0.008
Neg logit contributions
 possessions-0.017
 possession-0.016
izing-0.016
 mattered-0.016
 disl-0.015
Token!
Token ID0
Feature activation-2.77e-02

Pos logit contributions
 two+0.025
 blast+0.025
 roasted+0.024
 roast+0.024
 a+0.024
Neg logit contributions
 possessions-0.036
 possession-0.035
izing-0.034
-0.033
ement-0.033
Token<|endoftext|>
Token ID50256
Feature activation-4.12e-02

Pos logit contributions
 sne+0.040
 hook+0.039
rip+0.036
 boil+0.036
ata+0.036
Neg logit contributions
 Acc-0.043
-0.040
 mattered-0.039
 Ellen-0.038
 Mill-0.038
TokenOnce
Token ID7454
Feature activation+1.15e-03

Pos logit contributions
 roast+0.064
 sne+0.064
 hook+0.064
 boil+0.063
 padd+0.062
Neg logit contributions
-0.048
ement-0.047
 mattered-0.046
ined-0.044
ating-0.043
Token upon
Token ID2402
Feature activation-6.84e-02

Pos logit contributions
 possessions+0.002
izing+0.002
+0.002
bing+0.002
ths+0.002
Neg logit contributions
 a-0.001
 two-0.001
 Ben-0.001
 sne-0.001
 Tom-0.001
Token a
Token ID257
Feature activation-1.99e-02

Pos logit contributions
rip+0.071
 Theo+0.066
 blah+0.066
rr+0.066
OOOO+0.065
Neg logit contributions
 she-0.121
 ornament-0.116
 mattered-0.114
Her-0.113
 bracelet-0.111
Token time
Token ID640
Feature activation+3.90e-03

Pos logit contributions
 two+0.015
 Theo+0.015
rip+0.015
 GO+0.014
 a+0.014
Neg logit contributions
 possession-0.039
 possessions-0.039
-0.037
 mattered-0.037
ement-0.037
Token there
Token ID612
Feature activation+7.75e-03

Pos logit contributions
 possessions+0.007
 possession+0.007
izing+0.007
 angel+0.007
ths+0.007
Neg logit contributions
 a-0.005
 two-0.005
 Ben-0.005
 Tom-0.004
 sne-0.004
Token was
Token ID373
Feature activation-4.04e-02

Pos logit contributions
 possessions+0.013
 possession+0.013
izing+0.012
+0.012
 angel+0.012
Neg logit contributions
 a-0.010
 two-0.010
 Ben-0.009
 sne-0.009
 blast-0.008
Token a
Token ID257
Feature activation-2.47e-02

Pos logit contributions
rip+0.030
 blah+0.027
rr+0.025
 Theo+0.024
 GO+0.024
Neg logit contributions
 ornament-0.084
 mattered-0.082
 possession-0.081
 angel-0.079
 Ellie-0.078
Token the
Token ID262
Feature activation-5.05e-03

Pos logit contributions
 possessions+0.009
 possession+0.009
 angel+0.008
izing+0.008
 ornament+0.008
Neg logit contributions
 two-0.005
 a-0.005
 blast-0.005
 sne-0.005
rip-0.005
Token tower
Token ID10580
Feature activation-5.23e-03

Pos logit contributions
rip+0.008
umbs+0.007
ata+0.007
 Run+0.007
rr+0.007
Neg logit contributions
 possessions-0.007
 angel-0.007
 ornament-0.007
 bicycle-0.007
 possession-0.007
Token!"
Token ID2474
Feature activation-9.35e-03

Pos logit contributions
 two+0.005
 a+0.005
 roasted+0.005
rip+0.005
 roast+0.005
Neg logit contributions
 possessions-0.009
 angel-0.009
 possession-0.009
 ornament-0.009
izing-0.009
Token<|endoftext|>
Token ID50256
Feature activation-3.93e-02

Pos logit contributions
 blast+0.010
ata+0.010
 GO+0.009
 sne+0.009
rip+0.009
Neg logit contributions
-0.016
 Acc-0.016
 possession-0.015
 possessions-0.015
ement-0.014
TokenOnce
Token ID7454
Feature activation+1.64e-03

Pos logit contributions
 roast+0.060
 hook+0.059
 sne+0.058
 boil+0.057
 roasted+0.057
Neg logit contributions
-0.049
 mattered-0.046
ement-0.046
ating-0.045
ined-0.044
Token upon
Token ID2402
Feature activation-6.83e-02

Pos logit contributions
 possessions+0.003
izing+0.003
 possession+0.003
+0.003
ths+0.003
Neg logit contributions
 a-0.002
 two-0.002
 Ben-0.002
 sne-0.001
rip-0.001
Token a
Token ID257
Feature activation-2.09e-02

Pos logit contributions
rip+0.065
 blah+0.064
 Theo+0.064
 GO+0.059
OOOO+0.059
Neg logit contributions
 ornament-0.121
 she-0.120
 mattered-0.117
 bracelet-0.117
Her-0.115
Token time
Token ID640
Feature activation+4.93e-03

Pos logit contributions
 two+0.015
 Theo+0.015
 a+0.014
rip+0.014
 GO+0.014
Neg logit contributions
 possessions-0.041
 possession-0.041
-0.040
ement-0.040
 mattered-0.040
Token,
Token ID11
Feature activation-3.07e-04

Pos logit contributions
 possessions+0.011
 possession+0.010
izing+0.010
 ornament+0.010
adoes+0.009
Neg logit contributions
 a-0.006
 two-0.005
 Ben-0.005
 Tom-0.004
 time-0.004
Token there
Token ID612
Feature activation+1.84e-02

Pos logit contributions
 a+0.000
 two+0.000
 Ben+0.000
 blast+0.000
 sne+0.000
Neg logit contributions
 possessions-0.001
 possession-0.001
izing-0.001
 Acc-0.001
-0.001
Token was
Token ID373
Feature activation-3.36e-02

Pos logit contributions
 possessions+0.037
 possession+0.035
izing+0.034
 angel+0.034
osity+0.033
Neg logit contributions
 a-0.023
 two-0.020
 Ben-0.018
 sne-0.017
 gas-0.016
Token too
Token ID1165
Feature activation-3.54e-03

Pos logit contributions
 a+0.015
 two+0.014
rip+0.013
 blast+0.013
 Ben+0.013
Neg logit contributions
 possessions-0.032
 possession-0.031
 angel-0.030
izing-0.030
 mattered-0.030
Token late
Token ID2739
Feature activation-2.15e-02

Pos logit contributions
 blast+0.003
 GO+0.002
 roast+0.002
 blah+0.002
rip+0.002
Neg logit contributions
 possessions-0.007
 possession-0.007
izing-0.007
 Acc-0.007
 mattered-0.007
Token.
Token ID13
Feature activation-1.83e-02

Pos logit contributions
 blast+0.026
 two+0.026
 roasted+0.026
 roast+0.025
rip+0.025
Neg logit contributions
 possessions-0.035
 possession-0.034
izing-0.033
ement-0.032
-0.032
Token<|endoftext|>
Token ID50256
Feature activation-3.64e-02

Pos logit contributions
 hook+0.026
 sne+0.025
 roast+0.024
 padd+0.024
 bait+0.024
Neg logit contributions
-0.028
 Acc-0.027
 Ellen-0.026
ement-0.025
 Mill-0.025
TokenOnce
Token ID7454
Feature activation+4.95e-03

Pos logit contributions
 sne+0.053
 roast+0.052
 hook+0.051
 boil+0.051
 padd+0.051
Neg logit contributions
-0.047
 mattered-0.045
ement-0.045
ating-0.043
ined-0.042
Token upon
Token ID2402
Feature activation-6.82e-02

Pos logit contributions
 possessions+0.010
izing+0.009
+0.009
 possession+0.009
bing+0.009
Neg logit contributions
 a-0.006
 two-0.005
 Ben-0.005
 Tom-0.004
 sne-0.004
Token a
Token ID257
Feature activation-1.94e-02

Pos logit contributions
rip+0.070
rr+0.068
 blah+0.065
 Theo+0.065
OOOO+0.063
Neg logit contributions
 she-0.123
 ornament-0.118
Her-0.115
 mattered-0.113
 bicycle-0.113
Token time
Token ID640
Feature activation+3.96e-03

Pos logit contributions
 two+0.014
 a+0.013
rip+0.012
 Theo+0.012
 roasted+0.012
Neg logit contributions
 possessions-0.040
 possession-0.040
-0.039
ement-0.038
 mattered-0.038
Token there
Token ID612
Feature activation+9.40e-03

Pos logit contributions
 possessions+0.008
 possession+0.008
izing+0.007
 angel+0.007
 ornament+0.007
Neg logit contributions
 a-0.005
 two-0.005
 Ben-0.005
 Tom-0.004
 four-0.004
Token were
Token ID547
Feature activation-3.58e-02

Pos logit contributions
 possessions+0.017
 possession+0.016
izing+0.016
 angel+0.016
+0.016
Neg logit contributions
 a-0.012
 two-0.011
 Ben-0.010
 sne-0.010
 gas-0.009
Token two
Token ID734
Feature activation-1.41e-03

Pos logit contributions
rip+0.025
ata+0.025
 hook+0.025
 cast+0.024
 blah+0.023
Neg logit contributions
 ornament-0.072
 mattered-0.071
 Mill-0.071
 angel-0.070
ths-0.069
Token Experience
Token ID16386
Feature activation-3.08e-02

Pos logit contributions
 sne+0.021
 roast+0.019
 a+0.019
 blast+0.019
 two+0.018
Neg logit contributions
 Acc-0.036
 possession-0.036
-0.035
 possessions-0.035
 mattered-0.035
Token won
Token ID1839
Feature activation-2.38e-02

Pos logit contributions
 roast+0.036
 boil+0.035
 roasted+0.034
 a+0.034
 two+0.034
Neg logit contributions
 possessions-0.052
 mattered-0.049
 possession-0.048
izing-0.048
-0.047
Token!
Token ID0
Feature activation-3.12e-02

Pos logit contributions
 roast+0.031
 roasted+0.030
 microwave+0.030
 meat+0.030
 boil+0.029
Neg logit contributions
 possessions-0.035
izing-0.033
 mattered-0.033
bing-0.033
ating-0.033
Token<|endoftext|>
Token ID50256
Feature activation-4.01e-02

Pos logit contributions
 sne+0.038
 roast+0.035
 roasted+0.034
 hook+0.034
 boil+0.033
Neg logit contributions
 Acc-0.052
 possession-0.051
-0.050
 mattered-0.050
ement-0.048
TokenOnce
Token ID7454
Feature activation+2.04e-03

Pos logit contributions
 sne+0.062
 roast+0.062
 boil+0.061
 hook+0.060
 blast+0.060
Neg logit contributions
-0.048
ement-0.047
 mattered-0.045
ating-0.043
ined-0.042
Token upon
Token ID2402
Feature activation-6.81e-02

Pos logit contributions
 possessions+0.004
+0.004
izing+0.004
 possession+0.003
 angel+0.003
Neg logit contributions
 a-0.003
 two-0.002
 Ben-0.002
 sne-0.002
rip-0.002
Token a
Token ID257
Feature activation-1.92e-02

Pos logit contributions
rip+0.064
 blah+0.059
 Theo+0.058
OOOO+0.057
rr+0.057
Neg logit contributions
 ornament-0.123
 she-0.123
 bicycle-0.120
 mattered-0.118
 bracelet-0.117
Token time
Token ID640
Feature activation+4.27e-03

Pos logit contributions
 two+0.014
 a+0.013
rip+0.013
 Theo+0.013
 GO+0.012
Neg logit contributions
 possessions-0.039
 possession-0.039
ement-0.037
-0.037
 Acc-0.037
Token,
Token ID11
Feature activation-5.66e-03

Pos logit contributions
 possessions+0.008
 possession+0.008
izing+0.007
 angel+0.007
 ornament+0.007
Neg logit contributions
 a-0.006
 two-0.005
 Ben-0.005
 Tom-0.004
 four-0.004
Token there
Token ID612
Feature activation+1.56e-02

Pos logit contributions
 a+0.006
 two+0.006
 blast+0.005
 sne+0.005
 Ben+0.005
Neg logit contributions
 possessions-0.010
 possession-0.010
-0.009
izing-0.009
 angel-0.009
Token were
Token ID547
Feature activation-3.57e-02

Pos logit contributions
 possessions+0.030
 possession+0.028
izing+0.027
 angel+0.027
+0.027
Neg logit contributions
 a-0.020
 two-0.018
 Ben-0.017
 sne-0.016
 gas-0.015
Token sang
Token ID25889
Feature activation-3.37e-02

Pos logit contributions
 Theo+0.024
 two+0.024
 a+0.023
 GO+0.023
umbs+0.022
Neg logit contributions
 possessions-0.031
 mattered-0.030
 angel-0.029
 possession-0.029
 Acc-0.028
Token again
Token ID757
Feature activation-7.15e-03

Pos logit contributions
 roast+0.041
 roasted+0.040
 blast+0.039
 Theo+0.038
 two+0.038
Neg logit contributions
 angel-0.054
 possessions-0.054
 possession-0.052
izing-0.051
 mattered-0.051
Token.
Token ID13
Feature activation-1.26e-02

Pos logit contributions
 GO+0.012
 blast+0.012
rip+0.012
 roasted+0.012
Ar+0.011
Neg logit contributions
 angel-0.008
 possession-0.008
 possessions-0.008
izing-0.008
 mattered-0.007
Token<|endoftext|>
Token ID50256
Feature activation-4.43e-02

Pos logit contributions
 roasted+0.013
 two+0.013
 sne+0.013
 padd+0.013
 a+0.013
Neg logit contributions
 Acc-0.021
 possession-0.021
 possessions-0.020
-0.020
 angel-0.020
TokenOnce
Token ID7454
Feature activation-5.94e-03

Pos logit contributions
 roast+0.064
 roasted+0.064
 padd+0.063
 sne+0.061
 GO+0.061
Neg logit contributions
ating-0.053
ined-0.052
-0.052
 mattered-0.052
 ornament-0.052
Token upon
Token ID2402
Feature activation-6.80e-02

Pos logit contributions
 a+0.007
 two+0.007
 Ben+0.006
rip+0.006
 sne+0.006
Neg logit contributions
 possessions-0.011
 possession-0.010
izing-0.010
-0.010
 Acc-0.010
Token a
Token ID257
Feature activation-1.96e-02

Pos logit contributions
 Theo+0.051
 blah+0.047
 GO+0.047
rip+0.044
 padd+0.043
Neg logit contributions
 ornament-0.135
Her-0.132
 she-0.131
 angel-0.131
 bicycle-0.130
Token time
Token ID640
Feature activation+2.01e-03

Pos logit contributions
 two+0.014
 a+0.013
 Theo+0.013
 GO+0.013
 roasted+0.013
Neg logit contributions
 possession-0.039
 possessions-0.039
-0.038
 Acc-0.038
 mattered-0.038
Token,
Token ID11
Feature activation-2.38e-03

Pos logit contributions
 possessions+0.004
 possession+0.004
izing+0.004
ths+0.004
 angel+0.004
Neg logit contributions
 a-0.003
 two-0.002
 Ben-0.002
rr-0.002
 Tom-0.002
Token there
Token ID612
Feature activation+1.53e-02

Pos logit contributions
 a+0.002
 two+0.002
 Ben+0.002
 sne+0.002
 blast+0.002
Neg logit contributions
 possessions-0.005
 possession-0.005
izing-0.005
-0.005
 Acc-0.005
Token was
Token ID373
Feature activation-3.94e-02

Pos logit contributions
 possessions+0.033
 possession+0.030
izing+0.030
osity+0.029
 instincts+0.029
Neg logit contributions
 a-0.018
 two-0.015
 Ben-0.014
 sne-0.013
 gas-0.012
Token about
Token ID546
Feature activation-2.17e-02

Pos logit contributions
 two+0.002
 a+0.002
 blast+0.002
 sne+0.002
 roast+0.002
Neg logit contributions
 possessions-0.002
 possession-0.002
 angel-0.001
 Acc-0.001
izing-0.001
Token itself
Token ID2346
Feature activation+8.61e-03

Pos logit contributions
 a+0.023
 two+0.023
 blast+0.020
 sne+0.020
rip+0.020
Neg logit contributions
 possessions-0.040
 possession-0.039
izing-0.037
 angel-0.037
 Acc-0.037
Token.
Token ID13
Feature activation-1.67e-02

Pos logit contributions
 possessions+0.015
 angel+0.014
 possession+0.014
izing+0.014
 ornament+0.014
Neg logit contributions
 blast-0.010
rip-0.009
 GO-0.009
 roast-0.009
 two-0.009
Token<|endoftext|>
Token ID50256
Feature activation-4.32e-02

Pos logit contributions
 sne+0.026
 roasted+0.025
 padd+0.024
 roast+0.024
 boil+0.023
Neg logit contributions
 Acc-0.023
-0.021
Really-0.020
 disl-0.020
 mattered-0.020
TokenOnce
Token ID7454
Feature activation-1.73e-03

Pos logit contributions
 roast+0.068
 sne+0.065
 roasted+0.065
 boil+0.065
 blast+0.064
Neg logit contributions
-0.051
 mattered-0.048
ating-0.048
ement-0.047
ined-0.047
Token upon
Token ID2402
Feature activation-6.80e-02

Pos logit contributions
 a+0.002
 two+0.002
 Ben+0.002
rip+0.002
 sne+0.002
Neg logit contributions
 possessions-0.003
izing-0.003
-0.003
 possession-0.003
ths-0.003
Token a
Token ID257
Feature activation-1.94e-02

Pos logit contributions
 Theo+0.044
 blah+0.041
 GO+0.041
rip+0.040
 WANT+0.035
Neg logit contributions
 ornament-0.143
 mattered-0.140
 possession-0.139
 bicycle-0.138
 bracelet-0.137
Token time
Token ID640
Feature activation+2.83e-04

Pos logit contributions
 two+0.013
 a+0.012
 GO+0.011
 Theo+0.011
rip+0.011
Neg logit contributions
 possessions-0.041
 possession-0.041
-0.039
 Acc-0.039
 mattered-0.039
Token,
Token ID11
Feature activation-2.88e-03

Pos logit contributions
 possessions+0.001
 possession+0.001
izing+0.001
ths+0.000
 angel+0.000
Neg logit contributions
 a-0.000
 two-0.000
 Ben-0.000
 Tom-0.000
rr-0.000
Token there
Token ID612
Feature activation+1.60e-02

Pos logit contributions
 a+0.003
 two+0.003
 Ben+0.003
 blast+0.002
 sne+0.002
Neg logit contributions
 possessions-0.006
 possession-0.005
izing-0.005
-0.005
 angel-0.005
Token was
Token ID373
Feature activation-3.66e-02

Pos logit contributions
 possessions+0.036
izing+0.034
 possession+0.033
osity+0.033
 angel+0.033
Neg logit contributions
 a-0.018
 two-0.014
 Ben-0.014
 sne-0.012
 Tom-0.011
Token really
Token ID1107
Feature activation+1.35e-02

Pos logit contributions
 possessions+0.034
 possession+0.032
izing+0.031
 mattered+0.030
+0.030
Neg logit contributions
rip-0.024
 two-0.024
 a-0.023
 roast-0.022
 Ben-0.022
Token strong
Token ID1913
Feature activation-5.39e-03

Pos logit contributions
 possessions+0.022
 possession+0.021
izing+0.021
 Acc+0.020
+0.020
Neg logit contributions
 a-0.017
 two-0.016
 blast-0.015
 Ben-0.015
 sne-0.015
Token!"
Token ID2474
Feature activation-1.35e-02

Pos logit contributions
 roast+0.006
 blast+0.006
 roasted+0.006
rip+0.006
 a+0.006
Neg logit contributions
 possessions-0.009
 possession-0.009
izing-0.009
-0.009
 angel-0.009
Token<|endoftext|>
Token ID50256
Feature activation-4.27e-02

Pos logit contributions
ata+0.015
rip+0.015
 blast+0.015
rr+0.015
 explode+0.014
Neg logit contributions
ement-0.021
-0.021
 Acc-0.020
 she-0.020
 possession-0.020
TokenOnce
Token ID7454
Feature activation-5.74e-04

Pos logit contributions
 roast+0.063
 boil+0.060
 sne+0.060
 roasted+0.060
 hook+0.060
Neg logit contributions
-0.056
ement-0.053
 mattered-0.051
ating-0.051
ined-0.049
Token upon
Token ID2402
Feature activation-6.79e-02

Pos logit contributions
 a+0.001
 two+0.001
 Ben+0.001
 sne+0.001
rip+0.001
Neg logit contributions
 possessions-0.001
izing-0.001
 possession-0.001
-0.001
 angel-0.001
Token a
Token ID257
Feature activation-2.34e-02

Pos logit contributions
rip+0.069
 blah+0.065
OOOO+0.063
rr+0.061
 Theo+0.060
Neg logit contributions
 she-0.118
 ornament-0.112
 mattered-0.111
Her-0.111
 bicycle-0.110
Token time
Token ID640
Feature activation+3.15e-03

Pos logit contributions
 two+0.018
 Theo+0.017
rip+0.017
 a+0.017
 roast+0.016
Neg logit contributions
 possession-0.045
 possessions-0.045
-0.045
ement-0.044
 Acc-0.044
Token,
Token ID11
Feature activation-2.38e-03

Pos logit contributions
 possessions+0.007
 possession+0.006
izing+0.006
 angel+0.006
 ornament+0.006
Neg logit contributions
 a-0.004
 two-0.004
 Ben-0.004
 Tom-0.003
 sne-0.003
Token there
Token ID612
Feature activation+1.50e-02

Pos logit contributions
 a+0.003
 two+0.002
 Ben+0.002
 blast+0.002
 sne+0.002
Neg logit contributions
 possessions-0.005
 possession-0.004
izing-0.004
-0.004
 Acc-0.004
Token was
Token ID373
Feature activation-3.48e-02

Pos logit contributions
 possessions+0.028
 possession+0.026
izing+0.025
 angel+0.025
+0.025
Neg logit contributions
 a-0.019
 two-0.018
 Ben-0.016
 sne-0.016
 gas-0.015
Token and
Token ID290
Feature activation-1.44e-02

Pos logit contributions
rip+0.034
 two+0.032
 GO+0.031
ata+0.031
 roasted+0.031
Neg logit contributions
 possessions-0.043
 possession-0.043
 angel-0.040
 Acc-0.039
bing-0.039
Token play
Token ID711
Feature activation-1.78e-02

Pos logit contributions
 a+0.016
 two+0.016
 Ben+0.014
 blast+0.014
 sne+0.014
Neg logit contributions
 possessions-0.026
 possession-0.025
izing-0.024
-0.024
 Acc-0.024
Token.
Token ID13
Feature activation-2.43e-02

Pos logit contributions
 two+0.023
 a+0.022
rip+0.022
 roasted+0.021
 roast+0.021
Neg logit contributions
 possessions-0.027
 possession-0.027
 angel-0.026
izing-0.026
 ornament-0.025
Token<|endoftext|>
Token ID50256
Feature activation-3.86e-02

Pos logit contributions
 a+0.028
 two+0.027
 sne+0.027
 hook+0.026
rip+0.026
Neg logit contributions
 possessions-0.039
 Acc-0.039
 possession-0.039
-0.039
 angel-0.037
TokenOnce
Token ID7454
Feature activation-2.70e-04

Pos logit contributions
 roast+0.056
 sne+0.055
 hook+0.055
 boil+0.055
 padd+0.054
Neg logit contributions
-0.050
 mattered-0.047
ement-0.047
ined-0.045
ating-0.044
Token upon
Token ID2402
Feature activation-6.79e-02

Pos logit contributions
 a+0.000
 two+0.000
 Ben+0.000
 Tom+0.000
 sne+0.000
Neg logit contributions
 possessions-0.001
izing-0.001
-0.001
bing-0.001
ths-0.001
Token a
Token ID257
Feature activation-2.31e-02

Pos logit contributions
rip+0.061
 Theo+0.058
 blah+0.058
OOOO+0.054
rr+0.052
Neg logit contributions
 ornament-0.127
 she-0.125
 bicycle-0.123
 mattered-0.123
Her-0.122
Token time
Token ID640
Feature activation+2.51e-03

Pos logit contributions
 two+0.015
 a+0.014
rip+0.013
 Theo+0.012
 GO+0.012
Neg logit contributions
 possessions-0.050
 possession-0.049
-0.048
 Acc-0.048
ement-0.047
Token there
Token ID612
Feature activation+1.21e-02

Pos logit contributions
 possessions+0.005
 possession+0.005
izing+0.005
 angel+0.004
ths+0.004
Neg logit contributions
 a-0.003
 two-0.003
 Ben-0.003
 Tom-0.002
 blast-0.002
Token was
Token ID373
Feature activation-4.07e-02

Pos logit contributions
 possessions+0.023
 possession+0.021
izing+0.021
 angel+0.021
+0.021
Neg logit contributions
 a-0.015
 two-0.014
 Ben-0.013
 sne-0.012
 gas-0.012
Token a
Token ID257
Feature activation-2.65e-02

Pos logit contributions
rip+0.026
 blah+0.025
 GO+0.021
rr+0.021
 Theo+0.021
Neg logit contributions
 ornament-0.088
 possession-0.086
 mattered-0.086
 angel-0.083
 bracelet-0.083
Token radio
Token ID5243
Feature activation+1.38e-02

Pos logit contributions
 possessions+0.001
 possession+0.001
+0.001
izing+0.001
 Acc+0.001
Neg logit contributions
 a-0.001
 two-0.001
 Ben-0.001
 sne-0.001
 blast-0.001
Token at
Token ID379
Feature activation+4.73e-03

Pos logit contributions
 possessions+0.023
 possession+0.022
 angel+0.021
 bicycle+0.021
 ornament+0.021
Neg logit contributions
 a-0.017
 two-0.017
 blast-0.016
rip-0.015
 roast-0.015
Token home
Token ID1363
Feature activation-3.16e-03

Pos logit contributions
 possession+0.007
 angel+0.007
 possessions+0.007
 ornament+0.006
 bicycle+0.006
Neg logit contributions
 GO-0.006
 roasted-0.006
 roast-0.006
 blah-0.006
ata-0.006
Token.
Token ID13
Feature activation+9.47e-03

Pos logit contributions
 a+0.003
 two+0.003
 blast+0.003
 Ben+0.003
 sne+0.003
Neg logit contributions
 possessions-0.006
 possession-0.006
izing-0.006
 angel-0.006
-0.006
Token She
Token ID1375
Feature activation-1.68e-04

Pos logit contributions
 possessions+0.011
 possession+0.011
+0.010
izing+0.010
 Acc+0.010
Neg logit contributions
 a-0.017
 two-0.016
 Ben-0.015
 blast-0.015
 sne-0.015
Token looked
Token ID3114
Feature activation+7.18e-02

Pos logit contributions
umbs+0.000
 GO+0.000
rr+0.000
 blast+0.000
rip+0.000
Neg logit contributions
 possessions-0.000
 angel-0.000
 mattered-0.000
 possession-0.000
 ornament-0.000
Token out
Token ID503
Feature activation+5.52e-02

Pos logit contributions
+0.144
 possessions+0.141
ths+0.137
izing+0.137
 possession+0.136
Neg logit contributions
 a-0.092
 two-0.075
 Ben-0.070
 sne-0.065
 late-0.060
Token the
Token ID262
Feature activation+2.19e-02

Pos logit contributions
 possessions+0.123
+0.119
 possession+0.118
 Acc+0.117
izing+0.115
Neg logit contributions
 a-0.051
 two-0.044
 Ben-0.038
 sne-0.036
 blast-0.034
Token window
Token ID4324
Feature activation+7.43e-03

Pos logit contributions
 possessions+0.034
 possession+0.033
 angel+0.032
 bicycle+0.031
 ornament+0.031
Neg logit contributions
 a-0.030
 two-0.028
rip-0.027
 Ben-0.027
 blast-0.027
Token.
Token ID13
Feature activation+9.83e-03

Pos logit contributions
 possessions+0.014
 possession+0.013
izing+0.013
 angel+0.013
ement+0.013
Neg logit contributions
 a-0.008
 two-0.007
 blast-0.007
rip-0.007
 roast-0.007
Token It
Token ID632
Feature activation+2.18e-02

Pos logit contributions
 possessions+0.011
 possession+0.011
+0.010
 Acc+0.010
izing+0.010
Neg logit contributions
 a-0.017
 two-0.017
 blast-0.016
 sne-0.016
 roast-0.016
Token to
Token ID284
Feature activation-1.01e-02

Pos logit contributions
 Acc+0.022
+0.022
ths+0.021
 disl+0.021
izing+0.020
Neg logit contributions
 a-0.020
 two-0.019
 gas-0.017
 sne-0.017
 four-0.016
Token brand
Token ID4508
Feature activation+2.00e-02

Pos logit contributions
umbs+0.010
rip+0.010
ata+0.010
 blah+0.010
 GO+0.010
Neg logit contributions
 possessions-0.018
 possession-0.018
 angel-0.017
 mattered-0.017
 Acc-0.016
Token new
Token ID649
Feature activation+9.53e-03

Pos logit contributions
+0.040
flower+0.036
Ell+0.035
ishly+0.035
bing+0.034
Neg logit contributions
 a-0.024
 Ben-0.021
 two-0.020
 gas-0.020
 water-0.019
Token!
Token ID0
Feature activation-4.83e-03

Pos logit contributions
+0.015
 Acc+0.015
ths+0.014
izing+0.014
osity+0.014
Neg logit contributions
 a-0.015
 two-0.014
 sne-0.012
 gas-0.012
 four-0.012
Token They
Token ID1119
Feature activation+7.20e-03

Pos logit contributions
 sne+0.007
 hook+0.006
 br+0.006
 GO+0.006
 boil+0.006
Neg logit contributions
 Acc-0.007
 possession-0.007
-0.007
 mattered-0.006
 Ellie-0.006
Token looked
Token ID3114
Feature activation+6.68e-02

Pos logit contributions
 possessions+0.011
 possession+0.011
+0.010
 angel+0.010
izing+0.010
Neg logit contributions
 a-0.010
 two-0.009
 Ben-0.009
 blast-0.009
 sne-0.008
Token around
Token ID1088
Feature activation+4.67e-02

Pos logit contributions
 possessions+0.128
+0.127
 Acc+0.122
 possession+0.121
 disl+0.121
Neg logit contributions
 a-0.087
 two-0.073
 four-0.063
 Ben-0.062
 blast-0.060
Token in
Token ID287
Feature activation+2.37e-02

Pos logit contributions
 possessions+0.094
 possession+0.089
+0.087
 Acc+0.087
izing+0.086
Neg logit contributions
 a-0.048
 two-0.044
 Ben-0.039
 sne-0.037
 blast-0.037
Token amaz
Token ID40642
Feature activation+5.25e-02

Pos logit contributions
 possessions+0.039
 possession+0.037
+0.036
izing+0.036
 Acc+0.035
Neg logit contributions
 a-0.032
 two-0.030
 Ben-0.028
 blast-0.027
 sne-0.027
Tokenement
Token ID972
Feature activation+4.30e-02

Pos logit contributions
 possessions+0.038
+0.036
 Acc+0.035
 bicycle+0.033
 mattered+0.033
Neg logit contributions
 a-0.123
 two-0.117
 four-0.110
 Ben-0.109
 gas-0.105
Token and
Token ID290
Feature activation+1.86e-02

Pos logit contributions
 possessions+0.073
 possession+0.070
izing+0.067
+0.067
 angel+0.067
Neg logit contributions
 a-0.053
 two-0.051
 blast-0.047
 Ben-0.046
 sne-0.046
Token.
Token ID13
Feature activation+1.65e-03

Pos logit contributions
 a+0.008
 two+0.008
 Ben+0.007
 sne+0.007
 blast+0.007
Neg logit contributions
 possessions-0.010
 possession-0.010
 Acc-0.009
-0.009
izing-0.009
Token She
Token ID1375
Feature activation+2.74e-03

Pos logit contributions
 possessions+0.002
 possession+0.002
izing+0.002
 angel+0.002
+0.002
Neg logit contributions
 a-0.002
 two-0.002
 Ben-0.002
 blast-0.002
rip-0.002
Token made
Token ID925
Feature activation-1.25e-02

Pos logit contributions
 possessions+0.004
 possession+0.004
izing+0.004
+0.004
 Acc+0.004
Neg logit contributions
 a-0.004
 two-0.004
 sne-0.003
 Ben-0.003
 blast-0.003
Token sure
Token ID1654
Feature activation+2.13e-03

Pos logit contributions
 roast+0.013
rip+0.013
rr+0.013
 blah+0.013
 roasted+0.012
Neg logit contributions
 possessions-0.022
 possession-0.022
 angel-0.021
 ornament-0.020
izing-0.020
Token to
Token ID284
Feature activation+4.45e-03

Pos logit contributions
 possessions+0.004
+0.004
 possession+0.004
izing+0.004
 Acc+0.004
Neg logit contributions
 a-0.002
 two-0.002
 Ben-0.002
 sne-0.002
 blast-0.002
Token look
Token ID804
Feature activation+5.65e-02

Pos logit contributions
 possessions+0.007
 possession+0.007
 angel+0.007
 ornament+0.007
 bicycle+0.007
Neg logit contributions
 a-0.005
 two-0.005
ata-0.005
umbs-0.005
rip-0.005
Token up
Token ID510
Feature activation+3.18e-02

Pos logit contributions
 possessions+0.096
 possession+0.092
+0.091
 Acc+0.090
izing+0.088
Neg logit contributions
 a-0.074
 two-0.069
 Ben-0.062
 sne-0.060
 blast-0.059
Token at
Token ID379
Feature activation+4.13e-03

Pos logit contributions
 possessions+0.061
 possession+0.059
 Acc+0.058
+0.057
 ornament+0.057
Neg logit contributions
 a-0.037
 two-0.034
 Ben-0.029
 sne-0.027
 four-0.027
Token the
Token ID262
Feature activation+1.43e-02

Pos logit contributions
 possessions+0.009
+0.009
 Acc+0.008
 possession+0.008
izing+0.008
Neg logit contributions
 a-0.004
 two-0.003
 Ben-0.003
 four-0.003
 gas-0.002
Token stars
Token ID5788
Feature activation+1.42e-02

Pos logit contributions
 possessions+0.020
 possession+0.019
+0.019
 Acc+0.019
izing+0.018
Neg logit contributions
 a-0.023
 two-0.023
 blast-0.019
 sne-0.019
 four-0.019
Token each
Token ID1123
Feature activation+2.29e-02

Pos logit contributions
 possessions+0.024
 possession+0.023
izing+0.022
+0.022
 angel+0.022
Neg logit contributions
 a-0.018
 two-0.017
 Ben-0.016
 blast-0.016
 sne-0.016
Token them
Token ID606
Feature activation+9.63e-03

Pos logit contributions
 possessions+0.006
 possession+0.006
izing+0.006
 angel+0.005
+0.005
Neg logit contributions
 a-0.004
 two-0.004
 blast-0.004
 Ben-0.004
 sne-0.004
Token.
Token ID13
Feature activation+9.31e-03

Pos logit contributions
 possessions+0.015
 angel+0.015
 mattered+0.014
ement+0.014
 possession+0.014
Neg logit contributions
 blast-0.012
 roast-0.011
 GO-0.011
 roasted-0.011
Ar-0.011
Token
Token ID198
Feature activation+1.02e-02

Pos logit contributions
 possessions+0.020
 possession+0.019
 Acc+0.019
+0.019
 angel+0.018
Neg logit contributions
 a-0.007
 two-0.007
 sne-0.006
 blast-0.006
rip-0.006
Token
Token ID198
Feature activation+2.33e-02

Pos logit contributions
 possessions+0.021
 possession+0.020
 Acc+0.020
+0.020
 angel+0.020
Neg logit contributions
 a-0.008
 two-0.008
 sne-0.007
 blast-0.007
rip-0.007
TokenMom
Token ID29252
Feature activation-1.07e-02

Pos logit contributions
 possessions+0.039
 possession+0.038
 angel+0.036
izing+0.036
 Acc+0.036
Neg logit contributions
 a-0.030
 two-0.029
 Ben-0.026
 blast-0.026
 sne-0.025
Token looks
Token ID3073
Feature activation+6.36e-02

Pos logit contributions
 a+0.016
 roasted+0.016
 bait+0.016
 roast+0.016
rip+0.016
Neg logit contributions
ement-0.013
 angel-0.013
 possessions-0.012
 Acc-0.012
ths-0.012
Token at
Token ID379
Feature activation-6.59e-03

Pos logit contributions
 possessions+0.111
 possession+0.107
+0.104
izing+0.104
 Acc+0.102
Neg logit contributions
 a-0.080
 two-0.075
 Ben-0.069
 sne-0.067
 blast-0.065
Token them
Token ID606
Feature activation+8.97e-03

Pos logit contributions
ata+0.008
 blah+0.008
rip+0.008
 GO+0.008
 roast+0.008
Neg logit contributions
 possessions-0.010
 possession-0.010
 angel-0.010
 ornament-0.009
 Acc-0.009
Token and
Token ID290
Feature activation-7.30e-03

Pos logit contributions
 possessions+0.015
 angel+0.015
izing+0.015
ement+0.015
 possession+0.015
Neg logit contributions
 blast-0.010
 a-0.009
rip-0.009
 two-0.009
 roasted-0.009
Token smiles
Token ID21845
Feature activation+5.10e-03

Pos logit contributions
 a+0.011
 two+0.010
 Ben+0.010
 sne+0.009
 gas+0.009
Neg logit contributions
 possessions-0.013
 possession-0.012
izing-0.012
 mattered-0.011
-0.011
Token too
Token ID1165
Feature activation-7.17e-04

Pos logit contributions
 possessions+0.010
 possession+0.009
 angel+0.009
+0.009
izing+0.009
Neg logit contributions
 a-0.005
 two-0.005
 blast-0.005
 roast-0.005
 roasted-0.004
Token original
Token ID2656
Feature activation+4.92e-03

Pos logit contributions
 possessions+0.031
 possession+0.030
 angel+0.029
 bicycle+0.028
 ornament+0.028
Neg logit contributions
 a-0.042
 two-0.042
 Ben-0.040
rip-0.040
ata-0.039
Token idea
Token ID2126
Feature activation+9.67e-03

Pos logit contributions
 possessions+0.008
 possession+0.008
+0.007
izing+0.007
 Acc+0.007
Neg logit contributions
 a-0.006
 two-0.006
 Ben-0.006
 blast-0.006
 sne-0.006
Token.
Token ID13
Feature activation+5.05e-04

Pos logit contributions
 possessions+0.019
 possession+0.018
izing+0.018
+0.018
 Acc+0.018
Neg logit contributions
 a-0.009
 two-0.009
 Ben-0.008
 blast-0.008
 sne-0.008
Token They
Token ID1119
Feature activation+6.72e-03

Pos logit contributions
+0.001
 Ellie+0.001
 Acc+0.001
 bicycle+0.001
 Ellen+0.001
Neg logit contributions
 sne-0.001
ata-0.001
rip-0.001
 padd-0.001
 blast-0.001
Token looked
Token ID3114
Feature activation+6.75e-02

Pos logit contributions
 possessions+0.008
 mattered+0.008
 bicycle+0.008
 possession+0.008
+0.008
Neg logit contributions
 a-0.010
 two-0.010
umbs-0.010
 roast-0.010
 blah-0.010
Token very
Token ID845
Feature activation+6.08e-02

Pos logit contributions
 possessions+0.114
 possession+0.109
+0.105
 Acc+0.105
izing+0.105
Neg logit contributions
 a-0.088
 two-0.083
 Ben-0.075
 blast-0.075
 sne-0.074
Token different
Token ID1180
Feature activation+3.39e-02

Pos logit contributions
 possessions+0.104
 possession+0.103
+0.102
 ornament+0.097
 Acc+0.096
Neg logit contributions
 a-0.081
 two-0.075
 Ben-0.069
 sne-0.067
 four-0.065
Token from
Token ID422
Feature activation+2.74e-02

Pos logit contributions
 possessions+0.061
 possession+0.059
+0.057
 Acc+0.056
izing+0.056
Neg logit contributions
 a-0.039
 two-0.037
 Ben-0.033
 sne-0.033
 blast-0.033
Token each
Token ID1123
Feature activation+3.65e-02

Pos logit contributions
 possessions+0.057
 possession+0.055
+0.054
izing+0.054
 Acc+0.054
Neg logit contributions
 a-0.024
 two-0.022
 Ben-0.019
 blast-0.019
 sne-0.018
Token other
Token ID584
Feature activation+2.15e-02

Pos logit contributions
 possessions+0.064
 possession+0.061
 angel+0.061
+0.060
 bicycle+0.060
Neg logit contributions
 roasted-0.037
 roast-0.037
 a-0.036
 boil-0.035
 sne-0.034
Token,
Token ID11
Feature activation+9.04e-03

Pos logit contributions
 possessions+0.035
izing+0.034
 possession+0.033
 angel+0.033
bing+0.032
Neg logit contributions
 roast-0.025
 blast-0.025
 a-0.024
 roasted-0.024
 two-0.024
Token,
Token ID11
Feature activation-6.38e-03

Pos logit contributions
 possessions+0.007
 possession+0.007
+0.007
 Acc+0.007
izing+0.006
Neg logit contributions
 a-0.005
 two-0.005
 Ben-0.004
 sne-0.004
 blast-0.004
Token Lily
Token ID20037
Feature activation-2.01e-02

Pos logit contributions
 a+0.007
rip+0.006
 roasted+0.006
ata+0.006
 two+0.006
Neg logit contributions
 angel-0.011
 possessions-0.011
 possession-0.011
 ornament-0.011
izing-0.010
Token and
Token ID290
Feature activation-1.87e-02

Pos logit contributions
 blast+0.025
 a+0.025
 two+0.025
 GO+0.025
 roast+0.025
Neg logit contributions
 possessions-0.030
 possession-0.029
 angel-0.029
 ornament-0.028
-0.028
Token Ben
Token ID3932
Feature activation-1.34e-02

Pos logit contributions
 a+0.007
 two+0.006
 blast+0.004
 roast+0.004
 sne+0.004
Neg logit contributions
 possessions-0.048
 possession-0.046
 angel-0.045
 Acc-0.045
-0.045
Token?
Token ID30
Feature activation+1.20e-02

Pos logit contributions
 blast+0.021
 GO+0.021
umbs+0.020
 roast+0.020
 Theo+0.020
Neg logit contributions
ie-0.015
 angel-0.015
 possessions-0.015
 mattered-0.014
-0.014
Token Why
Token ID4162
Feature activation+2.65e-02

Pos logit contributions
 Acc+0.015
+0.015
 possessions+0.014
 possession+0.013
 mattered+0.013
Neg logit contributions
 a-0.018
 two-0.018
rip-0.018
 blast-0.018
ata-0.017
Token are
Token ID389
Feature activation-7.81e-03

Pos logit contributions
 possessions+0.050
 possession+0.047
 Acc+0.046
+0.046
 mattered+0.046
Neg logit contributions
 a-0.027
 two-0.026
 blast-0.024
 roast-0.023
 sne-0.023
Token you
Token ID345
Feature activation+5.90e-04

Pos logit contributions
ata+0.010
umbs+0.010
rip+0.009
 blast+0.009
 roasted+0.009
Neg logit contributions
 possessions-0.012
nam-0.012
ths-0.011
 mattered-0.011
izing-0.011
Token fighting
Token ID4330
Feature activation+9.93e-03

Pos logit contributions
bing+0.001
 angel+0.001
 Acc+0.001
 ornament+0.001
izing+0.001
Neg logit contributions
Ar-0.001
 blast-0.001
umbs-0.001
ata-0.001
 meat-0.001
Token and
Token ID290
Feature activation-2.76e-03

Pos logit contributions
 possessions+0.018
 possession+0.017
+0.017
izing+0.017
 Acc+0.017
Neg logit contributions
 a-0.012
 two-0.011
 Ben-0.010
 sne-0.010
 blast-0.009
Token pin
Token ID6757
Feature activation+6.51e-03

Pos logit contributions
 a+0.004
 two+0.003
 Ben+0.003
 sne+0.003
 blast+0.003
Neg logit contributions
 possessions-0.005
 possession-0.004
-0.004
ement-0.004
izing-0.004
Token was
Token ID373
Feature activation-1.01e-03

Pos logit contributions
 possessions+0.046
 Acc+0.044
osity+0.043
izing+0.043
+0.043
Neg logit contributions
 a-0.032
 two-0.029
 Ben-0.026
 gas-0.025
 sne-0.025
Token inside
Token ID2641
Feature activation+2.01e-02

Pos logit contributions
 a+0.001
 two+0.001
 Ben+0.001
 blast+0.001
 sne+0.001
Neg logit contributions
 possessions-0.001
 possession-0.001
izing-0.001
 angel-0.001
-0.001
Token.
Token ID13
Feature activation+3.29e-03

Pos logit contributions
 possessions+0.038
 possession+0.037
izing+0.036
+0.035
 angel+0.035
Neg logit contributions
 a-0.020
 two-0.019
 blast-0.018
 sne-0.017
rip-0.017
Token
Token ID198
Feature activation-3.16e-03

Pos logit contributions
+0.006
 possessions+0.005
 possession+0.005
 Acc+0.005
 mattered+0.005
Neg logit contributions
 roast-0.003
 blast-0.003
 roasted-0.003
rip-0.003
 sne-0.003
Token
Token ID198
Feature activation+1.44e-02

Pos logit contributions
 a+0.003
 sne+0.003
 roast+0.003
 two+0.003
 roasted+0.003
Neg logit contributions
 possessions-0.006
-0.006
 possession-0.006
 mattered-0.006
izing-0.005
Token"
Token ID1
Feature activation+4.64e-02

Pos logit contributions
 possessions+0.020
+0.019
 possession+0.019
izing+0.019
 mattered+0.018
Neg logit contributions
 a-0.022
 two-0.021
 sne-0.020
 blast-0.020
 roast-0.020
TokenLet
Token ID5756
Feature activation+7.75e-03

Pos logit contributions
 possessions+0.077
 possession+0.074
izing+0.071
 Acc+0.070
 angel+0.070
Neg logit contributions
 a-0.060
 two-0.058
 Ben-0.053
 blast-0.053
 sne-0.052
Token's
Token ID338
Feature activation+3.91e-03

Pos logit contributions
 possessions+0.015
+0.014
 possession+0.014
 angel+0.014
izing+0.014
Neg logit contributions
 a-0.007
 two-0.007
 roast-0.007
 blast-0.007
 roasted-0.007
Token go
Token ID467
Feature activation+5.68e-03

Pos logit contributions
 bicycle+0.007
 ornament+0.006
 angel+0.006
+0.006
 possessions+0.006
Neg logit contributions
ata-0.004
umbs-0.004
rip-0.004
 blah-0.004
 roasted-0.004
Token to
Token ID284
Feature activation+7.96e-03

Pos logit contributions
 possessions+0.008
 possession+0.008
 bicycle+0.008
bing+0.007
 angel+0.007
Neg logit contributions
 blah-0.007
 Ben-0.007
 roasted-0.007
 sne-0.007
 two-0.007
Token the
Token ID262
Feature activation+1.05e-02

Pos logit contributions
 bicycle+0.011
 ornament+0.010
 possession+0.010
 angel+0.010
 Ellen+0.009
Neg logit contributions
umbs-0.013
ata-0.012
rip-0.011
rr-0.011
OOOO-0.011
Token for
Token ID329
Feature activation-1.24e-03

Pos logit contributions
 possessions+0.060
 possession+0.058
+0.056
izing+0.056
 Acc+0.056
Neg logit contributions
 a-0.029
 two-0.028
 Ben-0.024
 sne-0.024
 blast-0.024
Token a
Token ID257
Feature activation+1.04e-03

Pos logit contributions
 a+0.001
 two+0.001
 Ben+0.001
 blast+0.001
 sne+0.001
Neg logit contributions
 possessions-0.002
 possession-0.002
izing-0.002
-0.002
 angel-0.002
Token better
Token ID1365
Feature activation+1.44e-02

Pos logit contributions
 possessions+0.002
 possession+0.002
izing+0.001
+0.001
 Acc+0.001
Neg logit contributions
 a-0.001
 two-0.001
 blast-0.001
 sne-0.001
 Ben-0.001
Token look
Token ID804
Feature activation+3.82e-02

Pos logit contributions
 possessions+0.028
izing+0.026
 Acc+0.026
+0.026
 possession+0.026
Neg logit contributions
 a-0.016
 two-0.015
 blast-0.013
 sne-0.013
 Ben-0.012
Token.
Token ID13
Feature activation+9.06e-03

Pos logit contributions
 possessions+0.084
 Acc+0.083
 possession+0.083
+0.081
 bicycle+0.080
Neg logit contributions
 a-0.030
 two-0.028
 Ben-0.021
 sne-0.021
 gas-0.020
Token
Token ID220
Feature activation+1.83e-02

Pos logit contributions
 possessions+0.016
 possession+0.015
+0.015
 Acc+0.015
izing+0.015
Neg logit contributions
 a-0.010
 two-0.010
 blast-0.009
 sne-0.009
 roast-0.009
Token
Token ID198
Feature activation+8.07e-03

Pos logit contributions
 possessions+0.041
 possession+0.039
+0.038
 Acc+0.038
izing+0.038
Neg logit contributions
 a-0.013
 two-0.012
 blast-0.010
 sne-0.010
 roast-0.010
Token
Token ID198
Feature activation+2.12e-02

Pos logit contributions
 possessions+0.015
+0.015
 possession+0.015
 Acc+0.015
izing+0.015
Neg logit contributions
 a-0.007
 blast-0.007
 sne-0.007
 roast-0.006
 two-0.006
TokenM
Token ID44
Feature activation+6.62e-03

Pos logit contributions
 possessions+0.036
 possession+0.035
+0.033
izing+0.033
 Acc+0.033
Neg logit contributions
 a-0.026
 two-0.025
 blast-0.023
 Ben-0.023
 sne-0.023
Tokenandy
Token ID10757
Feature activation+2.71e-02

Pos logit contributions
 possessions+0.007
 possession+0.007
izing+0.006
+0.006
 Acc+0.006
Neg logit contributions
 a-0.012
 two-0.012
 blast-0.011
 Ben-0.011
 sne-0.011
Token asked
Token ID1965
Feature activation+2.01e-02

Pos logit contributions
 possessions+0.044
+0.041
 possession+0.041
ths+0.041
izing+0.040
Neg logit contributions
 a-0.033
 two-0.032
 blast-0.031
rip-0.031
 roast-0.031
Token girl
Token ID2576
Feature activation-2.55e-02

Pos logit contributions
 a+0.023
 two+0.022
 Ben+0.020
 sne+0.020
 blast+0.020
Neg logit contributions
 possessions-0.017
 possession-0.016
izing-0.015
-0.015
 Acc-0.015
Token to
Token ID284
Feature activation+3.27e-03

Pos logit contributions
 two+0.025
 a+0.025
 blast+0.024
rip+0.024
 roast+0.024
Neg logit contributions
 possessions-0.048
 possession-0.046
 mattered-0.044
 angel-0.043
 bicycle-0.043
Token march
Token ID9960
Feature activation-2.01e-03

Pos logit contributions
 possessions+0.005
 bicycle+0.005
 possession+0.004
+0.004
 mattered+0.004
Neg logit contributions
umbs-0.004
 meat-0.004
ata-0.004
 roasted-0.004
 Theo-0.004
Token.
Token ID13
Feature activation-1.43e-03

Pos logit contributions
 a+0.002
 two+0.002
 Ben+0.002
 sne+0.002
 blast+0.002
Neg logit contributions
 possessions-0.004
 possession-0.003
-0.003
 Acc-0.003
izing-0.003
Token
Token ID198
Feature activation-4.91e-03

Pos logit contributions
 a+0.002
 two+0.002
 sne+0.001
 blast+0.001
 roast+0.001
Neg logit contributions
 possessions-0.002
-0.002
 possession-0.002
 Acc-0.002
izing-0.002
Token
Token ID198
Feature activation+1.52e-02

Pos logit contributions
 GO+0.006
 sne+0.006
 two+0.006
 roast+0.005
 roasted+0.005
Neg logit contributions
 possession-0.007
Ell-0.007
 possessions-0.007
ement-0.007
 mattered-0.007
TokenThe
Token ID464
Feature activation-7.23e-04

Pos logit contributions
izing+0.022
+0.021
 possessions+0.021
ths+0.020
Ell+0.020
Neg logit contributions
 a-0.021
 two-0.020
 sne-0.020
 Ben-0.020
 roast-0.020
Token little
Token ID1310
Feature activation-2.96e-03

Pos logit contributions
 a+0.001
 two+0.001
 Ben+0.001
rip+0.001
 blast+0.000
Neg logit contributions
 possessions-0.002
 possession-0.001
 angel-0.001
-0.001
 Acc-0.001
Token girl
Token ID2576
Feature activation+1.45e-02

Pos logit contributions
 a+0.004
 two+0.004
 Ben+0.004
 sne+0.004
 four+0.004
Neg logit contributions
 possessions-0.005
 possession-0.005
izing-0.004
 Acc-0.004
 disl-0.004
Token was
Token ID373
Feature activation+9.58e-03

Pos logit contributions
 possessions+0.021
 possession+0.020
+0.019
 mattered+0.019
ths+0.018
Neg logit contributions
 a-0.020
umbs-0.019
 two-0.019
 roast-0.019
rip-0.019
Token very
Token ID845
Feature activation+2.90e-02

Pos logit contributions
 possessions+0.015
 possession+0.014
 angel+0.014
 mattered+0.014
 Acc+0.014
Neg logit contributions
rip-0.011
 roast-0.011
umbs-0.011
 two-0.011
 meat-0.011
Token again
Token ID757
Feature activation+9.15e-03

Pos logit contributions
 blast+0.006
 two+0.006
 roasted+0.006
 roast+0.006
 a+0.006
Neg logit contributions
 possessions-0.011
 possession-0.010
 angel-0.010
 Acc-0.010
 bicycle-0.010
Token.
Token ID13
Feature activation-1.01e-02

Pos logit contributions
 possessions+0.013
 possession+0.013
bing+0.012
izing+0.012
+0.012
Neg logit contributions
umbs-0.013
 blast-0.013
 GO-0.013
rip-0.012
 roasted-0.012
Token
Token ID198
Feature activation-2.58e-03

Pos logit contributions
 blast+0.011
 roasted+0.011
ata+0.011
 sne+0.011
 roast+0.011
Neg logit contributions
 Acc-0.016
 possession-0.015
-0.015
 possessions-0.015
Really-0.015
Token
Token ID198
Feature activation+9.28e-03

Pos logit contributions
 sne+0.002
 a+0.002
 roasted+0.002
 two+0.002
 blast+0.002
Neg logit contributions
 possessions-0.005
 Acc-0.004
 possession-0.004
-0.004
 disl-0.004
Token"
Token ID1
Feature activation+2.29e-02

Pos logit contributions
 possessions+0.013
 possession+0.012
+0.012
izing+0.012
 Acc+0.012
Neg logit contributions
 a-0.014
 two-0.013
 sne-0.013
 blast-0.013
 roast-0.013
TokenHere
Token ID4342
Feature activation+2.50e-02

Pos logit contributions
+0.025
 possessions+0.025
Really+0.024
izing+0.024
 angel+0.023
Neg logit contributions
 a-0.039
 roast-0.037
 blast-0.037
rip-0.037
 two-0.037
Token you
Token ID345
Feature activation-1.87e-02

Pos logit contributions
 possessions+0.046
 possession+0.045
 Acc+0.043
izing+0.042
+0.042
Neg logit contributions
 a-0.030
 two-0.028
 Ben-0.026
 sne-0.024
 gas-0.024
Token go
Token ID467
Feature activation-2.46e-02

Pos logit contributions
 roasted+0.023
ata+0.021
hes+0.021
 roast+0.021
Ar+0.021
Neg logit contributions
ie-0.029
 possessions-0.028
 bicycle-0.027
 angel-0.027
 Acc-0.027
Token,
Token ID11
Feature activation-2.04e-02

Pos logit contributions
 roasted+0.049
 roast+0.047
rip+0.047
 boil+0.046
 two+0.046
Neg logit contributions
 possessions-0.021
 possession-0.020
 bicycle-0.019
izing-0.019
-0.019
Token kids
Token ID3988
Feature activation-1.94e-02

Pos logit contributions
 roasted+0.028
ata+0.025
umbs+0.025
rip+0.025
 roast+0.025
Neg logit contributions
 angel-0.034
 ornament-0.030
 possessions-0.030
 bicycle-0.029
izing-0.029
Token,
Token ID11
Feature activation-1.06e-02

Pos logit contributions
 roasted+0.030
 two+0.030
 roast+0.030
umbs+0.029
rip+0.029
Neg logit contributions
 possessions-0.024
 possession-0.023
 angel-0.022
 bicycle-0.022
ie-0.022
Token it
Token ID340
Feature activation+1.96e-02

Pos logit contributions
 possessions+0.016
+0.016
ement+0.016
 possession+0.016
izing+0.016
Neg logit contributions
 a-0.013
 two-0.012
 Ben-0.011
 sne-0.010
 gas-0.010
Token's
Token ID338
Feature activation-2.58e-03

Pos logit contributions
 possessions+0.040
 possession+0.040
izing+0.038
ement+0.038
 angel+0.038
Neg logit contributions
 a-0.019
 two-0.018
 Ben-0.015
 sne-0.014
 gas-0.014
Token not
Token ID407
Feature activation+1.61e-02

Pos logit contributions
 a+0.004
 two+0.004
 blast+0.004
 Ben+0.003
 sne+0.003
Neg logit contributions
 possessions-0.004
 possession-0.004
izing-0.003
 Acc-0.003
-0.003
Token about
Token ID546
Feature activation+2.41e-03

Pos logit contributions
 possessions+0.025
 possession+0.024
 mattered+0.023
 ornament+0.022
izing+0.022
Neg logit contributions
ata-0.021
 blast-0.021
rip-0.020
 Theo-0.020
 blah-0.020
Token the
Token ID262
Feature activation+9.06e-03

Pos logit contributions
 possessions+0.004
 possession+0.004
izing+0.004
 Acc+0.004
+0.004
Neg logit contributions
 a-0.003
 two-0.003
 blast-0.003
 Ben-0.003
 sne-0.003
Token score
Token ID4776
Feature activation+1.00e-02

Pos logit contributions
 possessions+0.014
 possession+0.014
 angel+0.013
 ornament+0.013
 bicycle+0.013
Neg logit contributions
 a-0.012
 two-0.012
rip-0.011
 blast-0.011
 Ben-0.011
Token of
Token ID286
Feature activation-2.08e-03

Pos logit contributions
 possessions+0.018
 possession+0.017
 mattered+0.016
 angel+0.016
izing+0.016
Neg logit contributions
 a-0.011
 two-0.011
 roast-0.010
 blast-0.010
 roasted-0.010
Token how
Token ID703
Feature activation+1.37e-02

Pos logit contributions
 a+0.002
 two+0.002
rip+0.002
ata+0.002
 roast+0.002
Neg logit contributions
 possessions-0.004
 possession-0.003
izing-0.003
 angel-0.003
 mattered-0.003
Token many
Token ID867
Feature activation+1.77e-02

Pos logit contributions
 possessions+0.024
 possession+0.023
izing+0.022
+0.022
 Acc+0.022
Neg logit contributions
 a-0.016
 two-0.016
 Ben-0.014
 blast-0.014
 sne-0.014
Token toys
Token ID14958
Feature activation+1.12e-02

Pos logit contributions
 possessions+0.026
 possession+0.026
+0.026
 Acc+0.026
izing+0.025
Neg logit contributions
 a-0.026
 two-0.025
 Ben-0.023
 sne-0.022
 blast-0.022
Token he
Token ID339
Feature activation-9.18e-03

Pos logit contributions
 possessions+0.020
 possession+0.020
 Acc+0.019
 angel+0.019
+0.019
Neg logit contributions
 a-0.013
 two-0.012
 Ben-0.011
 sne-0.010
 gas-0.010
Token Rex
Token ID17853
Feature activation+8.30e-03

Pos logit contributions
 possessions+0.012
 possession+0.012
izing+0.011
bing+0.011
ths+0.011
Neg logit contributions
 a-0.010
 two-0.009
 Ben-0.009
 sne-0.008
 gas-0.008
Token went
Token ID1816
Feature activation+2.27e-03

Pos logit contributions
 possessions+0.013
 possession+0.012
+0.012
 mattered+0.012
 Acc+0.012
Neg logit contributions
 a-0.011
 two-0.010
 blast-0.010
rip-0.010
 roast-0.010
Token to
Token ID284
Feature activation+7.38e-03

Pos logit contributions
 possessions+0.004
 possession+0.004
 bicycle+0.004
 mattered+0.003
 angel+0.003
Neg logit contributions
 two-0.002
 blast-0.002
rip-0.002
 a-0.002
 Theo-0.002
Token the
Token ID262
Feature activation+1.93e-02

Pos logit contributions
 bicycle+0.009
 Ellie+0.008
 possession+0.008
 Ellen+0.008
 Acc+0.008
Neg logit contributions
umbs-0.013
ata-0.012
rip-0.012
OOOO-0.011
sel-0.011
Token park
Token ID3952
Feature activation-4.23e-03

Pos logit contributions
 bicycle+0.025
 possession+0.024
 possessions+0.023
 angel+0.023
 ornament+0.023
Neg logit contributions
umbs-0.029
ata-0.029
rip-0.029
 blah-0.028
rr-0.028
Token.
Token ID13
Feature activation+7.86e-03

Pos logit contributions
 two+0.005
rip+0.005
 blast+0.005
 GO+0.005
 a+0.005
Neg logit contributions
 possession-0.007
 possessions-0.006
 mattered-0.006
 bicycle-0.006
 angel-0.006
Token They
Token ID1119
Feature activation+9.49e-03

Pos logit contributions
+0.012
lights+0.011
 possessions+0.011
 possession+0.011
 Nice+0.011
Neg logit contributions
 GO-0.009
ata-0.008
rr-0.008
 two-0.008
 blast-0.008
Token liked
Token ID8288
Feature activation-1.48e-02

Pos logit contributions
 possessions+0.015
 possession+0.014
izing+0.014
+0.014
 Acc+0.014
Neg logit contributions
 a-0.013
 two-0.013
 Ben-0.011
 blast-0.011
 sne-0.011
Token to
Token ID284
Feature activation+1.73e-02

Pos logit contributions
ata+0.018
rip+0.017
 two+0.017
umbs+0.016
 blah+0.016
Neg logit contributions
 possessions-0.024
 possession-0.024
 bicycle-0.023
 angel-0.022
 mattered-0.022
Token play
Token ID711
Feature activation+1.17e-02

Pos logit contributions
 possessions+0.028
 possession+0.027
 bicycle+0.026
+0.026
 angel+0.026
Neg logit contributions
 a-0.021
 two-0.021
rip-0.019
 Ben-0.019
 blah-0.019
Token with
Token ID351
Feature activation+5.14e-03

Pos logit contributions
 possessions+0.020
 possession+0.019
+0.019
 mattered+0.019
izing+0.019
Neg logit contributions
 two-0.012
rip-0.012
 a-0.012
 roasted-0.012
 roast-0.012
Token to
Token ID284
Feature activation-4.00e-03

Pos logit contributions
 a+0.006
 two+0.005
 Ben+0.005
 blast+0.005
 sne+0.005
Neg logit contributions
 possessions-0.009
 possession-0.009
izing-0.009
-0.009
 Acc-0.008
Token talk
Token ID1561
Feature activation-1.62e-02

Pos logit contributions
 two+0.004
 a+0.004
 roasted+0.004
umbs+0.004
ata+0.004
Neg logit contributions
 angel-0.007
 possessions-0.007
 ornament-0.006
 possession-0.006
 bicycle-0.006
Token.
Token ID13
Feature activation+3.97e-03

Pos logit contributions
 two+0.017
 roast+0.017
 a+0.016
 blast+0.016
 roasted+0.016
Neg logit contributions
 possessions-0.028
 angel-0.027
 possession-0.027
 bicycle-0.026
 mattered-0.026
Token They
Token ID1119
Feature activation+2.11e-04

Pos logit contributions
+0.006
 Acc+0.006
 possessions+0.006
 mattered+0.006
 possession+0.006
Neg logit contributions
rip-0.005
 sne-0.005
ata-0.005
 two-0.004
rr-0.004
Token say
Token ID910
Feature activation+2.71e-02

Pos logit contributions
 possessions+0.000
 possession+0.000
 angel+0.000
+0.000
 Acc+0.000
Neg logit contributions
 a-0.000
 two-0.000
 Ben-0.000
 blast-0.000
rip-0.000
Token hello
Token ID23748
Feature activation-1.38e-03

Pos logit contributions
 possessions+0.036
 possession+0.034
 angel+0.033
+0.033
 Acc+0.032
Neg logit contributions
 a-0.042
 two-0.041
 blast-0.038
 Ben-0.038
 roast-0.038
Token,
Token ID11
Feature activation-1.25e-02

Pos logit contributions
 a+0.001
 two+0.001
 blast+0.001
 roast+0.001
 roasted+0.001
Neg logit contributions
 possessions-0.003
 possession-0.002
 angel-0.002
-0.002
 bicycle-0.002
Token how
Token ID703
Feature activation-2.69e-03

Pos logit contributions
rip+0.017
umbs+0.016
ata+0.016
 two+0.015
rr+0.015
Neg logit contributions
 angel-0.017
 possessions-0.017
 Acc-0.016
 disl-0.016
 bicycle-0.016
Token are
Token ID389
Feature activation-3.15e-02

Pos logit contributions
 two+0.003
 a+0.003
 blast+0.003
 roast+0.003
 sne+0.003
Neg logit contributions
 possessions-0.005
 possession-0.004
 mattered-0.004
 Acc-0.004
 angel-0.004
Token you
Token ID345
Feature activation-6.35e-03

Pos logit contributions
 a+0.036
 two+0.036
rip+0.036
 blast+0.036
 roast+0.035
Neg logit contributions
 possessions-0.053
 possession-0.049
 angel-0.048
 mattered-0.048
izing-0.046
Token,
Token ID11
Feature activation-9.95e-03

Pos logit contributions
 GO+0.008
 blast+0.008
umbs+0.008
 Theo+0.007
 roast+0.007
Neg logit contributions
 possessions-0.010
 mattered-0.009
 angel-0.009
 possession-0.009
ie-0.009
Token<|endoftext|>
Token ID50256
Feature activation-2.93e-02

Pos logit contributions
 sne+0.005
 padd+0.005
 roasted+0.005
 a+0.005
rip+0.005
Neg logit contributions
 Acc-0.008
-0.007
 possession-0.007
 possessions-0.007
Really-0.007
TokenS
Token ID50
Feature activation+1.14e-03

Pos logit contributions
 roast+0.046
 roasted+0.046
 padd+0.045
 sne+0.045
 boil+0.044
Neg logit contributions
-0.034
ating-0.033
 ornament-0.030
ined-0.030
 mattered-0.030
Tokenara
Token ID3301
Feature activation-2.13e-03

Pos logit contributions
ement+0.001
ie+0.001
 possessions+0.001
+0.001
ths+0.001
Neg logit contributions
 blast-0.002
 a-0.002
 Ro-0.002
 Run-0.002
 sne-0.002
Token and
Token ID290
Feature activation-1.18e-02

Pos logit contributions
 a+0.002
 two+0.002
 blast+0.002
 GO+0.002
 roast+0.002
Neg logit contributions
 possessions-0.004
 possession-0.004
-0.004
 Acc-0.004
 angel-0.004
Token Ben
Token ID3932
Feature activation+1.13e-02

Pos logit contributions
 two+0.014
 a+0.014
ata+0.014
rip+0.013
 blast+0.013
Neg logit contributions
 possessions-0.019
 possession-0.018
 angel-0.018
 Acc-0.018
-0.018
Token are
Token ID389
Feature activation-1.20e-02

Pos logit contributions
 possessions+0.020
 possession+0.020
izing+0.019
+0.019
 angel+0.019
Neg logit contributions
 a-0.013
 two-0.012
 Ben-0.011
 sne-0.011
 blast-0.011
Token twins
Token ID20345
Feature activation+2.27e-03

Pos logit contributions
ata+0.023
rip+0.023
 hook+0.022
 boil+0.022
 blast+0.022
Neg logit contributions
 Acc-0.011
 possessions-0.010
 mattered-0.009
 angel-0.009
 possession-0.009
Token.
Token ID13
Feature activation+1.19e-02

Pos logit contributions
 possessions+0.005
 possession+0.005
izing+0.004
+0.004
 angel+0.004
Neg logit contributions
 a-0.002
 two-0.002
 Ben-0.002
 sne-0.002
 blast-0.002
Token They
Token ID1119
Feature activation+1.16e-02

Pos logit contributions
+0.021
 Acc+0.021
 possessions+0.020
 possession+0.020
 mattered+0.020
Neg logit contributions
ata-0.012
 GO-0.011
 a-0.011
 blast-0.011
rip-0.011
Token like
Token ID588
Feature activation-1.42e-02

Pos logit contributions
 possessions+0.020
 possession+0.020
 Acc+0.019
+0.019
 angel+0.019
Neg logit contributions
 a-0.013
 two-0.013
 blast-0.011
 Ben-0.011
rip-0.011
Token to
Token ID284
Feature activation+2.89e-02

Pos logit contributions
ata+0.016
 two+0.016
rip+0.016
 blah+0.015
 blast+0.015
Neg logit contributions
 possessions-0.024
 possession-0.023
 bicycle-0.023
 ornament-0.022
 angel-0.022
Token garden
Token ID11376
Feature activation-3.23e-03

Pos logit contributions
 possessions+0.025
 mattered+0.023
ement+0.023
 possession+0.023
izing+0.023
Neg logit contributions
 a-0.023
 two-0.022
 Ben-0.019
 sne-0.019
 finish-0.019
Token.
Token ID13
Feature activation+1.19e-02

Pos logit contributions
 blast+0.004
 GO+0.004
Ar+0.004
rip+0.004
 blah+0.004
Neg logit contributions
 possessions-0.005
 possession-0.005
izing-0.005
bing-0.004
ized-0.004
Token It
Token ID632
Feature activation+2.56e-02

Pos logit contributions
 possessions+0.018
 Acc+0.018
 possession+0.018
 bicycle+0.018
+0.017
Neg logit contributions
 blast-0.015
ata-0.015
 two-0.014
rip-0.014
rr-0.014
Token quickly
Token ID2952
Feature activation+1.05e-02

Pos logit contributions
 possessions+0.046
 possession+0.045
izing+0.042
+0.042
ement+0.042
Neg logit contributions
 a-0.033
 two-0.031
 sne-0.029
 Ben-0.027
 roasted-0.027
Token started
Token ID2067
Feature activation-1.49e-02

Pos logit contributions
 possessions+0.017
 possession+0.016
 mattered+0.016
+0.016
 angel+0.016
Neg logit contributions
 a-0.013
 two-0.013
 blast-0.012
 Theo-0.012
 GO-0.011
Token to
Token ID284
Feature activation+2.80e-04

Pos logit contributions
 two+0.015
 a+0.015
rip+0.015
 blast+0.015
ata+0.014
Neg logit contributions
 possessions-0.027
 possession-0.026
 Acc-0.025
izing-0.025
 angel-0.025
Token scatter
Token ID41058
Feature activation+4.89e-04

Pos logit contributions
 possessions+0.000
 Acc+0.000
 angel+0.000
 ornament+0.000
 possession+0.000
Neg logit contributions
 two-0.000
 a-0.000
rip-0.000
ata-0.000
 Ben-0.000
Token the
Token ID262
Feature activation+6.55e-03

Pos logit contributions
 possessions+0.001
 possession+0.001
+0.001
 Acc+0.001
izing+0.001
Neg logit contributions
 a-0.001
 two-0.000
 Ben-0.000
 sne-0.000
 gas-0.000
Token vegetables
Token ID13701
Feature activation+1.10e-02

Pos logit contributions
 Acc+0.011
 possession+0.011
 possessions+0.011
ths+0.010
+0.010
Neg logit contributions
 two-0.009
 a-0.009
 meat-0.007
 four-0.007
 bread-0.007
Token and
Token ID290
Feature activation-3.65e-03

Pos logit contributions
 possessions+0.021
 possession+0.020
 Acc+0.019
+0.019
 angel+0.019
Neg logit contributions
 a-0.012
 two-0.011
 Ben-0.010
 sne-0.010
 blast-0.010
Token flowers
Token ID12734
Feature activation+3.94e-03

Pos logit contributions
 a+0.004
 two+0.004
 blast+0.003
 Ben+0.003
rip+0.003
Neg logit contributions
 possessions-0.007
 possession-0.006
izing-0.006
-0.006
 angel-0.006
Token Tom
Token ID4186
Feature activation-5.10e-04

Pos logit contributions
 possessions+0.009
 possession+0.009
 mattered+0.008
 angel+0.008
izing+0.008
Neg logit contributions
 two-0.008
 a-0.008
 blast-0.007
 roast-0.007
rip-0.007
Token was
Token ID373
Feature activation+8.98e-03

Pos logit contributions
 a+0.001
 two+0.001
 Ben+0.001
 sne+0.001
 blast+0.001
Neg logit contributions
 possessions-0.001
 possession-0.001
izing-0.001
-0.001
 Acc-0.001
Token trading
Token ID7313
Feature activation-2.05e-03

Pos logit contributions
 possessions+0.016
 possession+0.015
 angel+0.015
 mattered+0.015
izing+0.015
Neg logit contributions
 a-0.010
 two-0.009
 blast-0.009
rip-0.009
 roast-0.009
Token something
Token ID1223
Feature activation+3.70e-02

Pos logit contributions
 a+0.002
 two+0.002
 Ben+0.002
 gas+0.002
 sne+0.002
Neg logit contributions
 possessions-0.004
-0.004
 Acc-0.004
ths-0.004
 possession-0.004
Token that
Token ID326
Feature activation-3.46e-03

Pos logit contributions
+0.069
 possessions+0.067
 Acc+0.065
 possession+0.065
ths+0.065
Neg logit contributions
 a-0.046
 two-0.040
 Ben-0.035
 gas-0.034
 sne-0.034
Token wasn
Token ID2492
Feature activation-3.21e-02

Pos logit contributions
 a+0.004
 two+0.004
 Ben+0.004
 sne+0.004
 blast+0.003
Neg logit contributions
 possessions-0.006
 possession-0.006
-0.006
izing-0.006
ement-0.006
Token't
Token ID470
Feature activation+1.67e-02

Pos logit contributions
 roasted+0.049
 meat+0.048
 roast+0.047
 blast+0.046
 rods+0.045
Neg logit contributions
izing-0.037
 angel-0.037
 ornament-0.037
-0.036
ized-0.035
Token fair
Token ID3148
Feature activation+6.21e-03

Pos logit contributions
 possessions+0.025
 possession+0.024
 mattered+0.023
 angel+0.022
 ornament+0.022
Neg logit contributions
 blast-0.024
 roast-0.024
 GO-0.024
 blah-0.023
 Run-0.023
Token.
Token ID13
Feature activation-1.34e-02

Pos logit contributions
 possessions+0.012
 possession+0.011
izing+0.011
+0.011
 angel+0.011
Neg logit contributions
 a-0.007
 two-0.006
 blast-0.006
 Ben-0.006
 sne-0.006
Token The
Token ID383
Feature activation-7.61e-03

Pos logit contributions
 roasted+0.013
 roast+0.012
 sne+0.012
 blast+0.012
 boil+0.012
Neg logit contributions
 Acc-0.025
ement-0.025
 possession-0.024
 possessions-0.024
 disl-0.024
Token fox
Token ID21831
Feature activation-1.48e-02

Pos logit contributions
rip+0.012
umbs+0.012
ata+0.012
 GO+0.012
 blah+0.012
Neg logit contributions
 angel-0.010
 possessions-0.009
 ornament-0.009
 bicycle-0.009
 possession-0.009
Token the
Token ID262
Feature activation+2.15e-03

Pos logit contributions
 possessions+0.011
 possession+0.010
 angel+0.010
+0.010
izing+0.010
Neg logit contributions
 a-0.009
 two-0.009
rip-0.008
 roast-0.008
 roasted-0.008
Token sky
Token ID6766
Feature activation+2.27e-03

Pos logit contributions
 possessions+0.003
 possession+0.003
+0.003
 angel+0.003
izing+0.003
Neg logit contributions
 a-0.003
 two-0.003
 Ben-0.003
 blast-0.003
 sne-0.003
Token and
Token ID290
Feature activation-1.38e-03

Pos logit contributions
 possessions+0.004
 possession+0.004
izing+0.003
+0.003
 angel+0.003
Neg logit contributions
 a-0.003
 two-0.003
 blast-0.003
 Ben-0.003
rip-0.003
Token then
Token ID788
Feature activation-8.06e-03

Pos logit contributions
 a+0.001
 two+0.001
 Ben+0.001
 blast+0.001
 sne+0.001
Neg logit contributions
 possessions-0.003
 possession-0.003
-0.002
izing-0.002
 Acc-0.002
Token it
Token ID340
Feature activation-1.00e-03

Pos logit contributions
 a+0.011
 two+0.011
 Ben+0.010
 sne+0.010
 blast+0.010
Neg logit contributions
 possessions-0.013
 possession-0.012
izing-0.012
-0.012
 Acc-0.012
Token started
Token ID2067
Feature activation-3.15e-02

Pos logit contributions
 a+0.001
 two+0.001
 blast+0.001
rip+0.001
 GO+0.001
Neg logit contributions
 possessions-0.002
 angel-0.001
 mattered-0.001
 possession-0.001
ths-0.001
Token to
Token ID284
Feature activation-1.76e-02

Pos logit contributions
 a+0.031
 two+0.030
 blast+0.027
rip+0.027
 Ben+0.027
Neg logit contributions
 possessions-0.060
 possession-0.058
 Acc-0.056
izing-0.055
 angel-0.055
Token laugh
Token ID6487
Feature activation-8.84e-03

Pos logit contributions
 a+0.021
 two+0.021
 Ben+0.019
rip+0.019
 blast+0.019
Neg logit contributions
 possessions-0.029
 angel-0.027
 possession-0.027
 Acc-0.027
 ornament-0.027
Token.
Token ID13
Feature activation-9.54e-03

Pos logit contributions
 roast+0.010
 blast+0.010
rip+0.010
 two+0.010
 roasted+0.010
Neg logit contributions
 possessions-0.014
 angel-0.014
izing-0.014
 possession-0.014
 mattered-0.014
Token It
Token ID632
Feature activation+1.06e-02

Pos logit contributions
 a+0.011
 two+0.011
 sne+0.010
rip+0.010
 blast+0.010
Neg logit contributions
 possessions-0.016
 possession-0.016
 Acc-0.016
-0.016
izing-0.015
Token was
Token ID373
Feature activation-2.21e-02

Pos logit contributions
 possessions+0.018
 possession+0.017
+0.016
 Acc+0.016
izing+0.016
Neg logit contributions
 a-0.014
 two-0.013
 sne-0.012
 Ben-0.012
 blast-0.012
Token!
Token ID0
Feature activation-1.73e-02

Pos logit contributions
 a+0.013
 two+0.013
 blast+0.013
rip+0.013
 roast+0.013
Neg logit contributions
 possessions-0.022
 possession-0.021
 angel-0.020
 bicycle-0.020
izing-0.020
Token The
Token ID383
Feature activation-1.04e-02

Pos logit contributions
 sne+0.029
 boil+0.027
 bread+0.027
ata+0.027
 hook+0.026
Neg logit contributions
 Acc-0.021
 Ellen-0.020
 Ellie-0.020
 Mill-0.020
-0.020
Token puppy
Token ID26188
Feature activation-1.05e-02

Pos logit contributions
 a+0.016
 two+0.016
 Ben+0.015
 blast+0.015
 sne+0.015
Neg logit contributions
 possessions-0.014
 possession-0.013
-0.013
izing-0.013
 angel-0.013
Token was
Token ID373
Feature activation-8.66e-03

Pos logit contributions
 a+0.015
 two+0.015
rip+0.015
 Ben+0.015
rr+0.015
Neg logit contributions
 possessions-0.014
 possession-0.013
 angel-0.013
 ornament-0.013
-0.013
Token so
Token ID523
Feature activation+1.45e-02

Pos logit contributions
rip+0.009
ata+0.009
 GO+0.009
 Run+0.009
 blah+0.009
Neg logit contributions
 angel-0.014
 possession-0.014
 possessions-0.014
 ornament-0.014
ement-0.014
Token pleased
Token ID10607
Feature activation-3.07e-02

Pos logit contributions
 possessions+0.018
 possession+0.017
 angel+0.017
 Acc+0.017
 mattered+0.016
Neg logit contributions
 a-0.023
 two-0.023
ata-0.022
rip-0.022
 blast-0.022
Token.
Token ID13
Feature activation-1.09e-02

Pos logit contributions
 a+0.034
 two+0.033
 Ben+0.029
 blast+0.029
 sne+0.029
Neg logit contributions
 possessions-0.056
 possession-0.054
izing-0.052
-0.052
 Acc-0.051
Token
Token ID198
Feature activation-3.36e-03

Pos logit contributions
 sne+0.015
 roast+0.014
 boil+0.014
 roasted+0.014
 blast+0.014
Neg logit contributions
 Acc-0.016
 possession-0.016
 possessions-0.015
-0.015
 bicycle-0.015
Token
Token ID198
Feature activation+1.04e-02

Pos logit contributions
 sne+0.004
 GO+0.004
 two+0.003
 roast+0.003
 blast+0.003
Neg logit contributions
 Acc-0.005
 possessions-0.005
 possession-0.005
bing-0.005
 mattered-0.005
TokenThe
Token ID464
Feature activation-5.71e-03

Pos logit contributions
+0.014
 possessions+0.013
izing+0.013
ths+0.013
bing+0.013
Neg logit contributions
 a-0.016
 Ben-0.015
 sne-0.015
 two-0.015
 roast-0.014
Token puppy
Token ID26188
Feature activation-4.24e-03

Pos logit contributions
 a+0.009
 two+0.009
 Ben+0.008
 blast+0.008
 sne+0.008
Neg logit contributions
 possessions-0.008
 possession-0.008
izing-0.007
 Acc-0.007
-0.007
Token was
Token ID373
Feature activation-1.83e-02

Pos logit contributions
 possessions+0.020
 possession+0.019
izing+0.019
 Acc+0.019
+0.018
Neg logit contributions
 a-0.014
 two-0.013
 sne-0.012
 gas-0.011
 Ben-0.011
Token brown
Token ID7586
Feature activation-1.85e-02

Pos logit contributions
rip+0.017
rr+0.016
ata+0.016
 two+0.016
umbs+0.016
Neg logit contributions
 possessions-0.034
 ornament-0.033
 possession-0.033
 angel-0.032
 bicycle-0.032
Token and
Token ID290
Feature activation-3.29e-03

Pos logit contributions
 a+0.017
 two+0.016
 Ben+0.014
 blast+0.014
 sne+0.014
Neg logit contributions
 possessions-0.037
 possession-0.036
izing-0.035
-0.035
 angel-0.034
Token soft
Token ID2705
Feature activation-1.58e-02

Pos logit contributions
 a+0.004
 two+0.004
 Ben+0.003
 sne+0.003
 gas+0.003
Neg logit contributions
 possessions-0.006
 possession-0.006
-0.006
izing-0.006
ement-0.006
Token and
Token ID290
Feature activation-9.79e-03

Pos logit contributions
 a+0.016
 two+0.015
 Ben+0.013
 blast+0.013
 sne+0.013
Neg logit contributions
 possessions-0.030
 possession-0.029
izing-0.028
-0.028
 angel-0.028
Token had
Token ID550
Feature activation-3.58e-02

Pos logit contributions
 a+0.011
 two+0.010
 Ben+0.009
 sne+0.009
 gas+0.009
Neg logit contributions
 possessions-0.019
 possession-0.018
-0.018
izing-0.018
ement-0.018
Token a
Token ID257
Feature activation-1.83e-02

Pos logit contributions
rip+0.031
rr+0.031
olo+0.030
 boil+0.029
 Shark+0.028
Neg logit contributions
 possessions-0.067
 ornament-0.065
 possession-0.065
 bicycle-0.064
 bicycles-0.063
Token pillow
Token ID28774
Feature activation-2.34e-02

Pos logit contributions
 a+0.020
 two+0.020
 sne+0.018
 blast+0.017
 Ben+0.017
Neg logit contributions
 possessions-0.034
 possession-0.033
izing-0.031
 Acc-0.031
-0.031
Token on
Token ID319
Feature activation-1.36e-02

Pos logit contributions
 two+0.026
 a+0.026
rip+0.026
 GO+0.025
 blast+0.025
Neg logit contributions
 possessions-0.038
 possession-0.037
 bicycle-0.037
 angel-0.037
 ornament-0.037
Token it
Token ID340
Feature activation-1.05e-02

Pos logit contributions
 two+0.017
rip+0.017
rr+0.017
ata+0.016
 roasted+0.016
Neg logit contributions
 possessions-0.020
 possession-0.020
 ornament-0.020
 bicycle-0.019
izing-0.019
Token.
Token ID13
Feature activation-1.21e-02

Pos logit contributions
 blast+0.012
 two+0.011
 roasted+0.011
 roast+0.011
 GO+0.011
Neg logit contributions
 possessions-0.017
 possession-0.017
 mattered-0.016
 bicycle-0.016
izing-0.016
Token.
Token ID13
Feature activation+2.37e-03

Pos logit contributions
 a+0.001
 two+0.001
 blast+0.001
 roast+0.001
 roasted+0.001
Neg logit contributions
 possessions-0.002
 possession-0.002
 Acc-0.002
 angel-0.002
izing-0.002
Token They
Token ID1119
Feature activation-2.09e-03

Pos logit contributions
 possessions+0.005
 possession+0.005
+0.005
 Acc+0.005
izing+0.005
Neg logit contributions
 a-0.002
 two-0.002
 sne-0.002
 blast-0.001
rip-0.001
Token said
Token ID531
Feature activation+2.81e-02

Pos logit contributions
umbs+0.003
 GO+0.003
rip+0.003
rr+0.003
 blah+0.003
Neg logit contributions
 possessions-0.003
 mattered-0.003
 angel-0.002
 Acc-0.002
 bicycle-0.002
Token sorry
Token ID7926
Feature activation-6.01e-03

Pos logit contributions
 possession+0.053
 possessions+0.053
izing+0.050
+0.048
ement+0.048
Neg logit contributions
 a-0.037
 two-0.034
 Ben-0.033
 sne-0.029
 four-0.027
Token and
Token ID290
Feature activation-4.02e-03

Pos logit contributions
 a+0.005
 two+0.005
 Ben+0.004
 sne+0.004
rip+0.004
Neg logit contributions
 possessions-0.013
 possession-0.013
izing-0.012
-0.012
ement-0.012
Token gave
Token ID2921
Feature activation-3.02e-02

Pos logit contributions
 a+0.005
 two+0.005
 blast+0.005
 Ben+0.005
rip+0.005
Neg logit contributions
 possessions-0.006
 possession-0.006
 Acc-0.006
 angel-0.006
izing-0.006
Token back
Token ID736
Feature activation-1.65e-02

Pos logit contributions
 a+0.044
 two+0.043
 blast+0.040
 sne+0.039
 Ben+0.039
Neg logit contributions
 possessions-0.045
 possession-0.043
-0.040
izing-0.040
 angel-0.040
Token the
Token ID262
Feature activation-2.42e-03

Pos logit contributions
 a+0.015
 two+0.014
 Ben+0.013
 blast+0.012
 sne+0.012
Neg logit contributions
 possessions-0.033
 possession-0.032
izing-0.031
-0.031
 Acc-0.031
Token shiny
Token ID22441
Feature activation+1.41e-02

Pos logit contributions
 a+0.003
 two+0.003
 Ben+0.003
 sne+0.003
 gas+0.003
Neg logit contributions
izing-0.004
ths-0.004
-0.004
bing-0.004
 possessions-0.004
Token things
Token ID1243
Feature activation-6.89e-03

Pos logit contributions
ths+0.026
 Acc+0.025
Ell+0.024
 disl+0.023
ishly+0.023
Neg logit contributions
 a-0.020
 two-0.019
 hook-0.016
 four-0.016
 gas-0.016
Token.
Token ID13
Feature activation-7.16e-03

Pos logit contributions
 a+0.007
 two+0.007
 Ben+0.006
 blast+0.006
 sne+0.006
Neg logit contributions
 possessions-0.013
 possession-0.013
izing-0.012
-0.012
 Acc-0.012