TOP ACTIVATIONS
MAX = 0.093

blocks to make it stand upright . Then they used the
the village back to its former glory !"
back the village to its former state . With the help
the tree to its once former beauty . He was happy
to be restored to its former glory ."
his socks away in a timely manner . And each time
a tall tree . Un bek now st to him ,
knows how to get things unst uck ." So , they
go to the land of Imp ressive Things .
heading to the land of Imp ressive Things , â
to tie her teacher 's sho el aces together . Each
friends helped her loosen her sho el aces so she could
just like in a fair yt ale ! <|endoftext|> Once upon
a big knot in his sho el aces and he couldn
and proud of himself for accompl ishing such a difficult task
, they were very apolog etic . The
day ever ! <|endoftext|> Mor ris was very excited . Today
they didn 't get it unst uck . Just
, leaving it to its former state .
Max was very apolog etic and said ,

MIN ACTIVATIONS
MIN = -0.081

the garden . It was as tall as her ! She
thick mushroom . It was as big as her head !
sun flower . It was as yellow as the sun and
. The ground now looked as clean as before and the
plates , and some sticks as forks and knives .
cried . " The ost ric hes are mean ! They
the sky . It was as high as could be and
the dress and it looked as good as new .
the doll and it looked as good as new . The
it , the shirt looked as good as new . Lily
them play . As the sun went down , Lily hugged
a moment , it was as if there was no time
and laughed . As the sun started to set , Max
the shirt and it looked as good as new ! The
fire work . It was as big as a house !
was repaired , it looked as good as new . The
. They were called ost ric hes . Lily and Ben
toys and fruits and books as things to sell .
. As the sun began to set , Buddy
to the park and swing as high as she could .

INTERVAL 0.049 - 0.093
CONTAINS 0.098%

I will always remind you â . Gr
then on , he always remembered that if he worked hard
to turn off the f auc et when she was done
he wanted to go to an opera himself . So ,
some more toys to your collection !" L

INTERVAL 0.006 - 0.049
CONTAINS 35.151%

careless . From then on , Fl uffy
prevent bad things from happening by being careful and listening to
so excited that from then on , she was always happy
. Everyone agreed that it was important to be flexible ,
" Look at me , I am so pretty !" she

INTERVAL -0.038 - 0.006
CONTAINS 63.469%

One day , something magical happened . A little
did you go outside ? Why didn 't you listen to
day . He lived in a small house with his best
competition ! He was so happy that he rubbed his head
. Dad said , " Great job ,

INTERVAL -0.081 - -0.038
CONTAINS 1.282%

. Mom and Dad heard them . They ran
, and puzzles . Lily likes puzzles . She likes to
up , but it was too hard . Tim felt sad
was making noises again . It was flashing and moving .
. They all hugged and smiled . Then they
Token blocks
Token ID7021
Feature activation+1.81e-03

Pos logit contributions
 pet+0.014
 water+0.013
 dog+0.013
 walk+0.012
 drops+0.012
Neg logit contributions
 exactly-0.033
ently-0.033
vel-0.032
 forever-0.032
 correctly-0.032
Token to
Token ID284
Feature activation+6.23e-03

Pos logit contributions
ently+0.005
vel+0.005
 exactly+0.004
mmm+0.004
 forever+0.004
Neg logit contributions
 pet-0.001
s-0.001
 water-0.001
 dog-0.001
 walk-0.001
Token make
Token ID787
Feature activation+6.01e-03

Pos logit contributions
ently+0.015
vel+0.015
 exactly+0.015
rol+0.015
 forever+0.015
Neg logit contributions
 pet-0.006
 water-0.006
 walk-0.005
 wave-0.005
 play-0.005
Token it
Token ID340
Feature activation-6.67e-03

Pos logit contributions
ently+0.013
 forever+0.013
anish+0.013
vel+0.013
ieved+0.013
Neg logit contributions
 pet-0.007
 water-0.007
 big-0.006
 dog-0.006
 drops-0.006
Token stand
Token ID1302
Feature activation+1.83e-03

Pos logit contributions
 pet+0.007
 water+0.006
 walk+0.006
 drops+0.006
s+0.006
Neg logit contributions
ently-0.015
vel-0.015
mmm-0.015
 exactly-0.015
loe-0.015
Token upright
Token ID24826
Feature activation+9.30e-02

Pos logit contributions
anish+0.004
ently+0.004
mmm+0.004
ocolate+0.004
loe+0.004
Neg logit contributions
 pet-0.002
 water-0.002
s-0.002
 strong-0.001
 walk-0.001
Token.
Token ID13
Feature activation-1.55e-03

Pos logit contributions
mmm+0.256
ocolate+0.250
ieved+0.246
vel+0.246
loe+0.245
Neg logit contributions
 pet-0.052
s-0.046
 water-0.042
 dog-0.039
 big-0.034
Token Then
Token ID3244
Feature activation+6.82e-04

Pos logit contributions
 pet+0.002
 dog+0.002
 water+0.002
s+0.002
 walk+0.002
Neg logit contributions
vel-0.003
 forever-0.003
 exactly-0.003
ently-0.003
 correctly-0.003
Token they
Token ID484
Feature activation-2.83e-03

Pos logit contributions
ently+0.002
vel+0.002
rol+0.002
aney+0.001
ieved+0.001
Neg logit contributions
 pet-0.001
 dog-0.001
 water-0.001
s-0.001
 walk-0.001
Token used
Token ID973
Feature activation+1.85e-02

Pos logit contributions
 pet+0.003
 water+0.003
 walk+0.003
 dog+0.003
 went+0.003
Neg logit contributions
ently-0.006
 exactly-0.006
vel-0.006
mmm-0.006
 forever-0.006
Token the
Token ID262
Feature activation-1.17e-02

Pos logit contributions
vel+0.047
ently+0.047
mmm+0.047
ieved+0.045
anish+0.044
Neg logit contributions
 water-0.019
 pet-0.019
 big-0.016
 drops-0.014
 walk-0.014
Token the
Token ID262
Feature activation+4.69e-03

Pos logit contributions
aney+0.034
rol+0.034
ently+0.033
vel+0.032
 exactly+0.032
Neg logit contributions
 pet-0.023
 water-0.023
s-0.020
 dog-0.020
 big-0.019
Token village
Token ID7404
Feature activation-3.19e-03

Pos logit contributions
 exactly+0.010
ently+0.010
 correctly+0.010
vel+0.010
 forever+0.010
Neg logit contributions
 pet-0.006
 dog-0.005
 water-0.005
 walk-0.005
 drops-0.005
Token back
Token ID736
Feature activation-4.02e-03

Pos logit contributions
 pet+0.004
s+0.004
 water+0.004
 big+0.003
 strong+0.003
Neg logit contributions
anish-0.007
ocolate-0.007
ently-0.007
mmm-0.007
loe-0.007
Token to
Token ID284
Feature activation+2.95e-02

Pos logit contributions
 pet+0.005
 water+0.004
s+0.004
 dog+0.004
 walk+0.004
Neg logit contributions
vel-0.009
mmm-0.009
ently-0.009
anish-0.009
ocolate-0.009
Token its
Token ID663
Feature activation+3.29e-02

Pos logit contributions
rol+0.064
ently+0.064
aney+0.063
 exactly+0.062
ieved+0.062
Neg logit contributions
 pet-0.037
 water-0.037
 big-0.031
 walk-0.031
 dog-0.031
Token former
Token ID1966
Feature activation+9.29e-02

Pos logit contributions
vel+0.069
ently+0.069
otion+0.068
rol+0.067
ocolate+0.067
Neg logit contributions
 pet-0.049
 dog-0.041
 walk-0.041
 water-0.040
 big-0.037
Token glory
Token ID13476
Feature activation+8.85e-03

Pos logit contributions
 exactly+0.142
vel+0.142
 correctly+0.139
otion+0.137
anish+0.135
Neg logit contributions
 pet-0.165
 water-0.154
 drops-0.149
 walk-0.144
 pets-0.141
Token!"
Token ID2474
Feature activation+1.89e-02

Pos logit contributions
rol+0.018
anish+0.018
 exactly+0.018
ieved+0.018
 fault+0.018
Neg logit contributions
 water-0.010
 pet-0.010
s-0.010
 big-0.009
 strong-0.009
Token
Token ID220
Feature activation-9.72e-03

Pos logit contributions
 exactly+0.046
vel+0.046
 forever+0.044
ust+0.044
 fault+0.044
Neg logit contributions
 pet-0.019
s-0.017
 dog-0.017
 water-0.016
 man-0.014
Token
Token ID198
Feature activation+1.53e-02

Pos logit contributions
 pet+0.006
s+0.005
 dog+0.005
 water+0.004
 drops+0.004
Neg logit contributions
 forever-0.027
 exactly-0.026
vel-0.026
 correctly-0.026
ently-0.025
Token
Token ID198
Feature activation+1.58e-02

Pos logit contributions
 exactly+0.044
 correctly+0.043
 fault+0.043
 forever+0.043
vel+0.042
Neg logit contributions
 pet-0.009
s-0.007
 dog-0.006
 water-0.005
 big-0.005
Token back
Token ID736
Feature activation+2.23e-03

Pos logit contributions
 exactly+0.023
ently+0.023
 forever+0.023
vel+0.023
 correctly+0.023
Neg logit contributions
 pet-0.017
 dog-0.015
 water-0.014
 walk-0.014
 drops-0.014
Token the
Token ID262
Feature activation-6.60e-03

Pos logit contributions
 exactly+0.005
ently+0.004
 forever+0.004
vel+0.004
 correctly+0.004
Neg logit contributions
 pet-0.003
 dog-0.003
 water-0.003
 walk-0.003
 drops-0.002
Token village
Token ID7404
Feature activation+1.05e-04

Pos logit contributions
 pet+0.009
 dog+0.008
 walk+0.007
 water+0.007
 drops+0.007
Neg logit contributions
 exactly-0.014
 forever-0.013
ently-0.013
vel-0.013
 correctly-0.013
Token to
Token ID284
Feature activation+2.49e-02

Pos logit contributions
ently+0.000
mmm+0.000
vel+0.000
anish+0.000
 exactly+0.000
Neg logit contributions
 pet-0.000
 walk-0.000
 water-0.000
s-0.000
 dog-0.000
Token its
Token ID663
Feature activation+2.75e-02

Pos logit contributions
ently+0.059
rol+0.058
vel+0.058
aney+0.057
 exactly+0.057
Neg logit contributions
 pet-0.027
 water-0.023
 walk-0.022
 dog-0.022
 wave-0.020
Token former
Token ID1966
Feature activation+9.14e-02

Pos logit contributions
vel+0.058
ently+0.058
 correctly+0.056
 exactly+0.056
anish+0.055
Neg logit contributions
 pet-0.039
 dog-0.032
 walk-0.032
 water-0.031
 pets-0.029
Token state
Token ID1181
Feature activation+3.03e-02

Pos logit contributions
vel+0.139
 exactly+0.138
 correctly+0.136
otion+0.132
anish+0.132
Neg logit contributions
 pet-0.164
 water-0.152
 drops-0.146
 walk-0.145
 dog-0.139
Token.
Token ID13
Feature activation+1.06e-02

Pos logit contributions
mmm+0.076
vel+0.075
ieved+0.074
anish+0.074
ocolate+0.073
Neg logit contributions
 pet-0.024
 water-0.021
s-0.020
 dog-0.018
 big-0.017
Token With
Token ID2080
Feature activation+2.68e-03

Pos logit contributions
 exactly+0.027
ently+0.027
vel+0.027
 correctly+0.026
 forever+0.026
Neg logit contributions
 pet-0.008
 dog-0.007
s-0.007
 water-0.006
 walk-0.005
Token the
Token ID262
Feature activation-5.83e-03

Pos logit contributions
ently+0.006
vel+0.006
anish+0.006
ide+0.006
 exactly+0.005
Neg logit contributions
 pet-0.004
 water-0.003
 dog-0.003
 big-0.003
 walk-0.003
Token help
Token ID1037
Feature activation-8.27e-03

Pos logit contributions
 pet+0.006
 dog+0.005
 walk+0.005
s+0.005
 water+0.005
Neg logit contributions
 exactly-0.013
 forever-0.013
ently-0.013
vel-0.013
 correctly-0.013
Token the
Token ID262
Feature activation-6.47e-03

Pos logit contributions
vel+0.020
ently+0.019
anish+0.019
mmm+0.019
 forever+0.019
Neg logit contributions
 pet-0.008
 water-0.008
 dog-0.007
 drops-0.006
s-0.006
Token tree
Token ID5509
Feature activation+9.56e-03

Pos logit contributions
 pet+0.007
 dog+0.006
 walk+0.006
 water+0.005
 drops+0.005
Neg logit contributions
ently-0.016
 exactly-0.016
vel-0.016
 correctly-0.015
mmm-0.015
Token to
Token ID284
Feature activation+2.51e-02

Pos logit contributions
mmm+0.022
anish+0.021
ently+0.021
vel+0.021
abella+0.021
Neg logit contributions
 pet-0.011
 walk-0.010
s-0.009
 water-0.009
 dog-0.009
Token its
Token ID663
Feature activation+3.41e-02

Pos logit contributions
ently+0.059
vel+0.059
 correctly+0.057
anish+0.056
rol+0.056
Neg logit contributions
 pet-0.028
 water-0.024
 walk-0.023
 dog-0.022
 wave-0.020
Token once
Token ID1752
Feature activation+1.01e-02

Pos logit contributions
vel+0.076
otion+0.075
ently+0.075
mmm+0.073
anish+0.073
Neg logit contributions
 pet-0.047
 walk-0.038
 dog-0.038
 water-0.036
 big-0.034
Token former
Token ID1966
Feature activation+8.96e-02

Pos logit contributions
mmm+0.020
ocolate+0.020
abella+0.020
vel+0.020
 correctly+0.020
Neg logit contributions
 pet-0.014
 walk-0.013
 water-0.013
 warm-0.012
 strong-0.012
Token beauty
Token ID8737
Feature activation-1.29e-02

Pos logit contributions
vel+0.138
 exactly+0.137
 correctly+0.134
otion+0.133
anish+0.132
Neg logit contributions
 pet-0.157
 water-0.144
 drops-0.141
 walk-0.140
 pets-0.136
Token.
Token ID13
Feature activation+1.89e-02

Pos logit contributions
 pet+0.011
s+0.010
 dog+0.009
 water+0.009
 walk+0.008
Neg logit contributions
vel-0.032
mmm-0.032
ently-0.032
ocolate-0.031
 exactly-0.031
Token He
Token ID679
Feature activation+2.11e-02

Pos logit contributions
vel+0.051
ently+0.049
anish+0.047
 exactly+0.047
ieved+0.046
Neg logit contributions
 pet-0.013
 dog-0.011
s-0.011
 water-0.010
 went-0.009
Token was
Token ID373
Feature activation+2.46e-02

Pos logit contributions
ocolate+0.053
vel+0.053
ently+0.052
 fault+0.052
mmm+0.052
Neg logit contributions
 pet-0.021
 went-0.017
 walk-0.016
s-0.015
 started-0.015
Token happy
Token ID3772
Feature activation-2.17e-03

Pos logit contributions
anish+0.058
ently+0.057
mmm+0.054
mmmm+0.052
ide+0.051
Neg logit contributions
 pet-0.034
 strong-0.032
 warm-0.030
 big-0.029
 calm-0.028
Token to
Token ID284
Feature activation+1.09e-02

Pos logit contributions
vel+0.025
 exactly+0.024
ently+0.024
mmm+0.024
ieved+0.023
Neg logit contributions
 water-0.009
 pet-0.009
s-0.009
 dog-0.009
 walk-0.008
Token be
Token ID307
Feature activation+6.22e-03

Pos logit contributions
ently+0.028
vel+0.028
 exactly+0.028
 correctly+0.027
rol+0.027
Neg logit contributions
 pet-0.009
 walk-0.009
 water-0.009
 dog-0.007
 play-0.007
Token restored
Token ID15032
Feature activation+2.67e-02

Pos logit contributions
ently+0.014
anish+0.014
ocolate+0.014
mmm+0.014
aney+0.014
Neg logit contributions
 pet-0.007
 water-0.006
 strong-0.006
 walk-0.005
 drops-0.005
Token to
Token ID284
Feature activation+1.95e-02

Pos logit contributions
ieved+0.062
aney+0.062
ocolate+0.061
vel+0.061
mmm+0.061
Neg logit contributions
 pet-0.026
 water-0.025
s-0.024
 dog-0.024
 big-0.022
Token its
Token ID663
Feature activation+2.32e-02

Pos logit contributions
rol+0.049
ently+0.048
aney+0.047
vel+0.047
ieved+0.046
Neg logit contributions
 pet-0.020
 water-0.019
 walk-0.018
 play-0.017
 big-0.016
Token former
Token ID1966
Feature activation+8.79e-02

Pos logit contributions
vel+0.049
ently+0.047
rol+0.046
mmm+0.046
otion+0.046
Neg logit contributions
 pet-0.036
 dog-0.030
 walk-0.030
 water-0.030
 big-0.028
Token glory
Token ID13476
Feature activation+4.69e-03

Pos logit contributions
 exactly+0.131
vel+0.130
 correctly+0.126
 forever+0.123
ently+0.123
Neg logit contributions
 pet-0.159
 water-0.147
 drops-0.144
 walk-0.143
 dog-0.140
Token."
Token ID526
Feature activation+8.48e-03

Pos logit contributions
anish+0.010
rol+0.010
 exactly+0.010
vel+0.010
ieved+0.010
Neg logit contributions
 water-0.005
s-0.005
 pet-0.005
 big-0.004
 strong-0.004
Token
Token ID220
Feature activation-1.57e-02

Pos logit contributions
vel+0.021
 correctly+0.021
 exactly+0.021
 forever+0.020
ently+0.020
Neg logit contributions
 pet-0.008
s-0.008
 water-0.007
 dog-0.006
 went-0.006
Token
Token ID198
Feature activation+6.56e-03

Pos logit contributions
 pet+0.010
s+0.009
 dog+0.008
 water+0.007
 walk+0.006
Neg logit contributions
 forever-0.042
 exactly-0.042
 correctly-0.042
vel-0.042
anish-0.040
Token
Token ID198
Feature activation+9.37e-03

Pos logit contributions
 correctly+0.018
 exactly+0.018
 forever+0.018
vel+0.017
ently+0.017
Neg logit contributions
 pet-0.005
s-0.004
 water-0.003
 dog-0.003
 big-0.003
Token his
Token ID465
Feature activation-2.53e-02

Pos logit contributions
ently+0.020
 exactly+0.020
 forever+0.020
vel+0.020
 correctly+0.020
Neg logit contributions
 pet-0.014
 water-0.013
 dog-0.013
 walk-0.012
 drops-0.012
Token socks
Token ID24359
Feature activation+4.13e-03

Pos logit contributions
 pet+0.035
 walk+0.031
 dog+0.031
 Claus+0.030
s+0.029
Neg logit contributions
 exactly-0.046
 correctly-0.044
 forever-0.044
vel-0.044
ently-0.042
Token away
Token ID1497
Feature activation+6.33e-03

Pos logit contributions
mmm+0.010
ently+0.010
 exactly+0.010
vel+0.009
anish+0.009
Neg logit contributions
 pet-0.004
 water-0.004
 walk-0.004
s-0.003
 dog-0.003
Token in
Token ID287
Feature activation+3.00e-02

Pos logit contributions
mmm+0.015
vel+0.015
anish+0.015
ently+0.015
ocolate+0.015
Neg logit contributions
 pet-0.006
 water-0.006
s-0.005
 walk-0.005
 warm-0.005
Token a
Token ID257
Feature activation+3.28e-02

Pos logit contributions
ently+0.077
mmm+0.073
ieved+0.072
vel+0.072
aney+0.071
Neg logit contributions
 water-0.032
 pet-0.031
 big-0.026
 dog-0.026
 walk-0.024
Token timely
Token ID19376
Feature activation+8.78e-02

Pos logit contributions
ently+0.079
 exactly+0.077
 correctly+0.076
mmm+0.075
vel+0.074
Neg logit contributions
 pet-0.036
 dog-0.030
 water-0.029
 walk-0.029
 drops-0.027
Token manner
Token ID5642
Feature activation+2.59e-02

Pos logit contributions
 exactly+0.135
ently+0.131
 forever+0.130
 correctly+0.129
vel+0.129
Neg logit contributions
 pet-0.162
 dog-0.148
 walk-0.144
 water-0.143
 drops-0.141
Token.
Token ID13
Feature activation+1.15e-02

Pos logit contributions
mmm+0.072
ieved+0.072
mmmm+0.070
vel+0.069
heit+0.069
Neg logit contributions
 pet-0.013
s-0.012
 dog-0.012
 man-0.009
 water-0.008
Token And
Token ID843
Feature activation-4.81e-03

Pos logit contributions
vel+0.029
 exactly+0.029
ently+0.028
 forever+0.028
ieved+0.027
Neg logit contributions
 pet-0.010
 dog-0.009
s-0.008
 water-0.008
 pets-0.006
Token each
Token ID1123
Feature activation+2.61e-02

Pos logit contributions
 pet+0.005
 dog+0.005
 water+0.005
s+0.005
 walk+0.004
Neg logit contributions
ently-0.011
vel-0.011
aney-0.011
ocolate-0.010
anish-0.010
Token time
Token ID640
Feature activation-3.55e-02

Pos logit contributions
ently+0.050
 exactly+0.050
mmm+0.049
 correctly+0.048
 forever+0.048
Neg logit contributions
 pet-0.040
 dog-0.036
 water-0.036
 walk-0.035
 drops-0.034
Token a
Token ID257
Feature activation-1.02e-02

Pos logit contributions
ently+0.037
vel+0.037
 correctly+0.036
 forever+0.036
mmm+0.035
Neg logit contributions
 pet-0.018
 water-0.015
 dog-0.015
 walk-0.013
 drops-0.013
Token tall
Token ID7331
Feature activation+2.07e-03

Pos logit contributions
 pet+0.009
 dog+0.007
 walk+0.007
 water+0.007
 drops+0.007
Neg logit contributions
 exactly-0.025
 forever-0.025
vel-0.024
ently-0.024
 correctly-0.024
Token tree
Token ID5509
Feature activation+2.20e-02

Pos logit contributions
 exactly+0.005
ently+0.005
vel+0.005
 correctly+0.005
 forever+0.005
Neg logit contributions
 pet-0.002
 dog-0.002
 water-0.002
 walk-0.002
 drops-0.002
Token.
Token ID13
Feature activation-1.67e-02

Pos logit contributions
 exactly+0.050
ently+0.050
vel+0.049
 correctly+0.049
 forever+0.049
Neg logit contributions
 pet-0.025
 dog-0.021
 walk-0.020
 water-0.020
s-0.019
Token Un
Token ID791
Feature activation+2.78e-02

Pos logit contributions
 pet+0.024
 dog+0.022
 walk+0.021
 drops+0.021
 water+0.021
Neg logit contributions
 exactly-0.031
 forever-0.030
ently-0.029
 correctly-0.029
vel-0.029
Tokenbek
Token ID47083
Feature activation+8.63e-02

Pos logit contributions
 exactly+0.044
 forever+0.044
ently+0.043
vel+0.043
 correctly+0.042
Neg logit contributions
 pet-0.049
 dog-0.045
 water-0.043
 walk-0.043
 drops-0.043
Tokennow
Token ID2197
Feature activation+6.58e-03

Pos logit contributions
 exactly+0.096
 forever+0.095
vel+0.092
 correctly+0.091
ently+0.091
Neg logit contributions
 pet-0.194
 dog-0.179
 walk-0.172
 drops-0.172
s-0.172
Tokenst
Token ID301
Feature activation+3.22e-02

Pos logit contributions
ently+0.013
 exactly+0.013
vel+0.013
 correctly+0.013
 fault+0.013
Neg logit contributions
 pet-0.009
 dog-0.008
s-0.008
 walk-0.008
 water-0.008
Token to
Token ID284
Feature activation+8.12e-03

Pos logit contributions
 accompl+0.046
ocolate+0.043
anish+0.042
mmm+0.042
 forever+0.041
Neg logit contributions
s-0.063
 pet-0.059
 water-0.057
 walk-0.055
 drops-0.054
Token him
Token ID683
Feature activation+1.02e-02

Pos logit contributions
ently+0.022
vel+0.022
mmm+0.020
rol+0.020
ement+0.020
Neg logit contributions
 pet-0.008
 water-0.007
 walk-0.007
 dog-0.006
 wave-0.006
Token,
Token ID11
Feature activation+2.26e-02

Pos logit contributions
rol+0.027
ocolate+0.026
aney+0.026
vel+0.026
 thirteen+0.026
Neg logit contributions
 pet-0.010
 walk-0.008
 dog-0.007
s-0.007
 play-0.007
Token knows
Token ID4206
Feature activation+1.18e-02

Pos logit contributions
 exactly+0.004
 forever+0.004
ently+0.004
 correctly+0.004
vel+0.004
Neg logit contributions
 pet-0.003
 dog-0.003
 walk-0.003
 water-0.003
 drops-0.003
Token how
Token ID703
Feature activation-1.78e-02

Pos logit contributions
vel+0.022
rol+0.022
aney+0.022
ently+0.021
ieved+0.021
Neg logit contributions
 pet-0.019
 water-0.018
 dog-0.018
 walk-0.016
 pets-0.016
Token to
Token ID284
Feature activation-4.92e-03

Pos logit contributions
 pet+0.017
 dog+0.015
 water+0.015
 walk+0.014
s+0.014
Neg logit contributions
ently-0.042
 exactly-0.042
vel-0.042
 correctly-0.041
 forever-0.041
Token get
Token ID651
Feature activation-8.84e-03

Pos logit contributions
 pet+0.005
 dog+0.004
 drops+0.004
s+0.004
 walk+0.004
Neg logit contributions
 exactly-0.011
 forever-0.011
ently-0.011
vel-0.011
 correctly-0.011
Token things
Token ID1243
Feature activation-2.10e-02

Pos logit contributions
 pet+0.012
 dog+0.010
 walk+0.010
 water+0.010
 drops+0.010
Neg logit contributions
 exactly-0.018
 forever-0.018
ently-0.017
 correctly-0.017
vel-0.017
Token unst
Token ID15014
Feature activation+8.37e-02

Pos logit contributions
 pet+0.029
 dog+0.026
 water+0.025
 walk+0.025
 drops+0.024
Neg logit contributions
 exactly-0.042
ently-0.041
vel-0.041
 forever-0.040
 correctly-0.040
Tokenuck
Token ID1347
Feature activation+2.64e-02

Pos logit contributions
ently+0.119
 forever+0.115
 exactly+0.114
mmm+0.112
 correctly+0.111
Neg logit contributions
 pet-0.159
 wave-0.147
 walk-0.143
 water-0.143
 drops-0.142
Token."
Token ID526
Feature activation+1.01e-03

Pos logit contributions
vel+0.056
 exactly+0.054
 accompl+0.054
ently+0.054
aney+0.053
Neg logit contributions
 water-0.034
 pet-0.032
s-0.030
 dog-0.029
 drops-0.027
Token So
Token ID1406
Feature activation+1.28e-02

Pos logit contributions
vel+0.002
 exactly+0.002
 correctly+0.002
 forever+0.002
ently+0.002
Neg logit contributions
 pet-0.001
s-0.001
 dog-0.001
 water-0.001
 man-0.001
Token,
Token ID11
Feature activation+2.79e-02

Pos logit contributions
vel+0.031
ently+0.030
rol+0.030
 fault+0.029
 correctly+0.029
Neg logit contributions
 pet-0.014
s-0.012
 dog-0.012
 water-0.012
 went-0.011
Token they
Token ID484
Feature activation+5.65e-03

Pos logit contributions
ently+0.067
vel+0.066
 fault+0.062
 correctly+0.062
anish+0.061
Neg logit contributions
 pet-0.032
 dog-0.028
 water-0.028
s-0.027
 walk-0.025
Token go
Token ID467
Feature activation-2.31e-02

Pos logit contributions
 pet+0.003
 walk+0.002
 dog+0.002
 water+0.002
 wave+0.002
Neg logit contributions
 exactly-0.006
ently-0.006
vel-0.006
 forever-0.005
 correctly-0.005
Token to
Token ID284
Feature activation+4.43e-03

Pos logit contributions
 drops+0.033
 pet+0.033
 dog+0.032
s+0.031
 pans+0.030
Neg logit contributions
 forever-0.041
 exactly-0.041
 correctly-0.037
vel-0.036
ently-0.035
Token the
Token ID262
Feature activation-7.36e-03

Pos logit contributions
ently+0.010
vel+0.010
 exactly+0.010
 correctly+0.010
 forever+0.010
Neg logit contributions
 pet-0.005
 dog-0.004
 walk-0.004
 water-0.004
 wave-0.004
Token land
Token ID1956
Feature activation-1.11e-02

Pos logit contributions
 pet+0.009
 dog+0.008
 walk+0.008
 water+0.008
 drops+0.007
Neg logit contributions
ently-0.016
 exactly-0.016
vel-0.016
 forever-0.016
 correctly-0.016
Token of
Token ID286
Feature activation+8.89e-03

Pos logit contributions
 pet+0.012
 walk+0.011
 dog+0.011
s+0.010
 water+0.010
Neg logit contributions
vel-0.026
ently-0.026
mmm-0.025
anish-0.025
loe-0.024
Token Imp
Token ID9855
Feature activation+8.24e-02

Pos logit contributions
ently+0.020
vel+0.019
 exactly+0.019
 forever+0.019
 correctly+0.019
Neg logit contributions
 pet-0.011
 dog-0.010
 water-0.010
 walk-0.009
 drops-0.009
Tokenressive
Token ID3314
Feature activation+1.50e-02

Pos logit contributions
 exactly+0.095
 forever+0.092
ently+0.092
vel+0.090
 correctly+0.090
Neg logit contributions
 pet-0.181
 dog-0.170
 walk-0.164
 water-0.164
 drops-0.164
Token Things
Token ID11597
Feature activation-1.57e-02

Pos logit contributions
 exactly+0.033
ently+0.032
vel+0.032
 forever+0.031
mmm+0.031
Neg logit contributions
 pet-0.018
 dog-0.017
 water-0.016
s-0.015
 walk-0.015
Token.
Token ID13
Feature activation+5.50e-03

Pos logit contributions
 pet+0.018
 walk+0.016
 water+0.016
 dog+0.015
s+0.015
Neg logit contributions
ently-0.037
 exactly-0.036
vel-0.035
mmm-0.035
anish-0.034
Token
Token ID198
Feature activation+3.42e-03

Pos logit contributions
 exactly+0.012
ently+0.012
vel+0.012
 forever+0.012
 correctly+0.012
Neg logit contributions
 pet-0.006
 dog-0.005
s-0.005
 walk-0.005
 water-0.005
Token
Token ID198
Feature activation+3.52e-03

Pos logit contributions
 correctly+0.009
 exactly+0.009
vel+0.009
ently+0.009
 forever+0.009
Neg logit contributions
 pet-0.003
s-0.002
 dog-0.002
 walk-0.002
 water-0.002
Token heading
Token ID9087
Feature activation+1.76e-02

Pos logit contributions
 pet+0.039
 dog+0.036
 drops+0.035
 water+0.034
 big+0.032
Neg logit contributions
ently-0.051
anish-0.049
 forever-0.049
 correctly-0.047
ieved-0.047
Token to
Token ID284
Feature activation+1.52e-02

Pos logit contributions
vel+0.047
ently+0.047
mmm+0.046
aney+0.046
anish+0.046
Neg logit contributions
 pet-0.017
 water-0.013
 walk-0.012
 dog-0.012
 wave-0.011
Token the
Token ID262
Feature activation+5.08e-03

Pos logit contributions
ently+0.039
vel+0.038
 exactly+0.037
anish+0.037
 correctly+0.036
Neg logit contributions
 pet-0.016
 water-0.014
 walk-0.014
 dog-0.014
 wave-0.012
Token land
Token ID1956
Feature activation-4.18e-03

Pos logit contributions
 exactly+0.010
ently+0.010
 forever+0.010
vel+0.010
 correctly+0.010
Neg logit contributions
 pet-0.007
 dog-0.006
 walk-0.006
 water-0.006
 drops-0.006
Token of
Token ID286
Feature activation+1.13e-02

Pos logit contributions
s+0.004
 water+0.004
 dog+0.004
 pet+0.004
 wave+0.004
Neg logit contributions
vel-0.009
mmm-0.009
anish-0.009
ocolate-0.009
 fault-0.009
Token Imp
Token ID9855
Feature activation+8.23e-02

Pos logit contributions
ently+0.030
vel+0.029
ieved+0.028
ust+0.028
aney+0.028
Neg logit contributions
 pet-0.012
 water-0.012
 dog-0.011
 big-0.010
 bushes-0.009
Tokenressive
Token ID3314
Feature activation+1.35e-02

Pos logit contributions
 exactly+0.102
 forever+0.098
 correctly+0.094
ently+0.094
vel+0.092
Neg logit contributions
 pet-0.175
 walk-0.162
 dog-0.159
 water-0.154
 drops-0.151
Token Things
Token ID11597
Feature activation-1.56e-02

Pos logit contributions
 exactly+0.027
ently+0.026
vel+0.025
 forever+0.025
aney+0.025
Neg logit contributions
 pet-0.019
 dog-0.019
 water-0.018
 walk-0.017
s-0.017
Token,
Token ID11
Feature activation+1.77e-02

Pos logit contributions
 pet+0.019
 water+0.018
 walk+0.018
s+0.017
 dog+0.017
Neg logit contributions
 exactly-0.034
ently-0.034
vel-0.033
mmm-0.032
anish-0.032
Tokenâ
Token ID22940
Feature activation+5.16e-02

Pos logit contributions
ently+0.039
vel+0.036
anish+0.035
rol+0.034
 correctly+0.034
Neg logit contributions
 pet-0.025
 dog-0.023
 water-0.023
 walk-0.021
 wave-0.020
Token
Token ID26391
Feature activation+1.89e-02

Pos logit contributions
 exactly+0.059
 forever+0.057
ently+0.057
vel+0.056
 correctly+0.056
Neg logit contributions
 pet-0.114
 dog-0.106
 walk-0.104
 water-0.103
 drops-0.102
Token to
Token ID284
Feature activation+8.11e-03

Pos logit contributions
ently+0.024
vel+0.024
mmm+0.024
rol+0.024
anish+0.024
Neg logit contributions
 pet-0.008
 dog-0.006
 water-0.006
 walk-0.005
s-0.005
Token tie
Token ID9839
Feature activation+1.53e-02

Pos logit contributions
 exactly+0.019
ently+0.019
 forever+0.019
vel+0.019
 correctly+0.019
Neg logit contributions
 pet-0.008
 dog-0.007
 walk-0.007
 water-0.007
 drops-0.006
Token her
Token ID607
Feature activation-4.94e-03

Pos logit contributions
ently+0.032
vel+0.031
 exactly+0.031
mmm+0.030
 forever+0.030
Neg logit contributions
 pet-0.021
 water-0.018
 dog-0.018
 drops-0.018
s-0.017
Token teacher
Token ID4701
Feature activation+1.59e-02

Pos logit contributions
 pet+0.006
 dog+0.005
 walk+0.005
 water+0.005
 drops+0.005
Neg logit contributions
 exactly-0.011
ently-0.011
vel-0.011
mmm-0.011
 forever-0.011
Token's
Token ID338
Feature activation+1.18e-02

Pos logit contributions
ently+0.043
mmm+0.042
 exactly+0.041
ocolate+0.040
anish+0.040
Neg logit contributions
 pet-0.015
s-0.011
 walk-0.011
 dog-0.011
 water-0.009
Token sho
Token ID7354
Feature activation+8.22e-02

Pos logit contributions
mmm+0.027
 exactly+0.027
ently+0.027
vel+0.026
 correctly+0.026
Neg logit contributions
 pet-0.015
 dog-0.013
 walk-0.012
 water-0.012
 drops-0.011
Tokenel
Token ID417
Feature activation+1.00e-02

Pos logit contributions
 exactly+0.142
vel+0.137
 forever+0.136
 correctly+0.134
ently+0.132
Neg logit contributions
 pet-0.132
s-0.122
 dog-0.118
 walk-0.113
 water-0.110
Tokenaces
Token ID2114
Feature activation+3.83e-02

Pos logit contributions
mmm+0.015
 exactly+0.014
otion+0.014
 forever+0.014
 reality+0.014
Neg logit contributions
 pet-0.020
 dog-0.018
 water-0.018
 walk-0.017
s-0.017
Token together
Token ID1978
Feature activation+2.65e-02

Pos logit contributions
mmm+0.104
anish+0.102
ocolate+0.102
mmmm+0.100
loe+0.099
Neg logit contributions
s-0.025
 pet-0.024
 walk-0.019
 big-0.017
 water-0.017
Token.
Token ID13
Feature activation+1.83e-03

Pos logit contributions
mmm+0.082
ently+0.080
mmmm+0.080
ieved+0.079
ocolate+0.079
Neg logit contributions
 pet-0.013
s-0.007
 walk-0.007
 big-0.006
 water-0.006
Token Each
Token ID5501
Feature activation-1.28e-02

Pos logit contributions
vel+0.005
 reality+0.004
 forever+0.004
 exactly+0.004
ently+0.004
Neg logit contributions
 pet-0.002
 dog-0.002
s-0.001
 walk-0.001
 water-0.001
Token friends
Token ID2460
Feature activation+5.45e-03

Pos logit contributions
 forever+0.005
 exactly+0.005
ently+0.005
vel+0.004
 correctly+0.004
Neg logit contributions
 pet-0.003
s-0.003
 walk-0.002
 dog-0.002
 drops-0.002
Token helped
Token ID4193
Feature activation+1.48e-02

Pos logit contributions
ently+0.013
mmm+0.013
vel+0.013
ocolate+0.012
 correctly+0.012
Neg logit contributions
 pet-0.006
 dog-0.005
 walk-0.005
 went-0.005
 bark-0.005
Token her
Token ID607
Feature activation+1.03e-02

Pos logit contributions
 exactly+0.027
ently+0.027
vel+0.026
 forever+0.026
 correctly+0.026
Neg logit contributions
 pet-0.023
 walk-0.021
 dog-0.021
 water-0.020
s-0.019
Token loosen
Token ID40953
Feature activation+3.49e-02

Pos logit contributions
 exactly+0.025
ently+0.024
vel+0.024
 forever+0.024
 correctly+0.024
Neg logit contributions
 pet-0.011
 walk-0.009
 dog-0.009
 water-0.008
s-0.008
Token her
Token ID607
Feature activation-8.06e-04

Pos logit contributions
anish+0.085
mmm+0.084
vel+0.083
ently+0.082
ocolate+0.081
Neg logit contributions
 pet-0.042
 water-0.038
 drops-0.033
 dog-0.032
s-0.032
Token sho
Token ID7354
Feature activation+8.16e-02

Pos logit contributions
 pet+0.001
 dog+0.001
 walk+0.001
 water+0.001
 drops+0.001
Neg logit contributions
 exactly-0.002
 forever-0.002
ently-0.002
vel-0.002
 correctly-0.002
Tokenel
Token ID417
Feature activation+1.12e-02

Pos logit contributions
 exactly+0.127
vel+0.123
 forever+0.123
 correctly+0.121
ently+0.120
Neg logit contributions
 pet-0.144
 dog-0.131
s-0.130
 walk-0.126
 water-0.124
Tokenaces
Token ID2114
Feature activation+3.86e-02

Pos logit contributions
 exactly+0.015
 forever+0.015
mmm+0.015
 reality+0.014
vel+0.014
Neg logit contributions
 pet-0.023
 water-0.021
s-0.020
 dog-0.020
 walk-0.020
Token so
Token ID523
Feature activation+6.88e-03

Pos logit contributions
anish+0.094
mmm+0.094
vel+0.093
 exactly+0.092
ocolate+0.092
Neg logit contributions
s-0.031
 pet-0.031
 walk-0.029
 water-0.026
 dog-0.024
Token she
Token ID673
Feature activation+3.02e-02

Pos logit contributions
vel+0.013
ently+0.013
 exactly+0.012
 correctly+0.012
anish+0.012
Neg logit contributions
 pet-0.010
 dog-0.009
s-0.009
 water-0.009
 walk-0.009
Token could
Token ID714
Feature activation+3.40e-03

Pos logit contributions
 exactly+0.063
ently+0.063
vel+0.062
 forever+0.062
 correctly+0.062
Neg logit contributions
 pet-0.038
 dog-0.034
 walk-0.032
 water-0.032
 drops-0.031
Token just
Token ID655
Feature activation-2.70e-03

Pos logit contributions
ently+0.037
vel+0.034
anish+0.034
rum+0.033
mmm+0.033
Neg logit contributions
 pet-0.022
 walk-0.020
 dog-0.019
 water-0.019
 play-0.018
Token like
Token ID588
Feature activation-1.15e-02

Pos logit contributions
 pet+0.004
 walk+0.003
 water+0.003
 touch+0.003
 wave+0.003
Neg logit contributions
ently-0.006
loe-0.006
vel-0.006
anish-0.006
rol-0.006
Token in
Token ID287
Feature activation+1.13e-02

Pos logit contributions
 pet+0.013
 water+0.011
 dog+0.010
 big+0.010
 animals+0.009
Neg logit contributions
ently-0.029
ieved-0.029
vel-0.028
rum-0.028
anish-0.027
Token a
Token ID257
Feature activation+1.76e-02

Pos logit contributions
ently+0.032
vel+0.031
ieved+0.031
ust+0.031
rol+0.031
Neg logit contributions
 water-0.008
 big-0.008
 pet-0.008
 dog-0.007
 warm-0.006
Token fair
Token ID3148
Feature activation+1.49e-02

Pos logit contributions
ently+0.041
 exactly+0.038
 correctly+0.038
mmm+0.038
ocolate+0.037
Neg logit contributions
 pet-0.023
 dog-0.020
 walk-0.020
 water-0.019
 big-0.019
Tokenyt
Token ID20760
Feature activation+8.15e-02

Pos logit contributions
ently+0.031
 correctly+0.031
 exactly+0.031
vel+0.030
mmm+0.030
Neg logit contributions
 pet-0.020
 dog-0.018
 water-0.018
 walk-0.018
 wave-0.017
Tokenale
Token ID1000
Feature activation+1.39e-02

Pos logit contributions
vel+0.167
 exactly+0.167
mmm+0.164
anish+0.163
 accompl+0.161
Neg logit contributions
 pet-0.096
s-0.088
 walk-0.086
 dog-0.080
 water-0.079
Token!
Token ID0
Feature activation+1.73e-02

Pos logit contributions
ently+0.031
mmm+0.030
 exactly+0.030
ieved+0.029
anish+0.029
Neg logit contributions
 pet-0.017
 dog-0.016
 water-0.015
 walk-0.015
 drops-0.014
Token<|endoftext|>
Token ID50256
Feature activation-2.92e-02

Pos logit contributions
 exactly+0.044
vel+0.044
ently+0.044
 forever+0.043
 correctly+0.043
Neg logit contributions
 pet-0.014
 dog-0.012
s-0.012
 water-0.010
 walk-0.010
TokenOnce
Token ID7454
Feature activation+1.29e-02

Pos logit contributions
 pet+0.025
 dog+0.021
s+0.019
 water+0.019
 walk+0.018
Neg logit contributions
 exactly-0.075
ently-0.073
vel-0.072
 correctly-0.072
 forever-0.072
Token upon
Token ID2402
Feature activation-8.33e-03

Pos logit contributions
vel+0.027
mmm+0.025
 correctly+0.025
anish+0.025
ently+0.025
Neg logit contributions
s-0.017
 pet-0.016
 dog-0.016
 walk-0.015
 big-0.015
Token a
Token ID257
Feature activation-8.31e-03

Pos logit contributions
ently+0.043
mmm+0.042
vel+0.040
 correctly+0.040
 exactly+0.039
Neg logit contributions
 pet-0.023
 dog-0.020
 water-0.020
 hurt-0.020
 walk-0.019
Token big
Token ID1263
Feature activation-1.28e-02

Pos logit contributions
 pet+0.009
 dog+0.008
 walk+0.007
 water+0.007
 drops+0.007
Neg logit contributions
ently-0.020
 correctly-0.019
 exactly-0.019
vel-0.019
mmm-0.019
Token knot
Token ID29654
Feature activation+3.78e-02

Pos logit contributions
 pet+0.017
 dog+0.016
 walk+0.015
 water+0.014
 drops+0.014
Neg logit contributions
ently-0.029
 exactly-0.028
rol-0.028
 correctly-0.028
vel-0.028
Token in
Token ID287
Feature activation+1.70e-02

Pos logit contributions
mmm+0.102
mmmm+0.098
ently+0.097
ocolate+0.094
anish+0.094
Neg logit contributions
 pet-0.032
 dog-0.027
 walk-0.027
 water-0.025
s-0.025
Token his
Token ID465
Feature activation+2.61e-02

Pos logit contributions
ently+0.043
vel+0.042
mmm+0.042
 exactly+0.042
ieved+0.041
Neg logit contributions
 water-0.016
 pet-0.016
 big-0.014
 dog-0.013
 hurt-0.012
Token sho
Token ID7354
Feature activation+8.12e-02

Pos logit contributions
 exactly+0.050
 forever+0.050
vel+0.047
ently+0.046
 correctly+0.046
Neg logit contributions
 pet-0.033
 drops-0.030
s-0.030
 walk-0.029
 water-0.029
Tokenel
Token ID417
Feature activation+1.65e-02

Pos logit contributions
 exactly+0.119
vel+0.116
 forever+0.115
 correctly+0.113
ently+0.113
Neg logit contributions
 pet-0.152
 dog-0.140
s-0.139
 walk-0.133
 water-0.132
Tokenaces
Token ID2114
Feature activation+3.85e-02

Pos logit contributions
 exactly+0.026
 forever+0.025
vel+0.024
 correctly+0.024
ently+0.024
Neg logit contributions
 pet-0.030
 dog-0.028
 water-0.028
 walk-0.027
s-0.026
Token and
Token ID290
Feature activation-1.48e-02

Pos logit contributions
mmm+0.100
ocolate+0.096
anish+0.096
ently+0.095
mmmm+0.095
Neg logit contributions
 pet-0.030
s-0.029
 walk-0.025
 water-0.024
 dog-0.023
Token he
Token ID339
Feature activation+1.61e-02

Pos logit contributions
 pet+0.017
 dog+0.015
 water+0.014
 walk+0.014
 drops+0.014
Neg logit contributions
ently-0.033
 exactly-0.033
vel-0.032
 forever-0.032
 correctly-0.032
Token couldn
Token ID3521
Feature activation+9.77e-03

Pos logit contributions
ently+0.037
mmm+0.036
 exactly+0.036
vel+0.035
ocolate+0.035
Neg logit contributions
 pet-0.019
 dog-0.016
 walk-0.016
s-0.015
 drops-0.015
Token and
Token ID290
Feature activation+4.90e-04

Pos logit contributions
 pet+0.008
 dog+0.007
 water+0.007
s+0.006
 walk+0.006
Neg logit contributions
ently-0.021
vel-0.020
 correctly-0.020
 exactly-0.020
ieved-0.020
Token proud
Token ID6613
Feature activation-1.16e-02

Pos logit contributions
 exactly+0.001
 forever+0.001
 correctly+0.001
ently+0.001
vel+0.001
Neg logit contributions
 pet-0.001
 dog-0.001
 walk-0.001
 drops-0.000
 water-0.000
Token of
Token ID286
Feature activation+2.49e-02

Pos logit contributions
 pet+0.013
 dog+0.011
 water+0.011
s+0.010
 drops+0.010
Neg logit contributions
ently-0.027
vel-0.027
ieved-0.026
 exactly-0.026
ocolate-0.026
Token himself
Token ID2241
Feature activation+1.76e-02

Pos logit contributions
aney+0.057
vel+0.056
rol+0.056
ently+0.055
anish+0.055
Neg logit contributions
 water-0.028
 pet-0.028
 play-0.023
 dog-0.023
 big-0.021
Token for
Token ID329
Feature activation+2.18e-02

Pos logit contributions
anish+0.043
mmm+0.043
ocolate+0.042
ently+0.041
aney+0.041
Neg logit contributions
 pet-0.016
s-0.014
 dog-0.014
 walk-0.013
 water-0.012
Token accompl
Token ID6424
Feature activation+8.10e-02

Pos logit contributions
ently+0.044
aney+0.044
rol+0.043
ust+0.041
 fault+0.041
Neg logit contributions
 pet-0.033
 water-0.029
 dog-0.027
 walk-0.027
 w-0.026
Tokenishing
Token ID3929
Feature activation+2.07e-02

Pos logit contributions
vel+0.175
mmm+0.170
ocolate+0.169
otion+0.168
anish+0.167
Neg logit contributions
 pet-0.087
s-0.078
 walk-0.068
 water-0.067
 pets-0.066
Token such
Token ID884
Feature activation+4.61e-02

Pos logit contributions
vel+0.049
mmm+0.048
aney+0.047
ieved+0.047
ust+0.046
Neg logit contributions
 water-0.022
 pet-0.020
 dog-0.020
s-0.019
 big-0.017
Token a
Token ID257
Feature activation+3.66e-02

Pos logit contributions
otion+0.120
rol+0.120
ently+0.119
mmm+0.116
aney+0.115
Neg logit contributions
 pet-0.045
 strong-0.040
 water-0.040
 big-0.038
 warm-0.034
Token difficult
Token ID2408
Feature activation+2.73e-02

Pos logit contributions
ently+0.082
 correctly+0.078
 exactly+0.077
rol+0.077
vel+0.076
Neg logit contributions
 pet-0.048
 dog-0.041
 water-0.040
 walk-0.040
 strong-0.038
Token task
Token ID4876
Feature activation+4.76e-02

Pos logit contributions
rol+0.057
 exactly+0.057
ently+0.057
vel+0.056
aney+0.056
Neg logit contributions
 pet-0.037
 walk-0.035
 water-0.033
 dog-0.033
 wave-0.030
Token,
Token ID11
Feature activation+2.63e-02

Pos logit contributions
rol+0.106
ocolate+0.106
aney+0.104
mmm+0.103
ieved+0.102
Neg logit contributions
s-0.034
 pet-0.032
 dog-0.028
 walk-0.026
 big-0.026
Token they
Token ID484
Feature activation+8.98e-03

Pos logit contributions
vel+0.067
ently+0.065
anish+0.064
mmm+0.064
aney+0.063
Neg logit contributions
 pet-0.026
 dog-0.024
 water-0.023
 man-0.021
 drops-0.020
Token were
Token ID547
Feature activation+2.38e-02

Pos logit contributions
ently+0.021
mmm+0.021
ocolate+0.021
vel+0.021
 fault+0.020
Neg logit contributions
 pet-0.010
 walk-0.008
 went-0.008
s-0.008
 water-0.008
Token very
Token ID845
Feature activation-3.73e-03

Pos logit contributions
mmm+0.054
ently+0.053
anish+0.053
mmmm+0.052
ocolate+0.051
Neg logit contributions
 pet-0.032
 hurt-0.028
 big-0.026
 started-0.026
 strong-0.026
Token apolog
Token ID8453
Feature activation+2.13e-02

Pos logit contributions
 pet+0.004
 hurt+0.003
 water+0.003
 big+0.003
 strong+0.003
Neg logit contributions
ently-0.009
anish-0.008
mmm-0.008
vel-0.008
ocolate-0.008
Tokenetic
Token ID5139
Feature activation+8.02e-02

Pos logit contributions
 forever+0.029
 exactly+0.028
 correctly+0.028
vel+0.027
ently+0.027
Neg logit contributions
 pet-0.040
 walk-0.037
 dog-0.036
s-0.036
 wave-0.036
Token.
Token ID13
Feature activation+1.31e-02

Pos logit contributions
mmm+0.223
ocolate+0.221
vel+0.216
mmmm+0.216
 accompl+0.215
Neg logit contributions
 pet-0.053
s-0.051
 dog-0.047
 man-0.035
 water-0.035
Token
Token ID220
Feature activation-1.66e-02

Pos logit contributions
vel+0.031
 exactly+0.030
ently+0.030
 forever+0.029
 correctly+0.029
Neg logit contributions
 pet-0.014
 dog-0.012
s-0.012
 water-0.011
 drops-0.011
Token
Token ID198
Feature activation+1.61e-02

Pos logit contributions
 pet+0.010
s+0.010
 dog+0.009
 water+0.007
 walk+0.007
Neg logit contributions
 forever-0.044
 exactly-0.043
vel-0.043
 correctly-0.042
ently-0.042
Token
Token ID198
Feature activation+1.60e-02

Pos logit contributions
 exactly+0.044
 forever+0.044
 correctly+0.043
 fault+0.043
ently+0.042
Neg logit contributions
 pet-0.010
s-0.009
 dog-0.008
 water-0.007
 walk-0.006
TokenThe
Token ID464
Feature activation+3.83e-03

Pos logit contributions
 fault+0.042
 exactly+0.041
 forever+0.040
ently+0.040
 correctly+0.039
Neg logit contributions
 pet-0.014
s-0.012
 dog-0.011
 water-0.010
 walk-0.010
Token day
Token ID1110
Feature activation-2.32e-02

Pos logit contributions
 pet+0.000
 dog+0.000
 walk+0.000
 water+0.000
 drops+0.000
Neg logit contributions
ently-0.000
vel-0.000
 exactly-0.000
 correctly-0.000
 forever-0.000
Token ever
Token ID1683
Feature activation+1.35e-02

Pos logit contributions
 pet+0.039
 dog+0.035
 walk+0.034
s+0.034
 water+0.034
Neg logit contributions
ently-0.039
 exactly-0.039
 correctly-0.038
vel-0.038
 forever-0.037
Token!
Token ID0
Feature activation+1.23e-02

Pos logit contributions
 exactly+0.025
 forever+0.025
ently+0.025
vel+0.024
 correctly+0.024
Neg logit contributions
 pet-0.020
 dog-0.018
 walk-0.017
 water-0.017
 drops-0.017
Token<|endoftext|>
Token ID50256
Feature activation-3.06e-02

Pos logit contributions
 exactly+0.027
 forever+0.027
ently+0.027
 correctly+0.027
vel+0.027
Neg logit contributions
 pet-0.014
 dog-0.012
 walk-0.011
 water-0.011
 drops-0.011
TokenMor
Token ID20044
Feature activation+6.81e-02

Pos logit contributions
 pet+0.029
 dog+0.024
 water+0.022
 walk+0.022
s+0.022
Neg logit contributions
 exactly-0.075
ently-0.073
 forever-0.073
vel-0.073
 correctly-0.072
Tokenris
Token ID2442
Feature activation+8.02e-02

Pos logit contributions
 exactly+0.136
ently+0.133
vel+0.131
 forever+0.129
 correctly+0.128
Neg logit contributions
 pet-0.098
 walk-0.088
s-0.084
 dog-0.081
 water-0.079
Token was
Token ID373
Feature activation+3.65e-02

Pos logit contributions
vel+0.208
 exactly+0.203
 correctly+0.202
ently+0.197
 forever+0.196
Neg logit contributions
 pet-0.081
s-0.063
 dog-0.062
 walk-0.061
 went-0.058
Token very
Token ID845
Feature activation+2.38e-02

Pos logit contributions
ently+0.090
mmm+0.087
anish+0.087
 perspective+0.086
 correctly+0.086
Neg logit contributions
 pet-0.041
 dog-0.033
 strong-0.031
 big-0.031
 went-0.031
Token excited
Token ID6568
Feature activation+2.73e-02

Pos logit contributions
 exactly+0.055
ently+0.054
 forever+0.054
vel+0.053
 correctly+0.053
Neg logit contributions
 pet-0.026
 dog-0.022
 walk-0.021
 water-0.020
s-0.020
Token.
Token ID13
Feature activation+3.21e-02

Pos logit contributions
 exactly+0.060
 forever+0.059
ently+0.059
vel+0.058
 correctly+0.058
Neg logit contributions
 pet-0.032
 dog-0.028
 walk-0.026
 water-0.026
 drops-0.025
Token Today
Token ID6288
Feature activation-1.61e-02

Pos logit contributions
 exactly+0.073
ently+0.073
 forever+0.071
vel+0.071
 correctly+0.069
Neg logit contributions
 pet-0.037
 dog-0.032
 walk-0.030
 water-0.029
s-0.029
Token they
Token ID484
Feature activation-7.68e-03

Pos logit contributions
 pet+0.020
 dog+0.019
 water+0.018
 walk+0.017
 drops+0.016
Neg logit contributions
vel-0.040
ently-0.039
mmm-0.037
 correctly-0.037
 exactly-0.037
Token didn
Token ID1422
Feature activation-1.83e-02

Pos logit contributions
 pet+0.009
 dog+0.009
s+0.008
 pets+0.008
 drops+0.008
Neg logit contributions
 exactly-0.016
 forever-0.016
 correctly-0.015
vel-0.015
ently-0.015
Token't
Token ID470
Feature activation-1.27e-02

Pos logit contributions
 pet+0.027
 dog+0.024
 walk+0.024
s+0.023
 water+0.023
Neg logit contributions
 exactly-0.035
 forever-0.034
ently-0.034
vel-0.033
 correctly-0.033
Token get
Token ID651
Feature activation-1.16e-02

Pos logit contributions
 pet+0.015
 dog+0.014
 pets+0.013
 drops+0.013
ted+0.013
Neg logit contributions
 forever-0.027
vel-0.025
 correctly-0.025
 exactly-0.025
ide-0.023
Token it
Token ID340
Feature activation-1.20e-02

Pos logit contributions
 pet+0.014
 dog+0.013
 walk+0.013
s+0.012
 drops+0.012
Neg logit contributions
 exactly-0.024
 forever-0.023
 correctly-0.022
vel-0.022
ently-0.021
Token unst
Token ID15014
Feature activation+7.95e-02

Pos logit contributions
 pet+0.013
 dog+0.011
 walk+0.011
 water+0.011
s+0.011
Neg logit contributions
 exactly-0.027
ently-0.027
vel-0.027
mmm-0.026
 correctly-0.026
Tokenuck
Token ID1347
Feature activation+2.43e-02

Pos logit contributions
 forever+0.115
ently+0.113
 exactly+0.112
vel+0.108
 correctly+0.108
Neg logit contributions
 pet-0.147
 wave-0.137
 water-0.132
 walk-0.131
 dog-0.130
Token.
Token ID13
Feature activation+7.74e-03

Pos logit contributions
mmm+0.061
vel+0.061
ently+0.060
anish+0.060
 exactly+0.060
Neg logit contributions
 pet-0.020
 water-0.019
s-0.018
 dog-0.017
 drops-0.016
Token
Token ID198
Feature activation+6.88e-03

Pos logit contributions
vel+0.019
 exactly+0.019
 forever+0.019
ently+0.019
 correctly+0.019
Neg logit contributions
 pet-0.007
 dog-0.006
s-0.006
 water-0.006
 walk-0.005
Token
Token ID198
Feature activation+7.39e-03

Pos logit contributions
 correctly+0.018
 exactly+0.018
vel+0.018
 forever+0.018
ently+0.017
Neg logit contributions
 pet-0.005
s-0.005
 dog-0.004
 water-0.004
 walk-0.004
TokenJust
Token ID5703
Feature activation+1.06e-04

Pos logit contributions
 correctly+0.014
 fault+0.014
 forever+0.014
ently+0.014
 exactly+0.014
Neg logit contributions
 pet-0.011
s-0.010
 dog-0.009
 water-0.009
 walk-0.009
Token,
Token ID11
Feature activation+2.60e-02

Pos logit contributions
vel+0.018
mmm+0.018
ently+0.018
 exactly+0.018
anish+0.018
Neg logit contributions
 pet-0.007
s-0.006
 water-0.006
 dog-0.006
 walk-0.006
Token leaving
Token ID4305
Feature activation+1.00e-02

Pos logit contributions
ently+0.059
vel+0.058
mmm+0.056
anish+0.056
 correctly+0.056
Neg logit contributions
 pet-0.032
 dog-0.027
 water-0.027
 walk-0.025
 drops-0.025
Token it
Token ID340
Feature activation-3.36e-03

Pos logit contributions
vel+0.023
mmm+0.023
ently+0.022
anish+0.022
ieved+0.022
Neg logit contributions
 pet-0.011
 dog-0.010
 water-0.010
 drops-0.009
 walk-0.009
Token to
Token ID284
Feature activation+1.77e-02

Pos logit contributions
 pet+0.004
 water+0.003
 walk+0.003
s+0.003
 drops+0.003
Neg logit contributions
mmm-0.008
ently-0.008
vel-0.007
 exactly-0.007
ocolate-0.007
Token its
Token ID663
Feature activation+2.64e-02

Pos logit contributions
ently+0.042
vel+0.041
 correctly+0.040
 exactly+0.039
anish+0.039
Neg logit contributions
 pet-0.021
 water-0.019
 walk-0.017
 dog-0.017
 wave-0.017
Token former
Token ID1966
Feature activation+7.90e-02

Pos logit contributions
vel+0.057
ently+0.057
 correctly+0.056
mmm+0.055
 exactly+0.055
Neg logit contributions
 pet-0.035
 water-0.030
 dog-0.030
 walk-0.028
 drops-0.027
Token state
Token ID1181
Feature activation+1.90e-02

Pos logit contributions
 exactly+0.116
vel+0.115
 correctly+0.113
ently+0.111
 forever+0.111
Neg logit contributions
 pet-0.147
 water-0.135
 drops-0.132
 dog-0.130
 walk-0.129
Token.
Token ID13
Feature activation+1.00e-02

Pos logit contributions
mmm+0.049
vel+0.048
anish+0.047
ently+0.047
ieved+0.047
Neg logit contributions
 pet-0.015
 water-0.013
s-0.011
 dog-0.011
 walk-0.011
Token
Token ID220
Feature activation-1.62e-02

Pos logit contributions
vel+0.023
 exactly+0.023
ently+0.023
 forever+0.022
 correctly+0.022
Neg logit contributions
 pet-0.011
s-0.010
 dog-0.010
 water-0.009
 walk-0.008
Token
Token ID198
Feature activation+1.64e-02

Pos logit contributions
 pet+0.011
s+0.010
 dog+0.009
 water+0.008
 walk+0.007
Neg logit contributions
 forever-0.043
 exactly-0.042
vel-0.042
 correctly-0.042
ently-0.041
Token
Token ID198
Feature activation+1.58e-02

Pos logit contributions
 exactly+0.045
 forever+0.044
 correctly+0.044
ently+0.044
vel+0.043
Neg logit contributions
 pet-0.011
s-0.009
 dog-0.008
 water-0.007
 walk-0.006
Token
Token ID198
Feature activation+1.96e-02

Pos logit contributions
 exactly+0.061
 correctly+0.061
 forever+0.060
 fault+0.060
vel+0.058
Neg logit contributions
 pet-0.011
s-0.009
 dog-0.007
 big-0.007
 water-0.007
TokenMax
Token ID11518
Feature activation+7.17e-03

Pos logit contributions
 fault+0.048
 correctly+0.044
ently+0.044
 forever+0.044
 exactly+0.043
Neg logit contributions
 pet-0.022
Rex-0.019
s-0.018
 walk-0.017
 dog-0.017
Token was
Token ID373
Feature activation+1.92e-02

Pos logit contributions
ently+0.018
vel+0.018
 exactly+0.018
ocolate+0.017
mmm+0.017
Neg logit contributions
 pet-0.008
 dog-0.006
 bark-0.006
 walk-0.006
s-0.006
Token very
Token ID845
Feature activation-2.53e-03

Pos logit contributions
anish+0.048
ently+0.047
mmm+0.046
mmmm+0.044
 perspective+0.043
Neg logit contributions
 pet-0.026
 hurt-0.022
 strong-0.021
 big-0.020
 sleepy-0.020
Token apolog
Token ID8453
Feature activation+1.02e-02

Pos logit contributions
 pet+0.003
 dog+0.002
 water+0.002
 drops+0.002
 strong+0.002
Neg logit contributions
ently-0.006
anish-0.006
 forever-0.006
vel-0.006
 exactly-0.006
Tokenetic
Token ID5139
Feature activation+7.90e-02

Pos logit contributions
 forever+0.015
 exactly+0.015
 correctly+0.015
anish+0.015
vel+0.015
Neg logit contributions
 pet-0.018
s-0.016
 dog-0.016
 walk-0.016
 wave-0.016
Token and
Token ID290
Feature activation+1.35e-02

Pos logit contributions
ocolate+0.220
vel+0.220
 accompl+0.219
mmm+0.215
anish+0.212
Neg logit contributions
 pet-0.046
 dog-0.045
s-0.045
 man-0.038
 water-0.032
Token said
Token ID531
Feature activation+4.55e-04

Pos logit contributions
ently+0.031
vel+0.031
anish+0.030
 forever+0.030
mmm+0.030
Neg logit contributions
 pet-0.015
 dog-0.013
 water-0.013
 drops-0.012
 walk-0.012
Token,
Token ID11
Feature activation+2.78e-02

Pos logit contributions
ieved+0.001
ently+0.001
rol+0.001
ide+0.001
aney+0.001
Neg logit contributions
 pet-0.000
 dog-0.000
s-0.000
 strong-0.000
 big-0.000
Token
Token ID6184
Feature activation+3.64e-02

Pos logit contributions
ently+0.068
ieved+0.066
anish+0.064
rol+0.064
 fault+0.064
Neg logit contributions
 pet-0.031
 dog-0.029
 man-0.026
 bark-0.026
s-0.023
Token
Token ID95
Feature activation+3.90e-02

Pos logit contributions
 exactly+0.060
 forever+0.058
ently+0.058
 correctly+0.057
vel+0.056
Neg logit contributions
 pet-0.063
 dog-0.057
 walk-0.056
 water-0.054
 drops-0.054
Token the
Token ID262
Feature activation+6.51e-03

Pos logit contributions
 exactly+0.013
 forever+0.013
 correctly+0.012
ently+0.012
vel+0.012
Neg logit contributions
 pet-0.010
 dog-0.010
s-0.009
 walk-0.009
 drops-0.009
Token garden
Token ID11376
Feature activation+1.59e-02

Pos logit contributions
 forever+0.013
 exactly+0.013
ently+0.013
 correctly+0.012
vel+0.012
Neg logit contributions
 pet-0.006
s-0.006
 walk-0.006
 dog-0.006
 pets-0.006
Token.
Token ID13
Feature activation-1.41e-02

Pos logit contributions
 exactly+0.037
vel+0.037
 correctly+0.037
ently+0.037
 forever+0.036
Neg logit contributions
 pet-0.017
 dog-0.014
 water-0.014
 walk-0.013
s-0.013
Token It
Token ID632
Feature activation-3.88e-02

Pos logit contributions
 pet+0.024
 walk+0.022
 wave+0.022
 dog+0.022
 water+0.021
Neg logit contributions
 exactly-0.021
 forever-0.021
 correctly-0.020
ently-0.020
vel-0.020
Token was
Token ID373
Feature activation-1.05e-02

Pos logit contributions
 pet+0.052
 dog+0.049
 walk+0.047
 pets+0.046
 wave+0.043
Neg logit contributions
 forever-0.072
 exactly-0.072
 correctly-0.069
vel-0.068
ently-0.068
Token as
Token ID355
Feature activation-8.12e-02

Pos logit contributions
 pet+0.015
 water+0.013
 dog+0.012
 big+0.012
 drops+0.012
Neg logit contributions
ently-0.022
vel-0.022
 correctly-0.021
anish-0.021
 forever-0.021
Token tall
Token ID7331
Feature activation+2.06e-03

Pos logit contributions
 pet+0.083
 dog+0.071
 walk+0.067
 water+0.064
 drops+0.064
Neg logit contributions
 exactly-0.190
 forever-0.187
ently-0.185
 correctly-0.184
vel-0.183
Token as
Token ID355
Feature activation-6.15e-02

Pos logit contributions
 exactly+0.003
 forever+0.003
ently+0.003
 correctly+0.003
mmm+0.003
Neg logit contributions
 pet-0.003
 dog-0.003
s-0.003
 walk-0.003
 Claus-0.003
Token her
Token ID607
Feature activation-1.07e-02

Pos logit contributions
 pet+0.084
 dog+0.075
 water+0.074
 walk+0.069
 drops+0.068
Neg logit contributions
vel-0.125
ently-0.123
 exactly-0.120
 correctly-0.118
 forever-0.118
Token!
Token ID0
Feature activation-3.16e-03

Pos logit contributions
 pet+0.012
 dog+0.011
s+0.010
 drops+0.010
 walk+0.010
Neg logit contributions
 exactly-0.023
 forever-0.023
ently-0.022
 correctly-0.022
vel-0.021
Token She
Token ID1375
Feature activation-2.19e-03

Pos logit contributions
 pet+0.004
 dog+0.004
 walk+0.004
 water+0.004
 drops+0.004
Neg logit contributions
 exactly-0.006
 forever-0.006
ently-0.006
 correctly-0.006
vel-0.006
Token thick
Token ID6546
Feature activation-9.74e-03

Pos logit contributions
 exactly+0.028
 forever+0.028
vel+0.027
 correctly+0.027
ently+0.026
Neg logit contributions
 pet-0.012
 dog-0.010
s-0.010
 walk-0.010
 water-0.009
Token mushroom
Token ID28520
Feature activation+2.26e-02

Pos logit contributions
 pet+0.012
 dog+0.010
 walk+0.010
 water+0.009
 drops+0.009
Neg logit contributions
 exactly-0.022
ently-0.021
 correctly-0.021
 forever-0.021
vel-0.021
Token.
Token ID13
Feature activation-1.61e-02

Pos logit contributions
 exactly+0.044
 forever+0.043
vel+0.042
ently+0.042
 correctly+0.041
Neg logit contributions
 pet-0.031
 dog-0.028
 walk-0.027
 water-0.027
 drops-0.026
Token It
Token ID632
Feature activation-3.90e-02

Pos logit contributions
 pet+0.026
 dog+0.024
 walk+0.024
 wave+0.023
 water+0.023
Neg logit contributions
 exactly-0.026
 forever-0.025
 correctly-0.025
ently-0.025
vel-0.025
Token was
Token ID373
Feature activation-5.53e-03

Pos logit contributions
 pet+0.051
 dog+0.048
 walk+0.046
 pets+0.044
 water+0.043
Neg logit contributions
 exactly-0.075
 forever-0.074
 correctly-0.072
vel-0.070
ently-0.069
Token as
Token ID355
Feature activation-7.72e-02

Pos logit contributions
 pet+0.007
 dog+0.006
 water+0.006
 drops+0.006
 walk+0.006
Neg logit contributions
ently-0.012
vel-0.012
 correctly-0.011
 exactly-0.011
 forever-0.011
Token big
Token ID1263
Feature activation-1.95e-04

Pos logit contributions
 pet+0.068
 dog+0.058
 water+0.055
 drops+0.050
s+0.050
Neg logit contributions
ently-0.193
vel-0.190
 exactly-0.188
 forever-0.186
 correctly-0.185
Token as
Token ID355
Feature activation-5.35e-02

Pos logit contributions
 pet+0.000
s+0.000
 walk+0.000
 dog+0.000
 wave+0.000
Neg logit contributions
 exactly-0.000
 forever-0.000
 correctly-0.000
vel-0.000
mmm-0.000
Token her
Token ID607
Feature activation-9.08e-03

Pos logit contributions
 pet+0.071
 dog+0.063
 water+0.063
 walk+0.058
 drops+0.057
Neg logit contributions
vel-0.112
ently-0.111
anish-0.105
 exactly-0.105
ocolate-0.104
Token head
Token ID1182
Feature activation-2.67e-02

Pos logit contributions
 pet+0.011
 dog+0.010
 drops+0.010
 water+0.010
s+0.010
Neg logit contributions
 exactly-0.019
 forever-0.019
ently-0.018
 correctly-0.018
vel-0.017
Token!
Token ID0
Feature activation-5.74e-03

Pos logit contributions
 pet+0.032
 dog+0.028
 walk+0.026
 water+0.026
 drops+0.026
Neg logit contributions
 exactly-0.057
 forever-0.056
ently-0.056
vel-0.056
 correctly-0.056
Token sun
Token ID4252
Feature activation-2.77e-02

Pos logit contributions
 exactly+0.034
ently+0.034
vel+0.033
 correctly+0.033
 forever+0.033
Neg logit contributions
 pet-0.014
 dog-0.013
 water-0.011
 walk-0.011
 drops-0.011
Tokenflower
Token ID25547
Feature activation-8.70e-04

Pos logit contributions
 pet+0.034
 dog+0.030
 walk+0.028
 water+0.028
 drops+0.027
Neg logit contributions
 exactly-0.060
 forever-0.058
ently-0.058
vel-0.058
 correctly-0.058
Token.
Token ID13
Feature activation-7.25e-04

Pos logit contributions
 pet+0.001
 dog+0.001
 water+0.001
 walk+0.001
s+0.001
Neg logit contributions
 exactly-0.002
ently-0.002
 correctly-0.002
vel-0.002
 forever-0.002
Token It
Token ID632
Feature activation-3.95e-02

Pos logit contributions
 pet+0.001
 dog+0.001
 water+0.001
 walk+0.001
 drops+0.001
Neg logit contributions
 exactly-0.002
ently-0.002
vel-0.002
 forever-0.002
ieved-0.002
Token was
Token ID373
Feature activation-1.32e-02

Pos logit contributions
 pet+0.049
 dog+0.044
 walk+0.042
 water+0.040
 drops+0.040
Neg logit contributions
 exactly-0.082
 forever-0.081
ently-0.079
vel-0.079
 correctly-0.079
Token as
Token ID355
Feature activation-7.66e-02

Pos logit contributions
 pet+0.014
 water+0.012
 big+0.012
 strong+0.011
 warm+0.010
Neg logit contributions
ently-0.035
vel-0.034
anish-0.033
 correctly-0.033
mmm-0.032
Token yellow
Token ID7872
Feature activation-1.16e-02

Pos logit contributions
 pet+0.071
 dog+0.059
 walk+0.056
 drops+0.053
 water+0.052
Neg logit contributions
 exactly-0.187
 forever-0.184
ently-0.181
 correctly-0.180
vel-0.179
Token as
Token ID355
Feature activation-6.77e-02

Pos logit contributions
 pet+0.016
 dog+0.014
 water+0.014
 walk+0.013
 drops+0.013
Neg logit contributions
 exactly-0.023
 forever-0.023
vel-0.023
ently-0.023
 correctly-0.023
Token the
Token ID262
Feature activation-1.86e-03

Pos logit contributions
 pet+0.095
 dog+0.086
 water+0.086
 walk+0.080
 drops+0.079
Neg logit contributions
vel-0.133
ently-0.132
 exactly-0.130
 correctly-0.128
 forever-0.127
Token sun
Token ID4252
Feature activation-4.39e-02

Pos logit contributions
 pet+0.001
s+0.001
 dog+0.001
 drops+0.001
 walk+0.001
Neg logit contributions
 forever-0.004
 exactly-0.004
ently-0.004
 correctly-0.004
vel-0.004
Token and
Token ID290
Feature activation-2.35e-02

Pos logit contributions
 pet+0.056
 dog+0.051
 walk+0.049
 drops+0.047
 water+0.046
Neg logit contributions
 exactly-0.085
ently-0.085
 forever-0.085
 correctly-0.084
vel-0.081
Token.
Token ID13
Feature activation-1.26e-04

Pos logit contributions
 pet+0.007
 water+0.006
s+0.006
 drops+0.005
 dog+0.005
Neg logit contributions
mmm-0.025
vel-0.025
ieved-0.024
ocolate-0.024
anish-0.024
Token The
Token ID383
Feature activation-1.08e-02

Pos logit contributions
 pet+0.000
 dog+0.000
 water+0.000
 drops+0.000
 walk+0.000
Neg logit contributions
 exactly-0.000
 forever-0.000
vel-0.000
ently-0.000
 correctly-0.000
Token ground
Token ID2323
Feature activation-2.96e-02

Pos logit contributions
s+0.012
 pet+0.010
ted+0.009
 Claus+0.009
Rex+0.009
Neg logit contributions
 exactly-0.020
 forever-0.020
ently-0.019
vel-0.019
 reality-0.019
Token now
Token ID783
Feature activation-2.38e-02

Pos logit contributions
 pet+0.029
 dog+0.025
 walk+0.024
 water+0.023
 drops+0.023
Neg logit contributions
 exactly-0.070
 forever-0.069
ently-0.069
vel-0.068
 correctly-0.068
Token looked
Token ID3114
Feature activation+4.33e-03

Pos logit contributions
 pet+0.020
 water+0.017
 drops+0.017
 dog+0.017
s+0.016
Neg logit contributions
vel-0.061
ently-0.060
 exactly-0.060
mmm-0.060
ocolate-0.059
Token as
Token ID355
Feature activation-7.65e-02

Pos logit contributions
ently+0.009
ocolate+0.009
anish+0.009
loe+0.009
mmm+0.009
Neg logit contributions
 pet-0.006
 drops-0.005
 water-0.005
 big-0.005
 strong-0.005
Token clean
Token ID3424
Feature activation+7.80e-03

Pos logit contributions
 pet+0.094
 dog+0.083
 water+0.082
 walk+0.078
 drops+0.077
Neg logit contributions
vel-0.161
ently-0.161
 exactly-0.160
 forever-0.157
 correctly-0.157
Token as
Token ID355
Feature activation-5.91e-02

Pos logit contributions
vel+0.019
ocolate+0.018
anish+0.018
 exactly+0.018
aney+0.018
Neg logit contributions
 pet-0.008
 water-0.008
 drops-0.007
 dog-0.007
s-0.007
Token before
Token ID878
Feature activation-2.36e-02

Pos logit contributions
 pet+0.079
 water+0.070
 dog+0.069
 walk+0.065
 drops+0.063
Neg logit contributions
vel-0.122
ently-0.119
 exactly-0.117
anish-0.116
 correctly-0.116
Token and
Token ID290
Feature activation-4.71e-03

Pos logit contributions
 pet+0.025
 water+0.021
 dog+0.021
 walk+0.020
 drops+0.019
Neg logit contributions
vel-0.056
 exactly-0.056
 correctly-0.055
ently-0.055
anish-0.054
Token the
Token ID262
Feature activation-5.37e-03

Pos logit contributions
 pet+0.006
 water+0.006
 drops+0.005
 walk+0.005
 dog+0.005
Neg logit contributions
ently-0.010
vel-0.010
ocolate-0.010
mmm-0.010
anish-0.010
Token plates
Token ID13375
Feature activation-9.17e-03

Pos logit contributions
 pet+0.086
 dog+0.077
 water+0.077
 walk+0.069
 drops+0.069
Neg logit contributions
ently-0.162
vel-0.162
 forever-0.158
 exactly-0.156
ieved-0.156
Token,
Token ID11
Feature activation+3.36e-04

Pos logit contributions
 pet+0.010
 water+0.009
 dog+0.009
s+0.009
 walk+0.008
Neg logit contributions
 exactly-0.021
ently-0.021
vel-0.021
ieved-0.020
mmm-0.020
Token and
Token ID290
Feature activation-2.64e-02

Pos logit contributions
 exactly+0.001
ently+0.001
 forever+0.001
vel+0.001
 correctly+0.001
Neg logit contributions
 pet-0.000
 water-0.000
 dog-0.000
 drops-0.000
 walk-0.000
Token some
Token ID617
Feature activation-7.64e-03

Pos logit contributions
 pet+0.032
 water+0.028
 dog+0.028
 walk+0.027
 drops+0.026
Neg logit contributions
ently-0.058
 exactly-0.058
 forever-0.057
vel-0.056
 correctly-0.055
Token sticks
Token ID16461
Feature activation+1.20e-02

Pos logit contributions
 pet+0.008
 water+0.007
 dog+0.007
 drops+0.006
 walk+0.006
Neg logit contributions
ently-0.018
 exactly-0.018
vel-0.018
 forever-0.018
 correctly-0.017
Token as
Token ID355
Feature activation-7.63e-02

Pos logit contributions
mmm+0.033
ieved+0.033
vel+0.033
ently+0.032
loe+0.032
Neg logit contributions
 pet-0.009
 water-0.008
s-0.008
 dog-0.007
 walk-0.006
Token forks
Token ID43378
Feature activation-2.41e-03

Pos logit contributions
 pet+0.088
 water+0.078
 dog+0.077
 walk+0.071
 drops+0.069
Neg logit contributions
ently-0.171
vel-0.171
 forever-0.167
 exactly-0.165
ieved-0.165
Token and
Token ID290
Feature activation-2.87e-02

Pos logit contributions
 pet+0.002
s+0.002
 water+0.002
 dog+0.002
 walk+0.001
Neg logit contributions
ieved-0.006
mmm-0.006
ently-0.006
 exactly-0.006
vel-0.006
Token knives
Token ID23110
Feature activation+2.51e-02

Pos logit contributions
 pet+0.032
 dog+0.027
 water+0.026
 walk+0.026
 drops+0.025
Neg logit contributions
 exactly-0.066
ently-0.065
 forever-0.065
vel-0.064
 correctly-0.063
Token.
Token ID13
Feature activation-6.53e-03

Pos logit contributions
 exactly+0.064
ently+0.064
ieved+0.064
vel+0.064
mmm+0.063
Neg logit contributions
 pet-0.020
s-0.019
 water-0.018
 dog-0.016
 walk-0.015
Token
Token ID198
Feature activation+1.31e-04

Pos logit contributions
 pet+0.006
s+0.005
 dog+0.005
 water+0.005
 drops+0.004
Neg logit contributions
 forever-0.017
vel-0.017
 exactly-0.016
ently-0.016
 correctly-0.016
Token cried
Token ID16896
Feature activation-4.13e-02

Pos logit contributions
 pet+0.003
 dog+0.002
 walk+0.002
 pets+0.002
 water+0.002
Neg logit contributions
 exactly-0.005
 forever-0.005
vel-0.005
 correctly-0.005
ently-0.005
Token.
Token ID13
Feature activation-1.26e-02

Pos logit contributions
 pet+0.062
 walk+0.057
 dog+0.056
 wave+0.054
 pans+0.053
Neg logit contributions
 forever-0.073
 exactly-0.070
 correctly-0.068
vel-0.067
ently-0.066
Token "
Token ID366
Feature activation-2.34e-02

Pos logit contributions
 pet+0.016
 dog+0.014
 walk+0.013
 water+0.013
 drops+0.013
Neg logit contributions
 exactly-0.027
 forever-0.026
ently-0.026
 correctly-0.026
vel-0.026
TokenThe
Token ID464
Feature activation-1.98e-02

Pos logit contributions
 pet+0.037
 walk+0.034
 pets+0.033
 dog+0.032
 wave+0.031
Neg logit contributions
vel-0.038
anish-0.036
 forever-0.035
 correctly-0.034
mmm-0.034
Token ost
Token ID23619
Feature activation+1.01e-02

Pos logit contributions
 pet+0.020
 dog+0.018
 water+0.016
 walk+0.016
 drops+0.016
Neg logit contributions
 exactly-0.047
ently-0.046
 forever-0.046
vel-0.046
 correctly-0.046
Tokenric
Token ID1173
Feature activation-7.61e-02

Pos logit contributions
 forever+0.013
 exactly+0.012
 correctly+0.012
vel+0.012
ently+0.012
Neg logit contributions
 pet-0.021
 dog-0.019
s-0.019
 water-0.019
 drops-0.019
Tokenhes
Token ID956
Feature activation-4.50e-02

Pos logit contributions
 pet+0.069
 water+0.067
s+0.066
 dog+0.065
 drops+0.063
Neg logit contributions
 forever-0.184
 exactly-0.174
mmm-0.171
 correctly-0.171
 perspective-0.168
Token are
Token ID389
Feature activation-2.81e-03

Pos logit contributions
 water+0.051
 dog+0.050
s+0.048
 pet+0.046
 drops+0.044
Neg logit contributions
 exactly-0.108
 accompl-0.108
vel-0.107
mmm-0.107
ently-0.106
Token mean
Token ID1612
Feature activation-1.31e-02

Pos logit contributions
 pet+0.003
 water+0.003
 dog+0.003
 drops+0.003
 walk+0.002
Neg logit contributions
ently-0.007
 exactly-0.006
 forever-0.006
 correctly-0.006
vel-0.006
Token!
Token ID0
Feature activation+8.28e-03

Pos logit contributions
 pet+0.021
 dog+0.020
 water+0.019
s+0.019
 drops+0.018
Neg logit contributions
 exactly-0.023
ently-0.023
 forever-0.022
vel-0.022
 correctly-0.022
Token They
Token ID1119
Feature activation-1.78e-02

Pos logit contributions
 exactly+0.017
 forever+0.016
ently+0.016
 correctly+0.016
vel+0.016
Neg logit contributions
 pet-0.012
 dog-0.012
s-0.010
 drops-0.010
 water-0.010
Token the
Token ID262
Feature activation+3.72e-03

Pos logit contributions
 pet+0.001
 water+0.001
 dog+0.001
 walk+0.001
 drops+0.001
Neg logit contributions
ently-0.002
 exactly-0.002
vel-0.002
 forever-0.001
 correctly-0.001
Token sky
Token ID6766
Feature activation-1.38e-02

Pos logit contributions
 forever+0.008
 exactly+0.008
vel+0.008
 correctly+0.008
ently+0.007
Neg logit contributions
 pet-0.004
s-0.004
 dog-0.004
 walk-0.003
 drops-0.003
Token.
Token ID13
Feature activation-5.00e-03

Pos logit contributions
 pet+0.013
 water+0.011
 dog+0.010
s+0.010
 walk+0.010
Neg logit contributions
ently-0.034
 exactly-0.034
vel-0.034
 correctly-0.034
 forever-0.033
Token It
Token ID632
Feature activation-3.84e-02

Pos logit contributions
 pet+0.006
 dog+0.005
 water+0.005
 walk+0.005
 bark+0.005
Neg logit contributions
 exactly-0.012
ently-0.011
vel-0.011
 forever-0.011
ieved-0.011
Token was
Token ID373
Feature activation-1.18e-02

Pos logit contributions
 pet+0.053
 dog+0.049
 walk+0.047
 pets+0.046
 drops+0.045
Neg logit contributions
 forever-0.071
 exactly-0.070
 correctly-0.067
ently-0.067
vel-0.066
Token as
Token ID355
Feature activation-7.60e-02

Pos logit contributions
 pet+0.013
 water+0.012
 big+0.011
 dog+0.010
 strong+0.010
Neg logit contributions
ently-0.030
vel-0.029
anish-0.028
 correctly-0.028
mmm-0.027
Token high
Token ID1029
Feature activation+1.73e-03

Pos logit contributions
 pet+0.079
 dog+0.066
 walk+0.065
 drops+0.061
s+0.059
Neg logit contributions
 exactly-0.178
 forever-0.175
 correctly-0.171
ently-0.169
vel-0.167
Token as
Token ID355
Feature activation-5.73e-02

Pos logit contributions
vel+0.004
 exactly+0.004
ently+0.003
 forever+0.003
 correctly+0.003
Neg logit contributions
 pet-0.002
 dog-0.002
 water-0.002
 drops-0.002
 walk-0.002
Token could
Token ID714
Feature activation-1.78e-02

Pos logit contributions
 pet+0.085
 water+0.077
 dog+0.077
 walk+0.073
s+0.072
Neg logit contributions
vel-0.108
ently-0.105
 exactly-0.105
 correctly-0.102
 forever-0.101
Token be
Token ID307
Feature activation-9.61e-03

Pos logit contributions
 pet+0.018
 walk+0.015
 water+0.015
 dog+0.015
s+0.014
Neg logit contributions
 exactly-0.042
vel-0.042
ently-0.041
mmm-0.041
 correctly-0.040
Token and
Token ID290
Feature activation-5.96e-03

Pos logit contributions
 pet+0.012
 dog+0.011
 walk+0.010
 drops+0.010
s+0.009
Neg logit contributions
 exactly-0.020
 forever-0.020
ently-0.020
 correctly-0.019
vel-0.019
Token the
Token ID262
Feature activation-1.67e-02

Pos logit contributions
 exactly+0.015
 forever+0.015
ently+0.015
 correctly+0.014
vel+0.014
Neg logit contributions
 pet-0.013
 dog-0.012
 walk-0.012
 water-0.012
 drops-0.011
Token dress
Token ID6576
Feature activation-8.02e-03

Pos logit contributions
 pet+0.021
 dog+0.018
 walk+0.018
 water+0.017
 drops+0.016
Neg logit contributions
 exactly-0.036
ently-0.035
vel-0.035
 forever-0.035
 correctly-0.035
Token and
Token ID290
Feature activation-2.35e-02

Pos logit contributions
 pet+0.007
s+0.006
 walk+0.006
 dog+0.006
 water+0.006
Neg logit contributions
vel-0.020
 exactly-0.019
mmm-0.019
anish-0.019
 correctly-0.019
Token it
Token ID340
Feature activation-1.95e-02

Pos logit contributions
 pet+0.027
 dog+0.024
 walk+0.023
 drops+0.022
s+0.022
Neg logit contributions
 exactly-0.051
 forever-0.050
 correctly-0.049
ently-0.049
vel-0.048
Token looked
Token ID3114
Feature activation+1.18e-02

Pos logit contributions
 pet+0.024
 dog+0.021
 walk+0.020
 water+0.019
 drops+0.019
Neg logit contributions
 exactly-0.042
 forever-0.041
ently-0.040
 correctly-0.040
vel-0.040
Token as
Token ID355
Feature activation-7.57e-02

Pos logit contributions
ently+0.026
anish+0.025
ocolate+0.025
loe+0.025
vel+0.024
Neg logit contributions
 pet-0.015
 water-0.013
 drops-0.013
 big-0.013
 warm-0.012
Token good
Token ID922
Feature activation-3.86e-04

Pos logit contributions
 pet+0.096
 dog+0.084
 walk+0.081
 water+0.080
 drops+0.079
Neg logit contributions
 exactly-0.158
 forever-0.155
ently-0.155
vel-0.154
 correctly-0.153
Token as
Token ID355
Feature activation-4.95e-02

Pos logit contributions
 pet+0.001
 dog+0.000
 walk+0.000
 water+0.000
 drops+0.000
Neg logit contributions
 exactly-0.001
ently-0.001
 forever-0.001
vel-0.001
 correctly-0.001
Token new
Token ID649
Feature activation+5.72e-03

Pos logit contributions
 pet+0.063
 dog+0.055
 walk+0.053
 water+0.053
 drops+0.052
Neg logit contributions
 exactly-0.103
ently-0.101
 forever-0.101
vel-0.101
 correctly-0.100
Token.
Token ID13
Feature activation+9.53e-03

Pos logit contributions
ently+0.013
 exactly+0.013
vel+0.013
 correctly+0.013
 forever+0.013
Neg logit contributions
 pet-0.006
 dog-0.005
 walk-0.005
 water-0.005
 drops-0.005
Token
Token ID220
Feature activation-2.32e-02

Pos logit contributions
ently+0.019
 exactly+0.019
vel+0.019
 correctly+0.019
 forever+0.019
Neg logit contributions
 pet-0.013
 dog-0.012
 water-0.011
s-0.011
 drops-0.011
Token the
Token ID262
Feature activation-7.88e-03

Pos logit contributions
vel+0.060
ently+0.059
 forever+0.058
mmm+0.057
anish+0.057
Neg logit contributions
 pet-0.025
 dog-0.022
 water-0.020
s-0.019
 drops-0.019
Token doll
Token ID3654
Feature activation-1.00e-02

Pos logit contributions
 pet+0.008
 dog+0.007
 walk+0.006
 water+0.006
 drops+0.006
Neg logit contributions
ently-0.020
 exactly-0.020
vel-0.019
 forever-0.019
 correctly-0.019
Token and
Token ID290
Feature activation-1.68e-02

Pos logit contributions
 pet+0.015
s+0.013
 walk+0.013
 dog+0.013
 water+0.013
Neg logit contributions
ently-0.019
vel-0.018
 exactly-0.018
 forever-0.018
mmm-0.018
Token it
Token ID340
Feature activation-9.59e-03

Pos logit contributions
 pet+0.020
 dog+0.018
 water+0.017
 walk+0.017
 drops+0.016
Neg logit contributions
ently-0.037
vel-0.037
 exactly-0.036
 forever-0.036
 correctly-0.036
Token looked
Token ID3114
Feature activation+1.86e-02

Pos logit contributions
 pet+0.010
 dog+0.009
 walk+0.009
 water+0.008
 drops+0.008
Neg logit contributions
 exactly-0.022
ently-0.021
 forever-0.021
vel-0.021
 correctly-0.021
Token as
Token ID355
Feature activation-7.56e-02

Pos logit contributions
ently+0.039
loe+0.039
anish+0.039
ocolate+0.038
mmm+0.036
Neg logit contributions
 pet-0.026
 drops-0.023
 big-0.023
 water-0.022
 strong-0.022
Token good
Token ID922
Feature activation+4.49e-03

Pos logit contributions
 pet+0.079
 water+0.071
 dog+0.069
 strong+0.064
 big+0.064
Neg logit contributions
vel-0.174
ently-0.173
mmm-0.167
 exactly-0.167
anish-0.165
Token as
Token ID355
Feature activation-4.87e-02

Pos logit contributions
vel+0.009
ently+0.009
 exactly+0.009
 correctly+0.009
anish+0.009
Neg logit contributions
 pet-0.006
 water-0.005
 dog-0.005
 walk-0.005
 drops-0.005
Token new
Token ID649
Feature activation+8.89e-03

Pos logit contributions
 pet+0.057
 water+0.050
 dog+0.050
 walk+0.046
s+0.045
Neg logit contributions
vel-0.108
ently-0.106
anish-0.103
ieved-0.103
 correctly-0.102
Token.
Token ID13
Feature activation+1.32e-02

Pos logit contributions
ently+0.021
vel+0.021
 exactly+0.021
ieved+0.021
aney+0.021
Neg logit contributions
 pet-0.009
 dog-0.008
 walk-0.007
 drops-0.007
s-0.007
Token The
Token ID383
Feature activation-4.46e-03

Pos logit contributions
vel+0.034
ently+0.032
gh+0.031
anish+0.031
 forever+0.031
Neg logit contributions
 pet-0.014
 dog-0.012
 water-0.011
s-0.011
 pets-0.010
Token it
Token ID340
Feature activation-1.49e-02

Pos logit contributions
vel+0.008
ently+0.008
 exactly+0.008
mmm+0.008
 correctly+0.008
Neg logit contributions
 pet-0.005
 water-0.004
s-0.004
 dog-0.004
 drops-0.004
Token,
Token ID11
Feature activation+9.41e-03

Pos logit contributions
 pet+0.020
 water+0.018
 walk+0.018
 dog+0.017
s+0.017
Neg logit contributions
 exactly-0.029
ently-0.029
vel-0.029
 correctly-0.029
mmm-0.028
Token the
Token ID262
Feature activation-2.69e-03

Pos logit contributions
vel+0.023
ently+0.023
 correctly+0.022
rum+0.022
anish+0.022
Neg logit contributions
 pet-0.011
 dog-0.009
 water-0.009
 walk-0.008
 pets-0.008
Token shirt
Token ID10147
Feature activation-1.98e-02

Pos logit contributions
 pet+0.003
 dog+0.003
 walk+0.002
s+0.002
 water+0.002
Neg logit contributions
 exactly-0.006
 forever-0.006
ently-0.006
 correctly-0.006
vel-0.006
Token looked
Token ID3114
Feature activation+1.06e-02

Pos logit contributions
 pet+0.019
 dog+0.016
 water+0.016
 drops+0.015
s+0.015
Neg logit contributions
 exactly-0.047
vel-0.047
ently-0.046
 forever-0.046
 correctly-0.046
Token as
Token ID355
Feature activation-7.53e-02

Pos logit contributions
ently+0.022
anish+0.021
ocolate+0.021
vel+0.021
loe+0.020
Neg logit contributions
 pet-0.015
 water-0.013
 drops-0.013
s-0.012
 big-0.012
Token good
Token ID922
Feature activation-3.26e-03

Pos logit contributions
 pet+0.084
 water+0.078
 dog+0.075
 drops+0.071
s+0.069
Neg logit contributions
vel-0.167
ently-0.166
 exactly-0.161
 correctly-0.160
 forever-0.159
Token as
Token ID355
Feature activation-5.10e-02

Pos logit contributions
 pet+0.004
 dog+0.004
 water+0.004
 drops+0.004
 walk+0.004
Neg logit contributions
 exactly-0.007
ently-0.007
vel-0.007
 correctly-0.006
 forever-0.006
Token new
Token ID649
Feature activation+5.09e-03

Pos logit contributions
 pet+0.067
 water+0.060
 dog+0.058
 walk+0.054
s+0.053
Neg logit contributions
vel-0.107
ently-0.105
anish-0.102
ieved-0.101
 correctly-0.101
Token.
Token ID13
Feature activation+7.11e-03

Pos logit contributions
ently+0.012
 exactly+0.012
vel+0.012
 correctly+0.012
mmm+0.012
Neg logit contributions
 pet-0.005
 dog-0.004
 water-0.004
 walk-0.004
 drops-0.004
Token Lily
Token ID20037
Feature activation+6.40e-03

Pos logit contributions
vel+0.017
ently+0.016
 correctly+0.016
 exactly+0.016
 forever+0.016
Neg logit contributions
 pet-0.008
 dog-0.007
s-0.007
 water-0.007
 drops-0.006
Token them
Token ID606
Feature activation-2.71e-02

Pos logit contributions
 pet+0.007
 dog+0.007
 walk+0.006
 water+0.006
 drops+0.006
Neg logit contributions
 exactly-0.008
 forever-0.008
ently-0.008
 correctly-0.008
vel-0.008
Token play
Token ID711
Feature activation-5.50e-03

Pos logit contributions
 pet+0.029
 dog+0.025
 walk+0.023
 water+0.023
 drops+0.023
Neg logit contributions
 exactly-0.062
 forever-0.061
ently-0.061
 correctly-0.060
vel-0.060
Token.
Token ID13
Feature activation-5.74e-03

Pos logit contributions
 pet+0.005
 dog+0.004
 water+0.004
s+0.003
 walk+0.003
Neg logit contributions
 exactly-0.015
mmm-0.015
ently-0.015
 fault-0.014
aman-0.014
Token As
Token ID1081
Feature activation-5.28e-02

Pos logit contributions
 pet+0.008
 walk+0.007
 wave+0.007
 dog+0.007
 water+0.007
Neg logit contributions
 exactly-0.011
 forever-0.010
ently-0.010
mmm-0.010
 correctly-0.010
Token the
Token ID262
Feature activation-2.45e-02

Pos logit contributions
 pet+0.079
 drops+0.072
 Claus+0.072
 walk+0.072
 dog+0.071
Neg logit contributions
 forever-0.091
 exactly-0.091
mmm-0.078
 correctly-0.075
 reality-0.074
Token sun
Token ID4252
Feature activation-7.52e-02

Pos logit contributions
s+0.032
 pet+0.027
 Claus+0.026
 sniff+0.026
ed+0.026
Neg logit contributions
 course-0.033
 exactly-0.033
 forever-0.033
ocolate-0.033
mmm-0.032
Token went
Token ID1816
Feature activation-3.31e-02

Pos logit contributions
 robbers+0.081
 packages+0.076
 Claus+0.074
 pans+0.074
 pet+0.073
Neg logit contributions
 exactly-0.127
 forever-0.126
rol-0.126
 unfortunately-0.125
ently-0.122
Token down
Token ID866
Feature activation-2.80e-02

Pos logit contributions
 pet+0.042
 walk+0.037
 dog+0.037
 human+0.035
 pets+0.035
Neg logit contributions
 forever-0.065
 exactly-0.063
 correctly-0.060
ently-0.057
rol-0.056
Token,
Token ID11
Feature activation-1.05e-02

Pos logit contributions
 pet+0.047
 dog+0.042
 walk+0.041
 drops+0.040
 water+0.040
Neg logit contributions
 exactly-0.047
 forever-0.047
ently-0.046
 correctly-0.045
vel-0.044
Token Lily
Token ID20037
Feature activation-2.44e-03

Pos logit contributions
 pet+0.015
 walk+0.013
 dog+0.013
 drops+0.012
 wave+0.012
Neg logit contributions
 exactly-0.020
 forever-0.019
ently-0.019
 correctly-0.019
vel-0.018
Token hugged
Token ID41130
Feature activation-4.07e-02

Pos logit contributions
 pet+0.003
 dog+0.002
 walk+0.002
 water+0.002
 drops+0.002
Neg logit contributions
 exactly-0.005
 forever-0.005
ently-0.005
vel-0.005
 correctly-0.005
Token a
Token ID257
Feature activation+1.43e-02

Pos logit contributions
 pet+0.006
 dog+0.005
 water+0.005
 walk+0.005
 drops+0.004
Neg logit contributions
ently-0.008
vel-0.008
 exactly-0.008
 correctly-0.008
anish-0.008
Token moment
Token ID2589
Feature activation-3.25e-03

Pos logit contributions
ently+0.032
 exactly+0.031
 correctly+0.031
vel+0.030
 forever+0.030
Neg logit contributions
 pet-0.017
 dog-0.015
 walk-0.015
 water-0.014
s-0.014
Token,
Token ID11
Feature activation+2.03e-02

Pos logit contributions
 pet+0.003
s+0.003
 dog+0.003
 walk+0.003
 water+0.003
Neg logit contributions
ently-0.008
anish-0.008
mmm-0.008
ocolate-0.008
vel-0.008
Token it
Token ID340
Feature activation-1.83e-02

Pos logit contributions
ently+0.051
vel+0.049
ool+0.049
rum+0.047
anish+0.047
Neg logit contributions
 pet-0.025
 dog-0.022
 water-0.020
 pets-0.019
 walk-0.019
Token was
Token ID373
Feature activation+9.61e-03

Pos logit contributions
 pet+0.021
 dog+0.018
 walk+0.017
 water+0.017
 drops+0.017
Neg logit contributions
 exactly-0.041
ently-0.040
 forever-0.040
vel-0.040
 correctly-0.040
Token as
Token ID355
Feature activation-7.50e-02

Pos logit contributions
ently+0.024
anish+0.023
vel+0.023
ocolate+0.021
rum+0.021
Neg logit contributions
 pet-0.012
 big-0.011
 water-0.011
 strong-0.010
 warm-0.010
Token if
Token ID611
Feature activation-2.91e-02

Pos logit contributions
 pet+0.087
 dog+0.076
 walk+0.072
 water+0.072
 drops+0.070
Neg logit contributions
 exactly-0.164
ently-0.162
 forever-0.161
vel-0.160
 correctly-0.159
Token there
Token ID612
Feature activation-2.43e-02

Pos logit contributions
 pet+0.027
 dog+0.025
 water+0.024
s+0.023
 walk+0.022
Neg logit contributions
ently-0.074
vel-0.073
mmm-0.070
rol-0.069
ocolate-0.069
Token was
Token ID373
Feature activation-3.08e-03

Pos logit contributions
 pet+0.020
 dog+0.019
s+0.019
 went+0.017
 water+0.016
Neg logit contributions
vel-0.067
ently-0.065
mmm-0.064
 exactly-0.063
ieved-0.062
Token no
Token ID645
Feature activation+1.99e-02

Pos logit contributions
 pet+0.004
 water+0.004
 dog+0.004
 big+0.004
 walk+0.004
Neg logit contributions
ently-0.007
vel-0.007
mmm-0.007
rum-0.006
anish-0.006
Token time
Token ID640
Feature activation-4.61e-02

Pos logit contributions
ently+0.045
vel+0.045
 exactly+0.043
mmm+0.043
 correctly+0.042
Neg logit contributions
 pet-0.027
 dog-0.023
 water-0.023
 walk-0.022
 drops-0.021
Token and
Token ID290
Feature activation-1.86e-02

Pos logit contributions
 pet+0.017
 dog+0.014
 water+0.014
 drops+0.014
s+0.013
Neg logit contributions
ently-0.031
 exactly-0.031
vel-0.031
mmm-0.030
 correctly-0.030
Token laughed
Token ID13818
Feature activation-2.40e-02

Pos logit contributions
 pet+0.021
 dog+0.019
 walk+0.018
 water+0.017
 drops+0.017
Neg logit contributions
 exactly-0.041
 forever-0.040
ently-0.040
 correctly-0.040
vel-0.039
Token.
Token ID13
Feature activation-1.07e-02

Pos logit contributions
 pet+0.026
 dog+0.022
s+0.021
 water+0.020
 walk+0.020
Neg logit contributions
ently-0.056
mmm-0.055
vel-0.054
aney-0.054
 exactly-0.054
Token As
Token ID1081
Feature activation-5.93e-02

Pos logit contributions
 pet+0.013
 dog+0.012
 drops+0.011
 water+0.011
 walk+0.011
Neg logit contributions
 exactly-0.024
vel-0.023
ently-0.023
 forever-0.023
 correctly-0.022
Token the
Token ID262
Feature activation-3.10e-02

Pos logit contributions
 pet+0.084
 drops+0.072
 dog+0.072
 walk+0.071
 water+0.069
Neg logit contributions
 forever-0.115
 exactly-0.112
 correctly-0.105
ently-0.102
vel-0.102
Token sun
Token ID4252
Feature activation-7.49e-02

Pos logit contributions
 pet+0.045
 pans+0.043
tails+0.040
 Claus+0.040
ns+0.040
Neg logit contributions
anish-0.041
 anymore-0.039
 about-0.038
 forever-0.038
 today-0.037
Token started
Token ID2067
Feature activation-6.60e-02

Pos logit contributions
 pans+0.085
 pet+0.085
 robbers+0.080
 pets+0.078
 backs+0.077
Neg logit contributions
 unfortunately-0.119
 exactly-0.118
 forever-0.117
ently-0.117
anish-0.114
Token to
Token ID284
Feature activation-2.22e-02

Pos logit contributions
 pet+0.104
 pans+0.093
Rex+0.092
 dog+0.091
 Claus+0.091
Neg logit contributions
 forever-0.111
 exactly-0.100
 anymore-0.092
 correctly-0.090
ently-0.088
Token set
Token ID900
Feature activation-1.81e-02

Pos logit contributions
 pet+0.025
 dog+0.025
s+0.025
 pets+0.024
 drops+0.023
Neg logit contributions
 forever-0.044
ently-0.043
 exactly-0.043
 correctly-0.042
mmm-0.041
Token,
Token ID11
Feature activation-1.38e-02

Pos logit contributions
 pet+0.025
 dog+0.023
 walk+0.022
 water+0.022
s+0.022
Neg logit contributions
vel-0.035
 exactly-0.035
ently-0.035
 correctly-0.034
 forever-0.034
Token Max
Token ID5436
Feature activation-9.29e-03

Pos logit contributions
 pet+0.015
 dog+0.015
 water+0.014
 drops+0.013
 walk+0.013
Neg logit contributions
ently-0.032
vel-0.032
 correctly-0.031
 exactly-0.031
 forever-0.030
Token the
Token ID262
Feature activation-8.83e-03

Pos logit contributions
 forever+0.036
vel+0.036
ently+0.035
 exactly+0.035
anish+0.035
Neg logit contributions
 pet-0.017
 dog-0.015
 water-0.015
s-0.014
 drops-0.014
Token shirt
Token ID10147
Feature activation-8.61e-03

Pos logit contributions
 pet+0.011
 dog+0.009
 walk+0.009
s+0.009
 drops+0.009
Neg logit contributions
 exactly-0.018
 forever-0.018
ently-0.018
vel-0.018
 correctly-0.018
Token and
Token ID290
Feature activation-1.69e-02

Pos logit contributions
s+0.006
 pet+0.005
 water+0.005
 big+0.005
 went+0.004
Neg logit contributions
mmm-0.023
vel-0.022
 perspective-0.022
loe-0.022
ocolate-0.022
Token it
Token ID340
Feature activation-6.91e-03

Pos logit contributions
 pet+0.021
 dog+0.019
 walk+0.018
 drops+0.018
s+0.017
Neg logit contributions
 exactly-0.035
 forever-0.034
ently-0.034
 correctly-0.034
vel-0.033
Token looked
Token ID3114
Feature activation+2.00e-02

Pos logit contributions
 pet+0.007
 dog+0.006
 water+0.006
 walk+0.006
 drops+0.006
Neg logit contributions
 exactly-0.016
 forever-0.015
ently-0.015
vel-0.015
 correctly-0.015
Token as
Token ID355
Feature activation-7.48e-02

Pos logit contributions
loe+0.042
ently+0.042
anish+0.041
ocolate+0.041
 perspective+0.039
Neg logit contributions
 pet-0.027
 drops-0.024
 water-0.024
 big-0.024
 strong-0.024
Token good
Token ID922
Feature activation+2.46e-03

Pos logit contributions
 pet+0.092
 dog+0.082
 water+0.080
 walk+0.077
 drops+0.076
Neg logit contributions
 exactly-0.155
vel-0.155
ently-0.155
 forever-0.153
 correctly-0.151
Token as
Token ID355
Feature activation-5.26e-02

Pos logit contributions
 exactly+0.005
ently+0.005
vel+0.005
 forever+0.005
 correctly+0.005
Neg logit contributions
 pet-0.003
 dog-0.003
 water-0.003
 walk-0.003
 drops-0.003
Token new
Token ID649
Feature activation+8.14e-03

Pos logit contributions
 pet+0.064
 dog+0.056
 water+0.056
 walk+0.053
s+0.052
Neg logit contributions
vel-0.112
ently-0.111
 exactly-0.109
 correctly-0.108
 forever-0.107
Token!
Token ID0
Feature activation+2.06e-02

Pos logit contributions
ently+0.020
 exactly+0.020
aney+0.020
rol+0.020
vel+0.019
Neg logit contributions
 pet-0.008
 dog-0.007
 water-0.006
 walk-0.006
s-0.006
Token The
Token ID383
Feature activation-3.75e-03

Pos logit contributions
vel+0.044
 exactly+0.044
 forever+0.043
ently+0.043
 correctly+0.042
Neg logit contributions
 pet-0.026
 dog-0.024
s-0.022
 water-0.021
 drops-0.021
Token fire
Token ID2046
Feature activation-6.28e-03

Pos logit contributions
ently+0.003
 exactly+0.003
 correctly+0.003
vel+0.003
 forever+0.003
Neg logit contributions
 pet-0.001
 dog-0.001
 walk-0.001
 water-0.001
 drops-0.001
Tokenwork
Token ID1818
Feature activation-2.03e-03

Pos logit contributions
 pet+0.008
 dog+0.007
 walk+0.007
 water+0.007
 drops+0.007
Neg logit contributions
 exactly-0.013
 forever-0.012
ently-0.012
vel-0.012
 correctly-0.012
Token.
Token ID13
Feature activation-1.43e-03

Pos logit contributions
 pet+0.002
 water+0.002
 dog+0.002
 wave+0.002
 walk+0.002
Neg logit contributions
ently-0.005
 exactly-0.004
mmm-0.004
vel-0.004
loe-0.004
Token It
Token ID632
Feature activation-2.94e-02

Pos logit contributions
 pet+0.001
 dog+0.001
 water+0.001
 sniff+0.001
 bark+0.001
Neg logit contributions
ently-0.004
 forever-0.004
vel-0.004
 exactly-0.004
 correctly-0.003
Token was
Token ID373
Feature activation-2.31e-03

Pos logit contributions
 pet+0.032
 dog+0.027
 water+0.026
 walk+0.026
 drops+0.025
Neg logit contributions
 exactly-0.067
ently-0.066
 forever-0.066
vel-0.065
 correctly-0.065
Token as
Token ID355
Feature activation-7.46e-02

Pos logit contributions
 pet+0.003
 water+0.002
 big+0.002
 strong+0.002
 dog+0.002
Neg logit contributions
ently-0.006
vel-0.006
anish-0.005
mmm-0.005
ieved-0.005
Token big
Token ID1263
Feature activation+1.54e-04

Pos logit contributions
 pet+0.067
 water+0.061
 dog+0.058
 drops+0.053
 strong+0.052
Neg logit contributions
ently-0.184
vel-0.182
 exactly-0.176
 forever-0.176
 correctly-0.174
Token as
Token ID355
Feature activation-5.69e-02

Pos logit contributions
 exactly+0.000
 forever+0.000
ently+0.000
 correctly+0.000
vel+0.000
Neg logit contributions
 pet-0.000
 dog-0.000
 walk-0.000
 water-0.000
s-0.000
Token a
Token ID257
Feature activation+9.54e-03

Pos logit contributions
 pet+0.068
 water+0.062
 dog+0.061
 walk+0.055
 drops+0.055
Neg logit contributions
vel-0.125
ently-0.123
anish-0.117
 exactly-0.117
 forever-0.117
Token house
Token ID2156
Feature activation-2.35e-02

Pos logit contributions
 exactly+0.021
 forever+0.021
ently+0.021
vel+0.020
 correctly+0.020
Neg logit contributions
 pet-0.011
 dog-0.009
 walk-0.009
 water-0.009
 drops-0.009
Token!
Token ID0
Feature activation+7.09e-03

Pos logit contributions
 pet+0.030
 dog+0.027
 water+0.027
 walk+0.026
 drops+0.026
Neg logit contributions
vel-0.049
ently-0.049
 exactly-0.048
 forever-0.047
mmm-0.046
Token was
Token ID373
Feature activation+5.54e-03

Pos logit contributions
 pet+0.008
 dog+0.006
 water+0.006
 walk+0.006
 went+0.006
Neg logit contributions
vel-0.022
 exactly-0.022
ently-0.021
rol-0.021
 forever-0.021
Token repaired
Token ID27457
Feature activation+2.52e-02

Pos logit contributions
anish+0.013
ently+0.013
 perspective+0.012
mmm+0.012
eway+0.012
Neg logit contributions
 strong-0.007
 big-0.006
 pet-0.006
 warm-0.006
 water-0.006
Token,
Token ID11
Feature activation+7.08e-03

Pos logit contributions
aney+0.062
rol+0.062
ocolate+0.061
vel+0.061
ieved+0.061
Neg logit contributions
 pet-0.022
 water-0.020
 dog-0.020
 strong-0.019
 big-0.019
Token it
Token ID340
Feature activation-9.47e-03

Pos logit contributions
vel+0.018
ently+0.018
anish+0.017
 fault+0.017
ieved+0.017
Neg logit contributions
 pet-0.007
 dog-0.006
 water-0.006
 man-0.005
 big-0.005
Token looked
Token ID3114
Feature activation+1.85e-02

Pos logit contributions
 pet+0.010
 dog+0.008
 water+0.008
 walk+0.008
 drops+0.008
Neg logit contributions
ently-0.022
 exactly-0.022
vel-0.022
 forever-0.022
 correctly-0.021
Token as
Token ID355
Feature activation-7.46e-02

Pos logit contributions
ently+0.038
loe+0.038
anish+0.037
ocolate+0.037
mmm+0.036
Neg logit contributions
 pet-0.026
 water-0.023
 big-0.023
 drops-0.023
 strong-0.023
Token good
Token ID922
Feature activation+9.26e-04

Pos logit contributions
 pet+0.077
 water+0.071
 dog+0.068
 strong+0.065
 big+0.064
Neg logit contributions
vel-0.171
ently-0.170
 exactly-0.163
mmm-0.163
anish-0.162
Token as
Token ID355
Feature activation-4.90e-02

Pos logit contributions
vel+0.002
ently+0.002
 exactly+0.002
rol+0.002
anish+0.002
Neg logit contributions
 pet-0.001
 water-0.001
 drops-0.001
 dog-0.001
 walk-0.001
Token new
Token ID649
Feature activation+5.34e-03

Pos logit contributions
 pet+0.059
 water+0.052
 dog+0.051
 walk+0.048
s+0.047
Neg logit contributions
vel-0.107
ently-0.104
anish-0.102
 exactly-0.101
ieved-0.100
Token.
Token ID13
Feature activation+1.45e-02

Pos logit contributions
vel+0.013
ently+0.012
aney+0.012
 exactly+0.012
rol+0.012
Neg logit contributions
 pet-0.005
 dog-0.005
 water-0.004
 walk-0.004
 drops-0.004
Token The
Token ID383
Feature activation-8.40e-04

Pos logit contributions
vel+0.035
gh+0.034
 exactly+0.033
ently+0.033
 forever+0.033
Neg logit contributions
 pet-0.016
 dog-0.013
 water-0.013
 drops-0.012
s-0.012
Token.
Token ID13
Feature activation-1.37e-02

Pos logit contributions
 pet+0.008
s+0.006
 water+0.006
 dog+0.006
 walk+0.006
Neg logit contributions
mmm-0.023
ently-0.023
 exactly-0.023
anish-0.022
vel-0.022
Token They
Token ID1119
Feature activation-2.47e-02

Pos logit contributions
 pet+0.011
 dog+0.010
 water+0.007
 pets+0.007
 drops+0.007
Neg logit contributions
ently-0.039
vel-0.039
 forever-0.037
 exactly-0.037
gh-0.036
Token were
Token ID547
Feature activation-3.99e-03

Pos logit contributions
 pet+0.027
 walk+0.021
 dog+0.021
 water+0.020
 drops+0.019
Neg logit contributions
ently-0.062
vel-0.060
 exactly-0.060
mmm-0.059
 forever-0.058
Token called
Token ID1444
Feature activation-2.63e-02

Pos logit contributions
 pet+0.005
 dog+0.005
 water+0.004
 pets+0.004
 drops+0.004
Neg logit contributions
ently-0.009
vel-0.008
anish-0.008
 correctly-0.008
mmm-0.008
Token ost
Token ID23619
Feature activation+5.11e-03

Pos logit contributions
 pet+0.037
 dog+0.034
 water+0.029
 walk+0.029
 pets+0.028
Neg logit contributions
ently-0.060
vel-0.059
ieved-0.058
 exactly-0.056
mmm-0.056
Tokenric
Token ID1173
Feature activation-7.46e-02

Pos logit contributions
 exactly+0.005
 forever+0.005
ently+0.005
vel+0.005
 correctly+0.005
Neg logit contributions
 pet-0.012
 dog-0.011
 walk-0.011
 water-0.010
 drops-0.010
Tokenhes
Token ID956
Feature activation-3.04e-02

Pos logit contributions
 pet+0.074
 water+0.068
s+0.067
 dog+0.065
 drops+0.064
Neg logit contributions
 forever-0.172
mmm-0.167
 exactly-0.164
ently-0.161
 perspective-0.160
Token.
Token ID13
Feature activation-1.47e-02

Pos logit contributions
 pet+0.032
s+0.029
 dog+0.028
 water+0.028
 drops+0.025
Neg logit contributions
mmm-0.075
vel-0.074
 exactly-0.073
ently-0.072
 accompl-0.072
Token Lily
Token ID20037
Feature activation-7.64e-03

Pos logit contributions
 pet+0.014
 dog+0.012
 water+0.010
s+0.009
 pets+0.009
Neg logit contributions
ently-0.040
vel-0.039
 forever-0.038
 exactly-0.038
 correctly-0.038
Token and
Token ID290
Feature activation-3.48e-02

Pos logit contributions
 pet+0.008
 dog+0.007
 walk+0.006
s+0.006
 water+0.006
Neg logit contributions
 exactly-0.018
ently-0.018
 forever-0.018
vel-0.018
 correctly-0.018
Token Ben
Token ID3932
Feature activation-7.35e-03

Pos logit contributions
 pet+0.050
 dog+0.045
 walk+0.042
 water+0.041
 drops+0.040
Neg logit contributions
ently-0.070
 exactly-0.069
 forever-0.069
vel-0.069
 correctly-0.067
Token toys
Token ID14958
Feature activation-6.50e-03

Pos logit contributions
ently+0.003
vel+0.003
mmm+0.003
 forever+0.003
 exactly+0.003
Neg logit contributions
 pet-0.002
 water-0.001
 dog-0.001
 walk-0.001
 drops-0.001
Token and
Token ID290
Feature activation-3.56e-02

Pos logit contributions
 pet+0.007
 dog+0.006
 water+0.005
s+0.005
 walk+0.005
Neg logit contributions
vel-0.016
ently-0.015
 exactly-0.015
mmm-0.015
 forever-0.015
Token fruits
Token ID15921
Feature activation+9.15e-04

Pos logit contributions
 pet+0.043
 dog+0.039
s+0.037
 drops+0.036
 walk+0.036
Neg logit contributions
 exactly-0.073
 forever-0.072
 correctly-0.071
vel-0.070
ently-0.069
Token and
Token ID290
Feature activation-4.21e-02

Pos logit contributions
vel+0.002
ently+0.002
 exactly+0.002
 forever+0.002
 correctly+0.002
Neg logit contributions
 pet-0.001
 dog-0.001
 water-0.001
s-0.001
 walk-0.001
Token books
Token ID3835
Feature activation+7.12e-03

Pos logit contributions
 pet+0.051
 dog+0.046
s+0.045
 drops+0.043
 walk+0.042
Neg logit contributions
 exactly-0.085
 forever-0.083
 correctly-0.082
vel-0.080
ently-0.079
Token as
Token ID355
Feature activation-7.43e-02

Pos logit contributions
vel+0.019
ently+0.019
anish+0.019
mmm+0.019
 exactly+0.018
Neg logit contributions
 pet-0.005
s-0.005
 walk-0.004
 water-0.004
 dog-0.004
Token things
Token ID1243
Feature activation-1.59e-02

Pos logit contributions
 pet+0.093
 dog+0.082
 walk+0.078
 water+0.077
 drops+0.074
Neg logit contributions
 exactly-0.157
ently-0.156
vel-0.155
 forever-0.155
 correctly-0.153
Token to
Token ID284
Feature activation+9.41e-03

Pos logit contributions
 pet+0.016
 dog+0.014
 walk+0.013
 water+0.013
s+0.013
Neg logit contributions
ently-0.038
 exactly-0.037
vel-0.037
 forever-0.037
anish-0.036
Token sell
Token ID3677
Feature activation+7.56e-03

Pos logit contributions
 exactly+0.022
 forever+0.022
ently+0.022
vel+0.021
 correctly+0.021
Neg logit contributions
 pet-0.010
 dog-0.008
 walk-0.008
 water-0.008
 drops-0.007
Token.
Token ID13
Feature activation-4.69e-03

Pos logit contributions
ently+0.019
vel+0.019
 exactly+0.018
ieved+0.018
anish+0.018
Neg logit contributions
 pet-0.007
 water-0.006
 dog-0.006
s-0.006
 walk-0.005
Token
Token ID198
Feature activation-6.83e-04

Pos logit contributions
 pet+0.004
s+0.004
 dog+0.003
 water+0.003
 walk+0.003
Neg logit contributions
ently-0.012
vel-0.012
 exactly-0.012
 forever-0.012
 correctly-0.012
Token.
Token ID13
Feature activation+3.82e-03

Pos logit contributions
 exactly+0.059
vel+0.059
ently+0.058
mmm+0.058
 correctly+0.057
Neg logit contributions
 pet-0.022
 dog-0.019
s-0.017
 water-0.017
 drops-0.016
Token
Token ID198
Feature activation+7.97e-03

Pos logit contributions
 forever+0.006
 exactly+0.006
ently+0.006
mmm+0.005
 about+0.005
Neg logit contributions
 pet-0.006
 wave-0.006
 walk-0.006
 pans-0.006
 drops-0.005
Token
Token ID198
Feature activation+8.13e-03

Pos logit contributions
 exactly+0.013
 forever+0.013
ently+0.013
vel+0.013
 correctly+0.012
Neg logit contributions
 pet-0.013
 dog-0.012
 drops-0.011
 walk-0.011
 water-0.011
TokenAs
Token ID1722
Feature activation-5.66e-02

Pos logit contributions
 exactly+0.011
 forever+0.011
vel+0.010
ently+0.010
 correctly+0.010
Neg logit contributions
 pet-0.016
 dog-0.015
 drops-0.015
 walk-0.015
 water-0.014
Token the
Token ID262
Feature activation-2.46e-02

Pos logit contributions
 pet+0.093
 pans+0.084
 waves+0.083
 walk+0.083
 wave+0.083
Neg logit contributions
 forever-0.097
 exactly-0.090
mmm-0.077
 anymore-0.075
 correctly-0.074
Token sun
Token ID4252
Feature activation-7.40e-02

Pos logit contributions
s+0.032
 pet+0.031
 Claus+0.030
ns+0.029
 pets+0.028
Neg logit contributions
 forever-0.040
ently-0.037
ocolate-0.036
 exactly-0.036
mmm-0.034
Token began
Token ID2540
Feature activation-6.16e-02

Pos logit contributions
 pet+0.084
 robbers+0.081
 pans+0.081
 pets+0.077
 Claus+0.074
Neg logit contributions
ently-0.125
 exactly-0.123
 forever-0.121
 unfortunately-0.120
rol-0.119
Token to
Token ID284
Feature activation-2.35e-02

Pos logit contributions
 pet+0.099
 dog+0.088
 pans+0.086
Rex+0.085
 drops+0.085
Neg logit contributions
 forever-0.109
 exactly-0.099
ently-0.090
 correctly-0.086
 fault-0.086
Token set
Token ID900
Feature activation-1.74e-02

Pos logit contributions
 pet+0.028
 dog+0.027
s+0.025
 pets+0.025
 drops+0.024
Neg logit contributions
 forever-0.049
 exactly-0.047
ently-0.047
 correctly-0.045
mmm-0.044
Token,
Token ID11
Feature activation-5.67e-03

Pos logit contributions
 pet+0.025
 dog+0.023
 walk+0.022
 water+0.022
s+0.022
Neg logit contributions
 exactly-0.033
vel-0.032
ently-0.032
 correctly-0.032
 forever-0.031
Token Buddy
Token ID36896
Feature activation+4.19e-03

Pos logit contributions
 pet+0.008
 dog+0.007
 walk+0.007
 drops+0.007
 water+0.007
Neg logit contributions
 exactly-0.011
 forever-0.011
ently-0.010
 correctly-0.010
vel-0.010
Token to
Token ID284
Feature activation+8.32e-04

Pos logit contributions
 pet+0.028
 dog+0.027
 drops+0.027
 stuffing+0.027
 Claus+0.026
Neg logit contributions
 exactly-0.038
 forever-0.038
 correctly-0.035
ently-0.034
vel-0.033
Token the
Token ID262
Feature activation-9.29e-03

Pos logit contributions
 exactly+0.002
ently+0.002
vel+0.002
 forever+0.002
 correctly+0.002
Neg logit contributions
 pet-0.001
 dog-0.001
 walk-0.001
 water-0.001
 drops-0.001
Token park
Token ID3952
Feature activation-1.10e-02

Pos logit contributions
 pet+0.010
 dog+0.009
 water+0.008
 walk+0.008
 drops+0.007
Neg logit contributions
ently-0.023
 exactly-0.022
vel-0.022
 forever-0.022
 correctly-0.022
Token and
Token ID290
Feature activation-8.55e-03

Pos logit contributions
 pet+0.012
 dog+0.010
 walk+0.010
s+0.010
 water+0.009
Neg logit contributions
ently-0.026
 exactly-0.026
vel-0.026
 correctly-0.025
mmm-0.025
Token swing
Token ID9628
Feature activation+2.98e-03

Pos logit contributions
 pet+0.008
 dog+0.006
 walk+0.006
 water+0.006
 drops+0.006
Neg logit contributions
 exactly-0.021
ently-0.021
 forever-0.021
vel-0.021
 correctly-0.020
Token as
Token ID355
Feature activation-7.38e-02

Pos logit contributions
vel+0.008
ently+0.008
mmm+0.008
 exactly+0.008
 reality+0.007
Neg logit contributions
 pet-0.003
 dog-0.002
s-0.002
 water-0.002
 walk-0.002
Token high
Token ID1029
Feature activation+6.26e-03

Pos logit contributions
 pet+0.083
 dog+0.072
 walk+0.067
 water+0.067
s+0.066
Neg logit contributions
 exactly-0.165
vel-0.165
ently-0.163
 forever-0.163
 correctly-0.161
Token as
Token ID355
Feature activation-5.49e-02

Pos logit contributions
 exactly+0.013
vel+0.012
 forever+0.012
ently+0.012
 correctly+0.012
Neg logit contributions
 pet-0.009
 dog-0.008
 walk-0.007
 water-0.007
 drops-0.007
Token she
Token ID673
Feature activation+1.34e-02

Pos logit contributions
 pet+0.062
 dog+0.052
 water+0.050
 walk+0.050
s+0.048
Neg logit contributions
vel-0.128
ently-0.122
anish-0.122
 exactly-0.122
ieved-0.119
Token could
Token ID714
Feature activation-4.10e-03

Pos logit contributions
 exactly+0.029
 forever+0.028
ently+0.028
 correctly+0.028
vel+0.027
Neg logit contributions
 pet-0.015
 dog-0.014
 walk-0.013
 water-0.013
 drops-0.012
Token.
Token ID13
Feature activation+1.80e-03

Pos logit contributions
 pet+0.005
 dog+0.004
 walk+0.004
 water+0.004
 drops+0.003
Neg logit contributions
 exactly-0.009
ently-0.009
vel-0.009
 forever-0.009
 correctly-0.009
TokenI
Token ID40
Feature activation+1.72e-02

Pos logit contributions
 pet+0.023
 dog+0.020
s+0.018
 water+0.017
 walk+0.017
Neg logit contributions
ently-0.038
 exactly-0.038
 correctly-0.036
vel-0.036
anish-0.035
Token will
Token ID481
Feature activation+3.53e-03

Pos logit contributions
 exactly+0.035
ently+0.035
 forever+0.034
 correctly+0.034
vel+0.034
Neg logit contributions
 pet-0.023
 dog-0.020
 walk-0.019
 water-0.019
s-0.019
Token always
Token ID1464
Feature activation+2.14e-02

Pos logit contributions
ently+0.009
 exactly+0.009
 correctly+0.009
rol+0.009
vel+0.008
Neg logit contributions
 pet-0.004
 walk-0.003
 water-0.003
 wave-0.003
 dog-0.003
Token remind
Token ID7101
Feature activation+3.12e-02

Pos logit contributions
ently+0.050
 exactly+0.048
rol+0.048
 correctly+0.047
anish+0.047
Neg logit contributions
 pet-0.026
 walk-0.022
 dog-0.020
 water-0.020
 wave-0.020
Token you
Token ID345
Feature activation+3.46e-02

Pos logit contributions
 exactly+0.060
 correctly+0.059
vel+0.059
ently+0.058
 forever+0.057
Neg logit contributions
 pet-0.045
 dog-0.039
 water-0.039
 walk-0.038
s-0.036
Tokenâ
Token ID22940
Feature activation+5.20e-02

Pos logit contributions
rol+0.084
ently+0.080
 accompl+0.080
ement+0.079
aman+0.079
Neg logit contributions
 pet-0.037
 water-0.035
 big-0.031
 walk-0.031
 strong-0.030
Token
Token ID26391
Feature activation+2.46e-02

Pos logit contributions
 exactly+0.073
 forever+0.068
ently+0.066
 correctly+0.065
vel+0.065
Neg logit contributions
 pet-0.103
 walk-0.096
 dog-0.093
 water-0.092
 drops-0.090
Token.
Token ID13
Feature activation+2.80e-02

Pos logit contributions
ently+0.044
 exactly+0.044
 correctly+0.043
vel+0.042
 forever+0.042
Neg logit contributions
 pet-0.039
 dog-0.036
s-0.035
 walk-0.034
 drops-0.032
Token
Token ID198
Feature activation+2.31e-02

Pos logit contributions
ently+0.056
 exactly+0.055
 correctly+0.054
vel+0.053
 forever+0.053
Neg logit contributions
 pet-0.039
 dog-0.035
 water-0.032
s-0.032
 walk-0.031
Token
Token ID198
Feature activation+2.07e-02

Pos logit contributions
 correctly+0.058
 exactly+0.058
ently+0.057
vel+0.056
 forever+0.056
Neg logit contributions
 pet-0.020
s-0.017
 dog-0.016
 water-0.016
 walk-0.015
TokenGr
Token ID8642
Feature activation+1.03e-02

Pos logit contributions
 fault+0.049
ently+0.046
 correctly+0.046
 exactly+0.045
ocolate+0.042
Neg logit contributions
 pet-0.024
s-0.021
 walk-0.020
 water-0.019
 dog-0.019
Token then
Token ID788
Feature activation+9.42e-03

Pos logit contributions
vel+0.044
anish+0.041
ently+0.040
rol+0.040
aney+0.040
Neg logit contributions
 pet-0.028
 water-0.025
 dog-0.025
 wave-0.023
 man-0.023
Token on
Token ID319
Feature activation+1.75e-02

Pos logit contributions
 exactly+0.023
ently+0.022
aney+0.022
 fault+0.022
 forever+0.022
Neg logit contributions
 pet-0.009
 dog-0.008
s-0.008
 walk-0.007
 water-0.007
Token,
Token ID11
Feature activation+2.43e-02

Pos logit contributions
rol+0.047
vel+0.047
ust+0.046
aney+0.045
ieved+0.045
Neg logit contributions
 dog-0.016
s-0.014
 water-0.014
 pet-0.013
 man-0.013
Token he
Token ID339
Feature activation+2.11e-02

Pos logit contributions
vel+0.062
ently+0.062
 exactly+0.059
aney+0.059
ieved+0.059
Neg logit contributions
 pet-0.022
 dog-0.021
 water-0.019
 walk-0.018
s-0.018
Token always
Token ID1464
Feature activation+1.13e-02

Pos logit contributions
ently+0.048
vel+0.048
 exactly+0.048
mmm+0.046
 correctly+0.046
Neg logit contributions
 pet-0.025
 dog-0.020
 water-0.020
s-0.019
 walk-0.019
Token remembered
Token ID12086
Feature activation+5.52e-02

Pos logit contributions
 exactly+0.026
ently+0.025
vel+0.025
 forever+0.025
 correctly+0.025
Neg logit contributions
 pet-0.013
 dog-0.011
 walk-0.010
 water-0.010
 drops-0.010
Token that
Token ID326
Feature activation+1.91e-02

Pos logit contributions
vel+0.146
ently+0.143
mmm+0.142
rol+0.141
aney+0.140
Neg logit contributions
 pet-0.053
 water-0.046
 dog-0.039
 walk-0.038
 big-0.036
Token if
Token ID611
Feature activation-2.37e-02

Pos logit contributions
ently+0.047
rol+0.046
mmm+0.046
vel+0.046
aney+0.046
Neg logit contributions
 dog-0.021
 water-0.020
 pet-0.020
 strong-0.018
 walk-0.018
Token he
Token ID339
Feature activation+1.19e-03

Pos logit contributions
 pet+0.023
 dog+0.021
 water+0.021
 walk+0.020
 hurt+0.018
Neg logit contributions
vel-0.058
ently-0.057
rol-0.056
aney-0.055
ieved-0.055
Token worked
Token ID3111
Feature activation+2.30e-02

Pos logit contributions
 exactly+0.002
 forever+0.002
ently+0.002
 correctly+0.002
vel+0.002
Neg logit contributions
 pet-0.002
 dog-0.001
 walk-0.001
 water-0.001
 drops-0.001
Token hard
Token ID1327
Feature activation+2.81e-03

Pos logit contributions
rol+0.054
aney+0.054
ocolate+0.053
anish+0.052
otion+0.051
Neg logit contributions
 water-0.022
s-0.022
 big-0.022
 strong-0.022
 pet-0.020
Token to
Token ID284
Feature activation+1.02e-02

Pos logit contributions
rol+0.027
ieved+0.027
ently+0.027
vel+0.027
 correctly+0.026
Neg logit contributions
 pet-0.012
 dog-0.011
s-0.010
 walk-0.010
 water-0.010
Token turn
Token ID1210
Feature activation-1.31e-02

Pos logit contributions
ently+0.026
 exactly+0.026
vel+0.026
 correctly+0.025
 forever+0.025
Neg logit contributions
 pet-0.009
 water-0.007
 walk-0.007
 dog-0.007
 wave-0.006
Token off
Token ID572
Feature activation-1.52e-02

Pos logit contributions
 water+0.012
 pet+0.012
 drops+0.011
 dog+0.010
s+0.010
Neg logit contributions
mmm-0.034
ently-0.033
vel-0.033
anish-0.033
 exactly-0.032
Token the
Token ID262
Feature activation-1.67e-02

Pos logit contributions
 water+0.012
 pet+0.011
 big+0.009
 walk+0.009
 warm+0.008
Neg logit contributions
vel-0.043
anish-0.043
 fault-0.042
mmm-0.041
 Majesty-0.041
Token f
Token ID277
Feature activation+1.51e-02

Pos logit contributions
 pet+0.019
 dog+0.015
 walk+0.015
 pets+0.015
s+0.015
Neg logit contributions
 forever-0.034
 exactly-0.033
 fault-0.033
 correctly-0.032
ently-0.032
Tokenauc
Token ID14272
Feature activation+5.68e-02

Pos logit contributions
ently+0.016
 exactly+0.016
vel+0.015
 correctly+0.015
 forever+0.014
Neg logit contributions
 pet-0.036
 dog-0.033
 walk-0.033
 drops-0.031
 water-0.031
Tokenet
Token ID316
Feature activation+6.73e-03

Pos logit contributions
ently+0.033
 correctly+0.032
 exactly+0.032
 forever+0.030
vel+0.029
Neg logit contributions
 pet-0.152
 dog-0.143
 walk-0.143
s-0.139
 drops-0.136
Token when
Token ID618
Feature activation-3.05e-02

Pos logit contributions
vel+0.016
 exactly+0.016
mmm+0.016
ieved+0.016
ently+0.015
Neg logit contributions
s-0.007
 pet-0.007
 water-0.007
 dog-0.006
 walk-0.006
Token she
Token ID673
Feature activation+7.39e-03

Pos logit contributions
 pet+0.032
 water+0.031
 dog+0.031
 walk+0.028
 drops+0.026
Neg logit contributions
vel-0.075
ently-0.073
ieved-0.072
rol-0.072
 correctly-0.070
Token was
Token ID373
Feature activation+2.06e-03

Pos logit contributions
 exactly+0.016
 forever+0.016
 correctly+0.016
vel+0.016
ently+0.016
Neg logit contributions
 pet-0.008
 dog-0.007
 walk-0.006
 pets-0.006
 water-0.006
Token done
Token ID1760
Feature activation+1.10e-02

Pos logit contributions
ently+0.005
 correctly+0.005
anish+0.005
mmm+0.005
 forever+0.004
Neg logit contributions
 pet-0.002
 water-0.002
 dog-0.002
 walk-0.002
 drops-0.002
Token he
Token ID339
Feature activation+2.81e-02

Pos logit contributions
ocolate+0.099
ently+0.098
mmm+0.098
anish+0.098
loe+0.097
Neg logit contributions
 pet-0.021
 walk-0.015
 dog-0.014
 water-0.013
 big-0.013
Token wanted
Token ID2227
Feature activation+2.86e-03

Pos logit contributions
ently+0.071
mmm+0.069
ocolate+0.069
vel+0.069
anish+0.068
Neg logit contributions
 pet-0.030
s-0.024
 went-0.024
 dog-0.023
 walk-0.023
Token to
Token ID284
Feature activation+4.35e-03

Pos logit contributions
ently+0.009
mmm+0.009
ieved+0.009
anish+0.009
vel+0.009
Neg logit contributions
 pet-0.001
 water-0.001
 big-0.001
 dog-0.001
 walk-0.000
Token go
Token ID467
Feature activation-1.17e-02

Pos logit contributions
ently+0.012
vel+0.012
 exactly+0.012
rol+0.012
mmm+0.011
Neg logit contributions
 pet-0.004
 walk-0.003
 play-0.003
 wave-0.003
 water-0.003
Token to
Token ID284
Feature activation+1.26e-02

Pos logit contributions
 pet+0.013
 walk+0.011
 dog+0.011
 water+0.010
s+0.010
Neg logit contributions
ently-0.026
 exactly-0.026
 correctly-0.026
vel-0.026
 forever-0.026
Token an
Token ID281
Feature activation+5.14e-02

Pos logit contributions
ently+0.034
anish+0.033
aney+0.033
vel+0.032
 exactly+0.032
Neg logit contributions
 pet-0.013
 play-0.011
 walk-0.011
 water-0.011
 dog-0.010
Token opera
Token ID27296
Feature activation+1.33e-02

Pos logit contributions
anish+0.091
ently+0.089
vel+0.086
 exactly+0.083
mmm+0.083
Neg logit contributions
 pet-0.096
 walk-0.086
 water-0.086
 play-0.082
 dog-0.081
Token himself
Token ID2241
Feature activation+1.03e-02

Pos logit contributions
ently+0.032
mmm+0.032
ocolate+0.031
 fault+0.031
anish+0.031
Neg logit contributions
 pet-0.015
 walk-0.012
 dog-0.012
 wave-0.011
s-0.011
Token.
Token ID13
Feature activation+1.81e-03

Pos logit contributions
mmm+0.031
ocolate+0.031
anish+0.031
mmmm+0.030
ently+0.029
Neg logit contributions
 pet-0.005
 walk-0.005
s-0.004
 big-0.003
 dog-0.003
Token So
Token ID1406
Feature activation+8.88e-03

Pos logit contributions
vel+0.004
 exactly+0.004
 forever+0.004
ently+0.004
 correctly+0.004
Neg logit contributions
 pet-0.002
 dog-0.002
s-0.002
 walk-0.002
 water-0.002
Token,
Token ID11
Feature activation+2.17e-02

Pos logit contributions
vel+0.021
rol+0.020
 fault+0.020
ently+0.020
 exactly+0.019
Neg logit contributions
 pet-0.009
 walk-0.008
 dog-0.008
 water-0.008
s-0.008
Token some
Token ID617
Feature activation-1.36e-02

Pos logit contributions
ently+0.024
 forever+0.024
loe+0.024
vel+0.024
aney+0.023
Neg logit contributions
 water-0.013
 pet-0.011
 big-0.011
s-0.010
 dog-0.009
Token more
Token ID517
Feature activation-1.49e-02

Pos logit contributions
 pet+0.016
 water+0.015
 dog+0.014
 drops+0.013
 walk+0.013
Neg logit contributions
ently-0.030
vel-0.030
 forever-0.030
 exactly-0.030
ieved-0.029
Token toys
Token ID14958
Feature activation-1.33e-02

Pos logit contributions
 water+0.017
 pet+0.017
 drops+0.015
 dog+0.015
s+0.014
Neg logit contributions
ently-0.035
 exactly-0.035
ieved-0.034
 forever-0.034
vel-0.034
Token to
Token ID284
Feature activation+1.08e-02

Pos logit contributions
 pet+0.010
 water+0.010
s+0.009
 dog+0.008
 walk+0.007
Neg logit contributions
ently-0.035
vel-0.034
ieved-0.033
loe-0.033
 exactly-0.033
Token your
Token ID534
Feature activation+1.52e-02

Pos logit contributions
ently+0.029
 exactly+0.027
rol+0.027
vel+0.027
 forever+0.027
Neg logit contributions
 pet-0.010
 water-0.009
 play-0.007
 walk-0.007
 big-0.007
Token collection
Token ID4947
Feature activation+5.27e-02

Pos logit contributions
 exactly+0.034
ently+0.033
vel+0.033
 correctly+0.033
 forever+0.032
Neg logit contributions
 pet-0.018
 dog-0.016
 walk-0.015
 water-0.015
 drops-0.015
Token!"
Token ID2474
Feature activation+2.65e-03

Pos logit contributions
vel+0.117
 accompl+0.115
 thirteen+0.114
urbed+0.113
 exactly+0.112
Neg logit contributions
s-0.061
 water-0.058
 dog-0.054
 pet-0.052
 big-0.051
Token
Token ID220
Feature activation-1.97e-02

Pos logit contributions
 exactly+0.005
 forever+0.005
ently+0.005
 correctly+0.005
vel+0.005
Neg logit contributions
 pet-0.003
 dog-0.003
 walk-0.003
 water-0.003
 drops-0.003
Token
Token ID198
Feature activation-5.16e-03

Pos logit contributions
 pet+0.018
 dog+0.016
s+0.015
 water+0.014
 walk+0.014
Neg logit contributions
 exactly-0.047
 forever-0.046
 correctly-0.046
vel-0.046
ently-0.046
Token
Token ID198
Feature activation-2.62e-03

Pos logit contributions
 pet+0.006
 dog+0.005
 walk+0.005
 water+0.005
 drops+0.004
Neg logit contributions
 exactly-0.012
 forever-0.011
ently-0.011
vel-0.011
 correctly-0.011
TokenL
Token ID43
Feature activation+5.54e-03

Pos logit contributions
 pet+0.003
 dog+0.003
 walk+0.003
 water+0.003
s+0.003
Neg logit contributions
 exactly-0.006
ently-0.006
 forever-0.006
 correctly-0.006
vel-0.006
Token careless
Token ID36138
Feature activation+2.87e-02

Pos logit contributions
ently+0.021
vel+0.021
anish+0.021
ocolate+0.021
 forever+0.020
Neg logit contributions
 pet-0.010
 dog-0.009
 water-0.008
 drops-0.008
s-0.008
Token.
Token ID13
Feature activation+1.38e-02

Pos logit contributions
mmm+0.069
ently+0.068
vel+0.068
 exactly+0.067
ieved+0.067
Neg logit contributions
 pet-0.028
 dog-0.026
 water-0.025
 walk-0.024
s-0.023
Token
Token ID220
Feature activation-2.10e-02

Pos logit contributions
ently+0.032
vel+0.032
 correctly+0.031
 exactly+0.031
 forever+0.031
Neg logit contributions
 pet-0.016
 dog-0.014
s-0.013
 bark-0.012
 drops-0.012
Token
Token ID198
Feature activation+1.12e-02

Pos logit contributions
 pet+0.013
s+0.012
 dog+0.011
 water+0.009
 drops+0.008
Neg logit contributions
 forever-0.056
 exactly-0.055
vel-0.055
 correctly-0.055
ently-0.054
Token
Token ID198
Feature activation+1.32e-02

Pos logit contributions
 correctly+0.031
 exactly+0.031
 forever+0.030
ently+0.030
 fault+0.030
Neg logit contributions
 pet-0.008
s-0.006
 dog-0.005
 water-0.005
 walk-0.005
TokenFrom
Token ID4863
Feature activation+2.02e-02

Pos logit contributions
 fault+0.030
 correctly+0.028
ently+0.028
 exactly+0.028
 forever+0.027
Neg logit contributions
 pet-0.017
s-0.015
 walk-0.014
 dog-0.014
Rex-0.014
Token then
Token ID788
Feature activation+1.47e-02

Pos logit contributions
vel+0.037
ently+0.036
 forever+0.036
 exactly+0.035
 correctly+0.035
Neg logit contributions
 pet-0.032
 dog-0.029
 water-0.028
 walk-0.028
 drops-0.027
Token on
Token ID319
Feature activation+2.16e-02

Pos logit contributions
aney+0.038
 exactly+0.038
 fault+0.037
ently+0.037
 reality+0.036
Neg logit contributions
 pet-0.011
 dog-0.010
s-0.010
 walk-0.009
 went-0.009
Token,
Token ID11
Feature activation+2.50e-02

Pos logit contributions
vel+0.057
aney+0.057
ieved+0.056
ust+0.056
rol+0.055
Neg logit contributions
s-0.019
 dog-0.018
 pet-0.017
 walk-0.014
 big-0.014
Token Fl
Token ID1610
Feature activation-4.27e-04

Pos logit contributions
ently+0.053
vel+0.052
 exactly+0.051
 correctly+0.051
 forever+0.049
Neg logit contributions
 pet-0.033
 dog-0.030
s-0.029
 walk-0.028
 water-0.028
Tokenuffy
Token ID15352
Feature activation+1.46e-03

Pos logit contributions
 pet+0.001
 dog+0.001
 walk+0.001
 water+0.001
 drops+0.001
Neg logit contributions
 exactly-0.001
 forever-0.001
ently-0.001
vel-0.001
 correctly-0.001
Token prevent
Token ID2948
Feature activation+3.28e-02

Pos logit contributions
 exactly+0.007
 forever+0.007
ently+0.006
 correctly+0.006
vel+0.006
Neg logit contributions
 pet-0.004
 dog-0.003
 drops-0.003
 water-0.003
s-0.003
Token bad
Token ID2089
Feature activation+4.69e-03

Pos logit contributions
ently+0.071
vel+0.069
 exactly+0.069
anish+0.068
 correctly+0.068
Neg logit contributions
 pet-0.043
 dog-0.039
 water-0.038
 drops-0.036
 pets-0.034
Token things
Token ID1243
Feature activation-1.26e-02

Pos logit contributions
 forever+0.010
ently+0.010
vel+0.010
 exactly+0.009
 correctly+0.009
Neg logit contributions
 pet-0.006
s-0.005
 dog-0.005
 walk-0.005
 water-0.004
Token from
Token ID422
Feature activation+3.07e-02

Pos logit contributions
 pet+0.018
 dog+0.016
 walk+0.015
 water+0.015
 drops+0.015
Neg logit contributions
 exactly-0.025
 forever-0.024
ently-0.024
vel-0.024
 correctly-0.024
Token happening
Token ID5836
Feature activation+2.03e-02

Pos logit contributions
ently+0.057
vel+0.055
 correctly+0.055
 exactly+0.055
 forever+0.054
Neg logit contributions
 pet-0.048
 water-0.045
 dog-0.043
 walk-0.042
 drops-0.041
Token by
Token ID416
Feature activation+1.30e-02

Pos logit contributions
 exactly+0.043
vel+0.043
ently+0.042
mmm+0.042
anish+0.042
Neg logit contributions
 pet-0.026
 water-0.024
 dog-0.023
 drops-0.023
s-0.023
Token being
Token ID852
Feature activation+1.30e-03

Pos logit contributions
 correctly+0.025
aney+0.025
ieved+0.025
rol+0.025
 exactly+0.025
Neg logit contributions
 water-0.019
 pet-0.019
 dog-0.017
 hurt-0.016
 drops-0.016
Token careful
Token ID8161
Feature activation+1.96e-02

Pos logit contributions
ently+0.003
 exactly+0.003
 correctly+0.003
vel+0.003
mmm+0.003
Neg logit contributions
 pet-0.001
 dog-0.001
 water-0.001
 drops-0.001
 walk-0.001
Token and
Token ID290
Feature activation+2.42e-03

Pos logit contributions
 exactly+0.044
mmm+0.043
vel+0.043
ently+0.042
 reality+0.041
Neg logit contributions
 pet-0.020
 water-0.019
s-0.019
 dog-0.018
 drops-0.017
Token listening
Token ID8680
Feature activation+1.03e-03

Pos logit contributions
 exactly+0.004
 forever+0.004
 correctly+0.004
vel+0.004
ently+0.004
Neg logit contributions
 pet-0.004
 dog-0.003
s-0.003
 walk-0.003
 drops-0.003
Token to
Token ID284
Feature activation+2.03e-02

Pos logit contributions
mmm+0.003
vel+0.003
 exactly+0.003
ently+0.003
ocolate+0.003
Neg logit contributions
 pet-0.001
s-0.001
 water-0.001
 dog-0.001
 drops-0.001
Token so
Token ID523
Feature activation-6.60e-03

Pos logit contributions
ently+0.030
anish+0.029
mmm+0.028
mmmm+0.027
 perspective+0.026
Neg logit contributions
 pet-0.016
 strong-0.014
 warm-0.014
 big-0.014
 calm-0.014
Token excited
Token ID6568
Feature activation+2.01e-04

Pos logit contributions
 pet+0.008
 dog+0.006
 strong+0.006
 water+0.006
 drops+0.006
Neg logit contributions
ently-0.015
anish-0.014
 exactly-0.014
vel-0.014
 correctly-0.014
Token that
Token ID326
Feature activation+1.66e-02

Pos logit contributions
vel+0.000
ently+0.000
anish+0.000
 correctly+0.000
ocolate+0.000
Neg logit contributions
 pet-0.000
 dog-0.000
s-0.000
 walk-0.000
 water-0.000
Token from
Token ID422
Feature activation+2.15e-02

Pos logit contributions
ently+0.042
anish+0.042
vel+0.041
 fault+0.040
rol+0.040
Neg logit contributions
 pet-0.016
 walk-0.014
 dog-0.014
 water-0.012
 went-0.012
Token then
Token ID788
Feature activation+1.36e-03

Pos logit contributions
vel+0.052
anish+0.052
mmm+0.050
ently+0.049
rol+0.048
Neg logit contributions
 pet-0.028
 water-0.023
 dog-0.023
 walk-0.022
 drops-0.019
Token on
Token ID319
Feature activation+2.01e-02

Pos logit contributions
ently+0.003
 exactly+0.003
vel+0.003
 fault+0.003
 correctly+0.003
Neg logit contributions
 pet-0.002
 walk-0.001
 dog-0.001
s-0.001
 drops-0.001
Token,
Token ID11
Feature activation+2.54e-02

Pos logit contributions
vel+0.050
rol+0.049
ust+0.047
anish+0.047
ieved+0.047
Neg logit contributions
 pet-0.023
 dog-0.021
 walk-0.019
s-0.017
 water-0.017
Token she
Token ID673
Feature activation+2.38e-02

Pos logit contributions
ently+0.062
anish+0.061
vel+0.060
ocolate+0.058
 fault+0.058
Neg logit contributions
 pet-0.031
 dog-0.025
 walk-0.024
 water-0.022
 went-0.021
Token was
Token ID373
Feature activation+2.26e-02

Pos logit contributions
ently+0.060
ocolate+0.059
mmm+0.059
anish+0.059
vel+0.058
Neg logit contributions
 pet-0.026
 went-0.020
 walk-0.019
 dog-0.018
s-0.018
Token always
Token ID1464
Feature activation+1.90e-02

Pos logit contributions
anish+0.049
ently+0.047
mmm+0.047
mmmm+0.044
 perspective+0.042
Neg logit contributions
 pet-0.036
 strong-0.031
 warm-0.029
 hurt-0.029
 big-0.028
Token happy
Token ID3772
Feature activation-2.79e-03

Pos logit contributions
ently+0.043
mmm+0.042
anish+0.040
ide+0.040
mmmm+0.039
Neg logit contributions
 pet-0.024
 strong-0.021
 dog-0.020
 water-0.020
 sleepy-0.020
Token.
Token ID13
Feature activation+1.05e-02

Pos logit contributions
anish+0.048
vel+0.048
mmm+0.047
ently+0.047
ocolate+0.047
Neg logit contributions
s-0.015
 pet-0.015
 water-0.014
 dog-0.013
 walk-0.013
Token Everyone
Token ID11075
Feature activation-4.46e-02

Pos logit contributions
ently+0.028
vel+0.028
 exactly+0.027
 forever+0.027
 correctly+0.027
Neg logit contributions
 pet-0.008
s-0.007
 dog-0.007
 water-0.006
 walk-0.005
Token agreed
Token ID4987
Feature activation+5.69e-03

Pos logit contributions
 pet+0.041
s+0.035
 went+0.034
 dog+0.033
 started+0.032
Neg logit contributions
mmm-0.117
 fault-0.115
ently-0.114
vel-0.112
mmmm-0.110
Token that
Token ID326
Feature activation+2.22e-02

Pos logit contributions
vel+0.014
ently+0.014
rol+0.014
anish+0.014
mmm+0.014
Neg logit contributions
 pet-0.005
 water-0.004
 dog-0.004
 walk-0.004
s-0.004
Token it
Token ID340
Feature activation-4.84e-03

Pos logit contributions
ently+0.054
vel+0.052
ieved+0.052
mmm+0.051
aney+0.051
Neg logit contributions
 pet-0.023
 dog-0.022
 water-0.021
 walk-0.020
s-0.019
Token was
Token ID373
Feature activation+1.62e-02

Pos logit contributions
 pet+0.006
 dog+0.005
 walk+0.005
 water+0.005
 drops+0.005
Neg logit contributions
 exactly-0.010
ently-0.010
vel-0.010
 forever-0.010
 correctly-0.010
Token important
Token ID1593
Feature activation+1.42e-02

Pos logit contributions
ently+0.039
anish+0.039
vel+0.037
mmm+0.037
rol+0.036
Neg logit contributions
 pet-0.018
 strong-0.018
 water-0.018
 big-0.017
 warm-0.016
Token to
Token ID284
Feature activation+2.33e-02

Pos logit contributions
vel+0.039
rol+0.039
ieved+0.039
ently+0.038
aney+0.038
Neg logit contributions
 pet-0.010
 dog-0.009
 water-0.008
s-0.007
 walk-0.007
Token be
Token ID307
Feature activation+1.48e-02

Pos logit contributions
rol+0.059
ently+0.058
vel+0.057
 exactly+0.056
 correctly+0.055
Neg logit contributions
 pet-0.023
 walk-0.021
 water-0.020
 wave-0.018
 play-0.018
Token flexible
Token ID12846
Feature activation+4.50e-02

Pos logit contributions
ently+0.035
mmm+0.034
vel+0.033
anish+0.033
 correctly+0.033
Neg logit contributions
 pet-0.017
 water-0.014
 dog-0.014
 strong-0.013
 drops-0.012
Token,
Token ID11
Feature activation+2.58e-02

Pos logit contributions
mmm+0.114
ocolate+0.112
ently+0.110
vel+0.110
aney+0.110
Neg logit contributions
 pet-0.039
 dog-0.036
 water-0.035
 drops-0.033
s-0.032
Token"
Token ID1
Feature activation-2.92e-02

Pos logit contributions
 exactly+0.012
 fault+0.012
ently+0.012
 forever+0.012
 correctly+0.012
Neg logit contributions
 pet-0.006
 dog-0.005
 walk-0.005
s-0.005
 water-0.005
TokenLook
Token ID8567
Feature activation+2.12e-02

Pos logit contributions
 pet+0.041
 dog+0.036
 walk+0.035
 water+0.035
 drops+0.034
Neg logit contributions
 exactly-0.058
 forever-0.057
ently-0.057
 correctly-0.056
vel-0.056
Token at
Token ID379
Feature activation+5.91e-03

Pos logit contributions
ently+0.048
 virtue+0.047
anish+0.046
rol+0.046
eway+0.046
Neg logit contributions
 pet-0.025
 dog-0.022
s-0.021
 walk-0.020
 drops-0.019
Token me
Token ID502
Feature activation-2.46e-03

Pos logit contributions
ently+0.015
vel+0.014
anish+0.014
 correctly+0.014
rum+0.014
Neg logit contributions
 pet-0.007
 water-0.006
 dog-0.006
 big-0.005
 drops-0.005
Token,
Token ID11
Feature activation+1.76e-02

Pos logit contributions
 pet+0.003
 dog+0.003
 walk+0.002
 wave+0.002
 water+0.002
Neg logit contributions
ently-0.006
 virtue-0.006
 accompl-0.006
anish-0.005
rol-0.005
Token I
Token ID314
Feature activation+1.61e-02

Pos logit contributions
ently+0.044
ieved+0.040
vel+0.039
 correctly+0.038
 forever+0.038
Neg logit contributions
 pet-0.025
 dog-0.024
 water-0.021
 big-0.019
 drops-0.019
Token am
Token ID716
Feature activation+1.32e-02

Pos logit contributions
 exactly+0.035
ently+0.035
 forever+0.035
 correctly+0.034
vel+0.034
Neg logit contributions
 pet-0.019
 dog-0.016
 walk-0.016
 water-0.016
 drops-0.015
Token so
Token ID523
Feature activation-9.57e-03

Pos logit contributions
ently+0.033
anish+0.032
 correctly+0.029
 perspective+0.029
ieved+0.029
Neg logit contributions
 pet-0.017
 water-0.014
 strong-0.014
 big-0.014
 hurt-0.013
Token pretty
Token ID2495
Feature activation-2.36e-03

Pos logit contributions
 pet+0.009
 water+0.007
 dog+0.006
 big+0.006
 drops+0.006
Neg logit contributions
ently-0.025
 correctly-0.025
 exactly-0.024
anish-0.024
 forever-0.024
Token!"
Token ID2474
Feature activation-1.02e-02

Pos logit contributions
 pet+0.004
 dog+0.003
s+0.003
 water+0.003
 drops+0.003
Neg logit contributions
ently-0.004
 exactly-0.004
vel-0.004
anish-0.004
ocolate-0.004
Token she
Token ID673
Feature activation+1.08e-02

Pos logit contributions
 pet+0.013
 dog+0.011
 water+0.010
 drops+0.010
s+0.010
Neg logit contributions
vel-0.024
ently-0.024
 forever-0.024
 exactly-0.024
 correctly-0.023
Token
Token ID198
Feature activation+3.61e-03

Pos logit contributions
 exactly+0.005
vel+0.005
ently+0.005
ieved+0.005
 correctly+0.005
Neg logit contributions
 pet-0.001
 dog-0.001
s-0.001
 water-0.001
 play-0.001
Token
Token ID198
Feature activation+4.87e-03

Pos logit contributions
 exactly+0.010
 correctly+0.010
vel+0.010
ently+0.010
 forever+0.009
Neg logit contributions
 pet-0.002
s-0.002
 dog-0.002
 walk-0.002
 water-0.002
TokenOne
Token ID3198
Feature activation-1.29e-03

Pos logit contributions
ently+0.011
 fault+0.010
 exactly+0.010
 correctly+0.010
rol+0.010
Neg logit contributions
 pet-0.007
 walk-0.006
 dog-0.006
 bark-0.006
 big-0.006
Token day
Token ID1110
Feature activation-2.52e-02

Pos logit contributions
 pet+0.002
 drops+0.002
s+0.002
 pets+0.002
 stuffing+0.002
Neg logit contributions
 forever-0.002
 exactly-0.002
 correctly-0.002
vel-0.002
ently-0.002
Token,
Token ID11
Feature activation+1.18e-02

Pos logit contributions
 pet+0.042
 dog+0.037
 walk+0.037
 water+0.036
 drops+0.036
Neg logit contributions
 exactly-0.042
 forever-0.042
ently-0.041
vel-0.040
 correctly-0.040
Token something
Token ID1223
Feature activation-1.38e-02

Pos logit contributions
ently+0.026
vel+0.026
 exactly+0.026
 correctly+0.025
 forever+0.025
Neg logit contributions
 pet-0.014
 dog-0.013
 water-0.012
 walk-0.012
 drops-0.011
Token magical
Token ID10883
Feature activation+1.24e-02

Pos logit contributions
 pet+0.020
 dog+0.019
 Claus+0.018
 walk+0.018
s+0.018
Neg logit contributions
 exactly-0.025
 forever-0.023
 correctly-0.022
vel-0.022
ently-0.022
Token happened
Token ID3022
Feature activation+3.48e-02

Pos logit contributions
 exactly+0.021
 forever+0.021
ently+0.021
 correctly+0.020
vel+0.020
Neg logit contributions
 pet-0.020
 dog-0.018
 walk-0.017
 water-0.017
s-0.017
Token.
Token ID13
Feature activation-6.11e-03

Pos logit contributions
aney+0.087
ently+0.087
 exactly+0.086
mmm+0.086
 correctly+0.086
Neg logit contributions
 pet-0.029
 dog-0.023
s-0.022
 walk-0.022
 water-0.022
Token A
Token ID317
Feature activation-1.45e-03

Pos logit contributions
 pet+0.007
 dog+0.006
 water+0.005
 drops+0.005
 walk+0.005
Neg logit contributions
 exactly-0.015
ently-0.014
 forever-0.014
vel-0.014
 correctly-0.014
Token little
Token ID1310
Feature activation-1.85e-02

Pos logit contributions
 pet+0.001
 dog+0.001
 walk+0.001
s+0.001
 drops+0.001
Neg logit contributions
 exactly-0.004
 forever-0.004
vel-0.003
ently-0.003
 correctly-0.003
Token did
Token ID750
Feature activation-1.58e-02

Pos logit contributions
 pet+0.001
 dog+0.001
 walk+0.001
 water+0.001
 drops+0.001
Neg logit contributions
 exactly-0.001
 forever-0.001
ently-0.001
vel-0.001
 correctly-0.001
Token you
Token ID345
Feature activation+6.53e-03

Pos logit contributions
 pet+0.024
 dog+0.022
 walk+0.021
 water+0.021
 drops+0.021
Neg logit contributions
 exactly-0.029
 forever-0.028
ently-0.028
vel-0.028
 correctly-0.028
Token go
Token ID467
Feature activation-1.58e-02

Pos logit contributions
 exactly+0.015
 forever+0.015
vel+0.015
 correctly+0.015
ently+0.015
Neg logit contributions
 pet-0.006
 dog-0.005
 drops-0.005
s-0.005
 walk-0.005
Token outside
Token ID2354
Feature activation-9.05e-03

Pos logit contributions
 pet+0.018
 dog+0.016
 walk+0.015
 water+0.015
 drops+0.015
Neg logit contributions
 exactly-0.035
 forever-0.034
ently-0.034
vel-0.034
 correctly-0.034
Token?
Token ID30
Feature activation+1.06e-02

Pos logit contributions
 pet+0.008
 water+0.008
 dog+0.008
 play+0.007
 big+0.007
Neg logit contributions
ently-0.025
vel-0.023
rol-0.023
mmm-0.023
aney-0.023
Token Why
Token ID4162
Feature activation-4.43e-03

Pos logit contributions
 exactly+0.018
ently+0.018
 forever+0.018
 correctly+0.017
vel+0.017
Neg logit contributions
 pet-0.018
 dog-0.017
 drops-0.016
 water-0.016
 walk-0.015
Token didn
Token ID1422
Feature activation-2.65e-02

Pos logit contributions
 pet+0.006
 dog+0.006
 water+0.005
 walk+0.005
s+0.005
Neg logit contributions
ently-0.009
vel-0.009
 forever-0.009
 exactly-0.009
 correctly-0.009
Token't
Token ID470
Feature activation-2.57e-03

Pos logit contributions
 pet+0.053
 pets+0.047
 dog+0.046
 walk+0.045
 drops+0.045
Neg logit contributions
 correctly-0.034
 exactly-0.034
vel-0.034
 forever-0.034
ently-0.033
Token you
Token ID345
Feature activation+4.05e-03

Pos logit contributions
 pet+0.004
 pets+0.004
 drops+0.004
 dog+0.004
s+0.004
Neg logit contributions
 forever-0.004
 exactly-0.004
vel-0.004
 correctly-0.004
mmm-0.004
Token listen
Token ID6004
Feature activation+1.52e-02

Pos logit contributions
 exactly+0.010
ently+0.010
vel+0.010
 forever+0.010
 correctly+0.010
Neg logit contributions
 pet-0.004
 dog-0.003
 walk-0.003
 water-0.003
 drops-0.003
Token to
Token ID284
Feature activation+1.92e-02

Pos logit contributions
ently+0.046
ocolate+0.046
anish+0.045
rol+0.045
mmm+0.045
Neg logit contributions
 pet-0.007
s-0.006
 water-0.006
 big-0.005
 dog-0.004
Token day
Token ID1110
Feature activation-1.32e-03

Pos logit contributions
 exactly+0.027
ently+0.027
vel+0.026
 forever+0.026
 correctly+0.026
Neg logit contributions
 pet-0.018
 dog-0.016
 walk-0.016
 water-0.016
 drops-0.015
Token.
Token ID13
Feature activation+9.24e-03

Pos logit contributions
 pet+0.001
 dog+0.001
s+0.001
 water+0.001
 walk+0.001
Neg logit contributions
 exactly-0.003
ently-0.003
anish-0.003
vel-0.003
ieved-0.003
Token He
Token ID679
Feature activation+1.83e-03

Pos logit contributions
 forever+0.018
 exactly+0.018
 correctly+0.017
ently+0.017
vel+0.016
Neg logit contributions
 pet-0.013
 dog-0.011
 drops-0.011
 walk-0.011
 water-0.011
Token lived
Token ID5615
Feature activation-3.99e-03

Pos logit contributions
ently+0.004
 exactly+0.004
vel+0.004
mmm+0.004
 correctly+0.004
Neg logit contributions
 pet-0.002
 dog-0.002
 walk-0.002
 water-0.002
s-0.002
Token in
Token ID287
Feature activation-3.31e-04

Pos logit contributions
 pet+0.005
 dog+0.005
 walk+0.004
 water+0.004
 drops+0.004
Neg logit contributions
 exactly-0.008
 forever-0.008
ently-0.008
vel-0.008
 correctly-0.008
Token a
Token ID257
Feature activation-1.42e-03

Pos logit contributions
 pet+0.001
s+0.000
 drops+0.000
 walk+0.000
 dog+0.000
Neg logit contributions
 exactly-0.001
 forever-0.001
 correctly-0.001
vel-0.001
ently-0.000
Token small
Token ID1402
Feature activation-1.05e-02

Pos logit contributions
 pet+0.001
 drops+0.001
 walk+0.001
ted+0.001
s+0.001
Neg logit contributions
 forever-0.003
 exactly-0.003
vel-0.003
 correctly-0.003
ently-0.003
Token house
Token ID2156
Feature activation+7.42e-03

Pos logit contributions
 pet+0.011
 dog+0.009
 walk+0.009
 drops+0.009
 water+0.009
Neg logit contributions
 forever-0.023
 exactly-0.023
vel-0.023
ently-0.022
 correctly-0.022
Token with
Token ID351
Feature activation-4.89e-03

Pos logit contributions
ently+0.018
mmm+0.018
vel+0.018
 exactly+0.017
 correctly+0.017
Neg logit contributions
 pet-0.008
 dog-0.006
 walk-0.006
 water-0.006
s-0.005
Token his
Token ID465
Feature activation+9.97e-04

Pos logit contributions
 pet+0.008
 drops+0.007
 walk+0.007
 dog+0.007
s+0.007
Neg logit contributions
 exactly-0.008
 forever-0.008
 correctly-0.007
vel-0.007
ently-0.007
Token best
Token ID1266
Feature activation+3.95e-02

Pos logit contributions
 forever+0.002
 exactly+0.002
 fault+0.002
 correctly+0.002
ently+0.002
Neg logit contributions
 drops-0.001
s-0.001
ted-0.001
 pans-0.001
 pet-0.001
Token competition
Token ID5449
Feature activation+3.58e-02

Pos logit contributions
 exactly+0.016
ently+0.016
aney+0.016
rol+0.016
vel+0.016
Neg logit contributions
 pet-0.010
 walk-0.009
 dog-0.009
 water-0.008
 wave-0.008
Token!
Token ID0
Feature activation+1.32e-02

Pos logit contributions
ently+0.093
 exactly+0.090
mmm+0.088
ieved+0.088
vel+0.087
Neg logit contributions
 pet-0.031
s-0.029
 walk-0.026
 dog-0.026
 water-0.023
Token He
Token ID679
Feature activation+8.33e-03

Pos logit contributions
vel+0.034
 exactly+0.033
ently+0.033
 forever+0.033
 correctly+0.032
Neg logit contributions
 pet-0.013
 dog-0.011
s-0.011
 water-0.010
 bark-0.009
Token was
Token ID373
Feature activation+1.65e-02

Pos logit contributions
ently+0.022
ocolate+0.022
mmm+0.022
otion+0.022
 exactly+0.022
Neg logit contributions
 pet-0.008
s-0.006
 went-0.006
 started-0.006
 bark-0.006
Token so
Token ID523
Feature activation-8.60e-03

Pos logit contributions
ently+0.039
anish+0.037
mmm+0.034
ool+0.034
vel+0.033
Neg logit contributions
 pet-0.024
 strong-0.022
 big-0.021
 water-0.020
 w-0.020
Token happy
Token ID3772
Feature activation-6.19e-03

Pos logit contributions
 pet+0.010
 water+0.008
 strong+0.008
 dog+0.008
 drops+0.008
Neg logit contributions
ently-0.020
 exactly-0.019
anish-0.019
 forever-0.019
vel-0.019
Token that
Token ID326
Feature activation+1.90e-02

Pos logit contributions
 pet+0.006
 dog+0.006
s+0.005
 water+0.005
 walk+0.005
Neg logit contributions
ently-0.015
vel-0.015
 correctly-0.014
ieved-0.014
 exactly-0.014
Token he
Token ID339
Feature activation+2.60e-02

Pos logit contributions
ently+0.055
vel+0.053
anish+0.051
ieved+0.051
 fault+0.051
Neg logit contributions
 pet-0.013
 dog-0.012
 water-0.010
 wave-0.009
 walk-0.009
Token rubbed
Token ID31862
Feature activation-3.96e-03

Pos logit contributions
ently+0.063
 exactly+0.062
vel+0.061
 correctly+0.060
 fault+0.060
Neg logit contributions
 pet-0.027
 dog-0.022
s-0.021
 drops-0.021
 water-0.020
Token his
Token ID465
Feature activation-1.37e-02

Pos logit contributions
 pet+0.004
 water+0.004
 dog+0.004
s+0.004
 drops+0.003
Neg logit contributions
vel-0.009
mmm-0.009
ently-0.009
 exactly-0.009
ust-0.009
Token head
Token ID1182
Feature activation-9.99e-03

Pos logit contributions
 pet+0.014
 dog+0.012
 walk+0.012
 water+0.011
 drops+0.011
Neg logit contributions
 exactly-0.031
 forever-0.031
ently-0.031
 correctly-0.030
vel-0.030
Token.
Token ID13
Feature activation+4.50e-03

Pos logit contributions
vel+0.019
mmm+0.019
 exactly+0.019
ieved+0.019
anish+0.019
Neg logit contributions
 pet-0.007
 water-0.006
s-0.006
 dog-0.006
 walk-0.006
Token
Token ID220
Feature activation-2.19e-02

Pos logit contributions
vel+0.009
 exactly+0.009
 forever+0.009
ently+0.009
 correctly+0.008
Neg logit contributions
 pet-0.006
 dog-0.006
 water-0.006
s-0.005
 walk-0.005
Token
Token ID198
Feature activation+1.11e-02

Pos logit contributions
 pet+0.015
s+0.014
 dog+0.013
 water+0.012
 walk+0.011
Neg logit contributions
 forever-0.056
vel-0.056
 exactly-0.056
 correctly-0.055
ently-0.055
Token
Token ID198
Feature activation+1.24e-02

Pos logit contributions
 exactly+0.028
 forever+0.028
 correctly+0.028
ently+0.027
vel+0.027
Neg logit contributions
 pet-0.009
s-0.007
 dog-0.007
 water-0.007
 walk-0.007
TokenDad
Token ID46270
Feature activation-1.78e-02

Pos logit contributions
 forever+0.020
 exactly+0.020
ently+0.020
 correctly+0.020
 fault+0.020
Neg logit contributions
 pet-0.021
 dog-0.019
 water-0.019
s-0.019
 walk-0.019
Token said
Token ID531
Feature activation-4.13e-03

Pos logit contributions
 pet+0.016
 dog+0.013
s+0.012
 walk+0.012
 water+0.012
Neg logit contributions
ently-0.044
 forever-0.044
 exactly-0.044
vel-0.043
 correctly-0.043
Token,
Token ID11
Feature activation+2.34e-02

Pos logit contributions
 pet+0.003
 dog+0.003
s+0.003
 man+0.003
 big+0.003
Neg logit contributions
ieved-0.011
ently-0.011
ust-0.011
 perspective-0.011
rol-0.011
Token "
Token ID366
Feature activation+1.51e-02

Pos logit contributions
ently+0.055
 fault+0.055
ieved+0.054
vel+0.054
 exactly+0.053
Neg logit contributions
 pet-0.025
 dog-0.023
 man-0.021
 water-0.020
 bark-0.019
TokenGreat
Token ID13681
Feature activation+2.79e-02

Pos logit contributions
 exactly+0.032
ently+0.032
 forever+0.031
 correctly+0.031
vel+0.031
Neg logit contributions
 pet-0.019
 dog-0.017
 water-0.016
 walk-0.016
s-0.015
Token job
Token ID1693
Feature activation+1.42e-02

Pos logit contributions
 exactly+0.054
ently+0.053
 forever+0.053
 correctly+0.052
vel+0.052
Neg logit contributions
 pet-0.040
 dog-0.036
 walk-0.034
 water-0.034
 drops-0.033
Token,
Token ID11
Feature activation+3.06e-02

Pos logit contributions
ently+0.031
ieved+0.031
rol+0.031
vel+0.031
ust+0.030
Neg logit contributions
 water-0.014
 pet-0.014
 dog-0.014
 drops-0.013
 wave-0.012
Token.
Token ID13
Feature activation-2.02e-02

Pos logit contributions
 pet+0.055
 dog+0.048
 walk+0.047
 water+0.045
 drops+0.045
Neg logit contributions
 exactly-0.097
 forever-0.096
ently-0.094
 correctly-0.094
vel-0.094
Token
Token ID198
Feature activation-1.71e-03

Pos logit contributions
 pet+0.023
 dog+0.020
 walk+0.019
 water+0.019
 drops+0.018
Neg logit contributions
 exactly-0.045
 forever-0.044
ently-0.044
vel-0.043
 correctly-0.043
Token
Token ID198
Feature activation-3.80e-03

Pos logit contributions
 pet+0.002
 walk+0.002
 dog+0.002
 drops+0.002
 wave+0.002
Neg logit contributions
 exactly-0.003
ently-0.003
 forever-0.003
 correctly-0.003
mmm-0.003
TokenMom
Token ID29252
Feature activation-3.66e-02

Pos logit contributions
 pet+0.007
 walk+0.007
 drops+0.006
 dog+0.006
 pets+0.006
Neg logit contributions
 exactly-0.005
mmm-0.005
vel-0.005
ently-0.005
 forever-0.005
Token and
Token ID290
Feature activation-4.95e-02

Pos logit contributions
 pet+0.049
 pets+0.044
 walk+0.043
 dog+0.043
 water+0.040
Neg logit contributions
 exactly-0.067
abel-0.063
 forever-0.063
ently-0.063
 correctly-0.063
Token Dad
Token ID17415
Feature activation-4.05e-02

Pos logit contributions
s+0.079
 pet+0.078
 stuffing+0.078
actus+0.077
 drops+0.074
Neg logit contributions
 exactly-0.068
 correctly-0.065
 forever-0.064
 none-0.059
 Jane-0.057
Token heard
Token ID2982
Feature activation-8.67e-03

Pos logit contributions
 pet+0.053
 pets+0.049
 dog+0.048
 pans+0.047
 water+0.046
Neg logit contributions
 exactly-0.075
 forever-0.075
 correctly-0.070
vel-0.069
ently-0.068
Token them
Token ID606
Feature activation-2.11e-02

Pos logit contributions
 pet+0.015
 dog+0.014
 walk+0.013
 water+0.013
 drops+0.013
Neg logit contributions
 exactly-0.014
 forever-0.014
ently-0.014
vel-0.014
 correctly-0.014
Token.
Token ID13
Feature activation-2.17e-02

Pos logit contributions
 pet+0.029
 dog+0.025
 pans+0.024
 water+0.024
 drops+0.024
Neg logit contributions
 exactly-0.040
 forever-0.040
 correctly-0.039
vel-0.038
ently-0.038
Token They
Token ID1119
Feature activation-3.59e-02

Pos logit contributions
 walk+0.041
 pet+0.037
 Claus+0.035
 wave+0.035
 stuffing+0.035
Neg logit contributions
 correctly-0.021
 Jane-0.021
 Megan-0.020
 about-0.020
 exactly-0.020
Token ran
Token ID4966
Feature activation-9.29e-03

Pos logit contributions
 pet+0.038
s+0.036
 walk+0.035
 pets+0.035
 dog+0.034
Neg logit contributions
 exactly-0.075
 forever-0.071
 correctly-0.065
vel-0.065
ently-0.063
Token,
Token ID11
Feature activation-6.76e-03

Pos logit contributions
vel+0.052
ently+0.052
 exactly+0.052
ieved+0.052
mmm+0.050
Neg logit contributions
s-0.022
 pet-0.022
 dog-0.022
 water-0.020
 drops-0.017
Token and
Token ID290
Feature activation-2.63e-02

Pos logit contributions
 pet+0.008
 dog+0.007
 water+0.006
 walk+0.006
 drops+0.006
Neg logit contributions
 exactly-0.016
ently-0.015
vel-0.015
 forever-0.015
 correctly-0.015
Token puzzles
Token ID24367
Feature activation+2.92e-02

Pos logit contributions
 pet+0.035
 walk+0.031
 dog+0.031
 water+0.031
 drops+0.029
Neg logit contributions
ently-0.057
 exactly-0.055
vel-0.054
 forever-0.053
mmm-0.053
Token.
Token ID13
Feature activation-9.80e-03

Pos logit contributions
mmm+0.084
ently+0.083
vel+0.083
loe+0.082
ocolate+0.081
Neg logit contributions
 pet-0.018
 water-0.017
s-0.015
 dog-0.014
 big-0.013
Token Lily
Token ID20037
Feature activation-3.71e-03

Pos logit contributions
 pet+0.011
 dog+0.010
 water+0.010
s+0.009
 drops+0.009
Neg logit contributions
ently-0.023
 exactly-0.023
vel-0.023
 forever-0.022
 correctly-0.022
Token likes
Token ID7832
Feature activation-3.78e-02

Pos logit contributions
 pet+0.004
 dog+0.003
s+0.003
 drops+0.003
 water+0.003
Neg logit contributions
 exactly-0.009
ently-0.009
vel-0.009
 forever-0.009
mmm-0.009
Token puzzles
Token ID24367
Feature activation+1.53e-02

Pos logit contributions
 pet+0.036
 dog+0.030
 water+0.029
 walk+0.028
 pets+0.026
Neg logit contributions
ently-0.097
 exactly-0.095
vel-0.095
mmm-0.093
 forever-0.093
Token.
Token ID13
Feature activation-7.34e-03

Pos logit contributions
ently+0.045
vel+0.045
rol+0.044
mmm+0.044
loe+0.044
Neg logit contributions
 pet-0.008
 water-0.008
 big-0.008
s-0.007
 dog-0.006
Token She
Token ID1375
Feature activation-1.12e-02

Pos logit contributions
 pet+0.010
 dog+0.009
 walk+0.008
 water+0.008
 drops+0.008
Neg logit contributions
 exactly-0.015
 forever-0.015
ently-0.015
vel-0.014
 correctly-0.014
Token likes
Token ID7832
Feature activation-3.89e-02

Pos logit contributions
 pet+0.012
 dog+0.010
 drops+0.009
 water+0.009
 walk+0.009
Neg logit contributions
ently-0.028
 exactly-0.027
 forever-0.027
vel-0.027
 correctly-0.026
Token to
Token ID284
Feature activation-1.02e-02

Pos logit contributions
 pet+0.044
 dog+0.039
 water+0.036
 walk+0.036
s+0.035
Neg logit contributions
ently-0.090
 exactly-0.090
vel-0.088
 forever-0.087
mmm-0.085
Token up
Token ID510
Feature activation-6.84e-04

Pos logit contributions
 pet+0.028
 dog+0.024
s+0.024
 walk+0.024
 water+0.023
Neg logit contributions
 exactly-0.056
ently-0.056
vel-0.056
 forever-0.055
mmm-0.054
Token,
Token ID11
Feature activation+1.28e-02

Pos logit contributions
 pet+0.001
 water+0.001
 walk+0.000
 dog+0.000
 strong+0.000
Neg logit contributions
vel-0.002
mmm-0.002
ently-0.002
 exactly-0.002
ieved-0.002
Token but
Token ID475
Feature activation+5.12e-03

Pos logit contributions
ently+0.027
 exactly+0.027
vel+0.027
 forever+0.026
anish+0.026
Neg logit contributions
 pet-0.017
 dog-0.014
 water-0.014
 walk-0.014
 drops-0.014
Token it
Token ID340
Feature activation-1.20e-02

Pos logit contributions
vel+0.011
ently+0.011
aney+0.011
anish+0.010
rol+0.010
Neg logit contributions
 pet-0.006
 water-0.006
 dog-0.006
 drops-0.005
 went-0.005
Token was
Token ID373
Feature activation+1.18e-02

Pos logit contributions
 pet+0.016
 dog+0.014
 walk+0.014
 pets+0.013
 water+0.013
Neg logit contributions
 exactly-0.024
 forever-0.023
 correctly-0.023
ently-0.023
vel-0.022
Token too
Token ID1165
Feature activation-5.31e-02

Pos logit contributions
ently+0.027
vel+0.026
anish+0.026
mmm+0.025
 exactly+0.025
Neg logit contributions
 pet-0.014
 water-0.012
 drops-0.012
 dog-0.012
 strong-0.011
Token hard
Token ID1327
Feature activation+3.39e-02

Pos logit contributions
 pet+0.068
 dog+0.059
 pets+0.059
 walk+0.056
ted+0.055
Neg logit contributions
 forever-0.106
 exactly-0.106
 correctly-0.101
 badly-0.097
vel-0.097
Token.
Token ID13
Feature activation+4.08e-03

Pos logit contributions
 exactly+0.077
ently+0.076
vel+0.076
 forever+0.075
 correctly+0.074
Neg logit contributions
 pet-0.035
 dog-0.031
s-0.030
 walk-0.029
 water-0.029
Token Tim
Token ID5045
Feature activation+1.08e-02

Pos logit contributions
vel+0.010
 forever+0.010
 exactly+0.010
ently+0.010
 correctly+0.010
Neg logit contributions
 pet-0.004
 dog-0.003
 water-0.003
s-0.003
 drops-0.003
Token felt
Token ID2936
Feature activation+1.51e-02

Pos logit contributions
 exactly+0.025
 forever+0.025
ently+0.025
vel+0.025
 correctly+0.025
Neg logit contributions
 pet-0.011
 dog-0.009
 walk-0.009
 water-0.009
 drops-0.009
Token sad
Token ID6507
Feature activation-1.83e-02

Pos logit contributions
 exactly+0.030
 forever+0.029
ently+0.029
vel+0.029
 correctly+0.029
Neg logit contributions
 pet-0.021
 dog-0.018
 walk-0.018
 water-0.017
 drops-0.017
Token was
Token ID373
Feature activation-3.93e-03

Pos logit contributions
 pet+0.068
 pets+0.063
 walk+0.062
 dog+0.061
 pans+0.057
Neg logit contributions
 exactly-0.068
 forever-0.065
 correctly-0.062
ieved-0.056
ently-0.056
Token making
Token ID1642
Feature activation-1.64e-02

Pos logit contributions
 pet+0.005
 dog+0.004
 water+0.004
 drops+0.004
 walk+0.004
Neg logit contributions
ently-0.009
vel-0.008
 exactly-0.008
 forever-0.008
 correctly-0.008
Token noises
Token ID26782
Feature activation-4.70e-02

Pos logit contributions
 pet+0.016
 water+0.015
 dog+0.014
 drops+0.013
 walk+0.013
Neg logit contributions
ently-0.040
 forever-0.039
vel-0.039
 exactly-0.039
 correctly-0.039
Token again
Token ID757
Feature activation+1.24e-02

Pos logit contributions
 pet+0.066
 dog+0.057
 walk+0.055
 pets+0.054
 drops+0.052
Neg logit contributions
 forever-0.092
 correctly-0.088
 exactly-0.088
vel-0.086
ently-0.081
Token.
Token ID13
Feature activation-1.74e-02

Pos logit contributions
 exactly+0.027
 forever+0.027
ently+0.027
vel+0.027
 correctly+0.027
Neg logit contributions
 pet-0.015
 dog-0.013
 walk-0.012
 water-0.012
 drops-0.012
Token It
Token ID632
Feature activation-4.56e-02

Pos logit contributions
 pet+0.021
 dog+0.019
 water+0.018
 drops+0.017
s+0.016
Neg logit contributions
vel-0.039
 exactly-0.039
ently-0.038
 forever-0.038
 correctly-0.038
Token was
Token ID373
Feature activation-3.49e-03

Pos logit contributions
 pet+0.069
 walk+0.065
 pets+0.064
 dog+0.064
 packages+0.061
Neg logit contributions
 exactly-0.072
 forever-0.069
 correctly-0.065
abella-0.060
abel-0.060
Token flashing
Token ID25293
Feature activation-2.51e-02

Pos logit contributions
 pet+0.004
 dog+0.004
 water+0.004
 drops+0.004
 walk+0.003
Neg logit contributions
ently-0.008
vel-0.007
 exactly-0.007
 correctly-0.007
 forever-0.007
Token and
Token ID290
Feature activation-3.64e-02

Pos logit contributions
 pet+0.037
 walk+0.034
 dog+0.031
s+0.031
 pets+0.031
Neg logit contributions
 forever-0.048
 exactly-0.047
 correctly-0.044
 anymore-0.043
ently-0.043
Token moving
Token ID3867
Feature activation-9.62e-03

Pos logit contributions
 pet+0.041
 dog+0.035
 walk+0.034
s+0.033
 water+0.032
Neg logit contributions
 exactly-0.080
 forever-0.079
 correctly-0.077
ently-0.077
vel-0.076
Token.
Token ID13
Feature activation-2.35e-02

Pos logit contributions
 pet+0.010
 dog+0.009
 water+0.008
 drops+0.008
s+0.008
Neg logit contributions
ently-0.023
mmm-0.023
 exactly-0.023
vel-0.023
ocolate-0.022
Token.
Token ID13
Feature activation-3.34e-03

Pos logit contributions
 exactly+0.006
 forever+0.006
ently+0.006
 correctly+0.006
vel+0.006
Neg logit contributions
 pet-0.004
 dog-0.004
 walk-0.003
 water-0.003
 drops-0.003
Token
Token ID198
Feature activation+2.45e-03

Pos logit contributions
 pet+0.004
 walk+0.004
 dog+0.004
 drops+0.004
 water+0.004
Neg logit contributions
 forever-0.007
 exactly-0.007
 correctly-0.007
ently-0.006
vel-0.006
Token
Token ID198
Feature activation+5.58e-04

Pos logit contributions
 exactly+0.005
 forever+0.005
ently+0.005
vel+0.005
 correctly+0.005
Neg logit contributions
 pet-0.003
 dog-0.003
 walk-0.003
 drops-0.002
 water-0.002
TokenThey
Token ID2990
Feature activation-2.56e-02

Pos logit contributions
 exactly+0.001
 forever+0.001
ently+0.001
 correctly+0.001
vel+0.001
Neg logit contributions
 pet-0.001
 dog-0.001
s-0.001
 water-0.001
 walk-0.001
Token all
Token ID477
Feature activation-2.03e-02

Pos logit contributions
 pet+0.024
 dog+0.019
 walk+0.018
 water+0.018
 drops+0.017
Neg logit contributions
ently-0.064
 exactly-0.063
vel-0.063
 forever-0.062
 correctly-0.062
Token hugged
Token ID41130
Feature activation-4.33e-02

Pos logit contributions
 pet+0.020
 dog+0.018
 walk+0.016
s+0.016
 water+0.016
Neg logit contributions
 exactly-0.047
 forever-0.046
 correctly-0.046
ently-0.045
vel-0.045
Token and
Token ID290
Feature activation-1.48e-02

Pos logit contributions
 pet+0.053
 dog+0.047
 walk+0.045
 water+0.044
 drops+0.044
Neg logit contributions
 exactly-0.092
 forever-0.091
ently-0.090
 correctly-0.089
vel-0.089
Token smiled
Token ID13541
Feature activation-2.51e-02

Pos logit contributions
 pet+0.017
 dog+0.016
s+0.016
 drops+0.015
 walk+0.015
Neg logit contributions
 forever-0.029
 exactly-0.028
 correctly-0.028
ently-0.026
vel-0.025
Token.
Token ID13
Feature activation-1.12e-02

Pos logit contributions
 pet+0.033
 walk+0.030
 dog+0.029
 drops+0.028
 water+0.028
Neg logit contributions
 forever-0.051
 exactly-0.050
 correctly-0.048
ently-0.047
vel-0.047
Token Then
Token ID3244
Feature activation-6.66e-03

Pos logit contributions
 pet+0.013
 dog+0.011
 walk+0.011
 water+0.010
 drops+0.010
Neg logit contributions
 exactly-0.024
 forever-0.024
ently-0.024
 correctly-0.024
vel-0.024
Token they
Token ID484
Feature activation-1.80e-02

Pos logit contributions
 pet+0.009
 dog+0.008
 walk+0.007
 water+0.007
 drops+0.007
Neg logit contributions
 exactly-0.014
 forever-0.014
ently-0.013
 correctly-0.013
vel-0.013