TOP ACTIVATIONS
MAX = 0.101

the mud all the time . <|endoftext|> Once upon a time
little girl and her family . <|endoftext|> Once upon a time
and grateful to her friend . From that day on ,
for her everywhere , but couldn 't find her . A
first ." Tim my promised and quickly ran out of the
crawled out from under the leather and surprised Sammy . Sammy
apple and her tum my couldn 't handle it . She
She hurt her head and couldn 't wake up . Lucy
Sadly , Nut ty 's sickness got worse and he couldn
!" Mr . Fox smiled and said , " You 're
careful and sat on the c actus . They became good
ear ." Fl oppy apologized and they continued to have a
you not to touch the c actus !" Tom
't make you touch the c actus ," Sam said .
help when he needed it . <|endoftext|> Once upon a time
happy and thanked Farmer Brown . They brought the hay back
She hurt herself badly and couldn 't get out of the
y flew away , happy and clean . <|endoftext|> Once upon
hurt her leg . She couldn 't dance anymore and had
They lost their passports and couldn 't go on their trip

MIN ACTIVATIONS
MIN = -0.113

if she could go get an X - ray .
ie . They both had birth days coming up , and
s mom explained that an X - ray was a
was too young to get an X - ray , but
, " One , two , three !" As soon as
. They had many toys , but their favorite was a
They had many number toys , like blocks , puzzles ,
. They had many toys , but their favorite was a
steps . One , two , three - all the way
them . They could count , add , and subtract with
to count and she counted to five . Then her head
. They had many toys , but their favorite was a
. " One , two , three !" they shouted ,
cards . They could count , add , and subtract very
up information and asked many questions . Eventually , the mole
. They had many toys , but their favorite ones were
. " One , two , three , four ," she
. They had many toys , but their favorite was a
blocks . One , two , three , four ... The
so high . How did you learn to count so well

INTERVAL 0.048 - 0.101
CONTAINS 1.002%

. One day , she couldn 't find the key to
together all the time and everyone in the shop was happy
just built ever again and he was left feeling scared and
One warm day , the sun was shining , so the
L ily says , " Thank you , Mom . I

INTERVAL -0.006 - 0.048
CONTAINS 60.713%

her mum . " Hel ping others can be a delicate
hole in the net . He was not angry . He
One day she was playing in her room , when
means different things to different people . She had learned that
Tim my he was going to be punished . Tim my

INTERVAL -0.059 - -0.006
CONTAINS 37.991%

She wanted to see what it could unlock . She took
We can skip around the yard and I 'll count how
again excited . <|endoftext|> Once upon a time , there were
, faster and faster . Tim 's mom
and said , " I will call you Blue . You

INTERVAL -0.113 - -0.059
CONTAINS 0.294%

replied , " I can look at the stones and remember
. " He is unique . I have never seen a
in this cup . Do you want to see it ?"
the trees . " One , two , three !" they
, " We need many different colors so we can draw
Token the
Token ID262
Feature activation+2.63e-02

Pos logit contributions
 therapist+0.015
actus+0.015
 Midnight+0.014
lor+0.014
 masse+0.013
Neg logit contributions
 guess-0.017
 count-0.016
 three-0.016
 ten-0.016
 imagine-0.015
Token mud
Token ID17492
Feature activation+3.57e-02

Pos logit contributions
 therapist+0.054
 Midnight+0.050
actus+0.048
lor+0.048
 masse+0.045
Neg logit contributions
 count-0.042
 guess-0.041
 three-0.040
 ten-0.038
 One-0.037
Token all
Token ID477
Feature activation+1.43e-02

Pos logit contributions
 therapist+0.082
 Midnight+0.080
actus+0.079
 masse+0.078
lor+0.074
Neg logit contributions
 count-0.045
 guess-0.043
 ten-0.041
 three-0.039
 counted-0.039
Token the
Token ID262
Feature activation+1.23e-02

Pos logit contributions
 therapist+0.024
actus+0.023
 Midnight+0.023
lor+0.022
 masse+0.021
Neg logit contributions
 count-0.026
 guess-0.026
 three-0.025
 ten-0.025
 imagine-0.024
Token time
Token ID640
Feature activation+5.30e-02

Pos logit contributions
 therapist+0.024
actus+0.023
 Midnight+0.023
lor+0.022
 masse+0.021
Neg logit contributions
 count-0.020
 guess-0.019
 three-0.018
 ten-0.018
 imagine-0.017
Token.
Token ID13
Feature activation+1.01e-01

Pos logit contributions
 therapist+0.128
actus+0.124
 masse+0.118
lor+0.117
 Midnight+0.112
Neg logit contributions
 like-0.057
 three-0.056
 guess-0.055
 ten-0.054
 count-0.052
Token<|endoftext|>
Token ID50256
Feature activation+4.11e-02

Pos logit contributions
 therapist+0.284
 swollen+0.276
guard+0.268
actus+0.267
eds+0.266
Neg logit contributions
 One-0.109
 Many-0.059
 three-0.057
 Some-0.056
 Today-0.055
TokenOnce
Token ID7454
Feature activation+1.43e-02

Pos logit contributions
 therapist+0.089
actus+0.085
 masse+0.084
 Midnight+0.083
 swollen+0.079
Neg logit contributions
 guess-0.055
 One-0.052
 three-0.052
 count-0.051
 ten-0.049
Token upon
Token ID2402
Feature activation-2.22e-02

Pos logit contributions
 therapist+0.039
actus+0.038
 Midnight+0.038
 masse+0.036
lor+0.036
Neg logit contributions
 guess-0.010
 count-0.010
 three-0.009
 ten-0.008
 One-0.008
Token a
Token ID257
Feature activation-2.80e-02

Pos logit contributions
 count+0.030
 guess+0.029
 1+0.027
 counted+0.026
 ten+0.026
Neg logit contributions
 Midnight-0.045
 therapist-0.044
actus-0.044
 Woo-0.040
iflower-0.040
Token time
Token ID640
Feature activation-3.63e-02

Pos logit contributions
 count+0.044
 guess+0.043
 counted+0.041
 three+0.040
 ten+0.040
Neg logit contributions
 Midnight-0.050
actus-0.049
 therapist-0.047
iflower-0.044
 masse-0.044
Token little
Token ID1310
Feature activation+2.86e-02

Pos logit contributions
lor+0.041
 therapist+0.041
actus+0.040
iflower+0.038
 Midnight+0.036
Neg logit contributions
 guess-0.073
 three-0.073
 count-0.070
 One-0.066
 like-0.065
Token girl
Token ID2576
Feature activation+4.98e-03

Pos logit contributions
actus+0.044
 therapist+0.042
lor+0.042
iflower+0.041
 Midnight+0.041
Neg logit contributions
 guess-0.056
 three-0.055
 count-0.053
 imagine-0.050
 One-0.050
Token and
Token ID290
Feature activation+6.23e-02

Pos logit contributions
 therapist+0.010
actus+0.010
 Midnight+0.010
 masse+0.009
lor+0.009
Neg logit contributions
 guess-0.007
 count-0.007
 three-0.007
 ten-0.007
 imagine-0.006
Token her
Token ID607
Feature activation+2.02e-02

Pos logit contributions
actus+0.142
lor+0.128
iflower+0.124
 therapist+0.121
eds+0.120
Neg logit contributions
 guess-0.091
 count-0.089
 three-0.087
 imagine-0.083
 counted-0.082
Token family
Token ID1641
Feature activation+4.98e-02

Pos logit contributions
 therapist+0.037
actus+0.037
 Midnight+0.036
lor+0.035
 masse+0.033
Neg logit contributions
 guess-0.033
 count-0.033
 three-0.032
 ten-0.030
 imagine-0.029
Token.
Token ID13
Feature activation+9.89e-02

Pos logit contributions
actus+0.112
 Midnight+0.108
 masse+0.107
 therapist+0.107
lor+0.102
Neg logit contributions
 count-0.063
 guess-0.062
 three-0.060
 like-0.058
 counted-0.056
Token<|endoftext|>
Token ID50256
Feature activation+3.96e-02

Pos logit contributions
 therapist+0.266
 swollen+0.265
eds+0.258
iflower+0.257
 drowned+0.254
Neg logit contributions
 One-0.111
 Many-0.065
 three-0.063
 Some-0.061
 Today-0.060
TokenOnce
Token ID7454
Feature activation+3.41e-03

Pos logit contributions
 therapist+0.081
actus+0.078
 masse+0.076
 Midnight+0.076
 swollen+0.073
Neg logit contributions
 guess-0.055
 three-0.054
 One-0.054
 count-0.052
 ten-0.050
Token upon
Token ID2402
Feature activation-3.52e-02

Pos logit contributions
 therapist+0.009
actus+0.009
 Midnight+0.009
 masse+0.008
lor+0.008
Neg logit contributions
 guess-0.003
 count-0.003
 three-0.002
 ten-0.002
 imagine-0.002
Token a
Token ID257
Feature activation-3.29e-02

Pos logit contributions
 count+0.053
 guess+0.049
 1+0.049
 counted+0.047
ruits+0.047
Neg logit contributions
 Midnight-0.067
 therapist-0.064
actus-0.062
 Woo-0.059
iflower-0.057
Token time
Token ID640
Feature activation-4.61e-02

Pos logit contributions
 count+0.054
 guess+0.051
 counted+0.050
 ten+0.049
 three+0.049
Neg logit contributions
 Midnight-0.056
actus-0.056
 therapist-0.053
lor-0.049
iflower-0.049
Token and
Token ID290
Feature activation+8.25e-02

Pos logit contributions
 therapist+0.127
actus+0.127
 masse+0.123
lor+0.119
 Midnight+0.119
Neg logit contributions
 like-0.062
 guess-0.057
 three-0.051
 count-0.051
 whether-0.048
Token grateful
Token ID14066
Feature activation+5.15e-02

Pos logit contributions
actus+0.196
lor+0.194
orns+0.179
burn+0.173
rob+0.173
Neg logit contributions
 counted-0.107
 guess-0.104
 excited-0.096
 like-0.095
 count-0.094
Token to
Token ID284
Feature activation+6.17e-02

Pos logit contributions
actus+0.120
 therapist+0.116
lor+0.113
 masse+0.111
 Midnight+0.111
Neg logit contributions
 like-0.066
 guess-0.059
 three-0.057
 count-0.054
 ten-0.052
Token her
Token ID607
Feature activation+1.84e-02

Pos logit contributions
actus+0.160
 therapist+0.153
lor+0.147
iflower+0.136
uddled+0.134
Neg logit contributions
 One-0.075
 count-0.074
 three-0.072
 guess-0.071
 ten-0.065
Token friend
Token ID1545
Feature activation+2.28e-02

Pos logit contributions
actus+0.037
 therapist+0.036
 Midnight+0.035
lor+0.034
 masse+0.032
Neg logit contributions
 guess-0.029
 count-0.029
 three-0.028
 imagine-0.026
 ten-0.025
Token.
Token ID13
Feature activation+9.79e-02

Pos logit contributions
actus+0.045
 therapist+0.044
 Midnight+0.040
lor+0.040
 masse+0.040
Neg logit contributions
 guess-0.036
 count-0.035
 three-0.033
 ten-0.032
 like-0.031
Token From
Token ID3574
Feature activation+4.19e-02

Pos logit contributions
 severe+0.246
eds+0.245
 thorn+0.242
 drowned+0.241
burn+0.241
Neg logit contributions
 One-0.117
 three-0.072
 Many-0.063
 Today-0.062
 As-0.058
Token that
Token ID326
Feature activation+2.88e-02

Pos logit contributions
 therapist+0.080
lor+0.079
actus+0.074
 Midnight+0.073
 sy+0.072
Neg logit contributions
 guess-0.065
 like-0.063
 three-0.063
 count-0.059
 ten-0.058
Token day
Token ID1110
Feature activation+4.85e-02

Pos logit contributions
actus+0.057
 therapist+0.056
lor+0.052
 Midnight+0.052
 masse+0.050
Neg logit contributions
 count-0.046
 guess-0.045
 three-0.045
 One-0.042
 ten-0.041
Token on
Token ID319
Feature activation+5.52e-02

Pos logit contributions
actus+0.105
 therapist+0.099
 Midnight+0.091
lor+0.088
keeping+0.088
Neg logit contributions
 three-0.068
 count-0.067
 guess-0.065
 One-0.063
 like-0.063
Token,
Token ID11
Feature activation+7.53e-02

Pos logit contributions
actus+0.185
 therapist+0.177
lor+0.177
keeping+0.173
burn+0.169
Neg logit contributions
 One-0.013
 count-0.006
 three-0.004
 like-0.002
 yes-0.001
Token for
Token ID329
Feature activation+2.29e-02

Pos logit contributions
 count+0.009
 guess+0.009
 ten+0.008
 three+0.008
 counted+0.008
Neg logit contributions
actus-0.008
 Midnight-0.008
 therapist-0.008
 masse-0.007
 Woo-0.007
Token her
Token ID607
Feature activation+6.42e-03

Pos logit contributions
actus+0.039
 therapist+0.038
lor+0.035
iflower+0.033
 Midnight+0.032
Neg logit contributions
 guess-0.042
 count-0.042
 three-0.041
 ten-0.040
 One-0.039
Token everywhere
Token ID8347
Feature activation+5.24e-03

Pos logit contributions
 therapist+0.012
actus+0.011
 Midnight+0.011
 masse+0.010
lor+0.010
Neg logit contributions
 guess-0.011
 count-0.010
 three-0.010
 ten-0.010
 imagine-0.009
Token,
Token ID11
Feature activation+3.84e-03

Pos logit contributions
 therapist+0.012
actus+0.012
 Midnight+0.012
lor+0.011
 masse+0.011
Neg logit contributions
 guess-0.006
 count-0.006
 three-0.006
 ten-0.006
 imagine-0.005
Token but
Token ID475
Feature activation+3.40e-02

Pos logit contributions
 therapist+0.011
actus+0.010
 Midnight+0.010
 masse+0.010
lor+0.010
Neg logit contributions
 guess-0.003
 count-0.003
 three-0.002
 ten-0.002
 imagine-0.002
Token couldn
Token ID3521
Feature activation+9.75e-02

Pos logit contributions
actus+0.066
lor+0.061
 therapist+0.061
iflower+0.057
 masse+0.056
Neg logit contributions
 guess-0.056
 count-0.052
 three-0.052
 like-0.051
 ten-0.050
Token't
Token ID470
Feature activation+8.93e-03

Pos logit contributions
actus+0.246
 therapist+0.227
rows+0.221
ugg+0.216
iflower+0.216
Neg logit contributions
 One-0.124
 count-0.085
 three-0.082
 guess-0.082
 Two-0.081
Token find
Token ID1064
Feature activation+3.74e-02

Pos logit contributions
 therapist+0.021
actus+0.020
 Midnight+0.020
lor+0.019
 masse+0.019
Neg logit contributions
 guess-0.011
 count-0.011
 three-0.009
 ten-0.009
 imagine-0.009
Token her
Token ID607
Feature activation+1.62e-02

Pos logit contributions
 therapist+0.076
actus+0.075
lor+0.068
iflower+0.067
ugg+0.065
Neg logit contributions
 ten-0.058
 count-0.058
 three-0.056
 guess-0.055
 One-0.054
Token.
Token ID13
Feature activation+5.23e-02

Pos logit contributions
 therapist+0.032
actus+0.031
 Midnight+0.030
 masse+0.029
lor+0.028
Neg logit contributions
 guess-0.025
 count-0.025
 three-0.023
 ten-0.023
 imagine-0.022
Token A
Token ID317
Feature activation+2.68e-02

Pos logit contributions
actus+0.116
 therapist+0.115
iflower+0.114
lor+0.108
rob+0.106
Neg logit contributions
 One-0.071
 guess-0.061
 three-0.059
 Many-0.055
 count-0.051
Token first
Token ID717
Feature activation-2.71e-03

Pos logit contributions
 guess+0.008
 count+0.008
 three+0.007
 ten+0.007
 imagine+0.007
Neg logit contributions
 therapist-0.007
 Midnight-0.006
actus-0.006
 masse-0.006
lor-0.006
Token."
Token ID526
Feature activation+2.01e-02

Pos logit contributions
 guess+0.005
 count+0.005
 three+0.004
 ten+0.004
 imagine+0.004
Neg logit contributions
 therapist-0.005
 Midnight-0.004
actus-0.004
 masse-0.004
lor-0.004
Token Tim
Token ID5045
Feature activation-7.66e-04

Pos logit contributions
 therapist+0.049
actus+0.048
 Midnight+0.045
lor+0.045
 masse+0.044
Neg logit contributions
 guess-0.022
 count-0.021
 One-0.020
 three-0.020
 ten-0.019
Tokenmy
Token ID1820
Feature activation+2.06e-02

Pos logit contributions
 guess+0.001
 count+0.001
 three+0.001
 ten+0.001
 imagine+0.001
Neg logit contributions
 therapist-0.002
actus-0.002
 Midnight-0.002
 masse-0.002
lor-0.002
Token promised
Token ID8072
Feature activation+4.46e-02

Pos logit contributions
actus+0.041
 therapist+0.040
 Midnight+0.040
lor+0.037
 masse+0.037
Neg logit contributions
 guess-0.031
 count-0.030
 three-0.029
 counted-0.027
 ten-0.027
Token and
Token ID290
Feature activation+9.73e-02

Pos logit contributions
actus+0.108
 therapist+0.100
iflower+0.099
 Midnight+0.095
lor+0.093
Neg logit contributions
 guess-0.057
 three-0.054
 count-0.053
 One-0.050
 ten-0.049
Token quickly
Token ID2952
Feature activation+4.98e-02

Pos logit contributions
actus+0.242
orns+0.211
eds+0.208
lor+0.206
iflower+0.199
Neg logit contributions
 three-0.115
 counted-0.114
 guess-0.111
 count-0.098
 yes-0.097
Token ran
Token ID4966
Feature activation+2.07e-02

Pos logit contributions
actus+0.111
 therapist+0.105
 Midnight+0.098
iflower+0.098
lor+0.096
Neg logit contributions
 guess-0.070
 count-0.069
 counted-0.069
 three-0.062
 like-0.059
Token out
Token ID503
Feature activation-3.95e-03

Pos logit contributions
 therapist+0.031
actus+0.030
 Midnight+0.028
iflower+0.028
lor+0.027
Neg logit contributions
 guess-0.043
 count-0.042
 three-0.041
 ten-0.041
 like-0.039
Token of
Token ID286
Feature activation+3.24e-02

Pos logit contributions
 guess+0.007
 count+0.007
 ten+0.006
 three+0.006
 imagine+0.006
Neg logit contributions
 therapist-0.007
 Midnight-0.007
actus-0.007
 masse-0.006
 drowned-0.006
Token the
Token ID262
Feature activation+1.30e-02

Pos logit contributions
 therapist+0.074
actus+0.074
lor+0.072
ugg+0.064
iflower+0.064
Neg logit contributions
 ten-0.043
 count-0.043
 guess-0.042
 three-0.042
 One-0.037
Token crawled
Token ID45668
Feature activation+1.62e-03

Pos logit contributions
 count+0.012
 guess+0.012
 ten+0.011
 three+0.011
 imagine+0.010
Neg logit contributions
 therapist-0.015
actus-0.015
 Midnight-0.014
 drowned-0.013
 masse-0.013
Token out
Token ID503
Feature activation-1.51e-02

Pos logit contributions
 therapist+0.003
actus+0.003
 Midnight+0.003
 masse+0.002
lor+0.002
Neg logit contributions
 guess-0.003
 count-0.003
 three-0.003
 ten-0.003
 imagine-0.002
Token from
Token ID422
Feature activation-7.28e-03

Pos logit contributions
 guess+0.028
 count+0.028
 imagine+0.026
ruits+0.026
 ten+0.026
Neg logit contributions
 therapist-0.022
 Midnight-0.022
actus-0.021
 masse-0.020
 Woo-0.019
Token under
Token ID739
Feature activation-6.40e-03

Pos logit contributions
 guess+0.012
 count+0.012
 three+0.011
 ten+0.010
 imagine+0.010
Neg logit contributions
 Midnight-0.013
 therapist-0.013
actus-0.013
 Woo-0.011
 masse-0.011
Token the
Token ID262
Feature activation+5.43e-02

Pos logit contributions
 guess+0.012
 count+0.011
 imagine+0.010
 ten+0.010
 counted+0.010
Neg logit contributions
 Midnight-0.010
 therapist-0.010
actus-0.010
 Woo-0.009
 drowned-0.009
Token leather
Token ID11620
Feature activation+9.71e-02

Pos logit contributions
 therapist+0.104
lor+0.103
ugg+0.096
iflower+0.095
ic+0.095
Neg logit contributions
 three-0.083
 count-0.080
 like-0.077
 guess-0.077
 ten-0.075
Token and
Token ID290
Feature activation+9.83e-03

Pos logit contributions
iflower+0.172
live+0.165
 swollen+0.165
ww+0.165
 therapist+0.164
Neg logit contributions
 three-0.126
 like-0.118
 One-0.118
 guess-0.117
 but-0.113
Token surprised
Token ID6655
Feature activation-9.56e-04

Pos logit contributions
 therapist+0.019
actus+0.018
 Midnight+0.018
lor+0.017
 masse+0.017
Neg logit contributions
 guess-0.016
 count-0.015
 three-0.014
 ten-0.014
 imagine-0.014
Token Sammy
Token ID44067
Feature activation+8.64e-03

Pos logit contributions
 guess+0.001
 count+0.001
 three+0.001
 ten+0.001
 imagine+0.001
Neg logit contributions
 therapist-0.002
actus-0.002
 Midnight-0.002
 masse-0.002
lor-0.002
Token.
Token ID13
Feature activation-4.99e-03

Pos logit contributions
 therapist+0.016
actus+0.015
 Midnight+0.015
 masse+0.014
lor+0.014
Neg logit contributions
 guess-0.014
 count-0.014
 three-0.013
 ten-0.013
 imagine-0.013
Token Sammy
Token ID44067
Feature activation+1.69e-02

Pos logit contributions
 guess+0.007
 count+0.007
 ten+0.007
 three+0.007
 imagine+0.007
Neg logit contributions
 Midnight-0.010
 therapist-0.010
actus-0.010
 Woo-0.009
 masse-0.009
Token apple
Token ID17180
Feature activation+5.69e-02

Pos logit contributions
 therapist+0.034
actus+0.033
 Midnight+0.032
lor+0.032
 Woo+0.029
Neg logit contributions
 count-0.042
 guess-0.041
 three-0.040
 ten-0.039
 imagine-0.038
Token and
Token ID290
Feature activation+6.12e-02

Pos logit contributions
 therapist+0.131
actus+0.131
lor+0.128
 masse+0.123
ugg+0.122
Neg logit contributions
 count-0.064
 guess-0.062
 like-0.061
 ten-0.058
 three-0.055
Token her
Token ID607
Feature activation+2.14e-02

Pos logit contributions
lor+0.129
actus+0.129
 therapist+0.123
 masse+0.116
ugg+0.110
Neg logit contributions
 counted-0.081
 guess-0.081
 like-0.080
 three-0.080
 count-0.079
Token tum
Token ID11814
Feature activation+1.30e-02

Pos logit contributions
 therapist+0.040
actus+0.038
lor+0.037
 Midnight+0.037
 masse+0.035
Neg logit contributions
 guess-0.035
 count-0.035
 three-0.033
 ten-0.032
 imagine-0.031
Tokenmy
Token ID1820
Feature activation+4.59e-02

Pos logit contributions
 therapist+0.033
actus+0.033
 Midnight+0.032
 masse+0.031
lor+0.030
Neg logit contributions
 guess-0.011
 count-0.011
 three-0.010
 ten-0.010
 counted-0.009
Token couldn
Token ID3521
Feature activation+9.70e-02

Pos logit contributions
 therapist+0.074
actus+0.072
lor+0.069
 Midnight+0.069
 Woo+0.067
Neg logit contributions
 count-0.082
 guess-0.081
 three-0.079
 like-0.077
 counted-0.077
Token't
Token ID470
Feature activation+2.68e-02

Pos logit contributions
actus+0.234
 therapist+0.230
ugg+0.208
ich+0.206
iflower+0.205
Neg logit contributions
 One-0.123
 count-0.099
 guess-0.090
 three-0.087
 Many-0.085
Token handle
Token ID5412
Feature activation+1.98e-02

Pos logit contributions
 therapist+0.062
actus+0.060
lor+0.058
 Midnight+0.057
 masse+0.055
Neg logit contributions
 count-0.035
 guess-0.035
 imagine-0.030
 like-0.030
 three-0.029
Token it
Token ID340
Feature activation+1.39e-02

Pos logit contributions
 therapist+0.038
actus+0.037
 Midnight+0.036
lor+0.035
 masse+0.034
Neg logit contributions
 count-0.030
 guess-0.030
 ten-0.029
 three-0.029
 like-0.027
Token.
Token ID13
Feature activation+2.47e-02

Pos logit contributions
 therapist+0.028
actus+0.027
 Midnight+0.026
lor+0.025
 masse+0.025
Neg logit contributions
 count-0.021
 guess-0.021
 ten-0.019
 three-0.019
 imagine-0.018
Token She
Token ID1375
Feature activation+4.64e-02

Pos logit contributions
 therapist+0.053
actus+0.051
iflower+0.047
lor+0.047
 Midnight+0.047
Neg logit contributions
 One-0.034
 guess-0.033
 three-0.031
 count-0.031
 ten-0.029
Token She
Token ID1375
Feature activation+2.96e-02

Pos logit contributions
 therapist+0.028
actus+0.028
 Midnight+0.027
lor+0.026
 masse+0.025
Neg logit contributions
 guess-0.020
 count-0.019
 three-0.018
 ten-0.017
 One-0.017
Token hurt
Token ID5938
Feature activation+1.70e-02

Pos logit contributions
 therapist+0.056
actus+0.056
lor+0.054
 Midnight+0.053
 masse+0.052
Neg logit contributions
 guess-0.047
 count-0.045
 three-0.043
 counted-0.043
 ten-0.042
Token her
Token ID607
Feature activation+1.71e-02

Pos logit contributions
actus+0.032
 therapist+0.031
 Midnight+0.029
lor+0.029
iflower+0.028
Neg logit contributions
 guess-0.028
 count-0.027
 three-0.027
 ten-0.027
 like-0.025
Token head
Token ID1182
Feature activation+7.54e-03

Pos logit contributions
 therapist+0.025
actus+0.024
 Midnight+0.023
lor+0.022
 masse+0.021
Neg logit contributions
 guess-0.035
 count-0.035
 three-0.033
 ten-0.032
 imagine-0.032
Token and
Token ID290
Feature activation+4.89e-02

Pos logit contributions
 therapist+0.016
actus+0.015
 Midnight+0.015
 masse+0.014
lor+0.014
Neg logit contributions
 guess-0.010
 count-0.010
 three-0.010
 ten-0.009
 imagine-0.009
Token couldn
Token ID3521
Feature activation+9.56e-02

Pos logit contributions
actus+0.100
lor+0.098
 therapist+0.094
iflower+0.086
ugg+0.086
Neg logit contributions
 guess-0.075
 count-0.074
 like-0.071
 counted-0.070
 three-0.070
Token't
Token ID470
Feature activation+1.50e-02

Pos logit contributions
actus+0.243
 therapist+0.236
iflower+0.218
ugg+0.214
lor+0.209
Neg logit contributions
 One-0.114
 count-0.088
 guess-0.082
 three-0.079
 ten-0.075
Token wake
Token ID7765
Feature activation+3.41e-02

Pos logit contributions
 therapist+0.032
actus+0.031
 Midnight+0.030
lor+0.029
 masse+0.028
Neg logit contributions
 count-0.022
 guess-0.022
 ten-0.019
 imagine-0.019
 three-0.019
Token up
Token ID510
Feature activation+2.63e-02

Pos logit contributions
 therapist+0.067
actus+0.065
lor+0.062
ugg+0.061
 masse+0.058
Neg logit contributions
 count-0.051
 guess-0.051
 ten-0.051
 three-0.048
 like-0.046
Token.
Token ID13
Feature activation+1.95e-02

Pos logit contributions
actus+0.056
 therapist+0.055
 Midnight+0.052
lor+0.052
iflower+0.050
Neg logit contributions
 guess-0.036
 count-0.036
 three-0.034
 ten-0.033
 like-0.033
Token Lucy
Token ID22162
Feature activation+4.57e-02

Pos logit contributions
 therapist+0.039
actus+0.039
 Midnight+0.036
iflower+0.035
lor+0.035
Neg logit contributions
 guess-0.029
 count-0.028
 One-0.027
 three-0.027
 ten-0.026
TokenSadly
Token ID36725
Feature activation+3.21e-02

Pos logit contributions
 therapist+0.068
actus+0.064
 Midnight+0.061
 disappears+0.055
 masse+0.055
Neg logit contributions
 guess-0.052
 count-0.050
 One-0.049
 like-0.046
 three-0.046
Token,
Token ID11
Feature activation+3.77e-02

Pos logit contributions
 therapist+0.085
actus+0.084
 Midnight+0.081
lor+0.080
 masse+0.079
Neg logit contributions
 guess-0.024
 three-0.023
 count-0.023
 ten-0.021
 One-0.020
Token Nut
Token ID11959
Feature activation+5.38e-02

Pos logit contributions
 therapist+0.058
actus+0.056
lor+0.050
 Midnight+0.046
ugg+0.046
Neg logit contributions
 One-0.076
 three-0.075
 guess-0.075
 like-0.074
 count-0.074
Tokenty
Token ID774
Feature activation+5.66e-02

Pos logit contributions
 therapist+0.093
 masse+0.086
 Midnight+0.084
actus+0.083
+0.081
Neg logit contributions
 guess-0.087
 count-0.084
 counted-0.083
 like-0.080
 three-0.076
Token's
Token ID338
Feature activation+2.66e-02

Pos logit contributions
 therapist+0.116
actus+0.114
 masse+0.112
 Midnight+0.109
lor+0.106
Neg logit contributions
 guess-0.079
 counted-0.076
 count-0.074
 three-0.071
 ten-0.065
Token sickness
Token ID26359
Feature activation+9.51e-02

Pos logit contributions
 therapist+0.047
actus+0.045
 Midnight+0.044
lor+0.043
 masse+0.041
Neg logit contributions
 guess-0.045
 count-0.045
 three-0.044
 ten-0.041
 imagine-0.040
Token got
Token ID1392
Feature activation+3.30e-02

Pos logit contributions
actus+0.209
 masse+0.206
 therapist+0.204
iflower+0.198
lor+0.197
Neg logit contributions
 like-0.117
 counted-0.107
 liked-0.092
 count-0.091
 guess-0.091
Token worse
Token ID4785
Feature activation+3.20e-02

Pos logit contributions
 therapist+0.062
actus+0.059
lor+0.055
 Midnight+0.053
ugg+0.050
Neg logit contributions
 three-0.056
 ten-0.055
 count-0.055
 guess-0.054
 like-0.053
Token and
Token ID290
Feature activation+5.75e-02

Pos logit contributions
 therapist+0.075
actus+0.074
 Midnight+0.074
 masse+0.069
lor+0.069
Neg logit contributions
 guess-0.037
 count-0.036
 like-0.034
 three-0.033
 ten-0.032
Token he
Token ID339
Feature activation+4.08e-02

Pos logit contributions
actus+0.124
 therapist+0.123
lor+0.120
ugg+0.112
Charl+0.109
Neg logit contributions
 three-0.074
 One-0.072
 guess-0.071
 count-0.070
 ten-0.069
Token couldn
Token ID3521
Feature activation+8.68e-02

Pos logit contributions
 therapist+0.082
actus+0.078
 masse+0.077
 Midnight+0.077
lor+0.074
Neg logit contributions
 guess-0.061
 counted-0.059
 count-0.058
 three-0.055
 ten-0.053
Token!"
Token ID2474
Feature activation+6.61e-02

Pos logit contributions
 therapist+0.000
actus+0.000
 Midnight+0.000
 masse+0.000
lor+0.000
Neg logit contributions
 guess-0.000
 count-0.000
 three-0.000
 ten-0.000
 imagine-0.000
Token Mr
Token ID1770
Feature activation+3.65e-02

Pos logit contributions
actus+0.154
 therapist+0.144
eds+0.143
lor+0.142
iflower+0.141
Neg logit contributions
 One-0.076
 three-0.068
 guess-0.067
 count-0.057
 yes-0.057
Token.
Token ID13
Feature activation+4.71e-02

Pos logit contributions
 therapist+0.078
actus+0.074
 Midnight+0.073
lor+0.071
 masse+0.070
Neg logit contributions
 guess-0.046
 count-0.046
 three-0.045
 ten-0.044
 One-0.043
Token Fox
Token ID5426
Feature activation+1.65e-02

Pos logit contributions
actus+0.071
 therapist+0.068
ugg+0.060
iflower+0.056
 Midnight+0.055
Neg logit contributions
 three-0.094
 guess-0.092
 One-0.091
 count-0.090
 ten-0.087
Token smiled
Token ID13541
Feature activation+1.43e-02

Pos logit contributions
 therapist+0.034
actus+0.034
 Midnight+0.033
 masse+0.031
lor+0.031
Neg logit contributions
 guess-0.023
 count-0.023
 three-0.021
 ten-0.021
 counted-0.020
Token and
Token ID290
Feature activation+9.50e-02

Pos logit contributions
 therapist+0.031
actus+0.031
 Midnight+0.030
lor+0.029
iflower+0.028
Neg logit contributions
 guess-0.018
 count-0.018
 three-0.018
 ten-0.016
 like-0.016
Token said
Token ID531
Feature activation+1.60e-02

Pos logit contributions
lor+0.251
eds+0.248
actus+0.248
orns+0.238
iflower+0.230
Neg logit contributions
 counted-0.085
 three-0.082
 guess-0.080
 yes-0.074
 like-0.066
Token,
Token ID11
Feature activation+4.06e-02

Pos logit contributions
actus+0.040
 therapist+0.039
 Midnight+0.037
lor+0.036
 masse+0.036
Neg logit contributions
 guess-0.016
 count-0.016
 three-0.015
 ten-0.014
 One-0.013
Token "
Token ID366
Feature activation+5.22e-02

Pos logit contributions
actus+0.100
 therapist+0.095
 Midnight+0.090
lor+0.089
 drowned+0.089
Neg logit contributions
 three-0.043
 guess-0.043
 count-0.041
 yes-0.039
 One-0.039
TokenYou
Token ID1639
Feature activation-4.51e-03

Pos logit contributions
actus+0.088
 disappears+0.079
 drowned+0.077
 therapist+0.076
 Midnight+0.074
Neg logit contributions
 three-0.093
 like-0.089
 guess-0.088
 especially-0.084
 count-0.083
Token're
Token ID821
Feature activation+2.93e-02

Pos logit contributions
 guess+0.005
 count+0.005
 ten+0.004
 three+0.004
 imagine+0.004
Neg logit contributions
 therapist-0.011
 Midnight-0.011
actus-0.010
lor-0.010
 masse-0.010
Token careful
Token ID8161
Feature activation-9.42e-03

Pos logit contributions
 guess+0.004
 count+0.004
 ten+0.004
 three+0.004
 imagine+0.004
Neg logit contributions
 therapist-0.006
 Midnight-0.006
actus-0.006
 masse-0.005
lor-0.005
Token and
Token ID290
Feature activation+2.34e-02

Pos logit contributions
 guess+0.014
 count+0.014
 ten+0.013
 three+0.013
 imagine+0.013
Neg logit contributions
 therapist-0.018
 Midnight-0.017
actus-0.017
 masse-0.016
lor-0.016
Token sat
Token ID3332
Feature activation+2.03e-02

Pos logit contributions
actus+0.048
 therapist+0.047
 Midnight+0.045
lor+0.044
 masse+0.043
Neg logit contributions
 guess-0.035
 count-0.034
 three-0.031
 counted-0.031
 imagine-0.031
Token on
Token ID319
Feature activation+2.01e-02

Pos logit contributions
 therapist+0.037
 Midnight+0.037
actus+0.036
 masse+0.034
lor+0.034
Neg logit contributions
 guess-0.032
 count-0.031
 three-0.031
 ten-0.030
 like-0.029
Token the
Token ID262
Feature activation+4.99e-02

Pos logit contributions
 therapist+0.038
actus+0.038
 Midnight+0.036
lor+0.034
 masse+0.034
Neg logit contributions
 guess-0.031
 count-0.031
 three-0.030
 ten-0.030
 One-0.029
Token c
Token ID269
Feature activation+9.49e-02

Pos logit contributions
 Midnight+0.090
 therapist+0.088
lor+0.083
+0.077
ugg+0.077
Neg logit contributions
 three-0.080
 count-0.079
 guess-0.078
 imagine-0.074
 ten-0.072
Tokenactus
Token ID34144
Feature activation+6.19e-03

Pos logit contributions
 Midnight+0.097
 therapist+0.088
 swollen+0.083
 drowned+0.077
 severe+0.074
Neg logit contributions
 One-0.198
 like-0.194
ruits-0.192
 guess-0.190
iner-0.190
Token.
Token ID13
Feature activation+5.59e-03

Pos logit contributions
 therapist+0.012
actus+0.012
 Midnight+0.011
 masse+0.010
lor+0.010
Neg logit contributions
 guess-0.010
 count-0.010
 three-0.009
 ten-0.009
 imagine-0.009
Token They
Token ID1119
Feature activation-3.24e-03

Pos logit contributions
 therapist+0.010
actus+0.010
 Midnight+0.010
 masse+0.009
lor+0.009
Neg logit contributions
 guess-0.009
 count-0.009
 three-0.008
 ten-0.008
 imagine-0.008
Token became
Token ID2627
Feature activation+1.91e-02

Pos logit contributions
 guess+0.005
 count+0.005
 three+0.005
 ten+0.005
 imagine+0.004
Neg logit contributions
 therapist-0.006
actus-0.006
 Midnight-0.006
 masse-0.005
 drowned-0.005
Token good
Token ID922
Feature activation+1.38e-02

Pos logit contributions
 therapist+0.037
actus+0.036
lor+0.035
 Midnight+0.034
ugg+0.032
Neg logit contributions
 guess-0.031
 count-0.031
 three-0.030
 ten-0.029
 like-0.028
Token ear
Token ID1027
Feature activation-6.28e-03

Pos logit contributions
 therapist+0.034
actus+0.033
 Midnight+0.032
lor+0.030
 masse+0.030
Neg logit contributions
 guess-0.030
 count-0.030
 three-0.028
 ten-0.027
 imagine-0.027
Token."
Token ID526
Feature activation+6.42e-02

Pos logit contributions
 guess+0.011
 count+0.011
 three+0.010
 ten+0.010
 counted+0.010
Neg logit contributions
 therapist-0.011
actus-0.010
 Midnight-0.010
lor-0.009
 masse-0.009
Token Fl
Token ID1610
Feature activation+4.20e-02

Pos logit contributions
actus+0.129
iflower+0.127
ich+0.125
 therapist+0.125
 severe+0.124
Neg logit contributions
 One-0.100
 three-0.084
 guess-0.080
 count-0.074
 yes-0.072
Tokenoppy
Token ID26696
Feature activation+3.67e-02

Pos logit contributions
 therapist+0.063
keeping+0.054
 Midnight+0.053
lor+0.052
 masse+0.052
Neg logit contributions
 guess-0.081
 three-0.075
 count-0.074
 One-0.073
 ten-0.072
Token apologized
Token ID22893
Feature activation+2.54e-02

Pos logit contributions
actus+0.077
 therapist+0.076
 Midnight+0.074
 masse+0.072
lor+0.072
Neg logit contributions
 guess-0.052
 count-0.049
 three-0.048
 counted-0.046
 ten-0.043
Token and
Token ID290
Feature activation+9.48e-02

Pos logit contributions
actus+0.062
 therapist+0.061
 Midnight+0.058
lor+0.057
iflower+0.056
Neg logit contributions
 guess-0.027
 three-0.025
 count-0.025
 like-0.024
 ten-0.024
Token they
Token ID484
Feature activation+3.73e-02

Pos logit contributions
actus+0.229
lor+0.223
eds+0.222
iflower+0.220
ich+0.212
Neg logit contributions
 guess-0.103
 counted-0.100
 three-0.099
 yes-0.093
 pretended-0.089
Token continued
Token ID3767
Feature activation+5.59e-02

Pos logit contributions
actus+0.081
 therapist+0.081
lor+0.076
iflower+0.074
 masse+0.074
Neg logit contributions
 guess-0.051
 count-0.049
 three-0.047
 counted-0.047
 ten-0.046
Token to
Token ID284
Feature activation+1.66e-02

Pos logit contributions
actus+0.147
 therapist+0.143
lor+0.136
iflower+0.129
 masse+0.124
Neg logit contributions
 three-0.063
 count-0.062
 One-0.057
 guess-0.056
 ten-0.055
Token have
Token ID423
Feature activation+3.76e-02

Pos logit contributions
 therapist+0.039
actus+0.038
 Midnight+0.036
lor+0.035
 masse+0.035
Neg logit contributions
 count-0.021
 guess-0.021
 imagine-0.018
 three-0.018
 ten-0.018
Token a
Token ID257
Feature activation+1.96e-02

Pos logit contributions
actus+0.098
 therapist+0.096
lor+0.087
iflower+0.086
ich+0.086
Neg logit contributions
 guess-0.039
 count-0.038
 three-0.038
 ten-0.037
 One-0.035
Token you
Token ID345
Feature activation-9.78e-03

Pos logit contributions
 therapist+0.014
actus+0.014
 Midnight+0.013
lor+0.012
 masse+0.011
Neg logit contributions
 guess-0.025
 count-0.025
 three-0.024
 ten-0.024
 imagine-0.023
Token not
Token ID407
Feature activation+9.18e-03

Pos logit contributions
 guess+0.018
 count+0.018
 three+0.017
 ten+0.017
 imagine+0.016
Neg logit contributions
 Midnight-0.015
 therapist-0.015
actus-0.015
 drowned-0.013
lor-0.013
Token to
Token ID284
Feature activation+1.36e-02

Pos logit contributions
 therapist+0.017
actus+0.016
 Midnight+0.016
lor+0.015
 masse+0.014
Neg logit contributions
 guess-0.016
 count-0.016
 ten-0.014
 three-0.014
 imagine-0.014
Token touch
Token ID3638
Feature activation+3.23e-02

Pos logit contributions
 therapist+0.025
actus+0.025
 Midnight+0.023
lor+0.023
iflower+0.022
Neg logit contributions
 guess-0.023
 count-0.023
 ten-0.021
 three-0.021
 imagine-0.021
Token the
Token ID262
Feature activation+5.02e-02

Pos logit contributions
 therapist+0.059
actus+0.057
ugg+0.053
lor+0.053
 Midnight+0.052
Neg logit contributions
 ten-0.052
 count-0.052
 three-0.051
 guess-0.051
 like-0.048
Token c
Token ID269
Feature activation+9.47e-02

Pos logit contributions
 Midnight+0.081
 therapist+0.078
lor+0.078
ugg+0.073
live+0.070
Neg logit contributions
 three-0.091
 guess-0.089
 like-0.085
 count-0.084
 ten-0.083
Tokenactus
Token ID34144
Feature activation+2.21e-02

Pos logit contributions
 Midnight+0.101
 therapist+0.098
 drowned+0.090
 swollen+0.083
 masse+0.081
Neg logit contributions
 like-0.176
ruits-0.176
bell-0.176
iner-0.174
 One-0.171
Token!"
Token ID2474
Feature activation+2.28e-03

Pos logit contributions
 therapist+0.044
actus+0.043
 Midnight+0.042
 masse+0.039
lor+0.039
Neg logit contributions
 guess-0.032
 count-0.032
 ten-0.030
 three-0.030
 like-0.028
Token
Token ID198
Feature activation+2.62e-02

Pos logit contributions
 therapist+0.004
actus+0.004
 Midnight+0.004
 masse+0.003
lor+0.003
Neg logit contributions
 guess-0.004
 count-0.004
 three-0.004
 ten-0.004
 imagine-0.004
Token
Token ID198
Feature activation+3.80e-03

Pos logit contributions
 therapist+0.054
actus+0.052
 Midnight+0.051
 masse+0.049
lor+0.047
Neg logit contributions
 guess-0.039
 count-0.036
 One-0.035
 ten-0.034
 three-0.034
TokenTom
Token ID13787
Feature activation-1.86e-02

Pos logit contributions
 therapist+0.007
actus+0.007
 Midnight+0.007
 masse+0.006
lor+0.006
Neg logit contributions
 guess-0.006
 count-0.006
 three-0.006
 ten-0.006
 imagine-0.006
Token't
Token ID470
Feature activation-1.81e-02

Pos logit contributions
actus+0.097
 therapist+0.097
lor+0.086
ugg+0.086
 Midnight+0.083
Neg logit contributions
 count-0.077
 guess-0.076
 three-0.072
 One-0.071
 ten-0.069
Token make
Token ID787
Feature activation+1.63e-02

Pos logit contributions
 guess+0.022
 count+0.022
 1+0.022
 three+0.022
 ten+0.021
Neg logit contributions
 Midnight-0.036
actus-0.034
 therapist-0.034
 Woo-0.033
 drowned-0.033
Token you
Token ID345
Feature activation+5.59e-03

Pos logit contributions
 therapist+0.030
actus+0.029
 Midnight+0.027
lor+0.026
ugg+0.026
Neg logit contributions
 guess-0.028
 count-0.027
 three-0.026
 ten-0.026
 imagine-0.024
Token touch
Token ID3638
Feature activation+2.60e-02

Pos logit contributions
 therapist+0.008
actus+0.008
 Midnight+0.008
lor+0.007
 masse+0.007
Neg logit contributions
 guess-0.011
 count-0.011
 three-0.010
 ten-0.010
 imagine-0.010
Token the
Token ID262
Feature activation+4.47e-02

Pos logit contributions
 therapist+0.047
actus+0.045
 Midnight+0.043
lor+0.042
ugg+0.041
Neg logit contributions
 guess-0.043
 count-0.042
 three-0.041
 ten-0.041
 like-0.040
Token c
Token ID269
Feature activation+9.45e-02

Pos logit contributions
 therapist+0.077
 Midnight+0.075
lor+0.071
 masse+0.069
ugg+0.067
Neg logit contributions
 guess-0.074
 three-0.071
 like-0.071
 imagine-0.069
 count-0.068
Tokenactus
Token ID34144
Feature activation+2.50e-02

Pos logit contributions
 Midnight+0.101
 therapist+0.099
 drowned+0.098
 masse+0.088
 payoff+0.084
Neg logit contributions
bell-0.176
iner-0.171
 like-0.171
ruits-0.168
 One-0.168
Token,"
Token ID553
Feature activation-1.67e-02

Pos logit contributions
 therapist+0.050
actus+0.048
 Midnight+0.048
lor+0.045
 masse+0.044
Neg logit contributions
 guess-0.037
 count-0.037
 ten-0.034
 three-0.034
 imagine-0.033
Token Sam
Token ID3409
Feature activation-1.29e-02

Pos logit contributions
 count+0.029
 guess+0.028
 ten+0.027
ruits+0.027
 imagine+0.026
Neg logit contributions
 therapist-0.027
 Midnight-0.026
actus-0.025
 masse-0.024
 Woo-0.023
Token said
Token ID531
Feature activation+1.41e-02

Pos logit contributions
 count+0.024
 guess+0.024
 ten+0.023
 three+0.023
 imagine+0.022
Neg logit contributions
 therapist-0.020
 Midnight-0.019
actus-0.018
 drowned-0.017
 masse-0.016
Token.
Token ID13
Feature activation+1.89e-02

Pos logit contributions
actus+0.028
 therapist+0.028
 Midnight+0.027
lor+0.025
 masse+0.025
Neg logit contributions
 guess-0.021
 count-0.021
 three-0.020
 ten-0.019
 counted-0.019
Token help
Token ID1037
Feature activation+2.07e-02

Pos logit contributions
actus+0.068
 therapist+0.064
 Midnight+0.062
lor+0.059
ugg+0.058
Neg logit contributions
 guess-0.046
 count-0.046
 three-0.043
 ten-0.042
 imagine-0.039
Token when
Token ID618
Feature activation+3.95e-02

Pos logit contributions
actus+0.044
 therapist+0.043
 Midnight+0.040
lor+0.038
ugg+0.038
Neg logit contributions
 guess-0.030
 count-0.030
 three-0.028
 ten-0.027
 imagine-0.026
Token he
Token ID339
Feature activation+1.15e-02

Pos logit contributions
actus+0.070
 therapist+0.067
 Midnight+0.063
lor+0.063
ugg+0.059
Neg logit contributions
 guess-0.070
 count-0.069
 three-0.067
 ten-0.065
 imagine-0.063
Token needed
Token ID2622
Feature activation+3.71e-02

Pos logit contributions
 therapist+0.022
actus+0.021
 Midnight+0.021
 masse+0.020
lor+0.019
Neg logit contributions
 guess-0.018
 count-0.018
 three-0.016
 ten-0.016
 counted-0.016
Token it
Token ID340
Feature activation+2.53e-02

Pos logit contributions
actus+0.076
 therapist+0.071
 Midnight+0.066
lor+0.065
ugg+0.064
Neg logit contributions
 guess-0.062
 count-0.060
 three-0.057
 ten-0.057
 like-0.054
Token.
Token ID13
Feature activation+9.45e-02

Pos logit contributions
actus+0.052
 therapist+0.052
 Midnight+0.049
 masse+0.046
lor+0.046
Neg logit contributions
 count-0.037
 guess-0.037
 ten-0.034
 three-0.033
 like-0.032
Token<|endoftext|>
Token ID50256
Feature activation+3.47e-02

Pos logit contributions
actus+0.252
 therapist+0.246
 drowned+0.243
eds+0.242
 thorn+0.241
Neg logit contributions
 One-0.093
 three-0.053
 Many-0.048
 Today-0.046
 Some-0.039
TokenOnce
Token ID7454
Feature activation-6.63e-03

Pos logit contributions
 therapist+0.068
actus+0.067
 Midnight+0.064
 masse+0.064
iflower+0.061
Neg logit contributions
 guess-0.051
 three-0.049
 One-0.049
 count-0.048
 ten-0.047
Token upon
Token ID2402
Feature activation-4.67e-02

Pos logit contributions
 guess+0.006
 count+0.006
 ten+0.005
 three+0.005
 imagine+0.005
Neg logit contributions
 therapist-0.017
actus-0.017
 Midnight-0.017
lor-0.015
 masse-0.015
Token a
Token ID257
Feature activation-3.33e-02

Pos logit contributions
 count+0.076
 1+0.072
 guess+0.069
 counted+0.068
ruits+0.067
Neg logit contributions
 Midnight-0.083
 therapist-0.077
actus-0.074
 Woo-0.072
iflower-0.069
Token time
Token ID640
Feature activation-5.13e-02

Pos logit contributions
 count+0.055
 guess+0.053
 counted+0.051
 three+0.050
 ten+0.050
Neg logit contributions
 Midnight-0.057
actus-0.056
 therapist-0.053
 masse-0.049
lor-0.049
Token happy
Token ID3772
Feature activation+1.89e-02

Pos logit contributions
 guess+0.009
 count+0.009
 ten+0.009
 imagine+0.008
 three+0.008
Neg logit contributions
 therapist-0.012
 Midnight-0.012
actus-0.011
 masse-0.010
 drowned-0.010
Token and
Token ID290
Feature activation+4.48e-02

Pos logit contributions
actus+0.038
 therapist+0.037
 Midnight+0.036
lor+0.035
iflower+0.033
Neg logit contributions
 guess-0.029
 count-0.028
 three-0.027
 ten-0.026
 like-0.026
Token thanked
Token ID26280
Feature activation+5.81e-02

Pos logit contributions
actus+0.101
lor+0.095
 therapist+0.089
 Midnight+0.085
 masse+0.084
Neg logit contributions
 guess-0.066
 count-0.066
 counted-0.060
 three-0.059
 imagine-0.058
Token Farmer
Token ID37755
Feature activation+3.36e-02

Pos logit contributions
actus+0.102
 therapist+0.094
ugg+0.086
lor+0.084
 drowned+0.082
Neg logit contributions
 three-0.105
 One-0.102
 guess-0.101
 count-0.099
 ten-0.098
Token Brown
Token ID4373
Feature activation+2.89e-02

Pos logit contributions
actus+0.049
 therapist+0.048
 Midnight+0.044
lor+0.042
 masse+0.042
Neg logit contributions
 count-0.066
 guess-0.065
 three-0.064
 ten-0.062
 counted-0.061
Token.
Token ID13
Feature activation+9.44e-02

Pos logit contributions
 therapist+0.055
actus+0.054
 Midnight+0.052
 masse+0.050
lor+0.050
Neg logit contributions
 count-0.044
 guess-0.044
 three-0.042
 ten-0.041
 like-0.038
Token They
Token ID1119
Feature activation+2.39e-02

Pos logit contributions
actus+0.220
eds+0.211
 therapist+0.209
iflower+0.205
web+0.205
Neg logit contributions
 One-0.122
 three-0.082
 So-0.080
 Today-0.076
 Two-0.073
Token brought
Token ID3181
Feature activation+3.72e-02

Pos logit contributions
actus+0.048
 therapist+0.047
 Midnight+0.045
lor+0.044
 masse+0.043
Neg logit contributions
 guess-0.036
 count-0.035
 three-0.033
 counted-0.033
 ten-0.032
Token the
Token ID262
Feature activation-1.50e-03

Pos logit contributions
actus+0.083
 therapist+0.081
lor+0.075
 masse+0.072
iflower+0.072
Neg logit contributions
 three-0.051
 guess-0.047
 ten-0.047
 count-0.046
 One-0.043
Token hay
Token ID27678
Feature activation+4.50e-02

Pos logit contributions
 guess+0.003
 count+0.003
 ten+0.003
 three+0.003
 imagine+0.003
Neg logit contributions
 therapist-0.002
actus-0.002
 Midnight-0.002
 masse-0.002
lor-0.002
Token back
Token ID736
Feature activation+2.30e-02

Pos logit contributions
actus+0.101
 therapist+0.100
 Midnight+0.097
 masse+0.095
lor+0.094
Neg logit contributions
 count-0.059
 guess-0.058
 three-0.053
 ten-0.052
 like-0.050
Token She
Token ID1375
Feature activation+2.53e-02

Pos logit contributions
 therapist+0.039
actus+0.039
iflower+0.036
 Midnight+0.036
lor+0.036
Neg logit contributions
 guess-0.025
 count-0.024
 three-0.024
 One-0.024
 ten-0.023
Token hurt
Token ID5938
Feature activation+1.25e-02

Pos logit contributions
 therapist+0.043
actus+0.042
 Midnight+0.041
lor+0.040
 masse+0.039
Neg logit contributions
 guess-0.046
 count-0.044
 three-0.042
 counted-0.042
 ten-0.041
Token herself
Token ID5223
Feature activation+2.17e-02

Pos logit contributions
 therapist+0.022
actus+0.022
 Midnight+0.021
lor+0.020
 masse+0.019
Neg logit contributions
 guess-0.022
 count-0.021
 three-0.021
 ten-0.021
 like-0.019
Token badly
Token ID11234
Feature activation+4.08e-02

Pos logit contributions
 therapist+0.042
actus+0.041
 Midnight+0.040
lor+0.038
 masse+0.037
Neg logit contributions
 guess-0.034
 count-0.033
 three-0.032
 ten-0.031
 imagine-0.030
Token and
Token ID290
Feature activation+4.09e-02

Pos logit contributions
 therapist+0.104
actus+0.102
lor+0.099
 Midnight+0.099
 masse+0.092
Neg logit contributions
 like-0.044
 guess-0.039
 count-0.037
 ten-0.037
 three-0.037
Token couldn
Token ID3521
Feature activation+9.39e-02

Pos logit contributions
actus+0.081
lor+0.079
 therapist+0.077
 masse+0.071
 Midnight+0.070
Neg logit contributions
 guess-0.064
 count-0.062
 three-0.061
 like-0.060
 ten-0.060
Token't
Token ID470
Feature activation+1.60e-02

Pos logit contributions
actus+0.230
 therapist+0.223
ugg+0.211
rows+0.206
iflower+0.205
Neg logit contributions
 One-0.120
 count-0.095
 guess-0.091
 three-0.086
 ten-0.083
Token get
Token ID651
Feature activation+4.27e-02

Pos logit contributions
 therapist+0.034
actus+0.033
 Midnight+0.032
lor+0.031
 masse+0.030
Neg logit contributions
 guess-0.023
 count-0.023
 imagine-0.020
 ten-0.020
 three-0.020
Token out
Token ID503
Feature activation+1.41e-03

Pos logit contributions
 therapist+0.094
lor+0.092
actus+0.092
ugg+0.081
 Midnight+0.080
Neg logit contributions
 ten-0.061
 three-0.061
 count-0.061
 guess-0.059
 like-0.056
Token of
Token ID286
Feature activation+4.03e-02

Pos logit contributions
 therapist+0.003
actus+0.002
 Midnight+0.002
 masse+0.002
lor+0.002
Neg logit contributions
 guess-0.002
 count-0.002
 three-0.002
 ten-0.002
 imagine-0.002
Token the
Token ID262
Feature activation+1.57e-02

Pos logit contributions
actus+0.084
 therapist+0.084
lor+0.080
ugg+0.077
iflower+0.074
Neg logit contributions
 ten-0.059
 guess-0.058
 count-0.057
 three-0.056
 like-0.053
Tokeny
Token ID88
Feature activation+5.12e-02

Pos logit contributions
 therapist+0.058
 Midnight+0.056
actus+0.056
 masse+0.055
lor+0.053
Neg logit contributions
 guess-0.036
 count-0.033
 three-0.033
 ten-0.030
 counted-0.030
Token flew
Token ID13112
Feature activation+2.32e-02

Pos logit contributions
actus+0.108
 therapist+0.106
 Midnight+0.105
 masse+0.104
lor+0.103
Neg logit contributions
 guess-0.069
 count-0.066
 three-0.065
 counted-0.063
 imagine-0.057
Token away
Token ID1497
Feature activation+3.79e-02

Pos logit contributions
actus+0.039
 therapist+0.038
lor+0.036
 Midnight+0.035
iflower+0.035
Neg logit contributions
 guess-0.043
 three-0.042
 count-0.042
 ten-0.042
 like-0.041
Token,
Token ID11
Feature activation+4.92e-02

Pos logit contributions
 therapist+0.094
actus+0.093
lor+0.092
 Midnight+0.090
iflower+0.087
Neg logit contributions
 guess-0.041
 three-0.040
 count-0.038
 like-0.038
 ten-0.035
Token happy
Token ID3772
Feature activation+5.12e-02

Pos logit contributions
actus+0.107
lor+0.105
 therapist+0.104
iflower+0.097
 masse+0.095
Neg logit contributions
 guess-0.068
 like-0.065
 three-0.064
 count-0.064
 counted-0.060
Token and
Token ID290
Feature activation+9.38e-02

Pos logit contributions
 therapist+0.132
 Midnight+0.131
lor+0.130
 masse+0.127
actus+0.127
Neg logit contributions
 like-0.053
 three-0.047
 guess-0.043
 imagine-0.037
 count-0.037
Token clean
Token ID3424
Feature activation+5.16e-02

Pos logit contributions
lor+0.225
actus+0.213
orns+0.206
 therapist+0.202
pan+0.198
Neg logit contributions
 like-0.111
 excited-0.105
 counted-0.102
 guess-0.102
 liked-0.095
Token.
Token ID13
Feature activation+8.08e-02

Pos logit contributions
actus+0.116
lor+0.113
ugg+0.110
 therapist+0.110
ich+0.110
Neg logit contributions
 like-0.058
 three-0.054
 count-0.054
 guess-0.053
 imagine-0.049
Token<|endoftext|>
Token ID50256
Feature activation+3.11e-02

Pos logit contributions
 therapist+0.194
actus+0.191
ich+0.182
 masse+0.181
burn+0.181
Neg logit contributions
 One-0.098
 three-0.078
 like-0.065
 Many-0.064
 guess-0.063
TokenOnce
Token ID7454
Feature activation+1.31e-02

Pos logit contributions
 therapist+0.063
actus+0.060
 Midnight+0.059
 masse+0.058
iflower+0.055
Neg logit contributions
 guess-0.046
 count-0.044
 three-0.043
 One-0.043
 ten-0.041
Token upon
Token ID2402
Feature activation-1.20e-02

Pos logit contributions
 therapist+0.037
actus+0.036
 Midnight+0.036
 masse+0.034
lor+0.034
Neg logit contributions
 guess-0.009
 count-0.008
 three-0.007
 ten-0.007
 imagine-0.006
Token hurt
Token ID5938
Feature activation+1.97e-02

Pos logit contributions
actus+0.077
lor+0.075
 therapist+0.074
 masse+0.067
iflower+0.066
Neg logit contributions
 guess-0.070
 count-0.066
 counted-0.065
 three-0.065
 ten-0.065
Token her
Token ID607
Feature activation+1.99e-02

Pos logit contributions
actus+0.038
 therapist+0.036
lor+0.034
 Midnight+0.034
iflower+0.033
Neg logit contributions
 guess-0.032
 three-0.031
 ten-0.031
 count-0.031
 like-0.029
Token leg
Token ID1232
Feature activation+3.18e-02

Pos logit contributions
 therapist+0.032
actus+0.031
 Midnight+0.030
lor+0.029
 masse+0.028
Neg logit contributions
 guess-0.037
 count-0.037
 three-0.035
 ten-0.034
 imagine-0.034
Token.
Token ID13
Feature activation+2.17e-02

Pos logit contributions
actus+0.068
 therapist+0.067
 Midnight+0.063
lor+0.062
 masse+0.061
Neg logit contributions
 count-0.044
 guess-0.044
 three-0.040
 ten-0.040
 like-0.040
Token She
Token ID1375
Feature activation+3.92e-02

Pos logit contributions
actus+0.044
 therapist+0.044
 Midnight+0.041
iflower+0.040
lor+0.040
Neg logit contributions
 guess-0.030
 One-0.029
 count-0.029
 three-0.029
 ten-0.027
Token couldn
Token ID3521
Feature activation+9.36e-02

Pos logit contributions
 therapist+0.077
actus+0.077
lor+0.075
 masse+0.072
 Midnight+0.070
Neg logit contributions
 guess-0.059
 count-0.055
 counted-0.055
 three-0.054
 like-0.051
Token't
Token ID470
Feature activation+8.27e-03

Pos logit contributions
actus+0.239
 therapist+0.231
ugg+0.215
iflower+0.214
rows+0.208
Neg logit contributions
 One-0.105
 count-0.084
 guess-0.080
 three-0.074
 Many-0.069
Token dance
Token ID9280
Feature activation+1.81e-02

Pos logit contributions
 therapist+0.017
actus+0.016
 Midnight+0.016
lor+0.015
 masse+0.015
Neg logit contributions
 guess-0.012
 count-0.012
 three-0.011
 ten-0.011
 imagine-0.011
Token anymore
Token ID7471
Feature activation+1.00e-02

Pos logit contributions
 therapist+0.039
actus+0.039
 Midnight+0.038
lor+0.036
 masse+0.036
Neg logit contributions
 guess-0.023
 count-0.023
 ten-0.021
 three-0.021
 like-0.021
Token and
Token ID290
Feature activation+4.28e-02

Pos logit contributions
 therapist+0.019
actus+0.019
 Midnight+0.018
 masse+0.017
lor+0.017
Neg logit contributions
 guess-0.016
 count-0.016
 three-0.014
 ten-0.014
 imagine-0.014
Token had
Token ID550
Feature activation+2.86e-02

Pos logit contributions
actus+0.081
 therapist+0.077
lor+0.076
iflower+0.071
 masse+0.070
Neg logit contributions
 guess-0.072
 count-0.072
 imagine-0.067
 like-0.067
 ten-0.066
Token They
Token ID1119
Feature activation-1.08e-02

Pos logit contributions
 therapist+0.052
actus+0.051
lor+0.048
iflower+0.048
 Midnight+0.047
Neg logit contributions
 guess-0.029
 three-0.028
 One-0.028
 count-0.027
 ten-0.025
Token lost
Token ID2626
Feature activation+2.51e-02

Pos logit contributions
 count+0.018
 guess+0.018
 ten+0.017
 three+0.016
ruits+0.016
Neg logit contributions
 therapist-0.019
 Midnight-0.018
actus-0.018
 drowned-0.017
 died-0.017
Token their
Token ID511
Feature activation+4.71e-03

Pos logit contributions
actus+0.050
 therapist+0.049
lor+0.046
 Midnight+0.046
ugg+0.044
Neg logit contributions
 guess-0.038
 three-0.037
 ten-0.037
 count-0.037
 like-0.035
Token passports
Token ID33052
Feature activation+1.95e-02

Pos logit contributions
 therapist+0.010
actus+0.009
 Midnight+0.009
 masse+0.009
lor+0.009
Neg logit contributions
 guess-0.007
 count-0.007
 three-0.006
 ten-0.006
 imagine-0.006
Token and
Token ID290
Feature activation+3.47e-02

Pos logit contributions
 therapist+0.040
actus+0.039
 Midnight+0.038
 masse+0.036
lor+0.036
Neg logit contributions
 guess-0.028
 count-0.027
 three-0.026
 ten-0.025
 imagine-0.024
Token couldn
Token ID3521
Feature activation+9.35e-02

Pos logit contributions
actus+0.072
 therapist+0.070
lor+0.070
iflower+0.065
 masse+0.065
Neg logit contributions
 guess-0.051
 count-0.048
 three-0.048
 counted-0.046
 like-0.046
Token't
Token ID470
Feature activation+2.54e-03

Pos logit contributions
actus+0.232
 therapist+0.225
iflower+0.207
ugg+0.206
rows+0.204
Neg logit contributions
 One-0.117
 count-0.090
 guess-0.088
 three-0.086
bell-0.079
Token go
Token ID467
Feature activation+2.41e-02

Pos logit contributions
 therapist+0.005
actus+0.005
 Midnight+0.005
 masse+0.005
lor+0.004
Neg logit contributions
 guess-0.004
 count-0.004
 three-0.003
 ten-0.003
 imagine-0.003
Token on
Token ID319
Feature activation+3.83e-02

Pos logit contributions
 therapist+0.043
actus+0.041
 Midnight+0.039
iflower+0.038
lor+0.037
Neg logit contributions
 count-0.043
 guess-0.042
 ten-0.041
 three-0.041
 like-0.040
Token their
Token ID511
Feature activation+2.23e-02

Pos logit contributions
actus+0.078
 therapist+0.077
ugg+0.071
 Midnight+0.069
iflower+0.069
Neg logit contributions
 guess-0.056
 three-0.055
 ten-0.055
 count-0.054
 One-0.052
Token trip
Token ID5296
Feature activation+6.71e-03

Pos logit contributions
 therapist+0.041
actus+0.040
 Midnight+0.040
lor+0.037
 masse+0.036
Neg logit contributions
 count-0.037
 guess-0.036
 three-0.034
 ten-0.034
 imagine-0.033
Token if
Token ID611
Feature activation+1.60e-02

Pos logit contributions
 blond+0.094
 blonde+0.093
 guess+0.089
 turrets+0.084
 Many+0.083
Neg logit contributions
 Midnight-0.098
 therapist-0.094
actus-0.086
 bottle-0.080
iflower-0.079
Token she
Token ID673
Feature activation-3.48e-02

Pos logit contributions
actus+0.023
 therapist+0.022
 Midnight+0.021
lor+0.019
 masse+0.019
Neg logit contributions
 guess-0.034
 count-0.033
 three-0.032
 ten-0.031
 imagine-0.030
Token could
Token ID714
Feature activation-3.13e-02

Pos logit contributions
 count+0.063
 ten+0.062
 guess+0.061
 imagine+0.059
 three+0.059
Neg logit contributions
 therapist-0.051
actus-0.048
 Midnight-0.047
 died-0.044
 drowned-0.043
Token go
Token ID467
Feature activation-3.41e-02

Pos logit contributions
 ten+0.044
ptions+0.044
 guess+0.044
 three+0.044
 blond+0.043
Neg logit contributions
 therapist-0.053
 Midnight-0.052
actus-0.051
 masse-0.048
 died-0.046
Token get
Token ID651
Feature activation-1.88e-02

Pos logit contributions
 guess+0.061
ruits+0.060
 blond+0.060
 blonde+0.059
 count+0.059
Neg logit contributions
actus-0.047
 therapist-0.047
 Midnight-0.044
 Woo-0.041
 masse-0.040
Token an
Token ID281
Feature activation-1.07e-01

Pos logit contributions
 count+0.033
 guess+0.033
 counted+0.030
 imagine+0.030
 ten+0.030
Neg logit contributions
 therapist-0.029
actus-0.029
 Midnight-0.028
 masse-0.025
iflower-0.025
Token X
Token ID1395
Feature activation-3.64e-02

Pos logit contributions
 blond+0.191
 blonde+0.189
ptions+0.187
ruits+0.177
 counted+0.175
Neg logit contributions
actus-0.138
iflower-0.131
 Tooth-0.127
 medicine-0.126
 band-0.114
Token-
Token ID12
Feature activation-5.24e-02

Pos logit contributions
 guess+0.078
 count+0.075
 One+0.073
 Many+0.072
 imagine+0.072
Neg logit contributions
 therapist-0.044
actus-0.042
lor-0.041
iflower-0.038
 Midnight-0.037
Tokenray
Token ID2433
Feature activation-2.04e-02

Pos logit contributions
 guess+0.105
 count+0.103
 imagine+0.100
 blonde+0.097
 blond+0.097
Neg logit contributions
actus-0.068
 therapist-0.064
lor-0.060
iflower-0.060
ugg-0.055
Token.
Token ID13
Feature activation+4.18e-03

Pos logit contributions
 guess+0.037
 count+0.036
 ten+0.034
 three+0.033
 imagine+0.033
Neg logit contributions
 therapist-0.033
actus-0.031
 Midnight-0.030
 masse-0.028
lor-0.028
Token
Token ID198
Feature activation+9.52e-03

Pos logit contributions
 therapist+0.008
actus+0.007
 Midnight+0.007
 masse+0.007
lor+0.007
Neg logit contributions
 guess-0.007
 count-0.007
 three-0.006
 ten-0.006
 imagine-0.006
Tokenie
Token ID494
Feature activation+3.98e-03

Pos logit contributions
 guess+0.033
 count+0.033
 ten+0.030
 imagine+0.030
 three+0.030
Neg logit contributions
 therapist-0.022
actus-0.022
 Midnight-0.022
iflower-0.019
 drowned-0.019
Token.
Token ID13
Feature activation-2.38e-03

Pos logit contributions
 therapist+0.007
actus+0.007
 Midnight+0.007
 masse+0.007
lor+0.007
Neg logit contributions
 guess-0.006
 count-0.006
 three-0.006
 ten-0.006
 imagine-0.006
Token They
Token ID1119
Feature activation-3.23e-02

Pos logit contributions
 guess+0.004
 count+0.004
 three+0.003
 ten+0.003
 imagine+0.003
Neg logit contributions
 therapist-0.004
 Midnight-0.004
actus-0.004
 masse-0.004
 Woo-0.004
Token both
Token ID1111
Feature activation-4.72e-02

Pos logit contributions
 guess+0.051
 count+0.049
 ten+0.045
ruits+0.044
 blond+0.043
Neg logit contributions
 Midnight-0.059
 therapist-0.053
actus-0.053
 Woo-0.051
 drowned-0.051
Token had
Token ID550
Feature activation-8.60e-03

Pos logit contributions
 guess+0.066
 count+0.066
 ten+0.060
 whether+0.059
ptions+0.058
Neg logit contributions
 Midnight-0.086
 therapist-0.085
actus-0.082
 died-0.082
 drowned-0.082
Token birth
Token ID4082
Feature activation-1.06e-01

Pos logit contributions
 guess+0.011
 count+0.010
 imagine+0.009
 ten+0.009
 three+0.009
Neg logit contributions
 Midnight-0.019
 therapist-0.018
actus-0.018
 Woo-0.017
 masse-0.017
Tokendays
Token ID12545
Feature activation-3.70e-02

Pos logit contributions
 One+0.161
 Many+0.160
 guess+0.160
 blond+0.159
 1+0.156
Neg logit contributions
ugg-0.168
eddy-0.167
uggle-0.153
burn-0.150
eds-0.146
Token coming
Token ID2406
Feature activation-2.81e-02

Pos logit contributions
 guess+0.062
 count+0.060
 imagine+0.055
 blond+0.055
 ten+0.055
Neg logit contributions
 Midnight-0.059
 therapist-0.059
ugg-0.055
actus-0.055
 Woo-0.052
Token up
Token ID510
Feature activation-1.89e-02

Pos logit contributions
 guess+0.055
 count+0.055
 blond+0.052
 imagine+0.052
 blonde+0.052
Neg logit contributions
 Midnight-0.036
 therapist-0.035
 Woo-0.033
actus-0.032
 masse-0.030
Token,
Token ID11
Feature activation-3.02e-02

Pos logit contributions
 count+0.032
 guess+0.031
 imagine+0.029
 counted+0.028
 three+0.028
Neg logit contributions
 therapist-0.031
 Midnight-0.030
actus-0.030
 Woo-0.028
 masse-0.027
Token and
Token ID290
Feature activation-8.24e-03

Pos logit contributions
 guess+0.043
 count+0.042
 ten+0.039
 three+0.038
ff+0.038
Neg logit contributions
 Midnight-0.057
 therapist-0.055
 Woo-0.052
actus-0.051
 masse-0.048
Token
Token ID8151
Feature activation+1.76e-04

Pos logit contributions
 guess+0.010
 count+0.010
 three+0.010
 ten+0.010
 imagine+0.009
Neg logit contributions
 therapist-0.011
actus-0.011
 Midnight-0.011
lor-0.010
 masse-0.010
Tokens
Token ID82
Feature activation-1.17e-02

Pos logit contributions
 therapist+0.000
actus+0.000
 Midnight+0.000
lor+0.000
 masse+0.000
Neg logit contributions
 guess-0.000
 count-0.000
 three-0.000
 ten-0.000
 imagine-0.000
Token mom
Token ID1995
Feature activation-1.95e-02

Pos logit contributions
 count+0.021
 guess+0.021
 ten+0.020
 three+0.020
 imagine+0.019
Neg logit contributions
 therapist-0.018
actus-0.018
 Midnight-0.017
lor-0.015
 died-0.015
Token explained
Token ID4893
Feature activation-1.03e-03

Pos logit contributions
 count+0.035
 guess+0.035
 ten+0.034
 imagine+0.033
 three+0.033
Neg logit contributions
 therapist-0.030
 Midnight-0.029
actus-0.028
 died-0.025
 drowned-0.025
Token that
Token ID326
Feature activation-2.34e-02

Pos logit contributions
 guess+0.002
 count+0.002
 three+0.002
 ten+0.002
 imagine+0.002
Neg logit contributions
 therapist-0.001
 Midnight-0.001
actus-0.001
 masse-0.001
lor-0.001
Token an
Token ID281
Feature activation-1.05e-01

Pos logit contributions
 guess+0.048
 count+0.047
 ten+0.044
 imagine+0.044
 three+0.044
Neg logit contributions
 therapist-0.031
 Midnight-0.029
actus-0.028
iflower-0.025
 masse-0.025
Token X
Token ID1395
Feature activation-2.75e-02

Pos logit contributions
 blond+0.191
 yes+0.191
 blonde+0.189
 guess+0.188
ill+0.186
Neg logit contributions
actus-0.131
iflower-0.126
 Tooth-0.107
orn-0.103
 masse-0.102
Token-
Token ID12
Feature activation-5.61e-02

Pos logit contributions
 guess+0.054
 count+0.052
 ten+0.050
 three+0.050
 imagine+0.049
Neg logit contributions
 therapist-0.040
actus-0.039
lor-0.036
 Midnight-0.035
iflower-0.035
Tokenray
Token ID2433
Feature activation-2.64e-02

Pos logit contributions
 guess+0.123
 count+0.120
 imagine+0.117
 blond+0.111
 blonde+0.111
Neg logit contributions
actus-0.064
 therapist-0.062
lor-0.060
iflower-0.059
con-0.049
Token was
Token ID373
Feature activation-2.21e-02

Pos logit contributions
 guess+0.050
 count+0.048
 ten+0.047
 three+0.046
 imagine+0.045
Neg logit contributions
 therapist-0.040
actus-0.038
 Midnight-0.035
lor-0.035
iflower-0.034
Token a
Token ID257
Feature activation-4.82e-02

Pos logit contributions
 guess+0.034
 count+0.033
 imagine+0.031
 three+0.030
 One+0.030
Neg logit contributions
 therapist-0.039
 Midnight-0.038
actus-0.038
iflower-0.035
 masse-0.035
Token was
Token ID373
Feature activation-6.41e-03

Pos logit contributions
 guess+0.018
 count+0.017
 ten+0.016
 three+0.016
 imagine+0.016
Neg logit contributions
 therapist-0.018
actus-0.018
 Midnight-0.018
lor-0.016
iflower-0.016
Token too
Token ID1165
Feature activation-2.22e-02

Pos logit contributions
 guess+0.009
 count+0.009
 ten+0.008
 imagine+0.008
 three+0.008
Neg logit contributions
 therapist-0.013
 Midnight-0.012
actus-0.012
 masse-0.011
 Woo-0.011
Token young
Token ID1862
Feature activation+2.46e-02

Pos logit contributions
 count+0.039
 guess+0.036
 counted+0.035
 ten+0.034
 imagine+0.034
Neg logit contributions
 therapist-0.037
actus-0.037
 Midnight-0.035
 masse-0.034
 severe-0.034
Token to
Token ID284
Feature activation-3.45e-02

Pos logit contributions
actus+0.050
 therapist+0.048
 Midnight+0.046
lor+0.045
 masse+0.044
Neg logit contributions
 guess-0.037
 count-0.037
 three-0.035
 ten-0.034
 like-0.033
Token get
Token ID651
Feature activation-2.58e-03

Pos logit contributions
ruits+0.046
 ten+0.046
 three+0.045
ptions+0.044
 1+0.044
Neg logit contributions
 therapist-0.061
 Midnight-0.060
 masse-0.057
actus-0.056
 drowned-0.055
Token an
Token ID281
Feature activation-1.03e-01

Pos logit contributions
 guess+0.004
 count+0.004
 ten+0.004
 three+0.004
 imagine+0.004
Neg logit contributions
 therapist-0.004
actus-0.004
 Midnight-0.004
 masse-0.004
lor-0.004
Token X
Token ID1395
Feature activation-3.26e-02

Pos logit contributions
 blond+0.188
 blonde+0.187
ptions+0.180
 yes+0.178
ruits+0.174
Neg logit contributions
actus-0.127
iflower-0.119
 Tooth-0.114
 medicine-0.112
 masse-0.106
Token-
Token ID12
Feature activation-4.58e-02

Pos logit contributions
 guess+0.069
 count+0.066
 One+0.064
 ten+0.063
 three+0.063
Neg logit contributions
 therapist-0.041
actus-0.040
lor-0.037
iflower-0.036
 Midnight-0.035
Tokenray
Token ID2433
Feature activation-2.01e-02

Pos logit contributions
 guess+0.094
 count+0.093
 imagine+0.089
 blond+0.085
 blonde+0.085
Neg logit contributions
actus-0.058
 therapist-0.056
lor-0.052
iflower-0.051
ugg-0.047
Token,
Token ID11
Feature activation-3.96e-02

Pos logit contributions
 guess+0.036
 count+0.035
 ten+0.033
 three+0.033
 imagine+0.032
Neg logit contributions
 therapist-0.033
actus-0.031
 Midnight-0.029
 masse-0.028
lor-0.028
Token but
Token ID475
Feature activation-2.94e-02

Pos logit contributions
 count+0.050
 guess+0.046
 ten+0.045
 counted+0.043
ruits+0.043
Neg logit contributions
 therapist-0.079
 Midnight-0.078
 Woo-0.072
 masse-0.071
actus-0.069
Token,
Token ID11
Feature activation-4.36e-02

Pos logit contributions
 guess+0.110
 count+0.106
 imagine+0.103
ruits+0.103
adies+0.101
Neg logit contributions
 therapist-0.064
 Midnight-0.058
actus-0.055
ls-0.052
 masse-0.052
Token "
Token ID366
Feature activation-4.94e-02

Pos logit contributions
 guess+0.060
ruits+0.059
 count+0.058
 imagine+0.057
ff+0.055
Neg logit contributions
 therapist-0.080
 Midnight-0.075
 masse-0.069
actus-0.066
 Woo-0.065
TokenOne
Token ID3198
Feature activation-4.67e-02

Pos logit contributions
 guess+0.088
 count+0.088
 imagine+0.087
ruits+0.079
 One+0.077
Neg logit contributions
 therapist-0.073
 masse-0.065
Now-0.064
 Midnight-0.062
lor-0.061
Token,
Token ID11
Feature activation-8.31e-02

Pos logit contributions
 guess+0.071
 blond+0.069
 blonde+0.067
 imagine+0.067
ptions+0.066
Neg logit contributions
 therapist-0.073
actus-0.069
iflower-0.069
 masse-0.065
 Midnight-0.064
Token two
Token ID734
Feature activation-5.68e-02

Pos logit contributions
illed+0.096
ruits+0.092
ff+0.092
 blond+0.092
adows+0.088
Neg logit contributions
 therapist-0.147
 band-0.143
 masse-0.142
 Midnight-0.140
 Woo-0.135
Token,
Token ID11
Feature activation-1.03e-01

Pos logit contributions
 guess+0.086
 count+0.079
 One+0.077
 imagine+0.077
ruits+0.077
Neg logit contributions
 therapist-0.091
 Midnight-0.091
actus-0.085
iflower-0.081
 masse-0.081
Token three
Token ID1115
Feature activation-5.55e-02

Pos logit contributions
illed+0.107
ff+0.093
ruits+0.093
 exception+0.090
adows+0.090
Neg logit contributions
 band-0.194
 masse-0.188
 Midnight-0.186
 therapist-0.185
 jar-0.183
Token!"
Token ID2474
Feature activation-4.99e-03

Pos logit contributions
 guess+0.096
 count+0.093
 imagine+0.089
ruits+0.086
 One+0.086
Neg logit contributions
 therapist-0.081
iflower-0.076
actus-0.075
 Midnight-0.074
ls-0.069
Token As
Token ID1081
Feature activation-4.55e-03

Pos logit contributions
 count+0.008
 guess+0.008
 ten+0.007
 three+0.007
 imagine+0.007
Neg logit contributions
 therapist-0.010
 Midnight-0.009
actus-0.009
 masse-0.008
lor-0.008
Token soon
Token ID2582
Feature activation-5.92e-03

Pos logit contributions
 guess+0.009
 count+0.009
 ten+0.009
 three+0.009
 imagine+0.008
Neg logit contributions
 therapist-0.006
 Midnight-0.006
actus-0.006
 masse-0.005
iflower-0.005
Token as
Token ID355
Feature activation+3.84e-03

Pos logit contributions
 guess+0.007
 count+0.007
 ten+0.007
 three+0.007
 imagine+0.006
Neg logit contributions
 therapist-0.013
 Midnight-0.012
actus-0.012
 masse-0.011
 drowned-0.011
Token.
Token ID13
Feature activation-4.29e-02

Pos logit contributions
 guess+0.018
 count+0.017
 three+0.016
 ten+0.016
 imagine+0.016
Neg logit contributions
 therapist-0.017
actus-0.017
 Midnight-0.016
 masse-0.014
lor-0.014
Token They
Token ID1119
Feature activation-5.73e-02

Pos logit contributions
 count+0.040
 guess+0.037
 counted+0.034
 imagine+0.031
 ten+0.030
Neg logit contributions
 Midnight-0.105
 Woo-0.100
 therapist-0.098
actus-0.097
 Lucky-0.096
Token had
Token ID550
Feature activation-3.95e-02

Pos logit contributions
 guess+0.113
teen+0.111
adies+0.110
 Today+0.110
ff+0.108
Neg logit contributions
 Midnight-0.069
 therapist-0.057
 died-0.055
 drowned-0.055
 Woo-0.053
Token many
Token ID867
Feature activation-5.36e-02

Pos logit contributions
 guess+0.065
 count+0.063
ruits+0.059
 imagine+0.059
ff+0.058
Neg logit contributions
 Midnight-0.069
 Woo-0.063
 therapist-0.060
 drowned-0.056
 Velvet-0.056
Token toys
Token ID14958
Feature activation-1.31e-02

Pos logit contributions
 guess+0.114
 count+0.106
ill+0.105
 ten+0.104
adies+0.104
Neg logit contributions
 Midnight-0.061
actus-0.061
 band-0.056
 swollen-0.053
ls-0.053
Token,
Token ID11
Feature activation-1.01e-01

Pos logit contributions
 guess+0.020
 count+0.019
 three+0.018
 ten+0.017
 imagine+0.017
Neg logit contributions
 therapist-0.025
 Midnight-0.024
actus-0.024
 Woo-0.022
lor-0.022
Token but
Token ID475
Feature activation-4.40e-02

Pos logit contributions
ff+0.184
 1+0.151
illed+0.150
rr+0.149
used+0.142
Neg logit contributions
 Midnight-0.167
 Woo-0.148
 Velvet-0.148
 band-0.144
 medicine-0.136
Token their
Token ID511
Feature activation-2.70e-02

Pos logit contributions
 count+0.085
 guess+0.083
ff+0.080
adows+0.079
 ten+0.079
Neg logit contributions
 Midnight-0.065
 therapist-0.062
 Woo-0.056
 Velvet-0.052
 Lucky-0.050
Token favorite
Token ID4004
Feature activation-2.46e-02

Pos logit contributions
 guess+0.034
 count+0.032
 counted+0.030
 ten+0.029
ruits+0.028
Neg logit contributions
actus-0.057
 Midnight-0.056
 therapist-0.056
 drowned-0.050
 masse-0.050
Token was
Token ID373
Feature activation-1.83e-02

Pos logit contributions
 guess+0.050
 count+0.050
 ten+0.048
 three+0.047
 1+0.046
Neg logit contributions
 therapist-0.033
actus-0.032
 Midnight-0.032
 drowned-0.026
 died-0.026
Token a
Token ID257
Feature activation+4.41e-03

Pos logit contributions
 guess+0.029
 count+0.027
 imagine+0.025
 three+0.024
 ten+0.024
Neg logit contributions
 Midnight-0.035
 therapist-0.034
actus-0.032
 Woo-0.030
 masse-0.030
Token They
Token ID1119
Feature activation-5.16e-02

Pos logit contributions
 count+0.038
 guess+0.038
 counted+0.032
 imagine+0.031
 ten+0.030
Neg logit contributions
 Midnight-0.100
 Woo-0.094
actus-0.092
 therapist-0.091
 Lucky-0.090
Token had
Token ID550
Feature activation-5.07e-02

Pos logit contributions
 guess+0.087
 count+0.082
adies+0.081
 Today+0.079
 whether+0.078
Neg logit contributions
 Midnight-0.075
 therapist-0.070
actus-0.065
 drowned-0.065
 died-0.065
Token many
Token ID867
Feature activation-7.52e-02

Pos logit contributions
 guess+0.088
immer+0.086
ruits+0.085
 imagine+0.084
 count+0.084
Neg logit contributions
 Midnight-0.067
 Woo-0.063
 therapist-0.062
 band-0.058
 Velvet-0.057
Token number
Token ID1271
Feature activation-8.68e-02

Pos logit contributions
 guess+0.141
 mate+0.141
ill+0.139
 yes+0.134
adies+0.133
Neg logit contributions
iflower-0.082
ls-0.078
actus-0.076
 band-0.076
 lands-0.075
Token toys
Token ID14958
Feature activation-2.47e-02

Pos logit contributions
 guess+0.155
 imagine+0.143
 blond+0.137
 count+0.136
illed+0.134
Neg logit contributions
 therapist-0.108
ls-0.105
actus-0.100
ic-0.099
 lands-0.097
Token,
Token ID11
Feature activation-1.00e-01

Pos logit contributions
 guess+0.041
 count+0.039
 imagine+0.037
 three+0.037
 blond+0.036
Neg logit contributions
 therapist-0.042
actus-0.040
 Midnight-0.040
lor-0.035
 Woo-0.035
Token like
Token ID588
Feature activation-6.44e-02

Pos logit contributions
ff+0.148
illed+0.139
adies+0.124
int+0.121
adows+0.116
Neg logit contributions
 Midnight-0.159
 Velvet-0.149
 band-0.147
 therapist-0.145
 chip-0.144
Token blocks
Token ID7021
Feature activation-1.84e-02

Pos logit contributions
 guess+0.098
illed+0.092
ruits+0.091
 imagine+0.086
 count+0.084
Neg logit contributions
 Midnight-0.107
 therapist-0.105
 masse-0.097
 chip-0.094
 band-0.094
Token,
Token ID11
Feature activation-8.51e-02

Pos logit contributions
 guess+0.028
 count+0.027
 three+0.025
 imagine+0.025
 ten+0.024
Neg logit contributions
 therapist-0.034
actus-0.033
 Midnight-0.033
lor-0.029
 masse-0.029
Token puzzles
Token ID24367
Feature activation+2.75e-03

Pos logit contributions
illed+0.162
ff+0.152
 guess+0.131
 exception+0.131
int+0.130
Neg logit contributions
 Midnight-0.112
 band-0.102
 Velvet-0.102
 chip-0.100
 sy-0.098
Token,
Token ID11
Feature activation-6.73e-02

Pos logit contributions
 therapist+0.006
actus+0.006
 Midnight+0.005
 masse+0.005
lor+0.005
Neg logit contributions
 guess-0.004
 count-0.004
 three-0.004
 ten-0.003
 imagine-0.003
Token.
Token ID13
Feature activation-5.53e-02

Pos logit contributions
 guess+0.005
 count+0.004
 three+0.004
 ten+0.004
 imagine+0.004
Neg logit contributions
 therapist-0.005
actus-0.005
 Midnight-0.004
 masse-0.004
lor-0.004
Token They
Token ID1119
Feature activation-5.14e-02

Pos logit contributions
 count+0.056
 guess+0.055
ptions+0.050
 counted+0.049
 yes+0.048
Neg logit contributions
 Woo-0.114
 Midnight-0.110
actus-0.110
 therapist-0.106
 Mitt-0.103
Token had
Token ID550
Feature activation-3.49e-02

Pos logit contributions
 guess+0.104
teen+0.100
 count+0.099
 whether+0.099
 Spanish+0.098
Neg logit contributions
 Midnight-0.061
 therapist-0.052
 drowned-0.048
 died-0.048
 Woo-0.047
Token many
Token ID867
Feature activation-4.60e-02

Pos logit contributions
 guess+0.056
 count+0.055
 imagine+0.051
ruits+0.051
ff+0.050
Neg logit contributions
 Midnight-0.061
 Woo-0.056
 therapist-0.055
actus-0.051
 Velvet-0.050
Token toys
Token ID14958
Feature activation-9.09e-03

Pos logit contributions
 guess+0.097
 count+0.090
 ten+0.089
ill+0.088
 1+0.087
Neg logit contributions
actus-0.056
 Midnight-0.055
 band-0.051
 therapist-0.049
ls-0.048
Token,
Token ID11
Feature activation-9.92e-02

Pos logit contributions
 guess+0.013
 count+0.013
 three+0.012
 ten+0.012
 imagine+0.012
Neg logit contributions
 therapist-0.018
actus-0.017
 Midnight-0.017
lor-0.015
 Woo-0.015
Token but
Token ID475
Feature activation-4.93e-02

Pos logit contributions
ff+0.180
 1+0.150
illed+0.148
rr+0.141
used+0.137
Neg logit contributions
 Midnight-0.158
 Velvet-0.145
 band-0.144
 Woo-0.141
 fle-0.131
Token their
Token ID511
Feature activation-1.75e-02

Pos logit contributions
 count+0.097
 guess+0.096
ff+0.093
adows+0.092
rr+0.092
Neg logit contributions
 Midnight-0.059
 therapist-0.059
 Woo-0.054
 Velvet-0.049
 swollen-0.047
Token favorite
Token ID4004
Feature activation-2.10e-02

Pos logit contributions
 guess+0.022
 count+0.021
 counted+0.019
 ten+0.019
 three+0.018
Neg logit contributions
actus-0.037
 therapist-0.037
 Midnight-0.037
 drowned-0.033
 masse-0.033
Token was
Token ID373
Feature activation-1.63e-02

Pos logit contributions
 guess+0.042
 count+0.041
 ten+0.040
 three+0.039
 1+0.039
Neg logit contributions
 therapist-0.029
actus-0.028
 Midnight-0.028
 drowned-0.024
 died-0.023
Token a
Token ID257
Feature activation+9.48e-03

Pos logit contributions
 guess+0.025
 count+0.024
 imagine+0.022
 three+0.022
 ten+0.021
Neg logit contributions
 Midnight-0.031
 therapist-0.030
actus-0.029
 Woo-0.027
 masse-0.027
Token steps
Token ID4831
Feature activation-4.28e-02

Pos logit contributions
 guess+0.066
ruits+0.063
 count+0.062
 imagine+0.061
 counted+0.060
Neg logit contributions
 therapist-0.041
actus-0.040
 masse-0.037
 Midnight-0.035
 disappears-0.034
Token.
Token ID13
Feature activation-4.63e-02

Pos logit contributions
 guess+0.078
 count+0.075
 imagine+0.072
ruits+0.068
 One+0.068
Neg logit contributions
 therapist-0.065
 Midnight-0.059
actus-0.059
 masse-0.054
iflower-0.051
Token One
Token ID1881
Feature activation-4.50e-02

Pos logit contributions
 guess+0.086
 count+0.083
 imagine+0.082
 mate+0.079
ruits+0.076
Neg logit contributions
 Midnight-0.068
 therapist-0.066
 masse-0.059
actus-0.055
 Woo-0.055
Token,
Token ID11
Feature activation-6.05e-02

Pos logit contributions
racted+0.081
ruits+0.081
 blond+0.080
 guess+0.079
ptions+0.078
Neg logit contributions
 therapist-0.056
 masse-0.052
actus-0.049
 band-0.047
 jar-0.047
Token two
Token ID734
Feature activation-6.81e-02

Pos logit contributions
ruits+0.094
ff+0.090
 guess+0.087
 count+0.087
 imagine+0.086
Neg logit contributions
 therapist-0.100
 Midnight-0.095
 masse-0.094
 band-0.089
 poppy-0.082
Token,
Token ID11
Feature activation-9.91e-02

Pos logit contributions
 guess+0.117
ruits+0.111
 imagine+0.107
ff+0.106
 whether+0.104
Neg logit contributions
ls-0.088
 masse-0.084
 Midnight-0.083
 therapist-0.082
 drowned-0.075
Token three
Token ID1115
Feature activation-6.98e-02

Pos logit contributions
ruits+0.120
immer+0.111
illed+0.110
 exception+0.104
ff+0.101
Neg logit contributions
 band-0.170
 therapist-0.166
 chip-0.160
 masse-0.158
 po-0.157
Token -
Token ID532
Feature activation-8.82e-02

Pos logit contributions
 guess+0.122
ruits+0.119
 count+0.119
 imagine+0.116
epad+0.113
Neg logit contributions
ls-0.098
 therapist-0.089
actus-0.084
iflower-0.083
 Midnight-0.082
Token all
Token ID477
Feature activation-2.65e-02

Pos logit contributions
ruits+0.126
illed+0.123
ff+0.116
 guess+0.115
immer+0.114
Neg logit contributions
 therapist-0.143
 Midnight-0.132
 masse-0.124
 Woo-0.119
 Cathy-0.116
Token the
Token ID262
Feature activation-4.82e-02

Pos logit contributions
 guess+0.054
 count+0.051
ruits+0.049
 ten+0.049
 imagine+0.048
Neg logit contributions
 therapist-0.033
 Midnight-0.030
actus-0.029
 masse-0.027
iflower-0.026
Token way
Token ID835
Feature activation+7.64e-03

Pos logit contributions
ruits+0.097
ff+0.094
 guess+0.094
 yes+0.093
immer+0.091
Neg logit contributions
 therapist-0.055
 masse-0.050
actus-0.046
 Midnight-0.044
 band-0.042
Token them
Token ID606
Feature activation-4.21e-02

Pos logit contributions
 guess+0.076
 count+0.074
 imagine+0.071
 counted+0.069
 want+0.068
Neg logit contributions
 Midnight-0.050
 therapist-0.046
actus-0.043
iflower-0.042
 Woo-0.040
Token.
Token ID13
Feature activation-3.83e-02

Pos logit contributions
 guess+0.079
 count+0.076
 three+0.075
 ten+0.074
 blond+0.072
Neg logit contributions
 Midnight-0.055
 therapist-0.055
actus-0.053
 died-0.047
iflower-0.046
Token They
Token ID1119
Feature activation-5.51e-02

Pos logit contributions
 count+0.072
 guess+0.070
 imagine+0.068
 counted+0.068
 ten+0.066
Neg logit contributions
 Midnight-0.058
 Woo-0.052
actus-0.049
 therapist-0.048
 Velvet-0.048
Token could
Token ID714
Feature activation-5.15e-02

Pos logit contributions
 guess+0.087
 Today+0.085
ruits+0.085
 whether+0.084
illed+0.084
Neg logit contributions
 therapist-0.080
 Midnight-0.079
 died-0.077
 drowned-0.074
actus-0.072
Token count
Token ID954
Feature activation-7.06e-02

Pos logit contributions
illed+0.076
ruits+0.073
 blonde+0.069
ff+0.069
adows+0.069
Neg logit contributions
 Midnight-0.084
 therapist-0.082
 band-0.075
actus-0.074
 masse-0.073
Token,
Token ID11
Feature activation-9.90e-02

Pos logit contributions
 guess+0.141
 count+0.132
 imagine+0.130
illed+0.128
 counted+0.128
Neg logit contributions
 Midnight-0.077
 therapist-0.074
actus-0.073
ls-0.072
iflower-0.070
Token add
Token ID751
Feature activation-6.01e-02

Pos logit contributions
illed+0.145
ruits+0.123
ff+0.122
adows+0.119
adies+0.114
Neg logit contributions
 Midnight-0.153
 therapist-0.147
 chip-0.146
 band-0.144
 masse-0.144
Token,
Token ID11
Feature activation-9.49e-02

Pos logit contributions
 guess+0.097
 imagine+0.092
 count+0.090
illed+0.088
 counted+0.085
Neg logit contributions
 Midnight-0.089
 therapist-0.089
actus-0.086
ls-0.081
 masse-0.080
Token and
Token ID290
Feature activation-7.43e-02

Pos logit contributions
illed+0.153
ruits+0.133
ff+0.127
 exception+0.126
 guess+0.125
Neg logit contributions
 band-0.128
 therapist-0.128
 Midnight-0.126
 chip-0.124
 masse-0.124
Token subtract
Token ID34128
Feature activation-6.31e-02

Pos logit contributions
illed+0.101
ruits+0.096
ff+0.090
adies+0.087
adows+0.086
Neg logit contributions
 Midnight-0.125
 therapist-0.124
 band-0.117
 masse-0.117
 disappears-0.111
Token with
Token ID351
Feature activation-5.48e-02

Pos logit contributions
 guess+0.131
 imagine+0.121
 blond+0.119
 wonder+0.119
 count+0.118
Neg logit contributions
 therapist-0.069
 Midnight-0.066
iflower-0.064
actus-0.063
ic-0.062
Token to
Token ID284
Feature activation-7.79e-02

Pos logit contributions
 guess+0.055
ruits+0.052
 count+0.052
 yes+0.052
 imagine+0.050
Neg logit contributions
 therapist-0.036
actus-0.032
 Midnight-0.031
 masse-0.031
 drowned-0.028
Token count
Token ID954
Feature activation-6.21e-02

Pos logit contributions
ruits+0.107
 yes+0.097
 mate+0.095
immer+0.093
adies+0.093
Neg logit contributions
 tree-0.123
 ble-0.119
 lands-0.119
 dri-0.117
 band-0.117
Token and
Token ID290
Feature activation-7.34e-02

Pos logit contributions
 guess+0.136
ruits+0.128
 count+0.128
 imagine+0.127
 counted+0.123
Neg logit contributions
ls-0.053
 therapist-0.049
 lands-0.045
actus-0.043
iflower-0.042
Token she
Token ID673
Feature activation-6.00e-02

Pos logit contributions
ruits+0.109
illed+0.098
ff+0.096
immer+0.096
 blond+0.096
Neg logit contributions
 therapist-0.117
 Midnight-0.104
 lands-0.102
 disappears-0.099
 she-0.098
Token counted
Token ID14789
Feature activation-8.36e-02

Pos logit contributions
ruits+0.069
 Today+0.064
 yes+0.061
 blond+0.060
 blonde+0.059
Neg logit contributions
 therapist-0.118
 drowned-0.113
 died-0.111
actus-0.109
 Midnight-0.104
Token to
Token ID284
Feature activation-9.89e-02

Pos logit contributions
 guess+0.177
ruits+0.173
 want+0.167
 imagine+0.167
 Many+0.167
Neg logit contributions
ls-0.064
 lands-0.064
 therapist-0.057
iflower-0.054
actus-0.053
Token five
Token ID1936
Feature activation-6.33e-02

Pos logit contributions
ruits+0.111
immer+0.100
 mate+0.095
 yes+0.095
 exception+0.092
Neg logit contributions
 tree-0.173
 lands-0.168
 band-0.153
 safety-0.152
 bush-0.148
Token.
Token ID13
Feature activation-2.17e-02

Pos logit contributions
 guess+0.119
ruits+0.114
 imagine+0.113
 mate+0.113
 yes+0.111
Neg logit contributions
ls-0.080
actus-0.071
 lands-0.071
 therapist-0.069
iflower-0.066
Token Then
Token ID3244
Feature activation+5.44e-03

Pos logit contributions
 guess+0.036
 count+0.035
 imagine+0.032
 ten+0.031
 counted+0.031
Neg logit contributions
 Midnight-0.039
 therapist-0.038
actus-0.037
 masse-0.034
 Woo-0.033
Token her
Token ID607
Feature activation-2.01e-02

Pos logit contributions
 therapist+0.012
actus+0.012
 Midnight+0.012
lor+0.011
 masse+0.011
Neg logit contributions
 guess-0.007
 count-0.007
 three-0.006
 ten-0.006
 imagine-0.006
Token head
Token ID1182
Feature activation-4.32e-02

Pos logit contributions
 guess+0.036
 count+0.035
 One+0.033
 imagine+0.033
 whether+0.033
Neg logit contributions
 therapist-0.031
actus-0.030
 Midnight-0.029
 masse-0.027
iflower-0.026
Token.
Token ID13
Feature activation-6.59e-02

Pos logit contributions
 guess+0.007
 count+0.007
 three+0.007
 ten+0.007
 imagine+0.006
Neg logit contributions
 therapist-0.007
actus-0.007
 Midnight-0.007
 masse-0.006
lor-0.006
Token They
Token ID1119
Feature activation-5.10e-02

Pos logit contributions
 guess+0.080
 imagine+0.073
 count+0.071
ptions+0.067
years+0.066
Neg logit contributions
 Midnight-0.121
 Woo-0.120
 Cathy-0.115
 Olive-0.113
actus-0.112
Token had
Token ID550
Feature activation-3.92e-02

Pos logit contributions
 guess+0.100
 count+0.092
 Today+0.091
 whether+0.091
teen+0.091
Neg logit contributions
 Midnight-0.066
 therapist-0.052
 Woo-0.052
 drowned-0.052
 died-0.051
Token many
Token ID867
Feature activation-5.37e-02

Pos logit contributions
 guess+0.065
 count+0.064
ruits+0.061
 imagine+0.060
ff+0.059
Neg logit contributions
 Midnight-0.064
 Woo-0.060
 therapist-0.054
 drowned-0.053
 Velvet-0.053
Token toys
Token ID14958
Feature activation-1.38e-02

Pos logit contributions
 guess+0.116
ill+0.110
ff+0.107
 count+0.107
 ten+0.106
Neg logit contributions
 Midnight-0.061
actus-0.056
 band-0.055
ls-0.052
 swollen-0.050
Token,
Token ID11
Feature activation-9.89e-02

Pos logit contributions
 guess+0.021
 count+0.020
 three+0.018
 ten+0.018
 imagine+0.018
Neg logit contributions
 therapist-0.026
 Midnight-0.026
actus-0.026
 Woo-0.023
lor-0.023
Token but
Token ID475
Feature activation-5.13e-02

Pos logit contributions
ff+0.185
rr+0.152
illed+0.148
 1+0.144
 amaz+0.143
Neg logit contributions
 Midnight-0.154
 Velvet-0.138
 band-0.136
 Woo-0.136
 poison-0.126
Token their
Token ID511
Feature activation-1.29e-02

Pos logit contributions
 count+0.101
 guess+0.101
rr+0.099
ff+0.099
adows+0.098
Neg logit contributions
 Midnight-0.063
 therapist-0.060
 Woo-0.056
 Velvet-0.049
 poppy-0.049
Token favorite
Token ID4004
Feature activation-1.88e-02

Pos logit contributions
 guess+0.016
 count+0.016
 ten+0.015
 counted+0.014
 three+0.014
Neg logit contributions
actus-0.027
 therapist-0.027
 Midnight-0.027
 masse-0.024
 drowned-0.024
Token was
Token ID373
Feature activation-1.79e-02

Pos logit contributions
 guess+0.037
 count+0.036
 ten+0.035
 three+0.034
 counted+0.034
Neg logit contributions
 therapist-0.027
actus-0.026
 Midnight-0.026
 drowned-0.022
iflower-0.021
Token a
Token ID257
Feature activation+2.63e-03

Pos logit contributions
 guess+0.028
 count+0.027
 imagine+0.025
 three+0.024
 ten+0.023
Neg logit contributions
 Midnight-0.034
 therapist-0.032
actus-0.031
 Woo-0.030
 masse-0.029
Token.
Token ID13
Feature activation-6.01e-02

Pos logit contributions
 guess+0.043
 count+0.041
 ten+0.039
 three+0.039
 imagine+0.039
Neg logit contributions
 therapist-0.036
actus-0.035
 Midnight-0.035
 drowned-0.031
 died-0.031
Token "
Token ID366
Feature activation-5.54e-02

Pos logit contributions
 guess+0.114
 count+0.112
 ten+0.111
 imagine+0.110
 yes+0.107
Neg logit contributions
 Midnight-0.079
 Cathy-0.063
 Woo-0.062
 therapist-0.062
actus-0.057
TokenOne
Token ID3198
Feature activation-4.47e-02

Pos logit contributions
 count+0.130
 guess+0.126
 imagine+0.123
ruits+0.121
 yes+0.119
Neg logit contributions
 Midnight-0.049
 therapist-0.047
Now-0.047
burn-0.044
 Woo-0.043
Token,
Token ID11
Feature activation-8.23e-02

Pos logit contributions
 guess+0.081
 blond+0.081
 blonde+0.079
 counted+0.079
racted+0.078
Neg logit contributions
actus-0.057
ww-0.054
iflower-0.051
 therapist-0.050
 mosquito-0.049
Token two
Token ID734
Feature activation-5.31e-02

Pos logit contributions
illed+0.122
 blond+0.120
adows+0.119
ill+0.115
ff+0.114
Neg logit contributions
 Midnight-0.135
 Woo-0.128
 poppy-0.117
 mosquito-0.116
 bunny-0.115
Token,
Token ID11
Feature activation-9.88e-02

Pos logit contributions
 guess+0.088
 count+0.079
 imagine+0.078
ruits+0.077
 whether+0.074
Neg logit contributions
 Midnight-0.080
actus-0.078
 drowned-0.075
 therapist-0.074
 died-0.073
Token three
Token ID1115
Feature activation-4.70e-02

Pos logit contributions
illed+0.117
adows+0.112
ff+0.105
ruits+0.103
owed+0.102
Neg logit contributions
 Midnight-0.181
 Woo-0.180
 band-0.174
 poppy-0.174
 mosquito-0.172
Token!"
Token ID2474
Feature activation-3.14e-02

Pos logit contributions
 guess+0.079
 count+0.076
 imagine+0.072
 ten+0.069
ruits+0.068
Neg logit contributions
actus-0.074
 therapist-0.072
 Midnight-0.070
iflower-0.070
ls-0.069
Token they
Token ID484
Feature activation-5.60e-02

Pos logit contributions
 ten+0.049
 count+0.049
 guess+0.047
ruits+0.045
 three+0.045
Neg logit contributions
 therapist-0.056
 Midnight-0.056
actus-0.050
 Woo-0.047
 Cathy-0.047
Token shouted
Token ID17293
Feature activation-3.93e-02

Pos logit contributions
 count+0.112
 ten+0.107
ruits+0.107
 guess+0.103
 whether+0.103
Neg logit contributions
 Midnight-0.071
 therapist-0.067
 Woo-0.058
 drowned-0.055
 band-0.053
Token,
Token ID11
Feature activation-2.96e-02

Pos logit contributions
 count+0.068
 guess+0.065
 ten+0.062
ruits+0.061
 counted+0.061
Neg logit contributions
 Midnight-0.064
 therapist-0.062
 Woo-0.055
actus-0.054
 masse-0.052
Token cards
Token ID4116
Feature activation-1.11e-03

Pos logit contributions
illed+0.155
ff+0.153
ruits+0.142
int+0.142
immer+0.139
Neg logit contributions
 Midnight-0.124
 band-0.113
 therapist-0.112
 masse-0.109
 medicine-0.108
Token.
Token ID13
Feature activation-3.20e-02

Pos logit contributions
 guess+0.002
 count+0.002
 three+0.002
 ten+0.002
 imagine+0.002
Neg logit contributions
 therapist-0.002
actus-0.002
 Midnight-0.002
lor-0.002
 masse-0.002
Token They
Token ID1119
Feature activation-6.86e-02

Pos logit contributions
 count+0.036
 guess+0.034
 counted+0.032
 ten+0.031
 imagine+0.031
Neg logit contributions
 Midnight-0.075
 Woo-0.069
actus-0.069
 therapist-0.069
 Lucky-0.066
Token could
Token ID714
Feature activation-5.71e-02

Pos logit contributions
adies+0.112
 Today+0.109
 guess+0.108
illed+0.107
adows+0.106
Neg logit contributions
 Midnight-0.088
 therapist-0.087
 died-0.085
 drowned-0.080
 band-0.078
Token count
Token ID954
Feature activation-7.13e-02

Pos logit contributions
illed+0.083
ruits+0.077
ff+0.073
adies+0.072
adows+0.071
Neg logit contributions
 Midnight-0.100
 therapist-0.092
 band-0.087
 masse-0.086
actus-0.085
Token,
Token ID11
Feature activation-9.87e-02

Pos logit contributions
 guess+0.145
 count+0.135
 imagine+0.132
illed+0.132
 counted+0.130
Neg logit contributions
 Midnight-0.079
actus-0.075
ls-0.074
 therapist-0.073
iflower-0.070
Token add
Token ID751
Feature activation-6.45e-02

Pos logit contributions
illed+0.152
ff+0.131
ruits+0.127
adies+0.125
adows+0.125
Neg logit contributions
 Midnight-0.152
 chip-0.146
 therapist-0.142
 sy-0.139
 masse-0.139
Token,
Token ID11
Feature activation-9.29e-02

Pos logit contributions
 guess+0.106
 imagine+0.099
 count+0.099
illed+0.096
 counted+0.095
Neg logit contributions
 Midnight-0.096
 therapist-0.094
actus-0.092
 masse-0.087
ls-0.086
Token and
Token ID290
Feature activation-8.10e-02

Pos logit contributions
illed+0.158
ruits+0.137
ff+0.135
 guess+0.133
adows+0.132
Neg logit contributions
 Midnight-0.121
 chip-0.119
 therapist-0.118
 band-0.116
 sy-0.114
Token subtract
Token ID34128
Feature activation-5.54e-02

Pos logit contributions
illed+0.117
adies+0.110
ruits+0.110
ff+0.106
adows+0.104
Neg logit contributions
 Midnight-0.131
 therapist-0.128
 masse-0.119
 band-0.117
 chip-0.112
Token very
Token ID845
Feature activation-3.66e-02

Pos logit contributions
 guess+0.112
 imagine+0.102
 count+0.101
 wonder+0.100
 blond+0.099
Neg logit contributions
 therapist-0.068
 Midnight-0.066
actus-0.063
iflower-0.063
ic-0.061
Token up
Token ID510
Feature activation-2.45e-02

Pos logit contributions
 guess+0.095
 count+0.095
ruits+0.094
 blond+0.089
 Many+0.087
Neg logit contributions
actus-0.051
 Midnight-0.048
 therapist-0.047
ls-0.045
 masse-0.040
Token information
Token ID1321
Feature activation-4.54e-02

Pos logit contributions
 count+0.041
 guess+0.039
ruits+0.036
 counted+0.035
 imagine+0.035
Neg logit contributions
actus-0.042
 therapist-0.042
 Midnight-0.040
 masse-0.038
iflower-0.036
Token and
Token ID290
Feature activation-5.88e-02

Pos logit contributions
 guess+0.068
 count+0.067
 imagine+0.061
 ten+0.061
 three+0.059
Neg logit contributions
 therapist-0.083
 Midnight-0.079
actus-0.076
 masse-0.071
lor-0.071
Token asked
Token ID1965
Feature activation-4.25e-02

Pos logit contributions
ruits+0.095
 blond+0.090
aney+0.087
 blonde+0.085
 count+0.083
Neg logit contributions
 therapist-0.091
 died-0.085
 drowned-0.083
 Midnight-0.081
actus-0.080
Token many
Token ID867
Feature activation-5.43e-02

Pos logit contributions
 guess+0.089
 blonde+0.088
 blond+0.087
 count+0.086
 imagine+0.084
Neg logit contributions
 Midnight-0.046
 therapist-0.043
 medicine-0.036
 Woo-0.036
iflower-0.035
Token questions
Token ID2683
Feature activation-9.86e-02

Pos logit contributions
 guess+0.059
 counted+0.058
 count+0.058
ill+0.055
ruits+0.053
Neg logit contributions
 therapist-0.113
 masse-0.107
actus-0.106
ls-0.104
 Midnight-0.103
Token.
Token ID13
Feature activation-9.38e-03

Pos logit contributions
 blond+0.191
 blonde+0.186
ptions+0.185
 guess+0.184
 imagine+0.177
Neg logit contributions
 therapist-0.110
 Midnight-0.081
lor-0.079
 that-0.076
iflower-0.075
Token Eventually
Token ID16178
Feature activation-9.54e-03

Pos logit contributions
 count+0.016
 guess+0.016
 ten+0.015
 imagine+0.015
 three+0.015
Neg logit contributions
 therapist-0.016
 Midnight-0.015
actus-0.015
lor-0.014
 masse-0.013
Token,
Token ID11
Feature activation-1.33e-02

Pos logit contributions
 guess+0.013
 count+0.013
 ten+0.012
 imagine+0.011
 three+0.011
Neg logit contributions
 therapist-0.019
 Midnight-0.019
actus-0.018
 drowned-0.017
 masse-0.017
Token the
Token ID262
Feature activation+1.16e-02

Pos logit contributions
 count+0.027
 guess+0.027
 ten+0.025
 imagine+0.025
ruits+0.024
Neg logit contributions
 therapist-0.018
 Midnight-0.018
actus-0.017
 masse-0.015
iflower-0.015
Token mole
Token ID9411
Feature activation-1.57e-02

Pos logit contributions
 therapist+0.013
actus+0.012
 Midnight+0.012
lor+0.011
 masse+0.010
Neg logit contributions
 guess-0.028
 count-0.028
 three-0.027
 ten-0.026
 imagine-0.025
Token.
Token ID13
Feature activation-7.05e-02

Pos logit contributions
 guess+0.025
 count+0.025
 three+0.023
 ten+0.023
 imagine+0.023
Neg logit contributions
 therapist-0.023
actus-0.022
 Midnight-0.022
 masse-0.019
iflower-0.019
Token They
Token ID1119
Feature activation-5.86e-02

Pos logit contributions
 guess+0.086
 count+0.085
 imagine+0.082
 believe+0.081
 yes+0.076
Neg logit contributions
 Woo-0.125
 Midnight-0.120
actus-0.117
 Mitt-0.115
 Cathy-0.111
Token had
Token ID550
Feature activation-3.80e-02

Pos logit contributions
 guess+0.107
 Today+0.104
 count+0.103
adies+0.101
teen+0.101
Neg logit contributions
 Midnight-0.077
 drowned-0.068
 therapist-0.068
 died-0.066
 Woo-0.064
Token many
Token ID867
Feature activation-5.38e-02

Pos logit contributions
 guess+0.060
 count+0.059
 imagine+0.056
ruits+0.056
ff+0.055
Neg logit contributions
 Midnight-0.064
 Woo-0.059
 therapist-0.058
 drowned-0.056
actus-0.054
Token toys
Token ID14958
Feature activation-1.08e-02

Pos logit contributions
 guess+0.112
ill+0.104
 count+0.102
ff+0.102
adies+0.100
Neg logit contributions
 band-0.061
actus-0.060
 Midnight-0.058
ls-0.058
 swollen-0.055
Token,
Token ID11
Feature activation-9.85e-02

Pos logit contributions
 guess+0.016
 count+0.016
 three+0.015
 ten+0.015
 imagine+0.014
Neg logit contributions
 therapist-0.020
actus-0.020
 Midnight-0.020
lor-0.018
 Woo-0.018
Token but
Token ID475
Feature activation-5.70e-02

Pos logit contributions
ff+0.169
rr+0.141
illed+0.140
 Today+0.135
 1+0.134
Neg logit contributions
 Midnight-0.160
 Woo-0.149
 band-0.145
 Velvet-0.141
 medicine-0.132
Token their
Token ID511
Feature activation-2.86e-02

Pos logit contributions
adows+0.114
 count+0.114
ff+0.112
 guess+0.111
rr+0.110
Neg logit contributions
 Midnight-0.061
 Woo-0.058
 therapist-0.057
 Velvet-0.049
 Mitt-0.047
Token favorite
Token ID4004
Feature activation-2.75e-02

Pos logit contributions
 guess+0.038
 count+0.036
 counted+0.034
 ten+0.034
 imagine+0.033
Neg logit contributions
actus-0.057
 therapist-0.056
 Midnight-0.056
 drowned-0.051
 masse-0.049
Token ones
Token ID3392
Feature activation-1.53e-02

Pos logit contributions
 guess+0.056
 count+0.055
 ten+0.053
 1+0.052
 three+0.052
Neg logit contributions
 therapist-0.037
actus-0.035
 Midnight-0.034
 drowned-0.030
 died-0.029
Token were
Token ID547
Feature activation-1.85e-02

Pos logit contributions
 guess+0.023
 count+0.023
 ten+0.022
 three+0.022
 imagine+0.021
Neg logit contributions
 therapist-0.028
actus-0.027
 Midnight-0.027
 drowned-0.024
iflower-0.024
Token.
Token ID13
Feature activation-4.74e-03

Pos logit contributions
 guess+0.071
 count+0.069
 three+0.066
 ten+0.065
 imagine+0.065
Neg logit contributions
 therapist-0.050
 Midnight-0.048
actus-0.047
 masse-0.041
iflower-0.041
Token "
Token ID366
Feature activation-4.94e-02

Pos logit contributions
 count+0.009
 guess+0.008
 ten+0.008
 three+0.008
 imagine+0.008
Neg logit contributions
 therapist-0.008
 Midnight-0.008
actus-0.007
 masse-0.007
lor-0.007
TokenOne
Token ID3198
Feature activation-3.59e-02

Pos logit contributions
 count+0.106
 guess+0.103
 imagine+0.100
ruits+0.098
 counted+0.096
Neg logit contributions
 therapist-0.056
 masse-0.051
 Woo-0.050
actus-0.050
Now-0.049
Token,
Token ID11
Feature activation-6.69e-02

Pos logit contributions
 guess+0.059
 counted+0.057
 count+0.056
 whether+0.055
 blond+0.055
Neg logit contributions
actus-0.057
 therapist-0.055
 bottle-0.053
 Midnight-0.051
 band-0.050
Token two
Token ID734
Feature activation-4.64e-02

Pos logit contributions
adows+0.124
illed+0.123
ff+0.117
ruits+0.116
 exception+0.109
Neg logit contributions
 Midnight-0.091
 therapist-0.085
 Woo-0.085
 bottle-0.083
 band-0.083
Token,
Token ID11
Feature activation-9.85e-02

Pos logit contributions
 guess+0.075
 count+0.066
 imagine+0.064
ruits+0.064
 ten+0.063
Neg logit contributions
 Midnight-0.076
 therapist-0.073
actus-0.071
 Woo-0.069
 band-0.067
Token three
Token ID1115
Feature activation-4.58e-02

Pos logit contributions
illed+0.169
adows+0.165
ff+0.156
ruits+0.150
 exception+0.143
Neg logit contributions
 Midnight-0.149
 Woo-0.148
 band-0.147
 Velvet-0.138
 bottle-0.136
Token,
Token ID11
Feature activation-8.96e-02

Pos logit contributions
 guess+0.078
 count+0.074
 imagine+0.070
 ten+0.068
 One+0.067
Neg logit contributions
 therapist-0.070
actus-0.068
 Midnight-0.067
iflower-0.064
ls-0.063
Token four
Token ID1440
Feature activation-4.17e-02

Pos logit contributions
illed+0.143
adows+0.134
ff+0.132
ruits+0.127
 exception+0.122
Neg logit contributions
 Woo-0.141
 Midnight-0.140
 band-0.140
 Velvet-0.132
 bottle-0.129
Token,"
Token ID553
Feature activation-2.71e-02

Pos logit contributions
 guess+0.068
 count+0.063
 imagine+0.060
 ten+0.060
ruits+0.059
Neg logit contributions
 therapist-0.067
 Midnight-0.065
actus-0.064
 drowned-0.059
ls-0.059
Token she
Token ID673
Feature activation-3.59e-02

Pos logit contributions
 count+0.048
 ten+0.046
ptions+0.045
 guess+0.045
ruits+0.045
Neg logit contributions
 therapist-0.042
 Midnight-0.040
actus-0.036
 masse-0.035
 ble-0.034
Token.
Token ID13
Feature activation-4.67e-02

Pos logit contributions
 guess+0.012
 count+0.012
 three+0.011
 ten+0.011
 imagine+0.011
Neg logit contributions
 therapist-0.014
actus-0.013
 Midnight-0.013
 masse-0.012
lor-0.012
Token They
Token ID1119
Feature activation-4.32e-02

Pos logit contributions
 count+0.043
 guess+0.041
 counted+0.040
 turrets+0.036
 ten+0.036
Neg logit contributions
 Midnight-0.110
 Lucky-0.103
 Woo-0.101
actus-0.101
 Velvet-0.099
Token had
Token ID550
Feature activation-3.71e-02

Pos logit contributions
 guess+0.082
 Today+0.076
 count+0.076
ff+0.075
 exception+0.074
Neg logit contributions
 Midnight-0.060
 therapist-0.051
 drowned-0.050
 Woo-0.048
 died-0.048
Token many
Token ID867
Feature activation-5.11e-02

Pos logit contributions
 guess+0.063
 count+0.061
ruits+0.059
ff+0.058
 imagine+0.058
Neg logit contributions
 Midnight-0.060
 Woo-0.056
 drowned-0.051
 therapist-0.050
 band-0.050
Token toys
Token ID14958
Feature activation-8.28e-03

Pos logit contributions
 guess+0.110
 count+0.103
ill+0.102
ff+0.100
adies+0.100
Neg logit contributions
actus-0.055
 Midnight-0.055
 band-0.055
orns-0.049
 swollen-0.049
Token,
Token ID11
Feature activation-9.84e-02

Pos logit contributions
 guess+0.012
 count+0.012
 three+0.011
 ten+0.011
 imagine+0.011
Neg logit contributions
 therapist-0.016
actus-0.016
 Midnight-0.016
lor-0.014
 Woo-0.014
Token but
Token ID475
Feature activation-3.91e-02

Pos logit contributions
ff+0.180
illed+0.149
 1+0.145
rr+0.142
used+0.139
Neg logit contributions
 Midnight-0.155
 band-0.143
 Velvet-0.142
 Woo-0.138
 poison-0.129
Token their
Token ID511
Feature activation-1.45e-02

Pos logit contributions
 count+0.082
 guess+0.081
ff+0.080
adows+0.079
 ten+0.077
Neg logit contributions
 Midnight-0.047
 therapist-0.042
 Woo-0.040
 Velvet-0.037
 Lucky-0.036
Token favorite
Token ID4004
Feature activation-2.30e-02

Pos logit contributions
 guess+0.019
 count+0.019
 ten+0.017
 counted+0.017
 three+0.017
Neg logit contributions
actus-0.030
 therapist-0.030
 Midnight-0.029
 drowned-0.027
 masse-0.027
Token was
Token ID373
Feature activation-1.88e-02

Pos logit contributions
 guess+0.046
 count+0.045
 ten+0.043
 three+0.042
 1+0.042
Neg logit contributions
 therapist-0.032
actus-0.031
 Midnight-0.030
 drowned-0.026
 died-0.026
Token a
Token ID257
Feature activation+6.70e-03

Pos logit contributions
 guess+0.031
 count+0.029
 imagine+0.028
 three+0.026
 ten+0.026
Neg logit contributions
 Midnight-0.034
 therapist-0.033
actus-0.031
 Woo-0.030
 masse-0.029
Token blocks
Token ID7021
Feature activation-4.62e-02

Pos logit contributions
 guess+0.096
ruits+0.093
 yes+0.090
 imagine+0.090
 blond+0.088
Neg logit contributions
actus-0.072
iflower-0.072
 band-0.069
 therapist-0.069
 jar-0.068
Token.
Token ID13
Feature activation-5.41e-02

Pos logit contributions
 guess+0.090
 blond+0.084
 count+0.083
 imagine+0.083
 blonde+0.082
Neg logit contributions
 therapist-0.056
actus-0.052
 Midnight-0.049
iflower-0.048
 masse-0.046
Token One
Token ID1881
Feature activation-4.85e-02

Pos logit contributions
 guess+0.100
 imagine+0.096
 count+0.095
 yes+0.091
ruits+0.088
Neg logit contributions
 Midnight-0.075
 therapist-0.067
 Woo-0.064
 Mitt-0.060
 Lucky-0.059
Token,
Token ID11
Feature activation-7.61e-02

Pos logit contributions
 blond+0.093
 blonde+0.091
adies+0.089
racted+0.088
uries+0.087
Neg logit contributions
 chip-0.050
 therapist-0.050
 band-0.050
iflower-0.049
 jar-0.048
Token two
Token ID734
Feature activation-8.06e-02

Pos logit contributions
ff+0.138
ruits+0.134
adows+0.132
iner+0.129
illed+0.128
Neg logit contributions
 band-0.098
 Midnight-0.095
 therapist-0.093
 bottle-0.086
 Woo-0.085
Token,
Token ID11
Feature activation-9.84e-02

Pos logit contributions
 guess+0.142
ruits+0.134
 imagine+0.134
ff+0.133
adies+0.133
Neg logit contributions
ls-0.090
 drowned-0.089
 chip-0.087
 died-0.086
 Midnight-0.084
Token three
Token ID1115
Feature activation-7.48e-02

Pos logit contributions
immer+0.122
ruits+0.120
illed+0.118
ff+0.113
adows+0.110
Neg logit contributions
 band-0.172
 chip-0.156
 Midnight-0.153
 therapist-0.152
 jar-0.149
Token,
Token ID11
Feature activation-9.28e-02

Pos logit contributions
 guess+0.133
 imagine+0.130
 count+0.130
ruits+0.128
epad+0.125
Neg logit contributions
ls-0.097
iflower-0.084
 therapist-0.083
 drowned-0.081
actus-0.081
Token four
Token ID1440
Feature activation-7.83e-02

Pos logit contributions
ruits+0.122
immer+0.120
illed+0.114
ff+0.112
adows+0.107
Neg logit contributions
 band-0.146
 therapist-0.138
 chip-0.137
 Midnight-0.136
 she-0.130
Token...
Token ID986
Feature activation-8.64e-02

Pos logit contributions
 guess+0.136
ruits+0.135
 imagine+0.132
adies+0.131
 mate+0.126
Neg logit contributions
ls-0.093
 therapist-0.085
 drowned-0.084
 chip-0.084
 died-0.083
Token The
Token ID383
Feature activation-5.46e-02

Pos logit contributions
 imagine+0.132
 mate+0.126
 count+0.123
 guess+0.122
 blond+0.120
Neg logit contributions
 therapist-0.131
Now-0.124
we-0.124
 Midnight-0.118
-0.115
Token so
Token ID523
Feature activation-3.86e-02

Pos logit contributions
 guess+0.142
illed+0.134
ruits+0.127
 count+0.127
 imagine+0.127
Neg logit contributions
actus-0.097
iflower-0.093
 Midnight-0.093
 therapist-0.093
lor-0.091
Token high
Token ID1029
Feature activation+4.64e-03

Pos logit contributions
 guess+0.064
 count+0.064
 counted+0.060
 imagine+0.060
 blond+0.056
Neg logit contributions
 therapist-0.062
 Midnight-0.060
 masse-0.057
actus-0.056
iflower-0.055
Token.
Token ID13
Feature activation-6.86e-02

Pos logit contributions
 therapist+0.009
actus+0.008
 Midnight+0.008
 masse+0.008
lor+0.008
Neg logit contributions
 guess-0.007
 count-0.007
 three-0.007
 ten-0.007
 imagine-0.007
Token How
Token ID1374
Feature activation-7.71e-02

Pos logit contributions
 guess+0.124
illed+0.117
 imagine+0.114
 count+0.114
 exception+0.112
Neg logit contributions
 Midnight-0.097
 therapist-0.092
 masse-0.091
 Woo-0.088
 Velvet-0.081
Token did
Token ID750
Feature activation-4.07e-03

Pos logit contributions
 count+0.120
ptions+0.117
 blond+0.117
 counted+0.115
 fascination+0.112
Neg logit contributions
actus-0.098
iflower-0.089
ww-0.086
 therapist-0.086
 band-0.086
Token you
Token ID345
Feature activation-9.81e-02

Pos logit contributions
 guess+0.010
 count+0.010
 three+0.009
 ten+0.009
 counted+0.009
Neg logit contributions
 therapist-0.004
 Midnight-0.004
actus-0.004
 masse-0.003
lor-0.003
Token learn
Token ID2193
Feature activation-4.08e-02

Pos logit contributions
 blond+0.150
 blonde+0.146
ruits+0.141
illed+0.132
 English+0.123
Neg logit contributions
 hurt-0.131
 band-0.129
 tree-0.125
 healed-0.123
 you-0.121
Token to
Token ID284
Feature activation-6.22e-02

Pos logit contributions
 guess+0.072
 count+0.069
 counted+0.067
ruits+0.067
 imagine+0.067
Neg logit contributions
 therapist-0.060
 Midnight-0.060
actus-0.060
lor-0.057
 masse-0.055
Token count
Token ID954
Feature activation-6.90e-02

Pos logit contributions
ruits+0.069
 blonde+0.067
illed+0.066
 blond+0.065
 yes+0.063
Neg logit contributions
 band-0.108
iflower-0.107
 therapist-0.106
actus-0.106
 Midnight-0.106
Token so
Token ID523
Feature activation-3.77e-02

Pos logit contributions
 guess+0.127
 count+0.119
 counted+0.119
 blond+0.116
illed+0.115
Neg logit contributions
 therapist-0.085
actus-0.085
 Midnight-0.082
iflower-0.082
 masse-0.078
Token well
Token ID880
Feature activation-9.18e-03

Pos logit contributions
 count+0.063
 guess+0.062
 counted+0.061
 imagine+0.058
 blond+0.058
Neg logit contributions
 therapist-0.061
actus-0.056
 Midnight-0.056
 masse-0.055
iflower-0.053
Token.
Token ID13
Feature activation-1.62e-02

Pos logit contributions
 guess+0.004
 count+0.004
 three+0.004
 ten+0.004
 imagine+0.004
Neg logit contributions
 therapist-0.004
actus-0.004
 Midnight-0.004
 masse-0.004
lor-0.004
Token One
Token ID1881
Feature activation-1.23e-02

Pos logit contributions
 guess+0.013
 count+0.013
 ten+0.011
 counted+0.011
 imagine+0.010
Neg logit contributions
 Midnight-0.044
 therapist-0.041
actus-0.041
 Woo-0.040
 masse-0.039
Token day
Token ID1110
Feature activation+4.02e-03

Pos logit contributions
 guess+0.028
 count+0.028
 ten+0.027
 1+0.026
 counted+0.026
Neg logit contributions
actus-0.014
 therapist-0.014
 Midnight-0.013
 masse-0.011
iflower-0.011
Token,
Token ID11
Feature activation+1.04e-02

Pos logit contributions
 therapist+0.009
actus+0.009
 Midnight+0.009
 masse+0.008
lor+0.008
Neg logit contributions
 guess-0.005
 count-0.005
 three-0.004
 ten-0.004
 imagine-0.004
Token she
Token ID673
Feature activation+2.36e-03

Pos logit contributions
actus+0.016
 therapist+0.016
 Midnight+0.015
lor+0.015
 masse+0.014
Neg logit contributions
 guess-0.020
 count-0.020
 three-0.019
 ten-0.019
 imagine-0.018
Token couldn
Token ID3521
Feature activation+4.81e-02

Pos logit contributions
 therapist+0.005
actus+0.005
 Midnight+0.004
 masse+0.004
lor+0.004
Neg logit contributions
 guess-0.004
 count-0.003
 three-0.003
 ten-0.003
 imagine-0.003
Token't
Token ID470
Feature activation-3.20e-02

Pos logit contributions
actus+0.091
 therapist+0.089
 Midnight+0.080
iflower+0.079
lor+0.079
Neg logit contributions
 count-0.077
 One-0.077
 guess-0.075
 three-0.072
 Many-0.067
Token find
Token ID1064
Feature activation+8.16e-03

Pos logit contributions
 1+0.045
 guess+0.044
 ten+0.044
 count+0.044
 blond+0.043
Neg logit contributions
 therapist-0.055
 Midnight-0.055
actus-0.055
 drowned-0.052
 masse-0.050
Token the
Token ID262
Feature activation-1.21e-02

Pos logit contributions
 therapist+0.014
actus+0.014
 Midnight+0.013
lor+0.012
 masse+0.012
Neg logit contributions
 guess-0.014
 count-0.014
 three-0.014
 ten-0.013
 imagine-0.013
Token key
Token ID1994
Feature activation-9.98e-03

Pos logit contributions
 guess+0.022
 count+0.022
 counted+0.020
 ten+0.020
 imagine+0.020
Neg logit contributions
 therapist-0.019
actus-0.019
 Midnight-0.019
 masse-0.017
 Woo-0.016
Token to
Token ID284
Feature activation-1.47e-02

Pos logit contributions
 guess+0.017
 count+0.017
 three+0.016
 ten+0.016
 imagine+0.015
Neg logit contributions
 therapist-0.017
 Midnight-0.017
actus-0.017
lor-0.015
 masse-0.015
Token together
Token ID1978
Feature activation+1.26e-03

Pos logit contributions
 guess+0.003
 count+0.003
 three+0.002
 ten+0.002
 imagine+0.002
Neg logit contributions
 therapist-0.003
actus-0.003
 Midnight-0.003
lor-0.002
 masse-0.002
Token all
Token ID477
Feature activation+6.24e-03

Pos logit contributions
 therapist+0.002
actus+0.002
 Midnight+0.002
 masse+0.002
lor+0.002
Neg logit contributions
 guess-0.002
 count-0.002
 three-0.002
 ten-0.002
 imagine-0.002
Token the
Token ID262
Feature activation-1.61e-03

Pos logit contributions
 therapist+0.009
actus+0.009
 Midnight+0.009
lor+0.008
 masse+0.008
Neg logit contributions
 guess-0.012
 count-0.012
 three-0.012
 ten-0.011
 imagine-0.011
Token time
Token ID640
Feature activation+2.14e-03

Pos logit contributions
 guess+0.003
 count+0.003
 three+0.003
 ten+0.003
 imagine+0.003
Neg logit contributions
actus-0.003
 therapist-0.003
 Midnight-0.003
 masse-0.002
iflower-0.002
Token and
Token ID290
Feature activation+2.45e-02

Pos logit contributions
 therapist+0.004
actus+0.004
 Midnight+0.004
 masse+0.004
lor+0.004
Neg logit contributions
 guess-0.003
 count-0.003
 three-0.003
 ten-0.003
 imagine-0.003
Token everyone
Token ID2506
Feature activation+5.22e-02

Pos logit contributions
actus+0.048
 therapist+0.048
lor+0.045
 Midnight+0.045
 masse+0.045
Neg logit contributions
 guess-0.038
 count-0.038
 three-0.036
 counted-0.035
 ten-0.035
Token in
Token ID287
Feature activation+1.73e-02

Pos logit contributions
 masse+0.124
 Midnight+0.123
 therapist+0.123
actus+0.122
lor+0.120
Neg logit contributions
 counted-0.057
 guess-0.057
 like-0.056
 count-0.056
 three-0.054
Token the
Token ID262
Feature activation+2.38e-02

Pos logit contributions
 therapist+0.031
actus+0.031
 Midnight+0.029
lor+0.028
 masse+0.028
Neg logit contributions
 guess-0.029
 count-0.029
 three-0.029
 ten-0.028
 imagine-0.027
Token shop
Token ID6128
Feature activation+1.68e-02

Pos logit contributions
 therapist+0.045
 Midnight+0.042
actus+0.041
lor+0.039
 masse+0.038
Neg logit contributions
 three-0.038
 guess-0.038
 count-0.037
 One-0.036
 ten-0.034
Token was
Token ID373
Feature activation+2.41e-02

Pos logit contributions
 therapist+0.032
actus+0.032
 Midnight+0.032
 masse+0.030
lor+0.029
Neg logit contributions
 guess-0.026
 count-0.025
 three-0.024
 ten-0.023
 counted-0.023
Token happy
Token ID3772
Feature activation+3.59e-02

Pos logit contributions
actus+0.058
 therapist+0.057
 Midnight+0.054
 masse+0.052
lor+0.052
Neg logit contributions
 guess-0.029
 count-0.029
 three-0.028
 ten-0.027
 like-0.026
Token just
Token ID655
Feature activation-1.64e-02

Pos logit contributions
 therapist+0.014
actus+0.014
 Midnight+0.013
 masse+0.012
lor+0.012
Neg logit contributions
 guess-0.013
 count-0.013
 three-0.012
 ten-0.012
 counted-0.012
Token built
Token ID3170
Feature activation+8.31e-03

Pos logit contributions
 guess+0.027
 count+0.027
 ten+0.024
 imagine+0.024
 three+0.024
Neg logit contributions
 therapist-0.028
actus-0.027
 Midnight-0.027
 drowned-0.026
 died-0.025
Token ever
Token ID1683
Feature activation-1.37e-02

Pos logit contributions
 therapist+0.014
actus+0.014
 Midnight+0.013
 masse+0.012
lor+0.012
Neg logit contributions
 guess-0.015
 count-0.015
 three-0.014
 ten-0.014
 imagine-0.013
Token again
Token ID757
Feature activation+7.32e-03

Pos logit contributions
 guess+0.027
 count+0.027
 three+0.025
 ten+0.025
ruits+0.025
Neg logit contributions
 therapist-0.018
actus-0.018
 Midnight-0.018
 drowned-0.017
 died-0.016
Token and
Token ID290
Feature activation+4.17e-02

Pos logit contributions
 therapist+0.014
actus+0.013
 Midnight+0.013
 masse+0.012
lor+0.012
Neg logit contributions
 guess-0.012
 count-0.012
 three-0.011
 ten-0.011
 imagine-0.010
Token he
Token ID339
Feature activation+5.30e-02

Pos logit contributions
actus+0.088
 therapist+0.084
lor+0.079
 Midnight+0.079
 masse+0.079
Neg logit contributions
 guess-0.063
 count-0.063
 counted-0.060
 three-0.059
 ten-0.058
Token was
Token ID373
Feature activation+3.55e-02

Pos logit contributions
actus+0.117
 therapist+0.115
 masse+0.112
 Midnight+0.112
iflower+0.107
Neg logit contributions
 guess-0.070
 counted-0.067
 count-0.066
 like-0.062
 three-0.060
Token left
Token ID1364
Feature activation+3.96e-02

Pos logit contributions
actus+0.083
 therapist+0.082
 Midnight+0.074
 masse+0.073
lor+0.072
Neg logit contributions
 three-0.047
 guess-0.045
 count-0.043
 like-0.043
 ten-0.042
Token feeling
Token ID4203
Feature activation+3.73e-02

Pos logit contributions
actus+0.077
 therapist+0.076
 Midnight+0.072
 masse+0.070
iflower+0.069
Neg logit contributions
 three-0.061
 guess-0.060
 count-0.059
 like-0.059
 ten-0.058
Token scared
Token ID12008
Feature activation+1.16e-02

Pos logit contributions
actus+0.084
 therapist+0.081
lor+0.077
 Midnight+0.075
 masse+0.074
Neg logit contributions
 like-0.050
 three-0.048
 guess-0.047
 count-0.044
 ten-0.043
Token and
Token ID290
Feature activation+3.24e-02

Pos logit contributions
 therapist+0.022
actus+0.022
 Midnight+0.021
lor+0.020
 masse+0.020
Neg logit contributions
 guess-0.018
 count-0.018
 three-0.017
 ten-0.017
 like-0.016
Token One
Token ID1881
Feature activation-7.72e-03

Pos logit contributions
 guess+0.041
 count+0.038
 ten+0.036
 counted+0.035
 imagine+0.034
Neg logit contributions
 Midnight-0.052
actus-0.049
 therapist-0.049
 masse-0.046
 Woo-0.045
Token warm
Token ID5814
Feature activation+5.20e-02

Pos logit contributions
 guess+0.016
 count+0.016
 ten+0.016
 three+0.015
 counted+0.015
Neg logit contributions
 therapist-0.010
actus-0.010
 Midnight-0.009
 masse-0.008
iflower-0.008
Token day
Token ID1110
Feature activation-1.14e-02

Pos logit contributions
lor+0.100
 therapist+0.097
 Midnight+0.096
 drowned+0.096
 swollen+0.093
Neg logit contributions
 three-0.076
 imagine-0.071
 like-0.070
 count-0.070
 One-0.068
Token,
Token ID11
Feature activation-1.45e-03

Pos logit contributions
 guess+0.017
 count+0.016
 ten+0.014
 imagine+0.014
 counted+0.014
Neg logit contributions
 Midnight-0.023
 therapist-0.022
actus-0.022
 Woo-0.020
 masse-0.020
Token the
Token ID262
Feature activation+2.57e-02

Pos logit contributions
 guess+0.003
 count+0.003
 ten+0.003
 three+0.003
 counted+0.003
Neg logit contributions
 therapist-0.002
 Midnight-0.002
actus-0.002
 masse-0.001
 Woo-0.001
Token sun
Token ID4252
Feature activation+5.54e-02

Pos logit contributions
 therapist+0.047
lor+0.046
 Midnight+0.046
actus+0.045
 drowned+0.043
Neg logit contributions
 three-0.040
 guess-0.040
 count-0.038
 imagine-0.036
 ten-0.035
Token was
Token ID373
Feature activation+9.42e-03

Pos logit contributions
lor+0.132
 Midnight+0.127
 therapist+0.125
actus+0.125
 masse+0.124
Neg logit contributions
 count-0.059
 counted-0.058
 three-0.057
 like-0.056
 guess-0.054
Token shining
Token ID22751
Feature activation-8.60e-03

Pos logit contributions
 therapist+0.016
actus+0.016
 Midnight+0.015
lor+0.014
 masse+0.014
Neg logit contributions
 guess-0.016
 count-0.016
 three-0.016
 ten-0.015
 imagine-0.015
Token,
Token ID11
Feature activation+1.56e-02

Pos logit contributions
 guess+0.014
 count+0.013
 ten+0.012
 imagine+0.012
 counted+0.012
Neg logit contributions
 therapist-0.016
actus-0.016
 Midnight-0.016
 masse-0.015
iflower-0.014
Token so
Token ID523
Feature activation+4.35e-03

Pos logit contributions
 therapist+0.035
actus+0.034
 Midnight+0.033
lor+0.032
 masse+0.031
Neg logit contributions
 guess-0.019
 count-0.019
 three-0.018
 ten-0.017
 imagine-0.017
Token the
Token ID262
Feature activation+1.43e-02

Pos logit contributions
 therapist+0.007
actus+0.007
 Midnight+0.007
lor+0.006
 masse+0.006
Neg logit contributions
 guess-0.008
 count-0.008
 three-0.007
 ten-0.007
 imagine-0.007
TokenL
Token ID43
Feature activation+3.37e-03

Pos logit contributions
 therapist+0.069
actus+0.067
 Midnight+0.065
lor+0.062
 masse+0.062
Neg logit contributions
 guess-0.046
 count-0.043
 ten-0.041
 One-0.039
 three-0.037
Tokenily
Token ID813
Feature activation+5.31e-02

Pos logit contributions
 therapist+0.007
actus+0.007
 Midnight+0.007
 masse+0.006
lor+0.006
Neg logit contributions
 guess-0.005
 count-0.005
 three-0.004
 ten-0.004
 imagine-0.004
Token says
Token ID1139
Feature activation+2.54e-02

Pos logit contributions
 therapist+0.125
lor+0.123
 Midnight+0.121
iflower+0.120
 masse+0.119
Neg logit contributions
 guess-0.058
 count-0.057
 counted-0.050
 three-0.049
 like-0.048
Token,
Token ID11
Feature activation+4.27e-02

Pos logit contributions
actus+0.068
 therapist+0.066
lor+0.063
 Midnight+0.063
 masse+0.062
Neg logit contributions
 guess-0.021
 three-0.021
 count-0.021
 yes-0.019
 ten-0.019
Token "
Token ID366
Feature activation+4.45e-02

Pos logit contributions
actus+0.111
lor+0.104
 therapist+0.103
eds+0.101
 drowned+0.100
Neg logit contributions
 three-0.042
 yes-0.040
 guess-0.040
 count-0.040
 like-0.035
TokenThank
Token ID10449
Feature activation+4.92e-02

Pos logit contributions
actus+0.076
eds+0.070
 drowned+0.070
lor+0.069
 disappears+0.069
Neg logit contributions
 guess-0.077
 count-0.074
 three-0.074
 like-0.073
 ten-0.069
Token you
Token ID345
Feature activation+1.24e-02

Pos logit contributions
actus+0.088
 therapist+0.085
 drowned+0.079
 Midnight+0.078
 severe+0.076
Neg logit contributions
 guess-0.085
 count-0.084
 three-0.083
 ten-0.080
 yes-0.076
Token,
Token ID11
Feature activation+1.21e-02

Pos logit contributions
 therapist+0.030
actus+0.029
 Midnight+0.028
lor+0.027
 masse+0.027
Neg logit contributions
 guess-0.014
 count-0.014
 three-0.013
 ten-0.012
 One-0.011
Token Mom
Token ID11254
Feature activation-1.71e-03

Pos logit contributions
 therapist+0.026
actus+0.025
 Midnight+0.024
lor+0.024
iflower+0.023
Neg logit contributions
 guess-0.017
 count-0.017
 three-0.015
 ten-0.015
 imagine-0.014
Token.
Token ID13
Feature activation+3.20e-02

Pos logit contributions
 guess+0.003
 count+0.003
 three+0.003
 ten+0.003
 imagine+0.003
Neg logit contributions
 therapist-0.003
actus-0.003
 Midnight-0.003
 masse-0.003
lor-0.003
Token I
Token ID314
Feature activation+5.30e-03

Pos logit contributions
actus+0.060
 therapist+0.058
 drowned+0.055
iflower+0.052
lor+0.052
Neg logit contributions
 count-0.052
 guess-0.050
 three-0.050
 One-0.049
 ten-0.048
Token her
Token ID607
Feature activation+9.59e-03

Pos logit contributions
actus+0.050
 therapist+0.046
lor+0.045
 Midnight+0.043
 masse+0.043
Neg logit contributions
 guess-0.041
 count-0.039
 three-0.039
 ten-0.037
 counted-0.035
Token mum
Token ID25682
Feature activation-4.04e-03

Pos logit contributions
 therapist+0.017
actus+0.017
 Midnight+0.017
lor+0.016
 masse+0.015
Neg logit contributions
 guess-0.016
 count-0.016
 three-0.015
 ten-0.015
 imagine-0.014
Token.
Token ID13
Feature activation+1.06e-02

Pos logit contributions
 guess+0.007
 count+0.007
 ten+0.006
 three+0.006
 imagine+0.006
Neg logit contributions
 therapist-0.007
 Midnight-0.007
actus-0.007
 masse-0.006
lor-0.006
Token "
Token ID366
Feature activation+2.33e-02

Pos logit contributions
actus+0.021
 therapist+0.021
 Midnight+0.020
 masse+0.019
lor+0.019
Neg logit contributions
 guess-0.016
 count-0.016
 three-0.015
 ten-0.014
 One-0.014
TokenHel
Token ID12621
Feature activation+1.47e-02

Pos logit contributions
actus+0.031
 therapist+0.030
 Midnight+0.029
 drowned+0.026
 masse+0.026
Neg logit contributions
 guess-0.049
 count-0.048
 three-0.047
 ten-0.046
 like-0.045
Tokenping
Token ID13886
Feature activation+3.62e-02

Pos logit contributions
 therapist+0.026
 Midnight+0.025
actus+0.025
 masse+0.023
 Woo+0.022
Neg logit contributions
 count-0.025
 guess-0.025
 three-0.024
 ten-0.024
 imagine-0.023
Token others
Token ID1854
Feature activation+2.65e-02

Pos logit contributions
actus+0.068
 therapist+0.068
 Midnight+0.065
 masse+0.062
lor+0.060
Neg logit contributions
 count-0.057
 guess-0.056
 three-0.056
 ten-0.053
 like-0.052
Token can
Token ID460
Feature activation+1.81e-02

Pos logit contributions
 therapist+0.054
actus+0.053
 Midnight+0.053
 masse+0.050
lor+0.049
Neg logit contributions
 guess-0.038
 count-0.038
 three-0.035
 imagine-0.034
 ten-0.033
Token be
Token ID307
Feature activation+5.61e-03

Pos logit contributions
 therapist+0.037
actus+0.036
 Midnight+0.036
lor+0.033
 masse+0.033
Neg logit contributions
 count-0.027
 guess-0.027
 three-0.024
 imagine-0.024
 ten-0.024
Token a
Token ID257
Feature activation+1.80e-02

Pos logit contributions
 therapist+0.010
actus+0.010
 Midnight+0.010
 masse+0.009
lor+0.009
Neg logit contributions
 guess-0.009
 count-0.009
 three-0.008
 ten-0.008
 imagine-0.008
Token delicate
Token ID19217
Feature activation+1.39e-02

Pos logit contributions
actus+0.033
 therapist+0.033
 Midnight+0.031
lor+0.029
 masse+0.028
Neg logit contributions
 guess-0.030
 count-0.030
 three-0.028
 ten-0.028
 imagine-0.026
Token hole
Token ID7604
Feature activation-4.47e-03

Pos logit contributions
lor+0.078
 therapist+0.078
ugg+0.077
 Midnight+0.072
 masse+0.070
Neg logit contributions
 guess-0.090
 three-0.086
 like-0.082
 count-0.081
 want-0.081
Token in
Token ID287
Feature activation+4.58e-02

Pos logit contributions
 guess+0.007
 count+0.007
 ten+0.007
 three+0.007
 imagine+0.007
Neg logit contributions
 therapist-0.008
actus-0.008
 Midnight-0.008
 masse-0.007
lor-0.007
Token the
Token ID262
Feature activation+5.95e-02

Pos logit contributions
actus+0.109
 therapist+0.103
lor+0.103
ugg+0.100
iflower+0.097
Neg logit contributions
 three-0.058
 ten-0.058
 guess-0.056
 count-0.052
 One-0.052
Token net
Token ID2010
Feature activation+2.03e-02

Pos logit contributions
lor+0.115
 therapist+0.111
ugg+0.111
ic+0.107
burn+0.103
Neg logit contributions
 three-0.097
 guess-0.092
 count-0.087
 like-0.087
 ten-0.083
Token.
Token ID13
Feature activation+2.74e-02

Pos logit contributions
 therapist+0.041
actus+0.041
 Midnight+0.039
 masse+0.037
lor+0.037
Neg logit contributions
 guess-0.030
 count-0.029
 three-0.027
 ten-0.027
 One-0.026
Token He
Token ID679
Feature activation+2.96e-03

Pos logit contributions
actus+0.071
 therapist+0.071
lor+0.068
burn+0.067
 Midnight+0.066
Neg logit contributions
 guess-0.024
 One-0.024
 three-0.023
 count-0.021
 ten-0.020
Token was
Token ID373
Feature activation+6.81e-03

Pos logit contributions
 therapist+0.006
actus+0.006
 Midnight+0.006
 masse+0.005
lor+0.005
Neg logit contributions
 guess-0.004
 count-0.004
 three-0.004
 ten-0.004
 imagine-0.004
Token not
Token ID407
Feature activation-1.30e-02

Pos logit contributions
 therapist+0.013
actus+0.013
 Midnight+0.013
lor+0.012
 masse+0.012
Neg logit contributions
 guess-0.010
 count-0.010
 three-0.010
 ten-0.009
 counted-0.009
Token angry
Token ID7954
Feature activation-1.46e-02

Pos logit contributions
 count+0.019
 guess+0.019
 One+0.017
 ten+0.017
 imagine+0.017
Neg logit contributions
 Midnight-0.024
 therapist-0.024
actus-0.023
 drowned-0.022
 Woo-0.022
Token.
Token ID13
Feature activation+1.89e-02

Pos logit contributions
 count+0.024
 guess+0.024
 imagine+0.022
 ten+0.021
 three+0.021
Neg logit contributions
 Midnight-0.025
 therapist-0.025
actus-0.025
 Woo-0.023
 masse-0.022
Token He
Token ID679
Feature activation+3.84e-03

Pos logit contributions
 therapist+0.047
actus+0.047
 Midnight+0.044
lor+0.043
iflower+0.043
Neg logit contributions
 guess-0.019
 three-0.018
 One-0.017
 count-0.017
 ten-0.016
Token
Token ID198
Feature activation+1.58e-03

Pos logit contributions
 therapist+0.015
actus+0.015
 Midnight+0.014
 masse+0.013
lor+0.013
Neg logit contributions
 guess-0.015
 count-0.015
 three-0.014
 ten-0.014
 imagine-0.013
TokenOne
Token ID3198
Feature activation-1.15e-02

Pos logit contributions
 therapist+0.003
actus+0.003
 Midnight+0.003
 masse+0.002
lor+0.002
Neg logit contributions
 guess-0.003
 count-0.003
 three-0.002
 ten-0.002
 imagine-0.002
Token day
Token ID1110
Feature activation+8.12e-03

Pos logit contributions
 count+0.026
 guess+0.026
 ten+0.025
 counted+0.024
 imagine+0.024
Neg logit contributions
 therapist-0.013
actus-0.013
 Midnight-0.012
 masse-0.011
iflower-0.011
Token she
Token ID673
Feature activation-4.52e-03

Pos logit contributions
 therapist+0.018
actus+0.018
 Midnight+0.018
lor+0.017
 masse+0.016
Neg logit contributions
 guess-0.010
 count-0.010
 three-0.009
 ten-0.009
 imagine-0.008
Token was
Token ID373
Feature activation+2.96e-03

Pos logit contributions
 count+0.007
 guess+0.007
 ten+0.006
 three+0.006
 imagine+0.006
Neg logit contributions
 therapist-0.009
actus-0.009
 Midnight-0.008
 masse-0.008
lor-0.008
Token playing
Token ID2712
Feature activation+2.85e-03

Pos logit contributions
 therapist+0.005
actus+0.005
 Midnight+0.005
 masse+0.005
lor+0.005
Neg logit contributions
 guess-0.005
 count-0.005
 three-0.004
 ten-0.004
 imagine-0.004
Token in
Token ID287
Feature activation+1.10e-02

Pos logit contributions
 therapist+0.005
actus+0.005
 Midnight+0.005
 masse+0.005
lor+0.005
Neg logit contributions
 guess-0.004
 count-0.004
 three-0.004
 ten-0.004
 imagine-0.004
Token her
Token ID607
Feature activation+2.28e-02

Pos logit contributions
 therapist+0.019
actus+0.019
 Midnight+0.018
lor+0.017
 masse+0.016
Neg logit contributions
 guess-0.020
 count-0.020
 three-0.019
 ten-0.019
 imagine-0.018
Token room
Token ID2119
Feature activation-2.60e-02

Pos logit contributions
 therapist+0.039
 Midnight+0.038
actus+0.035
lor+0.034
 Woo+0.033
Neg logit contributions
 count-0.043
 guess-0.042
 three-0.040
 imagine-0.039
 ten-0.039
Token,
Token ID11
Feature activation-1.07e-02

Pos logit contributions
 guess+0.041
 count+0.041
 ten+0.037
 imagine+0.036
 One+0.036
Neg logit contributions
 therapist-0.047
 Midnight-0.044
actus-0.044
 masse-0.040
lor-0.039
Token when
Token ID618
Feature activation+2.14e-02

Pos logit contributions
 guess+0.015
 count+0.015
 ten+0.014
 three+0.014
 imagine+0.013
Neg logit contributions
 therapist-0.021
 Midnight-0.021
actus-0.020
 masse-0.019
 Woo-0.018
Token means
Token ID1724
Feature activation+1.78e-02

Pos logit contributions
actus+0.064
 therapist+0.064
 Midnight+0.061
lor+0.059
 masse+0.057
Neg logit contributions
 count-0.040
 guess-0.039
 ten-0.037
 three-0.037
 like-0.037
Token different
Token ID1180
Feature activation-1.74e-02

Pos logit contributions
actus+0.034
 therapist+0.033
 Midnight+0.031
lor+0.030
ugg+0.029
Neg logit contributions
 guess-0.029
 count-0.029
 three-0.028
 ten-0.027
 imagine-0.026
Token things
Token ID1243
Feature activation+3.38e-02

Pos logit contributions
 guess+0.031
 count+0.030
 ten+0.028
 counted+0.028
 three+0.027
Neg logit contributions
 therapist-0.028
actus-0.027
 Midnight-0.027
 masse-0.025
lor-0.024
Token to
Token ID284
Feature activation-2.43e-02

Pos logit contributions
actus+0.071
 therapist+0.071
 Midnight+0.068
lor+0.066
 masse+0.065
Neg logit contributions
 count-0.047
 guess-0.046
 like-0.043
 three-0.042
 imagine-0.041
Token different
Token ID1180
Feature activation-2.40e-02

Pos logit contributions
 three+0.033
 counted+0.033
 guess+0.033
 ten+0.033
 1+0.033
Neg logit contributions
 Midnight-0.045
 therapist-0.043
actus-0.041
 masse-0.039
 Woo-0.039
Token people
Token ID661
Feature activation+1.05e-03

Pos logit contributions
 guess+0.041
 count+0.041
 three+0.038
 ten+0.038
 counted+0.038
Neg logit contributions
 therapist-0.038
actus-0.037
 Midnight-0.037
 masse-0.033
 drowned-0.033
Token.
Token ID13
Feature activation+2.34e-02

Pos logit contributions
 therapist+0.002
actus+0.002
 Midnight+0.002
 masse+0.002
lor+0.002
Neg logit contributions
 guess-0.002
 count-0.002
 three-0.001
 ten-0.001
 imagine-0.001
Token She
Token ID1375
Feature activation+3.43e-02

Pos logit contributions
 therapist+0.047
actus+0.046
 Midnight+0.042
 drowned+0.041
lor+0.041
Neg logit contributions
 guess-0.035
 count-0.034
 One-0.034
 three-0.033
 ten-0.032
Token had
Token ID550
Feature activation+7.18e-03

Pos logit contributions
actus+0.069
 Midnight+0.068
 therapist+0.068
 masse+0.065
lor+0.065
Neg logit contributions
 guess-0.050
 count-0.048
 counted-0.046
 three-0.044
 imagine-0.044
Token learned
Token ID4499
Feature activation+1.41e-02

Pos logit contributions
 therapist+0.015
actus+0.015
 Midnight+0.015
lor+0.014
 masse+0.014
Neg logit contributions
 guess-0.010
 count-0.010
 three-0.009
 ten-0.009
 counted-0.008
Token that
Token ID326
Feature activation+1.26e-02

Pos logit contributions
 therapist+0.026
actus+0.026
 Midnight+0.024
lor+0.023
 drowned+0.022
Neg logit contributions
 guess-0.023
 count-0.023
 three-0.022
 ten-0.022
 like-0.020
Token Tim
Token ID5045
Feature activation-1.83e-02

Pos logit contributions
 guess+0.036
 count+0.035
 imagine+0.032
 three+0.032
 counted+0.032
Neg logit contributions
 Midnight-0.034
 therapist-0.030
 Woo-0.029
actus-0.029
 drowned-0.028
Tokenmy
Token ID1820
Feature activation-1.18e-03

Pos logit contributions
 guess+0.024
 count+0.024
 three+0.022
 imagine+0.021
 ten+0.021
Neg logit contributions
 Midnight-0.038
actus-0.037
 therapist-0.037
 Woo-0.033
iflower-0.033
Token he
Token ID339
Feature activation-3.45e-03

Pos logit contributions
 guess+0.002
 count+0.002
 three+0.002
 ten+0.002
 imagine+0.002
Neg logit contributions
 therapist-0.002
 Midnight-0.002
actus-0.002
 masse-0.001
lor-0.001
Token was
Token ID373
Feature activation+1.75e-02

Pos logit contributions
 guess+0.005
 count+0.005
 ten+0.005
 three+0.005
 imagine+0.005
Neg logit contributions
 therapist-0.007
actus-0.006
 Midnight-0.006
 masse-0.006
 drowned-0.006
Token going
Token ID1016
Feature activation+1.66e-02

Pos logit contributions
actus+0.036
 therapist+0.035
lor+0.033
 Midnight+0.033
iflower+0.031
Neg logit contributions
 guess-0.026
 count-0.026
 three-0.026
 ten-0.024
 like-0.024
Token to
Token ID284
Feature activation+2.24e-02

Pos logit contributions
 therapist+0.032
actus+0.032
 Midnight+0.030
lor+0.029
iflower+0.028
Neg logit contributions
 guess-0.026
 count-0.026
 three-0.025
 ten-0.025
 counted-0.023
Token be
Token ID307
Feature activation+2.64e-03

Pos logit contributions
actus+0.051
 therapist+0.050
 Midnight+0.047
lor+0.046
iflower+0.046
Neg logit contributions
 guess-0.031
 count-0.031
 three-0.027
 ten-0.027
 imagine-0.027
Token punished
Token ID16851
Feature activation+5.53e-03

Pos logit contributions
 therapist+0.004
actus+0.004
 Midnight+0.004
 masse+0.004
lor+0.004
Neg logit contributions
 guess-0.005
 count-0.005
 three-0.004
 ten-0.004
 imagine-0.004
Token.
Token ID13
Feature activation-1.66e-03

Pos logit contributions
 therapist+0.010
actus+0.010
 Midnight+0.010
lor+0.009
 masse+0.009
Neg logit contributions
 guess-0.009
 count-0.009
 three-0.008
 ten-0.008
 imagine-0.008
Token Tim
Token ID5045
Feature activation-7.22e-03

Pos logit contributions
 guess+0.002
 count+0.002
 ten+0.002
 three+0.002
 imagine+0.002
Neg logit contributions
 therapist-0.003
 Midnight-0.003
actus-0.003
 masse-0.003
lor-0.003
Tokenmy
Token ID1820
Feature activation+8.48e-03

Pos logit contributions
 guess+0.008
 count+0.008
 ten+0.008
 three+0.008
 imagine+0.007
Neg logit contributions
actus-0.016
 therapist-0.016
 Midnight-0.016
 masse-0.015
lor-0.014
Token She
Token ID1375
Feature activation-4.02e-02

Pos logit contributions
 count+0.009
 guess+0.009
 ten+0.008
 three+0.008
 imagine+0.008
Neg logit contributions
 therapist-0.009
 Midnight-0.009
actus-0.009
 masse-0.008
lor-0.008
Token wanted
Token ID2227
Feature activation-1.42e-02

Pos logit contributions
 count+0.063
ruits+0.062
 ten+0.059
 like+0.057
 guess+0.057
Neg logit contributions
 therapist-0.066
actus-0.064
 Midnight-0.063
 drowned-0.060
 died-0.060
Token to
Token ID284
Feature activation-2.63e-02

Pos logit contributions
 guess+0.029
 count+0.029
 imagine+0.027
 three+0.026
ruits+0.026
Neg logit contributions
 Midnight-0.019
 therapist-0.019
actus-0.018
 Woo-0.016
lor-0.015
Token see
Token ID766
Feature activation-3.65e-02

Pos logit contributions
ruits+0.034
 three+0.032
 guess+0.031
 ten+0.031
 1+0.031
Neg logit contributions
 therapist-0.048
 Midnight-0.048
actus-0.047
 masse-0.046
 drowned-0.043
Token what
Token ID644
Feature activation+1.56e-02

Pos logit contributions
 guess+0.054
 count+0.053
 counted+0.050
 imagine+0.049
ruits+0.049
Neg logit contributions
 Midnight-0.066
 therapist-0.063
actus-0.063
 Woo-0.058
lor-0.058
Token it
Token ID340
Feature activation-1.17e-02

Pos logit contributions
actus+0.028
 therapist+0.028
 Midnight+0.027
lor+0.025
 masse+0.025
Neg logit contributions
 guess-0.026
 count-0.026
 three-0.025
 ten-0.024
 counted-0.024
Token could
Token ID714
Feature activation-4.21e-02

Pos logit contributions
 guess+0.021
 count+0.021
 ten+0.020
 three+0.020
 imagine+0.019
Neg logit contributions
 therapist-0.018
actus-0.018
 Midnight-0.018
 masse-0.015
 died-0.015
Token unlock
Token ID12116
Feature activation-2.00e-02

Pos logit contributions
ptions+0.062
 three+0.062
 ten+0.062
 blond+0.061
 1+0.060
Neg logit contributions
actus-0.064
 therapist-0.063
 Midnight-0.063
 died-0.059
iflower-0.058
Token.
Token ID13
Feature activation+1.52e-03

Pos logit contributions
 guess+0.036
 count+0.036
 three+0.034
 imagine+0.033
 counted+0.033
Neg logit contributions
 therapist-0.031
 Midnight-0.030
actus-0.030
lor-0.027
 masse-0.026
Token She
Token ID1375
Feature activation-4.02e-02

Pos logit contributions
 therapist+0.003
actus+0.003
 Midnight+0.003
 masse+0.002
lor+0.002
Neg logit contributions
 guess-0.003
 count-0.003
 three-0.002
 ten-0.002
 imagine-0.002
Token took
Token ID1718
Feature activation-6.81e-03

Pos logit contributions
ruits+0.063
 count+0.063
 like+0.060
 Today+0.060
 imagine+0.059
Neg logit contributions
 therapist-0.066
actus-0.060
 Midnight-0.060
 drowned-0.059
 died-0.059
Token We
Token ID775
Feature activation-1.84e-02

Pos logit contributions
 count+0.020
 guess+0.020
 ten+0.019
 three+0.018
 imagine+0.018
Neg logit contributions
 therapist-0.023
 Midnight-0.023
actus-0.022
 masse-0.021
lor-0.020
Token can
Token ID460
Feature activation+1.34e-04

Pos logit contributions
 count+0.029
 guess+0.029
 ten+0.027
 three+0.027
 imagine+0.026
Neg logit contributions
 therapist-0.032
 Midnight-0.032
actus-0.031
 drowned-0.028
 Woo-0.028
Token skip
Token ID14267
Feature activation+3.70e-03

Pos logit contributions
 therapist+0.000
actus+0.000
 Midnight+0.000
 masse+0.000
lor+0.000
Neg logit contributions
 guess-0.000
 count-0.000
 three-0.000
 ten-0.000
 imagine-0.000
Token around
Token ID1088
Feature activation+1.46e-02

Pos logit contributions
 therapist+0.007
actus+0.007
 Midnight+0.007
 masse+0.006
lor+0.006
Neg logit contributions
 guess-0.006
 count-0.006
 three-0.005
 ten-0.005
 imagine-0.005
Token the
Token ID262
Feature activation+7.20e-03

Pos logit contributions
 therapist+0.026
actus+0.026
 Midnight+0.025
lor+0.023
 masse+0.023
Neg logit contributions
 guess-0.025
 count-0.025
 three-0.024
 ten-0.023
 imagine-0.022
Token yard
Token ID12699
Feature activation-1.53e-02

Pos logit contributions
 therapist+0.012
actus+0.012
 Midnight+0.011
lor+0.010
 masse+0.010
Neg logit contributions
 guess-0.013
 count-0.013
 three-0.012
 ten-0.012
 imagine-0.012
Token and
Token ID290
Feature activation-7.88e-04

Pos logit contributions
 guess+0.023
 count+0.022
 three+0.020
 ten+0.020
 imagine+0.020
Neg logit contributions
 therapist-0.029
actus-0.028
 Midnight-0.028
lor-0.026
 masse-0.025
Token I
Token ID314
Feature activation-1.54e-02

Pos logit contributions
 guess+0.001
 count+0.001
 three+0.001
 ten+0.001
 imagine+0.001
Neg logit contributions
 therapist-0.002
 Midnight-0.002
actus-0.002
 masse-0.001
lor-0.001
Token'll
Token ID1183
Feature activation+2.07e-02

Pos logit contributions
 count+0.022
 guess+0.022
 ten+0.021
 three+0.021
 counted+0.020
Neg logit contributions
 therapist-0.029
actus-0.028
 Midnight-0.028
 masse-0.025
lor-0.025
Token count
Token ID954
Feature activation-2.96e-02

Pos logit contributions
actus+0.045
 therapist+0.044
 Midnight+0.043
lor+0.041
iflower+0.040
Neg logit contributions
 guess-0.029
 count-0.029
 three-0.026
 ten-0.026
 imagine-0.025
Token how
Token ID703
Feature activation+6.80e-03

Pos logit contributions
 guess+0.057
 count+0.055
 counted+0.052
 imagine+0.052
 three+0.050
Neg logit contributions
 therapist-0.041
 Midnight-0.040
actus-0.040
 Woo-0.036
 masse-0.036
Token again
Token ID757
Feature activation+4.48e-03

Pos logit contributions
 guess+0.001
 count+0.001
 ten+0.001
 three+0.001
 imagine+0.001
Neg logit contributions
 therapist-0.001
actus-0.001
 Midnight-0.001
 masse-0.001
lor-0.001
Token excited
Token ID6568
Feature activation+1.40e-02

Pos logit contributions
 therapist+0.008
actus+0.008
 Midnight+0.008
 masse+0.007
lor+0.007
Neg logit contributions
 guess-0.007
 count-0.007
 three-0.007
 ten-0.007
 imagine-0.006
Token.
Token ID13
Feature activation+1.20e-02

Pos logit contributions
 therapist+0.028
actus+0.027
 Midnight+0.027
lor+0.025
 masse+0.025
Neg logit contributions
 guess-0.021
 count-0.021
 three-0.020
 ten-0.020
 imagine-0.019
Token<|endoftext|>
Token ID50256
Feature activation+3.85e-02

Pos logit contributions
 therapist+0.023
actus+0.023
 Midnight+0.022
iflower+0.021
lor+0.020
Neg logit contributions
 guess-0.019
 count-0.018
 three-0.017
 ten-0.017
 One-0.017
TokenOnce
Token ID7454
Feature activation-9.19e-03

Pos logit contributions
 therapist+0.061
 Midnight+0.060
actus+0.059
 masse+0.057
 swollen+0.055
Neg logit contributions
 guess-0.069
 One-0.067
 three-0.067
 count-0.067
 ten-0.064
Token upon
Token ID2402
Feature activation-4.42e-02

Pos logit contributions
 count+0.008
 guess+0.008
 ten+0.007
 three+0.007
 counted+0.006
Neg logit contributions
actus-0.023
 therapist-0.023
 Midnight-0.023
lor-0.021
 masse-0.021
Token a
Token ID257
Feature activation-4.04e-02

Pos logit contributions
 count+0.073
 1+0.070
ruits+0.068
 guess+0.067
 counted+0.066
Neg logit contributions
 Midnight-0.075
 therapist-0.071
actus-0.068
 Woo-0.066
iflower-0.063
Token time
Token ID640
Feature activation-5.01e-02

Pos logit contributions
 count+0.069
 guess+0.067
 counted+0.066
 1+0.064
 three+0.063
Neg logit contributions
 Midnight-0.065
actus-0.062
 therapist-0.059
 masse-0.057
 drowned-0.054
Token,
Token ID11
Feature activation-5.36e-02

Pos logit contributions
 count+0.089
 guess+0.089
 counted+0.085
 imagine+0.084
 ten+0.084
Neg logit contributions
 Midnight-0.074
actus-0.065
 Woo-0.065
 therapist-0.063
ugg-0.061
Token there
Token ID612
Feature activation-8.36e-03

Pos logit contributions
 count+0.109
racted+0.102
 guess+0.102
 counted+0.100
 imagine+0.097
Neg logit contributions
 Midnight-0.072
 Woo-0.058
actus-0.057
 therapist-0.057
 Velvet-0.055
Token were
Token ID547
Feature activation-1.85e-02

Pos logit contributions
 count+0.016
 guess+0.016
 ten+0.015
 three+0.015
 imagine+0.015
Neg logit contributions
 therapist-0.012
actus-0.012
 Midnight-0.012
lor-0.010
 drowned-0.010
Token,
Token ID11
Feature activation-1.81e-02

Pos logit contributions
 guess+0.055
 count+0.052
ruits+0.050
 imagine+0.048
 yes+0.045
Neg logit contributions
 therapist-0.042
actus-0.042
 Midnight-0.040
 Woo-0.038
 band-0.037
Token faster
Token ID5443
Feature activation-1.84e-03

Pos logit contributions
 guess+0.030
 count+0.028
ruits+0.027
 ten+0.026
 imagine+0.026
Neg logit contributions
 therapist-0.031
 Midnight-0.031
actus-0.030
 Woo-0.028
 masse-0.028
Token and
Token ID290
Feature activation-2.20e-02

Pos logit contributions
 guess+0.003
 count+0.003
 three+0.003
 ten+0.002
 imagine+0.002
Neg logit contributions
 therapist-0.004
actus-0.003
 Midnight-0.003
 masse-0.003
lor-0.003
Token faster
Token ID5443
Feature activation-1.15e-03

Pos logit contributions
 guess+0.042
 count+0.040
ruits+0.040
 imagine+0.039
ff+0.038
Neg logit contributions
 therapist-0.031
 Midnight-0.030
actus-0.027
 Woo-0.027
 masse-0.026
Token.
Token ID13
Feature activation-3.99e-02

Pos logit contributions
 guess+0.002
 count+0.002
 three+0.002
 ten+0.002
 imagine+0.002
Neg logit contributions
 therapist-0.002
actus-0.002
 Midnight-0.002
 masse-0.002
lor-0.002
Token
Token ID220
Feature activation-2.20e-02

Pos logit contributions
 guess+0.076
 imagine+0.073
 count+0.072
 ten+0.072
ptions+0.068
Neg logit contributions
 Midnight-0.056
 Mitt-0.051
 Woo-0.050
 therapist-0.048
actus-0.047
Token
Token ID198
Feature activation-4.28e-02

Pos logit contributions
 count+0.046
 guess+0.046
 imagine+0.043
 three+0.043
 ten+0.043
Neg logit contributions
 Midnight-0.028
actus-0.027
 therapist-0.027
lor-0.024
 Woo-0.023
Token
Token ID198
Feature activation-3.35e-02

Pos logit contributions
 count+0.102
 imagine+0.097
 guess+0.096
 blond+0.095
 three+0.094
Neg logit contributions
 Midnight-0.040
actus-0.037
 therapist-0.037
 Woo-0.034
lor-0.033
TokenTim
Token ID14967
Feature activation-3.82e-02

Pos logit contributions
 count+0.058
 guess+0.058
 three+0.056
 imagine+0.056
 ten+0.055
Neg logit contributions
 Midnight-0.050
 therapist-0.047
actus-0.046
 Woo-0.046
Now-0.045
Token's
Token ID338
Feature activation-3.17e-02

Pos logit contributions
ruits+0.072
 guess+0.071
 count+0.071
 ten+0.069
 blond+0.067
Neg logit contributions
 drowned-0.050
actus-0.050
 therapist-0.050
 Midnight-0.047
 died-0.046
Token mom
Token ID1995
Feature activation-2.36e-02

Pos logit contributions
 guess+0.066
ff+0.063
 count+0.063
 counted+0.062
ruits+0.062
Neg logit contributions
 therapist-0.037
 Midnight-0.036
actus-0.035
 band-0.034
 masse-0.031
Token and
Token ID290
Feature activation+1.83e-02

Pos logit contributions
 guess+0.021
 count+0.021
 ten+0.019
 three+0.019
 imagine+0.019
Neg logit contributions
 therapist-0.023
actus-0.022
 Midnight-0.022
 masse-0.020
lor-0.020
Token said
Token ID531
Feature activation-3.06e-02

Pos logit contributions
actus+0.036
 therapist+0.035
 Midnight+0.034
lor+0.033
iflower+0.032
Neg logit contributions
 guess-0.028
 count-0.028
 three-0.026
 ten-0.025
 counted-0.025
Token,
Token ID11
Feature activation-1.57e-02

Pos logit contributions
 guess+0.049
 count+0.047
 imagine+0.044
 counted+0.042
ptions+0.042
Neg logit contributions
 therapist-0.055
 Midnight-0.054
actus-0.049
 Woo-0.048
lor-0.048
Token "
Token ID366
Feature activation+1.97e-03

Pos logit contributions
 guess+0.028
 count+0.027
 ten+0.025
 imagine+0.025
 three+0.025
Neg logit contributions
 therapist-0.026
 Midnight-0.026
actus-0.025
 masse-0.022
lor-0.022
TokenI
Token ID40
Feature activation-2.70e-02

Pos logit contributions
 therapist+0.003
actus+0.003
 Midnight+0.003
 masse+0.002
lor+0.002
Neg logit contributions
 guess-0.004
 count-0.004
 three-0.004
 ten-0.004
 imagine-0.004
Token will
Token ID481
Feature activation-8.01e-03

Pos logit contributions
 ten+0.042
 count+0.041
 1+0.041
 three+0.040
 blond+0.039
Neg logit contributions
 therapist-0.047
actus-0.044
 Midnight-0.043
 drowned-0.041
 died-0.041
Token call
Token ID869
Feature activation-1.73e-02

Pos logit contributions
 guess+0.012
 count+0.011
 ten+0.011
 three+0.011
 counted+0.010
Neg logit contributions
 therapist-0.015
 Midnight-0.015
actus-0.015
 masse-0.014
lor-0.013
Token you
Token ID345
Feature activation-4.34e-02

Pos logit contributions
 guess+0.037
 count+0.036
 counted+0.034
 imagine+0.034
 three+0.034
Neg logit contributions
 Midnight-0.022
 therapist-0.022
actus-0.020
 Woo-0.018
 masse-0.018
Token Blue
Token ID4518
Feature activation-5.51e-03

Pos logit contributions
 guess+0.066
 count+0.065
 counted+0.064
 blond+0.062
 blonde+0.060
Neg logit contributions
 Midnight-0.077
 therapist-0.075
actus-0.071
 Woo-0.065
lor-0.063
Token.
Token ID13
Feature activation-2.18e-02

Pos logit contributions
 guess+0.009
 count+0.009
 three+0.008
 ten+0.008
 imagine+0.008
Neg logit contributions
 therapist-0.010
actus-0.010
 Midnight-0.010
lor-0.009
 masse-0.009
Token You
Token ID921
Feature activation-3.21e-02

Pos logit contributions
 guess+0.043
 count+0.042
 ten+0.040
 three+0.040
 counted+0.039
Neg logit contributions
 Midnight-0.032
 therapist-0.032
 masse-0.028
actus-0.028
lor-0.027
Token replied
Token ID8712
Feature activation-1.15e-02

Pos logit contributions
 count+0.047
 guess+0.046
 ten+0.045
 imagine+0.043
ruits+0.043
Neg logit contributions
 therapist-0.049
actus-0.048
 Midnight-0.047
 drowned-0.042
iflower-0.042
Token,
Token ID11
Feature activation-2.05e-03

Pos logit contributions
 guess+0.016
 count+0.016
 imagine+0.014
 ten+0.014
 counted+0.014
Neg logit contributions
 therapist-0.024
 Midnight-0.023
actus-0.022
 masse-0.021
 Woo-0.021
Token "
Token ID366
Feature activation-1.11e-03

Pos logit contributions
 guess+0.003
 count+0.003
 ten+0.003
 three+0.003
 imagine+0.003
Neg logit contributions
 therapist-0.004
 Midnight-0.004
actus-0.004
 masse-0.003
lor-0.003
TokenI
Token ID40
Feature activation-1.78e-02

Pos logit contributions
 count+0.003
 guess+0.003
 ten+0.003
 three+0.003
 imagine+0.002
Neg logit contributions
 therapist-0.001
actus-0.001
 Midnight-0.001
 masse-0.001
lor-0.001
Token can
Token ID460
Feature activation-2.37e-02

Pos logit contributions
 count+0.024
 ten+0.024
 three+0.023
 guess+0.023
 1+0.022
Neg logit contributions
actus-0.035
 therapist-0.035
 Midnight-0.034
 drowned-0.032
 died-0.031
Token look
Token ID804
Feature activation-6.05e-02

Pos logit contributions
 ten+0.041
 guess+0.040
ruits+0.040
 count+0.040
 counted+0.039
Neg logit contributions
 therapist-0.036
 Midnight-0.035
actus-0.033
 masse-0.031
iflower-0.030
Token at
Token ID379
Feature activation+3.83e-03

Pos logit contributions
 count+0.105
 counted+0.100
 guess+0.099
ruits+0.099
 blond+0.099
Neg logit contributions
 Midnight-0.090
actus-0.082
 therapist-0.080
 Woo-0.080
ls-0.075
Token the
Token ID262
Feature activation-5.17e-04

Pos logit contributions
 therapist+0.006
actus+0.006
 Midnight+0.006
 masse+0.005
lor+0.005
Neg logit contributions
 guess-0.007
 count-0.007
 three-0.007
 ten-0.007
 imagine-0.007
Token stones
Token ID14966
Feature activation-1.38e-02

Pos logit contributions
 guess+0.001
 count+0.001
 ten+0.001
 three+0.001
 counted+0.001
Neg logit contributions
 therapist-0.001
actus-0.001
 Midnight-0.001
 masse-0.001
lor-0.001
Token and
Token ID290
Feature activation-3.36e-02

Pos logit contributions
 guess+0.021
 count+0.021
 three+0.019
 ten+0.019
 counted+0.019
Neg logit contributions
 therapist-0.026
actus-0.025
 Midnight-0.025
lor-0.023
iflower-0.023
Token remember
Token ID3505
Feature activation-1.46e-02

Pos logit contributions
 guess+0.046
 count+0.046
ruits+0.045
 counted+0.044
 One+0.043
Neg logit contributions
 Midnight-0.061
 therapist-0.060
actus-0.057
 masse-0.054
 drowned-0.053
Token.
Token ID13
Feature activation-2.48e-03

Pos logit contributions
 count+0.061
 guess+0.061
 ten+0.058
 imagine+0.058
ptions+0.057
Neg logit contributions
 therapist-0.037
 Midnight-0.036
 died-0.030
actus-0.029
 masse-0.029
Token "
Token ID366
Feature activation-3.48e-02

Pos logit contributions
 count+0.004
 guess+0.004
 ten+0.003
 three+0.003
 imagine+0.003
Neg logit contributions
 therapist-0.005
 Midnight-0.005
actus-0.005
 masse-0.004
lor-0.004
TokenHe
Token ID1544
Feature activation+8.27e-03

Pos logit contributions
 count+0.076
 guess+0.073
 counted+0.070
ruits+0.070
 imagine+0.070
Neg logit contributions
 therapist-0.042
actus-0.041
 Midnight-0.038
 masse-0.034
 Woo-0.034
Token is
Token ID318
Feature activation+9.53e-04

Pos logit contributions
 therapist+0.014
actus+0.014
 Midnight+0.013
 masse+0.012
lor+0.012
Neg logit contributions
 guess-0.015
 count-0.015
 three-0.014
 ten-0.013
 imagine-0.013
Token unique
Token ID3748
Feature activation-1.52e-02

Pos logit contributions
 therapist+0.002
actus+0.002
 Midnight+0.002
 masse+0.002
lor+0.002
Neg logit contributions
 guess-0.001
 count-0.001
 three-0.001
 ten-0.001
 imagine-0.001
Token.
Token ID13
Feature activation-8.33e-02

Pos logit contributions
 guess+0.027
 count+0.026
 ten+0.024
 counted+0.024
 imagine+0.024
Neg logit contributions
 therapist-0.026
actus-0.025
 Midnight-0.024
 masse-0.022
iflower-0.022
Token I
Token ID314
Feature activation-3.67e-02

Pos logit contributions
ptions+0.166
illed+0.164
 exception+0.161
 guess+0.161
 count+0.158
Neg logit contributions
 Midnight-0.094
 therapist-0.088
 Woo-0.085
 Pink-0.080
 masse-0.080
Token have
Token ID423
Feature activation-2.27e-02

Pos logit contributions
 three+0.038
 ten+0.037
 count+0.037
 1+0.036
ruits+0.035
Neg logit contributions
 therapist-0.077
actus-0.077
 Midnight-0.076
 drowned-0.073
 died-0.072
Token never
Token ID1239
Feature activation-3.80e-02

Pos logit contributions
 guess+0.031
 count+0.030
 imagine+0.027
ruits+0.027
 counted+0.026
Neg logit contributions
 Midnight-0.045
 therapist-0.045
actus-0.043
 drowned-0.041
 Woo-0.040
Token seen
Token ID1775
Feature activation-2.57e-02

Pos logit contributions
 1+0.040
 guess+0.039
 One+0.039
 count+0.038
 three+0.037
Neg logit contributions
actus-0.077
 drowned-0.077
 died-0.076
 therapist-0.076
 Midnight-0.076
Token a
Token ID257
Feature activation+2.33e-02

Pos logit contributions
 guess+0.042
 count+0.041
 imagine+0.037
 counted+0.036
ruits+0.035
Neg logit contributions
 Midnight-0.047
 therapist-0.046
actus-0.046
 Woo-0.041
lor-0.041
Token in
Token ID287
Feature activation+7.27e-03

Pos logit contributions
 guess+0.015
 count+0.014
 three+0.013
 ten+0.013
 imagine+0.013
Neg logit contributions
 therapist-0.014
 Midnight-0.014
actus-0.014
 masse-0.012
lor-0.012
Token this
Token ID428
Feature activation+1.10e-02

Pos logit contributions
 therapist+0.012
actus+0.012
 Midnight+0.011
lor+0.010
 masse+0.010
Neg logit contributions
 guess-0.014
 count-0.013
 three-0.013
 ten-0.013
 imagine-0.012
Token cup
Token ID6508
Feature activation+2.15e-02

Pos logit contributions
 therapist+0.016
actus+0.015
 Midnight+0.015
 masse+0.014
lor+0.014
Neg logit contributions
 guess-0.022
 count-0.022
 three-0.021
 ten-0.021
 imagine-0.020
Token.
Token ID13
Feature activation-3.66e-02

Pos logit contributions
 therapist+0.042
actus+0.041
 Midnight+0.041
 masse+0.038
lor+0.038
Neg logit contributions
 guess-0.031
 count-0.031
 three-0.029
 ten-0.029
 imagine-0.027
Token Do
Token ID2141
Feature activation-1.17e-03

Pos logit contributions
 count+0.064
 guess+0.063
 ten+0.061
 counted+0.059
 imagine+0.058
Neg logit contributions
 Midnight-0.060
 therapist-0.057
 Woo-0.052
 masse-0.051
actus-0.050
Token you
Token ID345
Feature activation-6.70e-02

Pos logit contributions
 guess+0.003
 count+0.003
 three+0.003
 ten+0.003
 imagine+0.003
Neg logit contributions
 therapist-0.001
 Midnight-0.001
actus-0.001
lor-0.001
 masse-0.001
Token want
Token ID765
Feature activation-1.89e-03

Pos logit contributions
 1+0.091
ptions+0.085
 6+0.078
 ten+0.077
 exhibits+0.077
Neg logit contributions
 drowned-0.115
 sting-0.114
 band-0.113
 died-0.112
 healed-0.110
Token to
Token ID284
Feature activation-1.02e-02

Pos logit contributions
 guess+0.003
 count+0.003
 three+0.003
 ten+0.003
 imagine+0.003
Neg logit contributions
 therapist-0.003
 Midnight-0.003
actus-0.003
 masse-0.003
lor-0.003
Token see
Token ID766
Feature activation-3.21e-02

Pos logit contributions
 guess+0.013
 count+0.013
 three+0.012
 ten+0.012
 counted+0.012
Neg logit contributions
 Midnight-0.021
actus-0.021
 therapist-0.020
 Woo-0.019
 masse-0.019
Token it
Token ID340
Feature activation-1.37e-02

Pos logit contributions
 guess+0.072
 count+0.072
ruits+0.069
 counted+0.068
 blond+0.067
Neg logit contributions
 Midnight-0.036
actus-0.033
 therapist-0.032
iflower-0.029
 Woo-0.029
Token?"
Token ID1701
Feature activation+2.26e-03

Pos logit contributions
 guess+0.021
 count+0.021
 three+0.020
 counted+0.020
 ten+0.020
Neg logit contributions
 Midnight-0.025
actus-0.024
 therapist-0.024
iflower-0.022
 masse-0.021
Token the
Token ID262
Feature activation-2.52e-02

Pos logit contributions
 guess+0.126
 count+0.123
 imagine+0.117
ruits+0.116
iner+0.114
Neg logit contributions
ls-0.033
 lands-0.032
 Midnight-0.030
 therapist-0.030
actus-0.029
Token trees
Token ID7150
Feature activation-2.39e-02

Pos logit contributions
 guess+0.047
 count+0.045
 counted+0.043
 imagine+0.043
ruits+0.042
Neg logit contributions
actus-0.039
 therapist-0.037
 Midnight-0.034
 masse-0.034
iflower-0.031
Token.
Token ID13
Feature activation-6.01e-02

Pos logit contributions
 guess+0.043
 count+0.041
 ten+0.039
 three+0.039
 imagine+0.039
Neg logit contributions
 therapist-0.036
actus-0.035
 Midnight-0.035
 drowned-0.031
 died-0.031
Token "
Token ID366
Feature activation-5.54e-02

Pos logit contributions
 guess+0.114
 count+0.112
 ten+0.111
 imagine+0.110
 yes+0.107
Neg logit contributions
 Midnight-0.079
 Cathy-0.063
 Woo-0.062
 therapist-0.062
actus-0.057
TokenOne
Token ID3198
Feature activation-4.47e-02

Pos logit contributions
 count+0.130
 guess+0.126
 imagine+0.123
ruits+0.121
 yes+0.119
Neg logit contributions
 Midnight-0.049
 therapist-0.047
Now-0.047
burn-0.044
 Woo-0.043
Token,
Token ID11
Feature activation-8.23e-02

Pos logit contributions
 guess+0.081
 blond+0.081
 blonde+0.079
 counted+0.079
racted+0.078
Neg logit contributions
actus-0.057
ww-0.054
iflower-0.051
 therapist-0.050
 mosquito-0.049
Token two
Token ID734
Feature activation-5.31e-02

Pos logit contributions
illed+0.122
 blond+0.120
adows+0.119
ill+0.115
ff+0.114
Neg logit contributions
 Midnight-0.135
 Woo-0.128
 poppy-0.117
 mosquito-0.116
 bunny-0.115
Token,
Token ID11
Feature activation-9.88e-02

Pos logit contributions
 guess+0.088
 count+0.079
 imagine+0.078
ruits+0.077
 whether+0.074
Neg logit contributions
 Midnight-0.080
actus-0.078
 drowned-0.075
 therapist-0.074
 died-0.073
Token three
Token ID1115
Feature activation-4.70e-02

Pos logit contributions
illed+0.117
adows+0.112
ff+0.105
ruits+0.103
owed+0.102
Neg logit contributions
 Midnight-0.181
 Woo-0.180
 band-0.174
 poppy-0.174
 mosquito-0.172
Token!"
Token ID2474
Feature activation-3.14e-02

Pos logit contributions
 guess+0.079
 count+0.076
 imagine+0.072
 ten+0.069
ruits+0.068
Neg logit contributions
actus-0.074
 therapist-0.072
 Midnight-0.070
iflower-0.070
ls-0.069
Token they
Token ID484
Feature activation-5.60e-02

Pos logit contributions
 ten+0.049
 count+0.049
 guess+0.047
ruits+0.045
 three+0.045
Neg logit contributions
 therapist-0.056
 Midnight-0.056
actus-0.050
 Woo-0.047
 Cathy-0.047
Token,
Token ID11
Feature activation+9.93e-03

Pos logit contributions
 guess+0.005
 count+0.005
 ten+0.004
 three+0.004
 imagine+0.004
Neg logit contributions
 therapist-0.008
 Midnight-0.008
actus-0.008
 masse-0.007
lor-0.007
Token "
Token ID366
Feature activation+6.52e-04

Pos logit contributions
actus+0.020
 therapist+0.020
 Midnight+0.019
lor+0.018
 masse+0.018
Neg logit contributions
 guess-0.014
 count-0.014
 three-0.013
 ten-0.013
 counted-0.012
TokenWe
Token ID1135
Feature activation-2.99e-02

Pos logit contributions
 therapist+0.001
actus+0.001
 Midnight+0.001
 masse+0.001
lor+0.001
Neg logit contributions
 guess-0.001
 count-0.001
 three-0.001
 ten-0.001
 imagine-0.001
Token need
Token ID761
Feature activation-8.49e-03

Pos logit contributions
 guess+0.046
 count+0.046
 ten+0.044
 three+0.043
 blond+0.043
Neg logit contributions
 Midnight-0.051
 therapist-0.049
actus-0.047
 drowned-0.045
 died-0.043
Token many
Token ID867
Feature activation-4.76e-02

Pos logit contributions
 guess+0.015
 count+0.015
 ten+0.014
 three+0.013
 imagine+0.013
Neg logit contributions
 therapist-0.014
 Midnight-0.014
actus-0.013
 masse-0.012
lor-0.012
Token different
Token ID1180
Feature activation-6.71e-02

Pos logit contributions
 guess+0.075
 count+0.069
ruits+0.068
 ten+0.067
 imagine+0.066
Neg logit contributions
 therapist-0.074
 Midnight-0.073
actus-0.072
 band-0.071
 masse-0.068
Token colors
Token ID7577
Feature activation-8.92e-03

Pos logit contributions
 guess+0.103
 yes+0.097
 count+0.092
ruits+0.092
 ten+0.090
Neg logit contributions
 therapist-0.098
actus-0.096
 Midnight-0.095
 band-0.093
orns-0.090
Token so
Token ID523
Feature activation-3.68e-03

Pos logit contributions
 guess+0.014
 count+0.014
 three+0.013
 ten+0.013
 imagine+0.013
Neg logit contributions
 therapist-0.016
actus-0.016
 Midnight-0.016
lor-0.014
 masse-0.014
Token we
Token ID356
Feature activation-2.22e-02

Pos logit contributions
 guess+0.008
 count+0.008
 three+0.007
 ten+0.007
 imagine+0.007
Neg logit contributions
 therapist-0.005
actus-0.005
 Midnight-0.005
 masse-0.004
lor-0.004
Token can
Token ID460
Feature activation-3.29e-02

Pos logit contributions
 guess+0.032
 count+0.031
 ten+0.030
 three+0.030
 counted+0.028
Neg logit contributions
 Midnight-0.041
 therapist-0.040
actus-0.040
 drowned-0.037
 masse-0.036
Token draw
Token ID3197
Feature activation-3.52e-02

Pos logit contributions
 guess+0.046
ruits+0.045
 three+0.044
 ten+0.044
 count+0.044
Neg logit contributions
actus-0.058
 Midnight-0.058
 therapist-0.056
 masse-0.051
lor-0.051