TOP ACTIVATIONS
MAX = 8.366

With his mom â s help , they quickly
thanks to Bob â s kind encouragement . <|endoftext|>
. With Joe â s help , Steve started
back to Lily â s home together . They
. As Daisy â s gaze passed by ,
back to Lily â s house and parted ways
patiently for Jack â s new passport . And
Jack and Emma â s mom gave them cel
to his dad â s instructions .
With his parents â help , he was able
road to grandmother â s . Although this day
into his mum â s car .
. When Lily â s mom picked her up
off and let â s go home . â
to his mom â s advice and ate the
With her mom â s help , she made
smiled at Daisy â s beautiful rainbow hair .
follow her mom â s instructions . <|endoftext|> One
for Strip es â family . Spot finally found
basket and let â s go home to show

MIN ACTIVATIONS
MIN = -8.522

when she spotted something gl ist ening in the grass .
a sign that says " tem ple ". They ask mom
that said " Self - destruct activated . Countdown begins ."
adm iring how it gl ist ened in the sun .
how the snow was gl ist ening in the sunlight .
blue and the sun was just right . Everyone was so
This box is probably too light to weigh an
the water was very sw ir ly and Arnold felt funny
fountain was not smart . List ening to Mom was the
all the stars that tw inkle . Everywhere she went ,
. We can find something just as good in the '
your back now ! What must I do to take care
box there was a surprise just for him . It was
easier and more enjoyable . List ening to others and trying
there was a kind wood cut ter who lived in the
balloon . A str id held onto her balloon
voice . But then , mirac ulously , it returned and
" Ben , it was just a spider . It was
One day , the wood cut ter got an idea to
wasn 't so bad . List ening to the song helped

INTERVAL 4.144 - 8.366
CONTAINS 1.335%

girl . She hugged her toy tightly and said goodbye to
and cried . " Hey , what 's wrong
. He zoom ed and raced on the street . He
the playground ?" " To get to the other
to make sand castles , how to collect shells , and

INTERVAL -0.078 - 4.144
CONTAINS 48.565%

mine ." L ila and Max feel sad .
the bird became friends and would whistle together in the backyard
It was a hot summer day , and the sun was
. One day , the little girl got tired
to give her a special gift . She took Lily to

INTERVAL -4.300 - -0.078
CONTAINS 49.580%

. A man stepped out of the jet
to spread it on the ground . Suddenly , a big
But as he climbed , the leaf slipped and he fell
they went on a walk and saw a big loop in
. He was so happy that he was getting faster and

INTERVAL -8.522 - -4.300
CONTAINS 0.520%

very happy . <|endoftext|> Once upon a time , there was
be sleeping , it is not time to open your gifts
does not answer . It just w igg les some more
The foolish muff in was not happy . He knew he
a reward . <|endoftext|> Once upon a time , a generous
Token With
Token ID2080
Feature activation-1.62e+00

Pos logit contributions
 sunscreen+0.021
aid+0.021
P+0.021
coat+0.020
ival+0.020
Neg logit contributions
arter-0.007
 erupt-0.006
 meant-0.006
rated-0.006
cker-0.006
Token his
Token ID465
Feature activation+2.24e+00

Pos logit contributions
 meant+0.027
 wasn+0.027
 He+0.025
ires+0.025
arter+0.025
Neg logit contributions
P-0.046
suits-0.045
aid-0.045
coat-0.045
ival-0.043
Token mom
Token ID1995
Feature activation+2.94e+00

Pos logit contributions
P+0.066
aid+0.065
 sunscreen+0.063
suits+0.061
 merry+0.061
Neg logit contributions
arter-0.035
 wasn-0.034
 meant-0.034
ires-0.033
 probably-0.033
Tokenâ
Token ID22940
Feature activation+3.14e-01

Pos logit contributions
P+0.093
 sunscreen+0.090
aid+0.089
 merry+0.087
suits+0.086
Neg logit contributions
 wasn-0.042
 meant-0.040
 probably-0.036
 Andy-0.036
 He-0.034
Token
Token ID26391
Feature activation+3.28e+00

Pos logit contributions
P+0.007
aid+0.007
 sunscreen+0.007
 merry+0.006
suits+0.006
Neg logit contributions
 wasn-0.007
 meant-0.007
arter-0.007
 probably-0.007
 He-0.007
Token
Token ID8151
Feature activation+8.37e+00

Pos logit contributions
P+0.082
aid+0.080
 sunscreen+0.078
suits+0.076
 merry+0.076
Neg logit contributions
arter-0.067
 meant-0.064
ires-0.063
 wasn-0.062
 probably-0.062
Tokens
Token ID82
Feature activation+1.70e+00

Pos logit contributions
P+0.227
aid+0.225
 sunscreen+0.217
 merry+0.212
suits+0.211
Neg logit contributions
arter-0.147
 meant-0.141
 wasn-0.140
ires-0.138
 probably-0.137
Token help
Token ID1037
Feature activation+2.08e+00

Pos logit contributions
P+0.037
aid+0.036
 sunscreen+0.035
 merry+0.034
suits+0.033
Neg logit contributions
 wasn-0.039
arter-0.039
 meant-0.039
 probably-0.038
ires-0.037
Token,
Token ID11
Feature activation+1.42e+00

Pos logit contributions
aid+0.047
P+0.047
 sunscreen+0.046
coat+0.045
 merry+0.044
Neg logit contributions
arter-0.046
ires-0.044
 erupt-0.042
 meant-0.041
 probably-0.041
Token they
Token ID484
Feature activation+7.99e-01

Pos logit contributions
P+0.043
aid+0.042
 sunscreen+0.042
 merry+0.041
coat+0.040
Neg logit contributions
arter-0.021
ires-0.019
 meant-0.019
 wasn-0.019
 erupt-0.019
Token quickly
Token ID2952
Feature activation+2.69e+00

Pos logit contributions
P+0.020
aid+0.019
 sunscreen+0.019
 merry+0.019
suits+0.019
Neg logit contributions
 wasn-0.016
 meant-0.016
 probably-0.015
arter-0.014
 erupt-0.014
Token thanks
Token ID5176
Feature activation-2.24e-01

Pos logit contributions
aid+0.080
P+0.077
 sunscreen+0.074
suits+0.073
 merry+0.072
Neg logit contributions
 wasn-0.060
arter-0.058
 meant-0.057
 Andy-0.056
 probably-0.056
Token to
Token ID284
Feature activation+1.92e+00

Pos logit contributions
arter+0.005
 wasn+0.005
 meant+0.005
 probably+0.005
ires+0.005
Neg logit contributions
P-0.005
aid-0.005
 sunscreen-0.005
suits-0.005
 merry-0.005
Token Bob
Token ID5811
Feature activation+2.72e+00

Pos logit contributions
P+0.053
aid+0.052
 sunscreen+0.052
 merry+0.050
coat+0.049
Neg logit contributions
arter-0.033
 wasn-0.031
 meant-0.031
ires-0.031
 probably-0.030
Tokenâ
Token ID22940
Feature activation+1.85e-01

Pos logit contributions
P+0.076
aid+0.076
 sunscreen+0.073
 merry+0.072
suits+0.071
Neg logit contributions
arter-0.046
 meant-0.043
 wasn-0.043
ires-0.043
 probably-0.042
Token
Token ID26391
Feature activation+3.39e+00

Pos logit contributions
P+0.004
aid+0.004
 sunscreen+0.004
 merry+0.004
suits+0.004
Neg logit contributions
arter-0.004
 wasn-0.004
 meant-0.004
 probably-0.004
ires-0.004
Token
Token ID8151
Feature activation+8.35e+00

Pos logit contributions
P+0.082
aid+0.079
 sunscreen+0.078
 merry+0.078
coat+0.077
Neg logit contributions
arter-0.072
 meant-0.069
ires-0.068
 probably-0.066
 erupt-0.066
Tokens
Token ID82
Feature activation+1.13e+00

Pos logit contributions
P+0.218
aid+0.209
 sunscreen+0.206
coat+0.204
suits+0.202
Neg logit contributions
arter-0.159
 meant-0.153
ires-0.150
 erupt-0.149
 probably-0.148
Token kind
Token ID1611
Feature activation+2.49e+00

Pos logit contributions
P+0.028
aid+0.027
 sunscreen+0.026
suits+0.026
 merry+0.025
Neg logit contributions
arter-0.023
 wasn-0.022
 meant-0.022
 probably-0.022
ires-0.022
Token encouragement
Token ID27529
Feature activation+1.47e+00

Pos logit contributions
P+0.070
aid+0.068
 sunscreen+0.066
 merry+0.064
suits+0.064
Neg logit contributions
 wasn-0.042
 meant-0.042
arter-0.042
 He-0.040
ires-0.040
Token.
Token ID13
Feature activation+1.19e+00

Pos logit contributions
aid+0.041
P+0.040
 sunscreen+0.040
 merry+0.039
coat+0.039
Neg logit contributions
arter-0.025
ires-0.023
 erupt-0.023
 meant-0.022
 probably-0.022
Token<|endoftext|>
Token ID50256
Feature activation-1.09e-01

Pos logit contributions
P+0.027
aid+0.026
 sunscreen+0.025
 merry+0.025
Of+0.024
Neg logit contributions
arter-0.027
rated-0.026
 matter-0.025
ires-0.025
cker-0.024
Token.
Token ID13
Feature activation+5.03e-01

Pos logit contributions
arter+0.000
ires+0.000
 meant+0.000
 probably+0.000
 wasn+0.000
Neg logit contributions
 merry-0.000
suits-0.000
P-0.000
 sunscreen-0.000
ival-0.000
Token With
Token ID2080
Feature activation-1.40e+00

Pos logit contributions
suits+0.013
ival+0.013
 sunscreen+0.013
 merry+0.011
coat+0.011
Neg logit contributions
arter-0.009
 meant-0.008
rated-0.008
ires-0.008
 probably-0.008
Token Joe
Token ID5689
Feature activation+2.66e+00

Pos logit contributions
 meant+0.025
 wasn+0.024
 Andy+0.024
arter+0.023
 He+0.023
Neg logit contributions
P-0.039
aid-0.038
coat-0.037
suits-0.036
ival-0.036
Tokenâ
Token ID22940
Feature activation+3.12e-01

Pos logit contributions
P+0.080
aid+0.079
 sunscreen+0.077
coat+0.074
 merry+0.074
Neg logit contributions
 wasn-0.040
 meant-0.039
arter-0.038
 probably-0.037
 He-0.036
Token
Token ID26391
Feature activation+3.06e+00

Pos logit contributions
P+0.007
aid+0.007
 sunscreen+0.007
coat+0.007
 merry+0.007
Neg logit contributions
arter-0.007
 wasn-0.006
 meant-0.006
ires-0.006
 probably-0.006
Token
Token ID8151
Feature activation+8.26e+00

Pos logit contributions
P+0.065
aid+0.063
 sunscreen+0.061
suits+0.061
 merry+0.060
Neg logit contributions
arter-0.074
 meant-0.071
ires-0.070
 wasn-0.069
 probably-0.069
Tokens
Token ID82
Feature activation+1.81e+00

Pos logit contributions
P+0.215
aid+0.212
 sunscreen+0.205
suits+0.201
coat+0.200
Neg logit contributions
arter-0.156
 meant-0.148
 wasn-0.146
 probably-0.146
ires-0.146
Token help
Token ID1037
Feature activation+2.18e+00

Pos logit contributions
aid+0.041
P+0.040
 sunscreen+0.039
 merry+0.038
coat+0.037
Neg logit contributions
arter-0.040
ires-0.038
 wasn-0.038
 meant-0.038
 probably-0.038
Token,
Token ID11
Feature activation+1.22e+00

Pos logit contributions
 sunscreen+0.048
aid+0.048
P+0.048
coat+0.046
 merry+0.046
Neg logit contributions
arter-0.049
ires-0.046
 erupt-0.044
 probably-0.044
 Next-0.044
Token Steve
Token ID6542
Feature activation-4.31e-01

Pos logit contributions
 sunscreen+0.035
P+0.034
aid+0.034
 merry+0.034
ival+0.033
Neg logit contributions
arter-0.021
ires-0.018
 probably-0.017
rated-0.017
 meant-0.017
Token started
Token ID2067
Feature activation+2.75e+00

Pos logit contributions
arter+0.007
 wasn+0.007
 meant+0.007
ires+0.007
 probably+0.007
Neg logit contributions
P-0.012
aid-0.012
 sunscreen-0.011
 merry-0.011
suits-0.011
Token back
Token ID736
Feature activation+4.05e+00

Pos logit contributions
aid+0.114
P+0.098
ival+0.093
 gown+0.093
 merry+0.092
Neg logit contributions
 He-0.080
 before-0.071
 meant-0.070
 wasn-0.069
 Joe-0.069
Token to
Token ID284
Feature activation+1.69e+00

Pos logit contributions
aid+0.098
P+0.095
 sunscreen+0.084
 merry+0.082
ival+0.082
Neg logit contributions
 wasn-0.088
 meant-0.087
 He-0.087
 Andy-0.086
ires-0.084
Token Lily
Token ID20037
Feature activation+1.97e+00

Pos logit contributions
aid+0.052
P+0.052
suits+0.049
coat+0.049
 sunscreen+0.048
Neg logit contributions
arter-0.024
 wasn-0.023
 meant-0.023
 probably-0.022
 He-0.022
Tokenâ
Token ID22940
Feature activation-9.98e-03

Pos logit contributions
P+0.058
aid+0.057
 sunscreen+0.056
suits+0.055
 merry+0.055
Neg logit contributions
arter-0.031
ires-0.028
 meant-0.028
 wasn-0.028
 probably-0.027
Token
Token ID26391
Feature activation+3.08e+00

Pos logit contributions
arter+0.000
 wasn+0.000
 meant+0.000
 probably+0.000
 erupt+0.000
Neg logit contributions
P-0.000
aid-0.000
 sunscreen-0.000
 merry-0.000
suits-0.000
Token
Token ID8151
Feature activation+8.22e+00

Pos logit contributions
P+0.078
aid+0.075
 sunscreen+0.074
suits+0.074
coat+0.074
Neg logit contributions
arter-0.061
 meant-0.058
ires-0.058
 erupt-0.056
 wasn-0.056
Tokens
Token ID82
Feature activation+1.16e+00

Pos logit contributions
P+0.218
aid+0.212
 sunscreen+0.207
suits+0.206
coat+0.205
Neg logit contributions
arter-0.151
 meant-0.145
 probably-0.142
ires-0.140
 erupt-0.139
Token home
Token ID1363
Feature activation+5.52e-01

Pos logit contributions
aid+0.031
P+0.031
ival+0.027
coat+0.026
 sunscreen+0.026
Neg logit contributions
 wasn-0.023
arter-0.023
 meant-0.022
 probably-0.022
 erupt-0.022
Token together
Token ID1978
Feature activation+8.65e-01

Pos logit contributions
P+0.017
aid+0.017
 sunscreen+0.017
suits+0.017
 merry+0.017
Neg logit contributions
arter-0.007
ires-0.006
 meant-0.006
 wasn-0.006
 probably-0.006
Token.
Token ID13
Feature activation+5.56e-01

Pos logit contributions
P+0.029
aid+0.029
 sunscreen+0.028
 merry+0.028
suits+0.027
Neg logit contributions
arter-0.010
 wasn-0.009
 meant-0.009
ires-0.009
 probably-0.008
Token They
Token ID1119
Feature activation+1.18e+00

Pos logit contributions
P+0.016
aid+0.016
 sunscreen+0.016
suits+0.015
coat+0.015
Neg logit contributions
arter-0.009
 meant-0.008
 wasn-0.008
 erupt-0.008
ires-0.008
Token.
Token ID13
Feature activation+7.07e-01

Pos logit contributions
arter+0.020
ires+0.018
 meant+0.018
 wasn+0.017
 probably+0.017
Neg logit contributions
P-0.032
aid-0.031
 sunscreen-0.031
 merry-0.030
suits-0.030
Token As
Token ID1081
Feature activation+6.76e-01

Pos logit contributions
P+0.020
 sunscreen+0.019
aid+0.019
suits+0.019
ival+0.019
Neg logit contributions
arter-0.012
 erupt-0.011
 meant-0.011
rated-0.010
ires-0.010
Token Daisy
Token ID40355
Feature activation+3.41e-01

Pos logit contributions
aid+0.021
P+0.021
 sunscreen+0.020
 merry+0.019
coat+0.019
Neg logit contributions
arter-0.010
 wasn-0.009
 meant-0.009
 probably-0.009
ires-0.009
Tokenâ
Token ID22940
Feature activation+1.27e-01

Pos logit contributions
P+0.009
aid+0.009
 sunscreen+0.009
 merry+0.008
coat+0.008
Neg logit contributions
 wasn-0.006
 meant-0.006
arter-0.006
 probably-0.006
 erupt-0.006
Token
Token ID26391
Feature activation+2.93e+00

Pos logit contributions
P+0.003
aid+0.003
 sunscreen+0.003
 merry+0.003
coat+0.003
Neg logit contributions
arter-0.003
 wasn-0.003
 meant-0.003
ires-0.003
 probably-0.003
Token
Token ID8151
Feature activation+8.13e+00

Pos logit contributions
P+0.067
suits+0.065
aid+0.064
 sunscreen+0.064
ival+0.063
Neg logit contributions
arter-0.065
 meant-0.062
ires-0.061
 probably-0.059
 erupt-0.058
Tokens
Token ID82
Feature activation+1.02e+00

Pos logit contributions
P+0.239
aid+0.238
 sunscreen+0.233
 merry+0.228
suits+0.223
Neg logit contributions
arter-0.120
 wasn-0.119
 meant-0.116
 He-0.111
ires-0.110
Token gaze
Token ID17841
Feature activation-7.06e-01

Pos logit contributions
P+0.023
aid+0.023
 sunscreen+0.022
 merry+0.022
suits+0.021
Neg logit contributions
arter-0.022
 wasn-0.021
 meant-0.021
ires-0.021
 probably-0.021
Token passed
Token ID3804
Feature activation+3.20e+00

Pos logit contributions
 wasn+0.014
 meant+0.014
arter+0.014
 probably+0.013
ires+0.013
Neg logit contributions
aid-0.018
P-0.018
 sunscreen-0.017
 merry-0.016
coat-0.016
Token by
Token ID416
Feature activation-1.97e+00

Pos logit contributions
aid+0.084
P+0.084
 sunscreen+0.080
coat+0.077
 merry+0.076
Neg logit contributions
 wasn-0.063
 meant-0.063
 He-0.059
 probably-0.058
ires-0.056
Token,
Token ID11
Feature activation+8.75e-01

Pos logit contributions
arter+0.034
 wasn+0.034
 meant+0.033
 probably+0.032
ires+0.032
Neg logit contributions
P-0.054
aid-0.054
 sunscreen-0.051
suits-0.050
 merry-0.050
Token back
Token ID736
Feature activation+4.15e+00

Pos logit contributions
aid+0.120
coat+0.104
P+0.102
 merry+0.100
 gown+0.098
Neg logit contributions
 He-0.095
 before-0.084
 Joe-0.082
.-0.080
 from-0.078
Token to
Token ID284
Feature activation+2.20e+00

Pos logit contributions
aid+0.102
P+0.099
 sunscreen+0.091
 merry+0.091
coat+0.088
Neg logit contributions
 wasn-0.087
 He-0.086
ires-0.085
 meant-0.084
 Andy-0.084
Token Lily
Token ID20037
Feature activation+1.77e+00

Pos logit contributions
P+0.064
aid+0.064
suits+0.060
 sunscreen+0.060
coat+0.060
Neg logit contributions
arter-0.035
 wasn-0.033
 meant-0.033
 probably-0.033
ires-0.032
Tokenâ
Token ID22940
Feature activation+4.20e-01

Pos logit contributions
P+0.049
aid+0.048
 sunscreen+0.048
coat+0.047
suits+0.046
Neg logit contributions
arter-0.030
ires-0.028
 meant-0.027
 wasn-0.027
 erupt-0.027
Token
Token ID26391
Feature activation+2.96e+00

Pos logit contributions
aid+0.011
P+0.011
 sunscreen+0.011
 merry+0.010
suits+0.010
Neg logit contributions
 wasn-0.008
arter-0.007
 meant-0.007
 probably-0.007
 erupt-0.007
Token
Token ID8151
Feature activation+8.12e+00

Pos logit contributions
P+0.064
suits+0.060
 sunscreen+0.060
coat+0.060
 merry+0.059
Neg logit contributions
arter-0.071
 meant-0.069
ires-0.068
 erupt-0.065
rated-0.065
Tokens
Token ID82
Feature activation+1.13e+00

Pos logit contributions
P+0.215
aid+0.209
 sunscreen+0.204
coat+0.203
suits+0.202
Neg logit contributions
arter-0.151
 meant-0.143
ires-0.141
 probably-0.140
 erupt-0.139
Token house
Token ID2156
Feature activation+2.15e-01

Pos logit contributions
P+0.027
aid+0.026
 sunscreen+0.025
 merry+0.024
suits+0.024
Neg logit contributions
arter-0.024
 wasn-0.023
 meant-0.023
ires-0.023
 probably-0.023
Token and
Token ID290
Feature activation+1.43e-01

Pos logit contributions
P+0.007
aid+0.007
 sunscreen+0.006
 merry+0.006
suits+0.006
Neg logit contributions
arter-0.003
 meant-0.003
 wasn-0.003
ires-0.003
 probably-0.003
Token parted
Token ID37813
Feature activation+2.47e+00

Pos logit contributions
P+0.004
aid+0.004
 sunscreen+0.004
suits+0.004
 merry+0.004
Neg logit contributions
 wasn-0.003
 meant-0.003
arter-0.002
 probably-0.002
 He-0.002
Token ways
Token ID2842
Feature activation+1.14e+00

Pos logit contributions
P+0.077
aid+0.077
coat+0.075
ival+0.074
 merry+0.073
Neg logit contributions
ires-0.033
 meant-0.032
 wasn-0.031
arter-0.031
 He-0.031
Token patiently
Token ID37220
Feature activation+4.92e-01

Pos logit contributions
arter+0.016
 meant+0.015
 wasn+0.015
ires+0.015
 probably+0.015
Neg logit contributions
P-0.021
aid-0.021
 sunscreen-0.020
 merry-0.020
suits-0.020
Token for
Token ID329
Feature activation+9.08e-01

Pos logit contributions
P+0.014
aid+0.014
 sunscreen+0.013
 merry+0.013
coat+0.013
Neg logit contributions
arter-0.008
 wasn-0.008
 meant-0.008
ires-0.008
 probably-0.008
Token Jack
Token ID3619
Feature activation+1.63e+00

Pos logit contributions
P+0.026
aid+0.025
coat+0.025
suits+0.024
 sunscreen+0.023
Neg logit contributions
arter-0.015
 wasn-0.015
 meant-0.015
 probably-0.015
 erupt-0.015
Tokenâ
Token ID22940
Feature activation+3.42e-01

Pos logit contributions
P+0.036
aid+0.035
 sunscreen+0.034
 merry+0.033
coat+0.032
Neg logit contributions
arter-0.037
 wasn-0.037
 meant-0.037
 probably-0.036
ires-0.035
Token
Token ID26391
Feature activation+3.61e+00

Pos logit contributions
P+0.008
aid+0.008
coat+0.007
 sunscreen+0.007
suits+0.007
Neg logit contributions
arter-0.008
ires-0.007
 wasn-0.007
 meant-0.007
 erupt-0.007
Token
Token ID8151
Feature activation+8.11e+00

Pos logit contributions
P+0.081
suits+0.077
aid+0.076
 sunscreen+0.076
 merry+0.075
Neg logit contributions
arter-0.083
 meant-0.078
ires-0.078
 erupt-0.076
 probably-0.076
Tokens
Token ID82
Feature activation+1.09e+00

Pos logit contributions
P+0.213
aid+0.205
suits+0.203
 sunscreen+0.202
ival+0.202
Neg logit contributions
arter-0.150
 meant-0.143
ires-0.143
 probably-0.140
rated-0.138
Token new
Token ID649
Feature activation+1.97e+00

Pos logit contributions
P+0.027
aid+0.026
 sunscreen+0.025
suits+0.024
coat+0.024
Neg logit contributions
 wasn-0.022
arter-0.022
 meant-0.022
 probably-0.021
ires-0.021
Token passport
Token ID17981
Feature activation+3.37e-01

Pos logit contributions
aid+0.056
P+0.052
ival+0.047
coat+0.047
rob+0.047
Neg logit contributions
 wasn-0.039
 ending-0.038
 meant-0.037
 either-0.036
 matter-0.036
Token.
Token ID13
Feature activation+1.01e+00

Pos logit contributions
P+0.010
aid+0.010
 sunscreen+0.010
 merry+0.009
suits+0.009
Neg logit contributions
arter-0.005
 wasn-0.005
 meant-0.005
ires-0.005
 probably-0.005
Token And
Token ID843
Feature activation-9.38e-01

Pos logit contributions
P+0.020
ival+0.020
suits+0.020
aid+0.019
Of+0.019
Neg logit contributions
rated-0.024
arter-0.023
cker-0.023
 matter-0.023
 meant-0.022
Token Jack
Token ID3619
Feature activation+3.76e-01

Pos logit contributions
 sunscreen+0.008
P+0.008
aid+0.007
ival+0.007
 merry+0.007
Neg logit contributions
arter-0.005
cker-0.005
ires-0.005
 erupt-0.005
 probably-0.005
Token and
Token ID290
Feature activation-3.66e-01

Pos logit contributions
P+0.010
aid+0.010
 sunscreen+0.010
 merry+0.010
suits+0.010
Neg logit contributions
arter-0.006
 wasn-0.006
 meant-0.006
ires-0.006
 probably-0.006
Token Emma
Token ID18966
Feature activation+1.84e+00

Pos logit contributions
arter+0.009
ires+0.009
 erupt+0.009
cker+0.008
uders+0.008
Neg logit contributions
P-0.007
aid-0.007
 sunscreen-0.007
 gown-0.007
 merry-0.007
Tokenâ
Token ID22940
Feature activation-1.64e-03

Pos logit contributions
aid+0.050
P+0.049
 sunscreen+0.046
 merry+0.045
ival+0.045
Neg logit contributions
 wasn-0.035
 meant-0.033
 He-0.031
 probably-0.031
 couldn-0.030
Token
Token ID26391
Feature activation+2.85e+00

Pos logit contributions
arter+0.000
 wasn+0.000
 meant+0.000
 probably+0.000
ires+0.000
Neg logit contributions
P-0.000
aid-0.000
 sunscreen-0.000
 merry-0.000
suits-0.000
Token
Token ID8151
Feature activation+8.06e+00

Pos logit contributions
P+0.064
aid+0.062
suits+0.062
 sunscreen+0.061
 merry+0.060
Neg logit contributions
arter-0.064
 meant-0.061
ires-0.060
 probably-0.059
 wasn-0.059
Tokens
Token ID82
Feature activation+4.63e-01

Pos logit contributions
P+0.206
aid+0.205
 sunscreen+0.198
 merry+0.192
suits+0.191
Neg logit contributions
arter-0.153
 wasn-0.147
 meant-0.147
ires-0.144
 probably-0.144
Token mom
Token ID1995
Feature activation+1.41e+00

Pos logit contributions
aid+0.012
P+0.011
 sunscreen+0.011
 merry+0.011
suits+0.011
Neg logit contributions
arter-0.009
ires-0.009
 probably-0.008
 meant-0.008
 wasn-0.008
Token gave
Token ID2921
Feature activation+6.49e-01

Pos logit contributions
P+0.038
aid+0.037
 sunscreen+0.036
 merry+0.035
suits+0.034
Neg logit contributions
 wasn-0.026
 meant-0.026
arter-0.025
 probably-0.024
 He-0.024
Token them
Token ID606
Feature activation+2.10e+00

Pos logit contributions
aid+0.020
 merry+0.019
coat+0.019
P+0.019
suits+0.019
Neg logit contributions
 meant-0.010
 wasn-0.010
 Andy-0.009
 Joe-0.008
 Next-0.008
Token cel
Token ID18725
Feature activation-6.17e-02

Pos logit contributions
rob+0.058
 merry+0.057
aid+0.056
coat+0.056
ival+0.055
Neg logit contributions
 Joe-0.032
 Sam-0.030
 before-0.030
 Ted-0.029
 wasn-0.029
Token to
Token ID284
Feature activation+5.82e-01

Pos logit contributions
arter+0.004
 meant+0.004
 wasn+0.004
ires+0.004
 probably+0.004
Neg logit contributions
P-0.004
aid-0.004
 sunscreen-0.004
suits-0.004
 merry-0.004
Token his
Token ID465
Feature activation+1.21e+00

Pos logit contributions
 sunscreen+0.015
 merry+0.015
aid+0.015
P+0.014
 gown+0.014
Neg logit contributions
arter-0.011
 meant-0.010
ires-0.010
 wasn-0.010
rated-0.010
Token dad
Token ID9955
Feature activation+3.77e+00

Pos logit contributions
P+0.032
aid+0.032
 sunscreen+0.031
 merry+0.030
coat+0.030
Neg logit contributions
arter-0.022
 wasn-0.022
 meant-0.021
ires-0.021
 probably-0.021
Tokenâ
Token ID22940
Feature activation-1.42e-01

Pos logit contributions
aid+0.101
 sunscreen+0.097
 gown+0.096
suits+0.096
 merry+0.096
Neg logit contributions
arter-0.068
ires-0.061
uders-0.058
 probably-0.057
 meant-0.057
Token
Token ID26391
Feature activation+3.08e+00

Pos logit contributions
arter+0.003
ires+0.003
 wasn+0.003
 meant+0.003
 probably+0.003
Neg logit contributions
P-0.003
aid-0.003
 sunscreen-0.003
coat-0.003
 merry-0.003
Token
Token ID8151
Feature activation+8.02e+00

Pos logit contributions
suits+0.061
ival+0.060
P+0.060
coat+0.060
 merry+0.058
Neg logit contributions
arter-0.079
 meant-0.076
ires-0.076
rated-0.073
 matter-0.072
Tokens
Token ID82
Feature activation+1.07e+00

Pos logit contributions
P+0.199
ival+0.192
coat+0.191
aid+0.191
suits+0.189
Neg logit contributions
arter-0.159
 meant-0.152
rated-0.148
 probably-0.148
 matter-0.146
Token instructions
Token ID7729
Feature activation+7.88e-01

Pos logit contributions
aid+0.024
 merry+0.023
 sunscreen+0.023
P+0.022
 gown+0.022
Neg logit contributions
arter-0.024
ires-0.022
 meant-0.022
 probably-0.021
 wasn-0.021
Token.
Token ID13
Feature activation+7.85e-01

Pos logit contributions
aid+0.022
 sunscreen+0.022
coat+0.021
suits+0.021
ival+0.021
Neg logit contributions
arter-0.014
ires-0.011
uders-0.011
 erupt-0.011
 probably-0.011
Token
Token ID198
Feature activation+1.70e-01

Pos logit contributions
P+0.019
aid+0.019
 sunscreen+0.019
ival+0.018
suits+0.018
Neg logit contributions
arter-0.016
rated-0.016
 meant-0.015
 erupt-0.015
cker-0.015
Token
Token ID198
Feature activation-4.73e-01

Pos logit contributions
P+0.005
aid+0.005
 sunscreen+0.005
ival+0.005
 merry+0.005
Neg logit contributions
arter-0.003
 meant-0.003
 wasn-0.002
ires-0.002
 probably-0.002
Token With
Token ID2080
Feature activation-1.17e+00

Pos logit contributions
 sunscreen+0.017
aid+0.017
P+0.017
ival+0.016
suits+0.016
Neg logit contributions
arter-0.010
rated-0.009
cker-0.009
uders-0.009
 erupt-0.009
Token his
Token ID465
Feature activation+2.46e+00

Pos logit contributions
 meant+0.016
 wasn+0.016
 He+0.015
 Andy+0.014
arter+0.014
Neg logit contributions
suits-0.036
P-0.036
aid-0.035
coat-0.035
ival-0.035
Token parents
Token ID3397
Feature activation+4.59e+00

Pos logit contributions
P+0.061
aid+0.061
suits+0.058
coat+0.058
 sunscreen+0.055
Neg logit contributions
 wasn-0.048
arter-0.047
 meant-0.047
ires-0.047
 probably-0.046
Tokenâ
Token ID22940
Feature activation+1.15e-02

Pos logit contributions
P+0.122
aid+0.121
 sunscreen+0.118
coat+0.114
suits+0.114
Neg logit contributions
 wasn-0.085
 meant-0.083
 probably-0.078
 He-0.077
arter-0.074
Token
Token ID26391
Feature activation+3.15e+00

Pos logit contributions
P+0.000
aid+0.000
 sunscreen+0.000
 merry+0.000
suits+0.000
Neg logit contributions
arter-0.000
 wasn-0.000
 meant-0.000
ires-0.000
 probably-0.000
Token
Token ID8151
Feature activation+7.96e+00

Pos logit contributions
P+0.069
aid+0.067
 sunscreen+0.065
suits+0.064
coat+0.063
Neg logit contributions
arter-0.075
ires-0.071
 meant-0.071
 probably-0.069
 erupt-0.069
Token help
Token ID1037
Feature activation+2.45e+00

Pos logit contributions
P+0.229
aid+0.227
 sunscreen+0.220
 merry+0.215
suits+0.213
Neg logit contributions
arter-0.127
 meant-0.121
 wasn-0.121
ires-0.118
 probably-0.118
Token,
Token ID11
Feature activation+1.75e+00

Pos logit contributions
P+0.058
aid+0.057
 sunscreen+0.057
coat+0.054
 merry+0.054
Neg logit contributions
arter-0.052
ires-0.048
 erupt-0.048
 probably-0.047
 meant-0.046
Token he
Token ID339
Feature activation+1.21e+00

Pos logit contributions
 sunscreen+0.050
P+0.049
aid+0.048
 merry+0.046
coat+0.045
Neg logit contributions
arter-0.031
ires-0.028
 erupt-0.028
rated-0.027
 probably-0.026
Token was
Token ID373
Feature activation+4.19e-01

Pos logit contributions
aid+0.031
P+0.031
 sunscreen+0.030
 merry+0.029
coat+0.029
Neg logit contributions
arter-0.023
ires-0.022
 erupt-0.021
 meant-0.021
 probably-0.020
Token able
Token ID1498
Feature activation+2.26e+00

Pos logit contributions
P+0.010
suits+0.010
aid+0.010
 sunscreen+0.009
ival+0.009
Neg logit contributions
 meant-0.009
 wasn-0.009
 probably-0.009
arter-0.008
 erupt-0.008
Token road
Token ID2975
Feature activation+1.90e+00

Pos logit contributions
aid+0.043
P+0.041
 sunscreen+0.038
 gown+0.038
coat+0.037
Neg logit contributions
 He-0.028
 erupt-0.028
 wasn-0.027
 meant-0.026
arter-0.026
Token to
Token ID284
Feature activation+3.51e-01

Pos logit contributions
P+0.054
aid+0.054
 merry+0.053
 sunscreen+0.053
coat+0.053
Neg logit contributions
arter-0.030
ires-0.027
 meant-0.027
 wasn-0.027
 probably-0.027
Token grandmother
Token ID18410
Feature activation+7.92e-01

Pos logit contributions
P+0.009
aid+0.009
suits+0.008
 sunscreen+0.008
coat+0.008
Neg logit contributions
arter-0.007
 wasn-0.007
 meant-0.007
 erupt-0.007
 probably-0.007
Tokenâ
Token ID22940
Feature activation-3.22e-01

Pos logit contributions
P+0.024
aid+0.024
 sunscreen+0.023
 merry+0.023
suits+0.023
Neg logit contributions
arter-0.011
ires-0.011
 meant-0.011
 wasn-0.011
 probably-0.010
Token
Token ID26391
Feature activation+2.77e+00

Pos logit contributions
arter+0.006
 wasn+0.006
 meant+0.006
 probably+0.006
ires+0.006
Neg logit contributions
P-0.008
aid-0.008
 sunscreen-0.008
 merry-0.008
suits-0.008
Token
Token ID8151
Feature activation+7.89e+00

Pos logit contributions
P+0.057
suits+0.057
coat+0.057
 sunscreen+0.055
ival+0.055
Neg logit contributions
arter-0.068
ires-0.065
 meant-0.065
 erupt-0.061
rated-0.061
Tokens
Token ID82
Feature activation+6.78e-01

Pos logit contributions
P+0.193
aid+0.185
 sunscreen+0.184
coat+0.183
suits+0.183
Neg logit contributions
arter-0.161
 meant-0.156
 probably-0.152
ires-0.150
rated-0.149
Token.
Token ID13
Feature activation+1.02e+00

Pos logit contributions
P+0.017
aid+0.016
 sunscreen+0.015
suits+0.015
coat+0.015
Neg logit contributions
 wasn-0.014
arter-0.014
 meant-0.014
 probably-0.014
 erupt-0.013
Token Although
Token ID4900
Feature activation-4.02e+00

Pos logit contributions
P+0.029
aid+0.029
 sunscreen+0.029
suits+0.028
coat+0.028
Neg logit contributions
arter-0.016
 meant-0.015
 erupt-0.015
ires-0.015
 wasn-0.015
Token this
Token ID428
Feature activation+6.13e-01

Pos logit contributions
arter+0.077
ires+0.070
 erupt+0.069
rated+0.066
 meant+0.065
Neg logit contributions
 sunscreen-0.106
P-0.103
aid-0.102
 merry-0.101
coat-0.098
Token day
Token ID1110
Feature activation+2.19e+00

Pos logit contributions
P+0.016
 sunscreen+0.016
aid+0.016
coat+0.016
 merry+0.016
Neg logit contributions
arter-0.011
ires-0.011
 erupt-0.010
 probably-0.010
iring-0.010
Token into
Token ID656
Feature activation+2.22e+00

Pos logit contributions
aid+0.111
 merry+0.103
P+0.103
rob+0.100
ival+0.095
Neg logit contributions
 He-0.090
 before-0.085
 wasn-0.083
 probably-0.080
 meant-0.078
Token his
Token ID465
Feature activation+3.44e+00

Pos logit contributions
P+0.068
aid+0.067
 sunscreen+0.064
coat+0.063
suits+0.063
Neg logit contributions
arter-0.032
 wasn-0.031
 meant-0.031
 probably-0.030
 erupt-0.029
Token mum
Token ID25682
Feature activation+3.73e+00

Pos logit contributions
aid+0.087
P+0.083
ival+0.076
coat+0.074
rob+0.073
Neg logit contributions
arter-0.071
 wasn-0.070
 probably-0.069
 He-0.069
 erupt-0.066
Tokenâ
Token ID22940
Feature activation-1.21e-01

Pos logit contributions
P+0.107
aid+0.106
 sunscreen+0.104
suits+0.101
 merry+0.101
Neg logit contributions
arter-0.061
ires-0.057
 wasn-0.056
 meant-0.056
 probably-0.055
Token
Token ID26391
Feature activation+2.78e+00

Pos logit contributions
arter+0.003
 wasn+0.002
 meant+0.002
ires+0.002
 probably+0.002
Neg logit contributions
P-0.003
aid-0.003
 sunscreen-0.003
coat-0.003
 merry-0.003
Token
Token ID8151
Feature activation+7.88e+00

Pos logit contributions
P+0.061
suits+0.060
coat+0.059
 sunscreen+0.058
ival+0.056
Neg logit contributions
arter-0.067
 meant-0.064
ires-0.063
 matter-0.060
rated-0.060
Tokens
Token ID82
Feature activation+1.90e+00

Pos logit contributions
P+0.208
aid+0.204
 sunscreen+0.199
coat+0.199
suits+0.198
Neg logit contributions
arter-0.145
 meant-0.140
ires-0.135
 probably-0.134
 erupt-0.133
Token car
Token ID1097
Feature activation+3.63e+00

Pos logit contributions
P+0.048
aid+0.048
 sunscreen+0.044
ival+0.044
 merry+0.043
Neg logit contributions
 wasn-0.038
arter-0.037
 meant-0.037
 He-0.036
 probably-0.036
Token.
Token ID13
Feature activation+5.66e-01

Pos logit contributions
P+0.120
aid+0.119
 sunscreen+0.117
coat+0.115
suits+0.114
Neg logit contributions
arter-0.043
ires-0.039
 wasn-0.038
 meant-0.038
 probably-0.037
Token
Token ID198
Feature activation-1.33e-01

Pos logit contributions
P+0.016
aid+0.016
 sunscreen+0.016
suits+0.015
coat+0.015
Neg logit contributions
arter-0.010
 erupt-0.009
rated-0.009
 meant-0.009
ires-0.009
Token
Token ID198
Feature activation-8.16e-01

Pos logit contributions
arter+0.002
 wasn+0.002
 meant+0.001
ires+0.001
 probably+0.001
Neg logit contributions
P-0.004
aid-0.004
 sunscreen-0.004
 merry-0.004
coat-0.004
Token.
Token ID13
Feature activation+1.97e-01

Pos logit contributions
arter+0.019
ires+0.017
 wasn+0.017
 probably+0.017
 meant+0.017
Neg logit contributions
aid-0.037
P-0.036
 sunscreen-0.036
suits-0.035
 merry-0.035
Token When
Token ID1649
Feature activation-1.57e+00

Pos logit contributions
suits+0.005
P+0.005
 sunscreen+0.005
coat+0.005
ival+0.005
Neg logit contributions
arter-0.004
rated-0.003
cker-0.003
ires-0.003
 erupt-0.003
Token Lily
Token ID20037
Feature activation+1.46e+00

Pos logit contributions
arter+0.022
 wasn+0.021
 meant+0.021
 probably+0.020
 erupt+0.020
Neg logit contributions
aid-0.049
P-0.049
 sunscreen-0.047
suits-0.046
coat-0.046
Tokenâ
Token ID22940
Feature activation+1.54e-01

Pos logit contributions
aid+0.036
P+0.035
 sunscreen+0.034
 merry+0.033
suits+0.032
Neg logit contributions
 wasn-0.030
 meant-0.029
arter-0.029
 probably-0.029
 He-0.028
Token
Token ID26391
Feature activation+2.84e+00

Pos logit contributions
P+0.004
aid+0.004
 sunscreen+0.004
 merry+0.003
suits+0.003
Neg logit contributions
arter-0.003
 wasn-0.003
 meant-0.003
ires-0.003
 probably-0.003
Token
Token ID8151
Feature activation+7.87e+00

Pos logit contributions
P+0.070
aid+0.068
suits+0.068
coat+0.068
ival+0.067
Neg logit contributions
arter-0.057
 meant-0.055
ires-0.054
 probably-0.052
 erupt-0.052
Tokens
Token ID82
Feature activation+7.11e-01

Pos logit contributions
P+0.219
aid+0.218
 sunscreen+0.211
 merry+0.206
suits+0.204
Neg logit contributions
arter-0.132
 wasn-0.126
 meant-0.126
ires-0.124
 probably-0.123
Token mom
Token ID1995
Feature activation+1.27e+00

Pos logit contributions
P+0.019
aid+0.019
 sunscreen+0.018
 merry+0.018
suits+0.018
Neg logit contributions
arter-0.013
 wasn-0.013
 meant-0.013
 probably-0.012
ires-0.012
Token picked
Token ID6497
Feature activation+7.62e-01

Pos logit contributions
P+0.033
aid+0.032
 sunscreen+0.031
coat+0.030
 merry+0.030
Neg logit contributions
arter-0.024
ires-0.023
 meant-0.023
 wasn-0.023
 erupt-0.022
Token her
Token ID607
Feature activation+2.39e+00

Pos logit contributions
aid+0.025
P+0.024
suits+0.024
coat+0.024
 merry+0.023
Neg logit contributions
 meant-0.010
 He-0.009
 wasn-0.009
 probably-0.009
ires-0.009
Token up
Token ID510
Feature activation+1.68e+00

Pos logit contributions
aid+0.075
P+0.074
 merry+0.068
ival+0.068
coat+0.068
Neg logit contributions
 wasn-0.036
 meant-0.035
 probably-0.034
 He-0.032
arter-0.031
Token off
Token ID572
Feature activation+2.91e-01

Pos logit contributions
P+0.017
aid+0.017
 merry+0.016
rob+0.015
 sunscreen+0.015
Neg logit contributions
 wasn-0.017
 meant-0.016
 probably-0.016
 before-0.016
 He-0.016
Token and
Token ID290
Feature activation-1.76e-01

Pos logit contributions
aid+0.008
P+0.008
 sunscreen+0.008
 merry+0.008
suits+0.008
Neg logit contributions
 wasn-0.005
arter-0.005
 meant-0.004
 probably-0.004
ires-0.004
Token let
Token ID1309
Feature activation+7.32e-01

Pos logit contributions
 wasn+0.003
arter+0.003
 meant+0.003
 probably+0.003
ires+0.003
Neg logit contributions
P-0.005
aid-0.005
 sunscreen-0.004
suits-0.004
 merry-0.004
Tokenâ
Token ID22940
Feature activation+3.33e-01

Pos logit contributions
P+0.021
aid+0.021
 sunscreen+0.020
 merry+0.020
suits+0.020
Neg logit contributions
arter-0.011
 wasn-0.011
 meant-0.011
 probably-0.011
ires-0.011
Token
Token ID26391
Feature activation+3.06e+00

Pos logit contributions
P+0.008
 sunscreen+0.008
aid+0.008
 merry+0.008
suits+0.008
Neg logit contributions
 meant-0.006
 wasn-0.006
arter-0.006
 probably-0.006
rated-0.006
Token
Token ID8151
Feature activation+7.87e+00

Pos logit contributions
P+0.075
aid+0.075
 sunscreen+0.072
 merry+0.070
suits+0.069
Neg logit contributions
arter-0.062
 wasn-0.059
 meant-0.059
ires-0.058
 probably-0.058
Tokens
Token ID82
Feature activation+1.69e+00

Pos logit contributions
P+0.189
aid+0.185
 sunscreen+0.177
coat+0.176
suits+0.175
Neg logit contributions
arter-0.166
 meant-0.158
ires-0.157
 probably-0.156
 erupt-0.154
Token go
Token ID467
Feature activation+5.07e+00

Pos logit contributions
P+0.042
aid+0.041
 sunscreen+0.040
 merry+0.039
coat+0.038
Neg logit contributions
arter-0.034
 wasn-0.033
 meant-0.033
ires-0.032
 probably-0.032
Token home
Token ID1363
Feature activation+1.05e+00

Pos logit contributions
P+0.151
aid+0.146
 merry+0.135
ival+0.135
 sunscreen+0.135
Neg logit contributions
 He-0.079
 wasn-0.072
 probably-0.071
arter-0.071
 before-0.070
Token.
Token ID13
Feature activation+1.01e+00

Pos logit contributions
P+0.034
aid+0.033
 sunscreen+0.032
 merry+0.032
suits+0.031
Neg logit contributions
arter-0.013
 wasn-0.013
 meant-0.013
 probably-0.012
ires-0.012
Tokenâ
Token ID22940
Feature activation-1.19e+00

Pos logit contributions
P+0.028
aid+0.027
 sunscreen+0.026
ival+0.025
 gown+0.025
Neg logit contributions
arter-0.020
 erupt-0.018
rated-0.018
cker-0.018
ires-0.017
Token to
Token ID284
Feature activation+8.73e-01

Pos logit contributions
arter+0.002
 wasn+0.002
 meant+0.002
ires+0.002
 probably+0.002
Neg logit contributions
P-0.002
aid-0.002
 sunscreen-0.002
 merry-0.002
suits-0.002
Token his
Token ID465
Feature activation+1.73e+00

Pos logit contributions
 sunscreen+0.023
aid+0.023
 merry+0.022
P+0.022
 gown+0.021
Neg logit contributions
arter-0.015
 meant-0.015
rated-0.015
 erupt-0.014
 wasn-0.014
Token mom
Token ID1995
Feature activation+2.09e+00

Pos logit contributions
P+0.053
aid+0.053
 sunscreen+0.051
 merry+0.050
suits+0.050
Neg logit contributions
arter-0.025
 wasn-0.024
 meant-0.024
ires-0.023
 probably-0.023
Tokenâ
Token ID22940
Feature activation-9.30e-03

Pos logit contributions
aid+0.052
P+0.050
 sunscreen+0.050
 merry+0.049
coat+0.049
Neg logit contributions
arter-0.042
ires-0.039
 erupt-0.037
 meant-0.037
 probably-0.037
Token
Token ID26391
Feature activation+3.15e+00

Pos logit contributions
arter+0.000
 wasn+0.000
 meant+0.000
ires+0.000
 probably+0.000
Neg logit contributions
P-0.000
aid-0.000
 sunscreen-0.000
 merry-0.000
suits-0.000
Token
Token ID8151
Feature activation+7.82e+00

Pos logit contributions
P+0.061
suits+0.058
ival+0.057
coat+0.056
 merry+0.056
Neg logit contributions
arter-0.083
 meant-0.079
ires-0.078
 matter-0.076
rated-0.076
Tokens
Token ID82
Feature activation+1.29e+00

Pos logit contributions
P+0.206
aid+0.198
coat+0.194
 sunscreen+0.193
ival+0.192
Neg logit contributions
arter-0.145
 meant-0.140
ires-0.138
 erupt-0.136
rated-0.135
Token advice
Token ID5608
Feature activation+9.62e-01

Pos logit contributions
aid+0.028
P+0.027
 sunscreen+0.027
 merry+0.026
 gown+0.025
Neg logit contributions
arter-0.030
 meant-0.029
ires-0.028
 wasn-0.028
 probably-0.028
Token and
Token ID290
Feature activation+3.21e-01

Pos logit contributions
aid+0.024
P+0.024
 sunscreen+0.024
 merry+0.023
coat+0.023
Neg logit contributions
arter-0.019
ires-0.017
 erupt-0.017
 meant-0.017
 probably-0.016
Token ate
Token ID15063
Feature activation+1.24e+00

Pos logit contributions
aid+0.008
P+0.008
 sunscreen+0.008
 merry+0.007
 gown+0.007
Neg logit contributions
arter-0.007
ires-0.006
 erupt-0.006
 meant-0.006
rated-0.006
Token the
Token ID262
Feature activation+9.59e-01

Pos logit contributions
P+0.035
aid+0.034
coat+0.033
suits+0.033
 merry+0.033
Neg logit contributions
 wasn-0.020
arter-0.020
 meant-0.020
ires-0.020
 probably-0.020
Token With
Token ID2080
Feature activation-1.07e+00

Pos logit contributions
P+0.012
aid+0.012
 sunscreen+0.012
coat+0.012
 merry+0.012
Neg logit contributions
arter-0.007
 erupt-0.007
 meant-0.006
 probably-0.006
ires-0.006
Token her
Token ID607
Feature activation+1.92e+00

Pos logit contributions
 meant+0.019
 wasn+0.018
 He+0.017
 Ste+0.017
ires+0.017
Neg logit contributions
suits-0.030
P-0.029
ival-0.029
aid-0.029
coat-0.028
Token mom
Token ID1995
Feature activation+2.85e+00

Pos logit contributions
P+0.054
aid+0.054
 sunscreen+0.052
 merry+0.051
suits+0.051
Neg logit contributions
arter-0.032
 wasn-0.031
 meant-0.031
 probably-0.030
ires-0.030
Tokenâ
Token ID22940
Feature activation+1.20e-01

Pos logit contributions
P+0.084
aid+0.082
 sunscreen+0.081
 merry+0.078
suits+0.077
Neg logit contributions
 wasn-0.045
 meant-0.044
 probably-0.040
arter-0.040
 Andy-0.040
Token
Token ID26391
Feature activation+3.07e+00

Pos logit contributions
P+0.003
aid+0.003
 sunscreen+0.003
 merry+0.003
suits+0.003
Neg logit contributions
 wasn-0.002
 meant-0.002
arter-0.002
rated-0.002
 probably-0.002
Token
Token ID8151
Feature activation+7.80e+00

Pos logit contributions
aid+0.068
P+0.067
 sunscreen+0.064
 merry+0.061
suits+0.059
Neg logit contributions
 wasn-0.069
arter-0.068
 meant-0.067
 probably-0.066
ires-0.065
Tokens
Token ID82
Feature activation+1.72e+00

Pos logit contributions
P+0.209
aid+0.208
 sunscreen+0.201
 merry+0.196
suits+0.193
Neg logit contributions
arter-0.138
 wasn-0.135
 meant-0.133
ires-0.130
 probably-0.129
Token help
Token ID1037
Feature activation+1.98e+00

Pos logit contributions
P+0.038
aid+0.037
 sunscreen+0.035
 merry+0.035
suits+0.034
Neg logit contributions
 wasn-0.040
 meant-0.039
arter-0.039
 probably-0.038
ires-0.037
Token,
Token ID11
Feature activation+1.26e+00

Pos logit contributions
P+0.051
aid+0.051
 sunscreen+0.049
 merry+0.048
coat+0.048
Neg logit contributions
arter-0.038
ires-0.035
 wasn-0.035
 meant-0.035
 probably-0.035
Token she
Token ID673
Feature activation+3.90e-01

Pos logit contributions
P+0.035
aid+0.035
 sunscreen+0.033
suits+0.032
 merry+0.032
Neg logit contributions
 wasn-0.021
 meant-0.021
arter-0.021
 probably-0.020
ires-0.020
Token made
Token ID925
Feature activation+6.30e-01

Pos logit contributions
P+0.009
aid+0.009
 sunscreen+0.009
suits+0.009
 merry+0.009
Neg logit contributions
arter-0.008
ires-0.008
 erupt-0.007
 meant-0.007
 probably-0.007
Token smiled
Token ID13541
Feature activation-1.89e+00

Pos logit contributions
arter+0.001
 wasn+0.001
 meant+0.001
ires+0.001
 probably+0.001
Neg logit contributions
P-0.002
aid-0.002
 sunscreen-0.002
 merry-0.002
coat-0.001
Token at
Token ID379
Feature activation-2.74e-01

Pos logit contributions
arter+0.029
 erupt+0.026
 meant+0.026
 wasn+0.026
ires+0.026
Neg logit contributions
P-0.056
aid-0.055
 sunscreen-0.054
 merry-0.054
suits-0.052
Token Daisy
Token ID40355
Feature activation+1.25e+00

Pos logit contributions
arter+0.006
 erupt+0.006
 wasn+0.006
 meant+0.006
 probably+0.006
Neg logit contributions
P-0.006
 sunscreen-0.006
aid-0.006
 merry-0.005
 gown-0.005
Tokenâ
Token ID22940
Feature activation-2.33e-02

Pos logit contributions
aid+0.036
P+0.035
 sunscreen+0.035
 merry+0.035
coat+0.035
Neg logit contributions
arter-0.020
ires-0.018
 erupt-0.017
 meant-0.017
 probably-0.017
Token
Token ID26391
Feature activation+3.04e+00

Pos logit contributions
arter+0.001
ires+0.000
 wasn+0.000
 meant+0.000
 erupt+0.000
Neg logit contributions
P-0.001
aid-0.001
 sunscreen-0.001
coat-0.001
 merry-0.000
Token
Token ID8151
Feature activation+7.79e+00

Pos logit contributions
P+0.077
coat+0.074
aid+0.074
 sunscreen+0.074
suits+0.073
Neg logit contributions
arter-0.061
 meant-0.058
ires-0.058
 erupt-0.056
 probably-0.055
Tokens
Token ID82
Feature activation+7.35e-01

Pos logit contributions
P+0.214
aid+0.205
coat+0.205
 sunscreen+0.204
suits+0.201
Neg logit contributions
arter-0.134
 meant-0.128
 erupt-0.124
ires-0.123
 probably-0.123
Token beautiful
Token ID4950
Feature activation-1.33e+00

Pos logit contributions
aid+0.018
 sunscreen+0.017
 merry+0.017
P+0.017
 gown+0.017
Neg logit contributions
arter-0.015
ires-0.014
 meant-0.014
 wasn-0.014
 probably-0.014
Token rainbow
Token ID27223
Feature activation-2.25e+00

Pos logit contributions
 wasn+0.030
arter+0.030
 meant+0.030
 probably+0.029
ires+0.028
Neg logit contributions
P-0.030
aid-0.029
 sunscreen-0.028
suits-0.027
ival-0.027
Token hair
Token ID4190
Feature activation-1.30e+00

Pos logit contributions
arter+0.030
ires+0.028
 erupt+0.026
 meant+0.026
 probably+0.026
Neg logit contributions
P-0.070
aid-0.070
 sunscreen-0.068
 merry-0.067
coat-0.067
Token.
Token ID13
Feature activation+2.90e-01

Pos logit contributions
arter+0.016
ires+0.015
 erupt+0.014
 wasn+0.014
 meant+0.014
Neg logit contributions
aid-0.041
P-0.041
 sunscreen-0.040
coat-0.040
 merry-0.040
Token follow
Token ID1061
Feature activation+8.47e-01

Pos logit contributions
arter+0.001
 wasn+0.001
 meant+0.001
ires+0.001
 probably+0.001
Neg logit contributions
P-0.001
aid-0.001
 sunscreen-0.001
 merry-0.001
coat-0.001
Token her
Token ID607
Feature activation+2.30e+00

Pos logit contributions
P+0.025
aid+0.025
suits+0.024
 sunscreen+0.024
coat+0.024
Neg logit contributions
 wasn-0.013
arter-0.013
 meant-0.012
 probably-0.012
ires-0.012
Token mom
Token ID1995
Feature activation+2.08e+00

Pos logit contributions
P+0.065
aid+0.065
 sunscreen+0.063
 merry+0.062
suits+0.061
Neg logit contributions
arter-0.038
 wasn-0.036
 meant-0.036
ires-0.035
 probably-0.035
Tokenâ
Token ID22940
Feature activation-3.59e-01

Pos logit contributions
P+0.057
aid+0.056
 sunscreen+0.056
coat+0.056
suits+0.055
Neg logit contributions
arter-0.036
ires-0.034
 wasn-0.031
 probably-0.031
 meant-0.031
Token
Token ID26391
Feature activation+3.12e+00

Pos logit contributions
arter+0.007
 wasn+0.007
 meant+0.007
 probably+0.007
ires+0.007
Neg logit contributions
P-0.009
aid-0.009
 sunscreen-0.008
 merry-0.008
suits-0.008
Token
Token ID8151
Feature activation+7.78e+00

Pos logit contributions
P+0.070
suits+0.067
coat+0.067
aid+0.065
 merry+0.065
Neg logit contributions
arter-0.073
ires-0.070
 meant-0.069
 erupt-0.066
 probably-0.065
Tokens
Token ID82
Feature activation+1.37e+00

Pos logit contributions
P+0.213
aid+0.203
coat+0.201
 sunscreen+0.199
suits+0.198
Neg logit contributions
arter-0.140
 meant-0.133
ires-0.131
 probably-0.130
 erupt-0.129
Token instructions
Token ID7729
Feature activation+3.62e-02

Pos logit contributions
P+0.038
aid+0.037
 sunscreen+0.037
 merry+0.036
suits+0.035
Neg logit contributions
arter-0.024
 meant-0.022
 wasn-0.022
ires-0.022
 probably-0.022
Token.
Token ID13
Feature activation+6.19e-01

Pos logit contributions
aid+0.001
P+0.001
 sunscreen+0.001
coat+0.001
suits+0.001
Neg logit contributions
arter-0.000
ires-0.000
 erupt-0.000
 wasn-0.000
 probably-0.000
Token<|endoftext|>
Token ID50256
Feature activation-2.12e-02

Pos logit contributions
P+0.017
aid+0.017
 sunscreen+0.017
suits+0.016
coat+0.016
Neg logit contributions
arter-0.011
 erupt-0.010
ires-0.010
 meant-0.010
 probably-0.009
TokenOne
Token ID3198
Feature activation+1.90e+00

Pos logit contributions
arter+0.000
ires+0.000
 erupt+0.000
 probably+0.000
rated+0.000
Neg logit contributions
P-0.001
aid-0.000
suits-0.000
 sunscreen-0.000
ival-0.000
Token for
Token ID329
Feature activation-8.33e-02

Pos logit contributions
P+0.051
aid+0.051
 sunscreen+0.050
 merry+0.049
suits+0.049
Neg logit contributions
arter-0.019
ires-0.018
 wasn-0.017
 meant-0.017
 probably-0.017
Token Strip
Token ID18508
Feature activation-3.17e-02

Pos logit contributions
arter+0.002
 wasn+0.002
 meant+0.002
ires+0.002
 probably+0.002
Neg logit contributions
P-0.002
aid-0.002
 sunscreen-0.002
 merry-0.002
suits-0.002
Tokenes
Token ID274
Feature activation+1.34e+00

Pos logit contributions
arter+0.001
 meant+0.001
ires+0.001
 wasn+0.001
 probably+0.001
Neg logit contributions
P-0.001
aid-0.001
suits-0.001
 sunscreen-0.001
ival-0.001
Tokenâ
Token ID22940
Feature activation-3.08e-01

Pos logit contributions
P+0.041
aid+0.041
 sunscreen+0.040
 merry+0.039
suits+0.039
Neg logit contributions
arter-0.018
 wasn-0.017
ires-0.017
 meant-0.017
 probably-0.017
Token
Token ID26391
Feature activation+2.94e+00

Pos logit contributions
arter+0.007
 wasn+0.007
 meant+0.006
ires+0.006
 probably+0.006
Neg logit contributions
P-0.007
aid-0.007
 sunscreen-0.007
coat-0.007
 merry-0.007
Token
Token ID8151
Feature activation+7.77e+00

Pos logit contributions
P+0.068
suits+0.067
coat+0.065
ival+0.065
 merry+0.064
Neg logit contributions
arter-0.066
ires-0.063
 meant-0.062
rated-0.061
 probably-0.059
Token family
Token ID1641
Feature activation+2.62e+00

Pos logit contributions
P+0.196
suits+0.190
aid+0.188
ival+0.187
coat+0.187
Neg logit contributions
arter-0.151
 meant-0.143
 probably-0.142
 matter-0.141
rated-0.141
Token.
Token ID13
Feature activation+6.52e-01

Pos logit contributions
aid+0.084
P+0.083
 sunscreen+0.081
suits+0.080
 merry+0.080
Neg logit contributions
arter-0.034
ires-0.031
 probably-0.029
 wasn-0.029
 meant-0.029
Token Spot
Token ID15899
Feature activation+2.04e+00

Pos logit contributions
P+0.017
aid+0.017
 sunscreen+0.017
ival+0.017
suits+0.017
Neg logit contributions
arter-0.012
rated-0.011
 meant-0.011
 wasn-0.011
 probably-0.011
Token finally
Token ID3443
Feature activation-7.84e-03

Pos logit contributions
aid+0.065
P+0.064
 sunscreen+0.063
 merry+0.060
coat+0.060
Neg logit contributions
 wasn-0.028
 meant-0.027
arter-0.025
 probably-0.025
 He-0.024
Token found
Token ID1043
Feature activation+2.51e-01

Pos logit contributions
arter+0.000
 wasn+0.000
 meant+0.000
ires+0.000
 probably+0.000
Neg logit contributions
P-0.000
aid-0.000
 sunscreen-0.000
 merry-0.000
suits-0.000
Token basket
Token ID7988
Feature activation+1.98e+00

Pos logit contributions
aid+0.051
P+0.049
rob+0.042
coat+0.042
ival+0.042
Neg logit contributions
 wasn-0.071
 probably-0.070
arter-0.070
ires-0.069
 erupt-0.067
Token and
Token ID290
Feature activation-4.02e-02

Pos logit contributions
P+0.052
aid+0.051
 sunscreen+0.049
 merry+0.048
suits+0.048
Neg logit contributions
arter-0.036
 wasn-0.035
 meant-0.035
ires-0.034
 probably-0.034
Token let
Token ID1309
Feature activation+1.12e+00

Pos logit contributions
arter+0.001
 wasn+0.001
 meant+0.001
ires+0.001
 probably+0.001
Neg logit contributions
P-0.001
aid-0.001
 sunscreen-0.001
 merry-0.001
suits-0.001
Tokenâ
Token ID22940
Feature activation+4.06e-01

Pos logit contributions
P+0.029
 sunscreen+0.028
aid+0.028
ival+0.026
 merry+0.026
Neg logit contributions
arter-0.022
ires-0.021
 erupt-0.020
 meant-0.020
 wasn-0.019
Token
Token ID26391
Feature activation+2.99e+00

Pos logit contributions
P+0.010
aid+0.010
 sunscreen+0.010
 merry+0.010
suits+0.010
Neg logit contributions
arter-0.008
 wasn-0.008
 meant-0.008
 probably-0.007
 erupt-0.007
Token
Token ID8151
Feature activation+7.73e+00

Pos logit contributions
P+0.076
aid+0.074
 sunscreen+0.073
suits+0.072
coat+0.072
Neg logit contributions
arter-0.059
ires-0.056
 meant-0.056
 wasn-0.054
 probably-0.054
Tokens
Token ID82
Feature activation+1.86e+00

Pos logit contributions
P+0.188
aid+0.185
 sunscreen+0.178
suits+0.175
coat+0.174
Neg logit contributions
arter-0.160
 meant-0.153
ires-0.151
 probably-0.151
 wasn-0.149
Token go
Token ID467
Feature activation+5.17e+00

Pos logit contributions
P+0.042
aid+0.042
 sunscreen+0.041
 merry+0.038
coat+0.038
Neg logit contributions
arter-0.041
ires-0.039
 meant-0.039
 wasn-0.038
 erupt-0.037
Token home
Token ID1363
Feature activation+9.60e-01

Pos logit contributions
aid+0.151
P+0.150
 merry+0.142
 sunscreen+0.139
ival+0.136
Neg logit contributions
 He-0.081
arter-0.078
 wasn-0.075
 probably-0.074
 erupt-0.073
Token to
Token ID284
Feature activation+8.32e-01

Pos logit contributions
P+0.029
aid+0.029
 sunscreen+0.028
 merry+0.028
suits+0.028
Neg logit contributions
arter-0.014
 wasn-0.013
 meant-0.013
ires-0.013
 probably-0.012
Token show
Token ID905
Feature activation-2.78e-01

Pos logit contributions
P+0.019
 sunscreen+0.019
aid+0.019
 merry+0.018
 gown+0.017
Neg logit contributions
arter-0.018
 meant-0.017
 wasn-0.017
ires-0.017
 probably-0.016
Token when
Token ID618
Feature activation-2.31e+00

Pos logit contributions
arter+0.027
 meant+0.025
rated+0.024
 erupt+0.024
 wasn+0.024
Neg logit contributions
P-0.043
 sunscreen-0.040
coat-0.039
aid-0.039
suits-0.038
Token she
Token ID673
Feature activation-9.02e-01

Pos logit contributions
arter+0.044
 meant+0.042
 wasn+0.042
ires+0.042
 probably+0.041
Neg logit contributions
P-0.060
aid-0.059
 sunscreen-0.058
 merry-0.056
coat-0.055
Token spotted
Token ID13489
Feature activation-2.22e+00

Pos logit contributions
 wasn+0.021
 meant+0.019
arter+0.019
 probably+0.019
 erupt+0.018
Neg logit contributions
aid-0.021
 sunscreen-0.020
 merry-0.019
P-0.019
ival-0.019
Token something
Token ID1223
Feature activation-1.79e+00

Pos logit contributions
 wasn+0.040
 meant+0.040
 probably+0.039
arter+0.038
 He+0.038
Neg logit contributions
suits-0.060
aid-0.057
P-0.057
coat-0.056
ival-0.056
Token gl
Token ID1278
Feature activation-5.05e+00

Pos logit contributions
arter+0.041
 meant+0.040
 wasn+0.040
 probably+0.039
ires+0.039
Neg logit contributions
P-0.039
aid-0.039
 sunscreen-0.037
suits-0.036
 merry-0.036
Tokenist
Token ID396
Feature activation-7.44e+00

Pos logit contributions
arter+0.083
 wasn+0.082
 meant+0.081
 probably+0.078
 erupt+0.077
Neg logit contributions
aid-0.143
P-0.142
 sunscreen-0.135
coat-0.134
suits-0.133
Tokenening
Token ID3101
Feature activation-4.13e+00

Pos logit contributions
arter+0.117
ires+0.102
 Next+0.098
 erupt+0.094
uders+0.093
Neg logit contributions
 sunscreen-0.216
P-0.211
 merry-0.210
aid-0.209
 sunglasses-0.204
Token in
Token ID287
Feature activation-2.11e+00

Pos logit contributions
arter+0.107
ires+0.099
 Next+0.097
 hasn+0.097
 wasn+0.097
Neg logit contributions
P-0.082
 sunscreen-0.074
 merry-0.072
coat-0.072
aid-0.072
Token the
Token ID262
Feature activation-2.50e-01

Pos logit contributions
arter+0.039
 wasn+0.037
ires+0.036
 meant+0.035
 probably+0.035
Neg logit contributions
P-0.057
 sunscreen-0.056
aid-0.056
 merry-0.053
suits-0.051
Token grass
Token ID8701
Feature activation-2.34e+00

Pos logit contributions
 wasn+0.004
 He+0.004
 probably+0.004
 meant+0.004
arter+0.004
Neg logit contributions
aid-0.007
ival-0.007
coat-0.007
P-0.007
suits-0.007
Token.
Token ID13
Feature activation-1.04e+00

Pos logit contributions
arter+0.048
 Andy+0.046
 God+0.046
rated+0.046
 Next+0.046
Neg logit contributions
P-0.058
coat-0.054
suits-0.051
 sunscreen-0.050
 purse-0.050
Token a
Token ID257
Feature activation+2.33e+00

Pos logit contributions
 meant+0.014
 wasn+0.014
 erupt+0.013
ires+0.013
 He+0.013
Neg logit contributions
P-0.017
aid-0.016
coat-0.016
 sunscreen-0.014
ugg-0.014
Token sign
Token ID1051
Feature activation-2.53e+00

Pos logit contributions
aid+0.063
P+0.060
 sunscreen+0.057
coat+0.056
shirts+0.054
Neg logit contributions
 erupt-0.045
 wasn-0.044
arter-0.044
 ending-0.043
 probably-0.043
Token that
Token ID326
Feature activation-1.10e+00

Pos logit contributions
 meant+0.049
arter+0.049
 wasn+0.048
ires+0.048
 probably+0.047
Neg logit contributions
P-0.064
aid-0.063
 sunscreen-0.061
coat-0.059
 merry-0.058
Token says
Token ID1139
Feature activation-2.16e+00

Pos logit contributions
arter+0.022
rated+0.020
 wasn+0.020
 probably+0.020
 ending+0.019
Neg logit contributions
P-0.027
 sunscreen-0.027
aid-0.027
suits-0.026
 merry-0.026
Token "
Token ID366
Feature activation+2.36e+00

Pos logit contributions
arter+0.026
rated+0.024
ires+0.023
 probably+0.022
 wasn+0.022
Neg logit contributions
P-0.069
aid-0.069
 sunscreen-0.068
 merry-0.067
ival-0.067
Tokentem
Token ID11498
Feature activation-7.26e+00

Pos logit contributions
 sunscreen+0.048
aid+0.048
P+0.047
 merry+0.045
coat+0.044
Neg logit contributions
 meant-0.056
 wasn-0.054
arter-0.053
ires-0.053
 erupt-0.052
Tokenple
Token ID1154
Feature activation-3.27e+00

Pos logit contributions
arter+0.142
 wasn+0.138
 meant+0.137
ires+0.134
 probably+0.133
Neg logit contributions
P-0.183
aid-0.180
 sunscreen-0.174
 merry-0.169
suits-0.168
Token".
Token ID1911
Feature activation-6.06e-01

Pos logit contributions
arter+0.058
 wasn+0.055
 meant+0.054
 probably+0.053
ires+0.053
Neg logit contributions
P-0.088
aid-0.087
 sunscreen-0.085
 merry-0.083
suits-0.082
Token They
Token ID1119
Feature activation+1.94e-01

Pos logit contributions
arter+0.011
 wasn+0.010
 probably+0.010
 meant+0.010
rated+0.010
Neg logit contributions
P-0.016
 sunscreen-0.016
suits-0.016
aid-0.016
 merry-0.015
Token ask
Token ID1265
Feature activation-7.88e-01

Pos logit contributions
P+0.005
aid+0.005
 sunscreen+0.004
 merry+0.004
coat+0.004
Neg logit contributions
arter-0.004
 wasn-0.004
 meant-0.004
 probably-0.004
ires-0.004
Token mom
Token ID1995
Feature activation+2.10e+00

Pos logit contributions
arter+0.014
 wasn+0.013
 meant+0.013
ires+0.013
 probably+0.013
Neg logit contributions
P-0.021
aid-0.021
 sunscreen-0.020
 merry-0.020
suits-0.020
Token that
Token ID326
Feature activation-9.70e-01

Pos logit contributions
arter+0.047
ires+0.045
 Next+0.044
 wasn+0.042
 probably+0.042
Neg logit contributions
aid-0.078
 merry-0.074
 gown-0.074
P-0.073
 sunscreen-0.072
Token said
Token ID531
Feature activation-1.67e+00

Pos logit contributions
arter+0.027
ires+0.025
rated+0.024
rum+0.023
uders+0.023
Neg logit contributions
P-0.019
aid-0.018
 merry-0.018
 band-0.017
 gown-0.017
Token "
Token ID366
Feature activation+2.07e+00

Pos logit contributions
rated+0.022
arter+0.022
ires+0.021
 matter+0.019
 probably+0.019
Neg logit contributions
aid-0.051
P-0.050
 sunscreen-0.050
 merry-0.049
 gown-0.049
TokenSelf
Token ID24704
Feature activation-4.62e+00

Pos logit contributions
P+0.042
aid+0.041
 sunscreen+0.038
coat+0.037
 merry+0.037
Neg logit contributions
arter-0.053
ires-0.050
 wasn-0.050
 probably-0.050
 meant-0.049
Token-
Token ID12
Feature activation-4.67e-01

Pos logit contributions
arter+0.099
 wasn+0.095
 meant+0.095
ires+0.093
 probably+0.093
Neg logit contributions
P-0.107
aid-0.106
 sunscreen-0.103
suits-0.100
 merry-0.098
Tokendestruct
Token ID35678
Feature activation-7.20e+00

Pos logit contributions
arter+0.011
 wasn+0.010
 meant+0.010
ires+0.010
 probably+0.010
Neg logit contributions
P-0.010
aid-0.010
 sunscreen-0.010
 merry-0.009
coat-0.009
Token activated
Token ID13906
Feature activation-1.92e+00

Pos logit contributions
arter+0.156
 ending+0.144
 He+0.143
 wasn+0.139
 meant+0.137
Neg logit contributions
suits-0.154
P-0.148
ival-0.145
 merry-0.141
aid-0.138
Token.
Token ID13
Feature activation-3.96e-01

Pos logit contributions
arter+0.038
 ending+0.034
 probably+0.033
ires+0.033
 wasn+0.033
Neg logit contributions
P-0.048
 sunscreen-0.046
suits-0.045
aid-0.045
 gown-0.043
Token Countdown
Token ID47555
Feature activation-2.51e+00

Pos logit contributions
arter+0.007
rated+0.006
 ending+0.006
 wasn+0.006
 probably+0.006
Neg logit contributions
P-0.011
aid-0.011
ival-0.010
 sunscreen-0.010
 gown-0.010
Token begins
Token ID6140
Feature activation+4.71e-01

Pos logit contributions
arter+0.047
 wasn+0.044
 probably+0.043
 matter+0.043
 Andy+0.043
Neg logit contributions
ival-0.061
suits-0.060
P-0.059
aid-0.058
 gown-0.055
Token."
Token ID526
Feature activation+3.85e-01

Pos logit contributions
aid+0.013
P+0.013
 sunscreen+0.012
suits+0.012
 gown+0.012
Neg logit contributions
arter-0.008
 meant-0.007
ires-0.007
 ending-0.007
 wasn-0.007
Token adm
Token ID6178
Feature activation-4.30e+00

Pos logit contributions
arter+0.014
 meant+0.013
 wasn+0.013
ires+0.013
 probably+0.012
Neg logit contributions
P-0.021
aid-0.021
 sunscreen-0.021
 merry-0.020
coat-0.020
Tokeniring
Token ID3428
Feature activation-4.40e+00

Pos logit contributions
arter+0.039
 wasn+0.038
 Andy+0.037
 hasn+0.035
 ending+0.035
Neg logit contributions
P-0.159
aid-0.152
 sunscreen-0.148
 merry-0.141
suits-0.141
Token how
Token ID703
Feature activation+4.64e+00

Pos logit contributions
arter+0.078
 meant+0.072
 wasn+0.071
 erupt+0.071
ires+0.071
Neg logit contributions
aid-0.118
 sunscreen-0.118
P-0.117
 merry-0.113
coat-0.110
Token it
Token ID340
Feature activation-4.32e+00

Pos logit contributions
coat+0.108
aid+0.108
P+0.106
 sunscreen+0.103
 gown+0.103
Neg logit contributions
 wasn-0.100
ires-0.096
 He-0.096
 meant-0.095
 probably-0.093
Token gl
Token ID1278
Feature activation-4.77e+00

Pos logit contributions
arter+0.100
ires+0.094
 erupt+0.090
 probably+0.089
 meant+0.088
Neg logit contributions
P-0.096
aid-0.095
 sunscreen-0.090
coat-0.089
 merry-0.089
Tokenist
Token ID396
Feature activation-7.14e+00

Pos logit contributions
arter+0.081
 meant+0.077
 wasn+0.077
 erupt+0.075
 probably+0.075
Neg logit contributions
aid-0.134
P-0.132
suits-0.125
coat-0.125
 sunscreen-0.125
Tokenened
Token ID2945
Feature activation-3.55e+00

Pos logit contributions
arter+0.118
ires+0.106
 erupt+0.104
 probably+0.101
 meant+0.100
Neg logit contributions
P-0.203
aid-0.203
 sunscreen-0.197
 merry-0.193
suits-0.190
Token in
Token ID287
Feature activation-1.24e+00

Pos logit contributions
arter+0.066
ires+0.060
 erupt+0.059
 meant+0.058
 wasn+0.058
Neg logit contributions
P-0.093
aid-0.091
 sunscreen-0.090
 merry-0.089
coat-0.088
Token the
Token ID262
Feature activation+5.32e-01

Pos logit contributions
arter+0.021
 wasn+0.021
 meant+0.021
ires+0.020
 probably+0.020
Neg logit contributions
P-0.034
aid-0.034
 sunscreen-0.032
 merry-0.032
suits-0.032
Token sun
Token ID4252
Feature activation-2.55e+00

Pos logit contributions
aid+0.014
P+0.013
coat+0.013
ival+0.013
suits+0.013
Neg logit contributions
 wasn-0.010
 He-0.010
 meant-0.010
 probably-0.010
 ending-0.009
Token.
Token ID13
Feature activation-2.25e-01

Pos logit contributions
arter+0.048
ires+0.043
 probably+0.040
 erupt+0.040
 Next+0.039
Neg logit contributions
coat-0.070
 sunscreen-0.068
 gown-0.067
P-0.066
aid-0.066
Token how
Token ID703
Feature activation+4.49e+00

Pos logit contributions
 wasn+0.028
 meant+0.028
 probably+0.027
arter+0.026
 He+0.026
Neg logit contributions
P-0.036
suits-0.036
aid-0.036
ival-0.034
coat-0.034
Token the
Token ID262
Feature activation-2.96e+00

Pos logit contributions
P+0.099
aid+0.096
coat+0.093
 sunscreen+0.093
suits+0.092
Neg logit contributions
 wasn-0.102
 meant-0.100
arter-0.099
 probably-0.098
ires-0.098
Token snow
Token ID6729
Feature activation-3.18e+00

Pos logit contributions
arter+0.070
ires+0.066
 meant+0.063
 wasn+0.063
 probably+0.062
Neg logit contributions
aid-0.064
P-0.064
 sunscreen-0.064
 merry-0.062
coat-0.059
Token was
Token ID373
Feature activation-3.01e+00

Pos logit contributions
arter+0.068
ires+0.066
 Next+0.056
rum+0.053
iring+0.053
Neg logit contributions
coat-0.082
 merry-0.077
aid-0.077
 sunscreen-0.076
co-0.075
Token gl
Token ID1278
Feature activation-4.59e+00

Pos logit contributions
ires+0.064
arter+0.063
 Next+0.058
uders+0.057
 Andy+0.053
Neg logit contributions
 sunscreen-0.073
coat-0.071
 merry-0.069
P-0.067
 gown-0.067
Tokenist
Token ID396
Feature activation-7.11e+00

Pos logit contributions
arter+0.083
 meant+0.082
 wasn+0.082
 He+0.078
 probably+0.078
Neg logit contributions
aid-0.123
P-0.119
coat-0.116
suits-0.114
 sunscreen-0.113
Tokenening
Token ID3101
Feature activation-3.78e+00

Pos logit contributions
arter+0.119
ires+0.110
 erupt+0.104
 Next+0.100
 probably+0.098
Neg logit contributions
 sunscreen-0.197
aid-0.196
P-0.195
 merry-0.194
coat-0.190
Token in
Token ID287
Feature activation-1.48e+00

Pos logit contributions
arter+0.076
ires+0.071
 erupt+0.067
 probably+0.066
 wasn+0.066
Neg logit contributions
P-0.092
 sunscreen-0.091
aid-0.091
 merry-0.089
coat-0.089
Token the
Token ID262
Feature activation+2.56e-01

Pos logit contributions
arter+0.027
 wasn+0.025
 meant+0.025
ires+0.025
 probably+0.024
Neg logit contributions
P-0.040
aid-0.039
 sunscreen-0.039
 merry-0.037
suits-0.037
Token sunlight
Token ID19606
Feature activation-2.91e+00

Pos logit contributions
aid+0.007
P+0.007
ival+0.006
 sunscreen+0.006
suits+0.006
Neg logit contributions
 wasn-0.005
 meant-0.005
arter-0.005
 He-0.005
 probably-0.005
Token.
Token ID13
Feature activation-6.85e-01

Pos logit contributions
arter+0.055
ires+0.051
 Next+0.048
 Andy+0.047
uders+0.046
Neg logit contributions
coat-0.078
 sunscreen-0.075
 gown-0.074
aid-0.074
P-0.073
Token blue
Token ID4171
Feature activation-4.13e+00

Pos logit contributions
arter+0.063
 wasn+0.061
 meant+0.061
ires+0.060
 probably+0.059
Neg logit contributions
P-0.077
aid-0.076
 sunscreen-0.074
 merry-0.072
suits-0.071
Token and
Token ID290
Feature activation-2.93e+00

Pos logit contributions
arter+0.089
 wasn+0.084
 erupt+0.083
ires+0.083
 meant+0.083
Neg logit contributions
P-0.096
 sunscreen-0.094
aid-0.094
 merry-0.091
coat-0.089
Token the
Token ID262
Feature activation-2.40e+00

Pos logit contributions
arter+0.070
ires+0.066
 wasn+0.066
 meant+0.066
 probably+0.065
Neg logit contributions
P-0.061
aid-0.061
 sunscreen-0.059
 merry-0.059
suits-0.056
Token sun
Token ID4252
Feature activation-4.11e+00

Pos logit contributions
 wasn+0.048
 meant+0.048
 probably+0.047
arter+0.046
 ending+0.046
Neg logit contributions
aid-0.060
P-0.060
 sunscreen-0.054
coat-0.054
suits-0.054
Token was
Token ID373
Feature activation-3.38e+00

Pos logit contributions
arter+0.097
ires+0.093
uders+0.084
rum+0.083
 Next+0.083
Neg logit contributions
 merry-0.094
coat-0.090
 sunscreen-0.084
 gown-0.084
P-0.083
Token just
Token ID655
Feature activation-7.07e+00

Pos logit contributions
arter+0.082
ires+0.078
 wasn+0.075
 He+0.075
 meant+0.074
Neg logit contributions
 sunscreen-0.069
 merry-0.068
P-0.067
aid-0.065
coat-0.064
Token right
Token ID826
Feature activation-4.23e+00

Pos logit contributions
arter+0.172
ires+0.158
uders+0.153
 arose+0.152
iny+0.152
Neg logit contributions
 sunscreen-0.148
 merry-0.143
 sunglasses-0.128
aid-0.127
P-0.125
Token.
Token ID13
Feature activation-1.13e-01

Pos logit contributions
arter+0.064
 wasn+0.057
 erupt+0.057
 meant+0.056
ires+0.056
Neg logit contributions
P-0.127
 sunscreen-0.123
aid-0.121
suits-0.120
 merry-0.120
Token Everyone
Token ID11075
Feature activation+1.86e-01

Pos logit contributions
arter+0.002
rated+0.002
 erupt+0.002
 meant+0.002
 probably+0.002
Neg logit contributions
P-0.003
 sunscreen-0.003
aid-0.003
ival-0.003
suits-0.003
Token was
Token ID373
Feature activation-2.40e-01

Pos logit contributions
P+0.005
aid+0.005
 sunscreen+0.005
 merry+0.005
coat+0.005
Neg logit contributions
arter-0.003
 wasn-0.003
 meant-0.003
ires-0.003
 probably-0.003
Token so
Token ID523
Feature activation+9.51e-01

Pos logit contributions
 meant+0.005
 wasn+0.005
 probably+0.005
arter+0.005
 erupt+0.005
Neg logit contributions
P-0.006
aid-0.006
suits-0.005
 sunscreen-0.005
coat-0.005
Token
Token ID129
Feature activation-3.50e+00

Pos logit contributions
arter+0.006
 meant+0.006
ires+0.006
 probably+0.006
 erupt+0.006
Neg logit contributions
P-0.010
aid-0.010
coat-0.009
suits-0.009
 sunscreen-0.009
Token
Token ID241
Feature activation+2.16e+00

Pos logit contributions
arter+0.071
 meant+0.068
 wasn+0.068
ires+0.066
 probably+0.065
Neg logit contributions
P-0.087
aid-0.085
 sunscreen-0.083
 merry-0.081
coat-0.080
TokenThis
Token ID1212
Feature activation-9.40e-01

Pos logit contributions
P+0.053
aid+0.051
 sunscreen+0.050
 merry+0.049
coat+0.048
Neg logit contributions
arter-0.047
 erupt-0.043
ires-0.042
 meant-0.042
 wasn-0.042
Token box
Token ID3091
Feature activation-3.24e+00

Pos logit contributions
 wasn+0.011
 meant+0.011
arter+0.011
 probably+0.011
 erupt+0.010
Neg logit contributions
aid-0.031
P-0.030
 sunscreen-0.029
suits-0.029
 merry-0.029
Token is
Token ID318
Feature activation-3.38e+00

Pos logit contributions
arter+0.048
ires+0.043
 meant+0.042
 erupt+0.042
 wasn+0.041
Neg logit contributions
P-0.095
aid-0.095
 sunscreen-0.092
 merry-0.092
suits-0.092
Token probably
Token ID2192
Feature activation-7.04e+00

Pos logit contributions
arter+0.077
ires+0.073
 wasn+0.073
 meant+0.072
 probably+0.071
Neg logit contributions
P-0.074
aid-0.073
 sunscreen-0.072
 merry-0.069
suits-0.068
Token too
Token ID1165
Feature activation-2.34e+00

Pos logit contributions
arter+0.153
ires+0.146
 He+0.135
 erupt+0.134
 meant+0.134
Neg logit contributions
 sunscreen-0.160
P-0.157
 merry-0.153
aid-0.152
coat-0.150
Token light
Token ID1657
Feature activation-5.09e+00

Pos logit contributions
arter+0.057
 wasn+0.055
ires+0.054
 meant+0.054
 probably+0.053
Neg logit contributions
P-0.048
aid-0.047
 sunscreen-0.046
suits-0.045
 merry-0.044
Token to
Token ID284
Feature activation-2.08e+00

Pos logit contributions
arter+0.087
ires+0.080
 wasn+0.079
 probably+0.078
 erupt+0.077
Neg logit contributions
P-0.137
aid-0.137
 sunscreen-0.136
coat-0.133
 merry-0.132
Token weigh
Token ID10164
Feature activation-6.55e-01

Pos logit contributions
arter+0.043
ires+0.042
 wasn+0.042
 meant+0.041
 probably+0.040
Neg logit contributions
aid-0.049
P-0.049
 sunscreen-0.048
 merry-0.047
 gown-0.045
Token an
Token ID281
Feature activation+9.77e-01

Pos logit contributions
arter+0.012
 wasn+0.011
 meant+0.011
ires+0.011
 probably+0.011
Neg logit contributions
P-0.018
aid-0.017
 sunscreen-0.017
 merry-0.017
suits-0.016
Token the
Token ID262
Feature activation-3.11e+00

Pos logit contributions
arter+0.026
rated+0.026
cker+0.023
 erupt+0.022
ires+0.022
Neg logit contributions
 sunscreen-0.038
aid-0.037
P-0.037
 merry-0.037
ival-0.036
Token water
Token ID1660
Feature activation-2.32e+00

Pos logit contributions
arter+0.067
 wasn+0.065
 meant+0.065
 probably+0.063
ires+0.063
Neg logit contributions
P-0.072
aid-0.072
 sunscreen-0.068
 merry-0.066
suits-0.066
Token was
Token ID373
Feature activation-2.47e+00

Pos logit contributions
arter+0.044
ires+0.042
 Next+0.038
 God+0.036
uders+0.035
Neg logit contributions
coat-0.061
 merry-0.060
 sunscreen-0.060
P-0.058
aid-0.058
Token very
Token ID845
Feature activation-3.97e-01

Pos logit contributions
arter+0.058
ires+0.056
 Next+0.052
 wasn+0.051
rated+0.051
Neg logit contributions
 sunscreen-0.054
 merry-0.052
aid-0.051
coat-0.050
P-0.050
Token sw
Token ID1509
Feature activation-2.96e+00

Pos logit contributions
arter+0.011
ires+0.010
 wasn+0.010
 meant+0.010
uders+0.010
Neg logit contributions
 sunscreen-0.007
P-0.007
 merry-0.007
aid-0.007
suits-0.007
Tokenir
Token ID343
Feature activation-7.01e+00

Pos logit contributions
arter+0.068
 meant+0.068
 wasn+0.067
 probably+0.064
 erupt+0.064
Neg logit contributions
P-0.064
aid-0.062
 sunscreen-0.061
suits-0.060
coat-0.059
Tokenly
Token ID306
Feature activation-1.44e+00

Pos logit contributions
arter+0.140
 hasn+0.137
 God+0.132
 Mother+0.131
 erupt+0.129
Neg logit contributions
coat-0.162
aid-0.158
co-0.150
P-0.149
 sunscreen-0.147
Token and
Token ID290
Feature activation-1.96e+00

Pos logit contributions
arter+0.025
ires+0.024
 wasn+0.024
 meant+0.024
 probably+0.023
Neg logit contributions
P-0.039
aid-0.038
 sunscreen-0.038
 merry-0.037
coat-0.036
Token Arnold
Token ID21418
Feature activation-2.19e+00

Pos logit contributions
arter+0.036
ires+0.035
rated+0.031
 probably+0.031
 erupt+0.031
Neg logit contributions
 sunscreen-0.051
 merry-0.051
aid-0.051
P-0.049
coat-0.048
Token felt
Token ID2936
Feature activation-1.16e-01

Pos logit contributions
arter+0.044
ires+0.041
 erupt+0.039
 Next+0.037
 ending+0.036
Neg logit contributions
P-0.057
aid-0.055
suits-0.053
 sunscreen-0.053
 merry-0.052
Token funny
Token ID8258
Feature activation-1.23e+00

Pos logit contributions
arter+0.003
 wasn+0.003
 meant+0.003
ires+0.003
 probably+0.003
Neg logit contributions
P-0.003
aid-0.003
 sunscreen-0.002
 merry-0.002
suits-0.002
Token fountain
Token ID29420
Feature activation-6.13e-01

Pos logit contributions
aid+0.007
ival+0.006
P+0.006
coat+0.006
Aud+0.006
Neg logit contributions
 wasn-0.007
 ending-0.007
arter-0.006
 probably-0.006
 He-0.006
Token was
Token ID373
Feature activation-1.24e+00

Pos logit contributions
arter+0.010
ires+0.009
 probably+0.008
 Andy+0.008
rated+0.008
Neg logit contributions
P-0.017
aid-0.017
 merry-0.017
 sunscreen-0.017
suits-0.017
Token not
Token ID407
Feature activation-4.15e+00

Pos logit contributions
arter+0.032
ires+0.030
 wasn+0.029
 He+0.029
rated+0.029
Neg logit contributions
 sunscreen-0.024
aid-0.024
P-0.023
 merry-0.022
coat-0.021
Token smart
Token ID4451
Feature activation+2.69e-01

Pos logit contributions
arter+0.108
ires+0.105
 wasn+0.100
 erupt+0.098
 probably+0.098
Neg logit contributions
aid-0.079
 sunscreen-0.077
P-0.074
 merry-0.072
coat-0.071
Token.
Token ID13
Feature activation+8.21e-02

Pos logit contributions
aid+0.007
 sunscreen+0.007
coat+0.007
suits+0.007
P+0.007
Neg logit contributions
arter-0.005
ires-0.004
 probably-0.004
 erupt-0.004
 Andy-0.004
Token List
Token ID7343
Feature activation-6.94e+00

Pos logit contributions
P+0.002
aid+0.002
 Woo+0.002
 sunscreen+0.002
Of+0.002
Neg logit contributions
rated-0.002
cker-0.002
 matter-0.002
 erupt-0.002
 ending-0.002
Tokenening
Token ID3101
Feature activation-1.90e+00

Pos logit contributions
arter+0.094
 Andy+0.094
 He+0.091
 erupt+0.091
 probably+0.087
Neg logit contributions
 sunscreen-0.204
aid-0.201
suits-0.198
P-0.187
coat-0.186
Token to
Token ID284
Feature activation+4.64e-01

Pos logit contributions
arter+0.035
ires+0.032
 erupt+0.032
 probably+0.032
 meant+0.030
Neg logit contributions
P-0.049
aid-0.049
 sunscreen-0.049
suits-0.047
coat-0.047
Token Mom
Token ID11254
Feature activation+1.81e+00

Pos logit contributions
P+0.012
aid+0.012
 sunscreen+0.012
 merry+0.012
coat+0.011
Neg logit contributions
arter-0.008
 wasn-0.008
 meant-0.008
ires-0.008
 probably-0.008
Token was
Token ID373
Feature activation+3.82e-01

Pos logit contributions
aid+0.052
P+0.050
coat+0.049
suits+0.049
 sunscreen+0.049
Neg logit contributions
arter-0.030
ires-0.028
 erupt-0.026
 probably-0.026
 wasn-0.025
Token the
Token ID262
Feature activation+2.31e-01

Pos logit contributions
aid+0.009
P+0.008
 sunscreen+0.008
 merry+0.008
coat+0.008
Neg logit contributions
arter-0.009
ires-0.008
 wasn-0.008
rated-0.008
 erupt-0.008
Token all
Token ID477
Feature activation+5.33e-01

Pos logit contributions
 meant+0.008
 wasn+0.008
arter+0.008
ires+0.008
 probably+0.008
Neg logit contributions
P-0.016
aid-0.016
suits-0.016
coat-0.015
 sunscreen-0.015
Token the
Token ID262
Feature activation-1.57e+00

Pos logit contributions
 merry+0.011
aid+0.011
 sunscreen+0.011
P+0.010
 gown+0.010
Neg logit contributions
arter-0.013
uders-0.012
 probably-0.012
der-0.012
ires-0.012
Token stars
Token ID5788
Feature activation-3.48e+00

Pos logit contributions
arter+0.037
 wasn+0.035
ires+0.035
 meant+0.034
 erupt+0.034
Neg logit contributions
P-0.034
aid-0.033
 sunscreen-0.033
 merry-0.032
 gown-0.030
Token that
Token ID326
Feature activation-5.62e-01

Pos logit contributions
arter+0.066
ires+0.060
 Next+0.059
 wasn+0.058
 He+0.058
Neg logit contributions
 merry-0.088
aid-0.087
P-0.086
 sunscreen-0.086
 gown-0.084
Token tw
Token ID665
Feature activation-3.23e+00

Pos logit contributions
arter+0.012
ires+0.012
 probably+0.011
 He+0.011
 erupt+0.011
Neg logit contributions
aid-0.013
P-0.013
 merry-0.013
 sunscreen-0.012
 gown-0.012
Tokeninkle
Token ID19894
Feature activation-6.92e+00

Pos logit contributions
arter+0.055
 wasn+0.052
 meant+0.052
ires+0.051
 probably+0.051
Neg logit contributions
P-0.090
aid-0.090
 sunscreen-0.087
 merry-0.085
suits-0.084
Token.
Token ID13
Feature activation-7.88e-01

Pos logit contributions
arter+0.130
ires+0.118
 erupt+0.117
 probably+0.116
 wasn+0.115
Neg logit contributions
P-0.182
aid-0.178
 sunscreen-0.174
 merry-0.169
suits-0.169
Token Everywhere
Token ID48201
Feature activation-4.36e-01

Pos logit contributions
arter+0.016
 erupt+0.014
ires+0.014
 meant+0.013
 ending+0.013
Neg logit contributions
P-0.021
aid-0.020
 sunscreen-0.020
coat-0.020
suits-0.019
Token she
Token ID673
Feature activation-8.01e-01

Pos logit contributions
arter+0.010
ires+0.009
 erupt+0.009
 probably+0.008
rated+0.008
Neg logit contributions
P-0.011
aid-0.010
 merry-0.010
 sunscreen-0.010
coat-0.009
Token went
Token ID1816
Feature activation+2.31e+00

Pos logit contributions
arter+0.017
 wasn+0.016
ires+0.016
 meant+0.016
 probably+0.016
Neg logit contributions
P-0.019
aid-0.018
 sunscreen-0.018
 merry-0.017
suits-0.017
Token,
Token ID11
Feature activation+4.53e-01

Pos logit contributions
P+0.050
 merry+0.046
aid+0.046
ival+0.045
 sunscreen+0.045
Neg logit contributions
arter-0.056
ires-0.051
 probably-0.051
 erupt-0.050
rated-0.049
Token.
Token ID13
Feature activation-7.11e-01

Pos logit contributions
arter+0.023
 wasn+0.022
 meant+0.021
ires+0.021
 probably+0.021
Neg logit contributions
P-0.030
aid-0.030
 sunscreen-0.029
 merry-0.028
suits-0.028
Token We
Token ID775
Feature activation+8.02e-01

Pos logit contributions
arter+0.012
 ending+0.011
rated+0.011
 meant+0.011
 wasn+0.011
Neg logit contributions
aid-0.019
 sunscreen-0.019
P-0.019
ival-0.018
suits-0.018
Token can
Token ID460
Feature activation+3.51e-01

Pos logit contributions
P+0.022
aid+0.021
 sunscreen+0.021
 merry+0.020
suits+0.020
Neg logit contributions
arter-0.014
 wasn-0.014
 meant-0.014
ires-0.013
 probably-0.013
Token find
Token ID1064
Feature activation+4.62e-01

Pos logit contributions
P+0.008
aid+0.007
 sunscreen+0.007
 merry+0.007
coat+0.007
Neg logit contributions
arter-0.008
 wasn-0.008
 meant-0.008
ires-0.008
 ending-0.007
Token something
Token ID1223
Feature activation-1.15e+00

Pos logit contributions
coat+0.013
P+0.012
aid+0.012
 merry+0.012
suits+0.012
Neg logit contributions
 meant-0.008
 wasn-0.008
ires-0.008
 probably-0.008
 Andy-0.008
Token just
Token ID655
Feature activation-6.92e+00

Pos logit contributions
 meant+0.026
 wasn+0.026
arter+0.026
ires+0.026
 probably+0.025
Neg logit contributions
P-0.025
aid-0.025
 sunscreen-0.024
coat-0.023
 merry-0.023
Token as
Token ID355
Feature activation-1.83e-01

Pos logit contributions
arter+0.154
ires+0.141
 wasn+0.141
 meant+0.137
 probably+0.136
Neg logit contributions
 sunscreen-0.156
aid-0.153
P-0.150
 merry-0.145
suits-0.145
Token good
Token ID922
Feature activation-2.95e-01

Pos logit contributions
arter+0.005
 wasn+0.005
 meant+0.005
ires+0.005
 probably+0.005
Neg logit contributions
P-0.003
aid-0.003
 sunscreen-0.003
 merry-0.003
suits-0.003
Token in
Token ID287
Feature activation-1.27e+00

Pos logit contributions
arter+0.005
 wasn+0.005
 meant+0.004
 probably+0.004
ires+0.004
Neg logit contributions
P-0.008
aid-0.008
 sunscreen-0.008
suits-0.008
 merry-0.008
Token the
Token ID262
Feature activation-6.53e-02

Pos logit contributions
arter+0.022
 wasn+0.021
 meant+0.021
ires+0.021
 probably+0.021
Neg logit contributions
P-0.035
aid-0.035
 sunscreen-0.034
 merry-0.033
coat-0.033
Token '
Token ID705
Feature activation+4.97e+00

Pos logit contributions
 He+0.001
 wasn+0.001
 meant+0.001
 ending+0.001
 probably+0.001
Neg logit contributions
aid-0.002
P-0.001
ival-0.001
 merry-0.001
suits-0.001
Token your
Token ID534
Feature activation+1.76e+00

Pos logit contributions
arter+0.054
 wasn+0.052
 meant+0.052
rated+0.051
 erupt+0.051
Neg logit contributions
P-0.054
 sunscreen-0.053
aid-0.048
 gown-0.047
 merry-0.046
Token back
Token ID736
Feature activation+2.08e+00

Pos logit contributions
aid+0.044
P+0.043
ival+0.041
 sunscreen+0.039
suits+0.038
Neg logit contributions
 wasn-0.036
arter-0.036
 probably-0.035
 meant-0.034
ires-0.033
Token now
Token ID783
Feature activation+7.61e-01

Pos logit contributions
P+0.055
aid+0.054
 sunscreen+0.053
 merry+0.051
suits+0.051
Neg logit contributions
arter-0.038
 wasn-0.036
 meant-0.036
ires-0.036
 probably-0.035
Token!
Token ID0
Feature activation-8.65e-01

Pos logit contributions
P+0.019
aid+0.019
 sunscreen+0.019
coat+0.018
 merry+0.018
Neg logit contributions
arter-0.015
ires-0.014
 wasn-0.014
 meant-0.014
 probably-0.014
Token What
Token ID1867
Feature activation-8.79e-02

Pos logit contributions
arter+0.015
 erupt+0.014
 meant+0.013
rated+0.013
 ending+0.013
Neg logit contributions
aid-0.024
P-0.024
 sunscreen-0.023
coat-0.022
ival-0.022
Token must
Token ID1276
Feature activation-6.88e+00

Pos logit contributions
arter+0.002
 erupt+0.002
ires+0.002
rated+0.002
cker+0.002
Neg logit contributions
P-0.002
aid-0.002
 merry-0.002
 sunscreen-0.002
 gown-0.002
Token I
Token ID314
Feature activation+1.46e+00

Pos logit contributions
arter+0.143
ires+0.138
 erupt+0.134
der+0.133
rated+0.132
Neg logit contributions
 sunscreen-0.165
P-0.164
aid-0.157
 merry-0.154
 purse-0.150
Token do
Token ID466
Feature activation-1.48e+00

Pos logit contributions
P+0.036
aid+0.036
 sunscreen+0.035
coat+0.034
 merry+0.032
Neg logit contributions
arter-0.029
ires-0.029
 erupt-0.027
 wasn-0.027
uders-0.027
Token to
Token ID284
Feature activation-2.16e+00

Pos logit contributions
arter+0.023
 wasn+0.022
 meant+0.022
ires+0.022
 probably+0.022
Neg logit contributions
P-0.043
aid-0.043
 sunscreen-0.042
 merry-0.041
suits-0.040
Token take
Token ID1011
Feature activation+2.00e+00

Pos logit contributions
arter+0.046
 wasn+0.045
 meant+0.044
ires+0.044
 probably+0.043
Neg logit contributions
P-0.050
aid-0.050
 sunscreen-0.049
 merry-0.047
coat-0.046
Token care
Token ID1337
Feature activation+1.76e+00

Pos logit contributions
 merry+0.059
suits+0.057
aid+0.057
 sandbox+0.057
coat+0.056
Neg logit contributions
 wasn-0.031
 meant-0.030
 probably-0.029
 He-0.028
 Andy-0.028
Token box
Token ID3091
Feature activation+1.43e-01

Pos logit contributions
 meant+0.026
 wasn+0.026
 He+0.025
 probably+0.025
ires+0.024
Neg logit contributions
aid-0.023
P-0.022
 sunscreen-0.021
coat-0.020
suits-0.020
Token there
Token ID612
Feature activation-6.81e-01

Pos logit contributions
P+0.004
aid+0.004
ival+0.004
 merry+0.004
 sunscreen+0.004
Neg logit contributions
arter-0.003
rated-0.002
ires-0.002
 erupt-0.002
 probably-0.002
Token was
Token ID373
Feature activation-2.45e+00

Pos logit contributions
arter+0.018
ires+0.016
rated+0.015
arks+0.014
uders+0.014
Neg logit contributions
P-0.014
ival-0.014
 merry-0.013
 suit-0.013
 seas-0.013
Token a
Token ID257
Feature activation+1.74e+00

Pos logit contributions
 wasn+0.053
 meant+0.052
arter+0.052
 probably+0.051
 He+0.050
Neg logit contributions
P-0.058
aid-0.057
 sunscreen-0.053
suits-0.053
coat-0.052
Token surprise
Token ID5975
Feature activation-1.61e+00

Pos logit contributions
aid+0.045
P+0.044
 sunscreen+0.042
suits+0.041
coat+0.040
Neg logit contributions
 wasn-0.035
 probably-0.034
 ending-0.033
arter-0.033
 erupt-0.033
Token just
Token ID655
Feature activation-6.85e+00

Pos logit contributions
arter+0.025
ires+0.025
rated+0.024
 wasn+0.023
 meant+0.023
Neg logit contributions
P-0.046
aid-0.044
 sunscreen-0.043
 merry-0.043
coat-0.043
Token for
Token ID329
Feature activation-3.41e+00

Pos logit contributions
arter+0.157
ires+0.146
 wasn+0.144
 meant+0.140
 erupt+0.140
Neg logit contributions
 sunscreen-0.148
aid-0.144
 merry-0.143
P-0.141
 gown-0.135
Token him
Token ID683
Feature activation-8.25e-01

Pos logit contributions
rated+0.067
arter+0.064
 meant+0.062
ires+0.059
 wasn+0.059
Neg logit contributions
 sunscreen-0.089
P-0.086
aid-0.084
 merry-0.082
 seas-0.081
Token.
Token ID13
Feature activation-5.42e-01

Pos logit contributions
arter+0.010
ires+0.009
 wasn+0.009
 meant+0.009
 probably+0.009
Neg logit contributions
P-0.027
aid-0.027
 sunscreen-0.026
suits-0.026
coat-0.025
Token It
Token ID632
Feature activation-3.66e+00

Pos logit contributions
arter+0.012
rated+0.011
 ending+0.011
ires+0.011
 matter+0.011
Neg logit contributions
ival-0.013
suits-0.012
P-0.012
aid-0.012
 sunscreen-0.012
Token was
Token ID373
Feature activation-3.05e+00

Pos logit contributions
arter+0.078
ires+0.075
uders+0.063
iring+0.062
der+0.062
Neg logit contributions
P-0.091
 merry-0.091
aid-0.087
 gown-0.086
coat-0.084
Token easier
Token ID4577
Feature activation-1.01e+00

Pos logit contributions
arter+0.034
ires+0.032
 wasn+0.031
uders+0.031
 Next+0.030
Neg logit contributions
aid-0.035
 sunscreen-0.034
P-0.033
 merry-0.033
 gown-0.031
Token and
Token ID290
Feature activation-3.87e-01

Pos logit contributions
arter+0.022
 Andy+0.019
ires+0.019
 Next+0.019
uders+0.018
Neg logit contributions
 sunscreen-0.023
suits-0.022
aid-0.022
P-0.022
coat-0.021
Token more
Token ID517
Feature activation+9.05e-01

Pos logit contributions
arter+0.011
ires+0.010
 wasn+0.010
 meant+0.010
 probably+0.010
Neg logit contributions
aid-0.007
P-0.006
 sunscreen-0.006
 merry-0.006
suits-0.006
Token enjoyable
Token ID20050
Feature activation-4.69e-01

Pos logit contributions
 sunscreen+0.018
aid+0.018
 merry+0.017
 sunglasses+0.017
 fashion+0.017
Neg logit contributions
arter-0.021
cker-0.020
uders-0.019
 Andy-0.019
 probably-0.019
Token.
Token ID13
Feature activation+3.16e-01

Pos logit contributions
arter+0.007
 probably+0.006
ires+0.006
 wasn+0.006
 Andy+0.006
Neg logit contributions
 sunscreen-0.014
aid-0.013
P-0.013
suits-0.013
 merry-0.013
Token List
Token ID7343
Feature activation-6.84e+00

Pos logit contributions
aid+0.008
P+0.008
 Woo+0.007
Of+0.007
 sunscreen+0.007
Neg logit contributions
rated-0.007
arter-0.006
 wasn-0.006
cker-0.006
 meant-0.006
Tokenening
Token ID3101
Feature activation-1.79e+00

Pos logit contributions
arter+0.090
 Andy+0.082
 wasn+0.082
 He+0.082
 erupt+0.081
Neg logit contributions
 sunscreen-0.208
aid-0.208
suits-0.202
P-0.199
 merry-0.194
Token to
Token ID284
Feature activation+6.38e-01

Pos logit contributions
arter+0.033
 erupt+0.031
 meant+0.030
 probably+0.030
 wasn+0.030
Neg logit contributions
P-0.046
aid-0.046
 sunscreen-0.046
suits-0.044
 merry-0.044
Token others
Token ID1854
Feature activation+6.10e-01

Pos logit contributions
P+0.015
aid+0.015
 sunscreen+0.015
 merry+0.014
suits+0.014
Neg logit contributions
arter-0.013
 wasn-0.013
 meant-0.013
ires-0.013
 probably-0.013
Token and
Token ID290
Feature activation+3.48e-01

Pos logit contributions
aid+0.015
P+0.014
 sunscreen+0.014
suits+0.014
ival+0.014
Neg logit contributions
arter-0.013
 erupt-0.011
ires-0.011
 Andy-0.011
 probably-0.011
Token trying
Token ID2111
Feature activation+1.46e+00

Pos logit contributions
aid+0.007
 sunscreen+0.007
P+0.007
 merry+0.006
 gown+0.006
Neg logit contributions
arter-0.009
ires-0.008
 wasn-0.008
 meant-0.008
 erupt-0.008
Token there
Token ID612
Feature activation-3.40e-01

Pos logit contributions
P+0.033
aid+0.029
 merry+0.029
suits+0.029
 sunscreen+0.029
Neg logit contributions
arter-0.045
ires-0.042
 meant-0.042
 erupt-0.041
der-0.041
Token was
Token ID373
Feature activation-2.11e+00

Pos logit contributions
arter+0.008
ires+0.007
cker+0.007
rated+0.007
der+0.007
Neg logit contributions
 merry-0.008
P-0.008
 ha-0.007
 purse-0.007
 sunscreen-0.007
Token a
Token ID257
Feature activation+1.37e+00

Pos logit contributions
 meant+0.039
 wasn+0.039
 probably+0.039
 He+0.038
arter+0.038
Neg logit contributions
aid-0.055
suits-0.052
P-0.052
coat-0.051
ival-0.051
Token kind
Token ID1611
Feature activation+3.70e-01

Pos logit contributions
aid+0.041
P+0.040
 sunscreen+0.039
coat+0.038
suits+0.038
Neg logit contributions
arter-0.020
 wasn-0.020
 probably-0.019
 He-0.019
 meant-0.018
Token wood
Token ID4898
Feature activation-3.20e+00

Pos logit contributions
P+0.009
aid+0.009
 sunscreen+0.009
 merry+0.009
suits+0.009
Neg logit contributions
arter-0.007
 meant-0.007
 wasn-0.007
ires-0.007
 probably-0.007
Tokencut
Token ID8968
Feature activation-6.83e+00

Pos logit contributions
arter+0.066
ires+0.064
 meant+0.063
 wasn+0.062
 erupt+0.062
Neg logit contributions
P-0.078
aid-0.076
coat-0.072
 sunscreen-0.072
 merry-0.071
Tokenter
Token ID353
Feature activation-2.47e+00

Pos logit contributions
arter+0.172
ires+0.167
 erupt+0.166
 topple+0.162
 explode+0.161
Neg logit contributions
P-0.130
co-0.125
 merry-0.122
 band-0.120
aid-0.119
Token who
Token ID508
Feature activation-4.35e-01

Pos logit contributions
arter+0.061
ires+0.059
 meant+0.058
 erupt+0.058
 wasn+0.057
Neg logit contributions
P-0.052
aid-0.049
 sunscreen-0.045
coat-0.044
 merry-0.044
Token lived
Token ID5615
Feature activation-4.41e-01

Pos logit contributions
arter+0.008
ires+0.008
uders+0.007
 erupt+0.007
 probably+0.007
Neg logit contributions
P-0.012
aid-0.012
 sunscreen-0.011
 merry-0.011
suits-0.011
Token in
Token ID287
Feature activation-3.52e-01

Pos logit contributions
arter+0.009
 wasn+0.008
 meant+0.008
ires+0.008
 probably+0.008
Neg logit contributions
aid-0.011
P-0.011
 sunscreen-0.011
 merry-0.010
suits-0.010
Token the
Token ID262
Feature activation-3.16e-02

Pos logit contributions
arter+0.010
rated+0.010
ires+0.009
der+0.009
 wasn+0.009
Neg logit contributions
P-0.006
 sunscreen-0.006
aid-0.006
 merry-0.006
 gown-0.005
Token balloon
Token ID21190
Feature activation-1.89e+00

Pos logit contributions
arter+0.063
 wasn+0.062
 meant+0.061
 probably+0.059
ires+0.059
Neg logit contributions
P-0.093
aid-0.092
 sunscreen-0.089
 merry-0.086
suits-0.086
Token.
Token ID13
Feature activation-1.32e+00

Pos logit contributions
arter+0.031
 wasn+0.030
 meant+0.030
ires+0.029
 probably+0.029
Neg logit contributions
P-0.053
aid-0.053
 sunscreen-0.051
 merry-0.050
suits-0.049
Token
Token ID198
Feature activation-2.06e+00

Pos logit contributions
arter+0.021
 meant+0.019
 erupt+0.019
rated+0.019
ires+0.018
Neg logit contributions
P-0.040
 sunscreen-0.038
aid-0.038
suits-0.037
coat-0.036
Token
Token ID198
Feature activation-2.09e+00

Pos logit contributions
arter+0.037
 meant+0.034
 wasn+0.033
rated+0.032
 probably+0.032
Neg logit contributions
P-0.060
aid-0.057
 sunscreen-0.055
suits-0.053
 merry-0.053
TokenA
Token ID32
Feature activation+7.50e-01

Pos logit contributions
arter+0.044
 wasn+0.044
 meant+0.043
ires+0.043
 probably+0.043
Neg logit contributions
aid-0.048
P-0.047
 sunscreen-0.046
 merry-0.045
coat-0.044
Tokenstr
Token ID2536
Feature activation-6.79e+00

Pos logit contributions
 sunscreen+0.021
aid+0.021
P+0.021
coat+0.020
suits+0.020
Neg logit contributions
 meant-0.012
 wasn-0.012
 He-0.011
 ending-0.011
 erupt-0.011
Tokenid
Token ID312
Feature activation-3.07e+00

Pos logit contributions
arter+0.185
 He+0.182
 meant+0.181
 ending+0.177
 wasn+0.171
Neg logit contributions
P-0.131
aid-0.113
 merry-0.106
coat-0.106
 purse-0.105
Token held
Token ID2714
Feature activation+1.14e-01

Pos logit contributions
arter+0.061
ires+0.057
 erupt+0.053
 probably+0.050
 explode+0.050
Neg logit contributions
aid-0.081
P-0.079
 purse-0.074
coat-0.073
 merry-0.072
Token onto
Token ID4291
Feature activation-6.17e-01

Pos logit contributions
P+0.003
aid+0.003
 sunscreen+0.003
 merry+0.003
suits+0.003
Neg logit contributions
arter-0.002
 wasn-0.002
 meant-0.002
ires-0.002
 probably-0.002
Token her
Token ID607
Feature activation+1.65e+00

Pos logit contributions
arter+0.010
 wasn+0.010
 meant+0.010
ires+0.010
 probably+0.009
Neg logit contributions
P-0.017
aid-0.017
 sunscreen-0.017
 merry-0.016
coat-0.016
Token balloon
Token ID21190
Feature activation+1.01e+00

Pos logit contributions
P+0.036
aid+0.036
 sunscreen+0.033
suits+0.032
coat+0.032
Neg logit contributions
arter-0.038
 wasn-0.038
 probably-0.037
 meant-0.037
ires-0.037
Token voice
Token ID3809
Feature activation-1.33e+00

Pos logit contributions
 meant+0.017
 wasn+0.017
arter+0.017
 ending+0.016
 probably+0.016
Neg logit contributions
P-0.038
aid-0.037
 sunscreen-0.036
suits-0.035
coat-0.035
Token.
Token ID13
Feature activation-3.82e-01

Pos logit contributions
arter+0.020
ires+0.019
 probably+0.017
 erupt+0.017
 He+0.017
Neg logit contributions
aid-0.039
P-0.038
 sunscreen-0.038
 merry-0.037
coat-0.036
Token But
Token ID887
Feature activation-3.71e+00

Pos logit contributions
arter+0.006
 erupt+0.006
ires+0.006
 meant+0.006
rated+0.006
Neg logit contributions
aid-0.011
P-0.011
 sunscreen-0.010
ival-0.010
coat-0.010
Token then
Token ID788
Feature activation+4.99e-01

Pos logit contributions
arter+0.071
ires+0.066
 meant+0.064
 erupt+0.064
 probably+0.063
Neg logit contributions
P-0.096
aid-0.095
 sunscreen-0.093
 merry-0.092
coat-0.088
Token,
Token ID11
Feature activation+1.26e-01

Pos logit contributions
aid+0.014
P+0.014
 sunscreen+0.014
suits+0.013
coat+0.013
Neg logit contributions
 wasn-0.008
 meant-0.008
arter-0.008
 probably-0.008
 He-0.008
Token mirac
Token ID30352
Feature activation-6.79e+00

Pos logit contributions
P+0.003
aid+0.003
 sunscreen+0.003
 merry+0.003
suits+0.003
Neg logit contributions
arter-0.002
 wasn-0.002
 meant-0.002
ires-0.002
 probably-0.002
Tokenulously
Token ID18117
Feature activation-1.03e+00

Pos logit contributions
arter+0.095
 wasn+0.087
 meant+0.086
ires+0.085
 erupt+0.084
Neg logit contributions
P-0.212
aid-0.209
 sunscreen-0.202
 merry-0.199
suits-0.196
Token,
Token ID11
Feature activation+4.69e-01

Pos logit contributions
 wasn+0.016
 meant+0.015
arter+0.015
 probably+0.014
 He+0.014
Neg logit contributions
aid-0.031
P-0.031
suits-0.030
coat-0.029
 sunscreen-0.029
Token it
Token ID340
Feature activation-2.93e+00

Pos logit contributions
P+0.013
aid+0.013
 sunscreen+0.012
 merry+0.012
suits+0.012
Neg logit contributions
arter-0.008
 wasn-0.008
 meant-0.008
ires-0.008
 probably-0.008
Token returned
Token ID4504
Feature activation+1.53e+00

Pos logit contributions
arter+0.073
ires+0.071
 Next+0.062
 God+0.061
 Andy+0.061
Neg logit contributions
P-0.062
 band-0.058
aid-0.058
 merry-0.057
 gown-0.055
Token and
Token ID290
Feature activation-1.46e-01

Pos logit contributions
P+0.046
aid+0.045
 sunscreen+0.044
 merry+0.043
ival+0.043
Neg logit contributions
 wasn-0.023
arter-0.022
 meant-0.022
 probably-0.022
 He-0.021
Token"
Token ID1
Feature activation+2.71e+00

Pos logit contributions
arter+0.013
 wasn+0.012
 meant+0.012
ires+0.012
 probably+0.012
Neg logit contributions
P-0.028
aid-0.028
 sunscreen-0.027
 merry-0.027
suits-0.027
TokenBen
Token ID11696
Feature activation+9.15e-01

Pos logit contributions
P+0.057
aid+0.056
 sunscreen+0.053
 merry+0.050
coat+0.050
Neg logit contributions
arter-0.068
 wasn-0.064
ires-0.064
 meant-0.063
 erupt-0.062
Token,
Token ID11
Feature activation-6.09e-01

Pos logit contributions
P+0.022
aid+0.022
 sunscreen+0.021
suits+0.021
coat+0.020
Neg logit contributions
arter-0.019
ires-0.019
 erupt-0.017
 probably-0.017
 meant-0.017
Token it
Token ID340
Feature activation-3.56e+00

Pos logit contributions
arter+0.011
ires+0.010
 meant+0.010
 wasn+0.010
 probably+0.010
Neg logit contributions
P-0.016
aid-0.016
 sunscreen-0.016
 merry-0.016
suits-0.015
Token was
Token ID373
Feature activation-3.18e+00

Pos logit contributions
ires+0.078
arter+0.075
iring+0.072
 God+0.064
der+0.063
Neg logit contributions
P-0.086
 ha-0.077
suits-0.077
 band-0.075
 merry-0.074
Token just
Token ID655
Feature activation-6.77e+00

Pos logit contributions
arter+0.079
rated+0.079
ires+0.077
 wasn+0.075
iring+0.073
Neg logit contributions
 sunscreen-0.064
P-0.064
 merry-0.056
 sunglasses-0.056
aid-0.056
Token a
Token ID257
Feature activation+1.16e+00

Pos logit contributions
arter+0.171
ires+0.164
 wasn+0.162
 meant+0.159
rated+0.157
Neg logit contributions
 sunscreen-0.131
P-0.124
 merry-0.123
aid-0.122
 sunglasses-0.114
Token spider
Token ID19230
Feature activation-3.02e+00

Pos logit contributions
aid+0.029
P+0.027
ival+0.025
 sunscreen+0.025
coat+0.025
Neg logit contributions
arter-0.024
 probably-0.024
 wasn-0.024
 ending-0.024
 erupt-0.024
Token.
Token ID13
Feature activation-1.53e+00

Pos logit contributions
arter+0.049
ires+0.047
 wasn+0.046
 meant+0.045
 probably+0.044
Neg logit contributions
P-0.086
aid-0.085
 sunscreen-0.083
coat-0.081
 merry-0.081
Token It
Token ID632
Feature activation-3.04e+00

Pos logit contributions
rated+0.027
 ending+0.026
 meant+0.026
 wasn+0.025
 matter+0.025
Neg logit contributions
P-0.041
aid-0.041
 sunscreen-0.038
 Woo-0.037
 ha-0.037
Token was
Token ID373
Feature activation-2.80e+00

Pos logit contributions
ires+0.069
arter+0.066
iring+0.061
 ending+0.058
der+0.055
Neg logit contributions
P-0.073
 merry-0.072
aid-0.068
coat-0.067
 band-0.065
TokenOne
Token ID3198
Feature activation+1.46e+00

Pos logit contributions
arter+0.035
 meant+0.033
ires+0.032
 wasn+0.032
 probably+0.032
Neg logit contributions
P-0.028
aid-0.026
 sunscreen-0.025
 merry-0.024
ival-0.024
Token day
Token ID1110
Feature activation+1.70e+00

Pos logit contributions
P+0.030
aid+0.027
 sunscreen+0.024
 merry+0.024
 band+0.023
Neg logit contributions
uders-0.040
der-0.038
ires-0.038
arter-0.037
 probably-0.037
Token,
Token ID11
Feature activation+2.29e-01

Pos logit contributions
P+0.039
aid+0.038
 sunscreen+0.037
suits+0.035
 merry+0.035
Neg logit contributions
arter-0.038
ires-0.036
 probably-0.036
 meant-0.036
 wasn-0.035
Token the
Token ID262
Feature activation-2.67e+00

Pos logit contributions
P+0.005
 sunscreen+0.005
aid+0.005
 merry+0.005
ival+0.005
Neg logit contributions
arter-0.005
ires-0.005
rated-0.005
 probably-0.005
 meant-0.005
Token wood
Token ID4898
Feature activation-2.85e+00

Pos logit contributions
arter+0.075
ires+0.067
uders+0.067
 arose+0.066
 either+0.065
Neg logit contributions
P-0.050
aid-0.048
 sunscreen-0.048
 band-0.044
 saddle-0.042
Tokencut
Token ID8968
Feature activation-6.76e+00

Pos logit contributions
arter+0.064
ires+0.056
 Ste+0.048
uders+0.047
 matter+0.047
Neg logit contributions
P-0.076
aid-0.076
 band-0.070
 suit-0.064
suits-0.064
Tokenter
Token ID353
Feature activation-1.67e+00

Pos logit contributions
arter+0.174
ires+0.160
 erupt+0.156
 topple+0.156
 explode+0.151
Neg logit contributions
P-0.137
aid-0.134
 band-0.123
rob-0.122
 ha-0.116
Token got
Token ID1392
Feature activation+2.41e+00

Pos logit contributions
arter+0.050
ires+0.048
 ending+0.043
 erupt+0.043
 explode+0.043
Neg logit contributions
P-0.030
aid-0.026
suits-0.025
 sunscreen-0.024
coat-0.023
Token an
Token ID281
Feature activation+1.79e+00

Pos logit contributions
P+0.053
aid+0.052
 sunscreen+0.050
 merry+0.049
suits+0.049
Neg logit contributions
arter-0.055
 meant-0.054
 wasn-0.054
 probably-0.052
ires-0.052
Token idea
Token ID2126
Feature activation-1.27e+00

Pos logit contributions
aid+0.046
P+0.045
 sunscreen+0.044
 merry+0.041
coat+0.041
Neg logit contributions
arter-0.034
 wasn-0.033
ires-0.031
 meant-0.031
 probably-0.030
Token to
Token ID284
Feature activation-1.91e+00

Pos logit contributions
ires+0.030
arter+0.029
rated+0.028
iring+0.027
uders+0.027
Neg logit contributions
P-0.027
aid-0.026
coat-0.026
 gown-0.025
 band-0.025
Token wasn
Token ID2492
Feature activation-2.61e+00

Pos logit contributions
arter+0.003
ires+0.003
 He+0.002
 erupt+0.002
cker+0.002
Neg logit contributions
 merry-0.003
 sunscreen-0.003
P-0.003
aid-0.003
 band-0.003
Token't
Token ID470
Feature activation-4.92e+00

Pos logit contributions
arter+0.048
ires+0.046
rated+0.043
cker+0.042
 ending+0.041
Neg logit contributions
aid-0.070
P-0.069
 sunscreen-0.068
ival-0.067
 merry-0.066
Token so
Token ID523
Feature activation-9.82e-02

Pos logit contributions
arter+0.129
ires+0.128
 Next+0.117
 He+0.116
uders+0.114
Neg logit contributions
 sunscreen-0.090
P-0.087
 merry-0.087
aid-0.086
coat-0.081
Token bad
Token ID2089
Feature activation-9.63e-01

Pos logit contributions
arter+0.002
ires+0.002
 wasn+0.002
 probably+0.002
 meant+0.002
Neg logit contributions
aid-0.002
P-0.002
 sunscreen-0.002
 merry-0.002
coat-0.002
Token.
Token ID13
Feature activation-2.79e-01

Pos logit contributions
arter+0.021
ires+0.019
 erupt+0.018
 probably+0.018
cker+0.018
Neg logit contributions
P-0.022
 sunscreen-0.022
 merry-0.021
aid-0.021
coat-0.020
Token List
Token ID7343
Feature activation-6.73e+00

Pos logit contributions
arter+0.005
rated+0.005
 ending+0.005
 erupt+0.005
cker+0.005
Neg logit contributions
aid-0.007
Of-0.007
P-0.007
ival-0.007
 sunscreen-0.007
Tokenening
Token ID3101
Feature activation-1.56e+00

Pos logit contributions
arter+0.084
 He+0.078
 erupt+0.075
 probably+0.074
 wasn+0.073
Neg logit contributions
 sunscreen-0.206
aid-0.201
suits-0.197
P-0.193
coat-0.191
Token to
Token ID284
Feature activation+1.37e-01

Pos logit contributions
arter+0.029
ires+0.027
 meant+0.027
 erupt+0.027
 wasn+0.027
Neg logit contributions
P-0.040
aid-0.040
 sunscreen-0.039
 merry-0.038
suits-0.038
Token the
Token ID262
Feature activation-1.82e-02

Pos logit contributions
P+0.003
aid+0.003
 sunscreen+0.003
 merry+0.003
coat+0.003
Neg logit contributions
arter-0.003
 wasn-0.003
 meant-0.003
ires-0.003
 probably-0.003
Token song
Token ID3496
Feature activation-1.92e+00

Pos logit contributions
 wasn+0.000
arter+0.000
 meant+0.000
 probably+0.000
 erupt+0.000
Neg logit contributions
aid-0.000
P-0.000
 sunscreen-0.000
suits-0.000
coat-0.000
Token helped
Token ID4193
Feature activation+3.05e-02

Pos logit contributions
arter+0.035
ires+0.031
 erupt+0.030
 probably+0.030
 wasn+0.029
Neg logit contributions
aid-0.051
P-0.051
 sunscreen-0.049
 merry-0.049
coat-0.049
Token girl
Token ID2576
Feature activation+1.81e+00

Pos logit contributions
P+0.020
aid+0.020
 sunscreen+0.019
 merry+0.019
suits+0.018
Neg logit contributions
arter-0.024
 wasn-0.023
 meant-0.023
 probably-0.023
ires-0.023
Token.
Token ID13
Feature activation+8.90e-01

Pos logit contributions
P+0.061
aid+0.060
 sunscreen+0.059
 merry+0.058
suits+0.058
Neg logit contributions
arter-0.020
 wasn-0.018
 meant-0.018
ires-0.018
 probably-0.017
Token She
Token ID1375
Feature activation+3.34e-01

Pos logit contributions
P+0.025
aid+0.024
 sunscreen+0.024
coat+0.023
suits+0.023
Neg logit contributions
arter-0.016
 erupt-0.015
 meant-0.015
rated-0.015
 probably-0.014
Token hugged
Token ID41130
Feature activation+2.55e-01

Pos logit contributions
P+0.010
aid+0.010
 sunscreen+0.010
suits+0.010
 merry+0.010
Neg logit contributions
 wasn-0.005
 meant-0.005
 probably-0.004
 He-0.004
arter-0.004
Token her
Token ID607
Feature activation+1.93e+00

Pos logit contributions
P+0.007
aid+0.007
suits+0.007
 sunscreen+0.007
ival+0.007
Neg logit contributions
 wasn-0.004
 meant-0.004
 He-0.004
arter-0.004
 probably-0.004
Token toy
Token ID13373
Feature activation+4.69e+00

Pos logit contributions
P+0.062
aid+0.058
ival+0.055
 sunscreen+0.054
suits+0.053
Neg logit contributions
 wasn-0.030
 meant-0.029
 probably-0.026
 He-0.026
arter-0.026
Token tightly
Token ID17707
Feature activation-2.03e-01

Pos logit contributions
P+0.139
aid+0.132
 sunscreen+0.123
ival+0.120
suits+0.118
Neg logit contributions
 wasn-0.082
 He-0.080
 Andy-0.076
 probably-0.074
 meant-0.072
Token and
Token ID290
Feature activation+9.15e-01

Pos logit contributions
 wasn+0.003
 meant+0.003
 probably+0.003
 He+0.003
arter+0.002
Neg logit contributions
P-0.007
aid-0.006
 sunscreen-0.006
ival-0.006
 merry-0.006
Token said
Token ID531
Feature activation+1.22e+00

Pos logit contributions
P+0.024
aid+0.023
 sunscreen+0.022
suits+0.022
ival+0.022
Neg logit contributions
 wasn-0.018
 meant-0.017
 probably-0.016
arter-0.016
 He-0.016
Token goodbye
Token ID24829
Feature activation-6.10e-01

Pos logit contributions
P+0.035
suits+0.033
 sunscreen+0.033
aid+0.033
coat+0.032
Neg logit contributions
 wasn-0.021
 He-0.020
 meant-0.020
arter-0.019
 probably-0.019
Token to
Token ID284
Feature activation+1.64e+00

Pos logit contributions
 wasn+0.010
arter+0.010
 meant+0.010
 probably+0.010
 He+0.010
Neg logit contributions
P-0.017
aid-0.016
 sunscreen-0.016
 merry-0.015
suits-0.015
Token and
Token ID290
Feature activation-4.53e-01

Pos logit contributions
arter+0.014
ires+0.013
 meant+0.013
 probably+0.013
 wasn+0.013
Neg logit contributions
aid-0.020
P-0.020
 sunscreen-0.019
coat-0.019
suits-0.019
Token cried
Token ID16896
Feature activation-1.47e+00

Pos logit contributions
 wasn+0.009
arter+0.009
 meant+0.009
 probably+0.008
ires+0.008
Neg logit contributions
P-0.011
aid-0.011
 sunscreen-0.011
 merry-0.011
suits-0.010
Token.
Token ID13
Feature activation+3.00e-01

Pos logit contributions
arter+0.017
 wasn+0.016
 meant+0.015
ires+0.015
 probably+0.015
Neg logit contributions
aid-0.049
P-0.049
 sunscreen-0.047
 merry-0.046
suits-0.046
Token
Token ID198
Feature activation-3.38e-01

Pos logit contributions
P+0.009
aid+0.009
 sunscreen+0.008
coat+0.008
 merry+0.008
Neg logit contributions
arter-0.005
rated-0.005
 erupt-0.004
 meant-0.004
 probably-0.004
Token
Token ID198
Feature activation-8.08e-01

Pos logit contributions
arter+0.006
rated+0.006
 probably+0.006
 meant+0.006
 wasn+0.006
Neg logit contributions
P-0.009
aid-0.009
 sunscreen-0.009
 merry-0.008
suits-0.008
Token"
Token ID1
Feature activation+4.17e+00

Pos logit contributions
 wasn+0.018
ires+0.018
arter+0.018
 probably+0.017
 meant+0.017
Neg logit contributions
 merry-0.017
aid-0.017
 sunscreen-0.017
coat-0.017
suits-0.017
TokenHey
Token ID10814
Feature activation+5.35e-02

Pos logit contributions
P+0.085
aid+0.082
 sunscreen+0.078
 merry+0.075
coat+0.073
Neg logit contributions
arter-0.108
 wasn-0.102
 meant-0.102
ires-0.102
 erupt-0.100
Token,
Token ID11
Feature activation+8.66e-01

Pos logit contributions
P+0.002
aid+0.002
 sunscreen+0.001
suits+0.001
coat+0.001
Neg logit contributions
arter-0.001
 wasn-0.001
 meant-0.001
 erupt-0.001
 probably-0.001
Token what
Token ID644
Feature activation+4.14e-01

Pos logit contributions
P+0.023
aid+0.023
 sunscreen+0.022
suits+0.022
 merry+0.021
Neg logit contributions
arter-0.016
 wasn-0.015
 meant-0.015
 probably-0.015
ires-0.015
Token's
Token ID338
Feature activation-4.03e+00

Pos logit contributions
P+0.009
aid+0.008
 ha+0.007
ival+0.007
 sunscreen+0.007
Neg logit contributions
arter-0.010
ires-0.010
rated-0.010
cker-0.010
iring-0.009
Token wrong
Token ID2642
Feature activation-2.19e+00

Pos logit contributions
arter+0.087
 wasn+0.084
 meant+0.084
ires+0.083
 probably+0.082
Neg logit contributions
P-0.094
aid-0.093
 sunscreen-0.089
 merry-0.087
suits-0.086
Token.
Token ID13
Feature activation-3.88e-01

Pos logit contributions
P+0.016
aid+0.016
 sunscreen+0.016
 merry+0.016
coat+0.015
Neg logit contributions
arter-0.013
ires-0.013
 wasn-0.013
 meant-0.012
 probably-0.012
Token He
Token ID679
Feature activation+3.17e-02

Pos logit contributions
rated+0.007
 matter+0.007
arter+0.007
cker+0.007
 probably+0.006
Neg logit contributions
P-0.011
suits-0.010
aid-0.010
 sunscreen-0.010
 Woo-0.010
Token zoom
Token ID19792
Feature activation+1.45e+00

Pos logit contributions
P+0.001
aid+0.001
 sunscreen+0.001
 merry+0.001
suits+0.001
Neg logit contributions
arter-0.001
ires-0.001
 meant-0.001
 wasn-0.001
 probably-0.001
Tokened
Token ID276
Feature activation+2.13e+00

Pos logit contributions
P+0.029
aid+0.028
 sunscreen+0.028
 merry+0.025
 gown+0.025
Neg logit contributions
 wasn-0.038
 meant-0.037
 He-0.037
ires-0.034
rated-0.034
Token and
Token ID290
Feature activation+9.23e-01

Pos logit contributions
P+0.062
aid+0.061
 sunscreen+0.055
coat+0.054
ival+0.054
Neg logit contributions
 wasn-0.039
 He-0.038
 erupt-0.036
 meant-0.036
 Andy-0.036
Token raced
Token ID32897
Feature activation+4.56e+00

Pos logit contributions
P+0.024
aid+0.023
ival+0.022
 sunscreen+0.022
 purse+0.021
Neg logit contributions
 wasn-0.019
 meant-0.018
 He-0.018
 erupt-0.017
 Andy-0.017
Token on
Token ID319
Feature activation+1.22e+00

Pos logit contributions
aid+0.133
P+0.133
 sunscreen+0.118
ival+0.115
coat+0.112
Neg logit contributions
 He-0.086
 wasn-0.081
 Andy-0.076
 meant-0.075
 before-0.075
Token the
Token ID262
Feature activation+1.06e+00

Pos logit contributions
P+0.036
aid+0.036
 sunscreen+0.034
suits+0.033
coat+0.033
Neg logit contributions
arter-0.019
 wasn-0.019
 meant-0.018
 probably-0.018
 erupt-0.018
Token street
Token ID4675
Feature activation+1.25e+00

Pos logit contributions
P+0.031
aid+0.031
ival+0.027
 sunscreen+0.027
suits+0.026
Neg logit contributions
 wasn-0.017
 He-0.017
 erupt-0.017
 probably-0.017
 explode-0.017
Token.
Token ID13
Feature activation-4.76e-02

Pos logit contributions
aid+0.038
P+0.038
 sunscreen+0.037
 merry+0.037
suits+0.036
Neg logit contributions
arter-0.018
 meant-0.016
ires-0.016
 wasn-0.016
 probably-0.016
Token He
Token ID679
Feature activation+5.08e-01

Pos logit contributions
rated+0.001
arter+0.001
 matter+0.001
cker+0.001
 meant+0.001
Neg logit contributions
P-0.001
aid-0.001
 sunscreen-0.001
suits-0.001
ival-0.001
Token the
Token ID262
Feature activation+7.87e-01

Pos logit contributions
arter+0.021
ires+0.019
 wasn+0.019
 meant+0.019
 probably+0.019
Neg logit contributions
P-0.031
aid-0.031
 sunscreen-0.030
 merry-0.030
coat-0.029
Token playground
Token ID24817
Feature activation-4.17e-01

Pos logit contributions
aid+0.020
P+0.019
 sunscreen+0.018
suits+0.018
ival+0.017
Neg logit contributions
 erupt-0.015
 probably-0.015
 wasn-0.015
arter-0.015
 ending-0.015
Token?"
Token ID1701
Feature activation+1.31e-01

Pos logit contributions
arter+0.007
ires+0.007
 wasn+0.007
 probably+0.007
 meant+0.007
Neg logit contributions
P-0.011
aid-0.011
 sunscreen-0.011
 merry-0.011
coat-0.011
Token
Token ID198
Feature activation-1.28e+00

Pos logit contributions
P+0.003
aid+0.003
 sunscreen+0.003
 merry+0.003
 purse+0.003
Neg logit contributions
arter-0.003
uders-0.003
ires-0.003
rated-0.003
 erupt-0.003
Token
Token ID198
Feature activation-1.50e+00

Pos logit contributions
arter+0.020
 meant+0.018
ires+0.018
 wasn+0.017
 probably+0.017
Neg logit contributions
P-0.039
aid-0.038
 sunscreen-0.037
 merry-0.036
suits-0.036
Token"
Token ID1
Feature activation+4.68e+00

Pos logit contributions
 wasn+0.031
 probably+0.030
 meant+0.029
 He+0.029
 Andy+0.029
Neg logit contributions
aid-0.035
 sunscreen-0.035
suits-0.034
coat-0.034
P-0.034
TokenTo
Token ID2514
Feature activation-2.07e+00

Pos logit contributions
P+0.104
aid+0.104
 sunscreen+0.102
suits+0.099
 merry+0.099
Neg logit contributions
 meant-0.100
 wasn-0.100
arter-0.098
 probably-0.097
ires-0.096
Token get
Token ID651
Feature activation+2.91e+00

Pos logit contributions
arter+0.041
ires+0.040
 meant+0.040
 wasn+0.039
 He+0.038
Neg logit contributions
 sunscreen-0.050
aid-0.050
P-0.049
 merry-0.046
 gown-0.045
Token to
Token ID284
Feature activation-5.35e-01

Pos logit contributions
aid+0.074
P+0.074
suits+0.073
 merry+0.073
coat+0.073
Neg logit contributions
 meant-0.054
 wasn-0.054
 probably-0.051
 erupt-0.051
arter-0.049
Token the
Token ID262
Feature activation-3.39e-01

Pos logit contributions
arter+0.010
 wasn+0.009
ires+0.009
 meant+0.009
 probably+0.009
Neg logit contributions
P-0.014
aid-0.014
 sunscreen-0.014
 merry-0.013
coat-0.013
Token other
Token ID584
Feature activation+1.63e+00

Pos logit contributions
arter+0.008
 probably+0.007
 wasn+0.007
 meant+0.007
 erupt+0.007
Neg logit contributions
aid-0.008
P-0.007
 sunscreen-0.007
suits-0.007
 merry-0.007
Token to
Token ID284
Feature activation-1.74e+00

Pos logit contributions
P+0.096
aid+0.094
 sunscreen+0.090
 merry+0.088
coat+0.087
Neg logit contributions
arter-0.096
 wasn-0.094
 meant-0.093
ires-0.092
 probably-0.091
Token make
Token ID787
Feature activation+3.67e-01

Pos logit contributions
arter+0.034
 wasn+0.033
 meant+0.033
 probably+0.032
 erupt+0.032
Neg logit contributions
P-0.045
aid-0.043
suits-0.041
ival-0.041
 sunscreen-0.041
Token sand
Token ID6450
Feature activation+6.84e-01

Pos logit contributions
P+0.010
aid+0.009
coat+0.009
suits+0.009
 sunscreen+0.009
Neg logit contributions
 meant-0.007
arter-0.007
 wasn-0.007
ires-0.007
 erupt-0.007
Token castles
Token ID49203
Feature activation+2.40e-02

Pos logit contributions
P+0.015
aid+0.015
 sunscreen+0.014
 merry+0.014
coat+0.014
Neg logit contributions
arter-0.016
 wasn-0.015
 meant-0.015
ires-0.015
 erupt-0.015
Token,
Token ID11
Feature activation-9.41e-01

Pos logit contributions
P+0.001
aid+0.001
 sunscreen+0.001
 merry+0.001
suits+0.001
Neg logit contributions
arter-0.000
 wasn-0.000
 meant-0.000
ires-0.000
 probably-0.000
Token how
Token ID703
Feature activation+4.27e+00

Pos logit contributions
arter+0.016
 wasn+0.015
 meant+0.015
ires+0.015
 probably+0.015
Neg logit contributions
P-0.026
aid-0.026
 sunscreen-0.025
 merry-0.024
suits-0.024
Token to
Token ID284
Feature activation-2.38e+00

Pos logit contributions
aid+0.101
P+0.101
 sunscreen+0.097
 merry+0.094
suits+0.093
Neg logit contributions
arter-0.091
 wasn-0.086
 meant-0.086
 probably-0.085
ires-0.085
Token collect
Token ID2824
Feature activation+1.13e+00

Pos logit contributions
arter+0.045
 wasn+0.043
 meant+0.043
ires+0.043
 probably+0.042
Neg logit contributions
P-0.061
aid-0.061
 sunscreen-0.059
 merry-0.057
coat-0.056
Token shells
Token ID19679
Feature activation+1.10e+00

Pos logit contributions
P+0.024
coat+0.024
aid+0.024
ival+0.023
 merry+0.022
Neg logit contributions
 erupt-0.026
 meant-0.025
 wasn-0.025
ires-0.025
 He-0.025
Token,
Token ID11
Feature activation-8.24e-01

Pos logit contributions
P+0.030
aid+0.030
 sunscreen+0.029
 merry+0.028
suits+0.028
Neg logit contributions
arter-0.019
 wasn-0.019
 meant-0.019
ires-0.018
 probably-0.018
Token and
Token ID290
Feature activation-1.09e+00

Pos logit contributions
arter+0.014
 wasn+0.013
 meant+0.013
ires+0.013
 probably+0.013
Neg logit contributions
P-0.023
aid-0.023
 sunscreen-0.022
 merry-0.022
suits-0.021
Token mine
Token ID6164
Feature activation-1.31e+00

Pos logit contributions
arter+0.034
 wasn+0.033
 meant+0.032
ires+0.032
 probably+0.032
Neg logit contributions
P-0.048
aid-0.047
 sunscreen-0.046
 merry-0.045
suits-0.044
Token."
Token ID526
Feature activation+1.27e+00

Pos logit contributions
arter+0.021
 wasn+0.019
 meant+0.019
 probably+0.019
ires+0.019
Neg logit contributions
P-0.038
aid-0.037
 sunscreen-0.037
 merry-0.036
suits-0.036
Token
Token ID198
Feature activation-4.33e-01

Pos logit contributions
P+0.035
aid+0.034
 sunscreen+0.033
 merry+0.032
suits+0.032
Neg logit contributions
arter-0.023
rated-0.023
 probably-0.022
 meant-0.021
 erupt-0.021
Token
Token ID198
Feature activation-7.78e-01

Pos logit contributions
arter+0.008
rated+0.008
 meant+0.008
 ending+0.008
 probably+0.008
Neg logit contributions
P-0.012
aid-0.012
 sunscreen-0.011
suits-0.010
 merry-0.010
TokenL
Token ID43
Feature activation-9.10e-01

Pos logit contributions
arter+0.019
 wasn+0.019
 meant+0.019
ires+0.018
 probably+0.018
Neg logit contributions
aid-0.016
P-0.016
 sunscreen-0.015
 merry-0.014
suits-0.014
Tokenila
Token ID10102
Feature activation+7.09e-01

Pos logit contributions
arter+0.020
 wasn+0.020
 erupt+0.020
 meant+0.020
 probably+0.020
Neg logit contributions
P-0.021
aid-0.020
 sunscreen-0.019
coat-0.018
suits-0.018
Token and
Token ID290
Feature activation-5.32e-01

Pos logit contributions
P+0.021
aid+0.021
 sunscreen+0.020
 merry+0.020
coat+0.020
Neg logit contributions
arter-0.011
 wasn-0.010
 meant-0.010
ires-0.010
 probably-0.010
Token Max
Token ID5436
Feature activation+3.20e+00

Pos logit contributions
rated+0.010
arter+0.010
ires+0.009
iring+0.009
 probably+0.009
Neg logit contributions
 sunscreen-0.014
 merry-0.014
 sunglasses-0.013
 ha-0.013
 lipstick-0.013
Token feel
Token ID1254
Feature activation-5.49e-01

Pos logit contributions
P+0.089
aid+0.083
 sunscreen+0.080
ival+0.079
suits+0.078
Neg logit contributions
 wasn-0.058
 meant-0.058
 probably-0.055
 He-0.055
 erupt-0.055
Token sad
Token ID6507
Feature activation-9.24e-01

Pos logit contributions
 meant+0.010
arter+0.010
 wasn+0.010
 probably+0.009
ires+0.009
Neg logit contributions
P-0.015
aid-0.014
 sunscreen-0.014
coat-0.014
suits-0.014
Token.
Token ID13
Feature activation+7.42e-03

Pos logit contributions
arter+0.019
uders+0.018
 Andy+0.018
cker+0.018
rated+0.018
Neg logit contributions
aid-0.023
 band-0.020
P-0.020
suits-0.020
 sunscreen-0.020
Token the
Token ID262
Feature activation-1.35e+00

Pos logit contributions
arter+0.018
ires+0.018
 probably+0.017
rated+0.017
 erupt+0.016
Neg logit contributions
P-0.026
 sunscreen-0.026
aid-0.026
 merry-0.026
 purse-0.024
Token bird
Token ID6512
Feature activation-7.24e-01

Pos logit contributions
 wasn+0.025
arter+0.025
 meant+0.024
 probably+0.024
 ending+0.024
Neg logit contributions
P-0.034
aid-0.034
suits-0.032
 sunscreen-0.032
coat-0.032
Token became
Token ID2627
Feature activation+1.32e+00

Pos logit contributions
 wasn+0.013
 meant+0.012
arter+0.012
 probably+0.012
 erupt+0.011
Neg logit contributions
P-0.020
aid-0.019
 sunscreen-0.019
suits-0.019
ival-0.019
Token friends
Token ID2460
Feature activation+1.98e+00

Pos logit contributions
aid+0.029
suits+0.028
ival+0.028
coat+0.028
P+0.028
Neg logit contributions
 meant-0.030
 wasn-0.029
 probably-0.029
arter-0.028
 erupt-0.027
Token and
Token ID290
Feature activation-2.99e-01

Pos logit contributions
P+0.060
aid+0.060
 sunscreen+0.058
 merry+0.057
suits+0.056
Neg logit contributions
arter-0.028
 wasn-0.027
 meant-0.027
ires-0.026
 probably-0.026
Token would
Token ID561
Feature activation+1.06e+00

Pos logit contributions
 wasn+0.007
 meant+0.007
arter+0.006
 probably+0.006
 erupt+0.006
Neg logit contributions
P-0.007
aid-0.007
 sunscreen-0.006
suits-0.006
ival-0.006
Token whistle
Token ID16121
Feature activation+5.30e-01

Pos logit contributions
P+0.028
aid+0.028
 sunscreen+0.027
suits+0.027
 merry+0.026
Neg logit contributions
arter-0.019
 wasn-0.019
 meant-0.018
 probably-0.018
 erupt-0.018
Token together
Token ID1978
Feature activation+9.29e-01

Pos logit contributions
P+0.011
aid+0.011
 sunscreen+0.011
coat+0.010
 merry+0.010
Neg logit contributions
 wasn-0.012
 meant-0.012
arter-0.012
ires-0.012
 He-0.012
Token in
Token ID287
Feature activation+8.85e-01

Pos logit contributions
P+0.021
aid+0.020
 sunscreen+0.019
 merry+0.019
 gown+0.019
Neg logit contributions
arter-0.021
 meant-0.020
 wasn-0.019
 probably-0.019
ires-0.019
Token the
Token ID262
Feature activation+1.17e+00

Pos logit contributions
aid+0.026
P+0.026
 sunscreen+0.025
coat+0.025
suits+0.025
Neg logit contributions
arter-0.014
 wasn-0.013
 meant-0.013
ires-0.013
 probably-0.013
Token backyard
Token ID24296
Feature activation+9.77e-01

Pos logit contributions
aid+0.033
coat+0.032
suits+0.031
ival+0.031
P+0.030
Neg logit contributions
 wasn-0.019
 erupt-0.019
 probably-0.019
 ending-0.019
arter-0.019
TokenIt
Token ID1026
Feature activation-2.73e+00

Pos logit contributions
 wasn+0.017
 He+0.017
 probably+0.017
 meant+0.016
 Andy+0.016
Neg logit contributions
aid-0.021
 sunscreen-0.021
suits-0.021
 merry-0.021
coat-0.020
Token was
Token ID373
Feature activation-2.12e+00

Pos logit contributions
arter+0.044
ires+0.040
 erupt+0.035
uders+0.033
 probably+0.033
Neg logit contributions
P-0.081
aid-0.080
 merry-0.078
coat-0.077
 sunscreen-0.076
Token a
Token ID257
Feature activation+1.65e+00

Pos logit contributions
arter+0.051
ires+0.049
 wasn+0.047
 meant+0.047
 erupt+0.047
Neg logit contributions
P-0.043
 sunscreen-0.043
aid-0.042
 merry-0.041
coat-0.040
Token hot
Token ID3024
Feature activation-1.02e+00

Pos logit contributions
aid+0.042
P+0.040
rob+0.036
 sunscreen+0.036
suits+0.036
Neg logit contributions
 wasn-0.037
 ending-0.036
 probably-0.035
 matter-0.034
 erupt-0.033
Token summer
Token ID3931
Feature activation-9.53e-01

Pos logit contributions
arter+0.020
 wasn+0.019
 meant+0.019
ires+0.019
 probably+0.018
Neg logit contributions
P-0.026
aid-0.026
 sunscreen-0.025
 merry-0.024
suits-0.024
Token day
Token ID1110
Feature activation+3.12e-01

Pos logit contributions
arter+0.020
ires+0.019
 probably+0.018
 meant+0.018
 erupt+0.018
Neg logit contributions
P-0.023
coat-0.022
 sunscreen-0.022
aid-0.022
 merry-0.022
Token,
Token ID11
Feature activation-1.16e+00

Pos logit contributions
P+0.008
aid+0.008
 sunscreen+0.008
coat+0.008
 merry+0.008
Neg logit contributions
arter-0.006
ires-0.005
 meant-0.005
 wasn-0.005
 probably-0.005
Token and
Token ID290
Feature activation-2.33e+00

Pos logit contributions
arter+0.020
 meant+0.019
 wasn+0.019
ires+0.018
 probably+0.018
Neg logit contributions
P-0.032
aid-0.032
 sunscreen-0.031
 merry-0.030
coat-0.030
Token the
Token ID262
Feature activation-2.43e+00

Pos logit contributions
arter+0.043
ires+0.040
 meant+0.039
 erupt+0.039
 wasn+0.039
Neg logit contributions
P-0.062
aid-0.061
 sunscreen-0.060
 merry-0.059
coat-0.057
Token sun
Token ID4252
Feature activation-2.80e+00

Pos logit contributions
 wasn+0.046
 meant+0.045
arter+0.045
 probably+0.044
 ending+0.043
Neg logit contributions
P-0.063
aid-0.063
 sunscreen-0.058
suits-0.058
ival-0.057
Token was
Token ID373
Feature activation-2.02e+00

Pos logit contributions
arter+0.056
ires+0.055
 erupt+0.047
uders+0.046
der+0.044
Neg logit contributions
coat-0.073
 merry-0.071
 sunscreen-0.070
aid-0.070
P-0.069
Token.
Token ID13
Feature activation-5.17e-01

Pos logit contributions
aid+0.081
P+0.080
 sunscreen+0.078
suits+0.076
coat+0.076
Neg logit contributions
 wasn-0.030
arter-0.030
 meant-0.029
 probably-0.029
 He-0.028
Token
Token ID198
Feature activation-1.07e+00

Pos logit contributions
arter+0.008
 meant+0.007
 erupt+0.007
 wasn+0.007
 probably+0.007
Neg logit contributions
P-0.015
aid-0.015
 sunscreen-0.015
coat-0.014
suits-0.014
Token
Token ID198
Feature activation-1.41e+00

Pos logit contributions
arter+0.016
 meant+0.015
 wasn+0.015
 probably+0.014
ires+0.014
Neg logit contributions
P-0.033
aid-0.032
 sunscreen-0.031
suits-0.030
 merry-0.030
TokenOne
Token ID3198
Feature activation+1.49e+00

Pos logit contributions
 wasn+0.023
 probably+0.022
 He+0.022
ires+0.021
 Next+0.021
Neg logit contributions
aid-0.039
 sunscreen-0.038
 merry-0.038
coat-0.037
suits-0.037
Token day
Token ID1110
Feature activation+1.61e+00

Pos logit contributions
P+0.033
aid+0.032
 merry+0.031
 sunscreen+0.030
suits+0.029
Neg logit contributions
arter-0.033
 probably-0.033
ires-0.033
uders-0.032
 wasn-0.032
Token,
Token ID11
Feature activation+1.75e-01

Pos logit contributions
aid+0.046
P+0.045
 merry+0.044
 sunscreen+0.044
coat+0.043
Neg logit contributions
 wasn-0.026
 meant-0.026
arter-0.024
 He-0.024
 Andy-0.024
Token the
Token ID262
Feature activation-2.54e+00

Pos logit contributions
aid+0.005
P+0.005
suits+0.005
 merry+0.005
coat+0.005
Neg logit contributions
 wasn-0.002
 meant-0.002
 Andy-0.002
 He-0.002
 probably-0.002
Token little
Token ID1310
Feature activation-1.53e+00

Pos logit contributions
arter+0.044
 wasn+0.043
 meant+0.043
ires+0.042
 probably+0.041
Neg logit contributions
P-0.069
aid-0.069
 sunscreen-0.067
 merry-0.065
suits-0.064
Token girl
Token ID2576
Feature activation-4.95e-01

Pos logit contributions
 meant+0.029
 wasn+0.029
 He+0.026
 ending+0.026
 Ste+0.026
Neg logit contributions
aid-0.041
P-0.040
ival-0.039
 sunscreen-0.038
suits-0.035
Token got
Token ID1392
Feature activation+3.02e+00

Pos logit contributions
arter+0.010
 wasn+0.010
 meant+0.010
ires+0.009
 probably+0.009
Neg logit contributions
P-0.012
aid-0.012
 sunscreen-0.012
 merry-0.011
suits-0.011
Token tired
Token ID10032
Feature activation+2.24e-02

Pos logit contributions
ival+0.080
suits+0.079
aid+0.079
coat+0.078
rob+0.077
Neg logit contributions
 meant-0.058
 wasn-0.054
 probably-0.052
 He-0.050
 erupt-0.049
Token to
Token ID284
Feature activation-1.10e+00

Pos logit contributions
P+0.035
aid+0.034
suits+0.033
 sunscreen+0.033
 merry+0.032
Neg logit contributions
 wasn-0.030
 meant-0.029
 probably-0.028
 He-0.027
 erupt-0.026
Token give
Token ID1577
Feature activation+1.49e+00

Pos logit contributions
arter+0.025
 wasn+0.024
 meant+0.024
ires+0.024
 probably+0.024
Neg logit contributions
P-0.024
aid-0.024
 sunscreen-0.023
 merry-0.022
coat-0.022
Token her
Token ID607
Feature activation+8.49e-01

Pos logit contributions
P+0.043
 merry+0.042
suits+0.042
aid+0.041
ival+0.041
Neg logit contributions
 wasn-0.025
 meant-0.025
 He-0.022
 probably-0.022
 Andy-0.022
Token a
Token ID257
Feature activation+4.25e+00

Pos logit contributions
P+0.024
suits+0.023
 merry+0.023
coat+0.022
ival+0.022
Neg logit contributions
 meant-0.015
 wasn-0.014
 He-0.013
 probably-0.013
 either-0.012
Token special
Token ID2041
Feature activation+1.16e+00

Pos logit contributions
P+0.109
rob+0.105
co+0.101
suits+0.097
 sunscreen+0.093
Neg logit contributions
 meant-0.087
 wasn-0.083
 He-0.079
 probably-0.079
 matter-0.078
Token gift
Token ID6979
Feature activation+1.00e-01

Pos logit contributions
P+0.027
suits+0.025
aid+0.024
ival+0.024
 merry+0.024
Neg logit contributions
 meant-0.025
 wasn-0.025
 probably-0.025
 ending-0.025
arter-0.024
Token.
Token ID13
Feature activation-8.50e-01

Pos logit contributions
aid+0.003
P+0.003
 sunscreen+0.003
coat+0.003
 merry+0.003
Neg logit contributions
arter-0.001
 wasn-0.001
ires-0.001
 meant-0.001
 probably-0.001
Token She
Token ID1375
Feature activation-5.48e-01

Pos logit contributions
arter+0.015
 meant+0.014
 erupt+0.014
ires+0.014
 probably+0.013
Neg logit contributions
P-0.023
aid-0.023
 sunscreen-0.023
suits-0.022
coat-0.022
Token took
Token ID1718
Feature activation+1.99e+00

Pos logit contributions
 wasn+0.010
 meant+0.010
 probably+0.009
 He+0.009
 hadn+0.009
Neg logit contributions
P-0.015
aid-0.014
 sunscreen-0.014
 merry-0.014
suits-0.014
Token Lily
Token ID20037
Feature activation+2.09e+00

Pos logit contributions
P+0.065
 merry+0.064
coat+0.064
suits+0.064
aid+0.061
Neg logit contributions
 meant-0.025
 wasn-0.025
 He-0.025
 Andy-0.024
 Ted-0.022
Token to
Token ID284
Feature activation+1.11e+00

Pos logit contributions
P+0.062
 merry+0.060
aid+0.060
rob+0.058
 sunscreen+0.058
Neg logit contributions
 wasn-0.034
 meant-0.033
 probably-0.030
 He-0.030
 before-0.029
Token.
Token ID13
Feature activation+6.60e-01

Pos logit contributions
P+0.006
aid+0.006
 sunscreen+0.006
suits+0.006
 merry+0.006
Neg logit contributions
arter-0.002
 wasn-0.002
ires-0.002
 meant-0.002
 probably-0.002
Token
Token ID220
Feature activation+6.71e-01

Pos logit contributions
suits+0.014
 sunscreen+0.014
ival+0.014
P+0.014
aid+0.013
Neg logit contributions
arter-0.016
 meant-0.014
rated-0.014
 matter-0.014
cker-0.014
Token
Token ID198
Feature activation-8.52e-02

Pos logit contributions
aid+0.014
ival+0.014
P+0.014
suits+0.013
 sunscreen+0.013
Neg logit contributions
 matter-0.017
arter-0.017
 ending-0.016
rated-0.016
cker-0.014
Token
Token ID198
Feature activation-5.54e-01

Pos logit contributions
arter+0.001
 meant+0.001
 wasn+0.001
 probably+0.001
ires+0.001
Neg logit contributions
P-0.003
aid-0.003
 sunscreen-0.002
suits-0.002
ival-0.002
TokenA
Token ID32
Feature activation+2.07e+00

Pos logit contributions
 wasn+0.011
arter+0.011
 meant+0.010
 probably+0.010
ires+0.010
Neg logit contributions
aid-0.014
P-0.013
 sunscreen-0.013
 merry-0.013
coat-0.013
Token man
Token ID582
Feature activation-6.52e-01

Pos logit contributions
aid+0.042
 sunscreen+0.041
P+0.041
coat+0.039
suits+0.038
Neg logit contributions
 wasn-0.050
 meant-0.050
 erupt-0.048
 Andy-0.047
arter-0.047
Token stepped
Token ID10764
Feature activation+3.70e-01

Pos logit contributions
arter+0.014
ires+0.013
 meant+0.013
 wasn+0.013
 probably+0.013
Neg logit contributions
P-0.015
aid-0.015
 sunscreen-0.015
 merry-0.014
suits-0.014
Token out
Token ID503
Feature activation+1.02e+00

Pos logit contributions
aid+0.010
 merry+0.010
 sandbox+0.009
 sunscreen+0.009
coat+0.009
Neg logit contributions
 He-0.007
 wasn-0.006
 meant-0.006
 probably-0.006
 before-0.006
Token of
Token ID286
Feature activation+9.61e-02

Pos logit contributions
aid+0.029
P+0.028
 sunscreen+0.028
 merry+0.027
suits+0.027
Neg logit contributions
arter-0.017
 probably-0.016
ires-0.016
 wasn-0.016
rated-0.015
Token the
Token ID262
Feature activation+1.42e+00

Pos logit contributions
P+0.003
aid+0.003
 sunscreen+0.003
suits+0.003
 merry+0.003
Neg logit contributions
arter-0.001
 meant-0.001
 wasn-0.001
 probably-0.001
ires-0.001
Token jet
Token ID12644
Feature activation+2.10e-01

Pos logit contributions
aid+0.032
P+0.030
coat+0.029
 sunscreen+0.028
suits+0.028
Neg logit contributions
 meant-0.033
 wasn-0.032
 probably-0.032
arter-0.031
ires-0.031
Token to
Token ID284
Feature activation-2.64e+00

Pos logit contributions
arter+0.017
ires+0.016
 meant+0.016
 probably+0.015
 erupt+0.015
Neg logit contributions
P-0.019
aid-0.018
 sunscreen-0.018
suits-0.018
 merry-0.018
Token spread
Token ID4104
Feature activation+5.27e-01

Pos logit contributions
arter+0.059
ires+0.057
 wasn+0.057
 meant+0.057
 probably+0.054
Neg logit contributions
 sunscreen-0.058
P-0.058
aid-0.058
 merry-0.055
coat-0.053
Token it
Token ID340
Feature activation-1.48e+00

Pos logit contributions
P+0.013
aid+0.013
 merry+0.012
 sunscreen+0.012
suits+0.012
Neg logit contributions
 wasn-0.010
arter-0.010
 meant-0.010
 probably-0.010
ires-0.010
Token on
Token ID319
Feature activation-1.43e+00

Pos logit contributions
 wasn+0.029
arter+0.029
 meant+0.029
 probably+0.028
 erupt+0.028
Neg logit contributions
P-0.037
aid-0.036
 sunscreen-0.034
 merry-0.034
ival-0.033
Token the
Token ID262
Feature activation-2.87e-01

Pos logit contributions
arter+0.029
ires+0.027
 meant+0.027
 wasn+0.027
rated+0.027
Neg logit contributions
 sunscreen-0.035
P-0.035
 merry-0.033
aid-0.033
 gown-0.032
Token ground
Token ID2323
Feature activation-1.63e+00

Pos logit contributions
 probably+0.004
 erupt+0.004
 wasn+0.004
arter+0.004
 meant+0.004
Neg logit contributions
aid-0.009
P-0.009
ival-0.008
suits-0.008
coat-0.008
Token.
Token ID13
Feature activation-5.54e-01

Pos logit contributions
arter+0.023
ires+0.022
 wasn+0.021
 meant+0.021
 probably+0.021
Neg logit contributions
P-0.049
aid-0.049
 sunscreen-0.048
coat-0.047
suits-0.047
Token Suddenly
Token ID24975
Feature activation-1.27e+00

Pos logit contributions
arter+0.010
rated+0.009
 meant+0.009
 erupt+0.009
ires+0.009
Neg logit contributions
P-0.015
 sunscreen-0.015
aid-0.015
ival-0.014
coat-0.014
Token,
Token ID11
Feature activation-4.29e-02

Pos logit contributions
arter+0.026
ires+0.025
 meant+0.024
 wasn+0.024
 probably+0.024
Neg logit contributions
P-0.031
aid-0.031
 sunscreen-0.030
 merry-0.029
suits-0.029
Token a
Token ID257
Feature activation+1.37e+00

Pos logit contributions
arter+0.001
ires+0.001
 wasn+0.001
 meant+0.001
 probably+0.001
Neg logit contributions
P-0.001
aid-0.001
 sunscreen-0.001
 merry-0.001
coat-0.001
Token big
Token ID1263
Feature activation+9.10e-01

Pos logit contributions
P+0.036
aid+0.036
 sunscreen+0.036
suits+0.034
coat+0.034
Neg logit contributions
 wasn-0.024
arter-0.024
 meant-0.023
 probably-0.023
 erupt-0.023
Token But
Token ID887
Feature activation-3.45e+00

Pos logit contributions
aid+0.014
P+0.014
ival+0.014
 sunscreen+0.014
suits+0.013
Neg logit contributions
arter-0.010
rated-0.009
 matter-0.008
 ending-0.008
 meant-0.008
Token as
Token ID355
Feature activation+1.75e+00

Pos logit contributions
arter+0.078
ires+0.072
 meant+0.065
uders+0.065
rated+0.065
Neg logit contributions
 sunscreen-0.078
aid-0.076
P-0.075
 merry-0.074
 ha-0.073
Token he
Token ID339
Feature activation+1.30e+00

Pos logit contributions
P+0.049
 merry+0.048
aid+0.048
 sunscreen+0.047
 seas+0.045
Neg logit contributions
arter-0.030
ires-0.029
uders-0.027
 probably-0.027
der-0.026
Token climbed
Token ID19952
Feature activation+1.69e+00

Pos logit contributions
P+0.032
aid+0.032
 merry+0.029
 sunscreen+0.029
coat+0.028
Neg logit contributions
arter-0.029
ires-0.027
 erupt-0.025
 probably-0.025
 meant-0.025
Token,
Token ID11
Feature activation+1.07e+00

Pos logit contributions
P+0.038
aid+0.037
 sunscreen+0.036
suits+0.034
 merry+0.034
Neg logit contributions
arter-0.038
 wasn-0.037
 meant-0.037
 probably-0.036
ires-0.036
Token the
Token ID262
Feature activation-9.57e-01

Pos logit contributions
P+0.030
 sunscreen+0.030
aid+0.029
 seas+0.028
 merry+0.028
Neg logit contributions
arter-0.019
ires-0.018
rated-0.016
iny-0.016
uders-0.016
Token leaf
Token ID12835
Feature activation-8.42e-01

Pos logit contributions
arter+0.021
 wasn+0.021
 meant+0.021
ires+0.020
 probably+0.020
Neg logit contributions
P-0.022
aid-0.021
 sunscreen-0.021
 merry-0.020
coat-0.020
Token slipped
Token ID18859
Feature activation-1.13e+00

Pos logit contributions
arter+0.019
ires+0.018
 erupt+0.016
 God+0.016
 meant+0.016
Neg logit contributions
P-0.020
aid-0.020
coat-0.019
 merry-0.019
 sunscreen-0.019
Token and
Token ID290
Feature activation+2.72e-01

Pos logit contributions
arter+0.026
ires+0.025
uders+0.022
 Next+0.022
 meant+0.022
Neg logit contributions
P-0.026
aid-0.025
 sunscreen-0.025
coat-0.025
 merry-0.024
Token he
Token ID339
Feature activation+1.12e+00

Pos logit contributions
P+0.007
aid+0.007
 sunscreen+0.006
 merry+0.006
coat+0.006
Neg logit contributions
arter-0.006
ires-0.006
 meant-0.005
uders-0.005
 probably-0.005
Token fell
Token ID3214
Feature activation+8.18e-01

Pos logit contributions
P+0.027
aid+0.027
 merry+0.026
 sunscreen+0.025
coat+0.025
Neg logit contributions
arter-0.024
ires-0.023
 meant-0.021
 erupt-0.021
 probably-0.021
Token they
Token ID484
Feature activation+6.27e-02

Pos logit contributions
 sunscreen+0.009
P+0.009
aid+0.009
 merry+0.008
coat+0.008
Neg logit contributions
arter-0.005
ires-0.005
rated-0.005
uders-0.005
 meant-0.005
Token went
Token ID1816
Feature activation+3.92e+00

Pos logit contributions
aid+0.002
 merry+0.002
coat+0.002
P+0.002
 sunscreen+0.002
Neg logit contributions
 wasn-0.001
 meant-0.001
 He-0.001
 probably-0.001
 erupt-0.001
Token on
Token ID319
Feature activation+5.29e-01

Pos logit contributions
aid+0.123
P+0.117
ival+0.113
 merry+0.111
 sunscreen+0.109
Neg logit contributions
 He-0.064
 about-0.054
 Andy-0.053
arter-0.051
 meant-0.051
Token a
Token ID257
Feature activation+3.68e+00

Pos logit contributions
aid+0.016
P+0.014
coat+0.014
ival+0.014
suits+0.013
Neg logit contributions
 meant-0.009
 Andy-0.009
 wasn-0.009
 He-0.009
 probably-0.009
Token walk
Token ID2513
Feature activation+2.04e+00

Pos logit contributions
aid+0.101
P+0.082
suits+0.081
ival+0.079
eal+0.079
Neg logit contributions
 erupt-0.076
 probably-0.076
 wasn-0.073
 Andy-0.071
arter-0.070
Token and
Token ID290
Feature activation-5.07e-01

Pos logit contributions
aid+0.060
P+0.057
 sunscreen+0.057
 merry+0.055
ival+0.054
Neg logit contributions
 He-0.033
 meant-0.033
 wasn-0.032
arter-0.031
 Andy-0.030
Token saw
Token ID2497
Feature activation-1.55e+00

Pos logit contributions
 wasn+0.010
 Andy+0.010
 meant+0.009
 He+0.009
 probably+0.009
Neg logit contributions
aid-0.014
P-0.013
 sunscreen-0.012
coat-0.012
 merry-0.012
Token a
Token ID257
Feature activation+1.98e+00

Pos logit contributions
 meant+0.030
 He+0.029
 wasn+0.029
 Andy+0.028
arter+0.028
Neg logit contributions
P-0.040
aid-0.039
coat-0.039
suits-0.038
 gown-0.037
Token big
Token ID1263
Feature activation+8.81e-01

Pos logit contributions
aid+0.052
P+0.049
 sunscreen+0.048
shirts+0.046
coat+0.046
Neg logit contributions
 wasn-0.038
arter-0.038
 erupt-0.037
 probably-0.037
 ending-0.036
Token loop
Token ID9052
Feature activation-1.09e+00

Pos logit contributions
aid+0.023
P+0.022
 sunscreen+0.020
suits+0.020
ival+0.020
Neg logit contributions
arter-0.018
 wasn-0.017
 erupt-0.017
 meant-0.017
 probably-0.017
Token in
Token ID287
Feature activation-1.65e+00

Pos logit contributions
arter+0.025
ires+0.023
 wasn+0.023
 meant+0.023
 Next+0.023
Neg logit contributions
P-0.025
coat-0.024
aid-0.023
 sunscreen-0.022
suits-0.022
Token.
Token ID13
Feature activation-4.67e-01

Pos logit contributions
P+0.026
aid+0.025
coat+0.025
 sunscreen+0.025
suits+0.025
Neg logit contributions
arter-0.014
ires-0.013
 meant-0.013
rated-0.013
 wasn-0.012
Token He
Token ID679
Feature activation-3.57e-01

Pos logit contributions
arter+0.009
 probably+0.007
 meant+0.007
ires+0.007
rated+0.007
Neg logit contributions
aid-0.012
suits-0.012
coat-0.012
 sunscreen-0.012
ival-0.012
Token was
Token ID373
Feature activation-4.28e-01

Pos logit contributions
 wasn+0.007
 meant+0.007
 probably+0.006
 He+0.006
 erupt+0.006
Neg logit contributions
aid-0.010
P-0.009
 sunscreen-0.009
 gown-0.008
ival-0.008
Token so
Token ID523
Feature activation+8.46e-01

Pos logit contributions
 meant+0.009
 wasn+0.009
 probably+0.009
 He+0.009
 erupt+0.009
Neg logit contributions
P-0.010
aid-0.010
 sunscreen-0.009
suits-0.009
ival-0.009
Token happy
Token ID3772
Feature activation-1.97e+00

Pos logit contributions
P+0.017
aid+0.017
 sunscreen+0.016
 merry+0.015
suits+0.015
Neg logit contributions
arter-0.021
 wasn-0.021
 meant-0.021
ires-0.020
 probably-0.020
Token that
Token ID326
Feature activation-4.27e-01

Pos logit contributions
arter+0.053
rated+0.050
ires+0.047
 hasn+0.046
cker+0.046
Neg logit contributions
P-0.035
 sunscreen-0.033
 merry-0.033
coat-0.032
suits-0.032
Token he
Token ID339
Feature activation-8.12e-01

Pos logit contributions
arter+0.010
ires+0.009
uders+0.009
rated+0.009
cker+0.009
Neg logit contributions
 merry-0.010
P-0.010
 sunscreen-0.010
 ha-0.009
 seas-0.009
Token was
Token ID373
Feature activation-9.24e-02

Pos logit contributions
 wasn+0.014
 meant+0.014
arter+0.014
 probably+0.014
 He+0.013
Neg logit contributions
aid-0.022
P-0.022
 sunscreen-0.021
 merry-0.020
suits-0.020
Token getting
Token ID1972
Feature activation+2.73e+00

Pos logit contributions
 meant+0.002
 wasn+0.002
arter+0.002
 probably+0.002
 erupt+0.002
Neg logit contributions
P-0.002
aid-0.002
suits-0.002
 sunscreen-0.002
ival-0.002
Token faster
Token ID5443
Feature activation+8.59e-01

Pos logit contributions
suits+0.068
 gown+0.067
P+0.067
coat+0.066
ival+0.066
Neg logit contributions
 meant-0.053
 He-0.051
 wasn-0.050
 erupt-0.047
 before-0.047
Token and
Token ID290
Feature activation+4.84e-01

Pos logit contributions
P+0.026
aid+0.026
 sunscreen+0.026
 merry+0.025
suits+0.025
Neg logit contributions
arter-0.012
 wasn-0.011
 meant-0.011
ires-0.011
 probably-0.011
Token very
Token ID845
Feature activation+1.61e+00

Pos logit contributions
aid+0.014
 sunscreen+0.013
P+0.013
 merry+0.013
coat+0.011
Neg logit contributions
arter-0.023
 wasn-0.022
ires-0.022
 meant-0.022
 probably-0.021
Token happy
Token ID3772
Feature activation-9.68e-01

Pos logit contributions
aid+0.031
P+0.031
 sunscreen+0.029
 merry+0.028
suits+0.027
Neg logit contributions
arter-0.042
 wasn-0.041
 meant-0.040
 probably-0.040
ires-0.040
Token.
Token ID13
Feature activation+9.82e-01

Pos logit contributions
rated+0.019
arter+0.018
 probably+0.016
 meant+0.016
ires+0.016
Neg logit contributions
aid-0.025
P-0.024
 sunscreen-0.024
 merry-0.024
suits-0.023
Token<|endoftext|>
Token ID50256
Feature activation+3.05e-01

Pos logit contributions
P+0.024
aid+0.023
 sunscreen+0.022
suits+0.022
Of+0.021
Neg logit contributions
rated-0.022
arter-0.021
 probably-0.020
 matter-0.019
 wasn-0.019
TokenOnce
Token ID7454
Feature activation+2.17e+00

Pos logit contributions
P+0.007
suits+0.006
aid+0.006
ival+0.006
 merry+0.006
Neg logit contributions
arter-0.008
rated-0.007
 matter-0.007
 probably-0.007
ires-0.007
Token upon
Token ID2402
Feature activation-4.71e+00

Pos logit contributions
P+0.046
suits+0.046
 sunscreen+0.042
 merry+0.041
 home+0.041
Neg logit contributions
uders-0.047
 probably-0.047
rated-0.047
uding-0.046
 spew-0.046
Token a
Token ID257
Feature activation+3.50e+00

Pos logit contributions
arter+0.142
ires+0.134
rated+0.132
uders+0.131
 wasn+0.130
Neg logit contributions
P-0.078
aid-0.075
 merry-0.070
 sunscreen-0.069
 gown-0.063
Token time
Token ID640
Feature activation+1.62e+00

Pos logit contributions
P+0.103
suits+0.094
coat+0.094
 sunscreen+0.094
aid+0.092
Neg logit contributions
arter-0.059
 meant-0.058
ires-0.056
 wasn-0.055
iring-0.053
Token,
Token ID11
Feature activation+1.40e+00

Pos logit contributions
P+0.026
suits+0.020
 merry+0.020
 ha+0.020
 sunscreen+0.019
Neg logit contributions
rated-0.051
 probably-0.048
 meant-0.047
cker-0.047
arter-0.047
Token there
Token ID612
Feature activation-3.29e-01

Pos logit contributions
P+0.023
aid+0.019
suits+0.018
 merry+0.018
 sunscreen+0.017
Neg logit contributions
arter-0.041
der-0.040
uders-0.039
places-0.038
istent-0.038
Token was
Token ID373
Feature activation-2.30e+00

Pos logit contributions
arter+0.008
der+0.008
ires+0.008
cker+0.008
rated+0.008
Neg logit contributions
 merry-0.007
P-0.006
 dress-0.006
suits-0.006
 sandbox-0.006
Token be
Token ID307
Feature activation+7.69e-01

Pos logit contributions
arter+0.043
ires+0.041
 wasn+0.041
 meant+0.040
 erupt+0.038
Neg logit contributions
 sunscreen-0.064
aid-0.063
P-0.063
 merry-0.060
coat-0.059
Token sleeping
Token ID11029
Feature activation+6.96e-01

Pos logit contributions
ival+0.019
suits+0.018
coat+0.018
aid+0.017
P+0.017
Neg logit contributions
 meant-0.017
 probably-0.016
 wasn-0.016
 He-0.015
 actually-0.015
Token,
Token ID11
Feature activation-5.89e-01

Pos logit contributions
P+0.019
aid+0.019
 sunscreen+0.018
 merry+0.018
suits+0.018
Neg logit contributions
arter-0.012
 wasn-0.012
 meant-0.012
ires-0.011
 probably-0.011
Token it
Token ID340
Feature activation-3.03e+00

Pos logit contributions
arter+0.011
ires+0.011
 erupt+0.010
 meant+0.010
rated+0.010
Neg logit contributions
 sunscreen-0.015
 merry-0.015
aid-0.014
P-0.014
 gown-0.014
Token is
Token ID318
Feature activation-2.74e+00

Pos logit contributions
arter+0.076
ires+0.071
iring+0.063
 erupt+0.062
uders+0.062
Neg logit contributions
P-0.059
 merry-0.056
coat-0.055
suits-0.055
 gown-0.055
Token not
Token ID407
Feature activation-4.95e+00

Pos logit contributions
arter+0.064
ires+0.062
uders+0.057
iring+0.057
 arose+0.057
Neg logit contributions
 sunscreen-0.059
P-0.056
 merry-0.055
aid-0.053
 leash-0.051
Token time
Token ID640
Feature activation-8.75e-01

Pos logit contributions
arter+0.124
ires+0.122
 wasn+0.116
 erupt+0.113
 Next+0.111
Neg logit contributions
aid-0.095
 sunscreen-0.095
P-0.092
 merry-0.091
coat-0.086
Token to
Token ID284
Feature activation-2.01e+00

Pos logit contributions
arter+0.019
 meant+0.019
 wasn+0.019
ires+0.018
 probably+0.018
Neg logit contributions
P-0.020
aid-0.020
 sunscreen-0.019
 merry-0.018
suits-0.018
Token open
Token ID1280
Feature activation-6.36e-01

Pos logit contributions
ires+0.046
arter+0.044
 wasn+0.043
 meant+0.042
iring+0.041
Neg logit contributions
 sunscreen-0.045
P-0.044
aid-0.043
 merry-0.042
 gown-0.040
Token your
Token ID534
Feature activation+2.25e+00

Pos logit contributions
arter+0.010
 wasn+0.010
 meant+0.010
ires+0.010
 probably+0.010
Neg logit contributions
P-0.018
aid-0.018
 sunscreen-0.017
 merry-0.017
suits-0.017
Token gifts
Token ID13201
Feature activation-2.97e-01

Pos logit contributions
aid+0.067
P+0.067
 sunscreen+0.062
ival+0.061
coat+0.061
Neg logit contributions
arter-0.036
 wasn-0.034
 meant-0.033
 probably-0.033
ires-0.033
Token does
Token ID857
Feature activation-3.37e+00

Pos logit contributions
arter+0.028
ires+0.026
 meant+0.026
rated+0.026
 Andy+0.025
Neg logit contributions
aid-0.032
P-0.032
 merry-0.030
 sunscreen-0.030
 gown-0.029
Token not
Token ID407
Feature activation-3.45e+00

Pos logit contributions
arter+0.039
 wasn+0.036
 meant+0.036
ires+0.033
rated+0.033
Neg logit contributions
aid-0.113
P-0.110
 sunscreen-0.110
 merry-0.106
 gown-0.103
Token answer
Token ID3280
Feature activation-2.01e+00

Pos logit contributions
arter+0.070
 wasn+0.068
ires+0.067
 probably+0.065
 meant+0.065
Neg logit contributions
aid-0.086
P-0.084
 sunscreen-0.081
 merry-0.079
coat-0.078
Token.
Token ID13
Feature activation-1.24e+00

Pos logit contributions
arter+0.031
rated+0.029
ires+0.028
 erupt+0.027
 probably+0.027
Neg logit contributions
aid-0.058
 merry-0.056
 sunscreen-0.056
 gown-0.055
P-0.055
Token It
Token ID632
Feature activation-2.68e+00

Pos logit contributions
arter+0.018
rated+0.017
 erupt+0.017
uders+0.016
cker+0.016
Neg logit contributions
aid-0.038
P-0.037
 sunscreen-0.035
ival-0.034
 merry-0.034
Token just
Token ID655
Feature activation-4.46e+00

Pos logit contributions
arter+0.062
ires+0.057
 Andy+0.057
iring+0.056
 ending+0.055
Neg logit contributions
P-0.058
aid-0.057
 merry-0.057
 sunscreen-0.055
 gown-0.053
Token w
Token ID266
Feature activation-5.32e-01

Pos logit contributions
arter+0.101
ires+0.098
iring+0.095
 wasn+0.091
 probably+0.089
Neg logit contributions
aid-0.098
 sunscreen-0.095
 merry-0.092
P-0.091
 gown-0.087
Tokenigg
Token ID6950
Feature activation-2.40e+00

Pos logit contributions
arter+0.014
ires+0.014
 erupt+0.013
 probably+0.013
 meant+0.013
Neg logit contributions
P-0.010
aid-0.009
 merry-0.009
 sunscreen-0.009
coat-0.009
Tokenles
Token ID829
Feature activation+3.22e-03

Pos logit contributions
arter+0.062
 erupt+0.059
 wasn+0.059
 probably+0.059
 meant+0.058
Neg logit contributions
P-0.046
aid-0.046
 sunscreen-0.043
suits-0.042
 merry-0.040
Token some
Token ID617
Feature activation+1.12e+00

Pos logit contributions
aid+0.000
P+0.000
 sunscreen+0.000
 merry+0.000
suits+0.000
Neg logit contributions
arter-0.000
 wasn-0.000
ires-0.000
 probably-0.000
 meant-0.000
Token more
Token ID517
Feature activation+1.11e+00

Pos logit contributions
P+0.025
aid+0.025
 sunscreen+0.024
 merry+0.023
suits+0.023
Neg logit contributions
arter-0.025
 wasn-0.024
 meant-0.024
ires-0.024
 probably-0.024
TokenThe
Token ID464
Feature activation-2.80e+00

Pos logit contributions
 wasn+0.035
 meant+0.035
arter+0.035
 He+0.034
 probably+0.034
Neg logit contributions
aid-0.039
P-0.039
 sunscreen-0.038
 merry-0.038
suits-0.037
Token foolish
Token ID19538
Feature activation-2.43e-01

Pos logit contributions
arter+0.066
 wasn+0.064
 meant+0.064
ires+0.063
 probably+0.063
Neg logit contributions
P-0.060
aid-0.059
 sunscreen-0.056
 merry-0.054
suits-0.054
Token muff
Token ID27563
Feature activation-1.49e+00

Pos logit contributions
arter+0.006
 wasn+0.005
ires+0.005
 Andy+0.005
 meant+0.005
Neg logit contributions
aid-0.005
 merry-0.005
P-0.005
 ha-0.005
 sunscreen-0.005
Tokenin
Token ID259
Feature activation+5.52e-01

Pos logit contributions
arter+0.044
 wasn+0.042
 meant+0.042
 probably+0.042
 erupt+0.042
Neg logit contributions
P-0.024
aid-0.024
 sunscreen-0.021
ival-0.020
 merry-0.020
Token was
Token ID373
Feature activation-2.06e+00

Pos logit contributions
P+0.014
aid+0.014
suits+0.013
ival+0.013
coat+0.013
Neg logit contributions
arter-0.012
ires-0.011
 probably-0.009
 erupt-0.009
 ending-0.009
Token not
Token ID407
Feature activation-4.60e+00

Pos logit contributions
arter+0.054
ires+0.052
 wasn+0.049
 Next+0.049
rated+0.049
Neg logit contributions
aid-0.039
 sunscreen-0.037
 merry-0.036
P-0.036
coat-0.034
Token happy
Token ID3772
Feature activation-2.30e+00

Pos logit contributions
arter+0.118
ires+0.116
 wasn+0.109
 Next+0.107
uders+0.107
Neg logit contributions
aid-0.090
 sunscreen-0.085
P-0.084
 merry-0.082
coat-0.080
Token.
Token ID13
Feature activation-1.10e+00

Pos logit contributions
cker+0.053
rated+0.052
arter+0.049
iny+0.048
ires+0.047
Neg logit contributions
 sunglasses-0.045
 purse-0.042
 parties-0.042
 lipstick-0.042
 fashion-0.042
Token He
Token ID679
Feature activation-7.87e-01

Pos logit contributions
arter+0.016
rated+0.014
 ending+0.013
cker+0.013
 explode+0.013
Neg logit contributions
aid-0.033
ival-0.033
P-0.032
Of-0.031
 sunscreen-0.031
Token knew
Token ID2993
Feature activation-6.31e-01

Pos logit contributions
arter+0.017
ires+0.016
 erupt+0.014
 probably+0.014
 meant+0.014
Neg logit contributions
P-0.019
aid-0.019
 sunscreen-0.018
 merry-0.018
suits-0.018
Token he
Token ID339
Feature activation-4.11e-01

Pos logit contributions
arter+0.012
rated+0.012
ires+0.011
 erupt+0.011
uders+0.011
Neg logit contributions
aid-0.016
 sunscreen-0.016
P-0.016
 merry-0.015
 ha-0.015
Token a
Token ID257
Feature activation+4.37e+00

Pos logit contributions
P+0.107
suits+0.106
 merry+0.103
coat+0.102
aid+0.102
Neg logit contributions
 wasn-0.050
 meant-0.049
 probably-0.049
 He-0.045
 erupt-0.044
Token reward
Token ID6721
Feature activation+9.29e-01

Pos logit contributions
rob+0.113
P+0.112
aid+0.106
suits+0.102
 sunscreen+0.101
Neg logit contributions
 wasn-0.091
 ending-0.089
 probably-0.086
 matter-0.083
 erupt-0.083
Token.
Token ID13
Feature activation+1.31e+00

Pos logit contributions
P+0.029
aid+0.029
 sunscreen+0.028
coat+0.028
suits+0.028
Neg logit contributions
arter-0.012
ires-0.011
 wasn-0.011
 meant-0.011
 erupt-0.011
Token<|endoftext|>
Token ID50256
Feature activation-4.50e-02

Pos logit contributions
P+0.034
aid+0.033
 sunscreen+0.030
 merry+0.029
Of+0.029
Neg logit contributions
arter-0.027
rated-0.026
 erupt-0.025
ires-0.025
uders-0.024
TokenOnce
Token ID7454
Feature activation+2.18e+00

Pos logit contributions
arter+0.001
ires+0.001
rated+0.001
 erupt+0.001
 probably+0.001
Neg logit contributions
P-0.001
suits-0.001
aid-0.001
ival-0.001
 sunscreen-0.001
Token upon
Token ID2402
Feature activation-4.75e+00

Pos logit contributions
P+0.051
 merry+0.050
suits+0.050
 purse+0.049
 sunscreen+0.048
Neg logit contributions
uders-0.041
arter-0.041
 erupt-0.041
 probably-0.040
 spew-0.039
Token a
Token ID257
Feature activation+2.98e+00

Pos logit contributions
arter+0.131
ires+0.124
 wasn+0.121
 meant+0.120
 erupt+0.120
Neg logit contributions
P-0.087
aid-0.086
 sunscreen-0.082
 merry-0.080
coat-0.075
Token time
Token ID640
Feature activation+1.31e+00

Pos logit contributions
P+0.094
aid+0.089
 sunscreen+0.089
 merry+0.087
coat+0.087
Neg logit contributions
arter-0.042
 meant-0.039
ires-0.039
 wasn-0.038
 erupt-0.037
Token,
Token ID11
Feature activation+1.21e+00

Pos logit contributions
P+0.024
 merry+0.022
suits+0.022
 sunscreen+0.022
coat+0.021
Neg logit contributions
arter-0.035
rated-0.034
ires-0.034
 meant-0.034
 probably-0.034
Token a
Token ID257
Feature activation+1.63e+00

Pos logit contributions
P+0.024
aid+0.022
suits+0.022
 sunscreen+0.021
 merry+0.021
Neg logit contributions
arter-0.031
ires-0.030
 erupt-0.029
 meant-0.029
der-0.029
Token generous
Token ID14431
Feature activation-1.51e-02

Pos logit contributions
aid+0.045
P+0.045
 sunscreen+0.043
 merry+0.042
coat+0.042
Neg logit contributions
arter-0.028
 wasn-0.027
 meant-0.027
ires-0.026
 probably-0.026