TOP ACTIVATIONS
MAX = 0.095

the cake with oil and decorated it with sprink les .
and fun . They had decorated the plant . They had
to eat and so he slur ped up the apples au
she used to tie all of her toys together at the
The team decorated the room with balloons ,
see how much Lily had decorated her room . She told
worm with her eye and slur ped it up with a
and the way they had decorated it . <|endoftext|> Once upon
made a new cake and decorated it with candles and sprink
the stars and planets in outer space . So one day
that he painted silver and decorated with some black stripes .
, Lily and her family decorated the house with balloons and
The dog l icks Lily 's face . He
he lun ged forward and slur ped it all up .
so happy that she had decorated it and it was safe
group of tur keys had decorated the tree with colourful feathers
the ostr ich . They decorated the ostr ich with some
and colourful . They had decorated it perfectly ! Everyone was
always very colourful and beautifully decorated . <|endoftext|> Once upon a
it , and then you slur p it down your mouth

MIN ACTIVATIONS
MIN = -0.113

a time there was a Stub born Old Woman . She
just a spider ." The Stub born Old Woman shook her
she did so , the Stub born Old Woman said ,
." The Stub born Old Woman chuckled and
neck that read , " Fore ver Friends ".
They laughed and shouted ' fore ' a lot .
. <|endoftext|> Once upon a time , there was a little
. <|endoftext|> Once upon a time , there was a little
. <|endoftext|> Once upon a time , there was a happy
. <|endoftext|> Once upon a time , there was a kid
. <|endoftext|> Once upon a time , there was a big
." <|endoftext|> Once upon a time , there was a little
. <|endoftext|> Once upon a time , there was a little
. <|endoftext|> Once upon a time , there was a z
. <|endoftext|> Once upon a time , there was a little
. <|endoftext|> Once upon a time , there was a little
. <|endoftext|> Once upon a time , there was a little
. <|endoftext|> Once upon a time , there was a little
. <|endoftext|> Once upon a time , there was a little
. <|endoftext|> Once upon a time , there was a little

INTERVAL 0.043 - 0.095
CONTAINS 4.899%

fix it . The fairy waved her wand
's home . But when they got there
winner . The giant looked at Tom 's sand castle and
to find its prey . The bunny was frightened
had two steps . His father put the steps

INTERVAL -0.009 - 0.043
CONTAINS 60.338%

and approached the lion . He saw his big man e
" Yes please !" Johnny said with a smile , so
butterfly and show it to his mom . Benny ran after
. They 're so delicious !" But his mother could see
offered to give them a beautiful , soft carpet for their

INTERVAL -0.061 - -0.009
CONTAINS 33.501%

was not angry . He was cheerful . "
play with his ball . One day , Max 's ball
to them . But now is not the time . The
it was full of shiny , colourful things and surprises !
Tim 's friend , Sam , came to visit

INTERVAL -0.113 - -0.061
CONTAINS 1.263%

. <|endoftext|> Once upon a time there was a selfish man
basis . <|endoftext|> One day , a rh in oc eros
that this wouldn â t be the last time
Mom my said , " Because he can pay for anything
fit . The crystal is too big for the ring .
Token the
Token ID262
Feature activation+3.46e-02

Pos logit contributions
irs+0.097
ide+0.085
ushes+0.078
of+0.078
angles+0.078
Neg logit contributions
 afraid-0.071
 oh-0.071
 worried-0.067
 scared-0.064
 careless-0.063
Token cake
Token ID12187
Feature activation+1.78e-02

Pos logit contributions
irs+0.051
ide+0.048
 parties+0.047
 journals+0.045
ingu+0.045
Neg logit contributions
 afraid-0.066
 oh-0.064
 worried-0.058
 scared-0.057
Run-0.056
Token with
Token ID351
Feature activation+2.43e-02

Pos logit contributions
irs+0.046
aff+0.036
reci+0.035
ide+0.035
of+0.035
Neg logit contributions
 worried-0.020
 scared-0.020
 afraid-0.020
 very-0.020
 oh-0.019
Token oil
Token ID3056
Feature activation-1.17e-02

Pos logit contributions
irs+0.046
ide+0.042
aff+0.037
of+0.037
angles+0.036
Neg logit contributions
 afraid-0.042
 oh-0.042
 worried-0.040
 scared-0.039
 careless-0.038
Token and
Token ID290
Feature activation+8.54e-03

Pos logit contributions
 afraid+0.017
 oh+0.017
 worried+0.016
 scared+0.016
 careless+0.015
Neg logit contributions
irs-0.025
ide-0.022
ushes-0.020
angles-0.020
of-0.020
Token decorated
Token ID24789
Feature activation+8.92e-02

Pos logit contributions
irs+0.015
ide+0.012
ushes+0.011
angles+0.011
of+0.011
Neg logit contributions
 afraid-0.016
 oh-0.016
 worried-0.015
 scared-0.015
 careless-0.015
Token it
Token ID340
Feature activation+2.74e-02

Pos logit contributions
irs+0.223
ide+0.193
of+0.187
aff+0.183
ingu+0.183
Neg logit contributions
 oh-0.104
 afraid-0.088
 worried-0.084
 scared-0.083
 careless-0.075
Token with
Token ID351
Feature activation+2.25e-02

Pos logit contributions
irs+0.066
ingu+0.051
of+0.051
 ease+0.051
aters+0.051
Neg logit contributions
 afraid-0.036
 scared-0.036
 very-0.034
 worried-0.033
 oh-0.032
Token sprink
Token ID27562
Feature activation+3.89e-02

Pos logit contributions
irs+0.045
ide+0.040
of+0.036
aff+0.036
angles+0.036
Neg logit contributions
 afraid-0.037
 oh-0.037
 worried-0.035
 scared-0.034
 careless-0.033
Tokenles
Token ID829
Feature activation+5.21e-02

Pos logit contributions
irs+0.053
ide+0.042
 rebuild+0.039
ushes+0.039
 occasions+0.036
Neg logit contributions
 worried-0.080
 afraid-0.079
 scared-0.077
 careless-0.077
 foolish-0.075
Token.
Token ID13
Feature activation-6.98e-03

Pos logit contributions
irs+0.101
ide+0.084
angles+0.079
ushes+0.078
of+0.077
Neg logit contributions
 afraid-0.090
 oh-0.087
 worried-0.085
 scared-0.083
 careless-0.080
Token and
Token ID290
Feature activation-8.55e-03

Pos logit contributions
irs+0.046
ide+0.043
ushes+0.038
 parties+0.037
of+0.036
Neg logit contributions
 afraid-0.073
 oh-0.072
 worried-0.069
 careless-0.066
 scared-0.066
Token fun
Token ID1257
Feature activation+5.32e-03

Pos logit contributions
 afraid+0.014
 oh+0.014
 worried+0.013
Run+0.012
 scared+0.012
Neg logit contributions
irs-0.015
ushes-0.013
 parties-0.013
ide-0.013
 outfits-0.013
Token.
Token ID13
Feature activation-1.05e-02

Pos logit contributions
irs+0.010
ide+0.009
of+0.008
aff+0.008
angles+0.008
Neg logit contributions
 afraid-0.010
 oh-0.009
 worried-0.009
 scared-0.009
 careless-0.008
Token They
Token ID1119
Feature activation+2.50e-02

Pos logit contributions
 afraid+0.023
 oh+0.023
 worried+0.022
 scared+0.022
 careless+0.021
Neg logit contributions
irs-0.016
ide-0.013
of-0.012
angles-0.012
ades-0.011
Token had
Token ID550
Feature activation-4.43e-03

Pos logit contributions
irs+0.057
ide+0.049
ushes+0.046
angles+0.045
ades+0.045
Neg logit contributions
 afraid-0.037
 worried-0.035
 oh-0.034
 scared-0.034
 careless-0.033
Token decorated
Token ID24789
Feature activation+8.89e-02

Pos logit contributions
 afraid+0.006
 oh+0.005
 worried+0.005
 scared+0.005
 careless+0.005
Neg logit contributions
irs-0.011
ide-0.010
angles-0.009
aff-0.009
of-0.009
Token the
Token ID262
Feature activation+2.85e-02

Pos logit contributions
irs+0.179
ide+0.151
of+0.149
ingu+0.142
aff+0.141
Neg logit contributions
 oh-0.144
 afraid-0.140
 scared-0.132
 worried-0.131
 foolish-0.122
Token plant
Token ID4618
Feature activation+2.09e-02

Pos logit contributions
irs+0.048
ide+0.041
ushes+0.038
angles+0.037
aff+0.037
Neg logit contributions
 afraid-0.055
 oh-0.055
 worried-0.052
 scared-0.050
 careless-0.049
Token.
Token ID13
Feature activation-7.04e-03

Pos logit contributions
irs+0.046
ushes+0.035
 ease+0.035
aff+0.035
angles+0.035
Neg logit contributions
 afraid-0.032
 oh-0.030
 scared-0.030
 worried-0.030
 careless-0.029
Token They
Token ID1119
Feature activation+2.88e-02

Pos logit contributions
 afraid+0.014
 oh+0.013
 worried+0.013
 scared+0.013
 careless+0.012
Neg logit contributions
irs-0.012
ide-0.011
of-0.010
angles-0.009
ades-0.009
Token had
Token ID550
Feature activation-1.15e-03

Pos logit contributions
irs+0.068
ide+0.058
ades+0.054
ushes+0.054
aff+0.054
Neg logit contributions
 afraid-0.041
 worried-0.039
 scared-0.039
 oh-0.037
 careless-0.036
Token to
Token ID284
Feature activation+3.28e-02

Pos logit contributions
 afraid+0.084
 oh+0.080
 worried+0.076
 scared+0.074
 careless+0.072
Neg logit contributions
irs-0.074
ide-0.063
of-0.061
 parties-0.060
 ease-0.058
Token eat
Token ID4483
Feature activation+3.45e-02

Pos logit contributions
irs+0.057
ushes+0.049
lights+0.047
ades+0.046
ide+0.046
Neg logit contributions
 afraid-0.059
 oh-0.059
 scared-0.057
 worried-0.057
 careless-0.054
Token and
Token ID290
Feature activation-7.12e-04

Pos logit contributions
irs+0.060
ide+0.051
ushes+0.048
angles+0.047
aff+0.046
Neg logit contributions
 oh-0.066
 afraid-0.065
 worried-0.061
 scared-0.060
 careless-0.059
Token so
Token ID523
Feature activation-2.76e-02

Pos logit contributions
 afraid+0.001
 oh+0.001
 worried+0.001
 scared+0.001
 careless+0.001
Neg logit contributions
irs-0.001
ide-0.001
ushes-0.001
angles-0.001
aff-0.001
Token he
Token ID339
Feature activation+1.88e-02

Pos logit contributions
 afraid+0.046
 oh+0.045
 worried+0.043
 scared+0.041
 careless+0.040
Neg logit contributions
irs-0.055
ide-0.048
ushes-0.045
angles-0.044
aff-0.044
Token slur
Token ID40066
Feature activation+8.63e-02

Pos logit contributions
irs+0.036
ide+0.033
of+0.030
angles+0.030
ushes+0.030
Neg logit contributions
 afraid-0.031
 oh-0.031
 worried-0.028
 scared-0.027
 careless-0.026
Tokenped
Token ID9124
Feature activation+3.71e-02

Pos logit contributions
irs+0.182
ide+0.157
ushes+0.151
angles+0.147
 parties+0.146
Neg logit contributions
 oh-0.134
 afraid-0.131
 worried-0.121
 scared-0.119
 careless-0.115
Token up
Token ID510
Feature activation+2.69e-02

Pos logit contributions
irs+0.064
ide+0.056
of+0.051
ushes+0.051
 ease+0.051
Neg logit contributions
 afraid-0.069
 oh-0.067
 worried-0.064
 careless-0.061
 scared-0.061
Token the
Token ID262
Feature activation+1.87e-02

Pos logit contributions
irs+0.049
ide+0.041
ushes+0.038
angles+0.038
of+0.038
Neg logit contributions
 afraid-0.049
 oh-0.049
 worried-0.046
 scared-0.045
 careless-0.044
Token apples
Token ID22514
Feature activation+1.03e-02

Pos logit contributions
irs+0.039
ide+0.033
ushes+0.031
angles+0.031
of+0.030
Neg logit contributions
 afraid-0.030
 oh-0.030
 worried-0.029
 scared-0.028
 careless-0.027
Tokenau
Token ID559
Feature activation-2.79e-02

Pos logit contributions
irs+0.020
ide+0.017
ushes+0.016
angles+0.016
aff+0.016
Neg logit contributions
 afraid-0.018
 oh-0.018
 worried-0.017
 scared-0.016
 careless-0.016
Token she
Token ID673
Feature activation+3.35e-02

Pos logit contributions
irs+0.030
ide+0.026
aff+0.025
ushes+0.024
angles+0.024
Neg logit contributions
 afraid-0.020
 oh-0.020
 worried-0.019
 scared-0.018
 careless-0.018
Token used
Token ID973
Feature activation+2.03e-02

Pos logit contributions
irs+0.073
ide+0.063
ushes+0.059
aff+0.059
angles+0.058
Neg logit contributions
 afraid-0.051
 oh-0.050
 worried-0.047
 scared-0.046
 careless-0.045
Token to
Token ID284
Feature activation+4.43e-02

Pos logit contributions
irs+0.045
ide+0.040
angles+0.038
aff+0.037
ushes+0.037
Neg logit contributions
 afraid-0.029
 oh-0.028
 worried-0.027
 scared-0.026
 careless-0.025
Token tie
Token ID9839
Feature activation+2.88e-02

Pos logit contributions
irs+0.078
aff+0.068
ide+0.067
ushes+0.064
ingu+0.064
Neg logit contributions
 afraid-0.082
 oh-0.081
 scared-0.080
 worried-0.079
 mean-0.074
Token all
Token ID477
Feature activation+2.81e-02

Pos logit contributions
irs+0.057
ide+0.051
aff+0.046
angles+0.045
ingu+0.045
Neg logit contributions
 afraid-0.047
 oh-0.046
 worried-0.044
 scared-0.043
 careless-0.041
Token of
Token ID286
Feature activation+8.61e-02

Pos logit contributions
irs+0.055
ades+0.048
ingu+0.048
aff+0.047
ide+0.045
Neg logit contributions
 afraid-0.043
 oh-0.040
 worried-0.040
 scared-0.040
 very-0.038
Token her
Token ID607
Feature activation+2.95e-02

Pos logit contributions
irs+0.140
ades+0.131
ide+0.129
lights+0.128
aff+0.127
Neg logit contributions
 oh-0.149
 afraid-0.148
 scared-0.139
 worried-0.137
 very-0.132
Token toys
Token ID14958
Feature activation+1.46e-02

Pos logit contributions
irs+0.054
ide+0.049
aff+0.042
angles+0.041
oons+0.040
Neg logit contributions
 afraid-0.055
 oh-0.054
 worried-0.052
 scared-0.051
 careless-0.050
Token together
Token ID1978
Feature activation+2.45e-02

Pos logit contributions
irs+0.029
ide+0.025
aff+0.023
ushes+0.023
angles+0.023
Neg logit contributions
 afraid-0.024
 oh-0.024
 worried-0.023
 scared-0.022
 careless-0.022
Token at
Token ID379
Feature activation+5.29e-02

Pos logit contributions
irs+0.050
angles+0.039
ushes+0.039
ades+0.039
aff+0.038
Neg logit contributions
 afraid-0.041
 oh-0.040
 worried-0.038
 scared-0.038
 careless-0.036
Token the
Token ID262
Feature activation+2.44e-02

Pos logit contributions
irs+0.109
ide+0.095
ades+0.093
aff+0.093
of+0.091
Neg logit contributions
 afraid-0.082
 oh-0.079
 worried-0.076
 scared-0.074
 careless-0.071
Token
Token ID220
Feature activation+2.44e-02

Pos logit contributions
irs+0.028
ide+0.024
angles+0.023
ades+0.023
 outfits+0.022
Neg logit contributions
 afraid-0.027
 oh-0.026
 scared-0.026
 worried-0.025
 careless-0.023
Token
Token ID198
Feature activation+6.83e-02

Pos logit contributions
irs+0.050
 occasions+0.045
 journals+0.045
 outfits+0.044
 parties+0.043
Neg logit contributions
 afraid-0.036
 scared-0.033
 oh-0.033
 worried-0.032
 curious-0.029
Token
Token ID198
Feature activation+5.49e-02

Pos logit contributions
irs+0.130
 journals+0.112
angles+0.112
lights+0.108
 outfits+0.107
Neg logit contributions
 afraid-0.115
 scared-0.107
 worried-0.106
 oh-0.105
 careless-0.095
TokenThe
Token ID464
Feature activation+1.68e-02

Pos logit contributions
irs+0.090
ushes+0.070
angles+0.070
ide+0.069
lights+0.068
Neg logit contributions
 afraid-0.111
 oh-0.110
 worried-0.103
 scared-0.103
 careless-0.101
Token team
Token ID1074
Feature activation+4.48e-03

Pos logit contributions
irs+0.023
ide+0.019
 parties+0.018
ushes+0.018
angles+0.017
Neg logit contributions
 oh-0.037
 afraid-0.037
 worried-0.035
 scared-0.033
 careless-0.032
Token decorated
Token ID24789
Feature activation+8.61e-02

Pos logit contributions
irs+0.008
ide+0.008
of+0.007
angles+0.007
aff+0.007
Neg logit contributions
 oh-0.008
 afraid-0.007
 worried-0.007
 careless-0.007
 scared-0.006
Token the
Token ID262
Feature activation+2.90e-02

Pos logit contributions
irs+0.158
ide+0.126
of+0.120
aff+0.120
 ease+0.117
Neg logit contributions
 oh-0.161
 afraid-0.155
 worried-0.146
 scared-0.146
 careless-0.138
Token room
Token ID2119
Feature activation+1.04e-02

Pos logit contributions
irs+0.051
ide+0.043
of+0.040
ushes+0.039
angles+0.039
Neg logit contributions
 afraid-0.055
 oh-0.054
 worried-0.052
 scared-0.050
 careless-0.050
Token with
Token ID351
Feature activation+1.07e-02

Pos logit contributions
irs+0.020
ide+0.016
of+0.015
angles+0.015
aff+0.015
Neg logit contributions
 afraid-0.018
 oh-0.018
 worried-0.017
 scared-0.017
 careless-0.017
Token balloons
Token ID35485
Feature activation+3.91e-02

Pos logit contributions
irs+0.019
ide+0.017
of+0.015
angles+0.015
aff+0.015
Neg logit contributions
 afraid-0.020
 oh-0.019
 worried-0.019
 scared-0.018
 careless-0.018
Token,
Token ID11
Feature activation-2.48e-02

Pos logit contributions
irs+0.083
ide+0.073
ushes+0.068
angles+0.068
of+0.067
Neg logit contributions
 afraid-0.059
 oh-0.058
 worried-0.055
 scared-0.053
 careless-0.051
Token see
Token ID766
Feature activation+2.20e-02

Pos logit contributions
irs+0.000
ide+0.000
ushes+0.000
angles+0.000
ades+0.000
Neg logit contributions
 afraid-0.000
 oh-0.000
 worried-0.000
 scared-0.000
 careless-0.000
Token how
Token ID703
Feature activation-3.98e-02

Pos logit contributions
irs+0.037
ide+0.033
ades+0.031
angles+0.030
ushes+0.030
Neg logit contributions
 oh-0.044
 afraid-0.044
 worried-0.042
 scared-0.041
 careless-0.039
Token much
Token ID881
Feature activation-1.82e-02

Pos logit contributions
 afraid+0.059
 worried+0.056
 oh+0.056
 scared+0.054
 careless+0.053
Neg logit contributions
irs-0.090
ide-0.079
aff-0.075
angles-0.075
of-0.073
Token Lily
Token ID20037
Feature activation+1.02e-02

Pos logit contributions
 afraid+0.029
 oh+0.028
 worried+0.027
 scared+0.026
 careless+0.025
Neg logit contributions
irs-0.038
ide-0.033
ushes-0.031
angles-0.030
aff-0.030
Token had
Token ID550
Feature activation-5.00e-03

Pos logit contributions
irs+0.021
ide+0.019
ushes+0.018
of+0.017
angles+0.017
Neg logit contributions
 afraid-0.016
 oh-0.016
 worried-0.014
 scared-0.014
 careless-0.014
Token decorated
Token ID24789
Feature activation+8.60e-02

Pos logit contributions
 afraid+0.007
 oh+0.006
 worried+0.006
 scared+0.006
 careless+0.006
Neg logit contributions
irs-0.012
ide-0.011
angles-0.011
aff-0.010
of-0.010
Token her
Token ID607
Feature activation+1.01e-02

Pos logit contributions
irs+0.201
ingu+0.171
of+0.169
aff+0.169
ide+0.166
Neg logit contributions
 oh-0.115
 afraid-0.102
 scared-0.100
 worried-0.095
 very-0.090
Token room
Token ID2119
Feature activation+1.75e-02

Pos logit contributions
irs+0.021
ide+0.017
of+0.017
aff+0.016
angles+0.016
Neg logit contributions
 afraid-0.017
 oh-0.017
 worried-0.017
 scared-0.016
 careless-0.016
Token.
Token ID13
Feature activation-9.64e-04

Pos logit contributions
irs+0.033
ide+0.025
aff+0.025
angles+0.025
of+0.024
Neg logit contributions
 afraid-0.033
 oh-0.032
 worried-0.031
 scared-0.031
 careless-0.030
Token She
Token ID1375
Feature activation+2.88e-02

Pos logit contributions
 afraid+0.002
 oh+0.002
 worried+0.002
 scared+0.002
 careless+0.002
Neg logit contributions
irs-0.001
ide-0.001
ushes-0.001
aff-0.001
 parties-0.001
Token told
Token ID1297
Feature activation+1.65e-02

Pos logit contributions
irs+0.060
ide+0.052
ushes+0.049
angles+0.049
of+0.048
Neg logit contributions
 afraid-0.045
 oh-0.045
 worried-0.041
 scared-0.040
 careless-0.039
Token worm
Token ID23849
Feature activation-8.27e-03

Pos logit contributions
 halves+0.039
irs+0.039
 ease+0.038
of+0.038
ide+0.036
Neg logit contributions
 afraid-0.043
 oh-0.041
 worried-0.038
 scared-0.035
 adamant-0.035
Token with
Token ID351
Feature activation+2.29e-02

Pos logit contributions
 afraid+0.012
 oh+0.011
 worried+0.011
 scared+0.011
 careless+0.010
Neg logit contributions
irs-0.019
ide-0.016
ushes-0.015
angles-0.015
aff-0.015
Token her
Token ID607
Feature activation+1.48e-02

Pos logit contributions
irs+0.036
ide+0.034
oons+0.031
ades+0.031
aff+0.031
Neg logit contributions
 afraid-0.046
 oh-0.044
 scared-0.044
 worried-0.044
 careless-0.041
Token eye
Token ID4151
Feature activation+9.52e-04

Pos logit contributions
irs+0.032
ide+0.028
angles+0.026
aff+0.025
ushes+0.025
Neg logit contributions
 afraid-0.023
 oh-0.023
 worried-0.022
 scared-0.022
 careless-0.021
Token and
Token ID290
Feature activation+1.15e-02

Pos logit contributions
irs+0.002
ide+0.002
ushes+0.002
angles+0.002
aff+0.002
Neg logit contributions
 afraid-0.002
 oh-0.002
 worried-0.001
 scared-0.001
 careless-0.001
Token slur
Token ID40066
Feature activation+8.59e-02

Pos logit contributions
irs+0.022
ide+0.019
of+0.018
 ease+0.018
 outfits+0.018
Neg logit contributions
 afraid-0.019
 oh-0.019
 worried-0.018
 careless-0.017
 scared-0.016
Tokenped
Token ID9124
Feature activation+3.06e-02

Pos logit contributions
irs+0.194
ide+0.168
 parties+0.168
 journals+0.166
ushes+0.166
Neg logit contributions
 oh-0.112
 afraid-0.109
 scared-0.100
 worried-0.099
 careless-0.091
Token it
Token ID340
Feature activation+1.02e-02

Pos logit contributions
irs+0.059
ide+0.051
ushes+0.047
angles+0.047
of+0.046
Neg logit contributions
 afraid-0.053
 oh-0.052
 worried-0.049
 scared-0.047
 careless-0.047
Token up
Token ID510
Feature activation+2.71e-02

Pos logit contributions
irs+0.022
ide+0.018
angles+0.017
aff+0.017
ushes+0.017
Neg logit contributions
 afraid-0.016
 oh-0.016
 scared-0.015
 worried-0.015
 careless-0.014
Token with
Token ID351
Feature activation+1.78e-02

Pos logit contributions
irs+0.048
ide+0.040
angles+0.037
ushes+0.037
aff+0.036
Neg logit contributions
 afraid-0.052
 oh-0.051
 worried-0.049
 scared-0.047
 careless-0.046
Token a
Token ID257
Feature activation+1.18e-02

Pos logit contributions
irs+0.033
ide+0.031
aff+0.029
angles+0.029
oons+0.029
Neg logit contributions
 afraid-0.031
 oh-0.031
 scared-0.030
 worried-0.030
 careless-0.027
Token and
Token ID290
Feature activation+2.06e-03

Pos logit contributions
 afraid+0.005
 oh+0.005
 worried+0.005
 scared+0.005
 careless+0.005
Neg logit contributions
irs-0.007
ide-0.006
ushes-0.006
angles-0.006
of-0.006
Token the
Token ID262
Feature activation+1.63e-02

Pos logit contributions
irs+0.004
ide+0.003
of+0.003
angles+0.003
ushes+0.003
Neg logit contributions
 afraid-0.004
 oh-0.004
 worried-0.004
 scared-0.003
 careless-0.003
Token way
Token ID835
Feature activation-5.04e-02

Pos logit contributions
irs+0.030
ide+0.026
 parties+0.025
 outfits+0.025
 ease+0.025
Neg logit contributions
 afraid-0.028
 oh-0.028
 worried-0.026
 scared-0.025
 careless-0.024
Token they
Token ID484
Feature activation+1.85e-02

Pos logit contributions
 afraid+0.076
 oh+0.075
 worried+0.072
 scared+0.069
 careless+0.068
Neg logit contributions
irs-0.107
ide-0.093
aff-0.088
ushes-0.088
ades-0.086
Token had
Token ID550
Feature activation-7.04e-03

Pos logit contributions
irs+0.039
ide+0.033
ushes+0.032
ades+0.031
aff+0.031
Neg logit contributions
 afraid-0.029
 worried-0.028
 scared-0.027
 oh-0.026
 careless-0.026
Token decorated
Token ID24789
Feature activation+8.58e-02

Pos logit contributions
 afraid+0.010
 oh+0.010
 worried+0.009
 scared+0.009
 careless+0.009
Neg logit contributions
irs-0.016
ide-0.014
angles-0.013
aff-0.013
ushes-0.013
Token it
Token ID340
Feature activation+1.62e-02

Pos logit contributions
irs+0.161
ide+0.132
of+0.128
aff+0.126
ingu+0.126
Neg logit contributions
 oh-0.147
 afraid-0.140
 scared-0.133
 worried-0.132
 foolish-0.126
Token.
Token ID13
Feature activation+1.38e-02

Pos logit contributions
irs+0.033
aff+0.027
ide+0.026
ushes+0.026
ingu+0.026
Neg logit contributions
 afraid-0.026
 oh-0.025
 scared-0.025
 worried-0.024
 careless-0.023
Token<|endoftext|>
Token ID50256
Feature activation+2.56e-02

Pos logit contributions
irs+0.023
ide+0.020
angles+0.018
ades+0.018
of+0.018
Neg logit contributions
 afraid-0.026
 oh-0.025
 worried-0.025
 scared-0.024
 careless-0.023
TokenOnce
Token ID7454
Feature activation-3.87e-02

Pos logit contributions
irs+0.047
angles+0.039
ide+0.039
ushes+0.037
 outfits+0.037
Neg logit contributions
 afraid-0.045
 oh-0.043
 scared-0.043
 worried-0.043
 careless-0.039
Token upon
Token ID2402
Feature activation+2.58e-02

Pos logit contributions
 afraid+0.073
 worried+0.070
 oh+0.067
 scared+0.067
 careless+0.065
Neg logit contributions
irs-0.068
 occasions-0.057
bons-0.057
aff-0.055
 outfits-0.055
Token made
Token ID925
Feature activation+4.69e-02

Pos logit contributions
irs+0.068
ide+0.057
aff+0.055
ushes+0.055
angles+0.054
Neg logit contributions
 afraid-0.041
 worried-0.039
 oh-0.039
 scared-0.038
 careless-0.036
Token a
Token ID257
Feature activation+1.03e-02

Pos logit contributions
irs+0.097
ide+0.085
of+0.079
ushes+0.078
angles+0.078
Neg logit contributions
 afraid-0.074
 oh-0.074
 worried-0.069
 scared-0.067
 careless-0.065
Token new
Token ID649
Feature activation+2.54e-02

Pos logit contributions
irs+0.020
ide+0.018
ushes+0.016
angles+0.016
of+0.016
Neg logit contributions
 afraid-0.017
 oh-0.017
 worried-0.016
 scared-0.016
 careless-0.015
Token cake
Token ID12187
Feature activation+1.46e-02

Pos logit contributions
irs+0.039
ide+0.035
ushes+0.031
 outfits+0.031
ades+0.031
Neg logit contributions
 afraid-0.052
 oh-0.051
 worried-0.048
 careless-0.046
 scared-0.046
Token and
Token ID290
Feature activation+4.03e-03

Pos logit contributions
irs+0.034
 ease+0.027
ide+0.027
of+0.027
reci+0.026
Neg logit contributions
 oh-0.020
 afraid-0.020
 worried-0.019
 scared-0.019
 careless-0.017
Token decorated
Token ID24789
Feature activation+8.58e-02

Pos logit contributions
irs+0.007
ide+0.006
ushes+0.006
angles+0.006
 parties+0.005
Neg logit contributions
 afraid-0.008
 oh-0.007
 worried-0.007
 careless-0.007
 scared-0.007
Token it
Token ID340
Feature activation+2.58e-02

Pos logit contributions
irs+0.231
of+0.197
ide+0.196
ickets+0.196
reci+0.193
Neg logit contributions
 oh-0.084
 afraid-0.066
 scared-0.063
 worried-0.061
 very-0.059
Token with
Token ID351
Feature activation+1.91e-02

Pos logit contributions
irs+0.063
 ease+0.050
aff+0.049
of+0.049
ingu+0.049
Neg logit contributions
 afraid-0.034
 scared-0.033
 oh-0.031
 very-0.031
 worried-0.031
Token candles
Token ID32268
Feature activation+4.62e-02

Pos logit contributions
irs+0.039
ide+0.035
of+0.032
aff+0.031
angles+0.031
Neg logit contributions
 afraid-0.031
 oh-0.030
 worried-0.030
 scared-0.029
 careless-0.028
Token and
Token ID290
Feature activation-1.52e-04

Pos logit contributions
irs+0.092
ide+0.076
angles+0.071
ushes+0.070
of+0.070
Neg logit contributions
 afraid-0.077
 oh-0.074
 worried-0.073
 scared-0.072
 careless-0.068
Token sprink
Token ID27562
Feature activation+4.09e-02

Pos logit contributions
 afraid+0.000
 oh+0.000
 worried+0.000
 scared+0.000
 careless+0.000
Neg logit contributions
irs-0.000
ide-0.000
ushes-0.000
angles-0.000
of-0.000
Token the
Token ID262
Feature activation+2.51e-02

Pos logit contributions
irs+0.002
ide+0.002
ushes+0.002
aff+0.002
angles+0.002
Neg logit contributions
 afraid-0.002
 oh-0.002
 worried-0.002
 scared-0.002
 careless-0.002
Token stars
Token ID5788
Feature activation+1.50e-02

Pos logit contributions
irs+0.030
ide+0.025
 parties+0.021
angles+0.021
of+0.021
Neg logit contributions
 oh-0.061
 afraid-0.059
 worried-0.056
 careless-0.054
 scared-0.054
Token and
Token ID290
Feature activation+1.75e-03

Pos logit contributions
irs+0.031
ide+0.027
ushes+0.025
angles+0.025
of+0.025
Neg logit contributions
 afraid-0.024
 oh-0.024
 worried-0.022
 scared-0.021
 careless-0.021
Token planets
Token ID14705
Feature activation+4.29e-02

Pos logit contributions
irs+0.003
ide+0.003
 parties+0.002
 ease+0.002
 outfits+0.002
Neg logit contributions
 oh-0.003
 afraid-0.003
 worried-0.003
 careless-0.003
 scared-0.003
Token in
Token ID287
Feature activation+2.42e-02

Pos logit contributions
irs+0.085
ushes+0.069
angles+0.068
ide+0.068
ingu+0.067
Neg logit contributions
 afraid-0.077
 worried-0.071
 scared-0.071
 oh-0.070
 careless-0.066
Token outer
Token ID12076
Feature activation+8.57e-02

Pos logit contributions
aff+0.051
irs+0.049
ingu+0.047
ades+0.047
oons+0.046
Neg logit contributions
 afraid-0.038
 scared-0.036
 worried-0.036
 oh-0.031
 very-0.031
Token space
Token ID2272
Feature activation+1.85e-02

Pos logit contributions
irs+0.169
ide+0.155
uing+0.148
umbs+0.147
oons+0.147
Neg logit contributions
 afraid-0.127
 scared-0.117
 worried-0.113
 foolish-0.113
 careless-0.111
Token.
Token ID13
Feature activation+8.77e-03

Pos logit contributions
irs+0.034
ide+0.027
aff+0.027
ingu+0.027
angles+0.027
Neg logit contributions
 afraid-0.035
 worried-0.033
 oh-0.033
 scared-0.032
 careless-0.031
Token So
Token ID1406
Feature activation-5.29e-02

Pos logit contributions
irs+0.015
angles+0.012
ide+0.012
ades+0.012
aff+0.012
Neg logit contributions
 afraid-0.017
 oh-0.016
 worried-0.016
 scared-0.015
 careless-0.015
Token one
Token ID530
Feature activation-1.56e-02

Pos logit contributions
 afraid+0.098
 oh+0.089
 worried+0.089
 careless+0.086
 scared+0.085
Neg logit contributions
irs-0.090
ide-0.083
 parties-0.076
of-0.074
angles-0.074
Token day
Token ID1110
Feature activation-5.23e-02

Pos logit contributions
 afraid+0.024
 oh+0.022
 worried+0.022
 careless+0.020
 scared+0.020
Neg logit contributions
ide-0.027
of-0.026
irs-0.026
 parties-0.024
r-0.023
Token that
Token ID326
Feature activation+9.20e-03

Pos logit contributions
 afraid+0.051
 oh+0.050
 worried+0.048
 scared+0.046
 careless+0.045
Neg logit contributions
irs-0.062
ide-0.053
ushes-0.050
angles-0.049
aff-0.049
Token he
Token ID339
Feature activation+2.22e-02

Pos logit contributions
irs+0.019
ide+0.017
ushes+0.016
angles+0.015
of+0.015
Neg logit contributions
 afraid-0.014
 oh-0.014
 worried-0.013
 scared-0.013
 careless-0.013
Token painted
Token ID13055
Feature activation+5.35e-02

Pos logit contributions
irs+0.049
ide+0.043
ushes+0.040
aff+0.040
angles+0.040
Neg logit contributions
 afraid-0.033
 oh-0.032
 worried-0.030
 scared-0.029
 careless-0.028
Token silver
Token ID8465
Feature activation+4.06e-02

Pos logit contributions
irs+0.104
ide+0.089
ushes+0.083
of+0.082
aff+0.082
Neg logit contributions
 afraid-0.090
 oh-0.090
 worried-0.084
 scared-0.082
 careless-0.080
Token and
Token ID290
Feature activation-1.19e-02

Pos logit contributions
irs+0.081
ide+0.069
ushes+0.064
of+0.063
angles+0.063
Neg logit contributions
 afraid-0.068
 oh-0.067
 worried-0.063
 scared-0.061
 careless-0.060
Token decorated
Token ID24789
Feature activation+8.55e-02

Pos logit contributions
 afraid+0.021
 oh+0.021
 worried+0.020
 scared+0.019
 careless+0.019
Neg logit contributions
irs-0.022
ide-0.019
ushes-0.017
angles-0.017
of-0.017
Token with
Token ID351
Feature activation+9.72e-03

Pos logit contributions
irs+0.199
ide+0.167
of+0.164
ingu+0.161
ushes+0.161
Neg logit contributions
 oh-0.112
 afraid-0.103
 scared-0.097
 worried-0.095
 foolish-0.089
Token some
Token ID617
Feature activation+2.43e-02

Pos logit contributions
irs+0.019
ide+0.016
aff+0.015
angles+0.015
of+0.015
Neg logit contributions
 afraid-0.017
 oh-0.017
 worried-0.016
 scared-0.016
 careless-0.015
Token black
Token ID2042
Feature activation+4.19e-02

Pos logit contributions
irs+0.042
ide+0.036
angles+0.033
ushes+0.032
of+0.032
Neg logit contributions
 afraid-0.047
 oh-0.046
 worried-0.044
 scared-0.043
 careless-0.042
Token stripes
Token ID28806
Feature activation+4.20e-03

Pos logit contributions
irs+0.073
ide+0.062
angles+0.056
ushes+0.056
of+0.055
Neg logit contributions
 afraid-0.079
 oh-0.078
 worried-0.074
 scared-0.073
 careless-0.070
Token.
Token ID13
Feature activation+8.97e-03

Pos logit contributions
irs+0.008
ide+0.006
ushes+0.006
angles+0.006
of+0.006
Neg logit contributions
 afraid-0.008
 oh-0.007
 worried-0.007
 scared-0.007
 careless-0.007
Token,
Token ID11
Feature activation-5.15e-02

Pos logit contributions
irs+0.012
ide+0.010
ushes+0.010
angles+0.010
of+0.010
Neg logit contributions
 oh-0.006
 afraid-0.006
 worried-0.005
 scared-0.005
 careless-0.005
Token Lily
Token ID20037
Feature activation+2.45e-04

Pos logit contributions
 afraid+0.103
 oh+0.100
 worried+0.097
 scared+0.093
 careless+0.092
Neg logit contributions
irs-0.083
ide-0.070
ushes-0.063
angles-0.063
of-0.062
Token and
Token ID290
Feature activation+1.03e-02

Pos logit contributions
irs+0.000
ide+0.000
 parties+0.000
 outfits+0.000
of+0.000
Neg logit contributions
 oh-0.000
 afraid-0.000
 worried-0.000
 careless-0.000
 foolish-0.000
Token her
Token ID607
Feature activation+1.87e-02

Pos logit contributions
irs+0.019
ades+0.016
ide+0.016
aff+0.015
ushes+0.015
Neg logit contributions
 oh-0.021
 afraid-0.020
 worried-0.020
 scared-0.019
 careless-0.018
Token family
Token ID1641
Feature activation+1.74e-02

Pos logit contributions
irs+0.036
ide+0.031
ushes+0.029
angles+0.029
of+0.028
Neg logit contributions
 afraid-0.033
 oh-0.032
 worried-0.031
 scared-0.029
 careless-0.029
Token decorated
Token ID24789
Feature activation+8.54e-02

Pos logit contributions
irs+0.034
ide+0.030
 parties+0.028
angles+0.027
of+0.027
Neg logit contributions
 afraid-0.029
 oh-0.029
 worried-0.026
 scared-0.025
 careless-0.025
Token the
Token ID262
Feature activation+2.31e-02

Pos logit contributions
irs+0.170
ide+0.141
ickets+0.141
of+0.139
ands+0.137
Neg logit contributions
 oh-0.143
 afraid-0.127
 scared-0.126
 worried-0.124
 very-0.118
Token house
Token ID2156
Feature activation+5.86e-03

Pos logit contributions
irs+0.041
ide+0.034
ushes+0.032
angles+0.032
of+0.032
Neg logit contributions
 afraid-0.044
 oh-0.043
 worried-0.041
 scared-0.040
 careless-0.039
Token with
Token ID351
Feature activation+5.73e-03

Pos logit contributions
irs+0.012
ide+0.009
of+0.009
aff+0.009
 ease+0.009
Neg logit contributions
 afraid-0.010
 oh-0.010
 worried-0.010
 scared-0.010
 careless-0.009
Token balloons
Token ID35485
Feature activation+3.76e-02

Pos logit contributions
irs+0.010
ide+0.009
of+0.008
angles+0.008
aff+0.008
Neg logit contributions
 afraid-0.011
 oh-0.010
 worried-0.010
 scared-0.010
 careless-0.010
Token and
Token ID290
Feature activation+2.43e-03

Pos logit contributions
irs+0.077
ide+0.064
angles+0.060
of+0.060
ushes+0.059
Neg logit contributions
 afraid-0.062
 oh-0.060
 worried-0.059
 scared-0.058
 careless-0.055
Token
Token ID198
Feature activation+5.62e-02

Pos logit contributions
irs+0.028
ide+0.024
angles+0.022
 outfits+0.021
ades+0.021
Neg logit contributions
 afraid-0.035
 oh-0.035
 worried-0.034
 scared-0.033
 careless-0.032
Token
Token ID198
Feature activation+4.73e-02

Pos logit contributions
irs+0.111
ide+0.098
angles+0.096
 outfits+0.096
 parties+0.093
Neg logit contributions
 afraid-0.096
 scared-0.091
 oh-0.090
 worried-0.089
 careless-0.082
TokenThe
Token ID464
Feature activation+1.01e-02

Pos logit contributions
irs+0.086
ide+0.072
angles+0.066
 outfits+0.065
aff+0.065
Neg logit contributions
 oh-0.092
 afraid-0.091
 worried-0.085
 scared-0.084
 careless-0.083
Token dog
Token ID3290
Feature activation-3.15e-02

Pos logit contributions
irs+0.022
ide+0.019
ushes+0.019
of+0.019
Everyone+0.019
Neg logit contributions
 oh-0.013
 afraid-0.013
 worried-0.011
 scared-0.011
 careless-0.010
Token l
Token ID300
Feature activation-8.41e-03

Pos logit contributions
 afraid+0.052
 oh+0.051
 worried+0.049
 scared+0.047
 careless+0.046
Neg logit contributions
irs-0.064
ide-0.055
ushes-0.051
angles-0.051
aff-0.050
Tokenicks
Token ID3378
Feature activation+8.52e-02

Pos logit contributions
 afraid+0.019
 scared+0.018
 worried+0.018
 oh+0.018
 careless+0.017
Neg logit contributions
irs-0.012
ide-0.010
 outfits-0.009
 occasions-0.009
 parties-0.008
Token Lily
Token ID20037
Feature activation+1.34e-02

Pos logit contributions
irs+0.195
ide+0.171
 outfits+0.169
 occasions+0.169
 projects+0.162
Neg logit contributions
 oh-0.111
 afraid-0.108
 scared-0.104
 worried-0.102
 foolish-0.095
Token's
Token ID338
Feature activation+1.66e-02

Pos logit contributions
irs+0.029
ide+0.024
angles+0.023
ades+0.023
aff+0.022
Neg logit contributions
 afraid-0.021
 oh-0.020
 worried-0.020
 scared-0.019
 careless-0.018
Token face
Token ID1986
Feature activation+3.44e-02

Pos logit contributions
irs+0.032
ide+0.027
angles+0.025
 parties+0.024
ushes+0.024
Neg logit contributions
 afraid-0.030
 oh-0.030
 worried-0.029
 scared-0.028
 careless-0.027
Token.
Token ID13
Feature activation-4.83e-03

Pos logit contributions
irs+0.049
ide+0.042
ushes+0.038
of+0.037
aff+0.037
Neg logit contributions
 afraid-0.074
 oh-0.074
 worried-0.070
 scared-0.068
 careless-0.067
Token He
Token ID679
Feature activation+2.53e-02

Pos logit contributions
 afraid+0.009
 oh+0.008
 worried+0.008
 scared+0.008
 careless+0.008
Neg logit contributions
irs-0.009
ide-0.008
 outfits-0.007
angles-0.007
ades-0.007
Token he
Token ID339
Feature activation+8.92e-03

Pos logit contributions
 afraid+0.018
 worried+0.017
 oh+0.017
 careless+0.016
 scared+0.015
Neg logit contributions
irs-0.014
ide-0.012
 ease-0.012
of-0.011
 parties-0.011
Token lun
Token ID14678
Feature activation-2.63e-02

Pos logit contributions
irs+0.017
ide+0.015
of+0.015
ingu+0.014
aff+0.014
Neg logit contributions
 afraid-0.014
 oh-0.014
 worried-0.013
 careless-0.012
 scared-0.012
Tokenged
Token ID2004
Feature activation-1.01e-02

Pos logit contributions
 afraid+0.046
 oh+0.045
 worried+0.042
 scared+0.041
 careless+0.040
Neg logit contributions
irs-0.052
ide-0.044
of-0.042
aff-0.040
angles-0.040
Token forward
Token ID2651
Feature activation-3.91e-02

Pos logit contributions
 afraid+0.019
 oh+0.019
 worried+0.017
 careless+0.017
 scared+0.016
Neg logit contributions
irs-0.016
of-0.014
ide-0.013
 ease-0.013
r-0.013
Token and
Token ID290
Feature activation-1.74e-02

Pos logit contributions
 oh+0.078
 afraid+0.078
 worried+0.073
 careless+0.072
 foolish+0.070
Neg logit contributions
irs-0.059
ide-0.054
of-0.048
r-0.047
aff-0.046
Token slur
Token ID40066
Feature activation+8.51e-02

Pos logit contributions
 afraid+0.031
 oh+0.031
 worried+0.029
 careless+0.028
 foolish+0.027
Neg logit contributions
irs-0.028
of-0.024
 ease-0.023
ide-0.023
 parties-0.022
Tokenped
Token ID9124
Feature activation+2.61e-02

Pos logit contributions
irs+0.178
ide+0.154
ushes+0.151
angles+0.146
 parties+0.146
Neg logit contributions
 oh-0.130
 afraid-0.128
 worried-0.117
 scared-0.117
 careless-0.111
Token it
Token ID340
Feature activation+6.34e-03

Pos logit contributions
irs+0.044
ide+0.037
of+0.034
ushes+0.034
 ease+0.034
Neg logit contributions
 afraid-0.050
 oh-0.049
 worried-0.046
 careless-0.045
 scared-0.044
Token all
Token ID477
Feature activation+2.11e-02

Pos logit contributions
irs+0.011
ide+0.010
ushes+0.009
of+0.009
angles+0.009
Neg logit contributions
 afraid-0.012
 oh-0.012
 worried-0.011
 scared-0.011
 careless-0.011
Token up
Token ID510
Feature activation+2.14e-02

Pos logit contributions
irs+0.044
ide+0.037
ushes+0.036
angles+0.035
aff+0.035
Neg logit contributions
 afraid-0.034
 oh-0.033
 worried-0.032
 scared-0.031
 careless-0.030
Token.
Token ID13
Feature activation+8.73e-03

Pos logit contributions
irs+0.033
ide+0.028
ushes+0.025
of+0.025
angles+0.025
Neg logit contributions
 afraid-0.045
 oh-0.044
 worried-0.042
 scared-0.041
 careless-0.040
Token so
Token ID523
Feature activation-4.96e-02

Pos logit contributions
 oh+0.044
Run+0.039
 afraid+0.037
 worried+0.031
 careless+0.031
Neg logit contributions
irs-0.081
ide-0.074
ushes-0.073
 ease-0.072
 parties-0.070
Token happy
Token ID3772
Feature activation+1.06e-02

Pos logit contributions
 oh+0.065
 afraid+0.058
 worried+0.051
 careless+0.049
Run+0.048
Neg logit contributions
irs-0.114
ide-0.102
ushes-0.100
 ease-0.095
of-0.095
Token that
Token ID326
Feature activation-2.20e-02

Pos logit contributions
irs+0.022
ide+0.019
ushes+0.018
angles+0.018
of+0.018
Neg logit contributions
 afraid-0.017
 oh-0.017
 worried-0.016
 scared-0.015
 careless-0.015
Token she
Token ID673
Feature activation+1.66e-02

Pos logit contributions
 afraid+0.038
 oh+0.037
 worried+0.035
 careless+0.034
 scared+0.034
Neg logit contributions
irs-0.041
ide-0.036
of-0.033
 ease-0.033
 parties-0.033
Token had
Token ID550
Feature activation-1.23e-02

Pos logit contributions
irs+0.032
ide+0.029
of+0.027
 outfits+0.026
angles+0.026
Neg logit contributions
 oh-0.026
 afraid-0.026
 worried-0.024
 careless-0.022
 scared-0.022
Token decorated
Token ID24789
Feature activation+8.51e-02

Pos logit contributions
 afraid+0.017
 oh+0.017
 worried+0.016
 scared+0.015
 careless+0.015
Neg logit contributions
irs-0.029
ide-0.026
aff-0.025
angles-0.024
ushes-0.024
Token it
Token ID340
Feature activation+1.55e-02

Pos logit contributions
irs+0.209
ide+0.184
aff+0.180
of+0.178
ingu+0.177
Neg logit contributions
 oh-0.107
 afraid-0.096
 worried-0.089
 scared-0.088
 foolish-0.078
Token and
Token ID290
Feature activation-3.13e-03

Pos logit contributions
irs+0.032
aff+0.026
ushes+0.025
angles+0.025
ide+0.025
Neg logit contributions
 afraid-0.026
 oh-0.025
 scared-0.024
 worried-0.024
 careless-0.023
Token it
Token ID340
Feature activation+1.27e-02

Pos logit contributions
 afraid+0.005
 oh+0.005
 worried+0.005
 scared+0.004
 careless+0.004
Neg logit contributions
irs-0.006
ide-0.005
ushes-0.005
angles-0.005
of-0.005
Token was
Token ID373
Feature activation-2.95e-02

Pos logit contributions
irs+0.031
aff+0.026
ide+0.026
ushes+0.025
angles+0.025
Neg logit contributions
 afraid-0.017
 worried-0.016
 scared-0.016
 oh-0.016
 careless-0.015
Token safe
Token ID3338
Feature activation+5.17e-03

Pos logit contributions
 afraid+0.043
 oh+0.042
 worried+0.040
 scared+0.038
 careless+0.037
Neg logit contributions
irs-0.066
ide-0.058
aff-0.055
of-0.055
angles-0.055
Token group
Token ID1448
Feature activation+2.79e-02

Pos logit contributions
irs+0.039
ide+0.034
aff+0.032
ushes+0.032
angles+0.031
Neg logit contributions
 oh-0.031
 afraid-0.031
 worried-0.029
 scared-0.027
 careless-0.027
Token of
Token ID286
Feature activation+2.04e-02

Pos logit contributions
irs+0.047
ide+0.039
ushes+0.036
angles+0.036
of+0.035
Neg logit contributions
 afraid-0.055
 oh-0.054
 worried-0.052
 scared-0.051
 careless-0.050
Token tur
Token ID7858
Feature activation-1.05e-02

Pos logit contributions
irs+0.036
ide+0.031
ushes+0.028
angles+0.028
of+0.028
Neg logit contributions
 afraid-0.038
 oh-0.038
 worried-0.036
 scared-0.035
 careless-0.034
Tokenkeys
Token ID13083
Feature activation+2.49e-02

Pos logit contributions
 afraid+0.018
 oh+0.017
 worried+0.016
 scared+0.016
 careless+0.016
Neg logit contributions
irs-0.020
ushes-0.017
angles-0.017
 outfits-0.017
ide-0.017
Token had
Token ID550
Feature activation-1.63e-02

Pos logit contributions
irs+0.048
ushes+0.041
ide+0.040
angles+0.040
ades+0.038
Neg logit contributions
 afraid-0.044
 worried-0.042
 scared-0.042
 oh-0.040
 careless-0.038
Token decorated
Token ID24789
Feature activation+8.51e-02

Pos logit contributions
 afraid+0.024
 oh+0.023
 worried+0.022
 scared+0.021
 careless+0.020
Neg logit contributions
irs-0.037
ide-0.032
angles-0.031
ushes-0.030
of-0.030
Token the
Token ID262
Feature activation+2.45e-02

Pos logit contributions
irs+0.153
ide+0.121
of+0.119
ingu+0.119
ushes+0.115
Neg logit contributions
 afraid-0.156
 oh-0.155
 scared-0.146
 worried-0.145
 careless-0.137
Token tree
Token ID5509
Feature activation+5.25e-03

Pos logit contributions
irs+0.047
ide+0.041
ushes+0.038
angles+0.037
aff+0.037
Neg logit contributions
 oh-0.042
 afraid-0.042
 worried-0.039
 scared-0.038
 careless-0.037
Token with
Token ID351
Feature activation+8.64e-03

Pos logit contributions
irs+0.011
ide+0.008
aff+0.008
angles+0.008
ushes+0.008
Neg logit contributions
 afraid-0.009
 oh-0.008
 worried-0.008
 scared-0.008
 careless-0.008
Token colourful
Token ID44799
Feature activation+5.27e-02

Pos logit contributions
irs+0.016
ide+0.014
angles+0.013
of+0.013
ushes+0.013
Neg logit contributions
 afraid-0.015
 oh-0.015
 worried-0.014
 scared-0.014
 careless-0.014
Token feathers
Token ID27400
Feature activation+2.18e-02

Pos logit contributions
irs+0.095
ide+0.084
angles+0.078
ingu+0.076
ushes+0.075
Neg logit contributions
 afraid-0.093
 oh-0.089
 scared-0.088
 worried-0.088
 careless-0.083
Token the
Token ID262
Feature activation+2.64e-02

Pos logit contributions
irs+0.004
ide+0.004
of+0.003
angles+0.003
ushes+0.003
Neg logit contributions
 afraid-0.005
 oh-0.004
 worried-0.004
 scared-0.004
 careless-0.004
Token ostr
Token ID42387
Feature activation-2.67e-02

Pos logit contributions
irs+0.062
ide+0.055
aff+0.055
 parties+0.052
 ease+0.052
Neg logit contributions
 oh-0.033
 afraid-0.031
 worried-0.027
 scared-0.025
 careless-0.025
Tokenich
Token ID488
Feature activation+3.31e-03

Pos logit contributions
 afraid+0.025
 oh+0.025
 worried+0.022
 scared+0.021
 careless+0.020
Neg logit contributions
irs-0.072
ide-0.065
ushes-0.062
angles-0.061
of-0.061
Token.
Token ID13
Feature activation-5.39e-03

Pos logit contributions
irs+0.006
ide+0.005
ushes+0.005
angles+0.005
of+0.005
Neg logit contributions
 afraid-0.006
 oh-0.006
 worried-0.006
 scared-0.005
 careless-0.005
Token They
Token ID1119
Feature activation+3.26e-02

Pos logit contributions
 afraid+0.011
 oh+0.010
 scared+0.010
 worried+0.010
 careless+0.009
Neg logit contributions
irs-0.009
ide-0.008
ades-0.007
of-0.007
angles-0.007
Token decorated
Token ID24789
Feature activation+8.51e-02

Pos logit contributions
irs+0.074
ide+0.064
ushes+0.061
angles+0.060
ades+0.060
Neg logit contributions
 afraid-0.048
 oh-0.045
 worried-0.045
 scared-0.043
 careless-0.041
Token the
Token ID262
Feature activation+3.51e-02

Pos logit contributions
irs+0.165
ide+0.140
ushes+0.129
aff+0.129
ades+0.128
Neg logit contributions
 oh-0.145
 afraid-0.141
 worried-0.132
 scared-0.131
 careless-0.123
Token ostr
Token ID42387
Feature activation-3.13e-02

Pos logit contributions
irs+0.066
ide+0.056
angles+0.051
ushes+0.051
of+0.051
Neg logit contributions
 afraid-0.064
 oh-0.062
 worried-0.061
 scared-0.058
 careless-0.057
Tokenich
Token ID488
Feature activation-2.06e-03

Pos logit contributions
 afraid+0.030
 oh+0.029
 worried+0.026
 scared+0.024
 careless+0.024
Neg logit contributions
irs-0.085
ide-0.076
ushes-0.072
angles-0.072
 parties-0.071
Token with
Token ID351
Feature activation+1.83e-02

Pos logit contributions
 afraid+0.003
 scared+0.003
 worried+0.003
 oh+0.003
 careless+0.003
Neg logit contributions
irs-0.005
ushes-0.004
 ease-0.004
ide-0.004
angles-0.004
Token some
Token ID617
Feature activation+2.27e-02

Pos logit contributions
irs+0.038
ide+0.034
angles+0.031
aff+0.031
ushes+0.030
Neg logit contributions
 afraid-0.029
 oh-0.028
 worried-0.028
 scared-0.028
 careless-0.026
Token and
Token ID290
Feature activation-2.52e-02

Pos logit contributions
irs+0.034
ide+0.030
of+0.027
aff+0.027
ushes+0.027
Neg logit contributions
 afraid-0.035
 oh-0.035
 worried-0.033
 careless-0.031
 scared-0.031
Token colourful
Token ID44799
Feature activation+3.82e-02

Pos logit contributions
 afraid+0.043
 oh+0.042
 worried+0.039
 careless+0.038
 scared+0.037
Neg logit contributions
irs-0.044
ide-0.038
ushes-0.037
 ease-0.036
lights-0.036
Token.
Token ID13
Feature activation+1.06e-03

Pos logit contributions
irs+0.063
ide+0.053
ushes+0.048
angles+0.048
of+0.047
Neg logit contributions
 afraid-0.076
 oh-0.076
 worried-0.072
 scared-0.070
 careless-0.069
Token They
Token ID1119
Feature activation+2.18e-02

Pos logit contributions
irs+0.002
ide+0.002
angles+0.001
ades+0.001
of+0.001
Neg logit contributions
 afraid-0.002
 oh-0.002
 worried-0.002
 scared-0.002
 careless-0.002
Token had
Token ID550
Feature activation-7.55e-03

Pos logit contributions
irs+0.044
ide+0.037
ushes+0.036
ades+0.035
aff+0.035
Neg logit contributions
 afraid-0.037
 worried-0.036
 scared-0.035
 oh-0.035
 careless-0.033
Token decorated
Token ID24789
Feature activation+8.50e-02

Pos logit contributions
 afraid+0.010
 oh+0.010
 worried+0.009
 scared+0.009
 careless+0.009
Neg logit contributions
irs-0.018
ide-0.016
angles-0.015
aff-0.015
ushes-0.015
Token it
Token ID340
Feature activation+1.54e-02

Pos logit contributions
irs+0.171
ide+0.141
aff+0.135
of+0.135
ingu+0.134
Neg logit contributions
 oh-0.141
 afraid-0.133
 worried-0.125
 scared-0.124
 careless-0.116
Token perfectly
Token ID7138
Feature activation-1.79e-02

Pos logit contributions
irs+0.032
aff+0.026
ushes+0.025
ide+0.025
angles+0.025
Neg logit contributions
 afraid-0.024
 oh-0.023
 scared-0.023
 worried-0.023
 careless-0.022
Token!
Token ID0
Feature activation+1.19e-02

Pos logit contributions
 afraid+0.029
 oh+0.029
 worried+0.027
 scared+0.026
 careless+0.026
Neg logit contributions
irs-0.037
ide-0.031
angles-0.029
ushes-0.029
of-0.029
Token Everyone
Token ID11075
Feature activation+3.01e-02

Pos logit contributions
irs+0.018
ide+0.016
ades+0.015
angles+0.015
 outfits+0.014
Neg logit contributions
 afraid-0.024
 oh-0.023
 worried-0.022
 scared-0.022
 careless-0.021
Token was
Token ID373
Feature activation-4.18e-02

Pos logit contributions
irs+0.067
ide+0.058
ushes+0.055
angles+0.055
aff+0.055
Neg logit contributions
 afraid-0.045
 oh-0.044
 worried-0.042
 scared-0.041
 careless-0.039
Token always
Token ID1464
Feature activation-2.02e-02

Pos logit contributions
 afraid+0.047
 oh+0.047
 worried+0.044
 scared+0.042
 careless+0.041
Neg logit contributions
irs-0.081
ide-0.072
aff-0.067
angles-0.067
of-0.067
Token very
Token ID845
Feature activation-5.14e-02

Pos logit contributions
 afraid+0.027
 oh+0.026
 worried+0.024
Run+0.024
 careless+0.023
Neg logit contributions
irs-0.042
ide-0.038
ushes-0.037
 ease-0.036
 outfits-0.036
Token colourful
Token ID44799
Feature activation+3.98e-02

Pos logit contributions
 oh+0.073
 afraid+0.069
 worried+0.061
Run+0.060
 scared+0.058
Neg logit contributions
irs-0.108
ushes-0.096
ide-0.095
 parties-0.089
 ease-0.088
Token and
Token ID290
Feature activation-1.54e-02

Pos logit contributions
irs+0.079
ide+0.070
ushes+0.064
aff+0.064
of+0.063
Neg logit contributions
 afraid-0.065
 oh-0.064
 worried-0.060
 scared-0.058
 careless-0.057
Token beautifully
Token ID21104
Feature activation+5.08e-03

Pos logit contributions
 afraid+0.025
 oh+0.024
 worried+0.023
 scared+0.022
 careless+0.022
Neg logit contributions
irs-0.029
ide-0.025
ushes-0.024
 parties-0.024
 outfits-0.024
Token decorated
Token ID24789
Feature activation+8.49e-02

Pos logit contributions
irs+0.010
ide+0.008
angles+0.008
of+0.008
ushes+0.007
Neg logit contributions
 afraid-0.009
 oh-0.009
 worried-0.009
 scared-0.009
 careless-0.008
Token.
Token ID13
Feature activation+9.64e-03

Pos logit contributions
irs+0.170
of+0.136
ingu+0.135
aff+0.134
ide+0.133
Neg logit contributions
 oh-0.144
 afraid-0.134
 worried-0.128
 scared-0.128
 careless-0.119
Token<|endoftext|>
Token ID50256
Feature activation+2.81e-02

Pos logit contributions
irs+0.019
ide+0.017
ades+0.017
angles+0.017
aff+0.016
Neg logit contributions
 afraid-0.015
 oh-0.015
 worried-0.015
 scared-0.014
 careless-0.013
TokenOnce
Token ID7454
Feature activation-3.59e-02

Pos logit contributions
irs+0.053
angles+0.045
ide+0.043
ushes+0.043
 outfits+0.042
Neg logit contributions
 afraid-0.049
 scared-0.046
 oh-0.046
 worried-0.046
 careless-0.042
Token upon
Token ID2402
Feature activation+2.83e-02

Pos logit contributions
 afraid+0.066
 worried+0.064
 scared+0.061
 oh+0.058
 careless+0.058
Neg logit contributions
irs-0.066
 occasions-0.057
bons-0.057
aff-0.056
 outfits-0.055
Token a
Token ID257
Feature activation-3.31e-02

Pos logit contributions
irs+0.069
ide+0.061
ushes+0.058
angles+0.058
aff+0.057
Neg logit contributions
 afraid-0.035
 oh-0.034
 worried-0.031
 scared-0.030
 careless-0.029
Token it
Token ID340
Feature activation+7.81e-04

Pos logit contributions
irs+0.096
ushes+0.083
ide+0.082
 outfits+0.081
lights+0.080
Neg logit contributions
 oh-0.077
 afraid-0.068
 scared-0.064
 foolish-0.064
 worried-0.064
Token,
Token ID11
Feature activation-3.59e-02

Pos logit contributions
irs+0.002
ide+0.001
ushes+0.001
aff+0.001
angles+0.001
Neg logit contributions
 afraid-0.001
 oh-0.001
 worried-0.001
 scared-0.001
 careless-0.001
Token and
Token ID290
Feature activation-1.62e-02

Pos logit contributions
 afraid+0.054
 oh+0.053
 worried+0.050
 scared+0.047
 careless+0.047
Neg logit contributions
irs-0.073
ide-0.064
ushes-0.061
angles-0.061
of-0.060
Token then
Token ID788
Feature activation-3.37e-02

Pos logit contributions
 afraid+0.027
 oh+0.026
 worried+0.025
 scared+0.024
 careless+0.023
Neg logit contributions
irs-0.032
ide-0.027
ushes-0.026
angles-0.026
of-0.025
Token you
Token ID345
Feature activation-1.74e-02

Pos logit contributions
 afraid+0.056
 oh+0.055
 worried+0.052
 scared+0.050
 careless+0.049
Neg logit contributions
irs-0.067
ide-0.058
ushes-0.054
angles-0.053
of-0.053
Token slur
Token ID40066
Feature activation+8.48e-02

Pos logit contributions
 afraid+0.027
 worried+0.025
 scared+0.025
 oh+0.025
 careless+0.023
Neg logit contributions
irs-0.037
ide-0.032
ushes-0.030
angles-0.030
ades-0.030
Tokenp
Token ID79
Feature activation+7.41e-03

Pos logit contributions
irs+0.174
ide+0.148
ushes+0.144
 outfits+0.143
 parties+0.142
Neg logit contributions
 oh-0.128
 afraid-0.124
 scared-0.115
 worried-0.113
 careless-0.108
Token it
Token ID340
Feature activation-2.94e-03

Pos logit contributions
irs+0.015
 journals+0.012
ide+0.012
angles+0.012
 occasions+0.012
Neg logit contributions
 afraid-0.012
 oh-0.012
 scared-0.011
 worried-0.011
 careless-0.011
Token down
Token ID866
Feature activation+1.95e-02

Pos logit contributions
 afraid+0.004
 oh+0.004
 scared+0.004
 worried+0.004
 careless+0.004
Neg logit contributions
irs-0.007
ide-0.005
aff-0.005
ushes-0.005
angles-0.005
Token your
Token ID534
Feature activation-9.81e-03

Pos logit contributions
irs+0.034
ide+0.029
angles+0.027
ushes+0.026
of+0.026
Neg logit contributions
 afraid-0.037
 oh-0.037
 worried-0.035
 scared-0.034
 careless-0.033
Token mouth
Token ID5422
Feature activation-3.60e-03

Pos logit contributions
 afraid+0.018
 oh+0.018
 worried+0.017
 scared+0.017
 careless+0.016
Neg logit contributions
irs-0.017
ide-0.015
ushes-0.014
angles-0.014
of-0.013
Token a
Token ID257
Feature activation-3.24e-02

Pos logit contributions
irs+0.066
ide+0.059
ushes+0.055
of+0.055
angles+0.054
Neg logit contributions
 afraid-0.038
 oh-0.038
 worried-0.035
 scared-0.033
 careless-0.032
Token time
Token ID640
Feature activation-8.31e-02

Pos logit contributions
 afraid+0.049
 oh+0.049
 worried+0.045
 scared+0.043
 careless+0.043
Neg logit contributions
irs-0.068
ide-0.061
ushes-0.056
of-0.056
angles-0.056
Token there
Token ID612
Feature activation+1.32e-02

Pos logit contributions
 afraid+0.128
 oh+0.128
 worried+0.119
 careless+0.112
 scared+0.112
Neg logit contributions
irs-0.171
ide-0.157
of-0.141
 parties-0.136
ushes-0.136
Token was
Token ID373
Feature activation-7.29e-02

Pos logit contributions
irs+0.030
ide+0.025
aff+0.025
ushes+0.025
angles+0.025
Neg logit contributions
 afraid-0.018
 oh-0.018
 worried-0.017
 scared-0.017
 careless-0.016
Token a
Token ID257
Feature activation-2.83e-02

Pos logit contributions
 afraid+0.119
 oh+0.116
 worried+0.111
 careless+0.102
 scared+0.101
Neg logit contributions
irs-0.142
ide-0.127
 parties-0.118
 ease-0.116
ushes-0.112
Token Stub
Token ID41135
Feature activation-1.13e-01

Pos logit contributions
 oh+0.055
 afraid+0.054
 worried+0.048
Run+0.046
 scared+0.046
Neg logit contributions
ide-0.037
irs-0.036
aff-0.033
ingu-0.033
of-0.033
Tokenborn
Token ID6286
Feature activation-2.19e-02

Pos logit contributions
 oh+0.218
 afraid+0.202
 worried+0.195
 scared+0.187
 careless+0.181
Neg logit contributions
irs-0.201
ide-0.177
of-0.168
aff-0.164
oons-0.158
Token Old
Token ID5706
Feature activation-2.71e-02

Pos logit contributions
 afraid+0.042
 oh+0.041
 worried+0.039
 scared+0.037
 careless+0.037
Neg logit contributions
irs-0.033
ide-0.031
of-0.030
aff-0.030
r-0.028
Token Woman
Token ID14381
Feature activation-3.65e-02

Pos logit contributions
 afraid+0.048
 oh+0.048
 worried+0.045
 scared+0.043
 careless+0.042
Neg logit contributions
irs-0.050
ide-0.043
of-0.040
aff-0.040
ushes-0.040
Token.
Token ID13
Feature activation-1.65e-02

Pos logit contributions
 afraid+0.070
 oh+0.068
 worried+0.065
 scared+0.064
 careless+0.062
Neg logit contributions
irs-0.066
ushes-0.052
ide-0.051
angles-0.051
 ease-0.050
Token She
Token ID1375
Feature activation+2.63e-02

Pos logit contributions
 afraid+0.027
 oh+0.027
 worried+0.025
 scared+0.025
 careless+0.024
Neg logit contributions
irs-0.033
ide-0.029
ades-0.027
ushes-0.027
angles-0.027
Token just
Token ID655
Feature activation-3.37e-02

Pos logit contributions
Run+0.024
 oh+0.022
 afraid+0.021
 worried+0.019
 sur+0.018
Neg logit contributions
irs-0.024
ide-0.021
ushes-0.021
 parties-0.020
rap-0.020
Token a
Token ID257
Feature activation-7.44e-03

Pos logit contributions
 oh+0.050
 afraid+0.050
 worried+0.047
 careless+0.044
 scared+0.043
Neg logit contributions
irs-0.066
ide-0.055
ushes-0.053
 ease-0.053
angles-0.053
Token spider
Token ID19230
Feature activation-2.37e-02

Pos logit contributions
 oh+0.013
 afraid+0.013
Run+0.012
 worried+0.012
 scared+0.011
Neg logit contributions
irs-0.009
 craft-0.009
 cere-0.008
ide-0.008
 design-0.008
Token."
Token ID526
Feature activation+2.56e-02

Pos logit contributions
 oh+0.044
 afraid+0.044
 worried+0.042
 careless+0.039
 scared+0.039
Neg logit contributions
irs-0.039
ide-0.035
of-0.031
 parties-0.031
r-0.030
Token The
Token ID383
Feature activation+1.80e-02

Pos logit contributions
irs+0.037
ide+0.030
angles+0.028
ushes+0.027
ades+0.027
Neg logit contributions
 oh-0.057
 afraid-0.057
 worried-0.055
 scared-0.053
 careless-0.052
Token Stub
Token ID41135
Feature activation-1.07e-01

Pos logit contributions
irs+0.021
 parties+0.019
 outfits+0.019
ide+0.019
oons+0.019
Neg logit contributions
 oh-0.033
 afraid-0.032
 gonna-0.031
Run-0.031
 steep-0.029
Tokenborn
Token ID6286
Feature activation-1.62e-02

Pos logit contributions
 oh+0.216
 afraid+0.202
 worried+0.196
 scared+0.188
 careless+0.184
Neg logit contributions
irs-0.174
of-0.146
ide-0.145
aff-0.140
oons-0.135
Token Old
Token ID5706
Feature activation-3.31e-02

Pos logit contributions
 afraid+0.031
 oh+0.030
 worried+0.028
 careless+0.027
 scared+0.027
Neg logit contributions
irs-0.025
of-0.023
ide-0.023
aff-0.022
r-0.021
Token Woman
Token ID14381
Feature activation-3.07e-02

Pos logit contributions
 afraid+0.064
 oh+0.062
 worried+0.059
 scared+0.057
 careless+0.056
Neg logit contributions
irs-0.054
ide-0.048
of-0.046
aff-0.045
 outfits-0.043
Token shook
Token ID14682
Feature activation+2.60e-02

Pos logit contributions
 oh+0.044
 afraid+0.044
 worried+0.040
 scared+0.038
 careless+0.038
Neg logit contributions
irs-0.066
ide-0.060
of-0.055
ushes-0.055
aff-0.055
Token her
Token ID607
Feature activation-3.35e-03

Pos logit contributions
irs+0.026
ide+0.021
of+0.017
 ease+0.016
ushes+0.016
Neg logit contributions
 afraid-0.065
 oh-0.064
 careless-0.061
 worried-0.060
 foolish-0.059
Token she
Token ID673
Feature activation+1.32e-02

Pos logit contributions
 afraid+0.089
 oh+0.088
 worried+0.082
 scared+0.079
 careless+0.078
Neg logit contributions
irs-0.091
ide-0.076
of-0.072
 parties-0.071
ushes-0.070
Token did
Token ID750
Feature activation-3.91e-02

Pos logit contributions
irs+0.030
ide+0.026
angles+0.025
ushes+0.025
aff+0.025
Neg logit contributions
 oh-0.019
 afraid-0.019
 worried-0.018
 scared-0.017
 careless-0.017
Token so
Token ID523
Feature activation-4.12e-02

Pos logit contributions
 afraid+0.062
 oh+0.059
 worried+0.056
 careless+0.053
 scared+0.053
Neg logit contributions
irs-0.079
ide-0.069
 parties-0.062
of-0.062
ades-0.062
Token,
Token ID11
Feature activation-4.08e-02

Pos logit contributions
 oh+0.056
 afraid+0.055
 worried+0.050
 scared+0.047
 careless+0.047
Neg logit contributions
irs-0.094
ide-0.083
ushes-0.078
of-0.077
angles-0.077
Token the
Token ID262
Feature activation+1.23e-02

Pos logit contributions
 afraid+0.076
 oh+0.073
 worried+0.069
 scared+0.067
 careless+0.067
Neg logit contributions
irs-0.071
ide-0.060
of-0.056
ushes-0.055
 ease-0.055
Token Stub
Token ID41135
Feature activation-1.05e-01

Pos logit contributions
 halves+0.014
irs+0.014
 outfits+0.014
 patterns+0.014
ide+0.014
Neg logit contributions
 oh-0.021
 afraid-0.021
 gonna-0.020
Run-0.019
 worried-0.018
Tokenborn
Token ID6286
Feature activation-1.93e-02

Pos logit contributions
 oh+0.200
 afraid+0.195
 worried+0.187
 scared+0.178
 careless+0.171
Neg logit contributions
irs-0.177
of-0.149
ide-0.147
aff-0.144
oons-0.136
Token Old
Token ID5706
Feature activation-3.18e-02

Pos logit contributions
 afraid+0.036
 oh+0.035
 worried+0.032
 careless+0.031
 scared+0.031
Neg logit contributions
irs-0.031
of-0.028
ide-0.028
aff-0.027
 outfits-0.026
Token Woman
Token ID14381
Feature activation-3.28e-02

Pos logit contributions
 afraid+0.060
 oh+0.059
 worried+0.055
 scared+0.054
 careless+0.052
Neg logit contributions
irs-0.054
ide-0.047
of-0.045
aff-0.044
angles-0.042
Token said
Token ID531
Feature activation-3.08e-02

Pos logit contributions
 oh+0.050
 afraid+0.049
 worried+0.045
 scared+0.043
 careless+0.043
Neg logit contributions
irs-0.068
ide-0.061
of-0.057
ushes-0.056
angles-0.056
Token,
Token ID11
Feature activation-4.33e-02

Pos logit contributions
 afraid+0.051
 oh+0.047
 worried+0.045
 careless+0.044
 scared+0.043
Neg logit contributions
irs-0.056
ide-0.050
of-0.047
ades-0.046
ushes-0.046
Token."
Token ID526
Feature activation+2.78e-02

Pos logit contributions
 oh+0.028
 afraid+0.028
 worried+0.026
 careless+0.025
 scared+0.025
Neg logit contributions
irs-0.021
ide-0.019
 parties-0.017
of-0.017
 outfits-0.016
Token
Token ID220
Feature activation+2.85e-02

Pos logit contributions
irs+0.057
ide+0.051
ades+0.049
angles+0.048
oons+0.046
Neg logit contributions
 afraid-0.044
 oh-0.044
 worried-0.044
 scared-0.041
 careless-0.038
Token
Token ID198
Feature activation+7.28e-02

Pos logit contributions
irs+0.051
 journals+0.048
ide+0.046
 outfits+0.045
lights+0.045
Neg logit contributions
 afraid-0.049
 oh-0.047
 worried-0.046
 scared-0.046
 careless-0.043
Token
Token ID198
Feature activation+6.47e-02

Pos logit contributions
irs+0.153
angles+0.148
lights+0.146
 journals+0.141
ide+0.138
Neg logit contributions
 oh-0.096
 afraid-0.091
 worried-0.090
 scared-0.086
 very-0.083
TokenThe
Token ID464
Feature activation+1.41e-02

Pos logit contributions
irs+0.125
lights+0.113
angles+0.110
ide+0.108
aff+0.107
Neg logit contributions
 oh-0.110
 afraid-0.094
 worried-0.094
 determined-0.092
 careless-0.091
Token Stub
Token ID41135
Feature activation-1.01e-01

Pos logit contributions
irs+0.020
 parties+0.018
ide+0.017
ingu+0.017
 outfits+0.017
Neg logit contributions
 oh-0.027
 afraid-0.027
 scared-0.024
 worried-0.023
Run-0.023
Tokenborn
Token ID6286
Feature activation-1.08e-02

Pos logit contributions
 oh+0.184
 afraid+0.179
 worried+0.172
 scared+0.163
 careless+0.157
Neg logit contributions
irs-0.173
of-0.151
ide-0.146
aff-0.144
oons-0.138
Token Old
Token ID5706
Feature activation-2.99e-02

Pos logit contributions
 afraid+0.021
 oh+0.021
 worried+0.019
 careless+0.018
 scared+0.018
Neg logit contributions
irs-0.017
ide-0.015
of-0.015
aff-0.015
r-0.014
Token Woman
Token ID14381
Feature activation-2.30e-02

Pos logit contributions
 afraid+0.055
 oh+0.053
 worried+0.050
 scared+0.049
 careless+0.047
Neg logit contributions
irs-0.051
ide-0.045
of-0.044
aff-0.043
 outfits-0.042
Token chuckled
Token ID39889
Feature activation+3.84e-02

Pos logit contributions
 afraid+0.034
 oh+0.033
 worried+0.032
 scared+0.030
 careless+0.030
Neg logit contributions
irs-0.050
ide-0.043
ushes-0.041
angles-0.041
aff-0.040
Token and
Token ID290
Feature activation-2.98e-03

Pos logit contributions
irs+0.072
ide+0.066
ushes+0.060
of+0.060
aff+0.059
Neg logit contributions
 afraid-0.064
 oh-0.063
 worried-0.058
 careless-0.056
 scared-0.056
Token neck
Token ID7393
Feature activation+1.33e-02

Pos logit contributions
irs+0.048
ide+0.041
angles+0.038
ushes+0.038
of+0.038
Neg logit contributions
 afraid-0.044
 oh-0.043
 worried-0.041
 scared-0.040
 careless-0.039
Token that
Token ID326
Feature activation+9.39e-03

Pos logit contributions
irs+0.023
ide+0.019
angles+0.018
ushes+0.018
of+0.017
Neg logit contributions
 afraid-0.025
 oh-0.025
 worried-0.024
 scared-0.023
 careless-0.023
Token read
Token ID1100
Feature activation+3.81e-02

Pos logit contributions
irs+0.020
ide+0.018
ushes+0.017
angles+0.016
of+0.016
Neg logit contributions
 afraid-0.014
 oh-0.014
 worried-0.013
 scared-0.012
 careless-0.012
Token,
Token ID11
Feature activation-2.54e-02

Pos logit contributions
irs+0.070
ide+0.059
ushes+0.055
angles+0.054
aff+0.054
Neg logit contributions
 afraid-0.069
 oh-0.068
 worried-0.065
 scared-0.063
 careless-0.061
Token "
Token ID366
Feature activation+2.06e-03

Pos logit contributions
 oh+0.039
 afraid+0.038
 worried+0.035
 careless+0.034
 scared+0.033
Neg logit contributions
irs-0.052
ide-0.045
ushes-0.043
aff-0.042
angles-0.042
TokenFore
Token ID16351
Feature activation-9.38e-02

Pos logit contributions
irs+0.004
ide+0.003
angles+0.003
ingu+0.003
lights+0.003
Neg logit contributions
 oh-0.004
 afraid-0.003
 worried-0.003
 scared-0.003
Run-0.003
Tokenver
Token ID332
Feature activation-8.05e-02

Pos logit contributions
 afraid+0.149
 oh+0.147
 worried+0.141
 scared+0.137
 careless+0.132
Neg logit contributions
irs-0.187
ide-0.160
of-0.155
r-0.148
angles-0.146
Token Friends
Token ID14213
Feature activation-1.15e-02

Pos logit contributions
 afraid+0.144
 oh+0.143
 worried+0.137
 careless+0.130
 scared+0.128
Neg logit contributions
irs-0.137
ide-0.124
r-0.114
aff-0.112
of-0.112
Token".
Token ID1911
Feature activation+1.33e-02

Pos logit contributions
 afraid+0.021
 oh+0.020
 worried+0.020
 scared+0.019
 careless+0.019
Neg logit contributions
irs-0.021
ide-0.018
ushes-0.018
lights-0.017
ades-0.017
Token
Token ID220
Feature activation+1.69e-02

Pos logit contributions
irs+0.025
ades+0.021
ide+0.021
angles+0.021
oons+0.020
Neg logit contributions
 oh-0.023
 afraid-0.023
 scared-0.022
 worried-0.022
 careless-0.020
Token
Token ID198
Feature activation+6.44e-02

Pos logit contributions
irs+0.032
 journals+0.029
 occasions+0.029
 outfits+0.029
 parties+0.028
Neg logit contributions
 afraid-0.027
 scared-0.025
 worried-0.025
 oh-0.024
 careless-0.022
Token They
Token ID1119
Feature activation+2.84e-02

Pos logit contributions
 afraid+0.017
 oh+0.017
 worried+0.016
 scared+0.016
 careless+0.015
Neg logit contributions
irs-0.015
ide-0.013
angles-0.013
 outfits-0.012
of-0.012
Token laughed
Token ID13818
Feature activation+3.01e-02

Pos logit contributions
irs+0.064
ide+0.055
ushes+0.052
aff+0.052
angles+0.051
Neg logit contributions
 afraid-0.042
 oh-0.040
 worried-0.039
 scared-0.038
 careless-0.036
Token and
Token ID290
Feature activation+7.79e-04

Pos logit contributions
irs+0.054
ide+0.049
ushes+0.045
angles+0.044
of+0.044
Neg logit contributions
 oh-0.052
 afraid-0.051
 worried-0.048
 careless-0.046
 scared-0.045
Token shouted
Token ID17293
Feature activation-8.78e-03

Pos logit contributions
irs+0.001
of+0.001
 ease+0.001
ide+0.001
 parties+0.001
Neg logit contributions
 oh-0.001
 afraid-0.001
 worried-0.001
 careless-0.001
 foolish-0.001
Token '
Token ID705
Feature activation-3.46e-02

Pos logit contributions
 afraid+0.016
 oh+0.016
 worried+0.015
 scared+0.014
 careless+0.014
Neg logit contributions
irs-0.016
ide-0.014
ushes-0.013
angles-0.012
of-0.012
Tokenfore
Token ID754
Feature activation-9.29e-02

Pos logit contributions
 afraid+0.059
 oh+0.058
 worried+0.055
 scared+0.053
 careless+0.052
Neg logit contributions
irs-0.067
ide-0.057
of-0.054
ushes-0.054
angles-0.053
Token'
Token ID6
Feature activation-8.07e-03

Pos logit contributions
 afraid+0.196
 oh+0.187
 scared+0.180
 worried+0.179
 fearful+0.172
Neg logit contributions
irs-0.137
of-0.123
ide-0.110
r-0.108
uing-0.107
Token a
Token ID257
Feature activation-2.38e-03

Pos logit contributions
 oh+0.015
 afraid+0.015
 worried+0.014
 careless+0.013
 scared+0.013
Neg logit contributions
irs-0.012
ide-0.011
of-0.011
aff-0.011
ushes-0.011
Token lot
Token ID1256
Feature activation-4.11e-02

Pos logit contributions
 oh+0.004
 afraid+0.004
 worried+0.004
 careless+0.004
 scared+0.004
Neg logit contributions
irs-0.004
of-0.003
ide-0.003
 ease-0.003
ushes-0.003
Token.
Token ID13
Feature activation-1.86e-02

Pos logit contributions
 afraid+0.080
 oh+0.079
 worried+0.075
 scared+0.073
 careless+0.072
Neg logit contributions
irs-0.069
ide-0.058
ushes-0.054
angles-0.054
 outfits-0.052
Token
Token ID198
Feature activation+4.94e-02

Pos logit contributions
 afraid+0.033
 oh+0.033
 worried+0.030
 scared+0.030
 careless+0.029
Neg logit contributions
irs-0.035
ide-0.031
angles-0.028
of-0.028
ushes-0.027
Token.
Token ID13
Feature activation-1.72e-03

Pos logit contributions
irs+0.058
angles+0.045
ingu+0.044
ushes+0.044
ide+0.044
Neg logit contributions
 afraid-0.046
 scared-0.045
 worried-0.044
 oh-0.043
 careless-0.040
Token<|endoftext|>
Token ID50256
Feature activation+2.05e-02

Pos logit contributions
 afraid+0.003
 oh+0.003
 worried+0.003
 scared+0.003
 careless+0.003
Neg logit contributions
irs-0.003
ide-0.003
ades-0.003
angles-0.003
 outfits-0.003
TokenOnce
Token ID7454
Feature activation-3.97e-02

Pos logit contributions
irs+0.040
angles+0.035
uing+0.033
 outfits+0.033
ide+0.032
Neg logit contributions
 afraid-0.035
 scared-0.033
 worried-0.032
 oh-0.031
 frightened-0.029
Token upon
Token ID2402
Feature activation+2.09e-02

Pos logit contributions
 afraid+0.072
 worried+0.070
 scared+0.067
 careless+0.065
 oh+0.063
Neg logit contributions
irs-0.074
 occasions-0.064
bons-0.064
 outfits-0.062
aff-0.061
Token a
Token ID257
Feature activation-4.24e-02

Pos logit contributions
irs+0.051
ide+0.045
angles+0.043
ushes+0.043
aff+0.043
Neg logit contributions
 afraid-0.026
 oh-0.025
 worried-0.023
 scared-0.022
 careless-0.021
Token time
Token ID640
Feature activation-9.28e-02

Pos logit contributions
 oh+0.067
 afraid+0.066
 worried+0.062
 scared+0.059
 careless+0.058
Neg logit contributions
irs-0.086
ide-0.078
of-0.071
ushes-0.071
angles-0.070
Token,
Token ID11
Feature activation-7.55e-02

Pos logit contributions
 afraid+0.137
 oh+0.137
 worried+0.127
 careless+0.120
 scared+0.120
Neg logit contributions
irs-0.197
ide-0.178
of-0.162
ushes-0.159
ades-0.158
Token there
Token ID612
Feature activation+9.82e-03

Pos logit contributions
 oh+0.103
 afraid+0.102
 worried+0.093
 careless+0.089
 scared+0.086
Neg logit contributions
irs-0.160
ide-0.150
ushes-0.137
of-0.133
angles-0.132
Token was
Token ID373
Feature activation-7.21e-02

Pos logit contributions
irs+0.024
aff+0.020
ushes+0.020
angles+0.020
ide+0.020
Neg logit contributions
 oh-0.012
 afraid-0.012
 worried-0.011
 scared-0.011
 careless-0.010
Token a
Token ID257
Feature activation-2.90e-02

Pos logit contributions
 afraid+0.118
 oh+0.115
 worried+0.110
 careless+0.102
 scared+0.101
Neg logit contributions
irs-0.140
ide-0.124
 parties-0.116
 ease-0.114
ushes-0.111
Token little
Token ID1310
Feature activation-3.90e-02

Pos logit contributions
 oh+0.055
 afraid+0.054
 worried+0.048
Run+0.047
 scared+0.046
Neg logit contributions
irs-0.038
ide-0.038
aff-0.035
ingu-0.035
of-0.034
Token.
Token ID13
Feature activation-6.34e-03

Pos logit contributions
 afraid+0.066
 oh+0.064
 worried+0.062
 scared+0.061
 careless+0.059
Neg logit contributions
irs-0.057
ide-0.046
angles-0.044
aff-0.043
ushes-0.041
Token<|endoftext|>
Token ID50256
Feature activation+2.11e-02

Pos logit contributions
 afraid+0.013
 oh+0.012
 worried+0.012
 scared+0.012
 careless+0.011
Neg logit contributions
irs-0.011
ide-0.009
angles-0.009
ades-0.009
 outfits-0.009
TokenOnce
Token ID7454
Feature activation-3.68e-02

Pos logit contributions
irs+0.040
angles+0.033
 outfits+0.032
ide+0.032
uing+0.031
Neg logit contributions
 afraid-0.038
 scared-0.036
 worried-0.035
 oh-0.035
 careless-0.032
Token upon
Token ID2402
Feature activation+2.51e-02

Pos logit contributions
 afraid+0.067
 worried+0.065
 scared+0.063
 careless+0.061
 oh+0.060
Neg logit contributions
irs-0.068
 occasions-0.058
bons-0.057
 outfits-0.056
ide-0.055
Token a
Token ID257
Feature activation-4.01e-02

Pos logit contributions
irs+0.062
ide+0.055
angles+0.052
ushes+0.052
aff+0.052
Neg logit contributions
 afraid-0.030
 oh-0.030
 worried-0.027
 scared-0.026
 careless-0.025
Token time
Token ID640
Feature activation-9.22e-02

Pos logit contributions
 oh+0.063
 afraid+0.063
 worried+0.058
 scared+0.056
 careless+0.055
Neg logit contributions
irs-0.082
ide-0.073
of-0.068
ushes-0.067
angles-0.066
Token,
Token ID11
Feature activation-7.66e-02

Pos logit contributions
 afraid+0.140
 oh+0.140
 worried+0.130
 careless+0.123
 scared+0.122
Neg logit contributions
irs-0.192
ide-0.173
of-0.159
ushes-0.155
 parties-0.154
Token there
Token ID612
Feature activation+1.09e-02

Pos logit contributions
 oh+0.104
 afraid+0.104
 worried+0.095
 careless+0.089
 scared+0.087
Neg logit contributions
irs-0.163
ide-0.153
ushes-0.139
of-0.137
angles-0.137
Token was
Token ID373
Feature activation-7.05e-02

Pos logit contributions
irs+0.025
ide+0.021
aff+0.021
ushes+0.021
angles+0.021
Neg logit contributions
 oh-0.015
 afraid-0.015
 worried-0.014
 scared-0.013
 careless-0.013
Token a
Token ID257
Feature activation-2.86e-02

Pos logit contributions
 afraid+0.112
 oh+0.108
 worried+0.104
 careless+0.096
 scared+0.095
Neg logit contributions
irs-0.142
ide-0.126
 parties-0.118
 ease-0.115
ushes-0.114
Token little
Token ID1310
Feature activation-3.73e-02

Pos logit contributions
 oh+0.055
 afraid+0.053
 worried+0.047
Run+0.047
 adamant+0.046
Neg logit contributions
ide-0.035
irs-0.035
ingu-0.033
aff-0.033
 parties-0.032
Token.
Token ID13
Feature activation+1.08e-02

Pos logit contributions
 afraid+0.034
 oh+0.034
 worried+0.033
 scared+0.032
 careless+0.031
Neg logit contributions
irs-0.038
ushes-0.031
angles-0.031
ide-0.031
ades-0.029
Token<|endoftext|>
Token ID50256
Feature activation+2.53e-02

Pos logit contributions
irs+0.021
ide+0.018
ades+0.018
r+0.018
 occasions+0.017
Neg logit contributions
 afraid-0.017
 worried-0.017
 scared-0.016
 oh-0.016
 careless-0.015
TokenOnce
Token ID7454
Feature activation-3.35e-02

Pos logit contributions
irs+0.051
angles+0.043
 outfits+0.041
uing+0.040
aff+0.040
Neg logit contributions
 afraid-0.042
 scared-0.041
 worried-0.039
 oh-0.038
 frightened-0.035
Token upon
Token ID2402
Feature activation+2.54e-02

Pos logit contributions
 afraid+0.057
 worried+0.056
 scared+0.054
 careless+0.052
 foolish+0.050
Neg logit contributions
irs-0.064
bons-0.058
 occasions-0.057
 outfits-0.055
aters-0.053
Token a
Token ID257
Feature activation-4.18e-02

Pos logit contributions
irs+0.063
ide+0.056
angles+0.053
ushes+0.053
aff+0.053
Neg logit contributions
 afraid-0.030
 oh-0.030
 worried-0.027
 scared-0.026
 careless-0.025
Token time
Token ID640
Feature activation-9.19e-02

Pos logit contributions
 afraid+0.063
 oh+0.063
 worried+0.058
 scared+0.056
 careless+0.055
Neg logit contributions
irs-0.089
ide-0.078
ushes-0.073
of-0.072
angles-0.072
Token,
Token ID11
Feature activation-7.39e-02

Pos logit contributions
 afraid+0.135
 oh+0.134
 worried+0.124
 scared+0.117
 careless+0.117
Neg logit contributions
irs-0.198
ide-0.178
of-0.164
ushes-0.162
angles-0.160
Token there
Token ID612
Feature activation+1.21e-02

Pos logit contributions
 afraid+0.097
 oh+0.096
 worried+0.089
 careless+0.083
 scared+0.082
Neg logit contributions
irs-0.165
ide-0.152
ushes-0.139
of-0.138
angles-0.137
Token was
Token ID373
Feature activation-7.12e-02

Pos logit contributions
irs+0.030
aff+0.025
ushes+0.025
ide+0.025
angles+0.024
Neg logit contributions
 oh-0.015
 afraid-0.015
 worried-0.014
 scared-0.014
 careless-0.013
Token a
Token ID257
Feature activation-2.88e-02

Pos logit contributions
 afraid+0.115
 oh+0.111
 worried+0.107
 careless+0.099
 scared+0.098
Neg logit contributions
irs-0.142
ide-0.126
 parties-0.117
 ease-0.115
ushes-0.113
Token happy
Token ID3772
Feature activation-2.52e-03

Pos logit contributions
 oh+0.055
 afraid+0.054
 worried+0.047
Run+0.047
 scared+0.045
Neg logit contributions
ide-0.036
irs-0.036
ingu-0.034
aff-0.034
 parties-0.033
Token.
Token ID13
Feature activation-3.70e-03

Pos logit contributions
irs+0.018
ide+0.015
ushes+0.014
angles+0.013
 parties+0.013
Neg logit contributions
 afraid-0.024
 oh-0.024
 worried-0.023
 scared-0.022
 careless-0.022
Token<|endoftext|>
Token ID50256
Feature activation+2.07e-02

Pos logit contributions
 afraid+0.007
 oh+0.007
 worried+0.007
 scared+0.007
 careless+0.006
Neg logit contributions
irs-0.006
ide-0.005
ades-0.005
angles-0.005
 outfits-0.005
TokenOnce
Token ID7454
Feature activation-3.88e-02

Pos logit contributions
irs+0.040
angles+0.034
 outfits+0.033
ide+0.032
uing+0.032
Neg logit contributions
 afraid-0.035
 scared-0.033
 worried-0.032
 oh-0.032
 frightened-0.029
Token upon
Token ID2402
Feature activation+2.36e-02

Pos logit contributions
 afraid+0.070
 worried+0.068
 scared+0.065
 careless+0.063
 oh+0.062
Neg logit contributions
irs-0.073
 occasions-0.063
bons-0.062
 outfits-0.061
aff-0.059
Token a
Token ID257
Feature activation-4.16e-02

Pos logit contributions
irs+0.058
ide+0.052
angles+0.049
ushes+0.049
aff+0.049
Neg logit contributions
 afraid-0.028
 oh-0.028
 worried-0.026
 scared-0.025
 careless-0.024
Token time
Token ID640
Feature activation-9.19e-02

Pos logit contributions
 oh+0.068
 afraid+0.067
 worried+0.062
 scared+0.060
 careless+0.058
Neg logit contributions
irs-0.082
ide-0.075
of-0.068
ushes-0.067
angles-0.066
Token,
Token ID11
Feature activation-7.63e-02

Pos logit contributions
 afraid+0.136
 oh+0.136
 worried+0.125
 scared+0.119
 careless+0.118
Neg logit contributions
irs-0.196
ide-0.177
of-0.162
ushes-0.159
angles-0.157
Token there
Token ID612
Feature activation+9.85e-03

Pos logit contributions
 oh+0.107
 afraid+0.106
 worried+0.097
 careless+0.092
 scared+0.089
Neg logit contributions
irs-0.157
ide-0.151
ushes-0.135
of-0.132
angles-0.131
Token was
Token ID373
Feature activation-7.31e-02

Pos logit contributions
irs+0.022
ide+0.019
aff+0.019
ushes+0.019
angles+0.018
Neg logit contributions
 afraid-0.014
 oh-0.014
 worried-0.013
 scared-0.013
 careless-0.012
Token a
Token ID257
Feature activation-3.04e-02

Pos logit contributions
 afraid+0.121
 oh+0.118
 worried+0.113
 careless+0.105
 scared+0.103
Neg logit contributions
irs-0.140
ide-0.124
 parties-0.117
 ease-0.114
ushes-0.110
Token kid
Token ID5141
Feature activation-6.05e-02

Pos logit contributions
 oh+0.059
 afraid+0.057
 worried+0.050
Run+0.050
 adamant+0.049
Neg logit contributions
ide-0.038
irs-0.037
aff-0.035
ingu-0.035
 parties-0.034
Token.
Token ID13
Feature activation+4.56e-04

Pos logit contributions
 oh+0.070
 afraid+0.070
 worried+0.067
 careless+0.064
 scared+0.064
Neg logit contributions
irs-0.043
ide-0.037
ushes-0.032
of-0.032
ades-0.031
Token<|endoftext|>
Token ID50256
Feature activation+2.25e-02

Pos logit contributions
irs+0.001
ide+0.001
angles+0.000
ushes+0.000
of+0.000
Neg logit contributions
 afraid-0.001
 oh-0.001
 worried-0.001
 scared-0.001
 careless-0.001
TokenOnce
Token ID7454
Feature activation-3.57e-02

Pos logit contributions
irs+0.041
angles+0.034
 outfits+0.033
ide+0.033
ushes+0.032
Neg logit contributions
 afraid-0.041
 scared-0.039
 worried-0.039
 oh-0.038
 careless-0.035
Token upon
Token ID2402
Feature activation+2.23e-02

Pos logit contributions
 afraid+0.064
 worried+0.062
 scared+0.060
 careless+0.058
 oh+0.057
Neg logit contributions
irs-0.066
 occasions-0.058
bons-0.057
 outfits-0.057
ands-0.054
Token a
Token ID257
Feature activation-4.20e-02

Pos logit contributions
irs+0.057
ide+0.050
angles+0.048
aff+0.048
ushes+0.048
Neg logit contributions
 afraid-0.026
 oh-0.025
 worried-0.023
 scared-0.022
 careless-0.021
Token time
Token ID640
Feature activation-9.19e-02

Pos logit contributions
 oh+0.067
 afraid+0.067
 worried+0.062
 scared+0.060
 careless+0.058
Neg logit contributions
irs-0.084
ide-0.076
of-0.070
ushes-0.069
angles-0.068
Token,
Token ID11
Feature activation-7.65e-02

Pos logit contributions
 oh+0.142
 afraid+0.142
 worried+0.131
 careless+0.125
 scared+0.124
Neg logit contributions
irs-0.190
ide-0.172
of-0.155
ushes-0.151
ades-0.151
Token there
Token ID612
Feature activation+1.04e-02

Pos logit contributions
 oh+0.111
 afraid+0.110
 worried+0.102
 careless+0.097
 scared+0.092
Neg logit contributions
irs-0.151
ide-0.147
ushes-0.126
of-0.125
angles-0.123
Token was
Token ID373
Feature activation-7.28e-02

Pos logit contributions
irs+0.023
ide+0.020
ushes+0.019
aff+0.019
angles+0.019
Neg logit contributions
 afraid-0.015
 oh-0.015
 worried-0.014
 scared-0.014
 careless-0.013
Token a
Token ID257
Feature activation-2.99e-02

Pos logit contributions
 afraid+0.116
 oh+0.113
 worried+0.107
 careless+0.100
 scared+0.098
Neg logit contributions
irs-0.146
ide-0.131
 parties-0.120
 ease-0.119
ingu-0.116
Token big
Token ID1263
Feature activation-2.15e-02

Pos logit contributions
 oh+0.058
 afraid+0.056
Run+0.049
 worried+0.049
 adamant+0.048
Neg logit contributions
ide-0.037
irs-0.036
ingu-0.034
aff-0.034
 parties-0.033
Token."
Token ID526
Feature activation+2.00e-02

Pos logit contributions
 afraid+0.006
 oh+0.006
 worried+0.005
 scared+0.005
 careless+0.005
Neg logit contributions
irs-0.004
ide-0.004
ushes-0.003
angles-0.003
of-0.003
Token<|endoftext|>
Token ID50256
Feature activation+2.55e-02

Pos logit contributions
irs+0.042
ide+0.035
ades+0.035
angles+0.033
 journals+0.032
Neg logit contributions
 afraid-0.031
 worried-0.030
 oh-0.030
 scared-0.029
 careless-0.027
TokenOnce
Token ID7454
Feature activation-3.41e-02

Pos logit contributions
irs+0.052
angles+0.045
uing+0.043
 outfits+0.043
ide+0.041
Neg logit contributions
 afraid-0.042
 scared-0.040
 worried-0.038
 oh-0.037
 frightened-0.035
Token upon
Token ID2402
Feature activation+2.41e-02

Pos logit contributions
 afraid+0.061
 worried+0.060
 scared+0.057
 careless+0.055
 oh+0.054
Neg logit contributions
irs-0.066
bons-0.058
 occasions-0.057
 outfits-0.055
aff-0.055
Token a
Token ID257
Feature activation-4.05e-02

Pos logit contributions
irs+0.061
ide+0.053
angles+0.051
aff+0.051
ushes+0.051
Neg logit contributions
 afraid-0.028
 oh-0.028
 worried-0.025
 scared-0.024
 careless-0.023
Token time
Token ID640
Feature activation-9.18e-02

Pos logit contributions
 afraid+0.063
 oh+0.062
 worried+0.058
 scared+0.055
 careless+0.054
Neg logit contributions
irs-0.084
ide-0.075
ushes-0.069
of-0.069
angles-0.068
Token,
Token ID11
Feature activation-7.32e-02

Pos logit contributions
 afraid+0.135
 oh+0.134
 worried+0.124
 scared+0.117
 careless+0.117
Neg logit contributions
irs-0.197
ide-0.177
of-0.162
ushes-0.160
angles-0.159
Token there
Token ID612
Feature activation+1.23e-02

Pos logit contributions
 afraid+0.095
 oh+0.094
 worried+0.087
 careless+0.081
 scared+0.080
Neg logit contributions
irs-0.164
ide-0.152
ushes-0.140
of-0.138
angles-0.138
Token was
Token ID373
Feature activation-7.00e-02

Pos logit contributions
irs+0.030
aff+0.026
ushes+0.025
angles+0.025
ide+0.025
Neg logit contributions
 oh-0.015
 afraid-0.015
 scared-0.014
 worried-0.014
 careless-0.013
Token a
Token ID257
Feature activation-2.77e-02

Pos logit contributions
 afraid+0.112
 oh+0.109
 worried+0.104
 careless+0.096
 scared+0.095
Neg logit contributions
irs-0.139
ide-0.124
 parties-0.116
 ease-0.113
ushes-0.112
Token little
Token ID1310
Feature activation-3.74e-02

Pos logit contributions
 oh+0.051
 afraid+0.050
 worried+0.045
 scared+0.043
Run+0.043
Neg logit contributions
irs-0.039
ide-0.039
aff-0.036
ingu-0.035
of-0.034
Token.
Token ID13
Feature activation+4.42e-03

Pos logit contributions
 afraid+0.025
 oh+0.025
 worried+0.024
 scared+0.023
 careless+0.023
Neg logit contributions
irs-0.022
ide-0.018
ushes-0.017
angles-0.016
aff-0.016
Token<|endoftext|>
Token ID50256
Feature activation+2.46e-02

Pos logit contributions
irs+0.007
ide+0.006
angles+0.005
ushes+0.005
ades+0.005
Neg logit contributions
 afraid-0.009
 oh-0.009
 worried-0.009
 scared-0.009
 careless-0.009
TokenOnce
Token ID7454
Feature activation-4.12e-02

Pos logit contributions
irs+0.045
angles+0.037
 outfits+0.036
ide+0.035
aff+0.035
Neg logit contributions
 afraid-0.046
 scared-0.044
 worried-0.043
 oh-0.042
 careless-0.039
Token upon
Token ID2402
Feature activation+2.45e-02

Pos logit contributions
 afraid+0.075
 worried+0.072
 scared+0.070
 careless+0.067
 oh+0.067
Neg logit contributions
irs-0.077
 occasions-0.066
bons-0.065
 outfits-0.065
aff-0.063
Token a
Token ID257
Feature activation-4.35e-02

Pos logit contributions
irs+0.062
ide+0.054
angles+0.052
aff+0.052
ushes+0.052
Neg logit contributions
 afraid-0.029
 oh-0.028
 worried-0.026
 scared-0.025
 careless-0.024
Token time
Token ID640
Feature activation-9.18e-02

Pos logit contributions
 oh+0.067
 afraid+0.067
 worried+0.062
 scared+0.060
 careless+0.058
Neg logit contributions
irs-0.090
ide-0.080
of-0.074
ushes-0.074
angles-0.073
Token,
Token ID11
Feature activation-7.58e-02

Pos logit contributions
 afraid+0.136
 oh+0.135
 worried+0.126
 scared+0.118
 careless+0.118
Neg logit contributions
irs-0.195
ide-0.176
of-0.161
ushes-0.159
angles-0.157
Token there
Token ID612
Feature activation+1.30e-02

Pos logit contributions
 afraid+0.103
 oh+0.103
 worried+0.094
 careless+0.089
 scared+0.086
Neg logit contributions
irs-0.160
ide-0.149
ushes-0.136
of-0.135
angles-0.134
Token was
Token ID373
Feature activation-7.23e-02

Pos logit contributions
irs+0.030
ide+0.025
aff+0.025
ushes+0.025
angles+0.025
Neg logit contributions
 afraid-0.018
 oh-0.018
 worried-0.017
 scared-0.016
 careless-0.015
Token a
Token ID257
Feature activation-3.05e-02

Pos logit contributions
 afraid+0.120
 oh+0.117
 worried+0.112
 careless+0.103
 scared+0.101
Neg logit contributions
irs-0.138
ide-0.122
 parties-0.114
 ease-0.111
ushes-0.109
Token little
Token ID1310
Feature activation-3.77e-02

Pos logit contributions
 oh+0.057
 afraid+0.055
 worried+0.048
Run+0.048
 scared+0.046
Neg logit contributions
ide-0.040
irs-0.040
aff-0.037
ingu-0.037
 parties-0.036
Token.
Token ID13
Feature activation+9.49e-03

Pos logit contributions
irs+0.030
ide+0.026
ushes+0.023
aff+0.022
angles+0.022
Neg logit contributions
 afraid-0.046
 oh-0.046
 worried-0.044
 scared-0.042
 careless-0.042
Token<|endoftext|>
Token ID50256
Feature activation+2.13e-02

Pos logit contributions
irs+0.018
ide+0.015
 outfits+0.014
ades+0.014
angles+0.014
Neg logit contributions
 afraid-0.018
 oh-0.017
 worried-0.016
 scared-0.016
 careless-0.015
TokenOnce
Token ID7454
Feature activation-4.16e-02

Pos logit contributions
irs+0.039
angles+0.032
 outfits+0.031
ide+0.031
aff+0.030
Neg logit contributions
 afraid-0.040
 scared-0.038
 worried-0.037
 oh-0.036
 careless-0.033
Token upon
Token ID2402
Feature activation+2.27e-02

Pos logit contributions
 afraid+0.078
 worried+0.075
 scared+0.073
 oh+0.070
 careless+0.070
Neg logit contributions
irs-0.076
 occasions-0.064
 outfits-0.062
bons-0.062
aff-0.061
Token a
Token ID257
Feature activation-4.12e-02

Pos logit contributions
irs+0.054
ide+0.048
ushes+0.045
angles+0.045
of+0.045
Neg logit contributions
 afraid-0.029
 oh-0.028
 worried-0.026
 scared-0.025
 careless-0.024
Token time
Token ID640
Feature activation-9.18e-02

Pos logit contributions
 oh+0.067
 afraid+0.066
 worried+0.062
 scared+0.059
 careless+0.058
Neg logit contributions
irs-0.082
ide-0.074
of-0.068
ushes-0.067
angles-0.066
Token,
Token ID11
Feature activation-7.44e-02

Pos logit contributions
 oh+0.144
 afraid+0.143
 worried+0.134
 careless+0.126
 scared+0.125
Neg logit contributions
irs-0.184
ide-0.169
of-0.153
ushes-0.148
ades-0.148
Token there
Token ID612
Feature activation+1.18e-02

Pos logit contributions
 oh+0.107
 afraid+0.106
 worried+0.098
 careless+0.092
 scared+0.089
Neg logit contributions
irs-0.148
ide-0.141
ushes-0.127
of-0.126
angles-0.125
Token was
Token ID373
Feature activation-7.31e-02

Pos logit contributions
irs+0.027
ide+0.023
aff+0.023
ushes+0.023
angles+0.022
Neg logit contributions
 afraid-0.017
 oh-0.017
 worried-0.015
 scared-0.015
 careless-0.014
Token a
Token ID257
Feature activation-3.15e-02

Pos logit contributions
 afraid+0.123
 oh+0.120
 worried+0.116
 careless+0.107
 scared+0.104
Neg logit contributions
irs-0.135
ide-0.121
 parties-0.113
 ease-0.110
ushes-0.109
Token z
Token ID1976
Feature activation-3.06e-02

Pos logit contributions
 oh+0.060
 afraid+0.058
 worried+0.051
Run+0.051
 adamant+0.050
Neg logit contributions
ide-0.039
irs-0.038
ingu-0.037
aff-0.036
 parties-0.035
Token.
Token ID13
Feature activation+3.62e-04

Pos logit contributions
irs+0.048
ushes+0.037
ide+0.037
angles+0.036
ades+0.036
Neg logit contributions
 afraid-0.048
 oh-0.048
 worried-0.046
 scared-0.045
 careless-0.043
Token<|endoftext|>
Token ID50256
Feature activation+2.40e-02

Pos logit contributions
irs+0.001
ide+0.001
ades+0.001
 outfits+0.000
angles+0.000
Neg logit contributions
 afraid-0.001
 oh-0.001
 worried-0.001
 scared-0.001
 careless-0.001
TokenOnce
Token ID7454
Feature activation-3.74e-02

Pos logit contributions
irs+0.049
angles+0.043
uing+0.042
 outfits+0.041
aff+0.040
Neg logit contributions
 afraid-0.038
 scared-0.037
 worried-0.035
 oh-0.033
 frightened-0.031
Token upon
Token ID2402
Feature activation+2.32e-02

Pos logit contributions
 afraid+0.067
 worried+0.065
 scared+0.062
 careless+0.060
 oh+0.058
Neg logit contributions
irs-0.071
 occasions-0.063
bons-0.061
 outfits-0.060
aff-0.058
Token a
Token ID257
Feature activation-4.24e-02

Pos logit contributions
irs+0.058
ide+0.051
angles+0.049
aff+0.049
ushes+0.049
Neg logit contributions
 afraid-0.028
 oh-0.027
 worried-0.025
 scared-0.024
 careless-0.023
Token time
Token ID640
Feature activation-9.17e-02

Pos logit contributions
 afraid+0.065
 oh+0.065
 worried+0.060
 scared+0.057
 careless+0.056
Neg logit contributions
irs-0.088
ide-0.079
ushes-0.073
of-0.073
angles-0.072
Token,
Token ID11
Feature activation-7.54e-02

Pos logit contributions
 oh+0.137
 afraid+0.137
 worried+0.127
 careless+0.120
 scared+0.119
Neg logit contributions
irs-0.193
ide-0.175
of-0.159
ushes-0.156
ades-0.155
Token there
Token ID612
Feature activation+1.12e-02

Pos logit contributions
 oh+0.104
 afraid+0.104
 worried+0.095
 careless+0.090
 scared+0.088
Neg logit contributions
irs-0.156
ide-0.148
ushes-0.133
of-0.131
angles-0.130
Token was
Token ID373
Feature activation-7.00e-02

Pos logit contributions
irs+0.026
ide+0.022
aff+0.022
ushes+0.022
angles+0.021
Neg logit contributions
 afraid-0.015
 oh-0.015
 worried-0.014
 scared-0.014
 careless-0.013
Token a
Token ID257
Feature activation-2.86e-02

Pos logit contributions
 afraid+0.120
 oh+0.114
 worried+0.112
 careless+0.103
 scared+0.102
Neg logit contributions
irs-0.129
ide-0.114
 parties-0.108
 ease-0.106
ushes-0.101
Token little
Token ID1310
Feature activation-3.71e-02

Pos logit contributions
 oh+0.055
 afraid+0.054
Run+0.047
 worried+0.047
 adamant+0.046
Neg logit contributions
ide-0.035
irs-0.034
ingu-0.033
aff-0.033
of-0.031
Token.
Token ID13
Feature activation+4.79e-03

Pos logit contributions
irs+0.000
ide+0.000
angles+0.000
ushes+0.000
of+0.000
Neg logit contributions
 afraid-0.000
 scared-0.000
 worried-0.000
 oh-0.000
 careless-0.000
Token<|endoftext|>
Token ID50256
Feature activation+2.29e-02

Pos logit contributions
irs+0.008
ide+0.007
 outfits+0.007
angles+0.007
ades+0.007
Neg logit contributions
 afraid-0.009
 oh-0.009
 worried-0.009
 scared-0.009
 careless-0.008
TokenOnce
Token ID7454
Feature activation-3.53e-02

Pos logit contributions
irs+0.044
angles+0.037
 outfits+0.036
ide+0.035
uing+0.035
Neg logit contributions
 afraid-0.040
 scared-0.039
 worried-0.037
 oh-0.037
 frightened-0.034
Token upon
Token ID2402
Feature activation+2.63e-02

Pos logit contributions
 afraid+0.065
 worried+0.063
 scared+0.061
 careless+0.059
 oh+0.058
Neg logit contributions
irs-0.066
 occasions-0.056
bons-0.055
 outfits-0.054
aff-0.053
Token a
Token ID257
Feature activation-3.97e-02

Pos logit contributions
irs+0.065
ide+0.057
angles+0.055
ushes+0.055
aff+0.055
Neg logit contributions
 afraid-0.032
 oh-0.031
 worried-0.029
 scared-0.027
 careless-0.026
Token time
Token ID640
Feature activation-9.17e-02

Pos logit contributions
 oh+0.065
 afraid+0.064
 worried+0.060
 scared+0.057
 careless+0.056
Neg logit contributions
irs-0.077
ide-0.071
of-0.065
ushes-0.064
angles-0.063
Token,
Token ID11
Feature activation-7.40e-02

Pos logit contributions
 oh+0.142
 afraid+0.142
 worried+0.132
 careless+0.125
 scared+0.123
Neg logit contributions
irs-0.186
ide-0.170
of-0.154
ushes-0.149
ades-0.149
Token there
Token ID612
Feature activation+1.13e-02

Pos logit contributions
 oh+0.103
 afraid+0.103
 worried+0.094
 careless+0.089
 scared+0.087
Neg logit contributions
irs-0.154
ide-0.145
ushes-0.133
of-0.129
angles-0.129
Token was
Token ID373
Feature activation-7.05e-02

Pos logit contributions
irs+0.027
aff+0.022
ide+0.022
ushes+0.022
angles+0.022
Neg logit contributions
 oh-0.015
 afraid-0.015
 worried-0.014
 scared-0.014
 careless-0.013
Token a
Token ID257
Feature activation-2.89e-02

Pos logit contributions
 afraid+0.119
 oh+0.115
 worried+0.111
 careless+0.103
 scared+0.100
Neg logit contributions
irs-0.130
ide-0.115
 parties-0.107
 ease-0.107
ushes-0.103
Token little
Token ID1310
Feature activation-3.57e-02

Pos logit contributions
 oh+0.055
 afraid+0.053
Run+0.047
 worried+0.046
 adamant+0.046
Neg logit contributions
ide-0.035
irs-0.034
ingu-0.034
aff-0.033
 parties-0.032
Token.
Token ID13
Feature activation-7.32e-04

Pos logit contributions
irs+0.000
ide+0.000
ushes+0.000
angles+0.000
of+0.000
Neg logit contributions
 afraid-0.000
 oh-0.000
 worried-0.000
 scared-0.000
 careless-0.000
Token<|endoftext|>
Token ID50256
Feature activation+2.34e-02

Pos logit contributions
 afraid+0.001
 oh+0.001
 worried+0.001
 scared+0.001
 careless+0.001
Neg logit contributions
irs-0.001
ide-0.001
 outfits-0.001
angles-0.001
ades-0.001
TokenOnce
Token ID7454
Feature activation-3.80e-02

Pos logit contributions
irs+0.045
angles+0.038
 outfits+0.037
ide+0.036
uing+0.036
Neg logit contributions
 afraid-0.040
 scared-0.039
 worried-0.038
 oh-0.037
 frightened-0.034
Token upon
Token ID2402
Feature activation+2.25e-02

Pos logit contributions
 afraid+0.070
 worried+0.067
 scared+0.065
 careless+0.062
 oh+0.062
Neg logit contributions
irs-0.072
 occasions-0.062
bons-0.061
 outfits-0.060
aff-0.059
Token a
Token ID257
Feature activation-4.09e-02

Pos logit contributions
irs+0.057
ide+0.050
angles+0.048
aff+0.048
ushes+0.048
Neg logit contributions
 afraid-0.027
 oh-0.026
 worried-0.024
 scared-0.023
 careless-0.022
Token time
Token ID640
Feature activation-9.16e-02

Pos logit contributions
 oh+0.065
 afraid+0.065
 worried+0.060
 scared+0.058
 careless+0.056
Neg logit contributions
irs-0.082
ide-0.074
of-0.068
ushes-0.067
angles-0.066
Token,
Token ID11
Feature activation-7.34e-02

Pos logit contributions
 oh+0.138
 afraid+0.138
 worried+0.128
 careless+0.121
 scared+0.120
Neg logit contributions
irs-0.192
ide-0.174
of-0.158
ushes-0.155
ades-0.154
Token there
Token ID612
Feature activation+1.08e-02

Pos logit contributions
 oh+0.102
 afraid+0.101
 worried+0.093
 careless+0.088
 scared+0.086
Neg logit contributions
irs-0.155
ide-0.145
ushes-0.133
angles-0.130
of-0.130
Token was
Token ID373
Feature activation-7.25e-02

Pos logit contributions
irs+0.027
aff+0.022
ushes+0.022
angles+0.022
ide+0.022
Neg logit contributions
 afraid-0.014
 oh-0.013
 worried-0.013
 scared-0.013
 careless-0.011
Token a
Token ID257
Feature activation-2.92e-02

Pos logit contributions
 afraid+0.120
 oh+0.116
 worried+0.112
 careless+0.104
 scared+0.102
Neg logit contributions
irs-0.139
ide-0.123
 parties-0.115
 ease-0.113
ushes-0.110
Token little
Token ID1310
Feature activation-3.84e-02

Pos logit contributions
 oh+0.055
 afraid+0.053
 worried+0.047
Run+0.047
 adamant+0.046
Neg logit contributions
ide-0.038
irs-0.037
aff-0.035
ingu-0.035
 parties-0.034
Token.
Token ID13
Feature activation+4.62e-03

Pos logit contributions
irs+0.033
ide+0.026
ushes+0.025
angles+0.025
aff+0.025
Neg logit contributions
 afraid-0.034
 oh-0.034
 worried-0.033
 scared-0.032
 careless-0.031
Token<|endoftext|>
Token ID50256
Feature activation+2.51e-02

Pos logit contributions
irs+0.009
ide+0.008
 outfits+0.007
angles+0.007
ades+0.007
Neg logit contributions
 afraid-0.008
 oh-0.008
 worried-0.007
 scared-0.007
 careless-0.007
TokenOnce
Token ID7454
Feature activation-3.74e-02

Pos logit contributions
irs+0.048
angles+0.041
 outfits+0.039
ide+0.039
uing+0.038
Neg logit contributions
 afraid-0.044
 scared-0.042
 worried-0.041
 oh-0.040
 frightened-0.037
Token upon
Token ID2402
Feature activation+2.30e-02

Pos logit contributions
 afraid+0.069
 worried+0.067
 scared+0.064
 careless+0.062
 oh+0.062
Neg logit contributions
irs-0.070
 occasions-0.060
bons-0.059
 outfits-0.058
aff-0.057
Token a
Token ID257
Feature activation-4.09e-02

Pos logit contributions
irs+0.055
ide+0.049
ushes+0.047
angles+0.046
aff+0.046
Neg logit contributions
 afraid-0.029
 oh-0.028
 worried-0.026
 scared-0.025
 careless-0.024
Token time
Token ID640
Feature activation-9.16e-02

Pos logit contributions
 oh+0.065
 afraid+0.065
 worried+0.060
 scared+0.058
 careless+0.056
Neg logit contributions
irs-0.082
ide-0.074
of-0.068
ushes-0.068
angles-0.067
Token,
Token ID11
Feature activation-7.48e-02

Pos logit contributions
 afraid+0.141
 oh+0.141
 worried+0.131
 careless+0.124
 scared+0.123
Neg logit contributions
irs-0.188
ide-0.171
of-0.155
ushes-0.150
ades-0.149
Token there
Token ID612
Feature activation+1.05e-02

Pos logit contributions
 oh+0.103
 afraid+0.102
 worried+0.094
 careless+0.089
 scared+0.087
Neg logit contributions
irs-0.156
ide-0.148
ushes-0.134
angles-0.131
of-0.131
Token was
Token ID373
Feature activation-7.14e-02

Pos logit contributions
irs+0.025
aff+0.021
ushes+0.021
ide+0.021
angles+0.021
Neg logit contributions
 oh-0.014
 afraid-0.014
 worried-0.013
 scared-0.013
 careless-0.012
Token a
Token ID257
Feature activation-2.99e-02

Pos logit contributions
 afraid+0.118
 oh+0.114
 worried+0.111
 careless+0.102
 scared+0.100
Neg logit contributions
irs-0.136
ide-0.121
 parties-0.113
 ease-0.113
ushes-0.107
Token little
Token ID1310
Feature activation-3.78e-02

Pos logit contributions
 oh+0.057
 afraid+0.055
Run+0.049
 worried+0.049
 adamant+0.048
Neg logit contributions
ide-0.036
irs-0.036
ingu-0.035
aff-0.034
 parties-0.033
Token.
Token ID13
Feature activation+3.58e-03

Pos logit contributions
 afraid+0.021
 oh+0.020
 worried+0.019
 scared+0.019
 careless+0.019
Neg logit contributions
irs-0.017
ide-0.013
ushes-0.013
angles-0.013
aff-0.012
Token<|endoftext|>
Token ID50256
Feature activation+2.38e-02

Pos logit contributions
irs+0.007
ide+0.006
ades+0.005
angles+0.005
aff+0.005
Neg logit contributions
 afraid-0.007
 oh-0.006
 worried-0.006
 scared-0.006
 careless-0.006
TokenOnce
Token ID7454
Feature activation-3.80e-02

Pos logit contributions
irs+0.046
angles+0.039
 outfits+0.037
ide+0.037
uing+0.036
Neg logit contributions
 afraid-0.042
 scared-0.040
 worried-0.039
 oh-0.038
 careless-0.035
Token upon
Token ID2402
Feature activation+2.53e-02

Pos logit contributions
 afraid+0.069
 worried+0.067
 scared+0.064
 careless+0.063
 oh+0.061
Neg logit contributions
irs-0.072
 occasions-0.062
bons-0.062
 outfits-0.060
aff-0.059
Token a
Token ID257
Feature activation-4.08e-02

Pos logit contributions
irs+0.064
ide+0.056
angles+0.054
aff+0.054
ushes+0.054
Neg logit contributions
 afraid-0.030
 oh-0.029
 worried-0.027
 scared-0.026
 careless-0.024
Token time
Token ID640
Feature activation-9.16e-02

Pos logit contributions
 afraid+0.059
 oh+0.058
 worried+0.054
 scared+0.052
 careless+0.051
Neg logit contributions
irs-0.090
ide-0.078
ushes-0.074
angles-0.073
aff-0.073
Token,
Token ID11
Feature activation-7.36e-02

Pos logit contributions
 afraid+0.130
 oh+0.128
 worried+0.119
 scared+0.113
 careless+0.111
Neg logit contributions
irs-0.203
ide-0.181
of-0.168
ushes-0.167
angles-0.166
Token there
Token ID612
Feature activation+9.56e-03

Pos logit contributions
 afraid+0.095
 oh+0.094
 worried+0.086
 scared+0.080
 careless+0.080
Neg logit contributions
irs-0.168
ide-0.154
ushes-0.142
of-0.141
angles-0.140
Token was
Token ID373
Feature activation-7.21e-02

Pos logit contributions
irs+0.024
aff+0.020
ushes+0.020
ide+0.020
angles+0.020
Neg logit contributions
 oh-0.012
 afraid-0.012
 worried-0.011
 scared-0.011
 careless-0.010
Token a
Token ID257
Feature activation-2.94e-02

Pos logit contributions
 afraid+0.115
 oh+0.111
 worried+0.107
 careless+0.098
 scared+0.098
Neg logit contributions
irs-0.147
ide-0.129
 parties-0.121
 ease-0.119
ushes-0.117
Token little
Token ID1310
Feature activation-3.84e-02

Pos logit contributions
 oh+0.056
 afraid+0.055
 worried+0.049
Run+0.047
 scared+0.047
Neg logit contributions
irs-0.038
ide-0.037
aff-0.035
ingu-0.035
of-0.034
Token.
Token ID13
Feature activation+9.06e-04

Pos logit contributions
irs+0.016
ide+0.013
of+0.012
ushes+0.012
angles+0.011
Neg logit contributions
 oh-0.024
 afraid-0.024
 worried-0.022
 scared-0.022
 careless-0.022
Token<|endoftext|>
Token ID50256
Feature activation+1.89e-02

Pos logit contributions
irs+0.002
ide+0.001
ades+0.001
angles+0.001
 outfits+0.001
Neg logit contributions
 afraid-0.002
 oh-0.002
 worried-0.002
 scared-0.002
 careless-0.001
TokenOnce
Token ID7454
Feature activation-3.57e-02

Pos logit contributions
irs+0.036
angles+0.030
 outfits+0.030
ide+0.029
uing+0.028
Neg logit contributions
 afraid-0.034
 scared-0.032
 worried-0.031
 oh-0.031
 careless-0.028
Token upon
Token ID2402
Feature activation+2.72e-02

Pos logit contributions
 afraid+0.065
 worried+0.063
 scared+0.060
 careless+0.059
 foolish+0.057
Neg logit contributions
irs-0.068
 occasions-0.058
bons-0.058
 outfits-0.057
aff-0.055
Token a
Token ID257
Feature activation-4.33e-02

Pos logit contributions
irs+0.067
ide+0.059
angles+0.057
ushes+0.056
aff+0.056
Neg logit contributions
 afraid-0.033
 oh-0.032
 worried-0.030
 scared-0.028
 careless-0.027
Token time
Token ID640
Feature activation-9.16e-02

Pos logit contributions
 oh+0.067
 afraid+0.067
 worried+0.062
 scared+0.060
 careless+0.058
Neg logit contributions
irs-0.089
ide-0.080
of-0.073
ushes-0.073
angles-0.073
Token,
Token ID11
Feature activation-7.41e-02

Pos logit contributions
 afraid+0.133
 oh+0.133
 worried+0.123
 scared+0.116
 careless+0.116
Neg logit contributions
irs-0.197
ide-0.178
of-0.163
ushes-0.161
angles-0.160
Token there
Token ID612
Feature activation+1.06e-02

Pos logit contributions
 oh+0.103
 afraid+0.102
 worried+0.094
 careless+0.088
 scared+0.087
Neg logit contributions
irs-0.155
ide-0.145
ushes-0.134
angles-0.131
of-0.129
Token was
Token ID373
Feature activation-7.28e-02

Pos logit contributions
irs+0.026
aff+0.022
ushes+0.022
ide+0.021
angles+0.021
Neg logit contributions
 oh-0.013
 afraid-0.013
 worried-0.012
 scared-0.012
 careless-0.011
Token a
Token ID257
Feature activation-3.00e-02

Pos logit contributions
 afraid+0.122
 oh+0.120
 worried+0.115
 careless+0.106
 scared+0.104
Neg logit contributions
irs-0.136
ide-0.118
 parties-0.112
 ease-0.111
ushes-0.107
Token little
Token ID1310
Feature activation-3.85e-02

Pos logit contributions
 oh+0.055
 afraid+0.054
 worried+0.048
Run+0.047
 scared+0.046
Neg logit contributions
irs-0.041
ide-0.041
aff-0.038
ingu-0.037
 parties-0.036
Token fix
Token ID4259
Feature activation+3.23e-02

Pos logit contributions
 afraid+0.023
 scared+0.023
 worried+0.022
 oh+0.021
 foolish+0.020
Neg logit contributions
irs-0.028
aff-0.025
angles-0.024
ushes-0.024
of-0.023
Token it
Token ID340
Feature activation+1.22e-04

Pos logit contributions
irs+0.063
ide+0.055
ushes+0.051
aff+0.051
of+0.051
Neg logit contributions
 oh-0.055
 afraid-0.055
 worried-0.052
 scared-0.051
 careless-0.048
Token.
Token ID13
Feature activation+1.32e-02

Pos logit contributions
irs+0.000
ide+0.000
aff+0.000
ushes+0.000
angles+0.000
Neg logit contributions
 afraid-0.000
 oh-0.000
 worried-0.000
 scared-0.000
 careless-0.000
Token
Token ID220
Feature activation+2.37e-02

Pos logit contributions
irs+0.029
angles+0.024
ades+0.024
 outfits+0.024
ide+0.023
Neg logit contributions
 afraid-0.019
 oh-0.019
 scared-0.019
 worried-0.019
 frightened-0.016
Token
Token ID198
Feature activation+6.49e-02

Pos logit contributions
irs+0.049
 outfits+0.047
 occasions+0.047
 journals+0.046
 parties+0.045
Neg logit contributions
 afraid-0.033
 scared-0.030
 worried-0.029
 oh-0.028
 frightened-0.025
Token
Token ID198
Feature activation+5.23e-02

Pos logit contributions
irs+0.128
angles+0.110
 journals+0.110
 outfits+0.108
lights+0.107
Neg logit contributions
 afraid-0.105
 oh-0.099
 scared-0.098
 worried-0.098
 careless-0.089
TokenThe
Token ID464
Feature activation+9.34e-03

Pos logit contributions
irs+0.094
angles+0.074
 parties+0.072
ushes+0.072
 outfits+0.071
Neg logit contributions
 oh-0.103
 afraid-0.102
 worried-0.097
 careless-0.097
 scared-0.096
Token fairy
Token ID25607
Feature activation+5.70e-03

Pos logit contributions
irs+0.014
ide+0.012
 parties+0.011
aff+0.011
 outfits+0.011
Neg logit contributions
 oh-0.019
 afraid-0.019
 worried-0.017
 scared-0.017
 careless-0.016
Token waved
Token ID26834
Feature activation+1.20e-02

Pos logit contributions
irs+0.011
ide+0.010
of+0.009
angles+0.009
aff+0.009
Neg logit contributions
 oh-0.009
 afraid-0.009
 worried-0.008
 careless-0.008
 scared-0.008
Token her
Token ID607
Feature activation+8.18e-03

Pos logit contributions
irs+0.021
ide+0.018
ushes+0.017
of+0.017
aff+0.017
Neg logit contributions
 afraid-0.022
 oh-0.022
 worried-0.021
 scared-0.020
 careless-0.020
Token wand
Token ID11569
Feature activation+1.51e-02

Pos logit contributions
irs+0.016
ide+0.013
of+0.012
angles+0.012
ushes+0.012
Neg logit contributions
 afraid-0.015
 oh-0.015
 worried-0.014
 scared-0.014
 careless-0.013
Token's
Token ID338
Feature activation+9.62e-04

Pos logit contributions
 afraid+0.058
 oh+0.058
 worried+0.053
 scared+0.051
 careless+0.051
Neg logit contributions
irs-0.063
ide-0.056
of-0.051
ushes-0.049
aff-0.049
Token home
Token ID1363
Feature activation-3.19e-03

Pos logit contributions
irs+0.002
ide+0.002
ushes+0.002
 outfits+0.002
 ease+0.002
Neg logit contributions
 oh-0.002
 afraid-0.001
 worried-0.001
 careless-0.001
 scared-0.001
Token.
Token ID13
Feature activation+6.78e-03

Pos logit contributions
 afraid+0.007
 oh+0.007
 worried+0.007
 careless+0.006
 scared+0.006
Neg logit contributions
irs-0.004
ide-0.004
of-0.003
ushes-0.003
ades-0.003
Token
Token ID220
Feature activation+2.17e-02

Pos logit contributions
irs+0.014
ide+0.012
angles+0.011
ushes+0.011
 outfits+0.011
Neg logit contributions
 afraid-0.011
 oh-0.011
 worried-0.011
 scared-0.010
 careless-0.010
Token
Token ID198
Feature activation+6.70e-02

Pos logit contributions
irs+0.042
 outfits+0.039
 occasions+0.039
 journals+0.039
 parties+0.038
Neg logit contributions
 afraid-0.035
 oh-0.033
 scared-0.033
 worried-0.032
 careless-0.029
Token
Token ID198
Feature activation+5.58e-02

Pos logit contributions
irs+0.130
 journals+0.116
angles+0.113
 outfits+0.112
lights+0.108
Neg logit contributions
 afraid-0.109
 oh-0.104
 scared-0.103
 worried-0.101
 careless-0.094
TokenBut
Token ID1537
Feature activation-3.31e-02

Pos logit contributions
irs+0.100
ide+0.081
ushes+0.079
angles+0.079
 outfits+0.078
Neg logit contributions
 oh-0.107
 afraid-0.105
 worried-0.099
 careless-0.097
 scared-0.097
Token when
Token ID618
Feature activation-5.95e-02

Pos logit contributions
 afraid+0.064
 worried+0.056
 seeker+0.056
 oh+0.055
Run+0.054
Neg logit contributions
irs-0.042
 ease-0.038
 them-0.037
 rebuild-0.037
ide-0.036
Token they
Token ID484
Feature activation-1.11e-02

Pos logit contributions
 afraid+0.125
 oh+0.120
 worried+0.112
 careless+0.109
 scared+0.105
Neg logit contributions
irs-0.084
 ease-0.069
ide-0.066
Everyone-0.064
 Both-0.063
Token got
Token ID1392
Feature activation-4.59e-02

Pos logit contributions
 afraid+0.015
 oh+0.015
 worried+0.015
 scared+0.014
 careless+0.013
Neg logit contributions
irs-0.025
ide-0.022
ushes-0.022
angles-0.022
ades-0.021
Token there
Token ID612
Feature activation+1.86e-02

Pos logit contributions
 oh+0.083
 afraid+0.079
 worried+0.069
 careless+0.068
Run+0.067
Neg logit contributions
irs-0.078
ide-0.066
 ease-0.063
 parties-0.062
of-0.062
Token winner
Token ID8464
Feature activation-1.56e-02

Pos logit contributions
irs+0.020
ide+0.017
ades+0.017
of+0.016
 outfits+0.016
Neg logit contributions
 oh-0.028
 afraid-0.027
 worried-0.025
 scared-0.024
 careless-0.024
Token.
Token ID13
Feature activation+6.10e-03

Pos logit contributions
 oh+0.029
 afraid+0.029
 worried+0.027
 scared+0.026
 careless+0.026
Neg logit contributions
irs-0.028
ide-0.023
ushes-0.022
of-0.021
angles-0.021
Token The
Token ID383
Feature activation+1.97e-02

Pos logit contributions
irs+0.012
ide+0.011
ushes+0.010
angles+0.010
aff+0.010
Neg logit contributions
 afraid-0.010
 oh-0.010
 worried-0.009
 scared-0.009
 careless-0.009
Token giant
Token ID6175
Feature activation+1.34e-03

Pos logit contributions
irs+0.029
ide+0.026
ades+0.025
 parties+0.025
 journals+0.024
Neg logit contributions
 oh-0.038
 afraid-0.038
 worried-0.034
 scared-0.033
 careless-0.032
Token looked
Token ID3114
Feature activation-1.24e-02

Pos logit contributions
irs+0.003
ide+0.002
of+0.002
angles+0.002
aff+0.002
Neg logit contributions
 oh-0.002
 afraid-0.002
 worried-0.002
 careless-0.002
 scared-0.002
Token at
Token ID379
Feature activation+4.65e-02

Pos logit contributions
 oh+0.019
 afraid+0.018
Run+0.016
 worried+0.015
 careless+0.015
Neg logit contributions
irs-0.024
ushes-0.021
angles-0.020
 ease-0.020
 parties-0.020
Token Tom
Token ID4186
Feature activation-1.31e-03

Pos logit contributions
irs+0.097
ide+0.085
ushes+0.079
aff+0.078
of+0.078
Neg logit contributions
 afraid-0.073
 oh-0.073
 worried-0.068
 scared-0.065
 careless-0.064
Token's
Token ID338
Feature activation-2.25e-03

Pos logit contributions
 afraid+0.002
 oh+0.002
 worried+0.002
 scared+0.002
 careless+0.002
Neg logit contributions
irs-0.003
ide-0.003
ushes-0.002
angles-0.002
aff-0.002
Token sand
Token ID6450
Feature activation+1.16e-02

Pos logit contributions
 oh+0.005
 afraid+0.005
 worried+0.004
 scared+0.004
 careless+0.004
Neg logit contributions
irs-0.004
ide-0.003
ushes-0.003
angles-0.003
 outfits-0.003
Token castle
Token ID16669
Feature activation+2.22e-02

Pos logit contributions
irs+0.018
ide+0.014
ushes+0.013
of+0.013
aff+0.013
Neg logit contributions
 afraid-0.025
 oh-0.024
 worried-0.023
 scared-0.023
 careless-0.022
Token and
Token ID290
Feature activation+8.70e-03

Pos logit contributions
irs+0.043
ide+0.038
ushes+0.035
angles+0.035
of+0.035
Neg logit contributions
 afraid-0.037
 oh-0.037
 worried-0.034
 scared-0.033
 careless-0.033
Token to
Token ID284
Feature activation+3.02e-02

Pos logit contributions
 oh+0.004
 afraid+0.004
 worried+0.004
 scared+0.004
 careless+0.004
Neg logit contributions
irs-0.005
ide-0.004
of-0.004
ushes-0.004
angles-0.004
Token find
Token ID1064
Feature activation+2.94e-02

Pos logit contributions
irs+0.061
ushes+0.053
oons+0.052
ide+0.051
angles+0.051
Neg logit contributions
 afraid-0.053
 oh-0.052
 scared-0.050
 worried-0.049
 careless-0.046
Token its
Token ID663
Feature activation+3.35e-02

Pos logit contributions
irs+0.064
ide+0.057
angles+0.053
ushes+0.053
aff+0.053
Neg logit contributions
 oh-0.044
 afraid-0.044
 worried-0.040
 scared-0.039
 careless-0.037
Token prey
Token ID15974
Feature activation-8.04e-03

Pos logit contributions
irs+0.063
ide+0.057
ushes+0.053
 ease+0.052
 outfits+0.052
Neg logit contributions
 oh-0.055
 afraid-0.054
 worried-0.050
 careless-0.047
 scared-0.047
Token.
Token ID13
Feature activation+2.92e-03

Pos logit contributions
 afraid+0.016
 oh+0.016
 worried+0.015
 scared+0.014
 careless+0.014
Neg logit contributions
irs-0.013
ide-0.011
of-0.010
ushes-0.010
angles-0.010
Token
Token ID198
Feature activation+6.33e-02

Pos logit contributions
irs+0.005
ide+0.004
angles+0.004
ushes+0.004
ades+0.004
Neg logit contributions
 afraid-0.006
 oh-0.006
 worried-0.006
 scared-0.005
 careless-0.005
Token
Token ID198
Feature activation+5.18e-02

Pos logit contributions
irs+0.105
angles+0.088
ide+0.085
 outfits+0.085
 parties+0.084
Neg logit contributions
 afraid-0.126
 oh-0.122
 worried-0.118
 scared-0.117
 careless-0.112
TokenThe
Token ID464
Feature activation+1.31e-02

Pos logit contributions
irs+0.071
ide+0.056
angles+0.052
ushes+0.051
aff+0.050
Neg logit contributions
 afraid-0.120
 oh-0.119
 worried-0.114
 scared-0.111
 careless-0.110
Token bunny
Token ID44915
Feature activation-2.01e-03

Pos logit contributions
irs+0.015
ide+0.014
 halves+0.014
Everyone+0.014
 ease+0.014
Neg logit contributions
 oh-0.027
 afraid-0.026
 worried-0.023
 gonna-0.022
 seeker-0.022
Token was
Token ID373
Feature activation-4.10e-02

Pos logit contributions
 oh+0.003
 afraid+0.003
 worried+0.003
 careless+0.003
 foolish+0.003
Neg logit contributions
irs-0.004
ide-0.003
of-0.003
aff-0.003
r-0.003
Token frightened
Token ID24776
Feature activation-3.34e-03

Pos logit contributions
 oh+0.049
Run+0.039
 seeker+0.035
 afraid+0.035
www+0.033
Neg logit contributions
irs-0.086
ide-0.083
 ease-0.080
ushes-0.079
r-0.075
Token had
Token ID550
Feature activation-2.67e-03

Pos logit contributions
irs+0.031
ide+0.027
ushes+0.025
angles+0.025
of+0.025
Neg logit contributions
 afraid-0.023
 oh-0.022
 worried-0.021
 scared-0.020
 careless-0.020
Token two
Token ID734
Feature activation+2.83e-02

Pos logit contributions
 afraid+0.004
 oh+0.003
 worried+0.003
 scared+0.003
 careless+0.003
Neg logit contributions
irs-0.006
ide-0.006
angles-0.005
ushes-0.005
aff-0.005
Token steps
Token ID4831
Feature activation-3.64e-02

Pos logit contributions
irs+0.054
ide+0.047
angles+0.043
of+0.042
ushes+0.042
Neg logit contributions
 afraid-0.050
 oh-0.049
 worried-0.047
 scared-0.046
 careless-0.045
Token.
Token ID13
Feature activation+7.74e-03

Pos logit contributions
 afraid+0.063
 oh+0.063
 worried+0.058
 scared+0.056
 careless+0.056
Neg logit contributions
irs-0.068
ide-0.059
of-0.055
ushes-0.055
angles-0.054
Token
Token ID198
Feature activation+6.29e-02

Pos logit contributions
irs+0.012
ide+0.010
angles+0.010
 outfits+0.010
ushes+0.009
Neg logit contributions
 afraid-0.016
 oh-0.016
 worried-0.015
 scared-0.015
 careless-0.014
Token
Token ID198
Feature activation+5.11e-02

Pos logit contributions
irs+0.108
 outfits+0.089
angles+0.089
ide+0.088
 journals+0.087
Neg logit contributions
 afraid-0.119
 oh-0.114
 scared-0.112
 worried-0.112
 careless-0.105
TokenHis
Token ID6653
Feature activation+1.11e-02

Pos logit contributions
irs+0.082
ide+0.064
ushes+0.063
angles+0.063
 outfits+0.063
Neg logit contributions
 oh-0.110
 afraid-0.108
 worried-0.102
 careless-0.101
 scared-0.100
Token father
Token ID2988
Feature activation-5.14e-03

Pos logit contributions
irs+0.016
 outfits+0.014
 journals+0.014
ingu+0.013
 collections+0.013
Neg logit contributions
 oh-0.022
 afraid-0.021
Run-0.020
 scared-0.019
 worried-0.019
Token put
Token ID1234
Feature activation+2.57e-02

Pos logit contributions
 afraid+0.009
 oh+0.009
 worried+0.008
 seeker+0.008
 careless+0.008
Neg logit contributions
irs-0.009
ide-0.008
of-0.008
r-0.008
aff-0.007
Token the
Token ID262
Feature activation+2.27e-02

Pos logit contributions
irs+0.045
ide+0.037
ushes+0.035
angles+0.035
of+0.034
Neg logit contributions
 afraid-0.049
 oh-0.048
 worried-0.046
 scared-0.044
 careless-0.044
Token steps
Token ID4831
Feature activation-4.01e-02

Pos logit contributions
irs+0.034
 outfits+0.031
ades+0.030
ushes+0.029
angles+0.028
Neg logit contributions
 afraid-0.043
 oh-0.041
 worried-0.037
 scared-0.036
 determined-0.035
Token and
Token ID290
Feature activation-5.79e-03

Pos logit contributions
 oh+0.032
 afraid+0.031
 worried+0.029
 careless+0.027
 scared+0.027
Neg logit contributions
irs-0.037
ide-0.034
ushes-0.031
of-0.030
 parties-0.030
Token approached
Token ID10448
Feature activation+2.16e-02

Pos logit contributions
 afraid+0.009
 oh+0.009
 worried+0.008
 careless+0.008
 foolish+0.008
Neg logit contributions
irs-0.010
ide-0.009
of-0.009
 ease-0.009
 parties-0.008
Token the
Token ID262
Feature activation+1.71e-02

Pos logit contributions
irs+0.034
ide+0.028
of+0.025
ushes+0.025
angles+0.025
Neg logit contributions
 afraid-0.045
 oh-0.044
 worried-0.042
 scared-0.041
 careless-0.041
Token lion
Token ID18744
Feature activation-2.30e-03

Pos logit contributions
irs+0.029
ide+0.026
aff+0.024
of+0.024
ushes+0.024
Neg logit contributions
 oh-0.030
 afraid-0.029
 worried-0.026
 scared-0.025
 careless-0.024
Token.
Token ID13
Feature activation+7.55e-03

Pos logit contributions
 afraid+0.004
 oh+0.004
 worried+0.004
 scared+0.004
 careless+0.004
Neg logit contributions
irs-0.004
ide-0.003
angles-0.003
ushes-0.003
aff-0.003
Token He
Token ID679
Feature activation+2.40e-02

Pos logit contributions
irs+0.012
ide+0.010
angles+0.009
ushes+0.009
of+0.009
Neg logit contributions
 afraid-0.016
 oh-0.016
 worried-0.015
 scared-0.015
 careless-0.014
Token saw
Token ID2497
Feature activation+3.34e-02

Pos logit contributions
irs+0.047
ide+0.042
of+0.040
ushes+0.039
angles+0.038
Neg logit contributions
 oh-0.038
 afraid-0.037
 worried-0.034
 careless-0.032
 scared-0.031
Token his
Token ID465
Feature activation+1.66e-03

Pos logit contributions
irs+0.060
ide+0.051
ushes+0.047
of+0.047
angles+0.047
Neg logit contributions
 afraid-0.060
 oh-0.060
 worried-0.057
 scared-0.054
 careless-0.054
Token big
Token ID1263
Feature activation+1.38e-02

Pos logit contributions
irs+0.003
ide+0.002
ushes+0.002
 outfits+0.002
 ease+0.002
Neg logit contributions
 oh-0.003
 afraid-0.003
 worried-0.003
 careless-0.003
Run-0.003
Token man
Token ID582
Feature activation-3.18e-02

Pos logit contributions
irs+0.018
ide+0.016
aff+0.016
 outfits+0.016
 ease+0.015
Neg logit contributions
 oh-0.028
 afraid-0.026
Run-0.024
 worried-0.024
 careless-0.023
Tokene
Token ID68
Feature activation+1.42e-02

Pos logit contributions
 afraid+0.048
 oh+0.048
 worried+0.044
 scared+0.042
 careless+0.042
Neg logit contributions
irs-0.067
ide-0.059
ushes-0.055
of-0.055
angles-0.054
Token"
Token ID1
Feature activation-1.40e-02

Pos logit contributions
irs+0.111
ide+0.096
ushes+0.090
angles+0.088
aff+0.087
Neg logit contributions
 oh-0.116
 afraid-0.114
 worried-0.109
 careless-0.104
 scared-0.104
TokenYes
Token ID5297
Feature activation-3.47e-02

Pos logit contributions
 oh+0.024
 afraid+0.023
 worried+0.022
 scared+0.021
 careless+0.021
Neg logit contributions
irs-0.028
ide-0.024
angles-0.022
ushes-0.022
aff-0.022
Token please
Token ID3387
Feature activation-1.06e-02

Pos logit contributions
 afraid+0.057
 oh+0.054
 worried+0.052
 scared+0.049
 careless+0.049
Neg logit contributions
irs-0.070
ide-0.063
of-0.059
aff-0.058
angles-0.057
Token!"
Token ID2474
Feature activation+2.86e-02

Pos logit contributions
 oh+0.018
 afraid+0.018
 worried+0.017
 careless+0.016
 scared+0.016
Neg logit contributions
irs-0.020
ide-0.017
of-0.016
aff-0.016
ushes-0.016
Token Johnny
Token ID15470
Feature activation-2.09e-02

Pos logit contributions
irs+0.063
ide+0.055
ushes+0.052
of+0.052
angles+0.052
Neg logit contributions
 afraid-0.041
 oh-0.041
 worried-0.037
 scared-0.036
 careless-0.035
Token said
Token ID531
Feature activation+5.97e-03

Pos logit contributions
 afraid+0.029
 oh+0.029
 worried+0.026
 scared+0.025
 careless+0.025
Neg logit contributions
irs-0.047
ide-0.042
of-0.040
aff-0.039
ushes-0.039
Token with
Token ID351
Feature activation+1.11e-02

Pos logit contributions
irs+0.008
ide+0.007
of+0.006
r+0.006
ushes+0.006
Neg logit contributions
 oh-0.013
 afraid-0.013
 worried-0.012
 careless-0.012
 foolish-0.012
Token a
Token ID257
Feature activation+1.15e-02

Pos logit contributions
irs+0.030
ide+0.028
aff+0.026
angles+0.026
oons+0.026
Neg logit contributions
 afraid-0.012
 oh-0.011
 worried-0.011
 scared-0.011
 careless-0.009
Token smile
Token ID8212
Feature activation+2.67e-02

Pos logit contributions
irs+0.013
ide+0.010
ushes+0.010
of+0.010
ades+0.009
Neg logit contributions
 oh-0.028
 afraid-0.027
 worried-0.025
 careless-0.025
 scared-0.024
Token,
Token ID11
Feature activation-1.84e-02

Pos logit contributions
irs+0.036
ide+0.032
of+0.028
aff+0.028
r+0.028
Neg logit contributions
 afraid-0.058
 oh-0.058
 worried-0.054
 careless-0.053
 foolish-0.052
Token so
Token ID523
Feature activation-2.18e-02

Pos logit contributions
 oh+0.030
 afraid+0.029
 worried+0.027
 careless+0.026
 scared+0.025
Neg logit contributions
irs-0.034
ide-0.030
of-0.028
ushes-0.028
aff-0.028
Token butterfly
Token ID35113
Feature activation-3.76e-03

Pos logit contributions
irs+0.036
ide+0.030
of+0.030
 outfits+0.028
 parties+0.028
Neg logit contributions
 afraid-0.044
 oh-0.044
 worried-0.041
 careless-0.038
 scared-0.038
Token and
Token ID290
Feature activation+4.19e-03

Pos logit contributions
 oh+0.006
 afraid+0.006
 worried+0.005
 careless+0.005
 scared+0.005
Neg logit contributions
irs-0.008
ide-0.007
of-0.006
ushes-0.006
 parties-0.006
Token show
Token ID905
Feature activation+2.91e-02

Pos logit contributions
irs+0.007
ide+0.006
ushes+0.005
angles+0.005
aff+0.005
Neg logit contributions
 afraid-0.008
 oh-0.008
 worried-0.008
 scared-0.008
 careless-0.008
Token it
Token ID340
Feature activation+9.87e-03

Pos logit contributions
irs+0.068
ide+0.062
ushes+0.059
ades+0.058
aff+0.058
Neg logit contributions
 oh-0.039
 afraid-0.038
 worried-0.035
 scared-0.035
 careless-0.032
Token to
Token ID284
Feature activation+4.96e-02

Pos logit contributions
irs+0.026
ide+0.023
aff+0.023
ingu+0.023
ades+0.023
Neg logit contributions
 afraid-0.012
 scared-0.011
 oh-0.010
 worried-0.010
 foolish-0.009
Token his
Token ID465
Feature activation+5.82e-03

Pos logit contributions
irs+0.098
ushes+0.090
aff+0.089
ades+0.086
aters+0.086
Neg logit contributions
 oh-0.092
 afraid-0.087
 scared-0.084
 worried-0.081
 foolish-0.079
Token mom
Token ID1995
Feature activation-8.82e-03

Pos logit contributions
irs+0.010
ide+0.009
ushes+0.008
angles+0.008
of+0.008
Neg logit contributions
 afraid-0.011
 oh-0.011
 worried-0.010
 scared-0.010
 careless-0.010
Token.
Token ID13
Feature activation+6.98e-04

Pos logit contributions
 afraid+0.014
 worried+0.013
 scared+0.013
 oh+0.013
 careless+0.011
Neg logit contributions
irs-0.019
ushes-0.016
ades-0.016
angles-0.016
ands-0.015
Token Benny
Token ID44275
Feature activation-8.24e-03

Pos logit contributions
irs+0.001
ide+0.001
ushes+0.001
angles+0.001
of+0.001
Neg logit contributions
 oh-0.001
 afraid-0.001
 worried-0.001
 scared-0.001
 careless-0.001
Token ran
Token ID4966
Feature activation-2.72e-02

Pos logit contributions
 oh+0.013
 afraid+0.013
 worried+0.012
 careless+0.011
 scared+0.011
Neg logit contributions
irs-0.015
ide-0.014
of-0.013
 outfits-0.013
aff-0.013
Token after
Token ID706
Feature activation-4.72e-03

Pos logit contributions
 oh+0.053
 afraid+0.051
 worried+0.047
 foolish+0.045
 careless+0.044
Neg logit contributions
irs-0.043
ide-0.039
of-0.036
 parties-0.035
aff-0.034
Token.
Token ID13
Feature activation-7.08e-03

Pos logit contributions
irs+0.055
ide+0.045
ushes+0.043
aff+0.043
angles+0.042
Neg logit contributions
 afraid-0.047
 oh-0.047
 worried-0.044
 scared-0.043
 careless-0.042
Token They
Token ID1119
Feature activation+3.13e-02

Pos logit contributions
 afraid+0.011
 oh+0.011
 worried+0.011
 scared+0.010
 careless+0.010
Neg logit contributions
irs-0.014
ide-0.012
ushes-0.012
of-0.011
aff-0.011
Token're
Token ID821
Feature activation-1.47e-02

Pos logit contributions
irs+0.053
ide+0.045
ushes+0.042
aff+0.042
ades+0.040
Neg logit contributions
 afraid-0.062
 oh-0.060
 worried-0.059
 scared-0.058
 careless-0.056
Token so
Token ID523
Feature activation-4.85e-02

Pos logit contributions
 oh+0.019
 afraid+0.019
 worried+0.017
 careless+0.016
 scared+0.016
Neg logit contributions
irs-0.033
ide-0.029
ushes-0.028
 ease-0.028
 parties-0.028
Token delicious
Token ID12625
Feature activation+1.18e-03

Pos logit contributions
 afraid+0.072
 oh+0.071
 worried+0.066
 scared+0.063
 careless+0.062
Neg logit contributions
irs-0.106
ide-0.093
ushes-0.087
angles-0.087
of-0.086
Token!"
Token ID2474
Feature activation+2.59e-02

Pos logit contributions
irs+0.002
ide+0.002
 parties+0.001
aff+0.001
oons+0.001
Neg logit contributions
 afraid-0.002
 worried-0.002
 oh-0.002
 careless-0.002
 scared-0.002
Token But
Token ID887
Feature activation-2.62e-02

Pos logit contributions
irs+0.049
angles+0.042
ide+0.041
ushes+0.040
aff+0.039
Neg logit contributions
 oh-0.049
 afraid-0.046
 worried-0.045
 scared-0.044
 careless-0.041
Token his
Token ID465
Feature activation+5.29e-03

Pos logit contributions
 afraid+0.051
 oh+0.048
 worried+0.046
 careless+0.045
 scared+0.044
Neg logit contributions
irs-0.041
ide-0.035
 ease-0.032
lights-0.031
of-0.031
Token mother
Token ID2802
Feature activation-6.80e-04

Pos logit contributions
irs+0.007
ushes+0.006
 outfits+0.006
ide+0.006
 parties+0.006
Neg logit contributions
 afraid-0.011
 oh-0.011
Run-0.010
 worried-0.010
 scared-0.010
Token could
Token ID714
Feature activation-6.13e-03

Pos logit contributions
 afraid+0.001
 oh+0.001
 seeker+0.001
 worried+0.001
Run+0.001
Neg logit contributions
ide-0.001
of-0.001
irs-0.001
r-0.001
aff-0.001
Token see
Token ID766
Feature activation+3.93e-02

Pos logit contributions
 afraid+0.006
 worried+0.005
 scared+0.005
 oh+0.005
 careless+0.004
Neg logit contributions
irs-0.018
ide-0.016
aff-0.016
ushes-0.015
angles-0.015
Token offered
Token ID4438
Feature activation+1.23e-02

Pos logit contributions
irs+0.050
ide+0.044
ushes+0.041
angles+0.041
of+0.041
Neg logit contributions
 afraid-0.037
 oh-0.037
 worried-0.034
 scared-0.033
 careless-0.032
Token to
Token ID284
Feature activation+3.29e-02

Pos logit contributions
irs+0.021
ide+0.018
ushes+0.016
angles+0.016
of+0.016
Neg logit contributions
 afraid-0.024
 oh-0.024
 worried-0.023
 scared-0.022
 careless-0.022
Token give
Token ID1577
Feature activation+1.01e-02

Pos logit contributions
irs+0.060
ide+0.056
ushes+0.056
aters+0.055
lights+0.054
Neg logit contributions
 oh-0.056
 scared-0.054
 afraid-0.053
 worried-0.052
 careless-0.049
Token them
Token ID606
Feature activation+9.26e-03

Pos logit contributions
irs+0.015
ide+0.012
ushes+0.011
angles+0.011
ades+0.011
Neg logit contributions
 oh-0.022
 afraid-0.022
 worried-0.021
 scared-0.020
 careless-0.020
Token a
Token ID257
Feature activation+1.48e-02

Pos logit contributions
irs+0.021
ide+0.019
aff+0.017
angles+0.017
ushes+0.017
Neg logit contributions
 oh-0.013
 afraid-0.012
 worried-0.012
 careless-0.012
 scared-0.012
Token beautiful
Token ID4950
Feature activation+4.01e-02

Pos logit contributions
irs+0.022
ide+0.018
ushes+0.017
of+0.017
ades+0.017
Neg logit contributions
 oh-0.031
 afraid-0.031
 worried-0.028
 scared-0.027
 careless-0.027
Token,
Token ID11
Feature activation-2.83e-02

Pos logit contributions
irs+0.061
ide+0.052
 outfits+0.048
aff+0.047
ushes+0.047
Neg logit contributions
 afraid-0.084
 oh-0.082
 worried-0.077
 careless-0.075
 scared-0.074
Token soft
Token ID2705
Feature activation+2.73e-02

Pos logit contributions
 oh+0.049
 afraid+0.049
 worried+0.044
 careless+0.042
 seeker+0.042
Neg logit contributions
irs-0.042
 journals-0.037
 outfits-0.037
ushes-0.037
 parties-0.036
Token carpet
Token ID20710
Feature activation+3.80e-03

Pos logit contributions
irs+0.042
ide+0.036
 outfits+0.033
aff+0.032
of+0.032
Neg logit contributions
 afraid-0.056
 oh-0.055
 worried-0.052
 careless-0.051
 scared-0.050
Token for
Token ID329
Feature activation+2.07e-02

Pos logit contributions
irs+0.007
ide+0.006
ushes+0.005
angles+0.005
of+0.005
Neg logit contributions
 afraid-0.007
 oh-0.007
 worried-0.006
 scared-0.006
 careless-0.006
Token their
Token ID511
Feature activation+6.03e-03

Pos logit contributions
irs+0.033
ide+0.031
ushes+0.029
of+0.029
angles+0.028
Neg logit contributions
 oh-0.042
 afraid-0.040
 worried-0.039
 scared-0.038
 careless-0.037
Token was
Token ID373
Feature activation-3.33e-02

Pos logit contributions
irs+0.043
ide+0.038
of+0.036
ushes+0.035
aff+0.035
Neg logit contributions
 oh-0.033
 afraid-0.032
 worried-0.029
 scared-0.028
 careless-0.028
Token not
Token ID407
Feature activation-3.42e-02

Pos logit contributions
 oh+0.030
 afraid+0.025
Run+0.023
 worried+0.020
 careless+0.019
Neg logit contributions
irs-0.084
ide-0.077
ushes-0.074
 ease-0.072
ingu-0.071
Token angry
Token ID7954
Feature activation-1.30e-02

Pos logit contributions
 oh+0.044
Run+0.043
 afraid+0.038
 worried+0.035
 seeker+0.035
Neg logit contributions
irs-0.067
ide-0.061
ushes-0.060
 ease-0.058
 parties-0.058
Token.
Token ID13
Feature activation-1.40e-02

Pos logit contributions
 oh+0.027
 afraid+0.027
 worried+0.026
 careless+0.024
 scared+0.024
Neg logit contributions
irs-0.018
ide-0.016
aff-0.014
of-0.014
 parties-0.014
Token He
Token ID679
Feature activation+1.76e-02

Pos logit contributions
 afraid+0.026
 oh+0.026
 worried+0.024
 scared+0.024
 careless+0.023
Neg logit contributions
irs-0.027
ide-0.023
angles-0.022
of-0.021
 outfits-0.021
Token was
Token ID373
Feature activation-3.41e-02

Pos logit contributions
irs+0.033
ide+0.030
of+0.029
aff+0.028
ushes+0.027
Neg logit contributions
 oh-0.028
 afraid-0.027
 worried-0.024
 careless-0.023
 seeker-0.023
Token cheerful
Token ID37999
Feature activation+2.22e-02

Pos logit contributions
 oh+0.032
Run+0.027
 afraid+0.023
 seeker+0.022
eware+0.021
Neg logit contributions
irs-0.080
ide-0.073
ushes-0.071
 ease-0.070
ingu-0.069
Token.
Token ID13
Feature activation-8.70e-03

Pos logit contributions
irs+0.045
ide+0.040
ushes+0.037
of+0.037
aff+0.037
Neg logit contributions
 oh-0.035
 afraid-0.035
 worried-0.032
 scared-0.031
 careless-0.031
Token
Token ID198
Feature activation+5.14e-02

Pos logit contributions
 afraid+0.016
 oh+0.015
 worried+0.015
 scared+0.014
 careless+0.014
Neg logit contributions
irs-0.018
ide-0.015
angles-0.015
 outfits-0.014
of-0.014
Token
Token ID198
Feature activation+3.90e-02

Pos logit contributions
irs+0.107
 outfits+0.100
angles+0.099
 journals+0.097
aters+0.095
Neg logit contributions
 afraid-0.079
 scared-0.075
 worried-0.074
 oh-0.068
 careless-0.065
Token"
Token ID1
Feature activation-1.36e-02

Pos logit contributions
irs+0.078
ide+0.067
angles+0.063
aff+0.063
 ease+0.062
Neg logit contributions
 oh-0.068
 afraid-0.068
 worried-0.064
 careless-0.062
 scared-0.062
Token play
Token ID711
Feature activation+2.50e-03

Pos logit contributions
irs+0.028
ushes+0.023
ide+0.022
angles+0.022
aff+0.022
Neg logit contributions
 afraid-0.032
 oh-0.032
 scared-0.031
 worried-0.030
 careless-0.029
Token with
Token ID351
Feature activation-1.66e-04

Pos logit contributions
irs+0.005
ide+0.004
ushes+0.004
angles+0.004
aff+0.004
Neg logit contributions
 oh-0.004
 afraid-0.004
 worried-0.004
 scared-0.004
 careless-0.004
Token his
Token ID465
Feature activation+9.69e-03

Pos logit contributions
 oh+0.000
 afraid+0.000
 scared+0.000
 careless+0.000
 worried+0.000
Neg logit contributions
irs-0.000
ide-0.000
ushes-0.000
aff-0.000
angles-0.000
Token ball
Token ID2613
Feature activation-1.42e-02

Pos logit contributions
irs+0.017
ide+0.015
ushes+0.014
 outfits+0.014
angles+0.014
Neg logit contributions
 afraid-0.017
 oh-0.017
 worried-0.016
 scared-0.015
 careless-0.015
Token.
Token ID13
Feature activation-7.80e-05

Pos logit contributions
 afraid+0.026
 oh+0.025
 worried+0.024
 scared+0.024
 careless+0.023
Neg logit contributions
irs-0.027
ide-0.022
ushes-0.022
aff-0.021
of-0.021
Token One
Token ID1881
Feature activation-1.58e-02

Pos logit contributions
 afraid+0.000
 oh+0.000
 scared+0.000
 worried+0.000
 foolish+0.000
Neg logit contributions
irs-0.000
ide-0.000
 outfits-0.000
ades-0.000
aff-0.000
Token day
Token ID1110
Feature activation-5.58e-02

Pos logit contributions
 afraid+0.025
 oh+0.025
 worried+0.023
 determined+0.021
 scared+0.021
Neg logit contributions
ide-0.028
of-0.026
irs-0.024
Everyone-0.023
r-0.022
Token,
Token ID11
Feature activation-5.22e-02

Pos logit contributions
 afraid+0.092
 oh+0.087
 worried+0.085
 scared+0.079
 careless+0.079
Neg logit contributions
irs-0.109
ide-0.096
of-0.090
angles-0.087
oons-0.087
Token Max
Token ID5436
Feature activation-3.13e-02

Pos logit contributions
 afraid+0.101
 worried+0.094
 oh+0.094
 careless+0.089
 scared+0.088
Neg logit contributions
irs-0.084
ide-0.072
angles-0.067
of-0.066
 parties-0.065
Token's
Token ID338
Feature activation-8.07e-03

Pos logit contributions
 oh+0.061
 afraid+0.058
 worried+0.054
 determined+0.050
 adamant+0.050
Neg logit contributions
irs-0.042
ide-0.042
of-0.039
 parties-0.037
 outfits-0.036
Token ball
Token ID2613
Feature activation-1.90e-02

Pos logit contributions
 afraid+0.016
 oh+0.016
 worried+0.014
 scared+0.013
Run+0.013
Neg logit contributions
irs-0.012
ide-0.011
ushes-0.010
 parties-0.010
angles-0.010
Token to
Token ID284
Feature activation+4.39e-02

Pos logit contributions
irs+0.032
ide+0.028
ushes+0.026
of+0.026
angles+0.026
Neg logit contributions
 oh-0.029
 afraid-0.029
 worried-0.027
 careless-0.026
 scared-0.026
Token them
Token ID606
Feature activation+1.52e-02

Pos logit contributions
irs+0.081
ushes+0.069
ide+0.069
ades+0.065
aff+0.064
Neg logit contributions
 oh-0.080
 afraid-0.078
 worried-0.073
 scared-0.072
 careless-0.070
Token.
Token ID13
Feature activation-9.28e-03

Pos logit contributions
irs+0.028
ide+0.024
ushes+0.022
angles+0.022
of+0.022
Neg logit contributions
 afraid-0.027
 oh-0.027
 worried-0.025
 scared-0.025
 careless-0.024
Token But
Token ID887
Feature activation-3.53e-02

Pos logit contributions
 oh+0.016
 afraid+0.016
 worried+0.015
 careless+0.015
 scared+0.015
Neg logit contributions
irs-0.017
ide-0.014
ushes-0.014
aff-0.013
of-0.013
Token now
Token ID783
Feature activation-3.80e-02

Pos logit contributions
 afraid+0.062
 worried+0.058
 oh+0.057
 scared+0.055
 careless+0.054
Neg logit contributions
irs-0.063
ide-0.054
angles-0.052
 parties-0.050
of-0.050
Token is
Token ID318
Feature activation-3.31e-02

Pos logit contributions
 afraid+0.061
 oh+0.060
 worried+0.056
 scared+0.054
 careless+0.053
Neg logit contributions
irs-0.077
ide-0.067
of-0.063
angles-0.063
ushes-0.062
Token not
Token ID407
Feature activation-3.95e-02

Pos logit contributions
 oh+0.053
 afraid+0.051
 worried+0.048
Run+0.046
 careless+0.045
Neg logit contributions
irs-0.060
ide-0.050
angles-0.048
 parties-0.048
ushes-0.047
Token the
Token ID262
Feature activation+2.11e-02

Pos logit contributions
 oh+0.057
 afraid+0.053
 worried+0.051
Run+0.048
 careless+0.048
Neg logit contributions
irs-0.076
ide-0.070
 parties-0.065
ushes-0.063
angles-0.063
Token time
Token ID640
Feature activation-4.31e-02

Pos logit contributions
irs+0.034
ide+0.032
 parties+0.031
 ease+0.031
 outfits+0.029
Neg logit contributions
 oh-0.036
 afraid-0.034
 worried-0.031
 careless-0.030
 scared-0.029
Token.
Token ID13
Feature activation-1.17e-02

Pos logit contributions
 afraid+0.077
 oh+0.076
 worried+0.072
 scared+0.069
 careless+0.068
Neg logit contributions
irs-0.079
ide-0.068
ushes-0.063
of-0.063
angles-0.062
Token The
Token ID383
Feature activation+2.33e-02

Pos logit contributions
 afraid+0.021
 oh+0.020
 worried+0.019
 scared+0.018
 careless+0.018
Neg logit contributions
irs-0.022
ide-0.019
ushes-0.018
angles-0.017
of-0.017
Token it
Token ID340
Feature activation+6.44e-03

Pos logit contributions
 oh+0.036
 afraid+0.036
 worried+0.033
 careless+0.032
 scared+0.032
Neg logit contributions
irs-0.035
ide-0.030
ushes-0.028
of-0.028
 ease-0.027
Token was
Token ID373
Feature activation-3.44e-02

Pos logit contributions
irs+0.013
ide+0.011
ushes+0.010
of+0.010
angles+0.010
Neg logit contributions
 afraid-0.010
 oh-0.010
 worried-0.010
 scared-0.009
 careless-0.009
Token full
Token ID1336
Feature activation+3.69e-02

Pos logit contributions
 oh+0.051
 afraid+0.050
 worried+0.045
 careless+0.044
 scared+0.043
Neg logit contributions
irs-0.070
ide-0.063
ushes-0.059
 ease-0.059
 parties-0.057
Token of
Token ID286
Feature activation+2.90e-02

Pos logit contributions
irs+0.046
ide+0.037
ushes+0.032
of+0.031
angles+0.031
Neg logit contributions
 afraid-0.088
 oh-0.088
 worried-0.084
 scared-0.081
 careless-0.081
Token shiny
Token ID22441
Feature activation+3.24e-02

Pos logit contributions
irs+0.053
ide+0.046
ushes+0.042
angles+0.042
of+0.042
Neg logit contributions
 afraid-0.053
 oh-0.052
 worried-0.049
 scared-0.048
 careless-0.047
Token,
Token ID11
Feature activation-3.28e-02

Pos logit contributions
irs+0.058
ide+0.049
ushes+0.045
angles+0.045
of+0.045
Neg logit contributions
 afraid-0.060
 oh-0.060
 worried-0.057
 scared-0.055
 careless-0.054
Token colourful
Token ID44799
Feature activation+4.10e-02

Pos logit contributions
 afraid+0.060
 oh+0.060
 worried+0.054
 careless+0.053
Run+0.051
Neg logit contributions
irs-0.048
 outfits-0.042
ushes-0.041
 patterns-0.041
 journals-0.041
Token things
Token ID1243
Feature activation-2.02e-02

Pos logit contributions
irs+0.073
ide+0.061
angles+0.057
ushes+0.056
of+0.055
Neg logit contributions
 afraid-0.077
 oh-0.076
 worried-0.073
 scared-0.072
 careless-0.070
Token and
Token ID290
Feature activation-1.93e-02

Pos logit contributions
 afraid+0.034
 oh+0.034
 worried+0.032
 scared+0.030
 careless+0.030
Neg logit contributions
irs-0.039
ide-0.034
ushes-0.032
angles-0.031
of-0.031
Token surprises
Token ID24072
Feature activation+1.38e-02

Pos logit contributions
 afraid+0.034
 oh+0.034
 worried+0.032
 careless+0.030
 scared+0.030
Neg logit contributions
irs-0.034
ide-0.029
ushes-0.028
 outfits-0.027
 parties-0.027
Token!
Token ID0
Feature activation+7.97e-03

Pos logit contributions
irs+0.026
ide+0.022
ushes+0.021
aff+0.021
angles+0.021
Neg logit contributions
 afraid-0.024
 oh-0.024
 worried-0.023
 scared-0.023
 careless-0.022
Token
Token ID198
Feature activation+6.37e-02

Pos logit contributions
irs+0.008
ide+0.006
angles+0.006
ushes+0.006
 outfits+0.006
Neg logit contributions
 oh-0.010
 afraid-0.010
 worried-0.009
 scared-0.009
 careless-0.009
Token
Token ID198
Feature activation+4.71e-02

Pos logit contributions
irs+0.118
 outfits+0.098
angles+0.097
 journals+0.096
aff+0.095
Neg logit contributions
 afraid-0.114
 oh-0.109
 scared-0.106
 worried-0.106
 careless-0.098
TokenTim
Token ID14967
Feature activation-5.60e-03

Pos logit contributions
irs+0.063
ide+0.053
of+0.045
ushes+0.045
angles+0.045
Neg logit contributions
 afraid-0.109
 oh-0.107
 worried-0.103
 scared-0.100
 careless-0.099
Token's
Token ID338
Feature activation-7.40e-03

Pos logit contributions
 oh+0.011
 afraid+0.010
 worried+0.009
 seeker+0.009
Run+0.009
Neg logit contributions
ide-0.008
irs-0.008
of-0.007
r-0.007
angles-0.007
Token friend
Token ID1545
Feature activation-1.06e-02

Pos logit contributions
 afraid+0.014
 oh+0.014
 worried+0.013
 scared+0.012
 careless+0.012
Neg logit contributions
irs-0.012
ide-0.011
ushes-0.010
 outfits-0.010
 parties-0.010
Token,
Token ID11
Feature activation-3.45e-02

Pos logit contributions
 afraid+0.019
 oh+0.018
 seeker+0.017
 worried+0.017
 determined+0.017
Neg logit contributions
irs-0.015
ide-0.015
of-0.014
r-0.013
aff-0.013
Token Sam
Token ID3409
Feature activation-3.03e-02

Pos logit contributions
 afraid+0.061
 determined+0.058
 worried+0.057
 oh+0.056
 careless+0.055
Neg logit contributions
irs-0.049
ide-0.044
of-0.043
 parties-0.041
Everyone-0.039
Token,
Token ID11
Feature activation-3.83e-02

Pos logit contributions
 afraid+0.043
 oh+0.043
 worried+0.039
 scared+0.037
 careless+0.037
Neg logit contributions
irs-0.067
ide-0.059
of-0.055
angles-0.055
aff-0.055
Token came
Token ID1625
Feature activation-2.57e-02

Pos logit contributions
 oh+0.066
 afraid+0.064
 seeker+0.060
 worried+0.059
 determined+0.059
Neg logit contributions
irs-0.057
ide-0.053
of-0.051
 parties-0.047
 ease-0.046
Token to
Token ID284
Feature activation+1.49e-02

Pos logit contributions
 oh+0.048
 afraid+0.045
 worried+0.043
 careless+0.041
 foolish+0.041
Neg logit contributions
irs-0.045
ide-0.040
ushes-0.036
of-0.036
r-0.035
Token visit
Token ID3187
Feature activation+5.73e-03

Pos logit contributions
irs+0.025
ide+0.022
ushes+0.021
oons+0.021
aff+0.020
Neg logit contributions
 oh-0.031
 afraid-0.029
 worried-0.028
 scared-0.028
 careless-0.027
Token.
Token ID13
Feature activation+1.05e-02

Pos logit contributions
 afraid+0.016
 oh+0.016
 worried+0.015
 scared+0.015
 careless+0.015
Neg logit contributions
irs-0.018
ide-0.015
ushes-0.015
aff-0.014
angles-0.014
Token<|endoftext|>
Token ID50256
Feature activation+2.50e-02

Pos logit contributions
irs+0.020
ide+0.017
ades+0.016
 outfits+0.016
angles+0.016
Neg logit contributions
 afraid-0.019
 oh-0.018
 worried-0.017
 scared-0.017
 careless-0.016
TokenOnce
Token ID7454
Feature activation-4.06e-02

Pos logit contributions
irs+0.046
angles+0.038
ide+0.037
 outfits+0.036
ushes+0.036
Neg logit contributions
 afraid-0.045
 scared-0.042
 oh-0.042
 worried-0.042
 careless-0.038
Token upon
Token ID2402
Feature activation+2.72e-02

Pos logit contributions
 afraid+0.076
 worried+0.073
 scared+0.071
 oh+0.069
 careless+0.068
Neg logit contributions
irs-0.074
 occasions-0.063
bons-0.061
 outfits-0.061
aff-0.060
Token a
Token ID257
Feature activation-3.57e-02

Pos logit contributions
irs+0.063
ide+0.057
ushes+0.053
of+0.053
angles+0.052
Neg logit contributions
 afraid-0.035
 oh-0.035
 worried-0.032
 scared-0.030
 careless-0.030
Token time
Token ID640
Feature activation-8.58e-02

Pos logit contributions
 oh+0.059
 afraid+0.058
 worried+0.054
 scared+0.052
 careless+0.051
Neg logit contributions
irs-0.070
ide-0.064
of-0.059
ushes-0.057
angles-0.056
Token there
Token ID612
Feature activation+1.51e-02

Pos logit contributions
 afraid+0.138
 oh+0.138
 worried+0.129
 careless+0.123
 scared+0.121
Neg logit contributions
irs-0.168
ide-0.156
of-0.139
 parties-0.133
ades-0.132
Token was
Token ID373
Feature activation-7.28e-02

Pos logit contributions
irs+0.033
ide+0.028
ushes+0.027
aff+0.027
angles+0.027
Neg logit contributions
 afraid-0.022
 oh-0.022
 worried-0.021
 scared-0.020
 careless-0.019
Token a
Token ID257
Feature activation-2.70e-02

Pos logit contributions
 afraid+0.121
 oh+0.117
 worried+0.114
 careless+0.106
 scared+0.103
Neg logit contributions
irs-0.139
ide-0.125
 ease-0.115
 parties-0.114
ushes-0.108
Token selfish
Token ID20363
Feature activation-3.86e-02

Pos logit contributions
 oh+0.053
 afraid+0.051
 worried+0.045
Run+0.044
 scared+0.044
Neg logit contributions
ide-0.034
irs-0.032
aff-0.031
of-0.030
ingu-0.030
Token man
Token ID582
Feature activation-3.83e-02

Pos logit contributions
 oh+0.063
 afraid+0.060
 worried+0.054
 determined+0.052
Run+0.051
Neg logit contributions
aff-0.069
irs-0.066
ide-0.065
ingu-0.063
of-0.061
Token basis
Token ID4308
Feature activation+2.34e-03

Pos logit contributions
irs+0.006
ide+0.006
of+0.005
aff+0.005
 outfits+0.005
Neg logit contributions
 afraid-0.006
 oh-0.006
 worried-0.006
 scared-0.006
 careless-0.006
Token.
Token ID13
Feature activation+1.24e-02

Pos logit contributions
irs+0.004
ide+0.003
ushes+0.003
angles+0.003
aff+0.003
Neg logit contributions
 afraid-0.004
 oh-0.004
 worried-0.004
 scared-0.004
 careless-0.004
Token<|endoftext|>
Token ID50256
Feature activation+2.32e-02

Pos logit contributions
irs+0.023
ide+0.019
ades+0.018
angles+0.018
 outfits+0.018
Neg logit contributions
 afraid-0.023
 oh-0.022
 worried-0.021
 scared-0.021
 careless-0.020
TokenOne
Token ID3198
Feature activation-7.85e-03

Pos logit contributions
irs+0.042
angles+0.036
ide+0.034
 outfits+0.034
ushes+0.033
Neg logit contributions
 afraid-0.042
 scared-0.039
 oh-0.039
 worried-0.039
 careless-0.035
Token day
Token ID1110
Feature activation-6.17e-02

Pos logit contributions
 oh+0.010
 afraid+0.010
 worried+0.009
 scared+0.008
 careless+0.008
Neg logit contributions
irs-0.017
ide-0.017
of-0.016
 parties-0.015
ushes-0.015
Token,
Token ID11
Feature activation-6.37e-02

Pos logit contributions
 afraid+0.097
 oh+0.094
 worried+0.088
 careless+0.084
 scared+0.084
Neg logit contributions
irs-0.126
ide-0.112
of-0.104
 parties-0.100
ades-0.099
Token a
Token ID257
Feature activation-2.22e-02

Pos logit contributions
 afraid+0.102
 oh+0.095
 worried+0.094
 careless+0.089
 scared+0.086
Neg logit contributions
irs-0.119
ide-0.112
of-0.101
 ease-0.098
ushes-0.096
Token rh
Token ID9529
Feature activation-2.56e-02

Pos logit contributions
 afraid+0.044
 oh+0.044
 worried+0.039
 adamant+0.039
 gonna+0.039
Neg logit contributions
of-0.024
Everyone-0.024
ide-0.024
aff-0.023
irs-0.022
Tokenin
Token ID259
Feature activation-1.07e-02

Pos logit contributions
 afraid+0.050
 oh+0.049
 worried+0.047
 scared+0.046
 careless+0.046
Neg logit contributions
irs-0.044
ide-0.035
ushes-0.034
angles-0.033
 outfits-0.033
Tokenoc
Token ID420
Feature activation-9.58e-03

Pos logit contributions
 afraid+0.026
 oh+0.026
 worried+0.025
 careless+0.024
 scared+0.024
Neg logit contributions
irs-0.011
ide-0.010
of-0.009
aff-0.009
r-0.008
Tokeneros
Token ID27498
Feature activation-1.87e-02

Pos logit contributions
 afraid+0.019
 oh+0.018
 worried+0.017
 scared+0.017
 careless+0.017
Neg logit contributions
irs-0.016
ide-0.014
aff-0.013
angles-0.013
of-0.013
Token that
Token ID326
Feature activation+8.65e-04

Pos logit contributions
 oh+0.055
 afraid+0.052
 worried+0.049
 careless+0.047
 scared+0.046
Neg logit contributions
irs-0.049
ide-0.043
of-0.040
r-0.039
 ease-0.039
Token this
Token ID428
Feature activation+5.44e-03

Pos logit contributions
irs+0.002
ide+0.001
of+0.001
ushes+0.001
 ease+0.001
Neg logit contributions
 oh-0.001
 afraid-0.001
 worried-0.001
 careless-0.001
 scared-0.001
Token wouldn
Token ID3636
Feature activation-5.44e-02

Pos logit contributions
irs+0.011
ide+0.010
of+0.009
ushes+0.009
aff+0.009
Neg logit contributions
 oh-0.009
 afraid-0.009
 worried-0.008
 scared-0.008
 careless-0.007
Tokenâ
Token ID22940
Feature activation-4.08e-02

Pos logit contributions
 afraid+0.087
 oh+0.081
 worried+0.081
 scared+0.081
 careless+0.075
Neg logit contributions
irs-0.113
ushes-0.097
ide-0.097
angles-0.096
 outfits-0.096
Token
Token ID26391
Feature activation-4.84e-02

Pos logit contributions
 afraid+0.063
 oh+0.060
 worried+0.057
 scared+0.055
 careless+0.054
Neg logit contributions
irs-0.086
ide-0.078
of-0.074
r-0.072
angles-0.071
Token
Token ID8151
Feature activation-6.68e-02

Pos logit contributions
 afraid+0.081
 oh+0.079
 worried+0.076
 scared+0.074
 careless+0.071
Neg logit contributions
irs-0.097
ide-0.083
angles-0.078
aff-0.076
ushes-0.076
Tokent
Token ID83
Feature activation-6.49e-03

Pos logit contributions
 afraid+0.100
 oh+0.100
 worried+0.092
 scared+0.088
 careless+0.087
Neg logit contributions
irs-0.142
ide-0.127
of-0.118
ushes-0.116
aff-0.116
Token be
Token ID307
Feature activation-3.94e-02

Pos logit contributions
 afraid+0.011
 oh+0.011
 worried+0.011
 scared+0.010
 careless+0.010
Neg logit contributions
irs-0.013
ide-0.011
ushes-0.011
angles-0.011
ades-0.011
Token the
Token ID262
Feature activation+1.92e-02

Pos logit contributions
 oh+0.047
 afraid+0.044
 worried+0.039
 careless+0.037
 scared+0.036
Neg logit contributions
irs-0.094
ide-0.084
ushes-0.080
of-0.079
 ease-0.079
Token last
Token ID938
Feature activation-1.05e-02

Pos logit contributions
irs+0.032
 ease+0.031
ide+0.031
of+0.029
 parties+0.029
Neg logit contributions
 oh-0.030
 afraid-0.029
 worried-0.026
Run-0.025
 scared-0.025
Token time
Token ID640
Feature activation-3.93e-02

Pos logit contributions
 oh+0.016
 afraid+0.016
 worried+0.014
 scared+0.014
 careless+0.014
Neg logit contributions
irs-0.022
ide-0.020
of-0.018
ushes-0.018
angles-0.018
TokenMom
Token ID29252
Feature activation+8.24e-03

Pos logit contributions
irs+0.138
angles+0.118
 journals+0.116
 outfits+0.115
 parties+0.113
Neg logit contributions
 oh-0.128
 afraid-0.125
 scared-0.120
 worried-0.118
 careless-0.109
Tokenmy
Token ID1820
Feature activation+5.07e-04

Pos logit contributions
irs+0.017
ide+0.013
ushes+0.013
angles+0.013
 parties+0.013
Neg logit contributions
 afraid-0.014
 worried-0.013
 oh-0.013
 scared-0.013
 careless-0.012
Token said
Token ID531
Feature activation-2.11e-02

Pos logit contributions
irs+0.001
ide+0.001
of+0.001
aff+0.001
angles+0.001
Neg logit contributions
 afraid-0.001
 oh-0.001
 worried-0.001
 careless-0.001
 scared-0.001
Token,
Token ID11
Feature activation-3.05e-02

Pos logit contributions
 afraid+0.034
 oh+0.032
 worried+0.031
 careless+0.030
 scared+0.029
Neg logit contributions
irs-0.040
ide-0.037
of-0.034
aff-0.033
ades-0.033
Token "
Token ID366
Feature activation-3.58e-03

Pos logit contributions
 afraid+0.053
 oh+0.051
 worried+0.048
 scared+0.047
 careless+0.047
Neg logit contributions
irs-0.055
ide-0.049
aff-0.045
of-0.044
ades-0.043
TokenBecause
Token ID8128
Feature activation-6.51e-02

Pos logit contributions
 oh+0.006
 afraid+0.006
 worried+0.005
 scared+0.005
 careless+0.005
Neg logit contributions
irs-0.007
ide-0.006
ushes-0.006
angles-0.006
oons-0.005
Token he
Token ID339
Feature activation+1.99e-02

Pos logit contributions
 afraid+0.107
 oh+0.105
 worried+0.099
 scared+0.095
 careless+0.093
Neg logit contributions
irs-0.128
ide-0.111
angles-0.103
ushes-0.103
of-0.102
Token can
Token ID460
Feature activation-1.52e-02

Pos logit contributions
irs+0.037
ide+0.032
of+0.030
angles+0.029
aff+0.029
Neg logit contributions
 oh-0.035
 afraid-0.034
 worried-0.032
 careless-0.030
 scared-0.030
Token pay
Token ID1414
Feature activation+1.05e-02

Pos logit contributions
 scared+0.023
 afraid+0.022
 worried+0.022
 oh+0.020
 very+0.020
Neg logit contributions
irs-0.032
ushes-0.029
 journals-0.028
rones-0.027
lights-0.027
Token for
Token ID329
Feature activation+1.77e-02

Pos logit contributions
irs+0.021
ide+0.018
ushes+0.017
angles+0.017
of+0.017
Neg logit contributions
 afraid-0.017
 oh-0.017
 worried-0.016
 scared-0.015
 careless-0.015
Token anything
Token ID1997
Feature activation-1.53e-02

Pos logit contributions
irs+0.033
ide+0.030
ushes+0.028
angles+0.027
aff+0.027
Neg logit contributions
 oh-0.032
 afraid-0.031
 worried-0.029
 scared-0.028
 careless-0.028
Token fit
Token ID4197
Feature activation-5.07e-03

Pos logit contributions
 afraid+0.055
 oh+0.053
 worried+0.051
 scared+0.050
 careless+0.048
Neg logit contributions
irs-0.067
ide-0.057
ushes-0.055
angles-0.054
ades-0.054
Token.
Token ID13
Feature activation-3.01e-02

Pos logit contributions
 afraid+0.009
 oh+0.009
 worried+0.008
 careless+0.008
 scared+0.008
Neg logit contributions
irs-0.010
ide-0.008
of-0.008
angles-0.008
ushes-0.008
Token The
Token ID383
Feature activation+1.96e-02

Pos logit contributions
 afraid+0.055
 oh+0.054
 worried+0.051
 scared+0.049
 careless+0.049
Neg logit contributions
irs-0.055
ide-0.047
ushes-0.043
angles-0.043
of-0.043
Token crystal
Token ID15121
Feature activation-2.42e-03

Pos logit contributions
irs+0.032
ide+0.027
ushes+0.024
angles+0.024
of+0.024
Neg logit contributions
 afraid-0.040
 oh-0.039
 worried-0.037
 scared-0.036
 careless-0.036
Token is
Token ID318
Feature activation-3.57e-02

Pos logit contributions
 oh+0.004
 afraid+0.004
 worried+0.004
 careless+0.004
 scared+0.004
Neg logit contributions
irs-0.004
ide-0.004
angles-0.004
of-0.004
ushes-0.004
Token too
Token ID1165
Feature activation-6.69e-02

Pos logit contributions
 oh+0.048
 afraid+0.044
 worried+0.040
Run+0.039
 careless+0.037
Neg logit contributions
irs-0.079
ide-0.068
ushes-0.066
angles-0.065
 ease-0.065
Token big
Token ID1263
Feature activation+2.76e-03

Pos logit contributions
 oh+0.110
 afraid+0.098
 worried+0.090
 careless+0.083
 scared+0.083
Neg logit contributions
irs-0.135
ide-0.120
ushes-0.114
of-0.113
ades-0.112
Token for
Token ID329
Feature activation+3.07e-03

Pos logit contributions
irs+0.004
ide+0.004
 parties+0.003
of+0.003
r+0.003
Neg logit contributions
 oh-0.005
 afraid-0.005
 worried-0.005
 careless-0.005
 scared-0.005
Token the
Token ID262
Feature activation+3.09e-02

Pos logit contributions
irs+0.006
ide+0.005
ushes+0.005
aff+0.005
of+0.005
Neg logit contributions
 oh-0.005
 afraid-0.005
 worried-0.005
 scared-0.005
 careless-0.005
Token ring
Token ID5858
Feature activation-6.96e-03

Pos logit contributions
irs+0.049
ide+0.044
 ease+0.042
 parties+0.042
 outfits+0.041
Neg logit contributions
 afraid-0.057
 oh-0.055
 worried-0.053
 scared-0.050
 careless-0.050
Token.
Token ID13
Feature activation-1.24e-02

Pos logit contributions
 afraid+0.014
 oh+0.014
 worried+0.013
 scared+0.013
 careless+0.013
Neg logit contributions
irs-0.011
ide-0.009
of-0.008
ushes-0.008
angles-0.008