TOP ACTIVATIONS
MAX = 10.636

long as we â re together , we can
When Jack â s friends saw the qu
dreamed of chocolate . <|endoftext|> Once upon a time , there
a bad ending . <|endoftext|> Once upon a time , there
and their friendship . <|endoftext|> Once upon a time , there
Then when you â re ready , decide what
safe and happy . <|endoftext|> Once upon a time , there
sharing is good . <|endoftext|> Once , there was a girl
bloom again too . <|endoftext|> Once upon a time , there
not be selfish . <|endoftext|> Once upon a time , a
. The end . <|endoftext|> Once upon a time , in
lot of fun . <|endoftext|> Once upon a time , a
happily ever after . <|endoftext|> Once upon a time , there
the sparkling stone . <|endoftext|> Once upon a time there was
and her friend . <|endoftext|> Once upon a time , there
a hard lesson . <|endoftext|> Once upon a time , there
to their family . <|endoftext|> Once upon a time , there
will be sad . <|endoftext|> Once upon a time , there
are not perfect . <|endoftext|> Once upon a time , there
love her anyway . <|endoftext|> Once there was a girl .

MIN ACTIVATIONS
MIN = -6.816

and sister . Sam was three and John was five .
named Tess . Tess was three years old and loved playing
came up to them and accused them of taking his mint
called Grace . Sam was three years old and Grace was
named Tom . Tom was three years old and he lived
. Charlie and Daisy were three years old and were best
named Ken . Ken was three years old , and he
named Jane . Jane was three years old and she was
named Lucy . She was three years old . She was
named Lucy . Lucy was three years old and excited to
ie . Mill ie was three years old and she loved
called Maggie . Maggie was three years old and she loved
named Daisy . Daisy was three years old and she loved
named Jane . Jane was three years old and she wanted
named John . He was three years old and loved to
named Jane . Jane was three years old and she liked
named Sally . Sally was three years old and loved to
called Grace . Grace was three years old and had the
named Jane . Jane was three years old and loved to
named Sarah . She was three years old and loved popcorn

INTERVAL 6.273 - 10.636
CONTAINS 0.385%

became best friends . <|endoftext|> Once upon a time , there
The end . <|endoftext|> Once upon a time there was
to the stadium ! <|endoftext|> Once upon a time , there
in the jungle . <|endoftext|> Once upon a time , there
he had seen . <|endoftext|> Once there was an elderly lady

INTERVAL 1.910 - 6.273
CONTAINS 12.752%

more money to get what you want . You just need
race . The big bee flew fast , but Buzz flew
want . Now you will be in big trouble . You
looking ." He waited for a few minutes .
They had a good ice cream . They were happy .

INTERVAL -2.453 - 1.910
CONTAINS 79.299%

their job with the machine . It was smooth and the
Dad 's hand and said , " Let 's try it
box and tucked the wet ornament inside it . She smiled
James had a long rope that he loved . He wanted
," Lily said . They carefully picked up the

INTERVAL -6.816 - -2.453
CONTAINS 7.564%

a time , there was a brave fish who lived in
. They surprised him with an amazing robot dog !
S que aky was also very lonely . Fl uffy and
. Ben 's mom and dad were worried too
over and gave the rabbit a big hug . But when
Token long
Token ID890
Feature activation+3.44e+00

Pos logit contributions
 gestures+0.007
 ways+0.007
 kinds+0.007
 unison+0.007
 compare+0.007
Neg logit contributions
at-0.016
affle-0.014
 seeker-0.014
ip-0.014
bled-0.013
Token as
Token ID355
Feature activation+9.27e-01

Pos logit contributions
at+0.088
 seeker+0.082
affle+0.081
ip+0.079
bled+0.075
Neg logit contributions
 gestures-0.050
 ways-0.049
 kinds-0.048
 compare-0.047
 unison-0.046
Token we
Token ID356
Feature activation+3.14e+00

Pos logit contributions
at+0.028
affle+0.026
 seeker+0.024
ip+0.024
bled+0.023
Neg logit contributions
 gestures-0.010
 kinds-0.010
 ways-0.010
 compare-0.010
 who-0.009
Tokenâ
Token ID22940
Feature activation+3.65e+00

Pos logit contributions
at+0.079
 seeker+0.075
affle+0.075
ip+0.074
bled+0.071
Neg logit contributions
 gestures-0.045
 compare-0.044
 kinds-0.044
 ways-0.043
 unison-0.043
Token
Token ID26391
Feature activation+5.59e+00

Pos logit contributions
at+0.068
 seeker+0.067
bled+0.065
affle+0.065
cast+0.065
Neg logit contributions
 gestures-0.071
 kinds-0.071
 compare-0.070
 unison-0.070
 respect-0.068
Token
Token ID8151
Feature activation+1.06e+01

Pos logit contributions
at+0.119
affle+0.108
 seeker+0.108
ip+0.104
bled+0.101
Neg logit contributions
 gestures-0.104
 ways-0.103
 unison-0.101
 kinds-0.101
 compare-0.098
Tokenre
Token ID260
Feature activation+2.98e+00

Pos logit contributions
 seeker+0.290
affle+0.284
at+0.282
bled+0.265
atmeal+0.261
Neg logit contributions
 who-0.116
So-0.115
old-0.114
 else-0.105
lier-0.105
Token together
Token ID1978
Feature activation-6.80e-02

Pos logit contributions
at+0.067
 seeker+0.063
ip+0.062
affle+0.061
bled+0.057
Neg logit contributions
 gestures-0.052
 ways-0.051
 kinds-0.050
 unison-0.049
 distress-0.049
Token,
Token ID11
Feature activation+8.56e-01

Pos logit contributions
 gestures+0.001
 ways+0.001
 kinds+0.001
 unison+0.001
 distress+0.001
Neg logit contributions
at-0.002
 seeker-0.001
ip-0.001
affle-0.001
bled-0.001
Token we
Token ID356
Feature activation+9.05e-01

Pos logit contributions
at+0.026
affle+0.024
 seeker+0.023
ip+0.023
bled+0.022
Neg logit contributions
 ways-0.010
 gestures-0.009
 who-0.009
 unison-0.009
 kinds-0.009
Token can
Token ID460
Feature activation-6.07e-03

Pos logit contributions
at+0.022
affle+0.020
ip+0.020
 seeker+0.020
bled+0.019
Neg logit contributions
 gestures-0.014
 compare-0.014
 kinds-0.014
 who-0.014
 unison-0.014
Token
Token ID198
Feature activation-9.02e-01

Pos logit contributions
 gestures+0.009
 ways+0.009
 kinds+0.009
 unison+0.008
 compare+0.008
Neg logit contributions
at-0.015
 seeker-0.014
ip-0.014
affle-0.014
bled-0.013
TokenWhen
Token ID2215
Feature activation+8.00e-01

Pos logit contributions
 gestures+0.018
 ways+0.017
 kinds+0.017
 distress+0.016
 appearances+0.016
Neg logit contributions
at-0.019
affle-0.017
 seeker-0.017
ip-0.017
bled-0.016
Token Jack
Token ID3619
Feature activation+2.60e+00

Pos logit contributions
at+0.019
 seeker+0.018
ip+0.018
affle+0.018
bled+0.017
Neg logit contributions
 gestures-0.013
 ways-0.012
 kinds-0.012
 unison-0.012
 distress-0.012
Tokenâ
Token ID22940
Feature activation+2.82e+00

Pos logit contributions
at+0.058
 seeker+0.056
affle+0.055
ained+0.053
bled+0.053
Neg logit contributions
 distress-0.043
 who-0.042
 compare-0.042
 gestures-0.042
 kinds-0.041
Token
Token ID26391
Feature activation+5.34e+00

Pos logit contributions
at+0.058
 seeker+0.056
ip+0.055
affle+0.053
bled+0.051
Neg logit contributions
 gestures-0.051
 ways-0.051
 compare-0.051
 kinds-0.051
 unison-0.050
Token
Token ID8151
Feature activation+1.05e+01

Pos logit contributions
at+0.119
affle+0.109
 seeker+0.108
ip+0.105
bled+0.101
Neg logit contributions
 gestures-0.094
 ways-0.093
 unison-0.091
 kinds-0.091
 distress-0.090
Tokens
Token ID82
Feature activation+2.60e+00

Pos logit contributions
 seeker+0.242
affle+0.238
at+0.238
cast+0.220
belt+0.220
Neg logit contributions
 who-0.154
old-0.152
So-0.147
 Amy-0.146
 Susan-0.145
Token friends
Token ID2460
Feature activation+4.64e-01

Pos logit contributions
at+0.058
 seeker+0.055
ip+0.053
affle+0.053
bled+0.049
Neg logit contributions
 gestures-0.045
 kinds-0.044
 ways-0.044
 unison-0.043
 compare-0.043
Token saw
Token ID2497
Feature activation-1.73e+00

Pos logit contributions
at+0.010
 seeker+0.009
ip+0.009
affle+0.009
bled+0.009
Neg logit contributions
 gestures-0.008
 ways-0.008
 kinds-0.008
 compare-0.008
 distress-0.008
Token the
Token ID262
Feature activation-3.77e-01

Pos logit contributions
 gestures+0.029
 unison+0.027
 compare+0.027
 kinds+0.026
 distress+0.026
Neg logit contributions
at-0.040
 seeker-0.038
affle-0.037
ip-0.037
bled-0.035
Token qu
Token ID627
Feature activation+3.62e+00

Pos logit contributions
 gestures+0.007
 ways+0.007
 kinds+0.007
 unison+0.006
 distress+0.006
Neg logit contributions
at-0.008
 seeker-0.008
ip-0.008
affle-0.008
bled-0.007
Token dreamed
Token ID27947
Feature activation+3.04e+00

Pos logit contributions
at+0.060
 seeker+0.056
ip+0.055
affle+0.053
bled+0.051
Neg logit contributions
 gestures-0.044
 kinds-0.044
 ways-0.044
 distress-0.042
 unison-0.042
Token of
Token ID286
Feature activation-8.84e-01

Pos logit contributions
at+0.091
 seeker+0.086
ip+0.085
affle+0.084
bled+0.080
Neg logit contributions
 gestures-0.031
 ways-0.030
 kinds-0.030
 unison-0.028
 distress-0.028
Token chocolate
Token ID11311
Feature activation+2.78e+00

Pos logit contributions
 gestures+0.015
 kinds+0.015
 unison+0.014
Maria+0.014
 distress+0.014
Neg logit contributions
at-0.018
 seeker-0.018
ip-0.017
affle-0.017
 skating-0.017
Token.
Token ID13
Feature activation+3.34e-01

Pos logit contributions
at+0.082
 seeker+0.078
affle+0.076
ip+0.076
bled+0.074
Neg logit contributions
 gestures-0.028
 kinds-0.027
 ways-0.027
 compare-0.025
 unison-0.025
Token<|endoftext|>
Token ID50256
Feature activation+2.51e+00

Pos logit contributions
at+0.009
 seeker+0.009
ip+0.008
affle+0.008
ained+0.008
Neg logit contributions
 gestures-0.004
 kinds-0.004
 ways-0.004
 compare-0.004
 distress-0.004
TokenOnce
Token ID7454
Feature activation+1.04e+01

Pos logit contributions
at+0.042
affle+0.037
 seeker+0.037
ip+0.036
belt+0.036
Neg logit contributions
 ways-0.056
 who-0.056
 kinds-0.055
 distress-0.055
 gestures-0.055
Token upon
Token ID2402
Feature activation+4.63e-01

Pos logit contributions
at+0.180
affle+0.161
 seeker+0.161
atmeal+0.145
ip+0.141
Neg logit contributions
 ways-0.236
 kinds-0.233
 who-0.231
 its-0.227
 Amy-0.222
Token a
Token ID257
Feature activation+3.15e+00

Pos logit contributions
at+0.009
 seeker+0.009
affle+0.008
ained+0.008
ip+0.008
Neg logit contributions
 gestures-0.009
 compare-0.009
 sizes-0.009
 appearances-0.009
 kinds-0.009
Token time
Token ID640
Feature activation+5.88e+00

Pos logit contributions
at+0.053
 seeker+0.049
ip+0.048
affle+0.047
uch+0.041
Neg logit contributions
 gestures-0.074
 kinds-0.072
 ways-0.070
 compare-0.070
 distress-0.070
Token,
Token ID11
Feature activation+6.17e+00

Pos logit contributions
at+0.153
affle+0.144
 seeker+0.140
ip+0.139
bled+0.134
Neg logit contributions
 who-0.080
 ways-0.078
 gestures-0.076
 distress-0.076
 Susan-0.075
Token there
Token ID612
Feature activation+4.35e+00

Pos logit contributions
at+0.136
affle+0.124
ip+0.123
 seeker+0.120
atmeal+0.116
Neg logit contributions
 who-0.117
 its-0.112
 distress-0.110
 kinds-0.109
 ways-0.109
Token a
Token ID257
Feature activation-4.31e-01

Pos logit contributions
 seeker+0.048
at+0.047
ip+0.044
ained+0.043
uch+0.042
Neg logit contributions
 gestures-0.041
 unison-0.040
 kinds-0.039
 sentences-0.039
 sizes-0.038
Token bad
Token ID2089
Feature activation+1.32e+00

Pos logit contributions
 gestures+0.008
 ways+0.008
 kinds+0.008
 distress+0.008
 unison+0.007
Neg logit contributions
at-0.009
 seeker-0.009
ip-0.009
affle-0.008
bled-0.008
Token ending
Token ID7464
Feature activation+1.44e+00

Pos logit contributions
at+0.025
ip+0.022
bled+0.021
affle+0.021
 seeker+0.021
Neg logit contributions
 gestures-0.029
 ways-0.028
 kinds-0.028
 compare-0.028
 meaning-0.027
Token.
Token ID13
Feature activation+1.93e-03

Pos logit contributions
at+0.040
 seeker+0.037
ip+0.037
affle+0.036
bled+0.035
Neg logit contributions
 gestures-0.017
 distress-0.017
 unison-0.016
 kinds-0.016
 compare-0.016
Token<|endoftext|>
Token ID50256
Feature activation+2.46e+00

Pos logit contributions
 seeker+0.000
ip+0.000
ained+0.000
at+0.000
affle+0.000
Neg logit contributions
 kinds-0.000
 compare-0.000
 ways-0.000
 brace-0.000
 gestures-0.000
TokenOnce
Token ID7454
Feature activation+1.03e+01

Pos logit contributions
at+0.037
 seeker+0.033
ip+0.032
affle+0.032
belt+0.030
Neg logit contributions
 ways-0.060
 kinds-0.060
 gestures-0.059
 distress-0.059
 who-0.058
Token upon
Token ID2402
Feature activation+4.68e-01

Pos logit contributions
at+0.178
 seeker+0.159
affle+0.158
atmeal+0.145
belt+0.143
Neg logit contributions
 ways-0.234
 kinds-0.229
 who-0.229
 its-0.224
 gestures-0.219
Token a
Token ID257
Feature activation+3.43e+00

Pos logit contributions
at+0.009
 seeker+0.009
affle+0.008
ained+0.008
uch+0.008
Neg logit contributions
 gestures-0.009
 compare-0.009
 sizes-0.009
 kinds-0.009
 ways-0.009
Token time
Token ID640
Feature activation+6.00e+00

Pos logit contributions
at+0.061
 seeker+0.055
ip+0.054
affle+0.054
uch+0.047
Neg logit contributions
 gestures-0.078
 kinds-0.075
 ways-0.074
 compare-0.074
 distress-0.073
Token,
Token ID11
Feature activation+6.29e+00

Pos logit contributions
at+0.155
affle+0.144
 seeker+0.141
ip+0.140
belt+0.133
Neg logit contributions
 ways-0.084
 who-0.084
 gestures-0.080
 distress-0.080
 kinds-0.079
Token there
Token ID612
Feature activation+4.38e+00

Pos logit contributions
at+0.138
affle+0.126
ip+0.125
 seeker+0.122
atmeal+0.117
Neg logit contributions
 who-0.121
 distress-0.114
 its-0.114
 ways-0.113
 kinds-0.112
Token and
Token ID290
Feature activation+1.76e+00

Pos logit contributions
at+0.005
ip+0.005
 seeker+0.005
robe+0.004
ored+0.004
Neg logit contributions
 distress-0.004
 unison-0.004
Maria-0.004
 gestures-0.004
 appearances-0.004
Token their
Token ID511
Feature activation+2.71e+00

Pos logit contributions
 seeker+0.038
ained+0.032
robe+0.031
 skating+0.031
ip+0.030
Neg logit contributions
lier-0.030
 unison-0.029
So-0.027
 distress-0.027
Maria-0.026
Token friendship
Token ID14738
Feature activation+1.48e+00

Pos logit contributions
at+0.066
 seeker+0.063
ip+0.061
affle+0.061
bled+0.057
Neg logit contributions
 gestures-0.040
 kinds-0.040
 unison-0.039
 ways-0.039
 distress-0.038
Token.
Token ID13
Feature activation-2.90e-01

Pos logit contributions
at+0.033
ip+0.032
 seeker+0.031
ained+0.031
robe+0.031
Neg logit contributions
 distress-0.021
Maria-0.021
 Susan-0.021
 unison-0.021
Susan-0.021
Token<|endoftext|>
Token ID50256
Feature activation+2.26e+00

Pos logit contributions
 distress+0.004
 kinds+0.004
 compare+0.004
 ways+0.004
 gestures+0.004
Neg logit contributions
 seeker-0.007
ip-0.007
ained-0.006
at-0.006
affle-0.006
TokenOnce
Token ID7454
Feature activation+1.03e+01

Pos logit contributions
at+0.032
 seeker+0.028
affle+0.027
ip+0.027
bled+0.025
Neg logit contributions
 gestures-0.058
 ways-0.058
 kinds-0.057
 distress-0.056
 unison-0.056
Token upon
Token ID2402
Feature activation+3.72e-01

Pos logit contributions
at+0.177
affle+0.156
 seeker+0.154
atmeal+0.143
ip+0.139
Neg logit contributions
 ways-0.240
 kinds-0.236
 who-0.231
 its-0.228
 gestures-0.225
Token a
Token ID257
Feature activation+3.49e+00

Pos logit contributions
at+0.007
 seeker+0.007
affle+0.006
ained+0.006
ip+0.006
Neg logit contributions
 gestures-0.007
 compare-0.007
 sizes-0.007
 kinds-0.007
 appearances-0.007
Token time
Token ID640
Feature activation+5.72e+00

Pos logit contributions
at+0.060
 seeker+0.054
ip+0.054
affle+0.053
uch+0.047
Neg logit contributions
 gestures-0.081
 kinds-0.078
 ways-0.077
 compare-0.077
 distress-0.076
Token,
Token ID11
Feature activation+6.15e+00

Pos logit contributions
at+0.142
affle+0.131
 seeker+0.129
ip+0.128
bled+0.120
Neg logit contributions
 ways-0.087
 gestures-0.085
 who-0.084
 kinds-0.083
 distress-0.083
Token there
Token ID612
Feature activation+4.12e+00

Pos logit contributions
at+0.124
affle+0.111
ip+0.111
 seeker+0.110
atmeal+0.101
Neg logit contributions
 who-0.122
 gestures-0.121
 ways-0.121
 kinds-0.120
 distress-0.120
Token Then
Token ID3244
Feature activation-3.66e-01

Pos logit contributions
 gestures+0.002
 ways+0.002
 kinds+0.002
 distress+0.002
 compare+0.002
Neg logit contributions
at-0.003
 seeker-0.002
ip-0.002
affle-0.002
bled-0.002
Token when
Token ID618
Feature activation+1.97e-01

Pos logit contributions
 gestures+0.005
 ways+0.005
 who+0.005
 its+0.005
 compare+0.005
Neg logit contributions
at-0.010
affle-0.009
 seeker-0.009
ip-0.009
bled-0.009
Token you
Token ID345
Feature activation+3.45e+00

Pos logit contributions
at+0.006
affle+0.005
bled+0.005
ip+0.005
ored+0.005
Neg logit contributions
 gestures-0.002
 its-0.002
 ways-0.002
 compare-0.002
 heaven-0.002
Tokenâ
Token ID22940
Feature activation+3.27e+00

Pos logit contributions
at+0.091
 seeker+0.087
ip+0.086
affle+0.085
bled+0.084
Neg logit contributions
 gestures-0.044
 compare-0.044
 ways-0.043
 kinds-0.043
 unison-0.042
Token
Token ID26391
Feature activation+5.69e+00

Pos logit contributions
 seeker+0.059
ip+0.057
at+0.057
cast+0.056
bled+0.056
Neg logit contributions
 compare-0.065
 unison-0.064
 kinds-0.064
 gestures-0.063
 heaven-0.063
Token
Token ID8151
Feature activation+1.03e+01

Pos logit contributions
at+0.126
 seeker+0.116
affle+0.114
ip+0.113
bled+0.107
Neg logit contributions
 gestures-0.101
 ways-0.099
 kinds-0.098
 unison-0.097
 distress-0.096
Tokenre
Token ID260
Feature activation+3.27e+00

Pos logit contributions
 seeker+0.227
at+0.222
affle+0.218
bled+0.215
belt+0.199
Neg logit contributions
 who-0.163
old-0.161
So-0.156
 kinds-0.155
 ways-0.152
Token ready
Token ID3492
Feature activation+2.26e+00

Pos logit contributions
at+0.071
 seeker+0.067
ip+0.066
affle+0.066
bled+0.062
Neg logit contributions
 ways-0.059
 gestures-0.058
 kinds-0.058
 compare-0.055
 distress-0.055
Token,
Token ID11
Feature activation+7.49e-01

Pos logit contributions
at+0.061
affle+0.057
 seeker+0.057
ip+0.055
bled+0.054
Neg logit contributions
 ways-0.030
 who-0.028
 gestures-0.028
 kinds-0.028
 its-0.027
Token decide
Token ID5409
Feature activation-2.03e+00

Pos logit contributions
at+0.022
affle+0.020
 seeker+0.020
ip+0.020
bled+0.019
Neg logit contributions
 gestures-0.008
 ways-0.008
 kinds-0.008
 compare-0.008
 unison-0.008
Token what
Token ID644
Feature activation-1.06e+00

Pos logit contributions
 gestures+0.027
 kinds+0.026
 ways+0.026
 unison+0.025
 distress+0.025
Neg logit contributions
at-0.054
 seeker-0.051
ip-0.051
affle-0.050
bled-0.047
Token safe
Token ID3338
Feature activation-1.41e+00

Pos logit contributions
at+0.044
ip+0.040
 seeker+0.040
affle+0.039
or+0.037
Neg logit contributions
 gestures-0.031
 ways-0.031
 kinds-0.030
 distress-0.030
 compare-0.030
Token and
Token ID290
Feature activation-1.75e-01

Pos logit contributions
 gestures+0.022
 unison+0.022
 compare+0.022
 kinds+0.021
 sentences+0.021
Neg logit contributions
at-0.032
 seeker-0.032
ip-0.030
bled-0.029
ored-0.029
Token happy
Token ID3772
Feature activation-1.96e+00

Pos logit contributions
 gestures+0.003
 unison+0.003
 ways+0.003
 distress+0.003
 kinds+0.003
Neg logit contributions
at-0.004
 seeker-0.004
ip-0.004
affle-0.003
bled-0.003
Token.
Token ID13
Feature activation-3.71e-01

Pos logit contributions
 distress+0.027
 unison+0.026
Susan+0.026
 Susan+0.026
 kinds+0.026
Neg logit contributions
 seeker-0.047
at-0.046
ip-0.045
ored-0.044
cast-0.043
Token<|endoftext|>
Token ID50256
Feature activation+2.35e+00

Pos logit contributions
 gestures+0.004
 ways+0.004
 kinds+0.004
 distress+0.004
 unison+0.004
Neg logit contributions
at-0.010
 seeker-0.010
ip-0.010
affle-0.010
bled-0.009
TokenOnce
Token ID7454
Feature activation+1.02e+01

Pos logit contributions
at+0.036
 seeker+0.032
affle+0.032
ip+0.032
belt+0.031
Neg logit contributions
 ways-0.056
 kinds-0.056
 gestures-0.055
 distress-0.055
 who-0.054
Token upon
Token ID2402
Feature activation+4.75e-01

Pos logit contributions
at+0.173
affle+0.153
 seeker+0.151
atmeal+0.139
belt+0.135
Neg logit contributions
 ways-0.239
 kinds-0.235
 who-0.233
 its-0.230
 gestures-0.227
Token a
Token ID257
Feature activation+3.69e+00

Pos logit contributions
at+0.009
 seeker+0.009
affle+0.008
ained+0.008
ip+0.008
Neg logit contributions
 gestures-0.009
 compare-0.009
 sizes-0.009
 kinds-0.009
 appearances-0.009
Token time
Token ID640
Feature activation+5.96e+00

Pos logit contributions
at+0.062
 seeker+0.056
ip+0.055
affle+0.054
uch+0.048
Neg logit contributions
 gestures-0.088
 kinds-0.085
 compare-0.083
 ways-0.083
 distress-0.083
Token,
Token ID11
Feature activation+6.37e+00

Pos logit contributions
at+0.148
affle+0.137
 seeker+0.135
ip+0.134
bled+0.126
Neg logit contributions
 ways-0.089
 gestures-0.087
 who-0.087
 kinds-0.086
 distress-0.085
Token there
Token ID612
Feature activation+4.30e+00

Pos logit contributions
at+0.140
affle+0.128
ip+0.126
 seeker+0.124
atmeal+0.119
Neg logit contributions
 who-0.123
 its-0.116
 distress-0.114
 ways-0.114
 whose-0.113
Token sharing
Token ID7373
Feature activation+1.34e+00

Pos logit contributions
at+0.009
 seeker+0.009
ip+0.009
affle+0.008
ained+0.008
Neg logit contributions
 unison-0.007
 gestures-0.007
 kinds-0.007
 distress-0.007
 ways-0.007
Token is
Token ID318
Feature activation-5.29e-01

Pos logit contributions
at+0.021
 seeker+0.020
ip+0.020
affle+0.019
ained+0.019
Neg logit contributions
 unison-0.029
 distress-0.029
 gestures-0.028
Maria-0.027
 Jessie-0.027
Token good
Token ID922
Feature activation+5.33e-02

Pos logit contributions
 gestures+0.009
 kinds+0.009
 unison+0.009
 distress+0.009
 ways+0.008
Neg logit contributions
at-0.011
 seeker-0.011
ip-0.011
bled-0.011
affle-0.011
Token.
Token ID13
Feature activation-1.16e-01

Pos logit contributions
 seeker+0.001
ip+0.001
at+0.001
ained+0.001
ored+0.001
Neg logit contributions
Maria-0.001
 unison-0.001
Susan-0.001
 distress-0.001
 Jessie-0.001
Token<|endoftext|>
Token ID50256
Feature activation+2.59e+00

Pos logit contributions
 distress+0.002
 kinds+0.002
 ways+0.002
 compare+0.002
 gestures+0.002
Neg logit contributions
 seeker-0.003
ip-0.003
at-0.002
ained-0.002
affle-0.002
TokenOnce
Token ID7454
Feature activation+1.02e+01

Pos logit contributions
at+0.060
 seeker+0.056
ip+0.055
affle+0.055
bled+0.053
Neg logit contributions
 gestures-0.042
 ways-0.042
 kinds-0.042
 distress-0.041
 unison-0.040
Token,
Token ID11
Feature activation+7.10e+00

Pos logit contributions
at+0.191
affle+0.169
 seeker+0.168
atmeal+0.163
belt+0.153
Neg logit contributions
 who-0.222
 its-0.220
 ways-0.219
 kinds-0.213
 Amy-0.207
Token there
Token ID612
Feature activation+4.90e+00

Pos logit contributions
at+0.181
affle+0.168
atmeal+0.163
ip+0.157
rob+0.155
Neg logit contributions
 who-0.134
 whose-0.118
 its-0.116
 else-0.108
 Amy-0.107
Token was
Token ID373
Feature activation+2.14e+00

Pos logit contributions
at+0.117
 seeker+0.109
ip+0.107
affle+0.106
bled+0.100
Neg logit contributions
 gestures-0.079
 ways-0.077
 kinds-0.077
 distress-0.075
 unison-0.075
Token a
Token ID257
Feature activation-1.01e+00

Pos logit contributions
at+0.054
 seeker+0.052
ip+0.050
affle+0.050
bled+0.047
Neg logit contributions
 gestures-0.032
 kinds-0.030
 ways-0.030
 compare-0.030
 unison-0.029
Token girl
Token ID2576
Feature activation-2.57e+00

Pos logit contributions
 who+0.012
 gestures+0.012
 unison+0.011
 ways+0.011
 kinds+0.011
Neg logit contributions
at-0.029
 seeker-0.026
ip-0.026
affle-0.026
bled-0.025
Token bloom
Token ID29955
Feature activation+2.33e+00

Pos logit contributions
 kinds+0.024
 gestures+0.023
 sizes+0.023
 unison+0.023
ymes+0.023
Neg logit contributions
at-0.028
 seeker-0.028
bled-0.024
ained-0.023
ip-0.023
Token again
Token ID757
Feature activation-9.26e-01

Pos logit contributions
at+0.063
affle+0.062
 seeker+0.060
ip+0.060
bled+0.056
Neg logit contributions
 ways-0.028
 gestures-0.028
 kinds-0.028
 who-0.026
 heaven-0.026
Token too
Token ID1165
Feature activation-1.07e-01

Pos logit contributions
 gestures+0.014
 distress+0.013
 kinds+0.013
 appearances+0.013
 unison+0.013
Neg logit contributions
at-0.022
ip-0.021
 seeker-0.021
or-0.019
ig-0.019
Token.
Token ID13
Feature activation-1.24e-03

Pos logit contributions
 gestures+0.002
 appearances+0.002
 distress+0.001
 unison+0.001
 kinds+0.001
Neg logit contributions
at-0.003
 seeker-0.002
ip-0.002
ored-0.002
ding-0.002
Token<|endoftext|>
Token ID50256
Feature activation+2.35e+00

Pos logit contributions
 kinds+0.000
 ways+0.000
 distress+0.000
 gestures+0.000
 compare+0.000
Neg logit contributions
 seeker-0.000
ip-0.000
at-0.000
affle-0.000
ained-0.000
TokenOnce
Token ID7454
Feature activation+1.02e+01

Pos logit contributions
at+0.034
 seeker+0.030
affle+0.030
ip+0.030
bled+0.028
Neg logit contributions
 gestures-0.059
 ways-0.058
 kinds-0.058
 distress-0.057
 unison-0.057
Token upon
Token ID2402
Feature activation+1.65e-01

Pos logit contributions
at+0.174
affle+0.154
 seeker+0.153
atmeal+0.142
belt+0.137
Neg logit contributions
 ways-0.235
 kinds-0.231
 who-0.230
 its-0.228
 gestures-0.225
Token a
Token ID257
Feature activation+3.27e+00

Pos logit contributions
 seeker+0.003
at+0.003
affle+0.003
ained+0.003
ip+0.003
Neg logit contributions
 gestures-0.003
 compare-0.003
 sizes-0.003
 kinds-0.003
 appearances-0.003
Token time
Token ID640
Feature activation+5.72e+00

Pos logit contributions
at+0.056
 seeker+0.050
ip+0.049
affle+0.049
uch+0.043
Neg logit contributions
 gestures-0.077
 kinds-0.074
 ways-0.073
 compare-0.073
 distress-0.072
Token,
Token ID11
Feature activation+6.33e+00

Pos logit contributions
at+0.140
affle+0.129
 seeker+0.128
ip+0.127
bled+0.119
Neg logit contributions
 ways-0.087
 gestures-0.087
 kinds-0.085
 who-0.085
 distress-0.084
Token there
Token ID612
Feature activation+4.28e+00

Pos logit contributions
at+0.137
affle+0.125
ip+0.122
 seeker+0.120
atmeal+0.116
Neg logit contributions
 who-0.123
 distress-0.117
 its-0.116
 ways-0.116
 gestures-0.116
Token not
Token ID407
Feature activation+2.37e-01

Pos logit contributions
at+0.006
 seeker+0.006
ip+0.006
affle+0.006
bled+0.005
Neg logit contributions
 gestures-0.003
 ways-0.003
 kinds-0.003
 unison-0.003
 distress-0.003
Token be
Token ID307
Feature activation+2.60e+00

Pos logit contributions
at+0.005
 seeker+0.005
ip+0.005
 skating+0.004
ained+0.004
Neg logit contributions
 unison-0.004
 gestures-0.004
 kinds-0.004
 sentences-0.004
 appearances-0.004
Token selfish
Token ID20363
Feature activation-1.63e+00

Pos logit contributions
at+0.055
 seeker+0.052
ip+0.051
affle+0.049
ained+0.049
Neg logit contributions
 gestures-0.046
 kinds-0.045
 unison-0.045
 compare-0.044
 appearances-0.043
Token.
Token ID13
Feature activation+2.84e-01

Pos logit contributions
Maria+0.025
 unison+0.024
 Jessie+0.023
 Hope+0.023
 distress+0.023
Neg logit contributions
 seeker-0.037
ip-0.037
at-0.037
ained-0.036
ored-0.035
Token<|endoftext|>
Token ID50256
Feature activation+3.23e+00

Pos logit contributions
 seeker+0.007
at+0.007
ip+0.007
affle+0.006
ained+0.006
Neg logit contributions
 kinds-0.004
 gestures-0.004
 ways-0.004
 distress-0.004
 compare-0.004
TokenOnce
Token ID7454
Feature activation+1.02e+01

Pos logit contributions
at+0.083
ip+0.078
affle+0.078
 seeker+0.078
bled+0.075
Neg logit contributions
 kinds-0.043
 ways-0.043
 gestures-0.042
 distress-0.042
 who-0.042
Token upon
Token ID2402
Feature activation+6.52e-01

Pos logit contributions
at+0.207
affle+0.187
 seeker+0.183
atmeal+0.182
belt+0.173
Neg logit contributions
 its-0.208
 who-0.205
 ways-0.198
 kinds-0.196
 Amy-0.195
Token a
Token ID257
Feature activation+3.63e+00

Pos logit contributions
at+0.014
 seeker+0.014
affle+0.013
ip+0.013
bled+0.012
Neg logit contributions
 gestures-0.011
 compare-0.011
 kinds-0.011
 ways-0.011
 unison-0.011
Token time
Token ID640
Feature activation+6.60e+00

Pos logit contributions
at+0.064
 seeker+0.058
ip+0.057
affle+0.056
uch+0.049
Neg logit contributions
 gestures-0.084
 kinds-0.081
 ways-0.079
 distress-0.079
 compare-0.079
Token,
Token ID11
Feature activation+6.39e+00

Pos logit contributions
at+0.178
affle+0.167
ip+0.161
 seeker+0.159
belt+0.155
Neg logit contributions
 who-0.085
 ways-0.085
 Susan-0.080
 Amy-0.080
 its-0.079
Token a
Token ID257
Feature activation+6.36e-01

Pos logit contributions
at+0.165
affle+0.158
atmeal+0.153
ip+0.145
 seeker+0.140
Neg logit contributions
 who-0.114
 its-0.101
 whose-0.100
 Amy-0.090
 each-0.089
Token.
Token ID13
Feature activation-7.05e-02

Pos logit contributions
 distress+0.021
 unison+0.021
 gestures+0.020
 compare+0.020
 kinds+0.019
Neg logit contributions
at-0.038
ip-0.036
 seeker-0.035
or-0.034
ored-0.034
Token The
Token ID383
Feature activation+4.55e-01

Pos logit contributions
 kinds+0.001
 compare+0.001
 distress+0.001
 ways+0.001
 gestures+0.001
Neg logit contributions
 seeker-0.002
ip-0.002
ained-0.002
at-0.002
affle-0.001
Token end
Token ID886
Feature activation-1.72e+00

Pos logit contributions
at+0.010
ip+0.009
 seeker+0.009
affle+0.009
bled+0.008
Neg logit contributions
 gestures-0.009
 ways-0.008
 kinds-0.008
 distress-0.008
 unison-0.008
Token.
Token ID13
Feature activation+7.90e-01

Pos logit contributions
 kinds+0.024
 distress+0.024
 unison+0.024
 compare+0.024
 gestures+0.023
Neg logit contributions
at-0.042
ip-0.040
 seeker-0.039
affle-0.038
ained-0.038
Token<|endoftext|>
Token ID50256
Feature activation+2.05e+00

Pos logit contributions
 seeker+0.015
at+0.015
ip+0.015
ained+0.014
affle+0.014
Neg logit contributions
 kinds-0.015
 ways-0.015
 compare-0.014
 gestures-0.014
 distress-0.014
TokenOnce
Token ID7454
Feature activation+1.02e+01

Pos logit contributions
at+0.033
 seeker+0.029
affle+0.029
ip+0.029
bled+0.027
Neg logit contributions
 kinds-0.048
 gestures-0.048
 ways-0.048
 distress-0.047
 unison-0.046
Token upon
Token ID2402
Feature activation+2.67e-01

Pos logit contributions
at+0.176
 seeker+0.158
affle+0.158
atmeal+0.143
belt+0.140
Neg logit contributions
 ways-0.230
 kinds-0.229
 who-0.226
 its-0.224
 Amy-0.217
Token a
Token ID257
Feature activation+3.24e+00

Pos logit contributions
at+0.005
 seeker+0.005
affle+0.005
ip+0.005
ained+0.004
Neg logit contributions
 gestures-0.005
 compare-0.005
 kinds-0.005
 sizes-0.005
 ways-0.005
Token time
Token ID640
Feature activation+6.03e+00

Pos logit contributions
at+0.055
 seeker+0.050
ip+0.049
affle+0.049
uch+0.043
Neg logit contributions
 gestures-0.076
 kinds-0.073
 ways-0.072
 compare-0.072
 distress-0.071
Token,
Token ID11
Feature activation+6.25e+00

Pos logit contributions
at+0.154
affle+0.143
 seeker+0.141
ip+0.139
bled+0.133
Neg logit contributions
 ways-0.086
 who-0.085
 gestures-0.083
 kinds-0.082
 distress-0.081
Token in
Token ID287
Feature activation+2.02e+00

Pos logit contributions
at+0.134
affle+0.122
ip+0.121
 seeker+0.119
atmeal+0.112
Neg logit contributions
 who-0.122
 ways-0.116
 kinds-0.116
 its-0.116
 distress-0.115
Token lot
Token ID1256
Feature activation-5.40e-02

Pos logit contributions
 gestures+0.011
 unison+0.011
 ways+0.011
 compare+0.011
 kinds+0.011
Neg logit contributions
at-0.013
ip-0.011
affle-0.011
 seeker-0.011
bled-0.011
Token of
Token ID286
Feature activation-1.09e+00

Pos logit contributions
 distress+0.001
 gestures+0.001
 unison+0.001
 appearances+0.001
 compare+0.001
Neg logit contributions
at-0.001
ip-0.001
affle-0.001
 seeker-0.001
ored-0.001
Token fun
Token ID1257
Feature activation+4.78e-01

Pos logit contributions
 gestures+0.015
 ways+0.015
 kinds+0.014
 unison+0.014
 distress+0.014
Neg logit contributions
at-0.029
 seeker-0.027
ip-0.026
affle-0.026
bled-0.025
Token.
Token ID13
Feature activation+1.92e-01

Pos logit contributions
at+0.013
 seeker+0.012
ip+0.012
affle+0.012
 skating+0.012
Neg logit contributions
 gestures-0.006
 distress-0.006
 unison-0.006
 compare-0.006
 kinds-0.006
Token<|endoftext|>
Token ID50256
Feature activation+2.16e+00

Pos logit contributions
at+0.005
 seeker+0.005
ip+0.005
affle+0.005
ained+0.004
Neg logit contributions
 kinds-0.003
 gestures-0.003
 ways-0.003
 distress-0.002
 compare-0.002
TokenOnce
Token ID7454
Feature activation+1.01e+01

Pos logit contributions
at+0.034
affle+0.030
 seeker+0.030
ip+0.029
belt+0.029
Neg logit contributions
 kinds-0.050
 ways-0.050
 gestures-0.050
 distress-0.049
 who-0.049
Token upon
Token ID2402
Feature activation+1.73e-01

Pos logit contributions
at+0.173
affle+0.153
 seeker+0.151
atmeal+0.140
ip+0.138
Neg logit contributions
 ways-0.235
 kinds-0.232
 who-0.227
 its-0.224
 gestures-0.224
Token a
Token ID257
Feature activation+2.98e+00

Pos logit contributions
at+0.003
 seeker+0.003
affle+0.003
ip+0.003
ained+0.003
Neg logit contributions
 gestures-0.003
 compare-0.003
 sizes-0.003
 kinds-0.003
 appearances-0.003
Token time
Token ID640
Feature activation+5.67e+00

Pos logit contributions
at+0.050
 seeker+0.046
ip+0.045
affle+0.044
uch+0.039
Neg logit contributions
 gestures-0.070
 kinds-0.068
 ways-0.067
 compare-0.066
 distress-0.066
Token,
Token ID11
Feature activation+6.28e+00

Pos logit contributions
at+0.148
affle+0.139
ip+0.134
 seeker+0.133
bled+0.128
Neg logit contributions
 who-0.079
 ways-0.078
 gestures-0.075
 kinds-0.074
 distress-0.074
Token a
Token ID257
Feature activation+4.01e-01

Pos logit contributions
at+0.129
affle+0.117
ip+0.116
 seeker+0.115
atmeal+0.106
Neg logit contributions
 who-0.124
 ways-0.121
 kinds-0.120
 distress-0.120
 gestures-0.120
Token happily
Token ID18177
Feature activation+9.15e-01

Pos logit contributions
at+0.078
affle+0.072
ip+0.071
 seeker+0.070
uch+0.066
Neg logit contributions
 ways-0.036
 gestures-0.036
 kinds-0.035
 who-0.034
 distress-0.033
Token ever
Token ID1683
Feature activation+3.02e+00

Pos logit contributions
at+0.022
ip+0.021
 seeker+0.020
ored+0.019
affle+0.019
Neg logit contributions
 gestures-0.015
 distress-0.015
 unison-0.014
 ways-0.014
 kinds-0.014
Token after
Token ID706
Feature activation-1.34e+00

Pos logit contributions
at+0.062
 seeker+0.059
ip+0.057
affle+0.055
ored+0.053
Neg logit contributions
 gestures-0.057
 kinds-0.056
 ways-0.056
 unison-0.055
 distress-0.055
Token.
Token ID13
Feature activation+5.33e-01

Pos logit contributions
 gestures+0.015
 kinds+0.014
 distress+0.014
 ways+0.014
 unison+0.014
Neg logit contributions
at-0.038
 seeker-0.037
ip-0.036
affle-0.036
bled-0.035
Token<|endoftext|>
Token ID50256
Feature activation+2.40e+00

Pos logit contributions
at+0.013
 seeker+0.012
ip+0.012
affle+0.012
ained+0.011
Neg logit contributions
 gestures-0.008
 kinds-0.008
 ways-0.008
 distress-0.008
 compare-0.008
TokenOnce
Token ID7454
Feature activation+1.01e+01

Pos logit contributions
at+0.059
 seeker+0.055
ip+0.054
affle+0.054
bled+0.051
Neg logit contributions
 gestures-0.036
 ways-0.036
 kinds-0.035
 unison-0.035
 distress-0.034
Token upon
Token ID2402
Feature activation+5.56e-01

Pos logit contributions
at+0.174
affle+0.153
 seeker+0.152
atmeal+0.140
belt+0.135
Neg logit contributions
 ways-0.234
 who-0.230
 kinds-0.229
 its-0.225
 gestures-0.222
Token a
Token ID257
Feature activation+3.54e+00

Pos logit contributions
at+0.011
 seeker+0.011
affle+0.010
ained+0.010
ip+0.010
Neg logit contributions
 gestures-0.011
 compare-0.010
 kinds-0.010
 appearances-0.010
 sizes-0.010
Token time
Token ID640
Feature activation+6.06e+00

Pos logit contributions
at+0.060
 seeker+0.054
ip+0.054
affle+0.053
uch+0.047
Neg logit contributions
 gestures-0.084
 kinds-0.081
 ways-0.079
 compare-0.079
 distress-0.079
Token,
Token ID11
Feature activation+6.28e+00

Pos logit contributions
at+0.156
affle+0.144
 seeker+0.140
ip+0.140
belt+0.132
Neg logit contributions
 ways-0.088
 who-0.086
 gestures-0.084
 kinds-0.083
 distress-0.082
Token there
Token ID612
Feature activation+4.43e+00

Pos logit contributions
at+0.147
affle+0.135
ip+0.131
 seeker+0.127
atmeal+0.127
Neg logit contributions
 who-0.118
 its-0.109
 whose-0.107
 ways-0.104
 distress-0.104
Token the
Token ID262
Feature activation+9.31e-02

Pos logit contributions
at+0.043
 seeker+0.041
ip+0.041
affle+0.038
ained+0.037
Neg logit contributions
 gestures-0.032
 compare-0.031
 appearances-0.031
 distress-0.031
 kinds-0.030
Token sparkling
Token ID39072
Feature activation-5.76e-02

Pos logit contributions
at+0.002
ip+0.002
affle+0.001
 seeker+0.001
bled+0.001
Neg logit contributions
 ways-0.002
 gestures-0.002
 kinds-0.002
 unison-0.002
 distress-0.002
Token stone
Token ID7815
Feature activation+8.35e-01

Pos logit contributions
 ways+0.001
 kinds+0.001
 gestures+0.001
 distress+0.001
 unison+0.001
Neg logit contributions
at-0.001
ip-0.001
 seeker-0.001
affle-0.001
bled-0.001
Token.
Token ID13
Feature activation+6.83e-01

Pos logit contributions
at+0.020
ip+0.019
 seeker+0.018
affle+0.018
ig+0.017
Neg logit contributions
 gestures-0.014
 unison-0.013
 distress-0.013
 compare-0.013
 sentences-0.013
Token<|endoftext|>
Token ID50256
Feature activation+2.08e+00

Pos logit contributions
at+0.017
 seeker+0.016
ip+0.016
affle+0.015
ained+0.015
Neg logit contributions
 gestures-0.010
 kinds-0.010
 distress-0.010
 ways-0.010
 compare-0.010
TokenOnce
Token ID7454
Feature activation+1.01e+01

Pos logit contributions
at+0.036
affle+0.033
 seeker+0.033
ip+0.032
bled+0.031
Neg logit contributions
 kinds-0.046
 ways-0.045
 gestures-0.045
 distress-0.045
 who-0.045
Token upon
Token ID2402
Feature activation+4.31e-01

Pos logit contributions
at+0.168
affle+0.149
 seeker+0.148
atmeal+0.135
ip+0.132
Neg logit contributions
 ways-0.238
 who-0.235
 kinds-0.234
 its-0.229
 gestures-0.227
Token a
Token ID257
Feature activation+3.04e+00

Pos logit contributions
at+0.009
 seeker+0.009
affle+0.008
ip+0.008
ained+0.008
Neg logit contributions
 gestures-0.008
 compare-0.008
 kinds-0.008
 appearances-0.008
 ways-0.007
Token time
Token ID640
Feature activation+5.94e+00

Pos logit contributions
at+0.053
 seeker+0.048
ip+0.047
affle+0.046
uch+0.041
Neg logit contributions
 gestures-0.070
 kinds-0.068
 ways-0.067
 compare-0.066
 distress-0.066
Token there
Token ID612
Feature activation+4.08e+00

Pos logit contributions
at+0.159
affle+0.148
ip+0.143
 seeker+0.142
atmeal+0.138
Neg logit contributions
 who-0.081
 ways-0.078
 Susan-0.076
 gestures-0.076
 distress-0.075
Token was
Token ID373
Feature activation+8.55e-01

Pos logit contributions
at+0.097
 seeker+0.090
ip+0.089
affle+0.088
bled+0.083
Neg logit contributions
 gestures-0.065
 ways-0.064
 kinds-0.064
 distress-0.062
 unison-0.062
Token and
Token ID290
Feature activation+1.56e+00

Pos logit contributions
at+0.021
ip+0.020
 seeker+0.020
belt+0.018
affle+0.018
Neg logit contributions
 distress-0.014
 unison-0.014
 gestures-0.014
 appearances-0.013
 compare-0.013
Token her
Token ID607
Feature activation+2.40e+00

Pos logit contributions
 seeker+0.034
ip+0.033
at+0.033
affle+0.030
ained+0.030
Neg logit contributions
 unison-0.026
 distress-0.025
ever-0.025
 gestures-0.024
lier-0.023
Token friend
Token ID1545
Feature activation-2.29e+00

Pos logit contributions
at+0.050
 seeker+0.046
ip+0.046
affle+0.045
bled+0.042
Neg logit contributions
 gestures-0.046
 ways-0.045
 kinds-0.045
 unison-0.044
 distress-0.044
Token.
Token ID13
Feature activation+2.78e-01

Pos logit contributions
 distress+0.028
 appearances+0.028
 gestures+0.028
 unison+0.027
 compare+0.026
Neg logit contributions
at-0.060
ip-0.060
 seeker-0.058
affle-0.055
ored-0.053
Token<|endoftext|>
Token ID50256
Feature activation+2.27e+00

Pos logit contributions
 seeker+0.006
ip+0.006
at+0.006
affle+0.005
ained+0.005
Neg logit contributions
 kinds-0.005
 gestures-0.005
 distress-0.005
 compare-0.005
 harm-0.005
TokenOnce
Token ID7454
Feature activation+1.01e+01

Pos logit contributions
at+0.036
 seeker+0.032
affle+0.032
ip+0.032
bled+0.030
Neg logit contributions
 ways-0.053
 gestures-0.053
 kinds-0.053
 distress-0.052
 unison-0.051
Token upon
Token ID2402
Feature activation+2.26e-01

Pos logit contributions
at+0.168
affle+0.148
 seeker+0.145
ip+0.133
atmeal+0.133
Neg logit contributions
 ways-0.237
 kinds-0.234
 who-0.229
 gestures-0.226
 its-0.226
Token a
Token ID257
Feature activation+3.04e+00

Pos logit contributions
at+0.004
 seeker+0.004
affle+0.004
ained+0.004
ip+0.004
Neg logit contributions
 gestures-0.004
 compare-0.004
 appearances-0.004
 sizes-0.004
 kinds-0.004
Token time
Token ID640
Feature activation+5.65e+00

Pos logit contributions
at+0.051
 seeker+0.046
ip+0.045
affle+0.044
uch+0.039
Neg logit contributions
 gestures-0.073
 kinds-0.070
 compare-0.069
 ways-0.069
 distress-0.068
Token,
Token ID11
Feature activation+5.96e+00

Pos logit contributions
at+0.141
affle+0.131
 seeker+0.128
ip+0.128
bled+0.121
Neg logit contributions
 ways-0.084
 gestures-0.082
 who-0.082
 kinds-0.081
 distress-0.080
Token there
Token ID612
Feature activation+4.00e+00

Pos logit contributions
at+0.122
affle+0.110
ip+0.109
 seeker+0.108
atmeal+0.100
Neg logit contributions
 who-0.118
 ways-0.116
 gestures-0.116
 kinds-0.115
 distress-0.115
Token a
Token ID257
Feature activation-1.89e+00

Pos logit contributions
 kinds+0.045
 gestures+0.044
 unison+0.044
 sizes+0.043
 compare+0.043
Neg logit contributions
at-0.062
 seeker-0.060
ip-0.059
affle-0.057
bled-0.056
Token hard
Token ID1327
Feature activation+2.43e+00

Pos logit contributions
 gestures+0.037
 ways+0.036
 kinds+0.035
 unison+0.035
 compare+0.035
Neg logit contributions
at-0.039
 seeker-0.036
ip-0.035
affle-0.035
bled-0.033
Token lesson
Token ID11483
Feature activation-1.52e+00

Pos logit contributions
at+0.057
 seeker+0.054
affle+0.052
ip+0.052
bled+0.051
Neg logit contributions
 gestures-0.041
 kinds-0.039
 meaning-0.039
 words-0.039
 compare-0.038
Token.
Token ID13
Feature activation+9.31e-02

Pos logit contributions
 gestures+0.019
 kinds+0.018
 ways+0.018
 distress+0.018
 unison+0.018
Neg logit contributions
at-0.041
 seeker-0.039
ip-0.038
affle-0.037
bled-0.036
Token<|endoftext|>
Token ID50256
Feature activation+2.54e+00

Pos logit contributions
 seeker+0.002
ained+0.002
at+0.002
ip+0.002
affle+0.002
Neg logit contributions
 kinds-0.001
 compare-0.001
 ways-0.001
 gestures-0.001
 brace-0.001
TokenOnce
Token ID7454
Feature activation+1.01e+01

Pos logit contributions
at+0.042
 seeker+0.038
affle+0.038
ip+0.037
bled+0.036
Neg logit contributions
 kinds-0.057
 ways-0.057
 gestures-0.056
 distress-0.056
 who-0.056
Token upon
Token ID2402
Feature activation+2.51e-01

Pos logit contributions
at+0.173
 seeker+0.155
affle+0.153
atmeal+0.139
ip+0.139
Neg logit contributions
 ways-0.229
 kinds-0.227
 who-0.224
 gestures-0.219
 its-0.219
Token a
Token ID257
Feature activation+3.28e+00

Pos logit contributions
at+0.005
 seeker+0.005
affle+0.004
ip+0.004
ained+0.004
Neg logit contributions
 gestures-0.005
 compare-0.005
 kinds-0.005
 sizes-0.005
 ways-0.005
Token time
Token ID640
Feature activation+5.92e+00

Pos logit contributions
at+0.056
 seeker+0.051
ip+0.050
affle+0.049
uch+0.043
Neg logit contributions
 gestures-0.077
 kinds-0.074
 ways-0.073
 compare-0.073
 distress-0.072
Token,
Token ID11
Feature activation+6.20e+00

Pos logit contributions
at+0.155
affle+0.145
 seeker+0.141
ip+0.140
bled+0.134
Neg logit contributions
 who-0.081
 ways-0.081
 gestures-0.078
 kinds-0.076
 distress-0.076
Token there
Token ID612
Feature activation+4.14e+00

Pos logit contributions
at+0.125
affle+0.112
ip+0.112
 seeker+0.111
atmeal+0.102
Neg logit contributions
 who-0.124
 kinds-0.123
 gestures-0.122
 ways-0.122
 distress-0.121
Token to
Token ID284
Feature activation-8.51e-01

Pos logit contributions
at+0.004
 seeker+0.004
ip+0.004
affle+0.004
bled+0.004
Neg logit contributions
 gestures-0.002
 ways-0.002
 kinds-0.002
 unison-0.002
 compare-0.002
Token their
Token ID511
Feature activation+1.53e+00

Pos logit contributions
 distress+0.010
 gestures+0.010
 unison+0.010
 kinds+0.010
lier+0.009
Neg logit contributions
at-0.022
 seeker-0.021
ip-0.021
affle-0.021
bled-0.020
Token family
Token ID1641
Feature activation-1.53e+00

Pos logit contributions
at+0.033
 seeker+0.032
ip+0.031
affle+0.031
bled+0.029
Neg logit contributions
 kinds-0.026
 gestures-0.026
 unison-0.025
 ways-0.025
 distress-0.025
Token.
Token ID13
Feature activation-2.48e-02

Pos logit contributions
 gestures+0.020
 distress+0.020
 kinds+0.019
Maria+0.019
 unison+0.019
Neg logit contributions
at-0.039
ip-0.038
 seeker-0.036
or-0.036
affle-0.035
Token<|endoftext|>
Token ID50256
Feature activation+2.57e+00

Pos logit contributions
 kinds+0.000
 ways+0.000
 distress+0.000
 compare+0.000
 gestures+0.000
Neg logit contributions
 seeker-0.001
ip-0.001
at-0.001
ained-0.001
affle-0.001
TokenOnce
Token ID7454
Feature activation+1.01e+01

Pos logit contributions
at+0.055
 seeker+0.050
affle+0.050
ip+0.049
bled+0.047
Neg logit contributions
 ways-0.046
 kinds-0.046
 gestures-0.046
 distress-0.045
 who-0.045
Token upon
Token ID2402
Feature activation+6.03e-01

Pos logit contributions
at+0.176
 seeker+0.156
affle+0.155
atmeal+0.145
ip+0.137
Neg logit contributions
 ways-0.230
 who-0.228
 kinds-0.226
 its-0.224
 gestures-0.216
Token a
Token ID257
Feature activation+3.29e+00

Pos logit contributions
at+0.012
 seeker+0.012
affle+0.011
ip+0.011
ained+0.011
Neg logit contributions
 gestures-0.011
 compare-0.011
 kinds-0.011
 ways-0.010
 sizes-0.010
Token time
Token ID640
Feature activation+6.23e+00

Pos logit contributions
at+0.056
 seeker+0.051
ip+0.051
affle+0.050
uch+0.044
Neg logit contributions
 gestures-0.077
 kinds-0.075
 ways-0.073
 compare-0.073
 distress-0.072
Token,
Token ID11
Feature activation+6.20e+00

Pos logit contributions
at+0.162
affle+0.151
ip+0.146
 seeker+0.146
atmeal+0.138
Neg logit contributions
 ways-0.087
 who-0.086
 gestures-0.084
 distress-0.082
 kinds-0.081
Token there
Token ID612
Feature activation+4.31e+00

Pos logit contributions
at+0.160
affle+0.150
atmeal+0.147
ip+0.140
 seeker+0.136
Neg logit contributions
 who-0.110
 its-0.098
 whose-0.097
 Amy-0.089
 shapes-0.087
Token will
Token ID481
Feature activation+1.05e+00

Pos logit contributions
at+0.037
ip+0.035
 seeker+0.034
affle+0.033
ored+0.031
Neg logit contributions
 gestures-0.039
 unison-0.038
 kinds-0.038
 ways-0.038
 distress-0.038
Token be
Token ID307
Feature activation+2.89e+00

Pos logit contributions
 seeker+0.024
at+0.023
ip+0.022
affle+0.021
uch+0.021
Neg logit contributions
 kinds-0.018
 gestures-0.017
 unison-0.016
 ways-0.016
 sentences-0.015
Token sad
Token ID6507
Feature activation-1.90e+00

Pos logit contributions
at+0.051
 seeker+0.050
ip+0.050
bled+0.049
affle+0.047
Neg logit contributions
 unison-0.056
 kinds-0.056
 gestures-0.056
 compare-0.054
 appearances-0.054
Token.
Token ID13
Feature activation+4.00e-01

Pos logit contributions
 unison+0.029
 kinds+0.028
 distress+0.028
 gestures+0.028
Susan+0.027
Neg logit contributions
 seeker-0.043
at-0.042
ip-0.042
affle-0.040
ained-0.040
Token<|endoftext|>
Token ID50256
Feature activation+2.74e+00

Pos logit contributions
 seeker+0.008
ip+0.008
ained+0.007
at+0.007
ugg+0.007
Neg logit contributions
 kinds-0.008
 gestures-0.007
 compare-0.007
 unison-0.007
 ways-0.007
TokenOnce
Token ID7454
Feature activation+1.00e+01

Pos logit contributions
at+0.073
 seeker+0.069
ip+0.068
affle+0.067
bled+0.064
Neg logit contributions
 gestures-0.036
 ways-0.035
 kinds-0.035
 distress-0.034
 unison-0.034
Token upon
Token ID2402
Feature activation+5.74e-01

Pos logit contributions
at+0.164
 seeker+0.146
affle+0.143
atmeal+0.128
ip+0.128
Neg logit contributions
 ways-0.239
 kinds-0.235
 who-0.233
 gestures-0.229
 its-0.228
Token a
Token ID257
Feature activation+3.59e+00

Pos logit contributions
at+0.011
 seeker+0.011
affle+0.010
ip+0.010
ained+0.010
Neg logit contributions
 gestures-0.011
 compare-0.011
 kinds-0.010
 sizes-0.010
 appearances-0.010
Token time
Token ID640
Feature activation+6.04e+00

Pos logit contributions
at+0.061
 seeker+0.055
ip+0.055
affle+0.054
uch+0.048
Neg logit contributions
 gestures-0.085
 kinds-0.082
 distress-0.080
 ways-0.080
 compare-0.080
Token,
Token ID11
Feature activation+6.09e+00

Pos logit contributions
at+0.145
affle+0.133
 seeker+0.132
ip+0.130
bled+0.123
Neg logit contributions
 ways-0.097
 gestures-0.095
 kinds-0.094
 distress-0.091
 who-0.091
Token there
Token ID612
Feature activation+4.21e+00

Pos logit contributions
at+0.151
affle+0.140
ip+0.133
atmeal+0.133
 seeker+0.128
Neg logit contributions
 who-0.112
 its-0.102
 whose-0.099
 Amy-0.093
 distress-0.093
Token are
Token ID389
Feature activation+7.18e-01

Pos logit contributions
at+0.055
 seeker+0.050
ip+0.050
affle+0.050
bled+0.047
Neg logit contributions
 gestures-0.038
 ways-0.037
 unison-0.037
 kinds-0.037
 distress-0.036
Token not
Token ID407
Feature activation-3.20e-02

Pos logit contributions
at+0.015
 seeker+0.014
ip+0.014
affle+0.014
bled+0.013
Neg logit contributions
 gestures-0.013
 kinds-0.013
 unison-0.013
 ways-0.013
 distress-0.013
Token perfect
Token ID2818
Feature activation+3.93e-01

Pos logit contributions
 unison+0.001
Maria+0.001
 gestures+0.001
 distress+0.001
 sentences+0.001
Neg logit contributions
at-0.001
 seeker-0.001
ained-0.001
affle-0.001
ip-0.001
Token.
Token ID13
Feature activation+5.43e-01

Pos logit contributions
at+0.010
 seeker+0.010
ip+0.010
affle+0.009
bled+0.009
Neg logit contributions
 gestures-0.005
 unison-0.005
 kinds-0.005
 distress-0.005
 compare-0.005
Token<|endoftext|>
Token ID50256
Feature activation+2.18e+00

Pos logit contributions
at+0.011
ip+0.011
 seeker+0.011
ained+0.010
affle+0.010
Neg logit contributions
 distress-0.010
 gestures-0.009
 compare-0.009
ymes-0.009
 ways-0.009
TokenOnce
Token ID7454
Feature activation+1.00e+01

Pos logit contributions
at+0.057
 seeker+0.054
ip+0.053
affle+0.053
bled+0.050
Neg logit contributions
 gestures-0.029
 ways-0.029
 kinds-0.028
 distress-0.028
 unison-0.028
Token upon
Token ID2402
Feature activation+5.46e-01

Pos logit contributions
at+0.169
 seeker+0.150
affle+0.148
atmeal+0.136
ip+0.134
Neg logit contributions
 ways-0.235
 kinds-0.232
 who-0.229
 its-0.225
 gestures-0.224
Token a
Token ID257
Feature activation+3.18e+00

Pos logit contributions
at+0.010
 seeker+0.010
affle+0.009
ained+0.009
ip+0.009
Neg logit contributions
 gestures-0.011
 compare-0.011
 appearances-0.010
 sizes-0.010
 kinds-0.010
Token time
Token ID640
Feature activation+5.96e+00

Pos logit contributions
at+0.055
 seeker+0.050
ip+0.049
affle+0.049
uch+0.043
Neg logit contributions
 gestures-0.074
 kinds-0.071
 ways-0.070
 compare-0.070
 distress-0.069
Token,
Token ID11
Feature activation+6.17e+00

Pos logit contributions
at+0.150
affle+0.138
 seeker+0.136
ip+0.135
bled+0.127
Neg logit contributions
 ways-0.089
 gestures-0.087
 who-0.087
 kinds-0.085
 distress-0.085
Token there
Token ID612
Feature activation+4.09e+00

Pos logit contributions
at+0.146
affle+0.134
ip+0.130
atmeal+0.128
 seeker+0.126
Neg logit contributions
 who-0.116
 its-0.106
 whose-0.104
 distress-0.101
 ways-0.100
Token love
Token ID1842
Feature activation-1.66e+00

Pos logit contributions
 seeker+0.019
at+0.018
ip+0.017
affle+0.016
uch+0.016
Neg logit contributions
 kinds-0.015
 gestures-0.014
 ways-0.014
 unison-0.013
ymes-0.013
Token her
Token ID607
Feature activation-9.15e-02

Pos logit contributions
 gestures+0.025
 unison+0.025
 distress+0.024
 kinds+0.024
 sentences+0.023
Neg logit contributions
at-0.040
ip-0.037
 seeker-0.037
affle-0.036
bled-0.034
Token anyway
Token ID6949
Feature activation+4.03e-02

Pos logit contributions
 gestures+0.001
 unison+0.001
 kinds+0.001
 distress+0.001
 compare+0.001
Neg logit contributions
at-0.002
 seeker-0.002
ip-0.002
affle-0.002
belt-0.002
Token.
Token ID13
Feature activation+3.17e-01

Pos logit contributions
at+0.001
ip+0.001
 seeker+0.001
belt+0.001
or+0.001
Neg logit contributions
 distress-0.001
 unison-0.001
 gestures-0.001
 compare-0.001
 appearances-0.001
Token<|endoftext|>
Token ID50256
Feature activation+2.28e+00

Pos logit contributions
 seeker+0.007
at+0.007
ip+0.007
affle+0.007
ained+0.007
Neg logit contributions
 kinds-0.005
 gestures-0.005
 unison-0.005
 compare-0.005
 ways-0.005
TokenOnce
Token ID7454
Feature activation+1.00e+01

Pos logit contributions
at+0.033
 seeker+0.030
affle+0.029
ip+0.029
belt+0.027
Neg logit contributions
 ways-0.056
 kinds-0.056
 gestures-0.056
 distress-0.055
 who-0.055
Token there
Token ID612
Feature activation+4.89e+00

Pos logit contributions
at+0.169
affle+0.150
 seeker+0.149
atmeal+0.135
ip+0.134
Neg logit contributions
 ways-0.234
 kinds-0.230
 who-0.227
 its-0.222
 gestures-0.220
Token was
Token ID373
Feature activation+1.59e+00

Pos logit contributions
at+0.118
 seeker+0.110
ip+0.108
affle+0.108
bled+0.101
Neg logit contributions
 gestures-0.077
 ways-0.076
 kinds-0.076
 distress-0.073
 compare-0.073
Token a
Token ID257
Feature activation-1.26e+00

Pos logit contributions
at+0.036
 seeker+0.035
ip+0.033
affle+0.033
ained+0.032
Neg logit contributions
 gestures-0.028
 compare-0.026
 unison-0.026
 sentences-0.025
 ways-0.025
Token girl
Token ID2576
Feature activation-2.49e+00

Pos logit contributions
 gestures+0.017
 ways+0.017
 kinds+0.016
 who+0.016
 compare+0.016
Neg logit contributions
at-0.033
 seeker-0.030
ip-0.030
affle-0.030
bled-0.029
Token.
Token ID13
Feature activation+8.10e-01

Pos logit contributions
 gestures+0.060
 sentences+0.058
 unison+0.057
 kinds+0.056
 sizes+0.056
Neg logit contributions
at-0.040
ip-0.038
 seeker-0.037
ig-0.034
affle-0.031
Token and
Token ID290
Feature activation-4.01e-01

Pos logit contributions
 gestures+0.076
 compare+0.074
 distress+0.074
 appearances+0.073
 kinds+0.072
Neg logit contributions
at-0.101
ip-0.097
 seeker-0.093
affle-0.091
ored-0.086
Token sister
Token ID6621
Feature activation-5.47e+00

Pos logit contributions
 gestures+0.006
 ways+0.006
 distress+0.006
 kinds+0.006
 unison+0.006
Neg logit contributions
at-0.010
 seeker-0.009
ip-0.009
affle-0.009
bled-0.008
Token.
Token ID13
Feature activation-2.32e-01

Pos logit contributions
 gestures+0.077
 distress+0.074
 appearances+0.072
 kinds+0.071
ymes+0.070
Neg logit contributions
at-0.135
ip-0.131
 seeker-0.129
affle-0.123
ored-0.121
Token Sam
Token ID3409
Feature activation-2.83e+00

Pos logit contributions
ymes+0.004
 appearances+0.003
 brace+0.003
 gestures+0.003
 distress+0.003
Neg logit contributions
at-0.005
 seeker-0.005
ained-0.005
affle-0.005
ding-0.005
Token was
Token ID373
Feature activation-2.80e-01

Pos logit contributions
 gestures+0.055
 ways+0.053
 unison+0.053
 distress+0.052
 appearances+0.052
Neg logit contributions
at-0.059
ip-0.053
 seeker-0.052
affle-0.052
ored-0.048
Token three
Token ID1115
Feature activation-6.82e+00

Pos logit contributions
 gestures+0.004
 ways+0.004
 kinds+0.004
 distress+0.004
 compare+0.004
Neg logit contributions
at-0.007
 seeker-0.006
ip-0.006
affle-0.006
bled-0.006
Token and
Token ID290
Feature activation-2.22e+00

Pos logit contributions
 Hope+0.104
 appearances+0.100
 gestures+0.099
 compare+0.097
 target+0.096
Neg logit contributions
at-0.161
ip-0.156
affle-0.151
 skating-0.146
ig-0.141
Token John
Token ID1757
Feature activation-3.19e+00

Pos logit contributions
 gestures+0.033
 kinds+0.032
 ways+0.032
 unison+0.031
 compare+0.031
Neg logit contributions
at-0.056
 seeker-0.052
ip-0.051
affle-0.051
bled-0.047
Token was
Token ID373
Feature activation-2.04e-01

Pos logit contributions
 gestures+0.064
 appearances+0.062
 sentences+0.061
 ways+0.061
 unison+0.060
Neg logit contributions
at-0.064
ip-0.059
affle-0.055
ored-0.054
 skating-0.053
Token five
Token ID1936
Feature activation-4.77e+00

Pos logit contributions
 gestures+0.004
 ways+0.003
 kinds+0.003
 distress+0.003
 unison+0.003
Neg logit contributions
at-0.005
 seeker-0.004
ip-0.004
affle-0.004
bled-0.004
Token.
Token ID13
Feature activation-9.90e-01

Pos logit contributions
 Hope+0.076
 appearances+0.074
 Susan+0.073
 gestures+0.072
 distress+0.072
Neg logit contributions
ip-0.109
at-0.108
affle-0.102
ig-0.101
 skating-0.101
Token named
Token ID3706
Feature activation+1.68e+00

Pos logit contributions
 gestures+0.084
 sentences+0.083
 sizes+0.079
 unison+0.079
 distress+0.078
Neg logit contributions
at-0.042
ip-0.041
 seeker-0.039
ig-0.037
or-0.031
Token Tess
Token ID39412
Feature activation-7.76e-01

Pos logit contributions
at+0.042
 seeker+0.039
ip+0.038
affle+0.038
bled+0.036
Neg logit contributions
 gestures-0.026
 ways-0.025
 kinds-0.025
 unison-0.024
 distress-0.024
Token.
Token ID13
Feature activation-9.03e-02

Pos logit contributions
 gestures+0.015
 sentences+0.014
 unison+0.014
 distress+0.013
 harmony+0.013
Neg logit contributions
at-0.016
ip-0.015
or-0.014
 seeker-0.014
ig-0.014
Token Tess
Token ID39412
Feature activation-1.19e+00

Pos logit contributions
 gestures+0.001
 ways+0.001
 kinds+0.001
 unison+0.001
 compare+0.001
Neg logit contributions
at-0.003
affle-0.002
 seeker-0.002
ip-0.002
bled-0.002
Token was
Token ID373
Feature activation+8.28e-02

Pos logit contributions
 gestures+0.028
 unison+0.027
 sentences+0.026
 compare+0.026
 distress+0.026
Neg logit contributions
at-0.021
ip-0.018
 seeker-0.017
affle-0.017
ored-0.016
Token three
Token ID1115
Feature activation-6.71e+00

Pos logit contributions
at+0.002
ip+0.002
affle+0.002
 seeker+0.002
bled+0.002
Neg logit contributions
 gestures-0.001
 ways-0.001
 kinds-0.001
 distress-0.001
 who-0.001
Token years
Token ID812
Feature activation-4.54e+00

Pos logit contributions
 gestures+0.093
 unison+0.089
 compare+0.088
 appearances+0.088
 target+0.085
Neg logit contributions
at-0.175
ip-0.168
affle-0.163
ig-0.151
 seeker-0.149
Token old
Token ID1468
Feature activation-5.07e+00

Pos logit contributions
 gestures+0.063
 ways+0.060
 kinds+0.060
 unison+0.059
 distress+0.058
Neg logit contributions
at-0.119
 seeker-0.111
ip-0.110
affle-0.109
bled-0.103
Token and
Token ID290
Feature activation-1.64e+00

Pos logit contributions
 gestures+0.100
 compare+0.098
 appearances+0.098
 sentences+0.096
 unison+0.096
Neg logit contributions
 seeker-0.093
ip-0.091
at-0.084
ored-0.081
affle-0.077
Token loved
Token ID6151
Feature activation-2.97e+00

Pos logit contributions
 gestures+0.029
 ways+0.028
 kinds+0.028
 unison+0.028
 distress+0.027
Neg logit contributions
at-0.037
 seeker-0.034
ip-0.034
affle-0.033
bled-0.031
Token playing
Token ID2712
Feature activation-2.51e-01

Pos logit contributions
 unison+0.055
 gestures+0.051
 distress+0.048
 appearances+0.047
 compare+0.047
Neg logit contributions
at-0.063
ip-0.061
 seeker-0.060
 skating-0.058
affle-0.058
Token came
Token ID1625
Feature activation+2.27e+00

Pos logit contributions
 ways+0.041
 kinds+0.040
 gestures+0.039
 sizes+0.039
 appearances+0.038
Neg logit contributions
at-0.035
ip-0.031
 seeker-0.029
ored-0.029
affle-0.029
Token up
Token ID510
Feature activation-9.17e-01

Pos logit contributions
at+0.039
 seeker+0.034
 skating+0.033
ip+0.033
or+0.032
Neg logit contributions
 unison-0.049
 gestures-0.049
 sizes-0.048
 compare-0.048
 sentences-0.048
Token to
Token ID284
Feature activation-2.34e+00

Pos logit contributions
 sizes+0.018
 appearances+0.018
 ways+0.017
 gestures+0.017
 feelings+0.017
Neg logit contributions
at-0.017
 seeker-0.017
ip-0.016
ored-0.016
 skating-0.016
Token them
Token ID606
Feature activation-1.77e-01

Pos logit contributions
 sizes+0.046
 gestures+0.045
 appearances+0.045
 kinds+0.043
 ways+0.042
Neg logit contributions
at-0.045
 seeker-0.043
ored-0.041
affle-0.041
ip-0.039
Token and
Token ID290
Feature activation+1.10e+00

Pos logit contributions
 gestures+0.003
 kinds+0.003
 unison+0.003
 ways+0.003
 appearances+0.003
Neg logit contributions
at-0.004
ip-0.004
ored-0.003
affle-0.003
 seeker-0.003
Token accused
Token ID5371
Feature activation-6.70e+00

Pos logit contributions
at+0.029
ip+0.027
 seeker+0.027
affle+0.027
bled+0.025
Neg logit contributions
 gestures-0.015
 ways-0.014
 kinds-0.014
 unison-0.014
 distress-0.014
Token them
Token ID606
Feature activation-1.65e+00

Pos logit contributions
 sizes+0.137
 meaning+0.132
 kinds+0.131
 type+0.131
 compare+0.129
Neg logit contributions
at-0.114
ip-0.112
bled-0.109
ored-0.108
or-0.107
Token of
Token ID286
Feature activation-4.79e-01

Pos logit contributions
 unison+0.027
ymes+0.026
 harmony+0.025
 gestures+0.025
 sizes+0.025
Neg logit contributions
at-0.036
 seeker-0.033
ip-0.033
affle-0.033
ored-0.032
Token taking
Token ID2263
Feature activation+1.65e+00

Pos logit contributions
 gestures+0.007
 kinds+0.007
 unison+0.007
 ways+0.007
 compare+0.007
Neg logit contributions
at-0.011
 seeker-0.010
affle-0.010
ip-0.010
 skating-0.010
Token his
Token ID465
Feature activation+1.55e+00

Pos logit contributions
at+0.039
ip+0.037
 seeker+0.036
affle+0.036
ored+0.034
Neg logit contributions
 unison-0.024
 gestures-0.024
 compare-0.024
 kinds-0.024
 distress-0.023
Token mint
Token ID21043
Feature activation+2.41e+00

Pos logit contributions
at+0.034
ip+0.032
 seeker+0.031
affle+0.031
ored+0.029
Neg logit contributions
 gestures-0.027
 unison-0.027
 kinds-0.026
 ways-0.026
 compare-0.026
Token called
Token ID1444
Feature activation+6.10e-02

Pos logit contributions
 ways+0.072
 distress+0.072
 gestures+0.071
 appearances+0.070
 unison+0.069
Neg logit contributions
at-0.129
ip-0.125
ored-0.112
affle-0.111
ipop-0.110
Token Grace
Token ID16156
Feature activation-2.18e+00

Pos logit contributions
at+0.001
 seeker+0.001
ip+0.001
affle+0.001
or+0.001
Neg logit contributions
 gestures-0.001
 unison-0.001
 ways-0.001
 distress-0.001
 compare-0.001
Token.
Token ID13
Feature activation-7.77e-02

Pos logit contributions
 distress+0.028
 gestures+0.028
 ways+0.027
 compare+0.027
 sentences+0.027
Neg logit contributions
at-0.057
ip-0.056
 seeker-0.054
ored-0.051
affle-0.051
Token Sam
Token ID3409
Feature activation-2.08e+00

Pos logit contributions
 distress+0.001
 gestures+0.001
 ways+0.001
 compare+0.001
 appearances+0.001
Neg logit contributions
at-0.002
 seeker-0.002
ip-0.002
affle-0.002
ained-0.002
Token was
Token ID373
Feature activation-3.21e-01

Pos logit contributions
 gestures+0.036
 ways+0.035
 unison+0.035
 kinds+0.034
 distress+0.034
Neg logit contributions
at-0.047
 seeker-0.043
ip-0.043
affle-0.042
ored-0.039
Token three
Token ID1115
Feature activation-6.66e+00

Pos logit contributions
 gestures+0.005
 ways+0.005
 kinds+0.005
 distress+0.005
 compare+0.005
Neg logit contributions
at-0.008
 seeker-0.007
ip-0.007
affle-0.007
bled-0.007
Token years
Token ID812
Feature activation-4.63e+00

Pos logit contributions
 gestures+0.098
 compare+0.096
 appearances+0.096
 distress+0.095
 unison+0.094
Neg logit contributions
at-0.165
ip-0.157
affle-0.153
ig-0.141
 skating-0.140
Token old
Token ID1468
Feature activation-5.19e+00

Pos logit contributions
 gestures+0.058
 ways+0.057
 kinds+0.056
 distress+0.054
 compare+0.053
Neg logit contributions
at-0.128
 seeker-0.120
ip-0.118
affle-0.117
bled-0.112
Token and
Token ID290
Feature activation-1.91e+00

Pos logit contributions
 compare+0.104
 gestures+0.102
 appearances+0.102
 distress+0.101
ever+0.101
Neg logit contributions
ip-0.090
 seeker-0.088
at-0.085
affle-0.080
ored-0.078
Token Grace
Token ID16156
Feature activation-2.54e+00

Pos logit contributions
 gestures+0.033
 ways+0.032
 kinds+0.032
 unison+0.031
 distress+0.031
Neg logit contributions
at-0.043
 seeker-0.040
ip-0.040
affle-0.039
bled-0.037
Token was
Token ID373
Feature activation-5.13e-01

Pos logit contributions
 gestures+0.050
 appearances+0.050
 distress+0.049
 ways+0.049
 sentences+0.048
Neg logit contributions
at-0.051
ip-0.047
 seeker-0.045
ored-0.043
affle-0.043
Token named
Token ID3706
Feature activation+1.93e+00

Pos logit contributions
 gestures+0.050
 sentences+0.049
 unison+0.048
 distress+0.047
 ways+0.047
Neg logit contributions
at-0.032
ip-0.030
 seeker-0.027
ig-0.025
affle-0.025
Token Tom
Token ID4186
Feature activation-8.05e-01

Pos logit contributions
at+0.057
 seeker+0.054
ip+0.053
affle+0.053
bled+0.051
Neg logit contributions
 who-0.019
 kinds-0.019
 ways-0.019
 compare-0.019
 Susan-0.018
Token.
Token ID13
Feature activation+9.43e-01

Pos logit contributions
 gestures+0.014
 unison+0.013
 sentences+0.013
 harmony+0.012
 compare+0.012
Neg logit contributions
at-0.020
ip-0.018
affle-0.017
 seeker-0.017
ained-0.016
Token Tom
Token ID4186
Feature activation-2.08e+00

Pos logit contributions
at+0.025
 seeker+0.024
ip+0.024
affle+0.023
bled+0.022
Neg logit contributions
 gestures-0.012
 ways-0.012
 kinds-0.012
 compare-0.012
 unison-0.011
Token was
Token ID373
Feature activation+8.36e-02

Pos logit contributions
 gestures+0.036
 kinds+0.035
 ways+0.035
 distress+0.034
 compare+0.034
Neg logit contributions
at-0.047
 seeker-0.044
ip-0.043
affle-0.042
bled-0.040
Token three
Token ID1115
Feature activation-6.66e+00

Pos logit contributions
at+0.002
ip+0.002
affle+0.002
 seeker+0.002
or+0.002
Neg logit contributions
 who-0.001
 ways-0.001
 distress-0.001
 kinds-0.001
 gestures-0.001
Token years
Token ID812
Feature activation-4.46e+00

Pos logit contributions
 gestures+0.090
 Hope+0.087
 unison+0.087
 appearances+0.086
 compare+0.086
Neg logit contributions
at-0.177
ip-0.167
affle-0.165
 skating-0.153
ig-0.150
Token old
Token ID1468
Feature activation-5.11e+00

Pos logit contributions
 gestures+0.059
 ways+0.057
 kinds+0.057
 distress+0.054
 unison+0.054
Neg logit contributions
at-0.120
 seeker-0.113
ip-0.111
affle-0.110
bled-0.104
Token and
Token ID290
Feature activation-1.53e+00

Pos logit contributions
 gestures+0.086
 unison+0.083
 appearances+0.082
 harmony+0.082
 compare+0.081
Neg logit contributions
ip-0.110
 seeker-0.108
at-0.106
affle-0.102
 skating-0.099
Token he
Token ID339
Feature activation-2.81e-01

Pos logit contributions
 who+0.020
 kinds+0.020
 its+0.019
 type+0.018
 gestures+0.018
Neg logit contributions
at-0.043
 seeker-0.039
ip-0.038
affle-0.038
or-0.037
Token lived
Token ID5615
Feature activation+1.77e+00

Pos logit contributions
 kinds+0.004
 who+0.004
 gestures+0.004
 compare+0.004
 ways+0.004
Neg logit contributions
at-0.007
 seeker-0.007
ip-0.007
affle-0.007
bled-0.006
Token.
Token ID13
Feature activation+1.49e-01

Pos logit contributions
 sentences+0.047
ymes+0.044
 appearances+0.044
 gestures+0.043
 distress+0.042
Neg logit contributions
ip-0.049
at-0.046
affle-0.040
 seeker-0.040
ig-0.040
Token Charlie
Token ID11526
Feature activation-3.28e+00

Pos logit contributions
 seeker+0.003
ip+0.003
at+0.003
ained+0.003
 morning+0.003
Neg logit contributions
ymes-0.003
 else-0.002
ever-0.002
 appearances-0.002
 rob-0.002
Token and
Token ID290
Feature activation+1.69e+00

Pos logit contributions
 gestures+0.076
 unison+0.074
 sentences+0.074
 appearances+0.074
 ways+0.074
Neg logit contributions
at-0.057
ip-0.051
affle-0.049
 seeker-0.045
ored-0.043
Token Daisy
Token ID40355
Feature activation-1.80e+00

Pos logit contributions
 seeker+0.040
at+0.040
ip+0.038
affle+0.037
ipop+0.035
Neg logit contributions
 unison-0.025
 gestures-0.025
 compare-0.024
 sentences-0.024
 distress-0.023
Token were
Token ID547
Feature activation+4.08e-01

Pos logit contributions
 gestures+0.035
 sentences+0.034
 appearances+0.034
 distress+0.034
 ways+0.034
Neg logit contributions
at-0.038
ip-0.034
 seeker-0.034
affle-0.033
ored-0.031
Token three
Token ID1115
Feature activation-6.66e+00

Pos logit contributions
at+0.008
 seeker+0.007
ip+0.007
affle+0.007
bled+0.006
Neg logit contributions
 gestures-0.008
 ways-0.008
 kinds-0.008
 compare-0.008
 unison-0.008
Token years
Token ID812
Feature activation-4.03e+00

Pos logit contributions
 appearances+0.106
 compare+0.105
 mend+0.103
 gestures+0.103
 target+0.103
Neg logit contributions
at-0.152
ip-0.147
affle-0.145
ig-0.127
uddle-0.126
Token old
Token ID1468
Feature activation-5.06e+00

Pos logit contributions
 gestures+0.063
 unison+0.063
 distress+0.060
 kinds+0.060
 ways+0.060
Neg logit contributions
at-0.097
ip-0.091
affle-0.089
 seeker-0.089
bled-0.082
Token and
Token ID290
Feature activation-1.60e+00

Pos logit contributions
 express+0.099
 appearances+0.098
Maria+0.098
 compare+0.097
 advertise+0.097
Neg logit contributions
ip-0.085
 seeker-0.080
affle-0.075
at-0.073
 skating-0.071
Token were
Token ID547
Feature activation-3.50e-01

Pos logit contributions
 gestures+0.026
 ways+0.026
 distress+0.026
 compare+0.026
 unison+0.025
Neg logit contributions
at-0.036
ip-0.034
 seeker-0.034
affle-0.033
bled-0.032
Token best
Token ID1266
Feature activation+3.65e-03

Pos logit contributions
 gestures+0.007
 unison+0.007
 ways+0.007
 distress+0.007
 compare+0.007
Neg logit contributions
at-0.007
 seeker-0.006
ip-0.006
affle-0.006
bled-0.006
Token named
Token ID3706
Feature activation+2.06e+00

Pos logit contributions
 gestures+0.047
 sentences+0.047
 unison+0.045
 distress+0.044
 ways+0.044
Neg logit contributions
at-0.028
ip-0.026
 seeker-0.025
ig-0.023
 skating-0.022
Token Ken
Token ID7148
Feature activation+6.04e-01

Pos logit contributions
at+0.064
ip+0.060
 seeker+0.060
affle+0.059
bled+0.058
Neg logit contributions
 who-0.018
 ways-0.018
 kinds-0.017
 Susan-0.017
 Amy-0.017
Token.
Token ID13
Feature activation-6.22e-03

Pos logit contributions
at+0.018
ip+0.016
affle+0.016
 seeker+0.016
ored+0.015
Neg logit contributions
 gestures-0.007
 ways-0.007
 unison-0.007
 sentences-0.007
 kinds-0.007
Token Ken
Token ID7148
Feature activation-2.63e-01

Pos logit contributions
 kinds+0.000
 ways+0.000
 gestures+0.000
 sizes+0.000
 sorts+0.000
Neg logit contributions
at-0.000
affle-0.000
bled-0.000
 seeker-0.000
ored-0.000
Token was
Token ID373
Feature activation+1.58e-01

Pos logit contributions
 gestures+0.004
 kinds+0.004
 ways+0.004
 distress+0.004
 unison+0.004
Neg logit contributions
at-0.006
 seeker-0.006
ip-0.006
affle-0.006
bled-0.005
Token three
Token ID1115
Feature activation-6.62e+00

Pos logit contributions
at+0.004
ip+0.004
affle+0.004
 seeker+0.004
ored+0.004
Neg logit contributions
 who-0.002
 distress-0.002
 ways-0.002
 gestures-0.002
 kinds-0.002
Token years
Token ID812
Feature activation-4.48e+00

Pos logit contributions
 gestures+0.081
 compare+0.076
 unison+0.075
 appearances+0.074
 distress+0.074
Neg logit contributions
at-0.184
ip-0.175
affle-0.173
 seeker-0.163
 skating-0.159
Token old
Token ID1468
Feature activation-4.95e+00

Pos logit contributions
 gestures+0.058
 ways+0.056
 kinds+0.056
 distress+0.054
 compare+0.053
Neg logit contributions
at-0.122
 seeker-0.115
ip-0.112
affle-0.112
bled-0.106
Token,
Token ID11
Feature activation-2.12e-01

Pos logit contributions
 gestures+0.079
 compare+0.076
 appearances+0.074
 distress+0.074
 unison+0.074
Neg logit contributions
ip-0.110
at-0.109
 seeker-0.109
affle-0.104
 skating-0.100
Token and
Token ID290
Feature activation-1.33e+00

Pos logit contributions
 gestures+0.003
 ways+0.003
 kinds+0.003
 unison+0.003
 compare+0.003
Neg logit contributions
at-0.005
 seeker-0.005
ip-0.005
affle-0.005
bled-0.005
Token he
Token ID339
Feature activation-2.27e-01

Pos logit contributions
 who+0.021
 kinds+0.021
 gestures+0.020
 its+0.019
 ways+0.019
Neg logit contributions
at-0.034
 seeker-0.031
affle-0.031
ip-0.030
ored-0.030
Token named
Token ID3706
Feature activation+1.71e+00

Pos logit contributions
 gestures+0.085
 sentences+0.083
 unison+0.080
 sizes+0.080
 distress+0.080
Neg logit contributions
at-0.045
ip-0.043
 seeker-0.041
ig-0.038
ored-0.034
Token Jane
Token ID12091
Feature activation-1.52e+00

Pos logit contributions
at+0.042
 seeker+0.040
ip+0.039
affle+0.039
bled+0.036
Neg logit contributions
 gestures-0.026
 ways-0.026
 kinds-0.025
 unison-0.025
 distress-0.025
Token.
Token ID13
Feature activation+5.62e-01

Pos logit contributions
 gestures+0.027
 sentences+0.026
 unison+0.026
 appearances+0.025
 facts+0.024
Neg logit contributions
at-0.037
ip-0.033
ig-0.031
 seeker-0.031
ored-0.030
Token Jane
Token ID12091
Feature activation-2.44e+00

Pos logit contributions
at+0.014
 seeker+0.014
ained+0.013
ip+0.012
affle+0.012
Neg logit contributions
 gestures-0.008
 appearances-0.008
 unison-0.008
 distress-0.008
 sentences-0.007
Token was
Token ID373
Feature activation+2.00e-01

Pos logit contributions
 gestures+0.050
 unison+0.048
 ways+0.048
 sentences+0.047
 appearances+0.047
Neg logit contributions
at-0.050
ip-0.045
 seeker-0.044
affle-0.043
uch-0.041
Token three
Token ID1115
Feature activation-6.61e+00

Pos logit contributions
at+0.005
affle+0.005
ip+0.005
 seeker+0.005
or+0.004
Neg logit contributions
 ways-0.003
 who-0.003
 gestures-0.003
 kinds-0.003
 distress-0.003
Token years
Token ID812
Feature activation-4.51e+00

Pos logit contributions
 gestures+0.095
 appearances+0.091
 unison+0.090
 compare+0.088
 target+0.088
Neg logit contributions
at-0.170
ip-0.162
affle-0.157
ig-0.147
 seeker-0.144
Token old
Token ID1468
Feature activation-5.07e+00

Pos logit contributions
 gestures+0.061
 ways+0.059
 kinds+0.059
 unison+0.058
 distress+0.057
Neg logit contributions
at-0.119
 seeker-0.112
ip-0.110
affle-0.110
bled-0.103
Token and
Token ID290
Feature activation-1.51e+00

Pos logit contributions
 gestures+0.096
 appearances+0.095
 unison+0.093
 compare+0.092
 distress+0.091
Neg logit contributions
 seeker-0.099
ip-0.095
at-0.089
ored-0.086
affle-0.082
Token she
Token ID673
Feature activation-7.37e-01

Pos logit contributions
 kinds+0.021
 gestures+0.021
 who+0.021
 ways+0.021
 compare+0.020
Neg logit contributions
at-0.040
affle-0.036
 seeker-0.036
ip-0.036
or-0.034
Token was
Token ID373
Feature activation+3.62e-01

Pos logit contributions
 kinds+0.011
 gestures+0.011
 ways+0.011
 compare+0.010
 who+0.010
Neg logit contributions
at-0.018
 seeker-0.017
ip-0.017
affle-0.017
bled-0.016
Token named
Token ID3706
Feature activation+1.54e+00

Pos logit contributions
 gestures+0.095
 sentences+0.093
 distress+0.091
 unison+0.091
 ways+0.088
Neg logit contributions
at-0.046
ip-0.043
 seeker-0.040
ig-0.037
ugg-0.032
Token Lucy
Token ID22162
Feature activation-9.77e-01

Pos logit contributions
at+0.036
 seeker+0.034
ip+0.033
affle+0.032
ored+0.031
Neg logit contributions
 gestures-0.026
 distress-0.025
 kinds-0.025
 unison-0.025
 ways-0.025
Token.
Token ID13
Feature activation+5.10e-01

Pos logit contributions
 sentences+0.022
 gestures+0.022
 appearances+0.021
 unison+0.020
 distress+0.019
Neg logit contributions
at-0.019
ip-0.017
 seeker-0.016
 on-0.016
 morning-0.015
Token She
Token ID1375
Feature activation-1.53e-01

Pos logit contributions
 seeker+0.011
at+0.011
ained+0.010
ip+0.010
affle+0.010
Neg logit contributions
 appearances-0.009
 gestures-0.008
 compare-0.008
 sentences-0.008
ever-0.008
Token was
Token ID373
Feature activation+2.60e-01

Pos logit contributions
 gestures+0.002
 kinds+0.002
 ways+0.002
 compare+0.002
 distress+0.002
Neg logit contributions
at-0.004
 seeker-0.003
ip-0.003
affle-0.003
bled-0.003
Token three
Token ID1115
Feature activation-6.59e+00

Pos logit contributions
at+0.006
ip+0.006
affle+0.006
 seeker+0.006
bled+0.005
Neg logit contributions
 gestures-0.004
 kinds-0.004
 ways-0.004
 distress-0.004
 compare-0.004
Token years
Token ID812
Feature activation-4.46e+00

Pos logit contributions
 gestures+0.092
 appearances+0.089
 unison+0.089
 compare+0.089
 distress+0.087
Neg logit contributions
at-0.174
ip-0.165
affle-0.162
 skating-0.147
ig-0.147
Token old
Token ID1468
Feature activation-4.93e+00

Pos logit contributions
 gestures+0.070
 unison+0.069
 distress+0.066
 compare+0.066
 kinds+0.065
Neg logit contributions
at-0.108
ip-0.102
 seeker-0.100
affle-0.100
bled-0.092
Token.
Token ID13
Feature activation-1.14e+00

Pos logit contributions
 gestures+0.098
 appearances+0.096
 compare+0.095
ever+0.095
 distress+0.094
Neg logit contributions
 seeker-0.090
ip-0.088
at-0.080
affle-0.076
ored-0.076
Token She
Token ID1375
Feature activation-2.87e-01

Pos logit contributions
 gestures+0.021
 appearances+0.020
 compare+0.019
ever+0.019
 distress+0.019
Neg logit contributions
at-0.024
 seeker-0.023
affle-0.022
ip-0.021
ained-0.021
Token was
Token ID373
Feature activation+3.32e-01

Pos logit contributions
 kinds+0.004
 gestures+0.004
 who+0.004
 ways+0.004
 compare+0.004
Neg logit contributions
at-0.007
 seeker-0.007
ip-0.007
affle-0.007
bled-0.006
Token named
Token ID3706
Feature activation+1.62e+00

Pos logit contributions
 gestures+0.090
 sentences+0.089
 unison+0.086
 distress+0.085
 sizes+0.085
Neg logit contributions
at-0.044
ip-0.042
 seeker-0.040
ig-0.037
or-0.031
Token Lucy
Token ID22162
Feature activation-1.25e+00

Pos logit contributions
at+0.041
 seeker+0.038
ip+0.038
affle+0.037
bled+0.035
Neg logit contributions
 gestures-0.024
 ways-0.024
 kinds-0.023
 unison-0.023
 distress-0.023
Token.
Token ID13
Feature activation+2.42e-01

Pos logit contributions
 sentences+0.027
 gestures+0.026
 appearances+0.025
 unison+0.024
 facts+0.024
Neg logit contributions
at-0.024
ip-0.022
 seeker-0.021
ored-0.020
ig-0.020
Token Lucy
Token ID22162
Feature activation-1.93e+00

Pos logit contributions
at+0.006
 seeker+0.006
ip+0.005
affle+0.005
ained+0.005
Neg logit contributions
 gestures-0.004
 appearances-0.003
 unison-0.003
 sentences-0.003
 distress-0.003
Token was
Token ID373
Feature activation+1.24e-01

Pos logit contributions
 gestures+0.044
 unison+0.042
 sentences+0.042
 ways+0.042
 appearances+0.041
Neg logit contributions
at-0.036
ip-0.031
 seeker-0.031
affle-0.030
uch-0.028
Token three
Token ID1115
Feature activation-6.58e+00

Pos logit contributions
at+0.003
affle+0.003
ip+0.003
 seeker+0.003
bled+0.003
Neg logit contributions
 gestures-0.002
 ways-0.002
 kinds-0.002
 distress-0.002
 who-0.002
Token years
Token ID812
Feature activation-4.38e+00

Pos logit contributions
 gestures+0.089
 unison+0.085
 compare+0.084
 appearances+0.083
 harmony+0.082
Neg logit contributions
at-0.176
ip-0.167
affle-0.165
 seeker-0.152
ig-0.151
Token old
Token ID1468
Feature activation-4.95e+00

Pos logit contributions
 gestures+0.060
 ways+0.058
 kinds+0.057
 unison+0.057
 distress+0.056
Neg logit contributions
at-0.115
 seeker-0.108
ip-0.107
affle-0.106
bled-0.100
Token and
Token ID290
Feature activation-1.81e+00

Pos logit contributions
 gestures+0.095
 appearances+0.092
 unison+0.091
 compare+0.091
 harmony+0.090
Neg logit contributions
 seeker-0.095
ip-0.094
at-0.088
ored-0.082
affle-0.082
Token excited
Token ID6568
Feature activation-3.90e+00

Pos logit contributions
 gestures+0.028
 kinds+0.027
 ways+0.027
 compare+0.026
 who+0.026
Neg logit contributions
at-0.045
 seeker-0.041
ip-0.041
affle-0.041
bled-0.038
Token to
Token ID284
Feature activation-3.00e+00

Pos logit contributions
 unison+0.068
 gestures+0.064
 distress+0.064
 appearances+0.061
 sentences+0.061
Neg logit contributions
at-0.087
 seeker-0.082
ip-0.081
ored-0.080
ained-0.079
Tokenie
Token ID494
Feature activation-1.86e+00

Pos logit contributions
 gestures+0.022
 unison+0.021
 compare+0.021
 sentences+0.021
 distress+0.020
Neg logit contributions
at-0.013
ip-0.011
affle-0.011
 seeker-0.010
ig-0.010
Token.
Token ID13
Feature activation-2.96e-01

Pos logit contributions
 gestures+0.032
 sentences+0.031
 compare+0.029
 kinds+0.029
 cooperate+0.029
Neg logit contributions
at-0.041
ip-0.041
 seeker-0.039
ig-0.036
affle-0.036
Token Mill
Token ID9212
Feature activation-2.64e-01

Pos logit contributions
 gestures+0.004
 unison+0.004
 appearances+0.004
 compare+0.004
 distress+0.003
Neg logit contributions
at-0.008
 seeker-0.008
ip-0.007
affle-0.007
ained-0.007
Tokenie
Token ID494
Feature activation-1.91e+00

Pos logit contributions
 gestures+0.005
 ways+0.005
 kinds+0.005
 unison+0.005
 compare+0.005
Neg logit contributions
at-0.005
 seeker-0.005
ip-0.005
affle-0.005
bled-0.004
Token was
Token ID373
Feature activation+2.30e-02

Pos logit contributions
 gestures+0.032
 ways+0.032
 kinds+0.032
 unison+0.031
 distress+0.031
Neg logit contributions
at-0.044
 seeker-0.041
ip-0.040
affle-0.040
bled-0.037
Token three
Token ID1115
Feature activation-6.58e+00

Pos logit contributions
at+0.001
affle+0.001
ip+0.001
 seeker+0.001
bled+0.000
Neg logit contributions
 gestures-0.000
 ways-0.000
 kinds-0.000
 who-0.000
 distress-0.000
Token years
Token ID812
Feature activation-4.38e+00

Pos logit contributions
 gestures+0.085
 unison+0.082
 appearances+0.081
 compare+0.081
 distress+0.079
Neg logit contributions
at-0.177
ip-0.169
affle-0.166
 seeker-0.154
ig-0.153
Token old
Token ID1468
Feature activation-5.03e+00

Pos logit contributions
 gestures+0.060
 ways+0.058
 kinds+0.057
 unison+0.056
 distress+0.056
Neg logit contributions
at-0.115
 seeker-0.108
ip-0.107
affle-0.106
bled-0.100
Token and
Token ID290
Feature activation-1.78e+00

Pos logit contributions
 gestures+0.097
 appearances+0.096
 compare+0.095
ever+0.095
 unison+0.094
Neg logit contributions
 seeker-0.095
ip-0.091
at-0.084
ored-0.081
affle-0.079
Token she
Token ID673
Feature activation-6.01e-01

Pos logit contributions
 gestures+0.030
 ways+0.030
 kinds+0.030
 unison+0.029
 distress+0.029
Neg logit contributions
at-0.041
 seeker-0.038
ip-0.037
affle-0.037
bled-0.035
Token loved
Token ID6151
Feature activation-3.02e+00

Pos logit contributions
 kinds+0.009
 gestures+0.009
 ways+0.009
 who+0.008
 compare+0.008
Neg logit contributions
at-0.015
 seeker-0.014
ip-0.014
affle-0.014
bled-0.013
Token called
Token ID1444
Feature activation+3.93e-01

Pos logit contributions
 gestures+0.076
 sentences+0.075
 sizes+0.071
 distress+0.071
 kinds+0.071
Neg logit contributions
at-0.040
ip-0.039
 seeker-0.037
ig-0.036
or-0.030
Token Maggie
Token ID33014
Feature activation-1.25e+00

Pos logit contributions
 seeker+0.010
at+0.010
ip+0.009
affle+0.008
ored+0.008
Neg logit contributions
 gestures-0.006
 distress-0.006
 ways-0.006
 kinds-0.006
 unison-0.006
Token.
Token ID13
Feature activation+2.16e-01

Pos logit contributions
 sentences+0.023
 gestures+0.023
 unison+0.021
 distress+0.021
 compare+0.021
Neg logit contributions
at-0.030
ip-0.028
 seeker-0.025
ig-0.024
or-0.023
Token Maggie
Token ID33014
Feature activation-1.97e+00

Pos logit contributions
at+0.006
 seeker+0.006
ip+0.006
affle+0.006
bled+0.005
Neg logit contributions
 gestures-0.002
 ways-0.002
 unison-0.002
 kinds-0.002
 distress-0.002
Token was
Token ID373
Feature activation+1.45e-01

Pos logit contributions
 gestures+0.041
 unison+0.039
 ways+0.039
 sentences+0.039
 appearances+0.038
Neg logit contributions
at-0.040
ip-0.035
 seeker-0.034
affle-0.034
uch-0.032
Token three
Token ID1115
Feature activation-6.58e+00

Pos logit contributions
at+0.004
ip+0.003
affle+0.003
 seeker+0.003
ored+0.003
Neg logit contributions
 gestures-0.002
 ways-0.002
 kinds-0.002
 distress-0.002
 who-0.002
Token years
Token ID812
Feature activation-4.40e+00

Pos logit contributions
 gestures+0.090
 appearances+0.087
 compare+0.087
 unison+0.085
 target+0.085
Neg logit contributions
at-0.171
ip-0.162
affle-0.160
ig-0.148
 seeker-0.146
Token old
Token ID1468
Feature activation-5.09e+00

Pos logit contributions
 gestures+0.066
 unison+0.064
 kinds+0.062
 ways+0.062
 distress+0.062
Neg logit contributions
at-0.110
ip-0.103
 seeker-0.102
affle-0.101
bled-0.094
Token and
Token ID290
Feature activation-1.74e+00

Pos logit contributions
 gestures+0.093
 compare+0.092
 appearances+0.092
 sentences+0.090
ever+0.090
Neg logit contributions
 seeker-0.100
ip-0.097
at-0.093
ored-0.086
affle-0.086
Token she
Token ID673
Feature activation-4.67e-01

Pos logit contributions
 gestures+0.028
 kinds+0.028
 ways+0.027
 compare+0.027
 unison+0.026
Neg logit contributions
at-0.042
 seeker-0.038
ip-0.038
affle-0.038
bled-0.035
Token loved
Token ID6151
Feature activation-3.06e+00

Pos logit contributions
 gestures+0.008
 kinds+0.007
 ways+0.007
 compare+0.007
 distress+0.007
Neg logit contributions
at-0.011
 seeker-0.010
ip-0.010
affle-0.010
bled-0.010
Token named
Token ID3706
Feature activation+1.57e+00

Pos logit contributions
 gestures+0.088
 sentences+0.087
 sizes+0.083
 unison+0.082
 distress+0.082
Neg logit contributions
at-0.043
ip-0.042
 seeker-0.039
ig-0.038
or-0.031
Token Daisy
Token ID40355
Feature activation-1.89e+00

Pos logit contributions
at+0.039
 seeker+0.037
ip+0.036
affle+0.035
bled+0.033
Neg logit contributions
 gestures-0.024
 kinds-0.023
 ways-0.023
 unison-0.023
 distress-0.023
Token.
Token ID13
Feature activation+1.90e-01

Pos logit contributions
 sentences+0.038
 gestures+0.037
 appearances+0.035
 unison+0.034
 facts+0.034
Neg logit contributions
at-0.039
ip-0.037
 seeker-0.034
ig-0.033
ored-0.033
Token Daisy
Token ID40355
Feature activation-2.20e+00

Pos logit contributions
 seeker+0.005
at+0.005
ip+0.004
ained+0.004
affle+0.004
Neg logit contributions
 gestures-0.003
 appearances-0.003
 compare-0.003
 sentences-0.003
 unison-0.003
Token was
Token ID373
Feature activation+2.42e-02

Pos logit contributions
 gestures+0.046
 unison+0.044
 ways+0.044
 sentences+0.043
 distress+0.043
Neg logit contributions
at-0.043
 seeker-0.039
ip-0.039
affle-0.037
uch-0.035
Token three
Token ID1115
Feature activation-6.57e+00

Pos logit contributions
at+0.001
affle+0.001
ip+0.001
 seeker+0.001
or+0.001
Neg logit contributions
 ways-0.000
 gestures-0.000
 kinds-0.000
 distress-0.000
 who-0.000
Token years
Token ID812
Feature activation-4.38e+00

Pos logit contributions
 gestures+0.091
 compare+0.087
 unison+0.086
 appearances+0.086
 target+0.085
Neg logit contributions
at-0.170
ip-0.163
affle-0.159
ig-0.147
 seeker-0.145
Token old
Token ID1468
Feature activation-5.06e+00

Pos logit contributions
 gestures+0.062
 ways+0.060
 unison+0.059
 kinds+0.059
 distress+0.058
Neg logit contributions
at-0.113
 seeker-0.106
ip-0.105
affle-0.104
bled-0.097
Token and
Token ID290
Feature activation-1.68e+00

Pos logit contributions
 gestures+0.095
 compare+0.092
 appearances+0.092
 sentences+0.091
 unison+0.091
Neg logit contributions
 seeker-0.097
ip-0.096
at-0.089
ored-0.084
affle-0.083
Token she
Token ID673
Feature activation-5.79e-01

Pos logit contributions
 gestures+0.027
 kinds+0.027
 ways+0.027
 unison+0.026
 compare+0.026
Neg logit contributions
at-0.040
 seeker-0.037
ip-0.037
affle-0.036
bled-0.034
Token loved
Token ID6151
Feature activation-3.06e+00

Pos logit contributions
 gestures+0.009
 kinds+0.009
 ways+0.008
 compare+0.008
 who+0.008
Neg logit contributions
at-0.014
 seeker-0.013
ip-0.013
affle-0.013
bled-0.012
Token named
Token ID3706
Feature activation+1.65e+00

Pos logit contributions
 gestures+0.082
 sentences+0.080
 unison+0.077
 sizes+0.077
 distress+0.077
Neg logit contributions
at-0.044
ip-0.043
 seeker-0.040
ig-0.038
or-0.032
Token Jane
Token ID12091
Feature activation-1.31e+00

Pos logit contributions
at+0.040
 seeker+0.038
ip+0.037
affle+0.036
bled+0.034
Neg logit contributions
 gestures-0.026
 ways-0.025
 kinds-0.025
 unison-0.025
 distress-0.025
Token.
Token ID13
Feature activation+5.37e-01

Pos logit contributions
 gestures+0.024
 sentences+0.024
 unison+0.023
 appearances+0.022
 facts+0.021
Neg logit contributions
at-0.032
ip-0.028
ig-0.027
 seeker-0.026
or-0.025
Token Jane
Token ID12091
Feature activation-2.30e+00

Pos logit contributions
at+0.013
 seeker+0.013
ained+0.012
ip+0.012
affle+0.012
Neg logit contributions
 gestures-0.008
 appearances-0.008
 unison-0.007
 distress-0.007
 sentences-0.007
Token was
Token ID373
Feature activation+1.38e-01

Pos logit contributions
 gestures+0.050
 unison+0.048
 ways+0.047
 sentences+0.047
 appearances+0.046
Neg logit contributions
at-0.046
ip-0.041
 seeker-0.039
affle-0.038
uch-0.037
Token three
Token ID1115
Feature activation-6.57e+00

Pos logit contributions
at+0.004
affle+0.003
ip+0.003
 seeker+0.003
or+0.003
Neg logit contributions
 ways-0.002
 gestures-0.002
 kinds-0.002
 who-0.002
 distress-0.002
Token years
Token ID812
Feature activation-4.46e+00

Pos logit contributions
 gestures+0.095
 unison+0.091
 appearances+0.090
 target+0.090
 compare+0.089
Neg logit contributions
at-0.168
ip-0.161
affle-0.155
ig-0.146
 skating-0.142
Token old
Token ID1468
Feature activation-4.97e+00

Pos logit contributions
 gestures+0.062
 ways+0.059
 kinds+0.059
 unison+0.058
 distress+0.057
Neg logit contributions
at-0.117
 seeker-0.109
ip-0.108
affle-0.107
bled-0.101
Token and
Token ID290
Feature activation-1.68e+00

Pos logit contributions
 gestures+0.091
 appearances+0.089
 unison+0.087
 compare+0.086
 sentences+0.086
Neg logit contributions
 seeker-0.099
ip-0.097
at-0.091
ored-0.087
affle-0.084
Token she
Token ID673
Feature activation-6.06e-01

Pos logit contributions
 gestures+0.026
 kinds+0.026
 ways+0.026
 compare+0.025
 unison+0.025
Neg logit contributions
at-0.041
 seeker-0.038
ip-0.038
affle-0.037
bled-0.035
Token wanted
Token ID2227
Feature activation-1.72e+00

Pos logit contributions
 kinds+0.009
 gestures+0.009
 ways+0.009
 compare+0.009
 who+0.009
Neg logit contributions
at-0.015
 seeker-0.014
ip-0.014
affle-0.014
bled-0.013
Token named
Token ID3706
Feature activation+1.82e+00

Pos logit contributions
 gestures+0.055
 sentences+0.054
 unison+0.053
 distress+0.051
 ways+0.051
Neg logit contributions
at-0.032
ip-0.030
 seeker-0.027
ig-0.026
affle-0.025
Token John
Token ID1757
Feature activation-9.44e-01

Pos logit contributions
at+0.051
 seeker+0.048
ip+0.047
affle+0.046
bled+0.045
Neg logit contributions
 ways-0.022
 kinds-0.022
 gestures-0.022
 compare-0.021
 who-0.021
Token.
Token ID13
Feature activation+3.39e-01

Pos logit contributions
 gestures+0.015
 sentences+0.014
 unison+0.014
 appearances+0.013
 harmony+0.013
Neg logit contributions
at-0.024
ip-0.022
affle-0.021
 seeker-0.020
 skating-0.020
Token He
Token ID679
Feature activation-1.43e-01

Pos logit contributions
at+0.007
 seeker+0.006
affle+0.006
ained+0.006
ding+0.006
Neg logit contributions
 gestures-0.006
 appearances-0.006
 unison-0.006
ymes-0.006
 distress-0.006
Token was
Token ID373
Feature activation+2.57e-01

Pos logit contributions
 who+0.002
 kinds+0.002
 its+0.002
 gestures+0.002
 wondered+0.002
Neg logit contributions
at-0.004
 seeker-0.004
ip-0.004
affle-0.003
bled-0.003
Token three
Token ID1115
Feature activation-6.56e+00

Pos logit contributions
at+0.007
ip+0.006
affle+0.006
 seeker+0.006
or+0.006
Neg logit contributions
 who-0.003
 kinds-0.003
 ways-0.003
 gestures-0.003
 distress-0.003
Token years
Token ID812
Feature activation-4.39e+00

Pos logit contributions
 gestures+0.090
 unison+0.087
 Hope+0.086
 appearances+0.085
 compare+0.084
Neg logit contributions
at-0.173
ip-0.164
affle-0.162
 skating-0.151
ig-0.148
Token old
Token ID1468
Feature activation-5.02e+00

Pos logit contributions
 gestures+0.058
 ways+0.057
 kinds+0.056
 unison+0.055
 distress+0.054
Neg logit contributions
at-0.117
 seeker-0.110
ip-0.108
affle-0.108
bled-0.102
Token and
Token ID290
Feature activation-1.71e+00

Pos logit contributions
 gestures+0.084
 unison+0.081
 appearances+0.079
 compare+0.079
 harmony+0.079
Neg logit contributions
 seeker-0.108
ip-0.107
at-0.104
affle-0.101
 skating-0.097
Token loved
Token ID6151
Feature activation-3.21e+00

Pos logit contributions
 kinds+0.024
 who+0.023
 gestures+0.023
 ways+0.022
 type+0.022
Neg logit contributions
at-0.046
 seeker-0.042
ip-0.041
affle-0.041
ored-0.040
Token to
Token ID284
Feature activation-2.30e+00

Pos logit contributions
 unison+0.058
 gestures+0.054
 distress+0.051
 appearances+0.050
 compare+0.050
Neg logit contributions
at-0.071
ip-0.068
affle-0.067
 skating-0.066
 seeker-0.066
Token named
Token ID3706
Feature activation+1.62e+00

Pos logit contributions
 gestures+0.085
 sentences+0.083
 unison+0.080
 sizes+0.080
 distress+0.079
Neg logit contributions
at-0.041
ip-0.040
 seeker-0.038
ig-0.036
or-0.030
Token Jane
Token ID12091
Feature activation-1.26e+00

Pos logit contributions
at+0.040
 seeker+0.038
ip+0.037
affle+0.037
bled+0.035
Neg logit contributions
 gestures-0.025
 ways-0.024
 kinds-0.024
 unison-0.024
 distress-0.023
Token.
Token ID13
Feature activation+5.37e-01

Pos logit contributions
 gestures+0.024
 sentences+0.023
 unison+0.023
 appearances+0.022
 harmony+0.022
Neg logit contributions
at-0.030
ip-0.027
ig-0.025
 seeker-0.024
or-0.024
Token Jane
Token ID12091
Feature activation-2.23e+00

Pos logit contributions
at+0.013
 seeker+0.013
ained+0.012
ip+0.012
affle+0.012
Neg logit contributions
 gestures-0.008
 appearances-0.008
 unison-0.007
 distress-0.007
 sentences-0.007
Token was
Token ID373
Feature activation+1.82e-01

Pos logit contributions
 gestures+0.049
 unison+0.047
 ways+0.046
 sentences+0.046
 appearances+0.046
Neg logit contributions
at-0.044
ip-0.039
 seeker-0.037
affle-0.037
uch-0.035
Token three
Token ID1115
Feature activation-6.56e+00

Pos logit contributions
at+0.005
affle+0.004
ip+0.004
 seeker+0.004
bled+0.004
Neg logit contributions
 ways-0.003
 gestures-0.003
 kinds-0.003
 distress-0.003
 who-0.003
Token years
Token ID812
Feature activation-4.54e+00

Pos logit contributions
 gestures+0.097
 appearances+0.092
 unison+0.092
 compare+0.090
 harmony+0.090
Neg logit contributions
at-0.165
ip-0.159
affle-0.153
ig-0.143
 skating-0.141
Token old
Token ID1468
Feature activation-4.99e+00

Pos logit contributions
 gestures+0.066
 unison+0.063
 ways+0.063
 kinds+0.063
 distress+0.061
Neg logit contributions
at-0.115
 seeker-0.108
ip-0.108
affle-0.106
bled-0.100
Token and
Token ID290
Feature activation-1.60e+00

Pos logit contributions
 gestures+0.094
 appearances+0.091
 harmony+0.090
 compare+0.090
 unison+0.089
Neg logit contributions
 seeker-0.097
ip-0.096
at-0.087
ored-0.084
affle-0.082
Token she
Token ID673
Feature activation-5.64e-01

Pos logit contributions
 kinds+0.024
 gestures+0.023
 ways+0.023
 who+0.022
 compare+0.022
Neg logit contributions
at-0.041
 seeker-0.038
affle-0.037
ip-0.037
ored-0.034
Token liked
Token ID8288
Feature activation-2.49e+00

Pos logit contributions
 gestures+0.008
 kinds+0.008
 ways+0.008
 compare+0.008
 who+0.008
Neg logit contributions
at-0.014
 seeker-0.013
ip-0.013
affle-0.013
bled-0.012
Token named
Token ID3706
Feature activation+1.56e+00

Pos logit contributions
 gestures+0.081
 sentences+0.079
 kinds+0.076
 sizes+0.076
 distress+0.075
Neg logit contributions
at-0.039
ip-0.038
 seeker-0.038
ig-0.035
or-0.029
Token Sally
Token ID25737
Feature activation-1.13e+00

Pos logit contributions
at+0.037
 seeker+0.036
ip+0.035
affle+0.033
ored+0.032
Neg logit contributions
 gestures-0.025
 kinds-0.024
 ways-0.024
 unison-0.024
 distress-0.024
Token.
Token ID13
Feature activation+4.66e-01

Pos logit contributions
 gestures+0.023
 sentences+0.023
 unison+0.022
 appearances+0.020
 facts+0.020
Neg logit contributions
at-0.024
ip-0.023
 seeker-0.021
ig-0.021
or-0.020
Token Sally
Token ID25737
Feature activation-1.45e+00

Pos logit contributions
at+0.014
affle+0.013
ip+0.013
 seeker+0.013
bled+0.013
Neg logit contributions
 ways-0.005
 kinds-0.005
 gestures-0.005
 distress-0.004
 compare-0.004
Token was
Token ID373
Feature activation+1.37e-01

Pos logit contributions
 gestures+0.030
 unison+0.029
 ways+0.028
 sentences+0.028
 distress+0.028
Neg logit contributions
at-0.029
 seeker-0.026
ip-0.026
affle-0.025
ored-0.024
Token three
Token ID1115
Feature activation-6.56e+00

Pos logit contributions
at+0.004
affle+0.003
ip+0.003
 seeker+0.003
or+0.003
Neg logit contributions
 who-0.002
 ways-0.002
 gestures-0.002
 kinds-0.002
 distress-0.002
Token years
Token ID812
Feature activation-4.55e+00

Pos logit contributions
 gestures+0.090
 unison+0.087
 compare+0.086
 appearances+0.085
 Hope+0.084
Neg logit contributions
at-0.171
ip-0.164
affle-0.160
ig-0.149
 seeker-0.148
Token old
Token ID1468
Feature activation-5.02e+00

Pos logit contributions
 gestures+0.064
 ways+0.062
 kinds+0.061
 unison+0.061
 distress+0.060
Neg logit contributions
at-0.117
 seeker-0.110
ip-0.109
affle-0.108
bled-0.101
Token and
Token ID290
Feature activation-1.54e+00

Pos logit contributions
 gestures+0.092
 appearances+0.089
 compare+0.089
 unison+0.089
 sentences+0.088
Neg logit contributions
 seeker-0.101
ip-0.099
at-0.089
ored-0.086
ig-0.085
Token loved
Token ID6151
Feature activation-2.83e+00

Pos logit contributions
 kinds+0.022
 who+0.021
 gestures+0.021
 ways+0.021
 compare+0.020
Neg logit contributions
at-0.041
 seeker-0.037
affle-0.037
ip-0.037
or-0.035
Token to
Token ID284
Feature activation-2.11e+00

Pos logit contributions
 unison+0.053
 gestures+0.049
 distress+0.046
lier+0.045
 appearances+0.045
Neg logit contributions
at-0.060
ip-0.059
 seeker-0.058
affle-0.056
 skating-0.055
Token called
Token ID1444
Feature activation+2.31e-01

Pos logit contributions
 gestures+0.091
 sentences+0.090
 unison+0.087
 distress+0.086
 sizes+0.086
Neg logit contributions
at-0.044
ip-0.043
 seeker-0.041
ig-0.038
or-0.032
Token Grace
Token ID16156
Feature activation-1.10e+00

Pos logit contributions
 seeker+0.006
at+0.006
ip+0.005
affle+0.005
ored+0.005
Neg logit contributions
 gestures-0.004
 kinds-0.003
 ways-0.003
 distress-0.003
 unison-0.003
Token.
Token ID13
Feature activation-4.07e-01

Pos logit contributions
 sentences+0.021
 gestures+0.021
 distress+0.020
 compare+0.019
 appearances+0.019
Neg logit contributions
at-0.023
ip-0.023
 seeker-0.020
ored-0.020
ig-0.020
Token Grace
Token ID16156
Feature activation-1.37e+00

Pos logit contributions
 gestures+0.006
 kinds+0.005
 ways+0.005
 compare+0.005
 distress+0.005
Neg logit contributions
at-0.010
 seeker-0.010
ip-0.010
affle-0.009
ored-0.009
Token was
Token ID373
Feature activation-8.20e-02

Pos logit contributions
 gestures+0.031
 sentences+0.030
 ways+0.029
 unison+0.029
 appearances+0.029
Neg logit contributions
at-0.025
ip-0.022
 seeker-0.021
affle-0.020
ored-0.020
Token three
Token ID1115
Feature activation-6.56e+00

Pos logit contributions
 gestures+0.001
 ways+0.001
 kinds+0.001
 distress+0.001
 who+0.001
Neg logit contributions
at-0.002
ip-0.002
affle-0.002
 seeker-0.002
bled-0.002
Token years
Token ID812
Feature activation-4.44e+00

Pos logit contributions
 gestures+0.082
 compare+0.078
 unison+0.078
 appearances+0.076
 distress+0.076
Neg logit contributions
at-0.180
ip-0.171
affle-0.170
 seeker-0.158
ig-0.155
Token old
Token ID1468
Feature activation-5.10e+00

Pos logit contributions
 gestures+0.060
 ways+0.058
 kinds+0.058
 unison+0.056
 distress+0.056
Neg logit contributions
at-0.118
 seeker-0.110
ip-0.109
affle-0.108
bled-0.102
Token and
Token ID290
Feature activation-1.97e+00

Pos logit contributions
 gestures+0.099
 compare+0.097
 appearances+0.096
ever+0.095
 cooperate+0.095
Neg logit contributions
 seeker-0.097
ip-0.092
at-0.088
affle-0.082
ored-0.082
Token had
Token ID550
Feature activation+9.86e-02

Pos logit contributions
 gestures+0.034
 ways+0.033
 kinds+0.033
 unison+0.032
 distress+0.032
Neg logit contributions
at-0.045
 seeker-0.042
ip-0.041
affle-0.041
bled-0.038
Token the
Token ID262
Feature activation-2.54e+00

Pos logit contributions
 seeker+0.002
at+0.002
ip+0.002
ained+0.002
 skating+0.002
Neg logit contributions
 gestures-0.002
 compare-0.002
 unison-0.002
 kinds-0.002
 distress-0.002
Token named
Token ID3706
Feature activation+1.51e+00

Pos logit contributions
 gestures+0.093
 sentences+0.092
 unison+0.089
 sizes+0.088
 distress+0.087
Neg logit contributions
at-0.043
ip-0.041
 seeker-0.040
ig-0.036
ugg-0.031
Token Jane
Token ID12091
Feature activation-1.49e+00

Pos logit contributions
at+0.032
 seeker+0.032
ip+0.030
affle+0.029
ored+0.028
Neg logit contributions
 gestures-0.028
 kinds-0.027
 unison-0.027
 distress-0.027
 ways-0.026
Token.
Token ID13
Feature activation+2.63e-01

Pos logit contributions
 gestures+0.030
 sentences+0.029
 unison+0.028
 harmony+0.027
 facts+0.027
Neg logit contributions
at-0.033
ip-0.030
ig-0.028
 seeker-0.027
or-0.026
Token Jane
Token ID12091
Feature activation-2.16e+00

Pos logit contributions
at+0.006
 seeker+0.006
ained+0.006
ored+0.006
ip+0.006
Neg logit contributions
 gestures-0.004
 appearances-0.004
 distress-0.004
 unison-0.004
ymes-0.004
Token was
Token ID373
Feature activation+2.79e-01

Pos logit contributions
 gestures+0.047
 unison+0.045
 ways+0.045
 sentences+0.044
 appearances+0.044
Neg logit contributions
at-0.043
ip-0.037
 seeker-0.037
affle-0.036
uch-0.034
Token three
Token ID1115
Feature activation-6.55e+00

Pos logit contributions
at+0.007
affle+0.006
ip+0.006
 seeker+0.006
bled+0.006
Neg logit contributions
 ways-0.004
 gestures-0.004
 who-0.004
 kinds-0.004
 distress-0.004
Token years
Token ID812
Feature activation-4.46e+00

Pos logit contributions
 gestures+0.090
 unison+0.086
 compare+0.085
 appearances+0.085
 distress+0.083
Neg logit contributions
at-0.173
ip-0.165
affle-0.161
ig-0.149
 seeker-0.148
Token old
Token ID1468
Feature activation-4.86e+00

Pos logit contributions
 gestures+0.059
 ways+0.057
 kinds+0.057
 unison+0.055
 distress+0.055
Neg logit contributions
at-0.120
 seeker-0.112
ip-0.111
affle-0.110
bled-0.104
Token and
Token ID290
Feature activation-1.71e+00

Pos logit contributions
 gestures+0.094
 appearances+0.092
 compare+0.090
 unison+0.090
ever+0.090
Neg logit contributions
 seeker-0.092
ip-0.089
at-0.082
ored-0.079
affle-0.077
Token loved
Token ID6151
Feature activation-2.84e+00

Pos logit contributions
 gestures+0.027
 kinds+0.027
 ways+0.026
 compare+0.026
 unison+0.025
Neg logit contributions
at-0.041
 seeker-0.038
ip-0.038
affle-0.038
bled-0.035
Token to
Token ID284
Feature activation-2.10e+00

Pos logit contributions
 unison+0.053
 gestures+0.049
 distress+0.046
 appearances+0.045
 compare+0.045
Neg logit contributions
at-0.060
ip-0.058
 seeker-0.057
affle-0.055
 skating-0.055
Token named
Token ID3706
Feature activation+1.65e+00

Pos logit contributions
 gestures+0.079
 sentences+0.078
 sizes+0.075
 kinds+0.074
 facts+0.074
Neg logit contributions
at-0.042
ip-0.042
 seeker-0.041
ig-0.038
or-0.033
Token Sarah
Token ID10490
Feature activation-1.70e+00

Pos logit contributions
at+0.042
 seeker+0.039
ip+0.038
affle+0.038
bled+0.036
Neg logit contributions
 gestures-0.024
 ways-0.024
 kinds-0.024
 unison-0.023
 compare-0.023
Token.
Token ID13
Feature activation+3.85e-01

Pos logit contributions
 gestures+0.033
 sentences+0.031
 unison+0.029
 facts+0.029
 distress+0.028
Neg logit contributions
at-0.037
ip-0.035
ig-0.033
ored-0.032
 seeker-0.031
Token She
Token ID1375
Feature activation-3.96e-02

Pos logit contributions
at+0.010
 seeker+0.010
ained+0.009
ip+0.009
affle+0.009
Neg logit contributions
 gestures-0.005
 distress-0.005
 unison-0.005
 appearances-0.005
 ways-0.005
Token was
Token ID373
Feature activation+3.13e-01

Pos logit contributions
 who+0.001
 kinds+0.000
 wondered+0.000
 its+0.000
 shapes+0.000
Neg logit contributions
at-0.001
 seeker-0.001
ip-0.001
affle-0.001
bled-0.001
Token three
Token ID1115
Feature activation-6.55e+00

Pos logit contributions
at+0.008
affle+0.007
ip+0.007
 seeker+0.007
or+0.007
Neg logit contributions
 ways-0.004
 who-0.004
 kinds-0.004
 gestures-0.004
 distress-0.004
Token years
Token ID812
Feature activation-4.36e+00

Pos logit contributions
 gestures+0.096
 unison+0.092
 appearances+0.091
 compare+0.090
 distress+0.089
Neg logit contributions
at-0.166
ip-0.160
affle-0.154
ig-0.144
 skating-0.142
Token old
Token ID1468
Feature activation-4.93e+00

Pos logit contributions
 gestures+0.059
 ways+0.057
 kinds+0.057
 unison+0.056
 distress+0.055
Neg logit contributions
at-0.115
 seeker-0.108
ip-0.107
affle-0.106
bled-0.100
Token and
Token ID290
Feature activation-1.55e+00

Pos logit contributions
 gestures+0.093
 appearances+0.090
 unison+0.089
 compare+0.088
 sentences+0.088
Neg logit contributions
 seeker-0.096
ip-0.093
at-0.085
ored-0.083
affle-0.080
Token loved
Token ID6151
Feature activation-3.15e+00

Pos logit contributions
 gestures+0.025
 kinds+0.024
 ways+0.024
 compare+0.023
 unison+0.023
Neg logit contributions
at-0.038
 seeker-0.035
ip-0.035
affle-0.034
bled-0.032
Token popcorn
Token ID38914
Feature activation+9.82e-02

Pos logit contributions
 unison+0.061
 gestures+0.057
 distress+0.054
 appearances+0.053
 sentences+0.052
Neg logit contributions
at-0.064
ip-0.063
 seeker-0.061
 skating-0.060
affle-0.059
Token became
Token ID2627
Feature activation+2.07e-01

Pos logit contributions
at+0.016
affle+0.015
 seeker+0.015
ip+0.014
bled+0.014
Neg logit contributions
 gestures-0.010
 kinds-0.010
 who-0.010
 compare-0.010
 ways-0.009
Token best
Token ID1266
Feature activation+1.47e+00

Pos logit contributions
at+0.003
 seeker+0.003
ip+0.003
affle+0.003
ained+0.002
Neg logit contributions
 gestures-0.005
 unison-0.005
 distress-0.005
 compare-0.005
 kinds-0.005
Token friends
Token ID2460
Feature activation-2.29e+00

Pos logit contributions
ip+0.027
at+0.027
 seeker+0.027
affle+0.025
ig+0.024
Neg logit contributions
 compare-0.029
 unison-0.029
 gestures-0.029
 kinds-0.028
 meaning-0.028
Token.
Token ID13
Feature activation+5.49e-01

Pos logit contributions
 gestures+0.034
 compare+0.034
 distress+0.034
 kinds+0.032
 appearances+0.032
Neg logit contributions
ip-0.054
at-0.054
 seeker-0.053
ored-0.050
ig-0.050
Token<|endoftext|>
Token ID50256
Feature activation+1.29e+00

Pos logit contributions
at+0.013
ip+0.012
 seeker+0.012
affle+0.011
ained+0.011
Neg logit contributions
 gestures-0.009
 distress-0.009
 compare-0.009
 ways-0.009
 kinds-0.008
TokenOnce
Token ID7454
Feature activation+9.15e+00

Pos logit contributions
at+0.034
 seeker+0.032
ip+0.031
affle+0.031
bled+0.030
Neg logit contributions
 gestures-0.017
 ways-0.017
 kinds-0.017
 unison-0.016
 distress-0.016
Token upon
Token ID2402
Feature activation-1.70e-01

Pos logit contributions
at+0.144
 seeker+0.126
affle+0.126
ip+0.113
atmeal+0.110
Neg logit contributions
 ways-0.224
 kinds-0.223
 who-0.219
 gestures-0.218
 distress-0.214
Token a
Token ID257
Feature activation+2.34e+00

Pos logit contributions
 gestures+0.004
 compare+0.003
 appearances+0.003
 sizes+0.003
 kinds+0.003
Neg logit contributions
at-0.003
 seeker-0.003
affle-0.003
ip-0.003
ained-0.003
Token time
Token ID640
Feature activation+5.24e+00

Pos logit contributions
at+0.042
 seeker+0.038
ip+0.037
affle+0.037
uch+0.032
Neg logit contributions
 gestures-0.053
 kinds-0.051
 ways-0.051
 compare-0.050
 distress-0.050
Token,
Token ID11
Feature activation+5.48e+00

Pos logit contributions
at+0.138
affle+0.128
ip+0.123
 seeker+0.122
atmeal+0.118
Neg logit contributions
 who-0.073
 ways-0.072
 gestures-0.071
 Susan-0.070
 kinds-0.070
Token there
Token ID612
Feature activation+3.36e+00

Pos logit contributions
at+0.129
affle+0.120
ip+0.115
atmeal+0.111
 seeker+0.110
Neg logit contributions
 who-0.104
 its-0.096
 Amy-0.094
 whose-0.093
 distress-0.092
Token
Token ID198
Feature activation-1.15e+00

Pos logit contributions
 gestures+0.017
 ways+0.016
 kinds+0.016
 compare+0.016
 distress+0.016
Neg logit contributions
at-0.025
ip-0.024
affle-0.023
 seeker-0.023
bled-0.022
TokenThe
Token ID464
Feature activation-5.30e-01

Pos logit contributions
 gestures+0.019
 ways+0.018
 kinds+0.018
 sizes+0.018
 distress+0.017
Neg logit contributions
at-0.028
ip-0.026
affle-0.025
bled-0.025
or-0.024
Token end
Token ID886
Feature activation-2.03e+00

Pos logit contributions
 ways+0.006
 gestures+0.006
 kinds+0.006
 unison+0.005
 distress+0.005
Neg logit contributions
at-0.016
 seeker-0.015
ip-0.015
ained-0.015
affle-0.015
Token.
Token ID13
Feature activation+3.48e-01

Pos logit contributions
 gestures+0.022
 ways+0.021
 kinds+0.021
 compare+0.021
 distress+0.020
Neg logit contributions
at-0.059
ip-0.056
affle-0.055
 seeker-0.054
ored-0.051
Token<|endoftext|>
Token ID50256
Feature activation+7.22e-01

Pos logit contributions
at+0.007
ip+0.007
 seeker+0.006
ained+0.006
affle+0.006
Neg logit contributions
 gestures-0.007
 compare-0.006
 ways-0.006
 kinds-0.006
 brace-0.006
TokenOnce
Token ID7454
Feature activation+8.71e+00

Pos logit contributions
at+0.018
 seeker+0.017
affle+0.016
ip+0.016
bled+0.016
Neg logit contributions
 kinds-0.011
 gestures-0.011
 ways-0.011
 distress-0.010
 unison-0.010
Token upon
Token ID2402
Feature activation-2.34e-01

Pos logit contributions
at+0.134
 seeker+0.120
affle+0.118
ip+0.105
bled+0.103
Neg logit contributions
 ways-0.217
 kinds-0.216
 who-0.209
 gestures-0.208
 distress-0.205
Token a
Token ID257
Feature activation+2.17e+00

Pos logit contributions
 gestures+0.004
 compare+0.004
 appearances+0.004
 kinds+0.004
 sizes+0.004
Neg logit contributions
at-0.005
 seeker-0.005
affle-0.004
ip-0.004
 skating-0.004
Token time
Token ID640
Feature activation+5.10e+00

Pos logit contributions
at+0.036
 seeker+0.032
ip+0.032
affle+0.031
uch+0.027
Neg logit contributions
 gestures-0.052
 kinds-0.050
 ways-0.050
 compare-0.049
 distress-0.049
Token there
Token ID612
Feature activation+3.10e+00

Pos logit contributions
at+0.138
affle+0.128
 seeker+0.124
ip+0.123
atmeal+0.119
Neg logit contributions
 who-0.068
 ways-0.068
 Susan-0.066
 gestures-0.066
 kinds-0.065
Token was
Token ID373
Feature activation+9.87e-02

Pos logit contributions
at+0.077
 seeker+0.072
ip+0.071
affle+0.070
bled+0.066
Neg logit contributions
 kinds-0.046
 gestures-0.046
 ways-0.046
 distress-0.045
 unison-0.044
Token to
Token ID284
Feature activation-6.79e-01

Pos logit contributions
at+0.092
affle+0.085
ip+0.082
 seeker+0.081
bled+0.079
Neg logit contributions
 ways-0.071
 gestures-0.070
 kinds-0.066
 heaven-0.066
 who-0.066
Token the
Token ID262
Feature activation+1.51e+00

Pos logit contributions
 sizes+0.012
 gestures+0.012
 kinds+0.012
ever+0.012
lier+0.011
Neg logit contributions
 seeker-0.014
at-0.013
affle-0.012
ip-0.012
 skating-0.012
Token stadium
Token ID10308
Feature activation-4.14e-01

Pos logit contributions
at+0.030
ip+0.028
affle+0.026
 seeker+0.025
bled+0.024
Neg logit contributions
 ways-0.032
 unison-0.031
 gestures-0.031
 compare-0.031
 kinds-0.030
Token!
Token ID0
Feature activation+9.40e-01

Pos logit contributions
 gestures+0.007
 distress+0.007
 appearances+0.007
 unison+0.007
 kinds+0.007
Neg logit contributions
at-0.009
 seeker-0.009
ip-0.008
or-0.008
ored-0.008
Token<|endoftext|>
Token ID50256
Feature activation+1.15e+00

Pos logit contributions
at+0.020
ip+0.020
 seeker+0.019
affle+0.018
ained+0.018
Neg logit contributions
 gestures-0.017
 kinds-0.016
 ways-0.016
 compare-0.016
 distress-0.016
TokenOnce
Token ID7454
Feature activation+9.22e+00

Pos logit contributions
at+0.025
affle+0.023
 seeker+0.023
ip+0.023
bled+0.022
Neg logit contributions
 kinds-0.020
 who-0.020
 ways-0.020
 gestures-0.020
 unison-0.020
Token upon
Token ID2402
Feature activation-2.74e-01

Pos logit contributions
at+0.138
affle+0.121
 seeker+0.120
ip+0.110
bled+0.104
Neg logit contributions
 ways-0.233
 kinds-0.231
 gestures-0.226
 who-0.225
 distress-0.222
Token a
Token ID257
Feature activation+2.07e+00

Pos logit contributions
 gestures+0.005
 compare+0.005
 appearances+0.005
 sizes+0.005
 kinds+0.005
Neg logit contributions
at-0.005
 seeker-0.005
affle-0.005
ip-0.005
ained-0.005
Token time
Token ID640
Feature activation+5.00e+00

Pos logit contributions
at+0.034
 seeker+0.030
ip+0.030
affle+0.030
uch+0.026
Neg logit contributions
 gestures-0.050
 kinds-0.048
 ways-0.047
 compare-0.047
 sentences-0.047
Token,
Token ID11
Feature activation+5.22e+00

Pos logit contributions
at+0.137
affle+0.128
ip+0.122
 seeker+0.122
atmeal+0.119
Neg logit contributions
 who-0.066
 ways-0.064
 Susan-0.064
 Amy-0.063
 gestures-0.062
Token there
Token ID612
Feature activation+3.29e+00

Pos logit contributions
at+0.124
affle+0.116
ip+0.111
atmeal+0.109
 seeker+0.106
Neg logit contributions
 who-0.099
 its-0.091
 Amy-0.089
 whose-0.089
 Susan-0.086
Token in
Token ID287
Feature activation-1.23e+00

Pos logit contributions
 gestures+0.029
 kinds+0.028
 ways+0.028
 unison+0.028
 distress+0.027
Neg logit contributions
at-0.061
 seeker-0.057
ip-0.057
affle-0.056
bled-0.053
Token the
Token ID262
Feature activation+1.77e+00

Pos logit contributions
keys+0.019
old+0.018
lier+0.018
 gestures+0.018
 unison+0.018
Neg logit contributions
 seeker-0.027
at-0.025
 skating-0.024
ained-0.024
affle-0.023
Token jungle
Token ID20712
Feature activation+5.35e-01

Pos logit contributions
at+0.038
ip+0.036
 seeker+0.035
affle+0.035
bled+0.034
Neg logit contributions
 ways-0.034
 gestures-0.033
 kinds-0.032
 compare-0.032
 distress-0.031
Token.
Token ID13
Feature activation+2.60e-01

Pos logit contributions
at+0.013
 seeker+0.012
ip+0.012
cast+0.012
 skating+0.012
Neg logit contributions
 unison-0.007
 gestures-0.007
 distress-0.007
 kinds-0.007
 Susan-0.007
Token<|endoftext|>
Token ID50256
Feature activation+1.64e+00

Pos logit contributions
at+0.007
 seeker+0.006
affle+0.006
ip+0.006
bled+0.006
Neg logit contributions
 gestures-0.003
 ways-0.003
 kinds-0.003
 unison-0.003
 compare-0.003
TokenOnce
Token ID7454
Feature activation+9.21e+00

Pos logit contributions
at+0.043
 seeker+0.040
ip+0.039
affle+0.039
bled+0.037
Neg logit contributions
 gestures-0.023
 ways-0.022
 kinds-0.022
 unison-0.021
 compare-0.021
Token upon
Token ID2402
Feature activation-7.44e-02

Pos logit contributions
at+0.140
 seeker+0.126
affle+0.121
ip+0.111
bled+0.108
Neg logit contributions
 ways-0.229
 kinds-0.226
 who-0.221
 gestures-0.219
 distress-0.217
Token a
Token ID257
Feature activation+2.35e+00

Pos logit contributions
 gestures+0.001
 compare+0.001
 appearances+0.001
 sizes+0.001
 kinds+0.001
Neg logit contributions
at-0.001
 seeker-0.001
affle-0.001
ip-0.001
ained-0.001
Token time
Token ID640
Feature activation+5.59e+00

Pos logit contributions
at+0.038
 seeker+0.034
ip+0.034
affle+0.034
uch+0.030
Neg logit contributions
 gestures-0.058
 kinds-0.055
 ways-0.054
 sentences-0.054
 compare-0.054
Token,
Token ID11
Feature activation+5.70e+00

Pos logit contributions
at+0.136
affle+0.125
 seeker+0.124
ip+0.121
bled+0.115
Neg logit contributions
 ways-0.088
 gestures-0.086
 kinds-0.085
 who-0.084
 distress-0.083
Token there
Token ID612
Feature activation+3.64e+00

Pos logit contributions
at+0.123
affle+0.112
ip+0.110
 seeker+0.109
atmeal+0.102
Neg logit contributions
 who-0.113
 its-0.105
 ways-0.105
 distress-0.105
 kinds-0.104
Token he
Token ID339
Feature activation+1.25e+00

Pos logit contributions
 gestures+0.016
 unison+0.015
 kinds+0.015
 sentences+0.015
 ways+0.015
Neg logit contributions
at-0.024
ip-0.022
 seeker-0.022
affle-0.022
 skating-0.021
Token had
Token ID550
Feature activation+2.26e+00

Pos logit contributions
at+0.029
 seeker+0.028
affle+0.027
ip+0.027
bled+0.026
Neg logit contributions
 gestures-0.020
 distress-0.020
 who-0.020
 ways-0.020
 kinds-0.020
Token seen
Token ID1775
Feature activation-8.10e-01

Pos logit contributions
at+0.051
 seeker+0.047
ip+0.046
affle+0.045
bled+0.044
Neg logit contributions
 gestures-0.040
 kinds-0.038
 unison-0.038
 ways-0.037
 compare-0.037
Token.
Token ID13
Feature activation+2.88e-01

Pos logit contributions
 gestures+0.011
 ways+0.010
 kinds+0.010
 unison+0.010
 compare+0.010
Neg logit contributions
at-0.022
 seeker-0.020
ip-0.020
affle-0.020
bled-0.019
Token<|endoftext|>
Token ID50256
Feature activation+1.65e+00

Pos logit contributions
at+0.007
 seeker+0.007
ip+0.007
affle+0.006
bled+0.006
Neg logit contributions
 gestures-0.004
 ways-0.004
 kinds-0.004
 unison-0.004
 distress-0.004
TokenOnce
Token ID7454
Feature activation+8.91e+00

Pos logit contributions
at+0.044
affle+0.041
 seeker+0.041
ip+0.040
belt+0.039
Neg logit contributions
 who-0.021
 kinds-0.021
 ways-0.020
 distress-0.020
 unison-0.020
Token there
Token ID612
Feature activation+3.74e+00

Pos logit contributions
at+0.131
 seeker+0.116
affle+0.113
ip+0.106
bled+0.098
Neg logit contributions
 ways-0.226
 kinds-0.225
 gestures-0.221
 who-0.218
 distress-0.216
Token was
Token ID373
Feature activation+5.38e-01

Pos logit contributions
at+0.091
 seeker+0.085
ip+0.084
affle+0.083
bled+0.078
Neg logit contributions
 gestures-0.058
 kinds-0.057
 ways-0.057
 distress-0.055
 unison-0.055
Token an
Token ID281
Feature activation-2.88e+00

Pos logit contributions
at+0.012
 seeker+0.012
ip+0.011
affle+0.011
ained+0.011
Neg logit contributions
 gestures-0.009
 compare-0.009
 appearances-0.009
 sentences-0.009
 ways-0.009
Token elderly
Token ID13830
Feature activation-3.80e+00

Pos logit contributions
 gestures+0.050
 unison+0.048
 ways+0.047
 kinds+0.047
 compare+0.047
Neg logit contributions
at-0.066
 seeker-0.061
affle-0.060
ip-0.059
bled-0.054
Token lady
Token ID10846
Feature activation-3.92e+00

Pos logit contributions
 gestures+0.060
 ways+0.059
 kinds+0.057
 unison+0.057
 distress+0.057
Neg logit contributions
at-0.092
 seeker-0.087
ip-0.086
affle-0.084
bled-0.077
Token more
Token ID517
Feature activation-4.59e-01

Pos logit contributions
 unison+0.001
 kinds+0.001
 gestures+0.001
Maria+0.001
 distress+0.001
Neg logit contributions
ip-0.001
 seeker-0.001
affle-0.001
at-0.001
 skating-0.001
Token money
Token ID1637
Feature activation+8.58e-01

Pos logit contributions
 unison+0.009
Maria+0.009
Susan+0.008
 Hope+0.008
 distress+0.008
Neg logit contributions
ip-0.008
affle-0.008
 seeker-0.008
at-0.008
cast-0.008
Token to
Token ID284
Feature activation+1.67e-01

Pos logit contributions
at+0.020
ip+0.020
 seeker+0.019
affle+0.018
ored+0.018
Neg logit contributions
 unison-0.013
 gestures-0.013
 distress-0.013
 kinds-0.012
Maria-0.012
Token get
Token ID651
Feature activation+7.22e-01

Pos logit contributions
 seeker+0.003
at+0.003
affle+0.003
ip+0.003
or+0.003
Neg logit contributions
Maria-0.003
 gestures-0.003
 kinds-0.003
 unison-0.003
ymes-0.003
Token what
Token ID644
Feature activation-1.82e-01

Pos logit contributions
at+0.016
ip+0.015
 seeker+0.015
affle+0.014
bled+0.014
Neg logit contributions
 gestures-0.013
 unison-0.012
 kinds-0.012
 ways-0.012
 distress-0.012
Token you
Token ID345
Feature activation+2.55e+00

Pos logit contributions
 unison+0.004
 distress+0.003
 gestures+0.003
Maria+0.003
 heaven+0.003
Neg logit contributions
ip-0.003
at-0.003
uch-0.003
 seeker-0.003
or-0.003
Token want
Token ID765
Feature activation+1.54e-01

Pos logit contributions
at+0.069
 seeker+0.065
ip+0.064
affle+0.064
bled+0.060
Neg logit contributions
 gestures-0.033
 ways-0.031
 kinds-0.031
 unison-0.031
 distress-0.030
Token.
Token ID13
Feature activation+3.92e-01

Pos logit contributions
at+0.003
ip+0.003
 seeker+0.003
affle+0.003
or+0.003
Neg logit contributions
 unison-0.003
 gestures-0.003
 distress-0.003
Maria-0.003
 appearances-0.003
Token You
Token ID921
Feature activation+2.52e+00

Pos logit contributions
ip+0.008
 seeker+0.008
at+0.008
ained+0.008
or+0.007
Neg logit contributions
 gestures-0.007
 distress-0.007
 unison-0.007
 kinds-0.006
 compare-0.006
Token just
Token ID655
Feature activation+2.64e+00

Pos logit contributions
at+0.058
 seeker+0.056
ip+0.055
affle+0.053
or+0.051
Neg logit contributions
 gestures-0.040
 ways-0.039
 unison-0.039
 kinds-0.038
 distress-0.037
Token need
Token ID761
Feature activation+5.19e-01

Pos logit contributions
at+0.067
 seeker+0.066
affle+0.063
ip+0.063
ored+0.059
Neg logit contributions
 unison-0.034
 gestures-0.033
 ways-0.032
 distress-0.032
 kinds-0.030
Token race
Token ID3234
Feature activation+1.52e+00

Pos logit contributions
 gestures+0.036
 kinds+0.036
 ways+0.035
 unison+0.035
 sizes+0.034
Neg logit contributions
at-0.063
 seeker-0.061
affle-0.058
ip-0.056
bled-0.055
Token.
Token ID13
Feature activation-1.80e-01

Pos logit contributions
at+0.033
 seeker+0.032
ip+0.030
affle+0.029
bled+0.029
Neg logit contributions
 gestures-0.026
 unison-0.026
 kinds-0.025
 distress-0.025
 compare-0.025
Token The
Token ID383
Feature activation+5.65e-03

Pos logit contributions
 gestures+0.003
 distress+0.003
 kinds+0.003
 compare+0.003
 appearances+0.003
Neg logit contributions
at-0.004
ip-0.004
affle-0.003
 seeker-0.003
ored-0.003
Token big
Token ID1263
Feature activation-1.22e+00

Pos logit contributions
at+0.000
ip+0.000
 seeker+0.000
affle+0.000
ored+0.000
Neg logit contributions
 gestures-0.000
 ways-0.000
 distress-0.000
 kinds-0.000
 compare-0.000
Token bee
Token ID20697
Feature activation-8.66e-01

Pos logit contributions
 gestures+0.016
 ways+0.015
 kinds+0.015
 unison+0.015
 distress+0.015
Neg logit contributions
at-0.033
 seeker-0.031
ip-0.031
affle-0.031
bled-0.029
Token flew
Token ID13112
Feature activation+3.44e+00

Pos logit contributions
 gestures+0.014
 kinds+0.013
 ways+0.013
 compare+0.013
 distress+0.013
Neg logit contributions
at-0.021
 seeker-0.020
affle-0.019
ip-0.019
bled-0.018
Token fast
Token ID3049
Feature activation-5.75e-01

Pos logit contributions
at+0.071
affle+0.066
 seeker+0.065
ip+0.064
atmeal+0.061
Neg logit contributions
 gestures-0.067
 ways-0.066
 kinds-0.066
 its-0.064
 distress-0.063
Token,
Token ID11
Feature activation+1.90e+00

Pos logit contributions
 gestures+0.010
 unison+0.010
 kinds+0.010
 distress+0.010
 ways+0.010
Neg logit contributions
at-0.012
 seeker-0.011
ip-0.011
affle-0.011
ored-0.010
Token but
Token ID475
Feature activation+1.23e+00

Pos logit contributions
at+0.051
affle+0.047
 seeker+0.047
ip+0.046
bled+0.043
Neg logit contributions
 gestures-0.026
 ways-0.026
 kinds-0.025
 compare-0.025
 who-0.025
Token Buzz
Token ID15753
Feature activation+8.93e-02

Pos logit contributions
at+0.037
affle+0.035
 seeker+0.035
ip+0.034
belt+0.033
Neg logit contributions
 its-0.013
 gestures-0.012
 ways-0.012
 kinds-0.012
 Hope-0.012
Token flew
Token ID13112
Feature activation+3.62e+00

Pos logit contributions
at+0.002
ip+0.002
 seeker+0.002
affle+0.002
bled+0.002
Neg logit contributions
 gestures-0.002
 ways-0.002
 kinds-0.001
 unison-0.001
 compare-0.001
Token want
Token ID765
Feature activation-4.16e-01

Pos logit contributions
at+0.044
ip+0.040
 seeker+0.040
affle+0.039
bled+0.037
Neg logit contributions
 gestures-0.032
 kinds-0.031
 ways-0.031
 compare-0.030
 distress-0.030
Token.
Token ID13
Feature activation-3.87e-01

Pos logit contributions
 unison+0.007
 gestures+0.007
 distress+0.007
 appearances+0.006
 kinds+0.006
Neg logit contributions
at-0.010
ip-0.009
 seeker-0.009
affle-0.009
uch-0.008
Token Now
Token ID2735
Feature activation+1.85e+00

Pos logit contributions
 gestures+0.007
 kinds+0.007
 ways+0.007
 distress+0.007
 unison+0.007
Neg logit contributions
at-0.008
 seeker-0.008
ip-0.008
affle-0.007
ained-0.007
Token you
Token ID345
Feature activation+2.68e+00

Pos logit contributions
at+0.045
 seeker+0.044
ip+0.043
affle+0.041
uch+0.039
Neg logit contributions
 kinds-0.027
 gestures-0.027
 distress-0.026
 unison-0.026
 ways-0.026
Token will
Token ID481
Feature activation+8.07e-01

Pos logit contributions
at+0.062
 seeker+0.058
ip+0.057
affle+0.056
bled+0.053
Neg logit contributions
 gestures-0.045
 ways-0.044
 kinds-0.043
 unison-0.043
 distress-0.042
Token be
Token ID307
Feature activation+2.08e+00

Pos logit contributions
 seeker+0.018
at+0.018
affle+0.016
ip+0.016
uch+0.016
Neg logit contributions
 kinds-0.014
 gestures-0.014
 ways-0.013
 unison-0.013
ymes-0.013
Token in
Token ID287
Feature activation-4.92e-01

Pos logit contributions
at+0.042
 seeker+0.040
ip+0.038
affle+0.038
bled+0.036
Neg logit contributions
 gestures-0.040
 kinds-0.040
 unison-0.039
 distress-0.039
 compare-0.038
Token big
Token ID1263
Feature activation-2.65e-01

Pos logit contributions
 gestures+0.007
 unison+0.007
Maria+0.007
keys+0.007
 sizes+0.007
Neg logit contributions
 seeker-0.012
at-0.010
ip-0.010
affle-0.010
 skating-0.010
Token trouble
Token ID5876
Feature activation+1.10e-01

Pos logit contributions
 unison+0.004
Susan+0.004
 Hope+0.004
Maria+0.004
 Susan+0.004
Neg logit contributions
 seeker-0.006
at-0.006
ip-0.005
affle-0.005
cast-0.005
Token.
Token ID13
Feature activation-3.50e-01

Pos logit contributions
at+0.003
 seeker+0.003
ip+0.003
affle+0.003
bled+0.003
Neg logit contributions
 gestures-0.001
 unison-0.001
 kinds-0.001
 distress-0.001
 ways-0.001
Token You
Token ID921
Feature activation+2.52e+00

Pos logit contributions
 kinds+0.006
 gestures+0.006
 distress+0.006
 ways+0.006
 unison+0.006
Neg logit contributions
 seeker-0.007
at-0.007
ip-0.007
affle-0.006
uch-0.006
Token looking
Token ID2045
Feature activation+5.40e-01

Pos logit contributions
at+0.026
 seeker+0.024
ip+0.023
affle+0.023
bled+0.021
Neg logit contributions
 gestures-0.021
 ways-0.020
 kinds-0.020
 compare-0.020
 distress-0.020
Token."
Token ID526
Feature activation-1.19e+00

Pos logit contributions
at+0.016
 seeker+0.015
ip+0.015
affle+0.015
bled+0.015
Neg logit contributions
 ways-0.005
 gestures-0.005
 kinds-0.005
 who-0.005
 compare-0.005
Token
Token ID198
Feature activation-2.29e-01

Pos logit contributions
 gestures+0.018
 kinds+0.018
 ways+0.018
 unison+0.018
 compare+0.017
Neg logit contributions
at-0.029
 seeker-0.027
ip-0.027
affle-0.026
ored-0.025
Token
Token ID198
Feature activation-4.22e-01

Pos logit contributions
 gestures+0.004
 kinds+0.004
 ways+0.003
 compare+0.003
 unison+0.003
Neg logit contributions
at-0.006
ip-0.005
 seeker-0.005
affle-0.005
bled-0.005
TokenHe
Token ID1544
Feature activation-6.14e-02

Pos logit contributions
 gestures+0.006
 ways+0.006
 kinds+0.006
 unison+0.006
 compare+0.006
Neg logit contributions
at-0.011
 seeker-0.010
ip-0.010
affle-0.010
bled-0.009
Token waited
Token ID13488
Feature activation+3.17e+00

Pos logit contributions
 gestures+0.001
 ways+0.001
 kinds+0.001
 unison+0.001
 compare+0.001
Neg logit contributions
at-0.001
 seeker-0.001
ip-0.001
affle-0.001
ained-0.001
Token for
Token ID329
Feature activation-1.17e+00

Pos logit contributions
at+0.077
affle+0.070
 seeker+0.070
ip+0.069
uch+0.067
Neg logit contributions
 ways-0.053
 gestures-0.051
 who-0.051
 its-0.048
 compare-0.048
Token a
Token ID257
Feature activation-9.77e-01

Pos logit contributions
 gestures+0.019
 kinds+0.019
 sizes+0.018
 distress+0.018
 unison+0.018
Neg logit contributions
at-0.026
 seeker-0.026
affle-0.024
ip-0.024
bled-0.023
Token few
Token ID1178
Feature activation-1.42e+00

Pos logit contributions
 gestures+0.018
 ways+0.017
 kinds+0.017
 unison+0.017
 distress+0.017
Neg logit contributions
at-0.021
 seeker-0.020
ip-0.020
affle-0.019
bled-0.018
Token minutes
Token ID2431
Feature activation+2.61e+00

Pos logit contributions
 kinds+0.025
 gestures+0.025
 Susan+0.025
 distress+0.025
 unison+0.024
Neg logit contributions
at-0.029
ip-0.028
affle-0.028
 seeker-0.027
ig-0.025
Token.
Token ID13
Feature activation+4.67e-01

Pos logit contributions
at+0.057
ip+0.054
 seeker+0.054
bled+0.050
affle+0.050
Neg logit contributions
 gestures-0.046
 distress-0.044
 kinds-0.044
 unison-0.044
 appearances-0.043
Token They
Token ID1119
Feature activation+6.28e-02

Pos logit contributions
 kinds+0.009
 distress+0.008
 compare+0.008
 gestures+0.008
 brace+0.008
Neg logit contributions
 seeker-0.010
ip-0.010
at-0.009
ained-0.009
affle-0.009
Token had
Token ID550
Feature activation+1.80e+00

Pos logit contributions
at+0.001
 seeker+0.001
ip+0.001
affle+0.001
ored+0.001
Neg logit contributions
 gestures-0.001
 ways-0.001
 distress-0.001
 kinds-0.001
 unison-0.001
Token a
Token ID257
Feature activation-1.09e+00

Pos logit contributions
at+0.037
 seeker+0.037
ip+0.035
 skating+0.034
ained+0.033
Neg logit contributions
 gestures-0.031
 distress-0.031
 appearances-0.030
 compare-0.030
 kinds-0.029
Token good
Token ID922
Feature activation+1.23e+00

Pos logit contributions
 gestures+0.021
 ways+0.020
 unison+0.020
 kinds+0.020
 compare+0.020
Neg logit contributions
at-0.023
ip-0.020
 seeker-0.020
affle-0.020
bled-0.019
Token ice
Token ID4771
Feature activation+3.12e+00

Pos logit contributions
at+0.020
ip+0.017
 seeker+0.017
affle+0.017
bled+0.016
Neg logit contributions
 gestures-0.030
 ways-0.029
 kinds-0.029
 unison-0.029
 compare-0.028
Token cream
Token ID8566
Feature activation+2.09e+00

Pos logit contributions
at+0.065
 seeker+0.062
affle+0.059
ip+0.059
bled+0.058
Neg logit contributions
 kinds-0.057
 ways-0.056
 who-0.055
 gestures-0.054
 than-0.053
Token.
Token ID13
Feature activation-5.10e-01

Pos logit contributions
at+0.047
 seeker+0.044
ip+0.043
affle+0.042
bled+0.039
Neg logit contributions
 gestures-0.037
 ways-0.036
 kinds-0.036
 distress-0.035
 unison-0.035
Token They
Token ID1119
Feature activation+2.89e-01

Pos logit contributions
 kinds+0.009
 distress+0.009
 compare+0.008
 gestures+0.008
 brace+0.008
Neg logit contributions
 seeker-0.011
ip-0.011
at-0.010
ained-0.010
affle-0.010
Token were
Token ID547
Feature activation+1.48e+00

Pos logit contributions
at+0.006
 seeker+0.006
ip+0.006
affle+0.006
ored+0.005
Neg logit contributions
 gestures-0.005
 ways-0.005
 distress-0.005
 appearances-0.005
 kinds-0.005
Token happy
Token ID3772
Feature activation-2.43e+00

Pos logit contributions
at+0.032
 seeker+0.029
ip+0.029
affle+0.028
bled+0.027
Neg logit contributions
 gestures-0.028
 distress-0.027
 unison-0.027
 kinds-0.027
 ways-0.026
Token.
Token ID13
Feature activation-5.89e-01

Pos logit contributions
 distress+0.041
 unison+0.040
 gestures+0.039
 compare+0.039
 kinds+0.038
Neg logit contributions
 seeker-0.051
ip-0.049
cast-0.048
at-0.048
 skating-0.048
Token their
Token ID511
Feature activation-6.85e-01

Pos logit contributions
 kinds+0.007
 gestures+0.007
 ways+0.007
 distress+0.007
 unison+0.007
Neg logit contributions
at-0.010
 seeker-0.009
ip-0.009
affle-0.009
bled-0.009
Token job
Token ID1693
Feature activation-6.39e-01

Pos logit contributions
 gestures+0.014
 kinds+0.014
 unison+0.014
 ways+0.013
 distress+0.013
Neg logit contributions
at-0.013
 seeker-0.012
ip-0.012
affle-0.012
bled-0.011
Token with
Token ID351
Feature activation-1.20e+00

Pos logit contributions
 gestures+0.009
 kinds+0.009
 unison+0.009
 ways+0.009
 distress+0.009
Neg logit contributions
at-0.016
ip-0.015
 seeker-0.014
affle-0.014
bled-0.014
Token the
Token ID262
Feature activation-6.32e-01

Pos logit contributions
 gestures+0.017
 unison+0.017
 compare+0.016
 kinds+0.016
 sizes+0.016
Neg logit contributions
at-0.029
ip-0.027
 seeker-0.027
affle-0.026
bled-0.025
Token machine
Token ID4572
Feature activation+1.92e-01

Pos logit contributions
 gestures+0.013
 ways+0.013
 kinds+0.013
 unison+0.013
 distress+0.013
Neg logit contributions
at-0.012
 seeker-0.011
ip-0.011
affle-0.011
bled-0.010
Token.
Token ID13
Feature activation-1.01e+00

Pos logit contributions
at+0.005
ip+0.005
 seeker+0.005
affle+0.005
ored+0.004
Neg logit contributions
 gestures-0.002
 unison-0.002
 kinds-0.002
 ways-0.002
 distress-0.002
Token It
Token ID632
Feature activation+5.83e-01

Pos logit contributions
 distress+0.016
 gestures+0.016
ymes+0.016
 harm+0.015
 compare+0.015
Neg logit contributions
ip-0.023
at-0.022
ored-0.021
 seeker-0.021
ained-0.021
Token was
Token ID373
Feature activation-1.21e+00

Pos logit contributions
at+0.013
ip+0.012
 seeker+0.012
affle+0.012
bled+0.011
Neg logit contributions
 gestures-0.010
 ways-0.010
 unison-0.010
 kinds-0.010
 distress-0.010
Token smooth
Token ID7209
Feature activation-1.21e+00

Pos logit contributions
 gestures+0.019
 ways+0.018
 kinds+0.018
 unison+0.018
 compare+0.018
Neg logit contributions
at-0.029
 seeker-0.028
ip-0.027
affle-0.027
bled-0.025
Token and
Token ID290
Feature activation-9.32e-01

Pos logit contributions
 ways+0.013
 kinds+0.013
 gestures+0.013
 who+0.012
 sizes+0.012
Neg logit contributions
at-0.035
 seeker-0.033
affle-0.033
ip-0.032
bled-0.031
Token the
Token ID262
Feature activation-6.72e-01

Pos logit contributions
 gestures+0.014
 kinds+0.014
 ways+0.014
 compare+0.013
 sizes+0.013
Neg logit contributions
at-0.023
 seeker-0.022
affle-0.022
ip-0.021
ored-0.020
Token Dad
Token ID17415
Feature activation-2.14e+00

Pos logit contributions
 gestures+0.032
 kinds+0.031
 sizes+0.031
 appearances+0.029
 distress+0.029
Neg logit contributions
at-0.039
 seeker-0.038
ip-0.037
bled-0.037
affle-0.035
Token's
Token ID338
Feature activation+1.02e+00

Pos logit contributions
 gestures+0.036
 unison+0.036
 harmony+0.035
 appearances+0.034
 ways+0.034
Neg logit contributions
at-0.050
ip-0.048
or-0.043
ored-0.043
 skating-0.043
Token hand
Token ID1021
Feature activation+5.52e-01

Pos logit contributions
at+0.022
 seeker+0.021
affle+0.021
ip+0.020
belt+0.019
Neg logit contributions
 gestures-0.018
 kinds-0.017
 unison-0.017
 appearances-0.017
 compare-0.017
Token and
Token ID290
Feature activation+1.01e+00

Pos logit contributions
at+0.014
ip+0.014
 seeker+0.013
affle+0.013
bled+0.012
Neg logit contributions
 gestures-0.008
 kinds-0.007
 ways-0.007
 unison-0.007
 distress-0.007
Token said
Token ID531
Feature activation+2.38e-02

Pos logit contributions
at+0.023
ip+0.022
 seeker+0.021
bled+0.021
affle+0.020
Neg logit contributions
 gestures-0.017
 ways-0.016
 appearances-0.016
 kinds-0.016
 unison-0.016
Token,
Token ID11
Feature activation-2.08e-01

Pos logit contributions
at+0.001
 seeker+0.000
affle+0.000
ip+0.000
ained+0.000
Neg logit contributions
 gestures-0.000
 ways-0.000
 kinds-0.000
 unison-0.000
 compare-0.000
Token "
Token ID366
Feature activation-2.98e+00

Pos logit contributions
 who+0.002
 gestures+0.002
 compare+0.002
 ways+0.002
 unison+0.002
Neg logit contributions
at-0.006
 seeker-0.006
affle-0.005
ip-0.005
ained-0.005
TokenLet
Token ID5756
Feature activation-5.44e-01

Pos logit contributions
 gestures+0.049
 ways+0.048
 distress+0.047
 kinds+0.046
 compare+0.046
Neg logit contributions
at-0.071
ip-0.065
 seeker-0.064
affle-0.063
ored-0.060
Token's
Token ID338
Feature activation+9.34e-01

Pos logit contributions
 gestures+0.007
 kinds+0.007
 unison+0.007
 ways+0.006
 distress+0.006
Neg logit contributions
at-0.015
ip-0.014
 seeker-0.014
affle-0.014
uch-0.013
Token try
Token ID1949
Feature activation-7.45e-01

Pos logit contributions
at+0.023
 seeker+0.022
affle+0.021
ip+0.021
bled+0.020
Neg logit contributions
 gestures-0.014
 kinds-0.014
 ways-0.014
 unison-0.013
 distress-0.013
Token it
Token ID340
Feature activation+1.96e+00

Pos logit contributions
 ways+0.009
 gestures+0.008
 who+0.008
 kinds+0.008
 compare+0.008
Neg logit contributions
at-0.021
 seeker-0.020
affle-0.020
ip-0.019
bled-0.019
Token box
Token ID3091
Feature activation-1.06e-02

Pos logit contributions
 gestures+0.027
 kinds+0.025
 ways+0.025
 unison+0.025
 compare+0.024
Neg logit contributions
at-0.032
 seeker-0.031
ip-0.030
affle-0.029
belt-0.027
Token and
Token ID290
Feature activation+1.32e+00

Pos logit contributions
 gestures+0.000
 ways+0.000
 unison+0.000
 appearances+0.000
 kinds+0.000
Neg logit contributions
at-0.000
 seeker-0.000
ip-0.000
affle-0.000
bled-0.000
Token tucked
Token ID29779
Feature activation+8.61e-01

Pos logit contributions
at+0.030
ip+0.028
 seeker+0.028
affle+0.027
bled+0.027
Neg logit contributions
 gestures-0.022
 ways-0.021
 unison-0.021
 appearances-0.021
 distress-0.021
Token the
Token ID262
Feature activation+8.95e-01

Pos logit contributions
at+0.026
affle+0.026
ip+0.024
 seeker+0.024
atmeal+0.023
Neg logit contributions
 ways-0.009
 its-0.008
 who-0.008
 kinds-0.008
 gestures-0.008
Token wet
Token ID9583
Feature activation+2.91e-01

Pos logit contributions
at+0.021
 seeker+0.020
ip+0.019
affle+0.019
bled+0.018
Neg logit contributions
 gestures-0.015
 unison-0.014
 kinds-0.014
 ways-0.014
 compare-0.014
Token ornament
Token ID45618
Feature activation+9.89e-01

Pos logit contributions
at+0.007
 seeker+0.007
ip+0.007
affle+0.007
bled+0.006
Neg logit contributions
 gestures-0.004
 unison-0.004
 kinds-0.004
 ways-0.004
 compare-0.004
Token inside
Token ID2641
Feature activation-2.95e-01

Pos logit contributions
at+0.017
 seeker+0.015
ip+0.015
affle+0.014
bled+0.013
Neg logit contributions
 gestures-0.024
 unison-0.023
 kinds-0.023
 ways-0.023
 distress-0.023
Token it
Token ID340
Feature activation+2.16e+00

Pos logit contributions
 gestures+0.005
 appearances+0.005
 unison+0.005
 distress+0.005
 compare+0.004
Neg logit contributions
at-0.007
ip-0.006
 seeker-0.006
ored-0.006
or-0.006
Token.
Token ID13
Feature activation+1.87e-01

Pos logit contributions
at+0.064
 seeker+0.060
ip+0.060
affle+0.059
bled+0.056
Neg logit contributions
 gestures-0.023
 ways-0.021
 kinds-0.021
 unison-0.021
 distress-0.021
Token She
Token ID1375
Feature activation+2.16e-01

Pos logit contributions
at+0.004
 seeker+0.004
ip+0.004
affle+0.004
bled+0.004
Neg logit contributions
 gestures-0.003
 ways-0.003
 kinds-0.003
 unison-0.003
 distress-0.003
Token smiled
Token ID13541
Feature activation-1.58e+00

Pos logit contributions
at+0.005
ip+0.004
 seeker+0.004
affle+0.004
bled+0.004
Neg logit contributions
 gestures-0.004
 ways-0.004
 kinds-0.004
 unison-0.004
 distress-0.004
TokenJames
Token ID14731
Feature activation-1.64e-02

Pos logit contributions
at+0.036
 seeker+0.034
affle+0.034
ip+0.034
bled+0.032
Neg logit contributions
 gestures-0.015
 kinds-0.015
 ways-0.015
 distress-0.014
 unison-0.014
Token had
Token ID550
Feature activation+2.05e+00

Pos logit contributions
 gestures+0.000
 ways+0.000
 kinds+0.000
 unison+0.000
 compare+0.000
Neg logit contributions
at-0.000
 seeker-0.000
ip-0.000
affle-0.000
bled-0.000
Token a
Token ID257
Feature activation-2.36e+00

Pos logit contributions
at+0.057
 seeker+0.053
affle+0.052
ip+0.052
bled+0.049
Neg logit contributions
 ways-0.026
 gestures-0.025
 kinds-0.025
 distress-0.024
 unison-0.024
Token long
Token ID890
Feature activation-7.82e-01

Pos logit contributions
 gestures+0.036
 ways+0.036
 kinds+0.036
 unison+0.035
 distress+0.035
Neg logit contributions
at-0.059
bled-0.053
ip-0.051
affle-0.051
 seeker-0.051
Token rope
Token ID17182
Feature activation+5.44e-01

Pos logit contributions
 gestures+0.012
 unison+0.012
 ways+0.012
 who+0.012
 distress+0.012
Neg logit contributions
at-0.019
bled-0.018
ip-0.017
ored-0.017
affle-0.017
Token that
Token ID326
Feature activation+1.76e+00

Pos logit contributions
at+0.015
 seeker+0.014
ip+0.014
affle+0.014
belt+0.013
Neg logit contributions
 gestures-0.007
 ways-0.006
 unison-0.006
 kinds-0.006
 distress-0.006
Token he
Token ID339
Feature activation-2.91e-01

Pos logit contributions
at+0.042
 seeker+0.039
affle+0.038
ip+0.038
bled+0.036
Neg logit contributions
 kinds-0.028
 gestures-0.028
 ways-0.028
 compare-0.027
 distress-0.026
Token loved
Token ID6151
Feature activation-3.11e+00

Pos logit contributions
 gestures+0.005
 ways+0.005
 unison+0.005
 distress+0.005
 compare+0.005
Neg logit contributions
at-0.007
ip-0.006
 seeker-0.006
affle-0.006
bled-0.006
Token.
Token ID13
Feature activation-5.46e-01

Pos logit contributions
 gestures+0.048
 unison+0.047
 distress+0.045
 compare+0.044
 appearances+0.044
Neg logit contributions
at-0.076
ip-0.072
affle-0.070
 seeker-0.070
 skating-0.069
Token He
Token ID679
Feature activation-1.45e-01

Pos logit contributions
 gestures+0.011
 appearances+0.011
 distress+0.011
ymes+0.011
 brace+0.011
Neg logit contributions
at-0.009
affle-0.009
 seeker-0.009
ip-0.008
ained-0.008
Token wanted
Token ID2227
Feature activation-1.28e+00

Pos logit contributions
 gestures+0.003
 ways+0.002
 kinds+0.002
 unison+0.002
 distress+0.002
Neg logit contributions
at-0.003
 seeker-0.003
ip-0.003
affle-0.003
bled-0.003
Token,"
Token ID553
Feature activation-1.16e+00

Pos logit contributions
at+0.074
ip+0.070
 seeker+0.066
ored+0.064
affle+0.063
Neg logit contributions
 unison-0.038
 gestures-0.038
 distress-0.037
 compare-0.036
 ways-0.036
Token Lily
Token ID20037
Feature activation-2.87e+00

Pos logit contributions
 gestures+0.021
 ways+0.020
 appearances+0.020
 distress+0.020
 kinds+0.020
Neg logit contributions
at-0.024
ip-0.023
 seeker-0.023
affle-0.022
bled-0.021
Token said
Token ID531
Feature activation+2.04e-01

Pos logit contributions
 harmony+0.060
 appearances+0.060
 unison+0.058
 ways+0.057
 harm+0.057
Neg logit contributions
at-0.056
ip-0.055
affle-0.048
ipop-0.048
ored-0.048
Token.
Token ID13
Feature activation+1.59e-01

Pos logit contributions
ip+0.005
bled+0.004
at+0.004
ored+0.004
or+0.004
Neg logit contributions
 kinds-0.004
 sizes-0.004
 unison-0.004
 sorts-0.004
 distress-0.003
Token
Token ID198
Feature activation-1.63e-01

Pos logit contributions
ip+0.003
at+0.003
 seeker+0.003
ored+0.003
affle+0.003
Neg logit contributions
 kinds-0.003
 gestures-0.003
 distress-0.003
 ways-0.003
 unison-0.003
Token
Token ID198
Feature activation-3.42e-01

Pos logit contributions
 gestures+0.003
 kinds+0.003
 distress+0.003
 ways+0.003
 compare+0.003
Neg logit contributions
at-0.003
ip-0.003
affle-0.003
 seeker-0.003
or-0.003
TokenThey
Token ID2990
Feature activation+6.08e-01

Pos logit contributions
 gestures+0.007
 distress+0.007
 kinds+0.007
 acts+0.007
 ways+0.007
Neg logit contributions
at-0.006
ip-0.006
 seeker-0.006
 skating-0.005
ding-0.005
Token carefully
Token ID7773
Feature activation+4.83e-01

Pos logit contributions
at+0.013
ip+0.012
bled+0.011
 seeker+0.011
affle+0.011
Neg logit contributions
 gestures-0.012
 ways-0.011
 appearances-0.011
 distress-0.011
 kinds-0.011
Token picked
Token ID6497
Feature activation+1.70e+00

Pos logit contributions
at+0.011
ip+0.011
affle+0.010
 seeker+0.010
ored+0.010
Neg logit contributions
 gestures-0.008
 ways-0.008
 kinds-0.008
 unison-0.008
 appearances-0.008
Token up
Token ID510
Feature activation-1.05e+00

Pos logit contributions
at+0.029
 seeker+0.026
ip+0.026
affle+0.026
bled+0.023
Neg logit contributions
 gestures-0.039
 ways-0.038
 kinds-0.038
 unison-0.037
 distress-0.037
Token the
Token ID262
Feature activation+1.17e+00

Pos logit contributions
 gestures+0.015
 compare+0.015
 unison+0.015
 distress+0.015
 kinds+0.015
Neg logit contributions
at-0.026
ip-0.025
 seeker-0.024
affle-0.023
bled-0.022
Token a
Token ID257
Feature activation+2.07e+00

Pos logit contributions
 gestures+0.001
 compare+0.001
 appearances+0.001
 sizes+0.001
 kinds+0.001
Neg logit contributions
at-0.001
 seeker-0.001
affle-0.001
ip-0.001
ained-0.001
Token time
Token ID640
Feature activation+5.06e+00

Pos logit contributions
at+0.034
 seeker+0.030
ip+0.030
affle+0.029
uch+0.025
Neg logit contributions
 gestures-0.051
 kinds-0.049
 compare-0.048
 ways-0.048
 sentences-0.047
Token,
Token ID11
Feature activation+5.41e+00

Pos logit contributions
at+0.135
affle+0.126
ip+0.121
 seeker+0.121
atmeal+0.117
Neg logit contributions
 who-0.071
 ways-0.069
 gestures-0.068
 Susan-0.067
 kinds-0.066
Token there
Token ID612
Feature activation+3.29e+00

Pos logit contributions
at+0.125
affle+0.116
ip+0.113
atmeal+0.110
 seeker+0.107
Neg logit contributions
 who-0.104
 its-0.097
 whose-0.094
 distress-0.093
 Amy-0.093
Token was
Token ID373
Feature activation+2.45e-02

Pos logit contributions
at+0.077
 seeker+0.072
ip+0.070
affle+0.070
bled+0.065
Neg logit contributions
 gestures-0.054
 ways-0.053
 kinds-0.053
 unison-0.052
 distress-0.051
Token a
Token ID257
Feature activation-2.63e+00

Pos logit contributions
at+0.001
 seeker+0.001
ip+0.001
affle+0.000
ained+0.000
Neg logit contributions
 gestures-0.000
 compare-0.000
 ways-0.000
 kinds-0.000
 appearances-0.000
Token brave
Token ID14802
Feature activation-2.79e+00

Pos logit contributions
 who+0.030
 gestures+0.028
 unison+0.027
 compare+0.027
 ways+0.026
Neg logit contributions
at-0.078
ained-0.070
ip-0.070
affle-0.069
bled-0.069
Token fish
Token ID5916
Feature activation-1.74e+00

Pos logit contributions
 gestures+0.034
 who+0.034
 ways+0.033
 kinds+0.033
 unison+0.032
Neg logit contributions
at-0.077
bled-0.071
ained-0.070
ip-0.069
uch-0.068
Token who
Token ID508
Feature activation+1.43e+00

Pos logit contributions
 gestures+0.033
 sentences+0.032
 unison+0.032
 compare+0.032
 ways+0.032
Neg logit contributions
at-0.038
 seeker-0.034
ip-0.034
affle-0.033
ained-0.031
Token lived
Token ID5615
Feature activation+1.77e+00

Pos logit contributions
at+0.030
 seeker+0.028
ip+0.027
affle+0.027
bled+0.025
Neg logit contributions
 gestures-0.027
 ways-0.026
 kinds-0.026
 unison-0.026
 distress-0.026
Token in
Token ID287
Feature activation-8.43e-01

Pos logit contributions
at+0.041
ip+0.037
affle+0.037
 seeker+0.037
atmeal+0.035
Neg logit contributions
 ways-0.030
 who-0.030
 kinds-0.029
 gestures-0.029
 distress-0.029
Token.
Token ID13
Feature activation-9.35e-01

Pos logit contributions
 gestures+0.023
 unison+0.023
 compare+0.022
 appearances+0.021
 distress+0.021
Neg logit contributions
at-0.052
ip-0.049
 seeker-0.048
affle-0.048
ored-0.045
Token They
Token ID1119
Feature activation-5.29e-01

Pos logit contributions
ymes+0.019
 harm+0.018
 gestures+0.018
 distress+0.018
 unison+0.017
Neg logit contributions
at-0.017
ip-0.017
 seeker-0.016
affle-0.016
belt-0.015
Token surprised
Token ID6655
Feature activation-3.23e+00

Pos logit contributions
 gestures+0.009
 kinds+0.009
 ways+0.009
 unison+0.008
 compare+0.008
Neg logit contributions
at-0.012
 seeker-0.011
ip-0.011
affle-0.011
bled-0.010
Token him
Token ID683
Feature activation-2.47e-01

Pos logit contributions
 unison+0.067
 gestures+0.064
 kinds+0.064
 distress+0.064
ymes+0.063
Neg logit contributions
at-0.055
 seeker-0.052
cast-0.050
ored-0.049
or-0.049
Token with
Token ID351
Feature activation-1.10e+00

Pos logit contributions
 gestures+0.004
 kinds+0.004
 ways+0.004
 unison+0.004
 compare+0.004
Neg logit contributions
at-0.005
 seeker-0.005
ip-0.005
affle-0.005
bled-0.005
Token an
Token ID281
Feature activation-2.51e+00

Pos logit contributions
Maria+0.019
ever+0.018
ymes+0.018
keys+0.018
 appearances+0.018
Neg logit contributions
affle-0.020
 seeker-0.020
at-0.020
ip-0.020
bled-0.019
Token amazing
Token ID4998
Feature activation-8.85e-01

Pos logit contributions
 gestures+0.041
 unison+0.039
 ways+0.039
 kinds+0.038
 compare+0.038
Neg logit contributions
at-0.059
 seeker-0.055
affle-0.055
ip-0.054
bled-0.050
Token robot
Token ID9379
Feature activation-2.49e+00

Pos logit contributions
 gestures+0.013
 kinds+0.012
 ways+0.012
 distress+0.012
 unison+0.012
Neg logit contributions
at-0.023
ip-0.022
 seeker-0.021
affle-0.021
bled-0.021
Token dog
Token ID3290
Feature activation-1.33e+00

Pos logit contributions
 gestures+0.042
 distress+0.041
 ways+0.040
 kinds+0.040
 unison+0.040
Neg logit contributions
at-0.055
ip-0.055
 seeker-0.053
affle-0.049
ig-0.049
Token!
Token ID0
Feature activation-1.16e+00

Pos logit contributions
 gestures+0.022
 ways+0.022
 kinds+0.022
 unison+0.021
 distress+0.021
Neg logit contributions
ip-0.030
at-0.030
 seeker-0.030
affle-0.027
ored-0.026
Token
Token ID220
Feature activation+3.04e+00

Pos logit contributions
 gestures+0.019
 distress+0.018
ymes+0.018
 kinds+0.018
 ways+0.018
Neg logit contributions
at-0.026
ip-0.026
 seeker-0.024
affle-0.024
ored-0.023
Token S
Token ID311
Feature activation+2.27e-01

Pos logit contributions
 gestures+0.009
 kinds+0.009
 ways+0.008
 who+0.008
 distress+0.008
Neg logit contributions
at-0.014
affle-0.012
 seeker-0.012
bled-0.012
ip-0.012
Tokenque
Token ID4188
Feature activation+1.13e+00

Pos logit contributions
 seeker+0.005
at+0.005
bled+0.005
belt+0.005
ip+0.005
Neg logit contributions
 ways-0.003
 gestures-0.003
 kinds-0.003
 who-0.003
 distress-0.003
Tokenaky
Token ID15492
Feature activation-2.08e+00

Pos logit contributions
at+0.023
ip+0.021
 seeker+0.020
affle+0.020
bled+0.018
Neg logit contributions
 gestures-0.023
 unison-0.022
 ways-0.022
 kinds-0.022
 compare-0.022
Token was
Token ID373
Feature activation+5.04e-01

Pos logit contributions
 gestures+0.031
 ways+0.030
 kinds+0.030
 unison+0.029
 compare+0.029
Neg logit contributions
at-0.052
ip-0.048
 seeker-0.048
affle-0.047
bled-0.045
Token also
Token ID635
Feature activation+1.73e-01

Pos logit contributions
at+0.012
ip+0.011
 seeker+0.011
affle+0.011
or+0.011
Neg logit contributions
 gestures-0.008
 ways-0.008
 distress-0.008
 who-0.008
 kinds-0.008
Token very
Token ID845
Feature activation-2.49e+00

Pos logit contributions
at+0.005
ip+0.004
affle+0.004
 seeker+0.004
or+0.004
Neg logit contributions
 gestures-0.002
 who-0.002
 distress-0.002
 kinds-0.002
 ways-0.002
Token lonely
Token ID21757
Feature activation-2.50e+00

Pos logit contributions
 gestures+0.038
 ways+0.037
 kinds+0.037
 unison+0.036
 distress+0.036
Neg logit contributions
at-0.061
 seeker-0.057
ip-0.057
affle-0.056
bled-0.053
Token.
Token ID13
Feature activation-7.35e-01

Pos logit contributions
 gestures+0.040
 unison+0.038
 compare+0.037
 sentences+0.037
 distress+0.036
Neg logit contributions
 seeker-0.060
at-0.059
ip-0.056
affle-0.054
ained-0.052
Token Fl
Token ID1610
Feature activation+1.84e+00

Pos logit contributions
 gestures+0.010
 ways+0.010
 kinds+0.010
 distress+0.009
 who+0.009
Neg logit contributions
at-0.019
 seeker-0.018
affle-0.018
ip-0.017
bled-0.017
Tokenuffy
Token ID15352
Feature activation-1.36e+00

Pos logit contributions
at+0.020
 seeker+0.017
affle+0.016
ip+0.015
bled+0.014
Neg logit contributions
 gestures-0.053
 ways-0.052
 distress-0.052
 kinds-0.052
 unison-0.052
Token and
Token ID290
Feature activation+2.44e+00

Pos logit contributions
 gestures+0.017
 ways+0.017
 kinds+0.017
 distress+0.016
 compare+0.016
Neg logit contributions
at-0.037
 seeker-0.035
ip-0.034
affle-0.034
bled-0.032
Token.
Token ID13
Feature activation-1.46e+00

Pos logit contributions
 unison+0.053
 kinds+0.051
 distress+0.050
 gestures+0.049
 compare+0.048
Neg logit contributions
at-0.068
ip-0.067
 seeker-0.064
ored-0.062
or-0.062
Token
Token ID198
Feature activation+1.65e-01

Pos logit contributions
 kinds+0.020
 gestures+0.020
 compare+0.019
 ways+0.019
 distress+0.019
Neg logit contributions
 seeker-0.036
at-0.036
ip-0.035
affle-0.033
ained-0.033
Token
Token ID198
Feature activation-1.46e-01

Pos logit contributions
at+0.004
ip+0.004
 seeker+0.004
affle+0.004
bled+0.003
Neg logit contributions
 gestures-0.003
 ways-0.003
 kinds-0.003
 compare-0.003
 distress-0.003
TokenBen
Token ID11696
Feature activation+3.12e-01

Pos logit contributions
 gestures+0.002
 ways+0.002
 kinds+0.002
 distress+0.002
 compare+0.002
Neg logit contributions
at-0.003
 seeker-0.003
ip-0.003
affle-0.003
bled-0.003
Token's
Token ID338
Feature activation+1.72e+00

Pos logit contributions
at+0.007
 seeker+0.007
ip+0.006
affle+0.006
bled+0.006
Neg logit contributions
 gestures-0.005
 kinds-0.005
 ways-0.005
 compare-0.005
 unison-0.005
Token mom
Token ID1995
Feature activation-3.35e+00

Pos logit contributions
at+0.037
 seeker+0.035
affle+0.034
ip+0.034
bled+0.032
Neg logit contributions
 gestures-0.031
 kinds-0.031
 ways-0.030
 unison-0.030
 compare-0.029
Token and
Token ID290
Feature activation+1.79e+00

Pos logit contributions
 gestures+0.051
 kinds+0.050
 ways+0.049
 unison+0.047
 compare+0.047
Neg logit contributions
at-0.084
ip-0.077
 seeker-0.076
affle-0.075
ored-0.071
Token dad
Token ID9955
Feature activation-1.46e+00

Pos logit contributions
 seeker+0.035
at+0.032
ip+0.032
affle+0.030
ipop+0.030
Neg logit contributions
 kinds-0.034
 unison-0.034
 sizes-0.033
 gestures-0.033
ymes-0.033
Token were
Token ID547
Feature activation+4.79e-01

Pos logit contributions
 gestures+0.025
 kinds+0.025
 ways+0.025
 unison+0.024
 appearances+0.024
Neg logit contributions
at-0.033
 seeker-0.030
ip-0.030
affle-0.029
ored-0.028
Token worried
Token ID7960
Feature activation-3.54e+00

Pos logit contributions
at+0.011
 seeker+0.010
ip+0.010
affle+0.010
bled+0.009
Neg logit contributions
 gestures-0.009
 ways-0.008
 kinds-0.008
 unison-0.008
 distress-0.008
Token too
Token ID1165
Feature activation-1.66e+00

Pos logit contributions
 kinds+0.042
 gestures+0.042
 distress+0.041
 unison+0.041
 appearances+0.039
Neg logit contributions
at-0.093
ip-0.088
affle-0.086
 seeker-0.085
ored-0.084
Token over
Token ID625
Feature activation-4.22e-01

Pos logit contributions
at+0.050
atmeal+0.047
uch+0.046
affle+0.046
ip+0.045
Neg logit contributions
 its-0.028
 who-0.028
 each-0.027
 Susan-0.027
 than-0.027
Token and
Token ID290
Feature activation+1.72e+00

Pos logit contributions
 gestures+0.006
 unison+0.006
 appearances+0.006
 compare+0.006
 sizes+0.006
Neg logit contributions
at-0.010
 seeker-0.010
ip-0.010
affle-0.009
ored-0.009
Token gave
Token ID2921
Feature activation+1.99e-01

Pos logit contributions
at+0.047
affle+0.044
 seeker+0.044
ained+0.041
ip+0.041
Neg logit contributions
 who-0.024
 gestures-0.022
 kinds-0.022
 its-0.022
 compare-0.021
Token the
Token ID262
Feature activation-2.35e+00

Pos logit contributions
at+0.005
 seeker+0.005
affle+0.005
ip+0.005
ained+0.005
Neg logit contributions
 gestures-0.003
 ways-0.003
 kinds-0.002
 distress-0.002
 compare-0.002
Token rabbit
Token ID22746
Feature activation-2.71e+00

Pos logit contributions
 gestures+0.032
 ways+0.031
 kinds+0.031
 unison+0.030
 compare+0.030
Neg logit contributions
at-0.062
 seeker-0.058
ip-0.057
affle-0.057
bled-0.054
Token a
Token ID257
Feature activation-2.53e+00

Pos logit contributions
 gestures+0.042
 ways+0.040
 kinds+0.040
 unison+0.040
 compare+0.040
Neg logit contributions
at-0.066
 seeker-0.061
ip-0.061
affle-0.060
bled-0.056
Token big
Token ID1263
Feature activation-2.84e+00

Pos logit contributions
 gestures+0.037
 kinds+0.036
 unison+0.035
 distress+0.035
 ways+0.035
Neg logit contributions
at-0.064
 seeker-0.060
ip-0.058
affle-0.057
bled-0.057
Token hug
Token ID16225
Feature activation-1.45e+00

Pos logit contributions
 gestures+0.036
 kinds+0.034
 ways+0.033
 unison+0.033
 distress+0.033
Neg logit contributions
at-0.079
 seeker-0.073
ip-0.072
affle-0.071
bled-0.071
Token.
Token ID13
Feature activation-3.69e-01

Pos logit contributions
 gestures+0.013
 ways+0.013
 kinds+0.012
 unison+0.012
 distress+0.012
Neg logit contributions
at-0.045
 seeker-0.042
ip-0.042
affle-0.041
bled-0.040
Token But
Token ID887
Feature activation+7.62e-01

Pos logit contributions
 gestures+0.006
 ways+0.006
 kinds+0.006
 distress+0.005
 compare+0.005
Neg logit contributions
at-0.009
 seeker-0.008
ip-0.008
affle-0.008
bled-0.008
Token when
Token ID618
Feature activation+1.12e+00

Pos logit contributions
at+0.024
affle+0.023
ip+0.021
atmeal+0.021
 seeker+0.020
Neg logit contributions
 its-0.009
 who-0.009
 ways-0.007
 Hope-0.007
 Charlie-0.007