TOP ACTIVATIONS
MAX = 10.769

of water from the morning de w . Then
were wet from the morning de w . The
, soaking up the morning de w that gl owed on
laugh and shout , " Whe e !" They
laugh and shout , " Whe e ! That was fun
and slides down . " Whe e !" "
and slides down . " Whe e !" "
and slides down . " Whe e !" "
slides down fast . " Whe e !" he shouts .
smiles and shouts , " Whe e ! That was fun
slides down fast . " Whe e !" she shouts .
and slides down . " Whe e !" "
in the air . " Whe e , it is like
. It was covered in de w from the wet rain
different styles like stars or spir als . Everyone was so
, the farmer in the de ll ". Suddenly , he
and wet from the morning de w . She pointed at
so much fun . " Whe e !" he shouted .
gently and said , " Whe e , Brown ie ,
smiled and shouted , " Whe e !" L

MIN ACTIVATIONS
MIN = -7.860

. She said , " Dear Tom , I made this
. It said , " Dear Anna , I have a
her and said , " Dear daughter , I knew that
says : " Dear Lily and Ben , Happy
The thief replied " Dear little girl , I need
prayed . " Dear God , please help us
. She said : " Dear Bob , we love you
mom . " Dear Sara , I miss you
it . " Dear Lily and Ben , we
slowly . " Dear ... An - na ...
The letter said : " Dear Sara and Ben , I
it out loud . " Dear mom , we love you
prayed and said , " Dear God , please forgive us
said to Frank : " Dear all igator , if you
said : " Dear Sara , Thank
closed her eyes . " Dear God ," she said softly
. It says : " Dear Sam , I like you
to pray instead . " Dear God , please let me
together . " Dear God , bless us and
. It says : " Dear Tim , we love you

INTERVAL 6.111 - 10.769
CONTAINS 0.079%

lake . But it was all no use - the fog
time for dinner . <|endoftext|> S ara did not like broccoli
the ranch stopped ringing and all the animals were content .
day he accidentally spilled ink all over his paper . He
. Now she could tw irl and dance whenever she wanted

INTERVAL 1.454 - 6.111
CONTAINS 21.856%

girl . <|endoftext|> Anna and Ben are playing in the park
bird was so happy that it sang a beautiful song for
that mistakes can be fixed . <|endoftext|> Once there was a
Tim . Tim was sad because he had no friends .
snow men and snow balls and slide on the ice .

INTERVAL -3.203 - 1.454
CONTAINS 69.090%

was very sick . Lucy 's parents took Hop to the
She thanked the man and ran off with the shiny toy
had taken her . Sarah felt so brave and
and it was hungry . It wanted to eat the ot
the flashlight ," Tom says . " OK ,

INTERVAL -7.860 - -3.203
CONTAINS 8.975%

Look , I can make a circle with my fork and
� Now , little fox , do you understand why your
. He knew he had to warn the sailors on the
he knew it was going to be a great day .
morning , she put on her special uniform and visited people
Token of
Token ID286
Feature activation+9.35e-01

Pos logit contributions
Looks+0.094
Any+0.089
|+0.089
You+0.086
ksh+0.085
Neg logit contributions
keeper-0.054
 keeper-0.049
 worker-0.046
ache-0.045
 doctor-0.044
Token water
Token ID1660
Feature activation+1.17e+00

Pos logit contributions
Looks+0.020
Any+0.020
|+0.019
hed+0.019
You+0.019
Neg logit contributions
keeper-0.021
 keeper-0.020
 hinges-0.019
 worker-0.019
 instincts-0.018
Token from
Token ID422
Feature activation+1.01e+00

Pos logit contributions
Looks+0.029
Any+0.029
You+0.027
|+0.027
hed+0.027
Neg logit contributions
keeper-0.023
 keeper-0.022
 worker-0.020
ache-0.020
 hinges-0.019
Token the
Token ID262
Feature activation+1.93e+00

Pos logit contributions
Looks+0.018
Any+0.018
hed+0.017
pill+0.017
You+0.017
Neg logit contributions
keeper-0.027
 keeper-0.026
 worker-0.024
 hinges-0.024
 doctor-0.024
Token morning
Token ID3329
Feature activation-1.78e+00

Pos logit contributions
Any+0.040
Looks+0.039
You+0.038
hed+0.037
pill+0.037
Neg logit contributions
keeper-0.045
 keeper-0.042
 worker-0.039
ache-0.039
 instincts-0.039
Token de
Token ID390
Feature activation+1.08e+01

Pos logit contributions
keeper+0.041
 keeper+0.038
 instincts+0.036
 hinges+0.036
 worker+0.036
Neg logit contributions
Any-0.037
hed-0.037
Looks-0.036
ee-0.035
You-0.035
Tokenw
Token ID86
Feature activation+2.29e+00

Pos logit contributions
Looks+0.162
pill+0.157
Any+0.156
 You+0.150
 Wed+0.150
Neg logit contributions
keeper-0.310
ache-0.295
 keeper-0.294
 worker-0.288
 code-0.273
Token.
Token ID13
Feature activation+6.98e-01

Pos logit contributions
Any+0.047
Looks+0.046
hed+0.045
You+0.044
ee+0.043
Neg logit contributions
keeper-0.055
 keeper-0.053
 worker-0.049
 code-0.048
 hinges-0.048
Token
Token ID198
Feature activation-1.61e+00

Pos logit contributions
Looks+0.014
Any+0.014
ee+0.013
hed+0.013
pill+0.013
Neg logit contributions
keeper-0.017
 keeper-0.016
 worker-0.015
 doctor-0.015
 hinges-0.015
Token
Token ID198
Feature activation-1.83e+00

Pos logit contributions
keeper+0.047
 keeper+0.045
 hinges+0.044
ache+0.044
 instincts+0.044
Neg logit contributions
Any-0.024
|-0.023
You-0.023
Looks-0.023
ee-0.022
TokenThen
Token ID6423
Feature activation-7.98e-01

Pos logit contributions
keeper+0.038
 keeper+0.036
 worker+0.033
ache+0.033
 hinges+0.033
Neg logit contributions
Looks-0.043
Any-0.043
hed-0.041
You-0.041
pill-0.040
Token were
Token ID547
Feature activation+8.42e-01

Pos logit contributions
Looks+0.052
Any+0.051
|+0.048
You+0.048
pill+0.047
Neg logit contributions
keeper-0.075
 keeper-0.071
ache-0.067
 worker-0.066
 hinges-0.064
Token wet
Token ID9583
Feature activation+2.13e+00

Pos logit contributions
Any+0.014
Looks+0.014
hed+0.013
You+0.013
ee+0.013
Neg logit contributions
keeper-0.023
 keeper-0.022
 worker-0.021
 code-0.021
 instincts-0.021
Token from
Token ID422
Feature activation+5.28e-01

Pos logit contributions
Looks+0.051
Any+0.051
You+0.048
pill+0.048
hed+0.048
Neg logit contributions
keeper-0.044
 keeper-0.041
 worker-0.038
ache-0.038
 hinges-0.037
Token the
Token ID262
Feature activation+1.25e+00

Pos logit contributions
Looks+0.010
Any+0.010
hed+0.009
pill+0.009
You+0.009
Neg logit contributions
keeper-0.014
 keeper-0.013
 worker-0.012
 hinges-0.012
 doctor-0.012
Token morning
Token ID3329
Feature activation-1.78e+00

Pos logit contributions
Any+0.027
Looks+0.026
You+0.025
hed+0.025
 You+0.025
Neg logit contributions
keeper-0.028
 keeper-0.026
 worker-0.025
ache-0.024
 hinges-0.024
Token de
Token ID390
Feature activation+1.07e+01

Pos logit contributions
keeper+0.040
 keeper+0.038
 instincts+0.035
 worker+0.035
 hinges+0.035
Neg logit contributions
Any-0.038
Looks-0.037
hed-0.037
You-0.036
ee-0.035
Tokenw
Token ID86
Feature activation+2.19e+00

Pos logit contributions
Looks+0.150
Any+0.147
pill+0.147
 Wed+0.144
|+0.138
Neg logit contributions
keeper-0.322
 keeper-0.307
ache-0.301
 worker-0.296
 code-0.284
Token.
Token ID13
Feature activation+9.57e-01

Pos logit contributions
Looks+0.044
Any+0.044
hed+0.041
You+0.041
pill+0.040
Neg logit contributions
keeper-0.054
 keeper-0.051
 worker-0.048
ache-0.047
 hinges-0.047
Token
Token ID198
Feature activation-1.01e+00

Pos logit contributions
Looks+0.018
Any+0.017
ee+0.016
hed+0.016
|+0.015
Neg logit contributions
keeper-0.026
 keeper-0.024
 doctor-0.023
 worker-0.023
 aid-0.022
Token
Token ID198
Feature activation-1.34e+00

Pos logit contributions
keeper+0.027
 keeper+0.026
 hinges+0.025
ache+0.025
 worker+0.025
Neg logit contributions
Any-0.018
Looks-0.017
You-0.017
hed-0.016
|-0.016
TokenThe
Token ID464
Feature activation+5.87e-01

Pos logit contributions
keeper+0.029
 keeper+0.028
 worker+0.025
ache+0.025
 aid+0.024
Neg logit contributions
Looks-0.030
Any-0.030
hed-0.029
|-0.029
pill-0.029
Token,
Token ID11
Feature activation-4.50e+00

Pos logit contributions
keeper+0.042
 keeper+0.039
 worker+0.037
 hinges+0.036
ache+0.036
Neg logit contributions
Any-0.043
Looks-0.043
hed-0.042
You-0.041
ee-0.040
Token soaking
Token ID41714
Feature activation+3.72e+00

Pos logit contributions
keeper+0.104
 keeper+0.100
 worker+0.091
 aid+0.090
 doctor+0.089
Neg logit contributions
Looks-0.096
Any-0.095
hed-0.091
pill-0.090
|-0.090
Token up
Token ID510
Feature activation+8.69e-01

Pos logit contributions
pill+0.088
Any+0.086
Looks+0.086
|+0.083
Whe+0.082
Neg logit contributions
keeper-0.074
 keeper-0.071
 door-0.068
 instincts-0.063
 hinges-0.061
Token the
Token ID262
Feature activation+9.02e-01

Pos logit contributions
Looks+0.021
Any+0.021
ee+0.020
pill+0.020
You+0.020
Neg logit contributions
keeper-0.017
 keeper-0.017
 door-0.016
 hinges-0.015
 worker-0.015
Token morning
Token ID3329
Feature activation-2.38e+00

Pos logit contributions
Any+0.019
Looks+0.019
hed+0.018
You+0.018
pill+0.018
Neg logit contributions
keeper-0.021
 keeper-0.020
 door-0.019
 worker-0.018
 hinges-0.018
Token de
Token ID390
Feature activation+1.06e+01

Pos logit contributions
keeper+0.050
 keeper+0.048
 worker+0.043
ache+0.043
 hinges+0.042
Neg logit contributions
Looks-0.057
Any-0.056
You-0.053
hed-0.053
|-0.052
Tokenw
Token ID86
Feature activation+1.79e+00

Pos logit contributions
 Wed+0.182
Looks+0.181
pill+0.178
Any+0.176
 imag+0.174
Neg logit contributions
keeper-0.286
 keeper-0.270
ache-0.269
 worker-0.255
 lady-0.240
Token that
Token ID326
Feature activation+1.30e-01

Pos logit contributions
Any+0.035
Looks+0.034
hed+0.034
ee+0.032
You+0.032
Neg logit contributions
keeper-0.044
 keeper-0.042
 worker-0.040
 code-0.039
 hinges-0.039
Token gl
Token ID1278
Feature activation+2.15e+00

Pos logit contributions
Looks+0.003
Any+0.003
pill+0.003
hed+0.003
|+0.003
Neg logit contributions
keeper-0.003
 keeper-0.003
 worker-0.003
ache-0.003
 doctor-0.003
Tokenowed
Token ID6972
Feature activation+6.39e-01

Pos logit contributions
Any+0.054
Looks+0.053
hed+0.051
ee+0.051
You+0.049
Neg logit contributions
keeper-0.041
 keeper-0.040
 worker-0.036
 dentist-0.036
 hinges-0.036
Token on
Token ID319
Feature activation+2.32e+00

Pos logit contributions
Looks+0.015
Any+0.014
pill+0.014
|+0.014
You+0.014
Neg logit contributions
keeper-0.014
 keeper-0.013
 worker-0.012
ache-0.012
 instincts-0.012
Token laugh
Token ID6487
Feature activation+2.52e+00

Pos logit contributions
Any+0.029
Looks+0.029
pill+0.029
ksh+0.026
hed+0.026
Neg logit contributions
 keeper-0.027
keeper-0.027
 code-0.024
 ring-0.024
 doctor-0.024
Token and
Token ID290
Feature activation+4.14e+00

Pos logit contributions
Looks+0.069
Any+0.065
pill+0.065
|+0.065
ee+0.063
Neg logit contributions
keeper-0.045
 keeper-0.042
 worker-0.038
ache-0.037
 door-0.037
Token shout
Token ID9191
Feature activation-2.50e-03

Pos logit contributions
pill+0.118
Any+0.114
Looks+0.114
ays+0.108
endo+0.106
Neg logit contributions
 keeper-0.068
 lady-0.062
 ring-0.061
 click-0.060
keeper-0.058
Token,
Token ID11
Feature activation-2.76e+00

Pos logit contributions
keeper+0.000
 keeper+0.000
 worker+0.000
ache+0.000
 hinges+0.000
Neg logit contributions
Any-0.000
Looks-0.000
hed-0.000
You-0.000
ee-0.000
Token "
Token ID366
Feature activation+1.32e+00

Pos logit contributions
keeper+0.039
 keeper+0.038
 worker+0.033
 instincts+0.031
 doctor+0.031
Neg logit contributions
Looks-0.083
ee-0.081
Any-0.080
|-0.080
hed-0.080
TokenWhe
Token ID10842
Feature activation+1.06e+01

Pos logit contributions
 Wed+0.017
pill+0.017
 sorts+0.017
 breaths+0.017
igg+0.016
Neg logit contributions
 keeper-0.034
keeper-0.033
 driver-0.031
 lady-0.030
man-0.030
Tokene
Token ID68
Feature activation+2.91e+00

Pos logit contributions
Looks+0.182
pill+0.173
Any+0.152
You+0.152
ksh+0.150
Neg logit contributions
 keeper-0.260
 aid-0.254
keeper-0.250
 worker-0.248
lf-0.246
Token!"
Token ID2474
Feature activation+1.01e+00

Pos logit contributions
Looks+0.079
Any+0.073
pill+0.070
You+0.069
|+0.068
Neg logit contributions
keeper-0.054
 keeper-0.053
ache-0.048
 worker-0.048
alo-0.047
Token
Token ID198
Feature activation-1.21e+00

Pos logit contributions
Looks+0.024
Any+0.023
pill+0.022
ee+0.022
hed+0.022
Neg logit contributions
keeper-0.021
 keeper-0.020
 aid-0.019
 doctor-0.019
 worker-0.018
Token
Token ID198
Feature activation-1.39e+00

Pos logit contributions
keeper+0.036
 keeper+0.034
ache+0.033
 hinges+0.033
 worker+0.033
Neg logit contributions
Any-0.018
You-0.017
Looks-0.017
|-0.017
hed-0.017
TokenThey
Token ID2990
Feature activation+4.97e-01

Pos logit contributions
keeper+0.030
 keeper+0.028
 worker+0.026
 hinges+0.026
ache+0.026
Neg logit contributions
Looks-0.033
Any-0.033
You-0.031
hed-0.031
ee-0.030
Token laugh
Token ID6487
Feature activation+2.67e+00

Pos logit contributions
Looks+0.037
Any+0.036
pill+0.035
ksh+0.034
 You+0.033
Neg logit contributions
 keeper-0.038
keeper-0.037
 doctor-0.034
ache-0.034
 ring-0.034
Token and
Token ID290
Feature activation+4.24e+00

Pos logit contributions
Looks+0.072
Any+0.069
|+0.068
pill+0.068
ee+0.066
Neg logit contributions
keeper-0.048
 keeper-0.046
 doctor-0.041
 worker-0.041
 door-0.040
Token shout
Token ID9191
Feature activation+1.26e-01

Pos logit contributions
pill+0.114
Looks+0.113
Any+0.110
ays+0.108
ash+0.103
Neg logit contributions
 keeper-0.076
 lady-0.072
 ring-0.069
 click-0.067
 man-0.066
Token,
Token ID11
Feature activation-2.74e+00

Pos logit contributions
Looks+0.003
Any+0.003
hed+0.003
You+0.003
pill+0.003
Neg logit contributions
keeper-0.003
 keeper-0.002
 worker-0.002
ache-0.002
 hinges-0.002
Token "
Token ID366
Feature activation+1.52e+00

Pos logit contributions
 keeper+0.036
keeper+0.036
 worker+0.030
 lady+0.028
 doctor+0.028
Neg logit contributions
ee-0.085
Looks-0.084
hed-0.083
|-0.082
pill-0.081
TokenWhe
Token ID10842
Feature activation+1.05e+01

Pos logit contributions
 sorts+0.021
 Wed+0.019
igg+0.019
 breaths+0.019
ays+0.018
Neg logit contributions
 keeper-0.038
keeper-0.038
ache-0.036
 driver-0.035
 lady-0.034
Tokene
Token ID68
Feature activation+2.94e+00

Pos logit contributions
Looks+0.187
pill+0.179
Any+0.160
ksh+0.157
You+0.155
Neg logit contributions
 keeper-0.260
 aid-0.249
keeper-0.248
 worker-0.248
 doctor-0.239
Token!
Token ID0
Feature activation+2.63e+00

Pos logit contributions
Looks+0.082
Any+0.075
pill+0.073
|+0.071
You+0.071
Neg logit contributions
keeper-0.053
 keeper-0.052
 worker-0.047
ache-0.047
alo-0.045
Token That
Token ID1320
Feature activation+4.95e+00

Pos logit contributions
Looks+0.051
eling+0.046
 sorts+0.043
 immensely+0.043
dd+0.043
Neg logit contributions
keeper-0.057
 aid-0.054
 doctor-0.054
 keeper-0.053
 click-0.053
Token was
Token ID373
Feature activation+5.83e-02

Pos logit contributions
Looks+0.112
Any+0.107
|+0.105
pill+0.103
You+0.099
Neg logit contributions
 keeper-0.111
keeper-0.110
 door-0.100
 worker-0.098
 doctor-0.098
Token fun
Token ID1257
Feature activation-1.88e-01

Pos logit contributions
hed+0.001
pill+0.001
Looks+0.001
|+0.001
Any+0.001
Neg logit contributions
 keeper-0.001
keeper-0.001
 worker-0.001
 code-0.001
 door-0.001
Token and
Token ID290
Feature activation+4.85e+00

Pos logit contributions
hed+0.021
Any+0.019
Looks+0.019
ee+0.019
 You+0.019
Neg logit contributions
 instincts-0.027
keeper-0.027
ache-0.026
 keeper-0.025
alo-0.025
Token slides
Token ID19392
Feature activation+4.17e+00

Pos logit contributions
Any+0.107
Looks+0.103
|+0.101
dd+0.099
ez+0.099
Neg logit contributions
keeper-0.110
 doctor-0.104
 door-0.103
 keeper-0.102
 lady-0.097
Token down
Token ID866
Feature activation+1.30e+00

Pos logit contributions
Looks+0.074
Any+0.073
hed+0.068
You+0.067
pill+0.067
Neg logit contributions
keeper-0.113
 keeper-0.108
 worker-0.102
 hinges-0.100
ache-0.099
Token.
Token ID13
Feature activation+3.38e-01

Pos logit contributions
Any+0.033
Looks+0.033
pill+0.031
ee+0.031
hed+0.031
Neg logit contributions
keeper-0.026
 keeper-0.025
 worker-0.023
 door-0.022
 doctor-0.021
Token "
Token ID366
Feature activation+1.57e+00

Pos logit contributions
 You+0.005
Looks+0.005
Any+0.005
You+0.005
 Ever+0.005
Neg logit contributions
ache-0.010
keeper-0.009
 instincts-0.009
 code-0.009
 keeper-0.009
TokenWhe
Token ID10842
Feature activation+1.05e+01

Pos logit contributions
pill+0.030
hed+0.028
Any+0.027
 Wed+0.027
Looks+0.026
Neg logit contributions
keeper-0.035
 keeper-0.035
 worker-0.033
 lady-0.032
 doctor-0.032
Tokene
Token ID68
Feature activation+2.88e+00

Pos logit contributions
Looks+0.175
pill+0.171
Any+0.155
ksh+0.154
 You+0.148
Neg logit contributions
 keeper-0.263
keeper-0.256
 worker-0.253
 aid-0.252
lf-0.245
Token!"
Token ID2474
Feature activation+1.91e-01

Pos logit contributions
Looks+0.076
Any+0.072
pill+0.067
You+0.066
|+0.065
Neg logit contributions
keeper-0.057
 keeper-0.055
 worker-0.050
ache-0.048
 door-0.048
Token
Token ID198
Feature activation-1.85e+00

Pos logit contributions
Looks+0.003
You+0.003
 You+0.003
Any+0.003
hed+0.003
Neg logit contributions
keeper-0.005
ache-0.005
 instincts-0.005
 code-0.005
 keeper-0.005
Token
Token ID198
Feature activation-1.89e+00

Pos logit contributions
ache+0.063
keeper+0.060
 instincts+0.060
 hinges+0.059
alo+0.058
Neg logit contributions
You-0.021
|-0.018
ee-0.018
 You-0.018
Looks-0.018
Token"
Token ID1
Feature activation-5.42e-01

Pos logit contributions
ache+0.048
alo+0.047
keeper+0.046
 instincts+0.045
 dentist+0.043
Neg logit contributions
You-0.037
Looks-0.037
Spl-0.033
ee-0.033
�-0.033
Token and
Token ID290
Feature activation+4.86e+00

Pos logit contributions
hed+0.041
Any+0.041
Looks+0.040
You+0.039
ee+0.038
Neg logit contributions
keeper-0.048
ache-0.045
 keeper-0.045
 instincts-0.044
 hinges-0.043
Token slides
Token ID19392
Feature activation+4.20e+00

Pos logit contributions
Any+0.107
Looks+0.103
|+0.100
ez+0.100
dd+0.098
Neg logit contributions
keeper-0.110
 door-0.105
 doctor-0.105
 keeper-0.103
 lady-0.098
Token down
Token ID866
Feature activation+1.36e+00

Pos logit contributions
Looks+0.074
Any+0.074
hed+0.069
You+0.068
pill+0.068
Neg logit contributions
keeper-0.113
 keeper-0.109
 worker-0.102
 hinges-0.100
ache-0.100
Token.
Token ID13
Feature activation+3.89e-01

Pos logit contributions
Any+0.035
Looks+0.035
pill+0.033
ee+0.032
hed+0.032
Neg logit contributions
keeper-0.027
 keeper-0.026
 worker-0.024
 door-0.022
 doctor-0.022
Token "
Token ID366
Feature activation+1.59e+00

Pos logit contributions
Looks+0.008
Any+0.008
hed+0.007
You+0.007
pill+0.007
Neg logit contributions
keeper-0.010
 keeper-0.009
 worker-0.009
 hinges-0.008
ache-0.008
TokenWhe
Token ID10842
Feature activation+1.04e+01

Pos logit contributions
pill+0.030
hed+0.029
Any+0.028
 Wed+0.027
ez+0.027
Neg logit contributions
keeper-0.035
 keeper-0.035
 worker-0.033
 lady-0.033
 vet-0.033
Tokene
Token ID68
Feature activation+2.85e+00

Pos logit contributions
Looks+0.175
pill+0.172
ksh+0.156
Any+0.155
 You+0.149
Neg logit contributions
 keeper-0.257
keeper-0.251
 worker-0.248
 aid-0.247
lf-0.242
Token!"
Token ID2474
Feature activation+1.66e-01

Pos logit contributions
Looks+0.074
Any+0.071
pill+0.066
You+0.065
|+0.064
Neg logit contributions
keeper-0.057
 keeper-0.055
 worker-0.050
ache-0.048
 door-0.048
Token
Token ID198
Feature activation-1.86e+00

Pos logit contributions
Looks+0.004
Any+0.004
|+0.004
hed+0.004
ksh+0.004
Neg logit contributions
keeper-0.004
 doctor-0.003
 keeper-0.003
 lady-0.003
 aid-0.003
Token
Token ID198
Feature activation-1.91e+00

Pos logit contributions
keeper+0.058
ache+0.055
 keeper+0.055
 hinges+0.054
 instincts+0.054
Neg logit contributions
You-0.025
Any-0.024
ee-0.023
 You-0.023
Looks-0.023
Token"
Token ID1
Feature activation-5.23e-01

Pos logit contributions
keeper+0.043
 keeper+0.040
ache+0.038
 hinges+0.038
alo+0.037
Neg logit contributions
Looks-0.043
Any-0.043
You-0.043
ee-0.041
hed-0.040
Token and
Token ID290
Feature activation+4.89e+00

Pos logit contributions
hed+0.020
Any+0.018
Looks+0.018
ee+0.018
�+0.018
Neg logit contributions
 instincts-0.026
ache-0.026
keeper-0.026
alo-0.025
 keeper-0.025
Token slides
Token ID19392
Feature activation+4.17e+00

Pos logit contributions
Any+0.110
Looks+0.106
|+0.103
ez+0.102
dd+0.102
Neg logit contributions
keeper-0.108
 door-0.103
 doctor-0.103
 keeper-0.101
 worker-0.096
Token down
Token ID866
Feature activation+1.26e+00

Pos logit contributions
Looks+0.074
Any+0.074
hed+0.068
pill+0.067
You+0.067
Neg logit contributions
keeper-0.112
 keeper-0.108
 worker-0.101
 hinges-0.099
ache-0.099
Token.
Token ID13
Feature activation+4.08e-01

Pos logit contributions
Any+0.032
Looks+0.032
pill+0.030
ee+0.030
hed+0.030
Neg logit contributions
keeper-0.025
 keeper-0.025
 worker-0.023
 door-0.021
 doctor-0.021
Token "
Token ID366
Feature activation+1.66e+00

Pos logit contributions
 You+0.007
Any+0.007
Looks+0.006
You+0.006
hed+0.006
Neg logit contributions
keeper-0.011
ache-0.011
 keeper-0.011
 instincts-0.011
 code-0.010
TokenWhe
Token ID10842
Feature activation+1.04e+01

Pos logit contributions
pill+0.031
hed+0.029
Any+0.029
 Wed+0.029
ez+0.027
Neg logit contributions
keeper-0.037
 keeper-0.037
 worker-0.035
 driver-0.034
 vet-0.034
Tokene
Token ID68
Feature activation+2.86e+00

Pos logit contributions
Looks+0.172
pill+0.168
Any+0.154
ksh+0.154
 You+0.145
Neg logit contributions
 keeper-0.265
keeper-0.257
 worker-0.254
 aid-0.252
lf-0.242
Token!"
Token ID2474
Feature activation+2.73e-01

Pos logit contributions
Looks+0.074
Any+0.071
pill+0.067
You+0.065
|+0.065
Neg logit contributions
keeper-0.057
 keeper-0.055
 worker-0.050
ache-0.048
 door-0.048
Token
Token ID198
Feature activation-1.79e+00

Pos logit contributions
Looks+0.005
Any+0.005
You+0.005
 You+0.005
hed+0.005
Neg logit contributions
keeper-0.007
 keeper-0.006
ache-0.006
 instincts-0.006
 hinges-0.006
Token
Token ID198
Feature activation-1.84e+00

Pos logit contributions
ache+0.058
keeper+0.057
 hinges+0.056
 instincts+0.056
alo+0.054
Neg logit contributions
You-0.022
ee-0.020
 You-0.019
Any-0.019
Looks-0.019
Token"
Token ID1
Feature activation-4.98e-01

Pos logit contributions
alo+0.046
keeper+0.045
ache+0.045
 instincts+0.042
 dentist+0.042
Neg logit contributions
You-0.037
Looks-0.036
ee-0.034
�-0.034
Any-0.034
Token slides
Token ID19392
Feature activation+3.95e+00

Pos logit contributions
keeper+0.012
 keeper+0.012
 doctor+0.012
 door+0.011
 dentist+0.011
Neg logit contributions
Any-0.012
Looks-0.011
|-0.011
ksh-0.011
dd-0.010
Token down
Token ID866
Feature activation+1.26e+00

Pos logit contributions
Looks+0.074
Any+0.074
pill+0.069
|+0.067
hed+0.067
Neg logit contributions
keeper-0.103
 keeper-0.100
 worker-0.093
 hinges-0.090
 door-0.090
Token fast
Token ID3049
Feature activation+3.28e+00

Pos logit contributions
Any+0.030
Looks+0.030
pill+0.029
hed+0.028
ee+0.028
Neg logit contributions
keeper-0.026
 keeper-0.026
 worker-0.024
 door-0.023
 doctor-0.023
Token.
Token ID13
Feature activation+8.87e-01

Pos logit contributions
Looks+0.088
Any+0.086
pill+0.081
|+0.080
ee+0.080
Neg logit contributions
keeper-0.060
 keeper-0.058
 worker-0.052
 hinges-0.050
ache-0.050
Token "
Token ID366
Feature activation+1.93e+00

Pos logit contributions
 You+0.012
You+0.011
Any+0.010
Looks+0.010
 Ever+0.009
Neg logit contributions
keeper-0.029
ache-0.028
 instincts-0.028
 keeper-0.028
 code-0.027
TokenWhe
Token ID10842
Feature activation+1.04e+01

Pos logit contributions
ays+0.029
pill+0.029
uses+0.027
 breaths+0.026
igg+0.026
Neg logit contributions
 driver-0.044
 keeper-0.044
 lady-0.044
 vet-0.043
 doctor-0.043
Tokene
Token ID68
Feature activation+2.92e+00

Pos logit contributions
Looks+0.171
pill+0.171
ksh+0.154
Any+0.152
 You+0.143
Neg logit contributions
 keeper-0.263
 worker-0.249
keeper-0.249
lf-0.246
 aid-0.245
Token!"
Token ID2474
Feature activation+7.12e-01

Pos logit contributions
Looks+0.077
Any+0.073
pill+0.070
You+0.068
|+0.067
Neg logit contributions
keeper-0.056
 keeper-0.054
 worker-0.049
ache-0.048
 door-0.047
Token he
Token ID339
Feature activation-4.70e-01

Pos logit contributions
Looks+0.013
You+0.013
 You+0.013
Any+0.012
hed+0.012
Neg logit contributions
keeper-0.018
 keeper-0.018
ache-0.017
 instincts-0.017
 code-0.017
Token shouts
Token ID34757
Feature activation-1.35e+00

Pos logit contributions
 keeper+0.011
keeper+0.011
 doctor+0.010
 door+0.010
 lady+0.010
Neg logit contributions
Any-0.010
Looks-0.010
dd-0.009
|-0.009
ksh-0.009
Token.
Token ID13
Feature activation+1.00e-01

Pos logit contributions
keeper+0.023
 keeper+0.021
 worker+0.019
 aid+0.019
 doctor+0.018
Neg logit contributions
Looks-0.040
Any-0.039
pill-0.038
|-0.036
ee-0.036
Token smiles
Token ID21845
Feature activation+2.09e+00

Pos logit contributions
 doctor+0.006
keeper+0.006
 keeper+0.006
 door+0.006
 dentist+0.006
Neg logit contributions
Looks-0.006
Any-0.006
|-0.006
ksh-0.006
dd-0.005
Token and
Token ID290
Feature activation+3.75e+00

Pos logit contributions
Looks+0.048
Any+0.047
pill+0.045
|+0.045
hed+0.045
Neg logit contributions
keeper-0.045
 keeper-0.043
 worker-0.040
ache-0.039
 doctor-0.039
Token shouts
Token ID34757
Feature activation-6.39e-01

Pos logit contributions
|+0.093
Looks+0.087
ee+0.083
dd+0.082
Any+0.081
Neg logit contributions
 keeper-0.072
 lady-0.072
keeper-0.070
 doctor-0.069
 door-0.067
Token,
Token ID11
Feature activation-3.26e+00

Pos logit contributions
keeper+0.010
 keeper+0.010
 worker+0.009
 instincts+0.008
ache+0.008
Neg logit contributions
Looks-0.018
Any-0.018
pill-0.017
|-0.017
hed-0.017
Token "
Token ID366
Feature activation+1.11e+00

Pos logit contributions
keeper+0.054
 keeper+0.051
 worker+0.046
ache+0.044
 instincts+0.044
Neg logit contributions
Looks-0.091
Any-0.090
hed-0.087
pill-0.086
ee-0.085
TokenWhe
Token ID10842
Feature activation+1.04e+01

Pos logit contributions
|+0.014
endo+0.013
pill+0.013
 Wed+0.013
 sorts+0.013
Neg logit contributions
 keeper-0.031
keeper-0.030
 lady-0.029
 vet-0.029
ache-0.028
Tokene
Token ID68
Feature activation+2.74e+00

Pos logit contributions
Looks+0.174
pill+0.173
ksh+0.152
Any+0.151
You+0.147
Neg logit contributions
 keeper-0.259
keeper-0.251
 aid-0.250
 worker-0.249
ress-0.241
Token!
Token ID0
Feature activation+2.01e+00

Pos logit contributions
Looks+0.075
Any+0.069
pill+0.067
You+0.065
|+0.065
Neg logit contributions
keeper-0.050
 keeper-0.049
ache-0.044
 worker-0.044
alo-0.043
Token That
Token ID1320
Feature activation+4.51e+00

Pos logit contributions
Looks+0.038
dd+0.033
|+0.032
hed+0.031
endo+0.031
Neg logit contributions
keeper-0.048
 doctor-0.047
 keeper-0.044
 aid-0.044
 click-0.044
Token was
Token ID373
Feature activation-1.10e-01

Pos logit contributions
Looks+0.100
Any+0.096
|+0.093
pill+0.092
You+0.089
Neg logit contributions
keeper-0.103
 keeper-0.102
 door-0.092
 doctor-0.091
 worker-0.091
Token fun
Token ID1257
Feature activation-4.83e-01

Pos logit contributions
 keeper+0.002
keeper+0.002
 door+0.002
 worker+0.002
 code+0.002
Neg logit contributions
hed-0.003
pill-0.003
Looks-0.003
ez-0.002
|-0.002
Token slides
Token ID19392
Feature activation+3.92e+00

Pos logit contributions
 doctor+0.008
keeper+0.008
 door+0.008
 keeper+0.008
 lady+0.008
Neg logit contributions
Looks-0.008
Any-0.008
|-0.008
ksh-0.008
dd-0.008
Token down
Token ID866
Feature activation+1.26e+00

Pos logit contributions
Looks+0.079
Any+0.078
pill+0.074
|+0.071
You+0.070
Neg logit contributions
keeper-0.098
 keeper-0.095
 worker-0.089
 door-0.086
 hinges-0.084
Token fast
Token ID3049
Feature activation+3.24e+00

Pos logit contributions
Any+0.031
Looks+0.031
pill+0.030
ksh+0.029
ez+0.029
Neg logit contributions
keeper-0.026
 keeper-0.026
 worker-0.024
 door-0.023
 lady-0.022
Token.
Token ID13
Feature activation+8.59e-01

Pos logit contributions
Looks+0.090
Any+0.087
|+0.082
pill+0.082
ee+0.081
Neg logit contributions
keeper-0.057
 keeper-0.055
 worker-0.050
 door-0.048
 hinges-0.047
Token "
Token ID366
Feature activation+1.70e+00

Pos logit contributions
Any+0.014
Looks+0.014
You+0.014
 You+0.014
hed+0.013
Neg logit contributions
keeper-0.024
 keeper-0.023
ache-0.022
 instincts-0.022
 worker-0.022
TokenWhe
Token ID10842
Feature activation+1.04e+01

Pos logit contributions
ays+0.027
pill+0.027
uses+0.025
Any+0.025
hed+0.025
Neg logit contributions
 lady-0.039
 keeper-0.038
 vet-0.038
 driver-0.038
 doctor-0.037
Tokene
Token ID68
Feature activation+2.94e+00

Pos logit contributions
Looks+0.176
pill+0.173
ksh+0.158
Any+0.154
 You+0.147
Neg logit contributions
 keeper-0.255
 worker-0.244
lf-0.242
keeper-0.242
 aid-0.239
Token!"
Token ID2474
Feature activation+4.35e-01

Pos logit contributions
Looks+0.077
Any+0.073
pill+0.069
You+0.068
|+0.067
Neg logit contributions
keeper-0.057
 keeper-0.056
 worker-0.050
ache-0.049
 door-0.048
Token she
Token ID673
Feature activation-2.17e-01

Pos logit contributions
Any+0.009
Looks+0.009
hed+0.009
ee+0.008
|+0.008
Neg logit contributions
keeper-0.010
 keeper-0.010
 worker-0.009
 hinges-0.009
 doctor-0.009
Token shouts
Token ID34757
Feature activation-1.34e+00

Pos logit contributions
keeper+0.004
 keeper+0.004
 doctor+0.004
 lady+0.004
 door+0.004
Neg logit contributions
Any-0.005
Looks-0.005
|-0.005
hed-0.005
ksh-0.005
Token.
Token ID13
Feature activation+4.11e-02

Pos logit contributions
keeper+0.023
 keeper+0.021
 worker+0.019
 aid+0.019
 doctor+0.018
Neg logit contributions
Looks-0.040
Any-0.039
pill-0.038
|-0.037
ee-0.036
Token and
Token ID290
Feature activation+4.77e+00

Pos logit contributions
hed+0.021
Any+0.020
Looks+0.019
You+0.019
ee+0.019
Neg logit contributions
keeper-0.026
 instincts-0.026
ache-0.025
 keeper-0.025
alo-0.024
Token slides
Token ID19392
Feature activation+4.13e+00

Pos logit contributions
Any+0.107
Looks+0.103
ez+0.099
|+0.099
dd+0.098
Neg logit contributions
keeper-0.108
 doctor-0.101
 door-0.101
 keeper-0.100
 lady-0.095
Token down
Token ID866
Feature activation+1.22e+00

Pos logit contributions
Looks+0.078
Any+0.077
pill+0.072
You+0.070
hed+0.070
Neg logit contributions
keeper-0.108
 keeper-0.104
 worker-0.097
 hinges-0.094
ache-0.093
Token.
Token ID13
Feature activation+4.14e-01

Pos logit contributions
Any+0.032
Looks+0.032
pill+0.031
ee+0.030
|+0.030
Neg logit contributions
keeper-0.023
 keeper-0.023
 worker-0.021
 door-0.019
 lady-0.019
Token "
Token ID366
Feature activation+1.53e+00

Pos logit contributions
Looks+0.008
Any+0.008
hed+0.008
You+0.007
pill+0.007
Neg logit contributions
keeper-0.010
 keeper-0.010
 worker-0.009
 hinges-0.009
ache-0.009
TokenWhe
Token ID10842
Feature activation+1.03e+01

Pos logit contributions
pill+0.029
hed+0.028
 Wed+0.027
Any+0.027
ez+0.026
Neg logit contributions
keeper-0.034
 keeper-0.033
 lady-0.032
 vet-0.032
 worker-0.031
Tokene
Token ID68
Feature activation+2.77e+00

Pos logit contributions
Looks+0.170
pill+0.167
Any+0.152
ksh+0.151
 You+0.146
Neg logit contributions
 keeper-0.256
keeper-0.250
 worker-0.246
 aid-0.245
lf-0.238
Token!"
Token ID2474
Feature activation+1.69e-01

Pos logit contributions
Looks+0.072
Any+0.068
pill+0.064
You+0.063
|+0.062
Neg logit contributions
keeper-0.055
 keeper-0.054
 worker-0.049
ache-0.047
 door-0.047
Token
Token ID198
Feature activation-1.78e+00

Pos logit contributions
Looks+0.004
Any+0.004
|+0.004
hed+0.004
ksh+0.004
Neg logit contributions
keeper-0.003
 doctor-0.003
 keeper-0.003
 lady-0.003
 aid-0.003
Token
Token ID198
Feature activation-1.82e+00

Pos logit contributions
keeper+0.053
 keeper+0.051
ache+0.051
 hinges+0.050
 worker+0.049
Neg logit contributions
You-0.025
Any-0.025
Looks-0.024
ee-0.023
 You-0.023
Token"
Token ID1
Feature activation-5.17e-01

Pos logit contributions
keeper+0.038
 keeper+0.036
ache+0.034
 hinges+0.033
alo+0.033
Neg logit contributions
Looks-0.043
Any-0.043
You-0.042
ee-0.041
hed-0.040
Token in
Token ID287
Feature activation+2.63e+00

Pos logit contributions
Looks+0.043
Any+0.043
pill+0.040
You+0.039
ee+0.039
Neg logit contributions
keeper-0.054
 keeper-0.052
 worker-0.048
 hinges-0.047
 door-0.047
Token the
Token ID262
Feature activation+2.36e+00

Pos logit contributions
Looks+0.032
You+0.030
Any+0.030
|+0.029
hed+0.028
Neg logit contributions
keeper-0.085
ache-0.082
 keeper-0.079
 instincts-0.079
alo-0.077
Token air
Token ID1633
Feature activation+6.39e-01

Pos logit contributions
Looks+0.049
Any+0.049
hed+0.046
You+0.045
pill+0.045
Neg logit contributions
keeper-0.056
 keeper-0.053
 worker-0.050
 hinges-0.049
ache-0.049
Token.
Token ID13
Feature activation+1.09e+00

Pos logit contributions
hed+0.012
You+0.011
Any+0.011
ee+0.011
 You+0.010
Neg logit contributions
keeper-0.016
 instincts-0.015
 hinges-0.015
 keeper-0.015
ache-0.015
Token "
Token ID366
Feature activation+1.95e+00

Pos logit contributions
 You+0.014
You+0.014
Any+0.014
 Ever+0.013
Looks+0.012
Neg logit contributions
ache-0.034
keeper-0.033
 code-0.033
 instincts-0.033
 hinges-0.032
TokenWhe
Token ID10842
Feature activation+1.02e+01

Pos logit contributions
Any+0.036
 Wed+0.036
pill+0.036
hed+0.035
Looks+0.034
Neg logit contributions
keeper-0.044
 keeper-0.042
 aid-0.041
 vet-0.041
 doctor-0.040
Tokene
Token ID68
Feature activation+2.98e+00

Pos logit contributions
Looks+0.165
pill+0.157
Any+0.148
You+0.139
ksh+0.139
Neg logit contributions
 keeper-0.269
keeper-0.266
 worker-0.258
 aid-0.256
lf-0.249
Token,
Token ID11
Feature activation-3.57e+00

Pos logit contributions
Looks+0.073
Any+0.070
pill+0.065
You+0.065
|+0.064
Neg logit contributions
keeper-0.063
 keeper-0.060
 worker-0.055
ache-0.055
 aid-0.053
Token it
Token ID340
Feature activation+2.36e+00

Pos logit contributions
keeper+0.078
 keeper+0.077
 doctor+0.072
 worker+0.071
 door+0.070
Neg logit contributions
Looks-0.080
Any-0.080
hed-0.078
pill-0.077
|-0.072
Token is
Token ID318
Feature activation+5.83e-03

Pos logit contributions
Looks+0.050
Any+0.048
|+0.046
You+0.045
Whe+0.044
Neg logit contributions
keeper-0.054
 keeper-0.053
ache-0.051
 door-0.049
 worker-0.049
Token like
Token ID588
Feature activation+1.01e+00

Pos logit contributions
Looks+0.000
Any+0.000
hed+0.000
pill+0.000
You+0.000
Neg logit contributions
keeper-0.000
 keeper-0.000
 hinges-0.000
 worker-0.000
ache-0.000
Token.
Token ID13
Feature activation+9.14e-01

Pos logit contributions
keeper+0.080
 keeper+0.076
 instincts+0.074
ache+0.073
 worker+0.072
Neg logit contributions
hed-0.062
Any-0.061
Looks-0.060
�-0.059
 You-0.058
Token It
Token ID632
Feature activation+9.13e-01

Pos logit contributions
Looks+0.017
Any+0.017
hed+0.016
pill+0.015
|+0.015
Neg logit contributions
keeper-0.024
 keeper-0.023
 worker-0.021
 hinges-0.021
 aid-0.021
Token was
Token ID373
Feature activation+3.83e-01

Pos logit contributions
Any+0.019
Looks+0.019
hed+0.018
You+0.017
pill+0.017
Neg logit contributions
keeper-0.022
 keeper-0.021
 worker-0.020
ache-0.019
 hinges-0.019
Token covered
Token ID5017
Feature activation-3.18e-01

Pos logit contributions
Any+0.006
Looks+0.006
You+0.005
hed+0.005
 You+0.005
Neg logit contributions
keeper-0.011
 keeper-0.011
ache-0.010
 instincts-0.010
 worker-0.010
Token in
Token ID287
Feature activation+7.62e-01

Pos logit contributions
keeper+0.007
 keeper+0.007
 worker+0.006
 hinges+0.006
 aid+0.006
Neg logit contributions
Looks-0.007
Any-0.007
pill-0.007
|-0.007
You-0.007
Token de
Token ID390
Feature activation+1.02e+01

Pos logit contributions
Any+0.015
Looks+0.015
hed+0.014
You+0.014
pill+0.014
Neg logit contributions
keeper-0.019
 keeper-0.018
 worker-0.016
 hinges-0.016
ache-0.016
Tokenw
Token ID86
Feature activation+1.50e+00

Pos logit contributions
Looks+0.191
Any+0.185
 Wed+0.183
pill+0.180
 You+0.178
Neg logit contributions
keeper-0.262
 keeper-0.248
ache-0.238
 worker-0.236
 code-0.224
Token from
Token ID422
Feature activation+5.60e-02

Pos logit contributions
Any+0.029
Looks+0.029
hed+0.028
You+0.027
ee+0.027
Neg logit contributions
keeper-0.038
 keeper-0.036
 worker-0.034
 hinges-0.033
 code-0.033
Token the
Token ID262
Feature activation+1.17e+00

Pos logit contributions
Looks+0.001
Any+0.001
hed+0.001
pill+0.001
You+0.001
Neg logit contributions
keeper-0.002
 keeper-0.001
 worker-0.001
 hinges-0.001
 instincts-0.001
Token wet
Token ID9583
Feature activation+4.88e-01

Pos logit contributions
Looks+0.027
Any+0.027
hed+0.026
You+0.025
pill+0.025
Neg logit contributions
keeper-0.025
 keeper-0.023
 worker-0.022
ache-0.021
 hinges-0.021
Token rain
Token ID6290
Feature activation+5.38e-01

Pos logit contributions
Looks+0.011
Any+0.011
hed+0.011
You+0.010
ee+0.010
Neg logit contributions
keeper-0.010
 keeper-0.010
 worker-0.009
ache-0.009
 hinges-0.009
Token different
Token ID1180
Feature activation-1.04e+00

Pos logit contributions
keeper+0.039
 keeper+0.037
 worker+0.035
ache+0.035
 hinges+0.034
Neg logit contributions
Looks-0.024
Any-0.024
hed-0.023
You-0.022
pill-0.022
Token styles
Token ID12186
Feature activation+3.24e+00

Pos logit contributions
keeper+0.027
 keeper+0.025
ache+0.024
 worker+0.024
alo+0.024
Neg logit contributions
Any-0.018
Looks-0.018
hed-0.018
pill-0.018
You-0.017
Token like
Token ID588
Feature activation+1.46e-01

Pos logit contributions
Any+0.072
Looks+0.071
hed+0.069
You+0.068
pill+0.067
Neg logit contributions
keeper-0.071
 keeper-0.067
 worker-0.062
 hinges-0.062
 instincts-0.062
Token stars
Token ID5788
Feature activation+3.45e+00

Pos logit contributions
Looks+0.003
Any+0.003
|+0.003
hed+0.003
ee+0.003
Neg logit contributions
keeper-0.004
 keeper-0.004
 worker-0.003
 door-0.003
 doctor-0.003
Token or
Token ID393
Feature activation+2.64e+00

Pos logit contributions
Looks+0.085
Any+0.083
You+0.078
|+0.078
hed+0.078
Neg logit contributions
keeper-0.069
 keeper-0.066
ache-0.060
 worker-0.060
 hinges-0.058
Token spir
Token ID9158
Feature activation+1.02e+01

Pos logit contributions
Looks+0.050
Any+0.050
hed+0.047
You+0.047
pill+0.046
Neg logit contributions
keeper-0.067
 keeper-0.064
 worker-0.060
ache-0.059
 hinges-0.059
Tokenals
Token ID874
Feature activation+3.48e+00

Pos logit contributions
Looks+0.258
Any+0.234
You+0.226
 Good+0.206
|+0.203
Neg logit contributions
keeper-0.193
ache-0.176
alo-0.171
 door-0.165
 hinges-0.160
Token.
Token ID13
Feature activation+9.87e-01

Pos logit contributions
Any+0.074
hed+0.072
Looks+0.071
pill+0.070
You+0.069
Neg logit contributions
keeper-0.080
 keeper-0.075
 worker-0.071
 instincts-0.070
 hinges-0.070
Token Everyone
Token ID11075
Feature activation+1.62e+00

Pos logit contributions
Looks+0.018
ee+0.017
Any+0.017
hed+0.016
ez+0.016
Neg logit contributions
keeper-0.027
 keeper-0.025
 doctor-0.024
 worker-0.024
 door-0.023
Token was
Token ID373
Feature activation+2.17e-01

Pos logit contributions
Looks+0.037
Any+0.035
pill+0.034
|+0.033
 You+0.033
Neg logit contributions
keeper-0.037
 keeper-0.035
 worker-0.032
ache-0.031
 doctor-0.031
Token so
Token ID523
Feature activation+2.53e+00

Pos logit contributions
Any+0.004
Looks+0.003
hed+0.003
You+0.003
ee+0.003
Neg logit contributions
keeper-0.006
 keeper-0.006
 hinges-0.006
 instincts-0.006
 code-0.005
Token,
Token ID11
Feature activation-3.16e+00

Pos logit contributions
Looks+0.051
Any+0.049
hed+0.048
pill+0.048
You+0.048
Neg logit contributions
keeper-0.044
 keeper-0.042
 worker-0.038
 instincts-0.037
ache-0.037
Token the
Token ID262
Feature activation+1.18e+00

Pos logit contributions
keeper+0.083
 keeper+0.081
 worker+0.075
 doctor+0.074
 instincts+0.074
Neg logit contributions
Looks-0.058
Any-0.055
hed-0.055
pill-0.053
|-0.053
Token farmer
Token ID18739
Feature activation-1.14e+00

Pos logit contributions
Looks+0.024
|+0.023
Any+0.022
hed+0.022
pill+0.022
Neg logit contributions
keeper-0.029
 keeper-0.029
 worker-0.027
 doctor-0.026
 door-0.026
Token in
Token ID287
Feature activation+2.88e+00

Pos logit contributions
keeper+0.027
 keeper+0.025
 worker+0.023
ache+0.023
 doctor+0.022
Neg logit contributions
Looks-0.026
Any-0.024
|-0.023
pill-0.022
You-0.022
Token the
Token ID262
Feature activation+2.35e+00

Pos logit contributions
Looks+0.058
Any+0.055
hed+0.053
pill+0.052
|+0.051
Neg logit contributions
keeper-0.070
 keeper-0.068
 doctor-0.065
 aid-0.065
 door-0.064
Token de
Token ID390
Feature activation+1.02e+01

Pos logit contributions
Looks+0.053
Any+0.053
hed+0.051
pill+0.050
|+0.050
Neg logit contributions
keeper-0.052
 keeper-0.050
 doctor-0.046
 worker-0.046
 door-0.046
Tokenll
Token ID297
Feature activation-4.23e+00

Pos logit contributions
Looks+0.230
 Wed+0.222
ksh+0.222
 imag+0.221
Any+0.219
Neg logit contributions
keeper-0.206
 keeper-0.195
ache-0.193
 worker-0.185
 lady-0.171
Token".
Token ID1911
Feature activation+7.39e-01

Pos logit contributions
keeper+0.094
 keeper+0.091
 worker+0.083
ache+0.083
 instincts+0.082
Neg logit contributions
Looks-0.094
Any-0.092
pill-0.088
hed-0.086
You-0.085
Token Suddenly
Token ID24975
Feature activation+6.42e-02

Pos logit contributions
Looks+0.015
Any+0.015
ee+0.014
pill+0.014
|+0.014
Neg logit contributions
keeper-0.018
 keeper-0.017
 worker-0.016
 aid-0.016
 doctor-0.015
Token,
Token ID11
Feature activation-2.60e+00

Pos logit contributions
Any+0.002
Looks+0.002
hed+0.002
You+0.002
ee+0.001
Neg logit contributions
keeper-0.001
 keeper-0.001
 hinges-0.001
ache-0.001
 code-0.001
Token he
Token ID339
Feature activation-1.01e+00

Pos logit contributions
keeper+0.057
 keeper+0.055
 worker+0.051
 doctor+0.049
 instincts+0.047
Neg logit contributions
pill-0.058
Any-0.058
Looks-0.058
|-0.057
ee-0.056
Token and
Token ID290
Feature activation+2.73e+00

Pos logit contributions
Any+0.042
Looks+0.042
hed+0.040
You+0.040
pill+0.040
Neg logit contributions
keeper-0.038
 keeper-0.035
 worker-0.033
ache-0.032
 hinges-0.032
Token wet
Token ID9583
Feature activation+3.88e-01

Pos logit contributions
|+0.060
hed+0.058
Any+0.058
Looks+0.057
ee+0.057
Neg logit contributions
keeper-0.061
 keeper-0.059
 worker-0.052
 doctor-0.051
 lady-0.051
Token from
Token ID422
Feature activation+2.59e-01

Pos logit contributions
Looks+0.009
Any+0.009
|+0.008
pill+0.008
hed+0.008
Neg logit contributions
keeper-0.008
 keeper-0.008
 worker-0.007
 hinges-0.007
 door-0.007
Token the
Token ID262
Feature activation+1.45e+00

Pos logit contributions
Looks+0.005
Any+0.005
hed+0.005
pill+0.005
|+0.005
Neg logit contributions
keeper-0.006
 keeper-0.006
 doctor-0.006
 worker-0.006
 door-0.006
Token morning
Token ID3329
Feature activation-1.96e+00

Pos logit contributions
Looks+0.030
Any+0.030
hed+0.028
You+0.028
pill+0.028
Neg logit contributions
keeper-0.034
 keeper-0.033
 worker-0.030
ache-0.030
 hinges-0.030
Token de
Token ID390
Feature activation+1.01e+01

Pos logit contributions
keeper+0.043
 keeper+0.041
 worker+0.037
 aid+0.036
 hinges+0.036
Neg logit contributions
Looks-0.045
Any-0.045
hed-0.042
You-0.042
|-0.042
Tokenw
Token ID86
Feature activation+1.88e+00

Pos logit contributions
Looks+0.147
Any+0.143
pill+0.141
 Wed+0.136
 You+0.135
Neg logit contributions
keeper-0.300
 keeper-0.287
ache-0.281
 worker-0.276
 code-0.262
Token.
Token ID13
Feature activation+8.38e-01

Pos logit contributions
Looks+0.037
Any+0.036
hed+0.034
You+0.034
pill+0.033
Neg logit contributions
keeper-0.047
 keeper-0.045
 worker-0.042
ache-0.042
 hinges-0.041
Token She
Token ID1375
Feature activation-1.38e+00

Pos logit contributions
Any+0.017
Looks+0.017
hed+0.016
ee+0.015
pill+0.015
Neg logit contributions
keeper-0.021
 keeper-0.020
 worker-0.018
 aid-0.018
 hinges-0.018
Token pointed
Token ID6235
Feature activation+1.08e-01

Pos logit contributions
keeper+0.033
 keeper+0.031
ache+0.029
 worker+0.029
 aid+0.028
Neg logit contributions
Looks-0.029
Any-0.028
hed-0.027
You-0.027
|-0.026
Token at
Token ID379
Feature activation-1.80e+00

Pos logit contributions
Any+0.002
Looks+0.002
hed+0.002
You+0.002
ee+0.002
Neg logit contributions
keeper-0.002
 keeper-0.002
ache-0.002
 hinges-0.002
 worker-0.002
Token so
Token ID523
Feature activation+1.46e+00

Pos logit contributions
Looks+0.057
Any+0.057
hed+0.054
You+0.052
|+0.052
Neg logit contributions
keeper-0.106
 keeper-0.102
 worker-0.096
 hinges-0.095
ache-0.094
Token much
Token ID881
Feature activation+2.42e+00

Pos logit contributions
ksh+0.028
Any+0.028
Looks+0.028
pill+0.026
 You+0.025
Neg logit contributions
keeper-0.033
 keeper-0.033
 worker-0.031
 code-0.031
ache-0.030
Token fun
Token ID1257
Feature activation-1.62e+00

Pos logit contributions
Looks+0.040
Any+0.040
hed+0.037
You+0.036
pill+0.036
Neg logit contributions
keeper-0.068
 keeper-0.065
 worker-0.061
ache-0.061
 hinges-0.061
Token.
Token ID13
Feature activation+4.12e-01

Pos logit contributions
keeper+0.032
 keeper+0.030
 worker+0.028
 instincts+0.028
 code+0.028
Neg logit contributions
Looks-0.041
Any-0.039
|-0.038
pill-0.038
hed-0.037
Token "
Token ID366
Feature activation+1.55e+00

Pos logit contributions
Looks+0.009
Any+0.009
hed+0.009
ee+0.009
pill+0.008
Neg logit contributions
keeper-0.009
 keeper-0.009
 doctor-0.008
 aid-0.008
 worker-0.008
TokenWhe
Token ID10842
Feature activation+1.01e+01

Pos logit contributions
pill+0.030
 breaths+0.027
hed+0.026
igg+0.026
ays+0.025
Neg logit contributions
keeper-0.033
 keeper-0.033
 aid-0.030
 driver-0.030
 vet-0.030
Tokene
Token ID68
Feature activation+2.56e+00

Pos logit contributions
pill+0.154
Looks+0.149
 You+0.131
You+0.129
ksh+0.129
Neg logit contributions
keeper-0.265
 keeper-0.258
 aid-0.253
lf-0.252
 worker-0.251
Token!"
Token ID2474
Feature activation+6.43e-01

Pos logit contributions
Looks+0.059
Any+0.058
You+0.054
pill+0.054
|+0.052
Neg logit contributions
keeper-0.057
 keeper-0.054
 worker-0.049
ache-0.049
 aid-0.048
Token he
Token ID339
Feature activation-1.33e+00

Pos logit contributions
Any+0.016
Looks+0.016
|+0.015
pill+0.015
ee+0.015
Neg logit contributions
keeper-0.013
 keeper-0.012
 doctor-0.011
 worker-0.011
 lady-0.011
Token shouted
Token ID17293
Feature activation-1.59e+00

Pos logit contributions
keeper+0.030
 keeper+0.028
 worker+0.026
ache+0.025
 hinges+0.025
Neg logit contributions
Looks-0.030
Any-0.030
hed-0.028
You-0.028
pill-0.027
Token.
Token ID13
Feature activation+2.67e-01

Pos logit contributions
keeper+0.034
 keeper+0.032
 worker+0.029
ache+0.028
 hinges+0.028
Neg logit contributions
Looks-0.038
Any-0.038
pill-0.036
hed-0.036
You-0.035
Token gently
Token ID15165
Feature activation-7.48e-01

Pos logit contributions
keeper+0.072
 keeper+0.069
 dentist+0.065
ache+0.064
 worker+0.064
Neg logit contributions
hed-0.050
Any-0.049
ee-0.047
You-0.047
�-0.047
Token and
Token ID290
Feature activation+3.51e+00

Pos logit contributions
keeper+0.020
 keeper+0.019
 instincts+0.018
 dentist+0.018
 worker+0.017
Neg logit contributions
hed-0.013
�-0.012
Any-0.012
ee-0.012
 You-0.012
Token said
Token ID531
Feature activation-3.43e+00

Pos logit contributions
Looks+0.060
Any+0.059
hed+0.058
You+0.056
ee+0.052
Neg logit contributions
keeper-0.093
 keeper-0.090
 instincts-0.086
 hinges-0.086
 aid-0.085
Token,
Token ID11
Feature activation-3.76e+00

Pos logit contributions
keeper+0.075
 keeper+0.072
 aid+0.071
 hinges+0.070
 code+0.068
Neg logit contributions
Any-0.074
 You-0.072
You-0.071
hed-0.070
pill-0.070
Token "
Token ID366
Feature activation+9.05e-01

Pos logit contributions
keeper+0.073
 keeper+0.069
 hinges+0.062
 worker+0.062
 code+0.062
Neg logit contributions
Any-0.094
Looks-0.092
You-0.090
hed-0.089
 You-0.089
TokenWhe
Token ID10842
Feature activation+1.01e+01

Pos logit contributions
Any+0.019
Looks+0.018
pill+0.017
|+0.017
ee+0.017
Neg logit contributions
keeper-0.020
 keeper-0.018
ache-0.018
 driver-0.017
 doctor-0.017
Tokene
Token ID68
Feature activation+2.76e+00

Pos logit contributions
Looks+0.166
pill+0.159
Any+0.145
ksh+0.143
 You+0.140
Neg logit contributions
 keeper-0.256
keeper-0.256
 aid-0.246
 worker-0.246
ress-0.239
Token,
Token ID11
Feature activation-3.80e+00

Pos logit contributions
Looks+0.073
Any+0.068
You+0.064
pill+0.064
|+0.062
Neg logit contributions
keeper-0.054
 keeper-0.052
 worker-0.047
ache-0.047
alo-0.045
Token Brown
Token ID4373
Feature activation+9.23e-01

Pos logit contributions
keeper+0.086
 keeper+0.082
 worker+0.075
 doctor+0.074
ache+0.074
Neg logit contributions
Looks-0.085
Any-0.084
hed-0.081
pill-0.078
You-0.076
Tokenie
Token ID494
Feature activation+3.67e-01

Pos logit contributions
Any+0.017
ee+0.017
Looks+0.017
hed+0.016
You+0.016
Neg logit contributions
keeper-0.023
 keeper-0.022
 hinges-0.021
 worker-0.021
 aid-0.021
Token,
Token ID11
Feature activation-3.78e+00

Pos logit contributions
Looks+0.008
Any+0.008
hed+0.007
You+0.007
pill+0.007
Neg logit contributions
keeper-0.009
 keeper-0.008
 worker-0.008
ache-0.007
 hinges-0.007
Token smiled
Token ID13541
Feature activation+1.01e+00

Pos logit contributions
keeper+0.012
 keeper+0.011
ache+0.010
 doctor+0.010
 dentist+0.010
Neg logit contributions
Looks-0.012
Any-0.012
pill-0.011
 You-0.011
|-0.011
Token and
Token ID290
Feature activation+3.66e+00

Pos logit contributions
Any+0.019
You+0.018
Looks+0.018
 You+0.018
hed+0.017
Neg logit contributions
keeper-0.025
 keeper-0.024
 worker-0.023
 hinges-0.023
 code-0.023
Token shouted
Token ID17293
Feature activation-5.78e-01

Pos logit contributions
Looks+0.082
Any+0.081
pill+0.080
|+0.079
ee+0.077
Neg logit contributions
keeper-0.079
 keeper-0.076
 worker-0.069
ache-0.068
 doctor-0.067
Token,
Token ID11
Feature activation-3.56e+00

Pos logit contributions
keeper+0.011
 keeper+0.010
 worker+0.009
 hinges+0.009
ache+0.009
Neg logit contributions
Any-0.015
Looks-0.014
hed-0.014
You-0.014
 You-0.014
Token "
Token ID366
Feature activation+1.00e+00

Pos logit contributions
keeper+0.071
 keeper+0.066
 hinges+0.061
 code+0.061
 worker+0.060
Neg logit contributions
Any-0.089
Looks-0.086
You-0.084
 You-0.083
hed-0.083
TokenWhe
Token ID10842
Feature activation+1.00e+01

Pos logit contributions
pill+0.014
|+0.013
 Wed+0.013
Any+0.013
Looks+0.013
Neg logit contributions
keeper-0.028
 keeper-0.027
 doctor-0.025
ache-0.025
 lady-0.025
Tokene
Token ID68
Feature activation+2.57e+00

Pos logit contributions
pill+0.166
Looks+0.165
ksh+0.146
Any+0.146
You+0.140
Neg logit contributions
 keeper-0.250
keeper-0.248
 aid-0.242
 worker-0.240
 doctor-0.238
Token!"
Token ID2474
Feature activation+9.19e-01

Pos logit contributions
Looks+0.069
Any+0.064
pill+0.062
You+0.060
|+0.059
Neg logit contributions
keeper-0.050
 keeper-0.048
ache-0.044
 worker-0.044
 door-0.043
Token
Token ID198
Feature activation-1.19e+00

Pos logit contributions
Any+0.016
Looks+0.016
You+0.015
 You+0.015
hed+0.015
Neg logit contributions
keeper-0.025
 keeper-0.024
 instincts-0.023
ache-0.023
 code-0.022
Token
Token ID198
Feature activation-1.38e+00

Pos logit contributions
keeper+0.035
 keeper+0.033
ache+0.033
 worker+0.032
 hinges+0.032
Neg logit contributions
You-0.018
Any-0.018
Looks-0.018
|-0.017
hed-0.017
TokenL
Token ID43
Feature activation+3.05e+00

Pos logit contributions
keeper+0.036
alo+0.036
 keeper+0.034
 instincts+0.034
 code+0.034
Neg logit contributions
You-0.027
Looks-0.027
Any-0.024
hed-0.022
�-0.022
Token.
Token ID13
Feature activation+6.03e-01

Pos logit contributions
Any+0.007
hed+0.007
Looks+0.007
You+0.006
pill+0.006
Neg logit contributions
keeper-0.008
 keeper-0.008
 hinges-0.007
 worker-0.007
 instincts-0.007
Token She
Token ID1375
Feature activation-5.84e-01

Pos logit contributions
 You+0.009
Any+0.009
You+0.009
Looks+0.009
hed+0.008
Neg logit contributions
keeper-0.018
ache-0.017
 keeper-0.017
 instincts-0.017
 hinges-0.016
Token said
Token ID531
Feature activation-3.25e+00

Pos logit contributions
keeper+0.014
 keeper+0.014
 instincts+0.013
 hinges+0.013
 worker+0.013
Neg logit contributions
Looks-0.012
Any-0.011
hed-0.011
You-0.011
ee-0.011
Token,
Token ID11
Feature activation-4.00e+00

Pos logit contributions
keeper+0.076
 hinges+0.074
 keeper+0.073
 aid+0.071
 code+0.069
Neg logit contributions
Any-0.065
 You-0.064
You-0.062
pill-0.059
�-0.059
Token "
Token ID366
Feature activation+7.54e-01

Pos logit contributions
keeper+0.087
 keeper+0.082
 hinges+0.079
 code+0.077
 instincts+0.077
Neg logit contributions
Any-0.090
 You-0.087
You-0.087
Looks-0.084
hed-0.082
TokenDear
Token ID20266
Feature activation-7.86e+00

Pos logit contributions
Any+0.013
Looks+0.012
pill+0.012
|+0.012
ee+0.012
Neg logit contributions
keeper-0.020
 keeper-0.019
 lady-0.018
 worker-0.017
 vet-0.017
Token Tom
Token ID4186
Feature activation+7.38e-02

Pos logit contributions
keeper+0.224
ache+0.211
 hinges+0.211
 instincts+0.204
 keeper+0.203
Neg logit contributions
 You-0.123
Any-0.114
You-0.113
ee-0.104
Looks-0.104
Token,
Token ID11
Feature activation-4.67e+00

Pos logit contributions
Any+0.002
 You+0.002
You+0.002
hed+0.002
Looks+0.002
Neg logit contributions
keeper-0.001
 keeper-0.001
 hinges-0.001
alo-0.001
 code-0.001
Token I
Token ID314
Feature activation+2.62e+00

Pos logit contributions
keeper+0.123
 keeper+0.117
 hinges+0.113
 instincts+0.110
ache+0.109
Neg logit contributions
Any-0.085
 You-0.081
You-0.081
Looks-0.079
ee-0.074
Token made
Token ID925
Feature activation+1.53e-01

Pos logit contributions
Looks+0.045
Any+0.044
pill+0.042
hed+0.042
You+0.040
Neg logit contributions
keeper-0.071
 keeper-0.069
 worker-0.064
ache-0.064
 code-0.064
Token this
Token ID428
Feature activation-2.67e+00

Pos logit contributions
Looks+0.003
Any+0.003
hed+0.003
|+0.003
pill+0.002
Neg logit contributions
keeper-0.004
 keeper-0.004
 doctor-0.004
 door-0.004
 worker-0.004
Token.
Token ID13
Feature activation+3.35e-01

Pos logit contributions
keeper+0.012
 keeper+0.012
 worker+0.011
 hinges+0.011
ache+0.011
Neg logit contributions
Any-0.012
Looks-0.012
hed-0.012
You-0.012
pill-0.011
Token It
Token ID632
Feature activation+1.43e+00

Pos logit contributions
Looks+0.006
Any+0.006
hed+0.006
ee+0.006
pill+0.006
Neg logit contributions
keeper-0.009
 keeper-0.008
 doctor-0.007
 worker-0.007
 aid-0.007
Token said
Token ID531
Feature activation-3.39e+00

Pos logit contributions
Looks+0.038
Any+0.037
pill+0.036
ee+0.035
hed+0.035
Neg logit contributions
keeper-0.027
 keeper-0.025
 doctor-0.023
 door-0.021
 worker-0.021
Token,
Token ID11
Feature activation-4.71e+00

Pos logit contributions
keeper+0.080
 keeper+0.076
 hinges+0.072
 worker+0.071
 code+0.070
Neg logit contributions
Any-0.071
Looks-0.067
You-0.066
 You-0.066
hed-0.064
Token "
Token ID366
Feature activation+1.13e+00

Pos logit contributions
keeper+0.093
 keeper+0.087
 hinges+0.080
 code+0.080
 worker+0.080
Neg logit contributions
Any-0.116
Looks-0.113
You-0.110
 You-0.110
hed-0.108
TokenDear
Token ID20266
Feature activation-7.83e+00

Pos logit contributions
pill+0.021
hed+0.021
 imag+0.020
Looks+0.019
|+0.019
Neg logit contributions
keeper-0.027
 keeper-0.025
 lady-0.023
 doctor-0.023
 aid-0.023
Token Anna
Token ID11735
Feature activation+4.76e-01

Pos logit contributions
keeper+0.200
 keeper+0.190
ache+0.180
 hinges+0.179
 worker+0.178
Neg logit contributions
Any-0.145
Looks-0.142
You-0.136
 You-0.134
hed-0.131
Token,
Token ID11
Feature activation-4.69e+00

Pos logit contributions
Any+0.012
hed+0.011
Looks+0.011
 You+0.011
You+0.011
Neg logit contributions
keeper-0.009
 keeper-0.009
 hinges-0.008
 worker-0.008
 code-0.008
Token I
Token ID314
Feature activation+2.77e+00

Pos logit contributions
keeper+0.106
 keeper+0.101
 hinges+0.093
 worker+0.092
ache+0.092
Neg logit contributions
Any-0.103
Looks-0.100
You-0.097
hed-0.095
 You-0.095
Token have
Token ID423
Feature activation+3.46e+00

Pos logit contributions
Looks+0.054
pill+0.053
Any+0.052
hed+0.050
You+0.048
Neg logit contributions
keeper-0.068
 keeper-0.066
ache-0.063
 doctor-0.062
 aid-0.062
Token a
Token ID257
Feature activation-2.75e+00

Pos logit contributions
Looks+0.068
Any+0.066
hed+0.063
|+0.063
pill+0.063
Neg logit contributions
keeper-0.087
 keeper-0.082
 doctor-0.078
ache-0.077
 dentist-0.077
Token her
Token ID607
Feature activation-3.95e+00

Pos logit contributions
keeper+0.065
 keeper+0.061
ache+0.060
 hinges+0.058
 worker+0.058
Neg logit contributions
Any-0.032
Looks-0.032
You-0.030
hed-0.030
|-0.029
Token and
Token ID290
Feature activation+2.57e+00

Pos logit contributions
keeper+0.096
 keeper+0.092
alo+0.090
ache+0.089
 dentist+0.089
Neg logit contributions
Any-0.070
�-0.068
 You-0.067
hed-0.066
You-0.064
Token said
Token ID531
Feature activation-4.51e+00

Pos logit contributions
Looks+0.055
Any+0.054
hed+0.052
You+0.051
pill+0.049
Neg logit contributions
keeper-0.058
 keeper-0.055
 hinges-0.053
 code-0.052
ache-0.052
Token,
Token ID11
Feature activation-4.79e+00

Pos logit contributions
keeper+0.087
 keeper+0.086
 hinges+0.085
 code+0.081
 aid+0.080
Neg logit contributions
Any-0.107
Looks-0.101
 You-0.101
You-0.101
hed-0.101
Token "
Token ID366
Feature activation+7.58e-01

Pos logit contributions
keeper+0.098
 keeper+0.093
 hinges+0.088
 code+0.087
ache+0.085
Neg logit contributions
Any-0.113
Looks-0.111
You-0.107
 You-0.105
hed-0.105
TokenDear
Token ID20266
Feature activation-7.82e+00

Pos logit contributions
hed+0.013
|+0.012
pill+0.012
Any+0.011
ee+0.011
Neg logit contributions
keeper-0.021
 keeper-0.020
 worker-0.018
 doctor-0.018
 lady-0.018
Token daughter
Token ID4957
Feature activation-1.86e+00

Pos logit contributions
keeper+0.198
 keeper+0.187
 hinges+0.180
ache+0.180
 worker+0.176
Neg logit contributions
Any-0.145
Looks-0.141
You-0.136
 You-0.136
pill-0.131
Token,
Token ID11
Feature activation-4.02e+00

Pos logit contributions
keeper+0.035
 keeper+0.033
 hinges+0.031
ache+0.030
 worker+0.030
Neg logit contributions
Any-0.046
Looks-0.045
 You-0.044
You-0.044
hed-0.044
Token I
Token ID314
Feature activation+2.21e+00

Pos logit contributions
keeper+0.092
 keeper+0.087
 worker+0.081
 hinges+0.080
ache+0.080
Neg logit contributions
Any-0.086
Looks-0.086
You-0.081
hed-0.080
ee-0.079
Token knew
Token ID2993
Feature activation-3.26e+00

Pos logit contributions
Looks+0.045
Any+0.044
pill+0.044
hed+0.042
Whe+0.042
Neg logit contributions
keeper-0.053
 keeper-0.051
ache-0.047
 aid-0.047
 doctor-0.047
Token that
Token ID326
Feature activation-4.00e-01

Pos logit contributions
keeper+0.091
 keeper+0.087
 worker+0.082
ache+0.081
 hinges+0.081
Neg logit contributions
Looks-0.054
Any-0.054
hed-0.051
You-0.049
pill-0.049
Token says
Token ID1139
Feature activation-3.39e+00

Pos logit contributions
Any+0.027
Looks+0.026
hed+0.025
ee+0.023
You+0.023
Neg logit contributions
keeper-0.038
 keeper-0.037
 doctor-0.034
 code-0.033
 worker-0.033
Token:
Token ID25
Feature activation-2.29e+00

Pos logit contributions
keeper+0.074
 hinges+0.070
 keeper+0.068
 aid+0.066
 detector+0.065
Neg logit contributions
Any-0.075
 You-0.073
You-0.071
Looks-0.066
�-0.066
Token
Token ID198
Feature activation-7.00e-01

Pos logit contributions
keeper+0.042
 keeper+0.040
 worker+0.036
ache+0.035
 doctor+0.035
Neg logit contributions
Looks-0.060
Any-0.059
hed-0.057
ee-0.056
|-0.056
Token
Token ID198
Feature activation-1.04e+00

Pos logit contributions
keeper+0.019
 keeper+0.018
 worker+0.017
 hinges+0.017
ache+0.017
Neg logit contributions
Any-0.012
Looks-0.012
You-0.012
hed-0.012
|-0.011
Token"
Token ID1
Feature activation-4.08e-01

Pos logit contributions
keeper+0.020
 keeper+0.019
 worker+0.017
 hinges+0.017
ache+0.017
Neg logit contributions
Looks-0.026
Any-0.026
You-0.025
hed-0.025
pill-0.025
TokenDear
Token ID20266
Feature activation-7.78e+00

Pos logit contributions
keeper+0.012
 keeper+0.011
 worker+0.011
ache+0.011
 hinges+0.010
Neg logit contributions
Any-0.006
Looks-0.006
hed-0.006
pill-0.006
|-0.006
Token Lily
Token ID20037
Feature activation+1.01e+00

Pos logit contributions
keeper+0.202
 keeper+0.188
 hinges+0.180
ache+0.180
 worker+0.177
Neg logit contributions
Any-0.145
Looks-0.141
You-0.137
 You-0.136
|-0.132
Token and
Token ID290
Feature activation+2.62e+00

Pos logit contributions
 You+0.018
Any+0.018
hed+0.017
ee+0.017
�+0.017
Neg logit contributions
keeper-0.026
 keeper-0.024
 hinges-0.024
 code-0.024
alo-0.023
Token Ben
Token ID3932
Feature activation+1.28e+00

Pos logit contributions
Looks+0.065
|+0.065
hed+0.062
Any+0.062
 sorts+0.060
Neg logit contributions
keeper-0.050
 doctor-0.050
 dentist-0.050
 lady-0.048
ache-0.047
Token,
Token ID11
Feature activation-4.00e+00

Pos logit contributions
Any+0.032
hed+0.031
 You+0.031
You+0.031
Looks+0.031
Neg logit contributions
keeper-0.025
 keeper-0.022
 hinges-0.021
 instincts-0.021
alo-0.020
Token Happy
Token ID14628
Feature activation-3.14e+00

Pos logit contributions
keeper+0.097
 keeper+0.094
 doctor+0.090
 worker+0.088
 code+0.087
Neg logit contributions
Looks-0.082
hed-0.078
Any-0.077
pill-0.075
ee-0.072
Token
Token ID198
Feature activation-1.80e+00

Pos logit contributions
keeper+0.048
 keeper+0.046
 code+0.045
 hinges+0.045
 instincts+0.045
Neg logit contributions
Any-0.027
You-0.027
Looks-0.026
hed-0.025
|-0.025
TokenThe
Token ID464
Feature activation+5.88e-01

Pos logit contributions
keeper+0.044
 keeper+0.042
 code+0.040
alo+0.040
 hinges+0.040
Neg logit contributions
Any-0.036
Looks-0.036
You-0.035
hed-0.034
|-0.033
Token thief
Token ID25906
Feature activation-2.17e+00

Pos logit contributions
Looks+0.017
You+0.017
Any+0.017
 You+0.017
hed+0.016
Neg logit contributions
keeper-0.008
 keeper-0.007
alo-0.007
 aid-0.007
 hinges-0.007
Token replied
Token ID8712
Feature activation-4.33e+00

Pos logit contributions
keeper+0.057
alo+0.054
 keeper+0.054
 hinges+0.054
 instincts+0.053
Neg logit contributions
Looks-0.037
hed-0.037
Any-0.035
You-0.034
�-0.033
Token "
Token ID366
Feature activation+5.25e-01

Pos logit contributions
keeper+0.085
 keeper+0.081
 hinges+0.081
 code+0.079
 instincts+0.076
Neg logit contributions
Any-0.102
hed-0.100
You-0.097
Looks-0.097
 You-0.096
TokenDear
Token ID20266
Feature activation-7.77e+00

Pos logit contributions
Any+0.009
pill+0.009
hed+0.009
Looks+0.009
 Ever+0.009
Neg logit contributions
keeper-0.014
 keeper-0.013
 worker-0.012
ache-0.012
 lady-0.011
Token little
Token ID1310
Feature activation-4.35e-03

Pos logit contributions
keeper+0.188
 hinges+0.180
ache+0.177
 keeper+0.174
 code+0.174
Neg logit contributions
 You-0.140
Any-0.139
You-0.135
Looks-0.133
|-0.130
Token girl
Token ID2576
Feature activation-1.90e+00

Pos logit contributions
keeper+0.000
 keeper+0.000
 worker+0.000
ache+0.000
 hinges+0.000
Neg logit contributions
Looks-0.000
Any-0.000
hed-0.000
You-0.000
pill-0.000
Token,
Token ID11
Feature activation-3.75e+00

Pos logit contributions
keeper+0.037
 keeper+0.035
 hinges+0.033
 code+0.032
ache+0.032
Neg logit contributions
Any-0.046
Looks-0.045
 You-0.044
You-0.044
hed-0.044
Token I
Token ID314
Feature activation+2.23e+00

Pos logit contributions
keeper+0.081
 keeper+0.077
 hinges+0.072
ache+0.071
 worker+0.070
Neg logit contributions
Any-0.085
Looks-0.083
You-0.081
 You-0.079
hed-0.078
Token need
Token ID761
Feature activation+1.63e+00

Pos logit contributions
Looks+0.042
Any+0.041
pill+0.040
hed+0.039
You+0.038
Neg logit contributions
keeper-0.057
 keeper-0.055
 worker-0.051
ache-0.051
 doctor-0.050
Token prayed
Token ID32168
Feature activation-1.91e+00

Pos logit contributions
keeper+0.019
 keeper+0.019
 worker+0.017
 hinges+0.017
 instincts+0.017
Neg logit contributions
Looks-0.017
Any-0.016
hed-0.016
You-0.015
pill-0.015
Token.
Token ID13
Feature activation+3.38e-01

Pos logit contributions
keeper+0.052
 keeper+0.051
 buyer+0.049
 aid+0.048
 hinges+0.048
Neg logit contributions
Any-0.027
hed-0.025
You-0.025
�-0.023
 Ever-0.023
Token
Token ID198
Feature activation-4.05e-01

Pos logit contributions
 You+0.005
Looks+0.005
Any+0.005
You+0.005
 Ever+0.005
Neg logit contributions
keeper-0.010
ache-0.009
 keeper-0.009
 worker-0.009
 instincts-0.009
Token
Token ID198
Feature activation-7.62e-01

Pos logit contributions
keeper+0.011
 keeper+0.011
ache+0.011
 worker+0.011
 hinges+0.010
Neg logit contributions
Looks-0.007
Any-0.006
You-0.006
|-0.006
hed-0.006
Token"
Token ID1
Feature activation-3.36e-01

Pos logit contributions
keeper+0.016
 keeper+0.015
ache+0.014
 hinges+0.014
 worker+0.014
Neg logit contributions
Looks-0.018
Any-0.017
You-0.017
hed-0.016
|-0.016
TokenDear
Token ID20266
Feature activation-7.67e+00

Pos logit contributions
keeper+0.009
 keeper+0.008
 worker+0.008
 hinges+0.008
 instincts+0.008
Neg logit contributions
Looks-0.006
Any-0.006
You-0.006
hed-0.006
|-0.006
Token God
Token ID1793
Feature activation-4.42e-01

Pos logit contributions
keeper+0.200
ache+0.185
 hinges+0.183
 keeper+0.183
 instincts+0.176
Neg logit contributions
 You-0.135
Any-0.133
You-0.131
Looks-0.128
|-0.123
Token,
Token ID11
Feature activation-4.00e+00

Pos logit contributions
keeper+0.009
 keeper+0.009
 hinges+0.009
 instincts+0.008
 worker+0.008
Neg logit contributions
Any-0.010
 You-0.010
ee-0.009
You-0.009
hed-0.009
Token please
Token ID3387
Feature activation-3.72e+00

Pos logit contributions
keeper+0.089
 keeper+0.083
 hinges+0.080
 worker+0.077
ache+0.076
Neg logit contributions
Any-0.087
 You-0.087
You-0.086
Looks-0.083
 Good-0.080
Token help
Token ID1037
Feature activation-2.80e+00

Pos logit contributions
keeper+0.091
 keeper+0.083
 worker+0.081
 hinges+0.079
alo+0.077
Neg logit contributions
 You-0.072
Any-0.071
�-0.070
You-0.069
 Good-0.066
Token us
Token ID514
Feature activation-3.10e+00

Pos logit contributions
keeper+0.097
 keeper+0.092
alo+0.092
 hinges+0.091
 worker+0.090
Neg logit contributions
ee-0.026
You-0.025
�-0.024
 You-0.023
|-0.022
Token.
Token ID13
Feature activation+5.94e-01

Pos logit contributions
Any+0.010
hed+0.010
ee+0.010
You+0.010
pill+0.010
Neg logit contributions
keeper-0.012
 keeper-0.012
 hinges-0.012
 instincts-0.011
alo-0.011
Token She
Token ID1375
Feature activation-6.14e-01

Pos logit contributions
 You+0.009
Any+0.009
You+0.009
Looks+0.008
 Ever+0.008
Neg logit contributions
keeper-0.018
ache-0.017
 keeper-0.017
 instincts-0.016
 hinges-0.016
Token said
Token ID531
Feature activation-3.23e+00

Pos logit contributions
keeper+0.015
 keeper+0.014
 instincts+0.014
 hinges+0.014
 worker+0.014
Neg logit contributions
Looks-0.012
Any-0.012
hed-0.011
You-0.011
ee-0.011
Token:
Token ID25
Feature activation-2.57e+00

Pos logit contributions
 hinges+0.072
keeper+0.071
 keeper+0.069
 code+0.067
 aid+0.066
Neg logit contributions
Any-0.066
 You-0.066
You-0.063
�-0.061
hed-0.060
Token "
Token ID366
Feature activation+6.77e-01

Pos logit contributions
keeper+0.052
 keeper+0.048
 worker+0.044
 hinges+0.044
ache+0.044
Neg logit contributions
Any-0.063
Looks-0.062
hed-0.059
You-0.059
pill-0.058
TokenDear
Token ID20266
Feature activation-7.64e+00

Pos logit contributions
Any+0.010
Looks+0.010
hed+0.010
pill+0.009
|+0.009
Neg logit contributions
keeper-0.020
 keeper-0.019
 worker-0.018
ache-0.018
 aid-0.017
Token Bob
Token ID5811
Feature activation+2.10e-01

Pos logit contributions
keeper+0.206
 hinges+0.195
ache+0.192
 keeper+0.184
 code+0.183
Neg logit contributions
 You-0.126
Any-0.124
You-0.121
Looks-0.115
 Wed-0.113
Token,
Token ID11
Feature activation-4.10e+00

Pos logit contributions
Any+0.005
Looks+0.005
You+0.005
ee+0.005
 You+0.005
Neg logit contributions
keeper-0.004
 keeper-0.004
 hinges-0.004
 code-0.003
alo-0.003
Token we
Token ID356
Feature activation+2.25e+00

Pos logit contributions
keeper+0.106
 keeper+0.101
 hinges+0.095
 worker+0.094
ache+0.094
Neg logit contributions
Any-0.076
Looks-0.074
You-0.072
 You-0.070
|-0.069
Token love
Token ID1842
Feature activation-7.23e-01

Pos logit contributions
Looks+0.039
Any+0.038
pill+0.037
ksh+0.035
Whe+0.034
Neg logit contributions
keeper-0.060
 keeper-0.058
 code-0.056
 doctor-0.056
ache-0.055
Token you
Token ID345
Feature activation+1.73e+00

Pos logit contributions
keeper+0.023
 keeper+0.022
 worker+0.021
 instincts+0.020
 hinges+0.020
Neg logit contributions
Looks-0.010
Any-0.010
hed-0.009
pill-0.008
|-0.008
Token mom
Token ID1995
Feature activation-1.18e+00

Pos logit contributions
keeper+0.056
 keeper+0.052
 hinges+0.049
 worker+0.049
 instincts+0.047
Neg logit contributions
Any-0.076
 You-0.074
You-0.073
ee-0.072
hed-0.072
Token.
Token ID13
Feature activation+2.86e-01

Pos logit contributions
keeper+0.028
 keeper+0.027
 instincts+0.025
 hinges+0.025
 worker+0.025
Neg logit contributions
Any-0.024
hed-0.023
Looks-0.023
You-0.023
 You-0.022
Token
Token ID198
Feature activation-5.93e-01

Pos logit contributions
Any+0.005
Looks+0.005
hed+0.005
You+0.004
pill+0.004
Neg logit contributions
keeper-0.008
 keeper-0.008
 worker-0.007
ache-0.007
 hinges-0.007
Token
Token ID198
Feature activation-7.58e-01

Pos logit contributions
keeper+0.015
 keeper+0.014
ache+0.014
 worker+0.014
 hinges+0.013
Neg logit contributions
Any-0.011
Looks-0.011
You-0.011
hed-0.010
|-0.010
Token"
Token ID1
Feature activation-5.44e-01

Pos logit contributions
keeper+0.014
 keeper+0.013
 worker+0.012
 aid+0.012
 doctor+0.011
Neg logit contributions
Any-0.020
Looks-0.019
hed-0.019
pill-0.019
|-0.018
TokenDear
Token ID20266
Feature activation-7.64e+00

Pos logit contributions
keeper+0.014
 keeper+0.013
 worker+0.012
ache+0.012
 hinges+0.012
Neg logit contributions
Any-0.010
Looks-0.010
hed-0.010
pill-0.010
|-0.009
Token Sara
Token ID24799
Feature activation+1.39e+00

Pos logit contributions
keeper+0.191
 keeper+0.177
ache+0.174
 hinges+0.174
 instincts+0.169
Neg logit contributions
Any-0.140
 You-0.140
You-0.136
Looks-0.134
|-0.129
Token,
Token ID11
Feature activation-4.64e+00

Pos logit contributions
 You+0.025
Any+0.024
ee+0.024
�+0.024
hed+0.023
Neg logit contributions
keeper-0.035
 hinges-0.033
 code-0.033
 keeper-0.032
alo-0.031
Token I
Token ID314
Feature activation+2.62e+00

Pos logit contributions
keeper+0.108
 keeper+0.102
 hinges+0.095
 instincts+0.094
 worker+0.092
Neg logit contributions
 You-0.099
Any-0.098
You-0.097
Looks-0.092
�-0.092
Token miss
Token ID2051
Feature activation+1.41e-01

Pos logit contributions
Looks+0.049
Any+0.047
pill+0.046
hed+0.045
You+0.043
Neg logit contributions
keeper-0.067
 keeper-0.065
ache-0.062
 doctor-0.062
 aid-0.061
Token you
Token ID345
Feature activation+1.50e+00

Pos logit contributions
Looks+0.002
Any+0.002
hed+0.002
pill+0.002
|+0.002
Neg logit contributions
keeper-0.004
 keeper-0.004
 worker-0.003
 doctor-0.003
 hinges-0.003
Token it
Token ID340
Feature activation+1.14e+00

Pos logit contributions
You+0.012
Any+0.011
hed+0.011
 You+0.011
Looks+0.010
Neg logit contributions
keeper-0.035
 keeper-0.034
ache-0.033
 hinges-0.033
 instincts-0.033
Token.
Token ID13
Feature activation+9.78e-01

Pos logit contributions
Any+0.026
hed+0.026
You+0.025
Looks+0.025
 You+0.025
Neg logit contributions
keeper-0.022
 keeper-0.021
ache-0.021
 instincts-0.020
 hinges-0.020
Token
Token ID198
Feature activation-3.02e-01

Pos logit contributions
Any+0.015
 You+0.014
You+0.014
Looks+0.014
pill+0.013
Neg logit contributions
keeper-0.029
 keeper-0.028
ache-0.027
 instincts-0.027
 hinges-0.027
Token
Token ID198
Feature activation-7.66e-01

Pos logit contributions
keeper+0.009
ache+0.009
 keeper+0.009
 instincts+0.009
 hinges+0.009
Neg logit contributions
You-0.004
Any-0.004
Looks-0.004
 You-0.004
|-0.004
Token"
Token ID1
Feature activation-2.93e-01

Pos logit contributions
keeper+0.015
 keeper+0.014
ache+0.012
 worker+0.012
 hinges+0.012
Neg logit contributions
Looks-0.020
Any-0.020
You-0.019
hed-0.019
ee-0.018
TokenDear
Token ID20266
Feature activation-7.60e+00

Pos logit contributions
keeper+0.009
 keeper+0.008
 worker+0.008
 code+0.008
 doctor+0.008
Neg logit contributions
Any-0.004
Looks-0.004
hed-0.004
|-0.004
ee-0.004
Token Lily
Token ID20037
Feature activation+9.75e-01

Pos logit contributions
keeper+0.227
 keeper+0.214
ache+0.210
 hinges+0.207
 worker+0.204
Neg logit contributions
Any-0.107
 You-0.107
You-0.101
Looks-0.098
pill-0.095
Token and
Token ID290
Feature activation+2.84e+00

Pos logit contributions
 You+0.020
Any+0.019
hed+0.019
ee+0.018
�+0.018
Neg logit contributions
keeper-0.023
 hinges-0.020
 instincts-0.020
 keeper-0.020
ache-0.020
Token Ben
Token ID3932
Feature activation+1.44e+00

Pos logit contributions
Looks+0.068
|+0.066
hed+0.065
Any+0.064
pill+0.061
Neg logit contributions
keeper-0.059
 keeper-0.056
 doctor-0.056
 dentist-0.056
 lady-0.054
Token,
Token ID11
Feature activation-3.84e+00

Pos logit contributions
Any+0.034
 You+0.033
hed+0.033
You+0.032
ee+0.032
Neg logit contributions
keeper-0.030
 keeper-0.027
 hinges-0.026
 instincts-0.026
alo-0.025
Token we
Token ID356
Feature activation+2.66e+00

Pos logit contributions
keeper+0.101
 keeper+0.097
 worker+0.091
 doctor+0.090
ache+0.089
Neg logit contributions
Looks-0.071
Any-0.069
hed-0.067
pill-0.064
ee-0.063
Token slowly
Token ID6364
Feature activation+1.82e-02

Pos logit contributions
hed+0.038
Any+0.038
You+0.037
Looks+0.037
ee+0.036
Neg logit contributions
keeper-0.048
 keeper-0.046
 instincts-0.044
 hinges-0.044
 worker-0.043
Token.
Token ID13
Feature activation+7.03e-01

Pos logit contributions
hed+0.000
Any+0.000
You+0.000
 You+0.000
Looks+0.000
Neg logit contributions
keeper-0.000
 keeper-0.000
 instincts-0.000
 hinges-0.000
 worker-0.000
Token
Token ID198
Feature activation-3.28e-01

Pos logit contributions
Any+0.011
Looks+0.011
You+0.011
 You+0.011
pill+0.010
Neg logit contributions
keeper-0.020
 keeper-0.019
ache-0.018
 worker-0.018
 instincts-0.018
Token
Token ID198
Feature activation-8.36e-01

Pos logit contributions
keeper+0.010
 keeper+0.009
ache+0.009
 instincts+0.009
 worker+0.009
Neg logit contributions
Any-0.005
You-0.005
Looks-0.005
 You-0.004
hed-0.004
Token"
Token ID1
Feature activation-4.87e-01

Pos logit contributions
keeper+0.017
 keeper+0.016
ache+0.015
 instincts+0.015
 worker+0.015
Neg logit contributions
Looks-0.020
Any-0.020
You-0.020
hed-0.019
ee-0.019
TokenDear
Token ID20266
Feature activation-7.60e+00

Pos logit contributions
keeper+0.013
 keeper+0.012
 worker+0.011
ache+0.011
 aid+0.011
Neg logit contributions
Any-0.009
Looks-0.009
hed-0.009
|-0.008
ee-0.008
Token...
Token ID986
Feature activation-8.98e-02

Pos logit contributions
keeper+0.199
 keeper+0.187
 hinges+0.181
ache+0.181
 worker+0.176
Neg logit contributions
Any-0.135
 You-0.131
You-0.129
Looks-0.127
pill-0.123
TokenAn
Token ID2025
Feature activation+2.71e+00

Pos logit contributions
keeper+0.002
 keeper+0.002
 worker+0.002
 doctor+0.002
ache+0.002
Neg logit contributions
Looks-0.002
Any-0.002
hed-0.002
ee-0.002
|-0.002
Token-
Token ID12
Feature activation+2.16e+00

Pos logit contributions
Looks+0.066
Any+0.066
hed+0.063
You+0.062
|+0.061
Neg logit contributions
keeper-0.055
 keeper-0.052
 worker-0.048
ache-0.047
 hinges-0.046
Tokenna
Token ID2616
Feature activation+2.33e+00

Pos logit contributions
Any+0.046
Looks+0.046
You+0.045
hed+0.043
ee+0.043
Neg logit contributions
keeper-0.048
 keeper-0.047
 hinges-0.045
 instincts-0.045
ache-0.043
Token...
Token ID986
Feature activation+3.66e-01

Pos logit contributions
Looks+0.046
Any+0.045
hed+0.042
You+0.041
pill+0.041
Neg logit contributions
keeper-0.059
 keeper-0.057
 worker-0.053
ache-0.052
 code-0.052
Token The
Token ID383
Feature activation+8.95e-01

Pos logit contributions
Looks+0.015
Any+0.015
hed+0.014
|+0.014
pill+0.014
Neg logit contributions
keeper-0.015
 keeper-0.014
 worker-0.013
 aid-0.013
 doctor-0.013
Token letter
Token ID3850
Feature activation-1.22e+00

Pos logit contributions
You+0.024
 You+0.023
Any+0.022
Looks+0.022
hed+0.022
Neg logit contributions
keeper-0.013
 instincts-0.013
 hinges-0.013
ache-0.013
 keeper-0.013
Token said
Token ID531
Feature activation-3.92e+00

Pos logit contributions
 instincts+0.034
 hinges+0.033
 aid+0.032
ache+0.031
keeper+0.031
Neg logit contributions
�-0.017
hed-0.017
Any-0.016
You-0.016
 You-0.016
Token:
Token ID25
Feature activation-2.15e+00

Pos logit contributions
 hinges+0.102
 lair+0.100
keeper+0.099
 instincts+0.097
 aid+0.096
Neg logit contributions
 You-0.065
Any-0.065
�-0.064
You-0.061
 Good-0.055
Token "
Token ID366
Feature activation+9.33e-01

Pos logit contributions
keeper+0.038
 keeper+0.035
 worker+0.032
 doctor+0.031
ache+0.031
Neg logit contributions
Looks-0.058
Any-0.057
hed-0.056
|-0.055
ee-0.055
TokenDear
Token ID20266
Feature activation-7.59e+00

Pos logit contributions
Looks+0.016
Any+0.015
hed+0.015
pill+0.015
|+0.014
Neg logit contributions
keeper-0.025
 keeper-0.024
 worker-0.022
 aid-0.021
 lady-0.021
Token Sara
Token ID24799
Feature activation+1.20e+00

Pos logit contributions
keeper+0.187
 keeper+0.176
ache+0.170
 hinges+0.169
 worker+0.165
Neg logit contributions
Any-0.147
 You-0.140
Looks-0.140
You-0.138
ee-0.132
Token and
Token ID290
Feature activation+2.62e+00

Pos logit contributions
Any+0.022
ee+0.021
hed+0.021
You+0.021
 You+0.021
Neg logit contributions
keeper-0.031
 keeper-0.029
 hinges-0.028
 code-0.027
alo-0.027
Token Ben
Token ID3932
Feature activation+1.05e+00

Pos logit contributions
Looks+0.066
|+0.065
hed+0.064
pill+0.062
Any+0.061
Neg logit contributions
keeper-0.052
 doctor-0.049
 keeper-0.049
 dentist-0.049
 lady-0.048
Token,
Token ID11
Feature activation-4.15e+00

Pos logit contributions
Looks+0.029
Any+0.028
pill+0.027
hed+0.027
|+0.026
Neg logit contributions
keeper-0.019
 keeper-0.018
 worker-0.016
ache-0.016
 doctor-0.015
Token I
Token ID314
Feature activation+2.89e+00

Pos logit contributions
keeper+0.103
 keeper+0.098
 doctor+0.094
 worker+0.093
 code+0.091
Neg logit contributions
Looks-0.085
hed-0.080
Any-0.078
pill-0.077
|-0.074
Token it
Token ID340
Feature activation+1.26e+00

Pos logit contributions
keeper+0.002
ache+0.002
 keeper+0.002
 hinges+0.002
 instincts+0.002
Neg logit contributions
hed-0.001
Any-0.001
You-0.001
Looks-0.001
 You-0.001
Token out
Token ID503
Feature activation-1.63e+00

Pos logit contributions
Any+0.023
hed+0.023
You+0.023
Looks+0.022
 You+0.022
Neg logit contributions
keeper-0.030
ache-0.029
 keeper-0.028
 instincts-0.028
 dentist-0.027
Token loud
Token ID7812
Feature activation+3.19e-01

Pos logit contributions
keeper+0.042
 keeper+0.039
 hinges+0.039
ache+0.039
 aid+0.038
Neg logit contributions
Looks-0.027
You-0.027
Any-0.026
 You-0.026
hed-0.024
Token.
Token ID13
Feature activation+4.27e-01

Pos logit contributions
hed+0.006
Any+0.006
 You+0.006
You+0.006
Looks+0.006
Neg logit contributions
keeper-0.008
 keeper-0.007
 instincts-0.007
ache-0.007
 worker-0.007
Token "
Token ID366
Feature activation+1.68e+00

Pos logit contributions
Any+0.006
 You+0.006
Looks+0.006
You+0.006
pill+0.006
Neg logit contributions
keeper-0.013
 keeper-0.012
ache-0.012
 instincts-0.012
 code-0.012
TokenDear
Token ID20266
Feature activation-7.58e+00

Pos logit contributions
hed+0.030
|+0.028
pill+0.028
Any+0.028
ee+0.028
Neg logit contributions
keeper-0.040
 keeper-0.039
 vet-0.036
 lady-0.036
 worker-0.035
Token mom
Token ID1995
Feature activation-7.15e-01

Pos logit contributions
keeper+0.204
 keeper+0.194
ache+0.187
 hinges+0.184
 worker+0.181
Neg logit contributions
Any-0.131
Looks-0.127
 You-0.123
You-0.122
pill-0.117
Token,
Token ID11
Feature activation-3.66e+00

Pos logit contributions
keeper+0.017
 keeper+0.016
 aid+0.015
 instincts+0.015
 code+0.015
Neg logit contributions
 You-0.014
Any-0.014
You-0.014
ee-0.013
Looks-0.013
Token we
Token ID356
Feature activation+2.15e+00

Pos logit contributions
keeper+0.097
 keeper+0.093
 hinges+0.088
ache+0.088
 worker+0.085
Neg logit contributions
Any-0.067
 You-0.063
You-0.063
Looks-0.062
�-0.058
Token love
Token ID1842
Feature activation-8.27e-01

Pos logit contributions
Looks+0.040
Any+0.040
pill+0.039
ksh+0.037
 Wed+0.036
Neg logit contributions
keeper-0.053
 keeper-0.051
 code-0.050
ache-0.050
 doctor-0.049
Token you
Token ID345
Feature activation+1.72e+00

Pos logit contributions
keeper+0.024
 keeper+0.023
 worker+0.022
 doctor+0.021
 instincts+0.021
Neg logit contributions
Looks-0.014
hed-0.013
Any-0.013
pill-0.012
|-0.012
Token prayed
Token ID32168
Feature activation-1.38e+00

Pos logit contributions
Looks+0.030
Any+0.030
hed+0.029
You+0.028
pill+0.028
Neg logit contributions
keeper-0.039
 keeper-0.038
 worker-0.035
 hinges-0.035
ache-0.035
Token and
Token ID290
Feature activation+3.67e+00

Pos logit contributions
keeper+0.037
 keeper+0.036
 hinges+0.035
 aid+0.035
ache+0.035
Neg logit contributions
Any-0.021
You-0.020
hed-0.020
 Ever-0.018
Looks-0.018
Token said
Token ID531
Feature activation-3.23e+00

Pos logit contributions
Looks+0.070
Any+0.069
You+0.067
hed+0.065
|+0.062
Neg logit contributions
keeper-0.094
 keeper-0.089
 worker-0.084
ache-0.083
 aid-0.082
Token,
Token ID11
Feature activation-3.76e+00

Pos logit contributions
keeper+0.069
 keeper+0.066
 hinges+0.062
 aid+0.061
 code+0.061
Neg logit contributions
Any-0.075
You-0.070
 You-0.069
Looks-0.069
hed-0.068
Token "
Token ID366
Feature activation+7.11e-01

Pos logit contributions
keeper+0.073
 keeper+0.068
 worker+0.062
 hinges+0.062
 code+0.062
Neg logit contributions
Any-0.096
Looks-0.093
You-0.091
hed-0.089
 You-0.089
TokenDear
Token ID20266
Feature activation-7.57e+00

Pos logit contributions
pill+0.012
hed+0.012
|+0.012
Looks+0.011
Any+0.011
Neg logit contributions
keeper-0.019
 keeper-0.018
ache-0.017
 doctor-0.017
 vet-0.016
Token God
Token ID1793
Feature activation-5.57e-02

Pos logit contributions
keeper+0.193
 keeper+0.181
ache+0.176
 hinges+0.176
 code+0.171
Neg logit contributions
Any-0.140
 You-0.134
Looks-0.133
You-0.133
 Wed-0.126
Token,
Token ID11
Feature activation-3.92e+00

Pos logit contributions
keeper+0.001
 keeper+0.001
 hinges+0.001
 instincts+0.001
 worker+0.001
Neg logit contributions
Any-0.001
hed-0.001
You-0.001
 You-0.001
ee-0.001
Token please
Token ID3387
Feature activation-3.78e+00

Pos logit contributions
keeper+0.108
 keeper+0.104
 worker+0.098
ache+0.096
 hinges+0.096
Neg logit contributions
Looks-0.067
Any-0.066
hed-0.063
You-0.060
pill-0.060
Token forgive
Token ID20927
Feature activation-6.46e-01

Pos logit contributions
keeper+0.088
 keeper+0.081
 hinges+0.076
 worker+0.076
alo+0.074
Neg logit contributions
Any-0.082
 You-0.079
You-0.079
Looks-0.075
ee-0.074
Token us
Token ID514
Feature activation-1.73e+00

Pos logit contributions
keeper+0.020
 keeper+0.019
 worker+0.018
 hinges+0.018
ache+0.017
Neg logit contributions
Any-0.009
Looks-0.009
You-0.009
hed-0.008
ee-0.008
Token said
Token ID531
Feature activation-4.63e+00

Pos logit contributions
Looks+0.085
Any+0.084
hed+0.081
You+0.080
pill+0.078
Neg logit contributions
keeper-0.088
 keeper-0.083
 worker-0.078
 hinges-0.078
 instincts-0.078
Token to
Token ID284
Feature activation-4.45e+00

Pos logit contributions
 hinges+0.099
 code+0.095
keeper+0.094
 keeper+0.092
 aid+0.091
Neg logit contributions
Any-0.098
�-0.096
You-0.096
 You-0.095
hed-0.094
Token Frank
Token ID5278
Feature activation-1.07e+00

Pos logit contributions
keeper+0.112
 keeper+0.109
 worker+0.100
 instincts+0.098
 aid+0.097
Neg logit contributions
Looks-0.087
Any-0.086
hed-0.081
pill-0.081
|-0.079
Token:
Token ID25
Feature activation-2.05e+00

Pos logit contributions
keeper+0.021
 hinges+0.021
alo+0.020
 code+0.020
 keeper+0.020
Neg logit contributions
�-0.024
Any-0.024
 You-0.023
hed-0.023
Looks-0.022
Token "
Token ID366
Feature activation+1.31e+00

Pos logit contributions
keeper+0.041
 keeper+0.039
 worker+0.035
ache+0.035
 hinges+0.035
Neg logit contributions
Looks-0.050
Any-0.050
hed-0.047
You-0.047
pill-0.046
TokenDear
Token ID20266
Feature activation-7.57e+00

Pos logit contributions
Any+0.023
hed+0.022
|+0.022
pill+0.022
Looks+0.021
Neg logit contributions
keeper-0.034
 keeper-0.033
ache-0.029
 worker-0.029
 aid-0.028
Token all
Token ID477
Feature activation+5.17e+00

Pos logit contributions
keeper+0.191
 hinges+0.177
 keeper+0.177
ache+0.177
 worker+0.171
Neg logit contributions
Any-0.137
You-0.133
Looks-0.133
 You-0.133
pill-0.123
Tokenigator
Token ID23823
Feature activation+5.86e-01

Pos logit contributions
Looks+0.111
Any+0.110
hed+0.106
ksh+0.104
|+0.100
Neg logit contributions
keeper-0.119
 keeper-0.116
 worker-0.108
 door-0.106
 doctor-0.106
Token,
Token ID11
Feature activation-3.32e+00

Pos logit contributions
Looks+0.016
Any+0.015
hed+0.015
You+0.014
|+0.014
Neg logit contributions
keeper-0.011
 keeper-0.010
 worker-0.009
ache-0.009
 instincts-0.009
Token if
Token ID611
Feature activation+2.00e+00

Pos logit contributions
keeper+0.082
 keeper+0.078
 worker+0.072
ache+0.071
 instincts+0.071
Neg logit contributions
Looks-0.066
Any-0.066
hed-0.063
pill-0.061
You-0.061
Token you
Token ID345
Feature activation+1.84e+00

Pos logit contributions
Looks+0.023
Any+0.023
hed+0.021
You+0.020
pill+0.020
Neg logit contributions
keeper-0.066
 keeper-0.063
 worker-0.060
 hinges-0.059
ache-0.059
Token said
Token ID531
Feature activation-3.66e+00

Pos logit contributions
 instincts+0.025
 hinges+0.024
 aid+0.023
keeper+0.023
 lair+0.022
Neg logit contributions
hed-0.014
�-0.013
Any-0.013
You-0.013
Looks-0.012
Token:
Token ID25
Feature activation-1.87e+00

Pos logit contributions
 hinges+0.092
 lair+0.091
 detector+0.088
 instincts+0.087
 aid+0.087
Neg logit contributions
Any-0.067
 You-0.066
�-0.064
You-0.063
 Good-0.059
Token
Token ID198
Feature activation-9.14e-01

Pos logit contributions
keeper+0.033
 keeper+0.031
 worker+0.027
 doctor+0.027
ache+0.027
Neg logit contributions
Looks-0.051
Any-0.049
hed-0.049
|-0.048
ee-0.048
Token
Token ID198
Feature activation-1.32e+00

Pos logit contributions
keeper+0.025
 keeper+0.024
 hinges+0.023
 instincts+0.023
ache+0.023
Neg logit contributions
Any-0.016
Looks-0.015
You-0.015
hed-0.015
|-0.015
Token"
Token ID1
Feature activation-1.89e-01

Pos logit contributions
keeper+0.026
 keeper+0.025
 worker+0.023
ache+0.022
 hinges+0.022
Neg logit contributions
Looks-0.033
Any-0.033
hed-0.031
You-0.031
pill-0.031
TokenDear
Token ID20266
Feature activation-7.56e+00

Pos logit contributions
keeper+0.005
 keeper+0.005
 worker+0.004
ache+0.004
 doctor+0.004
Neg logit contributions
Any-0.003
Looks-0.003
hed-0.003
|-0.003
pill-0.003
Token Sara
Token ID24799
Feature activation+1.12e+00

Pos logit contributions
keeper+0.196
 keeper+0.181
 hinges+0.180
ache+0.180
 code+0.175
Neg logit contributions
Any-0.133
 You-0.132
You-0.129
Looks-0.124
 Wed-0.122
Token,
Token ID11
Feature activation-4.13e+00

Pos logit contributions
Any+0.023
 You+0.022
�+0.022
ee+0.022
You+0.021
Neg logit contributions
keeper-0.026
 keeper-0.024
 hinges-0.023
 code-0.023
 worker-0.022
Token
Token ID198
Feature activation-1.19e+00

Pos logit contributions
keeper+0.092
 keeper+0.087
 worker+0.080
 hinges+0.080
ache+0.079
Neg logit contributions
Any-0.092
Looks-0.091
You-0.087
hed-0.086
pill-0.084
Token
Token ID198
Feature activation-1.30e+00

Pos logit contributions
keeper+0.030
 keeper+0.028
 instincts+0.028
ache+0.028
 hinges+0.028
Neg logit contributions
You-0.023
Looks-0.022
Any-0.022
|-0.021
 You-0.021
TokenThank
Token ID10449
Feature activation-1.60e+00

Pos logit contributions
keeper+0.031
 hinges+0.029
 keeper+0.029
 code+0.029
ache+0.029
Neg logit contributions
You-0.028
Looks-0.026
Any-0.025
�-0.024
ee-0.024
Token closed
Token ID4838
Feature activation-5.30e-01

Pos logit contributions
Any+0.074
Looks+0.073
pill+0.069
hed+0.068
You+0.068
Neg logit contributions
keeper-0.086
 keeper-0.081
 worker-0.075
 hinges-0.075
ache-0.074
Token her
Token ID607
Feature activation-3.66e+00

Pos logit contributions
keeper+0.016
 keeper+0.016
ache+0.015
 worker+0.015
 code+0.015
Neg logit contributions
Any-0.007
Looks-0.007
hed-0.006
You-0.006
 Wed-0.006
Token eyes
Token ID2951
Feature activation+1.03e+00

Pos logit contributions
keeper+0.076
 keeper+0.073
 worker+0.071
ache+0.068
 dentist+0.065
Neg logit contributions
Any-0.080
Looks-0.078
 You-0.078
ee-0.077
 Ever-0.076
Token.
Token ID13
Feature activation+1.16e+00

Pos logit contributions
hed+0.020
Any+0.019
�+0.019
Looks+0.019
You+0.018
Neg logit contributions
keeper-0.025
 keeper-0.024
 worker-0.023
 dentist-0.023
 instincts-0.023
Token "
Token ID366
Feature activation+1.87e+00

Pos logit contributions
Looks+0.020
Any+0.020
hed+0.019
pill+0.019
You+0.019
Neg logit contributions
keeper-0.031
 keeper-0.030
 worker-0.028
 hinges-0.028
 aid-0.027
TokenDear
Token ID20266
Feature activation-7.55e+00

Pos logit contributions
hed+0.038
pill+0.037
ays+0.035
|+0.034
irts+0.034
Neg logit contributions
keeper-0.039
 keeper-0.037
 dentist-0.035
ache-0.035
 doctor-0.034
Token God
Token ID1793
Feature activation-1.99e-01

Pos logit contributions
keeper+0.187
 keeper+0.175
ache+0.171
 hinges+0.170
 worker+0.168
Neg logit contributions
Any-0.141
 You-0.140
You-0.134
�-0.130
Looks-0.130
Token,"
Token ID553
Feature activation+3.07e-02

Pos logit contributions
keeper+0.004
 keeper+0.004
 hinges+0.004
 worker+0.003
 instincts+0.003
Neg logit contributions
Any-0.005
hed-0.004
Looks-0.004
 You-0.004
You-0.004
Token she
Token ID673
Feature activation-1.30e+00

Pos logit contributions
You+0.001
Looks+0.001
 You+0.001
�+0.001
Any+0.001
Neg logit contributions
keeper-0.001
 instincts-0.001
 keeper-0.001
 worker-0.001
ache-0.001
Token said
Token ID531
Feature activation-4.81e+00

Pos logit contributions
keeper+0.031
 keeper+0.029
 hinges+0.029
 instincts+0.029
 worker+0.029
Neg logit contributions
Looks-0.025
Any-0.024
You-0.023
�-0.023
hed-0.023
Token softly
Token ID26625
Feature activation-1.41e+00

Pos logit contributions
 keeper+0.110
keeper+0.109
 code+0.108
 hinges+0.107
 detector+0.106
Neg logit contributions
hed-0.087
Any-0.085
�-0.084
 You-0.083
You-0.082
Token.
Token ID13
Feature activation+5.19e-01

Pos logit contributions
 instincts+0.042
 hinges+0.041
ache+0.040
 stret+0.039
 aid+0.039
Neg logit contributions
 You-0.030
You-0.029
�-0.029
hed-0.029
 Ever-0.029
Token It
Token ID632
Feature activation+1.24e+00

Pos logit contributions
Any+0.007
Looks+0.007
You+0.007
 You+0.007
hed+0.006
Neg logit contributions
keeper-0.016
 keeper-0.015
ache-0.015
 instincts-0.015
 worker-0.015
Token says
Token ID1139
Feature activation-3.11e+00

Pos logit contributions
Looks+0.023
Any+0.023
hed+0.022
|+0.021
ee+0.021
Neg logit contributions
keeper-0.033
 keeper-0.031
 worker-0.029
 code-0.029
 doctor-0.029
Token:
Token ID25
Feature activation-2.10e+00

Pos logit contributions
keeper+0.069
 hinges+0.068
 keeper+0.065
 instincts+0.063
 aid+0.063
Neg logit contributions
Any-0.066
 You-0.066
You-0.063
�-0.059
 Ever-0.058
Token "
Token ID366
Feature activation+1.23e+00

Pos logit contributions
keeper+0.039
 keeper+0.037
 worker+0.033
ache+0.033
 hinges+0.032
Neg logit contributions
Looks-0.054
Any-0.054
hed-0.052
You-0.051
pill-0.051
TokenDear
Token ID20266
Feature activation-7.53e+00

Pos logit contributions
Looks+0.020
Any+0.020
hed+0.019
|+0.019
pill+0.018
Neg logit contributions
keeper-0.034
 keeper-0.033
ache-0.030
 worker-0.030
 doctor-0.030
Token Sam
Token ID3409
Feature activation+2.01e-01

Pos logit contributions
keeper+0.226
ache+0.211
 keeper+0.208
 hinges+0.206
 worker+0.200
Neg logit contributions
 You-0.106
Any-0.105
You-0.102
Looks-0.101
ee-0.097
Token,
Token ID11
Feature activation-4.37e+00

Pos logit contributions
Any+0.005
Looks+0.005
hed+0.005
You+0.005
 You+0.005
Neg logit contributions
keeper-0.004
 keeper-0.003
 hinges-0.003
 worker-0.003
 instincts-0.003
Token I
Token ID314
Feature activation+2.67e+00

Pos logit contributions
keeper+0.117
 keeper+0.111
 hinges+0.106
ache+0.104
 worker+0.104
Neg logit contributions
Any-0.078
Looks-0.075
You-0.074
 You-0.073
|-0.069
Token like
Token ID588
Feature activation+1.93e-01

Pos logit contributions
Any+0.044
Looks+0.044
hed+0.041
You+0.040
pill+0.039
Neg logit contributions
keeper-0.076
 keeper-0.072
 worker-0.068
 hinges-0.067
ache-0.067
Token you
Token ID345
Feature activation+2.30e+00

Pos logit contributions
Looks+0.003
Any+0.003
hed+0.003
You+0.003
pill+0.003
Neg logit contributions
keeper-0.005
 keeper-0.005
 worker-0.005
ache-0.005
 hinges-0.005
Token to
Token ID284
Feature activation-4.44e+00

Pos logit contributions
keeper+0.088
 keeper+0.083
 hinges+0.079
 buyer+0.078
 aid+0.078
Neg logit contributions
Any-0.090
You-0.090
 You-0.089
�-0.087
Looks-0.087
Token pray
Token ID12472
Feature activation-1.56e+00

Pos logit contributions
keeper+0.110
 keeper+0.099
ache+0.096
alo+0.094
 worker+0.093
Neg logit contributions
Any-0.087
You-0.086
Looks-0.084
 You-0.083
ee-0.082
Token instead
Token ID2427
Feature activation-5.80e-01

Pos logit contributions
keeper+0.041
 keeper+0.038
 buyer+0.036
ache+0.036
 worker+0.036
Neg logit contributions
Any-0.026
You-0.026
 You-0.025
ee-0.023
hed-0.023
Token.
Token ID13
Feature activation+1.88e-01

Pos logit contributions
keeper+0.014
 keeper+0.013
 worker+0.013
 instincts+0.012
 hinges+0.012
Neg logit contributions
Any-0.011
You-0.011
Looks-0.011
hed-0.010
 You-0.010
Token "
Token ID366
Feature activation+1.18e+00

Pos logit contributions
Any+0.003
Looks+0.003
You+0.003
hed+0.003
 You+0.003
Neg logit contributions
keeper-0.005
 keeper-0.004
ache-0.004
 code-0.004
 instincts-0.004
TokenDear
Token ID20266
Feature activation-7.51e+00

Pos logit contributions
hed+0.026
Any+0.025
pill+0.025
Looks+0.025
|+0.024
Neg logit contributions
keeper-0.025
 keeper-0.024
 worker-0.022
 vet-0.022
 doctor-0.022
Token God
Token ID1793
Feature activation-2.47e-01

Pos logit contributions
keeper+0.195
ache+0.181
 detector+0.172
 hinges+0.169
 code+0.168
Neg logit contributions
 You-0.141
Any-0.136
You-0.134
 summer-0.129
 Good-0.123
Token,
Token ID11
Feature activation-4.10e+00

Pos logit contributions
keeper+0.005
 keeper+0.005
 hinges+0.004
 aid+0.004
 instincts+0.004
Neg logit contributions
Any-0.006
 You-0.006
You-0.006
hed-0.006
ee-0.006
Token please
Token ID3387
Feature activation-3.95e+00

Pos logit contributions
keeper+0.099
 keeper+0.093
 hinges+0.088
ache+0.088
 aid+0.086
Neg logit contributions
Any-0.085
You-0.082
Looks-0.080
 You-0.080
|-0.074
Token let
Token ID1309
Feature activation+3.37e+00

Pos logit contributions
keeper+0.094
 keeper+0.087
 worker+0.081
 hinges+0.081
ache+0.078
Neg logit contributions
Any-0.083
 You-0.079
You-0.078
ee-0.076
Looks-0.075
Token me
Token ID502
Feature activation-2.78e+00

Pos logit contributions
Any+0.056
Looks+0.056
You+0.053
hed+0.053
ee+0.051
Neg logit contributions
keeper-0.093
 keeper-0.089
 hinges-0.084
 worker-0.084
 code-0.083
Token together
Token ID1978
Feature activation-3.90e-01

Pos logit contributions
Looks+0.076
Any+0.075
hed+0.071
You+0.071
pill+0.071
Neg logit contributions
keeper-0.068
 keeper-0.064
 worker-0.059
ache-0.058
 hinges-0.058
Token.
Token ID13
Feature activation+7.72e-01

Pos logit contributions
keeper+0.009
 keeper+0.008
 hinges+0.008
 worker+0.008
ache+0.008
Neg logit contributions
Any-0.008
Looks-0.008
hed-0.008
You-0.008
ee-0.008
Token
Token ID198
Feature activation-1.25e-02

Pos logit contributions
Looks+0.015
hed+0.014
Any+0.014
ee+0.014
pill+0.013
Neg logit contributions
keeper-0.020
 keeper-0.019
 doctor-0.017
 aid-0.017
 worker-0.017
Token
Token ID198
Feature activation-1.59e-01

Pos logit contributions
keeper+0.000
 keeper+0.000
 worker+0.000
 aid+0.000
 doctor+0.000
Neg logit contributions
Looks-0.000
Any-0.000
hed-0.000
pill-0.000
You-0.000
Token"
Token ID1
Feature activation-7.68e-02

Pos logit contributions
keeper+0.003
 keeper+0.003
 aid+0.003
 worker+0.003
 doctor+0.002
Neg logit contributions
hed-0.004
Any-0.004
Looks-0.004
pill-0.004
ksh-0.004
TokenDear
Token ID20266
Feature activation-7.51e+00

Pos logit contributions
keeper+0.002
 keeper+0.002
 worker+0.002
 doctor+0.002
 lady+0.002
Neg logit contributions
pill-0.001
hed-0.001
Any-0.001
ee-0.001
|-0.001
Token God
Token ID1793
Feature activation-4.93e-01

Pos logit contributions
keeper+0.188
 keeper+0.175
ache+0.172
 hinges+0.171
 worker+0.165
Neg logit contributions
Any-0.141
Looks-0.138
 You-0.135
You-0.134
|-0.129
Token,
Token ID11
Feature activation-4.22e+00

Pos logit contributions
keeper+0.010
 keeper+0.009
 hinges+0.009
 instincts+0.009
 code+0.009
Neg logit contributions
Any-0.011
 You-0.011
hed-0.011
ee-0.011
You-0.011
Token bless
Token ID12012
Feature activation-2.40e+00

Pos logit contributions
keeper+0.091
 keeper+0.086
 hinges+0.082
ache+0.080
 instincts+0.079
Neg logit contributions
Any-0.095
Looks-0.092
You-0.091
 You-0.090
|-0.086
Token us
Token ID514
Feature activation-1.02e+00

Pos logit contributions
keeper+0.078
 keeper+0.075
 worker+0.071
ache+0.071
 hinges+0.071
Neg logit contributions
Any-0.028
Looks-0.027
You-0.026
 You-0.025
hed-0.025
Token and
Token ID290
Feature activation+2.33e+00

Pos logit contributions
keeper+0.022
 keeper+0.021
 worker+0.019
 hinges+0.019
 instincts+0.019
Neg logit contributions
Any-0.023
Looks-0.022
hed-0.022
You-0.021
ee-0.021
Token.
Token ID13
Feature activation+1.13e+00

Pos logit contributions
hed+0.014
Any+0.014
 You+0.013
You+0.013
ee+0.013
Neg logit contributions
keeper-0.018
 keeper-0.017
 instincts-0.017
 hinges-0.016
 worker-0.016
Token It
Token ID632
Feature activation+1.21e+00

Pos logit contributions
 You+0.016
Any+0.015
You+0.015
Looks+0.014
 Ever+0.014
Neg logit contributions
keeper-0.034
ache-0.033
 keeper-0.033
 instincts-0.033
 worker-0.031
Token says
Token ID1139
Feature activation-3.30e+00

Pos logit contributions
Any+0.019
Looks+0.019
hed+0.018
You+0.017
pill+0.017
Neg logit contributions
keeper-0.035
 keeper-0.034
 worker-0.032
 code-0.031
 doctor-0.031
Token:
Token ID25
Feature activation-1.95e+00

Pos logit contributions
 hinges+0.080
keeper+0.079
 instincts+0.075
 doorstep+0.075
 aid+0.074
Neg logit contributions
 You-0.065
Any-0.060
You-0.057
�-0.057
 Happy-0.055
Token "
Token ID366
Feature activation+1.22e+00

Pos logit contributions
keeper+0.038
 keeper+0.035
 worker+0.032
 hinges+0.032
ache+0.032
Neg logit contributions
Any-0.049
Looks-0.048
You-0.047
hed-0.046
 You-0.046
TokenDear
Token ID20266
Feature activation-7.51e+00

Pos logit contributions
Any+0.020
Looks+0.020
pill+0.019
hed+0.019
ee+0.018
Neg logit contributions
keeper-0.034
 keeper-0.032
 worker-0.030
 doctor-0.029
ache-0.029
Token Tim
Token ID5045
Feature activation+7.44e-01

Pos logit contributions
keeper+0.214
ache+0.198
 keeper+0.198
 hinges+0.192
 worker+0.189
Neg logit contributions
Any-0.116
 You-0.116
Looks-0.109
You-0.109
|-0.103
Token,
Token ID11
Feature activation-4.09e+00

Pos logit contributions
Any+0.017
 You+0.017
You+0.017
hed+0.017
ee+0.016
Neg logit contributions
keeper-0.016
 keeper-0.014
alo-0.014
 hinges-0.014
 instincts-0.013
Token we
Token ID356
Feature activation+2.42e+00

Pos logit contributions
keeper+0.107
 keeper+0.101
 worker+0.094
 hinges+0.094
ache+0.094
Neg logit contributions
Any-0.076
Looks-0.074
You-0.071
 You-0.070
hed-0.069
Token love
Token ID1842
Feature activation-6.27e-01

Pos logit contributions
Looks+0.042
Any+0.041
pill+0.039
hed+0.037
ksh+0.036
Neg logit contributions
keeper-0.065
 keeper-0.062
 code-0.061
ache-0.061
 doctor-0.061
Token you
Token ID345
Feature activation+1.75e+00

Pos logit contributions
keeper+0.017
 keeper+0.017
 worker+0.016
 doctor+0.016
 dentist+0.016
Neg logit contributions
Looks-0.011
Any-0.011
hed-0.010
pill-0.010
|-0.010
Token lake
Token ID13546
Feature activation-1.29e+00

Pos logit contributions
keeper+0.007
 keeper+0.007
 worker+0.006
ache+0.006
 hinges+0.006
Neg logit contributions
Looks-0.005
Any-0.005
hed-0.005
You-0.005
pill-0.005
Token.
Token ID13
Feature activation+4.76e-01

Pos logit contributions
keeper+0.028
 keeper+0.026
 worker+0.025
ache+0.025
 hinges+0.024
Neg logit contributions
Any-0.028
Looks-0.028
hed-0.027
You-0.026
|-0.026
Token But
Token ID887
Feature activation-1.28e+00

Pos logit contributions
Any+0.009
Looks+0.009
ee+0.009
hed+0.008
pill+0.008
Neg logit contributions
keeper-0.012
 keeper-0.012
 worker-0.011
 doctor-0.011
 aid-0.011
Token it
Token ID340
Feature activation+1.53e+00

Pos logit contributions
keeper+0.034
 keeper+0.032
ache+0.031
 hinges+0.031
 code+0.031
Neg logit contributions
Any-0.022
Looks-0.022
You-0.021
hed-0.020
 You-0.020
Token was
Token ID373
Feature activation+2.48e-01

Pos logit contributions
Looks+0.032
Any+0.032
hed+0.030
You+0.030
pill+0.030
Neg logit contributions
keeper-0.037
 keeper-0.035
 worker-0.032
ache-0.032
 doctor-0.031
Token all
Token ID477
Feature activation+6.36e+00

Pos logit contributions
hed+0.005
Looks+0.005
pill+0.005
Any+0.005
ee+0.005
Neg logit contributions
keeper-0.006
 keeper-0.006
 worker-0.005
 hinges-0.005
ache-0.005
Token no
Token ID645
Feature activation+2.88e+00

Pos logit contributions
Looks+0.144
Any+0.141
pill+0.135
You+0.134
hed+0.133
Neg logit contributions
keeper-0.144
 keeper-0.140
 worker-0.128
 door-0.123
 code-0.123
Token use
Token ID779
Feature activation+2.41e+00

Pos logit contributions
Looks+0.060
Any+0.059
hed+0.055
You+0.055
ee+0.054
Neg logit contributions
keeper-0.070
 keeper-0.065
ache-0.061
 worker-0.060
 hinges-0.059
Token -
Token ID532
Feature activation-3.59e+00

Pos logit contributions
Looks+0.057
Any+0.056
hed+0.053
You+0.053
pill+0.052
Neg logit contributions
keeper-0.051
 keeper-0.048
 worker-0.044
ache-0.043
 hinges-0.043
Token the
Token ID262
Feature activation+5.13e-01

Pos logit contributions
keeper+0.083
 keeper+0.080
 doctor+0.074
 worker+0.073
 door+0.072
Neg logit contributions
Looks-0.077
Any-0.076
hed-0.076
|-0.075
pill-0.074
Token fog
Token ID19558
Feature activation-1.69e-01

Pos logit contributions
Looks+0.011
Any+0.011
hed+0.010
pill+0.010
|+0.010
Neg logit contributions
keeper-0.012
 keeper-0.012
 worker-0.011
 doctor-0.011
 instincts-0.011
Token time
Token ID640
Feature activation+3.57e+00

Pos logit contributions
Any+0.024
Looks+0.024
hed+0.022
You+0.022
|+0.021
Neg logit contributions
keeper-0.055
 keeper-0.052
 worker-0.049
ache-0.049
 instincts-0.049
Token for
Token ID329
Feature activation+2.62e+00

Pos logit contributions
Any+0.090
Looks+0.089
hed+0.087
You+0.086
ee+0.085
Neg logit contributions
keeper-0.067
 keeper-0.063
 worker-0.057
ache-0.057
 instincts-0.056
Token dinner
Token ID8073
Feature activation-6.20e-01

Pos logit contributions
Looks+0.051
Any+0.051
You+0.048
hed+0.048
|+0.048
Neg logit contributions
keeper-0.066
 keeper-0.062
ache-0.058
 worker-0.058
 instincts-0.056
Token.
Token ID13
Feature activation+1.60e+00

Pos logit contributions
keeper+0.016
 hinges+0.015
 keeper+0.015
 instincts+0.015
alo+0.015
Neg logit contributions
hed-0.011
Any-0.010
You-0.010
ee-0.010
 You-0.010
Token<|endoftext|>
Token ID50256
Feature activation+5.84e+00

Pos logit contributions
Looks+0.028
Any+0.028
hed+0.027
pill+0.026
ee+0.026
Neg logit contributions
keeper-0.043
 keeper-0.041
 worker-0.038
 hinges-0.038
 aid-0.038
TokenS
Token ID50
Feature activation+6.39e+00

Pos logit contributions
Looks+0.122
Any+0.107
ksh+0.105
pill+0.095
 AW+0.093
Neg logit contributions
keeper-0.137
 keeper-0.133
 doctor-0.129
 aid-0.127
 door-0.123
Tokenara
Token ID3301
Feature activation+3.17e+00

Pos logit contributions
ee+0.079
Any+0.077
Looks+0.076
hed+0.074
|+0.071
Neg logit contributions
keeper-0.199
 keeper-0.192
 instincts-0.192
 hinges-0.190
ache-0.187
Token did
Token ID750
Feature activation-8.77e-01

Pos logit contributions
Looks+0.081
Any+0.077
pill+0.073
You+0.072
 You+0.070
Neg logit contributions
keeper-0.064
 keeper-0.060
 aid-0.055
 worker-0.055
ache-0.054
Token not
Token ID407
Feature activation+3.08e-01

Pos logit contributions
keeper+0.012
 keeper+0.012
 doctor+0.011
 aid+0.011
 door+0.011
Neg logit contributions
Looks-0.025
Any-0.024
hed-0.024
pill-0.024
ksh-0.024
Token like
Token ID588
Feature activation-3.99e-01

Pos logit contributions
Any+0.007
Looks+0.007
pill+0.007
hed+0.007
ksh+0.007
Neg logit contributions
keeper-0.006
 keeper-0.006
 aid-0.006
 code-0.006
ache-0.006
Token broccoli
Token ID44653
Feature activation+2.56e+00

Pos logit contributions
keeper+0.006
 keeper+0.006
 worker+0.005
 hinges+0.005
ache+0.005
Neg logit contributions
Any-0.011
Looks-0.011
You-0.011
hed-0.011
pill-0.011
Token the
Token ID262
Feature activation+6.54e-01

Pos logit contributions
Looks+0.003
pill+0.002
Any+0.002
hed+0.002
You+0.002
Neg logit contributions
keeper-0.003
 keeper-0.003
 worker-0.003
 doctor-0.003
 dentist-0.003
Token ranch
Token ID28752
Feature activation-1.97e+00

Pos logit contributions
Looks+0.014
Any+0.014
hed+0.013
You+0.013
|+0.013
Neg logit contributions
keeper-0.015
 keeper-0.014
ache-0.013
 hinges-0.013
alo-0.013
Token stopped
Token ID5025
Feature activation+3.32e-01

Pos logit contributions
keeper+0.049
 keeper+0.046
 hinges+0.045
ache+0.044
 instincts+0.044
Neg logit contributions
Any-0.037
hed-0.036
Looks-0.036
You-0.034
|-0.033
Token ringing
Token ID32333
Feature activation+1.59e+00

Pos logit contributions
Looks+0.008
Any+0.008
hed+0.008
pill+0.008
|+0.008
Neg logit contributions
keeper-0.007
 keeper-0.006
 worker-0.006
ache-0.006
 hinges-0.006
Token and
Token ID290
Feature activation+4.34e+00

Pos logit contributions
Any+0.034
Looks+0.034
hed+0.034
You+0.032
ee+0.032
Neg logit contributions
keeper-0.035
 keeper-0.033
 worker-0.031
 hinges-0.031
 instincts-0.031
Token all
Token ID477
Feature activation+6.41e+00

Pos logit contributions
pill+0.089
Looks+0.088
Any+0.087
ee+0.084
|+0.084
Neg logit contributions
keeper-0.105
 keeper-0.101
 doctor-0.092
 worker-0.091
 door-0.091
Token the
Token ID262
Feature activation+1.79e+00

Pos logit contributions
Looks+0.144
Any+0.142
pill+0.140
ksh+0.128
You+0.127
Neg logit contributions
keeper-0.147
 keeper-0.145
 doctor-0.131
 worker-0.130
 door-0.129
Token animals
Token ID4695
Feature activation+1.20e+00

Pos logit contributions
Any+0.044
Looks+0.044
ee+0.041
hed+0.041
pill+0.041
Neg logit contributions
keeper-0.036
 keeper-0.035
 worker-0.032
 hinges-0.031
 instincts-0.031
Token were
Token ID547
Feature activation+5.40e-01

Pos logit contributions
Looks+0.022
Any+0.021
pill+0.020
You+0.019
 You+0.019
Neg logit contributions
keeper-0.032
 keeper-0.031
 worker-0.029
ache-0.029
 doctor-0.028
Token content
Token ID2695
Feature activation+5.27e-01

Pos logit contributions
Any+0.010
hed+0.009
Looks+0.009
You+0.009
ee+0.009
Neg logit contributions
keeper-0.014
 keeper-0.013
 hinges-0.013
 code-0.013
 worker-0.013
Token.
Token ID13
Feature activation+1.31e+00

Pos logit contributions
Any+0.012
Looks+0.011
hed+0.011
You+0.011
|+0.011
Neg logit contributions
keeper-0.012
 keeper-0.011
 worker-0.010
 hinges-0.010
 code-0.010
Token day
Token ID1110
Feature activation-2.49e-01

Pos logit contributions
keeper+0.027
 keeper+0.026
 hinges+0.026
ache+0.025
 instincts+0.025
Neg logit contributions
Any-0.006
Looks-0.005
You-0.005
ee-0.005
hed-0.005
Token he
Token ID339
Feature activation-6.80e-01

Pos logit contributions
keeper+0.005
 keeper+0.005
 worker+0.004
 hinges+0.004
ache+0.004
Neg logit contributions
Any-0.006
Looks-0.006
hed-0.006
You-0.006
pill-0.006
Token accidentally
Token ID14716
Feature activation+8.17e-01

Pos logit contributions
keeper+0.015
 keeper+0.014
 worker+0.013
ache+0.013
 hinges+0.012
Neg logit contributions
Looks-0.016
Any-0.016
hed-0.015
You-0.015
pill-0.015
Token spilled
Token ID31896
Feature activation+1.24e+00

Pos logit contributions
Looks+0.020
Any+0.020
You+0.018
hed+0.018
|+0.018
Neg logit contributions
keeper-0.017
 keeper-0.016
ache-0.015
 worker-0.015
 hinges-0.014
Token ink
Token ID16882
Feature activation+1.70e+00

Pos logit contributions
Any+0.028
Looks+0.027
|+0.027
pill+0.026
ee+0.025
Neg logit contributions
keeper-0.027
 door-0.026
 keeper-0.026
 doctor-0.025
 hinges-0.025
Token all
Token ID477
Feature activation+6.20e+00

Pos logit contributions
Any+0.028
Looks+0.028
hed+0.028
You+0.026
ee+0.026
Neg logit contributions
keeper-0.047
 keeper-0.045
 worker-0.042
 instincts-0.042
 hinges-0.042
Token over
Token ID625
Feature activation+2.78e+00

Pos logit contributions
Looks+0.149
pill+0.148
Any+0.144
ksh+0.131
Whe+0.128
Neg logit contributions
keeper-0.125
 keeper-0.119
 worker-0.114
 door-0.113
man-0.111
Token his
Token ID465
Feature activation-3.02e+00

Pos logit contributions
Looks+0.047
Any+0.046
hed+0.044
You+0.043
ee+0.042
Neg logit contributions
keeper-0.077
 keeper-0.073
ache-0.069
 worker-0.069
 instincts-0.068
Token paper
Token ID3348
Feature activation-2.50e-01

Pos logit contributions
keeper+0.057
 keeper+0.055
 instincts+0.049
 worker+0.049
ache+0.048
Neg logit contributions
ee-0.072
hed-0.069
Any-0.069
 You-0.069
pill-0.067
Token.
Token ID13
Feature activation+7.73e-01

Pos logit contributions
keeper+0.006
 keeper+0.006
 instincts+0.006
 worker+0.006
alo+0.006
Neg logit contributions
hed-0.004
�-0.004
ee-0.003
 You-0.003
pill-0.003
Token He
Token ID679
Feature activation-1.51e+00

Pos logit contributions
Any+0.013
Looks+0.013
You+0.013
hed+0.013
 You+0.012
Neg logit contributions
keeper-0.021
 keeper-0.020
ache-0.019
 worker-0.019
 instincts-0.019
Token.
Token ID13
Feature activation+7.25e-01

Pos logit contributions
keeper+0.009
 keeper+0.008
 worker+0.008
ache+0.008
 hinges+0.008
Neg logit contributions
Looks-0.008
Any-0.008
hed-0.008
You-0.008
|-0.008
Token Now
Token ID2735
Feature activation-8.30e-01

Pos logit contributions
Looks+0.017
Any+0.016
ee+0.015
ez+0.015
|+0.015
Neg logit contributions
keeper-0.016
 keeper-0.014
 doctor-0.014
 hinges-0.013
 door-0.013
Token she
Token ID673
Feature activation-3.46e-01

Pos logit contributions
keeper+0.017
 keeper+0.016
 worker+0.015
ache+0.015
 hinges+0.015
Neg logit contributions
Any-0.020
Looks-0.020
hed-0.019
You-0.019
pill-0.018
Token could
Token ID714
Feature activation-2.76e+00

Pos logit contributions
keeper+0.007
 keeper+0.007
 doctor+0.007
ache+0.007
 code+0.007
Neg logit contributions
Looks-0.008
ksh-0.007
Any-0.007
|-0.007
You-0.007
Token tw
Token ID665
Feature activation+4.70e+00

Pos logit contributions
keeper+0.068
 keeper+0.065
 worker+0.061
 hinges+0.060
ache+0.060
Neg logit contributions
Looks-0.054
Any-0.054
hed-0.051
pill-0.050
You-0.050
Tokenirl
Token ID1901
Feature activation+6.93e+00

Pos logit contributions
Looks+0.062
Any+0.058
hed+0.055
You+0.055
pill+0.055
Neg logit contributions
keeper-0.140
 keeper-0.137
 instincts-0.132
 doctor-0.130
 aid-0.130
Token and
Token ID290
Feature activation+3.11e+00

Pos logit contributions
Looks+0.178
Any+0.165
pill+0.163
ksh+0.154
|+0.154
Neg logit contributions
keeper-0.134
 keeper-0.126
ache-0.114
 door-0.114
 doctor-0.114
Token dance
Token ID9280
Feature activation+2.21e+00

Pos logit contributions
Looks+0.068
Any+0.067
pill+0.066
ksh+0.066
hed+0.065
Neg logit contributions
keeper-0.064
 keeper-0.062
 doctor-0.060
 door-0.059
 hinges-0.058
Token whenever
Token ID8797
Feature activation+1.72e+00

Pos logit contributions
Looks+0.046
Any+0.046
hed+0.043
You+0.043
pill+0.042
Neg logit contributions
keeper-0.052
 keeper-0.050
 worker-0.046
ache-0.045
 hinges-0.045
Token she
Token ID673
Feature activation+8.65e-01

Pos logit contributions
Looks+0.044
hed+0.044
Any+0.043
ksh+0.043
pill+0.042
Neg logit contributions
keeper-0.032
 keeper-0.031
 doctor-0.030
 worker-0.028
 dentist-0.028
Token wanted
Token ID2227
Feature activation+1.20e+00

Pos logit contributions
Looks+0.020
Any+0.019
ksh+0.018
 Good+0.018
You+0.018
Neg logit contributions
keeper-0.017
 keeper-0.017
ache-0.016
 doctor-0.016
 aid-0.016
Token girl
Token ID2576
Feature activation-1.54e+00

Pos logit contributions
keeper+0.005
 keeper+0.004
 hinges+0.004
ache+0.004
 instincts+0.004
Neg logit contributions
Any-0.006
Looks-0.006
You-0.005
hed-0.005
ee-0.005
Token.
Token ID13
Feature activation+1.11e+00

Pos logit contributions
keeper+0.036
 hinges+0.034
 keeper+0.034
 instincts+0.032
 code+0.032
Neg logit contributions
Any-0.032
You-0.030
 You-0.030
�-0.029
hed-0.029
Token<|endoftext|>
Token ID50256
Feature activation+5.83e+00

Pos logit contributions
Looks+0.025
Any+0.024
hed+0.024
ee+0.023
pill+0.022
Neg logit contributions
keeper-0.025
 keeper-0.023
 doctor-0.022
 aid-0.021
 worker-0.021
TokenAnna
Token ID31160
Feature activation+2.63e+00

Pos logit contributions
Looks+0.126
ksh+0.112
Any+0.110
 AW+0.099
pill+0.098
Neg logit contributions
keeper-0.131
 keeper-0.129
 doctor-0.125
 aid-0.122
 door-0.118
Token and
Token ID290
Feature activation+4.59e+00

Pos logit contributions
Looks+0.072
Any+0.066
pill+0.064
ksh+0.062
You+0.061
Neg logit contributions
keeper-0.048
 keeper-0.046
 aid-0.043
 doctor-0.043
 worker-0.042
Token Ben
Token ID3932
Feature activation+2.06e+00

Pos logit contributions
Looks+0.111
Any+0.107
pill+0.106
ksh+0.104
hed+0.101
Neg logit contributions
keeper-0.097
 keeper-0.092
 doctor-0.090
 lady-0.086
 dentist-0.084
Token are
Token ID389
Feature activation+4.19e-01

Pos logit contributions
Looks+0.054
pill+0.052
ksh+0.051
Any+0.051
 Wed+0.046
Neg logit contributions
 keeper-0.034
keeper-0.034
 driver-0.032
 doctor-0.031
 lady-0.031
Token playing
Token ID2712
Feature activation-5.58e-01

Pos logit contributions
Looks+0.011
hed+0.011
|+0.010
pill+0.010
Whe+0.010
Neg logit contributions
keeper-0.008
 doctor-0.007
 keeper-0.007
 door-0.007
 worker-0.007
Token in
Token ID287
Feature activation+2.55e+00

Pos logit contributions
keeper+0.011
 keeper+0.010
 worker+0.010
 doctor+0.010
 aid+0.009
Neg logit contributions
Looks-0.014
pill-0.013
Any-0.013
|-0.013
hed-0.013
Token the
Token ID262
Feature activation+2.38e+00

Pos logit contributions
Looks+0.058
Any+0.058
hed+0.058
ksh+0.056
pill+0.056
Neg logit contributions
 keeper-0.053
keeper-0.051
 doctor-0.049
 worker-0.049
 door-0.049
Token park
Token ID3952
Feature activation-7.41e-01

Pos logit contributions
Looks+0.064
Any+0.063
hed+0.062
pill+0.060
|+0.059
Neg logit contributions
keeper-0.041
 keeper-0.040
 doctor-0.037
 worker-0.036
 aid-0.036
Token bird
Token ID6512
Feature activation-4.14e-01

Pos logit contributions
Looks+0.028
Any+0.028
hed+0.027
You+0.026
pill+0.026
Neg logit contributions
keeper-0.022
 keeper-0.020
 worker-0.019
ache-0.018
 hinges-0.018
Token was
Token ID373
Feature activation-2.40e-01

Pos logit contributions
keeper+0.010
 keeper+0.010
ache+0.009
 worker+0.009
 doctor+0.009
Neg logit contributions
Looks-0.008
Any-0.008
You-0.008
hed-0.008
pill-0.007
Token so
Token ID523
Feature activation+2.33e+00

Pos logit contributions
keeper+0.006
 keeper+0.006
 hinges+0.006
 worker+0.006
 instincts+0.006
Neg logit contributions
Any-0.004
Looks-0.004
You-0.004
hed-0.004
 You-0.004
Token happy
Token ID3772
Feature activation-7.46e-01

Pos logit contributions
Looks+0.054
Any+0.054
hed+0.051
pill+0.050
You+0.050
Neg logit contributions
keeper-0.050
 keeper-0.048
 worker-0.043
ache-0.042
 instincts-0.041
Token that
Token ID326
Feature activation-8.01e-01

Pos logit contributions
keeper+0.014
 keeper+0.013
 worker+0.012
 hinges+0.012
ache+0.012
Neg logit contributions
Any-0.019
Looks-0.019
hed-0.019
You-0.018
pill-0.018
Token it
Token ID340
Feature activation+1.85e+00

Pos logit contributions
keeper+0.016
 keeper+0.015
 doctor+0.014
 worker+0.013
 lady+0.013
Neg logit contributions
Looks-0.021
hed-0.020
pill-0.020
ksh-0.020
Any-0.020
Token sang
Token ID25889
Feature activation+2.52e+00

Pos logit contributions
Looks+0.043
Whe+0.040
Any+0.040
ksh+0.039
ee+0.039
Neg logit contributions
keeper-0.041
 keeper-0.040
ache-0.036
 doctor-0.035
 worker-0.034
Token a
Token ID257
Feature activation-3.25e+00

Pos logit contributions
Looks+0.049
Any+0.048
hed+0.045
pill+0.045
You+0.045
Neg logit contributions
keeper-0.063
 keeper-0.061
 worker-0.056
ache-0.055
 instincts-0.055
Token beautiful
Token ID4950
Feature activation-7.68e-01

Pos logit contributions
keeper+0.068
 keeper+0.065
 worker+0.060
 hinges+0.059
ache+0.058
Neg logit contributions
Any-0.076
Looks-0.076
hed-0.072
You-0.071
pill-0.070
Token song
Token ID3496
Feature activation+3.80e-01

Pos logit contributions
keeper+0.013
 keeper+0.012
 worker+0.011
ache+0.010
 hinges+0.010
Neg logit contributions
Looks-0.021
Any-0.021
hed-0.020
You-0.020
pill-0.020
Token for
Token ID329
Feature activation+4.64e-01

Pos logit contributions
Any+0.008
Looks+0.008
hed+0.008
You+0.008
 You+0.008
Neg logit contributions
keeper-0.009
 keeper-0.008
 hinges-0.008
 worker-0.007
 instincts-0.007
Token that
Token ID326
Feature activation+3.84e-01

Pos logit contributions
Any+0.054
Looks+0.053
You+0.052
 You+0.050
hed+0.049
Neg logit contributions
keeper-0.041
 keeper-0.038
ache-0.036
 worker-0.035
 hinges-0.035
Token mistakes
Token ID10135
Feature activation+2.73e+00

Pos logit contributions
|+0.009
Looks+0.009
hed+0.009
ksh+0.008
Any+0.008
Neg logit contributions
keeper-0.008
 keeper-0.008
 doctor-0.008
 instincts-0.008
 worker-0.007
Token can
Token ID460
Feature activation-3.55e+00

Pos logit contributions
Looks+0.054
Any+0.052
ksh+0.048
ee+0.048
pill+0.048
Neg logit contributions
keeper-0.067
 keeper-0.065
 worker-0.061
 code-0.060
ache-0.060
Token be
Token ID307
Feature activation+8.30e-01

Pos logit contributions
keeper+0.081
 keeper+0.075
 worker+0.070
 hinges+0.069
 instincts+0.069
Neg logit contributions
Any-0.077
Looks-0.077
You-0.074
hed-0.071
|-0.071
Token fixed
Token ID5969
Feature activation-9.33e-01

Pos logit contributions
Any+0.017
Looks+0.017
You+0.016
hed+0.016
pill+0.015
Neg logit contributions
keeper-0.020
 keeper-0.019
 worker-0.018
 hinges-0.017
ache-0.017
Token.
Token ID13
Feature activation+1.65e+00

Pos logit contributions
keeper+0.020
 keeper+0.019
 worker+0.017
ache+0.017
 hinges+0.017
Neg logit contributions
Any-0.021
Looks-0.021
hed-0.020
You-0.020
pill-0.020
Token<|endoftext|>
Token ID50256
Feature activation+5.62e+00

Pos logit contributions
Looks+0.043
hed+0.041
ee+0.040
Any+0.040
pill+0.038
Neg logit contributions
keeper-0.032
 keeper-0.030
 doctor-0.026
 aid-0.025
 lady-0.025
TokenOnce
Token ID7454
Feature activation+2.28e+00

Pos logit contributions
Looks+0.128
Any+0.121
pill+0.114
ksh+0.112
ee+0.109
Neg logit contributions
keeper-0.123
 keeper-0.119
 aid-0.108
 doctor-0.108
 worker-0.107
Token there
Token ID612
Feature activation+7.35e-01

Pos logit contributions
Any+0.042
Looks+0.041
ee+0.040
|+0.040
hed+0.039
Neg logit contributions
keeper-0.057
 keeper-0.055
ache-0.053
 hinges-0.053
 worker-0.052
Token was
Token ID373
Feature activation+1.19e+00

Pos logit contributions
Any+0.015
Looks+0.014
hed+0.014
You+0.014
|+0.013
Neg logit contributions
keeper-0.018
 keeper-0.017
ache-0.016
 hinges-0.016
 worker-0.016
Token a
Token ID257
Feature activation-5.31e-01

Pos logit contributions
Looks+0.031
ksh+0.029
hed+0.028
 AW+0.028
pill+0.028
Neg logit contributions
 doctor-0.019
 aid-0.018
 dentist-0.018
 vet-0.017
 owner-0.016
Token Tim
Token ID5045
Feature activation+1.20e+00

Pos logit contributions
Any+0.037
Looks+0.037
hed+0.034
You+0.034
|+0.033
Neg logit contributions
keeper-0.073
 keeper-0.070
 worker-0.066
ache-0.066
 hinges-0.065
Token.
Token ID13
Feature activation+1.24e+00

Pos logit contributions
Looks+0.032
Any+0.031
pill+0.030
hed+0.029
|+0.029
Neg logit contributions
keeper-0.022
 keeper-0.021
 worker-0.019
ache-0.019
 aid-0.019
Token Tim
Token ID5045
Feature activation+3.18e-01

Pos logit contributions
Looks+0.024
Any+0.024
pill+0.022
hed+0.022
ee+0.022
Neg logit contributions
 keeper-0.030
keeper-0.030
 worker-0.028
 hinges-0.028
 doctor-0.027
Token was
Token ID373
Feature activation-2.08e-01

Pos logit contributions
Looks+0.008
pill+0.007
Any+0.007
ksh+0.007
Whe+0.006
Neg logit contributions
 keeper-0.007
keeper-0.006
man-0.006
 doctor-0.006
 worker-0.006
Token sad
Token ID6507
Feature activation-9.77e-01

Pos logit contributions
keeper+0.005
 doctor+0.004
 worker+0.004
 keeper+0.004
 dentist+0.004
Neg logit contributions
Looks-0.005
hed-0.004
pill-0.004
Any-0.004
ee-0.004
Token because
Token ID780
Feature activation+1.72e+00

Pos logit contributions
keeper+0.026
 keeper+0.025
 worker+0.024
ache+0.023
 hinges+0.023
Neg logit contributions
Looks-0.017
Any-0.017
hed-0.016
You-0.016
pill-0.016
Token he
Token ID339
Feature activation-9.99e-01

Pos logit contributions
Looks+0.049
hed+0.049
Any+0.049
pill+0.049
ksh+0.048
Neg logit contributions
keeper-0.027
 keeper-0.027
 doctor-0.025
 dentist-0.025
 worker-0.024
Token had
Token ID550
Feature activation+3.28e+00

Pos logit contributions
keeper+0.022
 keeper+0.021
ache+0.021
 doctor+0.020
 dentist+0.019
Neg logit contributions
Looks-0.023
pill-0.021
Any-0.020
hed-0.020
Whe-0.020
Token no
Token ID645
Feature activation+1.75e+00

Pos logit contributions
Any+0.058
Looks+0.058
You+0.054
hed+0.054
|+0.052
Neg logit contributions
keeper-0.088
 keeper-0.084
 worker-0.078
 hinges-0.077
ache-0.077
Token friends
Token ID2460
Feature activation-2.03e-01

Pos logit contributions
 summer+0.028
 night+0.028
 imag+0.027
Any+0.027
 morning+0.027
Neg logit contributions
keeper-0.047
 keeper-0.044
ache-0.042
cess-0.041
keepers-0.041
Token.
Token ID13
Feature activation+1.04e-01

Pos logit contributions
keeper+0.005
 hinges+0.005
 keeper+0.005
 instincts+0.004
alo+0.004
Neg logit contributions
Any-0.004
hed-0.004
You-0.004
�-0.004
 You-0.004
Token snow
Token ID6729
Feature activation-3.77e-01

Pos logit contributions
Looks+0.016
Any+0.016
hed+0.015
You+0.015
pill+0.015
Neg logit contributions
keeper-0.019
 keeper-0.018
 worker-0.017
ache-0.016
 hinges-0.016
Tokenmen
Token ID3653
Feature activation-1.83e+00

Pos logit contributions
keeper+0.007
 keeper+0.006
 aid+0.006
ache+0.006
 hinges+0.006
Neg logit contributions
Any-0.010
Looks-0.010
hed-0.010
You-0.010
ee-0.009
Token and
Token ID290
Feature activation+3.32e+00

Pos logit contributions
keeper+0.047
 keeper+0.043
 instincts+0.042
 hinges+0.041
alo+0.041
Neg logit contributions
hed-0.033
Any-0.033
ee-0.032
You-0.032
�-0.032
Token snow
Token ID6729
Feature activation+5.50e-01

Pos logit contributions
Any+0.075
Looks+0.075
hed+0.072
You+0.070
pill+0.070
Neg logit contributions
keeper-0.072
 keeper-0.069
 hinges-0.064
 worker-0.064
ache-0.063
Tokenballs
Token ID21591
Feature activation+2.45e+00

Pos logit contributions
hed+0.013
Any+0.013
You+0.013
ee+0.012
Looks+0.012
Neg logit contributions
keeper-0.011
 keeper-0.010
 aid-0.010
 code-0.009
 hinges-0.009
Token and
Token ID290
Feature activation+3.60e+00

Pos logit contributions
Looks+0.057
Any+0.057
hed+0.054
You+0.053
pill+0.053
Neg logit contributions
keeper-0.052
 keeper-0.049
 worker-0.045
ache-0.045
 hinges-0.044
Token slide
Token ID10649
Feature activation+5.22e+00

Pos logit contributions
Any+0.079
Looks+0.079
hed+0.075
pill+0.073
You+0.073
Neg logit contributions
keeper-0.081
 keeper-0.078
 hinges-0.072
 worker-0.072
ache-0.071
Token on
Token ID319
Feature activation+3.94e+00

Pos logit contributions
Any+0.100
hed+0.100
Looks+0.098
You+0.096
ee+0.095
Neg logit contributions
keeper-0.129
 keeper-0.122
ache-0.114
 instincts-0.114
 aid-0.113
Token the
Token ID262
Feature activation+1.24e+00

Pos logit contributions
Looks+0.060
Any+0.060
hed+0.055
You+0.055
pill+0.053
Neg logit contributions
keeper-0.116
 keeper-0.111
 worker-0.104
ache-0.103
 hinges-0.103
Token ice
Token ID4771
Feature activation+7.83e-01

Pos logit contributions
Any+0.026
Looks+0.026
hed+0.024
You+0.024
pill+0.024
Neg logit contributions
keeper-0.029
 keeper-0.028
 worker-0.026
ache-0.025
 hinges-0.025
Token.
Token ID13
Feature activation+1.01e+00

Pos logit contributions
hed+0.017
Any+0.016
Looks+0.016
ee+0.016
You+0.016
Neg logit contributions
keeper-0.017
 keeper-0.016
 instincts-0.015
ache-0.015
 aid-0.015
Token was
Token ID373
Feature activation-2.38e-01

Pos logit contributions
keeper+0.088
 keeper+0.085
 worker+0.078
ache+0.077
 doctor+0.077
Neg logit contributions
Looks-0.087
Any-0.085
hed-0.081
ee-0.079
|-0.079
Token very
Token ID845
Feature activation+3.74e-01

Pos logit contributions
keeper+0.006
 keeper+0.006
 worker+0.006
ache+0.005
 doctor+0.005
Neg logit contributions
Looks-0.004
Any-0.004
hed-0.004
You-0.004
pill-0.004
Token sick
Token ID6639
Feature activation-9.08e-01

Pos logit contributions
Looks+0.008
Any+0.008
hed+0.007
You+0.007
pill+0.007
Neg logit contributions
keeper-0.009
 keeper-0.008
 worker-0.008
 hinges-0.008
ache-0.008
Token.
Token ID13
Feature activation+6.70e-02

Pos logit contributions
keeper+0.020
 keeper+0.019
 worker+0.017
 hinges+0.017
 instincts+0.017
Neg logit contributions
Any-0.020
Looks-0.020
hed-0.019
You-0.019
pill-0.019
Token Lucy
Token ID22162
Feature activation+5.38e-01

Pos logit contributions
Looks+0.001
Any+0.001
hed+0.001
ee+0.001
You+0.001
Neg logit contributions
keeper-0.002
 keeper-0.002
 worker-0.002
 hinges-0.001
 doctor-0.001
Token's
Token ID338
Feature activation-2.74e+00

Pos logit contributions
Looks+0.011
Any+0.011
hed+0.010
You+0.010
pill+0.010
Neg logit contributions
keeper-0.013
 keeper-0.013
 worker-0.012
ache-0.012
 hinges-0.011
Token parents
Token ID3397
Feature activation-6.78e-01

Pos logit contributions
keeper+0.049
 keeper+0.044
 code+0.042
 hinges+0.042
 detector+0.041
Neg logit contributions
Any-0.069
 You-0.063
ee-0.063
You-0.063
hed-0.062
Token took
Token ID1718
Feature activation+1.98e+00

Pos logit contributions
keeper+0.017
 keeper+0.017
 hinges+0.016
 instincts+0.015
 worker+0.015
Neg logit contributions
Any-0.013
Looks-0.012
hed-0.012
You-0.012
ee-0.011
Token Hop
Token ID9996
Feature activation+1.57e-02

Pos logit contributions
Looks+0.039
Any+0.039
pill+0.036
ee+0.035
hed+0.035
Neg logit contributions
keeper-0.049
 keeper-0.047
 doctor-0.044
 worker-0.044
 hinges-0.043
Token to
Token ID284
Feature activation-4.79e+00

Pos logit contributions
Looks+0.000
Any+0.000
 You+0.000
pill+0.000
You+0.000
Neg logit contributions
keeper-0.000
 keeper-0.000
 worker-0.000
 aid-0.000
ache-0.000
Token the
Token ID262
Feature activation-1.65e-02

Pos logit contributions
keeper+0.148
 keeper+0.140
ache+0.135
 worker+0.132
 instincts+0.131
Neg logit contributions
Any-0.067
Looks-0.066
You-0.061
|-0.061
hed-0.058
Token She
Token ID1375
Feature activation-1.67e+00

Pos logit contributions
Looks+0.008
Any+0.008
ee+0.007
|+0.007
hed+0.007
Neg logit contributions
keeper-0.008
 keeper-0.008
 worker-0.007
 doctor-0.007
 door-0.007
Token thanked
Token ID26280
Feature activation-5.76e+00

Pos logit contributions
keeper+0.039
 keeper+0.036
 worker+0.033
ache+0.033
 hinges+0.032
Neg logit contributions
Looks-0.037
Any-0.036
You-0.034
|-0.034
hed-0.034
Token the
Token ID262
Feature activation-2.58e-01

Pos logit contributions
keeper+0.156
 keeper+0.152
 worker+0.141
 doctor+0.139
 instincts+0.136
Neg logit contributions
Looks-0.103
Any-0.101
hed-0.098
ee-0.093
pill-0.093
Token man
Token ID582
Feature activation-2.25e+00

Pos logit contributions
keeper+0.004
 hinges+0.003
 keeper+0.003
ache+0.003
alo+0.003
Neg logit contributions
You-0.007
Looks-0.007
Any-0.007
ee-0.007
hed-0.007
Token and
Token ID290
Feature activation+2.44e+00

Pos logit contributions
keeper+0.050
 keeper+0.047
 worker+0.043
ache+0.042
 instincts+0.041
Neg logit contributions
Looks-0.052
Any-0.051
hed-0.048
You-0.048
pill-0.048
Token ran
Token ID4966
Feature activation-9.51e-02

Pos logit contributions
Looks+0.048
Any+0.048
hed+0.045
You+0.045
pill+0.044
Neg logit contributions
keeper-0.061
 keeper-0.058
 worker-0.054
ache-0.053
 hinges-0.053
Token off
Token ID572
Feature activation-1.08e+00

Pos logit contributions
keeper+0.002
 keeper+0.002
 worker+0.001
 hinges+0.001
 aid+0.001
Neg logit contributions
Looks-0.003
Any-0.003
pill-0.002
hed-0.002
You-0.002
Token with
Token ID351
Feature activation+2.49e+00

Pos logit contributions
keeper+0.018
 keeper+0.016
ache+0.015
 worker+0.015
 hinges+0.015
Neg logit contributions
Any-0.030
Looks-0.030
hed-0.029
You-0.029
|-0.029
Token the
Token ID262
Feature activation+9.54e-01

Pos logit contributions
Any+0.047
Looks+0.046
hed+0.044
pill+0.043
|+0.042
Neg logit contributions
keeper-0.065
 keeper-0.063
 worker-0.058
 doctor-0.057
 instincts-0.057
Token shiny
Token ID22441
Feature activation+9.59e-01

Pos logit contributions
Any+0.024
Looks+0.024
hed+0.023
You+0.023
ee+0.023
Neg logit contributions
keeper-0.018
 keeper-0.017
 hinges-0.016
ache-0.016
 instincts-0.015
Token toy
Token ID13373
Feature activation-1.21e+00

Pos logit contributions
hed+0.021
Any+0.021
Looks+0.021
ee+0.020
You+0.020
Neg logit contributions
keeper-0.021
 keeper-0.020
ache-0.019
 hinges-0.018
 instincts-0.018
Token had
Token ID550
Feature activation+3.65e+00

Pos logit contributions
Looks+0.007
Any+0.007
hed+0.007
You+0.007
pill+0.007
Neg logit contributions
keeper-0.007
 keeper-0.006
 worker-0.006
ache-0.006
 hinges-0.006
Token taken
Token ID2077
Feature activation+3.37e+00

Pos logit contributions
Looks+0.071
Any+0.071
hed+0.067
You+0.066
pill+0.065
Neg logit contributions
keeper-0.091
 keeper-0.087
 worker-0.081
 hinges-0.080
ache-0.080
Token her
Token ID607
Feature activation-3.92e+00

Pos logit contributions
Looks+0.064
Any+0.064
hed+0.061
You+0.060
pill+0.059
Neg logit contributions
keeper-0.085
 keeper-0.081
 worker-0.076
ache-0.075
 hinges-0.075
Token.
Token ID13
Feature activation+8.74e-01

Pos logit contributions
keeper+0.091
 keeper+0.086
 worker+0.081
ache+0.078
 hinges+0.078
Neg logit contributions
Any-0.084
Looks-0.082
hed-0.079
|-0.077
You-0.077
Token
Token ID198
Feature activation-1.38e+00

Pos logit contributions
Any+0.015
Looks+0.015
hed+0.014
You+0.014
pill+0.014
Neg logit contributions
keeper-0.024
 keeper-0.023
ache-0.021
 worker-0.021
 instincts-0.021
Token
Token ID198
Feature activation-1.63e+00

Pos logit contributions
keeper+0.041
ache+0.039
 keeper+0.039
 instincts+0.038
 worker+0.038
Neg logit contributions
Any-0.020
|-0.020
You-0.019
Looks-0.019
hed-0.019
TokenSarah
Token ID29284
Feature activation+1.02e-01

Pos logit contributions
keeper+0.034
 keeper+0.032
 worker+0.029
ache+0.029
 hinges+0.029
Neg logit contributions
Looks-0.039
Any-0.039
hed-0.037
You-0.037
pill-0.036
Token felt
Token ID2936
Feature activation+3.19e+00

Pos logit contributions
Looks+0.002
Any+0.002
You+0.002
 You+0.002
pill+0.002
Neg logit contributions
keeper-0.002
 keeper-0.002
ache-0.002
 worker-0.002
 aid-0.002
Token so
Token ID523
Feature activation+2.73e+00

Pos logit contributions
Looks+0.071
pill+0.069
Any+0.068
ee+0.067
hed+0.067
Neg logit contributions
keeper-0.070
 keeper-0.067
 doctor-0.063
ache-0.061
 worker-0.060
Token brave
Token ID14802
Feature activation+5.95e-01

Pos logit contributions
pill+0.068
Looks+0.067
ksh+0.065
Any+0.065
hed+0.063
Neg logit contributions
keeper-0.053
 keeper-0.051
 instincts-0.046
 door-0.045
 doctor-0.045
Token and
Token ID290
Feature activation+2.43e+00

Pos logit contributions
Any+0.013
Looks+0.013
hed+0.013
You+0.012
 You+0.012
Neg logit contributions
keeper-0.013
 keeper-0.013
 hinges-0.012
 worker-0.012
 instincts-0.012
Token and
Token ID290
Feature activation+3.83e+00

Pos logit contributions
keeper+0.085
 keeper+0.082
 worker+0.075
ache+0.074
 hinges+0.074
Neg logit contributions
Looks-0.073
Any-0.073
hed-0.069
pill-0.068
|-0.067
Token it
Token ID340
Feature activation+2.38e+00

Pos logit contributions
Any+0.083
hed+0.080
Looks+0.080
|+0.080
ee+0.077
Neg logit contributions
keeper-0.088
 keeper-0.085
 worker-0.078
 doctor-0.077
ache-0.075
Token was
Token ID373
Feature activation+4.80e-01

Pos logit contributions
Any+0.050
Looks+0.050
hed+0.047
You+0.047
pill+0.047
Neg logit contributions
keeper-0.056
 keeper-0.054
 worker-0.049
ache-0.049
 doctor-0.048
Token hungry
Token ID14720
Feature activation-6.73e-01

Pos logit contributions
Any+0.008
Looks+0.008
You+0.008
 You+0.007
hed+0.007
Neg logit contributions
keeper-0.013
 keeper-0.012
 worker-0.012
 instincts-0.012
ache-0.012
Token.
Token ID13
Feature activation+6.16e-01

Pos logit contributions
keeper+0.015
 keeper+0.014
 worker+0.013
 instincts+0.013
ache+0.013
Neg logit contributions
Any-0.015
Looks-0.014
hed-0.014
You-0.014
 You-0.014
Token It
Token ID632
Feature activation+8.02e-01

Pos logit contributions
Any+0.009
Looks+0.009
You+0.008
hed+0.008
 You+0.008
Neg logit contributions
keeper-0.018
 keeper-0.018
ache-0.017
 worker-0.017
 instincts-0.017
Token wanted
Token ID2227
Feature activation-5.67e-01

Pos logit contributions
Any+0.015
Looks+0.015
hed+0.015
You+0.014
pill+0.014
Neg logit contributions
keeper-0.020
 keeper-0.019
 worker-0.018
ache-0.018
 hinges-0.018
Token to
Token ID284
Feature activation-4.44e+00

Pos logit contributions
keeper+0.011
ache+0.011
 keeper+0.010
alo+0.010
 instincts+0.010
Neg logit contributions
Any-0.013
You-0.013
Looks-0.012
 You-0.012
 Wed-0.012
Token eat
Token ID4483
Feature activation+4.13e-01

Pos logit contributions
keeper+0.119
 keeper+0.108
 worker+0.104
ache+0.104
 instincts+0.103
Neg logit contributions
|-0.079
Any-0.078
Looks-0.075
You-0.074
ee-0.074
Token the
Token ID262
Feature activation+7.15e-01

Pos logit contributions
Looks+0.008
Any+0.008
hed+0.007
You+0.007
pill+0.007
Neg logit contributions
keeper-0.011
 keeper-0.010
 worker-0.010
 hinges-0.010
 instincts-0.009
Token ot
Token ID30972
Feature activation+1.81e-01

Pos logit contributions
Any+0.015
Looks+0.015
hed+0.014
You+0.014
pill+0.014
Neg logit contributions
keeper-0.017
 keeper-0.016
 worker-0.015
ache-0.015
 hinges-0.015
Token the
Token ID262
Feature activation+1.17e+00

Pos logit contributions
Looks+0.064
Any+0.064
hed+0.060
pill+0.059
|+0.059
Neg logit contributions
keeper-0.082
 keeper-0.078
 hinges-0.074
 code-0.073
 worker-0.073
Token flashlight
Token ID38371
Feature activation+6.49e-02

Pos logit contributions
Looks+0.029
Any+0.029
You+0.028
hed+0.028
 You+0.028
Neg logit contributions
keeper-0.023
 keeper-0.021
 worker-0.019
ache-0.019
 aid-0.018
Token,"
Token ID553
Feature activation-6.00e-01

Pos logit contributions
hed+0.001
Any+0.001
Looks+0.001
ee+0.001
 You+0.001
Neg logit contributions
keeper-0.001
 keeper-0.001
 instincts-0.001
 worker-0.001
 hinges-0.001
Token Tom
Token ID4186
Feature activation-1.22e+00

Pos logit contributions
keeper+0.013
 keeper+0.013
 worker+0.012
 hinges+0.011
 doctor+0.011
Neg logit contributions
Any-0.013
Looks-0.013
hed-0.013
|-0.013
pill-0.012
Token says
Token ID1139
Feature activation-4.92e+00

Pos logit contributions
keeper+0.027
 keeper+0.026
 worker+0.024
 aid+0.023
 code+0.023
Neg logit contributions
Any-0.028
Looks-0.028
|-0.026
hed-0.025
pill-0.025
Token.
Token ID13
Feature activation-9.91e-01

Pos logit contributions
keeper+0.121
 keeper+0.116
 dentist+0.113
ache+0.113
 instincts+0.110
Neg logit contributions
Any-0.085
 You-0.085
hed-0.085
Looks-0.084
You-0.083
Token
Token ID198
Feature activation-1.90e+00

Pos logit contributions
keeper+0.027
ache+0.027
 instincts+0.026
 code+0.026
 hinges+0.025
Neg logit contributions
You-0.016
Looks-0.015
 You-0.015
Any-0.015
hed-0.014
Token
Token ID198
Feature activation-1.95e+00

Pos logit contributions
ache+0.060
keeper+0.059
 hinges+0.058
alo+0.058
 instincts+0.058
Neg logit contributions
You-0.025
ee-0.023
Looks-0.023
Any-0.023
�-0.022
Token"
Token ID1
Feature activation-6.06e-01

Pos logit contributions
alo+0.046
keeper+0.045
ache+0.044
 instincts+0.042
 hinges+0.041
Neg logit contributions
Looks-0.042
You-0.041
�-0.039
Any-0.039
ee-0.038
TokenOK
Token ID11380
Feature activation-2.42e-01

Pos logit contributions
keeper+0.016
 keeper+0.015
 worker+0.014
 hinges+0.014
ache+0.014
Neg logit contributions
Any-0.011
Looks-0.011
hed-0.011
|-0.010
pill-0.010
Token,
Token ID11
Feature activation-5.02e+00

Pos logit contributions
keeper+0.006
ache+0.005
 keeper+0.005
 lair+0.005
alo+0.005
Neg logit contributions
 You-0.005
�-0.005
Any-0.005
You-0.005
Looks-0.005
Token Look
Token ID6803
Feature activation+2.39e+00

Pos logit contributions
hed+0.023
Looks+0.023
Any+0.021
ee+0.020
|+0.020
Neg logit contributions
keeper-0.040
 doctor-0.040
 aid-0.038
 vet-0.038
 keeper-0.038
Token,
Token ID11
Feature activation-3.01e+00

Pos logit contributions
Any+0.046
ee+0.044
hed+0.044
 You+0.044
Looks+0.043
Neg logit contributions
keeper-0.060
 keeper-0.058
 hinges-0.054
 worker-0.053
 code-0.052
Token I
Token ID314
Feature activation+3.14e+00

Pos logit contributions
keeper+0.079
 keeper+0.076
 worker+0.071
 doctor+0.071
ache+0.070
Neg logit contributions
Looks-0.054
Any-0.054
hed-0.052
|-0.050
pill-0.050
Token can
Token ID460
Feature activation-2.81e+00

Pos logit contributions
Looks+0.062
Any+0.059
pill+0.058
hed+0.056
You+0.055
Neg logit contributions
keeper-0.076
 keeper-0.073
 worker-0.070
ache-0.069
 aid-0.069
Token make
Token ID787
Feature activation+1.06e-01

Pos logit contributions
keeper+0.067
 keeper+0.064
 worker+0.059
 hinges+0.058
 instincts+0.058
Neg logit contributions
Any-0.058
Looks-0.056
You-0.055
hed-0.055
ee-0.055
Token a
Token ID257
Feature activation-4.10e+00

Pos logit contributions
Looks+0.002
Any+0.002
|+0.002
hed+0.002
pill+0.002
Neg logit contributions
keeper-0.003
 keeper-0.002
 doctor-0.002
 dentist-0.002
ache-0.002
Token circle
Token ID9197
Feature activation+5.08e-02

Pos logit contributions
keeper+0.093
 keeper+0.086
 hinges+0.085
alo+0.084
ache+0.084
Neg logit contributions
ee-0.078
Any-0.078
Looks-0.076
 You-0.076
 Good-0.075
Token with
Token ID351
Feature activation+3.79e+00

Pos logit contributions
Any+0.001
hed+0.001
Looks+0.001
 You+0.001
You+0.001
Neg logit contributions
keeper-0.001
 keeper-0.001
 instincts-0.001
 hinges-0.001
ache-0.001
Token my
Token ID616
Feature activation-2.32e+00

Pos logit contributions
Looks+0.079
Any+0.076
|+0.075
hed+0.074
pill+0.070
Neg logit contributions
keeper-0.090
 keeper-0.085
 doctor-0.081
 door-0.081
 worker-0.079
Token fork
Token ID15563
Feature activation+1.02e-02

Pos logit contributions
keeper+0.043
 keeper+0.041
 worker+0.037
ache+0.036
 hinges+0.036
Neg logit contributions
Any-0.059
Looks-0.059
hed-0.056
You-0.056
ee-0.056
Token and
Token ID290
Feature activation+3.54e+00

Pos logit contributions
Any+0.000
 You+0.000
hed+0.000
You+0.000
ee+0.000
Neg logit contributions
keeper-0.000
 keeper-0.000
 hinges-0.000
 worker-0.000
 instincts-0.000
Token�
Token ID241
Feature activation+4.69e-01

Pos logit contributions
Looks+0.082
Any+0.077
pill+0.075
|+0.073
ee+0.072
Neg logit contributions
 keeper-0.075
keeper-0.074
 aid-0.071
ache-0.068
 worker-0.068
TokenNow
Token ID3844
Feature activation-9.15e-01

Pos logit contributions
hed+0.008
|+0.008
Looks+0.008
 imag+0.008
pill+0.008
Neg logit contributions
keeper-0.011
 keeper-0.011
 lady-0.010
 aid-0.010
 worker-0.010
Token,
Token ID11
Feature activation-3.71e+00

Pos logit contributions
keeper+0.022
 keeper+0.021
 doctor+0.019
 worker+0.019
 instincts+0.019
Neg logit contributions
Looks-0.019
Any-0.019
hed-0.018
pill-0.018
|-0.018
Token little
Token ID1310
Feature activation+2.55e-01

Pos logit contributions
keeper+0.088
 keeper+0.083
 doctor+0.078
 worker+0.077
 instincts+0.075
Neg logit contributions
Looks-0.077
hed-0.076
Any-0.075
pill-0.074
|-0.073
Token fox
Token ID21831
Feature activation+4.85e-01

Pos logit contributions
Looks+0.006
|+0.006
Any+0.006
hed+0.006
pill+0.005
Neg logit contributions
keeper-0.005
 keeper-0.005
 doctor-0.005
 worker-0.005
 instincts-0.005
Token,
Token ID11
Feature activation-3.26e+00

Pos logit contributions
Looks+0.012
Any+0.011
hed+0.011
You+0.011
pill+0.011
Neg logit contributions
keeper-0.010
 keeper-0.010
 worker-0.009
ache-0.009
 instincts-0.009
Token do
Token ID466
Feature activation-1.46e+00

Pos logit contributions
keeper+0.079
 keeper+0.075
 doctor+0.071
 worker+0.069
 instincts+0.068
Neg logit contributions
Looks-0.067
hed-0.064
Any-0.064
pill-0.063
|-0.062
Token you
Token ID345
Feature activation+1.27e+00

Pos logit contributions
keeper+0.033
 doctor+0.033
 keeper+0.033
 instincts+0.032
ache+0.031
Neg logit contributions
ksh-0.029
hed-0.028
Looks-0.028
|-0.028
Any-0.026
Token understand
Token ID1833
Feature activation-1.84e+00

Pos logit contributions
Looks+0.026
Any+0.025
pill+0.024
hed+0.023
You+0.023
Neg logit contributions
keeper-0.031
 keeper-0.029
ache-0.028
 code-0.027
 aid-0.027
Token why
Token ID1521
Feature activation+2.26e+00

Pos logit contributions
keeper+0.044
 keeper+0.041
 hinges+0.039
 worker+0.038
alo+0.038
Neg logit contributions
Any-0.038
Looks-0.036
You-0.036
ee-0.035
 You-0.035
Token your
Token ID534
Feature activation-3.20e+00

Pos logit contributions
Looks+0.047
hed+0.046
Any+0.044
|+0.043
ksh+0.043
Neg logit contributions
keeper-0.052
 keeper-0.051
 doctor-0.049
 instincts-0.048
 worker-0.047
Token.
Token ID13
Feature activation+1.04e-01

Pos logit contributions
Looks+0.006
Any+0.006
pill+0.006
hed+0.006
Whe+0.006
Neg logit contributions
keeper-0.005
 keeper-0.004
 worker-0.004
 door-0.004
 hinges-0.004
Token He
Token ID679
Feature activation-1.80e+00

Pos logit contributions
Looks+0.002
Any+0.002
hed+0.002
You+0.002
pill+0.002
Neg logit contributions
keeper-0.003
 keeper-0.003
 worker-0.002
 hinges-0.002
 doctor-0.002
Token knew
Token ID2993
Feature activation-3.15e+00

Pos logit contributions
keeper+0.042
 keeper+0.040
 worker+0.037
ache+0.036
 doctor+0.035
Neg logit contributions
Looks-0.039
Any-0.039
pill-0.037
hed-0.037
|-0.036
Token he
Token ID339
Feature activation-1.68e+00

Pos logit contributions
keeper+0.080
 keeper+0.075
 hinges+0.071
ache+0.071
alo+0.070
Neg logit contributions
Any-0.061
Looks-0.058
You-0.056
 You-0.053
ee-0.052
Token had
Token ID550
Feature activation+2.57e+00

Pos logit contributions
keeper+0.044
 keeper+0.043
 worker+0.040
ache+0.039
 hinges+0.039
Neg logit contributions
Looks-0.031
Any-0.031
hed-0.029
pill-0.029
You-0.028
Token to
Token ID284
Feature activation-4.36e+00

Pos logit contributions
Any+0.077
Looks+0.077
hed+0.074
You+0.073
pill+0.072
Neg logit contributions
keeper-0.038
 keeper-0.035
 worker-0.030
ache-0.030
 hinges-0.030
Token warn
Token ID9828
Feature activation-3.19e+00

Pos logit contributions
keeper+0.106
 keeper+0.097
ache+0.090
 worker+0.089
alo+0.089
Neg logit contributions
Any-0.088
Looks-0.087
You-0.085
ee-0.083
hed-0.082
Token the
Token ID262
Feature activation-3.54e-01

Pos logit contributions
keeper+0.088
 keeper+0.084
ache+0.080
 worker+0.079
 hinges+0.078
Neg logit contributions
Looks-0.052
Any-0.052
You-0.049
hed-0.049
 You-0.048
Token sailors
Token ID29996
Feature activation-1.53e+00

Pos logit contributions
keeper+0.007
 keeper+0.006
alo+0.006
 hinges+0.006
ache+0.006
Neg logit contributions
Any-0.008
Looks-0.008
You-0.008
ee-0.008
 You-0.008
Token on
Token ID319
Feature activation+2.74e+00

Pos logit contributions
keeper+0.034
 keeper+0.032
 hinges+0.029
 worker+0.029
ache+0.029
Neg logit contributions
Any-0.034
Looks-0.033
hed-0.032
You-0.032
ee-0.031
Token the
Token ID262
Feature activation+2.66e-01

Pos logit contributions
Looks+0.052
Any+0.051
hed+0.049
pill+0.048
You+0.046
Neg logit contributions
keeper-0.071
 keeper-0.069
 worker-0.064
 instincts-0.064
 aid-0.064
Token he
Token ID339
Feature activation-5.84e-01

Pos logit contributions
keeper+0.074
 keeper+0.071
 worker+0.066
 doctor+0.064
 hinges+0.063
Neg logit contributions
Any-0.082
Looks-0.081
hed-0.079
|-0.078
ee-0.078
Token knew
Token ID2993
Feature activation-3.00e+00

Pos logit contributions
keeper+0.013
 keeper+0.013
 worker+0.012
ache+0.012
 doctor+0.011
Neg logit contributions
Looks-0.013
Any-0.013
hed-0.012
You-0.012
|-0.012
Token it
Token ID340
Feature activation+1.62e+00

Pos logit contributions
keeper+0.073
 keeper+0.069
 hinges+0.069
ache+0.068
alo+0.068
Neg logit contributions
Any-0.060
You-0.056
Looks-0.055
 You-0.052
 Wed-0.051
Token was
Token ID373
Feature activation-1.61e-01

Pos logit contributions
Looks+0.035
Any+0.035
hed+0.033
You+0.032
pill+0.032
Neg logit contributions
keeper-0.037
 keeper-0.036
 worker-0.033
ache-0.032
 code-0.032
Token going
Token ID1016
Feature activation+1.07e+00

Pos logit contributions
keeper+0.004
 keeper+0.004
 worker+0.003
 code+0.003
 doctor+0.003
Neg logit contributions
Looks-0.003
hed-0.003
|-0.003
Any-0.003
pill-0.003
Token to
Token ID284
Feature activation-4.36e+00

Pos logit contributions
Looks+0.032
Any+0.032
hed+0.031
You+0.031
pill+0.031
Neg logit contributions
keeper-0.016
 keeper-0.014
 worker-0.013
ache-0.012
 hinges-0.012
Token be
Token ID307
Feature activation+1.35e+00

Pos logit contributions
keeper+0.082
 keeper+0.080
 worker+0.073
 doctor+0.072
 code+0.071
Neg logit contributions
Looks-0.111
Any-0.109
hed-0.107
pill-0.105
ee-0.103
Token a
Token ID257
Feature activation-2.91e+00

Pos logit contributions
Looks+0.025
Any+0.025
hed+0.024
You+0.023
pill+0.023
Neg logit contributions
keeper-0.035
 keeper-0.033
 worker-0.031
ache-0.031
 hinges-0.031
Token great
Token ID1049
Feature activation+5.49e-01

Pos logit contributions
keeper+0.073
 keeper+0.068
 hinges+0.068
alo+0.067
ache+0.065
Neg logit contributions
Any-0.055
Looks-0.051
ee-0.049
You-0.048
 You-0.047
Token day
Token ID1110
Feature activation-1.42e+00

Pos logit contributions
Looks+0.014
Any+0.014
hed+0.013
You+0.013
pill+0.013
Neg logit contributions
keeper-0.011
 keeper-0.010
 worker-0.009
 hinges-0.009
 code-0.009
Token.
Token ID13
Feature activation+1.32e+00

Pos logit contributions
keeper+0.029
 keeper+0.027
 worker+0.025
ache+0.025
 aid+0.024
Neg logit contributions
Looks-0.035
Any-0.034
|-0.032
pill-0.032
You-0.032
Token morning
Token ID3329
Feature activation-1.15e+00

Pos logit contributions
Looks+0.047
Any+0.041
pill+0.041
hed+0.040
ksh+0.040
Neg logit contributions
keeper-0.076
 keeper-0.076
 doctor-0.074
 worker-0.073
 aid-0.072
Token,
Token ID11
Feature activation-3.75e+00

Pos logit contributions
keeper+0.026
 keeper+0.025
 code+0.024
 hinges+0.024
 instincts+0.024
Neg logit contributions
Any-0.024
You-0.023
hed-0.023
 You-0.023
Looks-0.023
Token she
Token ID673
Feature activation-9.15e-01

Pos logit contributions
keeper+0.090
 keeper+0.085
 worker+0.080
 doctor+0.078
ache+0.078
Neg logit contributions
Looks-0.078
Any-0.077
hed-0.074
|-0.072
You-0.071
Token put
Token ID1234
Feature activation+2.12e+00

Pos logit contributions
keeper+0.021
 keeper+0.020
 worker+0.018
ache+0.018
 hinges+0.018
Neg logit contributions
Looks-0.020
Any-0.020
hed-0.019
You-0.019
pill-0.018
Token on
Token ID319
Feature activation+1.14e+00

Pos logit contributions
Looks+0.042
Any+0.042
pill+0.039
|+0.039
hed+0.037
Neg logit contributions
keeper-0.053
 keeper-0.050
 worker-0.048
 doctor-0.047
 door-0.046
Token her
Token ID607
Feature activation-4.14e+00

Pos logit contributions
Any+0.029
Looks+0.028
You+0.026
hed+0.026
|+0.026
Neg logit contributions
keeper-0.022
 doctor-0.021
 keeper-0.021
 door-0.020
 hinges-0.020
Token special
Token ID2041
Feature activation-2.91e+00

Pos logit contributions
keeper+0.085
 keeper+0.081
 worker+0.074
 hinges+0.073
ache+0.073
Neg logit contributions
Looks-0.100
Any-0.099
hed-0.095
You-0.093
pill-0.092
Token uniform
Token ID8187
Feature activation-7.35e-01

Pos logit contributions
keeper+0.054
 keeper+0.050
ache+0.045
 worker+0.044
 instincts+0.044
Neg logit contributions
Any-0.074
Looks-0.073
hed-0.071
You-0.070
|-0.069
Token and
Token ID290
Feature activation+3.23e+00

Pos logit contributions
keeper+0.018
 keeper+0.018
 instincts+0.017
 hinges+0.017
 worker+0.016
Neg logit contributions
hed-0.014
Any-0.013
 You-0.012
You-0.012
 Ever-0.012
Token visited
Token ID8672
Feature activation-3.19e+00

Pos logit contributions
Looks+0.058
Any+0.058
hed+0.054
You+0.054
pill+0.052
Neg logit contributions
keeper-0.085
 keeper-0.081
 worker-0.076
ache-0.075
 hinges-0.075
Token people
Token ID661
Feature activation-6.73e-02

Pos logit contributions
keeper+0.098
 keeper+0.094
ache+0.090
 hinges+0.089
 code+0.088
Neg logit contributions
Any-0.043
You-0.041
hed-0.040
Looks-0.040
 You-0.037