TOP ACTIVATIONS
MAX = 4.855

up to see a park ranger coming toward her . The
of flowers and smelled so sweet . The yellow zoom z
so pretty and smelled so sweet . She wanted to keep
, as it looked so sweet . She started playing catch
comb and it smelled so sweet . King Bob carefully poured
. She tried to hang Olive on a h anger ,
He runs and jumps and bar ks . Anna and Ben
That 's when Olive learned an important lesson -
could be and never mis beh aved . <|endoftext|> Once upon
He was nearing an elderly lady 's house . Jake enjoyed
slowly slowly reached out and pet ted the bird gently .
reached in and picked a sweet b erry . It tasted
pet als and smelled so sweet . He thought it was
It was warm and so sweet . She liked it very
see swings and slides and k ites . They see clouds
it smelled and tasted so sweet . Mom my was relieved
They picked up the fallen pet als and tried to put
, jumped rope and flew k ites . They laughed and
eventually he reached out and pet ted the wolf . To
colour and it smelled so sweet . She couldn 't wait

MIN ACTIVATIONS
MIN = -7.545

my had told him long ago . The bird spread out
velvet box all those years ago . They were glad that
the pigeon all those weeks ago . <|endoftext|> Mike and his
told him that many years ago , a big storm had
coins that people used long ago . They saw paintings and
ancestors who sang it long ago . Would you like to
how they were used long ago . Lily was fascinated and
found the truck many years ago and worked hard to fix
says that a long time ago , a man named Jesus
had been made many years ago , but it still looked
her that a long time ago , someone made the mark
" It was hidden long ago , and only seems fancy
says that a long time ago , a dragon used to
by a famous artist long ago . One
They were here a minute ago ." " Maybe
family from a long time ago ." " Maybe
so happy with her new seas hell and her new crab
how much you value your seas hell s , Lily .
her birthday a few days ago . She liked to practice
animals and people from long ago . Sam 's mom said

INTERVAL 1.755 - 4.855
CONTAINS 2.488%

Welcome . We like bubbles too . They are pretty and
. They were the fastest friends in the world ! <|endoftext|>
together and had so much fun . Max liked Lily 's
new bath robe and observed how warm and cozy it was
The pink worm stayed out all night , until the sun

INTERVAL -1.345 - 1.755
CONTAINS 90.303%

liked to play on . He would swing back and forth
out and hugged his mom . From then on , every
ers took the coin and bit it . " It 's
And even though they were scared at times , they were
they got home , mom my and Georg ie put all

INTERVAL -4.445 - -1.345
CONTAINS 7.206%

" Hello , Rex ," they said . " We
go play in the rain ?' His mum sighed
saw a box in the corner of the room and started
glasses . He quickly told his mom and dad and they
Tim my . Tim my loved to go to the lake

INTERVAL -7.545 - -4.445
CONTAINS 0.004%

He was here a minute ago . He was chasing the
him about the frog 's habitat , diet and how it
Louise . Her favourite past ime was playing in the park
a nice house by the seas ide . One
his feet and threw a tant rum . "
Token up
Token ID510
Feature activation+1.73e+00

Pos logit contributions
bi+0.025
 Johnson+0.025
isi+0.024
 Betty+0.024
 Karen+0.023
Neg logit contributions
 tant-0.032
lights-0.032
obs-0.031
ware-0.030
renches-0.030
Token to
Token ID284
Feature activation+9.59e-01

Pos logit contributions
bi+0.076
 Betty+0.072
 Wendy+0.071
 Johnson+0.071
 Karen+0.069
Neg logit contributions
lights-0.096
obs-0.096
 tant-0.095
ware-0.094
renches-0.092
Token see
Token ID766
Feature activation-8.11e-01

Pos logit contributions
bi+0.049
 Johnson+0.046
live+0.045
isi+0.043
 Wendy+0.043
Neg logit contributions
 tant-0.041
 Some-0.040
obs-0.040
ware-0.039
lights-0.039
Token a
Token ID257
Feature activation+1.50e+00

Pos logit contributions
ware+0.046
lights+0.046
 tant+0.045
obs+0.045
renches+0.044
Neg logit contributions
bi-0.035
 Johnson-0.031
 Wendy-0.030
 reunited-0.029
isi-0.029
Token park
Token ID3952
Feature activation+1.37e+00

Pos logit contributions
bi+0.067
 Betty+0.062
 Susan+0.061
 Wendy+0.060
 Patty+0.059
Neg logit contributions
 tant-0.082
ware-0.079
obs-0.079
lights-0.078
renches-0.077
Token ranger
Token ID43570
Feature activation+4.86e+00

Pos logit contributions
bi+0.065
 Betty+0.061
Elizabeth+0.061
isi+0.061
 Patty+0.058
Neg logit contributions
lights-0.072
obs-0.069
ware-0.066
 boards-0.065
 tant-0.064
Token coming
Token ID2406
Feature activation+4.23e-01

Pos logit contributions
bi+0.222
 Betty+0.211
 Wendy+0.205
 Patty+0.202
isi+0.201
Neg logit contributions
ware-0.257
obs-0.246
lights-0.244
 tant-0.238
renches-0.237
Token toward
Token ID3812
Feature activation+8.03e-01

Pos logit contributions
live+0.027
isi+0.025
bi+0.025
atherine+0.024
Elizabeth+0.024
Neg logit contributions
controlled-0.012
obs-0.012
ro-0.011
 ago-0.011
 tant-0.011
Token her
Token ID607
Feature activation+1.45e+00

Pos logit contributions
bi+0.027
 Johnson+0.023
 Wendy+0.022
 Betty+0.022
my+0.022
Neg logit contributions
 tant-0.051
ware-0.051
lights-0.049
renches-0.049
obs-0.049
Token.
Token ID13
Feature activation-1.87e-02

Pos logit contributions
bi+0.066
 Betty+0.063
 Wendy+0.062
 Susan+0.062
 Karen+0.060
Neg logit contributions
obs-0.076
lights-0.073
 tant-0.073
ware-0.073
der-0.071
Token The
Token ID383
Feature activation-5.14e-01

Pos logit contributions
obs+0.001
lights+0.001
renches+0.001
 tant+0.001
ware+0.001
Neg logit contributions
bi-0.001
 Betty-0.001
 Wendy-0.001
 Johnson-0.001
my-0.001
Token of
Token ID286
Feature activation-1.18e+00

Pos logit contributions
 tant+0.027
renches+0.027
ware+0.026
obs+0.025
lights+0.025
Neg logit contributions
 Daisy-0.016
 Betty-0.016
my-0.016
 Sally-0.016
 Wendy-0.015
Token flowers
Token ID12734
Feature activation+4.18e-02

Pos logit contributions
ware+0.073
obs+0.073
renches+0.072
 tant+0.071
lights+0.071
Neg logit contributions
 Wendy-0.045
 Betty-0.045
bi-0.044
 Karen-0.043
 Daisy-0.043
Token and
Token ID290
Feature activation+3.05e-01

Pos logit contributions
bi+0.002
 Betty+0.002
 Wendy+0.002
 Johnson+0.002
 Susan+0.002
Neg logit contributions
lights-0.002
ware-0.002
obs-0.002
renches-0.002
 tant-0.002
Token smelled
Token ID37609
Feature activation+1.22e+00

Pos logit contributions
 Betty+0.012
bi+0.012
 Wendy+0.012
 Susan+0.012
 Daisy+0.011
Neg logit contributions
renches-0.019
 tant-0.018
ware-0.018
obs-0.018
lights-0.018
Token so
Token ID523
Feature activation+2.55e-01

Pos logit contributions
bi+0.041
 Wendy+0.040
 Betty+0.040
 Daisy+0.039
 Susan+0.038
Neg logit contributions
renches-0.081
ware-0.079
 tant-0.079
lights-0.079
obs-0.079
Token sweet
Token ID6029
Feature activation+4.72e+00

Pos logit contributions
bi+0.011
 Betty+0.010
 Wendy+0.010
 Johnson+0.010
 Susan+0.010
Neg logit contributions
lights-0.014
ware-0.014
renches-0.014
 tant-0.014
obs-0.014
Token.
Token ID13
Feature activation+5.24e-01

Pos logit contributions
bi+0.157
 Betty+0.150
 Wendy+0.146
 Johnson+0.144
 Susan+0.141
Neg logit contributions
renches-0.310
ware-0.310
lights-0.310
 tant-0.310
obs-0.307
Token The
Token ID383
Feature activation-5.31e-01

Pos logit contributions
 Betty+0.024
 Wendy+0.024
 Daisy+0.023
bi+0.023
 Susan+0.023
Neg logit contributions
ware-0.030
 tant-0.030
lights-0.030
renches-0.029
obs-0.028
Token yellow
Token ID7872
Feature activation-2.57e-01

Pos logit contributions
lights+0.027
ware+0.027
 tant+0.026
renches+0.026
obs+0.026
Neg logit contributions
bi-0.026
 Betty-0.025
 Wendy-0.025
 Johnson-0.025
 Susan-0.024
Token zoom
Token ID19792
Feature activation-1.20e+00

Pos logit contributions
renches+0.017
lights+0.016
ware+0.016
 tant+0.016
obs+0.016
Neg logit contributions
 Wendy-0.009
bi-0.009
 Betty-0.009
my-0.009
 Patty-0.008
Token z
Token ID1976
Feature activation-7.22e-01

Pos logit contributions
renches+0.081
obs+0.078
 tant+0.078
lights+0.078
ware+0.078
Neg logit contributions
bi-0.040
 Betty-0.039
 Wendy-0.038
my-0.037
 Karen-0.036
Token so
Token ID523
Feature activation-4.03e-01

Pos logit contributions
bi+0.003
 Betty+0.003
 Wendy+0.003
 Johnson+0.003
 Susan+0.003
Neg logit contributions
lights-0.004
ware-0.004
renches-0.004
 tant-0.004
obs-0.004
Token pretty
Token ID2495
Feature activation+2.27e+00

Pos logit contributions
lights+0.021
ware+0.021
 tant+0.020
renches+0.020
obs+0.020
Neg logit contributions
bi-0.019
 Betty-0.019
 Wendy-0.018
 Johnson-0.018
 Susan-0.018
Token and
Token ID290
Feature activation+1.90e-01

Pos logit contributions
bi+0.102
 Betty+0.096
 Wendy+0.095
 Johnson+0.093
 Susan+0.092
Neg logit contributions
obs-0.125
ware-0.125
lights-0.125
 tant-0.121
renches-0.120
Token smelled
Token ID37609
Feature activation+1.05e+00

Pos logit contributions
bi+0.009
 Johnson+0.009
 Betty+0.009
 Wendy+0.009
 Patty+0.008
Neg logit contributions
lights-0.010
ware-0.009
 tant-0.009
obs-0.009
renches-0.009
Token so
Token ID523
Feature activation-2.86e-02

Pos logit contributions
bi+0.035
 Betty+0.034
 Wendy+0.034
 Daisy+0.033
my+0.033
Neg logit contributions
renches-0.070
ware-0.069
lights-0.069
obs-0.068
 tant-0.068
Token sweet
Token ID6029
Feature activation+4.72e+00

Pos logit contributions
lights+0.001
ware+0.001
 tant+0.001
obs+0.001
renches+0.001
Neg logit contributions
bi-0.001
 Johnson-0.001
my-0.001
 Betty-0.001
 Wendy-0.001
Token.
Token ID13
Feature activation+3.42e-01

Pos logit contributions
bi+0.171
 Betty+0.160
 Wendy+0.158
 Johnson+0.155
 Susan+0.153
Neg logit contributions
lights-0.298
ware-0.297
obs-0.297
 tant-0.294
renches-0.292
Token She
Token ID1375
Feature activation+1.01e+00

Pos logit contributions
bi+0.013
 Betty+0.012
 Johnson+0.012
 Wendy+0.012
 Susan+0.012
Neg logit contributions
 tant-0.021
lights-0.020
ware-0.020
obs-0.020
renches-0.020
Token wanted
Token ID2227
Feature activation-3.78e-01

Pos logit contributions
bi+0.042
 Johnson+0.041
 Betty+0.040
 Wendy+0.040
 Karen+0.039
Neg logit contributions
ware-0.056
 tant-0.055
obs-0.055
renches-0.054
lights-0.054
Token to
Token ID284
Feature activation+3.37e-01

Pos logit contributions
lights+0.021
renches+0.021
 tant+0.021
ware+0.021
obs+0.020
Neg logit contributions
bi-0.017
 Betty-0.016
 Wendy-0.016
 Johnson-0.016
 Susan-0.016
Token keep
Token ID1394
Feature activation+9.56e-01

Pos logit contributions
bi+0.014
 Johnson+0.013
 Wendy+0.013
 Betty+0.012
 Karen+0.012
Neg logit contributions
ware-0.019
 tant-0.019
obs-0.019
renches-0.019
lights-0.018
Token,
Token ID11
Feature activation+2.54e-01

Pos logit contributions
obs+0.045
lights+0.044
ware+0.044
 tant+0.042
renches+0.042
Neg logit contributions
bi-0.037
 Wendy-0.034
 Betty-0.034
 Johnson-0.034
 Iris-0.034
Token as
Token ID355
Feature activation+2.10e+00

Pos logit contributions
bi+0.007
 Betty+0.006
 Johnson+0.006
 Wendy+0.006
 Susan+0.006
Neg logit contributions
ware-0.018
lights-0.018
 tant-0.018
obs-0.018
renches-0.018
Token it
Token ID340
Feature activation-9.90e-01

Pos logit contributions
bi+0.078
 Johnson+0.070
 reunited+0.065
isi+0.065
 Karen+0.064
Neg logit contributions
lights-0.131
ware-0.129
obs-0.127
 tant-0.126
renches-0.122
Token looked
Token ID3114
Feature activation-1.94e-01

Pos logit contributions
ware+0.064
lights+0.063
 tant+0.063
renches+0.063
obs+0.062
Neg logit contributions
bi-0.035
 Betty-0.034
 Wendy-0.033
 Johnson-0.032
 Susan-0.032
Token so
Token ID523
Feature activation+4.48e-02

Pos logit contributions
renches+0.013
ware+0.012
lights+0.012
obs+0.012
 tant+0.012
Neg logit contributions
bi-0.007
my-0.006
 Betty-0.006
 Wendy-0.006
 reunited-0.006
Token sweet
Token ID6029
Feature activation+4.61e+00

Pos logit contributions
bi+0.002
 Betty+0.002
 Wendy+0.002
 Johnson+0.002
 Susan+0.002
Neg logit contributions
ware-0.002
lights-0.002
renches-0.002
 tant-0.002
obs-0.002
Token.
Token ID13
Feature activation+1.43e-01

Pos logit contributions
bi+0.177
 Betty+0.166
 Wendy+0.164
 Johnson+0.161
 Susan+0.158
Neg logit contributions
lights-0.281
ware-0.281
obs-0.280
 tant-0.277
renches-0.275
Token She
Token ID1375
Feature activation+8.44e-01

Pos logit contributions
bi+0.006
 Johnson+0.006
 Karen+0.006
 Betty+0.006
 reunited+0.006
Neg logit contributions
ware-0.008
 tant-0.008
obs-0.008
lights-0.008
renches-0.008
Token started
Token ID2067
Feature activation+6.31e-01

Pos logit contributions
bi+0.030
 Betty+0.029
 Wendy+0.028
 Johnson+0.028
 Susan+0.027
Neg logit contributions
ware-0.054
lights-0.053
renches-0.053
 tant-0.053
obs-0.053
Token playing
Token ID2712
Feature activation-8.96e-01

Pos logit contributions
bi+0.027
 Betty+0.026
 Wendy+0.026
 Johnson+0.026
my+0.025
Neg logit contributions
lights-0.036
renches-0.036
ware-0.035
 tant-0.035
obs-0.035
Token catch
Token ID4929
Feature activation+7.25e-01

Pos logit contributions
ware+0.052
 tant+0.052
renches+0.051
lights+0.051
obs+0.051
Neg logit contributions
bi-0.039
 Betty-0.036
 Wendy-0.035
 Johnson-0.035
 Susan-0.034
Tokencomb
Token ID24011
Feature activation+2.11e+00

Pos logit contributions
bi+0.035
 Betty+0.032
 Wendy+0.032
 Johnson+0.031
 Susan+0.031
Neg logit contributions
lights-0.073
ware-0.073
obs-0.073
renches-0.073
 tant-0.072
Token and
Token ID290
Feature activation+3.56e-01

Pos logit contributions
bi+0.056
 Betty+0.054
my+0.052
 Daisy+0.051
 Johnson+0.051
Neg logit contributions
 tant-0.155
renches-0.154
obs-0.149
ware-0.147
lights-0.146
Token it
Token ID340
Feature activation-6.11e-01

Pos logit contributions
bi+0.013
 Betty+0.013
 Wendy+0.012
 Susan+0.012
 Johnson+0.012
Neg logit contributions
renches-0.022
 tant-0.022
obs-0.022
ware-0.022
lights-0.022
Token smelled
Token ID37609
Feature activation+9.18e-01

Pos logit contributions
ware+0.038
lights+0.037
 tant+0.036
renches+0.036
obs+0.036
Neg logit contributions
bi-0.023
 Betty-0.022
 Wendy-0.022
 Johnson-0.022
 Karen-0.021
Token so
Token ID523
Feature activation-4.53e-02

Pos logit contributions
 Daisy+0.026
 Betty+0.026
bi+0.026
my+0.025
 Wendy+0.025
Neg logit contributions
renches-0.064
ware-0.063
obs-0.063
lights-0.062
 tant-0.062
Token sweet
Token ID6029
Feature activation+4.61e+00

Pos logit contributions
lights+0.003
ware+0.003
renches+0.003
 tant+0.003
obs+0.003
Neg logit contributions
bi-0.002
 Betty-0.002
 Wendy-0.002
 Johnson-0.002
 Susan-0.002
Token.
Token ID13
Feature activation+3.67e-01

Pos logit contributions
bi+0.142
 Betty+0.136
 Wendy+0.131
 Johnson+0.130
 Susan+0.127
Neg logit contributions
renches-0.314
 tant-0.313
ware-0.312
lights-0.312
obs-0.309
Token King
Token ID2677
Feature activation-6.47e-01

Pos logit contributions
bi+0.015
 Betty+0.014
 Wendy+0.014
 Johnson+0.014
 Susan+0.014
Neg logit contributions
lights-0.022
 tant-0.022
ware-0.022
renches-0.022
obs-0.021
Token Bob
Token ID5811
Feature activation+1.39e+00

Pos logit contributions
ware+0.038
obs+0.037
lights+0.036
 tant+0.035
renches+0.035
Neg logit contributions
bi-0.024
 Wendy-0.023
 Iris-0.023
 Johnson-0.022
 Betty-0.022
Token carefully
Token ID7773
Feature activation-1.20e+00

Pos logit contributions
bi+0.050
 Wendy+0.050
 Betty+0.049
 Johnson+0.048
 Karen+0.048
Neg logit contributions
ware-0.086
obs-0.083
lights-0.083
 tant-0.083
renches-0.082
Token poured
Token ID19036
Feature activation+1.88e+00

Pos logit contributions
lights+0.081
 tant+0.079
renches+0.078
obs+0.077
ware+0.072
Neg logit contributions
bi-0.041
 Betty-0.039
my-0.039
 Daisy-0.036
 Sue-0.036
Token.
Token ID13
Feature activation-3.17e-02

Pos logit contributions
lights+0.050
ware+0.050
obs+0.049
 tant+0.049
renches+0.049
Neg logit contributions
bi-0.033
 Betty-0.032
 Wendy-0.031
 Susan-0.030
 Karen-0.030
Token She
Token ID1375
Feature activation+1.18e+00

Pos logit contributions
lights+0.002
ware+0.002
obs+0.002
renches+0.002
 tant+0.002
Neg logit contributions
 Wendy-0.001
 Betty-0.001
bi-0.001
 Sue-0.001
 Daisy-0.001
Token tried
Token ID3088
Feature activation+5.82e-01

Pos logit contributions
bi+0.041
 Wendy+0.038
 Betty+0.038
my+0.037
 Johnson+0.037
Neg logit contributions
lights-0.078
obs-0.077
renches-0.077
ware-0.076
 tant-0.075
Token to
Token ID284
Feature activation+2.00e-01

Pos logit contributions
bi+0.029
 Betty+0.027
 Wendy+0.027
 Susan+0.026
 Karen+0.026
Neg logit contributions
 tant-0.029
ware-0.029
lights-0.028
obs-0.028
renches-0.028
Token hang
Token ID8181
Feature activation+5.96e-01

Pos logit contributions
bi+0.011
 Johnson+0.009
 Betty+0.009
 Susan+0.009
 Karen+0.009
Neg logit contributions
 tant-0.009
ro-0.009
ware-0.009
renches-0.009
 boards-0.008
Token Olive
Token ID30012
Feature activation+4.57e+00

Pos logit contributions
bi+0.016
 Betty+0.013
 Johnson+0.012
 Susan+0.012
 reunited+0.012
Neg logit contributions
 tant-0.044
lights-0.044
ware-0.043
renches-0.042
obs-0.042
Token on
Token ID319
Feature activation-3.00e-01

Pos logit contributions
bi+0.163
 Betty+0.153
 Susan+0.149
 Karen+0.146
 Johnson+0.145
Neg logit contributions
ware-0.296
 tant-0.291
obs-0.287
lights-0.285
renches-0.281
Token a
Token ID257
Feature activation+9.44e-01

Pos logit contributions
 tant+0.020
ware+0.020
lights+0.019
renches+0.019
obs+0.019
Neg logit contributions
bi-0.010
my-0.008
 Betty-0.008
 Susan-0.008
 reunited-0.008
Token h
Token ID289
Feature activation+1.91e-01

Pos logit contributions
 Wendy+0.043
my+0.043
bi+0.043
 Johnson+0.042
 Betty+0.041
Neg logit contributions
renches-0.050
lights-0.049
obs-0.049
ware-0.045
der-0.044
Tokenanger
Token ID2564
Feature activation+1.56e-01

Pos logit contributions
bi+0.012
 Wendy+0.012
 Betty+0.012
 Johnson+0.012
 Patty+0.012
Neg logit contributions
 tant-0.006
lights-0.006
renches-0.006
ware-0.006
obs-0.006
Token,
Token ID11
Feature activation-2.11e-01

Pos logit contributions
 Betty+0.004
 Wendy+0.004
bi+0.004
 Johnson+0.004
my+0.004
Neg logit contributions
renches-0.011
lights-0.011
 tant-0.011
obs-0.011
ware-0.011
Token He
Token ID679
Feature activation-3.31e-01

Pos logit contributions
renches+0.030
ware+0.030
obs+0.030
lights+0.030
 tant+0.028
Neg logit contributions
 Sue-0.015
 Wendy-0.013
 Daisy-0.013
 Bob-0.013
 Olive-0.013
Token runs
Token ID4539
Feature activation+3.28e-01

Pos logit contributions
lights+0.020
obs+0.019
renches+0.019
ware+0.019
 tant+0.019
Neg logit contributions
bi-0.013
 Wendy-0.013
 Betty-0.013
my-0.012
 Patty-0.012
Token and
Token ID290
Feature activation-1.08e-01

Pos logit contributions
bi+0.011
 Betty+0.011
 Wendy+0.011
 Johnson+0.010
my+0.010
Neg logit contributions
ware-0.021
lights-0.021
renches-0.021
obs-0.021
 tant-0.021
Token jumps
Token ID18045
Feature activation+1.23e-01

Pos logit contributions
 tant+0.005
ware+0.005
lights+0.005
renches+0.005
obs+0.005
Neg logit contributions
bi-0.006
 Betty-0.005
 Johnson-0.005
 Susan-0.005
 Karen-0.005
Token and
Token ID290
Feature activation+7.06e-01

Pos logit contributions
 Wendy+0.004
 Betty+0.004
my+0.004
 Sue+0.004
bi+0.004
Neg logit contributions
renches-0.008
ware-0.008
lights-0.008
obs-0.008
 tant-0.008
Token bar
Token ID2318
Feature activation+4.55e+00

Pos logit contributions
bi+0.032
 Betty+0.031
 Wendy+0.031
 Johnson+0.030
 Susan+0.030
Neg logit contributions
lights-0.038
ware-0.037
renches-0.037
 tant-0.037
obs-0.037
Tokenks
Token ID591
Feature activation+3.62e-01

Pos logit contributions
bi+0.211
 Wendy+0.195
my+0.194
 Johnson+0.193
 Betty+0.193
Neg logit contributions
renches-0.240
lights-0.239
ware-0.237
 tant-0.231
obs-0.228
Token.
Token ID13
Feature activation-5.42e-01

Pos logit contributions
 Sue+0.008
my+0.007
 Betty+0.007
 Wendy+0.007
 Daisy+0.007
Neg logit contributions
renches-0.028
ware-0.028
obs-0.027
lights-0.027
 tant-0.026
Token Anna
Token ID11735
Feature activation+1.24e+00

Pos logit contributions
obs+0.040
renches+0.040
lights+0.039
ware+0.039
 tant+0.038
Neg logit contributions
 Sue-0.018
 Bob-0.016
 Daisy-0.015
 Wendy-0.015
 Olive-0.015
Token and
Token ID290
Feature activation-1.40e+00

Pos logit contributions
bi+0.054
 Betty+0.052
 Wendy+0.052
 Johnson+0.051
my+0.050
Neg logit contributions
lights-0.069
renches-0.068
ware-0.068
 tant-0.067
obs-0.067
Token Ben
Token ID3932
Feature activation-4.57e-01

Pos logit contributions
ware+0.085
 tant+0.084
renches+0.083
lights+0.083
obs+0.083
Neg logit contributions
bi-0.056
 Betty-0.053
 Johnson-0.051
 Susan-0.051
 Karen-0.051
Token
Token ID198
Feature activation+4.83e-01

Pos logit contributions
bi+0.018
 Betty+0.017
 Wendy+0.017
 Johnson+0.017
 Susan+0.016
Neg logit contributions
lights-0.031
ware-0.031
renches-0.031
 tant-0.031
obs-0.030
Token
Token ID198
Feature activation+4.15e-01

Pos logit contributions
bi+0.018
 Betty+0.017
 Wendy+0.017
 Johnson+0.017
 Susan+0.016
Neg logit contributions
lights-0.030
ware-0.030
 tant-0.030
renches-0.029
obs-0.029
TokenThat
Token ID2504
Feature activation+2.52e+00

Pos logit contributions
bi+0.018
 Betty+0.017
 Johnson+0.017
 Susan+0.017
 Karen+0.016
Neg logit contributions
 tant-0.023
ware-0.023
lights-0.023
renches-0.023
obs-0.023
Token's
Token ID338
Feature activation+1.19e-01

Pos logit contributions
 Johnson+0.116
bi+0.113
 Karen+0.113
 Betty+0.113
 Wendy+0.108
Neg logit contributions
lights-0.124
ware-0.123
 tant-0.123
obs-0.123
oons-0.117
Token when
Token ID618
Feature activation-3.51e-01

Pos logit contributions
bi+0.006
 Johnson+0.005
 Betty+0.005
my+0.005
 Wendy+0.005
Neg logit contributions
lights-0.006
 tant-0.006
renches-0.006
ware-0.006
der-0.006
Token Olive
Token ID30012
Feature activation+4.52e+00

Pos logit contributions
 tant+0.025
ware+0.025
lights+0.025
der+0.024
renches+0.024
Neg logit contributions
bi-0.008
my-0.008
 reunited-0.007
 Johnson-0.007
ma-0.007
Token learned
Token ID4499
Feature activation-1.11e+00

Pos logit contributions
bi+0.152
 Betty+0.144
 Wendy+0.140
 Johnson+0.138
 Susan+0.136
Neg logit contributions
lights-0.295
ware-0.295
renches-0.294
 tant-0.293
obs-0.292
Token an
Token ID281
Feature activation-6.36e-01

Pos logit contributions
lights+0.072
ware+0.072
 tant+0.072
renches+0.072
obs+0.071
Neg logit contributions
bi-0.038
 Betty-0.035
 Johnson-0.034
 Wendy-0.034
my-0.033
Token important
Token ID1593
Feature activation-2.64e-02

Pos logit contributions
renches+0.031
ware+0.031
lights+0.031
 tant+0.031
obs+0.030
Neg logit contributions
bi-0.032
 Betty-0.031
 Wendy-0.031
 Johnson-0.030
 Susan-0.030
Token lesson
Token ID11483
Feature activation+2.88e-01

Pos logit contributions
lights+0.001
renches+0.001
ware+0.001
obs+0.001
 tant+0.001
Neg logit contributions
bi-0.001
 Betty-0.001
 Wendy-0.001
 Johnson-0.001
 Susan-0.001
Token -
Token ID532
Feature activation+6.42e-01

Pos logit contributions
bi+0.012
 Betty+0.011
 Wendy+0.011
 Johnson+0.011
 Susan+0.011
Neg logit contributions
lights-0.017
ware-0.017
renches-0.016
obs-0.016
 tant-0.016
Token could
Token ID714
Feature activation-8.80e-01

Pos logit contributions
lights+0.165
ware+0.164
 tant+0.163
renches+0.163
obs+0.162
Neg logit contributions
bi-0.076
 Betty-0.072
 Wendy-0.070
 Johnson-0.068
my-0.068
Token be
Token ID307
Feature activation+1.05e+00

Pos logit contributions
ware+0.055
renches+0.055
lights+0.055
obs+0.054
 tant+0.054
Neg logit contributions
bi-0.032
 Betty-0.030
 Wendy-0.029
 Johnson-0.029
 Karen-0.028
Token and
Token ID290
Feature activation+5.08e-01

Pos logit contributions
 Betty+0.031
 reunited+0.030
bi+0.030
 Susan+0.030
my+0.029
Neg logit contributions
ware-0.076
lights-0.074
renches-0.073
 tant-0.073
obs-0.073
Token never
Token ID1239
Feature activation+4.48e-01

Pos logit contributions
 Betty+0.015
bi+0.014
my+0.014
 Susan+0.014
 Sue+0.014
Neg logit contributions
renches-0.037
obs-0.037
ware-0.036
lights-0.036
 tant-0.035
Token mis
Token ID2984
Feature activation+7.16e-01

Pos logit contributions
bi+0.016
 Betty+0.015
 Wendy+0.015
 Johnson+0.015
 Karen+0.015
Neg logit contributions
 tant-0.028
ware-0.027
renches-0.027
lights-0.027
obs-0.027
Tokenbeh
Token ID20709
Feature activation+4.50e+00

Pos logit contributions
bi+0.029
 Betty+0.028
 Wendy+0.027
 Johnson+0.027
 Susan+0.027
Neg logit contributions
lights-0.042
ware-0.042
renches-0.041
 tant-0.041
obs-0.041
Tokenaved
Token ID9586
Feature activation+8.02e-02

Pos logit contributions
bi+0.197
 reunited+0.193
my+0.192
 Betty+0.185
 Wendy+0.181
Neg logit contributions
ware-0.261
lights-0.255
renches-0.235
 tant-0.232
obs-0.229
Token.
Token ID13
Feature activation+9.45e-01

Pos logit contributions
 Betty+0.002
 Johnson+0.002
 Wendy+0.002
 Susan+0.002
my+0.002
Neg logit contributions
renches-0.006
obs-0.006
lights-0.006
ware-0.006
 tant-0.005
Token<|endoftext|>
Token ID50256
Feature activation+4.87e-01

Pos logit contributions
bi+0.040
 Wendy+0.038
 Betty+0.037
 Johnson+0.036
 Susan+0.036
Neg logit contributions
 tant-0.052
obs-0.052
lights-0.052
renches-0.052
ware-0.052
TokenOnce
Token ID7454
Feature activation+8.28e-01

Pos logit contributions
bi+0.022
 Betty+0.021
 Wendy+0.020
 Susan+0.020
 Karen+0.020
Neg logit contributions
ware-0.026
lights-0.025
obs-0.025
renches-0.025
 tant-0.024
Token upon
Token ID2402
Feature activation-1.27e+00

Pos logit contributions
bi+0.034
 Betty+0.033
 Wendy+0.033
 Johnson+0.033
 Iris+0.032
Neg logit contributions
lights-0.047
 tant-0.046
ware-0.045
renches-0.043
obs-0.043
Token He
Token ID679
Feature activation-5.92e-01

Pos logit contributions
my+0.006
 Sue+0.006
 Susan+0.005
 Betty+0.005
 Sally+0.005
Neg logit contributions
 tant-0.010
ware-0.010
lights-0.010
obs-0.009
renches-0.009
Token was
Token ID373
Feature activation-2.27e-01

Pos logit contributions
lights+0.045
renches+0.045
obs+0.045
ware+0.044
 tant+0.043
Neg logit contributions
my-0.015
bi-0.014
 Karen-0.012
 Wendy-0.012
 Johnson-0.012
Token nearing
Token ID37242
Feature activation+1.37e-01

Pos logit contributions
ware+0.016
lights+0.016
renches+0.016
obs+0.016
 tant+0.015
Neg logit contributions
bi-0.007
my-0.006
 Johnson-0.006
 Betty-0.006
 Wendy-0.006
Token an
Token ID281
Feature activation+1.42e+00

Pos logit contributions
 home+0.002
 Karen+0.002
 Wendy+0.002
 Sue+0.002
 Betty+0.002
Neg logit contributions
obs-0.012
renches-0.012
lights-0.011
 tant-0.011
ware-0.011
Token elderly
Token ID13830
Feature activation+1.83e+00

Pos logit contributions
my+0.057
 home+0.052
ma+0.051
bi+0.051
 Karen+0.050
Neg logit contributions
ware-0.089
obs-0.087
lights-0.086
 tant-0.084
renches-0.084
Token lady
Token ID10846
Feature activation+4.49e+00

Pos logit contributions
bi+0.075
 Betty+0.072
 Wendy+0.070
 Johnson+0.070
 Susan+0.069
Neg logit contributions
lights-0.106
ware-0.106
renches-0.106
 tant-0.106
obs-0.105
Token's
Token ID338
Feature activation+1.08e+00

Pos logit contributions
bi+0.149
my+0.144
 Johnson+0.137
 Betty+0.136
 Wendy+0.134
Neg logit contributions
lights-0.303
renches-0.301
 tant-0.300
ware-0.296
obs-0.292
Token house
Token ID2156
Feature activation+6.06e-01

Pos logit contributions
bi+0.042
 Betty+0.040
 Wendy+0.040
 Johnson+0.039
 Susan+0.039
Neg logit contributions
lights-0.065
ware-0.065
renches-0.065
 tant-0.065
obs-0.064
Token.
Token ID13
Feature activation-1.15e-01

Pos logit contributions
my+0.017
bi+0.017
 Johnson+0.016
 Betty+0.015
 Susan+0.015
Neg logit contributions
renches-0.044
 tant-0.044
lights-0.043
ware-0.043
obs-0.042
Token Jake
Token ID14757
Feature activation-1.70e+00

Pos logit contributions
ware+0.008
 tant+0.008
renches+0.007
lights+0.007
obs+0.007
Neg logit contributions
 Betty-0.004
bi-0.004
 Susan-0.004
my-0.004
 Wendy-0.004
Token enjoyed
Token ID8359
Feature activation-5.09e-01

Pos logit contributions
lights+0.117
renches+0.116
 tant+0.116
obs+0.115
ware+0.115
Neg logit contributions
bi-0.053
 Betty-0.050
 Wendy-0.048
my-0.048
 Johnson-0.048
Token slowly
Token ID6364
Feature activation-4.15e-01

Pos logit contributions
ware+0.087
 tant+0.085
obs+0.083
lights+0.082
renches+0.082
Neg logit contributions
bi-0.053
 Betty-0.052
 Johnson-0.052
 Wendy-0.051
 Karen-0.050
Token slowly
Token ID6364
Feature activation-1.05e-01

Pos logit contributions
lights+0.027
 tant+0.027
renches+0.027
ware+0.026
obs+0.026
Neg logit contributions
bi-0.015
 Betty-0.014
 Wendy-0.014
my-0.013
 Susan-0.013
Token reached
Token ID4251
Feature activation+7.55e-01

Pos logit contributions
lights+0.007
 tant+0.007
renches+0.007
ware+0.007
obs+0.007
Neg logit contributions
bi-0.004
 Betty-0.004
 Wendy-0.003
my-0.003
 Susan-0.003
Token out
Token ID503
Feature activation+3.64e-01

Pos logit contributions
bi+0.032
 Johnson+0.030
 Karen+0.029
isi+0.029
 Wendy+0.029
Neg logit contributions
ware-0.043
lights-0.042
 tant-0.041
renches-0.041
obs-0.041
Token and
Token ID290
Feature activation+7.87e-01

Pos logit contributions
bi+0.017
 Wendy+0.015
 Johnson+0.015
 Karen+0.015
isi+0.015
Neg logit contributions
lights-0.020
 tant-0.019
ware-0.019
obs-0.019
renches-0.019
Token pet
Token ID4273
Feature activation+4.47e+00

Pos logit contributions
bi+0.033
 Johnson+0.031
 Wendy+0.030
 Karen+0.029
 Betty+0.029
Neg logit contributions
ware-0.045
 tant-0.044
obs-0.044
lights-0.043
renches-0.043
Tokented
Token ID1513
Feature activation+1.26e+00

Pos logit contributions
bi+0.207
 Betty+0.189
 Johnson+0.187
 Wendy+0.186
 Susan+0.184
Neg logit contributions
ware-0.240
 tant-0.237
obs-0.235
lights-0.233
renches-0.230
Token the
Token ID262
Feature activation+8.23e-01

Pos logit contributions
bi+0.042
 Betty+0.037
 Wendy+0.037
 Johnson+0.037
my+0.036
Neg logit contributions
ware-0.083
lights-0.083
 tant-0.083
renches-0.083
obs-0.082
Token bird
Token ID6512
Feature activation+1.71e+00

Pos logit contributions
 Johnson+0.030
bi+0.029
my+0.028
 Susan+0.027
 Betty+0.027
Neg logit contributions
renches-0.055
lights-0.054
obs-0.053
ware-0.052
 tant-0.052
Token gently
Token ID15165
Feature activation+1.11e+00

Pos logit contributions
bi+0.062
 Betty+0.059
 Wendy+0.058
 Johnson+0.057
 Susan+0.056
Neg logit contributions
lights-0.107
ware-0.106
 tant-0.106
obs-0.105
renches-0.105
Token.
Token ID13
Feature activation+5.85e-01

Pos logit contributions
bi+0.029
 Betty+0.028
 Wendy+0.027
 Johnson+0.027
my+0.026
Neg logit contributions
renches-0.083
lights-0.083
 tant-0.080
obs-0.080
ware-0.079
Token reached
Token ID4251
Feature activation+9.57e-01

Pos logit contributions
bi+0.046
 Betty+0.044
 Wendy+0.043
 Johnson+0.043
 Susan+0.042
Neg logit contributions
ware-0.080
 tant-0.079
lights-0.079
obs-0.079
renches-0.078
Token in
Token ID287
Feature activation+6.92e-01

Pos logit contributions
bi+0.037
 Johnson+0.035
 Betty+0.035
 Wendy+0.035
 Karen+0.034
Neg logit contributions
ware-0.057
lights-0.056
obs-0.055
 tant-0.055
renches-0.055
Token and
Token ID290
Feature activation+1.01e+00

Pos logit contributions
bi+0.034
 Betty+0.032
 Wendy+0.032
 Johnson+0.031
 Karen+0.031
Neg logit contributions
 tant-0.035
ware-0.035
lights-0.034
obs-0.034
renches-0.033
Token picked
Token ID6497
Feature activation+1.73e+00

Pos logit contributions
bi+0.043
 Johnson+0.041
 Betty+0.039
 Wendy+0.039
 Karen+0.039
Neg logit contributions
ware-0.055
obs-0.055
 tant-0.054
lights-0.052
renches-0.052
Token a
Token ID257
Feature activation+1.66e+00

Pos logit contributions
bi+0.075
 Johnson+0.073
 Betty+0.070
my+0.069
 Susan+0.069
Neg logit contributions
ware-0.094
 tant-0.092
obs-0.092
renches-0.090
lights-0.087
Token sweet
Token ID6029
Feature activation+4.47e+00

Pos logit contributions
bi+0.071
 Betty+0.068
 Wendy+0.067
 Susan+0.066
 Johnson+0.066
Neg logit contributions
 tant-0.093
ware-0.092
obs-0.090
renches-0.089
lights-0.089
Token b
Token ID275
Feature activation+2.15e+00

Pos logit contributions
bi+0.171
 Betty+0.164
 Wendy+0.162
 Johnson+0.159
 Susan+0.157
Neg logit contributions
ware-0.269
 tant-0.268
obs-0.260
lights-0.259
renches-0.258
Tokenerry
Token ID6996
Feature activation+2.04e+00

Pos logit contributions
bi+0.083
 Betty+0.082
 Wendy+0.081
 Susan+0.079
 Johnson+0.079
Neg logit contributions
obs-0.129
 tant-0.128
ware-0.127
renches-0.127
lights-0.126
Token.
Token ID13
Feature activation+4.48e-01

Pos logit contributions
bi+0.081
 Betty+0.077
 Wendy+0.076
 Susan+0.076
 Karen+0.076
Neg logit contributions
ware-0.122
obs-0.119
 tant-0.118
lights-0.117
renches-0.115
Token It
Token ID632
Feature activation+5.60e-01

Pos logit contributions
bi+0.019
 Betty+0.018
 Wendy+0.018
 Johnson+0.018
my+0.018
Neg logit contributions
ware-0.025
obs-0.025
lights-0.025
renches-0.024
 tant-0.024
Token tasted
Token ID29187
Feature activation+1.04e+00

Pos logit contributions
bi+0.022
 Johnson+0.021
 Karen+0.021
 Wendy+0.020
 Betty+0.020
Neg logit contributions
ware-0.033
lights-0.032
 tant-0.031
obs-0.030
renches-0.030
Token pet
Token ID4273
Feature activation+3.07e+00

Pos logit contributions
renches+0.041
 tant+0.041
lights+0.040
ware+0.038
obs+0.037
Neg logit contributions
 Betty-0.027
 Wendy-0.026
 Sue-0.026
 Patty-0.026
 Daisy-0.025
Tokenals
Token ID874
Feature activation-6.67e-01

Pos logit contributions
bi+0.169
 Betty+0.163
 Wendy+0.161
 Johnson+0.160
 Susan+0.158
Neg logit contributions
lights-0.135
ware-0.135
renches-0.134
 tant-0.134
obs-0.134
Token and
Token ID290
Feature activation-3.25e-01

Pos logit contributions
renches+0.047
 tant+0.046
ware+0.045
lights+0.045
obs+0.044
Neg logit contributions
bi-0.021
my-0.021
 Betty-0.021
 Wendy-0.020
 Patty-0.019
Token smelled
Token ID37609
Feature activation+8.98e-01

Pos logit contributions
 tant+0.022
renches+0.022
lights+0.022
obs+0.022
ware+0.022
Neg logit contributions
 Betty-0.010
 Wendy-0.010
bi-0.010
 Sue-0.010
 Susan-0.010
Token so
Token ID523
Feature activation-1.93e-02

Pos logit contributions
bi+0.035
 Betty+0.034
 Wendy+0.033
 Johnson+0.033
 Susan+0.032
Neg logit contributions
lights-0.054
renches-0.054
ware-0.054
 tant-0.053
obs-0.053
Token sweet
Token ID6029
Feature activation+4.45e+00

Pos logit contributions
lights+0.001
 tant+0.001
ware+0.001
obs+0.001
renches+0.001
Neg logit contributions
bi-0.001
 Johnson-0.001
my-0.001
 Betty-0.001
 Wendy-0.001
Token.
Token ID13
Feature activation+3.18e-02

Pos logit contributions
bi+0.168
 Betty+0.158
 Wendy+0.156
 Johnson+0.153
 Susan+0.151
Neg logit contributions
lights-0.276
ware-0.275
obs-0.274
 tant-0.271
renches-0.269
Token He
Token ID679
Feature activation-4.92e-01

Pos logit contributions
bi+0.001
 Betty+0.001
 Wendy+0.001
 Johnson+0.001
 Susan+0.001
Neg logit contributions
lights-0.002
ware-0.002
renches-0.002
 tant-0.002
obs-0.002
Token thought
Token ID1807
Feature activation+1.55e-01

Pos logit contributions
ware+0.030
lights+0.029
 tant+0.029
renches+0.029
obs+0.029
Neg logit contributions
bi-0.019
 Betty-0.018
 Johnson-0.018
 Wendy-0.018
 Susan-0.017
Token it
Token ID340
Feature activation-2.03e+00

Pos logit contributions
 Sue+0.004
 Sally+0.003
 Daisy+0.003
 Betty+0.003
 her+0.003
Neg logit contributions
obs-0.013
renches-0.013
lights-0.013
 tant-0.012
ifying-0.012
Token was
Token ID373
Feature activation-8.94e-01

Pos logit contributions
ware+0.116
lights+0.114
obs+0.110
renches+0.110
 tant+0.108
Neg logit contributions
bi-0.085
 Johnson-0.077
 Wendy-0.077
 Betty-0.076
isi-0.076
Token It
Token ID632
Feature activation+6.06e-01

Pos logit contributions
 Wendy+0.011
 Betty+0.011
 Daisy+0.011
 Johnson+0.011
 Susan+0.011
Neg logit contributions
ware-0.014
 tant-0.014
renches-0.014
lights-0.014
obs-0.014
Token was
Token ID373
Feature activation+1.25e-01

Pos logit contributions
bi+0.023
 Johnson+0.022
 Betty+0.022
 Karen+0.022
 Wendy+0.022
Neg logit contributions
lights-0.036
ware-0.036
 tant-0.035
renches-0.034
obs-0.034
Token warm
Token ID5814
Feature activation+1.19e+00

Pos logit contributions
bi+0.004
 Betty+0.004
 Wendy+0.004
 Johnson+0.004
 Susan+0.004
Neg logit contributions
lights-0.008
ware-0.008
renches-0.008
 tant-0.008
obs-0.008
Token and
Token ID290
Feature activation+3.47e-02

Pos logit contributions
bi+0.054
 Betty+0.051
 Wendy+0.051
 Johnson+0.050
 Susan+0.049
Neg logit contributions
lights-0.065
ware-0.065
obs-0.064
 tant-0.064
renches-0.064
Token so
Token ID523
Feature activation+5.03e-02

Pos logit contributions
bi+0.001
 Betty+0.001
 Wendy+0.001
 Johnson+0.001
 Susan+0.001
Neg logit contributions
renches-0.002
ware-0.002
obs-0.002
 tant-0.002
lights-0.002
Token sweet
Token ID6029
Feature activation+4.44e+00

Pos logit contributions
bi+0.002
 Betty+0.002
 Wendy+0.002
 Johnson+0.002
 Susan+0.002
Neg logit contributions
lights-0.003
ware-0.003
 tant-0.003
renches-0.003
obs-0.003
Token.
Token ID13
Feature activation+3.28e-01

Pos logit contributions
bi+0.176
 Betty+0.162
 Wendy+0.161
 Johnson+0.158
 Karen+0.155
Neg logit contributions
lights-0.268
ware-0.267
obs-0.264
 tant-0.259
renches-0.256
Token She
Token ID1375
Feature activation+1.29e+00

Pos logit contributions
 Betty+0.014
 Wendy+0.013
bi+0.013
 Johnson+0.013
 Susan+0.013
Neg logit contributions
ware-0.019
 tant-0.019
renches-0.019
lights-0.019
obs-0.019
Token liked
Token ID8288
Feature activation-7.24e-01

Pos logit contributions
bi+0.043
 Betty+0.041
 Johnson+0.040
 Wendy+0.040
 Susan+0.039
Neg logit contributions
ware-0.084
lights-0.083
 tant-0.083
obs-0.082
renches-0.082
Token it
Token ID340
Feature activation-1.31e+00

Pos logit contributions
renches+0.056
obs+0.055
lights+0.055
 tant+0.054
ware+0.054
Neg logit contributions
 Betty-0.019
 Wendy-0.018
bi-0.017
 Sue-0.017
 Daisy-0.017
Token very
Token ID845
Feature activation-9.76e-01

Pos logit contributions
ware+0.076
lights+0.076
obs+0.074
 tant+0.074
renches+0.073
Neg logit contributions
bi-0.053
 Betty-0.050
 Wendy-0.049
 Johnson-0.049
 Karen-0.048
Token see
Token ID766
Feature activation-8.80e-01

Pos logit contributions
bi+0.033
 Betty+0.032
 Wendy+0.031
 Johnson+0.031
 Susan+0.030
Neg logit contributions
lights-0.054
ware-0.054
 tant-0.054
renches-0.053
obs-0.053
Token swings
Token ID26728
Feature activation+1.85e+00

Pos logit contributions
obs+0.069
renches+0.069
ware+0.067
 tant+0.065
lights+0.064
Neg logit contributions
 Sue-0.022
 Wendy-0.022
 Karen-0.021
 Daisy-0.020
 Iris-0.020
Token and
Token ID290
Feature activation+4.61e-01

Pos logit contributions
 Sue+0.052
bi+0.051
 Wendy+0.051
 Betty+0.051
 Johnson+0.051
Neg logit contributions
renches-0.134
obs-0.132
ware-0.131
 tant-0.131
lights-0.126
Token slides
Token ID19392
Feature activation+7.79e-01

Pos logit contributions
bi+0.023
 Betty+0.022
 Wendy+0.022
 Johnson+0.022
 Susan+0.021
Neg logit contributions
lights-0.023
ware-0.023
 tant-0.022
renches-0.022
obs-0.022
Token and
Token ID290
Feature activation+6.73e-01

Pos logit contributions
bi+0.027
 Johnson+0.026
 Betty+0.026
 Wendy+0.025
my+0.025
Neg logit contributions
renches-0.051
ware-0.050
 tant-0.050
obs-0.049
lights-0.049
Token k
Token ID479
Feature activation+4.44e+00

Pos logit contributions
bi+0.029
 Betty+0.028
my+0.028
 Sue+0.027
 Wendy+0.027
Neg logit contributions
renches-0.038
obs-0.038
ware-0.038
 tant-0.037
lights-0.037
Tokenites
Token ID2737
Feature activation+1.95e+00

Pos logit contributions
 Betty+0.292
bi+0.290
 Susan+0.288
 Wendy+0.286
 Daisy+0.284
Neg logit contributions
obs-0.160
lights-0.142
renches-0.140
 tant-0.137
 boards-0.133
Token.
Token ID13
Feature activation-1.80e-02

Pos logit contributions
bi+0.061
 Betty+0.058
my+0.057
 Wendy+0.057
 Johnson+0.056
Neg logit contributions
renches-0.133
 tant-0.131
ware-0.131
obs-0.129
lights-0.129
Token They
Token ID1119
Feature activation+1.03e+00

Pos logit contributions
ware+0.001
renches+0.001
obs+0.001
 tant+0.001
lights+0.001
Neg logit contributions
 Sue-0.001
 Wendy-0.001
 Iris-0.001
 Daisy-0.001
 Olive-0.001
Token see
Token ID766
Feature activation-8.00e-01

Pos logit contributions
bi+0.040
 Betty+0.038
 Johnson+0.038
 Wendy+0.037
 Susan+0.037
Neg logit contributions
 tant-0.061
lights-0.061
ware-0.061
renches-0.060
obs-0.060
Token clouds
Token ID15114
Feature activation+3.50e-01

Pos logit contributions
obs+0.062
renches+0.062
ware+0.060
 tant+0.059
inters+0.057
Neg logit contributions
 Sue-0.022
 Wendy-0.021
 Karen-0.020
 Daisy-0.019
 Iris-0.019
Token it
Token ID340
Feature activation-6.65e-01

Pos logit contributions
bi+0.025
 Betty+0.024
 Susan+0.024
 Wendy+0.023
 Johnson+0.023
Neg logit contributions
renches-0.043
ware-0.042
lights-0.041
obs-0.041
 tant-0.041
Token smelled
Token ID37609
Feature activation+3.14e-01

Pos logit contributions
renches+0.045
lights+0.045
 tant+0.045
obs+0.045
ware+0.045
Neg logit contributions
bi-0.021
 Betty-0.020
 Wendy-0.019
my-0.019
 Johnson-0.019
Token and
Token ID290
Feature activation+3.78e-01

Pos logit contributions
 Daisy+0.009
 Sally+0.008
my+0.008
 welcome+0.008
 Betty+0.008
Neg logit contributions
ware-0.023
obs-0.023
renches-0.023
lights-0.022
angers-0.022
Token tasted
Token ID29187
Feature activation+1.27e+00

Pos logit contributions
bi+0.013
 Betty+0.012
 Wendy+0.012
 Johnson+0.012
 Susan+0.011
Neg logit contributions
lights-0.025
ware-0.025
 tant-0.025
renches-0.025
obs-0.024
Token so
Token ID523
Feature activation-1.79e-01

Pos logit contributions
 Betty+0.039
bi+0.039
 Daisy+0.038
 Wendy+0.037
 Susan+0.037
Neg logit contributions
renches-0.086
ware-0.086
lights-0.085
obs-0.085
 tant-0.084
Token sweet
Token ID6029
Feature activation+4.42e+00

Pos logit contributions
ware+0.011
renches+0.011
lights+0.011
obs+0.011
 tant+0.011
Neg logit contributions
bi-0.007
 Betty-0.007
 Wendy-0.006
 Susan-0.006
 Johnson-0.006
Token.
Token ID13
Feature activation+5.77e-01

Pos logit contributions
bi+0.133
 Betty+0.126
 Wendy+0.122
 Johnson+0.120
 Susan+0.118
Neg logit contributions
renches-0.305
lights-0.304
ware-0.304
 tant-0.304
obs-0.301
Token Mom
Token ID11254
Feature activation+1.63e+00

Pos logit contributions
bi+0.022
 Betty+0.022
 Wendy+0.022
 Johnson+0.021
 Susan+0.021
Neg logit contributions
lights-0.035
renches-0.035
ware-0.035
 tant-0.035
obs-0.035
Tokenmy
Token ID1820
Feature activation-6.32e-01

Pos logit contributions
 Iris+0.033
 Wendy+0.033
 Betty+0.031
 Karen+0.029
 Patty+0.029
Neg logit contributions
ware-0.122
lights-0.121
obs-0.120
oons-0.117
 tant-0.117
Token was
Token ID373
Feature activation+2.11e-01

Pos logit contributions
ware+0.040
lights+0.039
obs+0.039
 tant+0.038
renches+0.038
Neg logit contributions
bi-0.022
 Wendy-0.021
 Betty-0.021
 Johnson-0.021
 Karen-0.020
Token relieved
Token ID22598
Feature activation+2.15e+00

Pos logit contributions
bi+0.008
 Betty+0.007
 Wendy+0.007
 Johnson+0.007
 Susan+0.007
Neg logit contributions
lights-0.013
ware-0.013
 tant-0.013
renches-0.013
obs-0.013
TokenThey
Token ID2990
Feature activation+8.97e-01

Pos logit contributions
renches+0.028
lights+0.028
obs+0.028
 boards+0.027
ware+0.027
Neg logit contributions
bi-0.024
 Betty-0.022
 Wendy-0.022
 Patty-0.022
 Karen-0.022
Token picked
Token ID6497
Feature activation+1.11e+00

Pos logit contributions
 Betty+0.038
bi+0.038
 Johnson+0.037
 Patty+0.037
 Wendy+0.037
Neg logit contributions
 tant-0.051
ware-0.050
renches-0.050
lights-0.049
obs-0.049
Token up
Token ID510
Feature activation+7.56e-01

Pos logit contributions
bi+0.037
 Johnson+0.033
 Betty+0.033
 Susan+0.032
 Karen+0.032
Neg logit contributions
 tant-0.075
obs-0.074
ware-0.073
renches-0.072
 boards-0.071
Token the
Token ID262
Feature activation+1.38e-01

Pos logit contributions
bi+0.023
 Betty+0.021
 Wendy+0.021
 Johnson+0.020
 Susan+0.020
Neg logit contributions
lights-0.052
 tant-0.052
ware-0.052
obs-0.051
renches-0.051
Token fallen
Token ID9292
Feature activation-7.15e-01

Pos logit contributions
bi+0.006
 Betty+0.005
 Wendy+0.005
 Johnson+0.005
 Susan+0.005
Neg logit contributions
lights-0.008
ware-0.008
renches-0.008
 tant-0.008
obs-0.008
Token pet
Token ID4273
Feature activation+4.40e+00

Pos logit contributions
renches+0.044
lights+0.043
obs+0.042
 tant+0.042
ware+0.041
Neg logit contributions
bi-0.028
 Johnson-0.027
 Betty-0.027
my-0.026
 Wendy-0.026
Tokenals
Token ID874
Feature activation+7.00e-01

Pos logit contributions
bi+0.196
 Betty+0.195
 Johnson+0.192
 Wendy+0.192
my+0.189
Neg logit contributions
renches-0.239
lights-0.235
 tant-0.233
ware-0.229
obs-0.228
Token and
Token ID290
Feature activation+5.59e-01

Pos logit contributions
my+0.023
bi+0.023
 Betty+0.023
 Sue+0.022
 Daisy+0.022
Neg logit contributions
renches-0.047
 tant-0.046
obs-0.044
ware-0.044
lights-0.043
Token tried
Token ID3088
Feature activation+1.26e+00

Pos logit contributions
bi+0.021
 Betty+0.019
 Wendy+0.019
 Johnson+0.019
 Susan+0.018
Neg logit contributions
ware-0.035
lights-0.035
 tant-0.034
renches-0.034
obs-0.034
Token to
Token ID284
Feature activation+6.87e-01

Pos logit contributions
bi+0.054
 Betty+0.052
 Johnson+0.051
 Wendy+0.051
my+0.050
Neg logit contributions
renches-0.073
lights-0.073
ware-0.072
 tant-0.072
obs-0.072
Token put
Token ID1234
Feature activation+1.57e+00

Pos logit contributions
bi+0.036
 Wendy+0.033
 Betty+0.033
 Patty+0.032
 Johnson+0.032
Neg logit contributions
 tant-0.033
renches-0.032
ware-0.032
lights-0.032
ro-0.030
Token,
Token ID11
Feature activation+5.11e-01

Pos logit contributions
ware+0.105
renches+0.104
obs+0.104
lights+0.102
 tant+0.099
Neg logit contributions
 Sue-0.053
 Wendy-0.053
 Betty-0.052
 Patty-0.050
bi-0.049
Token jumped
Token ID11687
Feature activation-2.07e-02

Pos logit contributions
bi+0.017
my+0.017
 Wendy+0.017
 Betty+0.017
 Sue+0.016
Neg logit contributions
renches-0.034
lights-0.034
ware-0.033
 tant-0.033
obs-0.033
Token rope
Token ID17182
Feature activation+1.55e+00

Pos logit contributions
renches+0.001
ware+0.001
lights+0.001
 tant+0.001
obs+0.001
Neg logit contributions
bi-0.001
 Wendy-0.001
 Betty-0.001
 Johnson-0.001
 Karen-0.001
Token and
Token ID290
Feature activation+7.75e-01

Pos logit contributions
bi+0.055
 Wendy+0.053
 Betty+0.053
 Johnson+0.051
my+0.051
Neg logit contributions
renches-0.100
ware-0.098
 tant-0.098
lights-0.097
obs-0.097
Token flew
Token ID13112
Feature activation+1.07e+00

Pos logit contributions
bi+0.028
 Betty+0.027
 Wendy+0.026
 Johnson+0.026
 Susan+0.026
Neg logit contributions
lights-0.049
ware-0.049
renches-0.048
 tant-0.048
obs-0.048
Token k
Token ID479
Feature activation+4.38e+00

Pos logit contributions
bi+0.045
 Betty+0.042
 Johnson+0.042
my+0.040
 Susan+0.040
Neg logit contributions
lights-0.062
 tant-0.060
obs-0.060
ware-0.059
renches-0.058
Tokenites
Token ID2737
Feature activation+2.12e+00

Pos logit contributions
bi+0.257
 Betty+0.257
 Susan+0.253
 Johnson+0.247
 Wendy+0.246
Neg logit contributions
obs-0.180
 tant-0.174
lights-0.172
ware-0.169
renches-0.168
Token.
Token ID13
Feature activation+3.40e-01

Pos logit contributions
bi+0.085
 Betty+0.082
 Wendy+0.080
 Johnson+0.079
 Susan+0.078
Neg logit contributions
lights-0.124
ware-0.124
obs-0.123
 tant-0.123
renches-0.122
Token They
Token ID1119
Feature activation+1.15e+00

Pos logit contributions
 Daisy+0.013
 Sally+0.012
 Wendy+0.012
 Betty+0.012
 Sue+0.012
Neg logit contributions
ware-0.024
 tant-0.024
renches-0.023
lights-0.023
obs-0.022
Token laughed
Token ID13818
Feature activation-1.67e-01

Pos logit contributions
bi+0.044
 Johnson+0.043
 Betty+0.043
 Susan+0.041
 Wendy+0.041
Neg logit contributions
 tant-0.069
ware-0.069
lights-0.068
obs-0.067
renches-0.066
Token and
Token ID290
Feature activation-7.49e-01

Pos logit contributions
renches+0.010
ware+0.010
 tant+0.010
lights+0.010
obs+0.010
Neg logit contributions
 Betty-0.006
 Wendy-0.006
bi-0.006
 Susan-0.006
my-0.006
Token eventually
Token ID4191
Feature activation-3.87e-01

Pos logit contributions
bi+0.020
 Betty+0.018
 Johnson+0.018
 Wendy+0.018
 Susan+0.018
Neg logit contributions
 tant-0.031
lights-0.031
ware-0.031
renches-0.031
obs-0.031
Token he
Token ID339
Feature activation-1.03e+00

Pos logit contributions
ware+0.015
 boards+0.015
der+0.015
obs+0.015
lights+0.015
Neg logit contributions
bi-0.021
 Johnson-0.021
my-0.020
isi-0.019
Elizabeth-0.019
Token reached
Token ID4251
Feature activation+7.88e-01

Pos logit contributions
ware+0.060
 tant+0.059
obs+0.059
lights+0.058
renches+0.057
Neg logit contributions
bi-0.040
 Johnson-0.039
 Betty-0.039
 Susan-0.038
 Wendy-0.038
Token out
Token ID503
Feature activation+1.97e-01

Pos logit contributions
bi+0.024
 Betty+0.023
 Wendy+0.022
 Johnson+0.022
 Susan+0.021
Neg logit contributions
lights-0.054
ware-0.054
renches-0.054
 tant-0.054
obs-0.054
Token and
Token ID290
Feature activation+7.98e-01

Pos logit contributions
bi+0.007
 Betty+0.007
 Wendy+0.007
 Susan+0.006
 Johnson+0.006
Neg logit contributions
renches-0.013
ware-0.013
obs-0.013
 tant-0.013
lights-0.013
Token pet
Token ID4273
Feature activation+4.36e+00

Pos logit contributions
bi+0.030
 Johnson+0.029
 Betty+0.027
 Wendy+0.027
isi+0.027
Neg logit contributions
 tant-0.047
ware-0.046
obs-0.046
lights-0.045
renches-0.045
Tokented
Token ID1513
Feature activation+1.19e+00

Pos logit contributions
bi+0.158
 Betty+0.149
 Wendy+0.146
 Johnson+0.144
 Susan+0.142
Neg logit contributions
lights-0.274
ware-0.274
 tant-0.273
renches-0.272
obs-0.271
Token the
Token ID262
Feature activation+7.38e-01

Pos logit contributions
bi+0.032
 Betty+0.032
 Wendy+0.030
 Johnson+0.030
 Susan+0.029
Neg logit contributions
lights-0.085
renches-0.085
ware-0.085
 tant-0.085
obs-0.085
Token wolf
Token ID17481
Feature activation+1.41e+00

Pos logit contributions
bi+0.022
 Betty+0.021
 Wendy+0.021
 Johnson+0.020
 Susan+0.020
Neg logit contributions
lights-0.051
 tant-0.051
ware-0.051
renches-0.051
obs-0.050
Token.
Token ID13
Feature activation+4.13e-01

Pos logit contributions
bi+0.054
 Betty+0.051
 Wendy+0.050
 Johnson+0.049
 Susan+0.049
Neg logit contributions
ware-0.084
lights-0.083
obs-0.082
 tant-0.082
renches-0.081
Token To
Token ID1675
Feature activation+6.45e-01

Pos logit contributions
 Betty+0.018
 Wendy+0.018
 Johnson+0.018
 Karen+0.017
 Sue+0.017
Neg logit contributions
lights-0.025
ware-0.024
 tant-0.024
renches-0.024
obs-0.023
Token colour
Token ID9568
Feature activation-5.78e-01

Pos logit contributions
bi+0.024
 Johnson+0.023
my+0.023
 Betty+0.023
 Wendy+0.023
Neg logit contributions
renches-0.037
lights-0.036
obs-0.034
 tant-0.034
ware-0.032
Token and
Token ID290
Feature activation-3.97e-01

Pos logit contributions
renches+0.042
 tant+0.041
lights+0.039
 Some+0.038
obs+0.038
Neg logit contributions
my-0.016
bi-0.016
 Johnson-0.015
 Betty-0.015
 Daisy-0.015
Token it
Token ID340
Feature activation-9.21e-01

Pos logit contributions
 tant+0.024
renches+0.024
obs+0.024
lights+0.024
ware+0.024
Neg logit contributions
bi-0.015
 Betty-0.015
 Wendy-0.015
 Susan-0.014
 Johnson-0.014
Token smelled
Token ID37609
Feature activation+5.63e-01

Pos logit contributions
renches+0.058
lights+0.058
 tant+0.058
ware+0.058
obs+0.058
Neg logit contributions
bi-0.033
 Betty-0.031
 Wendy-0.031
 Johnson-0.030
my-0.030
Token so
Token ID523
Feature activation-2.83e-01

Pos logit contributions
 Daisy+0.015
 Sally+0.014
my+0.014
 welcome+0.014
 Sue+0.014
Neg logit contributions
renches-0.041
obs-0.040
ware-0.040
lights-0.039
 tant-0.039
Token sweet
Token ID6029
Feature activation+4.36e+00

Pos logit contributions
renches+0.018
 tant+0.018
obs+0.018
ware+0.018
lights+0.018
Neg logit contributions
bi-0.010
 Betty-0.010
 Wendy-0.010
 Susan-0.009
 Karen-0.009
Token.
Token ID13
Feature activation+8.87e-02

Pos logit contributions
bi+0.150
 Betty+0.141
 Wendy+0.138
 Johnson+0.136
 Susan+0.134
Neg logit contributions
lights-0.284
ware-0.283
obs-0.281
 tant-0.280
renches-0.280
Token She
Token ID1375
Feature activation+8.72e-01

Pos logit contributions
bi+0.003
 Betty+0.003
 Wendy+0.003
 Johnson+0.002
 Susan+0.002
Neg logit contributions
lights-0.006
renches-0.006
ware-0.006
 tant-0.006
obs-0.006
Token couldn
Token ID3521
Feature activation-7.18e-01

Pos logit contributions
bi+0.029
 Betty+0.027
 Wendy+0.027
my+0.027
 Susan+0.026
Neg logit contributions
lights-0.058
renches-0.058
 tant-0.057
obs-0.057
ware-0.057
Token't
Token ID470
Feature activation+1.50e+00

Pos logit contributions
lights+0.048
 tant+0.047
renches+0.047
obs+0.047
ware+0.047
Neg logit contributions
bi-0.023
 Betty-0.023
 Wendy-0.023
 Johnson-0.022
 Susan-0.022
Token wait
Token ID4043
Feature activation+6.86e-01

Pos logit contributions
 Betty+0.032
 Daisy+0.030
bi+0.029
 Sue+0.029
my+0.029
Neg logit contributions
lights-0.119
renches-0.114
obs-0.113
 tant-0.112
ware-0.111
Tokenmy
Token ID1820
Feature activation-2.09e+00

Pos logit contributions
ware+0.009
lights+0.009
renches+0.009
obs+0.009
 tant+0.009
Neg logit contributions
bi-0.004
 Betty-0.004
 Wendy-0.004
 Johnson-0.004
 Patty-0.003
Token had
Token ID550
Feature activation-1.31e+00

Pos logit contributions
lights+0.146
 tant+0.146
ware+0.145
renches+0.144
obs+0.143
Neg logit contributions
bi-0.063
 Betty-0.059
my-0.058
 Johnson-0.057
 Wendy-0.057
Token told
Token ID1297
Feature activation-5.45e-01

Pos logit contributions
lights+0.101
 tant+0.099
ware+0.095
renches+0.095
obs+0.094
Neg logit contributions
my-0.034
bi-0.032
 Betty-0.032
 reunited-0.031
 Susan-0.031
Token him
Token ID683
Feature activation-2.74e+00

Pos logit contributions
lights+0.038
ware+0.038
renches+0.038
 tant+0.038
obs+0.038
Neg logit contributions
bi-0.016
 Betty-0.015
 Wendy-0.014
 Johnson-0.014
 Susan-0.014
Token long
Token ID890
Feature activation-4.43e+00

Pos logit contributions
lights+0.154
ware+0.154
 tant+0.153
renches+0.153
obs+0.152
Neg logit contributions
bi-0.117
 Betty-0.112
 Wendy-0.110
 Johnson-0.109
 Susan-0.108
Token ago
Token ID2084
Feature activation-7.55e+00

Pos logit contributions
lights+0.240
obs+0.233
ware+0.231
renches+0.229
 tant+0.220
Neg logit contributions
my-0.204
bi-0.200
 Betty-0.196
 Susan-0.189
 Sue-0.188
Token.
Token ID13
Feature activation+3.73e-01

Pos logit contributions
 tant+0.578
renches+0.576
lights+0.568
obs+0.557
ware+0.549
Neg logit contributions
bi-0.184
my-0.181
 Johnson-0.181
 Susan-0.176
 Sue-0.172
Token The
Token ID383
Feature activation-8.85e-01

Pos logit contributions
bi+0.014
 Betty+0.014
 Wendy+0.013
 Johnson+0.013
 Susan+0.013
Neg logit contributions
lights-0.023
ware-0.023
 tant-0.023
renches-0.023
obs-0.023
Token bird
Token ID6512
Feature activation+6.15e-01

Pos logit contributions
lights+0.053
obs+0.053
ware+0.053
renches+0.052
 tant+0.052
Neg logit contributions
bi-0.034
 Betty-0.033
 Wendy-0.033
 Johnson-0.033
my-0.033
Token spread
Token ID4104
Feature activation+3.27e-01

Pos logit contributions
bi+0.020
 Betty+0.019
 Wendy+0.018
 Johnson+0.018
 Susan+0.018
Neg logit contributions
lights-0.041
ware-0.041
 tant-0.041
renches-0.041
obs-0.041
Token out
Token ID503
Feature activation-7.43e-01

Pos logit contributions
 Betty+0.010
bi+0.010
 Wendy+0.010
 Susan+0.009
my+0.009
Neg logit contributions
renches-0.022
ware-0.022
obs-0.022
lights-0.022
 tant-0.022
Token velvet
Token ID47750
Feature activation+1.45e+00

Pos logit contributions
bi+0.023
 Betty+0.022
 Wendy+0.022
 Susan+0.021
 Johnson+0.021
Neg logit contributions
ware-0.028
 tant-0.028
lights-0.028
renches-0.028
obs-0.027
Token box
Token ID3091
Feature activation+1.46e+00

Pos logit contributions
bi+0.057
 Betty+0.054
 Wendy+0.053
 Johnson+0.052
 Susan+0.051
Neg logit contributions
lights-0.087
ware-0.087
 tant-0.086
renches-0.086
obs-0.086
Token all
Token ID477
Feature activation+1.55e+00

Pos logit contributions
bi+0.051
 Betty+0.049
 Wendy+0.048
 Johnson+0.047
 Susan+0.046
Neg logit contributions
renches-0.093
 tant-0.093
ware-0.093
lights-0.093
obs-0.092
Token those
Token ID883
Feature activation-1.80e+00

Pos logit contributions
 Wendy+0.060
 Betty+0.060
bi+0.059
 Johnson+0.059
 Patty+0.057
Neg logit contributions
lights-0.092
ware-0.088
 tant-0.087
obs-0.085
renches-0.082
Token years
Token ID812
Feature activation-1.66e+00

Pos logit contributions
renches+0.114
 tant+0.114
obs+0.113
ware+0.113
lights+0.111
Neg logit contributions
bi-0.063
my-0.062
 Susan-0.060
 Betty-0.060
 Daisy-0.059
Token ago
Token ID2084
Feature activation-7.18e+00

Pos logit contributions
 tant+0.110
renches+0.109
ware+0.106
lights+0.103
obs+0.102
Neg logit contributions
bi-0.056
 Johnson-0.051
 Betty-0.051
 Karen-0.051
 Susan-0.050
Token.
Token ID13
Feature activation+3.69e-01

Pos logit contributions
 tant+0.510
renches+0.506
lights+0.500
ware+0.492
obs+0.490
Neg logit contributions
bi-0.216
 Johnson-0.197
 Betty-0.196
 Susan-0.193
my-0.191
Token They
Token ID1119
Feature activation+1.03e+00

Pos logit contributions
bi+0.016
 Wendy+0.016
 Betty+0.016
 Karen+0.015
 Sally+0.015
Neg logit contributions
 tant-0.022
ware-0.022
lights-0.021
renches-0.021
obs-0.020
Token were
Token ID547
Feature activation+3.95e-01

Pos logit contributions
 Johnson+0.042
bi+0.041
 Wendy+0.040
 Betty+0.039
 Iris+0.038
Neg logit contributions
ware-0.056
lights-0.055
obs-0.055
renches-0.054
 tant-0.054
Token glad
Token ID9675
Feature activation+5.14e-01

Pos logit contributions
 Johnson+0.017
bi+0.017
isi+0.017
 Betty+0.016
 Wendy+0.016
Neg logit contributions
lights-0.020
obs-0.020
renches-0.019
 boards-0.019
 ago-0.019
Token that
Token ID326
Feature activation+1.28e+00

Pos logit contributions
bi+0.021
 Betty+0.021
 Wendy+0.020
 Johnson+0.020
 Susan+0.020
Neg logit contributions
renches-0.030
ware-0.030
 tant-0.030
lights-0.030
obs-0.029
Token the
Token ID262
Feature activation-1.96e-01

Pos logit contributions
bi+0.004
 Betty+0.004
 Johnson+0.004
 Susan+0.004
 Wendy+0.004
Neg logit contributions
ware-0.009
 tant-0.009
obs-0.009
renches-0.009
lights-0.009
Token pigeon
Token ID46387
Feature activation+9.13e-01

Pos logit contributions
ware+0.012
 tant+0.012
lights+0.012
renches+0.011
obs+0.011
Neg logit contributions
bi-0.008
 Betty-0.007
 Susan-0.007
 Johnson-0.007
 Karen-0.007
Token all
Token ID477
Feature activation+6.85e-01

Pos logit contributions
bi+0.036
 Wendy+0.034
 Betty+0.034
 Johnson+0.034
 Karen+0.034
Neg logit contributions
ware-0.053
lights-0.053
obs-0.052
 tant-0.052
renches-0.051
Token those
Token ID883
Feature activation-2.37e+00

Pos logit contributions
bi+0.022
 Betty+0.021
 Johnson+0.021
 Wendy+0.021
 Karen+0.020
Neg logit contributions
lights-0.045
 tant-0.045
ware-0.045
renches-0.044
obs-0.044
Token weeks
Token ID2745
Feature activation-2.73e+00

Pos logit contributions
renches+0.155
ware+0.154
obs+0.153
lights+0.152
 tant+0.151
Neg logit contributions
bi-0.081
my-0.079
 Betty-0.077
 Susan-0.076
 Wendy-0.076
Token ago
Token ID2084
Feature activation-7.03e+00

Pos logit contributions
 tant+0.184
lights+0.181
renches+0.181
ware+0.178
obs+0.176
Neg logit contributions
bi-0.090
 Betty-0.086
my-0.085
 Wendy-0.085
 Johnson-0.084
Token.
Token ID13
Feature activation+4.80e-01

Pos logit contributions
renches+0.496
 tant+0.494
lights+0.491
obs+0.483
ware+0.480
Neg logit contributions
bi-0.214
 Johnson-0.198
 Betty-0.198
my-0.195
 Susan-0.194
Token<|endoftext|>
Token ID50256
Feature activation+2.06e-01

Pos logit contributions
bi+0.021
 Johnson+0.019
 Betty+0.019
my+0.019
isi+0.019
Neg logit contributions
lights-0.026
ware-0.026
obs-0.026
 tant-0.025
renches-0.025
TokenMike
Token ID16073
Feature activation+4.94e-01

Pos logit contributions
bi+0.010
 Betty+0.009
 Susan+0.009
 Karen+0.009
 Wendy+0.009
Neg logit contributions
ware-0.010
lights-0.010
obs-0.010
renches-0.010
 tant-0.010
Token and
Token ID290
Feature activation-3.17e-02

Pos logit contributions
bi+0.017
 Betty+0.015
my+0.015
 Johnson+0.015
 Wendy+0.015
Neg logit contributions
lights-0.033
renches-0.033
 tant-0.033
ware-0.033
obs-0.032
Token his
Token ID465
Feature activation-2.23e-01

Pos logit contributions
 tant+0.002
lights+0.002
obs+0.002
ware+0.002
renches+0.002
Neg logit contributions
bi-0.001
 Johnson-0.001
 Wendy-0.001
 reunited-0.001
my-0.001
Token told
Token ID1297
Feature activation-5.79e-01

Pos logit contributions
 Iris+0.009
bi+0.009
 Wendy+0.009
 Johnson+0.009
 welcome+0.009
Neg logit contributions
ware-0.015
renches-0.014
lights-0.014
der-0.014
 tant-0.014
Token him
Token ID683
Feature activation-2.56e+00

Pos logit contributions
lights+0.039
ware+0.039
renches+0.039
 tant+0.039
obs+0.038
Neg logit contributions
bi-0.019
 Betty-0.018
 Wendy-0.017
 Johnson-0.017
 Susan-0.017
Token that
Token ID326
Feature activation-8.38e-02

Pos logit contributions
lights+0.139
ware+0.138
renches+0.136
obs+0.136
 tant+0.134
Neg logit contributions
bi-0.113
 Johnson-0.106
 Wendy-0.104
 Betty-0.104
my-0.101
Token many
Token ID867
Feature activation-3.03e+00

Pos logit contributions
renches+0.006
 tant+0.006
obs+0.006
lights+0.006
ware+0.006
Neg logit contributions
 Betty-0.003
bi-0.002
 Wendy-0.002
 Susan-0.002
 Johnson-0.002
Token years
Token ID812
Feature activation-1.64e+00

Pos logit contributions
obs+0.193
ware+0.190
 tant+0.187
lights+0.183
renches+0.182
Neg logit contributions
 Betty-0.119
 Susan-0.115
 Patty-0.111
bi-0.111
 Sue-0.111
Token ago
Token ID2084
Feature activation-7.01e+00

Pos logit contributions
 tant+0.106
renches+0.103
ware+0.102
obs+0.097
lights+0.094
Neg logit contributions
 Betty-0.060
 Susan-0.058
 Karen-0.058
 Sue-0.058
bi-0.057
Token,
Token ID11
Feature activation-3.50e-01

Pos logit contributions
 tant+0.530
renches+0.520
lights+0.514
obs+0.507
ware+0.504
Neg logit contributions
bi-0.191
 Johnson-0.179
my-0.178
 Susan-0.176
 Betty-0.176
Token a
Token ID257
Feature activation+5.80e-02

Pos logit contributions
 tant+0.024
obs+0.024
ware+0.024
lights+0.024
renches+0.023
Neg logit contributions
 Betty-0.013
 Daisy-0.012
 Sue-0.012
 Susan-0.012
 Sally-0.012
Token big
Token ID1263
Feature activation-6.24e-01

Pos logit contributions
bi+0.002
 Betty+0.002
 Wendy+0.002
 Johnson+0.002
 Susan+0.002
Neg logit contributions
lights-0.003
ware-0.003
 tant-0.003
renches-0.003
obs-0.003
Token storm
Token ID6388
Feature activation-1.23e+00

Pos logit contributions
 tant+0.042
ware+0.042
lights+0.041
renches+0.041
obs+0.041
Neg logit contributions
bi-0.020
 Betty-0.019
 Wendy-0.018
 Johnson-0.018
 Susan-0.018
Token had
Token ID550
Feature activation-4.61e-01

Pos logit contributions
renches+0.088
obs+0.087
 tant+0.087
ware+0.086
lights+0.085
Neg logit contributions
bi-0.035
 Betty-0.034
my-0.034
 Wendy-0.033
 Johnson-0.033
Token coins
Token ID10796
Feature activation+6.34e-01

Pos logit contributions
obs+0.014
renches+0.014
obb+0.013
 tant+0.013
ware+0.013
Neg logit contributions
 Sue-0.010
 Betty-0.010
my-0.010
 Daisy-0.009
 Susan-0.009
Token that
Token ID326
Feature activation+1.28e+00

Pos logit contributions
bi+0.020
 Betty+0.019
 Johnson+0.018
 Wendy+0.018
my+0.018
Neg logit contributions
 tant-0.043
renches-0.043
lights-0.043
ware-0.042
obs-0.042
Token people
Token ID661
Feature activation-1.63e+00

Pos logit contributions
bi+0.050
 Betty+0.047
 Wendy+0.047
 Johnson+0.046
 Susan+0.045
Neg logit contributions
lights-0.076
ware-0.076
renches-0.076
 tant-0.076
obs-0.075
Token used
Token ID973
Feature activation+3.21e-01

Pos logit contributions
lights+0.112
renches+0.112
 tant+0.111
ware+0.110
obs+0.110
Neg logit contributions
bi-0.051
my-0.048
 Betty-0.047
 Johnson-0.045
 Susan-0.045
Token long
Token ID890
Feature activation-3.17e+00

Pos logit contributions
bi+0.012
 Betty+0.011
 Sue+0.011
my+0.011
 Daisy+0.011
Neg logit contributions
renches-0.020
lights-0.020
obs-0.020
 tant-0.019
ware-0.019
Token ago
Token ID2084
Feature activation-6.93e+00

Pos logit contributions
obb+0.180
lights+0.179
mic+0.172
renches+0.170
 burner+0.167
Neg logit contributions
my-0.128
!-0.126
,-0.119
bi-0.114
 friends-0.112
Token.
Token ID13
Feature activation+3.00e-02

Pos logit contributions
 tant+0.564
renches+0.562
lights+0.543
obb+0.531
obs+0.530
Neg logit contributions
bi-0.145
my-0.132
 Sue-0.130
ma-0.124
 Johnson-0.122
Token They
Token ID1119
Feature activation+7.31e-01

Pos logit contributions
bi+0.001
 Sue+0.001
 Wendy+0.001
 Johnson+0.001
 Karen+0.001
Neg logit contributions
renches-0.002
lights-0.002
ware-0.002
 tant-0.002
obs-0.002
Token saw
Token ID2497
Feature activation-2.09e+00

Pos logit contributions
bi+0.029
 Wendy+0.028
 Betty+0.028
 Johnson+0.027
 Patty+0.027
Neg logit contributions
ware-0.043
renches-0.042
 tant-0.042
lights-0.042
obs-0.041
Token paintings
Token ID21641
Feature activation-5.91e-01

Pos logit contributions
renches+0.124
obs+0.123
 tant+0.122
lights+0.121
ware+0.117
Neg logit contributions
 Sue-0.081
 Betty-0.080
bi-0.079
my-0.078
 Wendy-0.077
Token and
Token ID290
Feature activation-2.82e-02

Pos logit contributions
 tant+0.043
renches+0.042
ware+0.041
obs+0.041
lights+0.041
Neg logit contributions
bi-0.016
my-0.016
 Betty-0.015
 Johnson-0.015
 Wendy-0.015
Token ancestors
Token ID18668
Feature activation-3.16e+00

Pos logit contributions
lights+0.012
renches+0.012
ware+0.012
 tant+0.012
obs+0.011
Neg logit contributions
 Betty-0.005
bi-0.005
 Wendy-0.005
 Johnson-0.005
my-0.005
Token who
Token ID508
Feature activation-7.88e-01

Pos logit contributions
renches+0.238
 tant+0.237
lights+0.232
ware+0.226
obs+0.226
Neg logit contributions
my-0.080
 Betty-0.079
bi-0.078
 Daisy-0.076
 Susan-0.075
Token sang
Token ID25889
Feature activation-1.42e+00

Pos logit contributions
lights+0.048
renches+0.048
ware+0.047
 tant+0.047
obs+0.046
Neg logit contributions
bi-0.031
 Betty-0.030
my-0.030
 Wendy-0.029
 Patty-0.029
Token it
Token ID340
Feature activation-1.93e+00

Pos logit contributions
renches+0.103
ware+0.103
lights+0.099
 tant+0.098
obs+0.098
Neg logit contributions
 Daisy-0.043
my-0.041
 Betty-0.041
 Wendy-0.040
 welcome-0.040
Token long
Token ID890
Feature activation-3.01e+00

Pos logit contributions
renches+0.134
ware+0.129
 tant+0.129
obs+0.128
lights+0.127
Neg logit contributions
my-0.060
bi-0.059
 Betty-0.059
 Daisy-0.058
 Wendy-0.056
Token ago
Token ID2084
Feature activation-6.93e+00

Pos logit contributions
renches+0.244
lights+0.232
ware+0.230
obs+0.222
 tant+0.216
Neg logit contributions
!-0.074
my-0.071
 home-0.067
 Daisy-0.062
 welcome-0.060
Token.
Token ID13
Feature activation+6.80e-02

Pos logit contributions
renches+0.534
 tant+0.528
lights+0.515
obs+0.503
ware+0.501
Neg logit contributions
bi-0.178
my-0.169
 Johnson-0.160
 Susan-0.158
 Betty-0.158
Token Would
Token ID10928
Feature activation-4.03e-01

Pos logit contributions
 Betty+0.003
 Wendy+0.003
bi+0.003
 Susan+0.003
 Johnson+0.003
Neg logit contributions
 tant-0.004
ware-0.003
lights-0.003
renches-0.003
obs-0.003
Token you
Token ID345
Feature activation-6.14e-01

Pos logit contributions
ware+0.025
renches+0.024
lights+0.024
obs+0.024
 tant+0.024
Neg logit contributions
bi-0.016
 Johnson-0.014
 Betty-0.014
 Wendy-0.014
my-0.014
Token like
Token ID588
Feature activation+7.47e-01

Pos logit contributions
lights+0.040
renches+0.039
ware+0.039
 tant+0.038
obs+0.037
Neg logit contributions
my-0.023
bi-0.023
 Betty-0.022
 welcome-0.022
 Daisy-0.022
Token to
Token ID284
Feature activation+9.37e-02

Pos logit contributions
 Daisy+0.021
my+0.020
 Betty+0.020
 her+0.019
 welcome+0.019
Neg logit contributions
renches-0.058
 tant-0.058
lights-0.057
obs-0.055
inters-0.054
Token how
Token ID703
Feature activation+2.63e+00

Pos logit contributions
renches+0.004
 tant+0.004
obs+0.004
lights+0.004
ware+0.004
Neg logit contributions
 Betty-0.002
 Wendy-0.002
bi-0.002
 Susan-0.002
 Sue-0.002
Token they
Token ID484
Feature activation+6.61e-02

Pos logit contributions
bi+0.107
 Betty+0.103
 Susan+0.099
 Wendy+0.099
 Karen+0.097
Neg logit contributions
ware-0.157
 tant-0.156
renches-0.155
lights-0.154
obs-0.152
Token were
Token ID547
Feature activation+3.19e-02

Pos logit contributions
bi+0.002
my+0.002
 Betty+0.002
 Susan+0.002
 Wendy+0.002
Neg logit contributions
lights-0.004
 tant-0.004
obs-0.004
renches-0.004
ware-0.004
Token used
Token ID973
Feature activation+1.96e-01

Pos logit contributions
bi+0.001
 Wendy+0.001
 Betty+0.001
 Johnson+0.001
 Karen+0.001
Neg logit contributions
obs-0.002
renches-0.002
ware-0.002
lights-0.002
 tant-0.002
Token long
Token ID890
Feature activation-2.75e+00

Pos logit contributions
 Betty+0.007
bi+0.007
 Sue+0.007
 Susan+0.006
 Wendy+0.006
Neg logit contributions
renches-0.013
ware-0.012
lights-0.012
 tant-0.012
obs-0.012
Token ago
Token ID2084
Feature activation-6.82e+00

Pos logit contributions
plets+0.140
 tant+0.138
inter+0.136
renches+0.135
obb+0.134
Neg logit contributions
!-0.120
,-0.117
my-0.116
 home-0.110
 friends-0.110
Token.
Token ID13
Feature activation+2.17e-01

Pos logit contributions
 tant+0.563
renches+0.554
lights+0.527
obs+0.525
ware+0.518
Neg logit contributions
bi-0.137
my-0.131
 Sue-0.127
ma-0.127
 Johnson-0.126
Token Lily
Token ID20037
Feature activation+9.52e-01

Pos logit contributions
 Wendy+0.008
 Sue+0.008
 Daisy+0.008
 Sally+0.008
 Betty+0.008
Neg logit contributions
ware-0.015
lights-0.014
 tant-0.014
renches-0.014
obs-0.014
Token was
Token ID373
Feature activation-4.44e-01

Pos logit contributions
bi+0.034
 Betty+0.033
 Wendy+0.032
 Johnson+0.032
 Susan+0.031
Neg logit contributions
ware-0.060
renches-0.059
obs-0.059
lights-0.059
 tant-0.059
Token fascinated
Token ID30103
Feature activation-1.38e+00

Pos logit contributions
obs+0.023
renches+0.023
lights+0.022
 tant+0.022
ware+0.022
Neg logit contributions
bi-0.020
 Betty-0.019
 Johnson-0.019
 Wendy-0.018
isi-0.018
Token and
Token ID290
Feature activation+4.51e-02

Pos logit contributions
lights+0.089
 tant+0.088
obs+0.088
ware+0.088
renches+0.087
Neg logit contributions
bi-0.049
 Betty-0.045
 Wendy-0.044
 Johnson-0.044
 Susan-0.043
Token found
Token ID1043
Feature activation-1.39e+00

Pos logit contributions
lights+0.082
renches+0.080
 tant+0.080
ware+0.080
obs+0.079
Neg logit contributions
bi-0.038
 Betty-0.037
 Susan-0.036
 Wendy-0.036
my-0.035
Token the
Token ID262
Feature activation-9.34e-02

Pos logit contributions
lights+0.090
renches+0.090
obs+0.089
ware+0.089
 tant+0.089
Neg logit contributions
bi-0.048
 Betty-0.047
 Wendy-0.045
 Susan-0.044
 Johnson-0.044
Token truck
Token ID7779
Feature activation+5.54e-01

Pos logit contributions
lights+0.006
renches+0.006
obs+0.005
ware+0.005
 tant+0.005
Neg logit contributions
bi-0.004
 Johnson-0.004
 Betty-0.004
my-0.004
 Wendy-0.003
Token many
Token ID867
Feature activation-2.48e+00

Pos logit contributions
bi+0.020
my+0.019
 Johnson+0.019
 Betty+0.019
 Susan+0.018
Neg logit contributions
 tant-0.036
renches-0.036
lights-0.035
ware-0.035
obs-0.035
Token years
Token ID812
Feature activation-1.32e+00

Pos logit contributions
obs+0.150
ware+0.149
lights+0.146
renches+0.146
der+0.140
Neg logit contributions
my-0.093
bi-0.093
ma-0.091
 Susan-0.089
 Karen-0.089
Token ago
Token ID2084
Feature activation-6.74e+00

Pos logit contributions
 tant+0.083
renches+0.083
ware+0.079
lights+0.077
obs+0.076
Neg logit contributions
bi-0.046
 Johnson-0.045
 Karen-0.044
 Susan-0.044
 Sally-0.044
Token and
Token ID290
Feature activation-1.48e-02

Pos logit contributions
 tant+0.505
renches+0.498
lights+0.494
obs+0.483
ware+0.479
Neg logit contributions
bi-0.185
 Johnson-0.176
 Susan-0.171
my-0.169
 Betty-0.167
Token worked
Token ID3111
Feature activation+9.07e-01

Pos logit contributions
 tant+0.001
renches+0.001
lights+0.001
obs+0.001
ware+0.001
Neg logit contributions
 Betty-0.000
bi-0.000
 Susan-0.000
 Wendy-0.000
 Sue-0.000
Token hard
Token ID1327
Feature activation-1.68e-02

Pos logit contributions
bi+0.032
 Betty+0.031
 Wendy+0.030
 Johnson+0.030
 Susan+0.029
Neg logit contributions
ware-0.058
lights-0.058
renches-0.057
 tant-0.057
obs-0.057
Token to
Token ID284
Feature activation+5.71e-01

Pos logit contributions
lights+0.001
ware+0.001
 tant+0.001
renches+0.001
obs+0.001
Neg logit contributions
 home-0.000
!-0.000
my-0.000
 Betty-0.000
 Susan-0.000
Token fix
Token ID4259
Feature activation+8.59e-01

Pos logit contributions
 Betty+0.019
bi+0.019
 Susan+0.018
 Wendy+0.018
 Johnson+0.018
Neg logit contributions
lights-0.038
obs-0.038
 tant-0.037
ware-0.037
renches-0.037
Token says
Token ID1139
Feature activation-2.09e-01

Pos logit contributions
obs+0.067
renches+0.067
ware+0.067
lights+0.065
 tant+0.065
Neg logit contributions
 Wendy-0.035
bi-0.034
 Betty-0.034
 Patty-0.032
 Iris-0.032
Token that
Token ID326
Feature activation-8.49e-01

Pos logit contributions
 tant+0.014
renches+0.014
lights+0.013
obs+0.013
ware+0.013
Neg logit contributions
bi-0.007
 Betty-0.007
 Wendy-0.007
 Johnson-0.007
 Susan-0.007
Token a
Token ID257
Feature activation-7.70e-01

Pos logit contributions
lights+0.054
renches+0.054
 tant+0.053
ware+0.053
obs+0.053
Neg logit contributions
 Betty-0.031
bi-0.031
 Karen-0.030
 Wendy-0.030
 Johnson-0.030
Token long
Token ID890
Feature activation-2.93e+00

Pos logit contributions
lights+0.047
renches+0.046
ware+0.046
obs+0.045
 tant+0.045
Neg logit contributions
bi-0.031
 Betty-0.030
 Johnson-0.029
 Wendy-0.029
 Karen-0.029
Token time
Token ID640
Feature activation-1.90e+00

Pos logit contributions
lights+0.193
obs+0.185
renches+0.182
ware+0.181
 tant+0.171
Neg logit contributions
bi-0.114
 Betty-0.109
 Sue-0.108
 Susan-0.106
 Karen-0.106
Token ago
Token ID2084
Feature activation-6.73e+00

Pos logit contributions
 tant+0.141
renches+0.140
lights+0.138
ware+0.136
obs+0.135
Neg logit contributions
bi-0.051
 Sue-0.050
 Betty-0.049
my-0.049
 Susan-0.049
Token,
Token ID11
Feature activation-5.82e-01

Pos logit contributions
 tant+0.496
renches+0.495
lights+0.492
obs+0.484
ware+0.476
Neg logit contributions
bi-0.189
my-0.172
 Johnson-0.171
 Sue-0.170
 Betty-0.169
Token a
Token ID257
Feature activation+3.59e-01

Pos logit contributions
obs+0.043
renches+0.043
 tant+0.041
lights+0.041
ware+0.040
Neg logit contributions
 Daisy-0.017
 Sue-0.017
 Wendy-0.016
 Betty-0.016
 Bob-0.016
Token man
Token ID582
Feature activation+1.14e+00

Pos logit contributions
bi+0.014
 Betty+0.013
 Wendy+0.013
 Johnson+0.013
 Karen+0.012
Neg logit contributions
lights-0.022
renches-0.022
ware-0.022
obs-0.022
 tant-0.022
Token named
Token ID3706
Feature activation-9.34e-01

Pos logit contributions
bi+0.039
my+0.037
 Betty+0.037
 Susan+0.035
 Johnson+0.035
Neg logit contributions
lights-0.077
renches-0.076
 tant-0.075
obs-0.074
ware-0.074
Token Jesus
Token ID5803
Feature activation-3.99e-01

Pos logit contributions
lights+0.065
 tant+0.065
renches+0.064
ware+0.064
obs+0.064
Neg logit contributions
bi-0.028
 Betty-0.027
 Wendy-0.026
 Johnson-0.026
 Susan-0.026
Token had
Token ID550
Feature activation-6.20e-01

Pos logit contributions
bi+0.005
 Betty+0.005
 Wendy+0.005
my+0.004
 Johnson+0.004
Neg logit contributions
 tant-0.012
renches-0.012
lights-0.012
obs-0.012
ware-0.012
Token been
Token ID587
Feature activation+9.20e-03

Pos logit contributions
 tant+0.044
ware+0.044
lights+0.044
renches+0.042
obs+0.040
Neg logit contributions
 Sue-0.019
 Karen-0.019
 Betty-0.019
 Susan-0.018
my-0.018
Token made
Token ID925
Feature activation-6.29e-01

Pos logit contributions
my+0.000
 reunited+0.000
bi+0.000
 home+0.000
 Betty+0.000
Neg logit contributions
 tant-0.001
lights-0.001
der-0.001
ware-0.001
obs-0.001
Token many
Token ID867
Feature activation-1.90e+00

Pos logit contributions
 tant+0.043
ware+0.041
lights+0.041
renches+0.040
obs+0.040
Neg logit contributions
my-0.021
 Betty-0.021
 welcome-0.020
 home-0.020
 Daisy-0.020
Token years
Token ID812
Feature activation-1.23e+00

Pos logit contributions
ware+0.112
obs+0.110
lights+0.108
 tant+0.108
renches+0.107
Neg logit contributions
my-0.074
bi-0.074
 Betty-0.071
 Wendy-0.070
 Karen-0.070
Token ago
Token ID2084
Feature activation-6.70e+00

Pos logit contributions
ware+0.086
 tant+0.085
renches+0.084
growth+0.081
 rect+0.080
Neg logit contributions
 home-0.038
!-0.029
 Sally-0.029
 friend-0.028
bi-0.028
Token,
Token ID11
Feature activation-2.64e-01

Pos logit contributions
 tant+0.559
renches+0.538
lights+0.534
ware+0.515
obb+0.506
Neg logit contributions
bi-0.135
 Johnson-0.135
my-0.134
 Susan-0.124
 Sue-0.122
Token but
Token ID475
Feature activation-3.19e-01

Pos logit contributions
 tant+0.020
renches+0.019
lights+0.019
ware+0.019
obs+0.019
Neg logit contributions
 Betty-0.008
 Sue-0.008
 Sally-0.008
 Susan-0.008
 Daisy-0.008
Token it
Token ID340
Feature activation-1.36e+00

Pos logit contributions
obs+0.022
 tant+0.022
renches+0.022
ware+0.021
lights+0.021
Neg logit contributions
 Betty-0.010
bi-0.010
 Wendy-0.010
 Susan-0.010
 Karen-0.010
Token still
Token ID991
Feature activation+6.45e-01

Pos logit contributions
ware+0.082
lights+0.080
renches+0.079
 tant+0.079
obs+0.079
Neg logit contributions
bi-0.053
 Betty-0.051
 Johnson-0.050
 Wendy-0.050
 Susan-0.049
Token looked
Token ID3114
Feature activation-2.80e-01

Pos logit contributions
bi+0.020
my+0.019
 Betty+0.019
 Susan+0.018
 Sue+0.018
Neg logit contributions
renches-0.045
 tant-0.044
ware-0.044
lights-0.044
obs-0.043
Token her
Token ID607
Feature activation-1.15e+00

Pos logit contributions
 tant+0.067
lights+0.066
renches+0.066
ware+0.066
obs+0.065
Neg logit contributions
bi-0.014
 Betty-0.011
 Johnson-0.011
 Wendy-0.011
 Susan-0.010
Token that
Token ID326
Feature activation-5.64e-01

Pos logit contributions
lights+0.063
ware+0.062
obs+0.061
 tant+0.061
renches+0.061
Neg logit contributions
bi-0.052
 Johnson-0.048
 Wendy-0.048
 Betty-0.048
my-0.046
Token a
Token ID257
Feature activation-9.37e-01

Pos logit contributions
renches+0.038
 tant+0.038
obs+0.038
lights+0.038
ware+0.037
Neg logit contributions
 Betty-0.018
bi-0.018
 Wendy-0.017
 Johnson-0.017
 Susan-0.017
Token long
Token ID890
Feature activation-2.97e+00

Pos logit contributions
lights+0.056
renches+0.056
ware+0.055
obs+0.055
 tant+0.055
Neg logit contributions
bi-0.038
 Betty-0.036
 Wendy-0.035
 Johnson-0.035
my-0.034
Token time
Token ID640
Feature activation-1.77e+00

Pos logit contributions
lights+0.187
obs+0.185
renches+0.183
ware+0.176
 tant+0.170
Neg logit contributions
bi-0.123
 Betty-0.116
my-0.115
 Sue-0.113
 Susan-0.110
Token ago
Token ID2084
Feature activation-6.67e+00

Pos logit contributions
 tant+0.130
renches+0.126
lights+0.125
ware+0.124
obs+0.122
Neg logit contributions
 Betty-0.051
 Sue-0.050
my-0.050
bi-0.050
 Wendy-0.048
Token,
Token ID11
Feature activation-6.94e-02

Pos logit contributions
renches+0.530
 tant+0.525
lights+0.521
obs+0.512
ware+0.502
Neg logit contributions
bi-0.155
 Johnson-0.147
my-0.141
 Sue-0.140
 Susan-0.137
Token someone
Token ID2130
Feature activation-1.87e+00

Pos logit contributions
 tant+0.005
lights+0.005
obs+0.005
ware+0.005
renches+0.005
Neg logit contributions
 Sue-0.002
 Sally-0.002
 Betty-0.002
 Daisy-0.002
 Wendy-0.002
Token made
Token ID925
Feature activation-7.79e-01

Pos logit contributions
ware+0.110
lights+0.110
renches+0.110
 tant+0.110
obs+0.109
Neg logit contributions
bi-0.074
 Betty-0.071
 Wendy-0.069
 Johnson-0.069
 Susan-0.067
Token the
Token ID262
Feature activation+2.73e-01

Pos logit contributions
renches+0.049
ware+0.049
obs+0.049
 tant+0.049
lights+0.049
Neg logit contributions
bi-0.028
 Johnson-0.025
 Betty-0.025
 Wendy-0.025
 Karen-0.024
Token mark
Token ID1317
Feature activation-9.52e-01

Pos logit contributions
bi+0.011
 Betty+0.011
 Wendy+0.011
 Johnson+0.010
 Susan+0.010
Neg logit contributions
ware-0.016
lights-0.016
renches-0.016
 tant-0.016
obs-0.016
Token "
Token ID366
Feature activation-3.37e-01

Pos logit contributions
lights+0.013
ware+0.013
 tant+0.013
renches+0.013
obs+0.013
Neg logit contributions
bi-0.012
 Betty-0.011
 Wendy-0.011
 Johnson-0.011
 Susan-0.011
TokenIt
Token ID1026
Feature activation+1.08e+00

Pos logit contributions
lights+0.026
renches+0.025
 tant+0.025
ware+0.024
obb+0.023
Neg logit contributions
my-0.010
Joy-0.008
bi-0.008
Susan-0.008
Bet-0.008
Token was
Token ID373
Feature activation-9.40e-01

Pos logit contributions
bi+0.053
 Johnson+0.052
 Wendy+0.052
 Karen+0.051
isi+0.050
Neg logit contributions
ware-0.055
renches-0.053
obs-0.052
lights-0.051
 tant-0.049
Token hidden
Token ID7104
Feature activation-1.03e+00

Pos logit contributions
ware+0.057
lights+0.057
renches+0.057
obs+0.056
 tant+0.056
Neg logit contributions
bi-0.036
 Betty-0.034
 Wendy-0.033
 Johnson-0.033
 Susan-0.032
Token long
Token ID890
Feature activation-2.52e+00

Pos logit contributions
renches+0.073
lights+0.071
ware+0.071
 tant+0.071
obs+0.071
Neg logit contributions
 Betty-0.029
bi-0.027
 Daisy-0.027
 Sue-0.026
my-0.026
Token ago
Token ID2084
Feature activation-6.66e+00

Pos logit contributions
 tant+0.194
renches+0.193
lights+0.192
obs+0.182
ware+0.180
Neg logit contributions
!-0.058
 home-0.054
my-0.053
 Daisy-0.050
 Betty-0.047
Token,
Token ID11
Feature activation-5.32e-01

Pos logit contributions
 tant+0.508
renches+0.506
lights+0.495
ware+0.485
obs+0.478
Neg logit contributions
bi-0.174
 Johnson-0.169
my-0.164
 Sue-0.158
 Susan-0.156
Token and
Token ID290
Feature activation-1.58e-01

Pos logit contributions
ware+0.034
 tant+0.034
obs+0.033
renches+0.033
lights+0.033
Neg logit contributions
 Betty-0.021
 Sue-0.020
 Sally-0.020
 Daisy-0.019
 Susan-0.019
Token only
Token ID691
Feature activation+1.25e+00

Pos logit contributions
 tant+0.011
obs+0.011
renches+0.011
ware+0.011
lights+0.011
Neg logit contributions
 Betty-0.005
 Daisy-0.005
 Susan-0.005
 Sue-0.005
 Sally-0.004
Token seems
Token ID2331
Feature activation+1.22e+00

Pos logit contributions
 Betty+0.039
bi+0.038
my+0.038
 Sue+0.037
 Susan+0.037
Neg logit contributions
renches-0.087
lights-0.087
 tant-0.086
obs-0.086
ware-0.084
Token fancy
Token ID14996
Feature activation+4.40e-01

Pos logit contributions
bi+0.043
 Betty+0.043
 Daisy+0.040
 Sue+0.040
 Wendy+0.040
Neg logit contributions
lights-0.079
renches-0.079
ware-0.078
 tant-0.078
obs-0.076
Token says
Token ID1139
Feature activation+6.40e-02

Pos logit contributions
bi+0.052
 Wendy+0.050
 Johnson+0.050
 Karen+0.048
isi+0.047
Neg logit contributions
ware-0.060
lights-0.060
renches-0.059
obs-0.058
 tant-0.057
Token that
Token ID326
Feature activation-3.52e-01

Pos logit contributions
 Betty+0.002
bi+0.002
 Wendy+0.002
 Susan+0.002
 Johnson+0.002
Neg logit contributions
 tant-0.004
renches-0.004
lights-0.004
ware-0.004
obs-0.004
Token a
Token ID257
Feature activation-8.05e-01

Pos logit contributions
lights+0.023
renches+0.022
obs+0.022
 tant+0.022
ware+0.022
Neg logit contributions
bi-0.012
 Betty-0.012
 Wendy-0.012
 Susan-0.012
 Johnson-0.012
Token long
Token ID890
Feature activation-2.82e+00

Pos logit contributions
lights+0.047
ware+0.047
renches+0.047
 tant+0.047
obs+0.047
Neg logit contributions
bi-0.032
 Betty-0.031
 Wendy-0.030
 Johnson-0.030
 Susan-0.030
Token time
Token ID640
Feature activation-1.60e+00

Pos logit contributions
lights+0.174
obs+0.173
renches+0.172
ware+0.166
 tant+0.165
Neg logit contributions
 Betty-0.108
bi-0.108
 Susan-0.105
 Johnson-0.104
my-0.103
Token ago
Token ID2084
Feature activation-6.53e+00

Pos logit contributions
 tant+0.111
renches+0.108
lights+0.107
ware+0.107
obs+0.105
Neg logit contributions
bi-0.051
 Betty-0.051
 Susan-0.051
 Sue-0.049
my-0.049
Token,
Token ID11
Feature activation-4.46e-01

Pos logit contributions
renches+0.493
 tant+0.492
lights+0.491
obs+0.481
ware+0.480
Neg logit contributions
bi-0.168
 Betty-0.153
 Susan-0.152
 Johnson-0.151
my-0.148
Token a
Token ID257
Feature activation+1.95e-01

Pos logit contributions
ware+0.030
obs+0.030
renches+0.030
 tant+0.030
lights+0.030
Neg logit contributions
 Betty-0.014
 Susan-0.014
 Wendy-0.014
 Karen-0.014
 Sue-0.014
Token dragon
Token ID10441
Feature activation-7.64e-01

Pos logit contributions
bi+0.008
 Betty+0.007
 Wendy+0.007
 Johnson+0.007
 Susan+0.007
Neg logit contributions
lights-0.012
renches-0.012
ware-0.011
obs-0.011
 tant-0.011
Token used
Token ID973
Feature activation+8.29e-02

Pos logit contributions
lights+0.051
renches+0.051
 tant+0.051
ware+0.051
obs+0.051
Neg logit contributions
bi-0.024
 Betty-0.023
 Wendy-0.022
 Johnson-0.022
 Susan-0.022
Token to
Token ID284
Feature activation-8.58e-01

Pos logit contributions
 Betty+0.003
 Susan+0.003
 Daisy+0.003
my+0.003
 Sue+0.003
Neg logit contributions
lights-0.006
renches-0.006
ware-0.006
obs-0.005
 tant-0.005
Token by
Token ID416
Feature activation+7.00e-02

Pos logit contributions
ware+0.035
 tant+0.035
lights+0.035
renches+0.035
obs+0.034
Neg logit contributions
 Betty-0.018
bi-0.018
 Wendy-0.018
 Sue-0.017
 Susan-0.017
Token a
Token ID257
Feature activation+4.25e-01

Pos logit contributions
my+0.002
 Betty+0.002
 Sue+0.002
bi+0.002
 Susan+0.002
Neg logit contributions
lights-0.005
ware-0.005
renches-0.005
obs-0.005
 tant-0.005
Token famous
Token ID5863
Feature activation-1.70e-01

Pos logit contributions
bi+0.019
 Betty+0.018
 Wendy+0.018
 Johnson+0.018
 Susan+0.018
Neg logit contributions
 tant-0.023
lights-0.023
ware-0.023
renches-0.023
obs-0.023
Token artist
Token ID6802
Feature activation-4.96e-01

Pos logit contributions
ware+0.010
obs+0.010
 tant+0.010
renches+0.010
lights+0.010
Neg logit contributions
bi-0.007
 Betty-0.006
 Wendy-0.006
my-0.006
 Susan-0.006
Token long
Token ID890
Feature activation-2.24e+00

Pos logit contributions
renches+0.035
 tant+0.035
lights+0.035
ware+0.034
obs+0.033
Neg logit contributions
bi-0.014
 Betty-0.014
my-0.014
 Susan-0.013
 Johnson-0.013
Token ago
Token ID2084
Feature activation-6.51e+00

Pos logit contributions
renches+0.146
lights+0.145
ware+0.140
obs+0.139
 tant+0.135
Neg logit contributions
 Betty-0.076
my-0.075
 Susan-0.075
bi-0.075
 Johnson-0.073
Token.
Token ID13
Feature activation-1.12e-01

Pos logit contributions
 tant+0.520
renches+0.516
lights+0.507
ware+0.490
obs+0.488
Neg logit contributions
bi-0.145
 Johnson-0.140
 Sue-0.132
 Susan-0.131
my-0.128
Token
Token ID220
Feature activation+1.87e-01

Pos logit contributions
 tant+0.008
ware+0.007
lights+0.007
renches+0.007
obs+0.007
Neg logit contributions
 Betty-0.004
 Sue-0.004
 Susan-0.004
bi-0.004
 Wendy-0.004
Token
Token ID198
Feature activation-1.77e-01

Pos logit contributions
bi+0.006
 Betty+0.006
 Wendy+0.006
 Johnson+0.006
my+0.006
Neg logit contributions
ware-0.012
 tant-0.012
lights-0.012
renches-0.012
obs-0.012
Token
Token ID198
Feature activation-2.31e-01

Pos logit contributions
lights+0.011
ware+0.011
renches+0.011
 tant+0.011
obs+0.011
Neg logit contributions
bi-0.006
 Betty-0.006
 Wendy-0.006
 Johnson-0.006
 Susan-0.006
TokenOne
Token ID3198
Feature activation-1.14e+00

Pos logit contributions
lights+0.014
obs+0.014
 tant+0.014
ware+0.014
renches+0.014
Neg logit contributions
bi-0.009
 Betty-0.009
my-0.009
 Wendy-0.009
 Johnson-0.009
TokenThey
Token ID2990
Feature activation+2.06e-02

Pos logit contributions
my+0.003
bi+0.003
 Johnson+0.003
Joy+0.003
 Sue+0.003
Neg logit contributions
renches-0.007
obs-0.007
lights-0.007
ware-0.007
 tant-0.007
Token were
Token ID547
Feature activation-4.42e-01

Pos logit contributions
 Wendy+0.001
bi+0.001
 Betty+0.001
 Patty+0.001
 Johnson+0.001
Neg logit contributions
renches-0.001
ware-0.001
obs-0.001
 tant-0.001
 boards-0.001
Token here
Token ID994
Feature activation+1.95e-01

Pos logit contributions
renches+0.024
ware+0.023
 tant+0.023
lights+0.023
obs+0.023
Neg logit contributions
bi-0.019
isi-0.017
 Wendy-0.017
 Betty-0.017
 Karen-0.017
Token a
Token ID257
Feature activation+1.18e-01

Pos logit contributions
bi+0.008
 Betty+0.007
 Wendy+0.007
 Johnson+0.007
 Susan+0.007
Neg logit contributions
lights-0.012
ware-0.012
 tant-0.012
renches-0.012
obs-0.012
Token minute
Token ID5664
Feature activation-8.45e-01

Pos logit contributions
bi+0.005
 Susan+0.005
 Betty+0.005
 reunited+0.005
 Karen+0.005
Neg logit contributions
 tant-0.006
ware-0.006
renches-0.006
obs-0.006
lights-0.006
Token ago
Token ID2084
Feature activation-6.51e+00

Pos logit contributions
 tant+0.063
renches+0.062
ware+0.060
lights+0.059
obs+0.059
Neg logit contributions
bi-0.025
my-0.023
 Betty-0.023
 Sue-0.022
 Johnson-0.022
Token."
Token ID526
Feature activation-9.61e-01

Pos logit contributions
 tant+0.491
renches+0.489
lights+0.482
ware+0.477
obs+0.474
Neg logit contributions
bi-0.171
 Johnson-0.157
 Betty-0.154
 Sue-0.151
 Susan-0.150
Token
Token ID198
Feature activation-5.40e-01

Pos logit contributions
 tant+0.077
renches+0.076
lights+0.075
ware+0.075
obs+0.072
Neg logit contributions
 Wendy-0.023
 Sue-0.023
 Betty-0.021
 Daisy-0.021
 Sally-0.020
Token
Token ID198
Feature activation-5.76e-01

Pos logit contributions
lights+0.034
ware+0.034
renches+0.034
 tant+0.034
obs+0.034
Neg logit contributions
bi-0.020
 Betty-0.019
 Wendy-0.018
 Johnson-0.018
 Susan-0.018
Token"
Token ID1
Feature activation+9.36e-01

Pos logit contributions
obs+0.035
lights+0.035
 tant+0.035
renches+0.035
ware+0.035
Neg logit contributions
bi-0.021
 Betty-0.021
 Susan-0.020
 Johnson-0.020
 Wendy-0.019
TokenMaybe
Token ID13300
Feature activation+1.19e+00

Pos logit contributions
bi+0.030
 Wendy+0.030
 Sue+0.029
 Johnson+0.029
Joy+0.029
Neg logit contributions
renches-0.063
obs-0.063
lights-0.061
ware-0.061
 tant-0.061
Token family
Token ID1641
Feature activation-7.93e-01

Pos logit contributions
 tant+0.015
obs+0.015
ware+0.015
lights+0.015
renches+0.015
Neg logit contributions
bi-0.009
 Betty-0.008
 Susan-0.008
 Wendy-0.008
 Johnson-0.008
Token from
Token ID422
Feature activation-7.06e-01

Pos logit contributions
renches+0.050
 tant+0.050
ware+0.050
lights+0.050
obs+0.049
Neg logit contributions
bi-0.029
 Betty-0.027
my-0.027
 Johnson-0.027
 Wendy-0.026
Token a
Token ID257
Feature activation-2.15e-01

Pos logit contributions
obs+0.053
renches+0.053
ware+0.052
 tant+0.051
lights+0.051
Neg logit contributions
 Betty-0.017
bi-0.017
 Wendy-0.017
 Johnson-0.017
 Daisy-0.017
Token long
Token ID890
Feature activation-2.38e+00

Pos logit contributions
lights+0.012
ware+0.012
renches+0.012
 tant+0.012
obs+0.012
Neg logit contributions
bi-0.009
 Betty-0.009
 Wendy-0.009
 Johnson-0.009
 Susan-0.009
Token time
Token ID640
Feature activation-1.61e+00

Pos logit contributions
renches+0.139
lights+0.135
obs+0.134
ware+0.132
der+0.126
Neg logit contributions
bi-0.103
my-0.103
 Johnson-0.098
ma-0.096
 Betty-0.096
Token ago
Token ID2084
Feature activation-6.49e+00

Pos logit contributions
 tant+0.110
renches+0.107
ware+0.105
lights+0.103
obs+0.100
Neg logit contributions
my-0.053
 home-0.050
bi-0.050
 Johnson-0.050
 Susan-0.048
Token."
Token ID526
Feature activation-9.89e-01

Pos logit contributions
renches+0.462
 tant+0.461
lights+0.453
ware+0.448
obs+0.448
Neg logit contributions
bi-0.193
 Johnson-0.176
 Betty-0.174
my-0.172
 Susan-0.171
Token
Token ID198
Feature activation-5.32e-01

Pos logit contributions
 tant+0.077
renches+0.077
ware+0.076
obs+0.076
lights+0.075
Neg logit contributions
 Sue-0.023
 Betty-0.022
 Wendy-0.022
 Daisy-0.021
bi-0.020
Token
Token ID198
Feature activation-5.60e-01

Pos logit contributions
lights+0.033
ware+0.033
renches+0.033
 tant+0.033
obs+0.033
Neg logit contributions
bi-0.019
 Betty-0.019
 Wendy-0.018
 Johnson-0.018
 Susan-0.018
Token"
Token ID1
Feature activation+8.65e-01

Pos logit contributions
lights+0.032
ware+0.031
renches+0.031
 tant+0.031
obs+0.031
Neg logit contributions
bi-0.024
 Betty-0.023
 Wendy-0.023
 Johnson-0.023
 Susan-0.022
TokenMaybe
Token ID13300
Feature activation+1.18e+00

Pos logit contributions
 Iris+0.035
Joy+0.035
 Wendy+0.034
 Karen+0.033
 Sue+0.033
Neg logit contributions
obs-0.051
renches-0.051
lights-0.049
 tant-0.048
ware-0.048
Token so
Token ID523
Feature activation-5.08e-01

Pos logit contributions
lights+0.010
obs+0.010
 tant+0.009
renches+0.009
 Some+0.009
Neg logit contributions
bi-0.010
 Betty-0.009
isi-0.009
 Johnson-0.009
 Wendy-0.009
Token happy
Token ID3772
Feature activation+2.70e-01

Pos logit contributions
 tant+0.030
obs+0.029
lights+0.029
renches+0.029
ware+0.028
Neg logit contributions
bi-0.020
 Johnson-0.019
 Betty-0.019
 Wendy-0.019
 Susan-0.018
Token with
Token ID351
Feature activation+4.71e-01

Pos logit contributions
bi+0.015
 Johnson+0.014
 Wendy+0.013
 Karen+0.013
 Betty+0.013
Neg logit contributions
obs-0.012
ware-0.012
lights-0.012
der-0.012
 tant-0.011
Token her
Token ID607
Feature activation-8.67e-02

Pos logit contributions
bi+0.017
 Johnson+0.014
ll+0.014
my+0.013
Bet+0.012
Neg logit contributions
lights-0.030
 tant-0.029
renches-0.029
der-0.028
 Some-0.028
Token new
Token ID649
Feature activation-1.01e+00

Pos logit contributions
ware+0.005
lights+0.005
 tant+0.005
renches+0.005
obs+0.005
Neg logit contributions
bi-0.004
 Betty-0.004
 Wendy-0.004
 Susan-0.004
 Karen-0.004
Token seas
Token ID21547
Feature activation-6.49e+00

Pos logit contributions
ware+0.046
 tant+0.044
obs+0.043
 boards+0.042
lights+0.042
Neg logit contributions
 Karen-0.055
 Johnson-0.055
 Betty-0.055
 Wendy-0.055
bi-0.055
Tokenhell
Token ID12758
Feature activation-7.63e-01

Pos logit contributions
lights+0.262
 tant+0.259
obs+0.257
ware+0.257
renches+0.248
Neg logit contributions
bi-0.381
 Betty-0.372
 Wendy-0.370
 Johnson-0.364
 Susan-0.359
Token and
Token ID290
Feature activation-8.18e-01

Pos logit contributions
obs+0.039
lights+0.038
ware+0.038
der+0.038
 tant+0.036
Neg logit contributions
bi-0.038
 Betty-0.035
 Wendy-0.035
 Karen-0.035
 Susan-0.035
Token her
Token ID607
Feature activation+2.29e-01

Pos logit contributions
ware+0.047
lights+0.046
obs+0.046
 tant+0.044
 boards+0.044
Neg logit contributions
bi-0.033
 Johnson-0.032
isi-0.030
my-0.030
 Karen-0.029
Token new
Token ID649
Feature activation-7.42e-01

Pos logit contributions
bi+0.008
 Betty+0.008
 Wendy+0.007
 Johnson+0.007
 Susan+0.007
Neg logit contributions
lights-0.015
ware-0.015
 tant-0.015
renches-0.015
obs-0.015
Token crab
Token ID32202
Feature activation-2.22e+00

Pos logit contributions
ware+0.037
 tant+0.037
obs+0.036
lights+0.036
 boards+0.034
Neg logit contributions
 Betty-0.036
bi-0.036
 Wendy-0.036
 Johnson-0.036
 Karen-0.036
Token how
Token ID703
Feature activation+2.63e+00

Pos logit contributions
renches+0.105
lights+0.104
 tant+0.103
obs+0.100
ware+0.099
Neg logit contributions
 Betty-0.039
my-0.039
 Wendy-0.038
 Gig-0.037
 Daisy-0.037
Token much
Token ID881
Feature activation-8.91e-02

Pos logit contributions
bi+0.076
 Betty+0.075
 Sally+0.074
 Sue+0.073
 Wendy+0.073
Neg logit contributions
renches-0.188
 tant-0.184
ware-0.182
lights-0.176
obs-0.173
Token you
Token ID345
Feature activation-8.53e-01

Pos logit contributions
renches+0.006
lights+0.006
ware+0.006
 tant+0.006
obs+0.006
Neg logit contributions
bi-0.003
 Betty-0.003
 Wendy-0.003
 Susan-0.003
 Johnson-0.003
Token value
Token ID1988
Feature activation+9.38e-01

Pos logit contributions
lights+0.053
 tant+0.053
ware+0.053
renches+0.053
obs+0.052
Neg logit contributions
bi-0.032
 Betty-0.030
 Wendy-0.029
my-0.029
 Johnson-0.029
Token your
Token ID534
Feature activation-1.77e+00

Pos logit contributions
my+0.026
 Wendy+0.024
bi+0.023
 Betty+0.023
 Sue+0.023
Neg logit contributions
renches-0.070
 tant-0.066
obs-0.066
lights-0.064
obb-0.064
Token seas
Token ID21547
Feature activation-6.47e+00

Pos logit contributions
renches+0.096
obs+0.092
lights+0.092
 tant+0.089
der+0.088
Neg logit contributions
bi-0.080
 home-0.078
my-0.078
ma-0.076
 welcome-0.075
Tokenhell
Token ID12758
Feature activation-8.93e-01

Pos logit contributions
renches+0.322
obs+0.310
 tant+0.309
ware+0.307
lights+0.304
Neg logit contributions
bi-0.331
my-0.319
 Betty-0.313
 Wendy-0.303
 Susan-0.302
Tokens
Token ID82
Feature activation-3.72e-01

Pos logit contributions
renches+0.051
 tant+0.051
lights+0.051
ware+0.050
obs+0.050
Neg logit contributions
bi-0.037
 Betty-0.036
 Wendy-0.035
my-0.035
 Johnson-0.035
Token,
Token ID11
Feature activation-8.11e-01

Pos logit contributions
renches+0.027
 tant+0.027
lights+0.026
ware+0.026
obs+0.026
Neg logit contributions
my-0.012
bi-0.010
 home-0.010
 welcome-0.010
 Johnson-0.010
Token Lily
Token ID20037
Feature activation+2.47e-01

Pos logit contributions
ware+0.056
renches+0.056
obs+0.056
lights+0.056
 tant+0.055
Neg logit contributions
bi-0.024
 Betty-0.024
 Wendy-0.023
 Johnson-0.022
my-0.022
Token.
Token ID13
Feature activation-4.29e-01

Pos logit contributions
bi+0.011
 Betty+0.011
 Wendy+0.011
 Karen+0.011
 Patty+0.011
Neg logit contributions
obs-0.013
ware-0.013
lights-0.012
 tant-0.012
renches-0.012
Token her
Token ID607
Feature activation-4.24e-01

Pos logit contributions
obs+0.040
 tant+0.040
lights+0.040
ware+0.040
renches+0.040
Neg logit contributions
 Betty-0.008
bi-0.008
 Wendy-0.008
 Susan-0.007
 Karen-0.007
Token birthday
Token ID10955
Feature activation-2.49e+00

Pos logit contributions
renches+0.031
lights+0.031
ware+0.030
 tant+0.030
obs+0.030
Neg logit contributions
my-0.011
 Johnson-0.011
bi-0.011
 Wendy-0.011
 Betty-0.010
Token a
Token ID257
Feature activation-9.09e-02

Pos logit contributions
renches+0.164
lights+0.163
 tant+0.163
ware+0.162
obs+0.160
Neg logit contributions
bi-0.081
 Betty-0.078
 Johnson-0.077
 Wendy-0.076
my-0.075
Token few
Token ID1178
Feature activation-2.61e+00

Pos logit contributions
renches+0.005
lights+0.005
obs+0.005
ware+0.005
 tant+0.005
Neg logit contributions
bi-0.004
 Wendy-0.004
 Johnson-0.004
 Betty-0.004
my-0.004
Token days
Token ID1528
Feature activation-1.93e+00

Pos logit contributions
obs+0.151
renches+0.149
inters+0.143
ware+0.143
lights+0.139
Neg logit contributions
 friends-0.109
 friend-0.103
ma-0.103
 home-0.099
 Daisy-0.099
Token ago
Token ID2084
Feature activation-6.33e+00

Pos logit contributions
 tant+0.144
renches+0.141
lights+0.140
ware+0.138
obb+0.131
Neg logit contributions
my-0.054
bi-0.053
 Johnson-0.050
 Karen-0.047
 reunited-0.047
Token.
Token ID13
Feature activation+3.07e-02

Pos logit contributions
 tant+0.462
renches+0.460
lights+0.455
ware+0.443
obs+0.442
Neg logit contributions
bi-0.177
 Johnson-0.165
 Betty-0.160
 Susan-0.160
my-0.156
Token She
Token ID1375
Feature activation+8.76e-01

Pos logit contributions
 Betty+0.001
 Susan+0.001
bi+0.001
 Wendy+0.001
 Daisy+0.001
Neg logit contributions
 tant-0.002
ware-0.002
renches-0.002
obs-0.002
lights-0.002
Token liked
Token ID8288
Feature activation-9.62e-01

Pos logit contributions
bi+0.026
 Betty+0.024
 Wendy+0.024
my+0.024
 Johnson+0.023
Neg logit contributions
renches-0.062
lights-0.061
 tant-0.061
obs-0.061
ware-0.060
Token to
Token ID284
Feature activation-2.03e-01

Pos logit contributions
renches+0.074
lights+0.072
obs+0.072
 tant+0.071
obb+0.070
Neg logit contributions
 Wendy-0.026
 Daisy-0.025
 Betty-0.025
 Karen-0.024
 her-0.024
Token practice
Token ID3357
Feature activation-1.42e+00

Pos logit contributions
obs+0.014
lights+0.014
renches+0.014
ware+0.014
 tant+0.014
Neg logit contributions
 Betty-0.006
 Wendy-0.006
 Susan-0.006
my-0.006
 Daisy-0.006
Token animals
Token ID4695
Feature activation-4.66e-01

Pos logit contributions
 tant+0.111
lights+0.110
obs+0.109
renches+0.108
ware+0.104
Neg logit contributions
 Betty-0.063
 Sue-0.060
 Susan-0.060
 Daisy-0.059
my-0.059
Token and
Token ID290
Feature activation-3.91e-01

Pos logit contributions
renches+0.034
 tant+0.034
lights+0.034
ware+0.033
obs+0.032
Neg logit contributions
bi-0.013
 Betty-0.013
my-0.013
 Wendy-0.012
 Johnson-0.012
Token people
Token ID661
Feature activation-1.57e+00

Pos logit contributions
renches+0.024
obs+0.024
obb+0.023
 tant+0.023
ware+0.023
Neg logit contributions
 Betty-0.016
 Sue-0.015
 Susan-0.015
 Daisy-0.015
my-0.015
Token from
Token ID422
Feature activation+4.28e-01

Pos logit contributions
renches+0.108
lights+0.107
 tant+0.107
ware+0.106
obs+0.104
Neg logit contributions
bi-0.049
my-0.047
 Betty-0.045
 Johnson-0.045
 Wendy-0.044
Token long
Token ID890
Feature activation-2.32e+00

Pos logit contributions
 Johnson+0.009
bi+0.009
 Sue+0.009
my+0.009
 Karen+0.009
Neg logit contributions
renches-0.033
obs-0.033
lights-0.032
ware-0.031
 tant-0.031
Token ago
Token ID2084
Feature activation-6.32e+00

Pos logit contributions
lights+0.132
ware+0.130
renches+0.129
obb+0.127
 tant+0.125
Neg logit contributions
my-0.092
bi-0.089
 home-0.088
 Sue-0.086
!-0.086
Token.
Token ID13
Feature activation+1.21e-01

Pos logit contributions
 tant+0.494
renches+0.491
lights+0.481
obs+0.473
ware+0.471
Neg logit contributions
bi-0.145
 Johnson-0.135
 Sue-0.131
my-0.131
 Susan-0.129
Token Sam
Token ID3409
Feature activation+1.57e-01

Pos logit contributions
 Betty+0.004
 Sue+0.004
 Sally+0.004
 Susan+0.004
 Daisy+0.004
Neg logit contributions
ware-0.009
 tant-0.009
lights-0.009
obs-0.008
oons-0.008
Token's
Token ID338
Feature activation-2.83e-01

Pos logit contributions
bi+0.006
 Betty+0.005
my+0.005
 Wendy+0.005
 Johnson+0.005
Neg logit contributions
lights-0.010
renches-0.010
 tant-0.010
ware-0.010
obs-0.010
Token mom
Token ID1995
Feature activation+2.73e-01

Pos logit contributions
lights+0.020
renches+0.020
obs+0.020
ware+0.020
 tant+0.019
Neg logit contributions
bi-0.008
 Betty-0.008
 Wendy-0.007
 Johnson-0.007
my-0.007
Token said
Token ID531
Feature activation+6.04e-01

Pos logit contributions
bi+0.009
 Betty+0.009
 Wendy+0.008
 Patty+0.008
 Johnson+0.008
Neg logit contributions
ware-0.018
renches-0.018
lights-0.018
obs-0.017
 tant-0.017
Token Welcome
Token ID19134
Feature activation-3.48e-01

Pos logit contributions
lights+0.071
ware+0.071
renches+0.070
 tant+0.070
obs+0.070
Neg logit contributions
bi-0.066
 Betty-0.063
 Wendy-0.063
 Johnson-0.062
 Susan-0.061
Token.
Token ID13
Feature activation-5.60e-01

Pos logit contributions
lights+0.020
ware+0.020
renches+0.020
 tant+0.020
obs+0.020
Neg logit contributions
bi-0.015
 Betty-0.014
 Wendy-0.014
 Johnson-0.014
 Susan-0.013
Token We
Token ID775
Feature activation+1.31e-01

Pos logit contributions
 tant+0.031
lights+0.030
ware+0.030
renches+0.030
obs+0.030
Neg logit contributions
bi-0.025
 Betty-0.025
 Wendy-0.025
 Susan-0.024
 Johnson-0.024
Token like
Token ID588
Feature activation+5.86e-01

Pos logit contributions
bi+0.006
 Johnson+0.006
 Betty+0.006
 Susan+0.006
 Wendy+0.006
Neg logit contributions
 tant-0.007
renches-0.007
ware-0.007
obs-0.006
 boards-0.006
Token bubbles
Token ID25037
Feature activation-5.02e-01

Pos logit contributions
 Daisy+0.014
 her+0.013
 Sue+0.013
 Betty+0.013
 Wendy+0.013
Neg logit contributions
renches-0.048
 tant-0.047
lights-0.046
obs-0.046
ware-0.044
Token too
Token ID1165
Feature activation+2.55e+00

Pos logit contributions
renches+0.036
 tant+0.036
obs+0.035
lights+0.034
ware+0.034
Neg logit contributions
my-0.014
 Daisy-0.013
 Wendy-0.013
 Betty-0.013
bi-0.013
Token.
Token ID13
Feature activation+1.36e-01

Pos logit contributions
my+0.066
bi+0.066
 Betty+0.066
 Sue+0.066
 Johnson+0.064
Neg logit contributions
ware-0.188
 tant-0.187
renches-0.186
obs-0.178
lights-0.176
Token They
Token ID1119
Feature activation+8.48e-01

Pos logit contributions
 Betty+0.006
 Wendy+0.006
 Daisy+0.006
 Sue+0.006
 Iris+0.006
Neg logit contributions
 tant-0.008
ware-0.008
lights-0.008
renches-0.008
oons-0.008
Token are
Token ID389
Feature activation+1.36e+00

Pos logit contributions
bi+0.025
my+0.025
 Betty+0.024
 Wendy+0.023
 Daisy+0.022
Neg logit contributions
lights-0.059
 tant-0.058
ware-0.058
renches-0.058
obs-0.057
Token pretty
Token ID2495
Feature activation+8.11e-01

Pos logit contributions
bi+0.058
 Betty+0.055
 Wendy+0.054
 Johnson+0.053
 Susan+0.053
Neg logit contributions
ware-0.077
lights-0.077
 tant-0.077
renches-0.077
obs-0.076
Token and
Token ID290
Feature activation-3.16e-01

Pos logit contributions
my+0.026
bi+0.026
 Johnson+0.024
 Daisy+0.024
 Betty+0.024
Neg logit contributions
renches-0.055
 tant-0.052
ware-0.052
obs-0.051
lights-0.050
Token.
Token ID13
Feature activation+7.57e-01

Pos logit contributions
bi+0.042
 Betty+0.040
 Wendy+0.039
my+0.038
 Susan+0.038
Neg logit contributions
renches-0.105
 tant-0.103
lights-0.103
ware-0.102
obs-0.101
Token They
Token ID1119
Feature activation+1.35e+00

Pos logit contributions
bi+0.035
 Betty+0.034
 Wendy+0.034
 Johnson+0.033
 Susan+0.033
Neg logit contributions
ware-0.042
 tant-0.042
lights-0.041
renches-0.041
obs-0.040
Token were
Token ID547
Feature activation+6.28e-01

Pos logit contributions
 Johnson+0.056
bi+0.054
 Betty+0.053
 Wendy+0.053
 Susan+0.051
Neg logit contributions
ware-0.075
 tant-0.074
lights-0.074
obs-0.073
 boards-0.071
Token the
Token ID262
Feature activation+2.93e-01

Pos logit contributions
 Johnson+0.029
isi+0.029
bi+0.029
 Wendy+0.027
 Betty+0.027
Neg logit contributions
lights-0.030
obs-0.029
 boards-0.028
renches-0.027
 rect-0.026
Token fastest
Token ID14162
Feature activation-1.08e+00

Pos logit contributions
bi+0.012
 Betty+0.012
 Susan+0.012
 Wendy+0.011
 Johnson+0.011
Neg logit contributions
renches-0.016
 tant-0.016
obs-0.016
lights-0.016
 boards-0.015
Token friends
Token ID2460
Feature activation+2.30e+00

Pos logit contributions
renches+0.069
ware+0.069
lights+0.069
 tant+0.068
obs+0.068
Neg logit contributions
bi-0.038
 Betty-0.037
 Wendy-0.036
 Johnson-0.036
 Susan-0.035
Token in
Token ID287
Feature activation-8.60e-01

Pos logit contributions
bi+0.082
 Betty+0.078
 Wendy+0.076
 Johnson+0.075
 Susan+0.074
Neg logit contributions
lights-0.145
ware-0.145
renches-0.145
 tant-0.145
obs-0.144
Token the
Token ID262
Feature activation+1.90e-01

Pos logit contributions
lights+0.063
ware+0.063
 tant+0.063
renches+0.062
obs+0.062
Neg logit contributions
bi-0.022
 Betty-0.021
 Wendy-0.020
 Johnson-0.020
my-0.019
Token world
Token ID995
Feature activation-1.08e+00

Pos logit contributions
bi+0.009
 Betty+0.009
 Wendy+0.009
 Johnson+0.009
 Susan+0.009
Neg logit contributions
lights-0.009
 tant-0.009
obs-0.009
renches-0.009
ware-0.009
Token!
Token ID0
Feature activation+3.10e-01

Pos logit contributions
renches+0.076
 tant+0.076
ware+0.076
lights+0.075
obs+0.075
Neg logit contributions
bi-0.031
 Betty-0.029
 Wendy-0.029
 Johnson-0.028
 Susan-0.028
Token<|endoftext|>
Token ID50256
Feature activation+5.11e-01

Pos logit contributions
bi+0.012
 Betty+0.012
 Johnson+0.011
 Wendy+0.011
my+0.011
Neg logit contributions
lights-0.018
obs-0.018
 tant-0.018
renches-0.017
ware-0.017
Token together
Token ID1978
Feature activation+1.13e+00

Pos logit contributions
lights+0.026
ware+0.026
renches+0.026
 tant+0.026
obs+0.026
Neg logit contributions
bi-0.016
 Betty-0.016
 Wendy-0.015
 Johnson-0.015
 Susan-0.015
Token and
Token ID290
Feature activation+1.59e-01

Pos logit contributions
bi+0.035
my+0.034
 Betty+0.033
 Wendy+0.033
 Daisy+0.032
Neg logit contributions
renches-0.078
lights-0.078
obs-0.077
 tant-0.075
ware-0.075
Token had
Token ID550
Feature activation+1.01e-02

Pos logit contributions
bi+0.006
 Johnson+0.005
 Betty+0.005
 Wendy+0.005
 Patty+0.005
Neg logit contributions
 tant-0.009
lights-0.009
ware-0.009
renches-0.009
obs-0.009
Token so
Token ID523
Feature activation-2.99e-01

Pos logit contributions
bi+0.000
 Betty+0.000
 Johnson+0.000
 Wendy+0.000
 Susan+0.000
Neg logit contributions
lights-0.001
renches-0.001
obs-0.001
 tant-0.001
ware-0.001
Token much
Token ID881
Feature activation+2.03e-01

Pos logit contributions
ware+0.017
lights+0.017
renches+0.017
obs+0.017
 tant+0.017
Neg logit contributions
bi-0.013
 Betty-0.012
 Wendy-0.012
 Johnson-0.012
my-0.012
Token fun
Token ID1257
Feature activation+2.33e+00

Pos logit contributions
 Betty+0.008
bi+0.008
 Wendy+0.007
 Susan+0.007
 Johnson+0.007
Neg logit contributions
ware-0.013
 tant-0.012
lights-0.012
renches-0.012
obs-0.012
Token.
Token ID13
Feature activation+3.80e-01

Pos logit contributions
 Betty+0.054
 Wendy+0.053
 Sue+0.051
 Susan+0.049
 Johnson+0.049
Neg logit contributions
renches-0.177
 tant-0.176
ware-0.173
lights-0.170
obs-0.170
Token Max
Token ID5436
Feature activation+9.11e-01

Pos logit contributions
 Wendy+0.011
 Sue+0.011
 Daisy+0.011
my+0.010
 Ellie+0.010
Neg logit contributions
ware-0.028
obs-0.027
renches-0.027
 tant-0.027
lights-0.027
Token liked
Token ID8288
Feature activation-9.53e-01

Pos logit contributions
bi+0.039
 Betty+0.038
 Johnson+0.036
 Susan+0.036
 Wendy+0.036
Neg logit contributions
ware-0.049
obs-0.049
 tant-0.049
renches-0.048
lights-0.047
Token Lily
Token ID20037
Feature activation+5.74e-01

Pos logit contributions
renches+0.061
lights+0.061
obs+0.060
ware+0.060
 tant+0.059
Neg logit contributions
 Betty-0.036
 Wendy-0.035
 Sue-0.035
bi-0.034
 Daisy-0.034
Token's
Token ID338
Feature activation-8.32e-01

Pos logit contributions
bi+0.022
 Betty+0.022
 Wendy+0.021
 Johnson+0.021
 Susan+0.020
Neg logit contributions
 tant-0.034
obs-0.034
ware-0.034
renches-0.034
lights-0.034
Token new
Token ID649
Feature activation-2.74e-01

Pos logit contributions
my+0.031
 Wendy+0.031
 home+0.031
 her+0.030
 friend+0.029
Neg logit contributions
obs-0.065
renches-0.065
ues-0.061
oons-0.060
der-0.059
Token bath
Token ID7837
Feature activation-2.21e+00

Pos logit contributions
obs+0.016
renches+0.016
lights+0.015
der+0.015
 tant+0.015
Neg logit contributions
 Wendy-0.012
my-0.011
 home-0.011
 Daisy-0.011
 friend-0.011
Tokenrobe
Token ID25481
Feature activation-1.57e+00

Pos logit contributions
renches+0.127
 Some+0.119
mic+0.118
 tant+0.118
obs+0.116
Neg logit contributions
my-0.093
 home-0.088
ma-0.087
 Wendy-0.086
 Daisy-0.080
Token and
Token ID290
Feature activation-4.52e-01

Pos logit contributions
renches+0.115
lights+0.112
 tant+0.111
ware+0.108
obs+0.105
Neg logit contributions
my-0.046
 Johnson-0.042
 Daisy-0.042
 Wendy-0.041
 Sally-0.040
Token observed
Token ID6515
Feature activation-4.27e-01

Pos logit contributions
lights+0.029
ware+0.029
 tant+0.029
renches+0.029
obs+0.028
Neg logit contributions
bi-0.016
 Betty-0.015
 Wendy-0.015
 Johnson-0.015
 Susan-0.014
Token how
Token ID703
Feature activation+2.79e+00

Pos logit contributions
renches+0.033
obs+0.032
ware+0.032
 tant+0.031
lights+0.031
Neg logit contributions
 Wendy-0.010
 Betty-0.010
 Karen-0.009
 Daisy-0.009
my-0.009
Token warm
Token ID5814
Feature activation+2.32e+00

Pos logit contributions
bi+0.110
 Betty+0.104
 Wendy+0.104
my+0.102
 Susan+0.100
Neg logit contributions
renches-0.168
ware-0.166
 tant-0.164
obs-0.163
lights-0.162
Token and
Token ID290
Feature activation-2.63e-01

Pos logit contributions
 Betty+0.069
bi+0.068
 Wendy+0.067
 Susan+0.065
 Daisy+0.064
Neg logit contributions
renches-0.165
 tant-0.163
obs-0.161
ware-0.160
lights-0.157
Token cozy
Token ID37438
Feature activation+2.25e+00

Pos logit contributions
renches+0.015
 tant+0.015
obs+0.015
ware+0.015
lights+0.014
Neg logit contributions
my-0.011
 Wendy-0.011
bi-0.010
 Betty-0.010
 Karen-0.010
Token it
Token ID340
Feature activation-5.71e-01

Pos logit contributions
bi+0.071
 Betty+0.070
 Wendy+0.070
 Johnson+0.068
 Karen+0.066
Neg logit contributions
renches-0.153
 tant-0.151
obs-0.148
ware-0.147
lights-0.147
Token was
Token ID373
Feature activation-7.11e-02

Pos logit contributions
renches+0.040
obs+0.040
 tant+0.040
lights+0.039
ware+0.039
Neg logit contributions
bi-0.016
my-0.015
 Betty-0.015
 Wendy-0.015
 Johnson-0.014
Token The
Token ID383
Feature activation-4.70e-01

Pos logit contributions
 Betty+0.025
 Wendy+0.025
bi+0.024
 Susan+0.024
 Daisy+0.024
Neg logit contributions
ware-0.030
 tant-0.030
renches-0.030
lights-0.029
obs-0.029
Token pink
Token ID11398
Feature activation+5.40e-02

Pos logit contributions
lights+0.028
ware+0.028
renches+0.028
 tant+0.028
obs+0.028
Neg logit contributions
bi-0.019
 Betty-0.018
 Wendy-0.018
 Johnson-0.017
my-0.017
Token worm
Token ID23849
Feature activation+5.92e-01

Pos logit contributions
bi+0.002
 Betty+0.002
my+0.002
 Johnson+0.002
 Patty+0.002
Neg logit contributions
renches-0.004
lights-0.004
 tant-0.004
obs-0.003
ware-0.003
Token stayed
Token ID9658
Feature activation+1.04e+00

Pos logit contributions
bi+0.017
my+0.016
 Betty+0.016
 Wendy+0.015
 Daisy+0.015
Neg logit contributions
renches-0.043
obs-0.042
 tant-0.042
lights-0.041
ware-0.041
Token out
Token ID503
Feature activation-5.60e-01

Pos logit contributions
bi+0.038
 Betty+0.036
 Wendy+0.036
 Johnson+0.035
 Susan+0.035
Neg logit contributions
lights-0.065
ware-0.064
renches-0.064
 tant-0.064
obs-0.064
Token all
Token ID477
Feature activation+2.07e+00

Pos logit contributions
renches+0.039
ware+0.038
obs+0.038
 tant+0.037
oons+0.036
Neg logit contributions
my-0.018
 reunited-0.018
 Betty-0.018
 Daisy-0.018
 home-0.018
Token night
Token ID1755
Feature activation-5.42e-01

Pos logit contributions
bi+0.072
 Betty+0.067
my+0.066
 Wendy+0.066
 Susan+0.064
Neg logit contributions
renches-0.137
ware-0.136
obs-0.136
 tant-0.135
lights-0.133
Token,
Token ID11
Feature activation+4.94e-01

Pos logit contributions
 tant+0.039
renches+0.039
ware+0.038
lights+0.038
 Some+0.036
Neg logit contributions
 Daisy-0.015
 Betty-0.014
bi-0.014
 reunited-0.014
 home-0.014
Token until
Token ID1566
Feature activation-5.32e-01

Pos logit contributions
bi+0.017
 Betty+0.017
 Wendy+0.016
my+0.016
 Susan+0.016
Neg logit contributions
renches-0.033
ware-0.032
 tant-0.032
lights-0.032
obs-0.032
Token the
Token ID262
Feature activation+6.72e-01

Pos logit contributions
lights+0.032
ware+0.031
obs+0.031
 tant+0.031
renches+0.031
Neg logit contributions
bi-0.022
my-0.019
 Johnson-0.019
 Betty-0.019
 Wendy-0.019
Token sun
Token ID4252
Feature activation+1.38e+00

Pos logit contributions
bi+0.031
 Betty+0.030
my+0.029
 Johnson+0.029
 Wendy+0.029
Neg logit contributions
lights-0.036
renches-0.036
 tant-0.036
ware-0.036
obs-0.035
Token liked
Token ID8288
Feature activation-1.44e+00

Pos logit contributions
lights+0.129
renches+0.126
 tant+0.126
obs+0.126
ware+0.126
Neg logit contributions
bi-0.079
my-0.074
 Betty-0.073
 Wendy-0.072
 Johnson-0.071
Token to
Token ID284
Feature activation-5.62e-01

Pos logit contributions
renches+0.106
 tant+0.105
lights+0.103
ware+0.099
obs+0.099
Neg logit contributions
 Sue-0.045
 Betty-0.044
 Johnson-0.043
my-0.042
 Bob-0.041
Token play
Token ID711
Feature activation-1.34e-01

Pos logit contributions
lights+0.040
 tant+0.039
obs+0.039
renches+0.039
ware+0.039
Neg logit contributions
bi-0.017
 Betty-0.016
my-0.016
 Sue-0.016
 Wendy-0.016
Token on
Token ID319
Feature activation-9.66e-01

Pos logit contributions
obs+0.009
 tant+0.009
lights+0.009
renches+0.009
ware+0.009
Neg logit contributions
 Betty-0.004
 Wendy-0.004
 Sue-0.004
my-0.004
 Johnson-0.004
Token.
Token ID13
Feature activation+6.00e-02

Pos logit contributions
obs+0.088
renches+0.087
 tant+0.086
lights+0.083
ware+0.082
Neg logit contributions
 Sue-0.012
 Betty-0.010
 her-0.009
 Bob-0.009
 Karen-0.009
Token He
Token ID679
Feature activation-5.30e-01

Pos logit contributions
bi+0.002
my+0.002
 Betty+0.002
 Johnson+0.002
 Sue+0.002
Neg logit contributions
 tant-0.004
ware-0.004
renches-0.004
obs-0.004
lights-0.004
Token would
Token ID561
Feature activation-1.29e-01

Pos logit contributions
lights+0.032
renches+0.032
ware+0.032
 tant+0.032
obs+0.032
Neg logit contributions
bi-0.021
 Betty-0.020
 Wendy-0.020
my-0.019
 Johnson-0.019
Token swing
Token ID9628
Feature activation+8.91e-01

Pos logit contributions
lights+0.009
obs+0.009
ware+0.009
renches+0.009
 tant+0.009
Neg logit contributions
bi-0.004
 Betty-0.004
my-0.004
 Wendy-0.004
 Susan-0.004
Token back
Token ID736
Feature activation+1.11e+00

Pos logit contributions
bi+0.031
 Betty+0.030
 Wendy+0.030
 Johnson+0.029
my+0.029
Neg logit contributions
ware-0.057
renches-0.057
lights-0.057
 tant-0.056
obs-0.056
Token and
Token ID290
Feature activation-2.55e-01

Pos logit contributions
bi+0.044
 Betty+0.042
my+0.041
 Wendy+0.041
 Johnson+0.041
Neg logit contributions
renches-0.069
ware-0.067
lights-0.067
 tant-0.066
obs-0.066
Token forth
Token ID6071
Feature activation+1.60e-01

Pos logit contributions
lights+0.017
 tant+0.017
ware+0.017
obs+0.017
renches+0.017
Neg logit contributions
bi-0.007
 Johnson-0.007
 Betty-0.007
 Wendy-0.007
 Karen-0.006
Token out
Token ID503
Feature activation-4.63e-01

Pos logit contributions
 tant+0.018
obs+0.017
lights+0.017
ware+0.017
renches+0.016
Neg logit contributions
bi-0.013
 Betty-0.012
 Johnson-0.011
 Susan-0.011
isi-0.011
Token and
Token ID290
Feature activation+5.04e-02

Pos logit contributions
renches+0.025
ware+0.025
obs+0.025
lights+0.025
 tant+0.024
Neg logit contributions
bi-0.021
 Betty-0.021
 Wendy-0.020
my-0.020
 Susan-0.020
Token hugged
Token ID41130
Feature activation+1.28e+00

Pos logit contributions
bi+0.002
isi+0.002
 Johnson+0.002
 Betty+0.002
 Susan+0.002
Neg logit contributions
 tant-0.003
obs-0.003
ware-0.003
lights-0.003
 boards-0.003
Token his
Token ID465
Feature activation-9.93e-01

Pos logit contributions
bi+0.039
 Betty+0.038
 Wendy+0.037
 Johnson+0.036
 Karen+0.036
Neg logit contributions
renches-0.089
obs-0.088
ware-0.088
lights-0.088
 tant-0.087
Token mom
Token ID1995
Feature activation+5.09e-01

Pos logit contributions
lights+0.069
ware+0.069
 tant+0.069
renches+0.069
obs+0.068
Neg logit contributions
bi-0.030
 Betty-0.028
 Wendy-0.028
 Susan-0.027
 Karen-0.027
Token.
Token ID13
Feature activation-1.22e-01

Pos logit contributions
 Betty+0.024
bi+0.024
 Wendy+0.023
 Patty+0.023
Jud+0.023
Neg logit contributions
lights-0.025
 tant-0.024
obs-0.024
ware-0.022
der-0.022
Token From
Token ID3574
Feature activation-2.49e-01

Pos logit contributions
lights+0.006
obs+0.006
ware+0.006
 tant+0.006
renches+0.006
Neg logit contributions
bi-0.006
 Betty-0.006
 Wendy-0.005
 Johnson-0.005
my-0.005
Token then
Token ID788
Feature activation-1.23e+00

Pos logit contributions
lights+0.012
ware+0.011
obs+0.011
renches+0.011
 tant+0.011
Neg logit contributions
bi-0.013
 Betty-0.012
 Johnson-0.012
 Susan-0.012
 Wendy-0.012
Token on
Token ID319
Feature activation-9.32e-01

Pos logit contributions
lights+0.075
ware+0.075
renches+0.075
 tant+0.075
obs+0.075
Neg logit contributions
bi-0.047
 Betty-0.045
 Wendy-0.044
 Johnson-0.043
 Susan-0.043
Token,
Token ID11
Feature activation+7.21e-02

Pos logit contributions
lights+0.068
ware+0.068
renches+0.068
 tant+0.068
obs+0.067
Neg logit contributions
bi-0.025
 Betty-0.023
 Wendy-0.022
 Johnson-0.022
 Susan-0.021
Token every
Token ID790
Feature activation-4.90e-01

Pos logit contributions
bi+0.003
 Wendy+0.003
 Betty+0.003
 reunited+0.003
isi+0.003
Neg logit contributions
 tant-0.004
lights-0.004
ware-0.004
renches-0.004
obs-0.004
Tokeners
Token ID364
Feature activation+1.97e-01

Pos logit contributions
 Wendy+0.043
bi+0.043
 Johnson+0.043
 Betty+0.042
 Karen+0.042
Neg logit contributions
ware-0.016
oons-0.013
lights-0.012
ands-0.012
renches-0.012
Token took
Token ID1718
Feature activation-3.57e-01

Pos logit contributions
bi+0.007
 Wendy+0.007
 Betty+0.007
 Johnson+0.007
 Patty+0.007
Neg logit contributions
ware-0.012
 tant-0.012
renches-0.012
obs-0.011
lights-0.011
Token the
Token ID262
Feature activation+2.32e-01

Pos logit contributions
 tant+0.022
renches+0.021
ware+0.020
 Some+0.020
obs+0.020
Neg logit contributions
bi-0.015
 Johnson-0.014
 Susan-0.013
isi-0.013
 Betty-0.013
Token coin
Token ID10752
Feature activation+1.88e-01

Pos logit contributions
bi+0.011
 Betty+0.011
 Johnson+0.011
my+0.011
 Wendy+0.010
Neg logit contributions
lights-0.012
renches-0.012
obs-0.012
ware-0.011
 tant-0.011
Token and
Token ID290
Feature activation+4.43e-02

Pos logit contributions
bi+0.008
 Wendy+0.008
 Karen+0.008
 Betty+0.008
 Susan+0.008
Neg logit contributions
ware-0.011
obs-0.010
 tant-0.010
lights-0.010
renches-0.010
Token bit
Token ID1643
Feature activation+1.58e+00

Pos logit contributions
bi+0.002
isi+0.002
 Johnson+0.002
 Susan+0.002
 Patty+0.002
Neg logit contributions
ware-0.003
 tant-0.002
lights-0.002
renches-0.002
obs-0.002
Token it
Token ID340
Feature activation-3.60e-01

Pos logit contributions
bi+0.062
 Susan+0.058
 Betty+0.057
 Johnson+0.056
 Karen+0.055
Neg logit contributions
 tant-0.097
ware-0.095
obs-0.094
lights-0.094
renches-0.093
Token.
Token ID13
Feature activation+3.57e-01

Pos logit contributions
lights+0.026
renches+0.026
obs+0.026
ware+0.026
 tant+0.026
Neg logit contributions
 Betty-0.009
my-0.009
bi-0.009
 Johnson-0.009
 Wendy-0.009
Token "
Token ID366
Feature activation+1.77e-01

Pos logit contributions
 Betty+0.015
 Wendy+0.015
 Sue+0.014
 Johnson+0.014
 Karen+0.014
Neg logit contributions
 tant-0.021
lights-0.021
ware-0.021
obs-0.021
renches-0.021
TokenIt
Token ID1026
Feature activation+1.47e+00

Pos logit contributions
my+0.006
bi+0.006
 Johnson+0.006
 Wendy+0.006
 Betty+0.006
Neg logit contributions
lights-0.012
ware-0.011
renches-0.011
obs-0.011
 tant-0.011
Token's
Token ID338
Feature activation-2.17e-01

Pos logit contributions
bi+0.071
 Wendy+0.070
 Karen+0.070
 Johnson+0.068
 Tammy+0.067
Neg logit contributions
ware-0.075
obs-0.071
renches-0.071
 tant-0.070
lights-0.068
Token And
Token ID843
Feature activation+1.22e+00

Pos logit contributions
bi+0.024
 Betty+0.023
 Wendy+0.023
 Johnson+0.022
 Susan+0.022
Neg logit contributions
ware-0.030
 tant-0.030
lights-0.030
renches-0.030
obs-0.030
Token even
Token ID772
Feature activation-9.81e-01

Pos logit contributions
bi+0.052
 Johnson+0.047
 Betty+0.046
 Wendy+0.046
my+0.046
Neg logit contributions
lights-0.069
 tant-0.068
renches-0.068
obs-0.067
ware-0.067
Token though
Token ID996
Feature activation-5.29e-01

Pos logit contributions
obs+0.057
renches+0.056
ware+0.056
lights+0.055
 boards+0.053
Neg logit contributions
 Betty-0.044
 Daisy-0.042
 Susan-0.042
bi-0.042
 Wendy-0.041
Token they
Token ID484
Feature activation+3.37e-01

Pos logit contributions
lights+0.031
renches+0.031
ware+0.031
 tant+0.030
obs+0.030
Neg logit contributions
bi-0.022
 Betty-0.020
 Wendy-0.020
 Johnson-0.020
my-0.019
Token were
Token ID547
Feature activation+6.36e-01

Pos logit contributions
bi+0.013
 Betty+0.012
 Wendy+0.012
 Johnson+0.012
 Susan+0.012
Neg logit contributions
 tant-0.020
ware-0.020
lights-0.020
renches-0.020
obs-0.020
Token scared
Token ID12008
Feature activation+4.77e-01

Pos logit contributions
bi+0.022
 Betty+0.021
 Wendy+0.021
 Johnson+0.020
 Susan+0.020
Neg logit contributions
lights-0.041
ware-0.041
renches-0.040
 tant-0.040
obs-0.040
Token at
Token ID379
Feature activation+3.59e-01

Pos logit contributions
 Betty+0.012
 Daisy+0.012
 Wendy+0.012
 Susan+0.011
 Sally+0.011
Neg logit contributions
ware-0.037
renches-0.036
obs-0.036
 tant-0.035
 apr-0.034
Token times
Token ID1661
Feature activation+4.43e-02

Pos logit contributions
bi+0.009
 Betty+0.009
 Wendy+0.009
 Johnson+0.008
 Susan+0.008
Neg logit contributions
lights-0.026
ware-0.026
 tant-0.026
renches-0.026
obs-0.026
Token,
Token ID11
Feature activation+6.81e-01

Pos logit contributions
 Betty+0.001
 Wendy+0.001
 Susan+0.001
bi+0.001
 Karen+0.001
Neg logit contributions
ware-0.004
 tant-0.003
obs-0.003
renches-0.003
lights-0.003
Token they
Token ID484
Feature activation-6.57e-02

Pos logit contributions
bi+0.027
 Betty+0.025
 Wendy+0.025
 Johnson+0.025
my+0.024
Neg logit contributions
lights-0.041
renches-0.040
ware-0.040
 tant-0.040
obs-0.040
Token were
Token ID547
Feature activation+2.14e-01

Pos logit contributions
 tant+0.004
lights+0.003
ware+0.003
obs+0.003
der+0.003
Neg logit contributions
 Johnson-0.003
bi-0.003
 Wendy-0.003
 Betty-0.003
 Karen-0.003
Token they
Token ID484
Feature activation+1.07e+00

Pos logit contributions
bi+0.047
 Betty+0.042
isi+0.042
 Susan+0.041
 Johnson+0.041
Neg logit contributions
lights-0.051
ware-0.049
obs-0.048
renches-0.047
 tant-0.046
Token got
Token ID1392
Feature activation+2.98e-01

Pos logit contributions
bi+0.046
 Betty+0.045
 Wendy+0.044
 Johnson+0.043
 Patty+0.043
Neg logit contributions
 tant-0.059
ware-0.059
obs-0.057
renches-0.057
lights-0.056
Token home
Token ID1363
Feature activation+1.65e+00

Pos logit contributions
 Betty+0.010
 Wendy+0.010
 Patty+0.010
isi+0.009
 Tammy+0.009
Neg logit contributions
obs-0.019
ware-0.019
lights-0.019
 tant-0.019
 boards-0.019
Token,
Token ID11
Feature activation+7.75e-01

Pos logit contributions
bi+0.047
 Betty+0.044
 Wendy+0.042
 Johnson+0.041
 Susan+0.041
Neg logit contributions
lights-0.117
ware-0.116
renches-0.115
 tant-0.115
obs-0.115
Token mom
Token ID1995
Feature activation+1.37e+00

Pos logit contributions
bi+0.029
 Betty+0.028
 Wendy+0.027
 Johnson+0.027
 Susan+0.026
Neg logit contributions
renches-0.048
ware-0.048
 tant-0.048
lights-0.048
obs-0.047
Tokenmy
Token ID1820
Feature activation-5.68e-01

Pos logit contributions
 Iris+0.027
 Wendy+0.026
 Patty+0.025
 Betty+0.022
Joy+0.022
Neg logit contributions
ware-0.107
lights-0.106
obs-0.106
renches-0.104
oons-0.103
Token and
Token ID290
Feature activation-2.49e-01

Pos logit contributions
ware+0.034
obs+0.033
renches+0.033
lights+0.032
 tant+0.031
Neg logit contributions
bi-0.022
 Wendy-0.021
 Betty-0.021
 Patty-0.020
 Iris-0.020
Token Georg
Token ID6850
Feature activation-8.39e-01

Pos logit contributions
lights+0.016
ware+0.016
 tant+0.016
renches+0.016
obs+0.015
Neg logit contributions
bi-0.009
 Betty-0.009
 Wendy-0.008
 Johnson-0.008
 Susan-0.008
Tokenie
Token ID494
Feature activation-1.16e+00

Pos logit contributions
lights+0.067
ware+0.066
 tant+0.066
renches+0.066
obs+0.065
Neg logit contributions
bi-0.016
 Betty-0.016
 Wendy-0.016
 Johnson-0.015
 Susan-0.014
Token put
Token ID1234
Feature activation+1.03e+00

Pos logit contributions
ware+0.067
obs+0.065
renches+0.064
 boards+0.063
lights+0.062
Neg logit contributions
 Betty-0.047
bi-0.046
 Wendy-0.046
 Patty-0.045
 Tammy-0.044
Token all
Token ID477
Feature activation+2.03e+00

Pos logit contributions
bi+0.036
 Betty+0.034
 Wendy+0.033
 Johnson+0.032
 Susan+0.032
Neg logit contributions
ware-0.066
 tant-0.066
lights-0.066
renches-0.065
obs-0.065
Token
Token ID198
Feature activation-2.78e-01

Pos logit contributions
ware+0.012
lights+0.011
 tant+0.011
renches+0.011
obs+0.011
Neg logit contributions
bi-0.005
 Betty-0.005
 Wendy-0.005
my-0.005
 Johnson-0.004
Token"
Token ID1
Feature activation+9.45e-01

Pos logit contributions
ware+0.019
lights+0.019
renches+0.018
 tant+0.018
obs+0.018
Neg logit contributions
bi-0.009
 Betty-0.009
 Wendy-0.009
 Johnson-0.008
my-0.008
TokenHello
Token ID15496
Feature activation-1.97e+00

Pos logit contributions
bi+0.033
 Betty+0.032
 Wendy+0.031
 Johnson+0.031
my+0.030
Neg logit contributions
renches-0.061
lights-0.061
obs-0.061
 tant-0.061
ware-0.060
Token,
Token ID11
Feature activation-1.18e+00

Pos logit contributions
lights+0.109
ware+0.109
obs+0.105
renches+0.103
 tant+0.102
Neg logit contributions
bi-0.081
Susan-0.074
 Susan-0.074
isi-0.073
 reunited-0.073
Token Rex
Token ID17853
Feature activation-6.80e-01

Pos logit contributions
 tant+0.069
lights+0.069
ware+0.069
renches+0.068
obs+0.067
Neg logit contributions
bi-0.048
 Betty-0.042
 Susan-0.042
 Johnson-0.042
 reunited-0.041
Token,"
Token ID553
Feature activation-2.23e+00

Pos logit contributions
ware+0.028
ro+0.028
obs+0.028
lights+0.027
der+0.026
Neg logit contributions
 Wendy-0.034
 Iris-0.033
bi-0.033
isi-0.032
 Karen-0.032
Token they
Token ID484
Feature activation-1.27e+00

Pos logit contributions
lights+0.154
ware+0.153
 tant+0.153
renches+0.153
obs+0.152
Neg logit contributions
bi-0.068
 Betty-0.063
 Wendy-0.062
 Johnson-0.061
 Susan-0.060
Token said
Token ID531
Feature activation+2.62e-01

Pos logit contributions
 tant+0.089
lights+0.085
renches+0.085
ware+0.085
obs+0.084
Neg logit contributions
bi-0.039
 Johnson-0.037
 Betty-0.034
 Karen-0.033
 Susan-0.033
Token.
Token ID13
Feature activation-8.71e-01

Pos logit contributions
bi+0.010
 Betty+0.009
 Wendy+0.009
 Johnson+0.009
 Susan+0.009
Neg logit contributions
lights-0.016
ware-0.016
 tant-0.016
renches-0.016
obs-0.016
Token "
Token ID366
Feature activation-2.54e-01

Pos logit contributions
 tant+0.064
renches+0.064
ware+0.062
lights+0.062
obs+0.061
Neg logit contributions
 Sue-0.028
 Wendy-0.027
 Betty-0.027
 Johnson-0.027
 Bob-0.026
TokenWe
Token ID1135
Feature activation-2.46e-01

Pos logit contributions
renches+0.017
lights+0.017
 tant+0.016
ware+0.016
obs+0.016
Neg logit contributions
my-0.009
bi-0.009
 Johnson-0.009
Joy-0.008
 Squirrel-0.008
Token go
Token ID467
Feature activation-2.93e-01

Pos logit contributions
my+0.012
 Betty+0.011
 Wendy+0.011
 Daisy+0.011
bi+0.011
Neg logit contributions
lights-0.033
 tant-0.032
ware-0.032
obs-0.032
renches-0.032
Token play
Token ID711
Feature activation+3.67e-01

Pos logit contributions
lights+0.020
 tant+0.020
renches+0.020
obs+0.020
ware+0.020
Neg logit contributions
my-0.010
bi-0.009
 Betty-0.009
 Wendy-0.009
 Daisy-0.009
Token in
Token ID287
Feature activation-3.43e-01

Pos logit contributions
 Daisy+0.011
my+0.010
 Sally+0.010
 Wendy+0.010
ma+0.010
Neg logit contributions
obs-0.027
lights-0.027
renches-0.026
inters-0.025
ware-0.024
Token the
Token ID262
Feature activation+3.81e-01

Pos logit contributions
obs+0.030
renches+0.030
lights+0.029
 tant+0.029
ware+0.029
Neg logit contributions
 Daisy-0.005
my-0.005
ma-0.005
 Sally-0.005
 Susan-0.005
Token rain
Token ID6290
Feature activation-5.49e-01

Pos logit contributions
bi+0.016
 Betty+0.015
 Wendy+0.015
 Johnson+0.015
my+0.014
Neg logit contributions
lights-0.022
ware-0.022
 tant-0.022
renches-0.022
obs-0.022
Token?'
Token ID8348
Feature activation-2.09e+00

Pos logit contributions
renches+0.037
obs+0.037
lights+0.036
ware+0.036
 tant+0.035
Neg logit contributions
my-0.020
bi-0.018
 Johnson-0.018
 Wendy-0.017
 Daisy-0.017
Token
Token ID198
Feature activation-1.89e-01

Pos logit contributions
lights+0.141
renches+0.141
ware+0.140
obs+0.140
 tant+0.139
Neg logit contributions
bi-0.068
 Betty-0.066
 Susan-0.064
 Wendy-0.063
 Johnson-0.063
Token
Token ID198
Feature activation-2.21e-01

Pos logit contributions
lights+0.012
ware+0.012
renches+0.012
 tant+0.012
obs+0.012
Neg logit contributions
bi-0.006
 Betty-0.006
 Wendy-0.006
 Johnson-0.006
 Susan-0.006
TokenHis
Token ID6653
Feature activation-1.27e+00

Pos logit contributions
lights+0.015
ware+0.014
renches+0.014
obs+0.014
 tant+0.014
Neg logit contributions
bi-0.008
 Betty-0.007
 Susan-0.007
 Johnson-0.007
 Wendy-0.007
Token mum
Token ID25682
Feature activation-3.63e-01

Pos logit contributions
lights+0.089
ware+0.088
 tant+0.088
obs+0.088
renches+0.088
Neg logit contributions
bi-0.036
 Betty-0.035
 Johnson-0.034
 Wendy-0.033
 Susan-0.033
Token sighed
Token ID21893
Feature activation-7.77e-01

Pos logit contributions
ware+0.024
lights+0.024
 tant+0.024
renches+0.024
obs+0.024
Neg logit contributions
bi-0.011
 Betty-0.011
 Wendy-0.011
 Johnson-0.010
 Patty-0.010
Token saw
Token ID2497
Feature activation-2.03e+00

Pos logit contributions
lights+0.026
renches+0.026
 tant+0.026
obs+0.026
ware+0.025
Neg logit contributions
bi-0.012
 Betty-0.011
my-0.011
 Wendy-0.011
 Susan-0.010
Token a
Token ID257
Feature activation+1.07e+00

Pos logit contributions
renches+0.147
obs+0.146
lights+0.144
 tant+0.142
ware+0.141
Neg logit contributions
 Betty-0.056
my-0.055
 Wendy-0.054
 Sue-0.054
 Daisy-0.053
Token box
Token ID3091
Feature activation+1.62e+00

Pos logit contributions
bi+0.052
 Betty+0.049
 Wendy+0.047
 Susan+0.047
 Karen+0.047
Neg logit contributions
ware-0.055
 tant-0.055
lights-0.054
obs-0.053
renches-0.053
Token in
Token ID287
Feature activation-6.56e-01

Pos logit contributions
bi+0.070
 Betty+0.067
 Wendy+0.066
 Johnson+0.064
 Susan+0.064
Neg logit contributions
lights-0.091
ware-0.091
obs-0.090
renches-0.089
 tant-0.088
Token the
Token ID262
Feature activation+4.31e-01

Pos logit contributions
lights+0.047
ware+0.047
renches+0.047
 tant+0.047
obs+0.047
Neg logit contributions
bi-0.018
 Betty-0.017
 Wendy-0.016
 Johnson-0.016
 Susan-0.016
Token corner
Token ID5228
Feature activation-2.03e+00

Pos logit contributions
bi+0.023
 Betty+0.022
 Wendy+0.022
 Johnson+0.022
 Susan+0.022
Neg logit contributions
lights-0.020
ware-0.019
renches-0.019
 tant-0.019
obs-0.019
Token of
Token ID286
Feature activation-1.13e+00

Pos logit contributions
renches+0.138
 tant+0.138
ware+0.134
obs+0.128
lights+0.128
Neg logit contributions
my-0.070
 Sue-0.066
bi-0.065
ma-0.064
 Betty-0.063
Token the
Token ID262
Feature activation+5.09e-01

Pos logit contributions
obs+0.095
renches+0.094
lights+0.090
ues+0.089
der+0.089
Neg logit contributions
my-0.021
 Sue-0.020
 Johnson-0.019
 Sally-0.019
 Daisy-0.018
Token room
Token ID2119
Feature activation-2.93e-02

Pos logit contributions
bi+0.025
 Betty+0.024
 Wendy+0.024
 Johnson+0.024
my+0.023
Neg logit contributions
lights-0.026
renches-0.026
ware-0.026
 tant-0.025
obs-0.025
Token and
Token ID290
Feature activation+2.17e-01

Pos logit contributions
 tant+0.002
renches+0.002
lights+0.002
ware+0.002
obs+0.002
Neg logit contributions
my-0.001
bi-0.001
 Johnson-0.001
 Sue-0.001
 Betty-0.001
Token started
Token ID2067
Feature activation+1.13e+00

Pos logit contributions
bi+0.007
 Betty+0.007
 Wendy+0.007
 Susan+0.007
my+0.006
Neg logit contributions
renches-0.015
 tant-0.015
lights-0.015
obs-0.014
ware-0.014
Token glasses
Token ID15232
Feature activation-7.80e-02

Pos logit contributions
lights+0.123
obs+0.120
renches+0.118
ware+0.118
 tant+0.115
Neg logit contributions
 Betty-0.090
bi-0.089
 Wendy-0.089
 Johnson-0.086
 Susan-0.085
Token.
Token ID13
Feature activation+4.31e-01

Pos logit contributions
renches+0.006
 tant+0.006
lights+0.005
ware+0.005
obs+0.005
Neg logit contributions
my-0.002
 Johnson-0.002
bi-0.002
 Betty-0.002
 Wendy-0.002
Token He
Token ID679
Feature activation-2.61e-01

Pos logit contributions
bi+0.020
 Betty+0.020
 Wendy+0.019
 Johnson+0.019
 Susan+0.019
Neg logit contributions
ware-0.023
 tant-0.023
renches-0.023
lights-0.023
obs-0.022
Token quickly
Token ID2952
Feature activation-7.72e-01

Pos logit contributions
lights+0.018
ware+0.018
renches+0.018
 tant+0.017
obs+0.017
Neg logit contributions
bi-0.008
 Betty-0.008
 Wendy-0.008
 Johnson-0.007
 Susan-0.007
Token told
Token ID1297
Feature activation-1.83e-01

Pos logit contributions
lights+0.049
ware+0.049
renches+0.049
 tant+0.049
obs+0.049
Neg logit contributions
bi-0.027
 Betty-0.026
 Wendy-0.026
 Johnson-0.025
 Susan-0.025
Token his
Token ID465
Feature activation-2.21e+00

Pos logit contributions
lights+0.012
ware+0.012
 tant+0.012
renches+0.012
obs+0.012
Neg logit contributions
bi-0.006
 Betty-0.005
 Wendy-0.005
 Johnson-0.005
 Susan-0.005
Token mom
Token ID1995
Feature activation-6.12e-01

Pos logit contributions
 tant+0.145
renches+0.145
ware+0.145
obs+0.144
lights+0.144
Neg logit contributions
bi-0.074
 Betty-0.068
 Johnson-0.067
 Wendy-0.066
 Susan-0.065
Token and
Token ID290
Feature activation-1.12e+00

Pos logit contributions
lights+0.035
ware+0.034
obs+0.034
 tant+0.033
renches+0.033
Neg logit contributions
bi-0.025
 Wendy-0.024
 Betty-0.024
 Iris-0.023
 Johnson-0.023
Token dad
Token ID9955
Feature activation-9.73e-01

Pos logit contributions
ware+0.061
 tant+0.060
lights+0.060
obs+0.058
 boards+0.057
Neg logit contributions
bi-0.048
 Johnson-0.044
 Wendy-0.044
 Betty-0.044
 Patty-0.043
Token and
Token ID290
Feature activation-2.49e-01

Pos logit contributions
ware+0.054
lights+0.054
 tant+0.054
obs+0.054
renches+0.053
Neg logit contributions
bi-0.041
 Betty-0.040
 Johnson-0.039
 Wendy-0.039
 Patty-0.037
Token they
Token ID484
Feature activation-1.16e-02

Pos logit contributions
ware+0.015
 tant+0.014
lights+0.014
obs+0.014
renches+0.014
Neg logit contributions
bi-0.010
 Johnson-0.009
 Wendy-0.009
 Betty-0.009
 Iris-0.009
Token Tim
Token ID5045
Feature activation+1.37e+00

Pos logit contributions
lights+0.017
ware+0.017
renches+0.017
 tant+0.016
obs+0.016
Neg logit contributions
bi-0.005
 Betty-0.005
 Wendy-0.004
 Johnson-0.004
 Susan-0.004
Tokenmy
Token ID1820
Feature activation-4.58e-01

Pos logit contributions
 Wendy+0.043
 Betty+0.039
 Iris+0.039
 Patty+0.039
 Karen+0.038
Neg logit contributions
 tant-0.092
ware-0.091
obs-0.090
lights-0.090
renches-0.089
Token.
Token ID13
Feature activation-1.20e-01

Pos logit contributions
obs+0.024
 tant+0.024
ware+0.024
lights+0.024
renches+0.023
Neg logit contributions
 Wendy-0.021
bi-0.020
 Betty-0.020
 Patty-0.020
 Karen-0.019
Token Tim
Token ID5045
Feature activation+9.76e-01

Pos logit contributions
lights+0.009
ware+0.009
 tant+0.009
renches+0.009
obs+0.009
Neg logit contributions
bi-0.003
 Betty-0.003
 Wendy-0.003
 Johnson-0.003
 Susan-0.003
Tokenmy
Token ID1820
Feature activation-9.53e-01

Pos logit contributions
 Wendy+0.018
 Patty+0.016
 Betty+0.016
 Iris+0.016
 Karen+0.014
Neg logit contributions
 tant-0.078
lights-0.077
renches-0.077
ware-0.077
obs-0.076
Token loved
Token ID6151
Feature activation-1.54e+00

Pos logit contributions
obs+0.048
ware+0.047
renches+0.047
lights+0.047
 tant+0.047
Neg logit contributions
bi-0.045
 Wendy-0.044
 Betty-0.044
 Patty-0.043
 Johnson-0.042
Token to
Token ID284
Feature activation-5.34e-01

Pos logit contributions
lights+0.100
 tant+0.096
renches+0.096
ware+0.094
obs+0.094
Neg logit contributions
 Betty-0.057
bi-0.055
my-0.054
 Johnson-0.054
 Sue-0.053
Token go
Token ID467
Feature activation+3.58e-02

Pos logit contributions
lights+0.041
 tant+0.041
obs+0.040
renches+0.039
der+0.039
Neg logit contributions
 Betty-0.014
bi-0.013
 Sue-0.013
 Susan-0.013
my-0.013
Token to
Token ID284
Feature activation+4.37e-01

Pos logit contributions
bi+0.002
 Betty+0.001
 Wendy+0.001
 Johnson+0.001
 Susan+0.001
Neg logit contributions
lights-0.002
ware-0.002
renches-0.002
obs-0.002
 tant-0.002
Token the
Token ID262
Feature activation+1.03e+00

Pos logit contributions
bi+0.013
 Betty+0.012
 Wendy+0.012
 Johnson+0.012
 Susan+0.011
Neg logit contributions
ware-0.030
renches-0.030
 tant-0.030
obs-0.030
lights-0.030
Token lake
Token ID13546
Feature activation+5.91e-01

Pos logit contributions
bi+0.041
 Betty+0.040
 Wendy+0.039
 Johnson+0.038
 Susan+0.038
Neg logit contributions
lights-0.060
ware-0.060
renches-0.060
 tant-0.060
obs-0.060
Token He
Token ID679
Feature activation-5.43e-01

Pos logit contributions
 tant+0.003
ware+0.003
renches+0.003
lights+0.003
obs+0.003
Neg logit contributions
 Betty-0.002
 Sue-0.002
 Wendy-0.002
 Susan-0.002
 Johnson-0.002
Token was
Token ID373
Feature activation-1.67e-01

Pos logit contributions
ware+0.031
lights+0.031
renches+0.031
 tant+0.031
obs+0.031
Neg logit contributions
bi-0.023
 Betty-0.022
 Wendy-0.021
 Johnson-0.021
 Susan-0.021
Token here
Token ID994
Feature activation+8.29e-01

Pos logit contributions
lights+0.011
ware+0.011
obs+0.011
 tant+0.011
renches+0.011
Neg logit contributions
bi-0.005
 Betty-0.005
 Wendy-0.005
 Susan-0.005
 Johnson-0.005
Token a
Token ID257
Feature activation+4.70e-01

Pos logit contributions
my+0.025
 Johnson+0.024
bi+0.024
 Sue+0.023
 Betty+0.023
Neg logit contributions
renches-0.060
 tant-0.060
ware-0.059
lights-0.059
obs-0.058
Token minute
Token ID5664
Feature activation-2.49e-01

Pos logit contributions
bi+0.019
 Betty+0.018
 Wendy+0.018
 Johnson+0.018
 Susan+0.018
Neg logit contributions
lights-0.027
renches-0.027
obs-0.027
ware-0.027
 tant-0.027
Token ago
Token ID2084
Feature activation-5.96e+00

Pos logit contributions
renches+0.019
 tant+0.019
lights+0.018
obs+0.018
ware+0.018
Neg logit contributions
my-0.007
bi-0.006
 Sue-0.006
 Johnson-0.006
 Betty-0.006
Token.
Token ID13
Feature activation+3.81e-01

Pos logit contributions
renches+0.470
 tant+0.468
lights+0.464
obs+0.451
ware+0.442
Neg logit contributions
bi-0.140
 Johnson-0.132
my-0.130
 Sue-0.126
 Susan-0.124
Token He
Token ID679
Feature activation-9.51e-02

Pos logit contributions
 Sue+0.016
 Betty+0.016
 Susan+0.016
 Daisy+0.016
my+0.016
Neg logit contributions
 tant-0.023
ware-0.023
lights-0.022
obs-0.022
renches-0.022
Token was
Token ID373
Feature activation-1.30e-02

Pos logit contributions
lights+0.006
 tant+0.006
renches+0.006
ware+0.006
obs+0.006
Neg logit contributions
bi-0.003
 Betty-0.003
 Wendy-0.003
my-0.003
 Johnson-0.003
Token chasing
Token ID20023
Feature activation-8.33e-01

Pos logit contributions
lights+0.001
ware+0.001
obs+0.001
 tant+0.001
renches+0.001
Neg logit contributions
bi-0.000
 Betty-0.000
 Wendy-0.000
 Susan-0.000
my-0.000
Token the
Token ID262
Feature activation-2.03e-01

Pos logit contributions
renches+0.061
lights+0.061
ware+0.060
obs+0.060
 tant+0.057
Neg logit contributions
 Betty-0.023
 Wendy-0.022
bi-0.022
 Daisy-0.021
 Sue-0.021
Token him
Token ID683
Feature activation-2.49e+00

Pos logit contributions
lights+0.037
ware+0.036
renches+0.036
obs+0.036
 tant+0.036
Neg logit contributions
bi-0.016
 Betty-0.015
 Wendy-0.015
 Karen-0.014
 Susan-0.014
Token about
Token ID546
Feature activation+6.76e-01

Pos logit contributions
renches+0.158
lights+0.156
 tant+0.155
obs+0.155
ware+0.153
Neg logit contributions
bi-0.091
 Betty-0.090
 Wendy-0.086
 Sue-0.084
 Patty-0.084
Token the
Token ID262
Feature activation-7.17e-01

Pos logit contributions
 Betty+0.019
bi+0.018
 Daisy+0.018
 Wendy+0.017
 Susan+0.017
Neg logit contributions
obs-0.050
lights-0.050
renches-0.049
 tant-0.049
ware-0.047
Token frog
Token ID21264
Feature activation+1.11e+00

Pos logit contributions
lights+0.042
obs+0.041
renches+0.041
ware+0.040
 tant+0.040
Neg logit contributions
bi-0.030
 Betty-0.029
 Wendy-0.028
 Johnson-0.028
my-0.028
Token's
Token ID338
Feature activation-5.93e-01

Pos logit contributions
bi+0.039
 Betty+0.038
 Wendy+0.036
my+0.036
 Johnson+0.036
Neg logit contributions
renches-0.072
lights-0.071
 tant-0.071
ware-0.071
obs-0.071
Token habitat
Token ID20018
Feature activation-5.24e+00

Pos logit contributions
lights+0.038
renches+0.035
ware+0.035
obs+0.035
 tant+0.035
Neg logit contributions
bi-0.023
 Betty-0.023
 home-0.022
 Johnson-0.021
my-0.021
Token,
Token ID11
Feature activation-5.36e-01

Pos logit contributions
renches+0.403
ware+0.397
 tant+0.389
 apr+0.374
obs+0.371
Neg logit contributions
my-0.120
!-0.112
ma-0.099
 welcome-0.098
bi-0.098
Token diet
Token ID5496
Feature activation-2.56e+00

Pos logit contributions
renches+0.037
obs+0.036
ware+0.036
lights+0.035
 tant+0.035
Neg logit contributions
 Betty-0.019
 Daisy-0.017
 Sue-0.017
 Wendy-0.017
my-0.017
Token and
Token ID290
Feature activation-1.71e-01

Pos logit contributions
 tant+0.170
renches+0.170
lights+0.168
ware+0.167
obs+0.161
Neg logit contributions
my-0.081
bi-0.078
 Betty-0.076
 Wendy-0.074
 Patty-0.072
Token how
Token ID703
Feature activation+2.56e+00

Pos logit contributions
renches+0.011
obs+0.011
lights+0.011
 tant+0.011
ware+0.011
Neg logit contributions
 Betty-0.006
bi-0.006
my-0.006
 Susan-0.006
 Wendy-0.005
Token it
Token ID340
Feature activation-8.70e-01

Pos logit contributions
bi+0.091
 Betty+0.088
 Susan+0.084
 Sue+0.081
my+0.081
Neg logit contributions
ware-0.167
renches-0.166
 tant-0.166
lights-0.165
obs-0.163
Token Louise
Token ID31774
Feature activation+1.57e-01

Pos logit contributions
 Some+0.008
 ago+0.008
ro+0.008
ware+0.008
lights+0.008
Neg logit contributions
bi-0.008
isi-0.007
 reunited-0.007
 Chuck-0.007
rey-0.007
Token.
Token ID13
Feature activation-3.69e-02

Pos logit contributions
bi+0.007
 Wendy+0.007
 Betty+0.007
 Patty+0.007
 Johnson+0.007
Neg logit contributions
ware-0.008
lights-0.008
 tant-0.008
obs-0.008
renches-0.007
Token Her
Token ID2332
Feature activation+1.27e-01

Pos logit contributions
obs+0.002
lights+0.002
renches+0.002
ware+0.002
 tant+0.002
Neg logit contributions
bi-0.001
 Johnson-0.001
 Karen-0.001
 reunited-0.001
 Wendy-0.001
Token favourite
Token ID12507
Feature activation-1.58e+00

Pos logit contributions
bi+0.006
 Susan+0.006
 Lib+0.006
 Betty+0.006
Joy+0.006
Neg logit contributions
obs-0.006
renches-0.006
 tant-0.006
ware-0.006
lights-0.006
Token past
Token ID1613
Feature activation-9.47e-01

Pos logit contributions
lights+0.112
renches+0.112
obs+0.110
ware+0.109
 tant+0.108
Neg logit contributions
bi-0.047
 Betty-0.044
 Wendy-0.043
 Johnson-0.043
my-0.042
Tokenime
Token ID524
Feature activation-4.93e+00

Pos logit contributions
lights+0.055
ware+0.055
 tant+0.054
renches+0.054
obs+0.054
Neg logit contributions
bi-0.039
 Betty-0.037
 Wendy-0.037
 Johnson-0.036
 Susan-0.036
Token was
Token ID373
Feature activation-8.08e-01

Pos logit contributions
lights+0.323
ware+0.322
 tant+0.321
renches+0.321
obs+0.320
Neg logit contributions
bi-0.166
 Betty-0.158
 Wendy-0.154
 Johnson-0.152
 Susan-0.149
Token playing
Token ID2712
Feature activation-1.24e+00

Pos logit contributions
lights+0.053
ware+0.052
renches+0.052
 tant+0.052
obs+0.052
Neg logit contributions
bi-0.028
 Betty-0.026
 Wendy-0.026
 Johnson-0.025
 Susan-0.025
Token in
Token ID287
Feature activation-7.07e-01

Pos logit contributions
lights+0.080
obs+0.078
renches+0.078
ware+0.078
 tant+0.077
Neg logit contributions
 Betty-0.043
bi-0.043
 Wendy-0.042
 Johnson-0.041
my-0.040
Token the
Token ID262
Feature activation+7.77e-02

Pos logit contributions
ware+0.053
lights+0.053
 tant+0.052
renches+0.052
obs+0.051
Neg logit contributions
bi-0.018
 Betty-0.015
 Wendy-0.014
 Johnson-0.014
 Patty-0.013
Token park
Token ID3952
Feature activation+2.89e-01

Pos logit contributions
bi+0.003
 Betty+0.003
 Wendy+0.003
 Johnson+0.003
 Susan+0.003
Neg logit contributions
lights-0.005
ware-0.005
renches-0.005
 tant-0.005
obs-0.005
Token a
Token ID257
Feature activation+7.01e-01

Pos logit contributions
 tant+0.061
lights+0.061
ware+0.061
renches+0.061
obs+0.060
Neg logit contributions
bi-0.035
 Betty-0.033
 Wendy-0.031
 Johnson-0.031
my-0.031
Token nice
Token ID3621
Feature activation+1.72e+00

Pos logit contributions
bi+0.034
 Betty+0.033
 Wendy+0.032
 Johnson+0.032
 Susan+0.031
Neg logit contributions
 tant-0.036
ware-0.035
lights-0.035
renches-0.035
obs-0.035
Token house
Token ID2156
Feature activation+4.13e-01

Pos logit contributions
bi+0.072
 Betty+0.071
 Wendy+0.069
 Patty+0.067
 Susan+0.067
Neg logit contributions
 tant-0.097
ware-0.096
lights-0.096
obs-0.095
renches-0.093
Token by
Token ID416
Feature activation-1.04e-01

Pos logit contributions
bi+0.013
 Johnson+0.012
 Betty+0.012
my+0.012
 Susan+0.011
Neg logit contributions
renches-0.029
 tant-0.029
ware-0.028
lights-0.028
obs-0.028
Token the
Token ID262
Feature activation-1.88e-01

Pos logit contributions
obs+0.008
renches+0.008
 tant+0.008
ware+0.008
lights+0.008
Neg logit contributions
 Betty-0.002
bi-0.002
 Susan-0.002
 Wendy-0.002
 Karen-0.002
Token seas
Token ID21547
Feature activation-5.02e+00

Pos logit contributions
lights+0.010
ware+0.010
renches+0.010
 tant+0.010
obs+0.010
Neg logit contributions
bi-0.009
 Betty-0.008
 Wendy-0.008
 Johnson-0.008
 Susan-0.008
Tokenide
Token ID485
Feature activation-5.61e-01

Pos logit contributions
lights+0.316
ware+0.314
 tant+0.313
renches+0.312
obs+0.312
Neg logit contributions
bi-0.182
 Betty-0.174
 Wendy-0.171
 Johnson-0.168
 Susan-0.165
Token.
Token ID13
Feature activation-2.40e-01

Pos logit contributions
renches+0.039
 tant+0.038
lights+0.038
ware+0.037
obs+0.037
Neg logit contributions
bi-0.018
 Betty-0.017
my-0.016
 Johnson-0.016
 Wendy-0.016
Token
Token ID198
Feature activation-1.98e-01

Pos logit contributions
ware+0.015
lights+0.015
 tant+0.015
renches+0.015
obs+0.015
Neg logit contributions
bi-0.009
 Betty-0.009
 Wendy-0.008
 Johnson-0.008
 Susan-0.008
Token
Token ID198
Feature activation-2.80e-01

Pos logit contributions
lights+0.011
ware+0.011
 tant+0.011
renches+0.011
obs+0.011
Neg logit contributions
bi-0.008
 Betty-0.008
 Wendy-0.008
 Johnson-0.007
 Susan-0.007
TokenOne
Token ID3198
Feature activation-1.15e+00

Pos logit contributions
lights+0.014
renches+0.014
ware+0.014
 tant+0.014
obs+0.014
Neg logit contributions
bi-0.013
 Betty-0.013
 Wendy-0.013
 Johnson-0.013
 Susan-0.012
Token his
Token ID465
Feature activation-5.73e-01

Pos logit contributions
 Sue+0.018
 Sally+0.016
 Betty+0.016
 Wendy+0.015
bi+0.015
Neg logit contributions
obs-0.059
renches-0.059
ware-0.058
lights-0.056
mic-0.055
Token feet
Token ID3625
Feature activation-2.18e+00

Pos logit contributions
renches+0.032
ware+0.032
lights+0.032
obs+0.032
 tant+0.030
Neg logit contributions
bi-0.026
 Johnson-0.024
 Betty-0.024
 Wendy-0.024
my-0.023
Token and
Token ID290
Feature activation-7.68e-01

Pos logit contributions
renches+0.150
ware+0.150
lights+0.146
obs+0.141
 tant+0.139
Neg logit contributions
my-0.074
bi-0.069
 Johnson-0.068
 Wendy-0.064
 Betty-0.063
Token threw
Token ID9617
Feature activation+3.68e-01

Pos logit contributions
renches+0.054
lights+0.053
obs+0.053
ware+0.053
 tant+0.052
Neg logit contributions
 Wendy-0.023
bi-0.023
 Betty-0.023
 Sue-0.022
 Johnson-0.021
Token a
Token ID257
Feature activation+1.24e+00

Pos logit contributions
 Sue+0.010
bi+0.010
 Betty+0.010
 Wendy+0.010
my+0.010
Neg logit contributions
renches-0.027
obs-0.027
lights-0.026
ware-0.026
oons-0.024
Token tant
Token ID24246
Feature activation-4.56e+00

Pos logit contributions
bi+0.059
my+0.055
 Betty+0.054
 Wendy+0.054
 Johnson+0.053
Neg logit contributions
renches-0.068
lights-0.067
obs-0.065
ware-0.064
der-0.061
Tokenrum
Token ID6582
Feature activation+1.41e+00

Pos logit contributions
 tant+0.208
ware+0.206
lights+0.205
renches+0.205
obs+0.202
Neg logit contributions
bi-0.242
 Wendy-0.236
 Betty-0.235
 Johnson-0.232
 Karen-0.228
Token.
Token ID13
Feature activation-7.21e-01

Pos logit contributions
bi+0.039
 Johnson+0.037
my+0.037
 Betty+0.037
 Wendy+0.036
Neg logit contributions
renches-0.103
 tant-0.102
ware-0.099
obs-0.099
lights-0.097
Token
Token ID198
Feature activation-2.23e-01

Pos logit contributions
lights+0.055
obs+0.055
ware+0.055
renches+0.054
 tant+0.054
Neg logit contributions
 Sue-0.021
 Wendy-0.019
 Betty-0.019
 Sally-0.019
 Daisy-0.018
Token
Token ID198
Feature activation-2.49e-01

Pos logit contributions
ware+0.017
obs+0.017
lights+0.016
 tant+0.016
renches+0.016
Neg logit contributions
 Betty-0.006
bi-0.006
 Wendy-0.006
 Sue-0.006
my-0.005
Token"
Token ID1
Feature activation+1.08e+00

Pos logit contributions
ware+0.017
lights+0.017
obs+0.016
renches+0.016
 tant+0.016
Neg logit contributions
bi-0.008
 Betty-0.008
 Wendy-0.008
 Johnson-0.007
my-0.007