TOP ACTIVATIONS
MAX = 0.098

toys . She wanted her bag , her book , her
went home and put her box on her dress er .
it kept her warm and cos y all night long .
her brother and get the blanket back . Lily nodded her
if she could have her blanket back and her brother gave
she forgot to put the bowl away . As
the box and put the lid on . They ran to
He forgot to put his books and lunch in the bag
and wanted to get her ball back . Amy
, and how light and boun cy it was .
it kept her warm and cos y all night long .
yawn ed and rubbed her eyes . She wanted to go
They we pt until they felt better . Then they buried
. It was an ash tr ay . Fl uffy did
and made him drop his basket . She felt sorry and
he sp ied an ash tr ay . He picked up
dress , and a spark ly cape . Tom thought Elsa
, the bird put her blanket back inside . She was
Ben said , holding his bowl tightly . L
washed the dress until it was perfect , just like new

MIN ACTIVATIONS
MIN = -0.132

She loved to explore and try new things . Daisy wanted
Fox y . Fox y liked to dance in the forest
She loved to explore and try new things . One day
She loved to explore and try new things .
Lily . She loved to try new things and have adventures
to study nature . She liked to look at the trees
loved animals . She especially liked bees ! One
mom and dad . He liked to laugh and cl ap
was very curious . He liked to explore the world around
He liked to explore and try new things . He would
Jane . She liked to try new things , but sometimes
Paw and Tree . They liked to play together in the
She loved to explore and try new things . Today ,
curious little boy . He liked to explore his neighborhood .
play in nature . They liked to run on the grass
was very curious . He liked to explore his surroundings .
who loved nature . She liked to play in the trees
on the farm . They liked to run , jump ,
very brave and liked to try new things .
little girl who loved to try new things . She was

INTERVAL 0.040 - 0.098
CONTAINS 3.206%

took the phone and turned it off . She said :
cookie and smiled . It was the best night ever !
Sure . But be careful . It is very loud ."
into the living room and threw Anna on the couch .
nods , but she still feels sad . She loves to

INTERVAL -0.017 - 0.040
CONTAINS 74.137%

went to the park with her mom and forgot to bring
a break and relax ! <|endoftext|> One gloomy day , there
for help , but no one was available to save her
He sprayed some water on the battery and it started to
She loved to play outside in her garden . One day

INTERVAL -0.075 - -0.017
CONTAINS 22.264%

are good friends . <|endoftext|> L ily liked to jump in
was excited ! Mom my and Daddy said it
at home . His friends thought it was very foolish of
lion . That is two animals . Then we have one
again . Sam asked the cat why he was doing this

INTERVAL -0.132 - -0.075
CONTAINS 0.393%

a game . I 'll count to three and on three
Tom and Tara . Tom liked to play with the fire
Tim my and his friends loved to play on the train
girl named Lily . She loved to eat oranges . One
worry , Lily . Just try it out and you 'll
Token toys
Token ID14958
Feature activation+4.56e-02

Pos logit contributions
ivery+0.043
kin+0.043
ifying+0.041
ishy+0.038
 bruised+0.038
Neg logit contributions
 guessing-0.036
 facts-0.035
 tours-0.034
 motorcycles-0.033
 adventures-0.033
Token.
Token ID13
Feature activation+2.27e-02

Pos logit contributions
kin+0.092
ifying+0.091
ivery+0.090
ishy+0.087
 bruised+0.079
Neg logit contributions
 guessing-0.064
 adventures-0.060
 tours-0.059
 bikes-0.059
 facts-0.059
Token She
Token ID1375
Feature activation+4.19e-02

Pos logit contributions
kin+0.040
ivery+0.035
ifying+0.034
ishy+0.033
 bruised+0.032
Neg logit contributions
 topics-0.041
 guessing-0.041
 facts-0.041
 tours-0.040
 discovering-0.039
Token wanted
Token ID2227
Feature activation-9.39e-03

Pos logit contributions
kin+0.095
ivery+0.095
ifying+0.094
ishy+0.086
rier+0.084
Neg logit contributions
 guessing-0.051
 facts-0.048
 walks-0.048
 rides-0.048
 adventures-0.048
Token her
Token ID607
Feature activation+1.33e-03

Pos logit contributions
 guessing+0.014
 facts+0.013
 tours+0.013
 topics+0.013
 discovering+0.013
Neg logit contributions
kin-0.019
ivery-0.017
 bruised-0.017
ifying-0.017
robe-0.016
Token bag
Token ID6131
Feature activation+9.80e-02

Pos logit contributions
kin+0.003
ivery+0.002
ifying+0.002
 bruised+0.002
ishy+0.002
Neg logit contributions
 guessing-0.002
 facts-0.002
 tours-0.002
 motorcycles-0.002
 discovering-0.002
Token,
Token ID11
Feature activation+1.23e-02

Pos logit contributions
ifying+0.231
ishy+0.204
ivery+0.202
rified+0.193
kin+0.193
Neg logit contributions
 to-0.131
ups-0.107
 adventures-0.106
 bikes-0.095
 rules-0.095
Token her
Token ID607
Feature activation+2.48e-02

Pos logit contributions
kin+0.025
ivery+0.023
ifying+0.023
ishy+0.022
 bruised+0.021
Neg logit contributions
 guessing-0.019
 facts-0.018
 tours-0.018
 discovering-0.017
 motorcycles-0.017
Token book
Token ID1492
Feature activation+5.78e-02

Pos logit contributions
ivery+0.047
kin+0.046
ifying+0.044
ishy+0.041
 bruised+0.041
Neg logit contributions
 guessing-0.039
 facts-0.039
 tours-0.038
 motorcycles-0.037
 adventures-0.037
Token,
Token ID11
Feature activation+5.83e-03

Pos logit contributions
ifying+0.127
ivery+0.121
ishy+0.119
kin+0.117
 bruised+0.108
Neg logit contributions
 guessing-0.074
 adventures-0.074
 facts-0.072
 bikes-0.071
 tours-0.070
Token her
Token ID607
Feature activation+2.11e-02

Pos logit contributions
kin+0.011
ivery+0.011
ifying+0.010
 bruised+0.010
ishy+0.010
Neg logit contributions
 guessing-0.009
 facts-0.009
 tours-0.009
 discovering-0.009
 motorcycles-0.008
Token went
Token ID1816
Feature activation-2.69e-02

Pos logit contributions
 guessing+0.009
 facts+0.008
 tours+0.008
 topics+0.008
 discovering+0.008
Neg logit contributions
kin-0.012
ivery-0.011
ifying-0.010
 bruised-0.010
ishy-0.010
Token home
Token ID1363
Feature activation+2.80e-03

Pos logit contributions
 facts+0.041
 topics+0.040
 motorcycles+0.038
bors+0.038
 mills+0.037
Neg logit contributions
kin-0.051
robe-0.047
 bruised-0.047
ivery-0.046
ifying-0.046
Token and
Token ID290
Feature activation+1.95e-02

Pos logit contributions
kin+0.006
ivery+0.005
ifying+0.005
 bruised+0.005
ishy+0.005
Neg logit contributions
 guessing-0.004
 discovering-0.004
 tours-0.004
 facts-0.004
 topics-0.004
Token put
Token ID1234
Feature activation+4.62e-02

Pos logit contributions
kin+0.040
ivery+0.037
ifying+0.036
ishy+0.035
 bruised+0.033
Neg logit contributions
 guessing-0.031
 discovering-0.030
 seeing-0.028
 tours-0.028
 motorcycles-0.027
Token her
Token ID607
Feature activation+2.54e-02

Pos logit contributions
ivery+0.115
kin+0.107
ifying+0.105
istered+0.102
ishy+0.098
Neg logit contributions
 adventures-0.049
 motorcycles-0.049
ons-0.049
 guessing-0.049
 bikes-0.049
Token box
Token ID3091
Feature activation+9.70e-02

Pos logit contributions
ivery+0.057
kin+0.053
ifying+0.050
ishy+0.049
 bruised+0.048
Neg logit contributions
 guessing-0.035
 tours-0.032
 motorcycles-0.031
 adventures-0.031
 bikes-0.030
Token on
Token ID319
Feature activation+1.86e-02

Pos logit contributions
ivery+0.193
kin+0.190
ifying+0.184
ishy+0.181
 Deal+0.181
Neg logit contributions
 to-0.131
 airplanes-0.113
 guessing-0.111
 adventures-0.110
 exploring-0.109
Token her
Token ID607
Feature activation+4.48e-03

Pos logit contributions
kin+0.039
ivery+0.037
ishy+0.036
 bruised+0.034
ifying+0.034
Neg logit contributions
 guessing-0.028
 tours-0.027
 discovering-0.026
 motorcycles-0.026
 topics-0.026
Token dress
Token ID6576
Feature activation+5.57e-02

Pos logit contributions
kin+0.008
ivery+0.008
ifying+0.007
 bruised+0.007
ishy+0.007
Neg logit contributions
 guessing-0.008
 tours-0.007
 facts-0.007
 discovering-0.007
 motorcycles-0.007
Tokener
Token ID263
Feature activation+3.68e-02

Pos logit contributions
kin+0.114
ivery+0.109
ishy+0.108
ifying+0.107
 bruised+0.106
Neg logit contributions
 adventures-0.072
 guessing-0.071
 tours-0.070
 bikes-0.070
 motorcycles-0.069
Token.
Token ID13
Feature activation+2.30e-02

Pos logit contributions
kin+0.070
ivery+0.067
ishy+0.064
ifying+0.062
 bruised+0.062
Neg logit contributions
 guessing-0.057
 discovering-0.056
 tours-0.053
 facts-0.053
 motorcycles-0.052
Token it
Token ID340
Feature activation+2.11e-03

Pos logit contributions
kin+0.014
ivery+0.013
ifying+0.012
 bruised+0.012
ishy+0.012
Neg logit contributions
 guessing-0.011
 discovering-0.010
 facts-0.010
 tours-0.010
 topics-0.010
Token kept
Token ID4030
Feature activation+1.79e-02

Pos logit contributions
kin+0.004
ivery+0.004
ifying+0.004
 bruised+0.004
ishy+0.003
Neg logit contributions
 guessing-0.003
 facts-0.003
 tours-0.003
 discovering-0.003
 topics-0.003
Token her
Token ID607
Feature activation+3.35e-02

Pos logit contributions
kin+0.035
ivery+0.032
ifying+0.030
ishy+0.030
robe+0.028
Neg logit contributions
 guessing-0.030
 discovering-0.028
 tours-0.027
 facts-0.027
 adventures-0.026
Token warm
Token ID5814
Feature activation+6.04e-02

Pos logit contributions
kin+0.056
ivery+0.055
ifying+0.051
ishy+0.048
 bruised+0.046
Neg logit contributions
 adventures-0.060
 guessing-0.059
 discovering-0.056
 tours-0.056
 journeys-0.055
Token and
Token ID290
Feature activation+1.97e-02

Pos logit contributions
kin+0.123
ivery+0.120
ifying+0.115
istered+0.110
ishy+0.108
Neg logit contributions
 adventures-0.085
 journeys-0.075
 guessing-0.075
 exploring-0.075
 discovering-0.074
Token cos
Token ID8615
Feature activation+9.60e-02

Pos logit contributions
kin+0.030
ivery+0.028
ifying+0.026
ishy+0.025
robe+0.024
Neg logit contributions
 guessing-0.040
 discovering-0.039
 facts-0.038
 tours-0.037
 adventures-0.037
Tokeny
Token ID88
Feature activation+3.67e-02

Pos logit contributions
 Deal+0.219
 limp+0.199
 trigger+0.199
 sorry+0.198
 spoiler+0.196
Neg logit contributions
 to-0.087
we-0.087
 discovering-0.081
and-0.076
 seeing-0.075
Token all
Token ID477
Feature activation+5.00e-02

Pos logit contributions
kin+0.064
ivery+0.063
ifying+0.059
ishy+0.056
 bruised+0.055
Neg logit contributions
 guessing-0.061
 adventures-0.059
 discovering-0.058
 facts-0.057
 topics-0.056
Token night
Token ID1755
Feature activation+1.13e-02

Pos logit contributions
kin+0.103
ivery+0.098
ifying+0.092
istered+0.090
ishy+0.087
Neg logit contributions
 discovering-0.073
 adventures-0.072
 guessing-0.071
 exploring-0.069
 tours-0.066
Token long
Token ID890
Feature activation+1.59e-02

Pos logit contributions
kin+0.019
ivery+0.018
 bruised+0.017
ifying+0.017
ishy+0.016
Neg logit contributions
 guessing-0.021
 tours-0.020
 discovering-0.019
 facts-0.019
 topics-0.019
Token.
Token ID13
Feature activation+1.80e-02

Pos logit contributions
kin+0.027
ivery+0.026
ifying+0.024
 bruised+0.023
ishy+0.023
Neg logit contributions
 guessing-0.028
 discovering-0.027
 facts-0.027
 tours-0.027
 topics-0.027
Token her
Token ID607
Feature activation+2.19e-02

Pos logit contributions
kin+0.022
ivery+0.021
ifying+0.020
 bruised+0.019
ishy+0.019
Neg logit contributions
 guessing-0.016
 tours-0.016
 motorcycles-0.016
 facts-0.016
 discovering-0.015
Token brother
Token ID3956
Feature activation+1.86e-02

Pos logit contributions
kin+0.038
ivery+0.037
ifying+0.034
 bruised+0.032
ishy+0.032
Neg logit contributions
 guessing-0.039
 tours-0.038
 adventures-0.038
 facts-0.037
 discovering-0.036
Token and
Token ID290
Feature activation-4.69e-03

Pos logit contributions
kin+0.040
ivery+0.039
ifying+0.037
 bruised+0.035
ishy+0.035
Neg logit contributions
 guessing-0.025
 facts-0.024
 tours-0.024
 discovering-0.024
 topics-0.023
Token get
Token ID651
Feature activation+3.29e-02

Pos logit contributions
 guessing+0.007
 facts+0.007
 tours+0.007
 topics+0.007
 discovering+0.007
Neg logit contributions
kin-0.009
ivery-0.008
 bruised-0.008
ifying-0.008
ishy-0.008
Token the
Token ID262
Feature activation+3.51e-02

Pos logit contributions
kin+0.081
ivery+0.077
ifying+0.070
ishy+0.069
robe+0.066
Neg logit contributions
 guessing-0.041
 motorcycles-0.037
 discovering-0.037
 tours-0.035
 exploring-0.035
Token blanket
Token ID18447
Feature activation+9.53e-02

Pos logit contributions
kin+0.064
ivery+0.062
ifying+0.057
 bruised+0.056
ishy+0.056
Neg logit contributions
 adventures-0.058
 guessing-0.057
 tours-0.056
 bikes-0.055
 motorcycles-0.055
Token back
Token ID736
Feature activation+4.73e-02

Pos logit contributions
kin+0.212
ivery+0.209
ishy+0.207
 disgust+0.203
ifying+0.201
Neg logit contributions
 to-0.119
 airplanes-0.089
 guessing-0.086
 adventures-0.081
 exploring-0.076
Token.
Token ID13
Feature activation+3.00e-02

Pos logit contributions
kin+0.101
ivery+0.097
ishy+0.089
ifying+0.088
istered+0.086
Neg logit contributions
 guessing-0.065
 discovering-0.062
 exploring-0.058
 to-0.058
 seeing-0.058
Token Lily
Token ID20037
Feature activation-2.49e-02

Pos logit contributions
kin+0.049
ivery+0.042
ifying+0.040
 bruised+0.037
ishy+0.035
Neg logit contributions
 guessing-0.061
 topics-0.057
 discovering-0.056
 tours-0.056
 motorcycles-0.056
Token nodded
Token ID14464
Feature activation+2.93e-02

Pos logit contributions
 facts+0.048
 tours+0.048
 guessing+0.047
 discovering+0.046
 topics+0.046
Neg logit contributions
kin-0.040
ivery-0.035
 bruised-0.033
cloth-0.032
 limp-0.031
Token her
Token ID607
Feature activation+2.88e-02

Pos logit contributions
kin+0.069
ifying+0.068
ivery+0.066
ishy+0.064
 bruised+0.062
Neg logit contributions
 guessing-0.033
 discovering-0.031
 seeing-0.030
 motorcycles-0.027
 adventures-0.027
Token if
Token ID611
Feature activation-1.49e-02

Pos logit contributions
 tours+0.103
 guessing+0.096
 motorcycles+0.096
 mills+0.096
 journeys+0.095
Neg logit contributions
kin-0.082
cloth-0.068
robe-0.066
 bruised-0.063
ifying-0.061
Token she
Token ID673
Feature activation+2.20e-02

Pos logit contributions
 guessing+0.025
 facts+0.025
 tours+0.024
 topics+0.024
 discovering+0.024
Neg logit contributions
kin-0.027
 bruised-0.023
ivery-0.023
ifying-0.023
robe-0.022
Token could
Token ID714
Feature activation+3.18e-02

Pos logit contributions
kin+0.044
ivery+0.043
ifying+0.041
 bruised+0.038
robe+0.037
Neg logit contributions
 guessing-0.033
 facts-0.032
arm-0.031
 motorcycles-0.031
 discovering-0.030
Token have
Token ID423
Feature activation-2.20e-03

Pos logit contributions
kin+0.066
ivery+0.063
ifying+0.059
ishy+0.059
 bruised+0.059
Neg logit contributions
 guessing-0.044
 discovering-0.041
 seeing-0.039
 tours-0.038
 adventures-0.038
Token her
Token ID607
Feature activation+2.77e-02

Pos logit contributions
 guessing+0.004
 facts+0.004
 tours+0.004
 discovering+0.004
 topics+0.003
Neg logit contributions
kin-0.004
ivery-0.004
ifying-0.004
 bruised-0.004
ishy-0.003
Token blanket
Token ID18447
Feature activation+9.43e-02

Pos logit contributions
kin+0.048
ivery+0.048
ifying+0.043
 bruised+0.042
ishy+0.041
Neg logit contributions
 guessing-0.048
 adventures-0.047
 tours-0.046
 facts-0.045
 discovering-0.044
Token back
Token ID736
Feature activation+3.60e-02

Pos logit contributions
ivery+0.231
kin+0.218
ishy+0.216
ifying+0.214
 disgust+0.209
Neg logit contributions
 to-0.104
 guessing-0.082
 adventures-0.076
 discovering-0.070
 exploring-0.070
Token and
Token ID290
Feature activation+1.49e-02

Pos logit contributions
kin+0.075
ivery+0.073
ifying+0.066
ishy+0.066
 bruised+0.065
Neg logit contributions
 guessing-0.052
 discovering-0.050
 tours-0.047
 adventures-0.047
 seeing-0.046
Token her
Token ID607
Feature activation+3.70e-02

Pos logit contributions
kin+0.030
ivery+0.028
ifying+0.027
ishy+0.026
 bruised+0.025
Neg logit contributions
 guessing-0.024
 discovering-0.022
 facts-0.022
 motorcycles-0.022
 tours-0.021
Token brother
Token ID3956
Feature activation+4.75e-02

Pos logit contributions
ivery+0.074
ifying+0.071
kin+0.070
ishy+0.066
he+0.065
Neg logit contributions
 adventures-0.056
 guessing-0.056
 motorcycles-0.052
 facts-0.050
 computers-0.050
Token gave
Token ID2921
Feature activation+3.99e-02

Pos logit contributions
ivery+0.099
kin+0.098
ifying+0.096
ishy+0.088
 bruised+0.084
Neg logit contributions
 guessing-0.066
 facts-0.064
 adventures-0.061
 motorcycles-0.060
 discovering-0.060
Token she
Token ID673
Feature activation+2.42e-02

Pos logit contributions
ivery+0.074
ifying+0.071
kin+0.070
ishy+0.069
robe+0.065
Neg logit contributions
 guessing-0.042
 discovering-0.040
 seeing-0.039
 learning-0.035
 motorcycles-0.035
Token forgot
Token ID16453
Feature activation-2.29e-02

Pos logit contributions
ifying+0.048
ivery+0.047
kin+0.046
 bruised+0.043
ishy+0.042
Neg logit contributions
 guessing-0.035
arm-0.032
 discovering-0.031
 motorcycles-0.031
 adventures-0.030
Token to
Token ID284
Feature activation+4.37e-02

Pos logit contributions
 guessing+0.030
 tours+0.029
 topics+0.029
 facts+0.028
icians+0.028
Neg logit contributions
kin-0.049
 bruised-0.046
ifying-0.045
ivery-0.045
cloth-0.043
Token put
Token ID1234
Feature activation+4.93e-02

Pos logit contributions
ifying+0.098
kin+0.094
robe+0.094
ivery+0.089
ishy+0.084
Neg logit contributions
 guessing-0.057
 seeing-0.054
 discovering-0.053
 facts-0.052
 adventures-0.051
Token the
Token ID262
Feature activation+2.12e-02

Pos logit contributions
ivery+0.132
kin+0.126
ifying+0.121
robe+0.116
istered+0.116
Neg logit contributions
 guessing-0.040
ons-0.040
 tours-0.038
 motorcycles-0.038
 discovering-0.037
Token bowl
Token ID9396
Feature activation+9.42e-02

Pos logit contributions
kin+0.043
ivery+0.043
ifying+0.040
 bruised+0.037
robe+0.037
Neg logit contributions
 guessing-0.032
 facts-0.029
 motorcycles-0.029
 tours-0.028
 topics-0.028
Token away
Token ID1497
Feature activation+5.87e-02

Pos logit contributions
ivery+0.198
 Deal+0.194
 spoiler+0.187
kin+0.187
 relief+0.185
Neg logit contributions
 to-0.130
ups-0.100
 guessing-0.093
ons-0.091
 discovering-0.083
Token.
Token ID13
Feature activation+1.64e-02

Pos logit contributions
ivery+0.131
kin+0.127
ishy+0.122
ifying+0.118
 bruised+0.116
Neg logit contributions
 to-0.069
 guessing-0.069
 discovering-0.066
 seeing-0.063
 facts-0.059
Token
Token ID198
Feature activation-1.51e-02

Pos logit contributions
kin+0.036
ivery+0.033
ifying+0.032
 bruised+0.031
ishy+0.031
Neg logit contributions
 guessing-0.023
 tours-0.021
 discovering-0.021
 topics-0.021
 motorcycles-0.021
Token
Token ID198
Feature activation-7.29e-03

Pos logit contributions
 guessing+0.026
 facts+0.026
 tours+0.025
 topics+0.025
 discovering+0.025
Neg logit contributions
kin-0.027
ivery-0.024
ifying-0.023
 bruised-0.023
robe-0.022
TokenAs
Token ID1722
Feature activation+1.53e-02

Pos logit contributions
 guessing+0.011
 facts+0.011
 tours+0.011
 motorcycles+0.010
 discovering+0.010
Neg logit contributions
kin-0.015
ivery-0.014
ifying-0.013
 bruised-0.013
ishy-0.012
Token the
Token ID262
Feature activation+2.78e-02

Pos logit contributions
ivery+0.067
kin+0.065
ishy+0.063
ifying+0.059
robe+0.059
Neg logit contributions
 guessing-0.032
 tours-0.030
 discovering-0.029
 topics-0.029
 adventures-0.028
Token box
Token ID3091
Feature activation+3.30e-02

Pos logit contributions
kin+0.062
ivery+0.058
ishy+0.053
 bruised+0.051
ifying+0.051
Neg logit contributions
 guessing-0.037
 tours-0.037
 motorcycles-0.036
 topics-0.035
 bikes-0.035
Token and
Token ID290
Feature activation+1.92e-02

Pos logit contributions
kin+0.063
ivery+0.062
ifying+0.058
ishy+0.057
 bruised+0.056
Neg logit contributions
 guessing-0.053
 discovering-0.050
 facts-0.048
 tours-0.047
 seeing-0.046
Token put
Token ID1234
Feature activation+4.85e-02

Pos logit contributions
ivery+0.037
kin+0.037
ifying+0.034
ishy+0.033
 bruised+0.030
Neg logit contributions
 guessing-0.032
 discovering-0.030
 facts-0.030
 tours-0.029
 seeing-0.029
Token the
Token ID262
Feature activation+5.64e-02

Pos logit contributions
ivery+0.152
kin+0.142
ishy+0.131
istered+0.130
ifying+0.129
Neg logit contributions
 bikes-0.021
 guessing-0.020
 motorcycles-0.020
 tours-0.018
ons-0.018
Token lid
Token ID19789
Feature activation+9.35e-02

Pos logit contributions
kin+0.107
ivery+0.104
ishy+0.095
gan+0.093
ifying+0.092
Neg logit contributions
 bikes-0.087
 guessing-0.085
 instructions-0.084
 adventures-0.084
 motorcycles-0.083
Token on
Token ID319
Feature activation+3.95e-02

Pos logit contributions
ivery+0.205
kin+0.191
 seeker+0.181
ishy+0.175
 Deal+0.174
Neg logit contributions
 to-0.126
ups-0.102
 guessing-0.099
 airplanes-0.093
 instructions-0.091
Token.
Token ID13
Feature activation+7.58e-03

Pos logit contributions
ivery+0.104
kin+0.100
ishy+0.097
 bruised+0.094
ifying+0.091
Neg logit contributions
 guessing-0.036
 tours-0.033
 motorcycles-0.032
 bikes-0.031
 computers-0.030
Token They
Token ID1119
Feature activation+3.72e-02

Pos logit contributions
kin+0.013
ivery+0.012
 bruised+0.011
ifying+0.011
ishy+0.011
Neg logit contributions
 guessing-0.014
 tours-0.014
 facts-0.014
 discovering-0.013
 topics-0.013
Token ran
Token ID4966
Feature activation-8.43e-03

Pos logit contributions
kin+0.074
ivery+0.074
ifying+0.069
ishy+0.067
robe+0.065
Neg logit contributions
 guessing-0.053
 mills-0.050
 seeing-0.049
arm-0.049
 discovering-0.049
Token to
Token ID284
Feature activation+9.28e-03

Pos logit contributions
 guessing+0.011
 facts+0.011
 tours+0.010
 topics+0.010
 motorcycles+0.010
Neg logit contributions
kin-0.019
ivery-0.017
ifying-0.017
 bruised-0.017
ishy-0.016
Token He
Token ID679
Feature activation+2.33e-02

Pos logit contributions
kin+0.026
ivery+0.023
ifying+0.022
 bruised+0.021
ishy+0.021
Neg logit contributions
 guessing-0.026
 motorcycles-0.026
 tours-0.025
 facts-0.025
 mills-0.025
Token forgot
Token ID16453
Feature activation-4.22e-02

Pos logit contributions
kin+0.047
ivery+0.044
ifying+0.044
 bruised+0.041
ishy+0.040
Neg logit contributions
 guessing-0.037
 motorcycles-0.034
 adventures-0.033
 topics-0.033
 discovering-0.033
Token to
Token ID284
Feature activation+5.69e-02

Pos logit contributions
 topics+0.063
icians+0.062
 facts+0.061
 guessing+0.061
 tours+0.060
Neg logit contributions
kin-0.077
 bruised-0.075
cloth-0.070
ifying-0.070
 OK-0.069
Token put
Token ID1234
Feature activation+3.18e-02

Pos logit contributions
ifying+0.131
kin+0.119
ishy+0.118
ivery+0.117
 bruised+0.116
Neg logit contributions
 guessing-0.074
 tours-0.071
 adventures-0.068
 events-0.068
 discovering-0.067
Token his
Token ID465
Feature activation+1.99e-02

Pos logit contributions
ivery+0.074
kin+0.071
ifying+0.067
ishy+0.063
istered+0.063
Neg logit contributions
 motorcycles-0.041
 guessing-0.039
 tours-0.038
 bikes-0.036
 adventures-0.035
Token books
Token ID3835
Feature activation+9.31e-02

Pos logit contributions
ivery+0.037
kin+0.037
ifying+0.034
 bruised+0.034
ishy+0.032
Neg logit contributions
 guessing-0.032
 tours-0.031
 motorcycles-0.030
 facts-0.029
 bikes-0.029
Token and
Token ID290
Feature activation+2.42e-02

Pos logit contributions
kin+0.167
ivery+0.165
 thirst+0.160
 scalp+0.158
ishy+0.158
Neg logit contributions
 to-0.149
 guessing-0.126
 tours-0.118
 bikes-0.117
 motorcycles-0.117
Token lunch
Token ID9965
Feature activation+5.05e-02

Pos logit contributions
kin+0.053
ivery+0.053
ifying+0.050
ishy+0.049
 bruised+0.047
Neg logit contributions
 guessing-0.031
 tours-0.031
 motorcycles-0.031
 facts-0.030
 bikes-0.029
Token in
Token ID287
Feature activation+3.51e-02

Pos logit contributions
ivery+0.107
ifying+0.102
kin+0.099
ishy+0.098
 bruised+0.098
Neg logit contributions
 guessing-0.070
 tours-0.062
 adventures-0.062
 bikes-0.061
 motorcycles-0.060
Token the
Token ID262
Feature activation+2.34e-02

Pos logit contributions
ivery+0.086
kin+0.083
ishy+0.081
ifying+0.077
 bruised+0.075
Neg logit contributions
 guessing-0.041
 motorcycles-0.040
 adventures-0.040
 tours-0.039
 events-0.036
Token bag
Token ID6131
Feature activation+6.38e-02

Pos logit contributions
kin+0.054
ivery+0.051
 bruised+0.048
ishy+0.047
ifying+0.046
Neg logit contributions
 guessing-0.030
 tours-0.030
 motorcycles-0.028
 topics-0.027
 discovering-0.027
Token and
Token ID290
Feature activation+3.74e-02

Pos logit contributions
kin+0.007
ivery+0.007
ifying+0.006
 bruised+0.006
ishy+0.006
Neg logit contributions
 guessing-0.006
 facts-0.005
 tours-0.005
 discovering-0.005
 topics-0.005
Token wanted
Token ID2227
Feature activation-3.05e-02

Pos logit contributions
kin+0.075
ivery+0.067
ifying+0.067
ishy+0.062
robe+0.060
Neg logit contributions
 guessing-0.062
 discovering-0.059
 seeing-0.059
 facts-0.057
 learning-0.056
Token to
Token ID284
Feature activation+2.04e-02

Pos logit contributions
 tours+0.034
 guessing+0.034
 topics+0.034
icians+0.033
arm+0.033
Neg logit contributions
kin-0.067
 bruised-0.064
 sill-0.059
 limp-0.059
ivery-0.059
Token get
Token ID651
Feature activation+2.41e-02

Pos logit contributions
kin+0.045
ifying+0.043
ivery+0.042
ishy+0.041
 bruised+0.041
Neg logit contributions
 guessing-0.028
 tours-0.026
 facts-0.025
 adventures-0.025
 discovering-0.024
Token her
Token ID607
Feature activation+3.14e-02

Pos logit contributions
kin+0.052
ivery+0.048
ifying+0.044
ishy+0.042
robe+0.042
Neg logit contributions
 guessing-0.037
 discovering-0.035
 tours-0.033
 motorcycles-0.033
 facts-0.033
Token ball
Token ID2613
Feature activation+9.28e-02

Pos logit contributions
ivery+0.062
kin+0.061
ifying+0.056
ishy+0.054
 bruised+0.052
Neg logit contributions
 guessing-0.050
 tours-0.046
 adventures-0.046
 motorcycles-0.045
 facts-0.045
Token back
Token ID736
Feature activation+4.58e-02

Pos logit contributions
ifying+0.208
kin+0.199
ivery+0.192
 disgust+0.191
ishy+0.186
Neg logit contributions
 to-0.128
 adventures-0.096
 guessing-0.096
 bikes-0.095
 motorcycles-0.085
Token.
Token ID13
Feature activation-1.25e-02

Pos logit contributions
kin+0.102
ivery+0.097
ifying+0.092
ishy+0.088
istered+0.088
Neg logit contributions
 guessing-0.061
 discovering-0.059
 adventures-0.057
 tours-0.055
 exploring-0.055
Token
Token ID198
Feature activation-2.07e-02

Pos logit contributions
 guessing+0.026
 facts+0.026
 tours+0.025
 discovering+0.025
 motorcycles+0.025
Neg logit contributions
kin-0.017
ivery-0.015
 bruised-0.014
ifying-0.014
ishy-0.013
Token
Token ID198
Feature activation-2.04e-02

Pos logit contributions
 guessing+0.038
 facts+0.038
 tours+0.036
 motorcycles+0.036
 topics+0.036
Neg logit contributions
kin-0.034
ivery-0.031
he-0.028
 bruised-0.028
ifying-0.028
TokenAmy
Token ID40797
Feature activation-3.68e-03

Pos logit contributions
 facts+0.034
 guessing+0.033
 tours+0.033
 motorcycles+0.032
 discovering+0.032
Neg logit contributions
kin-0.038
ivery-0.036
ifying-0.033
he-0.031
 bruised-0.031
Token,
Token ID11
Feature activation-4.61e-03

Pos logit contributions
 guessing+0.008
 facts+0.008
 tours+0.008
 topics+0.008
 discovering+0.008
Neg logit contributions
kin-0.008
ivery-0.007
ifying-0.007
 bruised-0.007
ishy-0.006
Token and
Token ID290
Feature activation-1.92e-03

Pos logit contributions
 guessing+0.008
 facts+0.008
 tours+0.008
 topics+0.008
 discovering+0.008
Neg logit contributions
kin-0.008
ivery-0.007
ifying-0.007
 bruised-0.007
ishy-0.007
Token how
Token ID703
Feature activation+2.38e-02

Pos logit contributions
 guessing+0.003
 facts+0.003
 tours+0.003
 discovering+0.003
 topics+0.003
Neg logit contributions
kin-0.004
ivery-0.003
ifying-0.003
 bruised-0.003
ishy-0.003
Token light
Token ID1657
Feature activation+1.96e-02

Pos logit contributions
kin+0.048
ivery+0.046
ifying+0.045
 bruised+0.043
robe+0.042
Neg logit contributions
 guessing-0.034
 tours-0.034
 facts-0.034
 discovering-0.033
 adventures-0.032
Token and
Token ID290
Feature activation-1.04e-02

Pos logit contributions
kin+0.042
ivery+0.040
ifying+0.038
 bruised+0.037
ishy+0.036
Neg logit contributions
 guessing-0.027
 facts-0.026
 tours-0.026
 discovering-0.025
 motorcycles-0.025
Token boun
Token ID31283
Feature activation+9.27e-02

Pos logit contributions
 guessing+0.016
 tours+0.016
 facts+0.016
 topics+0.015
 motorcycles+0.015
Neg logit contributions
kin-0.020
ivery-0.018
 bruised-0.018
ifying-0.018
ishy-0.017
Tokency
Token ID948
Feature activation-1.09e-02

Pos logit contributions
 OK+0.134
 sorry+0.132
ivery+0.132
 bruised+0.128
istered+0.127
Neg logit contributions
ons-0.155
ites-0.145
st-0.144
cles-0.142
 topics-0.142
Token it
Token ID340
Feature activation+2.47e-02

Pos logit contributions
 guessing+0.018
 facts+0.017
 tours+0.017
 topics+0.017
 motorcycles+0.017
Neg logit contributions
kin-0.021
ivery-0.019
ifying-0.018
 bruised-0.018
ishy-0.017
Token was
Token ID373
Feature activation+5.24e-03

Pos logit contributions
kin+0.049
ivery+0.047
ifying+0.047
 bruised+0.044
ishy+0.043
Neg logit contributions
 guessing-0.037
 facts-0.035
 adventures-0.035
 tours-0.034
 discovering-0.033
Token.
Token ID13
Feature activation-2.66e-03

Pos logit contributions
kin+0.010
ivery+0.009
ifying+0.009
 bruised+0.008
ishy+0.008
Neg logit contributions
 guessing-0.009
 facts-0.008
 discovering-0.008
 tours-0.008
 motorcycles-0.008
Token
Token ID220
Feature activation+5.92e-03

Pos logit contributions
 guessing+0.005
 facts+0.005
 tours+0.005
 discovering+0.005
 topics+0.005
Neg logit contributions
kin-0.004
ivery-0.004
ifying-0.003
 bruised-0.003
ishy-0.003
Token it
Token ID340
Feature activation+1.02e-04

Pos logit contributions
kin+0.022
ivery+0.021
ifying+0.019
 bruised+0.019
ishy+0.019
Neg logit contributions
 guessing-0.016
 discovering-0.015
 facts-0.015
 tours-0.015
 topics-0.015
Token kept
Token ID4030
Feature activation+1.16e-02

Pos logit contributions
kin+0.000
ivery+0.000
ifying+0.000
 bruised+0.000
ishy+0.000
Neg logit contributions
 guessing-0.000
 facts-0.000
 tours-0.000
 discovering-0.000
 topics-0.000
Token her
Token ID607
Feature activation+3.33e-02

Pos logit contributions
kin+0.022
ivery+0.020
ifying+0.019
ishy+0.018
 bruised+0.018
Neg logit contributions
 guessing-0.020
 discovering-0.019
 tours-0.019
 facts-0.018
 motorcycles-0.018
Token warm
Token ID5814
Feature activation+6.17e-02

Pos logit contributions
kin+0.057
ivery+0.056
ifying+0.051
ishy+0.049
 bruised+0.047
Neg logit contributions
 guessing-0.058
 adventures-0.058
 discovering-0.055
 facts-0.055
 tours-0.055
Token and
Token ID290
Feature activation+1.71e-02

Pos logit contributions
kin+0.125
ivery+0.124
ifying+0.115
istered+0.112
ishy+0.110
Neg logit contributions
 adventures-0.087
 discovering-0.080
 guessing-0.080
 exploring-0.079
 journeys-0.077
Token cos
Token ID8615
Feature activation+9.23e-02

Pos logit contributions
kin+0.026
ivery+0.024
ifying+0.023
ishy+0.022
robe+0.021
Neg logit contributions
 guessing-0.034
 discovering-0.033
 facts-0.032
 tours-0.032
 adventures-0.031
Tokeny
Token ID88
Feature activation+4.01e-02

Pos logit contributions
 Deal+0.209
 sorry+0.196
 limp+0.195
 trigger+0.194
 spoiler+0.190
Neg logit contributions
we-0.082
 to-0.080
 discovering-0.078
 seeing-0.070
and-0.068
Token all
Token ID477
Feature activation+4.41e-02

Pos logit contributions
kin+0.071
ivery+0.070
ifying+0.064
ishy+0.062
 bruised+0.061
Neg logit contributions
 guessing-0.067
 adventures-0.063
 discovering-0.063
 facts-0.062
 topics-0.061
Token night
Token ID1755
Feature activation+1.13e-02

Pos logit contributions
kin+0.089
ivery+0.084
ifying+0.077
istered+0.076
ishy+0.075
Neg logit contributions
 discovering-0.067
 guessing-0.066
 adventures-0.064
 exploring-0.062
 tours-0.062
Token long
Token ID890
Feature activation+1.48e-02

Pos logit contributions
kin+0.019
ivery+0.018
 bruised+0.017
ifying+0.017
ishy+0.016
Neg logit contributions
 guessing-0.021
 discovering-0.020
 tours-0.019
 facts-0.019
 topics-0.019
Token.
Token ID13
Feature activation+2.17e-02

Pos logit contributions
kin+0.025
ivery+0.024
ifying+0.022
 bruised+0.022
ishy+0.021
Neg logit contributions
 guessing-0.027
 discovering-0.026
 facts-0.026
 tours-0.025
 topics-0.025
Token yawn
Token ID40908
Feature activation+3.94e-02

Pos logit contributions
 guessing+0.018
 facts+0.017
 tours+0.017
 topics+0.017
 discovering+0.017
Neg logit contributions
kin-0.019
ivery-0.017
 bruised-0.017
ifying-0.017
ishy-0.016
Tokened
Token ID276
Feature activation+4.76e-02

Pos logit contributions
kin+0.070
ivery+0.061
ifying+0.061
 bruised+0.060
ishy+0.057
Neg logit contributions
 guessing-0.064
 facts-0.061
 motorcycles-0.061
ups-0.060
arm-0.059
Token and
Token ID290
Feature activation+4.92e-02

Pos logit contributions
kin+0.132
ivery+0.126
ifying+0.120
ishy+0.117
robe+0.112
Neg logit contributions
 guessing-0.036
 discovering-0.033
 motorcycles-0.033
 adventures-0.032
 seeing-0.032
Token rubbed
Token ID31862
Feature activation+4.42e-02

Pos logit contributions
kin+0.098
ivery+0.096
ifying+0.087
ishy+0.084
he+0.081
Neg logit contributions
 guessing-0.075
 seeing-0.070
 facts-0.069
 discovering-0.068
 walks-0.066
Token her
Token ID607
Feature activation+3.79e-02

Pos logit contributions
ivery+0.095
kin+0.095
ishy+0.087
ifying+0.086
 bruised+0.083
Neg logit contributions
 facts-0.050
 bikes-0.048
 guessing-0.048
 motorcycles-0.048
 topics-0.047
Token eyes
Token ID2951
Feature activation+9.21e-02

Pos logit contributions
ivery+0.068
kin+0.062
he+0.056
ishy+0.056
ifying+0.055
Neg logit contributions
 guessing-0.063
 motorcycles-0.062
 facts-0.062
 tricks-0.061
 bikes-0.060
Token.
Token ID13
Feature activation+1.63e-02

Pos logit contributions
kin+0.221
ivery+0.204
 seeker+0.200
robe+0.193
idget+0.192
Neg logit contributions
 to-0.098
 adventures-0.077
 guessing-0.074
 airplanes-0.072
 instructions-0.070
Token She
Token ID1375
Feature activation+3.02e-02

Pos logit contributions
kin+0.029
ivery+0.027
ifying+0.024
 bruised+0.024
ishy+0.024
Neg logit contributions
 guessing-0.029
 tours-0.028
 discovering-0.028
 facts-0.028
 motorcycles-0.028
Token wanted
Token ID2227
Feature activation-2.12e-02

Pos logit contributions
kin+0.054
ivery+0.054
ifying+0.050
ishy+0.047
 bruised+0.047
Neg logit contributions
 guessing-0.050
 facts-0.049
 adventures-0.047
 motorcycles-0.046
 discovering-0.046
Token to
Token ID284
Feature activation+3.34e-02

Pos logit contributions
 guessing+0.023
 tours+0.022
 topics+0.022
 facts+0.021
 discovering+0.021
Neg logit contributions
kin-0.050
 bruised-0.047
ifying-0.045
ivery-0.045
 sill-0.043
Token go
Token ID467
Feature activation+1.03e-02

Pos logit contributions
kin+0.069
ivery+0.066
ifying+0.066
ishy+0.064
 bruised+0.063
Neg logit contributions
 guessing-0.048
 tours-0.044
 facts-0.043
 discovering-0.043
 adventures-0.043
Token They
Token ID1119
Feature activation+5.01e-02

Pos logit contributions
kin+0.075
ivery+0.068
ishy+0.062
 bruised+0.061
he+0.058
Neg logit contributions
 mills-0.048
 facts-0.048
 topics-0.047
 museums-0.047
 tours-0.047
Token we
Token ID356
Feature activation+5.29e-02

Pos logit contributions
kin+0.105
ifying+0.102
ivery+0.100
ishy+0.096
rier+0.096
Neg logit contributions
 adventures-0.067
 guessing-0.066
 facts-0.065
 seeing-0.062
 discovering-0.061
Tokenpt
Token ID457
Feature activation+2.61e-02

Pos logit contributions
kin+0.093
rier+0.079
ivery+0.079
ifying+0.077
robe+0.075
Neg logit contributions
 adventures-0.090
 guessing-0.087
 facts-0.083
 discovering-0.083
 motorcycles-0.083
Token until
Token ID1566
Feature activation+4.64e-02

Pos logit contributions
kin+0.055
ifying+0.050
ivery+0.049
ishy+0.048
 bruised+0.047
Neg logit contributions
 guessing-0.037
 discovering-0.034
 facts-0.034
 mills-0.033
 tours-0.033
Token they
Token ID484
Feature activation+6.87e-02

Pos logit contributions
ivery+0.122
ishy+0.117
kin+0.113
suit+0.112
ifying+0.109
Neg logit contributions
 guessing-0.044
 seeing-0.041
 tours-0.040
 discovering-0.039
 adventures-0.039
Token felt
Token ID2936
Feature activation+9.17e-02

Pos logit contributions
rier+0.153
ishy+0.149
ivery+0.147
ifying+0.146
kin+0.144
Neg logit contributions
 adventures-0.078
 seeing-0.078
 guessing-0.072
 whales-0.068
 airplanes-0.065
Token better
Token ID1365
Feature activation+3.50e-02

Pos logit contributions
ifying+0.174
kin+0.171
ivery+0.169
cy+0.167
suit+0.158
Neg logit contributions
 adventures-0.128
 guessing-0.127
 seeing-0.123
 to-0.123
 discovering-0.119
Token.
Token ID13
Feature activation+3.39e-02

Pos logit contributions
kin+0.071
ivery+0.068
ifying+0.066
ishy+0.065
robe+0.061
Neg logit contributions
 guessing-0.050
 discovering-0.048
 adventures-0.046
 seeing-0.046
 facts-0.045
Token Then
Token ID3244
Feature activation+1.98e-02

Pos logit contributions
kin+0.078
ivery+0.072
ishy+0.065
 bruised+0.063
he+0.062
Neg logit contributions
 mills-0.042
 topics-0.042
 discovering-0.042
 guessing-0.041
 museums-0.041
Token they
Token ID484
Feature activation+5.89e-02

Pos logit contributions
kin+0.050
ivery+0.049
ifying+0.046
ishy+0.046
 bruised+0.044
Neg logit contributions
 guessing-0.020
 tours-0.019
 facts-0.018
 discovering-0.018
 mills-0.017
Token buried
Token ID11694
Feature activation+5.04e-02

Pos logit contributions
ivery+0.123
kin+0.121
ishy+0.115
ifying+0.113
rier+0.113
Neg logit contributions
 airplanes-0.074
 planes-0.073
 whales-0.073
 adventures-0.073
 walks-0.072
Token.
Token ID13
Feature activation+1.51e-02

Pos logit contributions
ivery+0.072
kin+0.071
ishy+0.068
 bruised+0.068
ifying+0.067
Neg logit contributions
 guessing-0.062
 discovering-0.057
 facts-0.057
 tours-0.055
 mills-0.055
Token It
Token ID632
Feature activation+6.92e-03

Pos logit contributions
kin+0.024
ivery+0.021
ifying+0.019
 bruised+0.019
ishy+0.019
Neg logit contributions
 guessing-0.029
 facts-0.029
 tours-0.028
 topics-0.028
 discovering-0.028
Token was
Token ID373
Feature activation+2.69e-02

Pos logit contributions
kin+0.015
ivery+0.014
ifying+0.014
 bruised+0.013
ishy+0.013
Neg logit contributions
 guessing-0.010
 facts-0.009
 discovering-0.009
 tours-0.009
 motorcycles-0.009
Token an
Token ID281
Feature activation+1.27e-02

Pos logit contributions
ivery+0.052
kin+0.050
ishy+0.046
ifying+0.046
he+0.044
Neg logit contributions
 guessing-0.047
 discovering-0.043
 facts-0.041
 seeing-0.040
 tours-0.040
Token ash
Token ID12530
Feature activation+1.66e-02

Pos logit contributions
kin+0.024
ivery+0.022
ifying+0.022
 bruised+0.021
robe+0.020
Neg logit contributions
 guessing-0.022
 discovering-0.020
 facts-0.020
 tours-0.020
 motorcycles-0.020
Tokentr
Token ID2213
Feature activation+9.14e-02

Pos logit contributions
kin+0.020
ivery+0.019
 bruised+0.016
ifying+0.016
ishy+0.016
Neg logit contributions
 guessing-0.038
 facts-0.036
 tours-0.036
 discovering-0.035
 motorcycles-0.035
Tokenay
Token ID323
Feature activation+1.84e-02

Pos logit contributions
kin+0.207
 thirst+0.205
 limp+0.200
 relief+0.191
ivery+0.190
Neg logit contributions
ons-0.101
 motorcycles-0.074
ades-0.073
ths-0.073
cles-0.069
Token.
Token ID13
Feature activation+1.84e-02

Pos logit contributions
kin+0.036
ivery+0.034
ishy+0.032
ifying+0.031
 bruised+0.031
Neg logit contributions
 guessing-0.029
 discovering-0.027
 motorcycles-0.027
 tours-0.026
 facts-0.026
Token Fl
Token ID1610
Feature activation+1.53e-02

Pos logit contributions
kin+0.030
ivery+0.027
ifying+0.024
ishy+0.024
 bruised+0.024
Neg logit contributions
 guessing-0.034
 facts-0.033
 tours-0.033
 topics-0.033
 motorcycles-0.033
Tokenuffy
Token ID15352
Feature activation-4.41e-02

Pos logit contributions
kin+0.023
ifying+0.021
ivery+0.020
 bruised+0.020
cloth+0.019
Neg logit contributions
 guessing-0.030
arm-0.028
 tours-0.028
 topics-0.028
 facts-0.028
Token did
Token ID750
Feature activation-2.89e-02

Pos logit contributions
 tours+0.098
 facts+0.095
 topics+0.092
 motorcycles+0.091
 events+0.091
Neg logit contributions
kin-0.058
 bruised-0.049
ivery-0.046
 ripped-0.045
ishy-0.045
Token and
Token ID290
Feature activation+1.19e-02

Pos logit contributions
 facts+0.060
 tours+0.058
 topics+0.057
 mills+0.055
 motorcycles+0.055
Neg logit contributions
kin-0.054
 bruised-0.052
ivery-0.050
ifying-0.050
 limp-0.047
Token made
Token ID925
Feature activation+1.17e-02

Pos logit contributions
kin+0.023
ivery+0.021
ifying+0.020
ishy+0.019
 bruised+0.019
Neg logit contributions
 guessing-0.020
 motorcycles-0.019
 facts-0.019
 discovering-0.018
 tours-0.018
Token him
Token ID683
Feature activation+3.21e-02

Pos logit contributions
kin+0.021
ivery+0.019
ifying+0.018
 bruised+0.018
ishy+0.017
Neg logit contributions
 guessing-0.021
 tours-0.020
 facts-0.020
 motorcycles-0.020
 discovering-0.019
Token drop
Token ID4268
Feature activation+4.92e-02

Pos logit contributions
kin+0.059
ivery+0.052
ifying+0.050
robe+0.047
ishy+0.046
Neg logit contributions
 guessing-0.058
 tours-0.054
 facts-0.053
 motorcycles-0.053
 discovering-0.052
Token his
Token ID465
Feature activation+1.79e-02

Pos logit contributions
kin+0.121
ivery+0.117
robe+0.102
 seeker+0.098
ifying+0.096
Neg logit contributions
 motorcycles-0.059
 tours-0.051
 instructions-0.051
 facts-0.050
 bikes-0.049
Token basket
Token ID7988
Feature activation+9.05e-02

Pos logit contributions
kin+0.034
ivery+0.033
ifying+0.030
 bruised+0.029
robe+0.029
Neg logit contributions
 guessing-0.028
 tours-0.028
 motorcycles-0.027
 facts-0.027
 discovering-0.026
Token.
Token ID13
Feature activation+8.69e-03

Pos logit contributions
kin+0.200
ivery+0.196
 relief+0.196
 Deal+0.186
ishy+0.184
Neg logit contributions
 to-0.104
 guessing-0.088
 climbing-0.073
 exploring-0.071
ups-0.069
Token She
Token ID1375
Feature activation+1.31e-03

Pos logit contributions
kin+0.014
ivery+0.013
ifying+0.012
 bruised+0.012
ishy+0.011
Neg logit contributions
 guessing-0.017
 facts-0.016
 tours-0.016
 motorcycles-0.016
 topics-0.016
Token felt
Token ID2936
Feature activation+5.22e-02

Pos logit contributions
kin+0.002
ivery+0.002
ifying+0.002
 bruised+0.002
ishy+0.002
Neg logit contributions
 guessing-0.002
 facts-0.002
 tours-0.002
 motorcycles-0.002
 discovering-0.002
Token sorry
Token ID7926
Feature activation+1.37e-02

Pos logit contributions
kin+0.097
suit+0.083
ivery+0.083
cy+0.081
ifying+0.081
Neg logit contributions
 guessing-0.090
 learning-0.086
 discovering-0.086
 seeing-0.085
 facts-0.082
Token and
Token ID290
Feature activation+2.75e-02

Pos logit contributions
kin+0.032
ivery+0.030
ifying+0.029
 bruised+0.027
ishy+0.027
Neg logit contributions
 guessing-0.017
 facts-0.016
 discovering-0.016
 motorcycles-0.015
 tours-0.015
Token he
Token ID339
Feature activation+1.57e-02

Pos logit contributions
kin+0.018
ivery+0.017
ifying+0.016
 bruised+0.016
ishy+0.016
Neg logit contributions
 guessing-0.011
 tours-0.010
 motorcycles-0.010
 facts-0.010
 discovering-0.010
Token sp
Token ID599
Feature activation+2.69e-02

Pos logit contributions
kin+0.032
ivery+0.030
ifying+0.029
 bruised+0.029
ishy+0.028
Neg logit contributions
 guessing-0.024
 tours-0.022
 motorcycles-0.021
 discovering-0.021
 facts-0.021
Tokenied
Token ID798
Feature activation+1.84e-02

Pos logit contributions
kin+0.055
 bruised+0.054
ifying+0.053
ivery+0.052
 limp+0.049
Neg logit contributions
 guessing-0.035
 tours-0.034
 motorcycles-0.033
 facts-0.033
ons-0.031
Token an
Token ID281
Feature activation+6.20e-03

Pos logit contributions
kin+0.035
ivery+0.034
ifying+0.032
ishy+0.030
 bruised+0.030
Neg logit contributions
 guessing-0.030
 tours-0.028
 motorcycles-0.028
 facts-0.028
 discovering-0.028
Token ash
Token ID12530
Feature activation+2.19e-02

Pos logit contributions
kin+0.012
ivery+0.011
ifying+0.010
 bruised+0.010
ishy+0.010
Neg logit contributions
 guessing-0.010
 facts-0.010
 tours-0.010
 motorcycles-0.010
 discovering-0.010
Tokentr
Token ID2213
Feature activation+9.02e-02

Pos logit contributions
kin+0.028
ivery+0.026
 bruised+0.023
ishy+0.023
ifying+0.022
Neg logit contributions
 guessing-0.049
 facts-0.046
 tours-0.046
 motorcycles-0.046
 mills-0.045
Tokenay
Token ID323
Feature activation+3.06e-02

Pos logit contributions
kin+0.199
 thirst+0.192
 limp+0.185
ivery+0.184
 relief+0.179
Neg logit contributions
ons-0.108
 motorcycles-0.090
ades-0.083
ths-0.079
 airplanes-0.078
Token.
Token ID13
Feature activation-6.09e-03

Pos logit contributions
kin+0.058
ivery+0.056
ishy+0.052
 bruised+0.051
ifying+0.051
Neg logit contributions
 guessing-0.049
 motorcycles-0.047
 discovering-0.046
 facts-0.045
 tours-0.045
Token He
Token ID679
Feature activation-8.75e-03

Pos logit contributions
 guessing+0.013
 facts+0.013
 tours+0.012
 discovering+0.012
 topics+0.012
Neg logit contributions
kin-0.009
ivery-0.008
ifying-0.007
 bruised-0.007
ishy-0.007
Token picked
Token ID6497
Feature activation-1.78e-02

Pos logit contributions
 guessing+0.016
 facts+0.016
 tours+0.016
 topics+0.015
 discovering+0.015
Neg logit contributions
kin-0.015
ivery-0.013
 bruised-0.013
ifying-0.013
ishy-0.012
Token up
Token ID510
Feature activation-1.19e-02

Pos logit contributions
 facts+0.027
 guessing+0.027
 topics+0.027
 tours+0.027
 mills+0.026
Neg logit contributions
kin-0.034
 bruised-0.031
ivery-0.030
ishy-0.030
ifying-0.030
Token dress
Token ID6576
Feature activation+4.09e-02

Pos logit contributions
ivery+0.128
kin+0.115
ishy+0.113
istered+0.110
ifying+0.105
Neg logit contributions
 guessing-0.079
 bikes-0.078
 motorcycles-0.073
 occasions-0.071
 counting-0.071
Token,
Token ID11
Feature activation+2.11e-02

Pos logit contributions
ivery+0.101
kin+0.097
ishy+0.092
ifying+0.092
 bruised+0.091
Neg logit contributions
 guessing-0.042
 facts-0.037
 discovering-0.036
 seeing-0.036
 bikes-0.035
Token and
Token ID290
Feature activation+2.82e-02

Pos logit contributions
kin+0.048
ivery+0.048
ifying+0.044
ishy+0.043
 bruised+0.042
Neg logit contributions
 guessing-0.027
 discovering-0.025
 facts-0.025
 motorcycles-0.024
 tours-0.024
Token a
Token ID257
Feature activation+5.35e-02

Pos logit contributions
ivery+0.053
kin+0.052
ishy+0.045
ifying+0.045
 bruised+0.044
Neg logit contributions
 guessing-0.048
 facts-0.047
 motorcycles-0.046
 discovering-0.045
 tours-0.044
Token spark
Token ID9009
Feature activation+5.59e-02

Pos logit contributions
ivery+0.106
kin+0.099
ishy+0.094
ifying+0.091
he+0.088
Neg logit contributions
 guessing-0.087
 facts-0.076
 motorcycles-0.074
 tricks-0.073
 discovering-0.072
Tokenly
Token ID306
Feature activation+9.02e-02

Pos logit contributions
ivery+0.128
ifying+0.111
 bruised+0.109
 Deal+0.108
kin+0.107
Neg logit contributions
 facts-0.070
 tricks-0.069
 guessing-0.068
is-0.061
 occasions-0.061
Token cape
Token ID36023
Feature activation+5.19e-02

Pos logit contributions
ivery+0.199
ifying+0.169
istered+0.168
idget+0.163
ishy+0.160
Neg logit contributions
 guessing-0.120
 tricks-0.119
 facts-0.112
 bikes-0.112
 occasions-0.110
Token.
Token ID13
Feature activation+4.71e-02

Pos logit contributions
ivery+0.109
kin+0.104
ifying+0.099
ishy+0.099
 bruised+0.095
Neg logit contributions
 guessing-0.073
 facts-0.068
 discovering-0.066
 seeing-0.065
 tricks-0.061
Token Tom
Token ID4186
Feature activation-2.52e-02

Pos logit contributions
kin+0.093
ivery+0.092
ishy+0.082
 bruised+0.078
ifying+0.077
Neg logit contributions
 motorcycles-0.073
 guessing-0.069
 museums-0.069
 tours-0.068
 topics-0.068
Token thought
Token ID1807
Feature activation-6.89e-03

Pos logit contributions
 tours+0.050
 guessing+0.049
 facts+0.049
 topics+0.049
 discovering+0.048
Neg logit contributions
kin-0.038
 bruised-0.033
ivery-0.031
robe-0.030
ifying-0.030
Token Elsa
Token ID19226
Feature activation-2.72e-03

Pos logit contributions
 guessing+0.011
 facts+0.010
 tours+0.010
 topics+0.010
 discovering+0.010
Neg logit contributions
kin-0.014
ivery-0.012
 bruised-0.012
ifying-0.012
ishy-0.011
Token,
Token ID11
Feature activation+3.45e-02

Pos logit contributions
kin+0.048
ifying+0.045
ivery+0.045
ishy+0.042
 bruised+0.042
Neg logit contributions
 guessing-0.027
 discovering-0.025
 tours-0.025
 facts-0.024
 adventures-0.024
Token the
Token ID262
Feature activation+8.65e-03

Pos logit contributions
ivery+0.088
kin+0.086
ifying+0.083
ishy+0.082
istered+0.079
Neg logit contributions
 guessing-0.032
 tours-0.032
 adventures-0.031
 discovering-0.030
 museums-0.030
Token bird
Token ID6512
Feature activation+2.90e-04

Pos logit contributions
kin+0.018
ifying+0.017
ivery+0.017
ishy+0.016
 bruised+0.016
Neg logit contributions
 guessing-0.013
 tours-0.012
 facts-0.011
 discovering-0.011
 motorcycles-0.011
Token put
Token ID1234
Feature activation+7.01e-02

Pos logit contributions
kin+0.001
ivery+0.000
ifying+0.000
 bruised+0.000
ishy+0.000
Neg logit contributions
 guessing-0.001
 facts-0.001
 tours-0.000
 discovering-0.000
 motorcycles-0.000
Token her
Token ID607
Feature activation+4.37e-02

Pos logit contributions
ivery+0.197
ifying+0.187
istered+0.180
ishy+0.167
 Deal+0.165
Neg logit contributions
ons-0.058
 things-0.054
 adventures-0.050
 to-0.046
 bikes-0.045
Token blanket
Token ID18447
Feature activation+9.01e-02

Pos logit contributions
ivery+0.080
kin+0.076
ifying+0.072
ishy+0.071
istered+0.068
Neg logit contributions
 adventures-0.067
 guessing-0.065
 bikes-0.062
 tours-0.060
 journeys-0.060
Token back
Token ID736
Feature activation+3.60e-02

Pos logit contributions
ivery+0.217
ishy+0.202
 Deal+0.202
ifying+0.199
kin+0.199
Neg logit contributions
 to-0.101
 guessing-0.074
 airplanes-0.069
 exploring-0.068
 adventures-0.066
Token inside
Token ID2641
Feature activation+1.89e-02

Pos logit contributions
kin+0.071
ivery+0.071
ifying+0.064
ishy+0.064
istered+0.063
Neg logit contributions
 guessing-0.052
 discovering-0.050
 exploring-0.047
 tours-0.047
 motorcycles-0.047
Token.
Token ID13
Feature activation+7.17e-03

Pos logit contributions
kin+0.036
ivery+0.035
ifying+0.033
ishy+0.032
 bruised+0.031
Neg logit contributions
 guessing-0.031
 discovering-0.030
 facts-0.028
 seeing-0.028
 tours-0.028
Token She
Token ID1375
Feature activation-8.26e-03

Pos logit contributions
kin+0.011
ivery+0.010
ifying+0.009
 bruised+0.009
ishy+0.008
Neg logit contributions
 guessing-0.015
 facts-0.015
 tours-0.014
 topics-0.014
 discovering-0.014
Token was
Token ID373
Feature activation+4.68e-02

Pos logit contributions
 guessing+0.015
 facts+0.015
 tours+0.015
 topics+0.015
 discovering+0.015
Neg logit contributions
kin-0.014
ivery-0.012
 bruised-0.012
ifying-0.012
ishy-0.011
Token Ben
Token ID3932
Feature activation+1.88e-02

Pos logit contributions
ivery+0.039
kin+0.038
ifying+0.037
ishy+0.033
robe+0.033
Neg logit contributions
 guessing-0.036
 motorcycles-0.035
 facts-0.034
 tours-0.033
 discovering-0.033
Token said
Token ID531
Feature activation+4.33e-02

Pos logit contributions
kin+0.038
ifying+0.037
ivery+0.036
 bruised+0.033
ishy+0.032
Neg logit contributions
 guessing-0.029
 facts-0.027
 motorcycles-0.026
arm-0.026
 mills-0.026
Token,
Token ID11
Feature activation+7.55e-03

Pos logit contributions
ifying+0.109
kin+0.106
ivery+0.105
ishy+0.097
robe+0.093
Neg logit contributions
 guessing-0.044
 to-0.041
 counting-0.039
 instructions-0.037
 discovering-0.035
Token holding
Token ID4769
Feature activation+4.62e-02

Pos logit contributions
kin+0.018
ivery+0.016
ifying+0.016
ishy+0.016
 bruised+0.015
Neg logit contributions
 guessing-0.009
 discovering-0.009
 facts-0.008
 tours-0.008
 motorcycles-0.008
Token his
Token ID465
Feature activation-1.80e-02

Pos logit contributions
ivery+0.119
kin+0.106
ifying+0.104
 bruised+0.101
istered+0.100
Neg logit contributions
 quizz-0.046
 guessing-0.043
 motorcycles-0.041
 walks-0.041
 tricks-0.041
Token bowl
Token ID9396
Feature activation+8.98e-02

Pos logit contributions
 guessing+0.032
 facts+0.032
 tours+0.031
 discovering+0.031
 topics+0.031
Neg logit contributions
kin-0.032
 bruised-0.027
ivery-0.026
ifying-0.026
ishy-0.026
Token tightly
Token ID17707
Feature activation+5.89e-02

Pos logit contributions
ivery+0.199
ifying+0.195
kin+0.174
 Deal+0.169
Uh+0.168
Neg logit contributions
 to-0.112
ups-0.096
 instructions-0.092
 guessing-0.089
 questions-0.084
Token.
Token ID13
Feature activation+3.03e-02

Pos logit contributions
ivery+0.137
kin+0.134
ishy+0.115
ifying+0.114
robe+0.114
Neg logit contributions
 guessing-0.073
 to-0.070
 counting-0.065
 discovering-0.064
 motorcycles-0.063
Token
Token ID198
Feature activation-3.74e-03

Pos logit contributions
kin+0.075
ivery+0.074
ifying+0.070
ishy+0.069
 bruised+0.068
Neg logit contributions
 guessing-0.031
 motorcycles-0.027
 facts-0.026
 topics-0.026
 mills-0.025
Token
Token ID198
Feature activation+7.35e-04

Pos logit contributions
 guessing+0.006
 facts+0.006
 tours+0.006
 discovering+0.006
 topics+0.006
Neg logit contributions
kin-0.007
ivery-0.007
ifying-0.006
 bruised-0.006
ishy-0.006
TokenL
Token ID43
Feature activation+3.48e-02

Pos logit contributions
kin+0.001
ivery+0.001
ifying+0.001
 bruised+0.001
ishy+0.001
Neg logit contributions
 guessing-0.001
 facts-0.001
 tours-0.001
 topics-0.001
 motorcycles-0.001
Token washed
Token ID18989
Feature activation+3.39e-02

Pos logit contributions
kin+0.060
ivery+0.059
ifying+0.055
rier+0.053
ishy+0.051
Neg logit contributions
 guessing-0.055
 discovering-0.051
 seeing-0.047
 exploring-0.046
 computers-0.046
Token the
Token ID262
Feature activation+5.12e-03

Pos logit contributions
kin+0.080
ivery+0.079
ifying+0.074
 bruised+0.071
istered+0.069
Neg logit contributions
 topics-0.035
 guessing-0.034
 instructions-0.033
 facts-0.033
 bikes-0.032
Token dress
Token ID6576
Feature activation+3.94e-02

Pos logit contributions
kin+0.010
ivery+0.009
ifying+0.009
 bruised+0.008
ishy+0.008
Neg logit contributions
 guessing-0.008
 tours-0.008
 facts-0.008
 discovering-0.008
 topics-0.008
Token until
Token ID1566
Feature activation+4.43e-02

Pos logit contributions
kin+0.078
ivery+0.077
ifying+0.077
 bruised+0.073
ishy+0.070
Neg logit contributions
 guessing-0.054
ups-0.049
 tours-0.048
 facts-0.048
 instructions-0.048
Token it
Token ID340
Feature activation+6.31e-02

Pos logit contributions
ivery+0.121
ishy+0.114
ifying+0.110
kin+0.110
 bruised+0.108
Neg logit contributions
 guessing-0.035
 discovering-0.029
 seeing-0.029
 tours-0.027
 adventures-0.027
Token was
Token ID373
Feature activation+8.93e-02

Pos logit contributions
ifying+0.171
ivery+0.157
ishy+0.156
arl+0.156
 bruised+0.155
Neg logit contributions
ups-0.040
 guessing-0.040
ves-0.038
 to-0.034
 adventures-0.032
Token perfect
Token ID2818
Feature activation+2.73e-02

Pos logit contributions
rified+0.186
ifying+0.172
suit+0.167
cy+0.165
hy+0.164
Neg logit contributions
 guessing-0.120
 adventures-0.106
 exploring-0.105
 to-0.105
 seeing-0.104
Token,
Token ID11
Feature activation+2.63e-02

Pos logit contributions
kin+0.053
ivery+0.053
ifying+0.051
ishy+0.047
 bruised+0.047
Neg logit contributions
 guessing-0.043
 discovering-0.039
 seeing-0.036
 tours-0.036
 adventures-0.036
Token just
Token ID655
Feature activation+5.60e-02

Pos logit contributions
kin+0.061
ivery+0.059
ifying+0.056
ishy+0.054
robe+0.052
Neg logit contributions
 guessing-0.034
 discovering-0.032
 seeing-0.029
 facts-0.028
 motorcycles-0.027
Token like
Token ID588
Feature activation+8.35e-03

Pos logit contributions
ivery+0.114
kin+0.110
he+0.103
ifying+0.098
robe+0.096
Neg logit contributions
 guessing-0.090
 discovering-0.090
 seeing-0.087
 exploring-0.082
 learning-0.080
Token new
Token ID649
Feature activation-6.88e-03

Pos logit contributions
kin+0.015
ivery+0.014
ifying+0.013
ishy+0.013
 bruised+0.013
Neg logit contributions
 guessing-0.014
 facts-0.014
 tours-0.014
 discovering-0.014
 motorcycles-0.013
Token She
Token ID1375
Feature activation-2.19e-02

Pos logit contributions
kin+0.004
ivery+0.004
ifying+0.003
 bruised+0.003
ishy+0.003
Neg logit contributions
 guessing-0.007
 facts-0.007
 tours-0.007
 discovering-0.007
 topics-0.007
Token loved
Token ID6151
Feature activation-1.18e-01

Pos logit contributions
 facts+0.042
 guessing+0.042
 tours+0.041
 motorcycles+0.041
 topics+0.040
Neg logit contributions
kin-0.035
ivery-0.031
robe-0.028
 bruised-0.028
ishy-0.028
Token to
Token ID284
Feature activation-3.69e-02

Pos logit contributions
ves+0.257
icians+0.242
bors+0.235
isting+0.226
 Street+0.225
Neg logit contributions
 Lily-0.135
 cold-0.123
 wet-0.121
 dry-0.121
 empty-0.118
Token explore
Token ID7301
Feature activation-7.31e-02

Pos logit contributions
ups+0.068
 guessing+0.067
icians+0.067
 facts+0.067
ths+0.066
Neg logit contributions
kin-0.057
ivery-0.051
 limp-0.047
robe-0.046
ishy-0.045
Token and
Token ID290
Feature activation-4.68e-02

Pos logit contributions
icians+0.180
 facts+0.167
 Exhibit+0.166
cles+0.164
bors+0.163
Neg logit contributions
he-0.066
 empty-0.061
 alone-0.058
 dry-0.053
 like-0.052
Token try
Token ID1949
Feature activation-1.32e-01

Pos logit contributions
 motorcycles+0.084
icians+0.082
cles+0.081
 facts+0.080
arm+0.080
Neg logit contributions
kin-0.065
 sill-0.063
ivery-0.063
 limp-0.061
ifying-0.061
Token new
Token ID649
Feature activation-8.30e-02

Pos logit contributions
 occasions+0.286
cles+0.284
icians+0.274
bors+0.272
ths+0.255
Neg logit contributions
 out-0.134
 low-0.134
 down-0.132
 okay-0.131
 sad-0.123
Token things
Token ID1243
Feature activation-2.82e-02

Pos logit contributions
icians+0.141
 Exhibit+0.134
arm+0.132
ending+0.131
bors+0.129
Neg logit contributions
 eye-0.108
ishy-0.106
 away-0.105
 sill-0.103
 like-0.100
Token.
Token ID13
Feature activation-1.72e-02

Pos logit contributions
 guessing+0.060
 topics+0.060
 motorcycles+0.059
 facts+0.059
 mills+0.059
Neg logit contributions
kin-0.035
ivery-0.029
 bruised-0.029
ifying-0.028
robe-0.028
Token Daisy
Token ID40355
Feature activation-6.24e-02

Pos logit contributions
 guessing+0.037
 facts+0.037
 tours+0.036
 motorcycles+0.036
 mills+0.035
Neg logit contributions
kin-0.022
ivery-0.020
ifying-0.018
 bruised-0.018
ishy-0.018
Token wanted
Token ID2227
Feature activation-8.44e-02

Pos logit contributions
 facts+0.136
 motorcycles+0.129
 tours+0.128
 guessing+0.127
 topics+0.126
Neg logit contributions
kin-0.078
ishy-0.066
cy-0.066
 still-0.065
ivery-0.064
Token Fox
Token ID5426
Feature activation-1.48e-02

Pos logit contributions
 guessing+0.007
 facts+0.006
 tours+0.006
 discovering+0.006
 topics+0.006
Neg logit contributions
kin-0.006
ivery-0.005
ifying-0.005
 bruised-0.005
ishy-0.004
Tokeny
Token ID88
Feature activation-3.39e-03

Pos logit contributions
 guessing+0.033
 tours+0.032
 facts+0.032
 topics+0.031
 discovering+0.031
Neg logit contributions
kin-0.020
ivery-0.017
ifying-0.016
 bruised-0.016
ishy-0.015
Token.
Token ID13
Feature activation-2.17e-02

Pos logit contributions
 guessing+0.007
 facts+0.007
 tours+0.007
 discovering+0.006
 topics+0.006
Neg logit contributions
kin-0.005
ivery-0.005
ifying-0.004
 bruised-0.004
ishy-0.004
Token Fox
Token ID5426
Feature activation+2.62e-04

Pos logit contributions
 guessing+0.046
 facts+0.043
 tours+0.043
 discovering+0.043
 topics+0.042
Neg logit contributions
kin-0.029
ivery-0.027
ifying-0.025
 bruised-0.025
ishy-0.024
Tokeny
Token ID88
Feature activation-2.01e-02

Pos logit contributions
kin+0.000
ivery+0.000
ifying+0.000
 bruised+0.000
ishy+0.000
Neg logit contributions
 guessing-0.001
 facts-0.001
 tours-0.001
 discovering-0.001
 topics-0.001
Token liked
Token ID8288
Feature activation-1.32e-01

Pos logit contributions
 guessing+0.038
 tours+0.037
 facts+0.037
 topics+0.036
 discovering+0.036
Neg logit contributions
kin-0.033
ivery-0.029
 bruised-0.027
ifying-0.027
robe-0.027
Token to
Token ID284
Feature activation-2.83e-02

Pos logit contributions
ves+0.310
icians+0.304
bors+0.300
ises+0.296
 occasions+0.293
Neg logit contributions
 Lily-0.105
 a-0.092
 cold-0.091
 hungry-0.090
 full-0.088
Token dance
Token ID9280
Feature activation-1.51e-02

Pos logit contributions
 guessing+0.049
 topics+0.047
 facts+0.047
 discovering+0.047
 motorcycles+0.046
Neg logit contributions
kin-0.049
ivery-0.044
 bruised-0.041
 limp-0.040
robe-0.040
Token in
Token ID287
Feature activation-4.80e-02

Pos logit contributions
 guessing+0.025
 facts+0.024
 mills+0.024
 discovering+0.024
 motorcycles+0.024
Neg logit contributions
kin-0.028
ivery-0.025
ifying-0.024
robe-0.024
 bruised-0.023
Token the
Token ID262
Feature activation-2.73e-02

Pos logit contributions
 facts+0.104
arm+0.102
ups+0.101
 mills+0.098
 topics+0.097
Neg logit contributions
kin-0.051
 sill-0.048
ifying-0.046
 bruised-0.045
 empty-0.043
Token forest
Token ID8222
Feature activation-3.38e-02

Pos logit contributions
 facts+0.047
 guessing+0.045
 motorcycles+0.044
 topics+0.044
 mills+0.044
Neg logit contributions
kin-0.047
ivery-0.045
ifying-0.043
ishy-0.041
 bruised-0.041
Token She
Token ID1375
Feature activation+9.17e-04

Pos logit contributions
kin+0.049
ivery+0.045
ifying+0.043
 bruised+0.041
ishy+0.041
Neg logit contributions
 topics-0.054
 facts-0.053
 discovering-0.053
 tours-0.052
 guessing-0.052
Token loved
Token ID6151
Feature activation-1.02e-01

Pos logit contributions
kin+0.002
ivery+0.002
ifying+0.002
 bruised+0.002
ishy+0.001
Neg logit contributions
 guessing-0.002
 facts-0.001
 tours-0.001
 discovering-0.001
 topics-0.001
Token to
Token ID284
Feature activation-3.69e-02

Pos logit contributions
icians+0.199
ves+0.197
bors+0.186
arm+0.186
isting+0.177
Neg logit contributions
 Lily-0.135
 cold-0.129
 empty-0.126
 wet-0.126
 bruised-0.124
Token explore
Token ID7301
Feature activation-6.72e-02

Pos logit contributions
icians+0.070
 guessing+0.069
 facts+0.068
ths+0.068
ups+0.068
Neg logit contributions
kin-0.056
ivery-0.049
 limp-0.046
 bruised-0.044
robe-0.044
Token and
Token ID290
Feature activation-4.87e-02

Pos logit contributions
icians+0.164
 guessing+0.149
arm+0.149
 Exhibit+0.149
cles+0.148
Neg logit contributions
 empty-0.059
he-0.057
 alone-0.054
 dry-0.050
 Lily-0.050
Token try
Token ID1949
Feature activation-1.32e-01

Pos logit contributions
icians+0.088
 motorcycles+0.088
cles+0.086
arm+0.086
 facts+0.083
Neg logit contributions
 sill-0.066
kin-0.065
 bruised-0.064
 limp-0.064
ivery-0.064
Token new
Token ID649
Feature activation-7.75e-02

Pos logit contributions
 occasions+0.285
cles+0.284
icians+0.277
bors+0.277
ths+0.259
Neg logit contributions
 low-0.134
 down-0.132
 out-0.131
 okay-0.128
 sad-0.124
Token things
Token ID1243
Feature activation-1.97e-02

Pos logit contributions
icians+0.133
arm+0.126
 Exhibit+0.122
ending+0.122
ths+0.122
Neg logit contributions
 eye-0.101
 sill-0.100
ishy-0.097
 away-0.095
 idea-0.094
Token.
Token ID13
Feature activation-1.20e-02

Pos logit contributions
 guessing+0.042
 facts+0.041
 topics+0.041
 motorcycles+0.041
 tours+0.041
Neg logit contributions
kin-0.026
ivery-0.022
 bruised-0.021
ifying-0.021
robe-0.020
Token One
Token ID1881
Feature activation+7.32e-03

Pos logit contributions
 guessing+0.026
 facts+0.025
 tours+0.025
 motorcycles+0.025
 discovering+0.025
Neg logit contributions
kin-0.016
ivery-0.014
ifying-0.013
 bruised-0.013
robe-0.012
Token day
Token ID1110
Feature activation-1.71e-02

Pos logit contributions
kin+0.015
ivery+0.014
ifying+0.014
 bruised+0.014
ishy+0.013
Neg logit contributions
 guessing-0.011
 tours-0.010
 facts-0.010
 discovering-0.010
 topics-0.010
Token She
Token ID1375
Feature activation-5.53e-03

Pos logit contributions
kin+0.063
ivery+0.059
ifying+0.055
 bruised+0.053
ishy+0.052
Neg logit contributions
 topics-0.061
 facts-0.060
 discovering-0.060
 tours-0.058
 museums-0.058
Token loved
Token ID6151
Feature activation-1.00e-01

Pos logit contributions
 guessing+0.009
 facts+0.009
 tours+0.009
 discovering+0.009
 topics+0.009
Neg logit contributions
kin-0.010
ivery-0.009
ifying-0.009
 bruised-0.009
ishy-0.008
Token to
Token ID284
Feature activation-2.48e-02

Pos logit contributions
icians+0.193
bors+0.185
ves+0.185
arm+0.183
 Street+0.173
Neg logit contributions
 Lily-0.129
 empty-0.126
 bruised-0.124
 cold-0.124
 wet-0.123
Token explore
Token ID7301
Feature activation-6.96e-02

Pos logit contributions
 guessing+0.044
 facts+0.043
 topics+0.042
 discovering+0.041
 tours+0.041
Neg logit contributions
kin-0.044
ivery-0.038
 bruised-0.035
ifying-0.035
ishy-0.034
Token and
Token ID290
Feature activation-4.54e-02

Pos logit contributions
icians+0.174
 facts+0.162
arm+0.161
 Exhibit+0.160
cles+0.159
Neg logit contributions
he-0.056
 empty-0.055
 alone-0.050
 dry-0.048
kin-0.048
Token try
Token ID1949
Feature activation-1.31e-01

Pos logit contributions
 motorcycles+0.082
icians+0.081
arm+0.079
cles+0.079
 facts+0.078
Neg logit contributions
kin-0.063
 sill-0.061
ivery-0.059
ifying-0.059
 limp-0.059
Token new
Token ID649
Feature activation-7.82e-02

Pos logit contributions
cles+0.281
 occasions+0.281
bors+0.277
icians+0.276
ths+0.265
Neg logit contributions
 low-0.131
 down-0.124
 out-0.123
 sad-0.116
 okay-0.112
Token things
Token ID1243
Feature activation-2.16e-02

Pos logit contributions
icians+0.132
arm+0.127
 Exhibit+0.127
ending+0.126
bors+0.124
Neg logit contributions
 eye-0.100
 sill-0.097
 away-0.095
ishy-0.094
 like-0.092
Token.
Token ID13
Feature activation-8.83e-03

Pos logit contributions
 guessing+0.047
 facts+0.046
 topics+0.046
 motorcycles+0.046
 tours+0.046
Neg logit contributions
kin-0.027
ivery-0.023
 bruised-0.022
ifying-0.022
robe-0.021
Token
Token ID198
Feature activation-4.44e-02

Pos logit contributions
 guessing+0.019
 facts+0.018
 tours+0.018
 discovering+0.017
 motorcycles+0.017
Neg logit contributions
kin-0.012
ivery-0.011
ifying-0.011
 bruised-0.010
ishy-0.010
Token
Token ID198
Feature activation-4.18e-02

Pos logit contributions
 facts+0.091
 guessing+0.087
 motorcycles+0.087
 tours+0.086
 topics+0.085
Neg logit contributions
kin-0.062
he-0.058
ivery-0.055
robe-0.048
ifying-0.048
Token Lily
Token ID20037
Feature activation+1.46e-02

Pos logit contributions
kin+0.020
ivery+0.018
ifying+0.017
 bruised+0.016
ishy+0.015
Neg logit contributions
 guessing-0.038
 discovering-0.036
 tours-0.036
 mills-0.035
 topics-0.035
Token.
Token ID13
Feature activation+3.71e-02

Pos logit contributions
kin+0.027
ivery+0.025
ifying+0.025
ishy+0.023
 bruised+0.023
Neg logit contributions
 guessing-0.024
 discovering-0.023
 motorcycles-0.023
 facts-0.023
 tours-0.022
Token She
Token ID1375
Feature activation-5.22e-05

Pos logit contributions
kin+0.066
ivery+0.059
ifying+0.059
rier+0.056
 bruised+0.055
Neg logit contributions
 guessing-0.063
 discovering-0.062
 topics-0.061
 museums-0.060
 tours-0.059
Token loved
Token ID6151
Feature activation-1.09e-01

Pos logit contributions
 guessing+0.000
 facts+0.000
 tours+0.000
 discovering+0.000
 topics+0.000
Neg logit contributions
kin-0.000
ivery-0.000
ifying-0.000
 bruised-0.000
ishy-0.000
Token to
Token ID284
Feature activation-2.30e-02

Pos logit contributions
icians+0.219
ves+0.212
bors+0.210
arm+0.201
 Street+0.197
Neg logit contributions
 Lily-0.135
 wet-0.129
 dry-0.129
 cold-0.128
 bruised-0.127
Token try
Token ID1949
Feature activation-1.31e-01

Pos logit contributions
 guessing+0.041
 facts+0.040
 topics+0.039
 discovering+0.039
icians+0.039
Neg logit contributions
kin-0.040
ivery-0.035
 bruised-0.032
ifying-0.032
robe-0.032
Token new
Token ID649
Feature activation-8.76e-02

Pos logit contributions
bors+0.294
cles+0.288
 occasions+0.282
icians+0.281
ager+0.278
Neg logit contributions
 warm-0.108
 out-0.105
 cold-0.103
 wet-0.103
 dark-0.103
Token things
Token ID1243
Feature activation-1.84e-02

Pos logit contributions
icians+0.165
ending+0.165
ll+0.164
bors+0.161
 Exhibit+0.157
Neg logit contributions
 eye-0.095
 idea-0.088
 Lily-0.084
 bruised-0.081
ishy-0.081
Token and
Token ID290
Feature activation-4.40e-02

Pos logit contributions
 guessing+0.037
 facts+0.036
 topics+0.036
 motorcycles+0.036
 tours+0.035
Neg logit contributions
kin-0.027
ivery-0.024
 bruised-0.023
ifying-0.022
ishy-0.021
Token have
Token ID423
Feature activation-5.37e-02

Pos logit contributions
 motorcycles+0.088
arm+0.087
icians+0.087
cles+0.085
 mills+0.085
Neg logit contributions
kin-0.058
 sill-0.050
 bruised-0.050
 limp-0.048
ivery-0.046
Token adventures
Token ID17545
Feature activation-1.42e-02

Pos logit contributions
arm+0.102
cles+0.098
 mills+0.098
ule+0.098
 topics+0.095
Neg logit contributions
kin-0.066
 empty-0.063
 OK-0.060
 sill-0.059
 bruised-0.059
Token to
Token ID284
Feature activation-2.11e-02

Pos logit contributions
icians+0.178
bors+0.173
ves+0.163
arm+0.159
 Street+0.157
Neg logit contributions
 bruised-0.134
 Lily-0.131
 empty-0.125
 itch-0.123
 Mom-0.123
Token study
Token ID2050
Feature activation-4.40e-02

Pos logit contributions
 guessing+0.037
 facts+0.036
 topics+0.036
 discovering+0.036
 tours+0.035
Neg logit contributions
kin-0.037
ivery-0.033
 bruised-0.031
robe-0.030
ifying-0.030
Token nature
Token ID3450
Feature activation-4.00e-02

Pos logit contributions
 guessing+0.065
bors+0.062
 whales+0.062
 discovering+0.062
icians+0.062
Neg logit contributions
kin-0.083
he-0.072
robe-0.071
ifying-0.070
 bruised-0.069
Token.
Token ID13
Feature activation+7.42e-03

Pos logit contributions
 guessing+0.087
 discovering+0.084
icians+0.083
 motorcycles+0.083
 mills+0.083
Neg logit contributions
kin-0.050
robe-0.042
he-0.041
ivery-0.039
ifying-0.039
Token She
Token ID1375
Feature activation-3.58e-02

Pos logit contributions
kin+0.011
ivery+0.010
ifying+0.009
 bruised+0.009
ishy+0.009
Neg logit contributions
 guessing-0.015
 facts-0.015
 tours-0.015
 topics-0.015
 discovering-0.015
Token liked
Token ID8288
Feature activation-1.30e-01

Pos logit contributions
 facts+0.073
 tours+0.073
 guessing+0.073
 topics+0.071
 discovering+0.070
Neg logit contributions
kin-0.049
robe-0.040
ivery-0.039
 bruised-0.039
ishy-0.038
Token to
Token ID284
Feature activation-3.48e-02

Pos logit contributions
bors+0.325
ises+0.314
icians+0.307
isting+0.304
ves+0.301
Neg logit contributions
 Lily-0.108
 her-0.096
 Mom-0.092
 like-0.087
 full-0.087
Token look
Token ID804
Feature activation+1.08e-03

Pos logit contributions
 guessing+0.055
 motorcycles+0.054
 facts+0.053
icians+0.053
bors+0.053
Neg logit contributions
kin-0.062
ivery-0.056
 sill-0.054
he-0.051
 limp-0.051
Token at
Token ID379
Feature activation-4.76e-02

Pos logit contributions
kin+0.002
ivery+0.002
ifying+0.002
 bruised+0.002
ishy+0.002
Neg logit contributions
 guessing-0.002
 tours-0.002
 facts-0.002
 discovering-0.002
 motorcycles-0.002
Token the
Token ID262
Feature activation-5.15e-02

Pos logit contributions
 guessing+0.091
 facts+0.084
icians+0.083
bors+0.083
 topics+0.082
Neg logit contributions
 bruised-0.064
kin-0.064
ifying-0.062
 sill-0.057
 limp-0.056
Token trees
Token ID7150
Feature activation-2.76e-02

Pos logit contributions
 guessing+0.090
icians+0.086
bors+0.085
 discovering+0.085
 motorcycles+0.085
Neg logit contributions
ifying-0.082
kin-0.078
ishy-0.073
he-0.071
robe-0.070
Token loved
Token ID6151
Feature activation-1.10e-01

Pos logit contributions
 guessing+0.041
 motorcycles+0.040
 facts+0.040
 tours+0.040
icians+0.039
Neg logit contributions
kin-0.035
ivery-0.031
 bruised-0.030
ifying-0.030
ishy-0.029
Token animals
Token ID4695
Feature activation-4.15e-02

Pos logit contributions
ves+0.224
icians+0.224
bors+0.214
is+0.213
arm+0.205
Neg logit contributions
 dark-0.122
 cold-0.117
 Lily-0.114
 full-0.112
 gloomy-0.112
Token.
Token ID13
Feature activation-1.48e-02

Pos logit contributions
 motorcycles+0.104
 tours+0.104
ths+0.103
 facts+0.103
 mills+0.103
Neg logit contributions
kin-0.034
 bruised-0.026
ivery-0.025
he-0.024
ifying-0.024
Token She
Token ID1375
Feature activation-2.69e-02

Pos logit contributions
 guessing+0.033
 facts+0.032
 motorcycles+0.032
 tours+0.031
icians+0.031
Neg logit contributions
kin-0.018
ivery-0.016
ifying-0.015
 bruised-0.015
ishy-0.014
Token especially
Token ID2592
Feature activation-2.00e-02

Pos logit contributions
 facts+0.056
 guessing+0.055
 motorcycles+0.054
 topics+0.054
 tours+0.054
Neg logit contributions
kin-0.038
ivery-0.032
 bruised-0.030
ishy-0.030
robe-0.029
Token liked
Token ID8288
Feature activation-1.30e-01

Pos logit contributions
 guessing+0.041
 tours+0.039
 motorcycles+0.039
 facts+0.039
 mills+0.038
Neg logit contributions
kin-0.030
ivery-0.025
 bruised-0.025
ishy-0.024
ifying-0.023
Token bees
Token ID17002
Feature activation-5.05e-02

Pos logit contributions
icians+0.259
cles+0.257
ves+0.256
is+0.252
iting+0.247
Neg logit contributions
 Lily-0.133
 like-0.122
 dark-0.118
 cold-0.116
 full-0.115
Token!
Token ID0
Feature activation-3.91e-03

Pos logit contributions
 motorcycles+0.105
 topics+0.100
 tours+0.100
 mills+0.098
 guessing+0.098
Neg logit contributions
kin-0.072
 bruised-0.061
ivery-0.061
ifying-0.057
ishy-0.056
Token
Token ID198
Feature activation-2.93e-02

Pos logit contributions
 guessing+0.007
 facts+0.007
 tours+0.007
 discovering+0.007
 topics+0.007
Neg logit contributions
kin-0.006
ivery-0.006
ifying-0.006
 bruised-0.005
ishy-0.005
Token
Token ID198
Feature activation-3.35e-02

Pos logit contributions
 facts+0.055
 guessing+0.054
 motorcycles+0.053
 tours+0.052
 topics+0.052
Neg logit contributions
kin-0.046
ivery-0.041
he-0.040
ifying-0.038
 bruised-0.037
TokenOne
Token ID3198
Feature activation+8.42e-03

Pos logit contributions
 facts+0.055
 guessing+0.054
 motorcycles+0.054
icians+0.053
 tours+0.050
Neg logit contributions
kin-0.058
ivery-0.057
he-0.053
ifying-0.050
ishy-0.050
Token mom
Token ID1995
Feature activation-1.31e-02

Pos logit contributions
 discovering+0.037
 guessing+0.037
 tours+0.036
 facts+0.036
 topics+0.036
Neg logit contributions
kin-0.037
ivery-0.031
ifying-0.031
robe-0.031
 bruised-0.031
Token and
Token ID290
Feature activation-4.40e-02

Pos logit contributions
 guessing+0.027
 facts+0.027
 tours+0.026
 topics+0.026
 discovering+0.026
Neg logit contributions
kin-0.019
ivery-0.017
ifying-0.016
 bruised-0.016
ishy-0.015
Token dad
Token ID9955
Feature activation-2.53e-02

Pos logit contributions
 topics+0.094
 facts+0.093
 tours+0.092
 guessing+0.092
arm+0.092
Neg logit contributions
kin-0.055
 bruised-0.044
 sorry-0.042
cloth-0.040
robe-0.040
Token.
Token ID13
Feature activation+4.86e-03

Pos logit contributions
 tours+0.057
 facts+0.057
 guessing+0.056
 mills+0.055
 motorcycles+0.055
Neg logit contributions
kin-0.031
ivery-0.025
ifying-0.024
 bruised-0.023
cloth-0.022
Token He
Token ID679
Feature activation-3.85e-02

Pos logit contributions
kin+0.008
ivery+0.007
ifying+0.007
 bruised+0.007
ishy+0.007
Neg logit contributions
 guessing-0.009
 facts-0.009
 tours-0.009
 motorcycles-0.009
 topics-0.009
Token liked
Token ID8288
Feature activation-1.29e-01

Pos logit contributions
 tours+0.080
 facts+0.079
 topics+0.078
 heights+0.078
 guessing+0.077
Neg logit contributions
kin-0.057
robe-0.044
ivery-0.042
 bruised-0.042
ishy-0.041
Token to
Token ID284
Feature activation-3.18e-02

Pos logit contributions
bors+0.275
ises+0.269
ths+0.269
icians+0.263
ves+0.259
Neg logit contributions
 Lily-0.123
 a-0.103
 Mom-0.103
 like-0.102
 sleepy-0.102
Token laugh
Token ID6487
Feature activation-1.49e-02

Pos logit contributions
 guessing+0.051
 discovering+0.051
 motorcycles+0.050
 facts+0.050
 topics+0.050
Neg logit contributions
kin-0.058
ivery-0.051
ifying-0.046
cloth-0.046
ishy-0.046
Token and
Token ID290
Feature activation-2.32e-02

Pos logit contributions
 guessing+0.023
 facts+0.023
 discovering+0.022
 tours+0.022
 mills+0.022
Neg logit contributions
kin-0.029
ivery-0.026
ifying-0.025
robe-0.024
 bruised-0.024
Token cl
Token ID537
Feature activation+1.53e-02

Pos logit contributions
 tours+0.038
 guessing+0.038
 facts+0.037
 mills+0.037
 topics+0.037
Neg logit contributions
kin-0.043
ivery-0.036
ifying-0.036
 bruised-0.036
robe-0.034
Tokenap
Token ID499
Feature activation-2.81e-03

Pos logit contributions
kin+0.028
 bruised+0.026
ivery+0.026
ifying+0.025
 limp+0.024
Neg logit contributions
 guessing-0.025
 tours-0.024
 topics-0.024
 discovering-0.024
 facts-0.024
Token was
Token ID373
Feature activation+3.52e-02

Pos logit contributions
kin+0.004
ivery+0.004
ifying+0.004
 bruised+0.004
ishy+0.003
Neg logit contributions
 guessing-0.003
 facts-0.003
 tours-0.003
 discovering-0.003
 motorcycles-0.003
Token very
Token ID845
Feature activation+2.93e-02

Pos logit contributions
kin+0.074
ivery+0.070
ifying+0.068
ishy+0.063
cloth+0.062
Neg logit contributions
 guessing-0.058
 discovering-0.052
 learning-0.051
 exploring-0.049
 weighing-0.049
Token curious
Token ID11040
Feature activation-1.92e-02

Pos logit contributions
kin+0.053
ifying+0.052
ivery+0.050
robe+0.046
cloth+0.046
Neg logit contributions
 guessing-0.051
 discovering-0.045
 tours-0.044
 weighing-0.043
 learning-0.043
Token.
Token ID13
Feature activation-5.24e-03

Pos logit contributions
 guessing+0.035
 tours+0.034
 facts+0.033
 topics+0.033
icians+0.033
Neg logit contributions
kin-0.032
 bruised-0.028
ivery-0.028
ifying-0.027
ishy-0.026
Token He
Token ID679
Feature activation-2.20e-02

Pos logit contributions
 guessing+0.012
 facts+0.012
 tours+0.012
 discovering+0.012
 topics+0.012
Neg logit contributions
kin-0.006
ivery-0.005
ifying-0.005
 bruised-0.005
ishy-0.004
Token liked
Token ID8288
Feature activation-1.28e-01

Pos logit contributions
 guessing+0.045
 facts+0.045
 tours+0.044
 topics+0.043
 motorcycles+0.043
Neg logit contributions
kin-0.031
ivery-0.027
 bruised-0.026
ishy-0.025
ifying-0.025
Token to
Token ID284
Feature activation-4.33e-02

Pos logit contributions
ves+0.299
bors+0.289
is+0.285
arm+0.281
ises+0.280
Neg logit contributions
 alone-0.108
 dark-0.108
 empty-0.104
 Lily-0.104
 cold-0.101
Token explore
Token ID7301
Feature activation-7.46e-02

Pos logit contributions
ups+0.075
icians+0.074
ths+0.073
 guessing+0.072
 facts+0.072
Neg logit contributions
kin-0.069
ivery-0.063
 limp-0.059
 sill-0.058
 eye-0.057
Token the
Token ID262
Feature activation-5.20e-02

Pos logit contributions
icians+0.178
bors+0.170
ending+0.170
arm+0.167
 occasions+0.167
Neg logit contributions
 empty-0.062
he-0.058
 alone-0.058
 like-0.053
 dark-0.053
Token world
Token ID995
Feature activation-4.58e-02

Pos logit contributions
icians+0.096
bors+0.094
 facts+0.091
ves+0.091
 occasions+0.088
Neg logit contributions
he-0.072
 seeker-0.071
 empty-0.071
kin-0.069
 sill-0.069
Token around
Token ID1088
Feature activation-2.15e-02

Pos logit contributions
icians+0.100
 guessing+0.100
bors+0.099
 mills+0.099
 facts+0.099
Neg logit contributions
kin-0.047
he-0.047
 sill-0.041
robe-0.038
 alone-0.038
Token He
Token ID679
Feature activation-8.52e-03

Pos logit contributions
kin+0.019
ivery+0.017
ifying+0.016
 bruised+0.016
ishy+0.015
Neg logit contributions
 facts-0.020
 guessing-0.020
 tours-0.020
 motorcycles-0.020
 topics-0.020
Token liked
Token ID8288
Feature activation-1.07e-01

Pos logit contributions
 guessing+0.016
 facts+0.016
 tours+0.016
 topics+0.016
 discovering+0.015
Neg logit contributions
kin-0.014
ivery-0.012
 bruised-0.012
ifying-0.012
robe-0.011
Token to
Token ID284
Feature activation-1.75e-02

Pos logit contributions
arm+0.233
icians+0.225
bors+0.223
 topics+0.221
ves+0.221
Neg logit contributions
 bruised-0.114
 alone-0.114
 full-0.111
 empty-0.105
 wet-0.104
Token explore
Token ID7301
Feature activation-7.12e-02

Pos logit contributions
 guessing+0.030
 facts+0.030
 topics+0.030
 tours+0.029
 motorcycles+0.029
Neg logit contributions
kin-0.031
ivery-0.028
 bruised-0.026
robe-0.025
ifying-0.025
Token and
Token ID290
Feature activation-4.75e-02

Pos logit contributions
icians+0.169
 occasions+0.162
arm+0.161
 facts+0.161
bors+0.160
Neg logit contributions
he-0.057
 alone-0.056
 empty-0.055
 like-0.050
 low-0.050
Token try
Token ID1949
Feature activation-1.28e-01

Pos logit contributions
 motorcycles+0.087
arm+0.087
 tours+0.086
 facts+0.086
cles+0.085
Neg logit contributions
kin-0.065
 sill-0.059
 bruised-0.059
ifying-0.058
 limp-0.058
Token new
Token ID649
Feature activation-8.19e-02

Pos logit contributions
 occasions+0.278
bors+0.265
cles+0.263
ths+0.261
pill+0.252
Neg logit contributions
 low-0.141
 down-0.136
 out-0.135
 okay-0.130
 sad-0.122
Token things
Token ID1243
Feature activation-2.71e-02

Pos logit contributions
arm+0.137
 Exhibit+0.130
ending+0.128
icians+0.128
ths+0.127
Neg logit contributions
 eye-0.106
 sill-0.105
 away-0.104
 like-0.097
 OK-0.096
Token.
Token ID13
Feature activation-7.12e-03

Pos logit contributions
 topics+0.059
 guessing+0.059
 facts+0.059
 mills+0.058
 tours+0.058
Neg logit contributions
kin-0.033
 bruised-0.027
ivery-0.027
ifying-0.026
robe-0.026
Token He
Token ID679
Feature activation-2.95e-02

Pos logit contributions
 guessing+0.015
 facts+0.014
 tours+0.014
 discovering+0.014
 topics+0.014
Neg logit contributions
kin-0.010
ivery-0.009
ifying-0.009
 bruised-0.009
ishy-0.008
Token would
Token ID561
Feature activation-3.01e-02

Pos logit contributions
 facts+0.056
 tours+0.056
 guessing+0.055
 topics+0.055
 motorcycles+0.054
Neg logit contributions
kin-0.049
ivery-0.039
robe-0.038
 bruised-0.038
ishy-0.037
Token Jane
Token ID12091
Feature activation+1.02e-02

Pos logit contributions
kin+0.023
ivery+0.022
ifying+0.020
 bruised+0.020
ishy+0.019
Neg logit contributions
 guessing-0.031
 tours-0.031
 discovering-0.030
 facts-0.030
 topics-0.030
Token.
Token ID13
Feature activation+1.82e-02

Pos logit contributions
kin+0.018
ivery+0.017
ifying+0.016
 bruised+0.015
ishy+0.015
Neg logit contributions
 guessing-0.018
 facts-0.017
 discovering-0.017
 tours-0.017
 motorcycles-0.017
Token She
Token ID1375
Feature activation-3.02e-03

Pos logit contributions
kin+0.031
ivery+0.028
ifying+0.027
 bruised+0.027
robe+0.024
Neg logit contributions
 guessing-0.034
 discovering-0.034
 facts-0.034
 tours-0.034
 topics-0.034
Token liked
Token ID8288
Feature activation-1.10e-01

Pos logit contributions
 guessing+0.005
 facts+0.005
 tours+0.005
 discovering+0.005
 topics+0.005
Neg logit contributions
kin-0.006
ivery-0.005
ifying-0.005
 bruised-0.005
ishy-0.005
Token to
Token ID284
Feature activation-8.72e-03

Pos logit contributions
icians+0.223
bors+0.220
ves+0.216
arm+0.215
ises+0.209
Neg logit contributions
 Lily-0.134
 Mom-0.119
 wet-0.119
 empty-0.118
 cold-0.116
Token try
Token ID1949
Feature activation-1.27e-01

Pos logit contributions
 guessing+0.014
 facts+0.014
 tours+0.013
 discovering+0.013
 topics+0.013
Neg logit contributions
kin-0.017
ivery-0.015
 bruised-0.014
ifying-0.014
ishy-0.014
Token new
Token ID649
Feature activation-8.28e-02

Pos logit contributions
cles+0.280
bors+0.278
 occasions+0.274
icians+0.265
aro+0.260
Neg logit contributions
 out-0.113
 dark-0.109
 sad-0.109
 cold-0.106
 wet-0.103
Token things
Token ID1243
Feature activation-2.05e-02

Pos logit contributions
 Exhibit+0.150
icians+0.149
ending+0.148
bors+0.145
arm+0.145
Neg logit contributions
 eye-0.096
 idea-0.090
ishy-0.087
 away-0.086
 like-0.085
Token,
Token ID11
Feature activation-2.64e-02

Pos logit contributions
 guessing+0.041
 facts+0.040
 topics+0.040
 motorcycles+0.040
 tours+0.040
Neg logit contributions
kin-0.030
ivery-0.026
 bruised-0.025
ifying-0.025
robe-0.024
Token but
Token ID475
Feature activation-3.24e-03

Pos logit contributions
 guessing+0.049
 topics+0.048
 facts+0.048
arm+0.048
 tours+0.048
Neg logit contributions
kin-0.042
ivery-0.035
 bruised-0.035
ifying-0.034
ishy-0.033
Token sometimes
Token ID3360
Feature activation-4.23e-02

Pos logit contributions
 guessing+0.005
 facts+0.005
 tours+0.005
 discovering+0.005
 topics+0.005
Neg logit contributions
kin-0.006
ivery-0.006
ifying-0.005
 bruised-0.005
ishy-0.005
Token Paw
Token ID35553
Feature activation-2.57e-02

Pos logit contributions
 facts+0.030
 guessing+0.030
 tours+0.029
 topics+0.029
 discovering+0.028
Neg logit contributions
kin-0.020
ivery-0.017
ifying-0.017
 bruised-0.017
robe-0.015
Token and
Token ID290
Feature activation+1.45e-02

Pos logit contributions
 guessing+0.042
 tours+0.041
 topics+0.041
 facts+0.041
 mills+0.040
Neg logit contributions
kin-0.049
ivery-0.041
 bruised-0.041
ifying-0.040
ishy-0.040
Token Tree
Token ID12200
Feature activation-2.51e-02

Pos logit contributions
kin+0.024
ivery+0.022
ifying+0.021
 bruised+0.020
ishy+0.020
Neg logit contributions
 guessing-0.028
 facts-0.027
 discovering-0.027
 tours-0.027
 motorcycles-0.026
Token.
Token ID13
Feature activation-1.72e-02

Pos logit contributions
 guessing+0.053
 tours+0.052
 facts+0.051
 topics+0.051
 discovering+0.051
Neg logit contributions
kin-0.034
ifying-0.029
ivery-0.029
 bruised-0.028
ishy-0.026
Token They
Token ID1119
Feature activation+3.16e-03

Pos logit contributions
 guessing+0.036
 facts+0.034
 discovering+0.034
 topics+0.033
 tours+0.033
Neg logit contributions
kin-0.024
ivery-0.021
ifying-0.021
ishy-0.020
 bruised-0.020
Token liked
Token ID8288
Feature activation-1.27e-01

Pos logit contributions
kin+0.006
ivery+0.005
ifying+0.005
 bruised+0.005
ishy+0.005
Neg logit contributions
 guessing-0.006
 facts-0.005
 tours-0.005
 discovering-0.005
 topics-0.005
Token to
Token ID284
Feature activation-3.28e-02

Pos logit contributions
is+0.297
ves+0.294
icians+0.293
 occasions+0.285
bors+0.284
Neg logit contributions
 like-0.105
 Lily-0.105
 a-0.096
 Tim-0.094
 full-0.092
Token play
Token ID711
Feature activation-4.55e-02

Pos logit contributions
 guessing+0.057
 facts+0.055
 discovering+0.055
 motorcycles+0.055
 tours+0.054
Neg logit contributions
kin-0.055
ivery-0.051
robe-0.045
 limp-0.045
 bruised-0.045
Token together
Token ID1978
Feature activation-3.17e-02

Pos logit contributions
 facts+0.093
 cultures+0.088
ups+0.088
 motorcycles+0.087
 topics+0.086
Neg logit contributions
kin-0.066
robe-0.056
he-0.053
ivery-0.053
 seeker-0.046
Token in
Token ID287
Feature activation-4.11e-02

Pos logit contributions
 facts+0.057
 tours+0.057
 motorcycles+0.057
 mills+0.056
 guessing+0.056
Neg logit contributions
kin-0.054
ivery-0.043
ifying-0.042
he-0.041
robe-0.041
Token the
Token ID262
Feature activation-2.70e-02

Pos logit contributions
 facts+0.088
ups+0.086
arm+0.086
ending+0.082
cles+0.082
Neg logit contributions
kin-0.047
 sill-0.039
 bruised-0.039
he-0.039
ifying-0.038
Token She
Token ID1375
Feature activation-1.49e-03

Pos logit contributions
kin+0.023
ivery+0.019
ifying+0.019
 bruised+0.018
ishy+0.017
Neg logit contributions
 guessing-0.030
 tours-0.030
 facts-0.030
 discovering-0.029
 topics-0.029
Token loved
Token ID6151
Feature activation-1.08e-01

Pos logit contributions
 guessing+0.002
 facts+0.002
 tours+0.002
 discovering+0.002
 topics+0.002
Neg logit contributions
kin-0.003
ivery-0.003
ifying-0.002
 bruised-0.002
ishy-0.002
Token to
Token ID284
Feature activation-2.54e-02

Pos logit contributions
icians+0.216
ves+0.215
bors+0.212
arm+0.198
 Street+0.196
Neg logit contributions
 Lily-0.135
 cold-0.128
 dry-0.128
 bruised-0.127
 wet-0.127
Token explore
Token ID7301
Feature activation-6.94e-02

Pos logit contributions
 guessing+0.046
 facts+0.045
 topics+0.044
 discovering+0.044
icians+0.044
Neg logit contributions
kin-0.043
ivery-0.038
 bruised-0.034
ifying-0.034
ishy-0.034
Token and
Token ID290
Feature activation-4.34e-02

Pos logit contributions
icians+0.170
 facts+0.160
bors+0.156
 Exhibit+0.155
 guessing+0.155
Neg logit contributions
he-0.060
 empty-0.059
 dry-0.053
kin-0.050
 alone-0.050
Token try
Token ID1949
Feature activation-1.27e-01

Pos logit contributions
 motorcycles+0.081
icians+0.080
cles+0.078
 facts+0.078
arm+0.077
Neg logit contributions
kin-0.060
 sill-0.056
ivery-0.055
 limp-0.054
 bruised-0.054
Token new
Token ID649
Feature activation-7.93e-02

Pos logit contributions
bors+0.274
 occasions+0.272
cles+0.270
icians+0.269
ths+0.252
Neg logit contributions
 low-0.121
 down-0.120
 out-0.116
 okay-0.114
 sad-0.109
Token things
Token ID1243
Feature activation-2.13e-02

Pos logit contributions
icians+0.131
 Exhibit+0.127
arm+0.127
bors+0.125
ending+0.124
Neg logit contributions
 eye-0.104
 sill-0.100
ishy-0.100
 away-0.097
 limp-0.095
Token.
Token ID13
Feature activation-1.64e-02

Pos logit contributions
 guessing+0.046
 topics+0.045
 facts+0.045
 motorcycles+0.045
 mills+0.045
Neg logit contributions
kin-0.027
ivery-0.023
 bruised-0.022
ifying-0.022
robe-0.021
Token Today
Token ID6288
Feature activation-5.45e-03

Pos logit contributions
 guessing+0.035
 facts+0.033
 tours+0.033
 discovering+0.033
 topics+0.032
Neg logit contributions
kin-0.022
ivery-0.020
 bruised-0.019
ifying-0.019
ishy-0.018
Token,
Token ID11
Feature activation-1.77e-02

Pos logit contributions
 guessing+0.008
 facts+0.008
 tours+0.008
 discovering+0.008
 topics+0.008
Neg logit contributions
kin-0.011
ivery-0.010
ifying-0.009
 bruised-0.009
ishy-0.009
Token curious
Token ID11040
Feature activation+7.29e-04

Pos logit contributions
kin+0.046
ivery+0.044
ifying+0.040
 bruised+0.040
robe+0.040
Neg logit contributions
 guessing-0.041
 motorcycles-0.038
 discovering-0.038
 tours-0.037
 topics-0.037
Token little
Token ID1310
Feature activation+4.10e-03

Pos logit contributions
kin+0.001
ivery+0.001
ifying+0.001
 bruised+0.001
ishy+0.001
Neg logit contributions
 guessing-0.001
 facts-0.001
 tours-0.001
 discovering-0.001
 topics-0.001
Token boy
Token ID2933
Feature activation-8.89e-03

Pos logit contributions
kin+0.006
ivery+0.005
ifying+0.005
 bruised+0.005
ishy+0.004
Neg logit contributions
 guessing-0.009
 facts-0.008
 motorcycles-0.008
 tours-0.008
 discovering-0.008
Token.
Token ID13
Feature activation+1.04e-02

Pos logit contributions
 guessing+0.017
 facts+0.017
 tours+0.017
 topics+0.016
 motorcycles+0.016
Neg logit contributions
kin-0.014
ivery-0.012
ifying-0.012
 bruised-0.012
ishy-0.011
Token He
Token ID679
Feature activation-1.90e-02

Pos logit contributions
kin+0.016
ivery+0.014
 bruised+0.013
ifying+0.013
ishy+0.012
Neg logit contributions
 guessing-0.021
 motorcycles-0.021
 tours-0.021
 topics-0.021
 facts-0.021
Token liked
Token ID8288
Feature activation-1.27e-01

Pos logit contributions
 guessing+0.036
 facts+0.035
 tours+0.035
 topics+0.034
 motorcycles+0.034
Neg logit contributions
kin-0.030
ivery-0.027
 bruised-0.026
ifying-0.026
ishy-0.025
Token to
Token ID284
Feature activation-4.74e-02

Pos logit contributions
ves+0.270
arm+0.259
bors+0.251
icians+0.244
is+0.241
Neg logit contributions
 empty-0.126
 full-0.124
 alone-0.121
 Lily-0.120
 like-0.113
Token explore
Token ID7301
Feature activation-7.53e-02

Pos logit contributions
ups+0.075
 guessing+0.074
 facts+0.072
icians+0.072
ths+0.071
Neg logit contributions
kin-0.082
ivery-0.074
 limp-0.070
 sill-0.068
ifying-0.067
Token his
Token ID465
Feature activation-3.20e-02

Pos logit contributions
icians+0.167
arm+0.161
ending+0.161
bors+0.159
 Exhibit+0.158
Neg logit contributions
 empty-0.067
he-0.064
 alone-0.062
 like-0.059
kin-0.056
Token neighborhood
Token ID6232
Feature activation-5.42e-02

Pos logit contributions
icians+0.059
 facts+0.059
bors+0.056
 guessing+0.056
 mills+0.055
Neg logit contributions
kin-0.050
robe-0.043
 eye-0.043
ishy-0.043
ifying-0.042
Token.
Token ID13
Feature activation+2.02e-03

Pos logit contributions
 facts+0.109
 guessing+0.109
bors+0.109
icians+0.107
arm+0.106
Neg logit contributions
kin-0.063
ifying-0.057
 sill-0.056
robe-0.055
ivery-0.053
Token play
Token ID711
Feature activation-1.44e-02

Pos logit contributions
 guessing+0.029
 facts+0.029
 discovering+0.028
 tours+0.028
 topics+0.028
Neg logit contributions
kin-0.033
ivery-0.029
ifying-0.028
 bruised-0.027
ishy-0.027
Token in
Token ID287
Feature activation-3.46e-02

Pos logit contributions
 facts+0.026
 guessing+0.025
 tours+0.025
 discovering+0.025
 topics+0.025
Neg logit contributions
kin-0.025
ivery-0.022
ifying-0.021
robe-0.021
 bruised-0.020
Token nature
Token ID3450
Feature activation-2.58e-02

Pos logit contributions
 facts+0.067
arm+0.064
ups+0.063
icians+0.063
 guessing+0.063
Neg logit contributions
kin-0.049
ifying-0.042
 bruised-0.041
he-0.041
ivery-0.041
Token.
Token ID13
Feature activation+2.88e-02

Pos logit contributions
 guessing+0.057
 facts+0.056
 tours+0.055
 discovering+0.055
 topics+0.055
Neg logit contributions
kin-0.032
ivery-0.027
ifying-0.026
 bruised-0.025
robe-0.024
Token They
Token ID1119
Feature activation+2.82e-03

Pos logit contributions
kin+0.061
ivery+0.058
 bruised+0.055
ifying+0.052
 bleed+0.051
Neg logit contributions
 tours-0.043
 motorcycles-0.042
 topics-0.042
 facts-0.041
 guessing-0.041
Token liked
Token ID8288
Feature activation-1.26e-01

Pos logit contributions
kin+0.005
ivery+0.005
ifying+0.004
 bruised+0.004
ishy+0.004
Neg logit contributions
 guessing-0.005
 facts-0.005
 tours-0.005
 discovering-0.005
 topics-0.005
Token to
Token ID284
Feature activation-4.19e-02

Pos logit contributions
icians+0.287
 occasions+0.285
ves+0.279
bors+0.273
ises+0.272
Neg logit contributions
 like-0.121
 full-0.112
 wet-0.109
 Lily-0.106
 dry-0.103
Token run
Token ID1057
Feature activation-2.39e-02

Pos logit contributions
 motorcycles+0.073
 facts+0.073
 guessing+0.071
ups+0.071
icians+0.070
Neg logit contributions
kin-0.068
ivery-0.060
ishy-0.058
 limp-0.058
 sill-0.054
Token on
Token ID319
Feature activation-7.21e-02

Pos logit contributions
 guessing+0.043
 facts+0.043
 motorcycles+0.041
 topics+0.041
 discovering+0.041
Neg logit contributions
kin-0.039
ivery-0.034
robe-0.034
ishy-0.034
ifying-0.033
Token the
Token ID262
Feature activation-4.36e-02

Pos logit contributions
icians+0.140
cles+0.135
arm+0.134
ending+0.130
 guessing+0.127
Neg logit contributions
 empty-0.094
 limp-0.092
kin-0.084
 wet-0.084
 dark-0.083
Token grass
Token ID8701
Feature activation+8.03e-03

Pos logit contributions
icians+0.078
 topics+0.076
 facts+0.074
 occasions+0.072
 guessing+0.071
Neg logit contributions
kin-0.068
ishy-0.063
robe-0.063
ivery-0.062
 limp-0.062
Token was
Token ID373
Feature activation+1.12e-02

Pos logit contributions
kin+0.002
ivery+0.002
ifying+0.002
 bruised+0.002
ishy+0.002
Neg logit contributions
 guessing-0.002
 facts-0.002
 tours-0.002
 discovering-0.002
 topics-0.002
Token very
Token ID845
Feature activation+1.95e-02

Pos logit contributions
kin+0.022
ivery+0.020
ifying+0.019
ishy+0.018
 bruised+0.018
Neg logit contributions
 guessing-0.020
 discovering-0.019
 tours-0.018
 seeing-0.017
 motorcycles-0.017
Token curious
Token ID11040
Feature activation-2.19e-02

Pos logit contributions
kin+0.035
ifying+0.032
ivery+0.032
robe+0.030
 bruised+0.030
Neg logit contributions
 guessing-0.035
 discovering-0.032
 tours-0.031
 facts-0.031
 topics-0.030
Token.
Token ID13
Feature activation-3.44e-02

Pos logit contributions
 guessing+0.047
 tours+0.046
icians+0.045
 facts+0.045
 discovering+0.045
Neg logit contributions
kin-0.028
 bruised-0.024
ifying-0.024
ivery-0.024
ishy-0.023
Token He
Token ID679
Feature activation-1.63e-02

Pos logit contributions
 guessing+0.084
icians+0.083
 facts+0.081
 tours+0.079
bors+0.079
Neg logit contributions
kin-0.029
 sill-0.027
ivery-0.025
ishy-0.024
ifying-0.023
Token liked
Token ID8288
Feature activation-1.26e-01

Pos logit contributions
 guessing+0.031
 facts+0.031
 tours+0.030
 topics+0.030
 motorcycles+0.030
Neg logit contributions
kin-0.026
ivery-0.023
 bruised-0.022
ifying-0.021
ishy-0.021
Token to
Token ID284
Feature activation-3.42e-02

Pos logit contributions
ves+0.298
bors+0.284
ises+0.280
is+0.277
icians+0.275
Neg logit contributions
 Lily-0.108
 cold-0.105
 dark-0.100
 a-0.100
 alone-0.099
Token explore
Token ID7301
Feature activation-7.50e-02

Pos logit contributions
 guessing+0.055
 facts+0.054
icians+0.054
 motorcycles+0.053
ups+0.053
Neg logit contributions
kin-0.060
ivery-0.055
 limp-0.051
 sill-0.050
robe-0.048
Token his
Token ID465
Feature activation-4.00e-02

Pos logit contributions
icians+0.177
bors+0.169
ending+0.167
 occasions+0.166
ths+0.164
Neg logit contributions
 empty-0.063
he-0.062
 alone-0.059
 like-0.058
 dark-0.055
Token surroundings
Token ID21334
Feature activation-4.61e-02

Pos logit contributions
icians+0.080
bors+0.078
 facts+0.077
ves+0.073
 mills+0.072
Neg logit contributions
kin-0.057
 eye-0.052
robe-0.051
 empty-0.050
 seeker-0.049
Token.
Token ID13
Feature activation-3.21e-02

Pos logit contributions
icians+0.091
bors+0.091
 guessing+0.090
 tours+0.089
 facts+0.089
Neg logit contributions
kin-0.063
he-0.052
 sill-0.051
 bruised-0.050
robe-0.049
Token who
Token ID508
Feature activation+2.97e-02

Pos logit contributions
kin+0.011
ivery+0.010
ifying+0.010
 bruised+0.009
ishy+0.009
Neg logit contributions
 guessing-0.013
 discovering-0.013
 facts-0.013
 tours-0.013
 topics-0.013
Token loved
Token ID6151
Feature activation-8.52e-02

Pos logit contributions
kin+0.070
ifying+0.068
ivery+0.067
rier+0.065
 bruised+0.063
Neg logit contributions
 guessing-0.032
 discovering-0.031
 adventures-0.031
 topics-0.031
 facts-0.030
Token nature
Token ID3450
Feature activation-1.54e-03

Pos logit contributions
icians+0.153
bors+0.146
ves+0.143
arm+0.142
 Street+0.129
Neg logit contributions
 bruised-0.129
 empty-0.121
 Lily-0.120
 full-0.117
 itch-0.117
Token.
Token ID13
Feature activation+6.25e-03

Pos logit contributions
 guessing+0.003
 facts+0.003
 tours+0.003
 discovering+0.003
 topics+0.003
Neg logit contributions
kin-0.002
ivery-0.002
ifying-0.002
 bruised-0.002
ishy-0.002
Token She
Token ID1375
Feature activation-2.59e-02

Pos logit contributions
kin+0.008
ivery+0.007
ifying+0.007
 bruised+0.007
ishy+0.006
Neg logit contributions
 guessing-0.014
 facts-0.014
 tours-0.014
 topics-0.013
 discovering-0.013
Token liked
Token ID8288
Feature activation-1.25e-01

Pos logit contributions
 facts+0.052
 guessing+0.052
 motorcycles+0.051
 topics+0.051
 tours+0.051
Neg logit contributions
kin-0.041
ivery-0.033
ishy-0.031
 bruised-0.031
robe-0.031
Token to
Token ID284
Feature activation-3.24e-02

Pos logit contributions
icians+0.277
cles+0.257
 Street+0.256
bors+0.253
 site+0.251
Neg logit contributions
 full-0.125
 wind-0.124
 dry-0.119
 wet-0.118
 cold-0.118
Token play
Token ID711
Feature activation-4.41e-02

Pos logit contributions
 motorcycles+0.057
 guessing+0.057
 facts+0.055
 topics+0.055
icians+0.054
Neg logit contributions
kin-0.055
ivery-0.049
ishy-0.044
 limp-0.044
 bruised-0.043
Token in
Token ID287
Feature activation-5.02e-02

Pos logit contributions
 facts+0.100
 motorcycles+0.097
 topics+0.095
 cultures+0.094
arm+0.092
Neg logit contributions
kin-0.059
ivery-0.046
robe-0.046
he-0.045
ishy-0.038
Token the
Token ID262
Feature activation-3.15e-02

Pos logit contributions
 facts+0.119
arm+0.115
cles+0.113
icians+0.113
ups+0.112
Neg logit contributions
kin-0.048
 empty-0.043
he-0.042
 Lily-0.041
 sill-0.040
Token trees
Token ID7150
Feature activation-4.99e-03

Pos logit contributions
 facts+0.055
 guessing+0.051
icians+0.051
 topics+0.050
 motorcycles+0.049
Neg logit contributions
kin-0.054
ivery-0.051
ishy-0.048
ifying-0.048
 bruised-0.046
Token on
Token ID319
Feature activation-4.56e-02

Pos logit contributions
 guessing+0.012
 facts+0.012
 tours+0.012
 discovering+0.012
 topics+0.012
Neg logit contributions
kin-0.013
ivery-0.012
ifying-0.011
 bruised-0.011
robe-0.011
Token the
Token ID262
Feature activation-4.68e-02

Pos logit contributions
 facts+0.089
arm+0.088
cles+0.087
icians+0.087
 guessing+0.086
Neg logit contributions
kin-0.063
ifying-0.049
he-0.049
 bruised-0.048
 empty-0.048
Token farm
Token ID5318
Feature activation-2.97e-02

Pos logit contributions
 facts+0.066
 topics+0.064
icians+0.063
 discovering+0.060
arm+0.060
Neg logit contributions
kin-0.093
ivery-0.084
robe-0.083
 empty-0.079
 limp-0.078
Token.
Token ID13
Feature activation-4.21e-03

Pos logit contributions
 guessing+0.063
 facts+0.062
 motorcycles+0.061
 topics+0.061
 tours+0.060
Neg logit contributions
kin-0.040
ivery-0.033
robe-0.031
 bruised-0.030
ifying-0.030
Token They
Token ID1119
Feature activation+3.61e-03

Pos logit contributions
 guessing+0.009
 facts+0.008
 tours+0.008
 discovering+0.008
 topics+0.008
Neg logit contributions
kin-0.006
ivery-0.006
ifying-0.005
 bruised-0.005
ishy-0.005
Token liked
Token ID8288
Feature activation-1.25e-01

Pos logit contributions
kin+0.006
ivery+0.006
ifying+0.006
 bruised+0.006
ishy+0.005
Neg logit contributions
 guessing-0.006
 facts-0.006
 tours-0.006
 discovering-0.006
 motorcycles-0.006
Token to
Token ID284
Feature activation-2.21e-02

Pos logit contributions
icians+0.288
 occasions+0.287
ises+0.283
ves+0.281
is+0.277
Neg logit contributions
 Lily-0.104
 full-0.100
 like-0.097
 a-0.096
 wet-0.096
Token run
Token ID1057
Feature activation-2.86e-02

Pos logit contributions
 guessing+0.036
 facts+0.036
 topics+0.035
 motorcycles+0.035
 discovering+0.034
Neg logit contributions
kin-0.040
ivery-0.036
 bruised-0.034
ifying-0.034
ishy-0.034
Token,
Token ID11
Feature activation-3.99e-02

Pos logit contributions
 guessing+0.052
 facts+0.051
 discovering+0.050
 topics+0.049
 motorcycles+0.049
Neg logit contributions
kin-0.047
robe-0.041
suit-0.041
ivery-0.040
ishy-0.040
Token jump
Token ID4391
Feature activation-2.23e-02

Pos logit contributions
 topics+0.081
 facts+0.081
cles+0.079
 occasions+0.079
ths+0.078
Neg logit contributions
kin-0.052
 bruised-0.048
 limp-0.047
 sill-0.045
cloth-0.044
Token,
Token ID11
Feature activation-3.36e-02

Pos logit contributions
 facts+0.038
 guessing+0.037
 topics+0.037
 discovering+0.037
 motorcycles+0.036
Neg logit contributions
kin-0.042
ivery-0.036
ifying-0.035
robe-0.035
 bruised-0.034
Token very
Token ID845
Feature activation+2.74e-02

Pos logit contributions
kin+0.049
ivery+0.045
ifying+0.043
ishy+0.041
 bruised+0.039
Neg logit contributions
 guessing-0.043
 discovering-0.040
 tours-0.037
 learning-0.037
 exploring-0.037
Token brave
Token ID14802
Feature activation-3.15e-03

Pos logit contributions
kin+0.050
ifying+0.047
ivery+0.046
 bruised+0.043
robe+0.042
Neg logit contributions
 guessing-0.048
 discovering-0.044
 tours-0.043
 seeing-0.041
 learning-0.041
Token and
Token ID290
Feature activation-8.08e-03

Pos logit contributions
 guessing+0.004
 facts+0.004
 tours+0.004
 discovering+0.004
 topics+0.004
Neg logit contributions
kin-0.007
ivery-0.006
ifying-0.006
 bruised-0.006
ishy-0.006
Token liked
Token ID8288
Feature activation-1.06e-01

Pos logit contributions
 guessing+0.014
 facts+0.014
 tours+0.014
 motorcycles+0.013
 topics+0.013
Neg logit contributions
kin-0.014
ivery-0.013
ifying-0.012
 bruised-0.012
ishy-0.012
Token to
Token ID284
Feature activation-1.46e-02

Pos logit contributions
bors+0.228
icians+0.225
arm+0.224
ves+0.220
ises+0.219
Neg logit contributions
 Lily-0.111
 alone-0.101
 full-0.099
 asleep-0.096
 hers-0.092
Token try
Token ID1949
Feature activation-1.25e-01

Pos logit contributions
 guessing+0.022
 facts+0.022
 motorcycles+0.022
 discovering+0.021
 tours+0.021
Neg logit contributions
kin-0.028
ivery-0.026
 bruised-0.024
ifying-0.024
robe-0.023
Token new
Token ID649
Feature activation-8.39e-02

Pos logit contributions
bors+0.268
cles+0.265
 occasions+0.258
icians+0.248
ths+0.247
Neg logit contributions
 low-0.118
 out-0.117
 dark-0.117
 down-0.115
 okay-0.112
Token things
Token ID1243
Feature activation-2.24e-02

Pos logit contributions
arm+0.140
icians+0.139
ending+0.139
bors+0.139
 Exhibit+0.139
Neg logit contributions
 idea-0.100
 eye-0.097
 away-0.097
 sill-0.096
 like-0.095
Token.
Token ID13
Feature activation+1.34e-02

Pos logit contributions
 guessing+0.048
 facts+0.047
 topics+0.047
 tours+0.047
 motorcycles+0.047
Neg logit contributions
kin-0.029
ivery-0.025
 bruised-0.024
ifying-0.023
robe-0.022
Token
Token ID198
Feature activation-2.20e-02

Pos logit contributions
kin+0.025
ivery+0.022
ifying+0.021
 bruised+0.021
ishy+0.020
Neg logit contributions
 guessing-0.023
 tours-0.023
 discovering-0.022
 topics-0.022
 facts-0.022
Token
Token ID198
Feature activation-1.42e-02

Pos logit contributions
 facts+0.040
 guessing+0.039
 motorcycles+0.038
 tours+0.038
 topics+0.038
Neg logit contributions
kin-0.037
ivery-0.033
ifying-0.031
he-0.031
 bruised-0.030
Token little
Token ID1310
Feature activation+2.35e-02

Pos logit contributions
kin+0.012
ivery+0.011
ifying+0.011
 bruised+0.010
ishy+0.010
Neg logit contributions
 guessing-0.012
 tours-0.012
 facts-0.012
 motorcycles-0.012
 discovering-0.011
Token girl
Token ID2576
Feature activation-1.61e-04

Pos logit contributions
kin+0.036
ivery+0.035
ifying+0.033
 bruised+0.030
robe+0.030
Neg logit contributions
 guessing-0.045
 motorcycles-0.043
 mills-0.042
 whales-0.042
 tours-0.042
Token who
Token ID508
Feature activation+4.59e-02

Pos logit contributions
 guessing+0.000
 facts+0.000
 tours+0.000
 discovering+0.000
 topics+0.000
Neg logit contributions
kin-0.000
ivery-0.000
ifying-0.000
 bruised-0.000
ishy-0.000
Token loved
Token ID6151
Feature activation-8.75e-02

Pos logit contributions
kin+0.118
rier+0.116
ivery+0.113
ifying+0.113
rily+0.105
Neg logit contributions
 adventures-0.038
 topics-0.036
 guessing-0.035
 discovering-0.034
 facts-0.032
Token to
Token ID284
Feature activation-3.31e-02

Pos logit contributions
icians+0.150
bors+0.144
arm+0.144
ves+0.142
 Street+0.127
Neg logit contributions
 bruised-0.134
 empty-0.126
 Lily-0.125
 itch-0.122
 wet-0.121
Token try
Token ID1949
Feature activation-1.25e-01

Pos logit contributions
 guessing+0.059
 facts+0.059
icians+0.058
 topics+0.058
arm+0.057
Neg logit contributions
kin-0.055
ivery-0.047
 bruised-0.044
ifying-0.043
 limp-0.043
Token new
Token ID649
Feature activation-8.59e-02

Pos logit contributions
cles+0.272
bors+0.270
icians+0.264
ths+0.264
 occasions+0.260
Neg logit contributions
 warm-0.107
 cooler-0.104
 empty-0.104
 cold-0.104
 dark-0.103
Token things
Token ID1243
Feature activation-1.62e-02

Pos logit contributions
icians+0.151
ending+0.148
bors+0.146
arm+0.144
 Exhibit+0.144
Neg logit contributions
 idea-0.100
 eye-0.100
ishy-0.095
 bruised-0.093
 limp-0.091
Token.
Token ID13
Feature activation+1.87e-03

Pos logit contributions
 guessing+0.035
 facts+0.034
 topics+0.034
 tours+0.034
 motorcycles+0.034
Neg logit contributions
kin-0.022
ivery-0.018
 bruised-0.018
ifying-0.017
ishy-0.017
Token She
Token ID1375
Feature activation-2.89e-02

Pos logit contributions
kin+0.002
ivery+0.002
ifying+0.002
 bruised+0.002
ishy+0.002
Neg logit contributions
 guessing-0.004
 facts-0.004
 tours-0.004
 topics-0.004
 discovering-0.004
Token was
Token ID373
Feature activation+1.61e-03

Pos logit contributions
 facts+0.055
 tours+0.055
 guessing+0.055
 topics+0.053
 motorcycles+0.052
Neg logit contributions
kin-0.048
ivery-0.039
ishy-0.037
robe-0.037
 bruised-0.037
Token took
Token ID1718
Feature activation+1.41e-02

Pos logit contributions
 guessing+0.009
 facts+0.009
 tours+0.009
 topics+0.008
 discovering+0.008
Neg logit contributions
kin-0.009
ivery-0.008
 bruised-0.008
ifying-0.008
ishy-0.007
Token the
Token ID262
Feature activation+6.91e-03

Pos logit contributions
kin+0.030
ivery+0.029
ifying+0.026
 bruised+0.026
ishy+0.026
Neg logit contributions
 guessing-0.020
 tours-0.020
 motorcycles-0.019
 facts-0.019
 topics-0.019
Token phone
Token ID3072
Feature activation+1.39e-02

Pos logit contributions
kin+0.017
ivery+0.016
ifying+0.016
 bruised+0.015
ishy+0.015
Neg logit contributions
 guessing-0.007
 facts-0.007
 tours-0.007
 motorcycles-0.006
 mills-0.006
Token and
Token ID290
Feature activation-1.89e-04

Pos logit contributions
kin+0.031
ivery+0.029
ifying+0.027
 bruised+0.027
ishy+0.027
Neg logit contributions
 guessing-0.018
 tours-0.017
 facts-0.017
 motorcycles-0.017
 discovering-0.017
Token turned
Token ID2900
Feature activation+1.90e-03

Pos logit contributions
 guessing+0.000
 facts+0.000
 tours+0.000
 discovering+0.000
 motorcycles+0.000
Neg logit contributions
kin-0.000
ivery-0.000
ifying-0.000
 bruised-0.000
ishy-0.000
Token it
Token ID340
Feature activation+4.16e-02

Pos logit contributions
kin+0.004
ivery+0.004
ifying+0.004
 bruised+0.004
ishy+0.004
Neg logit contributions
 guessing-0.003
 facts-0.002
 tours-0.002
 motorcycles-0.002
 discovering-0.002
Token off
Token ID572
Feature activation+2.89e-02

Pos logit contributions
kin+0.087
ivery+0.083
ifying+0.078
ishy+0.076
 bruised+0.072
Neg logit contributions
 guessing-0.060
 tours-0.055
 to-0.053
 motorcycles-0.053
 bikes-0.053
Token.
Token ID13
Feature activation-1.74e-04

Pos logit contributions
kin+0.063
ivery+0.063
ishy+0.058
 bruised+0.056
ifying+0.055
Neg logit contributions
 tours-0.037
 guessing-0.036
 motorcycles-0.035
 facts-0.035
 mills-0.034
Token She
Token ID1375
Feature activation+3.99e-03

Pos logit contributions
 guessing+0.000
 facts+0.000
 tours+0.000
 discovering+0.000
 topics+0.000
Neg logit contributions
kin-0.000
ivery-0.000
ifying-0.000
 bruised-0.000
ishy-0.000
Token said
Token ID531
Feature activation-3.10e-02

Pos logit contributions
kin+0.007
ivery+0.007
ifying+0.006
 bruised+0.006
ishy+0.006
Neg logit contributions
 guessing-0.007
 facts-0.007
 tours-0.006
 motorcycles-0.006
 discovering-0.006
Token:
Token ID25
Feature activation-5.31e-02

Pos logit contributions
 guessing+0.057
 topics+0.057
 facts+0.056
cles+0.055
 tours+0.054
Neg logit contributions
kin-0.052
 sorry-0.045
 bruised-0.045
 OK-0.043
 okay-0.041
Token cookie
Token ID19751
Feature activation-3.28e-03

Pos logit contributions
ivery+0.036
kin+0.036
 bruised+0.033
ifying+0.032
ishy+0.031
Neg logit contributions
 guessing-0.030
 facts-0.028
 tours-0.027
 motorcycles-0.027
 topics-0.027
Token and
Token ID290
Feature activation+1.52e-02

Pos logit contributions
 guessing+0.005
 facts+0.004
 tours+0.004
 discovering+0.004
 topics+0.004
Neg logit contributions
kin-0.007
ivery-0.006
ifying-0.006
 bruised-0.006
ishy-0.006
Token smiled
Token ID13541
Feature activation+2.98e-02

Pos logit contributions
kin+0.028
ivery+0.028
ifying+0.025
 bruised+0.025
ishy+0.025
Neg logit contributions
 guessing-0.025
 facts-0.024
 motorcycles-0.024
 discovering-0.024
 tours-0.023
Token.
Token ID13
Feature activation-1.47e-03

Pos logit contributions
kin+0.060
ivery+0.058
 bruised+0.054
ishy+0.052
ifying+0.052
Neg logit contributions
 guessing-0.046
 discovering-0.043
 facts-0.043
 tours-0.042
 seeing-0.042
Token It
Token ID632
Feature activation+1.73e-02

Pos logit contributions
 guessing+0.003
 facts+0.003
 tours+0.003
 discovering+0.003
 topics+0.003
Neg logit contributions
kin-0.002
ivery-0.002
ifying-0.002
 bruised-0.002
ishy-0.002
Token was
Token ID373
Feature activation+4.23e-02

Pos logit contributions
kin+0.038
ivery+0.037
ifying+0.036
 bruised+0.034
ishy+0.034
Neg logit contributions
 guessing-0.023
 facts-0.021
 discovering-0.020
 topics-0.020
 motorcycles-0.020
Token the
Token ID262
Feature activation+2.45e-02

Pos logit contributions
ivery+0.092
kin+0.086
ifying+0.081
ishy+0.079
robe+0.078
Neg logit contributions
 guessing-0.062
 facts-0.058
 discovering-0.057
 adventures-0.056
 tours-0.055
Token best
Token ID1266
Feature activation-7.12e-05

Pos logit contributions
kin+0.045
ivery+0.044
ifying+0.039
ishy+0.039
 bruised+0.039
Neg logit contributions
 guessing-0.043
 facts-0.040
 tours-0.040
 motorcycles-0.040
 adventures-0.039
Token night
Token ID1755
Feature activation-1.07e-03

Pos logit contributions
 guessing+0.000
 facts+0.000
 tours+0.000
 discovering+0.000
 topics+0.000
Neg logit contributions
kin-0.000
ivery-0.000
ifying-0.000
 bruised-0.000
ishy-0.000
Token ever
Token ID1683
Feature activation+8.80e-03

Pos logit contributions
 guessing+0.002
 facts+0.002
 tours+0.002
 discovering+0.002
 topics+0.002
Neg logit contributions
kin-0.002
ivery-0.002
ifying-0.002
 bruised-0.002
ishy-0.002
Token!
Token ID0
Feature activation+1.83e-02

Pos logit contributions
kin+0.017
ivery+0.016
ifying+0.015
 bruised+0.015
ishy+0.015
Neg logit contributions
 guessing-0.014
 tours-0.013
 facts-0.013
 discovering-0.013
 motorcycles-0.013
TokenSure
Token ID19457
Feature activation+7.53e-03

Pos logit contributions
 guessing+0.002
 facts+0.002
 tours+0.002
 discovering+0.002
 topics+0.002
Neg logit contributions
kin-0.002
ivery-0.002
ifying-0.002
 bruised-0.002
ishy-0.002
Token.
Token ID13
Feature activation+3.07e-02

Pos logit contributions
kin+0.016
ivery+0.016
ifying+0.015
 bruised+0.014
robe+0.014
Neg logit contributions
 guessing-0.010
 facts-0.010
 tours-0.010
 discovering-0.010
 motorcycles-0.009
Token But
Token ID887
Feature activation-3.55e-03

Pos logit contributions
kin+0.063
ivery+0.061
ishy+0.053
 bruised+0.053
ifying+0.052
Neg logit contributions
 guessing-0.043
 facts-0.043
 discovering-0.042
 topics-0.042
 motorcycles-0.042
Token be
Token ID307
Feature activation+2.54e-03

Pos logit contributions
 guessing+0.006
 facts+0.006
 tours+0.006
 discovering+0.006
 topics+0.006
Neg logit contributions
kin-0.007
ivery-0.006
ifying-0.006
 bruised-0.006
ishy-0.006
Token careful
Token ID8161
Feature activation-1.79e-02

Pos logit contributions
kin+0.005
ivery+0.005
ifying+0.005
 bruised+0.005
ishy+0.005
Neg logit contributions
 guessing-0.004
 facts-0.003
 discovering-0.003
 tours-0.003
 motorcycles-0.003
Token.
Token ID13
Feature activation+4.19e-02

Pos logit contributions
 guessing+0.037
 facts+0.037
 tours+0.037
 topics+0.036
 motorcycles+0.036
Neg logit contributions
kin-0.026
ivery-0.022
 bruised-0.021
ifying-0.020
cloth-0.020
Token It
Token ID632
Feature activation-1.29e-02

Pos logit contributions
kin+0.081
ivery+0.074
ishy+0.067
robe+0.065
 bruised+0.065
Neg logit contributions
 mills-0.063
 motorcycles-0.063
 facts-0.062
arm-0.062
 guessing-0.061
Token is
Token ID318
Feature activation+1.94e-02

Pos logit contributions
 guessing+0.020
 facts+0.019
 discovering+0.019
 tours+0.019
 topics+0.019
Neg logit contributions
kin-0.026
ivery-0.023
 bruised-0.023
ifying-0.023
ishy-0.022
Token very
Token ID845
Feature activation+1.83e-02

Pos logit contributions
kin+0.037
ivery+0.037
ifying+0.033
robe+0.032
he+0.032
Neg logit contributions
 guessing-0.032
 facts-0.029
 discovering-0.029
 tours-0.029
 motorcycles-0.028
Token loud
Token ID7812
Feature activation-1.47e-02

Pos logit contributions
kin+0.032
ivery+0.031
ifying+0.029
robe+0.028
 bruised+0.028
Neg logit contributions
 guessing-0.032
 facts-0.030
 tours-0.030
 motorcycles-0.029
 discovering-0.028
Token."
Token ID526
Feature activation+2.54e-02

Pos logit contributions
 guessing+0.028
 facts+0.027
 tours+0.027
 topics+0.026
 discovering+0.026
Neg logit contributions
kin-0.024
ivery-0.021
 bruised-0.020
ifying-0.020
ishy-0.019
Token into
Token ID656
Feature activation+1.76e-02

Pos logit contributions
kin+0.025
ivery+0.023
ifying+0.022
 bruised+0.022
ishy+0.021
Neg logit contributions
 guessing-0.011
 tours-0.010
 facts-0.009
 discovering-0.009
 topics-0.009
Token the
Token ID262
Feature activation+8.89e-03

Pos logit contributions
kin+0.038
ivery+0.037
ifying+0.034
ishy+0.034
 bruised+0.033
Neg logit contributions
 guessing-0.025
 tours-0.023
 topics-0.023
 facts-0.023
 discovering-0.022
Token living
Token ID2877
Feature activation+7.54e-03

Pos logit contributions
kin+0.020
ivery+0.018
ifying+0.017
 bruised+0.017
ishy+0.017
Neg logit contributions
 guessing-0.013
 tours-0.012
 topics-0.012
 facts-0.012
 motorcycles-0.012
Token room
Token ID2119
Feature activation+8.07e-04

Pos logit contributions
kin+0.015
ivery+0.014
ifying+0.014
 bruised+0.013
ishy+0.013
Neg logit contributions
 guessing-0.012
 topics-0.011
 facts-0.011
 tours-0.011
 discovering-0.011
Token and
Token ID290
Feature activation+7.20e-03

Pos logit contributions
kin+0.002
ivery+0.002
ifying+0.002
 bruised+0.002
ishy+0.001
Neg logit contributions
 guessing-0.001
 facts-0.001
 tours-0.001
 discovering-0.001
 topics-0.001
Token threw
Token ID9617
Feature activation+5.19e-02

Pos logit contributions
kin+0.013
ivery+0.012
ifying+0.012
ishy+0.011
 bruised+0.011
Neg logit contributions
 guessing-0.013
 discovering-0.012
 facts-0.012
 tours-0.012
 motorcycles-0.012
Token Anna
Token ID11735
Feature activation-2.38e-03

Pos logit contributions
ivery+0.133
kin+0.124
ifying+0.111
istered+0.108
ishy+0.106
Neg logit contributions
 bikes-0.059
 guessing-0.058
 to-0.055
 facts-0.055
 instructions-0.054
Token on
Token ID319
Feature activation-6.58e-04

Pos logit contributions
 guessing+0.004
 facts+0.004
 tours+0.004
 discovering+0.004
 topics+0.004
Neg logit contributions
kin-0.004
ivery-0.004
ifying-0.004
 bruised-0.003
ishy-0.003
Token the
Token ID262
Feature activation+1.36e-02

Pos logit contributions
 guessing+0.001
 facts+0.001
 tours+0.001
 discovering+0.001
 topics+0.001
Neg logit contributions
kin-0.001
ivery-0.001
ifying-0.001
 bruised-0.001
ishy-0.001
Token couch
Token ID18507
Feature activation+4.01e-02

Pos logit contributions
kin+0.029
ivery+0.028
ifying+0.027
ishy+0.026
 bruised+0.026
Neg logit contributions
 guessing-0.019
 facts-0.017
 tours-0.017
 motorcycles-0.017
 discovering-0.017
Token.
Token ID13
Feature activation+3.28e-02

Pos logit contributions
kin+0.079
ivery+0.076
ifying+0.072
ishy+0.070
 bruised+0.069
Neg logit contributions
 guessing-0.062
 facts-0.056
 tours-0.056
 motorcycles-0.055
 topics-0.054
Token nods
Token ID34833
Feature activation+2.99e-02

Pos logit contributions
 guessing+0.026
 tours+0.026
 facts+0.025
 discovering+0.025
 topics+0.025
Neg logit contributions
kin-0.031
ivery-0.027
 bruised-0.027
ishy-0.026
cloth-0.026
Token,
Token ID11
Feature activation+1.11e-02

Pos logit contributions
ifying+0.070
ivery+0.069
kin+0.069
ishy+0.067
 bruised+0.065
Neg logit contributions
 guessing-0.034
 seeing-0.031
 facts-0.031
 discovering-0.031
 adventures-0.030
Token but
Token ID475
Feature activation+2.97e-02

Pos logit contributions
kin+0.023
ivery+0.022
ifying+0.021
ishy+0.020
robe+0.020
Neg logit contributions
 guessing-0.017
 discovering-0.016
 facts-0.015
 seeing-0.015
 tours-0.015
Token she
Token ID673
Feature activation-5.87e-03

Pos logit contributions
ivery+0.074
ifying+0.070
kin+0.069
ishy+0.067
robe+0.067
Neg logit contributions
 guessing-0.030
 discovering-0.030
 seeing-0.029
 bikes-0.027
 facts-0.027
Token still
Token ID991
Feature activation+2.04e-02

Pos logit contributions
 guessing+0.012
 tours+0.011
 facts+0.011
 discovering+0.011
 topics+0.011
Neg logit contributions
kin-0.009
ivery-0.008
 bruised-0.008
ifying-0.008
ishy-0.007
Token feels
Token ID5300
Feature activation+4.97e-02

Pos logit contributions
kin+0.041
ivery+0.038
ifying+0.036
robe+0.034
 bruised+0.034
Neg logit contributions
 guessing-0.031
 facts-0.031
 discovering-0.029
 motorcycles-0.029
 topics-0.029
Token sad
Token ID6507
Feature activation+3.70e-03

Pos logit contributions
kin+0.079
ivery+0.075
ifying+0.073
robe+0.068
cy+0.064
Neg logit contributions
 guessing-0.092
 discovering-0.090
 seeing-0.090
 adventures-0.088
 learning-0.088
Token.
Token ID13
Feature activation+1.71e-02

Pos logit contributions
kin+0.006
ivery+0.006
ifying+0.005
 bruised+0.005
ishy+0.005
Neg logit contributions
 guessing-0.007
 facts-0.007
 tours-0.007
 discovering-0.007
 topics-0.006
Token She
Token ID1375
Feature activation+2.41e-03

Pos logit contributions
kin+0.026
ivery+0.023
ifying+0.021
 bruised+0.020
ishy+0.020
Neg logit contributions
 guessing-0.035
 tours-0.035
 facts-0.034
 topics-0.034
 discovering-0.034
Token loves
Token ID10408
Feature activation-5.92e-02

Pos logit contributions
kin+0.005
ivery+0.005
ifying+0.005
 bruised+0.005
ishy+0.004
Neg logit contributions
 guessing-0.003
 facts-0.003
 tours-0.003
 motorcycles-0.003
 discovering-0.003
Token to
Token ID284
Feature activation+3.57e-03

Pos logit contributions
bors+0.129
icians+0.128
arm+0.127
 feats+0.125
ves+0.125
Neg logit contributions
 bruised-0.056
kin-0.054
 sill-0.052
 Lily-0.050
 Sara-0.049
Token went
Token ID1816
Feature activation-4.70e-02

Pos logit contributions
kin+0.038
ivery+0.035
ifying+0.035
 bruised+0.033
ishy+0.032
Neg logit contributions
 guessing-0.030
 tours-0.028
 discovering-0.028
 motorcycles-0.028
 adventures-0.027
Token to
Token ID284
Feature activation-2.05e-02

Pos logit contributions
 topics+0.067
 facts+0.067
bors+0.066
icians+0.063
 cultures+0.063
Neg logit contributions
robe-0.089
kin-0.088
 bruised-0.087
 limp-0.083
ifying-0.082
Token the
Token ID262
Feature activation-2.38e-02

Pos logit contributions
 facts+0.038
 guessing+0.036
 topics+0.036
cles+0.035
 motorcycles+0.035
Neg logit contributions
kin-0.032
ivery-0.029
 bruised-0.029
 limp-0.028
ifying-0.027
Token park
Token ID3952
Feature activation-2.80e-02

Pos logit contributions
 facts+0.036
 topics+0.034
 guessing+0.034
 discovering+0.033
icians+0.033
Neg logit contributions
kin-0.045
ifying-0.043
ivery-0.042
 bruised-0.041
robe-0.041
Token with
Token ID351
Feature activation+3.26e-02

Pos logit contributions
 facts+0.040
 motorcycles+0.039
 topics+0.038
icians+0.038
 tours+0.037
Neg logit contributions
kin-0.056
 sill-0.051
ivery-0.050
 bruised-0.050
he-0.050
Token her
Token ID607
Feature activation-1.85e-03

Pos logit contributions
kin+0.080
ifying+0.077
ivery+0.076
rier+0.073
ishy+0.073
Neg logit contributions
 tours-0.038
 guessing-0.037
 adventures-0.037
 discovering-0.034
 seeing-0.034
Token mom
Token ID1995
Feature activation-1.43e-02

Pos logit contributions
 guessing+0.004
 facts+0.004
 tours+0.004
 discovering+0.003
 topics+0.003
Neg logit contributions
kin-0.003
ivery-0.003
ifying-0.002
 bruised-0.002
ishy-0.002
Token and
Token ID290
Feature activation-1.16e-02

Pos logit contributions
 facts+0.030
 guessing+0.029
 tours+0.029
 topics+0.029
 discovering+0.028
Neg logit contributions
kin-0.020
ivery-0.018
 bruised-0.017
ifying-0.017
ishy-0.016
Token forgot
Token ID16453
Feature activation-4.02e-02

Pos logit contributions
 facts+0.021
 guessing+0.021
 topics+0.021
 tours+0.020
 motorcycles+0.020
Neg logit contributions
kin-0.019
ivery-0.018
 bruised-0.017
ifying-0.016
robe-0.016
Token to
Token ID284
Feature activation+3.67e-02

Pos logit contributions
icians+0.070
cles+0.067
 topics+0.066
arm+0.066
 facts+0.064
Neg logit contributions
kin-0.066
 bruised-0.063
 OK-0.059
cloth-0.059
ifying-0.059
Token bring
Token ID2222
Feature activation+6.23e-03

Pos logit contributions
ifying+0.080
kin+0.079
ivery+0.076
robe+0.075
 bruised+0.073
Neg logit contributions
 guessing-0.049
 seeing-0.045
 tours-0.045
 discovering-0.045
 facts-0.043
Token a
Token ID257
Feature activation+4.80e-04

Pos logit contributions
kin+0.055
ivery+0.053
ifying+0.048
 bruised+0.047
ishy+0.046
Neg logit contributions
 tours-0.041
 guessing-0.041
 adventures-0.039
 facts-0.039
 discovering-0.039
Token break
Token ID2270
Feature activation+2.20e-02

Pos logit contributions
kin+0.001
ivery+0.001
ifying+0.001
 bruised+0.001
ishy+0.001
Neg logit contributions
 guessing-0.001
 facts-0.001
 tours-0.001
 discovering-0.001
 motorcycles-0.001
Token and
Token ID290
Feature activation+3.52e-02

Pos logit contributions
kin+0.046
ivery+0.044
ifying+0.041
 bruised+0.040
ishy+0.040
Neg logit contributions
 guessing-0.030
 tours-0.029
 facts-0.029
 discovering-0.028
 topics-0.028
Token relax
Token ID8960
Feature activation+4.15e-02

Pos logit contributions
kin+0.068
ivery+0.066
ifying+0.060
ishy+0.060
 bruised+0.057
Neg logit contributions
 guessing-0.058
 discovering-0.054
 facts-0.052
 tours-0.052
 adventures-0.051
Token!
Token ID0
Feature activation+3.06e-02

Pos logit contributions
kin+0.079
ivery+0.078
ishy+0.072
 bruised+0.071
ifying+0.071
Neg logit contributions
 guessing-0.063
 discovering-0.061
 adventures-0.057
 facts-0.057
 tours-0.057
Token<|endoftext|>
Token ID50256
Feature activation+2.52e-02

Pos logit contributions
kin+0.072
ivery+0.067
ishy+0.062
 bruised+0.061
ifying+0.059
Neg logit contributions
 discovering-0.037
 guessing-0.036
 tours-0.036
 topics-0.035
arm-0.035
TokenOne
Token ID3198
Feature activation+1.47e-02

Pos logit contributions
kin+0.058
ivery+0.056
 bruised+0.054
ishy+0.053
ifying+0.052
Neg logit contributions
 guessing-0.030
 tours-0.029
 discovering-0.028
 topics-0.028
 motorcycles-0.027
Token gloomy
Token ID46400
Feature activation-1.40e-02

Pos logit contributions
kin+0.031
ivery+0.030
ifying+0.029
 bruised+0.028
robe+0.027
Neg logit contributions
 guessing-0.020
 tours-0.019
 facts-0.019
 discovering-0.019
 adventures-0.019
Token day
Token ID1110
Feature activation+8.44e-03

Pos logit contributions
 guessing+0.025
 facts+0.024
 tours+0.023
 motorcycles+0.023
 discovering+0.023
Neg logit contributions
kin-0.025
ivery-0.022
ifying-0.021
 bruised-0.021
ishy-0.020
Token,
Token ID11
Feature activation+5.13e-02

Pos logit contributions
kin+0.018
ivery+0.017
 bruised+0.016
ifying+0.016
ishy+0.016
Neg logit contributions
 guessing-0.011
 tours-0.011
 discovering-0.011
 facts-0.010
 topics-0.010
Token there
Token ID612
Feature activation+2.47e-02

Pos logit contributions
ivery+0.114
kin+0.108
 bruised+0.106
ishy+0.104
rier+0.102
Neg logit contributions
 guessing-0.067
 discovering-0.064
 exploring-0.063
 seeing-0.059
 tours-0.058
Token for
Token ID329
Feature activation+1.66e-03

Pos logit contributions
kin+0.081
ivery+0.073
ifying+0.073
ishy+0.070
 bruised+0.067
Neg logit contributions
 guessing-0.038
 discovering-0.036
 tours-0.034
 exploring-0.034
 seeing-0.034
Token help
Token ID1037
Feature activation-9.17e-04

Pos logit contributions
kin+0.003
ivery+0.003
ifying+0.003
 bruised+0.003
ishy+0.003
Neg logit contributions
 guessing-0.003
 facts-0.003
 tours-0.003
 discovering-0.003
 topics-0.002
Token,
Token ID11
Feature activation-1.56e-02

Pos logit contributions
 guessing+0.001
 facts+0.001
 tours+0.001
 discovering+0.001
 topics+0.001
Neg logit contributions
kin-0.002
ivery-0.002
ifying-0.002
 bruised-0.002
ishy-0.002
Token but
Token ID475
Feature activation+2.33e-02

Pos logit contributions
 facts+0.028
 guessing+0.028
 tours+0.028
 topics+0.027
 motorcycles+0.027
Neg logit contributions
kin-0.026
ivery-0.024
 bruised-0.023
ifying-0.022
 limp-0.021
Token no
Token ID645
Feature activation+8.24e-03

Pos logit contributions
ivery+0.044
kin+0.043
ishy+0.042
ifying+0.041
robe+0.037
Neg logit contributions
 guessing-0.038
 discovering-0.037
 seeing-0.035
 mills-0.034
 tours-0.034
Token one
Token ID530
Feature activation+4.58e-03

Pos logit contributions
kin+0.016
ivery+0.015
ifying+0.014
 bruised+0.014
robe+0.014
Neg logit contributions
 guessing-0.014
 facts-0.013
 motorcycles-0.013
 discovering-0.013
 tours-0.013
Token was
Token ID373
Feature activation+2.15e-02

Pos logit contributions
kin+0.009
ivery+0.008
ifying+0.008
 bruised+0.008
ishy+0.007
Neg logit contributions
 guessing-0.007
 facts-0.007
 tours-0.007
 discovering-0.007
 topics-0.007
Token available
Token ID1695
Feature activation+1.15e-02

Pos logit contributions
kin+0.042
ivery+0.041
ifying+0.039
ishy+0.037
robe+0.037
Neg logit contributions
 guessing-0.035
 discovering-0.033
 tours-0.031
 seeing-0.031
 motorcycles-0.030
Token to
Token ID284
Feature activation+3.04e-02

Pos logit contributions
kin+0.020
ivery+0.019
ifying+0.018
 bruised+0.017
ishy+0.017
Neg logit contributions
 guessing-0.021
 discovering-0.020
 facts-0.019
 tours-0.019
 motorcycles-0.019
Token save
Token ID3613
Feature activation+1.56e-02

Pos logit contributions
kin+0.062
ifying+0.062
ishy+0.059
ivery+0.059
 bruised+0.056
Neg logit contributions
 guessing-0.043
 tours-0.041
 seeing-0.040
 discovering-0.040
 adventures-0.039
Token her
Token ID607
Feature activation+8.42e-03

Pos logit contributions
kin+0.030
ivery+0.028
ifying+0.027
ishy+0.025
 bruised+0.025
Neg logit contributions
 guessing-0.026
 tours-0.025
 facts-0.025
 discovering-0.024
 topics-0.024
Token He
Token ID679
Feature activation-4.93e-03

Pos logit contributions
 guessing+0.005
 facts+0.005
 tours+0.005
 discovering+0.005
 topics+0.005
Neg logit contributions
kin-0.003
ivery-0.002
ifying-0.002
 bruised-0.002
ishy-0.002
Token sprayed
Token ID35729
Feature activation+3.00e-02

Pos logit contributions
 guessing+0.009
 facts+0.009
 tours+0.009
 topics+0.009
 discovering+0.009
Neg logit contributions
kin-0.008
ivery-0.007
ifying-0.007
 bruised-0.007
ishy-0.006
Token some
Token ID617
Feature activation+2.48e-02

Pos logit contributions
kin+0.069
ivery+0.068
ifying+0.061
istered+0.060
 bruised+0.058
Neg logit contributions
 motorcycles-0.036
 guessing-0.036
 facts-0.034
 tours-0.034
 mills-0.033
Token water
Token ID1660
Feature activation+1.40e-02

Pos logit contributions
kin+0.050
ivery+0.048
ifying+0.044
 bruised+0.043
ishy+0.042
Neg logit contributions
 guessing-0.037
 facts-0.035
 motorcycles-0.035
 tours-0.033
 topics-0.033
Token on
Token ID319
Feature activation+1.81e-02

Pos logit contributions
kin+0.027
ivery+0.026
ifying+0.024
 bruised+0.024
ishy+0.023
Neg logit contributions
 guessing-0.022
 facts-0.021
 tours-0.021
 discovering-0.020
 motorcycles-0.020
Token the
Token ID262
Feature activation+1.97e-02

Pos logit contributions
kin+0.042
ivery+0.041
ishy+0.038
ifying+0.038
 bruised+0.038
Neg logit contributions
 guessing-0.021
 motorcycles-0.021
 tours-0.020
 mills-0.019
 topics-0.019
Token battery
Token ID6555
Feature activation+1.07e-02

Pos logit contributions
kin+0.039
ivery+0.037
ifying+0.034
ishy+0.034
 bruised+0.034
Neg logit contributions
 guessing-0.030
 motorcycles-0.028
 discovering-0.027
 tours-0.027
 topics-0.027
Token and
Token ID290
Feature activation+8.39e-03

Pos logit contributions
kin+0.023
ivery+0.022
ifying+0.021
 bruised+0.021
ishy+0.020
Neg logit contributions
 guessing-0.015
 facts-0.014
 tours-0.013
 discovering-0.013
 motorcycles-0.013
Token it
Token ID340
Feature activation+1.84e-02

Pos logit contributions
kin+0.015
ivery+0.015
ifying+0.014
 bruised+0.013
ishy+0.013
Neg logit contributions
 guessing-0.014
 facts-0.014
 motorcycles-0.014
 discovering-0.014
 tours-0.013
Token started
Token ID2067
Feature activation-2.54e-02

Pos logit contributions
kin+0.033
ivery+0.031
ifying+0.030
 bruised+0.028
ishy+0.028
Neg logit contributions
 guessing-0.032
 facts-0.029
 tours-0.029
 adventures-0.029
 discovering-0.029
Token to
Token ID284
Feature activation+3.33e-02

Pos logit contributions
 facts+0.024
 topics+0.023
 tours+0.023
arm+0.022
 motorcycles+0.022
Neg logit contributions
kin-0.063
 bruised-0.061
ivery-0.059
ifying-0.059
 limp-0.057
Token She
Token ID1375
Feature activation-3.68e-03

Pos logit contributions
kin+0.087
ifying+0.080
ivery+0.077
rier+0.076
 bruised+0.075
Neg logit contributions
 guessing-0.071
 topics-0.070
 discovering-0.068
 museums-0.067
 adventures-0.066
Token loved
Token ID6151
Feature activation-1.03e-01

Pos logit contributions
 guessing+0.006
 facts+0.006
 tours+0.006
 discovering+0.006
 topics+0.006
Neg logit contributions
kin-0.007
ivery-0.006
ifying-0.006
 bruised-0.006
ishy-0.006
Token to
Token ID284
Feature activation-1.99e-02

Pos logit contributions
icians+0.193
bors+0.192
ves+0.190
arm+0.179
 Street+0.177
Neg logit contributions
 Lily-0.137
 bruised-0.133
 Mom-0.126
 wet-0.125
 dry-0.125
Token play
Token ID711
Feature activation-3.30e-02

Pos logit contributions
 guessing+0.035
 facts+0.034
 topics+0.033
 discovering+0.033
 tours+0.033
Neg logit contributions
kin-0.036
ivery-0.031
 bruised-0.029
robe-0.029
ifying-0.029
Token outside
Token ID2354
Feature activation-1.37e-02

Pos logit contributions
 facts+0.070
 cultures+0.066
 topics+0.065
 mills+0.065
 motorcycles+0.064
Neg logit contributions
kin-0.050
ivery-0.043
he-0.041
robe-0.040
 bruised-0.036
Token in
Token ID287
Feature activation-1.60e-02

Pos logit contributions
 facts+0.028
 guessing+0.027
 tours+0.027
 topics+0.027
 motorcycles+0.027
Neg logit contributions
kin-0.020
ivery-0.018
ifying-0.017
 bruised-0.017
ishy-0.016
Token her
Token ID607
Feature activation-7.54e-03

Pos logit contributions
 facts+0.028
 guessing+0.028
 topics+0.027
 motorcycles+0.026
 tours+0.026
Neg logit contributions
kin-0.028
ivery-0.025
 bruised-0.024
ifying-0.024
cloth-0.023
Token garden
Token ID11376
Feature activation-2.71e-02

Pos logit contributions
 guessing+0.011
 facts+0.011
 tours+0.011
 topics+0.011
 motorcycles+0.011
Neg logit contributions
kin-0.015
ivery-0.014
ifying-0.013
 bruised-0.013
ishy-0.013
Token.
Token ID13
Feature activation+8.80e-03

Pos logit contributions
 facts+0.062
 motorcycles+0.060
 topics+0.060
 guessing+0.060
icians+0.059
Neg logit contributions
kin-0.032
ivery-0.028
 bruised-0.026
cloth-0.025
 sill-0.024
Token One
Token ID1881
Feature activation+5.00e-03

Pos logit contributions
kin+0.016
ivery+0.014
ifying+0.013
 bruised+0.013
ishy+0.012
Neg logit contributions
 guessing-0.016
 tours-0.015
 facts-0.015
 topics-0.015
 discovering-0.015
Token day
Token ID1110
Feature activation-4.41e-03

Pos logit contributions
kin+0.010
ivery+0.009
ifying+0.009
 bruised+0.009
ishy+0.008
Neg logit contributions
 guessing-0.008
 tours-0.007
 facts-0.007
 discovering-0.007
 topics-0.007
Token are
Token ID389
Feature activation+2.69e-02

Pos logit contributions
kin+0.057
ivery+0.053
ifying+0.051
 bruised+0.049
robe+0.047
Neg logit contributions
 guessing-0.050
 facts-0.048
 tours-0.047
 discovering-0.046
 motorcycles-0.045
Token good
Token ID922
Feature activation-2.41e-02

Pos logit contributions
ivery+0.053
kin+0.052
ifying+0.051
ishy+0.047
robe+0.046
Neg logit contributions
 guessing-0.045
 discovering-0.041
 learning-0.040
 tours-0.039
 seeing-0.039
Token friends
Token ID2460
Feature activation+3.92e-03

Pos logit contributions
 guessing+0.036
 topics+0.036
 facts+0.035
 discovering+0.034
 mills+0.034
Neg logit contributions
kin-0.046
ivery-0.041
 bruised-0.041
ishy-0.039
ifying-0.039
Token.
Token ID13
Feature activation+4.10e-03

Pos logit contributions
kin+0.007
ivery+0.006
ifying+0.006
 bruised+0.006
ishy+0.005
Neg logit contributions
 guessing-0.007
 facts-0.007
 tours-0.007
 discovering-0.007
 motorcycles-0.007
Token<|endoftext|>
Token ID50256
Feature activation-2.92e-03

Pos logit contributions
kin+0.008
ivery+0.008
 bruised+0.007
ifying+0.007
ishy+0.007
Neg logit contributions
 guessing-0.006
 facts-0.006
 tours-0.006
 topics-0.006
 discovering-0.006
TokenL
Token ID43
Feature activation-1.80e-02

Pos logit contributions
 guessing+0.004
 facts+0.004
 tours+0.004
 discovering+0.004
 topics+0.004
Neg logit contributions
kin-0.006
ivery-0.005
ifying-0.005
 bruised-0.005
ishy-0.005
Tokenily
Token ID813
Feature activation-4.22e-02

Pos logit contributions
 guessing+0.038
 facts+0.038
 topics+0.037
 discovering+0.037
 tours+0.036
Neg logit contributions
kin-0.026
ivery-0.020
cloth-0.019
ishy-0.019
ifying-0.019
Token liked
Token ID8288
Feature activation-8.33e-02

Pos logit contributions
 guessing+0.074
 facts+0.071
 topics+0.069
 tours+0.069
 motorcycles+0.068
Neg logit contributions
kin-0.073
 bruised-0.063
ishy-0.062
robe-0.060
ivery-0.060
Token to
Token ID284
Feature activation-1.57e-02

Pos logit contributions
icians+0.159
bors+0.154
arm+0.150
 Street+0.140
 topics+0.138
Neg logit contributions
 Lily-0.118
 Mom-0.117
 full-0.113
 empty-0.111
 bruised-0.108
Token jump
Token ID4391
Feature activation+1.33e-02

Pos logit contributions
 guessing+0.026
 facts+0.025
 tours+0.025
 discovering+0.025
 topics+0.025
Neg logit contributions
kin-0.029
ivery-0.026
ifying-0.025
 bruised-0.024
ishy-0.024
Token in
Token ID287
Feature activation-2.29e-02

Pos logit contributions
kin+0.026
ivery+0.025
 bruised+0.024
ifying+0.023
ishy+0.022
Neg logit contributions
 guessing-0.021
 tours-0.020
 motorcycles-0.019
 facts-0.019
 topics-0.019
Token was
Token ID373
Feature activation+2.49e-02

Pos logit contributions
 tours+0.056
 guessing+0.053
 facts+0.053
 mills+0.052
 motorcycles+0.052
Neg logit contributions
kin-0.049
 bruised-0.042
ifying-0.040
ivery-0.040
ishy-0.038
Token excited
Token ID6568
Feature activation-3.22e-02

Pos logit contributions
kin+0.048
ivery+0.046
ifying+0.044
ishy+0.043
robe+0.042
Neg logit contributions
 guessing-0.043
 discovering-0.041
 seeing-0.039
 learning-0.038
 visiting-0.037
Token!
Token ID0
Feature activation+1.54e-02

Pos logit contributions
icians+0.045
 mills+0.044
 tours+0.043
 facts+0.043
arm+0.043
Neg logit contributions
kin-0.062
 bruised-0.055
 sill-0.055
ivery-0.055
cloth-0.054
Token
Token ID198
Feature activation-3.57e-02

Pos logit contributions
kin+0.029
ivery+0.027
 bruised+0.025
ishy+0.024
ifying+0.023
Neg logit contributions
 guessing-0.026
 motorcycles-0.026
 discovering-0.025
 tours-0.025
 facts-0.025
Token
Token ID198
Feature activation-3.45e-02

Pos logit contributions
 facts+0.077
 mills+0.074
 heights+0.074
 topics+0.074
 tours+0.074
Neg logit contributions
kin-0.048
he-0.044
ivery-0.040
ifying-0.037
robe-0.037
TokenMom
Token ID29252
Feature activation-2.62e-02

Pos logit contributions
 facts+0.058
 tours+0.056
 heights+0.054
 mills+0.054
 guessing+0.054
Neg logit contributions
kin-0.065
ivery-0.059
ifying-0.058
he-0.057
cloth-0.053
Tokenmy
Token ID1820
Feature activation-4.71e-02

Pos logit contributions
 tours+0.063
 facts+0.063
 guessing+0.063
 topics+0.061
 discovering+0.060
Neg logit contributions
kin-0.029
ivery-0.024
 bruised-0.024
ifying-0.022
robe-0.021
Token and
Token ID290
Feature activation-1.00e-02

Pos logit contributions
 tours+0.092
 facts+0.092
 topics+0.090
 guessing+0.087
 heights+0.084
Neg logit contributions
kin-0.075
 bruised-0.062
ivery-0.059
cloth-0.059
robe-0.059
Token Daddy
Token ID26926
Feature activation-8.98e-03

Pos logit contributions
 guessing+0.021
 facts+0.021
 tours+0.021
 topics+0.020
 discovering+0.020
Neg logit contributions
kin-0.014
ivery-0.012
ifying-0.012
 bruised-0.011
ishy-0.011
Token said
Token ID531
Feature activation-2.73e-02

Pos logit contributions
 guessing+0.017
 facts+0.017
 tours+0.017
 topics+0.017
 discovering+0.017
Neg logit contributions
kin-0.014
ivery-0.013
ifying-0.012
 bruised-0.012
ishy-0.011
Token it
Token ID340
Feature activation-1.66e-02

Pos logit contributions
 guessing+0.042
 tours+0.041
 topics+0.041
 facts+0.040
 discovering+0.039
Neg logit contributions
kin-0.053
 bruised-0.046
ifying-0.046
 sorry-0.046
ivery-0.044
Token at
Token ID379
Feature activation-8.39e-03

Pos logit contributions
kin+0.034
ivery+0.032
ifying+0.031
 bruised+0.030
robe+0.029
Neg logit contributions
 guessing-0.032
 discovering-0.030
 tours-0.030
 adventures-0.029
 motorcycles-0.028
Token home
Token ID1363
Feature activation+2.57e-03

Pos logit contributions
 guessing+0.013
 facts+0.013
 topics+0.013
 tours+0.013
 motorcycles+0.012
Neg logit contributions
kin-0.016
ivery-0.015
 bruised-0.014
ifying-0.014
robe-0.013
Token.
Token ID13
Feature activation-2.32e-02

Pos logit contributions
kin+0.005
ivery+0.004
ifying+0.004
 bruised+0.004
ishy+0.004
Neg logit contributions
 guessing-0.004
 facts-0.004
 tours-0.004
 discovering-0.004
 topics-0.004
Token His
Token ID2399
Feature activation-9.44e-03

Pos logit contributions
 guessing+0.047
 facts+0.046
 discovering+0.045
 tours+0.045
 topics+0.044
Neg logit contributions
kin-0.032
ivery-0.029
 bruised-0.028
ifying-0.027
robe-0.026
Token friends
Token ID2460
Feature activation+1.04e-02

Pos logit contributions
 guessing+0.013
 facts+0.013
 discovering+0.013
 tours+0.013
 topics+0.013
Neg logit contributions
kin-0.020
ivery-0.018
 bruised-0.017
ifying-0.017
robe-0.017
Token thought
Token ID1807
Feature activation-4.78e-02

Pos logit contributions
kin+0.019
ivery+0.017
ifying+0.017
 bruised+0.016
ishy+0.016
Neg logit contributions
 guessing-0.018
 discovering-0.017
 facts-0.017
 tours-0.017
 motorcycles-0.017
Token it
Token ID340
Feature activation-5.74e-03

Pos logit contributions
 topics+0.087
 guessing+0.085
icians+0.084
 cultures+0.083
 facts+0.083
Neg logit contributions
kin-0.080
he-0.065
 bruised-0.064
cloth-0.063
ishy-0.063
Token was
Token ID373
Feature activation+1.26e-02

Pos logit contributions
 guessing+0.009
 facts+0.008
 tours+0.008
 discovering+0.008
 topics+0.008
Neg logit contributions
kin-0.012
ivery-0.011
 bruised-0.010
ifying-0.010
ishy-0.010
Token very
Token ID845
Feature activation+2.01e-02

Pos logit contributions
kin+0.025
ivery+0.024
ifying+0.022
 bruised+0.022
ishy+0.021
Neg logit contributions
 guessing-0.020
 facts-0.019
 tours-0.019
 discovering-0.018
 motorcycles-0.018
Token foolish
Token ID19538
Feature activation-3.38e-02

Pos logit contributions
kin+0.042
ivery+0.040
ifying+0.040
 bruised+0.038
robe+0.037
Neg logit contributions
 guessing-0.029
 tours-0.028
 facts-0.027
 motorcycles-0.026
 topics-0.026
Token of
Token ID286
Feature activation-4.89e-02

Pos logit contributions
 tours+0.060
 guessing+0.060
icians+0.058
 mills+0.058
 airports+0.057
Neg logit contributions
kin-0.055
 bruised-0.048
cloth-0.047
ishy-0.046
robe-0.045
Token lion
Token ID18744
Feature activation-3.52e-02

Pos logit contributions
 guessing+0.029
 tours+0.029
 facts+0.029
 discovering+0.029
 topics+0.028
Neg logit contributions
kin-0.027
ifying-0.023
ivery-0.023
 bruised-0.023
ishy-0.022
Token.
Token ID13
Feature activation+1.15e-02

Pos logit contributions
 guessing+0.065
 tours+0.063
 topics+0.063
 discovering+0.063
 facts+0.062
Neg logit contributions
kin-0.058
ifying-0.050
 bruised-0.049
ivery-0.047
robe-0.046
Token That
Token ID1320
Feature activation-3.85e-02

Pos logit contributions
kin+0.020
ivery+0.018
 bruised+0.017
ifying+0.017
ishy+0.016
Neg logit contributions
 guessing-0.021
 facts-0.021
 tours-0.020
 motorcycles-0.020
 discovering-0.020
Token is
Token ID318
Feature activation-2.11e-02

Pos logit contributions
 tours+0.060
 discovering+0.060
 guessing+0.060
 topics+0.058
 mills+0.056
Neg logit contributions
kin-0.070
ifying-0.062
 bruised-0.061
robe-0.059
 sill-0.059
Token two
Token ID734
Feature activation-3.02e-02

Pos logit contributions
 guessing+0.037
 tours+0.037
 topics+0.037
 facts+0.036
 mills+0.036
Neg logit contributions
kin-0.035
 bruised-0.031
ifying-0.031
ivery-0.030
ishy-0.028
Token animals
Token ID4695
Feature activation-2.70e-02

Pos logit contributions
 guessing+0.052
 tours+0.051
 topics+0.050
 discovering+0.050
 facts+0.049
Neg logit contributions
kin-0.051
 bruised-0.046
ifying-0.046
ivery-0.045
ishy-0.043
Token.
Token ID13
Feature activation+3.15e-03

Pos logit contributions
 guessing+0.053
 tours+0.053
 topics+0.052
 motorcycles+0.052
 facts+0.052
Neg logit contributions
kin-0.041
ifying-0.035
 bruised-0.034
ivery-0.034
ishy-0.032
Token Then
Token ID3244
Feature activation-1.84e-02

Pos logit contributions
kin+0.006
ivery+0.005
 bruised+0.005
ifying+0.005
ishy+0.005
Neg logit contributions
 guessing-0.006
 facts-0.005
 tours-0.005
 discovering-0.005
 motorcycles-0.005
Token we
Token ID356
Feature activation+1.55e-02

Pos logit contributions
 guessing+0.026
 facts+0.024
 topics+0.024
 tours+0.024
 discovering+0.024
Neg logit contributions
kin-0.039
ivery-0.035
 bruised-0.035
ifying-0.034
ishy-0.032
Token have
Token ID423
Feature activation-5.06e-02

Pos logit contributions
kin+0.028
ivery+0.026
 bruised+0.024
ifying+0.024
robe+0.023
Neg logit contributions
 guessing-0.027
 facts-0.026
 tours-0.026
 motorcycles-0.025
 topics-0.025
Token one
Token ID530
Feature activation-1.87e-02

Pos logit contributions
 topics+0.079
 guessing+0.078
 discovering+0.075
arm+0.075
 facts+0.074
Neg logit contributions
kin-0.085
ifying-0.084
 bruised-0.081
 OK-0.074
ishy-0.073
Token again
Token ID757
Feature activation-1.36e-02

Pos logit contributions
 guessing+0.025
 facts+0.024
 tours+0.024
 topics+0.023
 motorcycles+0.023
Neg logit contributions
kin-0.031
ifying-0.028
ivery-0.028
 bruised-0.028
ishy-0.026
Token.
Token ID13
Feature activation+2.14e-02

Pos logit contributions
 guessing+0.027
 facts+0.026
 tours+0.026
 motorcycles+0.026
 topics+0.026
Neg logit contributions
kin-0.021
ivery-0.018
ifying-0.018
 bruised-0.017
ishy-0.016
Token Sam
Token ID3409
Feature activation-1.38e-02

Pos logit contributions
kin+0.040
ivery+0.037
 bruised+0.034
ishy+0.034
ifying+0.033
Neg logit contributions
 guessing-0.036
 motorcycles-0.035
 discovering-0.035
 tours-0.035
 facts-0.035
Token asked
Token ID1965
Feature activation-9.19e-03

Pos logit contributions
 guessing+0.024
 facts+0.023
 tours+0.023
 topics+0.023
 discovering+0.023
Neg logit contributions
kin-0.024
ivery-0.022
 bruised-0.021
ifying-0.021
ishy-0.020
Token the
Token ID262
Feature activation+4.50e-03

Pos logit contributions
 guessing+0.015
 facts+0.015
 tours+0.015
 topics+0.014
 discovering+0.014
Neg logit contributions
kin-0.017
ivery-0.015
 bruised-0.015
ifying-0.015
ishy-0.014
Token cat
Token ID3797
Feature activation-2.40e-02

Pos logit contributions
kin+0.009
ivery+0.009
ifying+0.008
 bruised+0.008
ishy+0.008
Neg logit contributions
 guessing-0.007
 facts-0.007
 tours-0.006
 topics-0.006
 motorcycles-0.006
Token why
Token ID1521
Feature activation+7.01e-03

Pos logit contributions
 guessing+0.034
 tours+0.033
 facts+0.032
 motorcycles+0.032
 topics+0.032
Neg logit contributions
kin-0.049
 bruised-0.044
ivery-0.044
ifying-0.043
robe-0.042
Token he
Token ID339
Feature activation+1.04e-02

Pos logit contributions
kin+0.015
ivery+0.014
ifying+0.014
 bruised+0.013
ishy+0.013
Neg logit contributions
 guessing-0.009
 facts-0.009
 tours-0.009
 motorcycles-0.009
 discovering-0.009
Token was
Token ID373
Feature activation+1.19e-02

Pos logit contributions
kin+0.021
ivery+0.020
ifying+0.019
 bruised+0.018
ishy+0.018
Neg logit contributions
 guessing-0.016
 facts-0.015
 discovering-0.015
 tours-0.015
 motorcycles-0.015
Token doing
Token ID1804
Feature activation-3.74e-02

Pos logit contributions
kin+0.024
ivery+0.022
ifying+0.021
 bruised+0.020
ishy+0.020
Neg logit contributions
 guessing-0.019
 discovering-0.017
 tours-0.017
 facts-0.017
 motorcycles-0.017
Token this
Token ID428
Feature activation-2.02e-02

Pos logit contributions
 facts+0.064
 guessing+0.063
 tours+0.063
 topics+0.062
 mills+0.062
Neg logit contributions
kin-0.065
 bruised-0.055
ivery-0.055
ifying-0.054
robe-0.051
Token a
Token ID257
Feature activation-6.11e-02

Pos logit contributions
 facts+0.041
 guessing+0.040
 discovering+0.040
 tours+0.040
 topics+0.039
Neg logit contributions
kin-0.036
ivery-0.030
ifying-0.029
 bruised-0.028
robe-0.028
Token game
Token ID983
Feature activation-2.65e-02

Pos logit contributions
 discovering+0.100
 mills+0.096
icians+0.096
 cultures+0.095
 facts+0.094
Neg logit contributions
kin-0.120
suit-0.091
ivery-0.090
robe-0.090
ifying-0.089
Token.
Token ID13
Feature activation+1.99e-02

Pos logit contributions
 facts+0.046
 guessing+0.045
 tours+0.045
 topics+0.045
 motorcycles+0.044
Neg logit contributions
kin-0.046
ivery-0.039
ifying-0.039
robe-0.038
cloth-0.037
Token I
Token ID314
Feature activation+2.86e-02

Pos logit contributions
kin+0.039
ivery+0.036
 bruised+0.033
ishy+0.032
ifying+0.032
Neg logit contributions
 guessing-0.032
 motorcycles-0.031
 facts-0.030
 discovering-0.030
 tours-0.030
Token'll
Token ID1183
Feature activation-1.06e-02

Pos logit contributions
kin+0.056
ivery+0.051
 bruised+0.048
ifying+0.048
robe+0.046
Neg logit contributions
 guessing-0.046
 discovering-0.043
 topics-0.042
 tours-0.042
 facts-0.042
Token count
Token ID954
Feature activation-7.83e-02

Pos logit contributions
 guessing+0.018
 facts+0.018
 tours+0.018
 topics+0.017
 motorcycles+0.017
Neg logit contributions
kin-0.019
ivery-0.017
ifying-0.016
 bruised-0.016
ishy-0.016
Token to
Token ID284
Feature activation-4.07e-02

Pos logit contributions
 topics+0.135
bors+0.135
 facts+0.131
 discovering+0.130
 guessing+0.128
Neg logit contributions
ifying-0.118
kin-0.115
 sill-0.098
 OK-0.097
cloth-0.097
Token three
Token ID1115
Feature activation-8.35e-02

Pos logit contributions
 facts+0.071
 tours+0.069
 mills+0.067
 topics+0.067
arm+0.067
Neg logit contributions
kin-0.063
ifying-0.057
 sill-0.056
 bruised-0.056
ivery-0.055
Token and
Token ID290
Feature activation-1.68e-02

Pos logit contributions
 discovering+0.148
 guessing+0.144
 mills+0.136
 tours+0.136
 facts+0.134
Neg logit contributions
kin-0.131
cloth-0.117
ishy-0.103
 sill-0.099
ifying-0.097
Token on
Token ID319
Feature activation-4.01e-02

Pos logit contributions
 guessing+0.029
 facts+0.029
 tours+0.029
 topics+0.028
 motorcycles+0.028
Neg logit contributions
kin-0.029
ivery-0.025
ifying-0.025
 bruised-0.025
ishy-0.023
Token three
Token ID1115
Feature activation-7.42e-02

Pos logit contributions
 facts+0.073
 guessing+0.070
 topics+0.069
 discovering+0.069
 tours+0.067
Neg logit contributions
kin-0.065
ifying-0.060
cloth-0.054
ivery-0.054
 bruised-0.053
Token Tom
Token ID4186
Feature activation-2.49e-02

Pos logit contributions
ivery+0.070
kin+0.069
ishy+0.059
ifying+0.059
 bruised+0.059
Neg logit contributions
 guessing-0.062
 motorcycles-0.059
 discovering-0.058
 whales-0.057
 tours-0.056
Token and
Token ID290
Feature activation+1.50e-02

Pos logit contributions
 guessing+0.043
 topics+0.042
 tours+0.042
 facts+0.042
 motorcycles+0.040
Neg logit contributions
kin-0.044
 bruised-0.039
ivery-0.038
cloth-0.036
ishy-0.036
Token Tara
Token ID37723
Feature activation-9.76e-03

Pos logit contributions
kin+0.025
ivery+0.024
ifying+0.023
 bruised+0.021
ishy+0.021
Neg logit contributions
 guessing-0.028
 facts-0.027
 tours-0.027
 discovering-0.027
 motorcycles-0.027
Token.
Token ID13
Feature activation-1.25e-02

Pos logit contributions
 guessing+0.020
 facts+0.020
 tours+0.020
 topics+0.020
 motorcycles+0.019
Neg logit contributions
kin-0.014
ivery-0.012
 bruised-0.012
ifying-0.011
ishy-0.011
Token Tom
Token ID4186
Feature activation-1.77e-02

Pos logit contributions
 guessing+0.028
 facts+0.027
 discovering+0.026
 tours+0.026
 topics+0.026
Neg logit contributions
kin-0.016
ivery-0.014
ifying-0.013
 bruised-0.013
ishy-0.013
Token liked
Token ID8288
Feature activation-1.07e-01

Pos logit contributions
 guessing+0.034
 facts+0.033
 tours+0.033
 topics+0.033
 discovering+0.032
Neg logit contributions
kin-0.028
ivery-0.024
 bruised-0.024
ishy-0.023
robe-0.022
Token to
Token ID284
Feature activation-1.44e-02

Pos logit contributions
bors+0.222
ves+0.221
icians+0.215
ises+0.210
arm+0.207
Neg logit contributions
 Lily-0.135
 empty-0.113
 Mom-0.110
 full-0.108
 Tim-0.108
Token play
Token ID711
Feature activation-3.27e-02

Pos logit contributions
 guessing+0.023
 facts+0.022
 discovering+0.022
 tours+0.022
 motorcycles+0.022
Neg logit contributions
kin-0.027
ivery-0.025
 bruised-0.023
ifying-0.023
ishy-0.023
Token with
Token ID351
Feature activation-1.90e-02

Pos logit contributions
 facts+0.066
 cultures+0.063
 topics+0.062
 mills+0.062
 discovering+0.061
Neg logit contributions
kin-0.051
robe-0.042
ivery-0.041
he-0.040
ishy-0.036
Token the
Token ID262
Feature activation-7.19e-03

Pos logit contributions
 guessing+0.032
 facts+0.032
 discovering+0.032
 tours+0.031
 topics+0.030
Neg logit contributions
kin-0.034
 bruised-0.029
ivery-0.029
ifying-0.028
ishy-0.028
Token fire
Token ID2046
Feature activation-7.54e-03

Pos logit contributions
 guessing+0.010
 facts+0.010
 tours+0.010
 discovering+0.010
 topics+0.010
Neg logit contributions
kin-0.015
ivery-0.014
ifying-0.013
 bruised-0.013
ishy-0.013
Token Tim
Token ID5045
Feature activation-1.09e-02

Pos logit contributions
kin+0.005
ivery+0.004
ifying+0.004
 bruised+0.004
ishy+0.004
Neg logit contributions
 guessing-0.005
 tours-0.005
 facts-0.005
 discovering-0.005
 topics-0.005
Tokenmy
Token ID1820
Feature activation-4.90e-02

Pos logit contributions
 guessing+0.024
 facts+0.023
 tours+0.023
 topics+0.023
 motorcycles+0.023
Neg logit contributions
kin-0.015
ivery-0.013
 bruised-0.012
ifying-0.012
ishy-0.011
Token and
Token ID290
Feature activation+1.07e-03

Pos logit contributions
 facts+0.093
 tours+0.093
 guessing+0.091
 topics+0.091
 motorcycles+0.090
Neg logit contributions
kin-0.075
 bruised-0.064
robe-0.062
ivery-0.061
ishy-0.060
Token his
Token ID465
Feature activation-1.56e-02

Pos logit contributions
kin+0.002
ivery+0.002
ifying+0.002
 bruised+0.002
ishy+0.001
Neg logit contributions
 guessing-0.002
 facts-0.002
 tours-0.002
 discovering-0.002
 topics-0.002
Token friends
Token ID2460
Feature activation+6.64e-03

Pos logit contributions
 facts+0.021
 guessing+0.021
 discovering+0.021
 motorcycles+0.020
 tours+0.020
Neg logit contributions
kin-0.034
ivery-0.030
 bruised-0.029
robe-0.028
ishy-0.028
Token loved
Token ID6151
Feature activation-8.68e-02

Pos logit contributions
kin+0.013
ivery+0.012
ifying+0.011
 bruised+0.011
ishy+0.011
Neg logit contributions
 guessing-0.011
 discovering-0.010
 facts-0.010
 tours-0.010
 topics-0.010
Token to
Token ID284
Feature activation-2.37e-02

Pos logit contributions
is+0.135
ups+0.134
icians+0.133
arm+0.132
ves+0.132
Neg logit contributions
 Lily-0.124
 like-0.117
 sorry-0.115
 okay-0.115
 bruised-0.111
Token play
Token ID711
Feature activation-1.76e-02

Pos logit contributions
 motorcycles+0.033
 guessing+0.032
 facts+0.032
 topics+0.031
 tours+0.031
Neg logit contributions
kin-0.049
ivery-0.044
 bruised-0.042
ifying-0.042
 limp-0.041
Token on
Token ID319
Feature activation-4.61e-02

Pos logit contributions
 facts+0.028
 guessing+0.027
 motorcycles+0.027
 topics+0.027
 tours+0.026
Neg logit contributions
kin-0.034
ivery-0.030
robe-0.028
 bruised-0.028
ifying-0.027
Token the
Token ID262
Feature activation-5.20e-02

Pos logit contributions
 facts+0.087
arm+0.085
cles+0.083
ups+0.083
icians+0.082
Neg logit contributions
kin-0.069
 bruised-0.054
 empty-0.053
he-0.052
ifying-0.051
Token train
Token ID4512
Feature activation-3.03e-02

Pos logit contributions
 facts+0.081
 topics+0.075
 cultures+0.071
 motorcycles+0.070
 recipes+0.069
Neg logit contributions
kin-0.095
ivery-0.092
 empty-0.086
robe-0.083
 seeker-0.083
Token girl
Token ID2576
Feature activation+7.04e-03

Pos logit contributions
kin+0.043
ivery+0.043
ifying+0.041
 bruised+0.037
robe+0.036
Neg logit contributions
 guessing-0.051
 motorcycles-0.049
 whales-0.049
 mills-0.048
 tours-0.047
Token named
Token ID3706
Feature activation+9.98e-03

Pos logit contributions
kin+0.011
ivery+0.010
ifying+0.010
 bruised+0.010
ishy+0.009
Neg logit contributions
 guessing-0.014
 discovering-0.013
 facts-0.013
 tours-0.013
 topics-0.013
Token Lily
Token ID20037
Feature activation+1.58e-02

Pos logit contributions
kin+0.011
ivery+0.010
ifying+0.009
 bruised+0.009
ishy+0.008
Neg logit contributions
 guessing-0.025
 tours-0.024
 discovering-0.024
 topics-0.023
 facts-0.023
Token.
Token ID13
Feature activation+3.74e-02

Pos logit contributions
kin+0.030
ivery+0.027
ifying+0.027
ishy+0.025
 bruised+0.025
Neg logit contributions
 guessing-0.026
 discovering-0.025
 motorcycles-0.025
 facts-0.025
 topics-0.024
Token She
Token ID1375
Feature activation-2.90e-03

Pos logit contributions
kin+0.068
ivery+0.059
ifying+0.059
 bruised+0.057
rier+0.056
Neg logit contributions
 discovering-0.063
 guessing-0.063
 topics-0.062
 museums-0.061
 tours-0.061
Token loved
Token ID6151
Feature activation-1.12e-01

Pos logit contributions
 guessing+0.005
 facts+0.005
 tours+0.005
 discovering+0.004
 topics+0.004
Neg logit contributions
kin-0.006
ivery-0.005
ifying-0.005
 bruised-0.005
ishy-0.005
Token to
Token ID284
Feature activation-2.85e-02

Pos logit contributions
icians+0.227
ves+0.220
bors+0.218
 Street+0.211
arm+0.205
Neg logit contributions
 Lily-0.133
 cold-0.128
 wet-0.128
 dry-0.124
 bruised-0.122
Token eat
Token ID4483
Feature activation-5.82e-02

Pos logit contributions
 guessing+0.052
 facts+0.052
icians+0.051
 topics+0.051
 discovering+0.050
Neg logit contributions
kin-0.047
ivery-0.041
 bruised-0.037
robe-0.037
 limp-0.037
Token oranges
Token ID48389
Feature activation-1.72e-02

Pos logit contributions
icians+0.119
 topics+0.109
arm+0.109
 discovering+0.108
 cultures+0.108
Neg logit contributions
kin-0.079
 bruised-0.070
ishy-0.069
 empty-0.067
 lump-0.065
Token.
Token ID13
Feature activation+1.05e-02

Pos logit contributions
 guessing+0.031
 facts+0.030
 discovering+0.030
 topics+0.030
 tours+0.029
Neg logit contributions
kin-0.030
ivery-0.027
 bruised-0.025
ifying-0.025
ishy-0.024
Token One
Token ID1881
Feature activation+7.63e-03

Pos logit contributions
kin+0.019
ivery+0.017
ifying+0.016
 bruised+0.016
ishy+0.015
Neg logit contributions
 guessing-0.018
 facts-0.018
 tours-0.018
 topics-0.018
 discovering-0.018
Token worry
Token ID5490
Feature activation+7.96e-03

Pos logit contributions
 facts+0.040
 guessing+0.039
 tours+0.039
 discovering+0.038
 topics+0.038
Neg logit contributions
kin-0.017
ivery-0.016
 bruised-0.015
ifying-0.014
ishy-0.013
Token,
Token ID11
Feature activation-4.69e-03

Pos logit contributions
kin+0.017
ivery+0.016
ifying+0.016
 bruised+0.015
ishy+0.015
Neg logit contributions
 guessing-0.011
 facts-0.010
 tours-0.010
 topics-0.010
 motorcycles-0.010
Token Lily
Token ID20037
Feature activation+4.52e-03

Pos logit contributions
 guessing+0.010
 facts+0.010
 tours+0.010
 discovering+0.010
 topics+0.010
Neg logit contributions
kin-0.007
ivery-0.006
ifying-0.005
 bruised-0.005
ishy-0.005
Token.
Token ID13
Feature activation+9.73e-03

Pos logit contributions
kin+0.008
ivery+0.007
ifying+0.007
 bruised+0.006
ishy+0.006
Neg logit contributions
 guessing-0.008
 facts-0.008
 tours-0.008
 discovering-0.008
 topics-0.008
Token Just
Token ID2329
Feature activation+2.99e-02

Pos logit contributions
kin+0.017
ivery+0.016
ifying+0.015
 bruised+0.014
ishy+0.014
Neg logit contributions
 guessing-0.017
 facts-0.016
 tours-0.016
 discovering-0.016
 topics-0.016
Token try
Token ID1949
Feature activation-8.61e-02

Pos logit contributions
kin+0.059
ivery+0.057
ifying+0.054
robe+0.052
ishy+0.051
Neg logit contributions
 guessing-0.046
 discovering-0.043
 facts-0.041
 seeing-0.040
 motorcycles-0.040
Token it
Token ID340
Feature activation-7.46e-03

Pos logit contributions
 facts+0.160
 topics+0.151
 occasions+0.150
 tours+0.149
cles+0.147
Neg logit contributions
 okay-0.124
 bruised-0.123
 OK-0.122
 Lily-0.112
kin-0.108
Token out
Token ID503
Feature activation-1.36e-02

Pos logit contributions
 guessing+0.013
 facts+0.013
 tours+0.013
 discovering+0.013
 topics+0.013
Neg logit contributions
kin-0.013
ivery-0.012
 bruised-0.011
ifying-0.011
ishy-0.011
Token and
Token ID290
Feature activation-9.95e-03

Pos logit contributions
 guessing+0.023
 facts+0.022
 tours+0.022
 topics+0.022
 motorcycles+0.022
Neg logit contributions
kin-0.025
ivery-0.022
ifying-0.021
 bruised-0.021
ishy-0.020
Token you
Token ID345
Feature activation+8.40e-03

Pos logit contributions
 guessing+0.016
 facts+0.016
 tours+0.016
 topics+0.016
 discovering+0.016
Neg logit contributions
kin-0.018
ivery-0.017
 bruised-0.016
ifying-0.016
ishy-0.015
Token'll
Token ID1183
Feature activation+1.33e-02

Pos logit contributions
kin+0.015
ivery+0.014
ifying+0.013
 bruised+0.013
ishy+0.013
Neg logit contributions
 guessing-0.014
 facts-0.013
 tours-0.013
 motorcycles-0.013
 discovering-0.013