TOP ACTIVATIONS
MAX = 0.186

line get farther and farther away . He didn 't know
's mom was even ang rier and Lily felt even sad
could be heard from miles away . <|endoftext|> Once upon a
When Daisy put the magn ifying glass down , the cater
and Lily felt even sad der . From that day on
pedal broke off and flew away . His friends said they
work , bravery and cle vern ess ! From now on
as Molly smelled the fresh air she started barking and running
play catch near the hut anymore . They learn to respect
path , further and further away from the farm .
of milk wob bled and almost fell down .
her orange arms in the air to get their attention .
. Something was blocking the hole . It was big and
frogs shaking off their wet ness . The sun was shining
Then something even we ir der happened . The animal suddenly
felt a chill down his spine . He suddenly remembered the
't sign at the traffic anymore ." Jess
a chill run down his spine . He could not believe
it completely disappeared in the night sky . <|endoftext|> Once upon
dust floating around in the air . It was coming from

MIN ACTIVATIONS
MIN = -0.097

a prayer . He said " Dear God , please make
Mum smiled and said : " I will help you get
polite . She always said " please " and " thank
a walk . He said ' hello ' to everyone he
a big map and said " Look , this map is
the car , Maggie said " What do you think Uncle
One day , Mrs Jones took her pet dog to the
his grandma . She said " I have something very tasty
<|endoftext|> M um and Dad took Daisy to school one day
, " Can you lend me a hand ? I
truck and their chief said " We need to quit this
bird . The bird said " thank you " to its
Mar tha smiled and said " Thank you ! I would
likes them all and says " mine , mine , mine
time . She said , " Let 's get your books
She hugged him and said " thank you !" The family
Jane was happy and said " thank you " to her
and a tag that said " For Mom ". Sara was
approached the snake and said " hello ". To his surprise
the ostr ich and said " bang !".

INTERVAL 0.115 - 0.186
CONTAINS 0.003%

INTERVAL 0.044 - 0.115
CONTAINS 3.818%

. She was so happy and hugged her grandma tightly .
ball . But it was too high for her .
a place to go , but all he saw was an
children safely . She cried too . <|endoftext|> L ily and
the blood . He is very scared . "

INTERVAL -0.026 - 0.044
CONTAINS 81.532%

back out ! The camp director reassured her that there would
and slides instead . They had so much fun together and
he was wet and cold . L ily
at all the different flavours . Her mom asked , "
and cool . But the sun was determined .

INTERVAL -0.097 - -0.026
CONTAINS 14.647%

pron again . She always used her a pron wisely !
every last cr umb and declared it was the best they
boy named Tim my wanted to watch a movie . He
the kingdom . She wanted to explore the magical world around
so many exciting things to buy . The dragon
Tokenline
Token ID1370
Feature activation+9.12e-02

Pos logit contributions
sing+0.099
 spine+0.087
 footing+0.087
ipop+0.087
sed+0.086
Neg logit contributions
 languages-0.105
These-0.103
Always-0.100
 parks-0.099
Class-0.095
Token get
Token ID651
Feature activation+4.20e-02

Pos logit contributions
sing+0.159
fish+0.144
 spine+0.144
oppy+0.142
sed+0.142
Neg logit contributions
These-0.146
 languages-0.143
Always-0.142
If-0.135
 parks-0.134
Token farther
Token ID18485
Feature activation+7.55e-02

Pos logit contributions
sing+0.078
 spine+0.074
fish+0.073
orse+0.073
ipop+0.071
Neg logit contributions
These-0.071
Always-0.069
 languages-0.065
Would-0.065
If-0.064
Token and
Token ID290
Feature activation+7.14e-02

Pos logit contributions
sing+0.160
sed+0.146
orse+0.141
ipop+0.141
 tum+0.132
Neg logit contributions
 languages-0.116
If-0.110
Always-0.105
These-0.104
 parks-0.101
Token farther
Token ID18485
Feature activation+7.06e-02

Pos logit contributions
sing+0.150
kin+0.142
ipop+0.142
fish+0.141
ower+0.138
Neg logit contributions
These-0.104
If-0.099
 languages-0.098
 parks-0.098
Always-0.097
Token away
Token ID1497
Feature activation+9.83e-02

Pos logit contributions
sing+0.153
ipop+0.139
orse+0.137
sed+0.136
kin+0.129
Neg logit contributions
 languages-0.102
If-0.097
 parks-0.093
Always-0.090
These-0.089
Token.
Token ID13
Feature activation+3.13e-02

Pos logit contributions
ipop+0.215
sing+0.202
kin+0.195
orse+0.194
OO+0.185
Neg logit contributions
 languages-0.118
 parks-0.118
 towns-0.116
 circles-0.111
Always-0.111
Token He
Token ID679
Feature activation+2.82e-02

Pos logit contributions
sing+0.061
orse+0.058
fish+0.057
 d+0.057
 spine+0.056
Neg logit contributions
 languages-0.047
These-0.046
Always-0.044
If-0.042
Class-0.042
Token didn
Token ID1422
Feature activation+4.77e-02

Pos logit contributions
sing+0.048
fish+0.045
less+0.045
 spine+0.043
ies+0.043
Neg logit contributions
 languages-0.050
Always-0.049
These-0.048
If-0.047
Will-0.046
Token't
Token ID470
Feature activation+5.43e-02

Pos logit contributions
sing+0.128
ies+0.126
fish+0.125
 spine+0.120
 d+0.117
Neg logit contributions
These-0.040
Always-0.039
If-0.038
 Every-0.037
 Whenever-0.036
Token know
Token ID760
Feature activation+1.77e-02

Pos logit contributions
sing+0.105
fish+0.101
ower+0.099
ipop+0.099
sed+0.099
Neg logit contributions
 languages-0.078
If-0.077
Class-0.075
Always-0.075
These-0.075
Token's
Token ID338
Feature activation+6.27e-02

Pos logit contributions
These+0.054
 languages+0.053
If+0.052
Always+0.052
Class+0.051
Neg logit contributions
fish-0.042
sing-0.042
 d-0.039
ies-0.038
orse-0.038
Token mom
Token ID1995
Feature activation+1.27e-02

Pos logit contributions
fish+0.094
sing+0.092
less+0.085
ies+0.084
 d+0.084
Neg logit contributions
 languages-0.123
These-0.120
Always-0.118
If-0.114
 alphabet-0.113
Token was
Token ID373
Feature activation+3.65e-02

Pos logit contributions
sing+0.020
fish+0.020
 d+0.019
ies+0.019
less+0.018
Neg logit contributions
These-0.023
 languages-0.023
Always-0.023
If-0.022
Class-0.022
Token even
Token ID772
Feature activation+5.92e-02

Pos logit contributions
fish+0.057
sing+0.054
ipop+0.050
ies+0.050
 spine+0.050
Neg logit contributions
These-0.070
Always-0.070
 languages-0.067
If-0.067
Class-0.066
Token ang
Token ID3550
Feature activation+7.66e-02

Pos logit contributions
sing+0.104
fish+0.101
 spine+0.096
 footing+0.094
orse+0.092
Neg logit contributions
These-0.102
Always-0.100
 languages-0.100
If-0.099
Class-0.096
Tokenrier
Token ID5277
Feature activation+9.76e-02

Pos logit contributions
sing+0.125
orse+0.117
 footing+0.114
fish+0.114
ipop+0.110
Neg logit contributions
These-0.138
If-0.137
 languages-0.133
Class-0.126
Always-0.125
Token and
Token ID290
Feature activation+5.73e-02

Pos logit contributions
sing+0.180
 footing+0.169
 raft+0.169
 thirst+0.168
orse+0.159
Neg logit contributions
 languages-0.151
If-0.149
Always-0.141
Will-0.135
These-0.135
Token Lily
Token ID20037
Feature activation+1.90e-02

Pos logit contributions
sing+0.098
fish+0.090
 footing+0.089
 spine+0.088
orse+0.086
Neg logit contributions
These-0.105
Always-0.104
If-0.104
 languages-0.102
Class-0.101
Token felt
Token ID2936
Feature activation+4.35e-02

Pos logit contributions
sing+0.032
fish+0.030
 footing+0.029
 spine+0.028
 raft+0.028
Neg logit contributions
Always-0.034
These-0.034
If-0.033
 languages-0.033
Will-0.032
Token even
Token ID772
Feature activation+6.44e-02

Pos logit contributions
sing+0.073
 spine+0.068
fish+0.068
ies+0.065
 footing+0.064
Neg logit contributions
These-0.084
Always-0.083
If-0.083
 languages-0.080
Here-0.078
Token sad
Token ID6507
Feature activation+7.11e-02

Pos logit contributions
sing+0.110
fish+0.106
 spine+0.101
 footing+0.098
kin+0.096
Neg logit contributions
These-0.115
If-0.115
Always-0.114
 languages-0.113
Class-0.110
Token could
Token ID714
Feature activation+2.82e-02

Pos logit contributions
sing+0.089
fish+0.084
less+0.082
 footing+0.081
 spine+0.081
Neg logit contributions
 languages-0.089
Always-0.088
These-0.087
If-0.084
Class-0.083
Token be
Token ID307
Feature activation+4.17e-02

Pos logit contributions
ipop+0.044
less+0.043
fish+0.043
sing+0.043
ies+0.042
Neg logit contributions
Always-0.050
These-0.049
 languages-0.048
If-0.048
Anne-0.047
Token heard
Token ID2982
Feature activation+1.18e-02

Pos logit contributions
fish+0.064
sing+0.061
ipop+0.059
 spine+0.056
 footing+0.055
Neg logit contributions
 languages-0.081
Always-0.080
These-0.079
If-0.077
Anne-0.076
Token from
Token ID422
Feature activation+2.25e-02

Pos logit contributions
sing+0.017
fish+0.017
 d+0.016
ies+0.016
less+0.015
Neg logit contributions
These-0.023
 languages-0.023
Always-0.023
If-0.022
Class-0.022
Token miles
Token ID4608
Feature activation+7.19e-02

Pos logit contributions
sing+0.035
fish+0.034
 d+0.032
 footing+0.032
ies+0.031
Neg logit contributions
These-0.043
 languages-0.042
Always-0.042
If-0.041
Class-0.041
Token away
Token ID1497
Feature activation+9.61e-02

Pos logit contributions
sing+0.127
ipop+0.120
fish+0.116
orse+0.115
kin+0.112
Neg logit contributions
 languages-0.127
If-0.119
Always-0.118
 parks-0.117
These-0.117
Token.
Token ID13
Feature activation+2.28e-02

Pos logit contributions
ipop+0.182
sing+0.175
orse+0.168
kin+0.165
fish+0.158
Neg logit contributions
 languages-0.155
Always-0.149
 parks-0.145
These-0.143
If-0.142
Token<|endoftext|>
Token ID50256
Feature activation-1.57e-02

Pos logit contributions
sing+0.041
orse+0.040
fish+0.038
 d+0.038
 spine+0.037
Neg logit contributions
These-0.036
 languages-0.036
Always-0.034
If-0.033
Class-0.032
TokenOnce
Token ID7454
Feature activation+6.24e-02

Pos logit contributions
 languages+0.034
These+0.032
Always+0.031
aries+0.031
Class+0.031
Neg logit contributions
fish-0.020
 d-0.019
kin-0.018
orse-0.018
 flo-0.018
Token upon
Token ID2402
Feature activation+2.57e-02

Pos logit contributions
sing+0.086
 raft+0.073
fish+0.072
 spine+0.072
 thirst+0.071
Neg logit contributions
 languages-0.127
 Every-0.126
These-0.124
 Everyday-0.123
Anne-0.122
Token a
Token ID257
Feature activation+3.00e-02

Pos logit contributions
sing+0.039
fish+0.038
 d+0.036
ies+0.035
 footing+0.035
Neg logit contributions
These-0.050
 languages-0.050
Always-0.048
If-0.048
Class-0.047
Token When
Token ID1649
Feature activation+5.78e-02

Pos logit contributions
sing+0.064
fish+0.062
orse+0.061
 d+0.060
ipop+0.058
Neg logit contributions
These-0.047
 languages-0.046
Always-0.045
If-0.043
Class-0.043
Token Daisy
Token ID40355
Feature activation+1.74e-02

Pos logit contributions
sing+0.081
fish+0.076
 spine+0.070
less+0.070
 footing+0.070
Neg logit contributions
These-0.118
 languages-0.117
Class-0.114
Always-0.114
Anne-0.113
Token put
Token ID1234
Feature activation-2.26e-02

Pos logit contributions
fish+0.026
sing+0.025
 d+0.024
ies+0.023
orse+0.023
Neg logit contributions
These-0.035
 languages-0.035
Always-0.034
If-0.034
Class-0.033
Token the
Token ID262
Feature activation+1.14e-02

Pos logit contributions
These+0.042
 languages+0.041
Always+0.040
If+0.040
Class+0.039
Neg logit contributions
sing-0.037
fish-0.036
 d-0.034
ies-0.034
 footing-0.033
Token magn
Token ID7842
Feature activation+2.95e-02

Pos logit contributions
sing+0.019
orse+0.018
fish+0.018
ies+0.017
 footing+0.017
Neg logit contributions
These-0.020
 languages-0.020
 alphabet-0.019
Always-0.019
Here-0.018
Tokenifying
Token ID4035
Feature activation+9.53e-02

Pos logit contributions
sing+0.045
fish+0.044
 d+0.040
ies+0.040
 footing+0.040
Neg logit contributions
These-0.058
 languages-0.057
Always-0.056
If-0.056
Class-0.054
Token glass
Token ID5405
Feature activation+2.56e-02

Pos logit contributions
sing+0.178
ies+0.176
fish+0.171
 footing+0.167
orse+0.165
Neg logit contributions
These-0.156
Here-0.143
 Every-0.142
Always-0.140
 Everyday-0.137
Token down
Token ID866
Feature activation+2.43e-02

Pos logit contributions
sing+0.036
fish+0.031
 footing+0.030
orse+0.030
 spine+0.029
Neg logit contributions
These-0.055
 languages-0.054
Always-0.053
If-0.052
Class-0.051
Token,
Token ID11
Feature activation+5.83e-02

Pos logit contributions
sing+0.035
fish+0.032
orse+0.031
 footing+0.029
 spine+0.029
Neg logit contributions
These-0.051
 languages-0.050
Always-0.049
If-0.049
Anne-0.047
Token the
Token ID262
Feature activation+2.91e-02

Pos logit contributions
sing+0.087
fish+0.083
 d+0.076
orse+0.075
less+0.075
Neg logit contributions
These-0.118
 languages-0.116
Always-0.114
If-0.113
Class-0.111
Token cater
Token ID20825
Feature activation+2.65e-02

Pos logit contributions
sing+0.050
fish+0.049
orse+0.045
 footing+0.045
ies+0.045
Neg logit contributions
These-0.050
 languages-0.050
Always-0.048
If-0.048
Class-0.047
Token and
Token ID290
Feature activation+5.73e-02

Pos logit contributions
sing+0.180
 footing+0.169
 raft+0.169
 thirst+0.168
orse+0.159
Neg logit contributions
 languages-0.151
If-0.149
Always-0.141
Will-0.135
These-0.135
Token Lily
Token ID20037
Feature activation+1.90e-02

Pos logit contributions
sing+0.098
fish+0.090
 footing+0.089
 spine+0.088
orse+0.086
Neg logit contributions
These-0.105
Always-0.104
If-0.104
 languages-0.102
Class-0.101
Token felt
Token ID2936
Feature activation+4.35e-02

Pos logit contributions
sing+0.032
fish+0.030
 footing+0.029
 spine+0.028
 raft+0.028
Neg logit contributions
Always-0.034
These-0.034
If-0.033
 languages-0.033
Will-0.032
Token even
Token ID772
Feature activation+6.44e-02

Pos logit contributions
sing+0.073
 spine+0.068
fish+0.068
ies+0.065
 footing+0.064
Neg logit contributions
These-0.084
Always-0.083
If-0.083
 languages-0.080
Here-0.078
Token sad
Token ID6507
Feature activation+7.11e-02

Pos logit contributions
sing+0.110
fish+0.106
 spine+0.101
 footing+0.098
kin+0.096
Neg logit contributions
These-0.115
If-0.115
Always-0.114
 languages-0.113
Class-0.110
Tokender
Token ID1082
Feature activation+9.51e-02

Pos logit contributions
sing+0.123
orse+0.114
fish+0.113
 footing+0.111
kin+0.107
Neg logit contributions
 languages-0.129
Always-0.128
These-0.124
If-0.122
 Every-0.115
Token.
Token ID13
Feature activation+2.12e-02

Pos logit contributions
sing+0.149
fish+0.127
 footing+0.126
 spine+0.126
 raft+0.125
Neg logit contributions
Always-0.195
These-0.192
If-0.190
 languages-0.190
Class-0.177
Token From
Token ID3574
Feature activation+1.53e-02

Pos logit contributions
sing+0.041
orse+0.038
 d+0.037
 spine+0.037
fish+0.037
Neg logit contributions
 languages-0.033
These-0.031
Always-0.030
Class-0.029
If-0.029
Token that
Token ID326
Feature activation+1.21e-02

Pos logit contributions
sing+0.026
fish+0.024
 spine+0.023
less+0.023
orse+0.023
Neg logit contributions
Always-0.028
These-0.028
If-0.027
 languages-0.027
Here-0.026
Token day
Token ID1110
Feature activation+3.56e-02

Pos logit contributions
sing+0.019
fish+0.018
 d+0.017
ies+0.017
 footing+0.017
Neg logit contributions
These-0.024
 languages-0.023
Always-0.023
If-0.022
Class-0.022
Token on
Token ID319
Feature activation+2.69e-02

Pos logit contributions
sing+0.067
fish+0.065
ies+0.060
 footing+0.059
 d+0.059
Neg logit contributions
These-0.059
Always-0.056
 languages-0.056
If-0.056
Class-0.054
Token pedal
Token ID26667
Feature activation+3.91e-02

Pos logit contributions
 languages+0.061
These+0.061
Always+0.059
If+0.058
Class+0.058
Neg logit contributions
sing-0.050
fish-0.047
orse-0.044
 spine-0.044
 d-0.043
Token broke
Token ID6265
Feature activation+5.22e-02

Pos logit contributions
sing+0.063
fish+0.060
orse+0.057
ies+0.056
 d+0.055
Neg logit contributions
 languages-0.073
If-0.071
Always-0.071
These-0.070
Class-0.067
Token off
Token ID572
Feature activation+7.62e-02

Pos logit contributions
sing+0.090
fish+0.090
orse+0.087
ipop+0.080
kin+0.078
Neg logit contributions
 languages-0.096
These-0.090
Always-0.089
If-0.088
 alphabet-0.085
Token and
Token ID290
Feature activation+6.90e-02

Pos logit contributions
fish+0.137
sing+0.133
orse+0.127
ipop+0.123
kin+0.121
Neg logit contributions
 languages-0.128
These-0.124
Always-0.122
If-0.120
Anne-0.118
Token flew
Token ID13112
Feature activation+2.86e-02

Pos logit contributions
fish+0.111
sing+0.111
orse+0.100
less+0.100
 footing+0.098
Neg logit contributions
 languages-0.128
These-0.126
Always-0.123
If-0.122
Class-0.122
Token away
Token ID1497
Feature activation+9.38e-02

Pos logit contributions
orse+0.056
sing+0.055
fish+0.054
ipop+0.050
OO+0.049
Neg logit contributions
 languages-0.043
Always-0.041
If-0.041
These-0.041
 parks-0.039
Token.
Token ID13
Feature activation+2.50e-02

Pos logit contributions
orse+0.183
ipop+0.181
fish+0.179
kin+0.176
sing+0.174
Neg logit contributions
 languages-0.138
Always-0.122
 Every-0.121
 parks-0.119
 circles-0.116
Token His
Token ID2399
Feature activation+3.24e-02

Pos logit contributions
sing+0.045
orse+0.043
fish+0.043
 d+0.042
 spine+0.041
Neg logit contributions
 languages-0.040
These-0.040
Always-0.039
If-0.038
Class-0.037
Token friends
Token ID2460
Feature activation+1.89e-02

Pos logit contributions
fish+0.046
sing+0.044
ies+0.040
orse+0.040
less+0.040
Neg logit contributions
 languages-0.067
These-0.065
Always-0.063
If-0.062
 alphabet-0.062
Token said
Token ID531
Feature activation+9.47e-03

Pos logit contributions
sing+0.030
fish+0.030
 footing+0.027
ies+0.027
 d+0.027
Neg logit contributions
These-0.035
 languages-0.035
Always-0.034
If-0.034
Class-0.033
Token they
Token ID484
Feature activation+8.29e-03

Pos logit contributions
sing+0.014
 spine+0.012
fish+0.012
 footing+0.012
ies+0.011
Neg logit contributions
 languages-0.020
Will-0.019
These-0.019
Always-0.018
Would-0.018
Token work
Token ID670
Feature activation+4.65e-02

Pos logit contributions
sing+0.113
fish+0.111
orse+0.103
 d+0.101
less+0.100
Neg logit contributions
 languages-0.136
These-0.127
Class-0.124
 alphabet-0.124
Always-0.123
Token,
Token ID11
Feature activation+4.88e-02

Pos logit contributions
sing+0.056
fish+0.055
 d+0.050
ies+0.049
 footing+0.048
Neg logit contributions
These-0.105
 languages-0.104
Always-0.102
If-0.102
Class-0.100
Token bravery
Token ID37808
Feature activation+5.47e-02

Pos logit contributions
sing+0.064
fish+0.061
 d+0.057
ies+0.056
 footing+0.055
Neg logit contributions
These-0.106
 languages-0.105
Always-0.103
If-0.102
Class-0.100
Token and
Token ID290
Feature activation+5.22e-02

Pos logit contributions
sing+0.078
fish+0.074
orse+0.068
 footing+0.067
 d+0.066
Neg logit contributions
 languages-0.114
These-0.112
Always-0.110
If-0.109
Class-0.107
Token cle
Token ID1190
Feature activation+6.35e-02

Pos logit contributions
sing+0.080
fish+0.078
 d+0.077
ies+0.074
less+0.073
Neg logit contributions
These-0.099
Always-0.096
If-0.095
 languages-0.095
Class-0.093
Tokenvern
Token ID933
Feature activation+9.29e-02

Pos logit contributions
sing+0.111
fish+0.099
 spine+0.096
 d+0.095
sed+0.093
Neg logit contributions
 languages-0.118
If-0.115
These-0.113
Always-0.112
Hope-0.108
Tokeness
Token ID408
Feature activation+5.25e-02

Pos logit contributions
sing+0.154
fish+0.152
 shovel+0.152
 raft+0.151
orse+0.148
Neg logit contributions
 languages-0.173
If-0.163
Always-0.156
These-0.153
Class-0.152
Token!
Token ID0
Feature activation+2.34e-02

Pos logit contributions
sing+0.076
fish+0.069
 footing+0.066
orse+0.064
 spine+0.064
Neg logit contributions
 languages-0.109
These-0.107
Always-0.105
If-0.104
Class-0.102
Token From
Token ID3574
Feature activation+1.18e-02

Pos logit contributions
orse+0.040
sing+0.039
fish+0.038
 d+0.038
ipop+0.037
Neg logit contributions
 languages-0.037
These-0.037
Always-0.034
If-0.033
Class-0.033
Token now
Token ID783
Feature activation+3.36e-02

Pos logit contributions
sing+0.016
fish+0.016
 d+0.014
ies+0.014
 footing+0.014
Neg logit contributions
These-0.025
 languages-0.025
Always-0.024
If-0.024
Class-0.023
Token on
Token ID319
Feature activation+2.45e-02

Pos logit contributions
 d+0.039
fish+0.039
 footing+0.037
ies+0.037
sing+0.036
Neg logit contributions
These-0.075
 languages-0.072
If-0.071
Would-0.071
Class-0.070
Token as
Token ID355
Feature activation+8.71e-02

Pos logit contributions
sing+0.133
fish+0.128
 footing+0.121
kin+0.118
 raft+0.118
Neg logit contributions
 languages-0.171
Class-0.166
These-0.162
Always-0.161
Anne-0.160
Token Molly
Token ID30236
Feature activation+4.05e-02

Pos logit contributions
sing+0.128
fish+0.120
 spine+0.111
orse+0.111
 footing+0.110
Neg logit contributions
These-0.172
 languages-0.172
Always-0.170
Anne-0.167
If-0.166
Token smelled
Token ID37609
Feature activation+3.17e-02

Pos logit contributions
fish+0.058
sing+0.057
 d+0.054
ies+0.052
orse+0.051
Neg logit contributions
These-0.083
 languages-0.082
Always-0.079
If-0.079
Class-0.077
Token the
Token ID262
Feature activation+5.69e-02

Pos logit contributions
sing+0.066
fish+0.060
orse+0.060
 spine+0.059
sed+0.057
Neg logit contributions
These-0.046
 languages-0.044
If-0.043
Always-0.042
Class-0.041
Token fresh
Token ID4713
Feature activation+4.31e-02

Pos logit contributions
sing+0.093
fish+0.090
orse+0.089
ies+0.085
 footing+0.084
Neg logit contributions
These-0.100
 languages-0.100
 alphabet-0.095
Always-0.094
If-0.091
Token air
Token ID1633
Feature activation+9.29e-02

Pos logit contributions
sing+0.095
fish+0.093
orse+0.088
 footing+0.087
ies+0.086
Neg logit contributions
These-0.056
 languages-0.053
Always-0.051
If-0.051
Class-0.049
Token she
Token ID673
Feature activation+5.06e-02

Pos logit contributions
sing+0.150
orse+0.136
fish+0.134
 footing+0.129
ipop+0.128
Neg logit contributions
These-0.171
 languages-0.166
Always-0.163
Class-0.156
If-0.154
Token started
Token ID2067
Feature activation+4.20e-02

Pos logit contributions
sing+0.083
fish+0.078
 spine+0.074
 footing+0.073
ies+0.073
Neg logit contributions
These-0.092
Always-0.089
 languages-0.088
If-0.087
Class-0.086
Token barking
Token ID48961
Feature activation+3.35e-02

Pos logit contributions
sing+0.065
fish+0.065
ipop+0.060
orse+0.059
kin+0.058
Neg logit contributions
 languages-0.077
These-0.076
Always-0.074
If-0.072
Class-0.071
Token and
Token ID290
Feature activation+6.17e-02

Pos logit contributions
sing+0.050
fish+0.049
orse+0.047
 footing+0.046
 spine+0.045
Neg logit contributions
 languages-0.065
These-0.064
If-0.061
Always-0.061
Class-0.059
Token running
Token ID2491
Feature activation+1.73e-02

Pos logit contributions
fish+0.099
sing+0.096
 footing+0.092
kin+0.091
 spine+0.090
Neg logit contributions
These-0.108
 languages-0.108
Anne-0.104
Class-0.104
Always-0.104
Token play
Token ID711
Feature activation+1.46e-02

Pos logit contributions
sing+0.166
fish+0.164
kin+0.155
ower+0.154
 spine+0.152
Neg logit contributions
 languages-0.169
Always-0.154
 Friendship-0.154
If-0.152
 Share-0.151
Token catch
Token ID4929
Feature activation+2.03e-02

Pos logit contributions
sing+0.026
fish+0.025
orse+0.023
 spine+0.023
 footing+0.023
Neg logit contributions
These-0.025
 languages-0.025
Always-0.024
If-0.024
Class-0.023
Token near
Token ID1474
Feature activation+2.59e-02

Pos logit contributions
sing+0.036
fish+0.033
 spine+0.033
 footing+0.032
orse+0.031
Neg logit contributions
 languages-0.036
These-0.036
Always-0.035
If-0.034
Class-0.033
Token the
Token ID262
Feature activation+6.59e-02

Pos logit contributions
sing+0.042
fish+0.041
 d+0.039
ies+0.038
 footing+0.038
Neg logit contributions
These-0.048
 languages-0.047
Always-0.046
If-0.046
Class-0.045
Token hut
Token ID40812
Feature activation+7.45e-02

Pos logit contributions
sing+0.116
fish+0.114
orse+0.112
ies+0.112
less+0.111
Neg logit contributions
 languages-0.107
These-0.106
Always-0.098
 alphabet-0.097
 parks-0.095
Token anymore
Token ID7471
Feature activation+9.29e-02

Pos logit contributions
sing+0.117
orse+0.107
fish+0.107
 d+0.105
ipop+0.104
Neg logit contributions
These-0.135
 languages-0.135
Always-0.135
If-0.131
Class-0.126
Token.
Token ID13
Feature activation+2.51e-02

Pos logit contributions
sing+0.146
less+0.129
 spine+0.127
 footing+0.125
fish+0.124
Neg logit contributions
 languages-0.181
Always-0.176
These-0.173
If-0.171
Class-0.169
Token They
Token ID1119
Feature activation+3.27e-02

Pos logit contributions
sing+0.043
orse+0.042
fish+0.039
 d+0.039
 spine+0.038
Neg logit contributions
 languages-0.042
These-0.042
Always-0.039
If-0.038
 alphabet-0.038
Token learn
Token ID2193
Feature activation+2.13e-02

Pos logit contributions
sing+0.057
fish+0.056
less+0.053
 footing+0.052
ies+0.052
Neg logit contributions
These-0.057
 languages-0.057
Always-0.055
If-0.054
Class-0.052
Token to
Token ID284
Feature activation-6.89e-03

Pos logit contributions
sing+0.035
fish+0.033
 spine+0.032
 footing+0.031
orse+0.031
Neg logit contributions
 languages-0.042
Always-0.040
These-0.040
If-0.038
 Friendship-0.038
Token respect
Token ID2461
Feature activation+9.88e-03

Pos logit contributions
 languages+0.013
These+0.013
Always+0.013
If+0.012
Class+0.012
Neg logit contributions
sing-0.011
fish-0.011
ipop-0.010
less-0.010
 footing-0.010
Token path
Token ID3108
Feature activation+6.62e-02

Pos logit contributions
sing+0.159
orse+0.158
fish+0.155
less+0.152
ipop+0.151
Neg logit contributions
These-0.129
 languages-0.126
 alphabet-0.120
 parks-0.118
Always-0.117
Token,
Token ID11
Feature activation+5.16e-02

Pos logit contributions
sing+0.104
fish+0.097
orse+0.092
ies+0.090
ipop+0.090
Neg logit contributions
 languages-0.122
These-0.121
Always-0.119
If-0.118
Class-0.113
Token further
Token ID2252
Feature activation+6.62e-02

Pos logit contributions
fish+0.079
sing+0.077
kin+0.072
ipop+0.071
orse+0.071
Neg logit contributions
These-0.098
 languages-0.096
Always-0.096
If-0.094
Would-0.092
Token and
Token ID290
Feature activation+6.33e-02

Pos logit contributions
sing+0.127
ipop+0.118
orse+0.118
kin+0.116
OO+0.111
Neg logit contributions
 languages-0.108
 parks-0.103
If-0.100
Always-0.099
These-0.094
Token further
Token ID2252
Feature activation+6.67e-02

Pos logit contributions
sing+0.110
fish+0.109
ipop+0.100
ies+0.100
orse+0.100
Neg logit contributions
These-0.110
 languages-0.109
Always-0.107
If-0.107
Class-0.104
Token away
Token ID1497
Feature activation+9.28e-02

Pos logit contributions
sing+0.125
ipop+0.117
kin+0.114
orse+0.112
OO+0.110
Neg logit contributions
 languages-0.112
 parks-0.108
If-0.104
Always-0.103
These-0.099
Token from
Token ID422
Feature activation+3.40e-02

Pos logit contributions
ipop+0.189
kin+0.181
sing+0.181
orse+0.178
fish+0.165
Neg logit contributions
 parks-0.122
 languages-0.119
 towns-0.117
Always-0.113
 circles-0.112
Token the
Token ID262
Feature activation+4.98e-02

Pos logit contributions
fish+0.060
sing+0.060
ies+0.056
orse+0.055
less+0.055
Neg logit contributions
These-0.057
 languages-0.057
Always-0.055
If-0.055
Class-0.053
Token farm
Token ID5318
Feature activation+4.02e-02

Pos logit contributions
fish+0.093
sing+0.091
ies+0.089
orse+0.088
ipop+0.088
Neg logit contributions
These-0.078
 languages-0.077
Always-0.072
If-0.072
 alphabet-0.071
Token.
Token ID13
Feature activation+2.09e-02

Pos logit contributions
sing+0.056
fish+0.054
orse+0.049
ipop+0.049
 d+0.048
Neg logit contributions
 languages-0.082
These-0.081
Always-0.079
If-0.078
Class-0.076
Token
Token ID220
Feature activation+6.47e-02

Pos logit contributions
sing+0.039
orse+0.037
 spine+0.036
 d+0.035
fish+0.035
Neg logit contributions
 languages-0.034
These-0.032
Always-0.031
Class-0.030
If-0.030
Token of
Token ID286
Feature activation+3.54e-02

Pos logit contributions
sing+0.139
orse+0.129
 footing+0.125
 spine+0.122
mit+0.118
Neg logit contributions
These-0.122
Always-0.119
 languages-0.112
If-0.108
 magazines-0.107
Token milk
Token ID7545
Feature activation+5.39e-02

Pos logit contributions
sing+0.062
fish+0.059
 spine+0.056
orse+0.056
 footing+0.056
Neg logit contributions
These-0.064
Always-0.061
 languages-0.061
If-0.059
Class-0.058
Token wob
Token ID46867
Feature activation+7.92e-02

Pos logit contributions
sing+0.096
orse+0.088
 footing+0.086
fish+0.084
 spine+0.082
Neg logit contributions
These-0.094
Always-0.092
If-0.084
 magazines-0.083
 languages-0.083
Tokenbled
Token ID9342
Feature activation+3.37e-02

Pos logit contributions
sing+0.090
 raft+0.089
 shovel+0.085
 crab+0.085
orse+0.085
Neg logit contributions
Always-0.161
These-0.160
If-0.156
Will-0.152
Anne-0.149
Token and
Token ID290
Feature activation+6.43e-02

Pos logit contributions
sing+0.062
orse+0.058
fish+0.058
 footing+0.055
 spine+0.055
Neg logit contributions
These-0.055
 languages-0.054
Always-0.053
If-0.052
Will-0.049
Token almost
Token ID2048
Feature activation+9.28e-02

Pos logit contributions
sing+0.104
fish+0.102
 footing+0.094
 spine+0.094
orse+0.092
Neg logit contributions
These-0.116
Always-0.113
Will-0.109
If-0.109
Would-0.109
Token fell
Token ID3214
Feature activation+6.07e-02

Pos logit contributions
 raft+0.146
sing+0.139
fish+0.135
ower+0.130
 footing+0.125
Neg logit contributions
 reads-0.153
These-0.148
 draws-0.145
Always-0.143
Will-0.143
Token down
Token ID866
Feature activation+4.94e-02

Pos logit contributions
orse+0.122
fish+0.119
sing+0.118
mit+0.108
kin+0.107
Neg logit contributions
 circles-0.083
Always-0.082
These-0.080
Will-0.079
 Every-0.078
Token.
Token ID13
Feature activation+2.95e-02

Pos logit contributions
sing+0.082
fish+0.081
orse+0.077
ies+0.073
 footing+0.071
Neg logit contributions
These-0.088
Always-0.087
If-0.085
 languages-0.084
Will-0.082
Token
Token ID198
Feature activation+3.92e-02

Pos logit contributions
sing+0.058
fish+0.056
orse+0.054
 d+0.054
 spine+0.053
Neg logit contributions
 languages-0.044
These-0.044
Always-0.042
If-0.041
Class-0.040
Token
Token ID198
Feature activation+3.33e-02

Pos logit contributions
fish+0.073
sing+0.072
orse+0.071
 d+0.069
less+0.067
Neg logit contributions
 languages-0.061
These-0.060
Always-0.058
If-0.055
 alphabet-0.055
Token her
Token ID607
Feature activation+7.56e-02

Pos logit contributions
sing+0.088
 footing+0.079
fish+0.078
kin+0.077
ower+0.077
Neg logit contributions
 languages-0.110
If-0.100
These-0.100
Always-0.099
 Friendship-0.097
Token orange
Token ID10912
Feature activation+5.70e-02

Pos logit contributions
fish+0.118
sing+0.115
orse+0.111
less+0.110
OO+0.108
Neg logit contributions
 languages-0.140
These-0.126
Always-0.124
 alphabet-0.124
Hope-0.123
Token arms
Token ID5101
Feature activation+8.00e-02

Pos logit contributions
sing+0.086
fish+0.083
orse+0.082
 footing+0.078
 d+0.078
Neg logit contributions
 languages-0.105
These-0.101
Always-0.099
 alphabet-0.096
Will-0.095
Token in
Token ID287
Feature activation+2.54e-02

Pos logit contributions
sing+0.129
fish+0.123
orse+0.119
 footing+0.118
kin+0.115
Neg logit contributions
These-0.139
Always-0.139
 languages-0.139
If-0.133
 Every-0.128
Token the
Token ID262
Feature activation+6.76e-02

Pos logit contributions
sing+0.035
 d+0.035
fish+0.033
 grip+0.032
less+0.032
Neg logit contributions
Always-0.051
These-0.051
Class-0.049
If-0.049
 languages-0.048
Token air
Token ID1633
Feature activation+9.27e-02

Pos logit contributions
sing+0.133
fish+0.130
 d+0.124
ies+0.121
 footing+0.120
Neg logit contributions
These-0.102
 languages-0.100
Always-0.099
If-0.098
Class-0.095
Token to
Token ID284
Feature activation+1.13e-02

Pos logit contributions
sing+0.146
fish+0.131
 footing+0.130
orse+0.129
ipop+0.126
Neg logit contributions
 languages-0.170
These-0.167
Always-0.165
If-0.158
Class-0.157
Token get
Token ID651
Feature activation+2.45e-02

Pos logit contributions
sing+0.018
fish+0.018
ies+0.017
less+0.017
 d+0.016
Neg logit contributions
These-0.021
 languages-0.020
Always-0.020
If-0.020
Class-0.019
Token their
Token ID511
Feature activation+4.73e-02

Pos logit contributions
sing+0.040
fish+0.038
orse+0.036
 spine+0.036
kin+0.036
Neg logit contributions
These-0.046
Always-0.045
 languages-0.045
If-0.044
Would-0.042
Token attention
Token ID3241
Feature activation+5.68e-02

Pos logit contributions
fish+0.070
sing+0.070
orse+0.065
 d+0.064
ies+0.063
Neg logit contributions
 languages-0.092
These-0.091
Always-0.088
If-0.086
Class-0.085
Token.
Token ID13
Feature activation+1.96e-02

Pos logit contributions
sing+0.086
fish+0.080
 footing+0.076
orse+0.075
 spine+0.074
Neg logit contributions
 languages-0.109
These-0.108
Always-0.105
If-0.103
Will-0.102
Token.
Token ID13
Feature activation+4.64e-02

Pos logit contributions
These+0.022
 languages+0.022
Always+0.022
If+0.021
Class+0.021
Neg logit contributions
sing-0.020
fish-0.020
 d-0.019
ies-0.018
orse-0.018
Token Something
Token ID13742
Feature activation+3.97e-02

Pos logit contributions
sing+0.092
fish+0.089
orse+0.087
 d+0.086
ipop+0.082
Neg logit contributions
These-0.067
 languages-0.066
Always-0.064
If-0.062
Class-0.061
Token was
Token ID373
Feature activation+3.62e-02

Pos logit contributions
sing+0.078
 spine+0.072
 footing+0.070
orse+0.069
fish+0.069
Neg logit contributions
 languages-0.062
These-0.060
Always-0.059
If-0.055
 Friendship-0.055
Token blocking
Token ID12013
Feature activation+2.78e-02

Pos logit contributions
sing+0.058
fish+0.056
 spine+0.053
ies+0.053
ipop+0.053
Neg logit contributions
 languages-0.065
These-0.064
Always-0.063
 alphabet-0.061
If-0.060
Token the
Token ID262
Feature activation+6.10e-02

Pos logit contributions
sing+0.046
fish+0.045
 d+0.042
ies+0.041
orse+0.041
Neg logit contributions
These-0.051
 languages-0.050
Always-0.049
If-0.049
Class-0.047
Token hole
Token ID7604
Feature activation+9.25e-02

Pos logit contributions
sing+0.104
fish+0.102
ies+0.097
 d+0.095
orse+0.095
Neg logit contributions
These-0.106
 languages-0.106
Always-0.102
If-0.100
Class-0.099
Token.
Token ID13
Feature activation+1.79e-02

Pos logit contributions
sing+0.149
orse+0.138
fish+0.136
ipop+0.136
sed+0.129
Neg logit contributions
 languages-0.160
These-0.155
Always-0.155
If-0.149
 parks-0.144
Token It
Token ID632
Feature activation+3.38e-02

Pos logit contributions
sing+0.032
orse+0.031
fish+0.031
 d+0.031
ipop+0.029
Neg logit contributions
 languages-0.029
These-0.028
Always-0.027
If-0.026
Class-0.026
Token was
Token ID373
Feature activation+3.28e-02

Pos logit contributions
sing+0.061
fish+0.058
 spine+0.055
less+0.055
orse+0.055
Neg logit contributions
These-0.056
 languages-0.055
Always-0.055
If-0.053
Class-0.052
Token big
Token ID1263
Feature activation+1.88e-02

Pos logit contributions
fish+0.057
orse+0.052
ipop+0.051
ies+0.051
sing+0.050
Neg logit contributions
These-0.056
 languages-0.056
Always-0.055
If-0.053
 Wisdom-0.053
Token and
Token ID290
Feature activation+5.19e-02

Pos logit contributions
sing+0.024
fish+0.024
 d+0.022
ies+0.022
less+0.021
Neg logit contributions
These-0.041
 languages-0.040
Always-0.040
If-0.040
Class-0.039
Token frogs
Token ID37475
Feature activation+7.24e-02

Pos logit contributions
sing+0.012
fish+0.012
ies+0.011
 footing+0.011
orse+0.011
Neg logit contributions
Always-0.017
These-0.017
 languages-0.016
If-0.016
Class-0.016
Token shaking
Token ID17275
Feature activation+4.17e-02

Pos logit contributions
sing+0.112
less+0.102
 footing+0.101
 spine+0.098
fish+0.097
Neg logit contributions
 languages-0.136
Always-0.136
These-0.135
If-0.133
Class-0.128
Token off
Token ID572
Feature activation+7.52e-02

Pos logit contributions
sing+0.064
fish+0.063
 footing+0.059
orse+0.059
ipop+0.058
Neg logit contributions
 languages-0.080
These-0.078
Always-0.076
If-0.076
Anne-0.073
Token their
Token ID511
Feature activation+8.06e-02

Pos logit contributions
sing+0.120
fish+0.118
orse+0.112
ipop+0.112
 footing+0.111
Neg logit contributions
 languages-0.134
These-0.129
Always-0.128
If-0.124
Anne-0.122
Token wet
Token ID9583
Feature activation+7.18e-02

Pos logit contributions
fish+0.127
less+0.120
sing+0.119
orse+0.119
ipop+0.117
Neg logit contributions
 languages-0.152
These-0.142
 alphabet-0.135
Always-0.134
 parks-0.132
Tokenness
Token ID1108
Feature activation+9.13e-02

Pos logit contributions
orse+0.115
sing+0.114
fish+0.113
ipop+0.107
less+0.103
Neg logit contributions
 languages-0.127
 circles-0.115
These-0.114
Always-0.113
 alphabet-0.112
Token.
Token ID13
Feature activation+2.80e-02

Pos logit contributions
sing+0.157
orse+0.142
fish+0.138
ipop+0.136
sed+0.134
Neg logit contributions
 languages-0.158
Always-0.156
These-0.156
If-0.149
 alphabet-0.142
Token The
Token ID383
Feature activation+2.27e-02

Pos logit contributions
sing+0.055
fish+0.054
 d+0.052
orse+0.052
ies+0.050
Neg logit contributions
These-0.041
 languages-0.041
Always-0.040
If-0.039
Class-0.038
Token sun
Token ID4252
Feature activation+3.84e-02

Pos logit contributions
sing+0.038
fish+0.036
 footing+0.034
 spine+0.034
orse+0.034
Neg logit contributions
 languages-0.040
These-0.040
Always-0.038
If-0.037
Class-0.037
Token was
Token ID373
Feature activation+3.14e-02

Pos logit contributions
sing+0.063
fish+0.059
 footing+0.057
 spine+0.057
orse+0.056
Neg logit contributions
These-0.070
 languages-0.068
Always-0.068
If-0.065
Class-0.065
Token shining
Token ID22751
Feature activation+3.45e-02

Pos logit contributions
fish+0.052
ipop+0.050
sing+0.049
orse+0.048
 spine+0.048
Neg logit contributions
These-0.056
Always-0.056
Will-0.053
 languages-0.052
If-0.052
TokenThen
Token ID6423
Feature activation+4.97e-02

Pos logit contributions
orse+0.057
fish+0.057
 d+0.055
sing+0.054
less+0.052
Neg logit contributions
 languages-0.047
These-0.042
Always-0.042
 alphabet-0.042
aries-0.040
Token something
Token ID1223
Feature activation+5.25e-02

Pos logit contributions
sing+0.069
fish+0.068
 footing+0.061
less+0.060
 spine+0.060
Neg logit contributions
These-0.103
 languages-0.103
Always-0.100
If-0.099
Class-0.098
Token even
Token ID772
Feature activation+4.28e-02

Pos logit contributions
sing+0.090
 spine+0.085
ower+0.077
 footing+0.077
OO+0.077
Neg logit contributions
 languages-0.097
These-0.092
Always-0.090
Class-0.088
If-0.087
Token we
Token ID356
Feature activation+2.58e-02

Pos logit contributions
sing+0.070
fish+0.068
 footing+0.062
 d+0.062
 spine+0.062
Neg logit contributions
These-0.079
 languages-0.078
Always-0.077
If-0.076
Class-0.074
Tokenir
Token ID343
Feature activation+4.42e-02

Pos logit contributions
sing+0.046
fish+0.042
less+0.042
 spine+0.042
 footing+0.041
Neg logit contributions
These-0.043
 languages-0.043
Always-0.043
If-0.041
Class-0.039
Tokender
Token ID1082
Feature activation+9.13e-02

Pos logit contributions
sing+0.078
fish+0.073
 footing+0.071
kin+0.070
less+0.070
Neg logit contributions
 languages-0.077
Always-0.074
If-0.074
These-0.073
Class-0.069
Token happened
Token ID3022
Feature activation+2.82e-02

Pos logit contributions
sing+0.147
 spine+0.128
ipop+0.126
less+0.125
 footing+0.125
Neg logit contributions
Always-0.179
 languages-0.177
If-0.173
These-0.170
Class-0.161
Token.
Token ID13
Feature activation+1.49e-02

Pos logit contributions
sing+0.039
fish+0.038
 footing+0.034
ipop+0.034
 spine+0.034
Neg logit contributions
 languages-0.059
These-0.058
Always-0.058
If-0.057
Class-0.055
Token The
Token ID383
Feature activation+1.65e-02

Pos logit contributions
sing+0.029
 d+0.027
orse+0.027
fish+0.027
less+0.026
Neg logit contributions
 languages-0.022
These-0.021
Always-0.021
Class-0.020
If-0.020
Token animal
Token ID5044
Feature activation+1.32e-02

Pos logit contributions
sing+0.028
fish+0.027
ies+0.025
orse+0.025
less+0.025
Neg logit contributions
 languages-0.029
These-0.029
Always-0.027
If-0.027
 alphabet-0.027
Token suddenly
Token ID6451
Feature activation+4.45e-02

Pos logit contributions
sing+0.023
fish+0.021
less+0.021
 spine+0.021
 footing+0.021
Neg logit contributions
 languages-0.023
These-0.022
Always-0.022
If-0.021
Class-0.021
Token felt
Token ID2936
Feature activation+2.51e-02

Pos logit contributions
Always+0.007
These+0.007
 languages+0.007
If+0.007
Will+0.007
Neg logit contributions
sing-0.007
fish-0.006
 footing-0.006
 spine-0.006
less-0.006
Token a
Token ID257
Feature activation+2.45e-02

Pos logit contributions
sing+0.048
ies+0.045
fish+0.045
ower+0.045
 spine+0.044
Neg logit contributions
These-0.042
Always-0.041
If-0.041
Would-0.040
Here-0.040
Token chill
Token ID20493
Feature activation+3.25e-02

Pos logit contributions
ies+0.043
fish+0.042
ipop+0.041
ower+0.040
sing+0.040
Neg logit contributions
Will-0.041
Hope-0.036
These-0.035
Always-0.034
If-0.033
Token down
Token ID866
Feature activation+3.78e-02

Pos logit contributions
sing+0.053
fish+0.051
orse+0.051
ipop+0.051
sed+0.050
Neg logit contributions
Will-0.050
These-0.049
Always-0.049
If-0.047
 circles-0.047
Token his
Token ID465
Feature activation+3.48e-02

Pos logit contributions
sing+0.066
fish+0.063
orse+0.061
 footing+0.057
ies+0.057
Neg logit contributions
Always-0.065
These-0.065
 languages-0.063
If-0.063
Will-0.060
Token spine
Token ID19656
Feature activation+9.11e-02

Pos logit contributions
fish+0.049
orse+0.040
ipop+0.040
ies+0.040
kin+0.038
Neg logit contributions
 alphabet-0.071
 languages-0.071
These-0.065
bons-0.065
 magazines-0.065
Token.
Token ID13
Feature activation-6.68e-03

Pos logit contributions
ipop+0.163
fish+0.158
sing+0.157
 raft+0.154
aa+0.152
Neg logit contributions
 circles-0.127
 languages-0.123
Always-0.120
 Every-0.116
These-0.116
Token He
Token ID679
Feature activation+2.03e-02

Pos logit contributions
 languages+0.011
These+0.011
Always+0.011
If+0.010
Class+0.010
Neg logit contributions
sing-0.012
orse-0.011
 d-0.011
fish-0.011
 spine-0.011
Token suddenly
Token ID6451
Feature activation+5.19e-02

Pos logit contributions
sing+0.034
less+0.031
fish+0.031
 spine+0.030
 footing+0.030
Neg logit contributions
Always-0.036
Will-0.036
These-0.036
 languages-0.035
If-0.035
Token remembered
Token ID12086
Feature activation+6.30e-03

Pos logit contributions
sing+0.104
ower+0.087
 footing+0.085
fish+0.085
 spine+0.084
Neg logit contributions
Always-0.077
These-0.076
If-0.075
 languages-0.075
Anne-0.075
Token the
Token ID262
Feature activation+3.58e-03

Pos logit contributions
sing+0.012
fish+0.011
kin+0.011
ower+0.011
 spine+0.011
Neg logit contributions
Always-0.010
 languages-0.010
These-0.009
 Wisdom-0.009
If-0.009
Token't
Token ID470
Feature activation+6.84e-02

Pos logit contributions
sing+0.151
fish+0.147
 d+0.140
ies+0.138
 footing+0.138
Neg logit contributions
These-0.085
 languages-0.084
Always-0.083
If-0.081
Class-0.077
Token sign
Token ID1051
Feature activation+3.36e-02

Pos logit contributions
sing+0.120
fish+0.118
kin+0.109
 d+0.109
 footing+0.109
Neg logit contributions
 languages-0.116
Always-0.111
These-0.110
If-0.109
Class-0.108
Token at
Token ID379
Feature activation+1.47e-02

Pos logit contributions
sing+0.049
fish+0.047
 footing+0.043
 d+0.042
orse+0.042
Neg logit contributions
 languages-0.069
These-0.069
Always-0.067
If-0.066
Class-0.065
Token the
Token ID262
Feature activation+4.75e-02

Pos logit contributions
sing+0.024
fish+0.023
 d+0.022
ies+0.022
 footing+0.021
Neg logit contributions
These-0.027
 languages-0.027
Always-0.026
If-0.026
Class-0.025
Token traffic
Token ID4979
Feature activation+6.32e-02

Pos logit contributions
sing+0.086
fish+0.084
ies+0.078
 d+0.078
 footing+0.078
Neg logit contributions
These-0.079
 languages-0.078
Always-0.075
If-0.074
Class-0.073
Token anymore
Token ID7471
Feature activation+9.09e-02

Pos logit contributions
sing+0.091
fish+0.086
 footing+0.080
orse+0.080
 d+0.079
Neg logit contributions
 languages-0.129
These-0.127
Always-0.122
If-0.122
Class-0.121
Token."
Token ID526
Feature activation+2.38e-02

Pos logit contributions
sing+0.137
fish+0.120
 footing+0.118
less+0.118
 spine+0.118
Neg logit contributions
 languages-0.185
Always-0.178
These-0.178
Class-0.174
If-0.174
Token
Token ID220
Feature activation+6.84e-02

Pos logit contributions
sing+0.047
fish+0.046
 d+0.044
orse+0.044
ies+0.043
Neg logit contributions
These-0.035
 languages-0.035
Always-0.033
If-0.033
Class-0.032
Token
Token ID198
Feature activation+4.26e-02

Pos logit contributions
sing+0.135
fish+0.133
orse+0.126
 d+0.126
ies+0.125
Neg logit contributions
These-0.100
 languages-0.099
Always-0.096
If-0.093
Class-0.092
Token
Token ID198
Feature activation+3.72e-02

Pos logit contributions
fish+0.073
orse+0.073
sing+0.071
 d+0.068
ies+0.065
Neg logit contributions
 languages-0.072
These-0.068
Always-0.066
 alphabet-0.065
aries-0.064
TokenJess
Token ID34648
Feature activation+2.35e-02

Pos logit contributions
fish+0.067
orse+0.065
sing+0.064
 d+0.063
less+0.061
Neg logit contributions
 languages-0.060
These-0.055
 alphabet-0.054
Always-0.053
aries-0.052
Token a
Token ID257
Feature activation+1.14e-02

Pos logit contributions
sing+0.018
fish+0.018
ies+0.016
 spine+0.016
orse+0.016
Neg logit contributions
These-0.021
 languages-0.020
Always-0.020
If-0.020
Would-0.019
Token chill
Token ID20493
Feature activation+2.38e-02

Pos logit contributions
fish+0.018
sing+0.017
ies+0.017
ipop+0.017
orse+0.016
Neg logit contributions
Will-0.021
These-0.020
If-0.020
Always-0.020
 languages-0.019
Token run
Token ID1057
Feature activation-4.55e-04

Pos logit contributions
fish+0.039
sing+0.038
orse+0.038
ipop+0.037
sed+0.036
Neg logit contributions
These-0.037
Will-0.036
 languages-0.036
Always-0.035
 circles-0.035
Token down
Token ID866
Feature activation+2.86e-02

Pos logit contributions
 languages+0.001
These+0.001
If+0.001
Always+0.001
Class+0.001
Neg logit contributions
fish-0.001
sing-0.001
orse-0.001
 footing-0.001
ies-0.001
Token his
Token ID465
Feature activation+3.69e-02

Pos logit contributions
sing+0.039
fish+0.038
 d+0.035
ies+0.035
 footing+0.034
Neg logit contributions
These-0.060
 languages-0.059
Always-0.059
If-0.058
Class-0.057
Token spine
Token ID19656
Feature activation+9.08e-02

Pos logit contributions
fish+0.038
sing+0.034
 d+0.032
orse+0.032
ies+0.032
Neg logit contributions
 languages-0.091
These-0.088
Always-0.086
 alphabet-0.086
Hope-0.085
Token.
Token ID13
Feature activation-4.17e-03

Pos logit contributions
ipop+0.157
fish+0.155
 raft+0.149
sing+0.147
aa+0.147
Neg logit contributions
 languages-0.137
 circles-0.135
 Every-0.125
These-0.123
Always-0.122
Token He
Token ID679
Feature activation+1.50e-02

Pos logit contributions
These+0.007
 languages+0.007
Always+0.007
If+0.006
Hope+0.006
Neg logit contributions
sing-0.008
 spine-0.007
 d-0.007
less-0.007
 footing-0.007
Token could
Token ID714
Feature activation+1.16e-02

Pos logit contributions
sing+0.022
fish+0.021
 d+0.020
ies+0.019
 footing+0.019
Neg logit contributions
These-0.030
 languages-0.030
Always-0.029
If-0.029
Class-0.028
Token not
Token ID407
Feature activation+3.47e-02

Pos logit contributions
fish+0.019
less+0.018
ies+0.018
sing+0.018
ipop+0.018
Neg logit contributions
Always-0.020
Anne-0.020
These-0.020
If-0.019
 languages-0.019
Token believe
Token ID1975
Feature activation+1.59e-02

Pos logit contributions
sing+0.055
fish+0.055
ies+0.052
less+0.052
 d+0.051
Neg logit contributions
These-0.063
 languages-0.062
Always-0.061
If-0.061
Class-0.060
Token it
Token ID340
Feature activation+5.53e-02

Pos logit contributions
fish+0.106
sing+0.106
ies+0.098
 spine+0.097
less+0.096
Neg logit contributions
These-0.112
 languages-0.108
Always-0.104
Class-0.103
Anne-0.103
Token completely
Token ID3190
Feature activation+8.17e-02

Pos logit contributions
sing+0.095
fish+0.092
 footing+0.085
ies+0.085
 d+0.085
Neg logit contributions
These-0.097
 languages-0.095
Always-0.094
If-0.093
Class-0.091
Token disappeared
Token ID12120
Feature activation+6.99e-02

Pos logit contributions
sing+0.134
orse+0.130
ipop+0.124
 spine+0.121
 footing+0.120
Neg logit contributions
Always-0.140
 languages-0.129
These-0.128
Hope-0.127
 reads-0.126
Token in
Token ID287
Feature activation+3.97e-02

Pos logit contributions
orse+0.140
ipop+0.128
sing+0.121
fish+0.121
aa+0.111
Neg logit contributions
 languages-0.105
Always-0.098
 parks-0.097
 circles-0.095
These-0.092
Token the
Token ID262
Feature activation+7.02e-02

Pos logit contributions
sing+0.055
fish+0.054
 d+0.053
 footing+0.049
less+0.049
Neg logit contributions
These-0.080
 languages-0.080
Always-0.079
If-0.079
Class-0.078
Token night
Token ID1755
Feature activation+9.07e-02

Pos logit contributions
sing+0.151
fish+0.149
ies+0.140
orse+0.140
 footing+0.140
Neg logit contributions
These-0.094
 languages-0.092
Always-0.088
If-0.085
Class-0.084
Token sky
Token ID6766
Feature activation+7.42e-02

Pos logit contributions
sing+0.175
fish+0.165
 footing+0.162
ipop+0.162
 raft+0.159
Neg logit contributions
These-0.135
 languages-0.132
Class-0.131
Always-0.129
Anne-0.126
Token.
Token ID13
Feature activation+1.80e-02

Pos logit contributions
sing+0.114
fish+0.107
ipop+0.106
 footing+0.106
 raft+0.103
Neg logit contributions
 languages-0.134
These-0.131
Always-0.130
Class-0.126
Anne-0.123
Token<|endoftext|>
Token ID50256
Feature activation-1.32e-02

Pos logit contributions
sing+0.033
orse+0.031
 d+0.030
fish+0.029
 flo+0.029
Neg logit contributions
 languages-0.030
These-0.029
Always-0.029
If-0.028
Class-0.027
TokenOnce
Token ID7454
Feature activation+6.14e-02

Pos logit contributions
 languages+0.028
These+0.027
If+0.026
Always+0.026
Class+0.025
Neg logit contributions
fish-0.018
 d-0.017
kin-0.016
orse-0.015
 flo-0.015
Token upon
Token ID2402
Feature activation+3.00e-02

Pos logit contributions
sing+0.083
 raft+0.071
 spine+0.070
fish+0.070
 thirst+0.069
Neg logit contributions
 languages-0.127
 Every-0.125
 Everyday-0.122
Anne-0.122
These-0.122
Token dust
Token ID8977
Feature activation+7.34e-02

Pos logit contributions
sing+0.099
 footing+0.087
orse+0.087
fish+0.086
 spine+0.084
Neg logit contributions
 languages-0.101
These-0.096
Always-0.094
 alphabet-0.090
Class-0.089
Token floating
Token ID12462
Feature activation+3.38e-02

Pos logit contributions
sing+0.122
orse+0.110
 footing+0.108
fish+0.104
ipop+0.104
Neg logit contributions
Always-0.129
 languages-0.128
These-0.127
If-0.120
Will-0.118
Token around
Token ID1088
Feature activation+3.99e-02

Pos logit contributions
sing+0.062
orse+0.057
 footing+0.054
fish+0.054
 spine+0.054
Neg logit contributions
 languages-0.057
These-0.057
Always-0.056
If-0.054
Will-0.051
Token in
Token ID287
Feature activation+2.95e-02

Pos logit contributions
sing+0.062
fish+0.060
 d+0.055
ies+0.055
 footing+0.055
Neg logit contributions
These-0.077
 languages-0.076
Always-0.075
If-0.074
Class-0.072
Token the
Token ID262
Feature activation+6.24e-02

Pos logit contributions
sing+0.044
fish+0.044
 d+0.043
less+0.040
ies+0.040
Neg logit contributions
These-0.057
 languages-0.056
Always-0.055
If-0.055
Class-0.054
Token air
Token ID1633
Feature activation+9.05e-02

Pos logit contributions
sing+0.126
fish+0.123
ipop+0.117
orse+0.117
 footing+0.116
Neg logit contributions
These-0.093
 languages-0.090
Always-0.087
If-0.084
Class-0.083
Token.
Token ID13
Feature activation+3.15e-02

Pos logit contributions
sing+0.153
ipop+0.135
 footing+0.135
orse+0.131
sed+0.130
Neg logit contributions
Always-0.153
These-0.153
 languages-0.152
If-0.144
Class-0.143
Token It
Token ID632
Feature activation+2.60e-02

Pos logit contributions
sing+0.062
fish+0.060
orse+0.059
 d+0.058
 spine+0.056
Neg logit contributions
 languages-0.047
These-0.046
Always-0.044
If-0.043
Class-0.042
Token was
Token ID373
Feature activation+2.18e-02

Pos logit contributions
sing+0.049
fish+0.046
 spine+0.045
less+0.045
 footing+0.045
Neg logit contributions
These-0.041
 languages-0.041
Always-0.041
If-0.039
Anne-0.039
Token coming
Token ID2406
Feature activation+1.18e-02

Pos logit contributions
fish+0.038
sing+0.037
ipop+0.036
ies+0.035
 spine+0.034
Neg logit contributions
These-0.037
Always-0.037
 languages-0.037
If-0.035
Anne-0.035
Token from
Token ID422
Feature activation+2.44e-02

Pos logit contributions
orse+0.022
fish+0.021
ipop+0.021
kin+0.021
sing+0.020
Neg logit contributions
These-0.018
Always-0.018
 languages-0.018
Would-0.018
Will-0.018
Token a
Token ID257
Feature activation-3.74e-02

Pos logit contributions
 languages+0.110
These+0.109
If+0.106
Always+0.106
Will+0.104
Neg logit contributions
sing-0.060
fish-0.056
 footing-0.051
 spine-0.051
ies-0.050
Token prayer
Token ID11443
Feature activation-2.60e-03

Pos logit contributions
These+0.071
 languages+0.071
Always+0.069
If+0.068
Class+0.067
Neg logit contributions
sing-0.058
fish-0.057
ies-0.052
 footing-0.052
orse-0.052
Token.
Token ID13
Feature activation-4.00e-02

Pos logit contributions
 languages+0.005
These+0.005
Always+0.005
 alphabet+0.005
If+0.005
Neg logit contributions
sing-0.004
fish-0.003
 footing-0.003
orse-0.003
 raft-0.003
Token He
Token ID679
Feature activation-1.98e-02

Pos logit contributions
 languages+0.088
Always+0.087
June+0.083
These+0.083
Hope+0.081
Neg logit contributions
sing-0.047
 d-0.045
 tum-0.045
 spine-0.044
 shovel-0.044
Token said
Token ID531
Feature activation-4.75e-02

Pos logit contributions
These+0.036
 languages+0.035
If+0.034
Always+0.034
Class+0.033
Neg logit contributions
fish-0.033
sing-0.033
 d-0.032
ies-0.031
orse-0.030
Token "
Token ID366
Feature activation-9.69e-02

Pos logit contributions
These+0.107
 languages+0.107
Always+0.105
If+0.104
Class+0.102
Neg logit contributions
sing-0.058
fish-0.056
 footing-0.050
ies-0.049
 d-0.049
TokenDear
Token ID20266
Feature activation-5.46e-02

Pos logit contributions
 languages+0.165
 Friendship+0.132
aries+0.131
 Share+0.129
 Learn+0.129
Neg logit contributions
 d-0.171
fish-0.168
less-0.167
ies-0.161
aa-0.160
Token God
Token ID1793
Feature activation-3.51e-02

Pos logit contributions
These+0.094
If+0.090
Always+0.090
 languages+0.089
June+0.086
Neg logit contributions
sing-0.099
fish-0.097
 d-0.095
orse-0.092
ies-0.089
Token,
Token ID11
Feature activation-2.25e-02

Pos logit contributions
These+0.079
 languages+0.078
Always+0.077
If+0.076
Class+0.075
Neg logit contributions
sing-0.043
fish-0.042
 d-0.038
ies-0.037
 footing-0.037
Token please
Token ID3387
Feature activation-2.24e-02

Pos logit contributions
These+0.047
 languages+0.047
If+0.045
Always+0.045
Class+0.044
Neg logit contributions
sing-0.031
fish-0.030
 d-0.029
less-0.027
orse-0.027
Token make
Token ID787
Feature activation-7.99e-02

Pos logit contributions
These+0.040
 languages+0.040
Always+0.039
If+0.039
Class+0.038
Neg logit contributions
sing-0.037
fish-0.037
 d-0.034
 footing-0.034
ies-0.034
Token Mum
Token ID38367
Feature activation-4.80e-02

Pos logit contributions
 languages+0.083
These+0.080
 alphabet+0.077
Class+0.075
Always+0.074
Neg logit contributions
sing-0.057
 d-0.055
 as-0.055
 shovel-0.054
 raft-0.053
Token smiled
Token ID13541
Feature activation-1.67e-02

Pos logit contributions
These+0.089
 languages+0.089
Always+0.089
If+0.086
Class+0.085
Neg logit contributions
sing-0.077
fish-0.071
 footing-0.069
 spine-0.069
less-0.067
Token and
Token ID290
Feature activation+3.87e-03

Pos logit contributions
Always+0.027
 languages+0.026
These+0.026
If+0.026
Will+0.026
Neg logit contributions
sing-0.031
fish-0.030
 footing-0.029
orse-0.029
 spine-0.028
Token said
Token ID531
Feature activation-4.75e-02

Pos logit contributions
sing+0.007
 footing+0.006
 spine+0.006
fish+0.006
less+0.006
Neg logit contributions
Always-0.007
These-0.007
Would-0.006
Class-0.006
If-0.006
Token:
Token ID25
Feature activation-3.95e-02

Pos logit contributions
These+0.085
Will+0.083
 languages+0.083
If+0.083
Would+0.082
Neg logit contributions
sing-0.084
 spine-0.075
 footing-0.075
ies-0.073
fish-0.071
Token "
Token ID366
Feature activation-9.66e-02

Pos logit contributions
These+0.067
 languages+0.066
Always+0.065
If+0.064
Class+0.063
Neg logit contributions
sing-0.070
fish-0.068
 d-0.064
ies-0.063
 footing-0.063
TokenI
Token ID40
Feature activation-3.50e-02

Pos logit contributions
 languages+0.188
aries+0.168
 alphabet+0.163
 channels+0.162
 parks+0.161
Neg logit contributions
fish-0.147
less-0.145
aa-0.138
 d-0.138
ipop-0.135
Token will
Token ID481
Feature activation-3.86e-02

Pos logit contributions
Always+0.060
These+0.059
If+0.058
Class+0.057
 languages+0.057
Neg logit contributions
sing-0.061
fish-0.060
 footing-0.056
 spine-0.056
ies-0.055
Token help
Token ID1037
Feature activation-6.93e-02

Pos logit contributions
These+0.068
Always+0.067
 languages+0.067
If+0.065
Class+0.063
Neg logit contributions
sing-0.066
fish-0.064
ies-0.059
 footing-0.059
 spine-0.059
Token you
Token ID345
Feature activation-2.20e-02

Pos logit contributions
These+0.134
 languages+0.133
Always+0.129
If+0.128
Class+0.125
Neg logit contributions
fish-0.108
sing-0.107
 d-0.100
ies-0.099
orse-0.096
Token get
Token ID651
Feature activation-4.14e-02

Pos logit contributions
These+0.039
Always+0.039
 languages+0.038
If+0.038
Class+0.037
Neg logit contributions
sing-0.037
fish-0.035
 footing-0.033
 spine-0.033
 d-0.033
Token polite
Token ID23507
Feature activation-8.40e-03

Pos logit contributions
These+0.002
Class+0.002
Always+0.002
Hope+0.002
 languages+0.002
Neg logit contributions
fish-0.002
sing-0.002
 footing-0.002
 raft-0.001
 spine-0.001
Token.
Token ID13
Feature activation-3.86e-02

Pos logit contributions
These+0.017
 languages+0.017
Always+0.017
If+0.016
Class+0.016
Neg logit contributions
sing-0.012
fish-0.012
 footing-0.011
orse-0.011
 d-0.011
Token She
Token ID1375
Feature activation-1.51e-02

Pos logit contributions
aries+0.071
Always+0.070
Brian+0.068
 languages+0.067
Will+0.067
Neg logit contributions
sing-0.061
 tum-0.057
 d-0.055
ipop-0.054
 shovel-0.053
Token always
Token ID1464
Feature activation-8.38e-03

Pos logit contributions
These+0.027
Always+0.026
 languages+0.026
If+0.025
Class+0.025
Neg logit contributions
sing-0.026
fish-0.026
 footing-0.024
ies-0.024
less-0.024
Token said
Token ID531
Feature activation-4.79e-02

Pos logit contributions
These+0.015
 languages+0.015
Always+0.014
If+0.014
Class+0.014
Neg logit contributions
sing-0.014
fish-0.014
 footing-0.013
ies-0.013
 d-0.013
Token "
Token ID366
Feature activation-9.66e-02

Pos logit contributions
 languages+0.102
These+0.102
Always+0.099
If+0.098
Class+0.097
Neg logit contributions
sing-0.065
fish-0.062
 footing-0.057
 d-0.056
ies-0.056
Tokenplease
Token ID29688
Feature activation-4.90e-02

Pos logit contributions
 languages+0.202
 Learn+0.176
 Friendship+0.175
aries+0.171
 parks+0.169
Neg logit contributions
sing-0.131
ower-0.129
 d-0.128
ies-0.127
less-0.127
Token"
Token ID1
Feature activation-5.11e-02

Pos logit contributions
 languages+0.099
These+0.097
Always+0.095
Class+0.093
Anne+0.093
Neg logit contributions
 d-0.070
fish-0.069
sing-0.068
less-0.066
orse-0.063
Token and
Token ID290
Feature activation-1.94e-03

Pos logit contributions
These+0.120
 languages+0.118
Always+0.114
If+0.113
Here+0.113
Neg logit contributions
sing-0.057
fish-0.056
ies-0.054
 d-0.052
less-0.051
Token "
Token ID366
Feature activation-7.39e-02

Pos logit contributions
 languages+0.004
These+0.004
Always+0.004
If+0.003
Class+0.003
Neg logit contributions
sing-0.003
fish-0.003
 d-0.003
ies-0.003
less-0.003
Tokenthank
Token ID40716
Feature activation-5.29e-02

Pos logit contributions
 languages+0.156
Anne+0.136
 Friendship+0.136
These+0.135
June+0.134
Neg logit contributions
fish-0.105
sing-0.098
ies-0.098
 d-0.097
less-0.096
Token a
Token ID257
Feature activation-9.83e-03

Pos logit contributions
These+0.090
 languages+0.090
Always+0.088
If+0.088
Class+0.085
Neg logit contributions
sing-0.054
 d-0.051
fish-0.048
orse-0.048
 footing-0.046
Token walk
Token ID2513
Feature activation-4.54e-02

Pos logit contributions
Hope+0.014
Will+0.013
These+0.013
 Wisdom+0.013
Class+0.013
Neg logit contributions
ble-0.018
ies-0.018
fish-0.018
kin-0.018
sing-0.018
Token.
Token ID13
Feature activation-3.03e-02

Pos logit contributions
These+0.089
 languages+0.087
Always+0.087
If+0.086
Class+0.083
Neg logit contributions
sing-0.070
fish-0.067
orse-0.063
 footing-0.062
 d-0.061
Token He
Token ID679
Feature activation-9.86e-03

Pos logit contributions
 languages+0.058
Always+0.054
These+0.052
ymes+0.050
aries+0.050
Neg logit contributions
sing-0.051
 spine-0.048
 tum-0.045
orse-0.044
 flo-0.043
Token said
Token ID531
Feature activation-4.17e-02

Pos logit contributions
These+0.020
 languages+0.019
Always+0.019
If+0.019
Class+0.019
Neg logit contributions
sing-0.015
fish-0.014
 d-0.013
ies-0.013
 footing-0.013
Token '
Token ID705
Feature activation-9.52e-02

Pos logit contributions
 languages+0.082
These+0.076
Will+0.075
 Share+0.074
 Wisdom+0.074
Neg logit contributions
sing-0.067
 spine-0.061
 footing-0.059
fish-0.059
kin-0.057
Tokenhello
Token ID31373
Feature activation-4.84e-02

Pos logit contributions
 languages+0.161
aries+0.131
 Friendship+0.126
 teaches+0.124
Anne+0.124
Neg logit contributions
less-0.174
fish-0.173
 d-0.172
orse-0.169
aa-0.167
Token'
Token ID6
Feature activation-3.63e-02

Pos logit contributions
These+0.089
 languages+0.088
Always+0.087
If+0.084
Class+0.084
Neg logit contributions
fish-0.075
 d-0.073
sing-0.073
less-0.071
ies-0.069
Token to
Token ID284
Feature activation-6.56e-02

Pos logit contributions
These+0.079
 languages+0.077
Always+0.076
If+0.076
Class+0.074
Neg logit contributions
sing-0.047
fish-0.046
 d-0.043
ies-0.042
less-0.041
Token everyone
Token ID2506
Feature activation-1.13e-02

Pos logit contributions
These+0.126
Always+0.126
 languages+0.122
If+0.119
Anne+0.118
Neg logit contributions
 d-0.093
fish-0.089
sing-0.087
orse-0.086
ies-0.084
Token he
Token ID339
Feature activation-1.22e-02

Pos logit contributions
These+0.024
 languages+0.023
Always+0.023
If+0.023
Class+0.022
Neg logit contributions
fish-0.016
sing-0.015
 d-0.014
ies-0.014
less-0.014
Token a
Token ID257
Feature activation-1.96e-02

Pos logit contributions
These+0.065
 languages+0.064
Always+0.062
If+0.062
Class+0.060
Neg logit contributions
sing-0.058
fish-0.056
 spine-0.052
 d-0.052
 footing-0.052
Token big
Token ID1263
Feature activation-2.58e-02

Pos logit contributions
 languages+0.033
 alphabet+0.032
These+0.032
Always+0.031
Will+0.030
Neg logit contributions
sing-0.033
fish-0.032
orse-0.032
ies-0.032
 spine-0.031
Token map
Token ID3975
Feature activation-1.48e-02

Pos logit contributions
 languages+0.043
These+0.041
 alphabet+0.040
Always+0.039
Class+0.038
Neg logit contributions
fish-0.045
sing-0.045
orse-0.043
ies-0.042
 footing-0.042
Token and
Token ID290
Feature activation+2.40e-02

Pos logit contributions
These+0.032
 languages+0.032
Always+0.032
If+0.031
Class+0.031
Neg logit contributions
sing-0.019
fish-0.018
 d-0.017
ies-0.016
 footing-0.016
Token said
Token ID531
Feature activation-4.05e-02

Pos logit contributions
sing+0.039
fish+0.039
 d+0.035
 footing+0.035
ies+0.035
Neg logit contributions
These-0.044
 languages-0.044
Always-0.043
If-0.043
Class-0.042
Token "
Token ID366
Feature activation-9.23e-02

Pos logit contributions
 languages+0.086
These+0.084
If+0.081
Will+0.081
 Friendship+0.081
Neg logit contributions
sing-0.058
 footing-0.052
 spine-0.051
fish-0.050
 grip-0.046
TokenLook
Token ID8567
Feature activation-1.55e-02

Pos logit contributions
 languages+0.129
 Friendship+0.103
 Share+0.101
 Learn+0.100
 reads+0.099
Neg logit contributions
less-0.186
fish-0.179
 d-0.178
aa-0.172
sing-0.167
Token,
Token ID11
Feature activation-1.16e-03

Pos logit contributions
 languages+0.030
These+0.030
 Wisdom+0.029
Will+0.029
Anne+0.029
Neg logit contributions
sing-0.022
fish-0.022
 spine-0.020
 footing-0.019
kin-0.019
Token this
Token ID428
Feature activation-1.65e-02

Pos logit contributions
These+0.002
 languages+0.002
Always+0.002
If+0.002
Class+0.002
Neg logit contributions
sing-0.002
fish-0.002
ies-0.002
 footing-0.002
 d-0.002
Token map
Token ID3975
Feature activation-2.81e-02

Pos logit contributions
These+0.031
 languages+0.030
Always+0.030
If+0.030
Class+0.029
Neg logit contributions
sing-0.026
fish-0.026
 d-0.024
ies-0.024
 footing-0.023
Token is
Token ID318
Feature activation-1.47e-02

Pos logit contributions
These+0.045
 languages+0.045
Always+0.043
If+0.043
Class+0.042
Neg logit contributions
sing-0.052
fish-0.051
 d-0.048
ies-0.048
 footing-0.047
Token the
Token ID262
Feature activation+1.04e-02

Pos logit contributions
 languages+0.055
Always+0.055
These+0.054
If+0.054
Class+0.053
Neg logit contributions
sing-0.039
 d-0.037
fish-0.036
 footing-0.034
 raft-0.034
Token car
Token ID1097
Feature activation-1.16e-02

Pos logit contributions
sing+0.018
 d+0.017
 raft+0.017
 spine+0.016
less+0.016
Neg logit contributions
 languages-0.018
If-0.017
Always-0.017
These-0.016
Anne-0.016
Token,
Token ID11
Feature activation+2.10e-02

Pos logit contributions
These+0.025
 languages+0.025
Always+0.024
If+0.024
Class+0.024
Neg logit contributions
sing-0.015
fish-0.015
 footing-0.013
ies-0.013
less-0.013
Token Maggie
Token ID33014
Feature activation-9.58e-03

Pos logit contributions
sing+0.030
fish+0.028
 spine+0.026
 footing+0.025
orse+0.025
Neg logit contributions
These-0.045
 languages-0.043
If-0.043
Always-0.043
Anne-0.042
Token said
Token ID531
Feature activation-3.32e-02

Pos logit contributions
These+0.017
 languages+0.017
Always+0.017
If+0.017
Class+0.017
Neg logit contributions
sing-0.016
fish-0.015
 footing-0.014
less-0.014
 spine-0.014
Token "
Token ID366
Feature activation-9.19e-02

Pos logit contributions
 languages+0.065
These+0.064
Will+0.062
Class+0.062
 Share+0.061
Neg logit contributions
sing-0.053
 spine-0.048
 footing-0.047
fish-0.047
 grip-0.044
TokenWhat
Token ID2061
Feature activation-2.21e-02

Pos logit contributions
 languages+0.137
Always+0.112
 magazines+0.109
 Share+0.109
 Friendship+0.108
Neg logit contributions
 d-0.180
less-0.177
sing-0.174
fish-0.172
aa-0.169
Token do
Token ID466
Feature activation-3.52e-02

Pos logit contributions
These+0.039
Always+0.038
If+0.038
Class+0.037
June+0.037
Neg logit contributions
 d-0.037
fish-0.036
sing-0.034
ies-0.033
less-0.033
Token you
Token ID345
Feature activation-1.45e-02

Pos logit contributions
These+0.076
 languages+0.075
Always+0.074
If+0.073
Class+0.071
Neg logit contributions
fish-0.046
sing-0.044
orse-0.041
 d-0.041
ies-0.040
Token think
Token ID892
Feature activation-2.76e-02

Pos logit contributions
These+0.029
 languages+0.028
Always+0.028
If+0.027
Class+0.027
Neg logit contributions
sing-0.022
fish-0.022
ies-0.020
 d-0.019
 footing-0.019
Token Uncle
Token ID23169
Feature activation-6.06e-02

Pos logit contributions
These+0.056
 languages+0.055
Always+0.054
If+0.054
Class+0.052
Neg logit contributions
sing-0.040
fish-0.039
 d-0.036
ies-0.036
 footing-0.035
TokenOne
Token ID3198
Feature activation-1.21e-02

Pos logit contributions
 languages+0.126
aries+0.115
These+0.114
 alphabet+0.114
Always+0.113
Neg logit contributions
fish-0.057
 d-0.057
 flo-0.052
orse-0.051
 fell-0.050
Token day
Token ID1110
Feature activation+2.13e-02

Pos logit contributions
These+0.022
 languages+0.021
Always+0.021
If+0.020
Class+0.020
Neg logit contributions
sing-0.021
fish-0.020
 spine-0.019
ies-0.018
 footing-0.018
Token,
Token ID11
Feature activation+1.12e-02

Pos logit contributions
sing+0.049
 spine+0.042
 raft+0.040
 footing+0.040
castle+0.040
Neg logit contributions
Will-0.026
These-0.026
If-0.025
Anne-0.024
 Every-0.024
Token Mrs
Token ID9074
Feature activation-4.78e-02

Pos logit contributions
sing+0.022
fish+0.022
 spine+0.021
ies+0.020
orse+0.020
Neg logit contributions
These-0.018
If-0.018
Always-0.016
Hope-0.016
Would-0.016
Token Jones
Token ID5437
Feature activation-3.73e-02

Pos logit contributions
These+0.118
Always+0.116
 languages+0.116
If+0.116
Class+0.112
Neg logit contributions
sing-0.046
fish-0.045
 d-0.043
ies-0.043
less-0.039
Token took
Token ID1718
Feature activation-9.17e-02

Pos logit contributions
These+0.073
 languages+0.072
Always+0.071
If+0.070
Class+0.069
Neg logit contributions
fish-0.056
sing-0.055
 d-0.052
ies-0.051
orse-0.050
Token her
Token ID607
Feature activation-1.86e-02

Pos logit contributions
 languages+0.191
Always+0.186
If+0.182
These+0.181
Would+0.174
Neg logit contributions
 d-0.120
ies-0.108
 as-0.106
less-0.105
aa-0.103
Token pet
Token ID4273
Feature activation-4.53e-02

Pos logit contributions
These+0.038
 languages+0.038
Always+0.038
If+0.037
Class+0.037
Neg logit contributions
sing-0.027
fish-0.025
 d-0.025
ies-0.023
 footing-0.023
Token dog
Token ID3290
Feature activation-5.39e-02

Pos logit contributions
 languages+0.084
These+0.084
Always+0.083
If+0.082
Class+0.080
Neg logit contributions
fish-0.071
sing-0.069
 d-0.068
ies-0.067
 raft-0.066
Token to
Token ID284
Feature activation-4.17e-02

Pos logit contributions
 languages+0.121
If+0.120
These+0.120
Always+0.119
Class+0.115
Neg logit contributions
sing-0.064
fish-0.062
 d-0.062
ies-0.059
less-0.057
Token the
Token ID262
Feature activation-1.52e-02

Pos logit contributions
 languages+0.089
Always+0.084
These+0.084
If+0.084
Class+0.082
Neg logit contributions
sing-0.053
 d-0.052
fish-0.049
 raft-0.048
orse-0.046
Token his
Token ID465
Feature activation+7.68e-04

Pos logit contributions
These+0.018
 languages+0.018
Always+0.017
If+0.017
Class+0.017
Neg logit contributions
sing-0.015
fish-0.015
ies-0.013
 d-0.013
orse-0.013
Token grandma
Token ID49890
Feature activation-4.05e-02

Pos logit contributions
fish+0.001
sing+0.001
orse+0.001
ies+0.001
 d+0.001
Neg logit contributions
 languages-0.002
These-0.002
Always-0.002
If-0.001
 alphabet-0.001
Token.
Token ID13
Feature activation-3.24e-02

Pos logit contributions
These+0.078
 languages+0.077
If+0.075
Always+0.075
Class+0.073
Neg logit contributions
sing-0.064
fish-0.063
 d-0.059
ies-0.058
less-0.057
Token She
Token ID1375
Feature activation-1.08e-02

Pos logit contributions
 languages+0.057
These+0.052
Always+0.052
If+0.051
 alphabet+0.049
Neg logit contributions
sing-0.054
 d-0.053
orse-0.052
fish-0.052
 shovel-0.050
Token said
Token ID531
Feature activation-4.17e-02

Pos logit contributions
These+0.021
 languages+0.020
Always+0.020
If+0.020
Class+0.019
Neg logit contributions
sing-0.017
fish-0.017
 d-0.016
ies-0.015
orse-0.015
Token "
Token ID366
Feature activation-9.16e-02

Pos logit contributions
 languages+0.089
These+0.088
If+0.085
Always+0.085
Class+0.085
Neg logit contributions
sing-0.059
fish-0.053
 footing-0.052
 spine-0.052
kin-0.048
TokenI
Token ID40
Feature activation-3.56e-02

Pos logit contributions
 languages+0.120
 Friendship+0.090
aries+0.090
 Books+0.089
 alphabet+0.089
Neg logit contributions
fish-0.198
less-0.196
 d-0.189
ies-0.185
aa-0.184
Token have
Token ID423
Feature activation-4.35e-02

Pos logit contributions
These+0.063
 languages+0.061
Always+0.061
If+0.060
Class+0.059
Neg logit contributions
sing-0.061
fish-0.060
 footing-0.056
ies-0.055
 spine-0.055
Token something
Token ID1223
Feature activation-8.13e-03

Pos logit contributions
These+0.083
 languages+0.082
Always+0.080
If+0.080
Class+0.078
Neg logit contributions
sing-0.068
fish-0.067
 d-0.062
ies-0.061
 footing-0.061
Token very
Token ID845
Feature activation+9.40e-03

Pos logit contributions
These+0.015
 languages+0.015
Always+0.015
 alphabet+0.014
 Wisdom+0.014
Neg logit contributions
sing-0.013
fish-0.012
 spine-0.012
 footing-0.012
orse-0.011
Token tasty
Token ID25103
Feature activation-1.65e-02

Pos logit contributions
sing+0.017
fish+0.015
 spine+0.015
 footing+0.015
orse+0.015
Neg logit contributions
 languages-0.016
These-0.015
 alphabet-0.015
 Wisdom-0.015
Always-0.014
Token<|endoftext|>
Token ID50256
Feature activation-5.39e-02

Pos logit contributions
 languages+0.054
These+0.051
Always+0.049
Class+0.049
If+0.048
Neg logit contributions
 d-0.045
orse-0.044
sing-0.044
less-0.044
ipop-0.043
TokenM
Token ID44
Feature activation-1.25e-02

Pos logit contributions
 languages+0.128
aries+0.117
Always+0.116
These+0.116
 alphabet+0.116
Neg logit contributions
fish-0.059
 d-0.057
 flo-0.053
kin-0.052
orse-0.052
Tokenum
Token ID388
Feature activation-5.63e-02

Pos logit contributions
These+0.026
 languages+0.026
If+0.025
Always+0.025
Class+0.024
Neg logit contributions
fish-0.017
sing-0.017
 d-0.016
ies-0.016
orse-0.016
Token and
Token ID290
Feature activation-1.60e-02

Pos logit contributions
 languages+0.121
These+0.120
If+0.116
Always+0.115
Class+0.113
Neg logit contributions
fish-0.075
sing-0.071
 d-0.071
ies-0.070
orse-0.065
Token Dad
Token ID17415
Feature activation-5.82e-02

Pos logit contributions
These+0.029
 languages+0.029
Always+0.028
If+0.028
Class+0.027
Neg logit contributions
sing-0.027
fish-0.026
 footing-0.024
 d-0.024
 spine-0.024
Token took
Token ID1718
Feature activation-9.11e-02

Pos logit contributions
 languages+0.119
These+0.119
Always+0.115
If+0.114
Class+0.112
Neg logit contributions
fish-0.082
sing-0.081
 d-0.078
ies-0.076
less-0.073
Token Daisy
Token ID40355
Feature activation-4.59e-02

Pos logit contributions
 languages+0.178
Always+0.177
These+0.176
If+0.174
Class+0.166
Neg logit contributions
sing-0.130
 d-0.128
fish-0.127
ies-0.123
less-0.118
Token to
Token ID284
Feature activation-4.85e-02

Pos logit contributions
These+0.101
 languages+0.100
Always+0.099
If+0.098
Class+0.096
Neg logit contributions
sing-0.058
fish-0.056
 d-0.052
ies-0.051
 footing-0.050
Token school
Token ID1524
Feature activation-3.64e-02

Pos logit contributions
 languages+0.094
These+0.093
Always+0.091
If+0.091
Class+0.088
Neg logit contributions
sing-0.074
fish-0.071
 d-0.069
orse-0.065
 footing-0.065
Token one
Token ID530
Feature activation-2.80e-03

Pos logit contributions
These+0.087
 languages+0.086
Always+0.086
If+0.085
June+0.083
Neg logit contributions
sing-0.037
fish-0.037
 d-0.035
ies-0.033
less-0.032
Token day
Token ID1110
Feature activation+1.48e-02

Pos logit contributions
These+0.006
Always+0.006
 languages+0.006
If+0.006
Class+0.006
Neg logit contributions
sing-0.003
fish-0.003
 d-0.003
less-0.003
ies-0.003
Token,
Token ID11
Feature activation+3.04e-03

Pos logit contributions
 languages+0.065
If+0.063
These+0.063
 Friendship+0.062
Would+0.062
Neg logit contributions
sing-0.047
 footing-0.041
 spine-0.041
fish-0.040
orse-0.038
Token
Token ID198
Feature activation-6.08e-03

Pos logit contributions
fish+0.006
sing+0.006
 footing+0.006
ies+0.005
kin+0.005
Neg logit contributions
These-0.005
If-0.004
Always-0.004
 languages-0.004
Would-0.004
Token"
Token ID1
Feature activation-6.53e-02

Pos logit contributions
 languages+0.012
These+0.011
Always+0.011
 magazines+0.011
aries+0.011
Neg logit contributions
fish-0.008
orse-0.008
 d-0.008
ble-0.008
ank-0.008
TokenCan
Token ID6090
Feature activation-4.69e-02

Pos logit contributions
 languages+0.125
These+0.113
aries+0.113
 alphabet+0.111
 Share+0.109
Neg logit contributions
fish-0.103
less-0.096
 d-0.096
sing-0.092
ipop-0.089
Token you
Token ID345
Feature activation-2.67e-02

Pos logit contributions
 languages+0.093
These+0.093
Always+0.090
If+0.089
Class+0.088
Neg logit contributions
fish-0.068
sing-0.068
 d-0.063
orse-0.061
less-0.061
Token lend
Token ID22096
Feature activation-9.08e-02

Pos logit contributions
These+0.045
Always+0.045
 languages+0.044
If+0.043
Class+0.042
Neg logit contributions
sing-0.047
fish-0.046
ies-0.043
 footing-0.043
 spine-0.043
Token me
Token ID502
Feature activation-4.15e-02

Pos logit contributions
Always+0.164
These+0.163
 languages+0.160
If+0.158
Class+0.156
Neg logit contributions
fish-0.140
 d-0.139
ies-0.133
sing-0.133
less-0.127
Token a
Token ID257
Feature activation-3.45e-02

Pos logit contributions
These+0.073
 languages+0.071
Always+0.070
If+0.069
Class+0.065
Neg logit contributions
sing-0.075
fish-0.071
 spine-0.067
orse-0.067
 footing-0.066
Token hand
Token ID1021
Feature activation-1.94e-02

Pos logit contributions
 alphabet+0.058
 languages+0.057
Will+0.055
 Wisdom+0.055
These+0.055
Neg logit contributions
sing-0.058
fish-0.055
ies-0.053
orse-0.053
 footing-0.053
Token?
Token ID30
Feature activation-4.09e-02

Pos logit contributions
Always+0.035
These+0.035
 languages+0.035
If+0.034
Class+0.033
Neg logit contributions
sing-0.031
fish-0.030
 footing-0.028
 d-0.028
orse-0.028
Token I
Token ID314
Feature activation-1.90e-02

Pos logit contributions
 languages+0.075
These+0.072
Class+0.069
Always+0.066
Anne+0.065
Neg logit contributions
sing-0.063
ipop-0.060
less-0.059
 d-0.059
fish-0.058
Token truck
Token ID7779
Feature activation-1.80e-02

Pos logit contributions
fish+0.012
sing+0.012
 footing+0.011
 d+0.011
orse+0.011
Neg logit contributions
These-0.015
 languages-0.015
Always-0.014
If-0.014
Class-0.014
Token and
Token ID290
Feature activation+2.13e-02

Pos logit contributions
These+0.038
 languages+0.037
Always+0.037
If+0.036
Class+0.036
Neg logit contributions
sing-0.025
fish-0.024
 d-0.022
ies-0.022
 footing-0.021
Token their
Token ID511
Feature activation-3.28e-03

Pos logit contributions
sing+0.037
fish+0.036
less+0.033
 footing+0.033
kin+0.033
Neg logit contributions
These-0.037
 languages-0.035
Always-0.035
If-0.035
Class-0.035
Token chief
Token ID4039
Feature activation-2.42e-02

Pos logit contributions
These+0.006
 languages+0.006
Always+0.006
If+0.006
Class+0.006
Neg logit contributions
fish-0.005
sing-0.005
ies-0.005
 d-0.005
orse-0.005
Token said
Token ID531
Feature activation-3.80e-02

Pos logit contributions
These+0.046
 languages+0.046
Always+0.045
If+0.044
Class+0.043
Neg logit contributions
sing-0.038
fish-0.037
 d-0.035
ies-0.034
 footing-0.034
Token "
Token ID366
Feature activation-9.08e-02

Pos logit contributions
 languages+0.075
Will+0.072
These+0.071
 Share+0.071
 Friendship+0.071
Neg logit contributions
sing-0.059
 spine-0.054
 footing-0.054
fish-0.053
 grip-0.050
TokenWe
Token ID1135
Feature activation-4.47e-02

Pos logit contributions
 languages+0.128
 Friendship+0.103
 magazines+0.102
 alphabet+0.102
 reads+0.101
Neg logit contributions
fish-0.187
 d-0.185
less-0.182
aa-0.178
ies-0.175
Token need
Token ID761
Feature activation-5.41e-02

Pos logit contributions
These+0.076
 languages+0.074
Always+0.073
If+0.072
Class+0.071
Neg logit contributions
sing-0.080
fish-0.079
 d-0.074
ies-0.073
 footing-0.072
Token to
Token ID284
Feature activation-5.17e-02

Pos logit contributions
These+0.120
 languages+0.119
Always+0.117
If+0.116
Class+0.114
Neg logit contributions
sing-0.067
fish-0.066
 d-0.061
ies-0.059
 footing-0.059
Token quit
Token ID11238
Feature activation-2.34e-02

Pos logit contributions
These+0.099
 languages+0.097
Always+0.096
If+0.095
Class+0.093
Neg logit contributions
sing-0.081
fish-0.079
 d-0.073
ies-0.072
 footing-0.072
Token this
Token ID428
Feature activation-1.43e-02

Pos logit contributions
These+0.045
 languages+0.044
Always+0.044
If+0.043
Class+0.042
Neg logit contributions
sing-0.036
fish-0.036
 d-0.033
ies-0.032
 footing-0.032
Token bird
Token ID6512
Feature activation-1.54e-02

Pos logit contributions
These+0.026
If+0.026
Anne+0.026
Always+0.025
 languages+0.025
Neg logit contributions
sing-0.020
 d-0.020
fish-0.020
ies-0.019
 raft-0.018
Token.
Token ID13
Feature activation-3.29e-02

Pos logit contributions
These+0.032
 languages+0.031
Always+0.031
If+0.030
Class+0.030
Neg logit contributions
sing-0.022
fish-0.022
 d-0.020
ies-0.020
 footing-0.019
Token The
Token ID383
Feature activation-2.43e-02

Pos logit contributions
 languages+0.062
Anne+0.061
June+0.061
Class+0.061
Always+0.061
Neg logit contributions
sing-0.048
 d-0.046
orse-0.043
 tum-0.042
sed-0.041
Token bird
Token ID6512
Feature activation-2.83e-02

Pos logit contributions
If+0.047
 languages+0.047
These+0.047
Always+0.047
Anne+0.046
Neg logit contributions
sing-0.037
fish-0.036
 d-0.036
ies-0.034
 raft-0.034
Token said
Token ID531
Feature activation-3.86e-02

Pos logit contributions
These+0.056
 languages+0.055
If+0.053
Always+0.053
Class+0.052
Neg logit contributions
fish-0.043
sing-0.042
 d-0.040
ies-0.039
orse-0.039
Token "
Token ID366
Feature activation-9.07e-02

Pos logit contributions
 languages+0.089
These+0.088
Always+0.087
If+0.086
Class+0.084
Neg logit contributions
sing-0.046
fish-0.043
 footing-0.041
 spine-0.040
ies-0.038
Tokenthank
Token ID40716
Feature activation-4.37e-02

Pos logit contributions
 languages+0.155
 parks+0.133
 magazines+0.133
 lawyers+0.128
 alphabet+0.127
Neg logit contributions
fish-0.167
 d-0.151
orse-0.148
less-0.148
ies-0.143
Token you
Token ID345
Feature activation-3.19e-02

Pos logit contributions
 alphabet+0.107
These+0.106
 languages+0.105
Always+0.105
 channels+0.104
Neg logit contributions
fish-0.041
 d-0.037
orse-0.031
 as-0.029
ies-0.029
Token"
Token ID1
Feature activation-4.63e-02

Pos logit contributions
These+0.065
 languages+0.064
Always+0.063
If+0.062
Class+0.061
Neg logit contributions
sing-0.046
fish-0.045
 footing-0.042
 d-0.041
 spine-0.040
Token to
Token ID284
Feature activation-6.72e-02

Pos logit contributions
These+0.111
 languages+0.111
If+0.106
Always+0.106
Class+0.104
Neg logit contributions
fish-0.047
ies-0.045
 d-0.044
sing-0.044
less-0.044
Token its
Token ID663
Feature activation-9.53e-03

Pos logit contributions
These+0.125
 languages+0.122
Always+0.121
If+0.120
Class+0.117
Neg logit contributions
sing-0.111
fish-0.109
 d-0.102
ies-0.101
orse-0.099
TokenMar
Token ID7676
Feature activation-2.38e-02

Pos logit contributions
 languages+0.025
These+0.024
Always+0.023
 alphabet+0.023
aries+0.022
Neg logit contributions
 d-0.023
fish-0.023
ies-0.022
sing-0.022
 raft-0.022
Tokentha
Token ID12898
Feature activation-3.43e-02

Pos logit contributions
These+0.061
 languages+0.061
Always+0.060
If+0.059
Class+0.059
Neg logit contributions
sing-0.022
fish-0.021
 d-0.018
ies-0.018
 footing-0.017
Token smiled
Token ID13541
Feature activation+2.02e-04

Pos logit contributions
These+0.064
 languages+0.063
Always+0.063
If+0.062
Class+0.060
Neg logit contributions
sing-0.055
fish-0.053
 footing-0.050
 spine-0.049
less-0.049
Token and
Token ID290
Feature activation+1.76e-02

Pos logit contributions
sing+0.000
 footing+0.000
fish+0.000
orse+0.000
ipop+0.000
Neg logit contributions
Always-0.000
 languages-0.000
 Every-0.000
 Wisdom-0.000
These-0.000
Token said
Token ID531
Feature activation-2.93e-02

Pos logit contributions
sing+0.034
 footing+0.032
less+0.031
 spine+0.031
fish+0.030
Neg logit contributions
Always-0.029
These-0.028
 Wisdom-0.027
Hope-0.027
If-0.027
Token "
Token ID366
Feature activation-9.05e-02

Pos logit contributions
 languages+0.055
 Share+0.053
 Friendship+0.052
Will+0.052
 Wisdom+0.052
Neg logit contributions
sing-0.048
 footing-0.044
 spine-0.043
fish-0.041
orse-0.041
TokenThank
Token ID10449
Feature activation-4.57e-02

Pos logit contributions
 languages+0.147
 magazines+0.127
aries+0.126
 parks+0.125
 lawyers+0.120
Neg logit contributions
fish-0.175
less-0.159
 d-0.157
orse-0.152
ies-0.152
Token you
Token ID345
Feature activation-2.95e-02

Pos logit contributions
 languages+0.108
 alphabet+0.106
These+0.104
Always+0.102
Class+0.102
Neg logit contributions
fish-0.050
 d-0.044
less-0.040
ies-0.039
orse-0.039
Token!
Token ID0
Feature activation-2.91e-02

Pos logit contributions
 languages+0.065
These+0.063
Anne+0.062
Always+0.062
 Wisdom+0.061
Neg logit contributions
fish-0.037
sing-0.036
 footing-0.034
orse-0.032
 spine-0.031
Token I
Token ID314
Feature activation-2.38e-02

Pos logit contributions
These+0.059
 languages+0.059
Always+0.054
Class+0.054
If+0.052
Neg logit contributions
sing-0.042
less-0.039
 d-0.038
ipop-0.038
fish-0.038
Token would
Token ID561
Feature activation-3.10e-02

Pos logit contributions
Always+0.036
These+0.033
If+0.032
Will+0.032
 languages+0.031
Neg logit contributions
sing-0.047
 spine-0.045
 footing-0.045
fish-0.045
 raft-0.044
Token likes
Token ID7832
Feature activation-5.39e-02

Pos logit contributions
These+0.031
 languages+0.030
Always+0.030
If+0.029
Class+0.028
Neg logit contributions
sing-0.044
fish-0.043
 d-0.042
ies-0.040
orse-0.040
Token them
Token ID606
Feature activation-2.79e-02

Pos logit contributions
If+0.119
Always+0.118
These+0.117
Hope+0.112
 languages+0.110
Neg logit contributions
 d-0.067
fish-0.067
ies-0.063
sing-0.062
 spl-0.059
Token all
Token ID477
Feature activation-1.76e-02

Pos logit contributions
These+0.065
 languages+0.064
If+0.064
Always+0.064
Class+0.063
Neg logit contributions
sing-0.032
fish-0.031
 d-0.029
ies-0.028
less-0.026
Token and
Token ID290
Feature activation+6.61e-04

Pos logit contributions
These+0.041
Always+0.041
 languages+0.040
If+0.040
Class+0.039
Neg logit contributions
sing-0.020
fish-0.019
 d-0.018
ies-0.017
less-0.017
Token says
Token ID1139
Feature activation-3.97e-02

Pos logit contributions
sing+0.001
fish+0.001
 d+0.001
ies+0.001
orse+0.001
Neg logit contributions
These-0.001
Always-0.001
If-0.001
 languages-0.001
Hope-0.001
Token "
Token ID366
Feature activation-9.04e-02

Pos logit contributions
These+0.091
 languages+0.089
Always+0.088
If+0.088
Class+0.086
Neg logit contributions
sing-0.047
fish-0.047
 d-0.043
ies-0.041
orse-0.041
Tokenmine
Token ID3810
Feature activation-3.02e-02

Pos logit contributions
 languages+0.130
Always+0.111
 parks+0.107
If+0.106
Anne+0.105
Neg logit contributions
fish-0.182
less-0.173
sing-0.173
 d-0.173
ipop-0.169
Token,
Token ID11
Feature activation-1.62e-02

Pos logit contributions
These+0.057
 languages+0.056
Always+0.055
If+0.054
Class+0.053
Neg logit contributions
sing-0.048
fish-0.047
 footing-0.043
 d-0.043
 spine-0.042
Token mine
Token ID6164
Feature activation-1.46e-02

Pos logit contributions
These+0.034
If+0.034
Always+0.034
 languages+0.033
Class+0.032
Neg logit contributions
sing-0.022
fish-0.022
 d-0.021
orse-0.020
ies-0.019
Token,
Token ID11
Feature activation-1.62e-02

Pos logit contributions
 languages+0.030
These+0.030
Always+0.029
If+0.029
Class+0.028
Neg logit contributions
sing-0.021
fish-0.020
orse-0.019
 spine-0.019
 footing-0.019
Token mine
Token ID6164
Feature activation-1.00e-02

Pos logit contributions
These+0.033
 languages+0.032
Always+0.032
If+0.032
Class+0.031
Neg logit contributions
sing-0.024
fish-0.023
 d-0.022
ies-0.021
orse-0.021
Token time
Token ID640
Feature activation+1.74e-02

Pos logit contributions
These+0.020
 languages+0.019
If+0.019
Always+0.019
Class+0.018
Neg logit contributions
sing-0.017
fish-0.017
 d-0.016
ies-0.015
less-0.015
Token.
Token ID13
Feature activation-2.94e-02

Pos logit contributions
fish+0.030
sing+0.030
 raft+0.029
kin+0.028
 thirst+0.027
Neg logit contributions
 languages-0.028
 alphabet-0.027
These-0.027
 Every-0.026
 Friendship-0.026
Token She
Token ID1375
Feature activation-9.72e-03

Pos logit contributions
 languages+0.038
These+0.035
Always+0.035
If+0.033
June+0.033
Neg logit contributions
sing-0.065
 d-0.062
fish-0.061
less-0.060
ies-0.060
Token said
Token ID531
Feature activation-4.88e-02

Pos logit contributions
These+0.018
 languages+0.017
Always+0.017
If+0.017
Class+0.017
Neg logit contributions
sing-0.016
fish-0.016
 footing-0.014
 spine-0.014
less-0.014
Token,
Token ID11
Feature activation-1.87e-02

Pos logit contributions
 languages+0.100
These+0.100
If+0.097
Always+0.096
Class+0.095
Neg logit contributions
sing-0.072
fish-0.065
 spine-0.063
 footing-0.062
ies-0.060
Token "
Token ID366
Feature activation-9.04e-02

Pos logit contributions
These+0.030
 languages+0.029
Always+0.028
If+0.028
Class+0.028
Neg logit contributions
sing-0.035
fish-0.034
 footing-0.032
ies-0.032
 d-0.031
TokenLet
Token ID5756
Feature activation-6.14e-02

Pos logit contributions
 languages+0.123
 Wisdom+0.096
aries+0.096
 Share+0.094
 Friendship+0.093
Neg logit contributions
less-0.196
 d-0.195
fish-0.182
sing-0.180
ies-0.177
Token's
Token ID338
Feature activation-3.64e-02

Pos logit contributions
These+0.073
 languages+0.072
Always+0.069
If+0.068
Class+0.065
Neg logit contributions
sing-0.140
fish-0.139
 d-0.133
ies-0.131
orse-0.130
Token get
Token ID651
Feature activation-4.99e-02

Pos logit contributions
 languages+0.083
If+0.081
These+0.080
Class+0.079
Always+0.078
Neg logit contributions
sing-0.043
 d-0.041
fish-0.041
 raft-0.040
 shovel-0.038
Token your
Token ID534
Feature activation-2.21e-02

Pos logit contributions
These+0.094
 languages+0.091
Always+0.090
If+0.089
Class+0.087
Neg logit contributions
sing-0.082
fish-0.080
 footing-0.073
 d-0.072
 spine-0.072
Token books
Token ID3835
Feature activation-1.20e-02

Pos logit contributions
These+0.039
Always+0.039
If+0.038
 languages+0.038
Class+0.037
Neg logit contributions
sing-0.039
fish-0.037
 d-0.036
ipop-0.035
 footing-0.035
Token She
Token ID1375
Feature activation-6.36e-03

Pos logit contributions
 languages+0.043
These+0.041
Always+0.039
If+0.039
Class+0.038
Neg logit contributions
sing-0.051
 d-0.049
 spine-0.048
less-0.048
ipop-0.048
Token hugged
Token ID41130
Feature activation-5.05e-02

Pos logit contributions
These+0.012
 languages+0.012
Always+0.011
If+0.011
Class+0.011
Neg logit contributions
sing-0.010
fish-0.010
 d-0.009
ies-0.009
 footing-0.009
Token him
Token ID683
Feature activation-3.34e-02

Pos logit contributions
 languages+0.100
These+0.099
Always+0.097
Class+0.096
If+0.094
Neg logit contributions
 d-0.071
sing-0.070
fish-0.069
ies-0.069
 as-0.063
Token and
Token ID290
Feature activation+1.68e-02

Pos logit contributions
These+0.074
 languages+0.073
Always+0.071
If+0.071
Class+0.070
Neg logit contributions
sing-0.043
fish-0.042
 d-0.039
ies-0.038
 footing-0.037
Token said
Token ID531
Feature activation-3.15e-02

Pos logit contributions
sing+0.029
fish+0.027
 spine+0.026
 footing+0.026
less+0.025
Neg logit contributions
These-0.030
Always-0.030
 languages-0.029
If-0.029
Class-0.028
Token "
Token ID366
Feature activation-9.04e-02

Pos logit contributions
 languages+0.067
These+0.066
Always+0.065
If+0.065
Will+0.064
Neg logit contributions
sing-0.046
 spine-0.040
fish-0.040
 footing-0.040
orse-0.036
Tokenthank
Token ID40716
Feature activation-4.84e-02

Pos logit contributions
 languages+0.155
 alphabet+0.125
 parks+0.124
 Friendship+0.124
 Learn+0.123
Neg logit contributions
fish-0.163
less-0.162
 d-0.160
sing-0.154
ies-0.153
Token you
Token ID345
Feature activation-3.25e-02

Pos logit contributions
 languages+0.118
Always+0.113
 alphabet+0.113
These+0.112
If+0.110
Neg logit contributions
fish-0.043
 d-0.042
less-0.036
orse-0.035
ies-0.034
Token!"
Token ID2474
Feature activation-3.36e-02

Pos logit contributions
These+0.064
 languages+0.063
Always+0.062
Class+0.061
If+0.060
Neg logit contributions
sing-0.048
fish-0.048
 footing-0.045
 spine-0.042
 flo-0.042
Token The
Token ID383
Feature activation-3.30e-02

Pos logit contributions
 languages+0.051
These+0.050
If+0.048
Always+0.047
Class+0.047
Neg logit contributions
sing-0.066
fish-0.065
 d-0.062
ies-0.061
orse-0.061
Token family
Token ID1641
Feature activation-2.55e-02

Pos logit contributions
Always+0.081
If+0.079
These+0.077
Class+0.075
 languages+0.075
Neg logit contributions
sing-0.036
 d-0.035
fish-0.031
ies-0.030
 spine-0.029
Token Jane
Token ID12091
Feature activation-2.04e-02

Pos logit contributions
 languages+0.054
Always+0.054
These+0.053
June+0.051
Class+0.050
Neg logit contributions
sing-0.062
 d-0.059
fish-0.056
less-0.056
 shovel-0.055
Token was
Token ID373
Feature activation-1.35e-02

Pos logit contributions
These+0.039
 languages+0.039
Always+0.037
If+0.037
Class+0.037
Neg logit contributions
sing-0.032
fish-0.032
 d-0.030
ies-0.029
orse-0.029
Token happy
Token ID3772
Feature activation+1.33e-02

Pos logit contributions
These+0.030
 languages+0.030
Always+0.030
If+0.029
Class+0.029
Neg logit contributions
sing-0.017
fish-0.017
 d-0.015
ies-0.015
 footing-0.014
Token and
Token ID290
Feature activation+2.06e-02

Pos logit contributions
sing+0.023
fish+0.022
 footing+0.020
 raft+0.020
orse+0.020
Neg logit contributions
Always-0.024
These-0.023
 languages-0.023
If-0.022
Class-0.022
Token said
Token ID531
Feature activation-3.41e-02

Pos logit contributions
sing+0.041
fish+0.041
 spine+0.038
 footing+0.038
OO+0.037
Neg logit contributions
Always-0.031
These-0.030
 Wisdom-0.028
Would-0.028
If-0.028
Token "
Token ID366
Feature activation-9.01e-02

Pos logit contributions
 languages+0.075
These+0.074
If+0.072
Always+0.072
 Friendship+0.071
Neg logit contributions
sing-0.046
fish-0.042
 footing-0.040
 spine-0.040
orse-0.037
Tokenthank
Token ID40716
Feature activation-4.48e-02

Pos logit contributions
 languages+0.170
 Learn+0.138
aries+0.138
 films+0.136
 magazines+0.136
Neg logit contributions
less-0.155
 d-0.152
ies-0.146
fish-0.144
sing-0.141
Token you
Token ID345
Feature activation-3.18e-02

Pos logit contributions
 languages+0.105
Always+0.103
These+0.100
 alphabet+0.099
If+0.099
Neg logit contributions
fish-0.046
 d-0.045
less-0.042
ipop-0.040
sing-0.039
Token"
Token ID1
Feature activation-4.76e-02

Pos logit contributions
These+0.068
 languages+0.067
Always+0.065
Class+0.064
If+0.064
Neg logit contributions
sing-0.043
fish-0.043
 footing-0.039
orse-0.037
 spine-0.037
Token to
Token ID284
Feature activation-6.72e-02

Pos logit contributions
These+0.111
 languages+0.110
Always+0.107
If+0.106
Class+0.104
Neg logit contributions
sing-0.052
fish-0.052
ies-0.049
 d-0.048
less-0.047
Token her
Token ID607
Feature activation-1.97e-02

Pos logit contributions
These+0.125
 languages+0.124
Always+0.122
If+0.121
Class+0.118
Neg logit contributions
sing-0.109
fish-0.104
 d-0.103
ies-0.101
 footing-0.097
Token and
Token ID290
Feature activation+1.80e-02

Pos logit contributions
These+0.036
 languages+0.035
Always+0.034
 alphabet+0.033
Will+0.033
Neg logit contributions
sing-0.043
fish-0.040
orse-0.039
 thirst-0.039
 footing-0.039
Token a
Token ID257
Feature activation-8.03e-03

Pos logit contributions
sing+0.031
fish+0.031
 footing+0.029
less+0.028
 spine+0.028
Neg logit contributions
These-0.032
 languages-0.031
Always-0.030
If-0.030
Class-0.029
Token tag
Token ID7621
Feature activation-1.15e-02

Pos logit contributions
 languages+0.014
These+0.014
 alphabet+0.013
Will+0.013
Always+0.013
Neg logit contributions
sing-0.013
fish-0.013
 footing-0.012
ies-0.012
orse-0.012
Token that
Token ID326
Feature activation+8.47e-04

Pos logit contributions
 languages+0.021
These+0.021
Always+0.020
 alphabet+0.020
If+0.020
Neg logit contributions
sing-0.018
fish-0.017
orse-0.016
 footing-0.016
 spine-0.016
Token said
Token ID531
Feature activation-4.06e-02

Pos logit contributions
sing+0.001
fish+0.001
 d+0.001
ies+0.001
 footing+0.001
Neg logit contributions
These-0.002
 languages-0.002
Always-0.002
If-0.002
Class-0.002
Token "
Token ID366
Feature activation-9.00e-02

Pos logit contributions
 languages+0.072
These+0.072
If+0.068
Class+0.068
Always+0.068
Neg logit contributions
sing-0.069
fish-0.066
 footing-0.063
ies-0.062
 spine-0.061
TokenFor
Token ID1890
Feature activation-5.80e-02

Pos logit contributions
 languages+0.144
 Friendship+0.118
 parks+0.118
Anne+0.118
 films+0.115
Neg logit contributions
less-0.169
fish-0.163
aa-0.161
 d-0.160
ipop-0.158
Token Mom
Token ID11254
Feature activation-3.64e-02

Pos logit contributions
Always+0.113
These+0.112
If+0.111
 languages+0.109
Class+0.105
Neg logit contributions
 d-0.085
sing-0.083
fish-0.078
less-0.076
ies-0.075
Token".
Token ID1911
Feature activation-3.26e-02

Pos logit contributions
These+0.082
 languages+0.081
Always+0.080
If+0.080
Class+0.077
Neg logit contributions
sing-0.043
fish-0.042
ies-0.040
 d-0.040
less-0.039
Token Sara
Token ID24799
Feature activation-2.19e-02

Pos logit contributions
These+0.035
 languages+0.034
If+0.033
Always+0.033
Class+0.031
Neg logit contributions
sing-0.079
fish-0.079
orse-0.074
 d-0.074
ies-0.073
Token was
Token ID373
Feature activation-1.99e-02

Pos logit contributions
These+0.040
 languages+0.040
Always+0.039
If+0.039
Class+0.038
Neg logit contributions
sing-0.036
fish-0.035
 d-0.032
ies-0.032
 footing-0.032
Token approached
Token ID10448
Feature activation-5.69e-02

Pos logit contributions
sing+0.009
fish+0.009
 d+0.008
ies+0.008
 footing+0.008
Neg logit contributions
These-0.011
 languages-0.011
Always-0.010
If-0.010
Class-0.010
Token the
Token ID262
Feature activation-3.23e-02

Pos logit contributions
These+0.119
 languages+0.118
Always+0.117
Class+0.115
If+0.112
Neg logit contributions
fish-0.079
sing-0.078
 d-0.076
ies-0.071
orse-0.070
Token snake
Token ID17522
Feature activation-1.85e-02

Pos logit contributions
 languages+0.068
These+0.068
Always+0.067
If+0.065
Class+0.065
Neg logit contributions
sing-0.046
fish-0.045
 d-0.042
ies-0.039
less-0.039
Token and
Token ID290
Feature activation+2.24e-02

Pos logit contributions
These+0.040
 languages+0.040
Always+0.039
If+0.039
Class+0.038
Neg logit contributions
sing-0.024
fish-0.024
 d-0.021
ies-0.021
 footing-0.021
Token said
Token ID531
Feature activation-3.03e-02

Pos logit contributions
sing+0.040
less+0.037
fish+0.037
 footing+0.037
kin+0.037
Neg logit contributions
These-0.037
If-0.037
Always-0.036
Will-0.036
Would-0.036
Token "
Token ID366
Feature activation-8.99e-02

Pos logit contributions
 languages+0.058
Will+0.056
 Wisdom+0.056
 Friendship+0.055
If+0.055
Neg logit contributions
sing-0.049
 spine-0.046
 footing-0.044
fish-0.041
 grip-0.041
Tokenhello
Token ID31373
Feature activation-4.58e-02

Pos logit contributions
 languages+0.118
aries+0.098
 Friendship+0.093
 magazines+0.093
 Books+0.090
Neg logit contributions
fish-0.206
 d-0.192
less-0.186
sing-0.186
ho-0.185
Token".
Token ID1911
Feature activation-3.23e-02

Pos logit contributions
Always+0.113
These+0.113
 languages+0.111
If+0.108
Class+0.107
Neg logit contributions
fish-0.048
 d-0.043
sing-0.042
ies-0.039
orse-0.039
Token To
Token ID1675
Feature activation-6.74e-03

Pos logit contributions
These+0.055
 languages+0.054
Always+0.053
If+0.053
Class+0.051
Neg logit contributions
sing-0.057
fish-0.057
 d-0.053
ies-0.052
 footing-0.052
Token his
Token ID465
Feature activation-2.83e-03

Pos logit contributions
These+0.013
 languages+0.013
Always+0.013
If+0.013
Class+0.013
Neg logit contributions
sing-0.010
fish-0.010
less-0.009
 footing-0.009
ies-0.009
Token surprise
Token ID5975
Feature activation-4.30e-03

Pos logit contributions
These+0.005
Always+0.005
If+0.005
Class+0.005
 languages+0.005
Neg logit contributions
sing-0.005
fish-0.005
 d-0.005
 raft-0.004
ies-0.004
Token the
Token ID262
Feature activation-1.87e-02

Pos logit contributions
These+0.083
 languages+0.081
Always+0.081
Class+0.079
If+0.078
Neg logit contributions
sing-0.067
fish-0.065
 d-0.063
 footing-0.060
ies-0.059
Token ostr
Token ID42387
Feature activation+1.27e-02

Pos logit contributions
These+0.033
 languages+0.032
If+0.031
Always+0.031
 alphabet+0.030
Neg logit contributions
kin-0.030
sing-0.030
fish-0.030
ipop-0.029
orse-0.029
Tokenich
Token ID488
Feature activation-2.24e-02

Pos logit contributions
fish+0.017
kin+0.016
sing+0.016
 footing+0.015
less+0.015
Neg logit contributions
Always-0.026
These-0.026
 languages-0.025
If-0.025
June-0.024
Token and
Token ID290
Feature activation+2.01e-02

Pos logit contributions
These+0.050
 languages+0.049
Always+0.048
If+0.048
Class+0.047
Neg logit contributions
sing-0.028
fish-0.028
 d-0.025
ies-0.025
orse-0.024
Token said
Token ID531
Feature activation-3.41e-02

Pos logit contributions
sing+0.036
 spine+0.034
 footing+0.033
less+0.033
kin+0.033
Neg logit contributions
These-0.033
 languages-0.033
Always-0.032
Will-0.032
Would-0.032
Token "
Token ID366
Feature activation-8.96e-02

Pos logit contributions
 languages+0.068
 Friendship+0.066
 Share+0.064
 Wisdom+0.063
Will+0.063
Neg logit contributions
sing-0.055
 spine-0.050
 footing-0.050
fish-0.046
ies-0.046
Tokenbang
Token ID36668
Feature activation-3.88e-02

Pos logit contributions
 languages+0.111
Anne+0.092
aries+0.089
Class+0.088
 magazines+0.088
Neg logit contributions
fish-0.207
 d-0.192
sing-0.190
less-0.190
orse-0.189
Token!".
Token ID48220
Feature activation-2.91e-02

Pos logit contributions
 languages+0.081
These+0.081
If+0.079
Class+0.078
Always+0.078
Neg logit contributions
sing-0.053
fish-0.049
 footing-0.048
 spine-0.046
ies-0.045
Token
Token ID220
Feature activation+2.92e-02

Pos logit contributions
 languages+0.039
These+0.039
Always+0.037
If+0.036
Class+0.036
Neg logit contributions
sing-0.063
fish-0.061
 d-0.059
orse-0.059
 spine-0.058
Token
Token ID198
Feature activation-4.10e-04

Pos logit contributions
sing+0.060
fish+0.059
 d+0.056
orse+0.055
ies+0.055
Neg logit contributions
These-0.042
 languages-0.041
Always-0.040
If-0.039
Class-0.038
Token
Token ID198
Feature activation-1.36e-02

Pos logit contributions
 languages+0.001
These+0.001
Always+0.001
 alphabet+0.001
 magazines+0.001
Neg logit contributions
orse-0.001
fish-0.001
 d-0.001
ble-0.001
sing-0.001
Token.
Token ID13
Feature activation+8.51e-03

Pos logit contributions
sing+0.063
fish+0.061
 d+0.054
 footing+0.054
ies+0.054
Neg logit contributions
These-0.110
 languages-0.109
Always-0.108
If-0.106
Class-0.104
Token She
Token ID1375
Feature activation+2.22e-02

Pos logit contributions
sing+0.016
fish+0.016
 d+0.016
orse+0.016
ipop+0.015
Neg logit contributions
 languages-0.013
These-0.013
Always-0.012
If-0.012
Class-0.011
Token was
Token ID373
Feature activation+1.76e-02

Pos logit contributions
sing+0.037
fish+0.036
 footing+0.034
 spine+0.034
less+0.034
Neg logit contributions
These-0.040
 languages-0.039
Always-0.039
If-0.038
Class-0.037
Token so
Token ID523
Feature activation+5.10e-02

Pos logit contributions
fish+0.025
sing+0.025
ies+0.022
 spine+0.022
 footing+0.022
Neg logit contributions
These-0.036
Always-0.036
 languages-0.036
If-0.035
Class-0.034
Token happy
Token ID3772
Feature activation+5.79e-02

Pos logit contributions
fish+0.075
sing+0.072
 spine+0.069
 footing+0.068
 raft+0.067
Neg logit contributions
These-0.105
Always-0.101
 languages-0.099
If-0.098
Anne-0.097
Token and
Token ID290
Feature activation+4.94e-02

Pos logit contributions
sing+0.115
kin+0.108
 footing+0.105
fish+0.103
 spine+0.099
Neg logit contributions
Always-0.092
If-0.085
Class-0.085
 languages-0.084
 Wisdom-0.083
Token hugged
Token ID41130
Feature activation-7.25e-03

Pos logit contributions
fish+0.095
sing+0.095
 spine+0.090
 footing+0.089
OO+0.089
Neg logit contributions
Always-0.075
These-0.074
 languages-0.071
Would-0.071
 Wisdom-0.071
Token her
Token ID607
Feature activation+1.44e-02

Pos logit contributions
These+0.013
 languages+0.013
Always+0.013
If+0.013
Class+0.012
Neg logit contributions
sing-0.012
fish-0.012
 d-0.011
ies-0.011
 footing-0.011
Token grandma
Token ID49890
Feature activation-2.57e-03

Pos logit contributions
sing+0.019
fish+0.019
orse+0.017
 footing+0.017
 d+0.017
Neg logit contributions
These-0.031
 languages-0.031
Always-0.030
If-0.030
Class-0.029
Token tightly
Token ID17707
Feature activation+4.71e-02

Pos logit contributions
These+0.006
 languages+0.005
Always+0.005
If+0.005
Class+0.005
Neg logit contributions
sing-0.003
fish-0.003
 d-0.003
 footing-0.003
ies-0.003
Token.
Token ID13
Feature activation-2.64e-03

Pos logit contributions
fish+0.066
sing+0.065
orse+0.060
ies+0.058
 footing+0.058
Neg logit contributions
These-0.095
 languages-0.095
Always-0.094
If-0.092
Class-0.089
Token ball
Token ID2613
Feature activation+7.58e-03

Pos logit contributions
fish+0.039
orse+0.038
sing+0.037
ies+0.036
 d+0.034
Neg logit contributions
These-0.045
 languages-0.043
Always-0.042
If-0.042
 alphabet-0.041
Token.
Token ID13
Feature activation-9.43e-03

Pos logit contributions
fish+0.012
sing+0.012
orse+0.012
 footing+0.011
ipop+0.011
Neg logit contributions
These-0.013
 languages-0.013
Always-0.013
If-0.013
 Wisdom-0.012
Token But
Token ID887
Feature activation+2.60e-02

Pos logit contributions
 languages+0.015
These+0.015
Always+0.014
If+0.013
Class+0.013
Neg logit contributions
sing-0.019
 d-0.017
 spine-0.017
fish-0.017
less-0.017
Token it
Token ID340
Feature activation+2.21e-02

Pos logit contributions
fish+0.051
sing+0.051
orse+0.047
ies+0.046
ower+0.046
Neg logit contributions
Will-0.037
These-0.037
Anne-0.035
 Wisdom-0.035
Always-0.035
Token was
Token ID373
Feature activation+1.83e-02

Pos logit contributions
sing+0.043
fish+0.042
 spine+0.041
orse+0.041
less+0.040
Neg logit contributions
These-0.031
Always-0.031
Will-0.031
If-0.030
 languages-0.030
Token too
Token ID1165
Feature activation+5.56e-02

Pos logit contributions
fish+0.033
orse+0.031
ies+0.030
sing+0.030
ower+0.029
Neg logit contributions
Always-0.031
These-0.030
If-0.029
 languages-0.029
 Wisdom-0.029
Token high
Token ID1029
Feature activation+4.75e-02

Pos logit contributions
sing+0.109
fish+0.109
orse+0.107
 spine+0.101
mit+0.100
Neg logit contributions
If-0.082
Always-0.078
These-0.078
Hope-0.076
 Every-0.076
Token for
Token ID329
Feature activation-1.09e-02

Pos logit contributions
orse+0.081
fish+0.076
sing+0.074
kin+0.071
 thirst+0.068
Neg logit contributions
Always-0.086
If-0.084
These-0.083
 languages-0.083
 Wisdom-0.079
Token her
Token ID607
Feature activation+2.12e-02

Pos logit contributions
These+0.019
 languages+0.019
Always+0.019
If+0.018
Class+0.018
Neg logit contributions
sing-0.019
fish-0.018
 d-0.018
ies-0.018
orse-0.017
Token.
Token ID13
Feature activation-5.89e-03

Pos logit contributions
fish+0.032
sing+0.029
orse+0.028
 d+0.028
ies+0.028
Neg logit contributions
 languages-0.040
These-0.037
 alphabet-0.037
Always-0.036
Will-0.036
Token
Token ID198
Feature activation+2.73e-02

Pos logit contributions
 languages+0.009
These+0.008
Always+0.008
Class+0.008
If+0.008
Neg logit contributions
sing-0.012
 spine-0.011
 d-0.011
less-0.011
 tum-0.011
Token a
Token ID257
Feature activation+1.46e-02

Pos logit contributions
These+0.028
 languages+0.027
Always+0.027
If+0.027
Class+0.026
Neg logit contributions
sing-0.020
fish-0.020
 d-0.019
 footing-0.018
ies-0.018
Token place
Token ID1295
Feature activation+4.49e-02

Pos logit contributions
sing+0.022
fish+0.021
ies+0.020
orse+0.020
 footing+0.020
Neg logit contributions
 languages-0.027
These-0.027
Will-0.026
Always-0.026
If-0.026
Token to
Token ID284
Feature activation-6.09e-03

Pos logit contributions
fish+0.071
sing+0.069
orse+0.067
OO+0.066
ies+0.064
Neg logit contributions
 languages-0.079
These-0.076
If-0.075
Always-0.074
 parks-0.073
Token go
Token ID467
Feature activation-6.89e-03

Pos logit contributions
These+0.011
 languages+0.011
Always+0.011
If+0.011
Class+0.011
Neg logit contributions
sing-0.010
fish-0.010
ies-0.009
 d-0.009
less-0.009
Token,
Token ID11
Feature activation+4.48e-02

Pos logit contributions
Always+0.012
 languages+0.012
These+0.012
 parks+0.011
If+0.011
Neg logit contributions
sing-0.012
fish-0.011
sed-0.011
kin-0.011
orse-0.011
Token but
Token ID475
Feature activation+4.46e-02

Pos logit contributions
fish+0.070
sing+0.065
kin+0.064
ies+0.063
ipop+0.062
Neg logit contributions
These-0.086
Always-0.084
 languages-0.083
If-0.083
Would-0.081
Token all
Token ID477
Feature activation+2.52e-02

Pos logit contributions
sing+0.061
fish+0.060
ies+0.057
less+0.056
ower+0.053
Neg logit contributions
These-0.090
If-0.089
Always-0.087
 languages-0.087
Will-0.086
Token he
Token ID339
Feature activation+3.63e-02

Pos logit contributions
sing+0.033
fish+0.032
 d+0.029
ies+0.028
 footing+0.028
Neg logit contributions
These-0.055
 languages-0.055
Always-0.054
If-0.053
Class-0.052
Token saw
Token ID2497
Feature activation-7.29e-03

Pos logit contributions
sing+0.055
fish+0.050
less+0.050
 spine+0.049
ies+0.048
Neg logit contributions
Always-0.069
If-0.068
Will-0.068
These-0.067
Would-0.066
Token was
Token ID373
Feature activation+2.07e-02

Pos logit contributions
These+0.014
 languages+0.014
Always+0.014
If+0.014
Class+0.013
Neg logit contributions
sing-0.011
fish-0.011
 d-0.010
ies-0.010
 footing-0.010
Token an
Token ID281
Feature activation+1.28e-02

Pos logit contributions
fish+0.032
sing+0.031
ies+0.029
orse+0.029
ipop+0.028
Neg logit contributions
These-0.039
 languages-0.039
Always-0.039
If-0.038
Will-0.037
Token children
Token ID1751
Feature activation+1.99e-03

Pos logit contributions
sing+0.033
fish+0.032
orse+0.030
 d+0.029
less+0.029
Neg logit contributions
 languages-0.039
These-0.039
Always-0.037
If-0.036
Class-0.035
Token safely
Token ID11512
Feature activation+3.31e-02

Pos logit contributions
sing+0.003
fish+0.003
 footing+0.003
orse+0.003
sed+0.003
Neg logit contributions
 languages-0.004
These-0.004
Always-0.004
Class-0.004
Hope-0.004
Token.
Token ID13
Feature activation+5.23e-04

Pos logit contributions
sing+0.049
fish+0.046
orse+0.043
 footing+0.043
ipop+0.042
Neg logit contributions
 languages-0.065
Always-0.064
These-0.064
Class-0.061
If-0.061
Token She
Token ID1375
Feature activation+3.11e-02

Pos logit contributions
sing+0.001
 spine+0.001
 d+0.001
sed+0.001
orse+0.001
Neg logit contributions
 languages-0.001
These-0.001
Always-0.001
aries-0.001
 alphabet-0.001
Token cried
Token ID16896
Feature activation+2.19e-02

Pos logit contributions
sing+0.048
fish+0.047
 d+0.044
ies+0.043
 footing+0.043
Neg logit contributions
These-0.060
 languages-0.060
Always-0.058
If-0.058
Class-0.057
Token too
Token ID1165
Feature activation+5.82e-02

Pos logit contributions
sing+0.030
fish+0.030
 d+0.027
 footing+0.027
orse+0.027
Neg logit contributions
These-0.046
 languages-0.045
Always-0.045
If-0.044
Class-0.043
Token.
Token ID13
Feature activation-3.11e-04

Pos logit contributions
fish+0.082
sing+0.080
orse+0.072
ipop+0.071
ies+0.071
Neg logit contributions
 languages-0.121
These-0.119
Always-0.118
If-0.118
Class-0.113
Token<|endoftext|>
Token ID50256
Feature activation-2.77e-02

Pos logit contributions
 languages+0.001
These+0.001
Always+0.001
aries+0.000
Class+0.000
Neg logit contributions
sing-0.001
 spine-0.000
 d-0.000
sed-0.000
 tum-0.000
TokenL
Token ID43
Feature activation-4.68e-03

Pos logit contributions
 languages+0.062
aries+0.059
If+0.058
June+0.057
Class+0.057
Neg logit contributions
fish-0.034
 d-0.030
kin-0.030
ble-0.027
 upon-0.026
Tokenily
Token ID813
Feature activation-8.54e-03

Pos logit contributions
These+0.008
 languages+0.008
Always+0.008
If+0.008
Class+0.007
Neg logit contributions
sing-0.008
fish-0.008
 d-0.008
ies-0.008
 footing-0.008
Token and
Token ID290
Feature activation+2.12e-02

Pos logit contributions
These+0.017
 languages+0.017
Always+0.016
If+0.016
Class+0.016
Neg logit contributions
sing-0.013
fish-0.012
 d-0.012
ies-0.011
 footing-0.011
Token the
Token ID262
Feature activation+1.02e-02

Pos logit contributions
These+0.052
 languages+0.051
If+0.050
Always+0.049
Class+0.048
Neg logit contributions
sing-0.053
fish-0.052
 footing-0.048
 spine-0.048
ies-0.047
Token blood
Token ID2910
Feature activation+2.25e-02

Pos logit contributions
fish+0.017
sing+0.017
orse+0.016
ies+0.016
 footing+0.016
Neg logit contributions
These-0.018
 languages-0.018
Always-0.017
If-0.017
Class-0.016
Token.
Token ID13
Feature activation-1.54e-02

Pos logit contributions
sing+0.030
fish+0.030
 d+0.027
ies+0.027
 footing+0.026
Neg logit contributions
These-0.048
 languages-0.048
Always-0.047
If-0.046
Class-0.046
Token He
Token ID679
Feature activation+4.94e-03

Pos logit contributions
 languages+0.027
These+0.026
Always+0.025
If+0.025
Class+0.025
Neg logit contributions
sing-0.027
fish-0.026
 d-0.025
orse-0.025
 spine-0.024
Token is
Token ID318
Feature activation+1.51e-02

Pos logit contributions
sing+0.011
fish+0.010
less+0.010
 footing+0.010
 spine+0.010
Neg logit contributions
These-0.007
 languages-0.007
Always-0.006
If-0.006
Class-0.006
Token very
Token ID845
Feature activation+4.46e-02

Pos logit contributions
fish+0.025
ipop+0.023
 spine+0.023
sing+0.023
ies+0.023
Neg logit contributions
These-0.027
Always-0.026
 languages-0.026
Will-0.025
If-0.025
Token scared
Token ID12008
Feature activation+2.96e-02

Pos logit contributions
 spine+0.085
fish+0.085
 footing+0.083
sing+0.082
ipop+0.082
Neg logit contributions
These-0.055
 languages-0.055
 asks-0.055
Will-0.054
 Share-0.054
Token.
Token ID13
Feature activation-1.20e-02

Pos logit contributions
sing+0.040
fish+0.038
 footing+0.034
 d+0.034
orse+0.034
Neg logit contributions
These-0.064
 languages-0.064
Always-0.063
If-0.062
Class-0.061
Token
Token ID198
Feature activation+2.39e-02

Pos logit contributions
 languages+0.021
These+0.020
Always+0.020
Class+0.019
If+0.019
Neg logit contributions
sing-0.021
fish-0.021
 d-0.020
orse-0.020
 spine-0.019
Token
Token ID198
Feature activation+1.99e-02

Pos logit contributions
fish+0.048
sing+0.047
 d+0.045
orse+0.045
ies+0.044
Neg logit contributions
 languages-0.035
These-0.035
Always-0.034
If-0.032
Class-0.032
Token"
Token ID1
Feature activation-3.25e-02

Pos logit contributions
sing+0.037
fish+0.036
 d+0.034
ies+0.034
 footing+0.034
Neg logit contributions
These-0.033
 languages-0.032
Always-0.031
If-0.031
Class-0.030
Token back
Token ID736
Feature activation+3.48e-03

Pos logit contributions
These+0.006
Always+0.006
Will+0.005
If+0.005
 languages+0.005
Neg logit contributions
sing-0.006
 spine-0.006
fish-0.006
OO-0.006
kin-0.006
Token out
Token ID503
Feature activation+3.39e-02

Pos logit contributions
sing+0.006
fish+0.005
orse+0.005
kin+0.005
 spine+0.005
Neg logit contributions
 languages-0.007
These-0.007
Always-0.007
If-0.006
 parks-0.006
Token!
Token ID0
Feature activation-2.77e-03

Pos logit contributions
sing+0.060
fish+0.056
orse+0.052
 spine+0.051
kin+0.050
Neg logit contributions
These-0.059
 languages-0.059
Always-0.058
If-0.056
Will-0.055
Token The
Token ID383
Feature activation-6.26e-03

Pos logit contributions
 languages+0.004
These+0.004
Always+0.004
Class+0.004
If+0.004
Neg logit contributions
sing-0.005
 d-0.005
 spine-0.005
ipop-0.005
less-0.005
Token camp
Token ID1413
Feature activation+1.34e-02

Pos logit contributions
These+0.011
 languages+0.011
Always+0.011
If+0.011
Class+0.011
Neg logit contributions
sing-0.011
fish-0.010
orse-0.009
 footing-0.009
ies-0.009
Token director
Token ID3437
Feature activation-5.91e-03

Pos logit contributions
sing+0.023
fish+0.021
 spine+0.021
 footing+0.020
ies+0.020
Neg logit contributions
These-0.023
 languages-0.023
Always-0.022
If-0.021
Class-0.021
Token reassured
Token ID48171
Feature activation-1.87e-02

Pos logit contributions
These+0.011
 languages+0.011
Always+0.011
If+0.010
Class+0.010
Neg logit contributions
sing-0.010
fish-0.009
 footing-0.009
 spine-0.009
ies-0.008
Token her
Token ID607
Feature activation-8.95e-03

Pos logit contributions
These+0.032
 languages+0.031
Always+0.031
If+0.030
Class+0.030
Neg logit contributions
sing-0.032
fish-0.032
 d-0.031
ies-0.030
less-0.030
Token that
Token ID326
Feature activation-2.50e-02

Pos logit contributions
 languages+0.017
Always+0.017
These+0.016
Will+0.016
If+0.016
Neg logit contributions
sing-0.015
fish-0.013
 spine-0.012
orse-0.012
 footing-0.012
Token there
Token ID612
Feature activation-1.43e-02

Pos logit contributions
These+0.054
 languages+0.053
Always+0.052
If+0.052
Class+0.051
Neg logit contributions
fish-0.032
sing-0.032
 d-0.030
ies-0.029
 footing-0.029
Token would
Token ID561
Feature activation-9.40e-03

Pos logit contributions
These+0.028
 languages+0.027
If+0.027
Always+0.027
Class+0.027
Neg logit contributions
fish-0.021
sing-0.021
 d-0.020
ies-0.020
 footing-0.019
Token and
Token ID290
Feature activation+3.97e-02

Pos logit contributions
sing+0.002
fish+0.002
 d+0.002
ies+0.002
 footing+0.002
Neg logit contributions
These-0.003
 languages-0.003
Always-0.003
If-0.003
Class-0.003
Token slides
Token ID19392
Feature activation+2.18e-02

Pos logit contributions
sing+0.062
fish+0.062
less+0.057
 footing+0.055
ies+0.055
Neg logit contributions
These-0.076
 languages-0.073
Always-0.072
If-0.071
Class-0.071
Token instead
Token ID2427
Feature activation+3.12e-02

Pos logit contributions
sing+0.040
orse+0.039
fish+0.039
OO+0.037
mit+0.036
Neg logit contributions
These-0.032
 parks-0.031
 languages-0.030
Always-0.029
If-0.029
Token.
Token ID13
Feature activation-7.89e-03

Pos logit contributions
sing+0.048
fish+0.047
 spine+0.042
orse+0.042
ies+0.042
Neg logit contributions
These-0.060
 languages-0.059
If-0.058
Always-0.058
Class-0.056
Token They
Token ID1119
Feature activation+2.50e-02

Pos logit contributions
Always+0.014
Anne+0.014
If+0.013
 languages+0.013
These+0.013
Neg logit contributions
sing-0.015
 d-0.013
sed-0.013
 spine-0.013
 tum-0.012
Token had
Token ID550
Feature activation-6.39e-03

Pos logit contributions
sing+0.045
fish+0.043
 spine+0.042
less+0.042
 footing+0.041
Neg logit contributions
These-0.043
Always-0.042
If-0.040
 languages-0.040
Class-0.039
Token so
Token ID523
Feature activation+4.80e-02

Pos logit contributions
These+0.013
 languages+0.013
Always+0.013
If+0.013
Class+0.012
Neg logit contributions
sing-0.009
fish-0.009
 d-0.008
 footing-0.008
ies-0.008
Token much
Token ID881
Feature activation+3.18e-02

Pos logit contributions
sing+0.057
fish+0.056
 d+0.050
ies+0.050
 footing+0.049
Neg logit contributions
These-0.110
 languages-0.109
Always-0.107
If-0.106
Class-0.104
Token fun
Token ID1257
Feature activation+3.33e-02

Pos logit contributions
sing+0.057
fish+0.056
 spine+0.052
kin+0.050
 footing+0.050
Neg logit contributions
These-0.055
Always-0.054
 languages-0.054
If-0.052
Class-0.052
Token together
Token ID1978
Feature activation+2.02e-02

Pos logit contributions
sing+0.063
 spine+0.059
ipop+0.058
fish+0.058
kin+0.055
Neg logit contributions
Always-0.053
These-0.052
If-0.049
Will-0.049
 languages-0.049
Token and
Token ID290
Feature activation+4.42e-02

Pos logit contributions
sing+0.030
fish+0.028
 spine+0.026
 footing+0.026
orse+0.026
Neg logit contributions
These-0.040
Always-0.039
 languages-0.039
If-0.038
Will-0.037
Token he
Token ID339
Feature activation+1.26e-02

Pos logit contributions
sing+0.052
fish+0.049
 spine+0.047
ower+0.046
 footing+0.046
Neg logit contributions
These-0.061
Always-0.061
If-0.060
 languages-0.060
Would-0.059
Token was
Token ID373
Feature activation+1.08e-02

Pos logit contributions
sing+0.022
 spine+0.021
fish+0.020
less+0.020
 footing+0.020
Neg logit contributions
Always-0.021
Will-0.021
These-0.021
Would-0.020
If-0.020
Token wet
Token ID9583
Feature activation+2.45e-02

Pos logit contributions
fish+0.017
sing+0.016
ies+0.015
 spine+0.015
ipop+0.015
Neg logit contributions
Always-0.021
These-0.021
If-0.020
Anne-0.020
Will-0.020
Token and
Token ID290
Feature activation+4.01e-02

Pos logit contributions
sing+0.041
fish+0.040
orse+0.040
ipop+0.037
kin+0.036
Neg logit contributions
Always-0.044
These-0.043
If-0.042
 languages-0.041
Class-0.040
Token cold
Token ID4692
Feature activation+3.26e-02

Pos logit contributions
fish+0.056
sing+0.055
OO+0.054
 spine+0.052
ipop+0.050
Neg logit contributions
These-0.077
Always-0.077
Will-0.075
Would-0.074
If-0.073
Token.
Token ID13
Feature activation-1.28e-02

Pos logit contributions
sing+0.054
fish+0.052
orse+0.051
kin+0.048
ipop+0.047
Neg logit contributions
These-0.060
Always-0.059
If-0.058
 languages-0.056
Class-0.055
Token
Token ID198
Feature activation+2.19e-02

Pos logit contributions
These+0.022
 languages+0.022
Always+0.021
If+0.020
Class+0.020
Neg logit contributions
sing-0.023
fish-0.022
 d-0.021
orse-0.021
ies-0.021
Token
Token ID220
Feature activation+4.14e-02

Pos logit contributions
fish+0.042
sing+0.041
orse+0.040
 d+0.039
ies+0.038
Neg logit contributions
 languages-0.034
These-0.033
Always-0.032
If-0.031
 alphabet-0.030
Token
Token ID198
Feature activation+1.19e-02

Pos logit contributions
sing+0.095
fish+0.095
 d+0.089
ies+0.089
 footing+0.089
Neg logit contributions
These-0.049
 languages-0.047
If-0.046
Always-0.046
Class-0.044
TokenL
Token ID43
Feature activation-7.26e-04

Pos logit contributions
fish+0.021
sing+0.021
 d+0.020
orse+0.020
less+0.019
Neg logit contributions
 languages-0.020
These-0.019
Always-0.019
If-0.018
 alphabet-0.018
Tokenily
Token ID813
Feature activation-1.40e-02

Pos logit contributions
These+0.001
If+0.001
Always+0.001
 languages+0.001
Class+0.001
Neg logit contributions
sing-0.001
fish-0.001
 d-0.001
 footing-0.001
 spine-0.001
Token at
Token ID379
Feature activation-2.09e-02

Pos logit contributions
 languages+0.002
These+0.001
 parks+0.001
Always+0.001
If+0.001
Neg logit contributions
sing-0.001
orse-0.001
fish-0.001
kin-0.001
 spine-0.001
Token all
Token ID477
Feature activation+1.46e-02

Pos logit contributions
These+0.040
Always+0.040
 languages+0.039
If+0.039
Class+0.039
Neg logit contributions
sing-0.031
fish-0.031
 d-0.030
ies-0.030
 footing-0.029
Token the
Token ID262
Feature activation+2.18e-02

Pos logit contributions
fish+0.025
sing+0.024
ies+0.024
 d+0.024
less+0.023
Neg logit contributions
These-0.025
Always-0.024
If-0.024
 languages-0.024
Class-0.023
Token different
Token ID1180
Feature activation+2.10e-02

Pos logit contributions
 d+0.034
ies+0.033
fish+0.033
sing+0.032
less+0.031
Neg logit contributions
Always-0.040
If-0.040
These-0.040
June-0.039
Anne-0.039
Token flavours
Token ID46946
Feature activation+1.26e-02

Pos logit contributions
sing+0.042
fish+0.042
 d+0.040
ies+0.039
 footing+0.039
Neg logit contributions
These-0.031
 languages-0.030
Always-0.030
If-0.029
Class-0.028
Token.
Token ID13
Feature activation-8.25e-03

Pos logit contributions
sing+0.021
 spine+0.019
orse+0.018
 footing+0.018
sed+0.018
Neg logit contributions
These-0.024
 languages-0.023
Always-0.022
If-0.022
 parks-0.021
Token Her
Token ID2332
Feature activation+1.40e-02

Pos logit contributions
 languages+0.014
Always+0.014
These+0.014
If+0.013
Will+0.013
Neg logit contributions
sing-0.013
 d-0.013
less-0.012
orse-0.012
 spine-0.012
Token mom
Token ID1995
Feature activation-1.92e-02

Pos logit contributions
fish+0.020
sing+0.018
orse+0.018
 spine+0.016
ies+0.016
Neg logit contributions
 languages-0.029
These-0.028
 alphabet-0.027
Will-0.027
Always-0.027
Token asked
Token ID1965
Feature activation-4.63e-02

Pos logit contributions
These+0.040
 languages+0.040
If+0.038
Always+0.038
Class+0.037
Neg logit contributions
fish-0.027
 d-0.026
sing-0.026
ies-0.025
orse-0.024
Token,
Token ID11
Feature activation+8.73e-03

Pos logit contributions
These+0.092
Always+0.090
 languages+0.088
If+0.087
Class+0.086
Neg logit contributions
 d-0.065
ies-0.065
sing-0.065
fish-0.064
less-0.064
Token "
Token ID366
Feature activation-5.74e-02

Pos logit contributions
sing+0.014
fish+0.013
 d+0.013
ies+0.012
 footing+0.012
Neg logit contributions
These-0.017
 languages-0.016
Always-0.016
If-0.016
Class-0.016
Token and
Token ID290
Feature activation+2.18e-02

Pos logit contributions
sing+0.018
orse+0.017
fish+0.017
 raft+0.016
 footing+0.016
Neg logit contributions
 languages-0.019
Always-0.019
These-0.018
If-0.018
Class-0.017
Token cool
Token ID3608
Feature activation+9.56e-03

Pos logit contributions
fish+0.031
sing+0.030
ipop+0.028
OO+0.028
orse+0.027
Neg logit contributions
These-0.044
Always-0.044
 languages-0.041
Would-0.041
Will-0.041
Token.
Token ID13
Feature activation-2.28e-02

Pos logit contributions
sing+0.015
orse+0.015
fish+0.014
kin+0.013
 footing+0.013
Neg logit contributions
 languages-0.018
These-0.017
Always-0.017
If-0.017
Class-0.016
Token
Token ID198
Feature activation+1.14e-02

Pos logit contributions
 languages+0.039
These+0.037
Always+0.036
Class+0.035
If+0.034
Neg logit contributions
sing-0.041
orse-0.039
 d-0.039
 spine-0.038
fish-0.038
Token
Token ID198
Feature activation-6.00e-04

Pos logit contributions
fish+0.020
orse+0.020
sing+0.019
 d+0.018
 footing+0.018
Neg logit contributions
 languages-0.020
These-0.019
Always-0.018
 alphabet-0.018
Class-0.018
TokenBut
Token ID1537
Feature activation+2.03e-02

Pos logit contributions
 languages+0.001
These+0.001
Always+0.001
 alphabet+0.001
aries+0.001
Neg logit contributions
fish-0.001
sing-0.001
orse-0.001
 d-0.001
 footing-0.001
Token the
Token ID262
Feature activation-2.62e-02

Pos logit contributions
fish+0.036
sing+0.035
less+0.034
ies+0.033
ower+0.032
Neg logit contributions
These-0.033
Always-0.032
If-0.032
Will-0.032
 languages-0.031
Token sun
Token ID4252
Feature activation+6.03e-03

Pos logit contributions
 languages+0.052
These+0.052
Always+0.052
If+0.051
Class+0.050
Neg logit contributions
sing-0.040
fish-0.037
 d-0.037
 footing-0.035
less-0.035
Token was
Token ID373
Feature activation+5.27e-03

Pos logit contributions
sing+0.011
fish+0.011
 spine+0.010
 footing+0.010
less+0.010
Neg logit contributions
These-0.009
Always-0.009
Will-0.009
 languages-0.009
If-0.009
Token determined
Token ID5295
Feature activation+1.25e-02

Pos logit contributions
fish+0.009
orse+0.009
ipop+0.009
sing+0.008
mit+0.008
Neg logit contributions
Always-0.009
These-0.009
Will-0.008
If-0.008
 Every-0.008
Token.
Token ID13
Feature activation-2.09e-02

Pos logit contributions
sing+0.022
fish+0.022
kin+0.021
ipop+0.021
orse+0.020
Neg logit contributions
 languages-0.020
Always-0.020
If-0.019
These-0.019
 Wisdom-0.018
Tokenpron
Token ID31186
Feature activation+2.21e-02

Pos logit contributions
sing+0.011
fish+0.011
ies+0.010
orse+0.010
ower+0.010
Neg logit contributions
Will-0.012
Always-0.011
Hope-0.011
If-0.011
These-0.011
Token again
Token ID757
Feature activation+3.53e-02

Pos logit contributions
sing+0.036
fish+0.035
orse+0.033
ies+0.033
 d+0.032
Neg logit contributions
These-0.039
 languages-0.039
Always-0.038
If-0.037
Class-0.037
Token.
Token ID13
Feature activation-2.47e-02

Pos logit contributions
sing+0.040
fish+0.039
 d+0.035
ies+0.034
 footing+0.034
Neg logit contributions
These-0.082
 languages-0.082
Always-0.081
If-0.080
Class-0.078
Token She
Token ID1375
Feature activation+2.60e-03

Pos logit contributions
 languages+0.041
These+0.039
Always+0.038
If+0.037
Class+0.036
Neg logit contributions
sing-0.042
 d-0.041
less-0.041
 spine-0.040
orse-0.039
Token always
Token ID1464
Feature activation+1.67e-02

Pos logit contributions
sing+0.004
fish+0.004
ies+0.004
less+0.004
 spine+0.004
Neg logit contributions
These-0.005
Always-0.005
 languages-0.005
If-0.004
Class-0.004
Token used
Token ID973
Feature activation-3.49e-02

Pos logit contributions
sing+0.026
fish+0.026
 footing+0.023
ies+0.023
less+0.023
Neg logit contributions
These-0.032
 languages-0.032
Always-0.032
If-0.031
Class-0.031
Token her
Token ID607
Feature activation-9.29e-03

Pos logit contributions
These+0.068
 languages+0.067
Always+0.066
If+0.065
Class+0.065
Neg logit contributions
sing-0.052
fish-0.050
 d-0.050
less-0.048
 footing-0.048
Token a
Token ID257
Feature activation-3.96e-03

Pos logit contributions
These+0.019
 languages+0.019
Always+0.018
If+0.018
Class+0.018
Neg logit contributions
sing-0.013
fish-0.013
 d-0.012
 footing-0.012
ies-0.012
Tokenpron
Token ID31186
Feature activation+9.78e-03

Pos logit contributions
Will+0.007
These+0.007
 alphabet+0.007
Always+0.007
Hope+0.007
Neg logit contributions
sing-0.006
fish-0.006
ies-0.006
orse-0.006
 footing-0.006
Token wisely
Token ID32773
Feature activation+1.32e-02

Pos logit contributions
sing+0.015
fish+0.015
orse+0.014
ies+0.013
 d+0.013
Neg logit contributions
 languages-0.019
These-0.019
Always-0.018
If-0.018
Class-0.017
Token!
Token ID0
Feature activation-2.17e-02

Pos logit contributions
sing+0.018
fish+0.018
ies+0.016
orse+0.016
 d+0.016
Neg logit contributions
These-0.027
 languages-0.027
Always-0.027
If-0.027
Class-0.026
Token every
Token ID790
Feature activation-5.22e-03

Pos logit contributions
These+0.028
 languages+0.026
Always+0.026
If+0.025
Class+0.025
Neg logit contributions
sing-0.025
fish-0.025
 spine-0.023
 footing-0.023
orse-0.022
Token last
Token ID938
Feature activation+4.01e-03

Pos logit contributions
These+0.009
 languages+0.009
Always+0.009
If+0.009
Class+0.009
Neg logit contributions
sing-0.009
fish-0.008
ies-0.008
orse-0.008
 spine-0.008
Token cr
Token ID1067
Feature activation-1.84e-03

Pos logit contributions
sing+0.007
fish+0.007
orse+0.006
 spine+0.006
 footing+0.006
Neg logit contributions
These-0.007
 languages-0.007
Always-0.007
Class-0.007
If-0.007
Tokenumb
Token ID2178
Feature activation+8.86e-03

Pos logit contributions
These+0.002
Always+0.002
 languages+0.002
If+0.002
Class+0.002
Neg logit contributions
sing-0.004
fish-0.004
 footing-0.004
 raft-0.004
 d-0.004
Token and
Token ID290
Feature activation+2.91e-02

Pos logit contributions
sing+0.015
 footing+0.013
orse+0.012
fish+0.012
sed+0.012
Neg logit contributions
These-0.017
 languages-0.016
Always-0.016
If-0.015
Class-0.015
Token declared
Token ID6875
Feature activation-2.81e-02

Pos logit contributions
sing+0.050
fish+0.047
 spine+0.044
 footing+0.044
less+0.042
Neg logit contributions
These-0.054
 languages-0.052
Always-0.051
Class-0.050
If-0.050
Token it
Token ID340
Feature activation-2.06e-02

Pos logit contributions
These+0.057
 languages+0.057
Always+0.055
If+0.054
Class+0.054
Neg logit contributions
sing-0.043
fish-0.040
 footing-0.038
 spine-0.038
orse-0.036
Token was
Token ID373
Feature activation-1.08e-02

Pos logit contributions
These+0.036
 languages+0.035
Always+0.034
If+0.034
Class+0.033
Neg logit contributions
sing-0.036
fish-0.035
 footing-0.033
 spine-0.032
orse-0.032
Token the
Token ID262
Feature activation+5.48e-03

Pos logit contributions
These+0.019
Always+0.018
 languages+0.018
If+0.017
 alphabet+0.017
Neg logit contributions
sing-0.019
fish-0.018
 spine-0.018
 footing-0.017
orse-0.017
Token best
Token ID1266
Feature activation+1.26e-02

Pos logit contributions
sing+0.009
fish+0.009
 d+0.008
ies+0.008
 footing+0.008
Neg logit contributions
These-0.010
 languages-0.010
Always-0.010
If-0.010
Class-0.009
Token they
Token ID484
Feature activation+7.34e-03

Pos logit contributions
sing+0.022
fish+0.022
orse+0.020
 spine+0.020
 footing+0.019
Neg logit contributions
These-0.022
 languages-0.021
Always-0.021
 alphabet-0.020
Class-0.020
Token boy
Token ID2933
Feature activation-3.97e-02

Pos logit contributions
These+0.000
Always+0.000
 languages+0.000
Hope+0.000
Class+0.000
Neg logit contributions
sing-0.000
 footing-0.000
fish-0.000
 spine-0.000
orse-0.000
Token named
Token ID3706
Feature activation-3.66e-02

Pos logit contributions
These+0.080
 languages+0.080
Always+0.078
If+0.077
Class+0.076
Neg logit contributions
sing-0.057
fish-0.057
 d-0.054
ies-0.052
orse-0.052
Token Tim
Token ID5045
Feature activation-5.34e-02

Pos logit contributions
These+0.073
 languages+0.070
If+0.069
Always+0.069
Class+0.068
Neg logit contributions
fish-0.055
sing-0.055
ies-0.050
 footing-0.050
 d-0.049
Tokenmy
Token ID1820
Feature activation-3.42e-02

Pos logit contributions
These+0.065
 languages+0.064
Always+0.061
If+0.061
Class+0.059
Neg logit contributions
sing-0.120
fish-0.119
 d-0.115
ies-0.113
orse-0.111
Token wanted
Token ID2227
Feature activation-5.21e-02

Pos logit contributions
These+0.057
Always+0.055
If+0.055
 languages+0.053
Class+0.053
Neg logit contributions
sing-0.064
fish-0.062
 footing-0.060
 spine-0.059
less-0.057
Token to
Token ID284
Feature activation-4.46e-02

Pos logit contributions
These+0.127
 languages+0.127
Always+0.124
If+0.124
Class+0.121
Neg logit contributions
sing-0.054
fish-0.052
 d-0.049
less-0.046
orse-0.046
Token watch
Token ID2342
Feature activation-4.71e-02

Pos logit contributions
 languages+0.088
These+0.087
Always+0.087
If+0.086
Class+0.083
Neg logit contributions
sing-0.065
fish-0.065
 d-0.061
orse-0.060
 raft-0.060
Token a
Token ID257
Feature activation-5.34e-03

Pos logit contributions
These+0.087
 languages+0.086
Always+0.084
If+0.083
Class+0.082
Neg logit contributions
sing-0.076
fish-0.075
 d-0.069
ies-0.068
 footing-0.068
Token movie
Token ID3807
Feature activation+1.44e-03

Pos logit contributions
 languages+0.009
These+0.009
Class+0.008
Always+0.008
 alphabet+0.008
Neg logit contributions
sing-0.010
fish-0.009
 footing-0.009
 spine-0.009
ies-0.009
Token.
Token ID13
Feature activation-2.18e-02

Pos logit contributions
sing+0.002
fish+0.002
 d+0.002
ies+0.002
less+0.002
Neg logit contributions
These-0.003
 languages-0.003
Always-0.003
If-0.003
Class-0.003
Token He
Token ID679
Feature activation-1.09e-02

Pos logit contributions
 languages+0.041
Always+0.040
These+0.040
Hope+0.039
Anne+0.039
Neg logit contributions
sing-0.036
 d-0.033
less-0.032
 raft-0.032
 shovel-0.032
Token the
Token ID262
Feature activation+3.82e-03

Pos logit contributions
These+0.057
If+0.055
Always+0.054
Class+0.053
 languages+0.052
Neg logit contributions
 d-0.029
fish-0.027
sing-0.026
ies-0.026
 grip-0.026
Token kingdom
Token ID13239
Feature activation+1.14e-03

Pos logit contributions
fish+0.008
sing+0.008
 d+0.007
ies+0.007
 raft+0.007
Neg logit contributions
If-0.006
These-0.006
Always-0.005
 languages-0.005
Anne-0.005
Token.
Token ID13
Feature activation-3.03e-02

Pos logit contributions
sing+0.001
fish+0.001
 d+0.001
 footing+0.001
orse+0.001
Neg logit contributions
These-0.002
 languages-0.002
Always-0.002
If-0.002
Class-0.002
Token She
Token ID1375
Feature activation-1.08e-02

Pos logit contributions
Always+0.048
These+0.048
 languages+0.045
Will+0.044
Brian+0.042
Neg logit contributions
sing-0.054
 spine-0.054
 as-0.051
 tum-0.050
 d-0.050
Token wanted
Token ID2227
Feature activation-5.70e-02

Pos logit contributions
 languages+0.019
These+0.019
Always+0.019
If+0.018
Class+0.018
Neg logit contributions
sing-0.019
fish-0.018
less-0.017
 spine-0.017
 footing-0.017
Token to
Token ID284
Feature activation-4.33e-02

Pos logit contributions
These+0.147
Always+0.142
 languages+0.141
If+0.141
Here+0.137
Neg logit contributions
fish-0.049
 d-0.048
sing-0.048
ies-0.044
less-0.042
Token explore
Token ID7301
Feature activation-3.46e-02

Pos logit contributions
 languages+0.076
These+0.075
Always+0.074
If+0.072
Class+0.071
Neg logit contributions
sing-0.075
fish-0.073
less-0.069
ipop-0.069
 spine-0.068
Token the
Token ID262
Feature activation-9.02e-03

Pos logit contributions
These+0.073
 languages+0.070
If+0.070
Always+0.070
Class+0.069
Neg logit contributions
fish-0.048
 d-0.045
sing-0.045
ies-0.044
 footing-0.041
Token magical
Token ID10883
Feature activation-3.34e-03

Pos logit contributions
If+0.014
These+0.014
Always+0.013
June+0.013
Anne+0.013
Neg logit contributions
fish-0.018
 d-0.017
sing-0.016
ies-0.016
 spine-0.016
Token world
Token ID995
Feature activation+1.49e-03

Pos logit contributions
These+0.005
If+0.005
Always+0.005
 languages+0.005
Here+0.004
Neg logit contributions
fish-0.007
sing-0.007
ies-0.006
 d-0.006
 spine-0.006
Token around
Token ID1088
Feature activation-1.24e-02

Pos logit contributions
fish+0.002
sing+0.001
 d+0.001
ies+0.001
less+0.001
Neg logit contributions
These-0.004
Always-0.004
If-0.004
 languages-0.004
Here-0.004
Token so
Token ID523
Feature activation+4.78e-02

Pos logit contributions
sing+0.052
fish+0.051
 d+0.047
ies+0.046
 footing+0.046
Neg logit contributions
These-0.068
 languages-0.068
Always-0.067
If-0.066
Class-0.064
Token many
Token ID867
Feature activation-2.87e-03

Pos logit contributions
sing+0.062
fish+0.060
ies+0.055
 spine+0.054
less+0.054
Neg logit contributions
These-0.103
 languages-0.103
Always-0.100
If-0.099
Class-0.097
Token exciting
Token ID7895
Feature activation+2.64e-02

Pos logit contributions
These+0.006
June+0.006
Always+0.006
If+0.006
Anne+0.005
Neg logit contributions
fish-0.004
 d-0.004
ies-0.004
orse-0.004
 raft-0.004
Token things
Token ID1243
Feature activation+3.35e-02

Pos logit contributions
sing+0.049
fish+0.049
 d+0.046
ies+0.046
 footing+0.045
Neg logit contributions
These-0.042
 languages-0.041
Always-0.041
If-0.040
Class-0.039
Token to
Token ID284
Feature activation-1.93e-02

Pos logit contributions
sing+0.058
 spine+0.050
kin+0.050
OO+0.049
fish+0.048
Neg logit contributions
 languages-0.060
These-0.059
Always-0.059
If-0.058
 parks-0.056
Token buy
Token ID2822
Feature activation-3.46e-02

Pos logit contributions
These+0.040
Always+0.039
 languages+0.039
If+0.039
Anne+0.038
Neg logit contributions
fish-0.027
sing-0.027
 d-0.025
orse-0.024
 footing-0.024
Token.
Token ID13
Feature activation-1.01e-02

Pos logit contributions
These+0.077
 languages+0.076
Always+0.074
If+0.073
Class+0.072
Neg logit contributions
sing-0.045
fish-0.043
 d-0.038
 footing-0.038
orse-0.038
Token
Token ID198
Feature activation+1.16e-02

Pos logit contributions
 languages+0.020
Always+0.019
These+0.019
Class+0.018
June+0.017
Neg logit contributions
sing-0.016
orse-0.015
 spine-0.014
 tum-0.014
 flo-0.014
Token
Token ID198
Feature activation+3.39e-03

Pos logit contributions
orse+0.018
fish+0.018
sing+0.016
 d+0.016
 flo+0.016
Neg logit contributions
 languages-0.023
These-0.022
 alphabet-0.021
Always-0.021
aries-0.021
TokenThe
Token ID464
Feature activation-2.13e-02

Pos logit contributions
fish+0.005
orse+0.005
 d+0.005
 flo+0.005
sing+0.005
Neg logit contributions
 languages-0.006
aries-0.006
 alphabet-0.006
 Friendship-0.005
 magazines-0.005
Token dragon
Token ID10441
Feature activation-6.91e-03

Pos logit contributions
These+0.045
If+0.045
Always+0.044
 languages+0.044
Class+0.043
Neg logit contributions
fish-0.031
sing-0.029
 d-0.028
kin-0.026
ies-0.025