TOP ACTIVATIONS
MAX = 0.092

, behind the rocks , and in the tunnel . She
under leaves , in trees and even in rocks . But
trees , behind bushes , and even in the sandbox .
, under the benches , and in the bushes . But
trees , under benches , and in the sandbox . They
, under the benches , and in the bushes . He
, behind the flowers , and in the sand . They
rocks , in trees , and even in the grass .
, on the swings , and in the sand . Finally
and put on his cape , feeling very strong .
, under the chairs , and even in the refrigerator !
, behind the bushes , and in the sandbox . Finally
, under the benches , and in the bushes . He
, behind the trees , and even in the flowers .
, under the trees , and near the pond . They
the grass , the trees and a little flower . Suddenly
They looked under the flowers and behind the trees . Tim
sun shone on the pillow , making it dry and fresh
people nearby were so scared , they all ran away .
looked everywhere , behind trees and under rocks , until he

MIN ACTIVATIONS
MIN = -0.138

" Oh , I hadn 't thought of that ,"
ally , you know you shouldn 't keep the gum .
. Mama said if she wasn 't careful she could get
Hey little boy , you shouldn 't be picking up that
so mom said if he wasn 't going to help ,
" The sign says ' CA UTION '."
Lily . " Don 't worry , Lily .
new words every day . Conf used is just a word
face and said , " Don 't worry , Amy !
" No , you didn 't , Tom ! I
She told Bobby that he wasn 't ready for TV time
to it and said " Sl ide !" He went down
and told him that he wasn 't alone .
had warned her but she hadn â € ™ t listened
face and said , " Don 't worry , I will
. He said , " Don 't worry little one ,
Mom sighed , " Sl ipping means to try to
to the bird , " Don 't worry , little bird
" Fred , maybe you shouldn 't be blowing that loud
and said , " I wasn 't paying attention . I

INTERVAL 0.035 - 0.092
CONTAINS 6.785%

some lip bal m on her lips to make them soft
was shiny and new , and the bird ie was happy
garden . He saw Lily and Rex and the trap .
leaves and were green , but he was a sun flower
't mind its loud noises , and they played together every

INTERVAL -0.023 - 0.035
CONTAINS 73.454%

The big boy gets angry . He says ,
and she will help you ". She called mummy and told
grow taller and taller . Soon , it had a brown
movies together . They soon forgot all about the delayed work
Little did Sam know that this new toy will help him

INTERVAL -0.080 - -0.023
CONTAINS 19.300%

fit through the circle . At the end of
this story is : never cut up too many things if
. <|endoftext|> Once upon a time , there was a little
Lily felt happy that she could help return the letter .
was very happy and excited to receive such a unique gift

INTERVAL -0.138 - -0.080
CONTAINS 0.461%

" Hey , what did you do ?" he shouts
grandma said , " Why don 't you open it and
?" Tim my replied , " I just flap my wings
his toys , but he didn 't listen .
that day on , Lily wasn 't embarrassed anymore and felt
Token,
Token ID11
Feature activation+6.90e-02

Pos logit contributions
bread+0.032
 he+0.030
 ju+0.029
rily+0.028
 cuc+0.027
Neg logit contributions
ases-0.056
ee-0.055
room-0.055
prise-0.055
rr-0.055
Token behind
Token ID2157
Feature activation+3.74e-02

Pos logit contributions
bread+0.160
der+0.126
ment+0.123
rily+0.122
oc+0.120
Neg logit contributions
e-0.093
w-0.085
 cannot-0.080
ee-0.080
ases-0.079
Token the
Token ID262
Feature activation+3.10e-02

Pos logit contributions
bread+0.044
 he+0.040
 ju+0.039
 cuc+0.037
rily+0.037
Neg logit contributions
ases-0.085
prise-0.084
ee-0.084
rr-0.083
room-0.082
Token rocks
Token ID12586
Feature activation+6.15e-05

Pos logit contributions
bread+0.038
rily+0.035
 energ+0.035
 he+0.034
 ju+0.033
Neg logit contributions
rr-0.066
ases-0.065
prise-0.065
ee-0.064
room-0.064
Token,
Token ID11
Feature activation+6.22e-02

Pos logit contributions
bread+0.000
 he+0.000
 ju+0.000
 cuc+0.000
rily+0.000
Neg logit contributions
ases-0.000
ee-0.000
prise-0.000
rr-0.000
uch-0.000
Token and
Token ID290
Feature activation+8.70e-02

Pos logit contributions
bread+0.134
 ju+0.106
rily+0.105
der+0.104
ment+0.103
Neg logit contributions
e-0.084
ases-0.081
w-0.081
ee-0.079
ow-0.074
Token in
Token ID287
Feature activation+3.99e-02

Pos logit contributions
bread+0.172
rily+0.150
yt+0.135
 cuc+0.134
 ju+0.133
Neg logit contributions
ases-0.130
w-0.124
e-0.122
prise-0.120
ee-0.119
Token the
Token ID262
Feature activation+4.11e-02

Pos logit contributions
bread+0.047
 he+0.040
 ju+0.040
 cuc+0.038
rily+0.037
Neg logit contributions
ases-0.093
prise-0.092
ee-0.091
rr-0.090
room-0.090
Token tunnel
Token ID13275
Feature activation+1.05e-02

Pos logit contributions
bread+0.049
rily+0.045
 he+0.043
 energ+0.040
 ju+0.039
Neg logit contributions
prise-0.092
ee-0.091
rr-0.091
ases-0.090
room-0.088
Token.
Token ID13
Feature activation+3.55e-02

Pos logit contributions
bread+0.014
 ju+0.013
 he+0.013
 cuc+0.013
 energ+0.013
Neg logit contributions
ases-0.022
room-0.021
rr-0.021
ee-0.021
prise-0.021
Token She
Token ID1375
Feature activation+2.58e-02

Pos logit contributions
 he+0.041
bread+0.040
 ju+0.036
 cuc+0.035
 until+0.035
Neg logit contributions
prise-0.081
ases-0.080
ee-0.080
rr-0.079
uch-0.079
Token under
Token ID739
Feature activation+5.56e-02

Pos logit contributions
bread+0.043
 current+0.037
ment+0.035
der+0.034
lest+0.033
Neg logit contributions
ee-0.032
prise-0.031
ases-0.029
rr-0.027
w-0.026
Token leaves
Token ID5667
Feature activation+2.48e-02

Pos logit contributions
bread+0.087
 ju+0.074
rily+0.072
 energ+0.072
lest+0.068
Neg logit contributions
ases-0.105
ee-0.105
room-0.101
prise-0.098
e-0.097
Token,
Token ID11
Feature activation+6.89e-02

Pos logit contributions
bread+0.038
 ju+0.034
 cuc+0.033
 tum+0.032
 energ+0.032
Neg logit contributions
ee-0.047
room-0.046
ases-0.044
oh-0.043
w-0.043
Token in
Token ID287
Feature activation+3.55e-02

Pos logit contributions
bread+0.115
rily+0.101
ment+0.101
der+0.100
 tum+0.097
Neg logit contributions
e-0.111
ee-0.102
w-0.100
as-0.094
oh-0.092
Token trees
Token ID7150
Feature activation+3.11e-02

Pos logit contributions
bread+0.042
 he+0.039
 ju+0.036
 cuc+0.034
rily+0.034
Neg logit contributions
ases-0.081
prise-0.081
ee-0.080
rr-0.079
room-0.079
Token and
Token ID290
Feature activation+8.46e-02

Pos logit contributions
bread+0.050
 ju+0.047
rily+0.047
 energ+0.045
 current+0.045
Neg logit contributions
room-0.054
ee-0.053
ases-0.051
e-0.051
rr-0.051
Token even
Token ID772
Feature activation+5.65e-02

Pos logit contributions
rily+0.137
bread+0.134
der+0.118
rier+0.111
 attent+0.107
Neg logit contributions
w-0.143
e-0.142
prise-0.134
ases-0.133
ee-0.133
Token in
Token ID287
Feature activation+3.55e-02

Pos logit contributions
bread+0.096
rily+0.089
der+0.082
 current+0.079
 tum+0.077
Neg logit contributions
ases-0.101
prise-0.097
rr-0.094
ee-0.094
e-0.093
Token rocks
Token ID12586
Feature activation+6.56e-03

Pos logit contributions
bread+0.037
 he+0.035
 ju+0.032
 cuc+0.030
rily+0.030
Neg logit contributions
prise-0.085
ases-0.085
ee-0.084
rr-0.083
room-0.082
Token.
Token ID13
Feature activation+3.14e-02

Pos logit contributions
bread+0.009
 ju+0.009
 cuc+0.008
 he+0.008
 current+0.008
Neg logit contributions
ee-0.013
ases-0.013
uch-0.012
prise-0.012
rr-0.012
Token But
Token ID887
Feature activation+2.62e-02

Pos logit contributions
bread+0.038
 he+0.037
 ju+0.034
 cuc+0.033
rily+0.033
Neg logit contributions
ases-0.069
prise-0.068
ee-0.068
rr-0.068
room-0.067
Token trees
Token ID7150
Feature activation+3.40e-02

Pos logit contributions
bread+0.089
 energ+0.077
 ju+0.077
rily+0.074
lest+0.071
Neg logit contributions
ases-0.100
ee-0.096
room-0.096
e-0.093
prise-0.090
Token,
Token ID11
Feature activation+6.42e-02

Pos logit contributions
bread+0.053
 ju+0.049
ment+0.049
rily+0.048
 cuc+0.048
Neg logit contributions
ee-0.062
room-0.062
rr-0.059
ases-0.059
e-0.057
Token behind
Token ID2157
Feature activation+4.14e-02

Pos logit contributions
bread+0.120
ment+0.115
rily+0.102
der+0.099
 tum+0.096
Neg logit contributions
e-0.101
w-0.097
ee-0.096
uch-0.091
room-0.090
Token bushes
Token ID37413
Feature activation+2.30e-02

Pos logit contributions
bread+0.045
 he+0.043
 ju+0.039
rily+0.037
 cuc+0.037
Neg logit contributions
prise-0.097
ases-0.097
ee-0.097
rr-0.096
uch-0.095
Token,
Token ID11
Feature activation+6.36e-02

Pos logit contributions
bread+0.030
 he+0.028
 ju+0.028
rily+0.027
 cuc+0.026
Neg logit contributions
ases-0.048
ee-0.047
prise-0.047
rr-0.047
room-0.047
Token and
Token ID290
Feature activation+8.44e-02

Pos logit contributions
bread+0.123
ment+0.116
rily+0.103
 tum+0.102
 current+0.099
Neg logit contributions
e-0.092
ee-0.089
w-0.088
room-0.088
oh-0.085
Token even
Token ID772
Feature activation+5.22e-02

Pos logit contributions
bread+0.139
rily+0.129
ment+0.122
 cuc+0.118
 tal+0.114
Neg logit contributions
prise-0.140
room-0.137
e-0.137
ases-0.132
w-0.130
Token in
Token ID287
Feature activation+3.57e-02

Pos logit contributions
bread+0.093
rily+0.084
 current+0.077
der+0.077
 tum+0.076
Neg logit contributions
ases-0.089
prise-0.086
rr-0.083
room-0.081
e-0.080
Token the
Token ID262
Feature activation+2.80e-02

Pos logit contributions
bread+0.036
 he+0.035
 ju+0.031
 cuc+0.030
rily+0.030
Neg logit contributions
prise-0.087
ases-0.087
ee-0.086
rr-0.085
room-0.084
Token sandbox
Token ID35204
Feature activation+2.51e-02

Pos logit contributions
bread+0.041
rily+0.038
 he+0.038
 cuc+0.035
ment+0.035
Neg logit contributions
prise-0.056
ee-0.055
uch-0.055
ases-0.054
rr-0.054
Token.
Token ID13
Feature activation+3.31e-02

Pos logit contributions
bread+0.031
 he+0.030
 ju+0.028
 cuc+0.027
rily+0.027
Neg logit contributions
prise-0.054
ases-0.054
ee-0.054
rr-0.053
uch-0.053
Token,
Token ID11
Feature activation+6.06e-02

Pos logit contributions
bread+0.045
 ju+0.041
ment+0.040
 cuc+0.040
rily+0.040
Neg logit contributions
room-0.059
ee-0.059
ases-0.058
rr-0.057
prise-0.057
Token under
Token ID739
Feature activation+4.92e-02

Pos logit contributions
bread+0.120
der+0.101
rily+0.101
ment+0.100
 current+0.089
Neg logit contributions
e-0.095
w-0.089
ma-0.088
ee-0.086
room-0.082
Token the
Token ID262
Feature activation+2.50e-02

Pos logit contributions
bread+0.072
 ju+0.065
 energ+0.064
rily+0.062
 cuc+0.060
Neg logit contributions
ases-0.099
room-0.098
ee-0.094
rr-0.094
prise-0.092
Token benches
Token ID43183
Feature activation+1.44e-02

Pos logit contributions
bread+0.028
 he+0.026
rily+0.025
 ju+0.023
 cuc+0.023
Neg logit contributions
ee-0.058
ases-0.057
rr-0.057
prise-0.057
uch-0.056
Token,
Token ID11
Feature activation+6.09e-02

Pos logit contributions
bread+0.018
 he+0.018
 ju+0.017
rily+0.016
 cuc+0.016
Neg logit contributions
ee-0.031
prise-0.031
ases-0.030
rr-0.030
uch-0.030
Token and
Token ID290
Feature activation+8.40e-02

Pos logit contributions
bread+0.123
ment+0.105
rily+0.101
 tum+0.099
der+0.098
Neg logit contributions
w-0.086
e-0.086
ee-0.085
oh-0.081
room-0.079
Token in
Token ID287
Feature activation+3.52e-02

Pos logit contributions
bread+0.145
rily+0.127
 cuc+0.121
ment+0.117
 current+0.116
Neg logit contributions
prise-0.139
w-0.139
e-0.136
room-0.135
ases-0.135
Token the
Token ID262
Feature activation+3.65e-02

Pos logit contributions
bread+0.041
 he+0.036
 ju+0.035
 cuc+0.034
rily+0.033
Neg logit contributions
ases-0.082
prise-0.081
ee-0.081
room-0.079
rr-0.079
Token bushes
Token ID37413
Feature activation+1.81e-02

Pos logit contributions
bread+0.039
 he+0.037
rily+0.035
 ju+0.032
 cuc+0.032
Neg logit contributions
prise-0.087
ee-0.086
ases-0.085
rr-0.084
uch-0.084
Token.
Token ID13
Feature activation+3.07e-02

Pos logit contributions
bread+0.023
 he+0.022
 ju+0.020
 cuc+0.020
rily+0.020
Neg logit contributions
prise-0.039
ases-0.039
ee-0.039
rr-0.039
room-0.038
Token But
Token ID887
Feature activation+2.63e-02

Pos logit contributions
 he+0.036
bread+0.036
 ju+0.032
 cuc+0.031
rily+0.031
Neg logit contributions
prise-0.069
ases-0.068
ee-0.068
rr-0.068
uch-0.068
Token trees
Token ID7150
Feature activation+2.87e-02

Pos logit contributions
bread+0.070
 ju+0.059
 cuc+0.058
 energ+0.057
rily+0.056
Neg logit contributions
ases-0.100
ee-0.097
room-0.096
rr-0.092
prise-0.092
Token,
Token ID11
Feature activation+5.78e-02

Pos logit contributions
bread+0.046
 ju+0.042
 cuc+0.041
ment+0.041
rily+0.041
Neg logit contributions
room-0.051
ee-0.051
ases-0.049
rr-0.048
ow-0.048
Token under
Token ID739
Feature activation+4.62e-02

Pos logit contributions
bread+0.108
ment+0.095
rily+0.093
der+0.090
rier+0.082
Neg logit contributions
e-0.093
w-0.089
ee-0.086
ma-0.084
ow-0.083
Token benches
Token ID43183
Feature activation+1.02e-02

Pos logit contributions
bread+0.068
rily+0.059
 ju+0.058
 energ+0.058
 cuc+0.055
Neg logit contributions
ases-0.094
room-0.091
rr-0.090
ee-0.089
e-0.087
Token,
Token ID11
Feature activation+5.85e-02

Pos logit contributions
bread+0.013
 he+0.013
 ju+0.012
rily+0.012
 cuc+0.012
Neg logit contributions
ee-0.021
prise-0.021
ases-0.021
rr-0.021
uch-0.021
Token and
Token ID290
Feature activation+8.39e-02

Pos logit contributions
bread+0.113
ment+0.102
rily+0.096
 tum+0.091
der+0.090
Neg logit contributions
w-0.084
ee-0.082
e-0.081
oh-0.078
room-0.076
Token in
Token ID287
Feature activation+3.60e-02

Pos logit contributions
bread+0.147
rily+0.136
 cuc+0.124
ment+0.123
yt+0.115
Neg logit contributions
e-0.129
prise-0.127
room-0.126
w-0.123
ases-0.122
Token the
Token ID262
Feature activation+3.22e-02

Pos logit contributions
bread+0.046
 he+0.039
 ju+0.038
 cuc+0.037
rily+0.037
Neg logit contributions
ases-0.081
ee-0.079
prise-0.079
room-0.078
rr-0.078
Token sandbox
Token ID35204
Feature activation+2.33e-02

Pos logit contributions
bread+0.035
rily+0.032
 he+0.031
 cuc+0.028
ment+0.028
Neg logit contributions
prise-0.075
ee-0.075
ases-0.075
rr-0.075
uch-0.074
Token.
Token ID13
Feature activation+3.40e-02

Pos logit contributions
bread+0.034
 ju+0.032
 cuc+0.031
 energ+0.030
rily+0.029
Neg logit contributions
room-0.047
ases-0.046
prise-0.045
rr-0.045
ee-0.044
Token They
Token ID1119
Feature activation+2.51e-02

Pos logit contributions
bread+0.047
 cuc+0.043
ment+0.041
 he+0.041
 energ+0.040
Neg logit contributions
room-0.070
ases-0.067
rr-0.067
prise-0.065
ee-0.065
Token,
Token ID11
Feature activation+5.31e-02

Pos logit contributions
bread+0.034
 ju+0.031
 cuc+0.029
 he+0.029
rily+0.029
Neg logit contributions
ee-0.056
ases-0.055
room-0.055
rr-0.055
prise-0.054
Token under
Token ID739
Feature activation+4.04e-02

Pos logit contributions
bread+0.095
rily+0.079
der+0.074
ment+0.073
 attent+0.068
Neg logit contributions
e-0.096
w-0.095
ases-0.092
ee-0.092
uch-0.091
Token the
Token ID262
Feature activation+1.84e-02

Pos logit contributions
bread+0.062
 energ+0.055
 ju+0.055
rily+0.052
lest+0.051
Neg logit contributions
ases-0.079
room-0.078
rr-0.075
ee-0.073
e-0.072
Token benches
Token ID43183
Feature activation+7.87e-03

Pos logit contributions
bread+0.020
 he+0.018
rily+0.018
 ju+0.017
 energ+0.017
Neg logit contributions
ee-0.042
ases-0.042
rr-0.042
prise-0.042
room-0.041
Token,
Token ID11
Feature activation+5.63e-02

Pos logit contributions
 he+0.009
bread+0.009
 until+0.008
 ju+0.008
 which+0.008
Neg logit contributions
prise-0.018
ases-0.018
uch-0.018
ee-0.018
rr-0.018
Token and
Token ID290
Feature activation+8.31e-02

Pos logit contributions
bread+0.106
ment+0.088
rily+0.088
der+0.082
 ju+0.082
Neg logit contributions
w-0.093
ee-0.091
e-0.088
ases-0.086
oh-0.086
Token in
Token ID287
Feature activation+3.17e-02

Pos logit contributions
bread+0.136
rily+0.119
 cuc+0.106
ment+0.105
 energ+0.103
Neg logit contributions
prise-0.144
ases-0.143
w-0.141
oh-0.141
e-0.139
Token the
Token ID262
Feature activation+3.15e-02

Pos logit contributions
bread+0.039
 ju+0.033
 he+0.032
 cuc+0.031
rily+0.031
Neg logit contributions
ases-0.073
ee-0.071
prise-0.071
rr-0.070
room-0.070
Token bushes
Token ID37413
Feature activation+1.19e-02

Pos logit contributions
bread+0.033
 he+0.031
rily+0.030
 ju+0.027
 cuc+0.027
Neg logit contributions
prise-0.074
ee-0.074
ases-0.074
rr-0.073
uch-0.073
Token.
Token ID13
Feature activation+2.99e-02

Pos logit contributions
 he+0.013
bread+0.013
 ju+0.012
 until+0.011
 cuc+0.011
Neg logit contributions
prise-0.028
ases-0.027
ee-0.027
uch-0.027
rr-0.027
Token He
Token ID679
Feature activation+2.00e-02

Pos logit contributions
bread+0.045
lest+0.040
 knowing+0.038
ment+0.038
rily+0.038
Neg logit contributions
room-0.056
ases-0.056
rr-0.055
ee-0.054
oh-0.053
Token,
Token ID11
Feature activation+5.82e-02

Pos logit contributions
bread+0.015
 he+0.015
 ju+0.014
rily+0.013
 cuc+0.013
Neg logit contributions
ee-0.029
prise-0.029
ases-0.029
rr-0.029
uch-0.029
Token behind
Token ID2157
Feature activation+2.86e-02

Pos logit contributions
bread+0.112
ment+0.099
rily+0.092
der+0.092
oc+0.089
Neg logit contributions
ee-0.088
w-0.087
e-0.085
oh-0.080
uch-0.079
Token the
Token ID262
Feature activation+2.01e-02

Pos logit contributions
bread+0.030
 he+0.030
 ju+0.027
 cuc+0.025
rily+0.025
Neg logit contributions
prise-0.068
ases-0.068
ee-0.067
rr-0.067
uch-0.066
Token flowers
Token ID12734
Feature activation+1.62e-02

Pos logit contributions
bread+0.021
 he+0.021
rily+0.019
 ju+0.018
 cuc+0.018
Neg logit contributions
prise-0.047
ases-0.047
ee-0.046
rr-0.046
uch-0.045
Token,
Token ID11
Feature activation+5.60e-02

Pos logit contributions
bread+0.022
 he+0.021
 ju+0.020
rily+0.019
 cuc+0.019
Neg logit contributions
ee-0.033
prise-0.033
ases-0.033
rr-0.033
uch-0.033
Token and
Token ID290
Feature activation+8.29e-02

Pos logit contributions
bread+0.110
rily+0.093
ment+0.093
 ju+0.088
 tum+0.088
Neg logit contributions
ee-0.080
w-0.080
e-0.079
ases-0.079
room-0.073
Token in
Token ID287
Feature activation+3.64e-02

Pos logit contributions
bread+0.138
rily+0.130
 cuc+0.116
yt+0.116
ment+0.113
Neg logit contributions
ases-0.133
e-0.133
ee-0.132
prise-0.131
w-0.126
Token the
Token ID262
Feature activation+3.61e-02

Pos logit contributions
bread+0.044
 he+0.039
 ju+0.038
 cuc+0.037
rily+0.036
Neg logit contributions
ases-0.083
ee-0.082
prise-0.081
room-0.080
rr-0.080
Token sand
Token ID6450
Feature activation-5.53e-03

Pos logit contributions
bread+0.039
 he+0.037
rily+0.037
 ju+0.032
 cuc+0.032
Neg logit contributions
prise-0.084
ee-0.083
ases-0.083
rr-0.082
uch-0.082
Token.
Token ID13
Feature activation+3.35e-02

Pos logit contributions
ases+0.012
prise+0.012
rr+0.012
ee+0.012
uch+0.012
Neg logit contributions
bread-0.007
 he-0.007
 ju-0.006
 cuc-0.006
 energ-0.006
Token They
Token ID1119
Feature activation+1.48e-02

Pos logit contributions
bread+0.044
 he+0.040
 ju+0.038
rily+0.038
 knowing+0.037
Neg logit contributions
ases-0.071
room-0.069
ee-0.069
rr-0.068
prise-0.065
Token rocks
Token ID12586
Feature activation+1.01e-02

Pos logit contributions
bread+0.089
 energ+0.076
 ju+0.075
rily+0.073
der+0.069
Neg logit contributions
ases-0.107
ee-0.104
room-0.099
prise-0.099
e-0.098
Token,
Token ID11
Feature activation+6.77e-02

Pos logit contributions
bread+0.014
 ju+0.013
 tum+0.013
 cuc+0.012
ment+0.012
Neg logit contributions
ee-0.020
rr-0.019
ases-0.019
room-0.019
prise-0.019
Token in
Token ID287
Feature activation+3.82e-02

Pos logit contributions
bread+0.114
ment+0.105
der+0.099
 tum+0.098
 ending+0.093
Neg logit contributions
e-0.111
ee-0.106
room-0.102
ases-0.100
rr-0.098
Token trees
Token ID7150
Feature activation+2.97e-02

Pos logit contributions
bread+0.049
 he+0.042
 ju+0.040
 energ+0.038
 cuc+0.038
Neg logit contributions
ases-0.086
prise-0.084
ee-0.084
room-0.083
rr-0.082
Token,
Token ID11
Feature activation+6.39e-02

Pos logit contributions
bread+0.047
 ju+0.046
rily+0.044
 energ+0.044
ment+0.043
Neg logit contributions
room-0.051
ee-0.051
rr-0.050
e-0.049
ases-0.049
Token and
Token ID290
Feature activation+8.27e-02

Pos logit contributions
bread+0.104
ment+0.094
 tum+0.092
rily+0.091
der+0.091
Neg logit contributions
e-0.105
ee-0.104
ases-0.100
room-0.100
rr-0.097
Token even
Token ID772
Feature activation+5.66e-02

Pos logit contributions
bread+0.124
rily+0.118
der+0.106
ment+0.103
 tum+0.099
Neg logit contributions
prise-0.145
e-0.144
ases-0.143
ee-0.142
room-0.139
Token in
Token ID287
Feature activation+3.77e-02

Pos logit contributions
bread+0.092
rily+0.086
der+0.078
 current+0.078
 tum+0.077
Neg logit contributions
ases-0.102
prise-0.099
rr-0.097
ee-0.094
e-0.094
Token the
Token ID262
Feature activation+2.86e-02

Pos logit contributions
bread+0.041
 he+0.038
 ju+0.035
rily+0.033
 cuc+0.033
Neg logit contributions
prise-0.089
ases-0.089
ee-0.088
rr-0.087
room-0.086
Token grass
Token ID8701
Feature activation+1.55e-03

Pos logit contributions
bread+0.036
 he+0.034
rily+0.033
ment+0.030
der+0.029
Neg logit contributions
ee-0.063
prise-0.062
uch-0.061
rr-0.061
ases-0.060
Token.
Token ID13
Feature activation+3.21e-02

Pos logit contributions
bread+0.002
 he+0.002
 ju+0.002
 cuc+0.002
rily+0.002
Neg logit contributions
ee-0.003
prise-0.003
ases-0.003
rr-0.003
uch-0.003
Token,
Token ID11
Feature activation+7.01e-02

Pos logit contributions
bread+0.024
 ju+0.021
 he+0.021
 cuc+0.021
 energ+0.020
Neg logit contributions
room-0.041
ases-0.041
ee-0.041
rr-0.040
prise-0.040
Token on
Token ID319
Feature activation+4.24e-02

Pos logit contributions
bread+0.153
ment+0.139
oc+0.120
 tal+0.113
der+0.113
Neg logit contributions
e-0.095
w-0.085
ee-0.083
 cannot-0.082
oh-0.081
Token the
Token ID262
Feature activation+4.12e-02

Pos logit contributions
bread+0.043
 he+0.040
 ju+0.037
rily+0.035
 cuc+0.035
Neg logit contributions
ases-0.102
ee-0.102
prise-0.101
room-0.100
rr-0.100
Token swings
Token ID26728
Feature activation+1.90e-02

Pos logit contributions
bread+0.058
 he+0.053
rily+0.053
ment+0.051
 energ+0.050
Neg logit contributions
ee-0.081
rr-0.080
room-0.079
prise-0.079
ases-0.079
Token,
Token ID11
Feature activation+6.61e-02

Pos logit contributions
bread+0.026
 he+0.024
 ju+0.023
 cuc+0.023
rily+0.022
Neg logit contributions
ases-0.039
ee-0.039
room-0.039
prise-0.038
rr-0.038
Token and
Token ID290
Feature activation+8.20e-02

Pos logit contributions
bread+0.144
ment+0.127
 tum+0.111
rily+0.111
 former+0.111
Neg logit contributions
w-0.087
ases-0.084
e-0.083
ee-0.083
ips-0.080
Token in
Token ID287
Feature activation+3.78e-02

Pos logit contributions
bread+0.148
rily+0.138
ment+0.127
 cuc+0.120
 attent+0.120
Neg logit contributions
ases-0.127
prise-0.125
w-0.124
room-0.120
e-0.119
Token the
Token ID262
Feature activation+3.51e-02

Pos logit contributions
bread+0.035
 he+0.034
 ju+0.030
 cuc+0.028
rily+0.028
Neg logit contributions
prise-0.095
ases-0.095
ee-0.094
rr-0.093
uch-0.093
Token sand
Token ID6450
Feature activation-1.43e-04

Pos logit contributions
bread+0.051
 he+0.049
rily+0.047
 ju+0.044
 cuc+0.044
Neg logit contributions
prise-0.070
ee-0.070
ases-0.069
rr-0.069
uch-0.068
Token.
Token ID13
Feature activation+3.32e-02

Pos logit contributions
ases+0.000
prise+0.000
room+0.000
ee+0.000
rr+0.000
Neg logit contributions
bread-0.000
 he-0.000
 ju-0.000
 energ-0.000
 cuc-0.000
Token Finally
Token ID9461
Feature activation+1.35e-02

Pos logit contributions
bread+0.046
 he+0.042
ment+0.040
 current+0.039
 cuc+0.039
Neg logit contributions
room-0.069
ases-0.067
rr-0.064
prise-0.064
ee-0.064
Token and
Token ID290
Feature activation+6.07e-02

Pos logit contributions
bread+0.065
 he+0.063
 ju+0.058
 cuc+0.055
rily+0.055
Neg logit contributions
prise-0.124
ases-0.123
ee-0.122
uch-0.122
rr-0.121
Token put
Token ID1234
Feature activation+3.72e-02

Pos logit contributions
bread+0.085
rily+0.075
 ju+0.071
 current+0.070
ment+0.070
Neg logit contributions
prise-0.123
ases-0.121
ee-0.118
uch-0.117
room-0.116
Token on
Token ID319
Feature activation+4.06e-02

Pos logit contributions
bread+0.047
 he+0.045
 ju+0.041
 current+0.040
 cuc+0.040
Neg logit contributions
ee-0.080
prise-0.080
ases-0.080
uch-0.078
rr-0.078
Token his
Token ID465
Feature activation+5.87e-02

Pos logit contributions
bread+0.046
 he+0.044
 ju+0.040
 current+0.039
 cuc+0.038
Neg logit contributions
ee-0.092
prise-0.092
ases-0.092
rr-0.090
room-0.089
Token cape
Token ID36023
Feature activation+2.71e-02

Pos logit contributions
 he+0.079
bread+0.079
 ju+0.072
rily+0.069
 until+0.069
Neg logit contributions
prise-0.121
ee-0.119
ases-0.119
uch-0.119
rr-0.118
Token,
Token ID11
Feature activation+8.17e-02

Pos logit contributions
bread+0.035
 ju+0.032
 cuc+0.032
 he+0.032
 current+0.031
Neg logit contributions
ee-0.056
ases-0.056
room-0.056
rr-0.056
uch-0.056
Token feeling
Token ID4203
Feature activation+1.99e-02

Pos logit contributions
bread+0.105
rily+0.086
ment+0.085
 current+0.083
 ju+0.080
Neg logit contributions
ee-0.159
e-0.158
ases-0.156
room-0.155
prise-0.155
Token very
Token ID845
Feature activation+2.70e-02

Pos logit contributions
bread+0.032
rily+0.028
 ju+0.028
 current+0.027
 cuc+0.027
Neg logit contributions
ee-0.036
ases-0.036
prise-0.036
e-0.035
uch-0.034
Token strong
Token ID1913
Feature activation+2.01e-02

Pos logit contributions
bread+0.039
 ju+0.035
 cuc+0.035
rily+0.034
 current+0.033
Neg logit contributions
prise-0.053
uch-0.053
ee-0.053
ases-0.050
rr-0.050
Token.
Token ID13
Feature activation+2.68e-02

Pos logit contributions
bread+0.030
 ju+0.029
 cuc+0.029
rily+0.028
 lettuce+0.028
Neg logit contributions
ee-0.036
ases-0.036
rr-0.034
room-0.034
w-0.034
Token
Token ID198
Feature activation-9.48e-03

Pos logit contributions
bread+0.029
 he+0.028
 ju+0.025
rily+0.024
 cuc+0.024
Neg logit contributions
prise-0.062
ee-0.062
ases-0.062
rr-0.061
uch-0.061
Token,
Token ID11
Feature activation+6.68e-02

Pos logit contributions
bread+0.047
 ju+0.045
 he+0.044
 cuc+0.043
 energ+0.042
Neg logit contributions
room-0.088
ee-0.086
rr-0.084
prise-0.084
ases-0.084
Token under
Token ID739
Feature activation+5.60e-02

Pos logit contributions
bread+0.115
der+0.100
 cuc+0.096
 current+0.096
ment+0.095
Neg logit contributions
e-0.117
ee-0.111
room-0.108
ow-0.106
prise-0.101
Token the
Token ID262
Feature activation+4.31e-02

Pos logit contributions
bread+0.076
 ju+0.070
 energ+0.067
rily+0.065
 cuc+0.064
Neg logit contributions
room-0.116
ases-0.115
ee-0.114
rr-0.111
prise-0.111
Token chairs
Token ID18791
Feature activation+1.89e-02

Pos logit contributions
bread+0.057
 he+0.055
 ju+0.052
rily+0.051
 cuc+0.050
Neg logit contributions
prise-0.089
ee-0.089
room-0.087
ases-0.086
rr-0.086
Token,
Token ID11
Feature activation+6.68e-02

Pos logit contributions
bread+0.026
 ju+0.024
 he+0.024
 cuc+0.023
rily+0.022
Neg logit contributions
ee-0.038
room-0.038
ases-0.038
rr-0.038
prise-0.037
Token and
Token ID290
Feature activation+8.16e-02

Pos logit contributions
bread+0.117
 current+0.105
der+0.104
 cuc+0.102
 ju+0.102
Neg logit contributions
ee-0.106
e-0.106
room-0.101
w-0.101
ases-0.097
Token even
Token ID772
Feature activation+5.62e-02

Pos logit contributions
bread+0.126
rily+0.119
 cuc+0.109
der+0.108
 ju+0.105
Neg logit contributions
prise-0.149
ee-0.142
ases-0.140
e-0.139
room-0.139
Token in
Token ID287
Feature activation+4.06e-02

Pos logit contributions
bread+0.088
rily+0.081
 current+0.077
der+0.075
 ju+0.074
Neg logit contributions
prise-0.105
ases-0.104
ee-0.100
rr-0.099
room-0.097
Token the
Token ID262
Feature activation+4.59e-02

Pos logit contributions
bread+0.039
 he+0.039
 ju+0.034
 cuc+0.032
rily+0.032
Neg logit contributions
prise-0.100
ases-0.099
ee-0.099
rr-0.098
uch-0.097
Token refrigerator
Token ID30500
Feature activation+2.66e-02

Pos logit contributions
bread+0.064
 he+0.063
rily+0.058
 ju+0.057
 cuc+0.056
Neg logit contributions
prise-0.093
ee-0.093
ases-0.091
rr-0.090
uch-0.089
Token!
Token ID0
Feature activation+2.87e-02

Pos logit contributions
bread+0.034
 he+0.032
 ju+0.031
 cuc+0.030
rily+0.030
Neg logit contributions
prise-0.056
ee-0.056
rr-0.055
ases-0.055
uch-0.055
Token,
Token ID11
Feature activation+6.55e-02

Pos logit contributions
bread+0.059
ment+0.056
 ju+0.056
rily+0.055
 energ+0.055
Neg logit contributions
room-0.063
ee-0.061
ow-0.059
e-0.058
rr-0.058
Token behind
Token ID2157
Feature activation+4.31e-02

Pos logit contributions
bread+0.119
ment+0.109
der+0.105
 tum+0.099
oc+0.098
Neg logit contributions
e-0.104
ee-0.099
w-0.095
room-0.092
uch-0.088
Token the
Token ID262
Feature activation+4.01e-02

Pos logit contributions
bread+0.054
 he+0.049
 ju+0.048
rily+0.046
 cuc+0.045
Neg logit contributions
ases-0.095
ee-0.094
prise-0.093
room-0.093
rr-0.092
Token bushes
Token ID37413
Feature activation+2.81e-02

Pos logit contributions
bread+0.046
 he+0.044
rily+0.044
 energ+0.042
ment+0.041
Neg logit contributions
ee-0.087
rr-0.085
prise-0.085
ases-0.085
room-0.084
Token,
Token ID11
Feature activation+6.53e-02

Pos logit contributions
bread+0.041
 ju+0.038
rily+0.037
 energ+0.037
 he+0.036
Neg logit contributions
room-0.054
ee-0.053
ases-0.053
rr-0.052
prise-0.051
Token and
Token ID290
Feature activation+8.08e-02

Pos logit contributions
bread+0.120
ment+0.107
 tum+0.100
der+0.099
rily+0.098
Neg logit contributions
e-0.100
ee-0.099
room-0.095
w-0.093
oh-0.092
Token in
Token ID287
Feature activation+4.07e-02

Pos logit contributions
bread+0.140
rily+0.125
der+0.117
ment+0.115
 cuc+0.111
Neg logit contributions
prise-0.132
e-0.131
ee-0.128
ases-0.124
room-0.124
Token the
Token ID262
Feature activation+4.11e-02

Pos logit contributions
bread+0.054
 ju+0.046
 he+0.045
 cuc+0.044
 energ+0.043
Neg logit contributions
ases-0.090
ee-0.088
prise-0.087
room-0.087
rr-0.085
Token sandbox
Token ID35204
Feature activation+3.10e-02

Pos logit contributions
bread+0.049
rily+0.047
 he+0.044
 energ+0.041
ment+0.040
Neg logit contributions
prise-0.091
ee-0.091
rr-0.089
ases-0.089
uch-0.089
Token.
Token ID13
Feature activation+3.77e-02

Pos logit contributions
bread+0.047
 ju+0.044
 cuc+0.041
 energ+0.041
 current+0.041
Neg logit contributions
room-0.061
ases-0.059
rr-0.057
prise-0.057
ee-0.057
Token Finally
Token ID9461
Feature activation+1.50e-02

Pos logit contributions
bread+0.046
 he+0.044
 cuc+0.040
 ju+0.039
rily+0.038
Neg logit contributions
room-0.083
ases-0.083
rr-0.082
prise-0.082
ee-0.081
Token,
Token ID11
Feature activation+6.34e-02

Pos logit contributions
bread+0.055
 ju+0.050
rily+0.050
ment+0.050
 cuc+0.048
Neg logit contributions
room-0.068
ee-0.067
ases-0.066
rr-0.064
e-0.063
Token under
Token ID739
Feature activation+4.99e-02

Pos logit contributions
bread+0.122
ment+0.104
rily+0.096
 tum+0.091
der+0.091
Neg logit contributions
e-0.097
w-0.095
ee-0.095
ma-0.091
oh-0.090
Token the
Token ID262
Feature activation+3.07e-02

Pos logit contributions
bread+0.072
 ju+0.065
 energ+0.061
rily+0.061
 satisfied+0.057
Neg logit contributions
ases-0.104
room-0.103
rr-0.099
ee-0.099
prise-0.096
Token benches
Token ID43183
Feature activation+1.76e-02

Pos logit contributions
bread+0.038
rily+0.035
 he+0.033
ment+0.033
 energ+0.032
Neg logit contributions
ee-0.066
rr-0.066
ases-0.066
prise-0.065
room-0.064
Token,
Token ID11
Feature activation+6.22e-02

Pos logit contributions
bread+0.025
 he+0.023
 ju+0.023
rily+0.022
 cuc+0.021
Neg logit contributions
ee-0.035
ases-0.035
prise-0.035
rr-0.035
uch-0.035
Token and
Token ID290
Feature activation+8.05e-02

Pos logit contributions
bread+0.126
ment+0.109
rily+0.104
 tum+0.103
 ju+0.099
Neg logit contributions
w-0.089
ee-0.088
oh-0.086
e-0.085
room-0.080
Token in
Token ID287
Feature activation+3.72e-02

Pos logit contributions
bread+0.140
rily+0.117
ment+0.108
 cuc+0.106
der+0.105
Neg logit contributions
w-0.135
e-0.134
prise-0.133
ee-0.133
room-0.131
Token the
Token ID262
Feature activation+4.09e-02

Pos logit contributions
bread+0.048
 ju+0.040
 cuc+0.038
 he+0.038
rily+0.037
Neg logit contributions
ases-0.085
ee-0.083
room-0.082
prise-0.082
rr-0.081
Token bushes
Token ID37413
Feature activation+2.19e-02

Pos logit contributions
bread+0.046
 he+0.043
rily+0.041
 ju+0.039
 cuc+0.038
Neg logit contributions
prise-0.095
ee-0.094
ases-0.094
rr-0.093
uch-0.093
Token.
Token ID13
Feature activation+3.54e-02

Pos logit contributions
bread+0.029
 he+0.026
 ju+0.026
rily+0.025
 cuc+0.025
Neg logit contributions
ases-0.046
prise-0.046
ee-0.046
rr-0.045
room-0.045
Token He
Token ID679
Feature activation+2.94e-02

Pos logit contributions
bread+0.037
rily+0.030
 ju+0.029
 energ+0.029
 cuc+0.029
Neg logit contributions
room-0.085
ases-0.084
rr-0.083
prise-0.080
ee-0.080
Token,
Token ID11
Feature activation+6.08e-02

Pos logit contributions
bread+0.036
 ju+0.032
rily+0.031
 he+0.031
ment+0.031
Neg logit contributions
room-0.056
ee-0.056
prise-0.055
ases-0.055
rr-0.055
Token behind
Token ID2157
Feature activation+4.11e-02

Pos logit contributions
bread+0.112
ment+0.104
der+0.090
 tum+0.088
rily+0.083
Neg logit contributions
e-0.092
ee-0.091
w-0.090
oh-0.088
room-0.084
Token the
Token ID262
Feature activation+2.41e-02

Pos logit contributions
bread+0.061
 ju+0.053
 he+0.053
 cuc+0.051
rily+0.051
Neg logit contributions
ases-0.082
room-0.081
ee-0.081
prise-0.081
rr-0.078
Token trees
Token ID7150
Feature activation+2.69e-02

Pos logit contributions
bread+0.029
rily+0.027
ment+0.026
 he+0.026
 energ+0.026
Neg logit contributions
rr-0.051
ee-0.051
prise-0.050
room-0.050
ases-0.050
Token,
Token ID11
Feature activation+5.55e-02

Pos logit contributions
bread+0.044
 ju+0.041
rily+0.041
 energ+0.040
ment+0.040
Neg logit contributions
room-0.047
ee-0.047
rr-0.045
ases-0.045
prise-0.044
Token and
Token ID290
Feature activation+8.04e-02

Pos logit contributions
bread+0.111
ment+0.101
 tum+0.095
rily+0.093
der+0.090
Neg logit contributions
ee-0.076
w-0.075
e-0.073
oh-0.071
room-0.068
Token even
Token ID772
Feature activation+5.28e-02

Pos logit contributions
bread+0.148
rily+0.135
der+0.126
ment+0.126
 tal+0.121
Neg logit contributions
prise-0.123
w-0.116
e-0.112
ee-0.112
room-0.112
Token in
Token ID287
Feature activation+3.88e-02

Pos logit contributions
bread+0.098
der+0.084
rily+0.084
 tum+0.080
 current+0.079
Neg logit contributions
prise-0.090
ases-0.086
rr-0.081
render-0.080
ee-0.079
Token the
Token ID262
Feature activation+3.41e-02

Pos logit contributions
bread+0.047
 he+0.041
 ju+0.039
 cuc+0.038
rily+0.038
Neg logit contributions
prise-0.088
ases-0.088
ee-0.087
room-0.086
rr-0.085
Token flowers
Token ID12734
Feature activation+1.81e-02

Pos logit contributions
bread+0.046
rily+0.043
ment+0.038
 he+0.038
der+0.036
Neg logit contributions
prise-0.072
ee-0.071
rr-0.070
room-0.069
uch-0.069
Token.
Token ID13
Feature activation+2.60e-02

Pos logit contributions
bread+0.025
 ju+0.023
 he+0.023
rily+0.022
 cuc+0.022
Neg logit contributions
prise-0.037
ee-0.037
ases-0.036
uch-0.036
room-0.036
Token,
Token ID11
Feature activation+6.83e-02

Pos logit contributions
bread+0.031
 ju+0.027
 energ+0.027
ment+0.026
 satisfied+0.026
Neg logit contributions
room-0.049
ases-0.047
rr-0.046
ee-0.046
uch-0.046
Token under
Token ID739
Feature activation+5.44e-02

Pos logit contributions
bread+0.120
ment+0.109
der+0.102
 tal+0.089
rily+0.089
Neg logit contributions
e-0.115
room-0.107
ee-0.105
w-0.098
ma-0.097
Token the
Token ID262
Feature activation+3.23e-02

Pos logit contributions
bread+0.081
 energ+0.071
 ju+0.070
rily+0.067
 satisfied+0.066
Neg logit contributions
room-0.108
ases-0.107
ee-0.104
rr-0.101
e-0.100
Token trees
Token ID7150
Feature activation+3.33e-02

Pos logit contributions
bread+0.041
 he+0.037
rily+0.036
ment+0.035
 energ+0.034
Neg logit contributions
prise-0.069
ee-0.068
rr-0.068
ases-0.068
room-0.067
Token,
Token ID11
Feature activation+6.53e-02

Pos logit contributions
bread+0.055
 ju+0.052
rily+0.051
ment+0.051
 energ+0.050
Neg logit contributions
room-0.057
ee-0.056
rr-0.054
ases-0.054
e-0.053
Token and
Token ID290
Feature activation+8.04e-02

Pos logit contributions
bread+0.121
ment+0.107
der+0.102
rily+0.100
 tum+0.098
Neg logit contributions
e-0.100
ee-0.098
room-0.095
w-0.095
ases-0.093
Token near
Token ID1474
Feature activation+2.50e-02

Pos logit contributions
bread+0.126
rily+0.113
der+0.104
ment+0.102
yt+0.096
Neg logit contributions
prise-0.141
e-0.140
ee-0.139
ases-0.139
room-0.138
Token the
Token ID262
Feature activation+3.53e-02

Pos logit contributions
bread+0.029
 he+0.026
 ju+0.024
 cuc+0.024
 energ+0.023
Neg logit contributions
ases-0.058
ee-0.058
prise-0.057
room-0.057
rr-0.057
Token pond
Token ID16723
Feature activation+3.32e-02

Pos logit contributions
bread+0.040
 he+0.040
rily+0.035
 ju+0.035
ment+0.034
Neg logit contributions
prise-0.079
ee-0.079
ases-0.078
rr-0.078
uch-0.077
Token.
Token ID13
Feature activation+3.17e-02

Pos logit contributions
bread+0.053
 ju+0.048
 cuc+0.046
 energ+0.045
rily+0.045
Neg logit contributions
room-0.061
ases-0.060
rr-0.059
ee-0.058
prise-0.058
Token They
Token ID1119
Feature activation+2.55e-02

Pos logit contributions
bread+0.036
 he+0.035
 ju+0.032
 cuc+0.031
rily+0.031
Neg logit contributions
prise-0.072
ases-0.071
ee-0.071
rr-0.071
uch-0.070
Token the
Token ID262
Feature activation+3.11e-02

Pos logit contributions
bread+0.009
rily+0.007
 satisfied+0.007
 energ+0.007
 cuc+0.007
Neg logit contributions
ee-0.013
ases-0.013
room-0.012
w-0.012
e-0.012
Token grass
Token ID8701
Feature activation+1.68e-02

Pos logit contributions
 he+0.033
bread+0.032
 ju+0.028
rily+0.027
 until+0.027
Neg logit contributions
prise-0.074
ee-0.073
ases-0.073
uch-0.073
rr-0.073
Token,
Token ID11
Feature activation+5.69e-02

Pos logit contributions
bread+0.022
 ju+0.019
 energ+0.019
 he+0.019
 satisfied+0.018
Neg logit contributions
room-0.035
ee-0.035
ases-0.034
uch-0.034
rr-0.034
Token the
Token ID262
Feature activation+3.63e-02

Pos logit contributions
bread+0.101
rily+0.093
ment+0.085
rier+0.081
 attent+0.080
Neg logit contributions
e-0.090
ee-0.088
w-0.086
ases-0.081
 yourself-0.080
Token trees
Token ID7150
Feature activation+2.98e-02

Pos logit contributions
bread+0.039
 he+0.039
rily+0.034
 ju+0.033
 cuc+0.033
Neg logit contributions
ee-0.083
prise-0.083
ases-0.082
rr-0.082
uch-0.082
Token and
Token ID290
Feature activation+8.04e-02

Pos logit contributions
bread+0.046
rily+0.043
 ju+0.043
ment+0.042
 cuc+0.041
Neg logit contributions
ee-0.053
room-0.052
uch-0.052
ases-0.051
rr-0.050
Token a
Token ID257
Feature activation+4.44e-02

Pos logit contributions
bread+0.133
rily+0.133
 attent+0.114
der+0.114
rier+0.112
Neg logit contributions
ee-0.126
w-0.122
ases-0.121
e-0.120
 you-0.114
Token little
Token ID1310
Feature activation+2.44e-02

Pos logit contributions
 he+0.054
bread+0.054
rily+0.048
 ju+0.047
 until+0.047
Neg logit contributions
prise-0.097
ee-0.096
rr-0.094
uch-0.094
ases-0.092
Token flower
Token ID15061
Feature activation+2.42e-02

Pos logit contributions
 he+0.029
bread+0.029
rily+0.028
ment+0.026
 energ+0.025
Neg logit contributions
ee-0.051
prise-0.050
rr-0.050
ases-0.049
room-0.048
Token.
Token ID13
Feature activation+2.46e-02

Pos logit contributions
bread+0.032
 he+0.030
 ju+0.030
 energ+0.029
rily+0.029
Neg logit contributions
ases-0.049
ee-0.048
prise-0.048
room-0.047
uch-0.047
Token Suddenly
Token ID24975
Feature activation+2.72e-02

Pos logit contributions
bread+0.033
 he+0.029
 cuc+0.029
 ju+0.029
rily+0.029
Neg logit contributions
room-0.051
ases-0.051
ee-0.051
rr-0.051
uch-0.050
Token They
Token ID1119
Feature activation+2.73e-02

Pos logit contributions
bread+0.042
 energ+0.036
rily+0.036
ment+0.036
 cuc+0.035
Neg logit contributions
room-0.063
ases-0.062
rr-0.060
ee-0.058
prise-0.058
Token looked
Token ID3114
Feature activation+2.01e-02

Pos logit contributions
bread+0.044
rily+0.039
ment+0.039
 he+0.038
 ju+0.037
Neg logit contributions
prise-0.051
ee-0.050
ases-0.049
room-0.048
rr-0.047
Token under
Token ID739
Feature activation+5.47e-02

Pos logit contributions
bread+0.043
ment+0.036
 current+0.035
der+0.034
lest+0.032
Neg logit contributions
ee-0.029
prise-0.029
ases-0.026
rr-0.024
w-0.024
Token the
Token ID262
Feature activation+3.14e-02

Pos logit contributions
bread+0.098
 energ+0.084
 ju+0.081
rily+0.079
lest+0.077
Neg logit contributions
ases-0.095
room-0.091
ee-0.090
e-0.086
prise-0.083
Token flowers
Token ID12734
Feature activation+3.28e-02

Pos logit contributions
bread+0.042
 he+0.040
rily+0.036
 energ+0.036
ment+0.036
Neg logit contributions
ee-0.064
prise-0.064
ases-0.063
rr-0.062
room-0.061
Token and
Token ID290
Feature activation+8.02e-02

Pos logit contributions
bread+0.049
 ju+0.044
ment+0.043
rily+0.043
 energ+0.042
Neg logit contributions
ee-0.063
room-0.062
prise-0.060
rr-0.060
ases-0.060
Token behind
Token ID2157
Feature activation+4.33e-02

Pos logit contributions
bread+0.164
der+0.154
rily+0.143
rier+0.131
ment+0.130
Neg logit contributions
w-0.110
e-0.110
ee-0.102
prise-0.101
oh-0.094
Token the
Token ID262
Feature activation+3.55e-02

Pos logit contributions
bread+0.064
 energ+0.055
 ju+0.054
 cuc+0.052
rily+0.051
Neg logit contributions
ases-0.089
room-0.089
ee-0.087
prise-0.083
rr-0.083
Token trees
Token ID7150
Feature activation+3.38e-02

Pos logit contributions
bread+0.043
 he+0.042
rily+0.037
 energ+0.037
ment+0.037
Neg logit contributions
ee-0.076
prise-0.076
ases-0.075
rr-0.075
room-0.073
Token.
Token ID13
Feature activation+3.90e-02

Pos logit contributions
bread+0.054
 ju+0.050
 energ+0.050
 cuc+0.049
ment+0.048
Neg logit contributions
room-0.059
ee-0.058
ases-0.058
rr-0.056
e-0.056
Token Tim
Token ID5045
Feature activation+4.37e-03

Pos logit contributions
bread+0.061
rily+0.053
 energ+0.053
 cuc+0.053
ment+0.051
Neg logit contributions
room-0.075
ases-0.072
rr-0.069
ee-0.067
prise-0.066
Token sun
Token ID4252
Feature activation+5.42e-02

Pos logit contributions
bread+0.029
rily+0.028
ment+0.028
 cuc+0.027
 ju+0.026
Neg logit contributions
ee-0.044
uch-0.043
room-0.041
oh-0.041
ases-0.041
Token shone
Token ID44193
Feature activation+4.60e-02

Pos logit contributions
bread+0.072
 ju+0.069
 current+0.069
 cuc+0.066
 he+0.066
Neg logit contributions
ee-0.109
uch-0.105
w-0.104
rr-0.101
ases-0.100
Token on
Token ID319
Feature activation+5.07e-02

Pos logit contributions
bread+0.075
 current+0.072
 cuc+0.069
 spine+0.066
 he+0.066
Neg logit contributions
ee-0.076
room-0.070
w-0.067
rr-0.066
e-0.064
Token the
Token ID262
Feature activation+5.93e-02

Pos logit contributions
bread+0.073
 ju+0.064
 energ+0.062
 current+0.062
 he+0.060
Neg logit contributions
ee-0.101
room-0.099
e-0.096
ases-0.094
prise-0.094
Token pillow
Token ID28774
Feature activation+6.23e-02

Pos logit contributions
 he+0.068
bread+0.067
 ju+0.059
rily+0.059
 cuc+0.058
Neg logit contributions
ee-0.133
prise-0.131
uch-0.129
rr-0.128
room-0.125
Token,
Token ID11
Feature activation+8.01e-02

Pos logit contributions
bread+0.082
 ju+0.080
 cuc+0.076
 he+0.075
 current+0.075
Neg logit contributions
ee-0.122
room-0.121
rr-0.120
uch-0.119
prise-0.117
Token making
Token ID1642
Feature activation+2.13e-02

Pos logit contributions
bread+0.123
 current+0.103
 cuc+0.097
 ju+0.096
ment+0.096
Neg logit contributions
ee-0.136
e-0.134
w-0.128
ow-0.121
prise-0.121
Token it
Token ID340
Feature activation+2.26e-02

Pos logit contributions
bread+0.029
 he+0.026
 ju+0.026
 cuc+0.025
 current+0.025
Neg logit contributions
ee-0.045
prise-0.043
ases-0.043
room-0.042
rr-0.041
Token dry
Token ID5894
Feature activation+2.28e-02

Pos logit contributions
bread+0.033
 cuc+0.031
rily+0.030
 current+0.030
 ju+0.030
Neg logit contributions
prise-0.042
ee-0.041
rr-0.040
ases-0.040
uch-0.040
Token and
Token ID290
Feature activation+4.05e-02

Pos logit contributions
bread+0.031
 he+0.028
 ju+0.027
 cuc+0.027
 current+0.027
Neg logit contributions
ee-0.046
room-0.045
ases-0.045
uch-0.044
w-0.044
Token fresh
Token ID4713
Feature activation+1.96e-02

Pos logit contributions
bread+0.062
rily+0.054
 cuc+0.052
der+0.051
 Ark+0.050
Neg logit contributions
prise-0.075
uch-0.073
w-0.072
ee-0.071
 yourself-0.068
Token people
Token ID661
Feature activation+2.07e-02

Pos logit contributions
bread+0.066
 he+0.065
 ju+0.059
rily+0.058
 until+0.056
Neg logit contributions
prise-0.118
ee-0.117
rr-0.116
ases-0.115
uch-0.115
Token nearby
Token ID6716
Feature activation+4.05e-02

Pos logit contributions
 he+0.029
bread+0.028
 ju+0.025
 until+0.025
 cuc+0.025
Neg logit contributions
prise-0.043
ases-0.042
ee-0.042
uch-0.042
rr-0.042
Token were
Token ID547
Feature activation+1.14e-03

Pos logit contributions
 he+0.056
bread+0.054
 ju+0.049
 until+0.049
 cuc+0.047
Neg logit contributions
prise-0.084
ases-0.083
uch-0.083
ee-0.083
rr-0.082
Token so
Token ID523
Feature activation+1.33e-02

Pos logit contributions
bread+0.002
 he+0.002
 ju+0.002
 cuc+0.001
rily+0.001
Neg logit contributions
prise-0.002
ases-0.002
ee-0.002
rr-0.002
uch-0.002
Token scared
Token ID12008
Feature activation+3.05e-02

Pos logit contributions
 he+0.015
bread+0.015
 ju+0.013
 until+0.013
 cuc+0.013
Neg logit contributions
prise-0.031
ee-0.030
ases-0.030
rr-0.030
uch-0.030
Token,
Token ID11
Feature activation+8.01e-02

Pos logit contributions
bread+0.040
 ju+0.039
 cuc+0.037
 energ+0.035
rily+0.034
Neg logit contributions
ee-0.062
ases-0.061
rr-0.060
prise-0.060
room-0.059
Token they
Token ID484
Feature activation+1.61e-02

Pos logit contributions
bread+0.087
 ju+0.074
 cuc+0.073
rily+0.073
ment+0.068
Neg logit contributions
ee-0.175
prise-0.169
rr-0.168
oh-0.167
ases-0.166
Token all
Token ID477
Feature activation+2.19e-02

Pos logit contributions
bread+0.023
 he+0.021
 ju+0.021
 cuc+0.020
rily+0.020
Neg logit contributions
ee-0.032
prise-0.032
uch-0.031
ases-0.031
rr-0.031
Token ran
Token ID4966
Feature activation+5.04e-03

Pos logit contributions
bread+0.032
 ju+0.029
rily+0.029
 cuc+0.028
 he+0.027
Neg logit contributions
ee-0.043
prise-0.043
ases-0.042
rr-0.042
uch-0.041
Token away
Token ID1497
Feature activation+4.13e-02

Pos logit contributions
bread+0.007
 he+0.007
 ju+0.006
 cuc+0.006
rily+0.006
Neg logit contributions
prise-0.010
ases-0.010
ee-0.010
rr-0.010
uch-0.010
Token.
Token ID13
Feature activation+2.32e-02

Pos logit contributions
bread+0.063
 ju+0.061
 cuc+0.058
 redd+0.054
 lettuce+0.054
Neg logit contributions
rr-0.076
ases-0.076
prise-0.076
ee-0.074
room-0.071
Token looked
Token ID3114
Feature activation+1.07e-02

Pos logit contributions
 he+0.032
bread+0.031
 ju+0.028
 until+0.027
 cuc+0.027
Neg logit contributions
prise-0.062
ases-0.061
ee-0.061
uch-0.061
rr-0.061
Token everywhere
Token ID8347
Feature activation+1.32e-02

Pos logit contributions
bread+0.021
 current+0.018
ment+0.018
der+0.017
 ju+0.017
Neg logit contributions
prise-0.017
ee-0.017
ases-0.015
w-0.015
rr-0.015
Token,
Token ID11
Feature activation+6.01e-02

Pos logit contributions
bread+0.020
 ju+0.018
rily+0.017
 cuc+0.016
 current+0.016
Neg logit contributions
rr-0.025
ee-0.025
prise-0.025
room-0.024
w-0.024
Token behind
Token ID2157
Feature activation+3.45e-02

Pos logit contributions
bread+0.081
ment+0.069
 cuc+0.063
rily+0.061
 current+0.060
Neg logit contributions
e-0.125
room-0.120
ee-0.119
prise-0.117
ow-0.116
Token trees
Token ID7150
Feature activation+2.29e-02

Pos logit contributions
bread+0.050
 ju+0.043
 energ+0.041
 cuc+0.040
rily+0.039
Neg logit contributions
room-0.070
ases-0.068
ee-0.068
prise-0.067
rr-0.066
Token and
Token ID290
Feature activation+8.00e-02

Pos logit contributions
bread+0.035
 ju+0.033
rily+0.032
 energ+0.032
ment+0.032
Neg logit contributions
room-0.042
ee-0.042
rr-0.041
ases-0.041
prise-0.040
Token under
Token ID739
Feature activation+4.22e-02

Pos logit contributions
bread+0.130
rily+0.125
ment+0.116
der+0.115
 current+0.108
Neg logit contributions
prise-0.134
w-0.131
e-0.128
ee-0.124
uch-0.123
Token rocks
Token ID12586
Feature activation-6.73e-04

Pos logit contributions
bread+0.067
 energ+0.059
 ju+0.059
rily+0.056
lest+0.054
Neg logit contributions
room-0.077
ases-0.074
rr-0.074
ee-0.073
e-0.072
Token,
Token ID11
Feature activation+5.98e-02

Pos logit contributions
ee+0.001
ases+0.001
rr+0.001
room+0.001
prise+0.001
Neg logit contributions
bread-0.001
 ju-0.001
 energ-0.001
 cuc-0.001
ment-0.001
Token until
Token ID1566
Feature activation+3.48e-02

Pos logit contributions
bread+0.091
ment+0.085
rily+0.079
 tum+0.077
 current+0.075
Neg logit contributions
e-0.100
room-0.099
ee-0.098
w-0.096
oh-0.095
Token he
Token ID339
Feature activation-6.48e-04

Pos logit contributions
bread+0.035
 ju+0.024
rily+0.024
der+0.023
 knowing+0.021
Neg logit contributions
prise-0.089
uch-0.088
ee-0.086
w-0.086
oh-0.084
Token
Token ID198
Feature activation-4.44e-02

Pos logit contributions
prise+0.111
uch+0.110
ases+0.109
ee+0.105
w+0.105
Neg logit contributions
 he-0.033
 until-0.024
bread-0.021
 "-0.020
 which-0.020
Token"
Token ID1
Feature activation-6.83e-02

Pos logit contributions
prise+0.119
ases+0.118
uch+0.117
ee+0.116
w+0.113
Neg logit contributions
 he-0.041
bread-0.036
 which-0.030
 ju-0.029
 until-0.028
TokenOh
Token ID5812
Feature activation-4.85e-02

Pos logit contributions
prise+0.135
ases+0.132
rr+0.130
ee+0.125
uch+0.124
Neg logit contributions
 he-0.115
 ju-0.108
 which-0.100
bread-0.100
 though-0.094
Token,
Token ID11
Feature activation-3.29e-02

Pos logit contributions
prise+0.103
ases+0.102
ee+0.102
uch+0.101
rr+0.101
Neg logit contributions
 he-0.064
bread-0.063
 ju-0.058
 until-0.056
 cuc-0.056
Token I
Token ID314
Feature activation-9.10e-02

Pos logit contributions
prise+0.053
ee+0.052
rr+0.050
uch+0.050
ases+0.049
Neg logit contributions
bread-0.061
 he-0.057
rily-0.056
 ju-0.055
 widened-0.055
Token hadn
Token ID8020
Feature activation-1.38e-01

Pos logit contributions
ases+0.170
prise+0.162
rr+0.161
room+0.159
uch+0.157
Neg logit contributions
 he-0.153
 until-0.139
bread-0.137
 but-0.133
 as-0.131
Token't
Token ID470
Feature activation-6.17e-02

Pos logit contributions
ases+0.230
ymes+0.223
prise+0.221
ee+0.208
w+0.205
Neg logit contributions
 he-0.270
 but-0.234
 as-0.230
 until-0.225
 feeling-0.225
Token thought
Token ID1807
Feature activation-2.96e-02

Pos logit contributions
prise+0.121
ases+0.119
ee+0.119
uch+0.118
rr+0.118
Neg logit contributions
 he-0.092
bread-0.091
 ju-0.084
 until-0.082
 cuc-0.081
Token of
Token ID286
Feature activation-5.44e-02

Pos logit contributions
prise+0.068
ases+0.067
uch+0.066
rr+0.066
ee+0.066
Neg logit contributions
 he-0.039
bread-0.034
 until-0.032
 which-0.032
 as-0.032
Token that
Token ID326
Feature activation-8.64e-03

Pos logit contributions
prise+0.128
uch+0.125
ee+0.124
ases+0.124
rr+0.122
Neg logit contributions
 he-0.068
 which-0.060
 as-0.058
 until-0.057
bread-0.054
Token,"
Token ID553
Feature activation-1.20e-02

Pos logit contributions
prise+0.017
ee+0.017
rr+0.017
ases+0.017
uch+0.017
Neg logit contributions
bread-0.013
 he-0.013
 ju-0.012
rily-0.011
 cuc-0.011
Tokenally
Token ID453
Feature activation-6.69e-02

Pos logit contributions
ases+0.150
prise+0.148
rr+0.147
ee+0.147
uch+0.145
Neg logit contributions
bread-0.083
 he-0.082
 ju-0.078
 until-0.074
 cuc-0.072
Token,
Token ID11
Feature activation-2.88e-02

Pos logit contributions
prise+0.143
ee+0.142
ases+0.140
uch+0.140
rr+0.140
Neg logit contributions
 he-0.086
bread-0.086
 ju-0.078
rily-0.075
 until-0.075
Token you
Token ID345
Feature activation-8.28e-02

Pos logit contributions
prise+0.054
ee+0.053
uch+0.052
ases+0.052
rr+0.052
Neg logit contributions
bread-0.046
 he-0.044
rily-0.041
 ju-0.041
 cuc-0.040
Token know
Token ID760
Feature activation-5.26e-02

Pos logit contributions
ases+0.156
rr+0.148
prise+0.147
room+0.147
ee+0.145
Neg logit contributions
 he-0.137
bread-0.124
 ju-0.121
 tum-0.115
 as-0.115
Token you
Token ID345
Feature activation-7.40e-02

Pos logit contributions
ases+0.121
prise+0.120
ee+0.118
rr+0.118
uch+0.117
Neg logit contributions
 he-0.065
bread-0.059
 which-0.054
 until-0.054
 ju-0.053
Token shouldn
Token ID6584
Feature activation-1.29e-01

Pos logit contributions
ases+0.140
rr+0.136
prise+0.133
room+0.132
ee+0.128
Neg logit contributions
 he-0.125
bread-0.114
 ju-0.109
 which-0.107
 as-0.107
Token't
Token ID470
Feature activation-7.86e-02

Pos logit contributions
prise+0.154
ases+0.149
rr+0.138
ee+0.138
ymes+0.128
Neg logit contributions
 he-0.315
bread-0.282
 ju-0.277
 which-0.275
 current-0.275
Token keep
Token ID1394
Feature activation-5.96e-02

Pos logit contributions
prise+0.157
ases+0.155
ee+0.154
rr+0.153
uch+0.153
Neg logit contributions
 he-0.116
bread-0.112
 ju-0.103
 until-0.101
 which-0.100
Token the
Token ID262
Feature activation-1.18e-02

Pos logit contributions
prise+0.135
rr+0.133
ases+0.132
ee+0.132
uch+0.131
Neg logit contributions
 he-0.073
bread-0.069
 ju-0.063
 until-0.062
 which-0.061
Token gum
Token ID27142
Feature activation-9.80e-03

Pos logit contributions
uch+0.020
ee+0.019
prise+0.019
ases+0.019
rr+0.019
Neg logit contributions
bread-0.021
 he-0.021
rily-0.019
 ju-0.019
 until-0.019
Token.
Token ID13
Feature activation-5.24e-02

Pos logit contributions
prise+0.021
ases+0.021
ee+0.021
rr+0.021
uch+0.021
Neg logit contributions
 he-0.012
bread-0.012
 ju-0.011
 until-0.011
rily-0.010
Token.
Token ID13
Feature activation+8.01e-03

Pos logit contributions
ee+0.043
ases+0.043
rr+0.043
prise+0.043
uch+0.042
Neg logit contributions
bread-0.025
 he-0.023
 ju-0.023
 cuc-0.022
 current-0.021
Token Mama
Token ID40084
Feature activation+8.63e-04

Pos logit contributions
bread+0.010
 he+0.010
 cuc+0.009
 current+0.009
 ju+0.009
Neg logit contributions
prise-0.017
ee-0.016
room-0.016
rr-0.016
ases-0.016
Token said
Token ID531
Feature activation-1.34e-02

Pos logit contributions
bread+0.001
 he+0.001
 ju+0.001
rily+0.001
 cuc+0.001
Neg logit contributions
ee-0.002
prise-0.002
rr-0.002
uch-0.002
ases-0.002
Token if
Token ID611
Feature activation-1.77e-02

Pos logit contributions
prise+0.030
ee+0.030
uch+0.030
ases+0.029
room+0.029
Neg logit contributions
bread-0.015
 ju-0.014
 he-0.014
rily-0.014
 cuc-0.013
Token she
Token ID673
Feature activation-2.75e-02

Pos logit contributions
prise+0.047
ee+0.045
uch+0.045
w+0.044
rr+0.043
Neg logit contributions
bread-0.015
rily-0.013
 cuc-0.012
 he-0.012
 current-0.012
Token wasn
Token ID2492
Feature activation-1.27e-01

Pos logit contributions
prise+0.070
ases+0.070
rr+0.070
room+0.068
uch+0.065
Neg logit contributions
 he-0.031
bread-0.024
 tum-0.022
 ju-0.022
 as-0.022
Token't
Token ID470
Feature activation-7.87e-02

Pos logit contributions
ymes+0.222
ases+0.208
rr+0.206
prise+0.205
w+0.192
Neg logit contributions
 he-0.263
 as-0.228
 until-0.223
 but-0.214
 they-0.209
Token careful
Token ID8161
Feature activation-4.77e-02

Pos logit contributions
prise+0.147
ee+0.145
rr+0.144
ases+0.144
uch+0.143
Neg logit contributions
bread-0.124
 he-0.122
 ju-0.114
rily-0.112
 cuc-0.111
Token she
Token ID673
Feature activation-2.19e-02

Pos logit contributions
prise+0.107
ases+0.106
ee+0.105
uch+0.105
rr+0.105
Neg logit contributions
 he-0.057
bread-0.056
 ju-0.050
 until-0.049
 cuc-0.048
Token could
Token ID714
Feature activation-7.62e-02

Pos logit contributions
prise+0.056
rr+0.056
ases+0.054
room+0.053
ee+0.051
Neg logit contributions
 he-0.025
 as-0.020
 until-0.019
bread-0.019
 which-0.019
Token get
Token ID651
Feature activation-3.01e-02

Pos logit contributions
prise+0.169
rr+0.165
ee+0.156
ases+0.156
ymes+0.155
Neg logit contributions
 he-0.114
 until-0.103
 as-0.097
 but-0.091
 which-0.091
TokenHey
Token ID10814
Feature activation-3.39e-02

Pos logit contributions
ases+0.136
prise+0.135
uch+0.134
rr+0.131
ee+0.130
Neg logit contributions
 he-0.128
bread-0.119
 ju-0.114
 which-0.109
 until-0.108
Token little
Token ID1310
Feature activation-3.27e-02

Pos logit contributions
prise+0.071
ee+0.070
rr+0.069
ases+0.069
uch+0.069
Neg logit contributions
bread-0.045
 he-0.045
 ju-0.041
rily-0.040
 cuc-0.040
Token boy
Token ID2933
Feature activation-3.78e-02

Pos logit contributions
prise+0.070
ases+0.070
rr+0.070
ee+0.069
uch+0.068
Neg logit contributions
bread-0.041
 he-0.040
 ju-0.037
rily-0.037
 until-0.036
Token,
Token ID11
Feature activation-2.61e-02

Pos logit contributions
prise+0.083
ases+0.082
uch+0.082
ee+0.082
rr+0.082
Neg logit contributions
 he-0.047
bread-0.046
 ju-0.041
 until-0.041
 cuc-0.039
Token you
Token ID345
Feature activation-8.06e-02

Pos logit contributions
ases+0.049
prise+0.049
uch+0.049
ee+0.049
rr+0.049
Neg logit contributions
 he-0.041
bread-0.039
 ju-0.037
 until-0.036
 which-0.036
Token shouldn
Token ID6584
Feature activation-1.25e-01

Pos logit contributions
ases+0.156
rr+0.143
room+0.141
prise+0.140
ips+0.137
Neg logit contributions
 he-0.135
 ju-0.112
 tum-0.112
 which-0.111
 as-0.111
Token't
Token ID470
Feature activation-7.24e-02

Pos logit contributions
ases+0.161
prise+0.157
rr+0.148
ee+0.147
ymes+0.142
Neg logit contributions
 he-0.294
 ju-0.254
bread-0.253
 which-0.250
 current-0.248
Token be
Token ID307
Feature activation-7.06e-02

Pos logit contributions
prise+0.158
ases+0.158
ee+0.155
rr+0.152
uch+0.149
Neg logit contributions
 he-0.099
bread-0.085
 which-0.084
 as-0.083
 ju-0.081
Token picking
Token ID10868
Feature activation-3.71e-02

Pos logit contributions
prise+0.150
ee+0.149
ases+0.149
uch+0.148
rr+0.148
Neg logit contributions
 he-0.094
bread-0.090
 ju-0.083
 until-0.082
 which-0.081
Token up
Token ID510
Feature activation+4.81e-04

Pos logit contributions
prise+0.082
ases+0.081
ee+0.081
uch+0.080
rr+0.080
Neg logit contributions
 he-0.045
bread-0.045
 ju-0.040
 until-0.039
 cuc-0.039
Token that
Token ID326
Feature activation-9.59e-04

Pos logit contributions
bread+0.001
 he+0.001
 ju+0.001
 cuc+0.001
rily+0.001
Neg logit contributions
prise-0.001
ases-0.001
ee-0.001
rr-0.001
uch-0.001
Token so
Token ID523
Feature activation+4.81e-02

Pos logit contributions
bread+0.062
rily+0.055
 cuc+0.054
 ju+0.052
ment+0.051
Neg logit contributions
ases-0.103
ee-0.103
room-0.102
rr-0.102
prise-0.100
Token mom
Token ID1995
Feature activation-6.51e-04

Pos logit contributions
bread+0.049
 ju+0.045
 cuc+0.044
rily+0.040
 lettuce+0.037
Neg logit contributions
room-0.116
w-0.115
ee-0.114
ases-0.113
uch-0.111
Token said
Token ID531
Feature activation-9.03e-03

Pos logit contributions
prise+0.001
ee+0.001
ases+0.001
uch+0.001
rr+0.001
Neg logit contributions
 he-0.001
bread-0.001
 ju-0.001
 until-0.001
 cuc-0.001
Token if
Token ID611
Feature activation-7.87e-03

Pos logit contributions
prise+0.023
uch+0.023
ee+0.022
rr+0.022
ases+0.022
Neg logit contributions
bread-0.008
 ju-0.007
rily-0.007
 cuc-0.006
 current-0.006
Token he
Token ID339
Feature activation-2.25e-02

Pos logit contributions
prise+0.025
ee+0.025
uch+0.025
ases+0.025
rr+0.025
Neg logit contributions
bread-0.002
 he-0.001
rily-0.001
 ju-0.001
 cuc-0.001
Token wasn
Token ID2492
Feature activation-1.24e-01

Pos logit contributions
prise+0.058
ases+0.058
rr+0.057
room+0.055
uch+0.053
Neg logit contributions
 he-0.026
bread-0.019
 tum-0.018
 as-0.017
 ju-0.017
Token't
Token ID470
Feature activation-7.04e-02

Pos logit contributions
ymes+0.232
ases+0.217
prise+0.210
rr+0.208
w+0.207
Neg logit contributions
 he-0.246
 as-0.218
 until-0.206
 they-0.197
 but-0.192
Token going
Token ID1016
Feature activation-3.27e-02

Pos logit contributions
prise+0.136
rr+0.133
ee+0.133
ases+0.132
uch+0.131
Neg logit contributions
bread-0.106
 he-0.105
 ju-0.098
 cuc-0.096
rily-0.095
Token to
Token ID284
Feature activation-4.26e-02

Pos logit contributions
prise+0.071
ases+0.070
ee+0.070
uch+0.070
rr+0.070
Neg logit contributions
 he-0.041
bread-0.041
 ju-0.037
 until-0.036
 cuc-0.035
Token help
Token ID1037
Feature activation-1.30e-02

Pos logit contributions
prise+0.090
ases+0.089
ee+0.089
rr+0.088
uch+0.087
Neg logit contributions
 he-0.062
bread-0.054
 until-0.053
 as-0.052
 which-0.051
Token,
Token ID11
Feature activation+4.01e-02

Pos logit contributions
prise+0.031
ee+0.030
uch+0.030
ases+0.030
rr+0.030
Neg logit contributions
 he-0.016
bread-0.014
 until-0.014
 which-0.013
 as-0.013
Token "
Token ID366
Feature activation-6.52e-02

Pos logit contributions
bread+0.029
 current+0.026
 ju+0.026
ment+0.026
 crops+0.025
Neg logit contributions
rr-0.055
oh-0.054
room-0.053
prise-0.052
ee-0.051
TokenThe
Token ID464
Feature activation+3.85e-03

Pos logit contributions
prise+0.109
ases+0.107
ee+0.107
uch+0.106
rr+0.106
Neg logit contributions
 he-0.116
bread-0.115
 ju-0.107
 until-0.105
 cuc-0.104
Token sign
Token ID1051
Feature activation-1.00e-02

Pos logit contributions
bread+0.004
 he+0.004
rily+0.004
 ju+0.004
 cuc+0.004
Neg logit contributions
prise-0.009
ee-0.009
rr-0.009
uch-0.009
ases-0.009
Token says
Token ID1139
Feature activation-4.24e-02

Pos logit contributions
prise+0.017
ases+0.017
rr+0.017
uch+0.017
ee+0.017
Neg logit contributions
 he-0.018
bread-0.017
 which-0.016
 until-0.016
 ju-0.016
Token '
Token ID705
Feature activation-7.44e-02

Pos logit contributions
ases+0.087
prise+0.087
ee+0.086
uch+0.085
rr+0.085
Neg logit contributions
 he-0.063
bread-0.059
 until-0.053
 which-0.053
 ju-0.053
TokenCA
Token ID8141
Feature activation-1.24e-01

Pos logit contributions
ases+0.128
prise+0.127
rr+0.126
room+0.120
uch+0.120
Neg logit contributions
 he-0.131
bread-0.126
 ju-0.121
 though-0.112
 until-0.112
TokenUTION
Token ID35354
Feature activation-2.19e-02

Pos logit contributions
ases+0.208
rr+0.201
prise+0.189
w+0.187
ee+0.186
Neg logit contributions
 he-0.202
bread-0.190
 ju-0.190
 which-0.189
 as-0.187
Token'."
Token ID30827
Feature activation+2.18e-02

Pos logit contributions
prise+0.038
ases+0.037
uch+0.036
rr+0.035
ee+0.035
Neg logit contributions
bread-0.038
 he-0.038
 until-0.035
rily-0.034
 ju-0.034
Token
Token ID220
Feature activation-3.48e-02

Pos logit contributions
bread+0.032
 current+0.029
 ju+0.028
rily+0.028
 cuc+0.028
Neg logit contributions
room-0.041
prise-0.040
ases-0.040
ee-0.039
rr-0.039
Token
Token ID198
Feature activation-3.83e-02

Pos logit contributions
prise+0.079
uch+0.077
ases+0.077
ee+0.076
rr+0.076
Neg logit contributions
 he-0.043
bread-0.040
 until-0.036
 ju-0.036
rily-0.035
Token
Token ID198
Feature activation-4.26e-02

Pos logit contributions
uch+0.092
prise+0.091
ases+0.091
ee+0.090
rr+0.089
Neg logit contributions
 he-0.043
bread-0.038
 until-0.036
 ju-0.034
 which-0.033
Token Lily
Token ID20037
Feature activation-2.70e-03

Pos logit contributions
ases+0.009
rr+0.009
prise+0.009
ee+0.009
room+0.009
Neg logit contributions
bread-0.006
 ju-0.005
 he-0.005
 cuc-0.005
rily-0.005
Token.
Token ID13
Feature activation-8.49e-03

Pos logit contributions
rr+0.005
ee+0.005
prise+0.005
uch+0.005
ases+0.005
Neg logit contributions
bread-0.004
 ju-0.004
 cuc-0.004
 he-0.004
 lettuce-0.003
Token
Token ID198
Feature activation-3.37e-02

Pos logit contributions
prise+0.020
rr+0.019
ases+0.019
ee+0.019
uch+0.019
Neg logit contributions
bread-0.010
 he-0.009
 ju-0.008
rily-0.008
 current-0.008
Token
Token ID198
Feature activation-3.95e-02

Pos logit contributions
ases+0.087
prise+0.087
uch+0.086
ee+0.085
rr+0.083
Neg logit contributions
 he-0.034
 until-0.028
 which-0.024
bread-0.023
 but-0.022
Token"
Token ID1
Feature activation-6.67e-02

Pos logit contributions
ases+0.070
prise+0.069
ee+0.069
uch+0.068
rr+0.066
Neg logit contributions
 he-0.068
bread-0.064
 until-0.061
 which-0.060
 ju-0.060
TokenDon
Token ID3987
Feature activation-1.23e-01

Pos logit contributions
prise+0.117
ee+0.116
uch+0.115
ases+0.115
rr+0.113
Neg logit contributions
bread-0.112
 he-0.112
 ju-0.102
 until-0.101
rily-0.101
Token't
Token ID470
Feature activation-7.44e-02

Pos logit contributions
ases+0.166
prise+0.163
ee+0.153
rr+0.151
w+0.136
Neg logit contributions
 he-0.278
 as-0.243
bread-0.240
 they-0.237
 until-0.237
Token worry
Token ID5490
Feature activation-5.43e-02

Pos logit contributions
ases+0.143
prise+0.141
ee+0.141
rr+0.136
room+0.135
Neg logit contributions
 he-0.116
bread-0.108
 ju-0.102
 until-0.102
 tum-0.102
Token,
Token ID11
Feature activation-1.89e-02

Pos logit contributions
prise+0.116
ee+0.115
rr+0.114
ases+0.114
uch+0.114
Neg logit contributions
 he-0.070
bread-0.070
 ju-0.063
 cuc-0.061
 until-0.060
Token Lily
Token ID20037
Feature activation-5.91e-02

Pos logit contributions
prise+0.040
rr+0.038
ee+0.037
room+0.037
uch+0.036
Neg logit contributions
bread-0.026
ment-0.023
rily-0.023
 he-0.022
 current-0.022
Token.
Token ID13
Feature activation-7.50e-02

Pos logit contributions
ee+0.113
prise+0.112
rr+0.111
uch+0.110
w+0.108
Neg logit contributions
bread-0.091
 ju-0.083
 cuc-0.083
 he-0.083
rily-0.081
Token new
Token ID649
Feature activation-2.34e-02

Pos logit contributions
ee+0.077
uch+0.077
prise+0.077
ases+0.075
rr+0.074
Neg logit contributions
 he-0.038
bread-0.032
 which-0.032
 until-0.032
 as-0.031
Token words
Token ID2456
Feature activation-1.70e-02

Pos logit contributions
prise+0.054
uch+0.054
ee+0.053
ases+0.052
rr+0.051
Neg logit contributions
 he-0.028
bread-0.026
 until-0.025
 which-0.024
 as-0.023
Token every
Token ID790
Feature activation-3.38e-02

Pos logit contributions
prise+0.043
uch+0.042
ee+0.042
ases+0.041
rr+0.041
Neg logit contributions
 he-0.019
bread-0.016
 until-0.015
 which-0.015
 as-0.014
Token day
Token ID1110
Feature activation+4.96e-03

Pos logit contributions
prise+0.077
ases+0.077
uch+0.076
room+0.074
ee+0.074
Neg logit contributions
 he-0.040
bread-0.034
 until-0.033
 which-0.032
 as-0.032
Token.
Token ID13
Feature activation-4.52e-02

Pos logit contributions
 he+0.005
bread+0.004
 until+0.004
 as+0.004
 which+0.004
Neg logit contributions
prise-0.013
uch-0.012
ee-0.012
ases-0.012
rr-0.012
Token Conf
Token ID7326
Feature activation-1.23e-01

Pos logit contributions
prise+0.082
uch+0.081
ee+0.080
ases+0.078
rr+0.077
Neg logit contributions
 he-0.076
bread-0.073
 ju-0.068
 until-0.068
 which-0.066
Tokenused
Token ID1484
Feature activation-1.96e-02

Pos logit contributions
ymes+0.203
prise+0.197
ee+0.193
rr+0.189
room+0.188
Neg logit contributions
 he-0.184
 as-0.181
 which-0.168
ment-0.161
 until-0.160
Token is
Token ID318
Feature activation-1.94e-02

Pos logit contributions
prise+0.045
uch+0.044
ee+0.044
ases+0.043
rr+0.042
Neg logit contributions
 he-0.025
 until-0.022
bread-0.022
 which-0.021
 as-0.020
Token just
Token ID655
Feature activation-1.39e-03

Pos logit contributions
prise+0.041
ee+0.040
ases+0.040
uch+0.040
rr+0.040
Neg logit contributions
 he-0.026
bread-0.025
 until-0.023
 ju-0.023
 which-0.023
Token a
Token ID257
Feature activation-1.69e-03

Pos logit contributions
ases+0.003
prise+0.003
rr+0.003
ee+0.003
uch+0.003
Neg logit contributions
bread-0.002
 he-0.002
rily-0.002
 ju-0.002
 widened-0.002
Token word
Token ID1573
Feature activation-1.47e-02

Pos logit contributions
ases+0.003
prise+0.003
uch+0.003
ee+0.003
rr+0.003
Neg logit contributions
 he-0.002
bread-0.002
 ju-0.002
 until-0.002
 cuc-0.002
Token face
Token ID1986
Feature activation+2.39e-02

Pos logit contributions
 ju+0.026
bread+0.025
 he+0.024
 cuc+0.024
rily+0.022
Neg logit contributions
rr-0.056
room-0.054
ases-0.053
ee-0.053
prise-0.052
Token and
Token ID290
Feature activation+2.64e-02

Pos logit contributions
bread+0.026
 he+0.025
 ju+0.023
 cuc+0.023
rily+0.022
Neg logit contributions
uch-0.055
ee-0.055
prise-0.055
ases-0.055
rr-0.055
Token said
Token ID531
Feature activation-2.11e-02

Pos logit contributions
bread+0.042
 ju+0.039
rily+0.038
 current+0.036
ment+0.036
Neg logit contributions
prise-0.049
rr-0.048
ee-0.047
ases-0.046
w-0.046
Token,
Token ID11
Feature activation+1.62e-02

Pos logit contributions
prise+0.045
ee+0.045
ases+0.045
uch+0.045
rr+0.044
Neg logit contributions
 he-0.027
bread-0.027
 ju-0.024
rily-0.023
 until-0.023
Token "
Token ID366
Feature activation-7.80e-02

Pos logit contributions
bread+0.013
 ju+0.012
rily+0.012
 current+0.011
 energ+0.011
Neg logit contributions
rr-0.041
prise-0.041
ee-0.040
room-0.040
uch-0.039
TokenDon
Token ID3987
Feature activation-1.23e-01

Pos logit contributions
prise+0.117
ases+0.116
ee+0.115
rr+0.114
uch+0.114
Neg logit contributions
 he-0.153
bread-0.149
 ju-0.141
 until-0.138
 which-0.136
Token't
Token ID470
Feature activation-5.76e-02

Pos logit contributions
ases+0.172
prise+0.170
ee+0.159
rr+0.156
w+0.145
Neg logit contributions
 he-0.272
 as-0.236
 they-0.236
 until-0.232
bread-0.229
Token worry
Token ID5490
Feature activation-5.16e-02

Pos logit contributions
ases+0.107
prise+0.107
ee+0.107
uch+0.105
rr+0.102
Neg logit contributions
 he-0.092
bread-0.087
 ju-0.082
 tum-0.081
 until-0.081
Token,
Token ID11
Feature activation-2.38e-02

Pos logit contributions
prise+0.117
ases+0.115
ee+0.115
uch+0.114
rr+0.114
Neg logit contributions
 he-0.062
bread-0.060
 until-0.054
 ju-0.053
rily-0.051
Token Amy
Token ID14235
Feature activation-4.94e-02

Pos logit contributions
prise+0.046
ee+0.044
rr+0.044
uch+0.044
ases+0.044
Neg logit contributions
bread-0.036
 he-0.034
rily-0.032
 cuc-0.032
 ju-0.032
Token!
Token ID0
Feature activation-7.64e-02

Pos logit contributions
prise+0.112
ases+0.111
ee+0.111
uch+0.110
rr+0.110
Neg logit contributions
 he-0.057
bread-0.056
 ju-0.050
 until-0.049
 cuc-0.048
Token
Token ID198
Feature activation-3.40e-02

Pos logit contributions
uch+0.072
prise+0.072
ases+0.071
ee+0.071
rr+0.069
Neg logit contributions
 he-0.034
 until-0.028
bread-0.027
 which-0.025
 ju-0.024
Token"
Token ID1
Feature activation-6.38e-02

Pos logit contributions
prise+0.049
rr+0.048
ases+0.048
ee+0.048
room+0.047
Neg logit contributions
bread-0.068
 he-0.066
 ju-0.063
 cuc-0.063
rily-0.062
TokenNo
Token ID2949
Feature activation-3.94e-02

Pos logit contributions
prise+0.111
rr+0.105
uch+0.103
ases+0.102
ee+0.099
Neg logit contributions
 he-0.124
 ju-0.109
bread-0.107
 current-0.103
 cuc-0.103
Token,
Token ID11
Feature activation-5.53e-03

Pos logit contributions
prise+0.092
ases+0.091
ee+0.090
uch+0.090
rr+0.089
Neg logit contributions
 he-0.047
bread-0.043
 until-0.040
 which-0.038
 ju-0.037
Token you
Token ID345
Feature activation-7.53e-02

Pos logit contributions
prise+0.009
ee+0.009
uch+0.009
ases+0.009
rr+0.009
Neg logit contributions
bread-0.010
rily-0.009
ment-0.009
 energ-0.009
 widened-0.009
Token didn
Token ID1422
Feature activation-1.22e-01

Pos logit contributions
prise+0.137
ases+0.134
loe+0.132
room+0.127
ymes+0.127
Neg logit contributions
 he-0.132
 tum-0.113
 as-0.112
bread-0.110
 ju-0.108
Token't
Token ID470
Feature activation-9.53e-02

Pos logit contributions
prise+0.208
ymes+0.189
ases+0.185
ee+0.184
rr+0.181
Neg logit contributions
 he-0.261
 as-0.214
 they-0.209
 which-0.205
 until-0.197
Token,
Token ID11
Feature activation-1.19e-02

Pos logit contributions
loe+0.188
prise+0.187
uch+0.179
ee+0.177
ases+0.173
Neg logit contributions
 he-0.158
 as-0.137
 until-0.134
 but-0.128
bread-0.124
Token Tom
Token ID4186
Feature activation-5.24e-02

Pos logit contributions
prise+0.019
ee+0.019
ases+0.019
uch+0.019
rr+0.019
Neg logit contributions
bread-0.021
 he-0.020
rily-0.019
 ju-0.019
 cuc-0.019
Token!
Token ID0
Feature activation-7.26e-02

Pos logit contributions
prise+0.114
ases+0.111
uch+0.110
ee+0.110
rr+0.110
Neg logit contributions
 he-0.071
bread-0.064
 until-0.060
 which-0.058
 ju-0.057
Token I
Token ID314
Feature activation-8.08e-02

Pos logit contributions
prise+0.139
uch+0.128
ee+0.125
w+0.124
rr+0.121
Neg logit contributions
 he-0.126
bread-0.111
 tum-0.106
 until-0.104
 which-0.104
Token She
Token ID1375
Feature activation+1.64e-02

Pos logit contributions
bread+0.017
ment+0.015
 cuc+0.015
 ju+0.015
 satisfied+0.015
Neg logit contributions
rr-0.021
ee-0.021
room-0.021
loe-0.021
oh-0.021
Token told
Token ID1297
Feature activation-1.78e-02

Pos logit contributions
bread+0.023
 he+0.021
 ju+0.020
rily+0.020
 cuc+0.020
Neg logit contributions
ee-0.034
prise-0.034
uch-0.033
rr-0.032
ases-0.032
Token Bobby
Token ID17155
Feature activation-2.59e-02

Pos logit contributions
prise+0.032
ee+0.031
ases+0.031
rr+0.031
uch+0.031
Neg logit contributions
bread-0.029
 he-0.029
 ju-0.027
 cuc-0.026
 until-0.026
Token that
Token ID326
Feature activation+8.32e-03

Pos logit contributions
ee+0.055
prise+0.054
uch+0.054
ases+0.053
w+0.053
Neg logit contributions
bread-0.032
 ju-0.030
 cuc-0.030
 widened-0.029
rily-0.028
Token he
Token ID339
Feature activation-2.54e-02

Pos logit contributions
bread+0.008
rily+0.006
 widened+0.006
 ju+0.006
 cuc+0.006
Neg logit contributions
prise-0.021
ee-0.021
uch-0.020
ases-0.020
rr-0.020
Token wasn
Token ID2492
Feature activation-1.22e-01

Pos logit contributions
prise+0.058
ases+0.058
rr+0.057
room+0.056
ee+0.055
Neg logit contributions
 he-0.033
bread-0.029
 ju-0.026
 until-0.025
 which-0.025
Token't
Token ID470
Feature activation-7.13e-02

Pos logit contributions
ymes+0.200
prise+0.197
ases+0.194
rr+0.192
w+0.177
Neg logit contributions
 he-0.257
 as-0.217
 which-0.207
 they-0.207
 until-0.206
Token ready
Token ID3492
Feature activation-2.25e-02

Pos logit contributions
prise+0.121
rr+0.119
ee+0.118
ases+0.116
uch+0.115
Neg logit contributions
bread-0.123
 he-0.117
 widened-0.114
 ju-0.113
rily-0.113
Token for
Token ID329
Feature activation-2.57e-02

Pos logit contributions
prise+0.050
ases+0.049
ee+0.049
rr+0.049
uch+0.049
Neg logit contributions
bread-0.027
 he-0.027
 ju-0.024
 cuc-0.023
rily-0.023
Token TV
Token ID3195
Feature activation-1.34e-02

Pos logit contributions
ee+0.062
uch+0.061
ases+0.060
prise+0.060
w+0.059
Neg logit contributions
 he-0.033
 until-0.028
 as-0.028
bread-0.027
 which-0.026
Token time
Token ID640
Feature activation+2.02e-03

Pos logit contributions
prise+0.032
ee+0.032
ases+0.032
uch+0.032
rr+0.032
Neg logit contributions
 he-0.016
bread-0.014
 until-0.013
 which-0.012
 as-0.012
Token to
Token ID284
Feature activation+1.36e-02

Pos logit contributions
prise+0.022
ases+0.022
ee+0.022
rr+0.022
uch+0.022
Neg logit contributions
 he-0.012
bread-0.012
 ju-0.010
 until-0.010
 cuc-0.010
Token it
Token ID340
Feature activation+1.46e-02

Pos logit contributions
 he+0.013
bread+0.010
 until+0.010
 which+0.009
 as+0.009
Neg logit contributions
prise-0.036
uch-0.036
rr-0.035
ases-0.035
ee-0.035
Token and
Token ID290
Feature activation+3.80e-02

Pos logit contributions
bread+0.017
 he+0.015
 ju+0.015
 cuc+0.014
rily+0.014
Neg logit contributions
ee-0.033
ases-0.033
prise-0.033
rr-0.033
uch-0.032
Token said
Token ID531
Feature activation-1.26e-02

Pos logit contributions
bread+0.059
rily+0.054
ment+0.053
 cuc+0.050
 current+0.050
Neg logit contributions
ee-0.075
uch-0.071
w-0.070
oh-0.070
prise-0.069
Token "
Token ID366
Feature activation-7.30e-02

Pos logit contributions
prise+0.029
ases+0.029
ee+0.029
uch+0.029
rr+0.029
Neg logit contributions
 he-0.014
bread-0.014
 ju-0.012
 until-0.012
 cuc-0.012
TokenSl
Token ID11122
Feature activation-1.22e-01

Pos logit contributions
prise+0.144
ases+0.140
rr+0.135
uch+0.130
loe+0.129
Neg logit contributions
 he-0.119
 ju-0.111
bread-0.108
 until-0.104
 which-0.103
Tokenide
Token ID485
Feature activation-7.36e-02

Pos logit contributions
w+0.208
prise+0.204
ases+0.203
ymes+0.199
uch+0.188
Neg logit contributions
rily-0.186
 as-0.171
 attent-0.168
 which-0.167
 int-0.165
Token!"
Token ID2474
Feature activation+2.52e-02

Pos logit contributions
prise+0.141
uch+0.136
ases+0.133
rr+0.128
ee+0.127
Neg logit contributions
 he-0.120
bread-0.113
 until-0.105
 as-0.104
 which-0.104
Token He
Token ID679
Feature activation+1.10e-02

Pos logit contributions
bread+0.039
 cuc+0.036
 ju+0.035
rily+0.033
 current+0.033
Neg logit contributions
room-0.050
ee-0.048
ases-0.048
rr-0.047
uch-0.046
Token went
Token ID1816
Feature activation-2.97e-02

Pos logit contributions
 he+0.013
bread+0.013
 ju+0.012
 until+0.011
 cuc+0.011
Neg logit contributions
prise-0.025
ases-0.025
rr-0.024
ee-0.024
uch-0.024
Token down
Token ID866
Feature activation+8.50e-03

Pos logit contributions
prise+0.067
uch+0.065
ases+0.065
rr+0.065
ee+0.065
Neg logit contributions
 he-0.037
bread-0.035
 until-0.032
 ju-0.032
 which-0.031
Token and
Token ID290
Feature activation+4.07e-02

Pos logit contributions
bread+0.066
ment+0.056
rily+0.054
 current+0.053
 cuc+0.052
Neg logit contributions
room-0.099
ee-0.099
prise-0.099
ases-0.097
e-0.097
Token told
Token ID1297
Feature activation-1.14e-02

Pos logit contributions
bread+0.048
rily+0.039
 ju+0.038
 widened+0.037
 current+0.037
Neg logit contributions
prise-0.087
ee-0.086
room-0.085
e-0.085
w-0.084
Token him
Token ID683
Feature activation-2.87e-03

Pos logit contributions
prise+0.015
ee+0.015
ases+0.015
rr+0.014
uch+0.014
Neg logit contributions
bread-0.024
 he-0.024
 ju-0.023
 cuc-0.022
rily-0.022
Token that
Token ID326
Feature activation+1.48e-02

Pos logit contributions
prise+0.006
ee+0.006
ases+0.006
rr+0.006
room+0.006
Neg logit contributions
bread-0.003
 ju-0.003
 cuc-0.003
 current-0.003
rily-0.003
Token he
Token ID339
Feature activation-1.46e-02

Pos logit contributions
bread+0.012
rily+0.010
 widened+0.009
 ju+0.008
ment+0.008
Neg logit contributions
prise-0.039
ases-0.038
ee-0.038
uch-0.038
rr-0.037
Token wasn
Token ID2492
Feature activation-1.22e-01

Pos logit contributions
prise+0.034
ases+0.033
rr+0.033
room+0.032
ee+0.032
Neg logit contributions
 he-0.020
bread-0.016
 ju-0.015
 until-0.015
 as-0.015
Token't
Token ID470
Feature activation-7.77e-02

Pos logit contributions
ymes+0.198
ases+0.197
prise+0.194
rr+0.193
w+0.181
Neg logit contributions
 he-0.255
 as-0.220
bread-0.207
 until-0.204
 which-0.204
Token alone
Token ID3436
Feature activation-2.28e-02

Pos logit contributions
prise+0.141
rr+0.140
ee+0.138
ases+0.137
uch+0.135
Neg logit contributions
bread-0.125
 he-0.118
 ju-0.112
 widened-0.112
ment-0.111
Token.
Token ID13
Feature activation+4.54e-03

Pos logit contributions
prise+0.052
ee+0.051
ases+0.051
uch+0.051
rr+0.051
Neg logit contributions
 he-0.027
bread-0.026
 ju-0.023
 until-0.023
rily-0.022
Token
Token ID198
Feature activation-3.20e-02

Pos logit contributions
bread+0.005
 he+0.005
 ju+0.005
rily+0.004
 cuc+0.004
Neg logit contributions
ases-0.010
prise-0.010
ee-0.010
rr-0.010
room-0.010
Token
Token ID198
Feature activation-3.12e-02

Pos logit contributions
prise+0.086
ases+0.085
uch+0.085
rr+0.082
ee+0.081
Neg logit contributions
 he-0.031
 until-0.023
 which-0.022
bread-0.019
 but-0.019
Token had
Token ID550
Feature activation-3.97e-02

Pos logit contributions
prise+0.019
ases+0.019
rr+0.018
ee+0.018
uch+0.018
Neg logit contributions
 he-0.010
bread-0.010
 ju-0.009
 until-0.009
 which-0.009
Token warned
Token ID7728
Feature activation-2.48e-02

Pos logit contributions
prise+0.082
ases+0.082
ee+0.081
rr+0.081
uch+0.080
Neg logit contributions
 he-0.057
bread-0.053
 ju-0.050
 which-0.048
 until-0.048
Token her
Token ID607
Feature activation-1.54e-02

Pos logit contributions
prise+0.058
ases+0.057
uch+0.057
rr+0.056
ee+0.056
Neg logit contributions
 he-0.029
bread-0.026
 until-0.026
 as-0.025
 which-0.024
Token but
Token ID475
Feature activation-5.13e-03

Pos logit contributions
prise+0.027
uch+0.027
ee+0.027
rr+0.027
ases+0.026
Neg logit contributions
 ju-0.023
 cuc-0.023
bread-0.023
 widened-0.022
rily-0.021
Token she
Token ID673
Feature activation-2.78e-02

Pos logit contributions
prise+0.012
ee+0.012
uch+0.012
w+0.011
ases+0.011
Neg logit contributions
bread-0.006
rily-0.005
 cuc-0.005
 ju-0.005
 current-0.004
Token hadn
Token ID8020
Feature activation-1.22e-01

Pos logit contributions
prise+0.059
ases+0.059
rr+0.059
room+0.056
ee+0.056
Neg logit contributions
 he-0.039
bread-0.036
 ju-0.032
 until-0.032
 which-0.031
Tokenâ
Token ID22940
Feature activation-7.87e-02

Pos logit contributions
prise+0.191
ases+0.182
ymes+0.177
rr+0.165
ee+0.164
Neg logit contributions
 he-0.264
 until-0.222
bread-0.218
 but-0.215
 as-0.214
Token€
Token ID26391
Feature activation-1.66e-02

Pos logit contributions
ases+0.129
prise+0.129
rr+0.122
room+0.118
ee+0.117
Neg logit contributions
 he-0.134
bread-0.130
 energ-0.123
 until-0.121
 ju-0.120
Tokenâ„¢
Token ID8151
Feature activation-5.21e-02

Pos logit contributions
ases+0.030
rr+0.030
prise+0.030
ee+0.030
uch+0.028
Neg logit contributions
 he-0.028
bread-0.026
 until-0.025
 which-0.024
 ju-0.024
Tokent
Token ID83
Feature activation-5.12e-02

Pos logit contributions
ases+0.087
prise+0.086
rr+0.083
ee+0.078
uch+0.076
Neg logit contributions
 he-0.101
 until-0.092
bread-0.091
 which-0.088
 as-0.086
Token listened
Token ID16399
Feature activation+3.20e-03

Pos logit contributions
prise+0.095
ee+0.094
uch+0.094
ases+0.093
rr+0.093
Neg logit contributions
bread-0.080
 he-0.080
rily-0.075
 ju-0.075
 cuc-0.074
Token face
Token ID1986
Feature activation+2.23e-02

Pos logit contributions
 he+0.032
bread+0.029
 until+0.028
 tum+0.027
 knowing+0.026
Neg logit contributions
prise-0.055
ases-0.054
uch-0.053
ee-0.053
rr-0.053
Token and
Token ID290
Feature activation+2.96e-02

Pos logit contributions
bread+0.026
 he+0.025
 ju+0.024
 cuc+0.024
rily+0.023
Neg logit contributions
ee-0.049
prise-0.049
uch-0.049
ases-0.048
rr-0.048
Token said
Token ID531
Feature activation-1.11e-02

Pos logit contributions
bread+0.046
 ju+0.042
rily+0.041
 current+0.038
 cuc+0.038
Neg logit contributions
prise-0.057
ee-0.053
ases-0.052
rr-0.051
room-0.051
Token,
Token ID11
Feature activation+2.34e-02

Pos logit contributions
prise+0.023
rr+0.022
ee+0.022
uch+0.022
room+0.022
Neg logit contributions
bread-0.014
 ju-0.013
 current-0.013
rily-0.013
 he-0.012
Token "
Token ID366
Feature activation-7.91e-02

Pos logit contributions
bread+0.024
 ju+0.021
 current+0.021
rily+0.020
ment+0.019
Neg logit contributions
rr-0.052
prise-0.051
oh-0.051
ee-0.051
room-0.051
TokenDon
Token ID3987
Feature activation-1.21e-01

Pos logit contributions
prise+0.138
ases+0.138
rr+0.135
uch+0.135
ee+0.135
Neg logit contributions
 he-0.136
bread-0.132
 ju-0.125
 until-0.121
rily-0.119
Token't
Token ID470
Feature activation-5.81e-02

Pos logit contributions
ases+0.184
prise+0.177
rr+0.161
ee+0.157
w+0.152
Neg logit contributions
 he-0.266
 as-0.231
bread-0.224
 until-0.222
 they-0.220
Token worry
Token ID5490
Feature activation-4.58e-02

Pos logit contributions
prise+0.105
ases+0.104
ee+0.103
uch+0.102
rr+0.101
Neg logit contributions
 he-0.097
bread-0.093
 ju-0.087
 until-0.085
 which-0.084
Token,
Token ID11
Feature activation-1.75e-02

Pos logit contributions
prise+0.097
ee+0.096
rr+0.096
ases+0.095
uch+0.095
Neg logit contributions
bread-0.059
 he-0.057
 ju-0.053
 cuc-0.052
 until-0.050
Token I
Token ID314
Feature activation-8.35e-02

Pos logit contributions
prise+0.031
ee+0.031
uch+0.030
rr+0.030
room+0.028
Neg logit contributions
bread-0.029
 cuc-0.025
ment-0.025
 he-0.025
rily-0.025
Token will
Token ID481
Feature activation-7.54e-02

Pos logit contributions
ases+0.133
rr+0.130
uch+0.127
w+0.125
prise+0.124
Neg logit contributions
 he-0.162
 until-0.149
bread-0.144
 as-0.139
 feeling-0.136
Token.
Token ID13
Feature activation+1.29e-02

Pos logit contributions
prise+0.029
ases+0.029
ee+0.029
rr+0.029
room+0.028
Neg logit contributions
bread-0.020
 he-0.019
 ju-0.018
 cuc-0.017
rily-0.017
Token He
Token ID679
Feature activation+1.28e-02

Pos logit contributions
bread+0.021
ment+0.018
lights+0.018
 ju+0.018
rily+0.017
Neg logit contributions
oh-0.020
rr-0.020
 Go-0.020
ow-0.019
loe-0.018
Token said
Token ID531
Feature activation-4.56e-03

Pos logit contributions
bread+0.020
ment+0.018
rily+0.018
 ju+0.018
 atmosphere+0.017
Neg logit contributions
ee-0.023
prise-0.023
uch-0.023
rr-0.023
w-0.023
Token,
Token ID11
Feature activation+1.74e-02

Pos logit contributions
prise+0.008
rr+0.008
uch+0.008
w+0.007
ases+0.007
Neg logit contributions
bread-0.007
 ju-0.007
rily-0.007
 current-0.007
 widened-0.007
Token "
Token ID366
Feature activation-8.47e-02

Pos logit contributions
bread+0.020
 ju+0.019
lights+0.018
ment+0.018
 current+0.017
Neg logit contributions
rr-0.037
oh-0.036
prise-0.035
ee-0.034
ow-0.034
TokenDon
Token ID3987
Feature activation-1.21e-01

Pos logit contributions
prise+0.143
ee+0.141
uch+0.139
w+0.139
rr+0.137
Neg logit contributions
bread-0.151
 he-0.146
 until-0.137
rily-0.135
 tum-0.134
Token't
Token ID470
Feature activation-5.90e-02

Pos logit contributions
ases+0.139
prise+0.137
rr+0.127
ee+0.127
room+0.120
Neg logit contributions
 he-0.297
bread-0.269
 ju-0.263
 as-0.261
 until-0.258
Token worry
Token ID5490
Feature activation-5.01e-02

Pos logit contributions
prise+0.100
ases+0.098
ee+0.098
rr+0.098
uch+0.098
Neg logit contributions
 he-0.103
bread-0.102
 ju-0.095
 until-0.093
rily-0.092
Token little
Token ID1310
Feature activation-3.89e-02

Pos logit contributions
rr+0.096
ee+0.095
prise+0.095
uch+0.092
ases+0.092
Neg logit contributions
bread-0.073
 he-0.068
 ju-0.067
 energ-0.065
 cuc-0.065
Token one
Token ID530
Feature activation-3.04e-02

Pos logit contributions
prise+0.088
ases+0.087
ee+0.087
rr+0.086
uch+0.085
Neg logit contributions
bread-0.046
 he-0.044
rily-0.040
 ju-0.039
 cuc-0.039
Token,
Token ID11
Feature activation-2.36e-02

Pos logit contributions
prise+0.070
ases+0.069
ee+0.069
uch+0.068
rr+0.068
Neg logit contributions
 he-0.036
bread-0.035
 ju-0.031
 until-0.030
 cuc-0.029
Token
Token ID198
Feature activation-3.43e-02

Pos logit contributions
ases+0.075
prise+0.074
uch+0.072
rr+0.072
ee+0.072
Neg logit contributions
 he-0.033
bread-0.025
 until-0.025
 tum-0.024
ment-0.023
TokenMom
Token ID29252
Feature activation-1.87e-02

Pos logit contributions
ases+0.076
prise+0.074
ee+0.073
rr+0.073
uch+0.071
Neg logit contributions
 he-0.047
bread-0.041
 until-0.038
 tum-0.037
ment-0.037
Token sighed
Token ID21893
Feature activation+1.94e-02

Pos logit contributions
prise+0.045
ases+0.045
uch+0.044
rr+0.044
ee+0.044
Neg logit contributions
 he-0.020
bread-0.018
 until-0.016
 ju-0.016
 which-0.015
Token,
Token ID11
Feature activation+3.91e-02

Pos logit contributions
bread+0.033
 ju+0.032
 cuc+0.029
 redd+0.029
 lettuce+0.028
Neg logit contributions
rr-0.033
room-0.032
ee-0.032
e-0.031
prise-0.031
Token "
Token ID366
Feature activation-7.90e-02

Pos logit contributions
bread+0.057
 ju+0.049
 current+0.047
lights+0.046
 cuc+0.046
Neg logit contributions
oh-0.070
ow-0.066
ee-0.066
rr-0.065
oo-0.065
TokenSl
Token ID11122
Feature activation-1.21e-01

Pos logit contributions
prise+0.146
ases+0.141
rr+0.140
room+0.133
uch+0.131
Neg logit contributions
 he-0.138
bread-0.127
 ju-0.127
 which-0.120
 until-0.116
Tokenipping
Token ID4501
Feature activation-1.63e-02

Pos logit contributions
prise+0.204
ases+0.193
ymes+0.193
w+0.192
uch+0.186
Neg logit contributions
rily-0.195
 until-0.170
 as-0.170
 which-0.168
 attent-0.166
Token means
Token ID1724
Feature activation-2.71e-02

Pos logit contributions
prise+0.034
ases+0.033
ee+0.033
uch+0.033
rr+0.033
Neg logit contributions
 he-0.022
bread-0.021
 ju-0.020
 until-0.019
rily-0.019
Token to
Token ID284
Feature activation-4.92e-02

Pos logit contributions
prise+0.060
ases+0.060
rr+0.059
ee+0.059
uch+0.058
Neg logit contributions
 he-0.037
bread-0.032
 until-0.030
 which-0.030
 as-0.029
Token try
Token ID1949
Feature activation-3.09e-02

Pos logit contributions
prise+0.096
ases+0.094
ee+0.094
rr+0.094
uch+0.093
Neg logit contributions
 he-0.075
bread-0.071
 ju-0.066
 until-0.066
 which-0.064
Token to
Token ID284
Feature activation-4.97e-02

Pos logit contributions
prise+0.066
ee+0.065
ases+0.065
uch+0.065
rr+0.065
Neg logit contributions
 he-0.040
bread-0.040
 ju-0.036
 cuc-0.035
rily-0.035
Token to
Token ID284
Feature activation-2.24e-02

Pos logit contributions
prise+0.007
rr+0.007
uch+0.007
ases+0.007
ee+0.007
Neg logit contributions
bread-0.006
rily-0.005
 ju-0.005
 widened-0.005
 he-0.005
Token the
Token ID262
Feature activation-6.18e-03

Pos logit contributions
prise+0.057
ases+0.057
ee+0.056
rr+0.056
uch+0.056
Neg logit contributions
bread-0.020
 he-0.019
 ju-0.017
rily-0.016
 until-0.016
Token bird
Token ID6512
Feature activation+5.80e-03

Pos logit contributions
prise+0.015
ases+0.015
ee+0.015
uch+0.015
rr+0.015
Neg logit contributions
 he-0.006
bread-0.006
 ju-0.005
 cuc-0.005
 until-0.005
Token,
Token ID11
Feature activation+2.53e-02

Pos logit contributions
bread+0.009
 ju+0.009
 widened+0.008
rily+0.008
 energ+0.008
Neg logit contributions
rr-0.011
ases-0.011
prise-0.011
ee-0.011
uch-0.010
Token "
Token ID366
Feature activation-8.09e-02

Pos logit contributions
bread+0.025
rily+0.021
 ju+0.021
ment+0.021
 widened+0.020
Neg logit contributions
rr-0.060
ee-0.058
prise-0.058
ases-0.057
oh-0.057
TokenDon
Token ID3987
Feature activation-1.21e-01

Pos logit contributions
prise+0.126
ee+0.125
uch+0.125
ases+0.122
rr+0.121
Neg logit contributions
bread-0.151
 he-0.150
rily-0.139
 until-0.139
 ju-0.139
Token't
Token ID470
Feature activation-6.57e-02

Pos logit contributions
prise+0.172
ases+0.167
ee+0.152
rr+0.151
room+0.138
Neg logit contributions
 he-0.279
 as-0.235
bread-0.233
 they-0.231
 ju-0.229
Token worry
Token ID5490
Feature activation-5.49e-02

Pos logit contributions
prise+0.119
ases+0.117
ee+0.117
uch+0.116
rr+0.115
Neg logit contributions
 he-0.108
bread-0.106
 ju-0.098
 until-0.096
 cuc-0.096
Token,
Token ID11
Feature activation-2.70e-02

Pos logit contributions
prise+0.118
ee+0.117
ases+0.117
rr+0.116
uch+0.116
Neg logit contributions
 he-0.070
bread-0.069
 ju-0.063
 cuc-0.061
 until-0.060
Token little
Token ID1310
Feature activation-4.79e-02

Pos logit contributions
ee+0.046
prise+0.045
uch+0.045
rr+0.045
ases+0.043
Neg logit contributions
bread-0.046
 he-0.042
rily-0.042
 widened-0.042
 until-0.041
Token bird
Token ID6512
Feature activation-4.35e-02

Pos logit contributions
prise+0.109
ee+0.108
rr+0.107
ases+0.107
uch+0.107
Neg logit contributions
 he-0.055
bread-0.055
rily-0.049
 ju-0.049
 until-0.048
Token "
Token ID366
Feature activation-7.23e-02

Pos logit contributions
bread+0.014
 ju+0.012
rily+0.012
 energ+0.011
ment+0.011
Neg logit contributions
rr-0.031
ee-0.030
oh-0.030
room-0.030
prise-0.030
TokenFred
Token ID30847
Feature activation-6.90e-02

Pos logit contributions
ee+0.111
uch+0.109
prise+0.109
ases+0.108
rr+0.106
Neg logit contributions
bread-0.138
 he-0.137
rily-0.128
 cuc-0.125
 until-0.125
Token,
Token ID11
Feature activation-2.71e-02

Pos logit contributions
prise+0.142
ee+0.141
ases+0.141
rr+0.140
uch+0.140
Neg logit contributions
 he-0.094
bread-0.094
 ju-0.085
 until-0.082
rily-0.082
Token maybe
Token ID3863
Feature activation-2.76e-02

Pos logit contributions
prise+0.051
ee+0.050
ases+0.050
uch+0.050
rr+0.050
Neg logit contributions
bread-0.043
 he-0.041
 ju-0.038
rily-0.038
 cuc-0.037
Token you
Token ID345
Feature activation-7.47e-02

Pos logit contributions
prise+0.046
ee+0.045
ases+0.045
rr+0.045
uch+0.045
Neg logit contributions
bread-0.049
 he-0.047
 ju-0.045
rily-0.045
 until-0.043
Token shouldn
Token ID6584
Feature activation-1.21e-01

Pos logit contributions
prise+0.131
ases+0.126
room+0.125
loe+0.123
rr+0.123
Neg logit contributions
 he-0.127
bread-0.121
 as-0.117
 until-0.114
 ju-0.113
Token't
Token ID470
Feature activation-7.19e-02

Pos logit contributions
prise+0.164
ases+0.147
rr+0.138
ymes+0.136
ee+0.133
Neg logit contributions
 he-0.287
 current-0.254
bread-0.248
 as-0.240
 tum-0.239
Token be
Token ID307
Feature activation-6.16e-02

Pos logit contributions
prise+0.153
ases+0.147
ee+0.144
rr+0.143
uch+0.142
Neg logit contributions
 he-0.102
bread-0.094
 until-0.087
 current-0.087
 as-0.086
Token blowing
Token ID19280
Feature activation-4.30e-02

Pos logit contributions
prise+0.134
ee+0.132
ases+0.131
rr+0.130
uch+0.130
Neg logit contributions
 he-0.081
bread-0.077
 ju-0.071
 until-0.070
 which-0.068
Token that
Token ID326
Feature activation-7.87e-03

Pos logit contributions
prise+0.105
uch+0.097
ases+0.097
rr+0.097
ee+0.095
Neg logit contributions
 he-0.051
 until-0.045
bread-0.044
 as-0.044
 ju-0.043
Token loud
Token ID7812
Feature activation-1.08e-02

Pos logit contributions
ee+0.016
prise+0.016
uch+0.016
rr+0.016
ases+0.016
Neg logit contributions
bread-0.011
 he-0.011
 ju-0.010
rily-0.010
 cuc-0.010
Token and
Token ID290
Feature activation+2.28e-02

Pos logit contributions
bread+0.035
 ju+0.035
 redd+0.032
 cuc+0.032
 lettuce+0.031
Neg logit contributions
room-0.036
rr-0.036
ee-0.035
e-0.034
prise-0.034
Token said
Token ID531
Feature activation-9.11e-03

Pos logit contributions
bread+0.039
 ju+0.033
rily+0.033
 cuc+0.032
 lettuce+0.032
Neg logit contributions
ee-0.042
prise-0.042
oh-0.039
w-0.038
ow-0.037
Token,
Token ID11
Feature activation+2.93e-02

Pos logit contributions
ee+0.018
prise+0.017
room+0.017
uch+0.017
ases+0.017
Neg logit contributions
bread-0.013
 ju-0.012
 current-0.012
 he-0.011
rily-0.011
Token "
Token ID366
Feature activation-8.01e-02

Pos logit contributions
bread+0.033
 ju+0.029
 current+0.027
ment+0.027
 crops+0.027
Neg logit contributions
oh-0.062
rr-0.062
room-0.061
ee-0.060
loe-0.059
TokenI
Token ID40
Feature activation-7.57e-02

Pos logit contributions
prise+0.142
ases+0.141
rr+0.138
uch+0.134
ee+0.132
Neg logit contributions
 he-0.142
bread-0.134
 ju-0.129
 which-0.122
 until-0.119
Token wasn
Token ID2492
Feature activation-1.21e-01

Pos logit contributions
ases+0.126
rr+0.123
prise+0.121
room+0.120
uch+0.119
Neg logit contributions
 he-0.140
bread-0.131
 until-0.124
 ju-0.122
 widened-0.121
Token't
Token ID470
Feature activation-7.65e-02

Pos logit contributions
ases+0.202
ymes+0.197
rr+0.192
prise+0.188
w+0.180
Neg logit contributions
 he-0.250
 as-0.215
 they-0.205
 which-0.205
 until-0.200
Token paying
Token ID5989
Feature activation-3.82e-02

Pos logit contributions
prise+0.155
ases+0.152
ee+0.152
rr+0.151
uch+0.151
Neg logit contributions
 he-0.109
bread-0.107
 ju-0.098
 until-0.095
 cuc-0.095
Token attention
Token ID3241
Feature activation-1.41e-02

Pos logit contributions
prise+0.086
ases+0.086
uch+0.085
room+0.085
ee+0.085
Neg logit contributions
 he-0.048
bread-0.043
 until-0.042
 as-0.039
 ju-0.039
Token.
Token ID13
Feature activation-4.21e-02

Pos logit contributions
prise+0.031
ases+0.031
uch+0.031
ee+0.031
rr+0.031
Neg logit contributions
 he-0.018
bread-0.017
 until-0.016
 which-0.015
 ju-0.015
Token I
Token ID314
Feature activation-7.12e-02

Pos logit contributions
prise+0.083
ases+0.082
ee+0.081
uch+0.081
rr+0.080
Neg logit contributions
 he-0.063
bread-0.061
 ju-0.056
 until-0.055
 cuc-0.054
Token some
Token ID617
Feature activation+4.59e-02

Pos logit contributions
bread+0.024
 he+0.024
 ju+0.022
 cuc+0.021
 until+0.021
Neg logit contributions
prise-0.036
ee-0.035
ases-0.035
uch-0.035
rr-0.035
Token lip
Token ID10645
Feature activation+1.12e-02

Pos logit contributions
 he+0.060
bread+0.057
 ju+0.052
 until+0.052
 cuc+0.051
Neg logit contributions
uch-0.099
prise-0.098
ases-0.098
rr-0.098
ee-0.098
Token bal
Token ID3652
Feature activation-8.24e-02

Pos logit contributions
 he+0.016
bread+0.016
 ju+0.014
rily+0.014
 cuc+0.014
Neg logit contributions
prise-0.022
ee-0.022
uch-0.022
ases-0.022
rr-0.022
Tokenm
Token ID76
Feature activation-2.57e-02

Pos logit contributions
ases+0.136
rr+0.132
room+0.126
ee+0.125
loe+0.122
Neg logit contributions
 he-0.136
 though-0.133
ment-0.130
bread-0.127
 which-0.126
Token on
Token ID319
Feature activation+2.09e-02

Pos logit contributions
prise+0.053
ases+0.051
rr+0.051
uch+0.051
ee+0.050
Neg logit contributions
 he-0.036
bread-0.036
 until-0.033
ment-0.032
rily-0.032
Token her
Token ID607
Feature activation+3.78e-02

Pos logit contributions
 he+0.022
bread+0.021
 until+0.018
 ju+0.018
 which+0.018
Neg logit contributions
prise-0.051
uch-0.050
ases-0.050
rr-0.050
ee-0.050
Token lips
Token ID11914
Feature activation+1.86e-02

Pos logit contributions
 he+0.050
bread+0.049
 until+0.045
 ju+0.045
 tum+0.044
Neg logit contributions
prise-0.081
ases-0.080
ee-0.080
rr-0.079
uch-0.079
Token to
Token ID284
Feature activation-1.05e-02

Pos logit contributions
bread+0.020
 he+0.020
 ju+0.018
 cuc+0.018
 current+0.017
Neg logit contributions
ee-0.043
prise-0.043
ases-0.042
uch-0.042
rr-0.042
Token make
Token ID787
Feature activation-8.94e-03

Pos logit contributions
prise+0.021
ee+0.021
ases+0.021
uch+0.020
rr+0.020
Neg logit contributions
 he-0.015
bread-0.015
 ju-0.014
 until-0.013
 cuc-0.013
Token them
Token ID606
Feature activation+1.81e-02

Pos logit contributions
prise+0.019
ee+0.019
ases+0.019
rr+0.019
uch+0.019
Neg logit contributions
 he-0.011
bread-0.011
 ju-0.010
 cuc-0.010
rily-0.010
Token soft
Token ID2705
Feature activation+1.11e-02

Pos logit contributions
bread+0.025
 he+0.024
 cuc+0.023
 current+0.023
rily+0.023
Neg logit contributions
prise-0.036
ee-0.036
uch-0.035
ases-0.035
rr-0.035
Token was
Token ID373
Feature activation-1.74e-02

Pos logit contributions
 he+0.016
bread+0.015
 ju+0.014
 until+0.014
 cuc+0.013
Neg logit contributions
prise-0.024
ases-0.024
ee-0.023
rr-0.023
uch-0.023
Token shiny
Token ID22441
Feature activation+8.09e-03

Pos logit contributions
prise+0.031
uch+0.030
ee+0.030
ases+0.030
rr+0.030
Neg logit contributions
bread-0.028
 he-0.026
ment-0.025
rily-0.025
 current-0.025
Token and
Token ID290
Feature activation+2.81e-02

Pos logit contributions
bread+0.010
 he+0.009
 ju+0.009
 current+0.009
 cuc+0.009
Neg logit contributions
ee-0.017
ases-0.017
uch-0.017
room-0.017
prise-0.016
Token new
Token ID649
Feature activation+8.69e-03

Pos logit contributions
bread+0.042
rily+0.038
der+0.035
ment+0.034
 current+0.033
Neg logit contributions
prise-0.056
ee-0.053
uch-0.052
ases-0.051
rr-0.051
Token,
Token ID11
Feature activation+6.23e-02

Pos logit contributions
bread+0.011
 he+0.010
 ju+0.009
rily+0.009
 cuc+0.009
Neg logit contributions
prise-0.019
ases-0.019
ee-0.019
rr-0.019
uch-0.019
Token and
Token ID290
Feature activation+5.03e-02

Pos logit contributions
bread+0.080
ment+0.068
 current+0.067
 cuc+0.066
rily+0.066
Neg logit contributions
room-0.130
ee-0.129
ases-0.127
prise-0.123
e-0.122
Token the
Token ID262
Feature activation+3.46e-02

Pos logit contributions
bread+0.073
rily+0.064
 widened+0.062
 spine+0.060
 current+0.058
Neg logit contributions
ee-0.095
prise-0.094
ases-0.093
room-0.093
e-0.088
Token bird
Token ID6512
Feature activation+1.66e-02

Pos logit contributions
bread+0.043
 he+0.043
rily+0.038
 ju+0.038
 until+0.036
Neg logit contributions
prise-0.075
ee-0.075
ases-0.074
rr-0.074
uch-0.073
Tokenie
Token ID494
Feature activation+9.62e-03

Pos logit contributions
bread+0.025
 he+0.023
 ju+0.023
 cuc+0.022
 current+0.022
Neg logit contributions
ee-0.032
uch-0.032
prise-0.031
ases-0.031
rr-0.031
Token was
Token ID373
Feature activation-2.60e-02

Pos logit contributions
 he+0.012
bread+0.012
 ju+0.011
 until+0.011
 cuc+0.010
Neg logit contributions
prise-0.021
ases-0.021
ee-0.021
rr-0.020
uch-0.020
Token happy
Token ID3772
Feature activation+1.75e-02

Pos logit contributions
prise+0.049
ee+0.048
ases+0.048
uch+0.047
rr+0.047
Neg logit contributions
bread-0.040
 he-0.038
 current-0.036
 ju-0.035
ment-0.035
Token garden
Token ID11376
Feature activation+3.14e-02

Pos logit contributions
 he+0.033
bread+0.033
 ju+0.030
rily+0.030
 until+0.029
Neg logit contributions
prise-0.043
ee-0.043
ases-0.043
uch-0.042
rr-0.042
Token.
Token ID13
Feature activation+2.26e-02

Pos logit contributions
bread+0.047
 ju+0.043
 energ+0.043
 widened+0.042
 cuc+0.042
Neg logit contributions
room-0.057
prise-0.055
ases-0.055
rr-0.055
ee-0.054
Token He
Token ID679
Feature activation+2.03e-02

Pos logit contributions
bread+0.026
 ju+0.023
 cuc+0.023
rily+0.022
 energ+0.021
Neg logit contributions
ee-0.051
ases-0.051
rr-0.051
room-0.049
prise-0.049
Token saw
Token ID2497
Feature activation+2.94e-02

Pos logit contributions
bread+0.027
 he+0.025
rily+0.024
 ju+0.024
 cuc+0.023
Neg logit contributions
ee-0.043
prise-0.043
ases-0.042
rr-0.041
uch-0.041
Token Lily
Token ID20037
Feature activation+1.09e-02

Pos logit contributions
bread+0.048
 ju+0.044
 he+0.043
rily+0.042
 cuc+0.042
Neg logit contributions
ases-0.054
ee-0.054
prise-0.052
rr-0.051
room-0.051
Token and
Token ID290
Feature activation+3.81e-02

Pos logit contributions
bread+0.012
 he+0.012
 ju+0.011
 cuc+0.011
rily+0.010
Neg logit contributions
prise-0.025
ee-0.025
ases-0.025
uch-0.025
rr-0.025
Token Rex
Token ID17853
Feature activation+1.69e-02

Pos logit contributions
bread+0.079
rily+0.075
 ju+0.074
 energ+0.069
 redd+0.068
Neg logit contributions
ee-0.052
ases-0.051
prise-0.049
e-0.048
rr-0.048
Token and
Token ID290
Feature activation+4.44e-02

Pos logit contributions
bread+0.022
 he+0.022
 ju+0.020
 cuc+0.019
rily+0.019
Neg logit contributions
prise-0.036
ases-0.036
ee-0.035
rr-0.035
uch-0.035
Token the
Token ID262
Feature activation+3.77e-02

Pos logit contributions
rily+0.084
bread+0.081
 ju+0.072
 energ+0.070
 cuc+0.069
Neg logit contributions
ases-0.072
ee-0.069
e-0.068
ow-0.067
w-0.067
Token trap
Token ID12840
Feature activation+3.45e-02

Pos logit contributions
 he+0.047
bread+0.046
 ju+0.042
 until+0.041
rily+0.040
Neg logit contributions
prise-0.082
ee-0.081
ases-0.081
rr-0.081
uch-0.080
Token.
Token ID13
Feature activation+2.88e-02

Pos logit contributions
bread+0.050
 ju+0.047
 cuc+0.044
 energ+0.042
 he+0.042
Neg logit contributions
ases-0.069
room-0.067
ee-0.067
prise-0.067
rr-0.066
Token leaves
Token ID5667
Feature activation+1.88e-02

Pos logit contributions
prise+0.041
ases+0.041
ee+0.041
uch+0.041
rr+0.041
Neg logit contributions
 he-0.030
bread-0.030
 ju-0.027
rily-0.027
 cuc-0.027
Token and
Token ID290
Feature activation+5.08e-02

Pos logit contributions
bread+0.024
 he+0.022
 ju+0.021
 cuc+0.021
rily+0.021
Neg logit contributions
ases-0.040
ee-0.039
uch-0.039
prise-0.039
rr-0.039
Token were
Token ID547
Feature activation-7.84e-03

Pos logit contributions
bread+0.065
rily+0.062
 ju+0.055
 he+0.054
 energ+0.054
Neg logit contributions
ases-0.105
prise-0.102
ee-0.102
rr-0.100
uch-0.100
Token green
Token ID4077
Feature activation+1.01e-02

Pos logit contributions
ases+0.014
prise+0.014
ee+0.014
rr+0.014
uch+0.014
Neg logit contributions
bread-0.013
 he-0.012
 ju-0.011
rily-0.011
 cuc-0.011
Token,
Token ID11
Feature activation+5.87e-02

Pos logit contributions
 he+0.011
bread+0.011
 ju+0.010
rily+0.010
 satisfied+0.010
Neg logit contributions
ases-0.023
rr-0.022
ee-0.022
prise-0.022
uch-0.022
Token but
Token ID475
Feature activation+3.53e-02

Pos logit contributions
bread+0.080
ment+0.067
rily+0.067
 cuc+0.065
 ju+0.064
Neg logit contributions
room-0.119
ases-0.119
uch-0.116
ee-0.116
w-0.112
Token he
Token ID339
Feature activation-2.98e-04

Pos logit contributions
bread+0.035
rily+0.030
ment+0.025
 ju+0.025
 cuc+0.025
Neg logit contributions
uch-0.085
ee-0.084
prise-0.084
ases-0.083
w-0.083
Token was
Token ID373
Feature activation-3.38e-02

Pos logit contributions
prise+0.001
rr+0.001
ases+0.001
room+0.001
uch+0.001
Neg logit contributions
 he-0.000
bread-0.000
 until-0.000
 ju-0.000
 which-0.000
Token a
Token ID257
Feature activation+5.54e-03

Pos logit contributions
prise+0.060
ases+0.060
ee+0.060
uch+0.059
rr+0.059
Neg logit contributions
bread-0.056
 he-0.052
 ju-0.050
 cuc-0.049
ment-0.049
Token sun
Token ID4252
Feature activation-1.71e-02

Pos logit contributions
 he+0.006
bread+0.006
rily+0.005
ment+0.005
 ju+0.005
Neg logit contributions
prise-0.013
ee-0.013
uch-0.013
rr-0.013
ases-0.013
Tokenflower
Token ID25547
Feature activation+1.20e-02

Pos logit contributions
ee+0.029
room+0.028
uch+0.028
rr+0.027
prise+0.026
Neg logit contributions
bread-0.030
 he-0.029
 energ-0.028
 ju-0.028
 widened-0.028
Token't
Token ID470
Feature activation-6.48e-02

Pos logit contributions
prise+0.130
ymes+0.114
rr+0.113
ases+0.109
w+0.105
Neg logit contributions
 he-0.190
 as-0.163
bread-0.156
 until-0.156
 they-0.154
Token mind
Token ID2000
Feature activation-2.62e-02

Pos logit contributions
prise+0.108
rr+0.106
ases+0.105
ee+0.104
uch+0.103
Neg logit contributions
bread-0.115
rily-0.111
 ju-0.106
ment-0.106
 cuc-0.105
Token its
Token ID663
Feature activation+1.46e-02

Pos logit contributions
ee+0.054
ases+0.054
rr+0.053
room+0.051
uch+0.051
Neg logit contributions
bread-0.036
 ju-0.032
 widened-0.031
 cuc-0.031
 current-0.031
Token loud
Token ID7812
Feature activation+4.49e-03

Pos logit contributions
bread+0.020
 he+0.020
 cuc+0.018
 ju+0.018
rily+0.018
Neg logit contributions
rr-0.030
uch-0.030
ases-0.029
prise-0.029
ee-0.029
Token noises
Token ID26782
Feature activation+1.02e-02

Pos logit contributions
bread+0.005
 he+0.005
 ju+0.005
 cuc+0.005
rily+0.005
Neg logit contributions
rr-0.010
ases-0.010
ee-0.010
uch-0.010
room-0.010
Token,
Token ID11
Feature activation+5.07e-02

Pos logit contributions
 he+0.012
bread+0.011
 ju+0.010
 until+0.010
 cuc+0.010
Neg logit contributions
prise-0.023
ases-0.023
ee-0.023
rr-0.023
uch-0.023
Token and
Token ID290
Feature activation+5.11e-02

Pos logit contributions
bread+0.063
 cuc+0.054
rily+0.054
ment+0.054
 widened+0.048
Neg logit contributions
ee-0.104
ases-0.102
uch-0.101
prise-0.099
room-0.099
Token they
Token ID484
Feature activation+1.00e-02

Pos logit contributions
bread+0.069
rily+0.058
 cuc+0.057
 widened+0.054
ment+0.053
Neg logit contributions
ases-0.102
ee-0.101
prise-0.100
uch-0.099
w-0.099
Token played
Token ID2826
Feature activation-2.15e-03

Pos logit contributions
bread+0.015
 he+0.015
 ju+0.014
 cuc+0.014
rily+0.014
Neg logit contributions
ee-0.019
prise-0.019
ases-0.019
uch-0.019
rr-0.019
Token together
Token ID1978
Feature activation+2.39e-02

Pos logit contributions
ases+0.004
prise+0.004
ee+0.004
room+0.004
rr+0.004
Neg logit contributions
bread-0.004
 he-0.004
 ju-0.003
 cuc-0.003
rily-0.003
Token every
Token ID790
Feature activation-9.15e-03

Pos logit contributions
bread+0.028
 he+0.027
 ju+0.025
 cuc+0.024
rily+0.023
Neg logit contributions
ases-0.054
prise-0.054
ee-0.053
rr-0.053
uch-0.053
Token
Token ID198
Feature activation-1.62e-02

Pos logit contributions
bread+0.042
 he+0.041
 ju+0.038
rily+0.037
 cuc+0.037
Neg logit contributions
ases-0.071
prise-0.071
ee-0.071
rr-0.070
uch-0.070
Token
Token ID198
Feature activation-2.20e-02

Pos logit contributions
prise+0.039
uch+0.038
ee+0.038
ases+0.037
rr+0.037
Neg logit contributions
 he-0.019
bread-0.017
 until-0.016
 which-0.015
 ju-0.014
TokenThe
Token ID464
Feature activation+7.27e-03

Pos logit contributions
prise+0.034
ee+0.033
ases+0.033
uch+0.033
rr+0.032
Neg logit contributions
 he-0.044
bread-0.042
 until-0.040
 which-0.039
 ju-0.038
Token big
Token ID1263
Feature activation+1.24e-02

Pos logit contributions
 he+0.008
bread+0.007
 ju+0.006
 current+0.006
 until+0.006
Neg logit contributions
prise-0.018
ases-0.018
ee-0.018
uch-0.018
rr-0.017
Token boy
Token ID2933
Feature activation+1.35e-02

Pos logit contributions
bread+0.015
 he+0.014
 ju+0.013
rily+0.013
 until+0.012
Neg logit contributions
rr-0.028
ases-0.028
prise-0.028
ee-0.028
room-0.027
Token gets
Token ID3011
Feature activation+2.29e-03

Pos logit contributions
 he+0.025
bread+0.022
 until+0.022
 which+0.021
 as+0.021
Neg logit contributions
prise-0.024
uch-0.023
rr-0.023
ee-0.023
ases-0.022
Token angry
Token ID7954
Feature activation+3.21e-03

Pos logit contributions
 he+0.003
bread+0.003
 ju+0.003
 until+0.003
 cuc+0.003
Neg logit contributions
prise-0.005
rr-0.005
uch-0.005
ee-0.005
ases-0.005
Token.
Token ID13
Feature activation+2.88e-02

Pos logit contributions
bread+0.003
 he+0.003
 ju+0.003
 cuc+0.003
rily+0.003
Neg logit contributions
ases-0.008
prise-0.008
rr-0.008
ee-0.008
uch-0.008
Token He
Token ID679
Feature activation+1.56e-02

Pos logit contributions
bread+0.027
 ju+0.023
rily+0.022
 cuc+0.022
 energ+0.021
Neg logit contributions
ases-0.074
room-0.073
rr-0.071
ee-0.070
prise-0.067
Token says
Token ID1139
Feature activation-1.17e-02

Pos logit contributions
 he+0.027
bread+0.027
 ju+0.025
 until+0.024
rily+0.024
Neg logit contributions
prise-0.027
ases-0.027
ee-0.026
rr-0.026
uch-0.026
Token,
Token ID11
Feature activation+1.92e-02

Pos logit contributions
prise+0.026
ee+0.026
ases+0.026
uch+0.025
rr+0.025
Neg logit contributions
 he-0.014
bread-0.014
 ju-0.013
 until-0.012
 cuc-0.012
Token and
Token ID290
Feature activation+7.01e-03

Pos logit contributions
prise+0.025
ases+0.025
ee+0.025
rr+0.025
uch+0.025
Neg logit contributions
 he-0.014
bread-0.014
 ju-0.012
 until-0.012
 cuc-0.012
Token she
Token ID673
Feature activation-3.76e-02

Pos logit contributions
bread+0.011
 he+0.011
 ju+0.010
rily+0.010
 until+0.010
Neg logit contributions
prise-0.013
ee-0.013
ases-0.013
rr-0.013
uch-0.012
Token will
Token ID481
Feature activation-5.79e-02

Pos logit contributions
rr+0.054
prise+0.053
ases+0.053
room+0.052
ee+0.049
Neg logit contributions
 he-0.079
bread-0.074
 until-0.070
 as-0.069
 ju-0.068
Token help
Token ID1037
Feature activation-3.22e-02

Pos logit contributions
prise+0.106
ases+0.105
ee+0.104
rr+0.104
uch+0.103
Neg logit contributions
 he-0.096
bread-0.091
 until-0.085
 ju-0.084
 which-0.083
Token you
Token ID345
Feature activation-2.69e-02

Pos logit contributions
ases+0.032
prise+0.032
uch+0.031
rr+0.030
ee+0.030
Neg logit contributions
 he-0.082
bread-0.076
 until-0.076
 as-0.074
 which-0.073
Token".
Token ID1911
Feature activation+1.04e-02

Pos logit contributions
prise+0.053
ases+0.052
ee+0.052
rr+0.051
uch+0.051
Neg logit contributions
 he-0.041
bread-0.039
 until-0.036
 ju-0.036
 which-0.035
Token She
Token ID1375
Feature activation+1.37e-02

Pos logit contributions
bread+0.013
 he+0.012
 ju+0.012
 cuc+0.011
rily+0.011
Neg logit contributions
ee-0.023
ases-0.023
rr-0.022
prise-0.022
uch-0.022
Token called
Token ID1444
Feature activation+5.37e-03

Pos logit contributions
bread+0.020
 he+0.018
 ju+0.018
rily+0.017
 cuc+0.017
Neg logit contributions
ee-0.027
prise-0.027
ases-0.026
uch-0.026
rr-0.026
Token mummy
Token ID38567
Feature activation+1.08e-02

Pos logit contributions
bread+0.007
 ju+0.007
 he+0.006
 cuc+0.006
 widened+0.006
Neg logit contributions
ee-0.011
prise-0.011
ases-0.011
rr-0.011
room-0.011
Token and
Token ID290
Feature activation+2.88e-02

Pos logit contributions
bread+0.015
 ju+0.014
 current+0.013
 he+0.013
 cuc+0.013
Neg logit contributions
ee-0.021
rr-0.021
prise-0.021
ases-0.021
room-0.021
Token told
Token ID1297
Feature activation-9.89e-03

Pos logit contributions
bread+0.050
 ju+0.046
 redd+0.042
rily+0.041
 current+0.041
Neg logit contributions
prise-0.052
ee-0.050
rr-0.048
ases-0.046
w-0.045
Token grow
Token ID1663
Feature activation-1.05e-02

Pos logit contributions
rr+0.044
prise+0.043
uch+0.043
ee+0.042
ases+0.042
Neg logit contributions
bread-0.033
 he-0.033
ment-0.031
rily-0.031
 couple-0.029
Token taller
Token ID25242
Feature activation+2.73e-02

Pos logit contributions
rr+0.020
ee+0.020
ases+0.019
uch+0.019
w+0.019
Neg logit contributions
 he-0.014
bread-0.014
 couple-0.013
 current-0.013
 spine-0.013
Token and
Token ID290
Feature activation+3.88e-02

Pos logit contributions
 he+0.040
 energ+0.040
bread+0.039
 satisfied+0.039
 current+0.038
Neg logit contributions
ases-0.048
rr-0.045
room-0.043
ee-0.042
e-0.041
Token taller
Token ID25242
Feature activation+2.53e-02

Pos logit contributions
bread+0.058
rily+0.056
ment+0.056
 couple+0.054
 spine+0.052
Neg logit contributions
oh-0.069
uch-0.065
prise-0.064
w-0.063
ases-0.063
Token.
Token ID13
Feature activation+2.63e-02

Pos logit contributions
 energ+0.038
bread+0.037
 eerie+0.037
 ju+0.037
 current+0.036
Neg logit contributions
ases-0.045
rr-0.041
room-0.039
e-0.038
ee-0.037
Token Soon
Token ID15894
Feature activation+9.69e-03

Pos logit contributions
bread+0.035
 he+0.033
 current+0.031
 energ+0.031
rily+0.031
Neg logit contributions
room-0.053
ee-0.051
rr-0.051
ases-0.051
uch-0.050
Token,
Token ID11
Feature activation+5.03e-02

Pos logit contributions
bread+0.016
rily+0.016
 ju+0.015
 current+0.015
 cuc+0.015
Neg logit contributions
prise-0.016
ee-0.016
ases-0.016
uch-0.015
w-0.015
Token it
Token ID340
Feature activation+6.12e-03

Pos logit contributions
bread+0.077
rily+0.072
ment+0.066
 current+0.066
 widened+0.065
Neg logit contributions
prise-0.094
w-0.088
room-0.087
uch-0.087
ee-0.087
Token had
Token ID550
Feature activation-3.28e-02

Pos logit contributions
bread+0.010
 he+0.010
 knowing+0.010
ment+0.010
rily+0.009
Neg logit contributions
uch-0.011
oh-0.010
ee-0.010
w-0.010
ases-0.009
Token a
Token ID257
Feature activation+1.92e-02

Pos logit contributions
rr+0.059
ases+0.058
uch+0.058
ee+0.058
prise+0.058
Neg logit contributions
bread-0.049
 he-0.048
rily-0.047
ment-0.045
 though-0.044
Token brown
Token ID7586
Feature activation+5.98e-03

Pos logit contributions
 he+0.026
bread+0.026
ment+0.023
rily+0.023
 ju+0.022
Neg logit contributions
prise-0.040
ee-0.039
rr-0.039
uch-0.038
ases-0.037
Token movies
Token ID6918
Feature activation+1.94e-02

Pos logit contributions
bread+0.011
 ju+0.011
 he+0.010
 current+0.010
 cuc+0.010
Neg logit contributions
room-0.019
ee-0.018
ases-0.018
rr-0.018
prise-0.018
Token together
Token ID1978
Feature activation+3.06e-02

Pos logit contributions
 he+0.024
bread+0.023
 ju+0.022
 cuc+0.021
rily+0.021
Neg logit contributions
prise-0.042
ee-0.042
ases-0.042
rr-0.042
uch-0.041
Token.
Token ID13
Feature activation+9.74e-03

Pos logit contributions
bread+0.037
 he+0.035
 ju+0.034
 current+0.032
 cuc+0.032
Neg logit contributions
ee-0.067
ases-0.066
rr-0.066
prise-0.065
room-0.065
Token They
Token ID1119
Feature activation+2.01e-02

Pos logit contributions
bread+0.013
 he+0.013
 ju+0.012
 current+0.012
rily+0.012
Neg logit contributions
ee-0.020
rr-0.020
prise-0.020
ases-0.019
room-0.019
Token soon
Token ID2582
Feature activation+8.66e-03

Pos logit contributions
bread+0.028
 he+0.028
 ju+0.026
rily+0.025
 cuc+0.025
Neg logit contributions
ee-0.040
prise-0.040
ases-0.040
uch-0.039
rr-0.039
Token forgot
Token ID16453
Feature activation+7.08e-03

Pos logit contributions
bread+0.012
rily+0.011
 ju+0.011
 he+0.011
 cuc+0.011
Neg logit contributions
ee-0.018
prise-0.017
ases-0.017
uch-0.017
rr-0.017
Token all
Token ID477
Feature activation+6.49e-03

Pos logit contributions
bread+0.011
 current+0.010
 cuc+0.010
 ju+0.010
rily+0.009
Neg logit contributions
ee-0.012
room-0.012
prise-0.012
ases-0.012
rr-0.012
Token about
Token ID546
Feature activation+9.03e-03

Pos logit contributions
bread+0.009
 ju+0.009
rily+0.009
 cuc+0.008
 current+0.008
Neg logit contributions
ee-0.013
ases-0.013
prise-0.013
rr-0.013
room-0.012
Token the
Token ID262
Feature activation+2.59e-02

Pos logit contributions
bread+0.011
 he+0.010
 ju+0.010
rily+0.010
 current+0.009
Neg logit contributions
prise-0.020
ases-0.019
ee-0.019
rr-0.019
room-0.019
Token delayed
Token ID11038
Feature activation+9.75e-03

Pos logit contributions
 he+0.037
bread+0.036
 ju+0.034
 until+0.033
 cuc+0.032
Neg logit contributions
ases-0.053
prise-0.053
rr-0.052
uch-0.052
ee-0.052
Token work
Token ID670
Feature activation+1.93e-02

Pos logit contributions
bread+0.014
 he+0.014
 ju+0.013
 cuc+0.012
rily+0.012
Neg logit contributions
prise-0.019
ee-0.019
ases-0.019
rr-0.019
uch-0.019
Token Little
Token ID7703
Feature activation-7.34e-03

Pos logit contributions
bread+0.014
rily+0.012
 ju+0.012
 he+0.012
 cuc+0.012
Neg logit contributions
ases-0.030
ee-0.030
rr-0.030
room-0.029
prise-0.029
Token did
Token ID750
Feature activation-1.82e-02

Pos logit contributions
ee+0.014
uch+0.014
prise+0.014
rr+0.014
ases+0.014
Neg logit contributions
bread-0.011
 he-0.010
rily-0.010
 cuc-0.010
 ju-0.010
Token Sam
Token ID3409
Feature activation-1.27e-02

Pos logit contributions
ee+0.048
uch+0.048
prise+0.048
ases+0.048
rr+0.047
Neg logit contributions
bread-0.014
 ju-0.012
rily-0.011
 cuc-0.011
 current-0.010
Token know
Token ID760
Feature activation+7.14e-03

Pos logit contributions
prise+0.026
ases+0.026
ee+0.025
rr+0.025
uch+0.025
Neg logit contributions
 he-0.018
bread-0.018
 ju-0.016
 until-0.016
 cuc-0.015
Token that
Token ID326
Feature activation+2.98e-02

Pos logit contributions
bread+0.009
 ju+0.008
rily+0.008
 current+0.007
 cuc+0.007
Neg logit contributions
prise-0.016
ases-0.015
ee-0.015
rr-0.015
room-0.015
Token this
Token ID428
Feature activation-5.45e-03

Pos logit contributions
bread+0.025
rily+0.022
 widened+0.019
 ju+0.017
 current+0.016
Neg logit contributions
ee-0.078
prise-0.078
room-0.076
ases-0.076
rr-0.075
Token new
Token ID649
Feature activation-4.80e-03

Pos logit contributions
ee+0.011
prise+0.010
room+0.010
uch+0.010
ases+0.010
Neg logit contributions
bread-0.008
rily-0.007
 he-0.007
 ju-0.007
 current-0.007
Token toy
Token ID13373
Feature activation-7.77e-03

Pos logit contributions
prise+0.011
ases+0.011
ee+0.011
uch+0.011
rr+0.011
Neg logit contributions
 he-0.006
bread-0.006
 ju-0.005
 until-0.005
 cuc-0.005
Token will
Token ID481
Feature activation-5.14e-02

Pos logit contributions
prise+0.017
rr+0.017
ases+0.017
ee+0.017
uch+0.017
Neg logit contributions
 he-0.011
bread-0.009
 until-0.009
 which-0.008
 ju-0.008
Token help
Token ID1037
Feature activation-1.35e-02

Pos logit contributions
prise+0.096
ases+0.095
ee+0.094
rr+0.093
uch+0.093
Neg logit contributions
 he-0.083
bread-0.079
 ju-0.073
 until-0.073
 which-0.071
Token him
Token ID683
Feature activation-6.06e-03

Pos logit contributions
prise+0.021
uch+0.021
ases+0.021
ee+0.020
rr+0.020
Neg logit contributions
 he-0.027
 until-0.024
bread-0.023
 as-0.023
 which-0.023
Token fit
Token ID4197
Feature activation-2.38e-03

Pos logit contributions
 he+0.008
bread+0.007
 ju+0.007
 until+0.006
 cuc+0.006
Neg logit contributions
prise-0.014
ases-0.014
ee-0.014
rr-0.014
uch-0.014
Token through
Token ID832
Feature activation+2.54e-02

Pos logit contributions
prise+0.006
uch+0.006
rr+0.006
ases+0.005
ee+0.005
Neg logit contributions
 he-0.003
 until-0.003
bread-0.002
 as-0.002
 which-0.002
Token the
Token ID262
Feature activation+7.03e-04

Pos logit contributions
bread+0.029
 ju+0.026
 he+0.025
 cuc+0.023
 current+0.023
Neg logit contributions
ases-0.059
ee-0.058
prise-0.058
room-0.057
rr-0.057
Token circle
Token ID9197
Feature activation+2.56e-02

Pos logit contributions
 he+0.001
bread+0.001
 current+0.001
 ju+0.001
 wooden+0.001
Neg logit contributions
prise-0.002
ases-0.001
uch-0.001
rr-0.001
ee-0.001
Token.
Token ID13
Feature activation+6.46e-03

Pos logit contributions
bread+0.030
 he+0.029
 ju+0.027
 cuc+0.026
rily+0.025
Neg logit contributions
prise-0.057
ases-0.057
ee-0.057
rr-0.056
uch-0.056
Token
Token ID198
Feature activation-3.11e-02

Pos logit contributions
bread+0.007
 he+0.007
 ju+0.007
 cuc+0.006
rily+0.006
Neg logit contributions
prise-0.015
ee-0.015
ases-0.015
rr-0.014
uch-0.014
Token
Token ID198
Feature activation-3.57e-02

Pos logit contributions
prise+0.081
ases+0.081
uch+0.081
ee+0.077
rr+0.077
Neg logit contributions
 he-0.032
 until-0.026
bread-0.023
 which-0.023
 they-0.021
TokenAt
Token ID2953
Feature activation-8.17e-03

Pos logit contributions
ases+0.077
prise+0.075
uch+0.075
rr+0.073
ee+0.073
Neg logit contributions
 he-0.050
bread-0.046
 until-0.043
 ju-0.042
 which-0.042
Token the
Token ID262
Feature activation-6.07e-03

Pos logit contributions
prise+0.019
ases+0.019
rr+0.019
uch+0.019
ee+0.019
Neg logit contributions
 he-0.009
bread-0.009
 until-0.008
 ju-0.008
 which-0.007
Token end
Token ID886
Feature activation-5.57e-03

Pos logit contributions
prise+0.014
ases+0.014
rr+0.013
w+0.013
uch+0.013
Neg logit contributions
 he-0.008
bread-0.007
 until-0.007
 ju-0.007
 current-0.007
Token of
Token ID286
Feature activation+3.87e-03

Pos logit contributions
uch+0.013
prise+0.012
ases+0.012
ee+0.012
rr+0.012
Neg logit contributions
 he-0.007
bread-0.006
 until-0.006
 as-0.006
 ju-0.006
Token this
Token ID428
Feature activation-3.48e-02

Pos logit contributions
prise+0.069
ee+0.069
ases+0.067
rr+0.067
uch+0.067
Neg logit contributions
bread-0.032
 he-0.029
 ju-0.028
rily-0.026
 cuc-0.026
Token story
Token ID1621
Feature activation-2.92e-02

Pos logit contributions
prise+0.049
ee+0.048
ases+0.048
rr+0.047
uch+0.047
Neg logit contributions
 he-0.071
bread-0.071
 ju-0.067
 until-0.065
 cuc-0.065
Token is
Token ID318
Feature activation-3.72e-02

Pos logit contributions
prise+0.055
uch+0.051
ee+0.050
ases+0.050
w+0.047
Neg logit contributions
 he-0.055
 "-0.047
der-0.046
 until-0.046
 they-0.045
Token:
Token ID25
Feature activation-1.82e-02

Pos logit contributions
prise+0.096
ases+0.096
uch+0.095
ee+0.095
rr+0.094
Neg logit contributions
 he-0.037
bread-0.031
 until-0.028
 which-0.028
 knowing-0.027
Token never
Token ID1239
Feature activation-3.24e-02

Pos logit contributions
prise+0.033
ases+0.032
uch+0.032
ee+0.032
rr+0.032
Neg logit contributions
 he-0.030
bread-0.029
 ju-0.027
 until-0.027
 cuc-0.026
Token cut
Token ID2005
Feature activation-3.61e-02

Pos logit contributions
rr+0.055
prise+0.053
uch+0.052
ee+0.051
ases+0.051
Neg logit contributions
rily-0.058
bread-0.057
 he-0.052
 ju-0.052
ment-0.051
Token up
Token ID510
Feature activation+1.35e-02

Pos logit contributions
prise+0.078
ee+0.076
ases+0.075
uch+0.075
rr+0.075
Neg logit contributions
 he-0.048
bread-0.046
 until-0.042
 ju-0.042
 which-0.041
Token too
Token ID1165
Feature activation-1.77e-03

Pos logit contributions
bread+0.018
 he+0.017
 ju+0.017
rily+0.016
 cuc+0.016
Neg logit contributions
ases-0.027
prise-0.027
rr-0.027
ee-0.026
uch-0.026
Token many
Token ID867
Feature activation+1.51e-02

Pos logit contributions
prise+0.004
ases+0.004
ee+0.004
rr+0.004
uch+0.004
Neg logit contributions
 he-0.003
bread-0.002
 until-0.002
 ju-0.002
 which-0.002
Token things
Token ID1243
Feature activation+3.16e-03

Pos logit contributions
bread+0.021
 he+0.021
 ju+0.019
rily+0.019
 cuc+0.018
Neg logit contributions
prise-0.030
ases-0.030
ee-0.030
rr-0.030
uch-0.029
Token if
Token ID611
Feature activation-1.62e-02

Pos logit contributions
 he+0.004
bread+0.004
 ju+0.003
 until+0.003
 which+0.003
Neg logit contributions
prise-0.007
ases-0.007
ee-0.007
rr-0.007
uch-0.007
Token.
Token ID13
Feature activation+1.26e-02

Pos logit contributions
bread+0.031
 he+0.030
 ju+0.029
 cuc+0.028
rily+0.027
Neg logit contributions
prise-0.052
ee-0.051
ases-0.051
uch-0.050
rr-0.050
Token<|endoftext|>
Token ID50256
Feature activation-4.08e-02

Pos logit contributions
bread+0.017
 he+0.016
 ju+0.015
rily+0.015
 cuc+0.015
Neg logit contributions
prise-0.026
ee-0.026
ases-0.026
rr-0.026
uch-0.025
TokenOnce
Token ID7454
Feature activation-1.52e-02

Pos logit contributions
prise+0.109
uch+0.105
ases+0.104
rr+0.102
w+0.099
Neg logit contributions
 he-0.042
 until-0.032
 He-0.028
 "-0.026
 which-0.026
Token upon
Token ID2402
Feature activation-9.62e-03

Pos logit contributions
ee+0.028
prise+0.027
rr+0.027
ases+0.027
uch+0.026
Neg logit contributions
bread-0.025
 he-0.024
 ju-0.023
 cuc-0.023
 current-0.022
Token a
Token ID257
Feature activation+2.60e-03

Pos logit contributions
prise+0.020
ee+0.020
ases+0.020
rr+0.019
uch+0.019
Neg logit contributions
 he-0.013
bread-0.013
 ju-0.012
 until-0.011
 cuc-0.011
Token time
Token ID640
Feature activation-3.05e-02

Pos logit contributions
 he+0.004
 until+0.004
 tum+0.004
bread+0.004
 as+0.003
Neg logit contributions
ases-0.005
prise-0.005
rr-0.005
uch-0.005
room-0.005
Token,
Token ID11
Feature activation-1.78e-02

Pos logit contributions
ases+0.086
prise+0.083
ee+0.082
uch+0.081
ymes+0.080
Neg logit contributions
 he-0.029
 until-0.022
 but-0.020
 as-0.019
 they-0.018
Token there
Token ID612
Feature activation-2.93e-02

Pos logit contributions
ee+0.035
prise+0.035
rr+0.035
uch+0.035
ases+0.034
Neg logit contributions
bread-0.026
 he-0.025
 ju-0.024
 cuc-0.023
 current-0.023
Token was
Token ID373
Feature activation-8.12e-03

Pos logit contributions
prise+0.076
rr+0.074
ases+0.074
uch+0.071
w+0.068
Neg logit contributions
 he-0.034
 until-0.025
 which-0.023
 tum-0.023
 they-0.022
Token a
Token ID257
Feature activation+1.82e-02

Pos logit contributions
prise+0.018
ases+0.018
uch+0.018
rr+0.018
ee+0.018
Neg logit contributions
 he-0.010
bread-0.009
 until-0.009
 ju-0.009
 cuc-0.008
Token little
Token ID1310
Feature activation+3.27e-03

Pos logit contributions
 he+0.016
bread+0.015
 ju+0.013
rily+0.013
 until+0.013
Neg logit contributions
prise-0.047
ee-0.047
rr-0.046
uch-0.046
ases-0.045
Token Lily
Token ID20037
Feature activation-2.56e-05

Pos logit contributions
bread+0.030
 current+0.030
 eerie+0.028
 ju+0.028
 crops+0.028
Neg logit contributions
room-0.027
rr-0.024
ases-0.024
prise-0.024
ee-0.023
Token felt
Token ID2936
Feature activation+4.72e-03

Pos logit contributions
ee+0.000
prise+0.000
uch+0.000
rr+0.000
ases+0.000
Neg logit contributions
bread-0.000
 he-0.000
 ju-0.000
rily-0.000
 current-0.000
Token happy
Token ID3772
Feature activation+1.71e-02

Pos logit contributions
bread+0.007
 current+0.007
rily+0.006
 spine+0.006
yt+0.006
Neg logit contributions
prise-0.009
ases-0.008
ee-0.008
uch-0.008
e-0.008
Token that
Token ID326
Feature activation+3.89e-02

Pos logit contributions
bread+0.032
 ju+0.029
rily+0.029
 current+0.028
 redd+0.028
Neg logit contributions
prise-0.027
ases-0.026
rr-0.025
room-0.024
ee-0.024
Token she
Token ID673
Feature activation-2.42e-02

Pos logit contributions
rily+0.054
bread+0.053
 current+0.044
 ju+0.044
 cuc+0.044
Neg logit contributions
room-0.077
prise-0.076
ee-0.076
ases-0.074
uch-0.073
Token could
Token ID714
Feature activation-6.13e-02

Pos logit contributions
prise+0.052
ases+0.052
rr+0.051
ee+0.051
uch+0.050
Neg logit contributions
 he-0.032
bread-0.030
 ju-0.028
 until-0.027
 which-0.026
Token help
Token ID1037
Feature activation+1.10e-03

Pos logit contributions
prise+0.101
ee+0.100
ases+0.100
rr+0.099
uch+0.099
Neg logit contributions
 he-0.111
bread-0.109
 ju-0.101
 until-0.100
 cuc-0.098
Token return
Token ID1441
Feature activation+1.44e-02

Pos logit contributions
bread+0.001
 ju+0.001
rily+0.001
 cuc+0.001
 current+0.001
Neg logit contributions
room-0.002
ases-0.002
prise-0.002
rr-0.002
ee-0.002
Token the
Token ID262
Feature activation+1.00e-02

Pos logit contributions
bread+0.018
 ju+0.016
 cuc+0.015
 energ+0.015
 current+0.015
Neg logit contributions
ases-0.031
prise-0.031
ee-0.031
room-0.030
uch-0.030
Token letter
Token ID3850
Feature activation+2.92e-02

Pos logit contributions
 he+0.016
bread+0.016
 ju+0.015
rily+0.014
 cuc+0.014
Neg logit contributions
prise-0.019
uch-0.018
ee-0.018
rr-0.018
ases-0.018
Token.
Token ID13
Feature activation+7.72e-03

Pos logit contributions
bread+0.039
 he+0.036
 cuc+0.036
 ju+0.035
 current+0.035
Neg logit contributions
prise-0.060
ee-0.059
ases-0.059
uch-0.059
rr-0.059
Token was
Token ID373
Feature activation-1.99e-02

Pos logit contributions
prise+0.003
ases+0.003
ee+0.003
rr+0.003
uch+0.003
Neg logit contributions
 he-0.002
bread-0.002
 ju-0.001
 until-0.001
 cuc-0.001
Token very
Token ID845
Feature activation-6.45e-03

Pos logit contributions
ee+0.038
prise+0.038
uch+0.037
ases+0.037
rr+0.037
Neg logit contributions
bread-0.031
 he-0.028
 cuc-0.027
 ju-0.027
rily-0.026
Token happy
Token ID3772
Feature activation+2.17e-02

Pos logit contributions
prise+0.015
ee+0.014
rr+0.014
uch+0.014
ases+0.014
Neg logit contributions
bread-0.008
 he-0.007
 ju-0.007
 cuc-0.007
rily-0.006
Token and
Token ID290
Feature activation+3.37e-02

Pos logit contributions
bread+0.035
 ju+0.033
 redd+0.032
 cuc+0.031
rily+0.031
Neg logit contributions
ee-0.038
prise-0.037
rr-0.036
room-0.036
ases-0.036
Token excited
Token ID6568
Feature activation+8.77e-03

Pos logit contributions
bread+0.046
 cuc+0.040
der+0.038
rily+0.038
 ju+0.036
Neg logit contributions
ee-0.073
uch-0.070
oh-0.069
prise-0.068
e-0.068
Token to
Token ID284
Feature activation-3.07e-02

Pos logit contributions
bread+0.014
 ju+0.013
 cuc+0.013
 current+0.012
rily+0.012
Neg logit contributions
ee-0.016
rr-0.016
prise-0.015
room-0.015
ases-0.015
Token receive
Token ID3328
Feature activation+2.06e-02

Pos logit contributions
prise+0.049
rr+0.049
ases+0.049
ee+0.048
uch+0.046
Neg logit contributions
bread-0.058
 ju-0.055
rily-0.052
 he-0.052
 cuc-0.051
Token such
Token ID884
Feature activation+2.20e-03

Pos logit contributions
bread+0.028
 he+0.024
 current+0.024
rily+0.024
 cuc+0.024
Neg logit contributions
ee-0.043
prise-0.043
ases-0.043
room-0.041
rr-0.041
Token a
Token ID257
Feature activation+7.88e-03

Pos logit contributions
bread+0.004
 he+0.004
rily+0.003
 until+0.003
 current+0.003
Neg logit contributions
prise-0.004
ee-0.004
ases-0.003
room-0.003
rr-0.003
Token unique
Token ID3748
Feature activation+1.34e-03

Pos logit contributions
 he+0.010
bread+0.009
 ju+0.008
 until+0.008
 cuc+0.008
Neg logit contributions
prise-0.018
ases-0.017
ee-0.017
uch-0.017
rr-0.017
Token gift
Token ID6979
Feature activation+1.62e-02

Pos logit contributions
bread+0.002
 he+0.002
 ju+0.002
rily+0.002
 cuc+0.002
Neg logit contributions
prise-0.003
ee-0.003
rr-0.003
uch-0.003
ases-0.003
Token
Token ID198
Feature activation-2.76e-02

Pos logit contributions
prise+0.047
uch+0.047
ases+0.046
ee+0.046
rr+0.045
Neg logit contributions
 he-0.021
 until-0.017
bread-0.016
 which-0.015
 ju-0.014
Token"
Token ID1
Feature activation-6.91e-02

Pos logit contributions
prise+0.036
rr+0.035
ee+0.035
ases+0.035
uch+0.034
Neg logit contributions
bread-0.059
 he-0.056
 ju-0.056
rily-0.055
 knowing-0.055
TokenHey
Token ID10814
Feature activation-4.43e-02

Pos logit contributions
prise+0.096
ases+0.094
rr+0.093
ee+0.093
uch+0.092
Neg logit contributions
 he-0.144
bread-0.141
 ju-0.134
 cuc-0.131
 until-0.130
Token,
Token ID11
Feature activation-3.59e-02

Pos logit contributions
prise+0.098
ases+0.097
ee+0.097
rr+0.096
uch+0.096
Neg logit contributions
 he-0.055
bread-0.053
 ju-0.048
 until-0.047
 cuc-0.046
Token what
Token ID644
Feature activation-6.72e-02

Pos logit contributions
ases+0.075
uch+0.075
prise+0.074
rr+0.074
ee+0.074
Neg logit contributions
 he-0.050
bread-0.046
 ju-0.042
 which-0.042
 cuc-0.042
Token did
Token ID750
Feature activation-8.50e-02

Pos logit contributions
loe+0.103
ases+0.103
prise+0.101
ippy+0.099
ips+0.095
Neg logit contributions
 he-0.131
 they-0.108
bread-0.105
 which-0.104
 tum-0.104
Token you
Token ID345
Feature activation-7.85e-02

Pos logit contributions
prise+0.081
ases+0.081
rr+0.080
ee+0.079
uch+0.077
Neg logit contributions
 he-0.224
bread-0.204
 which-0.197
 ju-0.196
 as-0.196
Token do
Token ID466
Feature activation-8.13e-02

Pos logit contributions
ases+0.149
loe+0.146
ymes+0.146
prise+0.143
rr+0.141
Neg logit contributions
 he-0.107
 which-0.100
 ju-0.099
 as-0.097
 but-0.096
Token?"
Token ID1701
Feature activation+9.20e-03

Pos logit contributions
ases+0.129
prise+0.127
rr+0.127
ee+0.126
uch+0.123
Neg logit contributions
 he-0.161
 which-0.142
 as-0.138
 but-0.135
 though-0.135
Token he
Token ID339
Feature activation-2.31e-02

Pos logit contributions
bread+0.013
rily+0.012
 ju+0.012
 knowing+0.011
 crops+0.011
Neg logit contributions
ee-0.017
ases-0.017
prise-0.017
 you-0.017
rr-0.016
Token shouts
Token ID34757
Feature activation+4.67e-03

Pos logit contributions
prise+0.046
uch+0.045
ee+0.044
ases+0.044
room+0.043
Neg logit contributions
 he-0.036
bread-0.033
 current-0.030
 until-0.030
 ju-0.030
Token grandma
Token ID49890
Feature activation+7.77e-04

Pos logit contributions
 he+0.027
bread+0.027
 until+0.024
 tum+0.024
 ju+0.024
Neg logit contributions
ases-0.044
prise-0.043
ee-0.043
rr-0.043
uch-0.043
Token said
Token ID531
Feature activation-6.19e-03

Pos logit contributions
bread+0.001
 he+0.001
 ju+0.001
rily+0.001
 cuc+0.001
Neg logit contributions
prise-0.002
ee-0.002
rr-0.002
ases-0.002
uch-0.002
Token,
Token ID11
Feature activation+3.08e-02

Pos logit contributions
prise+0.012
ee+0.012
room+0.012
rr+0.012
ases+0.012
Neg logit contributions
bread-0.009
 current-0.008
 ju-0.008
 he-0.008
rily-0.008
Token "
Token ID366
Feature activation-7.42e-02

Pos logit contributions
bread+0.029
 current+0.027
 ju+0.025
 tum+0.025
rily+0.024
Neg logit contributions
rr-0.072
prise-0.071
ee-0.070
room-0.070
oh-0.068
TokenWhy
Token ID5195
Feature activation-5.18e-02

Pos logit contributions
prise+0.130
ases+0.129
uch+0.128
rr+0.127
ee+0.126
Neg logit contributions
 he-0.126
bread-0.124
 ju-0.118
 until-0.113
 cuc-0.112
Token don
Token ID836
Feature activation-9.56e-02

Pos logit contributions
ases+0.085
prise+0.083
rr+0.081
ee+0.079
uch+0.078
Neg logit contributions
 he-0.100
bread-0.093
 which-0.089
 until-0.088
 as-0.087
Token't
Token ID470
Feature activation-5.39e-02

Pos logit contributions
prise+0.119
ases+0.114
rr+0.110
ee+0.107
uch+0.100
Neg logit contributions
 he-0.234
bread-0.211
 as-0.204
 they-0.204
 which-0.202
Token you
Token ID345
Feature activation-7.05e-02

Pos logit contributions
prise+0.059
ases+0.058
ee+0.058
uch+0.057
rr+0.057
Neg logit contributions
 he-0.127
bread-0.125
 ju-0.119
 until-0.117
 cuc-0.117
Token open
Token ID1280
Feature activation-1.56e-02

Pos logit contributions
ases+0.163
ymes+0.155
prise+0.154
rr+0.154
ips+0.154
Neg logit contributions
 he-0.082
 as-0.079
 which-0.072
bread-0.072
 until-0.070
Token it
Token ID340
Feature activation-2.99e-02

Pos logit contributions
prise+0.035
uch+0.034
rr+0.034
ases+0.034
ee+0.034
Neg logit contributions
 he-0.019
bread-0.018
 until-0.016
 ju-0.016
 which-0.016
Token and
Token ID290
Feature activation-5.95e-03

Pos logit contributions
ases+0.064
prise+0.062
ee+0.061
rr+0.061
uch+0.057
Neg logit contributions
 he-0.044
 until-0.039
 which-0.038
bread-0.038
 as-0.038
Token?"
Token ID1701
Feature activation-1.20e-03

Pos logit contributions
ases+0.028
uch+0.028
ee+0.027
prise+0.027
rr+0.027
Neg logit contributions
bread-0.048
 he-0.047
 ju-0.045
 cuc-0.044
rily-0.044
Token Tim
Token ID5045
Feature activation-3.30e-02

Pos logit contributions
uch+0.002
room+0.002
ee+0.002
prise+0.002
w+0.002
Neg logit contributions
bread-0.002
rily-0.002
ment-0.002
 current-0.002
 ju-0.002
Tokenmy
Token ID1820
Feature activation-4.33e-03

Pos logit contributions
prise+0.055
ases+0.054
ee+0.051
rr+0.051
uch+0.051
Neg logit contributions
 he-0.060
bread-0.057
 until-0.055
 ju-0.054
 which-0.054
Token replied
Token ID8712
Feature activation-2.70e-03

Pos logit contributions
uch+0.010
ee+0.010
rr+0.009
prise+0.009
w+0.009
Neg logit contributions
bread-0.005
 he-0.005
 ju-0.005
rily-0.005
ment-0.004
Token,
Token ID11
Feature activation+2.20e-02

Pos logit contributions
uch+0.005
room+0.005
prise+0.005
rr+0.005
ee+0.005
Neg logit contributions
bread-0.004
 ju-0.004
 current-0.004
 he-0.004
rily-0.004
Token "
Token ID366
Feature activation-8.32e-02

Pos logit contributions
bread+0.024
ment+0.021
 ju+0.020
rily+0.019
 current+0.019
Neg logit contributions
uch-0.048
room-0.048
rr-0.048
oh-0.048
ee-0.048
TokenI
Token ID40
Feature activation-7.52e-02

Pos logit contributions
prise+0.133
ases+0.124
rr+0.123
ee+0.116
uch+0.114
Neg logit contributions
 he-0.170
bread-0.161
 ju-0.160
 which-0.152
 until-0.149
Token just
Token ID655
Feature activation-3.12e-02

Pos logit contributions
ases+0.131
prise+0.130
rr+0.129
room+0.126
loe+0.118
Neg logit contributions
 he-0.135
bread-0.128
 ju-0.119
 until-0.118
 as-0.116
Token flap
Token ID37699
Feature activation-6.18e-03

Pos logit contributions
prise+0.060
ases+0.059
rr+0.058
ee+0.058
uch+0.058
Neg logit contributions
 he-0.048
bread-0.046
 ju-0.043
 until-0.043
 as-0.042
Token my
Token ID616
Feature activation+1.42e-02

Pos logit contributions
prise+0.011
ases+0.011
ee+0.011
uch+0.011
rr+0.011
Neg logit contributions
 he-0.010
bread-0.010
 ju-0.009
 until-0.009
rily-0.009
Token wings
Token ID12098
Feature activation-1.29e-03

Pos logit contributions
 he+0.013
bread+0.012
 until+0.011
 tum+0.011
 ju+0.011
Neg logit contributions
prise-0.037
ases-0.035
room-0.035
uch-0.035
ee-0.035
Token his
Token ID465
Feature activation+1.48e-02

Pos logit contributions
prise+0.036
rr+0.035
ee+0.035
uch+0.035
ases+0.035
Neg logit contributions
 he-0.015
bread-0.013
 until-0.012
 which-0.012
 ju-0.012
Token toys
Token ID14958
Feature activation+2.42e-03

Pos logit contributions
 he+0.023
bread+0.023
 ju+0.021
 until+0.021
 cuc+0.021
Neg logit contributions
prise-0.028
ee-0.028
ases-0.028
rr-0.027
uch-0.027
Token,
Token ID11
Feature activation+2.43e-02

Pos logit contributions
 he+0.003
bread+0.002
 until+0.002
 which+0.002
 ju+0.002
Neg logit contributions
prise-0.006
ases-0.006
ee-0.006
rr-0.006
uch-0.006
Token but
Token ID475
Feature activation+2.53e-03

Pos logit contributions
bread+0.024
rily+0.021
 he+0.020
 cuc+0.020
 ju+0.020
Neg logit contributions
prise-0.058
ee-0.057
ases-0.057
uch-0.057
rr-0.057
Token he
Token ID339
Feature activation-1.32e-02

Pos logit contributions
bread+0.003
rily+0.002
 widened+0.002
ment+0.002
 cuc+0.002
Neg logit contributions
prise-0.006
uch-0.006
w-0.006
ee-0.006
rr-0.006
Token didn
Token ID1422
Feature activation-1.03e-01

Pos logit contributions
prise+0.029
ases+0.029
rr+0.029
room+0.028
ee+0.028
Neg logit contributions
 he-0.018
bread-0.015
 ju-0.014
 until-0.014
 tum-0.014
Token't
Token ID470
Feature activation-7.51e-02

Pos logit contributions
prise+0.184
ases+0.165
ymes+0.164
rr+0.162
w+0.155
Neg logit contributions
 he-0.220
 as-0.176
 they-0.174
 He-0.172
 which-0.170
Token listen
Token ID6004
Feature activation+8.82e-03

Pos logit contributions
prise+0.130
ases+0.129
ee+0.128
uch+0.127
rr+0.126
Neg logit contributions
 he-0.130
bread-0.126
 until-0.117
 ju-0.117
 which-0.114
Token.
Token ID13
Feature activation+4.23e-03

Pos logit contributions
bread+0.012
 ju+0.011
 cuc+0.011
 he+0.011
 current+0.011
Neg logit contributions
ee-0.017
rr-0.017
prise-0.017
uch-0.017
ases-0.017
Token
Token ID198
Feature activation-2.83e-02

Pos logit contributions
bread+0.006
 he+0.006
 ju+0.005
rily+0.005
 cuc+0.005
Neg logit contributions
prise-0.009
rr-0.008
ee-0.008
ases-0.008
room-0.008
Token
Token ID198
Feature activation-2.81e-02

Pos logit contributions
prise+0.070
ases+0.070
uch+0.069
rr+0.068
ee+0.067
Neg logit contributions
 he-0.031
 until-0.025
 which-0.024
bread-0.023
 tum-0.022
Token that
Token ID326
Feature activation-4.17e-03

Pos logit contributions
bread+0.018
 he+0.016
 ju+0.016
 cuc+0.015
rily+0.015
Neg logit contributions
prise-0.035
ee-0.035
uch-0.035
ases-0.034
w-0.034
Token day
Token ID1110
Feature activation-1.63e-02

Pos logit contributions
prise+0.008
ee+0.008
ases+0.008
uch+0.008
rr+0.008
Neg logit contributions
 he-0.006
bread-0.006
 ju-0.006
rily-0.006
 cuc-0.006
Token on
Token ID319
Feature activation+1.41e-02

Pos logit contributions
prise+0.039
rr+0.038
ases+0.038
ee+0.037
uch+0.036
Neg logit contributions
 he-0.019
 until-0.017
bread-0.015
 as-0.015
 knowing-0.014
Token,
Token ID11
Feature activation+3.77e-02

Pos logit contributions
bread+0.018
 he+0.018
 ju+0.017
 cuc+0.016
rily+0.016
Neg logit contributions
prise-0.030
ee-0.030
ases-0.030
uch-0.030
rr-0.030
Token Lily
Token ID20037
Feature activation+8.74e-04

Pos logit contributions
bread+0.069
 current+0.063
ment+0.062
rily+0.062
 eerie+0.061
Neg logit contributions
room-0.062
prise-0.062
ases-0.059
ee-0.057
 you-0.057
Token wasn
Token ID2492
Feature activation-8.52e-02

Pos logit contributions
 he+0.001
bread+0.001
 ju+0.001
rily+0.001
 until+0.001
Neg logit contributions
prise-0.002
ee-0.002
ases-0.002
uch-0.002
rr-0.002
Token't
Token ID470
Feature activation-4.26e-02

Pos logit contributions
rr+0.116
prise+0.116
ases+0.113
ymes+0.105
ee+0.104
Neg logit contributions
 he-0.200
 as-0.177
bread-0.174
 which-0.173
 until-0.170
Token embarrassed
Token ID21100
Feature activation-2.35e-02

Pos logit contributions
rr+0.069
ee+0.068
ases+0.066
prise+0.063
ow+0.062
Neg logit contributions
bread-0.069
rily-0.069
ment-0.065
lights-0.065
 cuc-0.063
Token anymore
Token ID7471
Feature activation+1.48e-02

Pos logit contributions
prise+0.049
ases+0.049
ee+0.049
rr+0.048
uch+0.048
Neg logit contributions
bread-0.031
 he-0.031
 ju-0.028
 cuc-0.027
rily-0.027
Token and
Token ID290
Feature activation+2.52e-02

Pos logit contributions
bread+0.017
 ju+0.017
rily+0.016
 he+0.016
 cuc+0.016
Neg logit contributions
rr-0.032
prise-0.032
ases-0.032
ee-0.032
room-0.031
Token felt
Token ID2936
Feature activation+7.21e-03

Pos logit contributions
bread+0.044
rily+0.041
 ju+0.037
 current+0.037
 cuc+0.037
Neg logit contributions
ee-0.044
ases-0.044
uch-0.043
w-0.043
prise-0.042