TOP ACTIVATIONS
MAX = 9.782

I can find something gl immer ing ?"
the apple . It looks y ummy ." Ben
grabbed the dog â s leash . They all
table . It looks very y ummy and fancy . It
red strawberry . It looks y ummy . "
and lettuce . It looks y ummy . "
green and yellow and look y ummy . "
. We will share the shoes . We will be compassionate
is this ? It looks y ucky . I don 't
grabbed the man â s hand with the other
" Wow , it looks y ummy ," Amy said .
. I want to wear shorts ." Jack 's mom said
lobby . The cake looks y ummy . They want to
It was red and looked y ummy . Max wanted to
jam . " It looks y ummy !" They
wanted to share her pr unes with him . She ran
cake . It looks so y ummy . Can we have
chili and thought it looked y ummy . The bird ate
red and shiny and looks y ummy . Ben says ,
the table . It looked y ummy . "

MIN ACTIVATIONS
MIN = -8.580

L ily and Ben also wake up and look out
a thrill of joy . They think they are flying .
played and had fun . She saw the same bird from
feeling proud of themselves . They had learned something new about
along . The nice people even shared their candy with the
, happy and sad . She whist led all the time
dream . Every night , she would close her eyes and
discussing their images . They even found a pretty butterfly and
m ies hurt . They even shared an ice cream cone
dizz y and happy . They hugged and smiled . They
felt proud and happy . They had discovered a new place
was all wet . She even made a new friend ,
, and very amazed . They hugged their dad , and
sw am around . They even played a game of tag
, happy and warm . They were not lonely anymore .
forget the time . They also forget to eat . They
up happy and refreshed . They got ready for school and
they forget the time . They also forget to eat .
in the tent . They even stayed up late into the
feeling happy and grateful . They had a wonderful morning .

INTERVAL 5.192 - 9.782
CONTAINS 0.858%

very loud and wear funny clothes . L ily
But the lock was too strong and they could not open
. But it was too heavy and hard . She could
cannot take it from the museum ." He felt
in her backyard and pick flowers . One day , she

INTERVAL 0.601 - 5.192
CONTAINS 45.214%

" Dad , can I taste surf ?" His
the sun , it spark led . M ia
are going home right now . You are in big trouble
card and put the bracelet inside . Her friend was so
. It ran around and sniff ed the grass and bushes

INTERVAL -3.990 - 0.601
CONTAINS 44.174%

slide . She ran to the slide and said , "
Dan very much . She does not want to live without
. He was able to get a lot of fish !
one else has to see it like this ."
it , fascinated . His mom smiled . " Go ahead

INTERVAL -8.580 - -3.990
CONTAINS 9.755%

not want to chew . They wanted to play .
. The old man smiled and put it in his pocket
and walked away . Tim my and his friends felt sad
to try something new . She said , " Can we
the kitten couldn 't help but go back to explore .
Token I
Token ID314
Feature activation-1.60e+00

Pos logit contributions
 forgot+0.000
 knew+0.000
 remembered+0.000
 wished+0.000
 known+0.000
Neg logit contributions
he-0.000
ing-0.000
ays-0.000
room-0.000
ane-0.000
Token can
Token ID460
Feature activation+4.08e-01

Pos logit contributions
 forgot+0.029
 remembered+0.023
 wished+0.023
 knew+0.023
 pap+0.023
Neg logit contributions
he-0.046
room-0.042
ays-0.040
ore-0.040
ing-0.040
Token find
Token ID1064
Feature activation+1.74e+00

Pos logit contributions
he+0.010
ing+0.010
ays+0.010
room+0.010
ane+0.009
Neg logit contributions
 forgot-0.009
 knew-0.007
 remembered-0.007
 wished-0.007
 pap-0.006
Token something
Token ID1223
Feature activation+1.07e+00

Pos logit contributions
he+0.040
room+0.036
ays+0.036
ing+0.036
ane+0.034
Neg logit contributions
 forgot-0.041
 knew-0.037
 remembered-0.035
 wished-0.035
 pap-0.033
Token gl
Token ID1278
Feature activation+6.09e+00

Pos logit contributions
he+0.020
room+0.018
ays+0.018
ing+0.017
ane+0.016
Neg logit contributions
 forgot-0.031
 knew-0.028
 remembered-0.027
 wished-0.026
 pap-0.025
Tokenimmer
Token ID10957
Feature activation+9.78e+00

Pos logit contributions
he+0.105
ays+0.097
room+0.096
ing+0.085
eddy+0.082
Neg logit contributions
 forgot-0.149
 remembered-0.134
 knew-0.132
 pap-0.130
'd-0.129
Tokening
Token ID278
Feature activation+3.60e+00

Pos logit contributions
ooming+0.204
overed+0.198
room+0.189
uggle+0.188
ays+0.187
Neg logit contributions
 bit-0.179
 forgot-0.178
 knew-0.167
 always-0.163
 needed-0.162
Token?"
Token ID1701
Feature activation+3.85e+00

Pos logit contributions
he+0.061
ays+0.058
room+0.057
ing+0.055
eddy+0.053
Neg logit contributions
 forgot-0.101
 knew-0.092
 remembered-0.090
 bit-0.088
 wished-0.087
Token
Token ID220
Feature activation+4.82e+00

Pos logit contributions
he+0.084
ays+0.082
ing+0.079
room+0.078
eddy+0.076
Neg logit contributions
 forgot-0.094
 knew-0.086
 wished-0.081
 remembered-0.081
 learned-0.076
Token
Token ID198
Feature activation+4.85e+00

Pos logit contributions
room+0.111
ays+0.110
ane+0.107
eddy+0.107
ipping+0.106
Neg logit contributions
 forgot-0.089
 knew-0.087
 told-0.077
 wished-0.073
 never-0.073
Token
Token ID198
Feature activation+3.51e+00

Pos logit contributions
ays+0.124
ane+0.105
ement+0.103
ing+0.102
 anymore+0.101
Neg logit contributions
 knew-0.104
 forgot-0.100
 learned-0.093
 never-0.089
 remembered-0.088
Token the
Token ID262
Feature activation-1.51e+00

Pos logit contributions
he+0.045
room+0.041
ays+0.041
ing+0.041
ane+0.039
Neg logit contributions
 forgot-0.046
 knew-0.040
 remembered-0.039
 wished-0.038
 pap-0.036
Token apple
Token ID17180
Feature activation+5.80e+00

Pos logit contributions
 forgot+0.035
 knew+0.030
 remembered+0.030
 wished+0.029
 pap+0.027
Neg logit contributions
he-0.036
ays-0.033
room-0.033
ing-0.033
ane-0.031
Token.
Token ID13
Feature activation+2.34e+00

Pos logit contributions
ement+0.112
he+0.104
ays+0.101
overed+0.099
apping+0.099
Neg logit contributions
 forgot-0.144
 bit-0.134
 knew-0.129
 way-0.122
 never-0.119
Token It
Token ID632
Feature activation-1.33e+00

Pos logit contributions
he+0.050
ing+0.047
ays+0.043
ane+0.043
ings+0.043
Neg logit contributions
 forgot-0.059
 knew-0.052
 remembered-0.052
 bit-0.051
 pap-0.050
Token looks
Token ID3073
Feature activation+6.50e-04

Pos logit contributions
 forgot+0.033
 wished+0.029
 known+0.029
 remembered+0.029
 knew+0.028
Neg logit contributions
he-0.027
ays-0.026
ing-0.026
ings-0.024
room-0.024
Token y
Token ID331
Feature activation+8.57e+00

Pos logit contributions
he+0.000
ing+0.000
ays+0.000
room+0.000
ings+0.000
Neg logit contributions
 forgot-0.000
 knew-0.000
 remembered-0.000
 wished-0.000
 pap-0.000
Tokenummy
Token ID13513
Feature activation+6.30e+00

Pos logit contributions
he+0.140
overed+0.138
ane+0.135
ooming+0.132
but+0.123
Neg logit contributions
 bit-0.181
 forgot-0.177
 knew-0.162
 vitamins-0.161
 sp-0.159
Token."
Token ID526
Feature activation+4.72e+00

Pos logit contributions
he+0.136
room+0.136
ane+0.125
ays+0.124
ings+0.123
Neg logit contributions
 forgot-0.153
 bit-0.137
 knew-0.134
 remembered-0.130
 wished-0.129
Token
Token ID198
Feature activation+5.24e+00

Pos logit contributions
he+0.154
ane+0.152
ing+0.144
eddy+0.144
room+0.142
Neg logit contributions
 forgot-0.070
 knew-0.053
 remembered-0.048
 learned-0.046
 pap-0.043
Token
Token ID198
Feature activation+4.03e+00

Pos logit contributions
ane+0.128
ays+0.126
ement+0.123
overed+0.119
he+0.112
Neg logit contributions
 forgot-0.109
 knew-0.101
 learned-0.087
 grown-0.087
 told-0.086
TokenBen
Token ID11696
Feature activation-3.54e+00

Pos logit contributions
he+0.098
ays+0.096
ane+0.094
ing+0.090
ore+0.090
Neg logit contributions
 forgot-0.093
 knew-0.078
 remembered-0.077
 wished-0.071
 bit-0.070
Token grabbed
Token ID13176
Feature activation+1.12e+00

Pos logit contributions
 forgot+0.104
 knew+0.092
 known+0.092
 fact+0.089
 remembered+0.086
Neg logit contributions
he-0.114
room-0.106
ing-0.105
hat-0.101
ays-0.101
Token the
Token ID262
Feature activation-1.82e+00

Pos logit contributions
he+0.026
ing+0.024
ays+0.024
room+0.024
ane+0.023
Neg logit contributions
 forgot-0.026
 knew-0.023
 remembered-0.022
 wished-0.022
 pap-0.020
Token dog
Token ID3290
Feature activation+3.55e+00

Pos logit contributions
 forgot+0.043
 remembered+0.036
 knew+0.036
 wished+0.034
 known+0.034
Neg logit contributions
he-0.044
ing-0.040
room-0.039
ays-0.038
ane-0.038
Tokenâ
Token ID22940
Feature activation+2.27e+00

Pos logit contributions
he+0.069
room+0.063
ays+0.062
ane+0.059
ing+0.058
Neg logit contributions
 forgot-0.096
 knew-0.085
 remembered-0.083
 wished-0.083
 bit-0.083
Token
Token ID26391
Feature activation+3.77e+00

Pos logit contributions
he+0.051
ing+0.049
ays+0.049
room+0.046
eddy+0.045
Neg logit contributions
 forgot-0.053
 knew-0.047
 remembered-0.045
 wished-0.044
 learned-0.042
Token
Token ID8151
Feature activation+8.50e+00

Pos logit contributions
he+0.077
ays+0.076
ing+0.072
ying+0.070
ement+0.069
Neg logit contributions
 forgot-0.104
 knew-0.083
 remembered-0.083
 wished-0.079
 knocked-0.078
Tokens
Token ID82
Feature activation+1.56e+00

Pos logit contributions
ays+0.142
room+0.136
he+0.136
overed+0.125
uggle+0.124
Neg logit contributions
 forgot-0.226
 knew-0.206
 never-0.193
 forgotten-0.190
 remembered-0.186
Token leash
Token ID40529
Feature activation+6.12e+00

Pos logit contributions
he+0.023
ays+0.020
room+0.020
ing+0.020
ane+0.019
Neg logit contributions
 forgot-0.049
 knew-0.045
 remembered-0.044
 wished-0.044
 pap-0.042
Token.
Token ID13
Feature activation+4.18e+00

Pos logit contributions
he+0.094
ays+0.091
room+0.089
uggle+0.084
eddy+0.081
Neg logit contributions
 forgot-0.177
 knew-0.166
 wished-0.158
 remembered-0.156
 bit-0.156
Token They
Token ID1119
Feature activation-4.88e+00

Pos logit contributions
he+0.095
ing+0.091
ane+0.088
room+0.088
ays+0.085
Neg logit contributions
 forgot-0.096
 knew-0.085
 pap-0.081
 wished-0.079
 remembered-0.078
Token all
Token ID477
Feature activation-3.59e+00

Pos logit contributions
 forgot+0.088
 pap+0.078
 fact+0.076
 known+0.076
 cruel+0.073
Neg logit contributions
he-0.134
ing-0.114
room-0.110
ays-0.107
hat-0.105
Token table
Token ID3084
Feature activation+4.43e+00

Pos logit contributions
 forgot+0.034
 remembered+0.029
 knew+0.029
 wished+0.028
 known+0.028
Neg logit contributions
he-0.043
room-0.040
ing-0.039
eddy-0.037
ays-0.037
Token.
Token ID13
Feature activation+3.31e+00

Pos logit contributions
he+0.059
eddy+0.055
ing+0.054
ays+0.053
uggle+0.052
Neg logit contributions
 forgot-0.140
 knew-0.129
 remembered-0.125
 wished-0.125
 bit-0.122
Token It
Token ID632
Feature activation-2.98e+00

Pos logit contributions
he+0.072
ing+0.068
ays+0.065
eddy+0.064
ane+0.064
Neg logit contributions
 forgot-0.081
 knew-0.072
 wished-0.068
 remembered-0.068
 knocked-0.066
Token looks
Token ID3073
Feature activation-1.10e+00

Pos logit contributions
 forgot+0.080
 known+0.073
 remembered+0.068
 forgotten+0.067
 pap+0.067
Neg logit contributions
ays-0.054
he-0.051
ing-0.046
ore-0.045
ings-0.043
Token very
Token ID845
Feature activation-4.64e-01

Pos logit contributions
 forgot+0.024
 pap+0.020
 knew+0.019
 remembered+0.019
 known+0.019
Neg logit contributions
he-0.025
ing-0.023
ays-0.023
 anymore-0.023
ings-0.022
Token y
Token ID331
Feature activation+8.46e+00

Pos logit contributions
 forgot+0.008
 knew+0.006
 remembered+0.006
 wished+0.006
 known+0.006
Neg logit contributions
he-0.014
ing-0.013
ays-0.013
room-0.012
ings-0.012
Tokenummy
Token ID13513
Feature activation+4.89e+00

Pos logit contributions
he+0.134
ane+0.126
ute+0.112
uggle+0.110
overed+0.110
Neg logit contributions
 forgot-0.202
 bit-0.195
 knew-0.189
 forgotten-0.182
 sp-0.179
Token and
Token ID290
Feature activation-4.55e+00

Pos logit contributions
he+0.080
room+0.074
ays+0.071
ane+0.069
uggle+0.066
Neg logit contributions
 forgot-0.147
 knew-0.135
 bit-0.131
 wished-0.130
 remembered-0.129
Token fancy
Token ID14996
Feature activation+2.79e+00

Pos logit contributions
 forgot+0.087
 known+0.085
 knew+0.076
 remembered+0.075
 unknow+0.074
Neg logit contributions
 with-0.095
he-0.093
ays-0.092
ing-0.091
ings-0.087
Token.
Token ID13
Feature activation+3.66e+00

Pos logit contributions
he+0.041
room+0.034
ays+0.034
ane+0.033
uggle+0.032
Neg logit contributions
 forgot-0.090
 knew-0.082
 remembered-0.080
 wished-0.080
 bit-0.079
Token It
Token ID632
Feature activation-3.71e+00

Pos logit contributions
he+0.084
ing+0.078
ays+0.075
ane+0.074
eddy+0.074
Neg logit contributions
 forgot-0.088
 knew-0.077
 wished-0.073
 remembered-0.073
 knocked-0.070
Token red
Token ID2266
Feature activation+4.45e-01

Pos logit contributions
 forgot+0.039
 knew+0.032
 wished+0.032
 remembered+0.032
 known+0.031
Neg logit contributions
he-0.036
room-0.034
ing-0.034
ays-0.032
ings-0.032
Token strawberry
Token ID41236
Feature activation+3.71e+00

Pos logit contributions
he+0.012
ays+0.011
ing+0.011
room+0.011
ane+0.010
Neg logit contributions
 forgot-0.008
 knew-0.007
 bit-0.007
 remembered-0.007
 wished-0.007
Token.
Token ID13
Feature activation+4.09e+00

Pos logit contributions
he+0.058
room+0.053
ays+0.051
ane+0.049
uggle+0.048
Neg logit contributions
 forgot-0.112
 knew-0.104
 bit-0.100
 remembered-0.099
 wished-0.097
Token It
Token ID632
Feature activation-2.30e+00

Pos logit contributions
he+0.099
ane+0.094
ing+0.093
room+0.087
eddy+0.087
Neg logit contributions
 forgot-0.092
 knew-0.079
 remembered-0.075
 wished-0.072
 learned-0.072
Token looks
Token ID3073
Feature activation-7.15e-01

Pos logit contributions
 forgot+0.062
 known+0.054
 wished+0.053
 remembered+0.053
 forgotten+0.052
Neg logit contributions
he-0.042
ays-0.041
ing-0.038
room-0.036
ore-0.035
Token y
Token ID331
Feature activation+8.44e+00

Pos logit contributions
 forgot+0.017
 knew+0.014
 remembered+0.014
 pap+0.014
 wished+0.013
Neg logit contributions
he-0.016
ing-0.015
ays-0.014
room-0.014
ings-0.013
Tokenummy
Token ID13513
Feature activation+5.24e+00

Pos logit contributions
he+0.137
ane+0.134
overed+0.116
ute+0.115
room+0.111
Neg logit contributions
 forgot-0.188
 bit-0.184
 knew-0.174
 sp-0.170
 forgotten-0.167
Token.
Token ID13
Feature activation+4.17e+00

Pos logit contributions
he+0.093
room+0.090
ane+0.083
ays+0.083
uggle+0.080
Neg logit contributions
 forgot-0.149
 knew-0.136
 bit-0.136
 remembered-0.132
 wished-0.131
Token
Token ID198
Feature activation+4.77e+00

Pos logit contributions
he+0.098
ane+0.090
ing+0.088
room+0.088
uggle+0.083
Neg logit contributions
 forgot-0.101
 knew-0.087
 remembered-0.083
 wished-0.081
 learned-0.078
Token
Token ID198
Feature activation+3.83e+00

Pos logit contributions
ays+0.111
ane+0.106
ipping+0.105
ing+0.103
ement+0.103
Neg logit contributions
 forgot-0.107
 knew-0.098
 had-0.089
 learned-0.088
 never-0.087
Token"
Token ID1
Feature activation+4.66e-01

Pos logit contributions
he+0.062
ays+0.058
ing+0.055
room+0.055
ane+0.054
Neg logit contributions
 forgot-0.118
 knew-0.105
 remembered-0.104
 wished-0.101
 bit-0.097
Token and
Token ID290
Feature activation-4.38e+00

Pos logit contributions
he+0.090
room+0.074
uggle+0.073
ane+0.072
ogging+0.071
Neg logit contributions
 forgot-0.137
 knew-0.123
 remembered-0.123
 pap-0.116
 wished-0.113
Token lettuce
Token ID39406
Feature activation+3.82e+00

Pos logit contributions
 forgot+0.105
 known+0.096
 knew+0.090
 unknow+0.088
 forgotten+0.087
Neg logit contributions
ays-0.077
ing-0.074
he-0.073
room-0.071
eddy-0.071
Token.
Token ID13
Feature activation+3.86e+00

Pos logit contributions
he+0.049
room+0.042
ing+0.040
ays+0.040
ane+0.038
Neg logit contributions
 forgot-0.131
 knew-0.119
 remembered-0.117
 wished-0.115
 pap-0.111
Token It
Token ID632
Feature activation-3.25e+00

Pos logit contributions
he+0.100
room+0.092
ane+0.091
ing+0.091
eddy+0.084
Neg logit contributions
 forgot-0.080
 knew-0.074
 wished-0.071
 pap-0.068
 remembered-0.066
Token looks
Token ID3073
Feature activation-8.78e-01

Pos logit contributions
 forgot+0.082
 known+0.078
 forgotten+0.072
 unknow+0.071
 remembered+0.070
Neg logit contributions
ays-0.062
he-0.058
ing-0.051
ore-0.050
iffs-0.050
Token y
Token ID331
Feature activation+8.44e+00

Pos logit contributions
 forgot+0.017
 knew+0.014
 remembered+0.014
 pap+0.014
 known+0.013
Neg logit contributions
he-0.023
ing-0.021
ays-0.021
room-0.020
ore-0.020
Tokenummy
Token ID13513
Feature activation+4.73e+00

Pos logit contributions
he+0.131
ane+0.126
ute+0.114
overed+0.112
uggle+0.109
Neg logit contributions
 forgot-0.204
 bit-0.190
 knew-0.188
 sp-0.182
 forgotten-0.178
Token.
Token ID13
Feature activation+3.73e+00

Pos logit contributions
he+0.074
room+0.069
ays+0.064
ane+0.063
uggle+0.060
Neg logit contributions
 forgot-0.146
 knew-0.133
 remembered-0.129
 bit-0.129
 wished-0.128
Token
Token ID198
Feature activation+4.41e+00

Pos logit contributions
he+0.091
room+0.084
ing+0.081
ane+0.080
uggle+0.075
Neg logit contributions
 forgot-0.084
 knew-0.077
 wished-0.074
 pap-0.072
 remembered-0.071
Token
Token ID198
Feature activation+3.51e+00

Pos logit contributions
ays+0.105
ane+0.103
ipping+0.101
ing+0.101
ement+0.099
Neg logit contributions
 forgot-0.092
 knew-0.089
 wished-0.079
 also-0.079
 never-0.078
Token"
Token ID1
Feature activation+5.21e-01

Pos logit contributions
he+0.060
ays+0.057
room+0.055
ane+0.055
ing+0.054
Neg logit contributions
 forgot-0.104
 knew-0.093
 remembered-0.091
 wished-0.090
 bit-0.086
Token green
Token ID4077
Feature activation+5.19e+00

Pos logit contributions
 forgot+0.041
 wished+0.035
 knew+0.035
 remembered+0.035
 known+0.033
Neg logit contributions
he-0.040
ing-0.037
room-0.036
ings-0.036
ays-0.035
Token and
Token ID290
Feature activation-2.88e+00

Pos logit contributions
he+0.112
uggle+0.101
ane+0.100
ore+0.100
room+0.099
Neg logit contributions
 forgot-0.123
 bit-0.113
 knew-0.112
 wished-0.108
 remembered-0.108
Token yellow
Token ID7872
Feature activation+3.73e+00

Pos logit contributions
 forgot+0.068
 known+0.061
 remembered+0.060
 knew+0.059
 unknow+0.058
Neg logit contributions
he-0.053
ays-0.050
 with-0.048
ing-0.047
room-0.047
Token and
Token ID290
Feature activation-2.97e+00

Pos logit contributions
he+0.073
uggle+0.063
ane+0.062
eddy+0.061
ore+0.060
Neg logit contributions
 forgot-0.098
 knew-0.092
 bit-0.091
 remembered-0.087
 wished-0.086
Token look
Token ID804
Feature activation-2.02e-01

Pos logit contributions
 forgot+0.066
 known+0.059
 knew+0.058
 remembered+0.058
 unknow+0.057
Neg logit contributions
he-0.058
ays-0.056
room-0.054
 with-0.054
ing-0.054
Token y
Token ID331
Feature activation+8.43e+00

Pos logit contributions
 forgot+0.004
 knew+0.003
 remembered+0.003
 pap+0.003
 wished+0.003
Neg logit contributions
he-0.005
ing-0.005
ays-0.005
room-0.005
ings-0.005
Tokenummy
Token ID13513
Feature activation+5.14e+00

Pos logit contributions
he+0.137
ane+0.120
room+0.115
ute+0.107
uggle+0.105
Neg logit contributions
 forgot-0.214
 bit-0.193
 forgotten-0.193
 knew-0.193
 sp-0.184
Token.
Token ID13
Feature activation+3.92e+00

Pos logit contributions
he+0.083
room+0.077
ane+0.073
ays+0.070
uggle+0.069
Neg logit contributions
 forgot-0.155
 knew-0.142
 bit-0.139
 wished-0.138
 remembered-0.137
Token
Token ID198
Feature activation+4.75e+00

Pos logit contributions
ing+0.094
he+0.092
ane+0.090
ipping+0.089
ute+0.089
Neg logit contributions
 forgot-0.087
 knew-0.072
 learned-0.069
 wished-0.069
 remembered-0.068
Token
Token ID198
Feature activation+3.77e+00

Pos logit contributions
ays+0.123
ipping+0.122
ing+0.116
ane+0.114
ooming+0.114
Neg logit contributions
 forgot-0.089
 knew-0.085
 never-0.080
 also-0.079
 learned-0.077
Token"
Token ID1
Feature activation+7.55e-01

Pos logit contributions
ays+0.075
ane+0.070
ing+0.069
he+0.069
ore+0.069
Neg logit contributions
 forgot-0.102
 knew-0.088
 remembered-0.087
 wished-0.083
 learned-0.082
Token.
Token ID13
Feature activation+2.87e+00

Pos logit contributions
he+0.066
ays+0.057
room+0.055
ore+0.052
uggle+0.052
Neg logit contributions
 forgot-0.118
 knew-0.109
 remembered-0.107
 wished-0.099
 bit-0.098
Token We
Token ID775
Feature activation-2.59e+00

Pos logit contributions
he+0.067
ays+0.059
ing+0.059
room+0.058
ings+0.056
Neg logit contributions
 forgot-0.070
 knew-0.061
 remembered-0.060
 wished-0.059
 pap-0.057
Token will
Token ID481
Feature activation-1.13e+00

Pos logit contributions
 forgot+0.049
 pap+0.041
 wished+0.040
 knew+0.040
 remembered+0.039
Neg logit contributions
he-0.070
room-0.064
uggle-0.061
ing-0.061
ings-0.060
Token share
Token ID2648
Feature activation+3.74e+00

Pos logit contributions
 forgot+0.020
 knew+0.017
 remembered+0.016
 wished+0.016
 pap+0.015
Neg logit contributions
he-0.032
room-0.030
ing-0.029
ings-0.029
ays-0.028
Token the
Token ID262
Feature activation-2.53e+00

Pos logit contributions
he+0.086
ane+0.079
ays+0.078
ing+0.077
room+0.076
Neg logit contributions
 forgot-0.089
 knew-0.079
 wished-0.077
 remembered-0.076
 bit-0.074
Token shoes
Token ID10012
Feature activation+8.41e+00

Pos logit contributions
 forgot+0.056
 knew+0.047
 remembered+0.047
 wished+0.045
 known+0.043
Neg logit contributions
he-0.064
room-0.059
ing-0.058
ays-0.057
ane-0.055
Token.
Token ID13
Feature activation+2.45e+00

Pos logit contributions
ays+0.144
room+0.134
he+0.128
overed+0.128
ane+0.127
Neg logit contributions
 forgot-0.229
 bit-0.207
 knew-0.205
 never-0.194
 learned-0.193
Token We
Token ID775
Feature activation-2.70e+00

Pos logit contributions
he+0.059
ing+0.054
ays+0.052
ings+0.051
room+0.051
Neg logit contributions
 forgot-0.057
 wished-0.050
 knew-0.050
 remembered-0.049
 pap-0.048
Token will
Token ID481
Feature activation-1.57e+00

Pos logit contributions
 forgot+0.053
 pap+0.046
 wished+0.045
 knew+0.044
 remembered+0.044
Neg logit contributions
he-0.071
room-0.064
uggle-0.062
ings-0.060
ing-0.060
Token be
Token ID307
Feature activation-6.92e-01

Pos logit contributions
 forgot+0.027
 knew+0.023
 remembered+0.022
 wished+0.022
 pap+0.020
Neg logit contributions
he-0.044
room-0.042
ings-0.040
uggle-0.039
eddy-0.038
Token compassionate
Token ID32533
Feature activation+5.57e+00

Pos logit contributions
 forgot+0.013
 knew+0.011
 wished+0.010
 remembered+0.010
 pap+0.009
Neg logit contributions
he-0.019
ing-0.018
room-0.018
ays-0.018
eddy-0.017
Token is
Token ID318
Feature activation+2.41e-01

Pos logit contributions
he+0.005
ays+0.005
ing+0.005
room+0.005
eddy+0.005
Neg logit contributions
 forgot-0.004
 knew-0.003
 remembered-0.003
 wished-0.003
 pap-0.003
Token this
Token ID428
Feature activation-4.43e-01

Pos logit contributions
he+0.005
ing+0.005
ays+0.005
room+0.005
ore+0.004
Neg logit contributions
 forgot-0.006
 knew-0.005
 remembered-0.005
 wished-0.005
 known-0.005
Token?
Token ID30
Feature activation+2.38e+00

Pos logit contributions
 forgot+0.014
 knew+0.012
 remembered+0.012
 wished+0.012
 pap+0.011
Neg logit contributions
he-0.007
ays-0.006
room-0.006
ing-0.006
ane-0.006
Token It
Token ID632
Feature activation-1.86e+00

Pos logit contributions
he+0.054
ing+0.048
room+0.046
ays+0.045
ings+0.045
Neg logit contributions
 forgot-0.057
 knew-0.052
 remembered-0.050
 pap-0.050
 wished-0.050
Token looks
Token ID3073
Feature activation+1.83e-01

Pos logit contributions
 forgot+0.050
 wished+0.044
 known+0.044
 remembered+0.043
 knew+0.043
Neg logit contributions
ing-0.034
ays-0.034
he-0.032
ore-0.031
ings-0.028
Token y
Token ID331
Feature activation+8.39e+00

Pos logit contributions
he+0.005
ing+0.005
ays+0.005
room+0.004
ings+0.004
Neg logit contributions
 forgot-0.004
 knew-0.003
 remembered-0.003
 wished-0.003
 pap-0.003
Tokenucky
Token ID5309
Feature activation+4.71e+00

Pos logit contributions
he+0.145
ane+0.135
overed+0.131
ute+0.121
but+0.121
Neg logit contributions
 bit-0.181
 forgot-0.177
 sp-0.173
 knew-0.166
 f-0.162
Token.
Token ID13
Feature activation+2.62e+00

Pos logit contributions
he+0.091
room+0.084
ane+0.080
ays+0.079
uggle+0.076
Neg logit contributions
 forgot-0.127
 knew-0.113
 bit-0.110
 remembered-0.110
 wished-0.107
Token I
Token ID314
Feature activation-3.42e+00

Pos logit contributions
he+0.061
ing+0.054
ays+0.052
room+0.051
ane+0.051
Neg logit contributions
 forgot-0.063
 knew-0.057
 remembered-0.055
 wished-0.054
 pap-0.054
Token don
Token ID836
Feature activation+4.68e+00

Pos logit contributions
 forgot+0.066
 pap+0.066
 robbers+0.061
 women+0.060
 elves+0.057
Neg logit contributions
he-0.084
ore-0.077
room-0.069
ings-0.067
ays-0.067
Token't
Token ID470
Feature activation-2.36e+00

Pos logit contributions
ays+0.122
ment+0.122
ement+0.120
eddy+0.120
room+0.120
Neg logit contributions
 forgot-0.061
 knew-0.061
 knocked-0.054
 never-0.054
 also-0.052
Token grabbed
Token ID13176
Feature activation+9.52e-01

Pos logit contributions
 forgot+0.091
 known+0.084
 fact+0.082
 pap+0.074
 robbers+0.072
Neg logit contributions
he-0.155
room-0.143
ing-0.142
ays-0.140
ore-0.136
Token the
Token ID262
Feature activation-2.02e+00

Pos logit contributions
he+0.022
room+0.021
ays+0.020
ing+0.020
ane+0.020
Neg logit contributions
 forgot-0.022
 knew-0.019
 remembered-0.019
 wished-0.018
 pap-0.017
Token man
Token ID582
Feature activation+3.30e+00

Pos logit contributions
 forgot+0.046
 remembered+0.039
 knew+0.039
 wished+0.038
 known+0.038
Neg logit contributions
he-0.049
ing-0.044
ays-0.042
room-0.042
ings-0.041
Tokenâ
Token ID22940
Feature activation+2.23e+00

Pos logit contributions
he+0.074
ays+0.070
room+0.069
ing+0.067
ane+0.066
Neg logit contributions
 forgot-0.077
 knew-0.069
 remembered-0.066
 wished-0.065
 bit-0.064
Token
Token ID26391
Feature activation+3.66e+00

Pos logit contributions
he+0.050
ays+0.049
ing+0.048
room+0.046
eddy+0.046
Neg logit contributions
 forgot-0.052
 knew-0.046
 remembered-0.044
 wished-0.043
 learned-0.042
Token
Token ID8151
Feature activation+8.39e+00

Pos logit contributions
he+0.074
ays+0.070
ing+0.068
room+0.066
ement+0.063
Neg logit contributions
 forgot-0.101
 knew-0.085
 remembered-0.083
 wished-0.081
 learned-0.078
Tokens
Token ID82
Feature activation+7.90e-01

Pos logit contributions
ays+0.129
he+0.127
room+0.126
uggle+0.120
overed+0.115
Neg logit contributions
 forgot-0.230
 knew-0.216
 never-0.200
 forgotten-0.196
 remembered-0.195
Token hand
Token ID1021
Feature activation+5.29e+00

Pos logit contributions
he+0.013
ing+0.012
room+0.012
ays+0.012
ings+0.011
Neg logit contributions
 forgot-0.024
 knew-0.021
 remembered-0.021
 wished-0.020
 pap-0.020
Token with
Token ID351
Feature activation-1.90e+00

Pos logit contributions
he+0.085
ays+0.076
room+0.075
uggle+0.070
ane+0.069
Neg logit contributions
 forgot-0.154
 knew-0.146
 bit-0.141
 remembered-0.138
 wished-0.135
Token the
Token ID262
Feature activation-2.85e+00

Pos logit contributions
 forgot+0.050
 knew+0.040
 remembered+0.040
 wished+0.039
 pap+0.037
Neg logit contributions
he-0.041
ay-0.037
room-0.036
ays-0.035
ings-0.035
Token other
Token ID584
Feature activation-7.29e-01

Pos logit contributions
 forgot+0.075
 remembered+0.064
 knew+0.062
 wished+0.062
 known+0.061
Neg logit contributions
he-0.058
ing-0.051
ings-0.050
ay-0.049
room-0.046
Token"
Token ID1
Feature activation-2.22e-01

Pos logit contributions
ays+0.037
he+0.037
ing+0.035
ane+0.035
ings+0.034
Neg logit contributions
 forgot-0.058
 knew-0.052
 remembered-0.050
 wished-0.049
 learned-0.048
TokenWow
Token ID22017
Feature activation+2.96e+00

Pos logit contributions
 forgot+0.006
 knew+0.005
 remembered+0.005
 wished+0.005
 pap+0.005
Neg logit contributions
he-0.005
room-0.004
ays-0.004
ing-0.004
ane-0.004
Token,
Token ID11
Feature activation-1.53e+00

Pos logit contributions
he+0.060
ays+0.056
eddy+0.056
room+0.054
ane+0.054
Neg logit contributions
 forgot-0.080
 remembered-0.070
 knew-0.070
 bit-0.064
 knocked-0.062
Token it
Token ID340
Feature activation-1.32e+00

Pos logit contributions
 forgot+0.046
 known+0.043
 remembered+0.042
 knew+0.041
 wished+0.039
Neg logit contributions
he-0.023
room-0.020
 anymore-0.019
ing-0.019
ay-0.017
Token looks
Token ID3073
Feature activation-1.98e-02

Pos logit contributions
 forgot+0.031
 wished+0.027
 knew+0.026
 remembered+0.026
 known+0.026
Neg logit contributions
he-0.030
ing-0.028
room-0.028
ays-0.026
ings-0.025
Token y
Token ID331
Feature activation+8.38e+00

Pos logit contributions
 forgot+0.000
 knew+0.000
 remembered+0.000
 wished+0.000
 pap+0.000
Neg logit contributions
he-0.001
ing-0.001
room-0.001
ays-0.001
ings-0.000
Tokenummy
Token ID13513
Feature activation+5.73e+00

Pos logit contributions
ane+0.133
he+0.128
ute+0.121
but+0.118
overed+0.116
Neg logit contributions
 forgot-0.196
 bit-0.192
 sp-0.174
 knew-0.173
 fact-0.164
Token,"
Token ID553
Feature activation+5.09e-01

Pos logit contributions
he+0.112
room+0.107
ays+0.104
ane+0.102
uggle+0.098
Neg logit contributions
 forgot-0.150
 knew-0.133
 remembered-0.131
 bit-0.130
 wished-0.125
Token Amy
Token ID14235
Feature activation-3.17e+00

Pos logit contributions
he+0.008
room+0.007
ing+0.007
ays+0.007
ane+0.006
Neg logit contributions
 forgot-0.015
 knew-0.014
 remembered-0.014
 pap-0.013
 known-0.013
Token said
Token ID531
Feature activation+3.37e+00

Pos logit contributions
 forgot+0.051
 known+0.040
 knew+0.040
 remembered+0.040
 wished+0.038
Neg logit contributions
he-0.093
ing-0.088
room-0.086
ays-0.084
ore-0.081
Token.
Token ID13
Feature activation+5.33e+00

Pos logit contributions
he+0.069
ays+0.066
ane+0.062
room+0.061
ay+0.060
Neg logit contributions
 forgot-0.086
 knew-0.079
 remembered-0.076
 bit-0.076
 knocked-0.074
Token.
Token ID13
Feature activation+2.24e+00

Pos logit contributions
ays+0.159
overed+0.147
ement+0.146
ore+0.144
ings+0.144
Neg logit contributions
 forgot-0.213
 knew-0.177
 bit-0.172
 never-0.171
 forgotten-0.169
Token I
Token ID314
Feature activation-2.71e+00

Pos logit contributions
he+0.046
ing+0.041
ays+0.041
room+0.040
ings+0.039
Neg logit contributions
 forgot-0.059
 knew-0.052
 remembered-0.051
 wished-0.050
 pap-0.049
Token want
Token ID765
Feature activation+7.72e-01

Pos logit contributions
 forgot+0.051
 pap+0.049
 robbers+0.044
 remembered+0.043
 knew+0.042
Neg logit contributions
he-0.073
ore-0.061
uggle-0.061
room-0.060
ings-0.060
Token to
Token ID284
Feature activation-2.06e+00

Pos logit contributions
he+0.012
ing+0.011
ays+0.011
room+0.011
eddy+0.010
Neg logit contributions
 forgot-0.024
 knew-0.021
 remembered-0.021
 wished-0.020
 pap-0.019
Token wear
Token ID5806
Feature activation+2.40e+00

Pos logit contributions
 forgot+0.042
 knew+0.036
 remembered+0.035
 wished+0.034
 pap+0.032
Neg logit contributions
he-0.055
room-0.051
ing-0.049
ays-0.049
ings-0.048
Token shorts
Token ID22078
Feature activation+8.37e+00

Pos logit contributions
he+0.061
ing+0.056
room+0.056
ays+0.056
ane+0.054
Neg logit contributions
 forgot-0.052
 knew-0.044
 remembered-0.043
 wished-0.042
 pap-0.039
Token."
Token ID526
Feature activation+3.96e+00

Pos logit contributions
ays+0.167
he+0.160
ogging+0.159
ings+0.158
overed+0.157
Neg logit contributions
 forgot-0.199
 knew-0.177
 remembered-0.169
 bit-0.167
 always-0.165
Token Jack
Token ID3619
Feature activation-4.11e+00

Pos logit contributions
he+0.082
ays+0.082
ing+0.079
ane+0.079
eddy+0.073
Neg logit contributions
 forgot-0.101
 knew-0.092
 remembered-0.087
 learned-0.084
 wished-0.081
Token's
Token ID338
Feature activation-1.34e+00

Pos logit contributions
 forgot+0.065
 known+0.061
 robbers+0.057
 pap+0.057
 forgotten+0.053
Neg logit contributions
he-0.112
ing-0.107
room-0.102
uck-0.098
ings-0.095
Token mom
Token ID1995
Feature activation-3.08e+00

Pos logit contributions
 forgot+0.029
 knew+0.024
 remembered+0.024
 wished+0.024
 known+0.023
Neg logit contributions
he-0.033
room-0.032
ing-0.031
ings-0.030
ays-0.029
Token said
Token ID531
Feature activation+1.17e+00

Pos logit contributions
 forgot+0.049
 pap+0.046
 known+0.046
 robbers+0.044
 fact+0.042
Neg logit contributions
he-0.087
ing-0.083
room-0.075
eddy-0.071
uck-0.070
Token lobby
Token ID10866
Feature activation+3.77e+00

Pos logit contributions
he+0.043
ing+0.036
ays+0.036
room+0.036
ane+0.032
Neg logit contributions
 forgot-0.116
 knew-0.105
 remembered-0.104
 wished-0.102
 pap-0.099
Token.
Token ID13
Feature activation+3.07e+00

Pos logit contributions
he+0.050
eddy+0.045
ays+0.045
ing+0.044
room+0.044
Neg logit contributions
 forgot-0.122
 knew-0.112
 remembered-0.106
 wished-0.106
 bit-0.103
Token The
Token ID383
Feature activation-2.82e+00

Pos logit contributions
he+0.073
ing+0.065
room+0.063
ane+0.063
ays+0.060
Neg logit contributions
 forgot-0.071
 knew-0.064
 pap-0.061
 remembered-0.060
 wished-0.060
Token cake
Token ID12187
Feature activation-9.56e-01

Pos logit contributions
 forgot+0.078
 remembered+0.068
 known+0.064
 knew+0.063
 wished+0.062
Neg logit contributions
he-0.058
room-0.050
time-0.046
ing-0.045
ays-0.045
Token looks
Token ID3073
Feature activation-1.15e+00

Pos logit contributions
 forgot+0.023
 remembered+0.019
 knew+0.019
 known+0.019
 wished+0.019
Neg logit contributions
he-0.022
ays-0.020
ing-0.020
room-0.020
ane-0.019
Token y
Token ID331
Feature activation+8.36e+00

Pos logit contributions
 forgot+0.022
 knew+0.018
 remembered+0.018
 pap+0.017
 known+0.017
Neg logit contributions
he-0.031
ing-0.029
ays-0.028
room-0.027
ings-0.027
Tokenummy
Token ID13513
Feature activation+4.81e+00

Pos logit contributions
he+0.133
ane+0.122
ute+0.112
overed+0.111
uggle+0.111
Neg logit contributions
 forgot-0.195
 bit-0.184
 knew-0.181
 sp-0.173
 forgotten-0.173
Token.
Token ID13
Feature activation+3.18e+00

Pos logit contributions
he+0.078
room+0.074
ays+0.071
ane+0.069
uggle+0.066
Neg logit contributions
 forgot-0.144
 knew-0.131
 bit-0.128
 remembered-0.126
 wished-0.126
Token They
Token ID1119
Feature activation-6.11e+00

Pos logit contributions
he+0.082
ing+0.074
room+0.073
ane+0.072
uggle+0.069
Neg logit contributions
 forgot-0.067
 knew-0.059
 pap-0.057
 wished-0.056
 remembered-0.055
Token want
Token ID765
Feature activation+7.58e-01

Pos logit contributions
 forgot+0.125
 known+0.122
 pap+0.115
 forgotten+0.111
 remembered+0.110
Neg logit contributions
he-0.149
room-0.121
ays-0.109
 with-0.108
ing-0.103
Token to
Token ID284
Feature activation-2.47e+00

Pos logit contributions
he+0.011
ing+0.010
room+0.010
ays+0.010
ings+0.009
Neg logit contributions
 forgot-0.024
 remembered-0.021
 knew-0.021
 wished-0.020
 pap-0.020
Token It
Token ID632
Feature activation-2.43e+00

Pos logit contributions
he+0.080
ays+0.078
ing+0.077
ings+0.074
ane+0.074
Neg logit contributions
 forgot-0.088
 knew-0.079
 remembered-0.076
 learned-0.073
 wished-0.073
Token was
Token ID373
Feature activation-1.69e+00

Pos logit contributions
 forgot+0.054
 known+0.049
 pap+0.048
 women+0.047
 forgotten+0.046
Neg logit contributions
he-0.053
room-0.046
ing-0.046
ays-0.044
 with-0.043
Token red
Token ID2266
Feature activation+4.91e+00

Pos logit contributions
 forgot+0.037
 knew+0.030
 remembered+0.030
 wished+0.029
 pap+0.027
Neg logit contributions
he-0.041
room-0.038
ays-0.037
ing-0.037
ings-0.036
Token and
Token ID290
Feature activation-3.26e+00

Pos logit contributions
he+0.112
ane+0.100
ays+0.098
uggle+0.098
room+0.098
Neg logit contributions
 forgot-0.106
 bit-0.100
 knew-0.098
 sc-0.092
 remembered-0.090
Token looked
Token ID3114
Feature activation-4.85e-01

Pos logit contributions
 forgot+0.060
 known+0.057
 unknow+0.053
 punished+0.051
 prisoners+0.050
Neg logit contributions
he-0.074
 with-0.071
ing-0.069
 on-0.067
ays-0.065
Token y
Token ID331
Feature activation+8.35e+00

Pos logit contributions
 forgot+0.009
 pap+0.008
 knew+0.008
 remembered+0.008
 known+0.007
Neg logit contributions
he-0.012
ing-0.012
ays-0.011
room-0.011
 anymore-0.011
Tokenummy
Token ID13513
Feature activation+4.88e+00

Pos logit contributions
he+0.131
ane+0.122
room+0.111
ute+0.108
overed+0.107
Neg logit contributions
 forgot-0.213
 bit-0.197
 knew-0.196
 forgotten-0.190
 spoiled-0.183
Token.
Token ID13
Feature activation+3.99e+00

Pos logit contributions
he+0.081
room+0.077
ane+0.073
ays+0.072
uggle+0.068
Neg logit contributions
 forgot-0.147
 knew-0.134
 bit-0.133
 wished-0.130
 remembered-0.129
Token Max
Token ID5436
Feature activation-4.29e+00

Pos logit contributions
he+0.092
ays+0.087
ings+0.085
ing+0.084
ane+0.083
Neg logit contributions
 forgot-0.095
 knew-0.086
 remembered-0.081
 wished-0.079
 learned-0.078
Token wanted
Token ID2227
Feature activation+1.04e+00

Pos logit contributions
 known+0.064
 fact+0.062
 forgot+0.059
 forgotten+0.056
 pap+0.055
Neg logit contributions
ing-0.105
 with-0.101
he-0.100
room-0.099
uck-0.092
Token to
Token ID284
Feature activation-1.44e+00

Pos logit contributions
he+0.015
ing+0.014
room+0.014
ays+0.013
ore+0.013
Neg logit contributions
 forgot-0.033
 knew-0.029
 remembered-0.029
 wished-0.028
 pap-0.028
Token jam
Token ID16853
Feature activation+5.35e+00

Pos logit contributions
he+0.046
room+0.041
uggle+0.041
ays+0.041
ing+0.041
Neg logit contributions
 forgot-0.036
 knew-0.033
 remembered-0.032
 bit-0.030
 wished-0.029
Token.
Token ID13
Feature activation+4.59e+00

Pos logit contributions
he+0.079
room+0.077
uggle+0.077
ays+0.072
eddy+0.071
Neg logit contributions
 forgot-0.156
 knew-0.143
 bit-0.138
 remembered-0.137
 wished-0.136
Token "
Token ID366
Feature activation+1.99e+00

Pos logit contributions
he+0.115
ing+0.107
ays+0.107
ane+0.106
uggle+0.102
Neg logit contributions
 forgot-0.100
 knew-0.088
 remembered-0.080
 knocked-0.080
 wished-0.077
TokenIt
Token ID1026
Feature activation-1.71e+00

Pos logit contributions
he+0.045
ing+0.043
ays+0.043
ings+0.042
room+0.041
Neg logit contributions
 forgot-0.046
 knew-0.042
 wished-0.040
 remembered-0.039
 bit-0.038
Token looks
Token ID3073
Feature activation-6.96e-01

Pos logit contributions
 forgot+0.040
 wished+0.035
 known+0.034
 knew+0.034
 remembered+0.034
Neg logit contributions
ing-0.037
he-0.037
ays-0.037
room-0.033
ore-0.032
Token y
Token ID331
Feature activation+8.35e+00

Pos logit contributions
 forgot+0.014
 knew+0.012
 remembered+0.012
 wished+0.011
 known+0.011
Neg logit contributions
he-0.017
ing-0.016
ays-0.016
room-0.016
ings-0.015
Tokenummy
Token ID13513
Feature activation+5.77e+00

Pos logit contributions
he+0.139
ooming+0.134
ane+0.133
overed+0.132
ute+0.127
Neg logit contributions
 bit-0.162
 forgot-0.157
 sp-0.153
 knew-0.149
 vitamins-0.145
Token!"
Token ID2474
Feature activation+3.46e+00

Pos logit contributions
he+0.115
room+0.111
uggle+0.103
ays+0.102
ane+0.102
Neg logit contributions
 forgot-0.149
 knew-0.133
 bit-0.131
 remembered-0.131
 wished-0.129
Token
Token ID198
Feature activation+4.37e+00

Pos logit contributions
ays+0.069
ane+0.068
he+0.067
ing+0.066
room+0.064
Neg logit contributions
 forgot-0.092
 knew-0.082
 remembered-0.076
 wished-0.075
 learned-0.074
Token
Token ID198
Feature activation+3.18e+00

Pos logit contributions
ane+0.112
ays+0.112
owers+0.108
ipping+0.108
ement+0.103
Neg logit contributions
 forgot-0.080
 knew-0.079
 told-0.066
 learned-0.066
 grown-0.063
TokenThey
Token ID2990
Feature activation-6.02e+00

Pos logit contributions
ays+0.070
he+0.070
ane+0.067
ore+0.064
uggle+0.063
Neg logit contributions
 forgot-0.081
 knew-0.070
 remembered-0.068
 wished-0.064
 learned-0.064
Token wanted
Token ID2227
Feature activation+1.51e+00

Pos logit contributions
 least+0.087
 unknow+0.081
 fact+0.080
 payoff+0.079
 tresp+0.076
Neg logit contributions
 with-0.098
.-0.088
 at-0.081
 down-0.081
 on-0.080
Token to
Token ID284
Feature activation-1.63e+00

Pos logit contributions
ing+0.017
 anymore+0.015
he+0.014
ings+0.014
ays+0.014
Neg logit contributions
 forgot-0.051
 remembered-0.044
 knew-0.044
 wished-0.043
 pap-0.042
Token share
Token ID2648
Feature activation+3.86e+00

Pos logit contributions
 forgot+0.036
 wished+0.030
 remembered+0.030
 knew+0.029
 pap+0.029
Neg logit contributions
he-0.039
room-0.036
uggle-0.034
ings-0.034
ore-0.033
Token her
Token ID607
Feature activation+5.00e-01

Pos logit contributions
he+0.084
ane+0.078
ays+0.078
room+0.076
ing+0.075
Neg logit contributions
 forgot-0.095
 knew-0.086
 remembered-0.082
 wished-0.082
 bit-0.079
Token pr
Token ID778
Feature activation+5.48e+00

Pos logit contributions
he+0.013
ing+0.012
room+0.012
ays+0.012
ings+0.012
Neg logit contributions
 forgot-0.010
 knew-0.008
 remembered-0.008
 wished-0.008
 pap-0.008
Tokenunes
Token ID4015
Feature activation+8.33e+00

Pos logit contributions
he+0.107
room+0.092
ing+0.091
ays+0.088
ane+0.085
Neg logit contributions
 forgot-0.145
 knew-0.130
 remembered-0.125
 pap-0.125
 bit-0.119
Token with
Token ID351
Feature activation+9.91e-01

Pos logit contributions
ays+0.141
ane+0.132
room+0.128
he+0.120
ing+0.118
Neg logit contributions
 forgot-0.230
 bit-0.203
 knew-0.198
 never-0.194
 forget-0.190
Token him
Token ID683
Feature activation+3.09e+00

Pos logit contributions
he+0.021
room+0.019
ing+0.019
ays+0.019
ings+0.018
Neg logit contributions
 forgot-0.026
 knew-0.022
 remembered-0.022
 wished-0.021
 pap-0.021
Token.
Token ID13
Feature activation+4.18e+00

Pos logit contributions
he+0.050
ays+0.045
room+0.044
ane+0.042
ing+0.041
Neg logit contributions
 forgot-0.096
 knew-0.088
 remembered-0.087
 bit-0.083
 wished-0.083
Token She
Token ID1375
Feature activation-6.23e+00

Pos logit contributions
he+0.079
room+0.071
ays+0.070
ane+0.069
ing+0.069
Neg logit contributions
 forgot-0.120
 knew-0.108
 remembered-0.102
 wished-0.102
 learned-0.097
Token ran
Token ID4966
Feature activation+2.82e+00

Pos logit contributions
 fact+0.122
 supposed+0.112
 known+0.110
 forget+0.108
 supervision+0.106
Neg logit contributions
 with-0.123
 again-0.114
ing-0.111
room-0.110
ed-0.108
Token cake
Token ID12187
Feature activation+6.01e+00

Pos logit contributions
he+0.020
ing+0.019
ays+0.018
room+0.018
ane+0.018
Neg logit contributions
 forgot-0.014
 knew-0.013
 wished-0.012
 remembered-0.012
 bit-0.012
Token.
Token ID13
Feature activation+2.79e+00

Pos logit contributions
he+0.101
ays+0.099
ane+0.094
overed+0.093
ing+0.092
Neg logit contributions
 forgot-0.165
 knew-0.148
 bit-0.145
 remembered-0.141
 pap-0.136
Token It
Token ID632
Feature activation-1.54e+00

Pos logit contributions
he+0.070
ing+0.067
ays+0.063
ings+0.062
ane+0.061
Neg logit contributions
 forgot-0.059
 knew-0.053
 wished-0.052
 remembered-0.051
 pap-0.050
Token looks
Token ID3073
Feature activation-4.73e-01

Pos logit contributions
 forgot+0.044
 known+0.040
 remembered+0.040
 wished+0.039
 knew+0.038
Neg logit contributions
he-0.024
ays-0.024
ing-0.022
room-0.021
ore-0.020
Token so
Token ID523
Feature activation-7.35e-01

Pos logit contributions
 forgot+0.010
 knew+0.008
 remembered+0.008
 pap+0.008
 wished+0.008
Neg logit contributions
he-0.012
ing-0.011
ays-0.011
 anymore-0.010
room-0.010
Token y
Token ID331
Feature activation+8.33e+00

Pos logit contributions
 forgot+0.014
 remembered+0.012
 knew+0.011
 wished+0.011
 known+0.011
Neg logit contributions
he-0.020
ing-0.018
ays-0.018
room-0.018
ore-0.018
Tokenummy
Token ID13513
Feature activation+6.20e+00

Pos logit contributions
overed+0.139
he+0.139
ane+0.134
ooming+0.133
but+0.128
Neg logit contributions
 bit-0.166
 forgot-0.163
 sp-0.155
 knew-0.151
 f-0.151
Token.
Token ID13
Feature activation+2.83e+00

Pos logit contributions
he+0.117
room+0.111
ays+0.105
ane+0.103
ings+0.102
Neg logit contributions
 forgot-0.168
 knew-0.155
 bit-0.149
 wished-0.148
 remembered-0.147
Token Can
Token ID1680
Feature activation+2.26e+00

Pos logit contributions
he+0.073
ing+0.067
ays+0.065
ings+0.064
ane+0.064
Neg logit contributions
 forgot-0.061
 knew-0.054
 wished-0.053
 remembered-0.052
 pap-0.052
Token we
Token ID356
Feature activation-2.25e+00

Pos logit contributions
he+0.061
ing+0.058
ays+0.058
room+0.058
ings+0.056
Neg logit contributions
 forgot-0.046
 knew-0.038
 remembered-0.037
 wished-0.036
 bit-0.034
Token have
Token ID423
Feature activation-8.80e-02

Pos logit contributions
 forgot+0.043
 wished+0.036
 knew+0.036
 remembered+0.035
 pap+0.034
Neg logit contributions
he-0.063
room-0.056
uggle-0.053
ay-0.052
ays-0.052
Token chili
Token ID32728
Feature activation+3.53e+00

Pos logit contributions
he+0.009
ays+0.009
ing+0.009
room+0.009
ane+0.008
Neg logit contributions
 forgot-0.007
 knew-0.006
 remembered-0.006
 bit-0.006
 wished-0.006
Token and
Token ID290
Feature activation-3.67e+00

Pos logit contributions
he+0.100
ays+0.092
room+0.091
ane+0.088
ing+0.087
Neg logit contributions
 forgot-0.062
 knew-0.057
 remembered-0.051
 bit-0.051
 pap-0.050
Token thought
Token ID1807
Feature activation+1.26e+00

Pos logit contributions
 least+0.063
 forgot+0.063
 fact+0.061
 forget+0.060
 known+0.059
Neg logit contributions
 with-0.078
ing-0.076
.-0.070
it-0.069
he-0.068
Token it
Token ID340
Feature activation-6.67e-01

Pos logit contributions
he+0.032
ays+0.031
room+0.030
eddy+0.029
ane+0.029
Neg logit contributions
 forgot-0.028
 knew-0.026
 remembered-0.023
 wished-0.023
 knocked-0.022
Token looked
Token ID3114
Feature activation-5.83e-01

Pos logit contributions
 forgot+0.012
 remembered+0.009
 knew+0.009
 wished+0.009
 pap+0.009
Neg logit contributions
he-0.019
ing-0.018
room-0.018
ays-0.017
ore-0.017
Token y
Token ID331
Feature activation+8.33e+00

Pos logit contributions
 forgot+0.011
 knew+0.009
 remembered+0.009
 pap+0.009
 wished+0.009
Neg logit contributions
he-0.015
ing-0.015
ays-0.014
room-0.014
ings-0.014
Tokenummy
Token ID13513
Feature activation+4.66e+00

Pos logit contributions
he+0.132
ane+0.130
overed+0.115
room+0.114
ute+0.113
Neg logit contributions
 forgot-0.201
 bit-0.193
 knew-0.191
 forgotten-0.182
 spoiled-0.175
Token.
Token ID13
Feature activation+4.00e+00

Pos logit contributions
he+0.082
room+0.080
ane+0.076
ays+0.073
uggle+0.070
Neg logit contributions
 forgot-0.135
 bit-0.125
 knew-0.125
 wished-0.119
 remembered-0.119
Token The
Token ID383
Feature activation-3.12e+00

Pos logit contributions
he+0.080
room+0.073
ays+0.072
ing+0.071
ane+0.068
Neg logit contributions
 forgot-0.109
 knew-0.101
 wished-0.096
 remembered-0.094
 pap-0.090
Token bird
Token ID6512
Feature activation-3.38e+00

Pos logit contributions
 forgot+0.082
 forgotten+0.067
 remembered+0.066
 known+0.066
 wished+0.065
Neg logit contributions
he-0.061
ing-0.056
room-0.054
eddy-0.053
ane-0.052
Token ate
Token ID15063
Feature activation+2.13e+00

Pos logit contributions
 forgot+0.049
 known+0.046
 fact+0.046
 pap+0.044
 forgotten+0.042
Neg logit contributions
he-0.093
ing-0.092
 with-0.083
ore-0.081
ays-0.080
Token red
Token ID2266
Feature activation+4.77e+00

Pos logit contributions
 forgot+0.028
 knew+0.023
 remembered+0.023
 wished+0.022
 known+0.021
Neg logit contributions
he-0.029
room-0.027
ays-0.027
ing-0.027
ings-0.027
Token and
Token ID290
Feature activation-3.40e+00

Pos logit contributions
he+0.099
ane+0.087
uggle+0.085
room+0.085
ays+0.085
Neg logit contributions
 forgot-0.114
 bit-0.105
 knew-0.104
 remembered-0.098
 wished-0.097
Token shiny
Token ID22441
Feature activation+3.75e+00

Pos logit contributions
 forgot+0.073
 known+0.069
 knew+0.063
 remembered+0.062
 unknow+0.061
Neg logit contributions
he-0.069
ays-0.067
 with-0.066
ing-0.066
ings-0.063
Token and
Token ID290
Feature activation-3.62e+00

Pos logit contributions
he+0.069
room+0.062
uggle+0.061
ane+0.060
ore+0.057
Neg logit contributions
 forgot-0.105
 knew-0.097
 bit-0.096
 wished-0.095
 remembered-0.095
Token looks
Token ID3073
Feature activation-6.12e-01

Pos logit contributions
 forgot+0.071
 known+0.066
 knew+0.060
 remembered+0.060
 unknow+0.060
Neg logit contributions
he-0.081
ing-0.078
 with-0.077
ays-0.077
room-0.074
Token y
Token ID331
Feature activation+8.33e+00

Pos logit contributions
 forgot+0.013
 knew+0.011
 pap+0.011
 remembered+0.011
 known+0.011
Neg logit contributions
he-0.014
ing-0.013
 anymore-0.013
ays-0.013
ings-0.012
Tokenummy
Token ID13513
Feature activation+4.37e+00

Pos logit contributions
he+0.133
ane+0.122
uggle+0.108
overed+0.107
room+0.106
Neg logit contributions
 forgot-0.207
 bit-0.196
 knew-0.188
 forgotten-0.187
 sp-0.181
Token.
Token ID13
Feature activation+3.51e+00

Pos logit contributions
he+0.066
room+0.061
ane+0.058
ays+0.057
uggle+0.054
Neg logit contributions
 forgot-0.136
 knew-0.123
 bit-0.122
 wished-0.120
 remembered-0.120
Token Ben
Token ID3932
Feature activation-4.05e+00

Pos logit contributions
he+0.097
ing+0.087
room+0.086
ane+0.086
eddy+0.085
Neg logit contributions
 forgot-0.071
 knew-0.061
 remembered-0.057
 wished-0.057
 pap-0.055
Token says
Token ID1139
Feature activation+5.12e-01

Pos logit contributions
 forgot+0.128
 known+0.121
 forgotten+0.113
 knew+0.111
 remembered+0.111
Neg logit contributions
he-0.048
ays-0.047
ing-0.046
room-0.043
ore-0.038
Token,
Token ID11
Feature activation+1.43e+00

Pos logit contributions
he+0.010
ays+0.009
room+0.008
eddy+0.008
ing+0.008
Neg logit contributions
 forgot-0.015
 knew-0.014
 remembered-0.014
 wished-0.013
 bit-0.013
Token the
Token ID262
Feature activation-4.02e-01

Pos logit contributions
he+0.043
ing+0.043
ane+0.042
ays+0.041
ogging+0.041
Neg logit contributions
 forgot-0.030
 knew-0.027
 wished-0.025
 remembered-0.025
 bit-0.025
Token table
Token ID3084
Feature activation+4.83e+00

Pos logit contributions
 forgot+0.008
 knew+0.007
 remembered+0.007
 wished+0.006
 known+0.006
Neg logit contributions
he-0.011
room-0.010
ing-0.010
ays-0.010
eddy-0.010
Token.
Token ID13
Feature activation+3.84e+00

Pos logit contributions
he+0.062
ays+0.056
ing+0.056
eddy+0.054
uggle+0.053
Neg logit contributions
 forgot-0.159
 knew-0.145
 remembered-0.140
 wished-0.138
 bit-0.136
Token It
Token ID632
Feature activation-2.47e+00

Pos logit contributions
he+0.088
ing+0.081
ane+0.079
uggle+0.078
eddy+0.078
Neg logit contributions
 forgot-0.092
 knew-0.085
 remembered-0.078
 wished-0.075
 learned-0.073
Token looked
Token ID3114
Feature activation-6.62e-01

Pos logit contributions
 forgot+0.058
 known+0.051
 pap+0.050
 robbers+0.050
 wished+0.049
Neg logit contributions
he-0.053
room-0.046
ing-0.046
ays-0.044
eddy-0.043
Token y
Token ID331
Feature activation+8.32e+00

Pos logit contributions
 forgot+0.013
 knew+0.010
 remembered+0.010
 pap+0.010
 known+0.010
Neg logit contributions
he-0.018
ing-0.016
ays-0.016
room-0.016
ings-0.015
Tokenummy
Token ID13513
Feature activation+4.73e+00

Pos logit contributions
he+0.135
ane+0.129
ute+0.114
overed+0.111
uggle+0.109
Neg logit contributions
 forgot-0.197
 bit-0.189
 knew-0.183
 sp-0.182
 forgotten-0.174
Token.
Token ID13
Feature activation+3.75e+00

Pos logit contributions
he+0.078
room+0.073
ane+0.070
ays+0.070
uggle+0.066
Neg logit contributions
 forgot-0.139
 knew-0.127
 bit-0.126
 remembered-0.122
 wished-0.120
Token
Token ID198
Feature activation+4.63e+00

Pos logit contributions
he+0.083
ing+0.075
uggle+0.073
ane+0.073
eddy+0.073
Neg logit contributions
 forgot-0.094
 knew-0.086
 remembered-0.080
 wished-0.078
 learned-0.075
Token
Token ID198
Feature activation+3.37e+00

Pos logit contributions
ays+0.106
ane+0.102
ipping+0.100
ing+0.098
ement+0.096
Neg logit contributions
 forgot-0.101
 knew-0.098
 learned-0.085
 never-0.082
 also-0.081
Token"
Token ID1
Feature activation+6.30e-01

Pos logit contributions
he+0.063
ays+0.059
ing+0.057
ane+0.056
room+0.056
Neg logit contributions
 forgot-0.095
 knew-0.085
 remembered-0.082
 wished-0.080
 learned-0.076
Token
Token ID198
Feature activation+2.30e+00

Pos logit contributions
ays+0.051
ing+0.049
he+0.049
room+0.048
ane+0.048
Neg logit contributions
 forgot-0.077
 knew-0.073
 wished-0.068
 remembered-0.066
 learned-0.065
TokenL
Token ID43
Feature activation-2.19e-01

Pos logit contributions
he+0.043
ays+0.039
ing+0.039
room+0.039
ane+0.036
Neg logit contributions
 forgot-0.065
 knew-0.058
 remembered-0.056
 wished-0.055
 pap-0.053
Tokenily
Token ID813
Feature activation-4.81e+00

Pos logit contributions
 forgot+0.006
 knew+0.005
 remembered+0.005
 wished+0.005
 learned+0.004
Neg logit contributions
he-0.004
ays-0.004
ing-0.004
ings-0.004
room-0.004
Token and
Token ID290
Feature activation-1.87e+00

Pos logit contributions
 forgot+0.105
 known+0.099
 pap+0.094
 forgotten+0.092
 stolen+0.092
Neg logit contributions
he-0.095
ing-0.094
ays-0.093
ane-0.085
room-0.083
Token Ben
Token ID3932
Feature activation-6.32e+00

Pos logit contributions
 forgot+0.043
 knew+0.037
 remembered+0.036
 wished+0.035
 pap+0.033
Neg logit contributions
he-0.045
ing-0.042
ays-0.042
room-0.042
ane-0.040
Token also
Token ID635
Feature activation-8.36e+00

Pos logit contributions
 forgot+0.119
 pap+0.115
 known+0.106
 robbers+0.102
 shown+0.102
Neg logit contributions
he-0.159
room-0.127
 anymore-0.123
ing-0.121
ane-0.121
Token wake
Token ID7765
Feature activation+1.91e+00

Pos logit contributions
 Indians+0.134
 pap+0.134
 robbers+0.132
 crew+0.125
 miners+0.117
Neg logit contributions
he-0.185
.-0.163
 at-0.157
aw-0.156
 with-0.156
Token up
Token ID510
Feature activation+2.31e+00

Pos logit contributions
he+0.037
ane+0.036
unch+0.034
room+0.033
ite+0.032
Neg logit contributions
 forgot-0.050
 knew-0.046
 remembered-0.044
 wished-0.041
 learned-0.041
Token and
Token ID290
Feature activation-5.73e+00

Pos logit contributions
he+0.046
ays+0.045
room+0.044
ane+0.043
eddy+0.039
Neg logit contributions
 forgot-0.059
 knew-0.056
 remembered-0.053
 pap-0.051
 bit-0.051
Token look
Token ID804
Feature activation-2.96e-01

Pos logit contributions
 forgot+0.105
 knew+0.088
 known+0.088
 shown+0.083
 remembered+0.082
Neg logit contributions
he-0.150
ays-0.136
ing-0.135
uggle-0.133
ore-0.131
Token out
Token ID503
Feature activation+6.73e-01

Pos logit contributions
 forgot+0.009
 knew+0.009
 remembered+0.008
 wished+0.008
 bit+0.008
Neg logit contributions
he-0.005
room-0.004
ays-0.004
ane-0.004
ing-0.004
Token a
Token ID257
Feature activation-2.54e+00

Pos logit contributions
 forgot+0.119
 knew+0.106
 pap+0.099
 remembered+0.098
 stolen+0.096
Neg logit contributions
he-0.134
room-0.128
aps-0.120
ays-0.119
ings-0.119
Token thrill
Token ID28576
Feature activation+2.21e+00

Pos logit contributions
 forgot+0.054
 known+0.046
 wished+0.045
 knew+0.045
 remembered+0.044
Neg logit contributions
he-0.063
ing-0.057
room-0.055
ings-0.055
ay-0.054
Token of
Token ID286
Feature activation-2.86e+00

Pos logit contributions
he+0.030
room+0.026
ays+0.025
uggle+0.025
eddy+0.024
Neg logit contributions
 forgot-0.073
 knew-0.067
 remembered-0.066
 wished-0.064
 bit-0.062
Token joy
Token ID8716
Feature activation+1.83e+00

Pos logit contributions
 forgot+0.066
 remembered+0.054
 knew+0.053
 pap+0.052
 known+0.049
Neg logit contributions
he-0.067
room-0.061
ays-0.058
aps-0.058
ing-0.058
Token.
Token ID13
Feature activation+3.73e+00

Pos logit contributions
he+0.024
room+0.022
ays+0.022
uggle+0.020
ane+0.020
Neg logit contributions
 forgot-0.061
 knew-0.056
 remembered-0.054
 wished-0.054
 bit-0.052
Token They
Token ID1119
Feature activation-8.16e+00

Pos logit contributions
he+0.084
ing+0.070
ane+0.070
room+0.070
ays+0.069
Neg logit contributions
 forgot-0.096
 knew-0.089
 wished-0.084
 remembered-0.083
 pap-0.082
Token think
Token ID892
Feature activation+1.10e+00

Pos logit contributions
 pap+0.155
 forgot+0.152
 stolen+0.149
 forgotten+0.140
 robbers+0.138
Neg logit contributions
he-0.209
room-0.158
ane-0.154
hel-0.152
uggle-0.151
Token they
Token ID484
Feature activation-4.44e+00

Pos logit contributions
he+0.027
ays+0.024
room+0.024
eddy+0.023
ing+0.023
Neg logit contributions
 forgot-0.025
 knew-0.022
 remembered-0.021
 wished-0.021
 pap-0.020
Token are
Token ID389
Feature activation-2.19e+00

Pos logit contributions
 pap+0.109
 forgot+0.105
 known+0.096
 robbers+0.094
 remembered+0.093
Neg logit contributions
he-0.098
room-0.077
ane-0.070
ays-0.070
uggle-0.069
Token flying
Token ID7348
Feature activation+3.50e+00

Pos logit contributions
 forgot+0.047
 knew+0.040
 wished+0.038
 remembered+0.037
 bit+0.036
Neg logit contributions
he-0.047
ing-0.046
room-0.045
ays-0.044
aps-0.043
Token.
Token ID13
Feature activation+3.52e+00

Pos logit contributions
he+0.064
ays+0.059
eddy+0.057
room+0.056
ane+0.054
Neg logit contributions
 forgot-0.100
 knew-0.091
 remembered-0.088
 wished-0.086
 bit-0.085
Token played
Token ID2826
Feature activation+4.08e+00

Pos logit contributions
 fact+0.085
 mistake+0.079
 enemy+0.078
 pap+0.077
 willingness+0.076
Neg logit contributions
 with-0.091
ing-0.088
 on-0.088
 together-0.087
 down-0.086
Token and
Token ID290
Feature activation-4.16e+00

Pos logit contributions
ays+0.084
eddy+0.081
ane+0.081
ing+0.068
he+0.068
Neg logit contributions
 forgot-0.106
 knew-0.099
 learned-0.097
 grown-0.094
 never-0.093
Token had
Token ID550
Feature activation-7.17e-01

Pos logit contributions
 pap+0.066
 fact+0.058
 forgot+0.056
 least+0.055
 punished+0.054
Neg logit contributions
he-0.105
 with-0.103
ing-0.099
room-0.098
 on-0.096
Token fun
Token ID1257
Feature activation+3.84e+00

Pos logit contributions
 forgot+0.013
 knew+0.011
 remembered+0.010
 pap+0.010
 wished+0.010
Neg logit contributions
he-0.020
room-0.019
ing-0.018
ays-0.017
ings-0.017
Token.
Token ID13
Feature activation+2.79e+00

Pos logit contributions
eddy+0.074
ays+0.074
ane+0.072
he+0.068
ay+0.067
Neg logit contributions
 forgot-0.093
 knew-0.086
 wished-0.086
 bit-0.084
 always-0.084
Token She
Token ID1375
Feature activation-8.09e+00

Pos logit contributions
he+0.058
ing+0.054
room+0.054
eddy+0.052
ays+0.051
Neg logit contributions
 forgot-0.073
 knew-0.065
 wished-0.062
 remembered-0.061
 learned-0.058
Token saw
Token ID2497
Feature activation-2.24e+00

Pos logit contributions
 pap+0.118
 robbers+0.118
chie+0.116
 fact+0.107
 persever+0.104
Neg logit contributions
he-0.198
 with-0.186
 again-0.179
room-0.176
ing-0.163
Token the
Token ID262
Feature activation-3.11e+00

Pos logit contributions
 forgot+0.058
 remembered+0.048
 knew+0.046
 forgotten+0.045
 bit+0.045
Neg logit contributions
ing-0.045
he-0.045
room-0.043
ays-0.041
ane-0.038
Token same
Token ID976
Feature activation-1.86e+00

Pos logit contributions
 forgot+0.081
 remembered+0.071
 pap+0.069
 knew+0.067
 wished+0.066
Neg logit contributions
he-0.063
ing-0.054
ane-0.054
room-0.053
 anymore-0.050
Token bird
Token ID6512
Feature activation+7.48e-01

Pos logit contributions
 forgot+0.041
 knew+0.034
 remembered+0.034
 wished+0.033
 pap+0.032
Neg logit contributions
he-0.045
ing-0.043
room-0.043
ings-0.041
ays-0.041
Token from
Token ID422
Feature activation-1.70e+00

Pos logit contributions
he+0.013
room+0.012
ays+0.011
ing+0.011
ane+0.011
Neg logit contributions
 forgot-0.022
 knew-0.020
 remembered-0.020
 wished-0.019
 bit-0.018
Token feeling
Token ID4203
Feature activation-3.07e+00

Pos logit contributions
 forgot+0.036
 knew+0.032
 wished+0.031
 remembered+0.030
 known+0.030
Neg logit contributions
he-0.029
ing-0.025
room-0.025
ays-0.025
 with-0.024
Token proud
Token ID6613
Feature activation+1.95e+00

Pos logit contributions
 forgot+0.054
 knew+0.043
 remembered+0.042
 pap+0.039
 wished+0.039
Neg logit contributions
he-0.089
ing-0.082
ays-0.080
room-0.079
ore-0.078
Token of
Token ID286
Feature activation+7.85e-01

Pos logit contributions
he+0.031
room+0.027
ays+0.027
ings+0.025
ane+0.025
Neg logit contributions
 forgot-0.061
 knew-0.057
 wished-0.055
 remembered-0.055
 pap-0.052
Token themselves
Token ID2405
Feature activation+2.17e+00

Pos logit contributions
he+0.017
room+0.015
ing+0.015
ays+0.015
ings+0.014
Neg logit contributions
 forgot-0.021
 knew-0.018
 remembered-0.018
 wished-0.017
 pap-0.016
Token.
Token ID13
Feature activation+2.17e+00

Pos logit contributions
he+0.032
ays+0.029
room+0.029
ane+0.026
eddy+0.026
Neg logit contributions
 forgot-0.069
 knew-0.063
 remembered-0.061
 wished-0.060
 pap-0.058
Token They
Token ID1119
Feature activation-8.05e+00

Pos logit contributions
he+0.042
ing+0.037
room+0.037
ays+0.037
ane+0.035
Neg logit contributions
 forgot-0.060
 knew-0.054
 remembered-0.053
 wished-0.052
 pap-0.049
Token had
Token ID550
Feature activation-3.03e+00

Pos logit contributions
 pap+0.112
 robbers+0.110
 chemicals+0.099
 error+0.098
 younger+0.097
Neg logit contributions
he-0.240
hel-0.188
room-0.176
ane-0.169
.-0.169
Token learned
Token ID4499
Feature activation+3.22e-01

Pos logit contributions
 pap+0.039
 robbers+0.038
 forgot+0.037
 fact+0.035
 bit+0.034
Neg logit contributions
he-0.073
room-0.072
time-0.067
aw-0.067
 with-0.064
Token something
Token ID1223
Feature activation+1.25e-01

Pos logit contributions
he+0.007
ing+0.007
ays+0.007
room+0.007
ane+0.006
Neg logit contributions
 forgot-0.008
 knew-0.007
 remembered-0.007
 wished-0.007
 pap-0.006
Token new
Token ID649
Feature activation+2.54e+00

Pos logit contributions
he+0.003
ing+0.003
ays+0.003
room+0.003
ane+0.003
Neg logit contributions
 forgot-0.002
 knew-0.002
 remembered-0.002
 wished-0.002
 pap-0.002
Token about
Token ID546
Feature activation-1.72e+00

Pos logit contributions
he+0.048
ays+0.044
room+0.042
ing+0.040
eddy+0.039
Neg logit contributions
 forgot-0.071
 knew-0.068
 remembered-0.064
 wished-0.063
 learned-0.060
Token along
Token ID1863
Feature activation+1.00e+00

Pos logit contributions
he+0.046
ays+0.046
room+0.043
ane+0.041
ing+0.039
Neg logit contributions
 forgot-0.090
 knew-0.083
 remembered-0.080
 wished-0.079
 learned-0.076
Token.
Token ID13
Feature activation+1.87e+00

Pos logit contributions
he+0.019
ing+0.017
room+0.017
ays+0.016
ings+0.015
Neg logit contributions
 forgot-0.029
 knew-0.025
 remembered-0.025
 wished-0.024
 pap-0.023
Token The
Token ID383
Feature activation-4.00e+00

Pos logit contributions
he+0.036
ing+0.034
ays+0.032
room+0.032
ings+0.031
Neg logit contributions
 forgot-0.050
 knew-0.046
 wished-0.044
 remembered-0.044
 pap-0.042
Token nice
Token ID3621
Feature activation-3.29e+00

Pos logit contributions
 forgot+0.087
 remembered+0.070
 knew+0.068
 known+0.066
 forgotten+0.064
Neg logit contributions
he-0.101
ing-0.092
room-0.089
eddy-0.088
ane-0.087
Token people
Token ID661
Feature activation-5.50e+00

Pos logit contributions
 forgot+0.070
 remembered+0.058
 knew+0.056
 wished+0.055
 known+0.054
Neg logit contributions
he-0.084
ing-0.080
room-0.077
eddy-0.076
ane-0.075
Token even
Token ID772
Feature activation-8.04e+00

Pos logit contributions
 forgot+0.112
 pap+0.099
 remembered+0.094
 knew+0.092
 wished+0.090
Neg logit contributions
he-0.145
ing-0.129
ane-0.123
room-0.123
ays-0.120
Token shared
Token ID4888
Feature activation+8.59e-01

Pos logit contributions
 fact+0.106
 pap+0.104
 forgot+0.098
 forgotten+0.097
 stolen+0.093
Neg logit contributions
ing-0.217
he-0.216
 with-0.195
room-0.194
 again-0.189
Token their
Token ID511
Feature activation-3.51e+00

Pos logit contributions
he+0.019
room+0.018
ing+0.018
ays+0.018
ings+0.017
Neg logit contributions
 forgot-0.021
 knew-0.018
 remembered-0.018
 wished-0.017
 pap-0.017
Token candy
Token ID18550
Feature activation+4.73e+00

Pos logit contributions
 forgot+0.077
 remembered+0.064
 knew+0.063
 known+0.060
 wished+0.059
Neg logit contributions
he-0.084
room-0.079
ing-0.077
ings-0.075
uggle-0.075
Token with
Token ID351
Feature activation-1.79e+00

Pos logit contributions
he+0.056
ays+0.054
room+0.050
ing+0.049
ane+0.047
Neg logit contributions
 forgot-0.161
 knew-0.146
 remembered-0.141
 wished-0.139
 bit-0.138
Token the
Token ID262
Feature activation-3.80e+00

Pos logit contributions
 forgot+0.043
 remembered+0.036
 knew+0.035
 wished+0.035
 knocked+0.034
Neg logit contributions
he-0.039
room-0.037
ing-0.036
ings-0.034
ays-0.033
Token,
Token ID11
Feature activation-1.45e+00

Pos logit contributions
he+0.035
ays+0.031
room+0.030
ing+0.029
ane+0.028
Neg logit contributions
 forgot-0.068
 knew-0.063
 remembered-0.061
 wished-0.060
 pap-0.058
Token happy
Token ID3772
Feature activation+8.03e-01

Pos logit contributions
 forgot+0.046
 knew+0.041
 remembered+0.041
 wished+0.040
 pap+0.039
Neg logit contributions
he-0.021
ing-0.019
room-0.018
ays-0.018
ane-0.016
Token and
Token ID290
Feature activation-4.58e+00

Pos logit contributions
he+0.014
ays+0.013
room+0.012
ane+0.012
ing+0.012
Neg logit contributions
 forgot-0.024
 knew-0.022
 wished-0.021
 remembered-0.021
 bit-0.020
Token sad
Token ID6507
Feature activation+2.88e+00

Pos logit contributions
 forgot+0.083
 pap+0.074
 remembered+0.073
 knew+0.072
 wished+0.068
Neg logit contributions
he-0.121
ing-0.110
uggle-0.102
room-0.102
ore-0.100
Token.
Token ID13
Feature activation+3.09e+00

Pos logit contributions
he+0.047
ays+0.043
room+0.040
ane+0.040
uggle+0.037
Neg logit contributions
 forgot-0.088
 knew-0.082
 wished-0.079
 remembered-0.077
 bit-0.076
Token She
Token ID1375
Feature activation-8.03e+00

Pos logit contributions
he+0.062
ing+0.057
ays+0.056
room+0.056
ings+0.054
Neg logit contributions
 forgot-0.082
 knew-0.076
 wished-0.072
 remembered-0.071
 pap-0.068
Token whist
Token ID32262
Feature activation+4.29e-01

Pos logit contributions
 robbers+0.130
 pap+0.127
 payoff+0.118
 permission+0.118
 setup+0.117
Neg logit contributions
he-0.182
 with-0.169
.-0.156
room-0.154
 again-0.151
Tokenled
Token ID992
Feature activation+1.53e+00

Pos logit contributions
he+0.007
room+0.006
ays+0.006
ing+0.006
ane+0.006
Neg logit contributions
 forgot-0.013
 knew-0.012
 remembered-0.012
 wished-0.011
 learned-0.011
Token all
Token ID477
Feature activation-9.55e-01

Pos logit contributions
he+0.033
ays+0.031
room+0.030
ane+0.030
ing+0.029
Neg logit contributions
 forgot-0.039
 knew-0.035
 remembered-0.034
 wished-0.033
 bit-0.031
Token the
Token ID262
Feature activation-1.59e+00

Pos logit contributions
 forgot+0.021
 knew+0.018
 remembered+0.018
 wished+0.017
 pap+0.016
Neg logit contributions
he-0.024
ing-0.022
room-0.022
ays-0.022
ane-0.021
Token time
Token ID640
Feature activation+1.61e+00

Pos logit contributions
 forgot+0.043
 remembered+0.040
 knew+0.038
 wished+0.037
 known+0.036
Neg logit contributions
he-0.030
ing-0.026
ay-0.025
room-0.024
eddy-0.023
Token dream
Token ID4320
Feature activation+5.29e+00

Pos logit contributions
 forgot+0.035
 knew+0.029
 remembered+0.029
 wished+0.028
 pap+0.026
Neg logit contributions
he-0.050
room-0.047
ing-0.046
ays-0.046
ane-0.044
Token.
Token ID13
Feature activation+2.91e+00

Pos logit contributions
ays+0.096
eddy+0.082
he+0.080
room+0.078
ane+0.075
Neg logit contributions
 forgot-0.136
 knew-0.131
 always-0.125
 grown-0.123
 way-0.117
Token Every
Token ID3887
Feature activation-1.52e+00

Pos logit contributions
ing+0.073
he+0.069
ooming+0.068
ays+0.067
ane+0.066
Neg logit contributions
 forgot-0.057
 knew-0.054
 wished-0.049
 grown-0.049
 always-0.048
Token night
Token ID1755
Feature activation+3.05e+00

Pos logit contributions
 forgot+0.033
 knew+0.028
 remembered+0.028
 wished+0.027
 pap+0.025
Neg logit contributions
he-0.039
ing-0.036
room-0.035
ays-0.035
ane-0.034
Token,
Token ID11
Feature activation+5.44e-01

Pos logit contributions
he+0.064
ays+0.060
room+0.060
ane+0.055
eddy+0.054
Neg logit contributions
 forgot-0.081
 knew-0.072
 remembered-0.068
 pap-0.066
 wished-0.065
Token she
Token ID673
Feature activation-8.01e+00

Pos logit contributions
he+0.014
room+0.013
ays+0.013
ing+0.013
ane+0.012
Neg logit contributions
 forgot-0.012
 knew-0.010
 remembered-0.010
 wished-0.010
 pap-0.009
Token would
Token ID561
Feature activation-3.73e+00

Pos logit contributions
 forgot+0.125
 fact+0.111
 stolen+0.111
 robbers+0.107
 comm+0.107
Neg logit contributions
he-0.202
ane-0.188
ing-0.182
room-0.179
ore-0.179
Token close
Token ID1969
Feature activation+1.56e+00

Pos logit contributions
 forgot+0.074
 remembered+0.059
 knew+0.059
 wished+0.058
 known+0.056
Neg logit contributions
he-0.099
room-0.090
uggle-0.086
ore-0.084
ane-0.083
Token her
Token ID607
Feature activation-5.01e-01

Pos logit contributions
he+0.043
room+0.041
ing+0.039
ays+0.038
ane+0.038
Neg logit contributions
 forgot-0.028
 knew-0.025
 remembered-0.024
 wished-0.023
 pap-0.023
Token eyes
Token ID2951
Feature activation+4.36e+00

Pos logit contributions
 forgot+0.010
 knew+0.009
 remembered+0.009
 wished+0.008
 pap+0.008
Neg logit contributions
he-0.013
ing-0.012
ays-0.012
room-0.012
ane-0.012
Token and
Token ID290
Feature activation-7.34e+00

Pos logit contributions
ays+0.096
room+0.088
he+0.086
eddy+0.085
ane+0.084
Neg logit contributions
 forgot-0.096
 knew-0.093
 bit-0.091
 remembered-0.084
 wished-0.082
Token discussing
Token ID11142
Feature activation+7.18e-01

Pos logit contributions
ays+0.091
he+0.091
eddy+0.088
room+0.082
ucks+0.080
Neg logit contributions
 forgot-0.082
 knew-0.077
 remembered-0.072
 had-0.068
 knocked-0.067
Token their
Token ID511
Feature activation-3.95e+00

Pos logit contributions
he+0.015
ing+0.013
room+0.013
ays+0.013
ings+0.013
Neg logit contributions
 forgot-0.019
 knew-0.017
 remembered-0.016
 wished-0.016
 pap-0.015
Token images
Token ID4263
Feature activation+2.38e+00

Pos logit contributions
 forgot+0.083
 remembered+0.070
 knew+0.067
 wished+0.065
 known+0.064
Neg logit contributions
he-0.095
ings-0.092
room-0.091
ing-0.090
ay-0.086
Token.
Token ID13
Feature activation+2.52e+00

Pos logit contributions
he+0.032
ays+0.027
ing+0.027
room+0.027
ane+0.024
Neg logit contributions
 forgot-0.079
 knew-0.072
 remembered-0.071
 wished-0.070
 pap-0.067
Token They
Token ID1119
Feature activation-7.11e+00

Pos logit contributions
he+0.051
ing+0.047
ays+0.045
room+0.044
ings+0.043
Neg logit contributions
 forgot-0.066
 knew-0.060
 remembered-0.057
 wished-0.057
 pap-0.055
Token even
Token ID772
Feature activation-8.00e+00

Pos logit contributions
 pap+0.117
 forgot+0.113
 known+0.099
 stolen+0.096
 robbers+0.096
Neg logit contributions
he-0.195
room-0.166
 with-0.147
ane-0.146
ing-0.143
Token found
Token ID1043
Feature activation-1.27e+00

Pos logit contributions
 pap+0.115
 fact+0.103
 mir+0.099
 forgot+0.095
 tresp+0.095
Neg logit contributions
he-0.194
 with-0.188
ing-0.181
room-0.181
 again-0.176
Token a
Token ID257
Feature activation-4.14e+00

Pos logit contributions
 forgot+0.035
 remembered+0.029
 knew+0.029
 wished+0.028
 pap+0.027
Neg logit contributions
he-0.025
ing-0.025
room-0.024
ays-0.024
ings-0.022
Token pretty
Token ID2495
Feature activation-2.42e+00

Pos logit contributions
 forgot+0.093
 wished+0.082
 remembered+0.078
 struggled+0.075
 missed+0.074
Neg logit contributions
he-0.092
ay-0.086
room-0.086
ings-0.084
ing-0.084
Token butterfly
Token ID35113
Feature activation+2.29e+00

Pos logit contributions
 forgot+0.056
 wished+0.047
 remembered+0.046
 knew+0.045
 known+0.045
Neg logit contributions
he-0.058
room-0.054
ing-0.054
ay-0.052
ings-0.052
Token and
Token ID290
Feature activation-5.33e+00

Pos logit contributions
he+0.037
ays+0.033
room+0.033
ing+0.033
ings+0.030
Neg logit contributions
 forgot-0.069
 knew-0.062
 remembered-0.061
 wished-0.060
 bit-0.058
Tokenm
Token ID76
Feature activation-9.82e-01

Pos logit contributions
he+0.063
uggle+0.062
room+0.062
ore+0.061
ays+0.061
Neg logit contributions
 forgot-0.039
 knew-0.033
 remembered-0.030
 known-0.027
 learned-0.026
Tokenies
Token ID444
Feature activation-3.60e+00

Pos logit contributions
 forgot+0.023
 knew+0.020
 remembered+0.019
 wished+0.019
 pap+0.018
Neg logit contributions
he-0.024
room-0.022
ays-0.021
ing-0.021
ane-0.021
Token hurt
Token ID5938
Feature activation+2.07e+00

Pos logit contributions
 forgot+0.068
 pap+0.057
 knew+0.056
 remembered+0.056
 wished+0.055
Neg logit contributions
he-0.098
ing-0.088
room-0.087
ays-0.085
ane-0.083
Token.
Token ID13
Feature activation+3.20e+00

Pos logit contributions
he+0.030
ays+0.028
ane+0.025
room+0.024
uggle+0.024
Neg logit contributions
 forgot-0.066
 knew-0.062
 remembered-0.060
 wished-0.058
 bit-0.057
Token They
Token ID1119
Feature activation-6.55e+00

Pos logit contributions
he+0.059
ing+0.053
ays+0.053
room+0.052
ane+0.049
Neg logit contributions
 forgot-0.091
 knew-0.082
 remembered-0.079
 wished-0.079
 pap-0.075
Token even
Token ID772
Feature activation-7.99e+00

Pos logit contributions
 pap+0.126
 fact+0.118
 majesty+0.118
 robbers+0.111
 vitamins+0.110
Neg logit contributions
he-0.171
room-0.126
 anymore-0.121
 again-0.120
ing-0.119
Token shared
Token ID4888
Feature activation+8.20e-01

Pos logit contributions
 fact+0.131
 majesty+0.120
 pap+0.120
sed+0.110
 Majesty+0.110
Neg logit contributions
he-0.193
ing-0.182
 again-0.176
 with-0.171
room-0.170
Token an
Token ID281
Feature activation-2.26e+00

Pos logit contributions
he+0.019
ing+0.018
ays+0.018
room+0.018
ane+0.017
Neg logit contributions
 forgot-0.019
 knew-0.017
 remembered-0.016
 wished-0.016
 pap-0.015
Token ice
Token ID4771
Feature activation+3.68e+00

Pos logit contributions
 forgot+0.042
 knew+0.034
 remembered+0.033
 wished+0.032
 pap+0.030
Neg logit contributions
he-0.065
room-0.060
ing-0.060
ays-0.060
eddy-0.058
Token cream
Token ID8566
Feature activation+4.16e+00

Pos logit contributions
ays+0.077
he+0.076
room+0.072
ane+0.070
ing+0.069
Neg logit contributions
 forgot-0.089
 knew-0.079
 bit-0.075
 wished-0.072
 remembered-0.071
Token cone
Token ID27763
Feature activation+5.78e+00

Pos logit contributions
he+0.069
ays+0.067
room+0.063
ing+0.062
ane+0.060
Neg logit contributions
 forgot-0.120
 knew-0.109
 bit-0.104
 remembered-0.103
 wished-0.102
Token dizz
Token ID29527
Feature activation+4.28e+00

Pos logit contributions
 forgot+0.028
 knew+0.024
 wished+0.023
 remembered+0.023
 pap+0.022
Neg logit contributions
he-0.038
ing-0.035
ays-0.034
room-0.034
ings-0.033
Tokeny
Token ID88
Feature activation+1.99e+00

Pos logit contributions
uggle+0.092
ement+0.082
he+0.081
room+0.081
overed+0.079
Neg logit contributions
 forgot-0.098
 knew-0.084
 remembered-0.082
 wasted-0.077
 forgotten-0.075
Token and
Token ID290
Feature activation-5.65e+00

Pos logit contributions
he+0.030
ays+0.026
room+0.026
ane+0.025
uggle+0.024
Neg logit contributions
 forgot-0.063
 knew-0.056
 remembered-0.055
 wished-0.054
 bit-0.054
Token happy
Token ID3772
Feature activation+1.47e+00

Pos logit contributions
 forgot+0.087
 robbers+0.081
 pap+0.081
 known+0.079
 shown+0.079
Neg logit contributions
he-0.136
ing-0.133
 with-0.126
 anymore-0.125
room-0.122
Token.
Token ID13
Feature activation+2.72e+00

Pos logit contributions
he+0.022
ays+0.019
room+0.018
ane+0.018
uggle+0.017
Neg logit contributions
 forgot-0.047
 knew-0.043
 wished-0.042
 remembered-0.042
 bit-0.040
Token They
Token ID1119
Feature activation-7.99e+00

Pos logit contributions
he+0.059
ing+0.053
room+0.052
ays+0.051
ane+0.050
Neg logit contributions
 forgot-0.070
 knew-0.062
 wished-0.061
 remembered-0.061
 pap-0.058
Token hugged
Token ID41130
Feature activation+1.89e+00

Pos logit contributions
 pap+0.121
 robbers+0.110
 fact+0.106
 vitamins+0.098
 known+0.096
Neg logit contributions
he-0.237
room-0.190
hel-0.176
 with-0.175
ing-0.172
Token and
Token ID290
Feature activation-5.45e+00

Pos logit contributions
he+0.051
ays+0.049
ane+0.048
room+0.047
ing+0.046
Neg logit contributions
 forgot-0.036
 knew-0.034
 wished-0.031
 remembered-0.031
 pap-0.030
Token smiled
Token ID13541
Feature activation+2.70e+00

Pos logit contributions
 fact+0.087
 forgot+0.083
 known+0.083
 stolen+0.080
 pap+0.079
Neg logit contributions
he-0.128
ing-0.114
 with-0.114
 again-0.109
room-0.109
Token.
Token ID13
Feature activation+3.02e+00

Pos logit contributions
he+0.061
ays+0.055
room+0.055
ore+0.054
ane+0.053
Neg logit contributions
 forgot-0.066
 knew-0.065
 remembered-0.064
 wished-0.059
 pap-0.058
Token They
Token ID1119
Feature activation-7.23e+00

Pos logit contributions
he+0.072
ing+0.062
room+0.062
ane+0.060
ays+0.059
Neg logit contributions
 forgot-0.073
 knew-0.067
 wished-0.066
 remembered-0.065
 pap-0.064
Token felt
Token ID2936
Feature activation-3.29e+00

Pos logit contributions
 pap+0.113
 forgot+0.097
 fact+0.095
 vitamins+0.090
 majesty+0.084
Neg logit contributions
he-0.181
room-0.158
hel-0.142
ing-0.139
aw-0.137
Token proud
Token ID6613
Feature activation+1.72e+00

Pos logit contributions
 forgot+0.067
 remembered+0.054
 knew+0.054
 pap+0.051
 knocked+0.049
Neg logit contributions
he-0.082
ing-0.073
room-0.071
 anymore-0.071
ays-0.070
Token and
Token ID290
Feature activation-4.71e+00

Pos logit contributions
he+0.027
room+0.023
ays+0.023
ing+0.022
ane+0.021
Neg logit contributions
 forgot-0.054
 knew-0.050
 remembered-0.048
 wished-0.048
 pap-0.046
Token happy
Token ID3772
Feature activation+1.16e+00

Pos logit contributions
 pap+0.085
 forgot+0.079
 fact+0.076
 robbers+0.074
 er+0.068
Neg logit contributions
he-0.106
 with-0.099
ing-0.095
apping-0.092
room-0.089
Token.
Token ID13
Feature activation+2.98e+00

Pos logit contributions
he+0.015
ays+0.013
room+0.013
ing+0.012
ane+0.011
Neg logit contributions
 forgot-0.040
 knew-0.037
 remembered-0.036
 wished-0.036
 pap-0.034
Token They
Token ID1119
Feature activation-7.98e+00

Pos logit contributions
he+0.070
ing+0.064
room+0.061
ays+0.060
ane+0.060
Neg logit contributions
 forgot-0.072
 knew-0.064
 wished-0.063
 remembered-0.061
 pap-0.058
Token had
Token ID550
Feature activation-2.92e+00

Pos logit contributions
 pap+0.139
 fact+0.108
 vitamins+0.105
 robbers+0.105
 persever+0.105
Neg logit contributions
he-0.218
room-0.170
hel-0.169
 with-0.159
aw-0.157
Token discovered
Token ID5071
Feature activation-1.27e+00

Pos logit contributions
 pap+0.043
 parent+0.033
 fact+0.033
 collective+0.033
 nurses+0.033
Neg logit contributions
room-0.071
he-0.067
aw-0.065
.-0.065
 with-0.064
Token a
Token ID257
Feature activation-4.07e+00

Pos logit contributions
 forgot+0.033
 remembered+0.028
 knew+0.027
 wished+0.026
 pap+0.025
Neg logit contributions
he-0.027
ing-0.026
ays-0.025
room-0.025
ane-0.023
Token new
Token ID649
Feature activation-2.22e+00

Pos logit contributions
 forgot+0.090
 struggled+0.077
 wished+0.077
 remembered+0.076
 wasted+0.072
Neg logit contributions
he-0.084
room-0.081
time-0.080
ing-0.079
ay-0.078
Token place
Token ID1295
Feature activation+3.08e+00

Pos logit contributions
 forgot+0.050
 remembered+0.040
 knew+0.040
 wished+0.039
 pap+0.037
Neg logit contributions
he-0.054
room-0.052
ing-0.050
ings-0.050
ane-0.049
Token was
Token ID373
Feature activation-2.25e+00

Pos logit contributions
 forgot+0.094
 pap+0.093
 robbers+0.089
 fact+0.083
 wished+0.081
Neg logit contributions
he-0.175
room-0.159
ing-0.156
ays-0.152
ane-0.145
Token all
Token ID477
Feature activation-2.79e+00

Pos logit contributions
 forgot+0.042
 knew+0.038
 wished+0.038
 pap+0.036
 remembered+0.035
Neg logit contributions
he-0.058
ing-0.053
ays-0.051
apping-0.051
room-0.051
Token wet
Token ID9583
Feature activation+4.40e+00

Pos logit contributions
 forgot+0.057
 pap+0.053
 knew+0.050
 remembered+0.049
 wished+0.048
Neg logit contributions
he-0.070
ing-0.063
room-0.058
ays-0.058
 anymore-0.058
Token.
Token ID13
Feature activation+2.53e+00

Pos logit contributions
he+0.078
ane+0.076
uggle+0.073
room+0.072
ays+0.071
Neg logit contributions
 forgot-0.120
 knew-0.108
 wished-0.108
 remembered-0.107
 bit-0.103
Token She
Token ID1375
Feature activation-7.42e+00

Pos logit contributions
he+0.046
ing+0.042
ays+0.041
room+0.041
ane+0.039
Neg logit contributions
 forgot-0.072
 knew-0.066
 remembered-0.063
 wished-0.063
 pap-0.060
Token even
Token ID772
Feature activation-7.98e+00

Pos logit contributions
 robbers+0.134
 pap+0.124
 known+0.117
 setup+0.113
 prisoners+0.110
Neg logit contributions
he-0.180
 with-0.156
room-0.153
ing-0.149
 again-0.148
Token made
Token ID925
Feature activation-4.04e-01

Pos logit contributions
 fact+0.121
 pap+0.119
 robbers+0.108
 majesty+0.107
 tresp+0.105
Neg logit contributions
ing-0.194
he-0.193
 with-0.176
 again-0.175
room-0.171
Token a
Token ID257
Feature activation-4.21e+00

Pos logit contributions
 forgot+0.011
 remembered+0.009
 knew+0.009
 wished+0.009
 bit+0.009
Neg logit contributions
he-0.008
ing-0.008
room-0.008
ays-0.007
ings-0.007
Token new
Token ID649
Feature activation-2.62e+00

Pos logit contributions
 forgot+0.091
 known+0.083
 remembered+0.082
 wished+0.080
 struggled+0.078
Neg logit contributions
he-0.093
ings-0.087
ing-0.086
ute-0.084
eddy-0.084
Token friend
Token ID1545
Feature activation+7.41e-01

Pos logit contributions
 forgot+0.056
 remembered+0.046
 knew+0.046
 wished+0.045
 known+0.043
Neg logit contributions
he-0.064
ing-0.062
room-0.062
ings-0.061
eddy-0.060
Token,
Token ID11
Feature activation-3.21e+00

Pos logit contributions
he+0.011
ays+0.010
room+0.010
ing+0.009
ane+0.009
Neg logit contributions
 forgot-0.024
 knew-0.022
 remembered-0.021
 wished-0.021
 bit-0.020
Token,
Token ID11
Feature activation-1.20e+00

Pos logit contributions
he+0.038
ays+0.033
room+0.032
ane+0.031
ing+0.029
Neg logit contributions
 forgot-0.060
 knew-0.055
 remembered-0.053
 wished-0.053
 pap-0.050
Token and
Token ID290
Feature activation-3.67e+00

Pos logit contributions
 forgot+0.039
 knew+0.035
 remembered+0.035
 wished+0.034
 pap+0.033
Neg logit contributions
he-0.016
ing-0.014
room-0.014
ays-0.013
eddy-0.012
Token very
Token ID845
Feature activation-1.97e+00

Pos logit contributions
 forgot+0.073
 fact+0.069
 robbers+0.066
 stolen+0.064
 known+0.063
Neg logit contributions
ing-0.075
he-0.074
 with-0.072
apping-0.068
aps-0.066
Token amazed
Token ID24562
Feature activation+3.54e+00

Pos logit contributions
 forgot+0.034
 remembered+0.029
 fact+0.028
 robbers+0.028
 knew+0.027
Neg logit contributions
he-0.054
ing-0.051
ore-0.048
apping-0.048
aps-0.047
Token.
Token ID13
Feature activation+2.70e+00

Pos logit contributions
he+0.072
room+0.066
uggle+0.065
ays+0.064
ane+0.063
Neg logit contributions
 forgot-0.094
 knew-0.088
 remembered-0.081
 bit-0.080
 wished-0.079
Token They
Token ID1119
Feature activation-7.95e+00

Pos logit contributions
he+0.059
ing+0.055
room+0.051
ays+0.051
ane+0.051
Neg logit contributions
 forgot-0.068
 knew-0.061
 wished-0.060
 remembered-0.058
 learned-0.056
Token hugged
Token ID41130
Feature activation+1.80e+00

Pos logit contributions
 pap+0.125
 robbers+0.112
 wasteful+0.106
 comm+0.105
 unfair+0.105
Neg logit contributions
he-0.229
room-0.175
hel-0.173
 with-0.166
ane-0.166
Token their
Token ID511
Feature activation-2.03e+00

Pos logit contributions
he+0.050
ays+0.048
ane+0.046
room+0.045
ing+0.045
Neg logit contributions
 forgot-0.033
 knew-0.031
 wished-0.029
 remembered-0.029
 learned-0.027
Token dad
Token ID9955
Feature activation+3.24e+00

Pos logit contributions
 forgot+0.045
 known+0.039
 remembered+0.036
 knew+0.035
 forgotten+0.035
Neg logit contributions
he-0.046
eddy-0.042
ing-0.042
ane-0.042
ings-0.041
Token,
Token ID11
Feature activation-2.39e+00

Pos logit contributions
he+0.054
ays+0.049
room+0.049
ing+0.046
ane+0.045
Neg logit contributions
 forgot-0.097
 knew-0.088
 remembered-0.085
 wished-0.083
 pap-0.081
Token and
Token ID290
Feature activation-5.30e+00

Pos logit contributions
 forgot+0.073
 knew+0.063
 stolen+0.063
 remembered+0.062
 forgotten+0.062
Neg logit contributions
he-0.032
 with-0.025
room-0.025
ays-0.024
ing-0.023
Token sw
Token ID1509
Feature activation+1.71e+00

Pos logit contributions
 pap+0.075
 fact+0.073
 known+0.073
 forgot+0.072
 cle+0.067
Neg logit contributions
he-0.120
ing-0.120
 with-0.119
ays-0.112
 again-0.111
Tokenam
Token ID321
Feature activation+3.05e+00

Pos logit contributions
he+0.027
ing+0.025
ays+0.024
room+0.024
ings+0.022
Neg logit contributions
 forgot-0.053
 knew-0.047
 remembered-0.046
 wished-0.046
 pap-0.044
Token around
Token ID1088
Feature activation+1.65e+00

Pos logit contributions
he+0.055
ays+0.050
room+0.050
ing+0.049
ane+0.047
Neg logit contributions
 forgot-0.088
 knew-0.079
 remembered-0.077
 wished-0.075
 pap-0.071
Token.
Token ID13
Feature activation+3.07e+00

Pos logit contributions
he+0.025
ays+0.023
ing+0.022
room+0.022
ane+0.021
Neg logit contributions
 forgot-0.052
 knew-0.047
 remembered-0.046
 wished-0.046
 bit-0.044
Token They
Token ID1119
Feature activation-6.74e+00

Pos logit contributions
he+0.062
ing+0.057
ays+0.055
room+0.055
ings+0.053
Neg logit contributions
 forgot-0.081
 knew-0.072
 remembered-0.070
 wished-0.069
 pap-0.066
Token even
Token ID772
Feature activation-7.94e+00

Pos logit contributions
 pap+0.118
 fact+0.099
 error+0.098
 majesty+0.098
 payoff+0.097
Neg logit contributions
he-0.176
room-0.143
 with-0.140
 on-0.136
ing-0.135
Token played
Token ID2826
Feature activation+2.59e+00

Pos logit contributions
 pap+0.121
 fact+0.120
 majesty+0.108
 Majesty+0.098
 barely+0.097
Neg logit contributions
ing-0.191
he-0.184
room-0.176
 with-0.176
aw-0.171
Token a
Token ID257
Feature activation-4.49e-01

Pos logit contributions
ays+0.046
he+0.045
ane+0.044
eddy+0.043
ing+0.040
Neg logit contributions
 forgot-0.074
 knew-0.068
 remembered-0.066
 learned-0.064
 wished-0.064
Token game
Token ID983
Feature activation+3.54e+00

Pos logit contributions
 forgot+0.009
 remembered+0.008
 wished+0.008
 knew+0.008
 pap+0.008
Neg logit contributions
he-0.012
room-0.011
ing-0.011
uggle-0.010
ings-0.010
Token of
Token ID286
Feature activation-2.01e+00

Pos logit contributions
he+0.054
ays+0.048
ing+0.046
room+0.045
eddy+0.045
Neg logit contributions
 forgot-0.111
 knew-0.100
 remembered-0.097
 wished-0.096
 bit-0.092
Token tag
Token ID7621
Feature activation+3.13e+00

Pos logit contributions
 forgot+0.044
 knew+0.038
 remembered+0.038
 pap+0.037
 wished+0.036
Neg logit contributions
he-0.048
room-0.046
ing-0.044
ays-0.041
ings-0.041
Token,
Token ID11
Feature activation-2.26e+00

Pos logit contributions
ays+0.005
he+0.005
ing+0.005
room+0.004
eddy+0.004
Neg logit contributions
 forgot-0.010
 knew-0.009
 wished-0.009
 remembered-0.009
 bit-0.009
Token happy
Token ID3772
Feature activation+1.11e+00

Pos logit contributions
 forgot+0.044
 knew+0.039
 remembered+0.037
 pap+0.036
 known+0.036
Neg logit contributions
he-0.056
 with-0.048
ing-0.047
ays-0.047
room-0.047
Token and
Token ID290
Feature activation-3.93e+00

Pos logit contributions
he+0.020
ays+0.019
room+0.018
ane+0.017
ing+0.017
Neg logit contributions
 forgot-0.032
 knew-0.030
 wished-0.029
 remembered-0.028
 bit-0.027
Token warm
Token ID5814
Feature activation+2.33e+00

Pos logit contributions
 forgot+0.069
 knew+0.060
 remembered+0.058
 pap+0.057
 fact+0.055
Neg logit contributions
he-0.108
uggle-0.096
ing-0.095
room-0.092
ore-0.090
Token.
Token ID13
Feature activation+2.19e+00

Pos logit contributions
he+0.039
ays+0.039
room+0.038
ane+0.038
ay+0.035
Neg logit contributions
 forgot-0.065
 knew-0.060
 bit-0.059
 wished-0.058
 remembered-0.058
Token They
Token ID1119
Feature activation-7.94e+00

Pos logit contributions
he+0.043
ing+0.039
ays+0.038
room+0.038
ane+0.037
Neg logit contributions
 forgot-0.060
 knew-0.054
 remembered-0.052
 wished-0.052
 pap-0.049
Token were
Token ID547
Feature activation-3.08e+00

Pos logit contributions
 pap+0.117
 robbers+0.111
 fact+0.103
 error+0.103
 forgot+0.103
Neg logit contributions
he-0.235
room-0.182
hel-0.177
uggle-0.174
ane-0.168
Token not
Token ID407
Feature activation-3.24e+00

Pos logit contributions
 forgot+0.058
 knew+0.049
 bit+0.047
 wished+0.047
 knocked+0.045
Neg logit contributions
he-0.080
ing-0.073
eddy-0.072
room-0.071
ays-0.070
Token lonely
Token ID21757
Feature activation+3.05e+00

Pos logit contributions
 forgot+0.051
 knew+0.045
 fact+0.042
 pap+0.040
 wished+0.040
Neg logit contributions
ing-0.082
he-0.082
 anymore-0.081
eddy-0.076
apping-0.075
Token anymore
Token ID7471
Feature activation+2.03e+00

Pos logit contributions
he+0.037
ays+0.035
room+0.033
ane+0.031
uggle+0.031
Neg logit contributions
 forgot-0.106
 knew-0.095
 wished-0.094
 remembered-0.094
 bit-0.092
Token.
Token ID13
Feature activation+2.31e+00

Pos logit contributions
he+0.031
ays+0.029
room+0.028
ing+0.026
ane+0.026
Neg logit contributions
 forgot-0.066
 knew-0.061
 wished-0.059
 remembered-0.059
 learned-0.055
Token forget
Token ID6044
Feature activation+9.63e-01

Pos logit contributions
 pap+0.140
 forgot+0.129
 remembered+0.121
 knew+0.121
 fact+0.121
Neg logit contributions
he-0.188
uggle-0.143
room-0.141
 anymore-0.141
ane-0.135
Token the
Token ID262
Feature activation-3.58e+00

Pos logit contributions
he+0.018
ays+0.017
room+0.016
ing+0.016
ane+0.016
Neg logit contributions
 forgot-0.027
 knew-0.025
 remembered-0.024
 wished-0.024
 learned-0.023
Token time
Token ID640
Feature activation+2.28e+00

Pos logit contributions
 forgot+0.086
 remembered+0.075
 knew+0.069
 pap+0.067
 wished+0.065
Neg logit contributions
he-0.084
room-0.072
ay-0.071
time-0.070
uggle-0.070
Token.
Token ID13
Feature activation+1.86e+00

Pos logit contributions
he+0.034
room+0.031
ays+0.029
uggle+0.028
ing+0.027
Neg logit contributions
 forgot-0.072
 knew-0.066
 wished-0.064
 remembered-0.062
 learned-0.060
Token They
Token ID1119
Feature activation-7.91e+00

Pos logit contributions
he+0.044
ing+0.042
room+0.039
ays+0.038
ane+0.038
Neg logit contributions
 forgot-0.044
 knew-0.039
 wished-0.039
 remembered-0.037
 pap-0.036
Token also
Token ID635
Feature activation-7.94e+00

Pos logit contributions
 forgot+0.150
 pap+0.144
 remembered+0.136
 stolen+0.135
 knew+0.130
Neg logit contributions
he-0.210
uggle-0.162
room-0.162
 anymore-0.155
ane-0.151
Token forget
Token ID6044
Feature activation+8.75e-01

Pos logit contributions
 forgot+0.149
 pap+0.136
 knew+0.131
 struggled+0.129
 remembered+0.128
Neg logit contributions
he-0.192
uggle-0.154
room-0.152
 with-0.149
 anymore-0.147
Token to
Token ID284
Feature activation-5.52e+00

Pos logit contributions
he+0.016
ays+0.014
room+0.014
ing+0.014
ane+0.013
Neg logit contributions
 forgot-0.026
 knew-0.023
 remembered-0.023
 wished-0.022
 learned-0.021
Token eat
Token ID4483
Feature activation+2.22e+00

Pos logit contributions
 forgot+0.115
 knew+0.098
 wished+0.096
 remembered+0.094
 pap+0.088
Neg logit contributions
he-0.144
room-0.132
uggle-0.123
ings-0.119
time-0.112
Token.
Token ID13
Feature activation+2.22e+00

Pos logit contributions
he+0.059
ane+0.057
ays+0.056
ing+0.055
room+0.054
Neg logit contributions
 forgot-0.040
 knew-0.036
 wished-0.034
 bit-0.033
 pap-0.033
Token They
Token ID1119
Feature activation-7.36e+00

Pos logit contributions
he+0.057
ing+0.054
room+0.050
ane+0.049
ays+0.048
Neg logit contributions
 forgot-0.048
 wished-0.042
 knew-0.042
 remembered-0.040
 pap-0.040
Token up
Token ID510
Feature activation+9.83e-01

Pos logit contributions
room+0.034
he+0.034
ay+0.032
ane+0.031
ase+0.030
Neg logit contributions
 forgot-0.037
 pap-0.033
 knew-0.033
 remembered-0.033
 wished-0.032
Token happy
Token ID3772
Feature activation+7.78e-01

Pos logit contributions
room+0.018
he+0.018
ays+0.017
ane+0.016
ing+0.016
Neg logit contributions
 forgot-0.027
 knew-0.025
 remembered-0.024
 wished-0.023
 pap-0.023
Token and
Token ID290
Feature activation-5.07e+00

Pos logit contributions
he+0.014
ays+0.013
room+0.013
ane+0.012
ing+0.012
Neg logit contributions
 forgot-0.022
 knew-0.021
 wished-0.020
 remembered-0.020
 bit-0.019
Token refreshed
Token ID47193
Feature activation+4.29e+00

Pos logit contributions
 forgot+0.090
 knew+0.076
 pap+0.075
 known+0.072
 remembered+0.072
Neg logit contributions
he-0.134
ing-0.125
ays-0.121
ore-0.116
uggle-0.115
Token.
Token ID13
Feature activation+2.37e+00

Pos logit contributions
ays+0.084
ane+0.080
room+0.079
uggle+0.078
he+0.077
Neg logit contributions
 forgot-0.105
 knew-0.103
 remembered-0.093
 bit-0.092
 wished-0.092
Token They
Token ID1119
Feature activation-7.94e+00

Pos logit contributions
he+0.053
ing+0.050
room+0.047
ays+0.047
ane+0.046
Neg logit contributions
 forgot-0.059
 knew-0.052
 wished-0.051
 remembered-0.050
 pap-0.048
Token got
Token ID1392
Feature activation-1.57e+00

Pos logit contributions
 fact+0.107
 unfair+0.104
 stolen+0.103
 pap+0.103
 wasteful+0.102
Neg logit contributions
he-0.228
ane-0.174
room-0.170
hel-0.169
 with-0.166
Token ready
Token ID3492
Feature activation+1.22e+00

Pos logit contributions
 forgot+0.050
 knew+0.045
 remembered+0.044
 wished+0.044
 pap+0.042
Neg logit contributions
he-0.023
ing-0.020
ays-0.020
room-0.020
ane-0.018
Token for
Token ID329
Feature activation-1.00e+00

Pos logit contributions
he+0.026
ays+0.025
room+0.024
ing+0.023
ane+0.023
Neg logit contributions
 forgot-0.031
 knew-0.028
 remembered-0.027
 wished-0.026
 pap-0.025
Token school
Token ID1524
Feature activation+3.04e+00

Pos logit contributions
 forgot+0.025
 knew+0.022
 remembered+0.021
 wished+0.020
 bit+0.020
Neg logit contributions
he-0.021
room-0.019
ays-0.019
ing-0.019
ore-0.018
Token and
Token ID290
Feature activation-5.41e+00

Pos logit contributions
he+0.044
ays+0.040
room+0.038
ing+0.038
ane+0.035
Neg logit contributions
 forgot-0.098
 knew-0.089
 remembered-0.086
 wished-0.086
 learned-0.082
Token they
Token ID484
Feature activation-7.20e+00

Pos logit contributions
 forgot+0.030
 knew+0.026
 remembered+0.026
 wished+0.025
 pap+0.024
Neg logit contributions
he-0.027
ing-0.025
ays-0.025
room-0.025
ane-0.024
Token forget
Token ID6044
Feature activation+9.63e-01

Pos logit contributions
 pap+0.140
 forgot+0.129
 remembered+0.121
 knew+0.121
 fact+0.121
Neg logit contributions
he-0.188
uggle-0.143
room-0.141
 anymore-0.141
ane-0.135
Token the
Token ID262
Feature activation-3.58e+00

Pos logit contributions
he+0.018
ays+0.017
room+0.016
ing+0.016
ane+0.016
Neg logit contributions
 forgot-0.027
 knew-0.025
 remembered-0.024
 wished-0.024
 learned-0.023
Token time
Token ID640
Feature activation+2.28e+00

Pos logit contributions
 forgot+0.086
 remembered+0.075
 knew+0.069
 pap+0.067
 wished+0.065
Neg logit contributions
he-0.084
room-0.072
ay-0.071
time-0.070
uggle-0.070
Token.
Token ID13
Feature activation+1.86e+00

Pos logit contributions
he+0.034
room+0.031
ays+0.029
uggle+0.028
ing+0.027
Neg logit contributions
 forgot-0.072
 knew-0.066
 wished-0.064
 remembered-0.062
 learned-0.060
Token They
Token ID1119
Feature activation-7.91e+00

Pos logit contributions
he+0.044
ing+0.042
room+0.039
ays+0.038
ane+0.038
Neg logit contributions
 forgot-0.044
 knew-0.039
 wished-0.039
 remembered-0.037
 pap-0.036
Token also
Token ID635
Feature activation-7.94e+00

Pos logit contributions
 forgot+0.150
 pap+0.144
 remembered+0.136
 stolen+0.135
 knew+0.130
Neg logit contributions
he-0.210
uggle-0.162
room-0.162
 anymore-0.155
ane-0.151
Token forget
Token ID6044
Feature activation+8.75e-01

Pos logit contributions
 forgot+0.149
 pap+0.136
 knew+0.131
 struggled+0.129
 remembered+0.128
Neg logit contributions
he-0.192
uggle-0.154
room-0.152
 with-0.149
 anymore-0.147
Token to
Token ID284
Feature activation-5.52e+00

Pos logit contributions
he+0.016
ays+0.014
room+0.014
ing+0.014
ane+0.013
Neg logit contributions
 forgot-0.026
 knew-0.023
 remembered-0.023
 wished-0.022
 learned-0.021
Token eat
Token ID4483
Feature activation+2.22e+00

Pos logit contributions
 forgot+0.115
 knew+0.098
 wished+0.096
 remembered+0.094
 pap+0.088
Neg logit contributions
he-0.144
room-0.132
uggle-0.123
ings-0.119
time-0.112
Token.
Token ID13
Feature activation+2.22e+00

Pos logit contributions
he+0.059
ane+0.057
ays+0.056
ing+0.055
room+0.054
Neg logit contributions
 forgot-0.040
 knew-0.036
 wished-0.034
 bit-0.033
 pap-0.033
Token in
Token ID287
Feature activation-1.26e+00

Pos logit contributions
he+0.020
ays+0.018
room+0.018
ing+0.018
ane+0.017
Neg logit contributions
 forgot-0.024
 knew-0.021
 remembered-0.020
 wished-0.020
 pap-0.019
Token the
Token ID262
Feature activation-3.53e+00

Pos logit contributions
 forgot+0.035
 knew+0.030
 remembered+0.029
 wished+0.028
 pap+0.027
Neg logit contributions
he-0.025
room-0.023
ays-0.022
ing-0.021
ings-0.021
Token tent
Token ID11105
Feature activation+2.26e+00

Pos logit contributions
 forgot+0.089
 remembered+0.074
 knew+0.074
 wished+0.071
 pap+0.068
Neg logit contributions
he-0.082
room-0.073
ing-0.072
ay-0.070
time-0.069
Token.
Token ID13
Feature activation+2.70e+00

Pos logit contributions
he+0.030
ing+0.026
ays+0.026
room+0.026
ane+0.023
Neg logit contributions
 forgot-0.075
 knew-0.068
 remembered-0.067
 wished-0.066
 pap-0.064
Token They
Token ID1119
Feature activation-6.84e+00

Pos logit contributions
he+0.054
ing+0.051
room+0.049
ays+0.048
ane+0.046
Neg logit contributions
 forgot-0.072
 knew-0.064
 remembered-0.062
 wished-0.061
 pap-0.059
Token even
Token ID772
Feature activation-7.90e+00

Pos logit contributions
 pap+0.114
 comm+0.105
 vitamins+0.105
 fact+0.103
 cruel+0.102
Neg logit contributions
he-0.184
room-0.148
hel-0.135
 with-0.132
ing-0.132
Token stayed
Token ID9658
Feature activation+2.47e-01

Pos logit contributions
 fact+0.119
 pap+0.116
 majesty+0.102
 willingness+0.100
 vitamins+0.099
Neg logit contributions
he-0.190
ing-0.182
room-0.176
 with-0.171
aw-0.166
Token up
Token ID510
Feature activation+1.23e+00

Pos logit contributions
he+0.003
ing+0.003
ays+0.003
room+0.003
ings+0.003
Neg logit contributions
 forgot-0.008
 knew-0.007
 wished-0.007
 remembered-0.007
 pap-0.007
Token late
Token ID2739
Feature activation+1.76e+00

Pos logit contributions
he+0.021
ays+0.019
room+0.019
ing+0.019
ane+0.018
Neg logit contributions
 forgot-0.036
 knew-0.033
 remembered-0.032
 wished-0.031
 pap-0.030
Token into
Token ID656
Feature activation-2.47e+00

Pos logit contributions
he+0.033
ays+0.031
room+0.030
eddy+0.028
ane+0.028
Neg logit contributions
 forgot-0.049
 knew-0.044
 remembered-0.043
 wished-0.043
 bit-0.041
Token the
Token ID262
Feature activation-3.66e+00

Pos logit contributions
 forgot+0.072
 knew+0.062
 remembered+0.061
 wished+0.057
 pap+0.055
Neg logit contributions
he-0.043
room-0.040
ays-0.037
ing-0.037
 anymore-0.037
Token feeling
Token ID4203
Feature activation-3.02e+00

Pos logit contributions
 forgot+0.045
 knew+0.038
 wished+0.037
 remembered+0.037
 pap+0.035
Neg logit contributions
he-0.056
ing-0.050
ays-0.048
room-0.048
eddy-0.047
Token happy
Token ID3772
Feature activation+1.01e+00

Pos logit contributions
 forgot+0.054
 knew+0.043
 remembered+0.042
 wished+0.039
 pap+0.038
Neg logit contributions
he-0.087
ing-0.080
ays-0.077
room-0.077
uggle-0.075
Token and
Token ID290
Feature activation-3.23e+00

Pos logit contributions
he+0.017
ays+0.015
room+0.015
ing+0.014
ane+0.014
Neg logit contributions
 forgot-0.031
 knew-0.028
 remembered-0.027
 wished-0.027
 pap-0.025
Token grateful
Token ID14066
Feature activation+2.44e+00

Pos logit contributions
 forgot+0.059
 knew+0.050
 remembered+0.049
 pap+0.048
 wished+0.046
Neg logit contributions
he-0.086
ing-0.079
uggle-0.076
ore-0.075
ays-0.074
Token.
Token ID13
Feature activation+2.01e+00

Pos logit contributions
he+0.037
ays+0.033
room+0.033
ing+0.030
ane+0.030
Neg logit contributions
 forgot-0.078
 knew-0.072
 remembered-0.069
 wished-0.068
 pap-0.066
Token They
Token ID1119
Feature activation-7.90e+00

Pos logit contributions
he+0.032
ing+0.028
room+0.028
ays+0.028
ane+0.026
Neg logit contributions
 forgot-0.062
 knew-0.056
 remembered-0.055
 wished-0.054
 pap-0.052
Token had
Token ID550
Feature activation-3.30e+00

Pos logit contributions
 robbers+0.107
 pap+0.100
 chemicals+0.100
 destructive+0.097
 stolen+0.096
Neg logit contributions
he-0.245
hel-0.190
ane-0.185
room-0.181
uggle-0.178
Token a
Token ID257
Feature activation-4.12e+00

Pos logit contributions
 forgot+0.055
 robbers+0.050
 pap+0.049
 bit+0.046
 fact+0.045
Neg logit contributions
he-0.082
room-0.078
ing-0.073
time-0.072
ays-0.070
Token wonderful
Token ID7932
Feature activation-3.32e+00

Pos logit contributions
 forgot+0.082
 struggled+0.067
 wished+0.067
 remembered+0.066
 knew+0.063
Neg logit contributions
he-0.102
uggle-0.092
ay-0.090
time-0.090
eddy-0.085
Token morning
Token ID3329
Feature activation+1.11e+00

Pos logit contributions
 forgot+0.072
 knew+0.061
 remembered+0.060
 wished+0.058
 pap+0.055
Neg logit contributions
he-0.084
room-0.077
ing-0.076
ays-0.075
ane-0.073
Token.
Token ID13
Feature activation+1.84e+00

Pos logit contributions
he+0.018
room+0.017
ays+0.017
ing+0.016
ings+0.015
Neg logit contributions
 forgot-0.033
 knew-0.030
 remembered-0.029
 wished-0.029
 bit-0.028
Token very
Token ID845
Feature activation+4.61e-01

Pos logit contributions
he+0.063
ays+0.061
ane+0.054
ing+0.054
room+0.054
Neg logit contributions
 forgot-0.092
 knew-0.085
 remembered-0.081
 wished-0.079
 learned-0.077
Token loud
Token ID7812
Feature activation+4.04e+00

Pos logit contributions
he+0.012
ays+0.011
room+0.011
ing+0.011
ane+0.011
Neg logit contributions
 forgot-0.009
 knew-0.008
 remembered-0.008
 wished-0.008
 pap-0.007
Token and
Token ID290
Feature activation-4.18e+00

Pos logit contributions
he+0.073
ays+0.065
ane+0.061
uggle+0.060
eddy+0.059
Neg logit contributions
 forgot-0.111
 knew-0.106
 wished-0.104
 remembered-0.099
 bit-0.097
Token wear
Token ID5806
Feature activation+1.35e+00

Pos logit contributions
 forgot+0.078
 pap+0.072
 remembered+0.071
 knew+0.069
 stolen+0.067
Neg logit contributions
he-0.101
ing-0.092
room-0.087
 anymore-0.087
 with-0.086
Token funny
Token ID8258
Feature activation+1.68e+00

Pos logit contributions
he+0.031
ing+0.029
room+0.029
ays+0.028
ings+0.028
Neg logit contributions
 forgot-0.032
 knew-0.027
 remembered-0.027
 wished-0.026
 pap-0.025
Token clothes
Token ID8242
Feature activation+5.61e+00

Pos logit contributions
he+0.044
ays+0.041
ing+0.040
room+0.040
ane+0.039
Neg logit contributions
 forgot-0.034
 knew-0.030
 remembered-0.028
 wished-0.028
 pap-0.026
Token.
Token ID13
Feature activation+3.05e+00

Pos logit contributions
he+0.080
ays+0.076
room+0.076
ing+0.068
uggle+0.068
Neg logit contributions
 forgot-0.175
 knew-0.158
 remembered-0.152
 wished-0.151
 forgotten-0.148
Token
Token ID198
Feature activation+3.56e+00

Pos logit contributions
he+0.058
ing+0.056
ays+0.053
ings+0.051
ane+0.050
Neg logit contributions
 forgot-0.081
 knew-0.073
 wished-0.071
 remembered-0.069
 pap-0.068
Token
Token ID198
Feature activation+2.90e+00

Pos logit contributions
ays+0.073
ing+0.070
ipping+0.068
ane+0.067
he+0.065
Neg logit contributions
 forgot-0.086
 knew-0.083
 wished-0.076
 never-0.075
 learned-0.074
TokenL
Token ID43
Feature activation+3.08e-01

Pos logit contributions
he+0.053
room+0.048
ing+0.047
ays+0.046
ane+0.043
Neg logit contributions
 forgot-0.083
 knew-0.074
 remembered-0.073
 wished-0.072
 pap-0.069
Tokenily
Token ID813
Feature activation-4.34e+00

Pos logit contributions
he+0.006
ing+0.006
ays+0.006
uggle+0.006
ings+0.006
Neg logit contributions
 forgot-0.008
 knew-0.007
 remembered-0.006
 wished-0.006
 learned-0.006
TokenBut
Token ID1537
Feature activation-1.24e+00

Pos logit contributions
ays+0.063
he+0.061
ane+0.059
ore+0.058
ing+0.055
Neg logit contributions
 forgot-0.078
 knew-0.067
 remembered-0.066
 wished-0.064
 learned-0.064
Token the
Token ID262
Feature activation-3.48e+00

Pos logit contributions
 forgot+0.032
 knew+0.028
 remembered+0.028
 wished+0.027
 pap+0.026
Neg logit contributions
he-0.026
ing-0.023
room-0.023
ays-0.023
ane-0.022
Token lock
Token ID5793
Feature activation-2.43e+00

Pos logit contributions
 forgot+0.092
 known+0.079
 pap+0.077
 remembered+0.074
 knew+0.073
Neg logit contributions
he-0.065
room-0.060
ings-0.057
ing-0.056
 with-0.055
Token was
Token ID373
Feature activation-2.89e+00

Pos logit contributions
 forgot+0.044
 knew+0.037
 remembered+0.036
 wished+0.035
 pap+0.035
Neg logit contributions
he-0.068
ing-0.064
room-0.064
ays-0.064
ings-0.061
Token too
Token ID1165
Feature activation-9.29e-01

Pos logit contributions
 forgot+0.065
 knew+0.056
 pap+0.056
 wished+0.055
 known+0.051
Neg logit contributions
he-0.059
ing-0.056
 anymore-0.053
ays-0.052
 against-0.052
Token strong
Token ID1913
Feature activation+5.63e+00

Pos logit contributions
 forgot+0.016
 knew+0.013
 remembered+0.013
 wished+0.012
 pap+0.011
Neg logit contributions
he-0.028
ays-0.026
room-0.026
ing-0.026
ane-0.025
Token and
Token ID290
Feature activation-4.70e+00

Pos logit contributions
he+0.100
room+0.093
ays+0.092
uggle+0.089
ane+0.088
Neg logit contributions
 forgot-0.155
 knew-0.142
 wished-0.133
 remembered-0.133
 bit-0.133
Token they
Token ID484
Feature activation-5.21e+00

Pos logit contributions
 forgot+0.104
 known+0.097
 pap+0.094
 knew+0.092
 fact+0.091
Neg logit contributions
ing-0.088
ays-0.083
he-0.079
 with-0.076
ings-0.075
Token could
Token ID714
Feature activation-5.40e-01

Pos logit contributions
 pap+0.096
 forgot+0.090
 known+0.087
 fact+0.083
 shown+0.080
Neg logit contributions
he-0.141
room-0.124
ing-0.117
ays-0.117
ane-0.113
Token not
Token ID407
Feature activation-1.99e+00

Pos logit contributions
 forgot+0.007
 knew+0.005
 remembered+0.005
 wished+0.005
 pap+0.004
Neg logit contributions
he-0.019
room-0.017
ays-0.017
ing-0.017
ings-0.017
Token open
Token ID1280
Feature activation+1.68e+00

Pos logit contributions
 forgot+0.035
 knew+0.030
 remembered+0.028
 wished+0.026
 bit+0.025
Neg logit contributions
ays-0.059
he-0.057
ing-0.057
eddy-0.055
room-0.055
Token.
Token ID13
Feature activation+3.63e+00

Pos logit contributions
ane+0.105
he+0.099
eddy+0.097
anes+0.096
uggle+0.095
Neg logit contributions
 forgot-0.104
 way-0.102
 bit-0.102
 knew-0.101
 never-0.096
Token But
Token ID887
Feature activation-1.10e+00

Pos logit contributions
he+0.092
eddy+0.086
ing+0.083
ane+0.082
ays+0.082
Neg logit contributions
 forgot-0.080
 knew-0.073
 wished-0.067
 remembered-0.066
 knocked-0.064
Token it
Token ID340
Feature activation-2.73e+00

Pos logit contributions
 forgot+0.027
 knew+0.023
 remembered+0.023
 wished+0.022
 pap+0.021
Neg logit contributions
he-0.026
ays-0.024
room-0.024
ing-0.023
eddy-0.023
Token was
Token ID373
Feature activation-1.36e+00

Pos logit contributions
 forgot+0.059
 pap+0.052
 known+0.051
 wished+0.051
 knew+0.050
Neg logit contributions
he-0.061
ing-0.058
room-0.056
ays-0.054
ings-0.053
Token too
Token ID1165
Feature activation-3.38e-01

Pos logit contributions
 forgot+0.031
 knew+0.027
 wished+0.026
 remembered+0.025
 pap+0.025
Neg logit contributions
he-0.029
ays-0.027
ing-0.027
room-0.026
ings-0.026
Token heavy
Token ID4334
Feature activation+5.74e+00

Pos logit contributions
 forgot+0.007
 knew+0.006
 remembered+0.006
 wished+0.005
 bit+0.005
Neg logit contributions
he-0.009
ays-0.009
room-0.009
ing-0.008
ane-0.008
Token and
Token ID290
Feature activation-3.87e+00

Pos logit contributions
he+0.126
room+0.118
ays+0.117
ane+0.114
uggle+0.111
Neg logit contributions
 forgot-0.130
 knew-0.123
 bit-0.112
 wished-0.112
 remembered-0.112
Token hard
Token ID1327
Feature activation+3.84e+00

Pos logit contributions
 forgot+0.085
 known+0.077
 knew+0.074
 pap+0.073
 shown+0.071
Neg logit contributions
ing-0.078
ays-0.076
he-0.075
ings-0.069
 anymore-0.069
Token.
Token ID13
Feature activation+3.33e+00

Pos logit contributions
he+0.063
room+0.058
ays+0.058
ane+0.055
uggle+0.054
Neg logit contributions
 forgot-0.112
 knew-0.106
 wished-0.100
 remembered-0.099
 knocked-0.098
Token She
Token ID1375
Feature activation-6.18e+00

Pos logit contributions
he+0.082
ane+0.075
ing+0.074
eddy+0.074
room+0.071
Neg logit contributions
 forgot-0.072
 knew-0.065
 wished-0.061
 remembered-0.060
 pap-0.060
Token could
Token ID714
Feature activation-2.88e-01

Pos logit contributions
 known+0.118
 pap+0.111
 payoff+0.108
 disl+0.103
 fact+0.102
Neg logit contributions
he-0.134
 with-0.117
room-0.116
es-0.114
ays-0.114
Token cannot
Token ID2314
Feature activation-2.08e+00

Pos logit contributions
 pap+0.057
 forgot+0.054
 remembered+0.047
 known+0.046
 wished+0.046
Neg logit contributions
he-0.077
ane-0.063
 anymore-0.062
ore-0.061
uggle-0.061
Token take
Token ID1011
Feature activation+6.31e-01

Pos logit contributions
 forgot+0.040
 knew+0.033
 remembered+0.033
 wished+0.032
 pap+0.031
Neg logit contributions
he-0.058
room-0.053
ing-0.050
ings-0.050
ays-0.050
Token it
Token ID340
Feature activation+3.84e+00

Pos logit contributions
he+0.013
ing+0.012
ays+0.012
room+0.012
ings+0.012
Neg logit contributions
 forgot-0.016
 knew-0.014
 remembered-0.014
 wished-0.013
 pap-0.013
Token from
Token ID422
Feature activation-1.50e-01

Pos logit contributions
ays+0.071
he+0.069
room+0.066
ane+0.064
ement+0.063
Neg logit contributions
 forgot-0.104
 knew-0.090
 forgotten-0.089
 bit-0.087
 never-0.086
Token the
Token ID262
Feature activation-2.51e+00

Pos logit contributions
 forgot+0.004
 knew+0.003
 remembered+0.003
 wished+0.003
 pap+0.003
Neg logit contributions
he-0.003
room-0.003
ing-0.003
ays-0.003
ings-0.003
Token museum
Token ID13257
Feature activation+5.87e+00

Pos logit contributions
 forgot+0.061
 remembered+0.054
 knew+0.052
 wished+0.049
 pap+0.048
Neg logit contributions
he-0.059
room-0.052
ing-0.050
ore-0.050
 anymore-0.048
Token."
Token ID526
Feature activation+3.57e+00

Pos logit contributions
he+0.088
ays+0.080
room+0.078
ing+0.077
ings+0.071
Neg logit contributions
 forgot-0.185
 knew-0.167
 remembered-0.161
 wished-0.160
 bit-0.155
Token
Token ID198
Feature activation+4.30e+00

Pos logit contributions
he+0.080
ays+0.076
ing+0.075
ane+0.074
room+0.073
Neg logit contributions
 forgot-0.088
 knew-0.078
 remembered-0.074
 wished-0.072
 learned-0.069
Token
Token ID198
Feature activation+2.90e+00

Pos logit contributions
ays+0.095
ane+0.089
ing+0.085
ipping+0.082
ings+0.080
Neg logit contributions
 forgot-0.106
 knew-0.101
 learned-0.090
 wished-0.088
 told-0.087
TokenHe
Token ID1544
Feature activation-5.34e+00

Pos logit contributions
he+0.054
ing+0.049
room+0.049
ays+0.048
ane+0.045
Neg logit contributions
 forgot-0.082
 knew-0.073
 remembered-0.071
 wished-0.070
 pap-0.067
Token felt
Token ID2936
Feature activation-3.17e+00

Pos logit contributions
 forgot+0.085
 pap+0.079
 fact+0.078
 forget+0.077
 known+0.073
Neg logit contributions
he-0.133
room-0.125
ing-0.123
 with-0.120
 again-0.111
Token in
Token ID287
Feature activation+1.60e+00

Pos logit contributions
ays+0.072
ane+0.069
ings+0.063
ing+0.062
ement+0.061
Neg logit contributions
 forgot-0.084
 sc-0.081
 knew-0.079
 always-0.077
 grown-0.077
Token her
Token ID607
Feature activation-1.56e+00

Pos logit contributions
he+0.040
ing+0.039
ays+0.039
ane+0.037
room+0.037
Neg logit contributions
 forgot-0.033
 knew-0.029
 wished-0.028
 remembered-0.027
 pap-0.026
Token backyard
Token ID24296
Feature activation+5.11e+00

Pos logit contributions
 forgot+0.028
 knew+0.023
 remembered+0.022
 wished+0.021
 pap+0.020
Neg logit contributions
he-0.046
ing-0.043
ays-0.043
room-0.043
ane-0.041
Token and
Token ID290
Feature activation-3.77e+00

Pos logit contributions
he+0.077
ays+0.075
ane+0.071
eddy+0.068
ing+0.068
Neg logit contributions
 forgot-0.150
 knew-0.141
 bit-0.135
 remembered-0.134
 wished-0.133
Token pick
Token ID2298
Feature activation+2.72e+00

Pos logit contributions
 forgot+0.085
 tricked+0.076
 forgotten+0.074
 wasted+0.074
 shown+0.074
Neg logit contributions
he-0.074
 with-0.069
uggle-0.067
room-0.064
 on-0.063
Token flowers
Token ID12734
Feature activation+5.45e+00

Pos logit contributions
he+0.085
room+0.078
ane+0.078
ays+0.076
apping+0.076
Neg logit contributions
 forgot-0.037
 knew-0.035
 wished-0.030
 remembered-0.030
 learned-0.028
Token.
Token ID13
Feature activation+3.70e+00

Pos logit contributions
room+0.112
ane+0.109
he+0.105
ays+0.104
ement+0.102
Neg logit contributions
 forgot-0.122
 knew-0.114
 sc-0.114
 told-0.111
 bit-0.111
Token One
Token ID1881
Feature activation-1.43e+00

Pos logit contributions
he+0.102
ing+0.102
ays+0.099
ings+0.098
ane+0.096
Neg logit contributions
 forgot-0.063
 knew-0.058
 remembered-0.050
 always-0.050
 way-0.049
Token day
Token ID1110
Feature activation+3.24e+00

Pos logit contributions
 forgot+0.026
 knew+0.023
 remembered+0.021
 wished+0.020
 bit+0.019
Neg logit contributions
he-0.040
ays-0.040
room-0.038
ing-0.038
ings-0.038
Token,
Token ID11
Feature activation+1.57e+00

Pos logit contributions
he+0.071
ays+0.068
room+0.064
eddy+0.063
ane+0.062
Neg logit contributions
 forgot-0.086
 knew-0.078
 remembered-0.076
 bit-0.071
 knocked-0.070
Token she
Token ID673
Feature activation-5.81e+00

Pos logit contributions
he+0.039
ays+0.037
room+0.036
ing+0.036
ane+0.036
Neg logit contributions
 forgot-0.036
 knew-0.031
 remembered-0.030
 wished-0.029
 bit-0.028
Token"
Token ID1
Feature activation+3.62e-01

Pos logit contributions
ays+0.058
he+0.056
ing+0.053
room+0.050
ane+0.050
Neg logit contributions
 forgot-0.081
 knew-0.075
 remembered-0.072
 wished-0.070
 learned-0.067
TokenDad
Token ID46270
Feature activation+1.06e-01

Pos logit contributions
he+0.007
ing+0.007
room+0.007
ays+0.007
ane+0.006
Neg logit contributions
 forgot-0.010
 knew-0.008
 remembered-0.008
 wished-0.008
 pap-0.008
Token,
Token ID11
Feature activation-7.67e-01

Pos logit contributions
he+0.001
ing+0.001
room+0.001
ays+0.001
ane+0.001
Neg logit contributions
 forgot-0.004
 knew-0.003
 remembered-0.003
 wished-0.003
 pap-0.003
Token can
Token ID460
Feature activation+1.24e+00

Pos logit contributions
 forgot+0.020
 remembered+0.017
 known+0.017
 wished+0.017
 knew+0.016
Neg logit contributions
he-0.015
ing-0.014
room-0.013
ore-0.013
ane-0.013
Token I
Token ID314
Feature activation-1.87e+00

Pos logit contributions
he+0.029
ing+0.026
room+0.026
ays+0.026
ane+0.025
Neg logit contributions
 forgot-0.030
 knew-0.026
 remembered-0.025
 wished-0.025
 pap-0.023
Token taste
Token ID6938
Feature activation+2.88e+00

Pos logit contributions
 forgot+0.033
 knew+0.026
 wished+0.026
 remembered+0.026
 pap+0.024
Neg logit contributions
he-0.055
room-0.050
ing-0.050
ays-0.049
ore-0.048
Token surf
Token ID9053
Feature activation+7.07e+00

Pos logit contributions
he+0.076
room+0.074
ane+0.071
ays+0.070
ing+0.068
Neg logit contributions
 forgot-0.055
 knew-0.050
 wished-0.047
 remembered-0.047
 pap-0.046
Token?"
Token ID1701
Feature activation+3.02e+00

Pos logit contributions
ays+0.127
eddy+0.122
ings+0.120
room+0.119
aps+0.117
Neg logit contributions
 forgot-0.164
 bit-0.154
 knew-0.149
 packed-0.144
 remembered-0.144
Token
Token ID198
Feature activation+3.93e+00

Pos logit contributions
he+0.076
ing+0.070
room+0.070
ays+0.070
ane+0.066
Neg logit contributions
 forgot-0.066
 knew-0.056
 remembered-0.055
 wished-0.053
 pap-0.050
Token
Token ID198
Feature activation+2.89e+00

Pos logit contributions
ays+0.085
ing+0.079
he+0.077
room+0.075
 anymore+0.073
Neg logit contributions
 knew-0.095
 forgot-0.093
 wished-0.085
 remembered-0.085
 pap-0.082
TokenHis
Token ID6653
Feature activation-1.16e+00

Pos logit contributions
ays+0.063
he+0.061
ing+0.059
room+0.056
ane+0.055
Neg logit contributions
 forgot-0.072
 knew-0.068
 remembered-0.064
 wished-0.064
 pap-0.060
Token the
Token ID262
Feature activation-2.71e+00

Pos logit contributions
 forgot+0.025
 wished+0.021
 remembered+0.021
 stolen+0.020
 knew+0.020
Neg logit contributions
he-0.017
room-0.016
ing-0.015
ays-0.015
ings-0.014
Token sun
Token ID4252
Feature activation+1.57e+00

Pos logit contributions
 forgot+0.062
 knew+0.051
 remembered+0.051
 wished+0.048
 known+0.047
Neg logit contributions
he-0.068
room-0.060
ing-0.060
eddy-0.058
ane-0.058
Token,
Token ID11
Feature activation-7.80e-01

Pos logit contributions
he+0.028
ays+0.025
room+0.024
ore+0.024
ing+0.023
Neg logit contributions
 forgot-0.045
 knew-0.041
 remembered-0.040
 pap-0.039
 wished-0.038
Token it
Token ID340
Feature activation-4.81e+00

Pos logit contributions
 forgot+0.020
 knew+0.018
 remembered+0.017
 wished+0.017
 pap+0.016
Neg logit contributions
he-0.016
ing-0.015
ays-0.015
room-0.015
ane-0.014
Token spark
Token ID9009
Feature activation+6.85e-01

Pos logit contributions
 robbers+0.085
 forgot+0.082
 pap+0.073
 stolen+0.072
 known+0.072
Neg logit contributions
he-0.127
ing-0.120
ays-0.113
ane-0.108
room-0.108
Tokenled
Token ID992
Feature activation+2.92e+00

Pos logit contributions
he+0.012
room+0.011
ays+0.011
uggle+0.011
ore+0.010
Neg logit contributions
 forgot-0.020
 knew-0.018
 remembered-0.018
 wished-0.017
 bit-0.016
Token.
Token ID13
Feature activation+3.17e+00

Pos logit contributions
he+0.058
ays+0.054
room+0.053
ane+0.052
uggle+0.051
Neg logit contributions
 forgot-0.078
 knew-0.073
 remembered-0.072
 bit-0.070
 wished-0.069
Token
Token ID198
Feature activation+4.49e+00

Pos logit contributions
he+0.068
ing+0.062
eddy+0.061
ays+0.060
room+0.059
Neg logit contributions
 forgot-0.080
 knew-0.074
 remembered-0.068
 wished-0.067
 learned-0.065
Token
Token ID198
Feature activation+3.07e+00

Pos logit contributions
ays+0.097
ing+0.091
ane+0.090
he+0.088
ipping+0.088
Neg logit contributions
 knew-0.101
 forgot-0.100
 known-0.087
 learned-0.086
 remembered-0.085
TokenM
Token ID44
Feature activation+8.83e-01

Pos logit contributions
he+0.061
ays+0.058
ing+0.056
eddy+0.055
ane+0.054
Neg logit contributions
 forgot-0.082
 knew-0.074
 remembered-0.071
 wished-0.068
 known-0.066
Tokenia
Token ID544
Feature activation-5.16e+00

Pos logit contributions
he+0.022
ays+0.020
room+0.019
ing+0.019
eddy+0.019
Neg logit contributions
 forgot-0.019
 knew-0.017
 remembered-0.016
 wished-0.015
 pap-0.015
Token are
Token ID389
Feature activation-1.70e+00

Pos logit contributions
 pap+0.080
 forgot+0.070
 majesty+0.063
 remembered+0.062
 elves+0.061
Neg logit contributions
he-0.094
room-0.076
uggle-0.073
ore-0.072
 anymore-0.071
Token going
Token ID1016
Feature activation+1.84e+00

Pos logit contributions
 forgot+0.038
 knew+0.034
 pap+0.034
 remembered+0.032
 wished+0.032
Neg logit contributions
he-0.036
ing-0.034
 with-0.034
 at-0.033
 on-0.033
Token home
Token ID1363
Feature activation+3.95e+00

Pos logit contributions
he+0.034
ing+0.031
ays+0.030
room+0.030
ane+0.028
Neg logit contributions
 forgot-0.052
 knew-0.047
 remembered-0.046
 wished-0.045
 pap-0.043
Token right
Token ID826
Feature activation+2.56e+00

Pos logit contributions
he+0.066
ays+0.065
ane+0.061
room+0.060
ing+0.059
Neg logit contributions
 forgot-0.117
 knew-0.105
 remembered-0.102
 wished-0.100
 learned-0.098
Token now
Token ID783
Feature activation+2.80e+00

Pos logit contributions
he+0.051
ays+0.046
room+0.045
ane+0.044
uggle+0.043
Neg logit contributions
 forgot-0.071
 knew-0.065
 remembered-0.064
 wished-0.062
 bit-0.058
Token.
Token ID13
Feature activation+2.14e+00

Pos logit contributions
he+0.046
ays+0.045
ane+0.043
ing+0.041
eddy+0.040
Neg logit contributions
 forgot-0.084
 knew-0.076
 wished-0.075
 remembered-0.074
 bit-0.072
Token You
Token ID921
Feature activation-2.81e+00

Pos logit contributions
he+0.040
ing+0.036
ays+0.035
room+0.035
ane+0.033
Neg logit contributions
 forgot-0.061
 knew-0.054
 remembered-0.053
 wished-0.052
 pap-0.050
Token are
Token ID389
Feature activation-1.82e+00

Pos logit contributions
 pap+0.062
 forgot+0.055
 known+0.051
 remembered+0.051
 fact+0.050
Neg logit contributions
he-0.071
room-0.059
ay-0.053
uggle-0.053
ane-0.053
Token in
Token ID287
Feature activation-5.18e-01

Pos logit contributions
 forgot+0.042
 knew+0.038
 wished+0.036
 remembered+0.035
 pap+0.035
Neg logit contributions
he-0.038
 on-0.035
 with-0.035
ing-0.033
 anymore-0.033
Token big
Token ID1263
Feature activation-7.81e-01

Pos logit contributions
 forgot+0.011
 knew+0.010
 remembered+0.009
 wished+0.009
 pap+0.008
Neg logit contributions
he-0.013
room-0.012
ays-0.012
ing-0.012
ings-0.012
Token trouble
Token ID5876
Feature activation+2.86e+00

Pos logit contributions
 forgot+0.016
 knew+0.013
 remembered+0.013
 wished+0.012
 known+0.012
Neg logit contributions
he-0.021
ays-0.019
room-0.019
ing-0.019
ings-0.018
Token card
Token ID2657
Feature activation+3.30e+00

Pos logit contributions
 forgot+0.065
 remembered+0.053
 wished+0.052
 known+0.052
 knew+0.052
Neg logit contributions
he-0.065
room-0.061
eddy-0.061
ing-0.059
ay-0.058
Token and
Token ID290
Feature activation-5.41e+00

Pos logit contributions
he+0.043
ays+0.039
ing+0.037
room+0.037
ane+0.034
Neg logit contributions
 forgot-0.110
 knew-0.100
 remembered-0.098
 wished-0.096
 pap-0.094
Token put
Token ID1234
Feature activation+4.94e-01

Pos logit contributions
 forgot+0.104
 fact+0.095
 forgotten+0.088
 unknow+0.087
 pap+0.086
Neg logit contributions
 with-0.109
he-0.108
room-0.107
ing-0.106
eddy-0.100
Token the
Token ID262
Feature activation-3.15e+00

Pos logit contributions
he+0.010
ing+0.009
ays+0.009
room+0.009
ane+0.008
Neg logit contributions
 forgot-0.013
 knew-0.012
 remembered-0.011
 wished-0.011
 pap-0.011
Token bracelet
Token ID42893
Feature activation+4.04e+00

Pos logit contributions
 forgot+0.074
 knew+0.063
 remembered+0.062
 pap+0.058
 wished+0.058
Neg logit contributions
he-0.077
room-0.068
ing-0.066
eddy-0.066
ane-0.065
Token inside
Token ID2641
Feature activation+1.97e+00

Pos logit contributions
ays+0.069
he+0.065
room+0.065
ore+0.060
ing+0.059
Neg logit contributions
 forgot-0.115
 knew-0.102
 remembered-0.101
 bit-0.098
 never-0.095
Token.
Token ID13
Feature activation+3.03e+00

Pos logit contributions
he+0.035
ays+0.033
ing+0.032
room+0.032
ane+0.031
Neg logit contributions
 forgot-0.055
 knew-0.051
 remembered-0.049
 wished-0.048
 bit-0.046
Token Her
Token ID2332
Feature activation-2.94e+00

Pos logit contributions
he+0.071
ing+0.069
ays+0.066
ings+0.065
ane+0.065
Neg logit contributions
 forgot-0.066
 knew-0.061
 wished-0.056
 learned-0.055
 remembered-0.055
Token friend
Token ID1545
Feature activation-3.52e+00

Pos logit contributions
 forgot+0.066
 known+0.055
 remembered+0.055
 knew+0.054
 wished+0.052
Neg logit contributions
he-0.067
eddy-0.063
room-0.062
ay-0.061
uggle-0.060
Token was
Token ID373
Feature activation-2.05e+00

Pos logit contributions
 forgot+0.066
 pap+0.057
 fact+0.055
 robbers+0.055
 forgotten+0.055
Neg logit contributions
he-0.087
room-0.083
ing-0.081
eddy-0.075
ane-0.073
Token so
Token ID523
Feature activation-1.50e+00

Pos logit contributions
 forgot+0.047
 robbers+0.038
 knew+0.037
 remembered+0.037
 wished+0.037
Neg logit contributions
he-0.049
room-0.044
ing-0.043
ays-0.041
eddy-0.041
Token.
Token ID13
Feature activation+3.27e+00

Pos logit contributions
he+0.050
ays+0.045
room+0.044
ing+0.042
ane+0.041
Neg logit contributions
 forgot-0.095
 knew-0.087
 remembered-0.085
 wished-0.084
 bit-0.079
Token It
Token ID632
Feature activation-4.54e+00

Pos logit contributions
he+0.068
ing+0.065
ays+0.063
room+0.062
ane+0.060
Neg logit contributions
 forgot-0.084
 knew-0.076
 remembered-0.072
 wished-0.071
 pap-0.069
Token ran
Token ID4966
Feature activation+2.60e+00

Pos logit contributions
 forgot+0.080
 setup+0.080
 pap+0.078
 robbers+0.077
 fact+0.077
Neg logit contributions
he-0.101
ing-0.093
 with-0.090
ore-0.083
room-0.082
Token around
Token ID1088
Feature activation+1.01e+00

Pos logit contributions
he+0.045
ays+0.042
room+0.040
ing+0.039
ane+0.038
Neg logit contributions
 forgot-0.077
 knew-0.070
 remembered-0.067
 wished-0.066
 bit-0.064
Token and
Token ID290
Feature activation-3.11e+00

Pos logit contributions
he+0.019
ays+0.017
ing+0.017
room+0.017
ane+0.016
Neg logit contributions
 forgot-0.028
 knew-0.025
 remembered-0.025
 wished-0.024
 pap-0.023
Token sniff
Token ID26300
Feature activation+3.20e+00

Pos logit contributions
 pap+0.077
 forgot+0.076
 fact+0.069
 known+0.069
 wasted+0.069
Neg logit contributions
he-0.057
ing-0.053
 with-0.046
ore-0.045
 down-0.045
Tokened
Token ID276
Feature activation+1.68e+00

Pos logit contributions
he+0.050
room+0.046
ays+0.046
ane+0.045
uggle+0.042
Neg logit contributions
 forgot-0.097
 knew-0.095
 wished-0.088
 remembered-0.088
 learned-0.085
Token the
Token ID262
Feature activation-1.99e+00

Pos logit contributions
he+0.031
ays+0.028
room+0.028
ing+0.028
ane+0.026
Neg logit contributions
 forgot-0.048
 knew-0.042
 remembered-0.041
 wished-0.041
 pap-0.039
Token grass
Token ID8701
Feature activation+4.48e+00

Pos logit contributions
 forgot+0.044
 remembered+0.037
 knew+0.037
 wished+0.035
 pap+0.034
Neg logit contributions
he-0.051
ing-0.047
room-0.046
eddy-0.045
ays-0.043
Token and
Token ID290
Feature activation-4.76e+00

Pos logit contributions
he+0.072
ays+0.066
room+0.065
eddy+0.065
uggle+0.063
Neg logit contributions
 forgot-0.128
 knew-0.120
 bit-0.115
 remembered-0.112
 wished-0.110
Token bushes
Token ID37413
Feature activation+3.12e+00

Pos logit contributions
 forgot+0.081
 pap+0.076
 known+0.074
 unknow+0.074
 wasted+0.073
Neg logit contributions
 with-0.107
ing-0.102
 down-0.098
he-0.096
 against-0.095
Token slide
Token ID10649
Feature activation+3.68e+00

Pos logit contributions
he+0.024
ays+0.022
room+0.022
ing+0.022
ings+0.021
Neg logit contributions
 forgot-0.021
 knew-0.018
 remembered-0.017
 wished-0.017
 pap-0.016
Token.
Token ID13
Feature activation+2.97e+00

Pos logit contributions
he+0.051
room+0.047
uggle+0.047
eddy+0.046
ays+0.045
Neg logit contributions
 forgot-0.116
 knew-0.106
 remembered-0.102
 wished-0.102
 bit-0.101
Token She
Token ID1375
Feature activation-5.17e+00

Pos logit contributions
he+0.048
ays+0.043
ing+0.043
room+0.041
ings+0.039
Neg logit contributions
 forgot-0.090
 knew-0.082
 remembered-0.079
 wished-0.079
 pap-0.075
Token ran
Token ID4966
Feature activation+2.75e+00

Pos logit contributions
 forgot+0.093
 pap+0.092
 known+0.091
 fact+0.086
 least+0.086
Neg logit contributions
he-0.111
ing-0.109
 with-0.103
room-0.100
ane-0.098
Token to
Token ID284
Feature activation-5.17e-02

Pos logit contributions
he+0.045
ing+0.040
ays+0.040
room+0.040
ane+0.037
Neg logit contributions
 forgot-0.084
 knew-0.075
 remembered-0.074
 wished-0.073
 pap-0.070
Token the
Token ID262
Feature activation-2.20e+00

Pos logit contributions
 forgot+0.001
 remembered+0.001
 knew+0.001
 wished+0.001
 pap+0.001
Neg logit contributions
he-0.001
room-0.001
ing-0.001
ays-0.001
ore-0.001
Token slide
Token ID10649
Feature activation+4.94e+00

Pos logit contributions
 forgot+0.053
 remembered+0.045
 knew+0.044
 wished+0.043
 pap+0.043
Neg logit contributions
he-0.052
ing-0.047
room-0.047
ays-0.045
ane-0.044
Token and
Token ID290
Feature activation-3.85e+00

Pos logit contributions
he+0.088
ays+0.083
room+0.083
eddy+0.082
uggle+0.081
Neg logit contributions
 forgot-0.133
 knew-0.124
 bit-0.115
 remembered-0.115
 wished-0.113
Token said
Token ID531
Feature activation+1.33e+00

Pos logit contributions
 pap+0.063
 forgot+0.059
 known+0.058
 fact+0.058
 cruel+0.056
Neg logit contributions
he-0.100
ing-0.095
 with-0.088
 down-0.087
ays-0.084
Token,
Token ID11
Feature activation+2.35e+00

Pos logit contributions
he+0.027
ays+0.025
eddy+0.023
room+0.023
uggle+0.021
Neg logit contributions
 forgot-0.037
 knew-0.035
 remembered-0.033
 wished-0.032
 pap-0.031
Token "
Token ID366
Feature activation+1.10e+00

Pos logit contributions
he+0.056
ays+0.052
room+0.051
ing+0.050
ane+0.048
Neg logit contributions
 forgot-0.054
 knew-0.048
 remembered-0.046
 wished-0.045
 pap-0.043
Token Dan
Token ID6035
Feature activation+9.61e-01

Pos logit contributions
he+0.032
ing+0.029
room+0.029
ays+0.029
ane+0.028
Neg logit contributions
 forgot-0.038
 knew-0.033
 remembered-0.032
 wished-0.031
 pap-0.030
Token very
Token ID845
Feature activation+1.24e+00

Pos logit contributions
he+0.013
ing+0.011
ays+0.011
room+0.011
ane+0.010
Neg logit contributions
 forgot-0.032
 knew-0.029
 remembered-0.028
 wished-0.028
 pap-0.027
Token much
Token ID881
Feature activation+2.13e+00

Pos logit contributions
he+0.030
ays+0.028
room+0.028
ing+0.028
ane+0.026
Neg logit contributions
 forgot-0.028
 knew-0.024
 remembered-0.023
 wished-0.023
 pap-0.021
Token.
Token ID13
Feature activation+2.04e+00

Pos logit contributions
he+0.028
ays+0.026
room+0.025
ing+0.024
ane+0.023
Neg logit contributions
 forgot-0.072
 knew-0.066
 wished-0.064
 remembered-0.064
 bit-0.062
Token She
Token ID1375
Feature activation-6.63e+00

Pos logit contributions
he+0.041
ing+0.038
ays+0.036
eddy+0.036
room+0.035
Neg logit contributions
 forgot-0.054
 knew-0.051
 wished-0.049
 remembered-0.047
 pap-0.046
Token does
Token ID857
Feature activation-5.41e-01

Pos logit contributions
 forgot+0.143
 known+0.134
 pap+0.133
 robbers+0.126
 crew+0.119
Neg logit contributions
ays-0.128
he-0.126
 anymore-0.113
ore-0.111
aps-0.109
Token not
Token ID407
Feature activation-3.24e+00

Pos logit contributions
 forgot+0.008
 knew+0.006
 remembered+0.006
 wished+0.006
 pap+0.006
Neg logit contributions
he-0.017
ings-0.015
room-0.015
ing-0.015
ays-0.015
Token want
Token ID765
Feature activation+5.14e-01

Pos logit contributions
 forgot+0.059
 pap+0.052
 remembered+0.049
 knew+0.049
 wished+0.048
Neg logit contributions
he-0.090
room-0.078
 anymore-0.075
ing-0.075
ore-0.074
Token to
Token ID284
Feature activation-2.46e+00

Pos logit contributions
he+0.006
ing+0.006
room+0.006
ings+0.006
ore+0.005
Neg logit contributions
 forgot-0.017
 remembered-0.015
 wished-0.015
 knew-0.015
 pap-0.014
Token live
Token ID2107
Feature activation+3.05e+00

Pos logit contributions
 forgot+0.041
 knew+0.034
 remembered+0.033
 wished+0.031
 pap+0.030
Neg logit contributions
he-0.074
room-0.069
ing-0.066
ings-0.065
ays-0.065
Token without
Token ID1231
Feature activation-2.42e+00

Pos logit contributions
ays+0.054
he+0.051
ane+0.050
ay+0.047
uggle+0.047
Neg logit contributions
 forgot-0.086
 knew-0.079
 remembered-0.079
 wished-0.077
 forgotten-0.075
Token.
Token ID13
Feature activation+4.04e+00

Pos logit contributions
he+0.048
room+0.042
ays+0.041
ane+0.038
eddy+0.038
Neg logit contributions
 forgot-0.113
 knew-0.104
 remembered-0.101
 wished-0.099
 bit-0.097
Token He
Token ID679
Feature activation-6.65e+00

Pos logit contributions
he+0.078
ays+0.071
ing+0.070
room+0.070
ane+0.066
Neg logit contributions
 forgot-0.112
 knew-0.101
 remembered-0.096
 wished-0.096
 pap-0.091
Token was
Token ID373
Feature activation-2.80e+00

Pos logit contributions
 pap+0.111
 fact+0.110
 dishonest+0.105
 unfair+0.101
 robbers+0.101
Neg logit contributions
ing-0.146
 with-0.144
he-0.143
 again-0.139
room-0.135
Token able
Token ID1498
Feature activation+9.70e-02

Pos logit contributions
 forgot+0.069
 knew+0.059
 wished+0.058
 pap+0.058
 remembered+0.057
Neg logit contributions
he-0.054
ing-0.050
 with-0.049
room-0.048
apping-0.048
Token to
Token ID284
Feature activation-3.40e+00

Pos logit contributions
ing+0.001
he+0.001
 anymore+0.001
ings+0.001
ogging+0.001
Neg logit contributions
 forgot-0.003
 wished-0.003
 knew-0.003
 pap-0.003
 fact-0.003
Token get
Token ID651
Feature activation-7.28e-01

Pos logit contributions
 forgot+0.084
 wished+0.076
 knew+0.075
 pap+0.074
 remembered+0.074
Neg logit contributions
he-0.073
room-0.063
uggle-0.060
ings-0.057
ing-0.057
Token a
Token ID257
Feature activation-3.42e+00

Pos logit contributions
 forgot+0.021
 knew+0.018
 remembered+0.018
 wished+0.017
 fact+0.017
Neg logit contributions
ing-0.013
he-0.012
ays-0.011
ore-0.011
uggle-0.011
Token lot
Token ID1256
Feature activation+1.05e+00

Pos logit contributions
 forgot+0.075
 wished+0.066
 remembered+0.064
 knew+0.063
 known+0.061
Neg logit contributions
he-0.079
ing-0.077
ore-0.075
uggle-0.072
ings-0.072
Token of
Token ID286
Feature activation-2.67e+00

Pos logit contributions
he+0.015
room+0.013
ays+0.013
ing+0.012
ane+0.012
Neg logit contributions
 forgot-0.033
 knew-0.031
 remembered-0.030
 wished-0.029
 bit-0.029
Token fish
Token ID5916
Feature activation+3.79e+00

Pos logit contributions
 forgot+0.060
 remembered+0.050
 knew+0.049
'd+0.046
 pap+0.046
Neg logit contributions
he-0.060
ore-0.056
ays-0.055
uggle-0.055
ogging-0.055
Token!
Token ID0
Feature activation+3.48e+00

Pos logit contributions
he+0.060
ays+0.053
room+0.052
ing+0.050
ane+0.048
Neg logit contributions
 forgot-0.118
 knew-0.107
 remembered-0.104
 wished-0.102
 bit-0.099
Token one
Token ID530
Feature activation-1.87e+00

Pos logit contributions
 forgot+0.014
 knew+0.012
 remembered+0.011
 wished+0.011
 pap+0.011
Neg logit contributions
he-0.014
ing-0.013
ays-0.013
room-0.013
eddy-0.012
Token else
Token ID2073
Feature activation-7.26e-02

Pos logit contributions
 forgot+0.041
 remembered+0.034
 knew+0.034
 pap+0.034
 wished+0.033
Neg logit contributions
he-0.045
ing-0.042
ore-0.041
ays-0.041
ane-0.040
Token has
Token ID468
Feature activation-1.82e+00

Pos logit contributions
 forgot+0.002
 knew+0.001
 remembered+0.001
 pap+0.001
 wished+0.001
Neg logit contributions
he-0.002
ing-0.002
ays-0.001
ore-0.001
room-0.001
Token to
Token ID284
Feature activation-3.08e+00

Pos logit contributions
 forgot+0.038
 pap+0.032
 knew+0.031
 remembered+0.031
 wished+0.030
Neg logit contributions
he-0.042
ing-0.040
ays-0.039
room-0.039
ings-0.038
Token see
Token ID766
Feature activation+3.69e-01

Pos logit contributions
 forgot+0.057
 remembered+0.046
 knew+0.046
 wished+0.043
 pap+0.043
Neg logit contributions
he-0.089
room-0.082
ings-0.077
ing-0.076
ays-0.075
Token it
Token ID340
Feature activation-4.51e-01

Pos logit contributions
he+0.008
room+0.007
ays+0.007
ing+0.007
ane+0.007
Neg logit contributions
 forgot-0.009
 knew-0.008
 remembered-0.008
 wished-0.008
 pap-0.007
Token like
Token ID588
Feature activation+2.63e-01

Pos logit contributions
 forgot+0.013
 knew+0.012
 remembered+0.011
 wished+0.011
 bit+0.011
Neg logit contributions
he-0.008
room-0.007
ays-0.007
ing-0.007
eddy-0.007
Token this
Token ID428
Feature activation-6.73e-01

Pos logit contributions
he+0.006
ing+0.005
room+0.005
ays+0.005
ane+0.005
Neg logit contributions
 forgot-0.007
 knew-0.006
 remembered-0.005
 wished-0.005
 pap-0.005
Token."
Token ID526
Feature activation+3.23e+00

Pos logit contributions
 forgot+0.020
 knew+0.018
 remembered+0.018
 wished+0.017
 pap+0.017
Neg logit contributions
he-0.012
ing-0.010
ays-0.010
room-0.010
ane-0.010
Token
Token ID220
Feature activation+4.36e+00

Pos logit contributions
he+0.069
ays+0.065
ing+0.065
room+0.064
ane+0.062
Neg logit contributions
 forgot-0.081
 knew-0.072
 remembered-0.069
 wished-0.069
 learned-0.065
Token
Token ID198
Feature activation+3.60e+00

Pos logit contributions
he+0.086
ing+0.085
ays+0.084
room+0.081
ane+0.080
Neg logit contributions
 forgot-0.099
 knew-0.092
 learned-0.084
 known-0.083
 wished-0.082
Token it
Token ID340
Feature activation+1.08e+00

Pos logit contributions
 forgot+0.030
 knew+0.025
 remembered+0.025
 wished+0.024
 pap+0.023
Neg logit contributions
he-0.025
ing-0.023
room-0.023
ays-0.023
ings-0.021
Token,
Token ID11
Feature activation-1.93e+00

Pos logit contributions
he+0.017
ays+0.015
room+0.015
ane+0.014
eddy+0.013
Neg logit contributions
 forgot-0.034
 knew-0.031
 remembered-0.030
 wished-0.030
 bit-0.029
Token fascinated
Token ID30103
Feature activation+2.05e+00

Pos logit contributions
 forgot+0.045
 knew+0.037
 remembered+0.037
 wished+0.036
 known+0.035
Neg logit contributions
he-0.041
ing-0.039
ays-0.037
ings-0.036
room-0.036
Token.
Token ID13
Feature activation+2.90e+00

Pos logit contributions
he+0.034
ays+0.034
uggle+0.033
room+0.032
ane+0.030
Neg logit contributions
 forgot-0.058
 knew-0.056
 wished-0.054
 remembered-0.052
 bit-0.050
Token His
Token ID2399
Feature activation-3.37e+00

Pos logit contributions
he+0.056
ing+0.051
ays+0.051
room+0.050
ane+0.047
Neg logit contributions
 forgot-0.079
 knew-0.071
 remembered-0.069
 wished-0.068
 pap-0.065
Token mom
Token ID1995
Feature activation-3.60e+00

Pos logit contributions
 forgot+0.073
 known+0.057
 knew+0.056
 remembered+0.056
 wished+0.055
Neg logit contributions
he-0.078
ing-0.077
ings-0.075
room-0.074
eddy-0.073
Token smiled
Token ID13541
Feature activation+1.19e+00

Pos logit contributions
 forgot+0.065
 robbers+0.051
 pap+0.051
 known+0.050
 wished+0.050
Neg logit contributions
he-0.098
ing-0.095
room-0.087
ays-0.083
eddy-0.083
Token.
Token ID13
Feature activation+2.67e+00

Pos logit contributions
he+0.023
ays+0.021
room+0.021
ane+0.020
uggle+0.020
Neg logit contributions
 forgot-0.033
 knew-0.031
 remembered-0.031
 wished-0.029
 pap-0.029
Token "
Token ID366
Feature activation-8.79e-02

Pos logit contributions
he+0.055
ing+0.050
room+0.050
ays+0.050
ane+0.047
Neg logit contributions
 forgot-0.069
 knew-0.063
 remembered-0.061
 wished-0.060
 pap-0.058
TokenGo
Token ID5247
Feature activation+4.38e-01

Pos logit contributions
 forgot+0.002
 knew+0.002
 remembered+0.002
 wished+0.002
 pap+0.002
Neg logit contributions
he-0.002
ing-0.002
room-0.002
ays-0.002
ane-0.002
Token ahead
Token ID4058
Feature activation+1.49e+00

Pos logit contributions
he+0.009
ays+0.009
room+0.008
ing+0.008
ane+0.008
Neg logit contributions
 forgot-0.011
 knew-0.010
 remembered-0.010
 wished-0.009
 bit-0.009
Token not
Token ID407
Feature activation-2.71e+00

Pos logit contributions
he+0.012
ing+0.012
ays+0.012
room+0.012
ane+0.011
Neg logit contributions
 forgot-0.004
 knew-0.003
 remembered-0.003
 wished-0.003
 pap-0.003
Token want
Token ID765
Feature activation+1.00e+00

Pos logit contributions
 forgot+0.058
 pap+0.053
 wished+0.053
 remembered+0.052
 knew+0.052
Neg logit contributions
he-0.065
room-0.054
 anymore-0.051
ore-0.050
ings-0.049
Token to
Token ID284
Feature activation-1.95e+00

Pos logit contributions
he+0.017
ing+0.016
ays+0.016
room+0.016
ings+0.015
Neg logit contributions
 forgot-0.029
 knew-0.026
 remembered-0.026
 wished-0.025
 pap-0.024
Token chew
Token ID34722
Feature activation+4.73e+00

Pos logit contributions
 forgot+0.034
 knew+0.028
 remembered+0.027
 wished+0.027
 pap+0.025
Neg logit contributions
he-0.057
room-0.053
ays-0.053
ing-0.053
ings-0.051
Token.
Token ID13
Feature activation+3.34e+00

Pos logit contributions
ane+0.102
he+0.100
ogging+0.095
apping+0.095
uggle+0.094
Neg logit contributions
 forgot-0.100
 bit-0.092
 knew-0.090
 remembered-0.088
 sp-0.085
Token They
Token ID1119
Feature activation-6.17e+00

Pos logit contributions
ing+0.088
he+0.088
ipping+0.083
ane+0.082
ooming+0.079
Neg logit contributions
 forgot-0.064
 wished-0.057
 knew-0.057
 remembered-0.052
 learned-0.052
Token wanted
Token ID2227
Feature activation+7.59e-01

Pos logit contributions
 pap+0.115
 crew+0.103
 robbers+0.101
 Indians+0.101
 payoff+0.097
Neg logit contributions
he-0.158
.-0.121
room-0.117
 anymore-0.115
aw-0.112
Token to
Token ID284
Feature activation-2.44e+00

Pos logit contributions
he+0.011
ing+0.011
ays+0.011
room+0.011
ings+0.010
Neg logit contributions
 forgot-0.023
 knew-0.021
 remembered-0.020
 wished-0.020
 known-0.019
Token play
Token ID711
Feature activation+2.84e+00

Pos logit contributions
 forgot+0.049
 remembered+0.044
 wished+0.043
 knew+0.043
 known+0.039
Neg logit contributions
he-0.063
room-0.057
uggle-0.054
ings-0.053
 anymore-0.050
Token.
Token ID13
Feature activation+3.46e+00

Pos logit contributions
ane+0.061
ays+0.056
eddy+0.054
ing+0.054
umbled+0.054
Neg logit contributions
 forgot-0.068
 bit-0.062
 remembered-0.062
 knew-0.061
 learned-0.061
Token
Token ID198
Feature activation+2.83e+00

Pos logit contributions
he+0.084
ing+0.083
ane+0.078
ipping+0.078
ays+0.076
Neg logit contributions
 forgot-0.076
 knew-0.068
 wished-0.068
 remembered-0.064
 pap-0.061
Token.
Token ID13
Feature activation+3.44e+00

Pos logit contributions
he+0.092
ays+0.089
room+0.085
uggle+0.084
eddy+0.078
Neg logit contributions
 forgot-0.175
 knew-0.160
 bit-0.151
 remembered-0.150
 stolen-0.150
Token The
Token ID383
Feature activation-2.84e+00

Pos logit contributions
he+0.077
ays+0.072
ing+0.070
ings+0.068
room+0.067
Neg logit contributions
 forgot-0.081
 knew-0.075
 wished-0.070
 remembered-0.069
 pap-0.067
Token old
Token ID1468
Feature activation-2.53e+00

Pos logit contributions
 forgot+0.057
 remembered+0.046
 knew+0.046
 known+0.044
 wished+0.043
Neg logit contributions
he-0.075
ing-0.069
room-0.068
eddy-0.067
ays-0.064
Token man
Token ID582
Feature activation-4.51e+00

Pos logit contributions
 forgot+0.058
 knew+0.047
 remembered+0.046
 known+0.046
 wished+0.046
Neg logit contributions
he-0.059
ing-0.055
room-0.053
ings-0.052
eddy-0.051
Token smiled
Token ID13541
Feature activation+2.13e+00

Pos logit contributions
 forgot+0.083
 pap+0.077
 robbers+0.075
 fact+0.074
 known+0.074
Neg logit contributions
he-0.108
ing-0.106
 with-0.097
room-0.097
ays-0.093
Token and
Token ID290
Feature activation-4.95e+00

Pos logit contributions
he+0.050
uggle+0.046
room+0.046
ays+0.045
ore+0.043
Neg logit contributions
 knew-0.052
 remembered-0.052
 forgot-0.051
 wished-0.046
 pap-0.044
Token put
Token ID1234
Feature activation+2.28e-01

Pos logit contributions
 fact+0.093
 forgot+0.085
 known+0.080
 tresp+0.079
 forgotten+0.079
Neg logit contributions
ing-0.110
 with-0.109
overed-0.099
he-0.099
 at-0.096
Token it
Token ID340
Feature activation+1.19e+00

Pos logit contributions
he+0.005
ing+0.004
ays+0.004
room+0.004
ings+0.004
Neg logit contributions
 forgot-0.006
 knew-0.005
 remembered-0.005
 wished-0.005
 pap-0.005
Token in
Token ID287
Feature activation+1.07e-01

Pos logit contributions
he+0.022
room+0.020
ays+0.019
ane+0.018
ings+0.018
Neg logit contributions
 forgot-0.034
 knew-0.031
 remembered-0.031
 wished-0.029
 bit-0.029
Token his
Token ID465
Feature activation-1.50e+00

Pos logit contributions
he+0.003
room+0.002
ays+0.002
ing+0.002
ings+0.002
Neg logit contributions
 forgot-0.003
 knew-0.002
 remembered-0.002
 wished-0.002
 pap-0.002
Token pocket
Token ID10000
Feature activation+4.22e+00

Pos logit contributions
 forgot+0.036
 known+0.029
 knew+0.029
 remembered+0.029
 wished+0.028
Neg logit contributions
he-0.033
ings-0.032
room-0.032
ay-0.032
ore-0.032
Token and
Token ID290
Feature activation-4.43e+00

Pos logit contributions
he+0.055
ays+0.054
room+0.051
ane+0.049
uggle+0.047
Neg logit contributions
 forgot-0.091
 knew-0.084
 remembered-0.080
 wished-0.079
 forgotten-0.076
Token walked
Token ID6807
Feature activation+2.08e+00

Pos logit contributions
 fact+0.079
 pap+0.077
 forgot+0.073
 vitamins+0.069
 majesty+0.068
Neg logit contributions
ing-0.102
he-0.098
 again-0.094
 with-0.092
room-0.086
Token away
Token ID1497
Feature activation+1.52e+00

Pos logit contributions
he+0.043
ing+0.039
room+0.039
ays+0.038
ings+0.036
Neg logit contributions
 forgot-0.055
 knew-0.049
 remembered-0.048
 wished-0.047
 pap-0.045
Token.
Token ID13
Feature activation+2.92e+00

Pos logit contributions
he+0.018
ays+0.016
room+0.016
ing+0.015
ane+0.014
Neg logit contributions
 forgot-0.053
 knew-0.049
 remembered-0.048
 wished-0.047
 bit-0.045
Token Tim
Token ID5045
Feature activation-3.36e+00

Pos logit contributions
he+0.056
ing+0.050
ays+0.050
room+0.050
ane+0.047
Neg logit contributions
 forgot-0.081
 knew-0.072
 remembered-0.071
 wished-0.070
 pap-0.066
Tokenmy
Token ID1820
Feature activation-5.46e+00

Pos logit contributions
 forgot+0.080
 pap+0.074
 wished+0.070
 known+0.069
 knew+0.068
Neg logit contributions
he-0.073
ing-0.067
room-0.063
eddy-0.060
ays-0.060
Token and
Token ID290
Feature activation-1.53e+00

Pos logit contributions
 pap+0.097
 known+0.087
 fact+0.085
 robbers+0.085
 forgot+0.084
Neg logit contributions
he-0.135
ing-0.124
 again-0.117
 with-0.115
room-0.115
Token his
Token ID465
Feature activation-2.88e+00

Pos logit contributions
 forgot+0.047
 bit+0.042
 forgotten+0.041
'd+0.041
 wished+0.041
Neg logit contributions
he-0.019
room-0.017
ing-0.016
 anymore-0.015
 with-0.014
Token friends
Token ID2460
Feature activation-4.22e+00

Pos logit contributions
 forgot+0.062
 wished+0.054
 knew+0.051
 remembered+0.050
 known+0.050
Neg logit contributions
he-0.064
room-0.064
ing-0.064
ings-0.060
eddy-0.059
Token felt
Token ID2936
Feature activation-3.33e+00

Pos logit contributions
 pap+0.072
 forgot+0.067
 fact+0.060
 majesty+0.058
 known+0.056
Neg logit contributions
he-0.122
ing-0.105
room-0.104
ane-0.096
 anymore-0.096
Token sad
Token ID6507
Feature activation+2.47e+00

Pos logit contributions
 forgot+0.060
 pap+0.053
 remembered+0.048
 knew+0.047
 knocked+0.044
Neg logit contributions
he-0.092
ing-0.080
 anymore-0.080
uggle-0.078
ore-0.076
Token to
Token ID284
Feature activation-2.84e+00

Pos logit contributions
he+0.025
ing+0.022
room+0.022
ays+0.022
ings+0.020
Neg logit contributions
 forgot-0.039
 knew-0.034
 remembered-0.034
 wished-0.033
 pap-0.032
Token try
Token ID1949
Feature activation+2.08e+00

Pos logit contributions
 forgot+0.058
 wished+0.050
 remembered+0.049
 knew+0.048
 known+0.045
Neg logit contributions
he-0.070
room-0.067
uggle-0.062
ings-0.060
ore-0.059
Token something
Token ID1223
Feature activation+1.00e+00

Pos logit contributions
he+0.050
room+0.046
ays+0.045
ane+0.044
ing+0.044
Neg logit contributions
 forgot-0.048
 knew-0.042
 remembered-0.042
 wished-0.039
 pap-0.039
Token new
Token ID649
Feature activation+3.17e+00

Pos logit contributions
he+0.026
ays+0.025
room+0.025
ane+0.024
ing+0.023
Neg logit contributions
 forgot-0.021
 knew-0.018
 remembered-0.017
 wished-0.017
 bit-0.016
Token.
Token ID13
Feature activation+3.30e+00

Pos logit contributions
he+0.060
ays+0.057
ane+0.054
room+0.053
eddy+0.051
Neg logit contributions
 forgot-0.085
 knew-0.081
 bit-0.078
 remembered-0.076
 wished-0.075
Token She
Token ID1375
Feature activation-5.64e+00

Pos logit contributions
he+0.054
ays+0.048
room+0.047
ing+0.047
ane+0.044
Neg logit contributions
 forgot-0.102
 knew-0.092
 remembered-0.089
 wished-0.088
 pap-0.085
Token said
Token ID531
Feature activation+1.17e+00

Pos logit contributions
 known+0.101
 fact+0.100
 forget+0.089
 forgot+0.089
 willingness+0.087
Neg logit contributions
he-0.114
 with-0.110
ing-0.108
uck-0.104
room-0.103
Token,
Token ID11
Feature activation+2.32e+00

Pos logit contributions
he+0.027
ays+0.024
eddy+0.022
room+0.022
inning+0.021
Neg logit contributions
 forgot-0.030
 knew-0.029
 remembered-0.028
 pap-0.026
 wished-0.025
Token "
Token ID366
Feature activation+1.32e+00

Pos logit contributions
he+0.061
ays+0.057
room+0.056
ing+0.054
ings+0.053
Neg logit contributions
 forgot-0.049
 knew-0.044
 remembered-0.042
 wished-0.040
 pap-0.039
TokenCan
Token ID6090
Feature activation+1.37e+00

Pos logit contributions
he+0.030
ays+0.028
ing+0.028
room+0.027
ings+0.026
Neg logit contributions
 forgot-0.032
 knew-0.028
 remembered-0.027
 wished-0.027
 pap-0.025
Token we
Token ID356
Feature activation-2.03e+00

Pos logit contributions
he+0.030
ing+0.028
room+0.028
ays+0.027
ane+0.026
Neg logit contributions
 forgot-0.034
 knew-0.030
 remembered-0.029
 wished-0.028
 pap-0.027
Token the
Token ID262
Feature activation-3.23e+00

Pos logit contributions
 forgot+0.074
 remembered+0.061
 pap+0.061
 known+0.060
 knew+0.060
Neg logit contributions
ing-0.086
he-0.078
 with-0.076
time-0.075
ore-0.072
Token kitten
Token ID42143
Feature activation-3.49e+00

Pos logit contributions
 forgot+0.070
 remembered+0.059
 known+0.056
 knew+0.056
 wished+0.056
Neg logit contributions
ing-0.076
he-0.073
ore-0.067
time-0.065
room-0.065
Token couldn
Token ID3521
Feature activation+2.74e+00

Pos logit contributions
 forgot+0.055
 pap+0.050
 known+0.049
 fact+0.045
 robbers+0.045
Neg logit contributions
he-0.090
ing-0.089
room-0.082
ings-0.081
 anymore-0.081
Token't
Token ID470
Feature activation-2.12e+00

Pos logit contributions
he+0.062
ing+0.060
room+0.060
ays+0.060
eddy+0.058
Neg logit contributions
 forgot-0.061
 knew-0.054
 remembered-0.051
 wished-0.050
 never-0.048
Token help
Token ID1037
Feature activation+1.50e+00

Pos logit contributions
 forgot+0.038
 knew+0.031
 remembered+0.030
 wished+0.029
 pap+0.028
Neg logit contributions
he-0.062
room-0.057
ing-0.056
ays-0.055
ore-0.054
Token but
Token ID475
Feature activation-4.30e+00

Pos logit contributions
he+0.027
ays+0.026
room+0.024
ane+0.024
ing+0.024
Neg logit contributions
 forgot-0.043
 knew-0.039
 remembered-0.038
 wished-0.037
 bit-0.036
Token go
Token ID467
Feature activation+1.20e+00

Pos logit contributions
 forgot+0.080
 remembered+0.066
 pap+0.065
 known+0.065
 wished+0.064
Neg logit contributions
he-0.113
ore-0.099
 anymore-0.096
uggle-0.096
ing-0.095
Token back
Token ID736
Feature activation+1.31e+00

Pos logit contributions
he+0.024
ays+0.022
room+0.022
ing+0.022
ane+0.020
Neg logit contributions
 forgot-0.032
 knew-0.029
 remembered-0.028
 wished-0.027
 pap-0.026
Token to
Token ID284
Feature activation-1.46e+00

Pos logit contributions
he+0.026
room+0.023
ays+0.023
ing+0.023
ane+0.022
Neg logit contributions
 forgot-0.036
 knew-0.032
 remembered-0.031
 wished-0.030
 forgotten-0.029
Token explore
Token ID7301
Feature activation+2.61e+00

Pos logit contributions
 forgot+0.040
 remembered+0.033
 knew+0.032
 wished+0.031
 bit+0.031
Neg logit contributions
he-0.028
room-0.025
ings-0.024
ore-0.024
 anymore-0.024
Token.
Token ID13
Feature activation+2.65e+00

Pos logit contributions
he+0.046
ays+0.046
room+0.043
ane+0.043
ing+0.041
Neg logit contributions
 forgot-0.073
 knew-0.068
 wished-0.065
 remembered-0.065
 bit-0.062