TOP ACTIVATIONS
MAX = 0.156

upon a time , a long animal lived in the forest
Suddenly , an evil man appeared . He tried
who wanted to catch a sneaky fox . He had an
a time there was a long snake . It lived in
time , there was a long snake named Sam . Sam
<|endoftext|> Once there was a long silver river that spread all
nothing . Suddenly , an evil storm blew in and destroyed
a time , in a big forest , there was a
upon a time , a long lion lived in a big
o asis . After a long walk , he heard a
time , there was a long plane . It was very
time there was a very long tree . Every day the
Tim found a great , long string in his toy box
a time , in a big forest , there was a
time , there was a long , green snake named Sam
After a long time , Bobby was nearly
unexpected . He spotted an evil dragon that was lurking outside
a time , in a big forest , there was an
stay in . After a long search , they found a
After a long day of serving , Max

MIN ACTIVATIONS
MIN = -0.266

. They were called ost ric hes . Lily and Ben
ostr ich pen . Ost ric hes are big birds with
garden and admire his sun fl owers . They also see
them band - aid s and ice packs . She tells
l aces , brace lets and ear rings . They liked
John chose some carrots and cuc umbers to take home .
and gives them band - aid s and ice packs .
ostr ich pen . Ost ric hes are big birds with
put it on his night stand . He did not want
butterfly 's wings and gently stro ked them . The butterfly
arre ling over some caul if l owers ?"
tape will hold the zero and close the hole . Then
cried . " The ost ric hes are mean ! They
big , beautiful rainbow - col oured butterfly ! The little
ready to share our happiness and play together ! â €
promised he would come back and play again soon . He
her a band - aid and a hug . He said
you would share the toys or not . He was testing
better to share your words and your friendship . <|endoftext|> L
better by giving him hugs and telling him it was okay

INTERVAL 0.050 - 0.156
CONTAINS 11.205%

had a dream . It was a big , tall dream
Molly to come with them . Molly nods and follows them
the arm chair . It was huge , with big eyes
could zoom very , very fast . The car zoom ed
wanted to go to school . He asked his mom ,

INTERVAL -0.055 - 0.050
CONTAINS 78.351%

said . Her mum was surprised and asked : " What
, I 'm sorry , but I didn 't make you
are hurt . Let me see . We need to clean
and everyone was happy . <|endoftext|> T ara and Jack were
Mean Island , the people were not so kind to one

INTERVAL -0.161 - -0.055
CONTAINS 10.330%

She gave him the stamp and took another one . It
, saw what was happening and came over to encourage them
It can talk and sing and laugh . Do you like
The knight defeated the monster and saved Molly .
tea for herself and then sit by the window to enjoy

INTERVAL -0.266 - -0.161
CONTAINS 0.114%

. She hugged the fairy and said goodbye . From that
not go to the toy and book stalls at all .
They say they love them and the rabbits . They play
happy that he stopped crying and started laughing ! He was
she put on her coat and followed Mom my to the
Token upon
Token ID2402
Feature activation+3.40e-02

Pos logit contributions
 sunscreen+0.013
 oats+0.011
 Nem+0.010
ric+0.010
ans+0.010
Neg logit contributions
case-0.016
 code-0.016
 student-0.016
 reader-0.015
 should-0.015
Token a
Token ID257
Feature activation+1.33e-01

Pos logit contributions
case+0.035
 code+0.035
 reader+0.034
 student+0.033
Two+0.032
Neg logit contributions
 sunscreen-0.063
lights-0.062
 oats-0.061
 Nem-0.060
 them-0.059
Token time
Token ID640
Feature activation+1.37e-02

Pos logit contributions
 code+0.155
case+0.153
 reader+0.150
 student+0.146
Two+0.139
Neg logit contributions
 sunscreen-0.231
 them-0.224
 Nem-0.221
 oats-0.220
lights-0.219
Token,
Token ID11
Feature activation+4.34e-02

Pos logit contributions
case+0.010
 code+0.010
 reader+0.009
 student+0.009
ird+0.009
Neg logit contributions
 sunscreen-0.031
 oats-0.029
lights-0.029
 them-0.028
 Nem-0.028
Token a
Token ID257
Feature activation+1.38e-01

Pos logit contributions
 code+0.081
 reader+0.078
 student+0.076
case+0.075
 equation+0.074
Neg logit contributions
 them-0.050
 sunscreen-0.049
 Nem-0.047
 oats-0.044
lights-0.042
Token long
Token ID890
Feature activation+1.56e-01

Pos logit contributions
 code+0.214
case+0.201
 reader+0.197
 student+0.193
Warning+0.190
Neg logit contributions
 sunscreen-0.196
 them-0.190
 Nem-0.189
 oats-0.180
lights-0.168
Token animal
Token ID5044
Feature activation+8.64e-02

Pos logit contributions
 code+0.216
 reader+0.205
case+0.203
 student+0.201
 equation+0.189
Neg logit contributions
 sunscreen-0.245
 them-0.238
 Nem-0.230
 oats-0.229
lights-0.223
Token lived
Token ID5615
Feature activation+7.53e-02

Pos logit contributions
 code+0.138
Warning+0.131
 equation+0.129
 reader+0.125
 student+0.122
Neg logit contributions
 them-0.125
 Nem-0.123
 oats-0.114
 sunscreen-0.108
 they-0.107
Token in
Token ID287
Feature activation+6.49e-02

Pos logit contributions
 code+0.122
 reader+0.117
case+0.115
 student+0.115
 equation+0.109
Neg logit contributions
 sunscreen-0.102
 them-0.099
 Nem-0.095
 oats-0.094
lights-0.091
Token the
Token ID262
Feature activation+3.87e-02

Pos logit contributions
 code+0.100
 reader+0.095
case+0.095
 student+0.094
 equation+0.090
Neg logit contributions
 sunscreen-0.093
 them-0.092
 Nem-0.087
 oats-0.086
lights-0.082
Token forest
Token ID8222
Feature activation+5.79e-02

Pos logit contributions
 code+0.062
ird+0.060
 reader+0.060
 student+0.058
 liar+0.057
Neg logit contributions
 sunscreen-0.055
 them-0.054
 Nem-0.053
 oats-0.051
 everything-0.048
Token
Token ID198
Feature activation-1.10e-02

Pos logit contributions
 sunscreen+0.078
 oats+0.075
 Nem+0.070
lights+0.070
 them+0.067
Neg logit contributions
case-0.051
 code-0.046
 student-0.042
room-0.042
Two-0.041
Token
Token ID198
Feature activation-6.52e-03

Pos logit contributions
 sunscreen+0.016
 them+0.015
 oats+0.015
 Nem+0.015
lights+0.015
Neg logit contributions
 code-0.016
case-0.016
 reader-0.016
 student-0.015
Two-0.015
TokenSuddenly
Token ID38582
Feature activation+2.60e-02

Pos logit contributions
 sunscreen+0.007
 them+0.007
 Nem+0.006
 oats+0.006
 everything+0.005
Neg logit contributions
 code-0.013
 reader-0.012
 liar-0.012
 student-0.012
ird-0.012
Token,
Token ID11
Feature activation+2.82e-02

Pos logit contributions
 code+0.033
 reader+0.031
case+0.030
 student+0.030
 equation+0.028
Neg logit contributions
 sunscreen-0.045
 them-0.044
 Nem-0.042
 oats-0.042
lights-0.041
Token an
Token ID281
Feature activation+1.11e-01

Pos logit contributions
 code+0.047
 reader+0.046
case+0.045
 student+0.045
Two+0.043
Neg logit contributions
 sunscreen-0.035
 them-0.033
lights-0.033
 oats-0.032
 Nem-0.031
Token evil
Token ID6181
Feature activation+1.54e-01

Pos logit contributions
 code+0.196
 reader+0.189
Two+0.185
 student+0.183
ird+0.177
Neg logit contributions
 sunscreen-0.132
 oats-0.130
 Nem-0.130
 them-0.125
 as-0.115
Token man
Token ID582
Feature activation+8.22e-02

Pos logit contributions
case+0.215
ird+0.211
Warning+0.210
shop+0.210
 code+0.205
Neg logit contributions
 them-0.219
 oats-0.214
 Nem-0.211
 sunscreen-0.203
 everything-0.189
Token appeared
Token ID4120
Feature activation+7.80e-02

Pos logit contributions
 student+0.142
 reader+0.141
case+0.140
 code+0.138
 virtue+0.135
Neg logit contributions
 them-0.111
 oats-0.088
uses-0.086
 Nem-0.085
atted-0.083
Token.
Token ID13
Feature activation+6.48e-02

Pos logit contributions
 code+0.144
 reader+0.142
 student+0.142
Two+0.140
case+0.140
Neg logit contributions
 them-0.092
 sunscreen-0.079
 as-0.078
 Nem-0.076
 oats-0.075
Token He
Token ID679
Feature activation+4.84e-02

Pos logit contributions
 code+0.136
 reader+0.128
ird+0.125
 student+0.125
 clue+0.121
Neg logit contributions
 Nem-0.072
 them-0.060
 sunscreen-0.056
 oats-0.056
 as-0.053
Token tried
Token ID3088
Feature activation+3.27e-02

Pos logit contributions
 reader+0.081
 code+0.078
 student+0.078
room+0.078
ird+0.075
Neg logit contributions
 them-0.066
 Nem-0.064
 oats-0.056
 everything-0.055
 as-0.052
Token who
Token ID508
Feature activation+7.98e-03

Pos logit contributions
 code+0.135
case+0.130
 equation+0.125
Warning+0.124
Two+0.122
Neg logit contributions
 them-0.131
 Nem-0.122
 sunscreen-0.117
 oats-0.114
ric-0.110
Token wanted
Token ID2227
Feature activation+9.14e-02

Pos logit contributions
 code+0.014
Warning+0.013
 student+0.013
 equation+0.013
 reader+0.013
Neg logit contributions
 Nem-0.010
 them-0.009
 sunscreen-0.008
 everything-0.008
 oats-0.008
Token to
Token ID284
Feature activation+7.49e-02

Pos logit contributions
 code+0.145
 reader+0.138
 student+0.138
case+0.133
Two+0.129
Neg logit contributions
 sunscreen-0.122
lights-0.119
 them-0.113
 Nem-0.112
ric-0.111
Token catch
Token ID4929
Feature activation+3.29e-02

Pos logit contributions
 code+0.110
 reader+0.108
 equation+0.104
 student+0.103
case+0.103
Neg logit contributions
 sunscreen-0.113
 them-0.111
 Nem-0.104
 oats-0.099
 everything-0.090
Token a
Token ID257
Feature activation+1.02e-01

Pos logit contributions
 code+0.041
 reader+0.040
 student+0.040
 character+0.038
case+0.038
Neg logit contributions
lights-0.051
 sunscreen-0.050
ric-0.050
bur-0.046
atted-0.045
Token sneaky
Token ID48930
Feature activation+1.52e-01

Pos logit contributions
 code+0.160
case+0.152
 reader+0.151
Warning+0.150
 student+0.146
Neg logit contributions
 sunscreen-0.141
 oats-0.135
 Nem-0.134
 them-0.133
lights-0.118
Token fox
Token ID21831
Feature activation+8.00e-02

Pos logit contributions
 code+0.225
case+0.223
Warning+0.221
Two+0.209
 reader+0.207
Neg logit contributions
 oats-0.220
 sunscreen-0.219
 Nem-0.216
 them-0.214
 seas-0.187
Token.
Token ID13
Feature activation+7.56e-02

Pos logit contributions
 code+0.123
Two+0.119
case+0.116
 reader+0.114
 equation+0.114
Neg logit contributions
 them-0.116
 sunscreen-0.108
 Nem-0.107
 oats-0.107
 as-0.099
Token He
Token ID679
Feature activation+5.11e-02

Pos logit contributions
 code+0.130
 reader+0.123
 student+0.119
 liar+0.118
case+0.117
Neg logit contributions
 them-0.098
 Nem-0.096
 sunscreen-0.095
 oats-0.087
lights-0.084
Token had
Token ID550
Feature activation+8.14e-02

Pos logit contributions
 code+0.086
 reader+0.082
 student+0.078
 equation+0.077
case+0.075
Neg logit contributions
 Nem-0.075
 them-0.075
 oats-0.069
 sunscreen-0.069
 everything-0.057
Token an
Token ID281
Feature activation+9.49e-02

Pos logit contributions
 code+0.111
 reader+0.105
case+0.104
 student+0.104
Two+0.097
Neg logit contributions
 sunscreen-0.130
 them-0.126
 Nem-0.123
 oats-0.122
lights-0.120
Token a
Token ID257
Feature activation+1.33e-01

Pos logit contributions
 code+0.038
case+0.037
 reader+0.037
 student+0.036
Two+0.034
Neg logit contributions
 sunscreen-0.061
lights-0.058
 oats-0.058
 them-0.058
 Nem-0.057
Token time
Token ID640
Feature activation+1.43e-02

Pos logit contributions
 code+0.172
 reader+0.166
case+0.166
 student+0.162
Two+0.153
Neg logit contributions
 sunscreen-0.220
 them-0.213
 Nem-0.208
 oats-0.207
lights-0.204
Token there
Token ID612
Feature activation+2.81e-02

Pos logit contributions
case+0.011
 code+0.010
 student+0.009
 reader+0.009
Two+0.009
Neg logit contributions
 sunscreen-0.032
 oats-0.031
lights-0.030
ans-0.029
 them-0.029
Token was
Token ID373
Feature activation+9.98e-02

Pos logit contributions
 code+0.045
 reader+0.045
 student+0.043
 liar+0.042
 equation+0.040
Neg logit contributions
 them-0.042
 Nem-0.040
 sunscreen-0.037
 everything-0.036
 oats-0.035
Token a
Token ID257
Feature activation+1.34e-01

Pos logit contributions
 code+0.134
 reader+0.127
case+0.126
 student+0.125
 equation+0.117
Neg logit contributions
 sunscreen-0.163
 them-0.158
 Nem-0.153
 oats-0.152
lights-0.148
Token long
Token ID890
Feature activation+1.51e-01

Pos logit contributions
 code+0.237
Warning+0.227
case+0.223
ird+0.212
 reader+0.210
Neg logit contributions
 Nem-0.169
 sunscreen-0.167
 oats-0.156
 them-0.153
 everything-0.132
Token snake
Token ID17522
Feature activation+7.07e-02

Pos logit contributions
 code+0.227
Warning+0.213
 reader+0.207
case+0.203
 student+0.202
Neg logit contributions
 sunscreen-0.231
 Nem-0.220
 them-0.216
 oats-0.213
lights-0.192
Token.
Token ID13
Feature activation+9.15e-02

Pos logit contributions
 code+0.115
Warning+0.111
 reader+0.109
 equation+0.107
shop+0.105
Neg logit contributions
 them-0.092
 Nem-0.087
 oats-0.084
uses-0.082
 sunscreen-0.082
Token It
Token ID632
Feature activation+6.86e-04

Pos logit contributions
 code+0.207
ird+0.187
 reader+0.185
 student+0.184
 liar+0.183
Neg logit contributions
 Nem-0.089
 them-0.071
 sunscreen-0.067
 Wed-0.060
 everything-0.060
Token lived
Token ID5615
Feature activation+7.42e-02

Pos logit contributions
 code+0.001
 reader+0.001
 student+0.001
case+0.001
 equation+0.001
Neg logit contributions
 sunscreen-0.001
 them-0.001
 Nem-0.001
 oats-0.001
lights-0.001
Token in
Token ID287
Feature activation+6.26e-02

Pos logit contributions
 code+0.130
 reader+0.125
case+0.124
 student+0.123
ird+0.120
Neg logit contributions
 sunscreen-0.089
 them-0.089
 Nem-0.084
 oats-0.082
cast-0.077
Token time
Token ID640
Feature activation+1.48e-02

Pos logit contributions
 code+0.173
case+0.166
 reader+0.166
 student+0.162
Two+0.153
Neg logit contributions
 sunscreen-0.220
 them-0.213
 Nem-0.208
 oats-0.207
lights-0.205
Token,
Token ID11
Feature activation+4.37e-02

Pos logit contributions
case+0.012
 code+0.012
 reader+0.010
 student+0.010
Two+0.010
Neg logit contributions
 sunscreen-0.033
 oats-0.031
lights-0.030
 them-0.030
 Nem-0.030
Token there
Token ID612
Feature activation+2.81e-02

Pos logit contributions
 code+0.080
 reader+0.078
 student+0.076
case+0.075
 equation+0.073
Neg logit contributions
 them-0.050
 sunscreen-0.049
 Nem-0.046
 oats-0.044
lights-0.043
Token was
Token ID373
Feature activation+1.01e-01

Pos logit contributions
 code+0.038
 reader+0.037
 student+0.036
case+0.034
 equation+0.034
Neg logit contributions
 them-0.047
 sunscreen-0.044
 Nem-0.044
 oats-0.042
lights-0.040
Token a
Token ID257
Feature activation+1.34e-01

Pos logit contributions
 code+0.132
 reader+0.126
case+0.125
 student+0.124
Two+0.116
Neg logit contributions
 sunscreen-0.167
 them-0.162
 Nem-0.157
 oats-0.157
lights-0.153
Token long
Token ID890
Feature activation+1.49e-01

Pos logit contributions
 code+0.227
Warning+0.216
case+0.213
ird+0.202
 equation+0.202
Neg logit contributions
 sunscreen-0.174
 Nem-0.171
 oats-0.164
 them-0.163
 everything-0.140
Token snake
Token ID17522
Feature activation+6.59e-02

Pos logit contributions
 code+0.228
Warning+0.224
ird+0.204
 reader+0.200
case+0.200
Neg logit contributions
 sunscreen-0.223
 Nem-0.211
 oats-0.209
 them-0.206
 everything-0.179
Token named
Token ID3706
Feature activation+8.13e-03

Pos logit contributions
 code+0.107
Warning+0.105
 equation+0.100
shop+0.098
 reader+0.098
Neg logit contributions
 them-0.085
 Nem-0.079
 oats-0.076
ric-0.076
 sunscreen-0.075
Token Sam
Token ID3409
Feature activation+8.49e-02

Pos logit contributions
 code+0.015
 reader+0.014
 student+0.014
case+0.013
 equation+0.013
Neg logit contributions
 them-0.010
 Nem-0.010
 sunscreen-0.010
 oats-0.009
lights-0.008
Token.
Token ID13
Feature activation+8.82e-02

Pos logit contributions
 code+0.100
 reader+0.095
case+0.094
 student+0.093
Two+0.087
Neg logit contributions
 sunscreen-0.151
 them-0.149
 Nem-0.144
 oats-0.142
lights-0.138
Token Sam
Token ID3409
Feature activation+9.31e-02

Pos logit contributions
ird+0.177
 code+0.177
 reader+0.167
 liar+0.167
 student+0.165
Neg logit contributions
 them-0.083
 sunscreen-0.079
 Nem-0.077
cast-0.072
 everything-0.068
Token<|endoftext|>
Token ID50256
Feature activation+3.25e-02

Pos logit contributions
 code+0.064
 student+0.060
 reader+0.060
 liar+0.059
 equation+0.056
Neg logit contributions
 them-0.060
 sunscreen-0.059
 Nem-0.056
 everything-0.054
lights-0.054
TokenOnce
Token ID7454
Feature activation-9.36e-03

Pos logit contributions
 code+0.058
 reader+0.057
 student+0.056
 liar+0.054
 equation+0.053
Neg logit contributions
 them-0.039
 sunscreen-0.036
 Nem-0.036
 oats-0.033
lights-0.032
Token there
Token ID612
Feature activation+2.39e-02

Pos logit contributions
 sunscreen+0.013
 oats+0.012
 Nem+0.011
lights+0.010
ans+0.010
Neg logit contributions
case-0.016
 code-0.015
 student-0.015
 reader-0.014
 should-0.014
Token was
Token ID373
Feature activation+1.00e-01

Pos logit contributions
 code+0.023
case+0.022
 reader+0.021
 student+0.021
Two+0.019
Neg logit contributions
 sunscreen-0.048
 them-0.045
 oats-0.045
lights-0.045
 Nem-0.044
Token a
Token ID257
Feature activation+1.34e-01

Pos logit contributions
 reader+0.109
 code+0.107
case+0.107
 student+0.105
Two+0.100
Neg logit contributions
 sunscreen-0.177
lights-0.174
 them-0.172
 oats-0.169
 Nem-0.168
Token long
Token ID890
Feature activation+1.49e-01

Pos logit contributions
 code+0.236
Warning+0.223
case+0.217
 equation+0.207
ird+0.206
Neg logit contributions
 sunscreen-0.167
 Nem-0.165
 oats-0.157
 them-0.154
 everything-0.134
Token silver
Token ID8465
Feature activation+4.97e-02

Pos logit contributions
 code+0.224
Warning+0.207
 reader+0.200
case+0.197
 student+0.196
Neg logit contributions
 sunscreen-0.228
 Nem-0.214
 them-0.212
 oats-0.210
lights-0.187
Token river
Token ID7850
Feature activation+5.34e-02

Pos logit contributions
 code+0.071
Warning+0.070
case+0.069
 reader+0.067
ird+0.066
Neg logit contributions
 sunscreen-0.072
 oats-0.071
 Nem-0.069
 them-0.068
 seas-0.064
Token that
Token ID326
Feature activation+3.10e-02

Pos logit contributions
 code+0.075
 reader+0.072
 student+0.069
case+0.067
ird+0.066
Neg logit contributions
 them-0.081
 sunscreen-0.081
 Nem-0.079
 oats-0.077
lights-0.074
Token spread
Token ID4104
Feature activation+1.56e-02

Pos logit contributions
 code+0.030
case+0.029
 reader+0.026
 student+0.026
Two+0.025
Neg logit contributions
 sunscreen-0.062
 them-0.059
 oats-0.058
lights-0.058
 Nem-0.057
Token all
Token ID477
Feature activation-9.49e-03

Pos logit contributions
 code+0.020
case+0.020
 student+0.019
 reader+0.019
room+0.018
Neg logit contributions
 sunscreen-0.025
lights-0.023
 Nem-0.023
 oats-0.023
 them-0.022
Token nothing
Token ID2147
Feature activation+5.85e-03

Pos logit contributions
 code+0.133
 reader+0.128
 student+0.127
case+0.127
 equation+0.121
Neg logit contributions
 sunscreen-0.075
 them-0.071
 Nem-0.068
 oats-0.068
lights-0.067
Token.
Token ID13
Feature activation+5.26e-02

Pos logit contributions
 code+0.007
 reader+0.007
 student+0.007
case+0.007
Two+0.007
Neg logit contributions
 sunscreen-0.010
 them-0.010
 Nem-0.009
 oats-0.009
lights-0.009
Token Suddenly
Token ID24975
Feature activation+2.48e-02

Pos logit contributions
case+0.071
 code+0.070
 reader+0.067
 student+0.067
Two+0.065
Neg logit contributions
 sunscreen-0.086
 them-0.080
 oats-0.080
lights-0.078
 Nem-0.075
Token,
Token ID11
Feature activation+2.16e-02

Pos logit contributions
 code+0.029
 reader+0.027
case+0.027
 student+0.027
Two+0.025
Neg logit contributions
 sunscreen-0.044
 them-0.043
 Nem-0.041
 oats-0.041
lights-0.041
Token an
Token ID281
Feature activation+1.06e-01

Pos logit contributions
 code+0.037
 reader+0.037
 student+0.037
case+0.036
Two+0.035
Neg logit contributions
 sunscreen-0.024
lights-0.023
 them-0.023
 oats-0.022
ric-0.022
Token evil
Token ID6181
Feature activation+1.48e-01

Pos logit contributions
 code+0.175
case+0.169
 reader+0.168
Two+0.164
 student+0.164
Neg logit contributions
 oats-0.144
 sunscreen-0.140
 Nem-0.139
 them-0.138
ans-0.118
Token storm
Token ID6388
Feature activation+5.60e-02

Pos logit contributions
case+0.201
 code+0.194
 reader+0.187
shop+0.186
 student+0.181
Neg logit contributions
 them-0.234
 oats-0.233
 sunscreen-0.228
 Nem-0.226
lights-0.202
Token blew
Token ID17948
Feature activation+1.35e-02

Pos logit contributions
 reader+0.091
 code+0.090
 student+0.088
case+0.083
shop+0.082
Neg logit contributions
 them-0.079
 Nem-0.069
lights-0.069
 sunscreen-0.068
 oats-0.066
Token in
Token ID287
Feature activation+3.62e-02

Pos logit contributions
 code+0.026
 reader+0.025
 student+0.025
case+0.025
Two+0.024
Neg logit contributions
 them-0.015
 sunscreen-0.014
 Nem-0.013
 oats-0.013
lights-0.012
Token and
Token ID290
Feature activation-7.39e-02

Pos logit contributions
case+0.079
 reader+0.079
 code+0.078
 student+0.076
shop+0.075
Neg logit contributions
 them-0.035
 sunscreen-0.031
 oats-0.028
 Nem-0.028
 everything-0.024
Token destroyed
Token ID6572
Feature activation+6.46e-02

Pos logit contributions
 sunscreen+0.093
 them+0.089
 oats+0.085
 Nem+0.085
lights+0.083
Neg logit contributions
 code-0.125
 reader-0.120
case-0.120
 student-0.119
Two-0.113
Token a
Token ID257
Feature activation+1.40e-01

Pos logit contributions
 code+0.045
case+0.044
 reader+0.043
 student+0.042
Two+0.040
Neg logit contributions
 sunscreen-0.075
lights-0.072
 oats-0.071
 them-0.071
 Nem-0.071
Token time
Token ID640
Feature activation+2.19e-02

Pos logit contributions
 code+0.186
case+0.178
 reader+0.178
 student+0.174
Two+0.165
Neg logit contributions
 sunscreen-0.227
 them-0.220
 Nem-0.214
 oats-0.214
lights-0.210
Token,
Token ID11
Feature activation+4.93e-02

Pos logit contributions
 code+0.019
case+0.019
 reader+0.017
 student+0.017
Two+0.016
Neg logit contributions
 sunscreen-0.046
 oats-0.044
lights-0.043
 them-0.043
 Nem-0.042
Token in
Token ID287
Feature activation+7.78e-02

Pos logit contributions
 code+0.090
 reader+0.089
 student+0.087
case+0.085
 equation+0.083
Neg logit contributions
 them-0.057
 sunscreen-0.056
 Nem-0.053
 oats-0.050
lights-0.048
Token a
Token ID257
Feature activation+1.33e-01

Pos logit contributions
case+0.043
 reader+0.041
 x+0.041
 student+0.039
 code+0.038
Neg logit contributions
cast-0.181
lights-0.180
uses-0.178
atted-0.173
oming-0.172
Token big
Token ID1263
Feature activation+1.47e-01

Pos logit contributions
 code+0.192
 reader+0.181
case+0.177
 student+0.176
ird+0.169
Neg logit contributions
 sunscreen-0.208
 them-0.204
 Nem-0.194
 oats-0.192
lights-0.179
Token forest
Token ID8222
Feature activation+7.53e-02

Pos logit contributions
 code+0.209
ird+0.193
case+0.189
 liar+0.189
Warning+0.189
Neg logit contributions
 sunscreen-0.237
 them-0.231
 Nem-0.221
 oats-0.221
 everything-0.206
Token,
Token ID11
Feature activation+3.93e-02

Pos logit contributions
 code+0.105
 reader+0.103
 student+0.100
case+0.097
 equation+0.095
Neg logit contributions
 them-0.116
 sunscreen-0.114
 Nem-0.110
 oats-0.107
lights-0.107
Token there
Token ID612
Feature activation+3.28e-02

Pos logit contributions
 reader+0.075
 code+0.074
 student+0.073
 equation+0.070
case+0.069
Neg logit contributions
 them-0.042
 sunscreen-0.041
 Nem-0.039
 oats-0.037
lights-0.035
Token was
Token ID373
Feature activation+1.01e-01

Pos logit contributions
 reader+0.049
 code+0.049
 student+0.047
 liar+0.045
 client+0.045
Neg logit contributions
 them-0.052
 Nem-0.048
 sunscreen-0.047
 everything-0.045
 oats-0.043
Token a
Token ID257
Feature activation+1.31e-01

Pos logit contributions
 code+0.186
 reader+0.179
 student+0.175
ird+0.171
 client+0.169
Neg logit contributions
 sunscreen-0.124
 them-0.122
 Nem-0.114
 oats-0.111
 everything-0.104
Token upon
Token ID2402
Feature activation+2.32e-02

Pos logit contributions
 sunscreen+0.024
 oats+0.021
ans+0.018
 Nem+0.018
ric+0.017
Neg logit contributions
case-0.033
 code-0.031
 student-0.030
 reader-0.029
 should-0.029
Token a
Token ID257
Feature activation+1.25e-01

Pos logit contributions
 code+0.028
 reader+0.027
case+0.027
 student+0.026
Two+0.025
Neg logit contributions
 sunscreen-0.040
 them-0.038
 Nem-0.038
 oats-0.038
lights-0.038
Token time
Token ID640
Feature activation+7.40e-03

Pos logit contributions
 code+0.161
 reader+0.155
case+0.155
 student+0.152
Two+0.143
Neg logit contributions
 sunscreen-0.206
 them-0.200
 Nem-0.195
 oats-0.195
lights-0.192
Token,
Token ID11
Feature activation+3.61e-02

Pos logit contributions
case+0.006
 code+0.006
 reader+0.005
 student+0.005
Two+0.005
Neg logit contributions
 sunscreen-0.016
 oats-0.016
lights-0.015
 them-0.015
ans-0.015
Token a
Token ID257
Feature activation+1.29e-01

Pos logit contributions
 code+0.066
 reader+0.063
 student+0.062
case+0.061
 equation+0.060
Neg logit contributions
 them-0.042
 sunscreen-0.042
 Nem-0.040
 oats-0.038
lights-0.037
Token long
Token ID890
Feature activation+1.47e-01

Pos logit contributions
 code+0.204
case+0.193
 reader+0.186
Warning+0.184
 student+0.181
Neg logit contributions
 sunscreen-0.182
 Nem-0.177
 them-0.175
 oats-0.168
lights-0.153
Token lion
Token ID18744
Feature activation+7.14e-02

Pos logit contributions
 code+0.202
 reader+0.193
case+0.190
 student+0.189
Two+0.177
Neg logit contributions
 sunscreen-0.231
 them-0.224
 Nem-0.217
 oats-0.216
lights-0.211
Token lived
Token ID5615
Feature activation+6.65e-02

Pos logit contributions
 code+0.106
 student+0.098
 reader+0.098
 equation+0.097
case+0.096
Neg logit contributions
 them-0.108
 Nem-0.104
 sunscreen-0.094
uses-0.093
 oats-0.093
Token in
Token ID287
Feature activation+5.39e-02

Pos logit contributions
 code+0.106
 reader+0.102
case+0.101
 student+0.100
Two+0.095
Neg logit contributions
 sunscreen-0.091
 them-0.088
 Nem-0.085
 oats-0.084
lights-0.082
Token a
Token ID257
Feature activation+1.05e-01

Pos logit contributions
 code+0.080
 reader+0.076
case+0.075
 student+0.075
 equation+0.071
Neg logit contributions
 sunscreen-0.080
 them-0.079
 Nem-0.075
 oats-0.074
lights-0.071
Token big
Token ID1263
Feature activation+1.21e-01

Pos logit contributions
 code+0.165
 reader+0.153
ird+0.151
 student+0.150
 liar+0.146
Neg logit contributions
 them-0.150
 sunscreen-0.148
 Nem-0.142
 oats-0.138
 everything-0.129
Token o
Token ID267
Feature activation+2.99e-03

Pos logit contributions
 code+0.142
 student+0.141
 reader+0.139
Two+0.138
case+0.137
Neg logit contributions
 sunscreen-0.099
 oats-0.098
 Nem-0.097
 them-0.086
 peanuts-0.082
Tokenasis
Token ID17765
Feature activation+6.56e-03

Pos logit contributions
 code+0.004
case+0.004
 reader+0.004
Two+0.004
ird+0.004
Neg logit contributions
 them-0.005
 sunscreen-0.005
 Nem-0.004
lights-0.004
 oats-0.004
Token.
Token ID13
Feature activation+6.24e-02

Pos logit contributions
 code+0.008
 reader+0.007
case+0.007
 student+0.007
 equation+0.007
Neg logit contributions
 sunscreen-0.012
 them-0.011
 Nem-0.011
 oats-0.011
lights-0.011
Token After
Token ID2293
Feature activation+3.37e-02

Pos logit contributions
case+0.070
 code+0.064
Two+0.063
 reader+0.061
shop+0.060
Neg logit contributions
 sunscreen-0.116
lights-0.113
 them-0.112
 oats-0.112
cast-0.104
Token a
Token ID257
Feature activation+1.23e-01

Pos logit contributions
 code+0.046
 reader+0.044
case+0.043
 student+0.043
Two+0.041
Neg logit contributions
 sunscreen-0.053
 them-0.050
lights-0.050
 Nem-0.050
 oats-0.049
Token long
Token ID890
Feature activation+1.47e-01

Pos logit contributions
 code+0.189
 reader+0.179
 student+0.178
case+0.176
Two+0.171
Neg logit contributions
 sunscreen-0.174
 them-0.168
lights-0.163
 oats-0.161
 Nem-0.160
Token walk
Token ID2513
Feature activation+4.45e-02

Pos logit contributions
 code+0.173
case+0.163
 student+0.160
 reader+0.158
 figure+0.154
Neg logit contributions
lights-0.249
 sunscreen-0.249
 them-0.245
cast-0.239
 oats-0.238
Token,
Token ID11
Feature activation+1.67e-02

Pos logit contributions
 code+0.058
 reader+0.055
 student+0.054
case+0.054
 equation+0.051
Neg logit contributions
 sunscreen-0.074
 them-0.073
 Nem-0.071
 oats-0.070
lights-0.068
Token he
Token ID339
Feature activation+4.62e-02

Pos logit contributions
case+0.027
 code+0.026
Two+0.026
 figure+0.026
 should+0.025
Neg logit contributions
 sunscreen-0.019
ric-0.019
lights-0.019
ans-0.019
 oats-0.018
Token heard
Token ID2982
Feature activation+6.97e-02

Pos logit contributions
 code+0.041
case+0.038
 reader+0.038
 student+0.037
Two+0.034
Neg logit contributions
 sunscreen-0.096
 them-0.092
lights-0.092
 oats-0.090
 Nem-0.089
Token a
Token ID257
Feature activation+1.00e-01

Pos logit contributions
 code+0.084
case+0.080
 reader+0.079
 student+0.078
Two+0.073
Neg logit contributions
 sunscreen-0.121
 them-0.113
 Nem-0.113
 oats-0.112
lights-0.112
Token time
Token ID640
Feature activation+1.22e-02

Pos logit contributions
 code+0.165
case+0.161
 reader+0.157
 student+0.154
Two+0.146
Neg logit contributions
 sunscreen-0.224
 them-0.217
 Nem-0.213
 oats-0.212
lights-0.210
Token,
Token ID11
Feature activation+4.14e-02

Pos logit contributions
case+0.009
 code+0.008
Two+0.007
 reader+0.007
 student+0.007
Neg logit contributions
 sunscreen-0.028
 oats-0.027
lights-0.026
 them-0.025
 Nem-0.025
Token there
Token ID612
Feature activation+2.59e-02

Pos logit contributions
 code+0.073
 reader+0.071
 student+0.070
case+0.069
 equation+0.066
Neg logit contributions
 them-0.049
 sunscreen-0.049
 Nem-0.046
 oats-0.045
lights-0.044
Token was
Token ID373
Feature activation+9.98e-02

Pos logit contributions
 code+0.034
 reader+0.033
 student+0.032
case+0.030
 liar+0.030
Neg logit contributions
 them-0.044
 sunscreen-0.042
 Nem-0.041
 oats-0.039
lights-0.038
Token a
Token ID257
Feature activation+1.31e-01

Pos logit contributions
 code+0.138
 reader+0.131
case+0.129
 student+0.128
 equation+0.121
Neg logit contributions
 sunscreen-0.160
 them-0.156
 Nem-0.151
 oats-0.149
lights-0.143
Token long
Token ID890
Feature activation+1.46e-01

Pos logit contributions
 code+0.219
Warning+0.211
case+0.208
ird+0.198
 equation+0.196
Neg logit contributions
 sunscreen-0.171
 Nem-0.170
 oats-0.161
 them-0.161
 everything-0.138
Token plane
Token ID6614
Feature activation+3.20e-02

Pos logit contributions
 code+0.218
Warning+0.211
 reader+0.195
ird+0.195
case+0.195
Neg logit contributions
 sunscreen-0.223
 Nem-0.212
 them-0.209
 oats-0.207
lights-0.182
Token.
Token ID13
Feature activation+9.09e-02

Pos logit contributions
 code+0.051
 reader+0.049
 student+0.047
 equation+0.047
Warning+0.046
Neg logit contributions
 them-0.043
 sunscreen-0.041
lights-0.039
 Nem-0.039
 oats-0.038
Token It
Token ID632
Feature activation-2.99e-03

Pos logit contributions
 code+0.194
 liar+0.187
 equation+0.182
 student+0.181
 reader+0.179
Neg logit contributions
 Nem-0.074
 them-0.071
 as-0.065
 sunscreen-0.062
icles-0.060
Token was
Token ID373
Feature activation+9.34e-02

Pos logit contributions
 sunscreen+0.006
 them+0.006
 oats+0.006
 Nem+0.006
lights+0.005
Neg logit contributions
 code-0.003
case-0.003
 reader-0.003
 student-0.003
Two-0.003
Token very
Token ID845
Feature activation+6.90e-02

Pos logit contributions
 code+0.130
 reader+0.123
case+0.123
 student+0.121
 equation+0.113
Neg logit contributions
 sunscreen-0.147
 them-0.142
 Nem-0.138
 oats-0.138
lights-0.135
Token time
Token ID640
Feature activation+1.16e-02

Pos logit contributions
 code+0.171
 reader+0.164
case+0.163
 student+0.161
Two+0.151
Neg logit contributions
 sunscreen-0.208
 them-0.202
 Nem-0.196
 oats-0.195
lights-0.192
Token there
Token ID612
Feature activation+2.60e-02

Pos logit contributions
case+0.009
 code+0.009
 reader+0.008
 student+0.008
ird+0.007
Neg logit contributions
 sunscreen-0.026
 oats-0.024
lights-0.024
ans-0.023
 them-0.023
Token was
Token ID373
Feature activation+9.82e-02

Pos logit contributions
 code+0.043
 reader+0.042
 student+0.041
 liar+0.040
 client+0.038
Neg logit contributions
 them-0.038
 Nem-0.036
 everything-0.033
 sunscreen-0.032
 oats-0.031
Token a
Token ID257
Feature activation+1.32e-01

Pos logit contributions
 code+0.123
 reader+0.118
case+0.116
 student+0.116
Two+0.108
Neg logit contributions
 sunscreen-0.165
 them-0.160
 oats-0.156
 Nem-0.155
lights-0.155
Token very
Token ID845
Feature activation+8.20e-02

Pos logit contributions
 code+0.230
Warning+0.221
case+0.216
ird+0.205
 equation+0.204
Neg logit contributions
 Nem-0.167
 sunscreen-0.165
 oats-0.155
 them-0.152
 everything-0.131
Token long
Token ID890
Feature activation+1.46e-01

Pos logit contributions
 code+0.132
 reader+0.122
case+0.122
Warning+0.122
 student+0.120
Neg logit contributions
 sunscreen-0.112
 Nem-0.109
 oats-0.105
 them-0.104
lights-0.096
Token tree
Token ID5509
Feature activation+7.19e-02

Pos logit contributions
 code+0.215
Warning+0.207
case+0.194
ird+0.193
 reader+0.192
Neg logit contributions
 sunscreen-0.222
 Nem-0.212
 oats-0.208
 them-0.206
lights-0.185
Token.
Token ID13
Feature activation+9.15e-02

Pos logit contributions
 code+0.105
 reader+0.101
 student+0.099
 liar+0.095
 equation+0.095
Neg logit contributions
 them-0.104
 sunscreen-0.102
 Nem-0.099
 oats-0.098
lights-0.097
Token Every
Token ID3887
Feature activation+3.83e-02

Pos logit contributions
 code+0.181
 liar+0.167
 reader+0.158
 student+0.158
 client+0.157
Neg logit contributions
 Nem-0.098
 them-0.089
 sunscreen-0.088
 Wed-0.082
 everything-0.077
Token day
Token ID1110
Feature activation+5.17e-02

Pos logit contributions
 code+0.074
 reader+0.069
 student+0.068
ird+0.068
case+0.067
Neg logit contributions
 sunscreen-0.039
 oats-0.037
 them-0.037
 Nem-0.035
lights-0.030
Token the
Token ID262
Feature activation+3.40e-02

Pos logit contributions
 code+0.086
 reader+0.082
case+0.081
 student+0.080
 equation+0.078
Neg logit contributions
 sunscreen-0.069
 them-0.069
 Nem-0.065
 oats-0.063
lights-0.059
Token Tim
Token ID5045
Feature activation+6.35e-02

Pos logit contributions
 code+0.032
 reader+0.031
case+0.030
 student+0.030
 equation+0.029
Neg logit contributions
 sunscreen-0.021
 them-0.020
 Nem-0.019
 oats-0.019
lights-0.018
Token found
Token ID1043
Feature activation+6.90e-02

Pos logit contributions
 code+0.115
case+0.110
 student+0.110
 reader+0.109
 liar+0.105
Neg logit contributions
 them-0.079
 sunscreen-0.074
 Nem-0.074
 oats-0.067
 everything-0.063
Token a
Token ID257
Feature activation+1.16e-01

Pos logit contributions
 code+0.069
 student+0.065
case+0.065
 reader+0.063
Two+0.061
Neg logit contributions
lights-0.130
 sunscreen-0.129
cast-0.123
 oats-0.121
 them-0.119
Token great
Token ID1049
Feature activation+9.97e-02

Pos logit contributions
Warning+0.179
 code+0.177
 reader+0.175
ird+0.175
case+0.173
Neg logit contributions
 sunscreen-0.156
 Nem-0.149
 oats-0.148
 them-0.141
 seas-0.131
Token,
Token ID11
Feature activation+2.04e-02

Pos logit contributions
Warning+0.129
 code+0.129
ird+0.125
case+0.125
Two+0.122
Neg logit contributions
 sunscreen-0.158
 Nem-0.154
 oats-0.151
 them-0.148
 seas-0.139
Token long
Token ID890
Feature activation+1.45e-01

Pos logit contributions
 code+0.031
 reader+0.030
case+0.030
 student+0.030
Two+0.028
Neg logit contributions
 sunscreen-0.029
 them-0.028
 Nem-0.027
 oats-0.027
lights-0.025
Token string
Token ID4731
Feature activation+2.99e-02

Pos logit contributions
Warning+0.185
 code+0.184
 reader+0.179
 student+0.176
ird+0.175
Neg logit contributions
 sunscreen-0.235
 Nem-0.218
 oats-0.217
 them-0.210
 seas-0.198
Token in
Token ID287
Feature activation+4.28e-02

Pos logit contributions
 reader+0.045
 code+0.045
 student+0.044
Two+0.042
 equation+0.042
Neg logit contributions
 them-0.042
 sunscreen-0.041
 Nem-0.039
 oats-0.039
lights-0.038
Token his
Token ID465
Feature activation+5.19e-02

Pos logit contributions
case+0.108
 reader+0.108
 code+0.106
 student+0.104
lder+0.103
Neg logit contributions
 them-0.024
 sunscreen-0.021
 Nem-0.017
 oats-0.015
 everything-0.013
Token toy
Token ID13373
Feature activation+4.67e-02

Pos logit contributions
 code+0.058
 reader+0.055
case+0.053
 student+0.053
 equation+0.050
Neg logit contributions
 sunscreen-0.097
 them-0.093
 Nem-0.091
 oats-0.090
lights-0.087
Token box
Token ID3091
Feature activation+6.31e-02

Pos logit contributions
 reader+0.053
 equation+0.050
 liar+0.047
 code+0.047
 student+0.046
Neg logit contributions
 them-0.090
 sunscreen-0.086
 Nem-0.083
 oats-0.082
 seas-0.079
Token a
Token ID257
Feature activation+1.37e-01

Pos logit contributions
 code+0.053
 reader+0.050
case+0.050
 student+0.049
Two+0.046
Neg logit contributions
 sunscreen-0.067
 them-0.065
 Nem-0.063
 oats-0.063
lights-0.061
Token time
Token ID640
Feature activation+2.02e-02

Pos logit contributions
 code+0.183
 reader+0.175
case+0.175
 student+0.172
Two+0.162
Neg logit contributions
 sunscreen-0.220
 them-0.213
 Nem-0.207
 oats-0.206
lights-0.202
Token,
Token ID11
Feature activation+4.84e-02

Pos logit contributions
case+0.016
 code+0.015
 reader+0.013
 student+0.013
Two+0.013
Neg logit contributions
 sunscreen-0.045
 oats-0.043
lights-0.042
 Nem-0.042
 them-0.042
Token in
Token ID287
Feature activation+7.73e-02

Pos logit contributions
 code+0.093
 reader+0.090
 student+0.088
case+0.085
 equation+0.085
Neg logit contributions
 them-0.052
 sunscreen-0.050
 Nem-0.047
 oats-0.045
lights-0.044
Token a
Token ID257
Feature activation+1.31e-01

Pos logit contributions
 reader+0.062
case+0.062
 student+0.061
 code+0.060
Two+0.055
Neg logit contributions
cast-0.161
lights-0.159
 Nem-0.156
uses-0.155
 sunscreen-0.155
Token big
Token ID1263
Feature activation+1.45e-01

Pos logit contributions
 code+0.197
 reader+0.183
case+0.179
 student+0.178
ird+0.173
Neg logit contributions
 sunscreen-0.199
 them-0.194
 Nem-0.183
 oats-0.182
lights-0.169
Token forest
Token ID8222
Feature activation+7.42e-02

Pos logit contributions
 code+0.212
ird+0.194
Warning+0.193
case+0.190
 liar+0.188
Neg logit contributions
 sunscreen-0.228
 them-0.223
 Nem-0.213
 oats-0.212
 everything-0.199
Token,
Token ID11
Feature activation+4.06e-02

Pos logit contributions
 code+0.099
 reader+0.095
 student+0.092
case+0.091
 equation+0.087
Neg logit contributions
 sunscreen-0.119
 them-0.119
 Nem-0.113
 oats-0.112
lights-0.111
Token there
Token ID612
Feature activation+3.23e-02

Pos logit contributions
 code+0.077
 reader+0.077
 student+0.075
 equation+0.072
case+0.071
Neg logit contributions
 them-0.043
 sunscreen-0.042
 Nem-0.040
 oats-0.038
lights-0.036
Token was
Token ID373
Feature activation+1.01e-01

Pos logit contributions
 code+0.047
 reader+0.045
 student+0.044
 liar+0.042
 equation+0.042
Neg logit contributions
 them-0.052
 sunscreen-0.048
 Nem-0.048
 everything-0.046
 oats-0.045
Token a
Token ID257
Feature activation+1.29e-01

Pos logit contributions
 code+0.184
 reader+0.174
 student+0.170
ird+0.167
case+0.166
Neg logit contributions
 sunscreen-0.127
 them-0.122
 oats-0.113
 Nem-0.113
 everything-0.105
Token time
Token ID640
Feature activation+1.15e-02

Pos logit contributions
 code+0.159
case+0.157
 reader+0.154
 student+0.150
Two+0.142
Neg logit contributions
 sunscreen-0.217
 them-0.210
 Nem-0.206
 oats-0.206
lights-0.203
Token,
Token ID11
Feature activation+4.06e-02

Pos logit contributions
case+0.008
 code+0.007
 student+0.006
 reader+0.006
Two+0.006
Neg logit contributions
 sunscreen-0.027
 oats-0.026
lights-0.025
 Nem-0.024
 them-0.024
Token there
Token ID612
Feature activation+2.47e-02

Pos logit contributions
 code+0.071
 reader+0.069
 student+0.067
case+0.067
 equation+0.064
Neg logit contributions
 sunscreen-0.049
 them-0.049
 Nem-0.046
 oats-0.045
lights-0.044
Token was
Token ID373
Feature activation+9.73e-02

Pos logit contributions
 code+0.031
 reader+0.030
 student+0.029
case+0.028
 equation+0.027
Neg logit contributions
 them-0.043
 sunscreen-0.042
 Nem-0.041
 oats-0.039
lights-0.038
Token a
Token ID257
Feature activation+1.29e-01

Pos logit contributions
 code+0.131
 reader+0.124
case+0.122
 student+0.122
 equation+0.114
Neg logit contributions
 sunscreen-0.159
 them-0.154
 Nem-0.149
 oats-0.148
lights-0.144
Token long
Token ID890
Feature activation+1.45e-01

Pos logit contributions
 code+0.215
Warning+0.203
case+0.203
 reader+0.193
 equation+0.192
Neg logit contributions
 sunscreen-0.172
 Nem-0.168
 them-0.161
 oats-0.161
cast-0.139
Token,
Token ID11
Feature activation+3.30e-02

Pos logit contributions
 code+0.220
Warning+0.212
case+0.196
 reader+0.196
ird+0.195
Neg logit contributions
 sunscreen-0.217
 Nem-0.205
 oats-0.202
 them-0.201
lights-0.174
Token green
Token ID4077
Feature activation+5.98e-02

Pos logit contributions
 code+0.056
 reader+0.052
 student+0.051
case+0.051
Warning+0.050
Neg logit contributions
 sunscreen-0.043
 Nem-0.042
 oats-0.040
 them-0.039
lights-0.035
Token snake
Token ID17522
Feature activation+5.40e-02

Pos logit contributions
 code+0.090
Warning+0.083
case+0.082
ird+0.081
 reader+0.081
Neg logit contributions
 sunscreen-0.090
 oats-0.087
 Nem-0.086
 them-0.085
lights-0.076
Token named
Token ID3706
Feature activation-2.56e-03

Pos logit contributions
 code+0.084
Warning+0.082
 equation+0.079
 reader+0.077
shop+0.076
Neg logit contributions
 them-0.075
 Nem-0.070
 oats-0.067
ric-0.067
 sunscreen-0.067
Token Sam
Token ID3409
Feature activation+7.17e-02

Pos logit contributions
 sunscreen+0.003
 them+0.003
 Nem+0.003
 oats+0.003
lights+0.003
Neg logit contributions
 code-0.004
 reader-0.004
case-0.004
 student-0.004
 equation-0.004
Token
Token ID220
Feature activation-4.59e-02

Pos logit contributions
case+0.071
 code+0.068
Two+0.067
 student+0.063
 reader+0.062
Neg logit contributions
 sunscreen-0.081
 oats-0.075
 them-0.074
lights-0.072
cast-0.069
Token
Token ID198
Feature activation-1.76e-02

Pos logit contributions
 sunscreen+0.081
 oats+0.077
lights+0.073
 them+0.072
 Nem+0.072
Neg logit contributions
case-0.057
 code-0.055
Two-0.050
 student-0.050
ird-0.049
Token
Token ID198
Feature activation-1.29e-02

Pos logit contributions
 sunscreen+0.027
 oats+0.025
 them+0.025
lights+0.025
 Nem+0.025
Neg logit contributions
 code-0.025
case-0.024
 reader-0.023
 student-0.023
Two-0.023
TokenAfter
Token ID3260
Feature activation+2.87e-02

Pos logit contributions
 sunscreen+0.019
 them+0.018
 Nem+0.018
 oats+0.017
lights+0.017
Neg logit contributions
 code-0.020
 reader-0.019
 student-0.019
case-0.018
 liar-0.018
Token a
Token ID257
Feature activation+1.20e-01

Pos logit contributions
 code+0.051
 reader+0.050
case+0.049
 student+0.048
 equation+0.046
Neg logit contributions
 sunscreen-0.035
 them-0.035
 Nem-0.033
 oats-0.032
lights-0.030
Token long
Token ID890
Feature activation+1.44e-01

Pos logit contributions
 code+0.214
 reader+0.207
case+0.205
 student+0.202
 equation+0.194
Neg logit contributions
 sunscreen-0.139
 them-0.134
 Nem-0.129
 oats-0.127
lights-0.122
Token time
Token ID640
Feature activation+7.35e-03

Pos logit contributions
 code+0.170
 student+0.166
case+0.162
 reader+0.161
 equation+0.154
Neg logit contributions
 sunscreen-0.248
lights-0.238
 them-0.236
 oats-0.234
 Nem-0.232
Token,
Token ID11
Feature activation+2.98e-02

Pos logit contributions
 code+0.009
 reader+0.009
case+0.008
 student+0.008
Two+0.008
Neg logit contributions
 sunscreen-0.013
 them-0.012
 oats-0.012
 Nem-0.012
lights-0.012
Token Bobby
Token ID17155
Feature activation+5.90e-02

Pos logit contributions
case+0.039
Two+0.038
 character+0.038
 mouse+0.038
 should+0.037
Neg logit contributions
ans-0.044
 sunscreen-0.041
ric-0.041
atted-0.039
 oats-0.038
Token was
Token ID373
Feature activation+7.49e-02

Pos logit contributions
 code+0.107
 liar+0.098
 reader+0.097
 student+0.096
 clue+0.092
Neg logit contributions
 Nem-0.074
 them-0.072
cast-0.069
 sunscreen-0.068
 oats-0.067
Token nearly
Token ID3016
Feature activation+1.49e-02

Pos logit contributions
 code+0.121
case+0.118
 reader+0.117
 student+0.116
 equation+0.110
Neg logit contributions
 sunscreen-0.099
 them-0.096
 Nem-0.093
lights-0.092
 oats-0.092
Token unexpected
Token ID10059
Feature activation+7.44e-02

Pos logit contributions
 reader+0.046
 code+0.045
 student+0.044
case+0.043
Two+0.042
Neg logit contributions
 them-0.038
 sunscreen-0.038
 oats-0.036
 Nem-0.036
lights-0.034
Token.
Token ID13
Feature activation+6.10e-02

Pos logit contributions
case+0.099
 reader+0.099
 code+0.099
 student+0.097
Two+0.095
Neg logit contributions
 them-0.119
 sunscreen-0.114
 Nem-0.109
 oats-0.109
lights-0.106
Token He
Token ID679
Feature activation+4.61e-02

Pos logit contributions
 code+0.106
 reader+0.102
 student+0.098
 liar+0.098
ird+0.092
Neg logit contributions
 them-0.075
 Nem-0.073
 sunscreen-0.067
lights-0.064
 everything-0.062
Token spotted
Token ID13489
Feature activation+7.06e-02

Pos logit contributions
 code+0.059
 reader+0.058
 student+0.056
case+0.054
 liar+0.052
Neg logit contributions
 them-0.077
 sunscreen-0.076
 Nem-0.075
 oats-0.072
lights-0.069
Token an
Token ID281
Feature activation+9.75e-02

Pos logit contributions
 code+0.097
 reader+0.092
case+0.091
 student+0.091
Two+0.085
Neg logit contributions
 sunscreen-0.112
 them-0.107
 Nem-0.105
 oats-0.104
lights-0.103
Token evil
Token ID6181
Feature activation+1.44e-01

Pos logit contributions
 code+0.176
Two+0.176
Warning+0.173
anish+0.170
 reader+0.169
Neg logit contributions
 oats-0.106
 sunscreen-0.101
 Nem-0.098
 them-0.092
 as-0.090
Token dragon
Token ID10441
Feature activation+5.04e-02

Pos logit contributions
Warning+0.182
Two+0.177
ird+0.175
case+0.174
 code+0.171
Neg logit contributions
 oats-0.223
 them-0.221
 Nem-0.215
 sunscreen-0.213
 seas-0.198
Token that
Token ID326
Feature activation+1.89e-02

Pos logit contributions
 code+0.070
Two+0.069
 student+0.068
 reader+0.068
 liar+0.066
Neg logit contributions
 them-0.074
 sunscreen-0.067
 oats-0.067
 Nem-0.066
ric-0.065
Token was
Token ID373
Feature activation+7.77e-02

Pos logit contributions
 reader+0.028
 student+0.027
Two+0.026
 code+0.026
ird+0.025
Neg logit contributions
 them-0.028
 Nem-0.027
lights-0.025
 sunscreen-0.025
 everything-0.025
Token lurking
Token ID36105
Feature activation+6.50e-02

Pos logit contributions
 reader+0.116
 code+0.116
 student+0.112
case+0.112
Two+0.111
Neg logit contributions
 them-0.113
 sunscreen-0.110
 oats-0.105
 Nem-0.103
lights-0.097
Token outside
Token ID2354
Feature activation+2.97e-02

Pos logit contributions
 code+0.121
 reader+0.119
 student+0.117
ird+0.115
case+0.115
Neg logit contributions
 them-0.077
 Nem-0.069
 sunscreen-0.069
 oats-0.065
lights-0.062
Token a
Token ID257
Feature activation+1.37e-01

Pos logit contributions
 code+0.043
case+0.042
 reader+0.042
 student+0.041
Two+0.038
Neg logit contributions
 sunscreen-0.067
 them-0.064
lights-0.064
 oats-0.064
 Nem-0.064
Token time
Token ID640
Feature activation+1.86e-02

Pos logit contributions
 code+0.184
 reader+0.176
case+0.175
 student+0.172
Two+0.163
Neg logit contributions
 sunscreen-0.219
 them-0.212
 Nem-0.206
 oats-0.205
lights-0.201
Token,
Token ID11
Feature activation+4.69e-02

Pos logit contributions
case+0.015
 code+0.015
 reader+0.013
 student+0.012
Two+0.012
Neg logit contributions
 sunscreen-0.041
 oats-0.039
lights-0.038
 them-0.038
 Nem-0.037
Token in
Token ID287
Feature activation+7.59e-02

Pos logit contributions
 code+0.085
 reader+0.083
 student+0.081
case+0.080
 equation+0.078
Neg logit contributions
 them-0.054
 sunscreen-0.054
 Nem-0.051
 oats-0.049
lights-0.047
Token a
Token ID257
Feature activation+1.30e-01

Pos logit contributions
case+0.051
 reader+0.051
 student+0.048
 code+0.047
 x+0.045
Neg logit contributions
cast-0.169
uses-0.166
lights-0.166
ric-0.161
atted-0.160
Token big
Token ID1263
Feature activation+1.44e-01

Pos logit contributions
 code+0.199
 reader+0.184
case+0.179
 student+0.179
ird+0.177
Neg logit contributions
 sunscreen-0.193
 them-0.190
 Nem-0.179
 oats-0.177
 everything-0.163
Token forest
Token ID8222
Feature activation+7.24e-02

Pos logit contributions
 code+0.205
Warning+0.185
ird+0.185
case+0.183
 liar+0.182
Neg logit contributions
 sunscreen-0.234
 them-0.227
 oats-0.220
 Nem-0.219
 everything-0.203
Token,
Token ID11
Feature activation+3.81e-02

Pos logit contributions
 code+0.103
 reader+0.101
 student+0.098
case+0.095
 equation+0.094
Neg logit contributions
 them-0.109
 sunscreen-0.107
 Nem-0.103
 oats-0.102
lights-0.101
Token there
Token ID612
Feature activation+2.95e-02

Pos logit contributions
 reader+0.072
 code+0.072
 student+0.071
 equation+0.068
case+0.067
Neg logit contributions
 them-0.041
 sunscreen-0.040
 Nem-0.038
 oats-0.036
lights-0.034
Token was
Token ID373
Feature activation+9.75e-02

Pos logit contributions
 code+0.042
 reader+0.042
 student+0.040
 liar+0.038
 equation+0.038
Neg logit contributions
 them-0.048
 Nem-0.045
 sunscreen-0.044
 everything-0.042
 oats-0.041
Token an
Token ID281
Feature activation+1.12e-01

Pos logit contributions
 code+0.176
 reader+0.168
 student+0.164
ird+0.161
 liar+0.159
Neg logit contributions
 sunscreen-0.122
 them-0.120
 Nem-0.113
 oats-0.110
 everything-0.103
Token stay
Token ID2652
Feature activation-2.08e-02

Pos logit contributions
 code+0.062
 reader+0.060
 student+0.059
case+0.057
 equation+0.056
Neg logit contributions
 sunscreen-0.071
 them-0.070
 Nem-0.066
 oats-0.066
lights-0.063
Token in
Token ID287
Feature activation+2.27e-02

Pos logit contributions
 sunscreen+0.031
 them+0.030
 Nem+0.029
 oats+0.029
lights+0.028
Neg logit contributions
 code-0.031
 reader-0.029
case-0.029
 student-0.029
 equation-0.027
Token.
Token ID13
Feature activation+5.51e-02

Pos logit contributions
 reader+0.044
case+0.044
 code+0.043
 student+0.043
lder+0.042
Neg logit contributions
 them-0.026
 sunscreen-0.024
 Nem-0.022
 oats-0.021
lights-0.020
Token After
Token ID2293
Feature activation+3.02e-02

Pos logit contributions
case+0.072
 code+0.070
Two+0.066
 student+0.065
 reader+0.064
Neg logit contributions
 sunscreen-0.088
 oats-0.086
lights-0.083
 them-0.081
 Nem-0.080
Token a
Token ID257
Feature activation+1.18e-01

Pos logit contributions
 code+0.044
 reader+0.042
case+0.042
 student+0.042
 equation+0.039
Neg logit contributions
 sunscreen-0.045
 them-0.044
 Nem-0.042
 oats-0.042
lights-0.041
Token long
Token ID890
Feature activation+1.44e-01

Pos logit contributions
 code+0.213
 reader+0.205
case+0.203
 student+0.202
 equation+0.193
Neg logit contributions
 sunscreen-0.136
 them-0.130
 Nem-0.125
 oats-0.124
lights-0.119
Token search
Token ID2989
Feature activation+4.46e-02

Pos logit contributions
 code+0.152
case+0.146
 student+0.145
 reader+0.140
 figure+0.135
Neg logit contributions
 sunscreen-0.252
lights-0.250
ric-0.249
 them-0.248
cast-0.248
Token,
Token ID11
Feature activation+1.29e-02

Pos logit contributions
 code+0.053
 reader+0.050
case+0.049
 student+0.049
Two+0.045
Neg logit contributions
 sunscreen-0.079
 them-0.076
 Nem-0.074
 oats-0.074
lights-0.072
Token they
Token ID484
Feature activation+1.89e-02

Pos logit contributions
 code+0.024
case+0.024
 student+0.024
 reader+0.024
Two+0.023
Neg logit contributions
 sunscreen-0.011
ric-0.011
 oats-0.010
lights-0.010
ans-0.010
Token found
Token ID1043
Feature activation+4.63e-02

Pos logit contributions
 code+0.026
 reader+0.025
 student+0.024
 equation+0.024
case+0.024
Neg logit contributions
 them-0.032
 sunscreen-0.031
 Nem-0.031
 oats-0.029
cast-0.028
Token a
Token ID257
Feature activation+9.76e-02

Pos logit contributions
 code+0.066
 reader+0.064
case+0.063
 student+0.063
 equation+0.059
Neg logit contributions
 sunscreen-0.071
 them-0.070
 Nem-0.067
 oats-0.067
lights-0.065
Token
Token ID220
Feature activation-5.35e-02

Pos logit contributions
 code+0.056
 reader+0.053
 student+0.052
case+0.052
Two+0.049
Neg logit contributions
 sunscreen-0.058
 them-0.056
 Nem-0.054
 oats-0.053
lights-0.052
Token
Token ID198
Feature activation-2.56e-02

Pos logit contributions
 sunscreen+0.088
 oats+0.084
 them+0.080
lights+0.080
 Nem+0.079
Neg logit contributions
case-0.071
 code-0.071
 student-0.066
 reader-0.066
Two-0.064
Token
Token ID198
Feature activation-1.91e-02

Pos logit contributions
 them+0.032
 sunscreen+0.032
 Nem+0.030
 oats+0.029
lights+0.028
Neg logit contributions
 code-0.043
 reader-0.042
 student-0.041
case-0.040
 liar-0.039
TokenAfter
Token ID3260
Feature activation+2.84e-02

Pos logit contributions
 them+0.025
 sunscreen+0.024
 Nem+0.021
lights+0.021
 everything+0.020
Neg logit contributions
 code-0.032
 reader-0.030
 liar-0.029
 student-0.029
ird-0.029
Token a
Token ID257
Feature activation+1.19e-01

Pos logit contributions
 code+0.063
case+0.060
 reader+0.060
 student+0.059
 equation+0.057
Neg logit contributions
 them-0.025
 sunscreen-0.024
 Nem-0.022
 oats-0.021
lights-0.017
Token long
Token ID890
Feature activation+1.44e-01

Pos logit contributions
 code+0.206
 reader+0.198
case+0.196
 student+0.196
Two+0.187
Neg logit contributions
 sunscreen-0.146
 them-0.140
 Nem-0.134
 oats-0.134
lights-0.131
Token day
Token ID1110
Feature activation+5.02e-02

Pos logit contributions
 code+0.178
 student+0.170
case+0.170
 reader+0.169
Two+0.157
Neg logit contributions
 sunscreen-0.237
lights-0.228
 them-0.228
cast-0.227
 Nem-0.225
Token of
Token ID286
Feature activation+3.84e-02

Pos logit contributions
 code+0.084
 reader+0.077
case+0.077
 student+0.075
 equation+0.074
Neg logit contributions
 them-0.069
 sunscreen-0.065
 Nem-0.065
 oats-0.061
 as-0.057
Token serving
Token ID7351
Feature activation+3.23e-02

Pos logit contributions
 code+0.065
 reader+0.062
 student+0.061
case+0.061
 equation+0.058
Neg logit contributions
 sunscreen-0.049
 them-0.047
 Nem-0.046
 oats-0.045
lights-0.045
Token,
Token ID11
Feature activation+1.79e-02

Pos logit contributions
 code+0.064
 reader+0.063
case+0.062
Two+0.061
 student+0.061
Neg logit contributions
 them-0.034
 sunscreen-0.031
 Nem-0.030
 oats-0.029
 everything-0.025
Token Max
Token ID5436
Feature activation+7.86e-02

Pos logit contributions
 code+0.026
 reader+0.025
case+0.024
 student+0.024
Two+0.023
Neg logit contributions
 sunscreen-0.027
 them-0.027
 Nem-0.026
 oats-0.026
lights-0.025
Token.
Token ID13
Feature activation+6.54e-03

Pos logit contributions
 sunscreen+0.070
 them+0.069
 Nem+0.066
 oats+0.066
lights+0.065
Neg logit contributions
 code-0.047
 reader-0.045
 student-0.044
case-0.043
 equation-0.041
Token They
Token ID1119
Feature activation-3.11e-02

Pos logit contributions
 code+0.010
 reader+0.010
 student+0.010
case+0.009
 equation+0.009
Neg logit contributions
 them-0.009
 Nem-0.009
 sunscreen-0.009
 oats-0.008
lights-0.008
Token were
Token ID547
Feature activation+9.92e-03

Pos logit contributions
 them+0.053
 sunscreen+0.052
 Nem+0.050
 oats+0.048
cast+0.046
Neg logit contributions
 code-0.041
 reader-0.039
case-0.038
 student-0.037
 equation-0.037
Token called
Token ID1444
Feature activation-1.72e-02

Pos logit contributions
 code+0.013
case+0.013
 reader+0.013
 student+0.013
 liar+0.011
Neg logit contributions
 sunscreen-0.015
lights-0.015
 Nem-0.015
 oats-0.014
 them-0.014
Token ost
Token ID23619
Feature activation-1.03e-01

Pos logit contributions
 sunscreen+0.023
lights+0.022
 Nem+0.022
 oats+0.022
 them+0.021
Neg logit contributions
 code-0.027
 reader-0.026
 student-0.025
case-0.025
 liar-0.024
Tokenric
Token ID1173
Feature activation-2.07e-01

Pos logit contributions
 sunscreen+0.077
 Nem+0.070
 oats+0.065
 them+0.058
lights+0.055
Neg logit contributions
 code-0.232
 reader-0.218
case-0.217
 student-0.214
Two-0.209
Tokenhes
Token ID956
Feature activation-8.35e-02

Pos logit contributions
 sunscreen+0.316
 Nem+0.297
 oats+0.285
lights+0.281
 them+0.273
Neg logit contributions
 code-0.302
case-0.300
 reader-0.283
 student-0.280
Two-0.270
Token.
Token ID13
Feature activation+9.40e-03

Pos logit contributions
 them+0.122
 sunscreen+0.110
 oats+0.106
 Nem+0.105
lights+0.101
Neg logit contributions
 code-0.137
 reader-0.132
 student-0.128
 equation-0.127
case-0.127
Token Lily
Token ID20037
Feature activation+7.69e-03

Pos logit contributions
 code+0.015
 reader+0.014
 student+0.014
case+0.014
 equation+0.013
Neg logit contributions
 them-0.014
 sunscreen-0.013
 Nem-0.013
 oats-0.012
lights-0.012
Token and
Token ID290
Feature activation-1.08e-01

Pos logit contributions
 code+0.013
 reader+0.012
 liar+0.012
Warning+0.012
 student+0.012
Neg logit contributions
 them-0.010
 sunscreen-0.010
 Nem-0.009
cast-0.009
 oats-0.009
Token Ben
Token ID3932
Feature activation+7.87e-03

Pos logit contributions
 sunscreen+0.164
lights+0.158
 them+0.156
 oats+0.154
 Nem+0.152
Neg logit contributions
 code-0.151
 reader-0.148
 student-0.147
case-0.141
 liar-0.135
Token ostr
Token ID42387
Feature activation-1.01e-01

Pos logit contributions
 sunscreen+0.063
 them+0.063
 oats+0.061
 Nem+0.058
cast+0.053
Neg logit contributions
ird-0.064
 code-0.061
 reader-0.061
case-0.061
 student-0.060
Tokenich
Token ID488
Feature activation-2.38e-02

Pos logit contributions
 sunscreen+0.166
 them+0.153
 Nem+0.153
 oats+0.148
lights+0.146
Neg logit contributions
 code-0.139
 reader-0.133
case-0.131
 student-0.126
Two-0.126
Token pen
Token ID3112
Feature activation-2.62e-02

Pos logit contributions
 sunscreen+0.037
 them+0.037
 Nem+0.035
 oats+0.035
lights+0.034
Neg logit contributions
 code-0.033
 reader-0.031
 student-0.031
case-0.031
 equation-0.029
Token.
Token ID13
Feature activation+1.24e-02

Pos logit contributions
 them+0.037
 sunscreen+0.035
 oats+0.034
 Nem+0.033
lights+0.032
Neg logit contributions
 code-0.041
 reader-0.040
 equation-0.040
 student-0.039
Two-0.039
Token Ost
Token ID38919
Feature activation-5.98e-02

Pos logit contributions
 code+0.019
 reader+0.018
 student+0.018
case+0.018
 equation+0.017
Neg logit contributions
 them-0.018
 sunscreen-0.018
 Nem-0.018
lights-0.016
 oats-0.016
Tokenric
Token ID1173
Feature activation-2.02e-01

Pos logit contributions
 sunscreen+0.028
 oats+0.022
 Nem+0.022
 them+0.022
lights+0.020
Neg logit contributions
 code-0.150
 reader-0.144
case-0.144
 student-0.143
Two-0.140
Tokenhes
Token ID956
Feature activation-6.51e-02

Pos logit contributions
 sunscreen+0.316
 Nem+0.298
 oats+0.284
lights+0.282
 them+0.273
Neg logit contributions
case-0.293
 code-0.289
 reader-0.272
 student-0.268
Two-0.262
Token are
Token ID389
Feature activation+1.14e-02

Pos logit contributions
 sunscreen+0.112
 them+0.109
 Nem+0.106
 oats+0.105
lights+0.103
Neg logit contributions
 code-0.080
 reader-0.076
case-0.075
 student-0.074
Two-0.069
Token big
Token ID1263
Feature activation+5.27e-02

Pos logit contributions
 code+0.018
 reader+0.018
 student+0.017
case+0.017
Two+0.016
Neg logit contributions
 sunscreen-0.016
 them-0.016
 oats-0.015
 Nem-0.014
lights-0.014
Token birds
Token ID10087
Feature activation-3.84e-02

Pos logit contributions
case+0.103
 code+0.098
shop+0.097
 student+0.094
 reader+0.093
Neg logit contributions
 them-0.062
 oats-0.059
 sunscreen-0.057
 Nem-0.052
cast-0.049
Token with
Token ID351
Feature activation-4.31e-02

Pos logit contributions
 sunscreen+0.068
 them+0.066
 Nem+0.064
 oats+0.064
lights+0.062
Neg logit contributions
 code-0.046
 reader-0.043
case-0.042
 student-0.042
Two-0.039
Token garden
Token ID11376
Feature activation-9.07e-02

Pos logit contributions
 sunscreen+0.134
 them+0.128
 oats+0.124
 Nem+0.122
lights+0.116
Neg logit contributions
 code-0.136
 reader-0.131
case-0.129
 student-0.126
Warning-0.124
Token and
Token ID290
Feature activation-1.62e-01

Pos logit contributions
 sunscreen+0.137
 them+0.136
 oats+0.131
 Nem+0.129
lights+0.127
Neg logit contributions
 code-0.129
 reader-0.126
 student-0.122
Two-0.117
case-0.116
Token admire
Token ID21099
Feature activation-8.88e-02

Pos logit contributions
 sunscreen+0.234
 them+0.227
 Nem+0.218
 oats+0.217
lights+0.210
Neg logit contributions
 code-0.246
 reader-0.236
case-0.233
 student-0.232
Two-0.220
Token his
Token ID465
Feature activation-5.51e-02

Pos logit contributions
 sunscreen+0.097
 oats+0.091
ans+0.089
oming+0.087
 Nem+0.085
Neg logit contributions
 code-0.143
 should-0.135
 student-0.134
 character-0.132
 reader-0.131
Token sun
Token ID4252
Feature activation-1.55e-01

Pos logit contributions
 sunscreen+0.077
 them+0.074
 oats+0.071
 Nem+0.070
lights+0.068
Neg logit contributions
 code-0.086
 reader-0.083
case-0.083
 student-0.082
Two-0.078
Tokenfl
Token ID2704
Feature activation-2.00e-01

Pos logit contributions
 sunscreen+0.247
 Nem+0.235
 them+0.230
 oats+0.228
lights+0.217
Neg logit contributions
 code-0.198
case-0.195
 reader-0.192
 student-0.184
 equation-0.172
Tokenowers
Token ID3618
Feature activation-1.26e-01

Pos logit contributions
 sunscreen+0.337
 Nem+0.330
 them+0.319
 oats+0.314
lights+0.291
Neg logit contributions
case-0.243
room-0.224
 code-0.222
 reader-0.221
Two-0.211
Token.
Token ID13
Feature activation-4.95e-03

Pos logit contributions
 sunscreen+0.226
 them+0.214
 Nem+0.213
 oats+0.210
lights+0.208
Neg logit contributions
 code-0.144
case-0.136
 reader-0.134
 student-0.130
 equation-0.121
Token They
Token ID1119
Feature activation-4.64e-02

Pos logit contributions
 sunscreen+0.009
 oats+0.008
 them+0.008
lights+0.008
 Nem+0.008
Neg logit contributions
case-0.006
 code-0.006
 reader-0.006
 student-0.005
Two-0.005
Token also
Token ID635
Feature activation-8.22e-02

Pos logit contributions
 sunscreen+0.072
 them+0.071
 Nem+0.067
 oats+0.065
lights+0.062
Neg logit contributions
 code-0.067
 reader-0.064
 student-0.063
case-0.062
 equation-0.060
Token see
Token ID766
Feature activation-2.94e-02

Pos logit contributions
 sunscreen+0.131
 them+0.130
 Nem+0.121
 oats+0.119
cast+0.113
Neg logit contributions
 code-0.112
case-0.108
 reader-0.107
 student-0.106
Two-0.102
Token them
Token ID606
Feature activation-1.71e-01

Pos logit contributions
lights+0.121
 sunscreen+0.115
cast+0.112
ric+0.109
oming+0.104
Neg logit contributions
 code-0.217
 student-0.216
 reader-0.213
 character-0.208
 should-0.206
Token band
Token ID4097
Feature activation-1.41e-01

Pos logit contributions
lights+0.229
 sunscreen+0.226
 Nem+0.219
 them+0.213
cast+0.212
Neg logit contributions
 code-0.252
 reader-0.242
 student-0.240
case-0.237
 liar-0.220
Token-
Token ID12
Feature activation-1.70e-01

Pos logit contributions
 sunscreen+0.209
 them+0.198
 oats+0.197
 Nem+0.196
lights+0.188
Neg logit contributions
 code-0.209
 reader-0.201
case-0.200
 student-0.198
ird-0.185
Tokenaid
Token ID1698
Feature activation-1.95e-01

Pos logit contributions
 sunscreen+0.251
 Nem+0.236
 oats+0.234
lights+0.232
 them+0.225
Neg logit contributions
 code-0.250
 reader-0.241
 student-0.239
case-0.235
Two-0.232
Tokens
Token ID82
Feature activation-1.18e-01

Pos logit contributions
 sunscreen+0.264
 Nem+0.244
 oats+0.238
 them+0.228
lights+0.220
Neg logit contributions
 code-0.308
case-0.301
 reader-0.292
 student-0.290
ird-0.283
Token and
Token ID290
Feature activation-1.98e-01

Pos logit contributions
 sunscreen+0.204
 them+0.197
 oats+0.194
 Nem+0.193
lights+0.188
Neg logit contributions
 code-0.142
case-0.134
 reader-0.130
 student-0.127
ird-0.119
Token ice
Token ID4771
Feature activation-1.31e-01

Pos logit contributions
 sunscreen+0.312
 them+0.305
 Nem+0.305
 oats+0.298
lights+0.297
Neg logit contributions
 code-0.265
 reader-0.238
 student-0.237
case-0.235
 liar-0.223
Token packs
Token ID15186
Feature activation-8.30e-02

Pos logit contributions
 sunscreen+0.203
 Nem+0.200
 them+0.197
 Wed+0.189
 oats+0.186
Neg logit contributions
 code-0.162
case-0.155
room-0.145
ird-0.144
 reader-0.141
Token.
Token ID13
Feature activation-1.75e-02

Pos logit contributions
 sunscreen+0.140
 Nem+0.131
 oats+0.131
 them+0.131
lights+0.131
Neg logit contributions
 code-0.103
 reader-0.097
case-0.096
 student-0.095
 equation-0.087
Token She
Token ID1375
Feature activation-4.01e-02

Pos logit contributions
 sunscreen+0.027
 them+0.026
 oats+0.026
 Nem+0.025
lights+0.025
Neg logit contributions
 code-0.024
 reader-0.023
case-0.023
 student-0.023
Two-0.022
Token tells
Token ID4952
Feature activation-4.25e-02

Pos logit contributions
 sunscreen+0.072
 them+0.070
 Nem+0.068
 oats+0.067
lights+0.067
Neg logit contributions
 code-0.047
 reader-0.044
case-0.044
 student-0.043
 equation-0.040
Tokenl
Token ID75
Feature activation-7.20e-02

Pos logit contributions
 sunscreen+0.147
 Nem+0.140
 oats+0.135
lights+0.129
 them+0.128
Neg logit contributions
 code-0.108
case-0.105
 reader-0.095
 student-0.091
 should-0.089
Tokenaces
Token ID2114
Feature activation-7.05e-02

Pos logit contributions
 sunscreen+0.113
 them+0.110
 Nem+0.106
 oats+0.106
lights+0.104
Neg logit contributions
 code-0.100
 reader-0.095
case-0.094
 student-0.093
Two-0.088
Token,
Token ID11
Feature activation-8.46e-02

Pos logit contributions
 sunscreen+0.107
 them+0.105
 Nem+0.100
 oats+0.099
lights+0.098
Neg logit contributions
 code-0.100
 reader-0.097
 student-0.096
case-0.093
 equation-0.090
Token brace
Token ID22581
Feature activation-1.34e-01

Pos logit contributions
 Nem+0.135
 sunscreen+0.131
 oats+0.123
ores+0.123
ext+0.122
Neg logit contributions
 code-0.112
 reader-0.098
case-0.097
 figure-0.095
 student-0.094
Tokenlets
Token ID5289
Feature activation-8.78e-02

Pos logit contributions
 sunscreen+0.265
 Nem+0.256
 oats+0.254
 them+0.232
lights+0.232
Neg logit contributions
 code-0.147
 reader-0.137
case-0.127
 student-0.124
 clue-0.121
Token and
Token ID290
Feature activation-1.96e-01

Pos logit contributions
 sunscreen+0.143
 them+0.136
 Nem+0.134
 oats+0.133
lights+0.130
Neg logit contributions
 code-0.117
 reader-0.111
case-0.111
 student-0.109
Two-0.101
Token ear
Token ID1027
Feature activation-6.66e-02

Pos logit contributions
 sunscreen+0.273
 Nem+0.263
 them+0.262
 oats+0.259
lights+0.253
Neg logit contributions
 code-0.307
 reader-0.284
 student-0.281
case-0.278
 clue-0.263
Tokenrings
Token ID33173
Feature activation-1.26e-01

Pos logit contributions
 sunscreen+0.076
 Nem+0.072
 oats+0.070
 refreshing+0.066
lights+0.064
Neg logit contributions
 code-0.122
 reader-0.115
 student-0.112
ird-0.107
 should-0.107
Token.
Token ID13
Feature activation-9.20e-03

Pos logit contributions
 sunscreen+0.212
 them+0.210
 oats+0.199
 Nem+0.199
lights+0.196
Neg logit contributions
 code-0.157
 reader-0.151
 student-0.149
case-0.148
 equation-0.141
Token They
Token ID1119
Feature activation-4.18e-02

Pos logit contributions
 them+0.014
 sunscreen+0.014
 Nem+0.013
 oats+0.013
lights+0.013
Neg logit contributions
 code-0.013
 reader-0.013
case-0.013
 student-0.013
 equation-0.012
Token liked
Token ID8288
Feature activation-4.94e-03

Pos logit contributions
 them+0.067
 sunscreen+0.064
 Nem+0.060
 oats+0.058
cast+0.057
Neg logit contributions
 code-0.059
 reader-0.057
 equation-0.056
case-0.056
 student-0.055
Token John
Token ID1757
Feature activation-9.91e-02

Pos logit contributions
 code+0.112
 x+0.105
 reader+0.104
 liar+0.103
 equation+0.102
Neg logit contributions
 them-0.072
 sunscreen-0.062
 as-0.059
 Nem-0.058
 everything-0.058
Token chose
Token ID7690
Feature activation-7.94e-02

Pos logit contributions
 them+0.161
 sunscreen+0.158
 Nem+0.153
 oats+0.148
lights+0.140
Neg logit contributions
 code-0.136
 reader-0.126
case-0.125
 student-0.124
 equation-0.119
Token some
Token ID617
Feature activation-8.24e-02

Pos logit contributions
 sunscreen+0.104
 them+0.099
 Nem+0.097
lights+0.097
 oats+0.096
Neg logit contributions
 code-0.127
 reader-0.122
 student-0.121
case-0.120
 equation-0.113
Token carrots
Token ID34397
Feature activation-8.76e-02

Pos logit contributions
 sunscreen+0.105
 them+0.101
 Nem+0.097
 oats+0.096
lights+0.093
Neg logit contributions
 code-0.138
 reader-0.133
case-0.132
 student-0.131
Two-0.125
Token and
Token ID290
Feature activation-1.61e-01

Pos logit contributions
 sunscreen+0.136
 them+0.129
 Nem+0.126
lights+0.125
 oats+0.125
Neg logit contributions
 code-0.121
case-0.115
 reader-0.115
 student-0.113
Two-0.105
Token cuc
Token ID38421
Feature activation-1.96e-01

Pos logit contributions
 sunscreen+0.205
lights+0.199
 them+0.195
 Nem+0.186
ric+0.185
Neg logit contributions
 code-0.246
 reader-0.242
 student-0.240
case-0.239
 equation-0.229
Tokenumbers
Token ID17024
Feature activation-7.76e-02

Pos logit contributions
 sunscreen+0.307
lights+0.297
 Nem+0.281
 them+0.281
 oats+0.269
Neg logit contributions
 reader-0.260
ird-0.253
 code-0.249
 student-0.247
Two-0.247
Token to
Token ID284
Feature activation-5.92e-03

Pos logit contributions
 sunscreen+0.126
 them+0.117
 Nem+0.117
 oats+0.116
lights+0.115
Neg logit contributions
 code-0.102
case-0.096
 reader-0.096
 student-0.094
Two-0.088
Token take
Token ID1011
Feature activation-6.53e-02

Pos logit contributions
 sunscreen+0.009
lights+0.009
 them+0.009
 oats+0.009
 Nem+0.009
Neg logit contributions
 code-0.008
case-0.007
 reader-0.007
 student-0.007
Two-0.007
Token home
Token ID1363
Feature activation-5.12e-02

Pos logit contributions
 sunscreen+0.065
lights+0.062
 oats+0.057
ric+0.057
 Nem+0.056
Neg logit contributions
 code-0.122
 reader-0.118
case-0.116
 student-0.116
Two-0.112
Token.
Token ID13
Feature activation+3.59e-04

Pos logit contributions
 sunscreen+0.087
 Nem+0.081
 them+0.081
lights+0.081
 oats+0.080
Neg logit contributions
 code-0.061
 reader-0.058
 student-0.058
case-0.057
ird-0.053
Token and
Token ID290
Feature activation-1.84e-01

Pos logit contributions
case+0.140
 code+0.137
Warning+0.137
 reader+0.137
 student+0.131
Neg logit contributions
 sunscreen-0.151
 oats-0.139
 them-0.136
 Nem-0.129
lights-0.118
Token gives
Token ID3607
Feature activation-1.24e-01

Pos logit contributions
 sunscreen+0.270
 them+0.260
 Nem+0.254
 oats+0.254
lights+0.251
Neg logit contributions
 code-0.266
 reader-0.254
case-0.253
 student-0.252
 equation-0.235
Token them
Token ID606
Feature activation-1.71e-01

Pos logit contributions
lights+0.121
 sunscreen+0.115
cast+0.112
ric+0.109
oming+0.104
Neg logit contributions
 code-0.217
 student-0.216
 reader-0.213
 character-0.208
 should-0.206
Token band
Token ID4097
Feature activation-1.41e-01

Pos logit contributions
lights+0.229
 sunscreen+0.226
 Nem+0.219
 them+0.213
cast+0.212
Neg logit contributions
 code-0.252
 reader-0.242
 student-0.240
case-0.237
 liar-0.220
Token-
Token ID12
Feature activation-1.70e-01

Pos logit contributions
 sunscreen+0.209
 them+0.198
 oats+0.197
 Nem+0.196
lights+0.188
Neg logit contributions
 code-0.209
 reader-0.201
case-0.200
 student-0.198
ird-0.185
Tokenaid
Token ID1698
Feature activation-1.95e-01

Pos logit contributions
 sunscreen+0.251
 Nem+0.236
 oats+0.234
lights+0.232
 them+0.225
Neg logit contributions
 code-0.250
 reader-0.241
 student-0.239
case-0.235
Two-0.232
Tokens
Token ID82
Feature activation-1.18e-01

Pos logit contributions
 sunscreen+0.264
 Nem+0.244
 oats+0.238
 them+0.228
lights+0.220
Neg logit contributions
 code-0.308
case-0.301
 reader-0.292
 student-0.290
ird-0.283
Token and
Token ID290
Feature activation-1.98e-01

Pos logit contributions
 sunscreen+0.204
 them+0.197
 oats+0.194
 Nem+0.193
lights+0.188
Neg logit contributions
 code-0.142
case-0.134
 reader-0.130
 student-0.127
ird-0.119
Token ice
Token ID4771
Feature activation-1.31e-01

Pos logit contributions
 sunscreen+0.312
 them+0.305
 Nem+0.305
 oats+0.298
lights+0.297
Neg logit contributions
 code-0.265
 reader-0.238
 student-0.237
case-0.235
 liar-0.223
Token packs
Token ID15186
Feature activation-8.30e-02

Pos logit contributions
 sunscreen+0.203
 Nem+0.200
 them+0.197
 Wed+0.189
 oats+0.186
Neg logit contributions
 code-0.162
case-0.155
room-0.145
ird-0.144
 reader-0.141
Token.
Token ID13
Feature activation-1.75e-02

Pos logit contributions
 sunscreen+0.140
 Nem+0.131
 oats+0.131
 them+0.131
lights+0.131
Neg logit contributions
 code-0.103
 reader-0.097
case-0.096
 student-0.095
 equation-0.087
Token ostr
Token ID42387
Feature activation-9.57e-02

Pos logit contributions
 sunscreen+0.054
 them+0.052
 oats+0.051
 Nem+0.048
ric+0.044
Neg logit contributions
ird-0.054
 reader-0.051
 code-0.051
Warning-0.050
 student-0.050
Tokenich
Token ID488
Feature activation-1.80e-02

Pos logit contributions
 sunscreen+0.158
 them+0.147
 Nem+0.146
 oats+0.141
lights+0.139
Neg logit contributions
 code-0.132
 reader-0.126
case-0.126
 student-0.120
Two-0.120
Token pen
Token ID3112
Feature activation-2.07e-02

Pos logit contributions
 them+0.027
 sunscreen+0.027
 oats+0.026
 Nem+0.025
lights+0.024
Neg logit contributions
 code-0.026
 reader-0.025
 student-0.024
 liar-0.024
 equation-0.024
Token.
Token ID13
Feature activation+1.63e-02

Pos logit contributions
 them+0.028
 sunscreen+0.025
 oats+0.025
 Nem+0.024
lights+0.023
Neg logit contributions
 code-0.034
 reader-0.033
 equation-0.033
Two-0.033
 student-0.033
Token Ost
Token ID38919
Feature activation-5.71e-02

Pos logit contributions
 code+0.027
 reader+0.025
 student+0.025
case+0.024
 equation+0.024
Neg logit contributions
 them-0.023
 Nem-0.023
 sunscreen-0.022
 oats-0.020
lights-0.020
Tokenric
Token ID1173
Feature activation-1.95e-01

Pos logit contributions
 sunscreen+0.030
 oats+0.024
 them+0.024
 Nem+0.024
lights+0.022
Neg logit contributions
 code-0.140
case-0.135
 reader-0.135
 student-0.134
Two-0.131
Tokenhes
Token ID956
Feature activation-5.96e-02

Pos logit contributions
 sunscreen+0.300
 Nem+0.280
 oats+0.271
lights+0.265
 them+0.261
Neg logit contributions
case-0.290
 code-0.285
 reader-0.268
 student-0.265
Two-0.259
Token are
Token ID389
Feature activation+1.63e-02

Pos logit contributions
 them+0.104
 sunscreen+0.101
 oats+0.097
 Nem+0.096
lights+0.092
Neg logit contributions
 code-0.075
 reader-0.073
 student-0.071
case-0.069
 equation-0.067
Token big
Token ID1263
Feature activation+5.66e-02

Pos logit contributions
 code+0.025
 reader+0.025
case+0.024
 student+0.024
Two+0.023
Neg logit contributions
 sunscreen-0.023
 them-0.023
 oats-0.022
 Nem-0.021
lights-0.020
Token birds
Token ID10087
Feature activation-3.45e-02

Pos logit contributions
case+0.111
 code+0.106
shop+0.104
 student+0.100
Two+0.100
Neg logit contributions
 them-0.067
 oats-0.064
 sunscreen-0.061
 Nem-0.057
cast-0.053
Token with
Token ID351
Feature activation-3.90e-02

Pos logit contributions
 sunscreen+0.059
 them+0.057
 Nem+0.056
 oats+0.056
lights+0.054
Neg logit contributions
 code-0.043
 reader-0.041
case-0.040
 student-0.040
Two-0.037
Token put
Token ID1234
Feature activation-1.04e-01

Pos logit contributions
 sunscreen+0.255
 them+0.246
 Nem+0.239
 oats+0.238
lights+0.233
Neg logit contributions
 code-0.233
 reader-0.222
case-0.220
 student-0.218
 equation-0.205
Token it
Token ID340
Feature activation-1.32e-01

Pos logit contributions
 sunscreen+0.114
ric+0.112
lights+0.111
ext+0.105
cast+0.103
Neg logit contributions
 code-0.168
 box-0.155
 character-0.155
case-0.153
 should-0.152
Token on
Token ID319
Feature activation-4.04e-02

Pos logit contributions
 sunscreen+0.197
 Nem+0.190
 them+0.187
 oats+0.183
lights+0.180
Neg logit contributions
 code-0.187
case-0.181
 reader-0.166
 student-0.164
 clue-0.158
Token his
Token ID465
Feature activation-6.46e-02

Pos logit contributions
 them+0.041
 sunscreen+0.035
 Nem+0.032
 oats+0.029
 everything+0.028
Neg logit contributions
 code-0.085
 reader-0.085
Two-0.083
case-0.082
 liar-0.082
Token night
Token ID1755
Feature activation-7.63e-02

Pos logit contributions
 sunscreen+0.099
 them+0.094
 oats+0.092
 Nem+0.090
lights+0.088
Neg logit contributions
 code-0.092
 reader-0.091
 student-0.088
case-0.086
Two-0.085
Tokenstand
Token ID1481
Feature activation-1.95e-01

Pos logit contributions
 sunscreen+0.108
 them+0.104
 Nem+0.102
 oats+0.100
lights+0.096
Neg logit contributions
 code-0.118
case-0.115
 reader-0.112
 student-0.111
 equation-0.104
Token.
Token ID13
Feature activation-4.18e-03

Pos logit contributions
 sunscreen+0.319
 them+0.318
lights+0.302
 Nem+0.298
 oats+0.298
Neg logit contributions
 code-0.250
 reader-0.241
 student-0.235
Two-0.227
 equation-0.226
Token He
Token ID679
Feature activation-2.81e-02

Pos logit contributions
 sunscreen+0.006
 them+0.006
 oats+0.006
lights+0.006
 Nem+0.006
Neg logit contributions
 code-0.006
case-0.006
 reader-0.006
 student-0.006
Two-0.005
Token did
Token ID750
Feature activation-4.48e-03

Pos logit contributions
 sunscreen+0.046
 them+0.045
 Nem+0.043
 oats+0.043
lights+0.041
Neg logit contributions
 code-0.038
 reader-0.036
 student-0.035
case-0.035
Two-0.033
Token not
Token ID407
Feature activation-7.48e-02

Pos logit contributions
 sunscreen+0.006
 Nem+0.006
 oats+0.006
lights+0.006
ishes+0.006
Neg logit contributions
 code-0.007
case-0.007
 reader-0.006
 student-0.006
 equation-0.006
Token want
Token ID765
Feature activation-1.67e-02

Pos logit contributions
 sunscreen+0.117
 them+0.116
 Nem+0.111
 oats+0.108
lights+0.101
Neg logit contributions
 code-0.103
case-0.099
 reader-0.097
 student-0.095
Two-0.094
Token butterfly
Token ID35113
Feature activation-8.59e-02

Pos logit contributions
 sunscreen+0.138
 them+0.130
 oats+0.130
 Nem+0.126
lights+0.121
Neg logit contributions
 code-0.111
 reader-0.111
 student-0.107
case-0.105
Two-0.104
Token's
Token ID338
Feature activation-7.12e-02

Pos logit contributions
 sunscreen+0.117
 them+0.115
 oats+0.108
 Nem+0.107
lights+0.104
Neg logit contributions
 code-0.136
 reader-0.131
 student-0.128
Two-0.127
case-0.125
Token wings
Token ID12098
Feature activation-1.19e-01

Pos logit contributions
 sunscreen+0.109
 them+0.105
 oats+0.101
 Nem+0.100
lights+0.097
Neg logit contributions
 code-0.101
 reader-0.100
 student-0.097
case-0.097
Two-0.094
Token and
Token ID290
Feature activation-1.59e-01

Pos logit contributions
 sunscreen+0.178
 them+0.171
 Nem+0.167
 oats+0.166
lights+0.161
Neg logit contributions
 code-0.172
 reader-0.164
case-0.164
 student-0.162
 equation-0.151
Token gently
Token ID15165
Feature activation-5.20e-02

Pos logit contributions
 sunscreen+0.253
 Nem+0.248
 oats+0.248
lights+0.246
 them+0.245
Neg logit contributions
 code-0.208
 reader-0.194
case-0.191
 student-0.190
 figure-0.182
Token stro
Token ID9660
Feature activation-1.93e-01

Pos logit contributions
 sunscreen+0.081
lights+0.079
 Nem+0.079
 oats+0.079
 them+0.078
Neg logit contributions
 code-0.070
 reader-0.065
 student-0.064
case-0.064
ird-0.061
Tokenked
Token ID9091
Feature activation-9.16e-02

Pos logit contributions
 oats+0.295
 sunscreen+0.291
lights+0.267
 Nem+0.259
 peanuts+0.243
Neg logit contributions
 reader-0.258
 code-0.254
 student-0.250
room-0.247
case-0.241
Token them
Token ID606
Feature activation-1.28e-01

Pos logit contributions
 sunscreen+0.107
 Nem+0.097
lights+0.097
 oats+0.097
 them+0.093
Neg logit contributions
 code-0.161
 reader-0.155
 student-0.153
case-0.152
 equation-0.145
Token.
Token ID13
Feature activation-1.20e-02

Pos logit contributions
 sunscreen+0.193
 them+0.188
 Nem+0.181
 oats+0.179
lights+0.176
Neg logit contributions
 code-0.187
 reader-0.179
case-0.176
 student-0.176
Two-0.166
Token The
Token ID383
Feature activation-6.90e-02

Pos logit contributions
 sunscreen+0.021
 them+0.020
 oats+0.020
lights+0.019
 Nem+0.019
Neg logit contributions
 code-0.015
 reader-0.014
case-0.014
 student-0.014
Two-0.013
Token butterfly
Token ID35113
Feature activation-4.36e-02

Pos logit contributions
 sunscreen+0.102
 them+0.097
 Nem+0.093
 oats+0.093
lights+0.088
Neg logit contributions
 code-0.103
case-0.099
 reader-0.098
 student-0.094
ird-0.094
Tokenarre
Token ID9624
Feature activation-2.49e-01

Pos logit contributions
 them+0.010
 sunscreen+0.009
lights+0.009
 Nem+0.008
 oats+0.008
Neg logit contributions
 code-0.007
 reader-0.007
 liar-0.007
 student-0.007
case-0.006
Tokenling
Token ID1359
Feature activation-1.61e-01

Pos logit contributions
 sunscreen+0.488
 oats+0.462
lights+0.428
 Nem+0.428
bur+0.415
Neg logit contributions
 student-0.222
 code-0.221
case-0.218
 equation-0.217
room-0.217
Token over
Token ID625
Feature activation-1.19e-01

Pos logit contributions
 sunscreen+0.242
 oats+0.224
 them+0.224
 Nem+0.220
lights+0.218
Neg logit contributions
 code-0.230
case-0.220
 student-0.219
 reader-0.218
 equation-0.205
Token some
Token ID617
Feature activation-9.03e-02

Pos logit contributions
 sunscreen+0.165
 oats+0.153
 them+0.150
lights+0.150
 Nem+0.150
Neg logit contributions
 code-0.182
 student-0.173
 reader-0.173
case-0.170
 equation-0.164
Token caul
Token ID46087
Feature activation-1.41e-01

Pos logit contributions
 sunscreen+0.124
 them+0.119
 Nem+0.115
 oats+0.114
lights+0.112
Neg logit contributions
 code-0.141
 reader-0.134
case-0.134
 student-0.133
 equation-0.126
Tokenif
Token ID361
Feature activation-1.90e-01

Pos logit contributions
 sunscreen+0.216
 them+0.195
 Nem+0.193
lights+0.185
 oats+0.181
Neg logit contributions
 code-0.200
case-0.192
 reader-0.190
 student-0.186
 equation-0.184
Tokenl
Token ID75
Feature activation-1.09e-01

Pos logit contributions
 sunscreen+0.273
 oats+0.245
 them+0.234
lights+0.233
 Nem+0.231
Neg logit contributions
case-0.285
 code-0.281
 should-0.271
 reader-0.269
 student-0.268
Tokenowers
Token ID3618
Feature activation-1.41e-01

Pos logit contributions
 sunscreen+0.165
 them+0.161
 Nem+0.156
 oats+0.154
lights+0.150
Neg logit contributions
 code-0.157
 reader-0.149
 student-0.147
case-0.146
Two-0.138
Token?"
Token ID1701
Feature activation-3.54e-02

Pos logit contributions
 sunscreen+0.235
lights+0.218
 oats+0.216
 Nem+0.216
 them+0.216
Neg logit contributions
 code-0.177
case-0.170
 reader-0.161
 student-0.159
 should-0.154
Token
Token ID198
Feature activation-8.91e-02

Pos logit contributions
 them+0.046
 sunscreen+0.046
 Nem+0.045
 oats+0.042
lights+0.040
Neg logit contributions
 code-0.060
 reader-0.056
case-0.056
 student-0.055
 equation-0.053
Token
Token ID198
Feature activation-8.04e-02

Pos logit contributions
 them+0.116
 sunscreen+0.111
 Nem+0.111
 oats+0.099
lights+0.095
Neg logit contributions
 code-0.152
 reader-0.143
case-0.140
 student-0.138
 equation-0.134
Token tape
Token ID9154
Feature activation-1.50e-01

Pos logit contributions
 code+0.038
 student+0.036
case+0.035
 reader+0.035
Two+0.035
Neg logit contributions
 sunscreen-0.051
 Nem-0.049
 oats-0.049
 them-0.048
lights-0.048
Token will
Token ID481
Feature activation-9.71e-02

Pos logit contributions
 sunscreen+0.214
 them+0.211
 oats+0.199
 Nem+0.199
lights+0.195
Neg logit contributions
 code-0.228
 reader-0.221
 student-0.217
case-0.215
Two-0.207
Token hold
Token ID1745
Feature activation-9.55e-02

Pos logit contributions
 them+0.143
 sunscreen+0.142
 Nem+0.132
 oats+0.128
lights+0.125
Neg logit contributions
 code-0.144
 reader-0.142
 student-0.137
case-0.135
Two-0.132
Token the
Token ID262
Feature activation-1.02e-01

Pos logit contributions
 sunscreen+0.095
 oats+0.087
 Nem+0.087
lights+0.085
cast+0.083
Neg logit contributions
 code-0.181
 student-0.174
 reader-0.174
case-0.172
ird-0.167
Token zero
Token ID6632
Feature activation-9.20e-02

Pos logit contributions
 sunscreen+0.174
 them+0.166
 oats+0.164
 Nem+0.156
lights+0.153
Neg logit contributions
 code-0.126
 reader-0.124
case-0.120
 student-0.117
Two-0.113
Token and
Token ID290
Feature activation-1.89e-01

Pos logit contributions
 sunscreen+0.134
 them+0.132
 oats+0.125
lights+0.123
 Nem+0.123
Neg logit contributions
 code-0.136
 reader-0.131
 student-0.128
case-0.127
Two-0.123
Token close
Token ID1969
Feature activation-8.70e-02

Pos logit contributions
 sunscreen+0.284
 them+0.272
 Nem+0.270
 oats+0.270
lights+0.261
Neg logit contributions
 code-0.273
case-0.257
 reader-0.256
 student-0.254
 equation-0.237
Token the
Token ID262
Feature activation-9.45e-02

Pos logit contributions
 sunscreen+0.117
 them+0.110
 Nem+0.109
 oats+0.108
lights+0.105
Neg logit contributions
 code-0.140
 reader-0.134
case-0.133
 student-0.132
 equation-0.125
Token hole
Token ID7604
Feature activation-4.50e-02

Pos logit contributions
 sunscreen+0.158
 them+0.156
 oats+0.151
ric+0.141
 attention+0.140
Neg logit contributions
 reader-0.114
lder-0.112
 student-0.106
 code-0.106
shop-0.105
Token.
Token ID13
Feature activation-1.31e-02

Pos logit contributions
 them+0.072
 sunscreen+0.069
lights+0.067
 oats+0.065
 Nem+0.065
Neg logit contributions
 code-0.061
 reader-0.061
 student-0.059
Two-0.057
case-0.057
Token Then
Token ID3244
Feature activation-3.82e-02

Pos logit contributions
 sunscreen+0.020
 them+0.020
 Nem+0.019
 oats+0.018
lights+0.018
Neg logit contributions
 code-0.019
 reader-0.018
 student-0.017
case-0.017
 equation-0.017
Token cried
Token ID16896
Feature activation-1.17e-02

Pos logit contributions
 them+0.009
 sunscreen+0.008
 oats+0.008
 Nem+0.008
 as+0.007
Neg logit contributions
 reader-0.008
 code-0.008
 student-0.008
 equation-0.007
shop-0.007
Token.
Token ID13
Feature activation+5.11e-03

Pos logit contributions
 sunscreen+0.023
 oats+0.022
 Nem+0.022
 them+0.021
lights+0.021
Neg logit contributions
 code-0.011
 reader-0.010
 student-0.010
case-0.010
Two-0.009
Token "
Token ID366
Feature activation-2.52e-02

Pos logit contributions
 code+0.008
 reader+0.008
 student+0.008
case+0.008
 equation+0.008
Neg logit contributions
 them-0.007
 sunscreen-0.007
 Nem-0.006
lights-0.006
 oats-0.006
TokenThe
Token ID464
Feature activation-4.74e-02

Pos logit contributions
 sunscreen+0.044
 oats+0.043
 them+0.042
 Nem+0.042
lights+0.042
Neg logit contributions
 code-0.028
case-0.027
Two-0.027
 reader-0.027
 student-0.027
Token ost
Token ID23619
Feature activation-7.67e-02

Pos logit contributions
 sunscreen+0.078
 oats+0.075
 them+0.074
 Nem+0.071
lights+0.065
Neg logit contributions
case-0.062
 code-0.062
 reader-0.061
 student-0.059
ird-0.058
Tokenric
Token ID1173
Feature activation-1.88e-01

Pos logit contributions
 sunscreen+0.048
 Nem+0.046
 oats+0.038
 them+0.036
lights+0.035
Neg logit contributions
 code-0.181
case-0.171
 reader-0.170
 student-0.169
room-0.168
Tokenhes
Token ID956
Feature activation-6.86e-02

Pos logit contributions
 sunscreen+0.286
 Nem+0.269
 oats+0.258
lights+0.254
 them+0.252
Neg logit contributions
 code-0.276
case-0.274
 reader-0.260
 student-0.258
 liar-0.249
Token are
Token ID389
Feature activation+4.20e-03

Pos logit contributions
 them+0.116
 sunscreen+0.108
 oats+0.105
 Nem+0.103
lights+0.099
Neg logit contributions
 code-0.093
 reader-0.091
 student-0.087
 equation-0.085
shop-0.084
Token mean
Token ID1612
Feature activation+5.07e-02

Pos logit contributions
 code+0.005
 student+0.005
case+0.005
 reader+0.005
 liar+0.005
Neg logit contributions
 sunscreen-0.007
lights-0.007
 Nem-0.007
cast-0.007
 them-0.007
Token!
Token ID0
Feature activation+2.05e-02

Pos logit contributions
 code+0.065
case+0.063
 reader+0.062
 student+0.060
shop+0.058
Neg logit contributions
 them-0.087
 sunscreen-0.086
 oats-0.082
 Nem-0.080
lights-0.078
Token They
Token ID1119
Feature activation-1.44e-02

Pos logit contributions
 code+0.032
 reader+0.029
case+0.029
 student+0.029
shop+0.028
Neg logit contributions
 them-0.031
 sunscreen-0.030
 Nem-0.029
 oats-0.028
lights-0.027
Token big
Token ID1263
Feature activation+4.46e-02

Pos logit contributions
Warning+0.052
 reader+0.050
 code+0.049
 student+0.047
 Warning+0.047
Neg logit contributions
 sunscreen-0.043
 oats-0.041
 them-0.041
 Nem-0.041
 refreshing-0.037
Token,
Token ID11
Feature activation-5.19e-02

Pos logit contributions
 code+0.069
Warning+0.067
case+0.067
 reader+0.065
 student+0.064
Neg logit contributions
 sunscreen-0.064
 Nem-0.063
 them-0.062
 oats-0.061
 peanuts-0.053
Token beautiful
Token ID4950
Feature activation-2.51e-02

Pos logit contributions
 sunscreen+0.069
 Nem+0.064
 them+0.064
 oats+0.063
lights+0.057
Neg logit contributions
 code-0.087
 reader-0.085
 student-0.083
case-0.083
ird-0.079
Token rainbow
Token ID27223
Feature activation-7.66e-02

Pos logit contributions
 sunscreen+0.037
 them+0.036
 Nem+0.035
 oats+0.035
lights+0.032
Neg logit contributions
 code-0.038
 reader-0.036
case-0.036
 student-0.036
Warning-0.035
Token-
Token ID12
Feature activation-9.33e-02

Pos logit contributions
 them+0.120
 sunscreen+0.116
lights+0.112
 oats+0.111
 Nem+0.110
Neg logit contributions
 code-0.108
 reader-0.104
 student-0.102
 liar-0.099
case-0.098
Tokencol
Token ID4033
Feature activation-1.88e-01

Pos logit contributions
 sunscreen+0.139
 oats+0.129
 them+0.126
 Nem+0.126
lights+0.124
Neg logit contributions
 code-0.139
case-0.136
 reader-0.135
 student-0.131
Two-0.128
Tokenoured
Token ID8167
Feature activation-5.03e-02

Pos logit contributions
 sunscreen+0.265
 oats+0.252
 Nem+0.249
 them+0.249
lights+0.245
Neg logit contributions
 reader-0.269
ird-0.265
 code-0.264
case-0.256
 student-0.250
Token butterfly
Token ID35113
Feature activation-7.56e-02

Pos logit contributions
 sunscreen+0.067
 them+0.066
 Nem+0.063
 oats+0.062
lights+0.060
Neg logit contributions
 code-0.082
 reader-0.079
case-0.078
 student-0.077
 equation-0.074
Token!
Token ID0
Feature activation+1.26e-03

Pos logit contributions
 them+0.117
 sunscreen+0.116
 Nem+0.110
 oats+0.110
lights+0.106
Neg logit contributions
 code-0.106
 reader-0.101
 student-0.100
case-0.096
 liar-0.095
Token The
Token ID383
Feature activation-6.08e-02

Pos logit contributions
 code+0.002
 reader+0.002
 student+0.002
case+0.002
 equation+0.002
Neg logit contributions
 sunscreen-0.002
 them-0.002
 Nem-0.002
 oats-0.002
lights-0.002
Token little
Token ID1310
Feature activation-4.76e-03

Pos logit contributions
 sunscreen+0.089
 them+0.085
 Nem+0.084
 oats+0.080
lights+0.076
Neg logit contributions
 code-0.093
 reader-0.090
case-0.087
 student-0.086
 equation-0.085
Token ready
Token ID3492
Feature activation-4.82e-02

Pos logit contributions
 sunscreen+0.178
lights+0.175
 oats+0.170
 Nem+0.167
cast+0.165
Neg logit contributions
 code-0.169
 student-0.158
 reader-0.156
case-0.154
 liar-0.150
Token to
Token ID284
Feature activation-3.50e-02

Pos logit contributions
 sunscreen+0.067
 Nem+0.064
 them+0.063
 oats+0.062
lights+0.062
Neg logit contributions
 code-0.074
case-0.070
 reader-0.070
 student-0.069
 equation-0.065
Token share
Token ID2648
Feature activation-7.96e-02

Pos logit contributions
 sunscreen+0.052
 them+0.051
 Nem+0.048
 oats+0.048
lights+0.046
Neg logit contributions
 code-0.052
 reader-0.050
 student-0.049
case-0.049
Two-0.047
Token our
Token ID674
Feature activation-1.14e-01

Pos logit contributions
 sunscreen+0.105
 Nem+0.102
 oats+0.101
ric+0.101
lights+0.101
Neg logit contributions
 code-0.121
case-0.114
 reader-0.114
 student-0.113
 character-0.106
Token happiness
Token ID12157
Feature activation-1.34e-01

Pos logit contributions
 sunscreen+0.150
 them+0.147
 Nem+0.143
lights+0.140
 oats+0.140
Neg logit contributions
 code-0.183
 reader-0.173
 student-0.172
case-0.171
 equation-0.161
Token and
Token ID290
Feature activation-1.87e-01

Pos logit contributions
 sunscreen+0.238
 oats+0.228
 Nem+0.228
ric+0.218
 them+0.213
Neg logit contributions
 code-0.153
case-0.149
 student-0.147
 reader-0.146
 character-0.128
Token play
Token ID711
Feature activation-5.45e-02

Pos logit contributions
 sunscreen+0.293
 oats+0.281
 them+0.280
 Nem+0.278
lights+0.275
Neg logit contributions
 code-0.260
 student-0.243
 reader-0.241
case-0.238
 figure-0.220
Token together
Token ID1978
Feature activation-8.04e-02

Pos logit contributions
 them+0.071
 sunscreen+0.064
 Nem+0.062
lights+0.057
 oats+0.057
Neg logit contributions
 code-0.096
 reader-0.093
case-0.091
 student-0.090
Two-0.089
Token!
Token ID0
Feature activation-1.79e-02

Pos logit contributions
 them+0.133
 sunscreen+0.128
 Nem+0.122
 oats+0.119
lights+0.119
Neg logit contributions
 code-0.107
 reader-0.103
 student-0.099
case-0.099
Two-0.098
Tokenâ
Token ID22940
Feature activation-5.69e-02

Pos logit contributions
 sunscreen+0.026
 them+0.025
 Nem+0.024
 oats+0.024
lights+0.023
Neg logit contributions
 code-0.027
 reader-0.026
case-0.026
 student-0.026
 equation-0.024
Token€
Token ID26391
Feature activation-2.42e-02

Pos logit contributions
 sunscreen+0.084
 them+0.082
 Nem+0.081
lights+0.076
ric+0.074
Neg logit contributions
 code-0.083
 reader-0.081
 student-0.079
 liar-0.077
case-0.076
Token promised
Token ID8072
Feature activation-1.30e-01

Pos logit contributions
 code+0.133
case+0.124
 reader+0.121
 student+0.118
 equation+0.116
Neg logit contributions
 sunscreen-0.186
 them-0.182
 Nem-0.180
 oats-0.177
lights-0.167
Token he
Token ID339
Feature activation-9.82e-02

Pos logit contributions
lights+0.203
 sunscreen+0.200
 oats+0.194
cast+0.191
ric+0.189
Neg logit contributions
case-0.172
 code-0.168
 should-0.163
 reader-0.156
 client-0.155
Token would
Token ID561
Feature activation-7.58e-02

Pos logit contributions
 sunscreen+0.162
lights+0.149
 oats+0.148
 them+0.147
 Nem+0.144
Neg logit contributions
case-0.130
 code-0.129
 reader-0.121
 should-0.120
 student-0.118
Token come
Token ID1282
Feature activation-1.39e-01

Pos logit contributions
 sunscreen+0.118
 them+0.112
 oats+0.111
 Nem+0.110
lights+0.110
Neg logit contributions
 code-0.105
case-0.101
 reader-0.100
 student-0.098
Two-0.092
Token back
Token ID736
Feature activation-9.68e-02

Pos logit contributions
lights+0.200
 sunscreen+0.198
 oats+0.195
 Nem+0.185
icles+0.183
Neg logit contributions
 code-0.197
 figure-0.191
 reader-0.186
 student-0.179
case-0.177
Token and
Token ID290
Feature activation-1.87e-01

Pos logit contributions
 sunscreen+0.136
 them+0.128
 oats+0.126
 Nem+0.125
lights+0.124
Neg logit contributions
 code-0.149
 reader-0.144
 student-0.141
case-0.141
Two-0.131
Token play
Token ID711
Feature activation-5.26e-02

Pos logit contributions
lights+0.336
 sunscreen+0.317
ocked+0.314
 oats+0.313
ric+0.311
Neg logit contributions
case-0.213
 code-0.209
 figure-0.198
 should-0.193
 read-0.187
Token again
Token ID757
Feature activation-8.97e-02

Pos logit contributions
 sunscreen+0.066
 them+0.065
 Nem+0.061
 oats+0.060
lights+0.058
Neg logit contributions
 code-0.090
 reader-0.087
 student-0.086
case-0.086
Two-0.082
Token soon
Token ID2582
Feature activation-4.94e-02

Pos logit contributions
 sunscreen+0.152
 them+0.148
 Nem+0.144
 oats+0.143
lights+0.140
Neg logit contributions
 code-0.113
 reader-0.108
case-0.106
 student-0.105
Two-0.098
Token.
Token ID13
Feature activation-2.19e-02

Pos logit contributions
 sunscreen+0.089
 them+0.087
 Nem+0.084
 oats+0.084
lights+0.083
Neg logit contributions
 code-0.057
 reader-0.053
case-0.053
 student-0.052
 equation-0.048
Token He
Token ID679
Feature activation-3.53e-02

Pos logit contributions
 sunscreen+0.034
 them+0.033
 oats+0.032
lights+0.031
 Nem+0.030
Neg logit contributions
 code-0.030
case-0.030
 reader-0.029
 student-0.029
Two-0.027
Token her
Token ID607
Feature activation-1.06e-01

Pos logit contributions
lights+0.173
cast+0.164
 sunscreen+0.162
ric+0.161
 oats+0.150
Neg logit contributions
 should-0.224
 student-0.218
 character-0.214
 code-0.214
 reader-0.213
Token a
Token ID257
Feature activation-2.47e-02

Pos logit contributions
 sunscreen+0.156
 them+0.152
lights+0.149
 Nem+0.147
 oats+0.147
Neg logit contributions
 code-0.149
 reader-0.143
case-0.143
 student-0.142
ird-0.130
Token band
Token ID4097
Feature activation-1.25e-01

Pos logit contributions
 sunscreen+0.037
 them+0.035
 oats+0.035
 Nem+0.034
lights+0.032
Neg logit contributions
 code-0.036
 reader-0.035
case-0.034
 student-0.034
 equation-0.032
Token-
Token ID12
Feature activation-1.57e-01

Pos logit contributions
 sunscreen+0.235
 Nem+0.229
 them+0.229
lights+0.226
 oats+0.225
Neg logit contributions
 code-0.128
 reader-0.124
 student-0.123
case-0.122
ird-0.112
Tokenaid
Token ID1698
Feature activation-1.85e-01

Pos logit contributions
cast+0.231
 sunscreen+0.230
lights+0.229
 Nem+0.221
 oats+0.214
Neg logit contributions
 reader-0.218
 student-0.217
Two-0.215
 code-0.213
ird-0.207
Token and
Token ID290
Feature activation-1.87e-01

Pos logit contributions
 sunscreen+0.303
 them+0.284
 Nem+0.282
 oats+0.279
lights+0.273
Neg logit contributions
 code-0.232
case-0.229
 reader-0.219
 student-0.217
ird-0.208
Token a
Token ID257
Feature activation+2.43e-03

Pos logit contributions
 sunscreen+0.317
 them+0.310
 Nem+0.300
 oats+0.299
lights+0.298
Neg logit contributions
 code-0.228
 reader-0.214
 student-0.212
case-0.212
ird-0.195
Token hug
Token ID16225
Feature activation-8.73e-02

Pos logit contributions
 code+0.004
 reader+0.004
case+0.004
 student+0.004
 equation+0.003
Neg logit contributions
 sunscreen-0.003
 them-0.003
 oats-0.003
 Nem-0.003
lights-0.003
Token.
Token ID13
Feature activation-1.82e-02

Pos logit contributions
 them+0.142
 sunscreen+0.141
 Nem+0.134
 oats+0.133
lights+0.129
Neg logit contributions
 code-0.118
case-0.111
 reader-0.111
 student-0.108
Two-0.105
Token He
Token ID679
Feature activation-3.72e-02

Pos logit contributions
 sunscreen+0.031
 them+0.030
 oats+0.030
lights+0.029
 Nem+0.029
Neg logit contributions
 code-0.022
 reader-0.021
case-0.021
 student-0.021
Two-0.020
Token said
Token ID531
Feature activation-2.73e-02

Pos logit contributions
 sunscreen+0.061
 them+0.060
 Nem+0.058
 oats+0.058
lights+0.055
Neg logit contributions
 code-0.049
 reader-0.046
 student-0.045
case-0.045
Two-0.043
Token you
Token ID345
Feature activation-1.06e-01

Pos logit contributions
 sunscreen+0.107
lights+0.096
 them+0.096
 oats+0.093
 Nem+0.093
Neg logit contributions
 code-0.211
 reader-0.205
 student-0.203
case-0.202
Two-0.192
Token would
Token ID561
Feature activation-7.15e-02

Pos logit contributions
 them+0.164
 sunscreen+0.162
 Nem+0.156
 oats+0.150
cast+0.139
Neg logit contributions
 code-0.149
 reader-0.144
 student-0.142
case-0.138
 equation-0.137
Token share
Token ID2648
Feature activation-9.09e-02

Pos logit contributions
 them+0.109
 sunscreen+0.107
 Nem+0.101
 oats+0.098
lights+0.091
Neg logit contributions
 code-0.103
 reader-0.100
 student-0.098
case-0.097
 equation-0.094
Token the
Token ID262
Feature activation-1.06e-01

Pos logit contributions
 sunscreen+0.128
lights+0.120
 oats+0.120
 Nem+0.119
 them+0.115
Neg logit contributions
 code-0.138
 reader-0.131
case-0.129
 student-0.129
 equation-0.120
Token toys
Token ID14958
Feature activation-8.42e-02

Pos logit contributions
 sunscreen+0.159
 them+0.152
 oats+0.148
 Nem+0.147
lights+0.140
Neg logit contributions
 code-0.154
 reader-0.148
case-0.148
 student-0.145
Two-0.138
Token or
Token ID393
Feature activation-1.87e-01

Pos logit contributions
 sunscreen+0.135
 them+0.130
 Nem+0.127
 oats+0.127
lights+0.124
Neg logit contributions
 code-0.113
 reader-0.108
case-0.107
 student-0.106
 equation-0.098
Token not
Token ID407
Feature activation-1.06e-01

Pos logit contributions
 sunscreen+0.291
lights+0.281
 them+0.278
 oats+0.276
 Nem+0.276
Neg logit contributions
 code-0.254
 student-0.234
 reader-0.234
case-0.232
 liar-0.219
Token.
Token ID13
Feature activation-1.23e-02

Pos logit contributions
 sunscreen+0.157
 them+0.157
 Nem+0.149
 oats+0.146
lights+0.133
Neg logit contributions
 code-0.157
 reader-0.154
case-0.151
 student-0.149
Two-0.144
Token He
Token ID679
Feature activation-3.23e-02

Pos logit contributions
 sunscreen+0.019
 them+0.019
 Nem+0.018
 oats+0.017
lights+0.017
Neg logit contributions
 code-0.017
 reader-0.016
case-0.016
 student-0.016
 equation-0.015
Token was
Token ID373
Feature activation-2.81e-03

Pos logit contributions
 them+0.051
 Nem+0.050
 sunscreen+0.050
 oats+0.049
cast+0.045
Neg logit contributions
 code-0.041
 student-0.041
 reader-0.040
 equation-0.038
room-0.038
Token testing
Token ID4856
Feature activation-6.50e-02

Pos logit contributions
lights+0.005
 sunscreen+0.005
cast+0.004
 oats+0.004
 Nem+0.004
Neg logit contributions
 code-0.004
 student-0.003
 reader-0.003
 liar-0.003
case-0.003
Token better
Token ID1365
Feature activation-9.96e-02

Pos logit contributions
 sunscreen+0.095
 them+0.092
 Nem+0.089
 oats+0.088
lights+0.086
Neg logit contributions
 code-0.104
 reader-0.100
case-0.099
 student-0.098
Two-0.093
Token to
Token ID284
Feature activation-3.80e-02

Pos logit contributions
 sunscreen+0.152
 them+0.150
 Nem+0.143
 oats+0.143
lights+0.136
Neg logit contributions
 code-0.142
case-0.137
 reader-0.135
 student-0.132
Two-0.128
Token share
Token ID2648
Feature activation-8.04e-02

Pos logit contributions
 sunscreen+0.055
 them+0.055
 Nem+0.052
 oats+0.050
lights+0.045
Neg logit contributions
 code-0.058
 reader-0.057
 student-0.056
case-0.055
 equation-0.053
Token your
Token ID534
Feature activation-1.09e-01

Pos logit contributions
 sunscreen+0.117
lights+0.115
ric+0.113
 oats+0.112
 Nem+0.111
Neg logit contributions
 code-0.114
 reader-0.108
 student-0.106
case-0.105
 character-0.098
Token words
Token ID2456
Feature activation-1.15e-01

Pos logit contributions
 sunscreen+0.149
 them+0.148
 Nem+0.142
lights+0.142
 oats+0.139
Neg logit contributions
 code-0.169
 reader-0.159
 student-0.158
case-0.156
 liar-0.148
Token and
Token ID290
Feature activation-1.86e-01

Pos logit contributions
 sunscreen+0.186
 them+0.178
 oats+0.175
 Nem+0.175
lights+0.169
Neg logit contributions
 code-0.152
case-0.146
 reader-0.145
 student-0.143
 equation-0.132
Token your
Token ID534
Feature activation-9.51e-02

Pos logit contributions
 sunscreen+0.300
lights+0.293
 Nem+0.291
 oats+0.289
 them+0.282
Neg logit contributions
 code-0.251
 student-0.229
 reader-0.228
case-0.222
 liar-0.209
Token friendship
Token ID14738
Feature activation-1.08e-01

Pos logit contributions
 sunscreen+0.139
 them+0.135
 Nem+0.131
 oats+0.130
lights+0.127
Neg logit contributions
 code-0.142
 reader-0.135
 student-0.133
case-0.133
 equation-0.125
Token.
Token ID13
Feature activation-1.49e-02

Pos logit contributions
 sunscreen+0.197
 them+0.188
 Nem+0.186
 oats+0.186
lights+0.179
Neg logit contributions
 code-0.126
case-0.118
 reader-0.117
 student-0.115
 equation-0.105
Token<|endoftext|>
Token ID50256
Feature activation-3.46e-02

Pos logit contributions
 sunscreen+0.024
 them+0.024
 Nem+0.023
 oats+0.023
lights+0.022
Neg logit contributions
 code-0.020
 reader-0.019
case-0.018
 student-0.018
 equation-0.017
TokenL
Token ID43
Feature activation-7.88e-03

Pos logit contributions
 them+0.042
 sunscreen+0.039
 Nem+0.039
 oats+0.036
cast+0.033
Neg logit contributions
 code-0.063
 reader-0.060
 student-0.060
case-0.057
 liar-0.057
Token better
Token ID1365
Feature activation-1.03e-01

Pos logit contributions
 sunscreen+0.206
lights+0.198
ric+0.191
cast+0.190
icles+0.190
Neg logit contributions
 code-0.127
 should-0.124
 liar-0.118
case-0.115
 student-0.113
Token by
Token ID416
Feature activation-7.97e-02

Pos logit contributions
 them+0.167
 sunscreen+0.166
 Nem+0.160
 oats+0.158
lights+0.151
Neg logit contributions
 code-0.139
case-0.133
 reader-0.129
 student-0.128
Two-0.124
Token giving
Token ID3501
Feature activation-1.26e-01

Pos logit contributions
 sunscreen+0.107
lights+0.102
 oats+0.100
cast+0.091
 Nem+0.090
Neg logit contributions
 code-0.121
 student-0.116
 figure-0.113
 reader-0.112
 liar-0.110
Token him
Token ID683
Feature activation-9.35e-02

Pos logit contributions
lights+0.135
 sunscreen+0.133
 oats+0.125
ric+0.121
cast+0.115
Neg logit contributions
 code-0.217
 should-0.211
 reader-0.210
 student-0.210
 character-0.200
Token hugs
Token ID40657
Feature activation-1.29e-01

Pos logit contributions
lights+0.148
 sunscreen+0.139
cast+0.139
ric+0.138
 Nem+0.134
Neg logit contributions
 code-0.113
 reader-0.108
 student-0.108
 liar-0.101
case-0.100
Token and
Token ID290
Feature activation-1.86e-01

Pos logit contributions
 sunscreen+0.197
 oats+0.182
lights+0.179
 Nem+0.175
cast+0.174
Neg logit contributions
 code-0.179
 reader-0.174
 student-0.171
case-0.166
 should-0.161
Token telling
Token ID5149
Feature activation-6.55e-02

Pos logit contributions
 sunscreen+0.254
 them+0.248
lights+0.245
 Nem+0.240
 oats+0.239
Neg logit contributions
 code-0.283
 student-0.264
 reader-0.263
case-0.255
 should-0.250
Token him
Token ID683
Feature activation-8.11e-02

Pos logit contributions
 sunscreen+0.060
lights+0.057
cast+0.053
 oats+0.052
oming+0.050
Neg logit contributions
 code-0.126
 student-0.123
 reader-0.121
 character-0.119
 should-0.119
Token it
Token ID340
Feature activation-1.34e-01

Pos logit contributions
 sunscreen+0.118
lights+0.109
 Nem+0.106
 oats+0.106
bur+0.102
Neg logit contributions
case-0.117
 code-0.117
 should-0.113
 reader-0.112
 student-0.111
Token was
Token ID373
Feature activation-2.30e-02

Pos logit contributions
 sunscreen+0.241
 them+0.225
 oats+0.224
 Nem+0.224
lights+0.219
Neg logit contributions
 code-0.160
case-0.153
 reader-0.148
 student-0.145
 equation-0.135
Token okay
Token ID8788
Feature activation-6.37e-02

Pos logit contributions
 sunscreen+0.034
lights+0.032
 oats+0.032
 them+0.032
 Nem+0.031
Neg logit contributions
 code-0.033
 reader-0.031
 student-0.031
case-0.031
 liar-0.029
Token had
Token ID550
Feature activation+7.54e-02

Pos logit contributions
 code+0.128
case+0.120
 reader+0.119
 student+0.119
 liar+0.117
Neg logit contributions
 sunscreen-0.083
 them-0.083
 Nem-0.079
 oats-0.075
cast-0.068
Token a
Token ID257
Feature activation+1.06e-01

Pos logit contributions
 code+0.103
 reader+0.098
case+0.097
 student+0.096
 equation+0.090
Neg logit contributions
 sunscreen-0.121
 them-0.117
 Nem-0.113
 oats-0.113
lights-0.109
Token dream
Token ID4320
Feature activation+4.36e-02

Pos logit contributions
 code+0.175
Warning+0.164
case+0.163
 reader+0.158
 student+0.156
Neg logit contributions
 sunscreen-0.148
 oats-0.134
 Nem-0.134
 them-0.133
cast-0.112
Token.
Token ID13
Feature activation+7.26e-02

Pos logit contributions
 code+0.059
 reader+0.056
 student+0.055
case+0.055
Two+0.052
Neg logit contributions
 sunscreen-0.069
 them-0.068
 oats-0.064
 Nem-0.064
lights-0.064
Token It
Token ID632
Feature activation-2.32e-02

Pos logit contributions
 code+0.106
 liar+0.097
 reader+0.095
 student+0.093
 client+0.090
Neg logit contributions
 them-0.112
 sunscreen-0.107
 Nem-0.106
lights-0.100
 oats-0.094
Token was
Token ID373
Feature activation+7.37e-02

Pos logit contributions
 them+0.036
 Nem+0.035
 sunscreen+0.034
lights+0.034
 oats+0.033
Neg logit contributions
 code-0.031
 reader-0.031
 student-0.030
 liar-0.030
 equation-0.028
Token a
Token ID257
Feature activation+1.04e-01

Pos logit contributions
 code+0.100
 reader+0.095
case+0.094
 student+0.094
 equation+0.087
Neg logit contributions
 sunscreen-0.118
 them-0.114
 Nem-0.110
 oats-0.110
lights-0.108
Token big
Token ID1263
Feature activation+1.08e-01

Pos logit contributions
 code+0.171
Warning+0.171
case+0.164
 reader+0.163
ird+0.162
Neg logit contributions
 sunscreen-0.131
 Nem-0.127
 oats-0.120
 them-0.119
 seas-0.110
Token,
Token ID11
Feature activation+6.28e-03

Pos logit contributions
 code+0.172
Warning+0.167
case+0.167
ird+0.163
Two+0.161
Neg logit contributions
 sunscreen-0.145
 Nem-0.141
 them-0.137
 oats-0.134
lights-0.125
Token tall
Token ID7331
Feature activation+1.02e-01

Pos logit contributions
 code+0.010
 reader+0.009
case+0.009
 student+0.009
Two+0.009
Neg logit contributions
 sunscreen-0.009
 them-0.009
 Nem-0.008
 oats-0.008
lights-0.008
Token dream
Token ID4320
Feature activation+3.72e-02

Pos logit contributions
 code+0.154
Warning+0.147
case+0.145
ird+0.141
 reader+0.140
Neg logit contributions
 sunscreen-0.147
 Nem-0.141
 oats-0.141
 them-0.137
lights-0.131
Token Molly
Token ID30236
Feature activation+3.07e-02

Pos logit contributions
case+0.018
 code+0.018
 reader+0.017
Two+0.017
 equation+0.017
Neg logit contributions
 them-0.012
 sunscreen-0.010
 Nem-0.009
 oats-0.009
lights-0.009
Token to
Token ID284
Feature activation+4.28e-02

Pos logit contributions
 code+0.053
 reader+0.051
case+0.050
 student+0.050
Two+0.048
Neg logit contributions
 sunscreen-0.038
 them-0.036
 Nem-0.035
 oats-0.035
lights-0.034
Token come
Token ID1282
Feature activation-3.55e-02

Pos logit contributions
 code+0.060
 reader+0.059
case+0.058
 student+0.056
 equation+0.055
Neg logit contributions
 sunscreen-0.067
 them-0.066
 Nem-0.062
 oats-0.061
lights-0.057
Token with
Token ID351
Feature activation-2.07e-02

Pos logit contributions
 sunscreen+0.048
 oats+0.045
 them+0.045
 Nem+0.045
lights+0.044
Neg logit contributions
 code-0.057
 reader-0.055
 student-0.054
case-0.053
Two-0.050
Token them
Token ID606
Feature activation-4.32e-02

Pos logit contributions
 them+0.018
 sunscreen+0.018
 oats+0.016
 Nem+0.015
lights+0.014
Neg logit contributions
 code-0.044
case-0.044
 reader-0.043
 student-0.042
Two-0.041
Token.
Token ID13
Feature activation+5.04e-02

Pos logit contributions
 them+0.052
 sunscreen+0.045
 as+0.044
 oats+0.043
 Nem+0.041
Neg logit contributions
 liar-0.083
Two-0.082
 code-0.081
lder-0.079
 equation-0.079
Token Molly
Token ID30236
Feature activation+5.94e-02

Pos logit contributions
 code+0.088
 reader+0.081
case+0.080
 liar+0.080
 equation+0.078
Neg logit contributions
 them-0.069
 sunscreen-0.065
 Nem-0.064
 oats-0.058
lights-0.057
Token nods
Token ID34833
Feature activation-2.53e-02

Pos logit contributions
 code+0.076
 reader+0.071
case+0.070
 student+0.070
 liar+0.067
Neg logit contributions
 them-0.099
 sunscreen-0.099
 oats-0.094
 Nem-0.093
cast-0.088
Token and
Token ID290
Feature activation-8.90e-02

Pos logit contributions
 sunscreen+0.034
 them+0.033
 Nem+0.031
 oats+0.031
lights+0.031
Neg logit contributions
 code-0.041
 reader-0.040
case-0.039
 student-0.039
Two-0.037
Token follows
Token ID5679
Feature activation+2.33e-02

Pos logit contributions
lights+0.167
 sunscreen+0.166
 Nem+0.164
 oats+0.161
 them+0.159
Neg logit contributions
 reader-0.088
 code-0.087
 student-0.086
case-0.080
 equation-0.076
Token them
Token ID606
Feature activation-4.13e-02

Pos logit contributions
 code+0.045
 reader+0.044
 student+0.044
case+0.042
Two+0.041
Neg logit contributions
 sunscreen-0.023
 oats-0.021
lights-0.021
 Nem-0.021
 them-0.021
Token the
Token ID262
Feature activation-6.97e-03

Pos logit contributions
 code+0.071
 reader+0.069
 student+0.069
case+0.067
 equation+0.066
Neg logit contributions
 them-0.041
 sunscreen-0.038
 Nem-0.036
 oats-0.035
lights-0.031
Token arm
Token ID3211
Feature activation+2.93e-02

Pos logit contributions
 sunscreen+0.010
 oats+0.009
 them+0.009
 everything+0.008
 attention+0.008
Neg logit contributions
 reader-0.010
lder-0.010
ird-0.010
 code-0.010
 liar-0.010
Tokenchair
Token ID16337
Feature activation+3.19e-02

Pos logit contributions
lder+0.051
 liar+0.048
 reader+0.047
 code+0.047
 vain+0.046
Neg logit contributions
lights-0.037
 he-0.030
 them-0.030
 as-0.029
icles-0.029
Token.
Token ID13
Feature activation+4.77e-02

Pos logit contributions
Two+0.061
 code+0.059
 liar+0.059
 reader+0.059
lder+0.059
Neg logit contributions
 them-0.035
 as-0.032
lights-0.026
 Nem-0.025
uses-0.025
Token It
Token ID632
Feature activation-2.52e-02

Pos logit contributions
 code+0.079
 reader+0.075
 student+0.074
case+0.074
ird+0.071
Neg logit contributions
 sunscreen-0.063
 Nem-0.063
 them-0.062
 oats-0.060
lights-0.055
Token was
Token ID373
Feature activation+6.79e-02

Pos logit contributions
 them+0.034
 Nem+0.033
 oats+0.031
cast+0.031
 sunscreen+0.031
Neg logit contributions
 reader-0.041
 code-0.040
 student-0.039
 liar-0.037
ird-0.037
Token huge
Token ID3236
Feature activation+8.80e-02

Pos logit contributions
 code+0.115
 reader+0.114
case+0.112
 student+0.110
Two+0.109
Neg logit contributions
 sunscreen-0.087
 them-0.086
 Nem-0.081
 oats-0.081
lights-0.073
Token,
Token ID11
Feature activation+1.61e-03

Pos logit contributions
 code+0.153
Two+0.150
 student+0.147
case+0.146
 reader+0.145
Neg logit contributions
 them-0.106
 oats-0.104
 sunscreen-0.103
 Nem-0.100
lights-0.098
Token with
Token ID351
Feature activation+8.76e-04

Pos logit contributions
 code+0.003
 reader+0.002
 student+0.002
case+0.002
Two+0.002
Neg logit contributions
 sunscreen-0.002
 them-0.002
 Nem-0.002
 oats-0.002
lights-0.002
Token big
Token ID1263
Feature activation+9.96e-02

Pos logit contributions
 code+0.001
 reader+0.001
case+0.001
 student+0.001
 equation+0.001
Neg logit contributions
 sunscreen-0.001
 them-0.001
 oats-0.001
 Nem-0.001
lights-0.001
Token eyes
Token ID2951
Feature activation+8.54e-03

Pos logit contributions
 code+0.210
case+0.209
 student+0.205
 reader+0.204
shop+0.203
Neg logit contributions
 sunscreen-0.084
 them-0.080
 oats-0.079
 Nem-0.069
lights-0.065
Token could
Token ID714
Feature activation+7.43e-02

Pos logit contributions
 sunscreen+0.022
 them+0.021
 Nem+0.021
 oats+0.020
lights+0.020
Neg logit contributions
 code-0.015
 reader-0.014
 student-0.014
case-0.013
 equation-0.013
Token zoom
Token ID19792
Feature activation+7.52e-04

Pos logit contributions
 code+0.101
case+0.098
 reader+0.094
 student+0.093
Two+0.088
Neg logit contributions
 sunscreen-0.118
 Nem-0.113
 them-0.113
 oats-0.113
lights-0.109
Token very
Token ID845
Feature activation+4.30e-02

Pos logit contributions
 code+0.001
 reader+0.001
case+0.001
 student+0.001
Two+0.001
Neg logit contributions
 sunscreen-0.001
 them-0.001
 Nem-0.001
 oats-0.001
lights-0.001
Token,
Token ID11
Feature activation+9.55e-04

Pos logit contributions
 code+0.070
 reader+0.069
 student+0.068
case+0.065
ird+0.065
Neg logit contributions
 sunscreen-0.055
 them-0.055
 Nem-0.052
lights-0.051
 oats-0.050
Token very
Token ID845
Feature activation+5.64e-02

Pos logit contributions
 code+0.001
case+0.001
 figure+0.001
 student+0.001
 reader+0.001
Neg logit contributions
 sunscreen-0.001
 oats-0.001
lights-0.001
 Nem-0.001
ishes-0.001
Token fast
Token ID3049
Feature activation+8.02e-02

Pos logit contributions
 code+0.090
 reader+0.088
 student+0.086
case+0.085
ird+0.082
Neg logit contributions
 sunscreen-0.076
 them-0.075
 Nem-0.071
lights-0.070
 oats-0.069
Token.
Token ID13
Feature activation+7.15e-02

Pos logit contributions
 code+0.120
 reader+0.118
Two+0.117
case+0.114
 student+0.114
Neg logit contributions
 them-0.121
 sunscreen-0.114
 Nem-0.108
lights-0.107
 as-0.104
Token The
Token ID383
Feature activation-4.57e-04

Pos logit contributions
 code+0.125
 reader+0.119
 student+0.116
 liar+0.114
 equation+0.111
Neg logit contributions
 them-0.093
 sunscreen-0.090
 Nem-0.090
lights-0.078
 oats-0.078
Token car
Token ID1097
Feature activation+5.01e-02

Pos logit contributions
 sunscreen+0.001
 them+0.001
 Nem+0.001
 oats+0.001
lights+0.001
Neg logit contributions
 code-0.001
 reader-0.001
case-0.001
 student-0.001
Two-0.001
Token zoom
Token ID19792
Feature activation+4.37e-03

Pos logit contributions
 reader+0.070
 code+0.069
 student+0.066
Warning+0.066
 equation+0.065
Neg logit contributions
 them-0.075
 sunscreen-0.072
 Nem-0.069
lights-0.069
cast-0.067
Tokened
Token ID276
Feature activation+3.10e-02

Pos logit contributions
 code+0.008
 reader+0.007
 student+0.007
 equation+0.007
case+0.007
Neg logit contributions
 them-0.006
 sunscreen-0.005
 Nem-0.005
lights-0.005
cast-0.005
Token wanted
Token ID2227
Feature activation+8.09e-02

Pos logit contributions
 sunscreen+0.002
 Nem+0.002
 them+0.002
 oats+0.002
 everything+0.002
Neg logit contributions
 code-0.002
case-0.002
 student-0.002
 reader-0.002
ird-0.002
Token to
Token ID284
Feature activation+6.86e-02

Pos logit contributions
 code+0.142
 reader+0.137
 student+0.135
case+0.135
Two+0.128
Neg logit contributions
 sunscreen-0.097
 them-0.093
 Nem-0.090
 oats-0.089
lights-0.088
Token go
Token ID467
Feature activation+3.21e-02

Pos logit contributions
 code+0.115
 reader+0.114
 vain+0.109
 equation+0.109
 liar+0.108
Neg logit contributions
 sunscreen-0.092
 them-0.085
 Nem-0.079
 oats-0.076
 everything-0.069
Token to
Token ID284
Feature activation+5.72e-02

Pos logit contributions
 code+0.041
 reader+0.039
case+0.037
 student+0.037
 clue+0.036
Neg logit contributions
lights-0.052
 Nem-0.050
 sunscreen-0.050
ric-0.049
 them-0.047
Token school
Token ID1524
Feature activation+7.69e-02

Pos logit contributions
 code+0.103
 reader+0.099
case+0.098
 student+0.097
Two+0.093
Neg logit contributions
 sunscreen-0.067
 them-0.064
 Nem-0.061
 oats-0.061
lights-0.058
Token.
Token ID13
Feature activation+6.71e-02

Pos logit contributions
 code+0.120
Warning+0.118
 liar+0.115
lder+0.113
Two+0.112
Neg logit contributions
 them-0.108
 sunscreen-0.099
 as-0.094
cast-0.092
lights-0.091
Token He
Token ID679
Feature activation+4.43e-02

Pos logit contributions
 code+0.099
 reader+0.094
 student+0.092
case+0.092
 liar+0.087
Neg logit contributions
 sunscreen-0.100
 them-0.098
 Nem-0.094
 oats-0.092
lights-0.090
Token asked
Token ID1965
Feature activation+4.35e-02

Pos logit contributions
 code+0.076
 liar+0.071
Warning+0.070
 reader+0.069
room+0.067
Neg logit contributions
 them-0.059
 sunscreen-0.055
 Nem-0.055
 oats-0.050
 everything-0.050
Token his
Token ID465
Feature activation+3.39e-02

Pos logit contributions
 code+0.107
case+0.106
 reader+0.105
 student+0.103
Two+0.102
Neg logit contributions
 them-0.023
 sunscreen-0.021
 Nem-0.018
 oats-0.017
lights-0.014
Token mom
Token ID1995
Feature activation+2.42e-02

Pos logit contributions
Warning+0.073
case+0.071
ird+0.070
ified+0.070
shop+0.067
Neg logit contributions
 sunscreen-0.026
 them-0.025
 oats-0.022
 Nem-0.022
 attention-0.021
Token,
Token ID11
Feature activation-1.03e-02

Pos logit contributions
 code+0.037
 reader+0.036
 student+0.036
case+0.035
Two+0.033
Neg logit contributions
 sunscreen-0.035
 them-0.032
 Nem-0.032
 oats-0.032
lights-0.032
Token said
Token ID531
Feature activation+1.46e-02

Pos logit contributions
 code+0.003
 reader+0.002
case+0.002
 liar+0.002
room+0.002
Neg logit contributions
 them-0.003
 sunscreen-0.003
 Nem-0.003
 as-0.002
 oats-0.002
Token.
Token ID13
Feature activation+1.45e-02

Pos logit contributions
 code+0.021
 reader+0.020
case+0.020
Two+0.020
 student+0.019
Neg logit contributions
 them-0.022
 sunscreen-0.021
 Nem-0.021
 oats-0.019
lights-0.019
Token Her
Token ID2332
Feature activation-1.65e-02

Pos logit contributions
 code+0.026
 reader+0.025
 liar+0.024
 student+0.024
case+0.024
Neg logit contributions
 them-0.018
 sunscreen-0.017
 Nem-0.017
lights-0.015
 oats-0.015
Token mum
Token ID25682
Feature activation+3.52e-03

Pos logit contributions
 sunscreen+0.015
 them+0.015
 Nem+0.014
 oats+0.013
lights+0.013
Neg logit contributions
 code-0.033
 reader-0.032
 student-0.032
case-0.032
 equation-0.031
Token was
Token ID373
Feature activation+3.63e-02

Pos logit contributions
 code+0.006
 reader+0.005
case+0.005
 student+0.005
 liar+0.005
Neg logit contributions
 them-0.005
 sunscreen-0.005
 Nem-0.005
 oats-0.005
lights-0.004
Token surprised
Token ID6655
Feature activation-3.42e-03

Pos logit contributions
 code+0.059
 reader+0.056
case+0.055
 student+0.054
Two+0.052
Neg logit contributions
 sunscreen-0.050
 them-0.050
 Nem-0.046
 oats-0.045
lights-0.043
Token and
Token ID290
Feature activation-9.86e-02

Pos logit contributions
 sunscreen+0.006
 oats+0.006
 Nem+0.005
ric+0.005
lights+0.005
Neg logit contributions
 code-0.004
 reader-0.004
 student-0.004
case-0.004
 figure-0.004
Token asked
Token ID1965
Feature activation+1.64e-02

Pos logit contributions
 sunscreen+0.172
 them+0.166
 oats+0.162
 Nem+0.161
lights+0.158
Neg logit contributions
 code-0.120
 reader-0.114
case-0.113
 student-0.112
 equation-0.104
Token:
Token ID25
Feature activation+9.62e-04

Pos logit contributions
case+0.029
 reader+0.028
 code+0.028
Two+0.027
 liar+0.027
Neg logit contributions
 them-0.024
 sunscreen-0.018
 everything-0.018
lights-0.018
 Nem-0.018
Token "
Token ID366
Feature activation-1.27e-02

Pos logit contributions
 code+0.001
case+0.001
 reader+0.001
 student+0.001
Two+0.001
Neg logit contributions
 sunscreen-0.002
 oats-0.002
lights-0.002
 them-0.001
 Nem-0.001
TokenWhat
Token ID2061
Feature activation+7.57e-03

Pos logit contributions
 sunscreen+0.019
 them+0.019
 Nem+0.018
 oats+0.018
lights+0.018
Neg logit contributions
 code-0.018
 reader-0.017
case-0.017
 student-0.017
 equation-0.016
Token,
Token ID11
Feature activation+3.74e-03

Pos logit contributions
 code+0.048
 student+0.047
 reader+0.047
shop+0.046
Warning+0.046
Neg logit contributions
 them-0.034
 Nem-0.033
 sunscreen-0.031
 oats-0.030
 as-0.028
Token I
Token ID314
Feature activation+5.55e-02

Pos logit contributions
 code+0.005
 figure+0.004
 x+0.004
 liar+0.004
 cheat+0.004
Neg logit contributions
lights-0.007
cast-0.007
 sunscreen-0.007
ishes-0.006
ocked-0.006
Token'm
Token ID1101
Feature activation+3.39e-02

Pos logit contributions
 code+0.076
case+0.072
 reader+0.071
 student+0.071
shop+0.069
Neg logit contributions
 them-0.088
 sunscreen-0.086
 Nem-0.083
 oats-0.079
ric-0.075
Token sorry
Token ID7926
Feature activation+7.11e-04

Pos logit contributions
 liar+0.035
case+0.035
 code+0.034
 student+0.033
 x+0.033
Neg logit contributions
lights-0.065
 sunscreen-0.062
cast-0.060
 oats-0.058
icles-0.057
Token,
Token ID11
Feature activation-1.53e-03

Pos logit contributions
 code+0.001
case+0.001
 reader+0.001
 student+0.001
 liar+0.001
Neg logit contributions
 sunscreen-0.001
 oats-0.001
lights-0.001
 Nem-0.001
 them-0.001
Token but
Token ID475
Feature activation-3.37e-02

Pos logit contributions
 sunscreen+0.003
lights+0.002
 oats+0.002
cast+0.002
ric+0.002
Neg logit contributions
 code-0.002
 student-0.002
case-0.002
 liar-0.002
 figure-0.002
Token I
Token ID314
Feature activation+5.30e-02

Pos logit contributions
 sunscreen+0.060
lights+0.060
ans+0.056
cast+0.056
ocked+0.055
Neg logit contributions
 should-0.041
 code-0.041
 figure-0.039
 student-0.036
 mystery-0.036
Token didn
Token ID1422
Feature activation+7.61e-03

Pos logit contributions
 code+0.077
 reader+0.073
 student+0.072
case+0.071
shop+0.068
Neg logit contributions
 them-0.078
 sunscreen-0.078
 Nem-0.077
 oats-0.072
ric-0.069
Token't
Token ID470
Feature activation-1.58e-02

Pos logit contributions
 code+0.011
 reader+0.011
 student+0.010
case+0.010
 equation+0.010
Neg logit contributions
 sunscreen-0.011
 them-0.011
 Nem-0.011
 oats-0.011
lights-0.010
Token make
Token ID787
Feature activation+1.39e-03

Pos logit contributions
 them+0.030
 sunscreen+0.029
 Nem+0.027
 oats+0.027
 attention+0.026
Neg logit contributions
case-0.018
 code-0.017
 reader-0.016
shop-0.016
 student-0.016
Token you
Token ID345
Feature activation+1.20e-02

Pos logit contributions
 code+0.002
 reader+0.002
 student+0.002
case+0.002
 equation+0.002
Neg logit contributions
 sunscreen-0.002
lights-0.002
 oats-0.002
 Nem-0.001
cast-0.001
Token are
Token ID389
Feature activation+5.39e-03

Pos logit contributions
 sunscreen+0.027
 them+0.027
 Nem+0.026
 oats+0.025
 everything+0.024
Neg logit contributions
 code-0.025
 reader-0.024
 student-0.024
 equation-0.024
shop-0.023
Token hurt
Token ID5938
Feature activation-7.11e-03

Pos logit contributions
 code+0.007
 reader+0.007
case+0.007
 student+0.007
Two+0.006
Neg logit contributions
 sunscreen-0.009
 them-0.008
 Nem-0.008
 oats-0.008
lights-0.008
Token.
Token ID13
Feature activation+1.44e-02

Pos logit contributions
 them+0.010
 sunscreen+0.008
 Nem+0.008
lights+0.008
 everything+0.008
Neg logit contributions
 code-0.013
 reader-0.013
shop-0.013
case-0.012
 student-0.012
Token Let
Token ID3914
Feature activation-4.95e-02

Pos logit contributions
 code+0.020
 reader+0.019
case+0.019
 student+0.018
 equation+0.017
Neg logit contributions
 sunscreen-0.023
 them-0.022
 Nem-0.022
 oats-0.021
lights-0.021
Token me
Token ID502
Feature activation-6.88e-03

Pos logit contributions
 sunscreen+0.091
lights+0.087
 Nem+0.086
 oats+0.086
cast+0.084
Neg logit contributions
 code-0.053
case-0.052
 student-0.049
 reader-0.049
Two-0.047
Token see
Token ID766
Feature activation-3.03e-02

Pos logit contributions
 sunscreen+0.010
 them+0.010
 Nem+0.009
 oats+0.009
lights+0.009
Neg logit contributions
 code-0.010
 reader-0.010
 student-0.010
case-0.010
 equation-0.009
Token.
Token ID13
Feature activation+1.02e-02

Pos logit contributions
 sunscreen+0.036
lights+0.034
 oats+0.033
 Nem+0.032
ric+0.031
Neg logit contributions
 code-0.052
 reader-0.049
 student-0.049
case-0.048
ird-0.048
Token We
Token ID775
Feature activation-2.33e-02

Pos logit contributions
 code+0.014
 reader+0.013
case+0.013
 student+0.013
 equation+0.012
Neg logit contributions
 sunscreen-0.017
 them-0.016
 Nem-0.016
 oats-0.015
lights-0.015
Token need
Token ID761
Feature activation+2.63e-02

Pos logit contributions
 them+0.039
 sunscreen+0.036
 Nem+0.035
 oats+0.034
 everything+0.032
Neg logit contributions
 code-0.031
 student-0.031
 reader-0.031
 equation-0.030
shop-0.029
Token to
Token ID284
Feature activation+1.06e-02

Pos logit contributions
 code+0.042
 student+0.039
 reader+0.038
case+0.038
 clue+0.037
Neg logit contributions
 sunscreen-0.034
lights-0.034
cast-0.033
 Nem-0.033
 oats-0.032
Token clean
Token ID3424
Feature activation-5.75e-02

Pos logit contributions
 reader+0.017
 code+0.017
 student+0.017
case+0.016
 equation+0.016
Neg logit contributions
 sunscreen-0.015
 them-0.015
 Nem-0.013
 oats-0.013
 everything-0.012
Token and
Token ID290
Feature activation-9.74e-02

Pos logit contributions
 code+0.031
 reader+0.029
case+0.029
 student+0.029
 equation+0.027
Neg logit contributions
 sunscreen-0.031
 oats-0.029
 them-0.028
 Nem-0.028
lights-0.028
Token everyone
Token ID2506
Feature activation-1.29e-02

Pos logit contributions
 sunscreen+0.160
 oats+0.146
lights+0.144
cast+0.140
ans+0.139
Neg logit contributions
 code-0.133
 should-0.123
 student-0.123
case-0.121
 reader-0.119
Token was
Token ID373
Feature activation+3.02e-02

Pos logit contributions
 sunscreen+0.023
 them+0.022
 oats+0.021
 Nem+0.021
lights+0.021
Neg logit contributions
 code-0.015
 reader-0.015
case-0.014
 student-0.014
 equation-0.013
Token happy
Token ID3772
Feature activation-2.74e-02

Pos logit contributions
 code+0.035
case+0.034
 student+0.033
 reader+0.033
 liar+0.031
Neg logit contributions
 sunscreen-0.052
lights-0.051
cast-0.049
 Nem-0.049
 oats-0.049
Token.
Token ID13
Feature activation+3.53e-02

Pos logit contributions
 sunscreen+0.047
 oats+0.043
 Nem+0.042
 them+0.042
lights+0.041
Neg logit contributions
 code-0.036
 reader-0.034
 student-0.034
case-0.033
 equation-0.030
Token<|endoftext|>
Token ID50256
Feature activation+1.62e-02

Pos logit contributions
 reader+0.039
case+0.039
 student+0.039
 code+0.038
Two+0.038
Neg logit contributions
 sunscreen-0.067
 oats-0.064
lights-0.060
 them-0.058
cast-0.058
TokenT
Token ID51
Feature activation+2.90e-02

Pos logit contributions
 code+0.031
 reader+0.030
 student+0.030
 liar+0.029
 client+0.028
Neg logit contributions
 them-0.020
 Nem-0.018
 sunscreen-0.016
 oats-0.015
cast-0.015
Tokenara
Token ID3301
Feature activation+4.96e-02

Pos logit contributions
 code+0.046
 reader+0.044
 student+0.044
 liar+0.044
case+0.043
Neg logit contributions
 them-0.040
 Nem-0.038
 sunscreen-0.038
 oats-0.035
cast-0.035
Token and
Token ID290
Feature activation-5.52e-02

Pos logit contributions
 code+0.089
 liar+0.082
 student+0.081
case+0.080
 reader+0.080
Neg logit contributions
 them-0.062
 Nem-0.061
 sunscreen-0.060
cast-0.053
 oats-0.053
Token Jack
Token ID3619
Feature activation+6.41e-02

Pos logit contributions
 sunscreen+0.068
lights+0.067
 oats+0.066
 them+0.065
 Nem+0.063
Neg logit contributions
 code-0.091
 reader-0.091
 student-0.089
case-0.088
Two-0.083
Token were
Token ID547
Feature activation+7.35e-02

Pos logit contributions
 code+0.103
 student+0.095
 reader+0.093
 equation+0.091
case+0.091
Neg logit contributions
 them-0.098
 Nem-0.095
 sunscreen-0.095
 oats-0.083
cast-0.082
Token Mean
Token ID22728
Feature activation-3.24e-02

Pos logit contributions
 sunscreen+0.037
 them+0.036
 oats+0.034
 Nem+0.034
lights+0.032
Neg logit contributions
 code-0.041
 reader-0.040
case-0.039
 student-0.039
 liar-0.038
Token Island
Token ID5451
Feature activation-4.33e-02

Pos logit contributions
 sunscreen+0.052
 them+0.051
 Nem+0.049
 oats+0.049
lights+0.047
Neg logit contributions
 code-0.044
case-0.042
 reader-0.041
 student-0.041
 equation-0.039
Token,
Token ID11
Feature activation-3.48e-02

Pos logit contributions
 sunscreen+0.067
 them+0.067
 Nem+0.065
 oats+0.064
lights+0.062
Neg logit contributions
 code-0.059
 reader-0.056
case-0.056
 student-0.055
 equation-0.053
Token the
Token ID262
Feature activation-3.04e-02

Pos logit contributions
 sunscreen+0.042
 them+0.039
 oats+0.038
 Nem+0.038
lights+0.037
Neg logit contributions
 code-0.061
 reader-0.059
 student-0.059
case-0.058
Two-0.056
Token people
Token ID661
Feature activation-2.79e-02

Pos logit contributions
 sunscreen+0.047
 them+0.045
 Nem+0.044
 oats+0.044
lights+0.040
Neg logit contributions
 code-0.044
case-0.044
 reader-0.042
ird-0.040
 student-0.040
Token were
Token ID547
Feature activation+2.58e-04

Pos logit contributions
 them+0.044
 sunscreen+0.041
 Nem+0.040
 oats+0.039
lights+0.038
Neg logit contributions
 code-0.040
 reader-0.038
case-0.037
 student-0.037
 equation-0.036
Token not
Token ID407
Feature activation-5.61e-02

Pos logit contributions
 code+0.000
 reader+0.000
case+0.000
 student+0.000
 equation+0.000
Neg logit contributions
 sunscreen-0.000
 them-0.000
 Nem-0.000
 oats-0.000
lights-0.000
Token so
Token ID523
Feature activation-3.74e-02

Pos logit contributions
 sunscreen+0.073
 them+0.073
 Nem+0.069
 oats+0.068
lights+0.063
Neg logit contributions
 code-0.093
case-0.090
 reader-0.089
 student-0.086
Two-0.085
Token kind
Token ID1611
Feature activation+1.39e-02

Pos logit contributions
 sunscreen+0.057
 them+0.057
 Nem+0.055
 oats+0.054
lights+0.052
Neg logit contributions
 code-0.053
 reader-0.051
case-0.050
 student-0.050
 equation-0.047
Token to
Token ID284
Feature activation+1.86e-02

Pos logit contributions
 code+0.022
case+0.021
 reader+0.020
Two+0.020
 equation+0.020
Neg logit contributions
 them-0.020
 sunscreen-0.018
 Nem-0.018
 oats-0.018
lights-0.017
Token one
Token ID530
Feature activation-5.36e-02

Pos logit contributions
 code+0.036
 reader+0.035
 student+0.035
case+0.034
Two+0.032
Neg logit contributions
 sunscreen-0.019
 oats-0.017
lights-0.017
 Nem-0.017
 them-0.017
Token She
Token ID1375
Feature activation+1.88e-02

Pos logit contributions
 code+0.048
 reader+0.046
 student+0.044
case+0.043
 liar+0.042
Neg logit contributions
 them-0.048
 sunscreen-0.047
 Nem-0.045
lights-0.042
 oats-0.042
Token gave
Token ID2921
Feature activation-1.71e-02

Pos logit contributions
 code+0.029
 liar+0.027
 reader+0.027
Warning+0.027
 student+0.026
Neg logit contributions
 them-0.026
 sunscreen-0.025
 Nem-0.023
cast-0.023
 oats-0.022
Token him
Token ID683
Feature activation+4.43e-03

Pos logit contributions
lights+0.028
ric+0.027
cast+0.026
rim+0.025
oming+0.025
Neg logit contributions
 student-0.020
room-0.020
 reader-0.020
 x-0.019
 code-0.019
Token the
Token ID262
Feature activation-1.30e-02

Pos logit contributions
 student+0.006
 reader+0.006
 code+0.005
case+0.005
Two+0.005
Neg logit contributions
lights-0.007
ric-0.007
 Nem-0.007
cast-0.007
 sunscreen-0.007
Token stamp
Token ID17977
Feature activation-1.30e-02

Pos logit contributions
 sunscreen+0.025
 them+0.025
 oats+0.024
 Nem+0.023
lights+0.023
Neg logit contributions
 code-0.013
case-0.013
 reader-0.013
 student-0.012
 equation-0.011
Token and
Token ID290
Feature activation-1.10e-01

Pos logit contributions
 sunscreen+0.018
 them+0.017
 Nem+0.016
 oats+0.016
lights+0.016
Neg logit contributions
 code-0.021
 reader-0.020
case-0.020
 student-0.020
 equation-0.019
Token took
Token ID1718
Feature activation+7.81e-03

Pos logit contributions
 sunscreen+0.154
 them+0.150
 Nem+0.143
 oats+0.142
lights+0.137
Neg logit contributions
 code-0.173
 reader-0.166
case-0.164
 student-0.162
Two-0.156
Token another
Token ID1194
Feature activation+6.08e-02

Pos logit contributions
 code+0.014
 reader+0.014
 student+0.014
case+0.014
Two+0.013
Neg logit contributions
 sunscreen-0.008
lights-0.008
 oats-0.008
ric-0.008
 Nem-0.007
Token one
Token ID530
Feature activation-1.14e-02

Pos logit contributions
 code+0.058
case+0.055
 reader+0.055
 student+0.053
 liar+0.050
Neg logit contributions
 sunscreen-0.119
 them-0.118
 oats-0.113
 Nem-0.112
lights-0.109
Token.
Token ID13
Feature activation+4.06e-02

Pos logit contributions
 them+0.017
 sunscreen+0.017
 oats+0.016
 Nem+0.015
lights+0.015
Neg logit contributions
 code-0.016
 reader-0.016
 liar-0.016
case-0.015
Two-0.015
Token It
Token ID632
Feature activation-2.67e-02

Pos logit contributions
case+0.048
 code+0.048
 reader+0.047
 student+0.046
Two+0.044
Neg logit contributions
 sunscreen-0.070
 oats-0.067
 them-0.067
lights-0.065
 Nem-0.065
Token,
Token ID11
Feature activation-7.44e-03

Pos logit contributions
Warning+0.087
 code+0.086
 reader+0.082
shop+0.082
case+0.080
Neg logit contributions
 them-0.056
 they-0.046
 as-0.044
atted-0.044
 Nem-0.044
Token saw
Token ID2497
Feature activation+6.80e-02

Pos logit contributions
 them+0.012
 sunscreen+0.011
 Nem+0.011
 oats+0.011
 everything+0.010
Neg logit contributions
 code-0.011
 reader-0.010
case-0.010
 student-0.010
 equation-0.010
Token what
Token ID644
Feature activation-6.27e-03

Pos logit contributions
 code+0.122
 should+0.115
room+0.115
 reader+0.115
 student+0.114
Neg logit contributions
 sunscreen-0.071
lights-0.066
 oats-0.061
cast-0.061
 Nem-0.060
Token was
Token ID373
Feature activation+5.23e-02

Pos logit contributions
 them+0.010
 Nem+0.009
 sunscreen+0.009
 oats+0.008
 everything+0.008
Neg logit contributions
 code-0.010
 reader-0.009
 student-0.009
case-0.008
 equation-0.008
Token happening
Token ID5836
Feature activation+2.20e-02

Pos logit contributions
 code+0.076
 reader+0.073
case+0.071
 student+0.071
Two+0.067
Neg logit contributions
 sunscreen-0.080
 them-0.077
 Nem-0.075
 oats-0.074
lights-0.072
Token and
Token ID290
Feature activation-8.62e-02

Pos logit contributions
 code+0.039
 reader+0.038
 student+0.037
case+0.037
 liar+0.037
Neg logit contributions
 them-0.029
 sunscreen-0.025
 Nem-0.024
 oats-0.024
lights-0.023
Token came
Token ID1625
Feature activation-6.40e-04

Pos logit contributions
 them+0.126
 sunscreen+0.126
 Nem+0.121
 oats+0.118
lights+0.114
Neg logit contributions
 code-0.131
 reader-0.125
case-0.123
 student-0.122
Two-0.116
Token over
Token ID625
Feature activation-8.79e-03

Pos logit contributions
 sunscreen+0.001
 them+0.001
 Nem+0.001
 oats+0.001
lights+0.001
Neg logit contributions
 code-0.001
 reader-0.001
case-0.001
 student-0.001
Two-0.001
Token to
Token ID284
Feature activation+4.55e-02

Pos logit contributions
 them+0.012
 sunscreen+0.012
 Nem+0.011
 oats+0.011
lights+0.010
Neg logit contributions
 code-0.015
case-0.015
 reader-0.014
 student-0.014
shop-0.013
Token encourage
Token ID7898
Feature activation+7.95e-03

Pos logit contributions
 code+0.075
 reader+0.073
case+0.073
 student+0.072
 equation+0.068
Neg logit contributions
 sunscreen-0.060
 them-0.060
 Nem-0.057
 oats-0.055
lights-0.052
Token them
Token ID606
Feature activation-5.82e-02

Pos logit contributions
 code+0.015
 reader+0.014
 student+0.014
case+0.014
 equation+0.014
Neg logit contributions
 sunscreen-0.009
 them-0.008
 oats-0.008
 Nem-0.008
lights-0.008
Token It
Token ID632
Feature activation-3.88e-02

Pos logit contributions
 code+0.053
 reader+0.050
case+0.050
 student+0.050
 equation+0.047
Neg logit contributions
 sunscreen-0.062
 them-0.061
 Nem-0.059
 oats-0.057
lights-0.056
Token can
Token ID460
Feature activation+2.66e-02

Pos logit contributions
 sunscreen+0.075
 them+0.072
 Nem+0.071
 oats+0.070
lights+0.069
Neg logit contributions
 code-0.040
 reader-0.038
case-0.037
 student-0.037
Two-0.034
Token talk
Token ID1561
Feature activation+1.62e-02

Pos logit contributions
 code+0.034
case+0.033
 reader+0.032
 student+0.031
Two+0.030
Neg logit contributions
 sunscreen-0.044
 Nem-0.042
 them-0.042
 oats-0.042
lights-0.041
Token and
Token ID290
Feature activation-1.18e-01

Pos logit contributions
 code+0.027
 reader+0.027
 student+0.026
case+0.026
Two+0.026
Neg logit contributions
 them-0.021
 sunscreen-0.019
 oats-0.018
lights-0.018
 Nem-0.017
Token sing
Token ID1702
Feature activation-4.12e-02

Pos logit contributions
 sunscreen+0.191
 Nem+0.185
 them+0.184
 oats+0.183
lights+0.179
Neg logit contributions
 code-0.158
 reader-0.148
 student-0.147
case-0.144
 equation-0.135
Token and
Token ID290
Feature activation-1.24e-01

Pos logit contributions
 them+0.063
 sunscreen+0.059
lights+0.054
 Nem+0.054
 oats+0.054
Neg logit contributions
 code-0.062
 reader-0.059
case-0.058
 student-0.058
Two-0.057
Token laugh
Token ID6487
Feature activation-5.52e-02

Pos logit contributions
 sunscreen+0.189
 them+0.182
 Nem+0.179
 oats+0.179
lights+0.174
Neg logit contributions
 code-0.177
 reader-0.168
 student-0.166
case-0.165
 equation-0.155
Token.
Token ID13
Feature activation+3.65e-02

Pos logit contributions
 them+0.082
 Nem+0.067
 as+0.067
 sunscreen+0.065
lights+0.065
Neg logit contributions
 code-0.095
ird-0.089
Warning-0.088
Two-0.087
 liar-0.086
Token Do
Token ID2141
Feature activation+1.14e-02

Pos logit contributions
 code+0.048
 reader+0.045
case+0.045
 student+0.045
 equation+0.042
Neg logit contributions
 sunscreen-0.060
 them-0.058
 Nem-0.057
 oats-0.056
lights-0.055
Token you
Token ID345
Feature activation+1.64e-02

Pos logit contributions
 code+0.016
case+0.016
 should+0.016
 do+0.016
 student+0.015
Neg logit contributions
lights-0.018
 sunscreen-0.016
 Nem-0.016
 oats-0.016
atted-0.016
Token like
Token ID588
Feature activation-7.27e-03

Pos logit contributions
 code+0.020
 reader+0.019
case+0.019
 student+0.019
Two+0.017
Neg logit contributions
 sunscreen-0.029
 them-0.028
 Nem-0.027
 oats-0.027
lights-0.027
Token The
Token ID383
Feature activation-1.88e-02

Pos logit contributions
case+0.054
 code+0.054
 reader+0.053
 student+0.053
Two+0.050
Neg logit contributions
 sunscreen-0.073
 them-0.069
 oats-0.067
lights-0.065
 Nem-0.064
Token knight
Token ID22062
Feature activation+5.47e-02

Pos logit contributions
 them+0.031
 sunscreen+0.030
 Nem+0.030
 oats+0.029
lights+0.029
Neg logit contributions
ird-0.022
 code-0.022
case-0.022
lder-0.021
 reader-0.021
Token defeated
Token ID9772
Feature activation+5.94e-02

Pos logit contributions
 code+0.081
 reader+0.079
Warning+0.078
 student+0.076
 equation+0.076
Neg logit contributions
 them-0.078
 Nem-0.075
cast-0.070
 sunscreen-0.070
 oats-0.069
Token the
Token ID262
Feature activation-2.04e-03

Pos logit contributions
 code+0.094
 student+0.093
 reader+0.092
 character+0.088
case+0.087
Neg logit contributions
 sunscreen-0.078
lights-0.069
 oats-0.069
ric-0.065
 Nem-0.065
Token monster
Token ID9234
Feature activation+6.34e-02

Pos logit contributions
 them+0.003
 sunscreen+0.003
 Nem+0.003
 oats+0.003
 seas+0.003
Neg logit contributions
ird-0.003
case-0.003
shop-0.003
room-0.003
Two-0.003
Token and
Token ID290
Feature activation-7.64e-02

Pos logit contributions
 code+0.108
 reader+0.105
Two+0.104
 student+0.104
 liar+0.102
Neg logit contributions
 them-0.077
 sunscreen-0.072
 Nem-0.070
 oats-0.069
cast-0.066
Token saved
Token ID7448
Feature activation+3.03e-02

Pos logit contributions
 sunscreen+0.114
 them+0.110
 Nem+0.106
 oats+0.106
lights+0.103
Neg logit contributions
 code-0.113
 reader-0.108
case-0.107
 student-0.106
Two-0.100
Token Molly
Token ID30236
Feature activation+4.72e-02

Pos logit contributions
 code+0.052
 student+0.051
 reader+0.050
case+0.047
 character+0.047
Neg logit contributions
 sunscreen-0.037
lights-0.034
ric-0.032
 oats-0.032
 Nem-0.030
Token.
Token ID13
Feature activation+4.37e-02

Pos logit contributions
 code+0.076
 reader+0.073
case+0.072
 student+0.071
Two+0.068
Neg logit contributions
 sunscreen-0.064
 them-0.062
 Nem-0.059
 oats-0.059
lights-0.057
Token
Token ID220
Feature activation-5.25e-02

Pos logit contributions
case+0.052
 reader+0.051
 code+0.051
 student+0.051
Two+0.048
Neg logit contributions
 sunscreen-0.077
 oats-0.072
 them-0.071
lights-0.069
ric-0.067
Token
Token ID198
Feature activation-2.74e-02

Pos logit contributions
 sunscreen+0.086
 oats+0.082
 them+0.080
lights+0.079
 Nem+0.077
Neg logit contributions
case-0.070
 code-0.069
 reader-0.066
 student-0.065
Two-0.064
Token tea
Token ID8887
Feature activation-5.12e-02

Pos logit contributions
 sunscreen+0.028
 Nem+0.027
lights+0.026
 them+0.026
cast+0.025
Neg logit contributions
 code-0.036
 student-0.035
 reader-0.035
case-0.034
Two-0.033
Token for
Token ID329
Feature activation+2.76e-02

Pos logit contributions
 sunscreen+0.067
 them+0.067
 oats+0.062
 Nem+0.060
lights+0.060
Neg logit contributions
 code-0.085
 reader-0.081
 student-0.080
case-0.079
Two-0.078
Token herself
Token ID5223
Feature activation-1.64e-02

Pos logit contributions
 code+0.047
 student+0.046
 reader+0.046
case+0.045
ird+0.043
Neg logit contributions
 sunscreen-0.031
lights-0.030
 Nem-0.030
 oats-0.029
 them-0.029
Token and
Token ID290
Feature activation-1.19e-01

Pos logit contributions
 them+0.024
 sunscreen+0.024
 oats+0.023
 Nem+0.021
lights+0.021
Neg logit contributions
 code-0.025
Warning-0.023
Two-0.023
 reader-0.023
 student-0.023
Token then
Token ID788
Feature activation-2.86e-02

Pos logit contributions
 sunscreen+0.167
 Nem+0.164
lights+0.163
 them+0.158
cast+0.158
Neg logit contributions
 code-0.168
 student-0.164
 reader-0.163
case-0.159
ird-0.152
Token sit
Token ID1650
Feature activation-7.33e-02

Pos logit contributions
 sunscreen+0.042
 them+0.040
 Nem+0.039
 oats+0.039
lights+0.038
Neg logit contributions
 code-0.043
 reader-0.041
case-0.041
 student-0.041
 equation-0.038
Token by
Token ID416
Feature activation-3.26e-02

Pos logit contributions
 sunscreen+0.108
 Nem+0.103
 them+0.101
 oats+0.100
lights+0.099
Neg logit contributions
 reader-0.104
 code-0.104
 student-0.101
case-0.100
Two-0.093
Token the
Token ID262
Feature activation-3.91e-02

Pos logit contributions
 sunscreen+0.036
 them+0.034
 Nem+0.032
 oats+0.032
lights+0.031
Neg logit contributions
 code-0.061
 reader-0.059
case-0.059
 student-0.058
Two-0.056
Token window
Token ID4324
Feature activation-7.80e-03

Pos logit contributions
 sunscreen+0.062
 oats+0.059
 them+0.059
 Nem+0.054
lights+0.053
Neg logit contributions
 code-0.056
 reader-0.054
ird-0.053
Warning-0.052
 student-0.051
Token to
Token ID284
Feature activation+1.88e-02

Pos logit contributions
 them+0.011
lights+0.010
 sunscreen+0.010
 oats+0.010
 Nem+0.009
Neg logit contributions
 code-0.012
 liar-0.011
Warning-0.011
 reader-0.011
Two-0.011
Token enjoy
Token ID2883
Feature activation-3.84e-02

Pos logit contributions
 code+0.025
 reader+0.024
case+0.023
 student+0.023
ird+0.022
Neg logit contributions
 sunscreen-0.030
 Nem-0.029
lights-0.028
 oats-0.028
 them-0.028
Token.
Token ID13
Feature activation-3.33e-02

Pos logit contributions
 sunscreen+0.107
 Nem+0.097
 them+0.097
 oats+0.095
lights+0.095
Neg logit contributions
 code-0.127
 reader-0.126
 student-0.124
case-0.121
ird-0.113
Token She
Token ID1375
Feature activation-4.40e-02

Pos logit contributions
 sunscreen+0.052
 them+0.051
 Nem+0.049
 oats+0.049
lights+0.048
Neg logit contributions
 code-0.046
 reader-0.044
case-0.044
 student-0.043
Two-0.041
Token hugged
Token ID41130
Feature activation-9.15e-02

Pos logit contributions
 sunscreen+0.070
 them+0.069
 Nem+0.067
 oats+0.065
lights+0.062
Neg logit contributions
 code-0.061
 reader-0.058
 student-0.057
case-0.056
 equation-0.055
Token the
Token ID262
Feature activation-7.53e-02

Pos logit contributions
 sunscreen+0.127
lights+0.126
 oats+0.120
ric+0.119
cast+0.113
Neg logit contributions
 reader-0.129
 student-0.127
 mouse-0.124
 figure-0.122
 character-0.120
Token fairy
Token ID25607
Feature activation-5.84e-02

Pos logit contributions
 sunscreen+0.132
 them+0.128
 oats+0.126
 Nem+0.125
lights+0.119
Neg logit contributions
 code-0.088
case-0.085
 reader-0.084
 student-0.082
 equation-0.078
Token and
Token ID290
Feature activation-1.62e-01

Pos logit contributions
 sunscreen+0.094
 Nem+0.088
 them+0.087
 oats+0.085
lights+0.085
Neg logit contributions
 reader-0.078
 code-0.077
 student-0.076
case-0.076
 figure-0.070
Token said
Token ID531
Feature activation-3.09e-02

Pos logit contributions
 sunscreen+0.273
 them+0.264
lights+0.260
 oats+0.258
 Nem+0.255
Neg logit contributions
 code-0.193
 reader-0.184
 student-0.183
case-0.179
ird-0.166
Token goodbye
Token ID24829
Feature activation-7.37e-02

Pos logit contributions
 sunscreen+0.050
 oats+0.047
lights+0.047
ric+0.046
 Nem+0.045
Neg logit contributions
 code-0.039
case-0.038
 reader-0.037
 student-0.036
 liar-0.036
Token.
Token ID13
Feature activation-1.85e-02

Pos logit contributions
 sunscreen+0.134
 oats+0.128
 them+0.124
 Nem+0.124
lights+0.123
Neg logit contributions
 code-0.084
case-0.079
 reader-0.079
 student-0.077
shop-0.068
Token From
Token ID3574
Feature activation-6.97e-02

Pos logit contributions
 sunscreen+0.033
 them+0.032
 oats+0.031
 Nem+0.031
lights+0.030
Neg logit contributions
 code-0.021
 reader-0.020
case-0.020
 student-0.020
Two-0.019
Token that
Token ID326
Feature activation-2.89e-02

Pos logit contributions
 sunscreen+0.099
 oats+0.095
 them+0.093
 Nem+0.092
lights+0.092
Neg logit contributions
 code-0.103
case-0.097
 reader-0.095
 student-0.093
 should-0.090
Token not
Token ID407
Feature activation-1.10e-01

Pos logit contributions
 sunscreen+0.102
 them+0.100
 Nem+0.094
 oats+0.092
lights+0.088
Neg logit contributions
 code-0.100
 reader-0.098
 student-0.095
case-0.094
 equation-0.091
Token go
Token ID467
Feature activation-6.80e-02

Pos logit contributions
 sunscreen+0.170
 them+0.168
 Nem+0.160
 oats+0.156
lights+0.146
Neg logit contributions
 code-0.156
 reader-0.152
 student-0.147
case-0.147
 equation-0.140
Token to
Token ID284
Feature activation-3.60e-02

Pos logit contributions
 sunscreen+0.088
 oats+0.082
lights+0.082
 Nem+0.082
 them+0.081
Neg logit contributions
 code-0.110
 reader-0.105
case-0.104
 student-0.104
Two-0.098
Token the
Token ID262
Feature activation-7.80e-02

Pos logit contributions
 sunscreen+0.039
 them+0.039
 Nem+0.036
 oats+0.035
lights+0.032
Neg logit contributions
 code-0.069
 reader-0.067
case-0.066
 student-0.065
 equation-0.063
Token toy
Token ID13373
Feature activation-5.15e-02

Pos logit contributions
 sunscreen+0.125
 oats+0.118
 them+0.113
 Nem+0.110
 seas+0.098
Neg logit contributions
ird-0.111
 reader-0.111
 code-0.108
Warning-0.107
lder-0.105
Token and
Token ID290
Feature activation-1.66e-01

Pos logit contributions
 them+0.075
 Nem+0.060
 sunscreen+0.058
 oats+0.057
 seas+0.056
Neg logit contributions
lder-0.087
Two-0.084
 equation-0.083
unts-0.081
 code-0.080
Token book
Token ID1492
Feature activation-5.77e-02

Pos logit contributions
 sunscreen+0.239
lights+0.235
 them+0.235
 oats+0.232
 Nem+0.230
Neg logit contributions
 code-0.246
case-0.232
 student-0.229
 reader-0.228
 liar-0.215
Token stalls
Token ID40308
Feature activation-5.31e-02

Pos logit contributions
 them+0.076
lights+0.068
 sunscreen+0.068
 Nem+0.062
 as+0.061
Neg logit contributions
 reader-0.095
Two-0.094
Warning-0.093
 equation-0.092
 code-0.092
Token at
Token ID379
Feature activation-3.97e-02

Pos logit contributions
 them+0.082
 sunscreen+0.081
 Nem+0.076
 oats+0.076
lights+0.073
Neg logit contributions
 code-0.074
 reader-0.072
 student-0.071
Two-0.069
case-0.069
Token all
Token ID477
Feature activation-8.08e-02

Pos logit contributions
 them+0.041
 sunscreen+0.037
 Nem+0.035
 oats+0.031
 everything+0.030
Neg logit contributions
 code-0.081
 reader-0.081
Two-0.078
 student-0.078
 liar-0.078
Token.
Token ID13
Feature activation-7.24e-03

Pos logit contributions
 them+0.116
 sunscreen+0.116
 Nem+0.109
 oats+0.109
lights+0.107
Neg logit contributions
 code-0.123
 reader-0.119
case-0.116
 student-0.116
 equation-0.110
Token They
Token ID1119
Feature activation-6.92e-02

Pos logit contributions
 sunscreen+0.055
 them+0.054
 Nem+0.053
 oats+0.050
lights+0.050
Neg logit contributions
 code-0.055
 reader-0.052
 student-0.051
case-0.050
 equation-0.048
Token say
Token ID910
Feature activation-5.64e-02

Pos logit contributions
 them+0.107
 sunscreen+0.105
 Nem+0.098
 oats+0.091
cast+0.088
Neg logit contributions
 code-0.104
 reader-0.100
 student-0.097
 equation-0.095
Warning-0.094
Token they
Token ID484
Feature activation-8.11e-02

Pos logit contributions
 sunscreen+0.064
 them+0.060
 oats+0.058
 Nem+0.058
lights+0.057
Neg logit contributions
 code-0.103
 reader-0.099
case-0.098
 student-0.098
Two-0.093
Token love
Token ID1842
Feature activation-7.92e-02

Pos logit contributions
 them+0.132
 sunscreen+0.121
 Nem+0.112
 oats+0.110
 everything+0.106
Neg logit contributions
 code-0.120
 student-0.114
 reader-0.113
 equation-0.111
case-0.109
Token them
Token ID606
Feature activation-1.24e-01

Pos logit contributions
 sunscreen+0.078
oming+0.076
lights+0.076
cast+0.070
 Nem+0.068
Neg logit contributions
 code-0.140
 student-0.137
 reader-0.135
 character-0.132
 should-0.131
Token and
Token ID290
Feature activation-1.65e-01

Pos logit contributions
 them+0.167
 sunscreen+0.164
 Nem+0.151
 oats+0.151
lights+0.145
Neg logit contributions
 code-0.208
 reader-0.197
 student-0.195
 equation-0.193
case-0.192
Token the
Token ID262
Feature activation-7.89e-02

Pos logit contributions
 sunscreen+0.199
 them+0.188
 oats+0.186
 Nem+0.186
lights+0.183
Neg logit contributions
 code-0.283
 reader-0.271
 student-0.270
case-0.269
 liar-0.254
Token rabbits
Token ID33043
Feature activation-2.29e-02

Pos logit contributions
 sunscreen+0.123
 them+0.118
 oats+0.114
 Nem+0.113
lights+0.108
Neg logit contributions
 code-0.111
 reader-0.107
case-0.107
 student-0.105
 equation-0.099
Token.
Token ID13
Feature activation-1.50e-02

Pos logit contributions
 them+0.037
 sunscreen+0.037
 oats+0.035
 Nem+0.035
lights+0.033
Neg logit contributions
 code-0.030
 reader-0.029
 student-0.028
case-0.028
 equation-0.028
Token They
Token ID1119
Feature activation-4.71e-02

Pos logit contributions
 sunscreen+0.023
 them+0.023
 Nem+0.022
 oats+0.022
lights+0.021
Neg logit contributions
 code-0.021
 reader-0.020
case-0.020
 student-0.020
 equation-0.019
Token play
Token ID711
Feature activation-2.03e-02

Pos logit contributions
 sunscreen+0.071
 them+0.069
 Nem+0.065
 oats+0.064
cast+0.060
Neg logit contributions
 code-0.070
 reader-0.067
 student-0.066
case-0.065
 equation-0.064
Token happy
Token ID3772
Feature activation-1.19e-01

Pos logit contributions
 code+0.108
 reader+0.104
case+0.102
 student+0.102
Two+0.096
Neg logit contributions
 sunscreen-0.091
 them-0.088
 Nem-0.085
 oats-0.084
lights-0.082
Token that
Token ID326
Feature activation-9.76e-02

Pos logit contributions
 sunscreen+0.206
 oats+0.190
 them+0.189
lights+0.187
 Nem+0.186
Neg logit contributions
 code-0.142
 reader-0.139
 student-0.138
case-0.134
 liar-0.123
Token he
Token ID339
Feature activation-6.30e-02

Pos logit contributions
 sunscreen+0.106
 them+0.094
lights+0.094
 oats+0.093
 Nem+0.091
Neg logit contributions
 code-0.182
 reader-0.176
 student-0.175
case-0.174
 liar-0.165
Token stopped
Token ID5025
Feature activation-7.53e-02

Pos logit contributions
 them+0.106
 Nem+0.105
 sunscreen+0.104
 oats+0.099
cast+0.096
Neg logit contributions
 code-0.080
 reader-0.074
 student-0.073
case-0.072
Two-0.070
Token crying
Token ID13774
Feature activation-7.85e-02

Pos logit contributions
 sunscreen+0.110
lights+0.106
 oats+0.102
 Nem+0.101
 them+0.096
Neg logit contributions
 code-0.106
 reader-0.103
 student-0.102
 figure-0.099
 equation-0.096
Token and
Token ID290
Feature activation-1.63e-01

Pos logit contributions
 them+0.109
 sunscreen+0.100
 Nem+0.098
lights+0.097
 oats+0.095
Neg logit contributions
 code-0.131
 reader-0.124
Two-0.123
case-0.121
 student-0.121
Token started
Token ID2067
Feature activation-1.52e-02

Pos logit contributions
 sunscreen+0.270
 them+0.260
 oats+0.253
 Nem+0.251
lights+0.250
Neg logit contributions
 code-0.211
 reader-0.200
case-0.198
 student-0.197
 equation-0.183
Token laughing
Token ID14376
Feature activation-1.02e-01

Pos logit contributions
 sunscreen+0.021
lights+0.020
 Nem+0.020
 oats+0.019
 them+0.019
Neg logit contributions
 code-0.023
 student-0.022
 reader-0.022
case-0.022
 equation-0.021
Token!
Token ID0
Feature activation-2.06e-02

Pos logit contributions
 sunscreen+0.156
 them+0.153
 Nem+0.147
 oats+0.145
lights+0.143
Neg logit contributions
 code-0.146
 reader-0.139
case-0.137
 student-0.137
Two-0.129
Token He
Token ID679
Feature activation-2.71e-02

Pos logit contributions
 sunscreen+0.030
 them+0.029
 Nem+0.028
 oats+0.027
lights+0.027
Neg logit contributions
 code-0.031
 reader-0.030
case-0.030
 student-0.029
 equation-0.028
Token was
Token ID373
Feature activation+2.46e-03

Pos logit contributions
 them+0.043
 sunscreen+0.043
 Nem+0.042
 oats+0.040
lights+0.039
Neg logit contributions
 code-0.038
 reader-0.036
 student-0.035
case-0.034
Two-0.033
Token she
Token ID673
Feature activation-4.61e-02

Pos logit contributions
 sunscreen+0.133
lights+0.132
 oats+0.121
 them+0.121
cast+0.118
Neg logit contributions
 code-0.187
 student-0.183
 reader-0.182
case-0.181
 should-0.176
Token put
Token ID1234
Feature activation-9.40e-02

Pos logit contributions
 them+0.072
 sunscreen+0.071
 Nem+0.070
 oats+0.068
cast+0.062
Neg logit contributions
 code-0.065
 reader-0.061
 student-0.059
Warning-0.058
 equation-0.056
Token on
Token ID319
Feature activation-3.26e-02

Pos logit contributions
lights+0.115
 sunscreen+0.111
 Nem+0.110
ric+0.109
cast+0.108
Neg logit contributions
 code-0.143
 reader-0.134
case-0.134
 student-0.133
 character-0.131
Token her
Token ID607
Feature activation-5.57e-02

Pos logit contributions
lights+0.042
 sunscreen+0.040
 Nem+0.040
 oats+0.039
cast+0.038
Neg logit contributions
 code-0.050
case-0.048
 reader-0.048
 student-0.048
 equation-0.045
Token coat
Token ID13209
Feature activation-8.01e-02

Pos logit contributions
 sunscreen+0.078
 them+0.078
 Nem+0.076
lights+0.074
 oats+0.073
Neg logit contributions
 code-0.084
case-0.079
 reader-0.079
 student-0.078
ird-0.074
Token and
Token ID290
Feature activation-1.64e-01

Pos logit contributions
 sunscreen+0.111
 them+0.109
 Nem+0.102
 oats+0.102
lights+0.100
Neg logit contributions
 code-0.128
 reader-0.123
 student-0.121
Two-0.117
case-0.116
Token followed
Token ID3940
Feature activation-2.63e-02

Pos logit contributions
 Nem+0.248
 them+0.247
lights+0.245
 sunscreen+0.239
ric+0.237
Neg logit contributions
 code-0.217
 reader-0.206
case-0.205
 student-0.204
ird-0.199
Token Mom
Token ID11254
Feature activation-5.70e-02

Pos logit contributions
 sunscreen+0.031
lights+0.030
 oats+0.029
ric+0.028
cast+0.028
Neg logit contributions
 code-0.044
 reader-0.043
 student-0.043
case-0.042
 figure-0.041
Tokenmy
Token ID1820
Feature activation-5.63e-02

Pos logit contributions
 sunscreen+0.078
 them+0.074
 Nem+0.074
 oats+0.072
lights+0.071
Neg logit contributions
 code-0.089
 reader-0.086
 student-0.085
case-0.085
ird-0.080
Token to
Token ID284
Feature activation-6.18e-03

Pos logit contributions
 sunscreen+0.073
 them+0.070
 Nem+0.068
 oats+0.068
lights+0.066
Neg logit contributions
 code-0.093
 reader-0.089
case-0.088
 student-0.088
 equation-0.083
Token the
Token ID262
Feature activation-6.11e-02

Pos logit contributions
lights+0.009
 sunscreen+0.008
 oats+0.008
cast+0.008
 Nem+0.007
Neg logit contributions
 code-0.010
 student-0.009
 reader-0.009
 figure-0.009
case-0.009