TOP ACTIVATIONS
MAX = 0.173

, next to her magn ifying glass . She looked at
hat and grabbed her magn ifying glass . She was ready
the necklace and the magn ifying glass and started to examine
When Daisy put the magn ifying glass down , the cater
the necklace and the magn ifying glass and told him that
the watch with the magn ifying glass . He saw the
borrowed her mom 's magn ifying glass to take a closer
He pointed his magn ifying glass at different classmates and
, handing him the magn ifying glass . She looked around
examine it with the magn ifying glass ?" "
flashlight , and a magn ifying glass . As
He brought his magn ifying glass and looked closely .
She takes out a magn ifying glass . She looks at
didn 't stop until their stomach s were full .
it she had a magn ifying glass , a tiny torch
and handed him the magn ifying glass . He
crack ers to help your stomach feel better ." Tim my
chili pepper before and his stomach was starting to trem ble
He took out a magn ifying glass and a notebook and
cried , dropping the magn ifying glass . "

MIN ACTIVATIONS
MIN = -0.179

elephants , lions and ac rob ats . They are amazed
was three - years - old . She had a lot
spotted a row of books hel ves in her grandparents '
clown s juggling and ac rob ats doing flips . But
the animals , the ac rob ats and the clown s
thought they looked like ac rob ats . She enjoyed the
a 3 - year - old boy who had a new
clown s and the ac rob ats and the lions and
. The vet took an x - ray of Wh isk
walked over to the books hel ves and began to read
of fun watching the ac rob ats and clown s .
on the row of books hel ves . <|endoftext|> Tom was
idea â € � � he wanted to touch the glowing
. He found a squee gee and some wax .
The doctor took the x - ray and then Molly
a 3 - year - old little boy . He had
saw clown s and ac rob ats and lions and elephants
lots of big juicy pine con es , so Penny started
, the doctor took another x - ray and found out
She had a big books he lf in her room with

INTERVAL 0.085 - 0.173
CONTAINS 0.225%

is it ?" P enny smiled and said ,
The next day , Lily woke up early and put on
cry as he watched the crab move into his shell .
The boy looked sad and lonely . Lily felt sorry for
went outside and when he smelled the wonderful wet grass ,

INTERVAL -0.003 - 0.085
CONTAINS 56.885%

eat the y ummy fruit . <|endoftext|> Once upon a time
, their nos y neighbor knocked on the door . She
day , the weight would be her best friend . <|endoftext|>
look . <|endoftext|> Once , there was a loyal automobile .
said , " Yes , I did ! Can we bake

INTERVAL -0.091 - -0.003
CONTAINS 42.670%

be fun ." Ben hesitated . He didn 't
Snow y was no longer white . He was red and
. This story teaches us that it 's important
go to the slide ." Tom , Lily and
a little girl named Lily . She loved to play with

INTERVAL -0.179 - -0.091
CONTAINS 0.220%

Finn . F inn asked , " Do you
people . He held the rope and flew his k ite
, I can do this pose and this pose and this
They had a lot of fun making noises and pretending to
attic . They found all sorts of things : old toys
Token,
Token ID11
Feature activation+1.75e-02

Pos logit contributions
old+0.064
Years+0.058
iger+0.058
bench+0.057
ritz+0.057
Neg logit contributions
 woke-0.037
ifying-0.036
 Look-0.033
 reapp-0.032
're-0.032
Token next
Token ID1306
Feature activation+1.67e-02

Pos logit contributions
AAAA+0.030
ifying+0.030
're+0.028
 Look+0.028
 roy+0.028
Neg logit contributions
old-0.025
bench-0.025
 student-0.022
Years-0.022
 homework-0.022
Token to
Token ID284
Feature activation+1.89e-02

Pos logit contributions
 woke+0.021
 Look+0.020
ifying+0.018
 c+0.016
 burst+0.015
Neg logit contributions
old-0.033
bench-0.030
chairs-0.029
ritz-0.029
smanship-0.029
Token her
Token ID607
Feature activation+2.92e-02

Pos logit contributions
 Look+0.025
ifying+0.025
 woke+0.024
AAAA+0.024
 D+0.023
Neg logit contributions
bench-0.035
old-0.034
Years-0.030
ritz-0.030
smanship-0.029
Token magn
Token ID7842
Feature activation+2.60e-02

Pos logit contributions
ifying+0.043
 Look+0.041
 roy+0.040
 woke+0.040
 D+0.039
Neg logit contributions
bench-0.050
old-0.049
Years-0.045
ritz-0.043
iger-0.043
Tokenifying
Token ID4035
Feature activation+1.73e-01

Pos logit contributions
ifying+0.001
 Look-0.003
AAAA-0.005
're-0.005
 roy-0.006
Neg logit contributions
old-0.085
bench-0.084
Years-0.080
 homework-0.078
 student-0.078
Token glass
Token ID5405
Feature activation+3.47e-02

Pos logit contributions
AAAA+0.229
're+0.214
 Look+0.208
 roy+0.205
ifying+0.205
Neg logit contributions
bench-0.303
old-0.302
 tutor-0.282
 math-0.281
 student-0.279
Token.
Token ID13
Feature activation+1.07e-02

Pos logit contributions
ifying+0.048
 woke+0.045
 Look+0.044
AAAA+0.044
 roy+0.043
Neg logit contributions
old-0.062
bench-0.058
Years-0.056
 student-0.052
ritz-0.052
Token She
Token ID1375
Feature activation+3.34e-02

Pos logit contributions
 Look+0.014
ifying+0.014
AAAA+0.013
 Are+0.013
 woke+0.012
Neg logit contributions
old-0.020
bench-0.020
Years-0.018
ritz-0.018
ggie-0.018
Token looked
Token ID3114
Feature activation+4.27e-02

Pos logit contributions
ifying+0.038
AAAA+0.037
 Look+0.036
're+0.034
 roy+0.034
Neg logit contributions
old-0.068
bench-0.066
Years-0.061
 student-0.060
 homework-0.059
Token at
Token ID379
Feature activation+1.89e-03

Pos logit contributions
AAAA+0.088
ifying+0.083
 Sorry+0.081
're+0.081
 Look+0.080
Neg logit contributions
old-0.050
bench-0.042
Years-0.041
 tutor-0.041
 aloud-0.041
Token hat
Token ID6877
Feature activation+2.21e-02

Pos logit contributions
AAAA+0.018
ifying+0.018
 Look+0.017
're+0.017
 roy+0.016
Neg logit contributions
bench-0.017
old-0.017
 homework-0.016
 tutor-0.016
 student-0.015
Token and
Token ID290
Feature activation+9.91e-03

Pos logit contributions
ifying+0.029
 woke+0.028
 Look+0.028
're+0.027
AAAA+0.027
Neg logit contributions
old-0.041
bench-0.039
Years-0.036
 student-0.035
ritz-0.035
Token grabbed
Token ID13176
Feature activation+7.79e-03

Pos logit contributions
ifying+0.015
AAAA+0.014
 Look+0.014
're+0.014
 Are+0.013
Neg logit contributions
bench-0.016
old-0.016
 homework-0.015
 student-0.015
 tutor-0.014
Token her
Token ID607
Feature activation+1.19e-02

Pos logit contributions
ifying+0.011
 Look+0.011
AAAA+0.011
 roy+0.010
're+0.010
Neg logit contributions
old-0.013
bench-0.013
Years-0.012
 student-0.012
 worker-0.011
Token magn
Token ID7842
Feature activation+2.15e-02

Pos logit contributions
ifying+0.018
AAAA+0.018
 Look+0.017
're+0.017
 roy+0.016
Neg logit contributions
old-0.020
bench-0.019
Years-0.017
 student-0.017
ritz-0.016
Tokenifying
Token ID4035
Feature activation+1.72e-01

Pos logit contributions
AAAA+0.003
ifying+0.002
 Look+0.002
're+0.000
 woke+0.000
Neg logit contributions
old-0.065
bench-0.064
Years-0.060
 student-0.060
ritz-0.059
Token glass
Token ID5405
Feature activation+3.29e-02

Pos logit contributions
AAAA+0.255
cel+0.238
're+0.236
ifying+0.225
 Are+0.223
Neg logit contributions
bench-0.279
old-0.261
 math-0.253
 tutor-0.251
case-0.248
Token.
Token ID13
Feature activation+4.67e-03

Pos logit contributions
ifying+0.052
 woke+0.047
 roy+0.047
 Look+0.047
AAAA+0.047
Neg logit contributions
old-0.054
bench-0.050
Years-0.048
 student-0.045
 research-0.045
Token She
Token ID1375
Feature activation+3.56e-02

Pos logit contributions
ifying+0.006
AAAA+0.006
 Look+0.006
're+0.006
 woke+0.006
Neg logit contributions
old-0.008
bench-0.008
Years-0.007
 student-0.007
ggie-0.007
Token was
Token ID373
Feature activation-5.79e-03

Pos logit contributions
ifying+0.041
AAAA+0.040
 Look+0.039
're+0.036
 roy+0.035
Neg logit contributions
old-0.072
bench-0.070
Years-0.065
 student-0.065
ggie-0.063
Token ready
Token ID3492
Feature activation+3.70e-02

Pos logit contributions
old+0.009
bench+0.009
Years+0.008
 student+0.008
 homework+0.008
Neg logit contributions
ifying-0.009
AAAA-0.009
 Look-0.009
're-0.008
 roy-0.008
Token the
Token ID262
Feature activation-5.60e-03

Pos logit contributions
old+0.001
bench+0.001
Years+0.001
 student+0.001
 homework+0.001
Neg logit contributions
ifying-0.002
AAAA-0.002
 Look-0.002
're-0.001
 roy-0.001
Token necklace
Token ID38987
Feature activation+7.76e-03

Pos logit contributions
bench+0.008
old+0.008
 worker+0.007
 homework+0.007
 student+0.007
Neg logit contributions
AAAA-0.009
ifying-0.009
 Look-0.009
're-0.008
 Are-0.008
Token and
Token ID290
Feature activation+1.98e-02

Pos logit contributions
ifying+0.008
 woke+0.008
're+0.007
 D+0.007
als+0.007
Neg logit contributions
old-0.016
bench-0.015
Years-0.015
 research-0.015
 student-0.014
Token the
Token ID262
Feature activation+4.40e-03

Pos logit contributions
ifying+0.030
AAAA+0.027
 Look+0.026
're+0.025
 Are+0.024
Neg logit contributions
old-0.034
bench-0.033
 homework-0.030
 student-0.030
 worker-0.029
Token magn
Token ID7842
Feature activation+2.20e-02

Pos logit contributions
ifying+0.006
AAAA+0.006
 Look+0.006
're+0.006
 woke+0.006
Neg logit contributions
bench-0.007
old-0.007
 student-0.006
 worker-0.006
 tutor-0.006
Tokenifying
Token ID4035
Feature activation+1.69e-01

Pos logit contributions
AAAA+0.007
 Look+0.006
ifying+0.006
 roy+0.004
're+0.004
Neg logit contributions
old-0.061
bench-0.061
 student-0.057
Years-0.057
ritz-0.056
Token glass
Token ID5405
Feature activation+3.82e-02

Pos logit contributions
cel+0.272
're+0.259
AAAA+0.258
 roy+0.249
 Look+0.248
Neg logit contributions
bench-0.223
case-0.222
 math-0.215
 dentist-0.211
 site-0.206
Token and
Token ID290
Feature activation+2.14e-02

Pos logit contributions
ifying+0.063
AAAA+0.062
 Look+0.061
 woke+0.059
 roy+0.059
Neg logit contributions
old-0.057
bench-0.054
Years-0.050
 student-0.048
 homework-0.047
Token started
Token ID2067
Feature activation-1.12e-02

Pos logit contributions
ifying+0.031
AAAA+0.029
 Look+0.027
're+0.027
 Sorry+0.025
Neg logit contributions
old-0.038
bench-0.036
 homework-0.033
 student-0.033
 worker-0.032
Token to
Token ID284
Feature activation+1.36e-02

Pos logit contributions
old+0.016
bench+0.015
Years+0.013
 student+0.013
ritz+0.013
Neg logit contributions
ifying-0.020
AAAA-0.020
 Look-0.019
 roy-0.019
 woke-0.019
Token examine
Token ID10716
Feature activation+5.69e-03

Pos logit contributions
ifying+0.023
AAAA+0.022
 Look+0.021
 roy+0.019
're+0.019
Neg logit contributions
bench-0.020
old-0.019
 research-0.019
 homework-0.019
 math-0.018
Token When
Token ID1649
Feature activation-3.63e-02

Pos logit contributions
old+0.026
bench+0.024
 student+0.022
Years+0.022
 tutor+0.021
Neg logit contributions
ifying-0.022
AAAA-0.021
 roy-0.020
 Look-0.020
're-0.020
Token Daisy
Token ID40355
Feature activation+3.64e-02

Pos logit contributions
old+0.045
bench+0.044
 homework+0.038
 student+0.038
Years+0.037
Neg logit contributions
ifying-0.068
AAAA-0.068
 Look-0.065
 roy-0.064
're-0.063
Token put
Token ID1234
Feature activation-9.46e-04

Pos logit contributions
ifying+0.045
AAAA+0.042
 Look+0.040
 roy+0.038
 Sorry+0.036
Neg logit contributions
old-0.075
bench-0.067
 student-0.063
 worker-0.062
ggie-0.062
Token the
Token ID262
Feature activation-2.10e-03

Pos logit contributions
old+0.002
bench+0.002
Years+0.002
 student+0.002
ritz+0.001
Neg logit contributions
ifying-0.001
AAAA-0.001
 Look-0.001
're-0.001
 woke-0.001
Token magn
Token ID7842
Feature activation+1.39e-02

Pos logit contributions
old+0.003
bench+0.003
Years+0.003
 student+0.003
 homework+0.003
Neg logit contributions
ifying-0.003
AAAA-0.003
 Look-0.003
're-0.003
 roy-0.003
Tokenifying
Token ID4035
Feature activation+1.68e-01

Pos logit contributions
AAAA+0.004
 Look+0.003
 Are+0.002
 roy+0.002
 woke+0.002
Neg logit contributions
old-0.039
bench-0.039
bars-0.037
 student-0.037
ritz-0.036
Token glass
Token ID5405
Feature activation+2.75e-02

Pos logit contributions
AAAA+0.252
cel+0.243
 roy+0.238
 Look+0.228
 Are+0.224
Neg logit contributions
bench-0.256
case-0.244
old-0.243
 math-0.233
ae-0.231
Token down
Token ID866
Feature activation+2.06e-02

Pos logit contributions
AAAA+0.039
ifying+0.038
 Look+0.037
 roy+0.035
're+0.035
Neg logit contributions
bench-0.048
old-0.048
 student-0.041
 worker-0.041
 homework-0.041
Token,
Token ID11
Feature activation+1.09e-02

Pos logit contributions
AAAA+0.032
ifying+0.031
 Look+0.030
 roy+0.029
're+0.029
Neg logit contributions
old-0.033
bench-0.032
 worker-0.029
 student-0.028
 attendant-0.028
Token the
Token ID262
Feature activation+2.38e-03

Pos logit contributions
ifying+0.021
AAAA+0.021
 Look+0.020
 roy+0.020
're+0.020
Neg logit contributions
old-0.013
bench-0.012
Years-0.011
 student-0.011
 homework-0.010
Token cater
Token ID20825
Feature activation+4.71e-02

Pos logit contributions
AAAA+0.003
ifying+0.003
 Look+0.003
're+0.003
 roy+0.003
Neg logit contributions
old-0.004
bench-0.004
 student-0.004
Years-0.004
 worker-0.003
Token the
Token ID262
Feature activation-1.49e-02

Pos logit contributions
bench+0.043
old+0.042
Years+0.038
 student+0.037
ritz+0.036
Neg logit contributions
ifying-0.036
AAAA-0.035
 Look-0.035
're-0.034
 roy-0.033
Token necklace
Token ID38987
Feature activation+4.29e-03

Pos logit contributions
bench+0.025
old+0.024
 student+0.022
 homework+0.021
Years+0.021
Neg logit contributions
ifying-0.023
AAAA-0.022
 Look-0.021
 woke-0.020
 Sorry-0.020
Token and
Token ID290
Feature activation+1.72e-02

Pos logit contributions
 woke+0.005
ifying+0.005
're+0.004
 reapp+0.004
 Look+0.004
Neg logit contributions
old-0.008
bench-0.008
Years-0.008
 student-0.007
 research-0.007
Token the
Token ID262
Feature activation-6.85e-05

Pos logit contributions
ifying+0.026
AAAA+0.025
 Look+0.024
're+0.023
 roy+0.023
Neg logit contributions
old-0.030
bench-0.028
 student-0.026
 worker-0.025
Years-0.025
Token magn
Token ID7842
Feature activation+1.61e-02

Pos logit contributions
bench+0.000
old+0.000
 student+0.000
 worker+0.000
 homework+0.000
Neg logit contributions
ifying-0.000
AAAA-0.000
 Look-0.000
're-0.000
 woke-0.000
Tokenifying
Token ID4035
Feature activation+1.67e-01

Pos logit contributions
AAAA+0.005
 Look+0.004
ifying+0.004
're+0.003
 roy+0.003
Neg logit contributions
old-0.045
bench-0.045
 student-0.042
Years-0.042
ritz-0.041
Token glass
Token ID5405
Feature activation+3.47e-02

Pos logit contributions
're+0.264
cel+0.258
AAAA+0.256
 Look+0.242
 roy+0.238
Neg logit contributions
bench-0.224
 math-0.214
case-0.210
 dentist-0.207
 worker-0.207
Token and
Token ID290
Feature activation+1.81e-02

Pos logit contributions
ifying+0.057
AAAA+0.054
 Look+0.054
 woke+0.053
 roy+0.052
Neg logit contributions
old-0.053
bench-0.050
Years-0.047
 student-0.044
 homework-0.043
Token told
Token ID1297
Feature activation+1.75e-02

Pos logit contributions
ifying+0.025
AAAA+0.024
 Look+0.023
're+0.022
 roy+0.022
Neg logit contributions
old-0.033
bench-0.032
 student-0.029
 worker-0.029
Years-0.028
Token him
Token ID683
Feature activation-1.51e-02

Pos logit contributions
 D+0.024
 Look+0.023
 Po+0.023
 woke+0.023
 P+0.023
Neg logit contributions
bench-0.033
old-0.031
Years-0.029
 rider-0.029
smanship-0.028
Token that
Token ID326
Feature activation+6.00e-03

Pos logit contributions
old+0.023
bench+0.023
Years+0.020
 student+0.020
ritz+0.019
Neg logit contributions
ifying-0.024
 Look-0.024
AAAA-0.024
're-0.023
 woke-0.023
Token the
Token ID262
Feature activation-3.36e-03

Pos logit contributions
old+0.014
bench+0.014
 student+0.012
ggie+0.012
ae+0.012
Neg logit contributions
 D-0.007
AAAA-0.006
 Look-0.006
ifying-0.006
 P-0.006
Token watch
Token ID2342
Feature activation-5.54e-02

Pos logit contributions
old+0.006
bench+0.006
Years+0.005
ritz+0.005
 student+0.005
Neg logit contributions
ifying-0.005
AAAA-0.005
 Look-0.005
 woke-0.004
're-0.004
Token with
Token ID351
Feature activation+2.96e-03

Pos logit contributions
old+0.104
bench+0.097
Years+0.097
ritz+0.089
 research+0.086
Neg logit contributions
ifying-0.076
 woke-0.072
 Look-0.069
're-0.067
als-0.066
Token the
Token ID262
Feature activation-1.83e-03

Pos logit contributions
ifying+0.004
 Look+0.004
AAAA+0.004
 D+0.003
're+0.003
Neg logit contributions
old-0.006
bench-0.005
Years-0.005
 student-0.005
ritz-0.005
Token magn
Token ID7842
Feature activation+7.56e-03

Pos logit contributions
old+0.003
bench+0.003
Years+0.003
ritz+0.003
bars+0.003
Neg logit contributions
ifying-0.002
 Look-0.002
AAAA-0.002
're-0.002
 woke-0.002
Tokenifying
Token ID4035
Feature activation+1.65e-01

Pos logit contributions
ifying+0.000
 Look-0.001
AAAA-0.002
're-0.002
 woke-0.002
Neg logit contributions
old-0.025
bench-0.025
Years-0.024
 homework-0.023
 research-0.023
Token glass
Token ID5405
Feature activation+2.64e-02

Pos logit contributions
AAAA+0.224
're+0.219
cel+0.217
 roy+0.209
 Look+0.207
Neg logit contributions
bench-0.267
old-0.261
 math-0.249
case-0.247
 tutor-0.246
Token.
Token ID13
Feature activation+5.64e-03

Pos logit contributions
ifying+0.038
 woke+0.036
 Look+0.035
AAAA+0.035
're+0.035
Neg logit contributions
old-0.046
bench-0.042
Years-0.041
 student-0.038
 research-0.038
Token He
Token ID679
Feature activation+1.41e-02

Pos logit contributions
 Look+0.007
AAAA+0.007
ifying+0.007
 D+0.006
 Are+0.006
Neg logit contributions
old-0.011
bench-0.010
 student-0.009
Years-0.009
ggie-0.009
Token saw
Token ID2497
Feature activation+1.82e-02

Pos logit contributions
ifying+0.014
AAAA+0.013
 Look+0.013
're+0.012
 roy+0.012
Neg logit contributions
old-0.031
bench-0.030
Years-0.028
 student-0.028
 worker-0.027
Token the
Token ID262
Feature activation+3.67e-03

Pos logit contributions
ifying+0.024
AAAA+0.024
 Look+0.024
 D+0.023
 woke+0.022
Neg logit contributions
bench-0.033
old-0.032
Years-0.029
 student-0.029
ritz-0.028
Token borrowed
Token ID22546
Feature activation+3.00e-02

Pos logit contributions
ifying+0.026
 roy+0.023
 Look+0.022
're+0.022
AAAA+0.022
Neg logit contributions
old-0.030
Years-0.027
bench-0.027
ggie-0.026
 student-0.025
Token her
Token ID607
Feature activation+4.60e-03

Pos logit contributions
ifying+0.050
AAAA+0.049
 Look+0.047
 woke+0.046
're+0.045
Neg logit contributions
old-0.045
bench-0.042
 student-0.038
Years-0.038
 worker-0.037
Token mom
Token ID1995
Feature activation+2.20e-02

Pos logit contributions
ifying+0.007
AAAA+0.007
 Look+0.007
 Sorry+0.007
 Are+0.007
Neg logit contributions
old-0.007
 tutor-0.006
 student-0.006
 math-0.006
Years-0.006
Token's
Token ID338
Feature activation+3.99e-03

Pos logit contributions
ifying+0.031
 Look+0.029
're+0.029
AAAA+0.028
 woke+0.028
Neg logit contributions
bench-0.038
old-0.038
Years-0.035
 student-0.034
 homework-0.033
Token magn
Token ID7842
Feature activation+8.81e-03

Pos logit contributions
ifying+0.005
AAAA+0.005
 Look+0.005
're+0.005
 woke+0.005
Neg logit contributions
old-0.007
bench-0.007
 student-0.006
Years-0.006
 tutor-0.006
Tokenifying
Token ID4035
Feature activation+1.65e-01

Pos logit contributions
AAAA+0.002
 Look+0.001
 Are+0.001
 woke+0.001
 roy+0.001
Neg logit contributions
old-0.025
bench-0.025
ritz-0.023
 student-0.023
Years-0.023
Token glass
Token ID5405
Feature activation+2.66e-02

Pos logit contributions
AAAA+0.245
're+0.232
cel+0.228
 roy+0.221
ifying+0.217
Neg logit contributions
 math-0.244
bench-0.233
 tutor-0.232
old-0.231
 x-0.224
Token to
Token ID284
Feature activation+1.05e-02

Pos logit contributions
ifying+0.042
AAAA+0.041
 Look+0.040
're+0.039
 roy+0.038
Neg logit contributions
old-0.042
bench-0.041
Years-0.036
 student-0.036
 worker-0.035
Token take
Token ID1011
Feature activation+3.11e-02

Pos logit contributions
ifying+0.015
AAAA+0.015
 Look+0.014
 roy+0.013
 Are+0.012
Neg logit contributions
old-0.019
bench-0.018
Years-0.017
 research-0.017
 student-0.016
Token a
Token ID257
Feature activation+5.13e-03

Pos logit contributions
ifying+0.049
AAAA+0.048
 Look+0.047
're+0.045
 roy+0.044
Neg logit contributions
old-0.050
bench-0.048
Years-0.043
 student-0.042
 worker-0.041
Token closer
Token ID5699
Feature activation-3.24e-02

Pos logit contributions
ifying+0.007
AAAA+0.007
 Look+0.007
're+0.006
 roy+0.006
Neg logit contributions
old-0.009
bench-0.009
Years-0.008
 student-0.008
 worker-0.008
Token
Token ID198
Feature activation-8.29e-03

Pos logit contributions
bars+0.027
bench+0.027
 salon+0.026
 homework+0.026
smanship+0.026
Neg logit contributions
AAAA-0.010
 woke-0.009
 Look-0.008
 Are-0.007
 See-0.007
TokenHe
Token ID1544
Feature activation+6.82e-04

Pos logit contributions
bench+0.010
old+0.008
 homework+0.008
bars+0.008
smanship+0.008
Neg logit contributions
AAAA-0.017
 Look-0.016
 woke-0.016
 Are-0.016
ifying-0.016
Token pointed
Token ID6235
Feature activation-3.74e-03

Pos logit contributions
ifying+0.001
AAAA+0.001
 Look+0.001
're+0.001
 roy+0.001
Neg logit contributions
old-0.001
bench-0.001
Years-0.001
 student-0.001
 worker-0.001
Token his
Token ID465
Feature activation-9.40e-03

Pos logit contributions
bench+0.006
old+0.006
Years+0.006
 student+0.005
 homework+0.005
Neg logit contributions
ifying-0.005
 Look-0.005
AAAA-0.005
 woke-0.005
're-0.005
Token magn
Token ID7842
Feature activation+6.28e-03

Pos logit contributions
old+0.016
bench+0.016
 student+0.014
 homework+0.014
 tutor+0.014
Neg logit contributions
AAAA-0.013
ifying-0.013
 Look-0.013
 roy-0.012
're-0.012
Tokenifying
Token ID4035
Feature activation+1.63e-01

Pos logit contributions
AAAA+0.001
 Look+0.001
 Are+0.000
 woke+0.000
 roy+0.000
Neg logit contributions
old-0.018
bench-0.018
 student-0.017
ritz-0.017
bars-0.017
Token glass
Token ID5405
Feature activation+2.03e-02

Pos logit contributions
AAAA+0.238
cel+0.227
're+0.221
 Look+0.209
ifying+0.208
Neg logit contributions
bench-0.259
old-0.255
 math-0.246
 tutor-0.244
 student-0.243
Token at
Token ID379
Feature activation-5.72e-03

Pos logit contributions
ifying+0.030
AAAA+0.030
 Look+0.029
're+0.028
 roy+0.027
Neg logit contributions
old-0.034
bench-0.033
Years-0.029
 student-0.029
 worker-0.029
Token different
Token ID1180
Feature activation-1.76e-02

Pos logit contributions
old+0.012
bench+0.011
Years+0.011
 appointment+0.011
 student+0.010
Neg logit contributions
ifying-0.006
 woke-0.006
 Look-0.006
 D-0.006
're-0.005
Token classmates
Token ID28999
Feature activation-1.64e-02

Pos logit contributions
old+0.028
bench+0.027
Years+0.025
 student+0.023
ggie+0.023
Neg logit contributions
ifying-0.027
 Look-0.027
AAAA-0.027
 roy-0.026
're-0.026
Token and
Token ID290
Feature activation-6.43e-03

Pos logit contributions
Years+0.034
bench+0.033
old+0.032
ritz+0.031
iger+0.030
Neg logit contributions
ifying-0.017
're-0.016
 Look-0.016
 woke-0.016
 roy-0.014
Token,
Token ID11
Feature activation-1.97e-02

Pos logit contributions
old+0.115
bench+0.114
Years+0.106
 student+0.103
ritz+0.103
Neg logit contributions
 Look-0.065
ifying-0.064
 woke-0.062
're-0.062
AAAA-0.061
Token handing
Token ID22786
Feature activation+3.12e-02

Pos logit contributions
old+0.032
bench+0.031
Years+0.028
 student+0.028
 worker+0.027
Neg logit contributions
ifying-0.030
AAAA-0.030
 Look-0.029
're-0.028
 roy-0.028
Token him
Token ID683
Feature activation-2.03e-02

Pos logit contributions
 D+0.045
 Look+0.044
 Pink+0.043
 P+0.042
ifying+0.041
Neg logit contributions
old-0.056
bench-0.056
 outage-0.051
Years-0.049
ae-0.049
Token the
Token ID262
Feature activation-1.06e-02

Pos logit contributions
old+0.035
bench+0.033
Years+0.030
 student+0.030
 homework+0.029
Neg logit contributions
ifying-0.029
AAAA-0.029
 Look-0.028
're-0.027
 roy-0.027
Token magn
Token ID7842
Feature activation+8.84e-03

Pos logit contributions
old+0.018
bench+0.017
Years+0.016
ritz+0.015
 student+0.015
Neg logit contributions
ifying-0.016
AAAA-0.015
 Look-0.015
're-0.015
 roy-0.014
Tokenifying
Token ID4035
Feature activation+1.60e-01

Pos logit contributions
ifying+0.000
 Look-0.002
're-0.002
AAAA-0.002
 woke-0.002
Neg logit contributions
old-0.030
bench-0.029
Years-0.028
 homework-0.027
 research-0.027
Token glass
Token ID5405
Feature activation+2.52e-02

Pos logit contributions
AAAA+0.214
're+0.206
cel+0.202
 roy+0.199
 Look+0.197
Neg logit contributions
bench-0.271
old-0.265
 math-0.250
 tutor-0.248
 student-0.245
Token.
Token ID13
Feature activation-4.74e-03

Pos logit contributions
ifying+0.042
AAAA+0.041
 Look+0.040
're+0.039
 woke+0.038
Neg logit contributions
old-0.038
bench-0.036
Years-0.033
 student-0.032
 homework-0.031
Token She
Token ID1375
Feature activation+2.20e-02

Pos logit contributions
old+0.008
bench+0.008
Years+0.008
 student+0.007
ggie+0.007
Neg logit contributions
AAAA-0.006
 Look-0.006
ifying-0.006
're-0.006
 D-0.006
Token looked
Token ID3114
Feature activation+4.05e-02

Pos logit contributions
ifying+0.030
AAAA+0.028
 Look+0.028
're+0.026
 roy+0.026
Neg logit contributions
old-0.041
bench-0.039
Years-0.036
 student-0.036
ggie-0.035
Token around
Token ID1088
Feature activation-3.65e-02

Pos logit contributions
AAAA+0.080
ifying+0.077
're+0.074
 Look+0.074
 Sorry+0.073
Neg logit contributions
old-0.050
bench-0.044
 aloud-0.041
 tutor-0.041
Years-0.040
Token examine
Token ID10716
Feature activation-1.82e-02

Pos logit contributions
AAAA+0.094
ifying+0.085
 roy+0.085
 pulp+0.082
 Look+0.082
Neg logit contributions
old-0.102
 research-0.091
 student-0.088
bench-0.088
 tutor-0.087
Token it
Token ID340
Feature activation+6.53e-03

Pos logit contributions
old+0.024
bench+0.022
Years+0.020
 student+0.020
 homework+0.020
Neg logit contributions
AAAA-0.033
ifying-0.033
 Look-0.031
're-0.031
 roy-0.030
Token with
Token ID351
Feature activation+4.32e-03

Pos logit contributions
ifying+0.008
AAAA+0.008
 Look+0.007
 roy+0.007
're+0.007
Neg logit contributions
old-0.013
bench-0.012
Years-0.011
 student-0.011
 homework-0.011
Token the
Token ID262
Feature activation-1.04e-02

Pos logit contributions
AAAA+0.008
ifying+0.007
 roy+0.007
 Are+0.007
 Look+0.007
Neg logit contributions
old-0.006
 homework-0.005
 math-0.005
bench-0.005
 research-0.005
Token magn
Token ID7842
Feature activation+5.41e-03

Pos logit contributions
old+0.018
bench+0.018
 student+0.016
Years+0.016
 homework+0.016
Neg logit contributions
ifying-0.014
AAAA-0.014
 Look-0.014
're-0.013
 roy-0.013
Tokenifying
Token ID4035
Feature activation+1.60e-01

Pos logit contributions
AAAA+0.002
 Look+0.001
 Are+0.001
 roy+0.001
 D+0.001
Neg logit contributions
bench-0.015
old-0.015
 student-0.014
bars-0.014
iger-0.014
Token glass
Token ID5405
Feature activation+2.41e-02

Pos logit contributions
AAAA+0.238
cel+0.235
're+0.224
 forgot+0.219
ifying+0.216
Neg logit contributions
bench-0.232
 math-0.220
old-0.217
case-0.216
 x-0.212
Token?"
Token ID1701
Feature activation-6.88e-03

Pos logit contributions
AAAA+0.032
ifying+0.031
 Look+0.030
 roy+0.028
're+0.028
Neg logit contributions
bench-0.044
old-0.044
 worker-0.040
 student-0.039
guard-0.039
Token
Token ID198
Feature activation-1.45e-02

Pos logit contributions
old+0.012
bench+0.012
Years+0.011
 student+0.011
 homework+0.010
Neg logit contributions
AAAA-0.010
ifying-0.009
 Look-0.009
 woke-0.009
're-0.009
Token
Token ID198
Feature activation-1.13e-02

Pos logit contributions
old+0.025
bench+0.025
 student+0.022
Years+0.022
 homework+0.022
Neg logit contributions
AAAA-0.020
ifying-0.020
 Look-0.020
 woke-0.019
 D-0.019
Token"
Token ID1
Feature activation+3.11e-02

Pos logit contributions
old+0.017
Years+0.016
bench+0.015
 worker+0.015
ggie+0.015
Neg logit contributions
ifying-0.019
're-0.019
AAAA-0.018
 roy-0.017
 Look-0.017
Token flashlight
Token ID38371
Feature activation-5.51e-02

Pos logit contributions
ifying+0.004
AAAA+0.004
 Look+0.004
're+0.004
 roy+0.004
Neg logit contributions
old-0.004
bench-0.004
 student-0.004
Years-0.004
 worker-0.004
Token,
Token ID11
Feature activation-1.75e-04

Pos logit contributions
old+0.087
bench+0.083
Years+0.076
 student+0.074
ritz+0.072
Neg logit contributions
ifying-0.087
AAAA-0.083
 Look-0.083
 woke-0.081
're-0.080
Token and
Token ID290
Feature activation+1.54e-03

Pos logit contributions
old+0.000
bench+0.000
Years+0.000
 student+0.000
 homework+0.000
Neg logit contributions
ifying-0.000
AAAA-0.000
 Look-0.000
 roy-0.000
're-0.000
Token a
Token ID257
Feature activation+2.20e-03

Pos logit contributions
AAAA+0.002
ifying+0.002
 Look+0.002
 roy+0.002
're+0.002
Neg logit contributions
old-0.003
bench-0.002
 student-0.002
Years-0.002
 homework-0.002
Token magn
Token ID7842
Feature activation+8.51e-03

Pos logit contributions
AAAA+0.003
ifying+0.003
 Look+0.003
're+0.003
 roy+0.003
Neg logit contributions
old-0.003
bench-0.003
 student-0.003
Years-0.003
 tutor-0.003
Tokenifying
Token ID4035
Feature activation+1.59e-01

Pos logit contributions
ifying+0.003
 Look+0.002
AAAA+0.002
're+0.002
 woke+0.002
Neg logit contributions
old-0.024
bench-0.024
Years-0.023
 student-0.022
 homework-0.022
Token glass
Token ID5405
Feature activation+2.07e-02

Pos logit contributions
AAAA+0.244
cel+0.237
 roy+0.233
 Look+0.223
 Are+0.219
Neg logit contributions
bench-0.240
case-0.236
 math-0.216
old-0.210
 site-0.208
Token.
Token ID13
Feature activation+8.38e-03

Pos logit contributions
ifying+0.035
AAAA+0.035
 Look+0.034
're+0.032
 roy+0.032
Neg logit contributions
old-0.030
bench-0.029
Years-0.026
 student-0.026
 worker-0.025
Token
Token ID198
Feature activation+8.62e-03

Pos logit contributions
ifying+0.011
AAAA+0.011
 Look+0.011
're+0.010
 roy+0.010
Neg logit contributions
old-0.015
bench-0.015
Years-0.013
 student-0.013
 worker-0.013
Token
Token ID198
Feature activation+6.68e-03

Pos logit contributions
AAAA+0.008
 Look+0.008
 D+0.008
 woke+0.007
 Are+0.007
Neg logit contributions
bench-0.018
 homework-0.018
 rider-0.018
smanship-0.018
bars-0.017
TokenAs
Token ID1722
Feature activation-2.20e-02

Pos logit contributions
AAAA+0.008
ifying+0.008
 Look+0.008
 woke+0.007
're+0.007
Neg logit contributions
bench-0.013
old-0.012
 homework-0.011
Years-0.011
 student-0.011
Token
Token ID198
Feature activation-1.13e-02

Pos logit contributions
bars+0.032
bench+0.032
 salon+0.031
 homework+0.031
 student+0.031
Neg logit contributions
 woke-0.008
AAAA-0.007
 D-0.007
 Look-0.006
 Are-0.005
TokenHe
Token ID1544
Feature activation-9.18e-03

Pos logit contributions
bench+0.020
old+0.020
 homework+0.018
bars+0.018
 tutor+0.018
Neg logit contributions
AAAA-0.015
ifying-0.014
 Look-0.014
 woke-0.014
 Are-0.013
Token brought
Token ID3181
Feature activation+3.91e-02

Pos logit contributions
old+0.020
bench+0.019
Years+0.018
 student+0.017
 worker+0.017
Neg logit contributions
ifying-0.010
AAAA-0.009
 Look-0.009
're-0.008
 roy-0.008
Token his
Token ID465
Feature activation-1.16e-02

Pos logit contributions
 Look+0.047
AAAA+0.046
ifying+0.046
 D+0.045
 woke+0.045
Neg logit contributions
old-0.075
bench-0.075
 student-0.068
Years-0.068
 homework-0.067
Token magn
Token ID7842
Feature activation+1.88e-03

Pos logit contributions
old+0.022
bench+0.022
Years+0.020
ritz+0.019
 student+0.019
Neg logit contributions
ifying-0.014
AAAA-0.014
 Look-0.014
're-0.014
 roy-0.013
Tokenifying
Token ID4035
Feature activation+1.57e-01

Pos logit contributions
AAAA+0.001
 Look+0.000
 woke+0.000
 Are+0.000
 roy+0.000
Neg logit contributions
old-0.005
bench-0.005
Years-0.005
 student-0.005
ritz-0.005
Token glass
Token ID5405
Feature activation+1.31e-02

Pos logit contributions
AAAA+0.215
cel+0.207
're+0.200
 Look+0.199
ifying+0.197
Neg logit contributions
bench-0.259
old-0.258
 math-0.242
 tutor-0.239
 student-0.238
Token and
Token ID290
Feature activation-2.09e-03

Pos logit contributions
AAAA+0.020
ifying+0.020
 Look+0.020
're+0.019
 roy+0.019
Neg logit contributions
old-0.021
bench-0.020
Years-0.018
 student-0.018
 worker-0.018
Token looked
Token ID3114
Feature activation+4.02e-02

Pos logit contributions
old+0.004
bench+0.004
Years+0.004
 student+0.004
 worker+0.004
Neg logit contributions
ifying-0.003
AAAA-0.003
 Look-0.002
're-0.002
 roy-0.002
Token closely
Token ID7173
Feature activation-6.02e-02

Pos logit contributions
AAAA+0.082
ifying+0.079
 Look+0.078
 Sorry+0.077
 Are+0.074
Neg logit contributions
old-0.046
bench-0.042
Years-0.040
 aloud-0.037
 worker-0.037
Token.
Token ID13
Feature activation-7.77e-03

Pos logit contributions
bench+0.116
old+0.113
 student+0.107
Years+0.104
 professor+0.104
Neg logit contributions
ifying-0.074
're-0.070
 Look-0.069
 D-0.069
 woke-0.068
Token She
Token ID1375
Feature activation+1.09e-02

Pos logit contributions
old+0.026
bench+0.025
 homework+0.025
ggie+0.024
smanship+0.024
Neg logit contributions
 Look-0.014
AAAA-0.013
 D-0.012
 Are-0.012
ifying-0.011
Token takes
Token ID2753
Feature activation+1.21e-02

Pos logit contributions
AAAA+0.013
ifying+0.013
 Look+0.013
're+0.012
 woke+0.012
Neg logit contributions
old-0.021
bench-0.021
Years-0.019
 student-0.019
ggie-0.018
Token out
Token ID503
Feature activation-6.92e-04

Pos logit contributions
 Look+0.013
ifying+0.013
 D+0.012
AAAA+0.012
 roy+0.012
Neg logit contributions
old-0.025
bench-0.025
Years-0.023
guard-0.022
ggie-0.022
Token a
Token ID257
Feature activation+8.80e-03

Pos logit contributions
old+0.001
bench+0.001
Years+0.001
ggie+0.001
 rider+0.001
Neg logit contributions
ifying-0.001
AAAA-0.001
 Look-0.001
 roy-0.001
 D-0.001
Token magn
Token ID7842
Feature activation-1.79e-04

Pos logit contributions
ifying+0.013
AAAA+0.013
 Look+0.013
 woke+0.012
're+0.012
Neg logit contributions
old-0.014
bench-0.014
Years-0.013
 student-0.012
ritz-0.012
Tokenifying
Token ID4035
Feature activation+1.56e-01

Pos logit contributions
old+0.001
bench+0.001
Years+0.001
 student+0.001
 homework+0.001
Neg logit contributions
ifying-0.000
AAAA-0.000
 Look-0.000
're-0.000
 roy+0.000
Token glass
Token ID5405
Feature activation+1.49e-02

Pos logit contributions
AAAA+0.223
cel+0.217
're+0.216
 roy+0.213
 Look+0.207
Neg logit contributions
bench-0.248
 math-0.233
old-0.229
case-0.228
 tutor-0.224
Token.
Token ID13
Feature activation+5.56e-03

Pos logit contributions
ifying+0.023
AAAA+0.022
 Look+0.022
 woke+0.021
're+0.021
Neg logit contributions
old-0.024
bench-0.023
Years-0.021
 student-0.021
 homework-0.020
Token She
Token ID1375
Feature activation+2.65e-02

Pos logit contributions
 Look+0.006
AAAA+0.006
 D+0.006
ifying+0.006
 Are+0.005
Neg logit contributions
old-0.012
bench-0.011
Years-0.010
ggie-0.010
 rider-0.010
Token looks
Token ID3073
Feature activation+1.47e-02

Pos logit contributions
ifying+0.035
AAAA+0.034
 Look+0.033
're+0.032
 roy+0.031
Neg logit contributions
old-0.049
bench-0.047
Years-0.043
 student-0.043
 homework-0.042
Token at
Token ID379
Feature activation+2.54e-04

Pos logit contributions
AAAA+0.028
ifying+0.028
 Sorry+0.027
 Look+0.026
're+0.026
Neg logit contributions
old-0.018
bench-0.017
Years-0.016
 aloud-0.016
 math-0.015
Token didn
Token ID1422
Feature activation-1.70e-02

Pos logit contributions
AAAA+0.015
 Look+0.014
ifying+0.014
 roy+0.013
 Are+0.013
Neg logit contributions
old-0.020
bench-0.020
Years-0.018
 student-0.018
 worker-0.018
Token't
Token ID470
Feature activation+5.94e-03

Pos logit contributions
old+0.034
bench+0.032
 student+0.030
 professor+0.028
Years+0.028
Neg logit contributions
ifying-0.021
're-0.021
AAAA-0.020
 Look-0.019
 woke-0.019
Token stop
Token ID2245
Feature activation-2.35e-02

Pos logit contributions
ifying+0.008
AAAA+0.008
 Look+0.007
 roy+0.007
 Are+0.006
Neg logit contributions
old-0.011
bench-0.011
 student-0.010
Years-0.010
 worker-0.010
Token until
Token ID1566
Feature activation-1.65e-02

Pos logit contributions
old+0.031
bench+0.029
 student+0.026
 homework+0.026
 research+0.026
Neg logit contributions
AAAA-0.043
ifying-0.041
 Look-0.038
 woke-0.038
're-0.037
Token their
Token ID511
Feature activation+2.61e-02

Pos logit contributions
old+0.020
bench+0.020
 homework+0.017
Years+0.017
 worker+0.017
Neg logit contributions
AAAA-0.031
ifying-0.030
 Look-0.029
 roy-0.029
're-0.028
Token stomach
Token ID11384
Feature activation+1.55e-01

Pos logit contributions
ifying+0.039
 Look+0.037
AAAA+0.037
 woke+0.036
 roy+0.036
Neg logit contributions
old-0.044
bench-0.044
Years-0.037
ritz-0.037
 student-0.037
Tokens
Token ID82
Feature activation+1.07e-02

Pos logit contributions
AAAA+0.195
 Look+0.189
ifying+0.189
 roy+0.175
're+0.169
Neg logit contributions
old-0.288
bench-0.286
Years-0.257
 worker-0.255
 student-0.252
Token were
Token ID547
Feature activation+3.88e-02

Pos logit contributions
AAAA+0.016
ifying+0.016
 Look+0.016
 roy+0.015
're+0.015
Neg logit contributions
old-0.017
bench-0.017
Years-0.015
 student-0.015
 homework-0.015
Token full
Token ID1336
Feature activation+8.76e-02

Pos logit contributions
AAAA+0.058
 Look+0.055
ifying+0.054
 Are+0.052
 roy+0.052
Neg logit contributions
bench-0.066
old-0.064
Years-0.058
 worker-0.057
 homework-0.056
Token.
Token ID13
Feature activation-8.38e-04

Pos logit contributions
ifying+0.154
AAAA+0.148
 Look+0.148
're+0.144
 woke+0.143
Neg logit contributions
old-0.122
bench-0.118
Years-0.105
 student-0.103
 homework-0.100
Token
Token ID198
Feature activation+1.13e-03

Pos logit contributions
old+0.001
bench+0.001
Years+0.001
 student+0.001
 tutor+0.001
Neg logit contributions
ifying-0.001
AAAA-0.001
 roy-0.001
 Look-0.001
 woke-0.001
Token it
Token ID340
Feature activation+1.98e-02

Pos logit contributions
ifying+0.008
AAAA+0.008
 Look+0.008
 D+0.007
 woke+0.007
Neg logit contributions
old-0.012
bench-0.011
Years-0.011
ritz-0.010
ggie-0.010
Token she
Token ID673
Feature activation+1.94e-02

Pos logit contributions
ifying+0.027
AAAA+0.025
 Look+0.024
 roy+0.023
're+0.022
Neg logit contributions
old-0.038
bench-0.036
Years-0.033
 student-0.033
 worker-0.033
Token had
Token ID550
Feature activation+4.82e-02

Pos logit contributions
ifying+0.026
AAAA+0.023
 Look+0.022
 roy+0.022
're+0.021
Neg logit contributions
old-0.038
bench-0.035
 student-0.033
Years-0.032
 homework-0.032
Token a
Token ID257
Feature activation+7.70e-03

Pos logit contributions
ifying+0.081
AAAA+0.079
 Look+0.075
 Sorry+0.075
 roy+0.075
Neg logit contributions
old-0.073
bench-0.070
Years-0.063
ggie-0.061
 student-0.060
Token magn
Token ID7842
Feature activation-3.16e-03

Pos logit contributions
ifying+0.011
AAAA+0.011
 Look+0.011
're+0.010
 roy+0.010
Neg logit contributions
old-0.013
bench-0.013
 student-0.012
Years-0.011
 tutor-0.011
Tokenifying
Token ID4035
Feature activation+1.54e-01

Pos logit contributions
old+0.009
bench+0.009
Years+0.009
 student+0.009
ritz+0.009
Neg logit contributions
AAAA-0.001
ifying-0.001
 Look-0.000
're-0.000
 roy-0.000
Token glass
Token ID5405
Feature activation+1.58e-02

Pos logit contributions
AAAA+0.222
 roy+0.213
 Look+0.207
cel+0.204
ifying+0.194
Neg logit contributions
bench-0.242
case-0.239
 math-0.223
 dentist-0.222
old-0.221
Token,
Token ID11
Feature activation-3.10e-04

Pos logit contributions
ifying+0.026
AAAA+0.025
 Look+0.024
 woke+0.024
're+0.024
Neg logit contributions
old-0.024
bench-0.023
Years-0.021
 student-0.021
 homework-0.020
Token a
Token ID257
Feature activation+7.18e-03

Pos logit contributions
old+0.000
bench+0.000
Years+0.000
 student+0.000
 homework+0.000
Neg logit contributions
AAAA-0.000
ifying-0.000
 Look-0.000
're-0.000
 roy-0.000
Token tiny
Token ID7009
Feature activation+2.06e-03

Pos logit contributions
AAAA+0.011
ifying+0.011
 Look+0.010
're+0.010
 roy+0.010
Neg logit contributions
bench-0.011
old-0.011
 student-0.011
 tutor-0.010
 worker-0.010
Token torch
Token ID28034
Feature activation-6.60e-02

Pos logit contributions
AAAA+0.003
ifying+0.003
're+0.003
 Look+0.003
 Sorry+0.003
Neg logit contributions
old-0.003
bench-0.003
 student-0.003
 worker-0.003
 tutor-0.003
Token and
Token ID290
Feature activation-9.31e-03

Pos logit contributions
 woke+0.027
're+0.024
 Look+0.023
 D+0.021
 c+0.020
Neg logit contributions
bench-0.071
old-0.067
Years-0.065
ritz-0.064
 student-0.063
Token handed
Token ID10158
Feature activation+1.94e-02

Pos logit contributions
old+0.019
bench+0.018
 student+0.017
Years+0.017
 worker+0.016
Neg logit contributions
ifying-0.011
AAAA-0.011
 Look-0.010
're-0.010
 roy-0.010
Token him
Token ID683
Feature activation-2.64e-02

Pos logit contributions
 D+0.023
 Look+0.023
 P+0.022
 Po+0.022
 woke+0.021
Neg logit contributions
bench-0.040
 outage-0.039
Years-0.038
old-0.037
 appointment-0.037
Token the
Token ID262
Feature activation-1.42e-02

Pos logit contributions
old+0.045
bench+0.043
Years+0.039
 student+0.039
 homework+0.038
Neg logit contributions
ifying-0.038
AAAA-0.038
 Look-0.037
're-0.035
 woke-0.035
Token magn
Token ID7842
Feature activation+1.74e-03

Pos logit contributions
bench+0.021
old+0.020
 student+0.019
 homework+0.018
 worker+0.018
Neg logit contributions
ifying-0.023
AAAA-0.023
 Look-0.022
 Sorry-0.021
 woke-0.021
Tokenifying
Token ID4035
Feature activation+1.54e-01

Pos logit contributions
AAAA+0.001
 Look+0.000
ifying+0.000
 woke+0.000
're+0.000
Neg logit contributions
old-0.005
bench-0.005
 student-0.005
Years-0.005
ritz-0.005
Token glass
Token ID5405
Feature activation+2.20e-02

Pos logit contributions
AAAA+0.221
're+0.220
cel+0.212
 Look+0.202
 roy+0.199
Neg logit contributions
bench-0.233
old-0.228
 math-0.222
 student-0.218
 tutor-0.216
Token.
Token ID13
Feature activation+6.21e-04

Pos logit contributions
ifying+0.038
AAAA+0.037
 Look+0.036
're+0.035
 roy+0.035
Neg logit contributions
old-0.032
bench-0.030
Years-0.027
 student-0.027
 homework-0.026
Token
Token ID198
Feature activation-1.21e-03

Pos logit contributions
 Look+0.001
ifying+0.001
AAAA+0.001
 D+0.001
 Are+0.001
Neg logit contributions
old-0.001
bench-0.001
Years-0.001
ggie-0.001
ritz-0.001
Token
Token ID198
Feature activation-6.76e-04

Pos logit contributions
bench+0.003
 rider+0.003
smanship+0.003
 appointment+0.003
 student+0.003
Neg logit contributions
 D-0.001
 woke-0.001
AAAA-0.001
 Look-0.001
 P-0.001
TokenHe
Token ID1544
Feature activation+7.10e-03

Pos logit contributions
old+0.001
bench+0.001
Years+0.001
 homework+0.001
iger+0.001
Neg logit contributions
ifying-0.001
AAAA-0.001
 Look-0.001
 woke-0.001
 roy-0.001
Token crack
Token ID8469
Feature activation-4.31e-02

Pos logit contributions
ifying+0.003
AAAA+0.003
 Look+0.003
're+0.003
 woke+0.003
Neg logit contributions
old-0.003
bench-0.003
Years-0.003
 homework-0.003
 student-0.003
Tokeners
Token ID364
Feature activation-1.61e-02

Pos logit contributions
old+0.076
bench+0.074
Years+0.067
 student+0.066
 homework+0.064
Neg logit contributions
ifying-0.060
AAAA-0.059
 Look-0.057
're-0.055
 roy-0.055
Token to
Token ID284
Feature activation+2.22e-02

Pos logit contributions
old+0.025
bench+0.025
 homework+0.022
 worker+0.022
 student+0.022
Neg logit contributions
AAAA-0.025
ifying-0.024
 Look-0.024
 Are-0.022
 roy-0.022
Token help
Token ID1037
Feature activation-8.67e-03

Pos logit contributions
ifying+0.028
AAAA+0.028
 Look+0.027
 roy+0.025
're+0.025
Neg logit contributions
old-0.042
bench-0.040
Years-0.037
 student-0.037
 homework-0.036
Token your
Token ID534
Feature activation+6.07e-02

Pos logit contributions
bench+0.017
old+0.017
Years+0.016
 student+0.015
ritz+0.015
Neg logit contributions
ifying-0.010
 Look-0.009
AAAA-0.009
're-0.009
 D-0.009
Token stomach
Token ID11384
Feature activation+1.53e-01

Pos logit contributions
 Look+0.085
ifying+0.085
AAAA+0.084
 roy+0.082
 woke+0.080
Neg logit contributions
old-0.104
bench-0.103
Years-0.093
ritz-0.089
 student-0.088
Token feel
Token ID1254
Feature activation+3.84e-02

Pos logit contributions
AAAA+0.221
ifying+0.215
 Look+0.209
 roy+0.196
 D+0.187
Neg logit contributions
old-0.263
bench-0.261
 homework-0.234
Years-0.231
 worker-0.228
Token better
Token ID1365
Feature activation-2.00e-02

Pos logit contributions
ifying+0.093
AAAA+0.092
 Look+0.089
cel+0.087
 roy+0.086
Neg logit contributions
old-0.027
bench-0.026
 research-0.020
 homework-0.019
Years-0.018
Token."
Token ID526
Feature activation+2.04e-02

Pos logit contributions
bench+0.034
old+0.033
 homework+0.029
 worker+0.029
 student+0.029
Neg logit contributions
AAAA-0.029
ifying-0.029
 Look-0.027
're-0.025
 roy-0.025
Token Tim
Token ID5045
Feature activation-5.86e-02

Pos logit contributions
AAAA+0.027
ifying+0.027
 Look+0.027
 D+0.026
 woke+0.026
Neg logit contributions
old-0.036
bench-0.035
Years-0.033
 student-0.032
ritz-0.031
Tokenmy
Token ID1820
Feature activation-4.24e-02

Pos logit contributions
old+0.097
bench+0.096
Years+0.086
 student+0.085
 homework+0.083
Neg logit contributions
ifying-0.084
AAAA-0.084
 Look-0.083
 woke-0.081
're-0.080
Token chili
Token ID32728
Feature activation-6.30e-04

Pos logit contributions
ifying+0.028
AAAA+0.027
 Look+0.026
 roy+0.025
're+0.025
Neg logit contributions
bench-0.033
old-0.033
 worker-0.031
 student-0.030
 tutor-0.030
Token pepper
Token ID13385
Feature activation+3.16e-02

Pos logit contributions
old+0.001
bench+0.001
Years+0.001
 student+0.001
 worker+0.001
Neg logit contributions
ifying-0.001
AAAA-0.001
 Look-0.001
're-0.001
 roy-0.001
Token before
Token ID878
Feature activation-1.66e-02

Pos logit contributions
ifying+0.052
AAAA+0.050
 Look+0.050
're+0.048
 woke+0.048
Neg logit contributions
old-0.048
bench-0.046
Years-0.041
 student-0.040
 homework-0.039
Token and
Token ID290
Feature activation+2.19e-02

Pos logit contributions
old+0.026
bench+0.025
Years+0.023
 student+0.022
ritz+0.022
Neg logit contributions
ifying-0.026
AAAA-0.026
 Look-0.025
're-0.024
 woke-0.024
Token his
Token ID465
Feature activation+3.18e-02

Pos logit contributions
AAAA+0.038
ifying+0.038
 Look+0.036
 roy+0.035
're+0.035
Neg logit contributions
old-0.032
bench-0.030
 student-0.026
Years-0.026
 homework-0.026
Token stomach
Token ID11384
Feature activation+1.52e-01

Pos logit contributions
ifying+0.045
AAAA+0.043
 Look+0.043
're+0.041
 woke+0.041
Neg logit contributions
old-0.056
bench-0.054
Years-0.049
 student-0.048
ritz-0.047
Token was
Token ID373
Feature activation+6.02e-05

Pos logit contributions
AAAA+0.202
 Look+0.191
ifying+0.182
 roy+0.181
're+0.168
Neg logit contributions
bench-0.276
old-0.273
ggie-0.250
 worker-0.250
Years-0.247
Token starting
Token ID3599
Feature activation-5.67e-02

Pos logit contributions
AAAA+0.000
ifying+0.000
 Look+0.000
 roy+0.000
're+0.000
Neg logit contributions
old-0.000
bench-0.000
Years-0.000
 student-0.000
ggie-0.000
Token to
Token ID284
Feature activation+1.41e-02

Pos logit contributions
old+0.070
bench+0.068
Years+0.057
 student+0.057
 homework+0.056
Neg logit contributions
AAAA-0.108
ifying-0.108
 Look-0.105
're-0.102
 roy-0.102
Token trem
Token ID10023
Feature activation+6.65e-02

Pos logit contributions
ifying+0.021
 Look+0.019
AAAA+0.019
 roy+0.019
als+0.018
Neg logit contributions
bench-0.023
old-0.022
 homework-0.021
 research-0.021
 appointment-0.020
Tokenble
Token ID903
Feature activation+6.36e-03

Pos logit contributions
ifying+0.103
AAAA+0.100
 Look+0.098
 roy+0.094
 woke+0.093
Neg logit contributions
old-0.108
bench-0.105
Years-0.094
 student-0.092
 homework-0.090
Token He
Token ID679
Feature activation-5.95e-03

Pos logit contributions
 Look+0.019
AAAA+0.018
 Are+0.017
 woke+0.017
're+0.016
Neg logit contributions
bench-0.027
old-0.026
Years-0.025
ritz-0.025
bars-0.024
Token took
Token ID1718
Feature activation+1.95e-02

Pos logit contributions
old+0.012
bench+0.012
Years+0.011
 student+0.011
 homework+0.010
Neg logit contributions
AAAA-0.007
ifying-0.007
 Look-0.006
 woke-0.006
're-0.006
Token out
Token ID503
Feature activation-1.47e-02

Pos logit contributions
 Look+0.022
ifying+0.021
AAAA+0.021
 woke+0.021
're+0.020
Neg logit contributions
bench-0.038
old-0.037
Years-0.034
 salon-0.034
bars-0.034
Token a
Token ID257
Feature activation-5.78e-03

Pos logit contributions
bench+0.026
old+0.026
Years+0.024
ritz+0.023
ggie+0.023
Neg logit contributions
ifying-0.020
 Look-0.018
AAAA-0.018
 woke-0.018
 roy-0.018
Token magn
Token ID7842
Feature activation-7.04e-03

Pos logit contributions
old+0.009
bench+0.009
Years+0.008
 student+0.008
 worker+0.008
Neg logit contributions
ifying-0.009
AAAA-0.009
 Look-0.009
're-0.008
 roy-0.008
Tokenifying
Token ID4035
Feature activation+1.51e-01

Pos logit contributions
old+0.023
bench+0.022
Years+0.021
 homework+0.021
 student+0.021
Neg logit contributions
ifying-0.000
 Look+0.000
AAAA+0.001
're+0.001
 woke+0.001
Token glass
Token ID5405
Feature activation+8.21e-03

Pos logit contributions
AAAA+0.211
're+0.209
cel+0.203
 roy+0.201
 Look+0.196
Neg logit contributions
bench-0.238
old-0.229
 student-0.225
 tutor-0.225
 math-0.225
Token and
Token ID290
Feature activation+2.13e-03

Pos logit contributions
ifying+0.012
AAAA+0.012
 Look+0.012
 woke+0.011
're+0.011
Neg logit contributions
old-0.014
bench-0.013
Years-0.012
 student-0.012
 homework-0.011
Token a
Token ID257
Feature activation+4.87e-03

Pos logit contributions
ifying+0.003
AAAA+0.002
 Look+0.002
're+0.002
 woke+0.002
Neg logit contributions
old-0.004
bench-0.004
Years-0.004
 student-0.004
 homework-0.004
Token notebook
Token ID20922
Feature activation+3.61e-02

Pos logit contributions
AAAA+0.007
ifying+0.007
 Look+0.007
're+0.007
 roy+0.007
Neg logit contributions
old-0.008
bench-0.008
 student-0.007
 tutor-0.007
 worker-0.007
Token and
Token ID290
Feature activation+6.49e-03

Pos logit contributions
ifying+0.051
 woke+0.051
 Look+0.050
're+0.048
AAAA+0.047
Neg logit contributions
old-0.062
bench-0.057
Years-0.055
ritz-0.053
iger-0.052
Token cried
Token ID16896
Feature activation+1.57e-02

Pos logit contributions
old+0.070
bench+0.068
Years+0.066
 homework+0.062
 worker+0.061
Neg logit contributions
 Look-0.039
 woke-0.039
AAAA-0.038
're-0.035
 Are-0.035
Token,
Token ID11
Feature activation-1.20e-02

Pos logit contributions
 D+0.021
 Look+0.020
 roy+0.020
AAAA+0.020
ifying+0.020
Neg logit contributions
old-0.028
bench-0.028
Years-0.026
ae-0.025
 homework-0.025
Token dropping
Token ID12047
Feature activation-1.58e-02

Pos logit contributions
old+0.018
bench+0.018
Years+0.016
 student+0.016
 worker+0.015
Neg logit contributions
ifying-0.020
AAAA-0.019
 Look-0.019
're-0.018
 roy-0.018
Token the
Token ID262
Feature activation-1.30e-02

Pos logit contributions
old+0.029
bench+0.028
Years+0.026
ae+0.025
 student+0.025
Neg logit contributions
ifying-0.021
AAAA-0.021
 Look-0.020
 D-0.020
 roy-0.019
Token magn
Token ID7842
Feature activation-4.91e-03

Pos logit contributions
old+0.023
bench+0.022
 student+0.020
Years+0.020
 worker+0.020
Neg logit contributions
ifying-0.018
AAAA-0.018
 Look-0.018
're-0.017
 woke-0.017
Tokenifying
Token ID4035
Feature activation+1.50e-01

Pos logit contributions
old+0.015
bench+0.015
Years+0.014
 student+0.014
 worker+0.014
Neg logit contributions
ifying-0.000
AAAA-0.000
 Look+0.000
're+0.000
 roy+0.000
Token glass
Token ID5405
Feature activation+1.65e-02

Pos logit contributions
're+0.202
AAAA+0.201
cel+0.193
 Look+0.190
ifying+0.187
Neg logit contributions
bench-0.245
old-0.243
 worker-0.230
 student-0.230
 tutor-0.227
Token.
Token ID13
Feature activation-3.04e-03

Pos logit contributions
ifying+0.027
AAAA+0.027
 Look+0.026
're+0.025
 roy+0.025
Neg logit contributions
old-0.025
bench-0.024
Years-0.021
 student-0.021
 worker-0.020
Token
Token ID198
Feature activation-7.48e-03

Pos logit contributions
old+0.005
bench+0.005
Years+0.004
 student+0.004
 homework+0.004
Neg logit contributions
ifying-0.004
AAAA-0.004
 Look-0.004
're-0.004
 woke-0.004
Token
Token ID198
Feature activation-5.40e-03

Pos logit contributions
bench+0.015
 student+0.014
old+0.014
smanship+0.014
 homework+0.014
Neg logit contributions
AAAA-0.009
 Look-0.008
 D-0.008
 woke-0.007
 Are-0.007
Token"
Token ID1
Feature activation+3.63e-02

Pos logit contributions
old+0.009
bench+0.009
Years+0.008
 student+0.008
 worker+0.008
Neg logit contributions
ifying-0.008
AAAA-0.007
 Look-0.007
're-0.007
 roy-0.007
Token elephants
Token ID25368
Feature activation+1.32e-04

Pos logit contributions
AAAA+0.020
ifying+0.019
 roy+0.018
 Look+0.018
 Sorry+0.018
Neg logit contributions
old-0.015
bench-0.013
 student-0.013
 worker-0.012
 homework-0.012
Token,
Token ID11
Feature activation-1.99e-03

Pos logit contributions
ifying+0.000
 Look+0.000
're+0.000
AAAA+0.000
 woke+0.000
Neg logit contributions
bench-0.000
old-0.000
Years-0.000
 student-0.000
ritz-0.000
Token lions
Token ID31420
Feature activation-1.37e-02

Pos logit contributions
old+0.002
bench+0.002
 worker+0.002
 student+0.002
 rider+0.002
Neg logit contributions
 Sorry-0.004
ifying-0.004
AAAA-0.004
 Look-0.003
 roy-0.003
Token and
Token ID290
Feature activation-6.48e-04

Pos logit contributions
old+0.029
bench+0.029
Years+0.027
 student+0.026
 homework+0.025
Neg logit contributions
ifying-0.014
AAAA-0.014
 Look-0.013
're-0.013
 woke-0.012
Token ac
Token ID936
Feature activation+3.73e-02

Pos logit contributions
old+0.001
 student+0.001
bench+0.001
 math+0.001
 tutor+0.001
Neg logit contributions
ifying-0.001
AAAA-0.001
 Look-0.001
're-0.001
 Sorry-0.001
Tokenrob
Token ID22609
Feature activation-1.59e-01

Pos logit contributions
AAAA+0.062
 woke+0.058
ifying+0.057
als+0.057
cel+0.057
Neg logit contributions
old-0.053
 student-0.052
bench-0.051
 tutor-0.049
Years-0.049
Tokenats
Token ID1381
Feature activation-2.87e-02

Pos logit contributions
bench+0.212
old+0.207
Years+0.188
 attendant+0.186
 student+0.185
Neg logit contributions
AAAA-0.273
ifying-0.271
're-0.268
 Look-0.266
 roy-0.262
Token.
Token ID13
Feature activation-4.87e-03

Pos logit contributions
old+0.045
bench+0.045
Years+0.041
 student+0.040
 homework+0.038
Neg logit contributions
ifying-0.045
 Look-0.043
're-0.043
AAAA-0.042
 woke-0.042
Token They
Token ID1119
Feature activation+8.19e-03

Pos logit contributions
old+0.009
bench+0.009
Years+0.008
 student+0.008
 worker+0.008
Neg logit contributions
ifying-0.006
AAAA-0.006
 Look-0.006
're-0.006
 roy-0.006
Token are
Token ID389
Feature activation-7.89e-03

Pos logit contributions
AAAA+0.011
ifying+0.010
 Look+0.010
 roy+0.010
 D+0.009
Neg logit contributions
old-0.015
bench-0.014
 student-0.014
 research-0.013
 professor-0.013
Token amazed
Token ID24562
Feature activation-7.43e-03

Pos logit contributions
old+0.014
bench+0.013
Years+0.012
 student+0.012
ritz+0.012
Neg logit contributions
ifying-0.011
AAAA-0.011
 Look-0.011
're-0.010
 woke-0.010
Token was
Token ID373
Feature activation-2.31e-02

Pos logit contributions
bench+0.057
old+0.056
 student+0.052
ae+0.052
ggie+0.051
Neg logit contributions
 woke-0.026
AAAA-0.025
 Look-0.023
 roy-0.023
 D-0.023
Token three
Token ID1115
Feature activation-5.12e-02

Pos logit contributions
old+0.040
bench+0.039
Years+0.035
 student+0.035
ritz+0.034
Neg logit contributions
ifying-0.033
AAAA-0.032
 Look-0.031
're-0.030
 woke-0.030
Token-
Token ID12
Feature activation-1.08e-01

Pos logit contributions
bench+0.088
old+0.082
Years+0.078
iger+0.076
ritz+0.075
Neg logit contributions
ifying-0.069
 Look-0.068
AAAA-0.067
 woke-0.067
 roy-0.067
Tokenyears
Token ID19002
Feature activation-3.97e-02

Pos logit contributions
bench+0.198
 homework+0.194
case+0.184
 tutor+0.179
 rider+0.177
Neg logit contributions
 ah-0.119
 Look-0.118
ifying-0.112
cr-0.105
 roy-0.105
Token-
Token ID12
Feature activation-1.18e-01

Pos logit contributions
bench+0.071
 homework+0.060
Years+0.059
 attendant+0.058
 salon+0.057
Neg logit contributions
 roy-0.048
 peanut-0.047
're-0.043
 c-0.043
 woke-0.043
Tokenold
Token ID727
Feature activation-1.54e-01

Pos logit contributions
bench+0.121
 homework+0.106
 tutor+0.086
 cinema+0.084
smanship+0.079
Neg logit contributions
 Look-0.221
cel-0.213
 peanut-0.212
 ate-0.212
 c-0.211
Token.
Token ID13
Feature activation-2.60e-02

Pos logit contributions
bench+0.271
Years+0.254
old+0.250
 homework+0.238
ritz+0.235
Neg logit contributions
 Look-0.210
ifying-0.204
 woke-0.200
're-0.198
 roy-0.195
Token She
Token ID1375
Feature activation+3.38e-04

Pos logit contributions
old+0.047
bench+0.046
Years+0.042
 homework+0.041
ggie+0.041
Neg logit contributions
ifying-0.035
 Look-0.034
AAAA-0.034
're-0.031
 Are-0.031
Token had
Token ID550
Feature activation+2.92e-02

Pos logit contributions
ifying+0.000
AAAA+0.000
 Look+0.000
 roy+0.000
're+0.000
Neg logit contributions
old-0.001
bench-0.001
Years-0.001
 student-0.001
 worker-0.001
Token a
Token ID257
Feature activation-9.15e-03

Pos logit contributions
ifying+0.039
AAAA+0.038
 Look+0.038
're+0.036
 woke+0.036
Neg logit contributions
old-0.053
bench-0.051
Years-0.046
 student-0.046
ritz-0.045
Token lot
Token ID1256
Feature activation-5.20e-04

Pos logit contributions
old+0.016
bench+0.015
Years+0.013
ritz+0.013
 student+0.013
Neg logit contributions
ifying-0.013
AAAA-0.013
 Look-0.013
 woke-0.012
're-0.012
Token spotted
Token ID13489
Feature activation+2.96e-02

Pos logit contributions
ifying+0.002
AAAA+0.002
 Look+0.001
 D+0.001
 roy+0.001
Neg logit contributions
old-0.003
bench-0.002
 student-0.002
 tutor-0.002
 homework-0.002
Token a
Token ID257
Feature activation+1.38e-02

Pos logit contributions
ifying+0.030
 D+0.029
 woke+0.029
 Look+0.029
AAAA+0.028
Neg logit contributions
bench-0.063
old-0.060
Years-0.056
ritz-0.055
 student-0.055
Token row
Token ID5752
Feature activation-4.36e-02

Pos logit contributions
ifying+0.019
AAAA+0.019
 Look+0.018
 woke+0.018
 roy+0.017
Neg logit contributions
old-0.025
bench-0.024
Years-0.021
ritz-0.021
 student-0.021
Token of
Token ID286
Feature activation-2.53e-02

Pos logit contributions
old+0.064
bench+0.061
Years+0.058
 homework+0.056
 research+0.054
Neg logit contributions
ifying-0.075
 woke-0.068
're-0.068
 Look-0.067
AAAA-0.066
Token books
Token ID3835
Feature activation-3.74e-02

Pos logit contributions
old+0.044
bench+0.044
Years+0.040
ritz+0.038
ae+0.038
Neg logit contributions
ifying-0.035
AAAA-0.034
 Look-0.034
 roy-0.032
 woke-0.032
Tokenhel
Token ID2978
Feature activation-1.53e-01

Pos logit contributions
old+0.050
Years+0.044
iger+0.040
 student+0.040
 research+0.039
Neg logit contributions
ifying-0.071
 woke-0.070
're-0.067
 ate-0.063
 reapp-0.062
Tokenves
Token ID1158
Feature activation-7.79e-02

Pos logit contributions
old+0.219
bench+0.203
 professor+0.196
 technician+0.195
 student+0.194
Neg logit contributions
 woke-0.249
ifying-0.225
 landed-0.222
 hatched-0.220
 reapp-0.218
Token in
Token ID287
Feature activation+1.18e-02

Pos logit contributions
old+0.134
bench+0.129
Years+0.119
 student+0.116
 homework+0.113
Neg logit contributions
ifying-0.113
AAAA-0.106
 Look-0.105
're-0.105
 woke-0.103
Token her
Token ID607
Feature activation+2.56e-02

Pos logit contributions
ifying+0.015
AAAA+0.014
 Look+0.014
 woke+0.013
 roy+0.013
Neg logit contributions
old-0.023
bench-0.022
Years-0.020
ritz-0.019
 homework-0.019
Token grandparents
Token ID28571
Feature activation+4.11e-03

Pos logit contributions
ifying+0.042
 Look+0.040
AAAA+0.039
 woke+0.038
're+0.038
Neg logit contributions
old-0.040
bench-0.039
Years-0.035
ritz-0.034
 student-0.033
Token'
Token ID6
Feature activation-3.82e-02

Pos logit contributions
ifying+0.005
're+0.005
 Look+0.004
 woke+0.004
 roy+0.004
Neg logit contributions
Years-0.008
bench-0.008
old-0.008
ritz-0.008
iger-0.007
Token clown
Token ID25573
Feature activation+6.68e-02

Pos logit contributions
 Look+0.005
ifying+0.005
 woke+0.005
 D+0.005
AAAA+0.005
Neg logit contributions
bench-0.008
old-0.007
Years-0.007
ritz-0.007
smanship-0.007
Tokens
Token ID82
Feature activation-2.26e-02

Pos logit contributions
ifying+0.086
're+0.082
 Look+0.082
 woke+0.081
AAAA+0.076
Neg logit contributions
bench-0.131
old-0.123
Years-0.117
ritz-0.109
 homework-0.108
Token juggling
Token ID48097
Feature activation+5.85e-03

Pos logit contributions
old+0.042
bench+0.042
Years+0.038
 student+0.036
ritz+0.036
Neg logit contributions
ifying-0.030
 Look-0.028
AAAA-0.028
're-0.027
 woke-0.027
Token and
Token ID290
Feature activation+9.04e-03

Pos logit contributions
ifying+0.009
AAAA+0.009
 Look+0.009
're+0.008
 roy+0.008
Neg logit contributions
old-0.010
bench-0.009
Years-0.008
 student-0.008
 homework-0.008
Token ac
Token ID936
Feature activation+4.06e-02

Pos logit contributions
ifying+0.014
AAAA+0.014
 Look+0.014
 roy+0.013
're+0.013
Neg logit contributions
old-0.014
bench-0.013
 student-0.012
Years-0.012
 worker-0.012
Tokenrob
Token ID22609
Feature activation-1.53e-01

Pos logit contributions
 escaped+0.054
 ate+0.054
 woke+0.053
ifying+0.053
AAAA+0.052
Neg logit contributions
 tutor-0.068
 student-0.065
old-0.063
 research-0.063
 appointment-0.063
Tokenats
Token ID1381
Feature activation-3.08e-02

Pos logit contributions
bench+0.207
old+0.201
 attendant+0.190
Years+0.186
 worker+0.185
Neg logit contributions
ifying-0.252
're-0.252
AAAA-0.251
 Look-0.244
 roy-0.243
Token doing
Token ID1804
Feature activation-2.53e-02

Pos logit contributions
old+0.053
bench+0.051
Years+0.048
 student+0.046
ritz+0.045
Neg logit contributions
ifying-0.044
're-0.042
 woke-0.042
 Look-0.042
AAAA-0.040
Token flips
Token ID45971
Feature activation-3.48e-02

Pos logit contributions
old+0.044
bench+0.044
ritz+0.038
 student+0.038
iger+0.037
Neg logit contributions
ifying-0.034
AAAA-0.034
 Look-0.034
 D-0.033
're-0.032
Token.
Token ID13
Feature activation-8.45e-03

Pos logit contributions
old+0.066
bench+0.063
Years+0.057
 student+0.056
 appointment+0.056
Neg logit contributions
ifying-0.046
 D-0.041
 Look-0.041
 woke-0.041
're-0.040
Token But
Token ID887
Feature activation-3.83e-02

Pos logit contributions
bench+0.017
old+0.017
Years+0.016
 homework+0.015
 student+0.015
Neg logit contributions
 Look-0.010
AAAA-0.009
ifying-0.009
 D-0.009
 Are-0.009
Token the
Token ID262
Feature activation-3.44e-03

Pos logit contributions
bench+0.015
old+0.014
Years+0.013
 student+0.013
ritz+0.013
Neg logit contributions
 Look-0.011
ifying-0.011
AAAA-0.011
 woke-0.011
 roy-0.011
Token animals
Token ID4695
Feature activation+9.62e-03

Pos logit contributions
old+0.006
bench+0.006
Years+0.005
 student+0.005
ritz+0.005
Neg logit contributions
ifying-0.005
 Look-0.005
AAAA-0.005
 roy-0.005
're-0.005
Token,
Token ID11
Feature activation+7.31e-03

Pos logit contributions
ifying+0.012
're+0.011
 woke+0.011
 Look+0.011
AAAA+0.011
Neg logit contributions
bench-0.018
old-0.018
Years-0.017
 student-0.016
ritz-0.016
Token the
Token ID262
Feature activation-6.00e-03

Pos logit contributions
AAAA+0.011
ifying+0.011
 Look+0.010
 roy+0.010
 woke+0.009
Neg logit contributions
old-0.012
bench-0.012
 student-0.011
 homework-0.011
Years-0.011
Token ac
Token ID936
Feature activation+4.10e-02

Pos logit contributions
old+0.010
bench+0.009
Years+0.008
 student+0.008
ritz+0.008
Neg logit contributions
ifying-0.009
AAAA-0.009
 Look-0.009
're-0.009
 roy-0.009
Tokenrob
Token ID22609
Feature activation-1.52e-01

Pos logit contributions
 woke+0.062
AAAA+0.061
 peanut+0.059
cel+0.059
're+0.058
Neg logit contributions
old-0.059
 student-0.059
Years-0.057
 tutor-0.056
bench-0.055
Tokenats
Token ID1381
Feature activation-2.21e-02

Pos logit contributions
bench+0.226
old+0.211
Years+0.206
 attendant+0.204
 professor+0.199
Neg logit contributions
're-0.234
ifying-0.228
 roy-0.227
AAAA-0.225
 woke-0.223
Token and
Token ID290
Feature activation+2.41e-03

Pos logit contributions
old+0.043
bench+0.043
Years+0.039
 student+0.038
 homework+0.037
Neg logit contributions
ifying-0.026
AAAA-0.025
 Look-0.025
're-0.024
 woke-0.024
Token the
Token ID262
Feature activation-8.74e-03

Pos logit contributions
ifying+0.004
AAAA+0.004
 Look+0.004
 roy+0.004
're+0.004
Neg logit contributions
old-0.004
bench-0.003
 student-0.003
 tutor-0.003
Years-0.003
Token clown
Token ID25573
Feature activation+6.73e-02

Pos logit contributions
old+0.014
bench+0.013
 student+0.012
Years+0.012
 worker+0.011
Neg logit contributions
ifying-0.014
AAAA-0.014
 Look-0.013
're-0.013
 woke-0.013
Tokens
Token ID82
Feature activation-1.21e-02

Pos logit contributions
ifying+0.087
 woke+0.085
're+0.085
 Look+0.082
 D+0.079
Neg logit contributions
bench-0.126
old-0.120
Years-0.117
ritz-0.109
 student-0.106
Token thought
Token ID1807
Feature activation-2.29e-02

Pos logit contributions
old+0.055
bench+0.054
Years+0.049
 student+0.049
 homework+0.048
Neg logit contributions
ifying-0.030
AAAA-0.030
 Look-0.029
're-0.027
 woke-0.027
Token they
Token ID484
Feature activation+5.71e-03

Pos logit contributions
bench+0.046
old+0.043
iger+0.041
 attendant+0.041
 rider+0.040
Neg logit contributions
 woke-0.027
're-0.025
 D-0.025
 Look-0.024
AAAA-0.024
Token looked
Token ID3114
Feature activation+2.78e-02

Pos logit contributions
AAAA+0.007
ifying+0.007
 Look+0.006
 roy+0.006
 D+0.006
Neg logit contributions
old-0.011
bench-0.011
Years-0.010
 student-0.010
 homework-0.010
Token like
Token ID588
Feature activation+3.19e-03

Pos logit contributions
AAAA+0.051
ifying+0.048
 Look+0.047
 woke+0.046
 Sorry+0.046
Neg logit contributions
old-0.037
Years-0.034
 math-0.033
bench-0.033
 homework-0.031
Token ac
Token ID936
Feature activation+3.52e-02

Pos logit contributions
 Look+0.004
 D+0.004
AAAA+0.004
 woke+0.004
ifying+0.004
Neg logit contributions
bench-0.006
old-0.006
ritz-0.005
Years-0.005
iger-0.005
Tokenrob
Token ID22609
Feature activation-1.51e-01

Pos logit contributions
 woke+0.058
're+0.057
ifying+0.057
cel+0.056
AAAA+0.056
Neg logit contributions
old-0.047
 student-0.045
Years-0.044
 tutor-0.044
bench-0.043
Tokenats
Token ID1381
Feature activation-3.26e-02

Pos logit contributions
bench+0.208
old+0.200
 attendant+0.192
Years+0.185
 worker+0.184
Neg logit contributions
're-0.250
ifying-0.248
AAAA-0.247
 roy-0.244
 Look-0.240
Token.
Token ID13
Feature activation-1.15e-02

Pos logit contributions
old+0.056
bench+0.055
Years+0.050
 student+0.049
ritz+0.048
Neg logit contributions
ifying-0.047
AAAA-0.045
 Look-0.044
're-0.044
 woke-0.043
Token She
Token ID1375
Feature activation+1.91e-02

Pos logit contributions
old+0.022
bench+0.022
ggie+0.020
Years+0.019
 student+0.019
Neg logit contributions
 Look-0.015
ifying-0.014
AAAA-0.014
're-0.013
 woke-0.013
Token enjoyed
Token ID8359
Feature activation+2.49e-03

Pos logit contributions
AAAA+0.019
ifying+0.019
 Look+0.018
 woke+0.017
're+0.017
Neg logit contributions
old-0.041
bench-0.040
Years-0.037
 student-0.037
 homework-0.036
Token the
Token ID262
Feature activation-3.25e-02

Pos logit contributions
AAAA+0.003
 D+0.003
 Look+0.003
ifying+0.003
 woke+0.003
Neg logit contributions
old-0.005
bench-0.004
ritz-0.004
bars-0.004
case-0.004
Token a
Token ID257
Feature activation-6.16e-04

Pos logit contributions
old+0.002
bench+0.002
Years+0.002
 student+0.002
 homework+0.002
Neg logit contributions
ifying-0.002
AAAA-0.002
 Look-0.001
're-0.001
 woke-0.001
Token 3
Token ID513
Feature activation-5.48e-02

Pos logit contributions
old+0.001
bench+0.001
Years+0.001
 student+0.001
 homework+0.001
Neg logit contributions
ifying-0.001
AAAA-0.001
 Look-0.001
're-0.001
 roy-0.001
Token-
Token ID12
Feature activation-9.69e-02

Pos logit contributions
bench+0.069
old+0.068
Years+0.062
 homework+0.059
 student+0.058
Neg logit contributions
AAAA-0.098
 Look-0.096
 woke-0.096
ifying-0.096
 roy-0.092
Tokenyear
Token ID1941
Feature activation-8.48e-02

Pos logit contributions
bench+0.143
 homework+0.134
iger+0.125
ritz+0.122
 site+0.121
Neg logit contributions
 Look-0.147
ifying-0.141
 ah-0.137
 woke-0.134
 ate-0.131
Token-
Token ID12
Feature activation-1.01e-01

Pos logit contributions
bench+0.146
 homework+0.138
Years+0.132
 student+0.130
James+0.129
Neg logit contributions
 woke-0.096
 peanut-0.095
 roy-0.090
 c-0.088
ifying-0.088
Tokenold
Token ID727
Feature activation-1.51e-01

Pos logit contributions
bench+0.098
 homework+0.084
case+0.076
 tutor+0.076
 rider+0.073
Neg logit contributions
 Look-0.197
 woke-0.191
 ah-0.189
 ate-0.186
ifying-0.184
Token boy
Token ID2933
Feature activation-5.82e-02

Pos logit contributions
bench+0.245
old+0.233
Years+0.225
 homework+0.211
ritz+0.210
Neg logit contributions
 Look-0.223
ifying-0.220
 woke-0.215
AAAA-0.214
're-0.212
Token who
Token ID508
Feature activation-2.25e-02

Pos logit contributions
old+0.106
bench+0.103
Years+0.094
 student+0.093
 homework+0.090
Neg logit contributions
ifying-0.077
AAAA-0.075
 Look-0.074
're-0.070
 roy-0.070
Token had
Token ID550
Feature activation+2.92e-02

Pos logit contributions
old+0.045
bench+0.044
Years+0.041
 student+0.040
 homework+0.039
Neg logit contributions
ifying-0.025
AAAA-0.025
 Look-0.024
're-0.023
 woke-0.023
Token a
Token ID257
Feature activation-5.77e-03

Pos logit contributions
ifying+0.041
AAAA+0.040
 Look+0.039
're+0.037
 roy+0.037
Neg logit contributions
old-0.051
bench-0.050
Years-0.045
 student-0.044
 homework-0.043
Token new
Token ID649
Feature activation-9.56e-02

Pos logit contributions
old+0.010
bench+0.009
Years+0.008
ritz+0.008
 student+0.008
Neg logit contributions
ifying-0.008
AAAA-0.008
 Look-0.008
 woke-0.008
're-0.008
Token clown
Token ID25573
Feature activation+6.30e-02

Pos logit contributions
ifying+0.003
AAAA+0.003
 Look+0.003
 woke+0.003
 roy+0.003
Neg logit contributions
old-0.003
bench-0.003
Years-0.002
ritz-0.002
bars-0.002
Tokens
Token ID82
Feature activation-8.52e-03

Pos logit contributions
ifying+0.085
 Look+0.080
 woke+0.080
're+0.079
AAAA+0.078
Neg logit contributions
bench-0.114
old-0.113
Years-0.105
 student-0.100
ritz-0.100
Token and
Token ID290
Feature activation+1.64e-02

Pos logit contributions
old+0.015
bench+0.014
Years+0.013
 student+0.013
ritz+0.013
Neg logit contributions
ifying-0.012
AAAA-0.012
 Look-0.011
 woke-0.011
're-0.011
Token the
Token ID262
Feature activation+2.49e-03

Pos logit contributions
ifying+0.027
AAAA+0.026
 Look+0.025
're+0.024
 roy+0.024
Neg logit contributions
old-0.026
bench-0.024
 student-0.023
 worker-0.022
 tutor-0.022
Token ac
Token ID936
Feature activation+5.24e-02

Pos logit contributions
ifying+0.004
AAAA+0.004
 Look+0.004
're+0.004
 roy+0.004
Neg logit contributions
old-0.004
bench-0.004
Years-0.003
 student-0.003
 homework-0.003
Tokenrob
Token ID22609
Feature activation-1.47e-01

Pos logit contributions
 woke+0.078
ifying+0.076
AAAA+0.075
cel+0.073
 c+0.071
Neg logit contributions
 student-0.077
Years-0.077
old-0.077
 tutor-0.076
 research-0.072
Tokenats
Token ID1381
Feature activation-1.42e-02

Pos logit contributions
bench+0.218
Years+0.204
old+0.202
 attendant+0.199
 tutor+0.198
Neg logit contributions
're-0.228
ifying-0.221
 roy-0.217
AAAA-0.215
 woke-0.212
Token and
Token ID290
Feature activation+9.80e-03

Pos logit contributions
old+0.021
bench+0.021
Years+0.018
 student+0.018
 worker+0.018
Neg logit contributions
ifying-0.023
AAAA-0.023
 Look-0.023
're-0.022
 roy-0.022
Token the
Token ID262
Feature activation-7.96e-05

Pos logit contributions
ifying+0.017
AAAA+0.016
 Look+0.016
're+0.015
 roy+0.015
Neg logit contributions
old-0.014
bench-0.013
 student-0.013
 worker-0.012
 tutor-0.012
Token lions
Token ID31420
Feature activation-3.02e-03

Pos logit contributions
old+0.000
bench+0.000
 student+0.000
Years+0.000
 worker+0.000
Neg logit contributions
ifying-0.000
AAAA-0.000
 Look-0.000
're-0.000
 roy-0.000
Token and
Token ID290
Feature activation+1.57e-02

Pos logit contributions
old+0.005
bench+0.005
Years+0.005
 student+0.004
ritz+0.004
Neg logit contributions
ifying-0.005
 woke-0.004
're-0.004
AAAA-0.004
 Look-0.004
Token.
Token ID13
Feature activation-5.78e-03

Pos logit contributions
bench+0.070
old+0.070
Years+0.066
ritz+0.062
 student+0.062
Neg logit contributions
 Look-0.062
ifying-0.061
 woke-0.061
're-0.061
 D-0.060
Token The
Token ID383
Feature activation+8.81e-03

Pos logit contributions
old+0.011
bench+0.011
 student+0.010
Years+0.010
ritz+0.010
Neg logit contributions
 Look-0.007
AAAA-0.007
ifying-0.007
 woke-0.006
 D-0.006
Token vet
Token ID20202
Feature activation-4.78e-02

Pos logit contributions
AAAA+0.018
ifying+0.017
 woke+0.017
 Look+0.017
're+0.017
Neg logit contributions
old-0.010
bench-0.009
ritz-0.008
ae-0.008
 student-0.008
Token took
Token ID1718
Feature activation+2.30e-02

Pos logit contributions
old+0.095
bench+0.091
 worker+0.083
 student+0.083
Years+0.082
Neg logit contributions
ifying-0.060
AAAA-0.057
 Look-0.054
 roy-0.051
're-0.051
Token an
Token ID281
Feature activation+2.32e-03

Pos logit contributions
ifying+0.034
AAAA+0.034
 Look+0.033
're+0.032
 roy+0.031
Neg logit contributions
old-0.038
bench-0.037
Years-0.033
 student-0.033
 homework-0.032
Token x
Token ID2124
Feature activation-1.47e-01

Pos logit contributions
AAAA+0.005
 Look+0.004
 woke+0.004
 roy+0.004
ifying+0.004
Neg logit contributions
bench-0.002
old-0.002
ae-0.002
 student-0.002
Years-0.002
Token-
Token ID12
Feature activation-4.83e-02

Pos logit contributions
old+0.199
bench+0.191
Years+0.171
 student+0.165
bars+0.161
Neg logit contributions
AAAA-0.260
ifying-0.254
 woke-0.250
 Look-0.249
 roy-0.242
Tokenray
Token ID2433
Feature activation-6.84e-02

Pos logit contributions
bench+0.082
old+0.077
 student+0.073
 homework+0.072
 research+0.071
Neg logit contributions
ifying-0.070
AAAA-0.067
 woke-0.066
 Look-0.065
 roy-0.061
Token of
Token ID286
Feature activation+3.14e-03

Pos logit contributions
old+0.113
bench+0.107
Years+0.103
ggie+0.100
ritz+0.098
Neg logit contributions
ifying-0.102
 woke-0.098
 Look-0.094
AAAA-0.093
're-0.090
Token Wh
Token ID854
Feature activation-3.08e-02

Pos logit contributions
ifying+0.005
AAAA+0.004
 Look+0.004
're+0.004
 roy+0.004
Neg logit contributions
old-0.005
bench-0.005
Years-0.005
 student-0.005
 homework-0.005
Tokenisk
Token ID1984
Feature activation-4.26e-02

Pos logit contributions
bench+0.061
old+0.060
 student+0.055
Years+0.055
guard+0.055
Neg logit contributions
ifying-0.033
're-0.033
AAAA-0.033
 Look-0.032
 woke-0.032
Token walked
Token ID6807
Feature activation+6.85e-03

Pos logit contributions
AAAA+0.016
ifying+0.016
 Look+0.016
 woke+0.016
're+0.015
Neg logit contributions
old-0.043
bench-0.043
Years-0.039
 student-0.039
ritz-0.038
Token over
Token ID625
Feature activation-1.04e-02

Pos logit contributions
 woke+0.008
 Look+0.008
ifying+0.007
 D+0.007
 c+0.007
Neg logit contributions
old-0.014
bench-0.013
bars-0.012
 homework-0.012
ggie-0.012
Token to
Token ID284
Feature activation+1.01e-02

Pos logit contributions
bench+0.022
ggie+0.020
old+0.020
ritz+0.020
iger+0.019
Neg logit contributions
 woke-0.011
 Look-0.011
 c-0.009
als-0.009
 D-0.009
Token the
Token ID262
Feature activation-2.76e-03

Pos logit contributions
 Look+0.011
 woke+0.011
're+0.010
 D+0.010
ifying+0.010
Neg logit contributions
bench-0.021
old-0.020
Years-0.019
ggie-0.019
ritz-0.018
Token books
Token ID3835
Feature activation-2.67e-02

Pos logit contributions
old+0.004
bench+0.004
Years+0.003
 student+0.003
ritz+0.003
Neg logit contributions
ifying-0.005
AAAA-0.005
 Look-0.005
're-0.005
 woke-0.005
Tokenhel
Token ID2978
Feature activation-1.46e-01

Pos logit contributions
old+0.056
Years+0.053
iger+0.050
iddles+0.050
ggie+0.049
Neg logit contributions
 woke-0.032
're-0.028
ifying-0.027
als-0.025
 smelled-0.025
Tokenves
Token ID1158
Feature activation-6.92e-02

Pos logit contributions
old+0.213
bench+0.204
 student+0.185
 professor+0.185
 tutor+0.182
Neg logit contributions
 woke-0.251
 landed-0.216
AAAA-0.215
ifying-0.214
 reapp-0.211
Token and
Token ID290
Feature activation+9.59e-03

Pos logit contributions
old+0.140
bench+0.135
Years+0.130
 student+0.124
ae+0.123
Neg logit contributions
're-0.079
ifying-0.077
 woke-0.075
 Look-0.071
 reapp-0.067
Token began
Token ID2540
Feature activation-3.62e-02

Pos logit contributions
ifying+0.012
AAAA+0.012
 Look+0.012
 woke+0.011
're+0.011
Neg logit contributions
old-0.018
bench-0.018
Years-0.016
 student-0.016
ritz-0.016
Token to
Token ID284
Feature activation+6.07e-03

Pos logit contributions
old+0.062
bench+0.061
ritz+0.057
Years+0.057
iger+0.055
Neg logit contributions
 Look-0.049
ifying-0.048
 woke-0.048
 D-0.047
 roy-0.046
Token read
Token ID1100
Feature activation-2.47e-02

Pos logit contributions
ifying+0.009
AAAA+0.008
 Look+0.008
 roy+0.008
're+0.008
Neg logit contributions
old-0.011
bench-0.010
Years-0.009
 student-0.009
 homework-0.009
Token of
Token ID286
Feature activation-3.14e-02

Pos logit contributions
ifying+0.025
're+0.023
 D+0.022
 peanut+0.022
 woke+0.021
Neg logit contributions
bench-0.040
wen-0.039
xy-0.039
ritz-0.038
Years-0.038
Token fun
Token ID1257
Feature activation-7.82e-02

Pos logit contributions
bench+0.046
old+0.045
ritz+0.044
bars+0.044
 student+0.043
Neg logit contributions
 Look-0.048
AAAA-0.047
're-0.046
 woke-0.044
 roy-0.044
Token watching
Token ID4964
Feature activation+9.18e-03

Pos logit contributions
old+0.130
bench+0.127
Years+0.117
 student+0.113
ritz+0.112
Neg logit contributions
ifying-0.113
 Look-0.110
AAAA-0.110
 roy-0.107
 woke-0.106
Token the
Token ID262
Feature activation+1.47e-02

Pos logit contributions
ifying+0.014
 Look+0.014
 D+0.013
AAAA+0.013
 woke+0.013
Neg logit contributions
bench-0.015
old-0.015
Years-0.013
ritz-0.012
 student-0.012
Token ac
Token ID936
Feature activation+4.59e-02

Pos logit contributions
ifying+0.025
 Look+0.025
AAAA+0.025
 roy+0.024
're+0.024
Neg logit contributions
old-0.021
bench-0.021
Years-0.018
 student-0.018
ritz-0.018
Tokenrob
Token ID22609
Feature activation-1.45e-01

Pos logit contributions
 woke+0.058
cel+0.057
AAAA+0.056
 ate+0.056
als+0.054
Neg logit contributions
 student-0.076
old-0.071
bench-0.071
 tutor-0.070
 aloud-0.070
Tokenats
Token ID1381
Feature activation-1.24e-02

Pos logit contributions
bench+0.247
 attendant+0.227
Years+0.221
 professor+0.217
iger+0.216
Neg logit contributions
're-0.195
 ate-0.185
 roy-0.182
 woke-0.172
 peanut-0.171
Token and
Token ID290
Feature activation+2.01e-02

Pos logit contributions
old+0.021
bench+0.020
Years+0.018
 student+0.018
ritz+0.017
Neg logit contributions
ifying-0.019
 Look-0.018
're-0.017
AAAA-0.017
 woke-0.017
Token clown
Token ID25573
Feature activation+7.38e-02

Pos logit contributions
ifying+0.032
AAAA+0.032
 Look+0.031
're+0.030
 roy+0.029
Neg logit contributions
old-0.031
bench-0.030
 student-0.027
Years-0.027
 homework-0.026
Tokens
Token ID82
Feature activation+4.55e-03

Pos logit contributions
ifying+0.103
're+0.097
 woke+0.095
 Look+0.094
 roy+0.090
Neg logit contributions
bench-0.133
old-0.128
Years-0.122
ritz-0.115
 student-0.112
Token.
Token ID13
Feature activation+4.30e-03

Pos logit contributions
ifying+0.008
AAAA+0.007
 Look+0.007
're+0.007
 woke+0.007
Neg logit contributions
old-0.007
bench-0.007
Years-0.006
 student-0.006
ritz-0.006
Token on
Token ID319
Feature activation-1.43e-02

Pos logit contributions
old+0.061
bench+0.059
Years+0.053
ritz+0.051
 student+0.051
Neg logit contributions
ifying-0.069
 Look-0.066
're-0.064
AAAA-0.064
 woke-0.064
Token the
Token ID262
Feature activation-1.37e-03

Pos logit contributions
old+0.027
bench+0.025
Years+0.023
ritz+0.022
iger+0.022
Neg logit contributions
ifying-0.020
 Look-0.018
 woke-0.018
 D-0.018
AAAA-0.018
Token row
Token ID5752
Feature activation-3.75e-02

Pos logit contributions
old+0.002
bench+0.002
ritz+0.002
Years+0.002
iddles+0.002
Neg logit contributions
ifying-0.002
 Look-0.002
 woke-0.002
 roy-0.002
're-0.002
Token of
Token ID286
Feature activation-8.35e-03

Pos logit contributions
old+0.056
bench+0.054
Years+0.051
 homework+0.048
ritz+0.048
Neg logit contributions
ifying-0.061
're-0.058
 woke-0.057
 Look-0.056
AAAA-0.054
Token books
Token ID3835
Feature activation-2.98e-02

Pos logit contributions
old+0.014
bench+0.013
Years+0.012
ritz+0.012
 attendant+0.012
Neg logit contributions
 Look-0.012
 woke-0.012
 c-0.012
ifying-0.012
AAAA-0.012
Tokenhel
Token ID2978
Feature activation-1.43e-01

Pos logit contributions
old+0.035
bench+0.031
Years+0.030
iger+0.028
ggie+0.028
Neg logit contributions
ifying-0.061
're-0.058
 woke-0.058
 Look-0.055
 reapp-0.054
Tokenves
Token ID1158
Feature activation-7.01e-02

Pos logit contributions
old+0.206
bench+0.185
 technician+0.175
 professor+0.170
 tutor+0.165
Neg logit contributions
 woke-0.240
ifying-0.228
 reapp-0.220
 burst-0.210
 roy-0.209
Token.
Token ID13
Feature activation+3.22e-03

Pos logit contributions
old+0.115
bench+0.111
Years+0.102
 student+0.099
ritz+0.097
Neg logit contributions
ifying-0.107
're-0.101
 Look-0.099
AAAA-0.099
 woke-0.098
Token<|endoftext|>
Token ID50256
Feature activation+2.82e-02

Pos logit contributions
 Look+0.004
ifying+0.004
AAAA+0.004
're+0.004
 D+0.004
Neg logit contributions
bench-0.006
old-0.006
Years-0.006
 homework-0.006
 worker-0.006
TokenTom
Token ID13787
Feature activation-3.68e-02

Pos logit contributions
ifying+0.032
 Look+0.032
AAAA+0.032
 woke+0.031
 D+0.029
Neg logit contributions
old-0.058
bench-0.055
Years-0.054
 homework-0.051
 student-0.051
Token was
Token ID373
Feature activation-1.54e-02

Pos logit contributions
old+0.049
bench+0.042
 student+0.037
ritz+0.034
 homework+0.034
Neg logit contributions
ifying-0.073
AAAA-0.073
 Sorry-0.067
 roy-0.067
 Look-0.066
Token idea
Token ID2126
Feature activation-9.11e-02

Pos logit contributions
AAAA+0.010
 Look+0.010
're+0.010
ifying+0.010
 woke+0.010
Neg logit contributions
bench-0.007
old-0.007
 student-0.006
bars-0.006
 worker-0.006
Tokenâ
Token ID22940
Feature activation-2.17e-02

Pos logit contributions
bench+0.163
old+0.158
Years+0.155
 salon+0.149
bars+0.149
Neg logit contributions
 woke-0.115
 Look-0.109
're-0.108
AAAA-0.107
ifying-0.104
Token€
Token ID26391
Feature activation-4.57e-02

Pos logit contributions
old+0.035
bench+0.033
 tutor+0.032
 student+0.031
 homework+0.031
Neg logit contributions
AAAA-0.032
 Look-0.031
ifying-0.030
're-0.030
 woke-0.030
Token�
Token ID447
Feature activation+1.91e-02

Pos logit contributions
old+0.079
bench+0.075
Years+0.067
 student+0.067
ggie+0.066
Neg logit contributions
ifying-0.067
AAAA-0.065
 Look-0.063
 roy-0.061
're-0.059
Token�
Token ID251
Feature activation-2.53e-02

Pos logit contributions
ifying+0.028
AAAA+0.027
 Look+0.027
 roy+0.026
're+0.025
Neg logit contributions
bench-0.031
old-0.031
Years-0.028
 worker-0.027
 student-0.027
Tokenhe
Token ID258
Feature activation-1.43e-01

Pos logit contributions
old+0.040
bench+0.039
Years+0.035
 student+0.034
 worker+0.034
Neg logit contributions
ifying-0.040
AAAA-0.040
 Look-0.038
 roy-0.037
're-0.037
Token wanted
Token ID2227
Feature activation+4.56e-03

Pos logit contributions
old+0.257
bench+0.250
Years+0.226
 student+0.220
 homework+0.218
Neg logit contributions
AAAA-0.193
ifying-0.192
 Look-0.187
 woke-0.182
 roy-0.175
Token to
Token ID284
Feature activation-9.81e-03

Pos logit contributions
 woke+0.006
 D+0.006
 Look+0.005
 P+0.005
 c+0.005
Neg logit contributions
bench-0.009
old-0.008
ritz-0.008
 student-0.008
 rider-0.008
Token touch
Token ID3638
Feature activation+9.61e-03

Pos logit contributions
old+0.016
bench+0.015
 student+0.014
 research+0.014
ggie+0.014
Neg logit contributions
ifying-0.016
AAAA-0.015
 Look-0.014
 roy-0.014
're-0.013
Token the
Token ID262
Feature activation-7.70e-03

Pos logit contributions
ifying+0.014
AAAA+0.014
 Look+0.014
're+0.013
 roy+0.013
Neg logit contributions
old-0.016
bench-0.015
Years-0.014
 student-0.014
 homework-0.013
Token glowing
Token ID21377
Feature activation+2.35e-02

Pos logit contributions
old+0.012
bench+0.012
 worker+0.011
iger+0.011
 attendant+0.011
Neg logit contributions
ifying-0.012
AAAA-0.011
 Look-0.011
 roy-0.011
 D-0.011
Token.
Token ID13
Feature activation-9.49e-03

Pos logit contributions
old+0.044
bench+0.040
Years+0.039
 student+0.038
ae+0.037
Neg logit contributions
ifying-0.035
're-0.033
 Look-0.032
 woke-0.031
 D-0.030
Token He
Token ID679
Feature activation+1.30e-04

Pos logit contributions
old+0.020
bench+0.019
ae+0.019
 homework+0.019
ggie+0.019
Neg logit contributions
 Look-0.011
 Are-0.010
AAAA-0.009
ifying-0.009
're-0.009
Token found
Token ID1043
Feature activation+1.77e-02

Pos logit contributions
 woke+0.000
 Look+0.000
AAAA+0.000
ifying+0.000
're+0.000
Neg logit contributions
old-0.000
bench-0.000
iger-0.000
 tutor-0.000
 professor-0.000
Token a
Token ID257
Feature activation+4.42e-03

Pos logit contributions
ifying+0.018
 Look+0.018
 D+0.017
're+0.017
 peanut+0.016
Neg logit contributions
bench-0.036
old-0.036
 student-0.034
Years-0.033
 tutor-0.033
Token squee
Token ID13170
Feature activation-6.31e-02

Pos logit contributions
ifying+0.007
AAAA+0.006
 Look+0.006
're+0.006
 roy+0.006
Neg logit contributions
old-0.007
bench-0.007
Years-0.006
 student-0.006
 homework-0.006
Tokengee
Token ID29622
Feature activation-1.43e-01

Pos logit contributions
old+0.102
bench+0.097
Years+0.087
 student+0.087
 tutor+0.083
Neg logit contributions
AAAA-0.096
ifying-0.096
 Look-0.094
 woke-0.090
 roy-0.089
Token and
Token ID290
Feature activation-9.77e-03

Pos logit contributions
old+0.242
bench+0.232
Years+0.213
 student+0.206
 homework+0.203
Neg logit contributions
ifying-0.207
AAAA-0.203
 Look-0.201
 woke-0.194
're-0.193
Token some
Token ID617
Feature activation-1.19e-02

Pos logit contributions
old+0.018
bench+0.017
Years+0.015
 student+0.015
 homework+0.015
Neg logit contributions
ifying-0.013
AAAA-0.013
 Look-0.013
're-0.012
 roy-0.012
Token wax
Token ID20302
Feature activation-8.91e-03

Pos logit contributions
old+0.020
bench+0.019
Years+0.018
 student+0.017
ae+0.017
Neg logit contributions
ifying-0.018
AAAA-0.017
 Look-0.017
're-0.016
 woke-0.016
Token.
Token ID13
Feature activation-1.17e-02

Pos logit contributions
old+0.014
bench+0.013
Years+0.013
 student+0.012
 tutor+0.012
Neg logit contributions
ifying-0.014
're-0.013
AAAA-0.013
 Look-0.013
 woke-0.013
Token
Token ID198
Feature activation-9.37e-03

Pos logit contributions
old+0.024
bench+0.023
ae+0.022
 homework+0.022
 student+0.022
Neg logit contributions
 Look-0.014
 Are-0.013
AAAA-0.013
ifying-0.012
're-0.012
Token
Token ID198
Feature activation-1.04e-02

Pos logit contributions
bench+0.026
old+0.025
 rider+0.025
bars+0.024
 student+0.024
Neg logit contributions
AAAA-0.014
 D-0.014
 Look-0.013
 woke-0.013
ifying-0.013
TokenThe
Token ID464
Feature activation+8.25e-03

Pos logit contributions
old+0.019
bench+0.018
Years+0.017
 student+0.017
ritz+0.016
Neg logit contributions
ifying-0.014
AAAA-0.014
 Look-0.013
're-0.013
 roy-0.013
Token doctor
Token ID6253
Feature activation-5.00e-02

Pos logit contributions
 woke+0.014
 Are+0.014
AAAA+0.014
 Look+0.014
 c+0.014
Neg logit contributions
old-0.011
bench-0.010
xy-0.010
ritz-0.010
ide-0.009
Token took
Token ID1718
Feature activation+2.25e-02

Pos logit contributions
old+0.097
bench+0.097
Years+0.089
 student+0.086
ritz+0.086
Neg logit contributions
ifying-0.056
 Look-0.056
AAAA-0.056
 woke-0.055
're-0.053
Token the
Token ID262
Feature activation-2.46e-02

Pos logit contributions
ifying+0.030
 Look+0.029
AAAA+0.028
 D+0.028
 woke+0.028
Neg logit contributions
old-0.040
bench-0.040
Years-0.037
 student-0.035
 rider-0.035
Token x
Token ID2124
Feature activation-1.43e-01

Pos logit contributions
old+0.025
bench+0.025
Years+0.022
ritz+0.020
ide+0.020
Neg logit contributions
 woke-0.052
 Look-0.051
 Are-0.050
AAAA-0.049
ifying-0.049
Token-
Token ID12
Feature activation-4.81e-02

Pos logit contributions
bench+0.225
old+0.223
Years+0.211
bars+0.200
ritz+0.197
Neg logit contributions
 woke-0.203
AAAA-0.189
ifying-0.181
 roy-0.179
 Look-0.178
Tokenray
Token ID2433
Feature activation-6.54e-02

Pos logit contributions
bench+0.090
old+0.080
 research+0.079
 homework+0.079
 tutor+0.078
Neg logit contributions
ifying-0.061
 woke-0.058
 Look-0.056
AAAA-0.054
als-0.052
Token and
Token ID290
Feature activation+1.57e-02

Pos logit contributions
old+0.105
bench+0.099
Years+0.096
ggie+0.095
iger+0.092
Neg logit contributions
ifying-0.097
 woke-0.095
 Look-0.091
AAAA-0.088
're-0.088
Token then
Token ID788
Feature activation-3.04e-02

Pos logit contributions
ifying+0.021
AAAA+0.020
 Look+0.019
're+0.019
 roy+0.018
Neg logit contributions
old-0.029
bench-0.028
Years-0.026
 student-0.026
 worker-0.025
Token Molly
Token ID30236
Feature activation-1.68e-03

Pos logit contributions
old+0.052
bench+0.051
 student+0.046
Years+0.045
 homework+0.044
Neg logit contributions
ifying-0.044
AAAA-0.044
 Look-0.042
're-0.040
 roy-0.040
Token a
Token ID257
Feature activation+1.25e-02

Pos logit contributions
ifying+0.024
AAAA+0.023
 roy+0.022
 Look+0.022
're+0.021
Neg logit contributions
old-0.021
bench-0.021
 student-0.019
 worker-0.018
 tutor-0.018
Token 3
Token ID513
Feature activation-4.85e-02

Pos logit contributions
ifying+0.017
AAAA+0.017
 Look+0.016
're+0.016
 roy+0.016
Neg logit contributions
old-0.022
bench-0.022
 student-0.020
Years-0.020
 worker-0.019
Token-
Token ID12
Feature activation-9.12e-02

Pos logit contributions
old+0.058
bench+0.057
Years+0.050
 student+0.048
 homework+0.047
Neg logit contributions
AAAA-0.092
ifying-0.092
 Look-0.090
 woke-0.088
're-0.087
Tokenyear
Token ID1941
Feature activation-7.69e-02

Pos logit contributions
bench+0.124
 homework+0.116
ritz+0.107
iger+0.106
 site+0.106
Neg logit contributions
 Look-0.149
AAAA-0.140
ifying-0.140
 ah-0.140
 woke-0.137
Token-
Token ID12
Feature activation-9.64e-02

Pos logit contributions
bench+0.125
 homework+0.117
Years+0.113
 student+0.112
James+0.110
Neg logit contributions
 woke-0.099
 peanut-0.098
 D-0.095
 roy-0.095
AAAA-0.095
Tokenold
Token ID727
Feature activation-1.42e-01

Pos logit contributions
bench+0.078
 homework+0.071
 tutor+0.063
iger+0.059
odge+0.058
Neg logit contributions
 Look-0.198
 ah-0.196
 ate-0.193
 woke-0.192
 peanut-0.190
Token little
Token ID1310
Feature activation+1.15e-02

Pos logit contributions
old+0.229
bench+0.229
Years+0.207
 student+0.199
 homework+0.197
Neg logit contributions
ifying-0.211
AAAA-0.208
 Look-0.207
 woke-0.198
're-0.198
Token boy
Token ID2933
Feature activation-4.90e-02

Pos logit contributions
ifying+0.016
AAAA+0.016
 Look+0.015
 roy+0.015
're+0.015
Neg logit contributions
old-0.020
bench-0.020
Years-0.018
 student-0.017
 homework-0.017
Token.
Token ID13
Feature activation-2.32e-02

Pos logit contributions
old+0.096
bench+0.088
 student+0.082
 worker+0.082
ggie+0.081
Neg logit contributions
ifying-0.063
AAAA-0.062
 Look-0.057
 roy-0.056
're-0.054
Token He
Token ID679
Feature activation-9.02e-03

Pos logit contributions
old+0.040
bench+0.039
 student+0.035
Years+0.035
 worker+0.034
Neg logit contributions
ifying-0.033
AAAA-0.033
 Look-0.031
 roy-0.030
're-0.030
Token had
Token ID550
Feature activation+3.26e-02

Pos logit contributions
old+0.017
bench+0.016
Years+0.015
 student+0.014
ggie+0.014
Neg logit contributions
ifying-0.012
AAAA-0.012
 Look-0.011
 roy-0.011
're-0.011
Token saw
Token ID2497
Feature activation+1.37e-02

Pos logit contributions
ifying+0.021
AAAA+0.020
 Look+0.019
're+0.018
 roy+0.018
Neg logit contributions
old-0.037
bench-0.035
 student-0.033
Years-0.032
 worker-0.032
Token clown
Token ID25573
Feature activation+7.66e-02

Pos logit contributions
ifying+0.019
AAAA+0.019
 Look+0.019
 woke+0.018
 D+0.018
Neg logit contributions
bench-0.023
old-0.023
Years-0.021
 student-0.020
ritz-0.020
Tokens
Token ID82
Feature activation-6.10e-03

Pos logit contributions
ifying+0.108
 Look+0.102
AAAA+0.100
're+0.100
 woke+0.099
Neg logit contributions
bench-0.137
old-0.134
Years-0.124
 student-0.118
ritz-0.116
Token and
Token ID290
Feature activation+1.66e-02

Pos logit contributions
old+0.011
bench+0.011
Years+0.010
 student+0.010
ritz+0.009
Neg logit contributions
ifying-0.008
AAAA-0.008
 Look-0.008
 woke-0.008
're-0.008
Token ac
Token ID936
Feature activation+5.51e-02

Pos logit contributions
ifying+0.027
AAAA+0.027
're+0.025
 Look+0.025
 roy+0.024
Neg logit contributions
old-0.025
bench-0.023
 student-0.023
 worker-0.022
 tutor-0.022
Tokenrob
Token ID22609
Feature activation-1.42e-01

Pos logit contributions
ifying+0.079
AAAA+0.079
 woke+0.077
cel+0.075
 escaped+0.074
Neg logit contributions
 student-0.088
old-0.087
 tutor-0.085
Years-0.085
ae-0.082
Tokenats
Token ID1381
Feature activation-1.49e-02

Pos logit contributions
bench+0.188
old+0.183
Years+0.167
 student+0.165
 attendant+0.165
Neg logit contributions
ifying-0.245
AAAA-0.243
're-0.239
 Look-0.238
 roy-0.235
Token and
Token ID290
Feature activation+4.99e-03

Pos logit contributions
old+0.025
bench+0.024
Years+0.023
 student+0.022
 homework+0.021
Neg logit contributions
ifying-0.022
 Look-0.021
AAAA-0.021
 woke-0.021
're-0.020
Token lions
Token ID31420
Feature activation-4.29e-04

Pos logit contributions
AAAA+0.008
ifying+0.008
 Look+0.008
're+0.007
 roy+0.007
Neg logit contributions
old-0.007
 student-0.007
bench-0.006
 tutor-0.006
 homework-0.006
Token and
Token ID290
Feature activation+9.20e-03

Pos logit contributions
old+0.001
bench+0.001
Years+0.001
 student+0.001
ritz+0.001
Neg logit contributions
ifying-0.001
AAAA-0.001
 Look-0.001
 woke-0.001
're-0.001
Token elephants
Token ID25368
Feature activation+4.38e-03

Pos logit contributions
AAAA+0.016
ifying+0.015
 Look+0.015
're+0.014
 roy+0.014
Neg logit contributions
old-0.013
bench-0.012
 student-0.011
 homework-0.011
 worker-0.011
Token lots
Token ID6041
Feature activation+2.24e-02

Pos logit contributions
ifying+0.074
AAAA+0.072
 roy+0.070
 Sorry+0.069
 Look+0.068
Neg logit contributions
old-0.062
bench-0.060
Years-0.057
ggie-0.054
 worker-0.053
Token of
Token ID286
Feature activation-4.19e-02

Pos logit contributions
ifying+0.046
AAAA+0.045
 Look+0.045
're+0.043
 roy+0.043
Neg logit contributions
old-0.026
bench-0.024
Years-0.020
 student-0.020
 worker-0.019
Token big
Token ID1263
Feature activation+4.02e-03

Pos logit contributions
old+0.076
bench+0.073
Years+0.067
 student+0.066
 homework+0.065
Neg logit contributions
ifying-0.057
AAAA-0.056
 Look-0.054
're-0.051
 roy-0.051
Token juicy
Token ID35075
Feature activation-4.34e-03

Pos logit contributions
ifying+0.007
 Sorry+0.007
AAAA+0.006
 Look+0.006
 roy+0.006
Neg logit contributions
old-0.006
bench-0.005
chairs-0.005
 math-0.005
 research-0.005
Token pine
Token ID20161
Feature activation-5.62e-02

Pos logit contributions
old+0.006
bench+0.006
 worker+0.005
 math+0.005
 research+0.005
Neg logit contributions
ifying-0.008
AAAA-0.007
 Sorry-0.007
 Look-0.007
're-0.006
Tokencon
Token ID1102
Feature activation-1.42e-01

Pos logit contributions
old+0.090
bench+0.088
Years+0.083
 student+0.082
 tutor+0.077
Neg logit contributions
ifying-0.086
 woke-0.082
 peanut-0.081
're-0.080
AAAA-0.080
Tokenes
Token ID274
Feature activation-6.18e-02

Pos logit contributions
old+0.288
Years+0.287
bench+0.284
 tutor+0.271
 student+0.270
Neg logit contributions
ifying-0.146
're-0.139
cel-0.132
 roy-0.131
 D-0.128
Token,
Token ID11
Feature activation-2.91e-03

Pos logit contributions
old+0.109
bench+0.106
Years+0.098
 student+0.095
 homework+0.093
Neg logit contributions
ifying-0.087
 Look-0.082
AAAA-0.081
're-0.081
 woke-0.080
Token so
Token ID523
Feature activation+3.28e-02

Pos logit contributions
bench+0.004
old+0.004
 worker+0.004
Years+0.004
 student+0.004
Neg logit contributions
ifying-0.005
AAAA-0.005
're-0.004
 Sorry-0.004
 Look-0.004
Token Penny
Token ID25965
Feature activation+3.00e-02

Pos logit contributions
ifying+0.065
AAAA+0.065
 Look+0.061
're+0.060
 roy+0.060
Neg logit contributions
old-0.038
bench-0.036
 student-0.033
Years-0.032
 tutor-0.032
Token started
Token ID2067
Feature activation-4.87e-02

Pos logit contributions
ifying+0.037
AAAA+0.034
 Look+0.034
 Sorry+0.031
're+0.031
Neg logit contributions
old-0.060
bench-0.057
Years-0.055
ggie-0.053
 student-0.053
Token,
Token ID11
Feature activation+1.41e-02

Pos logit contributions
old+0.056
bench+0.055
 math+0.054
 research+0.053
 student+0.053
Neg logit contributions
ifying-0.088
AAAA-0.082
 roy-0.077
 Look-0.077
 pulp-0.076
Token the
Token ID262
Feature activation+3.75e-03

Pos logit contributions
ifying+0.024
AAAA+0.024
 Look+0.023
 roy+0.023
're+0.022
Neg logit contributions
old-0.020
bench-0.019
Years-0.017
 student-0.017
 homework-0.017
Token doctor
Token ID6253
Feature activation-4.12e-02

Pos logit contributions
 woke+0.007
 Are+0.007
 Look+0.007
AAAA+0.007
ifying+0.006
Neg logit contributions
old-0.005
ritz-0.004
bench-0.004
xy-0.004
ide-0.004
Token took
Token ID1718
Feature activation+3.65e-02

Pos logit contributions
old+0.086
bench+0.085
Years+0.078
 student+0.077
ritz+0.076
Neg logit contributions
ifying-0.042
AAAA-0.042
 Look-0.041
 woke-0.039
're-0.038
Token another
Token ID1194
Feature activation-4.85e-02

Pos logit contributions
ifying+0.052
AAAA+0.051
 Look+0.050
're+0.048
 woke+0.048
Neg logit contributions
old-0.062
bench-0.061
Years-0.055
 student-0.054
 worker-0.053
Token x
Token ID2124
Feature activation-1.42e-01

Pos logit contributions
old+0.078
bench+0.078
xy+0.072
Years+0.072
iddles+0.071
Neg logit contributions
 woke-0.071
ifying-0.069
 Look-0.068
AAAA-0.067
 Are-0.067
Token-
Token ID12
Feature activation-5.06e-02

Pos logit contributions
old+0.206
bench+0.204
Years+0.186
bars+0.178
ritz+0.177
Neg logit contributions
 woke-0.222
AAAA-0.219
ifying-0.214
 Look-0.211
 roy-0.207
Tokenray
Token ID2433
Feature activation-6.35e-02

Pos logit contributions
bench+0.087
old+0.080
 homework+0.076
 student+0.075
 site+0.074
Neg logit contributions
ifying-0.073
 Look-0.069
 woke-0.068
AAAA-0.068
 roy-0.063
Token and
Token ID290
Feature activation+1.41e-02

Pos logit contributions
old+0.105
bench+0.100
Years+0.095
iger+0.092
ritz+0.091
Neg logit contributions
ifying-0.094
 woke-0.090
 Look-0.088
AAAA-0.086
're-0.085
Token found
Token ID1043
Feature activation+3.42e-02

Pos logit contributions
ifying+0.020
AAAA+0.020
 Look+0.018
're+0.018
 roy+0.018
Neg logit contributions
old-0.025
bench-0.024
Years-0.022
 student-0.022
 worker-0.022
Token out
Token ID503
Feature activation+1.33e-02

Pos logit contributions
ifying+0.049
AAAA+0.048
 Look+0.048
're+0.045
 roy+0.045
Neg logit contributions
old-0.058
bench-0.057
Years-0.051
 student-0.051
 homework-0.050
Token She
Token ID1375
Feature activation+1.15e-02

Pos logit contributions
bench+0.010
old+0.010
 student+0.009
Years+0.009
 tutor+0.009
Neg logit contributions
AAAA-0.012
ifying-0.012
 roy-0.011
're-0.011
 Look-0.011
Token had
Token ID550
Feature activation+3.04e-02

Pos logit contributions
ifying+0.021
AAAA+0.018
 Look+0.018
 roy+0.017
're+0.017
Neg logit contributions
old-0.018
bench-0.017
ggie-0.016
Years-0.016
 homework-0.015
Token a
Token ID257
Feature activation-1.25e-02

Pos logit contributions
ifying+0.045
AAAA+0.045
 Look+0.043
're+0.042
 roy+0.042
Neg logit contributions
old-0.050
bench-0.049
Years-0.044
 student-0.044
 homework-0.043
Token big
Token ID1263
Feature activation-5.66e-03

Pos logit contributions
old+0.023
bench+0.022
Years+0.020
 student+0.019
ritz+0.019
Neg logit contributions
ifying-0.017
AAAA-0.017
 Look-0.016
 roy-0.015
 woke-0.015
Token books
Token ID3835
Feature activation-5.09e-02

Pos logit contributions
old+0.009
bench+0.009
 student+0.008
 tutor+0.008
 homework+0.008
Neg logit contributions
ifying-0.009
AAAA-0.008
 Look-0.008
're-0.008
 roy-0.008
Tokenhe
Token ID258
Feature activation-1.42e-01

Pos logit contributions
old+0.056
bench+0.048
Years+0.045
iger+0.043
ritz+0.042
Neg logit contributions
ifying-0.105
 woke-0.101
 Look-0.099
're-0.097
AAAA-0.094
Tokenlf
Token ID1652
Feature activation-1.10e-01

Pos logit contributions
old+0.151
bench+0.128
Years+0.115
bars+0.107
iger+0.105
Neg logit contributions
 woke-0.304
ifying-0.292
AAAA-0.290
 Look-0.287
 roy-0.284
Token in
Token ID287
Feature activation-7.73e-03

Pos logit contributions
old+0.245
bench+0.234
iger+0.230
Years+0.230
wen+0.223
Neg logit contributions
ifying-0.093
 woke-0.081
 ate-0.079
 c-0.073
 drank-0.072
Token her
Token ID607
Feature activation+1.17e-02

Pos logit contributions
old+0.011
bench+0.011
Years+0.010
 student+0.010
 homework+0.009
Neg logit contributions
ifying-0.013
AAAA-0.013
 Look-0.013
're-0.012
 roy-0.012
Token room
Token ID2119
Feature activation-3.38e-02

Pos logit contributions
ifying+0.022
AAAA+0.021
 Look+0.021
 woke+0.020
're+0.020
Neg logit contributions
old-0.015
bench-0.015
Years-0.013
 student-0.012
ritz-0.012
Token with
Token ID351
Feature activation-4.05e-03

Pos logit contributions
old+0.053
bench+0.052
Years+0.046
 student+0.045
ritz+0.044
Neg logit contributions
ifying-0.053
AAAA-0.052
 Look-0.051
 woke-0.049
're-0.049
Token is
Token ID318
Feature activation-1.51e-02

Pos logit contributions
ifying+0.016
AAAA+0.016
 Look+0.015
're+0.014
 roy+0.014
Neg logit contributions
old-0.028
bench-0.028
Years-0.026
 student-0.025
 worker-0.025
Token it
Token ID340
Feature activation+1.39e-02

Pos logit contributions
old+0.025
bench+0.024
Years+0.022
 student+0.022
ritz+0.021
Neg logit contributions
ifying-0.023
AAAA-0.022
 Look-0.022
're-0.021
 roy-0.021
Token?"
Token ID1701
Feature activation-8.67e-03

Pos logit contributions
AAAA+0.021
ifying+0.021
 Look+0.019
 roy+0.019
 woke+0.018
Neg logit contributions
old-0.023
bench-0.022
 homework-0.021
 tutor-0.021
 math-0.020
Token
Token ID198
Feature activation-1.15e-02

Pos logit contributions
old+0.012
bench+0.012
 tutor+0.011
 student+0.010
Years+0.010
Neg logit contributions
ifying-0.016
AAAA-0.015
 roy-0.014
're-0.014
 Look-0.014
Token
Token ID198
Feature activation-8.47e-03

Pos logit contributions
old+0.017
bench+0.016
Years+0.015
 student+0.014
 tutor+0.014
Neg logit contributions
ifying-0.020
AAAA-0.019
 Look-0.019
 roy-0.018
're-0.018
TokenP
Token ID47
Feature activation+8.66e-02

Pos logit contributions
old+0.015
Years+0.014
bench+0.014
 tutor+0.013
 student+0.013
Neg logit contributions
ifying-0.012
 Look-0.012
're-0.011
 roy-0.011
AAAA-0.011
Tokenenny
Token ID11870
Feature activation-4.35e-02

Pos logit contributions
ifying+0.166
AAAA+0.163
 Look+0.161
're+0.156
 roy+0.155
Neg logit contributions
old-0.106
bench-0.103
Years-0.088
 student-0.087
 worker-0.085
Token smiled
Token ID13541
Feature activation+3.13e-02

Pos logit contributions
old+0.073
bench+0.068
Years+0.064
ggie+0.063
 student+0.062
Neg logit contributions
ifying-0.069
AAAA-0.064
 roy-0.063
 Look-0.061
're-0.059
Token and
Token ID290
Feature activation-1.25e-02

Pos logit contributions
AAAA+0.057
ifying+0.056
 Look+0.053
're+0.053
 roy+0.052
Neg logit contributions
old-0.041
bench-0.039
Years-0.036
 student-0.036
 tutor-0.036
Token said
Token ID531
Feature activation-2.98e-02

Pos logit contributions
old+0.018
bench+0.015
Years+0.015
 student+0.014
 aloud+0.014
Neg logit contributions
ifying-0.024
AAAA-0.022
 roy-0.022
 Look-0.021
're-0.021
Token,
Token ID11
Feature activation-8.73e-03

Pos logit contributions
old+0.050
bench+0.046
 aloud+0.043
 worker+0.043
 tutor+0.043
Neg logit contributions
ifying-0.048
AAAA-0.046
 roy-0.043
 Look-0.042
're-0.041
TokenThe
Token ID464
Feature activation+1.75e-02

Pos logit contributions
bench+0.008
old+0.008
 homework+0.007
bars+0.007
 manager+0.007
Neg logit contributions
AAAA-0.005
ifying-0.005
 Look-0.005
 Are-0.005
 woke-0.005
Token next
Token ID1306
Feature activation-9.08e-04

Pos logit contributions
ifying+0.029
AAAA+0.029
 Look+0.028
're+0.027
 roy+0.027
Neg logit contributions
old-0.026
bench-0.025
Years-0.022
 student-0.022
ritz-0.021
Token day
Token ID1110
Feature activation-4.25e-03

Pos logit contributions
old+0.001
bench+0.001
ritz+0.001
chairs+0.001
Years+0.001
Neg logit contributions
 Look-0.002
AAAA-0.002
ifying-0.002
 woke-0.002
 Sorry-0.002
Token,
Token ID11
Feature activation+9.11e-03

Pos logit contributions
old+0.007
bench+0.007
 student+0.007
 worker+0.006
ggie+0.006
Neg logit contributions
ifying-0.006
AAAA-0.006
're-0.006
 Look-0.006
 roy-0.005
Token Lily
Token ID20037
Feature activation-2.75e-02

Pos logit contributions
ifying+0.012
AAAA+0.011
 Look+0.011
're+0.010
 woke+0.010
Neg logit contributions
old-0.017
bench-0.017
 student-0.016
Years-0.016
 homework-0.015
Token woke
Token ID19092
Feature activation+1.12e-01

Pos logit contributions
old+0.053
bench+0.047
 student+0.045
 tutor+0.043
 worker+0.043
Neg logit contributions
ifying-0.040
AAAA-0.035
 Look-0.032
 Sorry-0.032
 tweet-0.030
Token up
Token ID510
Feature activation-1.43e-02

Pos logit contributions
 D+0.137
 Pink+0.127
 roy+0.123
 P+0.121
 Po+0.119
Neg logit contributions
bench-0.233
old-0.222
 student-0.202
Years-0.198
bars-0.193
Token early
Token ID1903
Feature activation+1.75e-02

Pos logit contributions
old+0.023
bench+0.022
Years+0.020
 student+0.020
ritz+0.019
Neg logit contributions
ifying-0.022
AAAA-0.021
 Look-0.021
 woke-0.020
're-0.020
Token and
Token ID290
Feature activation+1.27e-02

Pos logit contributions
 woke+0.022
ifying+0.022
 Look+0.021
AAAA+0.020
 D+0.020
Neg logit contributions
old-0.034
bench-0.030
Years-0.030
 student-0.028
ritz-0.028
Token put
Token ID1234
Feature activation-1.12e-03

Pos logit contributions
ifying+0.017
AAAA+0.017
 roy+0.015
 Look+0.014
're+0.014
Neg logit contributions
old-0.023
bench-0.022
 student-0.020
 homework-0.020
 salon-0.020
Token on
Token ID319
Feature activation-1.09e-05

Pos logit contributions
old+0.002
bench+0.002
Years+0.002
 student+0.002
ritz+0.002
Neg logit contributions
ifying-0.001
 Look-0.001
AAAA-0.001
 woke-0.001
 D-0.001
Token cry
Token ID3960
Feature activation+4.33e-02

Pos logit contributions
old+0.016
bench+0.015
 research+0.014
 homework+0.013
ggie+0.013
Neg logit contributions
ifying-0.012
AAAA-0.011
 Look-0.010
 roy-0.010
're-0.009
Token as
Token ID355
Feature activation-5.73e-02

Pos logit contributions
ifying+0.067
AAAA+0.067
 Look+0.064
're+0.062
 roy+0.061
Neg logit contributions
old-0.069
bench-0.067
Years-0.060
 student-0.060
 worker-0.058
Token he
Token ID339
Feature activation-1.50e-02

Pos logit contributions
old+0.090
bench+0.089
Years+0.077
 student+0.076
 appointment+0.076
Neg logit contributions
AAAA-0.092
 D-0.091
ifying-0.088
 woke-0.086
 Look-0.086
Token watched
Token ID7342
Feature activation+4.05e-03

Pos logit contributions
old+0.030
bench+0.028
Years+0.027
ggie+0.026
 student+0.026
Neg logit contributions
ifying-0.017
AAAA-0.017
 Look-0.016
 roy-0.015
're-0.015
Token the
Token ID262
Feature activation+9.90e-03

Pos logit contributions
ifying+0.006
AAAA+0.006
 Look+0.006
're+0.005
 D+0.005
Neg logit contributions
old-0.007
bench-0.007
Years-0.006
 student-0.006
 homework-0.006
Token crab
Token ID32202
Feature activation+8.73e-02

Pos logit contributions
ifying+0.011
AAAA+0.010
 Look+0.010
're+0.010
 roy+0.009
Neg logit contributions
old-0.020
bench-0.020
Years-0.018
 student-0.018
 worker-0.018
Token move
Token ID1445
Feature activation-2.52e-02

Pos logit contributions
ifying+0.114
AAAA+0.111
 Look+0.109
're+0.106
 woke+0.105
Neg logit contributions
old-0.160
bench-0.157
Years-0.142
 student-0.140
ritz-0.136
Token into
Token ID656
Feature activation+3.62e-02

Pos logit contributions
old+0.047
bench+0.046
Years+0.043
 student+0.042
 homework+0.042
Neg logit contributions
ifying-0.033
 Look-0.030
 woke-0.029
 D-0.029
AAAA-0.029
Token his
Token ID465
Feature activation+1.89e-02

Pos logit contributions
ifying+0.046
AAAA+0.045
 Look+0.044
 D+0.043
 woke+0.042
Neg logit contributions
old-0.069
bench-0.067
Years-0.062
bars-0.060
ritz-0.059
Token shell
Token ID7582
Feature activation+8.24e-02

Pos logit contributions
AAAA+0.024
 Look+0.022
ifying+0.022
're+0.022
 Are+0.021
Neg logit contributions
old-0.036
bench-0.035
 homework-0.032
 student-0.032
 tutor-0.032
Token.
Token ID13
Feature activation+4.01e-03

Pos logit contributions
ifying+0.125
 woke+0.114
're+0.113
 Look+0.112
 roy+0.110
Neg logit contributions
old-0.138
bench-0.129
Years-0.121
 student-0.117
ritz-0.114
Token The
Token ID383
Feature activation+7.41e-03

Pos logit contributions
old+0.019
bench+0.018
 tutor+0.016
 worker+0.016
 student+0.016
Neg logit contributions
ifying-0.016
AAAA-0.016
're-0.014
 roy-0.014
 Look-0.014
Token boy
Token ID2933
Feature activation-1.79e-02

Pos logit contributions
ifying+0.013
AAAA+0.013
 Look+0.012
're+0.012
 roy+0.012
Neg logit contributions
old-0.010
bench-0.010
 student-0.009
Years-0.009
 worker-0.009
Token looked
Token ID3114
Feature activation+2.31e-02

Pos logit contributions
old+0.035
bench+0.033
Years+0.030
 student+0.030
ggie+0.030
Neg logit contributions
ifying-0.023
AAAA-0.022
 Look-0.021
 roy-0.020
're-0.020
Token sad
Token ID6507
Feature activation+5.97e-02

Pos logit contributions
AAAA+0.041
ifying+0.038
 roy+0.036
 pulp+0.036
 Look+0.035
Neg logit contributions
old-0.034
 tutor-0.028
 math-0.027
 homework-0.027
 research-0.027
Token and
Token ID290
Feature activation-9.42e-03

Pos logit contributions
ifying+0.083
AAAA+0.082
 Look+0.081
're+0.078
 woke+0.077
Neg logit contributions
old-0.105
bench-0.102
Years-0.092
 student-0.091
ritz-0.088
Token lonely
Token ID21757
Feature activation+9.70e-02

Pos logit contributions
old+0.018
bench+0.018
 homework+0.016
Years+0.016
 student+0.016
Neg logit contributions
ifying-0.012
AAAA-0.012
 Look-0.011
're-0.011
 roy-0.011
Token.
Token ID13
Feature activation-1.13e-02

Pos logit contributions
ifying+0.152
 Look+0.147
AAAA+0.146
 woke+0.144
're+0.144
Neg logit contributions
old-0.155
bench-0.148
Years-0.136
ritz-0.130
 student-0.129
Token Lily
Token ID20037
Feature activation-3.34e-02

Pos logit contributions
old+0.021
bench+0.020
Years+0.018
 student+0.018
 worker+0.018
Neg logit contributions
ifying-0.015
AAAA-0.015
're-0.014
 roy-0.014
 Look-0.014
Token felt
Token ID2936
Feature activation+2.22e-02

Pos logit contributions
old+0.064
bench+0.057
 student+0.055
Years+0.054
ggie+0.054
Neg logit contributions
ifying-0.050
AAAA-0.043
 Look-0.040
 roy-0.039
 Sorry-0.038
Token sorry
Token ID7926
Feature activation+4.46e-02

Pos logit contributions
ifying+0.037
AAAA+0.035
 Look+0.034
 roy+0.034
 AW+0.031
Neg logit contributions
old-0.036
bench-0.031
 research-0.029
 tutor-0.029
 homework-0.029
Token for
Token ID329
Feature activation-7.75e-02

Pos logit contributions
ifying+0.076
AAAA+0.075
 Look+0.074
're+0.072
 woke+0.072
Neg logit contributions
old-0.065
bench-0.063
Years-0.055
 student-0.054
ggie-0.053
Token went
Token ID1816
Feature activation+3.47e-02

Pos logit contributions
old+0.046
bench+0.042
ggie+0.040
 student+0.040
 worker+0.040
Neg logit contributions
AAAA-0.034
ifying-0.034
 Look-0.032
 roy-0.030
 Are-0.029
Token outside
Token ID2354
Feature activation-6.03e-02

Pos logit contributions
ifying+0.041
 Look+0.040
AAAA+0.039
 woke+0.038
 roy+0.037
Neg logit contributions
old-0.068
bench-0.066
Years-0.060
 student-0.060
ritz-0.060
Token and
Token ID290
Feature activation+6.42e-03

Pos logit contributions
old+0.108
bench+0.108
Years+0.100
ritz+0.096
iger+0.095
Neg logit contributions
 woke-0.079
 Look-0.077
ifying-0.075
 D-0.074
're-0.073
Token when
Token ID618
Feature activation-2.67e-02

Pos logit contributions
ifying+0.009
AAAA+0.009
 Look+0.008
 roy+0.008
're+0.007
Neg logit contributions
old-0.012
bench-0.011
 worker-0.010
 homework-0.010
 rider-0.010
Token he
Token ID339
Feature activation+3.17e-02

Pos logit contributions
old+0.031
bench+0.030
Years+0.026
 student+0.025
ritz+0.024
Neg logit contributions
ifying-0.053
AAAA-0.053
 Look-0.052
're-0.050
 woke-0.050
Token smelled
Token ID37609
Feature activation+9.29e-02

Pos logit contributions
ifying+0.042
AAAA+0.038
 Look+0.037
 roy+0.036
 Sorry+0.036
Neg logit contributions
old-0.060
bench-0.053
ggie-0.053
 student-0.052
ae-0.052
Token the
Token ID262
Feature activation+1.25e-03

Pos logit contributions
AAAA+0.183
ifying+0.182
 roy+0.176
 Look+0.170
cel+0.169
Neg logit contributions
old-0.106
 math-0.095
 research-0.092
bench-0.092
ggie-0.091
Token wonderful
Token ID7932
Feature activation+1.12e-02

Pos logit contributions
AAAA+0.002
ifying+0.002
 Look+0.002
 roy+0.002
're+0.002
Neg logit contributions
old-0.002
bench-0.002
Years-0.002
 student-0.002
 homework-0.002
Token wet
Token ID9583
Feature activation+5.33e-02

Pos logit contributions
ifying+0.018
AAAA+0.018
 Look+0.017
're+0.016
 roy+0.016
Neg logit contributions
old-0.017
bench-0.017
 homework-0.015
 research-0.015
 worker-0.015
Token grass
Token ID8701
Feature activation-4.07e-02

Pos logit contributions
AAAA+0.085
ifying+0.081
 Look+0.075
 roy+0.072
cel+0.072
Neg logit contributions
bench-0.079
 worker-0.075
old-0.072
guard-0.072
smanship-0.071
Token,
Token ID11
Feature activation+1.08e-02

Pos logit contributions
old+0.069
bench+0.067
Years+0.060
 student+0.060
 homework+0.058
Neg logit contributions
ifying-0.059
AAAA-0.058
 Look-0.057
're-0.055
 roy-0.054
Token eat
Token ID4483
Feature activation+1.93e-02

Pos logit contributions
ifying+0.021
AAAA+0.021
 Look+0.020
 roy+0.019
're+0.018
Neg logit contributions
old-0.040
bench-0.039
Years-0.036
 student-0.036
 homework-0.035
Token the
Token ID262
Feature activation-1.08e-02

Pos logit contributions
ifying+0.028
AAAA+0.027
 Look+0.027
're+0.026
 roy+0.025
Neg logit contributions
old-0.033
bench-0.032
Years-0.029
 student-0.029
 homework-0.028
Token y
Token ID331
Feature activation-1.53e-03

Pos logit contributions
old+0.020
bench+0.019
Years+0.017
 student+0.017
 worker+0.017
Neg logit contributions
ifying-0.015
AAAA-0.014
 Look-0.014
 roy-0.013
're-0.013
Tokenummy
Token ID13513
Feature activation+2.18e-02

Pos logit contributions
old+0.002
bench+0.002
Years+0.001
 student+0.001
ritz+0.001
Neg logit contributions
AAAA-0.003
ifying-0.003
 Look-0.003
 woke-0.003
're-0.003
Token fruit
Token ID8234
Feature activation+7.43e-03

Pos logit contributions
ifying+0.035
AAAA+0.034
 Look+0.031
 roy+0.029
 Sorry+0.029
Neg logit contributions
old-0.035
bench-0.034
 worker-0.031
 homework-0.031
 research-0.031
Token.
Token ID13
Feature activation-1.06e-03

Pos logit contributions
ifying+0.010
 D+0.010
're+0.010
 woke+0.010
als+0.010
Neg logit contributions
old-0.013
bench-0.012
 student-0.011
Years-0.011
iger-0.011
Token<|endoftext|>
Token ID50256
Feature activation+2.90e-02

Pos logit contributions
old+0.002
bench+0.002
Years+0.001
 student+0.001
 tutor+0.001
Neg logit contributions
ifying-0.002
AAAA-0.002
 Look-0.002
 roy-0.002
're-0.002
TokenOnce
Token ID7454
Feature activation+7.11e-02

Pos logit contributions
 woke+0.029
AAAA+0.027
 Look+0.027
 D+0.026
ifying+0.025
Neg logit contributions
old-0.063
Years-0.060
bench-0.060
bars-0.060
 student-0.059
Token upon
Token ID2402
Feature activation+3.21e-02

Pos logit contributions
 woke+0.057
 D+0.057
're+0.056
ifying+0.051
 Look+0.050
Neg logit contributions
bench-0.163
old-0.156
Years-0.152
ggie-0.149
 employee-0.148
Token a
Token ID257
Feature activation+2.41e-02

Pos logit contributions
AAAA+0.038
ifying+0.037
 Look+0.037
 woke+0.037
're+0.036
Neg logit contributions
old-0.063
bench-0.062
Years-0.057
 worker-0.056
 attendant-0.056
Token time
Token ID640
Feature activation-1.34e-03

Pos logit contributions
AAAA+0.036
 woke+0.035
 Look+0.034
ifying+0.034
're+0.034
Neg logit contributions
old-0.039
bench-0.039
bars-0.036
 attendant-0.036
 professor-0.035
Token,
Token ID11
Feature activation+1.72e-02

Pos logit contributions
ifying+0.018
AAAA+0.016
 Look+0.015
're+0.015
 roy+0.015
Neg logit contributions
old-0.013
 math-0.012
bench-0.011
 worker-0.011
 tutor-0.011
Token their
Token ID511
Feature activation+4.29e-03

Pos logit contributions
AAAA+0.031
ifying+0.031
 Look+0.030
're+0.029
 roy+0.029
Neg logit contributions
old-0.021
bench-0.021
 student-0.019
 homework-0.019
 tutor-0.019
Token nos
Token ID43630
Feature activation+6.84e-02

Pos logit contributions
ifying+0.006
 Look+0.006
 woke+0.005
 roy+0.005
AAAA+0.005
Neg logit contributions
old-0.008
bench-0.008
Years-0.007
ritz-0.007
iger-0.007
Tokeny
Token ID88
Feature activation+5.83e-02

Pos logit contributions
ifying+0.087
 Look+0.082
AAAA+0.081
're+0.079
 roy+0.078
Neg logit contributions
bench-0.128
old-0.127
Years-0.120
 homework-0.115
 research-0.114
Token neighbor
Token ID4780
Feature activation+4.67e-02

Pos logit contributions
AAAA+0.086
ifying+0.085
 Look+0.080
're+0.077
 roy+0.076
Neg logit contributions
old-0.098
bench-0.092
 student-0.089
 worker-0.087
 tutor-0.087
Token knocked
Token ID13642
Feature activation+8.73e-03

Pos logit contributions
ifying+0.066
AAAA+0.066
 Look+0.061
 roy+0.060
're+0.060
Neg logit contributions
old-0.082
bench-0.078
 student-0.071
 worker-0.070
ggie-0.070
Token on
Token ID319
Feature activation+1.78e-02

Pos logit contributions
ifying+0.011
 Look+0.010
AAAA+0.010
 woke+0.010
're+0.010
Neg logit contributions
old-0.017
bench-0.016
Years-0.015
 student-0.015
 research-0.015
Token the
Token ID262
Feature activation+1.89e-02

Pos logit contributions
ifying+0.024
AAAA+0.023
 Look+0.023
 woke+0.022
 D+0.022
Neg logit contributions
old-0.033
bench-0.032
Years-0.029
 student-0.028
bars-0.028
Token door
Token ID3420
Feature activation-6.60e-03

Pos logit contributions
AAAA+0.037
ifying+0.034
're+0.034
 Look+0.034
 roy+0.033
Neg logit contributions
old-0.023
bench-0.022
 tutor-0.021
 worker-0.021
 manager-0.021
Token.
Token ID13
Feature activation+3.12e-03

Pos logit contributions
old+0.012
bench+0.011
Years+0.011
 homework+0.010
ritz+0.010
Neg logit contributions
ifying-0.009
 Look-0.009
 woke-0.009
're-0.008
AAAA-0.008
Token She
Token ID1375
Feature activation+3.12e-02

Pos logit contributions
ifying+0.005
AAAA+0.005
're+0.005
 roy+0.004
 Look+0.004
Neg logit contributions
bench-0.005
old-0.005
 worker-0.005
 tutor-0.004
 manager-0.004
Token day
Token ID1110
Feature activation+1.31e-02

Pos logit contributions
bench+0.017
old+0.016
Years+0.015
ritz+0.015
ae+0.014
Neg logit contributions
're-0.016
AAAA-0.016
ifying-0.016
 woke-0.016
 Look-0.016
Token,
Token ID11
Feature activation+1.44e-02

Pos logit contributions
AAAA+0.017
ifying+0.017
 Look+0.017
 woke+0.016
're+0.016
Neg logit contributions
old-0.023
bench-0.023
Years-0.021
 student-0.020
ritz-0.020
Token the
Token ID262
Feature activation+2.97e-03

Pos logit contributions
ifying+0.021
 Look+0.021
're+0.020
 Are+0.020
 woke+0.020
Neg logit contributions
bench-0.025
old-0.023
Years-0.020
ritz-0.020
iger-0.019
Token weight
Token ID3463
Feature activation-1.85e-02

Pos logit contributions
ifying+0.004
 Look+0.004
're+0.004
 Are+0.004
AAAA+0.004
Neg logit contributions
old-0.005
bench-0.005
Years-0.005
ritz-0.005
xy-0.005
Token would
Token ID561
Feature activation-2.37e-02

Pos logit contributions
old+0.036
bench+0.035
Years+0.034
iger+0.033
 student+0.033
Neg logit contributions
ifying-0.020
 woke-0.020
're-0.020
AAAA-0.019
 Look-0.018
Token be
Token ID307
Feature activation+2.12e-02

Pos logit contributions
old+0.046
bench+0.045
Years+0.041
 student+0.040
 homework+0.040
Neg logit contributions
ifying-0.029
AAAA-0.029
 Look-0.028
 roy-0.026
're-0.025
Token her
Token ID607
Feature activation+1.50e-02

Pos logit contributions
ifying+0.032
AAAA+0.031
 Look+0.031
're+0.029
 roy+0.029
Neg logit contributions
old-0.035
bench-0.035
Years-0.031
 student-0.031
 worker-0.030
Token best
Token ID1266
Feature activation+2.39e-05

Pos logit contributions
ifying+0.021
're+0.020
 Look+0.018
 roy+0.018
 D+0.018
Neg logit contributions
bench-0.027
old-0.027
Years-0.023
xy-0.023
iger-0.023
Token friend
Token ID1545
Feature activation+1.24e-02

Pos logit contributions
ifying+0.000
're+0.000
 Look+0.000
 woke+0.000
 roy+0.000
Neg logit contributions
bench-0.000
old-0.000
Years-0.000
iger-0.000
ritz-0.000
Token.
Token ID13
Feature activation-2.99e-03

Pos logit contributions
ifying+0.017
 Look+0.016
're+0.016
 woke+0.016
AAAA+0.016
Neg logit contributions
old-0.021
bench-0.021
Years-0.019
ritz-0.019
 homework-0.018
Token<|endoftext|>
Token ID50256
Feature activation+2.94e-02

Pos logit contributions
bench+0.006
old+0.006
Years+0.005
ritz+0.005
 worker+0.005
Neg logit contributions
 Look-0.004
ifying-0.004
AAAA-0.004
're-0.003
 Are-0.003
Token look
Token ID804
Feature activation+2.38e-02

Pos logit contributions
ifying+0.016
AAAA+0.015
 Look+0.015
 roy+0.014
 D+0.013
Neg logit contributions
old-0.030
bench-0.029
Years-0.027
 student-0.027
 homework-0.027
Token.
Token ID13
Feature activation-7.62e-03

Pos logit contributions
AAAA+0.043
ifying+0.040
 Look+0.040
 roy+0.037
 Are+0.037
Neg logit contributions
old-0.034
bench-0.031
 student-0.028
 homework-0.028
 worker-0.028
Token<|endoftext|>
Token ID50256
Feature activation+3.01e-02

Pos logit contributions
old+0.013
bench+0.013
Years+0.011
 student+0.011
 homework+0.011
Neg logit contributions
ifying-0.011
AAAA-0.011
 Look-0.010
 roy-0.010
're-0.010
TokenOnce
Token ID7454
Feature activation+6.62e-02

Pos logit contributions
ifying+0.035
AAAA+0.034
 Look+0.034
 woke+0.033
're+0.032
Neg logit contributions
old-0.061
bench-0.058
Years-0.057
 student-0.054
 homework-0.054
Token,
Token ID11
Feature activation+1.48e-02

Pos logit contributions
're+0.050
 D+0.049
 woke+0.046
ifying+0.045
 Look+0.041
Neg logit contributions
bench-0.157
Years-0.152
old-0.150
ggie-0.147
 poses-0.145
Token there
Token ID612
Feature activation+4.56e-03

Pos logit contributions
AAAA+0.022
ifying+0.022
 roy+0.021
 Look+0.020
're+0.020
Neg logit contributions
old-0.024
bench-0.022
 student-0.020
 worker-0.020
Years-0.020
Token was
Token ID373
Feature activation-6.06e-03

Pos logit contributions
ifying+0.006
AAAA+0.006
 Look+0.006
're+0.005
 roy+0.005
Neg logit contributions
old-0.008
bench-0.008
Years-0.007
 student-0.007
 homework-0.007
Token a
Token ID257
Feature activation-3.42e-04

Pos logit contributions
bench+0.012
old+0.012
Years+0.011
bars+0.011
 professor+0.011
Neg logit contributions
ifying-0.007
 Look-0.007
 D-0.007
AAAA-0.007
 roy-0.006
Token loyal
Token ID9112
Feature activation-4.03e-03

Pos logit contributions
old+0.001
bench+0.001
Years+0.001
 student+0.001
 worker+0.001
Neg logit contributions
ifying-0.000
AAAA-0.000
 Look-0.000
're-0.000
 roy-0.000
Token automobile
Token ID27930
Feature activation-8.93e-03

Pos logit contributions
bench+0.008
old+0.007
Years+0.007
 homework+0.007
bars+0.007
Neg logit contributions
 Look-0.005
 D-0.005
AAAA-0.005
ifying-0.005
 woke-0.005
Token.
Token ID13
Feature activation-4.84e-03

Pos logit contributions
old+0.018
bench+0.018
Years+0.017
ritz+0.016
 homework+0.016
Neg logit contributions
ifying-0.010
 Look-0.009
AAAA-0.009
 woke-0.009
're-0.009
Token said
Token ID531
Feature activation-3.40e-02

Pos logit contributions
old+0.019
bench+0.017
Years+0.016
 student+0.015
 tutor+0.015
Neg logit contributions
ifying-0.029
AAAA-0.026
 Look-0.025
 roy-0.025
're-0.024
Token,
Token ID11
Feature activation-1.19e-02

Pos logit contributions
old+0.059
bench+0.056
 worker+0.051
 tutor+0.051
 student+0.051
Neg logit contributions
ifying-0.051
AAAA-0.049
 Look-0.045
 roy-0.045
're-0.043
Token "
Token ID366
Feature activation-2.80e-03

Pos logit contributions
old+0.016
bench+0.016
Years+0.014
 worker+0.014
 tutor+0.014
Neg logit contributions
ifying-0.022
AAAA-0.020
 roy-0.020
're-0.019
 Look-0.019
TokenYes
Token ID5297
Feature activation-1.02e-02

Pos logit contributions
old+0.005
bench+0.004
Years+0.004
 student+0.004
 homework+0.004
Neg logit contributions
ifying-0.004
AAAA-0.004
 Look-0.004
're-0.004
 roy-0.004
Token,
Token ID11
Feature activation-9.64e-03

Pos logit contributions
old+0.013
 tutor+0.012
 worker+0.012
ggie+0.012
 math+0.012
Neg logit contributions
AAAA-0.019
ifying-0.018
 roy-0.016
 Are-0.016
cel-0.016
Token I
Token ID314
Feature activation+3.58e-02

Pos logit contributions
old+0.019
bench+0.019
Years+0.017
 student+0.017
 homework+0.017
Neg logit contributions
ifying-0.011
AAAA-0.011
 Look-0.010
're-0.010
 roy-0.010
Token did
Token ID750
Feature activation-4.41e-02

Pos logit contributions
ifying+0.045
AAAA+0.043
 Look+0.041
 roy+0.040
 D+0.037
Neg logit contributions
old-0.070
bench-0.068
Years-0.063
 student-0.062
ggie-0.061
Token!
Token ID0
Feature activation+9.06e-03

Pos logit contributions
old+0.067
bench+0.064
ggie+0.063
 homework+0.062
 math+0.061
Neg logit contributions
AAAA-0.072
ifying-0.071
 Look-0.065
 woke-0.064
 roy-0.064
Token Can
Token ID1680
Feature activation+4.50e-02

Pos logit contributions
AAAA+0.010
ifying+0.010
 roy+0.009
 woke+0.009
 pulp+0.009
Neg logit contributions
old-0.018
bench-0.017
 student-0.016
Years-0.016
 tutor-0.016
Token we
Token ID356
Feature activation+3.75e-02

Pos logit contributions
ifying+0.052
AAAA+0.052
 roy+0.047
 Look+0.047
 woke+0.045
Neg logit contributions
old-0.090
bench-0.085
Years-0.081
 student-0.079
 homework-0.079
Token bake
Token ID28450
Feature activation+2.31e-02

Pos logit contributions
AAAA+0.063
 roy+0.056
ifying+0.055
 Look+0.054
 Are+0.050
Neg logit contributions
old-0.058
 research-0.054
 homework-0.053
Years-0.052
 math-0.052
Token be
Token ID307
Feature activation+2.13e-02

Pos logit contributions
bench+0.040
Years+0.039
old+0.038
 homework+0.038
 worker+0.037
Neg logit contributions
ifying-0.030
AAAA-0.028
 Look-0.028
 roy-0.027
 Are-0.024
Token fun
Token ID1257
Feature activation-4.83e-02

Pos logit contributions
ifying+0.048
AAAA+0.046
 Look+0.046
're+0.044
 Sorry+0.044
Neg logit contributions
bench-0.018
old-0.018
Years-0.016
 homework-0.015
 math-0.014
Token."
Token ID526
Feature activation+2.13e-02

Pos logit contributions
old+0.112
bench+0.110
Years+0.102
 student+0.099
ggie+0.099
Neg logit contributions
 Look-0.039
ifying-0.038
're-0.037
 roy-0.037
 woke-0.036
Token
Token ID198
Feature activation-1.79e-02

Pos logit contributions
AAAA+0.029
 Look+0.029
ifying+0.029
're+0.028
 D+0.027
Neg logit contributions
old-0.037
bench-0.037
Years-0.034
 student-0.033
ggie-0.033
Token
Token ID198
Feature activation-1.39e-02

Pos logit contributions
bench+0.036
old+0.034
 student+0.033
 homework+0.032
 worker+0.032
Neg logit contributions
AAAA-0.021
 D-0.019
 Look-0.019
 woke-0.019
 Are-0.018
TokenBen
Token ID11696
Feature activation-2.68e-02

Pos logit contributions
old+0.022
bench+0.022
Years+0.019
 student+0.019
 homework+0.018
Neg logit contributions
ifying-0.022
AAAA-0.021
 Look-0.021
're-0.020
 roy-0.020
Token hesitated
Token ID41723
Feature activation-2.85e-02

Pos logit contributions
old+0.050
bench+0.049
Years+0.045
 student+0.044
 worker+0.043
Neg logit contributions
ifying-0.034
AAAA-0.033
 Look-0.032
're-0.031
 roy-0.031
Token.
Token ID13
Feature activation-1.20e-02

Pos logit contributions
bench+0.053
old+0.052
 student+0.047
Years+0.047
ritz+0.047
Neg logit contributions
 D-0.035
 Look-0.035
ifying-0.035
're-0.035
 woke-0.035
Token He
Token ID679
Feature activation+1.64e-03

Pos logit contributions
old+0.024
bench+0.023
ggie+0.021
 homework+0.021
 student+0.021
Neg logit contributions
 Look-0.015
AAAA-0.014
 Are-0.014
 D-0.013
ifying-0.013
Token didn
Token ID1422
Feature activation-2.06e-02

Pos logit contributions
AAAA+0.001
 woke+0.001
ifying+0.001
 Look+0.001
're+0.001
Neg logit contributions
bench-0.004
old-0.004
 student-0.003
 homework-0.003
 tutor-0.003
Token't
Token ID470
Feature activation-2.18e-04

Pos logit contributions
old+0.046
bench+0.043
 student+0.041
smanship+0.040
Years+0.039
Neg logit contributions
AAAA-0.022
ifying-0.021
're-0.019
 D-0.019
 Look-0.019
Token Snow
Token ID7967
Feature activation+1.91e-02

Pos logit contributions
ifying+0.009
AAAA+0.009
 Look+0.009
're+0.008
 roy+0.008
Neg logit contributions
old-0.011
bench-0.010
Years-0.009
 student-0.009
 homework-0.009
Tokeny
Token ID88
Feature activation+6.17e-02

Pos logit contributions
ifying+0.028
AAAA+0.028
 Look+0.027
're+0.026
 roy+0.026
Neg logit contributions
old-0.032
bench-0.031
 student-0.027
Years-0.027
ritz-0.027
Token was
Token ID373
Feature activation-1.48e-02

Pos logit contributions
ifying+0.085
AAAA+0.081
 Look+0.079
 roy+0.072
're+0.072
Neg logit contributions
old-0.112
bench-0.108
 student-0.100
 homework-0.099
ggie-0.098
Token no
Token ID645
Feature activation+4.88e-02

Pos logit contributions
old+0.026
bench+0.025
Years+0.023
 student+0.022
 homework+0.022
Neg logit contributions
ifying-0.021
AAAA-0.021
 Look-0.020
're-0.019
 roy-0.019
Token longer
Token ID2392
Feature activation-2.31e-02

Pos logit contributions
ifying+0.089
AAAA+0.088
 Look+0.087
 roy+0.085
're+0.084
Neg logit contributions
old-0.064
bench-0.062
Years-0.055
 student-0.053
ritz-0.051
Token white
Token ID2330
Feature activation-3.24e-03

Pos logit contributions
old+0.040
bench+0.040
 student+0.035
 worker+0.034
 homework+0.034
Neg logit contributions
AAAA-0.033
ifying-0.033
 Look-0.032
're-0.031
 Sorry-0.030
Token.
Token ID13
Feature activation-2.25e-03

Pos logit contributions
old+0.005
bench+0.005
Years+0.004
 student+0.004
ritz+0.004
Neg logit contributions
ifying-0.005
AAAA-0.005
 Look-0.005
're-0.005
 woke-0.005
Token He
Token ID679
Feature activation+1.24e-02

Pos logit contributions
old+0.003
bench+0.003
 student+0.003
Years+0.003
 homework+0.003
Neg logit contributions
ifying-0.004
AAAA-0.004
 roy-0.004
're-0.003
 Look-0.003
Token was
Token ID373
Feature activation-8.57e-03

Pos logit contributions
ifying+0.015
AAAA+0.015
 Look+0.014
 roy+0.014
're+0.013
Neg logit contributions
old-0.024
bench-0.023
 student-0.021
Years-0.021
ggie-0.021
Token red
Token ID2266
Feature activation+1.56e-02

Pos logit contributions
old+0.015
bench+0.015
Years+0.013
 student+0.013
 homework+0.013
Neg logit contributions
ifying-0.012
AAAA-0.012
 Look-0.011
're-0.011
 roy-0.011
Token and
Token ID290
Feature activation-8.37e-03

Pos logit contributions
ifying+0.022
 Look+0.021
AAAA+0.021
 roy+0.020
 woke+0.020
Neg logit contributions
old-0.028
bench-0.027
Years-0.025
 student-0.024
ritz-0.023
Token.
Token ID13
Feature activation-4.24e-03

Pos logit contributions
old+0.063
bench+0.061
 student+0.054
Years+0.054
 homework+0.053
Neg logit contributions
AAAA-0.066
ifying-0.065
 Look-0.064
 woke-0.061
 roy-0.060
Token
Token ID198
Feature activation-1.02e-03

Pos logit contributions
bench+0.009
old+0.009
Years+0.008
 homework+0.008
 student+0.008
Neg logit contributions
 Look-0.005
ifying-0.005
 D-0.005
AAAA-0.004
're-0.004
Token
Token ID198
Feature activation+3.02e-05

Pos logit contributions
bench+0.002
bars+0.002
 rider+0.002
 homework+0.002
 student+0.002
Neg logit contributions
 D-0.001
AAAA-0.001
 woke-0.001
 Look-0.001
're-0.001
TokenThis
Token ID1212
Feature activation+2.69e-03

Pos logit contributions
AAAA+0.000
ifying+0.000
 Look+0.000
're+0.000
 D+0.000
Neg logit contributions
bench-0.000
old-0.000
 homework-0.000
bars-0.000
 appointment-0.000
Token story
Token ID1621
Feature activation+5.22e-02

Pos logit contributions
 woke+0.004
AAAA+0.004
're+0.004
 Look+0.003
ifying+0.003
Neg logit contributions
old-0.005
bench-0.004
bars-0.004
Years-0.004
iger-0.004
Token teaches
Token ID17324
Feature activation-2.31e-02

Pos logit contributions
ifying+0.081
AAAA+0.080
 woke+0.078
 Look+0.078
're+0.077
Neg logit contributions
old-0.082
bench-0.078
Years-0.070
 student-0.069
ritz-0.068
Token us
Token ID514
Feature activation-2.14e-03

Pos logit contributions
bench+0.040
old+0.039
Years+0.035
 student+0.035
 worker+0.034
Neg logit contributions
ifying-0.032
AAAA-0.032
 Look-0.031
're-0.030
 woke-0.030
Token that
Token ID326
Feature activation-1.93e-02

Pos logit contributions
old+0.003
bench+0.003
Years+0.002
 student+0.002
 homework+0.002
Neg logit contributions
ifying-0.004
AAAA-0.004
 Look-0.004
 roy-0.004
're-0.004
Token it
Token ID340
Feature activation-6.72e-03

Pos logit contributions
old+0.034
bench+0.033
Years+0.030
ritz+0.029
ae+0.028
Neg logit contributions
 Look-0.027
AAAA-0.026
ifying-0.026
're-0.025
 D-0.025
Token's
Token ID338
Feature activation+1.75e-02

Pos logit contributions
old+0.013
bench+0.013
Years+0.012
 student+0.012
 homework+0.011
Neg logit contributions
ifying-0.008
AAAA-0.008
 Look-0.008
're-0.007
 roy-0.007
Token important
Token ID1593
Feature activation+8.12e-03

Pos logit contributions
ifying+0.031
AAAA+0.031
 Look+0.030
're+0.029
 roy+0.029
Neg logit contributions
old-0.024
bench-0.023
 student-0.020
Years-0.020
 worker-0.019
Token go
Token ID467
Feature activation+1.60e-02

Pos logit contributions
AAAA+0.039
ifying+0.037
 roy+0.034
 Look+0.033
 Are+0.033
Neg logit contributions
old-0.031
 homework-0.029
bench-0.028
 student-0.028
ggie-0.027
Token to
Token ID284
Feature activation+1.13e-02

Pos logit contributions
ifying+0.028
AAAA+0.027
 Look+0.027
 roy+0.025
're+0.025
Neg logit contributions
old-0.023
bench-0.022
 student-0.019
Years-0.019
 worker-0.019
Token the
Token ID262
Feature activation+1.03e-03

Pos logit contributions
 D+0.012
 Look+0.012
AAAA+0.012
ifying+0.011
 roy+0.011
Neg logit contributions
bench-0.024
old-0.023
Years-0.022
ritz-0.021
bars-0.021
Token slide
Token ID10649
Feature activation-1.23e-02

Pos logit contributions
ifying+0.002
AAAA+0.002
 Look+0.002
're+0.002
 roy+0.002
Neg logit contributions
old-0.001
bench-0.001
Years-0.001
 student-0.001
 worker-0.001
Token."
Token ID526
Feature activation+2.69e-02

Pos logit contributions
old+0.026
bench+0.025
Years+0.024
ggie+0.024
 research+0.024
Neg logit contributions
 Look-0.012
 woke-0.012
ifying-0.012
're-0.012
 D-0.011
Token
Token ID198
Feature activation-1.90e-02

Pos logit contributions
 Look+0.026
 D+0.025
AAAA+0.025
're+0.024
 Are+0.024
Neg logit contributions
bench-0.059
old-0.057
 student-0.056
ggie-0.055
heet-0.055
Token
Token ID198
Feature activation-1.58e-02

Pos logit contributions
bench+0.043
 student+0.041
smanship+0.040
heet+0.040
 homework+0.040
Neg logit contributions
AAAA-0.016
 D-0.015
 woke-0.014
 Look-0.014
 Are-0.013
TokenTom
Token ID13787
Feature activation-5.42e-02

Pos logit contributions
bench+0.032
old+0.031
 student+0.029
ritz+0.029
 homework+0.029
Neg logit contributions
AAAA-0.018
 roy-0.015
ifying-0.015
 Look-0.015
 Are-0.014
Token,
Token ID11
Feature activation+3.82e-03

Pos logit contributions
bench+0.103
old+0.103
Years+0.093
ggie+0.092
 student+0.091
Neg logit contributions
AAAA-0.065
 Look-0.064
 woke-0.062
're-0.061
ifying-0.061
Token Lily
Token ID20037
Feature activation-3.64e-02

Pos logit contributions
ifying+0.006
AAAA+0.006
 Look+0.006
're+0.006
 woke+0.006
Neg logit contributions
old-0.006
bench-0.005
Years-0.005
 student-0.005
ritz-0.005
Token and
Token ID290
Feature activation-9.26e-03

Pos logit contributions
bench+0.071
old+0.071
Years+0.066
ggie+0.064
ritz+0.064
Neg logit contributions
AAAA-0.042
 Look-0.041
 woke-0.040
're-0.040
 roy-0.039
Token a
Token ID257
Feature activation-3.32e-04

Pos logit contributions
old+0.012
bench+0.012
Years+0.011
bars+0.011
 student+0.011
Neg logit contributions
AAAA-0.008
ifying-0.008
 Look-0.007
 D-0.007
're-0.007
Token little
Token ID1310
Feature activation+2.01e-02

Pos logit contributions
old+0.001
bench+0.001
Years+0.001
 student+0.001
 worker+0.001
Neg logit contributions
ifying-0.000
AAAA-0.000
 Look-0.000
're-0.000
 roy-0.000
Token girl
Token ID2576
Feature activation-2.72e-03

Pos logit contributions
ifying+0.028
AAAA+0.028
 Look+0.027
're+0.026
 roy+0.026
Neg logit contributions
old-0.035
bench-0.034
Years-0.031
 student-0.030
 homework-0.030
Token named
Token ID3706
Feature activation+1.82e-02

Pos logit contributions
old+0.006
bench+0.006
 student+0.005
 worker+0.005
Years+0.005
Neg logit contributions
ifying-0.003
AAAA-0.003
're-0.003
 Look-0.003
 roy-0.002
Token Lily
Token ID20037
Feature activation-3.95e-02

Pos logit contributions
AAAA+0.031
ifying+0.030
 roy+0.026
're+0.026
 pulp+0.025
Neg logit contributions
old-0.027
bench-0.025
 student-0.024
 tutor-0.024
 worker-0.023
Token.
Token ID13
Feature activation-6.02e-03

Pos logit contributions
bench+0.062
old+0.061
Years+0.056
 homework+0.055
 attendant+0.054
Neg logit contributions
 woke-0.060
 Look-0.059
ifying-0.059
AAAA-0.058
 roy-0.057
Token She
Token ID1375
Feature activation+1.18e-02

Pos logit contributions
bench+0.009
old+0.009
 student+0.007
Years+0.007
 tutor+0.007
Neg logit contributions
AAAA-0.010
ifying-0.010
 roy-0.009
 Look-0.009
're-0.009
Token loved
Token ID6151
Feature activation-2.50e-02

Pos logit contributions
ifying+0.022
AAAA+0.019
 roy+0.019
 Look+0.019
're+0.018
Neg logit contributions
old-0.017
bench-0.016
ggie-0.016
Years-0.015
 student-0.015
Token to
Token ID284
Feature activation-1.55e-02

Pos logit contributions
old+0.038
bench+0.038
ritz+0.033
Years+0.033
 attendant+0.032
Neg logit contributions
 Look-0.040
AAAA-0.039
 woke-0.039
ifying-0.039
 D-0.038
Token play
Token ID711
Feature activation-2.95e-02

Pos logit contributions
old+0.024
bench+0.023
 homework+0.023
 tutor+0.022
 math+0.022
Neg logit contributions
ifying-0.027
AAAA-0.025
 Look-0.023
 roy-0.022
 Are-0.022
Token with
Token ID351
Feature activation-1.79e-02

Pos logit contributions
old+0.042
bench+0.041
Years+0.036
 student+0.036
 homework+0.034
Neg logit contributions
ifying-0.051
AAAA-0.050
 Look-0.049
're-0.047
 roy-0.047
Token Finn
Token ID15368
Feature activation-4.85e-02

Pos logit contributions
ifying+0.029
AAAA+0.028
 woke+0.026
 Look+0.026
 roy+0.025
Neg logit contributions
old-0.029
bench-0.027
 student-0.025
ggie-0.025
iger-0.025
Token.
Token ID13
Feature activation-1.70e-02

Pos logit contributions
old+0.058
bench+0.052
 student+0.045
 worker+0.045
Years+0.043
Neg logit contributions
ifying-0.098
AAAA-0.097
 Look-0.093
 Are-0.088
 roy-0.088
Token
Token ID198
Feature activation-1.30e-02

Pos logit contributions
old+0.026
bench+0.025
Years+0.023
 student+0.022
 worker+0.022
Neg logit contributions
ifying-0.028
AAAA-0.026
 roy-0.025
 Look-0.024
're-0.024
Token
Token ID198
Feature activation-1.20e-02

Pos logit contributions
bench+0.029
 homework+0.028
old+0.028
bars+0.028
 student+0.028
Neg logit contributions
 woke-0.011
 D-0.011
AAAA-0.011
 Look-0.010
 Are-0.009
TokenF
Token ID37
Feature activation-1.23e-02

Pos logit contributions
old+0.019
bench+0.018
Years+0.017
 worker+0.016
ritz+0.016
Neg logit contributions
ifying-0.019
AAAA-0.018
 Look-0.018
're-0.018
 roy-0.018
Tokeninn
Token ID3732
Feature activation-1.09e-01

Pos logit contributions
bench+0.015
old+0.015
 student+0.013
 tutor+0.013
Years+0.013
Neg logit contributions
ifying-0.023
AAAA-0.023
 Look-0.022
 D-0.022
 woke-0.022
Token asked
Token ID1965
Feature activation+1.85e-03

Pos logit contributions
old+0.203
bench+0.189
Years+0.171
 student+0.171
 worker+0.170
Neg logit contributions
ifying-0.151
AAAA-0.142
 Look-0.140
 Are-0.131
 roy-0.130
Token,
Token ID11
Feature activation-4.00e-04

Pos logit contributions
ifying+0.003
AAAA+0.002
 Look+0.002
're+0.002
 roy+0.002
Neg logit contributions
old-0.003
bench-0.003
Years-0.003
 student-0.003
 worker-0.003
Token "
Token ID366
Feature activation-4.65e-03

Pos logit contributions
old+0.001
bench+0.001
Years+0.001
 student+0.001
 worker+0.001
Neg logit contributions
ifying-0.001
AAAA-0.001
 Look-0.001
're-0.001
 roy-0.001
TokenDo
Token ID5211
Feature activation+9.37e-03

Pos logit contributions
Years+0.007
old+0.006
guard+0.006
 worker+0.006
Jacob+0.006
Neg logit contributions
ifying-0.007
're-0.007
AAAA-0.007
 Sorry-0.007
 roy-0.007
Token you
Token ID345
Feature activation+4.04e-02

Pos logit contributions
ifying+0.012
AAAA+0.012
 Look+0.012
 roy+0.011
're+0.011
Neg logit contributions
old-0.017
bench-0.017
Years-0.015
 student-0.015
 homework-0.015
Token people
Token ID661
Feature activation+1.60e-02

Pos logit contributions
 D+0.056
 woke+0.055
 Pink+0.054
 Look+0.053
 th+0.052
Neg logit contributions
 homework-0.057
bench-0.056
Years-0.054
ae-0.053
chairs-0.051
Token.
Token ID13
Feature activation-1.08e-02

Pos logit contributions
ifying+0.020
're+0.020
 D+0.019
 woke+0.018
 Look+0.018
Neg logit contributions
bench-0.031
Years-0.029
old-0.027
ggie-0.027
 homework-0.027
Token He
Token ID679
Feature activation+4.75e-03

Pos logit contributions
old+0.022
bench+0.022
 student+0.020
Years+0.020
 homework+0.020
Neg logit contributions
 Look-0.012
ifying-0.012
AAAA-0.012
 woke-0.011
're-0.011
Token held
Token ID2714
Feature activation-2.99e-02

Pos logit contributions
ifying+0.006
AAAA+0.006
 Look+0.006
 roy+0.005
're+0.005
Neg logit contributions
old-0.009
bench-0.009
Years-0.008
 student-0.008
ggie-0.008
Token the
Token ID262
Feature activation-2.43e-02

Pos logit contributions
old+0.057
bench+0.056
 student+0.051
Years+0.051
 salon+0.051
Neg logit contributions
ifying-0.038
 Look-0.037
AAAA-0.036
 D-0.034
 woke-0.034
Token rope
Token ID17182
Feature activation-1.01e-01

Pos logit contributions
old+0.045
bench+0.041
ae+0.038
Years+0.037
 homework+0.037
Neg logit contributions
AAAA-0.035
ifying-0.033
're-0.032
 Look-0.031
 roy-0.030
Token and
Token ID290
Feature activation+3.81e-03

Pos logit contributions
old+0.183
bench+0.176
 student+0.162
 homework+0.159
Years+0.156
Neg logit contributions
ifying-0.136
're-0.125
 Look-0.124
 woke-0.122
 D-0.121
Token flew
Token ID13112
Feature activation+5.84e-02

Pos logit contributions
ifying+0.004
AAAA+0.004
 Look+0.004
 woke+0.004
're+0.004
Neg logit contributions
old-0.008
bench-0.008
Years-0.007
 student-0.007
 homework-0.007
Token his
Token ID465
Feature activation+1.16e-02

Pos logit contributions
ifying+0.061
 D+0.061
 Look+0.058
AAAA+0.056
're+0.055
Neg logit contributions
old-0.125
bench-0.119
 student-0.112
 homework-0.111
 tutor-0.109
Token k
Token ID479
Feature activation-1.18e-02

Pos logit contributions
ifying+0.014
AAAA+0.014
're+0.013
 Look+0.013
 woke+0.012
Neg logit contributions
old-0.023
bench-0.022
ritz-0.020
Years-0.020
 student-0.020
Tokenite
Token ID578
Feature activation+4.35e-02

Pos logit contributions
bench+0.019
old+0.019
Years+0.017
 homework+0.017
 student+0.017
Neg logit contributions
ifying-0.019
AAAA-0.018
 Look-0.017
're-0.017
 roy-0.016
Token,
Token ID11
Feature activation+9.50e-03

Pos logit contributions
ifying+0.125
AAAA+0.123
're+0.109
 Look+0.107
 Are+0.106
Neg logit contributions
old-0.120
 math-0.104
 homework-0.101
 tutor-0.100
 student-0.099
Token I
Token ID314
Feature activation+4.92e-02

Pos logit contributions
ifying+0.015
AAAA+0.015
 roy+0.014
're+0.013
 woke+0.013
Neg logit contributions
 math-0.013
old-0.013
 homework-0.013
 student-0.013
 tutor-0.013
Token can
Token ID460
Feature activation-3.99e-02

Pos logit contributions
ifying+0.062
AAAA+0.059
 roy+0.055
 Look+0.055
 pulp+0.053
Neg logit contributions
old-0.095
bench-0.090
 student-0.086
 homework-0.085
 research-0.084
Token do
Token ID466
Feature activation-9.72e-02

Pos logit contributions
old+0.072
bench+0.071
Years+0.064
 student+0.063
 worker+0.061
Neg logit contributions
ifying-0.053
AAAA-0.052
 Look-0.052
're-0.050
 woke-0.049
Token this
Token ID428
Feature activation-9.46e-03

Pos logit contributions
old+0.141
bench+0.134
Years+0.120
 student+0.118
 homework+0.117
Neg logit contributions
ifying-0.166
AAAA-0.164
 Look-0.160
're-0.154
 roy-0.154
Token pose
Token ID12705
Feature activation-9.47e-02

Pos logit contributions
old+0.014
bench+0.013
bars+0.010
Years+0.010
ritz+0.010
Neg logit contributions
 Look-0.016
AAAA-0.016
 woke-0.016
ifying-0.016
're-0.016
Token and
Token ID290
Feature activation+4.82e-03

Pos logit contributions
old+0.172
bench+0.167
Years+0.152
 student+0.148
ritz+0.146
Neg logit contributions
ifying-0.125
AAAA-0.121
 Look-0.121
're-0.116
 woke-0.114
Token this
Token ID428
Feature activation-4.59e-04

Pos logit contributions
ifying+0.006
AAAA+0.006
 Look+0.006
 roy+0.006
're+0.006
Neg logit contributions
old-0.009
bench-0.009
 student-0.008
 homework-0.008
Years-0.008
Token pose
Token ID12705
Feature activation-8.41e-02

Pos logit contributions
old+0.001
bench+0.001
Years+0.001
ritz+0.001
 attendant+0.001
Neg logit contributions
 Look-0.001
AAAA-0.001
 woke-0.001
ifying-0.001
're-0.001
Token and
Token ID290
Feature activation+5.99e-03

Pos logit contributions
old+0.158
bench+0.154
Years+0.141
 student+0.139
 homework+0.136
Neg logit contributions
ifying-0.106
AAAA-0.104
 Look-0.101
're-0.096
 roy-0.095
Token this
Token ID428
Feature activation+2.31e-03

Pos logit contributions
ifying+0.008
AAAA+0.008
 Look+0.007
 roy+0.007
're+0.007
Neg logit contributions
old-0.011
bench-0.011
 student-0.010
 homework-0.010
Years-0.010
Token They
Token ID1119
Feature activation+4.73e-05

Pos logit contributions
old+0.003
bench+0.003
Years+0.003
 student+0.002
 worker+0.002
Neg logit contributions
ifying-0.002
AAAA-0.002
 roy-0.002
 Look-0.002
 woke-0.002
Token had
Token ID550
Feature activation+3.17e-02

Pos logit contributions
ifying+0.000
AAAA+0.000
 Look+0.000
 roy+0.000
 Sorry+0.000
Neg logit contributions
old-0.000
bench-0.000
 student-0.000
 worker-0.000
Years-0.000
Token a
Token ID257
Feature activation-1.04e-02

Pos logit contributions
ifying+0.044
AAAA+0.043
 Look+0.043
're+0.041
 woke+0.040
Neg logit contributions
old-0.056
bench-0.054
Years-0.049
 student-0.049
ritz-0.048
Token lot
Token ID1256
Feature activation+7.02e-04

Pos logit contributions
old+0.019
bench+0.019
Years+0.017
 student+0.017
ritz+0.016
Neg logit contributions
ifying-0.014
AAAA-0.014
 Look-0.013
 roy-0.012
're-0.012
Token of
Token ID286
Feature activation-4.30e-02

Pos logit contributions
ifying+0.001
're+0.001
 D+0.001
 woke+0.001
als+0.001
Neg logit contributions
bench-0.001
 appointment-0.001
old-0.001
Years-0.001
wen-0.001
Token fun
Token ID1257
Feature activation-9.22e-02

Pos logit contributions
old+0.065
ae+0.061
bench+0.060
bars+0.059
wen+0.057
Neg logit contributions
AAAA-0.067
 Look-0.066
 roy-0.064
're-0.064
 D-0.062
Token making
Token ID1642
Feature activation-3.03e-02

Pos logit contributions
old+0.173
bench+0.162
 attendant+0.151
 professor+0.151
bars+0.150
Neg logit contributions
AAAA-0.120
 Look-0.112
 roy-0.112
 woke-0.110
ifying-0.110
Token noises
Token ID26782
Feature activation-2.57e-02

Pos logit contributions
bench+0.057
old+0.054
Years+0.051
bars+0.049
ritz+0.049
Neg logit contributions
 D-0.037
 Look-0.037
 roy-0.037
ifying-0.036
AAAA-0.036
Token and
Token ID290
Feature activation-4.65e-03

Pos logit contributions
old+0.052
bench+0.051
Years+0.048
ritz+0.046
ae+0.046
Neg logit contributions
 woke-0.027
ifying-0.027
're-0.026
 Look-0.025
 roy-0.025
Token pretending
Token ID23662
Feature activation-3.05e-02

Pos logit contributions
old+0.007
bench+0.007
 student+0.006
Years+0.006
 homework+0.006
Neg logit contributions
ifying-0.008
AAAA-0.008
 Look-0.007
 roy-0.007
're-0.007
Token to
Token ID284
Feature activation+2.61e-02

Pos logit contributions
old+0.047
bench+0.045
Years+0.040
 student+0.039
ritz+0.039
Neg logit contributions
ifying-0.049
AAAA-0.048
 Look-0.048
 woke-0.048
're-0.046
Token attic
Token ID46113
Feature activation-4.16e-02

Pos logit contributions
old+0.017
bench+0.016
Years+0.013
 student+0.013
ggie+0.013
Neg logit contributions
ifying-0.030
AAAA-0.030
 Look-0.030
 roy-0.029
 woke-0.029
Token.
Token ID13
Feature activation-7.07e-03

Pos logit contributions
old+0.083
bench+0.079
Years+0.079
ggie+0.076
iger+0.075
Neg logit contributions
ifying-0.046
're-0.045
 woke-0.044
 Look-0.043
 reapp-0.042
Token They
Token ID1119
Feature activation+6.37e-03

Pos logit contributions
old+0.014
bench+0.013
Years+0.012
 student+0.012
 homework+0.012
Neg logit contributions
ifying-0.009
AAAA-0.008
 Look-0.008
're-0.008
 roy-0.008
Token found
Token ID1043
Feature activation+1.94e-02

Pos logit contributions
ifying+0.007
AAAA+0.006
 Look+0.006
 roy+0.006
 Are+0.005
Neg logit contributions
old-0.014
bench-0.013
Years-0.012
 student-0.012
 worker-0.012
Token all
Token ID477
Feature activation-3.32e-03

Pos logit contributions
ifying+0.024
 Look+0.024
AAAA+0.024
're+0.023
 D+0.023
Neg logit contributions
bench-0.036
old-0.035
Years-0.032
 student-0.032
ritz-0.031
Token sorts
Token ID10524
Feature activation-1.11e-01

Pos logit contributions
bench+0.005
old+0.005
Years+0.005
ggie+0.005
ae+0.005
Neg logit contributions
 woke-0.005
're-0.005
ifying-0.005
AAAA-0.005
 Look-0.005
Token of
Token ID286
Feature activation-5.94e-02

Pos logit contributions
bench+0.153
old+0.145
Years+0.133
 student+0.132
 homework+0.131
Neg logit contributions
ifying-0.189
AAAA-0.183
're-0.182
 Look-0.181
 roy-0.177
Token things
Token ID1243
Feature activation-4.22e-02

Pos logit contributions
bench+0.097
old+0.094
ae+0.086
 student+0.085
Years+0.085
Neg logit contributions
AAAA-0.086
 Look-0.084
ifying-0.083
 roy-0.083
 woke-0.083
Token:
Token ID25
Feature activation-1.69e-03

Pos logit contributions
old+0.074
bench+0.073
Years+0.068
 student+0.067
ae+0.065
Neg logit contributions
ifying-0.056
're-0.055
AAAA-0.054
 Look-0.053
 woke-0.053
Token old
Token ID1468
Feature activation-1.75e-02

Pos logit contributions
old+0.003
bench+0.003
Years+0.002
 student+0.002
ritz+0.002
Neg logit contributions
ifying-0.003
AAAA-0.003
 Look-0.003
're-0.002
 roy-0.002
Token toys
Token ID14958
Feature activation+4.35e-03

Pos logit contributions
old+0.027
bench+0.026
Years+0.023
 student+0.023
 homework+0.022
Neg logit contributions
ifying-0.028
AAAA-0.028
 Look-0.027
're-0.026
 roy-0.026