TOP ACTIVATIONS
MAX = 0.099

the side . He leans too much and loses his balance
happy . When faced with lack , we must not be
not careful . He swung too hard and missed the branch
pieces . Tom fell down too . He was cold and
The thread was too strong and Jay couldn â
down . Tom falls down too . He hurts his knee
said , holding up the large knot . Her
She was too fast and too high . They
every day , he was too weak and fell straight back
and interesting plants . One time she noticed something strange in
You have to respect the rules and stay away from places
to climb down but was too dizz y from the height
down . Ben fell down too . He hit his head
One day , she bounced too hard and fell off the
want you to respect their rules and their things . If
slid . But she slid too fast and lost her balance
" I am thinking about wisdom ," said the owl .
jet ! But she was too young . One day ,
replied . He took a long piece of string and tied
pulled hard . Lily pulled too . They pulled and pulled

MIN ACTIVATIONS
MIN = -0.101

fire work . <|endoftext|> E dd ie was a three year
shouting " Ah oy , mate y !" and " Sh
the shore and watches the sw an . She thinks it
peaceful . The sw an looked around and saw
the pond and watch the sw ans swim . She loved
O ll ie . O ll ie loved to play all
resist ! I will try the o at !"
was mean . He said sun fl owers were alive .
around the lake . The sw an wanted to join the
walked until they finally reached the o asis . They drank
said , " Hi O ll ie ". <|endoftext|> Once ,
the park to study the sw an and other animals .
by the lake . The sw an was so graceful that
sun came up , the sw an would wake up again
When Twe ety arrived at the o asis , she saw
the park , watching the sw an and talking about all
One morning , the sw an discovered something very unique
the pond and saw the sw an again . She wanted
a big bite of the hamb urger . " M mm
clear and calm . The sw an flew through the sky

INTERVAL 0.049 - 0.099
CONTAINS 0.400%

it is so pretty !" Lily says . "
. But the stick is too short . They look around
that sharing is better than winning . From that day on
The girl was so relieved . She hugged Sharon and thanked
Can I touch them ?" Lily asked . "

INTERVAL -0.001 - 0.049
CONTAINS 50.205%

rabbit hopped out of the bushes . L ola was delighted
time , there was a little girl . She was very
pulled it away from the tree . The car was still
their trip . <|endoftext|> Once upon a time , there was
Finally they did it ! With the help of his friends

INTERVAL -0.051 - -0.001
CONTAINS 49.164%

But then Mom said , " Oh no , the car
, there was a happy clay . He lived in a
for the boy and he was happy to help . He
time with him and they both had a great day .
said I couldn 't because I don 't know how to

INTERVAL -0.101 - -0.051
CONTAINS 0.231%

was flying ! She pretended her t eddy bear could fly
, they finished building the ig loo . Tim my was
there was a tank . It was big and shiny !
Yes Bella . This is our beautiful porch . Come ,
He asked his mom , " Mom , are there dinosaurs
Token the
Token ID262
Feature activation-2.61e-04

Pos logit contributions
bs+0.057
Scar+0.052
 cane+0.051
powers+0.050
able+0.050
Neg logit contributions
icles-0.051
ator-0.051
urg-0.049
icle-0.045
zie-0.042
Token side
Token ID1735
Feature activation+2.22e-02

Pos logit contributions
urg+0.000
ator+0.000
icles+0.000
rim+0.000
icle+0.000
Neg logit contributions
bs-0.000
 cane-0.000
Scar-0.000
 arrows-0.000
Aud-0.000
Token.
Token ID13
Feature activation+2.43e-03

Pos logit contributions
Scar+0.034
bs+0.033
 cane+0.033
 arrows+0.030
Aud+0.030
Neg logit contributions
ator-0.042
icles-0.041
urg-0.040
icle-0.036
zie-0.034
Token He
Token ID679
Feature activation+2.22e-02

Pos logit contributions
 cane+0.004
bs+0.004
Scar+0.003
 arrows+0.003
 arrow+0.003
Neg logit contributions
urg-0.005
ator-0.005
icles-0.004
rim-0.004
icle-0.004
Token leans
Token ID39416
Feature activation-1.24e-02

Pos logit contributions
bs+0.032
 cane+0.031
Scar+0.030
Michelle+0.030
Har+0.029
Neg logit contributions
urg-0.041
ator-0.040
icles-0.039
ric-0.035
zie-0.034
Token too
Token ID1165
Feature activation+9.72e-02

Pos logit contributions
urg+0.024
ator+0.023
icles+0.022
rim+0.020
zie+0.019
Neg logit contributions
 cane-0.019
bs-0.019
Scar-0.019
 arrows-0.017
 confident-0.017
Token much
Token ID881
Feature activation+7.82e-02

Pos logit contributions
nam+0.182
Aud+0.181
pill+0.177
 diary+0.177
 stamps+0.177
Neg logit contributions
 now-0.110
ator-0.107
icles-0.103
 next-0.101
 first-0.101
Token and
Token ID290
Feature activation+4.24e-02

Pos logit contributions
nam+0.178
Jennifer+0.163
Scar+0.159
Terry+0.158
Michelle+0.158
Neg logit contributions
icles-0.087
icle-0.079
ator-0.071
 them-0.067
 side-0.067
Token loses
Token ID14754
Feature activation+1.40e-02

Pos logit contributions
Mic+0.073
Michelle+0.067
able+0.066
 cane+0.065
Har+0.065
Neg logit contributions
icles-0.066
urg-0.066
ator-0.058
ric-0.057
icle-0.056
Token his
Token ID465
Feature activation+3.98e-02

Pos logit contributions
bs+0.028
Scar+0.026
 cane+0.026
Aud+0.024
able+0.024
Neg logit contributions
urg-0.021
ator-0.021
icles-0.020
icle-0.017
 Esk-0.016
Token balance
Token ID5236
Feature activation+3.93e-02

Pos logit contributions
bs+0.061
Scar+0.060
Michelle+0.058
Charl+0.058
Aud+0.058
Neg logit contributions
ator-0.071
urg-0.071
icles-0.067
icle-0.061
 ig-0.058
Token happy
Token ID3772
Feature activation-1.27e-02

Pos logit contributions
 cane+0.041
Scar+0.041
bs+0.038
Har+0.038
Mic+0.038
Neg logit contributions
ator-0.055
urg-0.054
icles-0.050
rim-0.047
ric-0.046
Token.
Token ID13
Feature activation+1.77e-03

Pos logit contributions
urg+0.027
ator+0.026
icles+0.025
rim+0.023
icle+0.022
Neg logit contributions
bs-0.017
 cane-0.017
Scar-0.016
 arrows-0.015
 arrow-0.015
Token When
Token ID1649
Feature activation-1.59e-02

Pos logit contributions
bs+0.002
 cane+0.002
Scar+0.002
 arrows+0.002
 arrow+0.002
Neg logit contributions
urg-0.004
ator-0.004
icles-0.004
rim-0.003
icle-0.003
Token faced
Token ID7452
Feature activation+1.29e-02

Pos logit contributions
urg+0.034
ator+0.034
icles+0.032
rim+0.030
icle+0.028
Neg logit contributions
 cane-0.020
bs-0.020
Scar-0.020
 arrows-0.018
 arrow-0.018
Token with
Token ID351
Feature activation+2.79e-02

Pos logit contributions
 cane+0.018
bs+0.018
Scar+0.018
Aud+0.016
 confident+0.016
Neg logit contributions
ator-0.025
urg-0.025
icles-0.024
rim-0.021
icle-0.021
Token lack
Token ID3092
Feature activation+9.43e-02

Pos logit contributions
Scar+0.043
bs+0.041
 cane+0.040
Aud+0.038
Har+0.038
Neg logit contributions
ator-0.052
urg-0.050
icles-0.049
icle-0.043
rim-0.042
Token,
Token ID11
Feature activation+3.73e-02

Pos logit contributions
Scar+0.182
 cane+0.175
nam+0.173
Har+0.173
Mic+0.169
Neg logit contributions
 them-0.113
icles-0.105
ator-0.102
 us-0.093
urg-0.091
Token we
Token ID356
Feature activation+4.77e-03

Pos logit contributions
 cane+0.056
bs+0.056
Scar+0.055
Aud+0.053
Har+0.051
Neg logit contributions
ator-0.068
urg-0.065
icles-0.065
icle-0.054
 Esk-0.053
Token must
Token ID1276
Feature activation-1.17e-02

Pos logit contributions
 cane+0.007
bs+0.007
Scar+0.007
 arrows+0.006
Aud+0.006
Neg logit contributions
ator-0.009
urg-0.009
icles-0.008
rim-0.007
zie-0.007
Token not
Token ID407
Feature activation+2.12e-02

Pos logit contributions
urg+0.021
ator+0.021
icles+0.020
rim+0.018
icle+0.017
Neg logit contributions
bs-0.019
 cane-0.019
Scar-0.018
 arrows-0.017
 arrow-0.017
Token be
Token ID307
Feature activation+3.24e-02

Pos logit contributions
Scar+0.037
 cane+0.036
bs+0.035
Aud+0.033
able+0.033
Neg logit contributions
urg-0.035
ator-0.034
icles-0.032
rim-0.029
ric-0.028
Token not
Token ID407
Feature activation+4.04e-02

Pos logit contributions
 cane+0.040
bs+0.040
Scar+0.037
 arrows+0.037
able+0.036
Neg logit contributions
urg-0.030
ator-0.030
icles-0.028
 Esk-0.025
icle-0.024
Token careful
Token ID8161
Feature activation+2.18e-03

Pos logit contributions
bs+0.062
 cane+0.061
pill+0.059
Scar+0.059
Har+0.058
Neg logit contributions
urg-0.071
ator-0.067
 Esk-0.063
icles-0.061
rim-0.055
Token.
Token ID13
Feature activation-1.25e-02

Pos logit contributions
bs+0.003
 cane+0.003
Scar+0.003
 arrows+0.003
 arrow+0.003
Neg logit contributions
urg-0.005
ator-0.005
icles-0.004
rim-0.004
icle-0.004
Token He
Token ID679
Feature activation+1.51e-02

Pos logit contributions
urg+0.026
ator+0.026
icles+0.025
rim+0.023
icle+0.022
Neg logit contributions
bs-0.017
 cane-0.016
Scar-0.016
 arrows-0.015
 arrow-0.015
Token swung
Token ID28507
Feature activation-4.92e-03

Pos logit contributions
Scar+0.024
 cane+0.023
bs+0.023
Har+0.022
Aud+0.022
Neg logit contributions
urg-0.026
ator-0.025
icles-0.025
rim-0.022
icle-0.021
Token too
Token ID1165
Feature activation+9.21e-02

Pos logit contributions
urg+0.010
ator+0.010
icles+0.010
rim+0.009
icle+0.008
Neg logit contributions
bs-0.007
 cane-0.007
Scar-0.006
 arrows-0.006
 arrow-0.006
Token hard
Token ID1327
Feature activation+4.60e-02

Pos logit contributions
nam+0.176
Scar+0.171
Aud+0.170
Jennifer+0.167
pill+0.164
Neg logit contributions
 now-0.096
ator-0.091
 them-0.091
icles-0.086
 next-0.084
Token and
Token ID290
Feature activation+3.10e-02

Pos logit contributions
Scar+0.075
bs+0.074
Aud+0.072
 cane+0.072
Jennifer+0.071
Neg logit contributions
ator-0.076
urg-0.076
icles-0.074
icle-0.063
io-0.058
Token missed
Token ID6825
Feature activation-1.90e-02

Pos logit contributions
 cane+0.048
Scar+0.046
Har+0.046
Michelle+0.045
bs+0.043
Neg logit contributions
urg-0.052
icles-0.051
ator-0.048
 overfl-0.045
icle-0.044
Token the
Token ID262
Feature activation+2.10e-02

Pos logit contributions
urg+0.038
ator+0.037
icles+0.035
rim+0.033
icle+0.030
Neg logit contributions
 cane-0.028
bs-0.028
Scar-0.026
 arrows-0.026
 arrow-0.025
Token branch
Token ID8478
Feature activation+2.06e-02

Pos logit contributions
bs+0.025
Scar+0.024
Har+0.022
 cane+0.021
Aud+0.020
Neg logit contributions
ator-0.045
urg-0.045
icles-0.044
icle-0.040
 ig-0.039
Token pieces
Token ID5207
Feature activation+4.10e-02

Pos logit contributions
bs+0.001
 cane+0.001
Scar+0.001
 arrows+0.001
 confident+0.001
Neg logit contributions
urg-0.002
ator-0.002
icles-0.002
rim-0.002
icle-0.002
Token.
Token ID13
Feature activation+8.08e-03

Pos logit contributions
Scar+0.070
Aud+0.065
Har+0.065
Michelle+0.063
 cane+0.062
Neg logit contributions
ator-0.070
icles-0.069
urg-0.068
rim-0.059
icle-0.058
Token Tom
Token ID4186
Feature activation+1.39e-02

Pos logit contributions
 cane+0.013
bs+0.013
Scar+0.012
 arrows+0.012
 arrow+0.012
Neg logit contributions
urg-0.015
icles-0.014
ator-0.014
icle-0.012
rim-0.012
Token fell
Token ID3214
Feature activation-5.83e-03

Pos logit contributions
Scar+0.019
 cane+0.019
bs+0.019
Aud+0.017
Har+0.017
Neg logit contributions
urg-0.028
ator-0.027
icles-0.026
rim-0.023
icle-0.023
Token down
Token ID866
Feature activation+4.68e-03

Pos logit contributions
urg+0.011
ator+0.011
icles+0.010
rim+0.009
icle+0.009
Neg logit contributions
bs-0.009
 cane-0.009
Scar-0.009
 arrows-0.008
 arrow-0.008
Token too
Token ID1165
Feature activation+9.21e-02

Pos logit contributions
bs+0.006
 cane+0.006
Scar+0.005
 arrows+0.005
Aud+0.005
Neg logit contributions
urg-0.010
ator-0.010
icles-0.010
rim-0.009
icle-0.009
Token.
Token ID13
Feature activation-5.53e-03

Pos logit contributions
Aud+0.208
Scar+0.202
pill+0.193
Jennifer+0.192
 stamps+0.185
Neg logit contributions
urg-0.078
ator-0.077
 now-0.076
icles-0.075
 side-0.070
Token He
Token ID679
Feature activation+2.73e-03

Pos logit contributions
urg+0.011
ator+0.011
icles+0.011
rim+0.010
icle+0.009
Neg logit contributions
 cane-0.007
bs-0.007
Scar-0.007
 arrows-0.007
 arrow-0.007
Token was
Token ID373
Feature activation+1.09e-03

Pos logit contributions
Scar+0.003
 cane+0.003
bs+0.003
Aud+0.003
 arrows+0.003
Neg logit contributions
urg-0.006
ator-0.006
icles-0.006
rim-0.005
icle-0.005
Token cold
Token ID4692
Feature activation+1.59e-02

Pos logit contributions
bs+0.001
 cane+0.001
Scar+0.001
 arrows+0.001
Har+0.001
Neg logit contributions
urg-0.002
ator-0.002
icles-0.002
rim-0.002
icle-0.002
Token and
Token ID290
Feature activation-1.46e-02

Pos logit contributions
Scar+0.021
bs+0.021
 cane+0.020
 arrows+0.019
Aud+0.018
Neg logit contributions
urg-0.034
ator-0.033
icles-0.033
icle-0.029
rim-0.029
Token
Token ID220
Feature activation+5.92e-03

Pos logit contributions
ator+0.078
urg+0.078
rim+0.073
icles+0.072
icle+0.065
Neg logit contributions
 cane-0.037
bs-0.036
Scar-0.035
 arrows-0.035
 arrow-0.033
Token
Token ID198
Feature activation-1.40e-02

Pos logit contributions
 cane+0.007
bs+0.007
Scar+0.007
 arrows+0.006
 arrow+0.006
Neg logit contributions
urg-0.013
ator-0.013
icles-0.013
rim-0.011
icle-0.011
TokenThe
Token ID464
Feature activation+1.80e-02

Pos logit contributions
urg+0.034
ator+0.034
icles+0.033
rim+0.030
icle+0.029
Neg logit contributions
bs-0.014
 cane-0.014
Scar-0.013
 arrows-0.012
Har-0.011
Token thread
Token ID4704
Feature activation+2.58e-02

Pos logit contributions
bs+0.032
Scar+0.030
 cane+0.029
Aud+0.028
Har+0.028
Neg logit contributions
ator-0.029
urg-0.028
icles-0.027
icle-0.024
zie-0.023
Token was
Token ID373
Feature activation+1.20e-02

Pos logit contributions
Scar+0.047
 cane+0.044
Har+0.044
bs+0.044
Aud+0.043
Neg logit contributions
ator-0.041
urg-0.040
icles-0.039
rim-0.033
icle-0.033
Token too
Token ID1165
Feature activation+9.16e-02

Pos logit contributions
bs+0.016
 cane+0.016
Scar+0.015
 arrows+0.014
Har+0.014
Neg logit contributions
urg-0.025
ator-0.024
icles-0.024
rim-0.021
icle-0.021
Token strong
Token ID1913
Feature activation+4.58e-02

Pos logit contributions
 Bow+0.160
Jennifer+0.159
 jury+0.159
 Heidi+0.158
Aud+0.157
Neg logit contributions
urg-0.110
ie-0.103
 setting-0.094
 hyd-0.092
ail-0.092
Token and
Token ID290
Feature activation-1.44e-03

Pos logit contributions
Scar+0.086
nam+0.083
bs+0.082
Har+0.081
Jennifer+0.079
Neg logit contributions
urg-0.069
icles-0.066
ator-0.065
icle-0.056
 setting-0.055
Token Jay
Token ID9180
Feature activation+1.88e-02

Pos logit contributions
urg+0.003
ator+0.003
icles+0.003
rim+0.002
icle+0.002
Neg logit contributions
bs-0.002
 cane-0.002
Scar-0.002
 arrows-0.002
Har-0.002
Token couldn
Token ID3521
Feature activation+2.97e-02

Pos logit contributions
 cane+0.033
Scar+0.032
bs+0.032
Aud+0.030
 arrows+0.030
Neg logit contributions
urg-0.030
ator-0.029
icles-0.028
rim-0.024
io-0.023
Tokenâ
Token ID22940
Feature activation+6.47e-03

Pos logit contributions
bs+0.066
 cane+0.064
nam+0.063
 arrows+0.062
Scar+0.061
Neg logit contributions
ator-0.037
icles-0.035
urg-0.035
icle-0.029
zie-0.025
Token down
Token ID866
Feature activation+4.19e-03

Pos logit contributions
bs+0.010
Scar+0.010
 cane+0.010
 arrows+0.009
Jennifer+0.009
Neg logit contributions
urg-0.011
ator-0.010
icles-0.010
icle-0.009
rim-0.008
Token.
Token ID13
Feature activation+1.05e-02

Pos logit contributions
bs+0.006
 cane+0.006
Scar+0.006
 arrows+0.005
Aud+0.005
Neg logit contributions
urg-0.009
ator-0.009
icles-0.008
rim-0.007
icle-0.007
Token Tom
Token ID4186
Feature activation+6.74e-03

Pos logit contributions
 cane+0.017
bs+0.017
Scar+0.016
 arrows+0.016
able+0.015
Neg logit contributions
urg-0.019
ator-0.018
icles-0.018
icle-0.016
rim-0.015
Token falls
Token ID8953
Feature activation+7.38e-03

Pos logit contributions
bs+0.010
 cane+0.010
Scar+0.010
 arrows+0.009
Aud+0.009
Neg logit contributions
urg-0.013
ator-0.013
icles-0.012
rim-0.011
ric-0.011
Token down
Token ID866
Feature activation+1.44e-02

Pos logit contributions
bs+0.013
Scar+0.012
 cane+0.012
 arrows+0.012
Aud+0.011
Neg logit contributions
urg-0.012
ator-0.012
icles-0.012
icle-0.010
rim-0.010
Token too
Token ID1165
Feature activation+9.05e-02

Pos logit contributions
bs+0.020
 cane+0.019
Scar+0.019
Aud+0.017
 arrows+0.017
Neg logit contributions
ator-0.029
urg-0.029
icles-0.028
icle-0.025
rim-0.025
Token.
Token ID13
Feature activation-1.42e-03

Pos logit contributions
Aud+0.189
nam+0.185
Scar+0.182
 stamps+0.182
pill+0.173
Neg logit contributions
ator-0.091
 now-0.090
urg-0.089
icles-0.087
 himself-0.079
Token He
Token ID679
Feature activation+7.41e-03

Pos logit contributions
urg+0.003
ator+0.003
icles+0.003
rim+0.002
icle+0.002
Neg logit contributions
bs-0.002
 cane-0.002
Scar-0.002
 arrows-0.002
 arrow-0.002
Token hurts
Token ID20406
Feature activation-6.40e-03

Pos logit contributions
bs+0.009
Scar+0.009
 cane+0.009
Aud+0.008
Har+0.008
Neg logit contributions
urg-0.016
ator-0.015
icles-0.015
rim-0.013
ric-0.013
Token his
Token ID465
Feature activation+1.23e-02

Pos logit contributions
urg+0.011
ator+0.011
icles+0.010
rim+0.009
icle+0.009
Neg logit contributions
bs-0.011
 cane-0.011
Scar-0.011
 arrows-0.010
 arrow-0.010
Token knee
Token ID10329
Feature activation+3.57e-02

Pos logit contributions
bs+0.023
Scar+0.023
Aud+0.021
able+0.020
 cane+0.020
Neg logit contributions
urg-0.020
ator-0.020
icles-0.019
icle-0.016
 ig-0.016
Token said
Token ID531
Feature activation-3.15e-03

Pos logit contributions
bs+0.031
 cane+0.030
Scar+0.030
 arrows+0.029
Har+0.028
Neg logit contributions
urg-0.035
ator-0.035
icles-0.033
rim-0.030
zie-0.029
Token,
Token ID11
Feature activation-1.52e-02

Pos logit contributions
urg+0.007
ator+0.007
icles+0.007
rim+0.006
icle+0.006
Neg logit contributions
 cane-0.004
bs-0.004
Scar-0.004
 arrows-0.003
 arrow-0.003
Token holding
Token ID4769
Feature activation-4.87e-03

Pos logit contributions
urg+0.030
ator+0.030
icles+0.028
rim+0.026
icle+0.024
Neg logit contributions
 cane-0.022
bs-0.022
Scar-0.021
 arrows-0.020
 arrow-0.019
Token up
Token ID510
Feature activation-3.77e-03

Pos logit contributions
urg+0.010
ator+0.010
icles+0.009
rim+0.009
icle+0.008
Neg logit contributions
bs-0.007
 cane-0.007
Scar-0.006
 arrows-0.006
 arrow-0.006
Token the
Token ID262
Feature activation+7.50e-02

Pos logit contributions
urg+0.008
ator+0.008
icles+0.008
rim+0.007
icle+0.007
Neg logit contributions
 cane-0.005
bs-0.005
Scar-0.004
 arrows-0.004
 arrow-0.004
Token large
Token ID1588
Feature activation+8.97e-02

Pos logit contributions
 Jerry+0.086
bs+0.085
old+0.082
Har+0.081
Terry+0.080
Neg logit contributions
ator-0.129
 edge-0.121
 side-0.121
 surface-0.119
icles-0.118
Token knot
Token ID29654
Feature activation+2.59e-02

Pos logit contributions
Har+0.118
Terry+0.116
bs+0.113
Mic+0.111
ver+0.110
Neg logit contributions
ator-0.144
 side-0.142
icles-0.138
 surface-0.137
 size-0.136
Token.
Token ID13
Feature activation-5.17e-03

Pos logit contributions
Scar+0.036
bs+0.036
 cane+0.035
Har+0.033
 arrows+0.032
Neg logit contributions
ator-0.051
urg-0.050
icles-0.049
icle-0.043
rim-0.043
Token
Token ID198
Feature activation-2.82e-03

Pos logit contributions
urg+0.011
ator+0.011
icles+0.010
rim+0.009
icle+0.009
Neg logit contributions
bs-0.007
 cane-0.007
Scar-0.007
 arrows-0.006
 arrow-0.006
Token
Token ID198
Feature activation-9.46e-04

Pos logit contributions
urg+0.006
ator+0.006
icles+0.006
rim+0.006
icle+0.005
Neg logit contributions
 cane-0.003
bs-0.003
Scar-0.003
 arrows-0.003
 arrow-0.003
TokenHer
Token ID9360
Feature activation-6.46e-04

Pos logit contributions
urg+0.002
ator+0.002
icles+0.002
rim+0.002
icle+0.002
Neg logit contributions
 cane-0.001
bs-0.001
Scar-0.001
 arrows-0.001
 arrow-0.001
Token She
Token ID1375
Feature activation+1.31e-02

Pos logit contributions
urg+0.003
ator+0.003
icles+0.003
rim+0.003
icle+0.003
Neg logit contributions
 cane-0.002
bs-0.002
Scar-0.002
 arrows-0.002
 arrow-0.002
Token was
Token ID373
Feature activation-1.54e-02

Pos logit contributions
 cane+0.019
bs+0.019
Scar+0.019
 arrows+0.017
Har+0.017
Neg logit contributions
urg-0.025
ator-0.024
icles-0.023
rim-0.021
zie-0.020
Token too
Token ID1165
Feature activation+6.63e-02

Pos logit contributions
urg+0.034
ator+0.033
icles+0.032
rim+0.029
icle+0.028
Neg logit contributions
bs-0.019
 cane-0.019
Scar-0.018
 confident-0.017
 arrows-0.017
Token fast
Token ID3049
Feature activation+3.20e-02

Pos logit contributions
 arrows+0.111
 arrow+0.098
 cane+0.097
Scar+0.096
Aud+0.096
Neg logit contributions
urg-0.105
ator-0.097
rim-0.097
 Esk-0.091
allows-0.090
Token and
Token ID290
Feature activation+3.41e-02

Pos logit contributions
Scar+0.048
bs+0.048
nam+0.046
 cane+0.046
Har+0.045
Neg logit contributions
urg-0.057
ator-0.055
icles-0.055
icle-0.047
zie-0.046
Token too
Token ID1165
Feature activation+8.96e-02

Pos logit contributions
 cane+0.043
Scar+0.043
Har+0.043
bs+0.039
Mic+0.038
Neg logit contributions
urg-0.064
icles-0.064
ator-0.063
zie-0.058
icle-0.055
Token high
Token ID1029
Feature activation+6.38e-03

Pos logit contributions
Mic+0.131
 arrows+0.126
nam+0.124
Scar+0.124
Har+0.122
Neg logit contributions
urg-0.120
ie-0.116
ator-0.114
icles-0.113
 M-0.112
Token.
Token ID13
Feature activation+2.22e-02

Pos logit contributions
bs+0.009
 cane+0.009
Scar+0.008
 arrows+0.008
 arrow+0.008
Neg logit contributions
urg-0.013
ator-0.013
icles-0.013
icle-0.011
rim-0.011
Token
Token ID198
Feature activation+2.10e-02

Pos logit contributions
 cane+0.030
bs+0.029
 arrows+0.028
Scar+0.028
 arrow+0.027
Neg logit contributions
icles-0.043
ator-0.043
urg-0.043
zie-0.038
icle-0.038
Token
Token ID198
Feature activation+2.76e-02

Pos logit contributions
 cane+0.031
bs+0.029
 arrows+0.028
Scar+0.028
 arrow+0.027
Neg logit contributions
ator-0.040
urg-0.039
icles-0.038
rim-0.035
zie-0.034
TokenThey
Token ID2990
Feature activation+9.34e-03

Pos logit contributions
 cane+0.043
bs+0.041
 arrows+0.039
 stamps+0.038
 arrow+0.038
Neg logit contributions
ator-0.048
urg-0.045
icles-0.044
rim-0.043
zie-0.040
Token every
Token ID790
Feature activation+2.14e-02

Pos logit contributions
bs+0.002
Scar+0.002
 cane+0.002
Aud+0.002
Har+0.002
Neg logit contributions
urg-0.003
ator-0.003
icles-0.003
icle-0.003
rim-0.003
Token day
Token ID1110
Feature activation+1.72e-02

Pos logit contributions
Scar+0.045
bs+0.044
nam+0.043
 arrows+0.042
 cane+0.042
Neg logit contributions
urg-0.025
ator-0.025
icles-0.024
icle-0.021
rim-0.020
Token,
Token ID11
Feature activation+4.33e-03

Pos logit contributions
Scar+0.024
bs+0.023
 cane+0.023
Aud+0.022
Jennifer+0.022
Neg logit contributions
icles-0.034
urg-0.034
ator-0.032
icle-0.030
 Esk-0.029
Token he
Token ID339
Feature activation+2.05e-02

Pos logit contributions
bs+0.005
Scar+0.005
 cane+0.005
Aud+0.005
Har+0.005
Neg logit contributions
urg-0.009
ator-0.009
icles-0.009
icle-0.008
rim-0.008
Token was
Token ID373
Feature activation+3.35e-02

Pos logit contributions
Scar+0.037
 cane+0.036
bs+0.036
Aud+0.035
Har+0.034
Neg logit contributions
urg-0.031
ator-0.030
icles-0.028
rim-0.025
ric-0.024
Token too
Token ID1165
Feature activation+8.88e-02

Pos logit contributions
bs+0.050
 cane+0.049
Scar+0.049
 arrows+0.046
able+0.046
Neg logit contributions
urg-0.060
ator-0.057
icles-0.052
ric-0.052
 setting-0.049
Token weak
Token ID4939
Feature activation+3.16e-02

Pos logit contributions
bench+0.144
 arrows+0.143
ird+0.143
Scar+0.142
 Faith+0.142
Neg logit contributions
urg-0.130
 hyd-0.108
ric-0.107
ipples-0.107
ie-0.106
Token and
Token ID290
Feature activation+5.17e-03

Pos logit contributions
Scar+0.059
bs+0.056
Jennifer+0.054
nam+0.054
Har+0.054
Neg logit contributions
urg-0.048
icles-0.047
ator-0.046
icle-0.040
zie-0.038
Token fell
Token ID3214
Feature activation+1.52e-02

Pos logit contributions
Scar+0.007
 cane+0.007
bs+0.007
Har+0.007
Aud+0.006
Neg logit contributions
urg-0.010
ator-0.010
icles-0.010
rim-0.009
icle-0.009
Token straight
Token ID3892
Feature activation-1.54e-02

Pos logit contributions
bs+0.026
Scar+0.026
 cane+0.025
Jennifer+0.024
Har+0.024
Neg logit contributions
urg-0.026
icles-0.025
ator-0.025
icle-0.021
zie-0.020
Token back
Token ID736
Feature activation+1.56e-02

Pos logit contributions
urg+0.032
ator+0.032
icles+0.031
rim+0.028
icle+0.026
Neg logit contributions
 cane-0.021
bs-0.020
Scar-0.020
 arrows-0.019
 arrow-0.018
Token and
Token ID290
Feature activation+7.51e-03

Pos logit contributions
Scar+0.050
bs+0.047
 cane+0.047
Har+0.045
Jennifer+0.044
Neg logit contributions
ator-0.053
urg-0.053
icles-0.050
rim-0.047
fl-0.044
Token interesting
Token ID3499
Feature activation-2.08e-02

Pos logit contributions
Scar+0.011
bs+0.011
 cane+0.011
Har+0.010
Aud+0.010
Neg logit contributions
urg-0.014
ator-0.014
icles-0.013
rim-0.012
zie-0.011
Token plants
Token ID6134
Feature activation+1.11e-02

Pos logit contributions
urg+0.038
ator+0.038
icles+0.036
rim+0.032
icle+0.031
Neg logit contributions
 cane-0.033
bs-0.033
 arrows-0.032
Scar-0.031
 arrow-0.031
Token.
Token ID13
Feature activation+5.46e-02

Pos logit contributions
Scar+0.015
bs+0.015
 cane+0.015
Har+0.014
 arrows+0.013
Neg logit contributions
urg-0.022
ator-0.022
icles-0.021
rim-0.019
icle-0.018
Token One
Token ID1881
Feature activation-6.39e-03

Pos logit contributions
bs+0.092
 tangled+0.083
able+0.082
br+0.081
ner+0.080
Neg logit contributions
icles-0.082
urg-0.076
zie-0.075
ator-0.072
 Later-0.071
Token time
Token ID640
Feature activation+8.85e-02

Pos logit contributions
urg+0.010
ator+0.010
icles+0.009
rim+0.008
icle+0.008
Neg logit contributions
bs-0.012
 cane-0.012
Scar-0.012
 arrows-0.011
 arrow-0.011
Token she
Token ID673
Feature activation-1.16e-02

Pos logit contributions
powers+0.149
Mic+0.137
pill+0.137
azz+0.134
 pencil+0.131
Neg logit contributions
 them-0.122
icles-0.119
 setting-0.110
 him-0.095
fl-0.094
Token noticed
Token ID6810
Feature activation+3.35e-02

Pos logit contributions
urg+0.022
ator+0.022
icles+0.021
rim+0.019
icle+0.018
Neg logit contributions
 cane-0.018
bs-0.018
Scar-0.017
 arrows-0.016
 arrow-0.016
Token something
Token ID1223
Feature activation+1.58e-02

Pos logit contributions
bs+0.049
Scar+0.048
 cane+0.045
powers+0.044
 confident+0.043
Neg logit contributions
icles-0.064
urg-0.063
ator-0.062
icle-0.054
zie-0.053
Token strange
Token ID6283
Feature activation-1.74e-03

Pos logit contributions
Scar+0.023
bs+0.022
 cane+0.022
Har+0.020
Aud+0.020
Neg logit contributions
urg-0.030
ator-0.030
icles-0.029
rim-0.026
icle-0.025
Token in
Token ID287
Feature activation+2.61e-02

Pos logit contributions
urg+0.004
ator+0.004
icles+0.003
rim+0.003
icle+0.003
Neg logit contributions
bs-0.002
 cane-0.002
Scar-0.002
 arrows-0.002
 arrow-0.002
Token You
Token ID921
Feature activation+1.69e-02

Pos logit contributions
 cane+0.039
Scar+0.038
bs+0.038
 arrows+0.036
 confident+0.036
Neg logit contributions
urg-0.029
ator-0.028
icles-0.028
icle-0.024
rim-0.023
Token have
Token ID423
Feature activation+1.35e-02

Pos logit contributions
Scar+0.027
 cane+0.027
Aud+0.025
bs+0.025
 confident+0.025
Neg logit contributions
urg-0.029
ator-0.029
icles-0.028
rim-0.024
icle-0.024
Token to
Token ID284
Feature activation+9.69e-03

Pos logit contributions
bs+0.022
Scar+0.022
 cane+0.022
 arrows+0.020
 confident+0.020
Neg logit contributions
urg-0.023
icles-0.022
ator-0.022
icle-0.019
rim-0.019
Token respect
Token ID2461
Feature activation+1.29e-02

Pos logit contributions
bs+0.018
 cane+0.018
Scar+0.017
 arrows+0.017
powers+0.016
Neg logit contributions
urg-0.015
ator-0.015
icles-0.014
rim-0.013
icle-0.012
Token the
Token ID262
Feature activation-1.35e-02

Pos logit contributions
bs+0.021
Scar+0.021
 cane+0.020
 confident+0.019
Aud+0.019
Neg logit contributions
ator-0.022
urg-0.022
icles-0.022
rim-0.019
icle-0.019
Token rules
Token ID3173
Feature activation+8.82e-02

Pos logit contributions
urg+0.027
ator+0.026
icles+0.025
rim+0.023
icle+0.022
Neg logit contributions
 cane-0.020
bs-0.020
Scar-0.019
 arrows-0.018
 arrow-0.018
Token and
Token ID290
Feature activation+9.30e-03

Pos logit contributions
Scar+0.160
 cane+0.155
Aud+0.151
 arrows+0.148
Har+0.147
Neg logit contributions
ator-0.136
icles-0.129
urg-0.108
 us-0.105
icle-0.104
Token stay
Token ID2652
Feature activation-4.71e-03

Pos logit contributions
Scar+0.015
bs+0.015
 cane+0.015
Har+0.014
Aud+0.014
Neg logit contributions
urg-0.016
ator-0.016
icles-0.015
rim-0.013
icle-0.013
Token away
Token ID1497
Feature activation+1.88e-02

Pos logit contributions
urg+0.009
ator+0.008
icles+0.008
rim+0.007
icle+0.007
Neg logit contributions
 cane-0.008
bs-0.008
Scar-0.007
 arrows-0.007
 arrow-0.007
Token from
Token ID422
Feature activation-1.22e-02

Pos logit contributions
Scar+0.033
bs+0.033
 cane+0.032
Aud+0.031
 arrows+0.030
Neg logit contributions
urg-0.032
ator-0.031
icles-0.029
rim-0.026
 Esk-0.024
Token places
Token ID4113
Feature activation+5.27e-02

Pos logit contributions
urg+0.024
ator+0.023
icles+0.022
rim+0.020
icle+0.019
Neg logit contributions
 cane-0.018
bs-0.018
Scar-0.017
 arrows-0.016
 arrow-0.016
Token to
Token ID284
Feature activation+2.00e-02

Pos logit contributions
bs+0.024
Scar+0.023
 cane+0.022
able+0.021
Aud+0.021
Neg logit contributions
urg-0.023
icles-0.023
ator-0.022
icle-0.019
zie-0.019
Token climb
Token ID12080
Feature activation+7.67e-04

Pos logit contributions
bs+0.033
Scar+0.032
 cane+0.030
able+0.030
Aud+0.029
Neg logit contributions
ator-0.036
urg-0.036
icles-0.034
icle-0.030
zie-0.030
Token down
Token ID866
Feature activation+9.64e-03

Pos logit contributions
bs+0.001
 cane+0.001
Scar+0.001
 arrows+0.001
 confident+0.001
Neg logit contributions
ator-0.001
urg-0.001
icles-0.001
icle-0.001
rim-0.001
Token but
Token ID475
Feature activation+6.34e-03

Pos logit contributions
bs+0.013
Scar+0.013
 cane+0.012
Aud+0.011
Har+0.011
Neg logit contributions
ator-0.020
urg-0.020
icles-0.020
icle-0.017
zie-0.017
Token was
Token ID373
Feature activation+2.03e-02

Pos logit contributions
Scar+0.009
 cane+0.009
bs+0.009
 arrows+0.008
Har+0.008
Neg logit contributions
ator-0.012
urg-0.012
icles-0.012
icle-0.010
zie-0.010
Token too
Token ID1165
Feature activation+8.82e-02

Pos logit contributions
bs+0.037
Scar+0.037
 cane+0.036
Har+0.035
 arrows+0.034
Neg logit contributions
urg-0.030
ator-0.030
icles-0.029
icle-0.025
zie-0.025
Token dizz
Token ID29527
Feature activation+1.39e-02

Pos logit contributions
 arrows+0.136
 brace+0.136
Aud+0.132
 stamps+0.131
 Faith+0.129
Neg logit contributions
urg-0.128
 Esk-0.120
ator-0.118
zie-0.115
icles-0.113
Tokeny
Token ID88
Feature activation+6.92e-03

Pos logit contributions
 cane+0.027
Scar+0.027
bs+0.027
 arrows+0.026
 arrow+0.026
Neg logit contributions
urg-0.021
ator-0.020
icles-0.020
rim-0.017
icle-0.016
Token from
Token ID422
Feature activation-5.37e-03

Pos logit contributions
bs+0.009
Scar+0.009
 cane+0.009
 arrows+0.009
Aud+0.008
Neg logit contributions
urg-0.014
ator-0.014
icles-0.013
rim-0.012
icle-0.012
Token the
Token ID262
Feature activation+1.43e-02

Pos logit contributions
urg+0.011
ator+0.011
icles+0.010
rim+0.009
icle+0.009
Neg logit contributions
bs-0.008
 cane-0.008
Scar-0.008
 arrows-0.007
 arrow-0.007
Token height
Token ID6001
Feature activation+2.79e-02

Pos logit contributions
bs+0.020
Scar+0.019
 cane+0.018
Har+0.017
Aud+0.017
Neg logit contributions
ator-0.029
urg-0.028
icles-0.028
icle-0.024
zie-0.024
Token down
Token ID866
Feature activation-6.73e-03

Pos logit contributions
urg+0.002
ator+0.002
icles+0.002
rim+0.001
icle+0.001
Neg logit contributions
bs-0.001
 cane-0.001
Scar-0.001
 arrows-0.001
 confident-0.001
Token.
Token ID13
Feature activation+1.23e-02

Pos logit contributions
urg+0.015
ator+0.015
icles+0.015
rim+0.013
icle+0.013
Neg logit contributions
bs-0.008
 cane-0.008
Scar-0.008
 arrows-0.007
 arrow-0.007
Token Ben
Token ID3932
Feature activation+3.74e-02

Pos logit contributions
 cane+0.018
bs+0.017
Scar+0.017
 arrows+0.016
 arrow+0.016
Neg logit contributions
urg-0.024
icles-0.023
ator-0.023
icle-0.020
rim-0.020
Token fell
Token ID3214
Feature activation-1.80e-03

Pos logit contributions
Scar+0.055
bs+0.052
nam+0.052
 cane+0.052
Har+0.052
Neg logit contributions
urg-0.064
icles-0.058
ator-0.057
rim-0.053
 overfl-0.053
Token down
Token ID866
Feature activation-1.77e-04

Pos logit contributions
urg+0.003
ator+0.003
icles+0.003
rim+0.003
icle+0.003
Neg logit contributions
bs-0.003
 cane-0.003
Scar-0.003
 arrows-0.003
 arrow-0.003
Token too
Token ID1165
Feature activation+8.79e-02

Pos logit contributions
urg+0.000
ator+0.000
icles+0.000
rim+0.000
icle+0.000
Neg logit contributions
bs-0.000
 cane-0.000
Scar-0.000
 arrows-0.000
 arrow-0.000
Token.
Token ID13
Feature activation-4.52e-03

Pos logit contributions
Aud+0.214
Scar+0.210
nam+0.208
Jennifer+0.202
 stamps+0.197
Neg logit contributions
urg-0.056
ator-0.053
icles-0.053
 now-0.052
 Esk-0.049
Token He
Token ID679
Feature activation+1.11e-02

Pos logit contributions
urg+0.010
ator+0.010
icles+0.010
rim+0.009
icle+0.008
Neg logit contributions
 cane-0.005
bs-0.005
Scar-0.005
 arrows-0.005
 arrow-0.005
Token hit
Token ID2277
Feature activation-6.28e-03

Pos logit contributions
Scar+0.014
 cane+0.013
bs+0.013
Aud+0.012
 confident+0.012
Neg logit contributions
urg-0.023
icles-0.022
ator-0.022
rim-0.020
icle-0.019
Token his
Token ID465
Feature activation-2.05e-02

Pos logit contributions
urg+0.014
ator+0.014
icles+0.013
rim+0.012
icle+0.012
Neg logit contributions
bs-0.008
 cane-0.008
Scar-0.007
 arrows-0.007
 arrow-0.007
Token head
Token ID1182
Feature activation+1.03e-02

Pos logit contributions
urg+0.033
ator+0.033
icles+0.031
rim+0.029
zie+0.026
Neg logit contributions
 cane-0.038
bs-0.036
Scar-0.035
 arrow-0.035
 string-0.034
TokenOne
Token ID3198
Feature activation+5.69e-03

Pos logit contributions
urg+0.011
ator+0.011
icles+0.010
rim+0.009
icle+0.009
Neg logit contributions
 cane-0.007
bs-0.007
Scar-0.007
 arrows-0.006
 arrow-0.006
Token day
Token ID1110
Feature activation+2.41e-02

Pos logit contributions
Scar+0.012
 cane+0.012
bs+0.012
 arrows+0.011
Har+0.011
Neg logit contributions
urg-0.007
ator-0.007
icles-0.007
rim-0.006
icle-0.006
Token,
Token ID11
Feature activation-1.03e-03

Pos logit contributions
 cane+0.033
Scar+0.032
 confident+0.030
powers+0.030
bench+0.030
Neg logit contributions
urg-0.046
icles-0.045
ator-0.044
icle-0.039
io-0.038
Token she
Token ID673
Feature activation+5.83e-03

Pos logit contributions
urg+0.002
ator+0.002
icles+0.002
rim+0.002
icle+0.002
Neg logit contributions
 cane-0.001
bs-0.001
Scar-0.001
 arrows-0.001
 arrow-0.001
Token bounced
Token ID31318
Feature activation+3.71e-03

Pos logit contributions
 cane+0.009
bs+0.009
Scar+0.009
 arrows+0.008
Aud+0.008
Neg logit contributions
urg-0.010
ator-0.010
icles-0.010
rim-0.009
zie-0.008
Token too
Token ID1165
Feature activation+8.77e-02

Pos logit contributions
bs+0.006
 cane+0.006
Scar+0.006
Har+0.005
 arrows+0.005
Neg logit contributions
urg-0.006
ator-0.006
icles-0.006
icle-0.005
rim-0.005
Token hard
Token ID1327
Feature activation+4.40e-02

Pos logit contributions
nam+0.161
Scar+0.160
 stamps+0.157
 jury+0.156
Jennifer+0.153
Neg logit contributions
 now-0.104
ator-0.088
icles-0.087
ie-0.087
 soon-0.086
Token and
Token ID290
Feature activation+1.60e-02

Pos logit contributions
Scar+0.080
nam+0.078
Jennifer+0.077
bs+0.076
 cane+0.076
Neg logit contributions
urg-0.066
icles-0.061
ator-0.061
fl-0.052
icle-0.051
Token fell
Token ID3214
Feature activation+5.39e-03

Pos logit contributions
 cane+0.024
Scar+0.023
bs+0.023
Har+0.021
able+0.021
Neg logit contributions
urg-0.029
icles-0.029
ator-0.029
zie-0.025
icle-0.025
Token off
Token ID572
Feature activation+1.14e-02

Pos logit contributions
bs+0.009
Scar+0.009
 cane+0.009
 arrows+0.008
Har+0.008
Neg logit contributions
urg-0.009
ator-0.009
icles-0.009
icle-0.008
rim-0.008
Token the
Token ID262
Feature activation+1.49e-02

Pos logit contributions
bs+0.016
Scar+0.016
 cane+0.016
 arrows+0.015
Har+0.015
Neg logit contributions
ator-0.022
urg-0.022
icles-0.021
icle-0.019
zie-0.018
Token want
Token ID765
Feature activation-5.31e-03

Pos logit contributions
Scar+0.037
 cane+0.037
bs+0.035
 arrows+0.035
Aud+0.034
Neg logit contributions
urg-0.047
ator-0.044
icles-0.042
rim-0.041
ric-0.041
Token you
Token ID345
Feature activation+1.68e-02

Pos logit contributions
urg+0.011
ator+0.011
icles+0.010
rim+0.009
icle+0.009
Neg logit contributions
bs-0.007
 cane-0.007
Scar-0.007
 arrows-0.007
 arrow-0.007
Token to
Token ID284
Feature activation+1.32e-02

Pos logit contributions
Scar+0.027
 cane+0.026
bs+0.025
Har+0.025
 arrows+0.024
Neg logit contributions
urg-0.029
icles-0.028
ator-0.028
rim-0.024
icle-0.024
Token respect
Token ID2461
Feature activation+1.10e-03

Pos logit contributions
bs+0.024
 cane+0.024
Scar+0.024
 arrows+0.023
able+0.023
Neg logit contributions
urg-0.020
ator-0.020
icles-0.019
rim-0.016
zie-0.016
Token their
Token ID511
Feature activation-1.64e-02

Pos logit contributions
bs+0.002
 cane+0.002
Scar+0.002
 arrows+0.002
 confident+0.002
Neg logit contributions
urg-0.002
ator-0.002
icles-0.002
rim-0.002
icle-0.002
Token rules
Token ID3173
Feature activation+8.75e-02

Pos logit contributions
urg+0.030
ator+0.030
icles+0.029
rim+0.025
icle+0.025
Neg logit contributions
 cane-0.027
bs-0.027
Scar-0.025
 arrows-0.025
 arrow-0.024
Token and
Token ID290
Feature activation+1.82e-02

Pos logit contributions
Scar+0.169
Aud+0.161
bs+0.158
 cane+0.158
Mic+0.158
Neg logit contributions
ator-0.119
icles-0.114
urg-0.096
 us-0.091
 them-0.090
Token their
Token ID511
Feature activation-1.60e-02

Pos logit contributions
Scar+0.031
bs+0.030
 cane+0.029
Har+0.028
Aud+0.028
Neg logit contributions
urg-0.030
ator-0.029
icles-0.028
rim-0.025
ric-0.024
Token things
Token ID1243
Feature activation+3.98e-02

Pos logit contributions
urg+0.028
ator+0.027
icles+0.026
rim+0.023
icle+0.022
Neg logit contributions
 cane-0.028
bs-0.027
Scar-0.026
 arrows-0.025
 arrow-0.025
Token.
Token ID13
Feature activation+5.51e-04

Pos logit contributions
Scar+0.069
 cane+0.066
Har+0.065
Jennifer+0.065
Aud+0.064
Neg logit contributions
ator-0.063
icles-0.059
urg-0.059
rim-0.049
 hyd-0.049
Token If
Token ID1002
Feature activation-5.39e-03

Pos logit contributions
 cane+0.001
bs+0.001
Scar+0.001
 arrows+0.001
 arrow+0.001
Neg logit contributions
urg-0.001
ator-0.001
icles-0.001
rim-0.001
icle-0.001
Token slid
Token ID27803
Feature activation-1.91e-02

Pos logit contributions
Mic+0.075
able+0.070
powers+0.070
 cane+0.070
pill+0.070
Neg logit contributions
urg-0.067
icles-0.060
ator-0.059
zie-0.053
 hyd-0.052
Token.
Token ID13
Feature activation+1.38e-02

Pos logit contributions
urg+0.039
ator+0.038
icles+0.035
rim+0.035
icle+0.031
Neg logit contributions
 cane-0.028
bs-0.027
Scar-0.026
 arrows-0.025
 arrow-0.025
Token But
Token ID887
Feature activation+1.71e-02

Pos logit contributions
 cane+0.017
bs+0.017
able+0.016
 arrows+0.016
Scar+0.016
Neg logit contributions
urg-0.028
icles-0.028
ator-0.028
icle-0.024
zie-0.024
Token she
Token ID673
Feature activation-7.81e-03

Pos logit contributions
 cane+0.022
powers+0.021
bs+0.021
Scar+0.021
able+0.020
Neg logit contributions
ator-0.035
urg-0.035
icles-0.035
zie-0.030
icle-0.030
Token slid
Token ID27803
Feature activation-1.30e-02

Pos logit contributions
urg+0.015
ator+0.015
icles+0.014
rim+0.012
icle+0.012
Neg logit contributions
bs-0.012
 cane-0.012
Scar-0.012
 arrows-0.011
 arrow-0.011
Token too
Token ID1165
Feature activation+8.73e-02

Pos logit contributions
urg+0.025
ator+0.025
icles+0.023
rim+0.022
icle+0.020
Neg logit contributions
 cane-0.020
bs-0.019
Scar-0.019
 arrows-0.018
 arrow-0.018
Token fast
Token ID3049
Feature activation+2.43e-02

Pos logit contributions
Scar+0.157
pill+0.154
Aud+0.152
 stamps+0.151
Jennifer+0.151
Neg logit contributions
 now-0.102
icles-0.101
ator-0.098
 them-0.092
pl-0.091
Token and
Token ID290
Feature activation+1.83e-02

Pos logit contributions
Scar+0.037
bs+0.037
 cane+0.037
Jennifer+0.035
Har+0.035
Neg logit contributions
urg-0.043
ator-0.043
icles-0.042
icle-0.036
zie-0.034
Token lost
Token ID2626
Feature activation-6.26e-03

Pos logit contributions
 cane+0.025
Scar+0.025
Har+0.024
bs+0.023
Michelle+0.022
Neg logit contributions
urg-0.035
icles-0.034
ator-0.033
zie-0.030
icle-0.029
Token her
Token ID607
Feature activation+6.16e-03

Pos logit contributions
urg+0.014
ator+0.013
icles+0.013
rim+0.012
icle+0.011
Neg logit contributions
bs-0.008
 cane-0.008
Scar-0.008
 arrows-0.007
 arrow-0.007
Token balance
Token ID5236
Feature activation+1.80e-02

Pos logit contributions
bs+0.009
Scar+0.009
 cane+0.009
Har+0.008
Aud+0.008
Neg logit contributions
ator-0.012
urg-0.011
icles-0.011
rim-0.010
icle-0.010
Token"
Token ID1
Feature activation+1.63e-02

Pos logit contributions
bs+0.022
 cane+0.021
Scar+0.020
 confident+0.019
 arrows+0.019
Neg logit contributions
ator-0.030
urg-0.029
icles-0.027
rim-0.025
zie-0.023
TokenI
Token ID40
Feature activation+6.11e-03

Pos logit contributions
bs+0.030
 cane+0.029
Scar+0.027
 confident+0.026
 arrows+0.026
Neg logit contributions
ator-0.026
icles-0.024
urg-0.023
rim-0.022
icle-0.021
Token am
Token ID716
Feature activation+1.57e-02

Pos logit contributions
 cane+0.011
Scar+0.011
bs+0.011
 arrows+0.010
Aud+0.010
Neg logit contributions
urg-0.010
ator-0.009
icles-0.009
rim-0.008
 Esk-0.007
Token thinking
Token ID3612
Feature activation+3.12e-02

Pos logit contributions
bs+0.024
 cane+0.023
Scar+0.022
able+0.021
 arrows+0.021
Neg logit contributions
urg-0.030
ator-0.030
icles-0.027
rim-0.025
 Esk-0.025
Token about
Token ID546
Feature activation+1.61e-02

Pos logit contributions
Scar+0.053
 cane+0.051
bs+0.050
Har+0.049
Aud+0.048
Neg logit contributions
icles-0.054
urg-0.054
ator-0.053
icle-0.044
rim-0.043
Token wisdom
Token ID11501
Feature activation+8.70e-02

Pos logit contributions
bs+0.026
Scar+0.025
 cane+0.025
Aud+0.023
able+0.023
Neg logit contributions
urg-0.030
ator-0.029
icles-0.028
rim-0.024
 Esk-0.024
Token,"
Token ID553
Feature activation+2.44e-02

Pos logit contributions
Scar+0.156
Aud+0.146
Har+0.144
Jennifer+0.143
Charl+0.134
Neg logit contributions
icles-0.135
ator-0.130
ie-0.122
urg-0.118
 us-0.112
Token said
Token ID531
Feature activation+3.08e-02

Pos logit contributions
Scar+0.035
bs+0.035
 cane+0.034
powers+0.033
Aud+0.033
Neg logit contributions
urg-0.046
ator-0.045
icles-0.044
rim-0.039
icle-0.038
Token the
Token ID262
Feature activation+4.01e-02

Pos logit contributions
powers+0.051
bs+0.051
Scar+0.051
Aud+0.050
 cane+0.050
Neg logit contributions
ator-0.050
urg-0.049
icles-0.048
icle-0.043
ie-0.040
Token owl
Token ID39610
Feature activation+2.32e-02

Pos logit contributions
bs+0.068
Aud+0.062
Scar+0.059
powers+0.053
Jennifer+0.053
Neg logit contributions
ator-0.074
urg-0.067
icles-0.062
 Esk-0.060
fl-0.060
Token.
Token ID13
Feature activation-1.02e-03

Pos logit contributions
Scar+0.036
 cane+0.034
bs+0.033
Aud+0.033
Har+0.032
Neg logit contributions
urg-0.043
ator-0.043
icles-0.041
rim-0.036
icle-0.036
Token jet
Token ID12644
Feature activation+3.31e-02

Pos logit contributions
bs+0.064
Scar+0.060
Aud+0.055
 Jerry+0.053
Jennifer+0.053
Neg logit contributions
ator-0.076
urg-0.070
icles-0.067
 ig-0.066
 waterfall-0.066
Token!
Token ID0
Feature activation+6.97e-03

Pos logit contributions
bs+0.052
Scar+0.051
 cane+0.049
Har+0.047
powers+0.046
Neg logit contributions
ator-0.059
urg-0.056
icles-0.055
zie-0.048
icle-0.047
Token But
Token ID887
Feature activation+4.99e-03

Pos logit contributions
bs+0.008
 cane+0.008
Scar+0.008
 arrows+0.008
able+0.007
Neg logit contributions
ator-0.015
urg-0.015
icles-0.015
zie-0.013
icle-0.013
Token she
Token ID673
Feature activation+6.37e-03

Pos logit contributions
bs+0.007
 cane+0.007
Scar+0.007
powers+0.006
 arrows+0.006
Neg logit contributions
urg-0.010
ator-0.010
icles-0.010
zie-0.008
icle-0.008
Token was
Token ID373
Feature activation+3.39e-02

Pos logit contributions
bs+0.012
 cane+0.012
Scar+0.012
 arrows+0.011
Aud+0.011
Neg logit contributions
urg-0.010
ator-0.010
icles-0.009
rim-0.008
zie-0.008
Token too
Token ID1165
Feature activation+8.68e-02

Pos logit contributions
bs+0.060
 cane+0.057
 arrows+0.055
Scar+0.053
Mic+0.053
Neg logit contributions
ator-0.053
urg-0.052
icles-0.045
 Esk-0.044
zie-0.043
Token young
Token ID1862
Feature activation+3.08e-02

Pos logit contributions
 arrows+0.147
 brace+0.140
pill+0.138
Mic+0.136
Aud+0.134
Neg logit contributions
urg-0.117
ator-0.117
 hyd-0.102
ie-0.102
 Esk-0.100
Token.
Token ID13
Feature activation+1.61e-02

Pos logit contributions
bs+0.052
Scar+0.050
 cane+0.048
 arrows+0.047
Mic+0.047
Neg logit contributions
ator-0.049
urg-0.048
icles-0.045
rim-0.039
zie-0.039
Token One
Token ID1881
Feature activation+2.70e-02

Pos logit contributions
bs+0.021
 cane+0.021
Scar+0.020
 arrows+0.019
 arrow+0.018
Neg logit contributions
ator-0.033
icles-0.033
urg-0.032
zie-0.029
icle-0.028
Token day
Token ID1110
Feature activation+3.12e-02

Pos logit contributions
Scar+0.068
Aud+0.065
Har+0.064
Charl+0.063
bs+0.062
Neg logit contributions
ator-0.024
urg-0.022
icles-0.022
 ig-0.018
icle-0.018
Token,
Token ID11
Feature activation+2.67e-02

Pos logit contributions
 cane+0.045
powers+0.044
Scar+0.043
Aud+0.042
 arrows+0.041
Neg logit contributions
icles-0.055
urg-0.055
ator-0.055
 setting-0.047
zie-0.047
Token replied
Token ID8712
Feature activation+5.79e-03

Pos logit contributions
 cane+0.042
Scar+0.040
bs+0.039
Har+0.037
Aud+0.037
Neg logit contributions
urg-0.057
ator-0.053
icles-0.051
io-0.046
rim-0.045
Token.
Token ID13
Feature activation-9.45e-03

Pos logit contributions
 cane+0.008
bs+0.008
Scar+0.008
 arrows+0.007
Aud+0.007
Neg logit contributions
urg-0.012
ator-0.012
icles-0.011
rim-0.010
icle-0.010
Token He
Token ID679
Feature activation+2.48e-02

Pos logit contributions
urg+0.020
ator+0.020
icles+0.019
rim+0.017
icle+0.016
Neg logit contributions
bs-0.012
 cane-0.012
Scar-0.012
 arrows-0.011
 arrow-0.011
Token took
Token ID1718
Feature activation+2.78e-03

Pos logit contributions
bs+0.039
Scar+0.038
 cane+0.038
nam+0.036
Har+0.036
Neg logit contributions
urg-0.042
ator-0.042
icles-0.037
io-0.034
rim-0.033
Token a
Token ID257
Feature activation+8.20e-02

Pos logit contributions
 cane+0.004
bs+0.003
Scar+0.003
 arrows+0.003
 arrow+0.003
Neg logit contributions
urg-0.006
ator-0.006
icles-0.006
icle-0.005
rim-0.005
Token long
Token ID890
Feature activation+8.63e-02

Pos logit contributions
aces+0.124
 cheated+0.122
 reckless+0.120
pill+0.119
uts+0.118
Neg logit contributions
ail-0.098
 waterfall-0.096
 setting-0.095
ator-0.095
ie-0.094
Token piece
Token ID3704
Feature activation+2.74e-02

Pos logit contributions
nam+0.148
 reckless+0.133
 cheated+0.131
powers+0.125
 lawyers+0.121
Neg logit contributions
urg-0.129
ator-0.127
zie-0.122
icles-0.120
icle-0.119
Token of
Token ID286
Feature activation+5.67e-02

Pos logit contributions
Scar+0.043
bs+0.040
 cane+0.040
Har+0.039
nam+0.039
Neg logit contributions
urg-0.050
ator-0.050
icles-0.049
icle-0.045
rim-0.042
Token string
Token ID4731
Feature activation+3.47e-02

Pos logit contributions
Scar+0.074
Har+0.073
ver+0.071
 cheated+0.066
Terry+0.065
Neg logit contributions
icles-0.104
icle-0.098
urg-0.095
ator-0.093
asted-0.089
Token and
Token ID290
Feature activation+5.23e-02

Pos logit contributions
Scar+0.053
bs+0.051
Har+0.049
Aud+0.049
Jennifer+0.048
Neg logit contributions
icles-0.065
ator-0.065
urg-0.064
icle-0.058
rim-0.052
Token tied
Token ID8165
Feature activation+3.61e-03

Pos logit contributions
Har+0.082
Scar+0.081
 cane+0.078
able+0.076
Mic+0.074
Neg logit contributions
urg-0.082
icles-0.078
ator-0.074
 setting-0.070
bled-0.069
Token pulled
Token ID5954
Feature activation+4.52e-03

Pos logit contributions
Har+0.083
Scar+0.082
able+0.078
vern+0.078
Charl+0.078
Neg logit contributions
icles-0.073
urg-0.071
icle-0.062
ator-0.059
 overfl-0.058
Token hard
Token ID1327
Feature activation+5.88e-02

Pos logit contributions
 cane+0.007
Scar+0.007
bs+0.007
Aud+0.006
Har+0.006
Neg logit contributions
urg-0.008
ator-0.008
icles-0.008
icle-0.007
zie-0.007
Token.
Token ID13
Feature activation-1.89e-02

Pos logit contributions
Jennifer+0.119
Scar+0.115
nam+0.114
Michelle+0.111
Aud+0.109
Neg logit contributions
icles-0.077
urg-0.072
ator-0.069
icle-0.065
 them-0.063
Token Lily
Token ID20037
Feature activation+4.66e-02

Pos logit contributions
urg+0.042
ator+0.041
icles+0.039
rim+0.037
zie+0.035
Neg logit contributions
bs-0.024
 cane-0.023
Scar-0.022
 string-0.020
 arrows-0.020
Token pulled
Token ID5954
Feature activation+2.51e-02

Pos logit contributions
Scar+0.081
Har+0.080
 cane+0.078
nam+0.077
bs+0.076
Neg logit contributions
urg-0.065
ator-0.063
icles-0.061
we-0.054
 overfl-0.054
Token too
Token ID1165
Feature activation+8.63e-02

Pos logit contributions
powers+0.041
Scar+0.041
 cane+0.041
bs+0.041
Har+0.039
Neg logit contributions
icles-0.041
urg-0.039
ator-0.038
icle-0.035
 them-0.034
Token.
Token ID13
Feature activation+1.63e-02

Pos logit contributions
nam+0.164
Scar+0.159
Aud+0.156
Jennifer+0.155
 Scarlett+0.150
Neg logit contributions
icles-0.098
urg-0.098
ator-0.091
 them-0.089
 Esk-0.087
Token They
Token ID1119
Feature activation+3.40e-02

Pos logit contributions
 cane+0.025
bs+0.024
able+0.024
Scar+0.024
 arrows+0.023
Neg logit contributions
urg-0.029
icles-0.029
ator-0.029
icle-0.025
zie-0.024
Token pulled
Token ID5954
Feature activation+2.98e-02

Pos logit contributions
Scar+0.060
 cane+0.057
Har+0.056
Jennifer+0.055
Aud+0.055
Neg logit contributions
urg-0.049
icles-0.049
ator-0.045
icle-0.040
zie-0.039
Token and
Token ID290
Feature activation+4.35e-02

Pos logit contributions
Scar+0.048
powers+0.045
Har+0.045
 cane+0.045
bs+0.044
Neg logit contributions
icles-0.050
urg-0.049
ator-0.047
 them-0.043
icle-0.042
Token pulled
Token ID5954
Feature activation+2.25e-02

Pos logit contributions
 cane+0.073
Scar+0.069
Har+0.067
powers+0.066
Jennifer+0.065
Neg logit contributions
icles-0.069
urg-0.065
icle-0.058
 overfl-0.055
 setting-0.055
Token fire
Token ID2046
Feature activation-7.34e-02

Pos logit contributions
urg+0.101
icles+0.097
io+0.090
zie+0.088
ator+0.088
Neg logit contributions
 cane-0.066
 string-0.064
 long-0.063
 arrow-0.061
 stamp-0.060
Tokenwork
Token ID1818
Feature activation-3.26e-02

Pos logit contributions
urg+0.099
rim+0.091
icles+0.090
ator+0.086
io+0.084
Neg logit contributions
bs-0.127
 cane-0.121
 stamp-0.118
 arrow-0.113
-0.113
Token.
Token ID13
Feature activation-1.58e-02

Pos logit contributions
urg+0.068
ator+0.068
icles+0.066
rim+0.061
icle+0.058
Neg logit contributions
bs-0.043
 cane-0.042
Scar-0.038
 string-0.038
 confident-0.037
Token<|endoftext|>
Token ID50256
Feature activation+3.72e-03

Pos logit contributions
urg+0.039
ator+0.038
icles+0.037
rim+0.034
icle+0.032
Neg logit contributions
bs-0.016
 cane-0.016
Scar-0.015
 confident-0.013
 arrow-0.013
TokenE
Token ID36
Feature activation-1.83e-02

Pos logit contributions
bs+0.005
 cane+0.005
Scar+0.004
 arrows+0.004
 arrow+0.004
Neg logit contributions
ator-0.008
urg-0.008
icles-0.008
rim-0.007
icle-0.007
Tokendd
Token ID1860
Feature activation-1.01e-01

Pos logit contributions
urg+0.036
ator+0.035
icles+0.034
rim+0.030
icle+0.030
Neg logit contributions
 cane-0.027
bs-0.026
Scar-0.026
Har-0.023
able-0.023
Tokenie
Token ID494
Feature activation+1.39e-02

Pos logit contributions
umm+0.161
urg+0.140
ator+0.137
icles+0.132
rim+0.124
Neg logit contributions
able-0.154
 tied-0.154
bs-0.151
 cane-0.149
 arrow-0.146
Token was
Token ID373
Feature activation+1.19e-02

Pos logit contributions
bs+0.020
 cane+0.020
Scar+0.020
 arrows+0.018
Aud+0.018
Neg logit contributions
urg-0.026
ator-0.026
icles-0.024
rim-0.022
zie-0.022
Token a
Token ID257
Feature activation+1.86e-02

Pos logit contributions
bs+0.016
 cane+0.015
Scar+0.015
 arrows+0.014
powers+0.014
Neg logit contributions
urg-0.025
ator-0.025
icles-0.023
rim-0.022
icle-0.021
Token three
Token ID1115
Feature activation-9.35e-04

Pos logit contributions
bs+0.025
Scar+0.024
 arrows+0.023
 cane+0.023
Aud+0.023
Neg logit contributions
ator-0.037
urg-0.036
icles-0.033
 Esk-0.031
rim-0.031
Token year
Token ID614
Feature activation+1.76e-02

Pos logit contributions
urg+0.002
ator+0.002
icles+0.001
rim+0.001
icle+0.001
Neg logit contributions
bs-0.002
 cane-0.002
Scar-0.002
 arrows-0.001
Aud-0.001
Token shouting
Token ID19642
Feature activation-2.91e-02

Pos logit contributions
urg+0.023
ator+0.022
icles+0.021
rim+0.019
icle+0.018
Neg logit contributions
 cane-0.019
bs-0.019
Scar-0.018
 arrows-0.017
 arrow-0.017
Token "
Token ID366
Feature activation-3.11e-02

Pos logit contributions
ator+0.072
urg+0.070
icles+0.068
rim+0.066
zie+0.062
Neg logit contributions
 cane-0.029
bs-0.029
Scar-0.027
 arrows-0.026
 arrow-0.025
TokenAh
Token ID10910
Feature activation-5.85e-03

Pos logit contributions
urg+0.061
ator+0.059
icles+0.054
rim+0.052
zie+0.049
Neg logit contributions
 cane-0.045
bs-0.044
Scar-0.044
 arrows-0.042
 string-0.042
Tokenoy
Token ID726
Feature activation-1.96e-02

Pos logit contributions
urg+0.011
ator+0.011
icles+0.010
rim+0.009
icle+0.009
Neg logit contributions
 cane-0.009
bs-0.009
Scar-0.009
 arrows-0.008
 arrow-0.008
Token,
Token ID11
Feature activation-3.68e-02

Pos logit contributions
urg+0.044
ator+0.044
icles+0.042
rim+0.040
zie+0.037
Neg logit contributions
 cane-0.023
bs-0.022
Scar-0.022
 arrows-0.020
 arrow-0.019
Token mate
Token ID16133
Feature activation-9.66e-02

Pos logit contributions
urg+0.072
ator+0.072
icles+0.067
rim+0.067
zie+0.059
Neg logit contributions
 cane-0.052
 arrow-0.052
bs-0.052
 arrows-0.051
Scar-0.049
Tokeny
Token ID88
Feature activation-2.82e-02

Pos logit contributions
urg+0.195
ator+0.189
rim+0.189
icles+0.187
umm+0.175
Neg logit contributions
 cane-0.087
able-0.087
bs-0.086
ver-0.084
,-0.079
Token!"
Token ID2474
Feature activation-4.19e-02

Pos logit contributions
urg+0.056
ator+0.056
icles+0.054
rim+0.050
zie+0.047
Neg logit contributions
 cane-0.040
bs-0.038
Scar-0.037
 arrow-0.036
 arrows-0.036
Token and
Token ID290
Feature activation-6.46e-02

Pos logit contributions
urg+0.095
ator+0.094
rim+0.085
icles+0.084
urger+0.079
Neg logit contributions
bs-0.051
 Jerry-0.046
 she-0.041
 string-0.041
 cane-0.041
Token "
Token ID366
Feature activation-2.41e-02

Pos logit contributions
ator+0.163
urg+0.152
icles+0.146
rim+0.146
umm+0.144
Neg logit contributions
bs-0.067
 she-0.058
 arrow-0.056
 arrows-0.055
 short-0.054
TokenSh
Token ID2484
Feature activation-2.76e-02

Pos logit contributions
urg+0.050
ator+0.048
icles+0.046
rim+0.042
zie+0.040
Neg logit contributions
Scar-0.034
 cane-0.033
bs-0.033
 arrows-0.031
 arrow-0.031
Token the
Token ID262
Feature activation-4.43e-02

Pos logit contributions
bs+0.028
 cane+0.026
Scar+0.025
powers+0.024
able+0.024
Neg logit contributions
urg-0.035
ator-0.035
icles-0.034
icle-0.030
io-0.029
Token shore
Token ID15191
Feature activation-9.99e-03

Pos logit contributions
urg+0.085
ator+0.083
icles+0.082
rol+0.074
rim+0.070
Neg logit contributions
 cane-0.075
 arrow-0.073
 string-0.071
 arrows-0.068
 long-0.065
Token and
Token ID290
Feature activation+9.41e-03

Pos logit contributions
urg+0.024
ator+0.024
icles+0.023
rim+0.021
icle+0.021
Neg logit contributions
bs-0.010
 cane-0.010
Scar-0.009
 arrows-0.008
 arrow-0.008
Token watches
Token ID16860
Feature activation-1.32e-02

Pos logit contributions
Scar+0.014
 cane+0.014
bs+0.014
Har+0.013
 arrows+0.013
Neg logit contributions
urg-0.017
ator-0.017
icles-0.016
rim-0.014
ric-0.014
Token the
Token ID262
Feature activation-6.11e-02

Pos logit contributions
urg+0.027
ator+0.027
icles+0.026
rim+0.024
zie+0.022
Neg logit contributions
bs-0.018
 cane-0.018
Scar-0.017
 arrows-0.016
 arrow-0.016
Token sw
Token ID1509
Feature activation-9.61e-02

Pos logit contributions
ator+0.124
icles+0.123
urg+0.122
zie+0.109
icle+0.107
Neg logit contributions
 arrow-0.091
 arrows-0.089
 string-0.087
 cane-0.086
 pigeon-0.084
Tokenan
Token ID272
Feature activation-6.05e-02

Pos logit contributions
icles+0.145
ator+0.140
 Manny+0.138
why+0.127
ators+0.126
Neg logit contributions
 arrows-0.149
 arrow-0.148
 ar-0.141
ak-0.139
bs-0.138
Token.
Token ID13
Feature activation+1.68e-02

Pos logit contributions
ator+0.155
urg+0.146
icles+0.143
rim+0.135
zie+0.133
Neg logit contributions
bs-0.060
 cane-0.053
 arrows-0.051
 arrow-0.048
 string-0.046
Token She
Token ID1375
Feature activation+2.24e-03

Pos logit contributions
bs+0.021
 cane+0.021
 arrows+0.020
Scar+0.020
able+0.020
Neg logit contributions
urg-0.034
icles-0.033
ator-0.032
icle-0.029
zie-0.029
Token thinks
Token ID6834
Feature activation+1.06e-02

Pos logit contributions
bs+0.003
 cane+0.003
Scar+0.003
 arrows+0.003
 arrow+0.003
Neg logit contributions
urg-0.004
ator-0.004
icles-0.004
rim-0.003
ric-0.003
Token it
Token ID340
Feature activation-3.14e-02

Pos logit contributions
Scar+0.016
 cane+0.016
bs+0.016
Har+0.015
Aud+0.015
Neg logit contributions
urg-0.020
ator-0.019
icles-0.019
icle-0.016
rim-0.016
Token peaceful
Token ID12309
Feature activation-3.37e-02

Pos logit contributions
urg+0.043
ator+0.042
icles+0.040
rim+0.037
icle+0.036
Neg logit contributions
bs-0.020
 cane-0.020
Scar-0.018
 confident-0.017
 arrow-0.017
Token.
Token ID13
Feature activation-6.92e-05

Pos logit contributions
urg+0.087
ator+0.085
icles+0.082
rim+0.079
zie+0.075
Neg logit contributions
 cane-0.028
bs-0.026
 arrows-0.024
 confident-0.024
 arrow-0.023
Token
Token ID198
Feature activation+1.85e-03

Pos logit contributions
urg+0.000
ator+0.000
icles+0.000
rim+0.000
icle+0.000
Neg logit contributions
 cane-0.000
bs-0.000
Scar-0.000
 arrows-0.000
 arrow-0.000
Token
Token ID198
Feature activation-7.63e-03

Pos logit contributions
 cane+0.002
bs+0.002
Scar+0.002
 arrows+0.002
 arrow+0.002
Neg logit contributions
urg-0.004
ator-0.004
icles-0.004
rim-0.004
icle-0.003
TokenThe
Token ID464
Feature activation-6.13e-02

Pos logit contributions
urg+0.017
ator+0.017
icles+0.016
rim+0.015
icle+0.014
Neg logit contributions
bs-0.009
 cane-0.009
Scar-0.009
 arrows-0.008
 arrow-0.008
Token sw
Token ID1509
Feature activation-9.60e-02

Pos logit contributions
urg+0.139
icles+0.136
ator+0.131
rol+0.119
ators+0.117
Neg logit contributions
 arrow-0.081
 pigeon-0.077
 long-0.076
 arrows-0.076
 string-0.075
Tokenan
Token ID272
Feature activation-9.04e-03

Pos logit contributions
icles+0.155
riage+0.136
 Manny+0.135
icle+0.132
why+0.129
Neg logit contributions
 arrows-0.143
 arrow-0.139
 ar-0.138
powers-0.132
 pigeon-0.131
Token looked
Token ID3114
Feature activation-4.55e-03

Pos logit contributions
urg+0.018
ator+0.018
icles+0.018
rim+0.016
icle+0.015
Neg logit contributions
bs-0.012
 cane-0.012
Scar-0.012
 arrows-0.011
 arrow-0.011
Token around
Token ID1088
Feature activation-3.61e-02

Pos logit contributions
urg+0.008
ator+0.008
icles+0.007
rim+0.007
icle+0.006
Neg logit contributions
 cane-0.008
bs-0.008
Scar-0.007
 arrows-0.007
 arrow-0.007
Token and
Token ID290
Feature activation-1.74e-02

Pos logit contributions
ator+0.097
urg+0.094
icles+0.094
rim+0.087
icle+0.084
Neg logit contributions
 cane-0.030
bs-0.029
 confident-0.027
 long-0.027
 short-0.026
Token saw
Token ID2497
Feature activation-9.97e-03

Pos logit contributions
urg+0.039
ator+0.038
icles+0.037
rim+0.034
icle+0.033
Neg logit contributions
bs-0.022
 cane-0.021
Scar-0.020
 confident-0.019
 arrows-0.019
Token the
Token ID262
Feature activation-5.39e-02

Pos logit contributions
bs+0.002
 cane+0.002
Scar+0.002
 arrows+0.001
 arrow+0.001
Neg logit contributions
urg-0.003
ator-0.002
icles-0.002
rim-0.002
icle-0.002
Token pond
Token ID16723
Feature activation-4.39e-03

Pos logit contributions
urg+0.128
icles+0.124
ator+0.120
icle+0.108
zie+0.107
Neg logit contributions
 cane-0.065
 pigeon-0.063
 arrow-0.063
 string-0.061
 arrows-0.061
Token and
Token ID290
Feature activation-7.99e-03

Pos logit contributions
urg+0.009
ator+0.009
icles+0.009
rim+0.008
icle+0.008
Neg logit contributions
 cane-0.006
bs-0.006
Scar-0.006
 arrows-0.005
 arrow-0.005
Token watch
Token ID2342
Feature activation-2.96e-02

Pos logit contributions
urg+0.016
ator+0.016
icles+0.015
rim+0.013
icle+0.013
Neg logit contributions
bs-0.012
 cane-0.012
Scar-0.011
 arrows-0.010
 arrow-0.010
Token the
Token ID262
Feature activation-5.86e-02

Pos logit contributions
ator+0.066
urg+0.065
icles+0.061
rim+0.058
zie+0.055
Neg logit contributions
bs-0.037
 cane-0.036
Scar-0.033
 arrows-0.033
 arrow-0.032
Token sw
Token ID1509
Feature activation-9.51e-02

Pos logit contributions
urg+0.129
ator+0.127
icles+0.125
zie+0.115
icle+0.111
Neg logit contributions
 pigeon-0.076
 arrows-0.076
 arrow-0.073
 string-0.073
 long-0.071
Tokenans
Token ID504
Feature activation-2.39e-02

Pos logit contributions
icles+0.148
 Manny+0.146
ator+0.142
ators+0.127
icle+0.125
Neg logit contributions
 arrows-0.147
 ar-0.138
 arrow-0.137
bs-0.135
 pigeon-0.134
Token swim
Token ID9422
Feature activation-3.04e-02

Pos logit contributions
urg+0.050
ator+0.050
icles+0.048
rim+0.043
icle+0.042
Neg logit contributions
bs-0.032
 cane-0.031
Scar-0.029
 arrows-0.028
 confident-0.028
Token.
Token ID13
Feature activation-7.75e-03

Pos logit contributions
urg+0.072
ator+0.071
icles+0.068
zie+0.062
rim+0.062
Neg logit contributions
bs-0.035
 cane-0.032
 arrows-0.031
able-0.029
Scar-0.029
Token She
Token ID1375
Feature activation-1.50e-02

Pos logit contributions
urg+0.016
ator+0.016
icles+0.015
rim+0.014
icle+0.013
Neg logit contributions
bs-0.011
 cane-0.011
Scar-0.010
 arrows-0.010
 arrow-0.009
Token loved
Token ID6151
Feature activation-2.83e-02

Pos logit contributions
urg+0.030
ator+0.030
icles+0.029
rim+0.026
icle+0.025
Neg logit contributions
bs-0.021
 cane-0.021
Scar-0.021
 arrows-0.019
 arrow-0.019
Token O
Token ID440
Feature activation-2.90e-02

Pos logit contributions
ator+0.068
urg+0.068
rim+0.060
icles+0.059
ators+0.054
Neg logit contributions
 Jerry-0.049
bs-0.046
 cane-0.045
 arrow-0.045
 Bow-0.045
Tokenll
Token ID297
Feature activation-8.69e-02

Pos logit contributions
urg+0.039
icles+0.038
ator+0.038
icle+0.031
rim+0.029
Neg logit contributions
bs-0.058
Scar-0.056
 cane-0.056
Har-0.053
able-0.052
Tokenie
Token ID494
Feature activation-3.02e-02

Pos logit contributions
urg+0.120
ator+0.114
umm+0.112
omm+0.107
 department+0.102
Neg logit contributions
able-0.151
 arrow-0.147
 string-0.139
 cane-0.139
 arrows-0.134
Token.
Token ID13
Feature activation-6.18e-02

Pos logit contributions
urg+0.078
ator+0.077
icles+0.076
rim+0.068
icle+0.067
Neg logit contributions
 cane-0.026
bs-0.026
 arrow-0.022
Scar-0.021
 arrows-0.021
Token O
Token ID440
Feature activation-3.50e-02

Pos logit contributions
ator+0.122
urg+0.107
ators+0.096
rina+0.094
aby+0.093
Neg logit contributions
 long-0.099
bs-0.095
 short-0.095
 Jerry-0.094
 high-0.091
Tokenll
Token ID297
Feature activation-9.48e-02

Pos logit contributions
urg+0.049
icles+0.049
ator+0.047
icle+0.041
umm+0.037
Neg logit contributions
bs-0.065
Scar-0.064
 cane-0.063
able-0.062
Har-0.060
Tokenie
Token ID494
Feature activation-1.26e-02

Pos logit contributions
urg+0.129
ator+0.117
umm+0.117
 department+0.114
icles+0.106
Neg logit contributions
able-0.165
 arrow-0.164
 cane-0.155
 string-0.153
 tied-0.148
Token loved
Token ID6151
Feature activation-2.46e-02

Pos logit contributions
urg+0.024
ator+0.023
icles+0.023
rim+0.020
icle+0.019
Neg logit contributions
bs-0.019
 cane-0.019
Scar-0.019
 arrows-0.018
 arrow-0.018
Token to
Token ID284
Feature activation-2.09e-02

Pos logit contributions
ator+0.059
urg+0.057
icles+0.054
rim+0.050
zie+0.049
Neg logit contributions
 cane-0.029
bs-0.026
 arrow-0.025
 arrows-0.025
 string-0.025
Token play
Token ID711
Feature activation-2.95e-02

Pos logit contributions
ator+0.038
urg+0.037
icles+0.035
rim+0.031
zie+0.030
Neg logit contributions
 cane-0.035
bs-0.034
 arrow-0.032
Scar-0.032
 string-0.032
Token all
Token ID477
Feature activation-1.88e-02

Pos logit contributions
urg+0.071
ator+0.069
icles+0.068
zie+0.063
ric+0.063
Neg logit contributions
 cane-0.031
 arrows-0.031
bs-0.029
 arrow-0.028
 string-0.028
Token resist
Token ID4180
Feature activation-3.22e-02

Pos logit contributions
Scar+0.010
bs+0.010
 cane+0.010
 arrows+0.009
Aud+0.009
Neg logit contributions
urg-0.010
ator-0.009
icles-0.009
rim-0.008
icle-0.008
Token!
Token ID0
Feature activation-1.92e-02

Pos logit contributions
ator+0.067
urg+0.066
icles+0.063
rim+0.057
zie+0.056
Neg logit contributions
 cane-0.042
bs-0.041
Scar-0.039
 arrows-0.037
 string-0.037
Token I
Token ID314
Feature activation-2.12e-02

Pos logit contributions
urg+0.039
ator+0.038
icles+0.036
rim+0.034
zie+0.031
Neg logit contributions
bs-0.027
 cane-0.027
Scar-0.026
 arrow-0.024
 arrows-0.024
Token will
Token ID481
Feature activation-2.61e-02

Pos logit contributions
urg+0.044
ator+0.044
icles+0.042
rim+0.037
icle+0.036
Neg logit contributions
bs-0.030
 cane-0.029
Scar-0.028
 confident-0.026
able-0.026
Token try
Token ID1949
Feature activation-6.24e-03

Pos logit contributions
urg+0.052
ator+0.052
icles+0.050
rim+0.044
icle+0.042
Neg logit contributions
 cane-0.038
bs-0.037
Scar-0.035
 arrow-0.034
 string-0.034
Token the
Token ID262
Feature activation-9.40e-02

Pos logit contributions
urg+0.012
ator+0.012
icles+0.012
rim+0.011
icle+0.010
Neg logit contributions
bs-0.009
 cane-0.009
Scar-0.009
 arrows-0.008
 arrow-0.008
Token o
Token ID267
Feature activation-6.90e-02

Pos logit contributions
ator+0.198
urg+0.184
bled+0.172
aby+0.167
icles+0.159
Neg logit contributions
 long-0.143
 short-0.143
 cane-0.122
 small-0.121
 high-0.119
Tokenat
Token ID265
Feature activation-5.00e-02

Pos logit contributions
urg+0.092
ator+0.091
icles+0.078
maid+0.076
opher+0.071
Neg logit contributions
bs-0.134
br-0.124
 cane-0.120
 short-0.116
ite-0.115
Token!"
Token ID2474
Feature activation+1.06e-02

Pos logit contributions
urg+0.099
ator+0.099
icles+0.089
rim+0.089
ric+0.081
Neg logit contributions
bs-0.069
 cane-0.068
Scar-0.060
 string-0.059
 short-0.059
Token
Token ID198
Feature activation+6.15e-03

Pos logit contributions
 cane+0.015
Scar+0.015
bs+0.015
Aud+0.014
 arrows+0.014
Neg logit contributions
urg-0.021
icles-0.020
ator-0.020
icle-0.018
rim-0.017
Token
Token ID198
Feature activation-5.12e-03

Pos logit contributions
 cane+0.008
bs+0.008
Scar+0.007
 arrows+0.007
 arrow+0.007
Neg logit contributions
urg-0.013
ator-0.013
icles-0.012
rim-0.011
icle-0.011
Token was
Token ID373
Feature activation-4.60e-03

Pos logit contributions
urg+0.010
ator+0.010
icles+0.009
rim+0.008
icle+0.008
Neg logit contributions
bs-0.008
 cane-0.008
Scar-0.008
 arrows-0.008
 arrow-0.007
Token mean
Token ID1612
Feature activation-1.91e-02

Pos logit contributions
urg+0.010
ator+0.010
icles+0.009
rim+0.009
icle+0.008
Neg logit contributions
bs-0.006
 cane-0.006
Scar-0.006
 arrows-0.005
 arrow-0.005
Token.
Token ID13
Feature activation-6.73e-03

Pos logit contributions
ator+0.047
urg+0.047
icles+0.044
rim+0.042
zie+0.039
Neg logit contributions
 cane-0.019
bs-0.019
Scar-0.017
 arrows-0.017
 arrow-0.016
Token He
Token ID679
Feature activation-8.18e-04

Pos logit contributions
urg+0.014
ator+0.013
icles+0.013
rim+0.012
icle+0.011
Neg logit contributions
bs-0.010
 cane-0.010
Scar-0.009
 arrows-0.009
 arrow-0.008
Token said
Token ID531
Feature activation-4.07e-03

Pos logit contributions
urg+0.002
ator+0.002
icles+0.002
rim+0.002
icle+0.002
Neg logit contributions
 cane-0.001
bs-0.001
Scar-0.001
 arrows-0.001
Aud-0.001
Token sun
Token ID4252
Feature activation-9.39e-02

Pos logit contributions
urg+0.010
ator+0.009
icles+0.009
rim+0.008
icle+0.008
Neg logit contributions
 cane-0.004
bs-0.004
Scar-0.004
 arrows-0.004
 arrow-0.004
Tokenfl
Token ID2704
Feature activation-4.15e-02

Pos logit contributions
urg+0.147
ator+0.140
rim+0.133
 Esk+0.116
 aquarium+0.115
Neg logit contributions
bs-0.149
 she-0.145
 cane-0.142
ite-0.140
 stamp-0.132
Tokenowers
Token ID3618
Feature activation-9.17e-03

Pos logit contributions
urg+0.096
ator+0.095
icles+0.085
rim+0.084
 Esk+0.081
Neg logit contributions
bs-0.046
 cane-0.037
able-0.036
ite-0.036
 arrows-0.033
Token were
Token ID547
Feature activation-2.47e-02

Pos logit contributions
urg+0.018
ator+0.018
icles+0.017
rim+0.016
icle+0.015
Neg logit contributions
bs-0.013
 cane-0.013
Scar-0.013
 arrows-0.012
 arrow-0.012
Token alive
Token ID6776
Feature activation+4.02e-04

Pos logit contributions
ator+0.053
urg+0.053
icles+0.050
rim+0.046
zie+0.044
Neg logit contributions
 cane-0.031
bs-0.031
Scar-0.029
 arrow-0.028
 arrows-0.028
Token.
Token ID13
Feature activation-7.77e-03

Pos logit contributions
bs+0.001
 cane+0.001
Scar+0.000
 arrows+0.000
Har+0.000
Neg logit contributions
urg-0.001
ator-0.001
icles-0.001
rim-0.001
icle-0.001
Token around
Token ID1088
Feature activation-2.11e-02

Pos logit contributions
bs+0.002
 cane+0.002
Scar+0.002
Har+0.002
 confident+0.001
Neg logit contributions
urg-0.002
ator-0.002
icles-0.002
rim-0.002
icle-0.002
Token the
Token ID262
Feature activation-3.89e-02

Pos logit contributions
urg+0.050
ator+0.049
icles+0.047
rim+0.043
icle+0.042
Neg logit contributions
 cane-0.023
bs-0.023
 arrows-0.021
Scar-0.021
 arrow-0.020
Token lake
Token ID13546
Feature activation-2.13e-02

Pos logit contributions
urg+0.074
icles+0.069
ator+0.066
icle+0.062
omm+0.058
Neg logit contributions
 cane-0.062
 string-0.060
 arrow-0.059
 arrows-0.059
Scar-0.055
Token.
Token ID13
Feature activation-4.48e-03

Pos logit contributions
urg+0.051
ator+0.050
icles+0.048
rim+0.044
icle+0.043
Neg logit contributions
bs-0.022
 cane-0.022
Scar-0.020
 arrows-0.019
 arrow-0.019
Token The
Token ID383
Feature activation-6.05e-02

Pos logit contributions
urg+0.009
ator+0.009
icles+0.008
rim+0.007
icle+0.007
Neg logit contributions
bs-0.007
 cane-0.007
Scar-0.006
 arrows-0.006
 arrow-0.006
Token sw
Token ID1509
Feature activation-9.38e-02

Pos logit contributions
urg+0.143
icles+0.139
ator+0.136
icle+0.123
ators+0.121
Neg logit contributions
 arrow-0.073
 arrows-0.073
 pigeon-0.072
 long-0.071
 string-0.069
Tokenan
Token ID272
Feature activation-4.32e-03

Pos logit contributions
icles+0.147
riage+0.131
 Manny+0.130
icle+0.128
ator+0.126
Neg logit contributions
 arrows-0.150
 arrow-0.142
 ar-0.140
powers-0.133
 pigeon-0.129
Token wanted
Token ID2227
Feature activation-1.35e-02

Pos logit contributions
urg+0.008
ator+0.008
icles+0.008
rim+0.007
icle+0.007
Neg logit contributions
 cane-0.006
bs-0.006
Scar-0.006
 arrows-0.006
 arrow-0.006
Token to
Token ID284
Feature activation-1.39e-02

Pos logit contributions
urg+0.029
ator+0.029
icles+0.028
rim+0.026
zie+0.024
Neg logit contributions
 cane-0.017
bs-0.017
Scar-0.016
 arrows-0.016
 arrow-0.015
Token join
Token ID4654
Feature activation-2.17e-02

Pos logit contributions
urg+0.026
ator+0.025
icles+0.024
rim+0.021
zie+0.020
Neg logit contributions
 cane-0.022
bs-0.022
Scar-0.021
 arrow-0.020
 arrows-0.020
Token the
Token ID262
Feature activation-4.05e-02

Pos logit contributions
urg+0.036
ator+0.035
icles+0.033
rim+0.030
zie+0.028
Neg logit contributions
bs-0.039
 cane-0.038
 arrows-0.037
Scar-0.036
 arrow-0.036
Token walked
Token ID6807
Feature activation-2.31e-02

Pos logit contributions
ator+0.058
urg+0.057
icles+0.054
rim+0.051
icle+0.048
Neg logit contributions
bs-0.032
 cane-0.029
Scar-0.028
 arrows-0.026
 arrow-0.026
Token until
Token ID1566
Feature activation-6.04e-03

Pos logit contributions
urg+0.049
ator+0.047
icles+0.045
rim+0.043
ric+0.040
Neg logit contributions
bs-0.032
 cane-0.032
 arrows-0.030
Scar-0.030
 confident-0.029
Token they
Token ID484
Feature activation-3.20e-02

Pos logit contributions
urg+0.014
ator+0.014
icles+0.013
rim+0.012
icle+0.012
Neg logit contributions
 cane-0.007
bs-0.007
Scar-0.007
 arrows-0.006
 arrow-0.006
Token finally
Token ID3443
Feature activation-3.43e-03

Pos logit contributions
ator+0.076
urg+0.076
icles+0.073
rim+0.068
icle+0.065
Neg logit contributions
bs-0.039
Scar-0.035
 cane-0.033
 short-0.030
Har-0.030
Token reached
Token ID4251
Feature activation-2.63e-02

Pos logit contributions
urg+0.007
ator+0.007
icles+0.006
rim+0.006
icle+0.005
Neg logit contributions
 cane-0.005
bs-0.005
Scar-0.005
 arrows-0.005
 arrow-0.004
Token the
Token ID262
Feature activation-9.37e-02

Pos logit contributions
ator+0.068
urg+0.067
icles+0.064
rim+0.062
icle+0.058
Neg logit contributions
 cane-0.022
bs-0.021
Scar-0.021
 arrows-0.018
 string-0.018
Token o
Token ID267
Feature activation-8.18e-02

Pos logit contributions
urg+0.200
ator+0.189
abies+0.182
icles+0.177
rim+0.169
Neg logit contributions
 long-0.134
 far-0.131
 small-0.128
 big-0.126
 short-0.125
Tokenasis
Token ID17765
Feature activation-4.11e-02

Pos logit contributions
urg+0.115
ator+0.108
icles+0.102
maid+0.092
opher+0.092
Neg logit contributions
bs-0.140
br-0.135
ite-0.132
ak-0.124
 cane-0.120
Token.
Token ID13
Feature activation-1.18e-02

Pos logit contributions
urg+0.103
ator+0.100
icles+0.096
rim+0.093
icle+0.088
Neg logit contributions
bs-0.033
 cane-0.032
 confident-0.029
 short-0.028
Scar-0.028
Token They
Token ID1119
Feature activation-1.19e-02

Pos logit contributions
urg+0.024
ator+0.024
icles+0.023
rim+0.021
icle+0.020
Neg logit contributions
bs-0.016
 cane-0.016
Scar-0.016
 arrows-0.014
 arrow-0.014
Token drank
Token ID24070
Feature activation-1.21e-02

Pos logit contributions
urg+0.024
ator+0.024
icles+0.023
rim+0.021
icle+0.020
Neg logit contributions
bs-0.017
 cane-0.017
Scar-0.016
 arrows-0.015
 arrow-0.015
Token said
Token ID531
Feature activation+9.44e-03

Pos logit contributions
 cane+0.019
bs+0.019
Scar+0.019
able+0.017
powers+0.017
Neg logit contributions
urg-0.031
ator-0.030
icles-0.029
rim-0.026
zie-0.025
Token,
Token ID11
Feature activation-2.24e-02

Pos logit contributions
 cane+0.011
bs+0.011
Scar+0.011
 arrow+0.009
 arrows+0.009
Neg logit contributions
urg-0.021
ator-0.021
icles-0.020
icle-0.018
rim-0.018
Token "
Token ID366
Feature activation-1.64e-02

Pos logit contributions
urg+0.045
ator+0.045
icles+0.043
rim+0.040
zie+0.037
Neg logit contributions
 cane-0.031
bs-0.031
Scar-0.030
 confident-0.028
 arrows-0.027
TokenHi
Token ID17250
Feature activation-4.92e-02

Pos logit contributions
urg+0.033
ator+0.032
icles+0.031
rim+0.028
zie+0.026
Neg logit contributions
 cane-0.023
Scar-0.023
bs-0.023
 arrows-0.021
 confident-0.021
Token O
Token ID440
Feature activation-4.72e-02

Pos logit contributions
ator+0.101
urg+0.096
icles+0.091
rim+0.088
zie+0.083
Neg logit contributions
 cane-0.065
bs-0.064
 short-0.062
 arrow-0.061
 arrows-0.061
Tokenll
Token ID297
Feature activation-9.35e-02

Pos logit contributions
urg+0.065
ator+0.064
icles+0.062
icle+0.051
ators+0.049
Neg logit contributions
bs-0.092
able-0.086
Scar-0.084
 cane-0.084
ite-0.082
Tokenie
Token ID494
Feature activation-3.78e-02

Pos logit contributions
urg+0.143
 overfl+0.139
 department+0.134
umm+0.131
ksh+0.126
Neg logit contributions
 arrow-0.139
 stamp-0.128
able-0.127
 cane-0.124
,"-0.121
Token".
Token ID1911
Feature activation-1.83e-02

Pos logit contributions
urg+0.092
ator+0.089
icles+0.088
rim+0.082
icle+0.078
Neg logit contributions
 cane-0.037
bs-0.035
Scar-0.033
 arrow-0.031
able-0.030
Token<|endoftext|>
Token ID50256
Feature activation+6.50e-03

Pos logit contributions
ator+0.042
urg+0.042
icles+0.039
rim+0.037
zie+0.035
Neg logit contributions
bs-0.021
 cane-0.020
Scar-0.019
 arrows-0.018
 arrow-0.017
TokenOnce
Token ID7454
Feature activation-1.06e-02

Pos logit contributions
bs+0.010
 cane+0.009
Scar+0.009
 arrow+0.009
 arrows+0.009
Neg logit contributions
ator-0.012
urg-0.012
icles-0.012
rim-0.011
icle-0.010
Token,
Token ID11
Feature activation-4.01e-02

Pos logit contributions
urg+0.016
ator+0.015
icles+0.015
rim+0.013
icle+0.012
Neg logit contributions
bs-0.021
 cane-0.020
Scar-0.020
 arrows-0.019
 arrow-0.019
Token the
Token ID262
Feature activation-5.56e-02

Pos logit contributions
bs+0.039
 cane+0.036
able+0.035
Scar+0.035
powers+0.034
Neg logit contributions
urg-0.044
ator-0.042
icles-0.038
ric-0.035
 Esk-0.035
Token park
Token ID3952
Feature activation+3.55e-03

Pos logit contributions
urg+0.128
icles+0.124
ator+0.119
icle+0.109
zie+0.103
Neg logit contributions
 cane-0.072
 arrow-0.069
 string-0.068
 pigeon-0.068
 arrows-0.068
Token to
Token ID284
Feature activation+4.00e-03

Pos logit contributions
bs+0.005
 cane+0.005
Scar+0.005
 arrows+0.004
 arrow+0.004
Neg logit contributions
urg-0.007
ator-0.007
icles-0.007
rim-0.006
icle-0.006
Token study
Token ID2050
Feature activation+1.31e-02

Pos logit contributions
bs+0.006
 cane+0.006
Scar+0.006
 arrows+0.006
able+0.005
Neg logit contributions
urg-0.008
ator-0.007
icles-0.007
rim-0.006
zie-0.006
Token the
Token ID262
Feature activation-5.33e-02

Pos logit contributions
 cane+0.018
Scar+0.018
bs+0.018
Har+0.016
 arrow+0.016
Neg logit contributions
urg-0.026
ator-0.026
icles-0.025
rim-0.022
icle-0.021
Token sw
Token ID1509
Feature activation-9.34e-02

Pos logit contributions
urg+0.115
icles+0.110
ator+0.107
icle+0.097
zie+0.093
Neg logit contributions
 cane-0.076
 arrows-0.075
 arrow-0.075
 string-0.074
 long-0.071
Tokenan
Token ID272
Feature activation-3.06e-02

Pos logit contributions
icles+0.142
 Manny+0.129
ator+0.125
icle+0.118
urg+0.118
Neg logit contributions
 arrows-0.154
 arrow-0.149
 ar-0.147
ak-0.141
 cane-0.138
Token and
Token ID290
Feature activation+7.25e-03

Pos logit contributions
urg+0.069
ator+0.068
icles+0.066
rim+0.061
icle+0.059
Neg logit contributions
bs-0.037
 cane-0.036
 arrows-0.033
Scar-0.032
 arrow-0.032
Token other
Token ID584
Feature activation-5.09e-02

Pos logit contributions
bs+0.011
 cane+0.011
Scar+0.011
able+0.010
Aud+0.010
Neg logit contributions
urg-0.013
ator-0.013
icles-0.012
rim-0.011
ric-0.010
Token animals
Token ID4695
Feature activation+6.37e-04

Pos logit contributions
ator+0.105
urg+0.102
icles+0.099
icle+0.092
zie+0.087
Neg logit contributions
 arrows-0.075
 cane-0.071
able-0.070
bs-0.070
 arrow-0.068
Token.
Token ID13
Feature activation-1.77e-02

Pos logit contributions
 cane+0.001
bs+0.001
Scar+0.001
 arrows+0.001
 arrow+0.001
Neg logit contributions
urg-0.001
ator-0.001
icles-0.001
rim-0.001
icle-0.001
Token by
Token ID416
Feature activation-1.67e-02

Pos logit contributions
Scar+0.073
 cane+0.070
Jennifer+0.068
bs+0.067
Michelle+0.066
Neg logit contributions
ator-0.057
urg-0.056
rim-0.052
icles-0.049
 Esk-0.044
Token the
Token ID262
Feature activation-5.71e-02

Pos logit contributions
ator+0.034
urg+0.034
icles+0.033
rim+0.029
icle+0.028
Neg logit contributions
 cane-0.023
bs-0.022
Scar-0.021
 arrows-0.021
 arrow-0.021
Token lake
Token ID13546
Feature activation-2.62e-02

Pos logit contributions
icles+0.115
ator+0.115
urg+0.114
icle+0.100
zie+0.099
Neg logit contributions
 cane-0.083
 arrows-0.083
 arrow-0.083
 long-0.081
 string-0.081
Token.
Token ID13
Feature activation+4.36e-03

Pos logit contributions
urg+0.063
ator+0.062
icles+0.060
rim+0.055
icle+0.054
Neg logit contributions
bs-0.028
 cane-0.027
 arrows-0.024
 confident-0.023
 arrow-0.023
Token The
Token ID383
Feature activation-5.88e-02

Pos logit contributions
bs+0.007
 cane+0.007
Scar+0.006
 arrows+0.006
 confident+0.006
Neg logit contributions
urg-0.008
ator-0.008
icles-0.008
icle-0.007
rim-0.007
Token sw
Token ID1509
Feature activation-9.34e-02

Pos logit contributions
urg+0.127
ator+0.124
icles+0.122
ators+0.110
rol+0.104
Neg logit contributions
 arrow-0.083
 string-0.080
 arrows-0.076
 long-0.075
 pigeon-0.074
Tokenan
Token ID272
Feature activation-2.10e-02

Pos logit contributions
icles+0.144
riage+0.127
 Manny+0.122
icle+0.122
why+0.120
Neg logit contributions
 arrows-0.145
 arrow-0.141
 ar-0.136
powers-0.133
 pigeon-0.129
Token was
Token ID373
Feature activation-1.80e-02

Pos logit contributions
urg+0.044
ator+0.042
icles+0.041
rim+0.037
icle+0.036
Neg logit contributions
bs-0.030
 cane-0.029
Scar-0.028
 arrows-0.027
 arrow-0.027
Token so
Token ID523
Feature activation+7.76e-04

Pos logit contributions
ator+0.042
urg+0.042
icles+0.040
rim+0.036
icle+0.035
Neg logit contributions
 cane-0.020
bs-0.019
Scar-0.019
 confident-0.018
 arrow-0.018
Token graceful
Token ID44363
Feature activation-1.26e-03

Pos logit contributions
 cane+0.001
Scar+0.001
bs+0.001
 arrows+0.001
 arrow+0.001
Neg logit contributions
urg-0.002
ator-0.002
icles-0.002
rim-0.002
icle-0.002
Token that
Token ID326
Feature activation-2.14e-02

Pos logit contributions
urg+0.003
ator+0.003
icles+0.002
rim+0.002
icle+0.002
Neg logit contributions
bs-0.002
 cane-0.002
Scar-0.002
 arrows-0.001
 arrow-0.001
Token sun
Token ID4252
Feature activation-4.79e-02

Pos logit contributions
urg+0.114
ator+0.112
icles+0.108
ators+0.098
rol+0.094
Neg logit contributions
 arrow-0.072
 string-0.067
 cane-0.067
 arrows-0.067
 pigeon-0.066
Token came
Token ID1625
Feature activation-2.75e-02

Pos logit contributions
ator+0.100
urg+0.100
icles+0.093
rim+0.087
zie+0.082
Neg logit contributions
 cane-0.064
bs-0.064
Scar-0.056
 string-0.056
 arrow-0.055
Token up
Token ID510
Feature activation+2.06e-03

Pos logit contributions
ator+0.058
urg+0.058
icles+0.054
rim+0.052
io+0.048
Neg logit contributions
 cane-0.036
bs-0.034
Scar-0.033
 arrows-0.033
 arrow-0.032
Token,
Token ID11
Feature activation-1.04e-03

Pos logit contributions
bs+0.002
 cane+0.002
Scar+0.002
 arrows+0.002
Aud+0.002
Neg logit contributions
urg-0.005
ator-0.005
icles-0.004
rim-0.004
icle-0.004
Token the
Token ID262
Feature activation-3.74e-02

Pos logit contributions
urg+0.002
ator+0.002
icles+0.002
rim+0.002
icle+0.002
Neg logit contributions
bs-0.001
 cane-0.001
Scar-0.001
 arrows-0.001
Aud-0.001
Token sw
Token ID1509
Feature activation-9.32e-02

Pos logit contributions
urg+0.074
ator+0.072
icles+0.070
rim+0.060
zie+0.059
Neg logit contributions
 arrow-0.058
 cane-0.058
 string-0.056
 arrows-0.056
 pigeon-0.052
Tokenan
Token ID272
Feature activation-3.24e-02

Pos logit contributions
icles+0.141
ator+0.132
 Manny+0.129
ators+0.121
days+0.118
Neg logit contributions
 arrows-0.152
 arrow-0.149
 ar-0.141
 pigeon-0.140
powers-0.138
Token would
Token ID561
Feature activation-3.88e-02

Pos logit contributions
urg+0.066
ator+0.065
icles+0.063
rim+0.055
icle+0.054
Neg logit contributions
bs-0.047
 cane-0.047
Scar-0.044
 arrows-0.042
 arrow-0.042
Token wake
Token ID7765
Feature activation-4.71e-02

Pos logit contributions
ator+0.076
urg+0.076
icles+0.074
rim+0.066
io+0.065
Neg logit contributions
 cane-0.056
bs-0.054
Scar-0.051
 arrow-0.051
 string-0.050
Token up
Token ID510
Feature activation-7.92e-03

Pos logit contributions
ator+0.088
urg+0.086
icles+0.081
rim+0.080
umm+0.076
Neg logit contributions
 cane-0.064
bs-0.061
 arrows-0.061
 arrow-0.060
Scar-0.058
Token again
Token ID757
Feature activation+1.22e-03

Pos logit contributions
urg+0.018
ator+0.017
icles+0.017
rim+0.015
icle+0.015
Neg logit contributions
bs-0.010
 cane-0.009
Scar-0.009
 arrows-0.008
 arrow-0.008
TokenWhen
Token ID2215
Feature activation-6.24e-03

Pos logit contributions
bs+0.038
 cane+0.037
Scar+0.035
 arrows+0.033
 pencil+0.033
Neg logit contributions
ator-0.042
urg-0.041
icles-0.041
rim-0.038
 Esk-0.035
Token Twe
Token ID24205
Feature activation+7.05e-03

Pos logit contributions
urg+0.015
ator+0.015
icles+0.014
rim+0.013
icle+0.013
Neg logit contributions
bs-0.006
 cane-0.006
Scar-0.006
 arrows-0.006
 arrow-0.005
Tokenety
Token ID2963
Feature activation-1.84e-02

Pos logit contributions
 cane+0.012
bs+0.012
Scar+0.011
 arrows+0.011
 confident+0.011
Neg logit contributions
urg-0.012
ator-0.012
icles-0.011
rim-0.010
zie-0.010
Token arrived
Token ID5284
Feature activation-4.53e-03

Pos logit contributions
ator+0.038
urg+0.038
icles+0.037
rim+0.033
icle+0.032
Neg logit contributions
bs-0.025
 cane-0.025
Scar-0.023
 arrow-0.022
 arrows-0.022
Token at
Token ID379
Feature activation-2.51e-02

Pos logit contributions
urg+0.009
ator+0.009
icles+0.008
rim+0.007
icle+0.007
Neg logit contributions
 cane-0.007
bs-0.007
Scar-0.007
 arrows-0.006
 arrow-0.006
Token the
Token ID262
Feature activation-9.27e-02

Pos logit contributions
urg+0.063
ator+0.061
icles+0.058
rim+0.056
io+0.052
Neg logit contributions
 cane-0.024
bs-0.022
Scar-0.022
 arrow-0.021
 string-0.020
Token o
Token ID267
Feature activation-7.87e-02

Pos logit contributions
urg+0.191
ator+0.183
icles+0.180
abies+0.177
ators+0.172
Neg logit contributions
 far-0.133
 big-0.132
 long-0.128
 small-0.128
 high-0.121
Tokenasis
Token ID17765
Feature activation-2.52e-02

Pos logit contributions
urg+0.116
icles+0.105
ator+0.105
icle+0.092
maid+0.092
Neg logit contributions
bs-0.134
ite-0.129
br-0.124
able-0.120
ak-0.119
Token,
Token ID11
Feature activation+1.79e-02

Pos logit contributions
urg+0.068
ator+0.067
icles+0.065
rim+0.062
icle+0.059
Neg logit contributions
bs-0.017
 cane-0.017
Scar-0.014
 arrows-0.013
 arrow-0.013
Token she
Token ID673
Feature activation-1.89e-02

Pos logit contributions
bs+0.017
Scar+0.017
 cane+0.016
Aud+0.016
powers+0.014
Neg logit contributions
icles-0.043
ator-0.042
urg-0.042
 Esk-0.038
icle-0.037
Token saw
Token ID2497
Feature activation-7.93e-03

Pos logit contributions
ator+0.041
urg+0.041
icles+0.040
rim+0.036
icle+0.035
Neg logit contributions
bs-0.024
 cane-0.024
Scar-0.023
 arrows-0.021
 arrow-0.021
Token the
Token ID262
Feature activation-3.12e-02

Pos logit contributions
bs+0.011
 cane+0.011
Scar+0.011
 arrows+0.010
powers+0.010
Neg logit contributions
ator-0.015
urg-0.015
icles-0.014
rim-0.013
zie-0.013
Token park
Token ID3952
Feature activation+1.27e-03

Pos logit contributions
urg+0.073
icles+0.070
ator+0.070
icle+0.061
rim+0.059
Neg logit contributions
 cane-0.038
Scar-0.035
 arrow-0.035
bs-0.034
 arrows-0.034
Token,
Token ID11
Feature activation-4.24e-03

Pos logit contributions
 cane+0.002
bs+0.002
Scar+0.002
 arrows+0.002
 arrow+0.001
Neg logit contributions
urg-0.003
ator-0.003
icles-0.002
rim-0.002
zie-0.002
Token watching
Token ID4964
Feature activation-2.62e-02

Pos logit contributions
urg+0.008
ator+0.008
icles+0.007
rim+0.007
icle+0.006
Neg logit contributions
bs-0.007
 cane-0.007
Scar-0.006
 arrows-0.006
 arrow-0.006
Token the
Token ID262
Feature activation-5.73e-02

Pos logit contributions
ator+0.058
urg+0.057
icles+0.054
rim+0.051
zie+0.049
Neg logit contributions
bs-0.032
 cane-0.031
 arrows-0.030
 arrow-0.029
Scar-0.029
Token sw
Token ID1509
Feature activation-9.25e-02

Pos logit contributions
urg+0.117
ator+0.115
icles+0.115
icle+0.102
zie+0.099
Neg logit contributions
 arrow-0.084
 arrows-0.083
 pigeon-0.081
 string-0.081
 long-0.078
Tokenan
Token ID272
Feature activation-3.83e-02

Pos logit contributions
icles+0.145
 Manny+0.131
ator+0.130
riage+0.121
icle+0.121
Neg logit contributions
 arrows-0.149
 arrow-0.145
 ar-0.142
ak-0.138
 pigeon-0.133
Token and
Token ID290
Feature activation-2.33e-02

Pos logit contributions
ator+0.086
urg+0.084
icles+0.082
rim+0.075
icle+0.073
Neg logit contributions
bs-0.046
 cane-0.043
 arrows-0.041
 arrow-0.041
 confident-0.039
Token talking
Token ID3375
Feature activation-2.83e-02

Pos logit contributions
urg+0.046
ator+0.046
icles+0.044
rim+0.040
icle+0.038
Neg logit contributions
bs-0.033
 cane-0.033
Scar-0.032
 arrows-0.031
 arrow-0.031
Token about
Token ID546
Feature activation-1.46e-02

Pos logit contributions
ator+0.062
urg+0.062
icles+0.059
rim+0.056
zie+0.054
Neg logit contributions
bs-0.035
 cane-0.034
Scar-0.032
 arrows-0.032
 arrow-0.030
Token all
Token ID477
Feature activation-3.35e-03

Pos logit contributions
urg+0.030
ator+0.030
icles+0.029
rim+0.026
icle+0.025
Neg logit contributions
 cane-0.020
bs-0.020
Scar-0.019
 arrows-0.018
 arrow-0.018
Token
Token ID198
Feature activation-1.88e-04

Pos logit contributions
 cane+0.013
bs+0.013
Scar+0.012
 arrows+0.012
 arrow+0.012
Neg logit contributions
ator-0.021
urg-0.021
icles-0.020
rim-0.018
zie-0.018
TokenOne
Token ID3198
Feature activation-4.54e-03

Pos logit contributions
ator+0.000
urg+0.000
icles+0.000
rim+0.000
zie+0.000
Neg logit contributions
bs-0.000
 cane-0.000
Scar-0.000
 arrows-0.000
 arrow-0.000
Token morning
Token ID3329
Feature activation+1.17e-02

Pos logit contributions
urg+0.007
ator+0.007
icles+0.007
rim+0.006
icle+0.005
Neg logit contributions
 cane-0.009
bs-0.009
Scar-0.008
 arrows-0.008
 arrow-0.008
Token,
Token ID11
Feature activation+4.78e-03

Pos logit contributions
 cane+0.014
Scar+0.014
bs+0.014
 arrows+0.013
 confident+0.013
Neg logit contributions
urg-0.025
ator-0.024
icles-0.024
icle-0.022
rim-0.022
Token the
Token ID262
Feature activation-4.67e-02

Pos logit contributions
 cane+0.006
Scar+0.006
bs+0.006
 arrows+0.005
powers+0.005
Neg logit contributions
urg-0.010
ator-0.010
icles-0.010
icle-0.008
rim-0.008
Token sw
Token ID1509
Feature activation-9.25e-02

Pos logit contributions
urg+0.098
icles+0.095
ator+0.095
zie+0.081
ators+0.081
Neg logit contributions
 arrow-0.068
 string-0.066
 pigeon-0.065
 cane-0.064
 long-0.062
Tokenan
Token ID272
Feature activation-3.49e-02

Pos logit contributions
icles+0.154
 Manny+0.135
riage+0.131
icle+0.128
ator+0.126
Neg logit contributions
 arrows-0.134
 arrow-0.130
 ar-0.129
 pigeon-0.126
powers-0.123
Token discovered
Token ID5071
Feature activation+1.93e-04

Pos logit contributions
ator+0.079
urg+0.079
icles+0.077
rim+0.068
icle+0.068
Neg logit contributions
bs-0.043
 cane-0.042
Scar-0.038
 arrow-0.037
 arrows-0.037
Token something
Token ID1223
Feature activation+1.53e-02

Pos logit contributions
bs+0.000
 cane+0.000
Scar+0.000
 arrows+0.000
 confident+0.000
Neg logit contributions
urg-0.000
ator-0.000
icles-0.000
rim-0.000
icle-0.000
Token very
Token ID845
Feature activation+8.30e-03

Pos logit contributions
bs+0.023
Scar+0.022
 cane+0.022
Har+0.020
Aud+0.020
Neg logit contributions
urg-0.028
ator-0.028
icles-0.027
rim-0.025
icle-0.023
Token unique
Token ID3748
Feature activation+4.14e-03

Pos logit contributions
Scar+0.011
 cane+0.011
bs+0.011
Aud+0.010
Har+0.010
Neg logit contributions
urg-0.017
ator-0.016
icles-0.015
rim-0.014
 Esk-0.013
Token the
Token ID262
Feature activation-4.53e-02

Pos logit contributions
urg+0.013
ator+0.013
icles+0.012
rim+0.011
icle+0.011
Neg logit contributions
bs-0.008
 cane-0.008
Scar-0.008
 arrows-0.007
 arrow-0.007
Token pond
Token ID16723
Feature activation-1.10e-02

Pos logit contributions
icles+0.104
urg+0.104
ator+0.103
icle+0.091
zie+0.088
Neg logit contributions
 cane-0.059
 string-0.056
 arrow-0.054
 long-0.053
 pigeon-0.052
Token and
Token ID290
Feature activation-2.79e-02

Pos logit contributions
urg+0.026
ator+0.025
icles+0.024
rim+0.022
icle+0.022
Neg logit contributions
bs-0.012
 cane-0.012
Scar-0.011
 arrows-0.011
 arrow-0.010
Token saw
Token ID2497
Feature activation-3.59e-03

Pos logit contributions
ator+0.067
urg+0.066
icles+0.063
rim+0.057
icle+0.056
Neg logit contributions
bs-0.034
 cane-0.032
Scar-0.030
 arrow-0.028
 confident-0.028
Token the
Token ID262
Feature activation-5.02e-02

Pos logit contributions
urg+0.008
ator+0.008
icles+0.008
rim+0.007
icle+0.007
Neg logit contributions
 cane-0.004
bs-0.004
Scar-0.004
 arrows-0.003
 arrow-0.003
Token sw
Token ID1509
Feature activation-9.23e-02

Pos logit contributions
urg+0.106
ator+0.106
icles+0.104
icle+0.091
ators+0.091
Neg logit contributions
 arrows-0.069
 arrow-0.069
 cane-0.068
 long-0.068
 string-0.067
Tokenan
Token ID272
Feature activation-5.22e-02

Pos logit contributions
icles+0.142
 Manny+0.140
ator+0.136
riage+0.124
ators+0.122
Neg logit contributions
 arrows-0.140
ak-0.134
 ar-0.133
 arrow-0.130
 pigeon-0.130
Token again
Token ID757
Feature activation+3.31e-03

Pos logit contributions
ator+0.116
urg+0.110
icles+0.108
umm+0.097
zie+0.096
Neg logit contributions
bs-0.065
 cane-0.064
 arrows-0.061
 arrow-0.060
 string-0.057
Token.
Token ID13
Feature activation-1.11e-02

Pos logit contributions
 cane+0.005
Scar+0.005
bs+0.005
 arrows+0.004
Aud+0.004
Neg logit contributions
urg-0.007
ator-0.006
icles-0.006
rim-0.006
icle-0.005
Token She
Token ID1375
Feature activation-1.06e-02

Pos logit contributions
urg+0.028
ator+0.027
icles+0.026
rim+0.025
icle+0.023
Neg logit contributions
 cane-0.011
bs-0.011
Scar-0.010
 arrows-0.009
 arrow-0.009
Token wanted
Token ID2227
Feature activation-8.48e-03

Pos logit contributions
urg+0.022
ator+0.022
icles+0.021
rim+0.019
icle+0.018
Neg logit contributions
 cane-0.014
bs-0.014
Scar-0.014
 arrows-0.013
 arrow-0.013
Token a
Token ID257
Feature activation+1.14e-02

Pos logit contributions
 cane+0.002
bs+0.002
Scar+0.002
 arrows+0.001
 arrow+0.001
Neg logit contributions
urg-0.003
ator-0.003
icles-0.003
icle-0.002
rim-0.002
Token big
Token ID1263
Feature activation+2.98e-03

Pos logit contributions
Scar+0.015
bs+0.014
 cane+0.014
 arrows+0.014
 arrow+0.013
Neg logit contributions
urg-0.023
ator-0.022
icles-0.022
rim-0.020
icle-0.019
Token bite
Token ID13197
Feature activation+2.49e-03

Pos logit contributions
Scar+0.004
bs+0.004
 cane+0.004
 arrows+0.004
 confident+0.004
Neg logit contributions
urg-0.006
ator-0.006
icles-0.006
rim-0.005
icle-0.005
Token of
Token ID286
Feature activation-4.38e-02

Pos logit contributions
Scar+0.003
bs+0.003
 cane+0.003
 arrows+0.003
Har+0.003
Neg logit contributions
urg-0.006
ator-0.005
icles-0.005
rim-0.005
icle-0.005
Token the
Token ID262
Feature activation-5.63e-02

Pos logit contributions
ator+0.112
urg+0.108
icles+0.101
rim+0.098
zie+0.096
Neg logit contributions
 cane-0.039
 string-0.036
bs-0.035
 stamp-0.035
 stamps-0.033
Token hamb
Token ID41478
Feature activation-9.21e-02

Pos logit contributions
ator+0.126
urg+0.119
icles+0.110
zie+0.107
ric+0.104
Neg logit contributions
 cane-0.075
 long-0.070
 short-0.066
bs-0.066
 newspaper-0.065
Tokenurger
Token ID32650
Feature activation-1.18e-02

Pos logit contributions
ator+0.105
zie+0.091
 waterfall+0.089
rol+0.083
riage+0.081
Neg logit contributions
 cane-0.183
bs-0.171
 stamp-0.170
 pencil-0.169
 stamps-0.164
Token.
Token ID13
Feature activation-1.30e-02

Pos logit contributions
urg+0.028
ator+0.027
icles+0.026
rim+0.024
zie+0.023
Neg logit contributions
 cane-0.013
bs-0.013
Scar-0.012
 arrows-0.011
 arrow-0.011
Token "
Token ID366
Feature activation-2.76e-02

Pos logit contributions
urg+0.029
ator+0.029
icles+0.028
rim+0.026
icle+0.024
Neg logit contributions
 cane-0.015
bs-0.015
Scar-0.015
 arrows-0.013
 arrow-0.013
TokenM
Token ID44
Feature activation-4.41e-02

Pos logit contributions
urg+0.053
ator+0.052
icles+0.047
rim+0.042
zie+0.042
Neg logit contributions
 cane-0.044
bs-0.044
Scar-0.044
Har-0.040
 string-0.040
Tokenmm
Token ID3020
Feature activation-6.45e-03

Pos logit contributions
ator+0.072
urg+0.067
icles+0.064
zie+0.063
ric+0.060
Neg logit contributions
bs-0.070
 cane-0.067
Scar-0.063
 pencil-0.062
powers-0.062
Token clear
Token ID1598
Feature activation+8.52e-03

Pos logit contributions
urg+0.058
ator+0.058
icles+0.055
rim+0.051
io+0.050
Neg logit contributions
 cane-0.023
bs-0.022
Scar-0.021
 short-0.021
 arrow-0.020
Token and
Token ID290
Feature activation-1.62e-02

Pos logit contributions
Scar+0.011
bs+0.011
 cane+0.010
Aud+0.009
Har+0.009
Neg logit contributions
urg-0.019
ator-0.019
icles-0.018
rim-0.016
icle-0.016
Token calm
Token ID9480
Feature activation-2.58e-02

Pos logit contributions
urg+0.038
ator+0.037
icles+0.036
rim+0.033
icle+0.032
Neg logit contributions
bs-0.018
 cane-0.018
Scar-0.016
 arrows-0.015
 arrow-0.015
Token.
Token ID13
Feature activation+1.87e-03

Pos logit contributions
urg+0.064
ator+0.064
icles+0.059
rim+0.057
zie+0.054
Neg logit contributions
 cane-0.026
bs-0.025
 arrows-0.022
Scar-0.022
 arrow-0.021
Token The
Token ID383
Feature activation-5.77e-02

Pos logit contributions
 cane+0.003
bs+0.003
Scar+0.003
 arrows+0.002
 confident+0.002
Neg logit contributions
urg-0.004
ator-0.004
icles-0.004
icle-0.003
rim-0.003
Token sw
Token ID1509
Feature activation-9.21e-02

Pos logit contributions
urg+0.130
ator+0.127
icles+0.124
ators+0.109
rol+0.109
Neg logit contributions
 arrow-0.076
 pigeon-0.073
 string-0.072
 arrows-0.072
 long-0.072
Tokenan
Token ID272
Feature activation-4.91e-03

Pos logit contributions
icles+0.138
 Manny+0.117
riage+0.117
icle+0.117
ator+0.114
Neg logit contributions
 arrows-0.151
 arrow-0.146
 ar-0.143
powers-0.141
 pigeon-0.138
Token flew
Token ID13112
Feature activation-1.29e-02

Pos logit contributions
urg+0.009
ator+0.009
icles+0.009
rim+0.008
icle+0.008
Neg logit contributions
 cane-0.007
bs-0.007
Scar-0.007
 arrows-0.007
 arrow-0.006
Token through
Token ID832
Feature activation-2.21e-02

Pos logit contributions
urg+0.029
ator+0.029
icles+0.028
rim+0.026
icle+0.025
Neg logit contributions
 cane-0.015
bs-0.015
Scar-0.014
 arrows-0.013
 arrow-0.013
Token the
Token ID262
Feature activation+1.16e-02

Pos logit contributions
urg+0.055
ator+0.055
icles+0.053
rim+0.050
icle+0.048
Neg logit contributions
 cane-0.021
bs-0.019
 arrows-0.019
 string-0.018
 arrow-0.018
Token sky
Token ID6766
Feature activation-5.38e-03

Pos logit contributions
bs+0.021
Scar+0.020
 cane+0.020
Aud+0.018
 arrows+0.018
Neg logit contributions
ator-0.019
urg-0.019
icles-0.018
rim-0.016
zie-0.015
Token it
Token ID340
Feature activation-4.57e-03

Pos logit contributions
urg+0.043
ator+0.042
icles+0.041
rim+0.038
zie+0.036
Neg logit contributions
bs-0.024
 cane-0.024
Scar-0.022
 arrows-0.022
 arrow-0.022
Token is
Token ID318
Feature activation-1.64e-02

Pos logit contributions
urg+0.008
ator+0.008
icles+0.007
rim+0.007
icle+0.006
Neg logit contributions
bs-0.008
 cane-0.008
Scar-0.007
 arrows-0.007
 arrow-0.007
Token so
Token ID523
Feature activation-8.55e-03

Pos logit contributions
urg+0.037
ator+0.037
icles+0.036
rim+0.033
icle+0.032
Neg logit contributions
 cane-0.019
Scar-0.018
bs-0.018
 arrow-0.017
 arrows-0.017
Token pretty
Token ID2495
Feature activation+3.05e-02

Pos logit contributions
urg+0.020
ator+0.019
icles+0.019
rim+0.017
icle+0.017
Neg logit contributions
bs-0.010
 cane-0.010
Scar-0.009
 arrows-0.008
 arrow-0.008
Token!"
Token ID2474
Feature activation+1.15e-02

Pos logit contributions
bs+0.052
Scar+0.052
 cane+0.049
Har+0.049
Jennifer+0.047
Neg logit contributions
ator-0.052
urg-0.051
icles-0.050
fl-0.042
rim-0.041
Token Lily
Token ID20037
Feature activation+4.96e-02

Pos logit contributions
 cane+0.021
Scar+0.020
bs+0.020
Aud+0.019
Har+0.019
Neg logit contributions
urg-0.017
icles-0.017
ator-0.016
rim-0.014
icle-0.014
Token says
Token ID1139
Feature activation+3.57e-02

Pos logit contributions
bs+0.082
 cane+0.079
Mic+0.078
nam+0.077
Har+0.076
Neg logit contributions
urg-0.078
icles-0.073
ator-0.073
zie-0.066
fl-0.064
Token.
Token ID13
Feature activation+3.13e-02

Pos logit contributions
Scar+0.067
 cane+0.067
Mic+0.066
Charl+0.064
Aud+0.064
Neg logit contributions
urg-0.054
ator-0.049
icles-0.049
icle-0.040
aby-0.040
Token
Token ID198
Feature activation+2.64e-02

Pos logit contributions
bs+0.053
able+0.052
 cane+0.052
 arrows+0.052
 arrow+0.051
Neg logit contributions
icles-0.043
urg-0.043
ator-0.043
icle-0.037
 them-0.037
Token
Token ID198
Feature activation+2.13e-02

Pos logit contributions
 cane+0.043
bs+0.040
Scar+0.040
 arrows+0.039
nam+0.039
Neg logit contributions
urg-0.046
ator-0.045
icles-0.043
rim-0.038
aby-0.037
Token"
Token ID1
Feature activation-7.12e-03

Pos logit contributions
 cane+0.035
bs+0.034
 confident+0.032
Scar+0.032
powers+0.031
Neg logit contributions
ator-0.035
urg-0.035
icles-0.032
rim-0.029
io-0.027
Token.
Token ID13
Feature activation-2.39e-03

Pos logit contributions
bs+0.003
 cane+0.003
Scar+0.003
Har+0.002
 arrows+0.002
Neg logit contributions
urg-0.004
ator-0.004
icles-0.004
rim-0.003
icle-0.003
Token But
Token ID887
Feature activation-1.12e-02

Pos logit contributions
urg+0.005
ator+0.004
icles+0.004
rim+0.004
icle+0.004
Neg logit contributions
 cane-0.004
bs-0.004
Scar-0.004
 arrows-0.003
 arrow-0.003
Token the
Token ID262
Feature activation+2.20e-03

Pos logit contributions
urg+0.025
ator+0.025
icles+0.024
rim+0.022
icle+0.021
Neg logit contributions
bs-0.014
 cane-0.013
Scar-0.013
 arrows-0.012
 arrow-0.012
Token stick
Token ID4859
Feature activation+5.44e-03

Pos logit contributions
bs+0.003
Scar+0.003
 cane+0.003
Aud+0.003
Har+0.003
Neg logit contributions
urg-0.004
ator-0.004
icles-0.004
rim-0.004
icle-0.004
Token is
Token ID318
Feature activation+1.08e-02

Pos logit contributions
 cane+0.009
bs+0.009
Scar+0.009
Har+0.008
 confident+0.008
Neg logit contributions
urg-0.009
ator-0.009
icles-0.009
rim-0.008
icle-0.007
Token too
Token ID1165
Feature activation+5.30e-02

Pos logit contributions
bs+0.018
Scar+0.018
 cane+0.018
Har+0.017
Jennifer+0.016
Neg logit contributions
urg-0.019
ator-0.018
icles-0.018
icle-0.015
rim-0.015
Token short
Token ID1790
Feature activation+3.37e-02

Pos logit contributions
 Faith+0.079
Scar+0.078
pill+0.078
ird+0.076
Aud+0.075
Neg logit contributions
urg-0.093
icles-0.086
 hyd-0.083
ator-0.082
 Esk-0.081
Token.
Token ID13
Feature activation+1.35e-02

Pos logit contributions
Scar+0.053
bs+0.051
Har+0.050
nam+0.049
Michelle+0.048
Neg logit contributions
urg-0.061
icles-0.060
ator-0.058
icle-0.052
rim-0.049
Token They
Token ID1119
Feature activation+2.05e-02

Pos logit contributions
 cane+0.022
bs+0.021
Scar+0.021
 arrows+0.020
 arrow+0.020
Neg logit contributions
urg-0.024
icles-0.024
ator-0.023
icle-0.021
 Esk-0.019
Token look
Token ID804
Feature activation+7.14e-03

Pos logit contributions
Scar+0.031
 cane+0.028
Har+0.028
Jennifer+0.028
bs+0.028
Neg logit contributions
urg-0.037
icles-0.035
ator-0.035
ric-0.032
icle-0.031
Token around
Token ID1088
Feature activation-2.75e-02

Pos logit contributions
Scar+0.011
 cane+0.011
bs+0.010
Har+0.010
Aud+0.010
Neg logit contributions
urg-0.013
ator-0.012
icles-0.012
rim-0.011
icle-0.010
Token that
Token ID326
Feature activation+2.16e-02

Pos logit contributions
bs+0.021
 cane+0.020
Scar+0.020
powers+0.019
Aud+0.018
Neg logit contributions
ator-0.024
urg-0.024
icles-0.023
rim-0.020
icle-0.020
Token sharing
Token ID7373
Feature activation+2.30e-02

Pos logit contributions
bs+0.033
 cane+0.032
Scar+0.032
powers+0.031
Aud+0.029
Neg logit contributions
ator-0.039
urg-0.039
icles-0.038
rim-0.033
icle-0.032
Token is
Token ID318
Feature activation-3.00e-03

Pos logit contributions
bs+0.047
 cane+0.047
Scar+0.047
Har+0.045
powers+0.044
Neg logit contributions
ator-0.029
urg-0.027
icles-0.027
rim-0.023
zie-0.021
Token better
Token ID1365
Feature activation+8.39e-03

Pos logit contributions
urg+0.006
ator+0.006
icles+0.006
rim+0.005
icle+0.005
Neg logit contributions
 cane-0.004
bs-0.004
Scar-0.004
 arrows-0.003
 arrow-0.003
Token than
Token ID621
Feature activation+1.22e-02

Pos logit contributions
Scar+0.010
 cane+0.010
bs+0.010
Har+0.009
Aud+0.009
Neg logit contributions
ator-0.018
urg-0.017
icles-0.016
rim-0.015
io-0.015
Token winning
Token ID5442
Feature activation+4.94e-02

Pos logit contributions
 cane+0.021
bs+0.021
Scar+0.020
 arrows+0.019
Aud+0.019
Neg logit contributions
urg-0.021
ator-0.020
icles-0.019
rim-0.017
icle-0.016
Token.
Token ID13
Feature activation+5.52e-03

Pos logit contributions
bs+0.079
Scar+0.076
 cane+0.076
Michelle+0.073
Aud+0.072
Neg logit contributions
ator-0.082
urg-0.080
icles-0.078
ric-0.064
fl-0.064
Token From
Token ID3574
Feature activation+1.26e-02

Pos logit contributions
 cane+0.008
bs+0.008
Scar+0.008
 arrows+0.007
 arrow+0.007
Neg logit contributions
urg-0.011
ator-0.011
icles-0.010
zie-0.009
rim-0.009
Token that
Token ID326
Feature activation+4.77e-03

Pos logit contributions
bs+0.021
Scar+0.021
 cane+0.021
powers+0.020
Aud+0.019
Neg logit contributions
ator-0.021
urg-0.021
icles-0.020
rim-0.018
zie-0.017
Token day
Token ID1110
Feature activation+1.71e-02

Pos logit contributions
bs+0.010
 cane+0.010
Scar+0.010
 arrows+0.009
powers+0.009
Neg logit contributions
ator-0.006
urg-0.006
icles-0.005
rim-0.005
zie-0.005
Token on
Token ID319
Feature activation+1.67e-02

Pos logit contributions
 cane+0.024
Scar+0.023
bs+0.023
powers+0.023
 arrows+0.022
Neg logit contributions
urg-0.033
ator-0.032
icles-0.030
icle-0.027
rim-0.027
TokenThe
Token ID464
Feature activation+8.34e-03

Pos logit contributions
bs+0.081
 cane+0.079
 arrow+0.078
 arrows+0.077
 pencil+0.076
Neg logit contributions
ator-0.072
urg-0.065
icles-0.062
rim-0.060
why-0.057
Token girl
Token ID2576
Feature activation-5.08e-03

Pos logit contributions
bs+0.013
Scar+0.013
 cane+0.013
 arrows+0.012
Har+0.012
Neg logit contributions
ator-0.015
urg-0.014
icles-0.014
rim-0.012
icle-0.012
Token was
Token ID373
Feature activation+5.07e-03

Pos logit contributions
urg+0.010
ator+0.010
icles+0.010
rim+0.009
icle+0.008
Neg logit contributions
 cane-0.007
bs-0.007
Scar-0.007
 arrows-0.007
 arrow-0.006
Token so
Token ID523
Feature activation+2.17e-02

Pos logit contributions
 cane+0.007
bs+0.007
Scar+0.007
 arrows+0.007
 arrow+0.006
Neg logit contributions
urg-0.010
ator-0.010
icles-0.009
rim-0.009
icle-0.008
Token relieved
Token ID22598
Feature activation-1.04e-02

Pos logit contributions
 cane+0.034
Scar+0.032
 arrows+0.030
Har+0.029
bs+0.029
Neg logit contributions
urg-0.042
ator-0.038
rim-0.037
icles-0.037
 Esk-0.033
Token.
Token ID13
Feature activation+5.02e-02

Pos logit contributions
urg+0.024
ator+0.024
icles+0.023
rim+0.021
icle+0.020
Neg logit contributions
bs-0.011
 cane-0.011
Scar-0.011
 arrows-0.010
 arrow-0.010
Token She
Token ID1375
Feature activation-3.37e-04

Pos logit contributions
able+0.082
bs+0.081
 arrows+0.074
 arrow+0.074
 pencil+0.072
Neg logit contributions
urg-0.078
ator-0.076
icles-0.074
omm-0.068
zie-0.065
Token hugged
Token ID41130
Feature activation+3.19e-02

Pos logit contributions
urg+0.001
ator+0.001
icles+0.001
rim+0.001
icle+0.001
Neg logit contributions
 cane-0.000
bs-0.000
Scar-0.000
 arrows-0.000
 arrow-0.000
Token Sharon
Token ID26918
Feature activation+1.84e-02

Pos logit contributions
powers+0.053
bs+0.052
 cane+0.048
able+0.048
 arrow+0.047
Neg logit contributions
ator-0.051
icles-0.051
urg-0.050
 them-0.043
io-0.042
Token and
Token ID290
Feature activation+3.78e-02

Pos logit contributions
bs+0.024
Scar+0.023
 cane+0.023
 arrows+0.022
powers+0.022
Neg logit contributions
urg-0.037
ator-0.036
icles-0.035
zie-0.031
io-0.030
Token thanked
Token ID26280
Feature activation+2.07e-02

Pos logit contributions
powers+0.053
Scar+0.053
 cane+0.053
bs+0.051
Har+0.048
Neg logit contributions
urg-0.070
icles-0.065
ator-0.065
rim-0.057
zie-0.057
Token Can
Token ID1680
Feature activation-1.99e-02

Pos logit contributions
urg+0.054
ator+0.053
icles+0.049
rim+0.046
zie+0.043
Neg logit contributions
bs-0.038
 cane-0.037
Scar-0.036
 arrows-0.034
 arrow-0.034
Token I
Token ID314
Feature activation-1.37e-02

Pos logit contributions
urg+0.047
ator+0.046
icles+0.043
rim+0.042
zie+0.039
Neg logit contributions
bs-0.022
 cane-0.022
Scar-0.021
 arrows-0.019
 arrow-0.019
Token touch
Token ID3638
Feature activation-4.70e-03

Pos logit contributions
urg+0.030
ator+0.029
icles+0.028
rim+0.026
zie+0.025
Neg logit contributions
bs-0.018
 cane-0.017
Scar-0.017
 arrow-0.015
 arrows-0.015
Token them
Token ID606
Feature activation-9.03e-03

Pos logit contributions
urg+0.008
ator+0.008
icles+0.007
rim+0.006
icle+0.006
Neg logit contributions
bs-0.008
 cane-0.008
Scar-0.008
 arrows-0.008
 arrow-0.008
Token?"
Token ID1701
Feature activation+2.56e-02

Pos logit contributions
urg+0.018
ator+0.017
icles+0.016
rim+0.015
icle+0.014
Neg logit contributions
 cane-0.013
bs-0.013
Scar-0.013
 arrows-0.012
 arrow-0.012
Token Lily
Token ID20037
Feature activation+4.93e-02

Pos logit contributions
 cane+0.052
Scar+0.050
bs+0.049
Aud+0.048
Charl+0.048
Neg logit contributions
icles-0.030
urg-0.029
ator-0.028
 Esk-0.025
 M-0.024
Token asked
Token ID1965
Feature activation+4.60e-02

Pos logit contributions
bs+0.092
 cane+0.087
powers+0.087
Scar+0.086
 arrows+0.086
Neg logit contributions
urg-0.067
ator-0.060
icles-0.060
 M-0.054
ie-0.053
Token.
Token ID13
Feature activation+5.21e-02

Pos logit contributions
 cane+0.085
Aud+0.085
Scar+0.085
powers+0.085
bs+0.083
Neg logit contributions
icles-0.066
urg-0.061
ator-0.060
io-0.054
icle-0.053
Token
Token ID198
Feature activation+4.01e-02

Pos logit contributions
able+0.101
bs+0.101
 arrow+0.100
 cane+0.100
 arrows+0.099
Neg logit contributions
icles-0.057
ator-0.052
 them-0.050
icle-0.047
 M-0.046
Token
Token ID198
Feature activation+4.61e-02

Pos logit contributions
 cane+0.072
nam+0.070
bs+0.068
 confident+0.067
able+0.066
Neg logit contributions
ator-0.061
urg-0.059
icles-0.057
 them-0.048
aby-0.048
Token"
Token ID1
Feature activation+1.14e-03

Pos logit contributions
bs+0.087
 cane+0.083
nam+0.082
 confident+0.082
 gust+0.082
Neg logit contributions
ator-0.053
urg-0.049
icles-0.048
io-0.043
man-0.040
Token rabbit
Token ID22746
Feature activation-4.27e-03

Pos logit contributions
bs+0.055
Scar+0.053
Aud+0.049
 cane+0.049
able+0.047
Neg logit contributions
ator-0.076
urg-0.076
icles-0.069
 ig-0.065
rim-0.065
Token hopped
Token ID45230
Feature activation-1.83e-02

Pos logit contributions
urg+0.007
ator+0.007
icles+0.007
rim+0.006
icle+0.006
Neg logit contributions
bs-0.007
 cane-0.007
Scar-0.007
 arrows-0.007
 arrow-0.007
Token out
Token ID503
Feature activation-6.27e-03

Pos logit contributions
urg+0.036
ator+0.036
icles+0.034
rim+0.031
ric+0.029
Neg logit contributions
 cane-0.027
bs-0.027
Scar-0.026
 arrows-0.025
 arrow-0.025
Token of
Token ID286
Feature activation+8.64e-03

Pos logit contributions
urg+0.014
ator+0.014
icles+0.013
rim+0.012
icle+0.012
Neg logit contributions
bs-0.007
 cane-0.007
Scar-0.007
 arrows-0.007
 arrow-0.006
Token the
Token ID262
Feature activation+5.62e-03

Pos logit contributions
bs+0.013
Scar+0.013
 cane+0.012
 confident+0.011
powers+0.011
Neg logit contributions
urg-0.017
ator-0.016
icles-0.016
rim-0.014
zie-0.014
Token bushes
Token ID37413
Feature activation+2.86e-02

Pos logit contributions
bs+0.010
Scar+0.010
 cane+0.010
 confident+0.009
Aud+0.009
Neg logit contributions
ator-0.009
urg-0.009
icles-0.009
rim-0.008
zie-0.007
Token.
Token ID13
Feature activation+9.59e-03

Pos logit contributions
Scar+0.044
bs+0.042
 cane+0.042
Har+0.039
Aud+0.039
Neg logit contributions
urg-0.052
icles-0.052
ator-0.052
icle-0.044
zie-0.044
Token L
Token ID406
Feature activation-2.10e-02

Pos logit contributions
bs+0.015
 cane+0.015
Scar+0.014
 arrows+0.014
able+0.013
Neg logit contributions
urg-0.017
ator-0.017
icles-0.017
icle-0.014
rim-0.014
Tokenola
Token ID5708
Feature activation+1.85e-02

Pos logit contributions
urg+0.037
ator+0.036
icles+0.035
rim+0.030
icle+0.029
Neg logit contributions
bs-0.035
Scar-0.034
 cane-0.033
Har-0.030
able-0.030
Token was
Token ID373
Feature activation+1.64e-02

Pos logit contributions
 cane+0.030
bs+0.030
Scar+0.029
 arrows+0.027
Aud+0.027
Neg logit contributions
urg-0.032
ator-0.031
icles-0.029
rim-0.027
zie-0.026
Token delighted
Token ID20707
Feature activation-8.96e-03

Pos logit contributions
bs+0.024
 cane+0.023
Scar+0.022
able+0.021
 arrows+0.021
Neg logit contributions
urg-0.033
ator-0.032
icles-0.030
rim-0.028
zie-0.027
Token time
Token ID640
Feature activation+1.64e-02

Pos logit contributions
bs+0.045
 cane+0.045
 arrows+0.044
nam+0.043
Scar+0.043
Neg logit contributions
urg-0.041
ator-0.040
icles-0.036
rim-0.035
zie-0.034
Token,
Token ID11
Feature activation-6.27e-03

Pos logit contributions
 cane+0.021
Scar+0.020
bs+0.020
 arrows+0.019
 confident+0.019
Neg logit contributions
ator-0.035
urg-0.035
icles-0.033
rim-0.030
zie-0.029
Token there
Token ID612
Feature activation-3.60e-02

Pos logit contributions
urg+0.014
ator+0.014
icles+0.013
rim+0.012
icle+0.012
Neg logit contributions
bs-0.008
 cane-0.008
Scar-0.007
 arrows-0.007
 arrow-0.007
Token was
Token ID373
Feature activation+2.54e-02

Pos logit contributions
urg+0.074
ator+0.070
icles+0.067
rim+0.062
icle+0.057
Neg logit contributions
 cane-0.050
bs-0.048
Scar-0.047
-0.045
 string-0.045
Token a
Token ID257
Feature activation+1.30e-02

Pos logit contributions
bs+0.033
 cane+0.031
Scar+0.029
 arrows+0.027
powers+0.027
Neg logit contributions
urg-0.054
ator-0.052
icles-0.052
icle-0.047
 Esk-0.047
Token little
Token ID1310
Feature activation+3.63e-02

Pos logit contributions
bs+0.019
Scar+0.017
 cane+0.017
 arrows+0.016
Aud+0.016
Neg logit contributions
urg-0.025
ator-0.025
icles-0.023
rim-0.022
 Esk-0.021
Token girl
Token ID2576
Feature activation+6.14e-03

Pos logit contributions
Scar+0.071
bs+0.068
Aud+0.067
Jennifer+0.064
ver+0.063
Neg logit contributions
ator-0.050
urg-0.049
icles-0.046
 ig-0.045
 polar-0.044
Token.
Token ID13
Feature activation-1.13e-03

Pos logit contributions
Scar+0.010
 cane+0.010
bs+0.009
 arrows+0.009
Aud+0.009
Neg logit contributions
ator-0.011
urg-0.011
icles-0.010
rim-0.010
zie-0.009
Token She
Token ID1375
Feature activation-4.18e-03

Pos logit contributions
urg+0.003
ator+0.003
icles+0.003
rim+0.002
icle+0.002
Neg logit contributions
 cane-0.001
bs-0.001
Scar-0.001
 arrows-0.001
 arrow-0.001
Token was
Token ID373
Feature activation-1.20e-03

Pos logit contributions
urg+0.008
ator+0.008
icles+0.007
rim+0.007
icle+0.006
Neg logit contributions
bs-0.006
 cane-0.006
Scar-0.006
 arrows-0.006
 arrow-0.006
Token very
Token ID845
Feature activation-2.14e-04

Pos logit contributions
urg+0.003
ator+0.003
icles+0.002
rim+0.002
icle+0.002
Neg logit contributions
bs-0.002
 cane-0.002
Scar-0.001
 arrows-0.001
 arrow-0.001
Token pulled
Token ID5954
Feature activation+1.30e-02

Pos logit contributions
Scar+0.037
 cane+0.036
bs+0.036
able+0.034
Har+0.034
Neg logit contributions
urg-0.040
ator-0.040
icles-0.039
icle-0.034
 Esk-0.032
Token it
Token ID340
Feature activation-3.22e-03

Pos logit contributions
Scar+0.021
bs+0.021
 cane+0.021
Aud+0.020
Har+0.020
Neg logit contributions
urg-0.022
icles-0.022
ator-0.022
icle-0.019
zie-0.017
Token away
Token ID1497
Feature activation+8.11e-03

Pos logit contributions
urg+0.006
ator+0.006
icles+0.005
rim+0.005
icle+0.005
Neg logit contributions
bs-0.005
 cane-0.005
Scar-0.005
 arrows-0.005
 arrow-0.005
Token from
Token ID422
Feature activation-2.51e-03

Pos logit contributions
bs+0.012
Scar+0.012
 cane+0.012
Aud+0.011
Jennifer+0.011
Neg logit contributions
urg-0.016
ator-0.016
icles-0.015
icle-0.013
rim-0.013
Token the
Token ID262
Feature activation+7.68e-03

Pos logit contributions
urg+0.005
ator+0.005
icles+0.005
rim+0.004
icle+0.004
Neg logit contributions
bs-0.004
 cane-0.004
Scar-0.003
 arrows-0.003
Aud-0.003
Token tree
Token ID5509
Feature activation+2.55e-02

Pos logit contributions
bs+0.012
Scar+0.012
 cane+0.011
Aud+0.011
Jennifer+0.011
Neg logit contributions
ator-0.014
urg-0.014
icles-0.013
icle-0.012
 Esk-0.011
Token.
Token ID13
Feature activation+2.35e-02

Pos logit contributions
Scar+0.040
bs+0.039
 cane+0.038
Har+0.036
Jennifer+0.035
Neg logit contributions
urg-0.047
ator-0.047
icles-0.045
icle-0.039
rim-0.038
Token The
Token ID383
Feature activation+6.74e-03

Pos logit contributions
bs+0.038
 cane+0.037
Scar+0.036
able+0.035
 arrows+0.034
Neg logit contributions
icles-0.041
ator-0.040
urg-0.040
icle-0.035
 Esk-0.033
Token car
Token ID1097
Feature activation+5.65e-04

Pos logit contributions
bs+0.012
Scar+0.011
 cane+0.011
Aud+0.010
Har+0.010
Neg logit contributions
ator-0.012
urg-0.012
icles-0.011
icle-0.010
 Esk-0.010
Token was
Token ID373
Feature activation-2.49e-03

Pos logit contributions
bs+0.001
 cane+0.001
Scar+0.001
 arrows+0.001
Har+0.001
Neg logit contributions
urg-0.001
ator-0.001
icles-0.001
rim-0.001
icle-0.001
Token still
Token ID991
Feature activation+2.31e-02

Pos logit contributions
urg+0.005
ator+0.005
icles+0.005
rim+0.005
icle+0.004
Neg logit contributions
bs-0.003
 cane-0.003
Scar-0.003
 arrows-0.003
Har-0.003
Token their
Token ID511
Feature activation-9.29e-03

Pos logit contributions
bs+0.017
Scar+0.016
 cane+0.016
 confident+0.015
 arrows+0.015
Neg logit contributions
ator-0.023
urg-0.022
icles-0.021
rim-0.018
icle-0.018
Token trip
Token ID5296
Feature activation+4.67e-03

Pos logit contributions
urg+0.018
ator+0.017
icles+0.017
rim+0.015
icle+0.014
Neg logit contributions
 cane-0.014
bs-0.014
Scar-0.014
 arrows-0.013
 arrow-0.013
Token.
Token ID13
Feature activation+7.57e-03

Pos logit contributions
bs+0.006
 cane+0.006
Scar+0.006
 arrows+0.006
 arrow+0.005
Neg logit contributions
ator-0.010
urg-0.010
icles-0.009
rim-0.008
icle-0.008
Token<|endoftext|>
Token ID50256
Feature activation+1.74e-02

Pos logit contributions
 cane+0.011
bs+0.011
Scar+0.010
 arrows+0.010
 arrow+0.010
Neg logit contributions
urg-0.015
ator-0.015
icles-0.014
rim-0.013
icle-0.012
TokenOnce
Token ID7454
Feature activation-5.91e-03

Pos logit contributions
bs+0.025
 cane+0.025
Scar+0.023
 arrow+0.023
 arrows+0.023
Neg logit contributions
ator-0.034
urg-0.033
icles-0.032
rim-0.029
icle-0.028
Token upon
Token ID2402
Feature activation+4.21e-02

Pos logit contributions
urg+0.008
ator+0.008
icles+0.008
rim+0.006
icle+0.006
Neg logit contributions
 cane-0.012
bs-0.012
Scar-0.012
 arrows-0.011
 arrow-0.011
Token a
Token ID257
Feature activation+2.69e-02

Pos logit contributions
bs+0.061
 cane+0.055
Scar+0.054
Aud+0.050
 arrows+0.050
Neg logit contributions
urg-0.079
icles-0.077
ator-0.076
 Esk-0.075
icle-0.069
Token time
Token ID640
Feature activation+2.27e-02

Pos logit contributions
bs+0.046
 cane+0.046
 arrows+0.045
nam+0.045
Scar+0.044
Neg logit contributions
urg-0.043
ator-0.041
icles-0.038
rim-0.036
zie-0.035
Token,
Token ID11
Feature activation+3.68e-03

Pos logit contributions
 cane+0.032
bs+0.030
Scar+0.030
 arrows+0.028
 confident+0.028
Neg logit contributions
ator-0.047
urg-0.046
icles-0.044
rim-0.039
zie-0.039
Token there
Token ID612
Feature activation-3.30e-02

Pos logit contributions
 cane+0.005
bs+0.005
Scar+0.005
 arrows+0.004
 confident+0.004
Neg logit contributions
urg-0.008
ator-0.008
icles-0.007
rim-0.006
icle-0.006
Token was
Token ID373
Feature activation+2.81e-02

Pos logit contributions
urg+0.066
ator+0.062
icles+0.060
rim+0.055
icle+0.051
Neg logit contributions
 cane-0.047
bs-0.046
Scar-0.045
 string-0.043
able-0.043
Token Finally
Token ID9461
Feature activation-1.03e-02

Pos logit contributions
urg+0.034
ator+0.033
icles+0.031
rim+0.029
icle+0.027
Neg logit contributions
bs-0.022
 cane-0.021
Scar-0.021
 arrows-0.019
 arrow-0.019
Token they
Token ID484
Feature activation+6.93e-03

Pos logit contributions
urg+0.026
ator+0.026
icles+0.025
rim+0.023
icle+0.022
Neg logit contributions
bs-0.009
 cane-0.009
Scar-0.009
 arrows-0.008
 arrow-0.008
Token did
Token ID750
Feature activation+8.50e-03

Pos logit contributions
 cane+0.011
Scar+0.011
bs+0.011
 arrows+0.010
powers+0.010
Neg logit contributions
urg-0.013
ator-0.012
icles-0.012
rim-0.010
zie-0.010
Token it
Token ID340
Feature activation-1.38e-02

Pos logit contributions
bs+0.011
 cane+0.011
Scar+0.011
able+0.010
 arrows+0.010
Neg logit contributions
urg-0.018
ator-0.018
icles-0.017
rim-0.015
icle-0.015
Token!
Token ID0
Feature activation+5.12e-04

Pos logit contributions
urg+0.028
ator+0.028
icles+0.027
rim+0.025
icle+0.023
Neg logit contributions
 cane-0.019
bs-0.018
Scar-0.018
 arrows-0.017
 arrow-0.017
Token With
Token ID2080
Feature activation+1.43e-02

Pos logit contributions
 cane+0.001
bs+0.001
Scar+0.001
 arrows+0.001
 arrow+0.001
Neg logit contributions
urg-0.001
ator-0.001
icles-0.001
rim-0.001
icle-0.001
Token the
Token ID262
Feature activation+2.20e-02

Pos logit contributions
Scar+0.021
 cane+0.021
bs+0.021
Har+0.019
powers+0.019
Neg logit contributions
urg-0.027
ator-0.026
icles-0.026
icle-0.023
rim-0.022
Token help
Token ID1037
Feature activation+1.44e-02

Pos logit contributions
bs+0.033
Scar+0.032
Aud+0.031
Har+0.029
Jennifer+0.029
Neg logit contributions
ator-0.041
urg-0.040
icles-0.039
zie-0.034
icle-0.034
Token of
Token ID286
Feature activation-1.73e-02

Pos logit contributions
Scar+0.018
 cane+0.018
bs+0.018
Har+0.016
Aud+0.016
Neg logit contributions
urg-0.031
ator-0.031
icles-0.030
rim-0.026
icle-0.026
Token his
Token ID465
Feature activation+9.07e-03

Pos logit contributions
urg+0.036
ator+0.035
icles+0.034
rim+0.031
icle+0.029
Neg logit contributions
 cane-0.023
bs-0.023
Scar-0.022
 arrows-0.021
 arrow-0.021
Token friends
Token ID2460
Feature activation+6.46e-03

Pos logit contributions
bs+0.013
Scar+0.013
 cane+0.012
Aud+0.012
Har+0.012
Neg logit contributions
urg-0.018
ator-0.018
icles-0.017
rim-0.014
icle-0.014
TokenBut
Token ID1537
Feature activation-4.04e-02

Pos logit contributions
urg+0.033
ator+0.032
icles+0.031
rim+0.028
icle+0.027
Neg logit contributions
bs-0.020
 cane-0.019
Scar-0.019
 arrows-0.017
Har-0.017
Token then
Token ID788
Feature activation+6.42e-03

Pos logit contributions
ator+0.095
urg+0.093
rim+0.085
icles+0.084
ric+0.077
Neg logit contributions
bs-0.049
 cane-0.045
Scar-0.042
 arrow-0.042
 string-0.041
Token Mom
Token ID11254
Feature activation-4.90e-03

Pos logit contributions
 cane+0.007
Scar+0.007
bs+0.007
 confident+0.007
Aud+0.007
Neg logit contributions
urg-0.014
ator-0.014
icles-0.014
rim-0.012
icle-0.012
Token said
Token ID531
Feature activation-1.22e-02

Pos logit contributions
urg+0.011
ator+0.011
icles+0.011
rim+0.010
icle+0.009
Neg logit contributions
 cane-0.006
bs-0.006
Scar-0.005
 arrows-0.005
 arrow-0.005
Token,
Token ID11
Feature activation-1.18e-02

Pos logit contributions
urg+0.029
ator+0.029
icles+0.028
rim+0.026
zie+0.025
Neg logit contributions
bs-0.013
 cane-0.013
Scar-0.012
 arrows-0.011
 arrow-0.010
Token "
Token ID366
Feature activation-9.26e-03

Pos logit contributions
urg+0.023
ator+0.023
icles+0.021
rim+0.020
icle+0.018
Neg logit contributions
bs-0.018
 cane-0.018
Scar-0.017
 arrows-0.016
 arrow-0.016
TokenOh
Token ID5812
Feature activation-1.61e-02

Pos logit contributions
urg+0.019
ator+0.019
icles+0.018
rim+0.016
icle+0.016
Neg logit contributions
bs-0.013
 cane-0.013
Scar-0.013
 arrows-0.011
 arrow-0.011
Token no
Token ID645
Feature activation-1.11e-03

Pos logit contributions
ator+0.034
urg+0.034
icles+0.032
rim+0.030
icle+0.028
Neg logit contributions
bs-0.022
 cane-0.021
Scar-0.020
 arrows-0.019
 arrow-0.019
Token,
Token ID11
Feature activation-2.12e-02

Pos logit contributions
urg+0.002
ator+0.002
icles+0.002
rim+0.002
icle+0.002
Neg logit contributions
 cane-0.001
bs-0.001
Scar-0.001
 arrows-0.001
 arrow-0.001
Token the
Token ID262
Feature activation-1.99e-02

Pos logit contributions
ator+0.046
urg+0.046
icles+0.043
rim+0.041
zie+0.038
Neg logit contributions
bs-0.027
 cane-0.026
Scar-0.025
 arrows-0.024
 string-0.024
Token car
Token ID1097
Feature activation-1.48e-02

Pos logit contributions
urg+0.046
ator+0.045
icles+0.042
rim+0.040
icle+0.037
Neg logit contributions
 cane-0.024
Scar-0.022
bs-0.022
 arrows-0.021
 string-0.021
Token,
Token ID11
Feature activation-2.48e-05

Pos logit contributions
 cane+0.025
Scar+0.024
bs+0.024
 confident+0.022
 arrows+0.022
Neg logit contributions
urg-0.039
ator-0.039
icles-0.038
icle-0.033
rim-0.033
Token there
Token ID612
Feature activation-3.77e-02

Pos logit contributions
urg+0.000
ator+0.000
icles+0.000
rim+0.000
icle+0.000
Neg logit contributions
 cane-0.000
bs-0.000
Scar-0.000
 arrows-0.000
 confident-0.000
Token was
Token ID373
Feature activation+2.85e-02

Pos logit contributions
urg+0.077
ator+0.074
icles+0.070
rim+0.067
urger+0.060
Neg logit contributions
 cane-0.051
bs-0.049
-0.047
Scar-0.047
 string-0.047
Token a
Token ID257
Feature activation+1.39e-02

Pos logit contributions
bs+0.039
 cane+0.036
Scar+0.034
powers+0.032
 confident+0.032
Neg logit contributions
urg-0.058
icles-0.057
ator-0.056
icle-0.052
 Esk-0.051
Token happy
Token ID3772
Feature activation+9.00e-03

Pos logit contributions
bs+0.020
Scar+0.019
 cane+0.018
 arrows+0.017
Aud+0.017
Neg logit contributions
urg-0.027
ator-0.027
icles-0.025
rim-0.023
 Esk-0.023
Token clay
Token ID21558
Feature activation-4.01e-02

Pos logit contributions
bs+0.014
Scar+0.013
 cane+0.013
Aud+0.012
 arrows+0.012
Neg logit contributions
urg-0.017
ator-0.017
icles-0.016
rim-0.015
icle-0.014
Token.
Token ID13
Feature activation+4.07e-04

Pos logit contributions
urg+0.092
icles+0.085
ator+0.085
rim+0.077
zie+0.077
Neg logit contributions
 cane-0.052
 arrow-0.047
bs-0.045
 arrows-0.045
 stamp-0.044
Token He
Token ID679
Feature activation-2.30e-02

Pos logit contributions
bs+0.001
 cane+0.001
Scar+0.001
 arrows+0.001
 confident+0.001
Neg logit contributions
urg-0.001
ator-0.001
icles-0.001
icle-0.001
rim-0.001
Token lived
Token ID5615
Feature activation-7.96e-03

Pos logit contributions
urg+0.048
ator+0.048
icles+0.047
rim+0.041
icle+0.040
Neg logit contributions
 cane-0.032
bs-0.031
 arrow-0.030
Scar-0.030
 arrows-0.029
Token in
Token ID287
Feature activation+2.14e-03

Pos logit contributions
urg+0.016
ator+0.016
icles+0.015
rim+0.014
icle+0.013
Neg logit contributions
 cane-0.011
bs-0.011
Scar-0.011
 arrows-0.010
 arrow-0.010
Token a
Token ID257
Feature activation-3.63e-03

Pos logit contributions
bs+0.002
Scar+0.002
 cane+0.002
 arrows+0.002
Aud+0.002
Neg logit contributions
urg-0.005
ator-0.005
icles-0.005
rim-0.005
icle-0.004
Token for
Token ID329
Feature activation-3.93e-02

Pos logit contributions
ator+0.060
urg+0.059
icles+0.056
rim+0.053
ric+0.050
Neg logit contributions
bs-0.036
 cane-0.035
Scar-0.032
 arrow-0.031
 confident-0.031
Token the
Token ID262
Feature activation+1.91e-03

Pos logit contributions
urg+0.073
ator+0.072
icles+0.068
rim+0.064
icle+0.059
Neg logit contributions
 cane-0.058
Scar-0.057
bs-0.056
Har-0.054
 short-0.054
Token boy
Token ID2933
Feature activation+2.17e-02

Pos logit contributions
bs+0.003
 cane+0.003
Scar+0.003
 arrows+0.002
Aud+0.002
Neg logit contributions
urg-0.004
ator-0.004
icles-0.004
rim-0.003
icle-0.003
Token and
Token ID290
Feature activation-3.72e-03

Pos logit contributions
Scar+0.031
 cane+0.030
bs+0.029
 arrows+0.028
Har+0.028
Neg logit contributions
urg-0.041
ator-0.040
icles-0.040
rim-0.036
icle-0.035
Token he
Token ID339
Feature activation-1.03e-02

Pos logit contributions
urg+0.008
ator+0.008
icles+0.008
rim+0.007
icle+0.007
Neg logit contributions
 cane-0.005
bs-0.005
Scar-0.004
 arrows-0.004
 arrow-0.004
Token was
Token ID373
Feature activation-2.23e-03

Pos logit contributions
urg+0.022
ator+0.022
icles+0.021
rim+0.019
icle+0.018
Neg logit contributions
bs-0.013
 cane-0.013
Scar-0.013
 arrows-0.012
 arrow-0.012
Token happy
Token ID3772
Feature activation-3.79e-02

Pos logit contributions
urg+0.005
ator+0.005
icles+0.005
rim+0.004
icle+0.004
Neg logit contributions
 cane-0.003
bs-0.003
Scar-0.003
 arrows-0.002
 arrow-0.002
Token to
Token ID284
Feature activation-1.97e-02

Pos logit contributions
urg+0.097
ator+0.094
icles+0.091
rim+0.085
icer+0.082
Neg logit contributions
 cane-0.035
bs-0.034
 confident-0.031
 she-0.031
Scar-0.029
Token help
Token ID1037
Feature activation-1.80e-02

Pos logit contributions
urg+0.043
ator+0.042
icles+0.041
rim+0.037
icle+0.036
Neg logit contributions
bs-0.025
 cane-0.024
Scar-0.024
 arrow-0.022
 confident-0.022
Token.
Token ID13
Feature activation-1.04e-02

Pos logit contributions
urg+0.038
ator+0.037
icles+0.035
rim+0.032
icle+0.031
Neg logit contributions
 cane-0.024
bs-0.024
Scar-0.023
 arrows-0.021
 arrow-0.021
Token He
Token ID679
Feature activation-8.55e-04

Pos logit contributions
urg+0.022
ator+0.022
icles+0.021
rim+0.019
icle+0.018
Neg logit contributions
bs-0.014
 cane-0.014
Scar-0.013
 arrows-0.012
 arrow-0.012
Token time
Token ID640
Feature activation+5.54e-02

Pos logit contributions
bs+0.059
 cane+0.056
ite+0.053
powers+0.053
Scar+0.053
Neg logit contributions
ator-0.049
urg-0.045
icles-0.042
zie-0.038
rim-0.038
Token with
Token ID351
Feature activation-9.61e-03

Pos logit contributions
 cane+0.114
Scar+0.107
Har+0.103
bs+0.102
Michelle+0.101
Neg logit contributions
ator-0.067
urg-0.059
we-0.059
rim-0.055
balls-0.051
Token him
Token ID683
Feature activation+4.12e-05

Pos logit contributions
urg+0.019
ator+0.018
icles+0.018
rim+0.016
icle+0.015
Neg logit contributions
 cane-0.014
bs-0.014
Scar-0.014
 arrows-0.013
 arrow-0.013
Token and
Token ID290
Feature activation-1.23e-02

Pos logit contributions
bs+0.000
 cane+0.000
Scar+0.000
 arrows+0.000
 arrow+0.000
Neg logit contributions
urg-0.000
ator-0.000
icles-0.000
rim-0.000
icle-0.000
Token they
Token ID484
Feature activation-1.87e-02

Pos logit contributions
urg+0.023
ator+0.023
icles+0.022
rim+0.020
icle+0.019
Neg logit contributions
bs-0.019
 cane-0.019
Scar-0.018
 arrows-0.017
 arrow-0.017
Token both
Token ID1111
Feature activation-9.14e-03

Pos logit contributions
urg+0.037
ator+0.036
icles+0.035
rim+0.032
icle+0.030
Neg logit contributions
bs-0.029
 cane-0.028
Scar-0.027
 arrows-0.026
 arrow-0.026
Token had
Token ID550
Feature activation+1.41e-02

Pos logit contributions
urg+0.019
ator+0.019
icles+0.018
rim+0.016
icle+0.016
Neg logit contributions
bs-0.013
 cane-0.012
Scar-0.012
 arrows-0.011
 arrow-0.011
Token a
Token ID257
Feature activation+1.77e-02

Pos logit contributions
 cane+0.020
bs+0.020
Scar+0.018
Har+0.017
 arrow+0.016
Neg logit contributions
urg-0.029
ator-0.027
icles-0.026
rim-0.023
 Esk-0.023
Token great
Token ID1049
Feature activation+3.43e-03

Pos logit contributions
bs+0.029
 cane+0.028
Scar+0.027
 arrows+0.026
 arrow+0.026
Neg logit contributions
ator-0.030
urg-0.029
icles-0.026
rim-0.024
 Esk-0.024
Token day
Token ID1110
Feature activation+3.41e-02

Pos logit contributions
bs+0.006
 cane+0.006
Scar+0.006
 arrows+0.005
Aud+0.005
Neg logit contributions
urg-0.006
ator-0.006
icles-0.006
rim-0.005
icle-0.005
Token.
Token ID13
Feature activation-2.06e-02

Pos logit contributions
 cane+0.063
Scar+0.063
Har+0.056
Mic+0.056
Michelle+0.056
Neg logit contributions
urg-0.052
ator-0.050
icles-0.045
rim-0.043
 setting-0.041
Token said
Token ID531
Feature activation-3.25e-02

Pos logit contributions
urg+0.013
ator+0.013
icles+0.013
rim+0.011
icle+0.011
Neg logit contributions
bs-0.011
 cane-0.011
Scar-0.011
 arrows-0.010
 arrow-0.010
Token I
Token ID314
Feature activation-9.51e-03

Pos logit contributions
ator+0.077
urg+0.075
icles+0.071
rim+0.068
zie+0.064
Neg logit contributions
bs-0.037
 cane-0.036
 arrows-0.033
Scar-0.033
 arrow-0.032
Token couldn
Token ID3521
Feature activation+9.45e-03

Pos logit contributions
urg+0.021
ator+0.021
icles+0.020
rim+0.018
icle+0.017
Neg logit contributions
bs-0.012
 cane-0.012
Scar-0.011
 arrows-0.010
 arrow-0.010
Token't
Token ID470
Feature activation-8.34e-03

Pos logit contributions
bs+0.018
 cane+0.018
Scar+0.018
 arrows+0.017
 confident+0.017
Neg logit contributions
ator-0.014
urg-0.014
icles-0.013
rim-0.011
icle-0.011
Token because
Token ID780
Feature activation+6.85e-03

Pos logit contributions
urg+0.016
ator+0.016
icles+0.015
rim+0.014
icle+0.013
Neg logit contributions
bs-0.012
 cane-0.012
Scar-0.012
 arrows-0.011
 arrow-0.011
Token I
Token ID314
Feature activation-2.41e-02

Pos logit contributions
bs+0.009
Scar+0.009
 cane+0.009
 arrows+0.008
Aud+0.008
Neg logit contributions
urg-0.014
ator-0.014
icles-0.013
rim-0.012
icle-0.012
Token don
Token ID836
Feature activation+2.07e-02

Pos logit contributions
urg+0.053
ator+0.053
icles+0.050
rim+0.045
icle+0.043
Neg logit contributions
bs-0.033
 cane-0.031
Scar-0.029
 arrow-0.027
 confident-0.027
Token't
Token ID470
Feature activation+1.02e-02

Pos logit contributions
bs+0.043
 cane+0.042
Scar+0.041
 arrows+0.040
able+0.039
Neg logit contributions
ator-0.029
urg-0.029
icles-0.027
rim-0.022
icle-0.022
Token know
Token ID760
Feature activation-4.23e-03

Pos logit contributions
 cane+0.013
Scar+0.013
bs+0.013
 arrows+0.012
Aud+0.012
Neg logit contributions
urg-0.021
ator-0.021
icles-0.020
rim-0.019
zie-0.018
Token how
Token ID703
Feature activation-1.20e-02

Pos logit contributions
urg+0.007
ator+0.007
icles+0.007
rim+0.006
icle+0.006
Neg logit contributions
 cane-0.007
bs-0.007
Scar-0.007
 arrows-0.006
 arrow-0.006
Token to
Token ID284
Feature activation-2.07e-02

Pos logit contributions
urg+0.025
ator+0.025
icles+0.024
rim+0.022
icle+0.021
Neg logit contributions
bs-0.016
 cane-0.016
Scar-0.015
 arrows-0.014
 arrow-0.014
Token was
Token ID373
Feature activation-1.55e-02

Pos logit contributions
urg+0.025
ator+0.024
icles+0.023
rim+0.021
icle+0.020
Neg logit contributions
 cane-0.024
bs-0.023
Scar-0.023
 arrows-0.022
 arrow-0.022
Token flying
Token ID7348
Feature activation-4.70e-03

Pos logit contributions
urg+0.037
ator+0.036
icles+0.035
rim+0.032
icle+0.031
Neg logit contributions
 cane-0.017
bs-0.016
Scar-0.016
 arrows-0.014
 confident-0.014
Token!
Token ID0
Feature activation-4.32e-03

Pos logit contributions
urg+0.010
ator+0.010
icles+0.009
rim+0.008
icle+0.008
Neg logit contributions
bs-0.006
 cane-0.006
Scar-0.006
 arrows-0.006
 arrow-0.006
Token She
Token ID1375
Feature activation-1.64e-03

Pos logit contributions
urg+0.010
ator+0.009
icles+0.009
rim+0.008
icle+0.008
Neg logit contributions
 cane-0.005
bs-0.005
Scar-0.005
 arrows-0.005
 arrow-0.005
Token pretended
Token ID36766
Feature activation-8.17e-03

Pos logit contributions
urg+0.003
ator+0.003
icles+0.003
rim+0.002
icle+0.002
Neg logit contributions
 cane-0.003
bs-0.003
Scar-0.003
 arrows-0.002
Aud-0.002
Token her
Token ID607
Feature activation-5.78e-02

Pos logit contributions
urg+0.018
ator+0.018
icles+0.017
rim+0.015
icle+0.015
Neg logit contributions
 cane-0.010
bs-0.010
Scar-0.010
 arrows-0.009
 arrow-0.009
Token t
Token ID256
Feature activation-4.93e-02

Pos logit contributions
urg+0.144
rim+0.131
ator+0.129
icles+0.128
ric+0.116
Neg logit contributions
 cane-0.077
 arrow-0.063
 string-0.062
 pencil-0.061
 short-0.057
Tokeneddy
Token ID21874
Feature activation-5.40e-02

Pos logit contributions
urg+0.078
ator+0.072
icles+0.068
rim+0.062
zie+0.061
Neg logit contributions
 cane-0.091
bs-0.083
 arrow-0.081
powers-0.079
 pencil-0.079
Token bear
Token ID6842
Feature activation-1.07e-02

Pos logit contributions
urg+0.105
ator+0.097
icles+0.092
rim+0.091
add+0.084
Neg logit contributions
 cane-0.093
 arrow-0.086
 pencil-0.084
 arrows-0.083
 string-0.081
Token could
Token ID714
Feature activation-2.04e-02

Pos logit contributions
urg+0.018
ator+0.017
icles+0.017
rim+0.015
icle+0.014
Neg logit contributions
 cane-0.019
bs-0.019
Scar-0.018
 arrows-0.017
 arrow-0.017
Token fly
Token ID6129
Feature activation+2.27e-03

Pos logit contributions
urg+0.044
ator+0.044
icles+0.042
rim+0.039
zie+0.037
Neg logit contributions
 cane-0.026
bs-0.026
Scar-0.024
 arrow-0.024
 arrows-0.023
Token,
Token ID11
Feature activation-2.01e-02

Pos logit contributions
Scar+0.009
 cane+0.009
bs+0.008
Aud+0.008
powers+0.008
Neg logit contributions
urg-0.016
ator-0.016
icles-0.016
icle-0.014
rim-0.014
Token they
Token ID484
Feature activation+1.61e-03

Pos logit contributions
urg+0.039
ator+0.038
icles+0.035
rim+0.034
zie+0.030
Neg logit contributions
 cane-0.031
bs-0.031
Scar-0.029
 arrow-0.029
 arrows-0.029
Token finished
Token ID5201
Feature activation-1.94e-03

Pos logit contributions
 cane+0.003
bs+0.003
Scar+0.002
 arrows+0.002
powers+0.002
Neg logit contributions
urg-0.003
ator-0.003
icles-0.003
rim-0.002
zie-0.002
Token building
Token ID2615
Feature activation+2.36e-03

Pos logit contributions
urg+0.004
ator+0.004
icles+0.003
rim+0.003
icle+0.003
Neg logit contributions
bs-0.003
 cane-0.003
Scar-0.003
 arrows-0.003
 arrow-0.003
Token the
Token ID262
Feature activation-5.04e-03

Pos logit contributions
bs+0.003
 cane+0.003
Scar+0.003
 arrows+0.003
Aud+0.003
Neg logit contributions
urg-0.005
ator-0.005
icles-0.005
rim-0.004
icle-0.004
Token ig
Token ID45329
Feature activation-7.19e-02

Pos logit contributions
urg+0.009
ator+0.009
icles+0.008
rim+0.007
icle+0.007
Neg logit contributions
bs-0.008
 cane-0.008
Scar-0.008
 arrows-0.008
 arrow-0.007
Tokenloo
Token ID29680
Feature activation-5.26e-03

Pos logit contributions
urg+0.100
bet+0.083
maid+0.079
 vows+0.077
 perfection+0.075
Neg logit contributions
ite-0.134
able-0.124
bs-0.122
br-0.114
chair-0.112
Token.
Token ID13
Feature activation-3.05e-02

Pos logit contributions
urg+0.012
ator+0.011
icles+0.011
rim+0.010
icle+0.010
Neg logit contributions
bs-0.006
 cane-0.006
Scar-0.006
 arrows-0.006
 arrow-0.006
Token Tim
Token ID5045
Feature activation-1.94e-02

Pos logit contributions
ator+0.064
urg+0.064
rim+0.058
icles+0.058
io+0.053
Neg logit contributions
bs-0.040
 Jerry-0.039
 cane-0.038
Scar-0.037
 arrow-0.037
Tokenmy
Token ID1820
Feature activation+9.43e-04

Pos logit contributions
urg+0.037
ator+0.037
icles+0.036
rim+0.031
icle+0.030
Neg logit contributions
 cane-0.029
bs-0.028
Scar-0.028
 arrow-0.027
able-0.026
Token was
Token ID373
Feature activation-1.03e-02

Pos logit contributions
bs+0.002
 cane+0.002
Scar+0.002
 arrows+0.001
Aud+0.001
Neg logit contributions
urg-0.002
ator-0.002
icles-0.002
rim-0.001
zie-0.001
Token there
Token ID612
Feature activation-3.67e-02

Pos logit contributions
 cane+0.020
bs+0.019
Scar+0.019
 arrows+0.017
 confident+0.017
Neg logit contributions
urg-0.033
ator-0.033
icles-0.032
rim-0.028
zie-0.028
Token was
Token ID373
Feature activation+3.03e-02

Pos logit contributions
urg+0.075
ator+0.070
icles+0.067
rim+0.062
icle+0.058
Neg logit contributions
bs-0.052
 cane-0.052
Scar-0.050
able-0.047
-0.047
Token a
Token ID257
Feature activation+1.86e-02

Pos logit contributions
 cane+0.038
bs+0.038
Scar+0.035
 arrows+0.034
powers+0.033
Neg logit contributions
urg-0.062
ator-0.059
icles-0.058
 Esk-0.054
icle-0.053
Token tank
Token ID6873
Feature activation-2.41e-02

Pos logit contributions
bs+0.028
Scar+0.026
 cane+0.025
 arrows+0.025
Aud+0.024
Neg logit contributions
urg-0.035
ator-0.035
icles-0.031
rim-0.030
 Esk-0.030
Token.
Token ID13
Feature activation-9.16e-03

Pos logit contributions
urg+0.057
ator+0.055
icles+0.054
zie+0.048
rim+0.048
Neg logit contributions
 cane-0.026
bs-0.025
 arrow-0.023
Scar-0.022
 string-0.022
Token It
Token ID632
Feature activation-5.86e-02

Pos logit contributions
urg+0.019
ator+0.019
icles+0.018
rim+0.016
icle+0.016
Neg logit contributions
 cane-0.012
bs-0.012
Scar-0.012
 arrows-0.011
 arrow-0.011
Token was
Token ID373
Feature activation-2.91e-02

Pos logit contributions
urg+0.150
ator+0.141
icles+0.139
umm+0.128
zie+0.124
Neg logit contributions
-0.059
 string-0.055
 tied-0.055
 arrow-0.054
 cane-0.052
Token big
Token ID1263
Feature activation+7.69e-03

Pos logit contributions
ator+0.079
urg+0.078
icles+0.074
rim+0.070
zie+0.069
Neg logit contributions
 cane-0.021
 arrow-0.020
 long-0.019
 string-0.019
 short-0.019
Token and
Token ID290
Feature activation-5.70e-03

Pos logit contributions
bs+0.010
Scar+0.009
 cane+0.009
Aud+0.008
Har+0.008
Neg logit contributions
urg-0.017
ator-0.017
icles-0.016
rim-0.015
icle-0.014
Token shiny
Token ID22441
Feature activation+8.92e-03

Pos logit contributions
urg+0.014
ator+0.013
icles+0.013
rim+0.012
icle+0.011
Neg logit contributions
bs-0.006
 cane-0.006
Scar-0.006
 arrows-0.005
 arrow-0.005
Token!
Token ID0
Feature activation+1.48e-02

Pos logit contributions
Scar+0.013
bs+0.013
Aud+0.012
 cane+0.012
Jennifer+0.012
Neg logit contributions
ator-0.018
urg-0.018
icles-0.017
icle-0.015
rim-0.015
TokenYes
Token ID5297
Feature activation-4.48e-02

Pos logit contributions
urg+0.032
ator+0.031
icles+0.029
rim+0.026
icle+0.025
Neg logit contributions
bs-0.022
 cane-0.022
Scar-0.022
 arrows-0.020
 arrow-0.020
Token Bella
Token ID39304
Feature activation-8.02e-03

Pos logit contributions
ator+0.138
urg+0.134
icles+0.130
rim+0.129
ric+0.128
Neg logit contributions
bs-0.021
 long-0.012
 she-0.011
 short-0.011
 cane-0.011
Token.
Token ID13
Feature activation-4.33e-02

Pos logit contributions
urg+0.019
ator+0.019
icles+0.018
rim+0.017
icle+0.016
Neg logit contributions
bs-0.008
 cane-0.008
Scar-0.008
 arrows-0.007
 arrow-0.007
Token This
Token ID770
Feature activation-1.09e-02

Pos logit contributions
urg+0.094
ator+0.090
rim+0.086
icles+0.084
io+0.076
Neg logit contributions
bs-0.056
 cane-0.052
Scar-0.050
 arrow-0.048
 Jerry-0.048
Token is
Token ID318
Feature activation-1.84e-02

Pos logit contributions
urg+0.020
ator+0.019
icles+0.019
rim+0.017
icle+0.016
Neg logit contributions
 cane-0.018
bs-0.018
Scar-0.017
 arrows-0.016
 arrow-0.016
Token our
Token ID674
Feature activation-5.82e-02

Pos logit contributions
urg+0.045
ator+0.044
icles+0.042
rim+0.040
io+0.038
Neg logit contributions
 cane-0.018
Scar-0.018
bs-0.018
 arrow-0.016
 arrows-0.016
Token beautiful
Token ID4950
Feature activation-5.07e-02

Pos logit contributions
urg+0.133
icles+0.121
ator+0.120
rim+0.115
icle+0.108
Neg logit contributions
 cane-0.085
 string-0.072
 long-0.071
 newspaper-0.071
 dictionary-0.068
Token porch
Token ID33179
Feature activation-1.65e-02

Pos logit contributions
urg+0.104
ator+0.099
icles+0.095
rim+0.091
bled+0.086
Neg logit contributions
 cane-0.076
 arrow-0.063
 string-0.063
 newspaper-0.063
bs-0.062
Token.
Token ID13
Feature activation-3.80e-02

Pos logit contributions
urg+0.034
ator+0.033
icles+0.032
rim+0.030
icle+0.028
Neg logit contributions
bs-0.022
 cane-0.022
Scar-0.021
 arrow-0.019
 arrows-0.019
Token Come
Token ID7911
Feature activation-6.08e-02

Pos logit contributions
urg+0.080
ator+0.078
icles+0.072
rim+0.070
io+0.066
Neg logit contributions
bs-0.051
 cane-0.048
Scar-0.047
-0.046
 arrow-0.045
Token,
Token ID11
Feature activation-4.63e-02

Pos logit contributions
urg+0.167
ator+0.160
icles+0.157
rim+0.155
zie+0.142
Neg logit contributions
 cane-0.052
bs-0.049
 tie-0.044
-0.044
Scar-0.042
Token He
Token ID679
Feature activation-2.16e-02

Pos logit contributions
urg+0.019
ator+0.019
icles+0.018
rim+0.016
icle+0.015
Neg logit contributions
 cane-0.015
bs-0.015
Scar-0.015
 arrows-0.014
 arrow-0.014
Token asked
Token ID1965
Feature activation-2.65e-02

Pos logit contributions
urg+0.048
ator+0.047
icles+0.046
rim+0.042
icle+0.040
Neg logit contributions
 cane-0.027
bs-0.027
Scar-0.026
 arrow-0.024
 arrows-0.024
Token his
Token ID465
Feature activation-4.50e-02

Pos logit contributions
ator+0.056
urg+0.054
icles+0.051
rim+0.049
ric+0.046
Neg logit contributions
bs-0.037
 cane-0.037
Scar-0.035
 arrows-0.033
 confident-0.032
Token mom
Token ID1995
Feature activation-1.62e-02

Pos logit contributions
urg+0.098
ator+0.096
icles+0.096
rim+0.091
rol+0.087
Neg logit contributions
 cane-0.061
 pencil-0.059
 short-0.058
 long-0.056
 arrow-0.055
Token,
Token ID11
Feature activation-1.31e-02

Pos logit contributions
urg+0.037
ator+0.036
icles+0.034
rim+0.033
icle+0.031
Neg logit contributions
bs-0.019
 cane-0.019
Scar-0.018
 arrows-0.016
 arrow-0.016
Token "
Token ID366
Feature activation-5.51e-02

Pos logit contributions
urg+0.026
ator+0.026
icles+0.024
rim+0.022
zie+0.021
Neg logit contributions
 cane-0.019
bs-0.019
Scar-0.019
 arrows-0.017
 confident-0.017
TokenMom
Token ID29252
Feature activation-3.95e-02

Pos logit contributions
urg+0.117
ator+0.109
rim+0.099
opher+0.097
icles+0.096
Neg logit contributions
Scar-0.082
 cane-0.081
Har-0.078
 pencil-0.071
Mic-0.070
Token,
Token ID11
Feature activation-4.40e-02

Pos logit contributions
urg+0.094
ator+0.093
icles+0.088
rim+0.086
icle+0.078
Neg logit contributions
 cane-0.039
bs-0.038
Scar-0.036
 pencil-0.033
 short-0.032
Token are
Token ID389
Feature activation-3.34e-02

Pos logit contributions
urg+0.105
ator+0.103
rim+0.101
icles+0.096
io+0.089
Neg logit contributions
 cane-0.053
bs-0.045
 pencil-0.045
 arrow-0.045
Scar-0.045
Token there
Token ID612
Feature activation-4.16e-02

Pos logit contributions
ator+0.074
urg+0.072
icles+0.068
rim+0.064
zie+0.063
Neg logit contributions
bs-0.041
 cane-0.041
 arrows-0.039
Scar-0.039
 arrow-0.038
Token dinosaurs
Token ID27199
Feature activation-9.52e-03

Pos logit contributions
urg+0.092
ator+0.091
icles+0.086
rim+0.083
io+0.077
Neg logit contributions
 cane-0.050
Scar-0.044
bs-0.044
 arrow-0.042
 short-0.041