TOP ACTIVATIONS
MAX = 6.549

the garden . A subtle tranqu ility filled the air and
, being surrounded by the tranqu ility of nature .
, feeling the peace and tranqu ility of the night ,
feather to decor ate the peac ock â s
closer , he heard a voice . " Give
, he heard a deep voice from somewhere behind him .
higher , he heard a voice above him . It was
journey , they heard a voice . It was coming from
Suddenly , they heard a voice calling to them . It
while . Soon , the tad p oles became frogs and
. He could smell the peac eness of the cave .
all , he heard a voice calling out his name .
and suddenly heard a deep voice . The voice demanded Wally
arrived , he heard a voice say , " Hello !
up , it heard a voice and asked , " Who
Suddenly , he heard a voice from inside the boat .
you ?" she heard a voice outside . It was her
Who 's there ?" a voice said . L
Suddenly , Lisa heard a voice behind her say , "
pathway , he heard a voice . " Be

MIN ACTIVATIONS
MIN = -4.948

. He says , " Max , please . Don 't
ha , I catch your snowball , Ben !" Lily says
" Anna ! Max !" Lily says . "
" Dad , dad ! Max is at the grill !"
" Mom , Ben pushed me and took the
" Twenty - nine , thirty , thirty -
. " Sp ike , come back ! Please
. They shout , " Spot , stop ! You are
He said , " R over , I have a treat
at her and said " ny ah , n yah !"
each other . They were on a trip with their mom
and said , " B enny , I have a present
, " L ily and Luke , it is time to
, " S ara and Ben , you are both good
, " L ily and Ben , why did you fight
say , " No , Max ! These are our blocks
my , mom my ! Max broke my dress !" Sara
are twins . They live on a farm with their mom
and said , " Fin ny , why do you always
a boy and his family on a vacation . Every day

INTERVAL 3.675 - 6.549
CONTAINS 0.045%

so he wanted to get rid of it . But it
was a close call ," mom said . "
it . Lily and the babys itter looked inside and saw
day , she heard a sound that she had never heard
up when he heard a voice . " Let

INTERVAL 0.800 - 3.675
CONTAINS 18.981%

and enjoy the view . The tire had
was a little scared , but he agreed . The two
, Pete used his paints to paint a picture on the
Once upon a time there was a daddy who loved his
the drawer and was ready to give it a new home

INTERVAL -2.074 - 0.800
CONTAINS 78.996%

birds and trees and people . But then ,
with it . She screamed and cried . The nails hurt
air . Everything was damp and wet . Jack
you should always try something at least twice . You might
to eat . All my friends have sweets ."

INTERVAL -4.948 - -2.074
CONTAINS 1.978%

tongue . Sue cried out in pain and quickly ran to
idea . " Ben , I have a plan
l ilies . Cl ara leapt onto the
started to run . But then , she tri pped and
cheerful voice , Look at that
Token the
Token ID262
Feature activation+1.64e-01

Pos logit contributions
+0.007
 top+0.006
 puzzles+0.005
bes+0.004
 rh+0.004
Neg logit contributions
year-0.014
irk-0.014
and-0.013
old-0.013
ules-0.012
Token garden
Token ID11376
Feature activation+4.58e-01

Pos logit contributions
year+0.006
irk+0.006
and+0.006
old+0.005
ax+0.005
Neg logit contributions
-0.004
 top-0.003
bes-0.002
 puzzles-0.002
-0.002
Token.
Token ID13
Feature activation+4.56e-01

Pos logit contributions
year+0.017
irk+0.017
old+0.015
and+0.015
izz+0.015
Neg logit contributions
-0.010
 top-0.009
 puzzles-0.007
 ones-0.007
 none-0.006
Token A
Token ID317
Feature activation+1.02e+00

Pos logit contributions
year+0.015
irk+0.015
and+0.014
old+0.014
ules+0.013
Neg logit contributions
-0.012
 top-0.009
 puzzles-0.008
bes-0.008
-0.008
Token subtle
Token ID11800
Feature activation+6.78e-01

Pos logit contributions
year+0.036
irk+0.035
and+0.033
old+0.032
ax+0.031
Neg logit contributions
-0.026
 top-0.021
 puzzles-0.018
-0.017
 rh-0.017
Token tranqu
Token ID31882
Feature activation+6.55e+00

Pos logit contributions
year+0.025
irk+0.023
and+0.023
old+0.022
izz+0.021
Neg logit contributions
-0.017
 top-0.014
 puzzles-0.011
 rh-0.011
bes-0.010
Tokenility
Token ID879
Feature activation+1.13e+00

Pos logit contributions
old+0.156
irk+0.155
and+0.138
ingle+0.131
ules+0.130
Neg logit contributions
-0.203
 top-0.141
-0.136
 none-0.135
 End-0.127
Token filled
Token ID5901
Feature activation+1.06e+00

Pos logit contributions
year+0.029
irk+0.028
and+0.025
old+0.025
ax+0.023
Neg logit contributions
-0.040
 top-0.033
 puzzles-0.030
bes-0.029
-0.028
Token the
Token ID262
Feature activation+1.01e+00

Pos logit contributions
year+0.041
irk+0.040
and+0.038
old+0.038
ules+0.036
Neg logit contributions
-0.023
 top-0.017
 puzzles-0.014
bes-0.013
-0.012
Token air
Token ID1633
Feature activation+4.54e-01

Pos logit contributions
year+0.037
irk+0.036
and+0.034
old+0.034
ules+0.032
Neg logit contributions
-0.025
 top-0.019
 puzzles-0.016
bes-0.015
 rh-0.014
Token and
Token ID290
Feature activation+3.64e-01

Pos logit contributions
year+0.016
irk+0.016
old+0.015
and+0.015
izz+0.014
Neg logit contributions
-0.012
 top-0.009
 puzzles-0.008
bes-0.007
-0.007
Token,
Token ID11
Feature activation-1.67e-01

Pos logit contributions
+0.017
 top+0.013
 puzzles+0.010
bes+0.009
 ones+0.009
Neg logit contributions
year-0.032
irk-0.031
old-0.029
and-0.029
izz-0.028
Token being
Token ID852
Feature activation+7.29e-01

Pos logit contributions
+0.004
 top+0.004
 puzzles+0.003
+0.003
bes+0.003
Neg logit contributions
year-0.006
irk-0.006
old-0.005
and-0.005
auc-0.005
Token surrounded
Token ID11191
Feature activation+2.30e+00

Pos logit contributions
irk+0.025
year+0.024
and+0.022
old+0.022
ingle+0.021
Neg logit contributions
-0.019
 top-0.015
 puzzles-0.013
 ones-0.012
bes-0.012
Token by
Token ID416
Feature activation+2.80e-01

Pos logit contributions
irk+0.072
year+0.071
old+0.063
ules+0.063
auc+0.063
Neg logit contributions
-0.061
 top-0.052
 puzzles-0.045
 ones-0.044
gy-0.041
Token the
Token ID262
Feature activation+2.33e-01

Pos logit contributions
year+0.011
irk+0.010
and+0.010
old+0.010
ax+0.009
Neg logit contributions
-0.006
 top-0.005
 puzzles-0.004
bes-0.004
 rh-0.004
Token tranqu
Token ID31882
Feature activation+6.15e+00

Pos logit contributions
year+0.009
irk+0.008
and+0.008
old+0.008
ules+0.007
Neg logit contributions
-0.006
 top-0.005
 puzzles-0.004
 rh-0.004
 ones-0.004
Tokenility
Token ID879
Feature activation+4.48e-01

Pos logit contributions
irk+0.149
old+0.123
eal+0.112
year+0.112
ingle+0.107
Neg logit contributions
-0.201
 top-0.150
-0.135
 zoo-0.131
illas-0.128
Token of
Token ID286
Feature activation-1.13e-02

Pos logit contributions
year+0.010
irk+0.010
and+0.009
old+0.008
ax+0.008
Neg logit contributions
-0.017
 top-0.015
 puzzles-0.013
bes-0.013
-0.013
Token nature
Token ID3450
Feature activation+1.59e-01

Pos logit contributions
+0.000
 top+0.000
 puzzles+0.000
 rh+0.000
 ones+0.000
Neg logit contributions
year-0.000
irk-0.000
and-0.000
old-0.000
enny-0.000
Token.
Token ID13
Feature activation+8.40e-01

Pos logit contributions
year+0.005
irk+0.005
and+0.005
old+0.005
ax+0.004
Neg logit contributions
-0.005
 top-0.004
 puzzles-0.003
bes-0.003
-0.003
Token
Token ID220
Feature activation+5.91e-01

Pos logit contributions
year+0.028
irk+0.027
and+0.025
old+0.025
ax+0.023
Neg logit contributions
-0.023
 top-0.018
 puzzles-0.016
bes-0.015
-0.015
Token,
Token ID11
Feature activation+6.93e-01

Pos logit contributions
irk+0.031
year+0.031
old+0.029
auc+0.029
and+0.028
Neg logit contributions
-0.016
 top-0.012
-0.009
 ones-0.007
 none-0.007
Token feeling
Token ID4203
Feature activation+8.73e-01

Pos logit contributions
irk+0.024
year+0.024
auc+0.022
old+0.022
and+0.022
Neg logit contributions
-0.018
 top-0.014
-0.012
 puzzles-0.011
 none-0.011
Token the
Token ID262
Feature activation+6.40e-01

Pos logit contributions
irk+0.028
year+0.027
and+0.026
old+0.025
ules+0.024
Neg logit contributions
-0.025
 top-0.020
 puzzles-0.016
bes-0.016
-0.016
Token peace
Token ID4167
Feature activation+1.69e+00

Pos logit contributions
year+0.020
irk+0.019
and+0.018
old+0.018
enny+0.017
Neg logit contributions
-0.019
 top-0.016
 puzzles-0.014
 ones-0.013
bes-0.013
Token and
Token ID290
Feature activation+7.42e-02

Pos logit contributions
year+0.051
irk+0.049
and+0.046
old+0.046
ax+0.042
Neg logit contributions
-0.053
 top-0.043
 puzzles-0.037
 ones-0.036
-0.036
Token tranqu
Token ID31882
Feature activation+6.11e+00

Pos logit contributions
irk+0.002
year+0.002
and+0.002
old+0.002
auc+0.002
Neg logit contributions
-0.002
 top-0.002
 puzzles-0.001
 ones-0.001
 none-0.001
Tokenility
Token ID879
Feature activation+1.03e+00

Pos logit contributions
irk+0.161
old+0.159
and+0.139
ax+0.132
ules+0.131
Neg logit contributions
-0.183
 top-0.129
-0.125
gy-0.120
 none-0.106
Token of
Token ID286
Feature activation-2.47e-01

Pos logit contributions
year+0.029
irk+0.028
and+0.025
old+0.025
ax+0.024
Neg logit contributions
-0.035
 top-0.029
 puzzles-0.025
bes-0.024
 ones-0.024
Token the
Token ID262
Feature activation+8.41e-01

Pos logit contributions
+0.005
 top+0.004
 puzzles+0.003
bes+0.003
 ones+0.003
Neg logit contributions
year-0.010
irk-0.010
and-0.009
old-0.009
ax-0.009
Token night
Token ID1755
Feature activation+1.52e+00

Pos logit contributions
year+0.029
irk+0.027
and+0.026
old+0.026
ax+0.024
Neg logit contributions
-0.023
 top-0.018
 puzzles-0.016
bes-0.015
 rh-0.014
Token,
Token ID11
Feature activation+3.08e-01

Pos logit contributions
year+0.059
irk+0.057
and+0.053
old+0.053
ax+0.050
Neg logit contributions
-0.037
 top-0.028
 ones-0.022
 puzzles-0.021
-0.021
Token feather
Token ID17972
Feature activation+1.43e+00

Pos logit contributions
year+0.031
irk+0.030
and+0.028
old+0.028
ax+0.026
Neg logit contributions
-0.023
 top-0.018
 puzzles-0.015
bes-0.015
 rh-0.014
Token to
Token ID284
Feature activation+1.51e+00

Pos logit contributions
year+0.051
irk+0.049
old+0.045
and+0.045
izz+0.043
Neg logit contributions
-0.036
 top-0.029
 puzzles-0.024
bes-0.024
 ones-0.023
Token decor
Token ID11705
Feature activation+1.79e+00

Pos logit contributions
year+0.053
irk+0.053
and+0.050
old+0.047
enny+0.046
Neg logit contributions
-0.039
 top-0.034
 puzzles-0.027
 rh-0.026
bes-0.026
Tokenate
Token ID378
Feature activation+5.04e-01

Pos logit contributions
year+0.055
irk+0.053
and+0.051
old+0.050
ax+0.046
Neg logit contributions
-0.053
 top-0.042
 puzzles-0.037
bes-0.036
-0.035
Token the
Token ID262
Feature activation+6.55e-01

Pos logit contributions
year+0.018
irk+0.018
and+0.017
old+0.016
ules+0.015
Neg logit contributions
-0.012
 top-0.010
 puzzles-0.008
bes-0.008
 ones-0.008
Token peac
Token ID41961
Feature activation+5.71e+00

Pos logit contributions
year+0.023
irk+0.022
and+0.021
old+0.021
ax+0.020
Neg logit contributions
-0.017
 top-0.012
 puzzles-0.011
bes-0.010
-0.010
Tokenock
Token ID735
Feature activation+1.48e+00

Pos logit contributions
irk+0.108
year+0.100
old+0.076
auc+0.069
and+0.068
Neg logit contributions
-0.223
 top-0.196
 puzzles-0.173
bes-0.169
 rh-0.168
Tokenâ
Token ID22940
Feature activation+8.83e-01

Pos logit contributions
year+0.050
irk+0.049
old+0.044
and+0.044
ax+0.042
Neg logit contributions
-0.041
 top-0.033
bes-0.028
 puzzles-0.028
-0.025
Token
Token ID26391
Feature activation-1.47e+00

Pos logit contributions
year+0.028
irk+0.027
and+0.026
old+0.025
ules+0.024
Neg logit contributions
-0.027
 top-0.020
bes-0.018
 puzzles-0.017
-0.017
Token
Token ID8151
Feature activation+1.89e-01

Pos logit contributions
+0.040
 top+0.032
 puzzles+0.030
adoes+0.027
 rh+0.026
Neg logit contributions
year-0.044
irk-0.044
and-0.043
old-0.042
ules-0.039
Tokens
Token ID82
Feature activation-2.91e-01

Pos logit contributions
irk+0.006
year+0.006
leaf+0.005
old+0.005
and+0.005
Neg logit contributions
-0.005
 top-0.004
-0.003
-0.003
bes-0.003
Token closer
Token ID5699
Feature activation-4.75e-01

Pos logit contributions
irk+0.030
year+0.029
and+0.027
old+0.026
ules+0.025
Neg logit contributions
-0.035
 top-0.032
 puzzles-0.027
 none-0.026
 ones-0.026
Token,
Token ID11
Feature activation+3.72e-01

Pos logit contributions
+0.008
 top+0.006
 puzzles+0.004
+0.004
 ones+0.004
Neg logit contributions
year-0.021
irk-0.020
old-0.019
and-0.019
izz-0.019
Token he
Token ID339
Feature activation+4.39e-01

Pos logit contributions
year+0.012
irk+0.011
and+0.011
old+0.011
ax+0.010
Neg logit contributions
-0.011
 top-0.008
bes-0.007
 puzzles-0.007
-0.007
Token heard
Token ID2982
Feature activation+1.20e+00

Pos logit contributions
year+0.013
irk+0.013
and+0.012
old+0.012
auc+0.011
Neg logit contributions
-0.014
 top-0.011
 puzzles-0.009
-0.009
bes-0.009
Token a
Token ID257
Feature activation+1.74e+00

Pos logit contributions
irk+0.057
year+0.054
and+0.053
enny+0.052
ash+0.051
Neg logit contributions
 top-0.013
-0.011
 rh-0.006
 puzzles-0.006
 Raven-0.004
Token voice
Token ID3809
Feature activation+5.59e+00

Pos logit contributions
year+0.067
irk+0.063
old+0.063
and+0.062
ash+0.061
Neg logit contributions
-0.037
 top-0.031
-0.025
 rh-0.024
 hawk-0.022
Token.
Token ID13
Feature activation+6.63e-01

Pos logit contributions
year+0.213
irk+0.209
enny+0.206
and+0.202
ash+0.188
Neg logit contributions
-0.091
 top-0.079
 rh-0.061
 Raven-0.057
 one-0.055
Token
Token ID198
Feature activation+1.52e-01

Pos logit contributions
year+0.025
irk+0.025
and+0.024
old+0.023
ax+0.022
Neg logit contributions
-0.015
 top-0.011
-0.009
 rh-0.009
 puzzles-0.008
Token
Token ID198
Feature activation-1.22e+00

Pos logit contributions
year+0.006
irk+0.006
and+0.005
old+0.005
ax+0.005
Neg logit contributions
-0.003
 top-0.003
-0.002
bes-0.002
 puzzles-0.002
Token"
Token ID1
Feature activation+2.29e-01

Pos logit contributions
+0.024
 top+0.017
 puzzles+0.013
bes+0.013
+0.012
Neg logit contributions
year-0.050
irk-0.048
and-0.046
old-0.046
ax-0.043
TokenGive
Token ID23318
Feature activation-5.22e-01

Pos logit contributions
year+0.008
irk+0.008
and+0.008
old+0.007
ax+0.007
Neg logit contributions
-0.006
 top-0.004
 puzzles-0.004
bes-0.004
-0.003
Token,
Token ID11
Feature activation+8.71e-01

Pos logit contributions
+0.011
 top+0.008
 puzzles+0.007
bes+0.007
+0.006
Neg logit contributions
year-0.017
irk-0.016
and-0.016
old-0.016
ax-0.015
Token he
Token ID339
Feature activation+3.31e-01

Pos logit contributions
year+0.029
irk+0.029
and+0.027
old+0.027
ax+0.025
Neg logit contributions
-0.023
 top-0.018
 puzzles-0.016
bes-0.015
-0.014
Token heard
Token ID2982
Feature activation+8.18e-01

Pos logit contributions
year+0.010
irk+0.010
and+0.009
old+0.009
auc+0.008
Neg logit contributions
-0.010
 top-0.008
-0.007
 puzzles-0.007
bes-0.007
Token a
Token ID257
Feature activation+1.45e+00

Pos logit contributions
irk+0.037
year+0.036
and+0.035
ax+0.033
ashing+0.033
Neg logit contributions
-0.010
 top-0.010
 puzzles-0.006
 rh-0.006
adoes-0.004
Token deep
Token ID2769
Feature activation+1.23e+00

Pos logit contributions
year+0.056
irk+0.054
and+0.053
old+0.052
ax+0.051
Neg logit contributions
-0.030
 top-0.025
-0.019
 rh-0.019
 puzzles-0.018
Token voice
Token ID3809
Feature activation+5.56e+00

Pos logit contributions
year+0.040
irk+0.039
and+0.036
old+0.035
ax+0.033
Neg logit contributions
-0.035
 top-0.027
 puzzles-0.024
bes-0.024
-0.022
Token from
Token ID422
Feature activation+3.38e-01

Pos logit contributions
year+0.207
irk+0.205
enny+0.199
and+0.197
ash+0.182
Neg logit contributions
-0.099
 top-0.082
 rh-0.065
gy-0.061
 Raven-0.060
Token somewhere
Token ID7382
Feature activation+3.04e+00

Pos logit contributions
year+0.011
irk+0.010
and+0.010
old+0.009
ules+0.009
Neg logit contributions
-0.010
 top-0.008
 puzzles-0.007
bes-0.007
 rh-0.006
Token behind
Token ID2157
Feature activation-6.62e-01

Pos logit contributions
irk+0.123
year+0.121
and+0.120
eal+0.112
enny+0.111
Neg logit contributions
-0.047
 top-0.041
-0.029
 rh-0.027
 hawk-0.025
Token him
Token ID683
Feature activation+9.69e-01

Pos logit contributions
+0.019
 top+0.015
 puzzles+0.013
bes+0.012
 rh+0.012
Neg logit contributions
year-0.021
irk-0.021
and-0.020
old-0.019
ax-0.018
Token.
Token ID13
Feature activation-4.42e-02

Pos logit contributions
year+0.037
irk+0.035
and+0.033
old+0.033
izz+0.031
Neg logit contributions
-0.022
 top-0.018
gy-0.014
-0.013
bes-0.013
Token higher
Token ID2440
Feature activation-7.57e-01

Pos logit contributions
+0.021
 top+0.017
 puzzles+0.016
bes+0.015
+0.014
Neg logit contributions
year-0.019
irk-0.018
and-0.017
old-0.016
ax-0.015
Token,
Token ID11
Feature activation+2.75e-01

Pos logit contributions
+0.020
 top+0.015
bes+0.013
 puzzles+0.013
+0.012
Neg logit contributions
year-0.025
irk-0.025
and-0.024
old-0.023
ax-0.022
Token he
Token ID339
Feature activation+3.99e-02

Pos logit contributions
year+0.007
irk+0.006
old+0.006
and+0.006
ax+0.006
Neg logit contributions
-0.010
bes-0.007
 top-0.007
 necks-0.007
-0.007
Token heard
Token ID2982
Feature activation+9.93e-01

Pos logit contributions
year+0.001
irk+0.001
and+0.001
old+0.001
ules+0.001
Neg logit contributions
-0.001
 top-0.001
 puzzles-0.001
bes-0.001
-0.001
Token a
Token ID257
Feature activation+1.43e+00

Pos logit contributions
irk+0.046
year+0.044
and+0.042
enny+0.041
izz+0.040
Neg logit contributions
 top-0.012
-0.012
 puzzles-0.007
 rh-0.006
adoes-0.005
Token voice
Token ID3809
Feature activation+5.55e+00

Pos logit contributions
year+0.051
irk+0.048
and+0.047
old+0.046
ax+0.044
Neg logit contributions
-0.036
 top-0.031
-0.024
 rh-0.024
 puzzles-0.023
Token above
Token ID2029
Feature activation+7.88e-01

Pos logit contributions
year+0.195
irk+0.194
enny+0.189
and+0.185
ash+0.172
Neg logit contributions
-0.112
 top-0.096
 rh-0.077
 Raven-0.071
 hawk-0.071
Token him
Token ID683
Feature activation+1.80e+00

Pos logit contributions
year+0.026
irk+0.025
and+0.025
old+0.024
enny+0.023
Neg logit contributions
-0.021
 top-0.018
 rh-0.013
 puzzles-0.013
gy-0.013
Token.
Token ID13
Feature activation+3.83e-01

Pos logit contributions
year+0.065
irk+0.062
and+0.059
old+0.058
izz+0.055
Neg logit contributions
-0.045
 top-0.035
-0.028
gy-0.028
bes-0.028
Token It
Token ID632
Feature activation+1.56e+00

Pos logit contributions
year+0.014
irk+0.014
and+0.013
old+0.013
ules+0.012
Neg logit contributions
-0.009
 top-0.007
-0.006
 rh-0.005
bes-0.005
Token was
Token ID373
Feature activation+1.10e+00

Pos logit contributions
year+0.060
irk+0.058
and+0.057
old+0.055
ax+0.052
Neg logit contributions
-0.036
 top-0.027
-0.021
bes-0.021
 puzzles-0.021
Token journey
Token ID7002
Feature activation-6.30e-01

Pos logit contributions
+0.034
 top+0.026
 puzzles+0.023
bes+0.023
+0.021
Neg logit contributions
year-0.040
irk-0.039
and-0.037
old-0.037
ules-0.034
Token,
Token ID11
Feature activation+6.08e-02

Pos logit contributions
+0.011
 top+0.008
 puzzles+0.006
 ones+0.005
+0.005
Neg logit contributions
year-0.028
irk-0.027
and-0.025
old-0.025
izz-0.024
Token they
Token ID484
Feature activation+6.10e-01

Pos logit contributions
year+0.002
irk+0.002
old+0.002
and+0.002
ules+0.001
Neg logit contributions
-0.002
bes-0.001
 top-0.001
 puzzles-0.001
icer-0.001
Token heard
Token ID2982
Feature activation+3.28e-01

Pos logit contributions
irk+0.021
year+0.020
and+0.019
old+0.018
auc+0.018
Neg logit contributions
-0.016
 top-0.013
 none-0.011
-0.011
 puzzles-0.011
Token a
Token ID257
Feature activation+1.24e+00

Pos logit contributions
irk+0.014
year+0.014
and+0.013
enny+0.013
ax+0.013
Neg logit contributions
-0.005
 top-0.004
 puzzles-0.003
 rh-0.003
 ones-0.002
Token voice
Token ID3809
Feature activation+5.54e+00

Pos logit contributions
year+0.042
irk+0.040
and+0.039
old+0.038
ax+0.036
Neg logit contributions
-0.033
 top-0.028
 puzzles-0.022
-0.022
 rh-0.022
Token.
Token ID13
Feature activation+8.13e-01

Pos logit contributions
irk+0.202
year+0.200
enny+0.193
and+0.189
ash+0.175
Neg logit contributions
-0.109
 top-0.094
 rh-0.074
 one-0.068
 Raven-0.067
Token It
Token ID632
Feature activation+2.05e+00

Pos logit contributions
year+0.028
irk+0.027
and+0.026
old+0.025
ax+0.023
Neg logit contributions
-0.022
 top-0.017
 puzzles-0.014
bes-0.014
-0.014
Token was
Token ID373
Feature activation+1.19e+00

Pos logit contributions
year+0.073
irk+0.072
and+0.070
old+0.067
ax+0.063
Neg logit contributions
-0.052
 top-0.039
-0.032
 puzzles-0.032
bes-0.032
Token coming
Token ID2406
Feature activation+1.70e+00

Pos logit contributions
year+0.045
irk+0.044
and+0.043
old+0.041
ax+0.039
Neg logit contributions
-0.027
 top-0.020
 puzzles-0.016
bes-0.016
 rh-0.015
Token from
Token ID422
Feature activation-8.08e-01

Pos logit contributions
year+0.057
irk+0.057
old+0.053
and+0.051
ules+0.051
Neg logit contributions
-0.040
 top-0.038
gy-0.032
 rh-0.028
 Blue-0.027
Token Suddenly
Token ID24975
Feature activation-1.17e+00

Pos logit contributions
year+0.015
irk+0.015
and+0.014
old+0.014
ax+0.013
Neg logit contributions
-0.012
 top-0.009
 puzzles-0.008
bes-0.008
-0.008
Token,
Token ID11
Feature activation+6.93e-01

Pos logit contributions
+0.031
 top+0.022
bes+0.021
 puzzles+0.019
+0.018
Neg logit contributions
year-0.039
irk-0.037
old-0.036
and-0.035
ules-0.033
Token they
Token ID484
Feature activation+7.75e-01

Pos logit contributions
year+0.025
irk+0.024
and+0.023
old+0.022
ax+0.021
Neg logit contributions
-0.017
 top-0.013
 puzzles-0.011
bes-0.011
-0.010
Token heard
Token ID2982
Feature activation+2.64e-01

Pos logit contributions
irk+0.025
year+0.025
and+0.024
old+0.024
auc+0.022
Neg logit contributions
-0.020
 top-0.018
-0.014
 none-0.014
 puzzles-0.014
Token a
Token ID257
Feature activation+8.96e-01

Pos logit contributions
irk+0.011
year+0.011
and+0.010
old+0.010
ax+0.010
Neg logit contributions
-0.004
 top-0.004
 puzzles-0.003
 rh-0.002
 ones-0.002
Token voice
Token ID3809
Feature activation+5.51e+00

Pos logit contributions
year+0.030
irk+0.029
and+0.028
old+0.028
ax+0.026
Neg logit contributions
-0.024
 top-0.020
 puzzles-0.016
-0.016
 rh-0.016
Token calling
Token ID4585
Feature activation+1.82e+00

Pos logit contributions
irk+0.200
year+0.196
enny+0.191
and+0.189
ash+0.173
Neg logit contributions
-0.107
 top-0.093
 rh-0.070
 one-0.069
 Raven-0.066
Token to
Token ID284
Feature activation+1.66e+00

Pos logit contributions
year+0.062
irk+0.060
and+0.058
old+0.058
ax+0.055
Neg logit contributions
-0.044
 top-0.040
adoes-0.029
 puzzles-0.029
 ones-0.028
Token them
Token ID606
Feature activation+1.02e+00

Pos logit contributions
year+0.060
irk+0.058
and+0.055
old+0.054
auc+0.051
Neg logit contributions
-0.042
 top-0.033
 puzzles-0.026
 ones-0.025
 rh-0.025
Token.
Token ID13
Feature activation+5.14e-01

Pos logit contributions
year+0.040
irk+0.039
and+0.036
old+0.035
leaf+0.034
Neg logit contributions
-0.021
 top-0.018
-0.013
gy-0.012
 puzzles-0.012
Token It
Token ID632
Feature activation+1.75e+00

Pos logit contributions
year+0.019
irk+0.018
and+0.017
old+0.017
ax+0.016
Neg logit contributions
-0.013
 top-0.010
-0.008
 puzzles-0.007
bes-0.007
Token while
Token ID981
Feature activation-1.57e-02

Pos logit contributions
year+0.026
irk+0.025
and+0.023
old+0.023
ules+0.022
Neg logit contributions
-0.021
 top-0.017
 puzzles-0.013
 rh-0.013
-0.013
Token.
Token ID13
Feature activation-1.19e-01

Pos logit contributions
+0.000
 top+0.000
+0.000
 puzzles+0.000
 ones+0.000
Neg logit contributions
year-0.001
irk-0.001
old-0.001
and-0.001
auc-0.001
Token Soon
Token ID15894
Feature activation-1.68e+00

Pos logit contributions
+0.003
 top+0.002
 puzzles+0.002
bes+0.002
+0.002
Neg logit contributions
year-0.004
irk-0.004
and-0.004
old-0.004
ules-0.003
Token,
Token ID11
Feature activation+5.81e-01

Pos logit contributions
+0.045
 top+0.032
bes+0.031
 puzzles+0.029
+0.027
Neg logit contributions
year-0.056
irk-0.054
and-0.052
old-0.052
ax-0.049
Token the
Token ID262
Feature activation+3.91e-01

Pos logit contributions
year+0.021
irk+0.021
and+0.020
old+0.019
ax+0.018
Neg logit contributions
-0.014
 top-0.011
 puzzles-0.009
bes-0.008
-0.008
Token tad
Token ID36264
Feature activation+5.50e+00

Pos logit contributions
year+0.012
irk+0.012
and+0.011
old+0.011
ax+0.010
Neg logit contributions
-0.011
 top-0.008
bes-0.008
 puzzles-0.008
adoes-0.007
Tokenp
Token ID79
Feature activation+1.98e+00

Pos logit contributions
irk+0.164
year+0.154
enny+0.145
old+0.140
ax+0.136
Neg logit contributions
-0.158
 top-0.136
 rh-0.114
 hawk-0.112
 tails-0.109
Tokenoles
Token ID4316
Feature activation+2.32e+00

Pos logit contributions
year+0.056
irk+0.054
and+0.050
old+0.049
ax+0.046
Neg logit contributions
-0.065
 top-0.053
 puzzles-0.046
bes-0.046
-0.045
Token became
Token ID2627
Feature activation+4.00e-01

Pos logit contributions
year+0.077
irk+0.075
and+0.071
old+0.068
auc+0.063
Neg logit contributions
-0.064
 top-0.051
 puzzles-0.043
-0.042
 rh-0.041
Token frogs
Token ID37475
Feature activation+7.30e-01

Pos logit contributions
year+0.013
irk+0.013
and+0.012
old+0.012
ax+0.011
Neg logit contributions
-0.011
 top-0.008
bes-0.008
 puzzles-0.007
-0.007
Token and
Token ID290
Feature activation-6.30e-01

Pos logit contributions
year+0.027
irk+0.025
and+0.024
old+0.023
izz+0.023
Neg logit contributions
-0.018
 top-0.014
 puzzles-0.011
 rh-0.011
bes-0.011
Token.
Token ID13
Feature activation+1.86e-01

Pos logit contributions
year+0.038
irk+0.037
old+0.034
and+0.034
ax+0.032
Neg logit contributions
-0.031
 top-0.024
 puzzles-0.021
bes-0.020
-0.019
Token He
Token ID679
Feature activation-8.39e-02

Pos logit contributions
year+0.006
irk+0.006
and+0.006
old+0.006
ax+0.005
Neg logit contributions
-0.005
 top-0.004
 puzzles-0.003
bes-0.003
 rh-0.003
Token could
Token ID714
Feature activation+6.09e-01

Pos logit contributions
+0.003
 top+0.002
 puzzles+0.002
bes+0.002
+0.002
Neg logit contributions
year-0.003
irk-0.002
and-0.002
old-0.002
ax-0.002
Token smell
Token ID8508
Feature activation+2.00e+00

Pos logit contributions
year+0.016
irk+0.016
and+0.015
old+0.014
ax+0.013
Neg logit contributions
-0.021
 top-0.017
 puzzles-0.015
bes-0.015
-0.014
Token the
Token ID262
Feature activation+1.09e+00

Pos logit contributions
year+0.078
irk+0.077
and+0.072
old+0.072
enny+0.068
Neg logit contributions
-0.040
 top-0.033
 puzzles-0.027
 rh-0.024
 ones-0.024
Token peac
Token ID41961
Feature activation+5.49e+00

Pos logit contributions
year+0.036
irk+0.035
and+0.033
old+0.033
ules+0.030
Neg logit contributions
-0.030
 top-0.024
 puzzles-0.021
bes-0.020
 rh-0.019
Tokeneness
Token ID9449
Feature activation+8.29e-01

Pos logit contributions
irk+0.100
year+0.099
old+0.078
and+0.068
auc+0.066
Neg logit contributions
-0.199
 top-0.177
 puzzles-0.162
 ones-0.158
 arrest-0.157
Token of
Token ID286
Feature activation+9.80e-01

Pos logit contributions
year+0.032
irk+0.030
and+0.028
old+0.028
izz+0.027
Neg logit contributions
-0.019
 top-0.016
 puzzles-0.013
 ones-0.012
 rh-0.012
Token the
Token ID262
Feature activation+1.10e+00

Pos logit contributions
year+0.036
irk+0.036
and+0.034
old+0.034
enny+0.032
Neg logit contributions
-0.021
 top-0.018
 puzzles-0.016
 rh-0.014
 ones-0.014
Token cave
Token ID11527
Feature activation+1.15e+00

Pos logit contributions
year+0.039
irk+0.037
old+0.036
and+0.035
ules+0.033
Neg logit contributions
-0.027
 top-0.023
 puzzles-0.019
 rh-0.018
 ones-0.018
Token.
Token ID13
Feature activation+2.03e-01

Pos logit contributions
year+0.043
irk+0.043
old+0.040
and+0.039
izz+0.037
Neg logit contributions
-0.026
 top-0.022
 puzzles-0.018
bes-0.016
 ones-0.016
Token all
Token ID477
Feature activation+1.51e+00

Pos logit contributions
+0.007
 top+0.005
 puzzles+0.004
bes+0.004
 ones+0.003
Neg logit contributions
year-0.014
irk-0.014
and-0.013
old-0.013
izz-0.013
Token,
Token ID11
Feature activation+8.20e-01

Pos logit contributions
year+0.053
irk+0.051
and+0.049
enny+0.046
izz+0.045
Neg logit contributions
-0.036
 top-0.030
 puzzles-0.027
 ones-0.025
 rh-0.023
Token he
Token ID339
Feature activation+4.46e-01

Pos logit contributions
year+0.023
irk+0.021
and+0.020
old+0.020
ules+0.018
Neg logit contributions
-0.027
 top-0.021
bes-0.020
 puzzles-0.018
-0.018
Token heard
Token ID2982
Feature activation+8.57e-01

Pos logit contributions
irk+0.014
year+0.014
and+0.014
old+0.013
auc+0.012
Neg logit contributions
-0.012
 top-0.010
 puzzles-0.008
-0.008
 none-0.008
Token a
Token ID257
Feature activation+1.36e+00

Pos logit contributions
irk+0.038
year+0.036
and+0.035
izz+0.033
enny+0.033
Neg logit contributions
-0.012
 top-0.012
 puzzles-0.008
 rh-0.007
 none-0.006
Token voice
Token ID3809
Feature activation+5.48e+00

Pos logit contributions
year+0.050
irk+0.048
and+0.046
old+0.046
ax+0.045
Neg logit contributions
-0.032
 top-0.027
-0.021
 rh-0.021
 puzzles-0.020
Token calling
Token ID4585
Feature activation+1.97e+00

Pos logit contributions
irk+0.192
year+0.189
enny+0.186
and+0.179
ash+0.170
Neg logit contributions
-0.110
 top-0.095
 rh-0.077
 Raven-0.071
 helicopter-0.069
Token out
Token ID503
Feature activation+1.62e+00

Pos logit contributions
irk+0.063
year+0.063
and+0.061
old+0.059
izz+0.059
Neg logit contributions
-0.049
 top-0.046
adoes-0.035
 puzzles-0.034
 Blue-0.034
Token his
Token ID465
Feature activation+4.25e-01

Pos logit contributions
year+0.050
irk+0.050
and+0.048
old+0.047
ax+0.046
Neg logit contributions
-0.043
 top-0.037
 puzzles-0.031
 rh-0.030
-0.029
Token name
Token ID1438
Feature activation+6.14e-01

Pos logit contributions
year+0.016
and+0.015
irk+0.015
old+0.014
ax+0.014
Neg logit contributions
-0.010
 top-0.008
 rh-0.006
 puzzles-0.006
-0.006
Token.
Token ID13
Feature activation-6.52e-02

Pos logit contributions
year+0.022
irk+0.022
and+0.021
old+0.021
izz+0.020
Neg logit contributions
-0.015
 top-0.012
 puzzles-0.009
 rh-0.009
-0.009
Token and
Token ID290
Feature activation+1.09e-01

Pos logit contributions
year+0.064
irk+0.063
and+0.059
old+0.057
enny+0.055
Neg logit contributions
-0.042
 top-0.032
 puzzles-0.029
bes-0.026
 ones-0.025
Token suddenly
Token ID6451
Feature activation-9.94e-01

Pos logit contributions
irk+0.004
year+0.004
and+0.004
old+0.003
izz+0.003
Neg logit contributions
-0.003
 top-0.002
 puzzles-0.002
 ones-0.002
 rh-0.002
Token heard
Token ID2982
Feature activation+1.00e+00

Pos logit contributions
+0.028
 top+0.022
 puzzles+0.020
bes+0.019
+0.018
Neg logit contributions
year-0.033
irk-0.031
and-0.030
old-0.029
ax-0.027
Token a
Token ID257
Feature activation+1.27e+00

Pos logit contributions
irk+0.042
year+0.041
and+0.039
old+0.038
izz+0.037
Neg logit contributions
-0.017
 top-0.014
 puzzles-0.011
 rh-0.010
 ones-0.009
Token deep
Token ID2769
Feature activation+7.80e-01

Pos logit contributions
year+0.044
irk+0.043
and+0.041
old+0.041
ax+0.039
Neg logit contributions
-0.032
 top-0.028
 rh-0.022
 puzzles-0.022
-0.022
Token voice
Token ID3809
Feature activation+5.47e+00

Pos logit contributions
year+0.024
irk+0.024
and+0.022
old+0.022
ax+0.020
Neg logit contributions
-0.023
 top-0.017
 puzzles-0.016
bes-0.015
-0.015
Token.
Token ID13
Feature activation+7.16e-01

Pos logit contributions
irk+0.205
year+0.200
enny+0.194
and+0.189
ax+0.175
Neg logit contributions
-0.098
 top-0.087
 rh-0.069
 Raven-0.061
 one-0.060
Token The
Token ID383
Feature activation+7.93e-01

Pos logit contributions
year+0.025
irk+0.024
and+0.023
old+0.022
ules+0.021
Neg logit contributions
-0.019
 top-0.015
 puzzles-0.012
-0.012
bes-0.012
Token voice
Token ID3809
Feature activation+4.41e+00

Pos logit contributions
year+0.029
irk+0.028
and+0.027
old+0.026
ax+0.025
Neg logit contributions
-0.019
 top-0.015
 puzzles-0.012
 rh-0.012
bes-0.012
Token demanded
Token ID12284
Feature activation-3.26e-01

Pos logit contributions
irk+0.133
and+0.129
year+0.126
ax+0.114
enny+0.112
Neg logit contributions
-0.126
 top-0.105
-0.091
 rh-0.090
 k-0.086
Token Wally
Token ID50117
Feature activation-3.88e-01

Pos logit contributions
+0.008
 top+0.006
 puzzles+0.005
bes+0.005
+0.004
Neg logit contributions
year-0.012
irk-0.012
and-0.011
old-0.011
ules-0.010
Token arrived
Token ID5284
Feature activation-4.92e-01

Pos logit contributions
irk+0.008
year+0.008
and+0.008
old+0.008
ules+0.007
Neg logit contributions
-0.007
 top-0.005
-0.004
 rh-0.004
bes-0.004
Token,
Token ID11
Feature activation+9.50e-01

Pos logit contributions
+0.012
 top+0.009
 puzzles+0.007
+0.007
bes+0.006
Neg logit contributions
year-0.019
irk-0.018
and-0.017
old-0.017
ules-0.016
Token he
Token ID339
Feature activation+5.46e-01

Pos logit contributions
year+0.032
irk+0.031
and+0.029
old+0.028
ax+0.026
Neg logit contributions
-0.026
 top-0.020
 puzzles-0.018
bes-0.017
-0.016
Token heard
Token ID2982
Feature activation+1.34e+00

Pos logit contributions
year+0.018
irk+0.017
and+0.016
old+0.016
auc+0.015
Neg logit contributions
-0.015
 top-0.013
-0.011
 puzzles-0.010
bes-0.010
Token a
Token ID257
Feature activation+1.67e+00

Pos logit contributions
irk+0.062
year+0.060
and+0.057
enny+0.056
izz+0.056
Neg logit contributions
 top-0.016
-0.013
 puzzles-0.008
 rh-0.008
 things-0.006
Token voice
Token ID3809
Feature activation+5.47e+00

Pos logit contributions
year+0.061
irk+0.059
old+0.057
and+0.057
ax+0.054
Neg logit contributions
-0.039
 top-0.034
-0.027
 rh-0.026
 puzzles-0.024
Token say
Token ID910
Feature activation+5.31e-01

Pos logit contributions
irk+0.202
year+0.201
enny+0.197
and+0.190
ash+0.179
Neg logit contributions
-0.093
 top-0.084
 rh-0.065
 one-0.062
 Raven-0.059
Token,
Token ID11
Feature activation+2.23e-01

Pos logit contributions
year+0.021
irk+0.021
and+0.020
old+0.019
ingle+0.018
Neg logit contributions
-0.011
 top-0.008
-0.006
 rh-0.006
 puzzles-0.006
Token "
Token ID366
Feature activation+9.98e-01

Pos logit contributions
year+0.009
irk+0.009
and+0.008
old+0.008
ingle+0.008
Neg logit contributions
-0.005
 top-0.003
-0.003
bes-0.003
 rh-0.003
TokenHello
Token ID15496
Feature activation+2.39e-02

Pos logit contributions
year+0.037
irk+0.036
old+0.035
and+0.034
auc+0.032
Neg logit contributions
-0.024
 top-0.016
-0.014
 ones-0.012
 rh-0.012
Token!
Token ID0
Feature activation+1.78e+00

Pos logit contributions
year+0.001
irk+0.001
and+0.001
old+0.001
ax+0.001
Neg logit contributions
-0.001
 top-0.000
 puzzles-0.000
bes-0.000
-0.000
Token up
Token ID510
Feature activation-5.48e-01

Pos logit contributions
year+0.004
irk+0.003
old+0.003
ax+0.003
and+0.003
Neg logit contributions
-0.006
bes-0.004
 puzzles-0.004
 top-0.004
 necks-0.004
Token,
Token ID11
Feature activation+1.44e-01

Pos logit contributions
+0.011
 top+0.008
 puzzles+0.007
bes+0.006
+0.006
Neg logit contributions
year-0.022
irk-0.021
and-0.021
old-0.020
ules-0.019
Token it
Token ID340
Feature activation+9.96e-01

Pos logit contributions
year+0.005
irk+0.005
and+0.004
old+0.004
ax+0.004
Neg logit contributions
-0.004
 top-0.003
bes-0.003
 puzzles-0.003
-0.002
Token heard
Token ID2982
Feature activation+1.01e+00

Pos logit contributions
year+0.032
irk+0.031
and+0.030
old+0.029
ax+0.027
Neg logit contributions
-0.029
 top-0.023
 puzzles-0.020
bes-0.019
-0.019
Token a
Token ID257
Feature activation+1.41e+00

Pos logit contributions
irk+0.047
year+0.045
and+0.043
old+0.042
ax+0.042
Neg logit contributions
 top-0.011
-0.011
 puzzles-0.006
 rh-0.005
adoes-0.004
Token voice
Token ID3809
Feature activation+5.45e+00

Pos logit contributions
year+0.049
irk+0.048
old+0.046
and+0.045
ax+0.044
Neg logit contributions
-0.035
 top-0.031
 rh-0.024
-0.024
 puzzles-0.024
Token and
Token ID290
Feature activation+3.54e-01

Pos logit contributions
irk+0.201
year+0.200
enny+0.196
and+0.187
ax+0.175
Neg logit contributions
-0.099
 top-0.087
 rh-0.067
 helicopter-0.063
 Raven-0.063
Token asked
Token ID1965
Feature activation+1.53e+00

Pos logit contributions
irk+0.011
year+0.011
and+0.011
old+0.010
ash+0.009
Neg logit contributions
-0.011
 top-0.009
-0.008
 rh-0.008
 puzzles-0.007
Token,
Token ID11
Feature activation+1.13e+00

Pos logit contributions
year+0.058
irk+0.058
and+0.054
old+0.053
ax+0.052
Neg logit contributions
-0.034
 top-0.025
 puzzles-0.021
-0.018
 ones-0.018
Token "
Token ID366
Feature activation+8.79e-01

Pos logit contributions
irk+0.046
year+0.046
and+0.044
old+0.044
ingle+0.042
Neg logit contributions
-0.022
 top-0.016
-0.013
 rh-0.010
 puzzles-0.010
TokenWho
Token ID8241
Feature activation-4.47e-01

Pos logit contributions
year+0.033
irk+0.032
and+0.031
old+0.030
ax+0.029
Neg logit contributions
-0.020
 top-0.015
 puzzles-0.013
bes-0.012
adoes-0.011
TokenSuddenly
Token ID38582
Feature activation-4.20e-01

Pos logit contributions
+0.040
 top+0.030
 puzzles+0.027
bes+0.026
adoes+0.026
Neg logit contributions
year-0.060
and-0.057
irk-0.055
old-0.054
ax-0.050
Token,
Token ID11
Feature activation+7.79e-01

Pos logit contributions
+0.011
 top+0.008
bes+0.007
 puzzles+0.007
adoes+0.006
Neg logit contributions
year-0.015
irk-0.014
old-0.013
and-0.013
ax-0.012
Token he
Token ID339
Feature activation+3.14e-01

Pos logit contributions
year+0.028
irk+0.026
and+0.025
old+0.025
ax+0.024
Neg logit contributions
-0.020
 top-0.014
bes-0.013
 puzzles-0.013
-0.011
Token heard
Token ID2982
Feature activation+1.09e+00

Pos logit contributions
year+0.010
irk+0.009
and+0.009
old+0.009
auc+0.008
Neg logit contributions
-0.010
 top-0.008
-0.007
 puzzles-0.007
bes-0.007
Token a
Token ID257
Feature activation+1.67e+00

Pos logit contributions
irk+0.051
year+0.049
and+0.047
old+0.045
enny+0.045
Neg logit contributions
 top-0.012
-0.012
 rh-0.006
 puzzles-0.006
 Raven-0.005
Token voice
Token ID3809
Feature activation+5.45e+00

Pos logit contributions
year+0.064
irk+0.060
and+0.060
old+0.060
ax+0.057
Neg logit contributions
-0.036
 top-0.030
-0.024
 rh-0.023
 hawk-0.021
Token from
Token ID422
Feature activation+5.41e-01

Pos logit contributions
year+0.196
irk+0.192
enny+0.187
and+0.186
ash+0.171
Neg logit contributions
-0.106
 top-0.089
 rh-0.073
 Raven-0.068
gy-0.066
Token inside
Token ID2641
Feature activation+1.37e+00

Pos logit contributions
year+0.018
irk+0.017
and+0.017
old+0.016
ules+0.015
Neg logit contributions
-0.015
 top-0.012
 puzzles-0.010
bes-0.010
 rh-0.009
Token the
Token ID262
Feature activation+1.28e+00

Pos logit contributions
year+0.059
and+0.058
irk+0.056
old+0.053
ash+0.053
Neg logit contributions
-0.021
 top-0.018
 rh-0.010
 arrest-0.010
-0.010
Token boat
Token ID8848
Feature activation+1.76e+00

Pos logit contributions
year+0.050
irk+0.048
and+0.047
old+0.046
ax+0.043
Neg logit contributions
-0.028
 top-0.020
 puzzles-0.016
bes-0.016
-0.015
Token.
Token ID13
Feature activation+1.96e-01

Pos logit contributions
year+0.069
irk+0.067
and+0.064
old+0.062
izz+0.060
Neg logit contributions
-0.038
 top-0.032
-0.024
gy-0.023
 rh-0.023
Token you
Token ID345
Feature activation+6.44e-01

Pos logit contributions
+0.022
bes+0.017
 top+0.017
 necks+0.015
adoes+0.015
Neg logit contributions
year-0.017
old-0.016
and-0.015
ax-0.015
irk-0.015
Token?"
Token ID1701
Feature activation+9.09e-01

Pos logit contributions
year+0.022
irk+0.022
and+0.021
old+0.020
ules+0.019
Neg logit contributions
-0.017
 top-0.013
 puzzles-0.011
bes-0.011
-0.010
Token she
Token ID673
Feature activation+1.65e+00

Pos logit contributions
year+0.029
irk+0.028
and+0.027
old+0.025
ules+0.024
Neg logit contributions
-0.027
 top-0.021
 puzzles-0.018
bes-0.018
-0.017
Token heard
Token ID2982
Feature activation+1.55e+00

Pos logit contributions
year+0.054
and+0.051
irk+0.051
old+0.048
ash+0.046
Neg logit contributions
-0.045
 top-0.038
 puzzles-0.033
bes-0.032
 rh-0.032
Token a
Token ID257
Feature activation+1.91e+00

Pos logit contributions
year+0.057
irk+0.054
and+0.053
old+0.050
ules+0.048
Neg logit contributions
-0.034
 top-0.031
 puzzles-0.024
 rh-0.023
bes-0.021
Token voice
Token ID3809
Feature activation+5.45e+00

Pos logit contributions
year+0.077
irk+0.071
ash+0.070
old+0.070
and+0.069
Neg logit contributions
-0.035
 top-0.031
-0.024
 rh-0.022
 puzzles-0.022
Token outside
Token ID2354
Feature activation+1.28e+00

Pos logit contributions
year+0.205
irk+0.200
enny+0.197
and+0.188
ash+0.180
Neg logit contributions
-0.093
 top-0.081
 puzzles-0.057
bes-0.056
 rh-0.056
Token.
Token ID13
Feature activation+7.78e-01

Pos logit contributions
year+0.048
irk+0.047
and+0.045
old+0.044
ax+0.043
Neg logit contributions
-0.027
 top-0.023
 puzzles-0.018
adoes-0.016
bes-0.016
Token It
Token ID632
Feature activation+1.06e+00

Pos logit contributions
year+0.029
irk+0.027
and+0.027
old+0.026
ax+0.025
Neg logit contributions
-0.019
 top-0.013
-0.011
 rh-0.010
bes-0.010
Token was
Token ID373
Feature activation+8.78e-01

Pos logit contributions
year+0.039
irk+0.037
and+0.036
old+0.035
ax+0.033
Neg logit contributions
-0.026
 top-0.020
 puzzles-0.017
bes-0.016
-0.015
Token her
Token ID607
Feature activation-1.22e-01

Pos logit contributions
year+0.031
irk+0.030
and+0.029
old+0.029
ules+0.027
Neg logit contributions
-0.021
 top-0.017
 puzzles-0.014
bes-0.013
 rh-0.013
Token Who
Token ID5338
Feature activation-6.82e-01

Pos logit contributions
year+0.007
irk+0.006
and+0.006
old+0.006
auc+0.005
Neg logit contributions
-0.006
 top-0.005
 puzzles-0.005
bes-0.005
adoes-0.004
Token's
Token ID338
Feature activation+6.48e-02

Pos logit contributions
+0.019
 top+0.015
 puzzles+0.013
bes+0.013
+0.012
Neg logit contributions
year-0.022
irk-0.021
and-0.020
old-0.020
ax-0.018
Token there
Token ID612
Feature activation-4.61e-01

Pos logit contributions
year+0.002
irk+0.002
old+0.002
and+0.002
izz+0.002
Neg logit contributions
-0.002
 top-0.001
bes-0.001
-0.001
 necks-0.001
Token?"
Token ID1701
Feature activation+6.54e-01

Pos logit contributions
+0.011
 top+0.008
 puzzles+0.007
bes+0.006
+0.006
Neg logit contributions
year-0.017
irk-0.017
and-0.016
old-0.015
ax-0.014
Token a
Token ID257
Feature activation+2.39e+00

Pos logit contributions
year+0.021
irk+0.020
and+0.019
old+0.018
ules+0.017
Neg logit contributions
-0.019
 top-0.016
 puzzles-0.014
 rh-0.013
-0.013
Token voice
Token ID3809
Feature activation+5.43e+00

Pos logit contributions
year+0.099
and+0.097
ash+0.094
irk+0.092
old+0.091
Neg logit contributions
-0.041
 top-0.033
 rh-0.027
-0.026
 puzzles-0.021
Token said
Token ID531
Feature activation+1.79e+00

Pos logit contributions
and+0.158
irk+0.156
year+0.150
enny+0.145
ashing+0.142
Neg logit contributions
-0.136
 top-0.115
 rh-0.104
 Blue-0.098
 Raven-0.097
Token.
Token ID13
Feature activation+9.38e-01

Pos logit contributions
year+0.072
irk+0.069
and+0.066
ules+0.065
old+0.064
Neg logit contributions
-0.034
 top-0.027
 puzzles-0.019
gy-0.019
-0.019
Token
Token ID198
Feature activation-1.95e-01

Pos logit contributions
year+0.034
irk+0.033
and+0.032
old+0.032
ax+0.030
Neg logit contributions
-0.023
 top-0.017
 puzzles-0.014
-0.014
bes-0.013
Token
Token ID198
Feature activation-1.40e+00

Pos logit contributions
+0.004
 top+0.003
+0.002
 puzzles+0.002
bes+0.002
Neg logit contributions
year-0.008
irk-0.007
and-0.007
old-0.007
ules-0.007
TokenL
Token ID43
Feature activation+1.64e+00

Pos logit contributions
+0.033
 top+0.024
 puzzles+0.020
bes+0.019
+0.018
Neg logit contributions
year-0.053
irk-0.051
and-0.049
old-0.048
ax-0.045
TokenSuddenly
Token ID38582
Feature activation-3.68e-01

Pos logit contributions
+0.032
 top+0.025
 puzzles+0.021
bes+0.021
+0.020
Neg logit contributions
year-0.040
irk-0.039
and-0.037
old-0.036
ax-0.034
Token,
Token ID11
Feature activation+9.94e-01

Pos logit contributions
+0.010
 top+0.007
bes+0.006
 puzzles+0.006
 rh+0.006
Neg logit contributions
year-0.013
irk-0.012
and-0.011
old-0.011
ax-0.010
Token Lisa
Token ID15378
Feature activation+1.34e-01

Pos logit contributions
year+0.029
irk+0.028
and+0.026
old+0.026
ax+0.024
Neg logit contributions
-0.031
 top-0.025
 puzzles-0.023
bes-0.022
-0.021
Token heard
Token ID2982
Feature activation+6.37e-01

Pos logit contributions
year+0.004
irk+0.004
and+0.004
old+0.004
ax+0.003
Neg logit contributions
-0.004
 top-0.003
 puzzles-0.003
bes-0.003
-0.003
Token a
Token ID257
Feature activation+1.03e+00

Pos logit contributions
irk+0.028
year+0.027
and+0.026
ax+0.025
izz+0.025
Neg logit contributions
-0.009
 top-0.009
 puzzles-0.006
 ones-0.004
adoes-0.004
Token voice
Token ID3809
Feature activation+5.42e+00

Pos logit contributions
year+0.036
irk+0.035
and+0.033
old+0.033
izz+0.031
Neg logit contributions
-0.026
 top-0.022
-0.017
 puzzles-0.017
 rh-0.017
Token behind
Token ID2157
Feature activation-7.25e-01

Pos logit contributions
irk+0.196
year+0.193
enny+0.190
and+0.183
ash+0.173
Neg logit contributions
-0.103
 top-0.091
 one-0.067
 rh-0.065
 Raven-0.063
Token her
Token ID607
Feature activation+1.21e+00

Pos logit contributions
+0.019
 top+0.015
 puzzles+0.013
bes+0.012
+0.012
Neg logit contributions
year-0.025
irk-0.024
and-0.023
old-0.022
ules-0.021
Token say
Token ID910
Feature activation+2.99e-01

Pos logit contributions
year+0.044
irk+0.040
and+0.039
old+0.038
izz+0.037
Neg logit contributions
-0.031
 top-0.025
 puzzles-0.020
bes-0.019
 rh-0.019
Token,
Token ID11
Feature activation-2.65e-01

Pos logit contributions
year+0.011
irk+0.011
and+0.010
old+0.010
ingle+0.010
Neg logit contributions
-0.007
 top-0.006
-0.004
 puzzles-0.004
bes-0.004
Token "
Token ID366
Feature activation+8.55e-01

Pos logit contributions
+0.006
 top+0.005
+0.004
bes+0.004
 puzzles+0.004
Neg logit contributions
year-0.010
irk-0.010
and-0.009
old-0.009
ax-0.008
Token pathway
Token ID21182
Feature activation+1.61e-01

Pos logit contributions
year+0.034
irk+0.033
and+0.031
old+0.031
ax+0.029
Neg logit contributions
-0.027
 top-0.021
 puzzles-0.018
bes-0.018
-0.017
Token,
Token ID11
Feature activation+3.94e-01

Pos logit contributions
year+0.007
irk+0.007
and+0.006
old+0.006
izz+0.006
Neg logit contributions
-0.003
 top-0.002
 ones-0.001
 puzzles-0.001
 none-0.001
Token he
Token ID339
Feature activation-9.46e-02

Pos logit contributions
year+0.010
old+0.008
irk+0.008
and+0.008
ax+0.008
Neg logit contributions
-0.013
bes-0.011
 top-0.010
gy-0.010
adoes-0.010
Token heard
Token ID2982
Feature activation+1.09e+00

Pos logit contributions
+0.003
 top+0.002
+0.002
 puzzles+0.002
bes+0.002
Neg logit contributions
irk-0.003
year-0.003
and-0.003
old-0.003
leaf-0.003
Token a
Token ID257
Feature activation+1.42e+00

Pos logit contributions
irk+0.048
year+0.047
and+0.044
izz+0.043
old+0.043
Neg logit contributions
-0.016
 top-0.014
 puzzles-0.010
 rh-0.008
 ones-0.007
Token voice
Token ID3809
Feature activation+5.42e+00

Pos logit contributions
year+0.050
irk+0.048
and+0.046
old+0.046
ax+0.044
Neg logit contributions
-0.036
 top-0.031
-0.024
 rh-0.024
 puzzles-0.024
Token.
Token ID13
Feature activation+1.04e-01

Pos logit contributions
irk+0.196
year+0.195
enny+0.189
and+0.182
ax+0.171
Neg logit contributions
-0.105
 top-0.092
 rh-0.075
 Raven-0.066
gy-0.064
Token
Token ID198
Feature activation+2.47e-01

Pos logit contributions
year+0.003
irk+0.003
old+0.003
and+0.003
ax+0.003
Neg logit contributions
-0.003
 top-0.002
 puzzles-0.002
bes-0.002
adoes-0.002
Token
Token ID198
Feature activation-1.81e+00

Pos logit contributions
year+0.009
irk+0.008
and+0.008
old+0.008
ules+0.007
Neg logit contributions
-0.006
 top-0.005
 puzzles-0.004
bes-0.004
-0.004
Token"
Token ID1
Feature activation+4.76e-01

Pos logit contributions
+0.042
 top+0.033
 puzzles+0.030
bes+0.029
adoes+0.029
Neg logit contributions
year-0.063
irk-0.060
old-0.059
and-0.058
ax-0.054
TokenBe
Token ID3856
Feature activation+1.04e+00

Pos logit contributions
year+0.018
irk+0.017
and+0.016
old+0.016
auc+0.015
Neg logit contributions
-0.012
 top-0.008
-0.007
 puzzles-0.007
 rh-0.006
Token.
Token ID13
Feature activation+9.68e-02

Pos logit contributions
+0.006
 top+0.004
bes+0.004
+0.004
 puzzles+0.004
Neg logit contributions
year-0.005
irk-0.005
and-0.005
old-0.004
ax-0.004
Token He
Token ID679
Feature activation-1.74e+00

Pos logit contributions
year+0.003
irk+0.003
and+0.003
old+0.003
ax+0.002
Neg logit contributions
-0.003
 top-0.002
 puzzles-0.002
bes-0.002
-0.002
Token says
Token ID1139
Feature activation-9.49e-01

Pos logit contributions
+0.061
 top+0.045
bes+0.042
gy+0.041
+0.041
Neg logit contributions
irk-0.043
old-0.041
year-0.041
ax-0.036
auc-0.036
Token,
Token ID11
Feature activation-1.14e+00

Pos logit contributions
+0.025
 top+0.018
bes+0.016
 puzzles+0.016
+0.014
Neg logit contributions
year-0.033
irk-0.032
and-0.030
old-0.030
ax-0.028
Token "
Token ID366
Feature activation-1.08e+00

Pos logit contributions
+0.031
 top+0.024
 puzzles+0.022
bes+0.021
 arrest+0.020
Neg logit contributions
year-0.037
irk-0.035
and-0.034
old-0.032
ules-0.031
TokenMax
Token ID11518
Feature activation-4.95e+00

Pos logit contributions
+0.031
 top+0.026
 puzzles+0.025
bes+0.023
gy+0.022
Neg logit contributions
irk-0.032
year-0.031
and-0.030
old-0.028
enny-0.026
Token,
Token ID11
Feature activation-6.58e-01

Pos logit contributions
+0.151
 top+0.130
bes+0.119
 puzzles+0.117
adoes+0.107
Neg logit contributions
irk-0.126
old-0.125
oke-0.109
year-0.108
ax-0.107
Token please
Token ID3387
Feature activation-1.44e+00

Pos logit contributions
+0.020
 top+0.016
bes+0.014
 puzzles+0.013
+0.013
Neg logit contributions
year-0.019
irk-0.019
and-0.018
old-0.018
ax-0.017
Token.
Token ID13
Feature activation+1.84e-01

Pos logit contributions
+0.043
 top+0.033
 puzzles+0.031
bes+0.030
 rh+0.028
Neg logit contributions
year-0.043
irk-0.042
old-0.040
and-0.040
ax-0.036
Token Don
Token ID2094
Feature activation-6.14e-01

Pos logit contributions
irk+0.005
year+0.005
and+0.005
old+0.005
enny+0.004
Neg logit contributions
-0.006
 top-0.005
 puzzles-0.004
bes-0.004
-0.004
Token't
Token ID470
Feature activation+7.86e-01

Pos logit contributions
+0.016
 necks+0.011
bes+0.011
 top+0.010
 tails+0.010
Neg logit contributions
year-0.020
old-0.019
and-0.019
irk-0.019
ax-0.018
Token ha
Token ID387
Feature activation-1.01e+00

Pos logit contributions
+0.062
 top+0.048
bes+0.046
 puzzles+0.045
adoes+0.041
Neg logit contributions
irk-0.081
year-0.079
old-0.078
and-0.072
ax-0.070
Token,
Token ID11
Feature activation-1.12e-01

Pos logit contributions
+0.024
 top+0.017
 puzzles+0.015
bes+0.014
 rh+0.013
Neg logit contributions
year-0.037
irk-0.036
and-0.035
old-0.035
ax-0.033
Token I
Token ID314
Feature activation-5.46e-01

Pos logit contributions
+0.003
 top+0.002
bes+0.002
 puzzles+0.002
+0.002
Neg logit contributions
year-0.003
irk-0.003
old-0.003
and-0.003
ax-0.003
Token catch
Token ID4929
Feature activation-2.45e-01

Pos logit contributions
+0.015
 top+0.012
 puzzles+0.011
bes+0.010
 rh+0.010
Neg logit contributions
year-0.018
irk-0.017
and-0.016
old-0.016
ules-0.015
Token your
Token ID534
Feature activation-1.59e+00

Pos logit contributions
+0.007
 top+0.005
bes+0.004
 puzzles+0.004
+0.004
Neg logit contributions
year-0.008
irk-0.008
and-0.007
old-0.007
ax-0.007
Token snowball
Token ID46275
Feature activation-4.92e+00

Pos logit contributions
+0.044
 top+0.028
+0.026
bes+0.026
 puzzles+0.026
Neg logit contributions
irk-0.051
year-0.050
and-0.048
old-0.047
ax-0.045
Token,
Token ID11
Feature activation+8.10e-02

Pos logit contributions
+0.163
 top+0.114
 puzzles+0.110
bes+0.107
+0.104
Neg logit contributions
irk-0.128
and-0.124
year-0.123
old-0.117
ax-0.112
Token Ben
Token ID3932
Feature activation-2.78e+00

Pos logit contributions
year+0.002
irk+0.002
and+0.002
old+0.002
ax+0.002
Neg logit contributions
-0.002
 top-0.002
 puzzles-0.002
bes-0.002
-0.002
Token!"
Token ID2474
Feature activation+3.96e-01

Pos logit contributions
+0.081
 top+0.063
 puzzles+0.056
bes+0.054
+0.051
Neg logit contributions
year-0.086
irk-0.085
and-0.080
old-0.079
ax-0.074
Token Lily
Token ID20037
Feature activation+7.77e-01

Pos logit contributions
year+0.012
irk+0.012
and+0.011
old+0.011
ules+0.010
Neg logit contributions
-0.012
 top-0.010
 puzzles-0.008
bes-0.008
-0.008
Token says
Token ID1139
Feature activation+8.11e-01

Pos logit contributions
year+0.016
irk+0.016
and+0.014
old+0.014
ax+0.012
Neg logit contributions
-0.031
 top-0.026
 puzzles-0.024
bes-0.023
-0.023
Token
Token ID198
Feature activation-4.45e-01

Pos logit contributions
irk+0.003
year+0.003
and+0.003
old+0.003
enny+0.002
Neg logit contributions
-0.003
 top-0.003
 puzzles-0.003
bes-0.002
 arrest-0.002
Token
Token ID198
Feature activation-2.02e+00

Pos logit contributions
+0.010
 top+0.008
bes+0.006
 puzzles+0.006
+0.006
Neg logit contributions
year-0.017
irk-0.016
and-0.015
old-0.015
ax-0.015
Token"
Token ID1
Feature activation-1.76e+00

Pos logit contributions
+0.051
 top+0.040
 puzzles+0.037
adoes+0.034
bes+0.034
Neg logit contributions
year-0.070
irk-0.065
old-0.064
and-0.063
izz-0.059
TokenAnna
Token ID31160
Feature activation-4.15e+00

Pos logit contributions
+0.054
 top+0.045
 puzzles+0.042
gy+0.041
 ranch+0.038
Neg logit contributions
year-0.046
irk-0.044
izz-0.040
and-0.040
leaf-0.039
Token!
Token ID0
Feature activation-6.39e-01

Pos logit contributions
+0.106
 top+0.079
 puzzles+0.069
bes+0.066
adoes+0.061
Neg logit contributions
irk-0.143
year-0.141
old-0.135
and-0.132
ax-0.125
Token Max
Token ID5436
Feature activation-4.91e+00

Pos logit contributions
+0.020
 top+0.016
 puzzles+0.015
bes+0.014
 none+0.013
Neg logit contributions
irk-0.018
year-0.018
and-0.017
old-0.016
enny-0.015
Token!"
Token ID2474
Feature activation+3.09e-01

Pos logit contributions
+0.136
 top+0.100
 puzzles+0.089
bes+0.087
adoes+0.083
Neg logit contributions
irk-0.158
year-0.151
old-0.149
and-0.142
ax-0.139
Token Lily
Token ID20037
Feature activation+1.93e-01

Pos logit contributions
year+0.008
irk+0.008
old+0.007
and+0.007
ax+0.007
Neg logit contributions
-0.011
 top-0.008
 puzzles-0.008
bes-0.008
-0.007
Token says
Token ID1139
Feature activation+8.92e-01

Pos logit contributions
year+0.004
irk+0.004
and+0.003
old+0.003
ax+0.003
Neg logit contributions
-0.008
 top-0.006
 puzzles-0.006
bes-0.006
-0.006
Token.
Token ID13
Feature activation+2.24e-01

Pos logit contributions
year+0.038
irk+0.035
ules+0.034
izz+0.034
old+0.033
Neg logit contributions
-0.015
 top-0.012
-0.009
bes-0.008
 puzzles-0.008
Token "
Token ID366
Feature activation-1.41e+00

Pos logit contributions
irk+0.007
year+0.007
and+0.006
old+0.006
ules+0.005
Neg logit contributions
-0.006
 top-0.005
 puzzles-0.005
bes-0.005
 rh-0.004
Token "
Token ID366
Feature activation-6.53e-01

Pos logit contributions
+0.004
 top+0.003
 puzzles+0.002
bes+0.002
 rh+0.002
Neg logit contributions
year-0.005
irk-0.005
and-0.005
old-0.004
ules-0.004
TokenDad
Token ID46270
Feature activation-2.85e+00

Pos logit contributions
+0.019
 top+0.016
 puzzles+0.014
bes+0.014
adoes+0.013
Neg logit contributions
irk-0.019
year-0.019
and-0.018
old-0.018
ax-0.016
Token,
Token ID11
Feature activation-6.67e-01

Pos logit contributions
+0.064
 top+0.046
 puzzles+0.038
bes+0.036
+0.034
Neg logit contributions
year-0.110
irk-0.106
and-0.102
old-0.100
ax-0.094
Token dad
Token ID9955
Feature activation-2.30e+00

Pos logit contributions
+0.022
 top+0.017
bes+0.016
 puzzles+0.015
adoes+0.014
Neg logit contributions
year-0.018
irk-0.018
old-0.017
and-0.017
ax-0.016
Token!
Token ID0
Feature activation-4.70e-01

Pos logit contributions
+0.056
 top+0.042
 puzzles+0.035
+0.034
 rh+0.033
Neg logit contributions
year-0.086
irk-0.081
and-0.078
old-0.073
ules-0.071
Token Max
Token ID5436
Feature activation-4.86e+00

Pos logit contributions
+0.014
 top+0.011
 puzzles+0.010
bes+0.009
+0.009
Neg logit contributions
year-0.015
irk-0.014
and-0.014
old-0.013
ax-0.012
Token is
Token ID318
Feature activation-1.27e+00

Pos logit contributions
+0.140
 top+0.103
 puzzles+0.096
bes+0.094
adoes+0.091
Neg logit contributions
irk-0.146
old-0.141
year-0.141
and-0.134
ax-0.131
Token at
Token ID379
Feature activation-2.10e+00

Pos logit contributions
+0.039
 top+0.029
bes+0.025
 puzzles+0.025
+0.025
Neg logit contributions
year-0.039
irk-0.038
and-0.036
old-0.035
ax-0.033
Token the
Token ID262
Feature activation-6.59e-01

Pos logit contributions
+0.062
 top+0.043
bes+0.041
+0.039
 puzzles+0.039
Neg logit contributions
year-0.066
irk-0.064
and-0.061
old-0.060
ax-0.057
Token grill
Token ID29901
Feature activation-2.04e+00

Pos logit contributions
+0.020
bes+0.014
 top+0.014
 puzzles+0.013
adoes+0.013
Neg logit contributions
year-0.019
irk-0.019
old-0.018
and-0.018
ax-0.017
Token!"
Token ID2474
Feature activation-1.13e-01

Pos logit contributions
+0.062
 top+0.048
 puzzles+0.043
bes+0.043
+0.040
Neg logit contributions
year-0.062
irk-0.059
and-0.057
old-0.055
ax-0.051
Token
Token ID198
Feature activation+1.47e-01

Pos logit contributions
year+0.021
irk+0.020
and+0.019
old+0.019
ax+0.018
Neg logit contributions
-0.018
 top-0.014
 puzzles-0.012
bes-0.012
-0.011
Token
Token ID198
Feature activation-1.75e+00

Pos logit contributions
year+0.006
irk+0.005
and+0.005
old+0.005
ax+0.005
Neg logit contributions
-0.003
 top-0.002
bes-0.002
 puzzles-0.002
-0.002
Token"
Token ID1
Feature activation-1.72e+00

Pos logit contributions
+0.043
 top+0.033
 puzzles+0.031
bes+0.028
adoes+0.027
Neg logit contributions
year-0.060
irk-0.059
old-0.057
and-0.057
izz-0.052
TokenMom
Token ID29252
Feature activation-2.42e+00

Pos logit contributions
+0.051
 top+0.041
 puzzles+0.040
gy+0.038
bes+0.037
Neg logit contributions
year-0.050
irk-0.049
and-0.045
izz-0.044
old-0.043
Token,
Token ID11
Feature activation-1.72e+00

Pos logit contributions
+0.050
 top+0.037
 puzzles+0.029
+0.027
bes+0.026
Neg logit contributions
year-0.099
irk-0.093
and-0.091
old-0.086
ules-0.083
Token Ben
Token ID3932
Feature activation-4.85e+00

Pos logit contributions
+0.056
bes+0.046
 top+0.045
opes+0.041
adoes+0.040
Neg logit contributions
irk-0.042
year-0.042
old-0.040
and-0.040
ax-0.038
Token pushed
Token ID7121
Feature activation-6.17e-01

Pos logit contributions
+0.132
 top+0.099
 puzzles+0.090
bes+0.087
adoes+0.081
Neg logit contributions
irk-0.157
year-0.150
old-0.147
and-0.141
ax-0.138
Token me
Token ID502
Feature activation-9.08e-01

Pos logit contributions
+0.017
 top+0.012
bes+0.011
 puzzles+0.011
+0.011
Neg logit contributions
year-0.020
irk-0.020
and-0.019
old-0.019
ax-0.018
Token and
Token ID290
Feature activation-1.38e+00

Pos logit contributions
+0.020
 top+0.017
 puzzles+0.013
gy+0.012
bes+0.012
Neg logit contributions
year-0.035
irk-0.033
old-0.030
and-0.030
leaf-0.029
Token took
Token ID1718
Feature activation-1.08e+00

Pos logit contributions
+0.036
 top+0.028
 puzzles+0.024
bes+0.023
+0.022
Neg logit contributions
year-0.048
irk-0.046
and-0.044
old-0.043
ules-0.040
Token the
Token ID262
Feature activation+2.28e-03

Pos logit contributions
+0.029
 top+0.019
bes+0.018
+0.018
 puzzles+0.017
Neg logit contributions
year-0.037
irk-0.036
and-0.035
old-0.034
ax-0.032
Token
Token ID198
Feature activation-5.02e-01

Pos logit contributions
year+0.019
irk+0.018
and+0.017
old+0.017
ax+0.016
Neg logit contributions
-0.015
 top-0.012
 puzzles-0.010
bes-0.010
-0.009
Token
Token ID198
Feature activation-1.45e+00

Pos logit contributions
+0.012
 top+0.010
 puzzles+0.008
bes+0.008
+0.007
Neg logit contributions
year-0.018
irk-0.018
and-0.017
old-0.016
ax-0.016
Token"
Token ID1
Feature activation-5.68e-01

Pos logit contributions
+0.035
 top+0.026
 puzzles+0.024
bes+0.023
adoes+0.022
Neg logit contributions
year-0.053
irk-0.049
and-0.048
old-0.048
izz-0.044
TokenTwenty
Token ID34096
Feature activation-5.94e-01

Pos logit contributions
+0.016
 top+0.013
gy+0.012
bes+0.012
 puzzles+0.011
Neg logit contributions
year-0.018
irk-0.017
and-0.016
old-0.014
izz-0.014
Token-
Token ID12
Feature activation+1.06e+00

Pos logit contributions
+0.018
 top+0.014
bes+0.012
 puzzles+0.012
+0.012
Neg logit contributions
year-0.017
irk-0.017
and-0.016
old-0.016
ax-0.015
Tokennine
Token ID30888
Feature activation-4.82e+00

Pos logit contributions
year+0.030
irk+0.029
and+0.027
old+0.026
ules+0.025
Neg logit contributions
-0.034
 top-0.028
 puzzles-0.025
bes-0.024
-0.023
Token,
Token ID11
Feature activation+1.28e-02

Pos logit contributions
+0.132
 top+0.105
bes+0.095
adoes+0.093
 puzzles+0.091
Neg logit contributions
irk-0.139
old-0.134
year-0.131
ax-0.130
and-0.127
Token thirty
Token ID12277
Feature activation-1.57e+00

Pos logit contributions
year+0.000
irk+0.000
and+0.000
old+0.000
ax+0.000
Neg logit contributions
-0.000
 top-0.000
bes-0.000
 puzzles-0.000
adoes-0.000
Token,
Token ID11
Feature activation-1.44e-01

Pos logit contributions
+0.047
 top+0.036
bes+0.033
 puzzles+0.032
adoes+0.032
Neg logit contributions
year-0.046
irk-0.045
and-0.043
old-0.043
ax-0.040
Token thirty
Token ID12277
Feature activation-1.64e+00

Pos logit contributions
+0.004
 top+0.003
bes+0.003
 puzzles+0.003
adoes+0.002
Neg logit contributions
year-0.005
irk-0.004
and-0.004
old-0.004
ax-0.004
Token-
Token ID12
Feature activation+1.37e+00

Pos logit contributions
+0.050
 top+0.038
bes+0.034
 puzzles+0.034
adoes+0.033
Neg logit contributions
year-0.048
irk-0.047
and-0.045
old-0.045
ax-0.042
Token.
Token ID13
Feature activation+3.21e-01

Pos logit contributions
+0.091
 top+0.072
bes+0.068
 puzzles+0.066
+0.066
Neg logit contributions
year-0.060
and-0.060
irk-0.059
old-0.057
ax-0.048
Token
Token ID198
Feature activation-1.11e-01

Pos logit contributions
year+0.010
irk+0.009
and+0.009
old+0.009
enny+0.008
Neg logit contributions
-0.010
 top-0.007
 puzzles-0.007
bes-0.007
adoes-0.006
Token
Token ID198
Feature activation-1.63e+00

Pos logit contributions
+0.003
 top+0.002
bes+0.002
 puzzles+0.002
+0.001
Neg logit contributions
year-0.004
irk-0.004
and-0.004
old-0.004
ax-0.004
Token"
Token ID1
Feature activation-7.82e-01

Pos logit contributions
+0.043
 top+0.034
 puzzles+0.034
adoes+0.032
bes+0.030
Neg logit contributions
year-0.052
irk-0.049
and-0.048
old-0.047
enny-0.044
TokenSp
Token ID4561
Feature activation+7.42e-01

Pos logit contributions
+0.024
 top+0.020
 puzzles+0.019
bes+0.017
opes+0.017
Neg logit contributions
irk-0.021
year-0.021
and-0.019
old-0.017
izz-0.017
Tokenike
Token ID522
Feature activation-4.80e+00

Pos logit contributions
year+0.017
irk+0.017
and+0.016
old+0.015
enny+0.014
Neg logit contributions
-0.028
 top-0.024
 puzzles-0.021
bes-0.020
 rh-0.020
Token,
Token ID11
Feature activation-3.90e-01

Pos logit contributions
+0.139
 top+0.105
 puzzles+0.099
bes+0.099
adoes+0.093
Neg logit contributions
irk-0.137
old-0.133
year-0.127
ax-0.120
and-0.118
Token come
Token ID1282
Feature activation+6.89e-01

Pos logit contributions
+0.013
 top+0.010
bes+0.009
 puzzles+0.009
+0.009
Neg logit contributions
year-0.011
irk-0.010
and-0.010
old-0.010
ax-0.009
Token back
Token ID736
Feature activation+2.54e-01

Pos logit contributions
year+0.020
irk+0.019
and+0.019
old+0.018
ax+0.017
Neg logit contributions
-0.022
 top-0.017
 puzzles-0.016
bes-0.015
 rh-0.014
Token!
Token ID0
Feature activation+3.43e-01

Pos logit contributions
year+0.009
irk+0.008
and+0.008
old+0.008
ules+0.007
Neg logit contributions
-0.007
 top-0.005
 puzzles-0.004
bes-0.004
-0.004
Token Please
Token ID4222
Feature activation-1.15e-01

Pos logit contributions
year+0.011
irk+0.010
and+0.010
old+0.010
ax+0.009
Neg logit contributions
-0.010
 top-0.008
 puzzles-0.007
bes-0.007
 ones-0.006
Token.
Token ID13
Feature activation-7.33e-01

Pos logit contributions
+0.002
 top+0.002
 puzzles+0.001
bes+0.001
+0.001
Neg logit contributions
year-0.002
irk-0.002
and-0.002
old-0.002
ax-0.002
Token They
Token ID1119
Feature activation-9.22e-01

Pos logit contributions
+0.026
 top+0.021
 puzzles+0.019
bes+0.019
 arrest+0.018
Neg logit contributions
irk-0.018
year-0.016
and-0.015
old-0.015
ax-0.014
Token shout
Token ID9191
Feature activation-1.22e+00

Pos logit contributions
+0.033
 top+0.027
 puzzles+0.025
bes+0.024
+0.023
Neg logit contributions
year-0.023
irk-0.022
and-0.020
old-0.020
ax-0.018
Token,
Token ID11
Feature activation-1.60e+00

Pos logit contributions
+0.035
 top+0.027
 puzzles+0.024
bes+0.023
 rh+0.022
Neg logit contributions
year-0.039
irk-0.038
and-0.036
old-0.035
ax-0.033
Token "
Token ID366
Feature activation-2.03e+00

Pos logit contributions
+0.046
 top+0.036
 puzzles+0.033
bes+0.031
 rh+0.029
Neg logit contributions
year-0.048
irk-0.047
and-0.043
old-0.042
ules-0.042
TokenSpot
Token ID32565
Feature activation-4.75e+00

Pos logit contributions
+0.064
 puzzles+0.050
bes+0.048
 top+0.048
gy+0.047
Neg logit contributions
irk-0.048
and-0.047
year-0.045
izz-0.044
ax-0.044
Token,
Token ID11
Feature activation-1.69e+00

Pos logit contributions
+0.149
 top+0.124
bes+0.118
 puzzles+0.117
adoes+0.109
Neg logit contributions
irk-0.118
old-0.112
ax-0.108
year-0.105
oke-0.100
Token stop
Token ID2245
Feature activation-9.82e-01

Pos logit contributions
+0.055
 top+0.042
bes+0.040
 puzzles+0.038
 necks+0.037
Neg logit contributions
irk-0.043
year-0.042
old-0.041
and-0.041
ax-0.040
Token!
Token ID0
Feature activation-8.88e-01

Pos logit contributions
+0.032
 top+0.024
 puzzles+0.022
bes+0.022
 rh+0.020
Neg logit contributions
year-0.027
irk-0.027
and-0.025
old-0.025
ax-0.024
Token You
Token ID921
Feature activation-1.80e+00

Pos logit contributions
+0.028
 top+0.023
 puzzles+0.022
bes+0.021
 necks+0.020
Neg logit contributions
irk-0.024
year-0.023
old-0.022
and-0.022
ax-0.021
Token are
Token ID389
Feature activation-2.16e+00

Pos logit contributions
+0.058
 top+0.044
bes+0.041
 puzzles+0.039
+0.038
Neg logit contributions
year-0.052
irk-0.049
old-0.047
and-0.047
ax-0.044
Token He
Token ID679
Feature activation-5.37e-02

Pos logit contributions
+0.005
 top+0.004
 puzzles+0.004
bes+0.004
adoes+0.003
Neg logit contributions
irk-0.005
year-0.004
old-0.004
and-0.004
ax-0.004
Token said
Token ID531
Feature activation-5.72e-01

Pos logit contributions
+0.002
 top+0.001
 puzzles+0.001
bes+0.001
 rh+0.001
Neg logit contributions
year-0.002
irk-0.001
and-0.001
old-0.001
ax-0.001
Token,
Token ID11
Feature activation-9.89e-01

Pos logit contributions
+0.015
 top+0.011
bes+0.010
 puzzles+0.010
 arrest+0.009
Neg logit contributions
year-0.020
irk-0.018
and-0.018
old-0.018
ax-0.016
Token "
Token ID366
Feature activation+4.97e-01

Pos logit contributions
+0.022
 top+0.016
 puzzles+0.013
+0.013
bes+0.012
Neg logit contributions
year-0.038
irk-0.037
and-0.035
old-0.035
ax-0.033
TokenR
Token ID49
Feature activation+2.30e-02

Pos logit contributions
year+0.017
irk+0.017
and+0.016
old+0.016
ules+0.015
Neg logit contributions
-0.013
 top-0.010
 puzzles-0.008
bes-0.008
-0.008
Tokenover
Token ID2502
Feature activation-4.73e+00

Pos logit contributions
year+0.001
irk+0.001
and+0.001
old+0.001
ax+0.000
Neg logit contributions
-0.001
 top-0.001
 puzzles-0.001
bes-0.001
-0.001
Token,
Token ID11
Feature activation-1.39e-01

Pos logit contributions
+0.137
 top+0.105
 puzzles+0.098
bes+0.096
adoes+0.092
Neg logit contributions
irk-0.126
year-0.125
old-0.123
and-0.113
ax-0.110
Token I
Token ID314
Feature activation-9.42e-02

Pos logit contributions
+0.004
 top+0.003
 puzzles+0.003
bes+0.003
+0.003
Neg logit contributions
year-0.004
irk-0.004
and-0.004
old-0.004
ax-0.004
Token have
Token ID423
Feature activation-1.78e+00

Pos logit contributions
+0.003
 top+0.002
 puzzles+0.002
 rh+0.002
+0.002
Neg logit contributions
year-0.003
irk-0.003
and-0.003
old-0.003
ules-0.003
Token a
Token ID257
Feature activation-5.56e-01

Pos logit contributions
+0.045
 top+0.033
bes+0.028
 puzzles+0.027
+0.026
Neg logit contributions
year-0.063
irk-0.060
and-0.058
old-0.058
ax-0.054
Token treat
Token ID2190
Feature activation+5.43e-01

Pos logit contributions
+0.016
 top+0.012
bes+0.011
 puzzles+0.011
adoes+0.011
Neg logit contributions
year-0.017
irk-0.017
old-0.015
and-0.015
ax-0.014
Token at
Token ID379
Feature activation-4.10e-01

Pos logit contributions
irk+0.023
year+0.022
old+0.021
and+0.020
ax+0.019
Neg logit contributions
-0.026
 top-0.024
 puzzles-0.021
adoes-0.019
bes-0.019
Token her
Token ID607
Feature activation-2.15e-01

Pos logit contributions
+0.008
 top+0.007
 puzzles+0.006
 ones+0.005
bes+0.005
Neg logit contributions
year-0.016
irk-0.016
old-0.015
and-0.015
ules-0.014
Token and
Token ID290
Feature activation-7.27e-01

Pos logit contributions
+0.006
 top+0.005
 puzzles+0.004
bes+0.004
+0.004
Neg logit contributions
year-0.007
irk-0.007
and-0.007
old-0.006
ax-0.006
Token said
Token ID531
Feature activation-6.30e-01

Pos logit contributions
+0.021
 top+0.017
 puzzles+0.015
bes+0.015
+0.014
Neg logit contributions
year-0.023
irk-0.022
and-0.021
old-0.021
ules-0.019
Token "
Token ID366
Feature activation+6.53e-01

Pos logit contributions
+0.015
 top+0.011
 puzzles+0.009
bes+0.009
+0.008
Neg logit contributions
year-0.023
irk-0.022
and-0.021
old-0.021
ax-0.020
Tokenny
Token ID3281
Feature activation-4.73e+00

Pos logit contributions
year+0.023
irk+0.023
old+0.021
and+0.021
ules+0.020
Neg logit contributions
-0.017
 top-0.012
bes-0.010
 puzzles-0.010
-0.010
Tokenah
Token ID993
Feature activation-1.70e+00

Pos logit contributions
+0.128
 top+0.100
 puzzles+0.093
bes+0.088
 hawk+0.081
Neg logit contributions
irk-0.137
year-0.136
old-0.132
and-0.125
enny-0.122
Token,
Token ID11
Feature activation-3.31e-01

Pos logit contributions
+0.045
 top+0.034
 puzzles+0.030
bes+0.029
 rh+0.027
Neg logit contributions
year-0.057
irk-0.054
and-0.053
old-0.052
ules-0.049
Token n
Token ID299
Feature activation+3.08e+00

Pos logit contributions
+0.009
 top+0.007
 puzzles+0.006
bes+0.006
+0.006
Neg logit contributions
year-0.011
irk-0.010
and-0.010
old-0.010
ax-0.009
Tokenyah
Token ID46848
Feature activation-2.19e+00

Pos logit contributions
year+0.054
irk+0.051
old+0.051
leaf+0.050
and+0.046
Neg logit contributions
-0.125
 top-0.100
gy-0.096
-0.094
bes-0.093
Token!"
Token ID2474
Feature activation+3.07e-01

Pos logit contributions
+0.061
 top+0.047
 puzzles+0.042
bes+0.040
 rh+0.038
Neg logit contributions
year-0.070
irk-0.068
and-0.064
old-0.064
ules-0.060
Token each
Token ID1123
Feature activation-1.31e+00

Pos logit contributions
+0.024
 top+0.018
bes+0.016
+0.016
 rh+0.016
Neg logit contributions
year-0.022
irk-0.021
and-0.021
old-0.020
ax-0.018
Token other
Token ID584
Feature activation-1.89e+00

Pos logit contributions
+0.039
bes+0.028
 top+0.028
 puzzles+0.026
adoes+0.025
Neg logit contributions
year-0.041
irk-0.038
and-0.038
old-0.038
ules-0.034
Token.
Token ID13
Feature activation+2.80e-01

Pos logit contributions
+0.060
 top+0.046
bes+0.042
 puzzles+0.042
 arrest+0.041
Neg logit contributions
year-0.054
irk-0.051
and-0.050
old-0.047
ax-0.043
Token They
Token ID1119
Feature activation-9.84e-01

Pos logit contributions
year+0.012
irk+0.012
and+0.011
old+0.011
ules+0.011
Neg logit contributions
-0.005
 top-0.003
 puzzles-0.002
bes-0.002
-0.002
Token were
Token ID547
Feature activation-2.76e-01

Pos logit contributions
+0.029
 top+0.022
 puzzles+0.020
bes+0.019
+0.018
Neg logit contributions
year-0.031
irk-0.030
and-0.029
old-0.028
ax-0.026
Token on
Token ID319
Feature activation-4.65e+00

Pos logit contributions
+0.009
 top+0.006
bes+0.006
+0.006
 arrest+0.006
Neg logit contributions
year-0.008
old-0.007
irk-0.007
and-0.007
ules-0.007
Token a
Token ID257
Feature activation+5.00e-01

Pos logit contributions
+0.153
+0.107
eros+0.104
bes+0.104
adoes+0.101
Neg logit contributions
year-0.121
and-0.110
irk-0.107
old-0.107
 Jas-0.100
Token trip
Token ID5296
Feature activation+1.63e-01

Pos logit contributions
year+0.013
irk+0.013
and+0.013
old+0.013
ax+0.011
Neg logit contributions
-0.016
adoes-0.011
 top-0.011
bes-0.011
 puzzles-0.010
Token with
Token ID351
Feature activation-6.54e-01

Pos logit contributions
irk+0.006
year+0.006
old+0.005
and+0.005
izz+0.005
Neg logit contributions
-0.004
 top-0.003
 puzzles-0.003
 ones-0.002
gy-0.002
Token their
Token ID511
Feature activation-1.30e+00

Pos logit contributions
+0.015
 top+0.012
 puzzles+0.011
bes+0.009
 ones+0.009
Neg logit contributions
year-0.024
irk-0.024
and-0.022
old-0.022
ax-0.021
Token mom
Token ID1995
Feature activation+1.18e+00

Pos logit contributions
+0.035
bes+0.023
 top+0.022
adoes+0.020
+0.020
Neg logit contributions
irk-0.041
year-0.040
old-0.039
and-0.039
ules-0.035
Token and
Token ID290
Feature activation-4.40e-01

Pos logit contributions
year+0.024
irk+0.023
and+0.023
old+0.022
ax+0.020
Neg logit contributions
-0.024
 top-0.019
 puzzles-0.017
bes-0.016
-0.016
Token said
Token ID531
Feature activation-1.12e+00

Pos logit contributions
+0.017
 top+0.014
 puzzles+0.013
bes+0.012
+0.012
Neg logit contributions
year-0.010
irk-0.010
and-0.009
old-0.009
ax-0.008
Token,
Token ID11
Feature activation-1.03e+00

Pos logit contributions
+0.029
 top+0.022
bes+0.020
 puzzles+0.019
 necks+0.018
Neg logit contributions
year-0.038
irk-0.036
and-0.035
old-0.034
ax-0.032
Token "
Token ID366
Feature activation+4.79e-01

Pos logit contributions
+0.024
 top+0.018
 puzzles+0.015
bes+0.014
+0.014
Neg logit contributions
year-0.038
irk-0.037
and-0.036
old-0.035
ax-0.033
TokenB
Token ID33
Feature activation+2.72e-02

Pos logit contributions
year+0.016
irk+0.016
and+0.015
old+0.015
ax+0.014
Neg logit contributions
-0.013
 top-0.010
 puzzles-0.009
bes-0.009
-0.008
Tokenenny
Token ID11870
Feature activation-4.65e+00

Pos logit contributions
year+0.000
irk+0.000
old+0.000
and+0.000
ules+0.000
Neg logit contributions
-0.001
 top-0.001
-0.001
bes-0.001
 puzzles-0.001
Token,
Token ID11
Feature activation-6.73e-01

Pos logit contributions
+0.141
 top+0.113
 puzzles+0.103
bes+0.102
How+0.093
Neg logit contributions
irk-0.123
old-0.123
year-0.118
enny-0.106
and-0.106
Token I
Token ID314
Feature activation-2.62e-01

Pos logit contributions
+0.019
 top+0.015
 puzzles+0.013
bes+0.013
+0.012
Neg logit contributions
year-0.021
irk-0.021
and-0.020
old-0.019
ax-0.018
Token have
Token ID423
Feature activation-2.06e+00

Pos logit contributions
+0.007
 top+0.006
 puzzles+0.005
bes+0.005
 rh+0.005
Neg logit contributions
year-0.009
irk-0.008
and-0.008
old-0.007
ules-0.007
Token a
Token ID257
Feature activation-6.16e-01

Pos logit contributions
+0.060
 top+0.043
bes+0.041
+0.038
adoes+0.037
Neg logit contributions
year-0.064
and-0.062
irk-0.060
old-0.059
ax-0.056
Token present
Token ID1944
Feature activation+4.96e-01

Pos logit contributions
+0.017
 top+0.012
bes+0.011
 puzzles+0.011
+0.010
Neg logit contributions
year-0.020
irk-0.020
and-0.019
old-0.019
ax-0.017
Token,
Token ID11
Feature activation-2.93e-01

Pos logit contributions
+0.035
 top+0.027
bes+0.024
 puzzles+0.023
adoes+0.022
Neg logit contributions
year-0.044
irk-0.042
and-0.040
old-0.040
ax-0.037
Token "
Token ID366
Feature activation+1.38e-01

Pos logit contributions
+0.007
 top+0.005
 puzzles+0.005
bes+0.005
 rh+0.004
Neg logit contributions
year-0.010
irk-0.010
and-0.010
old-0.009
ax-0.009
TokenL
Token ID43
Feature activation+1.40e+00

Pos logit contributions
year+0.005
irk+0.005
old+0.004
ules+0.004
and+0.004
Neg logit contributions
-0.004
 top-0.003
-0.002
 ones-0.002
 puzzles-0.002
Tokenily
Token ID813
Feature activation-2.73e+00

Pos logit contributions
year+0.028
irk+0.027
and+0.023
old+0.023
ules+0.022
Neg logit contributions
-0.055
 top-0.044
 puzzles-0.044
bes-0.043
 ones-0.040
Token and
Token ID290
Feature activation-9.27e-01

Pos logit contributions
+0.072
 top+0.055
 puzzles+0.048
bes+0.046
+0.043
Neg logit contributions
year-0.093
irk-0.090
and-0.086
old-0.085
ax-0.079
Token Luke
Token ID11336
Feature activation-4.63e+00

Pos logit contributions
+0.032
 top+0.026
bes+0.024
 puzzles+0.023
 rh+0.023
Neg logit contributions
year-0.023
irk-0.023
and-0.021
old-0.021
ax-0.019
Token,
Token ID11
Feature activation-7.90e-01

Pos logit contributions
+0.120
 top+0.096
bes+0.082
 puzzles+0.079
adoes+0.077
Neg logit contributions
year-0.148
irk-0.147
old-0.145
and-0.139
ax-0.130
Token it
Token ID340
Feature activation-1.24e+00

Pos logit contributions
+0.022
 top+0.017
 puzzles+0.014
bes+0.014
+0.013
Neg logit contributions
year-0.026
irk-0.025
and-0.024
old-0.024
ax-0.022
Token is
Token ID318
Feature activation-7.24e-01

Pos logit contributions
+0.036
 top+0.027
bes+0.025
 arrest+0.024
 puzzles+0.023
Neg logit contributions
year-0.037
old-0.035
irk-0.034
and-0.034
auc-0.033
Token time
Token ID640
Feature activation-6.42e-01

Pos logit contributions
+0.025
bes+0.018
 arrest+0.017
 top+0.017
+0.017
Neg logit contributions
year-0.019
irk-0.017
and-0.017
ax-0.016
old-0.016
Token to
Token ID284
Feature activation+4.88e-01

Pos logit contributions
+0.015
 top+0.013
 puzzles+0.012
+0.010
 ones+0.010
Neg logit contributions
irk-0.022
year-0.022
old-0.020
izz-0.019
ax-0.019
Token,
Token ID11
Feature activation-1.06e+00

Pos logit contributions
+0.027
 top+0.020
 puzzles+0.016
bes+0.016
+0.015
Neg logit contributions
year-0.047
irk-0.046
and-0.044
old-0.043
ax-0.041
Token "
Token ID366
Feature activation-1.45e+00

Pos logit contributions
+0.030
 top+0.025
 puzzles+0.023
bes+0.022
 arrest+0.021
Neg logit contributions
year-0.031
irk-0.030
and-0.028
old-0.027
ules-0.026
TokenS
Token ID50
Feature activation-8.16e-01

Pos logit contributions
+0.041
 top+0.032
 puzzles+0.029
bes+0.028
 rh+0.026
Neg logit contributions
year-0.047
irk-0.046
and-0.043
old-0.042
ax-0.040
Tokenara
Token ID3301
Feature activation-3.11e+00

Pos logit contributions
+0.025
 top+0.019
 puzzles+0.017
bes+0.017
+0.016
Neg logit contributions
year-0.025
irk-0.024
and-0.023
old-0.022
ax-0.021
Token and
Token ID290
Feature activation-1.44e+00

Pos logit contributions
+0.077
 top+0.058
 puzzles+0.048
bes+0.048
+0.043
Neg logit contributions
year-0.109
irk-0.109
old-0.104
and-0.102
ax-0.095
Token Ben
Token ID3932
Feature activation-4.61e+00

Pos logit contributions
+0.049
 top+0.040
 puzzles+0.036
bes+0.035
+0.034
Neg logit contributions
year-0.039
irk-0.037
and-0.035
old-0.034
ax-0.031
Token,
Token ID11
Feature activation-9.95e-01

Pos logit contributions
+0.119
 top+0.092
bes+0.080
 puzzles+0.079
adoes+0.073
Neg logit contributions
irk-0.153
year-0.150
old-0.145
and-0.141
ax-0.133
Token you
Token ID345
Feature activation-6.80e-01

Pos logit contributions
+0.028
 top+0.021
 puzzles+0.019
bes+0.018
+0.017
Neg logit contributions
year-0.033
irk-0.032
and-0.030
old-0.030
ax-0.028
Token are
Token ID389
Feature activation-9.27e-01

Pos logit contributions
+0.023
 top+0.018
bes+0.017
 puzzles+0.016
+0.016
Neg logit contributions
year-0.018
irk-0.018
old-0.017
and-0.016
ax-0.015
Token both
Token ID1111
Feature activation-1.65e+00

Pos logit contributions
+0.031
 top+0.022
bes+0.021
+0.020
gy+0.019
Neg logit contributions
year-0.027
irk-0.025
old-0.025
and-0.025
ax-0.023
Token good
Token ID922
Feature activation-6.39e-01

Pos logit contributions
+0.046
 top+0.036
 puzzles+0.031
bes+0.031
+0.029
Neg logit contributions
year-0.054
irk-0.052
and-0.050
old-0.049
ax-0.046
Token,
Token ID11
Feature activation-1.74e-01

Pos logit contributions
+0.025
 top+0.019
bes+0.017
 puzzles+0.016
 rh+0.015
Neg logit contributions
year-0.031
irk-0.030
and-0.029
old-0.029
ax-0.027
Token "
Token ID366
Feature activation-8.53e-01

Pos logit contributions
+0.004
 top+0.003
 puzzles+0.003
bes+0.003
 rh+0.002
Neg logit contributions
year-0.006
irk-0.006
and-0.006
old-0.006
ax-0.005
TokenL
Token ID43
Feature activation+1.13e+00

Pos logit contributions
+0.025
 top+0.019
 puzzles+0.017
bes+0.017
 rh+0.016
Neg logit contributions
year-0.027
irk-0.027
and-0.025
old-0.024
ax-0.023
Tokenily
Token ID813
Feature activation-3.65e+00

Pos logit contributions
year+0.024
irk+0.022
and+0.020
old+0.019
ules+0.018
Neg logit contributions
-0.043
 top-0.036
 puzzles-0.033
bes-0.033
 arrest-0.032
Token and
Token ID290
Feature activation-9.13e-01

Pos logit contributions
+0.095
 top+0.072
 puzzles+0.062
bes+0.062
adoes+0.056
Neg logit contributions
year-0.122
irk-0.121
old-0.118
and-0.114
ax-0.108
Token Ben
Token ID3932
Feature activation-4.60e+00

Pos logit contributions
+0.026
 top+0.021
 puzzles+0.019
 ones+0.018
+0.017
Neg logit contributions
year-0.029
irk-0.027
and-0.027
old-0.025
ules-0.024
Token,
Token ID11
Feature activation-7.26e-01

Pos logit contributions
+0.120
 top+0.090
 puzzles+0.079
bes+0.079
adoes+0.071
Neg logit contributions
year-0.153
irk-0.153
old-0.147
and-0.144
ax-0.136
Token why
Token ID1521
Feature activation-1.02e+00

Pos logit contributions
+0.020
 top+0.015
 puzzles+0.013
bes+0.013
+0.012
Neg logit contributions
year-0.024
irk-0.024
and-0.022
old-0.022
ax-0.021
Token did
Token ID750
Feature activation-2.97e+00

Pos logit contributions
+0.030
 top+0.024
bes+0.021
 puzzles+0.020
+0.019
Neg logit contributions
year-0.032
irk-0.031
and-0.029
old-0.029
ax-0.027
Token you
Token ID345
Feature activation+1.68e-01

Pos logit contributions
+0.120
bes+0.092
 necks+0.090
 top+0.089
gy+0.089
Neg logit contributions
old-0.050
 Jill-0.047
 Jenny-0.044
year-0.043
 expect-0.043
Token fight
Token ID1907
Feature activation-5.60e-01

Pos logit contributions
year+0.006
irk+0.005
and+0.005
old+0.005
ules+0.005
Neg logit contributions
-0.005
 top-0.004
 puzzles-0.003
bes-0.003
-0.003
Token say
Token ID910
Feature activation-1.57e+00

Pos logit contributions
+0.041
 top+0.033
 puzzles+0.030
bes+0.030
+0.029
Neg logit contributions
year-0.029
irk-0.028
and-0.026
old-0.026
ax-0.023
Token,
Token ID11
Feature activation-1.69e+00

Pos logit contributions
+0.045
 top+0.034
bes+0.033
 puzzles+0.029
icer+0.028
Neg logit contributions
year-0.049
irk-0.047
old-0.046
and-0.045
auc-0.041
Token "
Token ID366
Feature activation-1.63e+00

Pos logit contributions
+0.046
 top+0.036
 puzzles+0.033
bes+0.032
 arrest+0.030
Neg logit contributions
year-0.054
irk-0.051
and-0.048
old-0.046
ules-0.045
TokenNo
Token ID2949
Feature activation-2.41e+00

Pos logit contributions
+0.048
 top+0.042
 puzzles+0.039
bes+0.036
gy+0.036
Neg logit contributions
irk-0.043
year-0.042
and-0.042
old-0.039
ax-0.037
Token,
Token ID11
Feature activation-8.09e-01

Pos logit contributions
+0.070
bes+0.049
 top+0.048
 puzzles+0.044
+0.042
Neg logit contributions
irk-0.075
old-0.072
year-0.070
and-0.068
ax-0.066
Token Max
Token ID5436
Feature activation-4.58e+00

Pos logit contributions
+0.025
 top+0.019
bes+0.018
gy+0.017
+0.016
Neg logit contributions
year-0.022
irk-0.022
old-0.020
and-0.020
ax-0.019
Token!
Token ID0
Feature activation-7.28e-01

Pos logit contributions
+0.149
 top+0.123
bes+0.114
 puzzles+0.114
adoes+0.105
Neg logit contributions
irk-0.116
old-0.113
year-0.106
and-0.099
ax-0.098
Token These
Token ID2312
Feature activation-2.08e+00

Pos logit contributions
+0.022
 top+0.018
bes+0.017
 puzzles+0.017
 hawk+0.015
Neg logit contributions
irk-0.020
year-0.020
and-0.019
old-0.019
enny-0.017
Token are
Token ID389
Feature activation-1.43e+00

Pos logit contributions
+0.068
bes+0.051
 top+0.049
 Raven+0.047
 hawk+0.046
Neg logit contributions
year-0.053
and-0.052
irk-0.051
old-0.051
auc-0.048
Token our
Token ID674
Feature activation-1.70e+00

Pos logit contributions
+0.038
 top+0.025
bes+0.024
+0.022
+0.021
Neg logit contributions
year-0.050
irk-0.046
and-0.045
old-0.045
ax-0.043
Token blocks
Token ID7021
Feature activation-2.41e+00

Pos logit contributions
+0.047
bes+0.032
+0.028
 top+0.026
 hawk+0.026
Neg logit contributions
and-0.054
year-0.053
irk-0.052
old-0.052
auc-0.049
Tokenmy
Token ID1820
Feature activation-3.70e+00

Pos logit contributions
+0.045
 top+0.035
 puzzles+0.030
+0.028
 rh+0.028
Neg logit contributions
year-0.064
irk-0.060
and-0.058
old-0.055
leaf-0.052
Token,
Token ID11
Feature activation-9.11e-01

Pos logit contributions
+0.086
 top+0.062
 puzzles+0.052
bes+0.051
+0.047
Neg logit contributions
year-0.139
irk-0.135
and-0.129
old-0.127
ax-0.120
Token mom
Token ID1995
Feature activation-9.58e-01

Pos logit contributions
+0.029
 top+0.023
bes+0.023
 puzzles+0.020
adoes+0.020
Neg logit contributions
year-0.024
irk-0.024
old-0.023
and-0.022
ax-0.021
Tokenmy
Token ID1820
Feature activation-3.63e+00

Pos logit contributions
+0.032
 top+0.026
 puzzles+0.024
+0.023
 rh+0.023
Neg logit contributions
year-0.027
irk-0.025
and-0.024
old-0.022
leaf-0.021
Token!
Token ID0
Feature activation-1.11e+00

Pos logit contributions
+0.087
 top+0.064
 puzzles+0.054
bes+0.052
+0.049
Neg logit contributions
year-0.134
irk-0.129
and-0.123
old-0.122
ax-0.114
Token Max
Token ID5436
Feature activation-4.58e+00

Pos logit contributions
+0.035
 top+0.028
 puzzles+0.027
bes+0.026
 ones+0.024
Neg logit contributions
irk-0.030
year-0.030
and-0.028
old-0.027
enny-0.025
Token broke
Token ID6265
Feature activation-1.06e+00

Pos logit contributions
+0.128
 top+0.095
 puzzles+0.086
bes+0.083
adoes+0.080
Neg logit contributions
irk-0.142
year-0.137
old-0.136
and-0.130
ax-0.126
Token my
Token ID616
Feature activation-8.00e-02

Pos logit contributions
+0.029
 top+0.018
bes+0.018
 puzzles+0.017
+0.017
Neg logit contributions
year-0.035
irk-0.034
and-0.034
old-0.033
ax-0.032
Token dress
Token ID6576
Feature activation-2.85e+00

Pos logit contributions
+0.002
 top+0.002
+0.002
 puzzles+0.002
adoes+0.001
Neg logit contributions
irk-0.002
year-0.002
and-0.002
old-0.002
auc-0.002
Token!"
Token ID2474
Feature activation+6.93e-01

Pos logit contributions
+0.081
 top+0.063
 puzzles+0.055
bes+0.054
+0.051
Neg logit contributions
year-0.092
irk-0.088
and-0.084
old-0.082
ax-0.077
Token Sara
Token ID24799
Feature activation+7.63e-01

Pos logit contributions
year+0.019
irk+0.018
and+0.017
old+0.017
ax+0.015
Neg logit contributions
-0.023
 top-0.018
 puzzles-0.017
bes-0.017
-0.016
Token are
Token ID389
Feature activation-2.50e-01

Pos logit contributions
+0.063
 top+0.053
 puzzles+0.048
bes+0.047
adoes+0.045
Neg logit contributions
year-0.045
irk-0.044
old-0.042
and-0.039
ax-0.037
Token twins
Token ID20345
Feature activation-7.69e-01

Pos logit contributions
+0.007
 top+0.006
 puzzles+0.005
bes+0.005
+0.005
Neg logit contributions
year-0.008
irk-0.008
and-0.007
old-0.007
ax-0.007
Token.
Token ID13
Feature activation+5.19e-01

Pos logit contributions
+0.027
 top+0.021
adoes+0.020
 puzzles+0.020
bes+0.020
Neg logit contributions
year-0.018
irk-0.018
and-0.017
old-0.017
ax-0.015
Token They
Token ID1119
Feature activation-9.93e-01

Pos logit contributions
year+0.019
irk+0.018
and+0.017
old+0.017
ules+0.016
Neg logit contributions
-0.013
 top-0.010
 puzzles-0.008
bes-0.008
-0.008
Token live
Token ID2107
Feature activation+9.00e-01

Pos logit contributions
+0.031
 top+0.025
 puzzles+0.023
bes+0.022
+0.021
Neg logit contributions
year-0.029
irk-0.028
and-0.026
old-0.025
ules-0.024
Token on
Token ID319
Feature activation-4.57e+00

Pos logit contributions
year+0.031
irk+0.031
old+0.030
and+0.029
ules+0.029
Neg logit contributions
-0.022
 top-0.017
 puzzles-0.015
bes-0.014
-0.014
Token a
Token ID257
Feature activation+5.44e-01

Pos logit contributions
+0.159
+0.112
adoes+0.099
bes+0.097
 top+0.094
Neg logit contributions
year-0.120
and-0.109
irk-0.106
old-0.104
 Jill-0.098
Token farm
Token ID5318
Feature activation-1.26e+00

Pos logit contributions
year+0.016
irk+0.015
and+0.015
old+0.015
auc+0.013
Neg logit contributions
-0.017
 top-0.012
adoes-0.011
bes-0.011
 puzzles-0.011
Token with
Token ID351
Feature activation-3.80e-01

Pos logit contributions
+0.038
 top+0.029
 puzzles+0.025
bes+0.025
adoes+0.025
Neg logit contributions
year-0.038
and-0.036
irk-0.036
old-0.034
ax-0.030
Token their
Token ID511
Feature activation-1.31e+00

Pos logit contributions
+0.009
 top+0.007
 puzzles+0.006
bes+0.006
 rh+0.006
Neg logit contributions
year-0.013
irk-0.013
and-0.013
old-0.012
ax-0.012
Token mom
Token ID1995
Feature activation+1.31e+00

Pos logit contributions
+0.039
bes+0.027
 top+0.026
adoes+0.025
+0.023
Neg logit contributions
year-0.037
irk-0.037
old-0.035
and-0.034
leaf-0.031
Token and
Token ID290
Feature activation-3.52e-01

Pos logit contributions
+0.043
 top+0.033
 puzzles+0.030
bes+0.030
+0.028
Neg logit contributions
year-0.038
irk-0.037
and-0.036
old-0.034
ax-0.032
Token said
Token ID531
Feature activation-1.04e+00

Pos logit contributions
+0.011
 top+0.009
 puzzles+0.008
bes+0.008
+0.008
Neg logit contributions
year-0.010
irk-0.010
and-0.009
old-0.009
ax-0.008
Token,
Token ID11
Feature activation-1.07e+00

Pos logit contributions
+0.029
bes+0.020
 top+0.020
 puzzles+0.019
 arrest+0.018
Neg logit contributions
year-0.034
irk-0.032
and-0.031
old-0.031
auc-0.028
Token "
Token ID366
Feature activation+4.68e-01

Pos logit contributions
+0.025
 top+0.019
 puzzles+0.016
bes+0.015
+0.014
Neg logit contributions
year-0.040
irk-0.038
and-0.037
old-0.036
ax-0.034
TokenFin
Token ID18467
Feature activation-8.96e-01

Pos logit contributions
year+0.014
irk+0.014
and+0.013
ax+0.012
old+0.012
Neg logit contributions
-0.013
 puzzles-0.011
 top-0.011
bes-0.010
adoes-0.009
Tokenny
Token ID3281
Feature activation-4.57e+00

Pos logit contributions
+0.029
 top+0.023
 puzzles+0.021
bes+0.021
adoes+0.019
Neg logit contributions
year-0.025
irk-0.025
and-0.023
old-0.023
ax-0.022
Token,
Token ID11
Feature activation-1.88e-01

Pos logit contributions
+0.132
 top+0.101
bes+0.097
 puzzles+0.096
adoes+0.085
Neg logit contributions
irk-0.133
year-0.126
old-0.126
and-0.122
ax-0.117
Token why
Token ID1521
Feature activation-4.07e-01

Pos logit contributions
+0.006
 top+0.005
bes+0.004
 puzzles+0.004
adoes+0.004
Neg logit contributions
year-0.006
irk-0.005
and-0.005
old-0.005
ax-0.005
Token do
Token ID466
Feature activation-9.96e-01

Pos logit contributions
+0.013
 top+0.010
bes+0.009
 puzzles+0.009
+0.008
Neg logit contributions
year-0.012
irk-0.012
old-0.011
and-0.011
ax-0.010
Token you
Token ID345
Feature activation+1.14e+00

Pos logit contributions
+0.035
 top+0.028
bes+0.027
 necks+0.025
gy+0.025
Neg logit contributions
old-0.021
and-0.021
year-0.021
irk-0.019
ax-0.019
Token always
Token ID1464
Feature activation+5.61e-01

Pos logit contributions
year+0.036
irk+0.034
and+0.032
old+0.031
enny+0.029
Neg logit contributions
-0.033
 top-0.028
 puzzles-0.024
 rh-0.023
bes-0.023
Token a
Token ID257
Feature activation+1.52e-01

Pos logit contributions
year+0.036
irk+0.034
and+0.033
old+0.033
ax+0.030
Neg logit contributions
-0.024
 top-0.017
bes-0.014
 puzzles-0.014
-0.014
Token boy
Token ID2933
Feature activation+2.68e-01

Pos logit contributions
year+0.005
irk+0.005
and+0.005
old+0.005
ax+0.004
Neg logit contributions
-0.004
 top-0.003
 puzzles-0.003
bes-0.003
adoes-0.003
Token and
Token ID290
Feature activation-1.34e+00

Pos logit contributions
year+0.009
irk+0.009
and+0.008
old+0.008
ax+0.007
Neg logit contributions
-0.007
 top-0.006
 puzzles-0.005
bes-0.005
-0.005
Token his
Token ID465
Feature activation-5.57e-01

Pos logit contributions
+0.037
 top+0.028
 puzzles+0.025
bes+0.024
+0.023
Neg logit contributions
year-0.045
irk-0.043
and-0.041
old-0.040
ax-0.038
Token family
Token ID1641
Feature activation-4.85e-01

Pos logit contributions
+0.013
 top+0.010
 puzzles+0.008
bes+0.007
 rh+0.007
Neg logit contributions
year-0.021
irk-0.020
and-0.020
old-0.019
ax-0.018
Token on
Token ID319
Feature activation-4.55e+00

Pos logit contributions
+0.015
 top+0.012
bes+0.011
 puzzles+0.011
adoes+0.010
Neg logit contributions
year-0.014
irk-0.013
old-0.013
and-0.013
ax-0.012
Token a
Token ID257
Feature activation-5.24e-01

Pos logit contributions
+0.170
arians+0.127
+0.126
adoes+0.115
ados+0.114
Neg logit contributions
year-0.077
old-0.071
 Jill-0.067
 Jas-0.066
 Jenny-0.065
Token vacation
Token ID14600
Feature activation-5.37e-01

Pos logit contributions
+0.015
 top+0.010
adoes+0.010
bes+0.010
 puzzles+0.010
Neg logit contributions
year-0.016
irk-0.015
and-0.015
old-0.015
ax-0.013
Token.
Token ID13
Feature activation-3.79e-01

Pos logit contributions
+0.014
 top+0.011
 puzzles+0.009
+0.009
 ones+0.009
Neg logit contributions
year-0.019
irk-0.018
and-0.017
old-0.017
ules-0.016
Token Every
Token ID3887
Feature activation-7.44e-01

Pos logit contributions
+0.011
 top+0.009
bes+0.008
 puzzles+0.008
+0.007
Neg logit contributions
year-0.012
irk-0.011
and-0.010
old-0.010
ax-0.010
Token day
Token ID1110
Feature activation-7.62e-01

Pos logit contributions
+0.017
 top+0.015
 ones+0.011
 rh+0.011
 zoo+0.010
Neg logit contributions
year-0.027
irk-0.027
and-0.025
old-0.025
ax-0.024
Token so
Token ID523
Feature activation+5.06e-01

Pos logit contributions
+0.002
 top+0.002
 puzzles+0.002
 ones+0.002
bes+0.001
Neg logit contributions
year-0.004
irk-0.003
old-0.003
and-0.003
izz-0.003
Token he
Token ID339
Feature activation+1.90e-01

Pos logit contributions
year+0.018
irk+0.018
and+0.017
old+0.016
ules+0.016
Neg logit contributions
-0.013
 top-0.010
 puzzles-0.008
 rh-0.008
 none-0.008
Token wanted
Token ID2227
Feature activation+4.50e-01

Pos logit contributions
irk+0.006
year+0.006
and+0.006
old+0.006
ules+0.005
Neg logit contributions
-0.005
 top-0.004
 puzzles-0.004
bes-0.003
-0.003
Token to
Token ID284
Feature activation+9.42e-01

Pos logit contributions
year+0.015
irk+0.015
old+0.013
and+0.013
ules+0.013
Neg logit contributions
-0.012
 top-0.011
 puzzles-0.009
 ones-0.008
 none-0.008
Token get
Token ID651
Feature activation+1.30e+00

Pos logit contributions
year+0.031
irk+0.029
and+0.027
old+0.026
ules+0.025
Neg logit contributions
-0.027
 top-0.024
 puzzles-0.020
 arrest-0.019
bes-0.019
Token rid
Token ID5755
Feature activation+4.36e+00

Pos logit contributions
irk+0.050
year+0.049
and+0.047
ules+0.046
old+0.045
Neg logit contributions
-0.025
 top-0.023
 puzzles-0.016
 ones-0.016
 none-0.016
Token of
Token ID286
Feature activation+3.11e-01

Pos logit contributions
irk+0.090
year+0.088
leaf+0.076
old+0.076
ules+0.075
Neg logit contributions
-0.145
 top-0.128
adoes-0.115
 things-0.115
 ones-0.115
Token it
Token ID340
Feature activation+2.36e-01

Pos logit contributions
year+0.011
irk+0.011
and+0.011
old+0.010
ules+0.010
Neg logit contributions
-0.007
 top-0.006
 puzzles-0.005
bes-0.005
 rh-0.004
Token.
Token ID13
Feature activation-3.17e-01

Pos logit contributions
year+0.008
irk+0.008
old+0.007
and+0.007
ules+0.007
Neg logit contributions
-0.006
 top-0.005
 puzzles-0.004
bes-0.004
 ones-0.004
Token But
Token ID887
Feature activation-3.58e-01

Pos logit contributions
+0.009
 top+0.007
 puzzles+0.006
bes+0.006
 rh+0.006
Neg logit contributions
year-0.010
irk-0.010
and-0.009
old-0.009
ax-0.008
Token it
Token ID340
Feature activation+8.32e-01

Pos logit contributions
+0.010
 top+0.008
bes+0.007
 puzzles+0.007
adoes+0.007
Neg logit contributions
year-0.011
irk-0.010
and-0.010
old-0.010
ax-0.009
Token was
Token ID373
Feature activation-2.28e-01

Pos logit contributions
+0.017
 top+0.012
bes+0.011
 puzzles+0.011
+0.010
Neg logit contributions
year-0.025
irk-0.024
and-0.023
old-0.023
auc-0.021
Token a
Token ID257
Feature activation-4.28e-01

Pos logit contributions
+0.006
 top+0.004
 puzzles+0.004
bes+0.004
+0.004
Neg logit contributions
year-0.008
irk-0.008
and-0.007
old-0.007
ax-0.007
Token close
Token ID1969
Feature activation+1.56e+00

Pos logit contributions
+0.012
 top+0.008
bes+0.008
 puzzles+0.008
adoes+0.007
Neg logit contributions
year-0.014
irk-0.014
and-0.013
old-0.013
ax-0.012
Token call
Token ID869
Feature activation+4.54e-01

Pos logit contributions
year+0.049
irk+0.048
and+0.046
old+0.044
ax+0.041
Neg logit contributions
-0.044
 top-0.037
 puzzles-0.031
 rh-0.031
 ones-0.030
Token,"
Token ID553
Feature activation+6.33e-01

Pos logit contributions
year+0.016
irk+0.016
and+0.015
old+0.015
izz+0.014
Neg logit contributions
-0.011
 top-0.009
 puzzles-0.007
-0.007
 rh-0.007
Token mom
Token ID1995
Feature activation+3.93e+00

Pos logit contributions
year+0.020
irk+0.019
and+0.019
old+0.018
ules+0.017
Neg logit contributions
-0.018
 top-0.015
 puzzles-0.013
bes-0.012
 rh-0.012
Token said
Token ID531
Feature activation+1.37e+00

Pos logit contributions
irk+0.116
year+0.116
and+0.108
olt+0.107
ashing+0.103
Neg logit contributions
-0.087
 top-0.072
-0.067
 puzzles-0.064
 one-0.063
Token.
Token ID13
Feature activation-3.48e-01

Pos logit contributions
year+0.056
irk+0.056
ash+0.051
ules+0.051
weet+0.051
Neg logit contributions
-0.021
 top-0.015
-0.012
 none-0.011
bes-0.010
Token
Token ID198
Feature activation-4.15e-01

Pos logit contributions
+0.008
 top+0.006
 puzzles+0.005
bes+0.005
+0.005
Neg logit contributions
year-0.013
irk-0.013
and-0.012
old-0.012
ax-0.011
Token
Token ID198
Feature activation-1.51e+00

Pos logit contributions
+0.009
 top+0.007
bes+0.006
 puzzles+0.006
+0.006
Neg logit contributions
year-0.016
irk-0.015
and-0.014
old-0.014
ax-0.013
Token"
Token ID1
Feature activation-1.31e+00

Pos logit contributions
+0.036
 top+0.027
 puzzles+0.024
bes+0.022
 rh+0.021
Neg logit contributions
year-0.055
irk-0.053
and-0.051
old-0.051
ax-0.047
Token it
Token ID340
Feature activation+2.92e-01

Pos logit contributions
+0.035
 top+0.025
bes+0.024
+0.024
 puzzles+0.023
Neg logit contributions
year-0.030
irk-0.029
and-0.026
old-0.026
ax-0.024
Token.
Token ID13
Feature activation-1.15e-01

Pos logit contributions
year+0.010
irk+0.010
and+0.009
old+0.009
ules+0.008
Neg logit contributions
-0.008
 top-0.006
 puzzles-0.005
bes-0.005
-0.005
Token Lily
Token ID20037
Feature activation+4.56e-01

Pos logit contributions
+0.003
 top+0.003
 puzzles+0.002
bes+0.002
+0.002
Neg logit contributions
year-0.004
irk-0.004
and-0.003
old-0.003
ules-0.003
Token and
Token ID290
Feature activation-8.64e-01

Pos logit contributions
year+0.014
irk+0.013
and+0.013
old+0.012
ax+0.011
Neg logit contributions
-0.014
 top-0.011
 puzzles-0.010
bes-0.009
adoes-0.009
Token the
Token ID262
Feature activation-1.66e-01

Pos logit contributions
+0.029
 top+0.023
bes+0.022
 puzzles+0.020
+0.020
Neg logit contributions
irk-0.023
year-0.022
old-0.021
and-0.020
ax-0.019
Token babys
Token ID46711
Feature activation+3.91e+00

Pos logit contributions
+0.005
 top+0.004
 puzzles+0.003
bes+0.003
+0.003
Neg logit contributions
year-0.005
irk-0.005
and-0.005
old-0.005
ax-0.004
Tokenitter
Token ID1967
Feature activation+6.17e-01

Pos logit contributions
year+0.079
irk+0.078
and+0.062
leaf+0.061
ax+0.060
Neg logit contributions
-0.148
 top-0.119
gy-0.116
-0.116
adoes-0.113
Token looked
Token ID3114
Feature activation-1.15e-01

Pos logit contributions
irk+0.021
year+0.020
and+0.019
old+0.018
ax+0.018
Neg logit contributions
-0.015
 top-0.012
 puzzles-0.011
-0.010
 rh-0.010
Token inside
Token ID2641
Feature activation-7.54e-03

Pos logit contributions
+0.004
 top+0.003
 puzzles+0.003
bes+0.003
+0.002
Neg logit contributions
year-0.003
irk-0.003
and-0.003
old-0.003
ules-0.003
Token and
Token ID290
Feature activation-2.03e-01

Pos logit contributions
+0.000
 top+0.000
 puzzles+0.000
+0.000
 ones+0.000
Neg logit contributions
year-0.000
irk-0.000
and-0.000
old-0.000
ules-0.000
Token saw
Token ID2497
Feature activation+7.49e-01

Pos logit contributions
+0.005
 top+0.004
 puzzles+0.003
 none+0.003
+0.003
Neg logit contributions
irk-0.008
year-0.007
and-0.007
old-0.007
ules-0.007
Token day
Token ID1110
Feature activation-4.95e-01

Pos logit contributions
year+0.065
irk+0.065
and+0.060
old+0.059
ax+0.058
Neg logit contributions
-0.051
 top-0.041
 puzzles-0.032
 rh-0.032
 ones-0.032
Token,
Token ID11
Feature activation+4.26e-01

Pos logit contributions
+0.013
 top+0.010
bes+0.010
 puzzles+0.009
adoes+0.009
Neg logit contributions
year-0.016
irk-0.015
and-0.015
old-0.015
enny-0.014
Token she
Token ID673
Feature activation+6.60e-02

Pos logit contributions
year+0.007
 Jenny+0.006
and+0.006
irk+0.006
 Jill+0.006
Neg logit contributions
-0.017
bes-0.014
adoes-0.013
opes-0.013
 hawk-0.013
Token heard
Token ID2982
Feature activation+9.45e-01

Pos logit contributions
year+0.002
irk+0.002
and+0.002
old+0.002
leaf+0.002
Neg logit contributions
-0.002
 top-0.002
 puzzles-0.001
-0.001
bes-0.001
Token a
Token ID257
Feature activation+9.82e-01

Pos logit contributions
irk+0.044
year+0.042
and+0.040
ax+0.039
ules+0.038
Neg logit contributions
-0.011
 top-0.010
 puzzles-0.007
 rh-0.005
adoes-0.005
Token sound
Token ID2128
Feature activation+4.01e+00

Pos logit contributions
year+0.033
irk+0.031
and+0.030
old+0.029
ax+0.028
Neg logit contributions
-0.027
 top-0.022
 puzzles-0.019
 rh-0.018
-0.018
Token that
Token ID326
Feature activation+1.06e+00

Pos logit contributions
irk+0.128
year+0.124
and+0.116
old+0.112
enny+0.112
Neg logit contributions
-0.106
 top-0.093
 puzzles-0.073
 rh-0.072
 ones-0.070
Token she
Token ID673
Feature activation+9.08e-01

Pos logit contributions
irk+0.037
year+0.036
and+0.035
old+0.033
ax+0.033
Neg logit contributions
-0.026
 top-0.023
 puzzles-0.019
 rh-0.017
 ones-0.017
Token had
Token ID550
Feature activation-9.11e-01

Pos logit contributions
year+0.032
irk+0.032
and+0.031
old+0.030
ax+0.028
Neg logit contributions
-0.022
 top-0.018
 puzzles-0.014
 rh-0.014
-0.014
Token never
Token ID1239
Feature activation+1.57e+00

Pos logit contributions
+0.021
 top+0.014
 puzzles+0.013
bes+0.012
+0.012
Neg logit contributions
year-0.035
irk-0.033
and-0.032
old-0.032
ules-0.030
Token heard
Token ID2982
Feature activation+1.09e+00

Pos logit contributions
year+0.043
irk+0.043
auc+0.040
old+0.037
ingle+0.037
Neg logit contributions
-0.044
 top-0.038
 ones-0.036
 puzzles-0.036
gy-0.034
Token up
Token ID510
Feature activation-3.16e-01

Pos logit contributions
year+0.023
irk+0.022
and+0.021
old+0.021
ules+0.019
Neg logit contributions
-0.021
 top-0.017
 puzzles-0.015
bes-0.014
-0.014
Token when
Token ID618
Feature activation+1.00e-01

Pos logit contributions
+0.008
 top+0.006
 puzzles+0.006
bes+0.005
+0.005
Neg logit contributions
year-0.011
irk-0.010
and-0.010
old-0.010
ax-0.009
Token he
Token ID339
Feature activation+7.80e-02

Pos logit contributions
year+0.002
irk+0.002
old+0.002
and+0.002
ules+0.002
Neg logit contributions
-0.004
bes-0.003
 puzzles-0.003
-0.002
adoes-0.002
Token heard
Token ID2982
Feature activation+7.22e-01

Pos logit contributions
year+0.002
irk+0.002
and+0.002
old+0.002
ules+0.002
Neg logit contributions
-0.002
 top-0.002
 puzzles-0.002
bes-0.002
-0.002
Token a
Token ID257
Feature activation+9.73e-01

Pos logit contributions
year+0.029
irk+0.029
and+0.027
old+0.026
enny+0.025
Neg logit contributions
-0.014
 top-0.011
 puzzles-0.009
 rh-0.008
bes-0.007
Token voice
Token ID3809
Feature activation+4.67e+00

Pos logit contributions
year+0.036
irk+0.033
and+0.033
old+0.033
ax+0.032
Neg logit contributions
-0.023
 top-0.019
 rh-0.015
-0.015
 puzzles-0.015
Token.
Token ID13
Feature activation+8.37e-01

Pos logit contributions
year+0.176
irk+0.174
enny+0.165
and+0.165
old+0.151
Neg logit contributions
-0.090
 top-0.076
 rh-0.060
 Raven-0.052
 hawk-0.052
Token
Token ID198
Feature activation+2.88e-01

Pos logit contributions
year+0.028
irk+0.027
and+0.026
old+0.025
ax+0.023
Neg logit contributions
-0.023
 top-0.018
 puzzles-0.015
bes-0.015
-0.015
Token
Token ID198
Feature activation-7.11e-01

Pos logit contributions
year+0.010
irk+0.010
and+0.009
old+0.009
ax+0.009
Neg logit contributions
-0.007
 top-0.005
bes-0.004
-0.004
 puzzles-0.004
Token"
Token ID1
Feature activation+3.70e-01

Pos logit contributions
+0.017
 top+0.013
 puzzles+0.012
bes+0.011
 ones+0.010
Neg logit contributions
year-0.026
irk-0.025
old-0.024
and-0.024
ax-0.022
TokenLet
Token ID5756
Feature activation-3.96e-01

Pos logit contributions
year+0.013
irk+0.013
and+0.012
old+0.012
ules+0.011
Neg logit contributions
-0.010
 top-0.007
 puzzles-0.006
bes-0.006
-0.006
Token and
Token ID290
Feature activation-6.69e-01

Pos logit contributions
year+0.005
irk+0.005
old+0.005
and+0.005
izz+0.005
Neg logit contributions
-0.003
 top-0.003
 puzzles-0.002
 ones-0.002
-0.002
Token enjoy
Token ID2883
Feature activation-3.85e-01

Pos logit contributions
+0.017
 top+0.015
 puzzles+0.012
bes+0.011
 rh+0.011
Neg logit contributions
irk-0.023
year-0.023
and-0.021
old-0.020
ules-0.020
Token the
Token ID262
Feature activation+8.29e-01

Pos logit contributions
+0.009
 top+0.007
 puzzles+0.006
bes+0.006
+0.005
Neg logit contributions
year-0.014
irk-0.014
and-0.013
old-0.013
ax-0.012
Token view
Token ID1570
Feature activation+1.83e+00

Pos logit contributions
year+0.028
irk+0.027
and+0.025
old+0.025
ax+0.023
Neg logit contributions
-0.023
 top-0.018
 puzzles-0.016
bes-0.015
-0.014
Token.
Token ID13
Feature activation+1.22e-01

Pos logit contributions
irk+0.067
year+0.066
old+0.060
izz+0.057
ules+0.056
Neg logit contributions
-0.040
 top-0.038
 puzzles-0.030
 ones-0.028
 rh-0.025
Token
Token ID220
Feature activation+1.01e+00

Pos logit contributions
year+0.004
irk+0.004
and+0.004
old+0.004
auc+0.004
Neg logit contributions
-0.003
 top-0.002
 puzzles-0.002
bes-0.002
-0.002
Token
Token ID198
Feature activation+2.80e-01

Pos logit contributions
year+0.040
irk+0.038
and+0.034
old+0.034
leaf+0.033
Neg logit contributions
-0.022
 top-0.017
bes-0.013
-0.012
 puzzles-0.012
Token
Token ID198
Feature activation-1.20e+00

Pos logit contributions
year+0.011
irk+0.011
auc+0.010
old+0.010
and+0.010
Neg logit contributions
-0.005
 top-0.004
bes-0.003
-0.003
 puzzles-0.003
TokenThe
Token ID464
Feature activation-2.65e-01

Pos logit contributions
+0.032
 top+0.025
 puzzles+0.022
bes+0.021
+0.020
Neg logit contributions
year-0.040
irk-0.039
and-0.037
old-0.036
ules-0.034
Token tire
Token ID15867
Feature activation-2.60e-01

Pos logit contributions
+0.007
bes+0.005
gy+0.005
adoes+0.005
 puzzles+0.005
Neg logit contributions
year-0.008
irk-0.008
old-0.007
and-0.007
ax-0.007
Token had
Token ID550
Feature activation-8.24e-01

Pos logit contributions
+0.007
 top+0.006
 puzzles+0.005
bes+0.005
+0.005
Neg logit contributions
year-0.008
irk-0.008
and-0.008
old-0.008
ax-0.007
Token was
Token ID373
Feature activation+1.02e+00

Pos logit contributions
year+0.019
irk+0.019
and+0.018
old+0.017
ules+0.016
Neg logit contributions
-0.022
 top-0.017
 puzzles-0.016
bes-0.015
-0.015
Token a
Token ID257
Feature activation+1.00e-01

Pos logit contributions
year+0.029
irk+0.028
and+0.026
old+0.025
ax+0.023
Neg logit contributions
-0.034
 top-0.027
 puzzles-0.024
bes-0.024
-0.023
Token little
Token ID1310
Feature activation+6.12e-01

Pos logit contributions
year+0.003
irk+0.002
old+0.002
and+0.002
enny+0.002
Neg logit contributions
-0.003
bes-0.003
adoes-0.003
opes-0.002
 top-0.002
Token scared
Token ID12008
Feature activation+1.56e+00

Pos logit contributions
year+0.019
irk+0.018
and+0.017
old+0.016
ules+0.015
Neg logit contributions
-0.019
 top-0.015
 puzzles-0.013
-0.013
 rh-0.013
Token,
Token ID11
Feature activation-7.24e-02

Pos logit contributions
year+0.062
irk+0.058
and+0.058
old+0.056
leaf+0.055
Neg logit contributions
-0.030
 top-0.023
 ones-0.017
 puzzles-0.017
 rh-0.017
Token but
Token ID475
Feature activation+1.04e+00

Pos logit contributions
+0.002
 top+0.001
 puzzles+0.001
 rh+0.001
+0.001
Neg logit contributions
year-0.003
irk-0.003
and-0.002
old-0.002
izz-0.002
Token he
Token ID339
Feature activation+9.34e-01

Pos logit contributions
irk+0.038
year+0.037
and+0.036
auc+0.034
old+0.034
Neg logit contributions
-0.023
 top-0.018
 puzzles-0.015
 none-0.014
-0.014
Token agreed
Token ID4987
Feature activation-6.10e-02

Pos logit contributions
year+0.027
irk+0.027
and+0.025
old+0.024
auc+0.023
Neg logit contributions
-0.029
 top-0.023
-0.020
 puzzles-0.020
bes-0.019
Token.
Token ID13
Feature activation-4.65e-02

Pos logit contributions
+0.001
 top+0.001
 puzzles+0.001
bes+0.001
+0.001
Neg logit contributions
year-0.002
irk-0.002
old-0.002
izz-0.002
and-0.002
Token The
Token ID383
Feature activation+3.71e-01

Pos logit contributions
+0.001
 top+0.001
 puzzles+0.001
+0.001
bes+0.001
Neg logit contributions
year-0.002
irk-0.002
old-0.002
and-0.002
izz-0.001
Token two
Token ID734
Feature activation+7.41e-01

Pos logit contributions
year+0.016
irk+0.015
and+0.015
old+0.015
auc+0.014
Neg logit contributions
-0.007
 top-0.004
bes-0.003
 puzzles-0.003
-0.003
Token,
Token ID11
Feature activation+1.43e+00

Pos logit contributions
+0.036
 top+0.026
 puzzles+0.022
bes+0.022
+0.021
Neg logit contributions
year-0.052
irk-0.050
and-0.048
old-0.047
ax-0.044
Token Pete
Token ID17542
Feature activation+4.18e-02

Pos logit contributions
year+0.035
irk+0.034
and+0.030
old+0.030
ax+0.029
Neg logit contributions
-0.050
bes-0.039
 top-0.039
 puzzles-0.039
 rh-0.036
Token used
Token ID973
Feature activation-5.71e-01

Pos logit contributions
year+0.001
irk+0.001
and+0.001
old+0.001
ax+0.001
Neg logit contributions
-0.001
 top-0.001
 puzzles-0.001
bes-0.001
-0.001
Token his
Token ID465
Feature activation-8.53e-01

Pos logit contributions
+0.012
 top+0.010
 puzzles+0.008
bes+0.007
 ones+0.007
Neg logit contributions
year-0.022
irk-0.022
and-0.021
old-0.020
ules-0.019
Token paints
Token ID32281
Feature activation-1.56e+00

Pos logit contributions
+0.024
 top+0.017
bes+0.016
 rh+0.016
gy+0.016
Neg logit contributions
year-0.025
irk-0.025
and-0.023
old-0.023
enny-0.021
Token to
Token ID284
Feature activation+1.07e+00

Pos logit contributions
+0.043
 top+0.035
 puzzles+0.030
bes+0.028
 ones+0.027
Neg logit contributions
year-0.052
irk-0.050
and-0.047
old-0.047
ules-0.044
Token paint
Token ID7521
Feature activation-1.43e-01

Pos logit contributions
irk+0.039
year+0.038
and+0.036
old+0.035
ules+0.034
Neg logit contributions
-0.026
 top-0.024
 puzzles-0.020
bes-0.017
 none-0.016
Token a
Token ID257
Feature activation+5.11e-01

Pos logit contributions
+0.003
 top+0.003
 puzzles+0.002
 ones+0.002
bes+0.002
Neg logit contributions
year-0.005
irk-0.005
and-0.005
old-0.005
ules-0.005
Token picture
Token ID4286
Feature activation+1.35e+00

Pos logit contributions
year+0.018
irk+0.017
and+0.016
old+0.016
ules+0.015
Neg logit contributions
-0.013
 top-0.011
 puzzles-0.009
bes-0.008
-0.008
Token on
Token ID319
Feature activation-1.58e+00

Pos logit contributions
year+0.041
irk+0.040
and+0.035
old+0.035
izz+0.034
Neg logit contributions
-0.039
 top-0.035
 puzzles-0.032
 ones-0.029
bes-0.027
Token the
Token ID262
Feature activation+1.22e-01

Pos logit contributions
+0.043
bes+0.029
+0.027
 top+0.027
 puzzles+0.026
Neg logit contributions
year-0.051
irk-0.048
and-0.046
old-0.045
ax-0.043
TokenOnce
Token ID7454
Feature activation-1.83e-02

Pos logit contributions
year+0.001
irk+0.001
old+0.001
and+0.001
ax+0.001
Neg logit contributions
-0.000
 top-0.000
 puzzles-0.000
bes-0.000
-0.000
Token upon
Token ID2402
Feature activation+1.27e+00

Pos logit contributions
+0.001
 top+0.000
 puzzles+0.000
+0.000
 ones+0.000
Neg logit contributions
year-0.001
irk-0.001
and-0.000
leaf-0.000
old-0.000
Token a
Token ID257
Feature activation-3.74e-01

Pos logit contributions
year+0.051
irk+0.050
and+0.047
old+0.047
ules+0.044
Neg logit contributions
-0.025
 top-0.018
 puzzles-0.015
bes-0.014
 rh-0.013
Token time
Token ID640
Feature activation-1.21e+00

Pos logit contributions
+0.011
 top+0.008
 puzzles+0.007
bes+0.007
adoes+0.007
Neg logit contributions
year-0.012
irk-0.012
and-0.011
old-0.011
ax-0.010
Token there
Token ID612
Feature activation+4.11e-01

Pos logit contributions
+0.032
 top+0.024
bes+0.022
 puzzles+0.022
adoes+0.020
Neg logit contributions
year-0.040
and-0.039
irk-0.038
old-0.037
enny-0.035
Token was
Token ID373
Feature activation+8.91e-01

Pos logit contributions
year+0.015
irk+0.015
and+0.014
old+0.014
ax+0.013
Neg logit contributions
-0.010
 top-0.007
 puzzles-0.006
bes-0.006
-0.006
Token a
Token ID257
Feature activation+1.50e-01

Pos logit contributions
year+0.033
irk+0.031
and+0.030
old+0.030
ax+0.028
Neg logit contributions
-0.022
 top-0.015
bes-0.013
 puzzles-0.013
-0.013
Token daddy
Token ID35822
Feature activation+1.04e+00

Pos logit contributions
year+0.005
irk+0.005
and+0.005
old+0.005
ax+0.004
Neg logit contributions
-0.004
 top-0.003
 puzzles-0.003
bes-0.002
-0.002
Token who
Token ID508
Feature activation-4.27e-01

Pos logit contributions
year+0.038
irk+0.038
and+0.034
old+0.033
ax+0.033
Neg logit contributions
-0.024
 top-0.017
 rh-0.016
gy-0.015
 puzzles-0.015
Token loved
Token ID6151
Feature activation-1.01e+00

Pos logit contributions
+0.012
 top+0.009
 puzzles+0.008
bes+0.008
 rh+0.008
Neg logit contributions
year-0.014
irk-0.014
and-0.013
old-0.013
ax-0.012
Token his
Token ID465
Feature activation-6.05e-01

Pos logit contributions
+0.034
 top+0.027
 puzzles+0.024
bes+0.024
+0.023
Neg logit contributions
year-0.028
irk-0.027
and-0.025
old-0.025
ules-0.023
Token the
Token ID262
Feature activation+5.95e-01

Pos logit contributions
+0.011
 top+0.010
 puzzles+0.008
 ones+0.007
bes+0.007
Neg logit contributions
year-0.022
irk-0.021
and-0.021
old-0.021
auc-0.020
Token drawer
Token ID33451
Feature activation-1.96e-01

Pos logit contributions
year+0.022
irk+0.020
and+0.020
old+0.020
ules+0.019
Neg logit contributions
-0.015
 top-0.012
 puzzles-0.010
bes-0.009
 ones-0.009
Token and
Token ID290
Feature activation-2.61e-01

Pos logit contributions
+0.004
 top+0.004
 puzzles+0.003
bes+0.003
 none+0.003
Neg logit contributions
irk-0.007
year-0.007
old-0.006
izz-0.006
and-0.006
Token was
Token ID373
Feature activation+1.18e+00

Pos logit contributions
+0.007
 top+0.006
 puzzles+0.005
bes+0.005
 rh+0.005
Neg logit contributions
irk-0.009
year-0.009
and-0.008
old-0.008
ules-0.008
Token ready
Token ID3492
Feature activation+4.54e-01

Pos logit contributions
irk+0.043
and+0.039
year+0.039
enny+0.037
auc+0.037
Neg logit contributions
-0.025
 top-0.024
 none-0.019
 puzzles-0.018
 rh-0.018
Token to
Token ID284
Feature activation+1.90e+00

Pos logit contributions
irk+0.018
year+0.017
old+0.017
izz+0.016
auc+0.016
Neg logit contributions
-0.008
 top-0.008
 not-0.005
 puzzles-0.005
 one-0.005
Token give
Token ID1577
Feature activation+1.62e+00

Pos logit contributions
year+0.064
irk+0.062
and+0.059
old+0.055
ules+0.054
Neg logit contributions
-0.052
 top-0.046
 puzzles-0.037
 arrest-0.036
bes-0.035
Token it
Token ID340
Feature activation-8.64e-02

Pos logit contributions
irk+0.069
year+0.069
oke+0.065
auc+0.064
ules+0.063
Neg logit contributions
 top-0.026
-0.026
 things-0.017
 ones-0.017
 one-0.015
Token a
Token ID257
Feature activation+1.92e-01

Pos logit contributions
+0.002
 top+0.002
 puzzles+0.002
 ones+0.001
bes+0.001
Neg logit contributions
year-0.003
irk-0.003
and-0.003
old-0.003
ules-0.002
Token new
Token ID649
Feature activation+3.59e-01

Pos logit contributions
year+0.007
irk+0.006
and+0.006
old+0.006
ules+0.006
Neg logit contributions
-0.005
 top-0.004
 puzzles-0.003
 rh-0.003
 lot-0.003
Token home
Token ID1363
Feature activation+2.19e+00

Pos logit contributions
year+0.012
irk+0.012
and+0.011
old+0.011
ax+0.010
Neg logit contributions
-0.010
 top-0.008
 puzzles-0.006
bes-0.006
 rh-0.006
Token birds
Token ID10087
Feature activation+2.42e-01

Pos logit contributions
+0.007
 top+0.005
 puzzles+0.005
bes+0.004
 ones+0.004
Neg logit contributions
year-0.011
irk-0.010
and-0.010
old-0.010
ax-0.009
Token and
Token ID290
Feature activation-1.38e+00

Pos logit contributions
year+0.008
irk+0.008
and+0.008
old+0.007
ules+0.007
Neg logit contributions
-0.006
 top-0.005
 puzzles-0.004
bes-0.004
-0.004
Token trees
Token ID7150
Feature activation-9.58e-01

Pos logit contributions
+0.033
 top+0.024
 puzzles+0.020
bes+0.020
+0.018
Neg logit contributions
year-0.051
irk-0.049
and-0.047
old-0.047
ules-0.044
Token and
Token ID290
Feature activation-1.04e+00

Pos logit contributions
+0.029
 top+0.022
 puzzles+0.019
bes+0.019
+0.019
Neg logit contributions
year-0.029
irk-0.028
and-0.027
old-0.026
ules-0.024
Token people
Token ID661
Feature activation+1.59e+00

Pos logit contributions
+0.022
 top+0.016
 puzzles+0.014
bes+0.012
 rh+0.012
Neg logit contributions
year-0.041
irk-0.040
and-0.038
old-0.038
ax-0.036
Token.
Token ID13
Feature activation+2.87e-01

Pos logit contributions
year+0.056
irk+0.056
and+0.051
old+0.050
ax+0.048
Neg logit contributions
-0.039
 top-0.030
 puzzles-0.028
bes-0.026
 ones-0.025
Token
Token ID198
Feature activation-1.23e-01

Pos logit contributions
year+0.010
irk+0.009
and+0.009
old+0.008
ax+0.008
Neg logit contributions
-0.008
 top-0.006
 puzzles-0.005
bes-0.005
-0.005
Token
Token ID198
Feature activation-1.38e+00

Pos logit contributions
+0.003
 top+0.002
bes+0.002
 puzzles+0.002
+0.002
Neg logit contributions
year-0.005
irk-0.004
and-0.004
old-0.004
ax-0.004
TokenBut
Token ID1537
Feature activation-5.73e-01

Pos logit contributions
+0.033
 top+0.024
 puzzles+0.022
bes+0.021
adoes+0.020
Neg logit contributions
year-0.051
irk-0.048
and-0.047
old-0.047
ules-0.043
Token then
Token ID788
Feature activation-1.60e+00

Pos logit contributions
+0.015
 top+0.010
bes+0.010
 puzzles+0.010
adoes+0.009
Neg logit contributions
year-0.019
irk-0.017
old-0.017
and-0.017
ax-0.016
Token,
Token ID11
Feature activation+1.69e-01

Pos logit contributions
+0.050
bes+0.040
 top+0.037
+0.035
adoes+0.035
Neg logit contributions
year-0.039
old-0.039
irk-0.038
 Jill-0.036
 Jenny-0.036
Token with
Token ID351
Feature activation-1.79e+00

Pos logit contributions
+0.049
 top+0.037
 puzzles+0.033
bes+0.033
+0.031
Neg logit contributions
year-0.055
irk-0.053
and-0.050
old-0.049
ax-0.046
Token it
Token ID340
Feature activation-2.28e+00

Pos logit contributions
+0.045
 top+0.031
 puzzles+0.027
bes+0.027
+0.026
Neg logit contributions
year-0.065
irk-0.063
and-0.059
old-0.059
ax-0.055
Token.
Token ID13
Feature activation-1.26e-01

Pos logit contributions
+0.070
 top+0.053
 puzzles+0.048
bes+0.047
+0.046
Neg logit contributions
year-0.069
irk-0.066
and-0.064
old-0.061
ax-0.057
Token She
Token ID1375
Feature activation-1.38e+00

Pos logit contributions
+0.004
 top+0.003
 puzzles+0.003
bes+0.002
+0.002
Neg logit contributions
year-0.004
irk-0.004
and-0.004
old-0.004
ax-0.003
Token screamed
Token ID25421
Feature activation-7.51e-01

Pos logit contributions
+0.050
 top+0.039
 puzzles+0.036
bes+0.035
+0.035
Neg logit contributions
year-0.034
irk-0.033
and-0.031
old-0.030
ax-0.027
Token and
Token ID290
Feature activation-1.10e+00

Pos logit contributions
+0.019
 top+0.015
 puzzles+0.013
bes+0.012
+0.012
Neg logit contributions
year-0.026
irk-0.025
old-0.024
and-0.024
ules-0.023
Token cried
Token ID16896
Feature activation+1.05e+00

Pos logit contributions
+0.032
 top+0.027
 puzzles+0.023
bes+0.023
 rh+0.022
Neg logit contributions
year-0.034
irk-0.032
and-0.032
old-0.030
ules-0.028
Token.
Token ID13
Feature activation-1.58e-01

Pos logit contributions
year+0.039
irk+0.037
old+0.036
and+0.035
ules+0.034
Neg logit contributions
-0.024
 top-0.020
 puzzles-0.016
 ones-0.015
adoes-0.014
Token The
Token ID383
Feature activation-1.04e+00

Pos logit contributions
+0.005
 top+0.004
 puzzles+0.003
bes+0.003
 rh+0.003
Neg logit contributions
year-0.005
irk-0.005
and-0.004
old-0.004
enny-0.004
Token nails
Token ID23361
Feature activation-1.06e+00

Pos logit contributions
+0.032
 top+0.023
bes+0.023
 puzzles+0.022
gy+0.021
Neg logit contributions
year-0.030
irk-0.030
old-0.028
and-0.027
ax-0.025
Token hurt
Token ID5938
Feature activation-1.43e-01

Pos logit contributions
+0.029
 top+0.024
 puzzles+0.020
bes+0.020
+0.019
Neg logit contributions
year-0.035
irk-0.034
and-0.032
old-0.031
ules-0.029
Token air
Token ID1633
Feature activation+4.41e-01

Pos logit contributions
year+0.069
irk+0.067
and+0.065
old+0.064
enny+0.060
Neg logit contributions
-0.039
 top-0.033
 ones-0.023
 puzzles-0.023
 rh-0.022
Token.
Token ID13
Feature activation+2.16e-01

Pos logit contributions
year+0.016
irk+0.016
and+0.015
old+0.015
enny+0.014
Neg logit contributions
-0.011
 top-0.009
bes-0.007
 puzzles-0.007
 ones-0.007
Token Everything
Token ID11391
Feature activation+7.06e-01

Pos logit contributions
year+0.006
irk+0.006
and+0.005
old+0.005
ules+0.005
Neg logit contributions
-0.007
 top-0.006
 puzzles-0.005
-0.005
bes-0.005
Token was
Token ID373
Feature activation+1.08e+00

Pos logit contributions
year+0.027
irk+0.027
and+0.025
old+0.024
ax+0.023
Neg logit contributions
-0.016
 top-0.012
 puzzles-0.010
bes-0.009
-0.009
Token damp
Token ID21151
Feature activation-4.66e-01

Pos logit contributions
year+0.036
irk+0.035
and+0.033
old+0.033
enny+0.031
Neg logit contributions
-0.029
 top-0.024
 puzzles-0.020
bes-0.019
 rh-0.019
Token and
Token ID290
Feature activation-1.08e+00

Pos logit contributions
+0.012
 top+0.010
 puzzles+0.008
bes+0.008
+0.008
Neg logit contributions
year-0.016
irk-0.015
and-0.015
old-0.015
ax-0.013
Token wet
Token ID9583
Feature activation-7.52e-01

Pos logit contributions
+0.028
 top+0.023
 puzzles+0.020
bes+0.018
+0.018
Neg logit contributions
year-0.037
irk-0.036
and-0.035
old-0.034
ax-0.031
Token.
Token ID13
Feature activation+4.57e-01

Pos logit contributions
+0.021
 top+0.017
 puzzles+0.014
bes+0.014
+0.013
Neg logit contributions
year-0.025
irk-0.024
and-0.023
old-0.022
ax-0.021
Token
Token ID198
Feature activation+3.52e-01

Pos logit contributions
year+0.013
irk+0.013
and+0.012
old+0.011
ules+0.011
Neg logit contributions
-0.015
 top-0.012
 puzzles-0.011
-0.010
bes-0.010
Token
Token ID198
Feature activation-1.35e+00

Pos logit contributions
year+0.014
irk+0.014
old+0.012
and+0.012
auc+0.012
Neg logit contributions
-0.007
 top-0.006
-0.004
bes-0.004
 puzzles-0.004
TokenJack
Token ID14295
Feature activation-4.73e-01

Pos logit contributions
+0.049
 top+0.041
 puzzles+0.038
bes+0.037
 rh+0.036
Neg logit contributions
year-0.033
irk-0.031
and-0.029
old-0.029
ax-0.026
Token you
Token ID345
Feature activation+7.20e-01

Pos logit contributions
year+0.016
irk+0.015
and+0.014
old+0.014
ules+0.013
Neg logit contributions
-0.011
 top-0.008
 puzzles-0.007
-0.007
bes-0.006
Token should
Token ID815
Feature activation-1.17e+00

Pos logit contributions
year+0.022
irk+0.021
and+0.020
old+0.020
ax+0.018
Neg logit contributions
-0.022
 top-0.017
 puzzles-0.015
bes-0.015
-0.014
Token always
Token ID1464
Feature activation+6.09e-01

Pos logit contributions
+0.031
 top+0.023
 puzzles+0.020
bes+0.020
+0.019
Neg logit contributions
year-0.040
irk-0.039
old-0.037
and-0.037
ax-0.035
Token try
Token ID1949
Feature activation-4.06e-01

Pos logit contributions
year+0.021
irk+0.020
and+0.019
old+0.018
ules+0.018
Neg logit contributions
-0.017
 top-0.013
 puzzles-0.011
bes-0.011
 rh-0.010
Token something
Token ID1223
Feature activation+5.77e-01

Pos logit contributions
+0.012
 top+0.009
 puzzles+0.008
bes+0.008
+0.008
Neg logit contributions
year-0.012
irk-0.012
and-0.012
old-0.011
ax-0.010
Token at
Token ID379
Feature activation-6.54e-01

Pos logit contributions
year+0.019
irk+0.018
old+0.017
and+0.017
ules+0.016
Neg logit contributions
-0.016
 top-0.013
 puzzles-0.011
bes-0.011
-0.010
Token least
Token ID1551
Feature activation+1.07e+00

Pos logit contributions
+0.015
 top+0.012
 puzzles+0.010
bes+0.009
 ones+0.009
Neg logit contributions
year-0.025
irk-0.024
and-0.023
old-0.023
ules-0.021
Token twice
Token ID5403
Feature activation+6.85e-01

Pos logit contributions
year+0.041
irk+0.040
and+0.036
auc+0.035
enny+0.035
Neg logit contributions
-0.025
 top-0.020
 puzzles-0.016
-0.014
bes-0.014
Token.
Token ID13
Feature activation+8.44e-01

Pos logit contributions
year+0.023
irk+0.022
and+0.021
old+0.020
ax+0.019
Neg logit contributions
-0.019
 top-0.015
 puzzles-0.013
bes-0.013
-0.012
Token You
Token ID921
Feature activation+4.76e-01

Pos logit contributions
year+0.028
irk+0.027
and+0.026
old+0.025
ax+0.024
Neg logit contributions
-0.023
 top-0.017
 puzzles-0.015
bes-0.015
-0.014
Token might
Token ID1244
Feature activation-5.89e-01

Pos logit contributions
year+0.014
irk+0.014
old+0.013
and+0.013
ax+0.012
Neg logit contributions
-0.014
 top-0.011
bes-0.010
 puzzles-0.010
-0.009
Token to
Token ID284
Feature activation+1.29e+00

Pos logit contributions
year+0.028
irk+0.027
and+0.025
old+0.025
ules+0.023
Neg logit contributions
-0.026
 top-0.021
 puzzles-0.019
bes-0.018
-0.017
Token eat
Token ID4483
Feature activation+5.03e-01

Pos logit contributions
year+0.045
irk+0.043
and+0.040
old+0.039
ules+0.037
Neg logit contributions
-0.035
 top-0.029
 puzzles-0.024
bes-0.023
 rh-0.022
Token.
Token ID13
Feature activation+7.64e-01

Pos logit contributions
year+0.017
irk+0.017
and+0.016
old+0.016
ax+0.015
Neg logit contributions
-0.013
 top-0.010
 puzzles-0.009
bes-0.009
-0.008
Token All
Token ID1439
Feature activation+1.57e+00

Pos logit contributions
year+0.025
irk+0.024
and+0.023
old+0.023
ax+0.021
Neg logit contributions
-0.021
 top-0.016
 puzzles-0.014
bes-0.014
-0.013
Token my
Token ID616
Feature activation+2.62e-01

Pos logit contributions
year+0.053
irk+0.052
and+0.049
old+0.047
ules+0.045
Neg logit contributions
-0.042
 top-0.032
 puzzles-0.029
bes-0.026
 ones-0.026
Token friends
Token ID2460
Feature activation-1.31e+00

Pos logit contributions
year+0.009
irk+0.009
and+0.009
old+0.009
ax+0.008
Neg logit contributions
-0.006
 top-0.005
 puzzles-0.004
bes-0.004
-0.004
Token have
Token ID423
Feature activation-1.60e+00

Pos logit contributions
+0.038
 top+0.029
 puzzles+0.026
bes+0.025
+0.024
Neg logit contributions
year-0.042
irk-0.040
and-0.038
old-0.037
ax-0.034
Token sweets
Token ID42402
Feature activation-2.38e+00

Pos logit contributions
+0.046
 top+0.034
bes+0.030
 puzzles+0.029
+0.029
Neg logit contributions
year-0.051
irk-0.049
and-0.048
old-0.047
ax-0.044
Token."
Token ID526
Feature activation+4.22e-02

Pos logit contributions
+0.072
 top+0.057
bes+0.050
 puzzles+0.049
 rh+0.047
Neg logit contributions
year-0.070
irk-0.067
and-0.066
old-0.063
ax-0.060
Token
Token ID198
Feature activation+1.22e-01

Pos logit contributions
year+0.001
irk+0.001
and+0.001
old+0.001
ules+0.001
Neg logit contributions
-0.001
 top-0.001
 puzzles-0.001
bes-0.001
-0.001
Token
Token ID198
Feature activation-1.71e+00

Pos logit contributions
year+0.005
irk+0.005
and+0.004
old+0.004
auc+0.004
Neg logit contributions
-0.003
 top-0.002
-0.002
bes-0.001
 puzzles-0.001
Token tongue
Token ID11880
Feature activation-1.38e+00

Pos logit contributions
+0.014
 top+0.011
 puzzles+0.010
bes+0.009
+0.009
Neg logit contributions
year-0.017
irk-0.017
and-0.016
old-0.015
ax-0.014
Token.
Token ID13
Feature activation+4.92e-01

Pos logit contributions
+0.038
 top+0.031
 puzzles+0.026
bes+0.025
+0.024
Neg logit contributions
year-0.046
irk-0.045
old-0.042
and-0.042
ules-0.039
Token Sue
Token ID26089
Feature activation+5.49e-01

Pos logit contributions
year+0.016
irk+0.015
and+0.014
old+0.014
ax+0.013
Neg logit contributions
-0.014
 top-0.011
 puzzles-0.010
bes-0.010
-0.009
Token cried
Token ID16896
Feature activation+2.47e-01

Pos logit contributions
year+0.016
irk+0.016
old+0.015
and+0.015
ax+0.014
Neg logit contributions
-0.017
 top-0.013
 puzzles-0.012
bes-0.011
adoes-0.011
Token out
Token ID503
Feature activation+4.74e-02

Pos logit contributions
year+0.009
irk+0.009
old+0.009
and+0.009
ules+0.009
Neg logit contributions
-0.005
 top-0.004
 puzzles-0.003
-0.003
gy-0.003
Token in
Token ID287
Feature activation-2.28e+00

Pos logit contributions
year+0.001
irk+0.001
and+0.001
old+0.001
ules+0.001
Neg logit contributions
-0.001
 top-0.001
bes-0.001
 puzzles-0.001
-0.001
Token pain
Token ID2356
Feature activation-8.57e-01

Pos logit contributions
+0.077
 top+0.056
+0.054
bes+0.054
 puzzles+0.052
Neg logit contributions
irk-0.065
year-0.061
old-0.054
and-0.053
olt-0.049
Token and
Token ID290
Feature activation-8.31e-01

Pos logit contributions
+0.019
 top+0.014
 puzzles+0.011
bes+0.011
+0.011
Neg logit contributions
year-0.034
irk-0.032
and-0.030
old-0.030
izz-0.028
Token quickly
Token ID2952
Feature activation-6.43e-01

Pos logit contributions
+0.024
 top+0.020
 puzzles+0.017
bes+0.017
 none+0.016
Neg logit contributions
year-0.026
irk-0.025
and-0.024
old-0.023
ules-0.022
Token ran
Token ID4966
Feature activation-4.45e-02

Pos logit contributions
+0.018
 top+0.015
 puzzles+0.013
bes+0.013
 arrest+0.012
Neg logit contributions
year-0.021
irk-0.019
and-0.019
old-0.018
ules-0.018
Token to
Token ID284
Feature activation+8.37e-01

Pos logit contributions
+0.001
 top+0.001
 puzzles+0.001
bes+0.001
gy+0.001
Neg logit contributions
year-0.001
irk-0.001
and-0.001
ules-0.001
old-0.001
Token idea
Token ID2126
Feature activation+6.74e-01

Pos logit contributions
year+0.028
irk+0.027
and+0.025
old+0.025
ax+0.023
Neg logit contributions
-0.036
 top-0.029
 puzzles-0.026
bes-0.026
-0.025
Token.
Token ID13
Feature activation+7.07e-01

Pos logit contributions
year+0.023
irk+0.022
and+0.020
old+0.020
ax+0.019
Neg logit contributions
-0.018
 top-0.014
 puzzles-0.012
bes-0.012
-0.011
Token
Token ID198
Feature activation-8.67e-02

Pos logit contributions
year+0.026
irk+0.024
and+0.023
old+0.023
ax+0.022
Neg logit contributions
-0.018
 top-0.013
-0.011
 puzzles-0.011
bes-0.010
Token
Token ID198
Feature activation-1.71e+00

Pos logit contributions
+0.002
 top+0.001
+0.001
bes+0.001
 puzzles+0.001
Neg logit contributions
year-0.004
irk-0.003
and-0.003
old-0.003
ash-0.003
Token"
Token ID1
Feature activation-1.87e-01

Pos logit contributions
+0.037
 top+0.026
 puzzles+0.022
bes+0.021
 rh+0.019
Neg logit contributions
year-0.067
irk-0.065
and-0.063
old-0.062
ax-0.058
TokenBen
Token ID11696
Feature activation-3.28e+00

Pos logit contributions
+0.006
 top+0.004
 puzzles+0.004
bes+0.004
+0.004
Neg logit contributions
year-0.006
irk-0.006
and-0.006
old-0.005
ules-0.005
Token,
Token ID11
Feature activation+7.22e-02

Pos logit contributions
+0.086
 top+0.067
 puzzles+0.059
bes+0.058
adoes+0.053
Neg logit contributions
irk-0.107
year-0.105
old-0.101
and-0.097
ax-0.095
Token I
Token ID314
Feature activation-4.76e-01

Pos logit contributions
year+0.002
irk+0.002
and+0.002
old+0.002
ax+0.002
Neg logit contributions
-0.002
 top-0.002
 puzzles-0.001
bes-0.001
-0.001
Token have
Token ID423
Feature activation-1.97e+00

Pos logit contributions
+0.014
 top+0.011
 puzzles+0.009
+0.009
bes+0.009
Neg logit contributions
year-0.015
irk-0.014
and-0.014
old-0.013
ules-0.013
Token a
Token ID257
Feature activation-6.39e-01

Pos logit contributions
+0.056
 top+0.041
bes+0.038
adoes+0.036
gy+0.035
Neg logit contributions
year-0.062
irk-0.058
and-0.057
old-0.057
ax-0.054
Token plan
Token ID1410
Feature activation+2.11e-01

Pos logit contributions
+0.017
 top+0.013
 puzzles+0.012
bes+0.011
+0.011
Neg logit contributions
year-0.021
irk-0.021
and-0.020
old-0.019
ax-0.018
Token l
Token ID300
Feature activation+2.56e+00

Pos logit contributions
year+0.005
irk+0.005
and+0.005
old+0.005
ax+0.004
Neg logit contributions
-0.004
 top-0.003
 puzzles-0.003
bes-0.003
-0.003
Tokenilies
Token ID3922
Feature activation+4.78e-01

Pos logit contributions
year+0.058
irk+0.057
and+0.053
old+0.049
leaf+0.046
Neg logit contributions
-0.082
 top-0.074
bes-0.067
 puzzles-0.062
 arrest-0.062
Token.
Token ID13
Feature activation-5.21e-02

Pos logit contributions
year+0.017
irk+0.016
and+0.015
old+0.015
izz+0.014
Neg logit contributions
-0.012
 top-0.010
 puzzles-0.008
bes-0.008
-0.007
Token
Token ID220
Feature activation+1.51e+00

Pos logit contributions
+0.002
 top+0.001
 puzzles+0.001
bes+0.001
 rh+0.001
Neg logit contributions
year-0.002
irk-0.002
and-0.002
old-0.001
ax-0.001
Token
Token ID198
Feature activation-2.33e-01

Pos logit contributions
year+0.058
irk+0.056
old+0.051
and+0.050
leaf+0.049
Neg logit contributions
-0.034
 top-0.025
bes-0.020
 puzzles-0.019
-0.019
Token
Token ID198
Feature activation-2.10e+00

Pos logit contributions
+0.005
 top+0.004
+0.003
bes+0.003
 puzzles+0.003
Neg logit contributions
year-0.009
irk-0.009
old-0.008
auc-0.008
and-0.008
TokenCl
Token ID2601
Feature activation-6.97e-01

Pos logit contributions
+0.071
 top+0.058
 puzzles+0.054
bes+0.052
adoes+0.051
Neg logit contributions
year-0.056
irk-0.053
and-0.051
old-0.049
ax-0.045
Tokenara
Token ID3301
Feature activation+5.61e-01

Pos logit contributions
+0.021
 top+0.017
 puzzles+0.015
bes+0.015
 rh+0.014
Neg logit contributions
year-0.021
irk-0.020
and-0.019
old-0.019
ax-0.017
Token leapt
Token ID43713
Feature activation+1.13e-02

Pos logit contributions
year+0.017
irk+0.017
and+0.016
old+0.015
ules+0.014
Neg logit contributions
-0.017
 top-0.013
 puzzles-0.012
bes-0.012
-0.011
Token onto
Token ID4291
Feature activation-8.03e-01

Pos logit contributions
year+0.000
irk+0.000
and+0.000
old+0.000
ax+0.000
Neg logit contributions
-0.000
 top-0.000
 puzzles-0.000
bes-0.000
 rh-0.000
Token the
Token ID262
Feature activation+1.49e+00

Pos logit contributions
+0.025
+0.017
bes+0.017
adoes+0.016
 puzzles+0.016
Neg logit contributions
year-0.023
irk-0.021
and-0.020
old-0.020
izz-0.019
Token started
Token ID2067
Feature activation-4.27e-01

Pos logit contributions
year+0.002
irk+0.002
and+0.002
old+0.002
ules+0.002
Neg logit contributions
-0.002
 top-0.001
 puzzles-0.001
-0.001
bes-0.001
Token to
Token ID284
Feature activation+1.68e-01

Pos logit contributions
+0.014
 top+0.012
 puzzles+0.011
bes+0.010
+0.010
Neg logit contributions
year-0.012
irk-0.011
and-0.010
old-0.010
ax-0.009
Token run
Token ID1057
Feature activation-2.36e-01

Pos logit contributions
year+0.005
irk+0.005
and+0.005
old+0.005
enny+0.004
Neg logit contributions
-0.005
 top-0.004
 puzzles-0.003
 rh-0.003
 arrest-0.003
Token.
Token ID13
Feature activation-4.66e-01

Pos logit contributions
+0.008
 top+0.006
 puzzles+0.006
bes+0.006
+0.006
Neg logit contributions
year-0.006
irk-0.006
and-0.006
old-0.005
ax-0.005
Token But
Token ID887
Feature activation-4.49e-01

Pos logit contributions
+0.014
 top+0.011
 puzzles+0.009
bes+0.009
+0.009
Neg logit contributions
year-0.015
irk-0.014
and-0.014
old-0.013
ax-0.012
Token then
Token ID788
Feature activation-2.27e+00

Pos logit contributions
+0.012
bes+0.009
 top+0.009
 puzzles+0.009
 rh+0.008
Neg logit contributions
year-0.014
old-0.013
irk-0.012
and-0.012
ax-0.011
Token,
Token ID11
Feature activation-2.98e-01

Pos logit contributions
+0.069
bes+0.056
 top+0.052
 puzzles+0.050
+0.049
Neg logit contributions
year-0.061
old-0.058
irk-0.058
ax-0.052
 Jill-0.051
Token she
Token ID673
Feature activation-6.46e-01

Pos logit contributions
+0.010
bes+0.008
 necks+0.008
 top+0.008
 puzzles+0.007
Neg logit contributions
year-0.007
old-0.006
irk-0.006
and-0.006
 Jill-0.006
Token tri
Token ID1333
Feature activation+1.10e+00

Pos logit contributions
+0.020
 top+0.016
 puzzles+0.014
bes+0.014
+0.013
Neg logit contributions
year-0.020
irk-0.019
and-0.018
old-0.017
ules-0.016
Tokenpped
Token ID1496
Feature activation-1.31e+00

Pos logit contributions
year+0.036
irk+0.035
and+0.033
old+0.033
ax+0.030
Neg logit contributions
-0.031
 top-0.024
 puzzles-0.021
bes-0.021
-0.020
Token and
Token ID290
Feature activation-1.90e+00

Pos logit contributions
+0.041
 top+0.030
 puzzles+0.029
bes+0.028
+0.027
Neg logit contributions
year-0.039
irk-0.037
and-0.036
old-0.034
ax-0.032
Token cheerful
Token ID37999
Feature activation+3.67e-01

Pos logit contributions
year+0.030
irk+0.028
and+0.027
old+0.026
ax+0.026
Neg logit contributions
-0.024
 top-0.020
 puzzles-0.017
-0.016
 rh-0.015
Token voice
Token ID3809
Feature activation+2.61e+00

Pos logit contributions
year+0.014
irk+0.013
and+0.012
old+0.012
izz+0.012
Neg logit contributions
-0.009
 top-0.007
 puzzles-0.006
 ones-0.005
bes-0.005
Token,
Token ID11
Feature activation-1.01e-01

Pos logit contributions
year+0.101
irk+0.100
and+0.096
old+0.090
enny+0.088
Neg logit contributions
-0.054
 top-0.039
-0.031
bes-0.028
 ones-0.028
Token
Token ID6184
Feature activation-4.71e-01

Pos logit contributions
+0.002
 top+0.002
+0.001
 puzzles+0.001
 rh+0.001
Neg logit contributions
year-0.004
irk-0.004
and-0.004
old-0.004
ingle-0.004
Token
Token ID95
Feature activation-2.27e-01

Pos logit contributions
+0.014
 top+0.011
 puzzles+0.010
bes+0.010
 rh+0.009
Neg logit contributions
year-0.014
irk-0.014
and-0.013
old-0.013
ax-0.012
Token
Token ID26391
Feature activation-2.13e+00

Pos logit contributions
+0.007
 top+0.005
bes+0.005
 puzzles+0.004
+0.004
Neg logit contributions
year-0.007
irk-0.007
and-0.007
old-0.006
ax-0.006
Token
Token ID129
Feature activation-1.38e+00

Pos logit contributions
+0.057
 top+0.044
 puzzles+0.040
bes+0.037
 rh+0.036
Neg logit contributions
year-0.070
irk-0.068
and-0.065
old-0.064
ules-0.061
Token
Token ID241
Feature activation-2.56e-01

Pos logit contributions
+0.038
 top+0.029
 puzzles+0.026
bes+0.025
+0.024
Neg logit contributions
year-0.046
irk-0.044
and-0.042
old-0.041
ax-0.039
TokenLook
Token ID8567
Feature activation-2.15e-01

Pos logit contributions
+0.006
 top+0.005
+0.004
bes+0.003
 ones+0.003
Neg logit contributions
year-0.009
irk-0.009
old-0.009
and-0.008
auc-0.008
Token at
Token ID379
Feature activation-1.06e+00

Pos logit contributions
+0.006
 top+0.004
 puzzles+0.004
bes+0.004
+0.004
Neg logit contributions
year-0.007
irk-0.007
and-0.007
old-0.007
ules-0.006
Token that
Token ID326
Feature activation-3.40e-01

Pos logit contributions
+0.022
 top+0.017
 puzzles+0.014
bes+0.012
 rh+0.012
Neg logit contributions
year-0.042
irk-0.041
and-0.039
old-0.038
ules-0.035