TOP ACTIVATIONS
MAX = 9.970

back the village to its former state . With the help
the village back to its former glory !"
to be restored to its former glory ."
, leaving it to its former state .
the tree to its once former beauty . He was happy
Cr ust y . Cr ust y lived in a big
suggested putting it in his bur row , but Nut ty
that day on , Cr ust y tried his best to
. One day , Cr ust y found a shiny shell
but couldn 't . Fr ust rated , he started to
peek ed out of his bur row . He
ge . Fr ust rated , the little ant
stay with them in their bur row . The elderly rabbit
proudly put it in his bur row . He thought it
and put them in his bur row . One
he said no . Fr ust rated , Bob decided to
or too hard . Fr ust rated , she kept searching
a little crab named Cr ust y . Cr ust y
print so badly . Fr ust rated tears were well ing
back . Fr ust rated , Jane went back

MIN ACTIVATIONS
MIN = -6.998

be cow boys and cow girls . They put on hats
. " These muff ins don 't taste good ." Her
't hurt her , please don 't hurt me !"
sk ates and helmets and pads . They use a puck
their gear : helmets , pads and gloves . They knew
pictures but the park keeper didn 't believe him .
told him , " we don 't buy things from the
They learned that snow men don 't last forever , but
. They wear helmets and pads to be safe .
mad at us . Please don 't hurt us ."
, lion . But please don 't eat our pets .
. " Fl owers don 't have a taste ,"
. But be careful , don 't hit your head on
No , Anna , please don 't tell mom . She
" Tom my , please don 't be noisy and stubborn
they said . " But don 't take our rope again
" Se as hell s don 't taste good , Lily
Please help yourself but just don 't eat too many ".
to go home ! Please don 't make me !" His
, " No , please don 't shoot me ! I

INTERVAL 5.728 - 9.970
CONTAINS 0.003%

a secret place in the hollow tree and told her to
, and the egg looked back at them . The animals
the small writing on the back of the picture .
Suddenly , a pant her appeared . He was
the small writing on the back of the picture said ,

INTERVAL 1.486 - 5.728
CONTAINS 13.831%

named Sarah was playing in her backyard . Suddenly , she
loved spending time with her mom my . She even taught
me ?" A unt Lisa smiled . She said
on , Tim was careful with the unusual sauce . He
So , D ippy had to give up all

INTERVAL -2.756 - 1.486
CONTAINS 83.252%

same . They blew their horns and squeezed their balloons .
wanted . The raven was so grateful and
and the truck driver joined in . They sang a happy
her head and hugged her doll . She wished Ben would
so exciting ! They saw lots of different fish , birds

INTERVAL -6.998 - -2.756
CONTAINS 2.913%

re so smart . â € May hugged
They were very happy and played with it every day .
brothers . They like to play outside in the yard .
says . " You have played enough . You need to
were twins who liked to play together . But sometimes they
Token back
Token ID736
Feature activation+4.53e+00

Pos logit contributions
arter+0.019
 policy+0.017
 polar+0.017
run+0.017
asured+0.017
Neg logit contributions
ics-0.010
he-0.009
 top-0.009
ars-0.009
case-0.008
Token the
Token ID262
Feature activation-1.61e+00

Pos logit contributions
arter+0.161
 policy+0.150
 polar+0.145
run+0.142
asured+0.139
Neg logit contributions
ics-0.068
 top-0.059
he-0.053
ars-0.053
hed-0.043
Token village
Token ID7404
Feature activation+1.23e+00

Pos logit contributions
ics+0.041
he+0.037
ars+0.035
case+0.034
 top+0.033
Neg logit contributions
 policy-0.038
arter-0.037
 polar-0.036
run-0.034
asured-0.033
Token to
Token ID284
Feature activation+2.52e-01

Pos logit contributions
arter+0.034
 policy+0.033
run+0.032
 polar+0.032
asured+0.031
Neg logit contributions
ics-0.024
he-0.022
ars-0.022
 top-0.021
case-0.019
Token its
Token ID663
Feature activation+2.24e+00

Pos logit contributions
arter+0.008
run+0.008
 policy+0.008
 polar+0.008
asured+0.008
Neg logit contributions
ics-0.004
 top-0.003
he-0.003
ars-0.003
'-0.002
Token former
Token ID1966
Feature activation+9.97e+00

Pos logit contributions
arter+0.056
run+0.054
 polar+0.051
 policy+0.050
asured+0.046
Neg logit contributions
 top-0.055
ics-0.052
case-0.046
ars-0.046
he-0.045
Token state
Token ID1181
Feature activation+2.62e+00

Pos logit contributions
arter+0.235
 polar+0.222
run+0.213
 policy+0.204
 Bobby+0.191
Neg logit contributions
 top-0.239
ics-0.231
ars-0.222
case-0.203
g-0.197
Token.
Token ID13
Feature activation-2.40e-01

Pos logit contributions
arter+0.080
 policy+0.076
 polar+0.075
run+0.073
asured+0.072
Neg logit contributions
ics-0.047
he-0.042
ars-0.041
 top-0.040
case-0.037
Token With
Token ID2080
Feature activation-6.88e-01

Pos logit contributions
ics+0.003
he+0.003
ars+0.003
 top+0.003
case+0.003
Neg logit contributions
arter-0.008
 policy-0.008
 polar-0.008
run-0.008
asured-0.008
Token the
Token ID262
Feature activation-1.03e+00

Pos logit contributions
ics+0.010
 top+0.009
he+0.009
ars+0.009
case+0.007
Neg logit contributions
arter-0.023
 policy-0.022
 polar-0.021
run-0.021
asured-0.021
Token help
Token ID1037
Feature activation+1.83e+00

Pos logit contributions
ics+0.025
he+0.022
case+0.021
ars+0.020
hed+0.019
Neg logit contributions
arter-0.026
 policy-0.025
 polar-0.025
run-0.024
 rightful-0.023
Token the
Token ID262
Feature activation-2.33e+00

Pos logit contributions
arter+0.049
 polar+0.046
 policy+0.046
run+0.046
asured+0.045
Neg logit contributions
ics-0.024
 top-0.022
he-0.021
ars-0.020
'-0.017
Token village
Token ID7404
Feature activation+1.18e+00

Pos logit contributions
ics+0.052
he+0.047
ars+0.047
 top+0.046
case+0.044
Neg logit contributions
arter-0.061
 policy-0.057
 polar-0.056
run-0.056
asured-0.054
Token back
Token ID736
Feature activation+5.02e+00

Pos logit contributions
arter+0.034
 policy+0.032
 polar+0.032
run+0.031
asured+0.031
Neg logit contributions
ics-0.023
he-0.020
ars-0.020
 top-0.020
case-0.018
Token to
Token ID284
Feature activation+1.71e-01

Pos logit contributions
arter+0.150
 polar+0.140
 policy+0.140
run+0.134
asured+0.131
Neg logit contributions
ics-0.099
 top-0.086
he-0.086
ars-0.082
case-0.078
Token its
Token ID663
Feature activation+1.82e+00

Pos logit contributions
run+0.005
arter+0.005
 policy+0.005
 polar+0.005
asured+0.005
Neg logit contributions
 top-0.003
ics-0.003
he-0.003
ars-0.002
't-0.002
Token former
Token ID1966
Feature activation+9.75e+00

Pos logit contributions
arter+0.044
run+0.042
 polar+0.041
 policy+0.040
asured+0.038
Neg logit contributions
 top-0.044
ics-0.044
case-0.039
ars-0.038
he-0.038
Token glory
Token ID13476
Feature activation+9.48e-01

Pos logit contributions
arter+0.224
 polar+0.216
run+0.204
 policy+0.196
 Bobby+0.184
Neg logit contributions
 top-0.239
ics-0.235
ars-0.221
case-0.205
ers-0.197
Token!"
Token ID2474
Feature activation-2.04e+00

Pos logit contributions
arter+0.027
 policy+0.025
 polar+0.025
run+0.025
asured+0.024
Neg logit contributions
ics-0.019
he-0.017
ars-0.017
 top-0.016
case-0.015
Token
Token ID220
Feature activation-7.09e-01

Pos logit contributions
ics+0.041
ars+0.033
he+0.032
case+0.032
 top+0.032
Neg logit contributions
run-0.056
arter-0.054
 polar-0.054
 Bobby-0.053
 Joey-0.053
Token
Token ID198
Feature activation-4.37e-01

Pos logit contributions
ics+0.014
ars+0.012
he+0.012
 top+0.012
vation+0.011
Neg logit contributions
run-0.019
arter-0.019
 policy-0.018
 polar-0.018
 Joey-0.017
Token
Token ID198
Feature activation+3.09e-01

Pos logit contributions
ics+0.010
 top+0.008
ars+0.008
case+0.008
he+0.008
Neg logit contributions
run-0.011
 polar-0.011
 Bobby-0.010
 policy-0.010
arter-0.010
Token to
Token ID284
Feature activation+3.81e-01

Pos logit contributions
arter+0.017
 policy+0.017
 polar+0.016
run+0.016
asured+0.016
Neg logit contributions
ics-0.011
he-0.010
ars-0.010
 top-0.010
case-0.009
Token be
Token ID307
Feature activation+1.65e+00

Pos logit contributions
arter+0.012
 policy+0.011
 polar+0.011
run+0.011
asured+0.011
Neg logit contributions
ics-0.007
he-0.006
 top-0.006
ars-0.006
case-0.005
Token restored
Token ID15032
Feature activation+3.36e-01

Pos logit contributions
arter+0.051
 policy+0.046
 polar+0.045
run+0.043
asured+0.042
Neg logit contributions
ics-0.032
ars-0.029
 top-0.027
he-0.026
case-0.025
Token to
Token ID284
Feature activation+6.18e-01

Pos logit contributions
arter+0.011
 polar+0.010
 policy+0.010
run+0.010
asured+0.009
Neg logit contributions
ics-0.006
he-0.005
 top-0.005
ars-0.005
case-0.005
Token its
Token ID663
Feature activation+2.01e+00

Pos logit contributions
arter+0.021
 polar+0.020
run+0.019
 policy+0.019
omo+0.019
Neg logit contributions
ics-0.010
 top-0.009
he-0.008
ars-0.007
case-0.007
Token former
Token ID1966
Feature activation+9.68e+00

Pos logit contributions
arter+0.049
run+0.048
 polar+0.045
 policy+0.041
ocol+0.041
Neg logit contributions
 top-0.052
ics-0.048
case-0.044
ars-0.041
 swing-0.041
Token glory
Token ID13476
Feature activation+8.00e-01

Pos logit contributions
arter+0.227
 polar+0.226
run+0.211
 policy+0.198
 Bobby+0.194
Neg logit contributions
 top-0.229
ics-0.224
ars-0.216
case-0.201
ers-0.185
Token."
Token ID526
Feature activation-5.34e-01

Pos logit contributions
arter+0.024
 policy+0.022
 polar+0.022
run+0.022
asured+0.021
Neg logit contributions
ics-0.015
he-0.014
ars-0.013
 top-0.013
case-0.012
Token
Token ID220
Feature activation+1.52e-01

Pos logit contributions
ics+0.006
he+0.005
 top+0.005
ars+0.005
case+0.004
Neg logit contributions
arter-0.020
 policy-0.020
 polar-0.019
asured-0.019
run-0.018
Token
Token ID198
Feature activation+3.19e-01

Pos logit contributions
arter+0.004
run+0.004
 policy+0.004
 polar+0.004
asured+0.004
Neg logit contributions
ics-0.003
ars-0.002
he-0.002
 top-0.002
case-0.002
Token
Token ID198
Feature activation+1.09e+00

Pos logit contributions
arter+0.010
 polar+0.009
 policy+0.009
run+0.009
asured+0.009
Neg logit contributions
ics-0.006
ars-0.005
he-0.005
 top-0.005
case-0.004
Token,
Token ID11
Feature activation-1.36e+00

Pos logit contributions
ics+0.008
he+0.007
ars+0.007
 top+0.007
case+0.007
Neg logit contributions
arter-0.013
 policy-0.012
 polar-0.012
run-0.012
asured-0.012
Token leaving
Token ID4305
Feature activation+1.35e+00

Pos logit contributions
ics+0.025
he+0.022
ars+0.022
 top+0.021
case+0.020
Neg logit contributions
arter-0.041
 policy-0.039
 polar-0.038
run-0.038
asured-0.037
Token it
Token ID340
Feature activation+5.27e-01

Pos logit contributions
arter+0.044
 policy+0.042
 polar+0.041
run+0.041
asured+0.041
Neg logit contributions
ics-0.021
 top-0.019
he-0.019
ars-0.018
case-0.016
Token to
Token ID284
Feature activation+2.79e-02

Pos logit contributions
arter+0.015
 policy+0.014
 polar+0.014
run+0.014
asured+0.013
Neg logit contributions
ics-0.011
he-0.010
ars-0.009
 top-0.009
case-0.009
Token its
Token ID663
Feature activation+1.90e+00

Pos logit contributions
arter+0.001
 policy+0.001
run+0.001
asured+0.001
 importantly+0.001
Neg logit contributions
ics-0.000
 top-0.000
he-0.000
ars-0.000
 swing-0.000
Token former
Token ID1966
Feature activation+9.54e+00

Pos logit contributions
arter+0.050
run+0.050
 policy+0.046
ocol+0.045
 importantly+0.043
Neg logit contributions
 top-0.045
ics-0.038
ars-0.034
 edge-0.034
 higher-0.033
Token state
Token ID1181
Feature activation+2.66e+00

Pos logit contributions
arter+0.236
run+0.221
 polar+0.219
 policy+0.212
 Bobby+0.200
Neg logit contributions
 top-0.212
ics-0.196
ars-0.193
ers-0.171
 beams-0.170
Token.
Token ID13
Feature activation-1.58e-01

Pos logit contributions
arter+0.083
 policy+0.078
 polar+0.078
run+0.076
asured+0.075
Neg logit contributions
ics-0.047
he-0.042
 top-0.040
ars-0.040
case-0.036
Token
Token ID220
Feature activation-2.92e-01

Pos logit contributions
ics+0.002
he+0.002
ars+0.001
 top+0.001
case+0.001
Neg logit contributions
arter-0.006
 policy-0.006
 polar-0.006
asured-0.005
run-0.005
Token
Token ID198
Feature activation-1.49e-01

Pos logit contributions
ics+0.006
he+0.005
ars+0.005
 top+0.005
case+0.004
Neg logit contributions
run-0.008
arter-0.008
 policy-0.008
 polar-0.008
asured-0.007
Token
Token ID198
Feature activation+6.38e-01

Pos logit contributions
ics+0.003
 top+0.003
ars+0.003
he+0.003
case+0.003
Neg logit contributions
run-0.004
 polar-0.004
arter-0.004
 policy-0.003
 Joey-0.003
Token the
Token ID262
Feature activation-1.33e+00

Pos logit contributions
arter+0.057
 policy+0.053
 polar+0.050
run+0.050
asured+0.050
Neg logit contributions
ics-0.025
 top-0.024
he-0.022
ars-0.020
 higher-0.018
Token tree
Token ID5509
Feature activation+7.43e-01

Pos logit contributions
ics+0.022
he+0.019
ars+0.019
 top+0.018
case+0.017
Neg logit contributions
arter-0.042
 policy-0.041
 polar-0.040
run-0.039
asured-0.038
Token to
Token ID284
Feature activation+3.49e-01

Pos logit contributions
arter+0.022
 policy+0.021
 polar+0.021
run+0.021
asured+0.020
Neg logit contributions
ics-0.014
he-0.012
ars-0.012
 top-0.011
case-0.011
Token its
Token ID663
Feature activation+2.28e+00

Pos logit contributions
arter+0.011
 policy+0.011
run+0.010
asured+0.010
 polar+0.010
Neg logit contributions
ics-0.005
 top-0.005
he-0.005
ars-0.004
't-0.004
Token once
Token ID1752
Feature activation+2.93e+00

Pos logit contributions
arter+0.059
run+0.055
 policy+0.054
 polar+0.053
asured+0.050
Neg logit contributions
ics-0.052
 top-0.052
case-0.046
ars-0.045
he-0.045
Token former
Token ID1966
Feature activation+9.21e+00

Pos logit contributions
arter+0.087
 policy+0.082
 polar+0.081
run+0.079
asured+0.077
Neg logit contributions
ics-0.056
he-0.052
ars-0.049
 top-0.049
case-0.045
Token beauty
Token ID8737
Feature activation+1.04e+00

Pos logit contributions
arter+0.207
 polar+0.198
run+0.185
 policy+0.184
 Bobby+0.164
Neg logit contributions
ics-0.237
 top-0.225
ars-0.220
case-0.204
he-0.196
Token.
Token ID13
Feature activation+2.88e-01

Pos logit contributions
arter+0.032
 policy+0.030
 polar+0.030
run+0.029
asured+0.029
Neg logit contributions
ics-0.019
he-0.017
ars-0.016
 top-0.016
case-0.015
Token He
Token ID679
Feature activation+1.15e+00

Pos logit contributions
arter+0.009
 polar+0.008
 policy+0.008
asured+0.007
ocol+0.007
Neg logit contributions
 top-0.005
he-0.005
ics-0.005
case-0.005
 Higher-0.005
Token was
Token ID373
Feature activation-1.09e+00

Pos logit contributions
arter+0.032
 policy+0.030
 polar+0.029
run+0.029
asured+0.028
Neg logit contributions
ics-0.024
he-0.022
ars-0.022
 top-0.021
case-0.020
Token happy
Token ID3772
Feature activation-1.48e+00

Pos logit contributions
ics+0.020
he+0.018
ars+0.018
 top+0.017
case+0.016
Neg logit contributions
arter-0.032
 policy-0.031
 polar-0.030
run-0.030
asured-0.029
Token Cr
Token ID3864
Feature activation+2.64e+00

Pos logit contributions
arter+0.026
 policy+0.025
run+0.023
asured+0.022
 polar+0.022
Neg logit contributions
ics-0.064
 top-0.060
he-0.059
ars-0.058
case-0.056
Tokenust
Token ID436
Feature activation+7.73e+00

Pos logit contributions
arter+0.039
asured+0.034
 policy+0.033
run+0.033
 polar+0.031
Neg logit contributions
ics-0.087
 top-0.083
ars-0.080
he-0.078
 Higher-0.078
Tokeny
Token ID88
Feature activation+1.22e+00

Pos logit contributions
arter+0.182
 policy+0.172
run+0.170
 importantly+0.162
 polar+0.162
Neg logit contributions
ics-0.192
 top-0.180
he-0.178
ars-0.173
case-0.156
Token.
Token ID13
Feature activation+1.64e+00

Pos logit contributions
arter+0.037
 policy+0.036
run+0.035
 polar+0.035
asured+0.034
Neg logit contributions
ics-0.021
he-0.019
 top-0.018
ars-0.018
case-0.017
Token Cr
Token ID3864
Feature activation+2.78e+00

Pos logit contributions
arter+0.023
 policy+0.021
 polar+0.020
run+0.019
asured+0.018
Neg logit contributions
ics-0.057
he-0.053
ars-0.053
 top-0.052
case-0.050
Tokenust
Token ID436
Feature activation+8.15e+00

Pos logit contributions
arter+0.032
asured+0.027
 policy+0.027
run+0.026
 polar+0.023
Neg logit contributions
ics-0.101
 top-0.097
ars-0.095
he-0.093
 Higher-0.090
Tokeny
Token ID88
Feature activation+1.57e+00

Pos logit contributions
arter+0.206
 policy+0.189
run+0.188
 polar+0.186
 importantly+0.182
Neg logit contributions
ics-0.190
he-0.178
 top-0.172
ars-0.169
case-0.160
Token lived
Token ID5615
Feature activation+1.91e-02

Pos logit contributions
arter+0.034
 polar+0.033
 policy+0.032
run+0.031
asured+0.030
Neg logit contributions
ics-0.041
he-0.037
ars-0.037
 top-0.036
case-0.034
Token in
Token ID287
Feature activation+1.37e+00

Pos logit contributions
arter+0.001
 polar+0.000
asured+0.000
 policy+0.000
run+0.000
Neg logit contributions
ics-0.000
ars-0.000
he-0.000
case-0.000
vation-0.000
Token a
Token ID257
Feature activation+6.05e-01

Pos logit contributions
 polar+0.042
arter+0.041
 policy+0.037
asured+0.037
 Joey+0.036
Neg logit contributions
ics-0.024
ars-0.022
he-0.020
case-0.020
vation-0.019
Token big
Token ID1263
Feature activation+8.55e-01

Pos logit contributions
arter+0.018
 polar+0.017
 policy+0.017
run+0.016
asured+0.016
Neg logit contributions
ics-0.011
he-0.010
ars-0.010
 top-0.009
case-0.009
Token suggested
Token ID5220
Feature activation-2.34e+00

Pos logit contributions
ics+0.006
he+0.006
ars+0.005
 top+0.005
case+0.005
Neg logit contributions
 polar-0.005
run-0.005
asured-0.005
arter-0.005
 Bobby-0.005
Token putting
Token ID5137
Feature activation-4.86e-01

Pos logit contributions
ics+0.051
he+0.047
ars+0.047
 top+0.043
case+0.043
Neg logit contributions
arter-0.061
 policy-0.058
run-0.058
 polar-0.057
asured-0.054
Token it
Token ID340
Feature activation+1.86e-01

Pos logit contributions
ics+0.007
 top+0.007
he+0.007
ars+0.006
hed+0.005
Neg logit contributions
arter-0.017
 policy-0.016
run-0.015
 polar-0.015
asured-0.015
Token in
Token ID287
Feature activation+8.96e-01

Pos logit contributions
arter+0.006
 policy+0.005
 polar+0.005
run+0.005
asured+0.005
Neg logit contributions
ics-0.003
he-0.003
ars-0.003
 top-0.003
case-0.003
Token his
Token ID465
Feature activation+2.75e+00

Pos logit contributions
arter+0.033
 policy+0.030
run+0.030
asured+0.029
 importantly+0.029
Neg logit contributions
 top-0.012
ics-0.010
he-0.010
g-0.008
 beam-0.007
Token bur
Token ID4356
Feature activation+8.11e+00

Pos logit contributions
arter+0.086
 policy+0.082
asured+0.079
 importantly+0.078
run+0.077
Neg logit contributions
 top-0.051
ars-0.042
case-0.041
ics-0.040
 swing-0.037
Tokenrow
Token ID808
Feature activation+2.31e+00

Pos logit contributions
arter+0.200
 policy+0.186
 polar+0.176
 importantly+0.169
asured+0.162
Neg logit contributions
ics-0.190
ars-0.187
 top-0.172
he-0.171
ers-0.158
Token,
Token ID11
Feature activation-1.12e+00

Pos logit contributions
arter+0.071
 policy+0.068
 polar+0.068
run+0.068
asured+0.065
Neg logit contributions
ics-0.040
ars-0.034
he-0.034
 top-0.033
case-0.030
Token but
Token ID475
Feature activation+1.17e+00

Pos logit contributions
ics+0.024
he+0.021
ars+0.021
 top+0.020
case+0.019
Neg logit contributions
arter-0.030
 policy-0.029
 polar-0.029
run-0.028
asured-0.027
Token Nut
Token ID11959
Feature activation+1.43e+00

Pos logit contributions
arter+0.034
 policy+0.030
asured+0.029
run+0.028
ocol+0.028
Neg logit contributions
 top-0.026
ics-0.023
't-0.022
 higher-0.021
The-0.020
Tokenty
Token ID774
Feature activation+8.23e-01

Pos logit contributions
arter+0.038
 polar+0.037
run+0.036
 policy+0.036
asured+0.035
Neg logit contributions
ics-0.031
he-0.028
ars-0.026
 top-0.026
case-0.025
Token that
Token ID326
Feature activation-1.60e-01

Pos logit contributions
arter+0.014
 policy+0.013
run+0.013
 polar+0.013
asured+0.012
Neg logit contributions
ics-0.006
he-0.005
 top-0.005
ars-0.005
case-0.005
Token day
Token ID1110
Feature activation-2.98e+00

Pos logit contributions
ics+0.004
ars+0.004
he+0.004
 top+0.003
case+0.003
Neg logit contributions
arter-0.004
 policy-0.004
 polar-0.004
run-0.003
asured-0.003
Token on
Token ID319
Feature activation-8.81e-01

Pos logit contributions
ics+0.051
he+0.042
ars+0.040
 top+0.039
case+0.039
Neg logit contributions
arter-0.093
 policy-0.092
run-0.089
 polar-0.088
 importantly-0.083
Token,
Token ID11
Feature activation-6.42e-01

Pos logit contributions
ics+0.009
 top+0.008
he+0.007
't+0.006
ars+0.006
Neg logit contributions
arter-0.034
asured-0.033
 polar-0.032
 former-0.032
run-0.032
Token Cr
Token ID3864
Feature activation+3.21e+00

Pos logit contributions
ics+0.018
 top+0.017
he+0.016
ars+0.016
't+0.015
Neg logit contributions
arter-0.014
 policy-0.012
run-0.012
 polar-0.012
asured-0.012
Tokenust
Token ID436
Feature activation+8.09e+00

Pos logit contributions
arter+0.047
 polar+0.043
 policy+0.043
run+0.039
asured+0.036
Neg logit contributions
ics-0.108
he-0.102
ars-0.101
 top-0.097
case-0.095
Tokeny
Token ID88
Feature activation+5.04e-01

Pos logit contributions
arter+0.215
 polar+0.207
ocol+0.198
 importantly+0.189
 policy+0.189
Neg logit contributions
ics-0.171
ars-0.155
he-0.149
 top-0.144
case-0.131
Token tried
Token ID3088
Feature activation+9.11e-03

Pos logit contributions
arter+0.013
 policy+0.013
 polar+0.012
run+0.012
asured+0.012
Neg logit contributions
ics-0.011
he-0.010
ars-0.010
 top-0.010
case-0.009
Token his
Token ID465
Feature activation+1.63e+00

Pos logit contributions
arter+0.000
 polar+0.000
 policy+0.000
ocol+0.000
run+0.000
Neg logit contributions
ics-0.000
 top-0.000
 higher-0.000
ars-0.000
he-0.000
Token best
Token ID1266
Feature activation+1.54e+00

Pos logit contributions
arter+0.050
 policy+0.047
 polar+0.047
run+0.047
asured+0.045
Neg logit contributions
ics-0.029
 top-0.025
ars-0.025
he-0.025
case-0.023
Token to
Token ID284
Feature activation+1.42e-01

Pos logit contributions
arter+0.048
 polar+0.045
ocol+0.043
 policy+0.041
 former+0.041
Neg logit contributions
ics-0.026
 top-0.025
ars-0.023
he-0.022
'-0.021
Token.
Token ID13
Feature activation+1.09e+00

Pos logit contributions
ics+0.008
he+0.007
ars+0.007
 top+0.006
case+0.006
Neg logit contributions
arter-0.013
 policy-0.012
 polar-0.012
run-0.012
asured-0.011
Token One
Token ID1881
Feature activation+2.70e+00

Pos logit contributions
arter+0.028
 policy+0.027
 polar+0.026
run+0.026
asured+0.025
Neg logit contributions
ics-0.025
he-0.023
ars-0.022
 top-0.022
case-0.021
Token day
Token ID1110
Feature activation-8.14e-01

Pos logit contributions
run+0.085
arter+0.085
 policy+0.083
 former+0.080
ocol+0.079
Neg logit contributions
ics-0.042
he-0.040
 top-0.040
ars-0.038
'-0.032
Token,
Token ID11
Feature activation+1.14e+00

Pos logit contributions
ics+0.013
he+0.012
ars+0.011
 top+0.011
case+0.010
Neg logit contributions
arter-0.026
 policy-0.025
 polar-0.025
run-0.024
asured-0.024
Token Cr
Token ID3864
Feature activation+3.40e+00

Pos logit contributions
arter+0.024
 policy+0.022
 polar+0.022
run+0.021
asured+0.020
Neg logit contributions
ics-0.031
he-0.029
ars-0.028
case-0.027
 top-0.027
Tokenust
Token ID436
Feature activation+8.08e+00

Pos logit contributions
arter+0.040
 policy+0.034
asured+0.033
run+0.032
 polar+0.030
Neg logit contributions
ics-0.123
 top-0.117
ars-0.115
he-0.115
 Higher-0.109
Tokeny
Token ID88
Feature activation+1.17e+00

Pos logit contributions
arter+0.197
run+0.188
 policy+0.186
 polar+0.180
 importantly+0.178
Neg logit contributions
ics-0.197
he-0.175
 top-0.174
ars-0.170
case-0.155
Token found
Token ID1043
Feature activation+2.72e+00

Pos logit contributions
arter+0.030
 polar+0.028
 policy+0.028
run+0.027
asured+0.027
Neg logit contributions
ics-0.026
he-0.024
ars-0.023
 top-0.023
case-0.022
Token a
Token ID257
Feature activation+7.31e-01

Pos logit contributions
arter+0.105
 policy+0.103
run+0.100
 polar+0.098
asured+0.097
Neg logit contributions
ics-0.027
 top-0.022
he-0.021
ars-0.020
case-0.015
Token shiny
Token ID22441
Feature activation+2.68e+00

Pos logit contributions
arter+0.023
run+0.022
asured+0.022
 policy+0.022
 polar+0.021
Neg logit contributions
 top-0.012
ics-0.012
he-0.010
ars-0.010
case-0.009
Token shell
Token ID7582
Feature activation+1.09e+00

Pos logit contributions
arter+0.068
run+0.067
 policy+0.065
asured+0.063
 polar+0.062
Neg logit contributions
 top-0.058
ics-0.058
he-0.054
ars-0.052
case-0.049
Token but
Token ID475
Feature activation+1.37e+00

Pos logit contributions
ics+0.006
he+0.006
ars+0.006
 top+0.005
case+0.005
Neg logit contributions
arter-0.005
 policy-0.005
 polar-0.005
run-0.005
asured-0.005
Token couldn
Token ID3521
Feature activation-4.59e+00

Pos logit contributions
arter+0.046
 policy+0.043
 polar+0.042
 former+0.041
asured+0.041
Neg logit contributions
 top-0.022
ics-0.021
he-0.020
ars-0.019
case-0.017
Token't
Token ID470
Feature activation+1.17e+00

Pos logit contributions
ics+0.062
ars+0.059
amed+0.051
iece+0.049
he+0.045
Neg logit contributions
arter-0.145
run-0.141
 Bobby-0.139
 polar-0.138
 policy-0.138
Token.
Token ID13
Feature activation+1.17e+00

Pos logit contributions
arter+0.035
 policy+0.033
 polar+0.033
run+0.033
asured+0.032
Neg logit contributions
ics-0.022
ars-0.019
he-0.018
 top-0.017
case-0.017
Token Fr
Token ID1305
Feature activation+1.76e+00

Pos logit contributions
arter+0.028
 policy+0.026
 polar+0.025
asured+0.024
run+0.024
Neg logit contributions
ics-0.029
 top-0.029
ars-0.027
he-0.027
case-0.027
Tokenust
Token ID436
Feature activation+8.01e+00

Pos logit contributions
arter+0.029
 policy+0.027
 polar+0.026
run+0.026
asured+0.024
Neg logit contributions
ics-0.056
he-0.053
ars-0.052
 top-0.051
case-0.049
Tokenrated
Token ID4111
Feature activation-1.18e+00

Pos logit contributions
arter+0.184
 policy+0.167
 polar+0.167
ocol+0.160
 importantly+0.159
Neg logit contributions
ics-0.198
 top-0.187
ars-0.186
he-0.185
case-0.174
Token,
Token ID11
Feature activation-2.28e-01

Pos logit contributions
ics+0.015
 top+0.014
he+0.013
ars+0.012
case+0.011
Neg logit contributions
arter-0.042
 policy-0.040
run-0.040
asured-0.040
 polar-0.039
Token he
Token ID339
Feature activation+1.34e+00

Pos logit contributions
 top+0.005
ics+0.004
he+0.004
ars+0.004
 beam+0.004
Neg logit contributions
arter-0.007
 policy-0.006
asured-0.006
run-0.006
 former-0.006
Token started
Token ID2067
Feature activation-4.04e-01

Pos logit contributions
arter+0.039
 policy+0.037
 polar+0.035
run+0.035
asured+0.034
Neg logit contributions
ics-0.027
ars-0.024
he-0.024
 top-0.023
case-0.022
Token to
Token ID284
Feature activation-6.16e-01

Pos logit contributions
ics+0.010
he+0.009
ars+0.009
 top+0.008
case+0.008
Neg logit contributions
arter-0.010
run-0.009
 policy-0.009
 polar-0.009
asured-0.009
Token peek
Token ID27185
Feature activation+2.29e+00

Pos logit contributions
arter+0.054
 policy+0.051
 polar+0.051
run+0.050
asured+0.048
Neg logit contributions
ics-0.041
he-0.037
ars-0.036
 top-0.036
case-0.033
Tokened
Token ID276
Feature activation+1.30e+00

Pos logit contributions
arter+0.063
 polar+0.061
run+0.060
 policy+0.060
asured+0.057
Neg logit contributions
ics-0.047
he-0.041
ars-0.041
 top-0.039
case-0.038
Token out
Token ID503
Feature activation+4.48e+00

Pos logit contributions
arter+0.044
 policy+0.042
run+0.041
 polar+0.041
asured+0.040
Neg logit contributions
ics-0.018
 top-0.017
he-0.017
ars-0.015
case-0.014
Token of
Token ID286
Feature activation+1.89e+00

Pos logit contributions
arter+0.131
 policy+0.125
 polar+0.120
asured+0.119
run+0.117
Neg logit contributions
ics-0.083
he-0.080
 top-0.078
ars-0.072
case-0.065
Token his
Token ID465
Feature activation+3.11e+00

Pos logit contributions
arter+0.072
 policy+0.068
run+0.067
asured+0.066
 polar+0.065
Neg logit contributions
 top-0.019
he-0.018
ics-0.018
 higher-0.011
 stage-0.011
Token bur
Token ID4356
Feature activation+7.89e+00

Pos logit contributions
arter+0.089
run+0.083
 polar+0.083
 policy+0.083
asured+0.082
Neg logit contributions
ics-0.059
 top-0.058
ars-0.054
he-0.053
case-0.051
Tokenrow
Token ID808
Feature activation+2.26e+00

Pos logit contributions
arter+0.181
 policy+0.170
 polar+0.162
run+0.156
asured+0.152
Neg logit contributions
ics-0.202
ars-0.189
he-0.180
 top-0.180
case-0.164
Token.
Token ID13
Feature activation+5.74e-01

Pos logit contributions
arter+0.062
 policy+0.059
 polar+0.059
run+0.059
asured+0.056
Neg logit contributions
ics-0.045
ars-0.040
he-0.039
 top-0.037
 Higher-0.036
Token
Token ID198
Feature activation+7.71e-01

Pos logit contributions
arter+0.017
 policy+0.016
asured+0.016
 polar+0.015
 rightful+0.014
Neg logit contributions
he-0.010
ics-0.010
 top-0.010
case-0.010
ars-0.009
Token
Token ID198
Feature activation+1.43e+00

Pos logit contributions
arter+0.025
 policy+0.024
 polar+0.024
run+0.023
asured+0.023
Neg logit contributions
ics-0.012
he-0.011
ars-0.010
 top-0.010
case-0.009
TokenHe
Token ID1544
Feature activation+1.20e+00

Pos logit contributions
arter+0.047
 policy+0.045
 polar+0.044
run+0.044
asured+0.043
Neg logit contributions
ics-0.023
he-0.019
ars-0.019
 top-0.018
case-0.017
Tokenge
Token ID469
Feature activation+6.28e-01

Pos logit contributions
arter+0.076
 polar+0.068
 policy+0.067
 importantly+0.061
 former+0.060
Neg logit contributions
ics-0.140
he-0.129
 top-0.125
ars-0.123
case-0.118
Token.
Token ID13
Feature activation+1.61e-01

Pos logit contributions
arter+0.019
 policy+0.018
 polar+0.018
run+0.017
asured+0.017
Neg logit contributions
ics-0.012
he-0.010
ars-0.010
 top-0.010
case-0.009
Token
Token ID198
Feature activation+4.55e-01

Pos logit contributions
run+0.004
 policy+0.004
arter+0.004
 Bobby+0.004
 Joey+0.004
Neg logit contributions
ics-0.003
ars-0.003
he-0.003
vation-0.003
 top-0.003
Token
Token ID198
Feature activation+1.52e+00

Pos logit contributions
arter+0.013
run+0.013
 policy+0.013
 polar+0.012
asured+0.011
Neg logit contributions
ics-0.009
ars-0.007
he-0.007
 top-0.007
case-0.007
TokenFr
Token ID6732
Feature activation+3.42e+00

Pos logit contributions
run+0.039
 policy+0.038
 importantly+0.038
emy+0.037
arter+0.037
Neg logit contributions
ics-0.031
he-0.024
hed-0.023
ong-0.023
 poster-0.023
Tokenust
Token ID436
Feature activation+7.85e+00

Pos logit contributions
arter+0.046
 policy+0.041
 polar+0.040
run+0.039
asured+0.036
Neg logit contributions
ics-0.120
he-0.113
ars-0.112
 top-0.110
case-0.107
Tokenrated
Token ID4111
Feature activation-1.22e+00

Pos logit contributions
arter+0.180
 polar+0.165
 policy+0.163
run+0.157
ocol+0.155
Neg logit contributions
ics-0.192
ars-0.182
 top-0.182
he-0.178
case-0.168
Token,
Token ID11
Feature activation-1.90e-01

Pos logit contributions
ics+0.019
he+0.017
ars+0.016
 top+0.016
case+0.014
Neg logit contributions
arter-0.040
 policy-0.038
 polar-0.038
run-0.038
asured-0.037
Token the
Token ID262
Feature activation-1.42e+00

Pos logit contributions
ics+0.002
 top+0.002
he+0.002
ars+0.002
case+0.002
Neg logit contributions
arter-0.007
 policy-0.007
asured-0.007
run-0.006
 polar-0.006
Token little
Token ID1310
Feature activation-6.94e-01

Pos logit contributions
ics+0.045
he+0.041
ars+0.041
 top+0.039
case+0.039
Neg logit contributions
arter-0.025
 policy-0.023
 polar-0.022
run-0.021
asured-0.020
Token ant
Token ID1885
Feature activation+1.00e+00

Pos logit contributions
ics+0.020
he+0.019
 top+0.019
ars+0.019
case+0.018
Neg logit contributions
arter-0.013
 policy-0.012
 polar-0.012
run-0.012
asured-0.011
Token stay
Token ID2652
Feature activation+1.11e+00

Pos logit contributions
ics+0.013
 top+0.011
he+0.010
ars+0.010
 sing+0.010
Neg logit contributions
arter-0.027
 policy-0.025
run-0.025
ocol-0.024
asured-0.024
Token with
Token ID351
Feature activation+1.22e+00

Pos logit contributions
arter+0.041
 policy+0.039
run+0.037
ocol+0.036
 polar+0.035
Neg logit contributions
ics-0.013
ars-0.011
he-0.011
 top-0.011
g-0.010
Token them
Token ID606
Feature activation+4.74e-03

Pos logit contributions
arter+0.038
 policy+0.037
run+0.035
asured+0.035
 polar+0.034
Neg logit contributions
ics-0.022
 top-0.020
he-0.019
ars-0.018
case-0.016
Token in
Token ID287
Feature activation+1.73e+00

Pos logit contributions
arter+0.000
 policy+0.000
 polar+0.000
run+0.000
asured+0.000
Neg logit contributions
ics-0.000
he-0.000
ars-0.000
 top-0.000
case-0.000
Token their
Token ID511
Feature activation+2.16e+00

Pos logit contributions
arter+0.058
 policy+0.054
asured+0.054
run+0.053
 polar+0.052
Neg logit contributions
 top-0.024
ics-0.024
he-0.023
ars-0.019
't-0.018
Token bur
Token ID4356
Feature activation+7.81e+00

Pos logit contributions
arter+0.065
 policy+0.060
run+0.060
asured+0.058
 polar+0.056
Neg logit contributions
 top-0.039
ars-0.037
ics-0.037
 rides-0.031
'-0.031
Tokenrow
Token ID808
Feature activation+2.24e+00

Pos logit contributions
arter+0.191
 policy+0.182
 polar+0.173
 Bobby+0.163
 importantly+0.163
Neg logit contributions
ics-0.187
ars-0.179
he-0.158
 top-0.156
ers-0.151
Token.
Token ID13
Feature activation+4.32e-01

Pos logit contributions
arter+0.070
 policy+0.067
 polar+0.066
run+0.064
asured+0.063
Neg logit contributions
ics-0.039
he-0.034
ars-0.033
 top-0.033
case-0.030
Token The
Token ID383
Feature activation-5.45e-01

Pos logit contributions
arter+0.013
 policy+0.013
 polar+0.012
asured+0.012
run+0.012
Neg logit contributions
ics-0.008
he-0.007
 top-0.007
ars-0.007
case-0.007
Token elderly
Token ID13830
Feature activation-3.70e-01

Pos logit contributions
 top+0.012
ics+0.012
he+0.011
ars+0.010
 sky+0.010
Neg logit contributions
run-0.015
arter-0.014
 policy-0.013
asured-0.012
ocol-0.012
Token rabbit
Token ID22746
Feature activation+8.41e-01

Pos logit contributions
ics+0.009
he+0.008
 top+0.008
ars+0.008
case+0.008
Neg logit contributions
arter-0.009
run-0.009
 policy-0.008
asured-0.008
 polar-0.008
Token proudly
Token ID23017
Feature activation+1.81e+00

Pos logit contributions
ics+0.003
 top+0.003
he+0.003
ars+0.003
case+0.003
Neg logit contributions
arter-0.006
 policy-0.006
ocol-0.006
run-0.005
 polar-0.005
Token put
Token ID1234
Feature activation-1.74e-01

Pos logit contributions
arter+0.052
 policy+0.049
 polar+0.047
run+0.046
asured+0.045
Neg logit contributions
ics-0.037
ars-0.032
he-0.032
 top-0.031
hed-0.029
Token it
Token ID340
Feature activation+2.12e-01

Pos logit contributions
ics+0.003
he+0.003
ars+0.003
 top+0.003
case+0.002
Neg logit contributions
arter-0.006
 policy-0.005
 polar-0.005
run-0.005
asured-0.005
Token in
Token ID287
Feature activation+9.09e-01

Pos logit contributions
arter+0.006
run+0.006
 policy+0.006
 polar+0.006
asured+0.006
Neg logit contributions
ics-0.004
he-0.003
ars-0.003
 top-0.003
case-0.003
Token his
Token ID465
Feature activation+2.49e+00

Pos logit contributions
arter+0.028
 policy+0.025
run+0.024
asured+0.024
 polar+0.024
Neg logit contributions
 top-0.017
ics-0.016
he-0.016
g-0.013
ars-0.013
Token bur
Token ID4356
Feature activation+7.79e+00

Pos logit contributions
arter+0.070
 policy+0.066
 polar+0.064
asured+0.063
run+0.062
Neg logit contributions
 top-0.050
ics-0.048
ars-0.047
case-0.044
he-0.043
Tokenrow
Token ID808
Feature activation+1.84e+00

Pos logit contributions
arter+0.193
 policy+0.180
 polar+0.171
 importantly+0.160
asured+0.157
Neg logit contributions
ics-0.185
ars-0.179
 top-0.164
he-0.164
ers-0.154
Token.
Token ID13
Feature activation+1.49e-01

Pos logit contributions
arter+0.051
 policy+0.049
run+0.049
 polar+0.049
asured+0.047
Neg logit contributions
ics-0.036
ars-0.032
he-0.032
 top-0.030
case-0.029
Token He
Token ID679
Feature activation+9.18e-01

Pos logit contributions
arter+0.005
 policy+0.004
 polar+0.004
asured+0.004
run+0.004
Neg logit contributions
ics-0.003
he-0.002
 top-0.002
ars-0.002
case-0.002
Token thought
Token ID1807
Feature activation-1.66e-01

Pos logit contributions
arter+0.028
 policy+0.027
run+0.025
 polar+0.025
ocol+0.024
Neg logit contributions
ics-0.017
 top-0.016
he-0.015
ars-0.015
case-0.014
Token it
Token ID340
Feature activation+1.46e+00

Pos logit contributions
ics+0.003
he+0.003
 top+0.003
ars+0.003
case+0.002
Neg logit contributions
arter-0.005
 policy-0.005
 polar-0.005
asured-0.005
run-0.005
Token and
Token ID290
Feature activation-5.55e-01

Pos logit contributions
ics+0.028
ars+0.024
he+0.024
 top+0.023
case+0.022
Neg logit contributions
arter-0.045
 policy-0.043
 polar-0.043
run-0.043
asured-0.042
Token put
Token ID1234
Feature activation-4.61e-01

Pos logit contributions
ics+0.010
 top+0.010
ars+0.009
 swing+0.008
 fly+0.008
Neg logit contributions
arter-0.017
 policy-0.017
run-0.015
asured-0.015
 importantly-0.015
Token them
Token ID606
Feature activation-1.02e+00

Pos logit contributions
ics+0.007
 top+0.006
ars+0.006
he+0.006
g+0.005
Neg logit contributions
arter-0.015
 policy-0.015
asured-0.014
 polar-0.014
run-0.014
Token in
Token ID287
Feature activation+9.35e-01

Pos logit contributions
ics+0.018
he+0.016
ars+0.016
 top+0.015
case+0.015
Neg logit contributions
arter-0.031
 polar-0.029
 policy-0.029
run-0.029
asured-0.028
Token his
Token ID465
Feature activation+2.65e+00

Pos logit contributions
arter+0.032
 policy+0.030
run+0.029
asured+0.029
 polar+0.029
Neg logit contributions
ics-0.014
 top-0.013
he-0.012
ars-0.011
g-0.010
Token bur
Token ID4356
Feature activation+7.78e+00

Pos logit contributions
arter+0.083
 policy+0.079
asured+0.076
run+0.074
 importantly+0.074
Neg logit contributions
 top-0.047
ars-0.044
ics-0.042
case-0.038
 swing-0.036
Tokenrow
Token ID808
Feature activation+1.75e+00

Pos logit contributions
arter+0.203
 policy+0.193
 polar+0.176
 importantly+0.174
 rightful+0.168
Neg logit contributions
ics-0.174
ars-0.171
 top-0.152
he-0.152
ers-0.151
Token.
Token ID13
Feature activation+3.66e-01

Pos logit contributions
arter+0.052
 policy+0.050
 polar+0.049
run+0.049
asured+0.047
Neg logit contributions
ics-0.032
he-0.028
ars-0.028
 top-0.027
case-0.025
Token
Token ID198
Feature activation+4.44e-01

Pos logit contributions
arter+0.011
 policy+0.010
asured+0.010
 polar+0.010
run+0.009
Neg logit contributions
ics-0.007
 top-0.007
he-0.007
ars-0.006
case-0.006
Token
Token ID198
Feature activation+1.30e+00

Pos logit contributions
run+0.012
 polar+0.011
arter+0.011
 policy+0.011
 Bobby+0.010
Neg logit contributions
ics-0.009
ars-0.008
 top-0.008
he-0.008
vation-0.007
TokenOne
Token ID3198
Feature activation+2.73e+00

Pos logit contributions
 polar+0.037
run+0.037
arter+0.036
 policy+0.035
asured+0.034
Neg logit contributions
ics-0.024
he-0.019
 top-0.019
ars-0.018
hed-0.018
Token he
Token ID339
Feature activation+1.30e+00

Pos logit contributions
arter+0.045
 policy+0.041
run+0.040
 former+0.039
 polar+0.039
Neg logit contributions
ics-0.025
 top-0.024
he-0.021
ars-0.020
case-0.019
Token said
Token ID531
Feature activation-6.34e-01

Pos logit contributions
arter+0.040
 policy+0.038
ocol+0.037
 polar+0.036
 former+0.035
Neg logit contributions
ics-0.023
 top-0.022
ars-0.021
he-0.021
case-0.020
Token no
Token ID645
Feature activation+4.53e-01

Pos logit contributions
ics+0.012
 top+0.010
he+0.010
ars+0.010
case+0.009
Neg logit contributions
arter-0.019
 policy-0.018
 polar-0.018
asured-0.018
run-0.017
Token.
Token ID13
Feature activation+6.64e-01

Pos logit contributions
arter+0.017
 former+0.016
 polar+0.016
asured+0.016
ocol+0.015
Neg logit contributions
 top-0.003
's-0.003
'-0.003
ics-0.003
he-0.002
Token Fr
Token ID1305
Feature activation+1.33e+00

Pos logit contributions
arter+0.017
 policy+0.016
 polar+0.016
run+0.015
asured+0.015
Neg logit contributions
ics-0.015
 top-0.014
he-0.014
ars-0.013
case-0.013
Tokenust
Token ID436
Feature activation+7.75e+00

Pos logit contributions
arter+0.023
 polar+0.021
 policy+0.021
run+0.020
asured+0.018
Neg logit contributions
ics-0.042
he-0.040
ars-0.039
 top-0.037
case-0.037
Tokenrated
Token ID4111
Feature activation-1.36e+00

Pos logit contributions
arter+0.190
 polar+0.177
 policy+0.171
 importantly+0.168
ocol+0.168
Neg logit contributions
ics-0.180
ars-0.168
 top-0.166
he-0.159
 higher-0.158
Token,
Token ID11
Feature activation-7.94e-01

Pos logit contributions
ics+0.017
 top+0.015
he+0.015
ars+0.015
case+0.013
Neg logit contributions
arter-0.049
 policy-0.046
run-0.046
 polar-0.045
asured-0.045
Token Bob
Token ID5811
Feature activation-3.12e-02

Pos logit contributions
 top+0.018
ics+0.017
he+0.016
't+0.015
 higher+0.014
Neg logit contributions
arter-0.022
 policy-0.020
asured-0.020
run-0.020
 former-0.019
Token decided
Token ID3066
Feature activation-1.23e+00

Pos logit contributions
ics+0.001
he+0.001
ars+0.001
 top+0.001
case+0.001
Neg logit contributions
arter-0.001
 policy-0.001
 polar-0.001
run-0.001
asured-0.001
Token to
Token ID284
Feature activation-8.07e-01

Pos logit contributions
ics+0.025
 top+0.022
he+0.021
ars+0.021
case+0.020
Neg logit contributions
arter-0.036
 polar-0.033
 policy-0.033
run-0.032
asured-0.032
Token or
Token ID393
Feature activation+4.68e-01

Pos logit contributions
ics+0.021
ars+0.019
he+0.018
 top+0.017
case+0.016
Neg logit contributions
arter-0.034
 policy-0.033
 polar-0.032
run-0.032
asured-0.031
Token too
Token ID1165
Feature activation+1.23e+00

Pos logit contributions
arter+0.015
 policy+0.014
 polar+0.013
run+0.013
asured+0.013
Neg logit contributions
ics-0.008
 top-0.007
ars-0.007
 higher-0.007
he-0.006
Token hard
Token ID1327
Feature activation-2.39e+00

Pos logit contributions
arter+0.038
 policy+0.036
 polar+0.036
run+0.035
asured+0.034
Neg logit contributions
ics-0.022
he-0.019
ars-0.019
 top-0.018
case-0.017
Token.
Token ID13
Feature activation+1.05e+00

Pos logit contributions
ics+0.051
he+0.046
ars+0.045
 top+0.044
case+0.041
Neg logit contributions
arter-0.065
 policy-0.062
run-0.061
 polar-0.060
asured-0.058
Token Fr
Token ID1305
Feature activation+1.41e+00

Pos logit contributions
arter+0.031
 policy+0.030
run+0.030
 polar+0.029
asured+0.028
Neg logit contributions
ics-0.020
ars-0.017
he-0.017
 top-0.016
case-0.015
Tokenust
Token ID436
Feature activation+7.73e+00

Pos logit contributions
arter+0.028
 policy+0.027
run+0.027
asured+0.026
 polar+0.026
Neg logit contributions
ics-0.038
 top-0.035
ars-0.035
he-0.035
case-0.033
Tokenrated
Token ID4111
Feature activation-1.26e+00

Pos logit contributions
arter+0.178
 policy+0.162
 polar+0.162
run+0.155
 importantly+0.152
Neg logit contributions
ics-0.198
he-0.184
ars-0.181
 top-0.180
case-0.169
Token,
Token ID11
Feature activation-6.03e-02

Pos logit contributions
ics+0.021
he+0.018
ars+0.017
 top+0.017
case+0.016
Neg logit contributions
arter-0.041
 policy-0.039
 polar-0.039
run-0.038
asured-0.037
Token she
Token ID673
Feature activation+1.24e+00

Pos logit contributions
ics+0.001
he+0.001
 top+0.001
ars+0.001
case+0.001
Neg logit contributions
arter-0.002
 policy-0.002
run-0.002
 polar-0.002
asured-0.002
Token kept
Token ID4030
Feature activation-4.56e-01

Pos logit contributions
arter+0.030
 polar+0.030
 policy+0.029
run+0.029
asured+0.028
Neg logit contributions
ics-0.029
he-0.027
ars-0.025
 top-0.025
case-0.024
Token searching
Token ID10342
Feature activation+3.00e+00

Pos logit contributions
ics+0.010
he+0.010
case+0.010
ars+0.010
 top+0.008
Neg logit contributions
run-0.011
arter-0.011
 polar-0.011
asured-0.011
 policy-0.010
Token a
Token ID257
Feature activation+1.10e-01

Pos logit contributions
arter+0.008
 polar+0.008
 policy+0.007
asured+0.007
 Bobby+0.007
Neg logit contributions
ics-0.005
ars-0.004
vation-0.004
he-0.004
case-0.004
Token little
Token ID1310
Feature activation+8.20e-02

Pos logit contributions
arter+0.003
 policy+0.003
run+0.003
 polar+0.003
asured+0.003
Neg logit contributions
ics-0.002
he-0.002
ars-0.002
 top-0.002
case-0.002
Token crab
Token ID32202
Feature activation+1.21e+00

Pos logit contributions
arter+0.002
run+0.002
 policy+0.002
asured+0.002
 former+0.002
Neg logit contributions
ics-0.002
he-0.002
 top-0.002
ars-0.002
case-0.002
Token named
Token ID3706
Feature activation+1.85e+00

Pos logit contributions
arter+0.023
 policy+0.021
 polar+0.021
run+0.021
asured+0.020
Neg logit contributions
ics-0.036
he-0.033
ars-0.033
 top-0.032
case-0.031
Token Cr
Token ID3864
Feature activation+2.64e+00

Pos logit contributions
arter+0.026
 policy+0.025
run+0.023
asured+0.022
 polar+0.022
Neg logit contributions
ics-0.064
 top-0.060
he-0.059
ars-0.058
case-0.056
Tokenust
Token ID436
Feature activation+7.73e+00

Pos logit contributions
arter+0.039
asured+0.034
 policy+0.033
run+0.033
 polar+0.031
Neg logit contributions
ics-0.087
 top-0.083
ars-0.080
he-0.078
 Higher-0.078
Tokeny
Token ID88
Feature activation+1.22e+00

Pos logit contributions
arter+0.182
 policy+0.172
run+0.170
 importantly+0.162
 polar+0.162
Neg logit contributions
ics-0.192
 top-0.180
he-0.178
ars-0.173
case-0.156
Token.
Token ID13
Feature activation+1.64e+00

Pos logit contributions
arter+0.037
 policy+0.036
run+0.035
 polar+0.035
asured+0.034
Neg logit contributions
ics-0.021
he-0.019
 top-0.018
ars-0.018
case-0.017
Token Cr
Token ID3864
Feature activation+2.78e+00

Pos logit contributions
arter+0.023
 policy+0.021
 polar+0.020
run+0.019
asured+0.018
Neg logit contributions
ics-0.057
he-0.053
ars-0.053
 top-0.052
case-0.050
Tokenust
Token ID436
Feature activation+8.15e+00

Pos logit contributions
arter+0.032
asured+0.027
 policy+0.027
run+0.026
 polar+0.023
Neg logit contributions
ics-0.101
 top-0.097
ars-0.095
he-0.093
 Higher-0.090
Tokeny
Token ID88
Feature activation+1.57e+00

Pos logit contributions
arter+0.206
 policy+0.189
run+0.188
 polar+0.186
 importantly+0.182
Neg logit contributions
ics-0.190
he-0.178
 top-0.172
ars-0.169
case-0.160
Token print
Token ID3601
Feature activation-1.22e+00

Pos logit contributions
ics+0.006
he+0.005
ars+0.005
 top+0.005
case+0.004
Neg logit contributions
arter-0.008
 policy-0.008
 polar-0.008
run-0.008
asured-0.008
Token so
Token ID523
Feature activation+9.95e-01

Pos logit contributions
ics+0.024
ars+0.021
he+0.020
 top+0.019
case+0.019
Neg logit contributions
arter-0.035
 policy-0.034
 polar-0.033
run-0.033
asured-0.032
Token badly
Token ID11234
Feature activation+3.06e+00

Pos logit contributions
arter+0.030
 polar+0.029
 policy+0.028
asured+0.027
run+0.027
Neg logit contributions
ics-0.018
 top-0.018
he-0.018
ars-0.016
case-0.015
Token.
Token ID13
Feature activation+6.56e-01

Pos logit contributions
arter+0.101
 polar+0.096
 policy+0.094
run+0.094
asured+0.092
Neg logit contributions
ics-0.049
he-0.044
 top-0.042
ars-0.040
case-0.037
Token Fr
Token ID1305
Feature activation+1.65e+00

Pos logit contributions
arter+0.021
 policy+0.019
 polar+0.019
asured+0.018
run+0.018
Neg logit contributions
ics-0.011
 top-0.011
he-0.010
ars-0.010
case-0.010
Tokenust
Token ID436
Feature activation+7.69e+00

Pos logit contributions
arter+0.028
 policy+0.026
 polar+0.025
run+0.025
asured+0.024
Neg logit contributions
ics-0.052
he-0.048
ars-0.048
 top-0.047
case-0.045
Tokenrated
Token ID4111
Feature activation-1.42e+00

Pos logit contributions
arter+0.182
 polar+0.165
 policy+0.161
run+0.156
ocol+0.155
Neg logit contributions
ics-0.194
he-0.182
ars-0.174
 top-0.171
case-0.164
Token tears
Token ID10953
Feature activation+3.26e+00

Pos logit contributions
ics+0.021
he+0.019
 top+0.018
ars+0.018
case+0.016
Neg logit contributions
arter-0.048
 policy-0.045
 polar-0.045
run-0.045
asured-0.044
Token were
Token ID547
Feature activation-4.64e-01

Pos logit contributions
arter+0.099
 policy+0.094
 polar+0.093
run+0.091
asured+0.090
Neg logit contributions
ics-0.059
he-0.053
ars-0.051
 top-0.050
case-0.046
Token well
Token ID880
Feature activation+7.77e-01

Pos logit contributions
ics+0.008
 top+0.007
he+0.007
ars+0.006
case+0.006
Neg logit contributions
arter-0.015
 policy-0.014
 polar-0.014
run-0.013
asured-0.013
Tokening
Token ID278
Feature activation+6.76e-01

Pos logit contributions
arter+0.025
 policy+0.024
 polar+0.023
run+0.023
asured+0.023
Neg logit contributions
ics-0.013
he-0.011
ars-0.011
 top-0.011
case-0.010
Token back
Token ID736
Feature activation+5.11e+00

Pos logit contributions
arter+0.006
 polar+0.006
 policy+0.006
run+0.006
asured+0.005
Neg logit contributions
ics-0.003
he-0.003
 top-0.003
ars-0.003
case-0.002
Token.
Token ID13
Feature activation-8.78e-02

Pos logit contributions
arter+0.191
 polar+0.184
 policy+0.171
ocol+0.166
 importantly+0.165
Neg logit contributions
ics-0.064
 top-0.059
he-0.055
ars-0.044
'-0.043
Token
Token ID198
Feature activation+3.90e-01

Pos logit contributions
ics+0.001
he+0.001
 top+0.001
ars+0.001
case+0.001
Neg logit contributions
arter-0.003
 polar-0.003
 policy-0.003
run-0.003
asured-0.003
Token
Token ID198
Feature activation+1.15e+00

Pos logit contributions
run+0.011
arter+0.011
 policy+0.010
 polar+0.010
asured+0.010
Neg logit contributions
ics-0.008
ars-0.007
 top-0.006
he-0.006
case-0.006
TokenFr
Token ID6732
Feature activation+3.35e+00

Pos logit contributions
run+0.033
 policy+0.033
arter+0.032
asured+0.032
 importantly+0.031
Neg logit contributions
ics-0.020
ars-0.017
he-0.016
 top-0.015
ong-0.015
Tokenust
Token ID436
Feature activation+7.68e+00

Pos logit contributions
arter+0.042
 policy+0.037
 polar+0.037
run+0.035
asured+0.032
Neg logit contributions
ics-0.121
he-0.114
ars-0.112
 top-0.111
case-0.108
Tokenrated
Token ID4111
Feature activation-1.45e+00

Pos logit contributions
arter+0.180
 polar+0.166
 policy+0.157
run+0.151
ocol+0.150
Neg logit contributions
ics-0.194
 top-0.178
ars-0.178
he-0.177
case-0.167
Token,
Token ID11
Feature activation-4.42e-01

Pos logit contributions
ics+0.022
he+0.019
 top+0.019
ars+0.019
case+0.017
Neg logit contributions
arter-0.049
 policy-0.046
 polar-0.046
run-0.045
asured-0.045
Token Jane
Token ID12091
Feature activation-9.47e-02

Pos logit contributions
 top+0.009
ics+0.009
he+0.009
case+0.008
ars+0.008
Neg logit contributions
arter-0.012
 polar-0.011
 policy-0.011
asured-0.011
run-0.011
Token went
Token ID1816
Feature activation-3.07e-01

Pos logit contributions
ics+0.002
he+0.002
ars+0.002
 top+0.002
 Higher+0.002
Neg logit contributions
run-0.002
 polar-0.002
arter-0.002
asured-0.002
 policy-0.002
Token back
Token ID736
Feature activation+3.67e+00

Pos logit contributions
ics+0.006
he+0.006
ars+0.006
 top+0.005
case+0.005
Neg logit contributions
arter-0.008
 policy-0.008
 polar-0.008
run-0.008
asured-0.008
Token be
Token ID307
Feature activation+1.37e+00

Pos logit contributions
ics+0.018
ars+0.017
he+0.017
hed+0.015
 top+0.015
Neg logit contributions
arter-0.026
 polar-0.025
 policy-0.025
run-0.024
asured-0.024
Token cow
Token ID9875
Feature activation-1.28e+00

Pos logit contributions
arter+0.045
 policy+0.044
run+0.042
 polar+0.041
asured+0.040
Neg logit contributions
ics-0.022
ars-0.018
he-0.018
 top-0.018
case-0.016
Tokenboys
Token ID13202
Feature activation-2.05e+00

Pos logit contributions
ics+0.022
he+0.021
ars+0.020
 top+0.019
case+0.018
Neg logit contributions
arter-0.038
 polar-0.037
 policy-0.036
run-0.036
asured-0.035
Token and
Token ID290
Feature activation-1.81e+00

Pos logit contributions
ics+0.040
ars+0.035
 top+0.035
he+0.034
 Higher+0.033
Neg logit contributions
arter-0.056
 polar-0.055
run-0.053
 policy-0.052
asured-0.052
Token cow
Token ID9875
Feature activation-1.26e+00

Pos logit contributions
ics+0.042
 top+0.038
ars+0.038
 swing+0.036
 sing+0.035
Neg logit contributions
arter-0.046
 policy-0.044
asured-0.041
run-0.041
 importantly-0.037
Tokengirls
Token ID36960
Feature activation-7.00e+00

Pos logit contributions
ics+0.030
he+0.027
ars+0.027
 top+0.026
case+0.025
Neg logit contributions
arter-0.031
 policy-0.029
 polar-0.029
run-0.029
asured-0.028
Token.
Token ID13
Feature activation+7.39e-01

Pos logit contributions
ics+0.151
he+0.142
ars+0.139
 top+0.134
vation+0.133
Neg logit contributions
arter-0.165
 polar-0.161
run-0.157
 policy-0.156
asured-0.152
Token They
Token ID1119
Feature activation-1.59e+00

Pos logit contributions
arter+0.021
 policy+0.020
asured+0.020
 polar+0.019
 former+0.019
Neg logit contributions
 top-0.014
ics-0.014
he-0.013
ars-0.013
case-0.012
Token put
Token ID1234
Feature activation-9.26e-01

Pos logit contributions
ics+0.035
he+0.030
 top+0.029
ars+0.029
case+0.027
Neg logit contributions
 polar-0.042
arter-0.042
 policy-0.039
run-0.039
asured-0.038
Token on
Token ID319
Feature activation+2.10e-01

Pos logit contributions
ics+0.017
he+0.015
ars+0.014
 top+0.014
case+0.014
Neg logit contributions
arter-0.028
 polar-0.027
 policy-0.026
run-0.026
asured-0.025
Token hats
Token ID23910
Feature activation-3.03e+00

Pos logit contributions
 polar+0.005
arter+0.005
run+0.004
 Bobby+0.004
 policy+0.004
Neg logit contributions
he-0.005
ics-0.005
ars-0.005
case-0.005
vation-0.004
Token.
Token ID13
Feature activation+5.56e-01

Pos logit contributions
ics+0.012
he+0.010
ars+0.010
 top+0.010
case+0.009
Neg logit contributions
arter-0.019
 policy-0.018
 polar-0.018
run-0.018
asured-0.017
Token "
Token ID366
Feature activation-1.62e+00

Pos logit contributions
arter+0.014
 policy+0.013
run+0.013
 polar+0.013
asured+0.012
Neg logit contributions
ics-0.013
he-0.012
ars-0.012
 top-0.011
case-0.011
TokenThese
Token ID4711
Feature activation-3.06e-02

Pos logit contributions
ics+0.031
he+0.026
ars+0.026
 top+0.026
case+0.025
Neg logit contributions
arter-0.047
 policy-0.045
 polar-0.044
run-0.044
asured-0.043
Token muff
Token ID27563
Feature activation+2.13e+00

Pos logit contributions
ics+0.001
he+0.001
ars+0.000
 top+0.000
case+0.000
Neg logit contributions
arter-0.001
 policy-0.001
 polar-0.001
run-0.001
asured-0.001
Tokenins
Token ID1040
Feature activation+3.48e-01

Pos logit contributions
arter+0.049
 polar+0.047
 policy+0.047
 former+0.043
run+0.043
Neg logit contributions
ics-0.056
he-0.053
ars-0.051
 top-0.047
hed-0.045
Token don
Token ID836
Feature activation-6.38e+00

Pos logit contributions
arter+0.011
 policy+0.011
 polar+0.010
run+0.010
asured+0.010
Neg logit contributions
ics-0.006
he-0.005
ars-0.005
 top-0.005
case-0.004
Token't
Token ID470
Feature activation+9.06e-01

Pos logit contributions
iece+0.095
ars+0.091
 headset+0.086
 Higher+0.085
ics+0.084
Neg logit contributions
arter-0.182
 policy-0.178
run-0.164
 Bobby-0.163
 Joey-0.160
Token taste
Token ID6938
Feature activation+9.21e-01

Pos logit contributions
arter+0.026
 policy+0.026
run+0.025
 polar+0.025
asured+0.025
Neg logit contributions
ics-0.017
he-0.016
ars-0.015
 top-0.015
't-0.013
Token good
Token ID922
Feature activation-1.29e+00

Pos logit contributions
arter+0.026
 policy+0.025
 polar+0.025
run+0.025
asured+0.024
Neg logit contributions
ics-0.018
he-0.016
 top-0.016
ars-0.016
case-0.014
Token."
Token ID526
Feature activation-6.93e-01

Pos logit contributions
ics+0.021
he+0.019
 top+0.018
ars+0.017
hed+0.016
Neg logit contributions
arter-0.041
 polar-0.040
run-0.039
 policy-0.038
asured-0.037
Token Her
Token ID2332
Feature activation+2.31e+00

Pos logit contributions
ics+0.017
ars+0.016
he+0.016
vation+0.014
case+0.014
Neg logit contributions
arter-0.016
run-0.015
 policy-0.015
 Bobby-0.014
 Joey-0.014
Token't
Token ID470
Feature activation+9.81e-01

Pos logit contributions
ars+0.092
ics+0.087
iece+0.085
amed+0.076
 higher+0.076
Neg logit contributions
arter-0.178
 policy-0.173
run-0.166
 Bobby-0.165
 polar-0.161
Token hurt
Token ID5938
Feature activation+7.23e-01

Pos logit contributions
arter+0.031
run+0.031
asured+0.030
 policy+0.030
 polar+0.030
Neg logit contributions
ics-0.015
 top-0.014
he-0.014
't-0.013
ars-0.013
Token her
Token ID607
Feature activation+9.84e-01

Pos logit contributions
arter+0.024
run+0.022
 policy+0.022
asured+0.022
 polar+0.022
Neg logit contributions
ics-0.012
 top-0.011
he-0.010
't-0.009
ars-0.009
Token,
Token ID11
Feature activation-2.12e+00

Pos logit contributions
arter+0.028
 policy+0.027
 polar+0.026
run+0.026
asured+0.025
Neg logit contributions
ics-0.020
he-0.018
 top-0.017
ars-0.017
case-0.016
Token please
Token ID3387
Feature activation+5.07e-01

Pos logit contributions
ics+0.043
he+0.042
 top+0.041
case+0.040
't+0.039
Neg logit contributions
arter-0.056
asured-0.053
 policy-0.052
run-0.050
 polar-0.050
Token don
Token ID836
Feature activation-6.37e+00

Pos logit contributions
arter+0.015
 policy+0.014
 polar+0.014
run+0.014
asured+0.013
Neg logit contributions
ics-0.010
he-0.009
ars-0.009
 top-0.009
case-0.008
Token't
Token ID470
Feature activation+8.50e-01

Pos logit contributions
ars+0.091
ics+0.090
iece+0.084
amed+0.078
 Higher+0.076
Neg logit contributions
arter-0.191
 policy-0.189
run-0.180
 Bobby-0.173
 polar-0.171
Token hurt
Token ID5938
Feature activation+4.57e-01

Pos logit contributions
arter+0.027
run+0.026
asured+0.026
 policy+0.026
 polar+0.025
Neg logit contributions
ics-0.013
 top-0.012
he-0.012
't-0.012
ars-0.011
Token me
Token ID502
Feature activation-3.20e-01

Pos logit contributions
arter+0.015
run+0.015
 policy+0.014
 polar+0.014
asured+0.014
Neg logit contributions
ics-0.007
 top-0.006
he-0.006
't-0.005
ars-0.005
Token!"
Token ID2474
Feature activation-1.78e+00

Pos logit contributions
ics+0.007
ars+0.006
he+0.006
 top+0.006
case+0.006
Neg logit contributions
arter-0.008
 policy-0.008
 polar-0.008
run-0.008
asured-0.008
Token
Token ID198
Feature activation+4.14e-02

Pos logit contributions
ics+0.042
ars+0.036
he+0.035
 top+0.033
case+0.033
Neg logit contributions
arter-0.045
 policy-0.043
run-0.042
 polar-0.041
 Bobby-0.040
Token sk
Token ID1341
Feature activation+9.11e-01

Pos logit contributions
ics+0.000
he+0.000
ars+0.000
 top+0.000
case+0.000
Neg logit contributions
arter-0.000
 policy-0.000
 polar-0.000
run-0.000
asured-0.000
Tokenates
Token ID689
Feature activation-9.08e-01

Pos logit contributions
arter+0.025
run+0.024
 policy+0.023
 polar+0.023
asured+0.023
Neg logit contributions
ics-0.019
he-0.017
 top-0.017
ars-0.016
case-0.016
Token and
Token ID290
Feature activation-1.16e+00

Pos logit contributions
ics+0.015
he+0.013
ars+0.013
 top+0.012
case+0.012
Neg logit contributions
arter-0.028
 policy-0.027
run-0.027
 polar-0.027
asured-0.026
Token helmets
Token ID28359
Feature activation+7.32e-01

Pos logit contributions
ics+0.020
ars+0.019
 top+0.018
he+0.017
case+0.016
Neg logit contributions
arter-0.037
 policy-0.034
 polar-0.033
run-0.033
asured-0.033
Token and
Token ID290
Feature activation-1.16e+00

Pos logit contributions
arter+0.017
run+0.017
 policy+0.017
 polar+0.017
asured+0.016
Neg logit contributions
ics-0.016
ars-0.014
he-0.014
 top-0.014
vation-0.013
Token pads
Token ID21226
Feature activation-6.35e+00

Pos logit contributions
ics+0.018
ars+0.018
 top+0.017
he+0.015
case+0.015
Neg logit contributions
arter-0.038
 policy-0.035
 polar-0.034
asured-0.034
run-0.034
Token.
Token ID13
Feature activation+8.76e-01

Pos logit contributions
ics+0.159
vation+0.146
ceed+0.136
he+0.135
 Higher+0.134
Neg logit contributions
run-0.129
 policy-0.127
arter-0.115
 polar-0.112
asured-0.106
Token They
Token ID1119
Feature activation-1.52e+00

Pos logit contributions
arter+0.024
 policy+0.023
asured+0.022
 polar+0.021
run+0.021
Neg logit contributions
ars-0.017
 top-0.017
ics-0.017
 Higher-0.016
't-0.016
Token use
Token ID779
Feature activation+6.24e-01

Pos logit contributions
ics+0.029
he+0.025
ars+0.024
 top+0.024
case+0.022
Neg logit contributions
arter-0.043
 polar-0.043
 policy-0.041
run-0.041
asured-0.040
Token a
Token ID257
Feature activation+9.18e-01

Pos logit contributions
arter+0.019
 policy+0.017
 polar+0.017
asured+0.017
run+0.016
Neg logit contributions
ics-0.012
 top-0.011
ars-0.010
he-0.010
case-0.009
Token puck
Token ID20695
Feature activation-2.59e+00

Pos logit contributions
arter+0.029
 policy+0.028
 polar+0.027
run+0.027
asured+0.027
Neg logit contributions
ics-0.015
he-0.013
ars-0.013
 top-0.013
case-0.012
Token their
Token ID511
Feature activation+1.09e+00

Pos logit contributions
 polar+0.009
run+0.009
arter+0.009
 policy+0.008
asured+0.008
Neg logit contributions
ics-0.007
he-0.007
ars-0.006
case-0.006
 top-0.006
Token gear
Token ID7733
Feature activation-2.05e+00

Pos logit contributions
arter+0.030
 policy+0.029
 polar+0.029
run+0.029
asured+0.027
Neg logit contributions
ics-0.022
he-0.020
ars-0.019
 top-0.018
case-0.018
Token:
Token ID25
Feature activation-8.49e-01

Pos logit contributions
ics+0.041
he+0.037
ars+0.036
 Higher+0.035
ceed+0.035
Neg logit contributions
run-0.050
 policy-0.050
asured-0.049
arter-0.048
 polar-0.048
Token helmets
Token ID28359
Feature activation+6.34e-01

Pos logit contributions
ics+0.018
he+0.016
ars+0.016
vation+0.015
case+0.015
Neg logit contributions
 polar-0.022
run-0.022
arter-0.022
 policy-0.021
asured-0.020
Token,
Token ID11
Feature activation-1.06e+00

Pos logit contributions
 policy+0.016
arter+0.016
run+0.015
 polar+0.015
asured+0.015
Neg logit contributions
ics-0.012
ars-0.012
he-0.011
ceed-0.011
 Higher-0.011
Token pads
Token ID21226
Feature activation-6.35e+00

Pos logit contributions
ics+0.021
he+0.019
ars+0.018
 top+0.018
case+0.017
Neg logit contributions
arter-0.030
 policy-0.029
 polar-0.028
run-0.028
asured-0.027
Token and
Token ID290
Feature activation-1.37e+00

Pos logit contributions
ics+0.139
vation+0.128
ceed+0.126
 Higher+0.123
Ta+0.122
Neg logit contributions
 policy-0.139
run-0.136
 polar-0.131
arter-0.123
 Bobby-0.115
Token gloves
Token ID18051
Feature activation-2.74e+00

Pos logit contributions
ics+0.022
ars+0.020
 top+0.019
he+0.018
case+0.017
Neg logit contributions
arter-0.046
 policy-0.042
 polar-0.041
asured-0.040
run-0.040
Token.
Token ID13
Feature activation+5.88e-01

Pos logit contributions
ics+0.057
ars+0.052
 top+0.051
he+0.049
 Higher+0.049
Neg logit contributions
 policy-0.067
run-0.067
arter-0.066
 polar-0.065
asured-0.061
Token They
Token ID1119
Feature activation-1.75e+00

Pos logit contributions
arter+0.021
 policy+0.021
asured+0.020
 polar+0.020
run+0.019
Neg logit contributions
 top-0.007
he-0.006
ics-0.006
ars-0.006
 Higher-0.006
Token knew
Token ID2993
Feature activation+1.20e+00

Pos logit contributions
ics+0.042
ars+0.037
 top+0.037
he+0.037
vation+0.034
Neg logit contributions
 polar-0.040
arter-0.039
run-0.038
 policy-0.038
asured-0.038
Token pictures
Token ID5986
Feature activation-2.17e+00

Pos logit contributions
arter+0.002
 policy+0.001
 polar+0.001
run+0.001
asured+0.001
Neg logit contributions
ics-0.001
 top-0.001
ars-0.001
he-0.001
case-0.001
Token but
Token ID475
Feature activation+3.75e-01

Pos logit contributions
ics+0.050
ars+0.045
 top+0.043
he+0.042
 Higher+0.040
Neg logit contributions
run-0.052
arter-0.052
 policy-0.051
 polar-0.048
asured-0.047
Token the
Token ID262
Feature activation-2.01e+00

Pos logit contributions
arter+0.012
 polar+0.012
 policy+0.011
asured+0.011
 former+0.011
Neg logit contributions
ics-0.006
 top-0.006
he-0.005
't-0.005
case-0.005
Token park
Token ID3952
Feature activation-1.47e-01

Pos logit contributions
ics+0.042
he+0.038
ars+0.038
hed+0.033
case+0.033
Neg logit contributions
arter-0.054
 policy-0.054
 polar-0.050
 rightful-0.049
run-0.049
Token keeper
Token ID28356
Feature activation+1.01e+00

Pos logit contributions
ics+0.003
 top+0.003
he+0.002
case+0.002
ars+0.002
Neg logit contributions
arter-0.004
 polar-0.004
 policy-0.004
run-0.004
asured-0.004
Token didn
Token ID1422
Feature activation-6.34e+00

Pos logit contributions
arter+0.029
 policy+0.027
 polar+0.027
run+0.026
asured+0.025
Neg logit contributions
ics-0.020
he-0.019
ars-0.018
 top-0.018
case-0.017
Token't
Token ID470
Feature activation+2.47e-01

Pos logit contributions
ars+0.107
ics+0.105
iece+0.101
amed+0.096
 headset+0.089
Neg logit contributions
 policy-0.171
run-0.162
arter-0.161
 Bobby-0.158
 Joey-0.149
Token believe
Token ID1975
Feature activation+2.12e+00

Pos logit contributions
arter+0.007
 policy+0.007
 polar+0.007
run+0.007
asured+0.006
Neg logit contributions
ics-0.005
he-0.004
ars-0.004
 top-0.004
case-0.004
Token him
Token ID683
Feature activation+1.14e+00

Pos logit contributions
arter+0.062
 former+0.061
 policy+0.061
 polar+0.060
asured+0.059
Neg logit contributions
ics-0.037
 top-0.035
he-0.035
ars-0.029
't-0.028
Token.
Token ID13
Feature activation-1.40e-01

Pos logit contributions
arter+0.035
 policy+0.034
 polar+0.033
run+0.033
asured+0.032
Neg logit contributions
ics-0.020
he-0.018
ars-0.017
 top-0.017
case-0.016
Token
Token ID220
Feature activation-3.38e-01

Pos logit contributions
ics+0.002
he+0.002
ars+0.002
 top+0.002
case+0.002
Neg logit contributions
arter-0.005
 policy-0.004
 polar-0.004
asured-0.004
run-0.004
Token told
Token ID1297
Feature activation+2.08e-01

Pos logit contributions
arter+0.032
 policy+0.029
run+0.028
 polar+0.028
ocol+0.027
Neg logit contributions
ics-0.020
he-0.018
ars-0.017
 top-0.017
case-0.017
Token him
Token ID683
Feature activation-8.38e-01

Pos logit contributions
arter+0.005
 polar+0.005
 policy+0.004
run+0.004
asured+0.004
Neg logit contributions
ics-0.005
ars-0.005
he-0.005
case-0.005
 top-0.004
Token,
Token ID11
Feature activation-7.97e-01

Pos logit contributions
ics+0.015
he+0.013
 top+0.013
ars+0.012
case+0.012
Neg logit contributions
arter-0.026
 policy-0.024
 polar-0.024
run-0.024
asured-0.024
Token "
Token ID366
Feature activation-1.46e+00

Pos logit contributions
ics+0.015
he+0.014
 top+0.013
ars+0.013
case+0.012
Neg logit contributions
arter-0.024
 policy-0.022
 polar-0.022
run-0.021
asured-0.021
Tokenwe
Token ID732
Feature activation-2.87e+00

Pos logit contributions
ics+0.031
ars+0.026
he+0.026
 top+0.026
case+0.025
Neg logit contributions
arter-0.040
 policy-0.039
 polar-0.038
run-0.038
asured-0.037
Token don
Token ID836
Feature activation-6.32e+00

Pos logit contributions
ics+0.067
he+0.060
ars+0.059
 top+0.059
case+0.056
Neg logit contributions
arter-0.072
 policy-0.068
 polar-0.067
run-0.066
asured-0.064
Token't
Token ID470
Feature activation+1.96e-01

Pos logit contributions
ars+0.096
iece+0.094
ics+0.093
amed+0.083
 higher+0.082
Neg logit contributions
 policy-0.173
arter-0.171
run-0.165
 Bobby-0.164
 importantly-0.156
Token buy
Token ID2822
Feature activation+1.46e-01

Pos logit contributions
arter+0.006
 polar+0.006
run+0.006
 policy+0.005
asured+0.005
Neg logit contributions
ics-0.003
he-0.003
ars-0.003
 top-0.003
case-0.003
Token things
Token ID1243
Feature activation-6.17e-01

Pos logit contributions
arter+0.005
 policy+0.004
 polar+0.004
run+0.004
asured+0.004
Neg logit contributions
ics-0.003
 top-0.002
he-0.002
ars-0.002
 rides-0.002
Token from
Token ID422
Feature activation+7.48e-01

Pos logit contributions
ics+0.011
he+0.010
 top+0.009
ars+0.009
case+0.009
Neg logit contributions
arter-0.020
 polar-0.019
 policy-0.018
run-0.018
asured-0.018
Token the
Token ID262
Feature activation-3.02e+00

Pos logit contributions
arter+0.025
run+0.024
 policy+0.023
 polar+0.023
 importantly+0.022
Neg logit contributions
ics-0.012
 top-0.011
he-0.009
 rides-0.008
't-0.008
TokenThey
Token ID2990
Feature activation-2.35e+00

Pos logit contributions
 polar+0.036
 policy+0.036
arter+0.036
run+0.035
asured+0.034
Neg logit contributions
ics-0.022
he-0.018
 top-0.018
case-0.017
ong-0.016
Token learned
Token ID4499
Feature activation-5.69e-01

Pos logit contributions
ics+0.056
ars+0.049
he+0.048
 top+0.048
case+0.045
Neg logit contributions
 polar-0.056
arter-0.053
asured-0.051
 policy-0.051
run-0.050
Token that
Token ID326
Feature activation+8.13e-01

Pos logit contributions
ics+0.009
 top+0.008
he+0.007
ars+0.007
case+0.007
Neg logit contributions
arter-0.019
 policy-0.018
run-0.018
 polar-0.017
asured-0.017
Token snow
Token ID6729
Feature activation-1.69e+00

Pos logit contributions
arter+0.024
run+0.023
 policy+0.021
 polar+0.021
asured+0.021
Neg logit contributions
ics-0.016
 top-0.015
he-0.013
ars-0.013
't-0.012
Tokenmen
Token ID3653
Feature activation-5.16e-01

Pos logit contributions
ics+0.031
he+0.028
ars+0.027
 top+0.026
case+0.025
Neg logit contributions
arter-0.050
 policy-0.048
 polar-0.047
run-0.047
asured-0.046
Token don
Token ID836
Feature activation-6.31e+00

Pos logit contributions
ics+0.009
he+0.008
ars+0.008
 top+0.007
case+0.007
Neg logit contributions
arter-0.016
 policy-0.015
 polar-0.015
run-0.015
asured-0.015
Token't
Token ID470
Feature activation+6.86e-01

Pos logit contributions
iece+0.108
ics+0.095
ars+0.093
 Higher+0.092
 headset+0.092
Neg logit contributions
 policy-0.154
arter-0.152
 Bobby-0.146
run-0.139
 polar-0.137
Token last
Token ID938
Feature activation-6.59e-01

Pos logit contributions
arter+0.021
 policy+0.020
run+0.019
 polar+0.019
asured+0.019
Neg logit contributions
ics-0.012
he-0.012
ars-0.011
 top-0.011
't-0.010
Token forever
Token ID8097
Feature activation+2.01e+00

Pos logit contributions
 top+0.010
ics+0.008
ars+0.008
'+0.008
 higher+0.007
Neg logit contributions
arter-0.022
omo-0.019
 polar-0.019
asured-0.019
ocol-0.019
Token,
Token ID11
Feature activation-1.83e+00

Pos logit contributions
arter+0.071
 polar+0.068
 policy+0.066
run+0.066
asured+0.065
Neg logit contributions
ics-0.026
ars-0.023
he-0.023
 top-0.022
'-0.020
Token but
Token ID475
Feature activation+7.05e-01

Pos logit contributions
ics+0.028
 top+0.027
he+0.025
ars+0.025
 higher+0.023
Neg logit contributions
arter-0.058
 policy-0.054
 polar-0.053
run-0.053
asured-0.051
Token.
Token ID13
Feature activation+9.49e-01

Pos logit contributions
arter+0.059
run+0.056
 policy+0.056
 polar+0.056
asured+0.054
Neg logit contributions
ics-0.044
ars-0.039
he-0.038
 top-0.035
case-0.034
Token They
Token ID1119
Feature activation-1.84e+00

Pos logit contributions
arter+0.026
 policy+0.025
asured+0.024
 polar+0.023
 possession+0.022
Neg logit contributions
 top-0.019
he-0.019
ics-0.018
case-0.018
ars-0.017
Token wear
Token ID5806
Feature activation-4.14e-01

Pos logit contributions
ics+0.046
vation+0.040
ceed+0.036
ers+0.036
ars+0.036
Neg logit contributions
 polar-0.040
asured-0.037
 Bobby-0.037
run-0.036
 Joey-0.036
Token helmets
Token ID28359
Feature activation-5.57e-03

Pos logit contributions
ics+0.010
he+0.009
vation+0.009
ars+0.009
case+0.009
Neg logit contributions
 polar-0.009
run-0.009
 Bobby-0.008
 policy-0.008
asured-0.008
Token and
Token ID290
Feature activation-6.39e-01

Pos logit contributions
ars+0.000
ics+0.000
vation+0.000
ceed+0.000
he+0.000
Neg logit contributions
run-0.000
asured-0.000
 policy-0.000
arter-0.000
 polar-0.000
Token pads
Token ID21226
Feature activation-6.31e+00

Pos logit contributions
ics+0.011
he+0.010
ars+0.010
 top+0.010
case+0.009
Neg logit contributions
arter-0.020
 policy-0.019
 polar-0.018
run-0.018
asured-0.018
Token to
Token ID284
Feature activation+9.56e-01

Pos logit contributions
vation+0.166
ics+0.161
ceed+0.153
Ta+0.150
Georgia+0.139
Neg logit contributions
run-0.116
 policy-0.104
 polar-0.094
 theirs-0.089
asured-0.085
Token be
Token ID307
Feature activation+1.28e+00

Pos logit contributions
arter+0.029
 polar+0.027
 policy+0.027
run+0.027
asured+0.026
Neg logit contributions
ics-0.017
he-0.015
ars-0.015
 top-0.014
case-0.014
Token safe
Token ID3338
Feature activation+6.87e-01

Pos logit contributions
arter+0.039
 policy+0.037
 polar+0.036
run+0.036
asured+0.035
Neg logit contributions
ics-0.023
he-0.020
ars-0.020
 top-0.019
case-0.018
Token.
Token ID13
Feature activation+8.15e-01

Pos logit contributions
arter+0.024
run+0.022
 polar+0.022
 policy+0.022
ocol+0.021
Neg logit contributions
ics-0.010
he-0.009
 top-0.008
ars-0.008
case-0.007
Token
Token ID198
Feature activation+4.24e-01

Pos logit contributions
arter+0.031
 policy+0.029
asured+0.029
 polar+0.028
run+0.027
Neg logit contributions
 top-0.009
he-0.008
ics-0.008
case-0.007
 Higher-0.007
Token mad
Token ID8805
Feature activation-2.39e-01

Pos logit contributions
 policy+0.036
arter+0.035
 former+0.032
 polar+0.032
ocol+0.031
Neg logit contributions
ics-0.016
ars-0.014
 higher-0.014
 top-0.013
amed-0.013
Token at
Token ID379
Feature activation-1.69e-01

Pos logit contributions
ics+0.005
he+0.004
ars+0.004
 top+0.004
case+0.004
Neg logit contributions
arter-0.007
 polar-0.007
 policy-0.006
run-0.006
asured-0.006
Token us
Token ID514
Feature activation-1.59e+00

Pos logit contributions
ics+0.003
 top+0.003
he+0.003
ars+0.003
case+0.002
Neg logit contributions
arter-0.005
 policy-0.005
 polar-0.005
run-0.005
asured-0.005
Token.
Token ID13
Feature activation+4.45e-01

Pos logit contributions
ics+0.034
ars+0.031
he+0.030
 top+0.029
case+0.028
Neg logit contributions
arter-0.041
 policy-0.040
 polar-0.039
run-0.039
asured-0.038
Token Please
Token ID4222
Feature activation-1.50e+00

Pos logit contributions
arter+0.012
asured+0.010
 polar+0.010
run+0.010
 policy+0.010
Neg logit contributions
ars-0.009
ics-0.009
 top-0.009
't-0.009
case-0.009
Token don
Token ID836
Feature activation-6.31e+00

Pos logit contributions
ics+0.032
he+0.028
ars+0.028
 top+0.027
case+0.026
Neg logit contributions
arter-0.041
 policy-0.039
 polar-0.039
run-0.038
asured-0.037
Token't
Token ID470
Feature activation+4.60e-01

Pos logit contributions
ars+0.095
iece+0.093
ics+0.087
amed+0.082
 higher+0.080
Neg logit contributions
arter-0.178
 policy-0.175
 Bobby-0.166
run-0.161
 polar-0.161
Token hurt
Token ID5938
Feature activation+4.86e-01

Pos logit contributions
arter+0.014
run+0.014
 policy+0.014
 polar+0.013
asured+0.013
Neg logit contributions
ics-0.008
 top-0.007
he-0.007
ars-0.007
't-0.006
Token us
Token ID514
Feature activation-1.27e+00

Pos logit contributions
arter+0.014
 policy+0.013
 polar+0.013
run+0.013
asured+0.013
Neg logit contributions
ics-0.010
 top-0.009
he-0.008
ars-0.008
case-0.007
Token."
Token ID526
Feature activation-2.60e+00

Pos logit contributions
ics+0.029
ars+0.026
he+0.025
 top+0.025
case+0.024
Neg logit contributions
arter-0.032
 policy-0.031
 polar-0.030
run-0.030
asured-0.029
Token
Token ID198
Feature activation-6.33e-01

Pos logit contributions
ics+0.041
he+0.035
ars+0.034
 top+0.034
case+0.031
Neg logit contributions
arter-0.085
 policy-0.082
 polar-0.081
run-0.080
asured-0.078
Token,
Token ID11
Feature activation-1.48e+00

Pos logit contributions
ics+0.011
he+0.010
ars+0.010
 top+0.009
case+0.009
Neg logit contributions
arter-0.021
 policy-0.020
 polar-0.020
run-0.020
asured-0.019
Token lion
Token ID18744
Feature activation-1.30e+00

Pos logit contributions
 top+0.026
ics+0.026
ars+0.024
he+0.024
case+0.023
Neg logit contributions
 policy-0.041
arter-0.041
asured-0.040
run-0.040
 former-0.038
Token.
Token ID13
Feature activation+9.40e-01

Pos logit contributions
ics+0.022
ars+0.019
he+0.019
 top+0.018
case+0.017
Neg logit contributions
arter-0.041
 policy-0.039
run-0.038
 polar-0.038
asured-0.038
Token But
Token ID887
Feature activation+1.17e+00

Pos logit contributions
arter+0.027
asured+0.026
 policy+0.025
run+0.025
 polar+0.024
Neg logit contributions
ars-0.017
ics-0.016
 top-0.015
 Higher-0.015
he-0.015
Token please
Token ID3387
Feature activation-5.88e-02

Pos logit contributions
arter+0.040
run+0.039
asured+0.038
 policy+0.038
 polar+0.038
Neg logit contributions
 top-0.015
ics-0.014
't-0.012
ars-0.012
 fly-0.012
Token don
Token ID836
Feature activation-6.27e+00

Pos logit contributions
ics+0.001
he+0.001
ars+0.001
 top+0.001
case+0.001
Neg logit contributions
arter-0.002
 policy-0.002
 polar-0.002
run-0.001
asured-0.001
Token't
Token ID470
Feature activation+6.77e-03

Pos logit contributions
ars+0.091
iece+0.090
amed+0.089
ics+0.089
hed+0.078
Neg logit contributions
arter-0.184
 policy-0.178
run-0.167
 Bobby-0.165
 polar-0.165
Token eat
Token ID4483
Feature activation-1.40e+00

Pos logit contributions
arter+0.000
run+0.000
 policy+0.000
asured+0.000
 polar+0.000
Neg logit contributions
ics-0.000
 top-0.000
he-0.000
ars-0.000
't-0.000
Token our
Token ID674
Feature activation+2.80e-01

Pos logit contributions
ics+0.027
he+0.023
 top+0.023
ars+0.023
case+0.020
Neg logit contributions
arter-0.042
 policy-0.040
 polar-0.039
run-0.039
asured-0.038
Token pets
Token ID17252
Feature activation-1.33e+00

Pos logit contributions
arter+0.008
 policy+0.007
 polar+0.007
run+0.007
asured+0.007
Neg logit contributions
ics-0.006
ars-0.005
he-0.005
 top-0.005
case-0.005
Token.
Token ID13
Feature activation+5.62e-01

Pos logit contributions
ics+0.028
he+0.025
ars+0.025
 top+0.024
case+0.023
Neg logit contributions
arter-0.037
 policy-0.035
 polar-0.034
run-0.034
asured-0.033
Token.
Token ID13
Feature activation+2.97e-01

Pos logit contributions
ics+0.038
 top+0.034
he+0.033
ars+0.031
case+0.029
Neg logit contributions
arter-0.072
 policy-0.068
run-0.066
 polar-0.066
asured-0.066
Token
Token ID198
Feature activation+1.42e+00

Pos logit contributions
arter+0.010
 policy+0.009
 polar+0.009
asured+0.009
run+0.009
Neg logit contributions
ics-0.004
 top-0.004
he-0.004
ars-0.004
case-0.004
Token"
Token ID1
Feature activation-3.60e-01

Pos logit contributions
arter+0.044
 policy+0.042
run+0.042
 polar+0.042
asured+0.040
Neg logit contributions
ics-0.024
he-0.020
ars-0.020
 top-0.019
case-0.018
TokenFl
Token ID7414
Feature activation+5.90e-01

Pos logit contributions
ics+0.006
he+0.005
ars+0.005
 top+0.005
case+0.004
Neg logit contributions
arter-0.012
 policy-0.011
 polar-0.011
run-0.011
asured-0.011
Tokenowers
Token ID3618
Feature activation+1.61e-01

Pos logit contributions
arter+0.015
 policy+0.013
run+0.013
 polar+0.013
asured+0.013
Neg logit contributions
ics-0.014
he-0.013
ars-0.012
 top-0.012
case-0.012
Token don
Token ID836
Feature activation-6.27e+00

Pos logit contributions
arter+0.005
 policy+0.005
 polar+0.005
run+0.005
asured+0.005
Neg logit contributions
ics-0.002
he-0.002
ars-0.002
 top-0.002
case-0.002
Token't
Token ID470
Feature activation+5.97e-01

Pos logit contributions
iece+0.086
ars+0.085
ics+0.082
 Higher+0.073
 headset+0.072
Neg logit contributions
arter-0.182
 policy-0.179
 Bobby-0.171
run-0.169
 Joey-0.168
Token have
Token ID423
Feature activation-3.23e-01

Pos logit contributions
arter+0.017
 polar+0.016
 policy+0.016
run+0.016
asured+0.016
Neg logit contributions
ics-0.012
he-0.011
ars-0.010
 top-0.010
case-0.009
Token a
Token ID257
Feature activation+7.82e-01

Pos logit contributions
ics+0.006
 top+0.005
ars+0.005
he+0.005
case+0.005
Neg logit contributions
arter-0.010
 policy-0.009
 polar-0.009
run-0.009
asured-0.009
Token taste
Token ID6938
Feature activation+2.42e+00

Pos logit contributions
arter+0.023
 policy+0.022
asured+0.021
 polar+0.021
run+0.021
Neg logit contributions
ics-0.014
 top-0.014
ars-0.013
he-0.013
case-0.012
Token,"
Token ID553
Feature activation-1.02e+00

Pos logit contributions
arter+0.077
 polar+0.073
 policy+0.073
run+0.072
asured+0.070
Neg logit contributions
ics-0.041
he-0.037
 top-0.035
ars-0.035
case-0.031
Token.
Token ID13
Feature activation+8.57e-01

Pos logit contributions
ics+0.042
he+0.037
 top+0.036
ars+0.035
case+0.033
Neg logit contributions
arter-0.059
 policy-0.057
 polar-0.056
run-0.055
asured-0.053
Token But
Token ID887
Feature activation+8.85e-01

Pos logit contributions
arter+0.022
 policy+0.021
 polar+0.020
asured+0.020
run+0.019
Neg logit contributions
ics-0.019
ars-0.018
 top-0.018
he-0.017
case-0.016
Token be
Token ID307
Feature activation+1.17e+00

Pos logit contributions
arter+0.029
 policy+0.027
 polar+0.027
run+0.026
asured+0.026
Neg logit contributions
ics-0.014
 top-0.013
ars-0.012
he-0.011
't-0.011
Token careful
Token ID8161
Feature activation-1.96e+00

Pos logit contributions
arter+0.026
run+0.025
 polar+0.025
asured+0.025
 policy+0.024
Neg logit contributions
ics-0.031
he-0.029
case-0.026
 top-0.026
ars-0.026
Token,
Token ID11
Feature activation-1.60e+00

Pos logit contributions
ics+0.040
he+0.039
ars+0.036
 top+0.035
hed+0.034
Neg logit contributions
arter-0.052
 policy-0.050
run-0.050
 polar-0.048
asured-0.047
Token don
Token ID836
Feature activation-6.27e+00

Pos logit contributions
ics+0.023
 top+0.023
ars+0.022
he+0.021
'+0.020
Neg logit contributions
arter-0.050
 policy-0.048
asured-0.047
 polar-0.046
run-0.045
Token't
Token ID470
Feature activation+5.31e-01

Pos logit contributions
ics+0.080
ars+0.070
amed+0.068
iece+0.065
he+0.062
Neg logit contributions
arter-0.203
 policy-0.201
run-0.196
 polar-0.184
 Bobby-0.181
Token hit
Token ID2277
Feature activation-1.73e+00

Pos logit contributions
arter+0.018
 policy+0.017
run+0.017
asured+0.017
 polar+0.017
Neg logit contributions
ics-0.008
 top-0.007
ars-0.006
he-0.006
 swing-0.006
Token your
Token ID534
Feature activation+1.06e+00

Pos logit contributions
ics+0.031
 top+0.028
ars+0.027
he+0.027
case+0.023
Neg logit contributions
arter-0.053
 policy-0.052
 polar-0.049
run-0.049
asured-0.049
Token head
Token ID1182
Feature activation+1.14e+00

Pos logit contributions
 policy+0.028
arter+0.027
run+0.027
 importantly+0.026
asured+0.025
Neg logit contributions
 top-0.023
ics-0.022
ars-0.021
he-0.018
 higher-0.018
Token on
Token ID319
Feature activation+3.16e-01

Pos logit contributions
arter+0.033
 policy+0.031
 polar+0.031
run+0.031
asured+0.030
Neg logit contributions
ics-0.022
he-0.020
ars-0.019
 top-0.019
case-0.018
TokenNo
Token ID2949
Feature activation-1.25e+00

Pos logit contributions
ics+0.016
he+0.014
ars+0.014
 top+0.013
case+0.012
Neg logit contributions
arter-0.025
 policy-0.023
 polar-0.023
run-0.023
asured-0.022
Token,
Token ID11
Feature activation-1.76e+00

Pos logit contributions
 top+0.014
ics+0.014
ars+0.012
he+0.012
't+0.010
Neg logit contributions
arter-0.045
asured-0.044
 polar-0.043
run-0.043
 former-0.042
Token Anna
Token ID11735
Feature activation-1.40e+00

Pos logit contributions
 top+0.029
ics+0.028
he+0.028
ars+0.026
case+0.026
Neg logit contributions
arter-0.054
asured-0.051
 polar-0.051
run-0.050
 former-0.050
Token,
Token ID11
Feature activation-1.65e+00

Pos logit contributions
ics+0.019
ars+0.017
he+0.016
 top+0.016
case+0.014
Neg logit contributions
arter-0.048
 polar-0.046
 policy-0.045
asured-0.045
run-0.045
Token please
Token ID3387
Feature activation+1.92e-01

Pos logit contributions
ics+0.032
he+0.029
 top+0.029
ars+0.028
case+0.026
Neg logit contributions
arter-0.047
 policy-0.045
 polar-0.044
run-0.044
asured-0.043
Token don
Token ID836
Feature activation-6.26e+00

Pos logit contributions
arter+0.005
 policy+0.005
 polar+0.005
run+0.005
asured+0.005
Neg logit contributions
ics-0.004
he-0.004
ars-0.003
 top-0.003
case-0.003
Token't
Token ID470
Feature activation+4.99e-01

Pos logit contributions
ics+0.092
ars+0.085
iece+0.081
amed+0.078
 higher+0.071
Neg logit contributions
 policy-0.187
arter-0.184
run-0.179
 Bobby-0.169
 importantly-0.163
Token tell
Token ID1560
Feature activation+1.20e+00

Pos logit contributions
arter+0.016
 polar+0.015
run+0.015
 policy+0.015
asured+0.015
Neg logit contributions
ics-0.008
he-0.007
ars-0.007
 top-0.007
case-0.006
Token mom
Token ID1995
Feature activation+5.05e-01

Pos logit contributions
arter+0.031
 policy+0.029
 polar+0.029
run+0.029
asured+0.027
Neg logit contributions
ics-0.027
he-0.024
ars-0.024
 top-0.023
case-0.022
Token.
Token ID13
Feature activation+6.75e-01

Pos logit contributions
arter+0.015
 policy+0.014
 polar+0.014
run+0.014
asured+0.014
Neg logit contributions
ics-0.010
he-0.008
ars-0.008
 top-0.008
case-0.008
Token She
Token ID1375
Feature activation+1.37e+00

Pos logit contributions
arter+0.018
asured+0.017
 polar+0.017
 policy+0.017
run+0.017
Neg logit contributions
ics-0.013
 top-0.013
case-0.013
ars-0.013
he-0.012
Token "
Token ID366
Feature activation-1.93e+00

Pos logit contributions
ics+0.035
he+0.031
ars+0.030
 top+0.030
case+0.028
Neg logit contributions
arter-0.044
 policy-0.042
 polar-0.042
run-0.041
asured-0.040
TokenTom
Token ID13787
Feature activation+3.36e-02

Pos logit contributions
ics+0.037
he+0.032
ars+0.032
 top+0.031
case+0.029
Neg logit contributions
arter-0.057
 policy-0.055
 polar-0.054
run-0.053
asured-0.052
Tokenmy
Token ID1820
Feature activation-8.18e-01

Pos logit contributions
arter+0.001
 policy+0.001
 polar+0.001
run+0.001
asured+0.001
Neg logit contributions
ics-0.001
he-0.001
ars-0.001
 top-0.001
case-0.001
Token,
Token ID11
Feature activation-1.47e+00

Pos logit contributions
ics+0.018
he+0.017
ars+0.016
 top+0.015
case+0.015
Neg logit contributions
arter-0.021
 policy-0.021
run-0.020
 polar-0.019
asured-0.018
Token please
Token ID3387
Feature activation+2.27e-01

Pos logit contributions
ics+0.026
he+0.024
 top+0.024
ars+0.023
case+0.021
Neg logit contributions
arter-0.044
 policy-0.041
 polar-0.041
asured-0.041
run-0.041
Token don
Token ID836
Feature activation-6.26e+00

Pos logit contributions
arter+0.005
 policy+0.005
run+0.005
asured+0.005
 polar+0.005
Neg logit contributions
ics-0.006
he-0.005
case-0.005
ars-0.005
 top-0.005
Token't
Token ID470
Feature activation+4.87e-01

Pos logit contributions
ics+0.101
ars+0.096
amed+0.092
iece+0.092
 higher+0.080
Neg logit contributions
 policy-0.174
arter-0.169
run-0.165
 Bobby-0.158
 importantly-0.155
Token be
Token ID307
Feature activation+1.26e+00

Pos logit contributions
arter+0.014
 policy+0.013
 polar+0.013
run+0.013
asured+0.013
Neg logit contributions
ics-0.010
he-0.009
ars-0.008
 top-0.008
case-0.008
Token noisy
Token ID31210
Feature activation-2.84e+00

Pos logit contributions
arter+0.032
 policy+0.031
 polar+0.030
run+0.029
asured+0.028
Neg logit contributions
ics-0.030
he-0.027
ars-0.026
 top-0.026
case-0.024
Token and
Token ID290
Feature activation-1.10e+00

Pos logit contributions
ics+0.041
he+0.037
ars+0.034
 top+0.034
case+0.031
Neg logit contributions
arter-0.095
 polar-0.090
 policy-0.090
run-0.088
asured-0.086
Token stubborn
Token ID24444
Feature activation-1.96e+00

Pos logit contributions
ics+0.021
ars+0.020
 top+0.019
he+0.019
case+0.017
Neg logit contributions
arter-0.031
 policy-0.031
 polar-0.029
asured-0.028
run-0.028
Token they
Token ID484
Feature activation-2.16e+00

Pos logit contributions
ics+0.026
ars+0.024
he+0.023
 top+0.023
case+0.020
Neg logit contributions
arter-0.048
 policy-0.044
asured-0.044
run-0.044
 polar-0.042
Token said
Token ID531
Feature activation-6.75e-01

Pos logit contributions
ics+0.053
 top+0.049
he+0.047
ars+0.047
 Higher+0.045
Neg logit contributions
 polar-0.047
run-0.044
arter-0.044
 policy-0.043
asured-0.042
Token.
Token ID13
Feature activation+3.74e-01

Pos logit contributions
ics+0.016
he+0.015
ars+0.015
 top+0.014
case+0.014
Neg logit contributions
run-0.016
 policy-0.015
arter-0.015
 polar-0.015
asured-0.014
Token "
Token ID366
Feature activation-2.16e+00

Pos logit contributions
arter+0.010
 policy+0.010
 polar+0.010
run+0.010
asured+0.009
Neg logit contributions
ics-0.008
he-0.007
ars-0.007
 top-0.007
case-0.006
TokenBut
Token ID1537
Feature activation+6.40e-01

Pos logit contributions
ics+0.049
 top+0.042
case+0.040
ong+0.040
ars+0.038
Neg logit contributions
 policy-0.050
arter-0.050
asured-0.049
 polar-0.048
run-0.048
Token don
Token ID836
Feature activation-6.26e+00

Pos logit contributions
arter+0.021
run+0.020
 policy+0.020
 polar+0.019
asured+0.019
Neg logit contributions
ics-0.009
 top-0.009
't-0.008
ars-0.007
'-0.007
Token't
Token ID470
Feature activation-3.66e-01

Pos logit contributions
ics+0.090
iece+0.090
ars+0.090
amed+0.087
hed+0.079
Neg logit contributions
arter-0.180
 policy-0.179
run-0.169
 Bobby-0.167
 importantly-0.162
Token take
Token ID1011
Feature activation+1.91e-03

Pos logit contributions
ics+0.006
 top+0.006
he+0.005
't+0.005
 swing+0.005
Neg logit contributions
arter-0.012
run-0.011
asured-0.011
 policy-0.011
 polar-0.011
Token our
Token ID674
Feature activation-4.70e-01

Pos logit contributions
arter+0.000
 policy+0.000
 polar+0.000
run+0.000
asured+0.000
Neg logit contributions
ics-0.000
 top-0.000
ars-0.000
he-0.000
case-0.000
Token rope
Token ID17182
Feature activation-1.84e+00

Pos logit contributions
ics+0.011
he+0.010
case+0.009
hed+0.009
amed+0.009
Neg logit contributions
 policy-0.012
run-0.011
arter-0.011
 polar-0.011
 rightful-0.010
Token again
Token ID757
Feature activation+2.00e+00

Pos logit contributions
ics+0.039
ars+0.035
he+0.035
 top+0.033
case+0.032
Neg logit contributions
arter-0.048
 policy-0.047
 polar-0.046
run-0.046
asured-0.045
Token "
Token ID366
Feature activation-1.04e+00

Pos logit contributions
ics+0.023
he+0.022
ars+0.021
case+0.020
vation+0.019
Neg logit contributions
arter-0.018
run-0.017
 polar-0.017
 policy-0.016
 Bobby-0.016
TokenSe
Token ID4653
Feature activation+1.83e+00

Pos logit contributions
ics+0.020
he+0.017
ars+0.017
 top+0.017
case+0.015
Neg logit contributions
arter-0.031
 policy-0.029
 polar-0.029
run-0.029
asured-0.028
Tokenas
Token ID292
Feature activation-1.79e+00

Pos logit contributions
run+0.047
arter+0.045
ocol+0.043
asured+0.042
 polar+0.039
Neg logit contributions
ics-0.043
 top-0.038
he-0.038
case-0.037
't-0.032
Tokenhell
Token ID12758
Feature activation+1.08e+00

Pos logit contributions
ics+0.027
he+0.023
ars+0.023
 top+0.022
case+0.020
Neg logit contributions
arter-0.059
 policy-0.056
 polar-0.056
run-0.056
asured-0.054
Tokens
Token ID82
Feature activation+1.11e-01

Pos logit contributions
arter+0.027
run+0.025
 policy+0.025
 polar+0.023
asured+0.022
Neg logit contributions
ics-0.024
he-0.022
ars-0.021
 top-0.021
case-0.021
Token don
Token ID836
Feature activation-6.25e+00

Pos logit contributions
arter+0.003
 policy+0.003
 polar+0.003
run+0.003
asured+0.003
Neg logit contributions
ics-0.002
he-0.002
ars-0.002
 top-0.002
case-0.002
Token't
Token ID470
Feature activation+6.77e-01

Pos logit contributions
ics+0.087
iece+0.086
ars+0.085
 Higher+0.078
 headset+0.076
Neg logit contributions
arter-0.181
 policy-0.177
run-0.170
 Bobby-0.162
 polar-0.158
Token taste
Token ID6938
Feature activation+1.20e+00

Pos logit contributions
arter+0.020
 policy+0.019
run+0.019
 polar+0.019
asured+0.018
Neg logit contributions
ics-0.012
he-0.011
ars-0.011
 top-0.011
't-0.009
Token good
Token ID922
Feature activation-9.60e-01

Pos logit contributions
arter+0.037
 policy+0.036
 polar+0.035
run+0.034
asured+0.033
Neg logit contributions
ics-0.020
 top-0.019
ars-0.018
he-0.018
g-0.015
Token,
Token ID11
Feature activation-7.25e-01

Pos logit contributions
ics+0.013
 top+0.011
ars+0.011
he+0.010
hed+0.010
Neg logit contributions
arter-0.032
 polar-0.031
run-0.030
 former-0.029
 rightful-0.029
Token Lily
Token ID20037
Feature activation-6.94e-01

Pos logit contributions
 top+0.015
ics+0.014
he+0.014
case+0.014
ars+0.013
Neg logit contributions
arter-0.020
 policy-0.018
asured-0.018
 polar-0.017
ocol-0.017
Token Please
Token ID4222
Feature activation-9.47e-01

Pos logit contributions
arter+0.039
 policy+0.036
 polar+0.035
asured+0.034
run+0.033
Neg logit contributions
ics-0.028
 top-0.028
ars-0.028
case-0.027
he-0.027
Token help
Token ID1037
Feature activation+1.35e+00

Pos logit contributions
ics+0.021
he+0.019
ars+0.018
case+0.018
 top+0.017
Neg logit contributions
arter-0.024
 policy-0.023
run-0.022
 polar-0.022
asured-0.021
Token yourself
Token ID3511
Feature activation+1.07e+00

Pos logit contributions
arter+0.037
 policy+0.035
 polar+0.035
run+0.034
asured+0.033
Neg logit contributions
ics-0.028
he-0.025
ars-0.025
 top-0.024
case-0.023
Token but
Token ID475
Feature activation+1.25e+00

Pos logit contributions
arter+0.030
 policy+0.029
 polar+0.028
run+0.028
asured+0.027
Neg logit contributions
ics-0.021
he-0.019
ars-0.019
 top-0.018
case-0.017
Token just
Token ID655
Feature activation+3.25e+00

Pos logit contributions
arter+0.043
 polar+0.041
run+0.040
asured+0.040
 policy+0.039
Neg logit contributions
ics-0.017
 top-0.017
't-0.016
'-0.014
 fly-0.014
Token don
Token ID836
Feature activation-6.25e+00

Pos logit contributions
arter+0.103
 policy+0.100
 polar+0.100
run+0.092
asured+0.092
Neg logit contributions
ics-0.051
 top-0.047
ars-0.041
 higher-0.041
 fly-0.041
Token't
Token ID470
Feature activation+8.76e-03

Pos logit contributions
iece+0.089
ars+0.088
ics+0.082
amed+0.077
 Higher+0.075
Neg logit contributions
arter-0.185
 policy-0.180
run-0.173
 Bobby-0.167
 Joey-0.163
Token eat
Token ID4483
Feature activation-9.85e-01

Pos logit contributions
arter+0.000
run+0.000
 polar+0.000
 policy+0.000
asured+0.000
Neg logit contributions
ics-0.000
 top-0.000
he-0.000
ars-0.000
't-0.000
Token too
Token ID1165
Feature activation-4.25e-01

Pos logit contributions
ics+0.016
 top+0.015
he+0.013
ars+0.013
case+0.011
Neg logit contributions
arter-0.032
 policy-0.031
 polar-0.030
run-0.030
asured-0.029
Token many
Token ID867
Feature activation-1.09e-01

Pos logit contributions
ics+0.009
he+0.008
ars+0.008
 top+0.008
case+0.007
Neg logit contributions
arter-0.011
 policy-0.011
 polar-0.011
run-0.011
asured-0.010
Token".
Token ID1911
Feature activation-8.43e-01

Pos logit contributions
ics+0.002
 top+0.002
ars+0.001
ers+0.001
he+0.001
Neg logit contributions
arter-0.004
 polar-0.003
 policy-0.003
run-0.003
asured-0.003
Token to
Token ID284
Feature activation-6.63e-01

Pos logit contributions
 top+0.022
ics+0.021
't+0.018
he+0.017
ars+0.016
Neg logit contributions
arter-0.051
 polar-0.046
 policy-0.046
asured-0.045
run-0.045
Token go
Token ID467
Feature activation-9.09e-01

Pos logit contributions
ics+0.010
 top+0.009
he+0.009
case+0.008
ars+0.008
Neg logit contributions
arter-0.022
run-0.021
 policy-0.021
 polar-0.020
asured-0.020
Token home
Token ID1363
Feature activation+1.55e+00

Pos logit contributions
ics+0.015
he+0.014
 top+0.013
't+0.012
case+0.012
Neg logit contributions
arter-0.030
 policy-0.028
 importantly-0.027
run-0.027
ocol-0.026
Token!
Token ID0
Feature activation+5.23e-01

Pos logit contributions
arter+0.045
 polar+0.043
 policy+0.042
run+0.042
asured+0.039
Neg logit contributions
ics-0.031
he-0.029
 top-0.028
ars-0.026
case-0.025
Token Please
Token ID4222
Feature activation-1.24e+00

Pos logit contributions
arter+0.015
 policy+0.014
 polar+0.013
asured+0.013
ocol+0.013
Neg logit contributions
case-0.010
 top-0.010
't-0.009
he-0.009
ics-0.009
Token don
Token ID836
Feature activation-6.23e+00

Pos logit contributions
ics+0.024
he+0.021
ars+0.021
 top+0.020
case+0.019
Neg logit contributions
arter-0.036
 policy-0.035
 polar-0.034
run-0.034
asured-0.033
Token't
Token ID470
Feature activation+4.58e-01

Pos logit contributions
ars+0.090
iece+0.086
ics+0.085
amed+0.076
 higher+0.072
Neg logit contributions
arter-0.179
 policy-0.179
run-0.177
 Bobby-0.174
 polar-0.165
Token make
Token ID787
Feature activation-2.79e+00

Pos logit contributions
arter+0.015
run+0.015
 policy+0.014
asured+0.014
 polar+0.014
Neg logit contributions
ics-0.007
he-0.006
 top-0.006
case-0.006
't-0.006
Token me
Token ID502
Feature activation-9.89e-01

Pos logit contributions
ics+0.059
he+0.055
ars+0.054
 top+0.050
case+0.050
Neg logit contributions
arter-0.074
 polar-0.070
 policy-0.069
run-0.069
asured-0.067
Token!"
Token ID2474
Feature activation-1.62e+00

Pos logit contributions
ics+0.021
he+0.019
ars+0.019
 top+0.019
case+0.018
Neg logit contributions
arter-0.027
 policy-0.025
 polar-0.025
run-0.024
asured-0.024
Token His
Token ID2399
Feature activation+2.34e+00

Pos logit contributions
ics+0.040
he+0.038
 top+0.037
ars+0.036
case+0.034
Neg logit contributions
arter-0.040
 polar-0.037
 policy-0.036
run-0.035
asured-0.035
Token,
Token ID11
Feature activation-8.28e-01

Pos logit contributions
ics+0.008
 top+0.007
ars+0.007
he+0.007
case+0.006
Neg logit contributions
arter-0.020
 policy-0.019
 polar-0.018
asured-0.018
run-0.018
Token "
Token ID366
Feature activation-1.65e+00

Pos logit contributions
ics+0.020
he+0.018
ars+0.018
case+0.017
 top+0.017
Neg logit contributions
arter-0.020
run-0.019
 polar-0.019
 policy-0.019
asured-0.017
TokenNo
Token ID2949
Feature activation-1.10e+00

Pos logit contributions
ics+0.038
he+0.032
ars+0.032
 top+0.032
case+0.031
Neg logit contributions
arter-0.041
 policy-0.041
run-0.040
 polar-0.040
asured-0.039
Token,
Token ID11
Feature activation-1.92e+00

Pos logit contributions
ics+0.012
 top+0.012
ars+0.010
he+0.009
't+0.009
Neg logit contributions
arter-0.040
asured-0.039
 former-0.037
run-0.037
 polar-0.037
Token please
Token ID3387
Feature activation+1.63e-01

Pos logit contributions
 top+0.030
ics+0.029
he+0.027
ars+0.027
't+0.026
Neg logit contributions
arter-0.060
asured-0.058
 former-0.056
 policy-0.055
 polar-0.054
Token don
Token ID836
Feature activation-6.23e+00

Pos logit contributions
arter+0.005
 policy+0.004
 polar+0.004
run+0.004
asured+0.004
Neg logit contributions
ics-0.003
he-0.003
ars-0.003
 top-0.003
case-0.003
Token't
Token ID470
Feature activation+4.51e-01

Pos logit contributions
ics+0.089
ars+0.081
iece+0.080
amed+0.073
 higher+0.069
Neg logit contributions
 policy-0.186
run-0.185
arter-0.184
 Bobby-0.174
 polar-0.171
Token shoot
Token ID2686
Feature activation-8.58e-01

Pos logit contributions
arter+0.014
 policy+0.013
 polar+0.013
run+0.013
asured+0.012
Neg logit contributions
ics-0.008
he-0.007
ars-0.007
 top-0.007
case-0.006
Token me
Token ID502
Feature activation-1.33e-02

Pos logit contributions
ics+0.018
he+0.016
ars+0.016
 top+0.015
case+0.014
Neg logit contributions
arter-0.024
 policy-0.023
 polar-0.023
run-0.022
asured-0.022
Token!
Token ID0
Feature activation+3.76e-01

Pos logit contributions
ics+0.000
he+0.000
ars+0.000
 top+0.000
case+0.000
Neg logit contributions
arter-0.000
 policy-0.000
 polar-0.000
run-0.000
asured-0.000
Token I
Token ID314
Feature activation+8.00e-01

Pos logit contributions
arter+0.011
 policy+0.010
 polar+0.010
asured+0.010
 former+0.009
Neg logit contributions
ics-0.007
't-0.007
ars-0.006
case-0.006
 top-0.006
Token a
Token ID257
Feature activation+4.05e-01

Pos logit contributions
arter+0.070
 policy+0.066
 polar+0.064
run+0.064
asured+0.063
Neg logit contributions
ics-0.051
he-0.046
 top-0.045
ars-0.045
case-0.042
Token secret
Token ID3200
Feature activation+3.51e+00

Pos logit contributions
arter+0.012
 policy+0.011
 polar+0.011
run+0.011
asured+0.010
Neg logit contributions
ics-0.008
he-0.007
ars-0.007
 top-0.007
case-0.006
Token place
Token ID1295
Feature activation+1.95e+00

Pos logit contributions
arter+0.099
 polar+0.092
 policy+0.091
run+0.090
asured+0.089
Neg logit contributions
ics-0.071
he-0.063
 top-0.063
ars-0.061
case-0.061
Token in
Token ID287
Feature activation+1.12e+00

Pos logit contributions
arter+0.056
 polar+0.052
 policy+0.052
run+0.051
asured+0.050
Neg logit contributions
ics-0.039
he-0.035
ars-0.034
 top-0.033
case-0.032
Token the
Token ID262
Feature activation-1.21e+00

Pos logit contributions
arter+0.042
 polar+0.038
run+0.038
asured+0.038
 policy+0.038
Neg logit contributions
 top-0.012
ics-0.012
he-0.011
case-0.008
 stage-0.007
Token hollow
Token ID20596
Feature activation+5.89e+00

Pos logit contributions
 top+0.023
he+0.021
ics+0.021
ars+0.021
case+0.020
Neg logit contributions
arter-0.036
run-0.034
asured-0.033
 policy-0.032
 polar-0.031
Token tree
Token ID5509
Feature activation+1.28e+00

Pos logit contributions
arter+0.190
 policy+0.174
 polar+0.173
run+0.172
asured+0.170
Neg logit contributions
ics-0.102
he-0.093
 top-0.092
ars-0.083
case-0.082
Token and
Token ID290
Feature activation-8.39e-01

Pos logit contributions
arter+0.038
 policy+0.036
 polar+0.036
run+0.035
asured+0.034
Neg logit contributions
ics-0.024
he-0.021
ars-0.021
 top-0.020
case-0.019
Token told
Token ID1297
Feature activation+1.54e+00

Pos logit contributions
ics+0.016
 top+0.014
he+0.014
ars+0.013
case+0.012
Neg logit contributions
arter-0.026
 policy-0.024
 polar-0.023
run-0.023
asured-0.022
Token her
Token ID607
Feature activation+8.20e-01

Pos logit contributions
arter+0.040
 policy+0.037
 polar+0.037
run+0.036
asured+0.035
Neg logit contributions
ics-0.035
he-0.032
 top-0.031
ars-0.031
case-0.029
Token to
Token ID284
Feature activation+1.75e-02

Pos logit contributions
arter+0.025
 policy+0.023
 polar+0.023
run+0.023
asured+0.023
Neg logit contributions
ics-0.015
 top-0.013
he-0.013
case-0.012
ars-0.012
Token,
Token ID11
Feature activation-1.18e+00

Pos logit contributions
arter+0.014
run+0.014
 policy+0.014
asured+0.013
 polar+0.013
Neg logit contributions
ics-0.013
ars-0.012
he-0.012
vation-0.010
 Higher-0.010
Token and
Token ID290
Feature activation-1.09e+00

Pos logit contributions
ics+0.024
he+0.022
ars+0.021
 top+0.021
case+0.020
Neg logit contributions
arter-0.033
 policy-0.031
 polar-0.031
run-0.031
asured-0.030
Token the
Token ID262
Feature activation-1.16e+00

Pos logit contributions
ics+0.016
 top+0.015
he+0.014
ars+0.013
 higher+0.012
Neg logit contributions
arter-0.038
 policy-0.037
 polar-0.034
run-0.034
asured-0.033
Token egg
Token ID5935
Feature activation+3.94e+00

Pos logit contributions
ics+0.027
he+0.025
ars+0.024
 top+0.024
case+0.023
Neg logit contributions
arter-0.030
 policy-0.028
 polar-0.027
run-0.027
asured-0.026
Token looked
Token ID3114
Feature activation+3.02e+00

Pos logit contributions
arter+0.101
 policy+0.096
 polar+0.095
run+0.094
asured+0.091
Neg logit contributions
ics-0.087
he-0.080
ars-0.077
 top-0.075
case-0.071
Token back
Token ID736
Feature activation+6.00e+00

Pos logit contributions
arter+0.101
 policy+0.100
 polar+0.092
run+0.091
asured+0.091
Neg logit contributions
 top-0.041
ics-0.039
he-0.038
 higher-0.036
ars-0.036
Token at
Token ID379
Feature activation+1.27e+00

Pos logit contributions
arter+0.184
 polar+0.176
 policy+0.175
run+0.163
asured+0.161
Neg logit contributions
ics-0.110
 top-0.101
he-0.097
ars-0.092
case-0.086
Token them
Token ID606
Feature activation-6.24e-01

Pos logit contributions
arter+0.043
 policy+0.041
run+0.041
 polar+0.040
asured+0.039
Neg logit contributions
ics-0.019
 top-0.019
he-0.017
ars-0.015
hed-0.013
Token.
Token ID13
Feature activation-6.43e-03

Pos logit contributions
ics+0.013
ars+0.012
he+0.012
 top+0.011
case+0.010
Neg logit contributions
arter-0.017
run-0.016
 policy-0.016
 polar-0.016
asured-0.015
Token The
Token ID383
Feature activation-1.23e+00

Pos logit contributions
ics+0.000
 top+0.000
he+0.000
ars+0.000
case+0.000
Neg logit contributions
arter-0.000
 policy-0.000
 polar-0.000
asured-0.000
run-0.000
Token animals
Token ID4695
Feature activation-1.17e+00

Pos logit contributions
ics+0.031
he+0.028
ars+0.027
 top+0.027
case+0.026
Neg logit contributions
arter-0.029
 polar-0.028
 policy-0.027
run-0.026
asured-0.026
Token the
Token ID262
Feature activation-1.77e+00

Pos logit contributions
arter+0.044
 policy+0.042
 polar+0.042
run+0.041
asured+0.040
Neg logit contributions
ics-0.025
he-0.022
 top-0.022
ars-0.021
case-0.019
Token small
Token ID1402
Feature activation+9.62e-01

Pos logit contributions
ics+0.042
ars+0.037
he+0.037
 top+0.033
case+0.033
Neg logit contributions
arter-0.045
 policy-0.043
 polar-0.042
run-0.040
asured-0.039
Token writing
Token ID3597
Feature activation+9.22e-01

Pos logit contributions
arter+0.028
run+0.027
asured+0.027
 polar+0.026
 policy+0.026
Neg logit contributions
 top-0.017
ics-0.017
he-0.016
ars-0.015
case-0.015
Token on
Token ID319
Feature activation+1.08e+00

Pos logit contributions
arter+0.026
 policy+0.025
run+0.025
 polar+0.023
asured+0.023
Neg logit contributions
ics-0.018
ars-0.016
 Higher-0.015
he-0.014
 top-0.014
Token the
Token ID262
Feature activation-2.03e+00

Pos logit contributions
arter+0.033
 policy+0.032
 polar+0.031
run+0.031
asured+0.030
Neg logit contributions
ics-0.020
ars-0.017
he-0.017
 top-0.015
case-0.015
Token back
Token ID736
Feature activation+5.85e+00

Pos logit contributions
ics+0.049
he+0.041
ars+0.040
hed+0.038
case+0.035
Neg logit contributions
arter-0.052
 policy-0.051
 polar-0.050
 rightful-0.045
run-0.044
Token of
Token ID286
Feature activation+4.96e-01

Pos logit contributions
arter+0.170
 policy+0.161
 polar+0.160
run+0.157
asured+0.154
Neg logit contributions
ics-0.114
he-0.102
ars-0.099
 top-0.098
case-0.092
Token the
Token ID262
Feature activation-1.22e+00

Pos logit contributions
arter+0.018
run+0.017
 policy+0.017
 polar+0.017
asured+0.017
Neg logit contributions
 top-0.006
ics-0.005
he-0.005
case-0.004
 stage-0.004
Token picture
Token ID4286
Feature activation-6.13e-01

Pos logit contributions
ics+0.027
he+0.023
ars+0.023
hed+0.021
 top+0.020
Neg logit contributions
arter-0.034
 policy-0.033
 polar-0.031
run-0.029
 rightful-0.028
Token.
Token ID13
Feature activation+3.65e-01

Pos logit contributions
ics+0.014
ars+0.013
 top+0.011
 Higher+0.011
he+0.011
Neg logit contributions
arter-0.015
 policy-0.015
run-0.015
 polar-0.013
asured-0.013
Token
Token ID198
Feature activation+2.90e-01

Pos logit contributions
arter+0.011
 policy+0.011
 polar+0.011
run+0.011
asured+0.010
Neg logit contributions
ics-0.006
he-0.006
ars-0.005
 top-0.005
case-0.005
Token
Token ID198
Feature activation+5.80e-01

Pos logit contributions
arter+0.009
 policy+0.008
 polar+0.008
asured+0.008
run+0.008
Neg logit contributions
ics-0.007
he-0.006
 top-0.006
ars-0.006
case-0.006
Token
Token ID198
Feature activation+1.64e+00

Pos logit contributions
arter+0.018
 policy+0.017
 polar+0.017
run+0.017
asured+0.016
Neg logit contributions
ics-0.010
he-0.009
ars-0.009
 top-0.008
case-0.008
TokenSuddenly
Token ID38582
Feature activation+6.01e-03

Pos logit contributions
arter+0.048
 polar+0.046
 policy+0.046
run+0.046
asured+0.044
Neg logit contributions
ics-0.031
he-0.027
ars-0.026
 top-0.025
case-0.024
Token,
Token ID11
Feature activation+4.94e-01

Pos logit contributions
arter+0.000
 polar+0.000
 policy+0.000
run+0.000
asured+0.000
Neg logit contributions
ics-0.000
he-0.000
ars-0.000
 top-0.000
case-0.000
Token a
Token ID257
Feature activation+7.51e-02

Pos logit contributions
arter+0.016
 policy+0.015
 polar+0.015
run+0.015
asured+0.014
Neg logit contributions
ics-0.008
he-0.007
ars-0.007
 top-0.007
case-0.006
Token pant
Token ID15857
Feature activation+6.28e+00

Pos logit contributions
arter+0.002
 policy+0.002
asured+0.002
run+0.002
 polar+0.002
Neg logit contributions
ics-0.001
he-0.001
 top-0.001
ars-0.001
case-0.001
Tokenher
Token ID372
Feature activation+2.82e+00

Pos logit contributions
arter+0.142
 policy+0.140
 former+0.139
run+0.137
asured+0.136
Neg logit contributions
ics-0.148
he-0.146
ars-0.143
 top-0.136
ong-0.123
Token appeared
Token ID4120
Feature activation+1.50e+00

Pos logit contributions
arter+0.078
 policy+0.073
run+0.072
 polar+0.071
asured+0.070
Neg logit contributions
ics-0.060
ars-0.053
he-0.053
 top-0.052
case-0.048
Token.
Token ID13
Feature activation+6.62e-01

Pos logit contributions
arter+0.040
 policy+0.039
run+0.039
 polar+0.038
asured+0.036
Neg logit contributions
ics-0.032
ars-0.029
he-0.027
case-0.027
 top-0.026
Token He
Token ID679
Feature activation+1.18e+00

Pos logit contributions
arter+0.018
 policy+0.018
asured+0.017
 polar+0.016
run+0.016
Neg logit contributions
 top-0.013
ics-0.013
he-0.012
ars-0.012
case-0.011
Token was
Token ID373
Feature activation-9.28e-01

Pos logit contributions
arter+0.040
 policy+0.038
run+0.035
ocol+0.034
 former+0.034
Neg logit contributions
ics-0.018
 top-0.017
ars-0.017
he-0.017
case-0.016
Token the
Token ID262
Feature activation-1.27e+00

Pos logit contributions
arter+0.030
run+0.029
 polar+0.028
asured+0.028
 policy+0.028
Neg logit contributions
ics-0.012
 top-0.010
he-0.010
case-0.008
't-0.008
Token small
Token ID1402
Feature activation+1.84e+00

Pos logit contributions
ics+0.030
he+0.027
ars+0.027
 top+0.025
case+0.024
Neg logit contributions
arter-0.032
 policy-0.031
 polar-0.030
run-0.029
asured-0.028
Token writing
Token ID3597
Feature activation+1.50e+00

Pos logit contributions
arter+0.054
run+0.050
 polar+0.050
asured+0.050
 policy+0.050
Neg logit contributions
ics-0.034
 top-0.033
he-0.031
ars-0.030
case-0.029
Token on
Token ID319
Feature activation+1.68e+00

Pos logit contributions
arter+0.037
 policy+0.036
run+0.034
 polar+0.034
asured+0.033
Neg logit contributions
ics-0.035
ars-0.031
he-0.031
 top-0.031
 Higher-0.028
Token the
Token ID262
Feature activation-1.80e+00

Pos logit contributions
arter+0.055
 policy+0.052
 polar+0.052
run+0.051
asured+0.050
Neg logit contributions
ics-0.027
he-0.024
 top-0.023
ars-0.023
case-0.021
Token back
Token ID736
Feature activation+5.76e+00

Pos logit contributions
ics+0.044
he+0.037
ars+0.037
hed+0.033
case+0.033
Neg logit contributions
arter-0.046
 policy-0.045
 polar-0.043
run-0.040
 rightful-0.039
Token of
Token ID286
Feature activation+9.76e-01

Pos logit contributions
arter+0.171
 polar+0.161
 policy+0.161
run+0.158
asured+0.156
Neg logit contributions
ics-0.108
he-0.098
 top-0.094
ars-0.094
case-0.088
Token the
Token ID262
Feature activation-1.27e+00

Pos logit contributions
arter+0.036
run+0.035
 policy+0.035
 polar+0.035
asured+0.034
Neg logit contributions
 top-0.010
ics-0.009
he-0.008
 stage-0.006
 swing-0.006
Token picture
Token ID4286
Feature activation-7.74e-01

Pos logit contributions
ics+0.029
he+0.024
ars+0.024
 top+0.022
case+0.022
Neg logit contributions
arter-0.034
 policy-0.033
 polar-0.032
run-0.030
asured-0.029
Token said
Token ID531
Feature activation-8.02e-01

Pos logit contributions
ics+0.017
ars+0.015
 top+0.014
he+0.014
 Higher+0.013
Neg logit contributions
arter-0.020
 policy-0.020
run-0.019
 polar-0.018
asured-0.018
Token,
Token ID11
Feature activation-1.40e+00

Pos logit contributions
ics+0.018
he+0.016
ars+0.016
 top+0.014
hed+0.014
Neg logit contributions
arter-0.021
 policy-0.020
run-0.020
 polar-0.020
 importantly-0.018
Token named
Token ID3706
Feature activation+1.80e+00

Pos logit contributions
arter+0.020
 former+0.018
asured+0.018
run+0.018
 policy+0.018
Neg logit contributions
ics-0.018
he-0.018
'-0.017
 top-0.016
hed-0.016
Token Sarah
Token ID10490
Feature activation+2.36e-01

Pos logit contributions
arter+0.038
 polar+0.036
 policy+0.035
asured+0.035
run+0.034
Neg logit contributions
ics-0.049
he-0.046
 top-0.045
case-0.044
ars-0.043
Token was
Token ID373
Feature activation+1.15e-01

Pos logit contributions
arter+0.006
 policy+0.006
 polar+0.006
run+0.006
asured+0.006
Neg logit contributions
ics-0.005
ars-0.005
he-0.005
 top-0.005
case-0.004
Token playing
Token ID2712
Feature activation-3.68e+00

Pos logit contributions
arter+0.003
 polar+0.003
 policy+0.002
run+0.002
asured+0.002
Neg logit contributions
ics-0.003
ars-0.003
he-0.003
 top-0.002
case-0.002
Token in
Token ID287
Feature activation+6.68e-01

Pos logit contributions
ics+0.088
ars+0.080
vation+0.076
case+0.073
he+0.070
Neg logit contributions
arter-0.084
 polar-0.082
run-0.079
 policy-0.078
asured-0.076
Token her
Token ID607
Feature activation+2.31e+00

Pos logit contributions
arter+0.020
 polar+0.019
 policy+0.019
run+0.018
asured+0.018
Neg logit contributions
ics-0.013
ars-0.011
he-0.011
 top-0.010
case-0.010
Token backyard
Token ID24296
Feature activation+7.28e-01

Pos logit contributions
arter+0.068
run+0.066
 policy+0.065
 polar+0.063
asured+0.062
Neg logit contributions
ics-0.044
 top-0.040
he-0.038
case-0.036
ars-0.036
Token.
Token ID13
Feature activation+1.07e+00

Pos logit contributions
arter+0.020
 policy+0.019
 polar+0.019
run+0.019
asured+0.018
Neg logit contributions
ics-0.015
ars-0.014
he-0.013
 top-0.012
case-0.012
Token Suddenly
Token ID24975
Feature activation+5.94e-01

Pos logit contributions
arter+0.030
 policy+0.029
 polar+0.028
run+0.028
asured+0.027
Neg logit contributions
ics-0.022
ars-0.019
he-0.019
 top-0.018
case-0.017
Token,
Token ID11
Feature activation+1.12e+00

Pos logit contributions
arter+0.017
 policy+0.016
 polar+0.016
run+0.016
 Bobby+0.015
Neg logit contributions
ics-0.012
ars-0.011
he-0.010
 top-0.009
case-0.009
Token she
Token ID673
Feature activation+1.85e+00

Pos logit contributions
arter+0.032
 policy+0.031
 polar+0.030
run+0.030
asured+0.028
Neg logit contributions
ics-0.022
ars-0.020
he-0.019
 top-0.017
case-0.017
Token loved
Token ID6151
Feature activation-1.56e+00

Pos logit contributions
ics+0.002
he+0.001
ars+0.001
 top+0.001
case+0.001
Neg logit contributions
arter-0.003
 policy-0.002
 polar-0.002
run-0.002
asured-0.002
Token spending
Token ID4581
Feature activation-1.31e+00

Pos logit contributions
ics+0.031
he+0.027
 top+0.027
ars+0.025
case+0.024
Neg logit contributions
arter-0.046
 policy-0.044
 polar-0.043
run-0.042
asured-0.041
Token time
Token ID640
Feature activation-2.42e+00

Pos logit contributions
ics+0.024
he+0.021
ars+0.021
 top+0.020
case+0.019
Neg logit contributions
arter-0.039
 policy-0.038
 polar-0.037
run-0.037
asured-0.036
Token with
Token ID351
Feature activation+1.86e+00

Pos logit contributions
ics+0.055
ars+0.051
he+0.049
 top+0.048
 Higher+0.047
Neg logit contributions
run-0.053
arter-0.051
 policy-0.050
asured-0.050
 importantly-0.049
Token her
Token ID607
Feature activation+2.21e+00

Pos logit contributions
arter+0.071
 policy+0.070
run+0.068
 polar+0.068
asured+0.065
Neg logit contributions
ics-0.019
 top-0.018
he-0.014
 stage-0.010
't-0.010
Token mom
Token ID1995
Feature activation+1.55e+00

Pos logit contributions
arter+0.061
 polar+0.056
 policy+0.056
run+0.055
asured+0.054
Neg logit contributions
ics-0.046
 top-0.044
he-0.042
ars-0.039
case-0.039
Tokenmy
Token ID1820
Feature activation-5.28e-02

Pos logit contributions
arter+0.042
 policy+0.040
 polar+0.039
run+0.038
asured+0.037
Neg logit contributions
ics-0.034
he-0.031
ars-0.029
 top-0.029
case-0.028
Token.
Token ID13
Feature activation+5.05e-02

Pos logit contributions
ics+0.001
ars+0.001
he+0.001
 top+0.001
case+0.001
Neg logit contributions
arter-0.002
 policy-0.001
 polar-0.001
run-0.001
asured-0.001
Token She
Token ID1375
Feature activation+8.71e-01

Pos logit contributions
arter+0.001
 policy+0.001
 polar+0.001
run+0.001
asured+0.001
Neg logit contributions
ics-0.001
he-0.001
 top-0.001
ars-0.001
case-0.001
Token even
Token ID772
Feature activation+3.18e+00

Pos logit contributions
arter+0.020
 polar+0.019
asured+0.019
 policy+0.019
run+0.018
Neg logit contributions
ics-0.022
he-0.019
ars-0.019
 top-0.019
case-0.018
Token taught
Token ID7817
Feature activation-9.46e-01

Pos logit contributions
arter+0.100
 policy+0.095
 polar+0.089
run+0.086
ocol+0.084
Neg logit contributions
ics-0.056
he-0.051
ars-0.051
 top-0.047
hed-0.045
Token me
Token ID502
Feature activation-3.12e-02

Pos logit contributions
ics+0.017
 top+0.015
he+0.014
ars+0.013
case+0.012
Neg logit contributions
arter-0.030
run-0.029
 policy-0.029
 polar-0.029
asured-0.028
Token?"
Token ID1701
Feature activation-4.19e-01

Pos logit contributions
ics+0.001
he+0.001
ars+0.001
 top+0.001
case+0.001
Neg logit contributions
arter-0.001
 policy-0.001
 polar-0.001
run-0.001
asured-0.001
Token
Token ID198
Feature activation+5.92e-01

Pos logit contributions
ics+0.011
ars+0.009
he+0.009
vation+0.008
hed+0.008
Neg logit contributions
run-0.009
 policy-0.009
arter-0.009
 Bobby-0.009
 importantly-0.009
Token
Token ID198
Feature activation+1.39e+00

Pos logit contributions
run+0.013
 policy+0.012
arter+0.012
 polar+0.012
 importantly+0.011
Neg logit contributions
ics-0.015
ars-0.013
hed-0.012
he-0.012
vation-0.012
TokenA
Token ID32
Feature activation+4.11e-01

Pos logit contributions
 policy+0.029
run+0.029
 importantly+0.028
emy+0.026
arter+0.026
Neg logit contributions
ics-0.033
hed-0.029
he-0.026
ong-0.026
string-0.025
Tokenunt
Token ID2797
Feature activation+1.90e+00

Pos logit contributions
arter+0.013
 policy+0.013
 polar+0.012
run+0.012
asured+0.012
Neg logit contributions
ics-0.007
he-0.006
ars-0.006
 top-0.006
case-0.005
Token Lisa
Token ID15378
Feature activation-8.74e-01

Pos logit contributions
arter+0.049
 policy+0.046
 polar+0.046
run+0.045
asured+0.044
Neg logit contributions
ics-0.043
he-0.039
ars-0.038
 top-0.038
case-0.036
Token smiled
Token ID13541
Feature activation+9.04e-01

Pos logit contributions
ics+0.020
ars+0.018
he+0.018
 top+0.017
 Higher+0.016
Neg logit contributions
arter-0.021
 polar-0.020
run-0.020
 policy-0.019
asured-0.019
Token.
Token ID13
Feature activation+7.05e-01

Pos logit contributions
arter+0.028
 policy+0.027
 polar+0.027
run+0.026
asured+0.026
Neg logit contributions
ics-0.016
he-0.014
ars-0.013
 top-0.013
case-0.012
Token She
Token ID1375
Feature activation+1.03e+00

Pos logit contributions
arter+0.022
 policy+0.021
 polar+0.020
run+0.020
asured+0.020
Neg logit contributions
ics-0.013
he-0.011
ars-0.011
 top-0.011
case-0.010
Token said
Token ID531
Feature activation-7.83e-01

Pos logit contributions
arter+0.032
 policy+0.030
 polar+0.029
run+0.028
asured+0.028
Neg logit contributions
ics-0.019
he-0.017
ars-0.016
 top-0.016
case-0.015
Token on
Token ID319
Feature activation-6.63e-01

Pos logit contributions
ics+0.041
he+0.034
ars+0.033
 top+0.032
case+0.030
Neg logit contributions
arter-0.094
 policy-0.090
 polar-0.089
run-0.088
asured-0.085
Token,
Token ID11
Feature activation-1.02e-02

Pos logit contributions
ics+0.006
 top+0.006
he+0.005
't+0.004
case+0.004
Neg logit contributions
asured-0.026
arter-0.026
 former-0.025
run-0.025
 polar-0.025
Token Tim
Token ID5045
Feature activation-5.63e-01

Pos logit contributions
 top+0.000
ics+0.000
he+0.000
 Higher+0.000
't+0.000
Neg logit contributions
arter-0.000
 policy-0.000
asured-0.000
 polar-0.000
run-0.000
Token was
Token ID373
Feature activation-3.63e-01

Pos logit contributions
ics+0.013
ars+0.011
he+0.011
 top+0.011
case+0.011
Neg logit contributions
arter-0.014
 polar-0.014
asured-0.013
 policy-0.013
run-0.012
Token careful
Token ID8161
Feature activation-1.01e+00

Pos logit contributions
ics+0.007
he+0.007
ars+0.006
 top+0.006
case+0.006
Neg logit contributions
arter-0.010
 polar-0.010
 policy-0.010
run-0.010
asured-0.009
Token with
Token ID351
Feature activation+2.07e+00

Pos logit contributions
ics+0.021
he+0.020
ars+0.019
 top+0.018
case+0.017
Neg logit contributions
arter-0.027
 policy-0.026
run-0.026
asured-0.025
 polar-0.025
Token the
Token ID262
Feature activation-1.11e+00

Pos logit contributions
 policy+0.067
arter+0.066
obby+0.063
 former+0.060
 importantly+0.059
Neg logit contributions
 top-0.031
ics-0.026
 a-0.025
 high-0.023
 up-0.022
Token unusual
Token ID8468
Feature activation+1.67e+00

Pos logit contributions
ics+0.024
 top+0.022
he+0.022
ars+0.022
case+0.020
Neg logit contributions
arter-0.030
run-0.028
 policy-0.028
 polar-0.027
asured-0.027
Token sauce
Token ID10746
Feature activation-2.09e+00

Pos logit contributions
arter+0.045
run+0.044
asured+0.043
 policy+0.041
 polar+0.041
Neg logit contributions
 top-0.034
ics-0.033
ars-0.030
he-0.029
g-0.027
Token.
Token ID13
Feature activation+1.08e+00

Pos logit contributions
ics+0.040
ars+0.035
he+0.035
 top+0.033
case+0.031
Neg logit contributions
arter-0.060
 policy-0.058
 polar-0.056
run-0.056
asured-0.054
Token He
Token ID679
Feature activation+1.09e+00

Pos logit contributions
arter+0.026
 polar+0.026
 policy+0.024
asured+0.024
ocol+0.024
Neg logit contributions
 top-0.023
he-0.022
ics-0.022
case-0.022
 Higher-0.022
Token
Token ID198
Feature activation+6.84e-01

Pos logit contributions
arter+0.015
 policy+0.014
asured+0.013
ocol+0.013
 polar+0.012
Neg logit contributions
 top-0.009
ars-0.009
ics-0.008
he-0.008
case-0.008
Token
Token ID198
Feature activation+1.37e+00

Pos logit contributions
arter+0.022
 policy+0.021
 polar+0.021
run+0.021
asured+0.020
Neg logit contributions
ics-0.011
he-0.010
ars-0.009
 top-0.009
case-0.008
TokenSo
Token ID2396
Feature activation+6.90e-02

Pos logit contributions
arter+0.041
 policy+0.039
 polar+0.039
run+0.038
asured+0.037
Neg logit contributions
ics-0.026
he-0.022
ars-0.022
 top-0.021
case-0.020
Token,
Token ID11
Feature activation-5.03e-01

Pos logit contributions
arter+0.002
asured+0.002
run+0.002
 policy+0.002
ocol+0.002
Neg logit contributions
 top-0.001
ars-0.001
ics-0.001
he-0.001
't-0.001
Token D
Token ID360
Feature activation-1.50e+00

Pos logit contributions
 top+0.012
he+0.011
ics+0.011
ars+0.011
't+0.010
Neg logit contributions
arter-0.013
 policy-0.012
asured-0.011
run-0.011
ocol-0.011
Tokenippy
Token ID41214
Feature activation+2.61e+00

Pos logit contributions
ics+0.029
he+0.027
ars+0.027
 top+0.026
case+0.024
Neg logit contributions
arter-0.043
 policy-0.040
 polar-0.040
run-0.039
asured-0.038
Token had
Token ID550
Feature activation-5.88e-01

Pos logit contributions
arter+0.073
 policy+0.070
 polar+0.070
run+0.069
asured+0.067
Neg logit contributions
ics-0.053
he-0.047
ars-0.046
 top-0.045
case-0.042
Token to
Token ID284
Feature activation+2.67e-01

Pos logit contributions
ics+0.010
 top+0.009
ars+0.008
he+0.008
case+0.008
Neg logit contributions
arter-0.020
 policy-0.018
 polar-0.018
run-0.017
ocol-0.017
Token give
Token ID1577
Feature activation+2.03e+00

Pos logit contributions
arter+0.008
 policy+0.007
 polar+0.007
run+0.007
asured+0.007
Neg logit contributions
ics-0.005
he-0.005
ars-0.004
 top-0.004
case-0.004
Token up
Token ID510
Feature activation+1.48e+00

Pos logit contributions
arter+0.055
 policy+0.052
run+0.049
 polar+0.048
asured+0.047
Neg logit contributions
ics-0.045
 top-0.041
ars-0.041
he-0.040
hed-0.035
Token all
Token ID477
Feature activation+4.11e+00

Pos logit contributions
arter+0.051
 policy+0.047
 polar+0.047
asured+0.045
run+0.045
Neg logit contributions
ics-0.020
 top-0.019
ars-0.019
he-0.017
g-0.016
Token same
Token ID976
Feature activation-5.69e-01

Pos logit contributions
case+0.068
 Higher+0.067
ars+0.067
hel+0.067
ives+0.066
Neg logit contributions
 polar-0.057
arter-0.054
 policy-0.049
 rightful-0.046
cest-0.046
Token.
Token ID13
Feature activation+8.19e-01

Pos logit contributions
ics+0.012
he+0.010
ars+0.010
 top+0.010
case+0.010
Neg logit contributions
arter-0.016
 policy-0.015
 polar-0.015
run-0.015
asured-0.014
Token They
Token ID1119
Feature activation-2.02e+00

Pos logit contributions
arter+0.020
 policy+0.019
 polar+0.019
run+0.018
asured+0.018
Neg logit contributions
ics-0.020
he-0.018
 top-0.018
ars-0.018
case-0.017
Token blew
Token ID17948
Feature activation-8.69e-01

Pos logit contributions
ics+0.045
he+0.038
ars+0.038
 top+0.037
case+0.036
Neg logit contributions
 polar-0.051
arter-0.051
run-0.050
 policy-0.048
asured-0.048
Token their
Token ID511
Feature activation+5.31e-01

Pos logit contributions
ics+0.019
case+0.016
ars+0.015
he+0.015
vation+0.014
Neg logit contributions
 polar-0.024
run-0.023
arter-0.022
 policy-0.022
 Bobby-0.021
Token horns
Token ID26771
Feature activation-1.18e+00

Pos logit contributions
arter+0.013
 polar+0.012
run+0.012
 policy+0.012
asured+0.011
Neg logit contributions
ics-0.012
he-0.011
case-0.011
hel-0.010
ars-0.010
Token and
Token ID290
Feature activation-1.60e+00

Pos logit contributions
ics+0.029
vation+0.025
ars+0.023
ceed+0.023
he+0.023
Neg logit contributions
run-0.028
 polar-0.024
arter-0.024
 Bobby-0.023
asured-0.023
Token squeezed
Token ID29650
Feature activation+1.26e+00

Pos logit contributions
ics+0.029
ars+0.026
 top+0.026
he+0.026
case+0.023
Neg logit contributions
arter-0.049
 policy-0.047
run-0.045
 polar-0.044
asured-0.044
Token their
Token ID511
Feature activation+6.17e-01

Pos logit contributions
arter+0.036
 polar+0.034
 policy+0.034
run+0.034
asured+0.033
Neg logit contributions
ics-0.025
he-0.022
ars-0.022
 top-0.020
case-0.020
Token balloons
Token ID35485
Feature activation-1.53e+00

Pos logit contributions
arter+0.013
 polar+0.012
 policy+0.012
run+0.012
asured+0.011
Neg logit contributions
ics-0.017
he-0.016
case-0.015
ars-0.014
 top-0.014
Token.
Token ID13
Feature activation+2.76e-01

Pos logit contributions
ics+0.036
ars+0.033
he+0.031
 Higher+0.031
vation+0.031
Neg logit contributions
arter-0.036
run-0.034
 polar-0.033
 policy-0.033
asured-0.032
Token wanted
Token ID2227
Feature activation-3.19e-01

Pos logit contributions
ics+0.001
he+0.001
ars+0.001
 top+0.001
case+0.001
Neg logit contributions
arter-0.001
 policy-0.001
 polar-0.001
run-0.001
asured-0.001
Token.
Token ID13
Feature activation-8.27e-02

Pos logit contributions
ics+0.005
 top+0.005
ars+0.004
he+0.004
case+0.004
Neg logit contributions
arter-0.010
 polar-0.010
 policy-0.010
asured-0.009
run-0.009
Token
Token ID220
Feature activation+4.20e-02

Pos logit contributions
ics+0.002
he+0.001
ars+0.001
 top+0.001
case+0.001
Neg logit contributions
arter-0.002
 policy-0.002
 polar-0.002
run-0.002
asured-0.002
Token
Token ID198
Feature activation+1.70e-01

Pos logit contributions
arter+0.001
run+0.001
 policy+0.001
 polar+0.001
asured+0.001
Neg logit contributions
ics-0.001
ars-0.001
he-0.001
 top-0.001
case-0.001
Token
Token ID198
Feature activation+1.05e+00

Pos logit contributions
run+0.004
 polar+0.003
 Bobby+0.003
 Joey+0.003
 policy+0.003
Neg logit contributions
ics-0.004
 top-0.004
ars-0.004
 banner-0.003
 poster-0.003
TokenThe
Token ID464
Feature activation-1.85e+00

Pos logit contributions
run+0.027
 importantly+0.026
 polar+0.026
 policy+0.026
asured+0.025
Neg logit contributions
ics-0.021
ong-0.017
 top-0.016
 beams-0.016
ars-0.016
Token raven
Token ID37735
Feature activation+2.44e+00

Pos logit contributions
ics+0.052
ars+0.044
hed+0.043
uits+0.043
case+0.041
Neg logit contributions
 polar-0.037
arter-0.034
 policy-0.034
 importantly-0.032
asured-0.032
Token was
Token ID373
Feature activation-1.05e+00

Pos logit contributions
arter+0.060
 polar+0.059
run+0.058
 policy+0.057
asured+0.056
Neg logit contributions
ics-0.056
he-0.050
ars-0.048
 top-0.047
case-0.045
Token so
Token ID523
Feature activation+5.52e-01

Pos logit contributions
ics+0.021
ars+0.019
he+0.019
 top+0.018
case+0.017
Neg logit contributions
arter-0.029
 polar-0.028
run-0.028
asured-0.028
 policy-0.027
Token grateful
Token ID14066
Feature activation+6.43e-01

Pos logit contributions
arter+0.015
 polar+0.014
run+0.014
 policy+0.014
asured+0.014
Neg logit contributions
ics-0.012
ars-0.010
 top-0.010
he-0.010
case-0.009
Token and
Token ID290
Feature activation+1.26e-01

Pos logit contributions
arter+0.023
 polar+0.021
 policy+0.021
asured+0.020
run+0.020
Neg logit contributions
ics-0.008
he-0.008
 top-0.007
ars-0.007
case-0.006
Token and
Token ID290
Feature activation-5.71e-01

Pos logit contributions
arter+0.006
run+0.006
 policy+0.006
 polar+0.006
asured+0.006
Neg logit contributions
ics-0.005
ars-0.004
he-0.004
 top-0.004
vation-0.004
Token the
Token ID262
Feature activation-1.08e+00

Pos logit contributions
ics+0.010
 top+0.008
he+0.008
ars+0.008
case+0.007
Neg logit contributions
arter-0.018
 policy-0.018
 polar-0.017
run-0.017
asured-0.017
Token truck
Token ID7779
Feature activation+6.65e-01

Pos logit contributions
ics+0.024
he+0.022
ars+0.021
case+0.020
 top+0.019
Neg logit contributions
arter-0.027
 polar-0.027
 policy-0.026
run-0.025
 former-0.024
Token driver
Token ID4639
Feature activation+9.79e-01

Pos logit contributions
arter+0.017
 polar+0.017
run+0.016
 policy+0.016
asured+0.016
Neg logit contributions
ics-0.014
he-0.013
ars-0.013
 top-0.012
case-0.011
Token joined
Token ID5399
Feature activation-9.75e-01

Pos logit contributions
arter+0.027
 polar+0.026
 policy+0.026
run+0.025
asured+0.025
Neg logit contributions
ics-0.020
he-0.018
ars-0.018
 top-0.017
case-0.016
Token in
Token ID287
Feature activation-4.66e-01

Pos logit contributions
ics+0.017
he+0.015
 top+0.014
ars+0.014
case+0.013
Neg logit contributions
arter-0.031
 policy-0.029
 polar-0.029
run-0.028
asured-0.028
Token.
Token ID13
Feature activation+5.09e-01

Pos logit contributions
ics+0.008
he+0.007
ars+0.007
 top+0.007
case+0.007
Neg logit contributions
arter-0.014
 policy-0.013
 polar-0.013
run-0.013
asured-0.013
Token They
Token ID1119
Feature activation-1.87e+00

Pos logit contributions
arter+0.014
 policy+0.014
 polar+0.013
run+0.013
asured+0.013
Neg logit contributions
ics-0.010
he-0.009
 top-0.009
ars-0.009
case-0.009
Token sang
Token ID25889
Feature activation-1.21e+00

Pos logit contributions
ics+0.039
ars+0.034
he+0.034
 top+0.033
case+0.032
Neg logit contributions
 polar-0.049
arter-0.048
run-0.046
asured-0.045
 policy-0.045
Token a
Token ID257
Feature activation+9.32e-03

Pos logit contributions
ics+0.028
he+0.024
ars+0.024
case+0.023
vation+0.023
Neg logit contributions
 polar-0.029
run-0.029
 Bobby-0.028
arter-0.026
 importantly-0.026
Token happy
Token ID3772
Feature activation+3.66e-01

Pos logit contributions
 polar+0.000
 importantly+0.000
arter+0.000
run+0.000
 former+0.000
Neg logit contributions
ics-0.000
he-0.000
iece-0.000
hed-0.000
hel-0.000
Token her
Token ID607
Feature activation+1.87e+00

Pos logit contributions
arter+0.059
 policy+0.056
 polar+0.055
run+0.055
asured+0.053
Neg logit contributions
ics-0.034
he-0.030
 top-0.029
ars-0.029
case-0.026
Token head
Token ID1182
Feature activation+1.07e+00

Pos logit contributions
arter+0.041
 polar+0.039
 policy+0.038
run+0.038
asured+0.038
Neg logit contributions
ics-0.050
he-0.044
ival-0.044
vation-0.043
ars-0.043
Token and
Token ID290
Feature activation-1.66e+00

Pos logit contributions
run+0.024
arter+0.022
asured+0.021
 polar+0.021
 policy+0.021
Neg logit contributions
ics-0.028
vation-0.024
ars-0.024
he-0.022
ceed-0.022
Token hugged
Token ID41130
Feature activation+1.30e+00

Pos logit contributions
ics+0.034
he+0.030
ars+0.030
 top+0.029
case+0.027
Neg logit contributions
arter-0.047
 policy-0.045
 polar-0.044
run-0.044
asured-0.043
Token her
Token ID607
Feature activation+1.55e+00

Pos logit contributions
arter+0.035
 policy+0.033
 polar+0.033
run+0.032
asured+0.031
Neg logit contributions
ics-0.028
he-0.026
ars-0.025
 top-0.024
case-0.023
Token doll
Token ID3654
Feature activation-4.59e-01

Pos logit contributions
arter+0.035
 polar+0.034
 policy+0.034
run+0.032
asured+0.031
Neg logit contributions
ics-0.041
he-0.035
ars-0.035
vation-0.034
hed-0.034
Token.
Token ID13
Feature activation+3.06e-01

Pos logit contributions
ics+0.011
he+0.010
ars+0.010
vation+0.009
 Higher+0.009
Neg logit contributions
arter-0.010
 polar-0.010
run-0.010
asured-0.010
 policy-0.010
Token She
Token ID1375
Feature activation+7.93e-01

Pos logit contributions
arter+0.009
 policy+0.009
 polar+0.009
run+0.008
asured+0.008
Neg logit contributions
ics-0.006
he-0.005
ars-0.005
 top-0.005
case-0.005
Token wished
Token ID16555
Feature activation+5.91e-01

Pos logit contributions
arter+0.025
 policy+0.024
run+0.022
 polar+0.022
ocol+0.021
Neg logit contributions
ics-0.014
ars-0.013
he-0.013
 top-0.013
case-0.012
Token Ben
Token ID3932
Feature activation-2.61e+00

Pos logit contributions
arter+0.019
 policy+0.019
 polar+0.018
run+0.018
asured+0.018
Neg logit contributions
ics-0.009
 top-0.008
he-0.008
ars-0.008
case-0.007
Token would
Token ID561
Feature activation-2.92e+00

Pos logit contributions
ics+0.057
he+0.052
ars+0.051
 top+0.050
case+0.047
Neg logit contributions
arter-0.069
 policy-0.065
 polar-0.064
run-0.064
asured-0.062
Token so
Token ID523
Feature activation+4.58e-01

Pos logit contributions
ics+0.017
 top+0.015
he+0.015
ars+0.014
case+0.013
Neg logit contributions
arter-0.033
 policy-0.032
run-0.031
 polar-0.031
asured-0.030
Token exciting
Token ID7895
Feature activation-5.05e-01

Pos logit contributions
arter+0.015
 policy+0.014
run+0.014
 polar+0.013
asured+0.013
Neg logit contributions
ics-0.008
he-0.007
ars-0.007
 top-0.007
case-0.006
Token!
Token ID0
Feature activation+3.36e-01

Pos logit contributions
ics+0.010
he+0.009
ars+0.008
 top+0.008
case+0.008
Neg logit contributions
arter-0.015
 policy-0.014
 polar-0.014
run-0.014
asured-0.013
Token They
Token ID1119
Feature activation-1.94e+00

Pos logit contributions
arter+0.009
 policy+0.009
 polar+0.009
run+0.008
asured+0.008
Neg logit contributions
ics-0.007
he-0.006
ars-0.006
 top-0.006
case-0.006
Token saw
Token ID2497
Feature activation-3.88e-01

Pos logit contributions
ics+0.046
he+0.041
ars+0.041
 top+0.040
case+0.039
Neg logit contributions
arter-0.046
 polar-0.046
 policy-0.044
run-0.043
asured-0.043
Token lots
Token ID6041
Feature activation-5.45e-01

Pos logit contributions
ics+0.006
he+0.006
ars+0.005
 top+0.005
case+0.005
Neg logit contributions
arter-0.012
 policy-0.012
 polar-0.012
run-0.012
asured-0.011
Token of
Token ID286
Feature activation-2.91e-01

Pos logit contributions
ics+0.013
he+0.012
ars+0.012
 top+0.012
case+0.011
Neg logit contributions
arter-0.013
 policy-0.013
 polar-0.012
run-0.012
asured-0.012
Token different
Token ID1180
Feature activation+5.42e-01

Pos logit contributions
ics+0.005
 top+0.005
ars+0.005
he+0.005
 rides+0.004
Neg logit contributions
arter-0.009
run-0.009
 policy-0.009
asured-0.008
 former-0.008
Token fish
Token ID5916
Feature activation-1.29e+00

Pos logit contributions
arter+0.017
run+0.015
 policy+0.015
 polar+0.014
asured+0.014
Neg logit contributions
ics-0.011
 top-0.010
ars-0.009
he-0.009
 rides-0.008
Token,
Token ID11
Feature activation-9.74e-01

Pos logit contributions
ics+0.024
he+0.021
ars+0.021
 top+0.020
case+0.019
Neg logit contributions
arter-0.039
 policy-0.037
 polar-0.036
run-0.036
asured-0.035
Token birds
Token ID10087
Feature activation-1.37e+00

Pos logit contributions
ics+0.019
 top+0.017
ars+0.016
he+0.016
case+0.015
Neg logit contributions
arter-0.029
 policy-0.027
run-0.026
asured-0.026
 polar-0.026
Tokenre
Token ID260
Feature activation+4.20e-01

Pos logit contributions
ics+0.088
he+0.079
ars+0.078
 top+0.076
case+0.071
Neg logit contributions
arter-0.111
 policy-0.105
 polar-0.104
run-0.103
asured-0.099
Token so
Token ID523
Feature activation+2.34e-01

Pos logit contributions
arter+0.012
 policy+0.012
run+0.012
 polar+0.012
asured+0.011
Neg logit contributions
ics-0.008
he-0.007
ars-0.007
 top-0.007
case-0.006
Token smart
Token ID4451
Feature activation-1.59e+00

Pos logit contributions
arter+0.007
 policy+0.006
 polar+0.006
run+0.006
asured+0.006
Neg logit contributions
ics-0.005
he-0.004
ars-0.004
 top-0.004
case-0.004
Token.
Token ID13
Feature activation+3.54e-01

Pos logit contributions
ics+0.026
he+0.024
ars+0.022
 top+0.022
case+0.020
Neg logit contributions
arter-0.052
 polar-0.048
 policy-0.047
run-0.046
asured-0.046
Tokenâ
Token ID22940
Feature activation-1.11e+00

Pos logit contributions
arter+0.010
 policy+0.010
 polar+0.009
asured+0.009
run+0.009
Neg logit contributions
ics-0.007
he-0.006
 top-0.006
ars-0.006
case-0.006
Token€
Token ID26391
Feature activation-2.93e+00

Pos logit contributions
ics+0.023
 top+0.021
case+0.019
ars+0.019
he+0.018
Neg logit contributions
arter-0.028
asured-0.027
 policy-0.027
run-0.027
 polar-0.026
Token
Token ID220
Feature activation-2.56e-01

Pos logit contributions
ics+0.060
he+0.054
 top+0.054
ars+0.051
iece+0.050
Neg logit contributions
run-0.075
arter-0.071
 policy-0.069
 polar-0.067
omo-0.066
Token
Token ID198
Feature activation-6.74e-01

Pos logit contributions
ics+0.005
ars+0.004
he+0.004
 top+0.004
case+0.004
Neg logit contributions
arter-0.007
run-0.007
 policy-0.007
 polar-0.007
asured-0.007
Token
Token ID198
Feature activation+3.95e-01

Pos logit contributions
ics+0.012
he+0.010
ars+0.010
 top+0.009
case+0.009
Neg logit contributions
arter-0.021
 policy-0.020
run-0.020
 polar-0.020
asured-0.019
TokenMay
Token ID6747
Feature activation+4.04e-01

Pos logit contributions
arter+0.013
 policy+0.012
 polar+0.012
run+0.012
asured+0.011
Neg logit contributions
ics-0.007
he-0.006
ars-0.006
 top-0.005
case-0.005
Token hugged
Token ID41130
Feature activation+1.42e+00

Pos logit contributions
arter+0.011
 policy+0.011
 polar+0.010
run+0.010
asured+0.010
Neg logit contributions
ics-0.008
he-0.007
ars-0.007
 top-0.007
case-0.007
Token They
Token ID1119
Feature activation-1.76e+00

Pos logit contributions
arter+0.013
 policy+0.013
 polar+0.012
run+0.012
asured+0.012
Neg logit contributions
ics-0.013
he-0.012
ars-0.012
 top-0.012
case-0.011
Token were
Token ID547
Feature activation-1.34e+00

Pos logit contributions
ics+0.038
ars+0.033
he+0.033
 top+0.032
case+0.030
Neg logit contributions
arter-0.045
 polar-0.044
run-0.043
 policy-0.043
asured-0.043
Token very
Token ID845
Feature activation+1.41e+00

Pos logit contributions
ics+0.023
he+0.020
ars+0.020
 top+0.019
case+0.018
Neg logit contributions
arter-0.042
 policy-0.040
 polar-0.040
run-0.039
asured-0.038
Token happy
Token ID3772
Feature activation-1.60e+00

Pos logit contributions
arter+0.053
 policy+0.051
run+0.051
 polar+0.049
ocol+0.048
Neg logit contributions
ics-0.015
ars-0.013
he-0.012
 top-0.010
case-0.010
Token and
Token ID290
Feature activation-6.48e-01

Pos logit contributions
ics+0.020
he+0.016
 top+0.016
ars+0.015
case+0.013
Neg logit contributions
arter-0.058
run-0.055
 polar-0.055
 policy-0.054
ocol-0.053
Token played
Token ID2826
Feature activation-4.18e+00

Pos logit contributions
ics+0.012
ars+0.010
he+0.010
 top+0.010
case+0.009
Neg logit contributions
arter-0.020
 policy-0.019
run-0.019
 polar-0.019
ocol-0.018
Token with
Token ID351
Feature activation+1.31e+00

Pos logit contributions
ics+0.094
ars+0.082
he+0.079
vation+0.077
case+0.074
Neg logit contributions
arter-0.100
 polar-0.097
 policy-0.096
run-0.094
asured-0.092
Token it
Token ID340
Feature activation+2.74e-01

Pos logit contributions
arter+0.045
 policy+0.043
 polar+0.042
run+0.042
asured+0.041
Neg logit contributions
ics-0.020
 top-0.017
he-0.017
ars-0.016
case-0.014
Token every
Token ID790
Feature activation+4.12e-01

Pos logit contributions
arter+0.006
run+0.006
 policy+0.006
asured+0.006
 polar+0.006
Neg logit contributions
ics-0.006
ars-0.006
he-0.005
vation-0.005
ceed-0.005
Token day
Token ID1110
Feature activation-1.97e+00

Pos logit contributions
arter+0.012
 policy+0.011
 polar+0.011
run+0.011
asured+0.011
Neg logit contributions
ics-0.008
 top-0.007
he-0.007
ars-0.007
case-0.007
Token.
Token ID13
Feature activation+1.55e-01

Pos logit contributions
ics+0.037
he+0.032
ars+0.032
 top+0.031
case+0.029
Neg logit contributions
arter-0.059
 policy-0.056
 polar-0.056
run-0.055
asured-0.054
Token brothers
Token ID9397
Feature activation-2.83e+00

Pos logit contributions
ics+0.016
he+0.014
ars+0.014
 top+0.014
case+0.014
Neg logit contributions
arter-0.012
 polar-0.011
 policy-0.011
run-0.011
asured-0.010
Token.
Token ID13
Feature activation+3.12e-01

Pos logit contributions
ics+0.059
ars+0.051
he+0.051
 top+0.049
case+0.047
Neg logit contributions
arter-0.077
 polar-0.074
 policy-0.073
run-0.072
asured-0.071
Token They
Token ID1119
Feature activation-1.69e+00

Pos logit contributions
 policy+0.010
arter+0.010
asured+0.010
 former+0.010
ocol+0.010
Neg logit contributions
he-0.004
ars-0.004
 top-0.004
ics-0.003
The-0.003
Token like
Token ID588
Feature activation-2.56e+00

Pos logit contributions
ics+0.044
 top+0.039
ars+0.037
vation+0.037
case+0.037
Neg logit contributions
arter-0.036
 polar-0.035
 policy-0.033
asured-0.032
run-0.031
Token to
Token ID284
Feature activation-3.03e-01

Pos logit contributions
ics+0.056
he+0.050
ars+0.049
 top+0.048
case+0.046
Neg logit contributions
arter-0.069
 policy-0.065
 polar-0.064
run-0.063
asured-0.061
Token play
Token ID711
Feature activation-4.30e+00

Pos logit contributions
ics+0.006
he+0.006
ars+0.006
 top+0.005
case+0.005
Neg logit contributions
arter-0.008
 policy-0.008
 polar-0.008
run-0.008
asured-0.008
Token outside
Token ID2354
Feature activation-7.18e-01

Pos logit contributions
ics+0.095
he+0.082
ars+0.082
case+0.078
vation+0.077
Neg logit contributions
arter-0.108
 polar-0.104
 policy-0.102
run-0.100
asured-0.098
Token in
Token ID287
Feature activation+1.53e-01

Pos logit contributions
ics+0.014
he+0.012
ars+0.012
 top+0.011
case+0.011
Neg logit contributions
arter-0.021
 policy-0.020
 polar-0.020
run-0.020
asured-0.019
Token the
Token ID262
Feature activation-1.62e+00

Pos logit contributions
arter+0.005
 policy+0.005
run+0.005
 polar+0.005
asured+0.005
Neg logit contributions
ics-0.002
 top-0.002
he-0.002
ars-0.002
case-0.001
Token yard
Token ID12699
Feature activation-1.51e+00

Pos logit contributions
ics+0.030
he+0.027
ars+0.026
 top+0.026
case+0.024
Neg logit contributions
arter-0.048
 policy-0.046
 polar-0.045
run-0.045
asured-0.044
Token.
Token ID13
Feature activation+7.25e-01

Pos logit contributions
ics+0.034
ars+0.033
he+0.031
vation+0.030
 Higher+0.029
Neg logit contributions
arter-0.034
 policy-0.033
 polar-0.032
run-0.032
asured-0.032
Token says
Token ID1139
Feature activation-9.66e-01

Pos logit contributions
ics+0.019
ars+0.018
 top+0.017
he+0.016
 Higher+0.016
Neg logit contributions
 polar-0.015
arter-0.015
run-0.015
asured-0.014
 policy-0.013
Token.
Token ID13
Feature activation+3.97e-01

Pos logit contributions
ics+0.024
ars+0.022
he+0.022
 top+0.021
case+0.021
Neg logit contributions
run-0.023
arter-0.022
 policy-0.021
 polar-0.020
 Bobby-0.019
Token "
Token ID366
Feature activation-2.16e+00

Pos logit contributions
arter+0.012
 policy+0.011
 polar+0.011
run+0.011
asured+0.011
Neg logit contributions
ics-0.007
he-0.007
 top-0.006
ars-0.006
case-0.006
TokenYou
Token ID1639
Feature activation-4.26e-01

Pos logit contributions
ics+0.051
case+0.041
 top+0.040
ars+0.038
ong+0.037
Neg logit contributions
 policy-0.052
run-0.051
 polar-0.051
arter-0.051
asured-0.051
Token have
Token ID423
Feature activation-2.45e-01

Pos logit contributions
ics+0.010
 top+0.008
he+0.008
ars+0.008
 Higher+0.008
Neg logit contributions
arter-0.010
 polar-0.010
 policy-0.010
run-0.010
asured-0.009
Token played
Token ID2826
Feature activation-4.40e+00

Pos logit contributions
ics+0.005
he+0.004
ars+0.004
 top+0.004
case+0.004
Neg logit contributions
arter-0.007
 policy-0.007
 polar-0.007
run-0.006
asured-0.006
Token enough
Token ID1576
Feature activation+6.67e-01

Pos logit contributions
ics+0.097
ars+0.085
he+0.082
hed+0.077
case+0.075
Neg logit contributions
arter-0.110
 polar-0.107
 policy-0.105
run-0.104
asured-0.102
Token.
Token ID13
Feature activation+1.53e+00

Pos logit contributions
arter+0.021
run+0.018
asured+0.018
 policy+0.018
 polar+0.018
Neg logit contributions
ics-0.012
 top-0.011
he-0.011
ars-0.010
 rides-0.010
Token You
Token ID921
Feature activation-9.22e-01

Pos logit contributions
arter+0.042
 policy+0.039
 polar+0.039
run+0.038
asured+0.038
Neg logit contributions
ics-0.029
ars-0.028
he-0.028
 top-0.028
 Higher-0.026
Token need
Token ID761
Feature activation+7.67e-01

Pos logit contributions
ics+0.020
 top+0.017
he+0.016
ars+0.015
 Higher+0.015
Neg logit contributions
 polar-0.024
arter-0.024
 policy-0.023
run-0.023
asured-0.023
Token to
Token ID284
Feature activation+5.14e-01

Pos logit contributions
arter+0.025
 polar+0.023
 policy+0.023
run+0.023
asured+0.022
Neg logit contributions
ics-0.013
 top-0.011
ars-0.011
he-0.010
case-0.009
Token were
Token ID547
Feature activation-3.15e-01

Pos logit contributions
ics+0.062
ars+0.052
 top+0.052
he+0.052
case+0.049
Neg logit contributions
arter-0.081
 polar-0.079
 policy-0.077
asured-0.074
run-0.073
Token twins
Token ID20345
Feature activation-2.82e+00

Pos logit contributions
ics+0.007
he+0.007
ars+0.007
 top+0.006
case+0.006
Neg logit contributions
arter-0.008
 polar-0.007
 policy-0.007
run-0.007
asured-0.007
Token who
Token ID508
Feature activation+1.20e+00

Pos logit contributions
ics+0.067
he+0.060
ars+0.060
 top+0.058
case+0.055
Neg logit contributions
arter-0.069
 polar-0.065
 policy-0.065
run-0.064
asured-0.062
Token liked
Token ID8288
Feature activation-1.80e+00

Pos logit contributions
arter+0.031
 policy+0.030
 polar+0.029
run+0.029
asured+0.028
Neg logit contributions
ics-0.027
he-0.024
ars-0.023
 top-0.023
case-0.022
Token to
Token ID284
Feature activation-2.21e-01

Pos logit contributions
ics+0.033
 top+0.027
he+0.026
 swing+0.023
case+0.023
Neg logit contributions
arter-0.057
 policy-0.053
run-0.053
 polar-0.051
asured-0.050
Token play
Token ID711
Feature activation-4.18e+00

Pos logit contributions
ics+0.004
he+0.004
 top+0.004
ars+0.003
case+0.003
Neg logit contributions
arter-0.007
 policy-0.006
run-0.006
 polar-0.006
asured-0.006
Token together
Token ID1978
Feature activation-7.63e-01

Pos logit contributions
ics+0.079
he+0.069
ars+0.069
 top+0.065
case+0.062
Neg logit contributions
arter-0.123
 polar-0.117
 policy-0.117
run-0.114
asured-0.111
Token.
Token ID13
Feature activation+1.05e+00

Pos logit contributions
ics+0.016
ars+0.015
he+0.014
vation+0.013
 top+0.013
Neg logit contributions
arter-0.020
 policy-0.019
run-0.018
 polar-0.018
asured-0.018
Token But
Token ID887
Feature activation+1.40e+00

Pos logit contributions
arter+0.030
 policy+0.029
 polar+0.029
run+0.028
asured+0.027
Neg logit contributions
ics-0.020
 top-0.019
he-0.018
case-0.017
ars-0.017
Token sometimes
Token ID3360
Feature activation-1.02e+00

Pos logit contributions
arter+0.050
run+0.047
 policy+0.045
 polar+0.044
asured+0.043
Neg logit contributions
 top-0.019
ics-0.019
he-0.015
'-0.014
't-0.014
Token they
Token ID484
Feature activation-1.74e+00

Pos logit contributions
ics+0.017
 top+0.013
ars+0.012
he+0.012
case+0.011
Neg logit contributions
arter-0.035
run-0.033
 polar-0.032
asured-0.032
 policy-0.031