TOP ACTIVATIONS
MAX = 0.138

But the ost ric hes did not like that
" O st ric hes do not like people
the problem - an old seaf aring vessel had crashed onto
shouted . " The ost ric hes are so funny !"
cried . " The ost ric hes are mean ! They
" Mom , are the pizz as ready ?" Lily asked
Look , mom , ost ric hes !" "
that said " O st ric hes ". Sara and Tom
Yes , they are ost ric hes ," mom said .
ostr ich pen . Ost ric hes are big birds with
. One of the ost ric hes looked different from the
busy . They have many pizz as to deliver . Ben
. They were called ost ric hes . Lily and Ben
but be careful . Ost ric hes can be gr umpy
They put the pizz as in the oven and
but he also knew cater pill ars were not pets .
. Tim my found a seas hell and showed it to
ostr ich pen . Ost ric hes are big birds with
too close . The ost ric hes might pe ck you
. He was carrying their pizz as on a big tray

MIN ACTIVATIONS
MIN = -0.131

aded in the water delic ately with his fishing rod .
. Then we can swing to the other tree . The
" Can we sit close to the fireworks , Mom ?"
apped and shouted " ready to go !" She was ready
had been able to climb to the top . After a
he had managed to climb to the top . And he
He wished he could climb to the top !
backyard and wanted to climb to the top . She
. They want to climb to the top and slide down
. She wanted to climb to the top of a tall
He liked to climb up to high places and swing on
will be able to climb to the top of the tallest
climb down the steep rocks to get to it .
use it to climb up to the balloon .
this oil on the slide to make it go faster !"
heat . Then I climb to the top and I can
tall buildings to climb up to the best view on the
add more blocks on top to make it even taller !"
so she could climb up to get the apples . Mom
Can I open you up to have a look ?

INTERVAL 0.071 - 0.138
CONTAINS 3.645%

hated that idea . He tried to sneak away
eating her favorite food . She was happy to
safely on the ground . The plane was happy
Bob felt relieved . Bob had been wise and
each other secrets . One day , Lily 's

INTERVAL 0.004 - 0.071
CONTAINS 40.494%

toys , but he didn 't like to share them with
Sam pulled back . They both fell down . The doll
do it . " Let me help you ,"
the door . She had a kind face and a big
share it nicely . She also said they had to be

INTERVAL -0.064 - 0.004
CONTAINS 51.987%

. Then I will cook it for dinner ."
to play with all the time . One day , her
Do you want to go this way ?" M imi me
my , asked her , " Why do you write in
home . The other people tried to apologize to

INTERVAL -0.131 - -0.064
CONTAINS 3.874%

, the spot was visited by many animals . And all
had to measure the price carefully . The little boy thought
twins who liked to play outside . They had many toys
proud that Andy was able to pick the grape all by
she would get the meat from the cold box . Kitty
Token
Token ID198
Feature activation+8.98e-02

Pos logit contributions
keeping+0.038
 itself+0.031
 him+0.031
 home+0.029
ie+0.028
Neg logit contributions
istered-0.055
cases-0.054
ider-0.051
plets-0.051
arest-0.051
Token
Token ID198
Feature activation+8.02e-02

Pos logit contributions
istered+0.179
cases+0.177
arest+0.165
ider+0.165
plets+0.163
Neg logit contributions
keeping-0.154
 itself-0.128
 him-0.125
 home-0.119
ie-0.119
TokenBut
Token ID1537
Feature activation-3.19e-02

Pos logit contributions
istered+0.160
cases+0.156
plets+0.149
br+0.142
icious+0.141
Neg logit contributions
keeping-0.133
 itself-0.119
 him-0.115
 home-0.114
gy-0.103
Token the
Token ID262
Feature activation+5.41e-02

Pos logit contributions
keeping+0.054
 itself+0.042
 him+0.039
ie+0.038
 generation+0.036
Neg logit contributions
cases-0.066
istered-0.066
ider-0.062
plets-0.061
arest-0.060
Token ost
Token ID23619
Feature activation+4.73e-02

Pos logit contributions
istered+0.119
cases+0.118
plets+0.108
arest+0.108
ider+0.107
Neg logit contributions
keeping-0.081
 home-0.070
 itself-0.069
 him-0.068
ie-0.064
Tokenric
Token ID1173
Feature activation+1.38e-01

Pos logit contributions
istered+0.127
cases+0.127
arest+0.120
br+0.120
ider+0.118
Neg logit contributions
keeping-0.048
 itself-0.037
gy-0.037
ie-0.035
 him-0.034
Tokenhes
Token ID956
Feature activation+2.15e-02

Pos logit contributions
istered+0.127
cases+0.110
ider+0.095
icious+0.094
plets+0.093
Neg logit contributions
keeping-0.373
 itself-0.371
 him-0.360
 himself-0.336
ie-0.335
Token did
Token ID750
Feature activation+2.57e-02

Pos logit contributions
istered+0.045
cases+0.045
ider+0.042
plets+0.042
arest+0.042
Neg logit contributions
keeping-0.034
 itself-0.028
 him-0.027
 home-0.026
 Joey-0.024
Token not
Token ID407
Feature activation+4.30e-03

Pos logit contributions
cases+0.036
istered+0.036
plets+0.034
arest+0.033
lic+0.032
Neg logit contributions
keeping-0.055
 itself-0.055
 him-0.054
 home-0.051
 his-0.050
Token like
Token ID588
Feature activation-5.87e-03

Pos logit contributions
cases+0.010
istered+0.010
ider+0.009
plets+0.009
arest+0.009
Neg logit contributions
keeping-0.006
 him-0.006
 itself-0.006
 home-0.005
ie-0.005
Token that
Token ID326
Feature activation-3.36e-02

Pos logit contributions
keeping+0.007
 itself+0.005
 him+0.005
gy+0.005
ie+0.005
Neg logit contributions
cases-0.014
istered-0.014
plets-0.013
br-0.013
ider-0.013
Token
Token ID198
Feature activation+9.08e-02

Pos logit contributions
keeping+0.032
 him+0.028
 itself+0.027
 home+0.025
 O+0.023
Neg logit contributions
istered-0.059
cases-0.057
arest-0.056
plets-0.055
ider-0.055
Token
Token ID198
Feature activation+8.11e-02

Pos logit contributions
istered+0.180
cases+0.178
arest+0.166
ider+0.165
plets+0.164
Neg logit contributions
keeping-0.155
 itself-0.129
 him-0.126
 home-0.121
ie-0.119
Token"
Token ID1
Feature activation+2.43e-02

Pos logit contributions
istered+0.143
cases+0.139
plets+0.132
ider+0.127
arest+0.126
Neg logit contributions
keeping-0.157
 him-0.140
 itself-0.139
 home-0.133
ie-0.125
TokenO
Token ID46
Feature activation+9.53e-04

Pos logit contributions
istered+0.053
cases+0.052
aj+0.047
arest+0.046
plets+0.046
Neg logit contributions
keeping-0.037
 itself-0.033
 him-0.033
 Twe-0.028
 generation-0.028
Tokenst
Token ID301
Feature activation+3.16e-02

Pos logit contributions
istered+0.001
cases+0.001
ider+0.001
plets+0.001
br+0.001
Neg logit contributions
keeping-0.002
 itself-0.002
 him-0.002
ie-0.002
gy-0.002
Tokenric
Token ID1173
Feature activation+1.30e-01

Pos logit contributions
istered+0.057
cases+0.056
ider+0.052
arest+0.051
plets+0.051
Neg logit contributions
keeping-0.060
 itself-0.053
 him-0.052
 home-0.050
ie-0.049
Tokenhes
Token ID956
Feature activation+2.35e-02

Pos logit contributions
istered+0.127
cases+0.108
plets+0.097
ider+0.095
arest+0.094
Neg logit contributions
 itself-0.345
keeping-0.345
 him-0.337
 himself-0.316
 his-0.310
Token do
Token ID466
Feature activation-1.22e-02

Pos logit contributions
cases+0.040
istered+0.040
arest+0.037
plets+0.036
ider+0.036
Neg logit contributions
keeping-0.047
 itself-0.039
 him-0.039
 home-0.037
 generation-0.036
Token not
Token ID407
Feature activation+1.17e-03

Pos logit contributions
 itself+0.024
 him+0.022
keeping+0.022
 home+0.021
ie+0.021
Neg logit contributions
istered-0.021
cases-0.020
ider-0.019
plets-0.018
lic-0.018
Token like
Token ID588
Feature activation-1.68e-02

Pos logit contributions
cases+0.003
istered+0.003
ider+0.002
plets+0.002
arest+0.002
Neg logit contributions
keeping-0.002
 him-0.002
 itself-0.002
 home-0.001
ie-0.001
Token people
Token ID661
Feature activation-1.23e-03

Pos logit contributions
keeping+0.025
 itself+0.022
 him+0.022
 home+0.021
ie+0.019
Neg logit contributions
istered-0.037
cases-0.036
ider-0.035
arest-0.034
plets-0.034
Token the
Token ID262
Feature activation+2.36e-02

Pos logit contributions
keeping+0.061
 itself+0.047
 him+0.045
ie+0.043
 home+0.042
Neg logit contributions
istered-0.110
cases-0.108
ider-0.101
plets-0.101
arest-0.101
Token problem
Token ID1917
Feature activation-1.00e-02

Pos logit contributions
istered+0.053
cases+0.052
ider+0.049
plets+0.048
uts+0.048
Neg logit contributions
keeping-0.036
 home-0.031
 itself-0.029
 him-0.029
ie-0.027
Token -
Token ID532
Feature activation-2.18e-02

Pos logit contributions
keeping+0.016
 itself+0.013
 him+0.013
 home+0.012
ie+0.011
Neg logit contributions
istered-0.022
cases-0.021
ider-0.020
plets-0.020
arest-0.020
Token an
Token ID281
Feature activation+5.62e-02

Pos logit contributions
keeping+0.031
 itself+0.026
 him+0.026
 home+0.025
 Joey+0.023
Neg logit contributions
istered-0.050
cases-0.049
ider-0.047
arest-0.046
plets-0.046
Token old
Token ID1468
Feature activation+2.31e-05

Pos logit contributions
istered+0.090
cases+0.088
ider+0.081
arest+0.081
plets+0.080
Neg logit contributions
keeping-0.118
 itself-0.102
 him-0.101
 home-0.096
ie-0.095
Token seaf
Token ID25127
Feature activation+1.29e-01

Pos logit contributions
istered+0.000
cases+0.000
ider+0.000
plets+0.000
arest+0.000
Neg logit contributions
keeping-0.000
 itself-0.000
 him-0.000
 home-0.000
ie-0.000
Tokenaring
Token ID1723
Feature activation-5.09e-03

Pos logit contributions
istered+0.190
icious+0.184
cases+0.162
aj+0.159
plets+0.155
Neg logit contributions
keeping-0.249
away-0.244
 home-0.242
 itself-0.234
ie-0.225
Token vessel
Token ID8837
Feature activation+5.24e-02

Pos logit contributions
keeping+0.009
 itself+0.008
 him+0.008
 home+0.007
gy+0.007
Neg logit contributions
istered-0.010
cases-0.010
plets-0.009
arest-0.009
ider-0.009
Token had
Token ID550
Feature activation+3.65e-03

Pos logit contributions
istered+0.111
cases+0.106
ider+0.101
arest+0.100
plets+0.100
Neg logit contributions
keeping-0.085
 itself-0.073
 him-0.073
 home-0.069
ie-0.066
Token crashed
Token ID14997
Feature activation+1.24e-03

Pos logit contributions
istered+0.008
cases+0.007
ider+0.007
plets+0.007
arest+0.007
Neg logit contributions
keeping-0.006
 itself-0.006
 him-0.005
 home-0.005
 Joey-0.005
Token onto
Token ID4291
Feature activation-6.44e-02

Pos logit contributions
istered+0.003
cases+0.003
ider+0.003
arest+0.003
plets+0.003
Neg logit contributions
keeping-0.002
 itself-0.001
 him-0.001
 home-0.001
ie-0.001
Token shouted
Token ID17293
Feature activation+1.83e-02

Pos logit contributions
keeping+0.043
 him+0.038
 itself+0.037
 home+0.034
ie+0.033
Neg logit contributions
istered-0.056
arest-0.053
icious-0.052
plets-0.051
cases-0.051
Token.
Token ID13
Feature activation-2.94e-02

Pos logit contributions
istered+0.035
cases+0.034
ider+0.032
plets+0.030
 short+0.030
Neg logit contributions
keeping-0.035
 itself-0.027
 generation-0.026
ie-0.026
 him-0.025
Token "
Token ID366
Feature activation+2.00e-02

Pos logit contributions
keeping+0.051
 itself+0.042
 him+0.042
 home+0.039
ie+0.039
Neg logit contributions
istered-0.059
cases-0.058
ider-0.054
plets-0.054
arest-0.053
TokenThe
Token ID464
Feature activation+4.09e-02

Pos logit contributions
istered+0.030
cases+0.030
ider+0.027
plets+0.027
arest+0.027
Neg logit contributions
keeping-0.044
 itself-0.039
 him-0.038
 home-0.037
ie-0.036
Token ost
Token ID23619
Feature activation+3.58e-02

Pos logit contributions
cases+0.088
istered+0.087
ider+0.081
arest+0.079
plets+0.078
Neg logit contributions
keeping-0.067
 home-0.061
 itself-0.059
 him-0.058
 O-0.052
Tokenric
Token ID1173
Feature activation+1.27e-01

Pos logit contributions
istered+0.095
cases+0.094
arest+0.088
ider+0.087
br+0.087
Neg logit contributions
keeping-0.038
 itself-0.031
gy-0.029
 him-0.029
ie-0.027
Tokenhes
Token ID956
Feature activation+1.38e-02

Pos logit contributions
istered+0.129
cases+0.108
ider+0.096
icious+0.093
arest+0.089
Neg logit contributions
keeping-0.336
 itself-0.336
 him-0.328
 himself-0.305
 his-0.304
Token are
Token ID389
Feature activation+4.18e-02

Pos logit contributions
istered+0.026
cases+0.026
ider+0.024
plets+0.024
arest+0.024
Neg logit contributions
keeping-0.025
 itself-0.021
 him-0.021
 home-0.020
ie-0.020
Token so
Token ID523
Feature activation+2.66e-02

Pos logit contributions
cases+0.079
istered+0.075
arest+0.073
ider+0.073
plets+0.071
Neg logit contributions
keeping-0.075
 itself-0.074
 home-0.072
 him-0.071
 its-0.069
Token funny
Token ID8258
Feature activation-8.96e-03

Pos logit contributions
istered+0.043
cases+0.042
ider+0.042
icious+0.039
plets+0.038
Neg logit contributions
 itself-0.051
keeping-0.050
 its-0.049
 home-0.049
 him-0.047
Token!"
Token ID2474
Feature activation-1.27e-02

Pos logit contributions
keeping+0.015
 itself+0.014
 him+0.013
 home+0.012
ie+0.012
Neg logit contributions
istered-0.019
cases-0.018
arest-0.017
ider-0.017
icious-0.016
Token cried
Token ID16896
Feature activation+1.26e-02

Pos logit contributions
 him+0.037
keeping+0.035
 itself+0.032
ie+0.031
 home+0.031
Neg logit contributions
istered-0.047
ider-0.046
cases-0.046
arest-0.046
icious-0.044
Token.
Token ID13
Feature activation-2.76e-02

Pos logit contributions
istered+0.026
cases+0.026
ider+0.024
plets+0.024
arest+0.024
Neg logit contributions
keeping-0.021
 itself-0.017
 him-0.016
 home-0.015
ie-0.015
Token "
Token ID366
Feature activation+2.06e-02

Pos logit contributions
keeping+0.039
 itself+0.034
 him+0.033
 home+0.030
gy+0.029
Neg logit contributions
istered-0.063
cases-0.062
arest-0.060
ider-0.058
plets-0.058
TokenThe
Token ID464
Feature activation+4.05e-02

Pos logit contributions
cases+0.036
istered+0.035
ider+0.033
arest+0.032
br+0.031
Neg logit contributions
keeping-0.040
 him-0.036
 itself-0.036
 home-0.032
gy-0.031
Token ost
Token ID23619
Feature activation+3.26e-02

Pos logit contributions
cases+0.077
istered+0.075
plets+0.068
arest+0.068
ider+0.068
Neg logit contributions
keeping-0.075
 him-0.065
 itself-0.065
 home-0.064
ie-0.058
Tokenric
Token ID1173
Feature activation+1.25e-01

Pos logit contributions
istered+0.083
cases+0.083
arest+0.078
ider+0.077
br+0.077
Neg logit contributions
keeping-0.038
 itself-0.030
 him-0.030
ie-0.028
gy-0.027
Tokenhes
Token ID956
Feature activation+1.37e-02

Pos logit contributions
istered+0.125
cases+0.121
ider+0.103
icious+0.095
arest+0.095
Neg logit contributions
 itself-0.325
keeping-0.323
 him-0.321
 his-0.298
 himself-0.297
Token are
Token ID389
Feature activation+4.02e-02

Pos logit contributions
istered+0.025
cases+0.024
ider+0.022
plets+0.022
arest+0.022
Neg logit contributions
keeping-0.026
 itself-0.022
 him-0.022
 home-0.021
ie-0.020
Token mean
Token ID1612
Feature activation-1.78e-02

Pos logit contributions
cases+0.079
ider+0.068
istered+0.065
arest+0.065
plets+0.063
Neg logit contributions
 itself-0.077
 him-0.075
 its-0.074
 home-0.072
 his-0.071
Token!
Token ID0
Feature activation-4.00e-02

Pos logit contributions
keeping+0.029
 him+0.026
 itself+0.025
 home+0.024
ie+0.024
Neg logit contributions
istered-0.038
cases-0.038
ider-0.036
arest-0.036
icious-0.035
Token They
Token ID1119
Feature activation+1.16e-02

Pos logit contributions
keeping+0.055
 itself+0.046
 him+0.045
 home+0.040
ie+0.039
Neg logit contributions
istered-0.093
cases-0.092
arest-0.087
ider-0.087
plets-0.086
Token"
Token ID1
Feature activation+2.35e-02

Pos logit contributions
cases+0.170
istered+0.165
plets+0.159
br+0.149
uts+0.146
Neg logit contributions
keeping-0.104
 him-0.102
 itself-0.100
 home-0.099
ie-0.095
TokenMom
Token ID29252
Feature activation+1.16e-02

Pos logit contributions
cases+0.050
istered+0.049
plets+0.045
arest+0.045
uts+0.045
Neg logit contributions
keeping-0.036
 him-0.032
 itself-0.031
 home-0.028
gent-0.027
Token,
Token ID11
Feature activation+3.61e-02

Pos logit contributions
istered+0.028
cases+0.027
ider+0.026
arest+0.026
plets+0.026
Neg logit contributions
keeping-0.015
 him-0.013
 itself-0.012
 home-0.012
ie-0.011
Token are
Token ID389
Feature activation+5.66e-02

Pos logit contributions
istered+0.053
cases+0.053
arest+0.048
ider+0.048
br+0.048
Neg logit contributions
keeping-0.082
 itself-0.069
 him-0.065
gy-0.065
ie-0.064
Token the
Token ID262
Feature activation+4.98e-02

Pos logit contributions
istered+0.104
cases+0.102
ider+0.095
arest+0.094
plets+0.094
Neg logit contributions
keeping-0.106
 itself-0.091
 him-0.089
 home-0.085
ie-0.083
Token pizz
Token ID38748
Feature activation+1.25e-01

Pos logit contributions
istered+0.097
cases+0.097
plets+0.091
arest+0.089
ider+0.088
Neg logit contributions
keeping-0.085
 him-0.078
 home-0.075
 itself-0.072
ie-0.069
Tokenas
Token ID292
Feature activation+2.10e-02

Pos logit contributions
 annoying+0.139
istered+0.138
cases+0.125
plets+0.122
 childish+0.110
Neg logit contributions
keeping-0.299
 him-0.270
 home-0.262
ie-0.257
 itself-0.250
Token ready
Token ID3492
Feature activation-5.98e-03

Pos logit contributions
istered+0.044
cases+0.043
plets+0.041
arest+0.041
ider+0.040
Neg logit contributions
keeping-0.033
 him-0.029
 home-0.028
 itself-0.028
ie-0.028
Token?"
Token ID1701
Feature activation+6.09e-03

Pos logit contributions
keeping+0.010
 him+0.009
 itself+0.009
 home+0.008
ie+0.008
Neg logit contributions
istered-0.012
cases-0.012
plets-0.011
ider-0.011
arest-0.011
Token Lily
Token ID20037
Feature activation-2.29e-02

Pos logit contributions
istered+0.015
cases+0.015
ider+0.014
plets+0.014
arest+0.014
Neg logit contributions
keeping-0.007
 itself-0.006
 him-0.005
 home-0.005
ie-0.005
Token asked
Token ID1965
Feature activation-2.58e-02

Pos logit contributions
keeping+0.028
 him+0.024
 itself+0.022
ie+0.021
 home+0.021
Neg logit contributions
istered-0.058
arest-0.055
cases-0.055
plets-0.055
br-0.054
TokenLook
Token ID8567
Feature activation+1.66e-02

Pos logit contributions
istered+0.029
cases+0.029
plets+0.028
arest+0.027
br+0.027
Neg logit contributions
keeping-0.022
 itself-0.018
 him-0.017
 home-0.017
ie-0.016
Token,
Token ID11
Feature activation+3.02e-02

Pos logit contributions
cases+0.033
istered+0.033
arest+0.030
br+0.030
plets+0.030
Neg logit contributions
keeping-0.030
 itself-0.024
 generation-0.021
gent-0.021
 him-0.021
Token mom
Token ID1995
Feature activation-1.12e-04

Pos logit contributions
cases+0.061
istered+0.060
arest+0.056
plets+0.056
ider+0.055
Neg logit contributions
keeping-0.051
 itself-0.042
 him-0.040
 home-0.038
ie-0.038
Token,
Token ID11
Feature activation+3.17e-02

Pos logit contributions
keeping+0.000
 him+0.000
ie+0.000
 itself+0.000
 home+0.000
Neg logit contributions
istered-0.000
cases-0.000
ider-0.000
arest-0.000
plets-0.000
Token ost
Token ID23619
Feature activation+3.74e-02

Pos logit contributions
cases+0.062
istered+0.062
br+0.057
arest+0.057
plets+0.057
Neg logit contributions
keeping-0.055
 itself-0.045
 him-0.043
ie-0.042
 home-0.041
Tokenric
Token ID1173
Feature activation+1.25e-01

Pos logit contributions
istered+0.096
cases+0.095
arest+0.090
ider+0.090
plets+0.088
Neg logit contributions
keeping-0.043
 itself-0.035
 him-0.033
gy-0.032
ie-0.031
Tokenhes
Token ID956
Feature activation+1.21e-02

Pos logit contributions
istered+0.124
cases+0.108
ider+0.101
arest+0.091
icious+0.090
Neg logit contributions
keeping-0.338
 itself-0.332
 him-0.324
 home-0.303
 its-0.298
Token!"
Token ID2474
Feature activation-2.01e-02

Pos logit contributions
istered+0.025
cases+0.024
ider+0.023
arest+0.023
plets+0.023
Neg logit contributions
keeping-0.020
 itself-0.017
 him-0.017
 home-0.016
ie-0.016
Token
Token ID198
Feature activation+7.63e-02

Pos logit contributions
keeping+0.037
 itself+0.028
ie+0.027
 him+0.027
 generation+0.027
Neg logit contributions
cases-0.039
istered-0.037
ider-0.036
uts-0.035
plets-0.035
Token
Token ID198
Feature activation+6.85e-02

Pos logit contributions
istered+0.144
cases+0.143
arest+0.134
ider+0.132
plets+0.132
Neg logit contributions
keeping-0.138
 itself-0.114
 him-0.110
 home-0.105
ie-0.103
Token"
Token ID1
Feature activation+2.00e-02

Pos logit contributions
istered+0.113
cases+0.111
ider+0.103
arest+0.102
plets+0.102
Neg logit contributions
keeping-0.142
 itself-0.123
 him-0.122
 home-0.117
ie-0.114
Token that
Token ID326
Feature activation-2.99e-02

Pos logit contributions
istered+0.099
icious+0.091
cases+0.090
plets+0.089
arest+0.087
Neg logit contributions
keeping-0.052
 itself-0.046
 him-0.046
 home-0.043
 his-0.038
Token said
Token ID531
Feature activation+7.52e-03

Pos logit contributions
keeping+0.051
 him+0.048
 itself+0.047
 O+0.044
 Joey+0.044
Neg logit contributions
istered-0.061
ider-0.056
icious-0.056
arest-0.053
plets-0.052
Token "
Token ID366
Feature activation+1.15e-05

Pos logit contributions
istered+0.015
cases+0.015
ider+0.014
arest+0.014
plets+0.014
Neg logit contributions
keeping-0.013
 itself-0.011
 him-0.011
 home-0.010
ie-0.010
TokenO
Token ID46
Feature activation-7.11e-03

Pos logit contributions
istered+0.000
cases+0.000
ider+0.000
plets+0.000
arest+0.000
Neg logit contributions
keeping-0.000
 itself-0.000
 him-0.000
 home-0.000
ie-0.000
Tokenst
Token ID301
Feature activation+2.34e-02

Pos logit contributions
keeping+0.014
gy+0.013
 itself+0.013
ie+0.013
 him+0.012
Neg logit contributions
istered-0.012
cases-0.011
ider-0.010
arest-0.010
plets-0.010
Tokenric
Token ID1173
Feature activation+1.22e-01

Pos logit contributions
istered+0.043
cases+0.041
ider+0.038
arest+0.037
plets+0.037
Neg logit contributions
keeping-0.044
 itself-0.040
 him-0.039
 home-0.038
gy-0.037
Tokenhes
Token ID956
Feature activation+1.49e-02

Pos logit contributions
istered+0.139
cases+0.120
ider+0.110
icious+0.101
arest+0.100
Neg logit contributions
keeping-0.309
 itself-0.302
 him-0.294
 home-0.280
 its-0.270
Token".
Token ID1911
Feature activation-2.52e-02

Pos logit contributions
istered+0.030
cases+0.029
ider+0.028
arest+0.027
plets+0.027
Neg logit contributions
keeping-0.025
 itself-0.022
 him-0.022
 home-0.021
ie-0.020
Token Sara
Token ID24799
Feature activation-9.75e-03

Pos logit contributions
keeping+0.041
ie+0.034
 itself+0.034
 him+0.033
 home+0.031
Neg logit contributions
cases-0.053
istered-0.051
ider-0.048
plets-0.047
arest-0.045
Token and
Token ID290
Feature activation-6.96e-03

Pos logit contributions
keeping+0.012
gy+0.011
 him+0.011
 itself+0.011
 home+0.010
Neg logit contributions
istered-0.023
arest-0.022
cases-0.022
plets-0.021
ider-0.021
Token Tom
Token ID4186
Feature activation-2.39e-02

Pos logit contributions
keeping+0.014
 itself+0.012
ie+0.011
 him+0.011
 home+0.011
Neg logit contributions
cases-0.012
istered-0.012
arest-0.011
ider-0.011
plets-0.011
TokenYes
Token ID5297
Feature activation+1.16e-03

Pos logit contributions
istered+0.050
cases+0.049
ider+0.045
arest+0.044
aj+0.044
Neg logit contributions
keeping-0.025
 itself-0.021
 him-0.020
 home-0.017
 O-0.017
Token,
Token ID11
Feature activation+3.20e-02

Pos logit contributions
istered+0.003
cases+0.003
ider+0.002
plets+0.002
arest+0.002
Neg logit contributions
keeping-0.002
 itself-0.001
 him-0.001
 home-0.001
ie-0.001
Token they
Token ID484
Feature activation+2.73e-02

Pos logit contributions
istered+0.073
cases+0.070
ider+0.067
arest+0.066
plets+0.065
Neg logit contributions
keeping-0.045
 itself-0.042
 him-0.040
ie-0.037
 home-0.037
Token are
Token ID389
Feature activation+4.67e-02

Pos logit contributions
cases+0.062
ider+0.062
istered+0.061
plets+0.057
aj+0.055
Neg logit contributions
keeping-0.036
 him-0.036
 itself-0.033
 home-0.032
ie-0.031
Token ost
Token ID23619
Feature activation+3.24e-02

Pos logit contributions
istered+0.099
cases+0.098
ider+0.091
plets+0.090
arest+0.088
Neg logit contributions
keeping-0.075
 itself-0.069
 home-0.066
 him-0.065
 O-0.060
Tokenric
Token ID1173
Feature activation+1.21e-01

Pos logit contributions
istered+0.082
cases+0.081
arest+0.076
ider+0.076
br+0.075
Neg logit contributions
keeping-0.038
 itself-0.031
gy-0.029
 him-0.029
ie-0.029
Tokenhes
Token ID956
Feature activation+3.87e-03

Pos logit contributions
istered+0.118
cases+0.103
ider+0.094
plets+0.088
icious+0.085
Neg logit contributions
keeping-0.330
 itself-0.322
 him-0.315
 home-0.297
 himself-0.289
Token,"
Token ID553
Feature activation-3.20e-03

Pos logit contributions
istered+0.008
cases+0.008
ider+0.008
plets+0.008
arest+0.008
Neg logit contributions
keeping-0.006
 itself-0.005
 him-0.005
 home-0.005
ie-0.005
Token mom
Token ID1995
Feature activation-9.18e-03

Pos logit contributions
keeping+0.006
 generation+0.005
 itself+0.004
gy+0.004
 home+0.004
Neg logit contributions
istered-0.006
cases-0.006
 short-0.006
keys-0.005
ider-0.005
Token said
Token ID531
Feature activation-1.37e-03

Pos logit contributions
keeping+0.020
 him+0.018
 itself+0.018
ie+0.017
 home+0.017
Neg logit contributions
istered-0.014
cases-0.014
arest-0.013
ider-0.013
plets-0.013
Token.
Token ID13
Feature activation-3.39e-02

Pos logit contributions
keeping+0.003
 generation+0.002
 itself+0.002
 him+0.002
ie+0.002
Neg logit contributions
istered-0.002
cases-0.002
ider-0.002
uts-0.002
 short-0.002
Token ostr
Token ID42387
Feature activation+2.13e-02

Pos logit contributions
istered+0.031
cases+0.031
plets+0.029
ider+0.029
arest+0.029
Neg logit contributions
keeping-0.023
 itself-0.020
 him-0.019
 home-0.019
ie-0.017
Tokenich
Token ID488
Feature activation-1.09e-02

Pos logit contributions
istered+0.031
cases+0.030
arest+0.028
br+0.027
plets+0.027
Neg logit contributions
keeping-0.046
gy-0.042
 him-0.041
 home-0.041
 itself-0.041
Token pen
Token ID3112
Feature activation+2.33e-02

Pos logit contributions
keeping+0.022
 itself+0.019
 him+0.017
 Joey+0.017
gent+0.016
Neg logit contributions
cases-0.018
istered-0.017
ider-0.017
arest-0.015
plets-0.015
Token.
Token ID13
Feature activation-3.39e-02

Pos logit contributions
istered+0.057
cases+0.053
plets+0.052
arest+0.052
icious+0.050
Neg logit contributions
keeping-0.032
 itself-0.027
 him-0.027
 home-0.026
ie-0.025
Token Ost
Token ID38919
Feature activation-3.92e-03

Pos logit contributions
keeping+0.059
ie+0.049
 home+0.046
 itself+0.046
 him+0.044
Neg logit contributions
cases-0.067
istered-0.066
plets-0.062
lic-0.061
ider-0.061
Tokenric
Token ID1173
Feature activation+1.20e-01

Pos logit contributions
keeping+0.006
 him+0.005
 itself+0.005
 home+0.005
 Joey+0.005
Neg logit contributions
cases-0.008
istered-0.008
icious-0.007
plets-0.007
arest-0.007
Tokenhes
Token ID956
Feature activation+2.23e-02

Pos logit contributions
istered+0.115
cases+0.101
ider+0.096
arest+0.086
icious+0.084
Neg logit contributions
keeping-0.326
 itself-0.320
 him-0.316
 himself-0.295
 his-0.291
Token are
Token ID389
Feature activation+5.42e-02

Pos logit contributions
cases+0.042
istered+0.041
plets+0.038
ider+0.038
arest+0.038
Neg logit contributions
keeping-0.040
 itself-0.034
 him-0.033
 home-0.032
 Joey-0.031
Token big
Token ID1263
Feature activation+5.32e-03

Pos logit contributions
cases+0.098
arest+0.094
plets+0.094
istered+0.092
ider+0.090
Neg logit contributions
keeping-0.101
 home-0.099
 itself-0.099
 him-0.098
 his-0.095
Token birds
Token ID10087
Feature activation+2.81e-02

Pos logit contributions
istered+0.011
cases+0.011
arest+0.010
ider+0.010
plets+0.010
Neg logit contributions
keeping-0.009
 him-0.008
 itself-0.008
 home-0.007
ie-0.007
Token with
Token ID351
Feature activation-5.13e-03

Pos logit contributions
cases+0.069
istered+0.069
ider+0.066
arest+0.065
plets+0.065
Neg logit contributions
keeping-0.034
 him-0.030
 itself-0.029
 home-0.027
ie-0.027
Token.
Token ID13
Feature activation-3.75e-02

Pos logit contributions
istered+0.122
cases+0.122
plets+0.114
ider+0.113
icious+0.110
Neg logit contributions
keeping-0.145
 itself-0.142
 him-0.142
 home-0.132
ie-0.131
Token One
Token ID1881
Feature activation+5.34e-02

Pos logit contributions
keeping+0.073
 itself+0.060
 him+0.060
ie+0.060
 home+0.058
Neg logit contributions
istered-0.067
cases-0.066
ider-0.062
plets-0.062
lic-0.060
Token of
Token ID286
Feature activation-3.55e-02

Pos logit contributions
istered+0.092
cases+0.091
ider+0.084
plets+0.083
arest+0.083
Neg logit contributions
keeping-0.106
 itself-0.092
 him-0.091
 home-0.087
ie-0.085
Token the
Token ID262
Feature activation+3.06e-02

Pos logit contributions
keeping+0.044
gy+0.027
 itself+0.026
 home+0.026
 him+0.024
Neg logit contributions
istered-0.085
cases-0.084
br-0.082
 short-0.081
arest-0.081
Token ost
Token ID23619
Feature activation+3.59e-02

Pos logit contributions
istered+0.069
cases+0.068
ider+0.063
plets+0.063
uts+0.062
Neg logit contributions
keeping-0.044
 home-0.039
 itself-0.037
 him-0.037
ie-0.036
Tokenric
Token ID1173
Feature activation+1.19e-01

Pos logit contributions
istered+0.093
cases+0.092
arest+0.087
ider+0.086
br+0.085
Neg logit contributions
keeping-0.041
 itself-0.032
 him-0.030
gy-0.030
 home-0.029
Tokenhes
Token ID956
Feature activation+1.41e-02

Pos logit contributions
istered+0.108
cases+0.090
ider+0.079
icious+0.078
plets+0.077
Neg logit contributions
keeping-0.328
 itself-0.323
 him-0.314
 home-0.296
ie-0.291
Token looked
Token ID3114
Feature activation+3.35e-02

Pos logit contributions
cases+0.028
istered+0.028
ider+0.026
plets+0.026
arest+0.026
Neg logit contributions
keeping-0.023
 itself-0.019
 him-0.019
 home-0.018
 Joey-0.017
Token different
Token ID1180
Feature activation-6.69e-03

Pos logit contributions
istered+0.064
cases+0.064
plets+0.061
arest+0.060
ider+0.060
Neg logit contributions
keeping-0.056
 him-0.052
 itself-0.052
 home-0.049
ie-0.047
Token from
Token ID422
Feature activation-5.83e-02

Pos logit contributions
keeping+0.014
 him+0.011
 itself+0.011
 home+0.011
 Joey+0.011
Neg logit contributions
cases-0.011
istered-0.011
ider-0.010
plets-0.010
arest-0.010
Token the
Token ID262
Feature activation+1.86e-02

Pos logit contributions
keeping+0.106
 him+0.078
 itself+0.078
ie+0.077
 Joey+0.075
Neg logit contributions
istered-0.105
cases-0.101
 short-0.101
 long-0.099
ider-0.097
Token busy
Token ID8179
Feature activation-9.19e-03

Pos logit contributions
plets+0.110
br+0.109
Aud+0.104
icious+0.104
cases+0.103
Neg logit contributions
 home-0.115
 him-0.101
 his-0.099
keeping-0.098
 Joey-0.095
Token.
Token ID13
Feature activation-3.65e-02

Pos logit contributions
keeping+0.015
 itself+0.013
 him+0.012
 home+0.012
ie+0.011
Neg logit contributions
istered-0.019
cases-0.019
ider-0.017
plets-0.017
arest-0.017
Token They
Token ID1119
Feature activation+1.81e-02

Pos logit contributions
keeping+0.060
 itself+0.049
 him+0.047
ie+0.047
 home+0.046
Neg logit contributions
istered-0.076
cases-0.075
ider-0.070
plets-0.070
arest-0.068
Token have
Token ID423
Feature activation+3.97e-02

Pos logit contributions
istered+0.040
cases+0.039
ider+0.038
plets+0.038
arest+0.037
Neg logit contributions
keeping-0.027
 him-0.023
 itself-0.022
 home-0.021
ie-0.020
Token many
Token ID867
Feature activation+1.45e-02

Pos logit contributions
istered+0.095
plets+0.092
cases+0.091
ider+0.091
arest+0.089
Neg logit contributions
keeping-0.056
 him-0.048
 home-0.046
 itself-0.045
 his-0.041
Token pizz
Token ID38748
Feature activation+1.19e-01

Pos logit contributions
plets+0.029
istered+0.029
ider+0.029
cases+0.027
uts+0.027
Neg logit contributions
 him-0.024
 his-0.022
keeping-0.022
 home-0.021
ie-0.020
Tokenas
Token ID292
Feature activation+6.71e-03

Pos logit contributions
 annoying+0.141
istered+0.133
plets+0.121
 swollen+0.116
 childish+0.115
Neg logit contributions
keeping-0.288
 him-0.264
 home-0.249
 itself-0.236
ie-0.235
Token to
Token ID284
Feature activation-8.50e-02

Pos logit contributions
istered+0.014
cases+0.014
ider+0.013
arest+0.013
plets+0.013
Neg logit contributions
keeping-0.011
 him-0.009
 itself-0.009
 home-0.008
ie-0.008
Token deliver
Token ID5203
Feature activation-1.65e-02

Pos logit contributions
keeping+0.121
 itself+0.098
 him+0.094
 home+0.089
ie+0.087
Neg logit contributions
istered-0.192
cases-0.190
arest-0.179
ider-0.178
plets-0.177
Token.
Token ID13
Feature activation-3.08e-02

Pos logit contributions
keeping+0.025
 itself+0.020
 him+0.019
ie+0.018
 home+0.018
Neg logit contributions
cases-0.036
istered-0.036
arest-0.033
ider-0.033
plets-0.033
Token Ben
Token ID3932
Feature activation-1.10e-02

Pos logit contributions
keeping+0.045
 him+0.040
 itself+0.039
 home+0.036
 Joey+0.035
Neg logit contributions
istered-0.068
cases-0.066
arest-0.064
ider-0.063
plets-0.062
Token.
Token ID13
Feature activation-3.16e-02

Pos logit contributions
istered+0.071
icious+0.068
arest+0.067
cases+0.067
plets+0.066
Neg logit contributions
keeping-0.029
 him-0.025
 itself-0.024
 home-0.023
ie-0.020
Token They
Token ID1119
Feature activation+1.72e-02

Pos logit contributions
keeping+0.050
 itself+0.040
 him+0.040
ie+0.038
 home+0.038
Neg logit contributions
cases-0.067
istered-0.067
ider-0.063
plets-0.062
arest-0.061
Token were
Token ID547
Feature activation+7.10e-02

Pos logit contributions
istered+0.039
ider+0.037
cases+0.037
arest+0.037
icious+0.036
Neg logit contributions
 him-0.024
keeping-0.024
 itself-0.021
ie-0.020
 home-0.020
Token called
Token ID1444
Feature activation+1.47e-02

Pos logit contributions
istered+0.144
cases+0.142
arest+0.140
ider+0.138
plets+0.138
Neg logit contributions
 home-0.119
keeping-0.115
 him-0.113
 itself-0.108
 his-0.104
Token ost
Token ID23619
Feature activation+3.01e-02

Pos logit contributions
istered+0.028
cases+0.027
ider+0.025
plets+0.025
arest+0.025
Neg logit contributions
keeping-0.026
 itself-0.022
 him-0.022
 home-0.021
ie-0.021
Tokenric
Token ID1173
Feature activation+1.18e-01

Pos logit contributions
istered+0.080
cases+0.077
arest+0.076
br+0.075
ider+0.074
Neg logit contributions
keeping-0.032
gy-0.026
 itself-0.026
 him-0.025
ie-0.024
Tokenhes
Token ID956
Feature activation+2.47e-02

Pos logit contributions
istered+0.121
cases+0.103
ider+0.098
icious+0.095
arest+0.090
Neg logit contributions
keeping-0.315
 itself-0.309
 him-0.304
 home-0.281
 himself-0.280
Token.
Token ID13
Feature activation-2.75e-02

Pos logit contributions
istered+0.055
cases+0.053
ider+0.051
arest+0.051
plets+0.050
Neg logit contributions
keeping-0.037
 him-0.032
 itself-0.032
 home-0.029
ie-0.029
Token Lily
Token ID20037
Feature activation-2.11e-02

Pos logit contributions
keeping+0.040
 itself+0.033
 him+0.032
 home+0.030
ie+0.029
Neg logit contributions
istered-0.062
cases-0.061
ider-0.058
arest-0.057
plets-0.057
Token and
Token ID290
Feature activation-4.51e-03

Pos logit contributions
keeping+0.028
 him+0.024
 itself+0.023
gy+0.023
ie+0.023
Neg logit contributions
istered-0.050
arest-0.048
cases-0.047
ider-0.046
plets-0.046
Token Ben
Token ID3932
Feature activation-1.13e-02

Pos logit contributions
keeping+0.008
 itself+0.007
ie+0.007
 home+0.007
 him+0.006
Neg logit contributions
cases-0.008
istered-0.008
plets-0.007
ider-0.007
 short-0.007
Token but
Token ID475
Feature activation-4.36e-02

Pos logit contributions
istered+0.072
cases+0.067
ider+0.067
plets+0.064
arest+0.063
Neg logit contributions
 him-0.041
 itself-0.035
keeping-0.034
ie-0.032
 Joey-0.031
Token be
Token ID307
Feature activation+3.45e-02

Pos logit contributions
keeping+0.094
 itself+0.074
 him+0.072
gent+0.072
ie+0.071
Neg logit contributions
istered-0.065
ider-0.063
cases-0.063
arest-0.060
 short-0.060
Token careful
Token ID8161
Feature activation-2.45e-02

Pos logit contributions
cases+0.056
plets+0.050
istered+0.049
ider+0.047
arest+0.047
Neg logit contributions
keeping-0.074
 itself-0.072
 home-0.070
 him-0.070
 his-0.066
Token.
Token ID13
Feature activation-3.33e-02

Pos logit contributions
keeping+0.036
 itself+0.031
 him+0.030
 home+0.028
ie+0.027
Neg logit contributions
istered-0.056
cases-0.055
arest-0.052
plets-0.052
ider-0.051
Token Ost
Token ID38919
Feature activation-9.02e-03

Pos logit contributions
keeping+0.050
 itself+0.038
 him+0.038
 home+0.038
ie+0.038
Neg logit contributions
istered-0.073
cases-0.070
ider-0.069
plets-0.069
arest-0.067
Tokenric
Token ID1173
Feature activation+1.17e-01

Pos logit contributions
keeping+0.014
 him+0.011
 itself+0.011
 home+0.011
ie+0.010
Neg logit contributions
istered-0.018
cases-0.018
plets-0.018
icious-0.017
arest-0.017
Tokenhes
Token ID956
Feature activation+1.46e-02

Pos logit contributions
istered+0.108
cases+0.098
ider+0.088
plets+0.076
arest+0.076
Neg logit contributions
keeping-0.322
 itself-0.319
 him-0.315
 himself-0.291
 home-0.291
Token can
Token ID460
Feature activation-2.26e-03

Pos logit contributions
istered+0.026
cases+0.026
plets+0.024
arest+0.024
ider+0.024
Neg logit contributions
keeping-0.027
 itself-0.023
 him-0.022
 home-0.021
 Joey-0.021
Token be
Token ID307
Feature activation+2.55e-02

Pos logit contributions
 him+0.003
 itself+0.003
keeping+0.003
 home+0.003
gy+0.003
Neg logit contributions
cases-0.005
istered-0.005
ider-0.005
lic-0.005
plets-0.005
Token gr
Token ID1036
Feature activation+3.63e-02

Pos logit contributions
cases+0.043
istered+0.040
ider+0.039
arest+0.038
plets+0.038
Neg logit contributions
keeping-0.053
 itself-0.049
 home-0.048
 him-0.048
gy-0.045
Tokenumpy
Token ID32152
Feature activation-2.08e-02

Pos logit contributions
cases+0.058
istered+0.057
ider+0.052
arest+0.050
br+0.047
Neg logit contributions
keeping-0.079
 itself-0.070
ie-0.067
 him-0.066
 home-0.064
Token
Token ID198
Feature activation+8.55e-02

Pos logit contributions
keeping+0.027
 him+0.025
 itself+0.023
 home+0.023
 Joey+0.022
Neg logit contributions
istered-0.068
cases-0.068
arest-0.067
ider-0.064
icious-0.064
Token
Token ID198
Feature activation+7.79e-02

Pos logit contributions
istered+0.199
cases+0.197
ider+0.189
plets+0.184
arest+0.182
Neg logit contributions
keeping-0.116
 him-0.102
 itself-0.101
 home-0.093
ie-0.087
TokenThey
Token ID2990
Feature activation+1.48e-02

Pos logit contributions
cases+0.145
istered+0.138
plets+0.128
ider+0.126
lic+0.123
Neg logit contributions
keeping-0.138
 itself-0.132
 him-0.127
 home-0.122
ie-0.118
Token put
Token ID1234
Feature activation+1.23e-02

Pos logit contributions
istered+0.030
cases+0.029
ider+0.028
arest+0.027
plets+0.027
Neg logit contributions
keeping-0.025
 itself-0.021
 him-0.021
 home-0.020
ie-0.019
Token the
Token ID262
Feature activation+4.63e-02

Pos logit contributions
cases+0.027
istered+0.027
hes+0.026
plets+0.025
 short+0.025
Neg logit contributions
keeping-0.018
 itself-0.013
 generation-0.012
ie-0.012
 Acc-0.012
Token pizz
Token ID38748
Feature activation+1.17e-01

Pos logit contributions
istered+0.083
cases+0.082
plets+0.076
ider+0.075
arest+0.074
Neg logit contributions
keeping-0.087
 him-0.077
 itself-0.076
 home-0.074
ie-0.070
Tokenas
Token ID292
Feature activation-2.72e-04

Pos logit contributions
istered+0.111
 annoying+0.105
cases+0.097
plets+0.092
icious+0.084
Neg logit contributions
keeping-0.307
 him-0.283
 itself-0.280
 home-0.277
ie-0.267
Token in
Token ID287
Feature activation-4.82e-02

Pos logit contributions
keeping+0.000
 itself+0.000
 him+0.000
 generation+0.000
ie+0.000
Neg logit contributions
cases-0.001
istered-0.001
ider-0.001
arest-0.001
plets-0.001
Token the
Token ID262
Feature activation+5.45e-03

Pos logit contributions
keeping+0.056
 itself+0.041
 him+0.040
ie+0.038
 home+0.035
Neg logit contributions
istered-0.122
cases-0.119
plets-0.113
arest-0.112
ider-0.112
Token oven
Token ID14361
Feature activation+1.22e-02

Pos logit contributions
istered+0.008
cases+0.008
ider+0.007
br+0.007
arest+0.007
Neg logit contributions
keeping-0.013
 itself-0.011
 him-0.011
ie-0.010
 Joey-0.010
Token and
Token ID290
Feature activation-2.85e-02

Pos logit contributions
istered+0.025
cases+0.024
ider+0.023
arest+0.023
plets+0.023
Neg logit contributions
keeping-0.020
 itself-0.017
 him-0.017
 home-0.016
ie-0.015
Token but
Token ID475
Feature activation-3.39e-02

Pos logit contributions
istered+0.075
cases+0.073
arest+0.071
plets+0.068
ider+0.068
Neg logit contributions
keeping-0.052
 him-0.051
 itself-0.051
 his-0.046
 himself-0.044
Token he
Token ID339
Feature activation+1.95e-02

Pos logit contributions
keeping+0.056
 itself+0.048
 him+0.048
 home+0.045
ie+0.043
Neg logit contributions
istered-0.070
cases-0.069
plets-0.065
arest-0.065
ider-0.064
Token also
Token ID635
Feature activation+1.73e-02

Pos logit contributions
istered+0.040
cases+0.039
plets+0.038
arest+0.038
ider+0.037
Neg logit contributions
keeping-0.031
 him-0.029
 itself-0.028
ie-0.027
 home-0.025
Token knew
Token ID2993
Feature activation+3.77e-03

Pos logit contributions
cases+0.038
istered+0.037
plets+0.036
arest+0.036
ider+0.035
Neg logit contributions
keeping-0.025
 him-0.023
 itself-0.023
 his-0.021
ie-0.020
Token cater
Token ID20825
Feature activation+5.69e-02

Pos logit contributions
istered+0.008
cases+0.008
ider+0.008
arest+0.007
plets+0.007
Neg logit contributions
keeping-0.006
 itself-0.004
 him-0.004
 home-0.004
gy-0.004
Tokenpill
Token ID27215
Feature activation+1.16e-01

Pos logit contributions
cases+0.076
istered+0.070
ider+0.062
arest+0.060
icious+0.058
Neg logit contributions
keeping-0.138
 him-0.128
ie-0.127
 itself-0.125
 Joey-0.116
Tokenars
Token ID945
Feature activation+4.85e-02

Pos logit contributions
cases+0.148
istered+0.134
aj+0.134
plets+0.123
ider+0.115
Neg logit contributions
keeping-0.270
 itself-0.262
 him-0.254
ie-0.235
 himself-0.231
Token were
Token ID547
Feature activation+5.17e-02

Pos logit contributions
istered+0.104
cases+0.102
ider+0.096
arest+0.095
plets+0.095
Neg logit contributions
keeping-0.075
 itself-0.062
 him-0.061
 home-0.058
ie-0.056
Token not
Token ID407
Feature activation+2.34e-02

Pos logit contributions
cases+0.086
plets+0.083
istered+0.083
arest+0.079
ider+0.075
Neg logit contributions
keeping-0.104
 itself-0.103
 him-0.101
 his-0.095
 home-0.095
Token pets
Token ID17252
Feature activation+2.16e-02

Pos logit contributions
cases+0.042
istered+0.042
plets+0.039
ider+0.038
arest+0.038
Neg logit contributions
keeping-0.045
 itself-0.040
 him-0.039
 home-0.038
ie-0.036
Token.
Token ID13
Feature activation-2.82e-02

Pos logit contributions
istered+0.046
cases+0.046
ider+0.043
plets+0.043
arest+0.043
Neg logit contributions
keeping-0.033
 itself-0.028
 him-0.027
 home-0.026
ie-0.025
Token.
Token ID13
Feature activation-3.34e-02

Pos logit contributions
keeping+0.002
 itself+0.002
 him+0.002
 home+0.002
ie+0.001
Neg logit contributions
istered-0.003
cases-0.003
arest-0.003
ider-0.003
plets-0.003
Token Tim
Token ID5045
Feature activation-3.20e-02

Pos logit contributions
keeping+0.047
 itself+0.037
ie+0.035
 him+0.035
 home+0.033
Neg logit contributions
cases-0.078
istered-0.076
plets-0.073
lic-0.071
ider-0.070
Tokenmy
Token ID1820
Feature activation-2.22e-03

Pos logit contributions
keeping+0.052
 itself+0.044
 him+0.041
 generation+0.039
 Joey+0.037
Neg logit contributions
istered-0.065
cases-0.065
aj-0.059
ider-0.058
icious-0.058
Token found
Token ID1043
Feature activation+9.91e-03

Pos logit contributions
keeping+0.003
 him+0.003
 itself+0.003
 home+0.003
ie+0.002
Neg logit contributions
istered-0.005
cases-0.005
ider-0.005
arest-0.005
plets-0.005
Token a
Token ID257
Feature activation+6.60e-02

Pos logit contributions
istered+0.017
cases+0.017
arest+0.016
plets+0.016
br+0.015
Neg logit contributions
keeping-0.019
 itself-0.015
ie-0.015
gy-0.015
 generation-0.015
Token seas
Token ID21547
Feature activation+1.16e-01

Pos logit contributions
cases+0.122
istered+0.120
ider+0.114
plets+0.108
uts+0.107
Neg logit contributions
 home-0.128
keeping-0.113
 Joey-0.107
 itself-0.105
 his-0.103
Tokenhell
Token ID12758
Feature activation+6.83e-02

Pos logit contributions
istered+0.198
cases+0.184
ider+0.178
plets+0.176
icious+0.167
Neg logit contributions
keeping-0.229
 itself-0.216
 him-0.215
ie-0.213
 home-0.207
Token and
Token ID290
Feature activation-2.18e-02

Pos logit contributions
istered+0.186
icious+0.175
 Deal+0.169
aj+0.168
ider+0.168
Neg logit contributions
 him-0.071
 home-0.070
ie-0.067
keeping-0.059
 itself-0.056
Token showed
Token ID3751
Feature activation-2.29e-02

Pos logit contributions
keeping+0.035
 him+0.030
 itself+0.030
 home+0.028
ie+0.027
Neg logit contributions
istered-0.046
cases-0.044
ider-0.043
arest-0.042
plets-0.042
Token it
Token ID340
Feature activation+5.99e-03

Pos logit contributions
keeping+0.037
 Acc+0.027
gy+0.025
gent+0.025
stack+0.025
Neg logit contributions
cases-0.048
br-0.045
 short-0.043
arest-0.043
 long-0.043
Token to
Token ID284
Feature activation-8.27e-02

Pos logit contributions
cases+0.012
istered+0.012
uts+0.011
 short+0.011
lic+0.011
Neg logit contributions
keeping-0.010
 itself-0.007
 generation-0.007
gent-0.007
ie-0.007
Token ostr
Token ID42387
Feature activation+1.80e-02

Pos logit contributions
istered+0.028
cases+0.028
ider+0.026
arest+0.026
plets+0.026
Neg logit contributions
keeping-0.023
 itself-0.019
 him-0.019
 home-0.018
ie-0.017
Tokenich
Token ID488
Feature activation-1.52e-02

Pos logit contributions
istered+0.026
cases+0.025
arest+0.023
plets+0.023
br+0.023
Neg logit contributions
keeping-0.040
gy-0.036
 him-0.035
 itself-0.035
ie-0.035
Token pen
Token ID3112
Feature activation+2.07e-02

Pos logit contributions
keeping+0.029
 itself+0.025
 Joey+0.023
 him+0.023
 Sus+0.022
Neg logit contributions
cases-0.027
istered-0.026
ider-0.025
arest-0.023
plets-0.023
Token.
Token ID13
Feature activation-3.36e-02

Pos logit contributions
istered+0.049
cases+0.046
plets+0.045
arest+0.045
icious+0.044
Neg logit contributions
keeping-0.029
 itself-0.025
 him-0.025
 home-0.024
ie-0.022
Token Ost
Token ID38919
Feature activation-8.07e-03

Pos logit contributions
keeping+0.061
ie+0.052
 home+0.048
 itself+0.048
 him+0.046
Neg logit contributions
cases-0.063
istered-0.062
plets-0.058
ider-0.058
br-0.057
Tokenric
Token ID1173
Feature activation+1.16e-01

Pos logit contributions
keeping+0.013
 him+0.010
 home+0.010
 itself+0.010
 Joey+0.010
Neg logit contributions
istered-0.015
cases-0.015
icious-0.015
plets-0.015
arest-0.015
Tokenhes
Token ID956
Feature activation+1.78e-02

Pos logit contributions
istered+0.113
cases+0.101
ider+0.095
arest+0.085
icious+0.083
Neg logit contributions
keeping-0.310
 itself-0.306
 him-0.304
 himself-0.281
 his-0.280
Token are
Token ID389
Feature activation+5.05e-02

Pos logit contributions
istered+0.032
cases+0.032
plets+0.029
ider+0.029
arest+0.029
Neg logit contributions
keeping-0.033
 itself-0.027
 him-0.027
 home-0.026
 Joey-0.025
Token big
Token ID1263
Feature activation+4.67e-03

Pos logit contributions
cases+0.087
plets+0.082
arest+0.082
istered+0.082
ider+0.079
Neg logit contributions
keeping-0.100
 itself-0.097
 home-0.096
 him-0.094
 his-0.091
Token birds
Token ID10087
Feature activation+2.53e-02

Pos logit contributions
istered+0.009
cases+0.009
arest+0.008
ider+0.008
plets+0.008
Neg logit contributions
keeping-0.008
 itself-0.007
 him-0.007
 home-0.007
ie-0.006
Token with
Token ID351
Feature activation-6.89e-03

Pos logit contributions
cases+0.063
istered+0.063
ider+0.060
arest+0.059
plets+0.059
Neg logit contributions
keeping-0.030
 itself-0.026
 him-0.026
 home-0.024
ie-0.023
Token too
Token ID1165
Feature activation+1.80e-02

Pos logit contributions
keeping+0.013
 itself+0.011
 him+0.010
 home+0.010
ie+0.010
Neg logit contributions
istered-0.017
cases-0.017
ider-0.016
plets-0.015
arest-0.015
Token close
Token ID1969
Feature activation-4.62e-02

Pos logit contributions
istered+0.038
cases+0.036
icious+0.036
plets+0.035
ider+0.035
Neg logit contributions
 home-0.030
keeping-0.029
 him-0.027
 itself-0.026
 his-0.025
Token.
Token ID13
Feature activation-3.60e-02

Pos logit contributions
keeping+0.067
 itself+0.054
 him+0.053
 home+0.049
ie+0.048
Neg logit contributions
istered-0.102
cases-0.102
ider-0.096
plets-0.095
arest-0.095
Token The
Token ID383
Feature activation+3.19e-02

Pos logit contributions
keeping+0.061
 itself+0.050
 him+0.049
ie+0.049
 home+0.048
Neg logit contributions
istered-0.072
cases-0.071
ider-0.068
plets-0.067
arest-0.065
Token ost
Token ID23619
Feature activation+2.81e-02

Pos logit contributions
istered+0.071
cases+0.070
ider+0.064
arest+0.063
uts+0.062
Neg logit contributions
keeping-0.049
 home-0.044
 itself-0.042
 him-0.040
 side-0.037
Tokenric
Token ID1173
Feature activation+1.16e-01

Pos logit contributions
istered+0.075
cases+0.073
arest+0.070
ider+0.069
br+0.069
Neg logit contributions
keeping-0.030
 itself-0.024
gy-0.023
 him-0.022
ie-0.021
Tokenhes
Token ID956
Feature activation+1.34e-02

Pos logit contributions
istered+0.115
cases+0.094
ider+0.084
icious+0.078
plets+0.077
Neg logit contributions
keeping-0.310
 itself-0.305
 him-0.300
 home-0.282
 its-0.279
Token might
Token ID1244
Feature activation+1.28e-03

Pos logit contributions
istered+0.025
cases+0.024
ider+0.023
plets+0.023
arest+0.023
Neg logit contributions
keeping-0.025
 itself-0.021
 him-0.021
 home-0.020
ie-0.020
Token pe
Token ID613
Feature activation+3.44e-02

Pos logit contributions
cases+0.003
istered+0.003
ider+0.003
plets+0.003
lic+0.003
Neg logit contributions
keeping-0.002
 itself-0.001
 home-0.001
 him-0.001
gy-0.001
Tokenck
Token ID694
Feature activation-7.80e-03

Pos logit contributions
istered+0.069
cases+0.065
ider+0.061
plets+0.060
icious+0.060
Neg logit contributions
keeping-0.056
 itself-0.054
 him-0.051
ie-0.050
 home-0.048
Token you
Token ID345
Feature activation-3.90e-02

Pos logit contributions
keeping+0.013
 itself+0.011
 him+0.011
 home+0.010
ie+0.010
Neg logit contributions
istered-0.016
cases-0.016
ider-0.015
arest-0.015
plets-0.015
Token.
Token ID13
Feature activation-3.44e-02

Pos logit contributions
keeping+0.001
 itself+0.001
 him+0.001
 home+0.001
gy+0.001
Neg logit contributions
istered-0.001
cases-0.001
ider-0.001
plets-0.001
arest-0.001
Token He
Token ID679
Feature activation-1.03e-03

Pos logit contributions
keeping+0.046
 itself+0.038
 him+0.038
 home+0.034
 Joey+0.033
Neg logit contributions
istered-0.080
cases-0.080
arest-0.076
ider-0.075
plets-0.075
Token was
Token ID373
Feature activation+6.43e-02

Pos logit contributions
 him+0.001
keeping+0.001
ie+0.001
 his+0.001
 itself+0.001
Neg logit contributions
istered-0.002
arest-0.002
plets-0.002
cases-0.002
icious-0.002
Token carrying
Token ID6872
Feature activation-1.33e-02

Pos logit contributions
cases+0.133
plets+0.128
arest+0.125
istered+0.120
uts+0.112
Neg logit contributions
 his-0.117
 him-0.114
 home-0.101
 her-0.101
 its-0.100
Token their
Token ID511
Feature activation-1.80e-03

Pos logit contributions
keeping+0.020
 itself+0.016
 him+0.015
ie+0.015
 home+0.014
Neg logit contributions
istered-0.029
cases-0.029
plets-0.027
arest-0.026
ider-0.026
Token pizz
Token ID38748
Feature activation+1.16e-01

Pos logit contributions
keeping+0.004
 itself+0.003
 him+0.003
 home+0.003
ie+0.003
Neg logit contributions
istered-0.003
cases-0.003
ider-0.003
plets-0.003
arest-0.003
Tokenas
Token ID292
Feature activation+1.17e-02

Pos logit contributions
 annoying+0.126
plets+0.114
istered+0.114
cases+0.108
icious+0.106
Neg logit contributions
keeping-0.296
 him-0.257
 itself-0.256
 home-0.254
ie-0.240
Token on
Token ID319
Feature activation-6.67e-02

Pos logit contributions
cases+0.024
istered+0.024
ider+0.022
plets+0.022
arest+0.022
Neg logit contributions
keeping-0.019
 itself-0.016
 him-0.015
 home-0.014
ie-0.014
Token a
Token ID257
Feature activation+1.88e-02

Pos logit contributions
keeping+0.071
 itself+0.053
ie+0.052
 Joey+0.047
 him+0.046
Neg logit contributions
istered-0.163
cases-0.162
 short-0.154
br-0.152
 long-0.151
Token big
Token ID1263
Feature activation-9.67e-03

Pos logit contributions
istered+0.030
cases+0.030
plets+0.028
ider+0.028
arest+0.027
Neg logit contributions
keeping-0.039
 home-0.034
 him-0.034
 itself-0.033
ie-0.032
Token tray
Token ID26473
Feature activation+1.65e-02

Pos logit contributions
keeping+0.022
 itself+0.020
 him+0.018
 Joey+0.018
gy+0.018
Neg logit contributions
istered-0.013
cases-0.013
br-0.012
arest-0.012
plets-0.012
Tokenaded
Token ID5286
Feature activation-7.54e-03

Pos logit contributions
istered+0.115
cases+0.115
ider+0.105
plets+0.103
arest+0.100
Neg logit contributions
keeping-0.138
 him-0.125
 itself-0.123
 home-0.118
ie-0.114
Token in
Token ID287
Feature activation-5.77e-02

Pos logit contributions
keeping+0.013
 generation+0.010
 Acc+0.010
 itself+0.010
 Sus+0.009
Neg logit contributions
cases-0.014
plets-0.013
istered-0.013
 long-0.013
 short-0.013
Token the
Token ID262
Feature activation-5.45e-03

Pos logit contributions
keeping+0.080
 itself+0.061
ie+0.059
gy+0.059
 generation+0.056
Neg logit contributions
 long-0.123
 short-0.122
istered-0.120
cases-0.115
br-0.114
Token water
Token ID1660
Feature activation+2.24e-02

Pos logit contributions
keeping+0.009
 itself+0.008
 him+0.008
 home+0.007
ie+0.007
Neg logit contributions
istered-0.011
cases-0.011
ider-0.010
plets-0.010
arest-0.010
Token delic
Token ID8675
Feature activation-1.35e-02

Pos logit contributions
istered+0.050
cases+0.049
arest+0.046
ider+0.046
plets+0.046
Neg logit contributions
keeping-0.033
 itself-0.028
 him-0.028
 home-0.027
ie-0.025
Tokenately
Token ID1286
Feature activation-1.31e-01

Pos logit contributions
keeping+0.022
 itself+0.018
 him+0.018
 home+0.017
ie+0.016
Neg logit contributions
istered-0.029
cases-0.028
ider-0.027
plets-0.026
arest-0.026
Token with
Token ID351
Feature activation-3.94e-02

Pos logit contributions
keeping+0.213
 itself+0.177
 him+0.175
 home+0.163
ie+0.159
Neg logit contributions
istered-0.265
cases-0.265
ider-0.248
plets-0.248
arest-0.247
Token his
Token ID465
Feature activation-1.87e-02

Pos logit contributions
keeping+0.048
 itself+0.037
 him+0.036
 home+0.033
ie+0.032
Neg logit contributions
istered-0.098
cases-0.097
ider-0.091
arest-0.091
plets-0.091
Token fishing
Token ID12478
Feature activation+7.36e-03

Pos logit contributions
keeping+0.033
 home+0.030
 him+0.030
 itself+0.029
 Joey+0.027
Neg logit contributions
cases-0.037
istered-0.036
ider-0.034
plets-0.034
icious-0.033
Token rod
Token ID15299
Feature activation+2.01e-02

Pos logit contributions
istered+0.013
cases+0.013
plets+0.012
arest+0.012
ider+0.012
Neg logit contributions
keeping-0.014
 itself-0.012
 him-0.012
 home-0.012
ie-0.011
Token.
Token ID13
Feature activation-2.77e-02

Pos logit contributions
istered+0.050
cases+0.048
icious+0.048
arest+0.047
plets+0.047
Neg logit contributions
keeping-0.024
 itself-0.021
 him-0.020
 home-0.019
ie-0.017
Token.
Token ID13
Feature activation-4.19e-02

Pos logit contributions
keeping+0.024
 itself+0.020
 him+0.020
 home+0.019
ie+0.018
Neg logit contributions
istered-0.035
cases-0.034
arest-0.032
plets-0.032
ider-0.032
Token Then
Token ID3244
Feature activation-1.48e-02

Pos logit contributions
keeping+0.064
 itself+0.054
 him+0.052
 home+0.049
 Joey+0.046
Neg logit contributions
istered-0.091
cases-0.090
ider-0.084
arest-0.084
plets-0.084
Token we
Token ID356
Feature activation-1.64e-02

Pos logit contributions
keeping+0.031
 itself+0.026
 him+0.026
 home+0.025
ie+0.025
Neg logit contributions
istered-0.023
cases-0.023
ider-0.021
arest-0.021
plets-0.021
Token can
Token ID460
Feature activation-1.60e-02

Pos logit contributions
keeping+0.028
 him+0.026
 itself+0.025
 home+0.024
ie+0.023
Neg logit contributions
cases-0.031
istered-0.031
ider-0.029
arest-0.028
plets-0.028
Token swing
Token ID9628
Feature activation+2.69e-03

Pos logit contributions
keeping+0.020
 him+0.019
 itself+0.018
 home+0.017
gy+0.016
Neg logit contributions
istered-0.038
cases-0.038
lic-0.036
plets-0.036
ider-0.036
Token to
Token ID284
Feature activation-1.25e-01

Pos logit contributions
cases+0.006
istered+0.006
br+0.005
plets+0.005
ider+0.005
Neg logit contributions
keeping-0.004
 itself-0.003
 him-0.003
ie-0.003
 Joey-0.003
Token the
Token ID262
Feature activation-7.56e-03

Pos logit contributions
keeping+0.136
 itself+0.107
 him+0.105
 home+0.095
ie+0.088
Neg logit contributions
istered-0.329
cases-0.325
ider-0.310
plets-0.306
arest-0.305
Token other
Token ID584
Feature activation-7.37e-02

Pos logit contributions
keeping+0.012
 itself+0.011
 home+0.010
 him+0.010
ie+0.009
Neg logit contributions
cases-0.016
istered-0.016
ider-0.015
lic-0.014
arest-0.014
Token tree
Token ID5509
Feature activation-2.36e-02

Pos logit contributions
keeping+0.066
 itself+0.047
 him+0.045
 home+0.040
ie+0.037
Neg logit contributions
istered-0.207
cases-0.204
ider-0.195
plets-0.194
arest-0.194
Token.
Token ID13
Feature activation-3.83e-02

Pos logit contributions
keeping+0.037
 itself+0.031
 him+0.030
 home+0.029
ie+0.028
Neg logit contributions
istered-0.051
cases-0.050
arest-0.047
plets-0.046
ider-0.046
Token The
Token ID383
Feature activation+4.09e-02

Pos logit contributions
keeping+0.059
 itself+0.049
 him+0.048
 home+0.045
ie+0.043
Neg logit contributions
istered-0.083
cases-0.082
ider-0.077
arest-0.076
plets-0.076
Token"
Token ID1
Feature activation+1.35e-02

Pos logit contributions
istered+0.094
cases+0.092
ider+0.083
plets+0.083
arest+0.082
Neg logit contributions
keeping-0.153
 itself-0.136
 him-0.135
 home-0.130
ie-0.127
TokenCan
Token ID6090
Feature activation-5.15e-02

Pos logit contributions
istered+0.030
cases+0.029
plets+0.027
arest+0.027
ider+0.027
Neg logit contributions
keeping-0.021
 itself-0.018
 him-0.017
 home-0.016
ie-0.015
Token we
Token ID356
Feature activation-4.56e-02

Pos logit contributions
keeping+0.122
 itself+0.104
 him+0.101
 home+0.100
ie+0.099
Neg logit contributions
cases-0.070
istered-0.068
ider-0.064
arest-0.062
br-0.060
Token sit
Token ID1650
Feature activation+1.11e-02

Pos logit contributions
keeping+0.054
 him+0.046
 itself+0.043
 home+0.042
ie+0.038
Neg logit contributions
istered-0.115
cases-0.113
plets-0.108
ider-0.107
arest-0.106
Token close
Token ID1969
Feature activation-6.97e-02

Pos logit contributions
istered+0.022
cases+0.022
ider+0.021
plets+0.021
arest+0.021
Neg logit contributions
keeping-0.019
 itself-0.016
 him-0.016
 home-0.015
ie-0.014
Token to
Token ID284
Feature activation-1.25e-01

Pos logit contributions
keeping+0.112
 itself+0.093
 him+0.091
 home+0.085
ie+0.083
Neg logit contributions
istered-0.144
cases-0.144
ider-0.134
arest-0.133
plets-0.133
Token the
Token ID262
Feature activation-9.97e-03

Pos logit contributions
keeping+0.112
 him+0.087
 itself+0.084
 home+0.076
ie+0.068
Neg logit contributions
istered-0.354
cases-0.345
ider-0.336
plets-0.331
arest-0.328
Token fireworks
Token ID26056
Feature activation+2.75e-03

Pos logit contributions
keeping+0.019
 itself+0.016
 him+0.016
 home+0.015
ie+0.015
Neg logit contributions
istered-0.018
cases-0.018
ider-0.017
plets-0.016
arest-0.016
Token,
Token ID11
Feature activation+9.75e-04

Pos logit contributions
istered+0.006
cases+0.006
ider+0.005
plets+0.005
arest+0.005
Neg logit contributions
keeping-0.004
 him-0.004
 itself-0.004
 home-0.004
ie-0.003
Token Mom
Token ID11254
Feature activation-2.07e-02

Pos logit contributions
istered+0.002
cases+0.002
ider+0.002
plets+0.002
arest+0.002
Neg logit contributions
keeping-0.001
 him-0.001
 itself-0.001
 home-0.001
 Joey-0.001
Token?"
Token ID1701
Feature activation-6.49e-03

Pos logit contributions
keeping+0.034
 him+0.029
 itself+0.028
 home+0.027
ie+0.027
Neg logit contributions
istered-0.043
cases-0.042
ider-0.039
arest-0.039
plets-0.039
Tokenapped
Token ID6320
Feature activation+5.33e-02

Pos logit contributions
istered+0.083
cases+0.081
ider+0.077
plets+0.076
arest+0.075
Neg logit contributions
keeping-0.081
 him-0.068
 itself-0.068
 home-0.064
ie-0.062
Token and
Token ID290
Feature activation-2.17e-02

Pos logit contributions
istered+0.135
cases+0.134
ider+0.127
arest+0.126
plets+0.125
Neg logit contributions
keeping-0.062
 itself-0.043
 him-0.042
gy-0.038
 home-0.038
Token shouted
Token ID17293
Feature activation+2.71e-02

Pos logit contributions
keeping+0.040
 itself+0.034
 him+0.034
 home+0.032
ie+0.031
Neg logit contributions
istered-0.041
cases-0.040
ider-0.037
plets-0.037
arest-0.037
Token "
Token ID366
Feature activation+1.20e-02

Pos logit contributions
istered+0.058
cases+0.058
ider+0.054
plets+0.054
arest+0.054
Neg logit contributions
keeping-0.042
 itself-0.034
 him-0.033
 home-0.031
ie-0.031
Tokenready
Token ID1493
Feature activation+3.42e-03

Pos logit contributions
istered+0.022
cases+0.022
ider+0.020
arest+0.020
plets+0.020
Neg logit contributions
keeping-0.022
 itself-0.019
 him-0.019
 home-0.018
ie-0.017
Token to
Token ID284
Feature activation-1.25e-01

Pos logit contributions
istered+0.007
cases+0.007
ider+0.006
plets+0.006
arest+0.006
Neg logit contributions
keeping-0.006
 itself-0.005
 him-0.005
 home-0.004
ie-0.004
Token go
Token ID467
Feature activation-4.95e-02

Pos logit contributions
keeping+0.191
 itself+0.156
 him+0.151
 home+0.143
ie+0.139
Neg logit contributions
istered-0.270
cases-0.268
arest-0.252
plets-0.251
ider-0.251
Token!"
Token ID2474
Feature activation-2.49e-02

Pos logit contributions
keeping+0.102
 itself+0.089
 him+0.084
 generation+0.081
gent+0.079
Neg logit contributions
cases-0.076
ider-0.071
istered-0.070
br-0.070
plets-0.070
Token She
Token ID1375
Feature activation-2.04e-02

Pos logit contributions
keeping+0.043
 itself+0.036
ie+0.035
 him+0.034
 home+0.034
Neg logit contributions
cases-0.048
istered-0.046
ider-0.044
plets-0.044
uts-0.043
Token was
Token ID373
Feature activation+4.41e-02

Pos logit contributions
keeping+0.030
 itself+0.026
 him+0.026
 home+0.024
ie+0.024
Neg logit contributions
istered-0.047
cases-0.042
ider-0.042
arest-0.042
icious-0.042
Token ready
Token ID3492
Feature activation-2.71e-03

Pos logit contributions
istered+0.089
cases+0.085
ider+0.081
arest+0.081
plets+0.080
Neg logit contributions
keeping-0.075
 home-0.068
 itself-0.067
 him-0.067
 Sus-0.060
Token had
Token ID550
Feature activation+2.83e-03

Pos logit contributions
keeping+0.019
 itself+0.017
 him+0.017
 home+0.016
ie+0.015
Neg logit contributions
istered-0.023
cases-0.022
ider-0.021
plets-0.021
arest-0.021
Token been
Token ID587
Feature activation+2.59e-02

Pos logit contributions
istered+0.006
cases+0.006
plets+0.006
ider+0.006
arest+0.006
Neg logit contributions
keeping-0.005
 him-0.004
 itself-0.004
 home-0.004
 Joey-0.003
Token able
Token ID1498
Feature activation+9.51e-04

Pos logit contributions
cases+0.048
istered+0.048
plets+0.045
ider+0.044
arest+0.044
Neg logit contributions
keeping-0.048
 him-0.042
 itself-0.042
 home-0.042
ie-0.038
Token to
Token ID284
Feature activation-8.72e-02

Pos logit contributions
istered+0.002
cases+0.002
ider+0.002
plets+0.002
arest+0.002
Neg logit contributions
keeping-0.001
 itself-0.001
 him-0.001
 home-0.001
ie-0.001
Token climb
Token ID12080
Feature activation-6.19e-02

Pos logit contributions
keeping+0.129
 itself+0.104
 him+0.100
ie+0.095
 home+0.095
Neg logit contributions
istered-0.185
cases-0.183
arest-0.175
ider-0.173
plets-0.171
Token to
Token ID284
Feature activation-1.25e-01

Pos logit contributions
keeping+0.085
 itself+0.060
ie+0.060
 Acc+0.059
gent+0.058
Neg logit contributions
cases-0.139
br-0.130
istered-0.130
plets-0.128
 short-0.125
Token the
Token ID262
Feature activation-1.62e-02

Pos logit contributions
keeping+0.163
ie+0.120
 itself+0.109
gy+0.108
gent+0.107
Neg logit contributions
cases-0.270
istered-0.267
 short-0.265
arest-0.262
 long-0.259
Token top
Token ID1353
Feature activation+1.54e-02

Pos logit contributions
keeping+0.028
 itself+0.024
 him+0.024
 home+0.022
ie+0.022
Neg logit contributions
istered-0.031
cases-0.031
ider-0.029
arest-0.029
plets-0.029
Token.
Token ID13
Feature activation-2.81e-02

Pos logit contributions
istered+0.027
cases+0.025
ider+0.023
arest+0.023
plets+0.023
Neg logit contributions
keeping-0.031
 itself-0.028
 him-0.028
 home-0.027
ie-0.027
Token After
Token ID2293
Feature activation-1.85e-02

Pos logit contributions
keeping+0.048
ie+0.041
 itself+0.040
 him+0.038
 home+0.038
Neg logit contributions
cases-0.054
istered-0.051
ider-0.049
plets-0.049
lic-0.048
Token a
Token ID257
Feature activation+5.18e-02

Pos logit contributions
keeping+0.034
 itself+0.027
ie+0.026
 him+0.026
gy+0.025
Neg logit contributions
cases-0.035
istered-0.034
ider-0.032
br-0.032
arest-0.032
Token he
Token ID339
Feature activation-9.74e-03

Pos logit contributions
keeping+0.054
 itself+0.044
 him+0.042
gy+0.041
 home+0.040
Neg logit contributions
cases-0.060
istered-0.059
ider-0.055
arest-0.055
br-0.055
Token had
Token ID550
Feature activation+9.60e-03

Pos logit contributions
keeping+0.018
 itself+0.015
 him+0.015
 home+0.014
ie+0.014
Neg logit contributions
istered-0.019
cases-0.018
ider-0.017
plets-0.017
arest-0.017
Token managed
Token ID5257
Feature activation-2.63e-02

Pos logit contributions
istered+0.020
cases+0.020
plets+0.019
ider+0.019
arest+0.018
Neg logit contributions
keeping-0.016
 him-0.014
 itself-0.014
 home-0.013
 his-0.012
Token to
Token ID284
Feature activation-8.50e-02

Pos logit contributions
keeping+0.028
 him+0.024
 itself+0.023
 home+0.022
 his+0.018
Neg logit contributions
istered-0.070
cases-0.068
ider-0.067
plets-0.066
arest-0.065
Token climb
Token ID12080
Feature activation-5.76e-02

Pos logit contributions
keeping+0.125
 itself+0.104
 him+0.103
 home+0.096
ie+0.092
Neg logit contributions
istered-0.191
cases-0.188
ider-0.177
plets-0.176
arest-0.175
Token to
Token ID284
Feature activation-1.24e-01

Pos logit contributions
keeping+0.078
 itself+0.058
ie+0.054
 him+0.053
 generation+0.052
Neg logit contributions
cases-0.131
istered-0.126
br-0.123
plets-0.121
ider-0.118
Token the
Token ID262
Feature activation-9.15e-03

Pos logit contributions
keeping+0.151
 itself+0.105
ie+0.101
 him+0.100
gy+0.099
Neg logit contributions
cases-0.288
istered-0.287
arest-0.279
br-0.275
 short-0.274
Token top
Token ID1353
Feature activation+1.05e-02

Pos logit contributions
keeping+0.016
 itself+0.014
 him+0.013
 home+0.012
ie+0.012
Neg logit contributions
istered-0.018
cases-0.018
ider-0.016
arest-0.016
plets-0.016
Token.
Token ID13
Feature activation-3.36e-02

Pos logit contributions
istered+0.018
cases+0.017
ider+0.016
arest+0.016
plets+0.016
Neg logit contributions
keeping-0.021
 itself-0.019
 him-0.019
 home-0.018
ie-0.018
Token And
Token ID843
Feature activation-2.21e-02

Pos logit contributions
keeping+0.059
 itself+0.049
ie+0.049
 him+0.047
 home+0.046
Neg logit contributions
cases-0.063
istered-0.061
plets-0.058
ider-0.058
lic-0.057
Token he
Token ID339
Feature activation+1.56e-02

Pos logit contributions
keeping+0.041
 itself+0.034
gy+0.033
ie+0.033
gent+0.033
Neg logit contributions
cases-0.038
istered-0.035
ider-0.033
arest-0.033
uts-0.033
Token He
Token ID679
Feature activation-7.00e-03

Pos logit contributions
keeping+0.050
 itself+0.041
ie+0.040
 him+0.039
 home+0.039
Neg logit contributions
cases-0.058
istered-0.056
ider-0.054
plets-0.053
arest-0.052
Token wished
Token ID16555
Feature activation+1.20e-02

Pos logit contributions
keeping+0.011
 him+0.010
 itself+0.009
 home+0.009
ie+0.009
Neg logit contributions
istered-0.015
cases-0.014
ider-0.014
plets-0.014
arest-0.014
Token he
Token ID339
Feature activation-7.23e-04

Pos logit contributions
cases+0.024
istered+0.024
ider+0.022
arest+0.022
plets+0.022
Neg logit contributions
keeping-0.020
 itself-0.017
 him-0.016
 home-0.016
ie-0.015
Token could
Token ID714
Feature activation+1.50e-02

Pos logit contributions
keeping+0.001
 him+0.001
 itself+0.001
 home+0.001
ie+0.001
Neg logit contributions
istered-0.001
cases-0.001
ider-0.001
plets-0.001
arest-0.001
Token climb
Token ID12080
Feature activation-4.63e-02

Pos logit contributions
istered+0.036
cases+0.035
plets+0.033
ider+0.033
arest+0.033
Neg logit contributions
keeping-0.020
 him-0.017
 itself-0.017
 home-0.016
ie-0.014
Token to
Token ID284
Feature activation-1.23e-01

Pos logit contributions
keeping+0.058
 itself+0.040
 Acc+0.037
gy+0.037
ie+0.037
Neg logit contributions
cases-0.110
istered-0.105
br-0.105
plets-0.102
 short-0.100
Token the
Token ID262
Feature activation-1.61e-02

Pos logit contributions
keeping+0.142
 itself+0.096
ie+0.092
gy+0.091
 him+0.089
Neg logit contributions
cases-0.298
istered-0.296
arest-0.287
br-0.284
 short-0.282
Token top
Token ID1353
Feature activation+8.66e-03

Pos logit contributions
keeping+0.029
 itself+0.024
 him+0.024
 home+0.023
ie+0.022
Neg logit contributions
istered-0.031
cases-0.031
ider-0.029
plets-0.028
arest-0.028
Token!
Token ID0
Feature activation-4.38e-02

Pos logit contributions
istered+0.015
cases+0.014
ider+0.013
arest+0.013
plets+0.013
Neg logit contributions
keeping-0.017
 itself-0.015
 him-0.015
 home-0.015
ie-0.014
Token
Token ID198
Feature activation+7.30e-02

Pos logit contributions
keeping+0.078
ie+0.067
 itself+0.066
 him+0.062
 home+0.061
Neg logit contributions
cases-0.080
istered-0.076
plets-0.072
ider-0.072
lic-0.070
Token
Token ID198
Feature activation+6.61e-02

Pos logit contributions
istered+0.138
cases+0.137
arest+0.131
ider+0.127
lic+0.127
Neg logit contributions
keeping-0.129
 itself-0.105
 him-0.101
ie-0.099
 home-0.098
Token backyard
Token ID24296
Feature activation-1.75e-02

Pos logit contributions
keeping+0.092
 itself+0.078
 him+0.077
 home+0.072
ie+0.070
Neg logit contributions
istered-0.103
cases-0.101
ider-0.094
arest-0.094
plets-0.094
Token and
Token ID290
Feature activation-3.81e-02

Pos logit contributions
keeping+0.029
 itself+0.023
 him+0.023
 home+0.021
 Joey+0.021
Neg logit contributions
istered-0.035
cases-0.035
ider-0.034
plets-0.033
arest-0.032
Token wanted
Token ID2227
Feature activation-1.94e-03

Pos logit contributions
keeping+0.081
 itself+0.068
 him+0.063
 home+0.062
gy+0.061
Neg logit contributions
cases-0.062
istered-0.056
 short-0.053
 long-0.052
ider-0.051
Token to
Token ID284
Feature activation-7.99e-02

Pos logit contributions
keeping+0.003
gy+0.003
ie+0.003
 itself+0.003
 generation+0.003
Neg logit contributions
cases-0.004
 short-0.003
istered-0.003
 long-0.003
br-0.003
Token climb
Token ID12080
Feature activation-5.41e-02

Pos logit contributions
keeping+0.145
 itself+0.118
 him+0.111
gy+0.108
ie+0.107
Neg logit contributions
cases-0.144
istered-0.139
arest-0.137
 short-0.136
br-0.135
Token to
Token ID284
Feature activation-1.23e-01

Pos logit contributions
keeping+0.087
 Acc+0.070
 generation+0.064
gent+0.062
ie+0.059
Neg logit contributions
 long-0.107
cases-0.106
br-0.105
 short-0.104
plets-0.102
Token the
Token ID262
Feature activation-1.22e-02

Pos logit contributions
keeping+0.217
gy+0.157
 Acc+0.154
gent+0.151
 itself+0.147
Neg logit contributions
 short-0.230
 long-0.223
cases-0.222
arest-0.219
istered-0.213
Token top
Token ID1353
Feature activation+5.56e-03

Pos logit contributions
keeping+0.019
 itself+0.016
 him+0.016
 home+0.015
ie+0.014
Neg logit contributions
istered-0.027
cases-0.026
ider-0.025
plets-0.024
arest-0.024
Token.
Token ID13
Feature activation-3.13e-02

Pos logit contributions
istered+0.009
cases+0.009
ider+0.008
plets+0.008
arest+0.008
Neg logit contributions
keeping-0.011
 itself-0.010
 him-0.010
 home-0.009
ie-0.009
Token
Token ID220
Feature activation+9.12e-02

Pos logit contributions
keeping+0.067
 itself+0.054
 generation+0.050
ie+0.049
 him+0.049
Neg logit contributions
cases-0.052
istered-0.048
 short-0.046
aj-0.045
 long-0.045
Token She
Token ID1375
Feature activation-9.64e-03

Pos logit contributions
istered+0.186
cases+0.184
ider+0.171
arest+0.171
plets+0.170
Neg logit contributions
keeping-0.150
 itself-0.126
 him-0.123
 home-0.117
ie-0.114
Token.
Token ID13
Feature activation-3.68e-02

Pos logit contributions
istered+0.100
cases+0.090
arest+0.090
icious+0.088
plets+0.088
Neg logit contributions
keeping-0.042
 him-0.038
 home-0.036
 itself-0.036
ie-0.035
Token They
Token ID1119
Feature activation+1.44e-02

Pos logit contributions
keeping+0.060
 itself+0.049
 him+0.047
ie+0.045
 home+0.045
Neg logit contributions
istered-0.077
cases-0.077
plets-0.072
ider-0.072
arest-0.070
Token want
Token ID765
Feature activation+6.44e-03

Pos logit contributions
istered+0.028
cases+0.027
ider+0.026
plets+0.025
arest+0.025
Neg logit contributions
keeping-0.026
 him-0.023
 itself-0.022
 home-0.021
ie-0.020
Token to
Token ID284
Feature activation-7.40e-02

Pos logit contributions
cases+0.015
istered+0.015
arest+0.014
plets+0.014
ider+0.014
Neg logit contributions
keeping-0.009
 itself-0.007
 him-0.007
ie-0.007
gy-0.006
Token climb
Token ID12080
Feature activation-4.79e-02

Pos logit contributions
keeping+0.126
 itself+0.103
 him+0.095
ie+0.092
gy+0.091
Neg logit contributions
cases-0.149
istered-0.145
arest-0.138
plets-0.135
ider-0.135
Token to
Token ID284
Feature activation-1.22e-01

Pos logit contributions
keeping+0.071
 Acc+0.051
 generation+0.051
 itself+0.049
ie+0.048
Neg logit contributions
cases-0.106
plets-0.098
br-0.098
 long-0.098
 short-0.097
Token the
Token ID262
Feature activation-1.22e-02

Pos logit contributions
keeping+0.163
 itself+0.107
gy+0.099
ie+0.097
gent+0.097
Neg logit contributions
cases-0.288
istered-0.276
 short-0.274
arest-0.273
plets-0.268
Token top
Token ID1353
Feature activation+8.25e-03

Pos logit contributions
keeping+0.018
 him+0.016
 home+0.016
 itself+0.016
ie+0.014
Neg logit contributions
istered-0.027
cases-0.026
ider-0.025
plets-0.024
arest-0.024
Token and
Token ID290
Feature activation-4.63e-02

Pos logit contributions
istered+0.016
cases+0.015
ider+0.014
arest+0.014
plets+0.014
Neg logit contributions
keeping-0.014
 itself-0.013
 him-0.013
 home-0.012
ie-0.012
Token slide
Token ID10649
Feature activation+1.96e-02

Pos logit contributions
keeping+0.083
 itself+0.069
ie+0.065
gy+0.063
 him+0.062
Neg logit contributions
cases-0.090
istered-0.083
arest-0.078
plets-0.078
br-0.077
Token down
Token ID866
Feature activation-3.29e-02

Pos logit contributions
cases+0.048
istered+0.046
ider+0.044
plets+0.044
arest+0.043
Neg logit contributions
keeping-0.025
 itself-0.019
 generation-0.017
 him-0.017
 Joey-0.016
Token.
Token ID13
Feature activation-3.03e-02

Pos logit contributions
istered+0.040
cases+0.040
arest+0.038
plets+0.038
ider+0.037
Neg logit contributions
keeping-0.027
 him-0.022
 itself-0.022
 home-0.021
ie-0.020
Token She
Token ID1375
Feature activation-6.65e-03

Pos logit contributions
keeping+0.058
 itself+0.048
 him+0.048
ie+0.047
 home+0.046
Neg logit contributions
cases-0.052
istered-0.051
ider-0.049
plets-0.048
lic-0.047
Token wanted
Token ID2227
Feature activation+6.93e-03

Pos logit contributions
keeping+0.010
 him+0.009
 itself+0.009
 home+0.008
ie+0.008
Neg logit contributions
istered-0.015
cases-0.014
plets-0.013
arest-0.013
ider-0.013
Token to
Token ID284
Feature activation-7.17e-02

Pos logit contributions
istered+0.016
cases+0.016
ider+0.015
arest+0.015
plets+0.015
Neg logit contributions
keeping-0.009
 itself-0.007
 him-0.007
 home-0.007
ie-0.006
Token climb
Token ID12080
Feature activation-4.80e-02

Pos logit contributions
keeping+0.102
 itself+0.082
 him+0.078
ie+0.074
 home+0.074
Neg logit contributions
istered-0.159
cases-0.158
arest-0.150
br-0.147
ider-0.147
Token to
Token ID284
Feature activation-1.22e-01

Pos logit contributions
keeping+0.074
 itself+0.054
 Acc+0.053
 generation+0.052
gent+0.051
Neg logit contributions
cases-0.100
br-0.096
istered-0.094
 short-0.093
 long-0.092
Token the
Token ID262
Feature activation-1.30e-02

Pos logit contributions
keeping+0.161
gy+0.106
 itself+0.104
ie+0.103
gent+0.100
Neg logit contributions
cases-0.273
 short-0.271
istered-0.267
arest-0.266
 long-0.265
Token top
Token ID1353
Feature activation+1.45e-02

Pos logit contributions
keeping+0.023
 itself+0.020
 him+0.019
ie+0.018
 home+0.018
Neg logit contributions
istered-0.025
cases-0.024
ider-0.023
arest-0.023
plets-0.023
Token of
Token ID286
Feature activation-8.93e-02

Pos logit contributions
istered+0.023
cases+0.022
ider+0.020
arest+0.020
plets+0.020
Neg logit contributions
keeping-0.031
 itself-0.028
 him-0.028
 home-0.027
ie-0.027
Token a
Token ID257
Feature activation+2.77e-02

Pos logit contributions
keeping+0.096
gy+0.066
ie+0.063
 itself+0.062
 him+0.060
Neg logit contributions
istered-0.220
cases-0.220
plets-0.212
arest-0.212
 short-0.211
Token tall
Token ID7331
Feature activation-5.51e-03

Pos logit contributions
istered+0.053
cases+0.053
ider+0.048
plets+0.048
arest+0.047
Neg logit contributions
keeping-0.048
 home-0.045
 itself-0.042
 him-0.042
ie-0.040
Token He
Token ID679
Feature activation-7.78e-03

Pos logit contributions
keeping+0.047
 itself+0.038
 him+0.037
ie+0.037
 home+0.036
Neg logit contributions
cases-0.055
istered-0.054
ider-0.052
plets-0.051
lic-0.049
Token liked
Token ID8288
Feature activation-4.85e-05

Pos logit contributions
keeping+0.013
 him+0.011
 itself+0.011
 home+0.010
ie+0.010
Neg logit contributions
istered-0.016
cases-0.015
arest-0.015
plets-0.015
ider-0.015
Token to
Token ID284
Feature activation-6.55e-02

Pos logit contributions
keeping+0.000
 itself+0.000
gy+0.000
 him+0.000
 home+0.000
Neg logit contributions
cases-0.000
istered-0.000
br-0.000
plets-0.000
ider-0.000
Token climb
Token ID12080
Feature activation-4.81e-02

Pos logit contributions
keeping+0.103
 itself+0.084
 him+0.081
 home+0.078
ie+0.075
Neg logit contributions
istered-0.137
cases-0.135
ider-0.127
arest-0.127
plets-0.126
Token up
Token ID510
Feature activation-5.33e-02

Pos logit contributions
keeping+0.075
 itself+0.057
 generation+0.054
ie+0.053
 him+0.053
Neg logit contributions
cases-0.101
istered-0.096
br-0.096
plets-0.093
 short-0.090
Token to
Token ID284
Feature activation-1.21e-01

Pos logit contributions
keeping+0.081
 itself+0.063
ie+0.060
 him+0.060
gy+0.058
Neg logit contributions
cases-0.113
istered-0.109
br-0.106
plets-0.103
ider-0.102
Token high
Token ID1029
Feature activation-3.90e-02

Pos logit contributions
keeping+0.156
gy+0.104
 itself+0.101
ie+0.101
 Acc+0.093
Neg logit contributions
cases-0.267
 short-0.266
 long-0.263
istered-0.260
br-0.258
Token places
Token ID4113
Feature activation-2.24e-02

Pos logit contributions
keeping+0.067
 itself+0.053
 him+0.053
 generation+0.051
 home+0.050
Neg logit contributions
cases-0.075
istered-0.072
ider-0.070
br-0.068
plets-0.068
Token and
Token ID290
Feature activation-5.19e-02

Pos logit contributions
keeping+0.037
 itself+0.031
 him+0.030
 home+0.029
ie+0.028
Neg logit contributions
istered-0.046
cases-0.045
ider-0.043
plets-0.042
arest-0.042
Token swing
Token ID9628
Feature activation+1.52e-03

Pos logit contributions
keeping+0.079
 itself+0.065
 him+0.063
 home+0.059
ie+0.059
Neg logit contributions
istered-0.112
cases-0.111
ider-0.104
plets-0.103
arest-0.103
Token on
Token ID319
Feature activation-1.01e-01

Pos logit contributions
cases+0.003
istered+0.003
ider+0.003
br+0.003
plets+0.003
Neg logit contributions
keeping-0.002
 itself-0.002
 him-0.002
 home-0.002
ie-0.002
Token will
Token ID481
Feature activation+2.00e-03

Pos logit contributions
keeping+0.008
 him+0.008
 home+0.008
 itself+0.008
ie+0.007
Neg logit contributions
istered-0.011
cases-0.011
plets-0.011
ider-0.011
icious-0.010
Token be
Token ID307
Feature activation+3.35e-02

Pos logit contributions
cases+0.005
istered+0.005
plets+0.004
ider+0.004
arest+0.004
Neg logit contributions
keeping-0.003
 him-0.002
 itself-0.002
 home-0.002
ie-0.002
Token able
Token ID1498
Feature activation+1.25e-02

Pos logit contributions
cases+0.058
istered+0.056
plets+0.053
ider+0.052
arest+0.052
Neg logit contributions
keeping-0.068
 itself-0.061
 home-0.061
 him-0.060
ie-0.055
Token to
Token ID284
Feature activation-7.99e-02

Pos logit contributions
istered+0.035
cases+0.035
plets+0.033
ider+0.033
icious+0.033
Neg logit contributions
keeping-0.012
 home-0.010
 him-0.009
 itself-0.009
 Joey-0.007
Token climb
Token ID12080
Feature activation-5.71e-02

Pos logit contributions
keeping+0.108
 him+0.092
 itself+0.090
 home+0.085
ie+0.080
Neg logit contributions
istered-0.190
cases-0.188
plets-0.178
ider-0.176
arest-0.173
Token to
Token ID284
Feature activation-1.21e-01

Pos logit contributions
keeping+0.080
 itself+0.062
 him+0.059
ie+0.057
gy+0.055
Neg logit contributions
cases-0.128
istered-0.126
br-0.119
plets-0.119
ider-0.118
Token the
Token ID262
Feature activation-9.07e-03

Pos logit contributions
keeping+0.143
 itself+0.104
 him+0.102
ie+0.094
gy+0.092
Neg logit contributions
istered-0.297
cases-0.295
arest-0.283
plets-0.279
ider-0.277
Token top
Token ID1353
Feature activation+6.96e-03

Pos logit contributions
keeping+0.015
 home+0.013
 itself+0.013
 him+0.013
 generation+0.011
Neg logit contributions
istered-0.019
cases-0.019
ider-0.018
plets-0.017
arest-0.017
Token of
Token ID286
Feature activation-9.38e-02

Pos logit contributions
istered+0.013
cases+0.012
arest+0.011
ider+0.011
plets+0.011
Neg logit contributions
keeping-0.013
 itself-0.012
 him-0.011
 home-0.011
ie-0.011
Token the
Token ID262
Feature activation-3.08e-04

Pos logit contributions
keeping+0.082
 itself+0.058
 him+0.055
 home+0.049
ie+0.044
Neg logit contributions
istered-0.266
cases-0.263
ider-0.252
plets-0.250
arest-0.250
Token tallest
Token ID38760
Feature activation+4.55e-03

Pos logit contributions
keeping+0.000
 home+0.000
 itself+0.000
 him+0.000
 generation+0.000
Neg logit contributions
istered-0.001
cases-0.001
ider-0.001
plets-0.001
lic-0.001
Token climb
Token ID12080
Feature activation-4.67e-02

Pos logit contributions
keeping+0.115
 itself+0.093
 him+0.090
 home+0.085
ie+0.083
Neg logit contributions
istered-0.161
cases-0.160
arest-0.151
ider-0.150
plets-0.150
Token down
Token ID866
Feature activation-4.73e-02

Pos logit contributions
keeping+0.078
 Acc+0.066
gent+0.056
 generation+0.056
gy+0.055
Neg logit contributions
 long-0.087
 short-0.085
br-0.085
plets-0.083
cases-0.083
Token the
Token ID262
Feature activation-1.88e-02

Pos logit contributions
keeping+0.070
gy+0.056
 Acc+0.051
ie+0.051
gent+0.050
Neg logit contributions
cases-0.097
br-0.093
 short-0.093
 long-0.092
ider-0.091
Token steep
Token ID14559
Feature activation-5.00e-02

Pos logit contributions
keeping+0.030
 itself+0.025
 him+0.024
 home+0.023
ie+0.022
Neg logit contributions
istered-0.040
cases-0.040
ider-0.037
plets-0.037
arest-0.037
Token rocks
Token ID12586
Feature activation+2.79e-03

Pos logit contributions
keeping+0.085
 Acc+0.076
 him+0.075
 itself+0.074
 Sus+0.069
Neg logit contributions
cases-0.084
plets-0.082
ider-0.080
br-0.079
 long-0.076
Token to
Token ID284
Feature activation-1.21e-01

Pos logit contributions
istered+0.006
cases+0.006
plets+0.006
arest+0.006
ider+0.006
Neg logit contributions
keeping-0.004
 itself-0.003
 him-0.003
 home-0.003
ie-0.003
Token get
Token ID651
Feature activation+6.71e-03

Pos logit contributions
keeping+0.210
ie+0.151
 itself+0.150
gy+0.149
gent+0.145
Neg logit contributions
istered-0.226
arest-0.222
cases-0.220
plets-0.216
 short-0.214
Token to
Token ID284
Feature activation-9.23e-02

Pos logit contributions
istered+0.014
cases+0.013
arest+0.013
ider+0.012
plets+0.012
Neg logit contributions
keeping-0.011
 itself-0.009
gy-0.008
ie-0.008
 him-0.008
Token it
Token ID340
Feature activation-1.67e-02

Pos logit contributions
keeping+0.124
ie+0.085
gy+0.083
 itself+0.082
gent+0.080
Neg logit contributions
istered-0.211
cases-0.205
arest-0.198
 short-0.197
br-0.196
Token.
Token ID13
Feature activation-3.42e-02

Pos logit contributions
keeping+0.027
 itself+0.022
 him+0.022
 home+0.020
ie+0.020
Neg logit contributions
istered-0.035
cases-0.034
ider-0.032
arest-0.032
plets-0.032
Token
Token ID198
Feature activation+7.25e-02

Pos logit contributions
keeping+0.064
ie+0.057
 itself+0.055
 him+0.053
pa+0.051
Neg logit contributions
cases-0.059
istered-0.054
lic-0.053
plets-0.053
uts-0.052
Token use
Token ID779
Feature activation-2.83e-02

Pos logit contributions
istered+0.050
cases+0.050
plets+0.047
ider+0.047
arest+0.046
Neg logit contributions
keeping-0.036
 itself-0.030
 him-0.030
 home-0.028
ie-0.027
Token it
Token ID340
Feature activation+2.53e-03

Pos logit contributions
keeping+0.029
 itself+0.021
 him+0.019
ie+0.018
 home+0.018
Neg logit contributions
istered-0.076
cases-0.075
arest-0.071
ider-0.071
plets-0.071
Token to
Token ID284
Feature activation-8.04e-02

Pos logit contributions
cases+0.005
istered+0.005
ider+0.005
arest+0.005
plets+0.005
Neg logit contributions
keeping-0.004
 itself-0.003
 him-0.003
ie-0.003
 home-0.003
Token climb
Token ID12080
Feature activation-5.19e-02

Pos logit contributions
keeping+0.139
 itself+0.110
 him+0.109
ie+0.108
 Joey+0.104
Neg logit contributions
istered-0.154
cases-0.152
arest-0.144
ider-0.143
br-0.141
Token up
Token ID510
Feature activation-5.29e-02

Pos logit contributions
keeping+0.073
ie+0.054
 Acc+0.051
 itself+0.050
gent+0.050
Neg logit contributions
cases-0.116
istered-0.107
 long-0.106
br-0.106
 short-0.106
Token to
Token ID284
Feature activation-1.21e-01

Pos logit contributions
keeping+0.082
ie+0.064
 him+0.062
 itself+0.061
 generation+0.059
Neg logit contributions
cases-0.110
istered-0.105
br-0.101
 short-0.101
ider-0.100
Token the
Token ID262
Feature activation-3.18e-03

Pos logit contributions
keeping+0.175
ie+0.131
gy+0.120
gent+0.114
 Acc+0.113
Neg logit contributions
 short-0.247
 long-0.243
cases-0.241
istered-0.239
arest-0.235
Token balloon
Token ID21190
Feature activation-1.18e-02

Pos logit contributions
keeping+0.005
 itself+0.005
 home+0.004
 him+0.004
ie+0.004
Neg logit contributions
istered-0.007
cases-0.006
ider-0.006
arest-0.006
plets-0.006
Token.
Token ID13
Feature activation-2.67e-02

Pos logit contributions
keeping+0.020
 itself+0.016
 him+0.016
 Joey+0.015
ie+0.015
Neg logit contributions
cases-0.023
istered-0.022
ider-0.021
plets-0.021
br-0.021
Token
Token ID198
Feature activation+8.79e-02

Pos logit contributions
keeping+0.051
ie+0.045
 itself+0.043
 him+0.041
 home+0.041
Neg logit contributions
cases-0.047
istered-0.047
ider-0.043
plets-0.042
lic-0.041
Token
Token ID198
Feature activation+7.85e-02

Pos logit contributions
istered+0.173
cases+0.172
arest+0.160
ider+0.159
plets+0.159
Neg logit contributions
keeping-0.151
 itself-0.126
 him-0.123
 home-0.118
ie-0.117
Token this
Token ID428
Feature activation-1.62e-02

Pos logit contributions
istered+0.003
cases+0.003
plets+0.002
arest+0.002
br+0.002
Neg logit contributions
keeping-0.002
 itself-0.001
 him-0.001
gy-0.001
ie-0.001
Token oil
Token ID3056
Feature activation-8.81e-03

Pos logit contributions
keeping+0.031
 itself+0.027
 home+0.027
 him+0.026
ie+0.025
Neg logit contributions
istered-0.029
cases-0.028
ider-0.027
arest-0.026
plets-0.026
Token on
Token ID319
Feature activation-8.06e-02

Pos logit contributions
keeping+0.013
 itself+0.011
 him+0.011
 home+0.010
ie+0.010
Neg logit contributions
istered-0.020
cases-0.019
ider-0.018
arest-0.018
plets-0.018
Token the
Token ID262
Feature activation-2.25e-02

Pos logit contributions
keeping+0.090
 itself+0.065
 him+0.063
 home+0.057
ie+0.056
Neg logit contributions
istered-0.205
cases-0.204
plets-0.194
arest-0.193
ider-0.192
Token slide
Token ID10649
Feature activation-5.11e-03

Pos logit contributions
keeping+0.042
 itself+0.036
 him+0.035
 home+0.034
ie+0.033
Neg logit contributions
istered-0.042
cases-0.041
ider-0.038
arest-0.038
plets-0.038
Token to
Token ID284
Feature activation-1.21e-01

Pos logit contributions
keeping+0.009
 itself+0.007
 him+0.007
 home+0.007
ie+0.007
Neg logit contributions
istered-0.010
cases-0.010
arest-0.009
ider-0.009
plets-0.009
Token make
Token ID787
Feature activation-2.06e-02

Pos logit contributions
keeping+0.182
 itself+0.144
 him+0.141
 home+0.133
ie+0.130
Neg logit contributions
istered-0.265
cases-0.260
arest-0.247
plets-0.247
ider-0.244
Token it
Token ID340
Feature activation-1.30e-02

Pos logit contributions
keeping+0.028
 itself+0.023
 him+0.022
 home+0.021
ie+0.020
Neg logit contributions
istered-0.048
cases-0.048
ider-0.045
plets-0.045
arest-0.045
Token go
Token ID467
Feature activation-2.39e-02

Pos logit contributions
keeping+0.023
 itself+0.021
 home+0.021
 him+0.021
ie+0.019
Neg logit contributions
istered-0.024
cases-0.024
ider-0.023
plets-0.022
arest-0.022
Token faster
Token ID5443
Feature activation-2.53e-02

Pos logit contributions
keeping+0.042
 itself+0.037
 him+0.036
 home+0.035
ie+0.034
Neg logit contributions
istered-0.046
cases-0.045
ider-0.042
arest-0.042
plets-0.042
Token!"
Token ID2474
Feature activation-2.59e-02

Pos logit contributions
keeping+0.047
 itself+0.040
 him+0.040
 home+0.038
ie+0.037
Neg logit contributions
istered-0.047
cases-0.046
ider-0.043
plets-0.043
arest-0.043
Token heat
Token ID4894
Feature activation+6.51e-03

Pos logit contributions
istered+0.045
cases+0.045
ider+0.041
plets+0.040
arest+0.040
Neg logit contributions
keeping-0.048
 home-0.043
 him-0.042
 itself-0.042
ie-0.039
Token.
Token ID13
Feature activation-3.87e-02

Pos logit contributions
istered+0.015
cases+0.014
ider+0.014
lic+0.013
icious+0.013
Neg logit contributions
keeping-0.009
 itself-0.008
 him-0.008
 home-0.008
ie-0.008
Token Then
Token ID3244
Feature activation-1.66e-02

Pos logit contributions
keeping+0.062
 itself+0.053
 him+0.052
 home+0.048
 Joey+0.047
Neg logit contributions
istered-0.082
cases-0.080
arest-0.074
ider-0.074
plets-0.074
Token I
Token ID314
Feature activation-1.77e-02

Pos logit contributions
keeping+0.033
 itself+0.028
 him+0.027
 home+0.026
ie+0.025
Neg logit contributions
istered-0.029
cases-0.028
ider-0.026
arest-0.026
plets-0.026
Token climb
Token ID12080
Feature activation-3.65e-02

Pos logit contributions
keeping+0.029
 him+0.025
 itself+0.025
 home+0.024
ie+0.023
Neg logit contributions
istered-0.036
cases-0.036
plets-0.033
ider-0.033
arest-0.033
Token to
Token ID284
Feature activation-1.21e-01

Pos logit contributions
keeping+0.051
 itself+0.038
 him+0.037
ie+0.036
gy+0.034
Neg logit contributions
cases-0.082
istered-0.080
plets-0.078
br-0.077
ider-0.077
Token the
Token ID262
Feature activation-9.51e-03

Pos logit contributions
keeping+0.157
 itself+0.109
 him+0.108
ie+0.108
gy+0.102
Neg logit contributions
istered-0.275
cases-0.275
arest-0.266
plets-0.261
br-0.258
Token top
Token ID1353
Feature activation+1.76e-02

Pos logit contributions
keeping+0.014
 itself+0.012
 him+0.011
 home+0.011
ie+0.010
Neg logit contributions
istered-0.021
cases-0.021
ider-0.020
plets-0.020
arest-0.020
Token and
Token ID290
Feature activation-4.49e-02

Pos logit contributions
istered+0.035
cases+0.031
ider+0.030
arest+0.030
icious+0.029
Neg logit contributions
keeping-0.031
 itself-0.028
 home-0.028
 him-0.028
ie-0.027
Token I
Token ID314
Feature activation-2.56e-02

Pos logit contributions
keeping+0.073
 itself+0.061
 him+0.059
ie+0.056
 home+0.056
Neg logit contributions
cases-0.091
istered-0.091
ider-0.085
plets-0.085
arest-0.084
Token can
Token ID460
Feature activation-2.54e-02

Pos logit contributions
keeping+0.044
 itself+0.038
 him+0.037
 home+0.036
ie+0.035
Neg logit contributions
istered-0.050
cases-0.050
ider-0.046
plets-0.046
arest-0.046
Token tall
Token ID7331
Feature activation-9.16e-04

Pos logit contributions
istered+0.035
cases+0.033
ider+0.031
plets+0.031
arest+0.030
Neg logit contributions
keeping-0.039
 him-0.035
 itself-0.034
 home-0.034
ie-0.031
Token buildings
Token ID6832
Feature activation+2.80e-02

Pos logit contributions
keeping+0.002
 itself+0.001
 him+0.001
 home+0.001
ie+0.001
Neg logit contributions
istered-0.002
cases-0.002
ider-0.002
plets-0.002
arest-0.002
Token to
Token ID284
Feature activation-9.34e-02

Pos logit contributions
istered+0.063
cases+0.061
ider+0.058
plets+0.058
arest+0.058
Neg logit contributions
keeping-0.041
 itself-0.034
 him-0.034
 home-0.032
ie-0.030
Token climb
Token ID12080
Feature activation-4.42e-02

Pos logit contributions
keeping+0.151
 itself+0.117
 him+0.110
ie+0.110
 home+0.107
Neg logit contributions
cases-0.193
istered-0.187
arest-0.179
br-0.177
 short-0.173
Token up
Token ID510
Feature activation-4.73e-02

Pos logit contributions
keeping+0.069
 itself+0.052
ie+0.050
 him+0.049
 Acc+0.048
Neg logit contributions
cases-0.096
br-0.088
istered-0.088
plets-0.085
 short-0.083
Token to
Token ID284
Feature activation-1.21e-01

Pos logit contributions
keeping+0.077
ie+0.059
 itself+0.058
 him+0.058
 Acc+0.057
Neg logit contributions
cases-0.096
br-0.088
istered-0.088
 long-0.086
 short-0.086
Token the
Token ID262
Feature activation-7.75e-03

Pos logit contributions
keeping+0.168
ie+0.126
gy+0.113
 itself+0.113
 Acc+0.111
Neg logit contributions
cases-0.258
 short-0.249
 long-0.248
istered-0.246
arest-0.243
Token best
Token ID1266
Feature activation-1.78e-02

Pos logit contributions
keeping+0.014
 itself+0.012
 him+0.012
 home+0.012
ie+0.011
Neg logit contributions
istered-0.014
cases-0.014
ider-0.013
plets-0.013
arest-0.013
Token view
Token ID1570
Feature activation+5.93e-02

Pos logit contributions
keeping+0.028
 itself+0.024
 him+0.023
 home+0.022
ie+0.021
Neg logit contributions
istered-0.038
cases-0.037
ider-0.035
plets-0.035
arest-0.035
Token on
Token ID319
Feature activation-8.63e-02

Pos logit contributions
istered+0.138
cases+0.131
ider+0.126
arest+0.124
plets+0.124
Neg logit contributions
keeping-0.085
 itself-0.072
 him-0.071
 home-0.067
ie-0.064
Token the
Token ID262
Feature activation-2.27e-02

Pos logit contributions
keeping+0.102
ie+0.069
 itself+0.067
gy+0.064
 home+0.063
Neg logit contributions
cases-0.208
istered-0.207
br-0.196
plets-0.196
arest-0.195
Token add
Token ID751
Feature activation-1.70e-02

Pos logit contributions
keeping+0.119
 itself+0.094
 him+0.091
 home+0.086
ie+0.084
Neg logit contributions
istered-0.184
cases-0.181
ider-0.172
arest-0.172
plets-0.170
Token more
Token ID517
Feature activation-1.78e-02

Pos logit contributions
keeping+0.027
 itself+0.023
 him+0.022
 home+0.021
ie+0.020
Neg logit contributions
istered-0.036
cases-0.035
ider-0.033
plets-0.033
arest-0.033
Token blocks
Token ID7021
Feature activation+1.46e-02

Pos logit contributions
keeping+0.032
 itself+0.028
 him+0.028
 home+0.027
ie+0.026
Neg logit contributions
istered-0.034
cases-0.033
ider-0.031
plets-0.031
arest-0.030
Token on
Token ID319
Feature activation-8.96e-02

Pos logit contributions
istered+0.033
cases+0.032
ider+0.030
plets+0.030
arest+0.030
Neg logit contributions
keeping-0.022
 itself-0.018
 him-0.018
 home-0.017
ie-0.016
Token top
Token ID1353
Feature activation-1.58e-02

Pos logit contributions
keeping+0.089
 itself+0.069
 him+0.068
 home+0.060
ie+0.054
Neg logit contributions
istered-0.244
cases-0.239
ider-0.230
plets-0.228
arest-0.227
Token to
Token ID284
Feature activation-1.21e-01

Pos logit contributions
keeping+0.033
 itself+0.029
 him+0.028
 home+0.028
ie+0.027
Neg logit contributions
istered-0.027
cases-0.025
ider-0.023
arest-0.023
plets-0.023
Token make
Token ID787
Feature activation-2.19e-02

Pos logit contributions
keeping+0.185
 itself+0.135
 him+0.132
 home+0.128
ie+0.128
Neg logit contributions
cases-0.258
istered-0.257
arest-0.248
br-0.242
plets-0.241
Token it
Token ID340
Feature activation-1.35e-02

Pos logit contributions
keeping+0.032
 itself+0.027
 him+0.026
 home+0.024
ie+0.023
Neg logit contributions
istered-0.050
cases-0.049
ider-0.047
plets-0.046
arest-0.046
Token even
Token ID772
Feature activation-3.05e-03

Pos logit contributions
keeping+0.024
 itself+0.022
 him+0.022
 home+0.021
ie+0.020
Neg logit contributions
istered-0.025
cases-0.024
plets-0.023
ider-0.023
arest-0.023
Token taller
Token ID25242
Feature activation-1.09e-03

Pos logit contributions
keeping+0.006
 itself+0.005
 him+0.005
 home+0.005
ie+0.005
Neg logit contributions
istered-0.005
cases-0.005
plets-0.005
ider-0.005
arest-0.005
Token!"
Token ID2474
Feature activation-1.97e-02

Pos logit contributions
keeping+0.002
 itself+0.002
 him+0.002
 home+0.002
ie+0.002
Neg logit contributions
istered-0.002
cases-0.002
arest-0.002
plets-0.002
ider-0.002
Token so
Token ID523
Feature activation+4.44e-03

Pos logit contributions
istered+0.049
cases+0.048
ider+0.045
plets+0.045
arest+0.045
Neg logit contributions
keeping-0.034
 itself-0.028
 him-0.027
 home-0.026
ie-0.025
Token she
Token ID673
Feature activation-6.65e-03

Pos logit contributions
istered+0.009
cases+0.008
plets+0.008
ider+0.008
arest+0.008
Neg logit contributions
keeping-0.008
 itself-0.007
 him-0.007
 home-0.007
ie-0.006
Token could
Token ID714
Feature activation+2.85e-03

Pos logit contributions
keeping+0.010
 him+0.009
 itself+0.009
ie+0.009
 home+0.009
Neg logit contributions
istered-0.014
cases-0.013
plets-0.013
arest-0.013
ider-0.013
Token climb
Token ID12080
Feature activation-4.68e-02

Pos logit contributions
istered+0.007
cases+0.007
plets+0.007
ider+0.007
arest+0.007
Neg logit contributions
keeping-0.003
 him-0.003
 itself-0.003
 home-0.003
ie-0.002
Token up
Token ID510
Feature activation-5.35e-02

Pos logit contributions
keeping+0.068
 Acc+0.050
 itself+0.049
ie+0.046
 zone+0.046
Neg logit contributions
cases-0.102
br-0.097
 short-0.095
plets-0.095
istered-0.095
Token to
Token ID284
Feature activation-1.21e-01

Pos logit contributions
keeping+0.082
 him+0.062
 itself+0.062
ie+0.061
 Acc+0.060
Neg logit contributions
cases-0.111
 short-0.104
istered-0.104
 long-0.103
br-0.102
Token get
Token ID651
Feature activation+3.72e-03

Pos logit contributions
keeping+0.177
 itself+0.123
ie+0.118
gy+0.118
 zone+0.109
Neg logit contributions
 short-0.254
cases-0.249
 long-0.247
istered-0.246
arest-0.240
Token the
Token ID262
Feature activation+8.53e-03

Pos logit contributions
istered+0.009
cases+0.009
plets+0.008
arest+0.008
br+0.008
Neg logit contributions
keeping-0.005
 itself-0.004
gy-0.003
ie-0.003
 him-0.003
Token apples
Token ID22514
Feature activation-1.62e-02

Pos logit contributions
istered+0.016
cases+0.016
lic+0.015
ider+0.015
arest+0.015
Neg logit contributions
keeping-0.015
 home-0.014
 him-0.014
 itself-0.013
ie-0.012
Token.
Token ID13
Feature activation-3.49e-02

Pos logit contributions
keeping+0.022
 itself+0.018
 him+0.018
 home+0.018
ie+0.015
Neg logit contributions
istered-0.039
cases-0.037
arest-0.036
ider-0.036
plets-0.036
Token Mom
Token ID11254
Feature activation+5.41e-04

Pos logit contributions
keeping+0.068
 itself+0.055
ie+0.053
 him+0.052
 generation+0.051
Neg logit contributions
cases-0.062
istered-0.058
uts-0.055
ider-0.054
plets-0.054
Token Can
Token ID1680
Feature activation-6.57e-02

Pos logit contributions
keeping+0.091
ie+0.075
 itself+0.075
 home+0.073
 him+0.072
Neg logit contributions
istered-0.077
ider-0.077
plets-0.075
cases-0.075
arest-0.072
Token I
Token ID314
Feature activation-4.23e-02

Pos logit contributions
keeping+0.161
 itself+0.133
 home+0.131
ie+0.126
 him+0.126
Neg logit contributions
cases-0.083
istered-0.082
ider-0.082
arest-0.078
plets-0.075
Token open
Token ID1280
Feature activation-6.48e-02

Pos logit contributions
keeping+0.049
 him+0.044
 home+0.042
 itself+0.040
ie+0.038
Neg logit contributions
cases-0.106
istered-0.106
plets-0.099
ider-0.098
arest-0.097
Token you
Token ID345
Feature activation-4.90e-02

Pos logit contributions
keeping+0.090
gy+0.065
 itself+0.063
 generation+0.058
ie+0.057
Neg logit contributions
cases-0.144
istered-0.140
arest-0.139
plets-0.136
ider-0.134
Token up
Token ID510
Feature activation-4.82e-02

Pos logit contributions
keeping+0.068
 him+0.064
 home+0.063
 itself+0.061
ie+0.058
Neg logit contributions
istered-0.112
cases-0.106
plets-0.103
arest-0.100
ider-0.099
Token to
Token ID284
Feature activation-1.20e-01

Pos logit contributions
keeping+0.088
 itself+0.074
 him+0.072
 home+0.068
ie+0.067
Neg logit contributions
istered-0.089
cases-0.089
ider-0.083
arest-0.083
plets-0.082
Token have
Token ID423
Feature activation-4.06e-02

Pos logit contributions
keeping+0.182
 itself+0.135
gy+0.126
 him+0.117
gent+0.116
Neg logit contributions
istered-0.254
cases-0.254
arest-0.251
plets-0.240
br-0.239
Token a
Token ID257
Feature activation+2.41e-02

Pos logit contributions
keeping+0.055
 him+0.050
 itself+0.048
 home+0.047
 Joey+0.041
Neg logit contributions
istered-0.096
cases-0.094
ider-0.090
plets-0.088
arest-0.086
Token look
Token ID804
Feature activation+3.60e-02

Pos logit contributions
cases+0.047
istered+0.046
plets+0.043
ider+0.043
arest+0.041
Neg logit contributions
keeping-0.041
 home-0.038
 him-0.036
 itself-0.036
 Joey-0.034
Token?
Token ID30
Feature activation-4.23e-02

Pos logit contributions
istered+0.087
cases+0.082
plets+0.079
ider+0.076
icious+0.076
Neg logit contributions
keeping-0.045
 him-0.044
 home-0.043
ie-0.040
 itself-0.039
Token
Token ID220
Feature activation+6.71e-02

Pos logit contributions
keeping+0.083
 itself+0.070
 him+0.069
 home+0.067
ie+0.066
Neg logit contributions
istered-0.074
cases-0.073
ider-0.069
plets-0.068
arest-0.068
Token hated
Token ID16563
Feature activation-3.29e-02

Pos logit contributions
keeping+0.030
 him+0.026
 itself+0.025
ie+0.024
 home+0.024
Neg logit contributions
istered-0.044
cases-0.042
arest-0.042
plets-0.042
ider-0.041
Token that
Token ID326
Feature activation-3.31e-02

Pos logit contributions
keeping+0.044
 itself+0.035
 him+0.032
gy+0.031
 home+0.031
Neg logit contributions
cases-0.077
istered-0.076
ider-0.071
plets-0.070
arest-0.069
Token idea
Token ID2126
Feature activation+3.58e-02

Pos logit contributions
keeping+0.053
 itself+0.044
 him+0.042
ie+0.039
 home+0.039
Neg logit contributions
istered-0.070
cases-0.070
ider-0.064
plets-0.063
arest-0.063
Token.
Token ID13
Feature activation-2.85e-02

Pos logit contributions
istered+0.079
cases+0.077
ider+0.073
plets+0.073
arest+0.073
Neg logit contributions
keeping-0.054
 him-0.045
 itself-0.045
 home-0.042
ie-0.040
Token
Token ID198
Feature activation+8.84e-02

Pos logit contributions
keeping+0.050
 itself+0.043
ie+0.041
 him+0.040
 home+0.039
Neg logit contributions
cases-0.055
istered-0.053
ider-0.050
uts-0.049
plets-0.049
Token
Token ID198
Feature activation+7.69e-02

Pos logit contributions
istered+0.174
cases+0.174
arest+0.161
ider+0.161
plets+0.159
Neg logit contributions
keeping-0.152
 itself-0.128
 him-0.123
 home-0.118
ie-0.116
TokenHe
Token ID1544
Feature activation-1.38e-03

Pos logit contributions
istered+0.147
plets+0.140
cases+0.131
arest+0.125
ider+0.125
Neg logit contributions
keeping-0.129
 him-0.127
 his-0.119
 himself-0.115
 itself-0.114
Token tried
Token ID3088
Feature activation+9.84e-03

Pos logit contributions
keeping+0.002
 him+0.002
 itself+0.002
 home+0.002
ie+0.002
Neg logit contributions
istered-0.003
cases-0.003
plets-0.003
arest-0.003
ider-0.003
Token to
Token ID284
Feature activation-7.10e-02

Pos logit contributions
istered+0.023
cases+0.023
ider+0.021
arest+0.021
br+0.021
Neg logit contributions
keeping-0.014
 itself-0.011
 him-0.010
ie-0.010
 Joey-0.009
Token sneak
Token ID20528
Feature activation+4.28e-04

Pos logit contributions
keeping+0.107
 itself+0.089
 him+0.088
 home+0.083
ie+0.080
Neg logit contributions
istered-0.157
cases-0.155
plets-0.145
ider-0.145
arest-0.144
Token away
Token ID1497
Feature activation-4.28e-02

Pos logit contributions
cases+0.001
istered+0.001
 short+0.001
ider+0.001
br+0.001
Neg logit contributions
keeping-0.001
 itself-0.000
ie-0.000
 Joey-0.000
 Acc-0.000
Token eating
Token ID6600
Feature activation-4.28e-02

Pos logit contributions
keeping+0.035
gy+0.025
ie+0.024
 itself+0.022
 Joey+0.022
Neg logit contributions
istered-0.076
cases-0.072
 short-0.072
 long-0.072
br-0.071
Token her
Token ID607
Feature activation-3.76e-02

Pos logit contributions
keeping+0.071
 itself+0.057
gy+0.055
 him+0.054
 home+0.053
Neg logit contributions
istered-0.086
cases-0.085
br-0.079
plets-0.078
arest-0.078
Token favorite
Token ID4004
Feature activation-1.64e-02

Pos logit contributions
keeping+0.081
 itself+0.071
 him+0.070
 home+0.068
ie+0.066
Neg logit contributions
istered-0.058
cases-0.057
ider-0.053
plets-0.052
arest-0.052
Token food
Token ID2057
Feature activation+1.82e-02

Pos logit contributions
keeping+0.035
 itself+0.030
 him+0.029
 home+0.028
 Joey+0.027
Neg logit contributions
istered-0.025
cases-0.024
arest-0.023
br-0.022
ider-0.022
Token.
Token ID13
Feature activation-3.43e-02

Pos logit contributions
istered+0.041
cases+0.040
plets+0.038
ider+0.038
arest+0.038
Neg logit contributions
keeping-0.027
 itself-0.022
 him-0.022
 home-0.021
ie-0.020
Token
Token ID198
Feature activation+7.96e-02

Pos logit contributions
keeping+0.059
 itself+0.049
 him+0.047
ie+0.046
 home+0.045
Neg logit contributions
istered-0.068
cases-0.067
plets-0.063
arest-0.062
ider-0.061
Token
Token ID198
Feature activation+7.17e-02

Pos logit contributions
istered+0.160
cases+0.158
arest+0.152
plets+0.147
br+0.146
Neg logit contributions
keeping-0.133
 itself-0.106
 him-0.101
gy-0.099
 home-0.098
TokenShe
Token ID3347
Feature activation-3.43e-03

Pos logit contributions
istered+0.113
cases+0.111
ider+0.103
plets+0.100
arest+0.097
Neg logit contributions
keeping-0.152
 itself-0.137
 him-0.136
 home-0.129
ie-0.127
Token was
Token ID373
Feature activation+6.58e-02

Pos logit contributions
keeping+0.006
 itself+0.005
 him+0.005
 home+0.004
ie+0.004
Neg logit contributions
istered-0.007
cases-0.007
ider-0.007
plets-0.007
arest-0.007
Token happy
Token ID3772
Feature activation+8.53e-03

Pos logit contributions
istered+0.119
cases+0.118
ider+0.109
plets+0.109
arest+0.108
Neg logit contributions
keeping-0.125
 itself-0.109
 him-0.108
 home-0.106
ie-0.099
Token to
Token ID284
Feature activation-4.14e-02

Pos logit contributions
istered+0.016
cases+0.016
ider+0.015
plets+0.015
arest+0.014
Neg logit contributions
keeping-0.015
 itself-0.013
 him-0.012
gy-0.012
ie-0.011
Token safely
Token ID11512
Feature activation-3.87e-03

Pos logit contributions
istered+0.077
cases+0.076
arest+0.071
ider+0.071
plets+0.071
Neg logit contributions
keeping-0.053
 itself-0.045
 him-0.044
 home-0.042
ie-0.039
Token on
Token ID319
Feature activation-7.32e-02

Pos logit contributions
keeping+0.005
 itself+0.004
 home+0.004
 him+0.004
ie+0.003
Neg logit contributions
istered-0.010
cases-0.009
arest-0.009
plets-0.009
icious-0.009
Token the
Token ID262
Feature activation-6.10e-03

Pos logit contributions
keeping+0.116
ie+0.091
gy+0.082
gent+0.078
 Acc+0.077
Neg logit contributions
 long-0.147
 short-0.145
istered-0.140
br-0.138
plets-0.136
Token ground
Token ID2323
Feature activation-1.03e-02

Pos logit contributions
keeping+0.010
 home+0.009
 itself+0.009
 him+0.008
 side+0.008
Neg logit contributions
istered-0.013
cases-0.013
lic-0.012
ider-0.012
plets-0.012
Token.
Token ID13
Feature activation-3.04e-02

Pos logit contributions
keeping+0.014
 itself+0.012
 him+0.011
 home+0.011
ie+0.010
Neg logit contributions
istered-0.025
cases-0.024
arest-0.023
plets-0.023
icious-0.022
Token
Token ID198
Feature activation+8.52e-02

Pos logit contributions
keeping+0.049
ie+0.042
 him+0.039
 home+0.037
 itself+0.037
Neg logit contributions
cases-0.061
istered-0.061
plets-0.061
ider-0.058
arest-0.058
Token
Token ID198
Feature activation+7.54e-02

Pos logit contributions
istered+0.158
cases+0.155
arest+0.148
plets+0.145
br+0.144
Neg logit contributions
keeping-0.155
 itself-0.128
 him-0.125
ie-0.122
 home-0.121
TokenThe
Token ID464
Feature activation+4.65e-02

Pos logit contributions
istered+0.123
cases+0.123
ider+0.108
plets+0.107
arest+0.104
Neg logit contributions
keeping-0.156
 itself-0.143
 him-0.137
 home-0.130
ie-0.127
Token plane
Token ID6614
Feature activation+6.40e-02

Pos logit contributions
istered+0.103
cases+0.097
lic+0.094
ider+0.091
uts+0.091
Neg logit contributions
 home-0.067
keeping-0.065
 itself-0.063
 him-0.060
 side-0.059
Token was
Token ID373
Feature activation+6.32e-02

Pos logit contributions
istered+0.137
cases+0.136
ider+0.128
plets+0.127
arest+0.127
Neg logit contributions
keeping-0.099
 itself-0.082
 him-0.081
 home-0.076
ie-0.073
Token happy
Token ID3772
Feature activation+1.47e-02

Pos logit contributions
cases+0.122
istered+0.120
ider+0.114
plets+0.111
arest+0.110
Neg logit contributions
keeping-0.112
 itself-0.105
 home-0.104
 him-0.100
gy-0.091
Token Bob
Token ID5811
Feature activation-2.67e-02

Pos logit contributions
istered+0.043
cases+0.042
ider+0.040
arest+0.040
plets+0.040
Neg logit contributions
keeping-0.019
 itself-0.015
 him-0.015
 home-0.013
 Joey-0.012
Token felt
Token ID2936
Feature activation+7.26e-02

Pos logit contributions
keeping+0.044
 itself+0.037
 him+0.037
 home+0.035
ie+0.034
Neg logit contributions
istered-0.055
cases-0.054
ider-0.051
arest-0.050
plets-0.050
Token relieved
Token ID22598
Feature activation-6.94e-03

Pos logit contributions
istered+0.125
cases+0.124
ider+0.115
plets+0.113
arest+0.113
Neg logit contributions
keeping-0.143
 itself-0.127
 him-0.125
 home-0.119
ie-0.114
Token.
Token ID13
Feature activation-2.81e-02

Pos logit contributions
keeping+0.013
 itself+0.010
gy+0.010
 him+0.010
ie+0.010
Neg logit contributions
istered-0.013
cases-0.012
ider-0.012
plets-0.011
arest-0.011
Token
Token ID198
Feature activation+8.72e-02

Pos logit contributions
keeping+0.047
ie+0.041
 itself+0.039
 him+0.038
 home+0.037
Neg logit contributions
cases-0.055
istered-0.053
ider-0.051
plets-0.050
uts-0.049
Token
Token ID198
Feature activation+7.71e-02

Pos logit contributions
cases+0.158
istered+0.156
arest+0.149
lic+0.147
plets+0.145
Neg logit contributions
keeping-0.160
 itself-0.132
 him-0.128
ie-0.127
 home-0.125
TokenBob
Token ID18861
Feature activation-8.86e-03

Pos logit contributions
istered+0.129
cases+0.125
ider+0.114
plets+0.114
arest+0.111
Neg logit contributions
keeping-0.159
 itself-0.142
 him-0.140
 home-0.131
 Joey-0.128
Token had
Token ID550
Feature activation+2.91e-02

Pos logit contributions
keeping+0.015
 itself+0.013
 him+0.013
 home+0.012
ie+0.011
Neg logit contributions
istered-0.017
cases-0.017
ider-0.016
plets-0.016
arest-0.016
Token been
Token ID587
Feature activation+5.09e-02

Pos logit contributions
istered+0.058
cases+0.057
ider+0.054
plets+0.053
arest+0.052
Neg logit contributions
keeping-0.051
 itself-0.045
 him-0.044
 home-0.042
 Joey-0.039
Token wise
Token ID10787
Feature activation+2.11e-02

Pos logit contributions
cases+0.086
istered+0.086
ider+0.078
plets+0.077
arest+0.076
Neg logit contributions
keeping-0.104
 itself-0.092
 him-0.091
 home-0.090
ie-0.084
Token and
Token ID290
Feature activation-9.34e-03

Pos logit contributions
istered+0.044
cases+0.043
plets+0.040
ider+0.040
arest+0.040
Neg logit contributions
keeping-0.035
 itself-0.029
 him-0.029
 home-0.028
ie-0.026
Token each
Token ID1123
Feature activation-1.52e-02

Pos logit contributions
keeping+0.039
 itself+0.031
 him+0.030
 home+0.029
gy+0.028
Neg logit contributions
istered-0.055
cases-0.054
ider-0.050
arest-0.050
plets-0.050
Token other
Token ID584
Feature activation-3.39e-02

Pos logit contributions
keeping+0.028
 itself+0.025
 him+0.024
 home+0.023
ie+0.023
Neg logit contributions
istered-0.028
cases-0.028
ider-0.026
plets-0.025
arest-0.025
Token secrets
Token ID13141
Feature activation+2.67e-02

Pos logit contributions
keeping+0.062
 itself+0.052
 him+0.051
 home+0.049
ie+0.048
Neg logit contributions
istered-0.063
cases-0.063
ider-0.057
arest-0.057
plets-0.057
Token.
Token ID13
Feature activation-2.91e-02

Pos logit contributions
istered+0.065
cases+0.061
plets+0.061
ider+0.060
arest+0.059
Neg logit contributions
keeping-0.034
 itself-0.030
 him-0.029
 home-0.028
ie-0.027
Token
Token ID198
Feature activation+8.54e-02

Pos logit contributions
keeping+0.050
 itself+0.041
 him+0.040
 home+0.037
ie+0.037
Neg logit contributions
cases-0.058
istered-0.058
ider-0.054
plets-0.053
br-0.053
Token
Token ID198
Feature activation+7.22e-02

Pos logit contributions
istered+0.206
cases+0.200
ider+0.190
plets+0.190
arest+0.183
Neg logit contributions
keeping-0.107
 him-0.098
 itself-0.093
 home-0.089
ie-0.085
TokenOne
Token ID3198
Feature activation+5.88e-02

Pos logit contributions
istered+0.135
cases+0.134
plets+0.123
uts+0.114
ider+0.112
Neg logit contributions
 him-0.118
 itself-0.117
keeping-0.115
ie-0.115
 home-0.112
Token day
Token ID1110
Feature activation+2.97e-02

Pos logit contributions
istered+0.169
cases+0.167
ider+0.159
plets+0.159
arest+0.158
Neg logit contributions
keeping-0.050
 itself-0.035
 him-0.033
 home-0.029
ie-0.027
Token,
Token ID11
Feature activation+2.17e-02

Pos logit contributions
istered+0.072
cases+0.070
ider+0.066
arest+0.066
plets+0.066
Neg logit contributions
keeping-0.038
 him-0.031
 itself-0.031
 home-0.029
ie-0.027
Token Lily
Token ID20037
Feature activation-2.32e-02

Pos logit contributions
istered+0.058
cases+0.057
arest+0.054
ider+0.054
plets+0.054
Neg logit contributions
keeping-0.022
 him-0.017
 itself-0.017
 home-0.016
ie-0.014
Token's
Token ID338
Feature activation+2.66e-02

Pos logit contributions
keeping+0.036
 itself+0.030
 him+0.030
 home+0.028
ie+0.027
Neg logit contributions
istered-0.050
cases-0.049
ider-0.046
arest-0.046
plets-0.046
Token toys
Token ID14958
Feature activation+7.20e-03

Pos logit contributions
keeping+0.031
 home+0.028
 him+0.028
 itself+0.026
gy+0.025
Neg logit contributions
istered-0.040
cases-0.040
plets-0.039
ider-0.038
arest-0.038
Token,
Token ID11
Feature activation+9.58e-03

Pos logit contributions
istered+0.017
cases+0.017
plets+0.016
arest+0.016
ider+0.016
Neg logit contributions
keeping-0.009
 him-0.008
 home-0.008
 itself-0.008
ie-0.007
Token but
Token ID475
Feature activation-4.06e-02

Pos logit contributions
istered+0.020
ider+0.019
cases+0.018
arest+0.018
plets+0.018
Neg logit contributions
 him-0.016
keeping-0.016
 his-0.015
ie-0.014
 itself-0.014
Token he
Token ID339
Feature activation+6.87e-03

Pos logit contributions
keeping+0.076
 itself+0.057
 him+0.051
 home+0.051
gent+0.051
Neg logit contributions
cases-0.075
istered-0.073
 short-0.071
ider-0.070
br-0.069
Token didn
Token ID1422
Feature activation+4.70e-02

Pos logit contributions
istered+0.012
cases+0.012
ider+0.011
arest+0.011
plets+0.011
Neg logit contributions
keeping-0.014
 itself-0.012
 him-0.011
 home-0.011
 Joey-0.010
Token't
Token ID470
Feature activation+1.86e-02

Pos logit contributions
istered+0.084
cases+0.077
ider+0.074
aj+0.071
plets+0.071
Neg logit contributions
 him-0.088
keeping-0.086
 itself-0.083
 himself-0.076
 home-0.076
Token like
Token ID588
Feature activation-1.60e-02

Pos logit contributions
istered+0.034
cases+0.033
ider+0.031
arest+0.031
plets+0.031
Neg logit contributions
keeping-0.035
 itself-0.030
 him-0.029
 home-0.027
ie-0.027
Token to
Token ID284
Feature activation-6.80e-02

Pos logit contributions
keeping+0.022
 itself+0.018
gy+0.017
 home+0.016
ie+0.016
Neg logit contributions
cases-0.035
istered-0.035
br-0.033
 annoying-0.032
uts-0.032
Token share
Token ID2648
Feature activation-2.80e-02

Pos logit contributions
keeping+0.105
 itself+0.084
 him+0.074
 home+0.073
 Joey+0.072
Neg logit contributions
istered-0.147
cases-0.142
ider-0.136
arest-0.134
icious-0.134
Token them
Token ID606
Feature activation-2.51e-02

Pos logit contributions
keeping+0.023
 itself+0.015
 him+0.014
 home+0.012
ie+0.011
Neg logit contributions
istered-0.081
cases-0.080
ider-0.077
arest-0.076
plets-0.076
Token with
Token ID351
Feature activation-4.44e-02

Pos logit contributions
keeping+0.036
 him+0.030
 itself+0.029
 home+0.028
ie+0.026
Neg logit contributions
istered-0.057
cases-0.056
plets-0.054
ider-0.053
arest-0.053
Token Sam
Token ID3409
Feature activation-1.85e-02

Pos logit contributions
keeping+0.067
 itself+0.054
 him+0.052
ie+0.051
 home+0.050
Neg logit contributions
istered-0.091
cases-0.090
plets-0.085
ider-0.085
arest-0.083
Token pulled
Token ID5954
Feature activation-1.21e-02

Pos logit contributions
keeping+0.028
 him+0.025
 itself+0.024
 home+0.022
ie+0.022
Neg logit contributions
istered-0.040
cases-0.038
arest-0.038
ider-0.037
plets-0.037
Token back
Token ID736
Feature activation-5.59e-02

Pos logit contributions
keeping+0.021
 generation+0.015
gy+0.015
stack+0.013
bank+0.013
Neg logit contributions
istered-0.025
 short-0.025
cases-0.024
 long-0.024
br-0.023
Token.
Token ID13
Feature activation-3.97e-02

Pos logit contributions
keeping+0.097
 itself+0.080
 Acc+0.078
 generation+0.076
gent+0.073
Neg logit contributions
cases-0.104
 short-0.095
istered-0.093
 long-0.092
br-0.089
Token They
Token ID1119
Feature activation+1.44e-02

Pos logit contributions
keeping+0.067
ie+0.053
 itself+0.052
 him+0.051
 home+0.049
Neg logit contributions
cases-0.080
istered-0.079
lic-0.075
plets-0.075
ider-0.075
Token both
Token ID1111
Feature activation+2.23e-02

Pos logit contributions
istered+0.023
cases+0.023
ider+0.021
plets+0.021
arest+0.020
Neg logit contributions
keeping-0.031
 itself-0.027
 him-0.026
 home-0.025
ie-0.025
Token fell
Token ID3214
Feature activation+5.31e-03

Pos logit contributions
istered+0.038
cases+0.038
ider+0.035
arest+0.035
plets+0.035
Neg logit contributions
keeping-0.044
 itself-0.038
 him-0.038
 home-0.036
ie-0.035
Token down
Token ID866
Feature activation-1.34e-02

Pos logit contributions
istered+0.012
cases+0.012
ider+0.011
plets+0.011
br+0.011
Neg logit contributions
keeping-0.008
 itself-0.006
 him-0.006
ie-0.005
 Joey-0.005
Token.
Token ID13
Feature activation-3.61e-02

Pos logit contributions
keeping+0.024
 itself+0.020
 him+0.020
ie+0.019
 Joey+0.019
Neg logit contributions
istered-0.025
cases-0.025
ider-0.023
plets-0.022
br-0.022
Token The
Token ID383
Feature activation+4.33e-02

Pos logit contributions
keeping+0.054
 itself+0.044
 him+0.043
 home+0.041
ie+0.040
Neg logit contributions
istered-0.079
cases-0.078
ider-0.074
plets-0.074
arest-0.073
Token doll
Token ID3654
Feature activation+3.03e-02

Pos logit contributions
istered+0.094
cases+0.093
plets+0.090
arest+0.087
lic+0.087
Neg logit contributions
keeping-0.066
 home-0.056
 him-0.055
 itself-0.054
ie-0.050
Token do
Token ID466
Feature activation-3.53e-02

Pos logit contributions
keeping+0.015
 itself+0.012
 him+0.012
 home+0.011
ie+0.011
Neg logit contributions
istered-0.019
cases-0.019
arest-0.018
ider-0.018
plets-0.018
Token it
Token ID340
Feature activation-1.59e-02

Pos logit contributions
keeping+0.037
 itself+0.026
 him+0.025
 home+0.023
ie+0.023
Neg logit contributions
istered-0.094
cases-0.093
arest-0.088
ider-0.088
plets-0.088
Token.
Token ID13
Feature activation-3.36e-02

Pos logit contributions
keeping+0.025
 itself+0.021
 him+0.021
 home+0.020
ie+0.019
Neg logit contributions
istered-0.033
cases-0.033
ider-0.031
arest-0.031
plets-0.031
Token
Token ID198
Feature activation+8.30e-02

Pos logit contributions
keeping+0.062
ie+0.052
 itself+0.051
 him+0.049
gy+0.049
Neg logit contributions
cases-0.060
istered-0.056
plets-0.054
uts-0.053
ider-0.052
Token
Token ID198
Feature activation+7.34e-02

Pos logit contributions
cases+0.143
istered+0.139
arest+0.133
lic+0.129
plets+0.127
Neg logit contributions
keeping-0.159
 itself-0.133
 him-0.127
gy-0.126
ie-0.125
Token"
Token ID1
Feature activation+2.38e-02

Pos logit contributions
istered+0.109
cases+0.107
ider+0.098
arest+0.097
plets+0.097
Neg logit contributions
keeping-0.163
 itself-0.144
 him-0.142
 home-0.137
ie-0.134
TokenLet
Token ID5756
Feature activation-4.39e-02

Pos logit contributions
istered+0.049
cases+0.048
ider+0.044
arest+0.043
plets+0.043
Neg logit contributions
keeping-0.041
 itself-0.035
 him-0.034
 home-0.032
 Joey-0.030
Token me
Token ID502
Feature activation-4.74e-02

Pos logit contributions
keeping+0.081
 itself+0.070
 him+0.069
 home+0.066
ie+0.064
Neg logit contributions
istered-0.082
cases-0.081
ider-0.076
plets-0.075
arest-0.074
Token help
Token ID1037
Feature activation-5.53e-02

Pos logit contributions
 home+0.054
keeping+0.053
 him+0.052
 itself+0.048
ie+0.045
Neg logit contributions
istered-0.121
cases-0.119
ider-0.108
plets-0.107
lic-0.105
Token you
Token ID345
Feature activation-4.22e-02

Pos logit contributions
keeping+0.115
 itself+0.093
 generation+0.091
gy+0.091
gent+0.089
Neg logit contributions
istered-0.080
cases-0.079
arest-0.079
keys-0.077
plets-0.076
Token,"
Token ID553
Feature activation-2.90e-03

Pos logit contributions
keeping+0.067
 itself+0.058
 him+0.057
 home+0.056
ie+0.052
Neg logit contributions
istered-0.090
cases-0.088
ider-0.083
plets-0.082
arest-0.081
Token the
Token ID262
Feature activation+6.99e-03

Pos logit contributions
keeping+0.036
ie+0.023
 itself+0.023
 generation+0.022
 Acc+0.022
Neg logit contributions
cases-0.066
istered-0.063
 short-0.061
 long-0.061
br-0.058
Token door
Token ID3420
Feature activation+3.91e-03

Pos logit contributions
istered+0.011
cases+0.011
ider+0.010
arest+0.009
plets+0.009
Neg logit contributions
keeping-0.015
 itself-0.013
 him-0.013
ie-0.012
 Joey-0.012
Token.
Token ID13
Feature activation-3.83e-02

Pos logit contributions
istered+0.008
cases+0.008
plets+0.008
arest+0.008
ider+0.008
Neg logit contributions
keeping-0.006
 itself-0.005
 him-0.005
 home-0.005
ie-0.005
Token She
Token ID1375
Feature activation+2.76e-04

Pos logit contributions
keeping+0.079
ie+0.071
 him+0.065
gy+0.065
 home+0.065
Neg logit contributions
cases-0.062
istered-0.056
lic-0.056
br-0.054
plets-0.054
Token had
Token ID550
Feature activation+3.01e-02

Pos logit contributions
istered+0.001
cases+0.001
arest+0.001
ider+0.001
plets+0.001
Neg logit contributions
keeping-0.000
 him-0.000
 itself-0.000
 home-0.000
ie-0.000
Token a
Token ID257
Feature activation+6.87e-02

Pos logit contributions
istered+0.063
ider+0.062
cases+0.062
plets+0.061
arest+0.060
Neg logit contributions
keeping-0.050
 him-0.044
 itself-0.043
 home-0.042
 his-0.037
Token kind
Token ID1611
Feature activation+1.35e-02

Pos logit contributions
cases+0.112
plets+0.106
ider+0.104
istered+0.102
uts+0.099
Neg logit contributions
 home-0.139
keeping-0.133
 him-0.124
gy-0.118
 his-0.118
Token face
Token ID1986
Feature activation+1.45e-02

Pos logit contributions
istered+0.027
cases+0.027
ider+0.025
plets+0.025
arest+0.025
Neg logit contributions
keeping-0.023
 itself-0.020
 him-0.019
 home-0.019
ie-0.018
Token and
Token ID290
Feature activation-1.07e-02

Pos logit contributions
istered+0.032
cases+0.031
arest+0.030
plets+0.029
icious+0.029
Neg logit contributions
keeping-0.022
 itself-0.018
 him-0.018
 home-0.017
ie-0.016
Token a
Token ID257
Feature activation+6.50e-02

Pos logit contributions
keeping+0.016
 itself+0.014
 him+0.014
 home+0.013
 his+0.012
Neg logit contributions
istered-0.024
cases-0.023
ider-0.022
plets-0.022
arest-0.022
Token big
Token ID1263
Feature activation+2.16e-02

Pos logit contributions
cases+0.110
plets+0.099
istered+0.099
ider+0.099
arest+0.094
Neg logit contributions
 home-0.125
keeping-0.124
 him-0.111
 his-0.110
 itself-0.110
Token share
Token ID2648
Feature activation-2.89e-02

Pos logit contributions
keeping+0.031
 itself+0.025
 him+0.025
 home+0.024
ie+0.023
Neg logit contributions
istered-0.042
cases-0.041
ider-0.039
plets-0.038
arest-0.038
Token it
Token ID340
Feature activation-1.65e-02

Pos logit contributions
keeping+0.037
 itself+0.028
 him+0.027
gy+0.026
ie+0.026
Neg logit contributions
istered-0.068
cases-0.067
br-0.062
arest-0.062
plets-0.062
Token nicely
Token ID16576
Feature activation-3.40e-02

Pos logit contributions
keeping+0.025
 itself+0.021
 him+0.021
 home+0.019
ie+0.019
Neg logit contributions
istered-0.036
cases-0.035
ider-0.033
arest-0.033
plets-0.033
Token.
Token ID13
Feature activation-2.92e-02

Pos logit contributions
keeping+0.052
 itself+0.043
 him+0.042
 home+0.040
ie+0.038
Neg logit contributions
istered-0.074
cases-0.073
ider-0.069
plets-0.068
arest-0.068
Token She
Token ID1375
Feature activation+4.68e-03

Pos logit contributions
keeping+0.036
 him+0.032
 itself+0.031
 home+0.028
 O+0.026
Neg logit contributions
istered-0.070
cases-0.069
arest-0.069
plets-0.066
br-0.066
Token also
Token ID635
Feature activation+1.90e-02

Pos logit contributions
istered+0.009
plets+0.009
arest+0.009
ider+0.009
cases+0.009
Neg logit contributions
keeping-0.007
 him-0.007
 itself-0.007
ie-0.007
 home-0.007
Token said
Token ID531
Feature activation+2.21e-02

Pos logit contributions
istered+0.037
plets+0.035
cases+0.035
arest+0.035
ider+0.035
Neg logit contributions
keeping-0.032
 him-0.029
 itself-0.028
 home-0.027
ie-0.027
Token they
Token ID484
Feature activation+3.96e-02

Pos logit contributions
istered+0.039
cases+0.039
ider+0.036
plets+0.036
arest+0.035
Neg logit contributions
keeping-0.043
 itself-0.036
 him-0.035
 home-0.034
ie-0.033
Token had
Token ID550
Feature activation+2.84e-02

Pos logit contributions
istered+0.081
cases+0.079
ider+0.078
plets+0.078
arest+0.076
Neg logit contributions
keeping-0.063
 him-0.059
 itself-0.054
 home-0.051
ie-0.050
Token to
Token ID284
Feature activation-6.62e-02

Pos logit contributions
istered+0.070
cases+0.069
ider+0.066
plets+0.066
arest+0.066
Neg logit contributions
keeping-0.036
 itself-0.029
 him-0.028
 home-0.026
ie-0.024
Token be
Token ID307
Feature activation+3.11e-02

Pos logit contributions
keeping+0.101
 itself+0.082
 him+0.080
 home+0.075
ie+0.074
Neg logit contributions
istered-0.142
cases-0.139
ider-0.132
arest-0.131
plets-0.129
Token.
Token ID13
Feature activation-4.63e-02

Pos logit contributions
istered+0.035
cases+0.034
arest+0.033
plets+0.032
br+0.032
Neg logit contributions
keeping-0.022
 home-0.018
 itself-0.017
 him-0.017
ie-0.017
Token Then
Token ID3244
Feature activation-1.76e-02

Pos logit contributions
keeping+0.058
 itself+0.049
 him+0.047
 home+0.042
 Joey+0.042
Neg logit contributions
cases-0.112
istered-0.112
arest-0.105
ider-0.104
br-0.103
Token I
Token ID314
Feature activation-1.40e-02

Pos logit contributions
keeping+0.032
 itself+0.027
 him+0.027
 home+0.025
ie+0.025
Neg logit contributions
istered-0.033
cases-0.033
ider-0.031
arest-0.030
plets-0.030
Token will
Token ID481
Feature activation-1.03e-02

Pos logit contributions
keeping+0.021
 him+0.019
 home+0.018
 itself+0.018
ie+0.017
Neg logit contributions
cases-0.030
istered-0.028
plets-0.027
ider-0.027
arest-0.027
Token cook
Token ID4255
Feature activation-6.68e-03

Pos logit contributions
keeping+0.014
 him+0.012
 itself+0.011
 home+0.011
 Joey+0.010
Neg logit contributions
cases-0.025
istered-0.025
plets-0.023
ider-0.023
arest-0.023
Token it
Token ID340
Feature activation-1.09e-02

Pos logit contributions
keeping+0.008
 itself+0.007
 him+0.006
 home+0.006
ie+0.006
Neg logit contributions
istered-0.017
cases-0.016
ider-0.016
plets-0.015
arest-0.015
Token for
Token ID329
Feature activation-7.50e-02

Pos logit contributions
keeping+0.015
 itself+0.013
 him+0.013
 home+0.013
ie+0.012
Neg logit contributions
istered-0.025
cases-0.025
plets-0.023
arest-0.023
ider-0.023
Token dinner
Token ID8073
Feature activation-1.23e-02

Pos logit contributions
keeping+0.118
 itself+0.101
 him+0.100
 home+0.094
ie+0.089
Neg logit contributions
istered-0.161
cases-0.160
ider-0.148
plets-0.148
arest-0.146
Token."
Token ID526
Feature activation-1.74e-02

Pos logit contributions
keeping+0.016
 home+0.014
 him+0.013
 itself+0.012
ie+0.011
Neg logit contributions
istered-0.031
cases-0.029
plets-0.029
arest-0.028
lic-0.027
Token
Token ID198
Feature activation+7.83e-02

Pos logit contributions
keeping+0.035
 itself+0.029
gy+0.027
ie+0.027
 generation+0.027
Neg logit contributions
cases-0.030
istered-0.029
ider-0.027
aj-0.027
plets-0.026
Token
Token ID198
Feature activation+6.78e-02

Pos logit contributions
istered+0.183
cases+0.181
plets+0.171
ider+0.171
arest+0.166
Neg logit contributions
keeping-0.105
 him-0.094
 itself-0.089
 home-0.086
ie-0.081
Token to
Token ID284
Feature activation-7.37e-02

Pos logit contributions
keeping+0.043
 Joey+0.039
 itself+0.038
riage+0.036
 Suz+0.036
Neg logit contributions
 long-0.028
 short-0.026
uts-0.024
istered-0.023
cases-0.023
Token play
Token ID711
Feature activation-2.56e-02

Pos logit contributions
keeping+0.091
 itself+0.067
 him+0.058
 home+0.057
 Joey+0.056
Neg logit contributions
istered-0.184
cases-0.183
ider-0.175
arest-0.174
br-0.172
Token with
Token ID351
Feature activation-5.69e-02

Pos logit contributions
keeping+0.045
 itself+0.037
 Joey+0.035
 him+0.033
ie+0.033
Neg logit contributions
istered-0.050
cases-0.049
ider-0.047
br-0.047
arest-0.045
Token all
Token ID477
Feature activation-2.78e-02

Pos logit contributions
keeping+0.106
 itself+0.085
 Joey+0.082
gy+0.079
 him+0.076
Neg logit contributions
cases-0.104
istered-0.103
ider-0.097
br-0.095
 short-0.095
Token the
Token ID262
Feature activation-5.08e-03

Pos logit contributions
keeping+0.044
 itself+0.032
ASH+0.030
gy+0.030
 generation+0.028
Neg logit contributions
br-0.055
 long-0.053
 short-0.053
hes-0.053
 annoying-0.052
Token time
Token ID640
Feature activation-3.42e-02

Pos logit contributions
keeping+0.008
 him+0.008
 home+0.007
 itself+0.007
ie+0.007
Neg logit contributions
istered-0.011
cases-0.010
plets-0.010
ider-0.010
arest-0.010
Token.
Token ID13
Feature activation-3.46e-02

Pos logit contributions
keeping+0.049
 itself+0.041
 him+0.041
 home+0.038
ie+0.036
Neg logit contributions
istered-0.079
cases-0.075
ider-0.073
arest-0.073
plets-0.073
Token One
Token ID1881
Feature activation+4.44e-02

Pos logit contributions
keeping+0.077
 itself+0.060
 him+0.055
away+0.054
ie+0.054
Neg logit contributions
cases-0.054
istered-0.052
br-0.049
 short-0.049
icious-0.048
Token day
Token ID1110
Feature activation+2.40e-02

Pos logit contributions
istered+0.110
cases+0.108
arest+0.104
ider+0.103
 short+0.103
Neg logit contributions
keeping-0.057
 itself-0.041
 him-0.037
 Joey-0.031
 generation-0.030
Token,
Token ID11
Feature activation+1.59e-02

Pos logit contributions
istered+0.052
cases+0.051
ider+0.048
br+0.048
plets+0.048
Neg logit contributions
keeping-0.038
 itself-0.031
 him-0.029
 home-0.027
ie-0.026
Token her
Token ID607
Feature activation-1.07e-03

Pos logit contributions
istered+0.032
cases+0.031
ider+0.030
 short+0.029
br+0.029
Neg logit contributions
keeping-0.029
 itself-0.023
gy-0.020
 generation-0.020
 him-0.020
TokenDo
Token ID5211
Feature activation-2.16e-02

Pos logit contributions
istered+0.025
cases+0.024
plets+0.024
ider+0.024
arest+0.023
Neg logit contributions
keeping-0.024
 itself-0.021
 him-0.020
 home-0.020
ie-0.018
Token you
Token ID345
Feature activation-1.05e-02

Pos logit contributions
keeping+0.035
 him+0.033
 itself+0.032
 home+0.029
ie+0.029
Neg logit contributions
istered-0.044
cases-0.043
plets-0.040
ider-0.040
arest-0.039
Token want
Token ID765
Feature activation-1.59e-03

Pos logit contributions
keeping+0.020
 itself+0.017
 him+0.017
 home+0.016
ie+0.016
Neg logit contributions
istered-0.019
cases-0.019
ider-0.017
arest-0.017
plets-0.017
Token to
Token ID284
Feature activation-7.32e-02

Pos logit contributions
keeping+0.002
 itself+0.002
 him+0.002
 home+0.002
ie+0.001
Neg logit contributions
istered-0.004
cases-0.004
ider-0.004
arest-0.004
plets-0.004
Token go
Token ID467
Feature activation-2.84e-02

Pos logit contributions
keeping+0.099
 itself+0.080
 him+0.079
 home+0.073
ie+0.070
Neg logit contributions
istered-0.173
cases-0.171
ider-0.161
plets-0.160
arest-0.160
Token this
Token ID428
Feature activation-1.98e-02

Pos logit contributions
keeping+0.046
 itself+0.038
 him+0.036
 home+0.033
ie+0.033
Neg logit contributions
istered-0.059
cases-0.059
ider-0.056
arest-0.055
plets-0.055
Token way
Token ID835
Feature activation-5.37e-02

Pos logit contributions
keeping+0.032
 itself+0.028
 him+0.027
 home+0.027
ie+0.026
Neg logit contributions
istered-0.042
cases-0.040
plets-0.038
ider-0.038
arest-0.037
Token?"
Token ID1701
Feature activation-1.32e-02

Pos logit contributions
keeping+0.091
 itself+0.076
 him+0.075
 home+0.070
ie+0.068
Neg logit contributions
istered-0.106
cases-0.106
ider-0.099
arest-0.099
plets-0.099
Token M
Token ID337
Feature activation+1.41e-02

Pos logit contributions
keeping+0.025
 generation+0.020
 itself+0.019
away+0.018
 home+0.017
Neg logit contributions
lic-0.024
istered-0.022
 long-0.022
 short-0.022
aj-0.022
Tokenimi
Token ID25236
Feature activation-2.11e-03

Pos logit contributions
istered+0.018
cases+0.018
ider+0.016
plets+0.016
arest+0.016
Neg logit contributions
keeping-0.034
 itself-0.030
 him-0.030
 home-0.029
ie-0.028
Token me
Token ID502
Feature activation+1.28e-02

Pos logit contributions
keeping+0.004
 itself+0.004
 him+0.004
 home+0.003
ie+0.003
Neg logit contributions
istered-0.004
cases-0.004
ider-0.003
plets-0.003
arest-0.003
Tokenmy
Token ID1820
Feature activation-7.57e-03

Pos logit contributions
keeping+0.067
 itself+0.057
 generation+0.053
gent+0.049
 Joey+0.048
Neg logit contributions
istered-0.061
cases-0.060
 short-0.055
opy-0.055
ider-0.055
Token,
Token ID11
Feature activation+2.97e-02

Pos logit contributions
keeping+0.012
 itself+0.010
 him+0.009
 home+0.009
ie+0.009
Neg logit contributions
istered-0.016
cases-0.016
ider-0.015
plets-0.015
br-0.014
Token asked
Token ID1965
Feature activation-3.30e-02

Pos logit contributions
istered+0.058
cases+0.058
ider+0.054
plets+0.053
arest+0.053
Neg logit contributions
keeping-0.051
 itself-0.043
 him-0.041
 home-0.040
ie-0.039
Token her
Token ID607
Feature activation-1.13e-02

Pos logit contributions
keeping+0.062
 generation+0.046
|+0.045
gy+0.044
gent+0.043
Neg logit contributions
 long-0.056
 sweet-0.055
 short-0.055
 loud-0.055
cases-0.053
Token,
Token ID11
Feature activation+1.63e-02

Pos logit contributions
keeping+0.019
 itself+0.015
 him+0.014
ie+0.014
 generation+0.013
Neg logit contributions
cases-0.023
istered-0.023
ider-0.022
br-0.021
arest-0.021
Token "
Token ID366
Feature activation+3.00e-04

Pos logit contributions
istered+0.029
cases+0.028
ider+0.027
plets+0.026
arest+0.026
Neg logit contributions
keeping-0.032
 itself-0.026
 him-0.025
ie-0.025
 home-0.024
TokenWhy
Token ID5195
Feature activation-2.84e-02

Pos logit contributions
istered+0.001
cases+0.001
ider+0.001
plets+0.001
arest+0.001
Neg logit contributions
keeping-0.001
 itself-0.000
 him-0.000
 home-0.000
ie-0.000
Token do
Token ID466
Feature activation-2.60e-02

Pos logit contributions
 him+0.040
keeping+0.040
 itself+0.036
 home+0.034
ie+0.034
Neg logit contributions
istered-0.063
ider-0.062
plets-0.060
arest-0.060
cases-0.059
Token you
Token ID345
Feature activation-3.23e-02

Pos logit contributions
 him+0.037
keeping+0.037
 itself+0.034
 his+0.031
 Joey+0.031
Neg logit contributions
istered-0.058
cases-0.054
ider-0.053
plets-0.052
arest-0.051
Token write
Token ID3551
Feature activation+6.21e-03

Pos logit contributions
 him+0.041
keeping+0.040
 home+0.035
ie+0.034
gy+0.034
Neg logit contributions
cases-0.078
istered-0.077
ider-0.074
plets-0.073
br-0.073
Token in
Token ID287
Feature activation-6.67e-02

Pos logit contributions
istered+0.015
cases+0.014
plets+0.014
ider+0.014
arest+0.013
Neg logit contributions
 him-0.008
keeping-0.008
 itself-0.008
 home-0.007
ie-0.007
Token home
Token ID1363
Feature activation-3.21e-02

Pos logit contributions
keeping+0.031
 itself+0.024
 him+0.023
ie+0.021
 home+0.021
Neg logit contributions
istered-0.056
cases-0.056
ider-0.053
arest-0.052
plets-0.052
Token.
Token ID13
Feature activation-2.77e-02

Pos logit contributions
keeping+0.047
 itself+0.039
 him+0.038
 home+0.037
ie+0.034
Neg logit contributions
istered-0.072
cases-0.071
plets-0.068
arest-0.067
ider-0.067
Token
Token ID198
Feature activation+8.66e-02

Pos logit contributions
keeping+0.051
ie+0.044
 itself+0.043
 him+0.041
gy+0.039
Neg logit contributions
cases-0.049
istered-0.046
plets-0.044
uts-0.044
lic-0.044
Token
Token ID198
Feature activation+7.64e-02

Pos logit contributions
istered+0.166
cases+0.165
arest+0.156
ider+0.152
plets+0.152
Neg logit contributions
keeping-0.152
 itself-0.126
 him-0.122
 home-0.118
ie-0.117
TokenThe
Token ID464
Feature activation+4.33e-02

Pos logit contributions
istered+0.122
cases+0.119
ider+0.109
plets+0.108
arest+0.106
Neg logit contributions
keeping-0.162
 itself-0.144
 him-0.142
 home-0.136
ie-0.132
Token other
Token ID584
Feature activation-2.57e-03

Pos logit contributions
istered+0.101
cases+0.096
plets+0.091
icious+0.087
ider+0.087
Neg logit contributions
keeping-0.064
 home-0.057
 him-0.054
 itself-0.054
ie-0.048
Token people
Token ID661
Feature activation+2.93e-02

Pos logit contributions
keeping+0.004
 itself+0.004
 him+0.004
 home+0.004
ie+0.003
Neg logit contributions
istered-0.005
cases-0.005
ider-0.005
plets-0.005
arest-0.005
Token tried
Token ID3088
Feature activation+5.14e-03

Pos logit contributions
istered+0.057
cases+0.056
ider+0.053
arest+0.052
plets+0.052
Neg logit contributions
keeping-0.051
 itself-0.043
 him-0.042
 home-0.040
ie-0.039
Token to
Token ID284
Feature activation-7.26e-02

Pos logit contributions
istered+0.012
cases+0.012
ider+0.011
arest+0.011
plets+0.011
Neg logit contributions
keeping-0.007
 itself-0.005
 him-0.005
 home-0.005
ie-0.005
Token apologize
Token ID16521
Feature activation-2.72e-02

Pos logit contributions
keeping+0.097
 him+0.086
 itself+0.083
 home+0.078
ie+0.074
Neg logit contributions
istered-0.175
cases-0.173
plets-0.166
ider-0.163
arest-0.158
Token to
Token ID284
Feature activation-6.63e-02

Pos logit contributions
keeping+0.045
 itself+0.036
 him+0.034
ie+0.033
 generation+0.033
Neg logit contributions
cases-0.056
istered-0.054
ider-0.050
br-0.049
arest-0.048
Token,
Token ID11
Feature activation+1.64e-02

Pos logit contributions
keeping+0.075
 itself+0.058
 him+0.054
 generation+0.053
gent+0.052
Neg logit contributions
cases-0.073
istered-0.071
br-0.069
ider-0.069
 long-0.066
Token the
Token ID262
Feature activation+3.31e-02

Pos logit contributions
cases+0.037
istered+0.037
plets+0.036
br+0.036
 short+0.036
Neg logit contributions
keeping-0.022
 itself-0.016
 him-0.013
gent-0.013
ie-0.013
Token spot
Token ID4136
Feature activation+4.18e-02

Pos logit contributions
istered+0.069
cases+0.068
ider+0.063
plets+0.063
arest+0.063
Neg logit contributions
keeping-0.054
 itself-0.046
 him-0.045
 home-0.044
ie-0.042
Token was
Token ID373
Feature activation+4.43e-02

Pos logit contributions
cases+0.083
istered+0.081
ider+0.077
plets+0.074
br+0.074
Neg logit contributions
keeping-0.070
 itself-0.057
 him-0.055
 generation-0.052
 Joey-0.052
Token visited
Token ID8672
Feature activation-6.07e-03

Pos logit contributions
cases+0.086
istered+0.086
ider+0.081
plets+0.080
arest+0.079
Neg logit contributions
keeping-0.077
 itself-0.070
 him-0.070
 home-0.068
ie-0.063
Token by
Token ID416
Feature activation-6.73e-02

Pos logit contributions
keeping+0.010
 itself+0.007
 him+0.007
ie+0.007
gent+0.007
Neg logit contributions
cases-0.012
istered-0.012
br-0.011
 short-0.011
ider-0.011
Token many
Token ID867
Feature activation-1.46e-02

Pos logit contributions
keeping+0.102
 him+0.078
 itself+0.076
ie+0.075
 home+0.072
Neg logit contributions
istered-0.140
cases-0.134
ider-0.129
plets-0.128
arest-0.127
Token animals
Token ID4695
Feature activation-6.22e-03

Pos logit contributions
keeping+0.025
 itself+0.022
 him+0.022
 home+0.021
ie+0.020
Neg logit contributions
istered-0.029
cases-0.029
ider-0.027
plets-0.027
arest-0.026
Token.
Token ID13
Feature activation-3.45e-02

Pos logit contributions
keeping+0.010
 itself+0.008
 him+0.008
 home+0.008
ie+0.007
Neg logit contributions
istered-0.013
cases-0.013
ider-0.012
plets-0.012
arest-0.012
Token And
Token ID843
Feature activation-2.33e-02

Pos logit contributions
keeping+0.058
ie+0.049
 home+0.048
 itself+0.047
 him+0.046
Neg logit contributions
cases-0.065
istered-0.063
plets-0.063
uts-0.060
br-0.060
Token all
Token ID477
Feature activation+2.10e-02

Pos logit contributions
keeping+0.042
 itself+0.032
gy+0.032
 him+0.032
ie+0.031
Neg logit contributions
cases-0.043
istered-0.041
br-0.039
arest-0.038
 importantly-0.038
Token had
Token ID550
Feature activation+1.25e-02

Pos logit contributions
keeping+0.013
 him+0.011
 itself+0.011
 home+0.010
ie+0.010
Neg logit contributions
istered-0.015
cases-0.015
plets-0.014
ider-0.014
arest-0.014
Token to
Token ID284
Feature activation-7.34e-02

Pos logit contributions
istered+0.027
cases+0.027
ider+0.025
arest+0.025
plets+0.025
Neg logit contributions
keeping-0.019
 itself-0.016
 him-0.015
 home-0.014
ie-0.014
Token measure
Token ID3953
Feature activation-2.33e-02

Pos logit contributions
keeping+0.111
 itself+0.090
 him+0.085
ie+0.083
 home+0.082
Neg logit contributions
istered-0.155
cases-0.153
arest-0.145
ider-0.144
icious-0.142
Token the
Token ID262
Feature activation+8.19e-03

Pos logit contributions
keeping+0.029
 itself+0.021
 him+0.020
gy+0.019
ie+0.019
Neg logit contributions
cases-0.057
istered-0.056
arest-0.053
plets-0.052
ider-0.052
Token price
Token ID2756
Feature activation-2.23e-02

Pos logit contributions
istered+0.016
cases+0.016
plets+0.015
ider+0.015
arest+0.015
Neg logit contributions
keeping-0.014
 him-0.012
 itself-0.012
 home-0.012
ie-0.011
Token carefully
Token ID7773
Feature activation-6.54e-02

Pos logit contributions
keeping+0.035
 itself+0.030
 him+0.029
 home+0.028
ie+0.026
Neg logit contributions
istered-0.047
cases-0.046
plets-0.044
ider-0.044
arest-0.043
Token.
Token ID13
Feature activation-3.58e-02

Pos logit contributions
keeping+0.107
 itself+0.088
 him+0.087
ie+0.083
 Joey+0.081
Neg logit contributions
cases-0.132
istered-0.130
ider-0.124
arest-0.120
plets-0.119
Token The
Token ID383
Feature activation+3.81e-02

Pos logit contributions
keeping+0.049
 itself+0.039
ie+0.037
 him+0.037
 home+0.035
Neg logit contributions
cases-0.084
istered-0.083
ider-0.078
plets-0.078
arest-0.076
Token little
Token ID1310
Feature activation+2.60e-03

Pos logit contributions
istered+0.085
cases+0.081
plets+0.079
arest+0.077
br+0.076
Neg logit contributions
keeping-0.057
 him-0.047
 home-0.047
 itself-0.046
ie-0.041
Token boy
Token ID2933
Feature activation-1.06e-02

Pos logit contributions
cases+0.005
istered+0.005
ider+0.005
plets+0.005
br+0.005
Neg logit contributions
keeping-0.004
 itself-0.004
 him-0.004
ie-0.003
 Joey-0.003
Token thought
Token ID1807
Feature activation+4.36e-02

Pos logit contributions
keeping+0.018
 itself+0.014
 him+0.014
 home+0.013
 Joey+0.013
Neg logit contributions
cases-0.022
istered-0.022
ider-0.020
arest-0.020
plets-0.020
Token twins
Token ID20345
Feature activation+3.53e-02

Pos logit contributions
plets+0.103
cases+0.102
istered+0.099
br+0.095
arest+0.095
Neg logit contributions
 home-0.092
 him-0.087
keeping-0.087
ie-0.077
 his-0.075
Token who
Token ID508
Feature activation-8.61e-03

Pos logit contributions
cases+0.079
istered+0.079
plets+0.075
arest+0.074
ider+0.074
Neg logit contributions
keeping-0.050
 him-0.043
 itself-0.042
 home-0.040
ie-0.038
Token liked
Token ID8288
Feature activation+4.39e-03

Pos logit contributions
keeping+0.016
 him+0.014
 itself+0.014
 home+0.013
ie+0.013
Neg logit contributions
istered-0.016
cases-0.016
ider-0.015
plets-0.014
arest-0.014
Token to
Token ID284
Feature activation-7.41e-02

Pos logit contributions
cases+0.007
 long+0.007
 short+0.007
istered+0.007
 y+0.007
Neg logit contributions
keeping-0.008
 itself-0.007
gy-0.007
ie-0.007
 generation-0.006
Token play
Token ID711
Feature activation-2.41e-02

Pos logit contributions
keeping+0.105
 itself+0.083
 home+0.072
ie+0.070
 him+0.070
Neg logit contributions
cases-0.164
istered-0.163
br-0.156
arest-0.155
 short-0.154
Token outside
Token ID2354
Feature activation-7.37e-02

Pos logit contributions
keeping+0.045
 itself+0.035
 Joey+0.033
ie+0.032
gy+0.032
Neg logit contributions
istered-0.046
cases-0.045
ider-0.043
br-0.043
arest-0.041
Token.
Token ID13
Feature activation-3.61e-02

Pos logit contributions
keeping+0.144
 itself+0.117
 Joey+0.106
 him+0.105
gent+0.103
Neg logit contributions
istered-0.129
cases-0.127
ider-0.125
 short-0.117
br-0.117
Token They
Token ID1119
Feature activation+1.20e-02

Pos logit contributions
keeping+0.069
 itself+0.056
 him+0.053
 home+0.051
ie+0.051
Neg logit contributions
istered-0.068
cases-0.067
ider-0.063
plets-0.060
br-0.060
Token had
Token ID550
Feature activation+3.03e-02

Pos logit contributions
istered+0.021
cases+0.021
ider+0.020
plets+0.019
arest+0.019
Neg logit contributions
keeping-0.023
 itself-0.020
 him-0.020
 home-0.019
ie-0.018
Token many
Token ID867
Feature activation+8.87e-03

Pos logit contributions
istered+0.067
cases+0.065
ider+0.064
plets+0.064
arest+0.062
Neg logit contributions
keeping-0.048
 him-0.041
 itself-0.041
 home-0.040
 his-0.036
Token toys
Token ID14958
Feature activation+2.08e-02

Pos logit contributions
istered+0.018
plets+0.017
cases+0.017
ider+0.016
arest+0.016
Neg logit contributions
keeping-0.015
 him-0.014
 home-0.013
 itself-0.013
ie-0.013
Token proud
Token ID6613
Feature activation+2.81e-02

Pos logit contributions
cases+0.107
plets+0.102
istered+0.099
ider+0.099
br+0.098
Neg logit contributions
keeping-0.080
 him-0.075
 itself-0.074
 home-0.074
 Joey-0.072
Token that
Token ID326
Feature activation-2.08e-02

Pos logit contributions
istered+0.055
cases+0.052
arest+0.049
plets+0.048
icious+0.047
Neg logit contributions
keeping-0.051
 him-0.049
 itself-0.046
 home-0.046
ie-0.044
Token Andy
Token ID12382
Feature activation-4.54e-02

Pos logit contributions
keeping+0.041
 itself+0.034
gy+0.032
ie+0.032
 him+0.031
Neg logit contributions
istered-0.035
cases-0.035
arest-0.032
ider-0.031
br-0.031
Token was
Token ID373
Feature activation+5.67e-02

Pos logit contributions
keeping+0.082
 itself+0.070
 him+0.069
 home+0.066
ie+0.064
Neg logit contributions
istered-0.086
cases-0.085
ider-0.079
plets-0.078
arest-0.078
Token able
Token ID1498
Feature activation+2.16e-02

Pos logit contributions
istered+0.096
cases+0.096
plets+0.087
ider+0.087
arest+0.085
Neg logit contributions
keeping-0.115
 itself-0.103
 him-0.101
 home-0.099
 Joey-0.093
Token to
Token ID284
Feature activation-6.50e-02

Pos logit contributions
istered+0.055
cases+0.055
ider+0.052
plets+0.052
arest+0.052
Neg logit contributions
keeping-0.025
 itself-0.020
 him-0.020
 home-0.019
 Joey-0.017
Token pick
Token ID2298
Feature activation-4.36e-02

Pos logit contributions
keeping+0.097
 itself+0.080
 him+0.078
 home+0.073
ie+0.071
Neg logit contributions
istered-0.143
cases-0.140
ider-0.133
arest-0.132
plets-0.132
Token the
Token ID262
Feature activation-8.14e-04

Pos logit contributions
keeping+0.061
 itself+0.044
gy+0.043
 him+0.041
ie+0.040
Neg logit contributions
istered-0.098
cases-0.097
br-0.092
plets-0.092
arest-0.091
Token grape
Token ID30777
Feature activation-5.91e-03

Pos logit contributions
keeping+0.002
 itself+0.001
 home+0.001
 him+0.001
ie+0.001
Neg logit contributions
cases-0.001
istered-0.001
ider-0.001
plets-0.001
arest-0.001
Token all
Token ID477
Feature activation-3.42e-02

Pos logit contributions
keeping+0.009
 itself+0.008
 him+0.007
 home+0.007
ie+0.007
Neg logit contributions
istered-0.013
cases-0.013
ider-0.012
plets-0.012
arest-0.012
Token by
Token ID416
Feature activation-9.52e-02

Pos logit contributions
keeping+0.051
 itself+0.038
 generation+0.036
 him+0.036
gy+0.035
Neg logit contributions
cases-0.073
istered-0.071
arest-0.069
br-0.067
ider-0.066
Token she
Token ID673
Feature activation+2.06e-03

Pos logit contributions
cases+0.014
istered+0.013
ider+0.012
br+0.012
plets+0.011
Neg logit contributions
keeping-0.016
 itself-0.013
 him-0.012
ie-0.012
 generation-0.012
Token would
Token ID561
Feature activation+3.86e-03

Pos logit contributions
istered+0.004
cases+0.004
arest+0.004
plets+0.004
ider+0.004
Neg logit contributions
keeping-0.003
 him-0.003
 itself-0.003
 home-0.003
ie-0.003
Token get
Token ID651
Feature activation+3.23e-02

Pos logit contributions
istered+0.009
cases+0.008
ider+0.008
plets+0.008
arest+0.008
Neg logit contributions
keeping-0.006
 itself-0.005
 him-0.005
 home-0.004
ie-0.004
Token the
Token ID262
Feature activation+2.17e-02

Pos logit contributions
istered+0.066
cases+0.063
hes+0.061
 short+0.060
 long+0.058
Neg logit contributions
keeping-0.053
 itself-0.044
gy-0.040
 zone-0.040
 generation-0.040
Token meat
Token ID6174
Feature activation+5.57e-03

Pos logit contributions
cases+0.044
istered+0.043
plets+0.042
arest+0.040
ider+0.040
Neg logit contributions
keeping-0.036
 home-0.033
 him-0.032
 itself-0.030
 Joey-0.028
Token from
Token ID422
Feature activation-7.40e-02

Pos logit contributions
istered+0.012
cases+0.012
ider+0.011
plets+0.011
arest+0.011
Neg logit contributions
keeping-0.008
 itself-0.007
 him-0.007
 home-0.006
ie-0.006
Token the
Token ID262
Feature activation+7.49e-03

Pos logit contributions
keeping+0.086
 itself+0.062
ie+0.057
roc+0.057
 Acc+0.057
Neg logit contributions
istered-0.164
 long-0.162
 short-0.159
 annoying-0.159
cases-0.158
Token cold
Token ID4692
Feature activation-1.84e-02

Pos logit contributions
istered+0.014
cases+0.014
arest+0.013
ider+0.013
plets+0.013
Neg logit contributions
keeping-0.013
 him-0.011
 itself-0.011
 home-0.011
ie-0.010
Token box
Token ID3091
Feature activation+1.08e-02

Pos logit contributions
keeping+0.030
 itself+0.026
 him+0.025
 home+0.023
 Joey+0.023
Neg logit contributions
istered-0.036
cases-0.036
plets-0.034
ider-0.034
br-0.033
Token.
Token ID13
Feature activation-3.62e-02

Pos logit contributions
istered+0.026
cases+0.025
plets+0.024
arest+0.024
icious+0.024
Neg logit contributions
keeping-0.015
 home-0.012
 itself-0.012
 him-0.012
ie-0.011
Token Kitty
Token ID21168
Feature activation-2.56e-02

Pos logit contributions
keeping+0.052
 itself+0.042
 him+0.041
 home+0.038
ie+0.038
Neg logit contributions
istered-0.082
cases-0.082
ider-0.077
plets-0.076
arest-0.075