TOP ACTIVATIONS
MAX = 5.347

day he found a pe b ble on the ground and
idea to bake a pe b ble . She gathered all
want to share my pe b ble ? We can both
Once there was a pe b ble . The pe b
Alice decided to grab some pe aches from the tree in
she saw a shiny pe b ble on the ground .
there was a small pe b ble that was very excited
so he asked the ch ame leon to play with him
" I found a pe b ble at the beach ,"
So , Lily gave the pe b ble to Jack and
, Lily gave the pe b ble to Jack and they
then she grabbed a pe b ble . She looked up
you want to share my pe b ble ? We can
and picked up a pe b ble too . He threw
Tim saw a small pe b ble in his shoe .
, " What makes pe b bles special ?" His mother
, she found a pe b ble that was very deep
the idea to bake a pe b ble . She gathered
I am a magical pe b ble and I â
she found a small pe b ble on the ground .

MIN ACTIVATIONS
MIN = -6.731

grandma 's house for safe keeping . When
it to me for safe keeping . He told me to
the sofa , in the closet , but she could not
the bed , in the closet , and even in the
her bed , in her closet , and in her toy
her bed , in her closet , and in her toy
the bed , in the closet , and behind the curtains
the bed , in the closet , and on the couch
the bed , in the closet , and even outside .
it and enjoying it for years to come ! <|endoftext|> Once
the bed , in the closet , and even outside in
the bed , in the closet , and even in the
the bed , in the closet , and even in the
his bed , in the closet , and even outside .
the bed and in the closet , but the toy was
butterfly back in its natural habitat . He waved goodbye to
the bed , in the closet , and even in the
the bed , in the closet , in the draw ers
see animals in their natural habitat . She ran around the
the bed , in the closet . She could not find

INTERVAL 2.328 - 5.347
CONTAINS 0.577%

be good friends and help each other . They learned that
you !" he said to all the birds . The lake
bird , so he climbed up a tree and called out
. She watched the k ang aroo from afar and imagined
he was determined to make it work !

INTERVAL -0.692 - 2.328
CONTAINS 79.456%

could . He stayed there for a long
" I 'm counting the planets in the sky ," the
bark . It was Max ! He had been stuck in
my in the kitchen . One day , her mom my
The bird replied , " My wings are too tired from

INTERVAL -3.711 - -0.692
CONTAINS 19.961%

Lily were playing in the park . They saw a giant
fear , not knowing what to do . The
and find a nice spot under a tree . They spread
too heavy for her . She tried to pull it out
I forgive you . But you have to be more careful

INTERVAL -6.731 - -3.711
CONTAINS 0.006%

. He looked under the bed , in the toy box
and had lived for many years . One day , he
play or watch TV or read books . They only had
. He looked under the bed and in the closet ,
, and even in the closet , but he couldn 't
Token day
Token ID1110
Feature activation+3.32e-01

Pos logit contributions
 nods+0.019
 nod+0.017
 nodded+0.017
 shakes+0.016
 smiles+0.015
Neg logit contributions
keeping-0.014
handed-0.011
grand-0.011
-0.011
hell-0.010
Token he
Token ID339
Feature activation+8.39e-01

Pos logit contributions
 nods+0.008
 nod+0.007
 nodded+0.007
 shakes+0.006
 smiles+0.006
Neg logit contributions
keeping-0.012
-0.010
 arrest-0.009
handed-0.009
hell-0.009
Token found
Token ID1043
Feature activation+1.09e+00

Pos logit contributions
 nods+0.021
 nod+0.019
 nodded+0.018
 shakes+0.016
 smiles+0.015
Neg logit contributions
keeping-0.030
-0.024
handed-0.024
 arrest-0.024
 rubble-0.023
Token a
Token ID257
Feature activation-5.79e-01

Pos logit contributions
 nods+0.025
 nod+0.022
 nodded+0.021
 shakes+0.018
ap+0.016
Neg logit contributions
keeping-0.041
handed-0.036
 arrest-0.034
 rubble-0.033
-0.033
Token pe
Token ID613
Feature activation+4.80e+00

Pos logit contributions
keeping+0.019
+0.015
handed+0.015
 arrest+0.015
 rubble+0.015
Neg logit contributions
 nods-0.016
 nod-0.014
 nodded-0.014
 shakes-0.012
 smiles-0.012
Tokenb
Token ID65
Feature activation+5.35e+00

Pos logit contributions
 nods+0.143
 nod+0.131
 nodded+0.124
 shakes+0.115
 smiles+0.111
Neg logit contributions
keeping-0.143
handed-0.116
hell-0.112
-0.112
 anywhere-0.108
Tokenble
Token ID903
Feature activation+3.17e+00

Pos logit contributions
 nods+0.123
 nod+0.107
 nodded+0.101
 shakes+0.093
 smiles+0.090
Neg logit contributions
keeping-0.197
-0.166
handed-0.161
hell-0.158
 anywhere-0.158
Token on
Token ID319
Feature activation+4.29e-01

Pos logit contributions
 nods+0.073
 nod+0.063
 nodded+0.060
 shakes+0.051
 smiles+0.050
Neg logit contributions
keeping-0.121
handed-0.100
 arrest-0.099
-0.098
 unatt-0.096
Token the
Token ID262
Feature activation-3.88e-01

Pos logit contributions
 nods+0.007
 nod+0.006
 nodded+0.006
 shakes+0.005
 smiles+0.004
Neg logit contributions
keeping-0.019
handed-0.016
 arrest-0.016
-0.016
 rubble-0.015
Token ground
Token ID2323
Feature activation+4.37e-01

Pos logit contributions
keeping+0.011
handed+0.008
+0.008
 arrest+0.008
 rubble+0.007
Neg logit contributions
 nods-0.013
 nod-0.012
 nodded-0.012
 shakes-0.011
 smiles-0.011
Token and
Token ID290
Feature activation+3.67e-01

Pos logit contributions
 nods+0.011
 nod+0.010
 nodded+0.009
 shakes+0.008
 smiles+0.008
Neg logit contributions
keeping-0.016
handed-0.013
-0.013
 arrest-0.012
 rubble-0.012
Token idea
Token ID2126
Feature activation+1.56e+00

Pos logit contributions
 nods+0.041
 nodded+0.037
 nod+0.035
 shakes+0.031
 smiles+0.030
Neg logit contributions
keeping-0.056
handed-0.046
-0.045
 arrest-0.044
 rubble-0.043
Token to
Token ID284
Feature activation-5.31e-01

Pos logit contributions
 nods+0.040
 nod+0.035
 nodded+0.034
 shakes+0.030
 smiles+0.028
Neg logit contributions
keeping-0.056
-0.045
handed-0.044
 arrest-0.044
 rubble-0.043
Token bake
Token ID28450
Feature activation+7.05e-01

Pos logit contributions
keeping+0.019
handed+0.016
 arrest+0.015
+0.015
 rubble+0.015
Neg logit contributions
 nods-0.013
 nodded-0.011
 nod-0.011
 shakes-0.010
 smiles-0.010
Token a
Token ID257
Feature activation+2.22e-01

Pos logit contributions
 nods+0.018
 nod+0.015
 nodded+0.015
 shakes+0.013
 fisherman+0.013
Neg logit contributions
keeping-0.024
-0.021
handed-0.020
 anywhere-0.020
hell-0.020
Token pe
Token ID613
Feature activation+5.07e+00

Pos logit contributions
 nods+0.006
 nod+0.006
 nodded+0.006
 shakes+0.005
 smiles+0.005
Neg logit contributions
keeping-0.007
-0.006
handed-0.005
 arrest-0.005
 rubble-0.005
Tokenb
Token ID65
Feature activation+5.34e+00

Pos logit contributions
 nods+0.167
 nod+0.151
 nodded+0.147
 shakes+0.138
 smiles+0.133
Neg logit contributions
keeping-0.135
handed-0.108
-0.104
hell-0.101
grand-0.099
Tokenble
Token ID903
Feature activation+3.84e+00

Pos logit contributions
 nods+0.127
 nod+0.108
 nodded+0.107
 shakes+0.099
 smiles+0.096
Neg logit contributions
keeping-0.195
-0.163
handed-0.160
hell-0.155
ork-0.154
Token.
Token ID13
Feature activation-6.51e-01

Pos logit contributions
 nods+0.088
 nod+0.076
 nodded+0.074
 shakes+0.064
 smiles+0.060
Neg logit contributions
keeping-0.147
handed-0.120
-0.120
 arrest-0.118
 rubble-0.115
Token She
Token ID1375
Feature activation+3.59e-01

Pos logit contributions
keeping+0.021
+0.016
handed+0.016
 arrest+0.015
grand+0.015
Neg logit contributions
 nods-0.018
 nod-0.016
 nodded-0.016
 shakes-0.015
 smiles-0.014
Token gathered
Token ID9272
Feature activation+1.04e+00

Pos logit contributions
 nods+0.010
 nod+0.010
 nodded+0.008
 shakes+0.008
ap+0.008
Neg logit contributions
keeping-0.011
 arrest-0.009
-0.008
 drowned-0.008
grand-0.008
Token all
Token ID477
Feature activation+2.45e+00

Pos logit contributions
 nods+0.025
 nod+0.022
 nodded+0.021
rey+0.019
 fisherman+0.018
Neg logit contributions
keeping-0.037
 rubble-0.030
handed-0.028
hell-0.028
 arrest-0.028
Token want
Token ID765
Feature activation+6.11e-01

Pos logit contributions
 nods+0.043
 nod+0.040
 nodded+0.038
 shakes+0.034
 smiles+0.032
Neg logit contributions
keeping-0.059
-0.048
handed-0.047
 arrest-0.046
 rubble-0.045
Token to
Token ID284
Feature activation+2.78e-01

Pos logit contributions
 nods+0.014
 nod+0.013
 nodded+0.012
 shakes+0.011
 smiles+0.010
Neg logit contributions
keeping-0.023
-0.019
handed-0.019
 arrest-0.018
 rubble-0.018
Token share
Token ID2648
Feature activation-4.21e-01

Pos logit contributions
 nods+0.005
 nod+0.005
 nodded+0.004
 shakes+0.004
 cl+0.003
Neg logit contributions
keeping-0.012
-0.010
handed-0.010
 rubble-0.010
 unatt-0.010
Token my
Token ID616
Feature activation+8.47e-01

Pos logit contributions
keeping+0.016
+0.013
handed+0.013
 arrest+0.013
 rubble+0.013
Neg logit contributions
 nods-0.009
 nod-0.008
 nodded-0.008
 shakes-0.007
 smiles-0.006
Token pe
Token ID613
Feature activation+5.13e+00

Pos logit contributions
 nods+0.018
 nod+0.015
 nodded+0.015
 shakes+0.013
 smiles+0.012
Neg logit contributions
keeping-0.034
-0.028
handed-0.028
 arrest-0.028
 rubble-0.027
Tokenb
Token ID65
Feature activation+5.32e+00

Pos logit contributions
 nods+0.147
 nod+0.134
 nodded+0.130
 shakes+0.117
 smiles+0.111
Neg logit contributions
keeping-0.162
-0.127
handed-0.127
hell-0.125
 arrest-0.122
Tokenble
Token ID903
Feature activation+3.74e+00

Pos logit contributions
 nods+0.137
 nod+0.121
 nodded+0.119
 shakes+0.110
 smiles+0.106
Neg logit contributions
keeping-0.180
hell-0.152
-0.147
uts-0.145
handed-0.142
Token?
Token ID30
Feature activation+1.23e+00

Pos logit contributions
 nods+0.083
 nod+0.073
 nodded+0.070
 shakes+0.061
 smiles+0.057
Neg logit contributions
keeping-0.146
-0.119
handed-0.119
 arrest-0.116
 rubble-0.113
Token We
Token ID775
Feature activation+7.92e-01

Pos logit contributions
 nods+0.036
 nod+0.033
 nodded+0.033
 shakes+0.029
 smiles+0.028
Neg logit contributions
keeping-0.037
-0.030
 arrest-0.028
handed-0.028
grand-0.027
Token can
Token ID460
Feature activation+8.26e-01

Pos logit contributions
 nods+0.019
 nod+0.017
 nodded+0.016
 shakes+0.015
 smiles+0.014
Neg logit contributions
keeping-0.030
-0.024
handed-0.024
 arrest-0.023
 rubble-0.023
Token both
Token ID1111
Feature activation+2.10e+00

Pos logit contributions
 nods+0.021
 nod+0.018
 nodded+0.018
 shakes+0.016
 smiles+0.015
Neg logit contributions
keeping-0.029
handed-0.023
 arrest-0.023
-0.023
 rubble-0.022
TokenOnce
Token ID7454
Feature activation+6.94e-01

Pos logit contributions
keeping+0.000
+0.000
handed+0.000
 arrest+0.000
hell+0.000
Neg logit contributions
 nods-0.000
 nod-0.000
 nodded-0.000
 shakes-0.000
 smiles-0.000
Token there
Token ID612
Feature activation-2.52e-01

Pos logit contributions
 nods+0.012
 nod+0.010
 nodded+0.010
 shakes+0.008
 smiles+0.007
Neg logit contributions
keeping-0.029
-0.025
handed-0.024
hell-0.024
 anywhere-0.024
Token was
Token ID373
Feature activation+2.58e-01

Pos logit contributions
keeping+0.009
+0.008
handed+0.008
 arrest+0.007
hell+0.007
Neg logit contributions
 nods-0.006
 nod-0.005
 nodded-0.005
 shakes-0.004
 smiles-0.004
Token a
Token ID257
Feature activation+4.06e-02

Pos logit contributions
 nods+0.006
 nod+0.005
 nodded+0.005
 shakes+0.005
rey+0.004
Neg logit contributions
keeping-0.010
handed-0.008
-0.008
 arrest-0.008
hell-0.008
Token pe
Token ID613
Feature activation+4.45e+00

Pos logit contributions
 nods+0.001
 nod+0.001
 nodded+0.001
 shakes+0.001
 smiles+0.001
Neg logit contributions
keeping-0.001
-0.001
 arrest-0.001
handed-0.001
hell-0.001
Tokenb
Token ID65
Feature activation+5.31e+00

Pos logit contributions
 nods+0.117
 nod+0.105
 nodded+0.100
 shakes+0.092
 smiles+0.086
Neg logit contributions
keeping-0.153
handed-0.125
-0.124
 arrest-0.120
hell-0.118
Tokenble
Token ID903
Feature activation+3.41e+00

Pos logit contributions
 nods+0.120
 nod+0.103
 nodded+0.098
 shakes+0.092
 smiles+0.087
Neg logit contributions
keeping-0.201
-0.170
handed-0.168
 arrest-0.163
hell-0.159
Token.
Token ID13
Feature activation-8.50e-01

Pos logit contributions
 nods+0.092
 nod+0.082
 nodded+0.078
 shakes+0.072
 smiles+0.069
Neg logit contributions
keeping-0.113
-0.092
handed-0.092
 arrest-0.091
hell-0.087
Token The
Token ID383
Feature activation+1.41e+00

Pos logit contributions
keeping+0.028
+0.022
handed+0.022
 arrest+0.022
 rubble+0.021
Neg logit contributions
 nods-0.023
 nod-0.021
 nodded-0.020
 shakes-0.018
 smiles-0.017
Token pe
Token ID613
Feature activation+3.69e+00

Pos logit contributions
 nods+0.030
 nod+0.026
 nodded+0.025
 shakes+0.022
 smiles+0.020
Neg logit contributions
keeping-0.055
-0.046
handed-0.045
 arrest-0.044
 rubble-0.043
Tokenb
Token ID65
Feature activation+4.28e+00

Pos logit contributions
 nods+0.098
 nod+0.089
 nodded+0.085
 shakes+0.077
 smiles+0.072
Neg logit contributions
keeping-0.126
-0.101
handed-0.101
 arrest-0.097
hell-0.095
Token Alice
Token ID14862
Feature activation-1.73e+00

Pos logit contributions
keeping+0.125
ache+0.116
lor+0.113
+0.112
handed+0.112
Neg logit contributions
 nods-0.032
 nod-0.032
 up-0.030
 three-0.029
 the-0.027
Token decided
Token ID3066
Feature activation-4.71e-01

Pos logit contributions
keeping+0.069
+0.058
handed+0.057
 rubble+0.056
hell+0.055
Neg logit contributions
 nods-0.037
 nodded-0.032
 nod-0.032
 shakes-0.027
 smiles-0.027
Token to
Token ID284
Feature activation+5.80e-01

Pos logit contributions
keeping+0.018
+0.015
handed+0.015
 arrest+0.014
 rubble+0.014
Neg logit contributions
 nods-0.011
 nod-0.010
 nodded-0.009
 shakes-0.008
 smiles-0.008
Token grab
Token ID5552
Feature activation+2.02e+00

Pos logit contributions
 nods+0.013
 nod+0.012
 nodded+0.011
 shakes+0.010
 smiles+0.009
Neg logit contributions
keeping-0.022
-0.018
handed-0.018
 rubble-0.018
 arrest-0.018
Token some
Token ID617
Feature activation+1.01e+00

Pos logit contributions
 nods+0.041
 nod+0.036
 nodded+0.035
 shakes+0.028
ap+0.025
Neg logit contributions
keeping-0.083
handed-0.068
-0.067
 arrest-0.066
 rubble-0.065
Token pe
Token ID613
Feature activation+5.30e+00

Pos logit contributions
 nods+0.031
 nod+0.028
 nodded+0.028
 shakes+0.025
 smiles+0.023
Neg logit contributions
keeping-0.030
handed-0.023
 arrest-0.023
-0.023
 rubble-0.023
Tokenaches
Token ID3694
Feature activation+2.14e+00

Pos logit contributions
 nods+0.164
 nod+0.151
 nodded+0.145
 shakes+0.133
 smiles+0.127
Neg logit contributions
keeping-0.153
handed-0.121
-0.118
hell-0.116
 arrest-0.112
Token from
Token ID422
Feature activation+6.40e-01

Pos logit contributions
 nods+0.062
 nod+0.054
 nodded+0.052
 shakes+0.047
 smiles+0.046
Neg logit contributions
keeping-0.073
handed-0.057
 anywhere-0.054
hell-0.054
 unatt-0.053
Token the
Token ID262
Feature activation-6.86e-01

Pos logit contributions
 nods+0.008
 nod+0.007
 nodded+0.006
 shakes+0.005
 smiles+0.005
Neg logit contributions
keeping-0.030
-0.026
handed-0.025
hell-0.025
 arrest-0.025
Token tree
Token ID5509
Feature activation+7.21e-01

Pos logit contributions
keeping+0.024
+0.020
hell+0.019
 arrest+0.019
handed+0.019
Neg logit contributions
 nods-0.017
 nod-0.015
 nodded-0.015
 shakes-0.013
 smiles-0.013
Token in
Token ID287
Feature activation-1.31e-01

Pos logit contributions
 nods+0.017
 nod+0.015
 nodded+0.015
 shakes+0.013
 smiles+0.012
Neg logit contributions
keeping-0.027
-0.022
handed-0.021
 arrest-0.021
 rubble-0.021
Token she
Token ID673
Feature activation-4.48e-01

Pos logit contributions
keeping+0.011
+0.009
handed+0.009
 arrest+0.009
 rubble+0.008
Neg logit contributions
 nods-0.009
 nod-0.008
 nodded-0.008
 shakes-0.007
 smiles-0.007
Token saw
Token ID2497
Feature activation+1.01e+00

Pos logit contributions
keeping+0.018
+0.014
handed+0.014
 arrest+0.014
 rubble+0.014
Neg logit contributions
 nods-0.010
 nod-0.009
 nodded-0.009
 shakes-0.007
 smiles-0.007
Token a
Token ID257
Feature activation-4.38e-02

Pos logit contributions
 nods+0.017
 nod+0.015
 nodded+0.014
 shakes+0.011
 smiles+0.010
Neg logit contributions
keeping-0.045
-0.038
handed-0.037
 arrest-0.037
 rubble-0.036
Token shiny
Token ID22441
Feature activation+2.20e+00

Pos logit contributions
keeping+0.002
+0.001
handed+0.001
 arrest+0.001
 rubble+0.001
Neg logit contributions
 nods-0.001
 nod-0.001
 nodded-0.001
 shakes-0.001
 smiles-0.001
Token pe
Token ID613
Feature activation+4.47e+00

Pos logit contributions
 nods+0.067
 nod+0.060
 nodded+0.059
 shakes+0.052
 smiles+0.051
Neg logit contributions
keeping-0.063
 arrest-0.049
hell-0.049
handed-0.049
ork-0.048
Tokenb
Token ID65
Feature activation+5.29e+00

Pos logit contributions
 nods+0.120
 nod+0.108
 nodded+0.104
 shakes+0.093
 smiles+0.089
Neg logit contributions
keeping-0.152
-0.121
handed-0.121
 arrest-0.118
hell-0.116
Tokenble
Token ID903
Feature activation+3.17e+00

Pos logit contributions
 nods+0.111
 nod+0.095
 nodded+0.091
 shakes+0.080
 smiles+0.077
Neg logit contributions
keeping-0.210
-0.176
handed-0.173
 arrest-0.169
hell-0.168
Token on
Token ID319
Feature activation-1.27e-02

Pos logit contributions
 nods+0.068
 nod+0.058
 nodded+0.056
 shakes+0.047
 smiles+0.045
Neg logit contributions
keeping-0.126
-0.104
handed-0.104
 arrest-0.103
 rubble-0.100
Token the
Token ID262
Feature activation-1.20e+00

Pos logit contributions
keeping+0.001
+0.001
handed+0.001
hell+0.001
 arrest+0.001
Neg logit contributions
 nods-0.000
 nod-0.000
 nodded-0.000
 shakes-0.000
 smiles-0.000
Token ground
Token ID2323
Feature activation+2.72e-01

Pos logit contributions
keeping+0.033
handed+0.025
+0.025
 arrest+0.024
 rubble+0.023
Neg logit contributions
 nods-0.040
 nod-0.037
 nodded-0.036
 shakes-0.033
 smiles-0.032
Token.
Token ID13
Feature activation-2.38e-01

Pos logit contributions
 nods+0.007
 nod+0.006
 nodded+0.006
 shakes+0.005
 smiles+0.005
Neg logit contributions
keeping-0.010
handed-0.008
-0.008
 arrest-0.008
 rubble-0.007
Token there
Token ID612
Feature activation-2.85e-01

Pos logit contributions
 nods+0.000
 nod+0.000
 nodded+0.000
 shakes+0.000
 agrees+0.000
Neg logit contributions
keeping-0.000
 anywhere-0.000
-0.000
hell-0.000
 anymore-0.000
Token was
Token ID373
Feature activation+1.96e-01

Pos logit contributions
keeping+0.010
+0.009
handed+0.008
 arrest+0.008
hell+0.008
Neg logit contributions
 nods-0.007
 nod-0.006
 nodded-0.006
 shakes-0.005
 smiles-0.005
Token a
Token ID257
Feature activation-2.67e-02

Pos logit contributions
 nods+0.004
 nod+0.004
 nodded+0.004
 shakes+0.003
 smiles+0.003
Neg logit contributions
keeping-0.008
-0.006
handed-0.006
 arrest-0.006
hell-0.006
Token small
Token ID1402
Feature activation+1.43e+00

Pos logit contributions
keeping+0.001
+0.001
 arrest+0.001
handed+0.001
 rubble+0.001
Neg logit contributions
 nods-0.001
 nod-0.001
 nodded-0.001
 shakes-0.001
 smiles-0.001
Token pe
Token ID613
Feature activation+4.57e+00

Pos logit contributions
 nods+0.034
 nod+0.030
 nodded+0.030
 shakes+0.026
 smiles+0.024
Neg logit contributions
keeping-0.050
 arrest-0.041
-0.041
 rubble-0.040
handed-0.040
Tokenb
Token ID65
Feature activation+5.25e+00

Pos logit contributions
 nods+0.121
 nod+0.110
 nodded+0.105
 shakes+0.095
 smiles+0.090
Neg logit contributions
keeping-0.157
handed-0.127
-0.126
 arrest-0.122
hell-0.120
Tokenble
Token ID903
Feature activation+3.57e+00

Pos logit contributions
 nods+0.110
 nod+0.095
 nodded+0.090
 shakes+0.081
 smiles+0.077
Neg logit contributions
keeping-0.209
-0.175
handed-0.173
 arrest-0.168
hell-0.165
Token that
Token ID326
Feature activation+1.99e-01

Pos logit contributions
 nods+0.096
 nod+0.085
 nodded+0.082
 shakes+0.074
 smiles+0.071
Neg logit contributions
keeping-0.121
-0.098
handed-0.097
 arrest-0.095
 rubble-0.092
Token was
Token ID373
Feature activation+4.06e-01

Pos logit contributions
 nods+0.006
 nod+0.005
 nodded+0.005
 shakes+0.005
ap+0.004
Neg logit contributions
keeping-0.006
-0.005
 rubble-0.004
handed-0.004
 drowned-0.004
Token very
Token ID845
Feature activation+5.28e-01

Pos logit contributions
 nods+0.011
 nod+0.010
 nodded+0.010
 shakes+0.009
 smiles+0.008
Neg logit contributions
keeping-0.013
-0.011
handed-0.010
 rubble-0.010
 arrest-0.010
Token excited
Token ID6568
Feature activation+6.98e-01

Pos logit contributions
 nods+0.014
 nod+0.013
 nodded+0.012
 shakes+0.011
 smiles+0.010
Neg logit contributions
keeping-0.018
handed-0.014
-0.014
 arrest-0.014
 rubble-0.014
Token so
Token ID523
Feature activation+8.82e-01

Pos logit contributions
 nods+0.004
 nod+0.003
 nodded+0.003
 shakes+0.003
 smiles+0.003
Neg logit contributions
keeping-0.005
-0.004
handed-0.004
 arrest-0.004
 rubble-0.003
Token he
Token ID339
Feature activation+1.11e+00

Pos logit contributions
 nods+0.019
 nod+0.017
 nodded+0.017
 shakes+0.014
 smiles+0.014
Neg logit contributions
keeping-0.035
-0.029
handed-0.028
 arrest-0.028
 rubble-0.027
Token asked
Token ID1965
Feature activation-3.87e-01

Pos logit contributions
 nods+0.031
 nod+0.028
 nodded+0.025
 shakes+0.024
 smiles+0.023
Neg logit contributions
keeping-0.035
-0.027
 arrest-0.027
handed-0.027
 drowned-0.027
Token the
Token ID262
Feature activation+1.75e+00

Pos logit contributions
keeping+0.014
 anywhere+0.011
handed+0.011
 arrest+0.011
+0.011
Neg logit contributions
 nods-0.009
 nod-0.007
 nodded-0.007
 shakes-0.007
rey-0.006
Token ch
Token ID442
Feature activation+2.80e+00

Pos logit contributions
 nods+0.044
 nodded+0.039
 nod+0.038
apped+0.035
 shakes+0.034
Neg logit contributions
keeping-0.054
handed-0.044
 arrest-0.043
 rubble-0.041
hell-0.040
Tokename
Token ID480
Feature activation+5.18e+00

Pos logit contributions
 nods+0.086
 nodded+0.082
 nod+0.081
 shakes+0.068
 smiles+0.068
Neg logit contributions
keeping-0.074
 arrest-0.061
hell-0.057
handed-0.056
ork-0.055
Tokenleon
Token ID38970
Feature activation+2.83e+00

Pos logit contributions
 nods+0.129
 nod+0.107
 nodded+0.103
 shakes+0.102
 smiles+0.090
Neg logit contributions
keeping-0.165
hell-0.141
 anywhere-0.140
ork-0.138
handed-0.136
Token to
Token ID284
Feature activation+3.02e-01

Pos logit contributions
 nods+0.072
 nod+0.063
 nodded+0.062
 shakes+0.058
 smiles+0.052
Neg logit contributions
keeping-0.098
-0.078
handed-0.077
hell-0.075
 anywhere-0.074
Token play
Token ID711
Feature activation+5.46e-01

Pos logit contributions
 nods+0.008
 nodded+0.007
 nod+0.007
 shakes+0.006
apped+0.006
Neg logit contributions
keeping-0.010
 arrest-0.008
handed-0.008
 unatt-0.007
 anywhere-0.007
Token with
Token ID351
Feature activation+5.24e-01

Pos logit contributions
 nods+0.013
 nod+0.011
 nodded+0.011
 shakes+0.010
 smiles+0.009
Neg logit contributions
keeping-0.020
handed-0.016
-0.016
hell-0.016
 unatt-0.016
Token him
Token ID683
Feature activation+9.08e-01

Pos logit contributions
 nods+0.009
 nod+0.008
 nodded+0.008
 shakes+0.006
 smiles+0.006
Neg logit contributions
keeping-0.023
-0.019
handed-0.019
 arrest-0.018
 rubble-0.018
Token"
Token ID1
Feature activation-3.43e-01

Pos logit contributions
keeping+0.044
handed+0.036
+0.036
 arrest+0.035
 rubble+0.034
Neg logit contributions
 nods-0.028
 nod-0.025
 nodded-0.024
 shakes-0.021
 smiles-0.020
TokenI
Token ID40
Feature activation+8.73e-01

Pos logit contributions
keeping+0.012
+0.010
handed+0.010
 arrest+0.009
grand+0.009
Neg logit contributions
 nods-0.009
 nod-0.008
 nodded-0.008
 shakes-0.007
 smiles-0.007
Token found
Token ID1043
Feature activation+1.05e+00

Pos logit contributions
 nods+0.023
 nod+0.021
 nodded+0.020
 shakes+0.018
 smiles+0.017
Neg logit contributions
keeping-0.031
-0.025
handed-0.025
 rubble-0.024
 arrest-0.024
Token a
Token ID257
Feature activation-5.62e-01

Pos logit contributions
 nods+0.020
 nod+0.018
 nodded+0.017
 shakes+0.014
 smiles+0.013
Neg logit contributions
keeping-0.044
-0.036
handed-0.036
 arrest-0.035
 rubble-0.034
Token pe
Token ID613
Feature activation+4.78e+00

Pos logit contributions
keeping+0.022
+0.018
handed+0.017
 arrest+0.017
 rubble+0.017
Neg logit contributions
 nods-0.013
 nod-0.012
 nodded-0.011
 shakes-0.010
 smiles-0.009
Tokenb
Token ID65
Feature activation+5.18e+00

Pos logit contributions
 nods+0.139
 nod+0.126
 nodded+0.121
 shakes+0.110
 smiles+0.106
Neg logit contributions
keeping-0.147
handed-0.116
-0.116
hell-0.114
 arrest-0.110
Tokenble
Token ID903
Feature activation+2.62e+00

Pos logit contributions
 nods+0.126
 nod+0.110
 nodded+0.106
 shakes+0.096
 smiles+0.094
Neg logit contributions
keeping-0.185
-0.150
hell-0.148
handed-0.146
 anywhere-0.144
Token at
Token ID379
Feature activation+6.11e-01

Pos logit contributions
 nods+0.067
 nod+0.060
 nodded+0.058
 shakes+0.051
 smiles+0.048
Neg logit contributions
keeping-0.093
handed-0.074
-0.074
 arrest-0.073
 rubble-0.071
Token the
Token ID262
Feature activation-1.49e-01

Pos logit contributions
 nods+0.010
 nod+0.008
 nodded+0.007
 shakes+0.006
 smiles+0.005
Neg logit contributions
keeping-0.028
-0.024
handed-0.023
 arrest-0.023
 rubble-0.022
Token beach
Token ID10481
Feature activation-9.06e-01

Pos logit contributions
keeping+0.005
+0.004
handed+0.004
 arrest+0.004
 rubble+0.004
Neg logit contributions
 nods-0.004
 nod-0.004
 nodded-0.003
 shakes-0.003
 smiles-0.003
Token,"
Token ID553
Feature activation-2.55e-01

Pos logit contributions
keeping+0.030
+0.024
handed+0.023
 arrest+0.023
 rubble+0.022
Neg logit contributions
 nods-0.025
 nod-0.023
 nodded-0.022
 shakes-0.020
 smiles-0.019
Token So
Token ID1406
Feature activation+4.15e-01

Pos logit contributions
 nods+0.015
 nod+0.013
 nodded+0.013
 shakes+0.011
 smiles+0.011
Neg logit contributions
keeping-0.027
 arrest-0.022
-0.022
handed-0.022
 rubble-0.021
Token,
Token ID11
Feature activation+9.29e-01

Pos logit contributions
 nods+0.009
 nod+0.008
 nodded+0.008
 shakes+0.006
 smiles+0.006
Neg logit contributions
keeping-0.017
handed-0.014
-0.014
 arrest-0.014
 rubble-0.014
Token Lily
Token ID20037
Feature activation-5.52e-01

Pos logit contributions
 nods+0.021
 nod+0.019
 nodded+0.019
 shakes+0.016
 smiles+0.015
Neg logit contributions
keeping-0.036
-0.029
handed-0.029
 arrest-0.029
 rubble-0.028
Token gave
Token ID2921
Feature activation+2.05e-01

Pos logit contributions
keeping+0.020
+0.017
handed+0.016
 arrest+0.016
 rubble+0.016
Neg logit contributions
 nods-0.013
 nod-0.012
 nodded-0.011
 shakes-0.010
 smiles-0.009
Token the
Token ID262
Feature activation+1.63e+00

Pos logit contributions
 nods+0.005
 nod+0.004
 nodded+0.004
 shakes+0.004
ap+0.003
Neg logit contributions
keeping-0.008
handed-0.006
 arrest-0.006
-0.006
 rubble-0.006
Token pe
Token ID613
Feature activation+5.18e+00

Pos logit contributions
 nods+0.034
 nod+0.030
 nodded+0.029
 shakes+0.025
 smiles+0.022
Neg logit contributions
keeping-0.064
handed-0.053
-0.053
 arrest-0.052
hell-0.051
Tokenb
Token ID65
Feature activation+5.16e+00

Pos logit contributions
 nods+0.154
 nod+0.143
 nodded+0.136
 shakes+0.125
 smiles+0.118
Neg logit contributions
keeping-0.154
handed-0.119
hell-0.119
-0.119
 arrest-0.113
Tokenble
Token ID903
Feature activation+3.32e+00

Pos logit contributions
 nods+0.124
 nod+0.111
 nodded+0.104
 shakes+0.097
 smiles+0.091
Neg logit contributions
keeping-0.178
hell-0.148
-0.145
uts-0.143
handed-0.140
Token to
Token ID284
Feature activation-4.57e-01

Pos logit contributions
 nods+0.083
 nod+0.072
 nodded+0.071
 shakes+0.064
rey+0.060
Neg logit contributions
keeping-0.116
handed-0.096
 arrest-0.093
hell-0.091
-0.091
Token Jack
Token ID3619
Feature activation+1.39e+00

Pos logit contributions
keeping+0.017
handed+0.014
 arrest+0.014
+0.013
 rubble+0.013
Neg logit contributions
 nods-0.011
 nod-0.009
 nodded-0.009
 shakes-0.008
 fisherman-0.007
Token and
Token ID290
Feature activation+4.53e-01

Pos logit contributions
 nods+0.029
 nod+0.024
 nodded+0.024
 shakes+0.021
 smiles+0.018
Neg logit contributions
keeping-0.055
handed-0.046
-0.044
 arrest-0.044
hell-0.043
Token,
Token ID11
Feature activation+9.29e-01

Pos logit contributions
 nods+0.009
 nod+0.008
 nodded+0.008
 shakes+0.006
 smiles+0.006
Neg logit contributions
keeping-0.017
handed-0.014
-0.014
 arrest-0.014
 rubble-0.014
Token Lily
Token ID20037
Feature activation-5.52e-01

Pos logit contributions
 nods+0.021
 nod+0.019
 nodded+0.019
 shakes+0.016
 smiles+0.015
Neg logit contributions
keeping-0.036
-0.029
handed-0.029
 arrest-0.029
 rubble-0.028
Token gave
Token ID2921
Feature activation+2.05e-01

Pos logit contributions
keeping+0.020
+0.017
handed+0.016
 arrest+0.016
 rubble+0.016
Neg logit contributions
 nods-0.013
 nod-0.012
 nodded-0.011
 shakes-0.010
 smiles-0.009
Token the
Token ID262
Feature activation+1.63e+00

Pos logit contributions
 nods+0.005
 nod+0.004
 nodded+0.004
 shakes+0.004
ap+0.003
Neg logit contributions
keeping-0.008
handed-0.006
 arrest-0.006
-0.006
 rubble-0.006
Token pe
Token ID613
Feature activation+5.18e+00

Pos logit contributions
 nods+0.034
 nod+0.030
 nodded+0.029
 shakes+0.025
 smiles+0.022
Neg logit contributions
keeping-0.064
handed-0.053
-0.053
 arrest-0.052
hell-0.051
Tokenb
Token ID65
Feature activation+5.16e+00

Pos logit contributions
 nods+0.154
 nod+0.143
 nodded+0.136
 shakes+0.125
 smiles+0.118
Neg logit contributions
keeping-0.154
handed-0.119
hell-0.119
-0.119
 arrest-0.113
Tokenble
Token ID903
Feature activation+3.32e+00

Pos logit contributions
 nods+0.124
 nod+0.111
 nodded+0.104
 shakes+0.097
 smiles+0.091
Neg logit contributions
keeping-0.178
hell-0.148
-0.145
uts-0.143
handed-0.140
Token to
Token ID284
Feature activation-4.57e-01

Pos logit contributions
 nods+0.083
 nod+0.072
 nodded+0.071
 shakes+0.064
rey+0.060
Neg logit contributions
keeping-0.116
handed-0.096
 arrest-0.093
hell-0.091
-0.091
Token Jack
Token ID3619
Feature activation+1.39e+00

Pos logit contributions
keeping+0.017
handed+0.014
 arrest+0.014
+0.013
 rubble+0.013
Neg logit contributions
 nods-0.011
 nod-0.009
 nodded-0.009
 shakes-0.008
 fisherman-0.007
Token and
Token ID290
Feature activation+4.53e-01

Pos logit contributions
 nods+0.029
 nod+0.024
 nodded+0.024
 shakes+0.021
 smiles+0.018
Neg logit contributions
keeping-0.055
handed-0.046
-0.044
 arrest-0.044
hell-0.043
Token they
Token ID484
Feature activation+9.20e-01

Pos logit contributions
 nods+0.011
 nod+0.009
 nodded+0.009
 shakes+0.008
 smiles+0.007
Neg logit contributions
keeping-0.017
handed-0.014
-0.014
 arrest-0.014
 rubble-0.013
Token then
Token ID788
Feature activation+2.83e-01

Pos logit contributions
 nods+0.009
 nod+0.008
 nodded+0.007
rey+0.007
 shakes+0.007
Neg logit contributions
keeping-0.011
 arrest-0.008
 rubble-0.008
 unatt-0.008
handed-0.008
Token she
Token ID673
Feature activation-4.41e-02

Pos logit contributions
 nods+0.008
 nod+0.007
 nodded+0.006
rey+0.006
 shakes+0.006
Neg logit contributions
keeping-0.009
-0.007
 arrest-0.007
handed-0.007
 rubble-0.006
Token grabbed
Token ID13176
Feature activation+1.43e+00

Pos logit contributions
keeping+0.001
+0.001
 arrest+0.001
handed+0.001
 drowned+0.001
Neg logit contributions
 nods-0.001
 nod-0.001
 nodded-0.001
 shakes-0.001
 smiles-0.001
Token a
Token ID257
Feature activation-2.68e-01

Pos logit contributions
 nods+0.034
 nod+0.029
 nodded+0.028
rey+0.024
 shakes+0.023
Neg logit contributions
keeping-0.053
 rubble-0.042
handed-0.042
-0.042
 arrest-0.041
Token pe
Token ID613
Feature activation+4.82e+00

Pos logit contributions
keeping+0.008
hell+0.007
 rubble+0.007
ork+0.007
+0.007
Neg logit contributions
 nods-0.008
 nodded-0.007
 nod-0.006
 shakes-0.006
rey-0.006
Tokenb
Token ID65
Feature activation+5.15e+00

Pos logit contributions
 nods+0.163
 nod+0.148
 nodded+0.142
 smiles+0.134
 shakes+0.132
Neg logit contributions
keeping-0.112
hell-0.095
ork-0.092
handed-0.087
-0.086
Tokenble
Token ID903
Feature activation+2.90e+00

Pos logit contributions
 nods+0.141
 nod+0.119
 nodded+0.116
 smiles+0.113
 shakes+0.111
Neg logit contributions
keeping-0.161
-0.136
hell-0.134
ork-0.134
uts-0.129
Token.
Token ID13
Feature activation-5.21e-01

Pos logit contributions
 nods+0.073
 nod+0.060
 nodded+0.060
 shakes+0.052
 smiles+0.052
Neg logit contributions
keeping-0.106
handed-0.085
 arrest-0.084
hell-0.083
-0.083
Token She
Token ID1375
Feature activation+2.50e-01

Pos logit contributions
keeping+0.020
handed+0.016
+0.016
 arrest+0.016
hell+0.016
Neg logit contributions
 nods-0.012
 nod-0.011
 nodded-0.010
 shakes-0.009
 smiles-0.009
Token looked
Token ID3114
Feature activation+5.62e-01

Pos logit contributions
 nods+0.007
 nod+0.007
ags+0.006
ap+0.005
 shakes+0.005
Neg logit contributions
keeping-0.007
 drowned-0.006
-0.006
 arrest-0.006
 rubble-0.005
Token up
Token ID510
Feature activation+1.97e+00

Pos logit contributions
 nods+0.014
 nod+0.012
 nodded+0.012
rey+0.011
ap+0.010
Neg logit contributions
keeping-0.017
 anywhere-0.015
 rubble-0.015
 unatt-0.014
hell-0.014
Token you
Token ID345
Feature activation+1.66e+00

Pos logit contributions
 nods+0.034
 nod+0.031
 nodded+0.030
 shakes+0.028
 smiles+0.026
Neg logit contributions
keeping-0.035
-0.029
handed-0.027
 arrest-0.027
hell-0.026
Token want
Token ID765
Feature activation+6.11e-01

Pos logit contributions
 nods+0.043
 nod+0.040
 nodded+0.038
 shakes+0.034
 smiles+0.032
Neg logit contributions
keeping-0.059
-0.048
handed-0.047
 arrest-0.046
 rubble-0.045
Token to
Token ID284
Feature activation+2.78e-01

Pos logit contributions
 nods+0.014
 nod+0.013
 nodded+0.012
 shakes+0.011
 smiles+0.010
Neg logit contributions
keeping-0.023
-0.019
handed-0.019
 arrest-0.018
 rubble-0.018
Token share
Token ID2648
Feature activation-4.21e-01

Pos logit contributions
 nods+0.005
 nod+0.005
 nodded+0.004
 shakes+0.004
 cl+0.003
Neg logit contributions
keeping-0.012
-0.010
handed-0.010
 rubble-0.010
 unatt-0.010
Token my
Token ID616
Feature activation+8.47e-01

Pos logit contributions
keeping+0.016
+0.013
handed+0.013
 arrest+0.013
 rubble+0.013
Neg logit contributions
 nods-0.009
 nod-0.008
 nodded-0.008
 shakes-0.007
 smiles-0.006
Token pe
Token ID613
Feature activation+5.13e+00

Pos logit contributions
 nods+0.018
 nod+0.015
 nodded+0.015
 shakes+0.013
 smiles+0.012
Neg logit contributions
keeping-0.034
-0.028
handed-0.028
 arrest-0.028
 rubble-0.027
Tokenb
Token ID65
Feature activation+5.32e+00

Pos logit contributions
 nods+0.147
 nod+0.134
 nodded+0.130
 shakes+0.117
 smiles+0.111
Neg logit contributions
keeping-0.162
-0.127
handed-0.127
hell-0.125
 arrest-0.122
Tokenble
Token ID903
Feature activation+3.74e+00

Pos logit contributions
 nods+0.137
 nod+0.121
 nodded+0.119
 shakes+0.110
 smiles+0.106
Neg logit contributions
keeping-0.180
hell-0.152
-0.147
uts-0.145
handed-0.142
Token?
Token ID30
Feature activation+1.23e+00

Pos logit contributions
 nods+0.083
 nod+0.073
 nodded+0.070
 shakes+0.061
 smiles+0.057
Neg logit contributions
keeping-0.146
-0.119
handed-0.119
 arrest-0.116
 rubble-0.113
Token We
Token ID775
Feature activation+7.92e-01

Pos logit contributions
 nods+0.036
 nod+0.033
 nodded+0.033
 shakes+0.029
 smiles+0.028
Neg logit contributions
keeping-0.037
-0.030
 arrest-0.028
handed-0.028
grand-0.027
Token can
Token ID460
Feature activation+8.26e-01

Pos logit contributions
 nods+0.019
 nod+0.017
 nodded+0.016
 shakes+0.015
 smiles+0.014
Neg logit contributions
keeping-0.030
-0.024
handed-0.024
 arrest-0.023
 rubble-0.023
Token and
Token ID290
Feature activation+1.28e+00

Pos logit contributions
 nods+0.063
 shakes+0.052
 nod+0.051
osh+0.049
leaf+0.047
Neg logit contributions
keeping-0.065
hell-0.050
 unatt-0.048
-0.048
 anymore-0.047
Token picked
Token ID6497
Feature activation+1.26e+00

Pos logit contributions
 nods+0.032
 nod+0.028
rey+0.027
ags+0.027
 particular+0.025
Neg logit contributions
keeping-0.038
-0.031
 arrest-0.030
 drowned-0.030
 recovered-0.029
Token up
Token ID510
Feature activation+4.14e+00

Pos logit contributions
 nods+0.023
 nodded+0.021
 nod+0.021
rey+0.016
 shakes+0.016
Neg logit contributions
keeping-0.046
hell-0.040
handed-0.039
 anywhere-0.039
-0.038
Token a
Token ID257
Feature activation+7.18e-01

Pos logit contributions
 nods+0.074
 nod+0.062
 nodded+0.059
 shakes+0.050
ap+0.042
Neg logit contributions
keeping-0.174
-0.147
handed-0.146
 arrest-0.145
hell-0.141
Token pe
Token ID613
Feature activation+4.88e+00

Pos logit contributions
 nods+0.018
 nod+0.015
 nodded+0.015
 shakes+0.013
rey+0.013
Neg logit contributions
keeping-0.023
hell-0.020
-0.019
ork-0.019
handed-0.019
Tokenb
Token ID65
Feature activation+5.10e+00

Pos logit contributions
 nods+0.158
 nod+0.146
 nodded+0.141
 shakes+0.128
 smiles+0.127
Neg logit contributions
keeping-0.118
hell-0.102
ork-0.094
-0.092
handed-0.092
Tokenble
Token ID903
Feature activation+2.98e+00

Pos logit contributions
 nods+0.131
 nod+0.113
 nodded+0.110
 smiles+0.104
 shakes+0.102
Neg logit contributions
keeping-0.164
hell-0.139
-0.139
ork-0.135
uts-0.132
Token too
Token ID1165
Feature activation-4.50e-01

Pos logit contributions
 nods+0.085
 nodded+0.071
 nod+0.071
 smiles+0.064
 shakes+0.064
Neg logit contributions
keeping-0.092
hell-0.075
handed-0.073
-0.072
 arrest-0.071
Token.
Token ID13
Feature activation+4.58e-01

Pos logit contributions
keeping+0.017
+0.014
handed+0.014
 arrest+0.014
 rubble+0.013
Neg logit contributions
 nods-0.010
 nod-0.009
 nodded-0.009
 shakes-0.008
 smiles-0.007
Token He
Token ID679
Feature activation+4.85e-01

Pos logit contributions
 nods+0.011
 nod+0.010
 nodded+0.010
 shakes+0.009
 smiles+0.008
Neg logit contributions
keeping-0.016
-0.013
handed-0.013
 arrest-0.012
hell-0.012
Token threw
Token ID9617
Feature activation+1.36e+00

Pos logit contributions
 nods+0.013
 nod+0.012
 nodded+0.010
ap+0.010
ags+0.010
Neg logit contributions
keeping-0.015
-0.012
 drowned-0.012
 arrest-0.012
hell-0.012
Token Tim
Token ID5045
Feature activation-2.83e-01

Pos logit contributions
keeping+0.038
+0.030
 arrest+0.029
handed+0.028
 rubble+0.028
Neg logit contributions
 nods-0.040
 nod-0.036
 nodded-0.035
 shakes-0.032
 smiles-0.030
Token saw
Token ID2497
Feature activation+7.18e-01

Pos logit contributions
keeping+0.012
handed+0.010
 rubble+0.010
+0.010
hell+0.009
Neg logit contributions
 nods-0.005
 nodded-0.005
 nod-0.005
 shakes-0.004
 up-0.004
Token a
Token ID257
Feature activation-4.89e-01

Pos logit contributions
 nods+0.014
 nod+0.012
 nodded+0.011
 shakes+0.009
 smiles+0.008
Neg logit contributions
keeping-0.030
handed-0.025
-0.025
 arrest-0.025
 rubble-0.025
Token small
Token ID1402
Feature activation+8.05e-01

Pos logit contributions
keeping+0.018
+0.015
handed+0.014
 arrest+0.014
 rubble+0.014
Neg logit contributions
 nods-0.012
 nod-0.010
 nodded-0.010
 shakes-0.009
 smiles-0.008
Token pe
Token ID613
Feature activation+4.42e+00

Pos logit contributions
 nods+0.021
 nod+0.019
 nodded+0.018
 shakes+0.016
 smiles+0.015
Neg logit contributions
keeping-0.028
-0.022
 arrest-0.022
handed-0.022
 rubble-0.021
Tokenb
Token ID65
Feature activation+5.10e+00

Pos logit contributions
 nods+0.123
 nod+0.112
 nodded+0.108
 shakes+0.097
 smiles+0.093
Neg logit contributions
keeping-0.144
handed-0.115
-0.115
 arrest-0.112
hell-0.110
Tokenble
Token ID903
Feature activation+3.12e+00

Pos logit contributions
 nods+0.105
 nod+0.091
 nodded+0.086
 shakes+0.076
 smiles+0.073
Neg logit contributions
keeping-0.203
-0.172
handed-0.168
 arrest-0.165
hell-0.162
Token in
Token ID287
Feature activation-6.00e-02

Pos logit contributions
 nods+0.074
 nod+0.065
 nodded+0.062
 shakes+0.053
 smiles+0.052
Neg logit contributions
keeping-0.115
handed-0.095
-0.095
 arrest-0.094
hell-0.091
Token his
Token ID465
Feature activation-1.30e+00

Pos logit contributions
keeping+0.003
+0.002
handed+0.002
 arrest+0.002
hell+0.002
Neg logit contributions
 nods-0.001
 nod-0.001
 nodded-0.001
 shakes-0.000
 smiles-0.000
Token shoe
Token ID17292
Feature activation-5.75e-01

Pos logit contributions
keeping+0.052
+0.041
hell+0.040
handed+0.040
 rubble+0.040
Neg logit contributions
 nods-0.029
 nod-0.026
 nodded-0.025
 shakes-0.021
 smiles-0.020
Token.
Token ID13
Feature activation-8.12e-01

Pos logit contributions
keeping+0.023
+0.019
handed+0.019
 arrest+0.018
 rubble+0.018
Neg logit contributions
 nods-0.012
 nod-0.011
 nodded-0.010
 shakes-0.009
 smiles-0.008
Token,
Token ID11
Feature activation+1.27e+00

Pos logit contributions
keeping+0.020
 anywhere+0.016
handed+0.015
+0.015
hell+0.015
Neg logit contributions
 nods-0.014
 nod-0.011
 nodded-0.011
 shakes-0.011
 smiles-0.009
Token "
Token ID366
Feature activation-1.86e-01

Pos logit contributions
 nods+0.037
 shakes+0.029
 smiles+0.028
 nod+0.027
rey+0.027
Neg logit contributions
keeping-0.034
-0.030
 anywhere-0.027
ache-0.027
 unatt-0.026
TokenWhat
Token ID2061
Feature activation+4.00e-01

Pos logit contributions
keeping+0.006
handed+0.005
+0.005
 arrest+0.005
 unatt+0.005
Neg logit contributions
 nods-0.005
 nod-0.004
 nodded-0.004
 shakes-0.004
 smiles-0.004
Token makes
Token ID1838
Feature activation+1.09e+00

Pos logit contributions
 nods+0.010
 nod+0.009
 nodded+0.009
 shakes+0.008
rey+0.007
Neg logit contributions
keeping-0.013
-0.011
hell-0.011
grand-0.010
 anywhere-0.010
Token pe
Token ID613
Feature activation+4.57e+00

Pos logit contributions
 nods+0.030
 nodded+0.027
 nod+0.026
 shakes+0.022
 smiles+0.022
Neg logit contributions
keeping-0.035
 arrest-0.029
-0.028
handed-0.028
 rubble-0.028
Tokenb
Token ID65
Feature activation+5.09e+00

Pos logit contributions
 nods+0.131
 nod+0.118
 nodded+0.115
 shakes+0.103
 smiles+0.098
Neg logit contributions
keeping-0.144
handed-0.115
-0.114
 arrest-0.110
hell-0.110
Tokenbles
Token ID7689
Feature activation+3.47e+00

Pos logit contributions
 nods+0.121
 nodded+0.105
 nod+0.104
 shakes+0.093
 smiles+0.089
Neg logit contributions
keeping-0.180
-0.150
hell-0.147
uts-0.146
handed-0.144
Token special
Token ID2041
Feature activation+1.71e+00

Pos logit contributions
 nods+0.086
 nod+0.076
 nodded+0.074
 shakes+0.066
 smiles+0.062
Neg logit contributions
keeping-0.125
-0.101
handed-0.100
 arrest-0.099
 rubble-0.096
Token?"
Token ID1701
Feature activation+1.01e+00

Pos logit contributions
 nods+0.049
 nodded+0.044
 nod+0.043
 shakes+0.038
 smiles+0.036
Neg logit contributions
keeping-0.054
 anywhere-0.042
-0.041
handed-0.040
hell-0.040
Token His
Token ID2399
Feature activation+8.20e-01

Pos logit contributions
 nods+0.027
 nod+0.022
 nodded+0.022
 shakes+0.021
 smiles+0.020
Neg logit contributions
keeping-0.033
handed-0.027
-0.026
grand-0.025
 arrest-0.025
Token mother
Token ID2802
Feature activation-4.79e-01

Pos logit contributions
 nods+0.018
 nod+0.015
 nodded+0.015
 shakes+0.013
 smiles+0.012
Neg logit contributions
keeping-0.031
-0.025
handed-0.024
 arrest-0.024
 rubble-0.023
Token,
Token ID11
Feature activation-4.82e-01

Pos logit contributions
keeping+0.000
handed+0.000
+0.000
 arrest+0.000
 rubble+0.000
Neg logit contributions
 nods-0.000
 nod-0.000
 nodded-0.000
 shakes-0.000
 smiles-0.000
Token she
Token ID673
Feature activation-5.81e-01

Pos logit contributions
keeping+0.018
+0.014
handed+0.014
 arrest+0.014
 rubble+0.014
Neg logit contributions
 nods-0.012
 nod-0.011
 nodded-0.010
 shakes-0.009
 smiles-0.008
Token found
Token ID1043
Feature activation+1.21e+00

Pos logit contributions
keeping+0.023
+0.019
handed+0.019
 arrest+0.018
 rubble+0.018
Neg logit contributions
 nods-0.012
 nod-0.011
 nodded-0.011
 shakes-0.009
 smiles-0.009
Token a
Token ID257
Feature activation-3.29e-01

Pos logit contributions
 nods+0.026
 nod+0.023
 nodded+0.021
 shakes+0.018
 smiles+0.016
Neg logit contributions
keeping-0.048
handed-0.041
 arrest-0.039
-0.039
 rubble-0.039
Token pe
Token ID613
Feature activation+4.66e+00

Pos logit contributions
keeping+0.011
+0.009
handed+0.009
 arrest+0.009
hell+0.009
Neg logit contributions
 nods-0.009
 nod-0.008
 nodded-0.007
 shakes-0.007
 smiles-0.006
Tokenb
Token ID65
Feature activation+5.08e+00

Pos logit contributions
 nods+0.139
 nod+0.127
 nodded+0.121
 shakes+0.110
 smiles+0.107
Neg logit contributions
keeping-0.140
handed-0.111
-0.110
hell-0.110
ork-0.105
Tokenble
Token ID903
Feature activation+3.07e+00

Pos logit contributions
 nods+0.118
 nod+0.102
 nodded+0.097
 shakes+0.088
 smiles+0.086
Neg logit contributions
keeping-0.187
-0.158
handed-0.152
hell-0.151
 anywhere-0.151
Token that
Token ID326
Feature activation-8.43e-01

Pos logit contributions
 nods+0.076
 nod+0.066
 nodded+0.063
 shakes+0.054
 smiles+0.053
Neg logit contributions
keeping-0.112
handed-0.092
-0.091
 arrest-0.090
hell-0.088
Token was
Token ID373
Feature activation+3.54e-01

Pos logit contributions
keeping+0.030
+0.025
handed+0.024
 rubble+0.024
 arrest+0.023
Neg logit contributions
 nods-0.021
 nod-0.019
 nodded-0.017
 shakes-0.015
 smiles-0.014
Token very
Token ID845
Feature activation+3.87e-01

Pos logit contributions
 nods+0.010
 nod+0.009
 nodded+0.009
 shakes+0.008
 smiles+0.007
Neg logit contributions
keeping-0.011
-0.009
handed-0.009
 rubble-0.009
 anywhere-0.009
Token deep
Token ID2769
Feature activation+1.83e+00

Pos logit contributions
 nods+0.011
 nod+0.010
 nodded+0.010
 shakes+0.009
 smiles+0.008
Neg logit contributions
keeping-0.012
handed-0.009
-0.009
hell-0.009
 rubble-0.009
Token the
Token ID262
Feature activation+1.62e+00

Pos logit contributions
 nods+0.006
 nod+0.005
 nodded+0.005
 shakes+0.003
 smiles+0.003
Neg logit contributions
keeping-0.026
handed-0.022
hell-0.022
-0.022
 rubble-0.022
Token idea
Token ID2126
Feature activation+1.56e+00

Pos logit contributions
 nods+0.041
 nodded+0.037
 nod+0.035
 shakes+0.031
 smiles+0.030
Neg logit contributions
keeping-0.056
handed-0.046
-0.045
 arrest-0.044
 rubble-0.043
Token to
Token ID284
Feature activation-5.31e-01

Pos logit contributions
 nods+0.040
 nod+0.035
 nodded+0.034
 shakes+0.030
 smiles+0.028
Neg logit contributions
keeping-0.056
-0.045
handed-0.044
 arrest-0.044
 rubble-0.043
Token bake
Token ID28450
Feature activation+7.05e-01

Pos logit contributions
keeping+0.019
handed+0.016
 arrest+0.015
+0.015
 rubble+0.015
Neg logit contributions
 nods-0.013
 nodded-0.011
 nod-0.011
 shakes-0.010
 smiles-0.010
Token a
Token ID257
Feature activation+2.22e-01

Pos logit contributions
 nods+0.018
 nod+0.015
 nodded+0.015
 shakes+0.013
 fisherman+0.013
Neg logit contributions
keeping-0.024
-0.021
handed-0.020
 anywhere-0.020
hell-0.020
Token pe
Token ID613
Feature activation+5.07e+00

Pos logit contributions
 nods+0.006
 nod+0.006
 nodded+0.006
 shakes+0.005
 smiles+0.005
Neg logit contributions
keeping-0.007
-0.006
handed-0.005
 arrest-0.005
 rubble-0.005
Tokenb
Token ID65
Feature activation+5.34e+00

Pos logit contributions
 nods+0.167
 nod+0.151
 nodded+0.147
 shakes+0.138
 smiles+0.133
Neg logit contributions
keeping-0.135
handed-0.108
-0.104
hell-0.101
grand-0.099
Tokenble
Token ID903
Feature activation+3.84e+00

Pos logit contributions
 nods+0.127
 nod+0.108
 nodded+0.107
 shakes+0.099
 smiles+0.096
Neg logit contributions
keeping-0.195
-0.163
handed-0.160
hell-0.155
ork-0.154
Token.
Token ID13
Feature activation-6.51e-01

Pos logit contributions
 nods+0.088
 nod+0.076
 nodded+0.074
 shakes+0.064
 smiles+0.060
Neg logit contributions
keeping-0.147
handed-0.120
-0.120
 arrest-0.118
 rubble-0.115
Token She
Token ID1375
Feature activation+3.59e-01

Pos logit contributions
keeping+0.021
+0.016
handed+0.016
 arrest+0.015
grand+0.015
Neg logit contributions
 nods-0.018
 nod-0.016
 nodded-0.016
 shakes-0.015
 smiles-0.014
Token gathered
Token ID9272
Feature activation+1.04e+00

Pos logit contributions
 nods+0.010
 nod+0.010
 nodded+0.008
 shakes+0.008
ap+0.008
Neg logit contributions
keeping-0.011
 arrest-0.009
-0.008
 drowned-0.008
grand-0.008
TokenI
Token ID40
Feature activation+1.04e+00

Pos logit contributions
 nods+0.037
 nod+0.033
 nodded+0.032
 shakes+0.029
 smiles+0.027
Neg logit contributions
keeping-0.054
-0.044
handed-0.042
 arrest-0.041
hell-0.040
Token am
Token ID716
Feature activation+6.44e-01

Pos logit contributions
 nods+0.030
 nod+0.027
 nodded+0.026
 shakes+0.024
 smiles+0.022
Neg logit contributions
keeping-0.034
-0.027
handed-0.026
 arrest-0.026
 rubble-0.025
Token a
Token ID257
Feature activation-2.30e-01

Pos logit contributions
 nods+0.014
 nod+0.012
 nodded+0.012
 shakes+0.010
 smiles+0.009
Neg logit contributions
keeping-0.025
-0.021
handed-0.021
 arrest-0.020
 rubble-0.020
Token magical
Token ID10883
Feature activation+1.32e+00

Pos logit contributions
keeping+0.009
+0.007
handed+0.007
 arrest+0.007
 rubble+0.007
Neg logit contributions
 nods-0.005
 nod-0.005
 nodded-0.005
 shakes-0.004
 cl-0.004
Token pe
Token ID613
Feature activation+4.74e+00

Pos logit contributions
 nods+0.033
 nod+0.029
 nodded+0.029
 shakes+0.025
 smiles+0.024
Neg logit contributions
keeping-0.047
-0.038
handed-0.037
 arrest-0.037
 rubble-0.036
Tokenb
Token ID65
Feature activation+5.03e+00

Pos logit contributions
 nods+0.136
 nod+0.123
 nodded+0.119
 shakes+0.110
 smiles+0.105
Neg logit contributions
keeping-0.149
handed-0.120
-0.118
hell-0.114
 arrest-0.114
Tokenble
Token ID903
Feature activation+3.07e+00

Pos logit contributions
 nods+0.122
 nod+0.104
 nodded+0.101
 shakes+0.096
 smiles+0.092
Neg logit contributions
keeping-0.184
-0.152
handed-0.148
hell-0.145
 anywhere-0.142
Token and
Token ID290
Feature activation+4.36e-01

Pos logit contributions
 nods+0.085
 nod+0.076
 nodded+0.074
 shakes+0.067
 smiles+0.063
Neg logit contributions
keeping-0.102
-0.080
handed-0.080
 arrest-0.079
 rubble-0.076
Token I
Token ID314
Feature activation+1.29e+00

Pos logit contributions
 nods+0.012
 nod+0.011
 nodded+0.010
 shakes+0.009
 smiles+0.009
Neg logit contributions
keeping-0.014
-0.011
handed-0.011
 arrest-0.011
 rubble-0.011
Tokenâ
Token ID22940
Feature activation+6.12e-01

Pos logit contributions
 nods+0.035
 nod+0.032
 nodded+0.031
 shakes+0.028
 smiles+0.026
Neg logit contributions
keeping-0.043
-0.034
handed-0.034
 arrest-0.033
 rubble-0.032
Token
Token ID26391
Feature activation-7.99e-02

Pos logit contributions
 nods+0.019
 nod+0.017
 nodded+0.017
 shakes+0.016
 smiles+0.015
Neg logit contributions
keeping-0.019
handed-0.014
-0.014
 arrest-0.013
grand-0.013
Token she
Token ID673
Feature activation-5.15e-01

Pos logit contributions
keeping+0.011
+0.009
handed+0.009
 arrest+0.009
 rubble+0.008
Neg logit contributions
 nods-0.009
 nod-0.008
 nodded-0.008
 shakes-0.007
 smiles-0.007
Token found
Token ID1043
Feature activation+1.07e+00

Pos logit contributions
keeping+0.019
+0.016
handed+0.016
 arrest+0.015
 rubble+0.015
Neg logit contributions
 nods-0.012
 nod-0.011
 nodded-0.010
 shakes-0.009
 smiles-0.009
Token a
Token ID257
Feature activation-6.13e-01

Pos logit contributions
 nods+0.019
 nod+0.017
 nodded+0.016
 shakes+0.013
 smiles+0.012
Neg logit contributions
keeping-0.046
handed-0.039
-0.039
 arrest-0.038
 rubble-0.038
Token small
Token ID1402
Feature activation+7.84e-01

Pos logit contributions
keeping+0.022
+0.018
handed+0.017
 arrest+0.017
 rubble+0.017
Neg logit contributions
 nods-0.015
 nod-0.014
 nodded-0.013
 shakes-0.012
 smiles-0.011
Token pe
Token ID613
Feature activation+4.42e+00

Pos logit contributions
 nods+0.023
 nod+0.020
 nodded+0.020
 shakes+0.018
 smiles+0.017
Neg logit contributions
keeping-0.025
-0.019
 arrest-0.019
handed-0.019
 rubble-0.018
Tokenb
Token ID65
Feature activation+5.03e+00

Pos logit contributions
 nods+0.124
 nod+0.113
 nodded+0.107
 shakes+0.097
 smiles+0.094
Neg logit contributions
keeping-0.144
handed-0.115
-0.114
hell-0.111
 arrest-0.110
Tokenble
Token ID903
Feature activation+2.91e+00

Pos logit contributions
 nods+0.104
 nod+0.090
 nodded+0.085
 shakes+0.075
 smiles+0.072
Neg logit contributions
keeping-0.201
-0.169
handed-0.165
 arrest-0.162
hell-0.160
Token on
Token ID319
Feature activation+1.28e-01

Pos logit contributions
 nods+0.059
 nod+0.051
 nodded+0.049
 shakes+0.041
 smiles+0.038
Neg logit contributions
keeping-0.119
-0.099
handed-0.099
 arrest-0.097
 rubble-0.095
Token the
Token ID262
Feature activation-1.27e+00

Pos logit contributions
 nods+0.002
 nod+0.001
 nodded+0.001
 shakes+0.001
 smiles+0.001
Neg logit contributions
keeping-0.006
-0.005
handed-0.005
 arrest-0.005
hell-0.005
Token ground
Token ID2323
Feature activation+1.58e-01

Pos logit contributions
keeping+0.036
+0.027
handed+0.027
 arrest+0.026
 rubble+0.025
Neg logit contributions
 nods-0.042
 nod-0.039
 nodded-0.038
 shakes-0.034
 smiles-0.033
Token.
Token ID13
Feature activation-2.92e-01

Pos logit contributions
 nods+0.004
 nod+0.004
 nodded+0.003
 shakes+0.003
 smiles+0.003
Neg logit contributions
keeping-0.006
handed-0.005
-0.005
 arrest-0.004
 rubble-0.004
Token grandma
Token ID49890
Feature activation-8.09e-01

Pos logit contributions
keeping+0.009
+0.008
handed+0.008
hell+0.008
 arrest+0.007
Neg logit contributions
 nods-0.004
 nod-0.003
 nodded-0.003
 shakes-0.002
 smiles-0.002
Token's
Token ID338
Feature activation-1.55e+00

Pos logit contributions
keeping+0.032
+0.027
handed+0.026
 arrest+0.026
 rubble+0.026
Neg logit contributions
 nods-0.017
 nod-0.015
 nodded-0.015
 shakes-0.012
 smiles-0.012
Token house
Token ID2156
Feature activation-1.64e+00

Pos logit contributions
keeping+0.056
+0.047
handed+0.045
hell+0.044
 arrest+0.043
Neg logit contributions
 nods-0.038
 nod-0.035
 nodded-0.032
 shakes-0.029
 smiles-0.029
Token for
Token ID329
Feature activation-6.36e-01

Pos logit contributions
keeping+0.068
+0.058
handed+0.056
 arrest+0.056
 rubble+0.055
Neg logit contributions
 nods-0.030
 nod-0.027
 nodded-0.026
 shakes-0.022
 smiles-0.020
Token safe
Token ID3338
Feature activation-1.35e+00

Pos logit contributions
keeping+0.030
+0.028
hell+0.027
handed+0.027
lor+0.026
Neg logit contributions
 nods-0.004
 nod-0.004
 their-0.004
 next-0.004
 cl-0.004
Tokenkeeping
Token ID19934
Feature activation-6.73e+00

Pos logit contributions
keeping+0.035
+0.035
hell+0.034
handed+0.032
 arrest+0.032
Neg logit contributions
 nods-0.036
 nod-0.033
 nodded-0.033
 up-0.032
 smiles-0.031
Token.
Token ID13
Feature activation-2.38e-01

Pos logit contributions
keeping+0.272
+0.249
 rubble+0.249
handed+0.243
 arrest+0.241
Neg logit contributions
 nods-0.093
 nod-0.093
 nodded-0.087
 up-0.069
 cl-0.065
Token
Token ID220
Feature activation+6.80e-01

Pos logit contributions
keeping+0.010
+0.008
 arrest+0.008
 unatt+0.008
hell+0.008
Neg logit contributions
 nods-0.005
 nod-0.004
 nodded-0.004
 shakes-0.003
 smiles-0.003
Token
Token ID198
Feature activation-4.57e-01

Pos logit contributions
 nods+0.015
 nod+0.013
 nodded+0.013
 shakes+0.011
 smiles+0.010
Neg logit contributions
keeping-0.027
-0.022
handed-0.022
 arrest-0.021
 rubble-0.021
Token
Token ID198
Feature activation-1.20e+00

Pos logit contributions
keeping+0.021
 arrest+0.017
+0.017
 rubble+0.017
 unatt+0.017
Neg logit contributions
 nods-0.007
 nod-0.006
 nodded-0.006
 shakes-0.004
 smiles-0.004
TokenWhen
Token ID2215
Feature activation-2.53e-01

Pos logit contributions
keeping+0.047
+0.037
 rubble+0.036
handed+0.036
 anywhere+0.036
Neg logit contributions
 nods-0.025
 nod-0.024
 nodded-0.023
 cl-0.018
 shakes-0.018
Token it
Token ID340
Feature activation+1.76e+00

Pos logit contributions
 nods+0.002
 nod+0.002
 nodded+0.002
 shakes+0.002
 smiles+0.002
Neg logit contributions
keeping-0.003
handed-0.003
-0.003
 arrest-0.003
 rubble-0.003
Token to
Token ID284
Feature activation-1.35e+00

Pos logit contributions
 nods+0.044
 nod+0.039
 nodded+0.039
 shakes+0.034
rey+0.031
Neg logit contributions
keeping-0.063
handed-0.050
 arrest-0.049
-0.048
 rubble-0.047
Token me
Token ID502
Feature activation+1.15e+00

Pos logit contributions
keeping+0.052
 arrest+0.043
handed+0.042
+0.041
 rubble+0.041
Neg logit contributions
 nods-0.030
 nod-0.026
 nodded-0.025
 shakes-0.022
 smiles-0.019
Token for
Token ID329
Feature activation-4.78e-02

Pos logit contributions
 nods+0.030
 nod+0.026
 nodded+0.026
 shakes+0.023
 smiles+0.021
Neg logit contributions
keeping-0.040
handed-0.032
-0.032
 arrest-0.031
hell-0.030
Token safe
Token ID3338
Feature activation-6.72e-01

Pos logit contributions
keeping+0.002
+0.002
handed+0.002
 arrest+0.002
hell+0.002
Neg logit contributions
 nods-0.001
 nod-0.001
 nodded-0.001
 shakes-0.001
 smiles-0.001
Tokenkeeping
Token ID19934
Feature activation-5.65e+00

Pos logit contributions
keeping+0.012
+0.011
handed+0.010
hell+0.010
 rubble+0.009
Neg logit contributions
 nods-0.025
 nod-0.023
 nodded-0.023
 smiles-0.022
 shakes-0.022
Token.
Token ID13
Feature activation+4.09e-02

Pos logit contributions
keeping+0.229
 rubble+0.210
+0.207
handed+0.203
 unnecessary+0.200
Neg logit contributions
 nods-0.093
 nod-0.083
 nodded-0.081
 smiles-0.063
 shakes-0.062
Token He
Token ID679
Feature activation+4.72e-01

Pos logit contributions
 nods+0.001
 nod+0.001
 nodded+0.001
 shakes+0.001
 smiles+0.001
Neg logit contributions
keeping-0.002
-0.001
handed-0.001
 arrest-0.001
 rubble-0.001
Token told
Token ID1297
Feature activation-8.81e-01

Pos logit contributions
 nods+0.010
 nod+0.009
 nodded+0.008
 shakes+0.007
ap+0.007
Neg logit contributions
keeping-0.018
 arrest-0.015
handed-0.015
-0.014
grand-0.014
Token me
Token ID502
Feature activation+8.87e-01

Pos logit contributions
keeping+0.036
handed+0.030
+0.029
 arrest+0.029
 rubble+0.029
Neg logit contributions
 nods-0.018
 nod-0.016
 nodded-0.015
 shakes-0.012
 smiles-0.011
Token to
Token ID284
Feature activation-1.29e+00

Pos logit contributions
 nods+0.024
 nod+0.022
 nodded+0.021
 shakes+0.019
 smiles+0.018
Neg logit contributions
keeping-0.030
-0.024
handed-0.024
 arrest-0.023
 rubble-0.023
Token the
Token ID262
Feature activation-2.53e+00

Pos logit contributions
keeping+0.019
handed+0.016
 arrest+0.016
+0.016
 rubble+0.015
Neg logit contributions
 nods-0.008
 nod-0.007
 nodded-0.007
 shakes-0.005
ap-0.004
Token sofa
Token ID34902
Feature activation-3.89e+00

Pos logit contributions
keeping+0.083
+0.070
handed+0.067
 arrest+0.064
grand+0.062
Neg logit contributions
 nods-0.069
 nod-0.065
 nodded-0.059
 shakes-0.057
 smiles-0.056
Token,
Token ID11
Feature activation-9.61e-01

Pos logit contributions
keeping+0.161
+0.137
handed+0.135
 arrest+0.131
grand+0.131
Neg logit contributions
 nods-0.073
 nod-0.066
 nodded-0.063
 shakes-0.054
 smiles-0.048
Token in
Token ID287
Feature activation-1.11e+00

Pos logit contributions
keeping+0.024
 anywhere+0.021
 rubble+0.019
 unatt+0.019
 arrest+0.019
Neg logit contributions
 nods-0.029
 nod-0.025
rey-0.025
ap-0.024
 nodded-0.024
Token the
Token ID262
Feature activation-1.45e+00

Pos logit contributions
keeping+0.051
+0.044
handed+0.043
 arrest+0.043
 rubble+0.042
Neg logit contributions
 nods-0.017
 nod-0.014
 nodded-0.013
 shakes-0.010
 smiles-0.009
Token closet
Token ID21615
Feature activation-5.57e+00

Pos logit contributions
keeping+0.043
+0.033
handed+0.032
 arrest+0.032
 rubble+0.030
Neg logit contributions
 nods-0.046
 nod-0.042
 nodded-0.041
 shakes-0.037
 smiles-0.036
Token,
Token ID11
Feature activation-1.32e+00

Pos logit contributions
keeping+0.251
+0.218
handed+0.215
grand+0.212
 unnecessary+0.211
Neg logit contributions
 nods-0.082
 nodded-0.076
 nod-0.075
 shakes-0.061
 up-0.050
Token but
Token ID475
Feature activation-7.09e-01

Pos logit contributions
keeping+0.035
 anywhere+0.029
 arrest+0.029
 unatt+0.027
 rubble+0.027
Neg logit contributions
 nods-0.037
 nod-0.033
rey-0.031
 nodded-0.031
 fisherman-0.030
Token she
Token ID673
Feature activation-2.38e+00

Pos logit contributions
keeping+0.021
 anywhere+0.018
+0.018
 unatt+0.017
 rubble+0.017
Neg logit contributions
 nods-0.020
 nod-0.018
 nodded-0.016
 shakes-0.014
 smiles-0.014
Token could
Token ID714
Feature activation-8.49e-01

Pos logit contributions
keeping+0.076
handed+0.058
+0.058
 arrest+0.057
 rubble+0.055
Neg logit contributions
 nods-0.070
 nod-0.064
 nodded-0.063
 shakes-0.057
 smiles-0.054
Token not
Token ID407
Feature activation-7.56e-01

Pos logit contributions
keeping+0.026
+0.021
 arrest+0.020
handed+0.020
 rubble+0.019
Neg logit contributions
 nods-0.025
 nod-0.022
 nodded-0.022
 shakes-0.020
 smiles-0.019
Token the
Token ID262
Feature activation-2.16e+00

Pos logit contributions
keeping+0.000
+0.000
handed+0.000
 arrest+0.000
hell+0.000
Neg logit contributions
 nods-0.000
 nod-0.000
 nodded-0.000
 shakes-0.000
 smiles-0.000
Token bed
Token ID3996
Feature activation-4.42e+00

Pos logit contributions
keeping+0.076
+0.065
handed+0.061
 arrest+0.058
 drowned+0.056
Neg logit contributions
 nods-0.054
 nod-0.050
 nodded-0.046
 shakes-0.044
 smiles-0.042
Token,
Token ID11
Feature activation-1.48e+00

Pos logit contributions
keeping+0.211
+0.186
handed+0.180
grand+0.179
 arrest+0.177
Neg logit contributions
 nods-0.044
 nod-0.044
 nodded-0.043
 up-0.033
 shakes-0.032
Token in
Token ID287
Feature activation-9.80e-01

Pos logit contributions
keeping+0.046
 unatt+0.036
 rubble+0.036
 arrest+0.035
handed+0.035
Neg logit contributions
 nods-0.044
 nod-0.038
 nodded-0.036
 shakes-0.033
ap-0.033
Token the
Token ID262
Feature activation-1.58e+00

Pos logit contributions
keeping+0.046
+0.040
handed+0.039
 arrest+0.038
hell+0.038
Neg logit contributions
 nods-0.014
 nod-0.012
 nodded-0.011
 shakes-0.009
 smiles-0.008
Token closet
Token ID21615
Feature activation-5.28e+00

Pos logit contributions
keeping+0.053
+0.044
handed+0.041
 arrest+0.040
hell+0.039
Neg logit contributions
 nods-0.043
 nod-0.040
 nodded-0.038
 shakes-0.035
 smiles-0.034
Token,
Token ID11
Feature activation-1.61e+00

Pos logit contributions
keeping+0.250
+0.218
grand+0.216
 unnecessary+0.214
handed+0.212
Neg logit contributions
 nodded-0.058
 nods-0.057
 nod-0.057
 shakes-0.041
 up-0.039
Token and
Token ID290
Feature activation-1.56e+00

Pos logit contributions
keeping+0.053
 unatt+0.042
 rubble+0.041
 anywhere+0.041
 arrest+0.041
Neg logit contributions
 nods-0.045
 nod-0.039
 nodded-0.036
 shakes-0.033
ap-0.033
Token even
Token ID772
Feature activation+9.18e-01

Pos logit contributions
keeping+0.062
handed+0.050
+0.050
 rubble+0.049
 arrest+0.049
Neg logit contributions
 nods-0.034
 nod-0.029
 nodded-0.027
 shakes-0.024
 smiles-0.022
Token in
Token ID287
Feature activation-6.70e-01

Pos logit contributions
 nods+0.021
 nod+0.019
 nodded+0.017
 shakes+0.015
 smiles+0.014
Neg logit contributions
keeping-0.035
handed-0.028
 unatt-0.028
-0.028
 rubble-0.028
Token the
Token ID262
Feature activation-1.19e+00

Pos logit contributions
keeping+0.034
+0.030
handed+0.029
grand+0.029
hell+0.028
Neg logit contributions
 nods-0.006
 nod-0.005
 nodded-0.004
 smiles-0.004
 shakes-0.004
Token her
Token ID607
Feature activation-2.07e+00

Pos logit contributions
keeping+0.001
+0.001
handed+0.001
 arrest+0.001
hell+0.001
Neg logit contributions
 nods-0.000
 nod-0.000
 nodded-0.000
 shakes-0.000
 smiles-0.000
Token bed
Token ID3996
Feature activation-4.19e+00

Pos logit contributions
keeping+0.064
+0.051
handed+0.050
 arrest+0.046
 drowned+0.045
Neg logit contributions
 nods-0.062
 nod-0.057
 nodded-0.054
 shakes-0.051
 smiles-0.050
Token,
Token ID11
Feature activation-1.04e+00

Pos logit contributions
keeping+0.196
+0.179
handed+0.175
 arrest+0.175
 unnecessary+0.172
Neg logit contributions
 nods-0.042
 nodded-0.042
 nod-0.042
 shakes-0.030
 up-0.027
Token in
Token ID287
Feature activation-1.11e+00

Pos logit contributions
keeping+0.033
 unatt+0.026
 anywhere+0.026
 rubble+0.025
 arrest+0.025
Neg logit contributions
 nods-0.030
 nod-0.026
 nodded-0.025
rey-0.022
ap-0.022
Token her
Token ID607
Feature activation-2.23e+00

Pos logit contributions
keeping+0.054
+0.047
handed+0.046
 arrest+0.045
grand+0.045
Neg logit contributions
 nods-0.014
 nod-0.011
 nodded-0.010
 shakes-0.008
 smiles-0.008
Token closet
Token ID21615
Feature activation-5.02e+00

Pos logit contributions
keeping+0.081
+0.066
handed+0.066
hell+0.061
 arrest+0.061
Neg logit contributions
 nods-0.056
 nod-0.051
 nodded-0.048
 shakes-0.044
 smiles-0.043
Token,
Token ID11
Feature activation-1.42e+00

Pos logit contributions
keeping+0.237
+0.210
 unnecessary+0.207
handed+0.206
grand+0.204
Neg logit contributions
 nods-0.057
 nodded-0.055
 nod-0.052
 shakes-0.040
 up-0.034
Token and
Token ID290
Feature activation-1.72e+00

Pos logit contributions
keeping+0.046
 unatt+0.037
 anywhere+0.037
 arrest+0.035
 rubble+0.035
Neg logit contributions
 nods-0.038
 nod-0.033
 nodded-0.031
rey-0.029
 fisherman-0.028
Token in
Token ID287
Feature activation-7.65e-01

Pos logit contributions
keeping+0.064
 unatt+0.051
 rubble+0.050
 anywhere+0.050
+0.050
Neg logit contributions
 nods-0.041
 nod-0.035
 nodded-0.032
 shakes-0.029
rey-0.028
Token her
Token ID607
Feature activation-2.21e+00

Pos logit contributions
keeping+0.039
+0.034
handed+0.033
grand+0.032
 arrest+0.032
Neg logit contributions
 nods-0.008
 nod-0.007
 nodded-0.006
 shakes-0.005
 smiles-0.005
Token toy
Token ID13373
Feature activation-1.15e+00

Pos logit contributions
keeping+0.088
handed+0.073
+0.072
hell+0.070
uts+0.069
Neg logit contributions
 nods-0.049
 nod-0.043
 nodded-0.042
 shakes-0.036
 smiles-0.036
Token her
Token ID607
Feature activation-1.96e+00

Pos logit contributions
 nods+0.003
 nod+0.002
 nodded+0.002
 shakes+0.002
 smiles+0.002
Neg logit contributions
keeping-0.010
-0.009
handed-0.009
 arrest-0.009
hell-0.008
Token bed
Token ID3996
Feature activation-4.19e+00

Pos logit contributions
keeping+0.062
+0.047
handed+0.047
 arrest+0.043
 drowned+0.043
Neg logit contributions
 nods-0.059
 nod-0.055
 nodded-0.052
 shakes-0.050
 smiles-0.049
Token,
Token ID11
Feature activation-1.31e+00

Pos logit contributions
keeping+0.201
+0.178
handed+0.172
 arrest+0.171
 unnecessary+0.170
Neg logit contributions
 nods-0.044
 nodded-0.043
 nod-0.042
 shakes-0.031
 up-0.028
Token in
Token ID287
Feature activation-8.72e-01

Pos logit contributions
keeping+0.041
 unatt+0.032
 anywhere+0.032
 rubble+0.032
 arrest+0.032
Neg logit contributions
 nods-0.038
 nod-0.033
 nodded-0.032
 shakes-0.028
ap-0.028
Token her
Token ID607
Feature activation-2.05e+00

Pos logit contributions
keeping+0.041
+0.035
handed+0.035
 arrest+0.034
hell+0.034
Neg logit contributions
 nods-0.012
 nod-0.010
 nodded-0.009
 shakes-0.008
 smiles-0.007
Token closet
Token ID21615
Feature activation-4.94e+00

Pos logit contributions
keeping+0.075
handed+0.059
+0.059
 arrest+0.056
hell+0.055
Neg logit contributions
 nods-0.053
 nod-0.047
 nodded-0.046
 shakes-0.041
 smiles-0.041
Token,
Token ID11
Feature activation-1.66e+00

Pos logit contributions
keeping+0.236
+0.203
 unnecessary+0.201
handed+0.200
grand+0.198
Neg logit contributions
 nods-0.058
 nodded-0.057
 nod-0.053
 shakes-0.042
 up-0.036
Token and
Token ID290
Feature activation-1.62e+00

Pos logit contributions
keeping+0.055
 unatt+0.044
 anywhere+0.043
 arrest+0.042
 rubble+0.042
Neg logit contributions
 nods-0.046
 nod-0.039
 nodded-0.037
ap-0.033
 shakes-0.033
Token in
Token ID287
Feature activation-5.42e-01

Pos logit contributions
keeping+0.062
+0.049
handed+0.049
 unatt+0.049
 rubble+0.049
Neg logit contributions
 nods-0.037
 nod-0.031
 nodded-0.029
 shakes-0.026
rey-0.024
Token her
Token ID607
Feature activation-2.03e+00

Pos logit contributions
keeping+0.027
+0.023
handed+0.023
 arrest+0.022
grand+0.022
Neg logit contributions
 nods-0.007
 nod-0.005
 nodded-0.005
 shakes-0.004
 smiles-0.004
Token toy
Token ID13373
Feature activation-8.07e-01

Pos logit contributions
keeping+0.080
handed+0.065
+0.063
hell+0.062
 arrest+0.062
Neg logit contributions
 nods-0.047
 nod-0.041
 nodded-0.041
 smiles-0.035
 shakes-0.035
Token the
Token ID262
Feature activation-2.16e+00

Pos logit contributions
 nods+0.006
 nod+0.005
 nodded+0.004
 shakes+0.004
 smiles+0.003
Neg logit contributions
keeping-0.023
-0.021
handed-0.020
grand-0.019
hell-0.019
Token bed
Token ID3996
Feature activation-3.94e+00

Pos logit contributions
keeping+0.076
+0.066
handed+0.060
 arrest+0.058
 drowned+0.057
Neg logit contributions
 nods-0.053
 nod-0.049
 shakes-0.044
 nodded-0.044
 up-0.043
Token,
Token ID11
Feature activation-1.16e+00

Pos logit contributions
keeping+0.189
+0.166
handed+0.160
grand+0.159
 arrest+0.158
Neg logit contributions
 nods-0.043
 nodded-0.040
 nod-0.040
 shakes-0.028
 up-0.027
Token in
Token ID287
Feature activation-7.05e-01

Pos logit contributions
keeping+0.037
 rubble+0.030
 unatt+0.030
 arrest+0.029
 anywhere+0.029
Neg logit contributions
 nods-0.033
 nod-0.028
 nodded-0.027
ap-0.025
 shakes-0.024
Token the
Token ID262
Feature activation-1.42e+00

Pos logit contributions
keeping+0.033
+0.029
handed+0.028
 arrest+0.028
hell+0.027
Neg logit contributions
 nods-0.010
 nod-0.008
 nodded-0.007
 shakes-0.006
 smiles-0.005
Token closet
Token ID21615
Feature activation-4.93e+00

Pos logit contributions
keeping+0.049
+0.040
handed+0.038
 arrest+0.038
hell+0.036
Neg logit contributions
 nods-0.038
 nod-0.035
 nodded-0.033
 shakes-0.030
 smiles-0.029
Token,
Token ID11
Feature activation-1.37e+00

Pos logit contributions
keeping+0.226
+0.196
grand+0.191
 unnecessary+0.190
handed+0.188
Neg logit contributions
 nods-0.067
 nodded-0.066
 nod-0.064
 shakes-0.048
 smiled-0.043
Token and
Token ID290
Feature activation-1.29e+00

Pos logit contributions
keeping+0.044
 unatt+0.036
 anywhere+0.035
 rubble+0.035
 arrest+0.035
Neg logit contributions
 nods-0.037
 nod-0.032
 nodded-0.029
ap-0.028
rey-0.027
Token behind
Token ID2157
Feature activation+3.39e-01

Pos logit contributions
keeping+0.052
+0.042
handed+0.042
 arrest+0.041
 rubble+0.041
Neg logit contributions
 nods-0.028
 nod-0.024
 nodded-0.023
 shakes-0.020
 smiles-0.019
Token the
Token ID262
Feature activation-2.01e+00

Pos logit contributions
 nods+0.004
 nod+0.003
 nodded+0.003
 smiles+0.002
 shakes+0.002
Neg logit contributions
keeping-0.016
-0.015
handed-0.014
grand-0.014
 arrest-0.014
Token curtains
Token ID41160
Feature activation-1.94e+00

Pos logit contributions
keeping+0.073
+0.066
handed+0.060
 arrest+0.060
 drowned+0.060
Neg logit contributions
 nods-0.044
 nod-0.043
 smiles-0.037
 up-0.037
 shakes-0.037
Token the
Token ID262
Feature activation-1.70e+00

Pos logit contributions
 nods+0.002
 nod+0.002
 nodded+0.002
 shakes+0.001
 smiles+0.001
Neg logit contributions
keeping-0.005
-0.004
 arrest-0.004
handed-0.004
 rubble-0.004
Token bed
Token ID3996
Feature activation-3.94e+00

Pos logit contributions
keeping+0.057
+0.047
handed+0.045
 arrest+0.044
grand+0.042
Neg logit contributions
 nods-0.046
 nod-0.042
 nodded-0.040
 shakes-0.037
 smiles-0.036
Token,
Token ID11
Feature activation-4.43e-01

Pos logit contributions
keeping+0.176
+0.151
handed+0.150
 arrest+0.148
grand+0.144
Neg logit contributions
 nods-0.059
 nod-0.055
 nodded-0.054
 shakes-0.041
 up-0.035
Token in
Token ID287
Feature activation-8.69e-01

Pos logit contributions
keeping+0.013
 rubble+0.010
 unatt+0.010
 anywhere+0.010
 arrest+0.010
Neg logit contributions
 nods-0.013
 nod-0.011
 nodded-0.010
ap-0.010
rey-0.010
Token the
Token ID262
Feature activation-1.20e+00

Pos logit contributions
keeping+0.040
+0.034
handed+0.033
 arrest+0.033
 rubble+0.032
Neg logit contributions
 nods-0.013
 nod-0.011
 nodded-0.011
 shakes-0.008
 smiles-0.007
Token closet
Token ID21615
Feature activation-4.93e+00

Pos logit contributions
keeping+0.036
+0.028
handed+0.027
 arrest+0.026
hell+0.025
Neg logit contributions
 nods-0.038
 nod-0.035
 nodded-0.034
 shakes-0.031
 smiles-0.030
Token,
Token ID11
Feature activation-7.09e-01

Pos logit contributions
keeping+0.214
+0.183
handed+0.181
 arrest+0.177
grand+0.177
Neg logit contributions
 nods-0.082
 nodded-0.077
 nod-0.075
 shakes-0.059
 smiles-0.053
Token and
Token ID290
Feature activation-1.10e+00

Pos logit contributions
keeping+0.021
 anywhere+0.017
 unatt+0.017
 arrest+0.016
 rubble+0.016
Neg logit contributions
 nods-0.019
 nod-0.016
rey-0.015
ap-0.015
 nodded-0.015
Token on
Token ID319
Feature activation+1.25e-01

Pos logit contributions
keeping+0.041
+0.032
handed+0.031
 rubble+0.031
 arrest+0.031
Neg logit contributions
 nods-0.027
 nod-0.023
 nodded-0.021
 shakes-0.020
 smiles-0.018
Token the
Token ID262
Feature activation-1.36e+00

Pos logit contributions
 nods+0.002
 nod+0.002
 nodded+0.002
 shakes+0.001
 smiles+0.001
Neg logit contributions
keeping-0.006
-0.005
handed-0.005
 arrest-0.005
 rubble-0.005
Token couch
Token ID18507
Feature activation-2.91e+00

Pos logit contributions
keeping+0.043
+0.034
handed+0.033
 arrest+0.032
hell+0.030
Neg logit contributions
 nods-0.040
 nod-0.037
 nodded-0.035
 shakes-0.033
 smiles-0.031
Token the
Token ID262
Feature activation-1.80e+00

Pos logit contributions
 nods+0.003
 nod+0.002
 nodded+0.002
 smiles+0.001
 shakes+0.001
Neg logit contributions
keeping-0.013
-0.011
handed-0.011
hell-0.011
grand-0.011
Token bed
Token ID3996
Feature activation-3.94e+00

Pos logit contributions
keeping+0.062
+0.057
handed+0.050
uts+0.049
uries+0.048
Neg logit contributions
 nods-0.041
 nod-0.039
 up-0.038
 shakes-0.037
 smiles-0.035
Token,
Token ID11
Feature activation-9.46e-01

Pos logit contributions
keeping+0.193
+0.173
handed+0.170
 unnecessary+0.168
 drowned+0.166
Neg logit contributions
 nodded-0.031
 nods-0.030
 nod-0.030
 for-0.023
 up-0.022
Token in
Token ID287
Feature activation-7.80e-01

Pos logit contributions
keeping+0.033
 rubble+0.026
 arrest+0.026
handed+0.025
+0.025
Neg logit contributions
 nods-0.025
 nod-0.022
 nodded-0.021
 shakes-0.019
ap-0.018
Token the
Token ID262
Feature activation-1.54e+00

Pos logit contributions
keeping+0.039
+0.035
handed+0.034
hell+0.033
grand+0.033
Neg logit contributions
 nods-0.008
 nod-0.006
 nodded-0.005
 shakes-0.004
 smiles-0.004
Token closet
Token ID21615
Feature activation-4.91e+00

Pos logit contributions
keeping+0.052
+0.046
uts+0.042
hell+0.040
 arrest+0.040
Neg logit contributions
 nods-0.040
 nod-0.038
 nodded-0.035
 shakes-0.033
 smiles-0.033
Token,
Token ID11
Feature activation-1.12e+00

Pos logit contributions
keeping+0.231
+0.205
 unnecessary+0.203
handed+0.201
grand+0.200
Neg logit contributions
 nodded-0.053
 nods-0.052
 nod-0.049
 shakes-0.035
 smiled-0.034
Token and
Token ID290
Feature activation-1.25e+00

Pos logit contributions
keeping+0.040
 rubble+0.032
 arrest+0.032
 anywhere+0.031
 unatt+0.031
Neg logit contributions
 nods-0.028
 nod-0.024
 nodded-0.022
 shakes-0.020
ap-0.019
Token even
Token ID772
Feature activation+1.26e+00

Pos logit contributions
keeping+0.051
+0.042
handed+0.042
 arrest+0.041
 rubble+0.040
Neg logit contributions
 nods-0.026
 nod-0.023
 nodded-0.022
 shakes-0.019
 smiles-0.017
Token outside
Token ID2354
Feature activation+2.74e-01

Pos logit contributions
 nods+0.025
 nod+0.021
 nodded+0.021
 shakes+0.018
 smiles+0.016
Neg logit contributions
keeping-0.052
-0.044
handed-0.043
 arrest-0.042
hell-0.041
Token.
Token ID13
Feature activation-4.50e-01

Pos logit contributions
 nods+0.004
 nodded+0.003
 nod+0.003
 smiles+0.003
 shakes+0.003
Neg logit contributions
keeping-0.012
-0.011
handed-0.011
hell-0.010
 arrest-0.010
Token it
Token ID340
Feature activation-5.58e-02

Pos logit contributions
keeping+0.026
+0.021
handed+0.021
 arrest+0.021
 rubble+0.020
Neg logit contributions
 nods-0.016
 nod-0.014
 nodded-0.014
 shakes-0.012
 smiles-0.012
Token and
Token ID290
Feature activation-2.65e-01

Pos logit contributions
keeping+0.002
+0.002
handed+0.002
 arrest+0.002
 rubble+0.002
Neg logit contributions
 nods-0.001
 nod-0.001
 nodded-0.001
 shakes-0.001
 smiles-0.001
Token enjoying
Token ID13226
Feature activation-1.37e+00

Pos logit contributions
keeping+0.010
+0.008
handed+0.008
 arrest+0.008
 rubble+0.008
Neg logit contributions
 nods-0.006
 nod-0.006
 nodded-0.006
 shakes-0.005
 smiles-0.005
Token it
Token ID340
Feature activation+2.39e-01

Pos logit contributions
keeping+0.055
+0.045
handed+0.045
 arrest+0.044
 rubble+0.043
Neg logit contributions
 nods-0.029
 nod-0.025
 nodded-0.024
 shakes-0.021
 smiles-0.019
Token for
Token ID329
Feature activation-9.29e-01

Pos logit contributions
 nods+0.005
 nod+0.005
 nodded+0.005
 shakes+0.004
 smiles+0.004
Neg logit contributions
keeping-0.009
-0.007
handed-0.007
 arrest-0.007
 rubble-0.007
Token years
Token ID812
Feature activation-4.90e+00

Pos logit contributions
keeping+0.040
+0.036
handed+0.035
grand+0.034
 arrest+0.034
Neg logit contributions
 nods-0.012
 nod-0.011
 nodded-0.010
 smiles-0.009
 cl-0.009
Token to
Token ID284
Feature activation-1.85e+00

Pos logit contributions
keeping+0.201
+0.177
handed+0.171
 arrest+0.167
 rubble+0.166
Neg logit contributions
 nods-0.087
 nodded-0.078
 nod-0.076
 smiles-0.063
 shakes-0.061
Token come
Token ID1282
Feature activation-1.29e+00

Pos logit contributions
keeping+0.068
+0.055
handed+0.055
 arrest+0.054
 rubble+0.052
Neg logit contributions
 nods-0.045
 nod-0.040
 nodded-0.039
 shakes-0.034
 smiles-0.032
Token!
Token ID0
Feature activation+6.03e-01

Pos logit contributions
keeping+0.051
+0.041
handed+0.041
 arrest+0.040
 rubble+0.039
Neg logit contributions
 nods-0.029
 nod-0.025
 nodded-0.025
 shakes-0.022
 smiles-0.020
Token<|endoftext|>
Token ID50256
Feature activation-1.31e-01

Pos logit contributions
 nods+0.013
 nod+0.012
 nodded+0.011
 shakes+0.009
 smiles+0.009
Neg logit contributions
keeping-0.024
-0.020
handed-0.020
 arrest-0.019
 rubble-0.019
TokenOnce
Token ID7454
Feature activation-1.38e-01

Pos logit contributions
keeping+0.005
+0.004
handed+0.004
 arrest+0.004
hell+0.004
Neg logit contributions
 nods-0.003
 nod-0.003
 nodded-0.002
 shakes-0.002
 smiles-0.002
Token the
Token ID262
Feature activation-1.36e+00

Pos logit contributions
 nods+0.003
 nod+0.003
 nodded+0.003
 shakes+0.002
 smiles+0.002
Neg logit contributions
keeping-0.013
-0.011
handed-0.011
grand-0.011
hell-0.011
Token bed
Token ID3996
Feature activation-3.98e+00

Pos logit contributions
keeping+0.047
+0.041
handed+0.038
 arrest+0.037
grand+0.035
Neg logit contributions
 nods-0.034
 nod-0.032
 nodded-0.030
 shakes-0.028
 smiles-0.026
Token,
Token ID11
Feature activation-1.01e+00

Pos logit contributions
keeping+0.178
+0.153
handed+0.148
 arrest+0.146
grand+0.145
Neg logit contributions
 nods-0.060
 nod-0.056
 nodded-0.055
 shakes-0.043
 up-0.038
Token in
Token ID287
Feature activation-4.65e-01

Pos logit contributions
keeping+0.031
 rubble+0.024
 unatt+0.024
 arrest+0.024
 anywhere+0.024
Neg logit contributions
 nods-0.031
 nod-0.026
 nodded-0.025
ap-0.024
 shakes-0.023
Token the
Token ID262
Feature activation-8.34e-01

Pos logit contributions
keeping+0.022
+0.019
handed+0.018
 arrest+0.018
hell+0.018
Neg logit contributions
 nods-0.007
 nod-0.006
 nodded-0.005
 shakes-0.004
 smiles-0.004
Token closet
Token ID21615
Feature activation-4.89e+00

Pos logit contributions
keeping+0.028
+0.022
handed+0.021
 arrest+0.021
hell+0.020
Neg logit contributions
 nods-0.024
 nod-0.022
 nodded-0.021
 shakes-0.019
 smiles-0.018
Token,
Token ID11
Feature activation-1.15e+00

Pos logit contributions
keeping+0.215
+0.185
handed+0.178
grand+0.178
 arrest+0.174
Neg logit contributions
 nods-0.078
 nodded-0.074
 nod-0.074
 shakes-0.056
 smiled-0.051
Token and
Token ID290
Feature activation-6.03e-01

Pos logit contributions
keeping+0.036
 unatt+0.029
 anywhere+0.028
 rubble+0.028
 arrest+0.028
Neg logit contributions
 nods-0.032
 nod-0.027
ap-0.025
 nodded-0.025
rey-0.024
Token even
Token ID772
Feature activation+1.20e+00

Pos logit contributions
keeping+0.023
+0.019
handed+0.019
 arrest+0.018
 rubble+0.018
Neg logit contributions
 nods-0.014
 nod-0.012
 nodded-0.012
 shakes-0.010
 smiles-0.010
Token outside
Token ID2354
Feature activation+1.55e-01

Pos logit contributions
 nods+0.027
 nod+0.023
 nodded+0.022
 shakes+0.019
 smiles+0.017
Neg logit contributions
keeping-0.047
handed-0.038
-0.038
 arrest-0.037
 rubble-0.037
Token in
Token ID287
Feature activation-8.23e-01

Pos logit contributions
 nods+0.003
 nod+0.003
 nodded+0.003
 shakes+0.002
 smiles+0.002
Neg logit contributions
keeping-0.006
-0.005
handed-0.005
 arrest-0.005
hell-0.005
Token the
Token ID262
Feature activation-1.87e+00

Pos logit contributions
 nods+0.005
 nod+0.004
 nodded+0.003
 shakes+0.003
 smiles+0.003
Neg logit contributions
keeping-0.025
-0.022
handed-0.022
grand-0.021
hell-0.021
Token bed
Token ID3996
Feature activation-3.85e+00

Pos logit contributions
keeping+0.064
+0.055
handed+0.051
 arrest+0.049
 drowned+0.048
Neg logit contributions
 nods-0.047
 nod-0.044
 shakes-0.040
 nodded-0.039
 up-0.038
Token,
Token ID11
Feature activation-1.31e+00

Pos logit contributions
keeping+0.189
+0.165
handed+0.159
 unnecessary+0.156
 arrest+0.156
Neg logit contributions
 nods-0.040
 nodded-0.037
 nod-0.036
 shakes-0.026
 up-0.025
Token in
Token ID287
Feature activation-8.12e-01

Pos logit contributions
keeping+0.045
handed+0.035
 rubble+0.035
 arrest+0.035
+0.035
Neg logit contributions
 nods-0.035
 nod-0.031
 nodded-0.030
 shakes-0.026
ap-0.025
Token the
Token ID262
Feature activation-1.41e+00

Pos logit contributions
keeping+0.041
+0.036
handed+0.035
hell+0.035
grand+0.035
Neg logit contributions
 nods-0.008
 nod-0.006
 nodded-0.005
 shakes-0.005
 smiles-0.005
Token closet
Token ID21615
Feature activation-4.88e+00

Pos logit contributions
keeping+0.049
+0.041
handed+0.037
 arrest+0.037
hell+0.037
Neg logit contributions
 nods-0.037
 nod-0.034
 nodded-0.031
 shakes-0.030
 smiles-0.028
Token,
Token ID11
Feature activation-1.43e+00

Pos logit contributions
keeping+0.233
+0.203
 unnecessary+0.199
handed+0.195
grand+0.194
Neg logit contributions
 nods-0.057
 nodded-0.056
 nod-0.053
 shakes-0.041
 up-0.039
Token and
Token ID290
Feature activation-1.50e+00

Pos logit contributions
keeping+0.050
 rubble+0.040
 arrest+0.039
+0.039
 unatt+0.039
Neg logit contributions
 nods-0.037
 nod-0.032
 nodded-0.031
 shakes-0.027
ap-0.027
Token even
Token ID772
Feature activation+1.09e+00

Pos logit contributions
keeping+0.062
+0.052
handed+0.052
 arrest+0.051
hell+0.049
Neg logit contributions
 nods-0.029
 nod-0.026
 nodded-0.025
 shakes-0.021
 smiles-0.019
Token in
Token ID287
Feature activation-7.54e-01

Pos logit contributions
 nods+0.023
 nod+0.020
 nodded+0.019
 shakes+0.016
 smiles+0.015
Neg logit contributions
keeping-0.044
-0.036
handed-0.036
 arrest-0.035
 rubble-0.035
Token the
Token ID262
Feature activation-1.07e+00

Pos logit contributions
keeping+0.039
+0.035
handed+0.034
grand+0.034
ache+0.034
Neg logit contributions
 nods-0.006
 nod-0.004
 smiles-0.004
 nodded-0.003
 shakes-0.003
Token the
Token ID262
Feature activation-1.88e+00

Pos logit contributions
 nods+0.001
 nod+0.001
 nodded+0.001
 shakes+0.000
 smiles+0.000
Neg logit contributions
keeping-0.003
-0.003
handed-0.003
hell-0.003
grand-0.003
Token bed
Token ID3996
Feature activation-3.92e+00

Pos logit contributions
keeping+0.065
+0.058
handed+0.052
lor+0.051
 arrest+0.050
Neg logit contributions
 nods-0.046
 nod-0.043
 shakes-0.039
 up-0.039
 nodded-0.038
Token,
Token ID11
Feature activation-8.64e-01

Pos logit contributions
keeping+0.189
+0.168
handed+0.163
 arrest+0.160
 unnecessary+0.159
Neg logit contributions
 nods-0.041
 nodded-0.038
 nod-0.038
 shakes-0.025
 up-0.023
Token in
Token ID287
Feature activation-6.77e-01

Pos logit contributions
keeping+0.028
 rubble+0.022
 arrest+0.022
 unatt+0.021
handed+0.021
Neg logit contributions
 nods-0.025
 nod-0.021
 nodded-0.020
 shakes-0.018
ap-0.018
Token the
Token ID262
Feature activation-1.33e+00

Pos logit contributions
keeping+0.033
+0.028
handed+0.028
 arrest+0.027
hell+0.027
Neg logit contributions
 nods-0.009
 nod-0.007
 nodded-0.006
 shakes-0.005
 smiles-0.005
Token closet
Token ID21615
Feature activation-4.86e+00

Pos logit contributions
keeping+0.045
+0.037
handed+0.034
 arrest+0.034
uts+0.033
Neg logit contributions
 nods-0.037
 nod-0.034
 nodded-0.032
 shakes-0.029
 smiles-0.029
Token,
Token ID11
Feature activation-1.04e+00

Pos logit contributions
keeping+0.220
+0.193
handed+0.188
 unnecessary+0.186
grand+0.185
Neg logit contributions
 nods-0.069
 nodded-0.066
 nod-0.062
 shakes-0.048
 smiled-0.045
Token and
Token ID290
Feature activation-1.02e+00

Pos logit contributions
keeping+0.035
 rubble+0.027
 arrest+0.027
 anywhere+0.027
 unatt+0.027
Neg logit contributions
 nods-0.028
 nod-0.024
 nodded-0.022
ap-0.021
 shakes-0.021
Token even
Token ID772
Feature activation+1.29e+00

Pos logit contributions
keeping+0.040
+0.032
handed+0.032
 arrest+0.032
 rubble+0.031
Neg logit contributions
 nods-0.023
 nod-0.020
 nodded-0.019
 shakes-0.017
 smiles-0.016
Token in
Token ID287
Feature activation-4.31e-01

Pos logit contributions
 nods+0.028
 nod+0.024
 nodded+0.023
 shakes+0.020
 smiles+0.019
Neg logit contributions
keeping-0.051
-0.042
handed-0.042
 arrest-0.041
 rubble-0.040
Token the
Token ID262
Feature activation-1.10e+00

Pos logit contributions
keeping+0.022
+0.019
handed+0.019
grand+0.019
hell+0.019
Neg logit contributions
 nods-0.004
 nod-0.003
 nodded-0.003
 smiles-0.002
 shakes-0.002
Token his
Token ID465
Feature activation-1.81e+00

Pos logit contributions
 nods+0.003
 nod+0.003
 nodded+0.002
 shakes+0.002
 smiles+0.002
Neg logit contributions
keeping-0.011
-0.009
handed-0.009
 arrest-0.009
hell-0.009
Token bed
Token ID3996
Feature activation-4.10e+00

Pos logit contributions
keeping+0.062
+0.050
handed+0.048
 endings+0.045
 unnecessary+0.043
Neg logit contributions
 nods-0.046
 nod-0.044
 nodded-0.040
 smiles-0.039
 shakes-0.039
Token,
Token ID11
Feature activation-9.86e-01

Pos logit contributions
keeping+0.190
+0.169
handed+0.162
 arrest+0.162
 unnecessary+0.159
Neg logit contributions
 nods-0.048
 nod-0.046
 nodded-0.046
 shakes-0.032
 up-0.030
Token in
Token ID287
Feature activation-7.31e-01

Pos logit contributions
keeping+0.032
 rubble+0.025
 unatt+0.025
 arrest+0.024
handed+0.024
Neg logit contributions
 nods-0.028
 nod-0.024
 nodded-0.023
ap-0.021
 shakes-0.021
Token the
Token ID262
Feature activation-1.23e+00

Pos logit contributions
keeping+0.034
+0.029
handed+0.029
 arrest+0.028
hell+0.028
Neg logit contributions
 nods-0.011
 nod-0.009
 nodded-0.008
 shakes-0.007
 smiles-0.006
Token closet
Token ID21615
Feature activation-4.85e+00

Pos logit contributions
keeping+0.041
+0.034
handed+0.032
 arrest+0.031
hell+0.031
Neg logit contributions
 nods-0.033
 nod-0.031
 nodded-0.029
 shakes-0.027
 smiles-0.026
Token,
Token ID11
Feature activation-1.09e+00

Pos logit contributions
keeping+0.213
+0.184
handed+0.177
grand+0.176
 unnecessary+0.176
Neg logit contributions
 nods-0.076
 nodded-0.072
 nod-0.072
 shakes-0.055
 smiled-0.049
Token and
Token ID290
Feature activation-9.34e-01

Pos logit contributions
keeping+0.036
 unatt+0.029
 rubble+0.029
 arrest+0.028
 anywhere+0.028
Neg logit contributions
 nods-0.029
 nod-0.025
 nodded-0.022
ap-0.022
rey-0.021
Token even
Token ID772
Feature activation+7.94e-01

Pos logit contributions
keeping+0.034
handed+0.027
 rubble+0.026
 unatt+0.026
+0.026
Neg logit contributions
 nods-0.024
 nod-0.020
 nodded-0.018
 shakes-0.017
ap-0.017
Token outside
Token ID2354
Feature activation-4.32e-02

Pos logit contributions
 nods+0.018
 nod+0.015
 nodded+0.014
 shakes+0.012
 smiles+0.011
Neg logit contributions
keeping-0.031
handed-0.026
-0.025
 rubble-0.025
 unatt-0.025
Token.
Token ID13
Feature activation-3.45e-01

Pos logit contributions
keeping+0.002
+0.001
handed+0.001
 arrest+0.001
hell+0.001
Neg logit contributions
 nods-0.001
 nod-0.001
 nodded-0.001
 shakes-0.001
 smiles-0.001
Token the
Token ID262
Feature activation-1.76e+00

Pos logit contributions
 nods+0.002
 nod+0.002
 nodded+0.002
 shakes+0.001
 smiles+0.001
Neg logit contributions
keeping-0.006
-0.005
handed-0.005
 arrest-0.005
 rubble-0.005
Token bed
Token ID3996
Feature activation-4.20e+00

Pos logit contributions
keeping+0.059
+0.049
handed+0.046
 arrest+0.045
grand+0.043
Neg logit contributions
 nods-0.047
 nod-0.044
 nodded-0.040
 shakes-0.038
 smiles-0.037
Token and
Token ID290
Feature activation-1.35e+00

Pos logit contributions
keeping+0.187
+0.163
handed+0.160
 arrest+0.158
grand+0.155
Neg logit contributions
 nods-0.060
 nod-0.056
 nodded-0.054
 shakes-0.042
 up-0.037
Token in
Token ID287
Feature activation-7.14e-01

Pos logit contributions
keeping+0.046
+0.036
handed+0.036
 arrest+0.035
 rubble+0.035
Neg logit contributions
 nods-0.036
 nod-0.032
 nodded-0.030
 shakes-0.028
 smiles-0.026
Token the
Token ID262
Feature activation-1.33e+00

Pos logit contributions
keeping+0.033
+0.028
handed+0.028
 arrest+0.027
 rubble+0.027
Neg logit contributions
 nods-0.011
 nod-0.009
 nodded-0.009
 shakes-0.007
 smiles-0.006
Token closet
Token ID21615
Feature activation-4.85e+00

Pos logit contributions
keeping+0.042
+0.033
handed+0.031
 arrest+0.031
hell+0.030
Neg logit contributions
 nods-0.040
 nod-0.037
 nodded-0.035
 shakes-0.032
 smiles-0.032
Token,
Token ID11
Feature activation-8.09e-01

Pos logit contributions
keeping+0.201
+0.173
 arrest+0.167
handed+0.166
grand+0.163
Neg logit contributions
 nods-0.086
 nodded-0.084
 nod-0.080
 shakes-0.063
 smiled-0.062
Token but
Token ID475
Feature activation-3.02e-01

Pos logit contributions
keeping+0.028
 arrest+0.022
 rubble+0.022
+0.022
handed+0.022
Neg logit contributions
 nods-0.021
 nod-0.018
 nodded-0.017
 shakes-0.015
rey-0.015
Token the
Token ID262
Feature activation-1.15e+00

Pos logit contributions
keeping+0.010
+0.008
handed+0.008
 rubble+0.008
 arrest+0.008
Neg logit contributions
 nods-0.008
 nod-0.007
 nodded-0.007
 shakes-0.006
 smiles-0.006
Token toy
Token ID13373
Feature activation-1.67e+00

Pos logit contributions
keeping+0.039
+0.032
handed+0.031
 arrest+0.030
hell+0.029
Neg logit contributions
 nods-0.031
 nod-0.028
 nodded-0.027
 shakes-0.025
 smiles-0.024
Token was
Token ID373
Feature activation-1.68e+00

Pos logit contributions
keeping+0.068
+0.056
handed+0.056
 arrest+0.055
hell+0.053
Neg logit contributions
 nods-0.034
 nod-0.030
 nodded-0.029
 shakes-0.025
 smiles-0.023
Token butterfly
Token ID35113
Feature activation+8.78e-01

Pos logit contributions
keeping+0.005
handed+0.004
+0.004
 rubble+0.004
 arrest+0.004
Neg logit contributions
 nods-0.004
 nod-0.004
 nodded-0.004
 shakes-0.003
 smiles-0.003
Token back
Token ID736
Feature activation+2.17e-01

Pos logit contributions
 nods+0.021
 nod+0.018
 nodded+0.017
 shakes+0.016
rey+0.015
Neg logit contributions
keeping-0.032
-0.025
handed-0.025
hell-0.025
 anywhere-0.025
Token in
Token ID287
Feature activation-1.80e-01

Pos logit contributions
 nods+0.005
 nod+0.005
 nodded+0.004
 shakes+0.004
rey+0.003
Neg logit contributions
keeping-0.008
handed-0.006
 anywhere-0.006
-0.006
 unatt-0.006
Token its
Token ID663
Feature activation-1.19e+00

Pos logit contributions
keeping+0.008
handed+0.006
+0.006
 arrest+0.006
 rubble+0.006
Neg logit contributions
 nods-0.003
 nod-0.003
 nodded-0.003
 shakes-0.002
 smiles-0.002
Token natural
Token ID3288
Feature activation-9.85e-01

Pos logit contributions
keeping+0.038
handed+0.030
 arrest+0.029
 rubble+0.029
+0.029
Neg logit contributions
 nods-0.033
 nod-0.029
 nodded-0.029
 shakes-0.026
 smiles-0.024
Token habitat
Token ID20018
Feature activation-4.83e+00

Pos logit contributions
keeping+0.032
+0.025
handed+0.025
 arrest+0.024
 rubble+0.023
Neg logit contributions
 nods-0.028
 nod-0.026
 nodded-0.025
 shakes-0.023
 smiles-0.021
Token.
Token ID13
Feature activation-4.51e-01

Pos logit contributions
keeping+0.192
+0.167
 arrest+0.161
handed+0.160
 rubble+0.153
Neg logit contributions
 nods-0.097
 nod-0.086
 nodded-0.084
 shakes-0.070
 smiles-0.068
Token He
Token ID679
Feature activation+7.87e-01

Pos logit contributions
keeping+0.016
+0.013
handed+0.013
 arrest+0.013
 rubble+0.012
Neg logit contributions
 nods-0.011
 nod-0.010
 nodded-0.010
 shakes-0.009
 smiles-0.008
Token waved
Token ID26834
Feature activation-9.68e-02

Pos logit contributions
 nods+0.020
 nod+0.018
 nodded+0.015
 shakes+0.015
ap+0.014
Neg logit contributions
keeping-0.026
-0.021
 arrest-0.021
 drowned-0.021
handed-0.020
Token goodbye
Token ID24829
Feature activation-1.63e+00

Pos logit contributions
keeping+0.003
 arrest+0.002
handed+0.002
+0.002
hell+0.002
Neg logit contributions
 nods-0.003
 nod-0.002
 shakes-0.002
 nodded-0.002
rey-0.002
Token to
Token ID284
Feature activation-7.80e-01

Pos logit contributions
keeping+0.060
handed+0.047
+0.047
 arrest+0.047
 rubble+0.044
Neg logit contributions
 nods-0.040
 nod-0.035
 nodded-0.033
 shakes-0.031
 smiles-0.028
Token the
Token ID262
Feature activation-1.87e+00

Pos logit contributions
 nods+0.001
 nod+0.001
 nodded+0.000
 smiles+0.000
 shakes+0.000
Neg logit contributions
keeping-0.003
-0.003
handed-0.003
grand-0.003
hell-0.003
Token bed
Token ID3996
Feature activation-3.83e+00

Pos logit contributions
keeping+0.061
+0.054
handed+0.048
 arrest+0.046
uts+0.046
Neg logit contributions
 nods-0.049
 nod-0.046
 nodded-0.042
 shakes-0.041
 smiles-0.041
Token,
Token ID11
Feature activation-6.75e-01

Pos logit contributions
keeping+0.178
+0.156
handed+0.152
 arrest+0.150
grand+0.150
Neg logit contributions
 nods-0.046
 nodded-0.044
 nod-0.043
 shakes-0.031
 up-0.028
Token in
Token ID287
Feature activation-7.01e-01

Pos logit contributions
keeping+0.022
 rubble+0.018
 arrest+0.017
 unatt+0.017
handed+0.017
Neg logit contributions
 nods-0.019
 nod-0.016
 nodded-0.015
rey-0.014
ap-0.014
Token the
Token ID262
Feature activation-1.07e+00

Pos logit contributions
keeping+0.034
+0.030
handed+0.029
grand+0.029
hell+0.029
Neg logit contributions
 nods-0.008
 nod-0.007
 nodded-0.006
 shakes-0.005
 smiles-0.005
Token closet
Token ID21615
Feature activation-4.82e+00

Pos logit contributions
keeping+0.035
+0.028
handed+0.026
 arrest+0.026
hell+0.025
Neg logit contributions
 nods-0.031
 nod-0.028
 nodded-0.027
 shakes-0.025
 smiles-0.024
Token,
Token ID11
Feature activation-8.52e-01

Pos logit contributions
keeping+0.214
+0.186
grand+0.181
handed+0.181
 unnecessary+0.180
Neg logit contributions
 nods-0.071
 nodded-0.069
 nod-0.065
 shakes-0.051
 smiled-0.048
Token and
Token ID290
Feature activation-9.69e-01

Pos logit contributions
keeping+0.029
 arrest+0.023
 rubble+0.023
 unatt+0.023
 anywhere+0.022
Neg logit contributions
 nods-0.022
 nod-0.019
 nodded-0.017
ap-0.016
rey-0.016
Token even
Token ID772
Feature activation+1.18e+00

Pos logit contributions
keeping+0.038
+0.031
handed+0.031
 arrest+0.031
 rubble+0.030
Neg logit contributions
 nods-0.021
 nod-0.018
 nodded-0.017
 shakes-0.015
 smiles-0.014
Token in
Token ID287
Feature activation-6.09e-01

Pos logit contributions
 nods+0.025
 nod+0.022
 nodded+0.020
 shakes+0.017
 smiles+0.016
Neg logit contributions
keeping-0.047
handed-0.039
-0.039
 arrest-0.038
 rubble-0.037
Token the
Token ID262
Feature activation-1.02e+00

Pos logit contributions
keeping+0.030
+0.027
handed+0.026
grand+0.026
hell+0.026
Neg logit contributions
 nods-0.006
 nod-0.005
 nodded-0.004
 smiles-0.004
 shakes-0.004
Token the
Token ID262
Feature activation-1.89e+00

Pos logit contributions
keeping+0.005
 arrest+0.004
+0.004
handed+0.004
 rubble+0.004
Neg logit contributions
 nods-0.002
 nod-0.002
 nodded-0.002
 shakes-0.001
ap-0.001
Token bed
Token ID3996
Feature activation-4.21e+00

Pos logit contributions
keeping+0.061
+0.051
handed+0.049
 arrest+0.045
hell+0.044
Neg logit contributions
 nods-0.053
 nod-0.049
 nodded-0.045
 shakes-0.043
 smiles-0.042
Token,
Token ID11
Feature activation-8.42e-01

Pos logit contributions
keeping+0.195
+0.173
handed+0.171
 arrest+0.169
 unnecessary+0.166
Neg logit contributions
 nods-0.050
 nod-0.047
 nodded-0.045
 shakes-0.035
 next-0.031
Token in
Token ID287
Feature activation-7.55e-01

Pos logit contributions
keeping+0.024
 unatt+0.019
 anywhere+0.019
 rubble+0.019
 arrest+0.018
Neg logit contributions
 nods-0.026
 nod-0.022
 nodded-0.021
rey-0.021
ap-0.020
Token the
Token ID262
Feature activation-8.59e-01

Pos logit contributions
keeping+0.035
+0.029
handed+0.029
 arrest+0.029
 rubble+0.028
Neg logit contributions
 nods-0.012
 nod-0.010
 nodded-0.009
 shakes-0.007
 smiles-0.006
Token closet
Token ID21615
Feature activation-4.81e+00

Pos logit contributions
keeping+0.026
+0.020
handed+0.020
 arrest+0.020
 rubble+0.019
Neg logit contributions
 nods-0.026
 nod-0.024
 nodded-0.024
 shakes-0.021
 smiles-0.020
Token,
Token ID11
Feature activation-9.96e-01

Pos logit contributions
keeping+0.221
+0.191
handed+0.189
grand+0.187
 unnecessary+0.186
Neg logit contributions
 nods-0.069
 nodded-0.062
 nod-0.062
 shakes-0.050
 smiles-0.041
Token in
Token ID287
Feature activation-7.42e-01

Pos logit contributions
keeping+0.029
 unatt+0.024
 anywhere+0.024
 arrest+0.023
 rubble+0.022
Neg logit contributions
 nods-0.029
 nod-0.024
 nodded-0.023
rey-0.022
 agrees-0.022
Token the
Token ID262
Feature activation-8.19e-01

Pos logit contributions
keeping+0.035
+0.030
handed+0.030
 arrest+0.029
hell+0.029
Neg logit contributions
 nods-0.010
 nod-0.009
 nodded-0.008
 shakes-0.006
 smiles-0.006
Token draw
Token ID3197
Feature activation-2.08e+00

Pos logit contributions
keeping+0.026
+0.020
handed+0.020
 arrest+0.019
hell+0.018
Neg logit contributions
 nods-0.024
 nod-0.022
 nodded-0.022
 shakes-0.020
 smiles-0.019
Tokeners
Token ID364
Feature activation-1.32e+00

Pos logit contributions
keeping+0.080
+0.065
handed+0.065
 arrest+0.063
 rubble+0.062
Neg logit contributions
 nods-0.048
 nod-0.042
 nodded-0.041
 shakes-0.036
 smiles-0.034
Token see
Token ID766
Feature activation-1.20e+00

Pos logit contributions
keeping+0.043
handed+0.035
+0.035
 arrest+0.035
hell+0.034
Neg logit contributions
 nods-0.025
 nod-0.022
 nodded-0.022
 shakes-0.019
 smiles-0.018
Token animals
Token ID4695
Feature activation+6.52e-01

Pos logit contributions
keeping+0.052
+0.044
handed+0.043
 arrest+0.042
ork+0.041
Neg logit contributions
 nods-0.022
 nod-0.018
 nodded-0.017
 up-0.016
 smiles-0.015
Token in
Token ID287
Feature activation-7.90e-01

Pos logit contributions
 nods+0.014
 nod+0.013
 nodded+0.012
 shakes+0.010
 smiles+0.010
Neg logit contributions
keeping-0.026
-0.021
handed-0.021
 arrest-0.020
 rubble-0.020
Token their
Token ID511
Feature activation-2.36e+00

Pos logit contributions
keeping+0.040
+0.035
handed+0.035
ache+0.034
hell+0.034
Neg logit contributions
 nods-0.008
 nod-0.006
 nodded-0.005
 smiles-0.004
 shakes-0.003
Token natural
Token ID3288
Feature activation-7.79e-01

Pos logit contributions
keeping+0.082
+0.062
uts+0.062
hell+0.060
handed+0.060
Neg logit contributions
 nods-0.068
 nod-0.059
 nodded-0.058
 smiles-0.056
 shakes-0.052
Token habitat
Token ID20018
Feature activation-4.81e+00

Pos logit contributions
keeping+0.017
+0.013
handed+0.013
hell+0.012
ache+0.012
Neg logit contributions
 nods-0.029
 nod-0.026
 nodded-0.026
 smiles-0.025
 shakes-0.025
Token.
Token ID13
Feature activation-8.54e-01

Pos logit contributions
keeping+0.207
+0.185
 arrest+0.181
handed+0.176
 drowned+0.174
Neg logit contributions
 nods-0.075
 nod-0.063
 nodded-0.059
 shakes-0.051
 smiles-0.051
Token She
Token ID1375
Feature activation-4.50e-01

Pos logit contributions
keeping+0.037
 rubble+0.031
 arrest+0.031
handed+0.030
 unatt+0.030
Neg logit contributions
 nods-0.016
 nod-0.015
 nodded-0.014
 shakes-0.011
 smiles-0.010
Token ran
Token ID4966
Feature activation-1.08e+00

Pos logit contributions
keeping+0.015
+0.012
 arrest+0.012
handed+0.012
 drowned+0.011
Neg logit contributions
 nods-0.012
 nod-0.011
 nodded-0.010
 shakes-0.009
ap-0.009
Token around
Token ID1088
Feature activation+1.47e+00

Pos logit contributions
keeping+0.047
+0.040
handed+0.040
 arrest+0.038
 rubble+0.038
Neg logit contributions
 nods-0.019
 nod-0.016
 nodded-0.015
 shakes-0.013
 smiles-0.012
Token the
Token ID262
Feature activation-1.15e+00

Pos logit contributions
 nods+0.034
 nod+0.030
 nodded+0.029
 shakes+0.025
ap+0.021
Neg logit contributions
keeping-0.055
 arrest-0.046
 rubble-0.043
 anywhere-0.043
handed-0.042
Token the
Token ID262
Feature activation-1.47e+00

Pos logit contributions
keeping+0.036
+0.032
handed+0.032
grand+0.031
hell+0.030
Neg logit contributions
 nods-0.006
 nod-0.004
 nodded-0.004
 smiles-0.004
 shakes-0.003
Token bed
Token ID3996
Feature activation-4.27e+00

Pos logit contributions
keeping+0.044
+0.036
handed+0.034
 arrest+0.033
 drowned+0.031
Neg logit contributions
 nods-0.045
 nod-0.042
 nodded-0.040
 shakes-0.038
 smiles-0.036
Token,
Token ID11
Feature activation-1.14e+00

Pos logit contributions
keeping+0.195
+0.170
handed+0.166
 arrest+0.164
 unnecessary+0.163
Neg logit contributions
 nods-0.052
 nod-0.051
 nodded-0.050
 shakes-0.037
ap-0.034
Token in
Token ID287
Feature activation-7.43e-01

Pos logit contributions
keeping+0.042
 arrest+0.034
 unatt+0.034
 rubble+0.034
handed+0.033
Neg logit contributions
 nods-0.027
 nod-0.023
 nodded-0.022
rey-0.018
 shakes-0.018
Token the
Token ID262
Feature activation-9.03e-01

Pos logit contributions
keeping+0.037
+0.032
handed+0.032
grand+0.031
hell+0.031
Neg logit contributions
 nods-0.008
 nod-0.007
 nodded-0.006
 shakes-0.005
 smiles-0.005
Token closet
Token ID21615
Feature activation-4.78e+00

Pos logit contributions
keeping+0.028
+0.022
handed+0.021
 arrest+0.021
 rubble+0.020
Neg logit contributions
 nods-0.027
 nod-0.025
 nodded-0.024
 shakes-0.022
 smiles-0.021
Token.
Token ID13
Feature activation-3.96e-01

Pos logit contributions
keeping+0.219
+0.189
 unnecessary+0.184
handed+0.184
grand+0.182
Neg logit contributions
 nods-0.065
 nodded-0.062
 nod-0.060
 shakes-0.047
 up-0.041
Token She
Token ID1375
Feature activation-9.47e-01

Pos logit contributions
keeping+0.017
 arrest+0.014
handed+0.014
 rubble+0.014
+0.014
Neg logit contributions
 nods-0.007
 nod-0.007
 nodded-0.006
 shakes-0.005
 three-0.005
Token could
Token ID714
Feature activation-7.02e-02

Pos logit contributions
keeping+0.034
+0.028
handed+0.027
 arrest+0.027
 rubble+0.026
Neg logit contributions
 nods-0.023
 nod-0.021
 nodded-0.019
 shakes-0.017
 smiles-0.016
Token not
Token ID407
Feature activation-3.06e-01

Pos logit contributions
keeping+0.002
+0.002
handed+0.002
 arrest+0.002
 rubble+0.002
Neg logit contributions
 nods-0.002
 nod-0.002
 nodded-0.002
 shakes-0.002
 smiles-0.001
Token find
Token ID1064
Feature activation-5.79e-01

Pos logit contributions
keeping+0.012
+0.010
handed+0.010
 arrest+0.010
lor+0.009
Neg logit contributions
 nod-0.006
 nods-0.006
 nodded-0.006
 cl-0.005
 shakes-0.005
Token be
Token ID307
Feature activation+7.84e-01

Pos logit contributions
keeping+0.018
+0.015
handed+0.015
 arrest+0.014
 rubble+0.014
Neg logit contributions
 nods-0.009
 nod-0.008
 nodded-0.007
 shakes-0.006
 smiles-0.006
Token good
Token ID922
Feature activation+5.70e-01

Pos logit contributions
 nods+0.021
 nod+0.019
 nodded+0.018
 shakes+0.017
 smiles+0.016
Neg logit contributions
keeping-0.026
-0.021
handed-0.021
 arrest-0.020
 rubble-0.020
Token friends
Token ID2460
Feature activation-1.59e+00

Pos logit contributions
 nods+0.010
 nod+0.009
 nodded+0.008
 shakes+0.007
 smiles+0.006
Neg logit contributions
keeping-0.025
handed-0.021
-0.021
hell-0.021
 rubble-0.021
Token and
Token ID290
Feature activation-2.37e-01

Pos logit contributions
keeping+0.065
+0.053
handed+0.053
 arrest+0.052
 rubble+0.051
Neg logit contributions
 nods-0.033
 nod-0.029
 nodded-0.027
 shakes-0.023
 smiles-0.022
Token help
Token ID1037
Feature activation-5.87e-01

Pos logit contributions
keeping+0.008
+0.007
handed+0.007
 arrest+0.007
 rubble+0.006
Neg logit contributions
 nods-0.006
 nod-0.005
 nodded-0.005
 shakes-0.005
 smiles-0.004
Token each
Token ID1123
Feature activation+2.53e+00

Pos logit contributions
keeping+0.025
+0.021
handed+0.021
 arrest+0.021
 rubble+0.020
Neg logit contributions
 nods-0.011
 nod-0.009
 nodded-0.009
 shakes-0.007
 smiles-0.007
Token other
Token ID584
Feature activation-3.41e-01

Pos logit contributions
 nods+0.047
 nod+0.043
 nodded+0.039
 shakes+0.034
 smiles+0.032
Neg logit contributions
keeping-0.106
-0.090
handed-0.087
hell-0.083
 arrest-0.083
Token.
Token ID13
Feature activation+7.31e-01

Pos logit contributions
keeping+0.014
+0.012
 arrest+0.011
handed+0.011
 rubble+0.011
Neg logit contributions
 nods-0.007
 nod-0.006
 nodded-0.006
 shakes-0.005
 smiles-0.005
Token They
Token ID1119
Feature activation+3.84e-01

Pos logit contributions
 nods+0.015
 nod+0.013
 nodded+0.012
 shakes+0.010
 smiles+0.010
Neg logit contributions
keeping-0.030
handed-0.025
-0.025
 arrest-0.025
 rubble-0.025
Token learned
Token ID4499
Feature activation-1.27e+00

Pos logit contributions
 nods+0.011
 nod+0.010
 nodded+0.010
 shakes+0.009
 smiles+0.008
Neg logit contributions
keeping-0.012
-0.010
handed-0.010
 arrest-0.009
 rubble-0.009
Token that
Token ID326
Feature activation+2.23e-01

Pos logit contributions
keeping+0.042
+0.033
handed+0.033
 arrest+0.032
 rubble+0.031
Neg logit contributions
 nods-0.036
 nod-0.033
 nodded-0.032
 shakes-0.028
 smiles-0.027
Token you
Token ID345
Feature activation+4.14e-01

Pos logit contributions
 nods+0.008
 nod+0.007
 nodded+0.007
 shakes+0.006
 smiles+0.006
Neg logit contributions
keeping-0.004
 arrest-0.003
handed-0.003
-0.003
hell-0.003
Token!"
Token ID2474
Feature activation+1.21e+00

Pos logit contributions
 nods+0.010
 nod+0.009
 nodded+0.008
 shakes+0.007
 smiles+0.007
Neg logit contributions
keeping-0.016
-0.013
handed-0.012
 arrest-0.012
 rubble-0.012
Token he
Token ID339
Feature activation+1.87e+00

Pos logit contributions
 nods+0.031
 nod+0.027
 nodded+0.025
 shakes+0.024
rey+0.022
Neg logit contributions
keeping-0.043
 arrest-0.034
handed-0.033
-0.032
 rubble-0.031
Token said
Token ID531
Feature activation-1.35e+00

Pos logit contributions
 nods+0.040
 nod+0.035
 shakes+0.030
 nodded+0.030
ap+0.028
Neg logit contributions
keeping-0.071
-0.057
 arrest-0.056
handed-0.056
hell-0.056
Token to
Token ID284
Feature activation+3.40e-01

Pos logit contributions
keeping+0.043
handed+0.033
+0.032
hell+0.032
 arrest+0.031
Neg logit contributions
 nods-0.039
 nod-0.032
 shakes-0.032
 nodded-0.031
rey-0.029
Token all
Token ID477
Feature activation+2.97e+00

Pos logit contributions
 nods+0.008
 nodded+0.007
 nod+0.007
 shakes+0.006
ap+0.005
Neg logit contributions
keeping-0.013
 arrest-0.011
handed-0.010
 rubble-0.010
-0.010
Token the
Token ID262
Feature activation+9.37e-01

Pos logit contributions
 nods+0.061
 nod+0.051
 nodded+0.051
 shakes+0.045
rey+0.038
Neg logit contributions
keeping-0.117
handed-0.094
 rubble-0.094
hell-0.093
 arrest-0.093
Token birds
Token ID10087
Feature activation+1.90e+00

Pos logit contributions
 nods+0.023
 nod+0.020
 nodded+0.019
 shakes+0.017
 smiles+0.016
Neg logit contributions
keeping-0.035
handed-0.028
-0.028
 arrest-0.028
 rubble-0.027
Token.
Token ID13
Feature activation+9.46e-01

Pos logit contributions
 nods+0.049
 nod+0.041
 shakes+0.041
 nodded+0.039
rey+0.037
Neg logit contributions
keeping-0.062
hell-0.049
handed-0.048
 anywhere-0.048
 arrest-0.048
Token The
Token ID383
Feature activation+9.20e-01

Pos logit contributions
 nods+0.025
 nod+0.023
 nodded+0.022
 shakes+0.021
 smiles+0.019
Neg logit contributions
keeping-0.031
-0.025
handed-0.024
 arrest-0.024
uts-0.022
Token lake
Token ID13546
Feature activation-8.10e-02

Pos logit contributions
 nods+0.023
 nod+0.021
 nodded+0.020
 shakes+0.018
 smiles+0.017
Neg logit contributions
keeping-0.033
handed-0.026
-0.026
 arrest-0.026
 rubble-0.025
Token bird
Token ID6512
Feature activation+2.13e+00

Pos logit contributions
 nods+0.006
 nod+0.005
 nodded+0.005
 shakes+0.004
 smiles+0.004
Neg logit contributions
keeping-0.010
-0.008
handed-0.008
 arrest-0.008
 rubble-0.007
Token,
Token ID11
Feature activation+1.27e+00

Pos logit contributions
 nods+0.045
 nod+0.040
 nodded+0.038
 shakes+0.032
 smiles+0.030
Neg logit contributions
keeping-0.087
-0.071
handed-0.071
 arrest-0.070
 rubble-0.069
Token so
Token ID523
Feature activation+1.66e+00

Pos logit contributions
 nods+0.041
 nodded+0.036
 nod+0.035
 shakes+0.033
rey+0.033
Neg logit contributions
keeping-0.034
-0.027
hell-0.025
 arrest-0.025
 anywhere-0.025
Token he
Token ID339
Feature activation+1.04e+00

Pos logit contributions
 nods+0.047
 nod+0.043
 nodded+0.039
 shakes+0.038
 smiles+0.036
Neg logit contributions
keeping-0.051
-0.041
 arrest-0.040
handed-0.040
hell-0.038
Token climbed
Token ID19952
Feature activation+1.55e+00

Pos logit contributions
 nods+0.029
 nod+0.027
ap+0.023
 shakes+0.022
ags+0.022
Neg logit contributions
keeping-0.030
-0.025
 drowned-0.024
 arrest-0.023
 recovered-0.023
Token up
Token ID510
Feature activation+2.59e+00

Pos logit contributions
 nods+0.037
 nod+0.033
 nodded+0.033
ap+0.027
 shakes+0.023
Neg logit contributions
keeping-0.058
 anywhere-0.047
 arrest-0.046
ork-0.043
-0.042
Token a
Token ID257
Feature activation-1.07e+00

Pos logit contributions
 nods+0.048
 nod+0.041
 nodded+0.039
 shakes+0.032
 smiles+0.028
Neg logit contributions
keeping-0.112
 arrest-0.092
handed-0.091
-0.091
 rubble-0.089
Token tree
Token ID5509
Feature activation+7.11e-01

Pos logit contributions
keeping+0.041
+0.033
handed+0.032
 arrest+0.031
 endings+0.030
Neg logit contributions
 nods-0.025
 nod-0.024
 nodded-0.022
 shakes-0.020
 cl-0.019
Token and
Token ID290
Feature activation+4.47e-01

Pos logit contributions
 nods+0.018
 nod+0.015
 nodded+0.015
 shakes+0.013
 smiles+0.013
Neg logit contributions
keeping-0.026
handed-0.021
 arrest-0.021
-0.020
 rubble-0.020
Token called
Token ID1444
Feature activation-9.00e-01

Pos logit contributions
 nods+0.012
 nod+0.010
 nodded+0.010
ap+0.009
 shakes+0.009
Neg logit contributions
keeping-0.015
 arrest-0.012
-0.012
handed-0.012
 rubble-0.011
Token out
Token ID503
Feature activation+2.51e+00

Pos logit contributions
keeping+0.027
 anywhere+0.023
 jail+0.021
 arrest+0.020
 anymore+0.020
Neg logit contributions
 nods-0.024
 nodded-0.023
rey-0.019
 shakes-0.019
 nod-0.018
Token.
Token ID13
Feature activation+5.25e-02

Pos logit contributions
 nods+0.000
 nod+0.000
 nodded+0.000
 shakes+0.000
 smiles+0.000
Neg logit contributions
keeping-0.000
-0.000
handed-0.000
 arrest-0.000
 rubble-0.000
Token She
Token ID1375
Feature activation-3.75e-01

Pos logit contributions
 nods+0.001
 nod+0.001
 nodded+0.001
 shakes+0.001
 smiles+0.001
Neg logit contributions
keeping-0.002
-0.001
handed-0.001
 arrest-0.001
 rubble-0.001
Token watched
Token ID7342
Feature activation+1.42e+00

Pos logit contributions
keeping+0.013
+0.010
handed+0.010
 arrest+0.010
 rubble+0.009
Neg logit contributions
 nods-0.010
 nod-0.009
 nodded-0.009
 shakes-0.008
 smiles-0.008
Token the
Token ID262
Feature activation+9.06e-01

Pos logit contributions
 nods+0.037
 nodded+0.031
 nod+0.029
 shakes+0.026
 agrees+0.024
Neg logit contributions
keeping-0.045
 arrest-0.040
 anymore-0.039
 unatt-0.037
 anywhere-0.037
Token k
Token ID479
Feature activation+7.54e-01

Pos logit contributions
 nods+0.022
 nod+0.020
 nodded+0.020
 shakes+0.016
ap+0.016
Neg logit contributions
keeping-0.030
 arrest-0.026
handed-0.025
-0.025
 drowned-0.024
Tokenang
Token ID648
Feature activation+4.19e+00

Pos logit contributions
 nods+0.021
 nod+0.020
 nodded+0.019
 shakes+0.017
 smiles+0.017
Neg logit contributions
keeping-0.022
 arrest-0.020
-0.019
handed-0.019
uts-0.018
Tokenaroo
Token ID38049
Feature activation+2.55e+00

Pos logit contributions
 nods+0.134
 nod+0.118
 smiles+0.113
 nodded+0.113
 shakes+0.109
Neg logit contributions
keeping-0.091
 arrest-0.078
 anywhere-0.075
handed-0.072
hell-0.071
Token from
Token ID422
Feature activation+3.25e-01

Pos logit contributions
 nods+0.060
 nodded+0.052
 nod+0.049
 shakes+0.045
 smiles+0.041
Neg logit contributions
keeping-0.092
handed-0.075
 arrest-0.075
-0.073
 rubble-0.073
Token afar
Token ID44246
Feature activation-4.55e-01

Pos logit contributions
 nods+0.007
 nod+0.006
 nodded+0.006
ap+0.005
 shakes+0.005
Neg logit contributions
keeping-0.013
 arrest-0.010
 rubble-0.010
handed-0.010
 unatt-0.010
Token and
Token ID290
Feature activation+5.92e-02

Pos logit contributions
keeping+0.019
+0.016
handed+0.016
 arrest+0.016
 rubble+0.015
Neg logit contributions
 nods-0.009
 nod-0.008
 nodded-0.007
 shakes-0.006
 smiles-0.006
Token imagined
Token ID15758
Feature activation-4.50e-01

Pos logit contributions
 nods+0.001
 nod+0.001
 nodded+0.001
 shakes+0.001
ap+0.001
Neg logit contributions
keeping-0.002
-0.002
 arrest-0.002
handed-0.002
 rubble-0.002
Token he
Token ID339
Feature activation-4.54e-01

Pos logit contributions
keeping+0.005
+0.004
handed+0.004
 arrest+0.004
 rubble+0.004
Neg logit contributions
 nods-0.004
 nod-0.003
 nodded-0.003
 shakes-0.003
 smiles-0.003
Token was
Token ID373
Feature activation+5.89e-02

Pos logit contributions
keeping+0.013
+0.011
handed+0.010
 drowned+0.010
 arrest+0.010
Neg logit contributions
 nods-0.013
 nod-0.012
 nodded-0.011
 shakes-0.011
 smiles-0.010
Token determined
Token ID5295
Feature activation+6.58e-01

Pos logit contributions
 nods+0.002
 nod+0.001
 nodded+0.001
 shakes+0.001
ap+0.001
Neg logit contributions
keeping-0.002
handed-0.001
-0.001
 drowned-0.001
 anywhere-0.001
Token to
Token ID284
Feature activation-9.17e-01

Pos logit contributions
 nods+0.016
 nod+0.014
 nodded+0.014
 shakes+0.012
 smiles+0.012
Neg logit contributions
keeping-0.024
-0.019
handed-0.019
 arrest-0.019
 rubble-0.018
Token make
Token ID787
Feature activation+9.04e-01

Pos logit contributions
keeping+0.033
+0.026
handed+0.026
 arrest+0.025
 rubble+0.025
Neg logit contributions
 nods-0.023
 nod-0.021
 nodded-0.020
 shakes-0.018
 smiles-0.017
Token it
Token ID340
Feature activation+2.46e+00

Pos logit contributions
 nods+0.018
 nod+0.016
 nodded+0.015
 shakes+0.013
 smiles+0.012
Neg logit contributions
keeping-0.037
-0.031
handed-0.031
 arrest-0.030
 rubble-0.029
Token work
Token ID670
Feature activation-2.72e-01

Pos logit contributions
 nods+0.055
 nod+0.050
 nodded+0.047
 shakes+0.041
 smiles+0.039
Neg logit contributions
keeping-0.095
-0.077
handed-0.076
 arrest-0.076
 rubble-0.073
Token!
Token ID0
Feature activation+5.15e-01

Pos logit contributions
keeping+0.010
+0.008
handed+0.008
 arrest+0.008
 rubble+0.008
Neg logit contributions
 nods-0.006
 nod-0.006
 nodded-0.005
 shakes-0.005
 smiles-0.005
Token
Token ID220
Feature activation+7.96e-01

Pos logit contributions
 nods+0.012
 nod+0.010
 nodded+0.010
 shakes+0.009
 smiles+0.008
Neg logit contributions
keeping-0.020
-0.016
handed-0.016
 arrest-0.016
 rubble-0.016
Token
Token ID198
Feature activation-1.01e-01

Pos logit contributions
 nods+0.020
 nod+0.018
 nodded+0.018
 shakes+0.016
 smiles+0.014
Neg logit contributions
keeping-0.028
-0.023
handed-0.023
 arrest-0.022
hell-0.021
Token
Token ID198
Feature activation-7.76e-01

Pos logit contributions
keeping+0.004
+0.003
handed+0.003
 arrest+0.003
hell+0.003
Neg logit contributions
 nods-0.002
 nod-0.002
 nodded-0.002
 shakes-0.002
 smiles-0.002
Token could
Token ID714
Feature activation-9.72e-01

Pos logit contributions
 nods+0.004
 nod+0.004
 nodded+0.004
 shakes+0.003
 smiles+0.003
Neg logit contributions
keeping-0.005
handed-0.004
-0.004
 arrest-0.003
 rubble-0.003
Token.
Token ID13
Feature activation+4.92e-01

Pos logit contributions
keeping+0.035
+0.028
handed+0.028
 arrest+0.028
 rubble+0.026
Neg logit contributions
 nods-0.024
 nod-0.022
 nodded-0.021
 shakes-0.019
 smiles-0.018
Token
Token ID198
Feature activation+3.09e-01

Pos logit contributions
 nods+0.011
 nod+0.010
 nodded+0.009
 shakes+0.008
 smiles+0.007
Neg logit contributions
keeping-0.019
 arrest-0.016
handed-0.016
-0.016
 rubble-0.016
Token
Token ID220
Feature activation+1.27e+00

Pos logit contributions
 nods+0.007
 nod+0.006
 nodded+0.006
 shakes+0.005
 smiles+0.004
Neg logit contributions
keeping-0.012
-0.010
handed-0.010
 arrest-0.010
 rubble-0.010
Token
Token ID198
Feature activation-5.27e-01

Pos logit contributions
 nods+0.030
 nod+0.026
 nodded+0.026
 shakes+0.022
 smiles+0.021
Neg logit contributions
keeping-0.048
-0.039
handed-0.039
 arrest-0.038
hell-0.036
TokenHe
Token ID1544
Feature activation+1.04e+00

Pos logit contributions
keeping+0.019
+0.015
handed+0.015
 arrest+0.015
 rubble+0.015
Neg logit contributions
 nods-0.013
 nod-0.012
 nodded-0.012
 shakes-0.010
 smiles-0.010
Token stayed
Token ID9658
Feature activation-2.82e-01

Pos logit contributions
 nods+0.027
 nod+0.025
 nodded+0.021
 shakes+0.021
ap+0.020
Neg logit contributions
keeping-0.034
-0.027
 arrest-0.026
 drowned-0.026
handed-0.025
Token there
Token ID612
Feature activation-1.69e-01

Pos logit contributions
keeping+0.009
 anywhere+0.007
 unatt+0.007
 arrest+0.007
handed+0.007
Neg logit contributions
 nods-0.007
 nod-0.006
 nodded-0.006
 shakes-0.006
 smiles-0.005
Token for
Token ID329
Feature activation-4.87e-01

Pos logit contributions
keeping+0.005
 anywhere+0.004
 unatt+0.004
 arrest+0.004
handed+0.004
Neg logit contributions
 nods-0.005
 nod-0.004
 nodded-0.004
 shakes-0.004
 smiles-0.003
Token a
Token ID257
Feature activation-1.53e+00

Pos logit contributions
keeping+0.022
+0.019
uts+0.018
handed+0.018
grand+0.018
Neg logit contributions
 nods-0.006
 nod-0.006
 nodded-0.005
 three-0.005
 smiles-0.005
Token long
Token ID890
Feature activation+2.87e-01

Pos logit contributions
keeping+0.059
handed+0.046
+0.045
 arrest+0.044
 rubble+0.042
Neg logit contributions
 nod-0.036
 nods-0.035
 nodded-0.033
 three-0.028
 cl-0.028
Token"
Token ID1
Feature activation-7.39e-01

Pos logit contributions
keeping+0.047
handed+0.039
+0.039
 arrest+0.039
hell+0.036
Neg logit contributions
 nods-0.030
 nod-0.026
 nodded-0.025
 shakes-0.023
 smiles-0.022
TokenI
Token ID40
Feature activation+9.93e-01

Pos logit contributions
keeping+0.029
+0.024
handed+0.024
 arrest+0.024
 rubble+0.023
Neg logit contributions
 nods-0.016
 nod-0.014
 nodded-0.014
 shakes-0.012
 smiles-0.011
Token'm
Token ID1101
Feature activation+1.11e+00

Pos logit contributions
 nods+0.024
 nod+0.022
 nodded+0.021
 shakes+0.019
 smiles+0.018
Neg logit contributions
keeping-0.038
handed-0.030
 rubble-0.029
-0.029
 arrest-0.028
Token counting
Token ID14143
Feature activation+1.72e+00

Pos logit contributions
 nods+0.031
 nod+0.027
 nodded+0.027
 shakes+0.024
 smiles+0.022
Neg logit contributions
keeping-0.037
-0.030
 arrest-0.029
handed-0.029
 rubble-0.028
Token the
Token ID262
Feature activation-6.16e-01

Pos logit contributions
 nods+0.043
 nod+0.040
 nodded+0.039
 shakes+0.031
rey+0.031
Neg logit contributions
keeping-0.060
-0.048
 arrest-0.047
 rubble-0.046
 anywhere-0.046
Token planets
Token ID14705
Feature activation+1.35e+00

Pos logit contributions
keeping+0.022
+0.018
handed+0.018
 arrest+0.017
 rubble+0.017
Neg logit contributions
 nods-0.015
 nod-0.014
 nodded-0.013
 shakes-0.012
 smiles-0.011
Token in
Token ID287
Feature activation-7.24e-01

Pos logit contributions
 nods+0.034
 nod+0.030
 nodded+0.029
 shakes+0.025
 smiles+0.024
Neg logit contributions
keeping-0.048
 arrest-0.038
handed-0.038
-0.038
 rubble-0.037
Token the
Token ID262
Feature activation-7.74e-01

Pos logit contributions
keeping+0.034
+0.029
handed+0.029
hell+0.028
 arrest+0.027
Neg logit contributions
 nods-0.010
 nod-0.008
 nodded-0.007
 shakes-0.006
 smiles-0.006
Token sky
Token ID6766
Feature activation+3.03e-01

Pos logit contributions
keeping+0.030
+0.024
handed+0.024
 arrest+0.024
hell+0.022
Neg logit contributions
 nods-0.018
 nod-0.016
 nodded-0.015
 shakes-0.014
 smiles-0.013
Token,"
Token ID553
Feature activation-6.50e-01

Pos logit contributions
 nods+0.009
 nod+0.008
 nodded+0.008
 shakes+0.007
 smiles+0.007
Neg logit contributions
keeping-0.010
-0.008
handed-0.007
 arrest-0.007
hell-0.007
Token the
Token ID262
Feature activation+6.59e-01

Pos logit contributions
keeping+0.019
 arrest+0.016
grand+0.016
 Bun+0.016
 rubble+0.015
Neg logit contributions
 nods-0.015
 nod-0.013
 shakes-0.012
ap-0.012
ags-0.012
Token bark
Token ID21405
Feature activation-9.18e-02

Pos logit contributions
keeping+0.008
+0.006
handed+0.006
 arrest+0.006
 rubble+0.006
Neg logit contributions
 nods-0.006
 nod-0.005
 nodded-0.005
 shakes-0.004
 smiles-0.004
Token.
Token ID13
Feature activation-4.97e-01

Pos logit contributions
keeping+0.003
+0.003
handed+0.003
hell+0.002
 arrest+0.002
Neg logit contributions
 nods-0.002
 nod-0.002
 nodded-0.002
 shakes-0.002
 smiles-0.002
Token It
Token ID632
Feature activation+6.55e-01

Pos logit contributions
keeping+0.019
handed+0.015
+0.015
 arrest+0.015
 rubble+0.014
Neg logit contributions
 nods-0.012
 nod-0.010
 nodded-0.010
 shakes-0.009
 smiles-0.008
Token was
Token ID373
Feature activation-4.56e-01

Pos logit contributions
 nods+0.017
 nod+0.015
 nodded+0.014
ap+0.013
 shakes+0.013
Neg logit contributions
keeping-0.020
-0.018
 drowned-0.017
handed-0.016
 rubble-0.016
Token Max
Token ID5436
Feature activation-2.41e-01

Pos logit contributions
keeping+0.017
+0.014
 anywhere+0.013
 rubble+0.013
 arrest+0.013
Neg logit contributions
 nods-0.011
 nod-0.009
 nodded-0.009
 shakes-0.008
ap-0.007
Token!
Token ID0
Feature activation+2.12e-01

Pos logit contributions
keeping+0.009
+0.007
handed+0.007
 arrest+0.007
 rubble+0.007
Neg logit contributions
 nods-0.006
 nod-0.005
 nodded-0.005
 shakes-0.004
 smiles-0.004
Token He
Token ID679
Feature activation+1.06e-01

Pos logit contributions
 nods+0.004
 nod+0.004
 nodded+0.004
 shakes+0.003
 smiles+0.003
Neg logit contributions
keeping-0.009
handed-0.007
 arrest-0.007
-0.007
 unatt-0.007
Token had
Token ID550
Feature activation-8.02e-01

Pos logit contributions
 nods+0.003
 nod+0.003
ap+0.002
ags+0.002
 nodded+0.002
Neg logit contributions
keeping-0.003
 drowned-0.003
-0.003
 arrest-0.003
 recovered-0.002
Token been
Token ID587
Feature activation+2.66e-01

Pos logit contributions
keeping+0.028
handed+0.023
+0.022
hell+0.021
grand+0.021
Neg logit contributions
 nods-0.021
 nodded-0.019
 nod-0.018
 smiles-0.017
 shakes-0.017
Token stuck
Token ID7819
Feature activation+1.76e-01

Pos logit contributions
 nods+0.007
 nod+0.006
 nodded+0.006
 shakes+0.005
 smiles+0.005
Neg logit contributions
keeping-0.009
-0.007
handed-0.007
 arrest-0.007
 rubble-0.007
Token in
Token ID287
Feature activation-4.14e-01

Pos logit contributions
 nods+0.004
 nod+0.003
 nodded+0.003
 shakes+0.002
 smiles+0.002
Neg logit contributions
keeping-0.007
handed-0.006
-0.006
 arrest-0.006
 unatt-0.006
Tokenmy
Token ID1820
Feature activation+9.88e-01

Pos logit contributions
keeping+0.025
handed+0.020
+0.020
 arrest+0.019
 unatt+0.019
Neg logit contributions
 nods-0.017
 nod-0.015
 nodded-0.015
 shakes-0.013
 smiles-0.013
Token in
Token ID287
Feature activation+4.78e-01

Pos logit contributions
 nods+0.024
 nodded+0.020
 nod+0.020
 shakes+0.018
 smiles+0.017
Neg logit contributions
keeping-0.036
handed-0.030
-0.029
 arrest-0.028
 unatt-0.028
Token the
Token ID262
Feature activation+9.11e-01

Pos logit contributions
 nods+0.008
 nod+0.007
 nodded+0.007
 shakes+0.005
 smiles+0.004
Neg logit contributions
keeping-0.021
 arrest-0.018
-0.018
handed-0.018
 rubble-0.017
Token kitchen
Token ID9592
Feature activation-9.27e-01

Pos logit contributions
 nods+0.025
 nod+0.023
 nodded+0.022
 shakes+0.020
 smiles+0.018
Neg logit contributions
keeping-0.031
-0.025
handed-0.024
 arrest-0.024
hell-0.023
Token.
Token ID13
Feature activation-4.31e-01

Pos logit contributions
keeping+0.032
handed+0.026
+0.025
 arrest+0.025
 rubble+0.024
Neg logit contributions
 nods-0.024
 nod-0.022
 nodded-0.021
 shakes-0.019
 smiles-0.018
Token One
Token ID1881
Feature activation+8.06e-01

Pos logit contributions
keeping+0.013
+0.011
handed+0.010
 arrest+0.010
hell+0.010
Neg logit contributions
 nods-0.012
 nodded-0.011
 nod-0.011
 shakes-0.010
 smiles-0.009
Token day
Token ID1110
Feature activation+3.40e-01

Pos logit contributions
 nods+0.028
 nodded+0.024
 nod+0.024
 shakes+0.023
 smiles+0.022
Neg logit contributions
keeping-0.019
grand-0.015
handed-0.015
-0.014
 rubble-0.014
Token,
Token ID11
Feature activation-4.33e-02

Pos logit contributions
 nods+0.008
 nod+0.007
 nodded+0.007
 shakes+0.006
 smiles+0.005
Neg logit contributions
keeping-0.013
handed-0.011
-0.011
 arrest-0.010
 rubble-0.010
Token her
Token ID607
Feature activation-3.96e-01

Pos logit contributions
keeping+0.001
+0.001
 arrest+0.001
handed+0.001
 rubble+0.001
Neg logit contributions
 nods-0.001
 nod-0.001
 nodded-0.001
 shakes-0.001
 smiles-0.001
Token mom
Token ID1995
Feature activation-2.09e+00

Pos logit contributions
keeping+0.015
+0.013
 arrest+0.012
handed+0.012
 rubble+0.012
Neg logit contributions
 nods-0.009
 nod-0.008
 nodded-0.007
 shakes-0.006
 smiles-0.006
Tokenmy
Token ID1820
Feature activation-5.82e-01

Pos logit contributions
keeping+0.075
+0.061
handed+0.060
 arrest+0.058
 rubble+0.056
Neg logit contributions
 nods-0.052
 nod-0.046
 nodded-0.043
 shakes-0.039
 smiles-0.037
TokenThe
Token ID464
Feature activation+8.70e-01

Pos logit contributions
keeping+0.043
handed+0.036
+0.035
 arrest+0.035
hell+0.033
Neg logit contributions
 nods-0.034
 nod-0.030
 nodded-0.028
 shakes-0.027
 smiles-0.025
Token bird
Token ID6512
Feature activation+1.56e+00

Pos logit contributions
 nods+0.020
 nod+0.018
 nodded+0.017
 shakes+0.015
 smiles+0.014
Neg logit contributions
keeping-0.033
-0.027
handed-0.026
 arrest-0.026
 rubble-0.025
Token replied
Token ID8712
Feature activation-1.29e+00

Pos logit contributions
 nods+0.031
 nod+0.030
ags+0.029
ap+0.027
 fisherman+0.025
Neg logit contributions
keeping-0.046
hell-0.042
 drowned-0.041
 recovered-0.040
ache-0.039
Token,
Token ID11
Feature activation+1.01e+00

Pos logit contributions
keeping+0.030
hell+0.025
ork+0.024
+0.024
 anywhere+0.024
Neg logit contributions
 nods-0.041
 nod-0.037
 shakes-0.035
 nodded-0.035
rey-0.034
Token "
Token ID366
Feature activation-4.83e-01

Pos logit contributions
 nods+0.029
 nod+0.025
 shakes+0.023
rey+0.023
ap+0.022
Neg logit contributions
keeping-0.024
-0.022
uts-0.019
ache-0.019
cel-0.019
TokenMy
Token ID3666
Feature activation+3.44e-01

Pos logit contributions
keeping+0.015
handed+0.013
grand+0.012
+0.012
 unatt+0.012
Neg logit contributions
 nods-0.013
 nod-0.011
 shakes-0.011
 nodded-0.010
ap-0.010
Token wings
Token ID12098
Feature activation+7.47e-02

Pos logit contributions
 nods+0.006
 nodded+0.005
 nod+0.005
 shakes+0.004
 smiles+0.004
Neg logit contributions
keeping-0.014
-0.012
 arrest-0.012
handed-0.012
hell-0.011
Token are
Token ID389
Feature activation-4.81e-01

Pos logit contributions
 nods+0.002
 nod+0.002
 nodded+0.002
 shakes+0.001
 smiles+0.001
Neg logit contributions
keeping-0.003
-0.002
handed-0.002
 arrest-0.002
hell-0.002
Token too
Token ID1165
Feature activation-4.27e-01

Pos logit contributions
keeping+0.016
+0.012
handed+0.012
 arrest+0.012
 rubble+0.011
Neg logit contributions
 nods-0.014
 nod-0.013
 nodded-0.012
 shakes-0.011
 smiles-0.011
Token tired
Token ID10032
Feature activation+8.45e-02

Pos logit contributions
keeping+0.016
+0.013
handed+0.013
 arrest+0.013
 rubble+0.012
Neg logit contributions
 nods-0.010
 nod-0.009
 nodded-0.009
 shakes-0.008
 smiles-0.007
Token from
Token ID422
Feature activation-1.93e-02

Pos logit contributions
 nods+0.001
 nod+0.001
 nodded+0.001
 shakes+0.001
 up+0.001
Neg logit contributions
keeping-0.004
-0.003
handed-0.003
 arrest-0.003
 rubble-0.003
Token Lily
Token ID20037
Feature activation-1.01e+00

Pos logit contributions
 nods+0.006
 nod+0.005
 nodded+0.005
 shakes+0.004
rey+0.004
Neg logit contributions
keeping-0.010
 arrest-0.008
grand-0.008
-0.007
handed-0.007
Token were
Token ID547
Feature activation+7.50e-01

Pos logit contributions
keeping+0.035
handed+0.027
 arrest+0.027
+0.027
 rubble+0.026
Neg logit contributions
 nods-0.026
 nod-0.023
 nodded-0.022
 shakes-0.020
 smiles-0.019
Token playing
Token ID2712
Feature activation+1.64e+00

Pos logit contributions
 nods+0.021
 nod+0.019
 nodded+0.019
 shakes+0.017
rey+0.015
Neg logit contributions
keeping-0.023
-0.019
 arrest-0.019
handed-0.019
 unatt-0.018
Token in
Token ID287
Feature activation+3.22e-02

Pos logit contributions
 nods+0.045
 nod+0.037
 nodded+0.037
 shakes+0.035
 smiles+0.033
Neg logit contributions
keeping-0.050
 arrest-0.044
 unatt-0.042
handed-0.042
 anywhere-0.041
Token the
Token ID262
Feature activation+8.01e-01

Pos logit contributions
 nods+0.001
 nodded+0.001
 nod+0.001
 shakes+0.001
apped+0.000
Neg logit contributions
keeping-0.001
 arrest-0.001
handed-0.001
 rubble-0.001
-0.001
Token park
Token ID3952
Feature activation-8.90e-01

Pos logit contributions
 nods+0.023
 nod+0.021
 nodded+0.021
 shakes+0.019
 smiles+0.018
Neg logit contributions
keeping-0.025
handed-0.020
 arrest-0.020
-0.019
 rubble-0.019
Token.
Token ID13
Feature activation-5.79e-01

Pos logit contributions
keeping+0.025
handed+0.019
 arrest+0.018
+0.018
 unatt+0.017
Neg logit contributions
 nods-0.029
 nod-0.025
 nodded-0.025
 shakes-0.024
 smiles-0.023
Token They
Token ID1119
Feature activation-9.88e-02

Pos logit contributions
keeping+0.021
+0.017
handed+0.016
 arrest+0.016
 rubble+0.016
Neg logit contributions
 nods-0.014
 nod-0.013
 nodded-0.012
 shakes-0.011
 smiles-0.010
Token saw
Token ID2497
Feature activation+6.96e-01

Pos logit contributions
keeping+0.003
 arrest+0.002
+0.002
 rubble+0.002
 anywhere+0.002
Neg logit contributions
 nods-0.003
 nod-0.003
 shakes-0.002
 nodded-0.002
rey-0.002
Token a
Token ID257
Feature activation-7.05e-01

Pos logit contributions
 nods+0.015
 nod+0.014
 nodded+0.013
 shakes+0.011
rey+0.011
Neg logit contributions
keeping-0.025
-0.021
handed-0.021
 rubble-0.021
 arrest-0.020
Token giant
Token ID6175
Feature activation+1.67e+00

Pos logit contributions
keeping+0.025
+0.020
handed+0.020
 arrest+0.020
 rubble+0.019
Neg logit contributions
 nods-0.018
 nod-0.016
 nodded-0.015
 shakes-0.014
 smiles-0.013
Token fear
Token ID3252
Feature activation-1.47e+00

Pos logit contributions
 nods+0.006
 nod+0.005
 nodded+0.005
 shakes+0.004
 smiles+0.004
Neg logit contributions
keeping-0.014
handed-0.011
-0.011
 arrest-0.011
 rubble-0.011
Token,
Token ID11
Feature activation-5.55e-01

Pos logit contributions
keeping+0.055
+0.045
handed+0.044
 arrest+0.044
 rubble+0.042
Neg logit contributions
 nods-0.035
 nod-0.031
 nodded-0.030
 shakes-0.026
 smiles-0.024
Token not
Token ID407
Feature activation-9.08e-01

Pos logit contributions
keeping+0.019
+0.015
 arrest+0.015
handed+0.015
 rubble+0.014
Neg logit contributions
 nods-0.015
 nod-0.013
 nodded-0.012
 shakes-0.011
rey-0.010
Token knowing
Token ID6970
Feature activation-1.69e+00

Pos logit contributions
keeping+0.033
+0.027
handed+0.027
 arrest+0.026
 rubble+0.025
Neg logit contributions
 nods-0.022
 nod-0.020
 nodded-0.019
 shakes-0.017
 smiles-0.016
Token what
Token ID644
Feature activation-4.69e-01

Pos logit contributions
keeping+0.061
+0.049
handed+0.049
 arrest+0.048
 rubble+0.046
Neg logit contributions
 nods-0.042
 nod-0.038
 nodded-0.037
 shakes-0.033
 smiles-0.032
Token to
Token ID284
Feature activation-2.44e+00

Pos logit contributions
keeping+0.016
+0.012
handed+0.012
 arrest+0.012
 rubble+0.012
Neg logit contributions
 nods-0.013
 nod-0.012
 nodded-0.011
 shakes-0.010
ap-0.009
Token do
Token ID466
Feature activation-4.41e-01

Pos logit contributions
keeping+0.096
handed+0.078
+0.078
 arrest+0.077
hell+0.074
Neg logit contributions
 nods-0.053
 nod-0.045
 nodded-0.045
 shakes-0.038
 smiles-0.036
Token.
Token ID13
Feature activation-1.56e-01

Pos logit contributions
keeping+0.017
+0.014
handed+0.014
 arrest+0.014
 rubble+0.014
Neg logit contributions
 nods-0.009
 nod-0.008
 nodded-0.008
 shakes-0.007
 smiles-0.007
Token
Token ID198
Feature activation-2.43e-02

Pos logit contributions
keeping+0.005
+0.004
handed+0.004
 arrest+0.004
 rubble+0.004
Neg logit contributions
 nods-0.004
 nod-0.004
 nodded-0.004
 shakes-0.003
rey-0.003
Token
Token ID198
Feature activation-8.15e-01

Pos logit contributions
keeping+0.001
handed+0.001
+0.001
 arrest+0.001
hell+0.001
Neg logit contributions
 nods-0.001
 nod-0.001
 nodded-0.001
 shakes-0.000
 smiles-0.000
TokenThe
Token ID464
Feature activation+1.71e-01

Pos logit contributions
keeping+0.024
handed+0.021
+0.019
 arrest+0.019
ork+0.018
Neg logit contributions
 nods-0.024
 nod-0.022
 shakes-0.021
 nodded-0.020
 smiles-0.020
Token and
Token ID290
Feature activation+3.10e-01

Pos logit contributions
keeping+0.036
handed+0.027
+0.026
 arrest+0.026
 unatt+0.025
Neg logit contributions
 nods-0.030
 nod-0.027
 nodded-0.026
 shakes-0.024
rey-0.023
Token find
Token ID1064
Feature activation-1.99e-01

Pos logit contributions
 nods+0.008
 nodded+0.007
 nod+0.007
rey+0.007
 shakes+0.006
Neg logit contributions
keeping-0.010
 arrest-0.008
 anywhere-0.007
-0.007
handed-0.007
Token a
Token ID257
Feature activation-1.98e+00

Pos logit contributions
keeping+0.007
handed+0.006
 rubble+0.006
+0.006
 anywhere+0.005
Neg logit contributions
 nods-0.005
 nod-0.004
 nodded-0.004
 shakes-0.004
rey-0.003
Token nice
Token ID3621
Feature activation-6.51e-01

Pos logit contributions
keeping+0.070
+0.056
handed+0.056
 arrest+0.055
 rubble+0.053
Neg logit contributions
 nods-0.050
 nod-0.045
 nodded-0.044
 shakes-0.039
 smiles-0.037
Token spot
Token ID4136
Feature activation+7.72e-01

Pos logit contributions
keeping+0.019
+0.014
handed+0.014
 arrest+0.014
hell+0.014
Neg logit contributions
 nods-0.020
 nodded-0.018
 nod-0.018
 shakes-0.016
 smiles-0.015
Token under
Token ID739
Feature activation-9.08e-01

Pos logit contributions
 nods+0.022
 nod+0.019
 nodded+0.018
 shakes+0.016
rey+0.016
Neg logit contributions
keeping-0.027
handed-0.020
-0.020
 anywhere-0.020
 unatt-0.019
Token a
Token ID257
Feature activation-3.25e+00

Pos logit contributions
keeping+0.043
+0.036
handed+0.036
 arrest+0.035
 rubble+0.035
Neg logit contributions
 nods-0.013
 nod-0.010
 nodded-0.010
 shakes-0.007
 smiles-0.007
Token tree
Token ID5509
Feature activation-6.51e-01

Pos logit contributions
keeping+0.121
+0.095
handed+0.093
 arrest+0.092
grand+0.087
Neg logit contributions
 nods-0.081
 nod-0.074
 nodded-0.069
 shakes-0.062
 smiles-0.059
Token.
Token ID13
Feature activation-5.92e-01

Pos logit contributions
keeping+0.024
+0.020
handed+0.019
 arrest+0.019
 rubble+0.019
Neg logit contributions
 nods-0.016
 nod-0.014
 nodded-0.013
 shakes-0.012
 smiles-0.011
Token They
Token ID1119
Feature activation+2.94e-02

Pos logit contributions
keeping+0.021
+0.017
handed+0.017
 arrest+0.016
 rubble+0.016
Neg logit contributions
 nods-0.015
 nod-0.014
 nodded-0.013
 shakes-0.012
 smiles-0.011
Token spread
Token ID4104
Feature activation+5.56e-01

Pos logit contributions
 nods+0.001
 nodded+0.001
 nod+0.001
apped+0.001
 shakes+0.001
Neg logit contributions
keeping-0.001
 arrest-0.001
 rubble-0.001
 anywhere-0.001
-0.001
Token too
Token ID1165
Feature activation-1.06e+00

Pos logit contributions
keeping+0.020
 rubble+0.017
 anywhere+0.016
 unatt+0.016
+0.016
Neg logit contributions
 nods-0.019
 nod-0.017
 nodded-0.015
ap-0.014
apped-0.013
Token heavy
Token ID4334
Feature activation+1.25e+00

Pos logit contributions
keeping+0.035
+0.028
handed+0.028
hell+0.027
 rubble+0.027
Neg logit contributions
 nods-0.029
 nod-0.026
 nodded-0.025
 shakes-0.022
 smiles-0.021
Token for
Token ID329
Feature activation-5.81e-01

Pos logit contributions
 nods+0.024
 nod+0.021
 nodded+0.020
 shakes+0.018
 smiles+0.016
Neg logit contributions
keeping-0.051
-0.043
handed-0.043
 arrest-0.042
 rubble-0.041
Token her
Token ID607
Feature activation-2.05e+00

Pos logit contributions
keeping+0.027
+0.023
handed+0.023
 arrest+0.022
grand+0.022
Neg logit contributions
 nods-0.008
 nod-0.007
 nodded-0.006
 shakes-0.005
 smiles-0.005
Token.
Token ID13
Feature activation-3.70e-01

Pos logit contributions
keeping+0.089
handed+0.078
+0.076
 rubble+0.075
 arrest+0.074
Neg logit contributions
 nods-0.031
 nod-0.030
 nodded-0.027
 shakes-0.022
 three-0.020
Token She
Token ID1375
Feature activation-7.93e-01

Pos logit contributions
keeping+0.012
+0.010
handed+0.010
 arrest+0.010
 rubble+0.009
Neg logit contributions
 nods-0.010
 nod-0.009
 nodded-0.008
 shakes-0.008
 smiles-0.007
Token tried
Token ID3088
Feature activation+5.44e-01

Pos logit contributions
keeping+0.024
 drowned+0.019
 arrest+0.019
+0.019
 rubble+0.019
Neg logit contributions
 nods-0.022
 nod-0.021
 nodded-0.018
ap-0.017
 shakes-0.016
Token to
Token ID284
Feature activation+5.55e-01

Pos logit contributions
 nods+0.013
 nod+0.012
 nodded+0.011
 shakes+0.010
 smiles+0.009
Neg logit contributions
keeping-0.020
-0.016
handed-0.016
 arrest-0.016
 rubble-0.015
Token pull
Token ID2834
Feature activation+9.06e-01

Pos logit contributions
 nods+0.014
 nodded+0.012
 nod+0.012
 shakes+0.010
 smiles+0.010
Neg logit contributions
keeping-0.020
 arrest-0.016
handed-0.016
-0.016
 rubble-0.016
Token it
Token ID340
Feature activation+4.91e-01

Pos logit contributions
 nods+0.018
 nod+0.016
 nodded+0.015
 shakes+0.013
 smiles+0.012
Neg logit contributions
keeping-0.037
-0.031
handed-0.031
 arrest-0.030
 rubble-0.030
Token out
Token ID503
Feature activation+1.16e+00

Pos logit contributions
 nods+0.007
 nod+0.007
 up+0.006
 nodded+0.005
 shakes+0.005
Neg logit contributions
keeping-0.023
handed-0.020
-0.019
 arrest-0.019
grand-0.019
Token I
Token ID314
Feature activation+4.11e-01

Pos logit contributions
keeping+0.028
+0.023
handed+0.023
 arrest+0.022
 rubble+0.022
Neg logit contributions
 nods-0.021
 nod-0.019
 nodded-0.018
 shakes-0.016
 smiles-0.015
Token forgive
Token ID20927
Feature activation-8.73e-02

Pos logit contributions
 nods+0.011
 nod+0.010
 nodded+0.009
 shakes+0.008
 smiles+0.008
Neg logit contributions
keeping-0.015
-0.012
handed-0.012
 arrest-0.012
 rubble-0.011
Token you
Token ID345
Feature activation-1.57e-01

Pos logit contributions
keeping+0.002
 arrest+0.002
+0.002
handed+0.002
 anywhere+0.002
Neg logit contributions
 nods-0.003
 nodded-0.003
 nod-0.003
 shakes-0.002
apped-0.002
Token.
Token ID13
Feature activation-8.64e-01

Pos logit contributions
keeping+0.006
+0.005
handed+0.005
 arrest+0.005
 rubble+0.005
Neg logit contributions
 nods-0.003
 nod-0.003
 nodded-0.003
 shakes-0.002
 smiles-0.002
Token But
Token ID887
Feature activation-6.93e-02

Pos logit contributions
keeping+0.028
+0.022
 arrest+0.022
handed+0.021
 rubble+0.020
Neg logit contributions
 nods-0.024
 nod-0.022
 nodded-0.021
 shakes-0.019
 smiles-0.018
Token you
Token ID345
Feature activation-7.40e-01

Pos logit contributions
keeping+0.001
 unatt+0.001
 anywhere+0.001
 arrest+0.001
+0.001
Neg logit contributions
 nods-0.002
 nodded-0.002
 nod-0.002
rey-0.002
apped-0.002
Token have
Token ID423
Feature activation-8.07e-01

Pos logit contributions
keeping+0.029
handed+0.023
+0.023
 arrest+0.022
 unatt+0.021
Neg logit contributions
 nods-0.017
 nod-0.017
 nodded-0.015
 shakes-0.014
 smiles-0.013
Token to
Token ID284
Feature activation-4.48e-01

Pos logit contributions
keeping+0.030
+0.024
handed+0.024
 arrest+0.023
 rubble+0.023
Neg logit contributions
 nods-0.020
 nod-0.018
 nodded-0.017
 shakes-0.015
 smiles-0.014
Token be
Token ID307
Feature activation+9.46e-01

Pos logit contributions
keeping+0.018
handed+0.014
+0.014
 arrest+0.014
 rubble+0.014
Neg logit contributions
 nods-0.010
 nod-0.008
 nodded-0.008
 shakes-0.007
 smiles-0.007
Token more
Token ID517
Feature activation-1.28e-01

Pos logit contributions
 nods+0.025
 nod+0.023
 nodded+0.021
 shakes+0.020
rey+0.019
Neg logit contributions
keeping-0.031
handed-0.025
-0.024
 unatt-0.024
 arrest-0.024
Token careful
Token ID8161
Feature activation+1.02e+00

Pos logit contributions
keeping+0.005
+0.004
handed+0.004
 arrest+0.004
 rubble+0.004
Neg logit contributions
 nods-0.003
 nod-0.003
 nodded-0.003
 shakes-0.002
 smiles-0.002
Token.
Token ID13
Feature activation-8.09e-01

Pos logit contributions
keeping+0.035
+0.031
handed+0.030
 arrest+0.030
 rubble+0.029
Neg logit contributions
 nods-0.013
 nod-0.012
 nodded-0.011
 shakes-0.009
 up-0.009
Token He
Token ID679
Feature activation-1.09e-02

Pos logit contributions
keeping+0.032
handed+0.026
+0.026
 arrest+0.026
hell+0.025
Neg logit contributions
 nods-0.018
 nod-0.016
 nodded-0.016
 shakes-0.014
 smiles-0.013
Token looked
Token ID3114
Feature activation+9.72e-02

Pos logit contributions
keeping+0.000
+0.000
handed+0.000
 arrest+0.000
 rubble+0.000
Neg logit contributions
 nods-0.000
 nod-0.000
 nodded-0.000
 shakes-0.000
 smiles-0.000
Token under
Token ID739
Feature activation+4.56e-01

Pos logit contributions
 nods+0.002
 nod+0.002
 nodded+0.002
 shakes+0.002
 smiles+0.002
Neg logit contributions
keeping-0.004
-0.003
handed-0.003
 arrest-0.003
 rubble-0.003
Token the
Token ID262
Feature activation-1.65e+00

Pos logit contributions
 nods+0.005
 nod+0.004
 nodded+0.003
 shakes+0.003
 smiles+0.003
Neg logit contributions
keeping-0.023
-0.020
handed-0.020
grand-0.019
hell-0.019
Token bed
Token ID3996
Feature activation-3.92e+00

Pos logit contributions
keeping+0.062
+0.057
handed+0.051
uries+0.049
uts+0.049
Neg logit contributions
 nods-0.035
 nod-0.033
 shakes-0.029
 nodded-0.029
 up-0.028
Token,
Token ID11
Feature activation-1.09e+00

Pos logit contributions
keeping+0.187
+0.165
handed+0.159
grand+0.157
 arrest+0.156
Neg logit contributions
 nods-0.042
 nod-0.042
 nodded-0.040
 shakes-0.029
 up-0.028
Token in
Token ID287
Feature activation-4.48e-01

Pos logit contributions
keeping+0.036
 rubble+0.028
 unatt+0.028
 arrest+0.028
handed+0.028
Neg logit contributions
 nods-0.030
 nod-0.026
 nodded-0.024
ap-0.022
 shakes-0.022
Token the
Token ID262
Feature activation-1.16e+00

Pos logit contributions
keeping+0.021
+0.018
handed+0.018
 arrest+0.018
hell+0.017
Neg logit contributions
 nods-0.006
 nod-0.005
 nodded-0.005
 shakes-0.004
 smiles-0.003
Token toy
Token ID13373
Feature activation-1.45e+00

Pos logit contributions
keeping+0.039
+0.034
handed+0.030
 arrest+0.030
uts+0.030
Neg logit contributions
 nods-0.031
 nod-0.028
 nodded-0.027
 shakes-0.025
 smiles-0.024
Token box
Token ID3091
Feature activation-2.34e+00

Pos logit contributions
keeping+0.061
+0.054
ache+0.050
grand+0.050
hell+0.049
Neg logit contributions
 nods-0.024
 nod-0.023
 nodded-0.021
 shakes-0.017
 smiles-0.016
Token and
Token ID290
Feature activation-5.60e-01

Pos logit contributions
 nods+0.001
 nod+0.001
 nodded+0.001
 shakes+0.001
 smiles+0.001
Neg logit contributions
keeping-0.002
-0.001
handed-0.001
 arrest-0.001
 rubble-0.001
Token had
Token ID550
Feature activation-8.16e-01

Pos logit contributions
keeping+0.020
+0.016
handed+0.016
 arrest+0.015
 rubble+0.015
Neg logit contributions
 nods-0.014
 nod-0.013
 nodded-0.013
 shakes-0.011
 smiles-0.011
Token lived
Token ID5615
Feature activation-8.89e-01

Pos logit contributions
keeping+0.030
handed+0.024
+0.024
 unatt+0.023
hell+0.023
Neg logit contributions
 nods-0.019
 nodded-0.017
 nod-0.017
 smiles-0.015
 shakes-0.015
Token for
Token ID329
Feature activation-1.95e-01

Pos logit contributions
keeping+0.032
 anywhere+0.027
handed+0.026
 unatt+0.026
+0.025
Neg logit contributions
 nods-0.023
 nod-0.020
 nodded-0.018
 shakes-0.016
 smiles-0.015
Token many
Token ID867
Feature activation+3.06e-01

Pos logit contributions
keeping+0.008
+0.007
handed+0.007
grand+0.007
 unatt+0.007
Neg logit contributions
 nods-0.003
 nod-0.003
 nodded-0.003
 smiles-0.002
 shakes-0.002
Token years
Token ID812
Feature activation-4.11e+00

Pos logit contributions
 nods+0.008
 nod+0.007
 nodded+0.007
 shakes+0.006
 smiles+0.006
Neg logit contributions
keeping-0.011
-0.009
handed-0.009
 arrest-0.008
 rubble-0.008
Token.
Token ID13
Feature activation+5.19e-02

Pos logit contributions
keeping+0.168
+0.143
handed+0.139
 arrest+0.136
hell+0.132
Neg logit contributions
 nods-0.080
 nod-0.069
 nodded-0.068
 shakes-0.058
 smiles-0.055
Token One
Token ID1881
Feature activation+3.50e-01

Pos logit contributions
 nods+0.001
 nod+0.001
 nodded+0.001
 shakes+0.001
 smiles+0.001
Neg logit contributions
keeping-0.002
-0.002
handed-0.002
 arrest-0.002
 rubble-0.002
Token day
Token ID1110
Feature activation+3.29e-01

Pos logit contributions
 nods+0.011
 nod+0.010
 nodded+0.010
 shakes+0.009
 smiles+0.009
Neg logit contributions
keeping-0.010
handed-0.008
-0.008
grand-0.007
 rubble-0.007
Token,
Token ID11
Feature activation-1.24e-01

Pos logit contributions
 nods+0.009
 nod+0.008
 nodded+0.008
 shakes+0.007
 smiles+0.007
Neg logit contributions
keeping-0.011
 arrest-0.009
-0.009
handed-0.008
 rubble-0.008
Token he
Token ID339
Feature activation+5.51e-01

Pos logit contributions
keeping+0.005
+0.004
handed+0.004
 arrest+0.004
 rubble+0.004
Neg logit contributions
 nods-0.003
 nod-0.003
 nodded-0.002
 shakes-0.002
 smiles-0.002
Token play
Token ID711
Feature activation-1.26e+00

Pos logit contributions
 nods+0.004
 nod+0.004
 nodded+0.004
 shakes+0.003
 smiles+0.003
Neg logit contributions
keeping-0.007
-0.005
handed-0.005
 arrest-0.005
 rubble-0.005
Token or
Token ID393
Feature activation-1.22e+00

Pos logit contributions
keeping+0.051
+0.042
handed+0.042
 arrest+0.041
 rubble+0.040
Neg logit contributions
 nods-0.026
 nod-0.023
 nodded-0.022
 shakes-0.019
 smiles-0.018
Token watch
Token ID2342
Feature activation-1.33e+00

Pos logit contributions
keeping+0.040
+0.032
handed+0.032
 arrest+0.030
hell+0.030
Neg logit contributions
 nods-0.033
 nod-0.032
 nodded-0.030
 shakes-0.027
 smiles-0.026
Token TV
Token ID3195
Feature activation-9.55e-01

Pos logit contributions
keeping+0.054
handed+0.045
+0.044
 arrest+0.043
 rubble+0.042
Neg logit contributions
 nods-0.027
 nod-0.024
 nodded-0.023
 shakes-0.020
 smiles-0.019
Token or
Token ID393
Feature activation-1.28e+00

Pos logit contributions
keeping+0.038
handed+0.031
+0.031
 arrest+0.030
 rubble+0.029
Neg logit contributions
 nods-0.021
 nod-0.019
 nodded-0.018
 shakes-0.016
 up-0.015
Token read
Token ID1100
Feature activation-3.83e+00

Pos logit contributions
keeping+0.046
+0.037
handed+0.037
 arrest+0.035
hell+0.035
Neg logit contributions
 nods-0.032
 nod-0.029
 nodded-0.028
 shakes-0.025
 smiles-0.024
Token books
Token ID3835
Feature activation-2.56e+00

Pos logit contributions
keeping+0.160
handed+0.136
+0.131
 arrest+0.130
 rubble+0.124
Neg logit contributions
 nods-0.075
 nod-0.065
 nodded-0.062
 smiles-0.055
 shakes-0.055
Token.
Token ID13
Feature activation-4.31e-01

Pos logit contributions
keeping+0.109
+0.091
handed+0.090
 arrest+0.090
 rubble+0.088
Neg logit contributions
 nods-0.046
 nod-0.041
 nodded-0.040
 shakes-0.033
 up-0.031
Token They
Token ID1119
Feature activation-5.69e-01

Pos logit contributions
keeping+0.018
handed+0.015
 arrest+0.015
 rubble+0.015
 unatt+0.015
Neg logit contributions
 nods-0.008
 nod-0.007
 nodded-0.007
 shakes-0.006
 smiles-0.005
Token only
Token ID691
Feature activation-5.46e-02

Pos logit contributions
keeping+0.021
+0.017
handed+0.017
 arrest+0.017
 rubble+0.016
Neg logit contributions
 nods-0.014
 nod-0.013
 nodded-0.012
 shakes-0.011
 smiles-0.010
Token had
Token ID550
Feature activation-1.08e+00

Pos logit contributions
keeping+0.002
handed+0.002
+0.002
 arrest+0.002
hell+0.002
Neg logit contributions
 nods-0.001
 nod-0.001
 nodded-0.001
 shakes-0.001
 smiles-0.001
Token.
Token ID13
Feature activation-5.55e-01

Pos logit contributions
keeping+0.063
 arrest+0.055
handed+0.054
+0.053
 rubble+0.053
Neg logit contributions
 nods-0.025
 nod-0.023
 up-0.022
 nodded-0.020
 shakes-0.019
Token He
Token ID679
Feature activation+4.66e-01

Pos logit contributions
keeping+0.023
handed+0.019
 arrest+0.019
+0.019
 rubble+0.018
Neg logit contributions
 nods-0.011
 nod-0.010
 nodded-0.010
 shakes-0.008
 up-0.007
Token looked
Token ID3114
Feature activation+4.94e-02

Pos logit contributions
 nods+0.011
 nod+0.010
 nodded+0.010
 shakes+0.009
 smiles+0.008
Neg logit contributions
keeping-0.017
-0.014
handed-0.014
 arrest-0.013
 rubble-0.013
Token under
Token ID739
Feature activation+4.01e-01

Pos logit contributions
 nods+0.002
 nod+0.001
 nodded+0.001
 shakes+0.001
 smiles+0.001
Neg logit contributions
keeping-0.001
 rubble-0.001
-0.001
 anywhere-0.001
 arrest-0.001
Token the
Token ID262
Feature activation-1.90e+00

Pos logit contributions
 nods+0.005
 nod+0.004
 nodded+0.004
 shakes+0.003
 smiles+0.003
Neg logit contributions
keeping-0.019
-0.017
handed-0.017
hell-0.016
grand-0.016
Token bed
Token ID3996
Feature activation-3.90e+00

Pos logit contributions
keeping+0.069
+0.059
handed+0.056
 arrest+0.053
uries+0.052
Neg logit contributions
 nods-0.044
 nod-0.042
 shakes-0.038
 nodded-0.036
 up-0.036
Token and
Token ID290
Feature activation-1.55e+00

Pos logit contributions
keeping+0.190
+0.166
handed+0.161
 arrest+0.158
 unnecessary+0.158
Neg logit contributions
 nods-0.039
 nod-0.038
 nodded-0.038
 shakes-0.027
 up-0.026
Token in
Token ID287
Feature activation-4.94e-01

Pos logit contributions
keeping+0.060
+0.049
handed+0.049
 arrest+0.048
 rubble+0.046
Neg logit contributions
 nods-0.035
 nod-0.031
 nodded-0.030
 shakes-0.026
 smiles-0.024
Token the
Token ID262
Feature activation-1.34e+00

Pos logit contributions
keeping+0.025
+0.022
handed+0.022
grand+0.021
 arrest+0.021
Neg logit contributions
 nods-0.005
 nod-0.004
 nodded-0.004
 shakes-0.003
 smiles-0.003
Token closet
Token ID21615
Feature activation-4.77e+00

Pos logit contributions
keeping+0.046
+0.038
 arrest+0.035
handed+0.035
uts+0.034
Neg logit contributions
 nods-0.036
 nod-0.033
 nodded-0.031
 shakes-0.029
 smiles-0.028
Token,
Token ID11
Feature activation-1.33e+00

Pos logit contributions
keeping+0.226
+0.195
 unnecessary+0.192
handed+0.189
 arrest+0.187
Neg logit contributions
 nodded-0.059
 nods-0.055
 nod-0.054
 shakes-0.039
 smiled-0.039
Token,
Token ID11
Feature activation-1.24e+00

Pos logit contributions
keeping+0.114
+0.093
handed+0.092
 arrest+0.091
 rubble+0.088
Neg logit contributions
 nods-0.066
 nod-0.059
 nodded-0.057
 shakes-0.050
 smiles-0.046
Token and
Token ID290
Feature activation-5.84e-01

Pos logit contributions
keeping+0.044
 arrest+0.035
 unatt+0.035
handed+0.034
 rubble+0.034
Neg logit contributions
 nods-0.031
 nod-0.026
 nodded-0.025
ap-0.022
 shakes-0.022
Token even
Token ID772
Feature activation+9.81e-01

Pos logit contributions
keeping+0.022
+0.018
handed+0.017
 arrest+0.017
 rubble+0.017
Neg logit contributions
 nods-0.014
 nod-0.012
 nodded-0.012
 shakes-0.011
 smiles-0.010
Token in
Token ID287
Feature activation-5.93e-01

Pos logit contributions
 nods+0.022
 nod+0.019
 nodded+0.018
 shakes+0.015
 smiles+0.014
Neg logit contributions
keeping-0.039
handed-0.031
-0.031
 arrest-0.031
 rubble-0.030
Token the
Token ID262
Feature activation-7.13e-01

Pos logit contributions
keeping+0.028
+0.024
handed+0.024
grand+0.023
hell+0.023
Neg logit contributions
 nods-0.008
 nod-0.006
 nodded-0.006
 smiles-0.005
 shakes-0.005
Token closet
Token ID21615
Feature activation-3.92e+00

Pos logit contributions
keeping+0.024
+0.019
handed+0.019
 arrest+0.019
hell+0.018
Neg logit contributions
 nods-0.020
 nod-0.018
 nodded-0.017
 shakes-0.015
 smiles-0.015
Token,
Token ID11
Feature activation-5.74e-01

Pos logit contributions
keeping+0.163
+0.140
 arrest+0.136
 rubble+0.132
 unnecessary+0.132
Neg logit contributions
 nods-0.067
 nodded-0.065
 nod-0.061
 shakes-0.047
 smiled-0.046
Token but
Token ID475
Feature activation+6.59e-02

Pos logit contributions
keeping+0.021
handed+0.017
 arrest+0.016
+0.016
 rubble+0.016
Neg logit contributions
 nods-0.014
 nod-0.012
 nodded-0.012
 shakes-0.010
ap-0.010
Token he
Token ID339
Feature activation-6.30e-01

Pos logit contributions
 nods+0.002
 nod+0.002
 nodded+0.002
 shakes+0.001
 smiles+0.001
Neg logit contributions
keeping-0.002
-0.002
handed-0.002
 arrest-0.002
 rubble-0.002
Token couldn
Token ID3521
Feature activation-3.46e-01

Pos logit contributions
keeping+0.019
+0.015
handed+0.015
 arrest+0.014
 rubble+0.014
Neg logit contributions
 nods-0.019
 nod-0.018
 nodded-0.017
 shakes-0.015
 smiles-0.015
Token't
Token ID470
Feature activation-3.64e-01

Pos logit contributions
keeping+0.009
+0.007
handed+0.007
 arrest+0.007
 rubble+0.006
Neg logit contributions
 nods-0.012
 nod-0.011
 nodded-0.011
 shakes-0.010
 smiles-0.009