TOP ACTIVATIONS
MAX = 0.198

did something special . On Wed nes days she liked to
said yes and gave her permission to do so .
and her mum gave her permission to start .
King and asked for his permission . The King said ,
! Her mom gave her permission to use the oven .
man who had given her permission to come .
, but they gave her permission to go ahead .
, Miss Sarah taught the pupils about the different colors .
doctor and asked her for permission to do the x -
her mom my gave her permission to open the envelope .
excited . She completed the tasks and opened the door .
his mom and asked for permission to take the symbol home
They packed all of their belongings and moved to a new
As they packed all their belongings , the family was looking
to work , attaching all kinds of things to the walls
her family sorted through their belongings , picking out all of
, they packed up their belongings and headed home . The
s important to ask for permission before taking something that doesn
Miss Sarah gave all the pupils a hug and said ,
He asked his mom for permission and she agreed . She

MIN ACTIVATIONS
MIN = -0.177

, fifteen , sixteen , seventeen , eighteen , nineteen ,
, fifteen , sixteen , seventeen , eighteen , nineteen ,
And it is not too high . I will go first
slippery and the hill is steep . It could be dangerous
shiny , but it is blue and soft and makes a
ball . It is not red and shiny , but it
careful . The hill is high and steep ."
bike . It was not red and shiny and fast anymore
slide . It is too high and fast . Can we
milk . It is too high ." Mom hears
hard . The hill is steep and rocky . Tom and
. " It is not blue anymore . It has many
careful . The hill is steep and the wagon is heavy
careful . The hill is steep ." They hold
and the clock is too high . We will fall and
. It was not too steep or too high . She
He thinks it is too high and too fast .
But the tower is too high and it falls down with
slide . It is too high for you and Max ."
sky . The sky is high and wide . Your block

INTERVAL 0.105 - 0.198
CONTAINS 0.209%

Mama gave baby some delicious foods to eat . Baby ate
together every day and spent lots of time together . Sus
to everyone , no matter how they look or smell .
careful ." M oral : It is important to
get the scales to see how much it weighed . Her

INTERVAL 0.011 - 0.105
CONTAINS 38.335%

took a spoon , scoop ed a handful of cereal and
those markers ", said Mark . Jimmy got
play everyday . They both liked to play outdoors in the
the world . It hopped off the peg and began walking
spent the rest of the day listening to the sun flower

INTERVAL -0.083 - 0.011
CONTAINS 60.673%

asking for help from others . And sometimes , things that
Welcome to my office ." Joe and Molly were impressed by
a blue flower , and a pink flower . They picked
were little . But recently Max and Sarah had begun to
they arrived at the airport , Benny noticed that it was

INTERVAL -0.177 - -0.083
CONTAINS 0.784%

. The doors did not open . Sara and Tom screamed
and the clouds and the sky . They had
touched it . It was soft and fluffy . She said
the garage ? It is dry and clean and the dog
on her . It was heavy and hurt her a little
Token did
Token ID750
Feature activation-4.05e-02

Pos logit contributions
 compass+0.043
edly+0.042
 reins+0.042
unn+0.042
amb+0.042
Neg logit contributions
 barking-0.027
 mess-0.026
 biting-0.026
 Sam-0.025
 Tom-0.025
Token something
Token ID1223
Feature activation+5.41e-02

Pos logit contributions
 barking+0.039
 biting+0.039
ue+0.036
 Sam+0.032
 mess+0.032
Neg logit contributions
unn-0.097
amb-0.096
ovable-0.095
 lamb-0.094
oll-0.094
Token special
Token ID2041
Feature activation+1.00e-02

Pos logit contributions
amb+0.098
unn+0.098
edly+0.092
 lamb+0.091
sie+0.091
Neg logit contributions
 barking-0.087
 biting-0.086
 mess-0.081
ue-0.079
 Sam-0.079
Token.
Token ID13
Feature activation+1.90e-02

Pos logit contributions
unn+0.028
 lamb+0.027
amb+0.027
sie+0.027
ovable+0.027
Neg logit contributions
 barking-0.006
 biting-0.006
 Sam-0.005
ue-0.004
 again-0.004
Token On
Token ID1550
Feature activation+3.05e-02

Pos logit contributions
unn+0.039
amb+0.039
ovable+0.038
 lamb+0.037
 luggage+0.037
Neg logit contributions
 biting-0.025
 barking-0.025
ue-0.023
 attacking-0.022
 mess-0.022
Token Wed
Token ID3300
Feature activation+1.82e-01

Pos logit contributions
unn+0.071
amb+0.070
ovable+0.068
oll+0.067
 lamb+0.067
Neg logit contributions
 barking-0.034
 biting-0.033
ue-0.032
 Sam-0.030
 mess-0.028
Tokennes
Token ID2516
Feature activation+2.98e-02

Pos logit contributions
unn+0.178
 lamb+0.172
 luggage+0.167
amb+0.155
ovable+0.152
Neg logit contributions
ue-0.439
 barking-0.430
 biting-0.426
 erupt-0.412
 Sam-0.408
Tokendays
Token ID12545
Feature activation+1.02e-01

Pos logit contributions
unn+0.045
oll+0.043
 lamb+0.043
nam+0.042
 luggage+0.042
Neg logit contributions
 barking-0.057
ue-0.055
 biting-0.054
 Sam-0.052
 erupt-0.051
Token she
Token ID673
Feature activation+5.83e-02

Pos logit contributions
 compass+0.259
nam+0.254
 lamb+0.254
unn+0.251
sie+0.251
Neg logit contributions
 barking-0.085
ue-0.081
 Sam-0.079
 Tom-0.074
 biting-0.074
Token liked
Token ID8288
Feature activation+1.98e-02

Pos logit contributions
unn+0.119
edly+0.116
uster+0.112
nam+0.112
 compass+0.109
Neg logit contributions
 barking-0.081
 Sam-0.080
 mess-0.078
 Tom-0.077
 biting-0.075
Token to
Token ID284
Feature activation-1.88e-02

Pos logit contributions
ovable+0.048
unn+0.048
amb+0.047
sie+0.047
oll+0.047
Neg logit contributions
 barking-0.021
 biting-0.021
 Sam-0.018
 Tom-0.016
 mess-0.016
Token said
Token ID531
Feature activation+1.92e-02

Pos logit contributions
unn+0.020
amb+0.020
ovable+0.019
 luggage+0.019
 lamb+0.019
Neg logit contributions
 barking-0.014
 biting-0.014
ue-0.014
 Sam-0.013
 mess-0.012
Token yes
Token ID3763
Feature activation-7.11e-03

Pos logit contributions
unn+0.037
amb+0.037
ovable+0.036
 lamb+0.035
sie+0.035
Neg logit contributions
 barking-0.028
 biting-0.028
ue-0.026
 Sam-0.025
 attacking-0.024
Token and
Token ID290
Feature activation+4.92e-02

Pos logit contributions
 biting+0.008
 barking+0.008
 attacking+0.007
ue+0.007
 Sam+0.007
Neg logit contributions
amb-0.016
unn-0.015
 luggage-0.015
oll-0.015
ovable-0.015
Token gave
Token ID2921
Feature activation+3.69e-02

Pos logit contributions
unn+0.116
ovable+0.115
amb+0.115
 luggage+0.110
edly+0.110
Neg logit contributions
 Sam-0.052
 barking-0.051
 biting-0.051
 Tom-0.047
ue-0.047
Token her
Token ID607
Feature activation+6.39e-02

Pos logit contributions
unn+0.076
amb+0.075
ovable+0.074
 lamb+0.072
oll+0.072
Neg logit contributions
 barking-0.049
 biting-0.049
ue-0.046
 Sam-0.045
 mess-0.043
Token permission
Token ID7170
Feature activation+1.73e-01

Pos logit contributions
unn+0.187
amb+0.179
ovable+0.168
edly+0.167
sie+0.163
Neg logit contributions
 Sam-0.050
 half-0.047
 Sue-0.044
 Tom-0.044
 Mia-0.043
Token to
Token ID284
Feature activation+2.17e-03

Pos logit contributions
unn+0.400
amb+0.387
ovable+0.386
 lamb+0.386
sie+0.383
Neg logit contributions
 barking-0.183
 biting-0.179
ue-0.169
 Sam-0.161
 mess-0.147
Token do
Token ID466
Feature activation-7.93e-03

Pos logit contributions
unn+0.004
amb+0.004
ovable+0.004
 luggage+0.004
 lamb+0.004
Neg logit contributions
 barking-0.003
 biting-0.003
ue-0.003
 Sam-0.003
 attacking-0.003
Token so
Token ID523
Feature activation+8.03e-02

Pos logit contributions
 barking+0.009
 biting+0.009
ue+0.009
 attacking+0.008
 Sam+0.008
Neg logit contributions
ovable-0.017
 luggage-0.017
unn-0.017
amb-0.017
oll-0.017
Token.
Token ID13
Feature activation+3.28e-02

Pos logit contributions
unn+0.138
amb+0.137
ovable+0.136
oll+0.134
sie+0.131
Neg logit contributions
 barking-0.131
 biting-0.129
ue-0.123
 attacking-0.116
 Sam-0.116
Token
Token ID198
Feature activation+1.03e-02

Pos logit contributions
amb+0.061
ouses+0.061
unn+0.059
 wife+0.057
 luggage+0.057
Neg logit contributions
 biting-0.049
 barking-0.046
 attacking-0.046
 erupt-0.043
 destroying-0.043
Token and
Token ID290
Feature activation+3.46e-02

Pos logit contributions
unn+0.036
ovable+0.034
 lamb+0.034
amb+0.034
oll+0.033
Neg logit contributions
 barking-0.013
 biting-0.013
ue-0.011
 Sam-0.011
 again-0.011
Token her
Token ID607
Feature activation+5.35e-02

Pos logit contributions
unn+0.072
ovable+0.070
amb+0.068
oll+0.068
ouses+0.067
Neg logit contributions
 barking-0.048
 biting-0.047
 Sam-0.047
ue-0.044
 Sue-0.044
Token mum
Token ID25682
Feature activation+2.12e-02

Pos logit contributions
amb+0.088
unn+0.087
ovable+0.086
nam+0.082
ouses+0.082
Neg logit contributions
 biting-0.094
 barking-0.092
 mess-0.088
 Sam-0.087
 mistake-0.086
Token gave
Token ID2921
Feature activation+2.87e-02

Pos logit contributions
unn+0.047
nam+0.047
 luggage+0.045
amb+0.044
ouses+0.044
Neg logit contributions
 biting-0.024
 barking-0.023
 again-0.023
 Sam-0.023
ue-0.022
Token her
Token ID607
Feature activation+5.85e-02

Pos logit contributions
unn+0.064
amb+0.061
ouses+0.061
ovable+0.061
oll+0.060
Neg logit contributions
 barking-0.035
 biting-0.034
 Sam-0.032
ue-0.032
 Tom-0.030
Token permission
Token ID7170
Feature activation+1.70e-01

Pos logit contributions
unn+0.178
amb+0.169
ovable+0.155
ances+0.153
edly+0.151
Neg logit contributions
 Sam-0.036
 Mia-0.034
 again-0.034
 half-0.034
 Sue-0.031
Token to
Token ID284
Feature activation-3.09e-03

Pos logit contributions
unn+0.463
 lamb+0.435
sie+0.428
ovable+0.427
amb+0.421
Neg logit contributions
 barking-0.122
 Sam-0.118
 again-0.115
 biting-0.113
 Tom-0.113
Token start
Token ID923
Feature activation-2.19e-02

Pos logit contributions
 barking+0.004
 biting+0.004
 Sam+0.004
 Tom+0.004
ue+0.004
Neg logit contributions
unn-0.007
amb-0.007
ovable-0.007
ouses-0.007
oll-0.006
Token.
Token ID13
Feature activation+2.10e-02

Pos logit contributions
 barking+0.025
 biting+0.024
ue+0.022
 Sam+0.021
 attacking+0.020
Neg logit contributions
unn-0.050
amb-0.049
ovable-0.048
sie-0.048
 lamb-0.048
Token
Token ID198
Feature activation+9.96e-04

Pos logit contributions
unn+0.046
ovable+0.046
sie+0.045
nam+0.045
 luggage+0.045
Neg logit contributions
 barking-0.025
 biting-0.023
ue-0.023
 Sam-0.022
 Tom-0.020
Token
Token ID198
Feature activation+1.46e-03

Pos logit contributions
unn+0.003
ovable+0.002
amb+0.002
sie+0.002
 luggage+0.002
Neg logit contributions
 barking-0.001
 biting-0.001
ue-0.001
 Sam-0.001
 attacking-0.001
Token King
Token ID2677
Feature activation+3.78e-02

Pos logit contributions
unn+0.030
amb+0.029
ovable+0.028
ouses+0.028
oll+0.028
Neg logit contributions
 barking-0.023
 biting-0.023
 mess-0.021
ue-0.021
 Sam-0.020
Token and
Token ID290
Feature activation+3.10e-02

Pos logit contributions
unn+0.091
amb+0.089
ovable+0.088
 luggage+0.087
oll+0.087
Neg logit contributions
 barking-0.038
 biting-0.036
ue-0.034
 Sam-0.032
 attacking-0.029
Token asked
Token ID1965
Feature activation+5.25e-02

Pos logit contributions
unn+0.069
amb+0.067
oll+0.067
ovable+0.065
uster+0.065
Neg logit contributions
 Sam-0.040
 barking-0.040
 biting-0.038
 Sue-0.037
 Tom-0.037
Token for
Token ID329
Feature activation+2.97e-02

Pos logit contributions
unn+0.129
amb+0.127
ovable+0.122
oll+0.121
ouses+0.121
Neg logit contributions
 biting-0.051
 barking-0.051
 Sam-0.050
 Tom-0.045
ue-0.044
Token his
Token ID465
Feature activation+5.86e-02

Pos logit contributions
amb+0.071
unn+0.069
ouses+0.065
ovable+0.065
oll+0.065
Neg logit contributions
 biting-0.036
 Sam-0.034
 barking-0.034
 Tom-0.031
 Sue-0.030
Token permission
Token ID7170
Feature activation+1.70e-01

Pos logit contributions
amb+0.112
unn+0.110
sie+0.106
uster+0.103
ovable+0.103
Neg logit contributions
 barking-0.093
 biting-0.093
 Sam-0.088
 bite-0.085
 Sue-0.081
Token.
Token ID13
Feature activation+2.92e-02

Pos logit contributions
unn+0.439
 lamb+0.428
amb+0.424
sie+0.423
ovable+0.422
Neg logit contributions
 barking-0.136
 biting-0.130
 Sam-0.119
ue-0.119
 Tom-0.108
Token The
Token ID383
Feature activation-4.06e-03

Pos logit contributions
unn+0.065
amb+0.064
ovable+0.063
 lamb+0.062
oll+0.062
Neg logit contributions
 barking-0.033
 biting-0.032
ue-0.030
 Sam-0.028
 attacking-0.027
Token King
Token ID2677
Feature activation+3.59e-02

Pos logit contributions
 biting+0.007
 barking+0.007
 mess+0.006
 Sam+0.006
ue+0.006
Neg logit contributions
amb-0.007
unn-0.007
ouses-0.007
uster-0.007
ovable-0.007
Token said
Token ID531
Feature activation+2.71e-02

Pos logit contributions
unn+0.075
amb+0.073
 luggage+0.072
ovable+0.072
 lamb+0.071
Neg logit contributions
 barking-0.046
 biting-0.046
ue-0.044
 Sam-0.042
 attacking-0.039
Token,
Token ID11
Feature activation+2.14e-03

Pos logit contributions
unn+0.064
amb+0.063
ovable+0.061
ouses+0.061
sie+0.061
Neg logit contributions
 biting-0.029
 barking-0.028
ue-0.025
 Sam-0.025
 attacking-0.023
Token!
Token ID0
Feature activation+2.01e-02

Pos logit contributions
ue+0.003
 dent+0.003
 collapse+0.003
 biting+0.003
 erupt+0.003
Neg logit contributions
amb-0.003
 wife-0.003
unn-0.002
 lamb-0.002
 husband-0.002
Token Her
Token ID2332
Feature activation+4.79e-02

Pos logit contributions
amb+0.040
unn+0.039
 wife+0.038
ouses+0.038
 lamb+0.037
Neg logit contributions
 biting-0.028
 barking-0.027
 attacking-0.026
 erupt-0.025
ue-0.024
Token mom
Token ID1995
Feature activation-5.62e-03

Pos logit contributions
unn+0.084
amb+0.083
ovable+0.082
 luggage+0.081
oll+0.081
Neg logit contributions
 barking-0.077
 biting-0.076
ue-0.072
 Sam-0.068
 attacking-0.068
Token gave
Token ID2921
Feature activation+3.57e-02

Pos logit contributions
 barking+0.009
 biting+0.009
ue+0.009
 attacking+0.008
 Sam+0.008
Neg logit contributions
unn-0.010
amb-0.010
ovable-0.009
 lamb-0.009
oll-0.009
Token her
Token ID607
Feature activation+6.16e-02

Pos logit contributions
unn+0.079
amb+0.079
ovable+0.077
ouses+0.076
edly+0.075
Neg logit contributions
 barking-0.043
 biting-0.043
 Sam-0.042
 Tom-0.039
ue-0.038
Token permission
Token ID7170
Feature activation+1.69e-01

Pos logit contributions
unn+0.172
amb+0.170
ovable+0.159
 compass+0.157
edly+0.155
Neg logit contributions
 Sam-0.047
 half-0.045
 biting-0.044
 Tom-0.043
 Sue-0.043
Token to
Token ID284
Feature activation-1.66e-03

Pos logit contributions
unn+0.438
 lamb+0.421
ovable+0.420
sie+0.415
 luggage+0.408
Neg logit contributions
 barking-0.136
 Sam-0.133
 biting-0.131
ue-0.128
 Tom-0.125
Token use
Token ID779
Feature activation+2.76e-02

Pos logit contributions
 barking+0.002
 biting+0.002
ue+0.002
 Sam+0.002
 attacking+0.002
Neg logit contributions
unn-0.003
amb-0.003
ovable-0.003
 lamb-0.003
sie-0.003
Token the
Token ID262
Feature activation+2.79e-02

Pos logit contributions
unn+0.061
amb+0.060
ovable+0.059
 luggage+0.058
 lamb+0.058
Neg logit contributions
 barking-0.032
 biting-0.031
ue-0.029
 Sam-0.027
 attacking-0.026
Token oven
Token ID14361
Feature activation-2.09e-02

Pos logit contributions
unn+0.051
 luggage+0.050
 lamb+0.050
 wife+0.050
ovable+0.050
Neg logit contributions
 barking-0.042
 biting-0.041
ue-0.039
 Sam-0.037
 attacking-0.037
Token.
Token ID13
Feature activation+2.40e-02

Pos logit contributions
 barking+0.019
 biting+0.019
ue+0.019
 Sam+0.018
 Tom+0.016
Neg logit contributions
unn-0.051
amb-0.050
ovable-0.050
 lamb-0.050
sie-0.050
Token man
Token ID582
Feature activation+4.77e-03

Pos logit contributions
unn+0.034
amb+0.033
ovable+0.032
ouses+0.031
oll+0.031
Neg logit contributions
 barking-0.028
 biting-0.027
 mess-0.025
ue-0.025
 Sam-0.025
Token who
Token ID508
Feature activation+6.29e-02

Pos logit contributions
unn+0.010
amb+0.009
nam+0.009
 downs+0.009
uster+0.009
Neg logit contributions
 barking-0.007
 biting-0.007
 again-0.007
 Sam-0.006
ue-0.006
Token had
Token ID550
Feature activation-3.76e-03

Pos logit contributions
unn+0.130
amb+0.120
ouses+0.118
uster+0.117
edly+0.117
Neg logit contributions
 barking-0.090
 Sam-0.089
 biting-0.087
 Tom-0.083
 Sue-0.081
Token given
Token ID1813
Feature activation+6.09e-02

Pos logit contributions
 barking+0.005
 biting+0.005
 Sam+0.005
 Tom+0.004
ue+0.004
Neg logit contributions
unn-0.008
amb-0.008
ovable-0.008
oll-0.008
sie-0.008
Token her
Token ID607
Feature activation+6.35e-02

Pos logit contributions
unn+0.129
amb+0.127
ovable+0.123
oll+0.121
 luggage+0.121
Neg logit contributions
 barking-0.078
 biting-0.077
 Sam-0.074
ue-0.072
 Tom-0.066
Token permission
Token ID7170
Feature activation+1.68e-01

Pos logit contributions
unn+0.173
amb+0.163
oll+0.150
sie+0.150
ovable+0.149
Neg logit contributions
 Sam-0.058
 barking-0.056
 again-0.054
 Sue-0.052
 biting-0.052
Token to
Token ID284
Feature activation+2.31e-03

Pos logit contributions
unn+0.438
 lamb+0.414
sie+0.414
amb+0.410
ovable+0.408
Neg logit contributions
 barking-0.139
 biting-0.130
 Sam-0.124
ue-0.122
 Tom-0.111
Token come
Token ID1282
Feature activation-4.71e-02

Pos logit contributions
unn+0.005
amb+0.005
oll+0.005
ouses+0.005
ovable+0.005
Neg logit contributions
 barking-0.003
 biting-0.003
 Sam-0.003
ue-0.003
 Sue-0.003
Token.
Token ID13
Feature activation+2.04e-02

Pos logit contributions
 barking+0.060
 biting+0.060
ue+0.056
 Sam+0.052
 attacking+0.051
Neg logit contributions
unn-0.097
amb-0.097
ovable-0.096
 lamb-0.094
 luggage-0.094
Token
Token ID220
Feature activation-4.16e-03

Pos logit contributions
unn+0.048
amb+0.047
ovable+0.046
 luggage+0.046
 lamb+0.046
Neg logit contributions
 barking-0.021
 biting-0.021
ue-0.020
 Sam-0.019
 attacking-0.017
Token
Token ID198
Feature activation+5.88e-03

Pos logit contributions
 barking+0.004
 biting+0.004
ue+0.004
 attacking+0.003
 mess+0.003
Neg logit contributions
unn-0.010
amb-0.010
ovable-0.010
oll-0.009
 lamb-0.009
Token,
Token ID11
Feature activation-1.93e-02

Pos logit contributions
unn+0.042
amb+0.042
ovable+0.042
 luggage+0.042
sie+0.042
Neg logit contributions
 barking-0.018
 biting-0.018
ue-0.017
 attacking-0.015
 Sam-0.014
Token but
Token ID475
Feature activation+2.30e-02

Pos logit contributions
 barking+0.021
 biting+0.021
ue+0.020
 Sam+0.018
 mess+0.017
Neg logit contributions
unn-0.044
amb-0.044
ovable-0.043
sie-0.042
oll-0.042
Token they
Token ID484
Feature activation+1.83e-02

Pos logit contributions
unn+0.050
amb+0.049
ovable+0.048
oll+0.047
ouses+0.047
Neg logit contributions
 barking-0.029
 biting-0.028
ue-0.027
 Sam-0.025
 Sue-0.024
Token gave
Token ID2921
Feature activation+2.58e-02

Pos logit contributions
unn+0.041
amb+0.039
ouses+0.039
 luggage+0.037
 lamb+0.037
Neg logit contributions
 Sam-0.021
ue-0.021
 barking-0.021
 biting-0.020
 Sue-0.020
Token her
Token ID607
Feature activation+5.63e-02

Pos logit contributions
unn+0.057
amb+0.056
ovable+0.055
oll+0.054
ouses+0.054
Neg logit contributions
 barking-0.031
 biting-0.030
ue-0.029
 Sam-0.028
 Sue-0.027
Token permission
Token ID7170
Feature activation+1.66e-01

Pos logit contributions
unn+0.168
amb+0.160
ovable+0.147
edly+0.145
oll+0.144
Neg logit contributions
 Sam-0.039
 Sue-0.037
 half-0.036
 Mia-0.034
 Tom-0.034
Token to
Token ID284
Feature activation-1.19e-03

Pos logit contributions
unn+0.431
 lamb+0.408
sie+0.405
ovable+0.405
amb+0.401
Neg logit contributions
 barking-0.138
 biting-0.130
 Sam-0.127
ue-0.123
 Tom-0.118
Token go
Token ID467
Feature activation+4.80e-03

Pos logit contributions
 barking+0.002
 biting+0.002
 Sam+0.001
ue+0.001
 Sue+0.001
Neg logit contributions
unn-0.003
amb-0.003
ovable-0.003
oll-0.003
ouses-0.002
Token ahead
Token ID4058
Feature activation-2.68e-02

Pos logit contributions
unn+0.012
ovable+0.012
oll+0.012
amb+0.012
sie+0.011
Neg logit contributions
 barking-0.005
 biting-0.004
ue-0.004
 Sam-0.004
 attacking-0.003
Token.
Token ID13
Feature activation+2.95e-02

Pos logit contributions
 barking+0.022
 biting+0.020
ue+0.019
 Sam+0.018
 Tom+0.016
Neg logit contributions
unn-0.071
amb-0.068
ovable-0.067
oll-0.067
 lamb-0.067
Token
Token ID198
Feature activation+9.86e-03

Pos logit contributions
unn+0.066
amb+0.065
ovable+0.064
 lamb+0.063
 luggage+0.063
Neg logit contributions
 barking-0.033
 biting-0.033
ue-0.031
 Sam-0.028
 attacking-0.028
Token,
Token ID11
Feature activation+5.84e-04

Pos logit contributions
 luggage+0.117
amb+0.116
unn+0.115
sie+0.114
ovable+0.113
Neg logit contributions
 biting-0.049
 barking-0.048
ue-0.041
 erupt-0.041
 attacking-0.040
Token Miss
Token ID4544
Feature activation+1.43e-02

Pos logit contributions
unn+0.002
amb+0.002
ovable+0.002
oll+0.002
 lamb+0.001
Neg logit contributions
 barking-0.000
 biting-0.000
ue-0.000
 Sam-0.000
 attacking-0.000
Token Sarah
Token ID10490
Feature activation+5.50e-04

Pos logit contributions
unn+0.035
amb+0.034
ovable+0.034
sie+0.033
 luggage+0.033
Neg logit contributions
 biting-0.013
 barking-0.013
ue-0.012
 attacking-0.011
 Sam-0.011
Token taught
Token ID7817
Feature activation+8.62e-02

Pos logit contributions
unn+0.001
amb+0.001
ovable+0.001
nam+0.001
oll+0.001
Neg logit contributions
 barking-0.001
 biting-0.001
ue-0.001
 Sam-0.001
 mess-0.001
Token the
Token ID262
Feature activation+5.22e-02

Pos logit contributions
unn+0.201
amb+0.199
ovable+0.196
oll+0.194
sie+0.194
Neg logit contributions
 barking-0.092
 biting-0.089
ue-0.086
 Sam-0.083
 Tom-0.074
Token pupils
Token ID25613
Feature activation+1.64e-01

Pos logit contributions
amb+0.100
unn+0.097
ovable+0.094
ouses+0.094
sie+0.094
Neg logit contributions
 barking-0.082
 biting-0.079
 mess-0.077
 Sam-0.076
ue-0.073
Token about
Token ID546
Feature activation-1.82e-02

Pos logit contributions
unn+0.371
amb+0.359
 lamb+0.341
oll+0.338
sie+0.333
Neg logit contributions
 barking-0.205
 biting-0.202
 Sam-0.189
 Sue-0.183
 mess-0.181
Token the
Token ID262
Feature activation+2.00e-02

Pos logit contributions
 barking+0.022
 biting+0.021
 Sam+0.019
ue+0.017
 attacking+0.017
Neg logit contributions
unn-0.043
amb-0.043
oll-0.041
ovable-0.041
sie-0.040
Token different
Token ID1180
Feature activation+1.21e-03

Pos logit contributions
amb+0.038
unn+0.037
oll+0.037
ovable+0.036
edly+0.036
Neg logit contributions
 barking-0.032
 biting-0.030
 mess-0.030
 Sam-0.028
 mistake-0.027
Token colors
Token ID7577
Feature activation+7.27e-03

Pos logit contributions
oll+0.002
unn+0.002
amb+0.002
ovable+0.002
ppa+0.002
Neg logit contributions
 barking-0.002
 mess-0.002
 biting-0.002
 Sam-0.002
 mistake-0.002
Token.
Token ID13
Feature activation+2.48e-02

Pos logit contributions
unn+0.017
amb+0.017
 lamb+0.016
ovable+0.016
sie+0.016
Neg logit contributions
 barking-0.008
 biting-0.008
ue-0.007
 Sam-0.007
 mess-0.007
Token doctor
Token ID6253
Feature activation+1.88e-02

Pos logit contributions
unn+0.051
amb+0.051
ovable+0.049
ouses+0.048
sie+0.048
Neg logit contributions
 barking-0.039
 biting-0.039
ue-0.035
 mess-0.035
 attacking-0.034
Token and
Token ID290
Feature activation+2.13e-02

Pos logit contributions
unn+0.047
amb+0.045
ovable+0.045
oll+0.044
sie+0.044
Neg logit contributions
 barking-0.019
 biting-0.017
 Sam-0.016
ue-0.015
 attacking-0.014
Token asked
Token ID1965
Feature activation+4.44e-02

Pos logit contributions
unn+0.049
amb+0.047
ovable+0.047
oll+0.046
ouses+0.045
Neg logit contributions
 barking-0.025
 Sam-0.024
 biting-0.024
 Tom-0.022
 Sue-0.022
Token her
Token ID607
Feature activation+6.56e-02

Pos logit contributions
unn+0.102
amb+0.101
ovable+0.099
ouses+0.098
oll+0.097
Neg logit contributions
 biting-0.049
 barking-0.048
 Sam-0.045
 Tom-0.042
ue-0.041
Token for
Token ID329
Feature activation+1.21e-02

Pos logit contributions
amb+0.150
unn+0.147
sie+0.141
edly+0.140
ovable+0.140
Neg logit contributions
 barking-0.074
 biting-0.074
 Sam-0.073
 Sue-0.065
 again-0.064
Token permission
Token ID7170
Feature activation+1.64e-01

Pos logit contributions
amb+0.025
unn+0.025
ouses+0.023
ovable+0.023
sie+0.023
Neg logit contributions
 biting-0.018
 barking-0.017
 Sam-0.016
 attacking-0.015
 Tom-0.015
Token to
Token ID284
Feature activation-7.15e-03

Pos logit contributions
unn+0.412
amb+0.401
 lamb+0.399
sie+0.396
ovable+0.395
Neg logit contributions
 barking-0.146
 biting-0.140
ue-0.127
 Sam-0.126
 Tom-0.111
Token do
Token ID466
Feature activation-1.51e-02

Pos logit contributions
 barking+0.010
 biting+0.010
ue+0.009
 Sam+0.009
 mess+0.008
Neg logit contributions
unn-0.015
amb-0.014
ovable-0.014
ouses-0.014
oll-0.014
Token the
Token ID262
Feature activation+2.04e-02

Pos logit contributions
 biting+0.016
 barking+0.016
 Sam+0.014
ue+0.013
 mess+0.013
Neg logit contributions
amb-0.036
unn-0.036
ovable-0.035
sie-0.035
 lamb-0.034
Token x
Token ID2124
Feature activation-5.05e-02

Pos logit contributions
unn+0.043
amb+0.042
ovable+0.042
 lamb+0.041
oll+0.041
Neg logit contributions
 barking-0.026
 biting-0.026
ue-0.024
 Sam-0.022
 attacking-0.022
Token-
Token ID12
Feature activation+1.34e-02

Pos logit contributions
 barking+0.080
 biting+0.078
ue+0.077
 Sam+0.072
 attacking+0.069
Neg logit contributions
oll-0.087
unn-0.086
ovable-0.086
 luggage-0.085
amb-0.085
Token her
Token ID607
Feature activation+3.67e-02

Pos logit contributions
 barking+0.010
 biting+0.010
ue+0.009
 Sam+0.008
 attacking+0.008
Neg logit contributions
unn-0.031
amb-0.030
ovable-0.030
oll-0.030
sie-0.030
Token mom
Token ID1995
Feature activation-2.21e-02

Pos logit contributions
unn+0.091
amb+0.090
sie+0.085
ovable+0.085
ance+0.084
Neg logit contributions
 mess-0.040
 biting-0.039
 barking-0.037
 Sam-0.036
 mistake-0.035
Tokenmy
Token ID1820
Feature activation-2.11e-03

Pos logit contributions
 biting+0.021
 barking+0.021
ue+0.019
 attacking+0.017
 Sam+0.017
Neg logit contributions
unn-0.052
amb-0.052
ovable-0.051
oll-0.051
 lamb-0.051
Token gave
Token ID2921
Feature activation+2.66e-02

Pos logit contributions
 barking+0.003
 biting+0.003
ue+0.003
 Sam+0.003
 mess+0.003
Neg logit contributions
unn-0.004
amb-0.004
ovable-0.004
 luggage-0.004
sie-0.004
Token her
Token ID607
Feature activation+5.33e-02

Pos logit contributions
unn+0.059
amb+0.058
ovable+0.056
oll+0.055
ouses+0.055
Neg logit contributions
 biting-0.032
 barking-0.032
 Sam-0.030
ue-0.030
 mess-0.029
Token permission
Token ID7170
Feature activation+1.63e-01

Pos logit contributions
unn+0.162
amb+0.155
ovable+0.144
iya+0.143
 compass+0.143
Neg logit contributions
 Sam-0.033
 half-0.033
 again-0.032
 Tom-0.030
 Sue-0.029
Token to
Token ID284
Feature activation-7.44e-03

Pos logit contributions
unn+0.419
 lamb+0.400
sie+0.395
ovable+0.394
 luggage+0.389
Neg logit contributions
 barking-0.141
 biting-0.137
 Sam-0.131
ue-0.128
 Tom-0.123
Token open
Token ID1280
Feature activation-5.24e-02

Pos logit contributions
 barking+0.011
 biting+0.011
ue+0.010
 Sam+0.010
 mess+0.009
Neg logit contributions
unn-0.015
amb-0.014
ovable-0.014
oll-0.014
 lamb-0.014
Token the
Token ID262
Feature activation+2.88e-02

Pos logit contributions
 barking+0.048
 biting+0.047
ue+0.044
 Sam+0.039
 attacking+0.038
Neg logit contributions
unn-0.128
amb-0.126
ovable-0.125
 lamb-0.123
 luggage-0.123
Token envelope
Token ID22878
Feature activation-1.87e-02

Pos logit contributions
unn+0.055
amb+0.055
ovable+0.053
sie+0.052
oll+0.052
Neg logit contributions
 barking-0.043
 biting-0.043
ue-0.040
 mess-0.039
 Sam-0.038
Token.
Token ID13
Feature activation+2.09e-02

Pos logit contributions
 barking+0.017
 biting+0.017
ue+0.016
 Sam+0.014
 mess+0.014
Neg logit contributions
unn-0.046
amb-0.045
ovable-0.045
 lamb-0.044
sie-0.044
Token excited
Token ID6568
Feature activation+8.64e-03

Pos logit contributions
unn+0.016
amb+0.015
ovable+0.015
sie+0.015
 lamb+0.015
Neg logit contributions
 barking-0.016
 biting-0.016
ue-0.015
 Sam-0.015
 attacking-0.014
Token.
Token ID13
Feature activation+8.48e-03

Pos logit contributions
amb+0.017
unn+0.016
 wife+0.016
 luggage+0.016
edly+0.016
Neg logit contributions
 biting-0.011
 barking-0.011
ue-0.011
 attacking-0.010
 dent-0.010
Token She
Token ID1375
Feature activation+1.26e-02

Pos logit contributions
unn+0.020
sie+0.020
 luggage+0.020
ovable+0.020
 lamb+0.020
Neg logit contributions
 barking-0.008
 Sam-0.008
ue-0.008
 biting-0.007
 Tom-0.007
Token completed
Token ID5668
Feature activation+1.08e-02

Pos logit contributions
unn+0.028
 lamb+0.027
amb+0.026
 luggage+0.026
ovable+0.026
Neg logit contributions
ue-0.015
 Sam-0.015
 barking-0.015
 mess-0.014
 biting-0.014
Token the
Token ID262
Feature activation+4.50e-02

Pos logit contributions
 luggage+0.023
nam+0.022
sie+0.022
ovable+0.022
 lamb+0.022
Neg logit contributions
 barking-0.013
 biting-0.013
ue-0.012
 attacking-0.011
 erupt-0.010
Token tasks
Token ID8861
Feature activation+1.63e-01

Pos logit contributions
unn+0.082
amb+0.081
ovable+0.079
oll+0.077
sie+0.077
Neg logit contributions
 barking-0.070
 biting-0.070
ue-0.067
 mess-0.063
 Sam-0.063
Token and
Token ID290
Feature activation+2.62e-02

Pos logit contributions
amb+0.341
unn+0.339
ovable+0.339
sie+0.337
oll+0.336
Neg logit contributions
 barking-0.196
 biting-0.192
ue-0.180
 attacking-0.166
 Sam-0.162
Token opened
Token ID4721
Feature activation-1.98e-02

Pos logit contributions
unn+0.055
amb+0.054
ovable+0.052
 lamb+0.051
oll+0.050
Neg logit contributions
 Sam-0.037
 barking-0.036
 biting-0.036
 Tom-0.035
ue-0.034
Token the
Token ID262
Feature activation+3.19e-02

Pos logit contributions
 barking+0.021
 biting+0.021
ue+0.020
 attacking+0.018
 mess+0.017
Neg logit contributions
 luggage-0.044
amb-0.044
unn-0.044
sie-0.044
 lamb-0.044
Token door
Token ID3420
Feature activation-1.66e-04

Pos logit contributions
amb+0.074
unn+0.074
ovable+0.070
sie+0.069
 lamb+0.069
Neg logit contributions
 barking-0.035
 biting-0.034
ue-0.033
 mess-0.032
 Sam-0.030
Token.
Token ID13
Feature activation+8.12e-03

Pos logit contributions
 biting+0.000
 barking+0.000
ue+0.000
 attacking+0.000
 Sam+0.000
Neg logit contributions
amb-0.000
ovable-0.000
 luggage-0.000
unn-0.000
oll-0.000
Token his
Token ID465
Feature activation+6.10e-02

Pos logit contributions
 wife+0.032
 luggage+0.031
amb+0.030
 servants+0.030
 lamb+0.030
Neg logit contributions
ue-0.022
 biting-0.021
 barking-0.020
 dent-0.019
 mess-0.019
Token mom
Token ID1995
Feature activation+5.47e-03

Pos logit contributions
unn+0.109
amb+0.107
ovable+0.105
sie+0.103
oll+0.103
Neg logit contributions
 barking-0.100
 biting-0.098
ue-0.094
 Sam-0.092
 mess-0.088
Token and
Token ID290
Feature activation+3.78e-02

Pos logit contributions
unn+0.013
amb+0.012
ovable+0.012
 lamb+0.012
oll+0.012
Neg logit contributions
 barking-0.006
 biting-0.006
ue-0.005
 Sam-0.005
 attacking-0.005
Token asked
Token ID1965
Feature activation+4.81e-02

Pos logit contributions
unn+0.077
ovable+0.075
amb+0.074
oll+0.073
 luggage+0.072
Neg logit contributions
 barking-0.054
 biting-0.053
 Sam-0.051
ue-0.048
 Tom-0.047
Token for
Token ID329
Feature activation+2.25e-02

Pos logit contributions
unn+0.106
amb+0.106
ovable+0.106
 luggage+0.104
oll+0.104
Neg logit contributions
 barking-0.053
 biting-0.052
ue-0.050
 mess-0.045
 attacking-0.044
Token permission
Token ID7170
Feature activation+1.62e-01

Pos logit contributions
unn+0.047
amb+0.047
ovable+0.045
oll+0.044
 lamb+0.044
Neg logit contributions
 biting-0.030
 barking-0.030
 Sam-0.028
ue-0.027
 attacking-0.026
Token to
Token ID284
Feature activation+2.18e-03

Pos logit contributions
unn+0.384
amb+0.377
ovable+0.373
 lamb+0.371
sie+0.370
Neg logit contributions
 barking-0.163
 biting-0.160
ue-0.148
 Sam-0.137
 attacking-0.129
Token take
Token ID1011
Feature activation+4.52e-02

Pos logit contributions
unn+0.004
amb+0.004
oll+0.004
ovable+0.004
 lamb+0.004
Neg logit contributions
 barking-0.003
 biting-0.003
ue-0.003
 Sam-0.003
 mess-0.003
Token the
Token ID262
Feature activation+3.20e-02

Pos logit contributions
unn+0.109
amb+0.107
ovable+0.104
 lamb+0.103
oll+0.103
Neg logit contributions
 barking-0.045
 biting-0.045
 Sam-0.042
ue-0.040
 attacking-0.037
Token symbol
Token ID6194
Feature activation+2.91e-02

Pos logit contributions
unn+0.060
amb+0.059
ovable+0.058
 luggage+0.057
 lamb+0.057
Neg logit contributions
 barking-0.048
 biting-0.047
ue-0.045
 Sam-0.042
 attacking-0.042
Token home
Token ID1363
Feature activation+1.43e-02

Pos logit contributions
unn+0.072
amb+0.070
ovable+0.069
 lamb+0.068
 luggage+0.068
Neg logit contributions
 barking-0.026
 biting-0.026
 Sam-0.024
ue-0.024
 erupt-0.021
Token They
Token ID1119
Feature activation+1.49e-02

Pos logit contributions
unn+0.029
ovable+0.029
oll+0.028
amb+0.028
sie+0.028
Neg logit contributions
 barking-0.022
 biting-0.021
ue-0.020
 Sam-0.020
 Tom-0.019
Token packed
Token ID11856
Feature activation+4.90e-02

Pos logit contributions
unn+0.037
oll+0.034
ouses+0.034
edly+0.034
amb+0.033
Neg logit contributions
 Sam-0.017
 barking-0.016
 Tim-0.015
 hit-0.015
 Tom-0.015
Token all
Token ID477
Feature activation+6.18e-02

Pos logit contributions
unn+0.107
amb+0.106
ovable+0.104
 luggage+0.104
 lamb+0.103
Neg logit contributions
 barking-0.057
 biting-0.057
ue-0.054
 Sam-0.049
 attacking-0.048
Token of
Token ID286
Feature activation+2.76e-02

Pos logit contributions
unn+0.161
oll+0.160
ovable+0.159
edly+0.158
amb+0.157
Neg logit contributions
 barking-0.043
 Sam-0.042
 biting-0.042
ue-0.039
 Sue-0.037
Token their
Token ID511
Feature activation+5.49e-02

Pos logit contributions
amb+0.073
unn+0.072
oll+0.071
ovable+0.071
sie+0.069
Neg logit contributions
 barking-0.023
 biting-0.023
 Sam-0.021
ue-0.018
 mess-0.018
Token belongings
Token ID30985
Feature activation+1.62e-01

Pos logit contributions
amb+0.118
oll+0.116
unn+0.110
edly+0.106
ovable+0.105
Neg logit contributions
 barking-0.080
 biting-0.076
 mess-0.072
 Sam-0.072
 hiding-0.069
Token and
Token ID290
Feature activation+3.57e-02

Pos logit contributions
unn+0.429
ovable+0.420
amb+0.420
oll+0.418
 lamb+0.412
Neg logit contributions
 barking-0.127
 biting-0.123
 Sam-0.100
 mess-0.099
ue-0.099
Token moved
Token ID3888
Feature activation+2.21e-02

Pos logit contributions
unn+0.077
ovable+0.077
amb+0.076
oll+0.075
uster+0.072
Neg logit contributions
 Sam-0.051
 barking-0.051
 Tom-0.047
 Tim-0.047
 biting-0.047
Token to
Token ID284
Feature activation+1.45e-03

Pos logit contributions
unn+0.052
oll+0.050
amb+0.049
ovable+0.049
sie+0.047
Neg logit contributions
 barking-0.026
 biting-0.025
 Sam-0.024
ue-0.021
 attacking-0.021
Token a
Token ID257
Feature activation+3.16e-02

Pos logit contributions
unn+0.004
amb+0.004
oll+0.004
ovable+0.004
ouses+0.004
Neg logit contributions
 barking-0.001
 Sam-0.001
 biting-0.001
 Tom-0.001
 mess-0.001
Token new
Token ID649
Feature activation+2.86e-02

Pos logit contributions
unn+0.056
amb+0.055
ovable+0.054
 lamb+0.053
oll+0.053
Neg logit contributions
 barking-0.050
 biting-0.050
ue-0.047
 Sam-0.045
 attacking-0.044
Token As
Token ID1081
Feature activation+9.59e-02

Pos logit contributions
amb+0.042
unn+0.042
ovable+0.041
 luggage+0.040
 lamb+0.040
Neg logit contributions
 biting-0.027
 barking-0.027
ue-0.024
 attacking-0.024
 mess-0.023
Token they
Token ID484
Feature activation+5.48e-02

Pos logit contributions
unn+0.197
amb+0.193
 lamb+0.193
 luggage+0.192
ovable+0.192
Neg logit contributions
 barking-0.122
 biting-0.122
ue-0.116
 attacking-0.106
 Sam-0.104
Token packed
Token ID11856
Feature activation+5.32e-02

Pos logit contributions
ovable+0.112
unn+0.111
ouses+0.111
 wife+0.106
essa+0.106
Neg logit contributions
 barking-0.071
 bite-0.068
 biting-0.067
 Tim-0.065
 Tom-0.064
Token all
Token ID477
Feature activation+6.54e-02

Pos logit contributions
 luggage+0.108
sie+0.104
 wife+0.103
amb+0.103
unn+0.102
Neg logit contributions
 barking-0.070
 biting-0.069
ue-0.066
 attacking-0.060
 erupt-0.058
Token their
Token ID511
Feature activation+5.14e-02

Pos logit contributions
unn+0.175
amb+0.175
oll+0.174
ovable+0.173
edly+0.172
Neg logit contributions
 Sam-0.041
 barking-0.041
 biting-0.040
ue-0.037
 Sue-0.034
Token belongings
Token ID30985
Feature activation+1.62e-01

Pos logit contributions
amb+0.108
la+0.101
unn+0.101
 groom+0.101
oll+0.101
Neg logit contributions
 barking-0.073
 biting-0.073
 Sam-0.069
 mess-0.068
 attacking-0.066
Token,
Token ID11
Feature activation-6.51e-03

Pos logit contributions
unn+0.469
amb+0.466
ovable+0.461
oll+0.457
 lamb+0.453
Neg logit contributions
 barking-0.081
 biting-0.081
 mess-0.062
 Sam-0.060
ue-0.059
Token the
Token ID262
Feature activation+9.27e-04

Pos logit contributions
 barking+0.005
 biting+0.005
 Sam+0.005
ue+0.005
 Tom+0.004
Neg logit contributions
unn-0.017
amb-0.017
oll-0.016
ovable-0.016
ouses-0.016
Token family
Token ID1641
Feature activation+8.43e-02

Pos logit contributions
unn+0.001
amb+0.001
ovable+0.001
oll+0.001
sie+0.001
Neg logit contributions
 barking-0.002
 biting-0.002
 mess-0.002
 Sam-0.002
 mistake-0.002
Token was
Token ID373
Feature activation-1.26e-03

Pos logit contributions
unn+0.170
amb+0.163
ovable+0.159
oll+0.157
nam+0.157
Neg logit contributions
 barking-0.121
 biting-0.118
 Sam-0.110
 mess-0.109
ue-0.105
Token looking
Token ID2045
Feature activation+2.00e-02

Pos logit contributions
 barking+0.002
 biting+0.002
ue+0.002
 attacking+0.002
 Sam+0.002
Neg logit contributions
unn-0.003
amb-0.002
ovable-0.002
sie-0.002
oll-0.002
Token to
Token ID284
Feature activation-9.37e-03

Pos logit contributions
unn+0.104
amb+0.103
ovable+0.101
oll+0.100
sie+0.100
Neg logit contributions
 barking-0.060
 biting-0.059
ue-0.055
 Sam-0.052
 attacking-0.050
Token work
Token ID670
Feature activation+5.71e-02

Pos logit contributions
 biting+0.014
 barking+0.014
ue+0.013
 attacking+0.012
 mess+0.011
Neg logit contributions
 luggage-0.017
 wife-0.017
unn-0.017
amb-0.017
 lamb-0.017
Token,
Token ID11
Feature activation-1.35e-02

Pos logit contributions
 luggage+0.114
amb+0.112
ovable+0.111
unn+0.111
edly+0.110
Neg logit contributions
ue-0.072
 barking-0.071
 biting-0.070
 attacking-0.061
 Sam-0.060
Token attaching
Token ID39550
Feature activation+5.71e-02

Pos logit contributions
 barking+0.016
 biting+0.016
ue+0.014
 Sam+0.014
 mess+0.013
Neg logit contributions
unn-0.030
amb-0.030
ovable-0.029
oll-0.029
sie-0.029
Token all
Token ID477
Feature activation+7.29e-02

Pos logit contributions
unn+0.125
amb+0.124
ovable+0.123
oll+0.121
 lamb+0.121
Neg logit contributions
 barking-0.066
 biting-0.066
ue-0.062
 Sam-0.057
 attacking-0.056
Token kinds
Token ID6982
Feature activation+1.62e-01

Pos logit contributions
unn+0.151
amb+0.149
ovable+0.147
 luggage+0.147
 lamb+0.145
Neg logit contributions
 barking-0.093
 biting-0.092
ue-0.087
 attacking-0.079
 mess-0.077
Token of
Token ID286
Feature activation+4.21e-02

Pos logit contributions
 lamb+0.322
amb+0.321
sie+0.318
unn+0.315
iya+0.312
Neg logit contributions
 biting-0.225
 barking-0.214
 Sam-0.204
ue-0.199
 mess-0.189
Token things
Token ID1243
Feature activation+1.24e-01

Pos logit contributions
amb+0.084
unn+0.082
ovable+0.081
oll+0.080
sie+0.079
Neg logit contributions
 biting-0.059
 barking-0.058
ue-0.054
 mess-0.054
 Sam-0.053
Token to
Token ID284
Feature activation+3.44e-03

Pos logit contributions
unn+0.278
amb+0.274
ovable+0.271
 lamb+0.269
sie+0.268
Neg logit contributions
 barking-0.138
 biting-0.138
ue-0.126
 Sam-0.122
 attacking-0.115
Token the
Token ID262
Feature activation+2.90e-02

Pos logit contributions
unn+0.008
amb+0.008
ovable+0.007
oll+0.007
sie+0.007
Neg logit contributions
 barking-0.004
 biting-0.004
 Sam-0.004
ue-0.004
 mess-0.004
Token walls
Token ID7714
Feature activation+1.60e-02

Pos logit contributions
unn+0.053
ovable+0.052
amb+0.052
 luggage+0.051
sie+0.051
Neg logit contributions
 barking-0.044
 biting-0.044
ue-0.042
 Sam-0.039
 attacking-0.039
Token her
Token ID607
Feature activation+3.77e-02

Pos logit contributions
unn+0.033
amb+0.033
oll+0.032
ouses+0.032
ovable+0.032
Neg logit contributions
 biting-0.019
 barking-0.018
 Sam-0.018
 Sue-0.016
 Tom-0.016
Token family
Token ID1641
Feature activation+9.35e-02

Pos logit contributions
uster+0.078
amb+0.076
year+0.074
unn+0.072
ance+0.072
Neg logit contributions
 biting-0.052
 barking-0.051
 Sam-0.051
 mess-0.050
 Tom-0.048
Token sorted
Token ID23243
Feature activation+5.14e-02

Pos logit contributions
unn+0.203
amb+0.190
ovable+0.185
oll+0.184
 lamb+0.184
Neg logit contributions
 biting-0.109
 Sam-0.107
 barking-0.106
 Tim-0.105
 mess-0.105
Token through
Token ID832
Feature activation+9.95e-03

Pos logit contributions
unn+0.113
amb+0.113
ovable+0.110
oll+0.109
 lamb+0.106
Neg logit contributions
 biting-0.061
 barking-0.060
ue-0.056
 Sam-0.056
 mess-0.054
Token their
Token ID511
Feature activation+5.22e-02

Pos logit contributions
amb+0.024
unn+0.024
ovable+0.023
oll+0.023
ouses+0.022
Neg logit contributions
 biting-0.010
 barking-0.010
 mess-0.009
ue-0.009
 Sam-0.009
Token belongings
Token ID30985
Feature activation+1.62e-01

Pos logit contributions
amb+0.091
unn+0.089
ovable+0.086
oll+0.085
la+0.082
Neg logit contributions
 biting-0.089
 barking-0.088
 mess-0.084
ue-0.083
 attacking-0.079
Token,
Token ID11
Feature activation-1.96e-02

Pos logit contributions
unn+0.414
amb+0.413
ovable+0.404
oll+0.401
 lamb+0.400
Neg logit contributions
 barking-0.134
 biting-0.134
ue-0.116
 Sam-0.112
 mess-0.111
Token picking
Token ID10868
Feature activation+2.26e-02

Pos logit contributions
 barking+0.025
 biting+0.025
 Sam+0.022
ue+0.022
 mess+0.022
Neg logit contributions
unn-0.042
amb-0.042
ovable-0.041
oll-0.040
sie-0.039
Token out
Token ID503
Feature activation-5.89e-03

Pos logit contributions
unn+0.049
amb+0.049
ovable+0.048
oll+0.047
 lamb+0.046
Neg logit contributions
 biting-0.027
 barking-0.027
ue-0.024
 Sam-0.024
 mess-0.024
Token all
Token ID477
Feature activation+5.13e-02

Pos logit contributions
 biting+0.006
 barking+0.006
 mess+0.006
 Sam+0.006
 Sue+0.006
Neg logit contributions
amb-0.014
unn-0.014
ovable-0.013
oll-0.013
edly-0.013
Token of
Token ID286
Feature activation+1.87e-02

Pos logit contributions
unn+0.107
amb+0.105
ovable+0.104
sie+0.104
 luggage+0.104
Neg logit contributions
 barking-0.065
 biting-0.064
ue-0.060
 attacking-0.055
 Sam-0.054
Token,
Token ID11
Feature activation-1.46e-02

Pos logit contributions
 barking+0.014
 biting+0.014
ue+0.013
 attacking+0.012
 mess+0.011
Neg logit contributions
amb-0.036
 lamb-0.036
unn-0.036
 luggage-0.036
sie-0.035
Token they
Token ID484
Feature activation+2.79e-02

Pos logit contributions
 barking+0.024
 biting+0.024
ue+0.023
 Sam+0.022
 mess+0.021
Neg logit contributions
unn-0.025
ovable-0.025
amb-0.025
oll-0.024
sie-0.024
Token packed
Token ID11856
Feature activation+4.42e-02

Pos logit contributions
unn+0.057
amb+0.053
ouses+0.052
ovable+0.052
 downs+0.051
Neg logit contributions
 barking-0.040
 Sam-0.039
 biting-0.038
ue-0.038
 Tom-0.038
Token up
Token ID510
Feature activation-2.85e-02

Pos logit contributions
amb+0.092
 luggage+0.092
unn+0.091
sie+0.090
 lamb+0.090
Neg logit contributions
 barking-0.054
 biting-0.054
ue-0.051
 attacking-0.046
 Sam-0.045
Token their
Token ID511
Feature activation+4.15e-02

Pos logit contributions
 barking+0.029
 biting+0.028
ue+0.026
 Sam+0.026
 Sue+0.024
Neg logit contributions
unn-0.068
ovable-0.067
amb-0.066
oll-0.065
 lamb-0.064
Token belongings
Token ID30985
Feature activation+1.62e-01

Pos logit contributions
amb+0.076
unn+0.075
ovable+0.074
oll+0.072
ibr+0.069
Neg logit contributions
 biting-0.067
 barking-0.067
ue-0.063
 mess-0.061
 bite-0.060
Token and
Token ID290
Feature activation+2.62e-02

Pos logit contributions
unn+0.383
ovable+0.375
amb+0.374
oll+0.367
 lamb+0.367
Neg logit contributions
 barking-0.163
 biting-0.161
ue-0.147
 Sam-0.135
 mess-0.134
Token headed
Token ID9153
Feature activation+8.32e-02

Pos logit contributions
unn+0.048
ovable+0.047
amb+0.045
oll+0.043
sie+0.043
Neg logit contributions
 barking-0.044
 biting-0.043
 Sam-0.043
 Tom-0.041
 Sue-0.041
Token home
Token ID1363
Feature activation+3.06e-02

Pos logit contributions
unn+0.151
amb+0.148
ovable+0.147
oll+0.144
 lamb+0.143
Neg logit contributions
 barking-0.129
 biting-0.128
ue-0.121
 Sam-0.115
 attacking-0.113
Token.
Token ID13
Feature activation+1.68e-02

Pos logit contributions
unn+0.074
amb+0.073
ovable+0.073
 lamb+0.072
oll+0.072
Neg logit contributions
 barking-0.028
 biting-0.028
ue-0.026
 Sam-0.023
 attacking-0.022
Token The
Token ID383
Feature activation-9.46e-03

Pos logit contributions
unn+0.031
ovable+0.030
sie+0.030
 compass+0.030
amb+0.029
Neg logit contributions
 barking-0.026
 biting-0.025
ue-0.024
 Sam-0.024
 Tom-0.023
Tokens
Token ID82
Feature activation-3.93e-02

Pos logit contributions
unn+0.020
amb+0.020
ovable+0.019
oll+0.019
sie+0.019
Neg logit contributions
 barking-0.014
 biting-0.014
ue-0.013
 Sam-0.012
 attacking-0.012
Token important
Token ID1593
Feature activation+4.79e-02

Pos logit contributions
 barking+0.051
 biting+0.051
ue+0.047
 Sam+0.045
 mess+0.045
Neg logit contributions
unn-0.084
amb-0.083
sie-0.080
 lamb-0.079
ovable-0.078
Token to
Token ID284
Feature activation-1.03e-02

Pos logit contributions
unn+0.112
ovable+0.112
amb+0.111
sie+0.111
oll+0.110
Neg logit contributions
 barking-0.051
 biting-0.049
 Sam-0.043
ue-0.043
 mess-0.041
Token ask
Token ID1265
Feature activation+4.94e-02

Pos logit contributions
 barking+0.012
 biting+0.012
 Sam+0.012
ue+0.011
 mess+0.011
Neg logit contributions
unn-0.023
amb-0.022
ovable-0.022
ouses-0.022
oll-0.022
Token for
Token ID329
Feature activation+1.84e-02

Pos logit contributions
unn+0.121
amb+0.120
ouses+0.114
ovable+0.114
sie+0.113
Neg logit contributions
 barking-0.051
 biting-0.050
 Sam-0.050
 Tom-0.045
 Mia-0.042
Token permission
Token ID7170
Feature activation+1.61e-01

Pos logit contributions
amb+0.042
unn+0.041
ovable+0.039
ouses+0.038
oll+0.038
Neg logit contributions
 biting-0.024
 barking-0.023
 Sam-0.022
 attacking-0.020
 catching-0.020
Token before
Token ID878
Feature activation+1.14e-02

Pos logit contributions
unn+0.406
 lamb+0.398
sie+0.397
amb+0.395
ovable+0.393
Neg logit contributions
 barking-0.137
 biting-0.131
 Sam-0.123
ue-0.120
 Tom-0.112
Token taking
Token ID2263
Feature activation+6.32e-02

Pos logit contributions
unn+0.024
amb+0.024
ovable+0.024
oll+0.024
 lamb+0.024
Neg logit contributions
 barking-0.014
 biting-0.014
ue-0.013
 Sam-0.012
 attacking-0.012
Token something
Token ID1223
Feature activation+7.62e-02

Pos logit contributions
unn+0.151
amb+0.148
ovable+0.145
oll+0.142
sie+0.141
Neg logit contributions
 biting-0.068
 barking-0.067
 Sam-0.061
 attacking-0.055
 Tom-0.055
Token that
Token ID326
Feature activation-3.13e-02

Pos logit contributions
amb+0.175
unn+0.165
ovable+0.163
years+0.161
edly+0.159
Neg logit contributions
 biting-0.094
 barking-0.088
 mess-0.087
 Sam-0.077
 catching-0.076
Token doesn
Token ID1595
Feature activation-3.03e-03

Pos logit contributions
 biting+0.041
 barking+0.040
 Sam+0.038
ue+0.037
 mess+0.036
Neg logit contributions
unn-0.065
amb-0.065
ovable-0.063
 lamb-0.063
sie-0.063
Token Miss
Token ID4544
Feature activation+1.99e-02

Pos logit contributions
 barking+0.008
 biting+0.007
ue+0.007
 Sam+0.006
 mess+0.006
Neg logit contributions
unn-0.022
amb-0.022
ovable-0.022
oll-0.022
 lamb-0.021
Token Sarah
Token ID10490
Feature activation+4.41e-03

Pos logit contributions
unn+0.047
amb+0.046
 lamb+0.046
ovable+0.046
oll+0.046
Neg logit contributions
 barking-0.020
ue-0.019
 biting-0.018
 Sam-0.018
 Tom-0.015
Token gave
Token ID2921
Feature activation+4.21e-02

Pos logit contributions
unn+0.009
ovable+0.009
amb+0.008
oll+0.008
nam+0.008
Neg logit contributions
 barking-0.006
ue-0.006
 biting-0.006
 Sam-0.006
 mess-0.006
Token all
Token ID477
Feature activation+6.91e-02

Pos logit contributions
unn+0.106
amb+0.103
oll+0.100
ovable+0.099
edly+0.099
Neg logit contributions
 Sam-0.043
 barking-0.042
 Tom-0.040
 biting-0.040
ue-0.039
Token the
Token ID262
Feature activation+4.13e-02

Pos logit contributions
unn+0.159
amb+0.158
ovable+0.154
oll+0.153
edly+0.153
Neg logit contributions
 barking-0.072
 Sam-0.072
 biting-0.070
ue-0.068
 Sue-0.067
Token pupils
Token ID25613
Feature activation+1.61e-01

Pos logit contributions
amb+0.082
oll+0.079
unn+0.078
ovable+0.078
edly+0.077
Neg logit contributions
 barking-0.061
 biting-0.060
 mess-0.056
 Sam-0.055
ue-0.055
Token a
Token ID257
Feature activation+3.71e-02

Pos logit contributions
unn+0.399
amb+0.376
ovable+0.362
oll+0.357
 lamb+0.356
Neg logit contributions
 barking-0.154
 biting-0.154
 half-0.149
 bite-0.148
 Sam-0.146
Token hug
Token ID16225
Feature activation+8.15e-03

Pos logit contributions
unn+0.075
amb+0.074
ovable+0.071
oll+0.070
edly+0.070
Neg logit contributions
 barking-0.054
 biting-0.052
 mess-0.049
ue-0.049
 bite-0.048
Token and
Token ID290
Feature activation+3.61e-02

Pos logit contributions
unn+0.023
sie+0.022
amb+0.021
ovable+0.021
nam+0.021
Neg logit contributions
 barking-0.006
 biting-0.006
 Sam-0.005
ue-0.005
 Tom-0.004
Token said
Token ID531
Feature activation+1.88e-02

Pos logit contributions
amb+0.086
unn+0.086
uster+0.085
ovable+0.084
sie+0.083
Neg logit contributions
 Sam-0.044
 barking-0.040
 Sue-0.039
 Tom-0.039
 biting-0.038
Token,
Token ID11
Feature activation-8.80e-03

Pos logit contributions
unn+0.052
amb+0.051
ovable+0.050
 lamb+0.050
 luggage+0.050
Neg logit contributions
 biting-0.012
 barking-0.012
ue-0.010
 Sam-0.009
 attacking-0.008
Token He
Token ID679
Feature activation+1.03e-02

Pos logit contributions
unn+0.041
amb+0.041
ovable+0.040
 luggage+0.040
 lamb+0.040
Neg logit contributions
 biting-0.026
 barking-0.026
ue-0.024
 attacking-0.022
 Sam-0.022
Token asked
Token ID1965
Feature activation+4.70e-02

Pos logit contributions
unn+0.023
uster+0.023
oll+0.022
ouses+0.021
nam+0.021
Neg logit contributions
 barking-0.013
 biting-0.012
 hit-0.011
 mess-0.011
 Sam-0.011
Token his
Token ID465
Feature activation+6.51e-02

Pos logit contributions
unn+0.126
amb+0.124
ouses+0.123
oll+0.123
ovable+0.122
Neg logit contributions
 Sam-0.035
 barking-0.035
 biting-0.035
 Tim-0.032
 Tom-0.031
Token mom
Token ID1995
Feature activation+3.98e-03

Pos logit contributions
sie+0.131
amb+0.131
edly+0.130
unn+0.128
ovable+0.128
Neg logit contributions
 barking-0.090
 Sam-0.086
 biting-0.086
 mistake-0.080
 mess-0.076
Token for
Token ID329
Feature activation+1.43e-02

Pos logit contributions
unn+0.009
amb+0.009
ovable+0.009
edly+0.009
 lamb+0.009
Neg logit contributions
 barking-0.004
 biting-0.004
ue-0.004
 Sam-0.004
 Tom-0.003
Token permission
Token ID7170
Feature activation+1.61e-01

Pos logit contributions
unn+0.033
amb+0.033
ouses+0.032
ovable+0.032
oll+0.031
Neg logit contributions
 biting-0.018
 barking-0.017
 Sam-0.016
 catching-0.015
 Tom-0.015
Token and
Token ID290
Feature activation+2.39e-02

Pos logit contributions
unn+0.404
 lamb+0.391
sie+0.391
ovable+0.388
amb+0.387
Neg logit contributions
 barking-0.147
 biting-0.139
 Sam-0.128
ue-0.124
 Tom-0.114
Token she
Token ID673
Feature activation+3.13e-02

Pos logit contributions
unn+0.056
oll+0.055
ovable+0.055
amb+0.054
uster+0.053
Neg logit contributions
 Sam-0.029
 barking-0.028
 biting-0.027
 Sue-0.026
 Tom-0.026
Token agreed
Token ID4987
Feature activation+3.30e-02

Pos logit contributions
unn+0.066
uster+0.063
 luggage+0.062
 lamb+0.061
amb+0.061
Neg logit contributions
ue-0.041
 Sam-0.040
 barking-0.039
 biting-0.038
 Sue-0.037
Token.
Token ID13
Feature activation+2.49e-02

Pos logit contributions
unn+0.074
amb+0.073
ovable+0.073
 luggage+0.072
oll+0.072
Neg logit contributions
 barking-0.036
 biting-0.036
ue-0.034
 Sam-0.030
 attacking-0.030
Token She
Token ID1375
Feature activation+3.49e-02

Pos logit contributions
amb+0.050
unn+0.048
 wife+0.047
 luggage+0.047
ovable+0.047
Neg logit contributions
 biting-0.035
 barking-0.034
 attacking-0.031
 mess-0.031
ue-0.031
Token,
Token ID11
Feature activation-7.94e-02

Pos logit contributions
 barking+0.119
 biting+0.108
ue+0.103
 attacking+0.097
 destroying+0.089
Neg logit contributions
unn-0.258
 luggage-0.254
 wife-0.251
 husband-0.248
 servants-0.248
Token fifteen
Token ID17280
Feature activation-8.67e-02

Pos logit contributions
 barking+0.104
 biting+0.100
ue+0.096
 attacking+0.088
 mess+0.085
Neg logit contributions
unn-0.158
amb-0.158
oll-0.156
 luggage-0.154
ovable-0.154
Token,
Token ID11
Feature activation-7.64e-02

Pos logit contributions
 barking+0.101
 biting+0.091
 attacking+0.086
ue+0.084
 destroying+0.080
Neg logit contributions
unn-0.188
 servants-0.181
oll-0.179
sie-0.179
 wife-0.179
Token sixteen
Token ID27913
Feature activation-1.20e-01

Pos logit contributions
 barking+0.101
 biting+0.096
ue+0.093
 attacking+0.087
 Sam+0.083
Neg logit contributions
amb-0.149
oll-0.146
unn-0.146
 luggage-0.143
 wife-0.143
Token,
Token ID11
Feature activation-7.43e-02

Pos logit contributions
 barking+0.135
 biting+0.125
 attacking+0.116
ue+0.110
 destroying+0.108
Neg logit contributions
unn-0.269
 servants-0.254
 subjects-0.252
oll-0.252
 wife-0.251
Token seventeen
Token ID38741
Feature activation-1.77e-01

Pos logit contributions
 barking+0.101
 biting+0.095
ue+0.091
 attacking+0.088
 Sam+0.083
Neg logit contributions
amb-0.138
oll-0.136
 groom-0.136
 wife-0.135
edly-0.134
Token,
Token ID11
Feature activation-7.56e-02

Pos logit contributions
 barking+0.256
 attacking+0.227
 biting+0.218
 destroying+0.217
 collapse+0.208
Neg logit contributions
oll-0.329
unn-0.324
 husband-0.317
 penn-0.317
 wife-0.310
Token eighteen
Token ID29095
Feature activation-1.29e-01

Pos logit contributions
 barking+0.099
 biting+0.092
ue+0.090
 attacking+0.085
 mess+0.081
Neg logit contributions
amb-0.146
oll-0.142
 groom-0.142
 wife-0.140
 servants-0.139
Token,
Token ID11
Feature activation-7.03e-02

Pos logit contributions
 barking+0.151
 biting+0.135
 attacking+0.131
ue+0.127
 destroying+0.121
Neg logit contributions
unn-0.273
oll-0.266
 servants-0.265
 wife-0.263
 husband-0.263
Token nineteen
Token ID43063
Feature activation-8.02e-02

Pos logit contributions
 barking+0.090
 biting+0.084
ue+0.082
 attacking+0.077
 mess+0.073
Neg logit contributions
amb-0.138
oll-0.135
 groom-0.135
 wife-0.134
 servants-0.133
Token,
Token ID11
Feature activation-4.61e-02

Pos logit contributions
 barking+0.079
 biting+0.075
ue+0.066
 attacking+0.064
 Sam+0.059
Neg logit contributions
unn-0.190
amb-0.187
oll-0.186
ovable-0.186
 wife-0.186
Token,
Token ID11
Feature activation-7.92e-02

Pos logit contributions
 barking+0.145
 biting+0.126
ue+0.125
 attacking+0.124
 destroying+0.107
Neg logit contributions
 servants-0.226
 luggage-0.225
 wife-0.223
 husband-0.223
nam-0.218
Token fifteen
Token ID17280
Feature activation-8.66e-02

Pos logit contributions
 barking+0.114
 biting+0.107
ue+0.106
 attacking+0.099
 mess+0.092
Neg logit contributions
oll-0.143
 servants-0.140
amb-0.139
 groom-0.138
unn-0.138
Token,
Token ID11
Feature activation-7.65e-02

Pos logit contributions
 barking+0.123
 biting+0.108
 attacking+0.106
ue+0.103
 destroying+0.093
Neg logit contributions
 servants-0.168
oll-0.163
 subjects-0.161
unn-0.158
 luggage-0.157
Token sixteen
Token ID27913
Feature activation-1.20e-01

Pos logit contributions
 barking+0.113
 biting+0.104
ue+0.102
 attacking+0.098
 Sam+0.091
Neg logit contributions
oll-0.133
amb-0.130
 servants-0.130
edly-0.130
 groom-0.129
Token,
Token ID11
Feature activation-7.47e-02

Pos logit contributions
 barking+0.169
 biting+0.148
 attacking+0.146
ue+0.136
 destroying+0.126
Neg logit contributions
 servants-0.230
oll-0.224
 subjects-0.223
unn-0.221
 luggage-0.216
Token seventeen
Token ID38741
Feature activation-1.76e-01

Pos logit contributions
 barking+0.112
 biting+0.102
 attacking+0.098
ue+0.098
 Sam+0.090
Neg logit contributions
edly-0.124
 groom-0.124
oll-0.124
 royal-0.123
 servants-0.122
Token,
Token ID11
Feature activation-7.67e-02

Pos logit contributions
 barking+0.265
 attacking+0.237
 biting+0.228
 destroying+0.216
 collapse+0.209
Neg logit contributions
oll-0.323
 husband-0.309
 servants-0.306
 penn-0.303
unn-0.300
Token eighteen
Token ID29095
Feature activation-1.29e-01

Pos logit contributions
 barking+0.113
 biting+0.102
ue+0.099
 attacking+0.099
 Sam+0.090
Neg logit contributions
 groom-0.129
oll-0.128
amb-0.128
edly-0.127
 servants-0.127
Token,
Token ID11
Feature activation-7.16e-02

Pos logit contributions
 barking+0.167
 biting+0.148
 attacking+0.147
ue+0.142
 destroying+0.129
Neg logit contributions
 servants-0.258
oll-0.254
 husband-0.251
 subjects-0.251
 wife-0.250
Token nineteen
Token ID43063
Feature activation-8.03e-02

Pos logit contributions
 barking+0.102
 biting+0.093
ue+0.091
 attacking+0.090
 Sam+0.081
Neg logit contributions
 groom-0.124
amb-0.123
 servants-0.123
oll-0.123
 wife-0.120
Token,
Token ID11
Feature activation-4.74e-02

Pos logit contributions
 barking+0.095
 biting+0.089
 attacking+0.080
ue+0.077
 destroying+0.072
Neg logit contributions
oll-0.173
 wife-0.173
 servants-0.172
unn-0.171
ovable-0.171
Token And
Token ID843
Feature activation+4.00e-02

Pos logit contributions
unn+0.031
amb+0.030
ovable+0.030
sie+0.030
oll+0.030
Neg logit contributions
 barking-0.017
 biting-0.016
ue-0.015
 Sam-0.014
 attacking-0.014
Token it
Token ID340
Feature activation-4.97e-02

Pos logit contributions
unn+0.092
amb+0.090
oll+0.088
ovable+0.088
 luggage+0.087
Neg logit contributions
 barking-0.047
 biting-0.047
 Sam-0.046
 Tom-0.042
 Sue-0.042
Token is
Token ID318
Feature activation-7.49e-02

Pos logit contributions
 barking+0.060
 biting+0.059
ue+0.056
 Sam+0.051
 attacking+0.050
Neg logit contributions
unn-0.106
amb-0.105
ovable-0.104
sie-0.103
oll-0.103
Token not
Token ID407
Feature activation-1.24e-01

Pos logit contributions
 barking+0.107
 biting+0.105
ue+0.103
 Sam+0.093
 attacking+0.093
Neg logit contributions
unn-0.140
ovable-0.139
amb-0.138
 luggage-0.137
oll-0.137
Token too
Token ID1165
Feature activation-9.37e-02

Pos logit contributions
 barking+0.180
 biting+0.179
ue+0.166
 Sam+0.160
 mess+0.156
Neg logit contributions
unn-0.240
amb-0.238
ovable-0.233
sie-0.230
oll-0.229
Token high
Token ID1029
Feature activation-1.72e-01

Pos logit contributions
 barking+0.141
 biting+0.139
ue+0.137
 Sam+0.124
 attacking+0.124
Neg logit contributions
unn-0.171
amb-0.166
ovable-0.165
 lamb-0.164
 luggage-0.164
Token.
Token ID13
Feature activation-4.96e-02

Pos logit contributions
ue+0.240
 barking+0.229
 biting+0.220
 Sam+0.209
 Mia+0.197
Neg logit contributions
amb-0.334
unn-0.326
 luggage-0.322
oll-0.317
edly-0.313
Token I
Token ID314
Feature activation-1.65e-02

Pos logit contributions
 barking+0.062
 biting+0.061
ue+0.057
 Sam+0.053
 attacking+0.052
Neg logit contributions
unn-0.104
amb-0.104
ovable-0.102
oll-0.101
 lamb-0.100
Token will
Token ID481
Feature activation-3.48e-02

Pos logit contributions
 barking+0.027
 biting+0.026
ue+0.025
 Sam+0.024
 attacking+0.024
Neg logit contributions
unn-0.029
amb-0.028
ovable-0.028
 lamb-0.027
oll-0.027
Token go
Token ID467
Feature activation-1.95e-02

Pos logit contributions
 barking+0.044
 biting+0.044
ue+0.041
 Sam+0.039
 mess+0.038
Neg logit contributions
unn-0.073
amb-0.072
ovable-0.071
oll-0.070
sie-0.070
Token first
Token ID717
Feature activation-2.68e-02

Pos logit contributions
 barking+0.025
 biting+0.025
ue+0.023
 Sam+0.022
 attacking+0.021
Neg logit contributions
unn-0.041
amb-0.040
ovable-0.040
oll-0.039
sie-0.039
Token slippery
Token ID32911
Feature activation-1.21e-01

Pos logit contributions
ue+0.113
 barking+0.103
 biting+0.100
 Sam+0.091
 attacking+0.090
Neg logit contributions
 luggage-0.132
 lamb-0.130
oll-0.129
ovable-0.128
nam-0.126
Token and
Token ID290
Feature activation-5.16e-02

Pos logit contributions
 barking+0.147
ue+0.146
 biting+0.142
 Sam+0.126
 attacking+0.124
Neg logit contributions
unn-0.250
amb-0.247
 lamb-0.243
oll-0.242
 luggage-0.242
Token the
Token ID262
Feature activation-4.58e-02

Pos logit contributions
ue+0.070
 barking+0.069
 biting+0.067
 attacking+0.060
 Sam+0.058
Neg logit contributions
 luggage-0.097
unn-0.097
 lamb-0.097
nam-0.095
ovable-0.095
Token hill
Token ID12788
Feature activation-7.24e-02

Pos logit contributions
 barking+0.061
 biting+0.061
ue+0.056
 Sam+0.053
 mess+0.053
Neg logit contributions
unn-0.094
amb-0.093
ovable-0.091
sie-0.090
oll-0.089
Token is
Token ID318
Feature activation-8.53e-02

Pos logit contributions
 barking+0.081
 biting+0.079
ue+0.074
 Sam+0.068
 attacking+0.067
Neg logit contributions
unn-0.159
amb-0.156
ovable-0.156
 luggage-0.154
sie-0.153
Token steep
Token ID14559
Feature activation-1.67e-01

Pos logit contributions
ue+0.117
 barking+0.109
 biting+0.107
 Sam+0.095
 attacking+0.094
Neg logit contributions
 luggage-0.162
 lamb-0.159
oll-0.158
unn-0.157
ovable-0.157
Token.
Token ID13
Feature activation-5.13e-02

Pos logit contributions
ue+0.228
 barking+0.221
 biting+0.215
 Sam+0.198
 attacking+0.190
Neg logit contributions
amb-0.316
unn-0.312
 luggage-0.311
 lamb-0.308
edly-0.305
Token It
Token ID632
Feature activation-7.55e-02

Pos logit contributions
 barking+0.065
 biting+0.065
ue+0.061
 Sam+0.057
 attacking+0.056
Neg logit contributions
unn-0.107
amb-0.105
ovable-0.104
sie-0.103
 lamb-0.102
Token could
Token ID714
Feature activation-3.03e-02

Pos logit contributions
 barking+0.119
ue+0.118
 biting+0.117
 Sam+0.105
 attacking+0.104
Neg logit contributions
unn-0.132
ovable-0.127
amb-0.126
 luggage-0.125
oll-0.125
Token be
Token ID307
Feature activation-4.33e-02

Pos logit contributions
 barking+0.034
 biting+0.033
ue+0.033
 attacking+0.028
 Sam+0.027
Neg logit contributions
 lamb-0.064
 luggage-0.064
unn-0.064
ovable-0.063
oll-0.063
Token dangerous
Token ID4923
Feature activation-6.49e-02

Pos logit contributions
ue+0.059
 barking+0.058
 biting+0.058
 Sam+0.051
 attacking+0.050
Neg logit contributions
unn-0.082
 luggage-0.082
ovable-0.080
 lamb-0.080
oll-0.079
Token shiny
Token ID22441
Feature activation-1.09e-01

Pos logit contributions
 barking+0.093
 biting+0.092
ue+0.089
 Sam+0.081
 attacking+0.081
Neg logit contributions
unn-0.128
amb-0.125
ovable-0.125
 luggage-0.124
 lamb-0.124
Token,
Token ID11
Feature activation-9.68e-02

Pos logit contributions
 biting+0.108
 barking+0.108
ue+0.094
 mess+0.090
 Sam+0.090
Neg logit contributions
unn-0.259
amb-0.256
ovable-0.255
sie-0.252
oll-0.249
Token but
Token ID475
Feature activation-3.17e-02

Pos logit contributions
 barking+0.111
 biting+0.109
ue+0.105
 attacking+0.093
 Sam+0.092
Neg logit contributions
unn-0.208
amb-0.206
ovable-0.205
 luggage-0.204
 lamb-0.204
Token it
Token ID340
Feature activation-8.21e-02

Pos logit contributions
 biting+0.030
 barking+0.029
 Sam+0.028
 Tom+0.024
ue+0.023
Neg logit contributions
amb-0.080
unn-0.079
sie-0.077
ovable-0.076
ouses-0.076
Token is
Token ID318
Feature activation-8.92e-02

Pos logit contributions
 barking+0.081
 biting+0.080
ue+0.074
 Sam+0.067
 attacking+0.065
Neg logit contributions
unn-0.194
amb-0.193
ovable-0.190
 lamb-0.188
 luggage-0.188
Token blue
Token ID4171
Feature activation-1.65e-01

Pos logit contributions
 barking+0.109
 biting+0.108
ue+0.103
 Sam+0.094
 attacking+0.093
Neg logit contributions
unn-0.189
amb-0.186
ovable-0.184
 lamb-0.182
oll-0.182
Token and
Token ID290
Feature activation-6.62e-02

Pos logit contributions
 barking+0.192
 biting+0.191
ue+0.181
 Sam+0.165
 attacking+0.162
Neg logit contributions
unn-0.360
amb-0.356
ovable-0.350
 luggage-0.347
oll-0.347
Token soft
Token ID2705
Feature activation-1.18e-01

Pos logit contributions
 barking+0.098
 biting+0.096
ue+0.096
 attacking+0.086
 Sam+0.085
Neg logit contributions
unn-0.118
 luggage-0.116
ovable-0.116
 lamb-0.116
amb-0.115
Token and
Token ID290
Feature activation-4.96e-02

Pos logit contributions
 barking+0.120
 biting+0.118
ue+0.112
 Sam+0.100
 attacking+0.097
Neg logit contributions
unn-0.273
amb-0.271
ovable-0.267
 luggage-0.265
 lamb-0.265
Token makes
Token ID1838
Feature activation-5.19e-02

Pos logit contributions
 barking+0.070
 biting+0.070
ue+0.067
 Sam+0.061
 attacking+0.061
Neg logit contributions
unn-0.095
amb-0.093
ovable-0.092
 luggage-0.091
 lamb-0.091
Token a
Token ID257
Feature activation-2.43e-02

Pos logit contributions
 barking+0.063
 biting+0.063
ue+0.058
 Sam+0.055
 attacking+0.053
Neg logit contributions
unn-0.112
amb-0.111
ovable-0.109
sie-0.108
 lamb-0.107
Token ball
Token ID2613
Feature activation-2.76e-02

Pos logit contributions
unn+0.063
amb+0.062
 luggage+0.061
ovable+0.061
 lamb+0.061
Neg logit contributions
 barking-0.055
 biting-0.054
ue-0.052
 Sam-0.049
 attacking-0.048
Token.
Token ID13
Feature activation-5.59e-02

Pos logit contributions
 barking+0.030
 biting+0.029
ue+0.028
 attacking+0.025
 Sam+0.024
Neg logit contributions
amb-0.062
unn-0.062
ovable-0.061
sie-0.060
 luggage-0.060
Token It
Token ID632
Feature activation-7.11e-02

Pos logit contributions
 barking+0.078
 biting+0.078
ue+0.074
 Sam+0.068
 attacking+0.068
Neg logit contributions
unn-0.109
amb-0.108
ovable-0.106
 lamb-0.104
oll-0.104
Token is
Token ID318
Feature activation-6.98e-02

Pos logit contributions
 barking+0.083
 biting+0.083
ue+0.077
 Sam+0.071
 attacking+0.070
Neg logit contributions
unn-0.155
amb-0.153
ovable-0.151
 lamb-0.149
oll-0.149
Token not
Token ID407
Feature activation-1.25e-01

Pos logit contributions
 barking+0.092
 biting+0.091
ue+0.087
 Sam+0.080
 attacking+0.078
Neg logit contributions
unn-0.142
amb-0.140
ovable-0.139
 lamb-0.137
 luggage-0.137
Token red
Token ID2266
Feature activation-1.65e-01

Pos logit contributions
 barking+0.171
 biting+0.171
 Sam+0.155
 mess+0.151
 attacking+0.143
Neg logit contributions
amb-0.268
unn-0.263
ovable-0.253
sie-0.252
oll-0.248
Token and
Token ID290
Feature activation-6.66e-02

Pos logit contributions
 barking+0.215
 biting+0.213
ue+0.207
 Sam+0.189
 attacking+0.186
Neg logit contributions
unn-0.337
amb-0.334
 luggage-0.325
oll-0.322
ovable-0.321
Token shiny
Token ID22441
Feature activation-1.09e-01

Pos logit contributions
 barking+0.093
 biting+0.092
ue+0.089
 Sam+0.081
 attacking+0.081
Neg logit contributions
unn-0.128
amb-0.125
ovable-0.125
 luggage-0.124
 lamb-0.124
Token,
Token ID11
Feature activation-9.68e-02

Pos logit contributions
 biting+0.108
 barking+0.108
ue+0.094
 mess+0.090
 Sam+0.090
Neg logit contributions
unn-0.259
amb-0.256
ovable-0.255
sie-0.252
oll-0.249
Token but
Token ID475
Feature activation-3.17e-02

Pos logit contributions
 barking+0.111
 biting+0.109
ue+0.105
 attacking+0.093
 Sam+0.092
Neg logit contributions
unn-0.208
amb-0.206
ovable-0.205
 luggage-0.204
 lamb-0.204
Token it
Token ID340
Feature activation-8.21e-02

Pos logit contributions
 biting+0.030
 barking+0.029
 Sam+0.028
 Tom+0.024
ue+0.023
Neg logit contributions
amb-0.080
unn-0.079
sie-0.077
ovable-0.076
ouses-0.076
Token careful
Token ID8161
Feature activation-3.90e-03

Pos logit contributions
ue+0.028
 barking+0.027
 biting+0.027
 Sam+0.025
 erupt+0.024
Neg logit contributions
ovable-0.023
 wife-0.023
 luggage-0.022
 husband-0.022
 lamb-0.022
Token.
Token ID13
Feature activation-4.16e-02

Pos logit contributions
ue+0.005
 barking+0.005
 biting+0.005
 attacking+0.004
 Sam+0.004
Neg logit contributions
amb-0.008
unn-0.007
edly-0.007
oll-0.007
ovable-0.007
Token The
Token ID383
Feature activation-4.36e-02

Pos logit contributions
 barking+0.052
 biting+0.052
ue+0.049
 mess+0.045
 attacking+0.045
Neg logit contributions
unn-0.087
amb-0.086
ovable-0.085
oll-0.084
 lamb-0.084
Token hill
Token ID12788
Feature activation-5.56e-02

Pos logit contributions
 barking+0.062
 biting+0.062
ue+0.060
 Sam+0.055
 attacking+0.054
Neg logit contributions
unn-0.082
amb-0.081
ovable-0.080
 lamb-0.080
 luggage-0.080
Token is
Token ID318
Feature activation-7.56e-02

Pos logit contributions
 barking+0.055
 biting+0.055
ue+0.050
 Sam+0.046
 attacking+0.044
Neg logit contributions
unn-0.129
amb-0.126
ovable-0.126
 luggage-0.125
sie-0.124
Token high
Token ID1029
Feature activation-1.64e-01

Pos logit contributions
ue+0.106
 biting+0.102
 barking+0.102
 Sam+0.089
 attacking+0.089
Neg logit contributions
 luggage-0.142
 lamb-0.137
ovable-0.135
nam-0.134
 wife-0.134
Token and
Token ID290
Feature activation-4.22e-02

Pos logit contributions
 barking+0.203
 biting+0.199
ue+0.197
 Sam+0.177
 attacking+0.171
Neg logit contributions
amb-0.342
unn-0.342
 lamb-0.333
 luggage-0.332
oll-0.331
Token steep
Token ID14559
Feature activation-1.50e-01

Pos logit contributions
 barking+0.059
ue+0.059
 biting+0.059
 attacking+0.052
 Sam+0.050
Neg logit contributions
 luggage-0.077
unn-0.077
nam-0.076
 lamb-0.076
ovable-0.075
Token."
Token ID526
Feature activation-1.92e-02

Pos logit contributions
ue+0.178
 barking+0.177
 biting+0.173
 Sam+0.156
 attacking+0.148
Neg logit contributions
amb-0.310
unn-0.307
 luggage-0.304
edly-0.300
 lamb-0.299
Token
Token ID198
Feature activation-2.93e-02

Pos logit contributions
 biting+0.028
 attacking+0.026
 barking+0.026
 mess+0.026
 destroying+0.025
Neg logit contributions
amb-0.035
 wife-0.035
unn-0.033
 husband-0.033
 lamb-0.033
Token
Token ID198
Feature activation-2.79e-02

Pos logit contributions
 biting+0.045
 barking+0.042
 destroying+0.040
 attacking+0.040
 mess+0.039
Neg logit contributions
amb-0.051
unn-0.050
 wife-0.049
ovable-0.049
ouses-0.048
Token bike
Token ID7161
Feature activation-3.16e-02

Pos logit contributions
 barking+0.035
 biting+0.035
ue+0.034
 Sam+0.031
 attacking+0.031
Neg logit contributions
unn-0.044
amb-0.044
ovable-0.043
 luggage-0.043
 lamb-0.043
Token.
Token ID13
Feature activation-5.33e-02

Pos logit contributions
ue+0.043
 biting+0.043
 barking+0.041
 Sam+0.041
 Amy+0.038
Neg logit contributions
 luggage-0.054
oll-0.053
ovable-0.052
 lamb-0.051
 passport-0.051
Token It
Token ID632
Feature activation-7.30e-02

Pos logit contributions
 biting+0.073
 barking+0.073
ue+0.069
 attacking+0.064
 mess+0.062
Neg logit contributions
amb-0.103
ovable-0.103
oll-0.103
unn-0.102
 luggage-0.101
Token was
Token ID373
Feature activation-4.42e-02

Pos logit contributions
 biting+0.106
 barking+0.104
ue+0.104
 erupt+0.096
 attacking+0.095
Neg logit contributions
oll-0.126
 luggage-0.124
ovable-0.124
amb-0.122
 lamb-0.121
Token not
Token ID407
Feature activation-1.24e-01

Pos logit contributions
ue+0.065
 biting+0.059
 barking+0.059
 attacking+0.053
 erupt+0.052
Neg logit contributions
 luggage-0.083
oll-0.081
nam-0.080
 lamb-0.079
ovable-0.078
Token red
Token ID2266
Feature activation-1.63e-01

Pos logit contributions
 barking+0.173
 biting+0.172
 Sam+0.154
ue+0.153
 mess+0.151
Neg logit contributions
unn-0.254
amb-0.250
ovable-0.242
sie-0.241
 lamb-0.237
Token and
Token ID290
Feature activation-6.16e-02

Pos logit contributions
 barking+0.178
 biting+0.176
ue+0.171
 Sam+0.154
 attacking+0.152
Neg logit contributions
unn-0.365
amb-0.361
oll-0.357
 luggage-0.352
ovable-0.350
Token shiny
Token ID22441
Feature activation-1.07e-01

Pos logit contributions
 barking+0.086
 biting+0.085
ue+0.081
 Sam+0.075
 attacking+0.074
Neg logit contributions
unn-0.121
amb-0.119
ovable-0.118
 lamb-0.116
 luggage-0.116
Token and
Token ID290
Feature activation-5.00e-02

Pos logit contributions
 barking+0.103
 biting+0.102
ue+0.095
 Sam+0.085
 attacking+0.083
Neg logit contributions
unn-0.256
amb-0.253
ovable-0.250
oll-0.248
 luggage-0.248
Token fast
Token ID3049
Feature activation-1.00e-01

Pos logit contributions
 barking+0.071
 biting+0.070
ue+0.066
 Sam+0.062
 attacking+0.061
Neg logit contributions
unn-0.097
amb-0.096
ovable-0.094
 lamb-0.093
 luggage-0.093
Token anymore
Token ID7471
Feature activation-1.18e-01

Pos logit contributions
ue+0.108
 barking+0.102
 biting+0.102
 Sam+0.089
 erupt+0.087
Neg logit contributions
unn-0.220
 luggage-0.219
amb-0.217
ovable-0.216
oll-0.216
Token slide
Token ID10649
Feature activation-8.08e-02

Pos logit contributions
 barking+0.047
 biting+0.047
ue+0.044
 Sam+0.042
 mess+0.041
Neg logit contributions
unn-0.059
amb-0.058
ovable-0.057
oll-0.056
sie-0.056
Token.
Token ID13
Feature activation-6.17e-02

Pos logit contributions
 barking+0.072
 biting+0.071
ue+0.066
 Sam+0.059
 mess+0.057
Neg logit contributions
unn-0.198
amb-0.197
ovable-0.194
 lamb-0.192
sie-0.192
Token It
Token ID632
Feature activation-8.33e-02

Pos logit contributions
 barking+0.070
 biting+0.068
 Sam+0.064
ue+0.064
 Tom+0.057
Neg logit contributions
unn-0.139
sie-0.136
amb-0.134
ovable-0.133
 luggage-0.132
Token is
Token ID318
Feature activation-8.15e-02

Pos logit contributions
 barking+0.107
 biting+0.106
ue+0.100
 Sam+0.094
 attacking+0.093
Neg logit contributions
unn-0.163
ovable-0.160
amb-0.157
 lamb-0.157
oll-0.156
Token too
Token ID1165
Feature activation-8.83e-02

Pos logit contributions
ue+0.117
 barking+0.107
 biting+0.107
 attacking+0.097
 Sam+0.089
Neg logit contributions
 luggage-0.149
nam-0.147
oll-0.146
 lamb-0.145
ovable-0.144
Token high
Token ID1029
Feature activation-1.60e-01

Pos logit contributions
 barking+0.140
 biting+0.140
ue+0.134
 attacking+0.131
 destroying+0.123
Neg logit contributions
 luggage-0.144
unn-0.144
oll-0.140
amb-0.138
ovable-0.138
Token and
Token ID290
Feature activation-4.68e-02

Pos logit contributions
 barking+0.195
ue+0.193
 biting+0.190
 Sam+0.173
 attacking+0.165
Neg logit contributions
unn-0.336
amb-0.336
oll-0.323
 luggage-0.321
 lamb-0.319
Token fast
Token ID3049
Feature activation-1.02e-01

Pos logit contributions
ue+0.066
 barking+0.064
 biting+0.063
 attacking+0.057
 Sam+0.055
Neg logit contributions
nam-0.085
unn-0.085
 luggage-0.084
 lamb-0.084
ovable-0.084
Token.
Token ID13
Feature activation-6.09e-02

Pos logit contributions
 barking+0.095
 biting+0.094
ue+0.086
 Sam+0.078
 attacking+0.076
Neg logit contributions
unn-0.247
amb-0.244
ovable-0.242
sie-0.239
 lamb-0.239
Token Can
Token ID1680
Feature activation-3.64e-02

Pos logit contributions
 barking+0.067
 biting+0.066
 Sam+0.064
ue+0.061
 Tom+0.057
Neg logit contributions
unn-0.139
sie-0.137
amb-0.133
 luggage-0.132
ovable-0.132
Token we
Token ID356
Feature activation-1.43e-03

Pos logit contributions
 barking+0.062
 biting+0.060
 mess+0.057
 attacking+0.057
ue+0.055
Neg logit contributions
nam-0.058
unn-0.057
amb-0.057
 wife-0.056
 lamb-0.055
Token milk
Token ID7545
Feature activation-3.45e-02

Pos logit contributions
 barking+0.023
 biting+0.023
ue+0.022
 Sam+0.021
 attacking+0.021
Neg logit contributions
 lamb-0.020
 luggage-0.020
unn-0.020
 wife-0.020
amb-0.019
Token.
Token ID13
Feature activation-5.67e-02

Pos logit contributions
 barking+0.042
 biting+0.040
ue+0.038
 attacking+0.036
 Sam+0.034
Neg logit contributions
oll-0.073
unn-0.072
amb-0.071
 lamb-0.071
sie-0.071
Token It
Token ID632
Feature activation-7.89e-02

Pos logit contributions
 barking+0.068
 biting+0.067
ue+0.063
 Sam+0.061
 Tom+0.056
Neg logit contributions
unn-0.124
amb-0.121
sie-0.120
ovable-0.119
 luggage-0.118
Token is
Token ID318
Feature activation-7.79e-02

Pos logit contributions
 barking+0.105
 biting+0.099
ue+0.097
 Sam+0.092
 attacking+0.089
Neg logit contributions
unn-0.156
oll-0.153
ovable-0.150
amb-0.146
 lamb-0.146
Token too
Token ID1165
Feature activation-8.47e-02

Pos logit contributions
ue+0.115
 barking+0.098
 biting+0.091
 attacking+0.086
 Sam+0.083
Neg logit contributions
nam-0.146
oll-0.145
 luggage-0.144
 lamb-0.141
 married-0.136
Token high
Token ID1029
Feature activation-1.59e-01

Pos logit contributions
 barking+0.141
ue+0.135
 biting+0.134
 attacking+0.127
 Sam+0.126
Neg logit contributions
unn-0.138
 lamb-0.137
sie-0.135
 luggage-0.135
oll-0.134
Token."
Token ID526
Feature activation-3.24e-02

Pos logit contributions
ue+0.215
 barking+0.205
 biting+0.199
 Sam+0.183
 attacking+0.178
Neg logit contributions
amb-0.319
unn-0.314
oll-0.307
edly-0.297
 luggage-0.297
Token
Token ID198
Feature activation-4.65e-02

Pos logit contributions
 biting+0.045
 barking+0.044
ue+0.043
 attacking+0.041
 destroying+0.039
Neg logit contributions
amb-0.062
 wife-0.062
 luggage-0.060
unn-0.060
oll-0.060
Token
Token ID198
Feature activation-4.40e-02

Pos logit contributions
 biting+0.058
 barking+0.058
ue+0.054
 attacking+0.050
 mess+0.050
Neg logit contributions
amb-0.096
unn-0.094
 wife-0.093
ovable-0.093
oll-0.092
TokenMom
Token ID29252
Feature activation-3.69e-02

Pos logit contributions
 barking+0.062
 biting+0.061
ue+0.058
 Sam+0.054
 attacking+0.053
Neg logit contributions
unn-0.086
amb-0.086
ovable-0.084
 lamb-0.083
oll-0.083
Token hears
Token ID23180
Feature activation-4.46e-02

Pos logit contributions
 barking+0.057
 biting+0.057
 catching+0.055
 erupt+0.053
 Sue+0.051
Neg logit contributions
oll-0.059
unn-0.056
la-0.053
ovable-0.052
 luggage-0.052
Token hard
Token ID1327
Feature activation-8.05e-02

Pos logit contributions
ue+0.117
 biting+0.110
 barking+0.105
 attacking+0.102
 destroying+0.094
Neg logit contributions
 luggage-0.097
nam-0.093
 compass-0.088
 lamb-0.087
 crops-0.087
Token.
Token ID13
Feature activation-4.54e-02

Pos logit contributions
ue+0.095
 biting+0.095
 barking+0.095
 Sam+0.083
 attacking+0.081
Neg logit contributions
unn-0.172
amb-0.168
 luggage-0.167
ovable-0.165
oll-0.164
Token The
Token ID383
Feature activation-4.62e-02

Pos logit contributions
 biting+0.071
 barking+0.068
 attacking+0.063
 mess+0.063
ue+0.061
Neg logit contributions
amb-0.078
ovable-0.077
 wife-0.077
 luggage-0.077
unn-0.076
Token hill
Token ID12788
Feature activation-5.83e-02

Pos logit contributions
 barking+0.064
 biting+0.063
ue+0.061
 Sam+0.056
 attacking+0.055
Neg logit contributions
unn-0.090
 luggage-0.088
amb-0.088
 lamb-0.088
ovable-0.087
Token is
Token ID318
Feature activation-7.63e-02

Pos logit contributions
 barking+0.078
 biting+0.077
ue+0.068
 attacking+0.067
 destroying+0.065
Neg logit contributions
 luggage-0.106
unn-0.104
ovable-0.103
ance-0.103
la-0.101
Token steep
Token ID14559
Feature activation-1.59e-01

Pos logit contributions
ue+0.124
 biting+0.115
 barking+0.113
 attacking+0.105
 destroying+0.097
Neg logit contributions
 luggage-0.123
nam-0.117
 compass-0.117
 lamb-0.113
 wife-0.113
Token and
Token ID290
Feature activation-5.46e-02

Pos logit contributions
ue+0.256
 barking+0.244
 biting+0.243
 Sam+0.225
 attacking+0.219
Neg logit contributions
 luggage-0.257
amb-0.251
edly-0.250
unn-0.249
 lamb-0.240
Token rocky
Token ID26898
Feature activation-7.04e-02

Pos logit contributions
ue+0.084
 barking+0.081
 biting+0.080
 attacking+0.074
 destroying+0.068
Neg logit contributions
nam-0.092
 luggage-0.091
 downs-0.087
 lamb-0.087
 compass-0.085
Token.
Token ID13
Feature activation-4.69e-02

Pos logit contributions
ue+0.086
 barking+0.084
 biting+0.083
 Sam+0.073
 attacking+0.072
Neg logit contributions
unn-0.143
 luggage-0.140
edly-0.139
amb-0.135
ance-0.134
Token Tom
Token ID4186
Feature activation-8.21e-02

Pos logit contributions
 biting+0.066
 barking+0.064
 mess+0.059
 attacking+0.059
ue+0.056
Neg logit contributions
ovable-0.087
 luggage-0.086
amb-0.086
oll-0.086
 wife-0.085
Token and
Token ID290
Feature activation-7.15e-03

Pos logit contributions
 barking+0.108
ue+0.107
 biting+0.107
 attacking+0.093
 Sue+0.091
Neg logit contributions
unn-0.164
amb-0.161
ovable-0.158
oll-0.157
 luggage-0.156
Token.
Token ID13
Feature activation+7.60e-03

Pos logit contributions
 biting+0.080
 barking+0.077
ue+0.076
 attacking+0.070
 Sam+0.066
Neg logit contributions
amb-0.147
edly-0.144
 luggage-0.144
 wife-0.144
 lamb-0.144
Token "
Token ID366
Feature activation+1.35e-02

Pos logit contributions
amb+0.014
 wife+0.014
unn+0.013
 husband+0.012
 downs+0.012
Neg logit contributions
 mess-0.011
 biting-0.011
 mistake-0.011
 attacking-0.011
 erupt-0.010
TokenIt
Token ID1026
Feature activation-7.55e-02

Pos logit contributions
unn+0.028
ovable+0.027
 luggage+0.027
sie+0.027
oll+0.027
Neg logit contributions
 barking-0.018
 biting-0.017
ue-0.017
 Sam-0.016
 erupt-0.015
Token is
Token ID318
Feature activation-6.50e-02

Pos logit contributions
 barking+0.094
 biting+0.093
ue+0.089
 Sam+0.080
 attacking+0.080
Neg logit contributions
unn-0.158
ovable-0.153
amb-0.152
 lamb-0.151
 luggage-0.150
Token not
Token ID407
Feature activation-1.19e-01

Pos logit contributions
ue+0.120
 biting+0.120
 barking+0.119
 attacking+0.110
 Sam+0.107
Neg logit contributions
 luggage-0.090
 lamb-0.088
nam-0.086
 wife-0.085
ovable-0.085
Token blue
Token ID4171
Feature activation-1.58e-01

Pos logit contributions
 barking+0.181
 biting+0.180
ue+0.176
 attacking+0.160
 Sam+0.159
Neg logit contributions
unn-0.210
amb-0.207
 luggage-0.207
ovable-0.206
 lamb-0.206
Token anymore
Token ID7471
Feature activation-1.17e-01

Pos logit contributions
 barking+0.173
 biting+0.172
ue+0.160
 attacking+0.148
 Sam+0.147
Neg logit contributions
unn-0.360
amb-0.354
oll-0.345
ovable-0.343
 luggage-0.342
Token.
Token ID13
Feature activation-2.22e-02

Pos logit contributions
 barking+0.116
 biting+0.115
ue+0.106
 Sam+0.097
 attacking+0.093
Neg logit contributions
unn-0.276
amb-0.274
ovable-0.271
 lamb-0.268
sie-0.268
Token It
Token ID632
Feature activation-7.47e-02

Pos logit contributions
 barking+0.028
 biting+0.027
ue+0.026
 Sam+0.024
 attacking+0.023
Neg logit contributions
unn-0.047
amb-0.046
ovable-0.045
sie-0.045
 luggage-0.045
Token has
Token ID468
Feature activation-6.72e-02

Pos logit contributions
 barking+0.093
 biting+0.092
ue+0.089
 Sam+0.080
 attacking+0.079
Neg logit contributions
unn-0.154
ovable-0.150
amb-0.149
 lamb-0.149
 luggage-0.148
Token many
Token ID867
Feature activation+5.13e-02

Pos logit contributions
 barking+0.090
 biting+0.087
 Sam+0.082
 mess+0.080
 Sue+0.079
Neg logit contributions
amb-0.149
unn-0.143
sie-0.143
ovable-0.141
oll-0.140
Token careful
Token ID8161
Feature activation-6.91e-03

Pos logit contributions
ue+0.033
 barking+0.032
 biting+0.031
 erupt+0.029
 Sam+0.029
Neg logit contributions
ovable-0.024
 wife-0.024
 lamb-0.023
 husband-0.023
sie-0.023
Token.
Token ID13
Feature activation-3.86e-02

Pos logit contributions
 barking+0.008
ue+0.008
 biting+0.008
 attacking+0.007
 Sam+0.007
Neg logit contributions
amb-0.014
unn-0.014
ovable-0.014
edly-0.014
oll-0.013
Token The
Token ID383
Feature activation-3.99e-02

Pos logit contributions
 barking+0.049
 biting+0.048
ue+0.046
 Sam+0.042
 attacking+0.041
Neg logit contributions
unn-0.080
amb-0.080
ovable-0.079
 lamb-0.078
oll-0.077
Token hill
Token ID12788
Feature activation-5.90e-02

Pos logit contributions
 barking+0.058
 biting+0.058
ue+0.054
 Sam+0.051
 attacking+0.050
Neg logit contributions
unn-0.076
amb-0.075
ovable-0.074
sie-0.073
oll-0.073
Token is
Token ID318
Feature activation-7.75e-02

Pos logit contributions
 barking+0.058
 biting+0.057
ue+0.053
 Sam+0.048
 attacking+0.047
Neg logit contributions
unn-0.137
amb-0.134
ovable-0.134
 luggage-0.133
sie-0.132
Token steep
Token ID14559
Feature activation-1.58e-01

Pos logit contributions
ue+0.107
 biting+0.102
 barking+0.102
 attacking+0.089
 Sam+0.088
Neg logit contributions
 luggage-0.146
 lamb-0.142
ovable-0.140
nam-0.139
 wife-0.139
Token and
Token ID290
Feature activation-5.15e-02

Pos logit contributions
ue+0.215
 barking+0.214
 biting+0.211
 Sam+0.190
 attacking+0.184
Neg logit contributions
amb-0.301
unn-0.296
 luggage-0.291
edly-0.287
 lamb-0.287
Token the
Token ID262
Feature activation-4.48e-02

Pos logit contributions
ue+0.074
 barking+0.072
 biting+0.072
 attacking+0.064
 Sam+0.062
Neg logit contributions
nam-0.092
 luggage-0.091
 lamb-0.090
unn-0.089
 downs-0.087
Token wagon
Token ID31860
Feature activation-2.45e-02

Pos logit contributions
 barking+0.066
 biting+0.065
ue+0.060
 Sam+0.058
 mess+0.058
Neg logit contributions
unn-0.087
amb-0.086
ovable-0.084
oll-0.083
sie-0.082
Token is
Token ID318
Feature activation-7.97e-02

Pos logit contributions
 barking+0.023
 biting+0.023
ue+0.021
 Sam+0.019
 mess+0.019
Neg logit contributions
unn-0.059
amb-0.059
ovable-0.058
sie-0.057
 lamb-0.057
Token heavy
Token ID4334
Feature activation-1.12e-01

Pos logit contributions
ue+0.105
 barking+0.100
 biting+0.100
 Sam+0.088
 attacking+0.086
Neg logit contributions
 luggage-0.155
 lamb-0.152
ovable-0.151
unn-0.151
oll-0.149
Token careful
Token ID8161
Feature activation-8.30e-03

Pos logit contributions
 barking+0.023
 biting+0.023
ue+0.023
 Sam+0.021
 attacking+0.021
Neg logit contributions
unn-0.021
ovable-0.021
amb-0.021
 luggage-0.020
 lamb-0.020
Token.
Token ID13
Feature activation-3.70e-02

Pos logit contributions
ue+0.010
 barking+0.010
 biting+0.010
 attacking+0.009
 dent+0.009
Neg logit contributions
amb-0.017
unn-0.017
ovable-0.016
edly-0.016
oll-0.016
Token The
Token ID383
Feature activation-4.29e-02

Pos logit contributions
 barking+0.045
 biting+0.044
ue+0.041
 Sam+0.040
 attacking+0.037
Neg logit contributions
unn-0.080
amb-0.078
sie-0.078
ovable-0.078
oll-0.077
Token hill
Token ID12788
Feature activation-5.64e-02

Pos logit contributions
 barking+0.085
 biting+0.083
ue+0.080
 mess+0.077
 Sam+0.076
Neg logit contributions
unn-0.061
amb-0.060
ovable-0.059
oll-0.058
sie-0.056
Token is
Token ID318
Feature activation-7.39e-02

Pos logit contributions
 barking+0.055
 biting+0.055
ue+0.050
 Sam+0.045
 attacking+0.044
Neg logit contributions
unn-0.132
amb-0.129
ovable-0.129
 luggage-0.128
sie-0.127
Token steep
Token ID14559
Feature activation-1.58e-01

Pos logit contributions
ue+0.106
 biting+0.101
 barking+0.100
 attacking+0.088
 Sam+0.088
Neg logit contributions
 luggage-0.135
 lamb-0.132
ovable-0.129
nam-0.128
 wife-0.127
Token."
Token ID526
Feature activation-2.71e-02

Pos logit contributions
ue+0.207
 barking+0.206
 biting+0.204
 Sam+0.180
 attacking+0.179
Neg logit contributions
amb-0.307
unn-0.306
 luggage-0.300
 lamb-0.297
edly-0.294
Token
Token ID198
Feature activation-3.81e-02

Pos logit contributions
 biting+0.036
 barking+0.035
 attacking+0.031
ue+0.031
 mess+0.031
Neg logit contributions
amb-0.054
 wife-0.054
oll-0.053
unn-0.053
ovable-0.053
Token
Token ID198
Feature activation-3.59e-02

Pos logit contributions
 biting+0.055
 barking+0.053
 destroying+0.049
 attacking+0.048
ue+0.047
Neg logit contributions
amb-0.071
unn-0.070
 wife-0.068
ovable-0.068
ouses-0.068
TokenThey
Token ID2990
Feature activation-5.18e-03

Pos logit contributions
 biting+0.070
 barking+0.067
 attacking+0.062
 destroying+0.062
 hitting+0.060
Neg logit contributions
amb-0.051
edly-0.050
 lamb-0.049
 wife-0.049
ouses-0.048
Token hold
Token ID1745
Feature activation+1.29e-03

Pos logit contributions
 barking+0.006
 Sam+0.006
 bite+0.006
ue+0.006
 biting+0.005
Neg logit contributions
unn-0.012
ouses-0.012
amb-0.012
ovable-0.011
edly-0.011
Token and
Token ID290
Feature activation-3.64e-02

Pos logit contributions
 barking+0.083
 biting+0.081
ue+0.069
 Sam+0.068
 mess+0.066
Neg logit contributions
amb-0.200
unn-0.200
ovable-0.198
sie-0.197
 lamb-0.195
Token the
Token ID262
Feature activation-3.97e-02

Pos logit contributions
 barking+0.049
 biting+0.049
ue+0.044
 Sam+0.044
 mess+0.042
Neg logit contributions
amb-0.075
unn-0.075
ovable-0.072
sie-0.071
oll-0.071
Token clock
Token ID8801
Feature activation-7.63e-02

Pos logit contributions
 barking+0.059
 biting+0.058
ue+0.056
 Sam+0.052
 attacking+0.051
Neg logit contributions
unn-0.074
amb-0.073
ovable-0.072
 lamb-0.071
 luggage-0.071
Token is
Token ID318
Feature activation-7.77e-02

Pos logit contributions
 barking+0.082
 biting+0.081
ue+0.076
 Sam+0.069
 attacking+0.068
Neg logit contributions
unn-0.173
amb-0.171
ovable-0.169
oll-0.167
 luggage-0.166
Token too
Token ID1165
Feature activation-8.35e-02

Pos logit contributions
ue+0.112
 barking+0.102
 biting+0.100
 attacking+0.090
 Sam+0.089
Neg logit contributions
 luggage-0.142
 wife-0.141
 lamb-0.141
ovable-0.139
oll-0.139
Token high
Token ID1029
Feature activation-1.57e-01

Pos logit contributions
 barking+0.132
ue+0.130
 biting+0.129
 attacking+0.118
 Sam+0.115
Neg logit contributions
unn-0.140
 luggage-0.138
 lamb-0.136
 wife-0.135
ovable-0.134
Token.
Token ID13
Feature activation-5.95e-02

Pos logit contributions
ue+0.178
 barking+0.178
 biting+0.175
 Sam+0.153
 attacking+0.149
Neg logit contributions
amb-0.342
unn-0.340
oll-0.331
 luggage-0.329
ovable-0.328
Token We
Token ID775
Feature activation-1.73e-02

Pos logit contributions
 barking+0.075
 biting+0.074
ue+0.070
 Sam+0.064
 attacking+0.064
Neg logit contributions
unn-0.125
amb-0.123
ovable-0.122
 lamb-0.120
oll-0.120
Token will
Token ID481
Feature activation-3.25e-02

Pos logit contributions
 barking+0.026
 biting+0.026
ue+0.024
 Sam+0.024
 mess+0.023
Neg logit contributions
unn-0.032
amb-0.031
ovable-0.031
 lamb-0.031
sie-0.030
Token fall
Token ID2121
Feature activation-9.65e-02

Pos logit contributions
 barking+0.044
 biting+0.043
ue+0.042
 Sam+0.038
 attacking+0.038
Neg logit contributions
unn-0.065
amb-0.064
ovable-0.063
 luggage-0.062
 lamb-0.062
Token and
Token ID290
Feature activation-2.87e-02

Pos logit contributions
 barking+0.114
 biting+0.113
ue+0.107
 Sam+0.098
 attacking+0.096
Neg logit contributions
unn-0.209
amb-0.207
ovable-0.204
 lamb-0.201
 luggage-0.201
Token.
Token ID13
Feature activation-1.47e-02

Pos logit contributions
 biting+0.049
 barking+0.048
ue+0.045
 attacking+0.043
 Sam+0.041
Neg logit contributions
amb-0.085
ovable-0.085
 luggage-0.084
unn-0.084
edly-0.084
Token It
Token ID632
Feature activation-5.29e-02

Pos logit contributions
 biting+0.026
 barking+0.024
 attacking+0.024
ue+0.024
 mess+0.023
Neg logit contributions
oll-0.022
 wife-0.022
 luggage-0.021
ovable-0.021
amb-0.021
Token was
Token ID373
Feature activation-3.05e-02

Pos logit contributions
 biting+0.079
 barking+0.077
ue+0.072
 attacking+0.070
 mess+0.067
Neg logit contributions
 luggage-0.093
unn-0.093
ovable-0.092
 lamb-0.092
oll-0.091
Token not
Token ID407
Feature activation-1.14e-01

Pos logit contributions
ue+0.054
 biting+0.050
 barking+0.048
 attacking+0.047
 bite+0.044
Neg logit contributions
 luggage-0.048
 lamb-0.045
nam-0.045
 wife-0.044
 husband-0.044
Token too
Token ID1165
Feature activation-8.69e-02

Pos logit contributions
 biting+0.177
 barking+0.177
ue+0.175
 attacking+0.159
 Sam+0.154
Neg logit contributions
 luggage-0.197
 lamb-0.195
unn-0.195
ovable-0.194
amb-0.193
Token steep
Token ID14559
Feature activation-1.57e-01

Pos logit contributions
 biting+0.139
 barking+0.138
ue+0.132
 attacking+0.126
 destroying+0.120
Neg logit contributions
unn-0.146
 luggage-0.146
 lamb-0.144
sie-0.143
 wife-0.142
Token or
Token ID393
Feature activation-7.35e-02

Pos logit contributions
ue+0.190
 biting+0.184
 barking+0.182
 Sam+0.162
 attacking+0.161
Neg logit contributions
unn-0.325
amb-0.320
 luggage-0.319
oll-0.307
 lamb-0.305
Token too
Token ID1165
Feature activation-8.25e-02

Pos logit contributions
ue+0.093
 barking+0.091
 biting+0.090
 attacking+0.079
 Sam+0.078
Neg logit contributions
 luggage-0.146
nam-0.146
 lamb-0.145
unn-0.144
ovable-0.142
Token high
Token ID1029
Feature activation-1.49e-01

Pos logit contributions
 biting+0.131
 barking+0.131
ue+0.127
 attacking+0.121
 destroying+0.115
Neg logit contributions
unn-0.135
 luggage-0.135
 lamb-0.134
 wife-0.133
sie-0.131
Token.
Token ID13
Feature activation-2.86e-02

Pos logit contributions
 barking+0.169
 biting+0.168
ue+0.167
 Sam+0.145
 attacking+0.144
Neg logit contributions
amb-0.323
unn-0.323
 luggage-0.314
oll-0.311
 lamb-0.309
Token She
Token ID1375
Feature activation-1.01e-02

Pos logit contributions
 biting+0.043
 barking+0.042
ue+0.040
 attacking+0.039
 erupt+0.037
Neg logit contributions
ovable-0.051
oll-0.051
amb-0.050
unn-0.050
 luggage-0.050
Token He
Token ID679
Feature activation-2.29e-02

Pos logit contributions
 biting+0.053
 barking+0.052
ue+0.050
 mess+0.049
 attacking+0.047
Neg logit contributions
oll-0.071
ovable-0.070
amb-0.070
unn-0.069
 luggage-0.068
Token thinks
Token ID6834
Feature activation+4.70e-02

Pos logit contributions
 barking+0.031
 biting+0.031
ue+0.030
 Sam+0.027
 attacking+0.027
Neg logit contributions
unn-0.045
amb-0.045
ovable-0.044
oll-0.043
 luggage-0.043
Token it
Token ID340
Feature activation-5.76e-02

Pos logit contributions
unn+0.110
amb+0.109
ovable+0.107
 lamb+0.106
sie+0.106
Neg logit contributions
 barking-0.048
 biting-0.048
ue-0.044
 Sam-0.040
 attacking-0.039
Token is
Token ID318
Feature activation-7.89e-02

Pos logit contributions
 biting+0.069
 barking+0.068
ue+0.066
 Sam+0.059
 attacking+0.058
Neg logit contributions
unn-0.120
ovable-0.118
 luggage-0.118
amb-0.117
oll-0.117
Token too
Token ID1165
Feature activation-8.40e-02

Pos logit contributions
ue+0.114
 biting+0.111
 barking+0.110
 attacking+0.099
 mess+0.094
Neg logit contributions
 luggage-0.144
ovable-0.141
oll-0.140
 wife-0.140
 lamb-0.139
Token high
Token ID1029
Feature activation-1.57e-01

Pos logit contributions
 biting+0.139
 barking+0.137
ue+0.133
 attacking+0.131
 destroying+0.123
Neg logit contributions
unn-0.133
 luggage-0.132
amb-0.127
 wife-0.127
oll-0.127
Token and
Token ID290
Feature activation-3.62e-02

Pos logit contributions
ue+0.186
 barking+0.183
 biting+0.181
 Sam+0.159
 attacking+0.156
Neg logit contributions
amb-0.332
unn-0.326
oll-0.321
 luggage-0.319
edly-0.318
Token too
Token ID1165
Feature activation-7.50e-02

Pos logit contributions
ue+0.048
 barking+0.047
 biting+0.046
 attacking+0.041
 Sam+0.040
Neg logit contributions
unn-0.069
ovable-0.068
nam-0.068
 luggage-0.068
oll-0.068
Token fast
Token ID3049
Feature activation-1.05e-01

Pos logit contributions
 barking+0.117
ue+0.116
 biting+0.115
 attacking+0.108
 destroying+0.103
Neg logit contributions
 luggage-0.120
unn-0.119
oll-0.119
ovable-0.118
 wife-0.117
Token.
Token ID13
Feature activation-3.86e-02

Pos logit contributions
 barking+0.102
 biting+0.102
ue+0.101
 Sam+0.085
 attacking+0.084
Neg logit contributions
unn-0.244
amb-0.243
 luggage-0.238
ovable-0.238
oll-0.237
Token
Token ID198
Feature activation-4.43e-02

Pos logit contributions
 biting+0.051
 barking+0.050
ue+0.048
 mess+0.046
 attacking+0.044
Neg logit contributions
amb-0.075
unn-0.075
ovable-0.074
oll-0.074
 wife-0.073
Token But
Token ID887
Feature activation+1.88e-03

Pos logit contributions
 biting+0.036
 barking+0.035
ue+0.034
 mess+0.032
 attacking+0.031
Neg logit contributions
amb-0.060
unn-0.059
ovable-0.058
oll-0.058
 wife-0.057
Token the
Token ID262
Feature activation-2.53e-02

Pos logit contributions
amb+0.004
 wife+0.004
oll+0.004
unn+0.004
 lamb+0.004
Neg logit contributions
 barking-0.002
 biting-0.002
ue-0.002
 attacking-0.002
 mess-0.002
Token tower
Token ID10580
Feature activation-5.56e-02

Pos logit contributions
 barking+0.030
 biting+0.029
ue+0.028
 attacking+0.025
 Sam+0.025
Neg logit contributions
unn-0.053
ovable-0.053
 luggage-0.053
 wife-0.053
oll-0.052
Token is
Token ID318
Feature activation-6.07e-02

Pos logit contributions
 barking+0.072
 biting+0.068
ue+0.063
 attacking+0.060
 Sam+0.059
Neg logit contributions
unn-0.110
oll-0.110
ovable-0.109
amb-0.109
 luggage-0.107
Token too
Token ID1165
Feature activation-6.67e-02

Pos logit contributions
ue+0.085
 barking+0.079
 biting+0.073
 attacking+0.068
 Sam+0.066
Neg logit contributions
nam-0.113
oll-0.113
 wife-0.108
ovable-0.108
 husband-0.107
Token high
Token ID1029
Feature activation-1.56e-01

Pos logit contributions
 barking+0.124
 biting+0.117
ue+0.116
 attacking+0.113
 destroying+0.106
Neg logit contributions
 wife-0.090
unn-0.089
 husband-0.089
oll-0.088
nam-0.088
Token and
Token ID290
Feature activation-2.62e-02

Pos logit contributions
ue+0.213
 barking+0.213
 biting+0.207
 Sam+0.191
 attacking+0.185
Neg logit contributions
amb-0.308
unn-0.292
oll-0.290
edly-0.285
 luggage-0.281
Token it
Token ID340
Feature activation-6.73e-02

Pos logit contributions
ue+0.035
 barking+0.034
 biting+0.033
 attacking+0.030
 Sam+0.027
Neg logit contributions
nam-0.049
oll-0.048
unn-0.048
 lamb-0.047
 luggage-0.047
Token falls
Token ID8953
Feature activation-1.03e-01

Pos logit contributions
 barking+0.086
 biting+0.082
ue+0.081
 attacking+0.072
 Sam+0.072
Neg logit contributions
unn-0.136
oll-0.136
ovable-0.133
amb-0.133
 luggage-0.131
Token down
Token ID866
Feature activation-4.90e-02

Pos logit contributions
ue+0.152
 barking+0.145
 biting+0.144
 attacking+0.129
 Sam+0.126
Neg logit contributions
amb-0.186
 luggage-0.183
unn-0.180
oll-0.179
 lamb-0.176
Token with
Token ID351
Feature activation-3.32e-02

Pos logit contributions
ue+0.065
 barking+0.063
 biting+0.061
 attacking+0.054
 Sam+0.052
Neg logit contributions
 luggage-0.090
edly-0.089
ovable-0.086
oll-0.086
 lamb-0.084
Token slide
Token ID10649
Feature activation-6.23e-02

Pos logit contributions
unn+0.006
amb+0.006
ovable+0.006
sie+0.006
oll+0.006
Neg logit contributions
 barking-0.004
 biting-0.004
ue-0.004
 Sam-0.004
 attacking-0.003
Token.
Token ID13
Feature activation-6.15e-02

Pos logit contributions
 barking+0.054
 biting+0.053
ue+0.049
 Sam+0.046
 mess+0.043
Neg logit contributions
unn-0.157
amb-0.156
ovable-0.152
 lamb-0.152
sie-0.152
Token It
Token ID632
Feature activation-8.16e-02

Pos logit contributions
 barking+0.076
 biting+0.076
ue+0.071
 Sam+0.065
 attacking+0.065
Neg logit contributions
unn-0.129
amb-0.128
ovable-0.127
oll-0.125
 lamb-0.125
Token is
Token ID318
Feature activation-7.93e-02

Pos logit contributions
 barking+0.103
ue+0.100
 biting+0.099
 Sam+0.088
 attacking+0.084
Neg logit contributions
unn-0.170
ovable-0.162
amb-0.160
oll-0.159
 lamb-0.156
Token too
Token ID1165
Feature activation-8.61e-02

Pos logit contributions
ue+0.102
 barking+0.094
 biting+0.092
 attacking+0.081
 Sam+0.076
Neg logit contributions
oll-0.158
 luggage-0.157
 lamb-0.156
nam-0.155
ovable-0.155
Token high
Token ID1029
Feature activation-1.56e-01

Pos logit contributions
 barking+0.122
 biting+0.121
ue+0.118
 attacking+0.108
 Sam+0.106
Neg logit contributions
unn-0.164
amb-0.159
 luggage-0.158
oll-0.157
 lamb-0.157
Token for
Token ID329
Feature activation-5.22e-02

Pos logit contributions
 barking+0.177
 biting+0.175
ue+0.166
 Sam+0.150
 attacking+0.147
Neg logit contributions
unn-0.346
amb-0.343
ovable-0.337
oll-0.334
 luggage-0.333
Token you
Token ID345
Feature activation+1.90e-02

Pos logit contributions
 biting+0.060
 barking+0.060
ue+0.054
 Sam+0.052
 attacking+0.050
Neg logit contributions
amb-0.117
unn-0.117
ovable-0.114
sie-0.112
oll-0.112
Token and
Token ID290
Feature activation-2.94e-02

Pos logit contributions
amb+0.050
unn+0.049
sie+0.049
ovable+0.048
oll+0.048
Neg logit contributions
 barking-0.016
 biting-0.016
 Sam-0.012
 mess-0.012
ue-0.012
Token Max
Token ID5436
Feature activation-5.25e-02

Pos logit contributions
 biting+0.033
 barking+0.033
 Sam+0.030
 mess+0.028
 Tom+0.027
Neg logit contributions
amb-0.069
unn-0.069
sie-0.068
ovable-0.068
oll-0.067
Token."
Token ID526
Feature activation-3.92e-02

Pos logit contributions
 biting+0.043
 barking+0.042
 Sam+0.034
 mess+0.033
ue+0.033
Neg logit contributions
amb-0.136
unn-0.135
sie-0.132
oll-0.131
ovable-0.131
Token sky
Token ID6766
Feature activation-1.08e-01

Pos logit contributions
 barking+0.066
 biting+0.065
ue+0.062
 Sam+0.059
 attacking+0.058
Neg logit contributions
unn-0.073
amb-0.072
ovable-0.070
sie-0.069
oll-0.069
Token.
Token ID13
Feature activation-4.74e-02

Pos logit contributions
 barking+0.123
 biting+0.120
ue+0.115
 attacking+0.105
 Sam+0.101
Neg logit contributions
unn-0.227
ovable-0.225
oll-0.224
amb-0.223
 luggage-0.223
Token The
Token ID383
Feature activation-3.99e-02

Pos logit contributions
 biting+0.060
 barking+0.060
ue+0.056
 mess+0.053
 attacking+0.052
Neg logit contributions
amb-0.098
unn-0.098
ovable-0.096
oll-0.095
 lamb-0.094
Token sky
Token ID6766
Feature activation-8.80e-02

Pos logit contributions
 barking+0.058
 biting+0.058
ue+0.055
 Sam+0.052
 mess+0.051
Neg logit contributions
unn-0.076
amb-0.075
ovable-0.074
sie-0.073
oll-0.072
Token is
Token ID318
Feature activation-6.97e-02

Pos logit contributions
 barking+0.118
 biting+0.111
ue+0.106
 Sam+0.102
 attacking+0.099
Neg logit contributions
ovable-0.161
unn-0.160
 married-0.158
oll-0.157
 luggage-0.157
Token high
Token ID1029
Feature activation-1.56e-01

Pos logit contributions
ue+0.109
 barking+0.100
 biting+0.099
 attacking+0.089
 Sam+0.088
Neg logit contributions
 luggage-0.119
nam-0.117
 lamb-0.115
 wife-0.114
 groom-0.114
Token and
Token ID290
Feature activation-3.50e-02

Pos logit contributions
 barking+0.193
 biting+0.189
ue+0.186
 Sam+0.168
 attacking+0.163
Neg logit contributions
amb-0.325
unn-0.322
 luggage-0.316
 lamb-0.316
oll-0.313
Token wide
Token ID3094
Feature activation-1.13e-01

Pos logit contributions
ue+0.052
 barking+0.051
 biting+0.050
 attacking+0.046
 mistake+0.043
Neg logit contributions
nam-0.062
 luggage-0.060
 lamb-0.059
ovable-0.057
 groom-0.057
Token.
Token ID13
Feature activation-3.88e-02

Pos logit contributions
 barking+0.118
 biting+0.117
ue+0.116
 Sam+0.102
 attacking+0.099
Neg logit contributions
unn-0.249
amb-0.247
 luggage-0.246
oll-0.244
ovable-0.244
Token Your
Token ID3406
Feature activation+4.48e-03

Pos logit contributions
 biting+0.048
 barking+0.048
ue+0.044
 mess+0.041
 attacking+0.041
Neg logit contributions
unn-0.082
amb-0.082
ovable-0.080
oll-0.079
 lamb-0.079
Token block
Token ID2512
Feature activation-6.43e-02

Pos logit contributions
unn+0.009
amb+0.009
ovable+0.009
sie+0.009
 lamb+0.009
Neg logit contributions
 barking-0.006
 biting-0.006
ue-0.005
 Sam-0.005
 mess-0.005
Token Mama
Token ID40084
Feature activation+1.92e-02

Pos logit contributions
unn+0.037
ovable+0.036
oll+0.036
amb+0.035
sie+0.035
Neg logit contributions
 barking-0.027
 biting-0.027
 Sam-0.025
ue-0.025
 Tom-0.023
Token gave
Token ID2921
Feature activation+3.80e-02

Pos logit contributions
unn+0.040
ovable+0.040
amb+0.039
 luggage+0.039
nam+0.039
Neg logit contributions
 barking-0.024
 biting-0.024
ue-0.023
 Sam-0.022
 mess-0.020
Token baby
Token ID5156
Feature activation+4.29e-02

Pos logit contributions
unn+0.071
amb+0.070
ovable+0.070
ouses+0.069
oll+0.069
Neg logit contributions
 biting-0.057
 barking-0.057
ue-0.053
 Sam-0.053
 Sue-0.049
Token some
Token ID617
Feature activation+5.59e-02

Pos logit contributions
ovable+0.089
amb+0.089
oll+0.088
unn+0.088
edly+0.086
Neg logit contributions
 barking-0.054
 Sam-0.052
 biting-0.051
 Sue-0.047
 Tom-0.047
Token delicious
Token ID12625
Feature activation-5.02e-03

Pos logit contributions
amb+0.129
unn+0.120
ovable+0.118
la+0.116
edly+0.116
Neg logit contributions
 bite-0.065
 biting-0.064
 mess-0.064
 pops-0.059
 half-0.058
Token foods
Token ID9013
Feature activation+1.17e-01

Pos logit contributions
 biting+0.007
 barking+0.007
ue+0.007
 mess+0.007
 Sam+0.006
Neg logit contributions
amb-0.010
ovable-0.010
unn-0.010
sie-0.009
nam-0.009
Token to
Token ID284
Feature activation-6.98e-03

Pos logit contributions
unn+0.288
amb+0.280
ovable+0.279
 lamb+0.274
sie+0.273
Neg logit contributions
 barking-0.107
 biting-0.107
ue-0.094
 Sam-0.093
 Tom-0.084
Token eat
Token ID4483
Feature activation-4.71e-03

Pos logit contributions
 barking+0.008
 biting+0.008
ue+0.008
 Sam+0.007
 Tom+0.007
Neg logit contributions
unn-0.016
amb-0.016
ovable-0.015
ouses-0.015
oll-0.015
Token.
Token ID13
Feature activation+2.66e-02

Pos logit contributions
 biting+0.004
 barking+0.004
ue+0.004
 Sam+0.003
 mess+0.003
Neg logit contributions
unn-0.012
amb-0.012
ovable-0.011
 lamb-0.011
oll-0.011
Token Baby
Token ID14801
Feature activation+1.59e-02

Pos logit contributions
unn+0.053
ovable+0.052
oll+0.051
amb+0.051
sie+0.050
Neg logit contributions
 barking-0.037
 biting-0.036
ue-0.035
 Sam-0.034
 Tom-0.031
Token ate
Token ID15063
Feature activation+5.72e-03

Pos logit contributions
unn+0.030
ovable+0.030
amb+0.030
 luggage+0.029
oll+0.029
Neg logit contributions
 barking-0.023
 biting-0.023
ue-0.022
 Sam-0.021
 mess-0.020
Token together
Token ID1978
Feature activation+9.42e-03

Pos logit contributions
amb+0.050
unn+0.049
ovable+0.049
 luggage+0.048
 lamb+0.048
Neg logit contributions
 barking-0.036
 biting-0.035
ue-0.034
 attacking-0.031
 Sam-0.031
Token every
Token ID790
Feature activation+9.08e-02

Pos logit contributions
unn+0.014
amb+0.014
ovable+0.014
sie+0.013
oll+0.013
Neg logit contributions
 barking-0.017
 biting-0.017
ue-0.017
 Sam-0.016
 attacking-0.016
Token day
Token ID1110
Feature activation+4.24e-02

Pos logit contributions
unn+0.182
amb+0.177
ovable+0.176
ouses+0.173
oll+0.172
Neg logit contributions
 barking-0.134
 biting-0.132
ue-0.120
 bite-0.116
 Sam-0.115
Token and
Token ID290
Feature activation+3.47e-02

Pos logit contributions
unn+0.102
amb+0.101
ovable+0.100
 lamb+0.099
sie+0.099
Neg logit contributions
 barking-0.041
 biting-0.040
ue-0.036
 Sam-0.033
 attacking-0.032
Token spent
Token ID3377
Feature activation+4.72e-02

Pos logit contributions
unn+0.063
amb+0.062
ovable+0.061
oll+0.060
 luggage+0.060
Neg logit contributions
 barking-0.055
 biting-0.054
ue-0.051
 Sam-0.050
 attacking-0.047
Token lots
Token ID6041
Feature activation+1.17e-01

Pos logit contributions
unn+0.109
amb+0.107
ovable+0.103
oll+0.102
 lamb+0.101
Neg logit contributions
 biting-0.055
 barking-0.055
 Sam-0.050
ue-0.049
 mess-0.047
Token of
Token ID286
Feature activation+2.58e-02

Pos logit contributions
amb+0.216
 luggage+0.214
unn+0.213
 lamb+0.213
 wife+0.212
Neg logit contributions
 biting-0.176
 barking-0.173
ue-0.166
 mess-0.154
 Sam-0.153
Token time
Token ID640
Feature activation+1.76e-03

Pos logit contributions
unn+0.046
amb+0.046
ovable+0.045
 lamb+0.044
sie+0.044
Neg logit contributions
 barking-0.041
 biting-0.040
ue-0.038
 Sam-0.036
 attacking-0.036
Token together
Token ID1978
Feature activation+1.75e-02

Pos logit contributions
amb+0.004
unn+0.004
ovable+0.004
ouses+0.004
sie+0.004
Neg logit contributions
ue-0.002
 barking-0.002
 biting-0.002
 Sam-0.002
 attacking-0.002
Token.
Token ID13
Feature activation+2.32e-02

Pos logit contributions
unn+0.043
amb+0.043
ovable+0.042
 luggage+0.042
 lamb+0.042
Neg logit contributions
 barking-0.016
 biting-0.016
ue-0.014
 Sam-0.013
 attacking-0.012
Token Sus
Token ID8932
Feature activation+4.39e-04

Pos logit contributions
unn+0.052
amb+0.051
ovable+0.051
 lamb+0.050
sie+0.050
Neg logit contributions
 barking-0.026
 biting-0.026
ue-0.024
 Sam-0.022
 attacking-0.021
Token to
Token ID284
Feature activation-1.80e-02

Pos logit contributions
 barking+0.007
 biting+0.007
 Sam+0.006
ue+0.006
 again+0.006
Neg logit contributions
nam-0.022
ovable-0.022
sie-0.022
unn-0.022
oll-0.021
Token everyone
Token ID2506
Feature activation+6.19e-02

Pos logit contributions
 biting+0.025
 barking+0.025
 Sam+0.024
 Tom+0.022
 mess+0.022
Neg logit contributions
amb-0.040
oll-0.039
unn-0.039
ouses-0.037
edly-0.037
Token,
Token ID11
Feature activation-3.02e-02

Pos logit contributions
amb+0.144
unn+0.141
oll+0.138
 luggage+0.137
 lamb+0.135
Neg logit contributions
 biting-0.070
 barking-0.067
 mess-0.060
 Sam-0.058
 attacking-0.056
Token no
Token ID645
Feature activation-3.07e-02

Pos logit contributions
 barking+0.043
 biting+0.043
 Sam+0.040
ue+0.039
 mess+0.038
Neg logit contributions
unn-0.061
amb-0.060
ovable-0.058
oll-0.058
sie-0.058
Token matter
Token ID2300
Feature activation-1.02e-02

Pos logit contributions
 barking+0.038
 biting+0.038
ue+0.035
 Sam+0.033
 mess+0.032
Neg logit contributions
amb-0.066
unn-0.066
ovable-0.065
oll-0.065
sie-0.064
Token how
Token ID703
Feature activation+1.08e-01

Pos logit contributions
 barking+0.017
 biting+0.017
ue+0.016
 Sam+0.016
 attacking+0.015
Neg logit contributions
amb-0.018
unn-0.018
oll-0.017
ovable-0.017
sie-0.017
Token they
Token ID484
Feature activation+2.83e-02

Pos logit contributions
amb+0.212
unn+0.208
oll+0.205
ovable+0.202
 luggage+0.201
Neg logit contributions
 biting-0.157
 barking-0.156
ue-0.142
 Sam-0.139
 mess-0.138
Token look
Token ID804
Feature activation-3.10e-02

Pos logit contributions
amb+0.057
unn+0.056
oll+0.056
 lamb+0.055
 luggage+0.055
Neg logit contributions
 biting-0.040
 barking-0.039
 mess-0.035
 Sam-0.035
 bite-0.035
Token or
Token ID393
Feature activation-4.32e-02

Pos logit contributions
 barking+0.031
 biting+0.030
 Sam+0.026
 mess+0.026
 attacking+0.024
Neg logit contributions
unn-0.076
amb-0.075
oll-0.073
ovable-0.072
sie-0.071
Token smell
Token ID8508
Feature activation-4.85e-02

Pos logit contributions
 biting+0.060
 barking+0.059
 mess+0.054
 Sam+0.052
 attacking+0.051
Neg logit contributions
amb-0.094
unn-0.091
oll-0.090
ovable-0.088
 luggage-0.085
Token.
Token ID13
Feature activation-3.63e-04

Pos logit contributions
 biting+0.052
 barking+0.051
 mess+0.045
 Sam+0.045
ue+0.043
Neg logit contributions
amb-0.116
unn-0.115
oll-0.110
ovable-0.109
ouses-0.108
Token careful
Token ID8161
Feature activation-7.08e-03

Pos logit contributions
 barking+0.011
 biting+0.011
ue+0.011
 Sam+0.010
 attacking+0.010
Neg logit contributions
unn-0.010
amb-0.010
ovable-0.010
 lamb-0.009
sie-0.009
Token."
Token ID526
Feature activation+3.53e-03

Pos logit contributions
 barking+0.008
 biting+0.008
ue+0.008
 Sam+0.007
 attacking+0.007
Neg logit contributions
unn-0.015
amb-0.015
ovable-0.015
 lamb-0.015
oll-0.015
Token
Token ID198
Feature activation-1.23e-02

Pos logit contributions
unn+0.008
amb+0.008
ovable+0.008
 luggage+0.008
sie+0.008
Neg logit contributions
 barking-0.004
 biting-0.003
ue-0.003
 Sam-0.003
 attacking-0.003
Token
Token ID198
Feature activation-1.19e-02

Pos logit contributions
 barking+0.011
 biting+0.011
ue+0.010
 Sam+0.009
 attacking+0.009
Neg logit contributions
unn-0.030
amb-0.030
ovable-0.030
 luggage-0.029
sie-0.029
TokenM
Token ID44
Feature activation+1.67e-02

Pos logit contributions
ue+0.016
 biting+0.016
 barking+0.016
 Sam+0.015
 attacking+0.014
Neg logit contributions
unn-0.024
 luggage-0.024
ovable-0.023
oll-0.023
amb-0.023
Tokenoral
Token ID6864
Feature activation+1.09e-01

Pos logit contributions
unn+0.028
amb+0.028
ovable+0.027
 lamb+0.027
sie+0.027
Neg logit contributions
 barking-0.027
 biting-0.027
ue-0.027
 Sam-0.025
 mess-0.024
Token:
Token ID25
Feature activation-1.54e-04

Pos logit contributions
unn+0.204
amb+0.202
sie+0.201
 lamb+0.200
ovable+0.198
Neg logit contributions
 barking-0.160
 biting-0.156
ue-0.153
 Sam-0.145
 attacking-0.138
Token It
Token ID632
Feature activation-3.51e-02

Pos logit contributions
 barking+0.000
 biting+0.000
ue+0.000
 mess+0.000
 attacking+0.000
Neg logit contributions
amb-0.000
unn-0.000
ovable-0.000
 lamb-0.000
oll-0.000
Token is
Token ID318
Feature activation-4.11e-02

Pos logit contributions
 barking+0.050
 biting+0.048
ue+0.047
 mess+0.042
 Sam+0.042
Neg logit contributions
unn-0.068
amb-0.066
ovable-0.065
 lamb-0.064
oll-0.062
Token important
Token ID1593
Feature activation+1.36e-02

Pos logit contributions
 barking+0.074
 biting+0.073
ue+0.070
 Sam+0.067
 attacking+0.066
Neg logit contributions
unn-0.064
amb-0.063
ovable-0.062
oll-0.061
 luggage-0.061
Token to
Token ID284
Feature activation-3.98e-02

Pos logit contributions
unn+0.030
amb+0.030
ovable+0.030
sie+0.030
oll+0.029
Neg logit contributions
 barking-0.016
 biting-0.016
ue-0.014
 Sam-0.014
 attacking-0.013
Token get
Token ID651
Feature activation+4.91e-02

Pos logit contributions
amb+0.007
unn+0.006
 luggage+0.006
 wife+0.006
ovable+0.006
Neg logit contributions
 biting-0.005
 barking-0.005
ue-0.005
 attacking-0.004
 mess-0.004
Token the
Token ID262
Feature activation+7.45e-03

Pos logit contributions
unn+0.119
amb+0.117
ovable+0.113
oll+0.111
ouses+0.111
Neg logit contributions
 barking-0.051
 biting-0.051
 Sam-0.048
 Sue-0.044
 Tom-0.043
Token scales
Token ID16252
Feature activation+5.80e-03

Pos logit contributions
unn+0.015
amb+0.015
ovable+0.014
ouses+0.014
oll+0.014
Neg logit contributions
 barking-0.010
 biting-0.010
 mess-0.010
 Sam-0.009
ue-0.009
Token to
Token ID284
Feature activation-1.71e-02

Pos logit contributions
unn+0.014
amb+0.014
ovable+0.014
sie+0.014
ouses+0.014
Neg logit contributions
 barking-0.005
 biting-0.005
ue-0.005
 Sam-0.004
 mess-0.004
Token see
Token ID766
Feature activation-3.49e-02

Pos logit contributions
 barking+0.025
 biting+0.025
ue+0.024
 attacking+0.022
 mess+0.021
Neg logit contributions
amb-0.031
unn-0.031
 wife-0.031
 lamb-0.031
sie-0.031
Token how
Token ID703
Feature activation+1.06e-01

Pos logit contributions
 biting+0.033
 barking+0.033
 Sam+0.030
ue+0.028
 Tom+0.027
Neg logit contributions
unn-0.089
amb-0.085
ouses-0.083
sie-0.083
ovable-0.082
Token much
Token ID881
Feature activation+2.34e-02

Pos logit contributions
unn+0.198
amb+0.194
ovable+0.193
sie+0.189
 lamb+0.189
Neg logit contributions
 barking-0.159
 biting-0.157
ue-0.150
 Sam-0.139
 attacking-0.138
Token it
Token ID340
Feature activation-3.66e-02

Pos logit contributions
unn+0.045
 luggage+0.044
 lamb+0.044
ovable+0.044
amb+0.044
Neg logit contributions
 barking-0.033
 biting-0.032
ue-0.031
 attacking-0.029
 Sam-0.028
Token weighed
Token ID20261
Feature activation-3.08e-02

Pos logit contributions
 barking+0.045
 biting+0.045
ue+0.042
 Sam+0.040
 mess+0.039
Neg logit contributions
unn-0.078
amb-0.077
ovable-0.075
 lamb-0.075
 luggage-0.074
Token.
Token ID13
Feature activation+1.94e-02

Pos logit contributions
 barking+0.028
 biting+0.028
ue+0.025
 Sam+0.023
 attacking+0.022
Neg logit contributions
unn-0.076
amb-0.075
ovable-0.074
 lamb-0.073
oll-0.073
Token Her
Token ID2332
Feature activation+4.08e-02

Pos logit contributions
unn+0.043
ovable+0.042
amb+0.042
 luggage+0.042
sie+0.042
Neg logit contributions
 barking-0.022
 biting-0.022
ue-0.021
 Sam-0.020
 Sue-0.018
Token took
Token ID1718
Feature activation+5.39e-02

Pos logit contributions
unn+0.086
amb+0.084
ovable+0.082
 lamb+0.081
 luggage+0.081
Neg logit contributions
 biting-0.061
 barking-0.061
ue-0.058
 Sam-0.057
 mess-0.054
Token a
Token ID257
Feature activation+4.10e-02

Pos logit contributions
unn+0.107
amb+0.106
ovable+0.105
 luggage+0.104
oll+0.104
Neg logit contributions
 barking-0.074
 biting-0.072
ue-0.069
 attacking-0.063
 mess-0.063
Token spoon
Token ID24556
Feature activation+3.10e-02

Pos logit contributions
sie+0.072
 luggage+0.072
 lamb+0.071
unn+0.070
oll+0.070
Neg logit contributions
 barking-0.062
 biting-0.060
ue-0.060
 Sam-0.056
 attacking-0.055
Token,
Token ID11
Feature activation-2.15e-02

Pos logit contributions
amb+0.051
oll+0.048
unn+0.047
 lamb+0.046
ovable+0.046
Neg logit contributions
 biting-0.054
 barking-0.054
ue-0.053
 attacking-0.050
 mess-0.049
Token scoop
Token ID23529
Feature activation-1.10e-03

Pos logit contributions
 biting+0.029
 barking+0.029
 Sam+0.026
ue+0.026
 mess+0.024
Neg logit contributions
unn-0.045
amb-0.044
ovable-0.043
 luggage-0.042
sie-0.042
Tokened
Token ID276
Feature activation+1.09e-02

Pos logit contributions
 barking+0.003
 biting+0.003
ue+0.003
 attacking+0.003
 erupt+0.003
Neg logit contributions
oll-0.001
amb-0.001
unn-0.001
ovable-0.001
sie-0.001
Token a
Token ID257
Feature activation+2.18e-02

Pos logit contributions
amb+0.023
unn+0.023
 lamb+0.023
 luggage+0.023
sie+0.022
Neg logit contributions
 barking-0.013
 biting-0.013
ue-0.013
 attacking-0.011
 mess-0.011
Token handful
Token ID10089
Feature activation+2.18e-02

Pos logit contributions
 lamb+0.034
 luggage+0.033
oll+0.033
sie+0.033
 wife+0.033
Neg logit contributions
 barking-0.038
ue-0.037
 biting-0.036
 Sam-0.035
 attacking-0.034
Token of
Token ID286
Feature activation+6.40e-03

Pos logit contributions
unn+0.034
amb+0.032
oll+0.032
sie+0.031
edly+0.031
Neg logit contributions
 barking-0.041
 biting-0.039
ue-0.038
 attacking-0.037
 Sam-0.036
Token cereal
Token ID33158
Feature activation+6.15e-03

Pos logit contributions
unn+0.012
 luggage+0.011
amb+0.011
sie+0.011
oll+0.011
Neg logit contributions
 barking-0.010
 biting-0.009
ue-0.009
 attacking-0.009
 Sam-0.008
Token and
Token ID290
Feature activation+1.73e-02

Pos logit contributions
amb+0.011
sie+0.011
oll+0.011
unn+0.011
edly+0.011
Neg logit contributions
 barking-0.009
 biting-0.009
ue-0.008
 attacking-0.008
 Sam-0.008
Token those
Token ID883
Feature activation+8.22e-02

Pos logit contributions
 barking+0.036
 biting+0.036
ue+0.033
 Sam+0.031
 attacking+0.030
Neg logit contributions
unn-0.062
amb-0.061
ovable-0.060
 lamb-0.059
oll-0.059
Token markers
Token ID19736
Feature activation+2.38e-02

Pos logit contributions
unn+0.178
amb+0.177
edly+0.174
 lamb+0.173
 ranc+0.166
Neg logit contributions
 Sam-0.104
 biting-0.100
 mess-0.099
 Tom-0.097
 fighting-0.094
Token",
Token ID1600
Feature activation+8.93e-03

Pos logit contributions
unn+0.060
amb+0.059
ovable+0.059
 lamb+0.059
sie+0.058
Neg logit contributions
 biting-0.020
 barking-0.019
ue-0.018
 Sam-0.016
 mess-0.016
Token said
Token ID531
Feature activation+1.76e-02

Pos logit contributions
amb+0.016
unn+0.016
 wife+0.016
 luggage+0.016
nam+0.016
Neg logit contributions
 biting-0.013
 barking-0.013
ue-0.012
 attacking-0.012
 mess-0.011
Token Mark
Token ID2940
Feature activation-3.85e-02

Pos logit contributions
amb+0.038
ovable+0.038
 wife+0.038
 luggage+0.038
oll+0.037
Neg logit contributions
 barking-0.019
 biting-0.019
 mess-0.018
ue-0.018
 attacking-0.016
Token.
Token ID13
Feature activation+2.00e-02

Pos logit contributions
 barking+0.040
 biting+0.040
ue+0.039
 mess+0.035
 attacking+0.034
Neg logit contributions
ovable-0.085
oll-0.083
amb-0.083
unn-0.083
sie-0.082
Token
Token ID220
Feature activation-2.30e-03

Pos logit contributions
amb+0.036
edly+0.034
 wife+0.034
ibr+0.033
unn+0.033
Neg logit contributions
 biting-0.028
 mess-0.027
 attacking-0.025
 dent-0.025
 bite-0.025
Token
Token ID198
Feature activation-2.13e-03

Pos logit contributions
 biting+0.003
 barking+0.003
 attacking+0.003
ue+0.003
 mess+0.003
Neg logit contributions
amb-0.004
unn-0.004
ovable-0.004
oll-0.004
ouses-0.004
Token
Token ID198
Feature activation-3.69e-04

Pos logit contributions
 biting+0.004
 barking+0.003
 mess+0.003
 dent+0.003
 attacking+0.003
Neg logit contributions
amb-0.004
unn-0.003
ibr-0.003
la-0.003
ouses-0.003
TokenJimmy
Token ID40335
Feature activation-2.17e-02

Pos logit contributions
 biting+0.001
 barking+0.001
 mess+0.000
 attacking+0.000
 bite+0.000
Neg logit contributions
amb-0.001
ibr-0.001
edly-0.001
years-0.001
la-0.001
Token got
Token ID1392
Feature activation+4.40e-02

Pos logit contributions
 barking+0.035
 biting+0.035
ue+0.034
 attacking+0.032
 erupt+0.031
Neg logit contributions
amb-0.036
unn-0.035
 wife-0.035
ovable-0.034
sie-0.034
Token play
Token ID711
Feature activation+1.17e-02

Pos logit contributions
 Sam+0.015
 barking+0.014
 bite+0.013
ue+0.013
 biting+0.013
Neg logit contributions
unn-0.034
oll-0.033
ovable-0.033
edly-0.033
amb-0.032
Token everyday
Token ID10908
Feature activation+7.70e-02

Pos logit contributions
amb+0.022
unn+0.022
ovable+0.021
 luggage+0.021
sie+0.021
Neg logit contributions
 barking-0.017
 biting-0.017
ue-0.016
 attacking-0.015
 Sam-0.015
Token.
Token ID13
Feature activation+2.56e-02

Pos logit contributions
unn+0.221
amb+0.213
ovable+0.211
oll+0.209
 lamb+0.206
Neg logit contributions
 barking-0.047
 biting-0.045
ue-0.038
 Sam-0.038
 mess-0.032
Token They
Token ID1119
Feature activation+2.14e-02

Pos logit contributions
unn+0.050
amb+0.050
ovable+0.049
 luggage+0.049
sie+0.048
Neg logit contributions
 biting-0.035
 barking-0.035
ue-0.032
 attacking-0.031
 mess-0.030
Token both
Token ID1111
Feature activation+8.92e-03

Pos logit contributions
edly+0.049
unn+0.047
amb+0.047
oll+0.046
uster+0.045
Neg logit contributions
 Sam-0.027
 Tim-0.025
 Tom-0.025
 barking-0.024
 bite-0.023
Token liked
Token ID8288
Feature activation+1.87e-02

Pos logit contributions
unn+0.019
 luggage+0.018
 lamb+0.018
amb+0.018
 compass+0.018
Neg logit contributions
 barking-0.011
 Sam-0.011
 Tom-0.011
 Mia-0.010
 biting-0.010
Token to
Token ID284
Feature activation-1.92e-02

Pos logit contributions
unn+0.039
amb+0.039
ovable+0.038
oll+0.038
sie+0.038
Neg logit contributions
 barking-0.024
 biting-0.023
ue-0.022
 Sam-0.020
 attacking-0.020
Token play
Token ID711
Feature activation+2.50e-03

Pos logit contributions
 barking+0.024
 biting+0.023
ue+0.023
 Sam+0.022
 mess+0.020
Neg logit contributions
unn-0.042
amb-0.040
ovable-0.040
oll-0.040
 lamb-0.040
Token outdoors
Token ID24349
Feature activation+2.47e-02

Pos logit contributions
 luggage+0.004
amb+0.004
ovable+0.004
ouses+0.004
sie+0.004
Neg logit contributions
 barking-0.004
 biting-0.004
ue-0.004
 attacking-0.004
 erupt-0.004
Token in
Token ID287
Feature activation-3.86e-02

Pos logit contributions
unn+0.060
amb+0.059
ovable+0.059
oll+0.058
 lamb+0.058
Neg logit contributions
 barking-0.024
 biting-0.023
ue-0.021
 Sam-0.020
 mess-0.019
Token the
Token ID262
Feature activation+1.19e-02

Pos logit contributions
 barking+0.039
 biting+0.038
ue+0.036
 attacking+0.031
 Sam+0.031
Neg logit contributions
unn-0.090
amb-0.089
ovable-0.088
 luggage-0.088
 lamb-0.088
Token the
Token ID262
Feature activation+2.25e-02

Pos logit contributions
unn+0.059
amb+0.059
ovable+0.059
 luggage+0.058
 lamb+0.058
Neg logit contributions
 barking-0.024
 biting-0.024
ue-0.023
 Sam-0.020
 attacking-0.020
Token world
Token ID995
Feature activation+6.10e-02

Pos logit contributions
unn+0.042
ovable+0.041
amb+0.041
oll+0.041
sie+0.040
Neg logit contributions
 barking-0.035
 biting-0.034
ue-0.031
 mess-0.031
 Sam-0.030
Token.
Token ID13
Feature activation+2.25e-02

Pos logit contributions
unn+0.138
amb+0.136
ovable+0.135
sie+0.133
 luggage+0.133
Neg logit contributions
 biting-0.065
 barking-0.064
ue-0.061
 Sam-0.054
 attacking-0.054
Token It
Token ID632
Feature activation-3.90e-02

Pos logit contributions
unn+0.055
sie+0.054
oll+0.053
ovable+0.053
amb+0.053
Neg logit contributions
 barking-0.023
 Sam-0.022
 biting-0.020
ue-0.020
 Tom-0.019
Token hopped
Token ID45230
Feature activation+4.02e-02

Pos logit contributions
 barking+0.049
 biting+0.046
ue+0.044
 Sam+0.043
 mess+0.042
Neg logit contributions
unn-0.085
amb-0.083
oll-0.082
nam-0.081
ovable-0.081
Token off
Token ID572
Feature activation+3.76e-02

Pos logit contributions
amb+0.076
unn+0.075
ovable+0.074
 luggage+0.073
sie+0.073
Neg logit contributions
 biting-0.059
 barking-0.058
ue-0.056
 attacking-0.051
 Sam-0.051
Token the
Token ID262
Feature activation+3.43e-02

Pos logit contributions
amb+0.080
unn+0.080
ovable+0.079
 luggage+0.079
 lamb+0.078
Neg logit contributions
 biting-0.043
 barking-0.043
ue-0.042
 mess-0.037
 attacking-0.037
Token peg
Token ID41350
Feature activation+6.68e-02

Pos logit contributions
ovable+0.068
unn+0.068
oll+0.068
amb+0.067
ouses+0.066
Neg logit contributions
 barking-0.052
 biting-0.049
 mess-0.048
 Sam-0.043
ue-0.043
Token and
Token ID290
Feature activation+2.97e-02

Pos logit contributions
unn+0.182
ovable+0.176
amb+0.175
 lamb+0.175
nam+0.173
Neg logit contributions
 barking-0.048
 biting-0.043
 Sam-0.039
ue-0.036
 mess-0.036
Token began
Token ID2540
Feature activation+3.47e-02

Pos logit contributions
unn+0.056
amb+0.054
ovable+0.054
oll+0.053
sie+0.052
Neg logit contributions
 barking-0.048
 biting-0.047
 Sam-0.045
ue-0.042
 mess-0.041
Token walking
Token ID6155
Feature activation+3.23e-02

Pos logit contributions
unn+0.065
amb+0.065
 luggage+0.065
 lamb+0.064
 wife+0.064
Neg logit contributions
ue-0.049
 biting-0.047
 barking-0.046
 mess-0.042
 Sam-0.042
Token spent
Token ID3377
Feature activation+3.21e-02

Pos logit contributions
unn+0.048
oll+0.045
amb+0.045
uster+0.045
sie+0.045
Neg logit contributions
 barking-0.023
 Sam-0.022
 biting-0.022
 Sue-0.019
 Tom-0.019
Token the
Token ID262
Feature activation+2.01e-02

Pos logit contributions
unn+0.072
amb+0.070
ovable+0.068
oll+0.067
sie+0.067
Neg logit contributions
 biting-0.039
 barking-0.038
ue-0.036
 Sam-0.035
 attacking-0.033
Token rest
Token ID1334
Feature activation+3.32e-02

Pos logit contributions
unn+0.030
amb+0.030
 lamb+0.030
ovable+0.030
sie+0.029
Neg logit contributions
 barking-0.036
 biting-0.035
ue-0.034
 Sam-0.033
 attacking-0.032
Token of
Token ID286
Feature activation+1.18e-02

Pos logit contributions
unn+0.061
amb+0.061
 luggage+0.061
 wife+0.059
 lamb+0.059
Neg logit contributions
 biting-0.051
 barking-0.050
ue-0.046
 Sam-0.045
 attacking-0.043
Token the
Token ID262
Feature activation+3.11e-02

Pos logit contributions
 lamb+0.024
 wife+0.024
 luggage+0.024
unn+0.023
ovable+0.023
Neg logit contributions
ue-0.014
 barking-0.014
 biting-0.013
 mess-0.012
 dent-0.012
Token day
Token ID1110
Feature activation+3.58e-02

Pos logit contributions
unn+0.044
 lamb+0.043
amb+0.043
sie+0.042
 wife+0.042
Neg logit contributions
 barking-0.059
 biting-0.059
ue-0.055
 Sam-0.055
 attacking-0.054
Token listening
Token ID8680
Feature activation+2.57e-02

Pos logit contributions
unn+0.074
amb+0.073
ovable+0.073
sie+0.072
 lamb+0.072
Neg logit contributions
 barking-0.045
 biting-0.044
ue-0.043
 Sam-0.039
 attacking-0.038
Token to
Token ID284
Feature activation-1.50e-02

Pos logit contributions
unn+0.052
amb+0.052
ovable+0.051
 luggage+0.050
 lamb+0.050
Neg logit contributions
 barking-0.033
 biting-0.033
ue-0.031
 Sam-0.029
 attacking-0.028
Token the
Token ID262
Feature activation+2.38e-02

Pos logit contributions
 barking+0.014
 biting+0.014
 Sam+0.013
ue+0.012
 attacking+0.011
Neg logit contributions
unn-0.038
amb-0.037
oll-0.037
ouses-0.036
sie-0.036
Token sun
Token ID4252
Feature activation-7.02e-02

Pos logit contributions
unn+0.036
amb+0.036
 lamb+0.035
ovable+0.035
 wife+0.035
Neg logit contributions
 barking-0.043
 biting-0.043
ue-0.041
 attacking-0.039
 Sam-0.039
Tokenflower
Token ID25547
Feature activation-2.05e-02

Pos logit contributions
 biting+0.115
 barking+0.115
ue+0.109
 mess+0.105
 attacking+0.104
Neg logit contributions
amb-0.118
unn-0.115
 lamb-0.113
ovable-0.112
 wife-0.112
Token asking
Token ID4737
Feature activation+5.52e-02

Pos logit contributions
 barking+0.090
ue+0.089
 biting+0.087
 Sam+0.079
 attacking+0.077
Neg logit contributions
unn-0.118
amb-0.116
 lamb-0.115
ovable-0.115
sie-0.114
Token for
Token ID329
Feature activation-5.22e-03

Pos logit contributions
unn+0.140
amb+0.138
oll+0.134
ovable+0.133
 lamb+0.132
Neg logit contributions
 biting-0.050
 barking-0.048
 Sam-0.044
ue-0.042
 attacking-0.039
Token help
Token ID1037
Feature activation+2.44e-02

Pos logit contributions
 biting+0.008
 barking+0.007
 Sam+0.007
 attacking+0.007
 mess+0.006
Neg logit contributions
amb-0.011
unn-0.011
ovable-0.010
oll-0.010
sie-0.010
Token from
Token ID422
Feature activation+3.95e-02

Pos logit contributions
unn+0.059
amb+0.059
ovable+0.058
oll+0.058
 luggage+0.057
Neg logit contributions
 barking-0.023
 biting-0.023
ue-0.021
 Sam-0.019
 attacking-0.018
Token others
Token ID1854
Feature activation+7.78e-02

Pos logit contributions
unn+0.095
amb+0.094
oll+0.090
ouses+0.090
ovable+0.088
Neg logit contributions
 biting-0.047
 barking-0.045
 Sam-0.040
 attacking-0.039
 catching-0.038
Token.
Token ID13
Feature activation-1.55e-03

Pos logit contributions
amb+0.218
unn+0.218
oll+0.213
 lamb+0.209
 luggage+0.209
Neg logit contributions
 biting-0.046
 barking-0.044
 Sam-0.036
 mess-0.033
 attacking-0.031
Token And
Token ID843
Feature activation+3.61e-02

Pos logit contributions
 biting+0.002
 barking+0.002
 mess+0.002
ue+0.002
 attacking+0.002
Neg logit contributions
amb-0.003
unn-0.003
ovable-0.003
oll-0.003
 lamb-0.003
Token sometimes
Token ID3360
Feature activation-3.71e-03

Pos logit contributions
amb+0.086
unn+0.086
ouses+0.085
ovable+0.085
 luggage+0.085
Neg logit contributions
 biting-0.037
 Sam-0.037
 barking-0.035
 Tom-0.032
 catching-0.031
Token,
Token ID11
Feature activation-2.35e-02

Pos logit contributions
 barking+0.003
 biting+0.003
ue+0.003
 Sam+0.003
 attacking+0.003
Neg logit contributions
unn-0.009
 lamb-0.009
ovable-0.009
sie-0.009
amb-0.009
Token things
Token ID1243
Feature activation+7.72e-02

Pos logit contributions
 biting+0.025
 barking+0.025
ue+0.022
 Sam+0.022
 attacking+0.021
Neg logit contributions
unn-0.055
amb-0.054
ovable-0.053
oll-0.053
sie-0.052
Token that
Token ID326
Feature activation-3.50e-02

Pos logit contributions
amb+0.155
oll+0.152
unn+0.152
edly+0.149
ouses+0.148
Neg logit contributions
 biting-0.108
 barking-0.104
 mess-0.097
ue-0.097
 Sam-0.096
Token Welcome
Token ID19134
Feature activation+2.28e-02

Pos logit contributions
 barking+0.007
 biting+0.007
ue+0.007
 Sam+0.007
 attacking+0.006
Neg logit contributions
unn-0.011
amb-0.011
 luggage-0.011
ovable-0.011
sie-0.011
Token to
Token ID284
Feature activation-3.73e-04

Pos logit contributions
unn+0.052
amb+0.050
ovable+0.050
oll+0.050
sie+0.049
Neg logit contributions
 barking-0.026
 biting-0.025
ue-0.023
 Sam-0.023
 attacking-0.021
Token my
Token ID616
Feature activation+3.33e-02

Pos logit contributions
 barking+0.000
 biting+0.000
 Sam+0.000
ue+0.000
 attacking+0.000
Neg logit contributions
amb-0.001
unn-0.001
ovable-0.001
oll-0.001
edly-0.001
Token office
Token ID2607
Feature activation+4.12e-02

Pos logit contributions
unn+0.062
amb+0.062
sie+0.059
ovable+0.058
ouses+0.057
Neg logit contributions
 barking-0.053
 biting-0.052
 mess-0.049
 mistake-0.047
ue-0.047
Token."
Token ID526
Feature activation+1.20e-02

Pos logit contributions
unn+0.104
amb+0.102
ovable+0.100
 lamb+0.100
sie+0.099
Neg logit contributions
 barking-0.036
 biting-0.035
ue-0.033
 Sam-0.030
 mess-0.029
Token Joe
Token ID5689
Feature activation-1.43e-02

Pos logit contributions
unn+0.024
amb+0.024
ovable+0.024
sie+0.023
oll+0.023
Neg logit contributions
 biting-0.016
 barking-0.015
ue-0.015
 mess-0.014
 attacking-0.014
Token and
Token ID290
Feature activation+2.81e-02

Pos logit contributions
 barking+0.017
 biting+0.017
ue+0.016
 Sam+0.015
 mess+0.014
Neg logit contributions
unn-0.031
ovable-0.031
 luggage-0.030
oll-0.030
ouses-0.030
Token Molly
Token ID30236
Feature activation+1.59e-02

Pos logit contributions
unn+0.066
amb+0.066
ovable+0.065
sie+0.065
 lamb+0.065
Neg logit contributions
 barking-0.027
 biting-0.027
ue-0.025
 attacking-0.022
 mess-0.022
Token were
Token ID547
Feature activation+6.02e-03

Pos logit contributions
unn+0.033
uster+0.032
oll+0.031
ouses+0.031
 lamb+0.031
Neg logit contributions
ue-0.020
 barking-0.019
 Tom-0.019
 Sam-0.019
 bite-0.019
Token impressed
Token ID12617
Feature activation+7.24e-03

Pos logit contributions
unn+0.010
amb+0.010
ovable+0.009
 lamb+0.009
 luggage+0.009
Neg logit contributions
 barking-0.010
 biting-0.010
ue-0.010
 Sam-0.009
 attacking-0.009
Token by
Token ID416
Feature activation-3.92e-02

Pos logit contributions
amb+0.015
unn+0.015
ovable+0.015
edly+0.015
oll+0.015
Neg logit contributions
 biting-0.009
 barking-0.009
ue-0.008
 attacking-0.008
 dent-0.008
Token a
Token ID257
Feature activation-1.61e-02

Pos logit contributions
ue+0.093
 biting+0.087
 barking+0.087
 mess+0.082
 attacking+0.080
Neg logit contributions
 lamb-0.099
nam-0.098
 wife-0.097
 luggage-0.096
amb-0.096
Token blue
Token ID4171
Feature activation-1.00e-01

Pos logit contributions
ue+0.026
 biting+0.026
 barking+0.025
 attacking+0.024
 Sam+0.023
Neg logit contributions
 lamb-0.027
 wife-0.027
 luggage-0.026
 groom-0.026
unn-0.026
Token flower
Token ID15061
Feature activation-4.24e-02

Pos logit contributions
 biting+0.173
 barking+0.172
 attacking+0.163
ue+0.162
 erupt+0.158
Neg logit contributions
 lamb-0.158
amb-0.156
 wife-0.154
 luggage-0.153
 groom-0.151
Token,
Token ID11
Feature activation-6.71e-02

Pos logit contributions
 barking+0.046
 biting+0.046
ue+0.044
 attacking+0.041
 Sam+0.040
Neg logit contributions
unn-0.091
oll-0.091
amb-0.090
 lamb-0.090
ovable-0.089
Token and
Token ID290
Feature activation-1.80e-02

Pos logit contributions
ue+0.095
 barking+0.094
 biting+0.093
 mess+0.085
 attacking+0.084
Neg logit contributions
 lamb-0.122
amb-0.122
 wife-0.120
 luggage-0.120
unn-0.120
Token a
Token ID257
Feature activation-1.17e-02

Pos logit contributions
 biting+0.019
 barking+0.019
 Sam+0.018
 Tom+0.016
 Mia+0.016
Neg logit contributions
unn-0.046
amb-0.046
ovable-0.043
sie-0.043
uster-0.043
Token pink
Token ID11398
Feature activation-7.68e-02

Pos logit contributions
 biting+0.018
 barking+0.018
ue+0.018
 attacking+0.017
 Sam+0.017
Neg logit contributions
 lamb-0.020
unn-0.020
 wife-0.019
 luggage-0.019
amb-0.019
Token flower
Token ID15061
Feature activation-3.85e-02

Pos logit contributions
 barking+0.129
 biting+0.129
ue+0.124
 attacking+0.120
 Sam+0.117
Neg logit contributions
amb-0.123
 lamb-0.123
 wife-0.123
 luggage-0.120
unn-0.120
Token.
Token ID13
Feature activation-3.67e-02

Pos logit contributions
 barking+0.040
 biting+0.039
ue+0.038
 Sam+0.034
 attacking+0.034
Neg logit contributions
unn-0.087
amb-0.086
oll-0.086
ovable-0.085
 lamb-0.085
Token They
Token ID1119
Feature activation-1.44e-02

Pos logit contributions
 biting+0.049
 barking+0.048
ue+0.046
 attacking+0.043
 mess+0.042
Neg logit contributions
amb-0.074
unn-0.073
ovable-0.072
oll-0.071
 luggage-0.071
Token picked
Token ID6497
Feature activation-3.67e-03

Pos logit contributions
 barking+0.020
 biting+0.020
 Sam+0.019
ue+0.019
 Tom+0.018
Neg logit contributions
unn-0.029
amb-0.029
ovable-0.028
 luggage-0.028
oll-0.028
Token were
Token ID547
Feature activation-3.22e-03

Pos logit contributions
unn+0.051
amb+0.051
edly+0.050
oll+0.050
uster+0.049
Neg logit contributions
 barking-0.036
 biting-0.035
 Sam-0.033
 bite-0.033
 fight-0.032
Token little
Token ID1310
Feature activation+1.00e-02

Pos logit contributions
 barking+0.005
 biting+0.005
 Sam+0.004
ue+0.004
 attacking+0.004
Neg logit contributions
unn-0.006
amb-0.006
oll-0.006
 luggage-0.006
ovable-0.006
Token.
Token ID13
Feature activation+1.86e-02

Pos logit contributions
unn+0.019
amb+0.019
ovable+0.018
oll+0.018
sie+0.018
Neg logit contributions
 barking-0.015
 biting-0.015
ue-0.014
 Sam-0.013
 attacking-0.013
Token But
Token ID887
Feature activation+1.65e-02

Pos logit contributions
 luggage+0.033
ouses+0.032
 wife+0.031
ovable+0.031
amb+0.031
Neg logit contributions
 biting-0.028
 barking-0.024
 mess-0.024
 attacking-0.024
 mistake-0.024
Token recently
Token ID2904
Feature activation+4.68e-02

Pos logit contributions
 luggage+0.029
 lamb+0.028
amb+0.028
 wife+0.028
unn+0.028
Neg logit contributions
 biting-0.025
 barking-0.025
ue-0.024
 attacking-0.023
 mess-0.022
Token Max
Token ID5436
Feature activation-1.05e-02

Pos logit contributions
amb+0.133
unn+0.133
ovable+0.130
oll+0.129
sie+0.127
Neg logit contributions
 biting-0.026
 barking-0.026
 Sam-0.023
ue-0.022
 Tom-0.018
Token and
Token ID290
Feature activation+3.45e-02

Pos logit contributions
 biting+0.015
 barking+0.015
ue+0.014
 attacking+0.013
 Sam+0.013
Neg logit contributions
unn-0.020
ovable-0.020
amb-0.020
 lamb-0.019
 luggage-0.019
Token Sarah
Token ID10490
Feature activation-2.11e-03

Pos logit contributions
unn+0.085
sie+0.084
amb+0.084
 lamb+0.083
ovable+0.083
Neg logit contributions
 biting-0.029
 barking-0.028
ue-0.026
 attacking-0.023
 erupt-0.022
Token had
Token ID550
Feature activation+9.20e-04

Pos logit contributions
 barking+0.004
 biting+0.003
ue+0.003
 Sam+0.003
 Tom+0.003
Neg logit contributions
unn-0.004
amb-0.004
oll-0.004
ouses-0.003
 lamb-0.003
Token begun
Token ID9258
Feature activation+1.49e-02

Pos logit contributions
amb+0.002
unn+0.002
oll+0.002
ovable+0.002
sie+0.002
Neg logit contributions
 barking-0.001
 biting-0.001
 Sam-0.001
ue-0.001
 attacking-0.001
Token to
Token ID284
Feature activation-4.68e-03

Pos logit contributions
 luggage+0.027
unn+0.027
ovable+0.026
amb+0.026
 lamb+0.026
Neg logit contributions
ue-0.022
 biting-0.021
 barking-0.021
 dent-0.019
 Sam-0.019
Token they
Token ID484
Feature activation+6.01e-02

Pos logit contributions
unn+0.186
 lamb+0.183
amb+0.181
sie+0.181
 luggage+0.179
Neg logit contributions
ue-0.140
 biting-0.138
 barking-0.137
 mess-0.125
 attacking-0.122
Token arrived
Token ID5284
Feature activation+2.46e-02

Pos logit contributions
ovable+0.125
ouses+0.122
ocket+0.121
nam+0.121
 wife+0.119
Neg logit contributions
 barking-0.077
ue-0.074
 bite-0.071
 Tim-0.071
 Sam-0.068
Token at
Token ID379
Feature activation-1.48e-02

Pos logit contributions
unn+0.052
amb+0.052
ovable+0.050
 luggage+0.050
 lamb+0.050
Neg logit contributions
 barking-0.030
 biting-0.030
ue-0.029
 mess-0.026
 attacking-0.026
Token the
Token ID262
Feature activation+2.01e-02

Pos logit contributions
 barking+0.011
 biting+0.010
 Sam+0.010
 Tom+0.008
 Sue+0.008
Neg logit contributions
unn-0.041
amb-0.040
ovable-0.040
oll-0.040
ouses-0.039
Token airport
Token ID9003
Feature activation+4.03e-02

Pos logit contributions
unn+0.026
amb+0.025
oll+0.024
ovable+0.024
edly+0.024
Neg logit contributions
 barking-0.045
 mess-0.043
ue-0.042
 biting-0.041
 Sam-0.040
Token,
Token ID11
Feature activation-1.29e-02

Pos logit contributions
unn+0.098
amb+0.097
 luggage+0.097
ovable+0.096
 wife+0.096
Neg logit contributions
 biting-0.034
 barking-0.033
ue-0.031
 attacking-0.027
 mess-0.027
Token Benny
Token ID44275
Feature activation+3.63e-03

Pos logit contributions
 barking+0.017
 biting+0.016
ue+0.016
 mess+0.014
 attacking+0.014
Neg logit contributions
unn-0.026
amb-0.026
ovable-0.026
 lamb-0.026
sie-0.026
Token noticed
Token ID6810
Feature activation+7.42e-03

Pos logit contributions
unn+0.008
oll+0.008
ovable+0.008
amb+0.008
sie+0.008
Neg logit contributions
 barking-0.005
 biting-0.004
ue-0.004
 Sam-0.004
 mess-0.004
Token that
Token ID326
Feature activation-2.32e-02

Pos logit contributions
unn+0.019
amb+0.019
sie+0.018
ouses+0.018
ovable+0.018
Neg logit contributions
 barking-0.008
 Sam-0.007
 biting-0.007
 Tim-0.006
 Tom-0.006
Token it
Token ID340
Feature activation-3.73e-02

Pos logit contributions
 barking+0.024
 biting+0.023
 Sam+0.020
ue+0.019
 attacking+0.019
Neg logit contributions
unn-0.056
amb-0.055
ovable-0.055
oll-0.054
sie-0.053
Token was
Token ID373
Feature activation-2.25e-02

Pos logit contributions
 biting+0.052
 barking+0.051
ue+0.050
 Sam+0.046
 attacking+0.045
Neg logit contributions
ovable-0.069
unn-0.067
 luggage-0.067
sie-0.066
oll-0.066
Token.
Token ID13
Feature activation-3.92e-02

Pos logit contributions
 luggage+0.026
 lamb+0.025
edly+0.025
amb+0.025
sie+0.025
Neg logit contributions
 biting-0.016
 barking-0.015
ue-0.015
 attacking-0.014
 mess-0.013
Token The
Token ID383
Feature activation-3.63e-02

Pos logit contributions
 biting+0.051
 barking+0.051
ue+0.049
 attacking+0.045
 mess+0.044
Neg logit contributions
unn-0.079
amb-0.078
ovable-0.077
oll-0.077
 luggage-0.077
Token doors
Token ID8215
Feature activation+4.08e-03

Pos logit contributions
 barking+0.061
 biting+0.060
ue+0.058
 Sam+0.054
 attacking+0.054
Neg logit contributions
unn-0.061
amb-0.060
ovable-0.059
 luggage-0.059
 lamb-0.058
Token did
Token ID750
Feature activation-6.39e-02

Pos logit contributions
unn+0.007
amb+0.007
ovable+0.007
 luggage+0.007
oll+0.007
Neg logit contributions
 barking-0.006
 biting-0.006
ue-0.006
 Sam-0.006
 attacking-0.005
Token not
Token ID407
Feature activation-8.22e-02

Pos logit contributions
 barking+0.085
 biting+0.085
ue+0.078
 attacking+0.073
 catching+0.066
Neg logit contributions
 luggage-0.120
 passports-0.117
 passport-0.116
 wife-0.115
 servants-0.111
Token open
Token ID1280
Feature activation-8.52e-02

Pos logit contributions
 barking+0.114
 biting+0.113
ue+0.107
 Sam+0.099
 attacking+0.098
Neg logit contributions
unn-0.161
amb-0.159
ovable-0.158
 luggage-0.156
 lamb-0.155
Token.
Token ID13
Feature activation-4.89e-02

Pos logit contributions
 biting+0.094
ue+0.092
 barking+0.092
 attacking+0.078
 Sam+0.075
Neg logit contributions
 luggage-0.189
la-0.178
 lamb-0.177
ovable-0.176
amb-0.176
Token Sara
Token ID24799
Feature activation-4.60e-02

Pos logit contributions
 barking+0.061
 biting+0.061
ue+0.057
 Sam+0.052
 attacking+0.052
Neg logit contributions
unn-0.103
amb-0.101
ovable-0.100
oll-0.099
 luggage-0.099
Token and
Token ID290
Feature activation+1.29e-03

Pos logit contributions
 barking+0.053
 biting+0.052
ue+0.049
 Sam+0.045
 attacking+0.044
Neg logit contributions
unn-0.101
amb-0.100
ovable-0.098
oll-0.097
 lamb-0.097
Token Tom
Token ID4186
Feature activation-8.26e-02

Pos logit contributions
 wife+0.003
oll+0.003
 luggage+0.003
 husband+0.003
amb+0.003
Neg logit contributions
 biting-0.001
 barking-0.001
ue-0.001
 mess-0.001
 attacking-0.001
Token screamed
Token ID25421
Feature activation-4.71e-02

Pos logit contributions
 barking+0.135
 biting+0.133
ue+0.126
 Sam+0.121
 mess+0.118
Neg logit contributions
unn-0.147
amb-0.144
ovable-0.139
sie-0.139
 lamb-0.139
Token and
Token ID290
Feature activation-3.65e-02

Pos logit contributions
 biting+0.123
 barking+0.123
ue+0.118
 Sam+0.109
 attacking+0.107
Neg logit contributions
amb-0.199
 lamb-0.197
unn-0.195
 luggage-0.195
 wife-0.191
Token the
Token ID262
Feature activation-3.29e-02

Pos logit contributions
 barking+0.033
 biting+0.031
 Sam+0.027
 mess+0.024
 Tom+0.024
Neg logit contributions
amb-0.099
oll-0.096
unn-0.096
ovable-0.096
ouses-0.092
Token clouds
Token ID15114
Feature activation-6.05e-02

Pos logit contributions
 barking+0.056
 biting+0.055
 mess+0.052
 attacking+0.049
ue+0.049
Neg logit contributions
unn-0.058
amb-0.057
oll-0.056
ovable-0.056
ouses-0.054
Token and
Token ID290
Feature activation-4.43e-02

Pos logit contributions
ue+0.071
 barking+0.071
 biting+0.070
 Sam+0.064
 attacking+0.062
Neg logit contributions
amb-0.123
 lamb-0.121
 luggage-0.121
ovable-0.121
unn-0.121
Token the
Token ID262
Feature activation-3.90e-02

Pos logit contributions
 barking+0.034
 biting+0.032
 Sam+0.028
 Tom+0.024
 Sue+0.024
Neg logit contributions
amb-0.127
unn-0.124
oll-0.123
ovable-0.122
ouses-0.119
Token sky
Token ID6766
Feature activation-9.75e-02

Pos logit contributions
 barking+0.065
 biting+0.065
 mess+0.060
ue+0.059
 attacking+0.058
Neg logit contributions
unn-0.069
amb-0.068
oll-0.067
ovable-0.066
ouses-0.064
Token.
Token ID13
Feature activation-5.81e-02

Pos logit contributions
 biting+0.123
ue+0.121
 barking+0.121
 Sam+0.110
 attacking+0.110
Neg logit contributions
 luggage-0.181
 husband-0.181
 wife-0.180
 married-0.179
ovable-0.178
Token
Token ID198
Feature activation-6.15e-02

Pos logit contributions
 biting+0.074
 barking+0.073
ue+0.069
 mess+0.065
 attacking+0.064
Neg logit contributions
unn-0.118
amb-0.118
ovable-0.117
 luggage-0.116
 wife-0.116
Token
Token ID198
Feature activation-5.66e-02

Pos logit contributions
 biting+0.084
 barking+0.083
ue+0.077
 attacking+0.073
 destroying+0.071
Neg logit contributions
unn-0.118
amb-0.117
ovable-0.116
ouses-0.113
 luggage-0.113
TokenThey
Token ID2990
Feature activation-1.42e-02

Pos logit contributions
 barking+0.079
 biting+0.078
ue+0.072
 attacking+0.068
 Sam+0.068
Neg logit contributions
unn-0.108
amb-0.108
ovable-0.107
sie-0.106
 lamb-0.106
Token had
Token ID550
Feature activation-2.35e-02

Pos logit contributions
 barking+0.018
 Sam+0.017
 biting+0.017
 Tom+0.017
 mess+0.016
Neg logit contributions
unn-0.032
amb-0.031
ouses-0.030
ovable-0.030
uster-0.030
Token touched
Token ID12615
Feature activation-5.53e-02

Pos logit contributions
unn+0.020
amb+0.020
ovable+0.020
oll+0.019
 luggage+0.019
Neg logit contributions
 barking-0.017
 biting-0.016
ue-0.015
 Sam-0.015
 attacking-0.014
Token it
Token ID340
Feature activation-2.98e-02

Pos logit contributions
 barking+0.048
 biting+0.047
ue+0.044
 attacking+0.038
 mess+0.037
Neg logit contributions
unn-0.134
amb-0.134
ovable-0.133
 lamb-0.133
 luggage-0.132
Token.
Token ID13
Feature activation-2.15e-02

Pos logit contributions
 barking+0.035
ue+0.034
 biting+0.034
 attacking+0.030
 Sam+0.030
Neg logit contributions
oll-0.061
amb-0.061
ovable-0.061
edly-0.060
 lamb-0.059
Token It
Token ID632
Feature activation-5.48e-02

Pos logit contributions
 biting+0.032
 barking+0.032
ue+0.030
 mess+0.030
 attacking+0.029
Neg logit contributions
amb-0.039
unn-0.038
ovable-0.038
oll-0.037
 wife-0.037
Token was
Token ID373
Feature activation-3.27e-02

Pos logit contributions
 barking+0.082
ue+0.080
 biting+0.079
 attacking+0.072
 Sam+0.070
Neg logit contributions
oll-0.095
ovable-0.094
unn-0.092
 lamb-0.091
sie-0.091
Token soft
Token ID2705
Feature activation-1.04e-01

Pos logit contributions
ue+0.052
 barking+0.050
 biting+0.049
 attacking+0.045
 Sam+0.044
Neg logit contributions
oll-0.055
ovable-0.055
 lamb-0.054
 luggage-0.054
unn-0.054
Token and
Token ID290
Feature activation-2.29e-02

Pos logit contributions
ue+0.152
 barking+0.146
 biting+0.143
 Sam+0.132
 attacking+0.132
Neg logit contributions
amb-0.193
oll-0.190
edly-0.189
 lamb-0.187
 luggage-0.185
Token fluffy
Token ID39145
Feature activation-9.28e-02

Pos logit contributions
 barking+0.039
 biting+0.038
ue+0.038
 attacking+0.035
 Sam+0.034
Neg logit contributions
unn-0.036
ovable-0.036
 lamb-0.036
amb-0.036
oll-0.036
Token.
Token ID13
Feature activation-3.48e-02

Pos logit contributions
ue+0.132
 biting+0.117
 barking+0.116
 collapse+0.110
 mistake+0.110
Neg logit contributions
edly-0.176
amb-0.174
 lamb-0.174
oll-0.173
 luggage-0.171
Token She
Token ID1375
Feature activation-7.91e-03

Pos logit contributions
 barking+0.045
 biting+0.044
ue+0.042
 attacking+0.039
 mess+0.038
Neg logit contributions
unn-0.072
amb-0.071
ovable-0.070
 lamb-0.069
oll-0.069
Token said
Token ID531
Feature activation-2.56e-03

Pos logit contributions
 biting+0.012
 barking+0.012
 Sam+0.011
ue+0.011
 mess+0.010
Neg logit contributions
unn-0.015
amb-0.015
 luggage-0.015
ovable-0.015
sie-0.015
Token the
Token ID262
Feature activation-8.33e-03

Pos logit contributions
 barking+0.049
 biting+0.048
 Sam+0.046
 Tom+0.042
ue+0.040
Neg logit contributions
amb-0.111
ovable-0.108
unn-0.108
sie-0.107
oll-0.106
Token garage
Token ID15591
Feature activation+4.00e-03

Pos logit contributions
 barking+0.013
 biting+0.012
ue+0.012
 Sam+0.011
 attacking+0.011
Neg logit contributions
unn-0.015
amb-0.015
 luggage-0.015
 lamb-0.015
ovable-0.015
Token?
Token ID30
Feature activation-4.28e-02

Pos logit contributions
unn+0.009
amb+0.009
ovable+0.009
 lamb+0.009
sie+0.009
Neg logit contributions
 barking-0.004
 biting-0.004
ue-0.004
 Sam-0.004
 mess-0.004
Token It
Token ID632
Feature activation-7.46e-02

Pos logit contributions
 Sam+0.042
 barking+0.042
 biting+0.039
 Tom+0.039
ue+0.036
Neg logit contributions
unn-0.102
sie-0.102
 luggage-0.098
 servants-0.098
amb-0.097
Token is
Token ID318
Feature activation-7.51e-02

Pos logit contributions
 barking+0.071
 biting+0.070
ue+0.065
 Sam+0.060
 mess+0.059
Neg logit contributions
amb-0.180
unn-0.180
ovable-0.176
 lamb-0.174
sie-0.174
Token dry
Token ID5894
Feature activation-8.75e-02

Pos logit contributions
 barking+0.098
 biting+0.097
ue+0.089
 Sam+0.086
 mess+0.084
Neg logit contributions
unn-0.159
amb-0.157
ovable-0.152
sie-0.150
 lamb-0.149
Token and
Token ID290
Feature activation-3.24e-02

Pos logit contributions
 barking+0.079
 biting+0.078
ue+0.067
 Sam+0.066
 mess+0.066
Neg logit contributions
unn-0.218
ovable-0.212
amb-0.210
 wife-0.209
sie-0.208
Token clean
Token ID3424
Feature activation-7.54e-02

Pos logit contributions
 biting+0.048
 barking+0.048
 mess+0.044
 Sam+0.043
ue+0.043
Neg logit contributions
amb-0.064
unn-0.063
ovable-0.060
sie-0.059
 lamb-0.059
Token and
Token ID290
Feature activation-1.24e-02

Pos logit contributions
 barking+0.073
 biting+0.073
ue+0.065
 Sam+0.061
 mess+0.061
Neg logit contributions
unn-0.183
amb-0.179
ovable-0.177
sie-0.174
 lamb-0.174
Token the
Token ID262
Feature activation-1.60e-02

Pos logit contributions
 biting+0.018
 barking+0.018
 mess+0.017
ue+0.016
 Sam+0.016
Neg logit contributions
amb-0.025
unn-0.024
ovable-0.024
sie-0.023
 lamb-0.023
Token dog
Token ID3290
Feature activation-3.34e-02

Pos logit contributions
 barking+0.026
 biting+0.025
ue+0.023
 mess+0.023
 Sam+0.023
Neg logit contributions
unn-0.029
amb-0.029
ovable-0.028
sie-0.028
oll-0.028
Token on
Token ID319
Feature activation-4.91e-02

Pos logit contributions
 biting+0.080
ue+0.079
 barking+0.079
 Sam+0.071
 attacking+0.071
Neg logit contributions
amb-0.096
unn-0.095
 luggage-0.093
 lamb-0.093
ovable-0.092
Token her
Token ID607
Feature activation+4.25e-02

Pos logit contributions
 biting+0.070
 barking+0.069
ue+0.068
 attacking+0.064
 erupt+0.062
Neg logit contributions
 luggage-0.089
 lamb-0.087
sie-0.082
edly-0.079
nam-0.079
Token.
Token ID13
Feature activation+9.84e-03

Pos logit contributions
year+0.096
unn+0.095
years+0.093
ibr+0.092
ovable+0.091
Neg logit contributions
 barking-0.055
 biting-0.053
 mess-0.051
 mistake-0.047
 bite-0.047
Token It
Token ID632
Feature activation-4.65e-02

Pos logit contributions
unn+0.021
amb+0.021
ovable+0.021
sie+0.021
oll+0.020
Neg logit contributions
 barking-0.012
 biting-0.012
ue-0.011
 Sam-0.011
 Sue-0.010
Token was
Token ID373
Feature activation-1.96e-02

Pos logit contributions
 barking+0.069
 biting+0.068
ue+0.067
 Sam+0.062
 attacking+0.061
Neg logit contributions
 lamb-0.078
 luggage-0.077
sie-0.077
ovable-0.077
unn-0.075
Token heavy
Token ID4334
Feature activation-8.50e-02

Pos logit contributions
ue+0.030
 barking+0.028
 biting+0.028
 attacking+0.025
 Sam+0.025
Neg logit contributions
 luggage-0.035
oll-0.034
ovable-0.034
 lamb-0.034
nam-0.033
Token and
Token ID290
Feature activation-1.52e-02

Pos logit contributions
ue+0.106
 barking+0.099
 biting+0.099
 Sam+0.090
 attacking+0.087
Neg logit contributions
unn-0.177
 luggage-0.176
amb-0.172
oll-0.171
 lamb-0.171
Token hurt
Token ID5938
Feature activation-7.70e-02

Pos logit contributions
 barking+0.021
 biting+0.021
ue+0.020
 Sam+0.019
 attacking+0.018
Neg logit contributions
unn-0.030
amb-0.030
ovable-0.029
 lamb-0.029
sie-0.029
Token her
Token ID607
Feature activation+1.49e-02

Pos logit contributions
 biting+0.095
 barking+0.094
ue+0.091
 attacking+0.082
 mess+0.080
Neg logit contributions
unn-0.160
ovable-0.158
 lamb-0.157
amb-0.156
 luggage-0.155
Token a
Token ID257
Feature activation-3.84e-03

Pos logit contributions
unn+0.031
amb+0.030
sie+0.030
ibr+0.030
years+0.029
Neg logit contributions
 barking-0.021
 biting-0.021
 mess-0.019
 Sam-0.019
 mistake-0.018
Token little
Token ID1310
Feature activation+4.90e-03

Pos logit contributions
 barking+0.007
 biting+0.007
ue+0.007
 mess+0.006
 Sam+0.006
Neg logit contributions
unn-0.006
amb-0.006
ovable-0.006
oll-0.006
sie-0.006