TOP ACTIVATIONS
MAX = 0.090

put the piece in the puzzle ?" Sara said .
Can I play with your computer , Ben ?" she asks
won 't you let me use it ? I promise I
" I don 't have nine . I lose a score
" Can I go to school ?" His mom smiled and
can I go in the office ?" He asked .
I put it in the oven ?" Lily asked .
do you have in your backpack ?" Max said . He
. Please , can I take it ?" Ben says ,
my , can I please spend my money on this robot
. Let me get my sewing kit ," said Tim my
" Can I use your er aser , Lily
why don 't you un pack your things and make yourselves
I am done I will spend some time with you ."
" Can I go first ?" Anna asks .
What do you want to build ?" Lily says .
Lily . Now you can unlock the box ."
" Can I go to university one day ?"
" Can we leave our building here , Mom ?" Anna
He says , " I win . I have more scores

MIN ACTIVATIONS
MIN = -0.092

that day on , Tim and Mo were always together ,
friend , a rabbit named Hipp o . He helped G
the fire . The marsh m allows became all warm and
noisy sound . D andy and Joan both looked in surprise
?" The voice belonged to G iz mo 's friend ,
on , Lily and the sweet re ind eer became good
when she saw the pe b ble and she wanted to
Tim and Mr . Sun played together all day
. Max saw the pe b ble in her pocket and
big trees and hearing the birds singing . Suddenly , he
, Becky spotted a big Hipp o standing in the park
wanted to see the pe b ble too . He said
favorite animal , Mr . Hipp o ." From that day
the most beautiful place the m ule had ever seen .
a big hipp o named Hipp y . Hipp y was
. The m ule stayed in the me
was and saw a pe b ble . He thought it
. They put the marsh m allows on sticks and held
he would miss the lovely m ule , but he was
was amazed and asked the hipp o how he knew how

INTERVAL 0.045 - 0.090
CONTAINS 0.517%

" Who will help me build my nest ?"
jealous . " Ben , can I have your
anything else we need to take home ?" Jack 's eyes
have fun . You can take turns . Or you can
and Tommy played together in the park every day and they

INTERVAL -0.001 - 0.045
CONTAINS 51.493%

to make something with your buttons too ?" Ben said .
in her room as a reminder of her competitive spirit .
, she saw a big , red apple . She took
and saw a shiny necklace with a shell . It was
The moral of the story is that it 's okay to

INTERVAL -0.047 - -0.001
CONTAINS 47.607%

went , the monkey was always in the lead . We
hesitant at first . They did not know if it was
, he was a bit scared . He was also very
, about his problem and Tommy suggested he make a special
proud and happy . She never forgot that long and exciting

INTERVAL -0.092 - -0.047
CONTAINS 0.383%

! Be quiet !" Kitty and Bunny felt sad .
. Bob said to himself , " Now that
, " Hello , Mr . Whale ! We won 't
M abel and her mum were both so
small patch where they grow rad ishes . They water them
Token put
Token ID1234
Feature activation+6.27e-02

Pos logit contributions
 breaks+0.048
 hits+0.043
 my+0.042
er+0.041
 makes+0.040
Neg logit contributions
 crashing-0.029
rolling-0.028
free-0.027
 understand-0.027
 echoing-0.026
Token the
Token ID262
Feature activation+3.88e-02

Pos logit contributions
 breaks+0.089
 slams+0.085
 te+0.084
 hits+0.083
ital+0.083
Neg logit contributions
 Joey-0.090
 Marvin-0.089
 him-0.089
free-0.088
 crashing-0.086
Token piece
Token ID3704
Feature activation+5.49e-02

Pos logit contributions
 breaks+0.038
er+0.037
 slams+0.037
 hits+0.034
ara+0.033
Neg logit contributions
free-0.080
 crashing-0.077
 understand-0.075
uttering-0.073
 ent-0.072
Token in
Token ID287
Feature activation+5.42e-02

Pos logit contributions
 breaks+0.098
 slams+0.088
 te+0.087
 hits+0.086
irts+0.084
Neg logit contributions
free-0.073
 crashing-0.068
ous-0.066
 quite-0.065
 Joey-0.064
Token the
Token ID262
Feature activation+6.13e-02

Pos logit contributions
 breaks+0.078
 hits+0.073
 slams+0.071
up+0.070
ital+0.069
Neg logit contributions
 Joey-0.085
 crashing-0.083
 him-0.082
 quite-0.081
 Marvin-0.080
Token puzzle
Token ID15027
Feature activation+9.03e-02

Pos logit contributions
 breaks+0.066
er+0.066
 hits+0.063
 slams+0.060
aj+0.059
Neg logit contributions
 crashing-0.115
free-0.115
 understand-0.105
 embark-0.102
rolling-0.101
Token?"
Token ID1701
Feature activation+6.65e-03

Pos logit contributions
 breaks+0.133
irts+0.113
 hits+0.107
ara+0.105
 slams+0.105
Neg logit contributions
 quite-0.130
 understand-0.119
 Joey-0.118
 weren-0.116
ous-0.112
Token Sara
Token ID24799
Feature activation+1.37e-02

Pos logit contributions
 breaks+0.010
er+0.009
 hits+0.008
 draws+0.008
 my+0.008
Neg logit contributions
 Joey-0.011
 Marvin-0.011
 crashing-0.011
 understand-0.011
 ent-0.011
Token said
Token ID531
Feature activation-4.65e-03

Pos logit contributions
 breaks+0.024
 hits+0.021
 my+0.021
er+0.019
 makes+0.018
Neg logit contributions
 crashing-0.021
rolling-0.021
 ent-0.020
 echoing-0.020
 Marvin-0.020
Token.
Token ID13
Feature activation-2.41e-02

Pos logit contributions
 crashing+0.009
rolling+0.009
free+0.009
ous+0.009
 echoing+0.009
Neg logit contributions
 breaks-0.006
 hits-0.005
 my-0.005
er-0.005
 makes-0.005
Token
Token ID198
Feature activation-1.43e-02

Pos logit contributions
hing+0.051
ous+0.051
 echoing+0.051
rolling+0.050
owing+0.049
Neg logit contributions
 breaks-0.030
 my-0.025
 hits-0.025
ara-0.025
 Anna-0.024
TokenCan
Token ID6090
Feature activation+2.55e-02

Pos logit contributions
 breaks+0.020
 hits+0.018
 my+0.018
er+0.017
 makes+0.016
Neg logit contributions
 crashing-0.018
rolling-0.018
free-0.017
ous-0.017
 echoing-0.017
Token I
Token ID314
Feature activation+1.54e-02

Pos logit contributions
 breaks+0.037
er+0.033
irts+0.033
 hits+0.032
 holes+0.032
Neg logit contributions
 Marvin-0.040
ous-0.040
 Joey-0.039
 weren-0.039
 crashing-0.039
Token play
Token ID711
Feature activation+6.21e-02

Pos logit contributions
 breaks+0.032
 hits+0.029
er+0.028
 holes+0.028
 slams+0.027
Neg logit contributions
 crashing-0.018
 understand-0.017
rolling-0.017
free-0.017
 recognise-0.016
Token with
Token ID351
Feature activation+5.05e-02

Pos logit contributions
 breaks+0.095
 closes+0.086
irts+0.084
ara+0.084
irted+0.082
Neg logit contributions
 understand-0.090
free-0.087
hing-0.087
 quite-0.086
 him-0.085
Token your
Token ID534
Feature activation+1.67e-02

Pos logit contributions
 breaks+0.050
 hits+0.045
 closes+0.040
 learns+0.040
irts+0.040
Neg logit contributions
 him-0.104
 crashing-0.100
 Joey-0.100
 Marvin-0.099
 Bobby-0.095
Token computer
Token ID3644
Feature activation+8.91e-02

Pos logit contributions
 breaks+0.028
 hits+0.025
er+0.025
 my+0.024
 slams+0.023
Neg logit contributions
 crashing-0.024
rolling-0.024
 echoing-0.023
free-0.023
ous-0.023
Token,
Token ID11
Feature activation-2.76e-03

Pos logit contributions
 breaks+0.130
 holes+0.111
irts+0.107
irted+0.104
 hits+0.101
Neg logit contributions
 understand-0.134
 quite-0.134
ous-0.133
 weren-0.128
free-0.127
Token Ben
Token ID3932
Feature activation+5.22e-02

Pos logit contributions
 crashing+0.005
rolling+0.005
free+0.005
ous+0.005
 echoing+0.005
Neg logit contributions
 breaks-0.004
 hits-0.004
 my-0.003
er-0.003
 makes-0.003
Token?"
Token ID1701
Feature activation-3.10e-05

Pos logit contributions
 breaks+0.079
 holes+0.063
 hits+0.063
 slams+0.061
 splits+0.057
Neg logit contributions
 understand-0.092
 Joey-0.084
 Frankie-0.083
 Marvin-0.081
free-0.081
Token she
Token ID673
Feature activation+2.26e-02

Pos logit contributions
 crashing+0.000
rolling+0.000
 ent+0.000
free+0.000
 Marvin+0.000
Neg logit contributions
 breaks-0.000
 hits-0.000
 my-0.000
er-0.000
 makes-0.000
Token asks
Token ID7893
Feature activation-2.05e-02

Pos logit contributions
 breaks+0.034
er+0.031
 my+0.030
 hits+0.029
 pitcher+0.027
Neg logit contributions
 understand-0.034
 ent-0.033
 crashing-0.033
 Marvin-0.033
 echoing-0.032
Token won
Token ID1839
Feature activation+8.60e-03

Pos logit contributions
 crashing+0.001
rolling+0.001
free+0.001
ous+0.001
 echoing+0.001
Neg logit contributions
 breaks-0.001
 hits-0.001
 my-0.001
er-0.001
 makes-0.001
Token't
Token ID470
Feature activation+1.85e-02

Pos logit contributions
 breaks+0.018
 hits+0.016
er+0.016
 holes+0.015
 draws+0.015
Neg logit contributions
 crashing-0.010
rolling-0.010
ous-0.009
free-0.009
 Marvin-0.009
Token you
Token ID345
Feature activation+2.82e-02

Pos logit contributions
 breaks+0.024
 hits+0.021
er+0.019
 slams+0.019
 holes+0.019
Neg logit contributions
 crashing-0.035
ous-0.035
rolling-0.035
 Joey-0.034
 Marvin-0.034
Token let
Token ID1309
Feature activation+3.73e-02

Pos logit contributions
 breaks+0.054
 hits+0.049
 my+0.048
er+0.047
 makes+0.045
Neg logit contributions
 crashing-0.038
rolling-0.037
free-0.036
 understand-0.036
ous-0.035
Token me
Token ID502
Feature activation+5.65e-02

Pos logit contributions
 breaks+0.041
 hits+0.036
er+0.032
 holes+0.032
 te+0.031
Neg logit contributions
 crashing-0.074
rolling-0.072
 Marvin-0.072
 Joey-0.072
ous-0.071
Token use
Token ID779
Feature activation+8.45e-02

Pos logit contributions
 breaks+0.107
 hits+0.095
er+0.094
ara+0.091
 my+0.091
Neg logit contributions
rolling-0.075
 understand-0.074
 crashing-0.071
 recognise-0.069
free-0.068
Token it
Token ID340
Feature activation+6.22e-02

Pos logit contributions
 breaks+0.088
ara+0.088
 slams+0.084
 hits+0.081
ital+0.081
Neg logit contributions
 quite-0.145
 him-0.144
 Joey-0.142
 crashing-0.142
 understand-0.139
Token?
Token ID30
Feature activation+3.28e-02

Pos logit contributions
 breaks+0.087
 te+0.075
 hits+0.073
 slams+0.073
ara+0.072
Neg logit contributions
ous-0.103
free-0.102
 crashing-0.098
 quite-0.097
 Joey-0.097
Token I
Token ID314
Feature activation-2.04e-03

Pos logit contributions
 breaks+0.050
 hits+0.045
er+0.041
 my+0.041
 holes+0.040
Neg logit contributions
free-0.056
rolling-0.055
 crashing-0.054
ous-0.053
 Marvin-0.051
Token promise
Token ID6991
Feature activation+3.89e-02

Pos logit contributions
free+0.002
 crashing+0.002
rolling+0.002
 understand+0.002
ous+0.002
Neg logit contributions
 breaks-0.004
 my-0.004
 hits-0.004
er-0.004
 makes-0.004
Token I
Token ID314
Feature activation+6.00e-03

Pos logit contributions
 breaks+0.055
 te+0.049
 hits+0.049
 holes+0.047
irts+0.046
Neg logit contributions
 crashing-0.067
ous-0.066
 Joey-0.065
 Frankie-0.064
free-0.064
Token "
Token ID366
Feature activation+4.49e-03

Pos logit contributions
rolling+0.048
 crashing+0.046
hing+0.045
free+0.045
 echoing+0.045
Neg logit contributions
 breaks-0.050
 my-0.045
 hits-0.044
 makes-0.042
er-0.041
TokenI
Token ID40
Feature activation+9.44e-03

Pos logit contributions
 breaks+0.009
 hits+0.008
 my+0.008
er+0.007
 makes+0.007
Neg logit contributions
 crashing-0.006
rolling-0.006
free-0.006
ous-0.006
 echoing-0.006
Token don
Token ID836
Feature activation+8.60e-03

Pos logit contributions
 breaks+0.019
 my+0.018
 hits+0.018
er+0.017
 makes+0.017
Neg logit contributions
 crashing-0.012
rolling-0.011
free-0.011
ous-0.011
 echoing-0.011
Token't
Token ID470
Feature activation-1.42e-02

Pos logit contributions
 breaks+0.018
 hits+0.016
 my+0.016
er+0.016
 draws+0.015
Neg logit contributions
 crashing-0.011
rolling-0.010
free-0.010
ous-0.009
 Marvin-0.009
Token have
Token ID423
Feature activation+5.42e-02

Pos logit contributions
 crashing+0.016
rolling+0.015
free+0.015
ous+0.014
 echoing+0.014
Neg logit contributions
 breaks-0.031
 hits-0.029
 my-0.029
er-0.027
 makes-0.027
Token nine
Token ID5193
Feature activation+8.42e-02

Pos logit contributions
er+0.085
 breaks+0.079
 te+0.077
 slams+0.076
ara+0.076
Neg logit contributions
 crashing-0.079
 understand-0.074
 ent-0.072
 quite-0.072
 weren-0.071
Token.
Token ID13
Feature activation+2.19e-02

Pos logit contributions
 breaks+0.103
 slams+0.099
ital+0.096
ara+0.092
 closes+0.089
Neg logit contributions
 crashing-0.135
 understand-0.126
 hesitation-0.125
 echoing-0.124
owing-0.122
Token I
Token ID314
Feature activation+5.04e-03

Pos logit contributions
 breaks+0.033
 hits+0.032
 mess+0.031
er+0.031
 pieces+0.030
Neg logit contributions
free-0.034
ous-0.032
 Mar-0.031
 Marvin-0.031
 crashing-0.031
Token lose
Token ID4425
Feature activation+5.31e-02

Pos logit contributions
 breaks+0.010
 my+0.009
 hits+0.009
er+0.009
 makes+0.008
Neg logit contributions
 crashing-0.007
free-0.006
rolling-0.006
ous-0.006
 echoing-0.006
Token a
Token ID257
Feature activation+4.78e-02

Pos logit contributions
 breaks+0.056
 hits+0.050
 te+0.044
er+0.044
 draws+0.043
Neg logit contributions
 crashing-0.112
free-0.107
 Marvin-0.106
 Joey-0.105
ous-0.104
Token score
Token ID4776
Feature activation+4.64e-02

Pos logit contributions
 breaks+0.071
er+0.064
 hits+0.064
 my+0.062
 slams+0.060
Neg logit contributions
 crashing-0.082
free-0.076
ous-0.074
 understand-0.074
rolling-0.073
Token "
Token ID366
Feature activation-1.73e-03

Pos logit contributions
hing+0.024
rolling+0.023
 streaming+0.023
ric+0.022
 roam+0.022
Neg logit contributions
 my-0.015
 your-0.012
 Lily-0.012
 Anna-0.012
 this-0.011
TokenCan
Token ID6090
Feature activation+5.29e-03

Pos logit contributions
ous+0.003
 crashing+0.003
rolling+0.003
 echoing+0.002
free+0.002
Neg logit contributions
 breaks-0.003
 my-0.003
er-0.003
 hits-0.003
ara-0.003
Token I
Token ID314
Feature activation+7.19e-03

Pos logit contributions
 breaks+0.007
 hits+0.006
er+0.005
 my+0.005
 holes+0.005
Neg logit contributions
rolling-0.011
 crashing-0.011
ous-0.010
 Marvin-0.010
free-0.010
Token go
Token ID467
Feature activation+4.68e-02

Pos logit contributions
 breaks+0.014
 hits+0.012
 my+0.012
er+0.012
 makes+0.011
Neg logit contributions
 crashing-0.010
rolling-0.010
free-0.010
ous-0.009
 echoing-0.009
Token to
Token ID284
Feature activation+6.35e-02

Pos logit contributions
 breaks+0.092
 hits+0.079
 holes+0.077
 splits+0.077
er+0.077
Neg logit contributions
rolling-0.065
 crashing-0.062
free-0.059
 understand-0.057
 recognise-0.056
Token school
Token ID1524
Feature activation+8.38e-02

Pos logit contributions
 breaks+0.121
er+0.114
 hits+0.109
 te+0.104
 splits+0.103
Neg logit contributions
 crashing-0.078
 understand-0.077
rolling-0.075
 Marvin-0.073
 recognise-0.073
Token?"
Token ID1701
Feature activation-3.08e-03

Pos logit contributions
 breaks+0.154
 hits+0.130
 holes+0.129
 splits+0.129
irted+0.122
Neg logit contributions
 quite-0.117
 understand-0.112
rolling-0.110
ous-0.110
free-0.108
Token His
Token ID2399
Feature activation+8.37e-03

Pos logit contributions
 crashing+0.004
rolling+0.004
free+0.004
ous+0.004
 echoing+0.004
Neg logit contributions
 breaks-0.006
 my-0.005
 hits-0.005
er-0.005
 makes-0.005
Token mom
Token ID1995
Feature activation+1.23e-02

Pos logit contributions
 breaks+0.013
 hits+0.012
 my+0.012
er+0.011
 makes+0.011
Neg logit contributions
 crashing-0.014
rolling-0.014
free-0.013
ous-0.013
 echoing-0.013
Token smiled
Token ID13541
Feature activation+1.36e-03

Pos logit contributions
 breaks+0.015
 hits+0.013
 my+0.013
er+0.012
 makes+0.012
Neg logit contributions
 crashing-0.026
rolling-0.025
free-0.024
ous-0.024
 echoing-0.024
Token and
Token ID290
Feature activation+1.54e-02

Pos logit contributions
 breaks+0.001
 hits+0.001
 my+0.001
er+0.001
 makes+0.001
Neg logit contributions
 crashing-0.003
rolling-0.003
free-0.003
ous-0.003
 echoing-0.003
Token can
Token ID460
Feature activation-7.14e-03

Pos logit contributions
 crashing+0.037
rolling+0.036
free+0.035
ous+0.035
 echoing+0.034
Neg logit contributions
 breaks-0.039
 hits-0.035
 my-0.034
er-0.033
 makes-0.032
Token I
Token ID314
Feature activation-9.47e-03

Pos logit contributions
 crashing+0.016
rolling+0.016
free+0.016
ous+0.016
 Marvin+0.015
Neg logit contributions
 breaks-0.007
 hits-0.006
 my-0.006
er-0.005
 makes-0.005
Token go
Token ID467
Feature activation+3.08e-02

Pos logit contributions
 crashing+0.014
free+0.014
rolling+0.014
ous+0.013
 echoing+0.013
Neg logit contributions
 breaks-0.017
 my-0.016
 hits-0.015
 makes-0.014
er-0.014
Token in
Token ID287
Feature activation+5.19e-02

Pos logit contributions
 breaks+0.057
 hits+0.048
er+0.047
ara+0.046
 draws+0.046
Neg logit contributions
rolling-0.047
 crashing-0.046
free-0.043
 Marvin-0.042
 roam-0.042
Token the
Token ID262
Feature activation+5.08e-02

Pos logit contributions
 breaks+0.099
er+0.087
irted+0.087
 draws+0.086
 splits+0.086
Neg logit contributions
 Marvin-0.067
 Mar-0.067
 Joey-0.064
 Frankie-0.064
 crashing-0.062
Token office
Token ID2607
Feature activation+8.31e-02

Pos logit contributions
 breaks+0.061
er+0.056
 hits+0.055
 my+0.052
 teaches+0.052
Neg logit contributions
 crashing-0.095
rolling-0.091
free-0.088
 inspect-0.087
 ent-0.085
Token?"
Token ID1701
Feature activation-4.46e-04

Pos logit contributions
 breaks+0.143
 holes+0.119
 hits+0.113
 splits+0.111
irted+0.111
Neg logit contributions
ous-0.122
owing-0.121
 quite-0.121
 crashing-0.117
 echoing-0.116
Token He
Token ID679
Feature activation+1.44e-02

Pos logit contributions
 crashing+0.001
rolling+0.001
free+0.001
 Marvin+0.001
 echoing+0.001
Neg logit contributions
 breaks-0.001
 hits-0.001
 my-0.001
er-0.001
 makes-0.001
Token asked
Token ID1965
Feature activation-2.59e-02

Pos logit contributions
 breaks+0.029
 hits+0.026
 my+0.025
er+0.024
irts+0.023
Neg logit contributions
rolling-0.019
 Marvin-0.018
 crashing-0.018
 ent-0.018
 echoing-0.017
Token.
Token ID13
Feature activation-3.13e-02

Pos logit contributions
 crashing+0.052
rolling+0.051
ous+0.050
free+0.049
 echoing+0.049
Neg logit contributions
 breaks-0.034
 my-0.029
 hits-0.029
er-0.028
 makes-0.026
Token
Token ID198
Feature activation-2.84e-02

Pos logit contributions
 crashing+0.054
rolling+0.052
 echoing+0.052
free+0.052
ous+0.051
Neg logit contributions
 breaks-0.050
 my-0.046
 hits-0.044
er-0.042
 makes-0.041
Token I
Token ID314
Feature activation-7.59e-03

Pos logit contributions
 breaks+0.026
 hits+0.024
er+0.022
 slams+0.021
irts+0.021
Neg logit contributions
free-0.031
 crashing-0.031
 Joey-0.031
ous-0.031
 weren-0.031
Token put
Token ID1234
Feature activation+3.66e-02

Pos logit contributions
 crashing+0.012
rolling+0.012
 echoing+0.012
ous+0.011
free+0.011
Neg logit contributions
 breaks-0.013
 my-0.012
 hits-0.012
 makes-0.011
er-0.011
Token it
Token ID340
Feature activation+3.13e-02

Pos logit contributions
 breaks+0.055
 hits+0.053
 slams+0.050
 learns+0.047
 draws+0.047
Neg logit contributions
free-0.057
 Joey-0.056
 crashing-0.054
 Mar-0.053
 quite-0.052
Token in
Token ID287
Feature activation+4.92e-02

Pos logit contributions
 breaks+0.056
 hits+0.050
 slams+0.048
 draws+0.046
er+0.046
Neg logit contributions
free-0.045
 crashing-0.043
 Joey-0.041
rolling-0.041
ous-0.041
Token the
Token ID262
Feature activation+2.90e-02

Pos logit contributions
 breaks+0.084
 hits+0.080
 slams+0.073
 falls+0.071
 learns+0.069
Neg logit contributions
 Joey-0.075
 quite-0.072
 Frankie-0.067
 crashing-0.067
 Mar-0.067
Token oven
Token ID14361
Feature activation+8.30e-02

Pos logit contributions
 breaks+0.033
 hits+0.030
er+0.027
 my+0.027
 draws+0.025
Neg logit contributions
 crashing-0.059
free-0.057
rolling-0.054
 ent-0.054
 Mar-0.053
Token?"
Token ID1701
Feature activation+1.95e-02

Pos logit contributions
 breaks+0.131
 hits+0.106
 holes+0.104
 learns+0.101
 slams+0.096
Neg logit contributions
 Joey-0.123
ous-0.121
owing-0.118
 quite-0.117
 crashing-0.115
Token Lily
Token ID20037
Feature activation+2.59e-02

Pos logit contributions
 breaks+0.029
 hits+0.026
er+0.025
 my+0.023
ers+0.023
Neg logit contributions
 Joey-0.032
 Marvin-0.032
 understand-0.032
rolling-0.032
 crashing-0.032
Token asked
Token ID1965
Feature activation-1.19e-02

Pos logit contributions
 breaks+0.051
 hits+0.046
 my+0.044
 makes+0.041
er+0.041
Neg logit contributions
rolling-0.035
 crashing-0.034
 ent-0.034
 echoing-0.033
 Marvin-0.032
Token.
Token ID13
Feature activation-2.28e-02

Pos logit contributions
 crashing+0.026
rolling+0.025
ous+0.025
 echoing+0.025
free+0.024
Neg logit contributions
 breaks-0.014
 my-0.011
 hits-0.011
er-0.011
 makes-0.010
Token
Token ID198
Feature activation-1.05e-02

Pos logit contributions
 crashing+0.046
 echoing+0.045
hing+0.045
owing+0.045
piring+0.045
Neg logit contributions
 breaks-0.032
 hits-0.027
 my-0.027
 makes-0.025
 draws-0.025
Token do
Token ID466
Feature activation+2.81e-02

Pos logit contributions
 breaks+0.008
 hits+0.007
 my+0.007
er+0.007
 te+0.006
Neg logit contributions
 crashing-0.007
free-0.007
ous-0.007
rolling-0.007
 understand-0.007
Token you
Token ID345
Feature activation+1.01e-02

Pos logit contributions
 breaks+0.032
 hits+0.028
er+0.026
 my+0.024
 slams+0.024
Neg logit contributions
 crashing-0.054
free-0.054
ous-0.054
 Marvin-0.053
rolling-0.053
Token have
Token ID423
Feature activation+5.77e-02

Pos logit contributions
 breaks+0.019
 hits+0.017
 my+0.017
er+0.016
 makes+0.016
Neg logit contributions
 crashing-0.014
rolling-0.014
free-0.014
ous-0.014
 echoing-0.013
Token in
Token ID287
Feature activation+4.93e-02

Pos logit contributions
er+0.078
 breaks+0.077
 teaches+0.074
 holes+0.073
 te+0.072
Neg logit contributions
uttering-0.081
 going-0.080
itting-0.077
to-0.077
 streaming-0.076
Token your
Token ID534
Feature activation+5.13e-02

Pos logit contributions
 breaks+0.045
 learns+0.043
er+0.042
up+0.041
 waits+0.039
Neg logit contributions
 Marvin-0.094
 Frankie-0.093
 crashing-0.092
 Mar-0.092
 Joey-0.091
Token backpack
Token ID22526
Feature activation+8.26e-02

Pos logit contributions
er+0.064
ara+0.057
 breaks+0.056
 hits+0.054
up+0.053
Neg logit contributions
rolling-0.076
 roam-0.074
 crashing-0.073
 stroll-0.073
 mis-0.072
Token?"
Token ID1701
Feature activation-1.44e-03

Pos logit contributions
 breaks+0.129
 holes+0.109
 te+0.104
irts+0.103
 jars+0.102
Neg logit contributions
 understand-0.121
ous-0.121
 weren-0.119
rolling-0.117
 crashing-0.117
Token Max
Token ID5436
Feature activation+1.97e-02

Pos logit contributions
 crashing+0.003
rolling+0.003
free+0.003
ous+0.003
 echoing+0.003
Neg logit contributions
 breaks-0.002
 hits-0.001
 my-0.001
er-0.001
 makes-0.001
Token said
Token ID531
Feature activation-8.67e-03

Pos logit contributions
 breaks+0.036
 my+0.031
 hits+0.031
er+0.029
 makes+0.028
Neg logit contributions
 crashing-0.029
 echoing-0.028
rolling-0.028
 Marvin-0.028
 ent-0.027
Token.
Token ID13
Feature activation-2.01e-02

Pos logit contributions
 crashing+0.018
rolling+0.018
free+0.017
ous+0.017
 echoing+0.017
Neg logit contributions
 breaks-0.011
 hits-0.009
 my-0.009
er-0.008
 makes-0.008
Token He
Token ID679
Feature activation+2.51e-02

Pos logit contributions
rolling+0.030
 crashing+0.030
free+0.028
ous+0.028
 echoing+0.028
Neg logit contributions
 breaks-0.037
 hits-0.034
 my-0.033
er-0.031
 makes-0.031
Token.
Token ID13
Feature activation+2.92e-02

Pos logit contributions
 breaks+0.032
 hits+0.029
 draws+0.026
 slams+0.026
ara+0.024
Neg logit contributions
free-0.036
 crashing-0.036
ous-0.035
 echoing-0.034
 quite-0.033
Token Please
Token ID4222
Feature activation+3.38e-02

Pos logit contributions
 breaks+0.052
 hits+0.048
er+0.046
 mess+0.044
 pieces+0.044
Neg logit contributions
 crashing-0.043
free-0.042
 Marvin-0.040
ous-0.039
 Joey-0.039
Token,
Token ID11
Feature activation+6.01e-03

Pos logit contributions
 breaks+0.041
 hits+0.035
 my+0.032
er+0.031
 holes+0.030
Neg logit contributions
 crashing-0.069
free-0.068
rolling-0.068
 understand-0.065
ous-0.064
Token can
Token ID460
Feature activation+2.15e-02

Pos logit contributions
 breaks+0.011
 hits+0.010
er+0.009
 my+0.009
 slams+0.009
Neg logit contributions
 crashing-0.009
free-0.008
 understand-0.008
 Marvin-0.008
rolling-0.008
Token I
Token ID314
Feature activation+1.18e-02

Pos logit contributions
 breaks+0.030
 hits+0.027
er+0.026
 holes+0.026
irts+0.024
Neg logit contributions
 Marvin-0.036
rolling-0.036
 crashing-0.036
 Joey-0.035
ous-0.035
Token take
Token ID1011
Feature activation+8.25e-02

Pos logit contributions
 breaks+0.023
 hits+0.021
er+0.020
 my+0.020
 holes+0.019
Neg logit contributions
 crashing-0.015
rolling-0.015
 understand-0.015
free-0.015
ous-0.014
Token it
Token ID340
Feature activation+4.14e-02

Pos logit contributions
 slams+0.131
aj+0.121
ital+0.118
 hits+0.117
ara+0.117
Neg logit contributions
 him-0.123
free-0.108
 Marvin-0.107
 Joey-0.106
 crashing-0.102
Token?"
Token ID1701
Feature activation+2.14e-02

Pos logit contributions
 breaks+0.080
 slams+0.071
 hits+0.071
 draws+0.069
 learns+0.068
Neg logit contributions
free-0.048
 crashing-0.047
 quite-0.045
ous-0.043
fully-0.043
Token Ben
Token ID3932
Feature activation+2.87e-02

Pos logit contributions
 breaks+0.030
 hits+0.026
 my+0.025
er+0.025
 makes+0.023
Neg logit contributions
 crashing-0.040
rolling-0.039
free-0.038
 Joey-0.038
 Marvin-0.038
Token says
Token ID1139
Feature activation+2.57e-03

Pos logit contributions
 breaks+0.028
 my+0.024
er+0.024
 hits+0.023
 pieces+0.020
Neg logit contributions
 understand-0.059
 crashing-0.057
 echoing-0.057
 Marvin-0.056
 ent-0.056
Token,
Token ID11
Feature activation-2.36e-02

Pos logit contributions
 breaks+0.002
 hits+0.002
 my+0.002
 draws+0.002
 makes+0.002
Neg logit contributions
 crashing-0.006
free-0.006
ous-0.006
rolling-0.006
 echoing-0.006
Tokenmy
Token ID1820
Feature activation+1.65e-02

Pos logit contributions
 breaks+0.029
 hits+0.026
 my+0.025
er+0.024
 makes+0.023
Neg logit contributions
 crashing-0.022
rolling-0.021
ous-0.021
free-0.021
 echoing-0.020
Token,
Token ID11
Feature activation-1.73e-02

Pos logit contributions
 breaks+0.016
 hits+0.014
 my+0.012
er+0.012
 holes+0.011
Neg logit contributions
 crashing-0.037
rolling-0.037
free-0.037
ous-0.037
 Joey-0.036
Token can
Token ID460
Feature activation+4.16e-03

Pos logit contributions
 crashing+0.031
free+0.029
ous+0.029
rolling+0.029
 Marvin+0.029
Neg logit contributions
 breaks-0.026
 hits-0.023
er-0.022
 my-0.021
 te-0.020
Token I
Token ID314
Feature activation+3.76e-03

Pos logit contributions
 breaks+0.005
 hits+0.004
er+0.004
 my+0.004
 holes+0.003
Neg logit contributions
 crashing-0.009
rolling-0.009
ous-0.009
 Marvin-0.009
free-0.008
Token please
Token ID3387
Feature activation+5.59e-02

Pos logit contributions
 breaks+0.007
 hits+0.006
 my+0.006
er+0.006
 makes+0.006
Neg logit contributions
 crashing-0.005
rolling-0.005
free-0.005
ous-0.005
 echoing-0.005
Token spend
Token ID4341
Feature activation+8.22e-02

Pos logit contributions
 breaks+0.114
 hits+0.102
er+0.097
 holes+0.096
 my+0.094
Neg logit contributions
rolling-0.070
 crashing-0.069
 understand-0.067
free-0.065
 Joey-0.062
Token my
Token ID616
Feature activation+2.66e-02

Pos logit contributions
 breaks+0.112
 splits+0.103
 te+0.102
ara+0.102
 pitcher+0.102
Neg logit contributions
 quite-0.123
 him-0.122
 Marvin-0.117
 Joey-0.113
 crashing-0.110
Token money
Token ID1637
Feature activation+6.38e-02

Pos logit contributions
 breaks+0.040
 hits+0.035
 my+0.034
er+0.031
 makes+0.031
Neg logit contributions
 crashing-0.049
rolling-0.048
ous-0.047
free-0.046
 echoing-0.046
Token on
Token ID319
Feature activation+4.57e-02

Pos logit contributions
 breaks+0.116
 hits+0.096
 slams+0.092
 splits+0.091
 draws+0.089
Neg logit contributions
free-0.097
 crashing-0.093
 quite-0.092
owing-0.092
rolling-0.088
Token this
Token ID428
Feature activation+3.18e-02

Pos logit contributions
 breaks+0.067
 hits+0.059
 falls+0.056
er+0.055
 splits+0.055
Neg logit contributions
 Marvin-0.074
 Joey-0.073
 crashing-0.071
 Frankie-0.070
 him-0.070
Token robot
Token ID9379
Feature activation+5.35e-02

Pos logit contributions
 breaks+0.051
 hits+0.044
irts+0.044
ara+0.042
er+0.042
Neg logit contributions
rolling-0.046
 crashing-0.045
ous-0.044
free-0.043
 quite-0.043
Token.
Token ID13
Feature activation+6.53e-03

Pos logit contributions
 breaks+0.031
 hits+0.027
 my+0.026
 slams+0.025
 holes+0.024
Neg logit contributions
free-0.045
 crashing-0.044
 Joey-0.044
ous-0.043
rolling-0.043
Token Let
Token ID3914
Feature activation+2.25e-02

Pos logit contributions
 breaks+0.009
 my+0.008
 hits+0.008
er+0.007
 makes+0.007
Neg logit contributions
 crashing-0.012
rolling-0.012
free-0.012
ous-0.012
 echoing-0.012
Token me
Token ID502
Feature activation+2.60e-02

Pos logit contributions
 breaks+0.023
 slams+0.020
 hits+0.019
 closes+0.017
 jars+0.016
Neg logit contributions
 Joey-0.048
 understand-0.047
 Frankie-0.046
hing-0.046
 weren-0.046
Token get
Token ID651
Feature activation+3.53e-02

Pos logit contributions
 breaks+0.051
 hits+0.046
 my+0.046
er+0.044
 makes+0.043
Neg logit contributions
 crashing-0.034
rolling-0.034
free-0.033
ous-0.032
 echoing-0.032
Token my
Token ID616
Feature activation+9.07e-03

Pos logit contributions
 breaks+0.046
 hits+0.041
er+0.040
ara+0.038
 slams+0.038
Neg logit contributions
free-0.066
 Joey-0.066
 crashing-0.066
 Marvin-0.062
rolling-0.062
Token sewing
Token ID35356
Feature activation+8.22e-02

Pos logit contributions
 breaks+0.012
 hits+0.011
 my+0.011
er+0.010
 makes+0.010
Neg logit contributions
 crashing-0.017
rolling-0.017
free-0.017
ous-0.017
 echoing-0.017
Token kit
Token ID6220
Feature activation+5.41e-02

Pos logit contributions
 breaks+0.104
 slams+0.101
irts+0.096
ocus+0.091
 hits+0.090
Neg logit contributions
free-0.131
fully-0.116
ous-0.111
un-0.109
hing-0.109
Token,"
Token ID553
Feature activation+9.16e-03

Pos logit contributions
 breaks+0.075
 hits+0.067
 slams+0.064
ara+0.058
 draws+0.057
Neg logit contributions
free-0.100
ous-0.094
 understand-0.091
 crashing-0.090
 weren-0.089
Token said
Token ID531
Feature activation-2.03e-02

Pos logit contributions
 breaks+0.014
 my+0.012
 hits+0.012
er+0.011
 makes+0.011
Neg logit contributions
 crashing-0.017
rolling-0.016
free-0.016
ous-0.016
 echoing-0.016
Token Tim
Token ID5045
Feature activation-1.32e-02

Pos logit contributions
 crashing+0.039
rolling+0.038
free+0.036
 echoing+0.036
ous+0.036
Neg logit contributions
 breaks-0.029
 my-0.026
 hits-0.025
er-0.024
 makes-0.022
Tokenmy
Token ID1820
Feature activation-6.62e-04

Pos logit contributions
 crashing+0.025
rolling+0.025
uttering+0.024
 echoing+0.023
 Marvin+0.023
Neg logit contributions
 breaks-0.019
 my-0.017
 hits-0.017
er-0.017
 makes-0.016
Token
Token ID198
Feature activation-5.91e-03

Pos logit contributions
 breaks+0.029
 hits+0.027
er+0.025
 my+0.025
 makes+0.024
Neg logit contributions
 crashing-0.014
free-0.013
 Joey-0.013
rolling-0.013
 Marvin-0.013
Token
Token ID198
Feature activation+8.56e-03

Pos logit contributions
 crashing+0.009
rolling+0.009
 echoing+0.009
ous+0.009
free+0.009
Neg logit contributions
 breaks-0.010
 my-0.009
 hits-0.009
er-0.009
 makes-0.009
Token"
Token ID1
Feature activation+1.90e-02

Pos logit contributions
 breaks+0.015
 hits+0.013
 my+0.012
 makes+0.012
 slams+0.012
Neg logit contributions
 crashing-0.014
free-0.013
rolling-0.013
 understand-0.013
Frank-0.012
TokenCan
Token ID6090
Feature activation+3.18e-02

Pos logit contributions
 breaks+0.038
 hits+0.035
 my+0.034
er+0.032
 makes+0.031
Neg logit contributions
 crashing-0.024
free-0.023
rolling-0.023
 echoing-0.022
 Marvin-0.022
Token I
Token ID314
Feature activation+1.30e-02

Pos logit contributions
 breaks+0.046
irts+0.043
er+0.041
 hits+0.041
 holes+0.041
Neg logit contributions
 Marvin-0.048
 weren-0.048
 Joey-0.048
 Squirrel-0.047
ous-0.046
Token use
Token ID779
Feature activation+8.20e-02

Pos logit contributions
 breaks+0.025
 hits+0.022
er+0.022
 holes+0.021
 slams+0.021
Neg logit contributions
 understand-0.017
free-0.017
 crashing-0.016
rolling-0.016
ous-0.015
Token your
Token ID534
Feature activation+3.09e-02

Pos logit contributions
ital+0.082
 breaks+0.081
 slams+0.079
 closes+0.073
ara+0.073
Neg logit contributions
 him-0.148
 crashing-0.147
 Joey-0.147
 quite-0.143
 Marvin-0.142
Token er
Token ID1931
Feature activation+2.45e-02

Pos logit contributions
 breaks+0.036
 hits+0.031
 my+0.030
er+0.029
 slams+0.027
Neg logit contributions
 crashing-0.061
free-0.060
rolling-0.060
ous-0.059
 Marvin-0.057
Tokenaser
Token ID6005
Feature activation+7.18e-02

Pos logit contributions
 breaks+0.048
 hits+0.044
 my+0.043
 slams+0.041
 makes+0.041
Neg logit contributions
ous-0.030
free-0.029
rolling-0.029
uttering-0.028
owing-0.028
Token,
Token ID11
Feature activation+1.98e-02

Pos logit contributions
 breaks+0.099
ara+0.089
irts+0.086
 te+0.085
 holes+0.083
Neg logit contributions
ous-0.110
 understand-0.109
free-0.107
 quite-0.106
owing-0.103
Token Lily
Token ID20037
Feature activation+3.90e-02

Pos logit contributions
 breaks+0.034
 hits+0.030
er+0.028
 makes+0.026
 draws+0.026
Neg logit contributions
 crashing-0.031
 Joey-0.029
 Marvin-0.029
 echoing-0.029
 understand-0.029
Token why
Token ID1521
Feature activation+1.76e-02

Pos logit contributions
 breaks+0.030
 hits+0.027
er+0.025
 my+0.025
 slams+0.024
Neg logit contributions
 crashing-0.029
free-0.027
rolling-0.027
ous-0.026
 echoing-0.026
Token don
Token ID836
Feature activation+8.85e-04

Pos logit contributions
 breaks+0.034
 hits+0.031
 my+0.031
er+0.030
 makes+0.028
Neg logit contributions
 crashing-0.023
free-0.022
rolling-0.022
ous-0.022
 echoing-0.022
Token't
Token ID470
Feature activation+1.22e-02

Pos logit contributions
 breaks+0.002
 hits+0.002
 my+0.002
er+0.002
 draws+0.002
Neg logit contributions
 crashing-0.001
rolling-0.001
ous-0.001
free-0.001
 Marvin-0.001
Token you
Token ID345
Feature activation+8.05e-03

Pos logit contributions
 breaks+0.013
 hits+0.011
 my+0.010
er+0.009
 makes+0.009
Neg logit contributions
 crashing-0.027
rolling-0.027
ous-0.026
free-0.026
 echoing-0.026
Token un
Token ID555
Feature activation+4.08e-02

Pos logit contributions
 breaks+0.015
 my+0.013
 hits+0.013
er+0.012
 makes+0.012
Neg logit contributions
 crashing-0.012
rolling-0.011
free-0.011
ous-0.011
 echoing-0.011
Tokenpack
Token ID8002
Feature activation+8.19e-02

Pos logit contributions
 breaks+0.077
 hits+0.070
 my+0.067
 makes+0.065
 holes+0.065
Neg logit contributions
rolling-0.058
 crashing-0.053
owing-0.051
free-0.050
uttering-0.049
Token your
Token ID534
Feature activation+3.95e-02

Pos logit contributions
 hits+0.100
 breaks+0.097
 te+0.094
 slams+0.093
 pitcher+0.090
Neg logit contributions
 quite-0.132
 him-0.129
 Marvin-0.128
free-0.128
 crashing-0.125
Token things
Token ID1243
Feature activation+3.48e-02

Pos logit contributions
 breaks+0.050
er+0.047
 hits+0.047
ara+0.044
irts+0.041
Neg logit contributions
free-0.062
rolling-0.062
 Marvin-0.061
 crashing-0.061
 keen-0.060
Token and
Token ID290
Feature activation+2.20e-02

Pos logit contributions
 breaks+0.048
 hits+0.041
 te+0.036
er+0.036
 my+0.035
Neg logit contributions
free-0.062
 crashing-0.062
ous-0.062
 weren-0.059
rolling-0.059
Token make
Token ID787
Feature activation+4.22e-02

Pos logit contributions
 breaks+0.040
 hits+0.037
er+0.036
ara+0.035
 te+0.034
Neg logit contributions
 crashing-0.030
 understand-0.030
free-0.028
rolling-0.028
 roam-0.028
Token yourselves
Token ID27012
Feature activation+2.05e-03

Pos logit contributions
 breaks+0.063
 hits+0.062
 slams+0.055
er+0.054
ara+0.051
Neg logit contributions
 understand-0.067
 crashing-0.067
 ent-0.065
 Marvin-0.064
 Joey-0.064
Token I
Token ID314
Feature activation+8.06e-03

Pos logit contributions
 breaks+0.046
 hits+0.042
er+0.038
 te+0.036
 slams+0.035
Neg logit contributions
 crashing-0.079
 Joey-0.076
rolling-0.076
free-0.076
 Marvin-0.075
Token am
Token ID716
Feature activation+9.49e-03

Pos logit contributions
 breaks+0.014
 hits+0.013
 my+0.013
 makes+0.012
er+0.012
Neg logit contributions
 crashing-0.012
rolling-0.012
ous-0.012
 echoing-0.012
free-0.012
Token done
Token ID1760
Feature activation+2.40e-02

Pos logit contributions
 breaks+0.016
 hits+0.015
 my+0.014
er+0.014
 draws+0.013
Neg logit contributions
 crashing-0.015
rolling-0.015
free-0.015
ous-0.014
 ent-0.014
Token I
Token ID314
Feature activation+1.24e-02

Pos logit contributions
 breaks+0.039
 hits+0.036
 my+0.033
er+0.032
ara+0.032
Neg logit contributions
 crashing-0.039
free-0.038
rolling-0.038
ous-0.036
 Joey-0.036
Token will
Token ID481
Feature activation+1.53e-02

Pos logit contributions
 breaks+0.022
 hits+0.020
 my+0.020
er+0.018
 makes+0.018
Neg logit contributions
 crashing-0.019
rolling-0.019
free-0.018
ous-0.018
 echoing-0.018
Token spend
Token ID4341
Feature activation+8.09e-02

Pos logit contributions
 breaks+0.029
 hits+0.026
 my+0.025
er+0.024
 makes+0.024
Neg logit contributions
 crashing-0.022
rolling-0.021
free-0.021
 understand-0.020
ous-0.020
Token some
Token ID617
Feature activation+4.98e-02

Pos logit contributions
 breaks+0.114
 hits+0.113
ara+0.108
 slams+0.106
 draws+0.105
Neg logit contributions
 quite-0.123
free-0.115
 understand-0.113
 Joey-0.113
 ease-0.109
Token time
Token ID640
Feature activation+3.26e-02

Pos logit contributions
 breaks+0.081
 hits+0.078
ara+0.076
 draws+0.075
 slams+0.073
Neg logit contributions
free-0.071
 crashing-0.065
 understand-0.065
 quite-0.064
 echoing-0.063
Token with
Token ID351
Feature activation+3.98e-02

Pos logit contributions
 breaks+0.053
 hits+0.049
ara+0.044
 slams+0.043
 my+0.043
Neg logit contributions
free-0.053
rolling-0.052
 crashing-0.052
 ent-0.049
 understand-0.048
Token you
Token ID345
Feature activation+2.33e-02

Pos logit contributions
 breaks+0.047
 hits+0.044
er+0.036
 draws+0.036
 makes+0.035
Neg logit contributions
 crashing-0.080
 Joey-0.078
free-0.078
 Marvin-0.077
rolling-0.076
Token."
Token ID526
Feature activation+8.46e-03

Pos logit contributions
 breaks+0.025
 hits+0.021
 my+0.020
er+0.018
 makes+0.017
Neg logit contributions
 crashing-0.051
free-0.050
rolling-0.050
ous-0.049
 echoing-0.048
Token
Token ID198
Feature activation+1.92e-02

Pos logit contributions
 breaks+0.024
 my+0.022
 hits+0.021
 makes+0.020
 draws+0.019
Neg logit contributions
 crashing-0.021
rolling-0.021
ous-0.021
 echoing-0.020
 streaming-0.020
Token"
Token ID1
Feature activation+1.81e-02

Pos logit contributions
 breaks+0.036
 hits+0.034
 my+0.031
er+0.031
 slams+0.030
Neg logit contributions
 crashing-0.027
free-0.026
rolling-0.025
 understand-0.025
 ent-0.025
TokenCan
Token ID6090
Feature activation+2.90e-02

Pos logit contributions
 breaks+0.036
 hits+0.033
 my+0.032
er+0.031
 makes+0.031
Neg logit contributions
 crashing-0.024
rolling-0.024
ous-0.023
free-0.023
 echoing-0.022
Token I
Token ID314
Feature activation+9.73e-03

Pos logit contributions
 breaks+0.041
er+0.037
irts+0.037
 hits+0.037
 holes+0.036
Neg logit contributions
 crashing-0.047
rolling-0.046
 Marvin-0.046
ous-0.046
free-0.045
Token go
Token ID467
Feature activation+4.61e-02

Pos logit contributions
 breaks+0.020
 hits+0.018
 my+0.017
er+0.017
 makes+0.017
Neg logit contributions
 crashing-0.012
rolling-0.012
free-0.011
 understand-0.011
ous-0.011
Token first
Token ID717
Feature activation+8.06e-02

Pos logit contributions
 breaks+0.081
 holes+0.068
 jars+0.065
 splits+0.065
irts+0.065
Neg logit contributions
 crashing-0.074
rolling-0.072
free-0.067
 Marvin-0.063
 Joey-0.063
Token?"
Token ID1701
Feature activation+1.60e-02

Pos logit contributions
 breaks+0.121
 holes+0.098
 slams+0.090
 splits+0.089
 jars+0.089
Neg logit contributions
 crashing-0.135
 understand-0.135
rolling-0.134
 Joey-0.129
ous-0.128
Token Anna
Token ID11735
Feature activation+3.07e-02

Pos logit contributions
 breaks+0.024
 hits+0.021
er+0.021
 my+0.019
 makes+0.018
Neg logit contributions
 crashing-0.028
 Marvin-0.027
 Joey-0.027
rolling-0.026
 ent-0.026
Token asks
Token ID7893
Feature activation-9.97e-03

Pos logit contributions
 breaks+0.052
 hits+0.043
 my+0.042
er+0.041
 holes+0.040
Neg logit contributions
 ent-0.046
 understand-0.046
 Marvin-0.045
 crashing-0.044
rolling-0.042
Token.
Token ID13
Feature activation-1.66e-02

Pos logit contributions
rolling+0.022
 crashing+0.021
 echoing+0.021
ous+0.021
free+0.021
Neg logit contributions
 breaks-0.011
 my-0.010
 hits-0.009
er-0.009
 makes-0.008
Token
Token ID198
Feature activation-4.13e-03

Pos logit contributions
hing+0.032
ous+0.032
owing+0.032
 echoing+0.031
 crashing+0.031
Neg logit contributions
 breaks-0.023
 my-0.021
 hits-0.020
 makes-0.019
ara-0.018
Token What
Token ID1867
Feature activation+1.46e-02

Pos logit contributions
 breaks+0.028
 hits+0.024
 my+0.023
er+0.023
 slams+0.021
Neg logit contributions
 crashing-0.036
rolling-0.035
free-0.035
ous-0.034
 Marvin-0.033
Token do
Token ID466
Feature activation+3.71e-02

Pos logit contributions
 breaks+0.020
 hits+0.017
 my+0.017
er+0.016
 makes+0.015
Neg logit contributions
 crashing-0.028
rolling-0.027
free-0.027
ous-0.027
 echoing-0.027
Token you
Token ID345
Feature activation+4.53e-03

Pos logit contributions
 breaks+0.047
 hits+0.042
er+0.041
 slams+0.040
 draws+0.038
Neg logit contributions
 Marvin-0.066
ous-0.065
 Joey-0.064
 understand-0.064
 crashing-0.064
Token want
Token ID765
Feature activation+5.83e-02

Pos logit contributions
 breaks+0.010
 my+0.009
er+0.009
 hits+0.009
ara+0.009
Neg logit contributions
 crashing-0.005
 understand-0.005
free-0.005
rolling-0.004
 echoing-0.004
Token to
Token ID284
Feature activation+1.31e-02

Pos logit contributions
 breaks+0.079
 slams+0.072
 hits+0.067
ara+0.065
irts+0.064
Neg logit contributions
 Joey-0.100
ous-0.097
 understand-0.097
 crashing-0.095
 Frank-0.092
Token build
Token ID1382
Feature activation+8.05e-02

Pos logit contributions
 breaks+0.027
 hits+0.024
 slams+0.024
er+0.024
ara+0.023
Neg logit contributions
 understand-0.015
 crashing-0.015
rolling-0.014
ous-0.014
 Joey-0.014
Token?"
Token ID1701
Feature activation+1.57e-02

Pos logit contributions
 breaks+0.119
 slams+0.110
 closes+0.104
 learns+0.102
ara+0.102
Neg logit contributions
 him-0.122
 understand-0.117
 quite-0.117
 Joey-0.113
hing-0.112
Token Lily
Token ID20037
Feature activation+2.74e-02

Pos logit contributions
 breaks+0.020
er+0.020
 hits+0.017
ers+0.016
irts+0.016
Neg logit contributions
 Joey-0.029
 understand-0.029
 Marvin-0.027
 crashing-0.027
 ent-0.027
Token says
Token ID1139
Feature activation-2.10e-03

Pos logit contributions
 breaks+0.035
 my+0.028
er+0.028
 hits+0.027
ital+0.026
Neg logit contributions
 understand-0.050
 ent-0.049
 Marvin-0.049
 crashing-0.049
 echoing-0.048
Token.
Token ID13
Feature activation-1.64e-02

Pos logit contributions
 crashing+0.004
rolling+0.004
free+0.004
ous+0.004
 echoing+0.004
Neg logit contributions
 breaks-0.003
 hits-0.002
 my-0.002
er-0.002
 makes-0.002
Token
Token ID198
Feature activation-8.43e-03

Pos logit contributions
rolling+0.030
ous+0.029
 crashing+0.029
owing+0.029
 echoing+0.029
Neg logit contributions
 breaks-0.025
 hits-0.022
 my-0.022
 makes-0.020
er-0.020
Token Lily
Token ID20037
Feature activation+2.63e-02

Pos logit contributions
 breaks+0.004
 hits+0.003
 makes+0.003
er+0.003
 draws+0.003
Neg logit contributions
 crashing-0.005
 Marvin-0.005
free-0.005
 Joey-0.005
ous-0.005
Token.
Token ID13
Feature activation+2.65e-02

Pos logit contributions
 breaks+0.039
 hits+0.032
 slams+0.032
 holes+0.031
 te+0.030
Neg logit contributions
free-0.046
 Frankie-0.044
ous-0.044
 Joey-0.044
 Frank-0.043
Token Now
Token ID2735
Feature activation+3.53e-02

Pos logit contributions
 breaks+0.045
 hits+0.041
 mess+0.039
er+0.039
 my+0.038
Neg logit contributions
free-0.040
 crashing-0.039
rolling-0.039
 Joey-0.038
 Marvin-0.038
Token you
Token ID345
Feature activation+1.29e-02

Pos logit contributions
 breaks+0.049
 hits+0.044
 slams+0.040
er+0.040
 te+0.039
Neg logit contributions
 crashing-0.062
 Joey-0.061
 Marvin-0.060
 Frankie-0.060
 weren-0.059
Token can
Token ID460
Feature activation+1.88e-03

Pos logit contributions
 breaks+0.022
 hits+0.020
 my+0.020
er+0.019
 makes+0.018
Neg logit contributions
 crashing-0.020
rolling-0.020
free-0.019
 understand-0.019
ous-0.019
Token unlock
Token ID12116
Feature activation+8.00e-02

Pos logit contributions
 breaks+0.004
 hits+0.003
 my+0.003
er+0.003
 makes+0.003
Neg logit contributions
rolling-0.003
 crashing-0.003
free-0.002
 Marvin-0.002
ous-0.002
Token the
Token ID262
Feature activation+2.52e-02

Pos logit contributions
 breaks+0.115
 hits+0.109
ital+0.107
 pitcher+0.107
 te+0.106
Neg logit contributions
 Marvin-0.122
 Joey-0.115
 Frankie-0.115
 Mar-0.115
 him-0.113
Token box
Token ID3091
Feature activation+5.29e-02

Pos logit contributions
 breaks+0.032
 hits+0.028
 my+0.027
er+0.027
 makes+0.025
Neg logit contributions
 crashing-0.049
rolling-0.047
 Marvin-0.046
free-0.046
ous-0.046
Token."
Token ID526
Feature activation+1.93e-02

Pos logit contributions
 breaks+0.086
 hits+0.075
irts+0.068
ara+0.067
 slams+0.066
Neg logit contributions
ous-0.079
free-0.077
owing-0.074
fully-0.074
 quite-0.073
Token
Token ID198
Feature activation+4.36e-04

Pos logit contributions
 breaks+0.032
 hits+0.029
 splits+0.027
 my+0.027
er+0.026
Neg logit contributions
 Joey-0.030
 Marvin-0.028
 Frankie-0.028
free-0.028
rolling-0.027
Token
Token ID198
Feature activation+1.29e-02

Pos logit contributions
 breaks+0.001
 my+0.001
 hits+0.001
er+0.001
 makes+0.001
Neg logit contributions
 crashing-0.001
rolling-0.001
ous-0.001
 echoing-0.001
free-0.001
Token "
Token ID366
Feature activation-2.20e-02

Pos logit contributions
rolling+0.058
hing+0.058
uttering+0.057
 roam+0.056
owing+0.056
Neg logit contributions
 my-0.035
 this-0.031
er-0.030
 breaks-0.030
 makes-0.030
TokenCan
Token ID6090
Feature activation+1.53e-03

Pos logit contributions
rolling+0.030
 crashing+0.029
ous+0.029
 echoing+0.029
owing+0.028
Neg logit contributions
 my-0.040
er-0.038
 breaks-0.038
ara-0.037
 draws-0.036
Token I
Token ID314
Feature activation+1.98e-04

Pos logit contributions
 breaks+0.002
 hits+0.001
 my+0.001
er+0.001
 makes+0.001
Neg logit contributions
 crashing-0.003
rolling-0.003
ous-0.003
free-0.003
 Marvin-0.003
Token go
Token ID467
Feature activation+4.23e-02

Pos logit contributions
 breaks+0.000
 my+0.000
 hits+0.000
 makes+0.000
er+0.000
Neg logit contributions
 crashing-0.000
rolling-0.000
free-0.000
 echoing-0.000
ous-0.000
Token to
Token ID284
Feature activation+4.65e-02

Pos logit contributions
 breaks+0.076
 hits+0.066
 my+0.061
er+0.061
 holes+0.060
Neg logit contributions
 crashing-0.066
rolling-0.066
free-0.063
ous-0.060
 Joey-0.060
Token university
Token ID6403
Feature activation+8.00e-02

Pos logit contributions
 breaks+0.073
 hits+0.064
er+0.064
ara+0.059
 makes+0.058
Neg logit contributions
 crashing-0.077
 Marvin-0.077
 understand-0.075
rolling-0.075
 Joey-0.074
Token one
Token ID530
Feature activation+3.16e-02

Pos logit contributions
 breaks+0.130
 hits+0.107
 holes+0.102
irted+0.099
 splits+0.098
Neg logit contributions
ous-0.131
 understand-0.128
 quite-0.127
 crashing-0.126
 ent-0.125
Token day
Token ID1110
Feature activation+5.22e-02

Pos logit contributions
 breaks+0.045
 hits+0.038
 my+0.034
 draws+0.033
er+0.033
Neg logit contributions
 crashing-0.059
ous-0.057
rolling-0.057
free-0.056
 echoing-0.055
Token?"
Token ID1701
Feature activation-1.78e-02

Pos logit contributions
 breaks+0.071
 hits+0.060
 slams+0.053
 splits+0.052
er+0.051
Neg logit contributions
 crashing-0.101
rolling-0.101
 Joey-0.100
ous-0.099
 Marvin-0.097
Token
Token ID198
Feature activation-2.46e-02

Pos logit contributions
 crashing+0.034
uttering+0.033
rolling+0.032
free+0.032
 echoing+0.032
Neg logit contributions
 breaks-0.025
 my-0.023
 hits-0.022
er-0.021
 makes-0.020
Token
Token ID198
Feature activation-1.86e-02

Pos logit contributions
 crashing+0.042
 echoing+0.040
ous+0.040
free+0.040
rolling+0.039
Neg logit contributions
 breaks-0.039
 my-0.037
 hits-0.035
er-0.034
ara-0.031
Token"
Token ID1
Feature activation+7.52e-03

Pos logit contributions
 breaks+0.027
 hits+0.025
 slams+0.022
er+0.022
 makes+0.022
Neg logit contributions
 crashing-0.019
Frank-0.018
rolling-0.018
 ent-0.018
free-0.018
TokenCan
Token ID6090
Feature activation+2.10e-02

Pos logit contributions
 breaks+0.014
 my+0.013
 hits+0.012
 makes+0.012
ara+0.012
Neg logit contributions
rolling-0.011
 crashing-0.011
ous-0.011
owing-0.010
 echoing-0.010
Token we
Token ID356
Feature activation-1.42e-02

Pos logit contributions
 breaks+0.029
er+0.026
 hits+0.026
irts+0.024
 holes+0.024
Neg logit contributions
 crashing-0.036
 Joey-0.035
rolling-0.035
 Marvin-0.035
ous-0.035
Token leave
Token ID2666
Feature activation+4.40e-02

Pos logit contributions
 crashing+0.022
rolling+0.021
free+0.021
ous+0.021
 echoing+0.021
Neg logit contributions
 breaks-0.025
 my-0.023
 hits-0.022
er-0.021
 makes-0.021
Token our
Token ID674
Feature activation+1.16e-03

Pos logit contributions
 breaks+0.059
 hits+0.054
er+0.049
 slams+0.048
 te+0.048
Neg logit contributions
 Joey-0.080
 Frank-0.076
 Marvin-0.075
 crashing-0.074
 Frankie-0.074
Token building
Token ID2615
Feature activation+8.00e-02

Pos logit contributions
 breaks+0.001
er+0.001
 hits+0.001
 my+0.001
ara+0.001
Neg logit contributions
 crashing-0.002
rolling-0.002
free-0.002
 ent-0.002
ous-0.002
Token here
Token ID994
Feature activation+5.50e-02

Pos logit contributions
 breaks+0.124
irts+0.108
ara+0.106
 learns+0.102
 teaches+0.101
Neg logit contributions
 quite-0.102
rolling-0.096
 crashing-0.096
 Joey-0.095
owing-0.095
Token,
Token ID11
Feature activation+7.71e-03

Pos logit contributions
 breaks+0.075
 hits+0.062
er+0.057
 slams+0.054
 makes+0.054
Neg logit contributions
 Joey-0.102
rolling-0.099
 crashing-0.099
 Marvin-0.099
ous-0.097
Token Mom
Token ID11254
Feature activation+4.67e-02

Pos logit contributions
 breaks+0.013
 hits+0.012
er+0.011
 slams+0.011
 draws+0.011
Neg logit contributions
 crashing-0.012
 echoing-0.011
 understand-0.011
 Joey-0.011
 Marvin-0.011
Token?"
Token ID1701
Feature activation+2.07e-02

Pos logit contributions
 breaks+0.063
 holes+0.053
 hits+0.048
 pieces+0.046
 slams+0.046
Neg logit contributions
ous-0.085
owing-0.085
 crashing-0.084
 understand-0.083
 Joey-0.083
Token Anna
Token ID11735
Feature activation+2.52e-02

Pos logit contributions
 breaks+0.029
er+0.026
 hits+0.026
ers+0.022
ital+0.022
Neg logit contributions
 Marvin-0.037
 Joey-0.037
 crashing-0.037
 ent-0.036
 understand-0.036
Token He
Token ID679
Feature activation+5.57e-02

Pos logit contributions
 breaks+0.030
 hits+0.027
 my+0.027
er+0.026
 makes+0.025
Neg logit contributions
 crashing-0.023
rolling-0.022
free-0.022
ous-0.021
 echoing-0.021
Token says
Token ID1139
Feature activation-1.89e-03

Pos logit contributions
er+0.058
 breaks+0.055
 my+0.053
aj+0.052
ara+0.051
Neg logit contributions
 crashing-0.100
 understand-0.099
 ent-0.098
 weren-0.096
 capt-0.095
Token,
Token ID11
Feature activation-3.42e-02

Pos logit contributions
 crashing+0.004
ous+0.004
free+0.004
 Joey+0.004
 understand+0.003
Neg logit contributions
 breaks-0.002
 hits-0.002
irts-0.002
 slams-0.002
 holes-0.002
Token "
Token ID366
Feature activation+1.76e-02

Pos logit contributions
rolling+0.056
uttering+0.053
 crashing+0.053
owing+0.053
 echoing+0.052
Neg logit contributions
 breaks-0.058
 my-0.053
 hits-0.052
 makes-0.050
er-0.049
TokenI
Token ID40
Feature activation+2.96e-02

Pos logit contributions
 breaks+0.037
 hits+0.034
 my+0.033
er+0.032
 makes+0.031
Neg logit contributions
 crashing-0.021
rolling-0.020
free-0.020
 echoing-0.020
ous-0.019
Token win
Token ID1592
Feature activation+7.99e-02

Pos logit contributions
 breaks+0.057
 my+0.056
er+0.055
 hits+0.053
 slams+0.052
Neg logit contributions
 understand-0.036
free-0.035
 crashing-0.033
 weren-0.033
 recognise-0.031
Token.
Token ID13
Feature activation+4.30e-02

Pos logit contributions
 breaks+0.099
 slams+0.091
 te+0.089
 hits+0.087
er+0.086
Neg logit contributions
 crashing-0.143
 quite-0.134
ous-0.133
free-0.132
 understand-0.131
Token I
Token ID314
Feature activation+2.42e-02

Pos logit contributions
 breaks+0.069
 mess+0.065
 hits+0.064
 pieces+0.063
er+0.062
Neg logit contributions
 crashing-0.065
free-0.065
 Joey-0.061
 Marvin-0.061
ous-0.060
Token have
Token ID423
Feature activation+7.76e-02

Pos logit contributions
 breaks+0.042
 my+0.040
er+0.039
 hits+0.038
 holes+0.036
Neg logit contributions
free-0.034
 understand-0.034
 crashing-0.033
 weren-0.032
rolling-0.031
Token more
Token ID517
Feature activation+6.77e-02

Pos logit contributions
 slams+0.114
 breaks+0.112
er+0.112
aj+0.107
ara+0.106
Neg logit contributions
 crashing-0.109
 ent-0.106
 quite-0.106
 understand-0.103
 weren-0.101
Token scores
Token ID8198
Feature activation+6.59e-02

Pos logit contributions
 slams+0.093
 breaks+0.092
ara+0.090
er+0.083
aj+0.081
Neg logit contributions
 understand-0.104
 crashing-0.101
 ent-0.100
 quite-0.095
 echoing-0.094
Token that
Token ID326
Feature activation+6.72e-03

Pos logit contributions
 breaks+0.039
 hits+0.032
 draws+0.031
 te+0.031
 makes+0.030
Neg logit contributions
 Marvin-0.045
 Joey-0.045
 crashing-0.045
 Frank-0.044
 Frankie-0.043
Token day
Token ID1110
Feature activation-6.00e-03

Pos logit contributions
 breaks+0.013
 hits+0.012
 my+0.012
er+0.011
 makes+0.011
Neg logit contributions
 crashing-0.009
rolling-0.009
free-0.009
ous-0.009
 echoing-0.009
Token on
Token ID319
Feature activation-2.85e-03

Pos logit contributions
 crashing+0.012
rolling+0.012
free+0.012
ous+0.012
 echoing+0.012
Neg logit contributions
 breaks-0.007
 hits-0.006
 my-0.006
er-0.006
 makes-0.005
Token,
Token ID11
Feature activation-4.17e-02

Pos logit contributions
 crashing+0.006
 Joey+0.006
rolling+0.006
 Marvin+0.006
 Frankie+0.006
Neg logit contributions
 breaks-0.003
 hits-0.002
 my-0.002
 draws-0.002
er-0.002
Token Tim
Token ID5045
Feature activation+1.32e-02

Pos logit contributions
uttering+0.061
rolling+0.059
 echoing+0.059
 crashing+0.058
 harming+0.057
Neg logit contributions
 breaks-0.075
 my-0.074
 hits-0.066
 makes-0.066
 your-0.064
Token and
Token ID290
Feature activation-9.22e-02

Pos logit contributions
 breaks+0.020
 hits+0.017
 my+0.017
 slams+0.016
 te+0.016
Neg logit contributions
 crashing-0.023
free-0.022
ous-0.022
rolling-0.022
 echoing-0.021
Token Mo
Token ID4270
Feature activation-3.22e-02

Pos logit contributions
uttering+0.130
rolling+0.119
 crashing+0.111
itting+0.110
year+0.108
Neg logit contributions
 my-0.185
 breaks-0.181
 hits-0.161
 your-0.161
 makes-0.161
Token were
Token ID547
Feature activation-2.40e-03

Pos logit contributions
 crashing+0.047
rolling+0.046
free+0.043
 echoing+0.043
ous+0.043
Neg logit contributions
 breaks-0.061
 hits-0.054
 my-0.053
 makes-0.051
er-0.051
Token always
Token ID1464
Feature activation+7.54e-03

Pos logit contributions
 crashing+0.004
rolling+0.004
free+0.004
 echoing+0.004
 ent+0.004
Neg logit contributions
 breaks-0.004
 hits-0.004
 my-0.003
er-0.003
 makes-0.003
Token together
Token ID1978
Feature activation-1.21e-02

Pos logit contributions
 breaks+0.013
 hits+0.011
 my+0.011
er+0.010
 makes+0.010
Neg logit contributions
 crashing-0.013
rolling-0.012
free-0.012
 echoing-0.012
 ent-0.011
Token,
Token ID11
Feature activation-1.09e-02

Pos logit contributions
rolling+0.026
 Marvin+0.025
 crashing+0.025
free+0.025
ous+0.025
Neg logit contributions
 breaks-0.014
 my-0.012
 hits-0.011
 makes-0.011
er-0.011
Token friend
Token ID1545
Feature activation-1.69e-02

Pos logit contributions
free+0.059
hing+0.058
rolling+0.056
 crashing+0.056
uttering+0.054
Neg logit contributions
 breaks-0.041
 my-0.036
 hits-0.035
er-0.034
 room-0.032
Token,
Token ID11
Feature activation-2.25e-02

Pos logit contributions
 crashing+0.036
rolling+0.035
free+0.035
ous+0.034
 echoing+0.034
Neg logit contributions
 breaks-0.019
 my-0.017
 hits-0.016
er-0.016
 makes-0.014
Token a
Token ID257
Feature activation-2.65e-02

Pos logit contributions
rolling+0.040
free+0.039
hing+0.039
uttering+0.039
piring+0.038
Neg logit contributions
 my-0.033
 breaks-0.029
er-0.027
 your-0.027
 hits-0.027
Token rabbit
Token ID22746
Feature activation-3.26e-02

Pos logit contributions
uttering+0.048
owing+0.048
rolling+0.048
hing+0.047
 crashing+0.047
Neg logit contributions
 breaks-0.035
 my-0.034
 te-0.031
 hits-0.030
 jar-0.029
Token named
Token ID3706
Feature activation-3.15e-02

Pos logit contributions
rolling+0.065
 crashing+0.063
free+0.062
uttering+0.061
ous+0.061
Neg logit contributions
 breaks-0.040
 my-0.037
er-0.036
 hits-0.035
 te-0.032
Token Hipp
Token ID33812
Feature activation-9.22e-02

Pos logit contributions
 crashing+0.048
rolling+0.047
uttering+0.047
 echoing+0.046
free+0.045
Neg logit contributions
 my-0.052
 breaks-0.052
 hits-0.048
er-0.047
 makes-0.045
Tokeno
Token ID78
Feature activation-3.04e-02

Pos logit contributions
 echoing+0.178
 continuing+0.170
 Marvin+0.170
rolling+0.169
 Frankie+0.166
Neg logit contributions
er-0.097
 breaks-0.095
ara-0.092
 hits-0.088
 is-0.084
Token.
Token ID13
Feature activation-3.89e-02

Pos logit contributions
 crashing+0.064
rolling+0.063
free+0.062
 echoing+0.061
ous+0.061
Neg logit contributions
 breaks-0.035
 my-0.031
 hits-0.030
er-0.029
 makes-0.026
Token He
Token ID679
Feature activation-6.32e-03

Pos logit contributions
hing+0.059
uttering+0.058
free+0.058
 echoing+0.058
 crashing+0.057
Neg logit contributions
 my-0.063
 breaks-0.062
er-0.057
 hits-0.056
 makes-0.054
Token helped
Token ID4193
Feature activation-6.00e-02

Pos logit contributions
free+0.013
 crashing+0.013
uttering+0.013
rolling+0.013
hing+0.012
Neg logit contributions
 breaks-0.008
 my-0.007
 hits-0.007
er-0.006
 makes-0.006
Token G
Token ID402
Feature activation-7.82e-02

Pos logit contributions
uttering+0.088
 echoing+0.088
rolling+0.086
itting+0.085
ous+0.082
Neg logit contributions
 my-0.110
 breaks-0.104
er-0.103
 your-0.098
 this-0.093
Token the
Token ID262
Feature activation-2.80e-02

Pos logit contributions
rolling+0.006
ous+0.006
 crashing+0.006
 echoing+0.006
free+0.006
Neg logit contributions
 breaks-0.004
 my-0.003
 hits-0.003
 makes-0.003
er-0.003
Token fire
Token ID2046
Feature activation-4.30e-02

Pos logit contributions
rolling+0.054
 crashing+0.053
ous+0.053
 echoing+0.052
free+0.052
Neg logit contributions
 breaks-0.039
 my-0.034
 hits-0.033
 makes-0.031
er-0.030
Token.
Token ID13
Feature activation+1.29e-03

Pos logit contributions
rolling+0.087
 crashing+0.086
free+0.084
 ent+0.083
 echoing+0.083
Neg logit contributions
 breaks-0.053
 hits-0.045
er-0.044
 my-0.044
 makes-0.042
Token The
Token ID383
Feature activation-3.31e-02

Pos logit contributions
 breaks+0.002
 hits+0.002
 my+0.002
 makes+0.002
er+0.002
Neg logit contributions
 crashing-0.002
free-0.002
rolling-0.002
ous-0.002
 echoing-0.002
Token marsh
Token ID22397
Feature activation-4.25e-02

Pos logit contributions
rolling+0.065
 crashing+0.063
ous+0.063
free+0.062
 echoing+0.062
Neg logit contributions
 breaks-0.046
 my-0.040
 hits-0.040
 makes-0.037
er-0.036
Tokenm
Token ID76
Feature activation-9.14e-02

Pos logit contributions
rolling+0.061
 ent+0.056
 crashing+0.055
 roam+0.055
 recognise+0.054
Neg logit contributions
 breaks-0.080
er-0.074
 hits-0.073
 my-0.071
ara-0.070
Tokenallows
Token ID47205
Feature activation-4.84e-02

Pos logit contributions
rolling+0.088
 crashing+0.084
 Marvin+0.081
free+0.080
Nice+0.079
Neg logit contributions
 breaks-0.214
 my-0.195
 makes-0.193
er-0.193
 hits-0.192
Token became
Token ID2627
Feature activation-1.59e-02

Pos logit contributions
rolling+0.070
uttering+0.065
 Marvin+0.063
 harming+0.063
Frank+0.062
Neg logit contributions
 breaks-0.094
 hits-0.083
 makes-0.083
er-0.081
 pieces-0.077
Token all
Token ID477
Feature activation-1.12e-02

Pos logit contributions
rolling+0.028
 crashing+0.028
ous+0.027
free+0.027
 echoing+0.026
Neg logit contributions
 breaks-0.024
 my-0.022
 hits-0.021
 makes-0.020
er-0.020
Token warm
Token ID5814
Feature activation-2.93e-02

Pos logit contributions
 crashing+0.018
free+0.017
 echoing+0.017
rolling+0.017
ous+0.017
Neg logit contributions
 breaks-0.019
 hits-0.017
 my-0.016
er-0.016
ara-0.016
Token and
Token ID290
Feature activation-1.33e-02

Pos logit contributions
 crashing+0.061
free+0.059
rolling+0.059
ous+0.059
 echoing+0.058
Neg logit contributions
 breaks-0.035
 hits-0.031
 my-0.030
 makes-0.026
er-0.026
Token noisy
Token ID31210
Feature activation-3.99e-02

Pos logit contributions
rolling+0.034
 crashing+0.033
uttering+0.033
owing+0.033
free+0.032
Neg logit contributions
 breaks-0.031
 my-0.028
 hits-0.027
 makes-0.025
er-0.024
Token sound
Token ID2128
Feature activation-4.25e-02

Pos logit contributions
rolling+0.087
free+0.084
uttering+0.084
andering+0.083
Anything+0.083
Neg logit contributions
 breaks-0.046
 my-0.037
 hits-0.037
er-0.036
 makes-0.032
Token.
Token ID13
Feature activation-4.15e-02

Pos logit contributions
rolling+0.099
Albert+0.094
ric+0.094
Frank+0.093
 Marvin+0.092
Neg logit contributions
 breaks-0.042
er-0.040
 hits-0.033
 makes-0.031
 holes-0.027
Token D
Token ID360
Feature activation-5.53e-02

Pos logit contributions
uttering+0.065
 crashing+0.065
hing+0.064
owing+0.064
free+0.063
Neg logit contributions
 breaks-0.068
 my-0.059
 makes-0.058
er-0.057
 hits-0.057
Tokenandy
Token ID10757
Feature activation-6.54e-03

Pos logit contributions
 crashing+0.092
rolling+0.087
 harming+0.085
 streaming+0.084
 echoing+0.083
Neg logit contributions
 breaks-0.089
er-0.086
ara-0.083
 my-0.082
 hits-0.077
Token and
Token ID290
Feature activation-8.63e-02

Pos logit contributions
 crashing+0.012
free+0.012
rolling+0.012
uttering+0.012
ous+0.012
Neg logit contributions
 breaks-0.009
 hits-0.008
 my-0.008
er-0.007
 makes-0.007
Token Joan
Token ID26640
Feature activation-5.58e-02

Pos logit contributions
uttering+0.137
rolling+0.128
Nice+0.125
 crashing+0.125
free+0.123
Neg logit contributions
 my-0.154
 breaks-0.142
 your-0.135
 this-0.134
 makes-0.134
Token both
Token ID1111
Feature activation-4.44e-02

Pos logit contributions
uttering+0.106
Nice+0.100
Anything+0.099
free+0.097
ancer+0.097
Neg logit contributions
's-0.079
 breaks-0.071
,-0.070
er-0.063
 hits-0.059
Token looked
Token ID3114
Feature activation-1.87e-02

Pos logit contributions
uttering+0.080
free+0.080
rolling+0.078
 crashing+0.076
 harming+0.076
Neg logit contributions
 breaks-0.066
 hits-0.054
 makes-0.051
 my-0.051
's-0.051
Token in
Token ID287
Feature activation-4.19e-03

Pos logit contributions
rolling+0.035
ous+0.034
 crashing+0.034
free+0.034
Frank+0.033
Neg logit contributions
 breaks-0.026
 my-0.023
 hits-0.023
 makes-0.021
er-0.020
Token surprise
Token ID5975
Feature activation-1.89e-02

Pos logit contributions
 crashing+0.008
rolling+0.008
free+0.007
ous+0.007
 echoing+0.007
Neg logit contributions
 breaks-0.006
 hits-0.005
 my-0.005
er-0.005
 makes-0.005
Token?"
Token ID1701
Feature activation-2.76e-02

Pos logit contributions
 crashing+0.029
rolling+0.028
free+0.028
 echoing+0.028
uttering+0.027
Neg logit contributions
 breaks-0.009
 my-0.009
 hits-0.008
er-0.007
 makes-0.006
Token The
Token ID383
Feature activation-5.66e-02

Pos logit contributions
 crashing+0.052
free+0.050
rolling+0.050
ous+0.050
uttering+0.049
Neg logit contributions
 breaks-0.040
 my-0.036
 hits-0.035
 makes-0.032
er-0.032
Token voice
Token ID3809
Feature activation-2.97e-02

Pos logit contributions
uttering+0.124
itting+0.122
owing+0.122
rolling+0.121
to+0.120
Neg logit contributions
 breaks-0.067
 my-0.062
er-0.056
 hits-0.055
 makes-0.050
Token belonged
Token ID19611
Feature activation-7.53e-03

Pos logit contributions
rolling+0.065
 crashing+0.065
free+0.063
uttering+0.061
ous+0.061
Neg logit contributions
 breaks-0.033
 my-0.028
 hits-0.027
er-0.027
 makes-0.025
Token to
Token ID284
Feature activation-4.56e-02

Pos logit contributions
 crashing+0.013
rolling+0.012
free+0.012
ous+0.012
 Joey+0.012
Neg logit contributions
 breaks-0.013
 hits-0.011
 my-0.010
 makes-0.010
 draws-0.010
Token G
Token ID402
Feature activation-8.50e-02

Pos logit contributions
rolling+0.080
free+0.076
ous+0.076
 echoing+0.076
 crashing+0.076
Neg logit contributions
 my-0.067
 breaks-0.066
er-0.059
 your-0.059
 this-0.058
Tokeniz
Token ID528
Feature activation-6.55e-02

Pos logit contributions
 crashing+0.153
 echoing+0.152
rolling+0.151
free+0.149
 stroll+0.146
Neg logit contributions
 breaks-0.114
er-0.113
 hits-0.103
 my-0.100
ara-0.099
Tokenmo
Token ID5908
Feature activation-3.23e-02

Pos logit contributions
rolling+0.089
uttering+0.084
 crashing+0.081
 echoing+0.077
free+0.077
Neg logit contributions
 breaks-0.125
er-0.122
 my-0.118
 hits-0.117
 makes-0.108
Token's
Token ID338
Feature activation-3.03e-02

Pos logit contributions
 crashing+0.069
rolling+0.068
free+0.068
uttering+0.067
hing+0.066
Neg logit contributions
 breaks-0.035
 my-0.031
 hits-0.030
er-0.030
 makes-0.027
Token friend
Token ID1545
Feature activation-1.69e-02

Pos logit contributions
free+0.059
hing+0.058
rolling+0.056
 crashing+0.056
uttering+0.054
Neg logit contributions
 breaks-0.041
 my-0.036
 hits-0.035
er-0.034
 room-0.032
Token,
Token ID11
Feature activation-2.25e-02

Pos logit contributions
 crashing+0.036
rolling+0.035
free+0.035
ous+0.034
 echoing+0.034
Neg logit contributions
 breaks-0.019
 my-0.017
 hits-0.016
er-0.016
 makes-0.014
Token on
Token ID319
Feature activation+1.09e-02

Pos logit contributions
 breaks+0.008
 hits+0.007
 my+0.007
er+0.006
 draws+0.006
Neg logit contributions
 crashing-0.010
free-0.009
rolling-0.009
 streaming-0.009
ous-0.009
Token,
Token ID11
Feature activation+1.67e-02

Pos logit contributions
 breaks+0.013
 hits+0.012
 draws+0.012
irts+0.011
 cards+0.010
Neg logit contributions
 Frankie-0.021
 Joey-0.021
 Mar-0.021
 Marvin-0.021
 streaming-0.021
Token Lily
Token ID20037
Feature activation+5.64e-04

Pos logit contributions
 breaks+0.020
 hits+0.019
 slams+0.015
er+0.015
 my+0.015
Neg logit contributions
free-0.035
 weren-0.032
 Joey-0.032
 crashing-0.032
 Mar-0.031
Token and
Token ID290
Feature activation-5.07e-02

Pos logit contributions
 breaks+0.001
 hits+0.001
 my+0.001
er+0.001
 te+0.001
Neg logit contributions
 crashing-0.001
free-0.001
rolling-0.001
ous-0.001
 echoing-0.001
Token the
Token ID262
Feature activation-5.01e-02

Pos logit contributions
 crashing+0.119
rolling+0.119
ous+0.115
uttering+0.115
 echoing+0.113
Neg logit contributions
 breaks-0.054
 my-0.047
 hits-0.043
 makes-0.039
er-0.038
Token sweet
Token ID6029
Feature activation-8.42e-02

Pos logit contributions
rolling+0.083
ous+0.079
owing+0.078
 crashing+0.078
uttering+0.077
Neg logit contributions
 breaks-0.085
 my-0.084
 makes-0.076
 hits-0.074
er-0.073
Token re
Token ID302
Feature activation-7.71e-02

Pos logit contributions
rolling+0.148
uttering+0.138
 crashing+0.138
free+0.133
 echoing+0.131
Neg logit contributions
 breaks-0.133
 my-0.130
er-0.128
 makes-0.120
 is-0.116
Tokenind
Token ID521
Feature activation-7.62e-02

Pos logit contributions
 echoing+0.149
 crashing+0.146
rolling+0.141
free+0.139
 Joey+0.138
Neg logit contributions
er-0.096
 my-0.088
 breaks-0.088
 hits-0.077
ara-0.075
Tokeneer
Token ID28153
Feature activation-4.94e-02

Pos logit contributions
rolling+0.149
 crashing+0.132
 echoing+0.129
uttering+0.127
 recognise+0.117
Neg logit contributions
er-0.116
 breaks-0.116
 my-0.103
 makes-0.101
 hits-0.099
Token became
Token ID2627
Feature activation-5.14e-03

Pos logit contributions
rolling+0.072
 crashing+0.070
uttering+0.069
 Marvin+0.068
 Joey+0.066
Neg logit contributions
 breaks-0.093
 hits-0.080
 makes-0.080
er-0.079
 my-0.075
Token good
Token ID922
Feature activation-1.90e-02

Pos logit contributions
 crashing+0.010
rolling+0.010
ous+0.010
free+0.010
 echoing+0.009
Neg logit contributions
 breaks-0.007
 my-0.006
 hits-0.006
er-0.006
 makes-0.005
Token when
Token ID618
Feature activation-6.82e-03

Pos logit contributions
 crashing+0.014
rolling+0.014
free+0.014
ous+0.014
 echoing+0.014
Neg logit contributions
 breaks-0.010
 my-0.009
 hits-0.009
er-0.008
 makes-0.008
Token she
Token ID673
Feature activation+1.47e-02

Pos logit contributions
free+0.012
 crashing+0.012
rolling+0.012
ous+0.012
hing+0.012
Neg logit contributions
 breaks-0.010
 my-0.009
 hits-0.008
 makes-0.008
 your-0.008
Token saw
Token ID2497
Feature activation-3.69e-03

Pos logit contributions
 breaks+0.025
 my+0.023
 hits+0.022
er+0.021
 makes+0.020
Neg logit contributions
rolling-0.023
 crashing-0.023
 echoing-0.021
ous-0.021
 ent-0.021
Token the
Token ID262
Feature activation-2.51e-02

Pos logit contributions
rolling+0.007
 crashing+0.007
free+0.007
ous+0.007
 echoing+0.007
Neg logit contributions
 breaks-0.005
 my-0.004
 hits-0.004
er-0.004
 makes-0.004
Token pe
Token ID613
Feature activation-5.78e-02

Pos logit contributions
rolling+0.046
 echoing+0.044
ous+0.043
 crashing+0.043
free+0.043
Neg logit contributions
 breaks-0.038
 my-0.035
 hits-0.033
er-0.032
 holes-0.031
Tokenb
Token ID65
Feature activation-8.41e-02

Pos logit contributions
 crashing+0.095
rolling+0.092
free+0.091
 echoing+0.090
ous+0.089
Neg logit contributions
 breaks-0.095
 hits-0.086
 my-0.084
er-0.081
 makes-0.078
Tokenble
Token ID903
Feature activation-2.96e-02

Pos logit contributions
 echoing+0.119
 crashing+0.115
rolling+0.113
free+0.111
 Frankie+0.111
Neg logit contributions
 breaks-0.147
er-0.145
 my-0.138
 hits-0.138
 makes-0.130
Token and
Token ID290
Feature activation-3.48e-03

Pos logit contributions
 crashing+0.059
rolling+0.059
free+0.058
ous+0.057
 echoing+0.056
Neg logit contributions
 breaks-0.038
 hits-0.032
 my-0.032
er-0.030
 makes-0.029
Token she
Token ID673
Feature activation+3.35e-03

Pos logit contributions
 crashing+0.006
rolling+0.006
free+0.006
ous+0.005
 echoing+0.005
Neg logit contributions
 breaks-0.006
 my-0.005
 hits-0.005
er-0.005
 makes-0.005
Token wanted
Token ID2227
Feature activation-8.05e-03

Pos logit contributions
 breaks+0.006
 my+0.005
 hits+0.005
er+0.005
 makes+0.005
Neg logit contributions
 crashing-0.005
free-0.005
rolling-0.005
ous-0.005
uttering-0.005
Token to
Token ID284
Feature activation-1.48e-02

Pos logit contributions
rolling+0.016
 crashing+0.015
free+0.015
 echoing+0.015
ous+0.015
Neg logit contributions
 breaks-0.011
 my-0.010
 hits-0.009
er-0.009
 makes-0.009
Token
Token ID198
Feature activation-4.91e-03

Pos logit contributions
 crashing+0.006
rolling+0.005
free+0.005
ous+0.005
 echoing+0.005
Neg logit contributions
 breaks-0.006
 my-0.005
 hits-0.005
er-0.005
 makes-0.005
Token
Token ID198
Feature activation+5.12e-03

Pos logit contributions
 crashing+0.009
rolling+0.008
 echoing+0.008
free+0.008
ous+0.008
Neg logit contributions
 breaks-0.008
 my-0.007
 hits-0.007
er-0.006
 makes-0.006
TokenTim
Token ID14967
Feature activation+2.36e-02

Pos logit contributions
 breaks+0.009
 hits+0.008
 my+0.008
er+0.007
 makes+0.007
Neg logit contributions
 crashing-0.008
rolling-0.008
free-0.008
ous-0.008
 echoing-0.008
Token and
Token ID290
Feature activation-4.81e-02

Pos logit contributions
 breaks+0.039
 hits+0.034
 my+0.034
 holes+0.030
 slams+0.030
Neg logit contributions
rolling-0.037
 echoing-0.035
 crashing-0.035
 understand-0.035
ous-0.035
Token Mr
Token ID1770
Feature activation-6.19e-02

Pos logit contributions
 crashing+0.094
rolling+0.094
uttering+0.093
ous+0.090
free+0.089
Neg logit contributions
 breaks-0.069
 my-0.067
 hits-0.060
 makes-0.056
 your-0.054
Token.
Token ID13
Feature activation-8.32e-02

Pos logit contributions
rolling+0.171
 crashing+0.165
 capt+0.160
 ent+0.158
 echoing+0.156
Neg logit contributions
 breaks-0.042
er-0.036
 my-0.029
ara-0.029
 makes-0.029
Token Sun
Token ID3825
Feature activation-3.03e-02

Pos logit contributions
 understand+0.122
uttering+0.120
 weren+0.119
rolling+0.116
owing+0.116
Neg logit contributions
 my-0.146
 breaks-0.144
 makes-0.125
 hits-0.122
 room-0.118
Token played
Token ID2826
Feature activation-9.81e-03

Pos logit contributions
 crashing+0.055
rolling+0.053
free+0.052
 Marvin+0.051
 echoing+0.051
Neg logit contributions
 breaks-0.045
 hits-0.039
er-0.038
 my-0.038
 makes-0.038
Token together
Token ID1978
Feature activation-6.38e-03

Pos logit contributions
 crashing+0.018
free+0.018
rolling+0.017
 echoing+0.017
 Marvin+0.017
Neg logit contributions
 breaks-0.014
 hits-0.013
 my-0.012
er-0.011
 makes-0.011
Token all
Token ID477
Feature activation-1.15e-02

Pos logit contributions
rolling+0.013
 crashing+0.013
free+0.012
ous+0.012
 echoing+0.012
Neg logit contributions
 breaks-0.009
 hits-0.007
 my-0.007
er-0.007
 makes-0.007
Token day
Token ID1110
Feature activation+4.46e-03

Pos logit contributions
 crashing+0.014
free+0.013
ous+0.012
 echoing+0.012
rolling+0.012
Neg logit contributions
 breaks-0.024
 hits-0.022
 my-0.021
er-0.021
 draws-0.020
Token.
Token ID13
Feature activation+5.00e-03

Pos logit contributions
rolling+0.052
 crashing+0.052
ous+0.050
free+0.050
uttering+0.050
Neg logit contributions
 breaks-0.015
 my-0.012
 hits-0.011
er-0.011
 makes-0.010
Token Max
Token ID5436
Feature activation-1.63e-03

Pos logit contributions
 breaks+0.007
 my+0.006
 hits+0.006
 makes+0.006
er+0.005
Neg logit contributions
 crashing-0.010
rolling-0.010
uttering-0.009
ous-0.009
free-0.009
Token saw
Token ID2497
Feature activation+1.51e-02

Pos logit contributions
 crashing+0.003
rolling+0.003
free+0.003
ous+0.003
 echoing+0.003
Neg logit contributions
 breaks-0.002
 my-0.002
 hits-0.002
er-0.002
 makes-0.002
Token the
Token ID262
Feature activation-3.39e-02

Pos logit contributions
 breaks+0.022
 my+0.019
 hits+0.019
er+0.017
 makes+0.017
Neg logit contributions
rolling-0.029
 crashing-0.029
free-0.028
ous-0.028
 echoing-0.027
Token pe
Token ID613
Feature activation-5.83e-02

Pos logit contributions
rolling+0.060
 echoing+0.058
 crashing+0.057
ous+0.056
 streaming+0.056
Neg logit contributions
 breaks-0.055
 hits-0.050
 my-0.050
 makes-0.046
er-0.045
Tokenb
Token ID65
Feature activation-8.30e-02

Pos logit contributions
 crashing+0.095
rolling+0.092
free+0.091
 echoing+0.090
ous+0.089
Neg logit contributions
 breaks-0.097
 hits-0.088
 my-0.087
er-0.083
 makes-0.080
Tokenble
Token ID903
Feature activation-2.55e-02

Pos logit contributions
 echoing+0.128
rolling+0.121
 crashing+0.118
 Frankie+0.118
 stroll+0.117
Neg logit contributions
 breaks-0.139
er-0.136
 hits-0.131
 my-0.131
 is-0.121
Token in
Token ID287
Feature activation+1.09e-02

Pos logit contributions
 crashing+0.055
rolling+0.054
free+0.053
ous+0.053
 echoing+0.052
Neg logit contributions
 breaks-0.029
 hits-0.024
 my-0.024
er-0.022
 makes-0.021
Token her
Token ID607
Feature activation+1.46e-02

Pos logit contributions
 breaks+0.017
 hits+0.015
 my+0.014
er+0.014
 makes+0.014
Neg logit contributions
 crashing-0.019
free-0.018
 Marvin-0.018
 echoing-0.017
rolling-0.017
Token pocket
Token ID10000
Feature activation+2.57e-02

Pos logit contributions
 breaks+0.018
 hits+0.015
 my+0.014
er+0.014
 makes+0.013
Neg logit contributions
 crashing-0.030
rolling-0.030
 Marvin-0.028
free-0.028
 echoing-0.028
Token and
Token ID290
Feature activation-3.34e-04

Pos logit contributions
 breaks+0.042
 hits+0.037
 my+0.035
 draws+0.035
 splits+0.032
Neg logit contributions
 crashing-0.043
owing-0.042
rolling-0.041
ous-0.041
 weren-0.041
Token big
Token ID1263
Feature activation-5.59e-03

Pos logit contributions
 crashing+0.014
rolling+0.014
free+0.013
ous+0.013
 echoing+0.013
Neg logit contributions
 breaks-0.014
 hits-0.013
 my-0.012
er-0.012
 makes-0.012
Token trees
Token ID7150
Feature activation-4.31e-02

Pos logit contributions
 crashing+0.009
 Joey+0.008
 Marvin+0.008
rolling+0.008
ous+0.007
Neg logit contributions
 breaks-0.010
 hits-0.008
er-0.008
 my-0.008
 teaches-0.008
Token and
Token ID290
Feature activation-3.87e-02

Pos logit contributions
 crashing+0.100
rolling+0.099
free+0.097
ous+0.097
 echoing+0.097
Neg logit contributions
 breaks-0.040
 my-0.033
 hits-0.033
er-0.031
 makes-0.029
Token hearing
Token ID4854
Feature activation-4.93e-02

Pos logit contributions
ous+0.069
rolling+0.069
 echoing+0.067
uttering+0.067
 streaming+0.066
Neg logit contributions
 breaks-0.057
 my-0.053
 hits-0.052
 makes-0.050
 holes-0.046
Token the
Token ID262
Feature activation-4.06e-02

Pos logit contributions
rolling+0.095
 crashing+0.092
uttering+0.092
free+0.092
andering+0.089
Neg logit contributions
 breaks-0.070
 my-0.062
 hits-0.058
 makes-0.056
er-0.054
Token birds
Token ID10087
Feature activation-8.29e-02

Pos logit contributions
owing+0.082
rolling+0.082
 streaming+0.081
uttering+0.080
andering+0.080
Neg logit contributions
 breaks-0.048
 holes-0.041
 my-0.041
 makes-0.039
 pieces-0.038
Token singing
Token ID13777
Feature activation-4.89e-02

Pos logit contributions
ido+0.166
year+0.165
 Marvin+0.164
Frank+0.164
Albert+0.161
Neg logit contributions
 breaks-0.090
 makes-0.090
's-0.089
er-0.085
,-0.085
Token.
Token ID13
Feature activation-1.09e-02

Pos logit contributions
rolling+0.107
andering+0.101
 Marvin+0.101
 Frankie+0.100
 haven+0.099
Neg logit contributions
 breaks-0.056
er-0.049
 my-0.047
 makes-0.043
 holes-0.041
Token Suddenly
Token ID24975
Feature activation+6.60e-03

Pos logit contributions
ous+0.020
free+0.020
uttering+0.020
 echoing+0.020
piring+0.020
Neg logit contributions
 breaks-0.016
 hits-0.014
 my-0.014
 makes-0.013
ara-0.013
Token,
Token ID11
Feature activation+7.65e-03

Pos logit contributions
 breaks+0.008
 hits+0.007
 my+0.007
 draws+0.006
 teaches+0.006
Neg logit contributions
 crashing-0.013
 Marvin-0.013
 Joey-0.013
rolling-0.013
ous-0.013
Token he
Token ID339
Feature activation+7.72e-03

Pos logit contributions
 breaks+0.012
 hits+0.011
 my+0.011
er+0.010
 makes+0.010
Neg logit contributions
 crashing-0.013
rolling-0.013
free-0.012
ous-0.012
 echoing-0.012
Token,
Token ID11
Feature activation+5.14e-03

Pos logit contributions
 breaks+0.012
 hits+0.011
er+0.010
 my+0.009
 draws+0.009
Neg logit contributions
 Joey-0.017
 Marvin-0.017
 Frankie-0.017
 Frank-0.017
 Mar-0.016
Token Becky
Token ID47882
Feature activation+3.70e-03

Pos logit contributions
 breaks+0.008
 hits+0.007
 my+0.007
er+0.006
 makes+0.006
Neg logit contributions
 crashing-0.009
 Marvin-0.009
rolling-0.009
free-0.009
 Joey-0.009
Token spotted
Token ID13489
Feature activation-8.81e-03

Pos logit contributions
 breaks+0.006
 my+0.005
 hits+0.005
er+0.005
 makes+0.005
Neg logit contributions
 crashing-0.006
rolling-0.006
 echoing-0.006
ous-0.006
free-0.006
Token a
Token ID257
Feature activation-1.52e-02

Pos logit contributions
rolling+0.020
 crashing+0.019
free+0.019
ous+0.018
 echoing+0.018
Neg logit contributions
 breaks-0.010
 my-0.009
er-0.008
 hits-0.008
 makes-0.007
Token big
Token ID1263
Feature activation-4.31e-03

Pos logit contributions
 crashing+0.028
rolling+0.028
free+0.027
ous+0.027
 echoing+0.026
Neg logit contributions
 breaks-0.022
 my-0.019
 hits-0.019
er-0.017
 makes-0.017
Token Hipp
Token ID33812
Feature activation-8.27e-02

Pos logit contributions
 crashing+0.008
 Joey+0.007
 echoing+0.007
ous+0.007
 Marvin+0.007
Neg logit contributions
 breaks-0.006
 hits-0.005
 draws-0.005
 my-0.005
 makes-0.005
Tokeno
Token ID78
Feature activation-4.15e-02

Pos logit contributions
 ent+0.165
Frank+0.164
 understand+0.163
 inspect+0.162
 harming+0.162
Neg logit contributions
er-0.088
 breaks-0.083
ara-0.075
 hits-0.074
's-0.073
Token standing
Token ID5055
Feature activation-3.20e-03

Pos logit contributions
rolling+0.075
 ent+0.072
 crashing+0.071
free+0.071
 Marvin+0.071
Neg logit contributions
 breaks-0.061
 hits-0.054
 my-0.053
er-0.052
 makes-0.049
Token in
Token ID287
Feature activation-3.37e-03

Pos logit contributions
 crashing+0.005
rolling+0.005
free+0.005
uttering+0.005
 echoing+0.005
Neg logit contributions
 breaks-0.006
 hits-0.005
 my-0.005
 draws-0.005
 makes-0.005
Token the
Token ID262
Feature activation+2.31e-02

Pos logit contributions
 crashing+0.007
rolling+0.007
free+0.006
 echoing+0.006
 Marvin+0.006
Neg logit contributions
 breaks-0.004
 hits-0.004
 my-0.004
er-0.003
 makes-0.003
Token park
Token ID3952
Feature activation+2.35e-03

Pos logit contributions
 breaks+0.036
 hits+0.032
 my+0.031
er+0.029
 makes+0.029
Neg logit contributions
 crashing-0.041
free-0.037
rolling-0.037
 echoing-0.037
ous-0.036
Token wanted
Token ID2227
Feature activation+7.00e-03

Pos logit contributions
 breaks+0.055
 hits+0.049
 my+0.049
er+0.046
 makes+0.044
Neg logit contributions
 crashing-0.049
rolling-0.048
 echoing-0.046
free-0.046
ous-0.046
Token to
Token ID284
Feature activation+9.25e-03

Pos logit contributions
 breaks+0.012
 hits+0.011
 my+0.010
er+0.010
 draws+0.010
Neg logit contributions
 crashing-0.011
 Joey-0.010
 Marvin-0.010
 understand-0.010
free-0.009
Token see
Token ID766
Feature activation+4.02e-03

Pos logit contributions
 breaks+0.017
 hits+0.015
 my+0.015
er+0.015
 makes+0.014
Neg logit contributions
 crashing-0.013
rolling-0.012
 understand-0.012
free-0.012
 Joey-0.012
Token the
Token ID262
Feature activation-3.00e-02

Pos logit contributions
 breaks+0.006
 my+0.006
 hits+0.006
er+0.005
 makes+0.005
Neg logit contributions
 crashing-0.007
rolling-0.007
free-0.007
ous-0.007
 echoing-0.007
Token pe
Token ID613
Feature activation-5.80e-02

Pos logit contributions
rolling+0.054
 echoing+0.052
ous+0.051
 crashing+0.051
 streaming+0.050
Neg logit contributions
 breaks-0.048
 my-0.045
 hits-0.043
 makes-0.040
er-0.040
Tokenb
Token ID65
Feature activation-8.26e-02

Pos logit contributions
 crashing+0.095
rolling+0.092
 echoing+0.090
free+0.090
ous+0.088
Neg logit contributions
 breaks-0.097
 hits-0.087
 my-0.087
er-0.082
 makes-0.080
Tokenble
Token ID903
Feature activation-2.03e-02

Pos logit contributions
 echoing+0.120
rolling+0.115
 crashing+0.113
 streaming+0.110
 stroll+0.107
Neg logit contributions
 breaks-0.151
 my-0.145
er-0.143
 hits-0.141
 makes-0.134
Token too
Token ID1165
Feature activation+1.56e-02

Pos logit contributions
 crashing+0.042
rolling+0.041
free+0.041
ous+0.040
 echoing+0.040
Neg logit contributions
 breaks-0.025
 hits-0.021
 my-0.021
er-0.019
 makes-0.018
Token.
Token ID13
Feature activation+8.10e-03

Pos logit contributions
 breaks+0.022
 hits+0.020
 my+0.020
 draws+0.018
er+0.018
Neg logit contributions
 crashing-0.029
rolling-0.027
free-0.027
ous-0.027
 echoing-0.027
Token He
Token ID679
Feature activation+3.16e-02

Pos logit contributions
 breaks+0.019
 hits+0.017
 my+0.017
 draws+0.016
 makes+0.016
Neg logit contributions
 crashing-0.007
free-0.007
 Joey-0.007
 ent-0.006
rolling-0.006
Token said
Token ID531
Feature activation-1.47e-02

Pos logit contributions
 breaks+0.055
 my+0.050
 hits+0.049
er+0.046
 makes+0.044
Neg logit contributions
 crashing-0.045
 Marvin-0.044
rolling-0.044
 echoing-0.043
 ent-0.043
Token favorite
Token ID4004
Feature activation-3.14e-02

Pos logit contributions
rolling+0.062
hing+0.060
 streaming+0.060
 crashing+0.059
ous+0.059
Neg logit contributions
 my-0.035
 breaks-0.033
 te-0.027
 hits-0.027
 your-0.026
Token animal
Token ID5044
Feature activation-9.72e-03

Pos logit contributions
 crashing+0.056
rolling+0.056
free+0.054
 echoing+0.053
ous+0.053
Neg logit contributions
 breaks-0.049
 my-0.044
 hits-0.043
er-0.040
 makes-0.039
Token,
Token ID11
Feature activation-1.85e-02

Pos logit contributions
 crashing+0.019
ous+0.019
free+0.019
 echoing+0.018
 Joey+0.018
Neg logit contributions
 breaks-0.012
 hits-0.011
 my-0.009
 draws-0.009
er-0.009
Token Mr
Token ID1770
Feature activation-3.98e-02

Pos logit contributions
rolling+0.036
 crashing+0.036
free+0.035
ous+0.034
 echoing+0.034
Neg logit contributions
 breaks-0.025
 my-0.023
 hits-0.022
er-0.020
 makes-0.020
Token.
Token ID13
Feature activation-6.54e-02

Pos logit contributions
rolling+0.091
 crashing+0.089
 echoing+0.086
 ent+0.085
 streaming+0.084
Neg logit contributions
 breaks-0.043
 my-0.037
 hits-0.036
er-0.035
 makes-0.032
Token Hipp
Token ID33812
Feature activation-8.26e-02

Pos logit contributions
rolling+0.072
uttering+0.067
 echoing+0.066
free+0.064
 understand+0.064
Neg logit contributions
 breaks-0.142
 my-0.138
 hits-0.132
er-0.128
 te-0.125
Tokeno
Token ID78
Feature activation-4.13e-03

Pos logit contributions
rolling+0.156
 echoing+0.155
 crashing+0.149
 harming+0.147
 streaming+0.147
Neg logit contributions
 breaks-0.103
 my-0.100
er-0.099
 hits-0.095
ara-0.094
Token."
Token ID526
Feature activation+6.41e-03

Pos logit contributions
free+0.009
 crashing+0.009
ous+0.008
rolling+0.008
 echoing+0.008
Neg logit contributions
 breaks-0.005
 hits-0.004
 my-0.004
 slams-0.003
er-0.003
Token From
Token ID3574
Feature activation+2.58e-02

Pos logit contributions
 breaks+0.009
 hits+0.008
 my+0.007
 makes+0.007
er+0.007
Neg logit contributions
 crashing-0.012
rolling-0.012
 Joey-0.012
free-0.012
 Marvin-0.012
Token that
Token ID326
Feature activation-2.56e-03

Pos logit contributions
 breaks+0.038
 hits+0.032
 draws+0.029
 te+0.029
 makes+0.029
Neg logit contributions
 crashing-0.046
 Marvin-0.046
 Joey-0.046
ous-0.045
rolling-0.044
Token day
Token ID1110
Feature activation+1.13e-02

Pos logit contributions
rolling+0.005
 crashing+0.004
 echoing+0.004
ous+0.004
uttering+0.004
Neg logit contributions
 breaks-0.004
 my-0.004
 hits-0.004
 makes-0.003
er-0.003
Token the
Token ID262
Feature activation-1.29e-02

Pos logit contributions
rolling+0.058
free+0.057
ous+0.056
 crashing+0.056
owing+0.055
Neg logit contributions
 breaks-0.042
 my-0.039
 hits-0.036
 makes-0.034
er-0.033
Token most
Token ID749
Feature activation-3.06e-02

Pos logit contributions
rolling+0.026
ous+0.025
free+0.024
 streaming+0.024
 crashing+0.024
Neg logit contributions
 my-0.016
 breaks-0.016
er-0.014
 makes-0.014
 hits-0.013
Token beautiful
Token ID4950
Feature activation-5.94e-02

Pos logit contributions
rolling+0.052
 crashing+0.052
free+0.049
ous+0.048
 Marvin+0.048
Neg logit contributions
 breaks-0.048
 my-0.045
 hits-0.042
er-0.042
 makes-0.040
Token place
Token ID1295
Feature activation-3.36e-02

Pos logit contributions
rolling+0.107
free+0.101
 crashing+0.101
uttering+0.097
 echoing+0.096
Neg logit contributions
 breaks-0.092
 my-0.085
er-0.082
 hits-0.082
 makes-0.078
Token the
Token ID262
Feature activation-3.64e-02

Pos logit contributions
rolling+0.061
 crashing+0.061
free+0.059
 echoing+0.058
ous+0.057
Neg logit contributions
 breaks-0.049
 my-0.043
 hits-0.043
er-0.040
 makes-0.040
Token m
Token ID285
Feature activation-8.25e-02

Pos logit contributions
rolling+0.073
ous+0.071
owing+0.067
uttering+0.067
 streaming+0.066
Neg logit contributions
 my-0.049
 breaks-0.046
 makes-0.041
er-0.041
 holes-0.040
Tokenule
Token ID2261
Feature activation-3.09e-02

Pos logit contributions
 echoing+0.086
free+0.082
 understand+0.081
 streaming+0.081
 crashing+0.081
Neg logit contributions
er-0.166
 breaks-0.157
 hits-0.155
 my-0.149
ara-0.143
Token had
Token ID550
Feature activation-1.04e-02

Pos logit contributions
rolling+0.061
 crashing+0.060
uttering+0.058
free+0.057
 echoing+0.057
Neg logit contributions
 breaks-0.040
 my-0.036
er-0.035
 hits-0.034
 makes-0.033
Token ever
Token ID1683
Feature activation-2.06e-02

Pos logit contributions
 crashing+0.014
rolling+0.014
free+0.013
ous+0.013
 echoing+0.013
Neg logit contributions
 breaks-0.020
 my-0.018
 hits-0.018
er-0.017
 makes-0.017
Token seen
Token ID1775
Feature activation-1.17e-02

Pos logit contributions
 crashing+0.033
rolling+0.032
free+0.031
ous+0.031
 echoing+0.031
Neg logit contributions
 breaks-0.035
 hits-0.031
 my-0.031
er-0.029
 makes-0.029
Token.
Token ID13
Feature activation+2.66e-03

Pos logit contributions
 crashing+0.022
rolling+0.022
free+0.022
ous+0.021
 echoing+0.021
Neg logit contributions
 breaks-0.016
 hits-0.014
 my-0.014
er-0.013
 makes-0.013
Token a
Token ID257
Feature activation-3.85e-03

Pos logit contributions
 crashing+0.008
rolling+0.008
ous+0.008
free+0.008
 echoing+0.008
Neg logit contributions
 breaks-0.005
 my-0.004
 hits-0.004
 makes-0.004
er-0.004
Token big
Token ID1263
Feature activation+1.27e-02

Pos logit contributions
rolling+0.007
 crashing+0.007
ous+0.007
 streaming+0.007
free+0.007
Neg logit contributions
 breaks-0.006
 my-0.005
 hits-0.005
 te-0.005
 makes-0.005
Token hipp
Token ID18568
Feature activation-3.28e-02

Pos logit contributions
 breaks+0.019
 hits+0.017
er+0.016
 makes+0.015
up+0.015
Neg logit contributions
 crashing-0.022
 Joey-0.020
ous-0.020
 Marvin-0.019
 echoing-0.019
Tokeno
Token ID78
Feature activation-2.57e-02

Pos logit contributions
rolling+0.051
 crashing+0.050
 echoing+0.050
 Marvin+0.049
 ent+0.048
Neg logit contributions
 breaks-0.054
er-0.050
 my-0.050
 hits-0.049
 makes-0.046
Token named
Token ID3706
Feature activation-1.92e-02

Pos logit contributions
 crashing+0.051
rolling+0.051
 echoing+0.049
ous+0.049
free+0.049
Neg logit contributions
 breaks-0.033
 my-0.029
 hits-0.028
er-0.027
 makes-0.026
Token Hipp
Token ID33812
Feature activation-8.24e-02

Pos logit contributions
 crashing+0.030
rolling+0.030
 echoing+0.029
free+0.029
uttering+0.029
Neg logit contributions
 breaks-0.033
 my-0.031
 hits-0.029
 makes-0.027
er-0.027
Tokeny
Token ID88
Feature activation-3.35e-02

Pos logit contributions
 echoing+0.183
rolling+0.175
 streaming+0.173
 crashing+0.169
 Frankie+0.169
Neg logit contributions
ara-0.075
."-0.074
 breaks-0.074
 is-0.073
er-0.072
Token.
Token ID13
Feature activation-5.35e-02

Pos logit contributions
rolling+0.078
 crashing+0.076
 echoing+0.074
free+0.074
uttering+0.073
Neg logit contributions
 breaks-0.031
 my-0.029
er-0.029
 hits-0.026
 makes-0.024
Token Hipp
Token ID33812
Feature activation-7.47e-02

Pos logit contributions
 echoing+0.060
rolling+0.057
uttering+0.057
owing+0.056
 harming+0.054
Neg logit contributions
 breaks-0.117
 my-0.112
 hits-0.107
 makes-0.107
 is-0.103
Tokeny
Token ID88
Feature activation-7.84e-03

Pos logit contributions
 echoing+0.161
 crashing+0.157
rolling+0.155
 harming+0.150
 hesitation+0.150
Neg logit contributions
 breaks-0.078
 is-0.075
ara-0.071
 hits-0.068
 makes-0.068
Token was
Token ID373
Feature activation-1.37e-02

Pos logit contributions
 crashing+0.008
rolling+0.008
free+0.008
ous+0.008
 Marvin+0.008
Neg logit contributions
 breaks-0.017
 hits-0.016
 my-0.016
er-0.015
 makes-0.014
Token.
Token ID13
Feature activation+5.70e-04

Pos logit contributions
 crashing+0.012
rolling+0.012
free+0.012
ous+0.011
 echoing+0.011
Neg logit contributions
 breaks-0.007
 my-0.006
 hits-0.006
er-0.006
 makes-0.006
Token
Token ID220
Feature activation-3.59e-02

Pos logit contributions
 breaks+0.001
 my+0.001
 hits+0.001
er+0.001
 makes+0.001
Neg logit contributions
 crashing-0.001
rolling-0.001
ous-0.001
 echoing-0.001
free-0.001
Token
Token ID198
Feature activation-7.06e-03

Pos logit contributions
 crashing+0.070
 echoing+0.070
 streaming+0.067
ous+0.067
piring+0.066
Neg logit contributions
 breaks-0.047
 my-0.044
 hits-0.041
 makes-0.038
er-0.036
Token
Token ID198
Feature activation-1.65e-03

Pos logit contributions
 streaming+0.014
ous+0.014
 echoing+0.014
 crashing+0.014
owing+0.014
Neg logit contributions
 breaks-0.009
 my-0.008
 hits-0.007
 makes-0.007
er-0.007
TokenThe
Token ID464
Feature activation-4.75e-02

Pos logit contributions
ous+0.003
 crashing+0.003
rolling+0.003
owing+0.003
 streaming+0.003
Neg logit contributions
 breaks-0.003
 my-0.002
 hits-0.002
er-0.002
 makes-0.002
Token m
Token ID285
Feature activation-8.18e-02

Pos logit contributions
rolling+0.094
uttering+0.092
ous+0.091
free+0.090
osity+0.089
Neg logit contributions
 breaks-0.060
 my-0.060
 hits-0.056
 makes-0.055
er-0.054
Tokenule
Token ID2261
Feature activation+1.88e-03

Pos logit contributions
free+0.095
 echoing+0.093
 crashing+0.091
 streaming+0.088
 harming+0.087
Neg logit contributions
er-0.155
 hits-0.142
 breaks-0.141
ara-0.139
 my-0.138
Token stayed
Token ID9658
Feature activation-7.66e-03

Pos logit contributions
 breaks+0.003
 my+0.003
 hits+0.003
 makes+0.003
 draws+0.003
Neg logit contributions
 crashing-0.003
rolling-0.003
 echoing-0.003
 ent-0.003
ous-0.003
Token in
Token ID287
Feature activation-4.58e-03

Pos logit contributions
 crashing+0.012
rolling+0.011
uttering+0.011
 ent+0.011
 echoing+0.011
Neg logit contributions
 breaks-0.014
 hits-0.013
 draws-0.012
ara-0.012
er-0.011
Token the
Token ID262
Feature activation+4.33e-03

Pos logit contributions
 crashing+0.009
rolling+0.008
free+0.008
 Marvin+0.008
 echoing+0.008
Neg logit contributions
 breaks-0.006
 hits-0.006
 my-0.005
er-0.005
 makes-0.005
Token me
Token ID502
Feature activation-1.22e-02

Pos logit contributions
 breaks+0.007
 hits+0.006
 my+0.005
er+0.005
 makes+0.005
Neg logit contributions
 crashing-0.008
rolling-0.008
free-0.007
 echoing-0.007
 Marvin-0.007
Token was
Token ID373
Feature activation-3.22e-02

Pos logit contributions
 crashing+0.090
rolling+0.088
free+0.086
uttering+0.086
ous+0.084
Neg logit contributions
 breaks-0.050
 hits-0.044
 makes-0.042
 my-0.039
er-0.038
Token and
Token ID290
Feature activation-2.03e-02

Pos logit contributions
 crashing+0.067
rolling+0.067
free+0.065
 echoing+0.064
ous+0.064
Neg logit contributions
 breaks-0.038
 my-0.033
 hits-0.032
er-0.030
 makes-0.030
Token saw
Token ID2497
Feature activation-1.43e-02

Pos logit contributions
 crashing+0.037
free+0.037
rolling+0.036
uttering+0.036
ous+0.035
Neg logit contributions
 breaks-0.030
 my-0.026
 hits-0.026
 makes-0.024
er-0.024
Token a
Token ID257
Feature activation-2.87e-02

Pos logit contributions
rolling+0.032
 crashing+0.031
free+0.030
uttering+0.030
ous+0.030
Neg logit contributions
 breaks-0.017
 my-0.014
 hits-0.014
er-0.013
 makes-0.012
Token pe
Token ID613
Feature activation-4.93e-02

Pos logit contributions
owing+0.051
rolling+0.051
free+0.051
 crashing+0.050
 echoing+0.050
Neg logit contributions
 breaks-0.042
 my-0.038
 hits-0.036
 te-0.033
 makes-0.033
Tokenb
Token ID65
Feature activation-8.13e-02

Pos logit contributions
 crashing+0.075
rolling+0.073
free+0.072
ous+0.071
 echoing+0.070
Neg logit contributions
 breaks-0.087
 hits-0.078
 my-0.078
er-0.073
 makes-0.072
Tokenble
Token ID903
Feature activation-3.81e-02

Pos logit contributions
 echoing+0.112
 crashing+0.109
rolling+0.105
 streaming+0.104
free+0.104
Neg logit contributions
 breaks-0.147
er-0.139
 hits-0.139
 my-0.136
 is-0.126
Token.
Token ID13
Feature activation-1.30e-02

Pos logit contributions
rolling+0.075
 crashing+0.074
free+0.073
ous+0.072
 echoing+0.072
Neg logit contributions
 breaks-0.050
 hits-0.043
 my-0.042
er-0.040
 makes-0.038
Token He
Token ID679
Feature activation-1.60e-03

Pos logit contributions
 crashing+0.022
 echoing+0.021
rolling+0.021
free+0.021
ous+0.021
Neg logit contributions
 breaks-0.021
 my-0.019
 hits-0.019
 makes-0.017
er-0.017
Token thought
Token ID1807
Feature activation-3.03e-02

Pos logit contributions
free+0.003
 crashing+0.003
uttering+0.003
owing+0.003
hing+0.003
Neg logit contributions
 breaks-0.002
 my-0.002
 hits-0.002
 makes-0.002
er-0.002
Token it
Token ID340
Feature activation-3.09e-02

Pos logit contributions
rolling+0.064
 crashing+0.063
free+0.061
 echoing+0.061
uttering+0.059
Neg logit contributions
 breaks-0.038
 hits-0.032
 my-0.032
 makes-0.029
er-0.028
Token.
Token ID13
Feature activation+1.52e-02

Pos logit contributions
rolling+0.080
 ent+0.076
 crashing+0.076
 Marvin+0.076
 harming+0.076
Neg logit contributions
 breaks-0.046
 hits-0.039
 my-0.039
er-0.039
 makes-0.037
Token They
Token ID1119
Feature activation-2.32e-02

Pos logit contributions
 breaks+0.027
 my+0.025
 hits+0.025
er+0.023
 makes+0.022
Neg logit contributions
 crashing-0.023
rolling-0.022
 echoing-0.021
free-0.021
ous-0.021
Token put
Token ID1234
Feature activation-1.19e-02

Pos logit contributions
rolling+0.039
 crashing+0.039
uttering+0.038
free+0.038
ous+0.038
Neg logit contributions
 breaks-0.037
 hits-0.032
 my-0.031
 makes-0.031
er-0.030
Token the
Token ID262
Feature activation-2.23e-02

Pos logit contributions
rolling+0.024
 crashing+0.024
ous+0.023
 echoing+0.023
free+0.023
Neg logit contributions
 breaks-0.016
 my-0.014
 hits-0.014
 makes-0.012
er-0.012
Token marsh
Token ID22397
Feature activation-4.09e-02

Pos logit contributions
rolling+0.041
 crashing+0.041
ous+0.040
 echoing+0.039
free+0.039
Neg logit contributions
 breaks-0.033
 my-0.029
 hits-0.029
 makes-0.027
er-0.026
Tokenm
Token ID76
Feature activation-8.10e-02

Pos logit contributions
rolling+0.054
 crashing+0.051
 ent+0.049
 echoing+0.049
 Marvin+0.048
Neg logit contributions
 breaks-0.081
 my-0.074
 hits-0.074
er-0.074
 makes-0.071
Tokenallows
Token ID47205
Feature activation-3.96e-02

Pos logit contributions
rolling+0.078
 Marvin+0.072
 crashing+0.072
Nice+0.068
 echoing+0.067
Neg logit contributions
 breaks-0.192
 my-0.176
 hits-0.174
er-0.173
 makes-0.172
Token on
Token ID319
Feature activation-8.02e-03

Pos logit contributions
rolling+0.078
 crashing+0.076
ous+0.074
 Marvin+0.074
 echoing+0.074
Neg logit contributions
 breaks-0.053
 hits-0.046
 my-0.044
er-0.042
 makes-0.041
Token sticks
Token ID16461
Feature activation-2.53e-02

Pos logit contributions
 crashing+0.015
free+0.014
rolling+0.014
 Marvin+0.014
 echoing+0.014
Neg logit contributions
 breaks-0.011
 hits-0.010
 my-0.009
er-0.009
 makes-0.009
Token and
Token ID290
Feature activation-2.03e-02

Pos logit contributions
 crashing+0.054
rolling+0.053
free+0.052
ous+0.052
 echoing+0.052
Neg logit contributions
 breaks-0.029
 hits-0.024
 my-0.024
er-0.022
 makes-0.021
Token held
Token ID2714
Feature activation-2.38e-02

Pos logit contributions
 crashing+0.033
rolling+0.032
free+0.032
 echoing+0.031
ous+0.031
Neg logit contributions
 breaks-0.033
 hits-0.030
 my-0.030
er-0.028
 makes-0.027
Token he
Token ID339
Feature activation-1.76e-02

Pos logit contributions
 crashing+0.041
rolling+0.041
free+0.040
ous+0.039
 echoing+0.039
Neg logit contributions
 breaks-0.027
 my-0.024
 hits-0.023
er-0.022
 makes-0.021
Token would
Token ID561
Feature activation-1.60e-02

Pos logit contributions
 crashing+0.031
rolling+0.031
uttering+0.030
 echoing+0.029
free+0.029
Neg logit contributions
 breaks-0.028
 makes-0.024
 hits-0.024
 my-0.024
er-0.023
Token miss
Token ID2051
Feature activation-9.38e-03

Pos logit contributions
rolling+0.025
 crashing+0.025
free+0.024
ous+0.024
uttering+0.024
Neg logit contributions
 breaks-0.027
 my-0.025
 hits-0.024
 makes-0.023
er-0.023
Token the
Token ID262
Feature activation-3.48e-02

Pos logit contributions
 crashing+0.016
rolling+0.016
free+0.015
ous+0.015
 echoing+0.015
Neg logit contributions
 breaks-0.015
 my-0.014
 hits-0.013
er-0.012
 makes-0.012
Token lovely
Token ID14081
Feature activation-6.32e-02

Pos logit contributions
rolling+0.067
uttering+0.062
ous+0.062
free+0.062
 crashing+0.061
Neg logit contributions
 breaks-0.048
 my-0.047
 hits-0.044
 makes-0.043
er-0.042
Token m
Token ID285
Feature activation-8.08e-02

Pos logit contributions
rolling+0.114
free+0.112
 crashing+0.107
uttering+0.105
ous+0.104
Neg logit contributions
 breaks-0.095
 my-0.090
 hits-0.088
er-0.087
 makes-0.084
Tokenule
Token ID2261
Feature activation-2.87e-02

Pos logit contributions
free+0.088
 echoing+0.085
 crashing+0.083
 harming+0.081
 streaming+0.081
Neg logit contributions
er-0.157
 hits-0.149
 breaks-0.149
 my-0.145
ara-0.141
Token,
Token ID11
Feature activation-1.16e-02

Pos logit contributions
 crashing+0.056
rolling+0.055
free+0.053
ous+0.053
 echoing+0.053
Neg logit contributions
 breaks-0.039
 hits-0.034
 my-0.033
er-0.031
 makes-0.030
Token but
Token ID475
Feature activation-1.67e-02

Pos logit contributions
 crashing+0.021
rolling+0.021
free+0.020
ous+0.020
 echoing+0.020
Neg logit contributions
 breaks-0.017
 my-0.015
 hits-0.015
er-0.014
 makes-0.014
Token he
Token ID339
Feature activation-1.09e-02

Pos logit contributions
 crashing+0.029
rolling+0.028
free+0.028
 echoing+0.027
ous+0.027
Neg logit contributions
 breaks-0.026
 my-0.024
 hits-0.023
 makes-0.022
er-0.022
Token was
Token ID373
Feature activation-1.46e-02

Pos logit contributions
 crashing+0.017
rolling+0.017
free+0.016
uttering+0.016
ous+0.016
Neg logit contributions
 breaks-0.019
 my-0.017
 hits-0.017
 makes-0.017
er-0.016
Token was
Token ID373
Feature activation+1.26e-03

Pos logit contributions
 crashing+0.035
free+0.034
rolling+0.033
uttering+0.033
 echoing+0.032
Neg logit contributions
 breaks-0.023
 hits-0.019
 my-0.019
er-0.019
 makes-0.018
Token amazed
Token ID24562
Feature activation+5.16e-03

Pos logit contributions
 breaks+0.002
 hits+0.002
 my+0.002
er+0.002
 makes+0.002
Neg logit contributions
 crashing-0.002
rolling-0.002
free-0.002
ous-0.002
 echoing-0.002
Token and
Token ID290
Feature activation-8.70e-05

Pos logit contributions
 breaks+0.006
 my+0.005
 hits+0.005
er+0.005
 makes+0.004
Neg logit contributions
 crashing-0.011
rolling-0.011
free-0.011
ous-0.011
 echoing-0.010
Token asked
Token ID1965
Feature activation-2.92e-02

Pos logit contributions
 crashing+0.000
rolling+0.000
free+0.000
ous+0.000
 echoing+0.000
Neg logit contributions
 breaks-0.000
 my-0.000
 hits-0.000
er-0.000
 makes-0.000
Token the
Token ID262
Feature activation-4.03e-02

Pos logit contributions
rolling+0.066
 crashing+0.065
 echoing+0.062
uttering+0.062
free+0.062
Neg logit contributions
 breaks-0.033
 my-0.030
 hits-0.026
er-0.026
 makes-0.024
Token hipp
Token ID18568
Feature activation-8.03e-02

Pos logit contributions
rolling+0.056
uttering+0.055
 echoing+0.053
 crashing+0.053
ous+0.053
Neg logit contributions
 breaks-0.078
 my-0.073
 hits-0.071
er-0.068
 makes-0.067
Tokeno
Token ID78
Feature activation-4.09e-02

Pos logit contributions
rolling+0.150
 echoing+0.149
 crashing+0.145
 streaming+0.145
uttering+0.139
Neg logit contributions
er-0.102
 breaks-0.097
 my-0.096
 hits-0.090
 is-0.088
Token how
Token ID703
Feature activation-4.27e-02

Pos logit contributions
 crashing+0.079
rolling+0.077
 echoing+0.075
free+0.075
uttering+0.074
Neg logit contributions
 breaks-0.056
 my-0.050
 hits-0.048
er-0.047
 makes-0.046
Token he
Token ID339
Feature activation-7.78e-03

Pos logit contributions
rolling+0.086
 crashing+0.082
 streaming+0.080
uttering+0.079
hing+0.078
Neg logit contributions
 breaks-0.059
 my-0.053
 makes-0.047
 hits-0.046
 your-0.046
Token knew
Token ID2993
Feature activation-2.73e-02

Pos logit contributions
 crashing+0.014
rolling+0.013
free+0.013
uttering+0.013
ous+0.013
Neg logit contributions
 breaks-0.012
 hits-0.011
 makes-0.010
 my-0.010
er-0.010
Token how
Token ID703
Feature activation-2.90e-02

Pos logit contributions
rolling+0.052
 crashing+0.050
free+0.049
uttering+0.048
 echoing+0.048
Neg logit contributions
 breaks-0.040
 my-0.037
 hits-0.034
er-0.034
 makes-0.033
Token "
Token ID366
Feature activation-1.23e-02

Pos logit contributions
hing+0.055
owing+0.054
rolling+0.053
 roam+0.053
uttering+0.052
Neg logit contributions
 breaks-0.036
 my-0.035
 makes-0.031
er-0.031
 hits-0.029
TokenWho
Token ID8241
Feature activation+2.81e-02

Pos logit contributions
 crashing+0.016
rolling+0.016
ous+0.016
 echoing+0.015
free+0.015
Neg logit contributions
 breaks-0.024
 my-0.022
 hits-0.022
er-0.021
 makes-0.021
Token will
Token ID481
Feature activation-3.52e-03

Pos logit contributions
 breaks+0.043
 hits+0.039
 my+0.039
er+0.039
 te+0.037
Neg logit contributions
 understand-0.045
 weren-0.043
ous-0.043
 crashing-0.042
rolling-0.042
Token help
Token ID1037
Feature activation+1.86e-02

Pos logit contributions
 crashing+0.005
rolling+0.005
free+0.005
ous+0.005
 echoing+0.005
Neg logit contributions
 breaks-0.006
 hits-0.006
 my-0.006
er-0.005
 makes-0.005
Token me
Token ID502
Feature activation+2.24e-02

Pos logit contributions
 breaks+0.020
 hits+0.016
 my+0.015
er+0.014
 makes+0.014
Neg logit contributions
 crashing-0.042
rolling-0.041
free-0.040
ous-0.040
 echoing-0.040
Token build
Token ID1382
Feature activation+4.53e-02

Pos logit contributions
 breaks+0.035
 hits+0.030
 my+0.029
er+0.028
 makes+0.027
Neg logit contributions
 crashing-0.039
rolling-0.039
free-0.037
ous-0.037
 echoing-0.037
Token my
Token ID616
Feature activation-1.28e-02

Pos logit contributions
 breaks+0.072
ara+0.071
 slams+0.068
 hits+0.067
 te+0.065
Neg logit contributions
 Joey-0.062
 crashing-0.061
 quite-0.059
 understand-0.058
 roam-0.056
Token nest
Token ID16343
Feature activation+3.85e-02

Pos logit contributions
rolling+0.019
 crashing+0.019
free+0.018
 ent+0.018
 roam+0.018
Neg logit contributions
 breaks-0.022
 hits-0.019
 my-0.018
er-0.018
 teaches-0.018
Token?"
Token ID1701
Feature activation-3.02e-03

Pos logit contributions
 breaks+0.053
 hits+0.044
ara+0.041
 draws+0.040
 slams+0.039
Neg logit contributions
 crashing-0.071
free-0.070
ous-0.069
rolling-0.068
 understand-0.068
Token
Token ID220
Feature activation-2.85e-02

Pos logit contributions
 crashing+0.005
rolling+0.005
ous+0.005
 echoing+0.005
free+0.005
Neg logit contributions
 breaks-0.005
 hits-0.004
 my-0.004
er-0.004
 makes-0.004
Token
Token ID198
Feature activation-8.70e-03

Pos logit contributions
 crashing+0.046
 echoing+0.043
rolling+0.043
free+0.043
ous+0.043
Neg logit contributions
 breaks-0.049
 my-0.044
 hits-0.044
er-0.041
 makes-0.041
Token jealous
Token ID19354
Feature activation+2.20e-02

Pos logit contributions
 breaks+0.009
 hits+0.008
 my+0.008
er+0.007
 makes+0.007
Neg logit contributions
 crashing-0.006
rolling-0.006
free-0.006
 echoing-0.006
ous-0.006
Token.
Token ID13
Feature activation+1.44e-02

Pos logit contributions
 breaks+0.038
 hits+0.034
ital+0.034
 my+0.033
 slams+0.033
Neg logit contributions
 crashing-0.032
ous-0.029
free-0.029
 echoing-0.029
 understand-0.029
Token
Token ID198
Feature activation-2.91e-03

Pos logit contributions
 breaks+0.025
 my+0.022
 hits+0.022
 makes+0.021
er+0.020
Neg logit contributions
 crashing-0.024
rolling-0.023
 echoing-0.023
ous-0.022
free-0.022
Token
Token ID198
Feature activation+1.30e-02

Pos logit contributions
 echoing+0.006
 crashing+0.005
rolling+0.005
 streaming+0.005
ous+0.005
Neg logit contributions
 breaks-0.004
 my-0.004
 hits-0.004
 makes-0.004
ara-0.003
Token"
Token ID1
Feature activation+2.21e-02

Pos logit contributions
 breaks+0.021
 hits+0.020
 my+0.017
er+0.017
 makes+0.017
Neg logit contributions
free-0.021
 crashing-0.020
rolling-0.020
 understand-0.020
uttering-0.019
TokenBen
Token ID11696
Feature activation+6.32e-02

Pos logit contributions
 breaks+0.044
 hits+0.040
 my+0.040
er+0.038
 makes+0.037
Neg logit contributions
 crashing-0.028
free-0.027
rolling-0.027
 echoing-0.026
 Marvin-0.026
Token,
Token ID11
Feature activation-1.06e-03

Pos logit contributions
 holes+0.077
 breaks+0.076
 forms+0.074
ital+0.073
 slams+0.073
Neg logit contributions
owing-0.099
 crashing-0.097
ous-0.096
 understand-0.096
free-0.096
Token can
Token ID460
Feature activation+1.33e-02

Pos logit contributions
 crashing+0.002
free+0.002
ous+0.002
 echoing+0.002
 understand+0.002
Neg logit contributions
 breaks-0.002
 hits-0.002
er-0.001
 my-0.001
 makes-0.001
Token I
Token ID314
Feature activation+1.81e-02

Pos logit contributions
 breaks+0.017
er+0.015
 hits+0.015
 holes+0.014
 pieces+0.013
Neg logit contributions
 crashing-0.024
rolling-0.024
ous-0.024
 Marvin-0.024
 Joey-0.024
Token have
Token ID423
Feature activation+5.80e-02

Pos logit contributions
 breaks+0.034
 hits+0.031
er+0.030
 my+0.030
 holes+0.029
Neg logit contributions
 understand-0.024
 crashing-0.024
free-0.023
rolling-0.023
 recognise-0.022
Token your
Token ID534
Feature activation+2.95e-02

Pos logit contributions
er+0.085
 breaks+0.081
 slams+0.081
ara+0.076
ital+0.076
Neg logit contributions
 crashing-0.081
 Marvin-0.080
 understand-0.080
 ent-0.077
 him-0.077
Token anything
Token ID1997
Feature activation+5.74e-03

Pos logit contributions
 breaks+0.035
 hits+0.031
 my+0.030
er+0.028
 makes+0.028
Neg logit contributions
 crashing-0.044
rolling-0.043
free-0.042
ous-0.042
 echoing-0.042
Token else
Token ID2073
Feature activation-2.30e-03

Pos logit contributions
 breaks+0.007
 hits+0.006
 my+0.006
er+0.006
 draws+0.006
Neg logit contributions
 crashing-0.011
free-0.011
ous-0.011
rolling-0.011
 echoing-0.011
Token we
Token ID356
Feature activation-3.59e-02

Pos logit contributions
 crashing+0.005
rolling+0.005
free+0.005
ous+0.005
 echoing+0.005
Neg logit contributions
 breaks-0.003
 hits-0.002
 my-0.002
er-0.002
 makes-0.002
Token need
Token ID761
Feature activation+4.32e-02

Pos logit contributions
ous+0.068
uttering+0.067
 echoing+0.066
rolling+0.065
 crashing+0.065
Neg logit contributions
 breaks-0.053
 hits-0.046
 my-0.045
 makes-0.045
 gives-0.040
Token to
Token ID284
Feature activation-1.38e-03

Pos logit contributions
 breaks+0.058
 hits+0.050
 slams+0.046
 draws+0.044
 te+0.043
Neg logit contributions
 crashing-0.083
 Joey-0.079
ous-0.079
free-0.079
rolling-0.078
Token take
Token ID1011
Feature activation+5.52e-02

Pos logit contributions
 crashing+0.002
rolling+0.002
free+0.002
ous+0.002
 echoing+0.002
Neg logit contributions
 breaks-0.003
 hits-0.002
 my-0.002
er-0.002
 makes-0.002
Token home
Token ID1363
Feature activation+4.14e-02

Pos logit contributions
 breaks+0.077
 hits+0.075
 slams+0.069
 draws+0.068
ara+0.068
Neg logit contributions
free-0.092
 Joey-0.091
 crashing-0.090
 Marvin-0.090
 Mar-0.087
Token?"
Token ID1701
Feature activation-1.24e-02

Pos logit contributions
 breaks+0.067
 hits+0.061
 draws+0.054
 splits+0.054
 slams+0.053
Neg logit contributions
 Joey-0.070
free-0.069
 quite-0.068
 Mar-0.066
ous-0.066
Token Jack
Token ID3619
Feature activation-5.71e-03

Pos logit contributions
 crashing+0.025
ous+0.024
free+0.024
 echoing+0.023
 harming+0.023
Neg logit contributions
 breaks-0.017
 my-0.016
 hits-0.014
 draws-0.013
 makes-0.013
Token's
Token ID338
Feature activation-4.49e-03

Pos logit contributions
 crashing+0.011
free+0.011
uttering+0.011
rolling+0.011
ous+0.010
Neg logit contributions
 breaks-0.008
 hits-0.007
 my-0.007
er-0.007
 makes-0.007
Token eyes
Token ID2951
Feature activation-3.48e-02

Pos logit contributions
 crashing+0.009
free+0.009
rolling+0.009
ous+0.009
hing+0.008
Neg logit contributions
 breaks-0.006
 hits-0.005
 my-0.005
er-0.005
 makes-0.005
Token have
Token ID423
Feature activation+4.88e-02

Pos logit contributions
 breaks+0.041
 hits+0.036
 my+0.036
er+0.034
 makes+0.033
Neg logit contributions
 crashing-0.037
rolling-0.036
free-0.035
ous-0.035
 echoing-0.035
Token fun
Token ID1257
Feature activation+2.51e-02

Pos logit contributions
 breaks+0.072
er+0.069
 hits+0.066
ara+0.065
 slams+0.064
Neg logit contributions
 crashing-0.078
 understand-0.073
 Joey-0.071
 ent-0.071
 weren-0.071
Token.
Token ID13
Feature activation+4.77e-02

Pos logit contributions
 breaks+0.038
 hits+0.033
ara+0.031
 my+0.031
 slams+0.031
Neg logit contributions
 crashing-0.043
free-0.041
 echoing-0.039
 understand-0.038
ous-0.038
Token You
Token ID921
Feature activation+2.32e-02

Pos logit contributions
 breaks+0.081
 hits+0.075
 mess+0.071
er+0.069
 pieces+0.069
Neg logit contributions
free-0.069
 crashing-0.069
 Marvin-0.067
 understand-0.066
 Joey-0.066
Token can
Token ID460
Feature activation+1.20e-02

Pos logit contributions
 breaks+0.040
 hits+0.036
 my+0.035
er+0.034
ara+0.032
Neg logit contributions
 crashing-0.036
free-0.035
rolling-0.034
ous-0.033
 understand-0.033
Token take
Token ID1011
Feature activation+6.83e-02

Pos logit contributions
 breaks+0.021
 hits+0.019
 my+0.019
er+0.018
 makes+0.018
Neg logit contributions
 crashing-0.018
rolling-0.017
free-0.017
ous-0.016
 understand-0.016
Token turns
Token ID4962
Feature activation+3.54e-02

Pos logit contributions
er+0.069
 breaks+0.067
 hits+0.067
 te+0.065
 slams+0.064
Neg logit contributions
 Marvin-0.124
 crashing-0.123
free-0.123
 understand-0.121
 Joey-0.120
Token.
Token ID13
Feature activation+4.60e-02

Pos logit contributions
 breaks+0.048
 hits+0.041
 my+0.040
ara+0.039
er+0.039
Neg logit contributions
 crashing-0.068
free-0.064
ous-0.062
rolling-0.062
 echoing-0.061
Token Or
Token ID1471
Feature activation+4.42e-02

Pos logit contributions
 breaks+0.077
 hits+0.071
 mess+0.066
er+0.066
 pieces+0.064
Neg logit contributions
free-0.069
 crashing-0.069
 Marvin-0.066
 understand-0.065
 Joey-0.064
Token you
Token ID345
Feature activation+2.96e-02

Pos logit contributions
 breaks+0.058
 hits+0.050
 my+0.048
er+0.046
 makes+0.043
Neg logit contributions
 crashing-0.087
rolling-0.085
free-0.084
ous-0.082
 echoing-0.082
Token can
Token ID460
Feature activation+1.44e-02

Pos logit contributions
 breaks+0.046
 my+0.041
 hits+0.041
er+0.039
 makes+0.036
Neg logit contributions
 crashing-0.050
free-0.049
rolling-0.049
ous-0.047
 understand-0.047
Token and
Token ID290
Feature activation+9.56e-03

Pos logit contributions
 crashing+0.038
rolling+0.037
ous+0.037
 echoing+0.036
free+0.036
Neg logit contributions
 breaks-0.035
 my-0.031
 hits-0.031
er-0.029
 makes-0.028
Token Tommy
Token ID19919
Feature activation-1.37e-02

Pos logit contributions
 breaks+0.015
 hits+0.013
 my+0.013
er+0.013
 makes+0.012
Neg logit contributions
 crashing-0.016
rolling-0.016
free-0.015
 echoing-0.015
ous-0.015
Token played
Token ID2826
Feature activation+3.26e-03

Pos logit contributions
 crashing+0.024
rolling+0.024
free+0.023
ous+0.023
uttering+0.023
Neg logit contributions
 breaks-0.021
 hits-0.019
 my-0.018
 makes-0.017
er-0.017
Token together
Token ID1978
Feature activation+1.73e-03

Pos logit contributions
 breaks+0.004
 hits+0.004
 my+0.004
er+0.003
 makes+0.003
Neg logit contributions
 crashing-0.006
free-0.006
rolling-0.006
 echoing-0.006
 Marvin-0.006
Token in
Token ID287
Feature activation+1.28e-02

Pos logit contributions
 breaks+0.003
 hits+0.002
 my+0.002
er+0.002
 makes+0.002
Neg logit contributions
 crashing-0.003
rolling-0.003
free-0.003
ous-0.003
 echoing-0.003
Token the
Token ID262
Feature activation+4.97e-02

Pos logit contributions
 breaks+0.019
 hits+0.017
er+0.015
 my+0.015
 makes+0.015
Neg logit contributions
 crashing-0.024
free-0.022
 Marvin-0.022
 Joey-0.021
rolling-0.021
Token park
Token ID3952
Feature activation+4.30e-03

Pos logit contributions
 breaks+0.082
er+0.078
 hits+0.077
 my+0.072
 slams+0.070
Neg logit contributions
 crashing-0.075
rolling-0.064
free-0.063
 echoing-0.063
 Joey-0.063
Token every
Token ID790
Feature activation+9.59e-03

Pos logit contributions
 breaks+0.007
 hits+0.006
 my+0.006
 draws+0.005
 makes+0.005
Neg logit contributions
 crashing-0.008
free-0.007
rolling-0.007
 echoing-0.007
ous-0.007
Token day
Token ID1110
Feature activation+7.00e-03

Pos logit contributions
 breaks+0.017
 my+0.015
 hits+0.015
er+0.014
 makes+0.014
Neg logit contributions
rolling-0.015
 crashing-0.015
free-0.015
ous-0.015
 echoing-0.014
Token and
Token ID290
Feature activation+1.20e-02

Pos logit contributions
 breaks+0.009
 hits+0.008
 my+0.008
er+0.007
 makes+0.007
Neg logit contributions
 crashing-0.014
rolling-0.014
free-0.013
ous-0.013
 echoing-0.013
Token they
Token ID484
Feature activation-1.07e-02

Pos logit contributions
 breaks+0.021
 hits+0.019
 my+0.019
er+0.018
 te+0.017
Neg logit contributions
 crashing-0.018
ous-0.017
 Joey-0.017
free-0.017
rolling-0.017
Token to
Token ID284
Feature activation+2.50e-02

Pos logit contributions
 breaks+0.060
 hits+0.049
er+0.047
 slams+0.046
 splits+0.044
Neg logit contributions
 crashing-0.086
rolling-0.082
ous-0.082
 Joey-0.080
 understand-0.079
Token make
Token ID787
Feature activation+4.14e-02

Pos logit contributions
 breaks+0.045
 hits+0.040
 my+0.039
er+0.038
 makes+0.037
Neg logit contributions
 crashing-0.038
rolling-0.037
free-0.035
ous-0.035
 understand-0.035
Token something
Token ID1223
Feature activation+1.85e-02

Pos logit contributions
 breaks+0.058
 hits+0.054
 slams+0.050
ara+0.046
er+0.046
Neg logit contributions
 crashing-0.070
 understand-0.069
 Marvin-0.067
 Joey-0.067
ous-0.066
Token with
Token ID351
Feature activation+4.50e-02

Pos logit contributions
 breaks+0.028
 hits+0.025
er+0.022
ara+0.022
 draws+0.022
Neg logit contributions
 crashing-0.031
ous-0.030
 understand-0.030
 echoing-0.030
rolling-0.030
Token your
Token ID534
Feature activation+1.85e-02

Pos logit contributions
 breaks+0.051
 hits+0.045
 slams+0.041
er+0.040
 te+0.037
Neg logit contributions
 crashing-0.091
 Joey-0.088
 Marvin-0.086
free-0.085
ous-0.084
Token buttons
Token ID12163
Feature activation+3.78e-02

Pos logit contributions
 breaks+0.024
 hits+0.021
er+0.020
 my+0.019
 makes+0.018
Neg logit contributions
 crashing-0.035
free-0.034
rolling-0.034
ous-0.034
 Marvin-0.033
Token too
Token ID1165
Feature activation+4.87e-02

Pos logit contributions
 breaks+0.049
 hits+0.040
 slams+0.036
 holes+0.035
 te+0.035
Neg logit contributions
ous-0.073
free-0.072
 crashing-0.072
 understand-0.070
owing-0.069
Token?"
Token ID1701
Feature activation+1.46e-02

Pos logit contributions
 breaks+0.060
 hits+0.051
 te+0.048
er+0.046
 draws+0.046
Neg logit contributions
ous-0.095
 crashing-0.094
 understand-0.091
 Joey-0.091
 Marvin-0.090
Token Ben
Token ID3932
Feature activation+2.55e-02

Pos logit contributions
 breaks+0.016
 hits+0.014
 my+0.013
er+0.013
 makes+0.012
Neg logit contributions
 crashing-0.031
rolling-0.031
free-0.030
 Marvin-0.030
 echoing-0.030
Token said
Token ID531
Feature activation-1.91e-04

Pos logit contributions
 breaks+0.048
 hits+0.042
 my+0.041
 slams+0.037
 holes+0.037
Neg logit contributions
 understand-0.035
 ent-0.035
 echoing-0.035
 Marvin-0.034
rolling-0.033
Token.
Token ID13
Feature activation-5.93e-03

Pos logit contributions
 crashing+0.000
free+0.000
rolling+0.000
ous+0.000
 echoing+0.000
Neg logit contributions
 breaks-0.000
 hits-0.000
 my-0.000
 makes-0.000
er-0.000
Token in
Token ID287
Feature activation+5.87e-03

Pos logit contributions
 crashing+0.046
rolling+0.044
free+0.044
ous+0.044
 echoing+0.043
Neg logit contributions
 breaks-0.041
 hits-0.036
 my-0.035
er-0.032
 makes-0.032
Token her
Token ID607
Feature activation+2.69e-02

Pos logit contributions
 breaks+0.009
 hits+0.009
 my+0.008
er+0.008
 makes+0.008
Neg logit contributions
 crashing-0.010
rolling-0.009
free-0.009
ous-0.009
 echoing-0.009
Token room
Token ID2119
Feature activation+2.02e-02

Pos logit contributions
 breaks+0.038
 hits+0.035
 my+0.033
er+0.032
ara+0.030
Neg logit contributions
 crashing-0.048
rolling-0.047
free-0.046
 echoing-0.045
 ent-0.045
Token as
Token ID355
Feature activation-1.53e-03

Pos logit contributions
 breaks+0.032
 hits+0.029
 my+0.028
 draws+0.025
ara+0.025
Neg logit contributions
 crashing-0.035
free-0.034
ous-0.033
 echoing-0.033
rolling-0.033
Token a
Token ID257
Feature activation+1.52e-03

Pos logit contributions
rolling+0.003
 crashing+0.003
 echoing+0.003
hing+0.003
owing+0.003
Neg logit contributions
 breaks-0.002
 my-0.002
 hits-0.002
er-0.002
 makes-0.002
Token reminder
Token ID15438
Feature activation+1.45e-02

Pos logit contributions
 breaks+0.002
 my+0.002
 hits+0.002
er+0.002
 te+0.002
Neg logit contributions
rolling-0.003
 crashing-0.002
ous-0.002
owing-0.002
free-0.002
Token of
Token ID286
Feature activation-1.11e-02

Pos logit contributions
 breaks+0.023
 hits+0.021
 my+0.020
 draws+0.018
 makes+0.018
Neg logit contributions
 crashing-0.025
ous-0.024
free-0.024
rolling-0.024
 echoing-0.023
Token her
Token ID607
Feature activation-6.53e-04

Pos logit contributions
rolling+0.023
 crashing+0.021
free+0.021
ous+0.021
 echoing+0.020
Neg logit contributions
 breaks-0.015
 my-0.015
 your-0.013
er-0.013
 makes-0.012
Token competitive
Token ID7606
Feature activation+2.32e-02

Pos logit contributions
 crashing+0.001
rolling+0.001
free+0.001
ous+0.001
 echoing+0.001
Neg logit contributions
 breaks-0.001
 my-0.001
 hits-0.001
 makes-0.001
er-0.001
Token spirit
Token ID4437
Feature activation-1.03e-02

Pos logit contributions
 breaks+0.036
 hits+0.033
 my+0.032
 slams+0.030
er+0.029
Neg logit contributions
 crashing-0.039
rolling-0.037
 echoing-0.037
free-0.036
ous-0.036
Token.
Token ID13
Feature activation+1.29e-03

Pos logit contributions
 crashing+0.020
rolling+0.020
free+0.019
ous+0.019
 echoing+0.019
Neg logit contributions
 breaks-0.014
 hits-0.012
 my-0.012
er-0.011
 makes-0.010
Token,
Token ID11
Feature activation+6.21e-03

Pos logit contributions
 breaks+0.035
 hits+0.032
irts+0.031
 my+0.030
 teaches+0.028
Neg logit contributions
ous-0.035
 weren-0.035
 Joey-0.035
 haven-0.033
 Marvin-0.033
Token she
Token ID673
Feature activation+3.20e-02

Pos logit contributions
 breaks+0.009
 my+0.008
 hits+0.008
 makes+0.007
er+0.007
Neg logit contributions
 crashing-0.012
rolling-0.012
 echoing-0.011
ous-0.011
free-0.011
Token saw
Token ID2497
Feature activation+1.33e-02

Pos logit contributions
 breaks+0.062
 hits+0.057
er+0.056
 my+0.056
 slams+0.051
Neg logit contributions
rolling-0.039
 ent-0.038
 weren-0.037
 haven-0.037
'd-0.037
Token a
Token ID257
Feature activation-2.72e-03

Pos logit contributions
 breaks+0.020
 hits+0.018
 my+0.017
er+0.016
 makes+0.016
Neg logit contributions
 crashing-0.023
free-0.022
ous-0.022
 Marvin-0.022
rolling-0.022
Token big
Token ID1263
Feature activation+1.66e-02

Pos logit contributions
 crashing+0.005
rolling+0.005
free+0.005
ous+0.005
 echoing+0.005
Neg logit contributions
 breaks-0.004
 my-0.003
 hits-0.003
 makes-0.003
er-0.003
Token,
Token ID11
Feature activation+1.70e-03

Pos logit contributions
 breaks+0.022
ups+0.021
up+0.020
er+0.019
 learns+0.019
Neg logit contributions
 crashing-0.029
 Joey-0.027
 Frankie-0.027
ous-0.026
 Marvin-0.026
Token red
Token ID2266
Feature activation-1.92e-02

Pos logit contributions
 breaks+0.003
 hits+0.003
er+0.002
 my+0.002
 makes+0.002
Neg logit contributions
 crashing-0.003
free-0.003
 echoing-0.002
rolling-0.002
 Marvin-0.002
Token apple
Token ID17180
Feature activation+5.92e-03

Pos logit contributions
 crashing+0.035
ous+0.033
rolling+0.033
free+0.033
 Marvin+0.032
Neg logit contributions
 breaks-0.028
 hits-0.025
 my-0.024
er-0.023
 makes-0.023
Token.
Token ID13
Feature activation-1.30e-02

Pos logit contributions
 breaks+0.009
 my+0.008
 hits+0.008
 draws+0.008
aj+0.008
Neg logit contributions
 crashing-0.010
ous-0.009
free-0.009
 streaming-0.009
 echoing-0.008
Token She
Token ID1375
Feature activation+1.66e-02

Pos logit contributions
 crashing+0.025
rolling+0.024
free+0.024
ous+0.023
 echoing+0.023
Neg logit contributions
 breaks-0.019
 hits-0.016
 my-0.016
 makes-0.015
er-0.015
Token took
Token ID1718
Feature activation+1.98e-02

Pos logit contributions
 breaks+0.033
 hits+0.030
 my+0.029
er+0.028
 makes+0.027
Neg logit contributions
 crashing-0.022
rolling-0.021
free-0.021
ous-0.020
 echoing-0.020
Token and
Token ID290
Feature activation+2.84e-02

Pos logit contributions
 breaks+0.004
 hits+0.004
 my+0.004
 slams+0.003
ara+0.003
Neg logit contributions
 crashing-0.005
rolling-0.005
ous-0.005
free-0.005
 Marvin-0.005
Token saw
Token ID2497
Feature activation+6.58e-03

Pos logit contributions
 breaks+0.052
 hits+0.047
er+0.047
 my+0.046
 teaches+0.044
Neg logit contributions
 inspect-0.036
 Joey-0.035
 Marvin-0.035
 Mar-0.034
 ent-0.033
Token a
Token ID257
Feature activation-1.73e-02

Pos logit contributions
 breaks+0.010
 hits+0.009
 my+0.008
er+0.008
 makes+0.008
Neg logit contributions
 crashing-0.012
free-0.011
 Marvin-0.011
ous-0.011
 ent-0.011
Token shiny
Token ID22441
Feature activation-4.08e-03

Pos logit contributions
 crashing+0.031
rolling+0.030
free+0.029
ous+0.029
 echoing+0.029
Neg logit contributions
 breaks-0.026
 hits-0.023
 my-0.023
er-0.021
 makes-0.021
Token necklace
Token ID38987
Feature activation-2.59e-03

Pos logit contributions
 crashing+0.006
ous+0.005
 Joey+0.005
 Marvin+0.005
free+0.005
Neg logit contributions
 breaks-0.007
 hits-0.006
 slams-0.006
 makes-0.006
 learns-0.006
Token with
Token ID351
Feature activation+6.96e-03

Pos logit contributions
 crashing+0.004
ous+0.004
free+0.004
 echoing+0.004
uttering+0.004
Neg logit contributions
 breaks-0.004
 my-0.004
 hits-0.004
 draws-0.003
 makes-0.003
Token a
Token ID257
Feature activation+5.75e-03

Pos logit contributions
 breaks+0.010
 hits+0.009
 my+0.009
er+0.008
 makes+0.008
Neg logit contributions
 crashing-0.012
free-0.012
rolling-0.012
 Marvin-0.012
ous-0.012
Token shell
Token ID7582
Feature activation-7.53e-03

Pos logit contributions
 breaks+0.009
 hits+0.008
er+0.008
 my+0.008
 slams+0.008
Neg logit contributions
 crashing-0.010
rolling-0.009
free-0.009
ous-0.009
 Marvin-0.009
Token.
Token ID13
Feature activation+1.43e-02

Pos logit contributions
 crashing+0.012
ous+0.011
uttering+0.011
 streaming+0.011
free+0.011
Neg logit contributions
 breaks-0.013
 my-0.011
 hits-0.011
 slams-0.010
 teaches-0.010
Token It
Token ID632
Feature activation-1.52e-02

Pos logit contributions
 breaks+0.028
 hits+0.026
 my+0.025
er+0.024
 makes+0.024
Neg logit contributions
 crashing-0.018
rolling-0.018
free-0.018
ous-0.017
 Marvin-0.017
Token was
Token ID373
Feature activation-7.09e-03

Pos logit contributions
 crashing+0.021
rolling+0.021
 Marvin+0.020
ous+0.020
 weren+0.020
Neg logit contributions
 breaks-0.027
 my-0.024
 hits-0.024
er-0.023
 slams-0.022
TokenThe
Token ID464
Feature activation+2.01e-02

Pos logit contributions
 breaks+0.021
 my+0.019
 hits+0.018
er+0.017
 makes+0.016
Neg logit contributions
 crashing-0.024
rolling-0.023
ous-0.023
free-0.023
 echoing-0.023
Token moral
Token ID6573
Feature activation+3.91e-02

Pos logit contributions
 breaks+0.032
 my+0.028
 hits+0.028
er+0.026
 makes+0.025
Neg logit contributions
 crashing-0.035
rolling-0.034
free-0.033
ous-0.033
 echoing-0.033
Token of
Token ID286
Feature activation-5.99e-03

Pos logit contributions
 breaks+0.068
 hits+0.065
 my+0.063
 slams+0.060
 draws+0.060
Neg logit contributions
 understand-0.053
 crashing-0.052
free-0.051
ous-0.050
 Marvin-0.049
Token the
Token ID262
Feature activation+9.51e-03

Pos logit contributions
 crashing+0.013
rolling+0.013
free+0.012
ous+0.012
 echoing+0.012
Neg logit contributions
 breaks-0.007
 hits-0.006
 my-0.006
er-0.005
 makes-0.005
Token story
Token ID1621
Feature activation+3.36e-02

Pos logit contributions
 breaks+0.016
 hits+0.015
 my+0.014
er+0.014
 draws+0.013
Neg logit contributions
 crashing-0.015
rolling-0.014
ous-0.014
free-0.014
 echoing-0.014
Token is
Token ID318
Feature activation+1.63e-02

Pos logit contributions
 breaks+0.029
 hits+0.027
 my+0.024
 te+0.024
irts+0.023
Neg logit contributions
ous-0.076
 weren-0.071
 crashing-0.070
 understand-0.070
free-0.069
Token that
Token ID326
Feature activation+9.22e-03

Pos logit contributions
 breaks+0.024
 hits+0.022
 my+0.020
er+0.019
 draws+0.019
Neg logit contributions
 crashing-0.029
rolling-0.029
ous-0.029
free-0.028
 Joey-0.027
Token it
Token ID340
Feature activation+2.15e-03

Pos logit contributions
 breaks+0.014
 hits+0.012
 my+0.012
er+0.011
ara+0.011
Neg logit contributions
 crashing-0.017
rolling-0.016
free-0.016
ous-0.016
 echoing-0.015
Token's
Token ID338
Feature activation-2.60e-03

Pos logit contributions
 breaks+0.002
 hits+0.001
 my+0.001
er+0.001
ara+0.001
Neg logit contributions
 crashing-0.005
ous-0.005
free-0.005
rolling-0.005
 understand-0.005
Token okay
Token ID8788
Feature activation-1.14e-02

Pos logit contributions
 crashing+0.004
rolling+0.004
free+0.004
ous+0.004
 echoing+0.004
Neg logit contributions
 breaks-0.004
 hits-0.004
 my-0.004
er-0.003
 makes-0.003
Token to
Token ID284
Feature activation-1.88e-02

Pos logit contributions
 crashing+0.019
free+0.018
ous+0.018
rolling+0.018
 echoing+0.017
Neg logit contributions
 breaks-0.019
 hits-0.017
 my-0.016
er-0.015
 te-0.015
Token went
Token ID1816
Feature activation+2.00e-03

Pos logit contributions
 crashing+0.043
ous+0.042
rolling+0.042
uttering+0.042
 echoing+0.041
Neg logit contributions
 breaks-0.039
 hits-0.035
 makes-0.033
 my-0.032
 draws-0.030
Token,
Token ID11
Feature activation-6.81e-03

Pos logit contributions
 breaks+0.003
 hits+0.002
 my+0.002
er+0.002
 makes+0.002
Neg logit contributions
 crashing-0.004
rolling-0.004
free-0.004
ous-0.004
 echoing-0.004
Token the
Token ID262
Feature activation-1.09e-02

Pos logit contributions
 crashing+0.013
rolling+0.012
free+0.012
ous+0.012
 echoing+0.012
Neg logit contributions
 breaks-0.010
 hits-0.009
 my-0.009
er-0.008
 makes-0.008
Token monkey
Token ID21657
Feature activation-8.61e-03

Pos logit contributions
 crashing+0.019
rolling+0.018
free+0.018
 echoing+0.018
ous+0.018
Neg logit contributions
 breaks-0.017
 hits-0.015
 my-0.015
er-0.014
 makes-0.014
Token was
Token ID373
Feature activation-1.98e-02

Pos logit contributions
 crashing+0.013
rolling+0.012
free+0.012
ous+0.012
 echoing+0.012
Neg logit contributions
 breaks-0.016
 hits-0.014
 my-0.014
er-0.013
 makes-0.013
Token always
Token ID1464
Feature activation-1.05e-02

Pos logit contributions
 crashing+0.031
rolling+0.030
free+0.029
ous+0.029
 echoing+0.029
Neg logit contributions
 breaks-0.035
 hits-0.031
 my-0.031
er-0.029
 makes-0.028
Token in
Token ID287
Feature activation-2.67e-03

Pos logit contributions
 crashing+0.017
rolling+0.017
free+0.017
ous+0.016
 echoing+0.016
Neg logit contributions
 breaks-0.017
 hits-0.015
 my-0.015
er-0.014
 makes-0.014
Token the
Token ID262
Feature activation+1.49e-02

Pos logit contributions
 crashing+0.005
rolling+0.005
free+0.005
 echoing+0.005
ous+0.005
Neg logit contributions
 breaks-0.004
 hits-0.003
 my-0.003
er-0.003
 makes-0.003
Token lead
Token ID1085
Feature activation+2.87e-03

Pos logit contributions
 breaks+0.025
 hits+0.022
 my+0.022
er+0.021
 makes+0.020
Neg logit contributions
 crashing-0.025
rolling-0.023
free-0.023
 echoing-0.023
 Marvin-0.023
Token.
Token ID13
Feature activation+1.48e-03

Pos logit contributions
 breaks+0.004
 hits+0.004
 my+0.003
er+0.003
 makes+0.003
Neg logit contributions
 crashing-0.005
free-0.005
rolling-0.005
ous-0.005
 echoing-0.005
Token We
Token ID775
Feature activation-4.25e-02

Pos logit contributions
 breaks+0.002
 my+0.002
 hits+0.002
 makes+0.002
 draws+0.002
Neg logit contributions
 crashing-0.003
 echoing-0.003
rolling-0.003
ous-0.003
uttering-0.003
Token hesitant
Token ID32848
Feature activation-1.43e-02

Pos logit contributions
 crashing+0.007
rolling+0.007
free+0.007
ous+0.007
 echoing+0.006
Neg logit contributions
 breaks-0.009
 my-0.008
 hits-0.008
er-0.007
 makes-0.007
Token at
Token ID379
Feature activation-8.19e-03

Pos logit contributions
 crashing+0.030
rolling+0.030
free+0.029
ous+0.028
 Marvin+0.028
Neg logit contributions
 breaks-0.017
 my-0.015
 hits-0.014
er-0.013
 makes-0.013
Token first
Token ID717
Feature activation+6.61e-03

Pos logit contributions
rolling+0.020
 echoing+0.020
 crashing+0.019
free+0.019
hing+0.019
Neg logit contributions
 my-0.008
 breaks-0.007
 your-0.006
 this-0.005
 makes-0.005
Token.
Token ID13
Feature activation-7.65e-04

Pos logit contributions
 breaks+0.006
 hits+0.005
 my+0.005
er+0.004
 makes+0.004
Neg logit contributions
 crashing-0.015
rolling-0.015
free-0.015
 echoing-0.015
ous-0.015
Token They
Token ID1119
Feature activation-1.95e-02

Pos logit contributions
 crashing+0.001
rolling+0.001
free+0.001
ous+0.001
 echoing+0.001
Neg logit contributions
 breaks-0.001
 my-0.001
 hits-0.001
er-0.001
 makes-0.001
Token did
Token ID750
Feature activation-6.88e-03

Pos logit contributions
 crashing+0.037
rolling+0.037
free+0.037
uttering+0.036
ous+0.036
Neg logit contributions
 breaks-0.026
 my-0.022
 hits-0.022
er-0.021
 makes-0.021
Token not
Token ID407
Feature activation-3.05e-02

Pos logit contributions
 crashing+0.013
rolling+0.013
free+0.013
ous+0.013
 echoing+0.013
Neg logit contributions
 breaks-0.009
 my-0.008
 hits-0.008
er-0.007
 makes-0.007
Token know
Token ID760
Feature activation-1.07e-02

Pos logit contributions
rolling+0.036
 crashing+0.033
ous+0.032
uttering+0.032
 Marvin+0.031
Neg logit contributions
 breaks-0.066
 my-0.061
 hits-0.060
 makes-0.057
er-0.057
Token if
Token ID611
Feature activation-1.78e-02

Pos logit contributions
rolling+0.017
 crashing+0.017
free+0.016
 echoing+0.016
ous+0.016
Neg logit contributions
 breaks-0.018
 my-0.017
 hits-0.016
er-0.016
 makes-0.015
Token it
Token ID340
Feature activation-1.85e-02

Pos logit contributions
 crashing+0.030
rolling+0.029
free+0.029
ous+0.029
 echoing+0.028
Neg logit contributions
 breaks-0.028
 hits-0.025
 my-0.025
er-0.023
 makes-0.023
Token was
Token ID373
Feature activation-1.94e-02

Pos logit contributions
 crashing+0.036
rolling+0.035
uttering+0.034
free+0.034
 Marvin+0.033
Neg logit contributions
 breaks-0.027
 hits-0.023
 makes-0.022
 my-0.022
er-0.022
Token,
Token ID11
Feature activation-1.15e-02

Pos logit contributions
 crashing+0.034
rolling+0.034
free+0.033
ous+0.033
 echoing+0.033
Neg logit contributions
 breaks-0.018
 hits-0.015
 my-0.015
er-0.013
 makes-0.013
Token he
Token ID339
Feature activation+1.37e-02

Pos logit contributions
 crashing+0.021
rolling+0.021
uttering+0.021
 echoing+0.020
free+0.020
Neg logit contributions
 breaks-0.017
 my-0.016
 hits-0.015
 makes-0.014
er-0.013
Token was
Token ID373
Feature activation-6.66e-03

Pos logit contributions
 breaks+0.024
 my+0.022
 hits+0.021
er+0.020
 makes+0.019
Neg logit contributions
 crashing-0.021
rolling-0.021
 ent-0.020
 Marvin-0.020
 echoing-0.020
Token a
Token ID257
Feature activation-9.23e-03

Pos logit contributions
 crashing+0.010
rolling+0.010
free+0.010
ous+0.010
 echoing+0.009
Neg logit contributions
 breaks-0.012
 hits-0.011
 my-0.011
er-0.010
 makes-0.010
Token bit
Token ID1643
Feature activation-9.96e-03

Pos logit contributions
 crashing+0.016
rolling+0.016
ous+0.016
free+0.016
 streaming+0.015
Neg logit contributions
 breaks-0.014
 my-0.013
 hits-0.012
 makes-0.011
er-0.011
Token scared
Token ID12008
Feature activation-2.37e-02

Pos logit contributions
 crashing+0.014
rolling+0.013
free+0.013
 echoing+0.013
 ent+0.012
Neg logit contributions
 breaks-0.019
 hits-0.017
 my-0.017
er-0.016
 makes-0.016
Token.
Token ID13
Feature activation-9.70e-03

Pos logit contributions
rolling+0.053
 crashing+0.051
free+0.050
ous+0.050
 echoing+0.049
Neg logit contributions
 breaks-0.026
 my-0.022
 hits-0.021
er-0.020
 makes-0.019
Token He
Token ID679
Feature activation+4.38e-03

Pos logit contributions
 crashing+0.017
uttering+0.017
rolling+0.016
hing+0.016
free+0.016
Neg logit contributions
 breaks-0.016
 my-0.014
 hits-0.013
 makes-0.013
er-0.012
Token was
Token ID373
Feature activation-1.59e-02

Pos logit contributions
 breaks+0.007
 hits+0.006
 my+0.006
 makes+0.006
er+0.006
Neg logit contributions
 crashing-0.007
free-0.007
rolling-0.007
ous-0.007
uttering-0.007
Token also
Token ID635
Feature activation-6.25e-04

Pos logit contributions
free+0.025
ous+0.025
rolling+0.025
 crashing+0.024
hing+0.024
Neg logit contributions
 breaks-0.027
 my-0.025
 hits-0.024
 makes-0.022
er-0.022
Token very
Token ID845
Feature activation-1.03e-02

Pos logit contributions
 crashing+0.001
rolling+0.001
ous+0.001
free+0.001
 echoing+0.001
Neg logit contributions
 breaks-0.001
 my-0.001
 hits-0.001
 makes-0.001
er-0.001
Token,
Token ID11
Feature activation-2.64e-02

Pos logit contributions
 crashing+0.080
rolling+0.079
free+0.076
 echoing+0.076
uttering+0.076
Neg logit contributions
 breaks-0.020
 my-0.016
 hits-0.014
er-0.013
 makes-0.012
Token about
Token ID546
Feature activation-2.37e-02

Pos logit contributions
rolling+0.057
uttering+0.055
free+0.053
ous+0.052
hing+0.052
Neg logit contributions
 my-0.030
 breaks-0.029
 makes-0.026
er-0.024
 your-0.024
Token his
Token ID465
Feature activation-3.55e-03

Pos logit contributions
rolling+0.053
uttering+0.049
free+0.049
hing+0.049
ous+0.049
Neg logit contributions
 my-0.028
 breaks-0.027
 your-0.024
 this-0.023
 makes-0.022
Token problem
Token ID1917
Feature activation+4.21e-03

Pos logit contributions
 crashing+0.006
free+0.006
ous+0.006
rolling+0.006
hing+0.006
Neg logit contributions
 breaks-0.006
 hits-0.005
 my-0.005
 makes-0.004
er-0.004
Token and
Token ID290
Feature activation-4.31e-03

Pos logit contributions
 breaks+0.005
 hits+0.004
 my+0.004
er+0.004
 makes+0.004
Neg logit contributions
 crashing-0.009
rolling-0.009
free-0.008
ous-0.008
 echoing-0.008
Token Tommy
Token ID19919
Feature activation-2.51e-02

Pos logit contributions
 crashing+0.007
rolling+0.007
free+0.007
ous+0.007
 echoing+0.006
Neg logit contributions
 breaks-0.007
 hits-0.007
 my-0.006
er-0.006
 makes-0.006
Token suggested
Token ID5220
Feature activation-1.88e-02

Pos logit contributions
 crashing+0.047
rolling+0.046
free+0.044
ous+0.044
 echoing+0.044
Neg logit contributions
 breaks-0.037
 hits-0.032
 my-0.031
er-0.030
 makes-0.029
Token he
Token ID339
Feature activation-3.30e-03

Pos logit contributions
 crashing+0.040
rolling+0.039
free+0.038
ous+0.038
 echoing+0.038
Neg logit contributions
 breaks-0.023
 my-0.019
 hits-0.018
er-0.017
 makes-0.017
Token make
Token ID787
Feature activation+7.30e-03

Pos logit contributions
 crashing+0.005
rolling+0.005
free+0.005
ous+0.005
 echoing+0.005
Neg logit contributions
 breaks-0.006
 hits-0.005
 my-0.005
er-0.005
 makes-0.005
Token a
Token ID257
Feature activation+1.02e-02

Pos logit contributions
 breaks+0.012
 hits+0.010
 my+0.010
er+0.009
 makes+0.009
Neg logit contributions
 crashing-0.012
rolling-0.012
free-0.012
 echoing-0.012
ous-0.012
Token special
Token ID2041
Feature activation+3.33e-03

Pos logit contributions
 breaks+0.016
 hits+0.014
 my+0.014
er+0.013
 makes+0.013
Neg logit contributions
 crashing-0.017
rolling-0.017
free-0.016
ous-0.016
 echoing-0.016
Token proud
Token ID6613
Feature activation+2.44e-03

Pos logit contributions
 breaks+0.013
 my+0.012
 hits+0.011
 makes+0.011
 your+0.010
Neg logit contributions
 crashing-0.012
ous-0.012
rolling-0.012
 Marvin-0.011
 harming-0.011
Token and
Token ID290
Feature activation+3.07e-04

Pos logit contributions
 breaks+0.004
 hits+0.003
 my+0.003
 draws+0.003
 makes+0.003
Neg logit contributions
 crashing-0.004
ous-0.004
 echoing-0.004
rolling-0.004
free-0.004
Token happy
Token ID3772
Feature activation-6.99e-03

Pos logit contributions
 breaks+0.001
 hits+0.001
 my+0.001
er+0.000
 makes+0.000
Neg logit contributions
 crashing-0.000
rolling-0.000
free-0.000
ous-0.000
 echoing-0.000
Token.
Token ID13
Feature activation+6.74e-03

Pos logit contributions
 crashing+0.014
rolling+0.014
free+0.014
ous+0.013
 echoing+0.013
Neg logit contributions
 breaks-0.009
 hits-0.008
 my-0.008
er-0.007
 makes-0.007
Token She
Token ID1375
Feature activation+1.79e-02

Pos logit contributions
 breaks+0.012
 my+0.011
 hits+0.011
 makes+0.010
er+0.010
Neg logit contributions
 crashing-0.010
rolling-0.010
free-0.010
 echoing-0.010
ous-0.010
Token never
Token ID1239
Feature activation-4.65e-03

Pos logit contributions
 breaks+0.032
 hits+0.029
 my+0.028
er+0.027
 makes+0.026
Neg logit contributions
 crashing-0.027
rolling-0.026
 echoing-0.026
ous-0.025
 ent-0.025
Token forgot
Token ID16453
Feature activation-2.10e-02

Pos logit contributions
 crashing+0.007
rolling+0.007
ous+0.007
free+0.007
 echoing+0.007
Neg logit contributions
 breaks-0.008
 my-0.007
 hits-0.007
er-0.007
 te-0.006
Token that
Token ID326
Feature activation-1.16e-02

Pos logit contributions
 crashing+0.039
rolling+0.038
free+0.037
ous+0.037
 echoing+0.036
Neg logit contributions
 breaks-0.031
 hits-0.027
 my-0.027
er-0.025
 makes-0.024
Token long
Token ID890
Feature activation-2.48e-02

Pos logit contributions
 crashing+0.018
rolling+0.018
free+0.018
ous+0.017
 echoing+0.017
Neg logit contributions
 breaks-0.020
 hits-0.018
 my-0.018
er-0.017
 makes-0.017
Token and
Token ID290
Feature activation-1.05e-02

Pos logit contributions
 crashing+0.042
rolling+0.041
free+0.040
ous+0.039
 echoing+0.039
Neg logit contributions
 breaks-0.040
 my-0.036
 hits-0.035
er-0.032
 makes-0.032
Token exciting
Token ID7895
Feature activation-2.12e-02

Pos logit contributions
 crashing+0.017
rolling+0.016
free+0.016
ous+0.016
 echoing+0.016
Neg logit contributions
 breaks-0.018
 hits-0.016
 my-0.015
er-0.015
 makes-0.014
Token!
Token ID0
Feature activation+2.34e-03

Pos logit contributions
 breaks+0.030
 hits+0.027
 te+0.024
 my+0.024
 draws+0.024
Neg logit contributions
ous-0.053
free-0.052
owing-0.051
 crashing-0.051
rolling-0.050
Token Be
Token ID1355
Feature activation+8.51e-03

Pos logit contributions
 breaks+0.003
 my+0.003
 hits+0.003
er+0.002
 makes+0.002
Neg logit contributions
 crashing-0.004
rolling-0.004
 echoing-0.004
free-0.004
uttering-0.004
Token quiet
Token ID5897
Feature activation+1.03e-02

Pos logit contributions
 breaks+0.015
 hits+0.014
er+0.013
 my+0.013
 draws+0.012
Neg logit contributions
 crashing-0.013
free-0.012
 echoing-0.012
rolling-0.012
 understand-0.012
Token!"
Token ID2474
Feature activation-3.67e-02

Pos logit contributions
 breaks+0.011
 hits+0.010
 my+0.009
 slams+0.008
 draws+0.008
Neg logit contributions
 crashing-0.023
ous-0.022
free-0.022
 echoing-0.022
rolling-0.021
Token Kitty
Token ID21168
Feature activation+3.95e-03

Pos logit contributions
 crashing+0.070
rolling+0.067
uttering+0.067
ous+0.065
piring+0.064
Neg logit contributions
 breaks-0.054
 my-0.047
er-0.046
 hits-0.046
 makes-0.045
Token and
Token ID290
Feature activation-6.39e-02

Pos logit contributions
 breaks+0.005
 hits+0.005
 my+0.005
er+0.004
 makes+0.004
Neg logit contributions
 crashing-0.008
rolling-0.007
free-0.007
ous-0.007
 echoing-0.007
Token Bunny
Token ID28683
Feature activation-2.98e-02

Pos logit contributions
uttering+0.109
rolling+0.106
year+0.105
ous+0.103
owing+0.101
Neg logit contributions
 my-0.112
 breaks-0.105
 your-0.097
 makes-0.097
 hits-0.091
Token felt
Token ID2936
Feature activation+6.74e-03

Pos logit contributions
 crashing+0.048
rolling+0.047
uttering+0.047
ous+0.046
free+0.045
Neg logit contributions
 breaks-0.050
 hits-0.044
 makes-0.042
er-0.042
 my-0.041
Token sad
Token ID6507
Feature activation-4.81e-03

Pos logit contributions
 breaks+0.012
 my+0.010
 hits+0.010
 makes+0.010
er+0.010
Neg logit contributions
 crashing-0.011
rolling-0.011
ous-0.010
free-0.010
 echoing-0.010
Token.
Token ID13
Feature activation-3.45e-02

Pos logit contributions
 crashing+0.009
rolling+0.009
free+0.009
ous+0.009
 echoing+0.009
Neg logit contributions
 breaks-0.006
 hits-0.006
 my-0.005
er-0.005
 makes-0.005
Token
Token ID198
Feature activation-2.04e-02

Pos logit contributions
 crashing+0.070
hing+0.069
 echoing+0.068
owing+0.067
uttering+0.066
Neg logit contributions
 breaks-0.045
 my-0.040
 makes-0.038
 hits-0.037
er-0.035
Token.
Token ID13
Feature activation-2.49e-02

Pos logit contributions
rolling+0.088
 ent+0.085
 Marvin+0.083
 crashing+0.083
Frank+0.081
Neg logit contributions
 breaks-0.041
 hits-0.034
er-0.034
 my-0.032
 makes-0.030
Token
Token ID198
Feature activation-2.54e-02

Pos logit contributions
 crashing+0.043
uttering+0.043
rolling+0.043
hing+0.042
free+0.042
Neg logit contributions
 breaks-0.042
 my-0.036
 hits-0.035
 makes-0.033
er-0.032
Token
Token ID198
Feature activation-2.43e-02

Pos logit contributions
 echoing+0.050
 crashing+0.050
 streaming+0.049
ous+0.048
owing+0.048
Neg logit contributions
 breaks-0.034
 my-0.032
 hits-0.030
er-0.027
 makes-0.026
TokenBob
Token ID18861
Feature activation+8.81e-03

Pos logit contributions
 crashing+0.043
ous+0.043
rolling+0.043
 streaming+0.043
owing+0.042
Neg logit contributions
 breaks-0.036
 my-0.033
 hits-0.031
er-0.029
 makes-0.028
Token said
Token ID531
Feature activation-2.54e-02

Pos logit contributions
 breaks+0.014
 my+0.013
 hits+0.013
er+0.012
 makes+0.011
Neg logit contributions
rolling-0.014
 crashing-0.014
 ent-0.013
ous-0.013
 echoing-0.013
Token to
Token ID284
Feature activation-5.74e-02

Pos logit contributions
 crashing+0.055
free+0.055
ous+0.054
rolling+0.053
 Marvin+0.052
Neg logit contributions
 breaks-0.027
 hits-0.024
 my-0.022
 draws-0.021
ara-0.020
Token himself
Token ID2241
Feature activation-3.46e-02

Pos logit contributions
rolling+0.135
uttering+0.134
itting+0.133
free+0.129
 echoing+0.126
Neg logit contributions
 my-0.074
 breaks-0.067
 hits-0.059
 your-0.057
 this-0.053
Token,
Token ID11
Feature activation-2.87e-02

Pos logit contributions
 crashing+0.091
rolling+0.090
free+0.088
 echoing+0.087
ous+0.087
Neg logit contributions
 breaks-0.024
 my-0.018
 hits-0.017
er-0.015
 makes-0.013
Token "
Token ID366
Feature activation-1.91e-02

Pos logit contributions
rolling+0.057
uttering+0.057
owing+0.056
hing+0.056
andering+0.055
Neg logit contributions
 my-0.035
 breaks-0.032
er-0.030
 this-0.029
 makes-0.029
TokenNow
Token ID3844
Feature activation+6.91e-03

Pos logit contributions
rolling+0.030
ous+0.029
 crashing+0.029
uttering+0.027
owing+0.027
Neg logit contributions
 breaks-0.033
 my-0.031
 hits-0.030
er-0.030
 makes-0.029
Token that
Token ID326
Feature activation+6.23e-03

Pos logit contributions
 breaks+0.009
 hits+0.007
 my+0.007
er+0.007
 draws+0.006
Neg logit contributions
 crashing-0.014
rolling-0.013
free-0.013
 Marvin-0.013
ous-0.013
Token,
Token ID11
Feature activation-2.82e-02

Pos logit contributions
 crashing+0.056
free+0.055
rolling+0.054
ous+0.054
 echoing+0.053
Neg logit contributions
 breaks-0.021
 hits-0.018
 my-0.017
er-0.015
 makes-0.014
Token "
Token ID366
Feature activation-1.70e-02

Pos logit contributions
hing+0.055
rolling+0.053
 roam+0.053
owing+0.053
uttering+0.052
Neg logit contributions
 breaks-0.036
 my-0.034
er-0.032
 makes-0.032
 hits-0.030
TokenHello
Token ID15496
Feature activation-1.34e-02

Pos logit contributions
 crashing+0.025
rolling+0.025
ous+0.024
free+0.023
 echoing+0.023
Neg logit contributions
 breaks-0.032
 hits-0.028
 my-0.028
er-0.027
 makes-0.026
Token,
Token ID11
Feature activation-3.87e-02

Pos logit contributions
 crashing+0.030
ous+0.029
rolling+0.029
 Marvin+0.028
free+0.028
Neg logit contributions
 breaks-0.015
 hits-0.012
 my-0.011
er-0.011
 makes-0.010
Token Mr
Token ID1770
Feature activation-2.02e-02

Pos logit contributions
rolling+0.060
free+0.057
 streaming+0.055
 crashing+0.055
 echoing+0.054
Neg logit contributions
 breaks-0.070
 my-0.068
 hits-0.062
 makes-0.060
er-0.057
Token.
Token ID13
Feature activation-4.83e-02

Pos logit contributions
 crashing+0.038
free+0.037
ous+0.037
rolling+0.037
 Marvin+0.036
Neg logit contributions
 breaks-0.028
 hits-0.025
 my-0.024
er-0.022
 makes-0.022
Token Whale
Token ID44772
Feature activation-4.37e-03

Pos logit contributions
 crashing+0.048
rolling+0.047
free+0.046
 echoing+0.045
 streaming+0.042
Neg logit contributions
 breaks-0.110
 my-0.103
 hits-0.102
 makes-0.096
er-0.096
Token!
Token ID0
Feature activation+4.21e-04

Pos logit contributions
 crashing+0.009
free+0.009
ous+0.009
 echoing+0.008
rolling+0.008
Neg logit contributions
 breaks-0.005
 hits-0.005
 my-0.004
er-0.004
 draws-0.004
Token We
Token ID775
Feature activation-4.08e-02

Pos logit contributions
 breaks+0.001
 my+0.000
 hits+0.000
 makes+0.000
er+0.000
Neg logit contributions
 crashing-0.001
rolling-0.001
 echoing-0.001
free-0.001
uttering-0.001
Token won
Token ID1839
Feature activation+6.83e-03

Pos logit contributions
 crashing+0.068
rolling+0.067
ous+0.065
 echoing+0.064
 Marvin+0.063
Neg logit contributions
 breaks-0.070
 hits-0.061
 my-0.060
 makes-0.057
er-0.055
Token't
Token ID470
Feature activation-1.92e-02

Pos logit contributions
 breaks+0.013
 hits+0.012
 my+0.012
er+0.012
 makes+0.011
Neg logit contributions
 crashing-0.009
rolling-0.009
free-0.009
ous-0.009
 Marvin-0.008
Token
Token ID220
Feature activation-3.52e-02

Pos logit contributions
 echoing+0.038
hing+0.038
 crashing+0.038
free+0.036
rolling+0.036
Neg logit contributions
 breaks-0.031
 my-0.028
 makes-0.026
 hits-0.025
ara-0.025
Token
Token ID198
Feature activation-2.90e-02

Pos logit contributions
 crashing+0.064
 echoing+0.061
ous+0.059
free+0.058
hing+0.058
Neg logit contributions
 breaks-0.054
 my-0.049
 hits-0.046
ara-0.044
 makes-0.044
Token
Token ID198
Feature activation-3.03e-02

Pos logit contributions
 echoing+0.063
owing+0.062
 crashing+0.062
ous+0.061
 harming+0.059
Neg logit contributions
 my-0.032
 breaks-0.030
 your-0.026
ara-0.025
 this-0.024
TokenM
Token ID44
Feature activation-2.21e-02

Pos logit contributions
ous+0.060
 crashing+0.059
owing+0.059
 echoing+0.058
rolling+0.057
Neg logit contributions
 breaks-0.038
 my-0.036
ara-0.030
 makes-0.030
er-0.030
Tokenabel
Token ID9608
Feature activation+2.44e-02

Pos logit contributions
 crashing+0.036
rolling+0.035
free+0.034
 echoing+0.033
ous+0.033
Neg logit contributions
 breaks-0.038
 my-0.033
 hits-0.033
er-0.031
 makes-0.031
Token and
Token ID290
Feature activation-4.86e-02

Pos logit contributions
 breaks+0.043
 hits+0.039
 my+0.039
 slams+0.036
 holes+0.035
Neg logit contributions
 crashing-0.035
 echoing-0.034
ous-0.034
rolling-0.033
 ent-0.033
Token her
Token ID607
Feature activation-2.03e-02

Pos logit contributions
rolling+0.100
free+0.098
hing+0.096
ous+0.096
 crashing+0.096
Neg logit contributions
 breaks-0.062
 my-0.061
 makes-0.053
 your-0.052
 gives-0.048
Token mum
Token ID25682
Feature activation-6.91e-03

Pos logit contributions
free+0.037
 crashing+0.034
hing+0.034
rolling+0.033
ous+0.033
Neg logit contributions
 breaks-0.034
 my-0.032
 jar-0.029
 hits-0.029
 makes-0.028
Token were
Token ID547
Feature activation+2.28e-03

Pos logit contributions
 crashing+0.011
free+0.011
rolling+0.010
uttering+0.010
ous+0.010
Neg logit contributions
 breaks-0.012
 hits-0.011
 my-0.010
er-0.010
 makes-0.010
Token both
Token ID1111
Feature activation-1.08e-03

Pos logit contributions
 breaks+0.004
 hits+0.004
 my+0.003
er+0.003
 makes+0.003
Neg logit contributions
 crashing-0.004
rolling-0.004
free-0.003
 echoing-0.003
ous-0.003
Token so
Token ID523
Feature activation+5.53e-03

Pos logit contributions
 crashing+0.002
rolling+0.002
free+0.002
ous+0.002
 Marvin+0.002
Neg logit contributions
 breaks-0.002
 my-0.002
 hits-0.002
er-0.001
 makes-0.001
Token small
Token ID1402
Feature activation-2.20e-03

Pos logit contributions
 breaks+0.008
 my+0.007
 hits+0.007
 makes+0.006
er+0.006
Neg logit contributions
 crashing-0.011
free-0.011
rolling-0.011
ous-0.010
 echoing-0.010
Token patch
Token ID8529
Feature activation-8.13e-03

Pos logit contributions
 crashing+0.004
rolling+0.004
 Joey+0.004
uttering+0.004
free+0.004
Neg logit contributions
 breaks-0.003
 my-0.003
 hits-0.003
er-0.003
 makes-0.003
Token where
Token ID810
Feature activation-2.27e-02

Pos logit contributions
 crashing+0.013
ous+0.012
free+0.012
uttering+0.012
 echoing+0.012
Neg logit contributions
 breaks-0.014
 hits-0.013
 my-0.013
 draws-0.012
 makes-0.011
Token they
Token ID484
Feature activation-1.09e-02

Pos logit contributions
 crashing+0.039
rolling+0.038
free+0.038
ous+0.037
 echoing+0.037
Neg logit contributions
 breaks-0.035
 hits-0.031
 my-0.031
er-0.029
 makes-0.028
Token grow
Token ID1663
Feature activation-2.24e-02

Pos logit contributions
 crashing+0.019
rolling+0.018
ous+0.018
 echoing+0.018
free+0.018
Neg logit contributions
 breaks-0.018
 hits-0.016
 my-0.015
 makes-0.015
 draws-0.014
Token rad
Token ID2511
Feature activation-4.68e-02

Pos logit contributions
 crashing+0.043
rolling+0.042
free+0.041
ous+0.041
 echoing+0.040
Neg logit contributions
 breaks-0.031
 hits-0.027
 my-0.026
er-0.024
 makes-0.023
Tokenishes
Token ID5614
Feature activation-3.48e-02

Pos logit contributions
 crashing+0.083
 echoing+0.081
rolling+0.079
ous+0.078
free+0.077
Neg logit contributions
 breaks-0.073
 my-0.066
 hits-0.065
er-0.061
 makes-0.060
Token.
Token ID13
Feature activation-9.91e-03

Pos logit contributions
rolling+0.076
 crashing+0.075
 echoing+0.073
ous+0.072
 ent+0.072
Neg logit contributions
 breaks-0.040
 my-0.034
 hits-0.033
er-0.032
 makes-0.031
Token They
Token ID1119
Feature activation-2.38e-02

Pos logit contributions
ous+0.019
free+0.019
 echoing+0.019
 crashing+0.019
rolling+0.019
Neg logit contributions
 breaks-0.013
 hits-0.012
 my-0.012
 makes-0.011
er-0.011
Token water
Token ID1660
Feature activation-3.20e-02

Pos logit contributions
ous+0.041
rolling+0.041
 crashing+0.040
uttering+0.039
 echoing+0.039
Neg logit contributions
 breaks-0.039
 my-0.035
 makes-0.034
 hits-0.034
 draws-0.032
Token them
Token ID606
Feature activation-4.14e-02

Pos logit contributions
 crashing+0.060
rolling+0.060
free+0.058
ous+0.058
 echoing+0.058
Neg logit contributions
 breaks-0.046
 hits-0.040
 my-0.039
er-0.036
 makes-0.036