TOP ACTIVATIONS
MAX = 0.177

dark herbs each time in case they wanted to recreate their
full of delicious snacks in case he got hungry .
the store again just in case ." When they
kept them in a special case . One day , her
his thoughts out loud in case anyone was listening . One
medicine with them just in case . They gave her a
kept her eyes open in case the cone reapp eared .
imagine what her great - grand ma was like . She
belonged to his great - grand father , and he used
set up a plan in case something like this happened .
That 's your great - grand ma ," her mom replied
kin nearby to use in case she got cr umbs on
close by , just in case it needed to scare away
a secret key just in case . And that
chest was your great - grand ma 's . She left
it away , just in case someone wanted it back .
parents and their great - grand pa 's picture . They
and have a plan in case of an emergency . He
an old rope just in case . He tied one end
, just with a new case ." L

MIN ACTIVATIONS
MIN = -0.157

the same pain ?". The taller one leaned in and said
lean against it . The taller one was curious and asked
. Max had a w agging tail and a shiny collar
to go , but the taller boy had no choice .
Anna and Ben . It tilted its head and made a
was so happy ! She tilted her head back and laughed
They roared louder than the thunder . The weather
her picture bigger . She tilted her head and continued to
!" The se ag ull s and the fish
Rex stopped and tilted his head . He did
. Ell y makes a louder noise . He does not
brown fur and a w agging tail . She draws Ben
brown spots and a w agging tail . He is not
looked at Sara . It tilted its head and opened its
wet nose and a w agging tail . He bar ks
. L ily tilted the jar slowly and watched
again . It makes a louder sound . Then it breaks
a bit stiff . The taller one thought that was a
started to swing . The higher Lin ny swung , the
fluffy and had a w agging tail . Lily pointed at

INTERVAL 0.094 - 0.177
CONTAINS 0.364%

They were sorry . They wished they had listened to Mom
there was a little girl named Lily . One day ,
fox inside and they had lots of fun together . They
, a doctor , and many other things .
made a special dish with lots of y ummy vegetables and

INTERVAL 0.010 - 0.094
CONTAINS 36.756%

world around him . One day , Benny went on an
run and play outside with his friends . One day ,
. They were scared and didn 't know what to do
. It 's not good for dogs ." Max looked sad
the truck . They put books , balls , dolls ,

INTERVAL -0.073 - 0.010
CONTAINS 60.995%

said , " Of course , you can hug him as
The daughter was very happy and said , " Thank
, Ally smiled and said , " We did it !
smile . So he watched it for a while .
Ok !" Then she took the bald man 's hand and

INTERVAL -0.157 - -0.073
CONTAINS 1.886%

excited because it was so bright and attractive !
big shadows . She heard loud noises , and she smelled
asked , " Are you okay , little bug ?" The
breaks . It makes a loud noise . The
waited for the rain to slow down so she could come
Token dark
Token ID3223
Feature activation-7.53e-02

Pos logit contributions
acey+0.191
yle+0.190
ocket+0.189
ays+0.185
oll+0.176
Neg logit contributions
 nod-0.156
 bark-0.144
 joke-0.144
 be-0.142
 roaring-0.140
Token herbs
Token ID25411
Feature activation+4.32e-02

Pos logit contributions
 nod+0.094
 be+0.085
 bark+0.085
 roaring+0.085
 joke+0.085
Neg logit contributions
acey-0.164
ocket-0.163
yle-0.161
oll-0.156
ays-0.154
Token each
Token ID1123
Feature activation+2.23e-02

Pos logit contributions
yle+0.107
acey+0.106
ays+0.103
ocket+0.101
oll+0.099
Neg logit contributions
 nod-0.041
 be-0.041
 count-0.036
 bark-0.034
 joke-0.033
Token time
Token ID640
Feature activation+1.74e-02

Pos logit contributions
ays+0.058
yle+0.056
acey+0.056
ocket+0.055
oll+0.054
Neg logit contributions
 nod-0.021
 be-0.021
 moment-0.020
 bark-0.020
 joke-0.019
Token in
Token ID287
Feature activation-3.35e-02

Pos logit contributions
ocket+0.040
yle+0.040
acey+0.040
oll+0.039
ays+0.039
Neg logit contributions
 nod-0.019
 be-0.018
 bark-0.017
 joke-0.016
 roaring-0.016
Token case
Token ID1339
Feature activation+1.67e-01

Pos logit contributions
 be+0.038
 nod+0.036
 moment+0.035
 worry+0.033
 quarrel+0.031
Neg logit contributions
oll-0.083
ocket-0.080
ils-0.080
yle-0.080
asers-0.079
Token they
Token ID484
Feature activation+5.84e-02

Pos logit contributions
yle+0.402
oll+0.391
ocket+0.386
acey+0.381
ays+0.373
Neg logit contributions
 be-0.170
 nod-0.162
 bark-0.147
 moment-0.141
 shaking-0.132
Token wanted
Token ID2227
Feature activation+4.70e-02

Pos logit contributions
ays+0.141
yle+0.133
acey+0.125
oll+0.124
ocket+0.124
Neg logit contributions
 nod-0.071
 be-0.070
 bark-0.067
 quarrel-0.062
 joke-0.061
Token to
Token ID284
Feature activation-4.50e-02

Pos logit contributions
yle+0.115
ocket+0.114
acey+0.114
ays+0.113
oll+0.112
Neg logit contributions
 nod-0.044
 be-0.041
 bark-0.039
 joke-0.039
 roaring-0.037
Token recreate
Token ID32049
Feature activation+1.59e-03

Pos logit contributions
 nod+0.047
 joke+0.040
 be+0.040
 bark+0.040
 count+0.037
Neg logit contributions
ocket-0.107
acey-0.107
yle-0.106
ays-0.106
oll-0.104
Token their
Token ID511
Feature activation+5.66e-03

Pos logit contributions
ocket+0.003
yle+0.003
acey+0.003
oll+0.003
ays+0.003
Neg logit contributions
 nod-0.002
 be-0.002
 joke-0.002
 bark-0.002
 roaring-0.002
Token full
Token ID1336
Feature activation+5.76e-02

Pos logit contributions
acey+0.159
yle+0.156
ays+0.153
ocket+0.152
ils+0.149
Neg logit contributions
 nod-0.124
 be-0.117
 bark-0.114
 roaring-0.108
 count-0.107
Token of
Token ID286
Feature activation+3.13e-02

Pos logit contributions
acey+0.086
ays+0.083
yle+0.083
ocket+0.079
 Hood+0.076
Neg logit contributions
 nod-0.118
 be-0.111
 bark-0.108
 joke-0.107
 count-0.102
Token delicious
Token ID12625
Feature activation-2.98e-02

Pos logit contributions
ays+0.068
acey+0.068
yle+0.067
ocket+0.066
aser+0.063
Neg logit contributions
 nod-0.042
 bark-0.039
 joke-0.038
 be-0.037
 laugh-0.035
Token snacks
Token ID28487
Feature activation+4.42e-02

Pos logit contributions
 nod+0.050
 joke+0.045
 bark+0.045
 be+0.044
 roaring+0.044
Neg logit contributions
ocket-0.051
acey-0.050
yle-0.050
ays-0.048
oll-0.048
Token in
Token ID287
Feature activation-3.63e-02

Pos logit contributions
acey+0.118
yle+0.114
ocket+0.111
ays+0.110
asers+0.108
Neg logit contributions
 nod-0.034
 be-0.032
 bark-0.031
 count-0.028
 roaring-0.028
Token case
Token ID1339
Feature activation+1.64e-01

Pos logit contributions
 be+0.032
 nod+0.028
 bark+0.028
 matter+0.027
 moment+0.027
Neg logit contributions
oll-0.097
ocket-0.095
acey-0.095
asers-0.094
aser-0.094
Token he
Token ID339
Feature activation+6.04e-02

Pos logit contributions
acey+0.412
yle+0.407
ocket+0.402
ays+0.394
oll+0.388
Neg logit contributions
 nod-0.135
 bark-0.131
 be-0.129
 roaring-0.125
 barking-0.117
Token got
Token ID1392
Feature activation+5.34e-02

Pos logit contributions
ays+0.158
yle+0.145
ions+0.143
acey+0.140
asers+0.139
Neg logit contributions
 bark-0.066
 be-0.064
 joke-0.057
 nod-0.056
 count-0.056
Token hungry
Token ID14720
Feature activation-5.38e-03

Pos logit contributions
acey+0.126
ays+0.123
ocket+0.123
yle+0.123
oll+0.121
Neg logit contributions
 bark-0.049
 be-0.049
 nod-0.049
 roaring-0.048
 barking-0.045
Token.
Token ID13
Feature activation-8.18e-03

Pos logit contributions
 nod+0.005
 bark+0.005
 be+0.005
 joke+0.004
 roaring+0.004
Neg logit contributions
acey-0.013
yle-0.013
ocket-0.013
ays-0.013
asers-0.012
Token
Token ID198
Feature activation-2.15e-02

Pos logit contributions
 nod+0.007
 bark+0.006
 be+0.006
 joke+0.006
 roaring+0.006
Neg logit contributions
ocket-0.021
yle-0.020
ils-0.020
acey-0.020
ays-0.020
Token the
Token ID262
Feature activation+4.91e-03

Pos logit contributions
ocket+0.092
oll+0.088
ils+0.088
 album+0.085
ays+0.085
Neg logit contributions
 be-0.051
 listen-0.045
 nod-0.045
 roaring-0.042
 louder-0.041
Token store
Token ID3650
Feature activation+7.58e-02

Pos logit contributions
ays+0.010
acey+0.010
yle+0.009
ocket+0.009
oll+0.009
Neg logit contributions
 nod-0.007
 joke-0.007
 bark-0.007
 be-0.007
 roaring-0.007
Token again
Token ID757
Feature activation-3.43e-02

Pos logit contributions
ocket+0.198
oll+0.194
acey+0.191
ils+0.189
yle+0.189
Neg logit contributions
 be-0.067
 nod-0.064
 bark-0.057
 count-0.050
 moment-0.050
Token just
Token ID655
Feature activation-5.01e-02

Pos logit contributions
 nod+0.040
 be+0.036
 bark+0.034
 joke+0.033
 roaring+0.033
Neg logit contributions
ocket-0.078
acey-0.077
yle-0.076
ays-0.076
oll-0.075
Token in
Token ID287
Feature activation-3.79e-02

Pos logit contributions
 nod+0.058
 be+0.051
 bark+0.050
 joke+0.049
 roaring+0.047
Neg logit contributions
acey-0.114
ocket-0.113
yle-0.113
ays-0.111
oll-0.109
Token case
Token ID1339
Feature activation+1.63e-01

Pos logit contributions
 be+0.040
 nod+0.038
 moment+0.035
 roaring+0.034
 barking+0.034
Neg logit contributions
oll-0.093
ocket-0.092
asers-0.091
ils-0.091
aser-0.089
Token."
Token ID526
Feature activation-2.19e-02

Pos logit contributions
ocket+0.407
oll+0.392
yle+0.386
acey+0.380
ays+0.372
Neg logit contributions
 be-0.144
 nod-0.128
 bark-0.115
 roaring-0.108
 barking-0.107
Token
Token ID198
Feature activation-2.33e-02

Pos logit contributions
 nod+0.021
 bark+0.017
 be+0.017
 joke+0.015
 laugh+0.014
Neg logit contributions
ocket-0.056
ils-0.054
ays-0.054
oll-0.054
acey-0.054
Token
Token ID198
Feature activation-2.72e-02

Pos logit contributions
 nod+0.025
 bark+0.022
 joke+0.022
 be+0.021
 roaring+0.020
Neg logit contributions
ocket-0.054
acey-0.053
yle-0.053
ays-0.053
oll-0.052
TokenWhen
Token ID2215
Feature activation+4.28e-02

Pos logit contributions
 nod+0.035
 joke+0.031
 bark+0.031
 be+0.029
 roaring+0.029
Neg logit contributions
ocket-0.057
acey-0.057
yle-0.056
ays-0.055
oll-0.055
Token they
Token ID484
Feature activation+1.87e-02

Pos logit contributions
ocket+0.082
ays+0.081
yle+0.079
oll+0.079
acey+0.078
Neg logit contributions
 nod-0.067
 be-0.064
 bark-0.062
 joke-0.061
 roaring-0.059
Token kept
Token ID4030
Feature activation-1.10e-02

Pos logit contributions
 nod+0.014
 joke+0.012
 bark+0.012
 be+0.012
 roaring+0.012
Neg logit contributions
ocket-0.023
acey-0.023
yle-0.023
ays-0.022
oll-0.022
Token them
Token ID606
Feature activation-3.45e-02

Pos logit contributions
 nod+0.015
 be+0.014
 roaring+0.014
 joke+0.014
 bark+0.014
Neg logit contributions
ocket-0.022
oll-0.021
acey-0.021
ays-0.021
yle-0.021
Token in
Token ID287
Feature activation-4.11e-02

Pos logit contributions
 nod+0.043
 be+0.038
 bark+0.037
 laugh+0.037
 joke+0.036
Neg logit contributions
ocket-0.076
acey-0.076
ays-0.075
yle-0.074
oll-0.073
Token a
Token ID257
Feature activation+3.97e-02

Pos logit contributions
 nod+0.051
 be+0.047
 joke+0.045
 bark+0.044
 roaring+0.044
Neg logit contributions
ocket-0.090
oll-0.089
ays-0.088
acey-0.088
yle-0.088
Token special
Token ID2041
Feature activation+1.98e-02

Pos logit contributions
ocket+0.086
yle+0.085
ays+0.085
acey+0.084
oll+0.083
Neg logit contributions
 nod-0.050
 joke-0.045
 bark-0.045
 be-0.044
 roaring-0.042
Token case
Token ID1339
Feature activation+1.60e-01

Pos logit contributions
yle+0.043
acey+0.043
ocket+0.043
ays+0.042
oll+0.042
Neg logit contributions
 nod-0.025
 joke-0.023
 be-0.022
 bark-0.022
 count-0.021
Token.
Token ID13
Feature activation-2.98e-03

Pos logit contributions
yle+0.420
ocket+0.413
ays+0.410
oll+0.408
acey+0.404
Neg logit contributions
 be-0.124
 nod-0.106
 bark-0.092
 moment-0.089
 count-0.083
Token One
Token ID1881
Feature activation-1.48e-02

Pos logit contributions
 nod+0.003
 be+0.002
 bark+0.002
 joke+0.002
 barking+0.002
Neg logit contributions
ocket-0.008
ils-0.008
yle-0.007
acey-0.007
ays-0.007
Token day
Token ID1110
Feature activation+7.17e-02

Pos logit contributions
 nod+0.015
 be+0.014
 joke+0.013
 bark+0.013
 moment+0.013
Neg logit contributions
ocket-0.037
ays-0.036
yle-0.036
acey-0.036
ils-0.035
Token,
Token ID11
Feature activation-1.39e-02

Pos logit contributions
ocket+0.175
yle+0.172
acey+0.172
ays+0.171
oll+0.169
Neg logit contributions
 nod-0.070
 bark-0.059
 joke-0.058
 be-0.057
 roaring-0.055
Token her
Token ID607
Feature activation+7.05e-03

Pos logit contributions
 nod+0.017
 bark+0.016
 be+0.016
 joke+0.015
 roaring+0.015
Neg logit contributions
ocket-0.031
ays-0.031
oll-0.030
acey-0.030
ils-0.029
Token his
Token ID465
Feature activation+3.06e-02

Pos logit contributions
ocket+0.011
acey+0.011
yle+0.011
oll+0.011
ils+0.011
Neg logit contributions
 nod-0.007
 joke-0.006
 bark-0.006
 be-0.006
appointed-0.006
Token thoughts
Token ID6066
Feature activation+7.66e-03

Pos logit contributions
ocket+0.064
acey+0.063
yle+0.062
oll+0.061
ays+0.060
Neg logit contributions
 nod-0.039
 bark-0.034
 joke-0.034
 roaring-0.032
 be-0.032
Token out
Token ID503
Feature activation-1.44e-02

Pos logit contributions
ocket+0.016
acey+0.016
yle+0.016
ays+0.015
oll+0.015
Neg logit contributions
 nod-0.010
 joke-0.009
 bark-0.009
 be-0.008
 roaring-0.008
Token loud
Token ID7812
Feature activation-5.67e-02

Pos logit contributions
 nod+0.021
 roaring+0.019
 joke+0.019
 bark+0.019
 be+0.019
Neg logit contributions
acey-0.029
yle-0.028
ocket-0.028
ays-0.028
oll-0.027
Token in
Token ID287
Feature activation-3.63e-02

Pos logit contributions
 nod+0.067
 bark+0.060
 joke+0.059
 roaring+0.057
 be+0.057
Neg logit contributions
acey-0.125
ocket-0.125
yle-0.124
ays-0.121
oll-0.120
Token case
Token ID1339
Feature activation+1.60e-01

Pos logit contributions
 be+0.035
 roaring+0.035
 nod+0.034
 bark+0.034
 joke+0.034
Neg logit contributions
acey-0.090
oll-0.089
ocket-0.088
ays-0.088
yle-0.086
Token anyone
Token ID2687
Feature activation+4.49e-02

Pos logit contributions
acey+0.381
yle+0.380
ocket+0.375
ays+0.373
oll+0.362
Neg logit contributions
 nod-0.165
 bark-0.151
 roaring-0.148
 be-0.142
 joke-0.139
Token was
Token ID373
Feature activation+3.19e-02

Pos logit contributions
acey+0.080
ays+0.079
ocket+0.077
yle+0.077
oll+0.076
Neg logit contributions
 nod-0.074
 bark-0.069
 joke-0.069
 be-0.068
 roaring-0.068
Token listening
Token ID8680
Feature activation-4.15e-02

Pos logit contributions
ocket+0.066
acey+0.066
yle+0.066
ays+0.064
oll+0.064
Neg logit contributions
 nod-0.041
 joke-0.036
 bark-0.036
 be-0.035
 roaring-0.035
Token.
Token ID13
Feature activation-7.54e-03

Pos logit contributions
 nod+0.051
 joke+0.044
 bark+0.043
 be+0.042
 roaring+0.041
Neg logit contributions
ocket-0.088
acey-0.087
yle-0.086
ays-0.085
oll-0.084
Token One
Token ID1881
Feature activation-1.52e-02

Pos logit contributions
 nod+0.010
 joke+0.010
sing+0.010
 matter+0.009
 laugh+0.009
Neg logit contributions
acey-0.014
ays-0.014
yle-0.014
oll-0.013
ions-0.013
Token medicine
Token ID9007
Feature activation+5.44e-02

Pos logit contributions
yle+0.041
acey+0.040
ocket+0.040
ays+0.039
oll+0.038
Neg logit contributions
 nod-0.034
 joke-0.031
 be-0.030
 bark-0.030
 count-0.029
Token with
Token ID351
Feature activation+4.71e-02

Pos logit contributions
yle+0.142
acey+0.140
ocket+0.137
ays+0.135
asers+0.129
Neg logit contributions
 nod-0.045
 bark-0.039
 be-0.039
 joke-0.036
 roaring-0.033
Token them
Token ID606
Feature activation-1.81e-02

Pos logit contributions
yle+0.100
ocket+0.100
ays+0.098
acey+0.097
oll+0.096
Neg logit contributions
 nod-0.059
 be-0.056
 bark-0.055
 joke-0.053
 roaring-0.051
Token just
Token ID655
Feature activation-5.73e-02

Pos logit contributions
 nod+0.020
 bark+0.018
 be+0.017
 joke+0.017
 roaring+0.016
Neg logit contributions
yle-0.042
acey-0.042
ocket-0.041
ays-0.041
oll-0.040
Token in
Token ID287
Feature activation-4.09e-02

Pos logit contributions
 nod+0.081
 joke+0.072
 bark+0.072
 be+0.070
 roaring+0.070
Neg logit contributions
ocket-0.112
acey-0.112
yle-0.111
ays-0.109
oll-0.108
Token case
Token ID1339
Feature activation+1.59e-01

Pos logit contributions
 nod+0.053
 be+0.049
 joke+0.046
 bark+0.046
 roaring+0.044
Neg logit contributions
ocket-0.089
oll-0.088
yle-0.087
acey-0.086
ays-0.085
Token.
Token ID13
Feature activation-2.36e-03

Pos logit contributions
yle+0.387
ocket+0.386
acey+0.379
oll+0.376
ays+0.370
Neg logit contributions
 nod-0.141
 be-0.136
 bark-0.129
 barking-0.116
 roaring-0.114
Token They
Token ID1119
Feature activation-1.07e-02

Pos logit contributions
 nod+0.002
 be+0.002
 bark+0.002
 barking+0.002
 ï¿½+0.002
Neg logit contributions
ocket-0.006
ils-0.006
yle-0.006
asers-0.005
oll-0.005
Token gave
Token ID2921
Feature activation+4.71e-02

Pos logit contributions
 nod+0.014
 bark+0.013
 be+0.012
 laugh+0.012
 joke+0.011
Neg logit contributions
ays-0.024
yle-0.023
ocket-0.023
acey-0.023
overs-0.022
Token her
Token ID607
Feature activation+2.07e-02

Pos logit contributions
ocket+0.113
oll+0.110
ays+0.107
yle+0.106
lor+0.105
Neg logit contributions
 be-0.050
 nod-0.047
 bark-0.045
 louder-0.043
 laugh-0.043
Token a
Token ID257
Feature activation+2.62e-02

Pos logit contributions
ays+0.054
yle+0.052
ocket+0.052
och+0.052
angers+0.051
Neg logit contributions
 nod-0.020
 be-0.017
 bark-0.016
 laugh-0.016
 joke-0.015
Token kept
Token ID4030
Feature activation-1.89e-02

Pos logit contributions
ocket+0.029
acey+0.029
yle+0.029
ays+0.028
oll+0.028
Neg logit contributions
 nod-0.018
 joke-0.016
 bark-0.015
 be-0.015
 roaring-0.015
Token her
Token ID607
Feature activation+1.92e-02

Pos logit contributions
 nod+0.025
 bark+0.023
 joke+0.023
 roaring+0.022
 be+0.022
Neg logit contributions
ocket-0.039
acey-0.038
yle-0.038
ays-0.037
oll-0.037
Token eyes
Token ID2951
Feature activation-2.86e-02

Pos logit contributions
acey+0.043
ocket+0.042
yle+0.042
ays+0.042
oll+0.041
Neg logit contributions
 nod-0.023
 joke-0.020
 bark-0.020
 be-0.020
 roaring-0.019
Token open
Token ID1280
Feature activation-5.88e-02

Pos logit contributions
 nod+0.036
 joke+0.035
 bark+0.033
 roaring+0.032
 be+0.031
Neg logit contributions
ocket-0.058
acey-0.057
yle-0.056
oll-0.056
ays-0.054
Token in
Token ID287
Feature activation-4.35e-02

Pos logit contributions
 nod+0.060
 be+0.052
 bark+0.051
 roaring+0.049
 count+0.048
Neg logit contributions
acey-0.139
ocket-0.138
yle-0.136
oll-0.135
ays-0.132
Token case
Token ID1339
Feature activation+1.59e-01

Pos logit contributions
 nod+0.054
 be+0.048
 bark+0.045
 moment+0.044
 joke+0.044
Neg logit contributions
ocket-0.096
oll-0.096
acey-0.096
ays-0.095
yle-0.094
Token the
Token ID262
Feature activation+1.72e-02

Pos logit contributions
ocket+0.393
yle+0.382
ays+0.382
acey+0.382
oll+0.380
Neg logit contributions
 nod-0.141
 be-0.122
 bark-0.117
 louder-0.106
 moment-0.102
Token cone
Token ID27763
Feature activation-3.17e-02

Pos logit contributions
acey+0.035
ays+0.035
yle+0.034
ocket+0.034
ions+0.033
Neg logit contributions
 nod-0.024
 bark-0.022
 joke-0.022
 be-0.021
 roaring-0.020
Token reapp
Token ID24578
Feature activation+5.31e-02

Pos logit contributions
 nod+0.032
 be+0.029
 bark+0.029
 count+0.027
 joke+0.026
Neg logit contributions
acey-0.075
yle-0.074
ocket-0.074
ays-0.074
ils-0.071
Tokeneared
Token ID3380
Feature activation-7.11e-03

Pos logit contributions
ocket+0.085
yle+0.084
acey+0.084
ays+0.082
oll+0.080
Neg logit contributions
 nod-0.092
 be-0.085
 bark-0.085
 joke-0.083
 roaring-0.083
Token.
Token ID13
Feature activation-2.75e-03

Pos logit contributions
 nod+0.006
 be+0.006
 bark+0.006
 roaring+0.006
 joke+0.006
Neg logit contributions
acey-0.017
ocket-0.017
ays-0.017
yle-0.017
oll-0.016
Token imagine
Token ID5967
Feature activation+3.57e-02

Pos logit contributions
appointed+0.055
eeper+0.052
dy+0.051
 reasoning+0.050
 roaring+0.049
Neg logit contributions
yle-0.086
ocket-0.084
acey-0.083
iza-0.078
 album-0.078
Token what
Token ID644
Feature activation+5.94e-02

Pos logit contributions
acey+0.070
yle+0.067
ays+0.067
ocket+0.067
oll+0.065
Neg logit contributions
 nod-0.053
 be-0.050
 bark-0.049
 roaring-0.049
 joke-0.049
Token her
Token ID607
Feature activation+2.88e-02

Pos logit contributions
ays+0.113
ocket+0.111
acey+0.109
oll+0.107
yle+0.107
Neg logit contributions
 nod-0.091
 joke-0.088
 bark-0.087
 be-0.087
 matter-0.081
Token great
Token ID1049
Feature activation+5.72e-03

Pos logit contributions
acey+0.061
ocket+0.061
ays+0.060
yle+0.060
ions+0.059
Neg logit contributions
 nod-0.037
 joke-0.033
 bark-0.033
 be-0.033
 count-0.031
Token-
Token ID12
Feature activation-1.44e-02

Pos logit contributions
ocket+0.013
acey+0.013
yle+0.013
ays+0.013
oll+0.013
Neg logit contributions
 nod-0.007
 joke-0.006
 bark-0.006
 be-0.006
 roaring-0.005
Tokengrand
Token ID23936
Feature activation+1.57e-01

Pos logit contributions
 nod+0.031
 bark+0.030
 joke+0.029
 be+0.029
 count+0.028
Neg logit contributions
ays-0.018
acey-0.017
ocket-0.017
ions-0.016
yle-0.016
Tokenma
Token ID2611
Feature activation+1.28e-02

Pos logit contributions
ocket+0.210
acey+0.208
yle+0.207
ays+0.201
oll+0.199
Neg logit contributions
 nod-0.318
 joke-0.294
 bark-0.293
 be-0.290
 roaring-0.287
Token was
Token ID373
Feature activation+2.53e-02

Pos logit contributions
ays+0.028
acey+0.028
yle+0.028
ocket+0.028
ions+0.027
Neg logit contributions
 nod-0.015
 be-0.014
 bark-0.014
 joke-0.013
 count-0.013
Token like
Token ID588
Feature activation-1.20e-02

Pos logit contributions
acey+0.056
yle+0.056
ocket+0.056
ays+0.055
oll+0.055
Neg logit contributions
 nod-0.029
 joke-0.025
 bark-0.025
 be-0.025
 roaring-0.025
Token.
Token ID13
Feature activation+1.40e-03

Pos logit contributions
 be+0.011
 bark+0.010
 nod+0.010
 roaring+0.010
 joke+0.009
Neg logit contributions
ays-0.030
acey-0.029
yle-0.028
oll-0.028
ocket-0.028
Token She
Token ID1375
Feature activation+1.91e-02

Pos logit contributions
ocket+0.003
yle+0.003
acey+0.003
ils+0.003
ays+0.003
Neg logit contributions
 nod-0.002
 bark-0.002
 be-0.002
 joke-0.001
appointed-0.001
Token belonged
Token ID19611
Feature activation+2.43e-02

Pos logit contributions
yle+0.068
acey+0.068
ocket+0.067
ays+0.067
oll+0.066
Neg logit contributions
 nod-0.044
 be-0.041
 bark-0.041
 joke-0.040
 roaring-0.038
Token to
Token ID284
Feature activation-3.92e-02

Pos logit contributions
acey+0.066
yle+0.065
ays+0.065
oll+0.065
ocket+0.064
Neg logit contributions
 be-0.013
 bark-0.012
 nod-0.011
 count-0.011
 shaking-0.010
Token his
Token ID465
Feature activation+1.23e-02

Pos logit contributions
 nod+0.049
 be+0.048
 joke+0.046
 bark+0.045
 roaring+0.044
Neg logit contributions
ays-0.084
ocket-0.083
acey-0.083
oll-0.082
yle-0.082
Token great
Token ID1049
Feature activation+3.98e-04

Pos logit contributions
ays+0.025
ocket+0.025
acey+0.024
ions+0.024
yle+0.024
Neg logit contributions
 nod-0.017
 joke-0.016
 be-0.016
 bark-0.015
 count-0.015
Token-
Token ID12
Feature activation-1.25e-02

Pos logit contributions
ocket+0.001
acey+0.001
yle+0.001
ays+0.001
oll+0.001
Neg logit contributions
 nod-0.001
 joke-0.001
 bark-0.001
 be-0.001
 roaring-0.000
Tokengrand
Token ID23936
Feature activation+1.57e-01

Pos logit contributions
 nod+0.024
 joke+0.023
 bark+0.023
 be+0.022
 moment+0.022
Neg logit contributions
ays-0.019
acey-0.017
ocket-0.017
yle-0.017
oll-0.016
Tokenfather
Token ID11358
Feature activation+6.01e-02

Pos logit contributions
ocket+0.226
yle+0.225
acey+0.224
ays+0.223
oll+0.216
Neg logit contributions
 nod-0.297
 joke-0.275
 be-0.274
 bark-0.272
 roaring-0.267
Token,
Token ID11
Feature activation+5.56e-04

Pos logit contributions
ays+0.153
acey+0.146
yle+0.144
ocket+0.142
umbers+0.140
Neg logit contributions
 be-0.054
 nod-0.053
 bark-0.048
 count-0.047
 joke-0.046
Token and
Token ID290
Feature activation+1.67e-02

Pos logit contributions
ays+0.001
ocket+0.001
acey+0.001
ils+0.001
yle+0.001
Neg logit contributions
 nod-0.001
 be-0.001
 bark-0.001
 joke-0.001
 roaring-0.001
Token he
Token ID339
Feature activation+3.97e-02

Pos logit contributions
ays+0.038
acey+0.037
yle+0.037
ocket+0.037
oll+0.036
Neg logit contributions
 be-0.021
 nod-0.020
 bark-0.019
 joke-0.019
 count-0.018
Token used
Token ID973
Feature activation+4.16e-02

Pos logit contributions
ays+0.094
yle+0.091
ocket+0.091
ions+0.090
acey+0.090
Neg logit contributions
 nod-0.044
 be-0.042
 bark-0.041
 joke-0.039
 count-0.036
Token set
Token ID900
Feature activation+1.11e-02

Pos logit contributions
acey+0.087
ocket+0.087
yle+0.086
oll+0.084
ays+0.083
Neg logit contributions
 nod-0.062
 joke-0.055
 bark-0.054
 roaring-0.053
 be-0.052
Token up
Token ID510
Feature activation-5.15e-02

Pos logit contributions
ocket+0.022
acey+0.022
yle+0.021
ays+0.021
oll+0.021
Neg logit contributions
 nod-0.016
 bark-0.014
 joke-0.014
 be-0.014
 roaring-0.014
Token a
Token ID257
Feature activation+3.53e-02

Pos logit contributions
 nod+0.069
 bark+0.062
 joke+0.062
 roaring+0.061
 be+0.061
Neg logit contributions
ocket-0.106
acey-0.105
yle-0.104
oll-0.103
ays-0.103
Token plan
Token ID1410
Feature activation+7.76e-02

Pos logit contributions
acey+0.063
ocket+0.061
yle+0.061
 register+0.057
iza+0.056
Neg logit contributions
 nod-0.055
appointed-0.048
 count-0.047
 bark-0.046
 roaring-0.046
Token in
Token ID287
Feature activation-4.03e-02

Pos logit contributions
acey+0.195
ays+0.192
yle+0.192
ocket+0.192
oll+0.187
Neg logit contributions
 nod-0.068
 be-0.062
 bark-0.061
 joke-0.057
 roaring-0.056
Token case
Token ID1339
Feature activation+1.57e-01

Pos logit contributions
 be+0.048
 nod+0.048
 bark+0.043
 barking+0.042
 moment+0.042
Neg logit contributions
oll-0.094
ocket-0.092
ays-0.090
acey-0.090
yle-0.089
Token something
Token ID1223
Feature activation+5.21e-02

Pos logit contributions
ocket+0.368
yle+0.363
oll+0.360
acey+0.358
ays+0.353
Neg logit contributions
 nod-0.153
 be-0.151
 bark-0.144
 barking-0.138
 shaking-0.133
Token like
Token ID588
Feature activation-6.67e-03

Pos logit contributions
ocket+0.107
acey+0.107
yle+0.106
ays+0.106
oll+0.103
Neg logit contributions
 nod-0.070
 joke-0.064
 bark-0.064
 be-0.062
 roaring-0.060
Token this
Token ID428
Feature activation+5.22e-02

Pos logit contributions
 be+0.008
 joke+0.008
 bark+0.007
 roaring+0.007
 shaking+0.007
Neg logit contributions
ays-0.014
acey-0.014
yle-0.014
ocket-0.014
oll-0.013
Token happened
Token ID3022
Feature activation-6.52e-03

Pos logit contributions
ocket+0.121
yle+0.120
ays+0.119
acey+0.119
oll+0.116
Neg logit contributions
 nod-0.059
 joke-0.054
 be-0.053
 bark-0.052
 matter-0.048
Token.
Token ID13
Feature activation-1.11e-03

Pos logit contributions
 nod+0.009
 joke+0.008
 bark+0.008
 be+0.007
 roaring+0.007
Neg logit contributions
ocket-0.013
acey-0.013
yle-0.013
ays-0.013
oll-0.013
TokenThat
Token ID2504
Feature activation+1.22e-02

Pos logit contributions
ocket+0.051
acey+0.049
oll+0.049
ils+0.048
ays+0.048
Neg logit contributions
 nod-0.041
 bark-0.037
 be-0.037
 joke-0.037
 count-0.034
Token's
Token ID338
Feature activation-4.65e-02

Pos logit contributions
ocket+0.028
oll+0.027
yle+0.027
acey+0.027
ays+0.026
Neg logit contributions
 nod-0.014
 be-0.014
 bark-0.014
 joke-0.013
 count-0.012
Token your
Token ID534
Feature activation+1.85e-02

Pos logit contributions
 nod+0.067
 joke+0.060
 bark+0.060
 be+0.060
 roaring+0.058
Neg logit contributions
ocket-0.090
yle-0.090
acey-0.090
ays-0.088
oll-0.087
Token great
Token ID1049
Feature activation+7.97e-03

Pos logit contributions
ays+0.040
oll+0.039
ocket+0.039
ions+0.039
umbers+0.039
Neg logit contributions
 nod-0.024
 be-0.023
 joke-0.022
 bark-0.022
 count-0.021
Token-
Token ID12
Feature activation-1.52e-02

Pos logit contributions
ocket+0.017
acey+0.017
yle+0.017
ays+0.016
oll+0.016
Neg logit contributions
 nod-0.010
 joke-0.009
 bark-0.009
 be-0.009
 count-0.009
Tokengrand
Token ID23936
Feature activation+1.57e-01

Pos logit contributions
 nod+0.032
 bark+0.031
 joke+0.031
 be+0.030
 count+0.029
Neg logit contributions
ays-0.020
acey-0.018
yle-0.018
ocket-0.017
oll-0.017
Tokenma
Token ID2611
Feature activation+9.60e-03

Pos logit contributions
ocket+0.218
acey+0.217
yle+0.216
ays+0.210
oll+0.207
Neg logit contributions
 nod-0.309
 joke-0.285
 bark-0.284
 be-0.281
 roaring-0.278
Token,"
Token ID553
Feature activation+2.61e-02

Pos logit contributions
yle+0.022
ays+0.022
acey+0.022
ocket+0.022
ions+0.021
Neg logit contributions
 be-0.011
 nod-0.010
 count-0.009
 bark-0.009
 roaring-0.008
Token her
Token ID607
Feature activation+4.31e-03

Pos logit contributions
ocket+0.047
acey+0.046
yle+0.046
ays+0.046
oll+0.045
Neg logit contributions
 nod-0.043
 bark-0.039
 be-0.038
 joke-0.038
 roaring-0.036
Token mom
Token ID1995
Feature activation-2.51e-02

Pos logit contributions
ocket+0.009
acey+0.009
yle+0.009
ays+0.008
oll+0.008
Neg logit contributions
 nod-0.006
 joke-0.005
 bark-0.005
 roaring-0.005
 be-0.005
Token replied
Token ID8712
Feature activation+4.21e-02

Pos logit contributions
 nod+0.042
sing+0.042
 joke+0.041
 roaring+0.041
 moment+0.038
Neg logit contributions
acey-0.040
yle-0.039
oll-0.039
ocket-0.038
aser-0.035
Tokenkin
Token ID5116
Feature activation+4.40e-02

Pos logit contributions
ays+0.079
yle+0.075
acey+0.072
 Hood+0.070
asers+0.070
Neg logit contributions
 be-0.071
 nod-0.069
 joke-0.066
 moment-0.065
 bark-0.062
Token nearby
Token ID6716
Feature activation-4.03e-02

Pos logit contributions
yle+0.112
acey+0.111
ays+0.110
ocket+0.106
asers+0.106
Neg logit contributions
 nod-0.040
 be-0.039
 bark-0.034
 joke-0.032
 moment-0.032
Token to
Token ID284
Feature activation-3.71e-02

Pos logit contributions
 nod+0.036
 be+0.034
 bark+0.034
 joke+0.033
 count+0.030
Neg logit contributions
ays-0.102
acey-0.102
yle-0.101
asers-0.099
ocket-0.099
Token use
Token ID779
Feature activation+3.89e-02

Pos logit contributions
 nod+0.040
 joke+0.034
 bark+0.034
 be+0.033
 roaring+0.032
Neg logit contributions
ocket-0.086
acey-0.085
yle-0.085
ays-0.083
oll-0.083
Token in
Token ID287
Feature activation-4.59e-02

Pos logit contributions
yle+0.090
ays+0.089
ocket+0.088
acey+0.088
oll+0.087
Neg logit contributions
 be-0.040
 nod-0.038
 bark-0.035
 joke-0.034
 count-0.034
Token case
Token ID1339
Feature activation+1.57e-01

Pos logit contributions
 be+0.055
 nod+0.052
 joke+0.049
 moment+0.048
 count+0.047
Neg logit contributions
ays-0.101
oll-0.101
ocket-0.100
yle-0.099
asers-0.098
Token she
Token ID673
Feature activation+7.65e-02

Pos logit contributions
ays+0.327
yle+0.324
ocket+0.317
oll+0.313
acey+0.312
Neg logit contributions
 be-0.176
 nod-0.170
 bark-0.160
 pur-0.154
 louder-0.152
Token got
Token ID1392
Feature activation+5.88e-02

Pos logit contributions
ays+0.174
yle+0.168
ions+0.167
acey+0.162
ocket+0.161
Neg logit contributions
 be-0.092
 nod-0.090
 bark-0.089
 joke-0.085
 count-0.085
Token cr
Token ID1067
Feature activation-8.14e-03

Pos logit contributions
ocket+0.137
yle+0.136
ays+0.134
acey+0.133
oll+0.130
Neg logit contributions
 nod-0.059
 be-0.057
 bark-0.054
 joke-0.052
 roaring-0.049
Tokenumbs
Token ID18146
Feature activation+3.71e-02

Pos logit contributions
 nod+0.017
 be+0.015
 joke+0.015
 bark+0.015
 count+0.015
Neg logit contributions
yle-0.010
ocket-0.010
acey-0.010
ils-0.009
iza-0.009
Token on
Token ID319
Feature activation+8.56e-03

Pos logit contributions
yle+0.097
acey+0.095
ocket+0.090
ays+0.089
asers+0.089
Neg logit contributions
 nod-0.034
 be-0.029
 bark-0.028
 laugh-0.026
 shake-0.025
Token close
Token ID1969
Feature activation-5.44e-02

Pos logit contributions
 nod+0.010
 be+0.009
 bark+0.009
 joke+0.009
 roaring+0.008
Neg logit contributions
acey-0.017
ocket-0.017
yle-0.017
oll-0.016
ays-0.016
Token by
Token ID416
Feature activation-5.69e-02

Pos logit contributions
 nod+0.065
 bark+0.056
 be+0.056
 joke+0.055
 roaring+0.054
Neg logit contributions
acey-0.121
ocket-0.120
yle-0.119
oll-0.117
ays-0.115
Token,
Token ID11
Feature activation-8.60e-03

Pos logit contributions
 nod+0.075
 joke+0.066
 bark+0.066
 be+0.063
 roaring+0.063
Neg logit contributions
ocket-0.117
acey-0.116
yle-0.116
ays-0.113
oll-0.112
Token just
Token ID655
Feature activation-6.91e-02

Pos logit contributions
 nod+0.011
 joke+0.010
 bark+0.010
 count+0.009
 be+0.009
Neg logit contributions
ocket-0.017
yle-0.017
acey-0.017
ays-0.017
oll-0.016
Token in
Token ID287
Feature activation-4.23e-02

Pos logit contributions
 nod+0.091
 joke+0.081
 bark+0.081
 roaring+0.077
 be+0.075
Neg logit contributions
ocket-0.138
yle-0.136
acey-0.135
oll-0.132
ays-0.132
Token case
Token ID1339
Feature activation+1.56e-01

Pos logit contributions
 nod+0.049
 be+0.045
 roaring+0.042
 joke+0.042
 bark+0.042
Neg logit contributions
ocket-0.098
acey-0.097
yle-0.097
oll-0.094
ays-0.094
Token it
Token ID340
Feature activation-7.38e-03

Pos logit contributions
ocket+0.327
acey+0.320
yle+0.316
oll+0.314
ays+0.306
Neg logit contributions
 nod-0.210
 be-0.201
 bark-0.185
 roaring-0.182
 count-0.174
Token needed
Token ID2622
Feature activation+9.79e-02

Pos logit contributions
 nod+0.010
 be+0.009
 bark+0.009
 joke+0.009
 count+0.008
Neg logit contributions
ocket-0.016
yle-0.016
acey-0.016
ays-0.015
oll-0.015
Token to
Token ID284
Feature activation-3.74e-02

Pos logit contributions
ocket+0.208
acey+0.207
yle+0.206
ays+0.203
oll+0.201
Neg logit contributions
 nod-0.122
 joke-0.108
 bark-0.107
 be-0.105
 roaring-0.103
Token scare
Token ID19437
Feature activation+9.90e-03

Pos logit contributions
 nod+0.034
 bark+0.028
 joke+0.028
 roaring+0.027
 be+0.026
Neg logit contributions
ocket-0.091
acey-0.091
yle-0.091
ays-0.088
oll-0.088
Token away
Token ID1497
Feature activation+1.80e-02

Pos logit contributions
oll+0.021
acey+0.021
ocket+0.021
ays+0.021
yle+0.020
Neg logit contributions
 nod-0.013
 be-0.013
 bark-0.012
 joke-0.011
 roaring-0.011
Token a
Token ID257
Feature activation+7.97e-03

Pos logit contributions
 nod+0.024
 joke+0.021
 bark+0.021
 be+0.020
 roaring+0.020
Neg logit contributions
ocket-0.034
acey-0.034
yle-0.034
ays-0.033
oll-0.032
Token secret
Token ID3200
Feature activation+2.49e-02

Pos logit contributions
ocket+0.014
acey+0.014
yle+0.014
oll+0.013
 register+0.013
Neg logit contributions
 nod-0.011
appointed-0.011
 roaring-0.010
 bark-0.010
 joke-0.010
Token key
Token ID1994
Feature activation+1.99e-02

Pos logit contributions
acey+0.055
yle+0.054
ays+0.054
ocket+0.052
oll+0.052
Neg logit contributions
 nod-0.031
 be-0.028
 joke-0.027
 bark-0.027
 count-0.026
Token just
Token ID655
Feature activation-6.21e-02

Pos logit contributions
acey+0.049
yle+0.048
ays+0.047
ocket+0.046
ils+0.046
Neg logit contributions
 be-0.019
 nod-0.019
 count-0.017
 bark-0.016
 moment-0.016
Token in
Token ID287
Feature activation-4.00e-02

Pos logit contributions
 nod+0.083
 be+0.073
 bark+0.072
 joke+0.071
 roaring+0.070
Neg logit contributions
acey-0.129
yle-0.128
ocket-0.127
ays-0.126
oll-0.123
Token case
Token ID1339
Feature activation+1.56e-01

Pos logit contributions
 nod+0.048
 be+0.047
 moment+0.043
 count+0.042
 matter+0.041
Neg logit contributions
ays-0.088
ocket-0.088
oll-0.088
acey-0.087
yle-0.087
Token.
Token ID13
Feature activation-3.79e-03

Pos logit contributions
yle+0.365
ays+0.362
acey+0.358
ocket+0.356
oll+0.356
Neg logit contributions
 nod-0.145
 be-0.144
 bark-0.123
 moment-0.122
 count-0.120
Token
Token ID198
Feature activation-1.91e-02

Pos logit contributions
 nod+0.004
 be+0.003
 bark+0.003
 count+0.003
 joke+0.003
Neg logit contributions
ocket-0.009
ils-0.009
yle-0.009
acey-0.009
ays-0.009
Token
Token ID198
Feature activation-2.34e-02

Pos logit contributions
 nod+0.020
 bark+0.017
 joke+0.017
 be+0.017
 roaring+0.016
Neg logit contributions
ocket-0.044
yle-0.044
acey-0.044
ays-0.043
oll-0.043
TokenAnd
Token ID1870
Feature activation+3.15e-02

Pos logit contributions
 nod+0.028
 joke+0.025
 bark+0.025
 be+0.024
 roaring+0.023
Neg logit contributions
ocket-0.051
acey-0.050
yle-0.050
ays-0.050
oll-0.049
Token that
Token ID326
Feature activation+3.08e-02

Pos logit contributions
ays+0.069
acey+0.064
umbers+0.063
ocket+0.063
oll+0.062
Neg logit contributions
 be-0.046
 nod-0.045
 bark-0.041
 laugh-0.040
 count-0.040
Token chest
Token ID7721
Feature activation+1.95e-02

Pos logit contributions
ocket+0.028
acey+0.027
yle+0.027
ays+0.027
oll+0.027
Neg logit contributions
 nod-0.020
 joke-0.018
 bark-0.018
 roaring-0.017
 be-0.017
Token was
Token ID373
Feature activation+1.41e-02

Pos logit contributions
yle+0.043
acey+0.041
ays+0.040
ocket+0.040
ions+0.040
Neg logit contributions
 be-0.026
 nod-0.024
 bark-0.023
 count-0.022
 joke-0.022
Token your
Token ID534
Feature activation+1.51e-02

Pos logit contributions
yle+0.032
ays+0.032
acey+0.032
ocket+0.031
ions+0.030
Neg logit contributions
 be-0.016
 nod-0.015
 bark-0.015
 roaring-0.014
 count-0.014
Token great
Token ID1049
Feature activation+9.83e-03

Pos logit contributions
ays+0.033
acey+0.031
ions+0.030
ils+0.030
umbers+0.030
Neg logit contributions
 be-0.022
 nod-0.021
 joke-0.020
 count-0.020
 bark-0.019
Token-
Token ID12
Feature activation-1.40e-02

Pos logit contributions
acey+0.020
yle+0.020
ays+0.020
ocket+0.020
ions+0.019
Neg logit contributions
 nod-0.014
 joke-0.013
 be-0.012
 bark-0.012
 count-0.012
Tokengrand
Token ID23936
Feature activation+1.56e-01

Pos logit contributions
 nod+0.027
 bark+0.026
 joke+0.026
 be+0.025
 moment+0.025
Neg logit contributions
ays-0.021
acey-0.019
yle-0.019
ions-0.018
oll-0.017
Tokenma
Token ID2611
Feature activation+1.02e-02

Pos logit contributions
yle+0.232
acey+0.230
ocket+0.230
ays+0.228
oll+0.220
Neg logit contributions
 nod-0.292
 be-0.268
 joke-0.267
 bark-0.264
 roaring-0.259
Token's
Token ID338
Feature activation-2.88e-02

Pos logit contributions
yle+0.026
ays+0.025
acey+0.025
umbers+0.025
ions+0.024
Neg logit contributions
 be-0.010
 nod-0.009
 bark-0.008
 count-0.008
 joke-0.007
Token.
Token ID13
Feature activation+2.35e-03

Pos logit contributions
 be+0.041
 nod+0.040
 joke+0.037
 bark+0.037
 count+0.035
Neg logit contributions
ays-0.064
acey-0.060
yle-0.059
ions-0.058
umbers-0.058
Token She
Token ID1375
Feature activation+1.89e-02

Pos logit contributions
ils+0.005
ocket+0.005
yle+0.005
asers+0.005
acey+0.005
Neg logit contributions
 nod-0.003
 be-0.003
 Just-0.002
 count-0.002
 But-0.002
Token left
Token ID1364
Feature activation+3.61e-02

Pos logit contributions
ays+0.044
yle+0.043
ions+0.042
acey+0.040
asers+0.040
Neg logit contributions
 be-0.023
 nod-0.022
 bark-0.022
 count-0.020
 joke-0.020
Token it
Token ID340
Feature activation+1.58e-02

Pos logit contributions
ocket+0.322
oll+0.316
acey+0.306
yle+0.304
ils+0.304
Neg logit contributions
 be-0.103
 nod-0.079
 bark-0.079
 moment-0.072
 count-0.070
Token away
Token ID1497
Feature activation+2.20e-02

Pos logit contributions
yle+0.040
acey+0.039
ocket+0.039
oll+0.038
asers+0.038
Neg logit contributions
 be-0.016
 bark-0.013
 nod-0.013
 count-0.013
 moment-0.012
Token,
Token ID11
Feature activation-7.82e-03

Pos logit contributions
acey+0.053
yle+0.053
ocket+0.053
ays+0.052
oll+0.052
Neg logit contributions
 nod-0.022
 be-0.021
 bark-0.019
 count-0.019
 joke-0.019
Token just
Token ID655
Feature activation-6.55e-02

Pos logit contributions
 nod+0.008
 be+0.007
 bark+0.007
 joke+0.007
 roaring+0.007
Neg logit contributions
ocket-0.019
acey-0.019
ays-0.019
yle-0.019
oll-0.018
Token in
Token ID287
Feature activation-4.12e-02

Pos logit contributions
 nod+0.085
 be+0.077
 bark+0.075
 joke+0.074
 roaring+0.072
Neg logit contributions
acey-0.140
yle-0.138
ocket-0.138
ays-0.137
oll-0.133
Token case
Token ID1339
Feature activation+1.56e-01

Pos logit contributions
 nod+0.045
 be+0.044
 count+0.039
 moment+0.038
 joke+0.038
Neg logit contributions
ocket-0.097
yle-0.097
acey-0.096
ays-0.096
oll-0.095
Token someone
Token ID2130
Feature activation+5.58e-02

Pos logit contributions
ocket+0.349
oll+0.344
yle+0.338
acey+0.336
ays+0.329
Neg logit contributions
 be-0.172
 nod-0.156
 bark-0.137
 shaking-0.134
 twist-0.133
Token wanted
Token ID2227
Feature activation+4.41e-02

Pos logit contributions
ays+0.111
acey+0.106
ocket+0.106
yle+0.105
oll+0.104
Neg logit contributions
 nod-0.083
 be-0.082
 bark-0.078
 count-0.077
 laugh-0.073
Token it
Token ID340
Feature activation-4.51e-03

Pos logit contributions
ays+0.105
yle+0.105
acey+0.105
ocket+0.104
oll+0.102
Neg logit contributions
 nod-0.043
 be-0.042
 bark-0.039
 joke-0.039
 count-0.037
Token back
Token ID736
Feature activation-2.80e-02

Pos logit contributions
 nod+0.004
 be+0.004
 bark+0.004
 joke+0.004
 count+0.004
Neg logit contributions
acey-0.011
yle-0.011
ocket-0.011
ays-0.011
oll-0.010
Token.
Token ID13
Feature activation-8.16e-03

Pos logit contributions
 nod+0.027
 be+0.026
 bark+0.023
 count+0.023
 moment+0.023
Neg logit contributions
acey-0.070
ocket-0.068
yle-0.067
oll-0.066
ays-0.066
Token parents
Token ID3397
Feature activation+5.37e-02

Pos logit contributions
 nod+0.019
 joke+0.018
 be+0.018
 bark+0.018
 roaring+0.017
Neg logit contributions
ays-0.025
ions-0.025
oll-0.025
acey-0.024
umbers-0.024
Token and
Token ID290
Feature activation+2.22e-02

Pos logit contributions
ocket+0.115
acey+0.115
yle+0.114
ays+0.112
oll+0.111
Neg logit contributions
 nod-0.066
 bark-0.058
 joke-0.058
 be-0.056
 roaring-0.056
Token their
Token ID511
Feature activation-8.25e-03

Pos logit contributions
acey+0.055
ays+0.055
yle+0.054
ocket+0.054
oll+0.053
Neg logit contributions
 nod-0.023
 be-0.019
 bark-0.019
 joke-0.019
 laugh-0.018
Token great
Token ID1049
Feature activation+8.80e-03

Pos logit contributions
 nod+0.013
 joke+0.012
 bark+0.012
 be+0.012
 laugh+0.011
Neg logit contributions
ays-0.016
ions-0.016
oll-0.016
acey-0.015
yle-0.015
Token-
Token ID12
Feature activation-1.35e-02

Pos logit contributions
ays+0.018
ions+0.018
acey+0.018
umbers+0.017
yle+0.017
Neg logit contributions
 nod-0.012
 be-0.012
 joke-0.011
 bark-0.011
 matter-0.011
Tokengrand
Token ID23936
Feature activation+1.55e-01

Pos logit contributions
 nod+0.027
 bark+0.026
 joke+0.025
 be+0.025
 listen+0.025
Neg logit contributions
ays-0.020
oll-0.017
acey-0.017
ions-0.016
ocket-0.016
Tokenpa
Token ID8957
Feature activation+4.12e-02

Pos logit contributions
ocket+0.242
yle+0.240
ays+0.237
acey+0.237
oll+0.235
Neg logit contributions
 nod-0.279
 be-0.257
 joke-0.252
 bark-0.251
 roaring-0.246
Token's
Token ID338
Feature activation-2.88e-02

Pos logit contributions
yle+0.104
acey+0.103
ays+0.102
ocket+0.101
oll+0.100
Neg logit contributions
 nod-0.037
 be-0.033
 bark-0.031
 roaring-0.029
 joke-0.028
Token picture
Token ID4286
Feature activation+5.12e-02

Pos logit contributions
 nod+0.042
 be+0.038
 joke+0.038
 bark+0.037
 roaring+0.036
Neg logit contributions
ays-0.060
acey-0.057
ions-0.057
yle-0.057
oll-0.056
Token.
Token ID13
Feature activation+2.82e-03

Pos logit contributions
yle+0.134
acey+0.132
ays+0.128
ocket+0.128
iza+0.125
Neg logit contributions
 nod-0.044
 joke-0.041
 bark-0.041
 be-0.040
 roaring-0.035
Token They
Token ID1119
Feature activation-8.49e-03

Pos logit contributions
ocket+0.007
yle+0.007
ils+0.007
aser+0.007
acey+0.007
Neg logit contributions
 nod-0.003
 bark-0.003
 be-0.002
 joke-0.002
 barking-0.002
Token and
Token ID290
Feature activation+2.67e-02

Pos logit contributions
 nod+0.008
 be+0.007
 laugh+0.006
 bark+0.006
 count+0.006
Neg logit contributions
ocket-0.015
yle-0.015
ays-0.015
oll-0.015
acey-0.015
Token have
Token ID423
Feature activation+3.57e-02

Pos logit contributions
ocket+0.057
yle+0.057
acey+0.056
ions+0.054
ils+0.054
Neg logit contributions
 nod-0.028
 bark-0.025
 joke-0.024
 roaring-0.024
appointed-0.024
Token a
Token ID257
Feature activation+3.38e-02

Pos logit contributions
ocket+0.068
acey+0.067
ils+0.067
yle+0.066
ions+0.066
Neg logit contributions
 nod-0.047
sing-0.044
 joke-0.043
 bark-0.042
 roaring-0.042
Token plan
Token ID1410
Feature activation+7.22e-02

Pos logit contributions
acey+0.064
yle+0.062
ocket+0.061
oll+0.059
 register+0.059
Neg logit contributions
 nod-0.043
appointed-0.041
 roaring-0.040
sing-0.040
 bark-0.040
Token in
Token ID287
Feature activation-4.04e-02

Pos logit contributions
yle+0.185
ays+0.185
acey+0.184
ocket+0.183
oll+0.178
Neg logit contributions
 nod-0.062
 be-0.060
 count-0.051
 joke-0.050
 bark-0.049
Token case
Token ID1339
Feature activation+1.55e-01

Pos logit contributions
 be+0.049
 nod+0.045
 joke+0.042
 roaring+0.042
 count+0.041
Neg logit contributions
oll-0.092
ocket-0.091
ays-0.091
acey-0.090
yle-0.090
Token of
Token ID286
Feature activation+3.96e-02

Pos logit contributions
ocket+0.362
yle+0.354
acey+0.347
oll+0.345
ays+0.337
Neg logit contributions
 be-0.170
 nod-0.167
 bark-0.147
 roaring-0.144
 louder-0.142
Token an
Token ID281
Feature activation+4.03e-02

Pos logit contributions
ocket+0.078
acey+0.078
yle+0.078
ays+0.077
oll+0.076
Neg logit contributions
 nod-0.056
 joke-0.050
 bark-0.049
 be-0.049
 roaring-0.048
Token emergency
Token ID6334
Feature activation-6.53e-03

Pos logit contributions
acey+0.083
yle+0.082
ocket+0.081
ays+0.081
ions+0.078
Neg logit contributions
 nod-0.055
 be-0.049
 joke-0.049
 bark-0.048
 roaring-0.047
Token.
Token ID13
Feature activation-3.34e-03

Pos logit contributions
 nod+0.007
 be+0.006
 bark+0.006
 joke+0.006
 roaring+0.006
Neg logit contributions
ocket-0.016
acey-0.016
yle-0.015
oll-0.015
ays-0.015
Token He
Token ID679
Feature activation+7.36e-03

Pos logit contributions
 nod+0.004
 bark+0.003
 joke+0.003
 be+0.003
 roaring+0.003
Neg logit contributions
ocket-0.008
yle-0.007
acey-0.007
ays-0.007
ils-0.007
Token an
Token ID281
Feature activation+3.64e-02

Pos logit contributions
 nod+0.005
 be+0.004
 bark+0.004
 joke+0.004
 count+0.004
Neg logit contributions
ays-0.009
yle-0.009
acey-0.009
ocket-0.008
oll-0.008
Token old
Token ID1468
Feature activation-9.72e-03

Pos logit contributions
acey+0.072
yle+0.071
ocket+0.071
ays+0.070
ions+0.068
Neg logit contributions
 nod-0.051
 joke-0.046
 bark-0.046
 be-0.045
 roaring-0.044
Token rope
Token ID17182
Feature activation+1.53e-02

Pos logit contributions
 nod+0.016
appointed+0.016
 roaring+0.015
 joke+0.014
sing+0.014
Neg logit contributions
ocket-0.016
acey-0.015
oll-0.015
yle-0.015
aser-0.014
Token just
Token ID655
Feature activation-5.61e-02

Pos logit contributions
acey+0.036
yle+0.035
ays+0.035
ocket+0.035
ils+0.034
Neg logit contributions
 nod-0.017
 be-0.015
 bark-0.015
 joke-0.014
 roaring-0.014
Token in
Token ID287
Feature activation-3.98e-02

Pos logit contributions
 nod+0.079
 bark+0.071
 joke+0.070
 be+0.069
 roaring+0.068
Neg logit contributions
acey-0.111
ocket-0.111
yle-0.110
ays-0.108
oll-0.106
Token case
Token ID1339
Feature activation+1.54e-01

Pos logit contributions
 nod+0.043
 be+0.041
 bark+0.037
 joke+0.037
 roaring+0.036
Neg logit contributions
ocket-0.094
oll-0.094
acey-0.094
ays-0.093
ils-0.092
Token.
Token ID13
Feature activation-6.44e-04

Pos logit contributions
acey+0.383
ocket+0.382
ays+0.379
yle+0.377
oll+0.375
Neg logit contributions
 be-0.129
 nod-0.127
 bark-0.119
 roaring-0.114
 barking-0.109
Token He
Token ID679
Feature activation+4.15e-03

Pos logit contributions
 nod+0.001
 bark+0.001
 be+0.001
 joke+0.001
 roaring+0.000
Neg logit contributions
ocket-0.002
yle-0.002
acey-0.002
ils-0.002
oll-0.002
Token tied
Token ID8165
Feature activation-1.32e-02

Pos logit contributions
ocket+0.009
yle+0.009
acey+0.009
ays+0.009
ions+0.009
Neg logit contributions
 nod-0.005
 bark-0.005
 be-0.005
 joke-0.005
 roaring-0.004
Token one
Token ID530
Feature activation+9.16e-03

Pos logit contributions
 nod+0.018
 bark+0.017
 be+0.016
 roaring+0.016
 joke+0.016
Neg logit contributions
ocket-0.027
acey-0.026
yle-0.026
oll-0.026
ays-0.025
Token end
Token ID886
Feature activation+3.30e-02

Pos logit contributions
acey+0.019
ocket+0.019
yle+0.019
ils+0.018
ays+0.018
Neg logit contributions
 nod-0.012
 be-0.011
 bark-0.011
 joke-0.010
 count-0.010
Token,
Token ID11
Feature activation-7.00e-03

Pos logit contributions
ays+0.069
yle+0.069
acey+0.068
ocket+0.067
asers+0.066
Neg logit contributions
 nod-0.025
 be-0.025
 bark-0.025
 joke-0.022
 count-0.022
Token just
Token ID655
Feature activation-7.08e-02

Pos logit contributions
 nod+0.008
 be+0.007
 bark+0.007
 joke+0.007
 roaring+0.007
Neg logit contributions
ocket-0.016
acey-0.016
ays-0.016
yle-0.016
ils-0.015
Token with
Token ID351
Feature activation+4.17e-02

Pos logit contributions
 nod+0.081
 joke+0.070
 be+0.070
 bark+0.070
 roaring+0.067
Neg logit contributions
acey-0.160
ocket-0.160
yle-0.159
ays-0.157
oll-0.154
Token a
Token ID257
Feature activation+3.20e-02

Pos logit contributions
acey+0.093
ocket+0.093
ays+0.093
yle+0.092
 register+0.091
Neg logit contributions
 be-0.048
 nod-0.046
 bark-0.043
 joke-0.041
 roaring-0.041
Token new
Token ID649
Feature activation-2.55e-02

Pos logit contributions
ocket+0.074
acey+0.073
yle+0.072
oll+0.071
ays+0.069
Neg logit contributions
 nod-0.031
 bark-0.027
 joke-0.027
 roaring-0.026
appointed-0.026
Token case
Token ID1339
Feature activation+1.54e-01

Pos logit contributions
 nod+0.037
 joke+0.032
 bark+0.032
appointed+0.032
 roaring+0.032
Neg logit contributions
ocket-0.047
acey-0.046
yle-0.046
oll-0.045
ays-0.045
Token."
Token ID526
Feature activation-2.18e-02

Pos logit contributions
yle+0.401
acey+0.400
ays+0.390
ocket+0.385
umbers+0.381
Neg logit contributions
 be-0.135
 nod-0.106
 bark-0.104
 count-0.095
 shaking-0.087
Token
Token ID220
Feature activation-1.66e-02

Pos logit contributions
 nod+0.022
 bark+0.018
 be+0.018
 joke+0.016
 laugh+0.015
Neg logit contributions
ocket-0.055
ils-0.053
oll-0.053
ays-0.053
acey-0.052
Token
Token ID198
Feature activation-1.76e-02

Pos logit contributions
 nod+0.016
 be+0.013
 bark+0.012
 joke+0.012
 roaring+0.012
Neg logit contributions
ocket-0.041
yle-0.040
acey-0.040
ils-0.039
oll-0.039
Token
Token ID198
Feature activation-2.12e-02

Pos logit contributions
 nod+0.020
 joke+0.017
 bark+0.017
 be+0.017
 roaring+0.016
Neg logit contributions
ocket-0.039
acey-0.039
yle-0.039
ays-0.038
oll-0.038
TokenL
Token ID43
Feature activation+6.77e-02

Pos logit contributions
 nod+0.028
 joke+0.025
 bark+0.025
 roaring+0.024
 be+0.024
Neg logit contributions
acey-0.043
ocket-0.043
yle-0.043
oll-0.041
ays-0.041
Token the
Token ID262
Feature activation+2.03e-02

Pos logit contributions
ocket+0.054
acey+0.054
yle+0.053
oll+0.052
ays+0.052
Neg logit contributions
 nod-0.040
 bark-0.036
 joke-0.036
 roaring-0.034
 be-0.034
Token same
Token ID976
Feature activation+2.78e-02

Pos logit contributions
oll+0.036
ocket+0.035
 register+0.033
aser+0.033
acey+0.032
Neg logit contributions
 nod-0.028
 argue-0.026
sing-0.025
dy-0.025
 barking-0.025
Token pain
Token ID2356
Feature activation-1.49e-02

Pos logit contributions
ocket+0.052
acey+0.052
yle+0.051
ays+0.050
oll+0.050
Neg logit contributions
 nod-0.042
 joke-0.038
 bark-0.038
 be-0.037
 roaring-0.037
Token?".
Token ID43634
Feature activation-1.59e-02

Pos logit contributions
 nod+0.023
 be+0.022
 joke+0.021
 bark+0.021
 roaring+0.020
Neg logit contributions
ocket-0.028
yle-0.028
acey-0.027
oll-0.026
ays-0.026
Token The
Token ID383
Feature activation-2.32e-02

Pos logit contributions
 nod+0.020
 joke+0.018
 roaring+0.018
 bark+0.017
sing+0.017
Neg logit contributions
acey-0.033
yle-0.032
ocket-0.032
oll-0.031
ays-0.030
Token taller
Token ID25242
Feature activation-1.54e-01

Pos logit contributions
 nod+0.033
sing+0.029
 argue+0.029
 bark+0.029
 joke+0.028
Neg logit contributions
oll-0.041
ocket-0.041
aser-0.040
yle-0.040
acey-0.040
Token one
Token ID530
Feature activation-2.70e-05

Pos logit contributions
 nod+0.208
 joke+0.185
 bark+0.183
 roaring+0.181
 be+0.175
Neg logit contributions
ocket-0.305
yle-0.303
acey-0.300
oll-0.296
ays-0.294
Token leaned
Token ID23831
Feature activation-6.26e-02

Pos logit contributions
 nod+0.000
 joke+0.000
 roaring+0.000
 bark+0.000
appointed+0.000
Neg logit contributions
ocket-0.000
oll-0.000
yle-0.000
acey-0.000
aser-0.000
Token in
Token ID287
Feature activation-5.95e-02

Pos logit contributions
 nod+0.097
 joke+0.088
 bark+0.087
 be+0.085
 roaring+0.084
Neg logit contributions
ocket-0.114
acey-0.113
yle-0.113
ays-0.110
oll-0.109
Token and
Token ID290
Feature activation+2.95e-02

Pos logit contributions
 nod+0.084
 joke+0.080
appointed+0.078
sing+0.078
 bark+0.077
Neg logit contributions
ocket-0.104
yle-0.103
ions-0.102
acey-0.101
ays-0.099
Token said
Token ID531
Feature activation+4.66e-02

Pos logit contributions
ocket+0.058
acey+0.058
yle+0.057
ays+0.056
oll+0.056
Neg logit contributions
 nod-0.041
 joke-0.037
 bark-0.036
 be-0.035
 roaring-0.035
Token lean
Token ID10904
Feature activation-6.17e-02

Pos logit contributions
 nod+0.042
 joke+0.036
 bark+0.035
 roaring+0.035
 be+0.033
Neg logit contributions
ocket-0.105
acey-0.105
yle-0.104
oll-0.102
ays-0.102
Token against
Token ID1028
Feature activation-4.69e-02

Pos logit contributions
 nod+0.104
 joke+0.095
 bark+0.094
 be+0.092
 roaring+0.092
Neg logit contributions
ocket-0.104
acey-0.103
yle-0.103
ays-0.100
oll-0.099
Token it
Token ID340
Feature activation-9.08e-03

Pos logit contributions
 nod+0.065
 bark+0.058
 be+0.058
 joke+0.057
 roaring+0.056
Neg logit contributions
acey-0.096
ocket-0.096
oll-0.093
yle-0.092
ays-0.092
Token.
Token ID13
Feature activation-7.47e-03

Pos logit contributions
 nod+0.011
 joke+0.010
 bark+0.010
 be+0.010
 roaring+0.010
Neg logit contributions
ocket-0.019
acey-0.019
yle-0.019
ays-0.019
oll-0.019
Token The
Token ID383
Feature activation-2.30e-02

Pos logit contributions
 nod+0.008
 bark+0.007
 be+0.006
 joke+0.006
 barking+0.006
Neg logit contributions
ocket-0.019
ils-0.018
yle-0.018
acey-0.018
oll-0.017
Token taller
Token ID25242
Feature activation-1.53e-01

Pos logit contributions
 nod+0.033
 joke+0.029
 bark+0.029
 be+0.028
 roaring+0.028
Neg logit contributions
ocket-0.045
acey-0.045
yle-0.045
ays-0.044
oll-0.043
Token one
Token ID530
Feature activation+8.05e-03

Pos logit contributions
 nod+0.211
 bark+0.191
 joke+0.188
 be+0.186
 roaring+0.179
Neg logit contributions
acey-0.311
ocket-0.310
yle-0.305
ays-0.300
oll-0.297
Token was
Token ID373
Feature activation+8.18e-03

Pos logit contributions
ocket+0.014
yle+0.014
oll+0.014
acey+0.014
ions+0.014
Neg logit contributions
 nod-0.012
 joke-0.011
 roaring-0.011
 bark-0.011
 argue-0.011
Token curious
Token ID11040
Feature activation-5.00e-02

Pos logit contributions
oll+0.015
ions+0.014
ocket+0.014
acey+0.014
ays+0.013
Neg logit contributions
 nod-0.012
 argue-0.011
 joke-0.010
sing-0.010
 bark-0.010
Token and
Token ID290
Feature activation+1.92e-02

Pos logit contributions
 nod+0.061
 joke+0.053
 bark+0.053
 be+0.051
 roaring+0.051
Neg logit contributions
ocket-0.109
acey-0.108
yle-0.108
ays-0.106
oll-0.105
Token asked
Token ID1965
Feature activation+4.87e-04

Pos logit contributions
ocket+0.037
acey+0.037
yle+0.037
ays+0.036
oll+0.036
Neg logit contributions
 nod-0.027
 joke-0.024
 bark-0.024
 be-0.024
 roaring-0.023
Token.
Token ID13
Feature activation-1.77e-03

Pos logit contributions
ocket+0.059
acey+0.058
yle+0.058
ays+0.057
oll+0.057
Neg logit contributions
 nod-0.036
 joke-0.032
 bark-0.031
 be-0.030
 roaring-0.030
Token Max
Token ID5436
Feature activation+7.60e-03

Pos logit contributions
 nod+0.003
 joke+0.003
 roaring+0.003
 wob+0.003
 matter+0.003
Neg logit contributions
acey-0.003
ays-0.003
yle-0.003
 album-0.003
 Hood-0.003
Token had
Token ID550
Feature activation+3.51e-02

Pos logit contributions
acey+0.012
ils+0.012
yle+0.012
 Hood+0.012
ocket+0.011
Neg logit contributions
 nod-0.012
sing-0.012
appointed-0.011
 matter-0.011
 joke-0.011
Token a
Token ID257
Feature activation+2.51e-02

Pos logit contributions
ocket+0.062
acey+0.061
yle+0.061
oll+0.059
ays+0.059
Neg logit contributions
 nod-0.056
 joke-0.050
 bark-0.049
 roaring-0.048
 be-0.048
Token w
Token ID266
Feature activation-5.57e-03

Pos logit contributions
ocket+0.038
acey+0.038
oll+0.036
yle+0.035
 album+0.035
Neg logit contributions
appointed-0.040
 nod-0.038
 violently-0.037
 erupt-0.036
 roaring-0.035
Tokenagging
Token ID16406
Feature activation-1.53e-01

Pos logit contributions
 nod+0.009
 joke+0.009
 bark+0.008
 be+0.008
 roaring+0.008
Neg logit contributions
acey-0.010
yle-0.010
ocket-0.009
ays-0.009
ions-0.009
Token tail
Token ID7894
Feature activation-6.07e-03

Pos logit contributions
 nod+0.179
 count+0.150
 argue+0.147
 joke+0.146
 disagree+0.146
Neg logit contributions
ocket-0.311
yle-0.305
ays-0.295
acey-0.290
overs-0.287
Token and
Token ID290
Feature activation+3.40e-02

Pos logit contributions
 nod+0.008
 joke+0.007
 bark+0.007
 be+0.007
 roaring+0.006
Neg logit contributions
ocket-0.013
acey-0.013
yle-0.013
ays-0.012
oll-0.012
Token a
Token ID257
Feature activation+2.74e-02

Pos logit contributions
ocket+0.066
acey+0.065
yle+0.065
oll+0.064
ays+0.063
Neg logit contributions
 nod-0.048
 joke-0.042
 bark-0.041
 be-0.040
 roaring-0.040
Token shiny
Token ID22441
Feature activation-3.86e-02

Pos logit contributions
ocket+0.043
acey+0.043
 album+0.040
oll+0.040
 suitcase+0.040
Neg logit contributions
appointed-0.042
 erupt-0.037
 argue-0.037
 count-0.037
 nod-0.037
Token collar
Token ID19908
Feature activation+6.17e-02

Pos logit contributions
 nod+0.060
appointed+0.058
 roaring+0.055
 count+0.054
 argue+0.054
Neg logit contributions
ocket-0.065
yle-0.060
oll-0.059
acey-0.059
ays-0.058
Token to
Token ID284
Feature activation-5.12e-02

Pos logit contributions
 sym+0.113
sing+0.111
appointed+0.109
 widening+0.108
 nod+0.102
Neg logit contributions
yle-0.064
ocket-0.063
 register-0.061
aser-0.056
 Hood-0.055
Token go
Token ID467
Feature activation+2.31e-03

Pos logit contributions
sing+0.065
 nod+0.064
appointed+0.062
 sym+0.061
 roaring+0.060
Neg logit contributions
yle-0.088
ocket-0.082
 register-0.082
oll-0.081
iza-0.079
Token,
Token ID11
Feature activation-1.30e-02

Pos logit contributions
acey+0.005
ocket+0.005
yle+0.005
ays+0.005
oll+0.005
Neg logit contributions
 nod-0.003
 be-0.003
 bark-0.003
 roaring-0.003
 joke-0.003
Token but
Token ID475
Feature activation-1.53e-02

Pos logit contributions
 nod+0.012
 joke+0.011
 bark+0.010
 roaring+0.009
 count+0.009
Neg logit contributions
acey-0.031
ocket-0.031
yle-0.031
oll-0.030
ays-0.030
Token the
Token ID262
Feature activation-6.56e-03

Pos logit contributions
 nod+0.024
 joke+0.022
 bark+0.021
 roaring+0.021
 be+0.021
Neg logit contributions
ocket-0.028
yle-0.027
acey-0.027
oll-0.026
ays-0.026
Token taller
Token ID25242
Feature activation-1.53e-01

Pos logit contributions
 nod+0.011
appointed+0.010
 pur+0.010
 chuckle+0.010
 bark+0.010
Neg logit contributions
 register-0.010
aser-0.009
asers-0.009
 Hood-0.009
acey-0.009
Token boy
Token ID2933
Feature activation+6.59e-03

Pos logit contributions
 nod+0.262
appointed+0.243
 roaring+0.241
sing+0.240
 bark+0.236
Neg logit contributions
yle-0.222
aser-0.221
oll-0.221
asers-0.219
ocket-0.218
Token had
Token ID550
Feature activation+4.48e-02

Pos logit contributions
ocket+0.013
acey+0.013
yle+0.013
oll+0.013
ils+0.012
Neg logit contributions
 nod-0.009
 joke-0.008
 roaring-0.008
 bark-0.007
sing-0.007
Token no
Token ID645
Feature activation+3.09e-02

Pos logit contributions
yle+0.073
acey+0.073
ocket+0.072
ays+0.070
oll+0.069
Neg logit contributions
 nod-0.071
sing-0.065
 joke-0.065
 roaring-0.063
 laugh-0.062
Token choice
Token ID3572
Feature activation+1.31e-01

Pos logit contributions
ocket+0.055
oll+0.054
 register+0.052
ils+0.051
asers+0.050
Neg logit contributions
appointed-0.039
 nod-0.038
 bark-0.036
dy-0.035
sing-0.034
Token.
Token ID13
Feature activation-1.75e-03

Pos logit contributions
ocket+0.263
yle+0.260
acey+0.258
oll+0.253
ions+0.251
Neg logit contributions
 nod-0.176
 joke-0.157
 bark-0.156
 be-0.149
 roaring-0.149
Token Anna
Token ID11735
Feature activation-2.49e-02

Pos logit contributions
 be+0.041
 bark+0.040
 nod+0.039
 barking+0.035
 matter+0.033
Neg logit contributions
ocket-0.098
oll-0.095
asers-0.093
aser-0.092
acey-0.091
Token and
Token ID290
Feature activation+2.43e-02

Pos logit contributions
 nod+0.034
 joke+0.031
 bark+0.030
 roaring+0.030
 matter+0.029
Neg logit contributions
ocket-0.049
yle-0.048
acey-0.048
ays-0.047
oll-0.047
Token Ben
Token ID3932
Feature activation+7.05e-03

Pos logit contributions
ocket+0.048
yle+0.047
acey+0.047
ays+0.045
oll+0.045
Neg logit contributions
 nod-0.034
 joke-0.031
 bark-0.030
 roaring-0.029
appointed-0.029
Token.
Token ID13
Feature activation-2.65e-03

Pos logit contributions
yle+0.012
ocket+0.012
ays+0.012
acey+0.012
oll+0.012
Neg logit contributions
 nod-0.010
 joke-0.010
 matter-0.010
 bark-0.009
appointed-0.009
Token It
Token ID632
Feature activation-6.45e-02

Pos logit contributions
 nod+0.003
 bark+0.003
 be+0.003
 joke+0.003
 roaring+0.002
Neg logit contributions
ocket-0.006
yle-0.006
acey-0.006
ils-0.006
oll-0.006
Token tilted
Token ID37229
Feature activation-1.52e-01

Pos logit contributions
 nod+0.102
appointed+0.098
sing+0.094
 argue+0.094
 roaring+0.089
Neg logit contributions
ocket-0.104
oll-0.104
yle-0.098
acey-0.098
ils-0.096
Token its
Token ID663
Feature activation-3.24e-02

Pos logit contributions
 joke+0.249
 nod+0.240
appointed+0.236
 count+0.231
 roaring+0.229
Neg logit contributions
ays-0.230
ocket-0.229
yle-0.221
ions-0.210
oll-0.210
Token head
Token ID1182
Feature activation-2.54e-02

Pos logit contributions
 nod+0.055
 cooperate+0.054
 disagree+0.052
 disl+0.052
 compromise+0.051
Neg logit contributions
ocket-0.043
oll-0.039
yle-0.038
aser-0.038
als-0.037
Token and
Token ID290
Feature activation+2.69e-02

Pos logit contributions
 nod+0.035
 joke+0.033
 bark+0.031
 roaring+0.030
 count+0.030
Neg logit contributions
ocket-0.049
yle-0.049
acey-0.048
ays-0.048
oll-0.048
Token made
Token ID925
Feature activation-6.76e-03

Pos logit contributions
ocket+0.051
yle+0.049
ays+0.049
als+0.049
acey+0.048
Neg logit contributions
 nod-0.033
appointed-0.032
 joke-0.032
 matter-0.029
sing-0.029
Token a
Token ID257
Feature activation+8.75e-03

Pos logit contributions
 nod+0.008
 be+0.008
 bark+0.007
 joke+0.007
 louder+0.007
Neg logit contributions
ocket-0.015
yle-0.015
asers-0.015
ays-0.015
oll-0.015
Token was
Token ID373
Feature activation+9.29e-04

Pos logit contributions
 nod+0.045
 bark+0.040
 joke+0.039
 be+0.039
 roaring+0.038
Neg logit contributions
ocket-0.074
acey-0.073
yle-0.073
ays-0.072
oll-0.072
Token so
Token ID523
Feature activation+1.65e-02

Pos logit contributions
yle+0.001
acey+0.001
ocket+0.001
oll+0.001
ays+0.001
Neg logit contributions
 nod-0.002
 joke-0.002
 bark-0.001
 matter-0.001
 roaring-0.001
Token happy
Token ID3772
Feature activation-4.25e-03

Pos logit contributions
yle+0.037
acey+0.037
ocket+0.037
oll+0.037
ays+0.036
Neg logit contributions
 nod-0.017
 joke-0.015
 roaring-0.015
 bark-0.015
 matter-0.014
Token!
Token ID0
Feature activation+5.40e-03

Pos logit contributions
 nod+0.006
 joke+0.005
 bark+0.005
 roaring+0.005
 be+0.005
Neg logit contributions
ocket-0.008
yle-0.008
acey-0.008
ays-0.008
oll-0.008
Token She
Token ID1375
Feature activation+1.78e-02

Pos logit contributions
ocket+0.012
acey+0.011
ils+0.011
yle+0.011
oll+0.011
Neg logit contributions
 nod-0.007
 bark-0.006
 be-0.006
 joke-0.006
appointed-0.006
Token tilted
Token ID37229
Feature activation-1.52e-01

Pos logit contributions
ocket+0.037
acey+0.037
yle+0.036
ays+0.036
oll+0.036
Neg logit contributions
 nod-0.023
 bark-0.021
 joke-0.021
 be-0.021
 roaring-0.020
Token her
Token ID607
Feature activation-1.34e-03

Pos logit contributions
 nod+0.275
 joke+0.271
 roaring+0.256
 bark+0.255
sing+0.251
Neg logit contributions
yle-0.213
ocket-0.210
ays-0.208
acey-0.204
ions-0.199
Token head
Token ID1182
Feature activation-3.09e-02

Pos logit contributions
 nod+0.002
 joke+0.002
 roaring+0.002
 matter+0.002
 erupt+0.002
Neg logit contributions
yle-0.002
ocket-0.002
oll-0.002
acey-0.002
aser-0.002
Token back
Token ID736
Feature activation-4.44e-02

Pos logit contributions
 nod+0.046
 joke+0.045
 matter+0.042
 roaring+0.041
dy+0.040
Neg logit contributions
yle-0.053
ocket-0.052
ays-0.051
oll-0.051
acey-0.051
Token and
Token ID290
Feature activation+3.08e-02

Pos logit contributions
 nod+0.059
 joke+0.054
 bark+0.051
 roaring+0.051
 be+0.049
Neg logit contributions
yle-0.089
ocket-0.088
ays-0.087
acey-0.087
oll-0.086
Token laughed
Token ID13818
Feature activation-4.39e-02

Pos logit contributions
yle+0.055
acey+0.053
ocket+0.052
ays+0.051
als+0.051
Neg logit contributions
 nod-0.045
 joke-0.043
sing-0.041
 roaring-0.040
appointed-0.040
Token They
Token ID1119
Feature activation-1.99e-02

Pos logit contributions
 nod+0.009
 joke+0.008
 bark+0.008
 be+0.008
 roaring+0.008
Neg logit contributions
ocket-0.015
acey-0.015
yle-0.015
ays-0.015
oll-0.015
Token roared
Token ID45615
Feature activation-4.66e-02

Pos logit contributions
 nod+0.025
 bark+0.023
 joke+0.022
 be+0.022
 roaring+0.021
Neg logit contributions
ocket-0.043
acey-0.042
yle-0.042
ays-0.042
oll-0.041
Token louder
Token ID27089
Feature activation-1.29e-01

Pos logit contributions
 nod+0.070
 joke+0.070
 matter+0.069
appointed+0.068
 count+0.066
Neg logit contributions
ays-0.071
als-0.070
yle-0.070
ocket-0.069
acey-0.069
Token than
Token ID621
Feature activation+3.67e-02

Pos logit contributions
 nod+0.199
appointed+0.183
 joke+0.177
 matter+0.173
 bark+0.169
Neg logit contributions
ocket-0.213
yle-0.210
ays-0.210
ions-0.205
ils-0.204
Token the
Token ID262
Feature activation-6.47e-03

Pos logit contributions
acey+0.074
ays+0.074
ocket+0.074
yle+0.073
oll+0.072
Neg logit contributions
 nod-0.051
 joke-0.047
 bark-0.047
 roaring-0.046
 be-0.046
Token thunder
Token ID18355
Feature activation-1.51e-01

Pos logit contributions
 nod+0.009
 bark+0.008
 joke+0.008
 roaring+0.007
appointed+0.007
Neg logit contributions
ocket-0.012
yle-0.012
acey-0.012
ays-0.012
oll-0.012
Token.
Token ID13
Feature activation-7.77e-03

Pos logit contributions
 nod+0.183
 bark+0.164
 joke+0.163
 be+0.159
 roaring+0.157
Neg logit contributions
ocket-0.329
acey-0.325
yle-0.322
oll-0.315
ays-0.314
Token
Token ID198
Feature activation-2.75e-02

Pos logit contributions
 nod+0.010
 joke+0.009
 bark+0.009
 roaring+0.009
 be+0.009
Neg logit contributions
acey-0.016
ocket-0.016
yle-0.016
ays-0.015
oll-0.015
Token
Token ID198
Feature activation-3.22e-02

Pos logit contributions
 nod+0.035
 joke+0.031
 bark+0.031
 roaring+0.030
sing+0.029
Neg logit contributions
acey-0.057
yle-0.056
ocket-0.056
ays-0.055
ions-0.055
TokenThe
Token ID464
Feature activation-3.37e-02

Pos logit contributions
 nod+0.047
 joke+0.042
 bark+0.041
sing+0.040
 roaring+0.040
Neg logit contributions
acey-0.062
yle-0.061
ocket-0.060
ions-0.059
ays-0.058
Token weather
Token ID6193
Feature activation-7.50e-02

Pos logit contributions
 nod+0.043
 bark+0.039
 joke+0.038
 count+0.038
appointed+0.037
Neg logit contributions
ocket-0.064
acey-0.064
yle-0.063
oll-0.062
ays-0.061
Token her
Token ID607
Feature activation+9.74e-03

Pos logit contributions
 nod+0.001
 joke+0.001
 be+0.001
 bark+0.001
 roaring+0.001
Neg logit contributions
ocket-0.001
ays-0.001
acey-0.001
yle-0.001
oll-0.001
Token picture
Token ID4286
Feature activation+3.55e-02

Pos logit contributions
ocket+0.020
yle+0.020
acey+0.019
oll+0.019
ays+0.018
Neg logit contributions
 nod-0.013
 bark-0.011
 roaring-0.011
 joke-0.011
appointed-0.011
Token bigger
Token ID5749
Feature activation-9.00e-02

Pos logit contributions
ocket+0.074
acey+0.074
yle+0.073
ays+0.072
oll+0.071
Neg logit contributions
 nod-0.046
 joke-0.041
 bark-0.040
 be-0.040
 roaring-0.039
Token.
Token ID13
Feature activation-8.13e-03

Pos logit contributions
 nod+0.101
 bark+0.087
 joke+0.087
 be+0.086
 roaring+0.084
Neg logit contributions
acey-0.207
ocket-0.204
yle-0.202
ays-0.198
oll-0.196
Token She
Token ID1375
Feature activation+1.63e-02

Pos logit contributions
 nod+0.010
 be+0.009
 bark+0.009
 joke+0.008
appointed+0.008
Neg logit contributions
ocket-0.018
ils-0.017
yle-0.017
acey-0.017
asers-0.017
Token tilted
Token ID37229
Feature activation-1.50e-01

Pos logit contributions
ocket+0.035
acey+0.034
yle+0.034
ays+0.034
oll+0.033
Neg logit contributions
 nod-0.021
 bark-0.018
 joke-0.018
 be-0.018
 roaring-0.017
Token her
Token ID607
Feature activation-1.71e-03

Pos logit contributions
 nod+0.265
 joke+0.252
 bark+0.247
 roaring+0.242
sing+0.237
Neg logit contributions
ocket-0.232
yle-0.230
acey-0.228
ays-0.224
oll-0.217
Token head
Token ID1182
Feature activation-3.33e-02

Pos logit contributions
 nod+0.002
 joke+0.002
 roaring+0.002
 barking+0.002
 bark+0.002
Neg logit contributions
ocket-0.003
yle-0.003
acey-0.003
oll-0.003
aser-0.003
Token and
Token ID290
Feature activation+2.87e-02

Pos logit contributions
 nod+0.047
 joke+0.046
 bark+0.042
 roaring+0.042
 matter+0.041
Neg logit contributions
yle-0.061
ocket-0.061
acey-0.060
oll-0.058
ays-0.058
Token continued
Token ID3767
Feature activation+3.76e-03

Pos logit contributions
yle+0.056
ocket+0.056
acey+0.056
oll+0.053
ays+0.053
Neg logit contributions
 nod-0.038
 joke-0.035
 roaring-0.033
sing-0.033
 bark-0.033
Token to
Token ID284
Feature activation-3.86e-02

Pos logit contributions
acey+0.007
yle+0.007
ocket+0.007
ays+0.007
oll+0.007
Neg logit contributions
 nod-0.005
 joke-0.005
 bark-0.004
 be-0.004
 roaring-0.004
Token!"
Token ID2474
Feature activation-1.42e-02

Pos logit contributions
acey+0.108
ocket+0.108
yle+0.107
ays+0.106
oll+0.105
Neg logit contributions
 be-0.047
 nod-0.047
 bark-0.044
 count-0.042
 roaring-0.041
Token
Token ID198
Feature activation-2.47e-02

Pos logit contributions
 nod+0.013
 bark+0.011
 be+0.011
 joke+0.010
 laugh+0.009
Neg logit contributions
ocket-0.037
ils-0.036
oll-0.036
ays-0.036
acey-0.035
Token
Token ID198
Feature activation-2.85e-02

Pos logit contributions
 nod+0.024
 bark+0.020
 joke+0.020
 be+0.020
 roaring+0.018
Neg logit contributions
ocket-0.061
ays-0.060
acey-0.059
yle-0.059
oll-0.059
TokenThe
Token ID464
Feature activation-3.21e-02

Pos logit contributions
 nod+0.034
 joke+0.030
 bark+0.030
 be+0.029
 roaring+0.028
Neg logit contributions
ocket-0.063
acey-0.062
yle-0.061
ays-0.061
oll-0.060
Token se
Token ID384
Feature activation-4.00e-02

Pos logit contributions
 nod+0.039
 joke+0.038
 be+0.037
 bark+0.036
 roaring+0.035
Neg logit contributions
ays-0.075
ions-0.070
acey-0.069
oll-0.068
asers-0.068
Tokenag
Token ID363
Feature activation-1.50e-01

Pos logit contributions
 nod+0.059
 bark+0.053
 joke+0.053
 be+0.052
 roaring+0.052
Neg logit contributions
ocket-0.075
acey-0.075
yle-0.074
ays-0.073
oll-0.073
Tokenull
Token ID724
Feature activation-6.08e-02

Pos logit contributions
 nod+0.182
 joke+0.163
 bark+0.161
 be+0.160
 roaring+0.156
Neg logit contributions
ocket-0.322
acey-0.320
yle-0.316
oll-0.315
ays-0.311
Tokens
Token ID82
Feature activation-5.30e-02

Pos logit contributions
 nod+0.083
 joke+0.075
 bark+0.072
 roaring+0.071
 be+0.070
Neg logit contributions
ocket-0.118
acey-0.118
yle-0.117
ays-0.116
oll-0.115
Token and
Token ID290
Feature activation+2.39e-02

Pos logit contributions
 nod+0.074
 joke+0.072
 matter+0.070
 roaring+0.066
 barking+0.065
Neg logit contributions
ocket-0.091
ays-0.090
yle-0.089
oll-0.089
ils-0.088
Token the
Token ID262
Feature activation-6.68e-03

Pos logit contributions
ocket+0.046
acey+0.046
yle+0.046
oll+0.045
ays+0.045
Neg logit contributions
 nod-0.035
 bark-0.032
 joke-0.031
 be-0.031
 roaring-0.030
Token fish
Token ID5916
Feature activation-6.07e-02

Pos logit contributions
 nod+0.009
 bark+0.009
 joke+0.009
 be+0.009
 roaring+0.008
Neg logit contributions
ays-0.014
oll-0.013
acey-0.013
ocket-0.013
yle-0.013
Token
Token ID198
Feature activation-2.47e-02

Pos logit contributions
 nod+0.001
 bark+0.001
 joke+0.001
 be+0.001
 roaring+0.001
Neg logit contributions
ocket-0.003
acey-0.003
yle-0.003
ays-0.003
oll-0.003
Token
Token ID198
Feature activation-2.85e-02

Pos logit contributions
 nod+0.029
 joke+0.025
 bark+0.025
 roaring+0.024
 be+0.024
Neg logit contributions
ocket-0.054
acey-0.054
yle-0.054
ays-0.053
oll-0.053
TokenRex
Token ID47389
Feature activation-2.81e-02

Pos logit contributions
 nod+0.044
 joke+0.041
 matter+0.040
sing+0.039
 bark+0.038
Neg logit contributions
acey-0.051
yle-0.050
ocket-0.050
ays-0.048
ions-0.048
Token stopped
Token ID5025
Feature activation-5.88e-02

Pos logit contributions
 nod+0.036
appointed+0.035
 matter+0.035
sing+0.033
 moment+0.032
Neg logit contributions
ils-0.052
als-0.051
yle-0.050
acey-0.050
ocket-0.050
Token and
Token ID290
Feature activation+8.66e-03

Pos logit contributions
 nod+0.079
appointed+0.077
 matter+0.072
 phrase+0.068
 joke+0.067
Neg logit contributions
yle-0.113
ays-0.113
acey-0.109
ions-0.108
izards-0.104
Token tilted
Token ID37229
Feature activation-1.50e-01

Pos logit contributions
yle+0.016
acey+0.016
ils+0.016
ocket+0.015
als+0.015
Neg logit contributions
 nod-0.011
appointed-0.011
sing-0.011
 joke-0.010
 matter-0.010
Token his
Token ID465
Feature activation-5.71e-03

Pos logit contributions
 nod+0.209
 joke+0.209
appointed+0.203
 count+0.199
 matter+0.195
Neg logit contributions
ays-0.260
yle-0.255
ocket-0.250
oll-0.244
acey-0.243
Token head
Token ID1182
Feature activation-2.62e-02

Pos logit contributions
 nod+0.007
 joke+0.006
 matter+0.006
 disagree+0.005
 roaring+0.005
Neg logit contributions
ocket-0.012
oll-0.012
acey-0.012
yle-0.011
ays-0.011
Token.
Token ID13
Feature activation-4.78e-03

Pos logit contributions
 nod+0.039
 joke+0.037
 matter+0.035
 count+0.035
appointed+0.034
Neg logit contributions
yle-0.046
ays-0.046
ocket-0.046
oll-0.046
acey-0.045
Token He
Token ID679
Feature activation-1.90e-03

Pos logit contributions
 nod+0.005
 joke+0.005
 bark+0.004
 roaring+0.004
 be+0.004
Neg logit contributions
ocket-0.011
acey-0.011
yle-0.011
ays-0.011
oll-0.011
Token did
Token ID750
Feature activation-4.22e-02

Pos logit contributions
 nod+0.002
 joke+0.002
 roaring+0.002
 bark+0.002
 be+0.002
Neg logit contributions
ocket-0.004
acey-0.004
yle-0.004
oll-0.004
ays-0.004
Token.
Token ID13
Feature activation-5.32e-03

Pos logit contributions
 nod+0.129
appointed+0.117
 joke+0.116
 bark+0.115
sing+0.113
Neg logit contributions
ays-0.141
yle-0.137
ions-0.137
ocket-0.134
oll-0.133
Token Ell
Token ID7122
Feature activation-1.62e-02

Pos logit contributions
 nod+0.007
 joke+0.007
 bark+0.007
 roaring+0.006
 be+0.006
Neg logit contributions
acey-0.011
ocket-0.010
yle-0.010
ays-0.010
oll-0.010
Tokeny
Token ID88
Feature activation-3.71e-02

Pos logit contributions
 nod+0.023
dy+0.022
 be+0.021
 joke+0.021
 bark+0.020
Neg logit contributions
ocket-0.031
yle-0.028
ions-0.028
 register-0.028
oll-0.028
Token makes
Token ID1838
Feature activation-3.65e-02

Pos logit contributions
 nod+0.055
 joke+0.049
 bark+0.049
 roaring+0.048
 be+0.048
Neg logit contributions
ocket-0.070
acey-0.070
yle-0.069
ays-0.068
oll-0.067
Token a
Token ID257
Feature activation+2.24e-03

Pos logit contributions
 nod+0.055
 joke+0.050
 bark+0.050
 be+0.048
 roaring+0.048
Neg logit contributions
ocket-0.069
acey-0.068
yle-0.068
ays-0.066
oll-0.066
Token louder
Token ID27089
Feature activation-1.50e-01

Pos logit contributions
ocket+0.004
yle+0.004
acey+0.004
aser+0.004
oll+0.004
Neg logit contributions
 nod-0.003
appointed-0.003
 count-0.003
 reasoning-0.003
 matter-0.002
Token noise
Token ID7838
Feature activation-3.70e-02

Pos logit contributions
 nod+0.162
appointed+0.148
 count+0.133
sing+0.128
 bark+0.127
Neg logit contributions
yle-0.323
ocket-0.320
acey-0.309
ays-0.308
oll-0.306
Token.
Token ID13
Feature activation+1.66e-03

Pos logit contributions
 nod+0.051
 joke+0.045
 bark+0.044
 be+0.043
 roaring+0.042
Neg logit contributions
ocket-0.072
acey-0.072
yle-0.072
ays-0.071
oll-0.071
Token He
Token ID679
Feature activation+3.94e-05

Pos logit contributions
acey+0.003
yle+0.003
ocket+0.003
ays+0.003
oll+0.003
Neg logit contributions
 nod-0.002
 joke-0.002
 bark-0.002
 roaring-0.002
 be-0.002
Token does
Token ID857
Feature activation-7.89e-02

Pos logit contributions
ocket+0.000
acey+0.000
yle+0.000
ays+0.000
oll+0.000
Neg logit contributions
 nod-0.000
 joke-0.000
 bark-0.000
 roaring-0.000
 be-0.000
Token not
Token ID407
Feature activation-7.61e-02

Pos logit contributions
 nod+0.075
 bark+0.063
appointed+0.060
sing+0.060
 roaring+0.059
Neg logit contributions
yle-0.183
acey-0.182
ocket-0.179
ays-0.174
oll-0.172
Token brown
Token ID7586
Feature activation-8.52e-02

Pos logit contributions
yle+0.055
ocket+0.054
acey+0.054
ays+0.053
oll+0.052
Neg logit contributions
 nod-0.037
 be-0.035
 bark-0.035
 joke-0.035
 roaring-0.033
Token fur
Token ID9230
Feature activation-3.16e-02

Pos logit contributions
 nod+0.128
 bark+0.115
 joke+0.115
 be+0.112
 roaring+0.111
Neg logit contributions
ocket-0.160
acey-0.159
yle-0.158
ays-0.154
oll-0.154
Token and
Token ID290
Feature activation+1.18e-02

Pos logit contributions
 nod+0.026
 bark+0.024
 be+0.024
 joke+0.021
 roaring+0.020
Neg logit contributions
yle-0.083
ocket-0.081
acey-0.081
iza-0.078
ays-0.077
Token a
Token ID257
Feature activation+2.56e-02

Pos logit contributions
ocket+0.024
yle+0.024
acey+0.024
ays+0.023
oll+0.023
Neg logit contributions
 nod-0.016
 bark-0.014
 joke-0.014
 be-0.014
 roaring-0.014
Token w
Token ID266
Feature activation-1.29e-02

Pos logit contributions
ocket+0.049
acey+0.048
yle+0.046
oll+0.046
ays+0.044
Neg logit contributions
 nod-0.035
appointed-0.034
 matter-0.031
 joke-0.031
 bark-0.031
Tokenagging
Token ID16406
Feature activation-1.50e-01

Pos logit contributions
 nod+0.023
 joke+0.022
 be+0.021
 bark+0.021
sing+0.020
Neg logit contributions
yle-0.020
 Hood-0.019
ays-0.019
ions-0.019
ocket-0.018
Token tail
Token ID7894
Feature activation-5.15e-03

Pos logit contributions
 nod+0.181
 joke+0.154
 bark+0.150
 roaring+0.149
 count+0.145
Neg logit contributions
ocket-0.323
acey-0.319
yle-0.319
ays-0.312
oll-0.306
Token.
Token ID13
Feature activation-1.09e-03

Pos logit contributions
 nod+0.006
 bark+0.005
 joke+0.005
 be+0.005
 roaring+0.005
Neg logit contributions
acey-0.011
ocket-0.011
yle-0.011
ays-0.011
oll-0.011
Token She
Token ID1375
Feature activation+9.07e-03

Pos logit contributions
 nod+0.002
 bark+0.001
 joke+0.001
 be+0.001
 roaring+0.001
Neg logit contributions
ocket-0.002
acey-0.002
yle-0.002
oll-0.002
ays-0.002
Token draws
Token ID14293
Feature activation+2.16e-02

Pos logit contributions
ocket+0.016
acey+0.016
yle+0.016
ays+0.016
oll+0.015
Neg logit contributions
 nod-0.014
 joke-0.013
 bark-0.013
 be-0.013
 roaring-0.013
Token Ben
Token ID3932
Feature activation+7.53e-03

Pos logit contributions
yle+0.046
ocket+0.045
acey+0.045
ays+0.044
iza+0.043
Neg logit contributions
 nod-0.027
 be-0.025
 bark-0.025
 roaring-0.024
 joke-0.024
Token brown
Token ID7586
Feature activation-8.07e-02

Pos logit contributions
acey+0.094
yle+0.092
ocket+0.092
ays+0.090
iza+0.089
Neg logit contributions
 nod-0.045
 bark-0.045
 be-0.043
 joke-0.042
 roaring-0.041
Token spots
Token ID10222
Feature activation+2.26e-02

Pos logit contributions
appointed+0.129
 nod+0.126
 matter+0.122
 erupt+0.117
 compromise+0.116
Neg logit contributions
ocket-0.133
als-0.131
ays-0.127
ions-0.123
acey-0.122
Token and
Token ID290
Feature activation+2.59e-02

Pos logit contributions
acey+0.052
yle+0.051
ocket+0.051
ays+0.049
oll+0.049
Neg logit contributions
 nod-0.026
 bark-0.022
 be-0.022
 joke-0.021
 roaring-0.021
Token a
Token ID257
Feature activation+2.99e-02

Pos logit contributions
ocket+0.050
acey+0.049
yle+0.049
ays+0.048
oll+0.048
Neg logit contributions
 nod-0.037
 joke-0.032
 bark-0.032
 roaring-0.031
 be-0.031
Token w
Token ID266
Feature activation-5.37e-03

Pos logit contributions
ocket+0.055
acey+0.053
yle+0.051
aser+0.050
oll+0.050
Neg logit contributions
appointed-0.041
 matter-0.039
 nod-0.039
 argue-0.036
 count-0.036
Tokenagging
Token ID16406
Feature activation-1.50e-01

Pos logit contributions
 nod+0.009
 joke+0.009
 bark+0.008
 be+0.008
 roaring+0.008
Neg logit contributions
ocket-0.009
yle-0.009
acey-0.009
ays-0.008
oll-0.008
Token tail
Token ID7894
Feature activation-2.76e-03

Pos logit contributions
 nod+0.183
 count+0.157
 disagree+0.152
 argue+0.147
 compromise+0.145
Neg logit contributions
ocket-0.294
yle-0.290
ays-0.284
overs-0.282
acey-0.281
Token.
Token ID13
Feature activation+1.03e-03

Pos logit contributions
 nod+0.004
 joke+0.004
appointed+0.004
 matter+0.004
 bark+0.004
Neg logit contributions
ocket-0.005
oll-0.005
ays-0.005
yle-0.005
acey-0.005
Token He
Token ID679
Feature activation+1.65e-03

Pos logit contributions
ocket+0.003
yle+0.003
ils+0.002
acey+0.002
ays+0.002
Neg logit contributions
 nod-0.001
 bark-0.001
 be-0.001
 joke-0.001
 barking-0.001
Token is
Token ID318
Feature activation-2.94e-02

Pos logit contributions
ocket+0.003
acey+0.003
yle+0.003
oll+0.003
ays+0.003
Neg logit contributions
 nod-0.002
 joke-0.002
 bark-0.002
 be-0.002
 roaring-0.002
Token not
Token ID407
Feature activation-7.02e-02

Pos logit contributions
 nod+0.039
 joke+0.034
 bark+0.034
appointed+0.032
 matter+0.032
Neg logit contributions
acey-0.059
ocket-0.059
yle-0.058
oll-0.057
ays-0.055
Token looked
Token ID3114
Feature activation-2.60e-02

Pos logit contributions
yle+0.042
ocket+0.042
als+0.042
acey+0.041
ays+0.041
Neg logit contributions
 nod-0.028
appointed-0.026
 joke-0.025
sing-0.025
 matter-0.024
Token at
Token ID379
Feature activation-3.18e-02

Pos logit contributions
 nod+0.049
 joke+0.047
 bark+0.046
 be+0.046
 count+0.045
Neg logit contributions
ocket-0.037
acey-0.036
ays-0.036
yle-0.035
oll-0.034
Token Sara
Token ID24799
Feature activation-6.80e-03

Pos logit contributions
 nod+0.036
 be+0.035
 bark+0.035
 barking+0.029
 Bob+0.029
Neg logit contributions
ocket-0.076
asers-0.074
acey-0.072
aser-0.072
oll-0.071
Token.
Token ID13
Feature activation-1.26e-03

Pos logit contributions
 nod+0.010
 joke+0.009
 matter+0.009
 bark+0.009
appointed+0.009
Neg logit contributions
ocket-0.012
yle-0.012
acey-0.012
ays-0.012
oll-0.012
Token It
Token ID632
Feature activation-5.97e-02

Pos logit contributions
 nod+0.002
 bark+0.002
 be+0.002
 joke+0.002
 roaring+0.002
Neg logit contributions
ocket-0.002
acey-0.002
yle-0.002
ils-0.002
ays-0.002
Token tilted
Token ID37229
Feature activation-1.50e-01

Pos logit contributions
 nod+0.094
appointed+0.091
sing+0.088
 argue+0.087
 compromise+0.084
Neg logit contributions
ocket-0.095
ils-0.090
als-0.090
yle-0.090
oll-0.088
Token its
Token ID663
Feature activation-2.94e-02

Pos logit contributions
 joke+0.269
 compromise+0.249
 quarrel+0.249
 count+0.247
 argue+0.247
Neg logit contributions
ocket-0.204
ays-0.195
als-0.189
yle-0.182
oll-0.179
Token head
Token ID1182
Feature activation-2.31e-02

Pos logit contributions
 nod+0.047
 disagree+0.046
 cooperate+0.044
 compromise+0.044
 reasoning+0.044
Neg logit contributions
ocket-0.044
als-0.038
aser-0.035
oll-0.034
 album-0.034
Token and
Token ID290
Feature activation+2.69e-02

Pos logit contributions
 nod+0.034
 joke+0.033
 matter+0.031
 bark+0.031
 compromise+0.030
Neg logit contributions
ocket-0.041
ays-0.040
yle-0.040
oll-0.040
acey-0.039
Token opened
Token ID4721
Feature activation-3.81e-02

Pos logit contributions
ocket+0.049
als+0.048
yle+0.047
ays+0.047
ils+0.046
Neg logit contributions
 nod-0.036
appointed-0.035
 joke-0.034
sing-0.032
 argue-0.032
Token its
Token ID663
Feature activation-6.89e-03

Pos logit contributions
 nod+0.059
 joke+0.056
 bark+0.052
 roaring+0.051
sing+0.051
Neg logit contributions
ays-0.069
yle-0.069
ocket-0.068
acey-0.067
ions-0.065
Token wet
Token ID9583
Feature activation-1.07e-01

Pos logit contributions
ocket+0.035
acey+0.034
oll+0.032
yle+0.032
 suitcase+0.031
Neg logit contributions
appointed-0.033
 nod-0.029
 matter-0.028
 count-0.028
 argue-0.027
Token nose
Token ID9686
Feature activation-4.25e-02

Pos logit contributions
 nod+0.164
 joke+0.148
 bark+0.145
 roaring+0.144
 be+0.141
Neg logit contributions
ocket-0.195
acey-0.192
yle-0.191
ays-0.186
oll-0.186
Token and
Token ID290
Feature activation+1.93e-02

Pos logit contributions
 nod+0.047
 bark+0.041
 be+0.040
 joke+0.040
 roaring+0.039
Neg logit contributions
acey-0.099
yle-0.098
ocket-0.097
ays-0.094
ils-0.094
Token a
Token ID257
Feature activation+2.51e-02

Pos logit contributions
yle+0.040
acey+0.039
ocket+0.039
ays+0.038
oll+0.037
Neg logit contributions
 nod-0.026
 bark-0.024
 joke-0.023
 be-0.023
 roaring-0.023
Token w
Token ID266
Feature activation-2.78e-03

Pos logit contributions
ocket+0.043
acey+0.041
oll+0.039
 suitcase+0.038
yle+0.038
Neg logit contributions
appointed-0.038
 nod-0.033
 matter-0.033
 count-0.032
 argue-0.032
Tokenagging
Token ID16406
Feature activation-1.49e-01

Pos logit contributions
 nod+0.005
 joke+0.005
 bark+0.004
 be+0.004
 roaring+0.004
Neg logit contributions
yle-0.004
ocket-0.004
acey-0.004
ays-0.004
ions-0.004
Token tail
Token ID7894
Feature activation-6.86e-03

Pos logit contributions
 nod+0.179
 count+0.152
 joke+0.150
 roaring+0.149
 disagree+0.146
Neg logit contributions
ocket-0.303
yle-0.300
ays-0.290
acey-0.289
overs-0.286
Token.
Token ID13
Feature activation+3.64e-04

Pos logit contributions
 nod+0.009
 joke+0.008
 bark+0.008
 be+0.008
 roaring+0.008
Neg logit contributions
ocket-0.014
acey-0.014
yle-0.014
ays-0.014
oll-0.014
Token He
Token ID679
Feature activation+1.39e-03

Pos logit contributions
ocket+0.001
ils+0.001
yle+0.001
acey+0.001
aser+0.001
Neg logit contributions
 nod-0.000
 bark-0.000
 be-0.000
 joke-0.000
 barking-0.000
Token bar
Token ID2318
Feature activation-2.03e-02

Pos logit contributions
ocket+0.003
yle+0.003
acey+0.003
ays+0.003
ions+0.003
Neg logit contributions
 nod-0.002
 bark-0.002
 joke-0.002
 be-0.002
 roaring-0.002
Tokenks
Token ID591
Feature activation-3.51e-02

Pos logit contributions
 nod+0.035
 joke+0.032
 bark+0.032
 be+0.032
 roaring+0.031
Neg logit contributions
ocket-0.033
acey-0.033
yle-0.033
ays-0.032
oll-0.032
Token.
Token ID13
Feature activation-1.44e-03

Pos logit contributions
 nod+0.056
 bark+0.048
 be+0.047
 joke+0.047
 count+0.045
Neg logit contributions
acey-0.110
yle-0.109
ocket-0.109
ays-0.106
oll-0.104
Token
Token ID198
Feature activation-1.93e-02

Pos logit contributions
 nod+0.002
 bark+0.002
 joke+0.002
 be+0.002
 roaring+0.001
Neg logit contributions
ocket-0.003
yle-0.003
acey-0.003
ays-0.003
oll-0.003
Token
Token ID198
Feature activation-2.27e-02

Pos logit contributions
 nod+0.019
 bark+0.016
 joke+0.016
 be+0.016
 roaring+0.015
Neg logit contributions
ocket-0.046
yle-0.046
acey-0.045
oll-0.045
ays-0.045
TokenL
Token ID43
Feature activation+6.01e-02

Pos logit contributions
 nod+0.033
 joke+0.030
 bark+0.030
 be+0.029
 roaring+0.029
Neg logit contributions
ocket-0.044
acey-0.043
yle-0.043
ays-0.042
oll-0.042
Tokenily
Token ID813
Feature activation-2.91e-02

Pos logit contributions
ifying+0.083
yle+0.082
als+0.080
ocket+0.079
ays+0.077
Neg logit contributions
 be-0.110
 mistake-0.103
 nod-0.103
 joke-0.099
 count-0.098
Token tilted
Token ID37229
Feature activation-1.49e-01

Pos logit contributions
 nod+0.043
 roaring+0.042
sing+0.042
 matter+0.041
 joke+0.041
Neg logit contributions
acey-0.051
oll-0.050
ils-0.050
ocket-0.049
yle-0.048
Token the
Token ID262
Feature activation-2.70e-02

Pos logit contributions
 joke+0.318
 nod+0.302
 roaring+0.299
appointed+0.294
sing+0.293
Neg logit contributions
ays-0.163
yle-0.161
ocket-0.160
acey-0.150
ions-0.150
Token jar
Token ID17379
Feature activation+2.17e-02

Pos logit contributions
 nod+0.047
 bark+0.044
 joke+0.044
 be+0.043
 laugh+0.041
Neg logit contributions
yle-0.045
ays-0.045
acey-0.045
ions-0.044
ocket-0.043
Token slowly
Token ID6364
Feature activation-7.66e-02

Pos logit contributions
ocket+0.043
oll+0.042
ays+0.042
acey+0.041
yle+0.041
Neg logit contributions
 nod-0.029
 joke-0.027
 bark-0.025
 roaring-0.025
 be-0.024
Token and
Token ID290
Feature activation+2.12e-02

Pos logit contributions
 nod+0.108
 joke+0.105
 roaring+0.097
 bark+0.097
dy+0.096
Neg logit contributions
ocket-0.137
ays-0.136
yle-0.135
oll-0.133
acey-0.133
Token watched
Token ID7342
Feature activation-2.72e-02

Pos logit contributions
ocket+0.039
acey+0.039
yle+0.038
als+0.038
iza+0.037
Neg logit contributions
 joke-0.028
 nod-0.028
appointed-0.026
 roaring-0.026
 matter-0.025
Token again
Token ID757
Feature activation-4.68e-02

Pos logit contributions
 nod+0.101
 bark+0.081
 joke+0.080
appointed+0.080
 chuckle+0.079
Neg logit contributions
yle-0.125
ays-0.124
acey-0.120
ions-0.118
ocket-0.116
Token.
Token ID13
Feature activation+4.70e-03

Pos logit contributions
 nod+0.087
appointed+0.078
sing+0.076
 bark+0.074
 joke+0.074
Neg logit contributions
ays-0.066
yle-0.065
ions-0.061
ocket-0.060
acey-0.059
Token It
Token ID632
Feature activation-6.25e-02

Pos logit contributions
ays+0.008
acey+0.008
ocket+0.008
oll+0.008
ions+0.008
Neg logit contributions
 nod-0.007
 matter-0.006
 joke-0.006
 bark-0.006
 roaring-0.006
Token makes
Token ID1838
Feature activation-3.52e-02

Pos logit contributions
 nod+0.094
 joke+0.083
 bark+0.083
 roaring+0.081
 be+0.080
Neg logit contributions
acey-0.115
ocket-0.115
yle-0.114
ays-0.113
oll-0.111
Token a
Token ID257
Feature activation+2.62e-03

Pos logit contributions
 nod+0.054
 joke+0.048
 bark+0.048
 roaring+0.046
 be+0.046
Neg logit contributions
acey-0.065
ocket-0.065
yle-0.064
ays-0.063
oll-0.062
Token louder
Token ID27089
Feature activation-1.49e-01

Pos logit contributions
acey+0.005
yle+0.005
ocket+0.005
aser+0.004
ays+0.004
Neg logit contributions
 nod-0.003
appointed-0.003
 count-0.003
 reasoning-0.003
 bark-0.003
Token sound
Token ID2128
Feature activation-3.72e-02

Pos logit contributions
 nod+0.150
appointed+0.142
 count+0.122
die+0.120
 bark+0.119
Neg logit contributions
yle-0.314
ays-0.309
ocket-0.309
acey-0.303
overs-0.299
Token.
Token ID13
Feature activation+7.05e-03

Pos logit contributions
 nod+0.051
 bark+0.044
 joke+0.044
 be+0.042
 roaring+0.042
Neg logit contributions
ocket-0.074
acey-0.074
yle-0.074
ays-0.072
oll-0.072
Token Then
Token ID3244
Feature activation-2.68e-02

Pos logit contributions
acey+0.013
ays+0.013
ocket+0.013
ions+0.013
oll+0.013
Neg logit contributions
 nod-0.010
 joke-0.009
 bark-0.009
 matter-0.009
 roaring-0.009
Token it
Token ID340
Feature activation-2.79e-02

Pos logit contributions
 nod+0.042
 be+0.037
 bark+0.037
 joke+0.037
 roaring+0.036
Neg logit contributions
ocket-0.050
acey-0.050
yle-0.050
ays-0.048
oll-0.048
Token breaks
Token ID9457
Feature activation-3.98e-02

Pos logit contributions
 nod+0.046
 joke+0.040
 bark+0.040
 roaring+0.040
appointed+0.039
Neg logit contributions
ays-0.047
acey-0.047
ocket-0.046
yle-0.045
oll-0.045
Token a
Token ID257
Feature activation+3.26e-02

Pos logit contributions
ocket+0.115
acey+0.114
yle+0.114
ays+0.111
oll+0.111
Neg logit contributions
 nod-0.065
 joke-0.056
 bark-0.056
 be-0.055
 roaring-0.054
Token bit
Token ID1643
Feature activation+6.46e-03

Pos logit contributions
ocket+0.048
acey+0.047
yle+0.045
oll+0.045
ifying+0.044
Neg logit contributions
appointed-0.049
 nod-0.049
 argue-0.048
 barking-0.047
 count-0.046
Token stiff
Token ID15175
Feature activation-4.67e-02

Pos logit contributions
acey+0.013
oll+0.013
ocket+0.013
yle+0.013
ays+0.012
Neg logit contributions
 nod-0.008
 joke-0.008
 bark-0.007
sing-0.007
 count-0.007
Token.
Token ID13
Feature activation-3.09e-03

Pos logit contributions
 nod+0.058
 bark+0.050
 joke+0.050
 be+0.049
 roaring+0.048
Neg logit contributions
ocket-0.102
acey-0.102
yle-0.101
ays-0.099
oll-0.097
Token The
Token ID383
Feature activation-1.69e-02

Pos logit contributions
 nod+0.004
 joke+0.004
 bark+0.003
 roaring+0.003
 count+0.003
Neg logit contributions
acey-0.007
yle-0.006
ocket-0.006
ays-0.006
oll-0.006
Token taller
Token ID25242
Feature activation-1.49e-01

Pos logit contributions
 nod+0.023
 bark+0.021
sing+0.021
 barking+0.020
 roaring+0.020
Neg logit contributions
ocket-0.031
oll-0.031
acey-0.030
 register-0.030
yle-0.030
Token one
Token ID530
Feature activation+2.15e-03

Pos logit contributions
 nod+0.209
 joke+0.186
 bark+0.184
 roaring+0.182
 be+0.176
Neg logit contributions
ocket-0.290
yle-0.287
acey-0.286
oll-0.281
ays-0.279
Token thought
Token ID1807
Feature activation+4.55e-02

Pos logit contributions
ocket+0.004
yle+0.004
acey+0.004
oll+0.004
aser+0.004
Neg logit contributions
 nod-0.003
 joke-0.003
 bark-0.003
 roaring-0.003
 argue-0.003
Token that
Token ID326
Feature activation+4.32e-02

Pos logit contributions
ays+0.090
yle+0.090
acey+0.089
ocket+0.088
oll+0.084
Neg logit contributions
 nod-0.066
 be-0.062
 joke-0.059
 bark-0.058
 roaring-0.057
Token was
Token ID373
Feature activation+9.45e-03

Pos logit contributions
ocket+0.085
acey+0.084
yle+0.084
ays+0.082
oll+0.082
Neg logit contributions
 nod-0.061
 joke-0.055
 bark-0.055
 be-0.053
 roaring-0.053
Token a
Token ID257
Feature activation+2.14e-02

Pos logit contributions
ocket+0.018
acey+0.017
yle+0.017
oll+0.017
ions+0.017
Neg logit contributions
 nod-0.014
 bark-0.012
 joke-0.012
 roaring-0.012
 be-0.012
Token started
Token ID2067
Feature activation-3.67e-02

Pos logit contributions
ocket+0.063
acey+0.062
yle+0.061
oll+0.060
ays+0.060
Neg logit contributions
 nod-0.040
 bark-0.037
 joke-0.035
 be-0.035
 roaring-0.034
Token to
Token ID284
Feature activation-4.39e-02

Pos logit contributions
 nod+0.047
 matter+0.041
 joke+0.040
appointed+0.039
sing+0.039
Neg logit contributions
yle-0.073
ays-0.071
acey-0.070
ions-0.069
ocket-0.068
Token swing
Token ID9628
Feature activation-5.36e-04

Pos logit contributions
 nod+0.046
sing+0.043
 roaring+0.043
appointed+0.042
 phrase+0.042
Neg logit contributions
yle-0.084
acey-0.081
iza-0.078
ions-0.077
ocket-0.076
Token.
Token ID13
Feature activation-1.99e-03

Pos logit contributions
 nod+0.001
 bark+0.001
 be+0.001
 joke+0.001
 roaring+0.001
Neg logit contributions
ocket-0.001
acey-0.001
yle-0.001
oll-0.001
ays-0.001
Token The
Token ID383
Feature activation-2.74e-02

Pos logit contributions
 nod+0.002
 bark+0.002
 joke+0.002
 be+0.002
 roaring+0.002
Neg logit contributions
ocket-0.005
yle-0.005
acey-0.004
ils-0.004
oll-0.004
Token higher
Token ID2440
Feature activation-1.49e-01

Pos logit contributions
 nod+0.038
 bark+0.035
 be+0.035
 joke+0.035
 roaring+0.032
Neg logit contributions
ocket-0.057
acey-0.057
ays-0.057
yle-0.056
oll-0.055
Token Lin
Token ID5164
Feature activation-1.49e-03

Pos logit contributions
 nod+0.162
 be+0.159
 bark+0.145
 joke+0.139
 moment+0.139
Neg logit contributions
ocket-0.359
acey-0.355
oll-0.346
yle-0.338
ils-0.326
Tokenny
Token ID3281
Feature activation-3.64e-02

Pos logit contributions
 nod+0.002
 bark+0.002
 be+0.002
 moment+0.002
 barking+0.002
Neg logit contributions
ocket-0.003
oll-0.003
ions-0.003
acey-0.003
asers-0.003
Token swung
Token ID28507
Feature activation-1.55e-02

Pos logit contributions
 nod+0.046
 bark+0.043
 be+0.041
 joke+0.041
 roaring+0.039
Neg logit contributions
ocket-0.078
acey-0.078
yle-0.077
ays-0.076
oll-0.075
Token,
Token ID11
Feature activation-2.16e-03

Pos logit contributions
 nod+0.022
 joke+0.020
 roaring+0.019
sing+0.019
 bark+0.019
Neg logit contributions
yle-0.028
ays-0.028
ocket-0.028
ions-0.028
oll-0.028
Token the
Token ID262
Feature activation-1.40e-02

Pos logit contributions
 nod+0.003
 joke+0.002
 matter+0.002
 compromise+0.002
 argue+0.002
Neg logit contributions
yle-0.004
ays-0.004
acey-0.004
ocket-0.004
als-0.004
Token fluffy
Token ID39145
Feature activation-8.39e-02

Pos logit contributions
ocket+0.026
acey+0.025
yle+0.025
oll+0.025
ays+0.025
Neg logit contributions
 nod-0.015
 joke-0.012
 bark-0.012
 roaring-0.012
 be-0.012
Token and
Token ID290
Feature activation+1.48e-02

Pos logit contributions
 nod+0.123
 joke+0.110
appointed+0.107
 matter+0.107
sing+0.105
Neg logit contributions
ocket-0.156
ays-0.152
oll-0.151
acey-0.151
yle-0.151
Token had
Token ID550
Feature activation+4.79e-02

Pos logit contributions
ocket+0.031
acey+0.031
yle+0.031
ays+0.030
oll+0.030
Neg logit contributions
 nod-0.019
 bark-0.017
 joke-0.017
 be-0.016
 roaring-0.016
Token a
Token ID257
Feature activation+2.82e-02

Pos logit contributions
acey+0.101
ocket+0.101
yle+0.101
ays+0.099
oll+0.097
Neg logit contributions
 nod-0.061
 bark-0.058
 joke-0.056
 be-0.056
 roaring-0.054
Token w
Token ID266
Feature activation-1.20e-02

Pos logit contributions
ocket+0.051
acey+0.050
oll+0.047
yle+0.047
ays+0.046
Neg logit contributions
 nod-0.038
appointed-0.037
 matter-0.035
 count-0.034
 argue-0.033
Tokenagging
Token ID16406
Feature activation-1.49e-01

Pos logit contributions
 nod+0.021
 joke+0.019
 bark+0.019
 be+0.018
 roaring+0.018
Neg logit contributions
yle-0.020
acey-0.019
ocket-0.019
ays-0.019
ions-0.019
Token tail
Token ID7894
Feature activation-1.19e-03

Pos logit contributions
 nod+0.166
 count+0.129
 joke+0.126
 disagree+0.125
 roaring+0.124
Neg logit contributions
ocket-0.326
yle-0.320
ays-0.317
acey-0.313
ils-0.306
Token.
Token ID13
Feature activation-1.23e-04

Pos logit contributions
 nod+0.001
 joke+0.001
 bark+0.001
 be+0.001
 roaring+0.001
Neg logit contributions
ocket-0.003
acey-0.003
yle-0.003
ays-0.002
oll-0.002
Token Lily
Token ID20037
Feature activation+3.66e-03

Pos logit contributions
 nod+0.000
 bark+0.000
 be+0.000
 barking+0.000
 joke+0.000
Neg logit contributions
ils-0.000
ocket-0.000
yle-0.000
 register-0.000
acey-0.000
Token pointed
Token ID6235
Feature activation-5.19e-02

Pos logit contributions
acey+0.006
yle+0.006
ocket+0.006
ils+0.006
als+0.006
Neg logit contributions
 nod-0.006
 matter-0.005
sing-0.005
 joke-0.005
appointed-0.005
Token at
Token ID379
Feature activation-3.69e-02

Pos logit contributions
 nod+0.074
 joke+0.073
 roaring+0.068
sing+0.068
 bark+0.068
Neg logit contributions
ocket-0.095
yle-0.094
ays-0.094
acey-0.091
oll-0.089
Token They
Token ID1119
Feature activation-1.16e-02

Pos logit contributions
 nod+0.000
 joke+0.000
 bark+0.000
 roaring+0.000
 be+0.000
Neg logit contributions
acey-0.001
ocket-0.001
yle-0.001
ays-0.001
oll-0.001
Token were
Token ID547
Feature activation+2.08e-02

Pos logit contributions
 nod+0.017
 bark+0.016
 be+0.015
 joke+0.014
 laugh+0.013
Neg logit contributions
ays-0.024
ocket-0.024
ions-0.024
yle-0.024
acey-0.023
Token sorry
Token ID7926
Feature activation-3.16e-02

Pos logit contributions
yle+0.047
ocket+0.047
acey+0.046
ays+0.045
ions+0.045
Neg logit contributions
 nod-0.024
 bark-0.022
 be-0.021
 roaring-0.021
 joke-0.021
Token.
Token ID13
Feature activation-3.93e-03

Pos logit contributions
 nod+0.048
 bark+0.044
 joke+0.044
 roaring+0.043
 count+0.043
Neg logit contributions
oll-0.055
acey-0.054
ays-0.054
yle-0.053
ocket-0.052
Token They
Token ID1119
Feature activation-1.37e-02

Pos logit contributions
 nod+0.005
 joke+0.004
 be+0.004
 bark+0.004
 roaring+0.004
Neg logit contributions
ocket-0.009
yle-0.009
acey-0.008
oll-0.008
ays-0.008
Token wished
Token ID16555
Feature activation+9.46e-02

Pos logit contributions
 nod+0.020
 bark+0.018
 be+0.018
 joke+0.017
 roaring+0.016
Neg logit contributions
ocket-0.027
ays-0.027
yle-0.027
acey-0.027
ions-0.026
Token they
Token ID484
Feature activation+2.67e-02

Pos logit contributions
ocket+0.148
acey+0.143
yle+0.143
ays+0.139
oll+0.138
Neg logit contributions
 nod-0.177
 joke-0.163
 be-0.163
 bark-0.161
 roaring-0.158
Token had
Token ID550
Feature activation+4.42e-02

Pos logit contributions
ays+0.065
ocket+0.062
yle+0.062
acey+0.062
ions+0.061
Neg logit contributions
 nod-0.028
 be-0.027
 bark-0.026
 joke-0.024
 laugh-0.023
Token listened
Token ID16399
Feature activation-6.86e-02

Pos logit contributions
ils+0.089
acey+0.088
ocket+0.088
yle+0.087
oll+0.086
Neg logit contributions
 nod-0.056
 roaring-0.049
 chuckle-0.047
sing-0.047
appointed-0.046
Token to
Token ID284
Feature activation-3.84e-02

Pos logit contributions
 nod+0.084
 joke+0.075
 bark+0.074
 roaring+0.073
sing+0.070
Neg logit contributions
acey-0.142
ocket-0.142
yle-0.138
oll-0.138
ays-0.137
Token Mom
Token ID11254
Feature activation-2.69e-02

Pos logit contributions
 nod+0.051
 joke+0.045
 bark+0.045
 be+0.044
 roaring+0.043
Neg logit contributions
ocket-0.079
acey-0.079
yle-0.078
ays-0.077
oll-0.077
Token there
Token ID612
Feature activation-1.06e-02

Pos logit contributions
ocket+0.002
acey+0.002
yle+0.002
ays+0.002
oll+0.002
Neg logit contributions
 nod-0.002
 be-0.002
 bark-0.002
 joke-0.002
 moment-0.002
Token was
Token ID373
Feature activation+1.20e-02

Pos logit contributions
 nod+0.014
 joke+0.012
 bark+0.012
 roaring+0.012
 be+0.012
Neg logit contributions
ocket-0.022
acey-0.022
yle-0.022
ays-0.021
oll-0.021
Token a
Token ID257
Feature activation+2.75e-02

Pos logit contributions
ocket+0.028
yle+0.027
acey+0.027
asers+0.027
ays+0.027
Neg logit contributions
 nod-0.013
 bark-0.012
 be-0.012
 joke-0.012
 roaring-0.011
Token little
Token ID1310
Feature activation+3.18e-02

Pos logit contributions
acey+0.047
yle+0.046
ocket+0.045
oll+0.043
iza+0.042
Neg logit contributions
 nod-0.045
appointed-0.042
 roaring-0.040
 joke-0.040
 count-0.040
Token girl
Token ID2576
Feature activation+6.05e-02

Pos logit contributions
ocket+0.057
acey+0.056
yle+0.056
ays+0.055
oll+0.054
Neg logit contributions
 nod-0.052
 bark-0.048
 joke-0.047
 be-0.046
 roaring-0.045
Token named
Token ID3706
Feature activation+9.41e-02

Pos logit contributions
ocket+0.096
acey+0.094
yle+0.094
ays+0.093
oll+0.091
Neg logit contributions
 nod-0.109
 bark-0.102
 be-0.101
 joke-0.100
 roaring-0.098
Token Lily
Token ID20037
Feature activation+3.09e-02

Pos logit contributions
ocket+0.172
yle+0.172
acey+0.168
ays+0.168
oll+0.165
Neg logit contributions
 nod-0.147
 bark-0.136
 be-0.136
 joke-0.133
 count-0.128
Token.
Token ID13
Feature activation+3.76e-03

Pos logit contributions
yle+0.056
ocket+0.055
acey+0.055
oll+0.053
ays+0.052
Neg logit contributions
 nod-0.048
 joke-0.043
 roaring-0.042
appointed-0.041
sing-0.041
Token One
Token ID1881
Feature activation-4.41e-03

Pos logit contributions
acey+0.006
oll+0.005
ays+0.005
ocket+0.005
ions+0.005
Neg logit contributions
 nod-0.007
 roaring-0.007
 joke-0.006
 bark-0.006
 wob-0.006
Token day
Token ID1110
Feature activation+7.21e-02

Pos logit contributions
 nod+0.007
 argue+0.007
 roaring+0.007
 count+0.006
appointed+0.006
Neg logit contributions
oll-0.007
ocket-0.007
acey-0.007
yle-0.007
 album-0.007
Token,
Token ID11
Feature activation-4.42e-03

Pos logit contributions
acey+0.166
ocket+0.165
yle+0.164
ays+0.161
oll+0.161
Neg logit contributions
 nod-0.077
 joke-0.066
 bark-0.065
 roaring-0.063
 be-0.061
Token fox
Token ID21831
Feature activation+2.30e-02

Pos logit contributions
ocket+0.007
acey+0.007
yle+0.007
oll+0.007
ays+0.006
Neg logit contributions
 nod-0.006
 joke-0.006
 bark-0.005
appointed-0.005
sing-0.005
Token inside
Token ID2641
Feature activation-3.44e-02

Pos logit contributions
acey+0.054
yle+0.054
ocket+0.052
ays+0.052
oll+0.052
Neg logit contributions
 nod-0.026
 bark-0.023
 be-0.022
 roaring-0.022
 count-0.021
Token and
Token ID290
Feature activation+2.38e-02

Pos logit contributions
 nod+0.041
 bark+0.036
 joke+0.035
 be+0.035
 roaring+0.034
Neg logit contributions
acey-0.076
ocket-0.076
yle-0.076
oll-0.074
ays-0.074
Token they
Token ID484
Feature activation+3.34e-02

Pos logit contributions
ays+0.050
acey+0.049
ocket+0.049
yle+0.049
oll+0.048
Neg logit contributions
 nod-0.033
 bark-0.032
 be-0.030
 joke-0.030
 roaring-0.029
Token had
Token ID550
Feature activation+4.47e-02

Pos logit contributions
ays+0.068
yle+0.068
acey+0.067
ocket+0.066
asers+0.065
Neg logit contributions
 nod-0.047
 bark-0.046
 be-0.043
 joke-0.042
 laugh-0.041
Token lots
Token ID6041
Feature activation+1.02e-01

Pos logit contributions
ocket+0.080
acey+0.079
yle+0.078
ays+0.076
oll+0.076
Neg logit contributions
 nod-0.071
 joke-0.064
 bark-0.064
 be-0.062
 roaring-0.062
Token of
Token ID286
Feature activation+3.56e-02

Pos logit contributions
acey+0.143
yle+0.140
ocket+0.133
ays+0.131
 Hood+0.121
Neg logit contributions
 nod-0.216
 joke-0.204
 bark-0.199
 laugh-0.196
 roaring-0.194
Token fun
Token ID1257
Feature activation-5.56e-02

Pos logit contributions
ocket+0.091
acey+0.089
oll+0.088
yle+0.088
ils+0.087
Neg logit contributions
 nod-0.029
 bark-0.022
 joke-0.022
 be-0.022
 roaring-0.021
Token together
Token ID1978
Feature activation-9.87e-03

Pos logit contributions
 nod+0.077
 joke+0.069
 bark+0.068
 be+0.066
 roaring+0.066
Neg logit contributions
ocket-0.111
acey-0.111
yle-0.110
ays-0.107
oll-0.107
Token.
Token ID13
Feature activation-2.74e-03

Pos logit contributions
 nod+0.013
 joke+0.012
 bark+0.012
 be+0.011
 roaring+0.011
Neg logit contributions
ocket-0.020
yle-0.020
acey-0.020
ays-0.019
oll-0.019
Token They
Token ID1119
Feature activation-8.94e-03

Pos logit contributions
 nod+0.003
 bark+0.003
 joke+0.003
 be+0.003
 roaring+0.003
Neg logit contributions
ocket-0.006
yle-0.006
acey-0.006
oll-0.006
ays-0.006
Token,
Token ID11
Feature activation+3.31e-03

Pos logit contributions
ocket+0.107
ays+0.106
yle+0.106
acey+0.103
umbers+0.102
Neg logit contributions
 nod-0.032
 be-0.030
 count-0.028
 bark-0.028
 joke-0.025
Token a
Token ID257
Feature activation+3.04e-02

Pos logit contributions
ays+0.009
acey+0.008
yle+0.008
ocket+0.008
oll+0.008
Neg logit contributions
 be-0.003
 nod-0.003
 bark-0.003
 count-0.003
 joke-0.002
Token doctor
Token ID6253
Feature activation+2.78e-02

Pos logit contributions
ays+0.060
ocket+0.056
yle+0.056
ions+0.055
ils+0.055
Neg logit contributions
 nod-0.047
 joke-0.045
 bark-0.044
 be-0.044
 roaring-0.042
Token,
Token ID11
Feature activation+4.00e-03

Pos logit contributions
ocket+0.066
yle+0.066
ays+0.066
acey+0.066
oll+0.064
Neg logit contributions
 nod-0.028
 bark-0.024
 be-0.024
 joke-0.024
 count-0.023
Token and
Token ID290
Feature activation+2.14e-02

Pos logit contributions
yle+0.011
ays+0.011
acey+0.011
ocket+0.011
oll+0.010
Neg logit contributions
 be-0.003
 nod-0.003
 count-0.003
 bark-0.002
 joke-0.002
Token many
Token ID867
Feature activation+9.68e-02

Pos logit contributions
ays+0.059
umbers+0.054
ocket+0.053
yle+0.053
ase+0.053
Neg logit contributions
 be-0.023
 louder-0.020
 count-0.019
 listen-0.019
 bark-0.019
Token other
Token ID584
Feature activation+1.48e-02

Pos logit contributions
ays+0.206
ocket+0.194
yle+0.189
oll+0.188
acey+0.187
Neg logit contributions
 be-0.139
 joke-0.138
 nod-0.138
 laugh-0.134
 bark-0.132
Token things
Token ID1243
Feature activation+6.23e-02

Pos logit contributions
ays+0.027
ocket+0.027
yle+0.026
oll+0.026
acey+0.026
Neg logit contributions
 nod-0.024
 be-0.023
 joke-0.023
 bark-0.023
 count-0.022
Token.
Token ID13
Feature activation+1.19e-03

Pos logit contributions
yle+0.158
acey+0.158
ocket+0.156
ays+0.155
iza+0.150
Neg logit contributions
 nod-0.053
 be-0.050
 joke-0.050
 bark-0.050
 count-0.048
Token
Token ID198
Feature activation-1.64e-02

Pos logit contributions
ocket+0.003
ils+0.003
yle+0.003
iza+0.003
oll+0.003
Neg logit contributions
 nod-0.001
 be-0.001
 bark-0.001
 joke-0.001
 barking-0.001
Token
Token ID198
Feature activation-1.95e-02

Pos logit contributions
 nod+0.014
 joke+0.012
 bark+0.012
 be+0.011
 moment+0.010
Neg logit contributions
ocket-0.042
yle-0.041
oll-0.041
ays-0.040
acey-0.040
Token made
Token ID925
Feature activation-1.21e-03

Pos logit contributions
 nod+0.010
 bark+0.009
 be+0.009
 joke+0.008
 roaring+0.008
Neg logit contributions
ocket-0.019
yle-0.018
ays-0.018
ions-0.018
acey-0.018
Token a
Token ID257
Feature activation+1.67e-02

Pos logit contributions
 nod+0.002
 bark+0.002
 be+0.002
 joke+0.002
 roaring+0.002
Neg logit contributions
ocket-0.002
yle-0.002
ays-0.002
acey-0.002
oll-0.002
Token special
Token ID2041
Feature activation-2.29e-04

Pos logit contributions
acey+0.032
ocket+0.032
yle+0.031
oll+0.031
 register+0.030
Neg logit contributions
 nod-0.022
appointed-0.020
 roaring-0.019
 bark-0.019
 joke-0.019
Token dish
Token ID9433
Feature activation+4.29e-02

Pos logit contributions
 nod+0.000
 bark+0.000
 joke+0.000
 roaring+0.000
 be+0.000
Neg logit contributions
ocket-0.000
acey-0.000
yle-0.000
oll-0.000
ays-0.000
Token with
Token ID351
Feature activation+5.16e-02

Pos logit contributions
yle+0.093
ays+0.091
ocket+0.090
acey+0.090
ils+0.086
Neg logit contributions
 be-0.054
 nod-0.051
 joke-0.047
 count-0.046
 laugh-0.045
Token lots
Token ID6041
Feature activation+1.13e-01

Pos logit contributions
ays+0.114
yle+0.112
oll+0.111
ocket+0.111
acey+0.110
Neg logit contributions
 be-0.062
 bark-0.056
 nod-0.056
 joke-0.054
 count-0.050
Token of
Token ID286
Feature activation+3.96e-02

Pos logit contributions
acey+0.191
yle+0.186
ocket+0.180
ays+0.178
 Hood+0.177
Neg logit contributions
 nod-0.196
 be-0.186
 joke-0.183
 bark-0.179
 laugh-0.175
Token y
Token ID331
Feature activation-2.70e-02

Pos logit contributions
ocket+0.080
ays+0.080
yle+0.079
oll+0.077
acey+0.077
Neg logit contributions
 nod-0.055
 joke-0.052
 be-0.050
 bark-0.049
 matter-0.047
Tokenummy
Token ID13513
Feature activation+1.32e-03

Pos logit contributions
 nod+0.054
 joke+0.050
 bark+0.050
 be+0.049
 roaring+0.049
Neg logit contributions
ocket-0.037
acey-0.037
yle-0.036
ays-0.035
oll-0.035
Token vegetables
Token ID13701
Feature activation+1.28e-02

Pos logit contributions
ocket+0.002
yle+0.002
acey+0.002
ays+0.002
oll+0.002
Neg logit contributions
 nod-0.002
 joke-0.002
 bark-0.002
 be-0.002
 roaring-0.002
Token and
Token ID290
Feature activation+3.26e-02

Pos logit contributions
acey+0.034
yle+0.034
ays+0.032
ocket+0.032
oll+0.032
Neg logit contributions
 be-0.009
 nod-0.008
 bark-0.008
 count-0.007
 roaring-0.007
Token world
Token ID995
Feature activation+3.48e-02

Pos logit contributions
ocket+0.019
acey+0.019
yle+0.019
ays+0.018
oll+0.018
Neg logit contributions
 nod-0.015
 bark-0.013
 joke-0.013
 be-0.013
 roaring-0.013
Token around
Token ID1088
Feature activation-4.24e-02

Pos logit contributions
acey+0.067
ocket+0.066
yle+0.065
oll+0.064
ays+0.063
Neg logit contributions
 nod-0.052
 bark-0.048
 joke-0.048
 be-0.046
 roaring-0.046
Token him
Token ID683
Feature activation-1.39e-02

Pos logit contributions
 nod+0.054
 joke+0.051
 be+0.051
 bark+0.050
 roaring+0.049
Neg logit contributions
ocket-0.091
oll-0.091
acey-0.090
yle-0.087
asers-0.086
Token.
Token ID13
Feature activation-4.87e-03

Pos logit contributions
 nod+0.020
 joke+0.017
 bark+0.017
 roaring+0.016
appointed+0.016
Neg logit contributions
ocket-0.028
yle-0.027
acey-0.027
ays-0.026
oll-0.026
Token One
Token ID1881
Feature activation-1.13e-02

Pos logit contributions
 nod+0.006
 roaring+0.006
 joke+0.005
sing+0.005
 wob+0.005
Neg logit contributions
acey-0.010
ays-0.010
yle-0.010
oll-0.009
ions-0.009
Token day
Token ID1110
Feature activation+6.82e-02

Pos logit contributions
 nod+0.020
 argue+0.018
 roaring+0.018
appointed+0.017
 chuckle+0.017
Neg logit contributions
ocket-0.017
oll-0.017
acey-0.016
 album-0.016
ions-0.016
Token,
Token ID11
Feature activation-1.12e-02

Pos logit contributions
acey+0.144
ocket+0.141
yle+0.140
oll+0.139
ions+0.139
Neg logit contributions
 nod-0.083
 joke-0.073
 bark-0.071
sing-0.071
 roaring-0.070
Token Benny
Token ID44275
Feature activation+1.19e-02

Pos logit contributions
 nod+0.015
 bark+0.014
 joke+0.014
 be+0.013
 roaring+0.013
Neg logit contributions
ocket-0.023
acey-0.023
yle-0.023
ays-0.022
oll-0.022
Token went
Token ID1816
Feature activation+6.31e-02

Pos logit contributions
ocket+0.025
yle+0.025
acey+0.025
oll+0.025
ays+0.025
Neg logit contributions
 nod-0.015
 bark-0.015
 be-0.014
 joke-0.014
 roaring-0.013
Token on
Token ID319
Feature activation+2.61e-03

Pos logit contributions
ocket+0.120
acey+0.118
yle+0.116
ays+0.115
oll+0.115
Neg logit contributions
 nod-0.092
 joke-0.083
 bark-0.080
 be-0.079
 roaring-0.078
Token an
Token ID281
Feature activation+2.54e-02

Pos logit contributions
ocket+0.004
acey+0.003
 Hood+0.003
yle+0.003
ions+0.003
Neg logit contributions
 nod-0.005
 joke-0.005
 phrase-0.004
sing-0.004
appointed-0.004
Token run
Token ID1057
Feature activation+3.82e-03

Pos logit contributions
 nod+0.054
appointed+0.054
 roaring+0.051
 moment+0.050
 louder+0.050
Neg logit contributions
ocket-0.103
acey-0.101
oll-0.101
yle-0.100
ions-0.099
Token and
Token ID290
Feature activation+2.93e-02

Pos logit contributions
ocket+0.009
acey+0.009
yle+0.009
ays+0.009
oll+0.009
Neg logit contributions
 bark-0.004
 nod-0.004
 be-0.004
 joke-0.003
 roaring-0.003
Token play
Token ID711
Feature activation+1.88e-02

Pos logit contributions
 aisle+0.040
 Hood+0.036
ions+0.035
 everywhere+0.034
als+0.034
Neg logit contributions
 phrase-0.050
 violently-0.046
 mistake-0.045
 nod-0.043
 plea-0.042
Token outside
Token ID2354
Feature activation-7.16e-02

Pos logit contributions
ils+0.029
acey+0.029
yle+0.029
ocket+0.029
oll+0.028
Neg logit contributions
 nod-0.033
sing-0.029
 roaring-0.029
 joke-0.029
appointed-0.029
Token with
Token ID351
Feature activation+4.80e-02

Pos logit contributions
 nod+0.100
 joke+0.089
 bark+0.088
 be+0.086
 roaring+0.085
Neg logit contributions
ocket-0.141
acey-0.141
yle-0.140
ays-0.137
oll-0.136
Token his
Token ID465
Feature activation+2.70e-02

Pos logit contributions
ocket+0.105
acey+0.105
yle+0.105
ays+0.103
oll+0.102
Neg logit contributions
 nod-0.057
 joke-0.049
 bark-0.048
 roaring-0.048
 be-0.047
Token friends
Token ID2460
Feature activation+3.75e-02

Pos logit contributions
yle+0.037
oll+0.037
ocket+0.037
ays+0.036
acey+0.036
Neg logit contributions
 nod-0.050
appointed-0.046
 matter-0.045
 joke-0.045
 roaring-0.044
Token.
Token ID13
Feature activation+1.09e-04

Pos logit contributions
acey+0.073
yle+0.072
ocket+0.071
oll+0.070
ays+0.070
Neg logit contributions
 nod-0.053
 joke-0.048
 bark-0.047
 roaring-0.046
sing-0.045
Token One
Token ID1881
Feature activation-1.05e-02

Pos logit contributions
acey+0.000
ays+0.000
oll+0.000
ions+0.000
 Hood+0.000
Neg logit contributions
 nod-0.000
 roaring-0.000
 wob-0.000
 swell-0.000
 chuckle-0.000
Token day
Token ID1110
Feature activation+6.93e-02

Pos logit contributions
 nod+0.018
 argue+0.017
 roaring+0.017
 chuckle+0.017
 phrase+0.016
Neg logit contributions
oll-0.015
 album-0.014
acey-0.014
ocket-0.014
ions-0.014
Token,
Token ID11
Feature activation-1.02e-02

Pos logit contributions
acey+0.144
ions+0.141
ays+0.141
yle+0.141
ocket+0.141
Neg logit contributions
 nod-0.086
 joke-0.075
 bark-0.074
 roaring-0.072
sing-0.071
Token.
Token ID13
Feature activation+1.84e-03

Pos logit contributions
acey+0.092
ocket+0.091
yle+0.090
oll+0.088
ays+0.088
Neg logit contributions
 nod-0.048
 bark-0.046
 roaring-0.044
 joke-0.043
 barking-0.043
Token They
Token ID1119
Feature activation-1.03e-02

Pos logit contributions
ocket+0.004
yle+0.004
acey+0.004
oll+0.004
ils+0.004
Neg logit contributions
 nod-0.002
 bark-0.002
 be-0.002
 joke-0.002
 roaring-0.002
Token were
Token ID547
Feature activation+1.96e-02

Pos logit contributions
 nod+0.014
 bark+0.013
 joke+0.013
 be+0.012
 roaring+0.012
Neg logit contributions
ocket-0.021
acey-0.020
yle-0.020
ays-0.020
oll-0.020
Token scared
Token ID12008
Feature activation-5.10e-02

Pos logit contributions
ocket+0.040
acey+0.040
yle+0.040
oll+0.040
ays+0.039
Neg logit contributions
 nod-0.025
 joke-0.022
 moment-0.020
 bark-0.020
 count-0.020
Token and
Token ID290
Feature activation+7.94e-03

Pos logit contributions
 nod+0.059
 bark+0.053
 be+0.053
 roaring+0.051
 barking+0.051
Neg logit contributions
ocket-0.119
acey-0.116
yle-0.114
oll-0.110
ays-0.110
Token didn
Token ID1422
Feature activation+1.90e-02

Pos logit contributions
ocket+0.019
yle+0.019
acey+0.019
ays+0.019
oll+0.018
Neg logit contributions
 nod-0.009
 bark-0.008
 be-0.008
 joke-0.007
 roaring-0.007
Token't
Token ID470
Feature activation-6.09e-02

Pos logit contributions
ocket+0.039
acey+0.037
yle+0.037
ils+0.036
oll+0.036
Neg logit contributions
 nod-0.026
 be-0.025
 bark-0.024
 joke-0.024
 roaring-0.023
Token know
Token ID760
Feature activation+3.90e-02

Pos logit contributions
sing+0.090
appointed+0.083
 nod+0.080
 reasoning+0.077
 roaring+0.076
Neg logit contributions
ocket-0.099
izards-0.098
oll-0.096
 Hood-0.093
acey-0.093
Token what
Token ID644
Feature activation+6.62e-02

Pos logit contributions
acey+0.075
ocket+0.074
ays+0.073
yle+0.072
oll+0.070
Neg logit contributions
 nod-0.059
 be-0.054
 bark-0.054
 joke-0.054
 roaring-0.054
Token to
Token ID284
Feature activation-3.75e-02

Pos logit contributions
ocket+0.134
acey+0.133
yle+0.131
ions+0.130
ils+0.128
Neg logit contributions
 nod-0.087
appointed-0.074
sing-0.073
 joke-0.071
 roaring-0.071
Token do
Token ID466
Feature activation-4.08e-03

Pos logit contributions
 nod+0.048
 roaring+0.043
appointed+0.042
sing+0.042
 joke+0.040
Neg logit contributions
ocket-0.073
yle-0.073
acey-0.069
ils-0.068
ays-0.068
Token.
Token ID13
Feature activation+2.59e-03

Pos logit contributions
 joke+0.002
 nod+0.002
 be+0.002
 count+0.002
 bark+0.002
Neg logit contributions
ocket-0.005
yle-0.005
acey-0.005
ays-0.005
asers-0.005
Token It
Token ID632
Feature activation-6.03e-02

Pos logit contributions
ocket+0.004
acey+0.004
yle+0.004
ays+0.004
oll+0.004
Neg logit contributions
 nod-0.004
 joke-0.004
 bark-0.004
 roaring-0.004
 be-0.004
Token's
Token ID338
Feature activation-4.56e-02

Pos logit contributions
 nod+0.089
 joke+0.082
 bark+0.081
 be+0.080
 roaring+0.078
Neg logit contributions
ocket-0.116
yle-0.115
acey-0.114
ays-0.111
oll-0.110
Token not
Token ID407
Feature activation-5.67e-02

Pos logit contributions
 nod+0.056
 joke+0.049
 bark+0.049
 be+0.047
 roaring+0.047
Neg logit contributions
ocket-0.098
acey-0.097
yle-0.097
ays-0.095
oll-0.094
Token good
Token ID922
Feature activation+4.44e-03

Pos logit contributions
 nod+0.069
 joke+0.063
 bark+0.062
 be+0.062
 roaring+0.060
Neg logit contributions
yle-0.124
ocket-0.123
acey-0.122
ays-0.121
oll-0.117
Token for
Token ID329
Feature activation+8.37e-02

Pos logit contributions
ocket+0.010
acey+0.010
yle+0.010
ays+0.009
oll+0.009
Neg logit contributions
 nod-0.005
 joke-0.005
 bark-0.005
 be-0.005
 roaring-0.005
Token dogs
Token ID6844
Feature activation-1.96e-02

Pos logit contributions
ocket+0.137
acey+0.135
yle+0.135
ays+0.134
oll+0.131
Neg logit contributions
 nod-0.143
 be-0.142
 joke-0.140
 roaring-0.138
 bark-0.137
Token."
Token ID526
Feature activation-1.94e-02

Pos logit contributions
 nod+0.018
 be+0.017
 bark+0.017
 joke+0.016
 quarrel+0.015
Neg logit contributions
yle-0.049
acey-0.049
ocket-0.049
oll-0.047
ays-0.047
Token Max
Token ID5436
Feature activation-8.71e-04

Pos logit contributions
 nod+0.023
 bark+0.020
 joke+0.019
 be+0.019
 roaring+0.019
Neg logit contributions
ocket-0.043
acey-0.043
yle-0.043
ays-0.042
oll-0.042
Token looked
Token ID3114
Feature activation-2.73e-02

Pos logit contributions
sing+0.001
 nod+0.001
 roaring+0.001
 moment+0.001
appointed+0.001
Neg logit contributions
acey-0.002
ils-0.001
yle-0.001
oll-0.001
ays-0.001
Token sad
Token ID6507
Feature activation-1.37e-03

Pos logit contributions
 nod+0.054
 joke+0.052
 bark+0.050
 be+0.050
 count+0.050
Neg logit contributions
acey-0.036
ocket-0.036
ays-0.036
yle-0.035
oll-0.034
Token the
Token ID262
Feature activation+9.65e-03

Pos logit contributions
 nod+0.069
 be+0.062
 bark+0.061
 joke+0.061
 roaring+0.059
Neg logit contributions
ocket-0.105
acey-0.104
oll-0.104
yle-0.103
ays-0.102
Token truck
Token ID7779
Feature activation-8.93e-03

Pos logit contributions
yle+0.018
acey+0.017
ocket+0.017
oll+0.017
ays+0.017
Neg logit contributions
 nod-0.017
 be-0.015
 joke-0.015
 bark-0.015
 roaring-0.014
Token.
Token ID13
Feature activation-3.01e-03

Pos logit contributions
 nod+0.008
 bark+0.007
 be+0.007
 count+0.006
 joke+0.006
Neg logit contributions
acey-0.023
yle-0.022
ocket-0.022
oll-0.021
ays-0.021
Token They
Token ID1119
Feature activation-1.37e-02

Pos logit contributions
 nod+0.004
 joke+0.003
 bark+0.003
 be+0.003
 roaring+0.003
Neg logit contributions
ocket-0.006
acey-0.006
yle-0.006
ays-0.006
oll-0.006
Token put
Token ID1234
Feature activation+4.64e-02

Pos logit contributions
 nod+0.019
 bark+0.017
 be+0.016
 joke+0.016
 laugh+0.015
Neg logit contributions
ays-0.029
ocket-0.029
yle-0.028
acey-0.028
ions-0.028
Token books
Token ID3835
Feature activation+3.27e-02

Pos logit contributions
ocket+0.107
acey+0.104
yle+0.103
oll+0.103
izards+0.097
Neg logit contributions
 nod-0.057
 be-0.051
 bark-0.049
 moment-0.046
 laugh-0.045
Token,
Token ID11
Feature activation-1.88e-02

Pos logit contributions
yle+0.066
acey+0.065
ocket+0.064
oll+0.060
ays+0.060
Neg logit contributions
 nod-0.049
 be-0.044
 laugh-0.042
 bark-0.041
 moment-0.041
Token balls
Token ID11333
Feature activation+9.10e-03

Pos logit contributions
 nod+0.032
 bark+0.029
 be+0.028
 joke+0.028
 laugh+0.028
Neg logit contributions
yle-0.034
ocket-0.033
acey-0.033
ays-0.033
oll-0.032
Token,
Token ID11
Feature activation-2.42e-02

Pos logit contributions
yle+0.026
acey+0.026
ocket+0.025
ays+0.024
oll+0.024
Neg logit contributions
 nod-0.006
 be-0.006
 bark-0.005
 laugh-0.004
 count-0.004
Token dolls
Token ID36062
Feature activation+2.06e-02

Pos logit contributions
 nod+0.036
 bark+0.033
 be+0.033
 count+0.031
 roaring+0.031
Neg logit contributions
yle-0.049
acey-0.048
ocket-0.048
ays-0.047
oll-0.046
Token,
Token ID11
Feature activation-2.44e-02

Pos logit contributions
yle+0.055
ocket+0.055
acey+0.054
ays+0.054
oll+0.053
Neg logit contributions
 nod-0.015
 be-0.014
 bark-0.012
 count-0.011
 roaring-0.011
Token said
Token ID531
Feature activation+5.03e-02

Pos logit contributions
acey+0.065
ocket+0.064
yle+0.064
oll+0.062
ays+0.062
Neg logit contributions
 nod-0.055
 joke-0.051
 roaring-0.049
 bark-0.048
sing-0.048
Token,
Token ID11
Feature activation-9.91e-03

Pos logit contributions
ocket+0.117
acey+0.117
yle+0.116
oll+0.114
ays+0.113
Neg logit contributions
 nod-0.052
 joke-0.045
 bark-0.044
 roaring-0.043
 be-0.042
Token "
Token ID366
Feature activation+4.07e-03

Pos logit contributions
 matter+0.015
 nod+0.015
 joke+0.014
 wob+0.013
sing+0.013
Neg logit contributions
acey-0.017
yle-0.016
ocket-0.016
oll-0.016
 register-0.015
TokenOf
Token ID5189
Feature activation+3.89e-02

Pos logit contributions
ocket+0.008
acey+0.008
yle+0.008
ays+0.008
oll+0.008
Neg logit contributions
 nod-0.006
 joke-0.005
 bark-0.005
 roaring-0.005
 be-0.005
Token course
Token ID1781
Feature activation+5.11e-02

Pos logit contributions
ocket+0.044
acey+0.043
yle+0.043
oll+0.041
ays+0.041
Neg logit contributions
 nod-0.088
 joke-0.082
 bark-0.082
 roaring-0.080
 be-0.079
Token,
Token ID11
Feature activation+6.91e-03

Pos logit contributions
ocket+0.112
acey+0.112
yle+0.111
ays+0.109
oll+0.108
Neg logit contributions
 nod-0.061
 bark-0.053
 joke-0.053
 be-0.052
 roaring-0.051
Token you
Token ID345
Feature activation+2.94e-02

Pos logit contributions
ays+0.012
ocket+0.012
acey+0.012
yle+0.012
 register+0.011
Neg logit contributions
 nod-0.012
 be-0.012
 joke-0.011
 bark-0.011
 matter-0.011
Token can
Token ID460
Feature activation-5.44e-02

Pos logit contributions
yle+0.050
ays+0.048
ocket+0.048
acey+0.047
ions+0.046
Neg logit contributions
 nod-0.052
 be-0.047
 joke-0.047
 bark-0.046
 count-0.045
Token hug
Token ID16225
Feature activation+6.63e-03

Pos logit contributions
 roaring+0.065
 nod+0.063
sing+0.059
 joke+0.058
 barking+0.057
Neg logit contributions
acey-0.115
yle-0.110
ocket-0.109
oll-0.109
ions-0.105
Token him
Token ID683
Feature activation-7.15e-03

Pos logit contributions
ocket+0.013
ays+0.012
acey+0.012
yle+0.012
oll+0.012
Neg logit contributions
 nod-0.010
 be-0.010
 bark-0.010
 joke-0.009
 roaring-0.009
Token as
Token ID355
Feature activation+4.52e-02

Pos logit contributions
 nod+0.008
 bark+0.008
 be+0.007
 joke+0.007
 roaring+0.007
Neg logit contributions
ocket-0.016
yle-0.016
acey-0.016
ays-0.015
oll-0.015
Token
Token ID198
Feature activation-2.03e-02

Pos logit contributions
 nod+0.018
 joke+0.015
 bark+0.015
 be+0.014
 roaring+0.014
Neg logit contributions
ocket-0.041
yle-0.040
acey-0.040
ays-0.040
oll-0.040
TokenThe
Token ID464
Feature activation-2.45e-02

Pos logit contributions
 nod+0.025
 joke+0.022
 bark+0.022
 be+0.021
 roaring+0.021
Neg logit contributions
ocket-0.044
acey-0.044
yle-0.044
ays-0.043
oll-0.043
Token daughter
Token ID4957
Feature activation+4.01e-02

Pos logit contributions
 nod+0.037
 be+0.034
 joke+0.033
 bark+0.033
 matter+0.032
Neg logit contributions
ays-0.049
ocket-0.048
ions-0.048
yle-0.047
izards-0.046
Token was
Token ID373
Feature activation-9.37e-04

Pos logit contributions
ocket+0.090
ays+0.089
yle+0.088
acey+0.087
ions+0.086
Neg logit contributions
 nod-0.048
 bark-0.043
 be-0.043
 joke-0.040
 count-0.039
Token very
Token ID845
Feature activation+9.61e-03

Pos logit contributions
 nod+0.002
 joke+0.001
 bark+0.001
 roaring+0.001
 be+0.001
Neg logit contributions
acey-0.002
ocket-0.002
yle-0.002
oll-0.002
ays-0.002
Token happy
Token ID3772
Feature activation-1.11e-03

Pos logit contributions
acey+0.022
yle+0.022
ocket+0.022
oll+0.021
ays+0.021
Neg logit contributions
 nod-0.010
 bark-0.009
 joke-0.009
 roaring-0.009
 count-0.008
Token and
Token ID290
Feature activation+2.26e-02

Pos logit contributions
 nod+0.002
 joke+0.001
 bark+0.001
 be+0.001
 roaring+0.001
Neg logit contributions
ocket-0.002
acey-0.002
yle-0.002
ays-0.002
oll-0.002
Token said
Token ID531
Feature activation+4.31e-02

Pos logit contributions
ays+0.048
ocket+0.048
acey+0.047
ions+0.047
yle+0.047
Neg logit contributions
 nod-0.030
 bark-0.029
 be-0.029
 joke-0.026
 laugh-0.026
Token,
Token ID11
Feature activation-1.52e-02

Pos logit contributions
ocket+0.102
acey+0.101
yle+0.101
ays+0.099
oll+0.098
Neg logit contributions
 nod-0.044
 joke-0.038
 bark-0.037
 be-0.036
 roaring-0.036
Token "
Token ID366
Feature activation+1.50e-03

Pos logit contributions
 nod+0.020
 matter+0.019
 joke+0.019
sing+0.019
 roaring+0.018
Neg logit contributions
acey-0.028
yle-0.028
ocket-0.027
 register-0.026
overs-0.026
TokenThank
Token ID10449
Feature activation+3.72e-02

Pos logit contributions
yle+0.003
acey+0.003
ocket+0.003
ays+0.003
oll+0.003
Neg logit contributions
 nod-0.002
 joke-0.002
 bark-0.002
 roaring-0.002
 be-0.002
Token,
Token ID11
Feature activation-6.61e-04

Pos logit contributions
 nod+0.028
 bark+0.025
 joke+0.025
 roaring+0.023
 be+0.023
Neg logit contributions
acey-0.041
ocket-0.040
yle-0.040
ays-0.040
ions-0.039
Token Ally
Token ID43039
Feature activation+3.99e-02

Pos logit contributions
 nod+0.001
 be+0.001
 bark+0.001
 joke+0.001
 laugh+0.001
Neg logit contributions
ocket-0.001
oll-0.001
ays-0.001
asers-0.001
yle-0.001
Token smiled
Token ID13541
Feature activation-3.04e-02

Pos logit contributions
ions+0.094
ocket+0.093
ays+0.093
asers+0.093
oll+0.092
Neg logit contributions
 nod-0.042
 bark-0.042
 be-0.042
 pur-0.035
 laugh-0.033
Token and
Token ID290
Feature activation+3.75e-02

Pos logit contributions
 nod+0.040
 joke+0.038
 bark+0.036
 roaring+0.035
 count+0.035
Neg logit contributions
ocket-0.060
yle-0.059
acey-0.059
ays-0.058
oll-0.058
Token said
Token ID531
Feature activation+5.12e-02

Pos logit contributions
ocket+0.083
ays+0.082
acey+0.081
yle+0.081
oll+0.079
Neg logit contributions
 nod-0.047
 bark-0.044
 be-0.043
 joke-0.040
 laugh-0.039
Token,
Token ID11
Feature activation-3.88e-03

Pos logit contributions
ocket+0.116
acey+0.115
yle+0.115
oll+0.112
ays+0.112
Neg logit contributions
 nod-0.056
 joke-0.048
 bark-0.048
 roaring-0.046
 be-0.046
Token "
Token ID366
Feature activation+8.83e-03

Pos logit contributions
 nod+0.005
 matter+0.005
 joke+0.004
 count+0.004
sing+0.004
Neg logit contributions
acey-0.008
yle-0.008
ocket-0.008
oll-0.007
ions-0.007
TokenWe
Token ID1135
Feature activation-1.61e-02

Pos logit contributions
yle+0.016
acey+0.016
ocket+0.015
ays+0.015
oll+0.015
Neg logit contributions
 nod-0.014
 joke-0.013
 bark-0.013
 roaring-0.013
 be-0.012
Token did
Token ID750
Feature activation-2.98e-02

Pos logit contributions
 nod+0.024
 be+0.021
 joke+0.021
 laugh+0.020
 bark+0.020
Neg logit contributions
yle-0.032
ays-0.032
ocket-0.032
acey-0.031
oll-0.030
Token it
Token ID340
Feature activation-1.62e-02

Pos logit contributions
 nod+0.051
 bark+0.046
 joke+0.045
 roaring+0.045
 be+0.044
Neg logit contributions
acey-0.049
ocket-0.049
yle-0.049
ays-0.047
oll-0.047
Token!
Token ID0
Feature activation+6.21e-03

Pos logit contributions
 nod+0.018
 be+0.017
 joke+0.016
 bark+0.016
 count+0.016
Neg logit contributions
ocket-0.038
yle-0.037
acey-0.036
ays-0.036
oll-0.035
Token smile
Token ID8212
Feature activation-3.00e-02

Pos logit contributions
 nod+0.024
 joke+0.020
 bark+0.019
 roaring+0.019
 matter+0.018
Neg logit contributions
ocket-0.069
yle-0.068
acey-0.067
oll-0.066
ays-0.065
Token.
Token ID13
Feature activation-5.16e-03

Pos logit contributions
 nod+0.043
 joke+0.038
 bark+0.037
 count+0.036
appointed+0.036
Neg logit contributions
ocket-0.057
yle-0.056
acey-0.056
oll-0.055
ays-0.055
Token So
Token ID1406
Feature activation+1.11e-02

Pos logit contributions
 nod+0.006
 joke+0.005
 bark+0.005
 roaring+0.005
 be+0.005
Neg logit contributions
acey-0.011
yle-0.011
ocket-0.011
ays-0.011
oll-0.011
Token he
Token ID339
Feature activation+3.29e-02

Pos logit contributions
ocket+0.022
acey+0.022
yle+0.022
ays+0.021
oll+0.021
Neg logit contributions
 nod-0.016
 joke-0.014
 bark-0.014
 be-0.014
 roaring-0.014
Token watched
Token ID7342
Feature activation-2.70e-02

Pos logit contributions
ocket+0.068
acey+0.067
yle+0.067
ays+0.066
oll+0.065
Neg logit contributions
 nod-0.044
 bark-0.040
 joke-0.039
 be-0.038
 roaring-0.037
Token it
Token ID340
Feature activation-2.46e-02

Pos logit contributions
 nod+0.043
 joke+0.039
 bark+0.038
 be+0.037
 roaring+0.037
Neg logit contributions
ocket-0.048
yle-0.047
acey-0.047
ays-0.046
oll-0.045
Token for
Token ID329
Feature activation+7.01e-02

Pos logit contributions
 nod+0.032
appointed+0.031
 matter+0.030
die+0.030
 reasoning+0.029
Neg logit contributions
ocket-0.042
oll-0.041
als-0.040
ays-0.040
yle-0.040
Token a
Token ID257
Feature activation+1.91e-02

Pos logit contributions
acey+0.131
ocket+0.131
yle+0.131
ays+0.129
oll+0.126
Neg logit contributions
 nod-0.106
 bark-0.096
 joke-0.096
 be-0.093
 roaring-0.093
Token while
Token ID981
Feature activation+4.41e-02

Pos logit contributions
ocket+0.036
acey+0.035
yle+0.034
oll+0.033
 album+0.032
Neg logit contributions
 nod-0.024
appointed-0.022
 roaring-0.021
 reasoning-0.020
 wob-0.020
Token.
Token ID13
Feature activation-8.53e-03

Pos logit contributions
ocket+0.096
acey+0.096
yle+0.095
ays+0.094
oll+0.093
Neg logit contributions
 nod-0.053
 bark-0.046
 joke-0.046
 be-0.045
 roaring-0.044
Token
Token ID198
Feature activation-2.10e-02

Pos logit contributions
 nod+0.011
 joke+0.010
 bark+0.009
 roaring+0.009
 be+0.009
Neg logit contributions
acey-0.018
yle-0.017
ays-0.017
ocket-0.017
oll-0.017
TokenOk
Token ID18690
Feature activation-3.96e-02

Pos logit contributions
 nod+0.001
 joke+0.001
 bark+0.001
 roaring+0.001
 be+0.001
Neg logit contributions
yle-0.001
acey-0.001
ocket-0.001
ays-0.001
oll-0.001
Token!"
Token ID2474
Feature activation-7.26e-03

Pos logit contributions
 nod+0.047
 joke+0.043
 bark+0.041
 roaring+0.041
sing+0.039
Neg logit contributions
acey-0.084
yle-0.084
ocket-0.084
oll-0.081
ays-0.080
Token Then
Token ID3244
Feature activation-2.30e-02

Pos logit contributions
 nod+0.008
 bark+0.007
 be+0.007
 joke+0.006
 roaring+0.006
Neg logit contributions
ocket-0.017
yle-0.017
acey-0.017
ils-0.017
oll-0.016
Token she
Token ID673
Feature activation+4.35e-02

Pos logit contributions
 nod+0.031
 be+0.028
 bark+0.026
 count+0.025
 laugh+0.024
Neg logit contributions
ocket-0.052
ays-0.050
yle-0.050
acey-0.049
oll-0.049
Token took
Token ID1718
Feature activation+7.54e-02

Pos logit contributions
ocket+0.092
yle+0.091
ays+0.091
acey+0.090
ions+0.088
Neg logit contributions
 nod-0.058
 bark-0.053
 be-0.051
 joke-0.049
 count-0.048
Token the
Token ID262
Feature activation-7.34e-03

Pos logit contributions
ocket+0.173
ays+0.162
yle+0.161
oll+0.157
asers+0.157
Neg logit contributions
 nod-0.091
 bark-0.080
 be-0.079
 count-0.077
 laugh-0.074
Token bald
Token ID28273
Feature activation-7.59e-02

Pos logit contributions
 nod+0.010
 joke+0.009
 bark+0.009
 be+0.009
 roaring+0.009
Neg logit contributions
ocket-0.015
acey-0.015
ays-0.015
yle-0.015
oll-0.014
Token man
Token ID582
Feature activation+1.51e-02

Pos logit contributions
 nod+0.109
 joke+0.097
 bark+0.096
 roaring+0.095
 be+0.093
Neg logit contributions
ocket-0.145
acey-0.144
yle-0.144
oll-0.139
ays-0.137
Token's
Token ID338
Feature activation-4.55e-02

Pos logit contributions
acey+0.032
ocket+0.032
ays+0.032
yle+0.032
oll+0.031
Neg logit contributions
 nod-0.019
 bark-0.017
 be-0.017
 roaring-0.016
 count-0.016
Token hand
Token ID1021
Feature activation+3.47e-02

Pos logit contributions
 nod+0.064
 bark+0.056
 be+0.056
 joke+0.056
 roaring+0.053
Neg logit contributions
acey-0.094
ays-0.093
ocket-0.093
yle-0.092
oll-0.090
Token and
Token ID290
Feature activation+2.94e-02

Pos logit contributions
ocket+0.078
acey+0.078
yle+0.077
ays+0.075
oll+0.075
Neg logit contributions
 nod-0.040
 bark-0.035
 be-0.034
 joke-0.033
 roaring-0.033
Token excited
Token ID6568
Feature activation-4.73e-03

Pos logit contributions
ays+0.056
ocket+0.055
ions+0.055
acey+0.054
yle+0.054
Neg logit contributions
 nod-0.016
 be-0.014
 laugh-0.013
 bark-0.013
 joke-0.013
Token because
Token ID780
Feature activation+4.09e-02

Pos logit contributions
 nod+0.005
 bark+0.005
 be+0.005
 joke+0.005
 roaring+0.004
Neg logit contributions
acey-0.011
ocket-0.011
yle-0.011
ays-0.010
oll-0.010
Token it
Token ID340
Feature activation-1.51e-02

Pos logit contributions
acey+0.089
ocket+0.086
ays+0.085
ils+0.085
asers+0.083
Neg logit contributions
 be-0.055
 nod-0.050
 matter-0.048
 count-0.047
 twist-0.046
Token was
Token ID373
Feature activation+9.73e-03

Pos logit contributions
 nod+0.016
 bark+0.015
 be+0.015
 joke+0.014
 count+0.013
Neg logit contributions
ays-0.035
ocket-0.035
yle-0.034
acey-0.034
oll-0.034
Token so
Token ID523
Feature activation+2.03e-02

Pos logit contributions
ays+0.021
yle+0.021
acey+0.021
ocket+0.020
asers+0.020
Neg logit contributions
 be-0.012
 nod-0.012
 bark-0.011
 roaring-0.011
 joke-0.011
Token bright
Token ID6016
Feature activation-8.87e-02

Pos logit contributions
ays+0.049
acey+0.048
ocket+0.048
yle+0.047
asers+0.047
Neg logit contributions
 nod-0.020
 be-0.020
 joke-0.018
 bark-0.018
 laugh-0.018
Token and
Token ID290
Feature activation+2.29e-02

Pos logit contributions
 nod+0.082
 bark+0.071
 be+0.070
 roaring+0.067
 joke+0.066
Neg logit contributions
acey-0.222
ocket-0.221
yle-0.216
asers-0.207
umbers-0.205
Token attractive
Token ID10966
Feature activation-3.53e-02

Pos logit contributions
ocket+0.049
acey+0.049
yle+0.049
ays+0.048
oll+0.047
Neg logit contributions
 nod-0.028
 bark-0.025
 joke-0.025
 be-0.024
 roaring-0.024
Token!
Token ID0
Feature activation+4.81e-03

Pos logit contributions
 nod+0.034
 bark+0.030
 be+0.030
 joke+0.030
 roaring+0.029
Neg logit contributions
acey-0.088
ocket-0.085
yle-0.084
ays-0.082
oll-0.082
Token
Token ID198
Feature activation-1.54e-02

Pos logit contributions
ocket+0.011
acey+0.011
ils+0.011
yle+0.011
oll+0.011
Neg logit contributions
 nod-0.005
 be-0.005
 bark-0.005
 joke-0.004
appointed-0.004
Token
Token ID198
Feature activation-2.00e-02

Pos logit contributions
 nod+0.016
 bark+0.014
 joke+0.013
 be+0.013
 roaring+0.013
Neg logit contributions
ocket-0.036
acey-0.036
yle-0.036
ays-0.035
oll-0.035
Token big
Token ID1263
Feature activation-1.29e-02

Pos logit contributions
acey+0.023
yle+0.022
ays+0.022
ocket+0.022
oll+0.021
Neg logit contributions
 nod-0.015
 bark-0.014
 joke-0.014
 be-0.014
 roaring-0.013
Token shadows
Token ID16187
Feature activation+7.97e-03

Pos logit contributions
 nod+0.021
 joke+0.019
appointed+0.019
 bark+0.019
 be+0.018
Neg logit contributions
ocket-0.022
ays-0.021
yle-0.021
oll-0.021
acey-0.020
Token.
Token ID13
Feature activation-6.27e-03

Pos logit contributions
acey+0.019
yle+0.018
ocket+0.018
ays+0.018
oll+0.017
Neg logit contributions
 nod-0.008
 bark-0.008
 roaring-0.007
 joke-0.007
 be-0.007
Token She
Token ID1375
Feature activation+1.08e-02

Pos logit contributions
 nod+0.009
 bark+0.008
 joke+0.008
 be+0.008
 roaring+0.008
Neg logit contributions
ocket-0.012
acey-0.012
yle-0.012
ays-0.012
oll-0.012
Token heard
Token ID2982
Feature activation-4.80e-03

Pos logit contributions
ocket+0.022
acey+0.022
yle+0.022
oll+0.021
ays+0.021
Neg logit contributions
 nod-0.015
 joke-0.013
 roaring-0.012
 bark-0.012
 be-0.012
Token loud
Token ID7812
Feature activation-7.91e-02

Pos logit contributions
 nod+0.007
 bark+0.006
 joke+0.006
 roaring+0.006
 be+0.006
Neg logit contributions
acey-0.010
ocket-0.010
yle-0.010
ays-0.010
oll-0.010
Token noises
Token ID26782
Feature activation-1.57e-02

Pos logit contributions
 nod+0.116
appointed+0.103
 matter+0.102
 compromise+0.098
 count+0.098
Neg logit contributions
ils-0.138
ocket-0.136
als-0.135
ions-0.135
ays-0.134
Token,
Token ID11
Feature activation-1.50e-02

Pos logit contributions
 nod+0.022
appointed+0.020
 joke+0.020
sing+0.020
 bark+0.019
Neg logit contributions
ocket-0.028
yle-0.028
ions-0.028
acey-0.028
oll-0.028
Token and
Token ID290
Feature activation+8.71e-03

Pos logit contributions
 nod+0.019
 joke+0.018
appointed+0.017
 matter+0.017
 compromise+0.016
Neg logit contributions
ocket-0.028
ays-0.027
yle-0.027
oll-0.027
ions-0.026
Token she
Token ID673
Feature activation+4.07e-02

Pos logit contributions
acey+0.019
yle+0.019
ocket+0.019
ays+0.018
oll+0.018
Neg logit contributions
 nod-0.011
 bark-0.010
 joke-0.010
 be-0.010
 roaring-0.010
Token smelled
Token ID37609
Feature activation-1.64e-02

Pos logit contributions
ocket+0.074
acey+0.074
yle+0.073
oll+0.071
ils+0.071
Neg logit contributions
 nod-0.060
appointed-0.053
 joke-0.052
sing-0.051
 count-0.050
Token asked
Token ID1965
Feature activation-3.48e-03

Pos logit contributions
 nod+0.089
 joke+0.080
sing+0.079
appointed+0.077
 count+0.077
Neg logit contributions
ocket-0.107
acey-0.104
yle-0.103
ays-0.103
oll-0.102
Token,
Token ID11
Feature activation-1.70e-02

Pos logit contributions
 nod+0.006
 joke+0.006
appointed+0.006
sing+0.005
 bark+0.005
Neg logit contributions
ions-0.005
ocket-0.005
acey-0.005
 Hood-0.005
yle-0.005
Token "
Token ID366
Feature activation+5.76e-04

Pos logit contributions
 matter+0.026
 joke+0.026
appointed+0.025
 nod+0.025
 argue+0.024
Neg logit contributions
acey-0.027
ions-0.026
ocket-0.025
overs-0.025
 register-0.025
TokenAre
Token ID8491
Feature activation-1.30e-02

Pos logit contributions
ocket+0.001
acey+0.001
yle+0.001
ays+0.001
oll+0.001
Neg logit contributions
 nod-0.001
 joke-0.001
 bark-0.001
 be-0.001
 roaring-0.001
Token you
Token ID345
Feature activation+1.91e-02

Pos logit contributions
 nod+0.027
 be+0.026
 bark+0.025
 joke+0.025
 roaring+0.025
Neg logit contributions
ocket-0.017
acey-0.017
ays-0.016
yle-0.016
oll-0.016
Token okay
Token ID8788
Feature activation-9.94e-02

Pos logit contributions
oll+0.043
acey+0.043
ocket+0.043
ays+0.041
aser+0.041
Neg logit contributions
 nod-0.018
 phrase-0.017
 joke-0.016
 bark-0.016
 violently-0.015
Token,
Token ID11
Feature activation-1.01e-02

Pos logit contributions
 nod+0.194
appointed+0.188
 bark+0.178
 wob+0.173
 phrase+0.173
Neg logit contributions
acey-0.125
yle-0.122
ays-0.122
ocket-0.118
ions-0.117
Token little
Token ID1310
Feature activation+8.56e-03

Pos logit contributions
 nod+0.015
 joke+0.013
 bark+0.013
 be+0.013
 roaring+0.013
Neg logit contributions
ocket-0.020
acey-0.020
yle-0.019
ays-0.019
oll-0.019
Token bug
Token ID5434
Feature activation-3.57e-02

Pos logit contributions
yle+0.014
ocket+0.014
ays+0.014
acey+0.014
ions+0.014
Neg logit contributions
 nod-0.014
 joke-0.013
 bark-0.013
 roaring-0.013
appointed-0.012
Token?"
Token ID1701
Feature activation-1.25e-02

Pos logit contributions
 nod+0.057
 joke+0.055
 be+0.054
 bark+0.053
 roaring+0.050
Neg logit contributions
yle-0.067
acey-0.065
ocket-0.065
ays-0.064
oll-0.062
Token The
Token ID383
Feature activation-3.05e-02

Pos logit contributions
 nod+0.018
 joke+0.017
 roaring+0.017
sing+0.017
 matter+0.017
Neg logit contributions
yle-0.022
acey-0.022
ocket-0.021
ays-0.021
ions-0.021
Token breaks
Token ID9457
Feature activation-4.11e-02

Pos logit contributions
 nod+0.093
 bark+0.086
 joke+0.086
 be+0.085
 roaring+0.082
Neg logit contributions
ocket-0.117
acey-0.114
yle-0.114
oll-0.110
ions-0.108
Token.
Token ID13
Feature activation+1.02e-03

Pos logit contributions
 nod+0.056
 bark+0.053
 be+0.053
 roaring+0.051
 joke+0.050
Neg logit contributions
ocket-0.083
acey-0.080
yle-0.079
oll-0.078
ils-0.076
Token It
Token ID632
Feature activation-6.18e-02

Pos logit contributions
ocket+0.002
yle+0.002
acey+0.002
ays+0.002
oll+0.002
Neg logit contributions
 nod-0.001
 joke-0.001
 bark-0.001
 be-0.001
 roaring-0.001
Token makes
Token ID1838
Feature activation-3.17e-02

Pos logit contributions
 nod+0.089
 bark+0.084
 be+0.084
 joke+0.084
 wob+0.079
Neg logit contributions
ocket-0.123
yle-0.121
acey-0.119
oll-0.115
asers-0.113
Token a
Token ID257
Feature activation+3.21e-03

Pos logit contributions
 nod+0.040
 be+0.038
 bark+0.037
 joke+0.037
 laugh+0.034
Neg logit contributions
ocket-0.069
yle-0.067
acey-0.066
asers-0.066
oll-0.065
Token loud
Token ID7812
Feature activation-1.06e-01

Pos logit contributions
ocket+0.007
acey+0.007
yle+0.006
oll+0.006
ays+0.006
Neg logit contributions
 nod-0.004
appointed-0.003
 count-0.003
 bark-0.003
 joke-0.003
Token noise
Token ID7838
Feature activation-3.81e-02

Pos logit contributions
 nod+0.124
appointed+0.119
die+0.110
 count+0.110
 reasoning+0.107
Neg logit contributions
acey-0.189
ays-0.186
yle-0.186
ocket-0.183
ions-0.181
Token.
Token ID13
Feature activation+3.93e-03

Pos logit contributions
 nod+0.046
 joke+0.041
 bark+0.041
 be+0.040
 roaring+0.039
Neg logit contributions
ocket-0.083
acey-0.083
yle-0.082
ays-0.080
oll-0.080
Token
Token ID198
Feature activation-2.41e-02

Pos logit contributions
ocket+0.008
yle+0.008
acey+0.008
ays+0.008
oll+0.007
Neg logit contributions
 nod-0.006
 joke-0.005
 bark-0.005
 be-0.005
 roaring-0.005
Token
Token ID198
Feature activation-2.75e-02

Pos logit contributions
 nod+0.030
 joke+0.027
 bark+0.027
 roaring+0.026
 matter+0.026
Neg logit contributions
acey-0.050
ocket-0.050
yle-0.049
ays-0.048
oll-0.048
TokenThe
Token ID464
Feature activation-3.13e-02

Pos logit contributions
 nod+0.044
 joke+0.040
 bark+0.040
 roaring+0.039
 be+0.039
Neg logit contributions
acey-0.049
ocket-0.049
yle-0.048
ays-0.047
oll-0.047
Token waited
Token ID13488
Feature activation-1.15e-02

Pos logit contributions
acey+0.022
yle+0.022
ays+0.022
ocket+0.022
oll+0.021
Neg logit contributions
 nod-0.010
 be-0.010
 joke-0.009
 bark-0.009
 count-0.009
Token for
Token ID329
Feature activation+8.53e-02

Pos logit contributions
 nod+0.014
 be+0.013
 bark+0.012
 roaring+0.012
 joke+0.012
Neg logit contributions
ocket-0.025
yle-0.025
acey-0.024
ays-0.024
oll-0.024
Token the
Token ID262
Feature activation+9.62e-04

Pos logit contributions
ays+0.186
ocket+0.176
yle+0.175
oll+0.173
acey+0.173
Neg logit contributions
 nod-0.107
 be-0.104
 bark-0.098
 count-0.096
 roaring-0.096
Token rain
Token ID6290
Feature activation-9.20e-02

Pos logit contributions
ays+0.002
yle+0.002
ocket+0.002
acey+0.002
oll+0.002
Neg logit contributions
 nod-0.001
 joke-0.001
 bark-0.001
 be-0.001
 roaring-0.001
Token to
Token ID284
Feature activation-3.76e-02

Pos logit contributions
 nod+0.100
 bark+0.086
 joke+0.085
 be+0.085
 roaring+0.080
Neg logit contributions
ocket-0.215
acey-0.213
yle-0.212
ays-0.207
oll-0.206
Token slow
Token ID3105
Feature activation-8.65e-02

Pos logit contributions
 nod+0.040
 joke+0.034
 bark+0.034
 roaring+0.032
 be+0.032
Neg logit contributions
ocket-0.087
acey-0.086
yle-0.086
oll-0.083
ays-0.083
Token down
Token ID866
Feature activation-1.90e-03

Pos logit contributions
 nod+0.105
 be+0.097
 bark+0.094
 barking+0.087
 moment+0.086
Neg logit contributions
ocket-0.205
acey-0.193
oll-0.193
yle-0.190
asers-0.182
Token so
Token ID523
Feature activation+3.04e-02

Pos logit contributions
 nod+0.002
 be+0.002
 bark+0.002
 roaring+0.002
 moment+0.002
Neg logit contributions
ocket-0.005
yle-0.005
acey-0.005
oll-0.004
ays-0.004
Token she
Token ID673
Feature activation+6.81e-02

Pos logit contributions
ocket+0.055
acey+0.053
ays+0.053
yle+0.052
oll+0.051
Neg logit contributions
 nod-0.050
 be-0.047
 bark-0.045
 roaring-0.044
 joke-0.044
Token could
Token ID714
Feature activation-5.14e-02

Pos logit contributions
ays+0.145
ocket+0.141
yle+0.140
ions+0.138
acey+0.136
Neg logit contributions
 nod-0.091
 be-0.089
 bark-0.087
 joke-0.081
 laugh-0.078
Token come
Token ID1282
Feature activation-8.42e-02

Pos logit contributions
 nod+0.053
 bark+0.045
 joke+0.045
 roaring+0.043
 be+0.042
Neg logit contributions
ocket-0.120
acey-0.119
yle-0.119
ays-0.116
oll-0.116