TOP ACTIVATIONS
MAX = 0.103

The bird handed the thing to Emily . It was a
m oo ed and l icked her face . Sara gigg
you have to explore !" Jordan smiled again .
. He gave the fruit to the old woman and they
â S oph ie used her hands and
. He gives one half to Anna . Anna says ,
he lun ged forward and slur ped it all up .
. They give some cheese to the mouse . Then they
The rat brought the picture to his friend and the friend
good care of him !" Daniel thanked him and
ags his tail and l icks their faces . He is
it , and then you slur p it down your mouth
." He gives some powder to L ila and Ben .
a lot of fish !" Lou ise was so
how special she looked . And rea was so
I am very happy ." J essa smiled ,
penny the bird had given to him . <|endoftext|> Once upon
a huge sand castle !" So , Nan and
a namesake bowl !" Andrew couldn 't believe
go have some fun !" Gray agreed and they

MIN ACTIVATIONS
MIN = -0.096

a time there was a Stub born Old Woman . She
neck that read , " Fore ver Friends ".
just a spider ." The Stub born Old Woman shook her
high they were just tiny spec ks in the sky .
oes were Tim my 's least favorite bugs because they always
one of her dolls was being persistent and wouldn 't stand
. <|endoftext|> " Mom my , what 's that ?" asked
she did so , the Stub born Old Woman said ,
. <|endoftext|> Once upon a time , there was a little
. <|endoftext|> " Mom my , I want to spend all
. <|endoftext|> " Mom my , can I buy a modern
dirty and the trees were being cut down . They knew
" Mom , when can we go outside ?"
. <|endoftext|> " Mom my , can I grab the screen
light . " Mom my , can you bring me a
. <|endoftext|> Once upon a time , there was a team
. <|endoftext|> Once upon a time , there was a little
her . " Mom my , can you help me lift
loud . " Mom my , can you turn down the
. <|endoftext|> " Mom my , can I dress up ?"

INTERVAL 0.053 - 0.103
CONTAINS 3.849%

for the camera !" Ben smiled and said ,
so happy and thankful . The prince and the
Thank you so much !" " This is the
to learn a lesson . " Since you were
smart and helpful . Later that day ,

INTERVAL 0.004 - 0.053
CONTAINS 40.284%

was not sad anymore . They all lived happily in the
touch things that can harm you . <|endoftext|> L ily was
fair child . She walked in the park and saw something
But before the bird left , he told Ben
at our little house !" said Alic y .

INTERVAL -0.046 - 0.004
CONTAINS 51.049%

dog . He wanted it more than anything and asked the
huge that it looked like it touched the sky . Inside
rd person said , " My name is Lucy ."
mom if they can go . " Can we
up the battery . She looked around and saw nowhere to

INTERVAL -0.096 - -0.046
CONTAINS 4.818%

. " Dad , why does the monkey have
upon a time , there was a girl named Grace .
she said . " Yes , it is ," her mom
time , there was a kid named Tim my . Tim
not cooked . They are just cookies ." Mom
TokenThe
Token ID464
Feature activation+1.30e-02

Pos logit contributions
irs+0.113
ide+0.108
aff+0.097
rones+0.094
 occasions+0.093
Neg logit contributions
 afraid-0.160
 worried-0.157
 sorry-0.155
 scared-0.152
 okay-0.152
Token bird
Token ID6512
Feature activation+1.91e-02

Pos logit contributions
irs+0.022
ide+0.022
 occasions+0.020
rones+0.019
growth+0.019
Neg logit contributions
 afraid-0.024
 sorry-0.023
 worried-0.022
 oh-0.022
 okay-0.021
Token handed
Token ID10158
Feature activation+3.51e-02

Pos logit contributions
irs+0.044
ide+0.042
rones+0.038
 occasions+0.038
growth+0.037
Neg logit contributions
 afraid-0.029
 sorry-0.028
 worried-0.027
 scared-0.026
 okay-0.026
Token the
Token ID262
Feature activation+2.97e-02

Pos logit contributions
irs+0.068
ide+0.064
 occasions+0.058
rones+0.057
aff+0.056
Neg logit contributions
 afraid-0.066
 worried-0.064
 sorry-0.064
 scared-0.062
 okay-0.060
Token thing
Token ID1517
Feature activation-2.87e-02

Pos logit contributions
irs+0.048
ide+0.047
growth+0.044
rones+0.042
 occasions+0.042
Neg logit contributions
 sorry-0.054
 afraid-0.054
 worried-0.051
 oh-0.051
 okay-0.050
Token to
Token ID284
Feature activation+8.95e-02

Pos logit contributions
 afraid+0.052
 sorry+0.052
 worried+0.049
 okay+0.047
 foolish+0.046
Neg logit contributions
ide-0.052
irs-0.051
of-0.047
growth-0.044
 occasions-0.042
Token Emily
Token ID17608
Feature activation+2.53e-02

Pos logit contributions
irs+0.198
ide+0.198
aff+0.187
rones+0.184
 occasions+0.179
Neg logit contributions
 sorry-0.142
 afraid-0.136
 worried-0.133
 very-0.133
 scared-0.131
Token.
Token ID13
Feature activation+4.24e-03

Pos logit contributions
irs+0.074
ide+0.067
 occasions+0.066
rones+0.066
aff+0.064
Neg logit contributions
 afraid-0.025
 worried-0.025
 scared-0.024
 sorry-0.024
 very-0.023
Token It
Token ID632
Feature activation+1.98e-02

Pos logit contributions
irs+0.009
ide+0.008
 occasions+0.007
rones+0.007
aff+0.007
Neg logit contributions
 afraid-0.008
 sorry-0.008
 worried-0.007
 scared-0.007
 okay-0.007
Token was
Token ID373
Feature activation-3.15e-02

Pos logit contributions
irs+0.048
ide+0.045
rones+0.042
 occasions+0.042
aff+0.041
Neg logit contributions
 afraid-0.028
 worried-0.027
 sorry-0.027
 scared-0.026
 okay-0.025
Token a
Token ID257
Feature activation-9.14e-03

Pos logit contributions
 afraid+0.042
 sorry+0.041
 worried+0.040
 oh+0.038
 careless+0.037
Neg logit contributions
irs-0.073
ide-0.068
rones-0.063
growth-0.062
aff-0.060
Token m
Token ID285
Feature activation-1.91e-02

Pos logit contributions
irs+0.009
ide+0.009
rones+0.008
 occasions+0.008
aff+0.008
Neg logit contributions
 afraid-0.006
 sorry-0.006
 worried-0.006
 scared-0.006
 okay-0.006
Tokenoo
Token ID2238
Feature activation-6.92e-03

Pos logit contributions
 afraid+0.025
 sorry+0.025
 worried+0.024
 scared+0.023
 okay+0.022
Neg logit contributions
irs-0.048
ide-0.045
 occasions-0.042
growth-0.042
rones-0.042
Tokened
Token ID276
Feature activation+1.83e-02

Pos logit contributions
 afraid+0.011
 sorry+0.011
 worried+0.011
 scared+0.010
 okay+0.010
Neg logit contributions
irs-0.015
ide-0.014
aff-0.013
rones-0.013
of-0.013
Token and
Token ID290
Feature activation-7.42e-03

Pos logit contributions
ide+0.031
irs+0.030
of+0.027
rones+0.026
aff+0.026
Neg logit contributions
 afraid-0.035
 sorry-0.032
 worried-0.032
 oh-0.032
 careless-0.032
Token l
Token ID300
Feature activation-3.18e-03

Pos logit contributions
 afraid+0.013
 sorry+0.012
 worried+0.012
 okay+0.012
 foolish+0.011
Neg logit contributions
irs-0.015
ide-0.014
rones-0.013
growth-0.013
 occasions-0.013
Tokenicked
Token ID9484
Feature activation+8.80e-02

Pos logit contributions
 afraid+0.006
 sorry+0.006
 worried+0.006
 scared+0.005
 okay+0.005
Neg logit contributions
irs-0.006
ide-0.006
 occasions-0.006
rones-0.005
aff-0.005
Token her
Token ID607
Feature activation+1.83e-02

Pos logit contributions
irs+0.192
 occasions+0.181
ide+0.178
rones+0.175
isk+0.167
Neg logit contributions
 sorry-0.146
 afraid-0.137
 worried-0.137
 very-0.137
 okay-0.137
Token face
Token ID1986
Feature activation+3.07e-02

Pos logit contributions
irs+0.041
ide+0.039
rones+0.035
 occasions+0.035
aff+0.034
Neg logit contributions
 afraid-0.029
 sorry-0.028
 worried-0.028
 scared-0.027
 okay-0.026
Token.
Token ID13
Feature activation+3.77e-03

Pos logit contributions
irs+0.055
ide+0.054
 occasions+0.046
rones+0.046
aff+0.045
Neg logit contributions
 afraid-0.060
 worried-0.058
 sorry-0.058
 scared-0.056
 foolish-0.054
Token Sara
Token ID24799
Feature activation+3.49e-03

Pos logit contributions
irs+0.008
ide+0.008
 occasions+0.007
rones+0.007
aff+0.007
Neg logit contributions
 afraid-0.007
 sorry-0.007
 worried-0.006
 scared-0.006
 okay-0.006
Token gigg
Token ID30442
Feature activation+1.46e-02

Pos logit contributions
irs+0.008
ide+0.008
rones+0.007
 occasions+0.007
of+0.007
Neg logit contributions
 afraid-0.005
 sorry-0.005
 worried-0.005
 okay-0.005
 scared-0.005
Token you
Token ID345
Feature activation-5.33e-03

Pos logit contributions
 afraid+0.014
 sorry+0.014
 worried+0.014
 scared+0.013
 careless+0.013
Neg logit contributions
irs-0.018
ide-0.017
 occasions-0.015
 parties-0.015
rones-0.015
Token have
Token ID423
Feature activation-9.25e-03

Pos logit contributions
 afraid+0.007
 sorry+0.007
 worried+0.007
 scared+0.007
 okay+0.006
Neg logit contributions
irs-0.013
ide-0.012
rones-0.011
 occasions-0.011
growth-0.011
Token to
Token ID284
Feature activation+5.45e-02

Pos logit contributions
 afraid+0.013
 sorry+0.013
 worried+0.013
 scared+0.012
 okay+0.012
Neg logit contributions
irs-0.022
ide-0.021
rones-0.019
 occasions-0.019
aff-0.019
Token explore
Token ID7301
Feature activation+5.02e-02

Pos logit contributions
irs+0.116
ide+0.107
rones+0.104
aff+0.099
growth+0.096
Neg logit contributions
 sorry-0.094
 afraid-0.093
 okay-0.091
 worried-0.091
 scared-0.089
Token!"
Token ID2474
Feature activation+4.86e-02

Pos logit contributions
irs+0.111
ide+0.105
rones+0.096
 occasions+0.095
aff+0.094
Neg logit contributions
 afraid-0.080
 sorry-0.077
 worried-0.077
 scared-0.074
 okay-0.073
Token
Token ID198
Feature activation+8.75e-02

Pos logit contributions
irs+0.090
ide+0.085
 occasions+0.075
rones+0.075
aff+0.072
Neg logit contributions
 afraid-0.096
 sorry-0.095
 worried-0.094
 scared-0.090
 okay-0.089
Token
Token ID198
Feature activation+8.10e-02

Pos logit contributions
irs+0.186
ide+0.182
 occasions+0.166
rones+0.159
aff+0.155
Neg logit contributions
 sorry-0.149
 afraid-0.145
 worried-0.145
 very-0.139
 okay-0.138
TokenJordan
Token ID34522
Feature activation+1.13e-02

Pos logit contributions
irs+0.145
ide+0.138
rones+0.122
 occasions+0.120
aff+0.116
Neg logit contributions
 afraid-0.161
 worried-0.160
 sorry-0.159
 okay-0.156
 foolish-0.153
Token smiled
Token ID13541
Feature activation+2.55e-02

Pos logit contributions
irs+0.025
ide+0.024
rones+0.022
 occasions+0.022
aff+0.022
Neg logit contributions
 afraid-0.017
 sorry-0.017
 worried-0.016
 scared-0.015
 okay-0.015
Token again
Token ID757
Feature activation+4.03e-03

Pos logit contributions
ide+0.052
irs+0.051
rones+0.046
growth+0.046
aff+0.046
Neg logit contributions
 afraid-0.039
 sorry-0.036
 worried-0.036
 oh-0.035
 careless-0.034
Token.
Token ID13
Feature activation+1.19e-02

Pos logit contributions
irs+0.010
ide+0.009
 occasions+0.009
rones+0.008
growth+0.008
Neg logit contributions
 afraid-0.005
 sorry-0.005
 worried-0.005
 scared-0.005
 okay-0.005
Token.
Token ID13
Feature activation+3.80e-03

Pos logit contributions
 afraid+0.010
 sorry+0.009
 worried+0.009
 scared+0.009
 foolish+0.008
Neg logit contributions
irs-0.014
ide-0.014
 occasions-0.012
aff-0.012
rones-0.012
Token He
Token ID679
Feature activation+1.84e-02

Pos logit contributions
irs+0.009
ide+0.008
 occasions+0.008
rones+0.007
growth+0.007
Neg logit contributions
 sorry-0.006
 afraid-0.005
 worried-0.005
 scared-0.005
 okay-0.005
Token gave
Token ID2921
Feature activation+3.19e-02

Pos logit contributions
irs+0.042
ide+0.041
rones+0.037
 occasions+0.037
aff+0.036
Neg logit contributions
 afraid-0.027
 sorry-0.026
 worried-0.026
 scared-0.024
 okay-0.024
Token the
Token ID262
Feature activation+3.63e-02

Pos logit contributions
irs+0.065
ide+0.061
 occasions+0.056
rones+0.056
aff+0.053
Neg logit contributions
 afraid-0.056
 worried-0.055
 scared-0.054
 sorry-0.054
 very-0.052
Token fruit
Token ID8234
Feature activation+2.57e-03

Pos logit contributions
irs+0.079
ide+0.075
rones+0.068
 occasions+0.068
aff+0.067
Neg logit contributions
 afraid-0.058
 sorry-0.057
 worried-0.055
 scared-0.053
 okay-0.052
Token to
Token ID284
Feature activation+8.72e-02

Pos logit contributions
irs+0.007
ide+0.006
 occasions+0.006
rones+0.006
aff+0.006
Neg logit contributions
 afraid-0.003
 worried-0.003
 sorry-0.003
 scared-0.003
 very-0.003
Token the
Token ID262
Feature activation+3.25e-02

Pos logit contributions
irs+0.187
ide+0.177
rones+0.175
 occasions+0.168
oons+0.159
Neg logit contributions
 sorry-0.141
 very-0.134
 afraid-0.133
 okay-0.132
 worried-0.131
Token old
Token ID1468
Feature activation+3.76e-02

Pos logit contributions
irs+0.067
ide+0.064
rones+0.058
aff+0.058
 occasions+0.058
Neg logit contributions
 afraid-0.054
 sorry-0.052
 worried-0.051
 scared-0.049
 okay-0.049
Token woman
Token ID2415
Feature activation-5.31e-03

Pos logit contributions
irs+0.083
ide+0.081
 occasions+0.074
rones+0.074
aff+0.073
Neg logit contributions
 afraid-0.057
 sorry-0.056
 worried-0.053
 okay-0.052
 scared-0.051
Token and
Token ID290
Feature activation-8.52e-03

Pos logit contributions
 afraid+0.006
 worried+0.006
 sorry+0.006
 scared+0.006
 very+0.006
Neg logit contributions
irs-0.014
ide-0.013
rones-0.013
 occasions-0.013
growth-0.012
Token they
Token ID484
Feature activation+1.79e-02

Pos logit contributions
 afraid+0.013
 sorry+0.013
 worried+0.013
 scared+0.012
 okay+0.012
Neg logit contributions
irs-0.019
ide-0.018
rones-0.017
 occasions-0.017
aff-0.016
Tokenâ
Token ID22940
Feature activation-2.23e-02

Pos logit contributions
irs+0.047
ide+0.044
 occasions+0.040
rones+0.039
growth+0.038
Neg logit contributions
 afraid-0.040
 sorry-0.039
 worried-0.038
 scared-0.037
 okay-0.036
Token
Token ID26391
Feature activation-7.46e-03

Pos logit contributions
 afraid+0.040
 sorry+0.038
 worried+0.038
 scared+0.037
 okay+0.036
Neg logit contributions
irs-0.046
ide-0.043
 occasions-0.039
rones-0.039
aff-0.038
Token
Token ID447
Feature activation-1.94e-02

Pos logit contributions
 afraid+0.014
 worried+0.013
 sorry+0.013
 scared+0.013
 okay+0.013
Neg logit contributions
irs-0.015
ide-0.014
 occasions-0.013
rones-0.012
aff-0.012
Token
Token ID251
Feature activation+4.92e-02

Pos logit contributions
 sorry+0.040
 worried+0.040
 afraid+0.040
 scared+0.039
 okay+0.038
Neg logit contributions
irs-0.034
ide-0.032
 occasions-0.028
rones-0.027
aff-0.027
TokenS
Token ID50
Feature activation+3.55e-02

Pos logit contributions
irs+0.088
ide+0.081
 occasions+0.075
of+0.072
growth+0.072
Neg logit contributions
 afraid-0.098
 worried-0.093
 sorry-0.093
 scared-0.090
 foolish-0.088
Tokenoph
Token ID2522
Feature activation+8.68e-02

Pos logit contributions
irs+0.050
ide+0.039
rones+0.038
growth+0.038
 occasions+0.036
Neg logit contributions
 okay-0.081
 worried-0.079
 foolish-0.078
 afraid-0.078
 sorry-0.076
Tokenie
Token ID494
Feature activation+2.25e-02

Pos logit contributions
irs+0.142
growth+0.109
rones+0.106
 occasions+0.104
 parties+0.102
Neg logit contributions
 afraid-0.192
 very-0.191
 worried-0.189
 sorry-0.184
 scared-0.183
Token used
Token ID973
Feature activation+1.93e-02

Pos logit contributions
irs+0.048
ide+0.048
 occasions+0.043
of+0.042
rones+0.042
Neg logit contributions
 afraid-0.035
 sorry-0.034
 worried-0.033
 okay-0.031
 scared-0.031
Token her
Token ID607
Feature activation+2.02e-02

Pos logit contributions
irs+0.037
ide+0.036
rones+0.032
aff+0.031
 occasions+0.031
Neg logit contributions
 afraid-0.037
 worried-0.037
 sorry-0.036
 scared-0.036
 okay-0.035
Token hands
Token ID2832
Feature activation+2.26e-02

Pos logit contributions
irs+0.036
 occasions+0.033
ide+0.032
growth+0.031
rones+0.029
Neg logit contributions
 afraid-0.038
 worried-0.036
 scared-0.035
 oh-0.034
 sorry-0.034
Token and
Token ID290
Feature activation-1.70e-03

Pos logit contributions
irs+0.048
ide+0.047
of+0.041
 occasions+0.041
rones+0.041
Neg logit contributions
 afraid-0.036
 sorry-0.036
 worried-0.034
 scared-0.033
 foolish-0.032
Token.
Token ID13
Feature activation-8.57e-03

Pos logit contributions
irs+0.117
ide+0.107
isk+0.104
 occasions+0.100
rones+0.100
Neg logit contributions
 worried-0.068
 scared-0.067
 afraid-0.066
 very-0.066
 sorry-0.064
Token He
Token ID679
Feature activation+1.58e-02

Pos logit contributions
 sorry+0.013
 afraid+0.012
 scared+0.012
 worried+0.011
 curious+0.011
Neg logit contributions
irs-0.021
ide-0.020
 occasions-0.019
of-0.018
rones-0.017
Token gives
Token ID3607
Feature activation+2.06e-02

Pos logit contributions
irs+0.037
ide+0.035
 occasions+0.033
rones+0.031
aff+0.030
Neg logit contributions
 afraid-0.025
 scared-0.024
 sorry-0.024
 worried-0.024
 very-0.023
Token one
Token ID530
Feature activation+3.47e-03

Pos logit contributions
irs+0.054
ide+0.050
 occasions+0.048
aff+0.047
rones+0.046
Neg logit contributions
 afraid-0.025
 sorry-0.024
 worried-0.024
 scared-0.024
 very-0.024
Token half
Token ID2063
Feature activation+1.62e-02

Pos logit contributions
irs+0.007
ide+0.006
rones+0.006
 occasions+0.006
growth+0.005
Neg logit contributions
 afraid-0.007
 sorry-0.006
 worried-0.006
 scared-0.006
 okay-0.006
Token to
Token ID284
Feature activation+8.65e-02

Pos logit contributions
irs+0.039
ide+0.036
 occasions+0.034
rones+0.034
growth+0.033
Neg logit contributions
 afraid-0.024
 worried-0.023
 sorry-0.023
 scared-0.022
 okay-0.022
Token Anna
Token ID11735
Feature activation+1.07e-02

Pos logit contributions
irs+0.240
 occasions+0.220
rones+0.214
ide+0.209
ades+0.209
Neg logit contributions
 sorry-0.100
 very-0.092
 oh-0.085
 scared-0.085
 okay-0.084
Token.
Token ID13
Feature activation-1.33e-03

Pos logit contributions
irs+0.027
ide+0.024
 occasions+0.024
rones+0.023
growth+0.023
Neg logit contributions
 afraid-0.016
 sorry-0.015
 worried-0.015
 scared-0.015
 very-0.015
Token Anna
Token ID11735
Feature activation-9.32e-04

Pos logit contributions
 sorry+0.002
 afraid+0.002
 scared+0.002
 very+0.002
 worried+0.002
Neg logit contributions
irs-0.004
ide-0.003
 occasions-0.003
isk-0.003
ations-0.003
Token says
Token ID1139
Feature activation-1.90e-02

Pos logit contributions
 scared+0.001
 sorry+0.001
 worried+0.001
 afraid+0.001
 very+0.001
Neg logit contributions
irs-0.002
ide-0.002
 occasions-0.002
rones-0.002
growth-0.002
Token,
Token ID11
Feature activation-3.84e-02

Pos logit contributions
 afraid+0.026
 worried+0.024
 scared+0.023
 careless+0.022
 sorry+0.022
Neg logit contributions
irs-0.046
ide-0.044
 occasions-0.040
of-0.040
aff-0.039
Token he
Token ID339
Feature activation+1.24e-02

Pos logit contributions
 afraid+0.023
 sorry+0.022
 worried+0.022
 careless+0.020
 okay+0.020
Neg logit contributions
irs-0.024
ide-0.024
 occasions-0.023
growth-0.022
of-0.021
Token lun
Token ID14678
Feature activation-2.84e-02

Pos logit contributions
irs+0.027
ide+0.027
of+0.025
 occasions+0.025
aff+0.024
Neg logit contributions
 sorry-0.018
 afraid-0.018
 worried-0.017
 okay-0.016
 oh-0.015
Tokenged
Token ID2004
Feature activation-4.06e-03

Pos logit contributions
 afraid+0.048
 sorry+0.046
 worried+0.045
 scared+0.044
 foolish+0.043
Neg logit contributions
irs-0.062
ide-0.058
rones-0.052
 occasions-0.052
of-0.051
Token forward
Token ID2651
Feature activation-3.09e-02

Pos logit contributions
 afraid+0.007
 sorry+0.006
 worried+0.006
 foolish+0.006
 oh+0.006
Neg logit contributions
irs-0.008
ide-0.007
of-0.007
pen-0.006
r-0.006
Token and
Token ID290
Feature activation-2.30e-02

Pos logit contributions
 afraid+0.054
 sorry+0.053
 worried+0.052
 foolish+0.049
 scared+0.049
Neg logit contributions
ide-0.061
irs-0.060
 occasions-0.052
aff-0.052
of-0.051
Token slur
Token ID40066
Feature activation+8.58e-02

Pos logit contributions
 afraid+0.040
 sorry+0.039
 worried+0.038
 okay+0.037
 foolish+0.036
Neg logit contributions
irs-0.041
ide-0.038
 occasions-0.037
of-0.035
growth-0.034
Tokenped
Token ID9124
Feature activation+2.50e-02

Pos logit contributions
irs+0.203
ide+0.187
 occasions+0.183
rones+0.179
 parties+0.179
Neg logit contributions
 afraid-0.114
 scared-0.113
 sorry-0.112
 okay-0.109
 worried-0.107
Token it
Token ID340
Feature activation+1.30e-02

Pos logit contributions
irs+0.048
ide+0.046
growth+0.041
of+0.040
 occasions+0.040
Neg logit contributions
 afraid-0.044
 worried-0.042
 sorry-0.041
 foolish-0.039
 careless-0.039
Token all
Token ID477
Feature activation+8.35e-03

Pos logit contributions
irs+0.026
ide+0.025
 occasions+0.022
rones+0.022
growth+0.022
Neg logit contributions
 afraid-0.023
 sorry-0.022
 worried-0.022
 scared-0.021
 oh-0.020
Token up
Token ID510
Feature activation+2.04e-02

Pos logit contributions
irs+0.019
ide+0.018
rones+0.017
 occasions+0.017
growth+0.016
Neg logit contributions
 afraid-0.013
 sorry-0.012
 worried-0.012
 scared-0.012
 okay-0.011
Token.
Token ID13
Feature activation+2.79e-03

Pos logit contributions
irs+0.040
ide+0.038
 occasions+0.033
rones+0.033
growth+0.032
Neg logit contributions
 afraid-0.038
 sorry-0.037
 worried-0.037
 scared-0.035
 okay-0.035
Token.
Token ID13
Feature activation-4.81e-03

Pos logit contributions
irs+0.071
ide+0.068
rones+0.061
 occasions+0.060
aff+0.059
Neg logit contributions
 afraid-0.068
 worried-0.066
 sorry-0.065
 scared-0.063
 oh-0.062
Token They
Token ID1119
Feature activation+2.27e-02

Pos logit contributions
 sorry+0.007
 afraid+0.007
 scared+0.007
 worried+0.007
 curious+0.007
Neg logit contributions
irs-0.011
ide-0.011
 occasions-0.010
of-0.009
rap-0.009
Token give
Token ID1577
Feature activation+2.02e-02

Pos logit contributions
irs+0.056
ide+0.052
 occasions+0.052
rones+0.051
growth+0.048
Neg logit contributions
 sorry-0.034
 scared-0.033
 worried-0.032
 afraid-0.032
 very-0.032
Token some
Token ID617
Feature activation+3.63e-02

Pos logit contributions
irs+0.042
ide+0.039
 occasions+0.035
rones+0.034
aff+0.034
Neg logit contributions
 afraid-0.036
 sorry-0.036
 worried-0.036
 scared-0.035
 okay-0.033
Token cheese
Token ID9891
Feature activation+2.11e-02

Pos logit contributions
irs+0.075
ide+0.070
rones+0.063
 occasions+0.062
growth+0.060
Neg logit contributions
 afraid-0.065
 sorry-0.064
 worried-0.063
 scared-0.062
 okay-0.060
Token to
Token ID284
Feature activation+8.56e-02

Pos logit contributions
irs+0.057
ide+0.052
 occasions+0.049
rones+0.049
growth+0.048
Neg logit contributions
 afraid-0.025
 sorry-0.025
 worried-0.024
 scared-0.023
 very-0.023
Token the
Token ID262
Feature activation+2.86e-02

Pos logit contributions
irs+0.213
rones+0.195
lor+0.192
 occasions+0.191
ide+0.188
Neg logit contributions
 sorry-0.128
 okay-0.116
 very-0.115
 scared-0.106
 sad-0.105
Token mouse
Token ID10211
Feature activation+1.11e-02

Pos logit contributions
irs+0.061
ide+0.058
rones+0.053
 occasions+0.052
growth+0.051
Neg logit contributions
 afraid-0.047
 sorry-0.045
 worried-0.045
 scared-0.042
 okay-0.042
Token.
Token ID13
Feature activation-4.64e-03

Pos logit contributions
irs+0.027
ide+0.024
rones+0.022
 occasions+0.022
growth+0.022
Neg logit contributions
 afraid-0.017
 sorry-0.017
 worried-0.017
 scared-0.016
 very-0.016
Token Then
Token ID3244
Feature activation-4.28e-02

Pos logit contributions
 sorry+0.008
 afraid+0.008
 scared+0.007
 worried+0.007
 okay+0.007
Neg logit contributions
irs-0.010
ide-0.010
 occasions-0.009
rones-0.009
isk-0.008
Token they
Token ID484
Feature activation+1.76e-02

Pos logit contributions
 afraid+0.071
 worried+0.066
 sorry+0.062
 scared+0.062
 careless+0.062
Neg logit contributions
irs-0.089
ide-0.081
 occasions-0.077
 parties-0.076
rones-0.075
TokenThe
Token ID464
Feature activation+1.00e-02

Pos logit contributions
irs+0.134
rones+0.124
ide+0.117
aff+0.116
 occasions+0.113
Neg logit contributions
 determined-0.104
 very-0.103
 afraid-0.101
 careful-0.100
 okay-0.099
Token rat
Token ID4227
Feature activation-2.69e-04

Pos logit contributions
irs+0.017
ide+0.017
 occasions+0.016
growth+0.015
rones+0.015
Neg logit contributions
 afraid-0.019
 sorry-0.017
 worried-0.017
 okay-0.017
 scared-0.017
Token brought
Token ID3181
Feature activation+3.49e-02

Pos logit contributions
 afraid+0.000
 sorry+0.000
 worried+0.000
 scared+0.000
 okay+0.000
Neg logit contributions
irs-0.001
ide-0.001
 occasions-0.001
rones-0.001
aff-0.001
Token the
Token ID262
Feature activation+3.09e-02

Pos logit contributions
irs+0.077
ide+0.073
 occasions+0.066
rones+0.065
aff+0.064
Neg logit contributions
 afraid-0.056
 sorry-0.055
 worried-0.055
 scared-0.055
 very-0.053
Token picture
Token ID4286
Feature activation+9.36e-03

Pos logit contributions
irs+0.053
ide+0.049
 occasions+0.047
growth+0.045
rones+0.044
Neg logit contributions
 afraid-0.058
 sorry-0.057
 worried-0.054
 okay-0.053
 oh-0.052
Token to
Token ID284
Feature activation+8.55e-02

Pos logit contributions
irs+0.022
ide+0.021
rones+0.019
 occasions+0.019
growth+0.019
Neg logit contributions
 afraid-0.014
 sorry-0.013
 worried-0.013
 scared-0.012
 okay-0.012
Token his
Token ID465
Feature activation+1.18e-02

Pos logit contributions
irs+0.197
ide+0.189
rones+0.176
oons+0.173
aff+0.171
Neg logit contributions
 sorry-0.134
 very-0.129
 afraid-0.124
 okay-0.122
 scared-0.120
Token friend
Token ID1545
Feature activation+1.76e-02

Pos logit contributions
irs+0.025
ide+0.023
rones+0.021
 occasions+0.021
aff+0.020
Neg logit contributions
 afraid-0.021
 sorry-0.020
 worried-0.020
 scared-0.020
 foolish-0.019
Token and
Token ID290
Feature activation-8.56e-03

Pos logit contributions
irs+0.046
ide+0.043
rones+0.040
 occasions+0.040
growth+0.040
Neg logit contributions
 afraid-0.022
 sorry-0.021
 worried-0.021
 scared-0.020
 okay-0.019
Token the
Token ID262
Feature activation+1.20e-02

Pos logit contributions
 afraid+0.012
 sorry+0.012
 worried+0.012
 scared+0.011
 okay+0.011
Neg logit contributions
irs-0.021
ide-0.020
rones-0.018
 occasions-0.018
growth-0.018
Token friend
Token ID1545
Feature activation+1.17e-02

Pos logit contributions
irs+0.019
ide+0.018
 occasions+0.017
growth+0.016
rones+0.016
Neg logit contributions
 afraid-0.024
 worried-0.023
 sorry-0.022
 okay-0.022
 scared-0.022
Token good
Token ID922
Feature activation+6.90e-03

Pos logit contributions
irs+0.046
ide+0.041
 occasions+0.039
rones+0.038
growth+0.037
Neg logit contributions
 afraid-0.042
 worried-0.041
 sorry-0.040
 scared-0.039
 okay-0.038
Token care
Token ID1337
Feature activation+1.66e-02

Pos logit contributions
irs+0.009
 occasions+0.009
ide+0.008
rones+0.008
 parties+0.008
Neg logit contributions
 afraid-0.016
 worried-0.016
 sorry-0.016
 okay-0.015
 scared-0.015
Token of
Token ID286
Feature activation+3.84e-02

Pos logit contributions
irs+0.031
ide+0.029
 occasions+0.025
rones+0.025
aff+0.025
Neg logit contributions
 afraid-0.032
 sorry-0.032
 worried-0.031
 scared-0.030
 okay-0.030
Token him
Token ID683
Feature activation+1.77e-02

Pos logit contributions
irs+0.075
ide+0.071
rones+0.065
aff+0.064
 occasions+0.063
Neg logit contributions
 afraid-0.071
 sorry-0.070
 worried-0.068
 scared-0.066
 okay-0.066
Token!"
Token ID2474
Feature activation+4.87e-02

Pos logit contributions
irs+0.039
ide+0.037
 occasions+0.033
rones+0.033
aff+0.032
Neg logit contributions
 afraid-0.029
 sorry-0.028
 worried-0.028
 scared-0.026
 okay-0.026
Token
Token ID198
Feature activation+8.55e-02

Pos logit contributions
irs+0.107
ide+0.103
 occasions+0.093
rones+0.093
aff+0.091
Neg logit contributions
 afraid-0.080
 worried-0.079
 sorry-0.077
 scared-0.076
 very-0.073
Token
Token ID198
Feature activation+8.23e-02

Pos logit contributions
irs+0.176
ide+0.171
 occasions+0.155
rones+0.151
growth+0.147
Neg logit contributions
 worried-0.145
 afraid-0.144
 very-0.143
 sorry-0.143
 scared-0.137
TokenDaniel
Token ID19962
Feature activation+1.91e-02

Pos logit contributions
irs+0.131
ide+0.122
rones+0.107
 occasions+0.105
aff+0.103
Neg logit contributions
 afraid-0.180
 worried-0.180
 very-0.173
 sorry-0.172
 foolish-0.171
Token thanked
Token ID26280
Feature activation+5.30e-02

Pos logit contributions
irs+0.045
ide+0.043
 occasions+0.039
rones+0.039
growth+0.038
Neg logit contributions
 afraid-0.028
 sorry-0.027
 worried-0.027
 scared-0.026
 okay-0.025
Token him
Token ID683
Feature activation+1.57e-02

Pos logit contributions
irs+0.118
ide+0.116
rones+0.108
aff+0.101
 occasions+0.100
Neg logit contributions
 sorry-0.087
 very-0.083
 worried-0.083
 afraid-0.080
 scared-0.079
Token and
Token ID290
Feature activation-9.47e-03

Pos logit contributions
irs+0.036
ide+0.035
of+0.031
aff+0.031
 occasions+0.031
Neg logit contributions
 afraid-0.023
 worried-0.021
 sorry-0.021
 scared-0.020
 oh-0.020
Tokenags
Token ID3775
Feature activation+2.48e-02

Pos logit contributions
irs+0.012
ide+0.012
growth+0.010
 occasions+0.010
rones+0.010
Neg logit contributions
 sorry-0.019
 afraid-0.018
 worried-0.017
 foolish-0.017
 scared-0.017
Token his
Token ID465
Feature activation-7.29e-04

Pos logit contributions
irs+0.043
ide+0.040
rones+0.035
 occasions+0.035
aff+0.034
Neg logit contributions
 afraid-0.053
 sorry-0.051
 worried-0.051
 scared-0.050
 okay-0.049
Token tail
Token ID7894
Feature activation+8.85e-03

Pos logit contributions
 afraid+0.001
 worried+0.001
 sorry+0.001
 scared+0.001
 okay+0.001
Neg logit contributions
irs-0.002
ide-0.001
rones-0.001
 occasions-0.001
growth-0.001
Token and
Token ID290
Feature activation+4.81e-03

Pos logit contributions
irs+0.020
ide+0.019
 occasions+0.017
rones+0.017
growth+0.017
Neg logit contributions
 afraid-0.014
 sorry-0.013
 worried-0.013
 scared-0.013
 okay-0.012
Token l
Token ID300
Feature activation-4.21e-03

Pos logit contributions
irs+0.010
ide+0.010
 occasions+0.009
rones+0.009
aff+0.008
Neg logit contributions
 afraid-0.008
 sorry-0.008
 worried-0.008
 scared-0.008
 foolish-0.007
Tokenicks
Token ID3378
Feature activation+8.55e-02

Pos logit contributions
 sorry+0.009
 afraid+0.009
 worried+0.008
 scared+0.008
 okay+0.008
Neg logit contributions
irs-0.007
ide-0.007
 occasions-0.007
rones-0.006
growth-0.006
Token their
Token ID511
Feature activation+2.73e-02

Pos logit contributions
irs+0.204
isk+0.192
 occasions+0.192
rones+0.185
ide+0.183
Neg logit contributions
 sorry-0.128
 very-0.117
 scared-0.113
 afraid-0.111
 okay-0.110
Token faces
Token ID6698
Feature activation+4.10e-02

Pos logit contributions
irs+0.056
ide+0.053
rones+0.043
rap+0.043
growth+0.042
Neg logit contributions
 sorry-0.053
 worried-0.050
 afraid-0.050
 scared-0.049
 sad-0.049
Token.
Token ID13
Feature activation-4.09e-03

Pos logit contributions
irs+0.085
ide+0.078
 occasions+0.072
rones+0.072
growth+0.070
Neg logit contributions
 afraid-0.073
 sorry-0.071
 worried-0.070
 scared-0.068
 okay-0.066
Token He
Token ID679
Feature activation+1.65e-02

Pos logit contributions
 sorry+0.007
 afraid+0.007
 worried+0.006
 scared+0.006
 okay+0.006
Neg logit contributions
irs-0.009
ide-0.009
 occasions-0.008
rones-0.008
aff-0.008
Token is
Token ID318
Feature activation-3.26e-02

Pos logit contributions
irs+0.040
ide+0.038
 occasions+0.035
rones+0.034
growth+0.033
Neg logit contributions
 afraid-0.025
 scared-0.024
 worried-0.024
 sorry-0.024
 very-0.022
Token it
Token ID340
Feature activation+7.98e-03

Pos logit contributions
irs+0.123
ide+0.113
rones+0.110
 occasions+0.109
aff+0.105
Neg logit contributions
 very-0.075
 sorry-0.073
 okay-0.071
 oh-0.069
 foolish-0.068
Token,
Token ID11
Feature activation-3.87e-02

Pos logit contributions
irs+0.018
ide+0.016
rones+0.015
 occasions+0.015
growth+0.015
Neg logit contributions
 afraid-0.013
 sorry-0.013
 worried-0.013
 scared-0.012
 okay-0.012
Token and
Token ID290
Feature activation-2.95e-02

Pos logit contributions
 afraid+0.054
 worried+0.050
 scared+0.047
 sorry+0.046
 careless+0.045
Neg logit contributions
irs-0.086
ide-0.081
 occasions-0.076
of-0.074
aff-0.073
Token then
Token ID788
Feature activation-2.48e-02

Pos logit contributions
 afraid+0.047
 worried+0.044
 sorry+0.043
 scared+0.041
 foolish+0.040
Neg logit contributions
irs-0.064
ide-0.059
 occasions-0.056
of-0.054
growth-0.054
Token you
Token ID345
Feature activation-1.89e-02

Pos logit contributions
 afraid+0.039
 worried+0.037
 sorry+0.037
 scared+0.035
 okay+0.034
Neg logit contributions
irs-0.056
ide-0.053
 occasions-0.048
rones-0.048
aff-0.047
Token slur
Token ID40066
Feature activation+8.55e-02

Pos logit contributions
 afraid+0.024
 worried+0.023
 sorry+0.023
 scared+0.022
 okay+0.021
Neg logit contributions
irs-0.049
ide-0.046
rones-0.043
 occasions-0.043
growth-0.042
Tokenp
Token ID79
Feature activation-1.77e-03

Pos logit contributions
irs+0.188
ide+0.171
 occasions+0.166
rones+0.160
 parties+0.159
Neg logit contributions
 sorry-0.122
 afraid-0.120
 okay-0.119
 scared-0.119
 very-0.115
Token it
Token ID340
Feature activation+1.23e-02

Pos logit contributions
 sorry+0.003
 afraid+0.003
 okay+0.003
 scared+0.003
 very+0.003
Neg logit contributions
irs-0.004
 occasions-0.004
ide-0.004
rones-0.004
growth-0.003
Token down
Token ID866
Feature activation+2.83e-02

Pos logit contributions
irs+0.032
rones+0.028
ide+0.028
 occasions+0.028
growth+0.028
Neg logit contributions
 afraid-0.016
 very-0.015
 scared-0.015
 sorry-0.015
 okay-0.015
Token your
Token ID534
Feature activation-7.29e-03

Pos logit contributions
irs+0.059
ide+0.055
rones+0.050
 occasions+0.050
growth+0.049
Neg logit contributions
 afraid-0.050
 sorry-0.048
 worried-0.048
 scared-0.046
 okay-0.045
Token mouth
Token ID5422
Feature activation-5.12e-03

Pos logit contributions
 afraid+0.013
 sorry+0.012
 worried+0.012
 scared+0.012
 okay+0.011
Neg logit contributions
irs-0.015
ide-0.014
rones-0.013
 occasions-0.013
growth-0.013
Token."
Token ID526
Feature activation+3.47e-02

Pos logit contributions
irs+0.037
ide+0.034
rones+0.031
 occasions+0.031
growth+0.030
Neg logit contributions
 afraid-0.034
 sorry-0.033
 worried-0.033
 scared-0.032
 okay-0.031
Token He
Token ID679
Feature activation+2.29e-02

Pos logit contributions
irs+0.082
ide+0.077
 occasions+0.070
rones+0.066
growth+0.066
Neg logit contributions
 afraid-0.056
 sorry-0.055
 worried-0.055
 scared-0.054
 very-0.050
Token gives
Token ID3607
Feature activation+2.86e-02

Pos logit contributions
irs+0.046
ide+0.043
 occasions+0.039
rones+0.038
growth+0.037
Neg logit contributions
 afraid-0.042
 sorry-0.041
 worried-0.041
 scared-0.040
 foolish-0.038
Token some
Token ID617
Feature activation+3.14e-02

Pos logit contributions
irs+0.060
ide+0.058
 occasions+0.053
growth+0.051
aff+0.050
Neg logit contributions
 worried-0.046
 afraid-0.046
 oh-0.045
 very-0.045
 scared-0.045
Token powder
Token ID11913
Feature activation-2.48e-02

Pos logit contributions
irs+0.061
ide+0.056
 occasions+0.051
rones+0.050
growth+0.049
Neg logit contributions
 afraid-0.059
 sorry-0.057
 worried-0.056
 scared-0.054
 okay-0.053
Token to
Token ID284
Feature activation+8.55e-02

Pos logit contributions
 afraid+0.040
 sorry+0.039
 worried+0.038
 scared+0.037
 okay+0.036
Neg logit contributions
irs-0.054
ide-0.051
rones-0.046
 occasions-0.046
aff-0.045
Token L
Token ID406
Feature activation-1.27e-04

Pos logit contributions
irs+0.230
ide+0.220
rones+0.217
 occasions+0.217
growth+0.214
Neg logit contributions
 sorry-0.093
 very-0.088
 oh-0.083
 scared-0.078
 sad-0.078
Tokenila
Token ID10102
Feature activation+1.04e-02

Pos logit contributions
 afraid+0.000
 worried+0.000
 sorry+0.000
 scared+0.000
 foolish+0.000
Neg logit contributions
irs-0.000
ide-0.000
 occasions-0.000
rones-0.000
growth-0.000
Token and
Token ID290
Feature activation+8.68e-04

Pos logit contributions
irs+0.028
ide+0.025
 occasions+0.024
rones+0.024
growth+0.023
Neg logit contributions
 afraid-0.013
 worried-0.013
 sorry-0.013
 scared-0.012
 okay-0.012
Token Ben
Token ID3932
Feature activation+4.29e-03

Pos logit contributions
irs+0.002
ide+0.002
rones+0.002
 occasions+0.002
aff+0.002
Neg logit contributions
 afraid-0.001
 sorry-0.001
 worried-0.001
 scared-0.001
 okay-0.001
Token.
Token ID13
Feature activation+3.74e-03

Pos logit contributions
irs+0.010
ide+0.010
rones+0.009
 occasions+0.009
aff+0.009
Neg logit contributions
 sorry-0.006
 worried-0.006
 afraid-0.006
 scared-0.006
 very-0.006
Token a
Token ID257
Feature activation+2.62e-02

Pos logit contributions
irs+0.094
ide+0.089
rones+0.082
 occasions+0.081
growth+0.079
Neg logit contributions
 afraid-0.063
 sorry-0.061
 worried-0.061
 scared-0.059
 okay-0.057
Token lot
Token ID1256
Feature activation-2.46e-02

Pos logit contributions
irs+0.042
ide+0.041
of+0.037
aff+0.037
 occasions+0.034
Neg logit contributions
 sorry-0.052
 afraid-0.052
 worried-0.051
 oh-0.047
 okay-0.047
Token of
Token ID286
Feature activation+1.85e-02

Pos logit contributions
 afraid+0.045
 sorry+0.043
 worried+0.043
 scared+0.041
 okay+0.040
Neg logit contributions
irs-0.049
ide-0.046
rones-0.042
 occasions-0.042
growth-0.040
Token fish
Token ID5916
Feature activation+9.33e-03

Pos logit contributions
irs+0.036
ide+0.032
 occasions+0.031
rones+0.029
 parties+0.029
Neg logit contributions
 afraid-0.034
 worried-0.033
 sorry-0.033
 okay-0.031
 scared-0.031
Token!"
Token ID2474
Feature activation+4.74e-02

Pos logit contributions
irs+0.019
ide+0.018
 occasions+0.016
rones+0.016
aff+0.016
Neg logit contributions
 afraid-0.016
 sorry-0.016
 worried-0.016
 scared-0.015
 okay-0.014
Token
Token ID198
Feature activation+8.53e-02

Pos logit contributions
irs+0.102
ide+0.095
 occasions+0.086
isk+0.083
aff+0.083
Neg logit contributions
 afraid-0.084
 scared-0.081
 sorry-0.081
 worried-0.080
 very-0.077
Token
Token ID198
Feature activation+8.11e-02

Pos logit contributions
irs+0.194
ide+0.183
 occasions+0.173
 parties+0.172
aff+0.171
Neg logit contributions
 afraid-0.134
 sorry-0.131
 very-0.130
 scared-0.128
 worried-0.125
TokenLou
Token ID24016
Feature activation+4.20e-02

Pos logit contributions
irs+0.155
ide+0.142
aff+0.132
 parties+0.128
rones+0.126
Neg logit contributions
 afraid-0.161
 worried-0.152
 very-0.151
 scared-0.150
 curious-0.149
Tokenise
Token ID786
Feature activation-1.25e-03

Pos logit contributions
irs+0.074
ide+0.065
rones+0.059
 occasions+0.058
aff+0.057
Neg logit contributions
 afraid-0.088
 worried-0.086
 sorry-0.084
 scared-0.083
 okay-0.081
Token was
Token ID373
Feature activation-4.15e-02

Pos logit contributions
 afraid+0.002
 worried+0.002
 scared+0.002
 sorry+0.002
 very+0.002
Neg logit contributions
irs-0.003
ide-0.003
 occasions-0.003
rones-0.003
growth-0.003
Token so
Token ID523
Feature activation-4.07e-02

Pos logit contributions
 afraid+0.035
 oh+0.034
 sorry+0.034
 foolish+0.033
 worried+0.032
Neg logit contributions
irs-0.100
ide-0.099
rones-0.089
ate-0.089
Jul-0.086
Token how
Token ID703
Feature activation-2.31e-02

Pos logit contributions
irs+0.079
ide+0.074
rones+0.069
aff+0.065
 occasions+0.064
Neg logit contributions
 afraid-0.078
 very-0.077
 sorry-0.077
 worried-0.077
 scared-0.074
Token special
Token ID2041
Feature activation+2.05e-02

Pos logit contributions
 afraid+0.032
 worried+0.030
 sorry+0.030
 scared+0.029
 foolish+0.028
Neg logit contributions
irs-0.057
ide-0.054
rones-0.050
 occasions-0.050
growth-0.049
Token she
Token ID673
Feature activation+2.30e-02

Pos logit contributions
irs+0.044
ide+0.041
 occasions+0.037
rones+0.037
growth+0.036
Neg logit contributions
 afraid-0.035
 sorry-0.033
 worried-0.033
 scared-0.032
 okay-0.031
Token looked
Token ID3114
Feature activation-6.13e-03

Pos logit contributions
irs+0.059
ide+0.055
rones+0.051
 occasions+0.051
aff+0.050
Neg logit contributions
 afraid-0.030
 worried-0.029
 sorry-0.028
 scared-0.027
 okay-0.026
Token.
Token ID13
Feature activation+1.54e-02

Pos logit contributions
 afraid+0.008
 sorry+0.008
 worried+0.008
 scared+0.008
 oh+0.007
Neg logit contributions
irs-0.015
ide-0.014
 occasions-0.013
rones-0.013
aff-0.013
Token
Token ID198
Feature activation+8.53e-02

Pos logit contributions
ide+0.033
irs+0.031
 occasions+0.028
rones+0.026
aff+0.026
Neg logit contributions
 afraid-0.026
 sorry-0.025
 worried-0.025
 very-0.024
 scared-0.024
Token
Token ID198
Feature activation+7.97e-02

Pos logit contributions
ide+0.199
irs+0.196
 occasions+0.188
aff+0.179
rones+0.176
Neg logit contributions
 very-0.125
 afraid-0.119
 sorry-0.118
 worried-0.115
 scared-0.112
TokenAnd
Token ID1870
Feature activation-2.49e-02

Pos logit contributions
ide+0.142
irs+0.137
rones+0.125
aff+0.123
 occasions+0.120
Neg logit contributions
 very-0.156
 afraid-0.155
 worried-0.150
 determined-0.145
 scared-0.144
Tokenrea
Token ID21468
Feature activation-8.51e-03

Pos logit contributions
 afraid+0.041
 sorry+0.040
 worried+0.040
 scared+0.038
 okay+0.037
Neg logit contributions
irs-0.054
ide-0.051
rones-0.047
 occasions-0.046
growth-0.045
Token was
Token ID373
Feature activation-4.18e-02

Pos logit contributions
 afraid+0.012
 sorry+0.011
 worried+0.011
 scared+0.011
 okay+0.011
Neg logit contributions
irs-0.020
ide-0.019
 occasions-0.018
rones-0.018
aff-0.017
Token so
Token ID523
Feature activation-4.16e-02

Pos logit contributions
 afraid+0.040
 sorry+0.036
 worried+0.036
 oh+0.034
 scared+0.033
Neg logit contributions
irs-0.110
ide-0.106
rones-0.096
 occasions-0.096
ate-0.095
TokenI
Token ID40
Feature activation-6.74e-03

Pos logit contributions
irs+0.015
ide+0.014
rones+0.013
aff+0.012
 occasions+0.012
Neg logit contributions
 afraid-0.008
 worried-0.008
 sorry-0.008
 scared-0.007
 okay-0.007
Token am
Token ID716
Feature activation-4.66e-02

Pos logit contributions
 afraid+0.010
 sorry+0.010
 oh+0.010
 worried+0.010
 careless+0.009
Neg logit contributions
irs-0.014
ide-0.014
 occasions-0.012
of-0.012
rones-0.012
Token very
Token ID845
Feature activation-5.07e-02

Pos logit contributions
 afraid+0.039
 oh+0.038
 worried+0.034
 foolish+0.032
 careless+0.032
Neg logit contributions
irs-0.122
ide-0.111
 parties-0.106
rones-0.103
ate-0.102
Token happy
Token ID3772
Feature activation+1.12e-02

Pos logit contributions
 afraid+0.050
 oh+0.048
 worried+0.044
 okay+0.043
 sorry+0.042
Neg logit contributions
irs-0.123
ide-0.117
ate-0.108
of-0.105
rones-0.103
Token."
Token ID526
Feature activation+3.80e-02

Pos logit contributions
irs+0.025
ide+0.024
 occasions+0.022
rones+0.021
aff+0.021
Neg logit contributions
 afraid-0.017
 sorry-0.017
 worried-0.017
 scared-0.016
 foolish-0.015
Token
Token ID198
Feature activation+8.52e-02

Pos logit contributions
irs+0.078
ide+0.074
 occasions+0.068
rones+0.065
isk+0.063
Neg logit contributions
 afraid-0.068
 sorry-0.065
 worried-0.065
 scared-0.064
 very-0.061
Token
Token ID198
Feature activation+8.02e-02

Pos logit contributions
irs+0.194
ide+0.186
 occasions+0.181
 parties+0.177
isk+0.173
Neg logit contributions
 sorry-0.113
 very-0.113
 afraid-0.110
 worried-0.109
 scared-0.107
TokenJ
Token ID41
Feature activation+1.70e-02

Pos logit contributions
irs+0.157
ide+0.138
 parties+0.137
rones+0.136
isk+0.129
Neg logit contributions
 afraid-0.141
 worried-0.138
 very-0.136
 curious-0.135
 determined-0.133
Tokenessa
Token ID21411
Feature activation+5.07e-03

Pos logit contributions
irs+0.028
ide+0.026
 occasions+0.025
rones+0.025
 parties+0.023
Neg logit contributions
 afraid-0.033
 worried-0.032
 sorry-0.031
 okay-0.031
 scared-0.031
Token smiled
Token ID13541
Feature activation+1.99e-02

Pos logit contributions
irs+0.013
ide+0.012
 occasions+0.011
rones+0.011
growth+0.011
Neg logit contributions
 afraid-0.007
 worried-0.006
 sorry-0.006
 scared-0.006
 very-0.006
Token,
Token ID11
Feature activation-1.54e-02

Pos logit contributions
irs+0.044
ide+0.043
rones+0.039
aff+0.038
growth+0.038
Neg logit contributions
 afraid-0.030
 sorry-0.028
 worried-0.028
 oh-0.027
 scared-0.026
Token penny
Token ID26369
Feature activation-1.75e-02

Pos logit contributions
irs+0.050
ide+0.049
 occasions+0.049
of+0.044
aff+0.044
Neg logit contributions
 afraid-0.041
 sorry-0.039
 worried-0.038
 scared-0.036
 okay-0.036
Token the
Token ID262
Feature activation+1.95e-02

Pos logit contributions
 afraid+0.027
 sorry+0.027
 worried+0.026
 scared+0.025
 okay+0.024
Neg logit contributions
irs-0.038
ide-0.038
 occasions-0.033
of-0.033
rones-0.033
Token bird
Token ID6512
Feature activation+1.13e-02

Pos logit contributions
irs+0.036
ide+0.035
 occasions+0.035
growth+0.032
rones+0.031
Neg logit contributions
 afraid-0.034
 sorry-0.032
 worried-0.031
 scared-0.030
 oh-0.029
Token had
Token ID550
Feature activation+6.51e-03

Pos logit contributions
irs+0.026
ide+0.025
 occasions+0.022
rones+0.022
aff+0.022
Neg logit contributions
 afraid-0.017
 sorry-0.017
 worried-0.016
 scared-0.016
 okay-0.015
Token given
Token ID1813
Feature activation+3.98e-02

Pos logit contributions
irs+0.016
ide+0.015
rones+0.014
aff+0.013
 occasions+0.013
Neg logit contributions
 afraid-0.009
 worried-0.009
 sorry-0.009
 scared-0.009
 okay-0.009
Token to
Token ID284
Feature activation+8.51e-02

Pos logit contributions
irs+0.072
ide+0.067
rones+0.060
 occasions+0.059
growth+0.058
Neg logit contributions
 afraid-0.080
 sorry-0.078
 worried-0.078
 scared-0.076
 okay-0.074
Token him
Token ID683
Feature activation+2.41e-02

Pos logit contributions
irs+0.182
ide+0.174
rones+0.159
aff+0.158
growth+0.152
Neg logit contributions
 sorry-0.147
 afraid-0.143
 okay-0.139
 very-0.138
 worried-0.136
Token.
Token ID13
Feature activation+8.50e-03

Pos logit contributions
irs+0.056
ide+0.052
rones+0.048
growth+0.047
 occasions+0.047
Neg logit contributions
 afraid-0.038
 sorry-0.037
 worried-0.036
 okay-0.036
 scared-0.036
Token<|endoftext|>
Token ID50256
Feature activation+4.81e-02

Pos logit contributions
irs+0.018
ide+0.017
 occasions+0.015
aff+0.015
rones+0.014
Neg logit contributions
 sorry-0.015
 afraid-0.015
 worried-0.014
 okay-0.014
 scared-0.014
TokenOnce
Token ID7454
Feature activation-1.83e-02

Pos logit contributions
irs+0.094
ide+0.084
 occasions+0.080
aff+0.077
rap+0.073
Neg logit contributions
 afraid-0.090
 scared-0.088
 worried-0.087
 very-0.085
 sorry-0.084
Token upon
Token ID2402
Feature activation+2.51e-02

Pos logit contributions
 afraid+0.036
 worried+0.035
 sorry+0.035
 scared+0.034
 okay+0.033
Neg logit contributions
irs-0.034
 occasions-0.031
growth-0.030
ide-0.030
rones-0.030
Token a
Token ID257
Feature activation+1.44e-02

Pos logit contributions
irs+0.094
ide+0.090
rones+0.083
 occasions+0.082
aff+0.082
Neg logit contributions
 afraid-0.050
 sorry-0.049
 worried-0.048
 scared-0.046
 okay-0.045
Token huge
Token ID3236
Feature activation+1.09e-02

Pos logit contributions
irs+0.025
ide+0.023
 occasions+0.021
 parties+0.020
rones+0.020
Neg logit contributions
 afraid-0.029
 worried-0.028
 sorry-0.027
 okay-0.026
 oh-0.026
Token sand
Token ID6450
Feature activation+2.11e-03

Pos logit contributions
irs+0.016
ide+0.016
 occasions+0.014
 parties+0.014
aff+0.013
Neg logit contributions
 afraid-0.025
 worried-0.023
 sorry-0.022
 okay-0.022
 oh-0.022
Tokencastle
Token ID18676
Feature activation-4.92e-03

Pos logit contributions
irs+0.003
ide+0.003
rones+0.003
 occasions+0.003
growth+0.003
Neg logit contributions
 afraid-0.005
 sorry-0.005
 worried-0.005
 scared-0.004
 okay-0.004
Token!"
Token ID2474
Feature activation+5.49e-02

Pos logit contributions
 afraid+0.008
 sorry+0.008
 worried+0.008
 scared+0.008
 okay+0.007
Neg logit contributions
irs-0.010
ide-0.010
 occasions-0.009
rones-0.009
aff-0.009
Token
Token ID198
Feature activation+8.50e-02

Pos logit contributions
irs+0.107
ide+0.103
 occasions+0.090
rones+0.090
isk+0.083
Neg logit contributions
 afraid-0.104
 sorry-0.103
 worried-0.102
 scared-0.100
 very-0.096
Token
Token ID198
Feature activation+8.17e-02

Pos logit contributions
irs+0.175
ide+0.171
rones+0.150
 occasions+0.149
isk+0.145
Neg logit contributions
 afraid-0.148
 sorry-0.147
 worried-0.145
 very-0.143
 scared-0.142
TokenSo
Token ID2396
Feature activation-1.21e-02

Pos logit contributions
irs+0.159
ide+0.151
rones+0.135
 occasions+0.127
isk+0.124
Neg logit contributions
 afraid-0.155
 worried-0.153
 sorry-0.148
 very-0.147
 scared-0.147
Token,
Token ID11
Feature activation-1.60e-02

Pos logit contributions
 afraid+0.021
 sorry+0.021
 worried+0.020
 scared+0.020
 okay+0.019
Neg logit contributions
irs-0.024
ide-0.023
 occasions-0.021
aff-0.020
rones-0.020
Token Nan
Token ID18008
Feature activation+3.23e-03

Pos logit contributions
 afraid+0.032
 sorry+0.031
 worried+0.031
 scared+0.030
 okay+0.029
Neg logit contributions
irs-0.029
ide-0.026
aff-0.023
rones-0.023
 occasions-0.023
Token and
Token ID290
Feature activation+1.32e-02

Pos logit contributions
irs+0.007
ide+0.007
aff+0.006
rones+0.006
of+0.006
Neg logit contributions
 afraid-0.005
 sorry-0.005
 worried-0.004
 okay-0.004
 scared-0.004
Token a
Token ID257
Feature activation+6.86e-03

Pos logit contributions
 afraid+0.029
 sorry+0.027
 worried+0.027
 scared+0.025
 oh+0.025
Neg logit contributions
irs-0.045
ide-0.039
 occasions-0.038
rones-0.037
of-0.037
Token namesake
Token ID49829
Feature activation-9.42e-03

Pos logit contributions
irs+0.011
of+0.011
ide+0.011
aff+0.010
 occasions+0.010
Neg logit contributions
 afraid-0.013
 sorry-0.012
 worried-0.012
 scared-0.012
 okay-0.011
Token bowl
Token ID9396
Feature activation-5.71e-03

Pos logit contributions
 afraid+0.020
 sorry+0.019
 worried+0.018
 scared+0.018
 okay+0.017
Neg logit contributions
ide-0.016
irs-0.016
 occasions-0.015
rones-0.014
aff-0.014
Token!"
Token ID2474
Feature activation+4.41e-02

Pos logit contributions
 afraid+0.009
 sorry+0.008
 worried+0.008
 scared+0.008
 foolish+0.008
Neg logit contributions
irs-0.013
ide-0.013
 occasions-0.011
rones-0.011
aff-0.011
Token
Token ID220
Feature activation+3.76e-02

Pos logit contributions
irs+0.087
ide+0.082
rones+0.073
 occasions+0.072
aff+0.071
Neg logit contributions
 afraid-0.083
 worried-0.081
 sorry-0.081
 scared-0.078
 very-0.076
Token
Token ID198
Feature activation+8.48e-02

Pos logit contributions
irs+0.084
 occasions+0.082
ide+0.080
rones+0.076
growth+0.075
Neg logit contributions
 sorry-0.055
 afraid-0.054
 okay-0.054
 worried-0.053
 scared-0.052
Token
Token ID198
Feature activation+7.88e-02

Pos logit contributions
irs+0.182
ide+0.173
 occasions+0.163
rones+0.154
isk+0.153
Neg logit contributions
 afraid-0.140
 worried-0.137
 sorry-0.135
 very-0.135
 scared-0.132
TokenAndrew
Token ID20508
Feature activation+3.53e-02

Pos logit contributions
irs+0.139
ide+0.129
rones+0.114
 occasions+0.114
aff+0.110
Neg logit contributions
 afraid-0.164
 worried-0.160
 sorry-0.153
 very-0.153
 scared-0.153
Token couldn
Token ID3521
Feature activation-1.66e-02

Pos logit contributions
irs+0.090
ide+0.083
rones+0.078
 occasions+0.077
growth+0.076
Neg logit contributions
 afraid-0.047
 worried-0.045
 sorry-0.044
 scared-0.043
 okay-0.041
Token't
Token ID470
Feature activation+1.18e-02

Pos logit contributions
 sorry+0.028
 afraid+0.027
 worried+0.026
 scared+0.026
 okay+0.026
Neg logit contributions
irs-0.036
ide-0.033
rones-0.032
 occasions-0.031
aff-0.030
Token believe
Token ID1975
Feature activation+2.09e-02

Pos logit contributions
irs+0.027
ide+0.025
rones+0.024
 occasions+0.023
aff+0.023
Neg logit contributions
 afraid-0.018
 sorry-0.018
 worried-0.018
 scared-0.017
 okay-0.017
Token go
Token ID467
Feature activation-7.00e-03

Pos logit contributions
irs+0.013
ide+0.012
rones+0.011
 occasions+0.011
aff+0.011
Neg logit contributions
 afraid-0.012
 sorry-0.012
 worried-0.012
 scared-0.012
 okay-0.011
Token have
Token ID423
Feature activation-8.54e-03

Pos logit contributions
 afraid+0.010
 sorry+0.010
 worried+0.010
 scared+0.010
 very+0.009
Neg logit contributions
irs-0.016
ide-0.016
rones-0.014
 occasions-0.014
growth-0.014
Token some
Token ID617
Feature activation+1.58e-02

Pos logit contributions
 afraid+0.011
 sorry+0.011
 worried+0.011
 scared+0.010
 okay+0.010
Neg logit contributions
irs-0.021
ide-0.020
rones-0.018
 occasions-0.018
aff-0.018
Token fun
Token ID1257
Feature activation+1.06e-02

Pos logit contributions
irs+0.035
ide+0.033
 occasions+0.032
 parties+0.031
rones+0.030
Neg logit contributions
 afraid-0.025
 worried-0.024
 sorry-0.024
 scared-0.022
 oh-0.022
Token!"
Token ID2474
Feature activation+4.12e-02

Pos logit contributions
irs+0.022
ide+0.021
 occasions+0.019
rones+0.019
aff+0.018
Neg logit contributions
 afraid-0.018
 sorry-0.018
 worried-0.018
 scared-0.017
 okay-0.016
Token
Token ID198
Feature activation+8.46e-02

Pos logit contributions
irs+0.086
ide+0.082
 occasions+0.072
rones+0.071
aff+0.070
Neg logit contributions
 afraid-0.073
 sorry-0.073
 worried-0.071
 scared-0.070
 very-0.067
Token
Token ID198
Feature activation+7.89e-02

Pos logit contributions
irs+0.184
ide+0.179
 occasions+0.161
aff+0.157
rones+0.157
Neg logit contributions
 sorry-0.141
 afraid-0.138
 very-0.137
 worried-0.136
 scared-0.134
TokenGray
Token ID46130
Feature activation-7.59e-03

Pos logit contributions
irs+0.148
ide+0.145
rones+0.126
aff+0.126
 occasions+0.121
Neg logit contributions
 afraid-0.152
 sorry-0.150
 worried-0.150
 very-0.148
 okay-0.146
Token agreed
Token ID4987
Feature activation+2.39e-02

Pos logit contributions
 afraid+0.012
 sorry+0.012
 worried+0.011
 okay+0.011
 scared+0.011
Neg logit contributions
irs-0.016
ide-0.016
 occasions-0.014
rones-0.014
of-0.014
Token and
Token ID290
Feature activation-9.86e-03

Pos logit contributions
irs+0.053
ide+0.053
 occasions+0.048
of+0.047
aff+0.046
Neg logit contributions
 afraid-0.036
 worried-0.033
 sorry-0.032
 scared-0.031
 oh-0.031
Token they
Token ID484
Feature activation+2.08e-02

Pos logit contributions
 afraid+0.017
 worried+0.016
 sorry+0.015
 okay+0.015
 foolish+0.015
Neg logit contributions
irs-0.020
ide-0.019
of-0.018
 occasions-0.017
aff-0.017
Token a
Token ID257
Feature activation-2.09e-02

Pos logit contributions
irs+0.071
ide+0.067
 occasions+0.062
rones+0.062
aff+0.061
Neg logit contributions
 afraid-0.029
 sorry-0.028
 worried-0.028
 scared-0.026
 okay-0.025
Token time
Token ID640
Feature activation-7.67e-02

Pos logit contributions
 afraid+0.030
 sorry+0.029
 worried+0.029
 scared+0.028
 okay+0.027
Neg logit contributions
irs-0.049
ide-0.047
rones-0.043
 occasions-0.043
growth-0.042
Token there
Token ID612
Feature activation+9.42e-03

Pos logit contributions
 afraid+0.099
 worried+0.094
 sorry+0.092
 scared+0.088
 foolish+0.084
Neg logit contributions
irs-0.192
ide-0.188
rones-0.166
of-0.163
 occasions-0.162
Token was
Token ID373
Feature activation-5.94e-02

Pos logit contributions
irs+0.025
ide+0.022
 occasions+0.022
aff+0.022
rones+0.022
Neg logit contributions
 afraid-0.012
 worried-0.011
 scared-0.011
 sorry-0.011
 very-0.010
Token a
Token ID257
Feature activation-1.99e-02

Pos logit contributions
 afraid+0.084
 worried+0.081
 sorry+0.081
 scared+0.074
 foolish+0.072
Neg logit contributions
irs-0.141
ide-0.135
rones-0.126
growth-0.123
 occasions-0.118
Token Stub
Token ID41135
Feature activation-9.56e-02

Pos logit contributions
 afraid+0.037
 sorry+0.035
 okay+0.034
 oh+0.034
 worried+0.034
Neg logit contributions
ide-0.034
irs-0.031
aff-0.029
of-0.028
rones-0.028
Tokenborn
Token ID6286
Feature activation-1.62e-02

Pos logit contributions
 afraid+0.165
 worried+0.165
 sorry+0.163
 oh+0.161
 scared+0.158
Neg logit contributions
irs-0.193
ide-0.186
aff-0.166
of-0.165
rones-0.155
Token Old
Token ID5706
Feature activation-5.07e-03

Pos logit contributions
 afraid+0.030
 sorry+0.030
 worried+0.029
 okay+0.028
 scared+0.028
Neg logit contributions
irs-0.029
ide-0.029
aff-0.026
of-0.026
rones-0.025
Token Woman
Token ID14381
Feature activation-2.92e-02

Pos logit contributions
 afraid+0.009
 worried+0.009
 sorry+0.009
 scared+0.008
 foolish+0.008
Neg logit contributions
irs-0.011
ide-0.010
 occasions-0.009
rones-0.009
growth-0.009
Token.
Token ID13
Feature activation-1.58e-02

Pos logit contributions
 afraid+0.049
 worried+0.046
 sorry+0.045
 scared+0.045
 very+0.044
Neg logit contributions
irs-0.065
ide-0.058
 occasions-0.055
rones-0.054
growth-0.053
Token She
Token ID1375
Feature activation+2.22e-02

Pos logit contributions
 afraid+0.026
 worried+0.025
 scared+0.024
 sorry+0.024
 very+0.024
Neg logit contributions
irs-0.035
ide-0.033
 occasions-0.031
rones-0.029
growth-0.029
Token neck
Token ID7393
Feature activation+2.13e-03

Pos logit contributions
irs+0.042
ide+0.039
rones+0.036
 occasions+0.035
growth+0.035
Neg logit contributions
 afraid-0.033
 sorry-0.031
 worried-0.031
 scared-0.030
 okay-0.029
Token that
Token ID326
Feature activation-6.25e-03

Pos logit contributions
irs+0.004
ide+0.004
rones+0.004
 occasions+0.004
growth+0.004
Neg logit contributions
 afraid-0.004
 sorry-0.004
 worried-0.004
 scared-0.003
 okay-0.003
Token read
Token ID1100
Feature activation+2.93e-02

Pos logit contributions
 afraid+0.009
 sorry+0.008
 worried+0.008
 okay+0.008
 scared+0.008
Neg logit contributions
irs-0.015
ide-0.014
 occasions-0.013
rones-0.013
growth-0.013
Token,
Token ID11
Feature activation-2.62e-02

Pos logit contributions
irs+0.064
ide+0.060
 occasions+0.055
rones+0.055
aff+0.053
Neg logit contributions
 afraid-0.048
 sorry-0.046
 worried-0.046
 scared-0.044
 okay-0.043
Token "
Token ID366
Feature activation-2.55e-03

Pos logit contributions
 afraid+0.035
 sorry+0.034
 worried+0.033
 foolish+0.031
 okay+0.031
Neg logit contributions
irs-0.061
ide-0.057
 occasions-0.053
rones-0.053
aff-0.052
TokenFore
Token ID16351
Feature activation-9.00e-02

Pos logit contributions
 afraid+0.004
 worried+0.004
 oh+0.004
 very+0.004
 scared+0.004
Neg logit contributions
irs-0.005
ide-0.004
rones-0.004
 occasions-0.004
 parties-0.004
Tokenver
Token ID332
Feature activation-7.38e-02

Pos logit contributions
 afraid+0.161
 worried+0.157
 scared+0.154
 sorry+0.152
 foolish+0.147
Neg logit contributions
irs-0.178
ide-0.163
of-0.147
aff-0.141
rap-0.139
Token Friends
Token ID14213
Feature activation-2.49e-02

Pos logit contributions
 afraid+0.128
 sorry+0.128
 worried+0.125
 scared+0.117
 foolish+0.116
Neg logit contributions
irs-0.145
ide-0.142
 occasions-0.130
aff-0.123
rones-0.122
Token".
Token ID1911
Feature activation+1.50e-02

Pos logit contributions
 afraid+0.042
 sorry+0.042
 worried+0.041
 scared+0.040
 okay+0.039
Neg logit contributions
irs-0.053
ide-0.050
rones-0.046
 occasions-0.045
growth-0.045
Token
Token ID220
Feature activation+2.37e-02

Pos logit contributions
irs+0.034
ide+0.032
 occasions+0.030
rones+0.029
ades+0.029
Neg logit contributions
 afraid-0.022
 sorry-0.022
 scared-0.022
 worried-0.022
 very-0.021
Token
Token ID198
Feature activation+6.69e-02

Pos logit contributions
 occasions+0.054
irs+0.052
growth+0.050
ide+0.050
rones+0.047
Neg logit contributions
 sorry-0.035
 okay-0.035
 afraid-0.034
 scared-0.033
 worried-0.032
Token just
Token ID655
Feature activation-5.54e-02

Pos logit contributions
 afraid+0.039
 sorry+0.038
 worried+0.037
 guilty+0.035
 ashamed+0.034
Neg logit contributions
irs-0.063
ide-0.062
isk-0.061
rap-0.060
 occasions-0.057
Token a
Token ID257
Feature activation-8.11e-03

Pos logit contributions
 afraid+0.078
 worried+0.076
 sorry+0.073
 scared+0.069
 determined+0.068
Neg logit contributions
irs-0.119
ide-0.108
 occasions-0.104
growth-0.100
isk-0.100
Token spider
Token ID19230
Feature activation-3.49e-02

Pos logit contributions
 afraid+0.014
 sorry+0.014
 okay+0.014
 worried+0.013
 oh+0.013
Neg logit contributions
 occasions-0.012
ide-0.012
irs-0.012
growth-0.011
ware-0.011
Token."
Token ID526
Feature activation+3.27e-02

Pos logit contributions
 afraid+0.063
 sorry+0.061
 worried+0.061
 okay+0.058
 scared+0.057
Neg logit contributions
ide-0.064
irs-0.064
 occasions-0.058
rones-0.055
aff-0.053
Token The
Token ID383
Feature activation+1.15e-02

Pos logit contributions
irs+0.060
ide+0.056
 occasions+0.049
rones+0.048
growth+0.047
Neg logit contributions
 afraid-0.066
 sorry-0.066
 worried-0.066
 scared-0.063
 okay-0.063
Token Stub
Token ID41135
Feature activation-8.88e-02

Pos logit contributions
irs+0.017
ide+0.017
 occasions+0.016
growth+0.015
rones+0.015
Neg logit contributions
 afraid-0.021
 okay-0.021
 sorry-0.021
 worried-0.020
 oh-0.020
Tokenborn
Token ID6286
Feature activation-6.27e-03

Pos logit contributions
 worried+0.156
 afraid+0.155
 oh+0.152
 sorry+0.150
 okay+0.148
Neg logit contributions
irs-0.171
ide-0.158
of-0.150
aff-0.148
 occasions-0.137
Token Old
Token ID5706
Feature activation-1.21e-02

Pos logit contributions
 afraid+0.012
 sorry+0.012
 worried+0.012
 scared+0.011
 okay+0.011
Neg logit contributions
irs-0.011
ide-0.011
aff-0.010
 occasions-0.010
rones-0.010
Token Woman
Token ID14381
Feature activation-2.18e-02

Pos logit contributions
 afraid+0.024
 worried+0.024
 sorry+0.024
 scared+0.023
 okay+0.022
Neg logit contributions
irs-0.022
ide-0.020
rones-0.018
 occasions-0.018
growth-0.018
Token shook
Token ID14682
Feature activation+1.83e-02

Pos logit contributions
 afraid+0.029
 sorry+0.028
 worried+0.027
 scared+0.026
 okay+0.026
Neg logit contributions
irs-0.053
ide-0.052
rones-0.046
 occasions-0.046
aff-0.046
Token her
Token ID607
Feature activation-8.04e-03

Pos logit contributions
irs+0.019
ide+0.019
of+0.014
 occasions+0.014
growth+0.013
Neg logit contributions
 afraid-0.047
 sorry-0.045
 worried-0.044
 careless-0.044
 foolish-0.044
Token high
Token ID1029
Feature activation-7.06e-03

Pos logit contributions
 afraid+0.061
 sorry+0.059
 worried+0.058
 scared+0.054
 okay+0.053
Neg logit contributions
irs-0.141
ide-0.135
rones-0.124
 occasions-0.122
aff-0.121
Token they
Token ID484
Feature activation-7.19e-03

Pos logit contributions
 afraid+0.012
 sorry+0.012
 worried+0.011
 okay+0.011
 scared+0.011
Neg logit contributions
ide-0.014
irs-0.014
aff-0.012
 occasions-0.012
growth-0.012
Token were
Token ID547
Feature activation-4.47e-02

Pos logit contributions
 sorry+0.010
 afraid+0.010
 worried+0.009
 oh+0.009
 okay+0.009
Neg logit contributions
irs-0.017
ide-0.017
of-0.015
aff-0.014
growth-0.014
Token just
Token ID655
Feature activation-5.98e-02

Pos logit contributions
 afraid+0.060
 sorry+0.060
 worried+0.057
 oh+0.056
 foolish+0.053
Neg logit contributions
irs-0.106
ide-0.102
rones-0.090
aff-0.088
 parties-0.087
Token tiny
Token ID7009
Feature activation-7.53e-03

Pos logit contributions
 sorry+0.083
 afraid+0.081
 oh+0.080
 worried+0.078
 foolish+0.076
Neg logit contributions
irs-0.126
ide-0.119
 parties-0.105
growth-0.105
en-0.104
Token spec
Token ID1020
Feature activation-8.88e-02

Pos logit contributions
 afraid+0.014
 sorry+0.014
 worried+0.013
 okay+0.013
 scared+0.013
Neg logit contributions
irs-0.015
ide-0.014
rones-0.012
 occasions-0.012
aff-0.012
Tokenks
Token ID591
Feature activation-6.39e-03

Pos logit contributions
 sorry+0.174
 afraid+0.170
 worried+0.170
 okay+0.163
 scared+0.162
Neg logit contributions
irs-0.160
ide-0.153
rones-0.132
aff-0.131
growth-0.130
Token in
Token ID287
Feature activation+1.32e-02

Pos logit contributions
 afraid+0.011
 sorry+0.011
 worried+0.010
 scared+0.010
 okay+0.010
Neg logit contributions
irs-0.014
ide-0.013
rones-0.011
 occasions-0.011
growth-0.011
Token the
Token ID262
Feature activation+9.80e-03

Pos logit contributions
irs+0.028
ide+0.027
rones+0.025
 occasions+0.025
aff+0.025
Neg logit contributions
 afraid-0.023
 worried-0.022
 scared-0.022
 sorry-0.022
 very-0.021
Token sky
Token ID6766
Feature activation+1.36e-02

Pos logit contributions
irs+0.021
ide+0.020
growth+0.018
rones+0.018
 occasions+0.018
Neg logit contributions
 afraid-0.016
 sorry-0.015
 worried-0.015
 okay-0.014
 oh-0.014
Token.
Token ID13
Feature activation+2.74e-03

Pos logit contributions
irs+0.029
ide+0.026
rones+0.024
 occasions+0.024
growth+0.023
Neg logit contributions
 afraid-0.024
 worried-0.023
 sorry-0.023
 scared-0.023
 okay-0.022
Tokenoes
Token ID3028
Feature activation+6.09e-03

Pos logit contributions
 afraid+0.048
 worried+0.046
 scared+0.046
 sorry+0.046
 okay+0.045
Neg logit contributions
irs-0.040
 occasions-0.036
rones-0.035
growth-0.034
ide-0.034
Token were
Token ID547
Feature activation-3.46e-02

Pos logit contributions
irs+0.013
ide+0.012
rones+0.012
 occasions+0.012
growth+0.011
Neg logit contributions
 afraid-0.010
 worried-0.010
 scared-0.010
 sorry-0.010
 okay-0.009
Token Tim
Token ID5045
Feature activation+1.20e-02

Pos logit contributions
 sorry+0.035
 afraid+0.033
 worried+0.033
 oh+0.032
 okay+0.030
Neg logit contributions
irs-0.089
ide-0.085
 occasions-0.076
rones-0.076
aff-0.076
Tokenmy
Token ID1820
Feature activation-7.03e-03

Pos logit contributions
irs+0.022
ide+0.021
rones+0.018
 occasions+0.018
of+0.018
Neg logit contributions
 afraid-0.023
 sorry-0.023
 worried-0.022
 scared-0.021
 okay-0.021
Token's
Token ID338
Feature activation-9.22e-03

Pos logit contributions
 sorry+0.011
 afraid+0.011
 worried+0.010
 oh+0.010
 okay+0.010
Neg logit contributions
irs-0.015
ide-0.015
rones-0.013
of-0.013
 occasions-0.013
Token least
Token ID1551
Feature activation-8.71e-02

Pos logit contributions
 afraid+0.016
 sorry+0.015
 worried+0.015
 oh+0.014
 scared+0.014
Neg logit contributions
irs-0.018
ide-0.018
 occasions-0.016
rones-0.015
aff-0.015
Token favorite
Token ID4004
Feature activation+2.39e-02

Pos logit contributions
 sorry+0.075
 worried+0.075
 afraid+0.073
 crank+0.067
 seeker+0.064
Neg logit contributions
irs-0.191
ide-0.185
 occasions-0.170
 parties-0.166
of-0.164
Token bugs
Token ID11316
Feature activation-1.36e-02

Pos logit contributions
irs+0.052
ide+0.048
rones+0.044
 occasions+0.043
growth+0.043
Neg logit contributions
 afraid-0.040
 sorry-0.039
 worried-0.038
 scared-0.037
 okay-0.036
Token because
Token ID780
Feature activation-5.84e-02

Pos logit contributions
 afraid+0.021
 worried+0.020
 sorry+0.020
 scared+0.020
 okay+0.019
Neg logit contributions
irs-0.031
ide-0.029
rones-0.027
 occasions-0.026
growth-0.026
Token they
Token ID484
Feature activation+4.05e-03

Pos logit contributions
 afraid+0.086
 worried+0.084
 sorry+0.083
 scared+0.077
 okay+0.077
Neg logit contributions
irs-0.134
ide-0.127
 occasions-0.113
rones-0.111
growth-0.111
Token always
Token ID1464
Feature activation-2.28e-02

Pos logit contributions
irs+0.009
ide+0.008
rones+0.008
 occasions+0.008
aff+0.008
Neg logit contributions
 afraid-0.007
 worried-0.006
 scared-0.006
 sorry-0.006
 foolish-0.006
Token one
Token ID530
Feature activation-2.11e-02

Pos logit contributions
 afraid+0.080
 sorry+0.071
 worried+0.070
 guilty+0.069
 scared+0.069
Neg logit contributions
ide-0.072
irs-0.071
 parties-0.062
 occasions-0.061
rap-0.061
Token of
Token ID286
Feature activation+1.64e-02

Pos logit contributions
 sorry+0.034
 afraid+0.033
 worried+0.030
 okay+0.029
 gonna+0.029
Neg logit contributions
ide-0.043
of-0.037
 occasions-0.035
irs-0.035
berry-0.034
Token her
Token ID607
Feature activation-4.77e-03

Pos logit contributions
irs+0.034
ide+0.034
rones+0.033
 occasions+0.032
ades+0.032
Neg logit contributions
 very-0.024
 sorry-0.024
 afraid-0.023
 worried-0.023
 oh-0.023
Token dolls
Token ID36062
Feature activation-2.51e-02

Pos logit contributions
 afraid+0.008
 sorry+0.008
 worried+0.008
 scared+0.007
 okay+0.007
Neg logit contributions
irs-0.010
ide-0.009
 occasions-0.009
rones-0.009
aff-0.008
Token was
Token ID373
Feature activation-6.52e-02

Pos logit contributions
 afraid+0.043
 sorry+0.042
 worried+0.041
 scared+0.039
 okay+0.039
Neg logit contributions
irs-0.052
ide-0.049
 occasions-0.044
rones-0.043
of-0.042
Token being
Token ID852
Feature activation-8.67e-02

Pos logit contributions
 sorry+0.059
 oh+0.058
 afraid+0.056
 worried+0.047
 guilty+0.045
Neg logit contributions
ide-0.167
irs-0.160
rones-0.149
 parties-0.146
isk-0.146
Token persistent
Token ID16218
Feature activation-3.22e-03

Pos logit contributions
 afraid+0.071
 sorry+0.070
 worried+0.059
 oh+0.059
 okay+0.057
Neg logit contributions
irs-0.219
ide-0.210
rones-0.199
 parties-0.196
 occasions-0.192
Token and
Token ID290
Feature activation-3.69e-02

Pos logit contributions
 sorry+0.006
 afraid+0.006
 worried+0.005
 oh+0.005
 scared+0.005
Neg logit contributions
ide-0.006
irs-0.006
 occasions-0.006
rones-0.005
 parties-0.005
Token wouldn
Token ID3636
Feature activation-4.80e-02

Pos logit contributions
 afraid+0.048
 sorry+0.046
 okay+0.044
 oh+0.043
 worried+0.043
Neg logit contributions
irs-0.077
ide-0.071
 parties-0.068
 occasions-0.068
isk-0.066
Token't
Token ID470
Feature activation-5.57e-03

Pos logit contributions
 afraid+0.081
 sorry+0.080
 worried+0.079
 scared+0.078
 okay+0.076
Neg logit contributions
irs-0.101
ide-0.096
rones-0.092
 occasions-0.092
aff-0.087
Token stand
Token ID1302
Feature activation-7.75e-03

Pos logit contributions
 afraid+0.010
 worried+0.010
 sorry+0.009
 scared+0.009
 okay+0.009
Neg logit contributions
irs-0.012
rones-0.011
ide-0.011
growth-0.011
aff-0.010
Token.
Token ID13
Feature activation+1.13e-03

Pos logit contributions
irs+0.074
ide+0.068
rones+0.063
 occasions+0.062
growth+0.061
Neg logit contributions
 afraid-0.061
 worried-0.059
 sorry-0.059
 scared-0.057
 okay-0.056
Token<|endoftext|>
Token ID50256
Feature activation+4.79e-02

Pos logit contributions
irs+0.002
ide+0.002
 occasions+0.002
rones+0.002
aff+0.002
Neg logit contributions
 afraid-0.002
 sorry-0.002
 worried-0.002
 scared-0.002
 okay-0.002
Token"
Token ID1
Feature activation+1.76e-02

Pos logit contributions
irs+0.095
ide+0.085
 occasions+0.082
aff+0.077
rones+0.075
Neg logit contributions
 afraid-0.087
 scared-0.085
 worried-0.084
 very-0.083
 sorry-0.083
TokenMom
Token ID29252
Feature activation-2.12e-02

Pos logit contributions
irs+0.034
ide+0.031
 occasions+0.029
rones+0.028
growth+0.028
Neg logit contributions
 afraid-0.033
 worried-0.032
 sorry-0.031
 scared-0.031
 okay-0.031
Tokenmy
Token ID1820
Feature activation-4.26e-02

Pos logit contributions
 afraid+0.039
 worried+0.038
 sorry+0.038
 scared+0.036
 okay+0.036
Neg logit contributions
irs-0.042
ide-0.039
 occasions-0.035
rones-0.035
growth-0.035
Token,
Token ID11
Feature activation-8.57e-02

Pos logit contributions
 afraid+0.060
 sorry+0.055
 worried+0.055
 scared+0.053
 okay+0.051
Neg logit contributions
irs-0.100
ide-0.096
rones-0.088
 occasions-0.087
aff-0.086
Token what
Token ID644
Feature activation-2.89e-02

Pos logit contributions
 afraid+0.149
 worried+0.134
 scared+0.132
 determined+0.130
 sorry+0.127
Neg logit contributions
irs-0.155
ide-0.146
 parties-0.130
of-0.130
rones-0.123
Token's
Token ID338
Feature activation-3.14e-02

Pos logit contributions
 afraid+0.051
 sorry+0.049
 worried+0.049
 oh+0.045
 careless+0.045
Neg logit contributions
ide-0.053
irs-0.053
of-0.046
 occasions-0.045
ate-0.045
Token that
Token ID326
Feature activation-1.80e-02

Pos logit contributions
 afraid+0.055
 worried+0.050
 scared+0.048
 sorry+0.046
 careless+0.045
Neg logit contributions
irs-0.064
ide-0.058
 occasions-0.057
 parties-0.055
ate-0.052
Token?"
Token ID1701
Feature activation+1.74e-02

Pos logit contributions
 afraid+0.034
 worried+0.032
 sorry+0.032
 determined+0.031
 careless+0.030
Neg logit contributions
ide-0.030
irs-0.030
 parties-0.025
 occasions-0.025
ate-0.025
Token asked
Token ID1965
Feature activation+2.26e-02

Pos logit contributions
irs+0.033
ide+0.032
 occasions+0.029
rones+0.029
growth+0.028
Neg logit contributions
 afraid-0.030
 sorry-0.028
 worried-0.028
 foolish-0.027
 okay-0.027
Token she
Token ID673
Feature activation+1.56e-02

Pos logit contributions
 afraid+0.059
 sorry+0.057
 worried+0.056
 scared+0.053
 okay+0.052
Neg logit contributions
irs-0.086
ide-0.081
 occasions-0.074
rones-0.074
growth-0.072
Token did
Token ID750
Feature activation-2.40e-02

Pos logit contributions
irs+0.040
ide+0.037
aff+0.034
rones+0.034
 occasions+0.034
Neg logit contributions
 afraid-0.020
 worried-0.020
 scared-0.020
 very-0.018
 sorry-0.018
Token so
Token ID523
Feature activation-2.39e-02

Pos logit contributions
 afraid+0.031
 worried+0.029
 sorry+0.029
 scared+0.027
 okay+0.026
Neg logit contributions
irs-0.060
ide-0.057
 occasions-0.052
growth-0.051
rones-0.051
Token,
Token ID11
Feature activation-2.88e-02

Pos logit contributions
 afraid+0.027
 worried+0.025
 sorry+0.025
 oh+0.024
 scared+0.023
Neg logit contributions
irs-0.062
ide-0.059
rones-0.053
growth-0.053
 occasions-0.053
Token the
Token ID262
Feature activation+5.56e-03

Pos logit contributions
 afraid+0.049
 sorry+0.047
 worried+0.047
 scared+0.045
 okay+0.044
Neg logit contributions
irs-0.060
ide-0.056
 occasions-0.051
rones-0.051
growth-0.050
Token Stub
Token ID41135
Feature activation-8.54e-02

Pos logit contributions
ide+0.008
irs+0.008
 occasions+0.008
growth+0.007
andy+0.007
Neg logit contributions
 sorry-0.010
 afraid-0.010
 okay-0.010
 worried-0.010
 gonna-0.009
Tokenborn
Token ID6286
Feature activation-1.08e-02

Pos logit contributions
 afraid+0.146
 worried+0.144
 sorry+0.143
 scared+0.137
 okay+0.137
Neg logit contributions
irs-0.170
ide-0.156
of-0.149
aff-0.147
 occasions-0.139
Token Old
Token ID5706
Feature activation-1.32e-02

Pos logit contributions
 afraid+0.020
 sorry+0.020
 worried+0.019
 okay+0.018
 scared+0.018
Neg logit contributions
irs-0.019
ide-0.019
 occasions-0.017
aff-0.017
rones-0.017
Token Woman
Token ID14381
Feature activation-2.65e-02

Pos logit contributions
 afraid+0.027
 worried+0.026
 sorry+0.026
 scared+0.025
 okay+0.024
Neg logit contributions
irs-0.024
ide-0.022
rones-0.020
 occasions-0.020
growth-0.019
Token said
Token ID531
Feature activation-2.74e-02

Pos logit contributions
 afraid+0.037
 sorry+0.036
 worried+0.035
 scared+0.033
 okay+0.033
Neg logit contributions
irs-0.063
ide-0.061
rones-0.055
 occasions-0.055
aff-0.054
Token,
Token ID11
Feature activation-4.43e-02

Pos logit contributions
 afraid+0.042
 worried+0.038
 scared+0.037
 crank+0.036
 careless+0.035
Neg logit contributions
ide-0.055
irs-0.055
 occasions-0.049
of-0.048
aff-0.047
Token.
Token ID13
Feature activation-1.76e-03

Pos logit contributions
 afraid+0.019
 sorry+0.019
 worried+0.019
 scared+0.018
 okay+0.018
Neg logit contributions
irs-0.021
ide-0.020
rones-0.018
 occasions-0.018
aff-0.017
Token<|endoftext|>
Token ID50256
Feature activation+4.59e-02

Pos logit contributions
 afraid+0.003
 sorry+0.003
 worried+0.003
 scared+0.003
 very+0.003
Neg logit contributions
irs-0.004
ide-0.003
 occasions-0.003
rones-0.003
of-0.003
TokenOnce
Token ID7454
Feature activation-2.16e-02

Pos logit contributions
irs+0.108
 occasions+0.096
ide+0.093
aff+0.091
growth+0.085
Neg logit contributions
 afraid-0.071
 scared-0.070
 very-0.070
 worried-0.066
 sorry-0.064
Token upon
Token ID2402
Feature activation+2.34e-02

Pos logit contributions
 afraid+0.042
 worried+0.041
 sorry+0.041
 scared+0.040
 okay+0.039
Neg logit contributions
irs-0.042
 occasions-0.038
ide-0.037
growth-0.037
rones-0.036
Token a
Token ID257
Feature activation-2.62e-02

Pos logit contributions
irs+0.064
ide+0.060
 occasions+0.056
rones+0.056
aff+0.056
Neg logit contributions
 afraid-0.026
 sorry-0.025
 worried-0.024
 scared-0.023
 okay-0.022
Token time
Token ID640
Feature activation-8.53e-02

Pos logit contributions
 afraid+0.042
 sorry+0.041
 worried+0.041
 scared+0.039
 oh+0.037
Neg logit contributions
irs-0.057
ide-0.056
rones-0.050
 occasions-0.049
of-0.048
Token,
Token ID11
Feature activation-6.46e-02

Pos logit contributions
 afraid+0.118
 worried+0.114
 sorry+0.110
 scared+0.105
 oh+0.102
Neg logit contributions
irs-0.201
ide-0.200
rones-0.173
of-0.172
 occasions-0.168
Token there
Token ID612
Feature activation+7.91e-03

Pos logit contributions
 afraid+0.090
 worried+0.087
 sorry+0.086
 guilty+0.081
 careless+0.080
Neg logit contributions
ide-0.144
irs-0.134
rones-0.117
of-0.115
ate-0.113
Token was
Token ID373
Feature activation-5.91e-02

Pos logit contributions
irs+0.020
ide+0.018
 occasions+0.018
aff+0.017
rones+0.017
Neg logit contributions
 afraid-0.010
 worried-0.010
 scared-0.010
 sorry-0.010
 very-0.009
Token a
Token ID257
Feature activation-2.44e-02

Pos logit contributions
 afraid+0.088
 worried+0.085
 sorry+0.084
 scared+0.077
 foolish+0.075
Neg logit contributions
irs-0.135
ide-0.131
rones-0.122
growth-0.118
 parties-0.114
Token little
Token ID1310
Feature activation-3.49e-02

Pos logit contributions
 afraid+0.045
 oh+0.042
 okay+0.042
 sorry+0.041
 worried+0.041
Neg logit contributions
ide-0.038
irs-0.034
aff-0.034
rones-0.031
 parties-0.031
Token.
Token ID13
Feature activation+3.32e-03

Pos logit contributions
 afraid+0.001
 worried+0.001
 sorry+0.001
 scared+0.001
 okay+0.001
Neg logit contributions
irs-0.002
ide-0.002
rones-0.002
 occasions-0.002
aff-0.002
Token<|endoftext|>
Token ID50256
Feature activation+4.80e-02

Pos logit contributions
irs+0.007
ide+0.007
 occasions+0.006
rones+0.006
aff+0.006
Neg logit contributions
 afraid-0.006
 sorry-0.006
 worried-0.006
 scared-0.005
 okay-0.005
Token"
Token ID1
Feature activation+1.87e-02

Pos logit contributions
irs+0.102
ide+0.092
 occasions+0.091
aff+0.084
isk+0.082
Neg logit contributions
 scared-0.079
 afraid-0.078
 very-0.077
 worried-0.075
 sorry-0.073
TokenMom
Token ID29252
Feature activation-2.52e-02

Pos logit contributions
irs+0.038
ide+0.034
 occasions+0.032
rones+0.031
growth+0.031
Neg logit contributions
 afraid-0.033
 worried-0.032
 okay-0.032
 sorry-0.031
 scared-0.031
Tokenmy
Token ID1820
Feature activation-4.28e-02

Pos logit contributions
 afraid+0.047
 worried+0.045
 sorry+0.045
 scared+0.044
 okay+0.043
Neg logit contributions
irs-0.050
ide-0.046
 occasions-0.042
rones-0.042
growth-0.041
Token,
Token ID11
Feature activation-8.49e-02

Pos logit contributions
 afraid+0.061
 sorry+0.057
 worried+0.056
 scared+0.053
 okay+0.052
Neg logit contributions
irs-0.100
ide-0.096
rones-0.088
aff-0.087
 occasions-0.087
Token I
Token ID314
Feature activation-8.29e-03

Pos logit contributions
 afraid+0.147
 worried+0.132
 scared+0.129
 determined+0.128
 sorry+0.128
Neg logit contributions
irs-0.154
ide-0.148
of-0.137
isk-0.126
 parties-0.126
Token want
Token ID765
Feature activation-1.04e-02

Pos logit contributions
 sorry+0.014
 afraid+0.013
 oh+0.013
 foolish+0.012
 worried+0.012
Neg logit contributions
irs-0.015
of-0.014
ide-0.014
rones-0.013
growth-0.013
Token to
Token ID284
Feature activation+4.67e-02

Pos logit contributions
 afraid+0.016
 sorry+0.015
 worried+0.015
 scared+0.015
 okay+0.014
Neg logit contributions
irs-0.024
ide-0.023
 occasions-0.021
rones-0.021
growth-0.020
Token spend
Token ID4341
Feature activation+1.12e-02

Pos logit contributions
irs+0.098
ide+0.093
rones+0.086
aff+0.083
 occasions+0.081
Neg logit contributions
 sorry-0.080
 afraid-0.079
 scared-0.078
 worried-0.077
 okay-0.075
Token all
Token ID477
Feature activation+9.13e-03

Pos logit contributions
irs+0.026
ide+0.025
aff+0.023
rones+0.023
growth+0.022
Neg logit contributions
 afraid-0.017
 sorry-0.016
 worried-0.016
 scared-0.016
 very-0.015
Token.
Token ID13
Feature activation+6.05e-03

Pos logit contributions
 afraid+0.055
 sorry+0.055
 worried+0.053
 scared+0.051
 foolish+0.049
Neg logit contributions
irs-0.065
ide-0.063
 occasions-0.057
rones-0.054
of-0.054
Token<|endoftext|>
Token ID50256
Feature activation+4.52e-02

Pos logit contributions
irs+0.014
ide+0.013
 occasions+0.013
aff+0.012
rap+0.012
Neg logit contributions
 afraid-0.009
 sorry-0.009
 worried-0.008
 very-0.008
 scared-0.008
Token"
Token ID1
Feature activation+1.64e-02

Pos logit contributions
irs+0.090
ide+0.079
 occasions+0.078
aff+0.072
rones+0.070
Neg logit contributions
 afraid-0.082
 scared-0.080
 worried-0.079
 very-0.079
 sorry-0.077
TokenMom
Token ID29252
Feature activation-2.35e-02

Pos logit contributions
irs+0.033
ide+0.029
 occasions+0.027
rones+0.027
growth+0.026
Neg logit contributions
 afraid-0.030
 worried-0.029
 sorry-0.029
 scared-0.028
 okay-0.028
Tokenmy
Token ID1820
Feature activation-4.39e-02

Pos logit contributions
 afraid+0.040
 worried+0.039
 sorry+0.039
 scared+0.037
 okay+0.037
Neg logit contributions
irs-0.050
ide-0.046
 occasions-0.042
growth-0.042
rones-0.042
Token,
Token ID11
Feature activation-8.47e-02

Pos logit contributions
 afraid+0.059
 sorry+0.055
 worried+0.055
 scared+0.053
 okay+0.051
Neg logit contributions
irs-0.106
ide-0.102
rones-0.094
 occasions-0.093
aff-0.092
Token can
Token ID460
Feature activation-2.18e-02

Pos logit contributions
 afraid+0.144
 worried+0.132
 scared+0.129
 sorry+0.125
 determined+0.124
Neg logit contributions
irs-0.158
ide-0.148
 parties-0.137
of-0.131
 occasions-0.128
Token I
Token ID314
Feature activation+6.73e-03

Pos logit contributions
 afraid+0.026
 sorry+0.025
 worried+0.025
 scared+0.023
 okay+0.023
Neg logit contributions
irs-0.058
ide-0.055
rones-0.051
 occasions-0.050
aff-0.050
Token buy
Token ID2822
Feature activation+2.86e-02

Pos logit contributions
irs+0.013
ide+0.012
rones+0.011
 occasions+0.011
aff+0.011
Neg logit contributions
 afraid-0.013
 sorry-0.012
 worried-0.012
 scared-0.012
 okay-0.012
Token a
Token ID257
Feature activation+1.89e-02

Pos logit contributions
irs+0.069
ide+0.066
rones+0.060
 occasions+0.059
growth+0.059
Neg logit contributions
 afraid-0.040
 sorry-0.039
 worried-0.038
 scared-0.037
 foolish-0.036
Token modern
Token ID3660
Feature activation+1.04e-02

Pos logit contributions
irs+0.033
ide+0.032
 occasions+0.029
of+0.028
aff+0.028
Neg logit contributions
 afraid-0.037
 sorry-0.035
 worried-0.035
 scared-0.034
 okay-0.033
Token dirty
Token ID11841
Feature activation-1.16e-02

Pos logit contributions
 oh+0.071
 sorry+0.064
 afraid+0.062
ses+0.059
sed+0.058
Neg logit contributions
irs-0.124
ide-0.112
 parties-0.103
rones-0.103
ate-0.101
Token and
Token ID290
Feature activation-4.48e-02

Pos logit contributions
 afraid+0.022
 sorry+0.022
 worried+0.021
 oh+0.021
 determined+0.020
Neg logit contributions
ide-0.021
irs-0.020
 parties-0.017
en-0.017
 occasions-0.016
Token the
Token ID262
Feature activation+1.18e-02

Pos logit contributions
 afraid+0.053
 sorry+0.051
 okay+0.049
 oh+0.049
 worried+0.047
Neg logit contributions
irs-0.106
ide-0.100
 occasions-0.090
rones-0.089
 parties-0.087
Token trees
Token ID7150
Feature activation-8.58e-03

Pos logit contributions
irs+0.018
ide+0.017
 occasions+0.015
growth+0.015
rones+0.015
Neg logit contributions
 afraid-0.025
 sorry-0.025
 worried-0.025
 oh-0.024
 okay-0.024
Token were
Token ID547
Feature activation-4.73e-02

Pos logit contributions
 afraid+0.017
 sorry+0.016
 worried+0.016
 okay+0.015
 oh+0.015
Neg logit contributions
irs-0.015
ide-0.015
 occasions-0.012
rones-0.012
growth-0.012
Token being
Token ID852
Feature activation-8.45e-02

Pos logit contributions
 afraid+0.056
 sorry+0.055
 oh+0.055
 okay+0.050
 worried+0.049
Neg logit contributions
ide-0.108
irs-0.108
rones-0.093
 parties-0.091
 occasions-0.090
Token cut
Token ID2005
Feature activation+3.19e-02

Pos logit contributions
 sorry+0.089
 afraid+0.089
 oh+0.085
 okay+0.078
 worried+0.078
Neg logit contributions
irs-0.194
ide-0.190
rones-0.170
 parties-0.167
 occasions-0.164
Token down
Token ID866
Feature activation+2.05e-02

Pos logit contributions
irs+0.062
ide+0.061
rones+0.052
 occasions+0.052
aff+0.051
Neg logit contributions
 afraid-0.058
 sorry-0.057
 worried-0.056
 scared-0.053
 oh-0.052
Token.
Token ID13
Feature activation-7.27e-03

Pos logit contributions
irs+0.041
ide+0.039
rones+0.035
 occasions+0.035
growth+0.034
Neg logit contributions
 afraid-0.037
 sorry-0.036
 worried-0.035
 scared-0.034
 okay-0.033
Token They
Token ID1119
Feature activation+1.09e-02

Pos logit contributions
 afraid+0.013
 worried+0.012
 sorry+0.012
 scared+0.012
 okay+0.012
Neg logit contributions
irs-0.015
ide-0.014
 occasions-0.013
rones-0.012
growth-0.012
Token knew
Token ID2993
Feature activation-2.51e-02

Pos logit contributions
irs+0.023
ide+0.022
rones+0.019
of+0.019
growth+0.019
Neg logit contributions
 sorry-0.017
 afraid-0.017
 okay-0.016
 oh-0.016
 worried-0.015
Token
Token ID198
Feature activation+5.31e-02

Pos logit contributions
 afraid+0.024
 sorry+0.024
 worried+0.023
 scared+0.022
 okay+0.022
Neg logit contributions
irs-0.024
ide-0.023
rones-0.020
 occasions-0.020
aff-0.020
Token
Token ID198
Feature activation+4.48e-02

Pos logit contributions
irs+0.116
ide+0.109
 occasions+0.106
aff+0.102
growth+0.101
Neg logit contributions
 afraid-0.086
 worried-0.081
 sorry-0.081
 scared-0.080
 very-0.079
Token"
Token ID1
Feature activation-3.44e-03

Pos logit contributions
irs+0.081
ide+0.075
rones+0.069
aff+0.069
 occasions+0.067
Neg logit contributions
 afraid-0.090
 worried-0.086
 sorry-0.086
 scared-0.083
 okay-0.082
TokenMom
Token ID29252
Feature activation-4.14e-02

Pos logit contributions
 afraid+0.006
 worried+0.006
 sorry+0.006
 scared+0.006
 okay+0.006
Neg logit contributions
irs-0.007
ide-0.007
rones-0.006
aff-0.006
 occasions-0.006
Token,
Token ID11
Feature activation-6.90e-02

Pos logit contributions
 afraid+0.063
 sorry+0.060
 worried+0.057
 oh+0.056
 foolish+0.055
Neg logit contributions
irs-0.088
ide-0.087
rones-0.079
of-0.079
aff-0.077
Token when
Token ID618
Feature activation-8.45e-02

Pos logit contributions
 afraid+0.122
 worried+0.114
 scared+0.111
 determined+0.109
 sorry+0.105
Neg logit contributions
irs-0.127
ide-0.117
 parties-0.112
 occasions-0.110
of-0.107
Token can
Token ID460
Feature activation-1.88e-02

Pos logit contributions
 afraid+0.138
 worried+0.131
 sorry+0.123
 scared+0.122
 determined+0.116
Neg logit contributions
irs-0.173
ide-0.159
 occasions-0.155
 parties-0.147
of-0.143
Token we
Token ID356
Feature activation-3.67e-03

Pos logit contributions
 sorry+0.025
 scared+0.024
 worried+0.023
 afraid+0.022
 okay+0.022
Neg logit contributions
irs-0.049
ide-0.045
aff-0.045
rones-0.044
 occasions-0.043
Token go
Token ID467
Feature activation-2.45e-02

Pos logit contributions
 sorry+0.006
 okay+0.006
 scared+0.006
 worried+0.006
 very+0.006
Neg logit contributions
irs-0.008
ide-0.007
rones-0.007
 occasions-0.007
growth-0.007
Token outside
Token ID2354
Feature activation+1.16e-02

Pos logit contributions
 afraid+0.036
 sorry+0.035
 worried+0.035
 scared+0.033
 okay+0.032
Neg logit contributions
irs-0.058
ide-0.055
rones-0.050
 occasions-0.050
growth-0.049
Token?"
Token ID1701
Feature activation+1.64e-02

Pos logit contributions
irs+0.029
ide+0.028
rones+0.026
 occasions+0.026
growth+0.025
Neg logit contributions
 afraid-0.015
 sorry-0.014
 worried-0.014
 scared-0.014
 okay-0.013
Token.
Token ID13
Feature activation-2.17e-03

Pos logit contributions
 afraid+0.079
 sorry+0.076
 worried+0.076
 scared+0.072
 oh+0.070
Neg logit contributions
irs-0.094
ide-0.092
 occasions-0.081
of-0.080
rones-0.079
Token<|endoftext|>
Token ID50256
Feature activation+4.44e-02

Pos logit contributions
 afraid+0.004
 sorry+0.004
 worried+0.004
 scared+0.004
 okay+0.003
Neg logit contributions
irs-0.005
ide-0.004
 occasions-0.004
rones-0.004
growth-0.004
Token"
Token ID1
Feature activation+1.32e-02

Pos logit contributions
irs+0.095
ide+0.085
 occasions+0.084
aff+0.079
 outfits+0.075
Neg logit contributions
 afraid-0.073
 scared-0.073
 very-0.070
 worried-0.070
 sorry-0.066
TokenMom
Token ID29252
Feature activation-2.64e-02

Pos logit contributions
irs+0.027
ide+0.024
 occasions+0.022
rones+0.021
growth+0.021
Neg logit contributions
 afraid-0.024
 worried-0.023
 okay-0.023
 scared-0.023
 sorry-0.023
Tokenmy
Token ID1820
Feature activation-4.09e-02

Pos logit contributions
 afraid+0.050
 worried+0.048
 sorry+0.048
 scared+0.046
 okay+0.045
Neg logit contributions
irs-0.052
ide-0.048
 occasions-0.043
rones-0.043
growth-0.042
Token,
Token ID11
Feature activation-8.42e-02

Pos logit contributions
 afraid+0.057
 sorry+0.053
 worried+0.052
 scared+0.049
 okay+0.049
Neg logit contributions
irs-0.095
ide-0.092
rones-0.084
aff-0.083
 occasions-0.083
Token can
Token ID460
Feature activation-3.47e-02

Pos logit contributions
 afraid+0.148
 worried+0.133
 determined+0.131
 sorry+0.130
 scared+0.130
Neg logit contributions
ide-0.142
irs-0.142
of-0.125
 parties-0.117
isk-0.113
Token I
Token ID314
Feature activation-2.92e-03

Pos logit contributions
 afraid+0.041
 sorry+0.040
 worried+0.039
 scared+0.038
 okay+0.037
Neg logit contributions
irs-0.092
ide-0.087
rones-0.082
 occasions-0.081
aff-0.080
Token grab
Token ID5552
Feature activation+2.10e-02

Pos logit contributions
 afraid+0.006
 sorry+0.005
 worried+0.005
 scared+0.005
 foolish+0.005
Neg logit contributions
irs-0.006
ide-0.005
 occasions-0.005
of-0.005
rones-0.004
Token the
Token ID262
Feature activation+1.78e-02

Pos logit contributions
irs+0.045
ide+0.043
rones+0.039
 occasions+0.038
aff+0.038
Neg logit contributions
 afraid-0.035
 sorry-0.034
 worried-0.034
 scared-0.033
 foolish-0.032
Token screen
Token ID3159
Feature activation-2.00e-02

Pos logit contributions
irs+0.030
of+0.028
ide+0.028
 occasions+0.028
aff+0.026
Neg logit contributions
 afraid-0.034
 sorry-0.034
 worried-0.032
 okay-0.030
 scared-0.030
Token light
Token ID1657
Feature activation-2.15e-03

Pos logit contributions
 afraid+0.028
 sorry+0.028
 worried+0.027
 oh+0.026
 foolish+0.026
Neg logit contributions
irs-0.028
ide-0.027
 occasions-0.024
rones-0.024
growth-0.023
Token.
Token ID13
Feature activation-3.69e-03

Pos logit contributions
 afraid+0.004
 sorry+0.004
 worried+0.004
 scared+0.004
 okay+0.004
Neg logit contributions
irs-0.004
ide-0.004
 occasions-0.004
rones-0.004
growth-0.003
Token "
Token ID366
Feature activation-6.69e-03

Pos logit contributions
 afraid+0.008
 sorry+0.008
 worried+0.008
 scared+0.007
 okay+0.007
Neg logit contributions
irs-0.007
ide-0.006
 occasions-0.005
rones-0.005
growth-0.005
TokenMom
Token ID29252
Feature activation-3.01e-02

Pos logit contributions
 afraid+0.014
 sorry+0.014
 worried+0.014
 scared+0.013
 foolish+0.013
Neg logit contributions
irs-0.011
ide-0.010
 occasions-0.009
rones-0.009
growth-0.009
Tokenmy
Token ID1820
Feature activation-5.26e-02

Pos logit contributions
 afraid+0.060
 sorry+0.058
 worried+0.057
 scared+0.056
 okay+0.054
Neg logit contributions
irs-0.055
ide-0.052
rones-0.046
 occasions-0.045
aff-0.044
Token,
Token ID11
Feature activation-8.41e-02

Pos logit contributions
 afraid+0.081
 sorry+0.076
 worried+0.073
 oh+0.072
 determined+0.071
Neg logit contributions
ide-0.104
irs-0.101
of-0.097
aff-0.090
rones-0.089
Token can
Token ID460
Feature activation-2.67e-02

Pos logit contributions
 afraid+0.141
 worried+0.130
 scared+0.124
 determined+0.124
 sorry+0.123
Neg logit contributions
irs-0.152
ide-0.149
of-0.143
 parties-0.139
 occasions-0.137
Token you
Token ID345
Feature activation-1.36e-02

Pos logit contributions
 sorry+0.029
 afraid+0.028
 worried+0.027
 scared+0.026
 okay+0.026
Neg logit contributions
irs-0.075
ide-0.072
rones-0.068
aff-0.068
 occasions-0.066
Token bring
Token ID2222
Feature activation+1.23e-02

Pos logit contributions
 afraid+0.025
 sorry+0.024
 worried+0.024
 scared+0.023
 okay+0.023
Neg logit contributions
irs-0.028
ide-0.026
rones-0.023
 occasions-0.023
aff-0.023
Token me
Token ID502
Feature activation+2.03e-03

Pos logit contributions
irs+0.026
ide+0.025
rones+0.022
aff+0.022
 occasions+0.022
Neg logit contributions
 afraid-0.021
 sorry-0.021
 worried-0.020
 scared-0.020
 okay-0.019
Token a
Token ID257
Feature activation+8.61e-03

Pos logit contributions
irs+0.005
ide+0.005
 occasions+0.004
rones+0.004
aff+0.004
Neg logit contributions
 afraid-0.003
 sorry-0.003
 worried-0.003
 scared-0.002
 okay-0.002
Token.
Token ID13
Feature activation-1.16e-04

Pos logit contributions
irs+0.011
ide+0.010
rones+0.010
 occasions+0.009
aff+0.009
Neg logit contributions
 afraid-0.008
 worried-0.008
 sorry-0.007
 scared-0.007
 okay-0.007
Token<|endoftext|>
Token ID50256
Feature activation+4.34e-02

Pos logit contributions
 afraid+0.000
 sorry+0.000
 worried+0.000
 scared+0.000
 very+0.000
Neg logit contributions
irs-0.000
ide-0.000
 occasions-0.000
rones-0.000
aff-0.000
TokenOnce
Token ID7454
Feature activation-2.26e-02

Pos logit contributions
irs+0.102
 occasions+0.090
ide+0.088
aff+0.085
 outfits+0.081
Neg logit contributions
 afraid-0.066
 scared-0.065
 very-0.065
 worried-0.062
 sorry-0.060
Token upon
Token ID2402
Feature activation+1.86e-02

Pos logit contributions
 afraid+0.044
 worried+0.043
 sorry+0.042
 scared+0.042
 sad+0.041
Neg logit contributions
irs-0.043
 occasions-0.041
growth-0.039
ide-0.038
rones-0.037
Token a
Token ID257
Feature activation-2.49e-02

Pos logit contributions
irs+0.052
ide+0.049
 occasions+0.046
rones+0.045
aff+0.045
Neg logit contributions
 afraid-0.020
 sorry-0.019
 worried-0.019
 scared-0.018
 okay-0.017
Token time
Token ID640
Feature activation-8.41e-02

Pos logit contributions
 afraid+0.038
 sorry+0.037
 worried+0.036
 scared+0.035
 okay+0.033
Neg logit contributions
irs-0.057
ide-0.055
rones-0.049
 occasions-0.049
growth-0.048
Token,
Token ID11
Feature activation-6.49e-02

Pos logit contributions
 afraid+0.118
 worried+0.113
 sorry+0.110
 scared+0.104
 careless+0.101
Neg logit contributions
irs-0.198
ide-0.197
of-0.170
rones-0.169
 occasions-0.166
Token there
Token ID612
Feature activation+9.64e-03

Pos logit contributions
 afraid+0.091
 worried+0.089
 sorry+0.089
 guilty+0.083
 careless+0.081
Neg logit contributions
ide-0.144
irs-0.136
of-0.118
rones-0.117
growth-0.115
Token was
Token ID373
Feature activation-5.77e-02

Pos logit contributions
irs+0.024
ide+0.022
 occasions+0.021
rones+0.021
aff+0.021
Neg logit contributions
 afraid-0.013
 worried-0.012
 sorry-0.012
 scared-0.012
 okay-0.011
Token a
Token ID257
Feature activation-2.30e-02

Pos logit contributions
 afraid+0.082
 worried+0.079
 sorry+0.079
 scared+0.073
 foolish+0.070
Neg logit contributions
irs-0.137
ide-0.132
rones-0.123
growth-0.120
 occasions-0.116
Token team
Token ID1074
Feature activation-1.84e-02

Pos logit contributions
 afraid+0.043
 okay+0.040
 sorry+0.040
 oh+0.039
 worried+0.039
Neg logit contributions
ide-0.036
irs-0.033
aff-0.032
rones-0.030
of-0.030
Token.
Token ID13
Feature activation-4.95e-03

Pos logit contributions
 afraid+0.027
 sorry+0.027
 worried+0.026
 scared+0.025
 okay+0.025
Neg logit contributions
irs-0.027
ide-0.026
 occasions-0.023
rones-0.023
aff-0.022
Token<|endoftext|>
Token ID50256
Feature activation+4.57e-02

Pos logit contributions
 afraid+0.010
 sorry+0.009
 worried+0.009
 scared+0.009
 okay+0.009
Neg logit contributions
irs-0.009
ide-0.009
 occasions-0.008
rones-0.008
aff-0.007
TokenOnce
Token ID7454
Feature activation-2.23e-02

Pos logit contributions
irs+0.106
 occasions+0.095
ide+0.093
aff+0.089
 outfits+0.084
Neg logit contributions
 afraid-0.071
 scared-0.070
 very-0.068
 worried-0.067
 sorry-0.064
Token upon
Token ID2402
Feature activation+2.10e-02

Pos logit contributions
 afraid+0.044
 worried+0.043
 sorry+0.042
 scared+0.042
 okay+0.040
Neg logit contributions
irs-0.042
 occasions-0.038
ide-0.037
growth-0.037
rones-0.036
Token a
Token ID257
Feature activation-2.46e-02

Pos logit contributions
irs+0.057
ide+0.054
 occasions+0.050
rones+0.050
aff+0.049
Neg logit contributions
 afraid-0.024
 sorry-0.023
 worried-0.023
 scared-0.021
 okay-0.021
Token time
Token ID640
Feature activation-8.41e-02

Pos logit contributions
 afraid+0.042
 sorry+0.041
 worried+0.041
 scared+0.039
 oh+0.038
Neg logit contributions
ide-0.051
irs-0.050
rones-0.044
of-0.043
 occasions-0.043
Token,
Token ID11
Feature activation-6.29e-02

Pos logit contributions
 afraid+0.123
 worried+0.119
 sorry+0.114
 scared+0.109
 oh+0.107
Neg logit contributions
ide-0.191
irs-0.191
of-0.163
rones-0.162
aff-0.157
Token there
Token ID612
Feature activation+9.02e-03

Pos logit contributions
 afraid+0.091
 worried+0.088
 sorry+0.087
 guilty+0.083
 determined+0.083
Neg logit contributions
ide-0.135
irs-0.127
rones-0.108
ate-0.106
of-0.105
Token was
Token ID373
Feature activation-5.73e-02

Pos logit contributions
irs+0.023
ide+0.021
 occasions+0.020
rones+0.020
aff+0.019
Neg logit contributions
 afraid-0.012
 worried-0.012
 sorry-0.012
 scared-0.011
 okay-0.011
Token a
Token ID257
Feature activation-2.51e-02

Pos logit contributions
 afraid+0.087
 sorry+0.084
 worried+0.084
 scared+0.076
 foolish+0.075
Neg logit contributions
irs-0.128
ide-0.124
rones-0.116
growth-0.112
 parties-0.108
Token little
Token ID1310
Feature activation-3.48e-02

Pos logit contributions
 afraid+0.045
 okay+0.042
 oh+0.042
 sorry+0.041
 worried+0.041
Neg logit contributions
ide-0.040
irs-0.036
aff-0.036
rones-0.033
 parties-0.033
Token her
Token ID607
Feature activation-1.16e-02

Pos logit contributions
ide+0.011
irs+0.010
 occasions+0.009
aff+0.009
rones+0.009
Neg logit contributions
 sorry-0.012
 afraid-0.012
 worried-0.012
 okay-0.011
 scared-0.011
Token.
Token ID13
Feature activation-1.21e-02

Pos logit contributions
 afraid+0.020
 sorry+0.019
 worried+0.019
 scared+0.018
 foolish+0.017
Neg logit contributions
irs-0.023
ide-0.023
 occasions-0.020
rones-0.020
growth-0.020
Token "
Token ID366
Feature activation-9.12e-03

Pos logit contributions
 afraid+0.021
 sorry+0.021
 worried+0.021
 scared+0.020
 okay+0.020
Neg logit contributions
irs-0.027
ide-0.024
 occasions-0.022
growth-0.022
rones-0.022
TokenMom
Token ID29252
Feature activation-4.23e-02

Pos logit contributions
 sorry+0.017
 afraid+0.017
 worried+0.017
 scared+0.016
 okay+0.016
Neg logit contributions
irs-0.017
ide-0.016
growth-0.014
rones-0.014
 occasions-0.014
Tokenmy
Token ID1820
Feature activation-5.49e-02

Pos logit contributions
 afraid+0.088
 sorry+0.086
 worried+0.083
 scared+0.080
 oh+0.079
Neg logit contributions
irs-0.069
ide-0.068
rones-0.058
of-0.058
aff-0.057
Token,
Token ID11
Feature activation-8.39e-02

Pos logit contributions
 afraid+0.088
 sorry+0.082
 worried+0.079
 determined+0.077
 oh+0.077
Neg logit contributions
ide-0.103
irs-0.099
of-0.098
aff-0.090
rap-0.088
Token can
Token ID460
Feature activation-3.27e-02

Pos logit contributions
 afraid+0.140
 worried+0.129
 scared+0.124
 determined+0.123
 sorry+0.122
Neg logit contributions
irs-0.153
of-0.146
ide-0.145
 parties-0.134
 occasions-0.133
Token you
Token ID345
Feature activation-1.46e-02

Pos logit contributions
 afraid+0.040
 sorry+0.040
 worried+0.039
 scared+0.037
 okay+0.036
Neg logit contributions
irs-0.085
ide-0.082
rones-0.076
 occasions-0.075
aff-0.075
Token help
Token ID1037
Feature activation+1.44e-02

Pos logit contributions
 afraid+0.026
 sorry+0.026
 worried+0.026
 scared+0.025
 okay+0.025
Neg logit contributions
irs-0.030
ide-0.028
rones-0.026
 occasions-0.025
aff-0.025
Token me
Token ID502
Feature activation+1.37e-03

Pos logit contributions
irs+0.033
ide+0.030
rones+0.030
growth+0.029
aff+0.028
Neg logit contributions
 worried-0.022
 afraid-0.022
 scared-0.022
 very-0.021
 foolish-0.021
Token lift
Token ID10303
Feature activation+3.12e-02

Pos logit contributions
irs+0.003
ide+0.003
rones+0.003
 occasions+0.003
aff+0.003
Neg logit contributions
 afraid-0.002
 worried-0.002
 sorry-0.002
 scared-0.002
 okay-0.002
Token loud
Token ID7812
Feature activation-1.43e-02

Pos logit contributions
 sorry+0.078
 afraid+0.074
 oh+0.073
 okay+0.069
 worried+0.068
Neg logit contributions
irs-0.136
ide-0.135
rones-0.117
of-0.116
aff-0.115
Token.
Token ID13
Feature activation-2.09e-02

Pos logit contributions
 afraid+0.031
 sorry+0.031
 worried+0.030
 careless+0.029
 foolish+0.029
Neg logit contributions
ide-0.022
irs-0.021
 occasions-0.018
 parties-0.017
aff-0.017
Token "
Token ID366
Feature activation-7.76e-03

Pos logit contributions
 afraid+0.037
 sorry+0.036
 worried+0.036
 scared+0.035
 okay+0.034
Neg logit contributions
irs-0.043
ide-0.041
 occasions-0.037
rones-0.036
growth-0.036
TokenMom
Token ID29252
Feature activation-3.31e-02

Pos logit contributions
 sorry+0.016
 afraid+0.016
 worried+0.015
 scared+0.015
 foolish+0.014
Neg logit contributions
irs-0.014
ide-0.013
 occasions-0.012
rones-0.011
growth-0.011
Tokenmy
Token ID1820
Feature activation-5.17e-02

Pos logit contributions
 afraid+0.069
 sorry+0.068
 worried+0.066
 scared+0.064
 foolish+0.063
Neg logit contributions
irs-0.055
ide-0.054
rones-0.046
 occasions-0.045
of-0.045
Token,
Token ID11
Feature activation-8.39e-02

Pos logit contributions
 afraid+0.083
 sorry+0.078
 worried+0.074
 determined+0.074
 oh+0.073
Neg logit contributions
ide-0.097
irs-0.093
of-0.091
aff-0.084
rap-0.082
Token can
Token ID460
Feature activation-3.29e-02

Pos logit contributions
 afraid+0.151
 worried+0.139
 determined+0.137
 scared+0.135
 guilty+0.130
Neg logit contributions
irs-0.140
ide-0.135
 parties-0.133
of-0.130
 occasions-0.122
Token you
Token ID345
Feature activation-1.75e-02

Pos logit contributions
 afraid+0.042
 sorry+0.041
 worried+0.040
 scared+0.039
 okay+0.037
Neg logit contributions
irs-0.084
ide-0.080
rones-0.075
 occasions-0.074
aff-0.073
Token turn
Token ID1210
Feature activation-2.59e-02

Pos logit contributions
 afraid+0.032
 sorry+0.031
 worried+0.031
 scared+0.030
 okay+0.029
Neg logit contributions
irs-0.036
ide-0.033
rones-0.030
 occasions-0.030
aff-0.029
Token down
Token ID866
Feature activation+6.79e-03

Pos logit contributions
 afraid+0.042
 sorry+0.041
 worried+0.039
 okay+0.038
 oh+0.038
Neg logit contributions
irs-0.056
ide-0.050
of-0.045
 parties-0.045
 occasions-0.045
Token the
Token ID262
Feature activation+9.39e-04

Pos logit contributions
irs+0.015
ide+0.014
rones+0.012
 occasions+0.012
aff+0.012
Neg logit contributions
 afraid-0.011
 sorry-0.011
 worried-0.011
 scared-0.010
 okay-0.010
Token.
Token ID13
Feature activation+7.65e-03

Pos logit contributions
irs+0.027
ide+0.026
rones+0.023
 occasions+0.023
aff+0.023
Neg logit contributions
 afraid-0.019
 sorry-0.018
 worried-0.018
 scared-0.017
 okay-0.017
Token<|endoftext|>
Token ID50256
Feature activation+4.65e-02

Pos logit contributions
irs+0.016
ide+0.016
 occasions+0.015
rones+0.014
 parties+0.014
Neg logit contributions
 afraid-0.012
 sorry-0.012
 worried-0.011
 okay-0.011
 scared-0.011
Token"
Token ID1
Feature activation+1.77e-02

Pos logit contributions
irs+0.100
ide+0.089
 occasions+0.087
aff+0.083
rones+0.079
Neg logit contributions
 afraid-0.077
 scared-0.076
 very-0.075
 worried-0.073
 sorry-0.070
TokenMom
Token ID29252
Feature activation-2.24e-02

Pos logit contributions
irs+0.035
ide+0.031
 occasions+0.029
rones+0.028
growth+0.028
Neg logit contributions
 afraid-0.033
 worried-0.032
 okay-0.031
 sorry-0.031
 scared-0.031
Tokenmy
Token ID1820
Feature activation-4.07e-02

Pos logit contributions
 afraid+0.040
 worried+0.039
 sorry+0.039
 scared+0.038
 okay+0.037
Neg logit contributions
irs-0.046
ide-0.042
 occasions-0.038
growth-0.038
rones-0.038
Token,
Token ID11
Feature activation-8.39e-02

Pos logit contributions
 afraid+0.055
 sorry+0.051
 worried+0.051
 scared+0.048
 okay+0.047
Neg logit contributions
irs-0.098
ide-0.094
rones-0.087
 occasions-0.086
aff-0.085
Token can
Token ID460
Feature activation-2.99e-02

Pos logit contributions
 afraid+0.145
 worried+0.132
 scared+0.128
 sorry+0.125
 determined+0.125
Neg logit contributions
irs-0.146
ide-0.142
 parties-0.126
of-0.125
isk-0.114
Token I
Token ID314
Feature activation+4.77e-03

Pos logit contributions
 afraid+0.038
 sorry+0.036
 worried+0.036
 scared+0.034
 okay+0.033
Neg logit contributions
irs-0.077
ide-0.073
rones-0.067
 occasions-0.067
growth-0.066
Token dress
Token ID6576
Feature activation-4.67e-03

Pos logit contributions
irs+0.009
ide+0.009
rones+0.008
 occasions+0.008
aff+0.007
Neg logit contributions
 afraid-0.009
 sorry-0.009
 worried-0.009
 scared-0.008
 okay-0.008
Token up
Token ID510
Feature activation+9.62e-03

Pos logit contributions
 afraid+0.006
 sorry+0.006
 worried+0.006
 scared+0.005
 okay+0.005
Neg logit contributions
irs-0.012
ide-0.011
rones-0.010
 occasions-0.010
growth-0.010
Token?"
Token ID1701
Feature activation+1.51e-02

Pos logit contributions
irs+0.023
ide+0.022
 occasions+0.020
rones+0.020
aff+0.019
Neg logit contributions
 afraid-0.014
 sorry-0.013
 worried-0.013
 scared-0.012
 okay-0.012
Token for
Token ID329
Feature activation+9.45e-03

Pos logit contributions
 afraid+0.049
 sorry+0.047
 worried+0.047
 scared+0.045
 okay+0.043
Neg logit contributions
irs-0.077
ide-0.073
rones-0.067
 occasions-0.067
growth-0.066
Token the
Token ID262
Feature activation+9.29e-03

Pos logit contributions
irs+0.020
ide+0.019
rones+0.017
aff+0.017
growth+0.017
Neg logit contributions
 afraid-0.016
 sorry-0.016
 worried-0.016
 scared-0.015
 okay-0.015
Token camera
Token ID4676
Feature activation-1.34e-02

Pos logit contributions
irs+0.016
ide+0.015
 occasions+0.014
aff+0.013
of+0.013
Neg logit contributions
 afraid-0.019
 worried-0.018
 sorry-0.017
 scared-0.017
 okay-0.016
Token!"
Token ID2474
Feature activation+3.89e-02

Pos logit contributions
 afraid+0.021
 sorry+0.020
 worried+0.020
 scared+0.020
 okay+0.019
Neg logit contributions
irs-0.030
ide-0.028
rones-0.026
 occasions-0.026
growth-0.025
Token
Token ID198
Feature activation+7.35e-02

Pos logit contributions
irs+0.074
ide+0.070
 occasions+0.063
rones+0.061
growth+0.059
Neg logit contributions
 afraid-0.076
 sorry-0.075
 worried-0.074
 scared-0.072
 okay-0.071
Token
Token ID198
Feature activation+6.71e-02

Pos logit contributions
irs+0.181
 occasions+0.180
ide+0.174
growth+0.164
isk+0.159
Neg logit contributions
 sorry-0.094
 very-0.090
 afraid-0.089
 okay-0.089
 worried-0.089
TokenBen
Token ID11696
Feature activation+2.72e-03

Pos logit contributions
irs+0.134
ide+0.124
 occasions+0.116
growth+0.114
aff+0.113
Neg logit contributions
 sorry-0.119
 afraid-0.119
 okay-0.118
 worried-0.118
 careless-0.112
Token smiled
Token ID13541
Feature activation+2.00e-02

Pos logit contributions
irs+0.007
ide+0.006
rones+0.006
 occasions+0.006
aff+0.006
Neg logit contributions
 afraid-0.004
 sorry-0.004
 worried-0.004
 scared-0.003
 okay-0.003
Token and
Token ID290
Feature activation-1.08e-02

Pos logit contributions
ide+0.041
irs+0.041
rones+0.038
growth+0.036
aff+0.036
Neg logit contributions
 afraid-0.030
 sorry-0.029
 oh-0.028
 worried-0.028
 foolish-0.027
Token said
Token ID531
Feature activation-8.83e-03

Pos logit contributions
 afraid+0.017
 sorry+0.016
 worried+0.016
 okay+0.016
 foolish+0.015
Neg logit contributions
irs-0.021
ide-0.019
of-0.019
rones-0.019
aff-0.018
Token,
Token ID11
Feature activation-2.98e-02

Pos logit contributions
 afraid+0.012
 worried+0.011
 scared+0.011
 sorry+0.010
 careless+0.010
Neg logit contributions
irs-0.021
ide-0.020
of-0.018
 occasions-0.018
aff-0.018
Token so
Token ID523
Feature activation-4.76e-02

Pos logit contributions
 afraid+0.038
 oh+0.036
 sorry+0.035
 worried+0.032
 foolish+0.032
Neg logit contributions
irs-0.137
ide-0.136
rones-0.125
 parties-0.119
growth-0.119
Token happy
Token ID3772
Feature activation+8.94e-03

Pos logit contributions
 afraid+0.048
 oh+0.045
 sorry+0.044
 worried+0.044
 okay+0.041
Neg logit contributions
irs-0.125
ide-0.121
rones-0.110
of-0.107
aff-0.107
Token and
Token ID290
Feature activation-2.30e-02

Pos logit contributions
irs+0.020
ide+0.020
 occasions+0.018
rones+0.018
aff+0.017
Neg logit contributions
 afraid-0.013
 sorry-0.013
 worried-0.013
 okay-0.012
 scared-0.012
Token thankful
Token ID25535
Feature activation+5.78e-03

Pos logit contributions
 afraid+0.040
 sorry+0.038
 worried+0.038
 okay+0.038
 oh+0.036
Neg logit contributions
irs-0.040
ide-0.038
of-0.034
rones-0.034
 parties-0.034
Token.
Token ID13
Feature activation+6.82e-03

Pos logit contributions
irs+0.012
ide+0.012
 occasions+0.011
aff+0.011
of+0.011
Neg logit contributions
 afraid-0.009
 worried-0.008
 sorry-0.008
 scared-0.008
 okay-0.007
Token
Token ID198
Feature activation+7.03e-02

Pos logit contributions
irs+0.016
ide+0.015
 occasions+0.014
rones+0.013
of+0.013
Neg logit contributions
 sorry-0.011
 afraid-0.011
 worried-0.010
 scared-0.010
 very-0.010
Token
Token ID198
Feature activation+6.44e-02

Pos logit contributions
irs+0.168
 occasions+0.163
ide+0.155
rones+0.147
aff+0.146
Neg logit contributions
 very-0.097
 afraid-0.094
 scared-0.092
 sorry-0.090
 worried-0.090
TokenThe
Token ID464
Feature activation+1.08e-02

Pos logit contributions
irs+0.144
rones+0.137
 occasions+0.129
ide+0.129
aff+0.128
Neg logit contributions
 afraid-0.094
 very-0.093
 determined-0.093
 worried-0.090
 foolish-0.090
Token prince
Token ID19716
Feature activation+2.23e-02

Pos logit contributions
irs+0.020
ide+0.019
rones+0.017
 occasions+0.016
aff+0.016
Neg logit contributions
 afraid-0.022
 sorry-0.021
 worried-0.021
 scared-0.020
 okay-0.020
Token and
Token ID290
Feature activation+1.38e-03

Pos logit contributions
irs+0.054
ide+0.051
 occasions+0.047
rones+0.047
growth+0.046
Neg logit contributions
 afraid-0.031
 worried-0.030
 sorry-0.030
 scared-0.029
 okay-0.027
Token the
Token ID262
Feature activation+1.05e-02

Pos logit contributions
irs+0.004
ide+0.004
ades+0.003
oons+0.003
 occasions+0.003
Neg logit contributions
 afraid-0.002
 scared-0.002
 worried-0.002
 sorry-0.002
 sad-0.002
TokenThank
Token ID10449
Feature activation+3.20e-02

Pos logit contributions
 afraid+0.004
 sorry+0.004
 worried+0.004
 scared+0.004
 okay+0.004
Neg logit contributions
irs-0.004
ide-0.004
rones-0.004
 occasions-0.004
growth-0.004
Token you
Token ID345
Feature activation-7.19e-03

Pos logit contributions
irs+0.077
ide+0.074
rones+0.069
aff+0.066
 occasions+0.066
Neg logit contributions
 afraid-0.044
 sorry-0.044
 worried-0.042
 scared-0.041
 okay-0.040
Token so
Token ID523
Feature activation-7.00e-03

Pos logit contributions
 afraid+0.011
 worried+0.010
 scared+0.010
 sorry+0.009
 careless+0.009
Neg logit contributions
irs-0.016
ide-0.016
aff-0.014
rones-0.014
of-0.014
Token much
Token ID881
Feature activation-1.69e-02

Pos logit contributions
 afraid+0.007
 sorry+0.006
 worried+0.006
 scared+0.006
 okay+0.006
Neg logit contributions
irs-0.020
ide-0.019
rones-0.018
 occasions-0.018
growth-0.017
Token!"
Token ID2474
Feature activation+3.73e-02

Pos logit contributions
 afraid+0.032
 worried+0.032
 sorry+0.030
 scared+0.030
 careless+0.029
Neg logit contributions
irs-0.032
ide-0.031
aff-0.027
 occasions-0.027
 parties-0.026
Token
Token ID198
Feature activation+7.20e-02

Pos logit contributions
irs+0.074
ide+0.068
 occasions+0.062
rones+0.060
aff+0.059
Neg logit contributions
 afraid-0.073
 sorry-0.072
 worried-0.071
 scared-0.069
 very-0.066
Token
Token ID198
Feature activation+6.70e-02

Pos logit contributions
irs+0.170
 occasions+0.162
ide+0.158
growth+0.154
 parties+0.151
Neg logit contributions
 sorry-0.104
 afraid-0.102
 very-0.099
 scared-0.098
 worried-0.097
Token"
Token ID1
Feature activation+2.84e-03

Pos logit contributions
irs+0.143
ide+0.131
 occasions+0.122
rones+0.121
aff+0.120
Neg logit contributions
 afraid-0.117
 worried-0.113
 sorry-0.113
 scared-0.109
 curious-0.105
TokenThis
Token ID1212
Feature activation-1.40e-02

Pos logit contributions
irs+0.006
ide+0.006
rones+0.005
 occasions+0.005
growth+0.005
Neg logit contributions
 afraid-0.005
 sorry-0.004
 worried-0.004
 scared-0.004
 okay-0.004
Token is
Token ID318
Feature activation-3.56e-02

Pos logit contributions
 afraid+0.028
 sorry+0.027
 worried+0.026
 okay+0.025
 scared+0.025
Neg logit contributions
irs-0.026
ide-0.024
 occasions-0.023
rones-0.022
 parties-0.021
Token the
Token ID262
Feature activation+9.40e-03

Pos logit contributions
 afraid+0.059
 worried+0.057
 sorry+0.054
 scared+0.053
 guilty+0.051
Neg logit contributions
irs-0.075
ide-0.069
 parties-0.063
ate-0.061
aff-0.060
Token to
Token ID284
Feature activation+5.40e-02

Pos logit contributions
irs+0.006
ide+0.005
rones+0.005
 occasions+0.005
aff+0.005
Neg logit contributions
 afraid-0.004
 sorry-0.003
 worried-0.003
 scared-0.003
 okay-0.003
Token learn
Token ID2193
Feature activation+3.65e-02

Pos logit contributions
irs+0.113
ide+0.105
rones+0.099
growth+0.096
aff+0.094
Neg logit contributions
 sorry-0.098
 afraid-0.095
 worried-0.094
 okay-0.094
 scared-0.091
Token a
Token ID257
Feature activation+3.43e-02

Pos logit contributions
irs+0.083
ide+0.078
rones+0.071
 occasions+0.071
growth+0.070
Neg logit contributions
 afraid-0.057
 sorry-0.055
 worried-0.055
 scared-0.053
 okay-0.051
Token lesson
Token ID11483
Feature activation-8.54e-03

Pos logit contributions
irs+0.073
ide+0.072
 occasions+0.065
aff+0.064
of+0.064
Neg logit contributions
 afraid-0.053
 sorry-0.052
 worried-0.051
 scared-0.049
 oh-0.049
Token.
Token ID13
Feature activation-8.81e-03

Pos logit contributions
 afraid+0.017
 sorry+0.016
 worried+0.016
 scared+0.016
 okay+0.015
Neg logit contributions
irs-0.016
ide-0.015
 occasions-0.013
rones-0.013
aff-0.013
Token
Token ID198
Feature activation+6.08e-02

Pos logit contributions
 afraid+0.018
 sorry+0.017
 worried+0.017
 scared+0.017
 okay+0.016
Neg logit contributions
irs-0.016
ide-0.015
 occasions-0.013
rones-0.013
aff-0.013
Token
Token ID198
Feature activation+5.42e-02

Pos logit contributions
irs+0.149
ide+0.143
 occasions+0.143
growth+0.135
ware+0.134
Neg logit contributions
 sorry-0.079
 afraid-0.077
 very-0.077
 worried-0.077
 scared-0.074
Token"
Token ID1
Feature activation-1.54e-03

Pos logit contributions
irs+0.119
ide+0.109
growth+0.107
 occasions+0.103
rones+0.102
Neg logit contributions
 afraid-0.089
 worried-0.089
 sorry-0.086
 very-0.084
 scared-0.084
TokenSince
Token ID6385
Feature activation-5.44e-02

Pos logit contributions
 afraid+0.003
 worried+0.003
 sorry+0.002
 scared+0.002
 okay+0.002
Neg logit contributions
irs-0.003
ide-0.003
rones-0.003
 occasions-0.003
growth-0.003
Token you
Token ID345
Feature activation-1.79e-02

Pos logit contributions
 afraid+0.108
 worried+0.099
 sorry+0.099
 okay+0.096
 scared+0.095
Neg logit contributions
irs-0.080
 occasions-0.071
ide-0.071
 parties-0.068
ate-0.061
Token were
Token ID547
Feature activation-3.40e-02

Pos logit contributions
 afraid+0.033
 oh+0.032
 sorry+0.032
 worried+0.031
 scared+0.030
Neg logit contributions
irs-0.034
ide-0.031
of-0.028
aff-0.028
 occasions-0.028
Token smart
Token ID4451
Feature activation-7.75e-03

Pos logit contributions
 afraid+0.036
 worried+0.033
 sorry+0.033
 scared+0.031
 okay+0.030
Neg logit contributions
irs-0.091
ide-0.086
 occasions-0.080
rones-0.078
aff-0.078
Token and
Token ID290
Feature activation-3.46e-02

Pos logit contributions
 afraid+0.012
 sorry+0.012
 worried+0.011
 okay+0.011
 scared+0.011
Neg logit contributions
irs-0.017
ide-0.017
 occasions-0.015
rones-0.015
aff-0.015
Token helpful
Token ID7613
Feature activation+3.12e-03

Pos logit contributions
 afraid+0.048
 worried+0.044
 sorry+0.044
 okay+0.042
 scared+0.042
Neg logit contributions
irs-0.076
ide-0.069
 occasions-0.069
 parties-0.065
ate-0.063
Token.
Token ID13
Feature activation-5.51e-03

Pos logit contributions
irs+0.006
ide+0.006
rones+0.005
 occasions+0.005
growth+0.005
Neg logit contributions
 afraid-0.006
 sorry-0.005
 worried-0.005
 scared-0.005
 okay-0.005
Token
Token ID220
Feature activation+2.69e-02

Pos logit contributions
 sorry+0.009
 afraid+0.009
 worried+0.009
 scared+0.009
 okay+0.008
Neg logit contributions
irs-0.012
ide-0.012
 occasions-0.011
growth-0.010
rones-0.010
Token
Token ID198
Feature activation+6.22e-02

Pos logit contributions
 occasions+0.065
irs+0.062
growth+0.062
ide+0.060
 outfits+0.057
Neg logit contributions
 okay-0.037
 sorry-0.036
 afraid-0.034
 scared-0.032
 very-0.032
Token
Token ID198
Feature activation+5.38e-02

Pos logit contributions
irs+0.157
 occasions+0.150
ide+0.148
growth+0.146
ware+0.142
Neg logit contributions
 very-0.077
 afraid-0.073
 worried-0.070
 sorry-0.070
 scared-0.070
TokenLater
Token ID18602
Feature activation+1.55e-03

Pos logit contributions
irs+0.107
ide+0.099
rones+0.098
aff+0.096
growth+0.096
Neg logit contributions
 afraid-0.095
 worried-0.094
 very-0.093
 sorry-0.092
 okay-0.091
Token that
Token ID326
Feature activation-2.39e-02

Pos logit contributions
irs+0.003
ide+0.003
 parties+0.003
 occasions+0.003
growth+0.003
Neg logit contributions
 afraid-0.003
 worried-0.002
 sorry-0.002
 okay-0.002
 scared-0.002
Token day
Token ID1110
Feature activation-2.61e-02

Pos logit contributions
 afraid+0.028
 sorry+0.027
 worried+0.026
 scared+0.025
 okay+0.025
Neg logit contributions
irs-0.063
ide-0.061
rones-0.056
 occasions-0.056
of-0.054
Token,
Token ID11
Feature activation-4.30e-02

Pos logit contributions
 afraid+0.034
 worried+0.031
 sorry+0.031
 scared+0.029
 foolish+0.028
Neg logit contributions
irs-0.067
ide-0.063
rones-0.058
 occasions-0.058
aff-0.057
Token was
Token ID373
Feature activation-4.66e-02

Pos logit contributions
irs+0.007
ide+0.007
rones+0.006
 occasions+0.006
aff+0.006
Neg logit contributions
 afraid-0.004
 sorry-0.004
 worried-0.004
 scared-0.004
 okay-0.004
Token not
Token ID407
Feature activation-3.06e-02

Pos logit contributions
 afraid+0.045
 sorry+0.043
 worried+0.042
 oh+0.042
 scared+0.037
Neg logit contributions
irs-0.120
ide-0.118
rones-0.108
 parties-0.106
aff-0.104
Token sad
Token ID6507
Feature activation+9.76e-03

Pos logit contributions
 sorry+0.023
 oh+0.022
 afraid+0.022
 worried+0.022
 okay+0.020
Neg logit contributions
irs-0.083
ide-0.082
aff-0.075
rones-0.075
 parties-0.074
Token anymore
Token ID7471
Feature activation+8.68e-03

Pos logit contributions
irs+0.021
ide+0.019
rones+0.018
 occasions+0.017
growth+0.017
Neg logit contributions
 afraid-0.017
 sorry-0.017
 worried-0.017
 scared-0.016
 okay-0.016
Token.
Token ID13
Feature activation-7.96e-05

Pos logit contributions
irs+0.018
ide+0.017
rones+0.015
 occasions+0.015
growth+0.015
Neg logit contributions
 afraid-0.015
 sorry-0.015
 worried-0.015
 scared-0.014
 okay-0.014
Token They
Token ID1119
Feature activation+1.40e-02

Pos logit contributions
 afraid+0.000
 sorry+0.000
 scared+0.000
 worried+0.000
 very+0.000
Neg logit contributions
irs-0.000
ide-0.000
 occasions-0.000
 outfits-0.000
rap-0.000
Token all
Token ID477
Feature activation-2.20e-03

Pos logit contributions
irs+0.032
ide+0.030
 occasions+0.028
rones+0.028
growth+0.027
Neg logit contributions
 afraid-0.022
 worried-0.021
 sorry-0.021
 scared-0.020
 okay-0.019
Token lived
Token ID5615
Feature activation+1.61e-02

Pos logit contributions
 afraid+0.003
 sorry+0.003
 worried+0.003
 scared+0.003
 okay+0.003
Neg logit contributions
irs-0.005
ide-0.005
rones-0.005
 occasions-0.005
aff-0.005
Token happily
Token ID18177
Feature activation-1.95e-02

Pos logit contributions
irs+0.025
ide+0.024
rones+0.021
 occasions+0.020
aff+0.020
Neg logit contributions
 afraid-0.035
 sorry-0.034
 worried-0.034
 scared-0.033
 okay-0.033
Token in
Token ID287
Feature activation+1.44e-02

Pos logit contributions
 afraid+0.030
 sorry+0.029
 worried+0.029
 oh+0.027
 okay+0.027
Neg logit contributions
irs-0.042
ide-0.041
rones-0.037
 occasions-0.037
aff-0.037
Token the
Token ID262
Feature activation+2.29e-02

Pos logit contributions
irs+0.034
ide+0.033
aff+0.031
 occasions+0.030
of+0.029
Neg logit contributions
 afraid-0.023
 sorry-0.023
 very-0.022
 scared-0.022
 worried-0.022
Token touch
Token ID3638
Feature activation+4.04e-02

Pos logit contributions
irs+0.008
ide+0.008
 occasions+0.007
rones+0.007
aff+0.007
Neg logit contributions
 afraid-0.007
 sorry-0.007
 worried-0.007
 scared-0.007
 okay-0.007
Token things
Token ID1243
Feature activation-2.33e-03

Pos logit contributions
irs+0.081
ide+0.076
rones+0.068
 occasions+0.068
growth+0.066
Neg logit contributions
 afraid-0.073
 sorry-0.071
 worried-0.070
 scared-0.068
 okay-0.066
Token that
Token ID326
Feature activation+5.76e-03

Pos logit contributions
 afraid+0.004
 worried+0.004
 sorry+0.004
 scared+0.004
 foolish+0.003
Neg logit contributions
irs-0.005
ide-0.005
 occasions-0.004
of-0.004
aff-0.004
Token can
Token ID460
Feature activation+1.87e-02

Pos logit contributions
irs+0.011
ide+0.010
 occasions+0.009
rones+0.009
aff+0.009
Neg logit contributions
 afraid-0.011
 sorry-0.011
 worried-0.011
 scared-0.011
 okay-0.010
Token harm
Token ID4419
Feature activation+3.11e-02

Pos logit contributions
irs+0.046
rones+0.045
ide+0.041
aff+0.040
growth+0.040
Neg logit contributions
 mean-0.027
 scared-0.027
 afraid-0.026
 okay-0.026
 worried-0.026
Token you
Token ID345
Feature activation+5.22e-03

Pos logit contributions
irs+0.063
ide+0.059
rones+0.054
 occasions+0.052
aff+0.052
Neg logit contributions
 afraid-0.056
 sorry-0.054
 worried-0.053
 scared-0.052
 okay-0.051
Token.
Token ID13
Feature activation-1.20e-02

Pos logit contributions
irs+0.011
ide+0.010
 occasions+0.009
rones+0.009
growth+0.009
Neg logit contributions
 afraid-0.009
 sorry-0.009
 worried-0.009
 scared-0.009
 okay-0.008
Token<|endoftext|>
Token ID50256
Feature activation+4.73e-02

Pos logit contributions
 afraid+0.020
 sorry+0.020
 okay+0.019
 worried+0.019
 scared+0.018
Neg logit contributions
irs-0.026
ide-0.025
 occasions-0.022
rones-0.022
aff-0.021
TokenL
Token ID43
Feature activation+6.62e-04

Pos logit contributions
irs+0.098
ide+0.088
 occasions+0.083
aff+0.081
isk+0.080
Neg logit contributions
 scared-0.080
 afraid-0.080
 very-0.080
 sorry-0.077
 worried-0.077
Tokenily
Token ID813
Feature activation+8.21e-03

Pos logit contributions
irs+0.001
ide+0.001
growth+0.001
 occasions+0.001
aff+0.001
Neg logit contributions
 worried-0.001
 very-0.001
 sorry-0.001
 afraid-0.001
 foolish-0.001
Token was
Token ID373
Feature activation-4.59e-02

Pos logit contributions
irs+0.018
ide+0.018
rones+0.016
growth+0.016
 occasions+0.016
Neg logit contributions
 afraid-0.012
 sorry-0.012
 worried-0.011
 oh-0.011
 okay-0.011
Token fair
Token ID3148
Feature activation-1.34e-02

Pos logit contributions
 afraid+0.038
 okay+0.035
 oh+0.035
 sorry+0.035
 worried+0.034
Neg logit contributions
ide-0.033
irs-0.030
aff-0.030
rones-0.027
of-0.027
Token child
Token ID1200
Feature activation-2.92e-02

Pos logit contributions
 afraid+0.024
 sorry+0.023
 worried+0.023
 okay+0.022
 scared+0.022
Neg logit contributions
irs-0.027
ide-0.026
aff-0.023
rones-0.023
 occasions-0.023
Token.
Token ID13
Feature activation-1.73e-02

Pos logit contributions
 afraid+0.035
 sorry+0.033
 worried+0.033
 scared+0.031
 okay+0.030
Neg logit contributions
irs-0.077
ide-0.074
 occasions-0.068
rones-0.068
aff-0.066
Token She
Token ID1375
Feature activation+2.41e-02

Pos logit contributions
 afraid+0.024
 scared+0.023
 sorry+0.023
 worried+0.023
 curious+0.022
Neg logit contributions
irs-0.044
ide-0.037
 occasions-0.037
lor-0.036
ades-0.035
Token walked
Token ID6807
Feature activation-1.29e-02

Pos logit contributions
irs+0.054
ide+0.053
rones+0.049
 occasions+0.049
aff+0.047
Neg logit contributions
 afraid-0.035
 sorry-0.035
 worried-0.034
 okay-0.033
 scared-0.031
Token in
Token ID287
Feature activation+8.07e-03

Pos logit contributions
 sorry+0.020
 afraid+0.020
 worried+0.019
 okay+0.018
 foolish+0.017
Neg logit contributions
irs-0.028
ide-0.028
rones-0.026
 occasions-0.025
aff-0.025
Token the
Token ID262
Feature activation+3.64e-03

Pos logit contributions
irs+0.020
ide+0.018
aff+0.018
 occasions+0.017
rones+0.017
Neg logit contributions
 very-0.012
 afraid-0.012
 worried-0.011
 scared-0.011
 curious-0.011
Token park
Token ID3952
Feature activation+2.25e-03

Pos logit contributions
irs+0.007
ide+0.006
 occasions+0.006
rones+0.005
aff+0.005
Neg logit contributions
 afraid-0.007
 worried-0.007
 sorry-0.007
 scared-0.007
 foolish-0.007
Token and
Token ID290
Feature activation-2.04e-02

Pos logit contributions
ide+0.005
irs+0.005
rones+0.004
 occasions+0.004
aff+0.004
Neg logit contributions
 afraid-0.004
 sorry-0.004
 worried-0.004
 okay-0.003
 scared-0.003
Token saw
Token ID2497
Feature activation+1.68e-02

Pos logit contributions
 sorry+0.036
 afraid+0.035
 okay+0.034
 foolish+0.032
 worried+0.032
Neg logit contributions
ide-0.038
irs-0.036
 parties-0.034
rones-0.034
of-0.034
Token something
Token ID1223
Feature activation-1.76e-02

Pos logit contributions
irs+0.040
ide+0.038
rones+0.035
growth+0.034
 occasions+0.034
Neg logit contributions
 afraid-0.024
 sorry-0.022
 worried-0.022
 okay-0.021
 scared-0.021
Token
Token ID198
Feature activation+6.71e-02

Pos logit contributions
irs+0.062
ide+0.058
 occasions+0.052
rones+0.051
aff+0.049
Neg logit contributions
 afraid-0.059
 sorry-0.058
 worried-0.057
 scared-0.056
 okay-0.054
Token
Token ID198
Feature activation+6.08e-02

Pos logit contributions
irs+0.160
 occasions+0.148
ide+0.144
rones+0.139
 parties+0.137
Neg logit contributions
 sorry-0.090
 very-0.089
 okay-0.087
 afraid-0.087
 worried-0.085
TokenBut
Token ID1537
Feature activation-3.70e-02

Pos logit contributions
irs+0.112
ide+0.096
rones+0.092
 occasions+0.089
aff+0.089
Neg logit contributions
 afraid-0.121
 foolish-0.120
 worried-0.120
 okay-0.119
 sorry-0.119
Token before
Token ID878
Feature activation-3.46e-02

Pos logit contributions
 afraid+0.078
 worried+0.072
 scared+0.069
 sorry+0.068
 guilty+0.067
Neg logit contributions
ide-0.046
irs-0.045
of-0.040
growth-0.039
 parties-0.038
Token the
Token ID262
Feature activation-8.99e-03

Pos logit contributions
 afraid+0.051
 worried+0.047
 sorry+0.047
 scared+0.045
 oh+0.045
Neg logit contributions
irs-0.081
ide-0.075
rones-0.070
growth-0.070
 occasions-0.068
Token bird
Token ID6512
Feature activation+7.33e-03

Pos logit contributions
 afraid+0.018
 oh+0.017
 sorry+0.016
 worried+0.016
 okay+0.016
Neg logit contributions
irs-0.013
ide-0.012
growth-0.012
of-0.011
 occasions-0.011
Token left
Token ID1364
Feature activation+2.66e-03

Pos logit contributions
irs+0.016
ide+0.015
of+0.014
 occasions+0.013
aff+0.013
Neg logit contributions
 afraid-0.011
 sorry-0.011
 worried-0.010
 oh-0.010
 okay-0.010
Token,
Token ID11
Feature activation-3.68e-02

Pos logit contributions
irs+0.006
ide+0.005
growth+0.005
 occasions+0.005
rones+0.005
Neg logit contributions
 afraid-0.004
 worried-0.004
 sorry-0.004
 scared-0.004
 careless-0.004
Token he
Token ID339
Feature activation+2.23e-02

Pos logit contributions
 afraid+0.074
 worried+0.070
 sorry+0.069
 scared+0.065
 oh+0.065
Neg logit contributions
irs-0.057
ide-0.052
growth-0.048
of-0.046
 parties-0.045
Token told
Token ID1297
Feature activation+3.08e-02

Pos logit contributions
irs+0.047
ide+0.046
of+0.043
rones+0.041
growth+0.041
Neg logit contributions
 sorry-0.033
 afraid-0.033
 oh-0.031
 worried-0.030
 okay-0.030
Token Ben
Token ID3932
Feature activation-2.15e-03

Pos logit contributions
irs+0.057
ide+0.053
rones+0.047
 occasions+0.046
aff+0.045
Neg logit contributions
 afraid-0.062
 sorry-0.062
 worried-0.060
 scared-0.059
 okay-0.058
Token at
Token ID379
Feature activation+2.77e-02

Pos logit contributions
 afraid+0.036
 worried+0.034
 sorry+0.033
 scared+0.032
 okay+0.030
Neg logit contributions
irs-0.076
ide-0.072
rones-0.066
 occasions-0.066
aff-0.065
Token our
Token ID674
Feature activation-2.58e-04

Pos logit contributions
irs+0.057
ide+0.054
rones+0.049
 occasions+0.049
growth+0.047
Neg logit contributions
 afraid-0.049
 sorry-0.048
 worried-0.047
 scared-0.046
 foolish-0.045
Token little
Token ID1310
Feature activation-6.00e-03

Pos logit contributions
 sorry+0.001
 afraid+0.001
 worried+0.001
 scared+0.001
 foolish+0.000
Neg logit contributions
irs-0.000
ide-0.000
of-0.000
rones-0.000
 occasions-0.000
Token house
Token ID2156
Feature activation-1.39e-02

Pos logit contributions
 afraid+0.013
 worried+0.013
 sorry+0.013
 scared+0.012
 okay+0.012
Neg logit contributions
irs-0.009
ide-0.009
 occasions-0.008
rones-0.008
aff-0.007
Token!"
Token ID2474
Feature activation+3.60e-02

Pos logit contributions
 afraid+0.023
 sorry+0.022
 worried+0.022
 scared+0.021
 okay+0.021
Neg logit contributions
irs-0.029
ide-0.028
 occasions-0.026
rones-0.025
aff-0.025
Token said
Token ID531
Feature activation+1.82e-02

Pos logit contributions
irs+0.091
ide+0.085
 occasions+0.079
rones+0.079
growth+0.078
Neg logit contributions
 afraid-0.046
 sorry-0.044
 worried-0.043
 okay-0.041
 scared-0.040
Token Alic
Token ID34745
Feature activation+2.14e-02

Pos logit contributions
irs+0.036
ide+0.034
rones+0.030
 occasions+0.030
growth+0.029
Neg logit contributions
 afraid-0.034
 sorry-0.033
 worried-0.033
 scared-0.032
 okay-0.030
Tokeny
Token ID88
Feature activation+1.86e-02

Pos logit contributions
irs+0.046
 occasions+0.039
growth+0.038
ide+0.038
 parties+0.037
Neg logit contributions
 afraid-0.037
 worried-0.034
 scared-0.033
 okay-0.032
 very-0.031
Token.
Token ID13
Feature activation+3.54e-03

Pos logit contributions
irs+0.038
ide+0.036
rones+0.032
 occasions+0.032
aff+0.031
Neg logit contributions
 afraid-0.033
 sorry-0.032
 worried-0.032
 scared-0.031
 okay-0.030
Token
Token ID220
Feature activation+2.53e-02

Pos logit contributions
irs+0.007
ide+0.007
 occasions+0.006
rones+0.006
growth+0.006
Neg logit contributions
 afraid-0.006
 sorry-0.006
 worried-0.006
 scared-0.006
 okay-0.006
Token
Token ID198
Feature activation+6.48e-02

Pos logit contributions
 occasions+0.055
irs+0.055
ide+0.052
growth+0.051
rones+0.050
Neg logit contributions
 afraid-0.040
 sorry-0.039
 scared-0.039
 okay-0.039
 worried-0.038
Token dog
Token ID3290
Feature activation-3.17e-02

Pos logit contributions
irs+0.075
ide+0.068
rones+0.063
 occasions+0.062
aff+0.061
Neg logit contributions
 afraid-0.042
 worried-0.042
 scared-0.041
 sorry-0.040
 oh-0.038
Token.
Token ID13
Feature activation+1.50e-03

Pos logit contributions
 afraid+0.061
 sorry+0.059
 worried+0.058
 okay+0.056
 foolish+0.056
Neg logit contributions
irs-0.056
ide-0.056
rones-0.047
 occasions-0.047
of-0.046
Token He
Token ID679
Feature activation+1.72e-02

Pos logit contributions
irs+0.003
ide+0.003
 occasions+0.003
rones+0.003
growth+0.003
Neg logit contributions
 afraid-0.003
 worried-0.003
 sorry-0.003
 scared-0.003
 okay-0.002
Token wanted
Token ID2227
Feature activation+2.81e-03

Pos logit contributions
irs+0.039
ide+0.038
rones+0.034
 occasions+0.034
aff+0.034
Neg logit contributions
 afraid-0.025
 sorry-0.025
 worried-0.024
 okay-0.023
 scared-0.023
Token it
Token ID340
Feature activation+5.01e-03

Pos logit contributions
irs+0.006
ide+0.006
 occasions+0.005
rones+0.005
growth+0.005
Neg logit contributions
 afraid-0.004
 sorry-0.004
 worried-0.004
 scared-0.004
 okay-0.004
Token more
Token ID517
Feature activation-2.26e-02

Pos logit contributions
ide+0.012
irs+0.012
of+0.010
 parties+0.010
 occasions+0.010
Neg logit contributions
 afraid-0.007
 sorry-0.007
 worried-0.007
 scared-0.006
 crank-0.006
Token than
Token ID621
Feature activation-3.73e-02

Pos logit contributions
 sorry+0.037
 afraid+0.036
 worried+0.036
 scared+0.033
 oh+0.032
Neg logit contributions
ide-0.049
irs-0.047
 occasions-0.045
 parties-0.043
of-0.041
Token anything
Token ID1997
Feature activation-1.04e-02

Pos logit contributions
 afraid+0.065
 sorry+0.064
 worried+0.063
 scared+0.061
 okay+0.059
Neg logit contributions
irs-0.077
ide-0.072
 occasions-0.065
rones-0.064
aff-0.063
Token and
Token ID290
Feature activation-1.66e-02

Pos logit contributions
 afraid+0.017
 sorry+0.016
 worried+0.016
 scared+0.015
 okay+0.015
Neg logit contributions
irs-0.021
ide-0.021
 occasions-0.019
of-0.018
aff-0.018
Token asked
Token ID1965
Feature activation+2.37e-02

Pos logit contributions
 afraid+0.023
 sorry+0.022
 worried+0.022
 scared+0.021
 okay+0.020
Neg logit contributions
irs-0.041
ide-0.039
rones-0.036
 occasions-0.035
growth-0.035
Token the
Token ID262
Feature activation+9.61e-03

Pos logit contributions
irs+0.049
ide+0.046
rones+0.042
 occasions+0.041
growth+0.041
Neg logit contributions
 afraid-0.042
 sorry-0.041
 worried-0.041
 scared-0.039
 okay-0.039
Token huge
Token ID3236
Feature activation-2.33e-02

Pos logit contributions
 afraid+0.085
 sorry+0.082
 worried+0.081
 oh+0.079
 okay+0.079
Neg logit contributions
irs-0.126
ide-0.121
rones-0.109
aff-0.106
 occasions-0.105
Token that
Token ID326
Feature activation-3.35e-02

Pos logit contributions
 afraid+0.040
 okay+0.040
 sorry+0.040
 worried+0.040
 sad+0.036
Neg logit contributions
irs-0.044
ide-0.043
 parties-0.040
 occasions-0.040
aff-0.038
Token it
Token ID340
Feature activation-1.13e-02

Pos logit contributions
 afraid+0.056
 worried+0.055
 sorry+0.053
 careless+0.051
 guilty+0.050
Neg logit contributions
irs-0.066
ide-0.064
 parties-0.057
 occasions-0.056
ate-0.053
Token looked
Token ID3114
Feature activation-3.39e-02

Pos logit contributions
 sorry+0.017
 okay+0.016
 afraid+0.015
 oh+0.015
 worried+0.015
Neg logit contributions
ide-0.025
irs-0.024
rones-0.021
 occasions-0.021
growth-0.021
Token like
Token ID588
Feature activation-3.01e-02

Pos logit contributions
 oh+0.044
 afraid+0.041
ses+0.040
 sorry+0.040
 worried+0.039
Neg logit contributions
irs-0.062
 parties-0.062
ide-0.060
 outer-0.053
 ports-0.052
Token it
Token ID340
Feature activation-1.59e-03

Pos logit contributions
 afraid+0.049
 worried+0.047
 sorry+0.045
 scared+0.043
 okay+0.042
Neg logit contributions
irs-0.062
ide-0.056
 parties-0.054
growth-0.051
 occasions-0.051
Token touched
Token ID12615
Feature activation+3.16e-02

Pos logit contributions
 sorry+0.002
 afraid+0.002
 worried+0.002
 okay+0.002
 oh+0.002
Neg logit contributions
irs-0.004
ide-0.004
rones-0.003
 occasions-0.003
growth-0.003
Token the
Token ID262
Feature activation+2.21e-02

Pos logit contributions
irs+0.061
ide+0.060
rones+0.051
 occasions+0.051
aff+0.050
Neg logit contributions
 afraid-0.058
 sorry-0.057
 worried-0.057
 scared-0.053
 careless-0.052
Token sky
Token ID6766
Feature activation+4.84e-03

Pos logit contributions
irs+0.038
ide+0.037
rones+0.032
 occasions+0.032
aff+0.031
Neg logit contributions
 afraid-0.045
 sorry-0.043
 worried-0.043
 scared-0.041
 okay-0.041
Token.
Token ID13
Feature activation-1.20e-02

Pos logit contributions
irs+0.010
ide+0.009
rones+0.008
 occasions+0.008
aff+0.008
Neg logit contributions
 afraid-0.009
 sorry-0.009
 worried-0.009
 scared-0.008
 okay-0.008
Token Inside
Token ID14384
Feature activation+3.27e-02

Pos logit contributions
 afraid+0.023
 sorry+0.023
 worried+0.022
 scared+0.022
 okay+0.021
Neg logit contributions
irs-0.023
ide-0.021
 occasions-0.019
rones-0.019
aff-0.018
Tokenrd
Token ID4372
Feature activation+1.51e-02

Pos logit contributions
irs+0.046
ide+0.041
rones+0.037
growth+0.037
 occasions+0.037
Neg logit contributions
 afraid-0.038
 worried-0.036
 sorry-0.036
 scared-0.035
 foolish-0.034
Token person
Token ID1048
Feature activation-1.51e-02

Pos logit contributions
irs+0.030
ide+0.025
growth+0.023
rones+0.023
 occasions+0.023
Neg logit contributions
 afraid-0.028
 worried-0.027
 sorry-0.027
 scared-0.027
 very-0.026
Token said
Token ID531
Feature activation-2.34e-02

Pos logit contributions
 afraid+0.021
 sorry+0.020
 worried+0.020
 scared+0.019
 okay+0.019
Neg logit contributions
irs-0.037
ide-0.035
rones-0.032
 occasions-0.032
growth-0.031
Token,
Token ID11
Feature activation-3.71e-02

Pos logit contributions
 afraid+0.033
 worried+0.030
 scared+0.028
 sorry+0.028
 careless+0.028
Neg logit contributions
irs-0.054
ide-0.054
 occasions-0.048
aff-0.048
of-0.047
Token "
Token ID366
Feature activation-6.48e-03

Pos logit contributions
 afraid+0.055
 worried+0.051
 sorry+0.050
 scared+0.049
 foolish+0.048
Neg logit contributions
irs-0.084
ide-0.081
aff-0.074
 occasions-0.072
rones-0.072
TokenMy
Token ID3666
Feature activation-1.83e-02

Pos logit contributions
 afraid+0.009
 oh+0.009
 worried+0.009
 very+0.008
 scared+0.008
Neg logit contributions
irs-0.015
ide-0.013
rones-0.013
 occasions-0.012
growth-0.012
Token name
Token ID1438
Feature activation+3.20e-03

Pos logit contributions
 afraid+0.034
 sorry+0.033
 worried+0.033
 scared+0.032
 okay+0.031
Neg logit contributions
irs-0.036
ide-0.033
 occasions-0.030
rones-0.030
aff-0.029
Token is
Token ID318
Feature activation-3.86e-02

Pos logit contributions
irs+0.008
ide+0.007
rones+0.007
 occasions+0.007
growth+0.007
Neg logit contributions
 afraid-0.004
 sorry-0.004
 scared-0.004
 worried-0.004
 very-0.004
Token Lucy
Token ID22162
Feature activation-6.85e-03

Pos logit contributions
 afraid+0.068
 sorry+0.066
 worried+0.065
 scared+0.063
 okay+0.062
Neg logit contributions
irs-0.080
ide-0.074
 occasions-0.069
rones-0.068
growth-0.067
Token."
Token ID526
Feature activation+2.46e-02

Pos logit contributions
 afraid+0.012
 sorry+0.011
 worried+0.011
 scared+0.011
 okay+0.011
Neg logit contributions
irs-0.014
ide-0.013
 occasions-0.012
rones-0.012
growth-0.012
Token
Token ID198
Feature activation+6.78e-02

Pos logit contributions
irs+0.048
ide+0.044
 occasions+0.040
rones+0.039
growth+0.039
Neg logit contributions
 afraid-0.046
 sorry-0.045
 worried-0.044
 scared-0.044
 okay-0.042
Token mom
Token ID1995
Feature activation-5.63e-03

Pos logit contributions
irs+0.015
ide+0.014
rones+0.013
 occasions+0.013
growth+0.012
Neg logit contributions
 afraid-0.010
 sorry-0.010
 worried-0.010
 scared-0.010
 okay-0.009
Token if
Token ID611
Feature activation-5.09e-02

Pos logit contributions
 afraid+0.007
 sorry+0.006
 worried+0.006
 scared+0.006
 okay+0.006
Neg logit contributions
irs-0.015
ide-0.014
rones-0.013
 occasions-0.013
growth-0.013
Token they
Token ID484
Feature activation-9.50e-03

Pos logit contributions
 afraid+0.080
 worried+0.074
 sorry+0.073
 scared+0.072
 foolish+0.068
Neg logit contributions
irs-0.109
ide-0.101
 occasions-0.095
 parties-0.092
rones-0.091
Token can
Token ID460
Feature activation-1.82e-03

Pos logit contributions
 afraid+0.013
 sorry+0.012
 worried+0.012
 scared+0.011
 okay+0.011
Neg logit contributions
irs-0.024
ide-0.023
rones-0.021
 occasions-0.021
aff-0.020
Token go
Token ID467
Feature activation-2.11e-02

Pos logit contributions
 sorry+0.003
 okay+0.003
 very+0.003
 scared+0.003
 worried+0.003
Neg logit contributions
irs-0.004
rones-0.004
growth-0.004
ide-0.004
aff-0.004
Token.
Token ID13
Feature activation-9.65e-03

Pos logit contributions
 afraid+0.033
 sorry+0.032
 worried+0.031
 scared+0.030
 foolish+0.029
Neg logit contributions
irs-0.048
ide-0.045
 occasions-0.041
rones-0.041
aff-0.040
Token
Token ID198
Feature activation+5.55e-02

Pos logit contributions
 sorry+0.017
 afraid+0.017
 worried+0.016
 okay+0.016
 scared+0.016
Neg logit contributions
irs-0.020
ide-0.019
 occasions-0.017
rones-0.017
growth-0.017
Token
Token ID198
Feature activation+4.62e-02

Pos logit contributions
irs+0.127
ide+0.120
growth+0.120
 occasions+0.120
aff+0.116
Neg logit contributions
 afraid-0.070
 very-0.069
 sorry-0.069
 worried-0.069
 scared-0.066
Token"
Token ID1
Feature activation-7.36e-03

Pos logit contributions
irs+0.092
ide+0.085
rones+0.083
aff+0.083
growth+0.082
Neg logit contributions
 afraid-0.080
 worried-0.079
 sorry-0.078
 oh-0.076
 okay-0.075
TokenCan
Token ID6090
Feature activation-3.33e-02

Pos logit contributions
 afraid+0.013
 sorry+0.012
 worried+0.012
 scared+0.012
 okay+0.011
Neg logit contributions
irs-0.016
ide-0.015
 occasions-0.014
rones-0.013
aff-0.013
Token we
Token ID356
Feature activation-5.57e-03

Pos logit contributions
 afraid+0.046
 worried+0.044
 sorry+0.043
 scared+0.041
 okay+0.039
Neg logit contributions
irs-0.081
ide-0.077
 occasions-0.071
rones-0.070
aff-0.069
Token up
Token ID510
Feature activation+2.36e-02

Pos logit contributions
irs+0.129
ide+0.121
rones+0.112
 occasions+0.111
growth+0.110
Neg logit contributions
 afraid-0.067
 sorry-0.064
 worried-0.063
 scared-0.059
 okay-0.058
Token the
Token ID262
Feature activation+9.61e-04

Pos logit contributions
irs+0.055
ide+0.052
 occasions+0.048
rones+0.048
aff+0.047
Neg logit contributions
 afraid-0.036
 worried-0.036
 scared-0.035
 sorry-0.035
 very-0.034
Token battery
Token ID6555
Feature activation-4.64e-02

Pos logit contributions
irs+0.002
ide+0.002
 occasions+0.001
growth+0.001
rones+0.001
Neg logit contributions
 afraid-0.002
 sorry-0.002
 worried-0.002
 oh-0.002
 okay-0.002
Token.
Token ID13
Feature activation+4.74e-03

Pos logit contributions
 afraid+0.090
 sorry+0.087
 worried+0.086
 scared+0.082
 okay+0.082
Neg logit contributions
irs-0.085
ide-0.084
 occasions-0.071
rones-0.070
aff-0.070
Token She
Token ID1375
Feature activation+2.47e-02

Pos logit contributions
irs+0.010
ide+0.010
 occasions+0.009
rones+0.009
aff+0.008
Neg logit contributions
 afraid-0.008
 sorry-0.008
 worried-0.008
 scared-0.007
 okay-0.007
Token looked
Token ID3114
Feature activation-2.26e-02

Pos logit contributions
irs+0.055
ide+0.052
rones+0.048
 occasions+0.047
growth+0.047
Neg logit contributions
 afraid-0.039
 sorry-0.038
 worried-0.037
 scared-0.035
 okay-0.035
Token around
Token ID1088
Feature activation+2.84e-02

Pos logit contributions
 sorry+0.035
 afraid+0.035
 oh+0.034
 okay+0.033
 crank+0.033
Neg logit contributions
irs-0.045
ide-0.042
of-0.039
aff-0.037
 parties-0.037
Token and
Token ID290
Feature activation-2.52e-02

Pos logit contributions
irs+0.064
ide+0.062
 occasions+0.055
growth+0.055
rones+0.055
Neg logit contributions
 afraid-0.042
 sorry-0.041
 worried-0.039
 scared-0.037
 okay-0.037
Token saw
Token ID2497
Feature activation+2.63e-02

Pos logit contributions
 sorry+0.044
 afraid+0.041
 okay+0.039
 foolish+0.038
 worried+0.038
Neg logit contributions
irs-0.048
ide-0.046
of-0.043
rones-0.042
growth-0.041
Token nowhere
Token ID12062
Feature activation+1.79e-04

Pos logit contributions
irs+0.063
ide+0.059
rones+0.054
 occasions+0.054
growth+0.053
Neg logit contributions
 afraid-0.038
 sorry-0.037
 worried-0.036
 scared-0.034
 okay-0.034
Token to
Token ID284
Feature activation+3.32e-02

Pos logit contributions
irs+0.000
ide+0.000
rones+0.000
of+0.000
 occasions+0.000
Neg logit contributions
 sorry-0.000
 afraid-0.000
 worried-0.000
 okay-0.000
 oh-0.000
Token.
Token ID13
Feature activation+3.09e-03

Pos logit contributions
irs+0.081
ide+0.076
rones+0.068
 occasions+0.068
aff+0.066
Neg logit contributions
 afraid-0.078
 sorry-0.075
 worried-0.075
 scared-0.072
 okay-0.071
Token
Token ID198
Feature activation+6.59e-02

Pos logit contributions
irs+0.006
ide+0.006
 occasions+0.005
rones+0.005
growth+0.005
Neg logit contributions
 afraid-0.006
 sorry-0.006
 worried-0.006
 scared-0.005
 okay-0.005
Token
Token ID198
Feature activation+5.81e-02

Pos logit contributions
irs+0.157
 occasions+0.153
ide+0.147
growth+0.147
ware+0.146
Neg logit contributions
 afraid-0.077
 very-0.075
 sorry-0.072
 scared-0.072
 worried-0.072
Token"
Token ID1
Feature activation-1.25e-02

Pos logit contributions
irs+0.123
ide+0.111
rones+0.107
aff+0.106
 occasions+0.106
Neg logit contributions
 afraid-0.096
 worried-0.092
 oh-0.090
 scared-0.088
 careless-0.088
TokenDad
Token ID46270
Feature activation-3.56e-02

Pos logit contributions
 afraid+0.023
 worried+0.022
 sorry+0.022
 scared+0.022
 foolish+0.021
Neg logit contributions
irs-0.025
ide-0.024
rones-0.021
 occasions-0.021
aff-0.021
Token,
Token ID11
Feature activation-5.54e-02

Pos logit contributions
 afraid+0.049
 sorry+0.046
 worried+0.045
 oh+0.045
 scared+0.042
Neg logit contributions
irs-0.082
ide-0.079
of-0.072
aff-0.072
rones-0.071
Token why
Token ID1521
Feature activation-4.19e-02

Pos logit contributions
 afraid+0.088
 worried+0.082
 scared+0.079
 sorry+0.078
 okay+0.075
Neg logit contributions
irs-0.113
ide-0.107
 occasions-0.099
of-0.097
 parties-0.097
Token does
Token ID857
Feature activation-2.96e-02

Pos logit contributions
 afraid+0.058
 worried+0.056
 sorry+0.056
 scared+0.052
 okay+0.051
Neg logit contributions
irs-0.099
ide-0.096
 occasions-0.086
rones-0.085
aff-0.084
Token the
Token ID262
Feature activation+2.31e-02

Pos logit contributions
 afraid+0.049
 worried+0.047
 sorry+0.047
 scared+0.045
 okay+0.045
Neg logit contributions
irs-0.063
ide-0.059
 occasions-0.053
rones-0.053
growth-0.052
Token monkey
Token ID21657
Feature activation+8.87e-03

Pos logit contributions
irs+0.036
ide+0.036
aff+0.033
rones+0.033
 occasions+0.032
Neg logit contributions
 afraid-0.044
 sorry-0.044
 worried-0.043
 okay-0.043
 scared-0.041
Token have
Token ID423
Feature activation-9.66e-03

Pos logit contributions
irs+0.020
ide+0.018
rones+0.017
 occasions+0.017
growth+0.016
Neg logit contributions
 afraid-0.014
 sorry-0.014
 worried-0.014
 scared-0.013
 okay-0.013
Token upon
Token ID2402
Feature activation+2.45e-02

Pos logit contributions
 afraid+0.029
 worried+0.028
 sorry+0.028
 scared+0.027
 sad+0.027
Neg logit contributions
irs-0.027
 occasions-0.025
growth-0.024
ide-0.024
rones-0.023
Token a
Token ID257
Feature activation-1.85e-02

Pos logit contributions
irs+0.066
ide+0.063
 occasions+0.058
rones+0.058
aff+0.057
Neg logit contributions
 afraid-0.028
 sorry-0.026
 worried-0.026
 scared-0.025
 okay-0.024
Token time
Token ID640
Feature activation-7.77e-02

Pos logit contributions
 afraid+0.029
 sorry+0.029
 worried+0.028
 scared+0.027
 oh+0.026
Neg logit contributions
irs-0.040
ide-0.040
rones-0.035
 occasions-0.034
of-0.034
Token,
Token ID11
Feature activation-5.84e-02

Pos logit contributions
 afraid+0.114
 worried+0.110
 sorry+0.106
 scared+0.101
 careless+0.100
Neg logit contributions
ide-0.177
irs-0.174
of-0.149
rones-0.146
 parties-0.142
Token there
Token ID612
Feature activation+1.19e-02

Pos logit contributions
 afraid+0.081
 worried+0.078
 sorry+0.078
 guilty+0.073
 careless+0.073
Neg logit contributions
ide-0.133
irs-0.124
of-0.108
rones-0.107
growth-0.106
Token was
Token ID373
Feature activation-5.44e-02

Pos logit contributions
irs+0.030
ide+0.027
 occasions+0.026
rones+0.026
aff+0.026
Neg logit contributions
 afraid-0.016
 worried-0.015
 sorry-0.015
 scared-0.015
 okay-0.014
Token a
Token ID257
Feature activation-2.19e-02

Pos logit contributions
 afraid+0.082
 sorry+0.079
 worried+0.079
 scared+0.072
 foolish+0.070
Neg logit contributions
irs-0.121
ide-0.118
rones-0.109
growth-0.108
 parties-0.102
Token girl
Token ID2576
Feature activation-2.43e-02

Pos logit contributions
 afraid+0.040
 okay+0.037
 oh+0.037
 sorry+0.036
 worried+0.036
Neg logit contributions
ide-0.034
irs-0.030
aff-0.030
berry-0.028
of-0.028
Token named
Token ID3706
Feature activation-4.12e-02

Pos logit contributions
 afraid+0.033
 sorry+0.032
 worried+0.031
 okay+0.031
 foolish+0.029
Neg logit contributions
ide-0.056
irs-0.055
 occasions-0.051
rones-0.051
of-0.049
Token Grace
Token ID16156
Feature activation+4.08e-04

Pos logit contributions
 afraid+0.063
 sorry+0.061
 worried+0.060
 okay+0.059
 scared+0.059
Neg logit contributions
irs-0.097
ide-0.088
 occasions-0.084
rones-0.083
growth-0.083
Token.
Token ID13
Feature activation-3.68e-03

Pos logit contributions
irs+0.001
ide+0.001
 occasions+0.001
rones+0.001
growth+0.001
Neg logit contributions
 afraid-0.001
 worried-0.001
 scared-0.001
 very-0.001
 sorry-0.001
Token she
Token ID673
Feature activation+2.60e-02

Pos logit contributions
irs+0.048
ide+0.046
rones+0.042
 occasions+0.041
aff+0.040
Neg logit contributions
 afraid-0.034
 sorry-0.033
 worried-0.033
 scared-0.031
 okay-0.031
Token said
Token ID531
Feature activation-3.85e-03

Pos logit contributions
irs+0.057
ide+0.054
rones+0.049
 occasions+0.049
aff+0.048
Neg logit contributions
 afraid-0.040
 sorry-0.039
 worried-0.038
 okay-0.036
 foolish-0.036
Token.
Token ID13
Feature activation-4.41e-03

Pos logit contributions
 afraid+0.007
 worried+0.007
 foolish+0.007
 scared+0.007
 careless+0.007
Neg logit contributions
irs-0.007
ide-0.007
 occasions-0.006
aff-0.006
of-0.005
Token "
Token ID366
Feature activation-1.76e-03

Pos logit contributions
 sorry+0.008
 afraid+0.007
 worried+0.007
 scared+0.007
 okay+0.007
Neg logit contributions
irs-0.010
ide-0.009
 occasions-0.008
rones-0.008
growth-0.008
TokenYes
Token ID5297
Feature activation-5.35e-02

Pos logit contributions
 afraid+0.003
 worried+0.003
 sorry+0.003
 scared+0.003
 okay+0.003
Neg logit contributions
irs-0.004
ide-0.003
rones-0.003
 occasions-0.003
aff-0.003
Token,
Token ID11
Feature activation-5.67e-02

Pos logit contributions
 afraid+0.073
 worried+0.067
 scared+0.064
 sorry+0.064
 foolish+0.059
Neg logit contributions
irs-0.132
ide-0.127
 occasions-0.114
of-0.114
aff-0.114
Token it
Token ID340
Feature activation-5.89e-03

Pos logit contributions
 afraid+0.096
 worried+0.091
 scared+0.086
 sorry+0.085
 careless+0.080
Neg logit contributions
irs-0.118
ide-0.112
 occasions-0.102
of-0.102
 parties-0.099
Token is
Token ID318
Feature activation-3.87e-02

Pos logit contributions
 afraid+0.010
 sorry+0.010
 worried+0.009
 oh+0.009
 okay+0.009
Neg logit contributions
irs-0.012
ide-0.012
rones-0.010
 occasions-0.010
of-0.010
Token,"
Token ID553
Feature activation+3.08e-02

Pos logit contributions
 afraid+0.062
 worried+0.058
 sorry+0.058
 scared+0.054
 careless+0.053
Neg logit contributions
irs-0.083
ide-0.077
 occasions-0.071
rones-0.069
 parties-0.068
Token her
Token ID607
Feature activation-5.53e-03

Pos logit contributions
irs+0.060
ide+0.056
 occasions+0.051
rones+0.051
aff+0.049
Neg logit contributions
 afraid-0.057
 worried-0.054
 sorry-0.054
 scared-0.052
 okay-0.051
Token mom
Token ID1995
Feature activation-1.95e-03

Pos logit contributions
 afraid+0.010
 worried+0.009
 sorry+0.009
 scared+0.009
 oh+0.009
Neg logit contributions
irs-0.011
ide-0.010
rones-0.009
 occasions-0.009
aff-0.009
Token time
Token ID640
Feature activation-7.87e-02

Pos logit contributions
 afraid+0.035
 sorry+0.034
 worried+0.033
 scared+0.032
 oh+0.030
Neg logit contributions
irs-0.051
ide-0.050
rones-0.045
 occasions-0.044
growth-0.043
Token,
Token ID11
Feature activation-6.06e-02

Pos logit contributions
 afraid+0.104
 worried+0.099
 sorry+0.098
 scared+0.092
 foolish+0.089
Neg logit contributions
irs-0.193
ide-0.190
rones-0.166
of-0.165
 occasions-0.163
Token there
Token ID612
Feature activation+6.24e-03

Pos logit contributions
 afraid+0.080
 sorry+0.078
 worried+0.076
 guilty+0.071
 scared+0.070
Neg logit contributions
ide-0.144
irs-0.138
rones-0.121
of-0.118
growth-0.116
Token was
Token ID373
Feature activation-6.11e-02

Pos logit contributions
irs+0.016
ide+0.015
 occasions+0.014
aff+0.014
rones+0.014
Neg logit contributions
 afraid-0.008
 worried-0.008
 scared-0.008
 sorry-0.007
 very-0.007
Token a
Token ID257
Feature activation-2.40e-02

Pos logit contributions
 afraid+0.094
 worried+0.091
 sorry+0.089
 scared+0.083
 foolish+0.081
Neg logit contributions
irs-0.134
ide-0.129
rones-0.121
growth-0.119
 parties-0.112
Token kid
Token ID5141
Feature activation-5.37e-02

Pos logit contributions
 afraid+0.044
 okay+0.041
 sorry+0.041
 oh+0.041
 worried+0.040
Neg logit contributions
ide-0.038
irs-0.034
aff-0.034
rones-0.031
of-0.031
Token named
Token ID3706
Feature activation-4.22e-02

Pos logit contributions
 afraid+0.069
 sorry+0.068
 worried+0.066
 okay+0.064
 scared+0.061
Neg logit contributions
ide-0.130
irs-0.129
 occasions-0.116
rones-0.115
aff-0.113
Token Tim
Token ID5045
Feature activation+3.72e-03

Pos logit contributions
 afraid+0.050
 sorry+0.049
 worried+0.048
 scared+0.046
 okay+0.046
Neg logit contributions
irs-0.111
ide-0.104
 occasions-0.101
rones-0.099
growth-0.098
Tokenmy
Token ID1820
Feature activation-1.31e-02

Pos logit contributions
irs+0.010
 occasions+0.009
growth+0.008
aff+0.008
ide+0.008
Neg logit contributions
 afraid-0.005
 worried-0.005
 very-0.005
 scared-0.005
 mean-0.005
Token.
Token ID13
Feature activation-2.73e-02

Pos logit contributions
 afraid+0.023
 worried+0.022
 sorry+0.022
 scared+0.022
 foolish+0.021
Neg logit contributions
irs-0.028
ide-0.026
 occasions-0.024
rones-0.023
aff-0.023
Token Tim
Token ID5045
Feature activation+1.13e-02

Pos logit contributions
 afraid+0.038
 sorry+0.037
 worried+0.037
 scared+0.035
 okay+0.034
Neg logit contributions
irs-0.067
ide-0.063
 occasions-0.058
rones-0.058
growth-0.056
Token not
Token ID407
Feature activation-3.45e-02

Pos logit contributions
 oh+0.027
 afraid+0.026
 worried+0.024
 sorry+0.023
 foolish+0.022
Neg logit contributions
irs-0.063
ide-0.060
 parties-0.055
rones-0.055
aff-0.054
Token cooked
Token ID15847
Feature activation+3.52e-02

Pos logit contributions
 afraid+0.044
 sorry+0.042
 worried+0.042
 scared+0.039
 oh+0.038
Neg logit contributions
irs-0.084
ide-0.080
rones-0.074
 occasions-0.073
aff-0.072
Token.
Token ID13
Feature activation-1.65e-02

Pos logit contributions
irs+0.077
ide+0.070
rones+0.067
growth+0.066
 occasions+0.066
Neg logit contributions
 afraid-0.057
 sorry-0.056
 worried-0.055
 okay-0.054
 scared-0.054
Token They
Token ID1119
Feature activation+1.35e-02

Pos logit contributions
 afraid+0.029
 sorry+0.029
 worried+0.028
 okay+0.028
 scared+0.027
Neg logit contributions
irs-0.034
ide-0.032
 occasions-0.029
rones-0.028
aff-0.028
Token are
Token ID389
Feature activation-2.46e-02

Pos logit contributions
irs+0.027
ide+0.025
rones+0.022
 occasions+0.022
aff+0.022
Neg logit contributions
 afraid-0.025
 sorry-0.025
 worried-0.024
 scared-0.024
 okay-0.023
Token just
Token ID655
Feature activation-5.19e-02

Pos logit contributions
 oh+0.024
 afraid+0.023
 worried+0.021
 sorry+0.020
 foolish+0.020
Neg logit contributions
irs-0.059
ide-0.056
 parties-0.053
rones-0.052
aff-0.051
Token cookies
Token ID14746
Feature activation-2.18e-02

Pos logit contributions
 afraid+0.063
 sorry+0.060
 worried+0.060
 oh+0.058
 okay+0.056
Neg logit contributions
irs-0.122
ide-0.115
rones-0.107
 occasions-0.106
aff-0.105
Token."
Token ID526
Feature activation+1.92e-02

Pos logit contributions
 afraid+0.041
 worried+0.040
 sorry+0.040
 scared+0.038
 foolish+0.037
Neg logit contributions
irs-0.041
ide-0.039
 occasions-0.035
rones-0.035
aff-0.034
Token
Token ID198
Feature activation+6.08e-02

Pos logit contributions
irs+0.039
ide+0.036
 occasions+0.032
rones+0.032
aff+0.031
Neg logit contributions
 afraid-0.036
 sorry-0.036
 worried-0.035
 scared-0.035
 okay-0.033
Token
Token ID198
Feature activation+5.12e-02

Pos logit contributions
irs+0.148
ide+0.140
 occasions+0.137
ware+0.134
 parties+0.133
Neg logit contributions
 afraid-0.086
 sorry-0.083
 scared-0.082
 worried-0.080
 very-0.080
TokenMom
Token ID29252
Feature activation-3.83e-03

Pos logit contributions
irs+0.102
ide+0.093
rones+0.088
aff+0.088
growth+0.086
Neg logit contributions
 afraid-0.095
 sorry-0.092
 worried-0.091
 scared-0.089
 okay-0.086