TOP ACTIVATIONS
MAX = 0.098

took it out and gave Sue a new , sweet candy
cleans their hand and gives them a band - aid .
mom if they could give him a treat . Her mom
the toy and even gave Tommy a hug .
He was brave and gave Sue some gum . They both
not want to give the chain . They liked the chain
Tim my wanted to make Sally feel better , so he
cuts the lime and gives them each a slice . They
so happy and gave Mitt ens a big hug . From
help the puppy and gave him some food .
. Tim wanted to help Sue . He used his tool
her brother he must give Jane some candy . But her
Tommy said yes and gave her a cup of juice .
girl was happy and gave Max a big hug . Max
ran after him and gave him back his lucky marble .
happy to help and gave Sally a carrot to eat .
promised to always keep his wallet safe . He realized that
for Tim my and gave him a hug . She told
does not want to give Sam any of his meat .
worry and promised to buy him a new one when they

MIN ACTIVATIONS
MIN = -0.128

. She looked around the vast ocean and didn 't see
Finally , they reached a vast stretch of beach . As
in the middle of a vast field . One
colourful flowers , through the lush forests deep in thought .
She loved nature - the lush green grass , the orange
it sh ined in the bright sun . One day ,
fish who lived in the deep blue sea . She sw
. He lived in the deep blue sea and liked to
sorts of adventures in the vast ocean . She was happy
. He lived in the deep blue sea . He had
loved to swim in the deep blue sea . One day
a small dot in the dark night sky . Soon after
. He could see a vast valley below . He decided
tw ink led over the dark town , glowing happily .
was running around on the lush grass , enjoying the sunshine
tw ink led in the dark sky , it was so
colors shimmer ing in the dark . L ola gasped with
a boat ride in the deep lake ". Lucy quickly said
boat , fishing in the deep blue sea . Suddenly ,
, through the lush forests deep in thought . He admired

INTERVAL 0.042 - 0.098
CONTAINS 2.671%

She was so proud of her qu ilt , she showed
are very kind to help your sister . But you should
watching and protecting the c actus . The bee buzz ed
Mom hugged Lily and Ben . She said , "
new toy soldier . He was very sad and started to

INTERVAL -0.015 - 0.042
CONTAINS 75.476%

disturb any of their animal friends . <|endoftext|> Once upon a
They searched the park quietly and finally found the
his story . Ann went home that day feeling
loved to play outside with his friend , Max . One
and blink in the sunshine . One day , when the

INTERVAL -0.071 - -0.015
CONTAINS 21.648%

ran around and around , climbed trees , and laughed and
Once it was all clean , she put a new
very special to L ila , and it seemed like nothing
dragon made a fire too close to the edge of the
time , they found him safe and sound on the other

INTERVAL -0.128 - -0.071
CONTAINS 0.205%

flying by and said , " Why don 't you forgive
proud of him and said " yes , you guessed it
Suddenly , she heard a loud qu ack ! Lucy pe
The queen said Yes ,
: Thanks . I try to
Token took
Token ID1718
Feature activation+3.33e-02

Pos logit contributions
 chores+0.004
 homework+0.004
ussy+0.004
 worse+0.004
 infected+0.004
Neg logit contributions
 exp-0.004
bye-0.003
 oceans-0.003
ard-0.003
 plains-0.003
Token it
Token ID340
Feature activation+6.54e-02

Pos logit contributions
ussy+0.051
 homework+0.049
ggy+0.048
 infected+0.047
abella+0.045
Neg logit contributions
bye-0.059
 oceans-0.054
 goodness-0.052
 breeze-0.052
 course-0.052
Token out
Token ID503
Feature activation+2.48e-02

Pos logit contributions
ggy+0.109
ussy+0.106
abella+0.104
 infected+0.100
orman+0.096
Neg logit contributions
bye-0.105
 oceans-0.096
 exp-0.096
amb-0.091
 breeze-0.088
Token and
Token ID290
Feature activation-8.12e-03

Pos logit contributions
 not+0.041
ggy+0.040
 infected+0.039
 worse+0.039
ussy+0.038
Neg logit contributions
 exp-0.040
bye-0.040
 oceans-0.039
 plains-0.036
amb-0.035
Token gave
Token ID2921
Feature activation+2.78e-02

Pos logit contributions
 exp+0.014
 oceans+0.014
bye+0.014
 plains+0.012
 lakes+0.012
Neg logit contributions
ggy-0.012
ussy-0.012
 not-0.012
 infected-0.012
 chores-0.011
Token Sue
Token ID26089
Feature activation+9.83e-02

Pos logit contributions
ussy+0.048
 homework+0.047
 infected+0.046
ggy+0.045
 chores+0.044
Neg logit contributions
bye-0.045
 course-0.041
 goodness-0.039
 oceans-0.039
 breeze-0.038
Token a
Token ID257
Feature activation+4.64e-02

Pos logit contributions
ggy+0.171
orman+0.167
ussy+0.163
 losing+0.159
ggies+0.159
Neg logit contributions
bye-0.130
 oceans-0.117
 exp-0.116
 plains-0.113
 course-0.113
Token new
Token ID649
Feature activation+2.97e-02

Pos logit contributions
ussy+0.059
ggy+0.056
 not+0.055
 infected+0.054
abella+0.054
Neg logit contributions
 exp-0.091
bye-0.087
 breeze-0.083
 oceans-0.082
ard-0.080
Token,
Token ID11
Feature activation-7.94e-03

Pos logit contributions
 not+0.040
ggy+0.039
ussy+0.038
 infected+0.038
 worse+0.038
Neg logit contributions
 exp-0.057
bye-0.056
 oceans-0.055
 plains-0.051
amb-0.050
Token sweet
Token ID6029
Feature activation+4.70e-03

Pos logit contributions
bye+0.018
 exp+0.018
amb+0.018
 oceans+0.017
 plains+0.017
Neg logit contributions
 not-0.010
 okay-0.009
 worse-0.009
 bad-0.008
 infected-0.008
Token candy
Token ID18550
Feature activation+4.77e-02

Pos logit contributions
 not+0.007
ggy+0.006
 infected+0.006
 worse+0.006
 bad+0.006
Neg logit contributions
 oceans-0.009
 exp-0.009
bye-0.009
 plains-0.008
 lakes-0.008
Token cleans
Token ID20658
Feature activation+1.91e-02

Pos logit contributions
 not+0.009
ggy+0.009
 infected+0.009
 worse+0.009
ussy+0.009
Neg logit contributions
 exp-0.010
bye-0.010
 oceans-0.010
 plains-0.009
amb-0.009
Token their
Token ID511
Feature activation+4.92e-02

Pos logit contributions
ggy+0.033
 infected+0.032
oll+0.032
 supposed+0.031
ussy+0.031
Neg logit contributions
 oceans-0.028
 exp-0.027
 lakes-0.026
bye-0.026
 rivers-0.025
Token hand
Token ID1021
Feature activation+2.20e-02

Pos logit contributions
ggy+0.075
 not+0.073
ussy+0.073
 worse+0.071
ver+0.071
Neg logit contributions
 exp-0.086
 oceans-0.084
bye-0.079
 plains-0.076
 lakes-0.076
Token and
Token ID290
Feature activation+4.41e-03

Pos logit contributions
 not+0.031
ggy+0.030
 infected+0.030
 worse+0.029
ussy+0.029
Neg logit contributions
bye-0.040
 exp-0.040
 oceans-0.040
 plains-0.037
amb-0.036
Token gives
Token ID3607
Feature activation+2.63e-02

Pos logit contributions
 not+0.006
 worse+0.005
 infected+0.005
 homework+0.005
ggy+0.005
Neg logit contributions
bye-0.009
 exp-0.009
 oceans-0.009
 plains-0.008
amb-0.008
Token them
Token ID606
Feature activation+9.16e-02

Pos logit contributions
ggy+0.047
ussy+0.046
 infected+0.045
 not+0.044
oll+0.043
Neg logit contributions
 oceans-0.038
bye-0.037
 course-0.035
 exp-0.033
 lakes-0.033
Token a
Token ID257
Feature activation+4.56e-02

Pos logit contributions
ggy+0.153
 infected+0.144
orman+0.140
ussy+0.140
 worse+0.135
Neg logit contributions
bye-0.144
 exp-0.141
 oceans-0.139
 lakes-0.130
 course-0.122
Token band
Token ID4097
Feature activation+2.50e-02

Pos logit contributions
ussy+0.062
 not+0.062
ggy+0.061
 infected+0.060
 worse+0.060
Neg logit contributions
 exp-0.085
bye-0.082
 oceans-0.080
 breeze-0.076
 plains-0.075
Token-
Token ID12
Feature activation+7.96e-03

Pos logit contributions
 not+0.038
ggy+0.037
 infected+0.036
 worse+0.036
ussy+0.035
Neg logit contributions
 exp-0.044
 oceans-0.043
bye-0.043
 plains-0.040
amb-0.039
Tokenaid
Token ID1698
Feature activation+4.71e-02

Pos logit contributions
ggy+0.008
 not+0.008
 infected+0.007
 worse+0.007
ussy+0.007
Neg logit contributions
 exp-0.019
 oceans-0.018
 plains-0.017
bye-0.017
 lakes-0.017
Token.
Token ID13
Feature activation+1.35e-02

Pos logit contributions
 not+0.058
ggy+0.056
 infected+0.055
 worse+0.054
ussy+0.053
Neg logit contributions
 exp-0.095
bye-0.095
 oceans-0.094
 plains-0.088
amb-0.087
Token mom
Token ID1995
Feature activation+4.87e-02

Pos logit contributions
 infected+0.103
gh+0.101
ggy+0.097
ggies+0.097
aned+0.095
Neg logit contributions
bye-0.076
 exp-0.074
 sea-0.073
 Thank-0.071
 and-0.070
Token if
Token ID611
Feature activation+2.29e-02

Pos logit contributions
ggy+0.096
 infected+0.086
ggie+0.081
orman+0.081
hes+0.079
Neg logit contributions
 oceans-0.066
 exp-0.066
bye-0.064
 lakes-0.058
ard-0.057
Token they
Token ID484
Feature activation+2.09e-02

Pos logit contributions
ussy+0.036
ggy+0.035
 infected+0.035
 not+0.034
 worse+0.033
Neg logit contributions
bye-0.037
 oceans-0.037
 exp-0.036
 lakes-0.033
 plains-0.032
Token could
Token ID714
Feature activation+2.42e-02

Pos logit contributions
ggy+0.042
 infected+0.039
ussy+0.039
 homework+0.038
 chores+0.037
Neg logit contributions
bye-0.029
 exp-0.027
 oceans-0.024
ard-0.023
 course-0.021
Token give
Token ID1577
Feature activation+4.16e-02

Pos logit contributions
ussy+0.040
abella+0.040
ggy+0.040
 infected+0.039
 worse+0.039
Neg logit contributions
bye-0.039
 exp-0.036
amb-0.032
 oceans-0.031
 course-0.031
Token him
Token ID683
Feature activation+9.06e-02

Pos logit contributions
ussy+0.072
ggy+0.072
 not+0.068
oll+0.067
 infected+0.066
Neg logit contributions
bye-0.065
 oceans-0.059
 exp-0.057
 goodness-0.054
ard-0.053
Token a
Token ID257
Feature activation+5.17e-02

Pos logit contributions
ggy+0.131
ussy+0.129
 infected+0.122
orman+0.120
abella+0.120
Neg logit contributions
bye-0.163
 exp-0.149
 oceans-0.148
 course-0.141
 plains-0.137
Token treat
Token ID2190
Feature activation+4.36e-02

Pos logit contributions
ussy+0.086
ggy+0.085
 not+0.083
abella+0.082
 infected+0.080
Neg logit contributions
bye-0.082
 exp-0.081
 breeze-0.074
 oceans-0.069
ard-0.067
Token.
Token ID13
Feature activation+2.11e-02

Pos logit contributions
 not+0.057
ggy+0.055
 infected+0.055
 worse+0.053
ussy+0.053
Neg logit contributions
bye-0.085
 exp-0.083
 oceans-0.082
amb-0.077
 plains-0.077
Token Her
Token ID2332
Feature activation+6.76e-02

Pos logit contributions
 not+0.029
 mom+0.026
ggy+0.025
 infected+0.025
 worse+0.025
Neg logit contributions
 oceans-0.043
bye-0.040
 exp-0.040
 plains-0.039
 lakes-0.039
Token mom
Token ID1995
Feature activation+1.06e-02

Pos logit contributions
ussy+0.126
abella+0.125
aches+0.113
hes+0.110
illi+0.107
Neg logit contributions
bye-0.101
 exp-0.081
 lake-0.079
 tune-0.079
 goodness-0.078
Token the
Token ID262
Feature activation+4.41e-02

Pos logit contributions
ggy+0.030
 infected+0.028
ussy+0.028
 worse+0.027
 not+0.027
Neg logit contributions
 exp-0.042
bye-0.040
 oceans-0.039
 course-0.039
amb-0.039
Token toy
Token ID13373
Feature activation+5.88e-02

Pos logit contributions
abella+0.068
ver+0.066
ussy+0.065
ggy+0.064
aches+0.059
Neg logit contributions
 exp-0.075
bye-0.071
 goodness-0.070
ard-0.067
 oceans-0.065
Token and
Token ID290
Feature activation+2.02e-02

Pos logit contributions
ggy+0.089
 infected+0.085
 not+0.084
abella+0.082
ussy+0.081
Neg logit contributions
bye-0.099
 exp-0.097
 oceans-0.095
ard-0.089
amb-0.088
Token even
Token ID772
Feature activation+3.79e-02

Pos logit contributions
 not+0.024
ggy+0.023
 infected+0.023
 worse+0.023
ussy+0.022
Neg logit contributions
 exp-0.041
bye-0.041
 oceans-0.041
 plains-0.038
amb-0.038
Token gave
Token ID2921
Feature activation+2.83e-02

Pos logit contributions
ggy+0.055
ussy+0.050
 not+0.050
 infected+0.049
orman+0.047
Neg logit contributions
bye-0.067
 exp-0.066
 oceans-0.064
 course-0.060
 breeze-0.059
Token Tommy
Token ID19919
Feature activation+9.03e-02

Pos logit contributions
ggy+0.055
ussy+0.053
 not+0.050
 infected+0.050
 homework+0.050
Neg logit contributions
bye-0.037
 goodness-0.033
 course-0.033
 oceans-0.031
 exp-0.029
Token a
Token ID257
Feature activation+6.08e-02

Pos logit contributions
ggy+0.142
ussy+0.140
abella+0.136
orman+0.135
 infected+0.130
Neg logit contributions
bye-0.136
 exp-0.120
 oceans-0.118
 course-0.114
ard-0.110
Token hug
Token ID16225
Feature activation+1.02e-02

Pos logit contributions
ussy+0.087
ggy+0.085
abella+0.081
 not+0.078
orman+0.077
Neg logit contributions
 exp-0.108
bye-0.103
 breeze-0.100
 course-0.092
 oceans-0.091
Token.
Token ID13
Feature activation+5.54e-04

Pos logit contributions
 not+0.013
ggy+0.012
 infected+0.012
 worse+0.012
ussy+0.011
Neg logit contributions
 exp-0.021
bye-0.021
 oceans-0.021
 plains-0.019
amb-0.019
Token
Token ID198
Feature activation+2.54e-02

Pos logit contributions
 not+0.001
ggy+0.001
 infected+0.001
 worse+0.001
ussy+0.001
Neg logit contributions
 exp-0.001
bye-0.001
 oceans-0.001
 plains-0.001
amb-0.001
Token
Token ID198
Feature activation+2.39e-03

Pos logit contributions
 worse+0.052
 infected+0.050
ussy+0.049
 okay+0.048
 chores+0.047
Neg logit contributions
bye-0.031
 exp-0.029
amb-0.029
 plains-0.027
 oceans-0.027
Token He
Token ID679
Feature activation+5.53e-03

Pos logit contributions
bye+0.011
 exp+0.011
 oceans+0.010
 plains+0.010
amb+0.009
Neg logit contributions
 infected-0.009
 worse-0.009
 not-0.009
ggy-0.009
ussy-0.009
Token was
Token ID373
Feature activation+2.66e-02

Pos logit contributions
 infected+0.010
 worse+0.010
ussy+0.010
ggy+0.010
 chores+0.010
Neg logit contributions
 exp-0.008
bye-0.008
 oceans-0.007
 plains-0.007
ard-0.007
Token brave
Token ID14802
Feature activation-2.90e-02

Pos logit contributions
ork+0.047
ussy+0.045
 infected+0.043
orman+0.042
abella+0.042
Neg logit contributions
 oceans-0.037
bye-0.035
 breeze-0.033
 plains-0.033
 exp-0.033
Token and
Token ID290
Feature activation+1.55e-02

Pos logit contributions
 exp+0.058
bye+0.057
 oceans+0.054
amb+0.053
 plains+0.053
Neg logit contributions
 not-0.040
 bad-0.036
 worse-0.035
 okay-0.035
 mom-0.035
Token gave
Token ID2921
Feature activation+2.25e-02

Pos logit contributions
 not+0.021
ggy+0.020
 infected+0.020
 worse+0.020
ussy+0.019
Neg logit contributions
 exp-0.030
 oceans-0.029
bye-0.029
 plains-0.028
amb-0.027
Token Sue
Token ID26089
Feature activation+8.97e-02

Pos logit contributions
 infected+0.033
ussy+0.031
 chores+0.031
 severe+0.030
ggy+0.030
Neg logit contributions
bye-0.044
 oceans-0.038
 course-0.037
 exp-0.036
 plains-0.036
Token some
Token ID617
Feature activation+4.60e-02

Pos logit contributions
 infected+0.140
 worse+0.138
 losing+0.138
ussy+0.136
abella+0.136
Neg logit contributions
bye-0.141
 oceans-0.130
 exp-0.129
 plains-0.128
 course-0.122
Token gum
Token ID27142
Feature activation+6.92e-02

Pos logit contributions
ggy+0.059
 not+0.058
ussy+0.056
 infected+0.054
abella+0.054
Neg logit contributions
bye-0.089
 exp-0.088
 oceans-0.086
ard-0.082
 goodness-0.080
Token.
Token ID13
Feature activation-7.08e-04

Pos logit contributions
 infected+0.114
ussy+0.112
ggy+0.111
 worse+0.110
abella+0.108
Neg logit contributions
bye-0.107
amb-0.100
 exp-0.094
 oceans-0.094
ard-0.093
Token They
Token ID1119
Feature activation+1.06e-02

Pos logit contributions
 exp+0.001
bye+0.001
 oceans+0.001
 plains+0.001
amb+0.001
Neg logit contributions
 not-0.001
ggy-0.001
 infected-0.001
 worse-0.001
ussy-0.001
Token both
Token ID1111
Feature activation+1.13e-04

Pos logit contributions
ggy+0.021
 infected+0.021
 worse+0.020
ussy+0.020
 not+0.020
Neg logit contributions
bye-0.014
 exp-0.014
 oceans-0.014
 plains-0.012
amb-0.012
Token not
Token ID407
Feature activation+1.16e-02

Pos logit contributions
ussy+0.051
oll+0.047
idget+0.040
acey+0.039
aila+0.039
Neg logit contributions
bye-0.079
 exp-0.078
 course-0.078
 oceans-0.076
 goodness-0.074
Token want
Token ID765
Feature activation+1.77e-02

Pos logit contributions
 infected+0.024
ussy+0.024
abella+0.023
ggy+0.023
 homework+0.022
Neg logit contributions
bye-0.013
 exp-0.012
 breeze-0.011
 oceans-0.011
hello-0.010
Token to
Token ID284
Feature activation+1.53e-02

Pos logit contributions
ussy+0.035
 infected+0.034
oll+0.033
ggy+0.033
illi+0.031
Neg logit contributions
bye-0.024
 oceans-0.023
 exp-0.021
 lakes-0.020
 course-0.020
Token give
Token ID1577
Feature activation+4.28e-02

Pos logit contributions
ussy+0.028
oll+0.028
 infected+0.027
abella+0.027
hes+0.026
Neg logit contributions
bye-0.024
 exp-0.021
 breeze-0.018
 course-0.018
amb-0.018
Token the
Token ID262
Feature activation+1.00e-02

Pos logit contributions
oll+0.077
ussy+0.074
 infected+0.073
 severe+0.070
ggy+0.070
Neg logit contributions
bye-0.068
 oceans-0.067
 lakes-0.061
 rivers-0.060
 exp-0.058
Token chain
Token ID6333
Feature activation+8.93e-02

Pos logit contributions
ussy+0.017
 infected+0.017
ggy+0.017
 worse+0.016
 not+0.016
Neg logit contributions
bye-0.016
 oceans-0.015
 exp-0.015
 plains-0.014
ard-0.014
Token.
Token ID13
Feature activation-1.87e-03

Pos logit contributions
abella+0.152
ussy+0.144
 infected+0.139
ggy+0.138
illi+0.136
Neg logit contributions
bye-0.145
 oceans-0.117
ard-0.115
 exp-0.114
 breeze-0.114
Token They
Token ID1119
Feature activation+5.22e-03

Pos logit contributions
 exp+0.003
bye+0.003
 oceans+0.003
 plains+0.003
amb+0.003
Neg logit contributions
 not-0.003
ggy-0.003
 worse-0.003
 okay-0.003
 bad-0.003
Token liked
Token ID8288
Feature activation+4.58e-03

Pos logit contributions
ussy+0.008
ggy+0.008
 infected+0.008
 homework+0.008
 worse+0.008
Neg logit contributions
 exp-0.008
bye-0.008
 oceans-0.008
 plains-0.007
amb-0.007
Token the
Token ID262
Feature activation+6.22e-03

Pos logit contributions
ussy+0.007
ggy+0.006
 infected+0.006
oll+0.006
 worse+0.006
Neg logit contributions
 oceans-0.009
 exp-0.008
bye-0.008
 plains-0.008
 lakes-0.008
Token chain
Token ID6333
Feature activation+5.86e-02

Pos logit contributions
 not+0.009
ggy+0.009
 infected+0.008
 worse+0.008
ussy+0.008
Neg logit contributions
bye-0.011
 exp-0.011
 oceans-0.011
 plains-0.011
amb-0.010
Token Tim
Token ID5045
Feature activation+3.14e-02

Pos logit contributions
 not+0.001
ggy+0.001
 infected+0.001
 worse+0.001
ussy+0.001
Neg logit contributions
 exp-0.002
bye-0.002
 oceans-0.002
 plains-0.002
amb-0.001
Tokenmy
Token ID1820
Feature activation+2.45e-02

Pos logit contributions
ggy+0.060
 not+0.056
 chores+0.055
 worse+0.053
orman+0.053
Neg logit contributions
 exp-0.042
bye-0.042
ard-0.039
 oceans-0.038
amb-0.038
Token wanted
Token ID2227
Feature activation+2.99e-02

Pos logit contributions
ggy+0.036
ussy+0.035
 infected+0.035
 not+0.035
 worse+0.034
Neg logit contributions
 exp-0.042
bye-0.042
 oceans-0.040
ard-0.038
amb-0.038
Token to
Token ID284
Feature activation+1.72e-02

Pos logit contributions
ussy+0.066
ggy+0.064
oll+0.063
bles+0.061
kin+0.060
Neg logit contributions
 oceans-0.027
bye-0.026
 course-0.026
 exp-0.025
 plains-0.022
Token make
Token ID787
Feature activation+4.74e-02

Pos logit contributions
 not+0.025
ggy+0.025
 infected+0.024
 worse+0.024
ussy+0.024
Neg logit contributions
 exp-0.031
bye-0.030
 oceans-0.030
 plains-0.028
amb-0.027
Token Sally
Token ID25737
Feature activation+8.92e-02

Pos logit contributions
ussy+0.079
ver+0.077
abella+0.072
ggy+0.071
 infected+0.069
Neg logit contributions
 goodness-0.069
 course-0.062
 exp-0.059
bye-0.059
 oceans-0.059
Token feel
Token ID1254
Feature activation+4.07e-02

Pos logit contributions
abella+0.165
ggy+0.151
orman+0.147
ussy+0.141
illi+0.137
Neg logit contributions
 exp-0.130
bye-0.122
 breeze-0.115
 oceans-0.115
 tune-0.115
Token better
Token ID1365
Feature activation+1.99e-03

Pos logit contributions
ggy+0.035
ussy+0.031
abella+0.031
orman+0.028
 not+0.027
Neg logit contributions
 exp-0.098
bye-0.097
 oceans-0.094
 breeze-0.091
 course-0.090
Token,
Token ID11
Feature activation-1.42e-02

Pos logit contributions
 not+0.003
ggy+0.003
 infected+0.003
 worse+0.003
ussy+0.003
Neg logit contributions
bye-0.003
 exp-0.003
 oceans-0.003
 plains-0.003
amb-0.003
Token so
Token ID523
Feature activation+2.52e-02

Pos logit contributions
 exp+0.031
bye+0.031
 oceans+0.030
 plains+0.030
amb+0.029
Neg logit contributions
 not-0.020
 worse-0.018
 okay-0.017
 bad-0.016
 infected-0.015
Token he
Token ID339
Feature activation+2.02e-02

Pos logit contributions
ussy+0.055
illi+0.048
 infected+0.047
 homework+0.047
orman+0.047
Neg logit contributions
bye-0.032
 exp-0.025
 oceans-0.025
 course-0.024
amb-0.024
Token cuts
Token ID6630
Feature activation+3.08e-02

Pos logit contributions
ggy+0.039
 not+0.038
 infected+0.038
 worse+0.036
ussy+0.036
Neg logit contributions
 exp-0.037
bye-0.037
 oceans-0.037
 plains-0.033
 course-0.033
Token the
Token ID262
Feature activation+2.65e-02

Pos logit contributions
ggy+0.041
 not+0.039
 infected+0.038
ussy+0.038
 homework+0.036
Neg logit contributions
 oceans-0.058
bye-0.058
 exp-0.057
 plains-0.053
ard-0.052
Token lime
Token ID28738
Feature activation+6.48e-02

Pos logit contributions
ggy+0.037
ussy+0.036
 not+0.036
 worse+0.035
 infected+0.035
Neg logit contributions
 exp-0.049
bye-0.048
 oceans-0.047
ard-0.044
 plains-0.044
Token and
Token ID290
Feature activation+1.14e-05

Pos logit contributions
ggy+0.096
 infected+0.095
ussy+0.094
 not+0.094
abella+0.091
Neg logit contributions
bye-0.108
 oceans-0.107
 exp-0.106
ard-0.098
 plains-0.097
Token gives
Token ID3607
Feature activation+1.98e-02

Pos logit contributions
 not+0.000
 worse+0.000
 infected+0.000
ggy+0.000
 homework+0.000
Neg logit contributions
bye-0.000
 exp-0.000
 oceans-0.000
 plains-0.000
amb-0.000
Token them
Token ID606
Feature activation+8.89e-02

Pos logit contributions
ggy+0.029
 not+0.028
 infected+0.028
ussy+0.028
 worse+0.026
Neg logit contributions
bye-0.035
 oceans-0.035
 exp-0.033
 plains-0.031
 course-0.031
Token each
Token ID1123
Feature activation+3.32e-02

Pos logit contributions
ggy+0.148
 infected+0.137
orman+0.137
ussy+0.136
abella+0.133
Neg logit contributions
bye-0.142
 oceans-0.135
 exp-0.135
 lakes-0.123
 course-0.121
Token a
Token ID257
Feature activation+3.79e-02

Pos logit contributions
ggy+0.037
 not+0.037
 infected+0.036
ussy+0.035
 worse+0.035
Neg logit contributions
bye-0.071
 exp-0.069
 oceans-0.069
 plains-0.065
amb-0.063
Token slice
Token ID16416
Feature activation+3.91e-02

Pos logit contributions
 not+0.049
ggy+0.047
 infected+0.046
 worse+0.046
ussy+0.045
Neg logit contributions
 exp-0.075
bye-0.074
 oceans-0.073
 plains-0.069
amb-0.067
Token.
Token ID13
Feature activation+1.57e-03

Pos logit contributions
 not+0.047
ggy+0.044
 worse+0.043
 infected+0.043
ussy+0.041
Neg logit contributions
 exp-0.083
 oceans-0.082
bye-0.080
 plains-0.077
amb-0.074
Token They
Token ID1119
Feature activation+5.94e-03

Pos logit contributions
 not+0.003
 infected+0.002
 worse+0.002
ggy+0.002
 okay+0.002
Neg logit contributions
bye-0.003
 oceans-0.003
 exp-0.003
 plains-0.003
ard-0.002
Token so
Token ID523
Feature activation+4.83e-02

Pos logit contributions
ussy+0.076
 homework+0.073
abella+0.071
 chores+0.071
ork+0.067
Neg logit contributions
 oceans-0.058
 exp-0.056
bye-0.054
 tune-0.050
 rivers-0.049
Token happy
Token ID3772
Feature activation-5.24e-03

Pos logit contributions
ussy+0.069
abella+0.066
ggy+0.066
 homework+0.066
 infected+0.065
Neg logit contributions
bye-0.087
 exp-0.085
 oceans-0.081
amb-0.077
 breeze-0.074
Token and
Token ID290
Feature activation+1.07e-02

Pos logit contributions
 exp+0.011
bye+0.011
 oceans+0.011
 plains+0.010
amb+0.010
Neg logit contributions
 not-0.008
ggy-0.006
 worse-0.006
 bad-0.006
 mom-0.006
Token gave
Token ID2921
Feature activation+3.89e-02

Pos logit contributions
 not+0.016
ggy+0.015
 infected+0.015
 worse+0.015
ussy+0.015
Neg logit contributions
 exp-0.019
bye-0.019
 oceans-0.019
 plains-0.017
amb-0.017
Token Mitt
Token ID16627
Feature activation+1.86e-02

Pos logit contributions
ggy+0.064
 homework+0.063
orman+0.059
ussy+0.058
aches+0.057
Neg logit contributions
bye-0.063
 course-0.057
 goodness-0.056
 breeze-0.054
 oceans-0.052
Tokenens
Token ID641
Feature activation+8.82e-02

Pos logit contributions
 not+0.025
ggy+0.023
 infected+0.023
 worse+0.022
 homework+0.021
Neg logit contributions
bye-0.037
 exp-0.036
 oceans-0.035
amb-0.033
ard-0.033
Token a
Token ID257
Feature activation+5.15e-02

Pos logit contributions
ussy+0.140
ggy+0.130
abella+0.130
 worse+0.129
 infected+0.126
Neg logit contributions
bye-0.145
 oceans-0.135
 exp-0.131
 plains-0.126
 lakes-0.123
Token big
Token ID1263
Feature activation+1.61e-02

Pos logit contributions
ussy+0.069
 not+0.063
 worse+0.063
 infected+0.062
ggy+0.062
Neg logit contributions
bye-0.099
 exp-0.098
 breeze-0.090
 oceans-0.089
 plains-0.086
Token hug
Token ID16225
Feature activation+5.01e-03

Pos logit contributions
 not+0.023
 infected+0.022
ggy+0.022
 worse+0.021
 bad+0.021
Neg logit contributions
 oceans-0.031
 exp-0.029
bye-0.029
 plains-0.028
 lakes-0.027
Token.
Token ID13
Feature activation+1.87e-02

Pos logit contributions
 not+0.006
ggy+0.006
 worse+0.005
 infected+0.005
ussy+0.005
Neg logit contributions
 exp-0.011
 oceans-0.011
bye-0.010
 plains-0.010
amb-0.010
Token From
Token ID3574
Feature activation+1.78e-02

Pos logit contributions
ggy+0.032
 not+0.032
 worse+0.032
 infected+0.031
ussy+0.031
Neg logit contributions
 exp-0.028
bye-0.028
 oceans-0.027
 plains-0.026
amb-0.025
Token help
Token ID1037
Feature activation+3.13e-02

Pos logit contributions
 exp+0.007
bye+0.007
 oceans+0.006
 plains+0.006
amb+0.006
Neg logit contributions
 infected-0.006
 not-0.006
ggy-0.006
 worse-0.005
ussy-0.005
Token the
Token ID262
Feature activation+1.56e-02

Pos logit contributions
ussy+0.052
oll+0.050
abella+0.049
ggy+0.047
 infected+0.047
Neg logit contributions
bye-0.052
 exp-0.049
 oceans-0.047
ard-0.045
 plains-0.045
Token puppy
Token ID26188
Feature activation+5.77e-02

Pos logit contributions
ussy+0.029
 infected+0.028
abella+0.027
ggy+0.027
aches+0.027
Neg logit contributions
bye-0.024
 exp-0.022
 oceans-0.020
amb-0.019
ard-0.019
Token and
Token ID290
Feature activation+1.67e-02

Pos logit contributions
abella+0.092
ussy+0.091
ggy+0.091
 infected+0.087
oll+0.084
Neg logit contributions
 exp-0.095
bye-0.092
 oceans-0.088
ard-0.084
 plains-0.082
Token gave
Token ID2921
Feature activation+2.99e-02

Pos logit contributions
 not+0.023
ggy+0.021
 worse+0.020
 infected+0.020
 bad+0.019
Neg logit contributions
 oceans-0.033
 exp-0.033
bye-0.033
 plains-0.031
amb-0.031
Token him
Token ID683
Feature activation+8.81e-02

Pos logit contributions
ussy+0.057
ggy+0.055
oll+0.053
 homework+0.052
 infected+0.051
Neg logit contributions
bye-0.047
 oceans-0.042
 exp-0.039
 course-0.038
 plains-0.038
Token some
Token ID617
Feature activation+3.71e-02

Pos logit contributions
ussy+0.138
ggy+0.136
abella+0.129
orman+0.127
 infected+0.125
Neg logit contributions
bye-0.154
 oceans-0.143
 exp-0.140
 lakes-0.132
 plains-0.131
Token food
Token ID2057
Feature activation+5.76e-02

Pos logit contributions
 not+0.046
ggy+0.044
ussy+0.043
 infected+0.042
 worse+0.042
Neg logit contributions
bye-0.075
 exp-0.074
 oceans-0.073
 plains-0.068
amb-0.068
Token.
Token ID13
Feature activation+9.06e-03

Pos logit contributions
 not+0.072
ggy+0.069
 infected+0.068
 worse+0.067
ussy+0.064
Neg logit contributions
 exp-0.117
 oceans-0.116
bye-0.115
 plains-0.109
amb-0.106
Token
Token ID198
Feature activation+2.92e-02

Pos logit contributions
 not+0.013
ggy+0.012
 infected+0.012
 worse+0.011
ussy+0.011
Neg logit contributions
 oceans-0.017
 exp-0.017
bye-0.017
 plains-0.016
amb-0.016
Token
Token ID198
Feature activation+8.15e-03

Pos logit contributions
 infected+0.056
ussy+0.055
 worse+0.055
 homework+0.053
 chores+0.052
Neg logit contributions
bye-0.038
 exp-0.036
 oceans-0.035
amb-0.034
 plains-0.034
Token.
Token ID13
Feature activation+2.74e-03

Pos logit contributions
 not+0.065
ggy+0.065
 infected+0.063
ussy+0.062
 worse+0.062
Neg logit contributions
bye-0.089
 exp-0.088
 oceans-0.087
 plains-0.082
ard-0.081
Token Tim
Token ID5045
Feature activation+2.90e-02

Pos logit contributions
 not+0.004
ggy+0.003
 mom+0.003
 okay+0.003
 bad+0.003
Neg logit contributions
 oceans-0.006
 exp-0.006
bye-0.005
 plains-0.005
amb-0.005
Token wanted
Token ID2227
Feature activation+3.18e-02

Pos logit contributions
ggy+0.046
 not+0.045
 infected+0.044
 worse+0.044
ussy+0.044
Neg logit contributions
 exp-0.048
bye-0.048
 oceans-0.047
 plains-0.043
ard-0.042
Token to
Token ID284
Feature activation+1.95e-02

Pos logit contributions
ussy+0.069
 infected+0.063
ggy+0.063
oll+0.063
bles+0.063
Neg logit contributions
bye-0.037
 oceans-0.032
 course-0.031
 exp-0.031
 plains-0.027
Token help
Token ID1037
Feature activation+3.53e-02

Pos logit contributions
 not+0.029
ggy+0.029
ussy+0.029
 infected+0.029
 worse+0.028
Neg logit contributions
bye-0.034
 exp-0.033
 oceans-0.032
 plains-0.029
amb-0.029
Token Sue
Token ID26089
Feature activation+8.81e-02

Pos logit contributions
 infected+0.051
ussy+0.050
abella+0.048
oll+0.048
 homework+0.047
Neg logit contributions
bye-0.066
 exp-0.060
 course-0.059
 oceans-0.058
 breeze-0.056
Token.
Token ID13
Feature activation+4.15e-03

Pos logit contributions
abella+0.203
ggy+0.186
ula+0.177
ussy+0.176
oll+0.175
Neg logit contributions
 exp-0.094
ard-0.078
bye-0.076
 oceans-0.074
 breeze-0.067
Token He
Token ID679
Feature activation+1.71e-02

Pos logit contributions
 not+0.006
ggy+0.005
 mom+0.005
 worse+0.005
 bad+0.005
Neg logit contributions
 oceans-0.008
 exp-0.008
bye-0.008
 plains-0.008
 lakes-0.007
Token used
Token ID973
Feature activation+3.73e-02

Pos logit contributions
ggy+0.031
ussy+0.031
 chores+0.030
 worse+0.030
 infected+0.030
Neg logit contributions
 exp-0.024
bye-0.023
 oceans-0.022
 plains-0.021
amb-0.020
Token his
Token ID465
Feature activation+6.28e-02

Pos logit contributions
ussy+0.064
ggy+0.062
abella+0.057
orman+0.056
oll+0.053
Neg logit contributions
 oceans-0.060
 exp-0.056
bye-0.053
 course-0.053
 lakes-0.052
Token tool
Token ID2891
Feature activation+5.24e-02

Pos logit contributions
ussy+0.088
abella+0.084
 worse+0.082
ggy+0.082
 infected+0.079
Neg logit contributions
 exp-0.113
bye-0.109
 oceans-0.108
 plains-0.104
ard-0.104
Token her
Token ID607
Feature activation+4.80e-02

Pos logit contributions
 infected+0.003
ggy+0.003
ussy+0.003
 not+0.003
 worse+0.003
Neg logit contributions
 exp-0.004
bye-0.004
 oceans-0.003
 course-0.003
 plains-0.003
Token brother
Token ID3956
Feature activation+4.26e-02

Pos logit contributions
ussy+0.079
 infected+0.077
idget+0.068
illi+0.068
hes+0.066
Neg logit contributions
bye-0.070
 exp-0.065
 oceans-0.057
 goodness-0.056
 manner-0.054
Token he
Token ID339
Feature activation+1.44e-02

Pos logit contributions
ussy+0.070
 infected+0.070
ggy+0.065
illi+0.062
orman+0.061
Neg logit contributions
bye-0.066
 exp-0.058
 course-0.056
 oceans-0.055
 goodness-0.052
Token must
Token ID1276
Feature activation+5.08e-03

Pos logit contributions
ula+0.024
 chores+0.023
 infected+0.022
ussy+0.022
 homework+0.022
Neg logit contributions
bye-0.021
ard-0.017
 exp-0.016
amb-0.015
 course-0.015
Token give
Token ID1577
Feature activation+3.26e-02

Pos logit contributions
 infected+0.007
ussy+0.007
 worse+0.006
ggy+0.006
abella+0.006
Neg logit contributions
bye-0.010
 exp-0.009
 course-0.008
amb-0.008
 oceans-0.008
Token Jane
Token ID12091
Feature activation+8.81e-02

Pos logit contributions
 infected+0.048
ggy+0.047
ussy+0.046
 homework+0.042
oll+0.042
Neg logit contributions
bye-0.055
 oceans-0.052
 course-0.051
 exp-0.050
 goodness-0.050
Token some
Token ID617
Feature activation+4.77e-02

Pos logit contributions
ggy+0.126
ussy+0.120
 infected+0.118
orman+0.117
abella+0.116
Neg logit contributions
bye-0.145
 oceans-0.138
 exp-0.135
 course-0.133
 plains-0.127
Token candy
Token ID18550
Feature activation+6.58e-02

Pos logit contributions
ggy+0.056
 not+0.055
ussy+0.052
abella+0.050
 infected+0.049
Neg logit contributions
bye-0.098
 exp-0.094
 oceans-0.092
 course-0.089
ard-0.089
Token.
Token ID13
Feature activation-6.41e-03

Pos logit contributions
ggy+0.093
 not+0.091
 infected+0.090
ussy+0.088
 worse+0.086
Neg logit contributions
bye-0.118
 exp-0.113
 oceans-0.112
ard-0.105
amb-0.105
Token But
Token ID887
Feature activation+5.15e-03

Pos logit contributions
 oceans+0.013
bye+0.012
 exp+0.012
 plains+0.012
amb+0.012
Neg logit contributions
 not-0.009
ggy-0.008
 worse-0.008
 mom-0.007
 infected-0.007
Token her
Token ID607
Feature activation+4.81e-02

Pos logit contributions
 not+0.008
ggy+0.007
 infected+0.007
 worse+0.007
ussy+0.007
Neg logit contributions
 exp-0.009
bye-0.009
 oceans-0.009
 plains-0.009
amb-0.008
Token Tommy
Token ID19919
Feature activation+2.70e-02

Pos logit contributions
 exp+0.004
bye+0.004
 oceans+0.004
 plains+0.004
amb+0.004
Neg logit contributions
 not-0.003
ggy-0.003
 infected-0.003
 worse-0.003
ussy-0.003
Token said
Token ID531
Feature activation+2.07e-02

Pos logit contributions
 infected+0.047
 worse+0.046
ussy+0.045
 chores+0.045
ggy+0.045
Neg logit contributions
 exp-0.039
bye-0.038
ard-0.034
amb-0.033
 oceans-0.032
Token yes
Token ID3763
Feature activation-1.83e-02

Pos logit contributions
ussy+0.035
 infected+0.033
ggy+0.032
 homework+0.031
 chores+0.031
Neg logit contributions
bye-0.033
 exp-0.032
 oceans-0.030
 course-0.028
 goodness-0.027
Token and
Token ID290
Feature activation-2.31e-03

Pos logit contributions
 oceans+0.033
 exp+0.032
bye+0.030
 plains+0.030
 lakes+0.029
Neg logit contributions
 not-0.031
ggy-0.029
 worse-0.028
 okay-0.027
 infected-0.027
Token gave
Token ID2921
Feature activation+2.55e-02

Pos logit contributions
bye+0.005
 exp+0.005
 oceans+0.005
amb+0.005
 plains+0.005
Neg logit contributions
 infected-0.002
ussy-0.002
ggy-0.002
 not-0.002
 worse-0.002
Token her
Token ID607
Feature activation+8.80e-02

Pos logit contributions
ggy+0.044
ussy+0.040
 not+0.039
 infected+0.039
istered+0.039
Neg logit contributions
bye-0.042
 course-0.038
 exp-0.036
 oceans-0.036
amb-0.034
Token a
Token ID257
Feature activation+6.31e-02

Pos logit contributions
orman+0.170
abella+0.169
ggy+0.169
ussy+0.163
ggies+0.158
Neg logit contributions
 course-0.098
 exp-0.090
bye-0.083
 goodness-0.082
 oceans-0.081
Token cup
Token ID6508
Feature activation+4.64e-02

Pos logit contributions
ussy+0.089
ggy+0.089
abella+0.087
 not+0.083
 worse+0.082
Neg logit contributions
 exp-0.115
bye-0.107
 breeze-0.106
 course-0.098
 oceans-0.098
Token of
Token ID286
Feature activation+3.66e-02

Pos logit contributions
 not+0.069
ggy+0.068
 infected+0.067
 worse+0.066
ussy+0.065
Neg logit contributions
bye-0.082
 exp-0.079
 oceans-0.077
amb-0.073
ard-0.072
Token juice
Token ID13135
Feature activation+5.47e-02

Pos logit contributions
ggy+0.041
 not+0.040
ussy+0.039
 infected+0.038
 worse+0.038
Neg logit contributions
 exp-0.077
bye-0.076
 oceans-0.075
 plains-0.070
 course-0.070
Token.
Token ID13
Feature activation+1.05e-03

Pos logit contributions
ggy+0.073
 not+0.073
 infected+0.071
 worse+0.070
ussy+0.069
Neg logit contributions
bye-0.103
 exp-0.102
 oceans-0.100
amb-0.093
 plains-0.093
Token girl
Token ID2576
Feature activation+2.99e-02

Pos logit contributions
 exp+0.015
bye+0.015
 oceans+0.015
 plains+0.013
amb+0.013
Neg logit contributions
 not-0.026
ggy-0.026
 infected-0.025
 worse-0.025
ussy-0.025
Token was
Token ID373
Feature activation+4.96e-02

Pos logit contributions
ussy+0.048
 chores+0.048
 homework+0.047
 infected+0.046
 worse+0.045
Neg logit contributions
 exp-0.046
bye-0.046
 oceans-0.043
ard-0.043
 plains-0.041
Token happy
Token ID3772
Feature activation-4.67e-03

Pos logit contributions
ussy+0.082
abella+0.079
 homework+0.077
 chores+0.076
orman+0.074
Neg logit contributions
bye-0.070
 exp-0.069
 oceans-0.069
 tune-0.063
 course-0.061
Token and
Token ID290
Feature activation+1.60e-02

Pos logit contributions
 exp+0.009
bye+0.009
 oceans+0.008
 plains+0.008
amb+0.008
Neg logit contributions
 not-0.007
 worse-0.007
ggy-0.007
 infected-0.006
 okay-0.006
Token gave
Token ID2921
Feature activation+2.83e-02

Pos logit contributions
 not+0.025
ggy+0.024
 infected+0.023
 worse+0.023
ussy+0.022
Neg logit contributions
 oceans-0.028
 exp-0.028
bye-0.028
 plains-0.026
amb-0.025
Token Max
Token ID5436
Feature activation+8.80e-02

Pos logit contributions
ggy+0.043
aches+0.041
 not+0.040
oll+0.039
 infected+0.039
Neg logit contributions
bye-0.050
 oceans-0.048
 course-0.046
 plains-0.045
 exp-0.042
Token a
Token ID257
Feature activation+5.92e-02

Pos logit contributions
ussy+0.145
ggy+0.137
abella+0.135
 infected+0.133
 worse+0.131
Neg logit contributions
bye-0.140
 oceans-0.127
 exp-0.120
 plains-0.120
 lakes-0.114
Token big
Token ID1263
Feature activation+1.87e-02

Pos logit contributions
ussy+0.084
abella+0.079
ggy+0.077
 not+0.075
 worse+0.074
Neg logit contributions
bye-0.111
 exp-0.110
 breeze-0.100
 oceans-0.097
amb-0.094
Token hug
Token ID16225
Feature activation+2.87e-03

Pos logit contributions
 not+0.027
 infected+0.026
ggy+0.026
 worse+0.025
 bad+0.025
Neg logit contributions
 oceans-0.035
 exp-0.033
bye-0.033
 plains-0.032
 lakes-0.031
Token.
Token ID13
Feature activation+3.49e-03

Pos logit contributions
 not+0.003
ggy+0.003
 infected+0.003
 worse+0.003
ussy+0.003
Neg logit contributions
 exp-0.006
 oceans-0.006
bye-0.006
 plains-0.005
amb-0.005
Token Max
Token ID5436
Feature activation+1.66e-02

Pos logit contributions
 not+0.004
ggy+0.004
 infected+0.004
 worse+0.004
ussy+0.003
Neg logit contributions
 exp-0.007
bye-0.007
 oceans-0.007
 plains-0.007
amb-0.007
Token ran
Token ID4966
Feature activation+6.43e-03

Pos logit contributions
ussy+0.020
ggy+0.019
 chores+0.019
 infected+0.019
 homework+0.019
Neg logit contributions
bye-0.018
 exp-0.018
 oceans-0.016
amb-0.016
 plains-0.015
Token after
Token ID706
Feature activation+1.83e-02

Pos logit contributions
ggy+0.011
ussy+0.010
 infected+0.010
 not+0.010
 worse+0.010
Neg logit contributions
bye-0.011
 oceans-0.010
 exp-0.010
 plains-0.009
amb-0.009
Token him
Token ID683
Feature activation+5.22e-02

Pos logit contributions
ussy+0.030
ggy+0.029
 infected+0.028
aches+0.028
 not+0.028
Neg logit contributions
 oceans-0.029
bye-0.027
 exp-0.026
 course-0.026
 plains-0.026
Token and
Token ID290
Feature activation+1.30e-03

Pos logit contributions
ggy+0.096
ussy+0.095
 infected+0.095
abella+0.092
 worse+0.091
Neg logit contributions
bye-0.071
 exp-0.071
 oceans-0.070
 plains-0.064
ard-0.062
Token gave
Token ID2921
Feature activation+2.84e-02

Pos logit contributions
ussy+0.002
 infected+0.002
ggy+0.002
 chores+0.002
 worse+0.002
Neg logit contributions
bye-0.002
 exp-0.002
 oceans-0.002
 plains-0.002
 course-0.002
Token him
Token ID683
Feature activation+8.79e-02

Pos logit contributions
ggy+0.064
 infected+0.061
ussy+0.060
 not+0.059
aches+0.059
Neg logit contributions
 oceans-0.029
bye-0.028
 course-0.026
 plains-0.023
 exp-0.022
Token back
Token ID736
Feature activation+5.13e-02

Pos logit contributions
ggy+0.140
ussy+0.134
orman+0.132
 infected+0.131
abella+0.129
Neg logit contributions
bye-0.146
 oceans-0.136
 course-0.128
 exp-0.128
 lakes-0.125
Token his
Token ID465
Feature activation+5.77e-02

Pos logit contributions
 worse+0.090
 infected+0.090
ussy+0.089
ggy+0.088
hes+0.084
Neg logit contributions
bye-0.078
 oceans-0.072
 exp-0.071
 plains-0.068
amb-0.066
Token lucky
Token ID9670
Feature activation+2.34e-02

Pos logit contributions
ggy+0.091
ussy+0.090
abella+0.088
 worse+0.087
 not+0.087
Neg logit contributions
 exp-0.097
 oceans-0.096
bye-0.095
 course-0.086
 plains-0.086
Token marble
Token ID30623
Feature activation+7.13e-02

Pos logit contributions
 not+0.032
ggy+0.031
 infected+0.030
 worse+0.030
ussy+0.029
Neg logit contributions
 exp-0.045
bye-0.044
 oceans-0.044
 plains-0.041
amb-0.040
Token.
Token ID13
Feature activation+2.18e-02

Pos logit contributions
 infected+0.100
ussy+0.100
 worse+0.097
 not+0.096
abella+0.095
Neg logit contributions
bye-0.129
 exp-0.121
 oceans-0.119
ard-0.114
amb-0.112
Token happy
Token ID3772
Feature activation-9.97e-03

Pos logit contributions
ork+0.045
ussy+0.043
aches+0.041
 homework+0.041
abella+0.041
Neg logit contributions
 oceans-0.037
 exp-0.036
bye-0.035
 plains-0.034
elt-0.034
Token to
Token ID284
Feature activation+1.69e-03

Pos logit contributions
 exp+0.019
 oceans+0.019
bye+0.019
 plains+0.018
amb+0.018
Neg logit contributions
 not-0.014
ggy-0.013
 worse-0.013
 infected-0.012
 okay-0.012
Token help
Token ID1037
Feature activation+2.57e-02

Pos logit contributions
ussy+0.003
 worse+0.003
 infected+0.003
ggy+0.002
abella+0.002
Neg logit contributions
bye-0.003
 exp-0.003
 oceans-0.003
amb-0.003
 course-0.002
Token and
Token ID290
Feature activation+6.69e-03

Pos logit contributions
ussy+0.043
abella+0.042
 infected+0.041
 homework+0.040
hes+0.040
Neg logit contributions
bye-0.038
 exp-0.038
ard-0.035
 oceans-0.034
amb-0.034
Token gave
Token ID2921
Feature activation+2.69e-02

Pos logit contributions
 not+0.009
ggy+0.009
 infected+0.009
ussy+0.009
 worse+0.009
Neg logit contributions
bye-0.012
 exp-0.012
 oceans-0.012
 plains-0.011
amb-0.011
Token Sally
Token ID25737
Feature activation+8.77e-02

Pos logit contributions
 homework+0.047
ggy+0.047
ussy+0.046
 chores+0.043
 infected+0.042
Neg logit contributions
bye-0.042
 course-0.039
 oceans-0.038
 exp-0.037
amb-0.036
Token a
Token ID257
Feature activation+4.30e-02

Pos logit contributions
ggy+0.137
 infected+0.132
ussy+0.131
 worse+0.128
abella+0.126
Neg logit contributions
bye-0.142
 oceans-0.138
 exp-0.136
 plains-0.131
 course-0.126
Token carrot
Token ID40562
Feature activation+5.13e-02

Pos logit contributions
ussy+0.057
ggy+0.054
 not+0.054
 worse+0.053
abella+0.053
Neg logit contributions
 exp-0.084
bye-0.081
 oceans-0.076
 breeze-0.074
 plains-0.073
Token to
Token ID284
Feature activation+1.23e-02

Pos logit contributions
ussy+0.076
ggy+0.074
 worse+0.073
 infected+0.073
 not+0.072
Neg logit contributions
bye-0.090
 exp-0.086
amb-0.083
 oceans-0.083
ard-0.079
Token eat
Token ID4483
Feature activation+3.40e-02

Pos logit contributions
ussy+0.019
 infected+0.018
ggy+0.018
 worse+0.018
 not+0.018
Neg logit contributions
bye-0.021
 exp-0.020
 oceans-0.019
 course-0.019
 plains-0.018
Token.
Token ID13
Feature activation+3.16e-03

Pos logit contributions
 not+0.044
ggy+0.043
 infected+0.043
ussy+0.042
 worse+0.042
Neg logit contributions
 exp-0.065
bye-0.065
 oceans-0.064
 plains-0.060
amb-0.059
Token promised
Token ID8072
Feature activation+9.60e-03

Pos logit contributions
 exp+0.018
 oceans+0.018
bye+0.017
 plains+0.016
amb+0.016
Neg logit contributions
 not-0.014
ggy-0.013
 worse-0.013
 infected-0.013
ussy-0.012
Token to
Token ID284
Feature activation+9.42e-03

Pos logit contributions
ussy+0.016
 infected+0.015
ggy+0.015
 worse+0.014
 not+0.014
Neg logit contributions
bye-0.016
 exp-0.015
 oceans-0.015
 plains-0.014
amb-0.014
Token always
Token ID1464
Feature activation+4.98e-03

Pos logit contributions
 not+0.014
ggy+0.011
 mom+0.010
 okay+0.010
 chores+0.010
Neg logit contributions
 oceans-0.021
 exp-0.020
 plains-0.019
ard-0.019
 lakes-0.019
Token keep
Token ID1394
Feature activation+2.65e-02

Pos logit contributions
 not+0.008
ggy+0.007
 worse+0.006
 okay+0.006
 homework+0.006
Neg logit contributions
 oceans-0.010
 exp-0.010
bye-0.009
 plains-0.009
 lakes-0.009
Token his
Token ID465
Feature activation+4.05e-02

Pos logit contributions
ussy+0.041
ggy+0.038
 infected+0.037
 worse+0.036
oll+0.036
Neg logit contributions
 oceans-0.046
 plains-0.042
bye-0.042
 exp-0.042
 lakes-0.040
Token wallet
Token ID13008
Feature activation+8.77e-02

Pos logit contributions
ussy+0.049
ggy+0.048
 infected+0.047
 worse+0.047
 not+0.046
Neg logit contributions
bye-0.084
 exp-0.082
 oceans-0.081
 plains-0.076
ard-0.075
Token safe
Token ID3338
Feature activation+7.52e-03

Pos logit contributions
abella+0.134
ussy+0.129
ggy+0.124
illi+0.116
 infected+0.115
Neg logit contributions
bye-0.155
ard-0.140
 oceans-0.137
 exp-0.130
 breeze-0.129
Token.
Token ID13
Feature activation+1.16e-02

Pos logit contributions
 not+0.010
ggy+0.009
 infected+0.009
 worse+0.009
ussy+0.009
Neg logit contributions
 exp-0.015
bye-0.015
 oceans-0.015
 plains-0.014
amb-0.014
Token He
Token ID679
Feature activation+6.90e-03

Pos logit contributions
 not+0.021
ggy+0.020
 infected+0.020
 worse+0.020
ussy+0.019
Neg logit contributions
 exp-0.017
 oceans-0.017
bye-0.017
 plains-0.016
amb-0.015
Token realized
Token ID6939
Feature activation+1.61e-02

Pos logit contributions
ussy+0.010
 infected+0.010
ggy+0.010
 worse+0.010
 not+0.010
Neg logit contributions
bye-0.012
 exp-0.012
 oceans-0.011
 plains-0.011
amb-0.011
Token that
Token ID326
Feature activation+2.21e-02

Pos logit contributions
ussy+0.028
ggy+0.027
 infected+0.027
 worse+0.025
 not+0.025
Neg logit contributions
 oceans-0.024
bye-0.023
 exp-0.023
 plains-0.021
 course-0.021
Token for
Token ID329
Feature activation+8.83e-03

Pos logit contributions
 not+0.011
ggy+0.010
 infected+0.010
 worse+0.010
ussy+0.010
Neg logit contributions
 exp-0.015
bye-0.015
 oceans-0.015
 plains-0.014
amb-0.014
Token Tim
Token ID5045
Feature activation+4.54e-02

Pos logit contributions
 not+0.012
ggy+0.012
 infected+0.011
 worse+0.011
ussy+0.011
Neg logit contributions
 exp-0.017
 oceans-0.017
bye-0.017
 plains-0.015
amb-0.015
Tokenmy
Token ID1820
Feature activation+5.63e-02

Pos logit contributions
ggy+0.076
abella+0.074
 infected+0.074
hes+0.072
 not+0.072
Neg logit contributions
bye-0.071
 exp-0.063
ard-0.063
 oceans-0.060
 course-0.059
Token and
Token ID290
Feature activation+4.63e-03

Pos logit contributions
abella+0.098
ussy+0.098
ggy+0.098
 infected+0.095
hes+0.094
Neg logit contributions
bye-0.078
ard-0.075
 exp-0.074
 oceans-0.070
 course-0.067
Token gave
Token ID2921
Feature activation+2.26e-02

Pos logit contributions
 not+0.006
ggy+0.006
 infected+0.006
 worse+0.006
ussy+0.006
Neg logit contributions
 exp-0.009
bye-0.009
 oceans-0.009
 plains-0.008
amb-0.008
Token him
Token ID683
Feature activation+8.76e-02

Pos logit contributions
ggy+0.044
ussy+0.041
 infected+0.041
 not+0.041
 homework+0.040
Neg logit contributions
bye-0.030
 oceans-0.029
 exp-0.028
 course-0.027
 plains-0.025
Token a
Token ID257
Feature activation+4.86e-02

Pos logit contributions
ggy+0.130
abella+0.128
ussy+0.127
orman+0.126
 losing+0.123
Neg logit contributions
bye-0.150
 exp-0.134
 oceans-0.131
 course-0.129
 plains-0.122
Token hug
Token ID16225
Feature activation+6.07e-03

Pos logit contributions
ussy+0.073
ggy+0.072
 not+0.071
abella+0.070
 infected+0.068
Neg logit contributions
 exp-0.082
bye-0.082
 oceans-0.075
 breeze-0.073
 plains-0.071
Token.
Token ID13
Feature activation-5.30e-03

Pos logit contributions
 not+0.008
ggy+0.007
 infected+0.007
 worse+0.007
ussy+0.007
Neg logit contributions
 exp-0.012
bye-0.012
 oceans-0.012
 plains-0.011
amb-0.011
Token She
Token ID1375
Feature activation+1.63e-03

Pos logit contributions
 oceans+0.012
bye+0.012
 exp+0.011
 plains+0.011
amb+0.011
Neg logit contributions
 not-0.006
 worse-0.005
 infected-0.005
ggy-0.005
 mom-0.005
Token told
Token ID1297
Feature activation-6.75e-04

Pos logit contributions
ggy+0.003
ussy+0.002
 chores+0.002
 homework+0.002
 worse+0.002
Neg logit contributions
 exp-0.003
bye-0.003
 oceans-0.002
amb-0.002
ard-0.002
Token does
Token ID857
Feature activation+3.76e-02

Pos logit contributions
 not+0.029
ggy+0.028
 infected+0.028
 worse+0.027
ussy+0.027
Neg logit contributions
 exp-0.030
 oceans-0.029
bye-0.029
 plains-0.027
amb-0.026
Token not
Token ID407
Feature activation+5.20e-03

Pos logit contributions
ussy+0.038
oll+0.033
 infected+0.030
aila+0.027
uffy+0.026
Neg logit contributions
bye-0.081
 exp-0.079
 goodness-0.073
 course-0.073
 applause-0.073
Token want
Token ID765
Feature activation+2.03e-02

Pos logit contributions
 infected+0.010
ggy+0.010
ussy+0.010
abella+0.010
 not+0.010
Neg logit contributions
bye-0.007
 exp-0.006
 breeze-0.006
 oceans-0.006
amb-0.005
Token to
Token ID284
Feature activation+5.60e-03

Pos logit contributions
ussy+0.039
ggy+0.038
 infected+0.038
oll+0.036
illi+0.036
Neg logit contributions
bye-0.027
 oceans-0.025
 exp-0.024
 course-0.023
 plains-0.022
Token give
Token ID1577
Feature activation+4.35e-02

Pos logit contributions
ussy+0.009
 infected+0.009
oll+0.009
abella+0.009
hes+0.009
Neg logit contributions
bye-0.009
 exp-0.009
 oceans-0.008
 course-0.008
 breeze-0.007
Token Sam
Token ID3409
Feature activation+8.75e-02

Pos logit contributions
ggy+0.070
oll+0.069
 severe+0.067
abella+0.067
acey+0.066
Neg logit contributions
bye-0.072
 oceans-0.071
 lakes-0.064
 rivers-0.062
 plains-0.062
Token any
Token ID597
Feature activation+3.38e-02

Pos logit contributions
abella+0.129
ggy+0.124
ussy+0.123
 infected+0.121
aila+0.117
Neg logit contributions
bye-0.151
 exp-0.135
 oceans-0.135
 course-0.131
ard-0.126
Token of
Token ID286
Feature activation+1.18e-02

Pos logit contributions
ggy+0.051
 not+0.051
ussy+0.049
abella+0.049
 infected+0.048
Neg logit contributions
bye-0.056
 oceans-0.054
 exp-0.054
ard-0.052
 plains-0.051
Token his
Token ID465
Feature activation+4.75e-02

Pos logit contributions
 infected+0.016
ussy+0.015
ggy+0.015
 worse+0.015
oll+0.015
Neg logit contributions
 oceans-0.021
 exp-0.021
bye-0.020
 course-0.020
 plains-0.019
Token meat
Token ID6174
Feature activation+4.88e-02

Pos logit contributions
ggy+0.069
 not+0.068
ussy+0.067
 worse+0.066
 infected+0.066
Neg logit contributions
 oceans-0.079
bye-0.079
 exp-0.079
ard-0.075
 plains-0.075
Token.
Token ID13
Feature activation-2.00e-03

Pos logit contributions
 infected+0.065
ggy+0.064
 not+0.063
ussy+0.062
 worse+0.061
Neg logit contributions
bye-0.091
 exp-0.088
 oceans-0.088
 plains-0.082
ard-0.082
Token worry
Token ID5490
Feature activation+1.79e-02

Pos logit contributions
 not+0.013
ggy+0.012
 worse+0.012
 infected+0.012
ussy+0.011
Neg logit contributions
 exp-0.018
 oceans-0.018
bye-0.018
 plains-0.017
amb-0.016
Token and
Token ID290
Feature activation+3.19e-04

Pos logit contributions
 infected+0.035
ggy+0.035
ussy+0.034
orman+0.033
abella+0.033
Neg logit contributions
 exp-0.021
bye-0.021
 oceans-0.020
ard-0.019
amb-0.018
Token promised
Token ID8072
Feature activation+1.79e-02

Pos logit contributions
 not+0.000
 worse+0.000
ggy+0.000
 infected+0.000
 bad+0.000
Neg logit contributions
 oceans-0.001
 exp-0.001
bye-0.001
 plains-0.001
amb-0.001
Token to
Token ID284
Feature activation-2.89e-03

Pos logit contributions
ussy+0.030
 infected+0.028
ggy+0.028
oll+0.027
hes+0.027
Neg logit contributions
bye-0.029
 oceans-0.027
 exp-0.027
 plains-0.025
 lakes-0.024
Token buy
Token ID2822
Feature activation+3.66e-02

Pos logit contributions
 oceans+0.006
 plains+0.006
 exp+0.006
 lakes+0.006
bye+0.005
Neg logit contributions
 not-0.004
 okay-0.004
ggy-0.004
 worse-0.003
 chores-0.003
Token him
Token ID683
Feature activation+8.73e-02

Pos logit contributions
ggy+0.062
ussy+0.062
 not+0.059
 infected+0.059
 worse+0.056
Neg logit contributions
bye-0.055
 oceans-0.055
 exp-0.051
 course-0.050
 plains-0.049
Token a
Token ID257
Feature activation+4.88e-02

Pos logit contributions
ussy+0.128
ggy+0.123
abella+0.121
 infected+0.114
orman+0.114
Neg logit contributions
bye-0.147
 exp-0.144
 oceans-0.140
 plains-0.129
 lakes-0.127
Token new
Token ID649
Feature activation+1.52e-02

Pos logit contributions
ussy+0.058
ggy+0.057
 not+0.054
 infected+0.053
abella+0.052
Neg logit contributions
 exp-0.101
bye-0.099
 oceans-0.096
 plains-0.091
 breeze-0.089
Token one
Token ID530
Feature activation+3.44e-02

Pos logit contributions
 not+0.018
ggy+0.017
 infected+0.017
 worse+0.017
 homework+0.016
Neg logit contributions
 oceans-0.032
 exp-0.032
bye-0.032
 plains-0.030
amb-0.029
Token when
Token ID618
Feature activation+2.85e-02

Pos logit contributions
 not+0.044
ggy+0.043
 infected+0.043
 worse+0.041
ussy+0.041
Neg logit contributions
bye-0.068
 exp-0.067
 oceans-0.066
 plains-0.062
amb-0.061
Token they
Token ID484
Feature activation-3.48e-03

Pos logit contributions
 not+0.042
ggy+0.042
 infected+0.041
ussy+0.041
 worse+0.040
Neg logit contributions
 oceans-0.050
 exp-0.049
bye-0.049
 plains-0.046
 lakes-0.044
Token.
Token ID13
Feature activation-2.92e-03

Pos logit contributions
bye+0.007
 exp+0.007
 oceans+0.007
amb+0.006
 plains+0.006
Neg logit contributions
 not-0.004
 worse-0.003
 infected-0.003
ggy-0.003
 wrong-0.003
Token She
Token ID1375
Feature activation-2.93e-03

Pos logit contributions
 exp+0.004
bye+0.004
 oceans+0.004
 plains+0.004
amb+0.003
Neg logit contributions
 not-0.006
ggy-0.005
 infected-0.005
 worse-0.005
ussy-0.005
Token looked
Token ID3114
Feature activation+5.77e-03

Pos logit contributions
 exp+0.005
bye+0.005
 oceans+0.005
 plains+0.005
amb+0.004
Neg logit contributions
 not-0.004
ggy-0.004
 infected-0.004
 worse-0.004
ussy-0.004
Token around
Token ID1088
Feature activation-1.74e-02

Pos logit contributions
ggy+0.007
 not+0.007
 infected+0.006
ussy+0.006
 worse+0.006
Neg logit contributions
 exp-0.012
 oceans-0.012
bye-0.012
 plains-0.011
amb-0.011
Token the
Token ID262
Feature activation-3.78e-02

Pos logit contributions
bye+0.038
amb+0.035
 exp+0.034
ard+0.033
 plains+0.033
Neg logit contributions
 not-0.025
 bad-0.021
 mom-0.021
 worse-0.021
 wrong-0.020
Token vast
Token ID5909
Feature activation-1.18e-01

Pos logit contributions
bye+0.040
amb+0.037
 exp+0.034
 course+0.031
 oceans+0.030
Neg logit contributions
 not-0.085
 bad-0.083
 worse-0.080
 wrong-0.080
 missing-0.080
Token ocean
Token ID9151
Feature activation-3.66e-02

Pos logit contributions
amb+0.168
bye+0.156
elt+0.138
 course+0.136
 Bye+0.133
Neg logit contributions
 bad-0.211
 mom-0.206
 well-0.199
 missing-0.196
 not-0.196
Token and
Token ID290
Feature activation-2.08e-02

Pos logit contributions
bye+0.079
amb+0.078
 exp+0.077
 oceans+0.069
icol+0.069
Neg logit contributions
 not-0.052
 wrong-0.042
 missing-0.041
 worse-0.040
 bad-0.039
Token didn
Token ID1422
Feature activation+2.32e-02

Pos logit contributions
bye+0.038
amb+0.037
 exp+0.037
 oceans+0.036
 plains+0.034
Neg logit contributions
 not-0.033
 worse-0.029
 bad-0.028
 infected-0.028
 wrong-0.028
Token't
Token ID470
Feature activation-6.13e-03

Pos logit contributions
ussy+0.034
 homework+0.032
 infected+0.032
aches+0.031
ggy+0.031
Neg logit contributions
bye-0.036
 exp-0.036
 oceans-0.034
amb-0.031
 horizon-0.030
Token see
Token ID766
Feature activation+9.97e-03

Pos logit contributions
 exp+0.011
bye+0.011
 oceans+0.010
 plains+0.010
amb+0.009
Neg logit contributions
ggy-0.009
 infected-0.009
 not-0.009
ussy-0.009
 worse-0.009
TokenFinally
Token ID11158
Feature activation-3.62e-03

Pos logit contributions
 not+0.005
ggy+0.005
 infected+0.005
 worse+0.005
ussy+0.005
Neg logit contributions
 exp-0.005
bye-0.005
 oceans-0.005
 plains-0.005
amb-0.005
Token,
Token ID11
Feature activation+3.73e-03

Pos logit contributions
 exp+0.006
bye+0.006
 oceans+0.006
 plains+0.005
amb+0.005
Neg logit contributions
 not-0.006
ggy-0.006
 infected-0.006
ussy-0.006
 worse-0.005
Token they
Token ID484
Feature activation-2.34e-02

Pos logit contributions
ggy+0.005
 not+0.005
 infected+0.005
ussy+0.005
 worse+0.005
Neg logit contributions
bye-0.007
 exp-0.007
 oceans-0.007
 plains-0.006
amb-0.006
Token reached
Token ID4251
Feature activation-6.02e-03

Pos logit contributions
 exp+0.040
bye+0.040
 oceans+0.039
 plains+0.037
amb+0.036
Neg logit contributions
 not-0.037
ggy-0.035
 infected-0.034
 worse-0.034
ussy-0.033
Token a
Token ID257
Feature activation-3.15e-02

Pos logit contributions
bye+0.014
 exp+0.014
 oceans+0.013
 plains+0.013
amb+0.013
Neg logit contributions
 not-0.006
ggy-0.006
 infected-0.006
 worse-0.006
ussy-0.005
Token vast
Token ID5909
Feature activation-1.14e-01

Pos logit contributions
bye+0.064
 exp+0.063
 oceans+0.063
amb+0.062
 plains+0.059
Neg logit contributions
 not-0.039
ggy-0.037
 worse-0.036
 bad-0.035
 infected-0.035
Token stretch
Token ID7539
Feature activation-6.04e-02

Pos logit contributions
amb+0.198
bye+0.185
elt+0.179
thank+0.162
 goodness+0.156
Neg logit contributions
 bad-0.179
 not-0.171
 well-0.171
 mom-0.168
 wrong-0.163
Token of
Token ID286
Feature activation-4.40e-02

Pos logit contributions
 exp+0.124
 oceans+0.121
bye+0.120
icol+0.120
amb+0.114
Neg logit contributions
 not-0.073
ggy-0.068
 wrong-0.064
 worse-0.061
 anywhere-0.061
Token beach
Token ID10481
Feature activation-3.80e-02

Pos logit contributions
bye+0.086
amb+0.082
 exp+0.079
ard+0.076
 oceans+0.075
Neg logit contributions
 not-0.061
 bad-0.060
 chores-0.058
 worse-0.058
 infected-0.056
Token.
Token ID13
Feature activation-4.35e-03

Pos logit contributions
bye+0.088
amb+0.085
 exp+0.085
 oceans+0.079
 course+0.078
Neg logit contributions
 not-0.045
ggy-0.033
 wrong-0.033
 missing-0.032
 anywhere-0.032
Token As
Token ID1081
Feature activation+3.49e-03

Pos logit contributions
bye+0.008
 oceans+0.008
 exp+0.008
 plains+0.008
amb+0.008
Neg logit contributions
 not-0.006
 infected-0.005
 worse-0.005
ggy-0.005
 okay-0.005
Token in
Token ID287
Feature activation-3.61e-02

Pos logit contributions
bye+0.044
 exp+0.043
 oceans+0.042
amb+0.040
 plains+0.040
Neg logit contributions
 not-0.027
 okay-0.023
 wrong-0.023
 worse-0.022
 bad-0.021
Token the
Token ID262
Feature activation-5.20e-02

Pos logit contributions
amb+0.097
bye+0.095
 exp+0.092
ard+0.091
 oceans+0.087
Neg logit contributions
 not-0.030
 bad-0.023
 mom-0.022
 worse-0.020
 infected-0.019
Token middle
Token ID3504
Feature activation-2.48e-02

Pos logit contributions
amb+0.078
bye+0.075
 exp+0.070
 oceans+0.069
ard+0.063
Neg logit contributions
 bad-0.093
 not-0.091
 worse-0.088
 wrong-0.088
 mom-0.085
Token of
Token ID286
Feature activation-2.90e-02

Pos logit contributions
 exp+0.060
amb+0.060
icol+0.059
bye+0.058
 oceans+0.058
Neg logit contributions
 not-0.024
ggy-0.020
 mom-0.019
 worse-0.018
 bad-0.018
Token a
Token ID257
Feature activation-3.03e-02

Pos logit contributions
bye+0.076
amb+0.074
 exp+0.072
ard+0.070
 oceans+0.068
Neg logit contributions
 not-0.025
 bad-0.019
 mom-0.018
 worse-0.018
 infected-0.017
Token vast
Token ID5909
Feature activation-1.12e-01

Pos logit contributions
bye+0.059
amb+0.059
 exp+0.057
 oceans+0.056
 plains+0.053
Neg logit contributions
 not-0.041
ggy-0.040
 bad-0.039
 worse-0.037
 wrong-0.036
Token field
Token ID2214
Feature activation-4.73e-02

Pos logit contributions
amb+0.198
bye+0.189
elt+0.183
Run+0.164
 Bye+0.164
Neg logit contributions
 bad-0.164
 well-0.164
 not-0.153
 wrong-0.152
 mom-0.151
Token.
Token ID13
Feature activation-2.88e-02

Pos logit contributions
icol+0.122
 exp+0.121
amb+0.117
 oceans+0.115
bye+0.110
Neg logit contributions
 not-0.049
 wrong-0.037
ggy-0.031
 missing-0.030
 okay-0.028
Token
Token ID198
Feature activation-1.03e-02

Pos logit contributions
bye+0.059
amb+0.055
 exp+0.054
 oceans+0.053
thank+0.051
Neg logit contributions
 not-0.037
 mom-0.033
 okay-0.033
 Lily-0.031
ggy-0.029
Token
Token ID198
Feature activation-1.07e-02

Pos logit contributions
 exp+0.019
bye+0.019
 oceans+0.019
 plains+0.017
amb+0.017
Neg logit contributions
 not-0.015
ggy-0.014
 infected-0.014
 worse-0.014
ussy-0.013
TokenOne
Token ID3198
Feature activation-1.94e-02

Pos logit contributions
 exp+0.020
bye+0.020
 oceans+0.019
amb+0.018
 plains+0.018
Neg logit contributions
 not-0.015
ggy-0.015
 infected-0.014
 worse-0.013
 okay-0.013
Token colourful
Token ID44799
Feature activation-4.77e-02

Pos logit contributions
bye+0.055
amb+0.053
 exp+0.052
ard+0.049
 plains+0.048
Neg logit contributions
 not-0.034
 bad-0.034
 worse-0.031
 mom-0.030
 chores-0.030
Token flowers
Token ID12734
Feature activation-1.01e-02

Pos logit contributions
bye+0.101
amb+0.090
ursion+0.085
 Bye+0.083
 exp+0.083
Neg logit contributions
 bad-0.054
 mom-0.052
 chores-0.050
 not-0.049
 good-0.049
Token,
Token ID11
Feature activation-4.67e-02

Pos logit contributions
bye+0.024
 exp+0.024
 oceans+0.022
icol+0.022
 plains+0.021
Neg logit contributions
 not-0.010
 anywhere-0.008
ggy-0.008
 missing-0.008
 okay-0.007
Token through
Token ID832
Feature activation-5.06e-02

Pos logit contributions
bye+0.100
amb+0.093
ursion+0.086
 exp+0.085
umm+0.084
Neg logit contributions
 not-0.073
 okay-0.057
 bad-0.056
 missing-0.055
 mom-0.053
Token the
Token ID262
Feature activation-5.38e-02

Pos logit contributions
bye+0.130
amb+0.125
 exp+0.118
icol+0.112
ard+0.110
Neg logit contributions
 not-0.046
 bad-0.039
 mom-0.038
 worse-0.037
 anywhere-0.035
Token lush
Token ID37408
Feature activation-1.11e-01

Pos logit contributions
amb+0.100
bye+0.089
elt+0.082
 exp+0.081
thank+0.080
Neg logit contributions
 bad-0.086
 wrong-0.077
 well-0.075
 not-0.073
 missing-0.072
Token forests
Token ID17039
Feature activation-4.87e-02

Pos logit contributions
bye+0.216
amb+0.209
elt+0.186
thank+0.185
ursion+0.184
Neg logit contributions
 bad-0.138
 well-0.131
 big-0.127
 not-0.124
 fake-0.123
Token deep
Token ID2769
Feature activation-1.05e-01

Pos logit contributions
 exp+0.110
bye+0.109
amb+0.104
icol+0.100
 Bye+0.095
Neg logit contributions
 not-0.054
ggy-0.044
 wrong-0.042
 anywhere-0.042
 missing-0.041
Token in
Token ID287
Feature activation-6.92e-02

Pos logit contributions
bye+0.180
amb+0.171
 Bye+0.165
ursion+0.148
 sidelines+0.147
Neg logit contributions
 well-0.151
 up-0.146
 bad-0.145
 on-0.144
 inside-0.139
Token thought
Token ID1807
Feature activation-1.08e-02

Pos logit contributions
amb+0.204
bye+0.199
icol+0.182
 exp+0.181
elt+0.180
Neg logit contributions
 not-0.032
 mom-0.026
 bad-0.026
 anywhere-0.020
 worse-0.019
Token.
Token ID13
Feature activation-7.81e-03

Pos logit contributions
 exp+0.023
bye+0.022
 oceans+0.022
amb+0.021
 plains+0.021
Neg logit contributions
 not-0.014
ggy-0.013
 worse-0.012
 infected-0.012
 bad-0.011
Token She
Token ID1375
Feature activation-1.71e-02

Pos logit contributions
 oceans+0.072
bye+0.071
 exp+0.070
 plains+0.066
 lakes+0.064
Neg logit contributions
 not-0.052
 mom-0.046
ggy-0.046
 okay-0.046
 infected-0.043
Token loved
Token ID6151
Feature activation-1.37e-02

Pos logit contributions
bye+0.035
 oceans+0.035
 exp+0.034
 plains+0.032
amb+0.032
Neg logit contributions
 not-0.023
ggy-0.021
 infected-0.020
 worse-0.020
 bad-0.019
Token nature
Token ID3450
Feature activation-2.13e-02

Pos logit contributions
bye+0.030
 exp+0.030
 oceans+0.029
amb+0.028
ard+0.028
Neg logit contributions
 not-0.017
ggy-0.014
 homework-0.014
 mom-0.014
 chores-0.014
Token -
Token ID532
Feature activation-1.86e-02

Pos logit contributions
 exp+0.042
bye+0.040
 oceans+0.039
amb+0.038
icol+0.038
Neg logit contributions
 not-0.029
ggy-0.027
 worse-0.026
 bad-0.025
 wrong-0.024
Token the
Token ID262
Feature activation-2.59e-02

Pos logit contributions
bye+0.039
 exp+0.038
amb+0.037
ard+0.037
 oceans+0.036
Neg logit contributions
 not-0.027
 okay-0.021
 chores-0.020
 bad-0.020
 mom-0.019
Token lush
Token ID37408
Feature activation-1.10e-01

Pos logit contributions
 exp+0.041
bye+0.041
amb+0.040
 oceans+0.039
 plains+0.036
Neg logit contributions
 not-0.044
 bad-0.041
 worse-0.040
 wrong-0.040
 chores-0.040
Token green
Token ID4077
Feature activation-8.04e-02

Pos logit contributions
amb+0.224
bye+0.218
 Bye+0.201
elt+0.201
ursion+0.199
Neg logit contributions
 bad-0.130
 well-0.118
 good-0.116
 big-0.115
 up-0.113
Token grass
Token ID8701
Feature activation-5.20e-02

Pos logit contributions
bye+0.142
icol+0.140
amb+0.139
 Bye+0.133
 exp+0.127
Neg logit contributions
 mom-0.099
 bad-0.097
 best-0.094
 wrong-0.093
 on-0.091
Token,
Token ID11
Feature activation-3.48e-02

Pos logit contributions
 exp+0.117
bye+0.113
icol+0.111
 oceans+0.106
amb+0.103
Neg logit contributions
 not-0.065
ggy-0.050
 wrong-0.049
 missing-0.048
 also-0.046
Token the
Token ID262
Feature activation-3.71e-02

Pos logit contributions
bye+0.072
amb+0.071
 exp+0.068
ard+0.063
 zone+0.063
Neg logit contributions
 not-0.054
 okay-0.040
 bad-0.040
 missing-0.038
 also-0.037
Token orange
Token ID10912
Feature activation-4.47e-02

Pos logit contributions
amb+0.062
 exp+0.061
bye+0.061
 oceans+0.056
icol+0.053
Neg logit contributions
 not-0.060
 bad-0.057
 mom-0.056
 okay-0.054
 wrong-0.053
Token it
Token ID340
Feature activation-1.24e-02

Pos logit contributions
bye+0.066
 exp+0.064
amb+0.063
 oceans+0.061
icol+0.060
Neg logit contributions
 not-0.028
 bad-0.021
 mom-0.020
 okay-0.020
ggy-0.019
Token sh
Token ID427
Feature activation-5.50e-02

Pos logit contributions
bye+0.028
 exp+0.027
 oceans+0.027
amb+0.026
 plains+0.026
Neg logit contributions
 not-0.015
ggy-0.012
 worse-0.012
 bad-0.012
 infected-0.011
Tokenined
Token ID1389
Feature activation-3.36e-02

Pos logit contributions
 exp+0.098
 oceans+0.098
bye+0.095
 plains+0.090
 lakes+0.087
Neg logit contributions
 not-0.083
ggy-0.082
 infected-0.078
 worse-0.076
ussy-0.076
Token in
Token ID287
Feature activation-4.57e-02

Pos logit contributions
 exp+0.068
bye+0.068
 oceans+0.066
 plains+0.064
amb+0.064
Neg logit contributions
 not-0.044
 worse-0.041
 okay-0.039
 bad-0.039
 infected-0.037
Token the
Token ID262
Feature activation-7.21e-02

Pos logit contributions
amb+0.137
bye+0.132
elt+0.126
ard+0.125
ursion+0.123
Neg logit contributions
 not-0.027
 mom-0.021
 bad-0.020
 good-0.017
 what-0.012
Token bright
Token ID6016
Feature activation-1.09e-01

Pos logit contributions
amb+0.117
 MacDonald+0.090
bye+0.090
elt+0.089
icol+0.089
Neg logit contributions
 bad-0.133
 well-0.125
 wrong-0.119
 not-0.117
 good-0.117
Token sun
Token ID4252
Feature activation-5.11e-02

Pos logit contributions
amb+0.205
istence+0.189
 MacDonald+0.188
icol+0.187
rob+0.187
Neg logit contributions
 big-0.130
 mom-0.126
 bad-0.120
 good-0.119
 well-0.117
Token.
Token ID13
Feature activation-1.94e-02

Pos logit contributions
 exp+0.138
 oceans+0.134
 course+0.130
icol+0.129
bye+0.128
Neg logit contributions
 not-0.030
 mom-0.022
 anywhere-0.017
 missing-0.016
 wrong-0.016
Token One
Token ID1881
Feature activation-2.26e-02

Pos logit contributions
bye+0.041
 exp+0.039
 oceans+0.039
amb+0.038
icol+0.036
Neg logit contributions
 not-0.023
 mom-0.020
ggy-0.020
 okay-0.019
 Lily-0.018
Token day
Token ID1110
Feature activation-7.86e-03

Pos logit contributions
 oceans+0.039
 exp+0.039
amb+0.037
icol+0.036
bye+0.036
Neg logit contributions
 not-0.035
ggy-0.034
 bad-0.033
 mom-0.032
 worse-0.031
Token,
Token ID11
Feature activation+2.37e-03

Pos logit contributions
 exp+0.014
 oceans+0.014
bye+0.014
 plains+0.013
amb+0.013
Neg logit contributions
 not-0.012
ggy-0.011
 infected-0.011
 worse-0.011
 bad-0.010
Token fish
Token ID5916
Feature activation-2.11e-02

Pos logit contributions
 exp+0.094
bye+0.089
 oceans+0.087
 plains+0.083
amb+0.081
Neg logit contributions
 bad-0.067
 mom-0.066
 worse-0.064
 wrong-0.063
 not-0.062
Token who
Token ID508
Feature activation-2.86e-02

Pos logit contributions
 exp+0.048
 oceans+0.046
bye+0.046
icol+0.045
 plains+0.042
Neg logit contributions
 not-0.024
ggy-0.023
 wrong-0.020
 bad-0.019
 worse-0.018
Token lived
Token ID5615
Feature activation+1.83e-03

Pos logit contributions
 oceans+0.058
 exp+0.058
bye+0.057
amb+0.053
icol+0.053
Neg logit contributions
 not-0.035
ggy-0.032
 infected-0.032
 bad-0.030
 worse-0.029
Token in
Token ID287
Feature activation-3.15e-02

Pos logit contributions
 not+0.002
ggy+0.002
 infected+0.002
 worse+0.002
ussy+0.002
Neg logit contributions
 exp-0.004
bye-0.004
 oceans-0.004
 plains-0.004
amb-0.004
Token the
Token ID262
Feature activation-5.88e-02

Pos logit contributions
bye+0.077
 exp+0.076
 oceans+0.074
amb+0.073
 plains+0.070
Neg logit contributions
 not-0.030
ggy-0.026
 infected-0.026
 worse-0.025
 bad-0.024
Token deep
Token ID2769
Feature activation-1.08e-01

Pos logit contributions
amb+0.068
bye+0.064
 exp+0.062
icol+0.055
 oceans+0.054
Neg logit contributions
 not-0.128
 bad-0.125
 wrong-0.122
 worse-0.119
 mom-0.118
Token blue
Token ID4171
Feature activation-8.56e-02

Pos logit contributions
amb+0.176
 goodness+0.162
bye+0.161
 MacDonald+0.154
thank+0.152
Neg logit contributions
 bad-0.166
 mom-0.153
 well-0.149
 broken-0.149
 sad-0.148
Token sea
Token ID5417
Feature activation-3.58e-02

Pos logit contributions
icol+0.123
bye+0.119
 la+0.118
 MacDonald+0.115
amb+0.107
Neg logit contributions
 mom-0.145
 friends-0.128
 best-0.128
 chores-0.127
 friend-0.126
Token.
Token ID13
Feature activation-2.42e-02

Pos logit contributions
 exp+0.094
bye+0.093
icol+0.092
amb+0.089
 oceans+0.087
Neg logit contributions
 not-0.029
ggy-0.021
 wrong-0.020
 bad-0.018
 mom-0.017
Token She
Token ID1375
Feature activation-2.31e-02

Pos logit contributions
bye+0.043
 exp+0.041
 oceans+0.040
amb+0.039
 course+0.037
Neg logit contributions
 not-0.037
ussy-0.033
ggy-0.033
 infected-0.032
 okay-0.032
Token sw
Token ID1509
Feature activation-4.48e-02

Pos logit contributions
bye+0.056
 oceans+0.054
 exp+0.054
amb+0.052
umm+0.052
Neg logit contributions
 not-0.024
ggy-0.021
 bad-0.018
 also-0.017
 infected-0.016
Token.
Token ID13
Feature activation-3.75e-02

Pos logit contributions
 oceans+0.086
 exp+0.083
bye+0.078
 lakes+0.078
 plains+0.077
Neg logit contributions
 not-0.054
ggy-0.053
 wrong-0.045
 bad-0.045
 infected-0.044
Token He
Token ID679
Feature activation-2.74e-02

Pos logit contributions
 oceans+0.068
bye+0.067
 exp+0.065
 plains+0.062
 breeze+0.060
Neg logit contributions
 not-0.054
 okay-0.049
 mom-0.049
ggy-0.046
 Lily-0.045
Token lived
Token ID5615
Feature activation+9.53e-04

Pos logit contributions
bye+0.057
 oceans+0.056
 exp+0.054
amb+0.052
 plains+0.051
Neg logit contributions
 not-0.038
ggy-0.033
 bad-0.031
 infected-0.031
 worse-0.031
Token in
Token ID287
Feature activation-2.52e-02

Pos logit contributions
ggy+0.001
 infected+0.001
 worse+0.001
 not+0.001
ussy+0.001
Neg logit contributions
 oceans-0.002
bye-0.002
 exp-0.002
 plains-0.002
amb-0.002
Token the
Token ID262
Feature activation-5.84e-02

Pos logit contributions
bye+0.062
 exp+0.062
 oceans+0.060
amb+0.059
 plains+0.057
Neg logit contributions
 not-0.022
ggy-0.020
 infected-0.019
 worse-0.019
 bad-0.018
Token deep
Token ID2769
Feature activation-1.08e-01

Pos logit contributions
amb+0.074
bye+0.072
 exp+0.068
 oceans+0.065
ard+0.058
Neg logit contributions
 not-0.119
 bad-0.117
 wrong-0.113
 worse-0.113
ggy-0.111
Token blue
Token ID4171
Feature activation-8.27e-02

Pos logit contributions
amb+0.195
bye+0.189
elt+0.174
 Bye+0.173
 MacDonald+0.171
Neg logit contributions
 bad-0.157
 not-0.141
 broken-0.139
 well-0.138
 sad-0.136
Token sea
Token ID5417
Feature activation-4.49e-02

Pos logit contributions
bye+0.130
icol+0.116
amb+0.115
 MacDonald+0.113
thank+0.108
Neg logit contributions
 mom-0.136
 friends-0.120
icky-0.119
 well-0.118
 best-0.116
Token and
Token ID290
Feature activation-2.47e-02

Pos logit contributions
bye+0.109
icol+0.108
 exp+0.107
amb+0.106
 sidelines+0.097
Neg logit contributions
 not-0.051
 wrong-0.033
 well-0.033
 bad-0.032
 okay-0.031
Token liked
Token ID8288
Feature activation-2.16e-02

Pos logit contributions
amb+0.054
bye+0.053
 exp+0.052
ard+0.051
 oceans+0.050
Neg logit contributions
 not-0.034
 bad-0.027
 worse-0.026
 wrong-0.025
ggy-0.025
Token to
Token ID284
Feature activation-2.66e-02

Pos logit contributions
bye+0.051
amb+0.048
 exp+0.048
ard+0.048
 plains+0.045
Neg logit contributions
 not-0.026
 bad-0.019
 worse-0.019
 mom-0.019
ggy-0.019
Token sorts
Token ID10524
Feature activation-1.88e-02

Pos logit contributions
 exp+0.012
bye+0.012
 oceans+0.012
 plains+0.011
amb+0.011
Neg logit contributions
 not-0.010
ggy-0.009
 infected-0.009
 worse-0.009
ussy-0.009
Token of
Token ID286
Feature activation-3.52e-03

Pos logit contributions
 exp+0.039
bye+0.038
 oceans+0.037
 plains+0.036
amb+0.034
Neg logit contributions
 not-0.024
ggy-0.023
 worse-0.022
 infected-0.022
ussy-0.021
Token adventures
Token ID17545
Feature activation-6.83e-03

Pos logit contributions
 exp+0.007
bye+0.007
 oceans+0.007
 plains+0.006
amb+0.006
Neg logit contributions
 not-0.005
ggy-0.004
 infected-0.004
 worse-0.004
ussy-0.004
Token in
Token ID287
Feature activation-1.62e-02

Pos logit contributions
 exp+0.014
bye+0.014
 oceans+0.013
amb+0.013
 plains+0.013
Neg logit contributions
 not-0.009
ggy-0.008
 worse-0.008
 infected-0.008
 homework-0.008
Token the
Token ID262
Feature activation-4.40e-02

Pos logit contributions
 exp+0.035
 oceans+0.035
bye+0.035
 plains+0.033
amb+0.032
Neg logit contributions
 not-0.018
ggy-0.017
 infected-0.017
 worse-0.017
ussy-0.016
Token vast
Token ID5909
Feature activation-1.08e-01

Pos logit contributions
amb+0.076
bye+0.070
 exp+0.065
elt+0.061
 goodness+0.061
Neg logit contributions
 bad-0.079
 wrong-0.072
 worse-0.072
 not-0.069
 chores-0.068
Token ocean
Token ID9151
Feature activation-2.47e-02

Pos logit contributions
amb+0.171
elt+0.159
bye+0.157
 goodness+0.143
thank+0.138
Neg logit contributions
 bad-0.181
 mom-0.171
 well-0.167
 broken-0.163
 wrong-0.160
Token.
Token ID13
Feature activation+1.21e-02

Pos logit contributions
amb+0.062
 exp+0.062
bye+0.062
 oceans+0.058
icol+0.058
Neg logit contributions
 not-0.023
ggy-0.018
 bad-0.018
 worse-0.018
 homework-0.017
Token She
Token ID1375
Feature activation-2.34e-02

Pos logit contributions
 not+0.022
ggy+0.021
 infected+0.021
 worse+0.021
ussy+0.021
Neg logit contributions
 exp-0.017
 oceans-0.017
bye-0.017
 plains-0.016
amb-0.015
Token was
Token ID373
Feature activation+8.03e-03

Pos logit contributions
bye+0.046
 exp+0.046
 oceans+0.045
amb+0.042
 plains+0.042
Neg logit contributions
 not-0.032
ggy-0.029
 worse-0.028
 infected-0.028
 bad-0.027
Token happy
Token ID3772
Feature activation-1.96e-02

Pos logit contributions
 infected+0.010
ggy+0.010
ussy+0.010
 worse+0.010
 homework+0.009
Neg logit contributions
 oceans-0.015
bye-0.015
 exp-0.015
 plains-0.014
 lakes-0.014
Token.
Token ID13
Feature activation-3.40e-02

Pos logit contributions
 exp+0.068
 oceans+0.067
bye+0.064
icol+0.062
 lakes+0.061
Neg logit contributions
ggy-0.028
 not-0.028
 wrong-0.022
 bad-0.020
ggie-0.019
Token He
Token ID679
Feature activation-3.15e-02

Pos logit contributions
bye+0.067
 oceans+0.063
 exp+0.062
 breeze+0.058
icol+0.058
Neg logit contributions
 not-0.045
 mom-0.040
 okay-0.039
 Lily-0.039
ggy-0.037
Token lived
Token ID5615
Feature activation+3.24e-03

Pos logit contributions
bye+0.068
 oceans+0.067
 exp+0.066
amb+0.063
umm+0.061
Neg logit contributions
 not-0.041
ggy-0.037
 bad-0.031
 infected-0.031
 mom-0.030
Token in
Token ID287
Feature activation-2.73e-02

Pos logit contributions
ggy+0.004
 infected+0.004
 worse+0.004
ussy+0.004
 not+0.004
Neg logit contributions
 oceans-0.006
 exp-0.006
bye-0.006
 plains-0.006
amb-0.006
Token the
Token ID262
Feature activation-5.65e-02

Pos logit contributions
bye+0.067
 exp+0.067
 oceans+0.065
amb+0.063
 plains+0.062
Neg logit contributions
 not-0.024
ggy-0.021
 infected-0.021
 worse-0.020
 bad-0.019
Token deep
Token ID2769
Feature activation-1.08e-01

Pos logit contributions
amb+0.059
bye+0.059
 exp+0.055
 oceans+0.051
ard+0.044
Neg logit contributions
 not-0.125
 bad-0.122
 wrong-0.119
 worse-0.118
ggy-0.118
Token blue
Token ID4171
Feature activation-8.22e-02

Pos logit contributions
bye+0.175
amb+0.172
thank+0.154
 Bye+0.154
 goodness+0.153
Neg logit contributions
 bad-0.166
 not-0.151
 mom-0.150
 well-0.149
 broken-0.149
Token sea
Token ID5417
Feature activation-4.29e-02

Pos logit contributions
bye+0.125
thank+0.108
 MacDonald+0.105
icol+0.104
amb+0.103
Neg logit contributions
 mom-0.142
 friends-0.125
icky-0.124
 well-0.122
 case-0.121
Token.
Token ID13
Feature activation-2.17e-02

Pos logit contributions
bye+0.105
icol+0.105
 exp+0.104
amb+0.100
umm+0.094
Neg logit contributions
 not-0.049
 well-0.031
 wrong-0.031
 mom-0.029
 bad-0.029
Token He
Token ID679
Feature activation-3.19e-02

Pos logit contributions
bye+0.041
 exp+0.039
 oceans+0.038
amb+0.036
ard+0.035
Neg logit contributions
 not-0.031
ggy-0.029
ussy-0.028
 infected-0.028
 mom-0.027
Token had
Token ID550
Feature activation-2.17e-02

Pos logit contributions
bye+0.074
amb+0.071
 exp+0.071
umm+0.070
 oceans+0.070
Neg logit contributions
 not-0.038
ggy-0.030
 bad-0.028
 also-0.027
 ruined-0.026
Token loved
Token ID6151
Feature activation-2.85e-02

Pos logit contributions
 oceans+0.035
 exp+0.035
bye+0.033
 plains+0.032
amb+0.032
Neg logit contributions
 not-0.023
ggy-0.022
 bad-0.020
 infected-0.019
 worse-0.019
Token to
Token ID284
Feature activation-3.63e-02

Pos logit contributions
ard+0.073
 exp+0.072
bye+0.071
icol+0.070
amb+0.069
Neg logit contributions
 not-0.026
 mom-0.019
 bad-0.018
 chores-0.015
 wrong-0.014
Token swim
Token ID9422
Feature activation-1.62e-02

Pos logit contributions
umm+0.082
 oceans+0.080
ard+0.078
icol+0.077
ky+0.077
Neg logit contributions
 not-0.046
ggy-0.040
 chores-0.036
 mom-0.036
aches-0.036
Token in
Token ID287
Feature activation-3.24e-02

Pos logit contributions
 exp+0.037
 oceans+0.036
bye+0.036
 plains+0.034
ard+0.034
Neg logit contributions
 not-0.017
ggy-0.015
 worse-0.014
 bad-0.014
 infected-0.013
Token the
Token ID262
Feature activation-6.32e-02

Pos logit contributions
bye+0.093
amb+0.090
elt+0.086
ard+0.086
 exp+0.084
Neg logit contributions
 not-0.022
 bad-0.017
 mom-0.016
 chores-0.015
 worse-0.011
Token deep
Token ID2769
Feature activation-1.07e-01

Pos logit contributions
amb+0.077
bye+0.070
 exp+0.068
icol+0.065
 zone+0.064
Neg logit contributions
 bad-0.131
 not-0.123
 mom-0.122
 wrong-0.122
 worse-0.117
Token blue
Token ID4171
Feature activation-8.95e-02

Pos logit contributions
amb+0.164
 goodness+0.161
bye+0.159
elt+0.153
 Bye+0.151
Neg logit contributions
 bad-0.160
 well-0.153
 big-0.151
 mom-0.146
 not-0.139
Token sea
Token ID5417
Feature activation-3.09e-02

Pos logit contributions
icol+0.135
 la+0.126
istence+0.123
 sidelines+0.122
bye+0.121
Neg logit contributions
 mom-0.145
 best-0.131
 well-0.131
 not-0.130
 chores-0.129
Token.
Token ID13
Feature activation-1.94e-02

Pos logit contributions
icol+0.086
 exp+0.086
bye+0.084
 sidelines+0.081
 course+0.080
Neg logit contributions
 not-0.022
 well-0.012
ggy-0.011
 wrong-0.011
 bad-0.011
Token One
Token ID1881
Feature activation-2.34e-02

Pos logit contributions
bye+0.042
 oceans+0.040
amb+0.040
 goodness+0.039
 exp+0.039
Neg logit contributions
 not-0.022
 mom-0.018
 chores-0.017
 homework-0.016
 Lily-0.016
Token day
Token ID1110
Feature activation+6.80e-03

Pos logit contributions
 exp+0.037
 oceans+0.037
bye+0.035
 plains+0.033
amb+0.033
Neg logit contributions
 not-0.040
ggy-0.039
 worse-0.037
 bad-0.037
 infected-0.036
Token a
Token ID257
Feature activation-2.22e-02

Pos logit contributions
amb+0.084
bye+0.082
 exp+0.080
ard+0.079
 plains+0.075
Neg logit contributions
 not-0.036
 worse-0.030
 okay-0.030
 bad-0.029
 wrong-0.029
Token small
Token ID1402
Feature activation-6.20e-02

Pos logit contributions
amb+0.049
bye+0.044
 exp+0.044
 oceans+0.043
elt+0.042
Neg logit contributions
 bad-0.029
 not-0.027
 mom-0.025
 worse-0.025
 wrong-0.024
Token dot
Token ID16605
Feature activation-3.07e-02

Pos logit contributions
amb+0.122
Now+0.106
 exp+0.105
bye+0.104
 oceans+0.104
Neg logit contributions
 bad-0.090
 mom-0.086
 sad-0.082
 wrong-0.080
 broken-0.080
Token in
Token ID287
Feature activation-5.27e-02

Pos logit contributions
 exp+0.074
 oceans+0.070
bye+0.069
 plains+0.067
amb+0.067
Neg logit contributions
ggy-0.032
 not-0.031
 worse-0.027
 wrong-0.026
 anywhere-0.026
Token the
Token ID262
Feature activation-5.92e-02

Pos logit contributions
bye+0.129
amb+0.129
 exp+0.125
ard+0.122
 oceans+0.119
Neg logit contributions
 not-0.049
 worse-0.042
 bad-0.041
 infected-0.040
 mom-0.038
Token dark
Token ID3223
Feature activation-1.06e-01

Pos logit contributions
amb+0.103
elt+0.077
icol+0.076
 exp+0.075
bye+0.073
Neg logit contributions
 bad-0.111
 wrong-0.109
 broken-0.103
 missing-0.100
 worse-0.099
Token night
Token ID1755
Feature activation-7.28e-02

Pos logit contributions
amb+0.217
 la+0.195
 sidelines+0.191
 goodness+0.190
bye+0.186
Neg logit contributions
 broken-0.117
 either-0.115
 bad-0.113
 not-0.113
 hurt-0.111
Token sky
Token ID6766
Feature activation-4.38e-02

Pos logit contributions
amb+0.138
bye+0.123
 exp+0.118
 course+0.110
 oceans+0.106
Neg logit contributions
 not-0.113
ggy-0.103
 bad-0.101
 worse-0.101
 mom-0.099
Token.
Token ID13
Feature activation-1.23e-03

Pos logit contributions
 exp+0.114
amb+0.113
bye+0.111
 oceans+0.109
 plains+0.107
Neg logit contributions
 not-0.034
ggy-0.027
 worse-0.026
 wrong-0.024
 okay-0.022
Token Soon
Token ID15894
Feature activation-7.68e-03

Pos logit contributions
 exp+0.003
 oceans+0.002
bye+0.002
ard+0.002
 plains+0.002
Neg logit contributions
 not-0.001
 okay-0.001
 infected-0.001
 mom-0.001
ggy-0.001
Token after
Token ID706
Feature activation-3.05e-02

Pos logit contributions
 exp+0.015
 oceans+0.015
bye+0.015
 plains+0.014
 lakes+0.013
Neg logit contributions
 not-0.010
ggy-0.010
 infected-0.010
 worse-0.009
ussy-0.009
Token.
Token ID13
Feature activation-2.96e-02

Pos logit contributions
bye+0.073
amb+0.069
 exp+0.069
 plains+0.066
 oceans+0.066
Neg logit contributions
 not-0.025
 mom-0.020
 bad-0.019
 worse-0.019
 wrong-0.018
Token He
Token ID679
Feature activation-2.08e-02

Pos logit contributions
bye+0.048
 oceans+0.047
 exp+0.047
amb+0.044
 plains+0.042
Neg logit contributions
 not-0.050
ggy-0.046
 infected-0.045
 worse-0.045
ussy-0.044
Token could
Token ID714
Feature activation+4.18e-03

Pos logit contributions
bye+0.041
 exp+0.039
 oceans+0.039
amb+0.038
 plains+0.036
Neg logit contributions
 not-0.030
ggy-0.027
 infected-0.026
 worse-0.026
 bad-0.025
Token see
Token ID766
Feature activation-1.17e-02

Pos logit contributions
 not+0.005
ggy+0.005
 worse+0.004
 infected+0.004
 okay+0.004
Neg logit contributions
 oceans-0.009
bye-0.009
 exp-0.009
 plains-0.009
ard-0.008
Token a
Token ID257
Feature activation-1.56e-02

Pos logit contributions
bye+0.026
 exp+0.026
 oceans+0.026
amb+0.024
 plains+0.024
Neg logit contributions
 not-0.012
ggy-0.011
 worse-0.011
 infected-0.011
ussy-0.010
Token vast
Token ID5909
Feature activation-1.06e-01

Pos logit contributions
bye+0.030
 exp+0.030
 oceans+0.030
 plains+0.028
amb+0.028
Neg logit contributions
 not-0.021
ggy-0.020
 infected-0.020
 worse-0.020
ussy-0.019
Token valley
Token ID19272
Feature activation-6.23e-02

Pos logit contributions
amb+0.188
bye+0.176
elt+0.171
thank+0.150
ursion+0.145
Neg logit contributions
 bad-0.165
 not-0.163
 mom-0.157
 well-0.156
 broken-0.152
Token below
Token ID2174
Feature activation-5.10e-02

Pos logit contributions
amb+0.146
 exp+0.145
bye+0.143
icol+0.139
 oceans+0.134
Neg logit contributions
 not-0.073
 wrong-0.060
 worse-0.053
 anywhere-0.052
ggy-0.052
Token.
Token ID13
Feature activation-2.50e-02

Pos logit contributions
 exp+0.118
bye+0.115
amb+0.114
icol+0.112
 oceans+0.112
Neg logit contributions
 not-0.052
 worse-0.046
ggy-0.045
 wrong-0.044
 bad-0.042
Token He
Token ID679
Feature activation-2.98e-02

Pos logit contributions
 exp+0.042
bye+0.042
 oceans+0.042
amb+0.038
 plains+0.037
Neg logit contributions
 not-0.041
ggy-0.038
 infected-0.036
ussy-0.035
 okay-0.035
Token decided
Token ID3066
Feature activation-2.29e-02

Pos logit contributions
bye+0.064
amb+0.059
 exp+0.059
umm+0.059
 oceans+0.058
Neg logit contributions
 not-0.039
ggy-0.032
 also-0.031
abella-0.031
 mom-0.031
Token tw
Token ID665
Feature activation-4.32e-02

Pos logit contributions
bye+0.032
 exp+0.031
 oceans+0.031
amb+0.030
 plains+0.029
Neg logit contributions
 not-0.016
 worse-0.014
ggy-0.013
 infected-0.013
 bad-0.013
Tokenink
Token ID676
Feature activation-3.31e-02

Pos logit contributions
bye+0.086
 exp+0.083
 oceans+0.082
 plains+0.076
ard+0.076
Neg logit contributions
 not-0.058
ggy-0.056
 infected-0.053
aches-0.050
ussy-0.047
Tokenled
Token ID992
Feature activation-3.95e-02

Pos logit contributions
 exp+0.073
icol+0.070
 oceans+0.070
bye+0.070
umm+0.068
Neg logit contributions
 not-0.036
ggy-0.032
 bad-0.031
 wrong-0.031
 worse-0.030
Token over
Token ID625
Feature activation-3.54e-02

Pos logit contributions
bye+0.101
 exp+0.097
amb+0.096
ard+0.094
 plains+0.092
Neg logit contributions
 not-0.037
 okay-0.029
 worse-0.027
 bad-0.027
 wrong-0.027
Token the
Token ID262
Feature activation-4.67e-02

Pos logit contributions
amb+0.104
bye+0.100
icol+0.097
round+0.095
 MacDonald+0.094
Neg logit contributions
 mom-0.019
 not-0.019
 bad-0.014
 on-0.011
 missing-0.011
Token dark
Token ID3223
Feature activation-1.06e-01

Pos logit contributions
amb+0.064
bye+0.059
 course+0.057
 exp+0.056
 manner+0.054
Neg logit contributions
 mom-0.086
 not-0.086
 bad-0.084
 missing-0.080
 well-0.080
Token town
Token ID3240
Feature activation-5.46e-02

Pos logit contributions
amb+0.181
 sidelines+0.165
ursion+0.161
 course+0.159
Run+0.158
Neg logit contributions
 mom-0.154
 not-0.138
 bad-0.136
 well-0.134
 anymore-0.133
Token,
Token ID11
Feature activation-4.10e-02

Pos logit contributions
icol+0.122
amb+0.122
 exp+0.120
 course+0.118
bye+0.117
Neg logit contributions
 not-0.063
 well-0.047
 mom-0.046
 missing-0.045
 sad-0.045
Token glowing
Token ID21377
Feature activation-4.51e-02

Pos logit contributions
bye+0.100
amb+0.094
ard+0.091
 exp+0.089
 zone+0.086
Neg logit contributions
 not-0.058
 missing-0.042
 sad-0.042
 wrong-0.040
 bad-0.039
Token happily
Token ID18177
Feature activation-2.99e-02

Pos logit contributions
bye+0.091
amb+0.085
 exp+0.084
 sidelines+0.083
ursion+0.082
Neg logit contributions
 not-0.055
 bad-0.051
 worse-0.051
 well-0.048
 mom-0.048
Token.
Token ID13
Feature activation-2.29e-02

Pos logit contributions
 exp+0.078
icol+0.075
bye+0.073
 oceans+0.073
 course+0.072
Neg logit contributions
 not-0.028
 anywhere-0.018
 missing-0.018
 well-0.017
 wrong-0.017
Token was
Token ID373
Feature activation+5.57e-03

Pos logit contributions
 oceans+0.041
bye+0.041
 exp+0.040
 plains+0.038
amb+0.037
Neg logit contributions
 not-0.026
ggy-0.023
 mom-0.021
abella-0.021
 infected-0.021
Token running
Token ID2491
Feature activation-3.55e-02

Pos logit contributions
 infected+0.007
ggy+0.007
 worse+0.007
ussy+0.007
 homework+0.006
Neg logit contributions
 oceans-0.011
bye-0.010
 exp-0.010
 plains-0.010
 lakes-0.009
Token around
Token ID1088
Feature activation-2.64e-02

Pos logit contributions
 exp+0.074
bye+0.072
 oceans+0.071
 plains+0.067
amb+0.067
Neg logit contributions
 not-0.046
ggy-0.041
 worse-0.039
 bad-0.038
 wrong-0.037
Token on
Token ID319
Feature activation-3.73e-02

Pos logit contributions
bye+0.062
 exp+0.059
amb+0.058
ard+0.055
 oceans+0.055
Neg logit contributions
 not-0.032
 mom-0.025
ggy-0.024
 bad-0.024
 wrong-0.024
Token the
Token ID262
Feature activation-4.10e-02

Pos logit contributions
bye+0.095
amb+0.092
 exp+0.089
 oceans+0.085
ard+0.085
Neg logit contributions
 not-0.033
ggy-0.028
 chores-0.027
 bad-0.027
 worse-0.026
Token lush
Token ID37408
Feature activation-1.06e-01

Pos logit contributions
bye+0.073
amb+0.072
 exp+0.070
 oceans+0.069
icol+0.061
Neg logit contributions
 not-0.062
 bad-0.060
ggy-0.060
 wrong-0.058
 chores-0.057
Token grass
Token ID8701
Feature activation-4.41e-02

Pos logit contributions
amb+0.190
bye+0.187
elt+0.173
 zone+0.166
thank+0.162
Neg logit contributions
 bad-0.147
ggy-0.141
 not-0.137
 mom-0.130
 wrong-0.129
Token,
Token ID11
Feature activation-3.99e-02

Pos logit contributions
 exp+0.098
bye+0.093
 oceans+0.091
icol+0.090
amb+0.085
Neg logit contributions
 not-0.058
 wrong-0.045
ggy-0.043
 missing-0.042
 mom-0.041
Token enjoying
Token ID13226
Feature activation-3.16e-02

Pos logit contributions
bye+0.086
 exp+0.081
amb+0.079
 oceans+0.074
umm+0.073
Neg logit contributions
 not-0.062
 missing-0.047
 okay-0.046
 sad-0.044
 wrong-0.044
Token the
Token ID262
Feature activation-3.07e-02

Pos logit contributions
bye+0.085
amb+0.081
 exp+0.081
ard+0.079
 oceans+0.075
Neg logit contributions
 not-0.027
ggy-0.021
 bad-0.019
 chores-0.017
 mom-0.017
Token sunshine
Token ID34488
Feature activation-2.95e-02

Pos logit contributions
amb+0.052
 exp+0.051
bye+0.050
 oceans+0.049
 plains+0.046
Neg logit contributions
 bad-0.047
 chores-0.044
 not-0.044
 best-0.044
 good-0.042
Token tw
Token ID665
Feature activation-5.30e-02

Pos logit contributions
amb+0.139
bye+0.134
 exp+0.128
ard+0.127
 zone+0.121
Neg logit contributions
 not-0.074
 mom-0.059
 bad-0.057
 worse-0.056
 okay-0.052
Tokenink
Token ID676
Feature activation-3.53e-02

Pos logit contributions
bye+0.111
 exp+0.100
 oceans+0.099
 zone+0.095
 course+0.095
Neg logit contributions
 not-0.068
ggy-0.068
 infected-0.063
aches-0.062
ussy-0.059
Tokenled
Token ID992
Feature activation-4.52e-02

Pos logit contributions
 exp+0.075
bye+0.073
 oceans+0.071
amb+0.070
icol+0.068
Neg logit contributions
 not-0.041
ggy-0.039
 worse-0.039
 bad-0.037
 wrong-0.036
Token in
Token ID287
Feature activation-6.14e-02

Pos logit contributions
bye+0.098
 exp+0.093
amb+0.093
 plains+0.088
ard+0.086
Neg logit contributions
 not-0.055
 worse-0.051
 okay-0.049
 bad-0.048
 wrong-0.045
Token the
Token ID262
Feature activation-6.41e-02

Pos logit contributions
amb+0.169
bye+0.164
round+0.153
ard+0.151
 exp+0.148
Neg logit contributions
 mom-0.043
 not-0.041
 bad-0.037
 what-0.030
 good-0.029
Token dark
Token ID3223
Feature activation-1.06e-01

Pos logit contributions
amb+0.116
bye+0.099
 exp+0.097
round+0.093
elt+0.092
Neg logit contributions
 bad-0.096
 mom-0.093
 worse-0.090
 wrong-0.089
 better-0.087
Token sky
Token ID6766
Feature activation-5.39e-02

Pos logit contributions
amb+0.251
bye+0.240
 sidelines+0.236
Run+0.232
 course+0.231
Neg logit contributions
 mom-0.076
 broken-0.061
 bad-0.059
 not-0.058
 either-0.057
Token,
Token ID11
Feature activation-4.94e-02

Pos logit contributions
 exp+0.142
bye+0.138
amb+0.138
 oceans+0.132
 plains+0.129
Neg logit contributions
 not-0.042
 worse-0.032
ggy-0.032
 okay-0.028
 bad-0.027
Token it
Token ID340
Feature activation-2.41e-02

Pos logit contributions
bye+0.112
amb+0.106
 exp+0.099
ursion+0.096
ard+0.094
Neg logit contributions
 not-0.077
 okay-0.055
 missing-0.053
 worse-0.052
 mom-0.051
Token was
Token ID373
Feature activation-1.99e-02

Pos logit contributions
bye+0.055
 exp+0.054
amb+0.053
 oceans+0.053
 plains+0.050
Neg logit contributions
 not-0.027
 worse-0.024
ggy-0.023
 bad-0.022
 infected-0.022
Token so
Token ID523
Feature activation-1.78e-02

Pos logit contributions
bye+0.043
 exp+0.042
amb+0.041
 oceans+0.041
ard+0.040
Neg logit contributions
 not-0.025
 worse-0.022
 infected-0.022
 okay-0.021
ggy-0.021
Token colors
Token ID7577
Feature activation-3.07e-02

Pos logit contributions
 exp+0.007
 oceans+0.007
bye+0.007
amb+0.007
ard+0.006
Neg logit contributions
 not-0.007
ggy-0.006
 worse-0.006
 mom-0.006
 bad-0.006
Token shimmer
Token ID34493
Feature activation-4.39e-02

Pos logit contributions
 exp+0.059
bye+0.059
 oceans+0.057
amb+0.055
 plains+0.055
Neg logit contributions
 not-0.045
 worse-0.040
ggy-0.039
 infected-0.038
 bad-0.036
Tokening
Token ID278
Feature activation-4.70e-02

Pos logit contributions
 exp+0.091
bye+0.089
 oceans+0.085
 plains+0.083
amb+0.081
Neg logit contributions
 worse-0.053
 not-0.050
 okay-0.049
 infected-0.047
ussy-0.047
Token in
Token ID287
Feature activation-5.01e-02

Pos logit contributions
bye+0.105
 exp+0.102
amb+0.102
 plains+0.095
 oceans+0.094
Neg logit contributions
 not-0.052
 worse-0.050
 bad-0.047
 wrong-0.046
 missing-0.043
Token the
Token ID262
Feature activation-5.95e-02

Pos logit contributions
amb+0.129
bye+0.128
ard+0.119
 exp+0.119
 plains+0.113
Neg logit contributions
 not-0.042
 bad-0.038
 mom-0.036
 worse-0.034
 infected-0.032
Token dark
Token ID3223
Feature activation-1.06e-01

Pos logit contributions
amb+0.097
 exp+0.075
bye+0.074
 oceans+0.070
elt+0.067
Neg logit contributions
 bad-0.111
 wrong-0.103
 worse-0.103
 mom-0.100
 better-0.098
Token.
Token ID13
Feature activation-1.15e-02

Pos logit contributions
amb+0.218
 sidelines+0.194
bye+0.191
ursion+0.186
YES+0.185
Neg logit contributions
 mom-0.108
 bad-0.108
 broken-0.105
 either-0.100
 anymore-0.099
Token L
Token ID406
Feature activation+1.57e-02

Pos logit contributions
bye+0.023
 exp+0.023
 oceans+0.023
 plains+0.021
amb+0.021
Neg logit contributions
 not-0.015
ggy-0.014
 infected-0.014
 worse-0.013
ussy-0.013
Tokenola
Token ID5708
Feature activation-6.66e-03

Pos logit contributions
 infected+0.023
 not+0.023
 worse+0.022
ggy+0.022
 chores+0.021
Neg logit contributions
 exp-0.029
 oceans-0.027
bye-0.027
amb-0.026
ard-0.025
Token gasped
Token ID45236
Feature activation-3.43e-02

Pos logit contributions
 oceans+0.015
bye+0.015
 exp+0.014
umm+0.014
 plains+0.014
Neg logit contributions
 not-0.008
ggy-0.007
 mom-0.006
 also-0.006
 else-0.005
Token with
Token ID351
Feature activation-2.05e-02

Pos logit contributions
bye+0.073
 exp+0.070
 oceans+0.069
 plains+0.068
amb+0.065
Neg logit contributions
 not-0.045
 worse-0.040
 okay-0.038
ggy-0.038
 mom-0.037
Token a
Token ID257
Feature activation+1.48e-03

Pos logit contributions
 not+0.024
ggy+0.020
 okay+0.020
 infected+0.019
 bad+0.019
Neg logit contributions
 exp-0.049
bye-0.049
 oceans-0.049
amb-0.047
 plains-0.046
Token boat
Token ID8848
Feature activation-7.03e-03

Pos logit contributions
 not+0.002
ggy+0.002
 worse+0.002
 infected+0.002
 bad+0.002
Neg logit contributions
 oceans-0.003
bye-0.003
 exp-0.003
 plains-0.003
amb-0.003
Token ride
Token ID6594
Feature activation-1.92e-02

Pos logit contributions
 oceans+0.014
 exp+0.014
 plains+0.013
 valleys+0.013
 lakes+0.013
Neg logit contributions
 not-0.010
ggy-0.009
 anywhere-0.009
 okay-0.008
 wrong-0.008
Token in
Token ID287
Feature activation-3.89e-02

Pos logit contributions
 exp+0.038
 oceans+0.037
bye+0.035
icol+0.034
 plains+0.034
Neg logit contributions
 not-0.028
ggy-0.025
 wrong-0.023
 anywhere-0.023
 okay-0.022
Token the
Token ID262
Feature activation-5.93e-02

Pos logit contributions
bye+0.098
amb+0.095
 exp+0.093
elt+0.089
icol+0.089
Neg logit contributions
 not-0.037
 bad-0.028
 worse-0.028
ggy-0.028
 anywhere-0.027
Token deep
Token ID2769
Feature activation-1.06e-01

Pos logit contributions
amb+0.089
bye+0.076
 exp+0.076
elt+0.069
thank+0.066
Neg logit contributions
 bad-0.116
 wrong-0.112
 worse-0.105
 not-0.103
 better-0.102
Token lake
Token ID13546
Feature activation-4.20e-02

Pos logit contributions
amb+0.192
elt+0.184
 goodness+0.177
bye+0.174
owers+0.162
Neg logit contributions
 bad-0.147
 mom-0.131
 wrong-0.128
 well-0.128
 broken-0.128
Token".
Token ID1911
Feature activation+1.45e-03

Pos logit contributions
 exp+0.080
 oceans+0.080
icol+0.079
bye+0.078
amb+0.077
Neg logit contributions
 not-0.060
 wrong-0.049
ggy-0.048
 anywhere-0.047
 okay-0.046
Token Lucy
Token ID22162
Feature activation-1.93e-02

Pos logit contributions
 not+0.002
ggy+0.002
 infected+0.002
 worse+0.002
 okay+0.002
Neg logit contributions
 oceans-0.003
bye-0.003
 exp-0.003
 plains-0.002
 lakes-0.002
Token quickly
Token ID2952
Feature activation-3.39e-02

Pos logit contributions
 oceans+0.044
bye+0.043
 exp+0.041
 plains+0.040
 lakes+0.039
Neg logit contributions
 not-0.023
ggy-0.021
 mom-0.018
 infected-0.016
 okay-0.016
Token said
Token ID531
Feature activation+5.04e-03

Pos logit contributions
 oceans+0.067
bye+0.066
 exp+0.064
 plains+0.062
amb+0.060
Neg logit contributions
 not-0.047
ggy-0.044
 infected-0.040
 bad-0.040
 worse-0.040
Token boat
Token ID8848
Feature activation-2.78e-03

Pos logit contributions
bye+0.042
 exp+0.042
 oceans+0.041
amb+0.040
 plains+0.038
Neg logit contributions
 not-0.035
ggy-0.032
 worse-0.031
 infected-0.031
 bad-0.031
Token,
Token ID11
Feature activation-1.26e-02

Pos logit contributions
 exp+0.006
 oceans+0.006
bye+0.006
 lakes+0.006
 plains+0.006
Neg logit contributions
 not-0.004
ggy-0.003
 worse-0.003
 wrong-0.003
 infected-0.003
Token fishing
Token ID12478
Feature activation-2.36e-02

Pos logit contributions
 exp+0.026
bye+0.026
 oceans+0.026
 plains+0.024
amb+0.024
Neg logit contributions
 not-0.016
ggy-0.015
 worse-0.014
 infected-0.014
 okay-0.013
Token in
Token ID287
Feature activation-3.99e-02

Pos logit contributions
 exp+0.046
 oceans+0.045
bye+0.043
amb+0.041
 plains+0.040
Neg logit contributions
 not-0.035
ggy-0.033
 bad-0.030
 worse-0.030
 wrong-0.030
Token the
Token ID262
Feature activation-5.68e-02

Pos logit contributions
bye+0.095
 exp+0.094
amb+0.091
 oceans+0.091
 plains+0.087
Neg logit contributions
 not-0.039
ggy-0.034
 worse-0.033
 infected-0.033
 bad-0.032
Token deep
Token ID2769
Feature activation-1.05e-01

Pos logit contributions
amb+0.092
bye+0.085
 exp+0.082
 oceans+0.073
icol+0.073
Neg logit contributions
 bad-0.104
 not-0.100
 wrong-0.097
 worse-0.097
 mom-0.091
Token blue
Token ID4171
Feature activation-8.63e-02

Pos logit contributions
amb+0.180
bye+0.169
 goodness+0.164
 la+0.161
elt+0.157
Neg logit contributions
 bad-0.161
 broken-0.140
 mom-0.140
 not-0.137
 big-0.137
Token sea
Token ID5417
Feature activation-3.09e-02

Pos logit contributions
 la+0.141
bye+0.136
icol+0.133
amb+0.130
 tweet+0.126
Neg logit contributions
 mom-0.126
 chores-0.122
 not-0.120
 bad-0.118
 wrong-0.114
Token.
Token ID13
Feature activation-1.38e-02

Pos logit contributions
 exp+0.083
bye+0.082
amb+0.080
 oceans+0.080
icol+0.080
Neg logit contributions
 not-0.025
ggy-0.016
 wrong-0.014
 bad-0.014
 mom-0.014
Token Suddenly
Token ID24975
Feature activation-1.68e-03

Pos logit contributions
bye+0.020
 exp+0.020
 oceans+0.020
 plains+0.018
amb+0.018
Neg logit contributions
 not-0.025
ggy-0.024
 infected-0.024
 worse-0.023
ussy-0.023
Token,
Token ID11
Feature activation-3.86e-04

Pos logit contributions
 exp+0.003
bye+0.003
 oceans+0.003
 plains+0.003
amb+0.002
Neg logit contributions
 not-0.003
ggy-0.003
 infected-0.003
 worse-0.003
ussy-0.003
Token,
Token ID11
Feature activation-4.67e-02

Pos logit contributions
bye+0.024
 exp+0.024
 oceans+0.022
icol+0.022
 plains+0.021
Neg logit contributions
 not-0.010
 anywhere-0.008
ggy-0.008
 missing-0.008
 okay-0.007
Token through
Token ID832
Feature activation-5.06e-02

Pos logit contributions
bye+0.100
amb+0.093
ursion+0.086
 exp+0.085
umm+0.084
Neg logit contributions
 not-0.073
 okay-0.057
 bad-0.056
 missing-0.055
 mom-0.053
Token the
Token ID262
Feature activation-5.38e-02

Pos logit contributions
bye+0.130
amb+0.125
 exp+0.118
icol+0.112
ard+0.110
Neg logit contributions
 not-0.046
 bad-0.039
 mom-0.038
 worse-0.037
 anywhere-0.035
Token lush
Token ID37408
Feature activation-1.11e-01

Pos logit contributions
amb+0.100
bye+0.089
elt+0.082
 exp+0.081
thank+0.080
Neg logit contributions
 bad-0.086
 wrong-0.077
 well-0.075
 not-0.073
 missing-0.072
Token forests
Token ID17039
Feature activation-4.87e-02

Pos logit contributions
bye+0.216
amb+0.209
elt+0.186
thank+0.185
ursion+0.184
Neg logit contributions
 bad-0.138
 well-0.131
 big-0.127
 not-0.124
 fake-0.123
Token deep
Token ID2769
Feature activation-1.05e-01

Pos logit contributions
 exp+0.110
bye+0.109
amb+0.104
icol+0.100
 Bye+0.095
Neg logit contributions
 not-0.054
ggy-0.044
 wrong-0.042
 anywhere-0.042
 missing-0.041
Token in
Token ID287
Feature activation-6.92e-02

Pos logit contributions
bye+0.180
amb+0.171
 Bye+0.165
ursion+0.148
 sidelines+0.147
Neg logit contributions
 well-0.151
 up-0.146
 bad-0.145
 on-0.144
 inside-0.139
Token thought
Token ID1807
Feature activation-1.08e-02

Pos logit contributions
amb+0.204
bye+0.199
icol+0.182
 exp+0.181
elt+0.180
Neg logit contributions
 not-0.032
 mom-0.026
 bad-0.026
 anywhere-0.020
 worse-0.019
Token.
Token ID13
Feature activation-7.81e-03

Pos logit contributions
 exp+0.023
bye+0.022
 oceans+0.022
amb+0.021
 plains+0.021
Neg logit contributions
 not-0.014
ggy-0.013
 worse-0.012
 infected-0.012
 bad-0.011
Token He
Token ID679
Feature activation-2.38e-02

Pos logit contributions
bye+0.014
 exp+0.013
 oceans+0.013
 course+0.012
amb+0.012
Neg logit contributions
 not-0.012
 infected-0.011
ussy-0.011
ggy-0.011
 okay-0.011
Token admired
Token ID29382
Feature activation-6.29e-03

Pos logit contributions
bye+0.047
 exp+0.045
 oceans+0.043
amb+0.043
 goodness+0.041
Neg logit contributions
 not-0.033
ggy-0.030
 worse-0.028
 infected-0.028
abella-0.028
Token She
Token ID1375
Feature activation-6.07e-03

Pos logit contributions
 not+0.007
ggy+0.007
 infected+0.007
 worse+0.007
ussy+0.007
Neg logit contributions
 oceans-0.009
 exp-0.009
bye-0.009
 plains-0.008
 lakes-0.008
Token was
Token ID373
Feature activation+1.81e-02

Pos logit contributions
 oceans+0.013
bye+0.013
 exp+0.013
 plains+0.012
 lakes+0.012
Neg logit contributions
 not-0.007
ggy-0.007
 infected-0.006
 worse-0.006
ussy-0.006
Token so
Token ID523
Feature activation+2.53e-02

Pos logit contributions
 infected+0.029
ggy+0.029
ussy+0.029
 worse+0.028
 homework+0.028
Neg logit contributions
 exp-0.028
bye-0.028
 oceans-0.028
 plains-0.025
amb-0.024
Token proud
Token ID6613
Feature activation-3.33e-02

Pos logit contributions
ggy+0.032
 infected+0.032
 not+0.032
 worse+0.031
ussy+0.031
Neg logit contributions
bye-0.049
 exp-0.049
 oceans-0.048
 plains-0.045
amb-0.044
Token of
Token ID286
Feature activation+5.98e-03

Pos logit contributions
 exp+0.081
bye+0.080
 oceans+0.078
 plains+0.076
amb+0.073
Neg logit contributions
 not-0.033
ggy-0.027
 mom-0.027
 worse-0.026
 bad-0.025
Token her
Token ID607
Feature activation+5.23e-02

Pos logit contributions
 infected+0.008
ussy+0.007
ggy+0.007
 worse+0.007
 homework+0.007
Neg logit contributions
 oceans-0.012
 exp-0.012
 course-0.011
bye-0.011
 goodness-0.011
Token qu
Token ID627
Feature activation-2.20e-02

Pos logit contributions
 not+0.075
ussy+0.074
ggy+0.074
 infected+0.073
 worse+0.072
Neg logit contributions
 exp-0.094
bye-0.093
 oceans-0.091
 plains-0.084
ard-0.083
Tokenilt
Token ID2326
Feature activation+7.68e-03

Pos logit contributions
 exp+0.037
bye+0.036
 oceans+0.036
 plains+0.033
 lakes+0.032
Neg logit contributions
 not-0.035
ggy-0.035
 infected-0.034
ussy-0.034
 worse-0.034
Token,
Token ID11
Feature activation-8.72e-03

Pos logit contributions
 not+0.010
ggy+0.009
 worse+0.009
 bad+0.009
 infected+0.009
Neg logit contributions
 exp-0.015
 oceans-0.015
bye-0.015
 plains-0.015
amb-0.014
Token she
Token ID673
Feature activation-5.69e-03

Pos logit contributions
bye+0.017
 exp+0.017
 oceans+0.017
 plains+0.016
ard+0.016
Neg logit contributions
 not-0.014
 okay-0.012
 mom-0.012
 bad-0.011
 worse-0.011
Token showed
Token ID3751
Feature activation+2.05e-02

Pos logit contributions
 oceans+0.012
bye+0.011
 exp+0.011
 plains+0.011
 lakes+0.011
Neg logit contributions
 not-0.008
ggy-0.007
 worse-0.007
 infected-0.006
 bad-0.006
Token are
Token ID389
Feature activation+4.14e-02

Pos logit contributions
ggy+0.044
ussy+0.042
 infected+0.041
 not+0.041
 worse+0.041
Neg logit contributions
bye-0.054
 exp-0.052
 oceans-0.050
ard-0.047
 plains-0.047
Token very
Token ID845
Feature activation+3.76e-02

Pos logit contributions
ggy+0.054
ork+0.052
abella+0.052
orman+0.050
ves+0.049
Neg logit contributions
 oceans-0.071
 exp-0.068
bye-0.068
 breeze-0.064
 goodness-0.064
Token kind
Token ID1611
Feature activation-1.87e-02

Pos logit contributions
ggy+0.036
 not+0.035
 infected+0.034
ussy+0.034
 worse+0.033
Neg logit contributions
bye-0.084
 exp-0.083
 oceans-0.081
amb-0.077
 plains-0.077
Token to
Token ID284
Feature activation+1.35e-02

Pos logit contributions
 exp+0.044
 oceans+0.043
 plains+0.042
icol+0.040
 lakes+0.040
Neg logit contributions
 not-0.021
 worse-0.017
 okay-0.017
 mom-0.017
 bad-0.017
Token help
Token ID1037
Feature activation+2.51e-02

Pos logit contributions
 not+0.023
ggy+0.023
 infected+0.023
 worse+0.023
ussy+0.022
Neg logit contributions
bye-0.020
 exp-0.020
 oceans-0.020
 plains-0.018
amb-0.017
Token your
Token ID534
Feature activation+5.02e-02

Pos logit contributions
 not+0.043
ggy+0.043
 infected+0.043
 worse+0.042
ussy+0.041
Neg logit contributions
bye-0.038
 exp-0.037
 oceans-0.036
 plains-0.033
ard-0.032
Token sister
Token ID6621
Feature activation+6.01e-02

Pos logit contributions
ggy+0.026
ussy+0.026
 infected+0.025
 not+0.025
abella+0.023
Neg logit contributions
bye-0.134
 exp-0.132
 oceans-0.129
 plains-0.122
amb-0.121
Token.
Token ID13
Feature activation-9.29e-03

Pos logit contributions
 not+0.080
ggy+0.079
 infected+0.077
 worse+0.076
ussy+0.075
Neg logit contributions
 exp-0.113
bye-0.112
 oceans-0.111
 plains-0.103
ard-0.101
Token But
Token ID887
Feature activation+2.30e-02

Pos logit contributions
 oceans+0.015
 exp+0.014
bye+0.014
 plains+0.013
 lakes+0.013
Neg logit contributions
 not-0.017
 infected-0.016
 okay-0.015
ggy-0.015
 worse-0.015
Token you
Token ID345
Feature activation+2.19e-02

Pos logit contributions
 not+0.041
 worse+0.037
ggy+0.037
 infected+0.037
 okay+0.036
Neg logit contributions
 exp-0.037
 oceans-0.037
bye-0.036
 plains-0.034
amb-0.033
Token should
Token ID815
Feature activation+2.77e-02

Pos logit contributions
ggy+0.029
ussy+0.027
 infected+0.027
abella+0.027
 worse+0.027
Neg logit contributions
 exp-0.039
bye-0.039
 oceans-0.038
 course-0.036
ard-0.035
Token watching
Token ID4964
Feature activation-1.90e-03

Pos logit contributions
 exp+0.005
bye+0.005
 oceans+0.004
 plains+0.004
amb+0.004
Neg logit contributions
 not-0.003
ggy-0.002
 infected-0.002
 worse-0.002
ussy-0.002
Token and
Token ID290
Feature activation-1.51e-02

Pos logit contributions
 exp+0.004
bye+0.004
 oceans+0.004
 plains+0.004
amb+0.004
Neg logit contributions
 not-0.002
ggy-0.002
 infected-0.002
 worse-0.002
ussy-0.002
Token protecting
Token ID10192
Feature activation+4.35e-03

Pos logit contributions
bye+0.031
 exp+0.031
amb+0.030
 oceans+0.030
 plains+0.028
Neg logit contributions
 not-0.021
 worse-0.018
 infected-0.018
ggy-0.017
 bad-0.017
Token the
Token ID262
Feature activation-2.98e-03

Pos logit contributions
 infected+0.007
ussy+0.007
ggy+0.006
 homework+0.006
 worse+0.006
Neg logit contributions
 oceans-0.007
bye-0.007
 exp-0.007
 plains-0.007
 lakes-0.007
Token c
Token ID269
Feature activation-7.06e-03

Pos logit contributions
 exp+0.005
bye+0.005
 oceans+0.005
 plains+0.005
amb+0.005
Neg logit contributions
 not-0.004
ggy-0.004
 infected-0.004
 worse-0.004
 bad-0.004
Tokenactus
Token ID34144
Feature activation+4.75e-02

Pos logit contributions
 exp+0.011
bye+0.011
 oceans+0.011
 plains+0.010
amb+0.010
Neg logit contributions
 not-0.012
ggy-0.011
 infected-0.011
 worse-0.011
ussy-0.011
Token.
Token ID13
Feature activation-1.19e-02

Pos logit contributions
 infected+0.065
ggy+0.064
ussy+0.064
abella+0.063
 not+0.062
Neg logit contributions
bye-0.085
 exp-0.083
 oceans-0.083
 plains-0.078
ard-0.077
Token The
Token ID383
Feature activation+1.13e-02

Pos logit contributions
bye+0.023
 oceans+0.021
 exp+0.021
amb+0.020
 lakes+0.019
Neg logit contributions
 not-0.017
 infected-0.016
 mom-0.016
ussy-0.015
 okay-0.015
Token bee
Token ID20697
Feature activation-1.00e-02

Pos logit contributions
 not+0.017
ussy+0.017
 infected+0.017
ggy+0.017
 worse+0.017
Neg logit contributions
bye-0.019
 exp-0.019
 oceans-0.019
 plains-0.017
ard-0.017
Token buzz
Token ID14713
Feature activation-3.44e-02

Pos logit contributions
 oceans+0.021
 exp+0.021
bye+0.021
 plains+0.019
icol+0.019
Neg logit contributions
 not-0.013
ggy-0.012
 worse-0.011
 infected-0.011
 bad-0.011
Tokened
Token ID276
Feature activation-7.29e-03

Pos logit contributions
 exp+0.068
 oceans+0.068
bye+0.066
 plains+0.063
 lakes+0.061
Neg logit contributions
ggy-0.044
 not-0.044
 infected-0.042
 worse-0.041
 bad-0.039
Token
Token ID198
Feature activation+2.57e-02

Pos logit contributions
ussy+0.067
 homework+0.066
 worse+0.065
 infected+0.064
 chores+0.064
Neg logit contributions
bye-0.037
 exp-0.035
 oceans-0.034
amb-0.031
 plains-0.031
TokenMom
Token ID29252
Feature activation+1.52e-02

Pos logit contributions
 not+0.040
ggy+0.039
 infected+0.039
 worse+0.039
ussy+0.038
Neg logit contributions
bye-0.043
 exp-0.043
 oceans-0.042
 plains-0.039
amb-0.038
Token hugged
Token ID41130
Feature activation-3.75e-04

Pos logit contributions
 not+0.025
ggy+0.024
 infected+0.024
 worse+0.024
 chores+0.023
Neg logit contributions
 exp-0.024
 oceans-0.024
bye-0.024
 plains-0.022
ard-0.022
Token Lily
Token ID20037
Feature activation+4.63e-02

Pos logit contributions
 oceans+0.001
bye+0.001
 exp+0.001
 plains+0.001
amb+0.001
Neg logit contributions
ggy-0.000
 not-0.000
 infected-0.000
 worse-0.000
ussy-0.000
Token and
Token ID290
Feature activation-1.33e-02

Pos logit contributions
ggy+0.097
 infected+0.096
 not+0.096
 worse+0.094
ussy+0.093
Neg logit contributions
bye-0.051
 exp-0.050
 oceans-0.049
 plains-0.045
amb-0.045
Token Ben
Token ID3932
Feature activation+4.67e-02

Pos logit contributions
 exp+0.028
bye+0.028
 oceans+0.028
 plains+0.027
amb+0.026
Neg logit contributions
 not-0.015
ggy-0.014
 infected-0.014
 worse-0.014
ussy-0.013
Token.
Token ID13
Feature activation+2.27e-04

Pos logit contributions
ggy+0.085
ussy+0.083
 infected+0.083
 not+0.082
 worse+0.081
Neg logit contributions
bye-0.065
 oceans-0.063
 exp-0.063
 plains-0.058
amb-0.058
Token She
Token ID1375
Feature activation-5.61e-03

Pos logit contributions
 not+0.000
ggy+0.000
 infected+0.000
 worse+0.000
ussy+0.000
Neg logit contributions
bye-0.000
 oceans-0.000
 exp-0.000
 plains-0.000
amb-0.000
Token said
Token ID531
Feature activation+2.86e-02

Pos logit contributions
bye+0.008
 exp+0.008
 course+0.007
 oceans+0.007
 plains+0.007
Neg logit contributions
ussy-0.010
 chores-0.010
 homework-0.009
 worse-0.009
 infected-0.009
Token,
Token ID11
Feature activation-5.63e-04

Pos logit contributions
ussy+0.051
 infected+0.051
ggy+0.051
 not+0.049
 worse+0.049
Neg logit contributions
bye-0.040
 exp-0.039
 oceans-0.038
amb-0.034
 lakes-0.034
Token "
Token ID366
Feature activation-5.70e-02

Pos logit contributions
 exp+0.001
bye+0.001
 oceans+0.001
 plains+0.001
amb+0.001
Neg logit contributions
 not-0.001
ggy-0.001
 infected-0.001
 worse-0.001
ussy-0.001
Token new
Token ID649
Feature activation+1.12e-02

Pos logit contributions
 not+0.049
ggy+0.047
 infected+0.047
ussy+0.047
 worse+0.046
Neg logit contributions
 exp-0.081
bye-0.081
 oceans-0.080
 plains-0.076
amb-0.073
Token toy
Token ID13373
Feature activation+3.05e-02

Pos logit contributions
 not+0.011
ggy+0.011
 infected+0.010
 worse+0.010
ussy+0.010
Neg logit contributions
bye-0.025
 exp-0.025
 oceans-0.025
 plains-0.024
amb-0.023
Token soldier
Token ID10686
Feature activation+2.91e-02

Pos logit contributions
 not+0.038
ggy+0.036
 worse+0.036
 infected+0.036
ussy+0.033
Neg logit contributions
 exp-0.063
 oceans-0.063
bye-0.063
 plains-0.058
 lakes-0.057
Token.
Token ID13
Feature activation+5.49e-03

Pos logit contributions
 not+0.041
ggy+0.040
 infected+0.039
 worse+0.039
ussy+0.039
Neg logit contributions
 exp-0.053
bye-0.052
 oceans-0.052
 plains-0.049
ard-0.048
Token He
Token ID679
Feature activation+2.33e-02

Pos logit contributions
 not+0.009
ggy+0.008
 infected+0.008
 worse+0.008
ussy+0.008
Neg logit contributions
 oceans-0.009
bye-0.009
 exp-0.009
 plains-0.009
amb-0.008
Token was
Token ID373
Feature activation+4.42e-02

Pos logit contributions
ussy+0.043
 chores+0.042
 okay+0.040
 homework+0.040
oll+0.038
Neg logit contributions
ard-0.028
 exp-0.027
ky-0.026
 oceans-0.025
 plains-0.025
Token very
Token ID845
Feature activation+4.44e-02

Pos logit contributions
ussy+0.063
abella+0.061
orman+0.060
ggy+0.058
 chores+0.058
Neg logit contributions
 oceans-0.066
 exp-0.063
 tune-0.061
bye-0.061
 plains-0.060
Token sad
Token ID6507
Feature activation-1.03e-02

Pos logit contributions
ussy+0.049
abella+0.049
 anywhere+0.044
ggy+0.043
orman+0.042
Neg logit contributions
bye-0.086
 exp-0.085
 oceans-0.079
amb-0.076
 mast-0.076
Token and
Token ID290
Feature activation+1.55e-02

Pos logit contributions
 exp+0.020
bye+0.019
 oceans+0.019
 plains+0.018
 lakes+0.017
Neg logit contributions
 not-0.017
 worse-0.016
 bad-0.015
 infected-0.015
 okay-0.014
Token started
Token ID2067
Feature activation+9.44e-03

Pos logit contributions
ggy+0.019
ussy+0.019
 infected+0.018
 not+0.018
 chores+0.017
Neg logit contributions
bye-0.031
 exp-0.030
 oceans-0.030
 plains-0.027
amb-0.027
Token to
Token ID284
Feature activation-1.45e-03

Pos logit contributions
ussy+0.014
ggy+0.014
 infected+0.013
 worse+0.013
 not+0.013
Neg logit contributions
 exp-0.016
 oceans-0.016
bye-0.015
 plains-0.014
amb-0.014
Token disturb
Token ID17037
Feature activation+8.26e-03

Pos logit contributions
 not+0.006
ggy+0.005
 worse+0.005
 infected+0.005
 okay+0.005
Neg logit contributions
 oceans-0.007
 exp-0.007
bye-0.007
 plains-0.007
amb-0.007
Token any
Token ID597
Feature activation+3.73e-03

Pos logit contributions
ggy+0.011
 not+0.011
 infected+0.010
ussy+0.010
 worse+0.010
Neg logit contributions
 oceans-0.016
 exp-0.016
bye-0.016
 plains-0.015
 lakes-0.014
Token of
Token ID286
Feature activation+2.57e-03

Pos logit contributions
 not+0.004
ggy+0.004
 infected+0.004
 worse+0.004
ussy+0.004
Neg logit contributions
 exp-0.008
bye-0.008
 oceans-0.008
 plains-0.007
amb-0.007
Token their
Token ID511
Feature activation+2.35e-02

Pos logit contributions
 not+0.003
ggy+0.003
 worse+0.003
 infected+0.003
 bad+0.003
Neg logit contributions
bye-0.006
 exp-0.006
 oceans-0.005
amb-0.005
 plains-0.005
Token animal
Token ID5044
Feature activation+3.70e-02

Pos logit contributions
 not+0.033
 bad+0.031
ggy+0.030
 infected+0.030
 worse+0.030
Neg logit contributions
 exp-0.044
bye-0.044
 oceans-0.043
amb-0.043
 plains-0.041
Token friends
Token ID2460
Feature activation+2.28e-02

Pos logit contributions
 not+0.045
ggy+0.043
 infected+0.043
 worse+0.042
ussy+0.042
Neg logit contributions
 exp-0.075
bye-0.074
 oceans-0.074
 plains-0.069
ard-0.068
Token.
Token ID13
Feature activation+1.90e-02

Pos logit contributions
 not+0.028
ggy+0.026
 worse+0.026
 infected+0.025
ussy+0.024
Neg logit contributions
 exp-0.047
bye-0.047
 oceans-0.047
 plains-0.044
amb-0.043
Token<|endoftext|>
Token ID50256
Feature activation-1.58e-02

Pos logit contributions
 not+0.030
 worse+0.029
ggy+0.029
 infected+0.029
ussy+0.028
Neg logit contributions
bye-0.032
 exp-0.031
 oceans-0.031
 plains-0.029
amb-0.028
TokenOnce
Token ID7454
Feature activation+6.38e-02

Pos logit contributions
 oceans+0.021
 exp+0.021
bye+0.020
 plains+0.020
 lakes+0.018
Neg logit contributions
 infected-0.030
 worse-0.030
ussy-0.029
 not-0.029
 homework-0.028
Token upon
Token ID2402
Feature activation+3.61e-03

Pos logit contributions
 infected+0.124
ggy+0.123
 worse+0.122
ussy+0.118
 homework+0.116
Neg logit contributions
bye-0.076
 exp-0.073
 oceans-0.071
amb-0.064
 course-0.063
Token a
Token ID257
Feature activation+1.32e-02

Pos logit contributions
ussy+0.004
 infected+0.004
ggy+0.004
 not+0.004
 worse+0.004
Neg logit contributions
bye-0.008
 exp-0.008
 oceans-0.008
 plains-0.007
 lakes-0.007
Token
Token ID198
Feature activation+2.53e-02

Pos logit contributions
 not+0.014
 mom+0.013
ggy+0.012
 infected+0.012
ussy+0.012
Neg logit contributions
 oceans-0.022
bye-0.021
 exp-0.021
 lakes-0.020
 plains-0.020
Token
Token ID198
Feature activation+1.07e-02

Pos logit contributions
 worse+0.043
 infected+0.043
 not+0.042
ussy+0.042
 homework+0.040
Neg logit contributions
bye-0.039
 exp-0.038
 oceans-0.037
 plains-0.035
amb-0.035
TokenThey
Token ID2990
Feature activation+1.17e-02

Pos logit contributions
 not+0.016
ggy+0.015
 worse+0.014
 infected+0.014
ussy+0.014
Neg logit contributions
 exp-0.020
 oceans-0.020
bye-0.020
 plains-0.018
amb-0.018
Token searched
Token ID16499
Feature activation+6.74e-03

Pos logit contributions
ggy+0.023
ussy+0.022
 infected+0.022
 not+0.022
 worse+0.022
Neg logit contributions
 oceans-0.016
 exp-0.015
bye-0.015
 plains-0.014
 lakes-0.013
Token the
Token ID262
Feature activation+2.18e-03

Pos logit contributions
ggy+0.009
 not+0.009
 infected+0.009
 worse+0.009
ussy+0.009
Neg logit contributions
 oceans-0.013
 exp-0.013
bye-0.013
 plains-0.012
 lakes-0.011
Token park
Token ID3952
Feature activation-4.31e-03

Pos logit contributions
 not+0.004
ggy+0.004
 infected+0.004
 worse+0.004
ussy+0.003
Neg logit contributions
 exp-0.003
bye-0.003
 oceans-0.003
 plains-0.003
amb-0.003
Token quietly
Token ID12703
Feature activation-2.78e-02

Pos logit contributions
 exp+0.007
bye+0.007
 oceans+0.007
amb+0.007
 plains+0.007
Neg logit contributions
 not-0.007
ggy-0.006
 worse-0.006
 infected-0.006
 okay-0.006
Token and
Token ID290
Feature activation-5.20e-03

Pos logit contributions
bye+0.061
 exp+0.060
amb+0.056
 MacDonald+0.055
 course+0.054
Neg logit contributions
 not-0.034
 mom-0.030
 worse-0.029
 anywhere-0.029
 missing-0.028
Token finally
Token ID3443
Feature activation-1.80e-03

Pos logit contributions
 oceans+0.010
bye+0.010
 exp+0.009
 plains+0.009
 lakes+0.009
Neg logit contributions
ggy-0.007
 infected-0.007
 not-0.007
ussy-0.007
 worse-0.007
Token found
Token ID1043
Feature activation+3.78e-02

Pos logit contributions
 oceans+0.003
bye+0.003
 exp+0.003
 plains+0.002
ard+0.002
Neg logit contributions
ussy-0.003
 homework-0.003
 chores-0.003
 infected-0.003
ggy-0.003
Token the
Token ID262
Feature activation+2.78e-02

Pos logit contributions
ussy+0.065
 infected+0.062
 homework+0.062
ggy+0.062
 severe+0.058
Neg logit contributions
 oceans-0.059
bye-0.056
 course-0.055
 plains-0.054
 lakes-0.053
Token his
Token ID465
Feature activation+3.17e-02

Pos logit contributions
 not+0.017
ggy+0.016
 infected+0.016
 worse+0.016
ussy+0.015
Neg logit contributions
 exp-0.024
bye-0.024
 oceans-0.024
 plains-0.022
amb-0.022
Token story
Token ID1621
Feature activation+2.56e-02

Pos logit contributions
 not+0.041
ggy+0.038
 worse+0.038
 chores+0.038
 bad+0.037
Neg logit contributions
 oceans-0.066
 exp-0.065
bye-0.063
 plains-0.063
amb-0.062
Token.
Token ID13
Feature activation-6.37e-03

Pos logit contributions
 infected+0.035
 not+0.034
ggy+0.033
ussy+0.033
 worse+0.033
Neg logit contributions
bye-0.048
 exp-0.047
 oceans-0.047
 plains-0.043
ard-0.043
Token
Token ID198
Feature activation+1.60e-02

Pos logit contributions
 exp+0.013
 oceans+0.013
bye+0.012
ard+0.011
 plains+0.011
Neg logit contributions
 not-0.008
ggy-0.007
 mom-0.007
ussy-0.007
 infected-0.007
Token
Token ID198
Feature activation+5.92e-03

Pos logit contributions
 worse+0.019
 infected+0.018
ussy+0.018
 not+0.017
 homework+0.017
Neg logit contributions
bye-0.033
 exp-0.032
 oceans-0.032
 plains-0.031
amb-0.031
TokenAnn
Token ID18858
Feature activation-9.60e-03

Pos logit contributions
 not+0.010
ggy+0.010
 infected+0.009
 worse+0.009
ussy+0.009
Neg logit contributions
 exp-0.009
 oceans-0.009
bye-0.009
 plains-0.009
amb-0.008
Token went
Token ID1816
Feature activation-3.64e-03

Pos logit contributions
 oceans+0.020
 exp+0.019
 lakes+0.018
bye+0.018
 plains+0.018
Neg logit contributions
 not-0.012
ggy-0.012
 worse-0.010
 mom-0.010
 infected-0.010
Token home
Token ID1363
Feature activation+1.61e-02

Pos logit contributions
 exp+0.008
bye+0.008
 oceans+0.008
 plains+0.007
amb+0.007
Neg logit contributions
 not-0.004
ggy-0.004
 infected-0.004
 worse-0.004
ussy-0.004
Token that
Token ID326
Feature activation+1.03e-02

Pos logit contributions
 not+0.024
ggy+0.022
 infected+0.021
 worse+0.021
 okay+0.020
Neg logit contributions
 exp-0.031
 oceans-0.030
bye-0.030
 plains-0.028
amb-0.027
Token day
Token ID1110
Feature activation-3.73e-03

Pos logit contributions
 not+0.019
ggy+0.018
 infected+0.018
 worse+0.018
ussy+0.017
Neg logit contributions
 exp-0.015
bye-0.015
 oceans-0.015
 plains-0.014
amb-0.014
Token feeling
Token ID4203
Feature activation+2.20e-02

Pos logit contributions
 exp+0.008
 oceans+0.008
bye+0.008
amb+0.007
 plains+0.007
Neg logit contributions
 not-0.005
 okay-0.004
 bad-0.004
ggy-0.004
 worse-0.004
Token loved
Token ID6151
Feature activation-1.20e-02

Pos logit contributions
 not+0.009
ggy+0.008
 infected+0.008
 worse+0.008
ussy+0.008
Neg logit contributions
 oceans-0.014
 exp-0.014
bye-0.013
 plains-0.013
 lakes-0.013
Token to
Token ID284
Feature activation+3.11e-03

Pos logit contributions
bye+0.027
 exp+0.026
 oceans+0.026
amb+0.025
ard+0.025
Neg logit contributions
 not-0.015
 homework-0.012
ggy-0.012
 bad-0.012
 chores-0.012
Token play
Token ID711
Feature activation+1.30e-02

Pos logit contributions
 not+0.005
ggy+0.004
 chores+0.004
 homework+0.004
 mom+0.004
Neg logit contributions
 oceans-0.006
bye-0.006
 plains-0.006
ard-0.006
 exp-0.006
Token outside
Token ID2354
Feature activation-1.67e-02

Pos logit contributions
 not+0.016
ggy+0.016
 worse+0.015
 infected+0.014
 bad+0.014
Neg logit contributions
bye-0.028
 exp-0.027
 oceans-0.027
 plains-0.025
amb-0.025
Token with
Token ID351
Feature activation+7.26e-03

Pos logit contributions
bye+0.042
 exp+0.041
icol+0.040
amb+0.039
 oceans+0.038
Neg logit contributions
 not-0.017
 worse-0.014
 bad-0.013
 mom-0.012
 wrong-0.012
Token his
Token ID465
Feature activation+3.02e-02

Pos logit contributions
 not+0.007
ggy+0.007
 infected+0.007
 worse+0.007
ussy+0.006
Neg logit contributions
 exp-0.016
bye-0.016
 oceans-0.016
 plains-0.015
amb-0.015
Token friend
Token ID1545
Feature activation+2.67e-02

Pos logit contributions
 not+0.031
ggy+0.030
 infected+0.029
 worse+0.028
ussy+0.027
Neg logit contributions
 exp-0.068
bye-0.067
 oceans-0.067
 plains-0.063
amb-0.062
Token,
Token ID11
Feature activation+6.66e-03

Pos logit contributions
 not+0.041
ggy+0.041
 infected+0.040
 worse+0.040
ussy+0.039
Neg logit contributions
 exp-0.044
 oceans-0.044
bye-0.044
 plains-0.041
ard-0.040
Token Max
Token ID5436
Feature activation+2.91e-02

Pos logit contributions
ggy+0.008
 infected+0.008
ussy+0.008
 worse+0.008
 not+0.008
Neg logit contributions
 exp-0.013
 oceans-0.013
bye-0.013
 plains-0.012
 lakes-0.012
Token.
Token ID13
Feature activation+1.79e-03

Pos logit contributions
 not+0.036
ggy+0.036
 infected+0.035
 worse+0.034
ussy+0.034
Neg logit contributions
 exp-0.058
bye-0.058
 oceans-0.057
 plains-0.054
amb-0.053
Token One
Token ID1881
Feature activation-6.89e-03

Pos logit contributions
 not+0.003
ggy+0.003
 infected+0.003
 worse+0.003
ussy+0.003
Neg logit contributions
 exp-0.003
bye-0.003
 oceans-0.003
 plains-0.003
amb-0.003
Token and
Token ID290
Feature activation+6.70e-04

Pos logit contributions
 not+0.025
ggy+0.024
 worse+0.023
 infected+0.023
ussy+0.022
Neg logit contributions
 exp-0.033
bye-0.032
 oceans-0.032
 plains-0.030
amb-0.029
Token blink
Token ID21019
Feature activation-2.96e-03

Pos logit contributions
 not+0.001
ggy+0.001
 worse+0.001
 chores+0.001
 mom+0.001
Neg logit contributions
 oceans-0.001
bye-0.001
 exp-0.001
amb-0.001
ard-0.001
Token in
Token ID287
Feature activation-1.64e-02

Pos logit contributions
 exp+0.006
 oceans+0.006
bye+0.006
 plains+0.005
amb+0.005
Neg logit contributions
 not-0.004
ggy-0.004
 infected-0.004
 worse-0.004
ussy-0.003
Token the
Token ID262
Feature activation-2.97e-02

Pos logit contributions
bye+0.038
 exp+0.037
amb+0.037
 oceans+0.036
 plains+0.034
Neg logit contributions
 not-0.017
 worse-0.015
 infected-0.015
 bad-0.015
ggy-0.015
Token sunshine
Token ID34488
Feature activation-2.46e-02

Pos logit contributions
amb+0.047
 exp+0.043
bye+0.042
 oceans+0.039
icol+0.038
Neg logit contributions
 bad-0.057
 worse-0.054
 wrong-0.052
 infected-0.052
 chores-0.051
Token.
Token ID13
Feature activation-9.73e-03

Pos logit contributions
 exp+0.061
bye+0.059
 oceans+0.058
amb+0.056
 plains+0.056
Neg logit contributions
 not-0.023
 worse-0.019
ggy-0.018
 wrong-0.017
 mom-0.017
Token One
Token ID1881
Feature activation-5.61e-04

Pos logit contributions
bye+0.020
 exp+0.020
 oceans+0.020
amb+0.019
ard+0.018
Neg logit contributions
 not-0.012
 mom-0.010
ggy-0.010
 okay-0.010
 infected-0.009
Token day
Token ID1110
Feature activation+1.73e-04

Pos logit contributions
bye+0.001
 exp+0.001
 oceans+0.001
 plains+0.001
amb+0.001
Neg logit contributions
 not-0.001
 infected-0.001
ggy-0.001
 worse-0.001
ussy-0.001
Token,
Token ID11
Feature activation+1.06e-02

Pos logit contributions
 not+0.000
ggy+0.000
 infected+0.000
 worse+0.000
ussy+0.000
Neg logit contributions
 exp-0.000
bye-0.000
 oceans-0.000
 plains-0.000
amb-0.000
Token when
Token ID618
Feature activation+1.63e-02

Pos logit contributions
 not+0.011
 worse+0.009
ggy+0.009
 bad+0.009
 infected+0.009
Neg logit contributions
 exp-0.025
bye-0.025
 oceans-0.025
amb-0.024
 plains-0.023
Token the
Token ID262
Feature activation+1.93e-02

Pos logit contributions
 not+0.019
ggy+0.018
 infected+0.018
 worse+0.018
ussy+0.017
Neg logit contributions
 exp-0.034
bye-0.034
 oceans-0.034
 plains-0.032
amb-0.031
Token ran
Token ID4966
Feature activation-1.84e-02

Pos logit contributions
bye+0.018
 exp+0.018
 oceans+0.018
 plains+0.017
 lakes+0.016
Neg logit contributions
 infected-0.016
 worse-0.015
ggy-0.015
 not-0.015
ussy-0.015
Token around
Token ID1088
Feature activation-1.29e-02

Pos logit contributions
 exp+0.036
bye+0.036
 oceans+0.035
 plains+0.033
amb+0.032
Neg logit contributions
 not-0.025
ggy-0.023
 worse-0.022
 infected-0.022
 bad-0.021
Token and
Token ID290
Feature activation-2.26e-02

Pos logit contributions
bye+0.031
 exp+0.030
amb+0.029
 oceans+0.028
ard+0.028
Neg logit contributions
 not-0.014
ggy-0.012
 bad-0.011
 mom-0.010
 homework-0.010
Token around
Token ID1088
Feature activation-1.46e-02

Pos logit contributions
 exp+0.040
bye+0.040
 oceans+0.040
 plains+0.037
amb+0.036
Neg logit contributions
 not-0.033
ggy-0.032
 infected-0.032
 worse-0.032
ussy-0.031
Token,
Token ID11
Feature activation-2.17e-02

Pos logit contributions
bye+0.036
amb+0.035
 exp+0.035
 oceans+0.032
ard+0.032
Neg logit contributions
 not-0.015
 bad-0.013
ggy-0.012
 wrong-0.012
 mom-0.012
Token climbed
Token ID19952
Feature activation-1.96e-02

Pos logit contributions
bye+0.044
 exp+0.044
 oceans+0.042
amb+0.040
 plains+0.039
Neg logit contributions
 not-0.031
ggy-0.027
 okay-0.025
 worse-0.025
 wrong-0.024
Token trees
Token ID7150
Feature activation-5.82e-03

Pos logit contributions
bye+0.042
 exp+0.041
 oceans+0.040
amb+0.039
 plains+0.038
Neg logit contributions
 not-0.023
ggy-0.021
 worse-0.021
 infected-0.021
ussy-0.019
Token,
Token ID11
Feature activation-2.97e-02

Pos logit contributions
 exp+0.012
bye+0.011
 oceans+0.011
 plains+0.010
icol+0.010
Neg logit contributions
 not-0.008
ggy-0.007
 worse-0.007
 wrong-0.007
 anywhere-0.006
Token and
Token ID290
Feature activation-3.03e-02

Pos logit contributions
bye+0.056
 exp+0.055
amb+0.051
 zone+0.048
 manner+0.048
Neg logit contributions
 not-0.049
 worse-0.038
 bad-0.038
 missing-0.038
hes-0.037
Token laughed
Token ID13818
Feature activation-2.35e-02

Pos logit contributions
 exp+0.055
bye+0.055
 oceans+0.052
amb+0.051
 plains+0.049
Neg logit contributions
 not-0.045
ggy-0.042
 worse-0.042
 infected-0.042
ussy-0.040
Token and
Token ID290
Feature activation-3.32e-02

Pos logit contributions
 exp+0.049
bye+0.049
 oceans+0.048
 plains+0.045
ard+0.045
Neg logit contributions
 not-0.030
ggy-0.027
 okay-0.026
 worse-0.026
 bad-0.024
Token
Token ID198
Feature activation+7.88e-03

Pos logit contributions
 worse+0.020
 not+0.020
 infected+0.019
ussy+0.019
 homework+0.019
Neg logit contributions
bye-0.021
 exp-0.021
 oceans-0.020
amb-0.020
 plains-0.019
TokenOnce
Token ID7454
Feature activation+2.48e-02

Pos logit contributions
 not+0.013
ggy+0.012
 infected+0.012
 worse+0.012
ussy+0.012
Neg logit contributions
 exp-0.013
 oceans-0.013
bye-0.013
 plains-0.012
amb-0.012
Token it
Token ID340
Feature activation+1.64e-02

Pos logit contributions
ggy+0.037
 infected+0.035
 not+0.034
ussy+0.033
 worse+0.033
Neg logit contributions
bye-0.044
 exp-0.044
 oceans-0.043
 plains-0.040
amb-0.039
Token was
Token ID373
Feature activation+4.72e-04

Pos logit contributions
 not+0.020
 infected+0.020
ussy+0.020
ggy+0.019
 worse+0.019
Neg logit contributions
 exp-0.033
bye-0.032
 oceans-0.031
 plains-0.030
ard-0.029
Token all
Token ID477
Feature activation-2.42e-04

Pos logit contributions
ggy+0.001
 not+0.001
 infected+0.001
ussy+0.001
 worse+0.001
Neg logit contributions
 exp-0.001
bye-0.001
 oceans-0.001
 plains-0.001
amb-0.001
Token clean
Token ID3424
Feature activation-2.23e-02

Pos logit contributions
bye+0.001
 oceans+0.001
 exp+0.001
 plains+0.001
amb+0.001
Neg logit contributions
 not-0.000
 worse-0.000
 infected-0.000
 bad-0.000
ggy-0.000
Token,
Token ID11
Feature activation+1.09e-02

Pos logit contributions
bye+0.046
 exp+0.045
 oceans+0.043
amb+0.043
 plains+0.042
Neg logit contributions
 not-0.030
 worse-0.028
 bad-0.026
 okay-0.026
 wrong-0.026
Token she
Token ID673
Feature activation-1.05e-02

Pos logit contributions
 not+0.013
ggy+0.013
 infected+0.013
 worse+0.012
ussy+0.012
Neg logit contributions
 exp-0.022
 oceans-0.022
bye-0.022
 plains-0.021
 lakes-0.020
Token put
Token ID1234
Feature activation+5.20e-03

Pos logit contributions
bye+0.025
 oceans+0.024
 exp+0.023
 plains+0.023
amb+0.022
Neg logit contributions
 not-0.012
ggy-0.011
 infected-0.010
 worse-0.010
 mom-0.009
Token a
Token ID257
Feature activation+2.50e-02

Pos logit contributions
 not+0.005
ggy+0.005
 infected+0.005
 worse+0.005
ussy+0.005
Neg logit contributions
 exp-0.012
bye-0.011
 oceans-0.011
 plains-0.011
amb-0.011
Token new
Token ID649
Feature activation+7.45e-03

Pos logit contributions
 not+0.033
ggy+0.032
 infected+0.031
 worse+0.031
ussy+0.030
Neg logit contributions
 exp-0.049
bye-0.048
 oceans-0.048
 plains-0.045
amb-0.044
Token very
Token ID845
Feature activation+1.06e-03

Pos logit contributions
ggy+0.007
ussy+0.007
 infected+0.007
 worse+0.007
 homework+0.007
Neg logit contributions
 oceans-0.010
 exp-0.010
bye-0.010
 plains-0.010
 lakes-0.009
Token special
Token ID2041
Feature activation-1.29e-02

Pos logit contributions
 not+0.001
ggy+0.001
 infected+0.001
 worse+0.001
ussy+0.001
Neg logit contributions
 exp-0.002
bye-0.002
 oceans-0.002
 plains-0.002
amb-0.002
Token to
Token ID284
Feature activation-4.20e-03

Pos logit contributions
 exp+0.025
 oceans+0.025
bye+0.024
 plains+0.023
amb+0.022
Neg logit contributions
 not-0.019
ggy-0.017
 worse-0.016
 bad-0.016
 homework-0.016
Token L
Token ID406
Feature activation+1.70e-02

Pos logit contributions
 exp+0.009
bye+0.009
 oceans+0.009
 plains+0.009
amb+0.009
Neg logit contributions
 not-0.004
ggy-0.004
 infected-0.004
 worse-0.004
ussy-0.004
Tokenila
Token ID10102
Feature activation+3.46e-02

Pos logit contributions
 not+0.021
 infected+0.021
 worse+0.021
 chores+0.019
ggy+0.019
Neg logit contributions
 exp-0.034
bye-0.033
 oceans-0.032
amb-0.032
ard-0.031
Token,
Token ID11
Feature activation-1.88e-02

Pos logit contributions
 not+0.048
ggy+0.043
 infected+0.041
 homework+0.041
 worse+0.041
Neg logit contributions
 exp-0.069
bye-0.069
 oceans-0.069
 plains-0.063
amb-0.063
Token and
Token ID290
Feature activation+7.47e-03

Pos logit contributions
bye+0.038
 exp+0.037
 oceans+0.036
amb+0.034
 plains+0.033
Neg logit contributions
 not-0.030
 okay-0.025
 worse-0.023
 missing-0.023
ggy-0.023
Token it
Token ID340
Feature activation+1.24e-02

Pos logit contributions
 not+0.010
ggy+0.010
 infected+0.010
 worse+0.010
ussy+0.010
Neg logit contributions
bye-0.014
 exp-0.014
 oceans-0.014
 plains-0.013
amb-0.012
Token seemed
Token ID3947
Feature activation-8.33e-03

Pos logit contributions
 not+0.015
ggy+0.014
 worse+0.014
 infected+0.014
ussy+0.013
Neg logit contributions
 exp-0.026
bye-0.026
 oceans-0.025
 plains-0.024
amb-0.024
Token like
Token ID588
Feature activation-1.08e-02

Pos logit contributions
 exp+0.016
bye+0.016
 oceans+0.016
 plains+0.015
amb+0.015
Neg logit contributions
 not-0.011
ggy-0.010
 worse-0.010
 infected-0.010
ussy-0.010
Token nothing
Token ID2147
Feature activation+1.37e-02

Pos logit contributions
 exp+0.024
bye+0.024
 oceans+0.023
 plains+0.022
amb+0.022
Neg logit contributions
 not-0.012
ggy-0.011
 worse-0.011
 infected-0.011
ussy-0.010
Token dragon
Token ID10441
Feature activation-1.81e-02

Pos logit contributions
 exp+0.071
bye+0.071
 oceans+0.071
 plains+0.066
amb+0.063
Neg logit contributions
 not-0.067
ggy-0.065
 infected-0.065
 worse-0.064
ussy-0.063
Token made
Token ID925
Feature activation+1.63e-02

Pos logit contributions
 oceans+0.037
 exp+0.036
bye+0.035
 plains+0.034
amb+0.034
Neg logit contributions
 not-0.024
ggy-0.022
 worse-0.022
 infected-0.021
 bad-0.021
Token a
Token ID257
Feature activation+1.32e-02

Pos logit contributions
ussy+0.018
 infected+0.018
 not+0.018
ggy+0.018
 worse+0.017
Neg logit contributions
bye-0.034
 oceans-0.033
 exp-0.033
 plains-0.031
 lakes-0.030
Token fire
Token ID2046
Feature activation-2.85e-02

Pos logit contributions
 not+0.013
ggy+0.013
 infected+0.013
ussy+0.013
 worse+0.012
Neg logit contributions
 exp-0.029
bye-0.029
 oceans-0.029
 plains-0.027
 lakes-0.026
Token too
Token ID1165
Feature activation+6.98e-03

Pos logit contributions
 exp+0.060
 oceans+0.059
bye+0.057
 plains+0.056
amb+0.055
Neg logit contributions
 not-0.041
 worse-0.036
 bad-0.033
ggy-0.032
 wrong-0.032
Token close
Token ID1969
Feature activation-2.74e-02

Pos logit contributions
 not+0.008
 bad+0.007
 worse+0.007
ggy+0.007
 infected+0.007
Neg logit contributions
bye-0.016
 exp-0.016
 oceans-0.016
amb-0.015
 plains-0.015
Token to
Token ID284
Feature activation-2.21e-02

Pos logit contributions
 exp+0.062
 oceans+0.062
bye+0.059
 plains+0.059
amb+0.058
Neg logit contributions
 not-0.033
 worse-0.028
 wrong-0.028
 bad-0.025
 infected-0.025
Token the
Token ID262
Feature activation-3.83e-02

Pos logit contributions
 exp+0.049
amb+0.049
bye+0.048
 oceans+0.048
ard+0.045
Neg logit contributions
 not-0.027
 worse-0.024
ggy-0.023
 bad-0.023
 infected-0.022
Token edge
Token ID5743
Feature activation-3.55e-02

Pos logit contributions
amb+0.068
bye+0.065
 exp+0.063
 oceans+0.060
 plains+0.057
Neg logit contributions
 not-0.062
 bad-0.061
 worse-0.060
 wrong-0.059
 infected-0.056
Token of
Token ID286
Feature activation-3.19e-02

Pos logit contributions
 exp+0.085
icol+0.083
 oceans+0.082
amb+0.082
bye+0.079
Neg logit contributions
 not-0.036
 worse-0.032
ggy-0.031
 bad-0.029
 wrong-0.029
Token the
Token ID262
Feature activation-3.79e-02

Pos logit contributions
amb+0.086
bye+0.086
 exp+0.083
ard+0.079
 oceans+0.077
Neg logit contributions
 not-0.027
 bad-0.022
 worse-0.021
 infected-0.020
 mom-0.019
Token time
Token ID640
Feature activation+6.58e-03

Pos logit contributions
 exp+0.008
bye+0.008
 oceans+0.008
amb+0.008
 plains+0.008
Neg logit contributions
 not-0.004
ggy-0.004
 worse-0.004
 infected-0.004
 bad-0.004
Token,
Token ID11
Feature activation+6.87e-03

Pos logit contributions
 not+0.009
ggy+0.009
 infected+0.008
 worse+0.008
 okay+0.008
Neg logit contributions
 exp-0.013
bye-0.012
 oceans-0.012
 plains-0.011
amb-0.011
Token they
Token ID484
Feature activation-8.37e-03

Pos logit contributions
 not+0.010
ggy+0.009
 infected+0.009
 worse+0.009
ussy+0.009
Neg logit contributions
 oceans-0.012
bye-0.012
 exp-0.012
 plains-0.012
amb-0.011
Token found
Token ID1043
Feature activation+2.29e-02

Pos logit contributions
 exp+0.016
bye+0.016
 oceans+0.016
 plains+0.015
amb+0.015
Neg logit contributions
 not-0.011
ggy-0.011
 infected-0.010
 worse-0.010
ussy-0.010
Token him
Token ID683
Feature activation+4.61e-02

Pos logit contributions
ussy+0.029
 not+0.028
 infected+0.028
ggy+0.028
 homework+0.028
Neg logit contributions
 oceans-0.045
bye-0.045
 exp-0.043
 plains-0.042
 lakes-0.041
Token safe
Token ID3338
Feature activation-1.95e-02

Pos logit contributions
ussy+0.064
ggy+0.063
 not+0.062
 infected+0.062
 worse+0.061
Neg logit contributions
bye-0.081
 oceans-0.081
 exp-0.081
 plains-0.076
ard-0.074
Token and
Token ID290
Feature activation-3.92e-03

Pos logit contributions
bye+0.039
 exp+0.038
 oceans+0.037
amb+0.037
 plains+0.036
Neg logit contributions
 not-0.027
 worse-0.023
ggy-0.023
 bad-0.022
 infected-0.022
Token sound
Token ID2128
Feature activation-1.29e-02

Pos logit contributions
 exp+0.008
bye+0.008
 oceans+0.008
 plains+0.007
amb+0.007
Neg logit contributions
 not-0.005
ggy-0.005
 infected-0.005
 worse-0.005
ussy-0.005
Token on
Token ID319
Feature activation-1.68e-02

Pos logit contributions
 exp+0.026
bye+0.026
 oceans+0.026
 plains+0.024
amb+0.024
Neg logit contributions
 not-0.016
ggy-0.016
 worse-0.015
 infected-0.015
ussy-0.014
Token the
Token ID262
Feature activation-3.03e-02

Pos logit contributions
 exp+0.036
bye+0.036
 oceans+0.035
 plains+0.033
amb+0.033
Neg logit contributions
 not-0.019
ggy-0.018
 infected-0.018
 worse-0.018
ussy-0.017
Token other
Token ID584
Feature activation-4.64e-02

Pos logit contributions
bye+0.052
amb+0.051
 exp+0.049
 oceans+0.047
elt+0.046
Neg logit contributions
 not-0.049
 bad-0.049
 wrong-0.046
 worse-0.046
 chores-0.046
Token flying
Token ID7348
Feature activation-2.78e-02

Pos logit contributions
 not+0.002
 infected+0.002
 worse+0.002
 okay+0.002
ggy+0.002
Neg logit contributions
bye-0.005
 exp-0.005
 oceans-0.005
 plains-0.005
ard-0.005
Token by
Token ID416
Feature activation-2.86e-02

Pos logit contributions
 exp+0.065
bye+0.063
amb+0.062
 oceans+0.062
ard+0.061
Neg logit contributions
 not-0.030
 worse-0.027
 wrong-0.026
 bad-0.024
 infected-0.023
Token and
Token ID290
Feature activation-2.52e-02

Pos logit contributions
bye+0.066
 exp+0.063
amb+0.062
 oceans+0.061
ard+0.060
Neg logit contributions
 not-0.034
ggy-0.029
 worse-0.028
 bad-0.027
 infected-0.026
Token said
Token ID531
Feature activation+5.32e-03

Pos logit contributions
bye+0.050
 exp+0.049
amb+0.048
 oceans+0.048
 plains+0.045
Neg logit contributions
 not-0.036
 worse-0.033
 infected-0.032
ggy-0.031
 bad-0.030
Token,
Token ID11
Feature activation-2.70e-02

Pos logit contributions
 not+0.008
ggy+0.008
 worse+0.007
 infected+0.007
ussy+0.007
Neg logit contributions
 exp-0.010
bye-0.009
 oceans-0.009
 plains-0.009
amb-0.009
Token "
Token ID366
Feature activation-7.70e-02

Pos logit contributions
bye+0.065
 oceans+0.062
 exp+0.061
 plains+0.059
amb+0.058
Neg logit contributions
 not-0.033
 okay-0.027
 mom-0.025
 worse-0.024
ggy-0.024
TokenWhy
Token ID5195
Feature activation+1.30e-02

Pos logit contributions
 exp+0.146
icol+0.136
 mast+0.136
 oceans+0.135
 plains+0.130
Neg logit contributions
ggy-0.097
 worse-0.097
 not-0.096
 infected-0.095
abella-0.093
Token don
Token ID836
Feature activation+1.65e-02

Pos logit contributions
 infected+0.020
ussy+0.019
 worse+0.019
ggy+0.019
 not+0.019
Neg logit contributions
 exp-0.022
bye-0.022
 oceans-0.021
 plains-0.020
 lakes-0.019
Token't
Token ID470
Feature activation+3.83e-03

Pos logit contributions
 not+0.017
ggy+0.017
 infected+0.016
ussy+0.016
 worse+0.016
Neg logit contributions
 exp-0.035
bye-0.035
 oceans-0.035
 plains-0.033
ard-0.032
Token you
Token ID345
Feature activation-3.41e-03

Pos logit contributions
 not+0.007
 worse+0.007
ggy+0.007
 bad+0.007
 mom+0.006
Neg logit contributions
 oceans-0.006
 exp-0.006
 plains-0.006
bye-0.006
amb-0.005
Token forgive
Token ID20927
Feature activation-1.72e-02

Pos logit contributions
 oceans+0.007
 exp+0.007
 plains+0.007
bye+0.007
 lakes+0.006
Neg logit contributions
 not-0.005
ggy-0.004
 okay-0.004
 worse-0.004
 bad-0.004
Token proud
Token ID6613
Feature activation-3.79e-02

Pos logit contributions
 infected+0.015
ggy+0.015
 worse+0.015
ussy+0.015
 homework+0.014
Neg logit contributions
 exp-0.021
bye-0.021
 oceans-0.020
amb-0.019
 plains-0.019
Token of
Token ID286
Feature activation-2.26e-02

Pos logit contributions
 exp+0.088
bye+0.087
 oceans+0.085
 plains+0.082
amb+0.080
Neg logit contributions
 not-0.042
ggy-0.037
 worse-0.035
 mom-0.034
 bad-0.034
Token him
Token ID683
Feature activation+3.01e-02

Pos logit contributions
bye+0.043
 exp+0.043
 oceans+0.042
 plains+0.039
amb+0.039
Neg logit contributions
 not-0.033
ggy-0.031
 worse-0.029
 infected-0.029
ussy-0.029
Token and
Token ID290
Feature activation-1.51e-02

Pos logit contributions
 not+0.049
 infected+0.049
ggy+0.049
 worse+0.048
ussy+0.048
Neg logit contributions
bye-0.047
 exp-0.047
 oceans-0.046
 plains-0.043
ard-0.042
Token said
Token ID531
Feature activation+4.48e-03

Pos logit contributions
 exp+0.029
 oceans+0.029
bye+0.029
 plains+0.027
amb+0.027
Neg logit contributions
 not-0.021
ggy-0.020
 worse-0.019
 infected-0.019
 bad-0.018
Token "
Token ID366
Feature activation-7.71e-02

Pos logit contributions
 not+0.006
ggy+0.006
 worse+0.006
 okay+0.006
 infected+0.006
Neg logit contributions
 oceans-0.009
bye-0.009
 exp-0.008
 plains-0.008
amb-0.008
Tokenyes
Token ID8505
Feature activation-1.46e-02

Pos logit contributions
 exp+0.136
 mast+0.120
 oceans+0.119
icol+0.118
 plains+0.110
Neg logit contributions
 not-0.111
ussy-0.111
ggy-0.109
abella-0.107
 worse-0.105
Token,
Token ID11
Feature activation-3.03e-02

Pos logit contributions
bye+0.021
 exp+0.020
 oceans+0.019
ard+0.018
 plains+0.018
Neg logit contributions
 not-0.026
 infected-0.026
 worse-0.026
ggy-0.026
ussy-0.025
Token you
Token ID345
Feature activation-7.94e-03

Pos logit contributions
 oceans+0.055
bye+0.054
 exp+0.054
 plains+0.052
amb+0.051
Neg logit contributions
 not-0.051
 okay-0.044
ggy-0.044
 bad-0.042
 mom-0.042
Token guessed
Token ID25183
Feature activation+7.06e-04

Pos logit contributions
 oceans+0.015
 exp+0.015
bye+0.015
 plains+0.014
amb+0.014
Neg logit contributions
 not-0.011
ggy-0.011
 infected-0.010
 worse-0.010
ussy-0.010
Token it
Token ID340
Feature activation+4.57e-03

Pos logit contributions
 not+0.001
ggy+0.001
 worse+0.001
 okay+0.001
 infected+0.001
Neg logit contributions
bye-0.002
 exp-0.001
 oceans-0.001
amb-0.001
 plains-0.001
Token Suddenly
Token ID24975
Feature activation-6.03e-03

Pos logit contributions
bye+0.006
 oceans+0.006
 exp+0.006
 plains+0.006
 lakes+0.005
Neg logit contributions
 not-0.005
 infected-0.005
ggy-0.005
 okay-0.005
 worse-0.005
Token,
Token ID11
Feature activation-6.23e-03

Pos logit contributions
 exp+0.011
bye+0.011
 oceans+0.010
 plains+0.010
amb+0.010
Neg logit contributions
 not-0.009
ggy-0.009
 infected-0.009
 worse-0.009
ussy-0.008
Token she
Token ID673
Feature activation-1.66e-02

Pos logit contributions
 exp+0.013
bye+0.013
 oceans+0.013
 plains+0.012
amb+0.012
Neg logit contributions
 not-0.007
ggy-0.007
 infected-0.007
 worse-0.007
ussy-0.007
Token heard
Token ID2982
Feature activation-2.27e-02

Pos logit contributions
bye+0.034
 exp+0.033
 oceans+0.032
amb+0.031
 plains+0.030
Neg logit contributions
 not-0.022
ggy-0.021
 infected-0.021
 worse-0.020
 bad-0.019
Token a
Token ID257
Feature activation-3.06e-02

Pos logit contributions
 exp+0.052
bye+0.052
 oceans+0.052
 plains+0.049
amb+0.048
Neg logit contributions
 not-0.022
ggy-0.021
 infected-0.020
 worse-0.020
ussy-0.019
Token loud
Token ID7812
Feature activation-8.18e-02

Pos logit contributions
amb+0.063
bye+0.063
 oceans+0.061
 exp+0.059
 plains+0.058
Neg logit contributions
 not-0.041
 bad-0.041
 worse-0.039
ggy-0.039
 mom-0.035
Token qu
Token ID627
Feature activation-5.56e-02

Pos logit contributions
bye+0.185
 oceans+0.181
amb+0.175
 plains+0.173
 exp+0.170
Neg logit contributions
 not-0.099
 bad-0.092
ggy-0.092
 worse-0.085
 sad-0.080
Tokenack
Token ID441
Feature activation-3.71e-02

Pos logit contributions
bye+0.086
 oceans+0.085
 exp+0.084
 plains+0.076
 lakes+0.074
Neg logit contributions
ggy-0.097
 not-0.097
 worse-0.093
 infected-0.092
ussy-0.092
Token!
Token ID0
Feature activation-8.13e-03

Pos logit contributions
bye+0.082
 exp+0.082
 oceans+0.079
 plains+0.077
amb+0.075
Neg logit contributions
 not-0.045
ggy-0.042
 worse-0.040
 bad-0.036
 infected-0.034
Token Lucy
Token ID22162
Feature activation-7.16e-03

Pos logit contributions
bye+0.014
 exp+0.014
 oceans+0.014
 plains+0.013
amb+0.013
Neg logit contributions
 not-0.013
ggy-0.012
 infected-0.012
 worse-0.012
ussy-0.011
Token pe
Token ID613
Feature activation-2.06e-02

Pos logit contributions
bye+0.016
 oceans+0.015
 exp+0.015
 plains+0.014
amb+0.014
Neg logit contributions
 not-0.009
ggy-0.008
 infected-0.007
 worse-0.007
 mom-0.007
Token
Token ID198
Feature activation-2.35e-02

Pos logit contributions
 exp+0.036
bye+0.036
 oceans+0.036
 plains+0.034
amb+0.033
Neg logit contributions
 not-0.031
ggy-0.030
 worse-0.030
 infected-0.030
ussy-0.029
TokenThe
Token ID464
Feature activation-2.45e-02

Pos logit contributions
 exp+0.044
bye+0.043
 oceans+0.043
 plains+0.040
amb+0.039
Neg logit contributions
 not-0.033
ggy-0.032
 infected-0.032
 worse-0.031
ussy-0.030
Token queen
Token ID16599
Feature activation-4.84e-02

Pos logit contributions
 exp+0.046
 oceans+0.045
bye+0.045
amb+0.042
 plains+0.042
Neg logit contributions
 not-0.035
ggy-0.034
 worse-0.033
 bad-0.033
 infected-0.032
Token said
Token ID531
Feature activation-1.04e-02

Pos logit contributions
 exp+0.102
 oceans+0.102
bye+0.099
 plains+0.094
 lakes+0.092
Neg logit contributions
 not-0.060
ggy-0.057
 worse-0.050
 infected-0.050
 mom-0.049
Token
Token ID6184
Feature activation-6.62e-02

Pos logit contributions
 oceans+0.020
 exp+0.020
bye+0.020
 plains+0.019
amb+0.018
Neg logit contributions
 not-0.015
ggy-0.014
 worse-0.014
 infected-0.013
 okay-0.013
Token
Token ID95
Feature activation-7.26e-02

Pos logit contributions
 oceans+0.113
bye+0.108
 exp+0.108
 plains+0.102
 lakes+0.101
Neg logit contributions
ggy-0.105
 not-0.103
ussy-0.098
abella-0.092
 homework-0.092
Token
Token ID26391
Feature activation-7.48e-03

Pos logit contributions
 oceans+0.124
 exp+0.117
 lakes+0.113
bye+0.111
 plains+0.107
Neg logit contributions
ggy-0.119
 not-0.117
orman-0.109
ussy-0.109
abella-0.106
Token
Token ID129
Feature activation-6.80e-02

Pos logit contributions
 oceans+0.013
 exp+0.013
bye+0.013
 plains+0.012
amb+0.012
Neg logit contributions
 not-0.012
ggy-0.012
 infected-0.010
ussy-0.010
 okay-0.010
Token
Token ID241
Feature activation-7.04e-02

Pos logit contributions
 oceans+0.117
 exp+0.114
bye+0.113
 lakes+0.105
 plains+0.105
Neg logit contributions
 not-0.109
ggy-0.108
 infected-0.100
ussy-0.098
 worse-0.096
TokenYes
Token ID5297
Feature activation-1.49e-02

Pos logit contributions
 exp+0.134
 oceans+0.132
bye+0.122
 plains+0.122
 lakes+0.117
Neg logit contributions
 not-0.107
ggy-0.102
abella-0.094
orman-0.089
 okay-0.089
Token,
Token ID11
Feature activation-3.55e-02

Pos logit contributions
 oceans+0.029
 exp+0.028
bye+0.027
 plains+0.027
 lakes+0.026
Neg logit contributions
 not-0.022
ggy-0.021
 okay-0.020
 worse-0.020
 bad-0.019
Token:
Token ID25
Feature activation-3.19e-02

Pos logit contributions
 oceans+0.018
bye+0.018
 exp+0.018
 plains+0.017
amb+0.016
Neg logit contributions
 not-0.014
ggy-0.013
 worse-0.013
 infected-0.013
 okay-0.012
Token
Token ID6184
Feature activation-6.21e-02

Pos logit contributions
bye+0.069
 oceans+0.066
 exp+0.065
 plains+0.063
amb+0.062
Neg logit contributions
 not-0.040
ggy-0.034
 okay-0.033
 infected-0.031
 mom-0.031
Token
Token ID95
Feature activation-7.04e-02

Pos logit contributions
 oceans+0.102
bye+0.100
 exp+0.096
 plains+0.091
 lakes+0.089
Neg logit contributions
ggy-0.100
 not-0.094
 wrong-0.088
ussy-0.087
 infected-0.087
Token
Token ID26391
Feature activation-2.50e-03

Pos logit contributions
 oceans+0.118
 exp+0.112
bye+0.107
 lakes+0.105
amb+0.102
Neg logit contributions
ggy-0.116
 not-0.108
orman-0.103
ggie-0.098
ussy-0.097
Token
Token ID129
Feature activation-6.13e-02

Pos logit contributions
 exp+0.005
bye+0.005
 oceans+0.004
amb+0.004
 breeze+0.004
Neg logit contributions
 not-0.004
ggy-0.004
 infected-0.003
 okay-0.003
 mom-0.003
Token
Token ID241
Feature activation-7.33e-02

Pos logit contributions
 oceans+0.103
bye+0.100
 exp+0.099
 lakes+0.093
 plains+0.090
Neg logit contributions
 not-0.105
ggy-0.100
 infected-0.092
 bad-0.089
 worse-0.088
TokenThanks
Token ID9690
Feature activation-1.73e-02

Pos logit contributions
 oceans+0.141
 exp+0.140
bye+0.129
 plains+0.128
 mast+0.126
Neg logit contributions
 not-0.109
ggy-0.105
abella-0.095
orman-0.088
 okay-0.087
Token.
Token ID13
Feature activation-3.29e-02

Pos logit contributions
 oceans+0.034
 exp+0.034
bye+0.033
 plains+0.032
 lakes+0.030
Neg logit contributions
 not-0.025
ggy-0.023
 okay-0.021
 worse-0.021
 infected-0.021
Token I
Token ID314
Feature activation-1.41e-02

Pos logit contributions
bye+0.058
 oceans+0.056
 exp+0.055
 tune+0.053
 plains+0.051
Neg logit contributions
 not-0.048
 okay-0.047
ussy-0.044
 mom-0.043
abella-0.043
Token try
Token ID1949
Feature activation-4.18e-03

Pos logit contributions
 oceans+0.028
 exp+0.027
bye+0.027
 plains+0.026
 lakes+0.025
Neg logit contributions
 not-0.020
ggy-0.019
 infected-0.018
 worse-0.017
 okay-0.017
Token to
Token ID284
Feature activation-1.29e-02

Pos logit contributions
bye+0.008
 exp+0.008
 oceans+0.008
 plains+0.008
amb+0.008
Neg logit contributions
 not-0.006
ggy-0.005
 worse-0.005
 infected-0.005
 okay-0.005