TOP ACTIVATIONS
MAX = 0.156

a little bit ." Mom squee zes some shampoo on her
safe journey home . The sur fer waved back and said
. Lily loved the big Fer ris wheel the most .
." L ily squee zes his hand . "
Suddenly , the tr amp oline collapsed beneath her
sur fer . Whenever the sur fer balanced on the board
. But Sam squee zes the dinosaur too hard
car . He found a squee gee and some wax .
!" Mia says . She squee zes the bottle and water
ball . The boun cy ball was not at
. The boun cy ball was so scared
around in it !" The boun cy thing was so upset
small for Ben . He squee zes in , but it
that he had used the squee gee to good use !
!" Sam says . He squee zes the dinosaur harder .
uddle ! Splash ! The boun cy thing was having so
could try it . The sur fer said
She takes the orange and squee zes it hard . She
. The boun cy ball was so brave
soon nothing but a small spe ck in the sky .

MIN ACTIVATIONS
MIN = -0.129

key and wanted to show someone . But he was still
and went outside to show everyone his new sword . He
how your actions might affect others . <|endoftext|> Once upon a
bird who wanted to show everyone how beautiful her singing was
take it home to show everyone . He grabbed the trophy
big fancy wedding to show someone you love them . All
to be consider ate of others and not be too noisy
and just had to show someone . She ran inside and
is your chance to show everyone how brave you can be
's very easy to hurt someone 's feelings with a joke
and he wanted to protect everyone in his forest . So
. He wanted to show everyone how silly he was .
sh iver go down her spine . It was very cold
happy and wanted to show everyone how much she loved them
chat tered as he showed everyone things in the jungle .
he was about to show everyone . He jumped with excitement
new toy she had been longing for - a mixing machine
happiness and wanted to show everyone how delicious his steak was
went back home to show everyone what he had found .
that she wanted to show everyone . She smiled and showed

INTERVAL 0.085 - 0.156
CONTAINS 0.259%

full of lovely ducks some white
. It is dangerous . Come with me . I have
said , Next time you are
my : What do you want
on the swings and the slide . They are happy and

INTERVAL 0.013 - 0.085
CONTAINS 28.496%

and cried . But then , they saw Mom
Jane said , " Don 't worry ! Just trust me
laughed a lot . But soon , they heard
pack some food and drinks in a big bag . They
to the park to play soccer . But the other team

INTERVAL -0.058 - 0.013
CONTAINS 69.004%

always listen to her mom 's recommendations . <|endoftext|> Once upon
smiled . He was glad that Anna saw his pedal .
steal something ! <|endoftext|> Once upon a time there was a
said thank you . The driver smiled and said , "
their cries and ran to them . She saw the dog

INTERVAL -0.129 - -0.058
CONTAINS 2.241%

bear 's worst fears were realized - the ribbon was going
them and l icked their faces . They put
sun came back , and everyone was happy again . The
started to talk about all sorts of things . Bob had
his head filled with incredible memories . <|endoftext|> Once upon a
Token a
Token ID257
Feature activation+1.97e-02

Pos logit contributions
 coaster+0.121
 Fer+0.116
+0.115
 jugg+0.114
 salon+0.113
Neg logit contributions
lessness-0.046
 of-0.039
 twins-0.037
 privacy-0.028
 about-0.028
Token little
Token ID1310
Feature activation+2.09e-02

Pos logit contributions
 coaster+0.049
 Fer+0.046
+0.046
olog+0.046
icol+0.044
Neg logit contributions
 of-0.022
lessness-0.022
 twins-0.021
 toys-0.017
 who-0.017
Token bit
Token ID1643
Feature activation+1.36e-02

Pos logit contributions
 coaster+0.032
 Fer+0.029
+0.029
 jugg+0.028
icol+0.027
Neg logit contributions
lessness-0.044
 of-0.042
 twins-0.041
 who-0.037
 about-0.036
Token."
Token ID526
Feature activation+3.67e-02

Pos logit contributions
 coaster+0.033
 salon+0.033
 jugg+0.032
 Fer+0.032
 pro+0.032
Neg logit contributions
lessness-0.013
 twins-0.010
 privacy-0.009
ruck-0.009
 of-0.009
Token Mom
Token ID11254
Feature activation+8.84e-03

Pos logit contributions
+0.076
 coaster+0.076
 Fer+0.072
 salon+0.071
icol+0.070
Neg logit contributions
lessness-0.057
 of-0.047
 twins-0.043
 privacy-0.043
ruck-0.043
Token squee
Token ID13170
Feature activation+1.56e-01

Pos logit contributions
 coaster+0.019
+0.019
 Fer+0.018
 som+0.018
icol+0.018
Neg logit contributions
 twins-0.012
 of-0.012
lessness-0.012
 about-0.011
 who-0.010
Tokenzes
Token ID12271
Feature activation+2.30e-02

Pos logit contributions
 Fer+0.171
 blur+0.165
 salon+0.152
+0.147
 pir+0.145
Neg logit contributions
 twins-0.340
ies-0.332
 what-0.330
.-0.320
 who-0.320
Token some
Token ID617
Feature activation-2.14e-02

Pos logit contributions
 coaster+0.053
 Fer+0.050
+0.050
 jugg+0.049
 salon+0.049
Neg logit contributions
lessness-0.031
 of-0.029
 twins-0.027
 about-0.022
 who-0.022
Token shampoo
Token ID47259
Feature activation+3.25e-02

Pos logit contributions
lessness+0.027
 twins+0.022
 of+0.021
 privacy+0.019
ruck+0.018
Neg logit contributions
 coaster-0.049
 Fer-0.048
-0.048
 jugg-0.048
 salon-0.047
Token on
Token ID319
Feature activation+4.76e-02

Pos logit contributions
 coaster+0.086
 Fer+0.083
+0.082
 jugg+0.081
 salon+0.080
Neg logit contributions
lessness-0.032
 of-0.027
 twins-0.025
 privacy-0.019
 about-0.019
Token her
Token ID607
Feature activation-3.72e-02

Pos logit contributions
 coaster+0.129
olog+0.125
 Fer+0.125
 jugg+0.123
icol+0.122
Neg logit contributions
 of-0.040
 twins-0.039
lessness-0.032
.-0.030
 toys-0.028
Token safe
Token ID3338
Feature activation+3.91e-02

Pos logit contributions
 coaster+0.017
+0.017
 Fer+0.016
zes+0.016
 salon+0.016
Neg logit contributions
lessness-0.007
 of-0.007
 twins-0.006
 who-0.005
 important-0.005
Token journey
Token ID7002
Feature activation+2.22e-03

Pos logit contributions
 coaster+0.083
 Fer+0.080
+0.080
olog+0.077
 salon+0.077
Neg logit contributions
lessness-0.058
 of-0.054
 twins-0.050
 about-0.043
 company-0.043
Token home
Token ID1363
Feature activation-2.07e-02

Pos logit contributions
 coaster+0.006
 Fer+0.006
+0.006
 jugg+0.006
 salon+0.006
Neg logit contributions
lessness-0.002
 of-0.002
 twins-0.002
 privacy-0.001
 about-0.001
Token.
Token ID13
Feature activation-1.22e-02

Pos logit contributions
lessness+0.018
 twins+0.015
 of+0.013
 privacy+0.011
 bricks+0.010
Neg logit contributions
 coaster-0.056
 Fer-0.053
-0.053
 jugg-0.052
 salon-0.052
Token The
Token ID383
Feature activation+2.19e-02

Pos logit contributions
lessness+0.014
 of+0.013
 twins+0.012
 about+0.010
 who+0.010
Neg logit contributions
 coaster-0.030
 Fer-0.029
 jugg-0.028
 salon-0.028
-0.027
Token sur
Token ID969
Feature activation+1.51e-01

Pos logit contributions
+0.047
icol+0.046
olog+0.046
 coaster+0.046
 salon+0.045
Neg logit contributions
 twins-0.031
 of-0.030
lessness-0.030
 children-0.026
 friends-0.026
Tokenfer
Token ID2232
Feature activation+4.05e-02

Pos logit contributions
icol+0.177
 som+0.167
 checkout+0.166
 blur+0.161
olog+0.159
Neg logit contributions
 of-0.347
lessness-0.344
 who-0.325
 twins-0.322
ruck-0.298
Token waved
Token ID26834
Feature activation-3.69e-03

Pos logit contributions
 coaster+0.085
 Fer+0.083
+0.083
icol+0.080
 salon+0.080
Neg logit contributions
 of-0.060
lessness-0.060
 twins-0.053
 who-0.048
 about-0.046
Token back
Token ID736
Feature activation+1.83e-02

Pos logit contributions
lessness+0.007
 of+0.006
 twins+0.006
 privacy+0.005
 about+0.005
Neg logit contributions
 coaster-0.006
 Fer-0.006
-0.006
 jugg-0.006
 salon-0.006
Token and
Token ID290
Feature activation+8.71e-03

Pos logit contributions
 coaster+0.049
 Fer+0.048
+0.047
 jugg+0.046
 salon+0.046
Neg logit contributions
lessness-0.017
 of-0.016
 twins-0.014
 about-0.011
 who-0.011
Token said
Token ID531
Feature activation-3.58e-02

Pos logit contributions
+0.020
 coaster+0.019
 salon+0.018
 Fer+0.017
olog+0.017
Neg logit contributions
lessness-0.013
 of-0.012
 twins-0.011
 friends-0.010
 who-0.010
Token.
Token ID13
Feature activation-1.25e-02

Pos logit contributions
 coaster+0.063
olog+0.062
+0.062
 Fer+0.061
icol+0.060
Neg logit contributions
 of-0.029
lessness-0.028
 twins-0.026
 about-0.022
 everywhere-0.022
Token Lily
Token ID20037
Feature activation+1.66e-02

Pos logit contributions
lessness+0.016
 of+0.015
 twins+0.014
 about+0.012
 who+0.012
Neg logit contributions
 coaster-0.029
 Fer-0.028
 jugg-0.027
-0.027
 salon-0.027
Token loved
Token ID6151
Feature activation-3.18e-02

Pos logit contributions
+0.046
 Fer+0.042
 coaster+0.042
icol+0.041
 seventeen+0.041
Neg logit contributions
 twins-0.015
 of-0.015
lessness-0.013
 who-0.013
 about-0.013
Token the
Token ID262
Feature activation+4.95e-03

Pos logit contributions
lessness+0.030
 of+0.028
 twins+0.027
 about+0.021
 who+0.020
Neg logit contributions
 coaster-0.082
-0.081
 Fer-0.079
olog-0.078
 jugg-0.077
Token big
Token ID1263
Feature activation+4.99e-02

Pos logit contributions
 coaster+0.009
 Fer+0.009
+0.009
 jugg+0.009
 salon+0.008
Neg logit contributions
lessness-0.009
 of-0.008
 twins-0.008
 privacy-0.007
 about-0.007
Token Fer
Token ID12880
Feature activation+1.51e-01

Pos logit contributions
 coaster+0.102
+0.099
 Fer+0.096
olog+0.096
 jugg+0.095
Neg logit contributions
lessness-0.076
 of-0.071
 twins-0.070
 toys-0.059
 about-0.058
Tokenris
Token ID2442
Feature activation+4.18e-02

Pos logit contributions
icol+0.268
 coaster+0.266
 som+0.261
olog+0.259
 Fer+0.257
Neg logit contributions
 of-0.264
lessness-0.256
 twins-0.231
aky-0.206
 who-0.205
Token wheel
Token ID7825
Feature activation+5.17e-02

Pos logit contributions
olog+0.060
V+0.058
icol+0.057
Re+0.055
+0.055
Neg logit contributions
 of-0.085
lessness-0.075
 twins-0.075
.-0.073
 who-0.072
Token the
Token ID262
Feature activation+1.21e-02

Pos logit contributions
 coaster+0.133
 Fer+0.127
+0.125
 salon+0.124
 jugg+0.124
Neg logit contributions
lessness-0.054
 of-0.045
 twins-0.044
 privacy-0.035
 company-0.033
Token most
Token ID749
Feature activation-3.34e-02

Pos logit contributions
 coaster+0.024
+0.023
 jugg+0.022
 Fer+0.022
olog+0.022
Neg logit contributions
lessness-0.019
 of-0.018
 twins-0.018
 who-0.015
 everywhere-0.015
Token.
Token ID13
Feature activation-1.77e-02

Pos logit contributions
lessness+0.041
 twins+0.033
 of+0.030
 privacy+0.029
 bricks+0.027
Neg logit contributions
 coaster-0.080
 jugg-0.076
 Fer-0.076
 pro-0.076
 som-0.074
Token."
Token ID526
Feature activation+3.62e-02

Pos logit contributions
 coaster+0.190
 Fer+0.182
+0.180
 jugg+0.179
 salon+0.176
Neg logit contributions
lessness-0.081
 of-0.075
 twins-0.069
 about-0.054
 who-0.053
Token
Token ID198
Feature activation+2.24e-02

Pos logit contributions
+0.072
 coaster+0.069
 Fer+0.065
 jugg+0.065
 pro+0.065
Neg logit contributions
lessness-0.058
 privacy-0.049
 twins-0.047
 of-0.046
ruck-0.046
Token
Token ID198
Feature activation+3.17e-02

Pos logit contributions
 coaster+0.046
+0.044
 jugg+0.043
 Fer+0.043
 som+0.042
Neg logit contributions
lessness-0.034
 twins-0.027
 of-0.027
 privacy-0.027
 company-0.025
TokenL
Token ID43
Feature activation+3.55e-02

Pos logit contributions
 coaster+0.065
 Fer+0.063
+0.061
 jugg+0.059
icol+0.058
Neg logit contributions
lessness-0.053
 of-0.050
 twins-0.048
 who-0.042
aky-0.040
Tokenily
Token ID813
Feature activation+2.84e-02

Pos logit contributions
 coaster+0.089
 salon+0.086
+0.086
 Fer+0.085
 jugg+0.085
Neg logit contributions
lessness-0.035
 twins-0.027
 of-0.027
 privacy-0.024
 company-0.024
Token squee
Token ID13170
Feature activation+1.50e-01

Pos logit contributions
+0.060
 coaster+0.056
 Fer+0.056
icol+0.055
 sidelines+0.055
Neg logit contributions
 twins-0.041
 about-0.039
 of-0.038
 who-0.036
.-0.034
Tokenzes
Token ID12271
Feature activation+8.29e-03

Pos logit contributions
 Fer+0.170
 blur+0.155
 salon+0.147
 coaster+0.141
+0.141
Neg logit contributions
 twins-0.333
ies-0.317
 what-0.315
 feelings-0.310
 friends-0.307
Token his
Token ID465
Feature activation-2.29e-02

Pos logit contributions
 coaster+0.019
 Fer+0.018
+0.018
 jugg+0.018
 salon+0.017
Neg logit contributions
lessness-0.011
 of-0.010
 twins-0.010
 about-0.008
 everywhere-0.008
Token hand
Token ID1021
Feature activation-1.86e-03

Pos logit contributions
lessness+0.037
 twins+0.036
 of+0.035
 toys+0.033
 company+0.031
Neg logit contributions
 coaster-0.043
olog-0.042
-0.041
 jugg-0.040
 Fer-0.040
Token.
Token ID13
Feature activation-5.67e-03

Pos logit contributions
lessness+0.002
 of+0.002
 twins+0.002
 privacy+0.001
 about+0.001
Neg logit contributions
 coaster-0.005
 Fer-0.005
-0.005
 jugg-0.005
 salon-0.004
Token "
Token ID366
Feature activation+6.05e-02

Pos logit contributions
lessness+0.007
 of+0.006
 twins+0.006
 about+0.005
 who+0.005
Neg logit contributions
 coaster-0.014
 Fer-0.013
 jugg-0.013
 salon-0.013
-0.013
Token
Token ID198
Feature activation-1.31e-02

Pos logit contributions
+0.018
 coaster+0.018
 salon+0.016
olog+0.016
 Fer+0.016
Neg logit contributions
lessness-0.012
 twins-0.011
 of-0.010
 privacy-0.010
 company-0.009
Token
Token ID198
Feature activation+1.03e-03

Pos logit contributions
lessness+0.015
 of+0.013
 twins+0.012
 about+0.010
 privacy+0.009
Neg logit contributions
 coaster-0.033
 Fer-0.032
-0.031
 jugg-0.031
 salon-0.031
TokenSuddenly
Token ID38582
Feature activation+2.09e-02

Pos logit contributions
 coaster+0.002
 Fer+0.002
 salon+0.002
 jugg+0.002
+0.002
Neg logit contributions
lessness-0.002
 of-0.002
 twins-0.002
 who-0.001
 about-0.001
Token,
Token ID11
Feature activation+1.74e-02

Pos logit contributions
 coaster+0.045
 Fer+0.043
+0.042
 jugg+0.041
 som+0.041
Neg logit contributions
lessness-0.031
 twins-0.027
 of-0.026
 privacy-0.023
 company-0.022
Token the
Token ID262
Feature activation+3.15e-02

Pos logit contributions
 coaster+0.037
+0.036
 Fer+0.036
 jugg+0.035
 som+0.034
Neg logit contributions
lessness-0.025
 twins-0.022
 of-0.022
 privacy-0.019
ruck-0.018
Token tr
Token ID491
Feature activation+1.48e-01

Pos logit contributions
 coaster+0.063
 Fer+0.060
+0.060
olog+0.059
 jugg+0.059
Neg logit contributions
lessness-0.050
 of-0.046
 twins-0.045
 toys-0.037
 privacy-0.037
Tokenamp
Token ID696
Feature activation+7.48e-02

Pos logit contributions
olog+0.211
 som+0.204
 Fer+0.200
 coaster+0.195
 pro+0.194
Neg logit contributions
lessness-0.305
 of-0.292
 twins-0.265
 bricks-0.258
iring-0.256
Tokenoline
Token ID14453
Feature activation+6.64e-02

Pos logit contributions
 coaster+0.106
 Fer+0.105
olog+0.103
 som+0.099
+0.099
Neg logit contributions
lessness-0.161
 of-0.155
 twins-0.143
 about-0.133
ruck-0.130
Token collapsed
Token ID14707
Feature activation-1.34e-03

Pos logit contributions
 coaster+0.127
 Fer+0.125
+0.120
icol+0.120
 sixty+0.119
Neg logit contributions
 of-0.108
lessness-0.107
 twins-0.094
 about-0.085
 everywhere-0.084
Token beneath
Token ID11061
Feature activation+9.66e-03

Pos logit contributions
lessness+0.001
 of+0.001
 twins+0.001
 about+0.001
 who+0.001
Neg logit contributions
 coaster-0.004
 Fer-0.003
-0.003
 jugg-0.003
 salon-0.003
Token her
Token ID607
Feature activation-5.23e-02

Pos logit contributions
 coaster+0.022
+0.021
 Fer+0.021
 jugg+0.021
 som+0.021
Neg logit contributions
lessness-0.013
 twins-0.011
 of-0.011
 privacy-0.009
aky-0.009
Token sur
Token ID969
Feature activation+1.28e-01

Pos logit contributions
+0.010
 coaster+0.010
olog+0.009
 salon+0.009
icol+0.009
Neg logit contributions
lessness-0.012
 of-0.011
 twins-0.011
 children-0.009
ruck-0.009
Tokenfer
Token ID2232
Feature activation+1.81e-02

Pos logit contributions
 checkout+0.151
icol+0.144
olog+0.142
 som+0.142
 sixty+0.139
Neg logit contributions
lessness-0.286
 of-0.286
 twins-0.268
 who-0.263
ruck-0.248
Token.
Token ID13
Feature activation-1.23e-02

Pos logit contributions
 coaster+0.038
 Fer+0.036
 jugg+0.036
 som+0.035
 salon+0.035
Neg logit contributions
lessness-0.028
 twins-0.024
 of-0.023
 privacy-0.020
ruck-0.020
Token Whenever
Token ID21326
Feature activation-1.38e-02

Pos logit contributions
lessness+0.017
 of+0.015
 twins+0.014
 privacy+0.012
 company+0.012
Neg logit contributions
 coaster-0.027
-0.026
 Fer-0.026
 jugg-0.026
 salon-0.025
Token the
Token ID262
Feature activation+3.14e-02

Pos logit contributions
lessness+0.020
 twins+0.017
 of+0.017
ruck+0.015
aky+0.015
Neg logit contributions
 coaster-0.029
 Fer-0.028
-0.027
 jugg-0.027
 som-0.027
Token sur
Token ID969
Feature activation+1.47e-01

Pos logit contributions
+0.047
olog+0.047
 coaster+0.045
icol+0.044
 salon+0.042
Neg logit contributions
lessness-0.066
 twins-0.065
 of-0.064
 children-0.057
 friends-0.056
Tokenfer
Token ID2232
Feature activation+3.62e-02

Pos logit contributions
 som+0.182
 checkout+0.180
icol+0.179
olog+0.176
 seventeen+0.168
Neg logit contributions
lessness-0.315
 of-0.315
 twins-0.302
 who-0.298
ruck-0.275
Token balanced
Token ID12974
Feature activation+6.16e-02

Pos logit contributions
 coaster+0.073
+0.072
icol+0.071
 Fer+0.071
olog+0.069
Neg logit contributions
 of-0.054
lessness-0.053
 twins-0.049
 who-0.045
 about-0.043
Token on
Token ID319
Feature activation+4.08e-02

Pos logit contributions
 coaster+0.148
+0.148
icol+0.147
olog+0.146
 Fer+0.142
Neg logit contributions
 of-0.070
lessness-0.070
 twins-0.058
 about-0.049
 toys-0.047
Token the
Token ID262
Feature activation+1.06e-03

Pos logit contributions
 coaster+0.096
 Fer+0.092
+0.091
 jugg+0.089
 salon+0.089
Neg logit contributions
lessness-0.051
 of-0.047
 twins-0.044
 about-0.036
 who-0.035
Token board
Token ID3096
Feature activation+5.37e-02

Pos logit contributions
+0.002
 coaster+0.002
olog+0.002
 salon+0.002
 sixty+0.002
Neg logit contributions
lessness-0.002
 of-0.002
 twins-0.002
 toys-0.002
 everywhere-0.002
Token.
Token ID13
Feature activation-1.06e-02

Pos logit contributions
 coaster+0.006
 jugg+0.006
 Fer+0.006
 som+0.006
+0.006
Neg logit contributions
lessness-0.002
 twins-0.002
 of-0.002
 privacy-0.001
 bricks-0.001
Token
Token ID198
Feature activation+8.75e-03

Pos logit contributions
lessness+0.016
 twins+0.014
 of+0.014
 privacy+0.012
 company+0.012
Neg logit contributions
 coaster-0.022
-0.022
 Fer-0.021
 jugg-0.021
 som-0.021
Token
Token ID198
Feature activation+2.08e-02

Pos logit contributions
 coaster+0.016
+0.016
V+0.016
 jugg+0.015
 som+0.015
Neg logit contributions
lessness-0.014
 twins-0.012
 privacy-0.012
 company-0.011
iness-0.011
TokenBut
Token ID1537
Feature activation+2.37e-02

Pos logit contributions
 coaster+0.038
 Fer+0.036
+0.035
 salon+0.035
 jugg+0.034
Neg logit contributions
lessness-0.038
 of-0.036
 twins-0.035
 who-0.030
 bricks-0.029
Token Sam
Token ID3409
Feature activation+6.58e-03

Pos logit contributions
 coaster+0.048
 Fer+0.046
+0.044
 salon+0.044
 jugg+0.043
Neg logit contributions
lessness-0.039
 of-0.038
 twins-0.035
 about-0.032
 who-0.031
Token squee
Token ID13170
Feature activation+1.47e-01

Pos logit contributions
icol+0.015
+0.015
 coaster+0.015
 Fer+0.015
olog+0.014
Neg logit contributions
 of-0.008
 twins-0.008
 about-0.007
lessness-0.007
 who-0.007
Tokenzes
Token ID12271
Feature activation+9.42e-03

Pos logit contributions
 Fer+0.166
 salon+0.163
 blur+0.142
+0.141
 pro+0.129
Neg logit contributions
 twins-0.324
 feelings-0.313
ies-0.311
 toys-0.310
 what-0.310
Token the
Token ID262
Feature activation+1.73e-02

Pos logit contributions
 coaster+0.022
 Fer+0.021
+0.021
 jugg+0.020
 salon+0.020
Neg logit contributions
lessness-0.013
 of-0.012
 twins-0.011
 about-0.009
 who-0.009
Token dinosaur
Token ID28257
Feature activation+1.26e-02

Pos logit contributions
 coaster+0.030
 Fer+0.028
 jugg+0.028
+0.026
 pro+0.026
Neg logit contributions
lessness-0.034
 of-0.030
 twins-0.029
 privacy-0.027
 about-0.027
Token too
Token ID1165
Feature activation-3.73e-03

Pos logit contributions
 coaster+0.029
 Fer+0.028
+0.028
 jugg+0.027
 salon+0.027
Neg logit contributions
lessness-0.017
 of-0.015
 twins-0.014
 about-0.012
 privacy-0.012
Token hard
Token ID1327
Feature activation+1.87e-02

Pos logit contributions
lessness+0.004
 twins+0.004
 of+0.004
 privacy+0.003
 company+0.003
Neg logit contributions
 coaster-0.009
 Fer-0.009
-0.009
 jugg-0.009
 som-0.009
Token car
Token ID1097
Feature activation+1.48e-02

Pos logit contributions
 coaster+0.001
 Fer+0.001
 salon+0.001
 jugg+0.001
+0.001
Neg logit contributions
lessness-0.002
 of-0.001
 twins-0.001
 privacy-0.001
 about-0.001
Token.
Token ID13
Feature activation-2.19e-02

Pos logit contributions
 coaster+0.039
 Fer+0.038
 salon+0.037
+0.037
 jugg+0.037
Neg logit contributions
lessness-0.014
 twins-0.011
 of-0.010
 privacy-0.009
 importantly-0.008
Token He
Token ID679
Feature activation+9.18e-04

Pos logit contributions
lessness+0.027
 of+0.024
 twins+0.023
 about+0.019
 privacy+0.019
Neg logit contributions
 coaster-0.052
 Fer-0.050
-0.049
 jugg-0.049
 salon-0.048
Token found
Token ID1043
Feature activation-3.10e-02

Pos logit contributions
 coaster+0.002
+0.002
 Fer+0.002
icol+0.002
olog+0.002
Neg logit contributions
lessness-0.001
 of-0.001
 twins-0.001
 who-0.001
 toys-0.001
Token a
Token ID257
Feature activation+1.20e-03

Pos logit contributions
lessness+0.047
 twins+0.039
 of+0.037
 privacy+0.036
 company+0.035
Neg logit contributions
 coaster-0.065
 Fer-0.063
 som-0.061
 pro-0.061
 jugg-0.060
Token squee
Token ID13170
Feature activation+1.46e-01

Pos logit contributions
 coaster+0.003
+0.003
 Fer+0.003
olog+0.003
 jugg+0.002
Neg logit contributions
lessness-0.002
 of-0.002
 twins-0.001
 toys-0.001
 who-0.001
Tokengee
Token ID29622
Feature activation+1.50e-02

Pos logit contributions
 salon+0.283
 blur+0.262
 Fer+0.261
+0.257
 sixty+0.255
Neg logit contributions
 toys-0.181
ies-0.179
 company-0.175
 things-0.174
.-0.171
Token and
Token ID290
Feature activation-6.72e-03

Pos logit contributions
 coaster+0.034
 Fer+0.032
+0.032
 jugg+0.031
 salon+0.031
Neg logit contributions
lessness-0.021
 of-0.018
 twins-0.018
 privacy-0.015
 about-0.014
Token some
Token ID617
Feature activation-1.57e-02

Pos logit contributions
 of+0.010
lessness+0.010
 twins+0.008
 who+0.008
 toys+0.008
Neg logit contributions
 coaster-0.015
-0.014
 Fer-0.014
 salon-0.014
olog-0.013
Token wax
Token ID20302
Feature activation+1.34e-02

Pos logit contributions
lessness+0.027
 twins+0.023
 privacy+0.022
 of+0.021
 about+0.020
Neg logit contributions
 coaster-0.029
 Fer-0.029
 jugg-0.028
 som-0.028
 salon-0.027
Token.
Token ID13
Feature activation-2.64e-02

Pos logit contributions
 coaster+0.033
 Fer+0.032
+0.031
 som+0.031
 jugg+0.031
Neg logit contributions
lessness-0.013
 twins-0.011
 privacy-0.011
 of-0.010
 company-0.009
Token!"
Token ID2474
Feature activation+3.95e-02

Pos logit contributions
lessness+0.000
 of+0.000
 twins+0.000
 privacy+0.000
 about+0.000
Neg logit contributions
 coaster-0.000
 Fer-0.000
-0.000
 jugg-0.000
 salon-0.000
Token Mia
Token ID32189
Feature activation+2.12e-02

Pos logit contributions
 coaster+0.061
 jugg+0.058
 Fer+0.058
 salon+0.057
+0.057
Neg logit contributions
lessness-0.081
 twins-0.070
 of-0.069
ruck-0.066
 privacy-0.066
Token says
Token ID1139
Feature activation-1.68e-02

Pos logit contributions
 coaster+0.041
+0.039
 Fer+0.039
 jugg+0.037
 salon+0.037
Neg logit contributions
lessness-0.035
 of-0.033
 twins-0.032
 about-0.028
 who-0.027
Token.
Token ID13
Feature activation+4.74e-03

Pos logit contributions
lessness+0.016
 twins+0.012
 of+0.011
 privacy+0.011
ruck+0.010
Neg logit contributions
 coaster-0.042
-0.042
olog-0.041
 pro-0.041
 jugg-0.041
Token She
Token ID1375
Feature activation+2.15e-02

Pos logit contributions
 coaster+0.012
 Fer+0.011
+0.011
 jugg+0.011
 salon+0.011
Neg logit contributions
lessness-0.005
 of-0.005
 twins-0.005
 about-0.004
 privacy-0.003
Token squee
Token ID13170
Feature activation+1.46e-01

Pos logit contributions
 coaster+0.045
+0.043
 Fer+0.041
 salon+0.041
uren+0.040
Neg logit contributions
 twins-0.034
 of-0.033
 toys-0.029
 who-0.028
 about-0.028
Tokenzes
Token ID12271
Feature activation+8.74e-03

Pos logit contributions
 Fer+0.165
 salon+0.160
 blur+0.146
+0.139
 coaster+0.137
Neg logit contributions
 what-0.321
 twins-0.321
ies-0.310
 toys-0.306
 who-0.298
Token the
Token ID262
Feature activation+2.48e-02

Pos logit contributions
 coaster+0.020
 Fer+0.019
+0.019
 jugg+0.019
 salon+0.019
Neg logit contributions
lessness-0.012
 of-0.011
 twins-0.010
 everywhere-0.009
 about-0.009
Token bottle
Token ID9294
Feature activation+3.59e-02

Pos logit contributions
 coaster+0.043
 Fer+0.041
+0.041
 som+0.039
 jugg+0.039
Neg logit contributions
lessness-0.046
 of-0.041
 twins-0.039
 privacy-0.038
 about-0.037
Token and
Token ID290
Feature activation+2.51e-02

Pos logit contributions
 coaster+0.088
 Fer+0.084
+0.083
 jugg+0.083
 salon+0.082
Neg logit contributions
lessness-0.041
 of-0.035
 twins-0.035
 privacy-0.028
 company-0.027
Token water
Token ID1660
Feature activation+1.53e-02

Pos logit contributions
 coaster+0.058
+0.056
olog+0.054
 salon+0.053
ar+0.053
Neg logit contributions
 twins-0.033
 of-0.033
 what-0.031
 toys-0.029
 who-0.029
Token ball
Token ID2613
Feature activation+5.26e-02

Pos logit contributions
 coaster+0.021
 Fer+0.021
+0.021
 salon+0.020
 jugg+0.020
Neg logit contributions
lessness-0.020
 of-0.018
 twins-0.017
 about-0.016
 privacy-0.016
Token.
Token ID13
Feature activation-2.01e-02

Pos logit contributions
 coaster+0.133
 Fer+0.128
+0.127
 jugg+0.125
 salon+0.124
Neg logit contributions
lessness-0.056
 of-0.048
 twins-0.047
 privacy-0.036
 about-0.035
Token
Token ID198
Feature activation-1.51e-02

Pos logit contributions
lessness+0.025
 of+0.024
 twins+0.023
 who+0.019
 about+0.019
Neg logit contributions
 coaster-0.048
 Fer-0.046
 salon-0.045
 jugg-0.044
 som-0.043
Token
Token ID198
Feature activation-8.86e-04

Pos logit contributions
lessness+0.019
 of+0.016
 twins+0.016
 privacy+0.014
 company+0.013
Neg logit contributions
 coaster-0.035
-0.034
 Fer-0.033
 jugg-0.033
 som-0.033
TokenThe
Token ID464
Feature activation+2.96e-02

Pos logit contributions
 of+0.001
lessness+0.001
 twins+0.001
 who+0.001
aky+0.001
Neg logit contributions
 Fer-0.002
icol-0.002
 coaster-0.002
 salon-0.002
-0.002
Token boun
Token ID31283
Feature activation+1.45e-01

Pos logit contributions
 coaster+0.062
 Fer+0.059
+0.058
 jugg+0.057
 salon+0.057
Neg logit contributions
lessness-0.046
 of-0.042
 twins-0.041
 privacy-0.034
 about-0.034
Tokency
Token ID948
Feature activation+3.56e-02

Pos logit contributions
 coaster+0.312
 Fer+0.304
 som+0.291
olog+0.289
 jugg+0.285
Neg logit contributions
 of-0.218
lessness-0.210
 twins-0.183
able-0.172
 everywhere-0.161
Token ball
Token ID2613
Feature activation+5.46e-02

Pos logit contributions
 coaster+0.067
 Fer+0.065
+0.063
 jugg+0.062
 salon+0.061
Neg logit contributions
lessness-0.060
 of-0.058
 twins-0.055
 who-0.048
 toys-0.047
Token was
Token ID373
Feature activation-1.52e-02

Pos logit contributions
 coaster+0.103
 Fer+0.100
icol+0.098
+0.097
 salon+0.096
Neg logit contributions
 of-0.096
lessness-0.092
 twins-0.081
 everywhere-0.077
 about-0.077
Token not
Token ID407
Feature activation-4.49e-02

Pos logit contributions
lessness+0.018
 of+0.017
 twins+0.016
 about+0.013
 everywhere+0.013
Neg logit contributions
 coaster-0.037
 Fer-0.036
-0.035
 salon-0.035
 jugg-0.034
Token at
Token ID379
Feature activation+3.26e-02

Pos logit contributions
lessness+0.065
 of+0.056
 twins+0.055
 privacy+0.051
 bricks+0.045
Neg logit contributions
 coaster-0.094
 Fer-0.092
 som-0.090
 jugg-0.090
 pro-0.089
Token.
Token ID13
Feature activation-1.40e-02

Pos logit contributions
 coaster+0.081
 Fer+0.078
+0.077
 jugg+0.076
 salon+0.076
Neg logit contributions
lessness-0.035
 of-0.030
 twins-0.029
 privacy-0.022
 about-0.022
Token
Token ID220
Feature activation+9.64e-03

Pos logit contributions
lessness+0.018
 of+0.017
 twins+0.016
 about+0.013
 who+0.013
Neg logit contributions
 coaster-0.033
 Fer-0.031
 salon-0.030
 jugg-0.030
 som-0.030
Token
Token ID198
Feature activation-8.84e-03

Pos logit contributions
 coaster+0.020
+0.019
 Fer+0.018
olog+0.018
 som+0.018
Neg logit contributions
lessness-0.014
 twins-0.012
 of-0.012
 privacy-0.012
 company-0.011
Token
Token ID198
Feature activation+4.56e-03

Pos logit contributions
lessness+0.009
 of+0.008
 twins+0.008
 who+0.006
 about+0.006
Neg logit contributions
 coaster-0.023
 Fer-0.022
 salon-0.021
-0.021
 jugg-0.021
TokenThe
Token ID464
Feature activation+3.57e-02

Pos logit contributions
 coaster+0.010
 Fer+0.009
icol+0.009
 salon+0.009
+0.009
Neg logit contributions
lessness-0.007
 of-0.007
 twins-0.006
 who-0.006
aky-0.005
Token boun
Token ID31283
Feature activation+1.45e-01

Pos logit contributions
 coaster+0.077
 Fer+0.074
+0.073
 jugg+0.072
 salon+0.071
Neg logit contributions
lessness-0.052
 of-0.047
 twins-0.046
 privacy-0.038
 company-0.038
Tokency
Token ID948
Feature activation+3.31e-02

Pos logit contributions
 coaster+0.309
 Fer+0.302
 som+0.294
olog+0.292
 salon+0.288
Neg logit contributions
 of-0.211
lessness-0.208
 twins-0.179
able-0.169
sm-0.156
Token ball
Token ID2613
Feature activation+5.26e-02

Pos logit contributions
 coaster+0.061
 Fer+0.060
+0.060
 salon+0.059
olog+0.059
Neg logit contributions
lessness-0.055
 of-0.053
 twins-0.050
 who-0.044
 toys-0.043
Token was
Token ID373
Feature activation-1.70e-02

Pos logit contributions
 coaster+0.095
 Fer+0.091
+0.089
 salon+0.087
icol+0.086
Neg logit contributions
lessness-0.094
 of-0.091
 twins-0.084
 about-0.076
 who-0.075
Token so
Token ID523
Feature activation-1.41e-02

Pos logit contributions
lessness+0.024
 twins+0.021
 of+0.020
 privacy+0.019
 bricks+0.017
Neg logit contributions
 coaster-0.037
 Fer-0.035
 jugg-0.035
-0.035
 som-0.034
Token scared
Token ID12008
Feature activation-3.42e-02

Pos logit contributions
lessness+0.017
 twins+0.015
 of+0.014
 privacy+0.013
ruck+0.012
Neg logit contributions
 coaster-0.033
 Fer-0.032
 jugg-0.031
-0.031
 som-0.031
Token around
Token ID1088
Feature activation+1.53e-03

Pos logit contributions
 coaster+0.065
 Fer+0.064
+0.064
olog+0.063
 salon+0.063
Neg logit contributions
 of-0.025
lessness-0.022
 twins-0.020
 everywhere-0.019
.-0.018
Token in
Token ID287
Feature activation+2.01e-02

Pos logit contributions
 coaster+0.004
 Fer+0.003
+0.003
olog+0.003
 salon+0.003
Neg logit contributions
lessness-0.002
 of-0.002
 twins-0.002
 who-0.001
 about-0.001
Token it
Token ID340
Feature activation-2.76e-02

Pos logit contributions
 coaster+0.044
 Fer+0.042
+0.041
olog+0.041
 salon+0.041
Neg logit contributions
lessness-0.029
 of-0.027
 twins-0.026
 who-0.021
 about-0.021
Token!"
Token ID2474
Feature activation+2.84e-02

Pos logit contributions
lessness+0.035
 of+0.030
 twins+0.030
 privacy+0.024
 about+0.024
Neg logit contributions
 coaster-0.065
 Fer-0.062
-0.061
 jugg-0.061
 salon-0.060
Token The
Token ID383
Feature activation+2.25e-02

Pos logit contributions
 coaster+0.071
 Fer+0.069
 salon+0.067
icol+0.067
olog+0.066
Neg logit contributions
lessness-0.031
 of-0.030
 twins-0.027
 who-0.023
 about-0.022
Token boun
Token ID31283
Feature activation+1.45e-01

Pos logit contributions
 coaster+0.050
+0.048
 Fer+0.047
olog+0.047
icol+0.046
Neg logit contributions
lessness-0.031
 twins-0.029
 of-0.029
 who-0.023
 privacy-0.023
Tokency
Token ID948
Feature activation+4.46e-02

Pos logit contributions
 coaster+0.307
 Fer+0.297
olog+0.284
 som+0.284
 salon+0.280
Neg logit contributions
 of-0.220
lessness-0.220
 twins-0.187
able-0.177
sm-0.166
Token thing
Token ID1517
Feature activation+1.84e-02

Pos logit contributions
 coaster+0.088
+0.088
 Fer+0.087
 salon+0.084
olog+0.083
Neg logit contributions
lessness-0.070
 of-0.069
 twins-0.064
 who-0.057
 toys-0.054
Token was
Token ID373
Feature activation-5.91e-03

Pos logit contributions
 coaster+0.036
 Fer+0.034
+0.033
 jugg+0.033
 salon+0.033
Neg logit contributions
lessness-0.030
 of-0.027
 twins-0.027
 privacy-0.024
 company-0.023
Token so
Token ID523
Feature activation-6.50e-03

Pos logit contributions
lessness+0.008
 of+0.007
 twins+0.007
 privacy+0.006
 company+0.005
Neg logit contributions
 coaster-0.013
 Fer-0.013
-0.013
 jugg-0.013
 salon-0.012
Token upset
Token ID9247
Feature activation-3.15e-02

Pos logit contributions
lessness+0.008
 of+0.007
 twins+0.006
 about+0.005
 privacy+0.005
Neg logit contributions
 coaster-0.016
 Fer-0.015
-0.015
 jugg-0.015
 salon-0.015
Token small
Token ID1402
Feature activation+3.88e-03

Pos logit contributions
 coaster+0.004
 Fer+0.004
+0.004
 salon+0.004
 jugg+0.004
Neg logit contributions
lessness-0.002
 of-0.002
 twins-0.002
 about-0.002
 who-0.002
Token for
Token ID329
Feature activation-5.11e-03

Pos logit contributions
 coaster+0.008
 Fer+0.008
 pro+0.008
 salon+0.007
 drib+0.007
Neg logit contributions
lessness-0.005
ovable-0.004
 privacy-0.004
 twins-0.004
 importantly-0.004
Token Ben
Token ID3932
Feature activation-1.64e-02

Pos logit contributions
lessness+0.008
 of+0.006
 twins+0.006
 privacy+0.006
 about+0.005
Neg logit contributions
 coaster-0.011
 Fer-0.010
 jugg-0.010
-0.010
 som-0.010
Token.
Token ID13
Feature activation-4.86e-03

Pos logit contributions
lessness+0.021
 twins+0.017
 importantly+0.017
 bricks+0.017
iring+0.016
Neg logit contributions
 coaster-0.036
 drib-0.035
 salon-0.034
 som-0.034
 jugg-0.033
Token He
Token ID679
Feature activation+1.27e-02

Pos logit contributions
lessness+0.006
 of+0.005
 twins+0.005
 about+0.004
 bricks+0.004
Neg logit contributions
 coaster-0.012
 Fer-0.012
 salon-0.011
 jugg-0.011
-0.011
Token squee
Token ID13170
Feature activation+1.44e-01

Pos logit contributions
 coaster+0.024
+0.024
icol+0.023
olog+0.022
 Fer+0.022
Neg logit contributions
 of-0.020
 twins-0.020
lessness-0.018
 who-0.018
 everywhere-0.017
Tokenzes
Token ID12271
Feature activation+9.38e-03

Pos logit contributions
 Fer+0.152
 salon+0.148
+0.140
 blur+0.139
 jugg+0.125
Neg logit contributions
 twins-0.327
 what-0.321
ies-0.317
 toys-0.317
 privacy-0.304
Token in
Token ID287
Feature activation+2.60e-02

Pos logit contributions
 coaster+0.020
 Fer+0.019
+0.019
 salon+0.019
 jugg+0.019
Neg logit contributions
lessness-0.014
 of-0.013
 twins-0.012
 about-0.010
 everywhere-0.010
Token,
Token ID11
Feature activation-5.57e-03

Pos logit contributions
 coaster+0.060
 Fer+0.057
+0.057
 jugg+0.056
 salon+0.056
Neg logit contributions
lessness-0.035
 of-0.031
 twins-0.030
 privacy-0.024
 about-0.024
Token but
Token ID475
Feature activation+4.06e-03

Pos logit contributions
lessness+0.006
 of+0.005
 twins+0.005
 who+0.004
 about+0.004
Neg logit contributions
 coaster-0.014
 Fer-0.014
-0.013
olog-0.013
 salon-0.013
Token it
Token ID340
Feature activation+1.15e-02

Pos logit contributions
 coaster+0.010
+0.010
 Fer+0.009
icol+0.009
 salon+0.009
Neg logit contributions
lessness-0.005
 of-0.005
 twins-0.004
 about-0.004
 everywhere-0.004
Token that
Token ID326
Feature activation-1.17e-02

Pos logit contributions
lessness+0.082
 twins+0.071
 of+0.066
aky+0.061
 bricks+0.059
Neg logit contributions
 coaster-0.133
 Fer-0.127
 som-0.127
 jugg-0.125
 salon-0.124
Token he
Token ID339
Feature activation-1.16e-02

Pos logit contributions
lessness+0.017
 twins+0.015
 of+0.015
aky+0.013
 privacy+0.013
Neg logit contributions
 coaster-0.024
 Fer-0.023
 jugg-0.023
 som-0.023
 salon-0.022
Token had
Token ID550
Feature activation+1.39e-02

Pos logit contributions
lessness+0.017
 twins+0.015
 of+0.015
 bricks+0.013
 privacy+0.013
Neg logit contributions
 coaster-0.025
 Fer-0.023
 jugg-0.023
-0.023
 som-0.022
Token used
Token ID973
Feature activation-1.24e-02

Pos logit contributions
 coaster+0.030
 Fer+0.028
 jugg+0.028
+0.028
 som+0.027
Neg logit contributions
lessness-0.020
 of-0.018
 twins-0.018
 privacy-0.015
 bricks-0.014
Token the
Token ID262
Feature activation-4.71e-03

Pos logit contributions
lessness+0.017
 twins+0.014
 of+0.014
 privacy+0.012
aky+0.012
Neg logit contributions
 coaster-0.027
 Fer-0.026
 som-0.025
-0.025
 jugg-0.025
Token squee
Token ID13170
Feature activation+1.44e-01

Pos logit contributions
lessness+0.007
 of+0.006
 twins+0.006
 privacy+0.005
 about+0.005
Neg logit contributions
 coaster-0.010
 Fer-0.009
-0.009
 jugg-0.009
 salon-0.009
Tokengee
Token ID29622
Feature activation+3.58e-02

Pos logit contributions
 Fer+0.249
 salon+0.236
olog+0.228
 blur+0.223
Re+0.222
Neg logit contributions
lessness-0.222
 twins-0.222
 company-0.219
ies-0.218
 of-0.216
Token to
Token ID284
Feature activation+3.35e-02

Pos logit contributions
 coaster+0.084
 Fer+0.080
+0.079
 jugg+0.078
 salon+0.078
Neg logit contributions
lessness-0.045
 of-0.039
 twins-0.038
 privacy-0.032
 about-0.031
Token good
Token ID922
Feature activation+4.98e-02

Pos logit contributions
 coaster+0.068
 Fer+0.065
+0.063
 salon+0.062
 jugg+0.062
Neg logit contributions
lessness-0.053
 of-0.050
 twins-0.047
 company-0.041
 about-0.041
Token use
Token ID779
Feature activation-2.61e-02

Pos logit contributions
 coaster+0.106
 Fer+0.100
 jugg+0.098
+0.098
 som+0.098
Neg logit contributions
lessness-0.075
 of-0.064
 twins-0.064
 privacy-0.054
 about-0.054
Token!
Token ID0
Feature activation-1.21e-02

Pos logit contributions
lessness+0.030
 of+0.026
 twins+0.025
 privacy+0.020
 about+0.019
Neg logit contributions
 coaster-0.064
 Fer-0.062
-0.061
 jugg-0.060
 salon-0.060
Token!"
Token ID2474
Feature activation+4.32e-02

Pos logit contributions
 coaster+0.003
 Fer+0.003
+0.003
 jugg+0.003
 salon+0.003
Neg logit contributions
lessness-0.002
 of-0.002
 twins-0.002
 privacy-0.001
 about-0.001
Token Sam
Token ID3409
Feature activation+6.18e-03

Pos logit contributions
 coaster+0.082
 Fer+0.078
+0.078
 jugg+0.076
 salon+0.075
Neg logit contributions
lessness-0.075
 of-0.067
 twins-0.066
 privacy-0.058
 company-0.057
Token says
Token ID1139
Feature activation-1.29e-02

Pos logit contributions
+0.015
icol+0.014
 coaster+0.014
 Fer+0.013
 blur+0.013
Neg logit contributions
 of-0.008
 twins-0.008
 about-0.007
 who-0.006
lessness-0.006
Token.
Token ID13
Feature activation+6.81e-03

Pos logit contributions
lessness+0.013
 twins+0.010
 privacy+0.009
 of+0.009
ruck+0.009
Neg logit contributions
 coaster-0.032
-0.031
 pro-0.031
 salon-0.030
olog-0.030
Token He
Token ID679
Feature activation+1.96e-02

Pos logit contributions
+0.015
 coaster+0.015
 Fer+0.014
 jugg+0.014
 pro+0.014
Neg logit contributions
lessness-0.009
 twins-0.007
 of-0.007
 privacy-0.007
 company-0.007
Token squee
Token ID13170
Feature activation+1.43e-01

Pos logit contributions
 coaster+0.037
+0.037
icol+0.036
uren+0.036
 salon+0.035
Neg logit contributions
 twins-0.032
 of-0.030
 who-0.027
 toys-0.027
 about-0.027
Tokenzes
Token ID12271
Feature activation+7.45e-03

Pos logit contributions
 Fer+0.158
 salon+0.156
 blur+0.137
+0.133
 coaster+0.127
Neg logit contributions
 twins-0.321
 what-0.316
ies-0.310
 toys-0.307
 friends-0.298
Token the
Token ID262
Feature activation+1.81e-02

Pos logit contributions
 coaster+0.016
 Fer+0.015
+0.015
 jugg+0.015
 salon+0.015
Neg logit contributions
lessness-0.011
 of-0.010
 twins-0.009
 privacy-0.008
 about-0.008
Token dinosaur
Token ID28257
Feature activation+1.48e-02

Pos logit contributions
 coaster+0.031
 Fer+0.030
 jugg+0.028
+0.027
 pro+0.027
Neg logit contributions
lessness-0.036
 of-0.032
 twins-0.030
 privacy-0.029
 about-0.028
Token harder
Token ID7069
Feature activation+3.15e-02

Pos logit contributions
 coaster+0.035
 Fer+0.033
+0.033
 jugg+0.032
 salon+0.032
Neg logit contributions
lessness-0.019
 of-0.017
 twins-0.016
 privacy-0.013
 about-0.013
Token.
Token ID13
Feature activation-8.95e-03

Pos logit contributions
 coaster+0.085
 jugg+0.080
 Fer+0.080
 salon+0.079
 som+0.079
Neg logit contributions
lessness-0.028
 twins-0.022
 of-0.020
 privacy-0.018
 scientists-0.017
Tokenuddle
Token ID24500
Feature activation+2.30e-02

Pos logit contributions
 coaster+0.167
 jugg+0.160
 Fer+0.158
+0.146
 sixty+0.145
Neg logit contributions
lessness-0.199
 of-0.189
 twins-0.175
ies-0.160
aky-0.157
Token!
Token ID0
Feature activation-1.05e-02

Pos logit contributions
 coaster+0.063
 Fer+0.061
+0.060
 jugg+0.059
 salon+0.059
Neg logit contributions
lessness-0.020
 twins-0.016
 of-0.016
 privacy-0.011
 company-0.010
Token Splash
Token ID45275
Feature activation+6.07e-02

Pos logit contributions
lessness+0.014
 of+0.013
 twins+0.012
 about+0.010
 privacy+0.010
Neg logit contributions
 coaster-0.024
 Fer-0.023
-0.022
 jugg-0.022
 salon-0.022
Token!
Token ID0
Feature activation-1.02e-02

Pos logit contributions
 coaster+0.127
 Fer+0.127
 salon+0.121
+0.119
 jugg+0.117
Neg logit contributions
lessness-0.095
 of-0.089
 twins-0.080
 about-0.070
 who-0.069
Token The
Token ID383
Feature activation+1.44e-02

Pos logit contributions
lessness+0.013
 of+0.012
 twins+0.011
 about+0.009
 who+0.009
Neg logit contributions
 coaster-0.024
 Fer-0.023
 jugg-0.023
 salon-0.022
 som-0.022
Token boun
Token ID31283
Feature activation+1.43e-01

Pos logit contributions
 coaster+0.028
 Fer+0.026
+0.026
 jugg+0.025
olog+0.025
Neg logit contributions
lessness-0.024
 of-0.023
 twins-0.023
 company-0.019
 toys-0.019
Tokency
Token ID948
Feature activation+2.95e-02

Pos logit contributions
 coaster+0.306
 Fer+0.299
olog+0.287
 salon+0.282
 som+0.279
Neg logit contributions
 of-0.219
lessness-0.214
 twins-0.183
 everywhere-0.164
able-0.164
Token thing
Token ID1517
Feature activation+6.03e-03

Pos logit contributions
 coaster+0.055
 Fer+0.053
+0.051
 salon+0.051
olog+0.050
Neg logit contributions
lessness-0.051
 of-0.050
 twins-0.046
 who-0.040
 everywhere-0.040
Token was
Token ID373
Feature activation-1.42e-02

Pos logit contributions
 coaster+0.013
 Fer+0.012
+0.012
 jugg+0.012
 salon+0.012
Neg logit contributions
lessness-0.009
 of-0.008
 twins-0.008
 privacy-0.007
 company-0.006
Token having
Token ID1719
Feature activation-2.66e-02

Pos logit contributions
lessness+0.020
 of+0.018
 twins+0.017
 about+0.014
 who+0.014
Neg logit contributions
 coaster-0.032
 Fer-0.031
-0.030
 jugg-0.030
 salon-0.030
Token so
Token ID523
Feature activation-1.66e-02

Pos logit contributions
lessness+0.034
 twins+0.029
 of+0.029
 privacy+0.025
ruck+0.023
Neg logit contributions
 coaster-0.061
 Fer-0.059
-0.058
 jugg-0.058
 som-0.057
Token could
Token ID714
Feature activation+6.37e-02

Pos logit contributions
lessness+0.010
 of+0.008
 twins+0.008
 privacy+0.007
 company+0.006
Neg logit contributions
 coaster-0.018
 Fer-0.017
-0.017
 jugg-0.017
 salon-0.017
Token try
Token ID1949
Feature activation+5.66e-03

Pos logit contributions
 coaster+0.121
 Fer+0.115
+0.111
 salon+0.109
 jugg+0.109
Neg logit contributions
lessness-0.109
 of-0.104
 twins-0.098
 about-0.086
 who-0.085
Token it
Token ID340
Feature activation-2.82e-02

Pos logit contributions
 coaster+0.014
 Fer+0.013
olog+0.013
+0.013
icol+0.013
Neg logit contributions
lessness-0.006
 of-0.006
 twins-0.006
 about-0.005
 who-0.005
Token.
Token ID13
Feature activation-6.43e-03

Pos logit contributions
lessness+0.019
 of+0.017
 twins+0.014
 about+0.010
 who+0.009
Neg logit contributions
 coaster-0.083
 Fer-0.080
-0.079
 jugg-0.078
 salon-0.078
Token The
Token ID383
Feature activation+1.98e-02

Pos logit contributions
 of+0.007
lessness+0.007
 twins+0.006
 who+0.005
 about+0.005
Neg logit contributions
 coaster-0.017
 Fer-0.016
 som-0.015
 jugg-0.015
 blur-0.015
Token sur
Token ID969
Feature activation+1.43e-01

Pos logit contributions
 coaster+0.041
 Fer+0.039
+0.039
 jugg+0.038
 salon+0.038
Neg logit contributions
lessness-0.030
 of-0.028
 twins-0.027
 privacy-0.023
 company-0.022
Tokenfer
Token ID2232
Feature activation+2.66e-02

Pos logit contributions
 Fer+0.156
icol+0.153
 som+0.150
+0.149
 coaster+0.148
Neg logit contributions
lessness-0.336
 of-0.328
 who-0.310
 twins-0.306
 about-0.290
Token said
Token ID531
Feature activation-3.05e-02

Pos logit contributions
 coaster+0.054
 Fer+0.052
+0.051
 salon+0.050
icol+0.049
Neg logit contributions
lessness-0.042
 of-0.040
 twins-0.037
 who-0.033
 about-0.033
Token
Token ID6184
Feature activation+8.64e-02

Pos logit contributions
lessness+0.051
 of+0.048
 twins+0.045
 about+0.040
 who+0.039
Neg logit contributions
 coaster-0.060
 Fer-0.057
-0.055
 jugg-0.054
 salon-0.054
Token
Token ID95
Feature activation+9.59e-02

Pos logit contributions
 salon+0.110
 blur+0.094
 coaster+0.093
 seventeen+0.088
 checkout+0.088
Neg logit contributions
lessness-0.165
 who-0.164
.-0.161
 of-0.157
 about-0.150
Token
Token ID26391
Feature activation+8.02e-02

Pos logit contributions
 salon+0.158
+0.140
 blur+0.132
 wrists+0.131
irl+0.130
Neg logit contributions
lessness-0.168
 who-0.159
 of-0.156
 about-0.155
.-0.151
Token She
Token ID1375
Feature activation+1.84e-02

Pos logit contributions
lessness+0.016
 of+0.014
 twins+0.013
 privacy+0.012
ruck+0.011
Neg logit contributions
 coaster-0.027
-0.026
 Fer-0.026
 salon-0.025
 jugg-0.025
Token takes
Token ID2753
Feature activation+1.40e-02

Pos logit contributions
 coaster+0.040
+0.040
 Fer+0.036
OM+0.035
 salon+0.035
Neg logit contributions
 twins-0.027
 of-0.026
 about-0.023
 who-0.023
 toys-0.022
Token the
Token ID262
Feature activation+1.36e-02

Pos logit contributions
 coaster+0.035
 Fer+0.033
 salon+0.033
+0.033
 jugg+0.033
Neg logit contributions
lessness-0.015
 of-0.015
 twins-0.014
 everywhere-0.011
 bricks-0.011
Token orange
Token ID10912
Feature activation+4.47e-02

Pos logit contributions
 coaster+0.024
 Fer+0.024
+0.023
 jugg+0.023
 pro+0.022
Neg logit contributions
lessness-0.025
 of-0.021
 twins-0.021
 privacy-0.021
 about-0.020
Token and
Token ID290
Feature activation+1.47e-02

Pos logit contributions
 coaster+0.096
 Fer+0.092
+0.091
 jugg+0.089
 salon+0.088
Neg logit contributions
lessness-0.066
 of-0.058
 twins-0.057
 privacy-0.049
 about-0.048
Token squee
Token ID13170
Feature activation+1.43e-01

Pos logit contributions
 coaster+0.034
OM+0.031
+0.030
 salon+0.030
 destination+0.030
Neg logit contributions
 of-0.021
 twins-0.020
 toys-0.018
 what-0.018
 bricks-0.017
Tokenzes
Token ID12271
Feature activation+3.66e-03

Pos logit contributions
 Fer+0.158
 salon+0.156
 blur+0.148
 coaster+0.133
+0.131
Neg logit contributions
ies-0.323
 twins-0.317
 toys-0.307
 what-0.301
 feelings-0.298
Token it
Token ID340
Feature activation-1.93e-02

Pos logit contributions
 coaster+0.008
 Fer+0.007
+0.007
 jugg+0.007
 salon+0.007
Neg logit contributions
lessness-0.006
 of-0.005
 twins-0.005
 about-0.004
 everywhere-0.004
Token hard
Token ID1327
Feature activation+2.40e-02

Pos logit contributions
lessness+0.021
 of+0.019
 twins+0.017
 about+0.014
 everywhere+0.013
Neg logit contributions
 coaster-0.050
 Fer-0.048
-0.047
 jugg-0.046
 salon-0.046
Token.
Token ID13
Feature activation-1.70e-02

Pos logit contributions
 coaster+0.060
 Fer+0.057
 pro+0.057
+0.057
 jugg+0.056
Neg logit contributions
lessness-0.025
 twins-0.020
 privacy-0.018
 of-0.017
ovable-0.016
Token She
Token ID1375
Feature activation+1.36e-02

Pos logit contributions
lessness+0.019
 of+0.017
 twins+0.016
 about+0.013
 privacy+0.013
Neg logit contributions
 coaster-0.043
 Fer-0.041
-0.040
 jugg-0.040
 salon-0.039
Token.
Token ID13
Feature activation-2.26e-02

Pos logit contributions
lessness+0.057
 twins+0.050
 privacy+0.045
 bricks+0.041
 of+0.040
Neg logit contributions
 coaster-0.120
 jugg-0.115
 blur-0.113
 pro-0.112
 Fer-0.112
Token
Token ID220
Feature activation+9.19e-03

Pos logit contributions
lessness+0.029
 of+0.027
 twins+0.025
 about+0.021
 who+0.020
Neg logit contributions
 coaster-0.053
 Fer-0.051
 jugg-0.050
 salon-0.050
 som-0.049
Token
Token ID198
Feature activation-1.20e-02

Pos logit contributions
 coaster+0.020
+0.020
 Fer+0.019
 salon+0.019
 jugg+0.019
Neg logit contributions
lessness-0.012
 of-0.011
 twins-0.010
 privacy-0.010
 company-0.009
Token
Token ID198
Feature activation+1.08e-03

Pos logit contributions
lessness+0.011
 of+0.011
 twins+0.009
 who+0.007
 about+0.007
Neg logit contributions
 coaster-0.032
 Fer-0.031
 salon-0.030
icol-0.030
 jugg-0.030
TokenThe
Token ID464
Feature activation+2.71e-02

Pos logit contributions
icol+0.002
 coaster+0.002
 Fer+0.002
+0.002
 salon+0.002
Neg logit contributions
 of-0.002
lessness-0.002
 twins-0.001
 who-0.001
aky-0.001
Token boun
Token ID31283
Feature activation+1.43e-01

Pos logit contributions
 coaster+0.057
 Fer+0.055
icol+0.054
olog+0.054
+0.053
Neg logit contributions
lessness-0.039
 twins-0.039
 of-0.038
 toys-0.031
 company-0.030
Tokency
Token ID948
Feature activation+3.49e-02

Pos logit contributions
 coaster+0.315
 Fer+0.311
olog+0.300
 som+0.300
 sixty+0.297
Neg logit contributions
 of-0.198
lessness-0.186
able-0.166
 twins-0.162
sm-0.145
Token ball
Token ID2613
Feature activation+5.08e-02

Pos logit contributions
 coaster+0.067
 Fer+0.067
+0.066
olog+0.064
 salon+0.064
Neg logit contributions
lessness-0.055
 of-0.053
 twins-0.051
 who-0.044
 toys-0.043
Token was
Token ID373
Feature activation-1.45e-02

Pos logit contributions
 coaster+0.094
 Fer+0.090
+0.088
icol+0.085
 salon+0.085
Neg logit contributions
lessness-0.089
 of-0.085
 twins-0.079
 about-0.071
 who-0.070
Token so
Token ID523
Feature activation-1.25e-02

Pos logit contributions
lessness+0.021
 twins+0.017
 of+0.017
 privacy+0.015
 bricks+0.015
Neg logit contributions
 coaster-0.031
 Fer-0.030
 jugg-0.030
 blur-0.029
 som-0.029
Token brave
Token ID14802
Feature activation-4.31e-03

Pos logit contributions
lessness+0.016
 twins+0.014
 of+0.013
 privacy+0.012
ruck+0.012
Neg logit contributions
 coaster-0.028
 Fer-0.027
 jugg-0.027
-0.027
 blur-0.026
Token soon
Token ID2582
Feature activation-2.39e-02

Pos logit contributions
lessness+0.016
 twins+0.014
 of+0.013
 privacy+0.012
 bricks+0.011
Neg logit contributions
 coaster-0.026
 blur-0.025
 Fer-0.025
 som-0.025
 jugg-0.025
Token nothing
Token ID2147
Feature activation-5.27e-02

Pos logit contributions
lessness+0.037
 twins+0.032
 of+0.032
 privacy+0.028
 bricks+0.027
Neg logit contributions
 coaster-0.049
 Fer-0.047
-0.046
 jugg-0.046
 som-0.046
Token but
Token ID475
Feature activation-1.26e-02

Pos logit contributions
lessness+0.073
 twins+0.070
 privacy+0.060
 bricks+0.060
 of+0.059
Neg logit contributions
 coaster-0.109
 pro-0.106
 som-0.106
 Fer-0.105
 jugg-0.104
Token a
Token ID257
Feature activation+8.52e-03

Pos logit contributions
lessness+0.018
 of+0.016
 twins+0.016
 privacy+0.013
 about+0.013
Neg logit contributions
 coaster-0.027
 Fer-0.026
-0.026
 jugg-0.026
 salon-0.025
Token small
Token ID1402
Feature activation+3.85e-02

Pos logit contributions
 coaster+0.022
 Fer+0.021
+0.020
 jugg+0.020
 salon+0.020
Neg logit contributions
lessness-0.009
 of-0.008
 twins-0.008
 who-0.006
 privacy-0.006
Token spe
Token ID693
Feature activation+1.41e-01

Pos logit contributions
 coaster+0.083
 Fer+0.079
+0.078
 jugg+0.077
 salon+0.076
Neg logit contributions
lessness-0.056
 of-0.051
 twins-0.049
 privacy-0.041
 about-0.041
Tokenck
Token ID694
Feature activation-1.11e-03

Pos logit contributions
 coaster+0.238
 Fer+0.230
+0.228
 jugg+0.214
 salon+0.213
Neg logit contributions
lessness-0.274
 of-0.259
 twins-0.239
 company-0.225
 privacy-0.224
Token in
Token ID287
Feature activation+3.92e-02

Pos logit contributions
lessness+0.002
 twins+0.001
 privacy+0.001
 of+0.001
 bricks+0.001
Neg logit contributions
 coaster-0.002
-0.002
 jugg-0.002
 pro-0.002
 blur-0.002
Token the
Token ID262
Feature activation-7.06e-03

Pos logit contributions
 coaster+0.089
 Fer+0.086
+0.084
 jugg+0.083
 salon+0.082
Neg logit contributions
lessness-0.053
 of-0.047
 twins-0.046
 privacy-0.038
 about-0.037
Token sky
Token ID6766
Feature activation-1.16e-02

Pos logit contributions
lessness+0.012
 of+0.011
 twins+0.011
 privacy+0.010
 company+0.009
Neg logit contributions
 coaster-0.013
 Fer-0.013
-0.013
 jugg-0.012
 salon-0.012
Token.
Token ID13
Feature activation-1.79e-02

Pos logit contributions
lessness+0.011
 of+0.008
 twins+0.008
 privacy+0.006
 bricks+0.006
Neg logit contributions
 coaster-0.031
 Fer-0.030
-0.030
 jugg-0.030
 salon-0.029
Token key
Token ID1994
Feature activation-4.56e-04

Pos logit contributions
lessness+0.027
 of+0.024
 twins+0.023
 about+0.018
 privacy+0.018
Neg logit contributions
 coaster-0.056
 Fer-0.054
-0.053
 jugg-0.053
 salon-0.052
Token and
Token ID290
Feature activation-1.25e-02

Pos logit contributions
lessness+0.001
 twins+0.000
 privacy+0.000
 scientists+0.000
 importantly+0.000
Neg logit contributions
 coaster-0.001
 salon-0.001
 Fer-0.001
 pro-0.001
 jugg-0.001
Token wanted
Token ID2227
Feature activation-5.28e-02

Pos logit contributions
lessness+0.020
 twins+0.017
 of+0.017
 company+0.015
 privacy+0.015
Neg logit contributions
 coaster-0.025
 Fer-0.024
 jugg-0.024
-0.023
 salon-0.023
Token to
Token ID284
Feature activation+3.23e-02

Pos logit contributions
lessness+0.085
 twins+0.068
iness+0.064
aky+0.064
itive+0.059
Neg logit contributions
 coaster-0.110
 Fer-0.107
 som-0.105
 pro-0.104
 jugg-0.102
Token show
Token ID905
Feature activation-4.50e-02

Pos logit contributions
 coaster+0.066
 Fer+0.062
+0.062
 salon+0.060
 jugg+0.059
Neg logit contributions
lessness-0.049
 of-0.049
 twins-0.045
 privacy-0.040
 about-0.040
Token someone
Token ID2130
Feature activation-1.24e-01

Pos logit contributions
lessness+0.078
 of+0.070
 twins+0.069
 privacy+0.060
 about+0.059
Neg logit contributions
 coaster-0.086
 Fer-0.081
-0.079
 jugg-0.079
 salon-0.078
Token.
Token ID13
Feature activation-2.19e-02

Pos logit contributions
lessness+0.220
 twins+0.166
aky+0.164
anny+0.160
iness+0.155
Neg logit contributions
 Fer-0.228
 pro-0.225
 coaster-0.222
 jugg-0.221
 som-0.211
Token But
Token ID887
Feature activation-1.10e-02

Pos logit contributions
lessness+0.021
 of+0.019
 twins+0.018
 about+0.014
 who+0.013
Neg logit contributions
 coaster-0.058
 Fer-0.056
 jugg-0.056
 salon-0.055
-0.055
Token he
Token ID339
Feature activation-5.77e-03

Pos logit contributions
lessness+0.017
 of+0.016
 twins+0.015
 about+0.013
 who+0.013
Neg logit contributions
 coaster-0.023
 Fer-0.022
-0.022
 jugg-0.021
 salon-0.021
Token was
Token ID373
Feature activation+9.33e-03

Pos logit contributions
 of+0.007
lessness+0.007
 twins+0.006
 who+0.006
 about+0.006
Neg logit contributions
 Fer-0.014
 coaster-0.014
-0.013
icol-0.013
 salon-0.013
Token still
Token ID991
Feature activation-7.83e-03

Pos logit contributions
 coaster+0.022
+0.021
icol+0.021
 Fer+0.021
olog+0.021
Neg logit contributions
lessness-0.012
 of-0.011
 twins-0.011
 about-0.010
 who-0.009
Token and
Token ID290
Feature activation+9.82e-03

Pos logit contributions
 coaster+0.113
 Fer+0.108
 salon+0.106
+0.105
 pro+0.103
Neg logit contributions
lessness-0.057
 of-0.045
 twins-0.044
 privacy-0.039
 company-0.037
Token went
Token ID1816
Feature activation+3.99e-02

Pos logit contributions
+0.023
 coaster+0.022
 seventeen+0.022
Too+0.021
 salon+0.021
Neg logit contributions
 of-0.012
lessness-0.011
 everywhere-0.011
 twins-0.011
 toys-0.010
Token outside
Token ID2354
Feature activation+1.58e-02

Pos logit contributions
 coaster+0.106
 Fer+0.102
+0.101
 jugg+0.101
 salon+0.099
Neg logit contributions
lessness-0.038
 of-0.035
 twins-0.032
 about-0.024
 everywhere-0.023
Token to
Token ID284
Feature activation+9.65e-03

Pos logit contributions
 coaster+0.039
 jugg+0.038
 Fer+0.038
+0.038
 pro+0.037
Neg logit contributions
lessness-0.017
 twins-0.013
 of-0.012
 privacy-0.012
aky-0.011
Token show
Token ID905
Feature activation-5.13e-02

Pos logit contributions
 coaster+0.022
+0.021
olog+0.021
 salon+0.021
icol+0.020
Neg logit contributions
 of-0.012
 twins-0.012
lessness-0.011
 who-0.010
 what-0.010
Token everyone
Token ID2506
Feature activation-1.21e-01

Pos logit contributions
lessness+0.062
 of+0.051
 twins+0.050
 privacy+0.040
 company+0.039
Neg logit contributions
 coaster-0.125
 Fer-0.121
 jugg-0.117
 som-0.116
 salon-0.115
Token his
Token ID465
Feature activation-6.12e-02

Pos logit contributions
lessness+0.201
 privacy+0.158
aky+0.156
 company+0.152
 twins+0.150
Neg logit contributions
 coaster-0.234
 jugg-0.230
 Fer-0.229
 som-0.220
 pro-0.214
Token new
Token ID649
Feature activation-3.30e-03

Pos logit contributions
lessness+0.108
 of+0.093
 twins+0.083
free+0.077
iness+0.077
Neg logit contributions
 coaster-0.120
 Fer-0.114
 som-0.108
 jugg-0.108
 pro-0.105
Token sword
Token ID8429
Feature activation-4.52e-02

Pos logit contributions
lessness+0.006
 of+0.005
 twins+0.005
 privacy+0.004
 who+0.004
Neg logit contributions
 coaster-0.006
 Fer-0.006
 jugg-0.006
 som-0.006
 pro-0.006
Token.
Token ID13
Feature activation-3.49e-02

Pos logit contributions
lessness+0.050
 twins+0.038
 of+0.036
 privacy+0.034
aky+0.033
Neg logit contributions
 coaster-0.114
 Fer-0.105
 jugg-0.103
 som-0.103
 pro-0.103
Token He
Token ID679
Feature activation-3.84e-03

Pos logit contributions
lessness+0.052
 twins+0.045
 of+0.045
 privacy+0.040
 company+0.039
Neg logit contributions
 coaster-0.075
 Fer-0.071
-0.071
 jugg-0.068
olog-0.067
Token how
Token ID703
Feature activation-2.26e-02

Pos logit contributions
lessness+0.011
 of+0.009
 twins+0.009
aky+0.007
ruck+0.007
Neg logit contributions
 coaster-0.019
 Fer-0.018
 jugg-0.018
 som-0.017
-0.017
Token your
Token ID534
Feature activation-3.86e-03

Pos logit contributions
lessness+0.035
 twins+0.030
ruck+0.029
aky+0.028
 of+0.027
Neg logit contributions
 Fer-0.040
 coaster-0.040
 som-0.040
 pro-0.038
 jugg-0.038
Token actions
Token ID4028
Feature activation-5.69e-02

Pos logit contributions
lessness+0.005
 of+0.004
 twins+0.004
ruck+0.003
aky+0.003
Neg logit contributions
 coaster-0.009
 Fer-0.009
-0.008
 jugg-0.008
 salon-0.008
Token might
Token ID1244
Feature activation+1.01e-03

Pos logit contributions
lessness+0.079
 twins+0.070
 of+0.068
 bricks+0.058
 privacy+0.058
Neg logit contributions
 coaster-0.121
 jugg-0.116
-0.115
 Fer-0.115
 som-0.112
Token affect
Token ID2689
Feature activation-5.71e-02

Pos logit contributions
 coaster+0.002
 jugg+0.002
+0.002
 Fer+0.002
 som+0.002
Neg logit contributions
lessness-0.002
 twins-0.002
 of-0.002
 bricks-0.001
 privacy-0.001
Token others
Token ID1854
Feature activation-1.20e-01

Pos logit contributions
lessness+0.072
 of+0.055
 twins+0.052
 bricks+0.044
ruck+0.043
Neg logit contributions
 coaster-0.131
 jugg-0.128
 som-0.127
 Fer-0.125
-0.124
Token.
Token ID13
Feature activation-1.37e-02

Pos logit contributions
lessness+0.138
ruck+0.107
 bricks+0.103
 twins+0.103
 of+0.101
Neg logit contributions
 coaster-0.258
 pro-0.256
 jugg-0.255
 salon-0.247
-0.245
Token<|endoftext|>
Token ID50256
Feature activation-6.23e-03

Pos logit contributions
lessness+0.018
 of+0.016
 twins+0.015
 privacy+0.013
 about+0.012
Neg logit contributions
 coaster-0.032
 Fer-0.030
-0.030
 jugg-0.029
 salon-0.029
TokenOnce
Token ID7454
Feature activation-2.35e-02

Pos logit contributions
lessness+0.009
 of+0.007
 twins+0.007
 privacy+0.007
ruck+0.006
Neg logit contributions
 coaster-0.014
-0.013
 Fer-0.013
 jugg-0.013
 pro-0.012
Token upon
Token ID2402
Feature activation-1.59e-02

Pos logit contributions
lessness+0.032
 privacy+0.031
 company+0.029
ruck+0.029
 scientists+0.028
Neg logit contributions
 salon-0.045
-0.043
 drib-0.041
 pro-0.041
irl-0.041
Token a
Token ID257
Feature activation-1.00e-03

Pos logit contributions
lessness+0.028
iness+0.022
 privacy+0.021
 twins+0.021
 scientists+0.021
Neg logit contributions
 coaster-0.030
 jugg-0.028
 som-0.027
V-0.027
 Fer-0.027
Token bird
Token ID6512
Feature activation-1.23e-02

Pos logit contributions
lessness+0.010
 twins+0.009
 of+0.009
 who+0.007
 toys+0.005
Neg logit contributions
 coaster-0.044
 Fer-0.043
-0.042
olog-0.042
 salon-0.041
Token who
Token ID508
Feature activation-3.02e-02

Pos logit contributions
lessness+0.019
 scientists+0.016
 twins+0.015
 privacy+0.015
 sons+0.015
Neg logit contributions
 coaster-0.025
 jugg-0.024
 salon-0.024
 som-0.023
 blur-0.023
Token wanted
Token ID2227
Feature activation-6.20e-02

Pos logit contributions
lessness+0.046
 twins+0.041
 of+0.040
 scientists+0.037
 bricks+0.037
Neg logit contributions
 coaster-0.060
 jugg-0.057
 som-0.056
 sav-0.056
-0.056
Token to
Token ID284
Feature activation+2.06e-02

Pos logit contributions
lessness+0.098
 twins+0.074
iness+0.069
aky+0.067
ovable+0.066
Neg logit contributions
 coaster-0.130
 Fer-0.128
 som-0.126
 jugg-0.125
 salon-0.121
Token show
Token ID905
Feature activation-4.28e-02

Pos logit contributions
 coaster+0.035
 Fer+0.033
+0.032
icol+0.031
 pro+0.031
Neg logit contributions
 of-0.039
lessness-0.038
 twins-0.036
 company-0.033
 privacy-0.032
Token everyone
Token ID2506
Feature activation-1.20e-01

Pos logit contributions
lessness+0.070
 twins+0.060
 of+0.059
aky+0.050
 privacy+0.050
Neg logit contributions
 coaster-0.085
 Fer-0.082
 jugg-0.080
 som-0.079
 salon-0.078
Token how
Token ID703
Feature activation-3.30e-02

Pos logit contributions
lessness+0.223
 twins+0.182
 privacy+0.176
aky+0.173
 sons+0.173
Neg logit contributions
 jugg-0.198
 Fer-0.193
 coaster-0.192
 som-0.191
 salon-0.187
Token beautiful
Token ID4950
Feature activation+9.79e-03

Pos logit contributions
lessness+0.043
 twins+0.037
ruck+0.034
 of+0.034
aky+0.032
Neg logit contributions
 Fer-0.069
 som-0.068
 coaster-0.067
 jugg-0.066
 pro-0.065
Token her
Token ID607
Feature activation-6.38e-02

Pos logit contributions
 coaster+0.018
 Fer+0.017
+0.016
 jugg+0.016
 salon+0.016
Neg logit contributions
lessness-0.018
 of-0.016
 twins-0.016
 privacy-0.014
 about-0.014
Token singing
Token ID13777
Feature activation-2.28e-02

Pos logit contributions
lessness+0.074
 twins+0.055
 of+0.054
ruck+0.052
 scientists+0.047
Neg logit contributions
 coaster-0.150
-0.147
 salon-0.144
 Fer-0.144
 jugg-0.143
Token was
Token ID373
Feature activation-4.51e-02

Pos logit contributions
lessness+0.034
 twins+0.031
 of+0.029
 privacy+0.027
 bricks+0.026
Neg logit contributions
 coaster-0.047
 jugg-0.044
 Fer-0.044
-0.044
 pro-0.043
Token take
Token ID1011
Feature activation+4.56e-03

Pos logit contributions
 coaster+0.076
 Fer+0.071
 salon+0.071
+0.071
icol+0.070
Neg logit contributions
 of-0.070
lessness-0.063
 twins-0.059
 about-0.056
 who-0.055
Token it
Token ID340
Feature activation-2.75e-02

Pos logit contributions
 salon+0.012
 coaster+0.011
+0.011
olog+0.011
 jugg+0.011
Neg logit contributions
 of-0.005
lessness-0.005
 twins-0.004
 about-0.004
 what-0.004
Token home
Token ID1363
Feature activation-3.43e-02

Pos logit contributions
lessness+0.047
 twins+0.042
 privacy+0.036
 importantly+0.036
 sons+0.034
Neg logit contributions
 coaster-0.053
 Fer-0.048
 drib-0.048
-0.048
 jugg-0.047
Token to
Token ID284
Feature activation+7.66e-03

Pos logit contributions
lessness+0.039
 of+0.034
 twins+0.032
 about+0.026
 privacy+0.025
Neg logit contributions
 coaster-0.085
 Fer-0.082
-0.081
 jugg-0.080
 salon-0.079
Token show
Token ID905
Feature activation-5.09e-02

Pos logit contributions
 coaster+0.015
 salon+0.015
 Fer+0.014
+0.014
olog+0.014
Neg logit contributions
 of-0.013
lessness-0.012
 twins-0.012
 about-0.011
 what-0.010
Token everyone
Token ID2506
Feature activation-1.19e-01

Pos logit contributions
lessness+0.062
 of+0.054
 twins+0.052
 privacy+0.042
 about+0.042
Neg logit contributions
 coaster-0.123
 Fer-0.119
-0.117
 jugg-0.115
 salon-0.115
Token.
Token ID13
Feature activation-2.72e-02

Pos logit contributions
lessness+0.197
 twins+0.161
 privacy+0.155
aky+0.153
 bricks+0.151
Neg logit contributions
 coaster-0.230
 jugg-0.228
 Fer-0.223
 som-0.218
 pro-0.213
Token He
Token ID679
Feature activation-6.08e-03

Pos logit contributions
lessness+0.036
 of+0.033
 twins+0.031
 about+0.026
 who+0.026
Neg logit contributions
 coaster-0.063
 Fer-0.061
 jugg-0.059
 salon-0.059
-0.059
Token grabbed
Token ID13176
Feature activation+3.81e-04

Pos logit contributions
 of+0.007
 who+0.007
 friends+0.006
 twins+0.006
 about+0.006
Neg logit contributions
-0.015
Whe-0.014
 coaster-0.014
 sidelines-0.014
 Fer-0.014
Token the
Token ID262
Feature activation+5.18e-03

Pos logit contributions
 coaster+0.001
+0.001
 salon+0.001
 Fer+0.001
 jugg+0.001
Neg logit contributions
lessness-0.001
 of-0.000
 twins-0.000
 about-0.000
 everywhere-0.000
Token trophy
Token ID16383
Feature activation+2.57e-02

Pos logit contributions
 coaster+0.006
 Fer+0.005
+0.005
 jugg+0.005
 salon+0.005
Neg logit contributions
lessness-0.013
 of-0.012
 twins-0.012
 privacy-0.011
 about-0.011
Token big
Token ID1263
Feature activation+3.42e-02

Pos logit contributions
lessness+0.019
 of+0.017
 twins+0.016
 privacy+0.014
 who+0.014
Neg logit contributions
 coaster-0.027
 Fer-0.026
-0.026
 jugg-0.025
 salon-0.025
Token fancy
Token ID14996
Feature activation+2.50e-02

Pos logit contributions
 coaster+0.067
 Fer+0.064
+0.063
 jugg+0.062
 salon+0.062
Neg logit contributions
lessness-0.056
 of-0.051
 twins-0.049
 privacy-0.043
 about-0.042
Token wedding
Token ID10614
Feature activation-2.58e-03

Pos logit contributions
 coaster+0.048
 Fer+0.046
 salon+0.045
 jugg+0.044
+0.044
Neg logit contributions
lessness-0.043
 of-0.037
 twins-0.036
 bricks-0.031
aky-0.031
Token to
Token ID284
Feature activation+1.69e-02

Pos logit contributions
lessness+0.003
 twins+0.002
 of+0.002
 privacy+0.002
 bricks+0.002
Neg logit contributions
 coaster-0.007
 Fer-0.006
-0.006
 jugg-0.006
 salon-0.006
Token show
Token ID905
Feature activation-4.28e-02

Pos logit contributions
 coaster+0.032
 Fer+0.031
 jugg+0.030
+0.030
 salon+0.029
Neg logit contributions
lessness-0.029
 of-0.026
 twins-0.025
 privacy-0.022
 bricks-0.022
Token someone
Token ID2130
Feature activation-1.19e-01

Pos logit contributions
lessness+0.064
 twins+0.053
 of+0.053
aky+0.046
 bricks+0.045
Neg logit contributions
 coaster-0.088
 Fer-0.084
 som-0.082
 jugg-0.082
olog-0.082
Token you
Token ID345
Feature activation-6.06e-02

Pos logit contributions
lessness+0.155
 twins+0.118
 bricks+0.107
ruck+0.106
aky+0.104
Neg logit contributions
 coaster-0.258
 Fer-0.256
 pro-0.255
 jugg-0.251
 salon-0.248
Token love
Token ID1842
Feature activation-4.40e-02

Pos logit contributions
lessness+0.082
 twins+0.069
 of+0.067
 bricks+0.063
ruck+0.060
Neg logit contributions
 coaster-0.125
 jugg-0.124
 pro-0.122
 Fer-0.122
 som-0.120
Token them
Token ID606
Feature activation-7.40e-02

Pos logit contributions
lessness+0.052
 of+0.040
 twins+0.040
 bricks+0.032
 privacy+0.031
Neg logit contributions
 coaster-0.106
 Fer-0.100
 jugg-0.100
-0.099
 pro-0.098
Token.
Token ID13
Feature activation-1.20e-02

Pos logit contributions
lessness+0.088
 twins+0.068
 of+0.065
ruck+0.062
 bricks+0.060
Neg logit contributions
 coaster-0.173
 jugg-0.165
 Fer-0.164
 pro-0.164
-0.163
Token All
Token ID1439
Feature activation-2.91e-03

Pos logit contributions
lessness+0.015
 of+0.013
 twins+0.013
 about+0.010
 who+0.010
Neg logit contributions
 coaster-0.029
 Fer-0.028
 jugg-0.027
 salon-0.027
 som-0.026
Token to
Token ID284
Feature activation+3.01e-02

Pos logit contributions
lessness+0.018
 twins+0.015
 of+0.014
aky+0.012
 company+0.012
Neg logit contributions
 coaster-0.034
 som-0.032
-0.032
 pro-0.032
olog-0.032
Token be
Token ID307
Feature activation-4.88e-03

Pos logit contributions
 coaster+0.063
 Fer+0.060
+0.058
 salon+0.058
icol+0.057
Neg logit contributions
lessness-0.046
 of-0.044
 twins-0.043
 company-0.038
 about-0.037
Token consider
Token ID2074
Feature activation-9.22e-03

Pos logit contributions
lessness+0.006
 of+0.005
 twins+0.005
ruck+0.004
iness+0.004
Neg logit contributions
 coaster-0.011
 pro-0.011
 Fer-0.011
 jugg-0.011
 som-0.011
Tokenate
Token ID378
Feature activation-1.89e-02

Pos logit contributions
lessness+0.020
 of+0.017
 twins+0.017
 bricks+0.015
 company+0.015
Neg logit contributions
 coaster-0.013
 Fer-0.013
 jugg-0.012
 som-0.012
olog-0.012
Token of
Token ID286
Feature activation-3.20e-02

Pos logit contributions
lessness+0.020
 twins+0.017
ruck+0.016
 bricks+0.015
 privacy+0.015
Neg logit contributions
 coaster-0.038
 pro-0.038
ar-0.037
olog-0.037
 jugg-0.036
Token others
Token ID1854
Feature activation-1.18e-01

Pos logit contributions
lessness+0.044
 of+0.034
 twins+0.033
aky+0.030
ruck+0.030
Neg logit contributions
 som-0.063
 jugg-0.061
 coaster-0.061
 drib-0.060
 sav-0.060
Token and
Token ID290
Feature activation-1.14e-02

Pos logit contributions
lessness+0.160
 twins+0.133
ruck+0.133
 bricks+0.129
 of+0.124
Neg logit contributions
 coaster-0.230
 pro-0.227
 jugg-0.223
-0.217
 Fer-0.217
Token not
Token ID407
Feature activation-2.83e-02

Pos logit contributions
lessness+0.018
 twins+0.015
 of+0.015
aky+0.013
ruck+0.013
Neg logit contributions
 jugg-0.022
 coaster-0.022
 som-0.021
 Fer-0.021
 drib-0.021
Token be
Token ID307
Feature activation+1.95e-02

Pos logit contributions
lessness+0.054
 twins+0.047
 of+0.047
 bricks+0.041
 privacy+0.041
Neg logit contributions
 coaster-0.047
 jugg-0.047
 Fer-0.046
 som-0.045
 blur-0.044
Token too
Token ID1165
Feature activation+3.47e-03

Pos logit contributions
 coaster+0.042
 Fer+0.042
 pro+0.042
 jugg+0.042
 som+0.041
Neg logit contributions
lessness-0.025
 of-0.020
 twins-0.019
 privacy-0.017
 bricks-0.017
Token noisy
Token ID31210
Feature activation-3.40e-02

Pos logit contributions
 jugg+0.007
 coaster+0.007
 Fer+0.007
 pro+0.007
 blur+0.007
Neg logit contributions
lessness-0.004
 twins-0.004
 of-0.003
 bricks-0.003
 privacy-0.003
Token and
Token ID290
Feature activation-3.35e-03

Pos logit contributions
lessness+0.056
 twins+0.052
 importantly+0.045
 privacy+0.045
ruck+0.044
Neg logit contributions
 coaster-0.062
 pro-0.061
 destination-0.061
-0.060
 som-0.060
Token just
Token ID655
Feature activation-4.44e-03

Pos logit contributions
lessness+0.006
 of+0.005
 twins+0.005
 privacy+0.004
 company+0.004
Neg logit contributions
 coaster-0.006
 Fer-0.006
 jugg-0.006
-0.006
 salon-0.006
Token had
Token ID550
Feature activation+6.98e-03

Pos logit contributions
lessness+0.008
 twins+0.008
 of+0.007
 privacy+0.006
 company+0.006
Neg logit contributions
 coaster-0.008
 Fer-0.007
 jugg-0.007
 som-0.007
 blur-0.007
Token to
Token ID284
Feature activation+4.87e-02

Pos logit contributions
 coaster+0.017
 Fer+0.017
+0.016
 jugg+0.016
 salon+0.016
Neg logit contributions
lessness-0.008
 of-0.007
 twins-0.007
 privacy-0.005
 about-0.005
Token show
Token ID905
Feature activation-3.80e-02

Pos logit contributions
 coaster+0.102
 Fer+0.098
+0.095
 pro+0.095
 salon+0.093
Neg logit contributions
 of-0.070
lessness-0.068
 twins-0.064
 company-0.061
 who-0.059
Token someone
Token ID2130
Feature activation-1.17e-01

Pos logit contributions
lessness+0.068
 of+0.062
 twins+0.061
 privacy+0.053
 about+0.052
Neg logit contributions
 coaster-0.070
 Fer-0.066
-0.065
 jugg-0.064
 salon-0.063
Token.
Token ID13
Feature activation-3.14e-02

Pos logit contributions
lessness+0.190
 twins+0.143
anny+0.136
aky+0.135
 company+0.132
Neg logit contributions
 pro-0.228
 Fer-0.223
 jugg-0.223
 som-0.220
 coaster-0.220
Token She
Token ID1375
Feature activation+5.81e-03

Pos logit contributions
lessness+0.041
 of+0.036
 twins+0.035
 privacy+0.029
 about+0.029
Neg logit contributions
 coaster-0.073
 Fer-0.070
-0.069
 jugg-0.068
 salon-0.067
Token ran
Token ID4966
Feature activation+3.99e-02

Pos logit contributions
+0.014
 coaster+0.014
 Fer+0.014
Whe+0.014
 sidelines+0.013
Neg logit contributions
 of-0.007
 who-0.006
 twins-0.006
lessness-0.006
 about-0.005
Token inside
Token ID2641
Feature activation+7.41e-03

Pos logit contributions
 coaster+0.107
 Fer+0.105
+0.104
 salon+0.103
 jugg+0.103
Neg logit contributions
lessness-0.034
 of-0.034
 twins-0.029
 everywhere-0.025
 about-0.025
Token and
Token ID290
Feature activation+3.41e-03

Pos logit contributions
 coaster+0.016
+0.016
 jugg+0.015
 salon+0.015
 pro+0.015
Neg logit contributions
lessness-0.010
 twins-0.009
 privacy-0.008
aky-0.008
 company-0.007
Token is
Token ID318
Feature activation+1.20e-02

Pos logit contributions
 coaster+0.034
 Fer+0.032
 jugg+0.031
+0.031
 salon+0.030
Neg logit contributions
lessness-0.032
 of-0.028
 twins-0.028
 privacy-0.024
 company-0.024
Token your
Token ID534
Feature activation-1.65e-03

Pos logit contributions
 coaster+0.028
 Fer+0.026
+0.026
 jugg+0.026
 salon+0.025
Neg logit contributions
lessness-0.016
 of-0.014
 twins-0.014
 privacy-0.011
 company-0.011
Token chance
Token ID2863
Feature activation-2.77e-02

Pos logit contributions
lessness+0.003
 of+0.002
 twins+0.002
aky+0.002
ruck+0.002
Neg logit contributions
 coaster-0.003
 Fer-0.003
 jugg-0.003
 salon-0.003
-0.003
Token to
Token ID284
Feature activation+3.77e-02

Pos logit contributions
lessness+0.042
 twins+0.033
aky+0.031
iring+0.029
ruck+0.029
Neg logit contributions
 coaster-0.054
 jugg-0.052
 som-0.051
 pro-0.051
 Fer-0.051
Token show
Token ID905
Feature activation-3.97e-02

Pos logit contributions
 coaster+0.074
 Fer+0.071
+0.069
 salon+0.068
icol+0.067
Neg logit contributions
lessness-0.060
 of-0.059
 twins-0.056
 company-0.050
 about-0.049
Token everyone
Token ID2506
Feature activation-1.17e-01

Pos logit contributions
lessness+0.061
 twins+0.051
 of+0.051
 privacy+0.043
ruck+0.042
Neg logit contributions
 coaster-0.083
 Fer-0.079
 jugg-0.077
 salon-0.076
 som-0.076
Token how
Token ID703
Feature activation-2.40e-02

Pos logit contributions
lessness+0.184
 twins+0.139
aky+0.130
ruck+0.127
free+0.123
Neg logit contributions
 coaster-0.225
 Fer-0.221
 jugg-0.217
 salon-0.211
 pro-0.208
Token brave
Token ID14802
Feature activation+1.12e-02

Pos logit contributions
lessness+0.033
 twins+0.028
aky+0.026
 of+0.025
ruck+0.025
Neg logit contributions
 Fer-0.049
 coaster-0.049
 som-0.047
 pro-0.046
 jugg-0.046
Token you
Token ID345
Feature activation-3.77e-02

Pos logit contributions
 coaster+0.020
 Fer+0.020
 jugg+0.019
 salon+0.018
 pro+0.018
Neg logit contributions
lessness-0.020
 twins-0.016
 privacy-0.015
aky-0.015
 company-0.015
Token can
Token ID460
Feature activation+3.86e-02

Pos logit contributions
lessness+0.051
 of+0.044
 twins+0.044
 privacy+0.036
 bricks+0.035
Neg logit contributions
 coaster-0.085
 Fer-0.081
 jugg-0.080
-0.079
 som-0.078
Token be
Token ID307
Feature activation-1.50e-02

Pos logit contributions
 coaster+0.073
 Fer+0.069
+0.068
 jugg+0.068
 salon+0.066
Neg logit contributions
lessness-0.067
 of-0.059
 twins-0.059
 privacy-0.051
 about-0.050
Token's
Token ID338
Feature activation-1.56e-02

Pos logit contributions
lessness+0.023
 of+0.020
 twins+0.019
ruck+0.018
 privacy+0.018
Neg logit contributions
 coaster-0.025
-0.023
 jugg-0.023
 Fer-0.023
olog-0.023
Token very
Token ID845
Feature activation-2.03e-03

Pos logit contributions
lessness+0.018
 of+0.014
 twins+0.013
 bricks+0.011
ruck+0.011
Neg logit contributions
 coaster-0.037
 som-0.036
 pro-0.035
 jugg-0.035
olog-0.035
Token easy
Token ID2562
Feature activation+5.07e-03

Pos logit contributions
lessness+0.002
 of+0.001
 twins+0.001
ruck+0.001
 bricks+0.001
Neg logit contributions
 coaster-0.005
 Fer-0.005
olog-0.005
 jugg-0.005
 som-0.005
Token to
Token ID284
Feature activation+2.37e-02

Pos logit contributions
 coaster+0.012
 som+0.011
 pro+0.011
 jugg+0.011
 Fer+0.011
Neg logit contributions
lessness-0.006
 twins-0.005
 of-0.005
aky-0.004
 privacy-0.004
Token hurt
Token ID5938
Feature activation-3.40e-02

Pos logit contributions
 coaster+0.048
 Fer+0.047
+0.045
icol+0.044
olog+0.044
Neg logit contributions
 of-0.036
lessness-0.035
 twins-0.034
 who-0.031
 about-0.030
Token someone
Token ID2130
Feature activation-1.17e-01

Pos logit contributions
lessness+0.049
 twins+0.039
 of+0.039
ruck+0.035
aky+0.033
Neg logit contributions
 coaster-0.068
 som-0.065
 jugg-0.065
 blur-0.064
 pro-0.064
Token's
Token ID338
Feature activation-3.46e-02

Pos logit contributions
lessness+0.182
 twins+0.155
ruck+0.145
 of+0.140
 bricks+0.140
Neg logit contributions
 coaster-0.222
 pro-0.216
 salon-0.216
 jugg-0.214
 som-0.211
Token feelings
Token ID7666
Feature activation-1.03e-01

Pos logit contributions
lessness+0.044
 of+0.038
 twins+0.033
ruck+0.032
 scientists+0.032
Neg logit contributions
 coaster-0.074
 som-0.072
 salon-0.071
 drib-0.070
 sav-0.070
Token with
Token ID351
Feature activation-3.04e-02

Pos logit contributions
lessness+0.113
 twins+0.099
 of+0.086
 privacy+0.085
 bricks+0.080
Neg logit contributions
 coaster-0.241
-0.231
 pro-0.231
 jugg-0.231
 blur-0.228
Token a
Token ID257
Feature activation-2.59e-02

Pos logit contributions
lessness+0.043
 twins+0.033
 of+0.032
aky+0.030
anny+0.030
Neg logit contributions
 som-0.060
 jugg-0.059
 drib-0.058
 coaster-0.058
 Fer-0.057
Token joke
Token ID9707
Feature activation-1.01e-03

Pos logit contributions
lessness+0.038
 of+0.034
 twins+0.033
 about+0.028
 privacy+0.028
Neg logit contributions
 coaster-0.055
 Fer-0.053
-0.052
 jugg-0.051
 salon-0.051
Token and
Token ID290
Feature activation-2.34e-02

Pos logit contributions
lessness+0.027
 twins+0.020
 privacy+0.019
 attention+0.018
ruck+0.018
Neg logit contributions
 coaster-0.052
 jugg-0.049
-0.049
 pro-0.048
 Fer-0.048
Token he
Token ID339
Feature activation-8.61e-03

Pos logit contributions
lessness+0.030
 of+0.025
 twins+0.025
 privacy+0.021
ruck+0.021
Neg logit contributions
 coaster-0.054
 Fer-0.052
 jugg-0.052
 som-0.051
-0.050
Token wanted
Token ID2227
Feature activation-6.29e-02

Pos logit contributions
lessness+0.010
 of+0.008
 twins+0.008
 privacy+0.006
ruck+0.006
Neg logit contributions
 coaster-0.021
 Fer-0.020
 jugg-0.020
-0.020
 som-0.020
Token to
Token ID284
Feature activation+2.73e-02

Pos logit contributions
lessness+0.105
iness+0.079
 twins+0.078
aky+0.074
ovable+0.070
Neg logit contributions
 coaster-0.133
 Fer-0.127
 som-0.127
 jugg-0.124
 roller-0.121
Token protect
Token ID1805
Feature activation-5.07e-02

Pos logit contributions
 coaster+0.058
 Fer+0.056
+0.055
 pro+0.054
icol+0.054
Neg logit contributions
 of-0.039
lessness-0.037
 twins-0.036
 company-0.032
 privacy-0.031
Token everyone
Token ID2506
Feature activation-1.17e-01

Pos logit contributions
lessness+0.078
 twins+0.063
 of+0.061
aky+0.054
 importantly+0.053
Neg logit contributions
 coaster-0.106
 Fer-0.101
 jugg-0.099
 som-0.098
 sixty-0.095
Token in
Token ID287
Feature activation+1.80e-02

Pos logit contributions
lessness+0.154
 twins+0.125
 privacy+0.116
ruck+0.110
 scientists+0.108
Neg logit contributions
 coaster-0.249
 jugg-0.247
 Fer-0.242
 salon-0.232
 pro-0.227
Token his
Token ID465
Feature activation-8.13e-02

Pos logit contributions
 Fer+0.032
 coaster+0.032
 jugg+0.032
+0.032
 som+0.031
Neg logit contributions
lessness-0.030
 twins-0.024
ruck-0.023
 of-0.023
iring-0.023
Token forest
Token ID8222
Feature activation-2.26e-02

Pos logit contributions
lessness+0.136
 of+0.116
ruck+0.106
 twins+0.104
iness+0.099
Neg logit contributions
 coaster-0.159
 Fer-0.154
-0.151
 sav-0.146
 salon-0.146
Token.
Token ID13
Feature activation-3.66e-02

Pos logit contributions
lessness+0.024
 twins+0.018
 of+0.016
ruck+0.016
 importantly+0.015
Neg logit contributions
 coaster-0.056
 salon-0.054
 Fer-0.053
 som-0.053
 jugg-0.053
Token So
Token ID1406
Feature activation+1.10e-02

Pos logit contributions
 of+0.038
lessness+0.038
 twins+0.034
 about+0.031
 who+0.028
Neg logit contributions
 coaster-0.091
 Fer-0.088
 salon-0.087
 jugg-0.087
-0.086
Token.
Token ID13
Feature activation-3.39e-02

Pos logit contributions
lessness+0.011
aky+0.008
 twins+0.008
 company+0.007
 privacy+0.007
Neg logit contributions
 coaster-0.023
 som-0.021
 Fer-0.021
 pro-0.021
 jugg-0.020
Token He
Token ID679
Feature activation-4.44e-03

Pos logit contributions
lessness+0.051
 of+0.044
 twins+0.043
 privacy+0.037
ruck+0.036
Neg logit contributions
 coaster-0.073
 Fer-0.070
-0.069
 jugg-0.067
 salon-0.066
Token wanted
Token ID2227
Feature activation-5.48e-02

Pos logit contributions
lessness+0.005
 of+0.005
 twins+0.004
 who+0.004
 about+0.004
Neg logit contributions
 coaster-0.011
-0.011
 Fer-0.011
icol-0.010
olog-0.010
Token to
Token ID284
Feature activation+3.29e-02

Pos logit contributions
lessness+0.069
 twins+0.053
 of+0.050
 privacy+0.044
aky+0.043
Neg logit contributions
 coaster-0.133
 Fer-0.128
 jugg-0.126
 som-0.124
 salon-0.123
Token show
Token ID905
Feature activation-4.15e-02

Pos logit contributions
 coaster+0.066
 Fer+0.064
icol+0.062
+0.061
olog+0.060
Neg logit contributions
 of-0.052
 twins-0.048
lessness-0.048
 company-0.043
 friends-0.042
Token everyone
Token ID2506
Feature activation-1.16e-01

Pos logit contributions
lessness+0.056
 of+0.047
 twins+0.047
 privacy+0.038
 company+0.037
Neg logit contributions
 coaster-0.095
 Fer-0.092
 jugg-0.089
-0.088
 som-0.088
Token how
Token ID703
Feature activation-2.68e-02

Pos logit contributions
lessness+0.200
 twins+0.169
 privacy+0.160
 company+0.158
aky+0.156
Neg logit contributions
 coaster-0.213
 jugg-0.208
 Fer-0.207
 som-0.199
 pro-0.196
Token silly
Token ID14397
Feature activation-2.04e-03

Pos logit contributions
lessness+0.036
 twins+0.032
 of+0.028
ruck+0.028
aky+0.027
Neg logit contributions
 coaster-0.056
 Fer-0.055
 som-0.054
 jugg-0.053
 pro-0.052
Token he
Token ID339
Feature activation-3.39e-02

Pos logit contributions
lessness+0.003
 twins+0.003
 of+0.003
 privacy+0.002
aky+0.002
Neg logit contributions
 coaster-0.004
 Fer-0.004
 jugg-0.004
 som-0.004
 pro-0.004
Token was
Token ID373
Feature activation-2.95e-02

Pos logit contributions
lessness+0.043
 of+0.040
 twins+0.037
 who+0.032
 about+0.031
Neg logit contributions
 coaster-0.078
 Fer-0.075
-0.075
 salon-0.074
icol-0.074
Token.
Token ID13
Feature activation-3.94e-02

Pos logit contributions
lessness+0.039
 twins+0.029
 privacy+0.026
 of+0.025
 importantly+0.024
Neg logit contributions
 coaster-0.067
 jugg-0.066
 pro-0.064
 som-0.064
 drib-0.063
Token sh
Token ID427
Feature activation+5.00e-02

Pos logit contributions
lessness+0.042
 of+0.038
 twins+0.036
 privacy+0.030
 about+0.030
Neg logit contributions
 coaster-0.065
 Fer-0.062
-0.061
 jugg-0.060
 salon-0.059
Tokeniver
Token ID1428
Feature activation-4.83e-02

Pos logit contributions
 coaster+0.101
 Fer+0.096
+0.095
 jugg+0.094
 salon+0.093
Neg logit contributions
lessness-0.079
 of-0.072
 twins-0.071
 privacy-0.060
 about-0.060
Token go
Token ID467
Feature activation+5.25e-02

Pos logit contributions
lessness+0.060
 twins+0.055
 privacy+0.054
 sons+0.050
 importantly+0.048
Neg logit contributions
 coaster-0.102
 blur-0.102
 salon-0.098
 jugg-0.098
 som-0.097
Token down
Token ID866
Feature activation+1.94e-02

Pos logit contributions
 coaster+0.125
 Fer+0.119
+0.118
 jugg+0.115
 salon+0.115
Neg logit contributions
lessness-0.066
 twins-0.055
 of-0.051
 privacy-0.047
 importantly-0.043
Token her
Token ID607
Feature activation-6.89e-02

Pos logit contributions
+0.046
 coaster+0.045
 Fer+0.044
 blur+0.043
 pro+0.043
Neg logit contributions
lessness-0.023
 twins-0.019
 privacy-0.018
aky-0.016
 sons-0.016
Token spine
Token ID19656
Feature activation-1.16e-01

Pos logit contributions
lessness+0.123
 of+0.107
 twins+0.104
 scientists+0.095
 importantly+0.094
Neg logit contributions
 coaster-0.127
 salon-0.122
-0.117
 Fer-0.115
icol-0.113
Token.
Token ID13
Feature activation-3.15e-02

Pos logit contributions
lessness+0.138
 twins+0.111
 privacy+0.111
 scientists+0.109
 files+0.104
Neg logit contributions
 coaster-0.247
-0.243
 som-0.242
 blur-0.240
 sav-0.234
Token It
Token ID632
Feature activation-3.40e-03

Pos logit contributions
lessness+0.038
 of+0.036
 twins+0.034
 about+0.030
 who+0.028
Neg logit contributions
 coaster-0.076
 Fer-0.074
 jugg-0.072
 som-0.070
 salon-0.069
Token was
Token ID373
Feature activation-7.90e-03

Pos logit contributions
lessness+0.005
 of+0.004
 twins+0.004
 about+0.004
 who+0.004
Neg logit contributions
 coaster-0.008
 Fer-0.007
-0.007
 salon-0.007
 jugg-0.007
Token very
Token ID845
Feature activation-1.45e-02

Pos logit contributions
lessness+0.012
 twins+0.009
 privacy+0.009
 of+0.009
iness+0.009
Neg logit contributions
 blur-0.015
 coaster-0.015
 som-0.015
 drib-0.015
 jugg-0.015
Token cold
Token ID4692
Feature activation-3.54e-02

Pos logit contributions
lessness+0.020
 twins+0.018
 of+0.017
ruck+0.016
 privacy+0.015
Neg logit contributions
 coaster-0.028
-0.028
 blur-0.028
 som-0.028
 Fer-0.028
Token happy
Token ID3772
Feature activation-3.30e-02

Pos logit contributions
lessness+0.024
 of+0.021
 twins+0.020
 privacy+0.016
ruck+0.016
Neg logit contributions
 coaster-0.050
 Fer-0.049
-0.049
 jugg-0.048
 salon-0.048
Token and
Token ID290
Feature activation-3.44e-03

Pos logit contributions
lessness+0.047
ovable+0.037
aky+0.037
 twins+0.035
 privacy+0.034
Neg logit contributions
 coaster-0.071
 salon-0.067
 jugg-0.067
-0.067
 pro-0.065
Token wanted
Token ID2227
Feature activation-6.25e-02

Pos logit contributions
lessness+0.003
 of+0.003
 twins+0.002
 about+0.002
 who+0.002
Neg logit contributions
 coaster-0.010
 Fer-0.009
-0.009
 salon-0.009
 jugg-0.009
Token to
Token ID284
Feature activation+2.69e-02

Pos logit contributions
lessness+0.088
 twins+0.065
 of+0.061
iness+0.059
aky+0.058
Neg logit contributions
 coaster-0.143
 Fer-0.140
 jugg-0.137
 salon-0.137
 som-0.136
Token show
Token ID905
Feature activation-4.43e-02

Pos logit contributions
 coaster+0.056
 Fer+0.053
 pro+0.051
+0.051
icol+0.051
Neg logit contributions
 of-0.042
 twins-0.038
lessness-0.038
 toys-0.033
 company-0.033
Token everyone
Token ID2506
Feature activation-1.16e-01

Pos logit contributions
lessness+0.054
 of+0.047
 twins+0.046
 privacy+0.037
 about+0.036
Neg logit contributions
 coaster-0.107
 Fer-0.102
-0.101
 jugg-0.100
 salon-0.099
Token how
Token ID703
Feature activation-2.92e-02

Pos logit contributions
lessness+0.185
 twins+0.150
aky+0.143
 privacy+0.140
 sons+0.136
Neg logit contributions
 salon-0.224
 coaster-0.223
 Fer-0.222
 jugg-0.221
 som-0.216
Token much
Token ID881
Feature activation-9.45e-03

Pos logit contributions
lessness+0.037
 twins+0.031
ruck+0.029
aky+0.028
 of+0.028
Neg logit contributions
 Fer-0.062
 som-0.061
 coaster-0.060
 salon-0.060
 jugg-0.059
Token she
Token ID673
Feature activation-3.99e-02

Pos logit contributions
lessness+0.013
 twins+0.011
 of+0.011
aky+0.010
 privacy+0.009
Neg logit contributions
 coaster-0.021
 Fer-0.020
 jugg-0.020
-0.019
 salon-0.019
Token loved
Token ID6151
Feature activation-4.30e-02

Pos logit contributions
lessness+0.042
 of+0.040
 twins+0.037
 who+0.031
 about+0.029
Neg logit contributions
 coaster-0.101
-0.097
 Fer-0.096
 salon-0.096
icol-0.095
Token them
Token ID606
Feature activation-7.38e-02

Pos logit contributions
lessness+0.057
 of+0.045
 twins+0.044
aky+0.038
ruck+0.036
Neg logit contributions
 coaster-0.100
 Fer-0.097
 jugg-0.094
 salon-0.094
 som-0.093
Token chat
Token ID8537
Feature activation+5.08e-03

Pos logit contributions
+0.005
 salon+0.005
 coaster+0.004
OM+0.004
icol+0.004
Neg logit contributions
lessness-0.003
 of-0.003
 everywhere-0.003
 about-0.002
 who-0.002
Tokentered
Token ID4400
Feature activation-1.41e-02

Pos logit contributions
 coaster+0.009
 Fer+0.008
 jugg+0.008
+0.008
 som+0.008
Neg logit contributions
lessness-0.009
 twins-0.008
 of-0.008
 privacy-0.007
 bricks-0.007
Token as
Token ID355
Feature activation-8.44e-03

Pos logit contributions
lessness+0.017
 of+0.016
 twins+0.015
 about+0.013
 who+0.012
Neg logit contributions
 coaster-0.033
-0.032
 Fer-0.032
 salon-0.031
 jugg-0.031
Token he
Token ID339
Feature activation-1.93e-03

Pos logit contributions
lessness+0.013
 twins+0.011
 of+0.011
 privacy+0.010
 company+0.009
Neg logit contributions
 coaster-0.018
 Fer-0.017
 jugg-0.017
 pro-0.016
 som-0.016
Token showed
Token ID3751
Feature activation-6.47e-02

Pos logit contributions
 of+0.003
lessness+0.002
 who+0.002
 twins+0.002
 friends+0.002
Neg logit contributions
 salon-0.004
icol-0.004
 coaster-0.004
 Fer-0.004
-0.004
Token everyone
Token ID2506
Feature activation-1.16e-01

Pos logit contributions
lessness+0.106
 twins+0.085
aky+0.077
 of+0.077
 importantly+0.073
Neg logit contributions
 coaster-0.131
 Fer-0.126
 jugg-0.122
 som-0.122
 pro-0.119
Token things
Token ID1243
Feature activation-7.66e-02

Pos logit contributions
lessness+0.189
aky+0.147
 privacy+0.145
 twins+0.144
 company+0.142
Neg logit contributions
 coaster-0.227
 Fer-0.221
 jugg-0.218
 som-0.207
 pro-0.207
Token in
Token ID287
Feature activation+2.73e-02

Pos logit contributions
lessness+0.093
 twins+0.083
aky+0.074
 privacy+0.073
anny+0.071
Neg logit contributions
 coaster-0.176
 Fer-0.172
 pro-0.167
 jugg-0.167
 som-0.161
Token the
Token ID262
Feature activation-1.92e-02

Pos logit contributions
 jugg+0.049
 coaster+0.049
 Fer+0.049
+0.048
 blur+0.047
Neg logit contributions
lessness-0.047
 twins-0.038
 of-0.037
 privacy-0.035
ruck-0.035
Token jungle
Token ID20712
Feature activation+1.24e-02

Pos logit contributions
lessness+0.041
 of+0.036
 twins+0.035
iness+0.032
ruck+0.032
Neg logit contributions
 coaster-0.030
 Fer-0.029
 sav-0.026
-0.026
 blur-0.026
Token.
Token ID13
Feature activation-1.99e-02

Pos logit contributions
 coaster+0.034
 Fer+0.033
 jugg+0.032
+0.032
 salon+0.032
Neg logit contributions
lessness-0.010
 of-0.007
 twins-0.007
 privacy-0.006
ruck-0.005
Token he
Token ID339
Feature activation-8.67e-03

Pos logit contributions
lessness+0.015
 of+0.013
 twins+0.013
 company+0.011
aky+0.011
Neg logit contributions
 coaster-0.019
 Fer-0.018
 jugg-0.018
 som-0.018
 pro-0.017
Token was
Token ID373
Feature activation-1.21e-02

Pos logit contributions
lessness+0.010
 of+0.010
 twins+0.009
 who+0.008
 about+0.008
Neg logit contributions
-0.021
 coaster-0.020
 Fer-0.020
 salon-0.019
icol-0.019
Token about
Token ID546
Feature activation-2.31e-02

Pos logit contributions
lessness+0.017
 of+0.014
 twins+0.014
 privacy+0.012
ruck+0.012
Neg logit contributions
 coaster-0.026
 Fer-0.025
 jugg-0.025
 pro-0.025
 som-0.025
Token to
Token ID284
Feature activation+4.61e-02

Pos logit contributions
lessness+0.036
aky+0.027
iness+0.027
 twins+0.027
 of+0.026
Neg logit contributions
 som-0.048
 coaster-0.048
 jugg-0.048
 pro-0.046
 Fer-0.046
Token show
Token ID905
Feature activation-4.50e-02

Pos logit contributions
 coaster+0.094
 Fer+0.088
+0.088
 salon+0.085
olog+0.085
Neg logit contributions
lessness-0.073
 of-0.070
 twins-0.065
 about-0.059
 privacy-0.059
Token everyone
Token ID2506
Feature activation-1.16e-01

Pos logit contributions
lessness+0.075
 twins+0.064
 of+0.062
aky+0.056
 company+0.054
Neg logit contributions
 coaster-0.089
 Fer-0.085
 som-0.082
 jugg-0.082
 pro-0.079
Token.
Token ID13
Feature activation-2.30e-02

Pos logit contributions
lessness+0.198
aky+0.163
 twins+0.157
 privacy+0.153
 company+0.152
Neg logit contributions
 jugg-0.209
 coaster-0.207
 Fer-0.203
 som-0.197
 pro-0.191
Token He
Token ID679
Feature activation-4.90e-04

Pos logit contributions
lessness+0.024
 of+0.023
 twins+0.021
 about+0.018
 who+0.015
Neg logit contributions
 coaster-0.058
 Fer-0.056
 jugg-0.056
 salon-0.056
-0.056
Token jumped
Token ID11687
Feature activation+3.68e-02

Pos logit contributions
 of+0.001
 twins+0.001
lessness+0.001
 who+0.001
 about+0.000
Neg logit contributions
-0.001
 coaster-0.001
 Fer-0.001
Whe-0.001
icol-0.001
Token with
Token ID351
Feature activation-3.01e-02

Pos logit contributions
 coaster+0.096
+0.093
 salon+0.093
 Fer+0.093
 jugg+0.090
Neg logit contributions
lessness-0.036
 of-0.036
 twins-0.030
 everywhere-0.027
 about-0.027
Token excitement
Token ID14067
Feature activation-4.98e-02

Pos logit contributions
lessness+0.054
 twins+0.045
 of+0.044
aky+0.043
 privacy+0.041
Neg logit contributions
 Fer-0.055
 coaster-0.054
 jugg-0.052
 blur-0.050
 som-0.050
Token new
Token ID649
Feature activation+2.58e-02

Pos logit contributions
lessness+0.014
 of+0.013
 twins+0.012
 everywhere+0.010
 who+0.010
Neg logit contributions
 coaster-0.022
 Fer-0.021
-0.020
 jugg-0.020
olog-0.020
Token toy
Token ID13373
Feature activation+4.83e-03

Pos logit contributions
 coaster+0.050
 Fer+0.047
+0.046
 jugg+0.045
olog+0.044
Neg logit contributions
lessness-0.044
 of-0.042
 twins-0.040
 about-0.034
 bricks-0.034
Token she
Token ID673
Feature activation-1.42e-02

Pos logit contributions
 coaster+0.011
 Fer+0.011
+0.010
 salon+0.010
 jugg+0.010
Neg logit contributions
lessness-0.007
 twins-0.005
 of-0.005
 privacy-0.005
 company-0.005
Token had
Token ID550
Feature activation-2.56e-02

Pos logit contributions
lessness+0.020
 of+0.017
 twins+0.016
 privacy+0.014
 company+0.014
Neg logit contributions
 coaster-0.031
 Fer-0.030
-0.030
 jugg-0.030
 pro-0.029
Token been
Token ID587
Feature activation-4.33e-02

Pos logit contributions
lessness+0.042
 twins+0.036
 of+0.036
 privacy+0.032
 company+0.032
Neg logit contributions
 coaster-0.051
 Fer-0.048
-0.048
 jugg-0.047
 salon-0.047
Token longing
Token ID40314
Feature activation-1.16e-01

Pos logit contributions
lessness+0.066
 of+0.058
 twins+0.057
 privacy+0.049
 about+0.048
Neg logit contributions
 coaster-0.091
 Fer-0.087
-0.086
 jugg-0.085
 salon-0.084
Token for
Token ID329
Feature activation-3.82e-02

Pos logit contributions
lessness+0.166
 twins+0.144
aky+0.135
 privacy+0.131
py+0.127
Neg logit contributions
 som-0.207
 pro-0.204
 salon-0.199
 sav-0.195
 drib-0.195
Token -
Token ID532
Feature activation-3.67e-02

Pos logit contributions
lessness+0.048
 twins+0.038
 of+0.035
 privacy+0.032
aky+0.032
Neg logit contributions
 coaster-0.086
-0.085
 som-0.083
 Fer-0.082
 jugg-0.082
Token a
Token ID257
Feature activation+1.71e-03

Pos logit contributions
lessness+0.047
 of+0.044
 twins+0.041
 about+0.034
 who+0.034
Neg logit contributions
 coaster-0.085
 Fer-0.081
-0.080
 jugg-0.079
olog-0.078
Token mixing
Token ID17090
Feature activation+1.90e-02

Pos logit contributions
 coaster+0.004
 Fer+0.004
+0.004
olog+0.004
 jugg+0.004
Neg logit contributions
lessness-0.002
 of-0.002
 twins-0.002
 who-0.001
 toys-0.001
Token machine
Token ID4572
Feature activation+9.79e-03

Pos logit contributions
 coaster+0.040
olog+0.037
 Fer+0.037
icol+0.036
+0.036
Neg logit contributions
 of-0.030
lessness-0.028
 twins-0.026
 bricks-0.023
 toys-0.023
Token happiness
Token ID12157
Feature activation-5.94e-02

Pos logit contributions
lessness+0.068
 twins+0.057
 of+0.055
aky+0.053
 privacy+0.052
Neg logit contributions
 Fer-0.070
 coaster-0.070
 jugg-0.069
 som-0.068
 sav-0.066
Token and
Token ID290
Feature activation-1.42e-02

Pos logit contributions
lessness+0.080
 twins+0.067
 privacy+0.064
 of+0.063
aky+0.060
Neg logit contributions
 coaster-0.134
 Fer-0.124
 jugg-0.124
 pro-0.123
-0.121
Token wanted
Token ID2227
Feature activation-5.23e-02

Pos logit contributions
lessness+0.022
 of+0.020
 twins+0.019
 about+0.016
 privacy+0.016
Neg logit contributions
 coaster-0.030
 Fer-0.028
-0.028
 salon-0.028
 jugg-0.027
Token to
Token ID284
Feature activation+3.16e-02

Pos logit contributions
lessness+0.075
 twins+0.060
 of+0.054
aky+0.051
 privacy+0.050
Neg logit contributions
 coaster-0.118
 Fer-0.114
 jugg-0.112
 som-0.112
 pro-0.110
Token show
Token ID905
Feature activation-4.02e-02

Pos logit contributions
 coaster+0.063
 Fer+0.060
+0.058
olog+0.057
icol+0.057
Neg logit contributions
 of-0.051
lessness-0.049
 twins-0.046
 who-0.041
 company-0.041
Token everyone
Token ID2506
Feature activation-1.16e-01

Pos logit contributions
lessness+0.069
 twins+0.059
 of+0.057
aky+0.051
 privacy+0.050
Neg logit contributions
 coaster-0.077
 Fer-0.074
 jugg-0.071
 som-0.071
 pro-0.069
Token how
Token ID703
Feature activation-2.46e-02

Pos logit contributions
lessness+0.177
 twins+0.149
 privacy+0.135
aky+0.133
 bricks+0.132
Neg logit contributions
 coaster-0.236
 jugg-0.231
 Fer-0.229
 som-0.225
 pro-0.222
Token delicious
Token ID12625
Feature activation+2.32e-02

Pos logit contributions
lessness+0.033
 twins+0.029
 of+0.026
ruck+0.025
aky+0.024
Neg logit contributions
 Fer-0.052
 coaster-0.052
 som-0.052
 jugg-0.050
 pro-0.050
Token his
Token ID465
Feature activation-4.74e-02

Pos logit contributions
 coaster+0.046
 Fer+0.043
+0.042
 jugg+0.042
 som+0.041
Neg logit contributions
lessness-0.038
 of-0.034
 twins-0.034
 privacy-0.029
 about-0.029
Token steak
Token ID26320
Feature activation+7.39e-03

Pos logit contributions
lessness+0.076
 of+0.063
 twins+0.059
ruck+0.054
 bricks+0.053
Neg logit contributions
 coaster-0.098
 Fer-0.093
-0.090
 jugg-0.090
 sund-0.088
Token was
Token ID373
Feature activation-3.53e-02

Pos logit contributions
 coaster+0.015
 Fer+0.015
+0.014
 jugg+0.014
 salon+0.014
Neg logit contributions
lessness-0.011
 of-0.010
 twins-0.010
 privacy-0.009
 about-0.008
Token went
Token ID1816
Feature activation+3.73e-02

Pos logit contributions
lessness+0.011
 of+0.010
 twins+0.009
 about+0.008
 privacy+0.008
Neg logit contributions
 coaster-0.014
 Fer-0.013
-0.013
 salon-0.013
 jugg-0.013
Token back
Token ID736
Feature activation+8.87e-03

Pos logit contributions
 coaster+0.091
 Fer+0.087
+0.086
 jugg+0.086
 salon+0.085
Neg logit contributions
lessness-0.044
 of-0.040
 twins-0.037
 about-0.030
 privacy-0.029
Token home
Token ID1363
Feature activation-3.22e-02

Pos logit contributions
 coaster+0.020
 Fer+0.020
+0.019
 salon+0.019
 jugg+0.019
Neg logit contributions
lessness-0.012
 twins-0.010
 of-0.010
 privacy-0.008
iness-0.008
Token to
Token ID284
Feature activation+9.28e-03

Pos logit contributions
lessness+0.032
 twins+0.025
 of+0.025
 privacy+0.021
 bricks+0.019
Neg logit contributions
 coaster-0.083
 Fer-0.079
-0.079
 salon-0.078
 jugg-0.078
Token show
Token ID905
Feature activation-4.73e-02

Pos logit contributions
 coaster+0.023
 Fer+0.022
+0.022
 salon+0.021
 jugg+0.021
Neg logit contributions
lessness-0.011
 of-0.010
 twins-0.009
 about-0.007
 privacy-0.007
Token everyone
Token ID2506
Feature activation-1.15e-01

Pos logit contributions
lessness+0.062
 twins+0.049
 of+0.049
aky+0.042
 privacy+0.041
Neg logit contributions
 coaster-0.109
 Fer-0.106
 som-0.103
 jugg-0.103
 salon-0.099
Token what
Token ID644
Feature activation-6.57e-02

Pos logit contributions
lessness+0.176
aky+0.143
 twins+0.142
 bricks+0.139
 privacy+0.139
Neg logit contributions
 coaster-0.227
 Fer-0.224
 jugg-0.221
 som-0.217
 pro-0.211
Token he
Token ID339
Feature activation-4.56e-02

Pos logit contributions
lessness+0.105
 twins+0.083
 of+0.082
aky+0.079
 privacy+0.078
Neg logit contributions
 Fer-0.134
 coaster-0.134
 jugg-0.133
 som-0.128
 pro-0.124
Token had
Token ID550
Feature activation-2.63e-02

Pos logit contributions
lessness+0.064
 of+0.056
 twins+0.056
 privacy+0.046
 bricks+0.046
Neg logit contributions
 coaster-0.100
 Fer-0.096
 jugg-0.094
-0.094
 pro-0.092
Token found
Token ID1043
Feature activation-7.69e-02

Pos logit contributions
lessness+0.042
 twins+0.036
 of+0.034
 bricks+0.033
 privacy+0.032
Neg logit contributions
 coaster-0.050
 jugg-0.049
 Fer-0.048
-0.047
 pro-0.047
Token.
Token ID13
Feature activation-1.41e-02

Pos logit contributions
lessness+0.095
 twins+0.080
 privacy+0.072
aky+0.069
 scientists+0.067
Neg logit contributions
 coaster-0.171
 Fer-0.165
 sund-0.161
 pro-0.161
 som-0.156
Token that
Token ID326
Feature activation-1.33e-02

Pos logit contributions
lessness+0.083
 twins+0.068
 privacy+0.061
 of+0.059
ruck+0.058
Neg logit contributions
 coaster-0.145
-0.138
 pro-0.137
 som-0.136
 Fer-0.136
Token she
Token ID673
Feature activation-3.07e-03

Pos logit contributions
lessness+0.017
 twins+0.015
 of+0.015
 privacy+0.012
ruck+0.012
Neg logit contributions
 coaster-0.030
 Fer-0.029
 jugg-0.029
 salon-0.028
-0.028
Token wanted
Token ID2227
Feature activation-5.18e-02

Pos logit contributions
lessness+0.003
 of+0.003
 twins+0.003
 who+0.003
 should+0.002
Neg logit contributions
 coaster-0.008
-0.008
 Fer-0.008
 salon-0.007
icol-0.007
Token to
Token ID284
Feature activation+3.76e-02

Pos logit contributions
lessness+0.066
 twins+0.054
 of+0.049
aky+0.045
 privacy+0.044
Neg logit contributions
 coaster-0.120
 jugg-0.115
 som-0.115
 Fer-0.114
 pro-0.114
Token show
Token ID905
Feature activation-4.37e-02

Pos logit contributions
 coaster+0.083
 Fer+0.079
icol+0.073
+0.072
olog+0.071
Neg logit contributions
 of-0.058
lessness-0.054
 twins-0.051
 about-0.047
 company-0.046
Token everyone
Token ID2506
Feature activation-1.15e-01

Pos logit contributions
lessness+0.062
 of+0.052
 twins+0.052
 privacy+0.043
ruck+0.043
Neg logit contributions
 coaster-0.096
 Fer-0.092
 jugg-0.090
-0.090
 salon-0.089
Token.
Token ID13
Feature activation-2.23e-02

Pos logit contributions
lessness+0.174
 twins+0.147
ruck+0.134
aky+0.133
 privacy+0.130
Neg logit contributions
 coaster-0.225
 jugg-0.224
 pro-0.222
 Fer-0.221
 salon-0.221
Token She
Token ID1375
Feature activation+8.77e-03

Pos logit contributions
lessness+0.028
 of+0.025
 twins+0.024
 about+0.020
 who+0.019
Neg logit contributions
 coaster-0.053
 Fer-0.051
 jugg-0.050
-0.049
 salon-0.049
Token smiled
Token ID13541
Feature activation-1.67e-02

Pos logit contributions
 coaster+0.023
+0.022
 Fer+0.022
Whe+0.021
icol+0.021
Neg logit contributions
 of-0.010
 who-0.009
 twins-0.009
 about-0.008
lessness-0.008
Token and
Token ID290
Feature activation-1.98e-03

Pos logit contributions
lessness+0.022
 twins+0.019
 of+0.018
 privacy+0.017
ruck+0.016
Neg logit contributions
 coaster-0.037
 pro-0.036
 jugg-0.035
 Fer-0.035
 blur-0.034
Token showed
Token ID3751
Feature activation-4.60e-02

Pos logit contributions
 of+0.002
lessness+0.002
 twins+0.002
 who+0.002
 friends+0.002
Neg logit contributions
 coaster-0.005
-0.005
 salon-0.005
 Fer-0.005
 rink-0.004
Token full
Token ID1336
Feature activation-3.99e-02

Pos logit contributions
 coaster+0.002
 Fer+0.002
+0.002
 jugg+0.001
 salon+0.001
Neg logit contributions
lessness-0.001
 of-0.001
 twins-0.001
 privacy-0.001
 about-0.001
Token of
Token ID286
Feature activation-3.25e-02

Pos logit contributions
lessness+0.021
 twins+0.016
 privacy+0.014
aky+0.010
ruck+0.008
Neg logit contributions
 coaster-0.113
 jugg-0.112
 pro-0.112
-0.111
 blur-0.111
Token lovely
Token ID14081
Feature activation-1.67e-02

Pos logit contributions
lessness+0.051
 twins+0.041
 of+0.041
 privacy+0.038
ruck+0.036
Neg logit contributions
 coaster-0.065
 Fer-0.064
 jugg-0.063
 som-0.062
 blur-0.060
Token ducks
Token ID39694
Feature activation-2.45e-02

Pos logit contributions
lessness+0.027
 twins+0.021
 of+0.021
 privacy+0.020
ovable+0.019
Neg logit contributions
 coaster-0.034
 Fer-0.033
 jugg-0.032
 som-0.032
-0.031
Token
Token ID6184
Feature activation+6.10e-02

Pos logit contributions
lessness+0.035
 privacy+0.029
 twins+0.028
anny+0.026
 files+0.026
Neg logit contributions
 coaster-0.050
-0.048
 jugg-0.048
 pro-0.048
 som-0.047
Token
Token ID95
Feature activation+8.95e-02

Pos logit contributions
 salon+0.074
 coaster+0.063
 sixty+0.058
 pro+0.057
 jugg+0.054
Neg logit contributions
 of-0.132
 company-0.126
lessness-0.126
free-0.120
 who-0.118
Token
Token ID26391
Feature activation+7.50e-02

Pos logit contributions
 salon+0.145
 Fer+0.129
 jugg+0.129
 pro+0.127
 sixty+0.120
Neg logit contributions
lessness-0.156
 company-0.151
 twins-0.151
 of-0.147
 about-0.147
Token
Token ID447
Feature activation+1.12e-01

Pos logit contributions
 coaster+0.130
 Fer+0.130
 jugg+0.124
 salon+0.123
 sixty+0.121
Neg logit contributions
lessness-0.135
 of-0.128
 twins-0.127
 who-0.112
 about-0.110
Token
Token ID251
Feature activation+2.80e-02

Pos logit contributions
 salon+0.123
 jugg+0.117
 som+0.115
 coaster+0.115
 instructor+0.112
Neg logit contributions
lessness-0.238
 twins-0.237
 who-0.222
 company-0.219
 of-0.218
Token some
Token ID617
Feature activation-9.97e-03

Pos logit contributions
 coaster+0.060
+0.058
 Fer+0.058
 jugg+0.057
 salon+0.056
Neg logit contributions
lessness-0.040
 twins-0.034
 of-0.034
 privacy-0.030
ruck-0.030
Token white
Token ID2330
Feature activation+2.44e-02

Pos logit contributions
lessness+0.007
 twins+0.005
 privacy+0.005
 bricks+0.004
ruck+0.004
Neg logit contributions
 jugg-0.027
 Fer-0.026
 coaster-0.026
 som-0.026
 pro-0.026
Token.
Token ID13
Feature activation-1.70e-02

Pos logit contributions
lessness+0.025
 twins+0.020
 of+0.018
 privacy+0.017
 importantly+0.016
Neg logit contributions
 coaster-0.056
 Fer-0.053
 salon-0.053
 jugg-0.052
-0.052
Token It
Token ID632
Feature activation+2.25e-02

Pos logit contributions
lessness+0.025
 of+0.023
 twins+0.022
 about+0.018
 who+0.018
Neg logit contributions
 coaster-0.037
 Fer-0.036
 jugg-0.034
-0.034
 salon-0.034
Token is
Token ID318
Feature activation+1.15e-02

Pos logit contributions
 coaster+0.048
+0.047
 Fer+0.045
icol+0.045
olog+0.043
Neg logit contributions
lessness-0.033
 of-0.033
 twins-0.031
 who-0.027
 about-0.026
Token dangerous
Token ID4923
Feature activation-3.26e-02

Pos logit contributions
 coaster+0.026
 Fer+0.025
 jugg+0.024
+0.024
 som+0.024
Neg logit contributions
lessness-0.016
 of-0.014
 twins-0.013
 privacy-0.011
ruck-0.011
Token.
Token ID13
Feature activation-1.89e-02

Pos logit contributions
lessness+0.044
ovable+0.032
 importantly+0.030
aky+0.030
 twins+0.030
Neg logit contributions
 coaster-0.074
 Fer-0.068
 roller-0.064
 pro-0.064
olog-0.063
Token Come
Token ID7911
Feature activation+8.80e-02

Pos logit contributions
lessness+0.029
 of+0.026
 twins+0.025
 privacy+0.022
 about+0.021
Neg logit contributions
 coaster-0.039
 Fer-0.038
-0.037
 jugg-0.036
 salon-0.036
Token with
Token ID351
Feature activation+1.39e-02

Pos logit contributions
 coaster+0.249
 Fer+0.245
 jugg+0.240
 som+0.237
 salon+0.235
Neg logit contributions
 of-0.067
lessness-0.061
 twins-0.060
 who-0.044
 everywhere-0.043
Token me
Token ID502
Feature activation+3.27e-03

Pos logit contributions
 coaster+0.023
+0.023
 Fer+0.022
 salon+0.021
 jugg+0.021
Neg logit contributions
 of-0.027
lessness-0.026
 twins-0.025
 everywhere-0.023
 about-0.023
Token.
Token ID13
Feature activation-1.51e-03

Pos logit contributions
 coaster+0.008
 Fer+0.008
+0.008
 salon+0.007
 jugg+0.007
Neg logit contributions
lessness-0.004
 of-0.003
 twins-0.003
 privacy-0.003
 company-0.002
Token I
Token ID314
Feature activation+6.30e-02

Pos logit contributions
lessness+0.002
 of+0.002
 twins+0.002
 who+0.002
 company+0.002
Neg logit contributions
 coaster-0.003
 Fer-0.003
 salon-0.003
 som-0.003
 jugg-0.003
Token have
Token ID423
Feature activation+1.80e-02

Pos logit contributions
 coaster+0.135
 Fer+0.129
+0.125
 salon+0.123
 jugg+0.123
Neg logit contributions
lessness-0.094
 of-0.090
 twins-0.084
 about-0.072
 who-0.072
Token said
Token ID531
Feature activation-3.20e-02

Pos logit contributions
lessness+0.015
 of+0.014
 twins+0.013
 who+0.011
 about+0.011
Neg logit contributions
 coaster-0.018
 Fer-0.017
-0.017
 salon-0.016
 jugg-0.016
Token,
Token ID11
Feature activation+8.49e-03

Pos logit contributions
lessness+0.050
 twins+0.044
 of+0.044
 privacy+0.038
ruck+0.037
Neg logit contributions
 coaster-0.065
 Fer-0.062
-0.061
 jugg-0.060
 som-0.060
Token
Token ID6184
Feature activation+8.77e-02

Pos logit contributions
 coaster+0.018
 Fer+0.017
+0.017
 jugg+0.016
 salon+0.016
Neg logit contributions
lessness-0.013
 of-0.012
 twins-0.012
 about-0.010
 who-0.010
Token
Token ID95
Feature activation+9.90e-02

Pos logit contributions
 coaster+0.120
 salon+0.120
 jugg+0.112
 Fer+0.110
 blur+0.104
Neg logit contributions
lessness-0.174
 who-0.161
hearted-0.156
 of-0.156
free-0.154
Token
Token ID26391
Feature activation+8.45e-02

Pos logit contributions
 salon+0.177
 Fer+0.169
 jugg+0.168
 coaster+0.164
 blur+0.158
Neg logit contributions
lessness-0.167
 who-0.146
 twins-0.145
hearted-0.145
 company-0.144
Token
Token ID129
Feature activation+1.03e-01

Pos logit contributions
 Fer+0.136
 coaster+0.135
 jugg+0.131
 salon+0.129
 blur+0.124
Neg logit contributions
lessness-0.164
 twins-0.150
 of-0.150
 who-0.136
 about-0.131
Token
Token ID241
Feature activation+2.01e-02

Pos logit contributions
 salon+0.187
 pro+0.177
 Fer+0.177
 jugg+0.176
+0.174
Neg logit contributions
 about-0.176
 important-0.174
 privacy-0.157
 supposed-0.155
 company-0.154
TokenNext
Token ID10019
Feature activation-2.71e-02

Pos logit contributions
 coaster+0.035
 Fer+0.033
+0.032
 salon+0.031
 jugg+0.031
Neg logit contributions
lessness-0.038
 of-0.035
 twins-0.034
 who-0.030
 about-0.030
Token time
Token ID640
Feature activation+2.81e-02

Pos logit contributions
lessness+0.039
 of+0.036
 twins+0.034
 about+0.028
 company+0.028
Neg logit contributions
 coaster-0.060
 Fer-0.057
-0.056
 jugg-0.056
 salon-0.055
Token you
Token ID345
Feature activation-1.09e-03

Pos logit contributions
 coaster+0.064
 Fer+0.061
+0.060
 jugg+0.059
 salon+0.059
Neg logit contributions
lessness-0.038
 of-0.035
 twins-0.033
 about-0.027
 privacy-0.027
Token are
Token ID389
Feature activation+1.50e-03

Pos logit contributions
lessness+0.002
 of+0.001
 twins+0.001
 about+0.001
 who+0.001
Neg logit contributions
 coaster-0.002
 Fer-0.002
-0.002
 salon-0.002
 jugg-0.002
Tokenmy
Token ID1820
Feature activation-1.01e-02

Pos logit contributions
lessness+0.027
 company+0.021
 privacy+0.021
 secrets+0.019
ovable+0.019
Neg logit contributions
 coaster-0.039
 jugg-0.037
 sixty-0.036
 sav-0.036
 salon-0.036
Token:
Token ID25
Feature activation+3.90e-03

Pos logit contributions
lessness+0.015
 of+0.014
 twins+0.013
 privacy+0.011
 about+0.011
Neg logit contributions
 coaster-0.022
 Fer-0.021
-0.020
 jugg-0.020
 salon-0.020
Token
Token ID6184
Feature activation+8.35e-02

Pos logit contributions
 coaster+0.009
 Fer+0.008
 som+0.008
 jugg+0.008
+0.008
Neg logit contributions
lessness-0.005
 of-0.005
 twins-0.005
 who-0.004
 about-0.004
Token
Token ID95
Feature activation+9.19e-02

Pos logit contributions
 coaster+0.101
 blur+0.093
 jugg+0.091
 som+0.091
 pro+0.090
Neg logit contributions
 who-0.166
lessness-0.164
 twins-0.164
 of-0.159
.-0.156
Token
Token ID26391
Feature activation+7.33e-02

Pos logit contributions
 Fer+0.151
 jugg+0.148
 pro+0.146
 coaster+0.145
 som+0.143
Neg logit contributions
lessness-0.162
 twins-0.153
 of-0.146
 who-0.145
 living-0.141
Token
Token ID129
Feature activation+9.67e-02

Pos logit contributions
 coaster+0.110
 Fer+0.110
 jugg+0.104
 sixty+0.101
 som+0.100
Neg logit contributions
lessness-0.149
 of-0.141
 twins-0.140
 who-0.126
 about-0.122
Token
Token ID241
Feature activation+1.07e-02

Pos logit contributions
 pro+0.157
+0.152
 jugg+0.148
 Fer+0.148
 coaster+0.146
Neg logit contributions
 about-0.172
 twins-0.168
lessness-0.164
 supposed-0.163
 important-0.160
TokenWhat
Token ID2061
Feature activation-4.33e-02

Pos logit contributions
 coaster+0.019
 Fer+0.018
+0.018
 jugg+0.017
 som+0.017
Neg logit contributions
lessness-0.019
 of-0.018
 twins-0.017
 privacy-0.016
 bricks-0.015
Token do
Token ID466
Feature activation+3.41e-02

Pos logit contributions
lessness+0.074
 twins+0.055
ruck+0.055
 of+0.054
 privacy+0.052
Neg logit contributions
 coaster-0.083
 jugg-0.083
 Fer-0.081
olog-0.078
 salon-0.077
Token you
Token ID345
Feature activation-3.62e-02

Pos logit contributions
 coaster+0.056
 Fer+0.052
 pro+0.050
 blur+0.049
+0.049
Neg logit contributions
lessness-0.067
 of-0.065
 twins-0.062
 who-0.055
 company-0.054
Token want
Token ID765
Feature activation-4.61e-02

Pos logit contributions
lessness+0.056
 of+0.049
 twins+0.047
ruck+0.041
 bricks+0.041
Neg logit contributions
 coaster-0.071
 jugg-0.070
-0.070
 Fer-0.069
 drib-0.068
Token on
Token ID319
Feature activation+3.60e-02

Pos logit contributions
 coaster+0.075
 Fer+0.073
+0.072
 salon+0.071
icol+0.070
Neg logit contributions
lessness-0.038
 of-0.036
 twins-0.034
 about-0.029
 who-0.028
Token the
Token ID262
Feature activation-2.88e-03

Pos logit contributions
 coaster+0.096
olog+0.096
icol+0.093
+0.093
 jugg+0.092
Neg logit contributions
 of-0.030
 twins-0.029
lessness-0.026
 toys-0.025
 bricks-0.024
Token swings
Token ID26728
Feature activation+6.31e-02

Pos logit contributions
lessness+0.007
 of+0.006
 twins+0.006
 toys+0.006
 who+0.006
Neg logit contributions
 coaster-0.004
-0.003
 Fer-0.003
 jugg-0.003
olog-0.003
Token and
Token ID290
Feature activation+2.42e-02

Pos logit contributions
 coaster+0.179
 Fer+0.174
icol+0.171
+0.170
 jugg+0.170
Neg logit contributions
 of-0.050
lessness-0.049
 twins-0.042
 about-0.033
 who-0.032
Token the
Token ID262
Feature activation+1.95e-02

Pos logit contributions
+0.052
 coaster+0.049
icol+0.048
olog+0.047
 salon+0.044
Neg logit contributions
 of-0.037
 twins-0.035
lessness-0.033
 toys-0.032
 who-0.031
Token slide
Token ID10649
Feature activation+8.91e-02

Pos logit contributions
 coaster+0.022
 Fer+0.021
 jugg+0.019
+0.018
 salon+0.018
Neg logit contributions
lessness-0.049
 of-0.046
 twins-0.044
 about-0.042
 privacy-0.042
Token.
Token ID13
Feature activation-8.30e-03

Pos logit contributions
 coaster+0.282
icol+0.276
+0.275
 Fer+0.273
 jugg+0.270
Neg logit contributions
 of-0.044
lessness-0.039
 twins-0.029
 about-0.018
 everywhere-0.015
Token They
Token ID1119
Feature activation+1.98e-02

Pos logit contributions
lessness+0.008
 of+0.007
 twins+0.007
 about+0.005
 privacy+0.005
Neg logit contributions
 coaster-0.022
 Fer-0.021
-0.020
 jugg-0.020
 salon-0.020
Token are
Token ID389
Feature activation+9.75e-03

Pos logit contributions
 coaster+0.054
icol+0.052
+0.051
 Fer+0.050
olog+0.050
Neg logit contributions
lessness-0.017
 of-0.017
 twins-0.016
 who-0.013
 everywhere-0.012
Token happy
Token ID3772
Feature activation-2.44e-02

Pos logit contributions
 coaster+0.024
 Fer+0.023
 jugg+0.023
 som+0.022
 blur+0.022
Neg logit contributions
lessness-0.011
 of-0.009
 twins-0.008
 privacy-0.008
ruck-0.007
Token and
Token ID290
Feature activation+2.73e-02

Pos logit contributions
lessness+0.033
aky+0.027
ovable+0.026
 privacy+0.025
iness+0.023
Neg logit contributions
 coaster-0.053
 pro-0.052
 jugg-0.051
-0.050
 som-0.049
Token and
Token ID290
Feature activation+3.43e-02

Pos logit contributions
lessness+0.071
 privacy+0.066
ery+0.065
grand+0.065
 files+0.064
Neg logit contributions
 blur-0.063
 pro-0.062
 coaster-0.060
 jugg-0.057
 roller-0.057
Token cried
Token ID16896
Feature activation-4.61e-02

Pos logit contributions
 coaster+0.082
 Fer+0.078
+0.078
 jugg+0.076
 salon+0.076
Neg logit contributions
lessness-0.043
 of-0.038
 twins-0.037
 about-0.030
 privacy-0.030
Token.
Token ID13
Feature activation-4.37e-03

Pos logit contributions
lessness+0.057
 privacy+0.050
 importantly+0.048
 twins+0.048
grand+0.048
Neg logit contributions
 pro-0.090
 blur-0.088
olog-0.086
-0.085
 coaster-0.085
Token
Token ID198
Feature activation+7.94e-03

Pos logit contributions
lessness+0.005
 of+0.005
 twins+0.005
 privacy+0.004
ruck+0.004
Neg logit contributions
 coaster-0.010
-0.010
 Fer-0.010
 jugg-0.010
 salon-0.010
Token
Token ID198
Feature activation+1.98e-02

Pos logit contributions
 coaster+0.016
+0.016
 jugg+0.015
 Fer+0.015
olog+0.015
Neg logit contributions
lessness-0.012
 twins-0.010
 of-0.009
 privacy-0.009
iness-0.009
TokenBut
Token ID1537
Feature activation+2.55e-02

Pos logit contributions
 coaster+0.042
 Fer+0.041
+0.039
 jugg+0.039
icol+0.038
Neg logit contributions
lessness-0.030
 of-0.030
 twins-0.030
 who-0.026
aky-0.024
Token then
Token ID788
Feature activation+4.61e-02

Pos logit contributions
 coaster+0.043
 Fer+0.040
+0.040
 jugg+0.039
icol+0.038
Neg logit contributions
 of-0.051
lessness-0.050
 twins-0.048
 about-0.044
 who-0.044
Token,
Token ID11
Feature activation+2.87e-02

Pos logit contributions
 coaster+0.109
 Fer+0.104
+0.103
 jugg+0.101
 salon+0.100
Neg logit contributions
lessness-0.058
 of-0.053
 twins-0.052
 about-0.042
 who-0.041
Token they
Token ID484
Feature activation+2.07e-02

Pos logit contributions
 coaster+0.067
 Fer+0.064
+0.063
 jugg+0.062
 salon+0.062
Neg logit contributions
lessness-0.037
 of-0.033
 twins-0.032
 privacy-0.026
 about-0.026
Token saw
Token ID2497
Feature activation-3.36e-02

Pos logit contributions
 coaster+0.044
 Fer+0.043
 jugg+0.042
+0.042
 salon+0.041
Neg logit contributions
lessness-0.030
 of-0.027
 twins-0.026
 privacy-0.023
 company-0.022
Token Mom
Token ID11254
Feature activation-1.98e-02

Pos logit contributions
lessness+0.053
 of+0.041
aky+0.039
 company+0.039
 privacy+0.038
Neg logit contributions
 coaster-0.069
 Fer-0.067
 jugg-0.064
 som-0.064
 blur-0.063
TokenJane
Token ID41083
Feature activation+1.24e-02

Pos logit contributions
lessness+0.010
 of+0.009
 twins+0.009
 privacy+0.008
 company+0.008
Neg logit contributions
 coaster-0.009
 Fer-0.009
-0.008
 jugg-0.008
 salon-0.008
Token said
Token ID531
Feature activation-3.61e-02

Pos logit contributions
 coaster+0.028
 Fer+0.027
+0.027
 salon+0.025
 som+0.025
Neg logit contributions
lessness-0.017
 of-0.016
 twins-0.015
 who-0.014
 about-0.013
Token,
Token ID11
Feature activation+3.23e-03

Pos logit contributions
lessness+0.047
 of+0.041
 twins+0.040
 privacy+0.033
 about+0.032
Neg logit contributions
 coaster-0.084
 Fer-0.081
-0.080
 jugg-0.079
 salon-0.078
Token "
Token ID366
Feature activation+5.08e-02

Pos logit contributions
 coaster+0.007
 Fer+0.006
+0.006
 jugg+0.006
 salon+0.006
Neg logit contributions
lessness-0.005
 of-0.005
 twins-0.005
 about-0.004
 who-0.004
TokenDon
Token ID3987
Feature activation+2.84e-02

Pos logit contributions
 coaster+0.095
 Fer+0.091
+0.089
 jugg+0.086
 salon+0.086
Neg logit contributions
lessness-0.090
 of-0.083
 twins-0.080
 privacy-0.070
 who-0.070
Token't
Token ID470
Feature activation+2.43e-02

Pos logit contributions
 coaster+0.056
icol+0.055
 Fer+0.055
 salon+0.054
 wrists+0.053
Neg logit contributions
 of-0.049
lessness-0.043
 twins-0.040
 about-0.038
 scientists-0.034
Token worry
Token ID5490
Feature activation-1.65e-02

Pos logit contributions
 coaster+0.058
 Fer+0.055
+0.054
 jugg+0.054
 salon+0.053
Neg logit contributions
lessness-0.031
 of-0.027
 twins-0.026
 privacy-0.021
 about-0.021
Token!
Token ID0
Feature activation-4.09e-04

Pos logit contributions
lessness+0.028
 twins+0.024
 privacy+0.022
aky+0.022
 of+0.022
Neg logit contributions
 coaster-0.031
 Fer-0.029
 som-0.029
 jugg-0.028
olog-0.028
Token Just
Token ID2329
Feature activation+3.72e-02

Pos logit contributions
lessness+0.001
 of+0.001
 twins+0.001
 who+0.001
 about+0.001
Neg logit contributions
 coaster-0.001
 Fer-0.001
 som-0.001
 jugg-0.001
 salon-0.001
Token trust
Token ID3774
Feature activation-4.66e-03

Pos logit contributions
 coaster+0.074
 Fer+0.070
+0.069
icol+0.069
 salon+0.066
Neg logit contributions
 of-0.061
lessness-0.058
 twins-0.054
 who-0.054
 company-0.052
Token me
Token ID502
Feature activation-7.92e-03

Pos logit contributions
lessness+0.009
 of+0.008
 twins+0.008
 privacy+0.007
 company+0.007
Neg logit contributions
 coaster-0.008
 Fer-0.008
 jugg-0.007
-0.007
 salon-0.007
Token laughed
Token ID13818
Feature activation-2.71e-02

Pos logit contributions
 coaster+0.018
 Fer+0.017
+0.017
 jugg+0.017
 salon+0.017
Neg logit contributions
lessness-0.012
 of-0.011
 twins-0.010
 privacy-0.009
 about-0.009
Token a
Token ID257
Feature activation-8.31e-03

Pos logit contributions
lessness+0.037
 privacy+0.031
 twins+0.028
ruck+0.028
aky+0.027
Neg logit contributions
 coaster-0.056
 pro-0.056
 blur-0.053
 jugg-0.052
 som-0.052
Token lot
Token ID1256
Feature activation-1.38e-02

Pos logit contributions
lessness+0.013
 of+0.012
 twins+0.011
 about+0.009
 company+0.009
Neg logit contributions
 coaster-0.017
 Fer-0.017
 salon-0.016
-0.016
 jugg-0.016
Token.
Token ID13
Feature activation-9.86e-03

Pos logit contributions
lessness+0.012
 twins+0.010
 of+0.008
 privacy+0.008
 bricks+0.008
Neg logit contributions
 coaster-0.036
 Fer-0.035
 pro-0.034
-0.034
 jugg-0.034
Token
Token ID198
Feature activation+5.83e-03

Pos logit contributions
lessness+0.012
 of+0.011
 twins+0.010
 privacy+0.008
 company+0.008
Neg logit contributions
 coaster-0.024
 Fer-0.023
-0.023
 jugg-0.022
 salon-0.022
Token
Token ID198
Feature activation+1.83e-02

Pos logit contributions
 coaster+0.012
+0.011
 jugg+0.011
V+0.011
 som+0.011
Neg logit contributions
lessness-0.009
 twins-0.007
 of-0.007
 privacy-0.007
iness-0.007
TokenBut
Token ID1537
Feature activation+2.12e-02

Pos logit contributions
 coaster+0.037
 Fer+0.036
+0.035
icol+0.035
 salon+0.034
Neg logit contributions
lessness-0.029
 of-0.028
 twins-0.026
 who-0.024
 about-0.023
Token soon
Token ID2582
Feature activation+1.96e-02

Pos logit contributions
 coaster+0.042
 Fer+0.040
+0.040
 jugg+0.039
icol+0.039
Neg logit contributions
 of-0.035
 twins-0.033
lessness-0.032
 about-0.031
 who-0.030
Token,
Token ID11
Feature activation+2.56e-02

Pos logit contributions
 coaster+0.042
 Fer+0.040
+0.039
 jugg+0.038
 som+0.038
Neg logit contributions
lessness-0.030
 of-0.025
 twins-0.025
 privacy-0.022
 company-0.021
Token they
Token ID484
Feature activation+2.00e-02

Pos logit contributions
 coaster+0.059
 Fer+0.057
+0.056
 jugg+0.055
 salon+0.054
Neg logit contributions
lessness-0.033
 of-0.030
 twins-0.029
 about-0.024
 privacy-0.024
Token heard
Token ID2982
Feature activation-4.68e-02

Pos logit contributions
 coaster+0.044
 Fer+0.042
+0.041
 jugg+0.041
 salon+0.041
Neg logit contributions
lessness-0.028
 of-0.025
 twins-0.024
 privacy-0.021
 company-0.020
Token pack
Token ID2353
Feature activation+1.71e-02

Pos logit contributions
 coaster+0.025
 Fer+0.024
+0.023
 jugg+0.023
 salon+0.023
Neg logit contributions
lessness-0.021
 of-0.020
 twins-0.019
 privacy-0.016
 about-0.016
Token some
Token ID617
Feature activation-3.11e-02

Pos logit contributions
 coaster+0.037
 Fer+0.037
 salon+0.034
 som+0.034
 jugg+0.034
Neg logit contributions
lessness-0.025
 of-0.020
 twins-0.019
ruck-0.018
aky-0.018
Token food
Token ID2057
Feature activation-1.90e-02

Pos logit contributions
lessness+0.054
 twins+0.040
iness+0.039
 privacy+0.039
ruck+0.038
Neg logit contributions
 Fer-0.061
 coaster-0.059
 jugg-0.055
-0.054
 roller-0.054
Token and
Token ID290
Feature activation+6.44e-03

Pos logit contributions
lessness+0.026
 twins+0.021
 privacy+0.019
ruck+0.019
 of+0.019
Neg logit contributions
 coaster-0.042
-0.040
 Fer-0.040
 salon-0.039
 sund-0.038
Token drinks
Token ID11758
Feature activation-7.61e-03

Pos logit contributions
 coaster+0.013
 Fer+0.012
+0.012
 jugg+0.012
 salon+0.012
Neg logit contributions
lessness-0.011
 of-0.010
 twins-0.009
 privacy-0.008
 about-0.008
Token in
Token ID287
Feature activation+3.00e-02

Pos logit contributions
lessness+0.008
 twins+0.006
 of+0.006
 privacy+0.006
ruck+0.005
Neg logit contributions
 coaster-0.019
 Fer-0.019
-0.018
 jugg-0.018
 som-0.018
Token a
Token ID257
Feature activation-1.99e-02

Pos logit contributions
 coaster+0.064
 Fer+0.063
 jugg+0.060
+0.060
 som+0.060
Neg logit contributions
lessness-0.046
 of-0.037
 twins-0.035
iness-0.032
 privacy-0.032
Token big
Token ID1263
Feature activation+2.89e-02

Pos logit contributions
lessness+0.027
 of+0.025
 twins+0.024
 who+0.020
 company+0.020
Neg logit contributions
 coaster-0.045
 Fer-0.042
-0.042
 jugg-0.042
 salon-0.041
Token bag
Token ID6131
Feature activation+1.72e-02

Pos logit contributions
 coaster+0.055
+0.052
 jugg+0.052
olog+0.051
 pro+0.050
Neg logit contributions
 of-0.047
lessness-0.045
 twins-0.045
 toys-0.040
 bricks-0.039
Token.
Token ID13
Feature activation-1.87e-02

Pos logit contributions
 coaster+0.041
 salon+0.040
+0.040
 Fer+0.039
 jugg+0.039
Neg logit contributions
lessness-0.018
 privacy-0.014
 twins-0.014
 importantly-0.012
 secrets-0.012
Token They
Token ID1119
Feature activation+1.27e-02

Pos logit contributions
lessness+0.018
 of+0.018
 twins+0.017
 about+0.013
 who+0.013
Neg logit contributions
 coaster-0.050
 Fer-0.048
 jugg-0.046
 som-0.046
 salon-0.046
Token to
Token ID284
Feature activation+2.70e-02

Pos logit contributions
 coaster+0.218
+0.211
 Fer+0.209
 jugg+0.207
 som+0.206
Neg logit contributions
 of-0.048
lessness-0.046
 twins-0.041
 about-0.028
 everywhere-0.027
Token the
Token ID262
Feature activation-1.39e-02

Pos logit contributions
 coaster+0.073
+0.069
olog+0.069
icol+0.068
 Fer+0.067
Neg logit contributions
 twins-0.023
 of-0.022
 who-0.019
lessness-0.018
 toys-0.017
Token park
Token ID3952
Feature activation+5.45e-02

Pos logit contributions
 twins+0.025
 of+0.025
lessness+0.022
 living+0.022
 toys+0.022
Neg logit contributions
 coaster-0.024
-0.023
olog-0.023
chief-0.023
icol-0.022
Token to
Token ID284
Feature activation+2.70e-02

Pos logit contributions
 coaster+0.148
 Fer+0.144
+0.142
icol+0.141
 som+0.138
Neg logit contributions
 of-0.050
lessness-0.046
 twins-0.043
 about-0.036
 who-0.035
Token play
Token ID711
Feature activation+8.20e-03

Pos logit contributions
 coaster+0.070
icol+0.068
olog+0.067
+0.065
 seventeen+0.064
Neg logit contributions
 of-0.028
 twins-0.025
 who-0.024
 what-0.022
 toys-0.021
Token soccer
Token ID11783
Feature activation+4.12e-02

Pos logit contributions
 coaster+0.024
 Fer+0.023
+0.023
icol+0.023
 salon+0.023
Neg logit contributions
lessness-0.005
 of-0.005
 twins-0.005
 who-0.003
 about-0.003
Token.
Token ID13
Feature activation-1.08e-02

Pos logit contributions
 coaster+0.129
icol+0.126
+0.126
 Fer+0.125
olog+0.123
Neg logit contributions
 of-0.020
lessness-0.016
 twins-0.015
 who-0.009
 about-0.007
Token But
Token ID887
Feature activation+2.62e-03

Pos logit contributions
 of+0.010
lessness+0.010
 twins+0.010
 about+0.008
 who+0.007
Neg logit contributions
 coaster-0.028
 jugg-0.026
 Fer-0.026
-0.026
 som-0.026
Token the
Token ID262
Feature activation+2.43e-02

Pos logit contributions
 coaster+0.007
+0.006
 Fer+0.006
icol+0.006
 jugg+0.006
Neg logit contributions
 of-0.003
lessness-0.003
 twins-0.003
 about-0.002
 who-0.002
Token other
Token ID584
Feature activation+2.79e-02

Pos logit contributions
 coaster+0.052
+0.050
icol+0.050
olog+0.049
 Fer+0.049
Neg logit contributions
 of-0.035
 twins-0.035
lessness-0.034
 toys-0.028
 friends-0.027
Token team
Token ID1074
Feature activation+6.38e-03

Pos logit contributions
 coaster+0.063
+0.061
 Fer+0.060
icol+0.060
olog+0.058
Neg logit contributions
lessness-0.037
 of-0.036
 twins-0.035
 toys-0.028
 who-0.028
Token always
Token ID1464
Feature activation+4.72e-03

Pos logit contributions
 coaster+0.102
 Fer+0.097
+0.095
 jugg+0.093
 salon+0.093
Neg logit contributions
lessness-0.073
 of-0.067
 twins-0.065
 privacy-0.056
 about-0.055
Token listen
Token ID6004
Feature activation-1.97e-02

Pos logit contributions
 coaster+0.009
 Fer+0.009
+0.009
 salon+0.008
icol+0.008
Neg logit contributions
lessness-0.008
 of-0.008
 twins-0.007
 who-0.007
 company-0.006
Token to
Token ID284
Feature activation+1.66e-02

Pos logit contributions
lessness+0.024
 twins+0.021
 of+0.018
 bricks+0.018
 privacy+0.017
Neg logit contributions
 coaster-0.043
 som-0.042
 pro-0.042
 jugg-0.042
 salon-0.041
Token her
Token ID607
Feature activation-5.97e-02

Pos logit contributions
 coaster+0.035
 som+0.035
 jugg+0.034
 Fer+0.034
 salon+0.034
Neg logit contributions
lessness-0.024
 of-0.019
 twins-0.018
aky-0.016
ruck-0.015
Token mom
Token ID1995
Feature activation-5.68e-02

Pos logit contributions
lessness+0.076
 of+0.063
 twins+0.059
ruck+0.054
aky+0.050
Neg logit contributions
 coaster-0.137
 salon-0.130
 jugg-0.129
 som-0.129
-0.128
Token's
Token ID338
Feature activation-2.78e-02

Pos logit contributions
lessness+0.071
 twins+0.054
 bricks+0.052
 privacy+0.052
 scientists+0.051
Neg logit contributions
 coaster-0.124
 salon-0.124
-0.117
 jugg-0.117
 pro-0.116
Token recommendations
Token ID10763
Feature activation-5.27e-02

Pos logit contributions
lessness+0.047
 of+0.041
 twins+0.038
aky+0.035
ruck+0.035
Neg logit contributions
 coaster-0.052
 som-0.048
 salon-0.048
 jugg-0.047
olog-0.047
Token.
Token ID13
Feature activation-9.48e-03

Pos logit contributions
lessness+0.047
 twins+0.041
 of+0.034
 privacy+0.031
 bricks+0.031
Neg logit contributions
 coaster-0.134
 salon-0.130
-0.128
 Fer-0.127
 jugg-0.126
Token<|endoftext|>
Token ID50256
Feature activation-6.19e-03

Pos logit contributions
lessness+0.012
 of+0.011
 twins+0.010
 privacy+0.009
ruck+0.009
Neg logit contributions
 coaster-0.022
-0.021
 Fer-0.020
 salon-0.020
 jugg-0.020
TokenOnce
Token ID7454
Feature activation-2.25e-02

Pos logit contributions
lessness+0.009
 twins+0.007
 privacy+0.007
 of+0.007
ruck+0.007
Neg logit contributions
 coaster-0.013
-0.013
 Fer-0.012
 jugg-0.012
 pro-0.012
Token upon
Token ID2402
Feature activation-1.71e-02

Pos logit contributions
 privacy+0.031
lessness+0.031
 scientists+0.029
 company+0.029
ruck+0.027
Neg logit contributions
 salon-0.042
-0.040
 drib-0.038
irl-0.037
 pro-0.037
Token smiled
Token ID13541
Feature activation-1.57e-02

Pos logit contributions
+0.061
 coaster+0.060
 Fer+0.060
 som+0.058
 blur+0.057
Neg logit contributions
 of-0.035
 twins-0.035
lessness-0.034
 who-0.032
 about-0.028
Token.
Token ID13
Feature activation-1.45e-03

Pos logit contributions
lessness+0.020
 bricks+0.017
ruck+0.016
 privacy+0.016
 scientists+0.016
Neg logit contributions
 pro-0.032
 coaster-0.032
 blur-0.030
 jugg-0.030
icol-0.030
Token He
Token ID679
Feature activation+1.61e-02

Pos logit contributions
lessness+0.002
 of+0.001
 twins+0.001
 privacy+0.001
 company+0.001
Neg logit contributions
 coaster-0.004
 Fer-0.004
-0.004
 jugg-0.003
 salon-0.003
Token was
Token ID373
Feature activation+2.13e-03

Pos logit contributions
 coaster+0.039
+0.038
 Fer+0.038
 salon+0.037
 som+0.036
Neg logit contributions
lessness-0.019
 of-0.018
 twins-0.017
 who-0.015
 about-0.015
Token glad
Token ID9675
Feature activation-3.58e-02

Pos logit contributions
 coaster+0.005
 Fer+0.005
 jugg+0.005
+0.005
 salon+0.005
Neg logit contributions
lessness-0.003
 of-0.002
 twins-0.002
 privacy-0.002
ruck-0.002
Token that
Token ID326
Feature activation+6.24e-04

Pos logit contributions
lessness+0.058
aky+0.047
 privacy+0.046
ovable+0.045
 company+0.045
Neg logit contributions
 coaster-0.067
 jugg-0.067
 salon-0.064
 som-0.064
 pro-0.063
Token Anna
Token ID11735
Feature activation+2.51e-02

Pos logit contributions
 coaster+0.001
 jugg+0.001
 Fer+0.001
 salon+0.001
+0.001
Neg logit contributions
lessness-0.001
 of-0.001
 twins-0.001
 privacy-0.001
ruck-0.001
Token saw
Token ID2497
Feature activation-4.61e-02

Pos logit contributions
 coaster+0.053
 Fer+0.051
+0.050
 jugg+0.049
 salon+0.049
Neg logit contributions
lessness-0.038
 of-0.034
 twins-0.033
 about-0.028
 privacy-0.028
Token his
Token ID465
Feature activation-1.48e-02

Pos logit contributions
lessness+0.085
 of+0.070
 twins+0.069
 privacy+0.068
 company+0.066
Neg logit contributions
 coaster-0.082
 salon-0.079
 jugg-0.079
 Fer-0.078
 pro-0.075
Token pedal
Token ID26667
Feature activation+5.64e-02

Pos logit contributions
lessness+0.030
 of+0.028
 twins+0.027
 privacy+0.024
 about+0.024
Neg logit contributions
 coaster-0.024
 Fer-0.022
-0.022
 jugg-0.022
 salon-0.021
Token.
Token ID13
Feature activation-4.75e-03

Pos logit contributions
 coaster+0.151
 Fer+0.147
+0.145
 jugg+0.143
 salon+0.142
Neg logit contributions
lessness-0.052
 of-0.044
 twins-0.041
 about-0.030
 privacy-0.029
Token steal
Token ID8711
Feature activation-2.23e-02

Pos logit contributions
 coaster+0.064
 Fer+0.061
+0.060
 salon+0.059
icol+0.058
Neg logit contributions
 of-0.051
lessness-0.049
 twins-0.048
 who-0.043
 company-0.043
Token something
Token ID1223
Feature activation-8.04e-02

Pos logit contributions
lessness+0.032
 twins+0.027
 of+0.026
 privacy+0.022
ruck+0.022
Neg logit contributions
 coaster-0.049
 Fer-0.047
 jugg-0.045
 som-0.045
 pro-0.045
Token!
Token ID0
Feature activation-1.65e-02

Pos logit contributions
lessness+0.101
 twins+0.086
 of+0.080
 privacy+0.070
ruck+0.069
Neg logit contributions
 coaster-0.189
 Fer-0.184
 jugg-0.179
 som-0.177
 pro-0.177
Token<|endoftext|>
Token ID50256
Feature activation-1.34e-02

Pos logit contributions
lessness+0.019
 of+0.017
 twins+0.016
 about+0.013
 privacy+0.013
Neg logit contributions
 coaster-0.041
 Fer-0.039
-0.038
 jugg-0.038
 salon-0.038
TokenOnce
Token ID7454
Feature activation-2.37e-02

Pos logit contributions
lessness+0.022
 of+0.019
 twins+0.019
 privacy+0.018
 company+0.017
Neg logit contributions
 coaster-0.026
-0.026
 Fer-0.025
 jugg-0.024
 pro-0.023
Token upon
Token ID2402
Feature activation-2.68e-02

Pos logit contributions
lessness+0.032
 privacy+0.031
 company+0.030
 twins+0.029
 scientists+0.028
Neg logit contributions
 salon-0.046
-0.043
 coaster-0.042
 Fer-0.042
 drib-0.042
Token a
Token ID257
Feature activation-1.32e-02

Pos logit contributions
lessness+0.048
iness+0.039
 privacy+0.037
 twins+0.036
 bulbs+0.035
Neg logit contributions
 coaster-0.051
 jugg-0.047
 som-0.046
 sund-0.044
 Fer-0.044
Token time
Token ID640
Feature activation+1.14e-02

Pos logit contributions
lessness+0.018
 of+0.016
 twins+0.015
 privacy+0.013
 about+0.013
Neg logit contributions
 coaster-0.030
 Fer-0.029
-0.028
 jugg-0.028
 salon-0.028
Token there
Token ID612
Feature activation+3.05e-03

Pos logit contributions
 coaster+0.023
+0.022
 Fer+0.022
 jugg+0.021
 salon+0.021
Neg logit contributions
lessness-0.018
 twins-0.015
 of-0.015
 privacy-0.014
ruck-0.013
Token was
Token ID373
Feature activation-9.80e-03

Pos logit contributions
 coaster+0.006
 Fer+0.006
+0.006
 jugg+0.006
 salon+0.005
Neg logit contributions
lessness-0.005
 of-0.004
 twins-0.004
 privacy-0.004
 company-0.004
Token a
Token ID257
Feature activation-1.32e-02

Pos logit contributions
lessness+0.017
 privacy+0.013
iness+0.013
 of+0.012
 twins+0.012
Neg logit contributions
-0.019
 coaster-0.019
 Fer-0.018
 sund-0.018
 jugg-0.018
Token said
Token ID531
Feature activation-3.40e-02

Pos logit contributions
+0.047
 coaster+0.047
 Fer+0.044
icol+0.043
 som+0.043
Neg logit contributions
 of-0.030
lessness-0.029
 twins-0.027
 who-0.026
 about-0.024
Token thank
Token ID5875
Feature activation+5.70e-03

Pos logit contributions
lessness+0.059
 of+0.053
 twins+0.052
 privacy+0.046
 company+0.045
Neg logit contributions
 coaster-0.064
 Fer-0.061
-0.059
 jugg-0.059
 salon-0.058
Token you
Token ID345
Feature activation-2.09e-02

Pos logit contributions
 coaster+0.014
 Fer+0.013
icol+0.013
 blur+0.013
+0.013
Neg logit contributions
 of-0.007
lessness-0.006
 twins-0.006
 who-0.005
 company-0.005
Token.
Token ID13
Feature activation-1.74e-03

Pos logit contributions
lessness+0.024
 twins+0.019
 of+0.018
 privacy+0.016
 importantly+0.016
Neg logit contributions
 coaster-0.051
 Fer-0.047
 salon-0.047
olog-0.047
 pro-0.046
Token The
Token ID383
Feature activation+2.83e-02

Pos logit contributions
lessness+0.002
 of+0.002
 twins+0.002
 about+0.002
 who+0.002
Neg logit contributions
 coaster-0.004
 Fer-0.004
 jugg-0.004
-0.004
 salon-0.004
Token driver
Token ID4639
Feature activation+1.11e-02

Pos logit contributions
 coaster+0.045
 Fer+0.042
+0.040
 salon+0.040
 jugg+0.039
Neg logit contributions
lessness-0.059
 of-0.053
 twins-0.051
 privacy-0.047
 about-0.046
Token smiled
Token ID13541
Feature activation-1.12e-02

Pos logit contributions
 coaster+0.023
+0.023
 Fer+0.022
icol+0.022
 som+0.022
Neg logit contributions
 of-0.017
lessness-0.016
 twins-0.015
 who-0.014
 about-0.013
Token and
Token ID290
Feature activation+8.45e-03

Pos logit contributions
lessness+0.014
 twins+0.012
 of+0.012
 privacy+0.011
 bricks+0.010
Neg logit contributions
 coaster-0.026
 Fer-0.025
 pro-0.024
-0.024
 jugg-0.024
Token said
Token ID531
Feature activation-3.08e-02

Pos logit contributions
+0.021
 coaster+0.021
 salon+0.020
+0.020
 seventeen+0.019
Neg logit contributions
 of-0.009
lessness-0.009
 twins-0.009
 who-0.008
 about-0.007
Token,
Token ID11
Feature activation+1.42e-02

Pos logit contributions
lessness+0.043
 of+0.037
 twins+0.037
 privacy+0.031
 company+0.030
Neg logit contributions
 coaster-0.069
 Fer-0.066
-0.065
 jugg-0.064
 salon-0.064
Token "
Token ID366
Feature activation+6.12e-02

Pos logit contributions
 coaster+0.036
 Fer+0.035
icol+0.033
 salon+0.033
irl+0.033
Neg logit contributions
lessness-0.016
 of-0.015
 twins-0.014
 who-0.013
 about-0.012
Token their
Token ID511
Feature activation-3.74e-03

Pos logit contributions
lessness+0.065
aky+0.055
 privacy+0.054
ovable+0.053
 supposed+0.052
Neg logit contributions
 coaster-0.049
-0.049
 jugg-0.049
 Fer-0.048
 blur-0.046
Token cries
Token ID24691
Feature activation-5.52e-02

Pos logit contributions
lessness+0.008
 of+0.007
 twins+0.006
 company+0.006
ruck+0.006
Neg logit contributions
 coaster-0.006
 Fer-0.005
 jugg-0.005
icol-0.005
 salon-0.005
Token and
Token ID290
Feature activation+2.49e-02

Pos logit contributions
lessness+0.087
 privacy+0.084
grand+0.078
 importantly+0.077
 supposed+0.075
Neg logit contributions
 pro-0.085
-0.084
 blur-0.084
 coaster-0.083
olog-0.081
Token ran
Token ID4966
Feature activation+5.27e-02

Pos logit contributions
 coaster+0.042
 Fer+0.040
+0.039
 jugg+0.039
 salon+0.038
Neg logit contributions
lessness-0.047
 of-0.043
 twins-0.042
 privacy-0.038
 company-0.038
Token to
Token ID284
Feature activation+3.34e-02

Pos logit contributions
 coaster+0.138
 Fer+0.133
+0.131
 jugg+0.130
 salon+0.129
Neg logit contributions
lessness-0.053
 of-0.044
 twins-0.043
 privacy-0.032
 company-0.031
Token them
Token ID606
Feature activation-5.13e-02

Pos logit contributions
 coaster+0.077
 Fer+0.074
+0.072
 jugg+0.072
 salon+0.071
Neg logit contributions
lessness-0.044
 of-0.040
 twins-0.039
 who-0.031
 privacy-0.031
Token.
Token ID13
Feature activation-1.84e-03

Pos logit contributions
lessness+0.071
 importantly+0.057
ruck+0.055
 twins+0.054
 privacy+0.054
Neg logit contributions
 coaster-0.105
-0.104
 drib-0.102
 salon-0.101
olog-0.100
Token She
Token ID1375
Feature activation+2.11e-02

Pos logit contributions
lessness+0.003
ruck+0.003
 of+0.002
 company+0.002
iness+0.002
Neg logit contributions
-0.004
 coaster-0.003
olog-0.003
icol-0.003
 salon-0.003
Token saw
Token ID2497
Feature activation-3.36e-02

Pos logit contributions
 coaster+0.046
 Fer+0.044
+0.043
 jugg+0.042
 salon+0.042
Neg logit contributions
lessness-0.031
 of-0.029
 twins-0.028
 who-0.024
 about-0.023
Token the
Token ID262
Feature activation+2.40e-02

Pos logit contributions
lessness+0.067
 company+0.061
 privacy+0.059
aky+0.057
 importantly+0.053
Neg logit contributions
 salon-0.051
 Fer-0.050
-0.049
 coaster-0.047
 jugg-0.046
Token dog
Token ID3290
Feature activation+1.48e-02

Pos logit contributions
 coaster+0.047
 Fer+0.046
+0.044
 salon+0.043
 jugg+0.043
Neg logit contributions
lessness-0.042
 of-0.036
 twins-0.033
 privacy-0.031
 company-0.031
Token bear
Token ID6842
Feature activation+2.89e-02

Pos logit contributions
 coaster+0.088
 Fer+0.084
+0.083
 jugg+0.082
 salon+0.081
Neg logit contributions
lessness-0.056
 of-0.050
 twins-0.048
 privacy-0.040
 about-0.040
Token's
Token ID338
Feature activation+9.96e-03

Pos logit contributions
 coaster+0.060
 Fer+0.059
+0.057
 salon+0.055
 jugg+0.055
Neg logit contributions
lessness-0.043
 of-0.042
 twins-0.039
 who-0.035
 about-0.034
Token worst
Token ID5290
Feature activation-1.74e-02

Pos logit contributions
 coaster+0.024
 Fer+0.023
+0.022
 jugg+0.022
 salon+0.022
Neg logit contributions
lessness-0.012
 of-0.011
 twins-0.011
 privacy-0.009
 company-0.009
Token fears
Token ID10251
Feature activation-5.32e-02

Pos logit contributions
lessness+0.028
 twins+0.024
 of+0.024
 privacy+0.021
 company+0.021
Neg logit contributions
 coaster-0.035
 Fer-0.033
 jugg-0.033
 som-0.032
 salon-0.032
Token were
Token ID547
Feature activation-1.55e-02

Pos logit contributions
lessness+0.085
 twins+0.080
 of+0.074
 privacy+0.070
 company+0.067
Neg logit contributions
 coaster-0.103
 Fer-0.097
-0.096
 jugg-0.096
 som-0.094
Token realized
Token ID6939
Feature activation-7.21e-02

Pos logit contributions
lessness+0.023
 of+0.020
 twins+0.020
 privacy+0.017
ruck+0.016
Neg logit contributions
 coaster-0.033
 Fer-0.031
-0.031
 jugg-0.031
 som-0.030
Token -
Token ID532
Feature activation-9.88e-03

Pos logit contributions
lessness+0.096
 twins+0.082
 of+0.080
 privacy+0.069
 company+0.067
Neg logit contributions
 coaster-0.165
 Fer-0.157
-0.155
 jugg-0.154
 som-0.152
Token the
Token ID262
Feature activation+2.94e-02

Pos logit contributions
lessness+0.014
 of+0.013
 twins+0.012
 about+0.011
 who+0.011
Neg logit contributions
 coaster-0.022
 Fer-0.021
-0.021
 salon-0.020
 jugg-0.020
Token ribbon
Token ID29092
Feature activation+5.86e-02

Pos logit contributions
 coaster+0.060
 Fer+0.057
+0.056
 jugg+0.055
 salon+0.054
Neg logit contributions
lessness-0.046
 twins-0.042
 of-0.042
 privacy-0.036
 company-0.035
Token was
Token ID373
Feature activation+5.64e-03

Pos logit contributions
 coaster+0.120
 Fer+0.114
+0.113
 jugg+0.111
 salon+0.110
Neg logit contributions
lessness-0.090
 twins-0.081
 of-0.078
 privacy-0.069
 company-0.067
Token going
Token ID1016
Feature activation+5.54e-02

Pos logit contributions
 coaster+0.012
 Fer+0.012
+0.011
 jugg+0.011
 salon+0.011
Neg logit contributions
lessness-0.008
 of-0.007
 twins-0.007
 about-0.006
 privacy-0.006
Token them
Token ID606
Feature activation-5.09e-02

Pos logit contributions
lessness+0.091
aky+0.070
 privacy+0.070
iness+0.069
 twins+0.069
Neg logit contributions
 coaster-0.072
 Fer-0.071
 jugg-0.068
 salon-0.063
 sixty-0.063
Token and
Token ID290
Feature activation+1.25e-02

Pos logit contributions
lessness+0.082
 privacy+0.064
aky+0.062
 importantly+0.060
 twins+0.060
Neg logit contributions
 coaster-0.104
 jugg-0.101
 Fer-0.096
 drib-0.096
 pro-0.095
Token l
Token ID300
Feature activation+4.44e-02

Pos logit contributions
 coaster+0.026
 Fer+0.025
+0.024
 jugg+0.024
 salon+0.024
Neg logit contributions
lessness-0.019
 of-0.018
 twins-0.017
 privacy-0.015
 about-0.015
Tokenicked
Token ID9484
Feature activation-2.26e-03

Pos logit contributions
+0.080
 Fer+0.079
 coaster+0.078
ar+0.076
olog+0.076
Neg logit contributions
lessness-0.066
 privacy-0.061
 twins-0.060
 of-0.059
ruck-0.059
Token their
Token ID511
Feature activation-3.04e-02

Pos logit contributions
lessness+0.004
 twins+0.003
 privacy+0.003
ruck+0.003
aky+0.003
Neg logit contributions
 coaster-0.004
 jugg-0.004
 Fer-0.004
 drib-0.004
 pro-0.004
Token faces
Token ID6698
Feature activation-8.39e-02

Pos logit contributions
lessness+0.050
 of+0.044
 twins+0.039
 about+0.038
ruck+0.038
Neg logit contributions
 coaster-0.061
 Fer-0.058
 jugg-0.055
-0.054
 som-0.054
Token.
Token ID13
Feature activation-1.57e-02

Pos logit contributions
lessness+0.078
 twins+0.059
 privacy+0.054
 of+0.052
 bricks+0.044
Neg logit contributions
 coaster-0.222
 jugg-0.213
 Fer-0.213
 pro-0.210
-0.209
Token
Token ID198
Feature activation-3.42e-03

Pos logit contributions
lessness+0.019
 of+0.017
 twins+0.016
 privacy+0.013
 company+0.013
Neg logit contributions
 coaster-0.038
 Fer-0.036
-0.036
 jugg-0.035
 salon-0.035
Token
Token ID198
Feature activation+1.14e-02

Pos logit contributions
lessness+0.005
 twins+0.004
 of+0.004
 privacy+0.004
 company+0.004
Neg logit contributions
 coaster-0.007
-0.007
 jugg-0.007
 Fer-0.007
 som-0.007
TokenThey
Token ID2990
Feature activation+1.83e-02

Pos logit contributions
 Fer+0.025
 coaster+0.025
icol+0.024
+0.023
 salon+0.022
Neg logit contributions
 of-0.017
lessness-0.016
 twins-0.015
 who-0.014
aky-0.013
Token put
Token ID1234
Feature activation+3.46e-02

Pos logit contributions
 coaster+0.041
+0.038
 Fer+0.037
icol+0.037
 som+0.036
Neg logit contributions
lessness-0.025
 of-0.025
 twins-0.024
 who-0.021
 everywhere-0.021
Token sun
Token ID4252
Feature activation+2.21e-02

Pos logit contributions
 coaster+0.028
 Fer+0.026
 jugg+0.026
+0.025
 salon+0.025
Neg logit contributions
lessness-0.019
 of-0.017
 twins-0.015
ruck-0.014
 about-0.014
Token came
Token ID1625
Feature activation+8.39e-03

Pos logit contributions
 coaster+0.048
 Fer+0.047
+0.046
olog+0.045
icol+0.044
Neg logit contributions
lessness-0.032
 of-0.032
 twins-0.029
 about-0.024
 everywhere-0.024
Token back
Token ID736
Feature activation+9.21e-03

Pos logit contributions
 coaster+0.017
 blur+0.017
 pro+0.016
+0.016
 surely+0.016
Neg logit contributions
lessness-0.012
 twins-0.011
 privacy-0.010
 importantly-0.010
 of-0.009
Token,
Token ID11
Feature activation-1.16e-02

Pos logit contributions
 coaster+0.021
+0.020
 salon+0.020
 pro+0.020
 blur+0.020
Neg logit contributions
lessness-0.012
 twins-0.010
 of-0.009
 privacy-0.009
 scientists-0.008
Token and
Token ID290
Feature activation+2.20e-02

Pos logit contributions
lessness+0.014
 of+0.012
 twins+0.011
 about+0.009
 privacy+0.009
Neg logit contributions
 coaster-0.029
 Fer-0.027
-0.027
 jugg-0.027
 salon-0.026
Token everyone
Token ID2506
Feature activation-6.31e-02

Pos logit contributions
 coaster+0.050
+0.048
 Fer+0.048
 jugg+0.046
 salon+0.046
Neg logit contributions
lessness-0.030
 of-0.027
 twins-0.026
 about-0.022
 privacy-0.021
Token was
Token ID373
Feature activation-4.64e-03

Pos logit contributions
lessness+0.107
 twins+0.094
 of+0.092
 company+0.087
 privacy+0.086
Neg logit contributions
 coaster-0.115
 jugg-0.114
 Fer-0.111
 pro-0.110
 salon-0.108
Token happy
Token ID3772
Feature activation-2.69e-02

Pos logit contributions
lessness+0.005
 of+0.004
 twins+0.004
 privacy+0.004
 bricks+0.004
Neg logit contributions
 coaster-0.011
 Fer-0.011
 jugg-0.011
-0.011
 salon-0.011
Token again
Token ID757
Feature activation+9.97e-03

Pos logit contributions
lessness+0.035
aky+0.031
 privacy+0.029
ovable+0.029
 files+0.028
Neg logit contributions
 pro-0.057
 coaster-0.055
 jugg-0.054
 sund-0.053
-0.053
Token.
Token ID13
Feature activation-2.24e-02

Pos logit contributions
 coaster+0.026
+0.025
 jugg+0.025
 pro+0.025
 som+0.025
Neg logit contributions
lessness-0.009
 of-0.007
 twins-0.007
 bricks-0.006
 privacy-0.006
Token The
Token ID383
Feature activation+1.62e-02

Pos logit contributions
lessness+0.037
 of+0.032
 twins+0.031
aky+0.030
ruck+0.030
Neg logit contributions
-0.043
 coaster-0.042
 Fer-0.041
olog-0.039
icol-0.039
Token started
Token ID2067
Feature activation+1.32e-03

Pos logit contributions
 coaster+0.010
+0.010
 Fer+0.009
 salon+0.009
olog+0.009
Neg logit contributions
lessness-0.007
 of-0.007
 twins-0.007
 who-0.006
 about-0.006
Token to
Token ID284
Feature activation+3.71e-02

Pos logit contributions
 coaster+0.003
 Fer+0.003
+0.003
 jugg+0.003
 salon+0.003
Neg logit contributions
lessness-0.001
 of-0.001
 twins-0.001
 about-0.001
 who-0.001
Token talk
Token ID1561
Feature activation-5.56e-02

Pos logit contributions
 coaster+0.075
 Fer+0.072
+0.071
olog+0.069
icol+0.068
Neg logit contributions
lessness-0.054
 of-0.053
 twins-0.052
 who-0.047
 company-0.047
Token about
Token ID546
Feature activation-4.48e-02

Pos logit contributions
lessness+0.059
 twins+0.046
aky+0.040
 privacy+0.040
 bricks+0.039
Neg logit contributions
 coaster-0.138
 pro-0.132
 Fer-0.130
 som-0.130
 jugg-0.130
Token all
Token ID477
Feature activation-3.25e-02

Pos logit contributions
lessness+0.069
 of+0.054
 twins+0.051
aky+0.050
iness+0.047
Neg logit contributions
 coaster-0.095
 Fer-0.093
 jugg-0.091
 som-0.090
 pro-0.088
Token sorts
Token ID10524
Feature activation-6.33e-02

Pos logit contributions
lessness+0.059
 privacy+0.049
ovable+0.046
aky+0.045
 company+0.044
Neg logit contributions
 Fer-0.055
 coaster-0.053
 jugg-0.053
 salon-0.050
 som-0.049
Token of
Token ID286
Feature activation-5.78e-02

Pos logit contributions
 privacy+0.037
lessness+0.034
 parent+0.033
 twins+0.033
aky+0.033
Neg logit contributions
 Fer-0.146
 coaster-0.145
 jugg-0.144
overed-0.144
 pro-0.142
Token things
Token ID1243
Feature activation-7.21e-02

Pos logit contributions
lessness+0.087
iness+0.057
 of+0.057
aky+0.054
ruck+0.053
Neg logit contributions
 coaster-0.126
 Fer-0.123
 jugg-0.121
 som-0.120
 roller-0.115
Token.
Token ID13
Feature activation-1.35e-02

Pos logit contributions
lessness+0.078
 twins+0.064
 privacy+0.058
aky+0.055
 of+0.054
Neg logit contributions
 coaster-0.178
 Fer-0.173
 jugg-0.170
 pro-0.169
-0.165
Token Bob
Token ID5811
Feature activation+9.39e-03

Pos logit contributions
lessness+0.018
 of+0.018
 twins+0.017
 about+0.016
 who+0.014
Neg logit contributions
 coaster-0.029
 salon-0.028
 Fer-0.028
 jugg-0.028
-0.027
Token had
Token ID550
Feature activation+7.91e-03

Pos logit contributions
 coaster+0.025
 Fer+0.024
+0.024
 som+0.023
icol+0.023
Neg logit contributions
 of-0.009
lessness-0.008
 twins-0.008
 who-0.007
 about-0.007
Token his
Token ID465
Feature activation-4.96e-02

Pos logit contributions
lessness+0.044
 of+0.033
iness+0.032
 twins+0.032
iring+0.031
Neg logit contributions
 Fer-0.059
 jugg-0.057
 coaster-0.057
 som-0.055
 sund-0.054
Token head
Token ID1182
Feature activation-2.10e-02

Pos logit contributions
lessness+0.069
 of+0.056
 twins+0.053
ruck+0.050
 about+0.046
Neg logit contributions
 coaster-0.110
 Fer-0.110
-0.106
 destination-0.102
 salon-0.102
Token filled
Token ID5901
Feature activation+1.45e-02

Pos logit contributions
lessness+0.021
 twins+0.016
 of+0.015
 privacy+0.013
 scientists+0.012
Neg logit contributions
 coaster-0.055
 Fer-0.052
 salon-0.052
-0.052
 jugg-0.051
Token with
Token ID351
Feature activation-5.03e-02

Pos logit contributions
 coaster+0.033
 Fer+0.032
+0.031
 jugg+0.031
 salon+0.030
Neg logit contributions
lessness-0.020
 of-0.017
 twins-0.017
 privacy-0.014
 about-0.013
Token incredible
Token ID8082
Feature activation-2.60e-02

Pos logit contributions
lessness+0.077
 of+0.057
 twins+0.057
aky+0.055
iness+0.055
Neg logit contributions
 Fer-0.102
 coaster-0.097
 som-0.096
 jugg-0.096
 sav-0.096
Token memories
Token ID9846
Feature activation-8.70e-02

Pos logit contributions
lessness+0.039
 of+0.030
 twins+0.030
aky+0.027
ovable+0.026
Neg logit contributions
 coaster-0.055
 Fer-0.055
 som-0.052
icol-0.052
 jugg-0.051
Token.
Token ID13
Feature activation-3.73e-02

Pos logit contributions
lessness+0.094
 privacy+0.084
 twins+0.079
aky+0.073
ovable+0.067
Neg logit contributions
 coaster-0.206
 Fer-0.200
 jugg-0.192
 pro-0.191
 sund-0.190
Token<|endoftext|>
Token ID50256
Feature activation-2.33e-02

Pos logit contributions
lessness+0.056
 of+0.047
 twins+0.046
 privacy+0.043
 company+0.041
Neg logit contributions
 coaster-0.080
 Fer-0.076
-0.076
 jugg-0.073
icol-0.072
TokenOnce
Token ID7454
Feature activation-2.99e-02

Pos logit contributions
lessness+0.037
 privacy+0.032
 twins+0.030
 of+0.029
ruck+0.028
Neg logit contributions
 coaster-0.047
-0.046
 Fer-0.043
 jugg-0.043
 pro-0.042
Token upon
Token ID2402
Feature activation-2.80e-02

Pos logit contributions
lessness+0.042
 privacy+0.042
 company+0.039
 twins+0.037
 scientists+0.037
Neg logit contributions
 salon-0.055
-0.054
 drib-0.052
irl-0.052
 coaster-0.052
Token a
Token ID257
Feature activation-1.39e-02

Pos logit contributions
lessness+0.050
iness+0.040
 privacy+0.039
 scientists+0.037
 twins+0.037
Neg logit contributions
 coaster-0.053
 jugg-0.050
-0.048
 sund-0.048
 som-0.047