TOP ACTIVATIONS
MAX = 6.705

had a black cat named Midnight . Midnight was very peaceful
was a kind monkey named Coco . Coco lived in a
, a little girl named Coco was outside playing in the
kind monkey named Coco . Coco lived in a big tree
black cat named Midnight . Midnight was very peaceful and liked
a very furry cat called Midnight . Midnight had a friend
loves her par rot , Coco . Coco is a gentle
a small cater pillar named Coco . Coco was sad because
has a small dog named Coco . One day , they
jungle . One day , Coco noticed a bird trapped in
furry cat called Midnight . Midnight had a friend called Lilly
had a black cat named Midnight . One day , Lily
par rot , Coco . Coco is a gentle bird who
cater pillar named Coco . Coco was sad because she wanted
hug her par rot , Coco . Coco was green and
was a tiny mouse called Bruno . He liked to hide
and Tom are at the shop with Mom . They want
there were two friends , Bruno and Sarah . They loved
par rot , Coco . Coco was green and red and
One day , Lily and Midnight went for a jog in

MIN ACTIVATIONS
MIN = -7.759

again and again . J od ie would always remember the
went on with his mis chie vous ways , causing problems
tall walls and a draw bridge . He started
always acted silly and mis chie vous . One
The cat grinned mis chie v ously . " Never
Suddenly , a mis chie vous bird flew by and
at him with a mis chie vous grin . She said
. He was a mis chie vous little baby and always
answer is a fall - bridge !" He laughs . He
so that no more mis chie vous mice could come in
old lady with a mis chie vous tw inkle in her
mo at and a draw bridge ." Mom said
One day , a mis chie vous young boy decided to
<|endoftext|> Peter was a mis chie vous 3 year old who
they admired Sarah 's sand castle . <|endoftext|> Once upon a
With a mis chie vous gl int in her
the squirrel got a mis chie vous look in its eye
The little boy smiled mis chie v ously and said ,
lesson . They never d ived into the pond again .
sweet tune , a mis chie vous fox appeared . "

INTERVAL 3.089 - 6.705
CONTAINS 1.069%

day , a little girl named Grace saw a big yellow
a little brown dog named Max . Max liked to dig
so he decided to climb up the tree . He worked
. They hugged and made up . Tim my learned that
it was safer inside the case . John smiled

INTERVAL -0.527 - 3.089
CONTAINS 68.128%

, but it was too expensive for her . But ,
he was jealous of Johnny 's new basketball . His mom
ling . The seal was scared . The
. " Wow , you are good at knots
rock . It was not nice . A wise

INTERVAL -4.143 - -0.527
CONTAINS 30.589%

" Wow , a monkey !" Tom said . "
red and round and boun cy . They want to play
his hard work was ruined . The little boy
in its be ak . It put it on the windows
his head . " Let 's go away ,

INTERVAL -7.759 - -4.143
CONTAINS 0.214%

excited . She kept climbing until she was at the very
enjoyed the silence for many minutes . When it
loved to play make - bel ieve . Today she was
, the garden was prett ier than ever . It gl
. J ill was persistent and said ,
Token had
Token ID550
Feature activation+8.00e-01

Pos logit contributions
bridge+0.053
ruck+0.050
ull+0.049
ill+0.048
icle+0.048
Neg logit contributions
 slams-0.027
 shop-0.025
 Jewel-0.025
 Elsa-0.025
 wrong-0.024
Token a
Token ID257
Feature activation+7.62e-01

Pos logit contributions
 shop+0.025
 wrong+0.024
 Elsa+0.023
 slams+0.022
 old+0.022
Neg logit contributions
bridge-0.083
ill-0.081
ived-0.081
icles-0.079
umbs-0.079
Token black
Token ID2042
Feature activation+3.59e-01

Pos logit contributions
 slams+0.049
 tells+0.044
ience+0.044
annah+0.043
 Jewel+0.043
Neg logit contributions
bridge-0.048
ruck-0.045
castle-0.045
ull-0.044
icle-0.044
Token cat
Token ID3797
Feature activation+6.15e-01

Pos logit contributions
 slams+0.019
 shop+0.018
 needs+0.018
 Elsa+0.018
 wrong+0.018
Neg logit contributions
bridge-0.030
ived-0.027
ruck-0.027
umbs-0.026
icle-0.026
Token named
Token ID3706
Feature activation+3.55e+00

Pos logit contributions
ience+0.031
 tells+0.029
 slams+0.029
elia+0.029
annah+0.029
Neg logit contributions
bridge-0.048
ull-0.044
ruck-0.044
icle-0.043
ill-0.041
Token Midnight
Token ID27960
Feature activation+6.63e+00

Pos logit contributions
 slams+0.183
 shop+0.170
 needs+0.168
 wrong+0.163
 tells+0.160
Neg logit contributions
bridge-0.288
icle-0.281
icles-0.275
ruck-0.272
umbs-0.272
Token.
Token ID13
Feature activation+8.96e-01

Pos logit contributions
 slams+0.445
athy+0.393
annah+0.384
 studies+0.377
ience+0.370
Neg logit contributions
bridge-0.431
ruck-0.408
ull-0.406
paper-0.395
ill-0.392
Token Midnight
Token ID27960
Feature activation+6.53e+00

Pos logit contributions
 slams+0.045
 wrong+0.040
 shop+0.039
 studies+0.039
 needs+0.039
Neg logit contributions
umbs-0.072
bridge-0.072
icle-0.070
icles-0.069
ived-0.066
Token was
Token ID373
Feature activation+1.15e+00

Pos logit contributions
 slams+0.458
 aisle+0.395
 shop+0.394
 Jewel+0.387
 case+0.385
Neg logit contributions
ull-0.340
bridge-0.339
paper-0.337
ruck-0.322
urg-0.308
Token very
Token ID845
Feature activation+2.46e+00

Pos logit contributions
 slams+0.060
 shop+0.057
ience+0.057
 needs+0.056
 studies+0.056
Neg logit contributions
bridge-0.090
ruck-0.084
icle-0.083
ull-0.082
umbs-0.082
Token peaceful
Token ID12309
Feature activation-1.80e-01

Pos logit contributions
 slams+0.150
 shop+0.141
 studies+0.139
ience+0.138
 Jewel+0.137
Neg logit contributions
ived-0.156
ull-0.153
ipples-0.152
ruck-0.151
bridge-0.150
Token was
Token ID373
Feature activation+1.20e+00

Pos logit contributions
 slams+0.031
 shop+0.031
 needs+0.029
 wrong+0.028
 tells+0.028
Neg logit contributions
bridge-0.098
ived-0.091
umbs-0.090
ruck-0.090
icle-0.088
Token a
Token ID257
Feature activation+8.31e-01

Pos logit contributions
 shop+0.046
 slams+0.046
 Elsa+0.044
 wrong+0.043
 Jewel+0.042
Neg logit contributions
bridge-0.116
ived-0.108
ruck-0.105
umbs-0.105
ill-0.104
Token kind
Token ID1611
Feature activation+2.82e+00

Pos logit contributions
 slams+0.050
 tells+0.044
 needs+0.044
 Jewel+0.044
 wrong+0.043
Neg logit contributions
bridge-0.062
ull-0.056
ruck-0.056
ived-0.054
castle-0.053
Token monkey
Token ID21657
Feature activation-9.19e-01

Pos logit contributions
 slams+0.156
 wrong+0.142
 needs+0.140
 studies+0.137
 tells+0.133
Neg logit contributions
bridge-0.226
ruck-0.219
ull-0.216
ill-0.210
icle-0.210
Token named
Token ID3706
Feature activation+3.65e+00

Pos logit contributions
bridge+0.089
ull+0.085
ruck+0.083
ill+0.080
umbs+0.080
Neg logit contributions
 slams-0.034
 Elsa-0.033
 Jewel-0.031
 Hazel-0.031
 Hannah-0.031
Token Coco
Token ID48222
Feature activation+6.62e+00

Pos logit contributions
 slams+0.226
 needs+0.216
 shop+0.215
 wrong+0.209
 tells+0.207
Neg logit contributions
bridge-0.268
ived-0.241
ruck-0.239
ill-0.239
icle-0.238
Token.
Token ID13
Feature activation+1.02e+00

Pos logit contributions
 slams+0.469
 Jewel+0.399
 Hannah+0.397
 Hazel+0.393
 shop+0.392
Neg logit contributions
ull-0.443
ruck-0.440
bridge-0.427
ill-0.419
paper-0.396
Token Coco
Token ID48222
Feature activation+6.56e+00

Pos logit contributions
 slams+0.032
 needs+0.028
 shop+0.026
 tells+0.026
 wrong+0.025
Neg logit contributions
bridge-0.103
ived-0.097
umbs-0.097
paper-0.096
ill-0.094
Token lived
Token ID5615
Feature activation+3.36e+00

Pos logit contributions
 slams+0.448
 Hannah+0.394
 Hazel+0.388
 therapist+0.388
 Elsa+0.381
Neg logit contributions
ull-0.415
bridge-0.412
paper-0.392
ruck-0.380
castle-0.351
Token in
Token ID287
Feature activation+1.81e+00

Pos logit contributions
 slams+0.188
 needs+0.180
 tells+0.175
 shop+0.174
 Elsa+0.172
Neg logit contributions
bridge-0.265
ived-0.242
ruck-0.236
ull-0.235
ill-0.235
Token a
Token ID257
Feature activation+8.25e-01

Pos logit contributions
 slams+0.083
 shop+0.080
 needs+0.076
 Elsa+0.076
 wrong+0.076
Neg logit contributions
bridge-0.164
ived-0.150
ruck-0.148
umbs-0.148
ill-0.147
Token,
Token ID11
Feature activation+9.43e-01

Pos logit contributions
 slams+0.008
 needs+0.007
 shop+0.007
 Jewel+0.007
 tells+0.007
Neg logit contributions
bridge-0.017
ived-0.016
ruck-0.016
umbs-0.015
ill-0.015
Token a
Token ID257
Feature activation+7.36e-01

Pos logit contributions
 shop+0.039
 slams+0.039
 wrong+0.038
 Elsa+0.038
 Jewel+0.036
Neg logit contributions
bridge-0.088
umbs-0.081
ived-0.081
ruck-0.079
icles-0.079
Token little
Token ID1310
Feature activation-4.79e-01

Pos logit contributions
 slams+0.048
 tells+0.042
 Jewel+0.040
 needs+0.040
 Elsa+0.040
Neg logit contributions
bridge-0.052
ruck-0.049
ull-0.049
ived-0.047
icle-0.046
Token girl
Token ID2576
Feature activation-3.37e-01

Pos logit contributions
bridge+0.033
ruck+0.033
ull+0.032
icle+0.032
ived+0.032
Neg logit contributions
 slams-0.030
 tells-0.028
 studies-0.027
 wrong-0.027
 needs-0.026
Token named
Token ID3706
Feature activation+3.55e+00

Pos logit contributions
bridge+0.034
ruck+0.032
ived+0.031
ull+0.031
ill+0.031
Neg logit contributions
 slams-0.011
 wrong-0.010
 Jewel-0.010
 Elsa-0.010
 shop-0.010
Token Coco
Token ID48222
Feature activation+6.59e+00

Pos logit contributions
 slams+0.155
 shop+0.152
 needs+0.142
 tells+0.139
 wrong+0.139
Neg logit contributions
bridge-0.321
ived-0.301
ill-0.299
umbs-0.293
ruck-0.293
Token was
Token ID373
Feature activation+1.10e+00

Pos logit contributions
 slams+0.404
 Jewel+0.383
 wrong+0.380
 Elsa+0.379
 shop+0.374
Neg logit contributions
bridge-0.479
ull-0.451
ill-0.446
ruck-0.443
ived-0.426
Token outside
Token ID2354
Feature activation+1.86e+00

Pos logit contributions
 slams+0.063
 needs+0.061
ience+0.061
 shop+0.060
 owns+0.060
Neg logit contributions
bridge-0.084
ived-0.077
ruck-0.075
ull-0.073
umbs-0.072
Token playing
Token ID2712
Feature activation-3.33e-01

Pos logit contributions
 slams+0.086
 shop+0.084
 needs+0.079
 wrong+0.079
 Elsa+0.078
Neg logit contributions
bridge-0.171
ived-0.154
umbs-0.154
ruck-0.152
ill-0.151
Token in
Token ID287
Feature activation+1.90e+00

Pos logit contributions
bridge+0.029
ived+0.027
ruck+0.027
umbs+0.026
ill+0.026
Neg logit contributions
 slams-0.016
 shop-0.015
 needs-0.015
 Elsa-0.015
 wrong-0.015
Token the
Token ID262
Feature activation+1.18e+00

Pos logit contributions
 slams+0.089
 needs+0.084
 shop+0.084
 wrong+0.083
 tells+0.081
Neg logit contributions
bridge-0.168
ruck-0.153
ill-0.152
ived-0.151
umbs-0.151
Token kind
Token ID1611
Feature activation+2.82e+00

Pos logit contributions
 slams+0.050
 tells+0.044
 needs+0.044
 Jewel+0.044
 wrong+0.043
Neg logit contributions
bridge-0.062
ull-0.056
ruck-0.056
ived-0.054
castle-0.053
Token monkey
Token ID21657
Feature activation-9.19e-01

Pos logit contributions
 slams+0.156
 wrong+0.142
 needs+0.140
 studies+0.137
 tells+0.133
Neg logit contributions
bridge-0.226
ruck-0.219
ull-0.216
ill-0.210
icle-0.210
Token named
Token ID3706
Feature activation+3.65e+00

Pos logit contributions
bridge+0.089
ull+0.085
ruck+0.083
ill+0.080
umbs+0.080
Neg logit contributions
 slams-0.034
 Elsa-0.033
 Jewel-0.031
 Hazel-0.031
 Hannah-0.031
Token Coco
Token ID48222
Feature activation+6.62e+00

Pos logit contributions
 slams+0.226
 needs+0.216
 shop+0.215
 wrong+0.209
 tells+0.207
Neg logit contributions
bridge-0.268
ived-0.241
ruck-0.239
ill-0.239
icle-0.238
Token.
Token ID13
Feature activation+1.02e+00

Pos logit contributions
 slams+0.469
 Jewel+0.399
 Hannah+0.397
 Hazel+0.393
 shop+0.392
Neg logit contributions
ull-0.443
ruck-0.440
bridge-0.427
ill-0.419
paper-0.396
Token Coco
Token ID48222
Feature activation+6.56e+00

Pos logit contributions
 slams+0.032
 needs+0.028
 shop+0.026
 tells+0.026
 wrong+0.025
Neg logit contributions
bridge-0.103
ived-0.097
umbs-0.097
paper-0.096
ill-0.094
Token lived
Token ID5615
Feature activation+3.36e+00

Pos logit contributions
 slams+0.448
 Hannah+0.394
 Hazel+0.388
 therapist+0.388
 Elsa+0.381
Neg logit contributions
ull-0.415
bridge-0.412
paper-0.392
ruck-0.380
castle-0.351
Token in
Token ID287
Feature activation+1.81e+00

Pos logit contributions
 slams+0.188
 needs+0.180
 tells+0.175
 shop+0.174
 Elsa+0.172
Neg logit contributions
bridge-0.265
ived-0.242
ruck-0.236
ull-0.235
ill-0.235
Token a
Token ID257
Feature activation+8.25e-01

Pos logit contributions
 slams+0.083
 shop+0.080
 needs+0.076
 Elsa+0.076
 wrong+0.076
Neg logit contributions
bridge-0.164
ived-0.150
ruck-0.148
umbs-0.148
ill-0.147
Token big
Token ID1263
Feature activation+3.39e-01

Pos logit contributions
 slams+0.046
 tells+0.043
 Jewel+0.043
 needs+0.043
 therapist+0.042
Neg logit contributions
bridge-0.063
ull-0.055
umbs-0.055
ruck-0.054
ived-0.054
Token tree
Token ID5509
Feature activation+2.92e+00

Pos logit contributions
 slams+0.018
 Elsa+0.017
 Jewel+0.017
 wrong+0.016
 tells+0.016
Neg logit contributions
bridge-0.027
ruck-0.024
ill-0.024
umbs-0.024
ived-0.024
Token black
Token ID2042
Feature activation+3.59e-01

Pos logit contributions
 slams+0.049
 tells+0.044
ience+0.044
annah+0.043
 Jewel+0.043
Neg logit contributions
bridge-0.048
ruck-0.045
castle-0.045
ull-0.044
icle-0.044
Token cat
Token ID3797
Feature activation+6.15e-01

Pos logit contributions
 slams+0.019
 shop+0.018
 needs+0.018
 Elsa+0.018
 wrong+0.018
Neg logit contributions
bridge-0.030
ived-0.027
ruck-0.027
umbs-0.026
icle-0.026
Token named
Token ID3706
Feature activation+3.55e+00

Pos logit contributions
ience+0.031
 tells+0.029
 slams+0.029
elia+0.029
annah+0.029
Neg logit contributions
bridge-0.048
ull-0.044
ruck-0.044
icle-0.043
ill-0.041
Token Midnight
Token ID27960
Feature activation+6.63e+00

Pos logit contributions
 slams+0.183
 shop+0.170
 needs+0.168
 wrong+0.163
 tells+0.160
Neg logit contributions
bridge-0.288
icle-0.281
icles-0.275
ruck-0.272
umbs-0.272
Token.
Token ID13
Feature activation+8.96e-01

Pos logit contributions
 slams+0.445
athy+0.393
annah+0.384
 studies+0.377
ience+0.370
Neg logit contributions
bridge-0.431
ruck-0.408
ull-0.406
paper-0.395
ill-0.392
Token Midnight
Token ID27960
Feature activation+6.53e+00

Pos logit contributions
 slams+0.045
 wrong+0.040
 shop+0.039
 studies+0.039
 needs+0.039
Neg logit contributions
umbs-0.072
bridge-0.072
icle-0.070
icles-0.069
ived-0.066
Token was
Token ID373
Feature activation+1.15e+00

Pos logit contributions
 slams+0.458
 aisle+0.395
 shop+0.394
 Jewel+0.387
 case+0.385
Neg logit contributions
ull-0.340
bridge-0.339
paper-0.337
ruck-0.322
urg-0.308
Token very
Token ID845
Feature activation+2.46e+00

Pos logit contributions
 slams+0.060
 shop+0.057
ience+0.057
 needs+0.056
 studies+0.056
Neg logit contributions
bridge-0.090
ruck-0.084
icle-0.083
ull-0.082
umbs-0.082
Token peaceful
Token ID12309
Feature activation-1.80e-01

Pos logit contributions
 slams+0.150
 shop+0.141
 studies+0.139
ience+0.138
 Jewel+0.137
Neg logit contributions
ived-0.156
ull-0.153
ipples-0.152
ruck-0.151
bridge-0.150
Token and
Token ID290
Feature activation+6.98e-01

Pos logit contributions
bridge+0.014
ruck+0.013
ived+0.013
ill+0.013
umbs+0.013
Neg logit contributions
 slams-0.010
 needs-0.009
 shop-0.009
 tells-0.009
 Elsa-0.009
Token liked
Token ID8288
Feature activation+2.30e+00

Pos logit contributions
 slams+0.037
 shop+0.034
 Jewel+0.032
 wrong+0.031
 studies+0.031
Neg logit contributions
bridge-0.057
ruck-0.052
icle-0.051
ull-0.051
icles-0.050
Token a
Token ID257
Feature activation+1.06e+00

Pos logit contributions
 shop+0.051
 slams+0.050
 Elsa+0.048
 wrong+0.048
 Jewel+0.046
Neg logit contributions
bridge-0.136
ived-0.127
umbs-0.124
ill-0.123
ruck-0.123
Token very
Token ID845
Feature activation+2.77e+00

Pos logit contributions
 slams+0.065
 tells+0.058
 wrong+0.057
 needs+0.057
 Jewel+0.056
Neg logit contributions
bridge-0.077
ruck-0.072
ull-0.072
ived-0.069
ill-0.068
Token furry
Token ID40857
Feature activation+2.33e+00

Pos logit contributions
 slams+0.167
 Jewel+0.160
 tells+0.155
 Elsa+0.155
Charl+0.151
Neg logit contributions
bridge-0.196
ived-0.187
ull-0.185
ipples-0.175
castle-0.174
Token cat
Token ID3797
Feature activation+8.32e-01

Pos logit contributions
 slams+0.147
 tells+0.135
 studies+0.132
 wrong+0.132
 needs+0.130
Neg logit contributions
bridge-0.163
ruck-0.159
ull-0.159
icle-0.155
ived-0.148
Token called
Token ID1444
Feature activation+4.67e+00

Pos logit contributions
 slams+0.046
 studies+0.043
 Elsa+0.043
ience+0.043
 Jewel+0.042
Neg logit contributions
ull-0.062
bridge-0.061
ruck-0.059
ill-0.057
icle-0.057
Token Midnight
Token ID27960
Feature activation+6.53e+00

Pos logit contributions
 slams+0.284
 needs+0.277
 studies+0.272
 shop+0.268
 wrong+0.259
Neg logit contributions
icle-0.323
icles-0.316
bridge-0.315
ill-0.305
umbs-0.303
Token.
Token ID13
Feature activation+1.37e+00

Pos logit contributions
 slams+0.441
 studies+0.369
 owns+0.368
athy+0.365
 needs+0.362
Neg logit contributions
bridge-0.430
ruck-0.426
ull-0.410
ill-0.401
icle-0.396
Token Midnight
Token ID27960
Feature activation+6.46e+00

Pos logit contributions
 slams+0.069
 tells+0.060
 wrong+0.058
elia+0.058
 studies+0.057
Neg logit contributions
umbs-0.110
bridge-0.105
icle-0.102
ived-0.098
icles-0.098
Token had
Token ID550
Feature activation+1.00e+00

Pos logit contributions
 slams+0.447
annah+0.391
 aisle+0.389
 wrong+0.387
 shop+0.381
Neg logit contributions
ull-0.370
bridge-0.363
paper-0.357
ruck-0.340
ill-0.329
Token a
Token ID257
Feature activation+9.61e-01

Pos logit contributions
 slams+0.044
 shop+0.043
 wrong+0.041
 Elsa+0.041
 needs+0.041
Neg logit contributions
bridge-0.092
ived-0.085
ill-0.084
umbs-0.084
ruck-0.084
Token friend
Token ID1545
Feature activation-6.62e-01

Pos logit contributions
 slams+0.064
 owns+0.059
ience+0.058
 tells+0.057
annah+0.057
Neg logit contributions
bridge-0.055
icle-0.053
 crashing-0.052
ruck-0.052
paper-0.051
Token loves
Token ID10408
Feature activation+1.85e+00

Pos logit contributions
 slams+0.074
 wrong+0.073
 Jewel+0.072
 Elsa+0.068
 shop+0.067
Neg logit contributions
bridge-0.095
ill-0.091
ruck-0.090
ull-0.087
ived-0.083
Token her
Token ID607
Feature activation+6.65e-01

Pos logit contributions
 slams+0.106
 shop+0.102
 needs+0.099
 wrong+0.099
 Elsa+0.098
Neg logit contributions
bridge-0.145
ived-0.131
ruck-0.130
umbs-0.128
icle-0.127
Token par
Token ID1582
Feature activation-7.76e-02

Pos logit contributions
 slams+0.038
 needs+0.034
 owns+0.034
ience+0.034
 wrong+0.033
Neg logit contributions
bridge-0.051
icle-0.047
ruck-0.045
icles-0.045
ill-0.045
Tokenrot
Token ID10599
Feature activation-4.15e-01

Pos logit contributions
bridge+0.005
ived+0.004
umbs+0.004
ruck+0.004
ill+0.004
Neg logit contributions
 slams-0.006
 needs-0.005
 shop-0.005
 Elsa-0.005
 tells-0.005
Token,
Token ID11
Feature activation+1.58e+00

Pos logit contributions
bridge+0.029
ull+0.026
ill+0.026
icle+0.026
umbs+0.025
Neg logit contributions
 slams-0.027
 studies-0.023
annah-0.023
 tells-0.023
athy-0.023
Token Coco
Token ID48222
Feature activation+6.53e+00

Pos logit contributions
 slams+0.080
 shop+0.074
 needs+0.074
 wrong+0.071
 Jewel+0.071
Neg logit contributions
bridge-0.135
ill-0.123
ruck-0.122
icle-0.121
ived-0.120
Token.
Token ID13
Feature activation+1.22e+00

Pos logit contributions
 slams+0.517
 studies+0.422
 wrong+0.417
becca+0.411
 courtroom+0.411
Neg logit contributions
bridge-0.353
ill-0.351
ruck-0.338
ull-0.337
paper-0.330
Token Coco
Token ID48222
Feature activation+6.46e+00

Pos logit contributions
 slams+0.060
 wrong+0.056
 needs+0.055
 shop+0.054
 tells+0.053
Neg logit contributions
bridge-0.105
umbs-0.097
ived-0.097
ill-0.094
icle-0.093
Token is
Token ID318
Feature activation+2.42e+00

Pos logit contributions
 slams+0.402
annah+0.349
 hospital+0.343
 wrong+0.342
 spa+0.333
Neg logit contributions
paper-0.422
bridge-0.411
ull-0.396
ill-0.380
castle-0.364
Token a
Token ID257
Feature activation+9.52e-01

Pos logit contributions
 slams+0.148
ience+0.138
 needs+0.137
 shop+0.136
 studies+0.136
Neg logit contributions
bridge-0.180
ull-0.162
ill-0.160
ruck-0.159
ived-0.157
Token gentle
Token ID10296
Feature activation+1.28e+00

Pos logit contributions
 slams+0.068
 tells+0.062
ience+0.056
 studies+0.056
 says+0.056
Neg logit contributions
bridge-0.061
ull-0.058
castle-0.055
 crashing-0.053
paper-0.052
Token a
Token ID257
Feature activation+6.86e-01

Pos logit contributions
bridge+0.069
ived+0.063
icle+0.063
ill+0.063
ull+0.063
Neg logit contributions
 slams-0.046
 needs-0.043
 shop-0.043
 Elsa-0.042
 studies-0.041
Token small
Token ID1402
Feature activation+1.51e+00

Pos logit contributions
 slams+0.040
 needs+0.035
 studies+0.034
annah+0.034
ience+0.034
Neg logit contributions
bridge-0.051
ull-0.050
ruck-0.048
 crashing-0.046
paper-0.046
Token cater
Token ID20825
Feature activation+3.97e-01

Pos logit contributions
 slams+0.112
 Elsa+0.104
 studies+0.103
 wrong+0.103
Charl+0.103
Neg logit contributions
ull-0.084
bridge-0.081
ruck-0.079
ill-0.076
icle-0.075
Tokenpillar
Token ID41643
Feature activation+3.82e-01

Pos logit contributions
 slams+0.026
 tells+0.025
 studies+0.024
 needs+0.024
 shop+0.024
Neg logit contributions
bridge-0.029
umbs-0.026
ull-0.025
ill-0.025
ived-0.025
Token named
Token ID3706
Feature activation+3.47e+00

Pos logit contributions
 slams+0.023
annah+0.023
 Hazel+0.022
 Hannah+0.022
 studies+0.022
Neg logit contributions
bridge-0.027
ull-0.027
ived-0.024
ruck-0.024
ill-0.024
Token Coco
Token ID48222
Feature activation+6.51e+00

Pos logit contributions
 slams+0.241
 needs+0.231
 shop+0.231
 studies+0.226
ience+0.224
Neg logit contributions
bridge-0.209
icle-0.204
ill-0.203
icles-0.198
ull-0.194
Token.
Token ID13
Feature activation+1.10e+00

Pos logit contributions
 slams+0.518
becca+0.447
annah+0.438
 studies+0.436
athy+0.435
Neg logit contributions
ill-0.371
ull-0.364
ived-0.344
bridge-0.340
paper-0.339
Token Coco
Token ID48222
Feature activation+6.46e+00

Pos logit contributions
 slams+0.050
 needs+0.048
ience+0.047
issa+0.047
 wrong+0.046
Neg logit contributions
ived-0.086
umbs-0.085
icle-0.085
bridge-0.083
paper-0.083
Token was
Token ID373
Feature activation+1.13e+00

Pos logit contributions
 slams+0.468
 clinic+0.414
 hospital+0.408
 spa+0.395
 therapist+0.394
Neg logit contributions
ull-0.370
paper-0.351
ill-0.340
ived-0.332
bridge-0.317
Token sad
Token ID6507
Feature activation+1.92e+00

Pos logit contributions
ience+0.069
 needs+0.064
 slams+0.064
 studies+0.063
 therapist+0.062
Neg logit contributions
bridge-0.077
ull-0.076
ived-0.075
 crashing-0.072
icle-0.071
Token because
Token ID780
Feature activation+1.37e+00

Pos logit contributions
 slams+0.119
 shop+0.115
 Elsa+0.114
 Jewel+0.114
athy+0.112
Neg logit contributions
ruck-0.129
ived-0.126
ill-0.125
bridge-0.124
icle-0.123
Token has
Token ID468
Feature activation+1.85e+00

Pos logit contributions
 slams+0.026
 Jewel+0.025
athy+0.024
 shop+0.024
 Elsa+0.024
Neg logit contributions
bridge-0.035
ruck-0.032
ived-0.032
ill-0.032
paper-0.031
Token a
Token ID257
Feature activation+3.00e-01

Pos logit contributions
 slams+0.093
 shop+0.089
 needs+0.086
 Elsa+0.085
 Jewel+0.085
Neg logit contributions
bridge-0.159
ived-0.144
ruck-0.143
umbs-0.142
icle-0.140
Token small
Token ID1402
Feature activation+1.14e+00

Pos logit contributions
 slams+0.018
 Jewel+0.017
 owns+0.016
 Elsa+0.016
elia+0.016
Neg logit contributions
bridge-0.021
ruck-0.019
ull-0.019
ived-0.019
 crashing-0.018
Token dog
Token ID3290
Feature activation+1.43e+00

Pos logit contributions
 slams+0.063
 Elsa+0.059
 Jewel+0.058
 owns+0.058
ience+0.056
Neg logit contributions
bridge-0.085
ruck-0.081
ived-0.080
icle-0.080
ull-0.079
Token named
Token ID3706
Feature activation+3.06e+00

Pos logit contributions
 Hazel+0.070
 therapist+0.069
annah+0.069
becca+0.069
 Jewel+0.069
Neg logit contributions
bridge-0.112
ruck-0.102
ived-0.101
ull-0.098
umbs-0.097
Token Coco
Token ID48222
Feature activation+6.48e+00

Pos logit contributions
 slams+0.175
 shop+0.162
 studies+0.159
 needs+0.151
 wardrobe+0.148
Neg logit contributions
bridge-0.216
icle-0.214
umbs-0.212
icles-0.212
ived-0.207
Token.
Token ID13
Feature activation+6.91e-02

Pos logit contributions
 slams+0.447
athy+0.391
becca+0.386
 Jewel+0.380
annah+0.379
Neg logit contributions
bridge-0.418
ill-0.396
ived-0.390
ruck-0.382
umbs-0.380
Token One
Token ID1881
Feature activation+2.32e+00

Pos logit contributions
 slams+0.004
 wrong+0.004
ience+0.004
 needs+0.004
 shop+0.003
Neg logit contributions
bridge-0.005
umbs-0.005
ill-0.005
ived-0.005
icles-0.005
Token day
Token ID1110
Feature activation-8.48e-02

Pos logit contributions
 slams+0.170
 Elsa+0.160
 Jewel+0.158
 studies+0.156
ience+0.156
Neg logit contributions
ull-0.143
bridge-0.137
ruck-0.136
icle-0.136
umbs-0.129
Token,
Token ID11
Feature activation+1.24e+00

Pos logit contributions
ived+0.007
bridge+0.007
umbs+0.007
ruck+0.007
icle+0.006
Neg logit contributions
 slams-0.004
 shop-0.004
 needs-0.004
 studies-0.004
 Jewel-0.004
Token they
Token ID484
Feature activation+6.76e-01

Pos logit contributions
 slams+0.064
 needs+0.059
 shop+0.058
 studies+0.056
 wrong+0.056
Neg logit contributions
bridge-0.105
ived-0.096
ruck-0.094
ill-0.093
umbs-0.093
Token jungle
Token ID20712
Feature activation+2.83e-01

Pos logit contributions
 slams+0.056
 Elsa+0.052
 tells+0.051
 needs+0.051
 Jewel+0.050
Neg logit contributions
bridge-0.115
umbs-0.104
ruck-0.103
ived-0.103
ill-0.102
Token.
Token ID13
Feature activation+9.79e-01

Pos logit contributions
 Elsa+0.016
 Jewel+0.016
 Hazel+0.016
 Hannah+0.016
 slams+0.016
Neg logit contributions
bridge-0.022
ill-0.020
ull-0.020
ipples-0.020
ived-0.020
Token One
Token ID1881
Feature activation+2.58e+00

Pos logit contributions
 slams+0.058
ience+0.056
issa+0.055
 needs+0.054
otte+0.052
Neg logit contributions
bridge-0.068
umbs-0.067
ived-0.063
paper-0.061
ill-0.060
Token day
Token ID1110
Feature activation+3.13e-01

Pos logit contributions
 shop+0.109
 slams+0.109
 needs+0.104
 wrong+0.102
 tells+0.100
Neg logit contributions
bridge-0.244
ived-0.223
ruck-0.218
umbs-0.218
icles-0.216
Token,
Token ID11
Feature activation+1.21e+00

Pos logit contributions
 slams+0.015
 shop+0.014
 needs+0.013
 tells+0.013
 studies+0.013
Neg logit contributions
bridge-0.027
ived-0.026
umbs-0.025
ruck-0.024
icle-0.024
Token Coco
Token ID48222
Feature activation+6.47e+00

Pos logit contributions
 slams+0.051
 needs+0.044
 studies+0.043
 tells+0.042
athy+0.042
Neg logit contributions
bridge-0.114
ived-0.108
paper-0.103
umbs-0.102
icle-0.101
Token noticed
Token ID6810
Feature activation-1.96e+00

Pos logit contributions
 slams+0.405
 Jewel+0.390
 shop+0.378
 Elsa+0.376
 wrong+0.375
Neg logit contributions
bridge-0.464
ull-0.419
ived-0.416
ill-0.412
paper-0.411
Token a
Token ID257
Feature activation+6.47e-01

Pos logit contributions
bridge+0.138
ruck+0.135
icle+0.133
ull+0.130
icles+0.128
Neg logit contributions
 slams-0.122
 needs-0.119
 studies-0.118
annah-0.114
ience-0.110
Token bird
Token ID6512
Feature activation-7.61e-01

Pos logit contributions
 slams+0.037
 Elsa+0.034
 tells+0.034
 needs+0.033
athy+0.032
Neg logit contributions
bridge-0.049
ruck-0.045
ull-0.045
ived-0.044
paper-0.044
Token trapped
Token ID13640
Feature activation+1.66e+00

Pos logit contributions
bridge+0.061
ived+0.055
ruck+0.055
umbs+0.054
ill+0.053
Neg logit contributions
 slams-0.042
 shop-0.041
 Elsa-0.041
 Jewel-0.040
 needs-0.039
Token in
Token ID287
Feature activation+1.74e+00

Pos logit contributions
 slams+0.091
 shop+0.088
 tells+0.087
 Elsa+0.087
 needs+0.086
Neg logit contributions
bridge-0.133
ruck-0.121
ived-0.120
icle-0.118
ill-0.117
Token furry
Token ID40857
Feature activation+2.33e+00

Pos logit contributions
 slams+0.167
 Jewel+0.160
 tells+0.155
 Elsa+0.155
Charl+0.151
Neg logit contributions
bridge-0.196
ived-0.187
ull-0.185
ipples-0.175
castle-0.174
Token cat
Token ID3797
Feature activation+8.32e-01

Pos logit contributions
 slams+0.147
 tells+0.135
 studies+0.132
 wrong+0.132
 needs+0.130
Neg logit contributions
bridge-0.163
ruck-0.159
ull-0.159
icle-0.155
ived-0.148
Token called
Token ID1444
Feature activation+4.67e+00

Pos logit contributions
 slams+0.046
 studies+0.043
 Elsa+0.043
ience+0.043
 Jewel+0.042
Neg logit contributions
ull-0.062
bridge-0.061
ruck-0.059
ill-0.057
icle-0.057
Token Midnight
Token ID27960
Feature activation+6.53e+00

Pos logit contributions
 slams+0.284
 needs+0.277
 studies+0.272
 shop+0.268
 wrong+0.259
Neg logit contributions
icle-0.323
icles-0.316
bridge-0.315
ill-0.305
umbs-0.303
Token.
Token ID13
Feature activation+1.37e+00

Pos logit contributions
 slams+0.441
 studies+0.369
 owns+0.368
athy+0.365
 needs+0.362
Neg logit contributions
bridge-0.430
ruck-0.426
ull-0.410
ill-0.401
icle-0.396
Token Midnight
Token ID27960
Feature activation+6.46e+00

Pos logit contributions
 slams+0.069
 tells+0.060
 wrong+0.058
elia+0.058
 studies+0.057
Neg logit contributions
umbs-0.110
bridge-0.105
icle-0.102
ived-0.098
icles-0.098
Token had
Token ID550
Feature activation+1.00e+00

Pos logit contributions
 slams+0.447
annah+0.391
 aisle+0.389
 wrong+0.387
 shop+0.381
Neg logit contributions
ull-0.370
bridge-0.363
paper-0.357
ruck-0.340
ill-0.329
Token a
Token ID257
Feature activation+9.61e-01

Pos logit contributions
 slams+0.044
 shop+0.043
 wrong+0.041
 Elsa+0.041
 needs+0.041
Neg logit contributions
bridge-0.092
ived-0.085
ill-0.084
umbs-0.084
ruck-0.084
Token friend
Token ID1545
Feature activation-6.62e-01

Pos logit contributions
 slams+0.064
 owns+0.059
ience+0.058
 tells+0.057
annah+0.057
Neg logit contributions
bridge-0.055
icle-0.053
 crashing-0.052
ruck-0.052
paper-0.051
Token called
Token ID1444
Feature activation+4.53e+00

Pos logit contributions
bridge+0.058
ill+0.056
ull+0.056
ruck+0.055
ived+0.055
Neg logit contributions
 slams-0.026
 studies-0.024
 Jewel-0.023
 shop-0.023
ience-0.022
Token Lilly
Token ID35134
Feature activation+1.31e+00

Pos logit contributions
 slams+0.232
 studies+0.212
 needs+0.204
 tells+0.198
ience+0.195
Neg logit contributions
icle-0.363
icles-0.350
umbs-0.349
bridge-0.348
ived-0.344
Token had
Token ID550
Feature activation+9.26e-01

Pos logit contributions
 slams+0.087
 shop+0.079
 Jewel+0.077
 tells+0.074
 Pixie+0.074
Neg logit contributions
bridge-0.105
ull-0.096
ruck-0.095
paper-0.087
ived-0.086
Token a
Token ID257
Feature activation+8.83e-01

Pos logit contributions
 slams+0.040
 shop+0.039
 wrong+0.038
 Elsa+0.038
 needs+0.037
Neg logit contributions
bridge-0.086
ived-0.079
ruck-0.078
umbs-0.078
ill-0.077
Token black
Token ID2042
Feature activation+3.49e-01

Pos logit contributions
 slams+0.055
ience+0.053
annah+0.052
 owns+0.052
 tells+0.050
Neg logit contributions
bridge-0.052
paper-0.048
castle-0.047
ruck-0.047
icle-0.046
Token cat
Token ID3797
Feature activation+7.75e-01

Pos logit contributions
 slams+0.019
 tells+0.017
 needs+0.017
 Elsa+0.017
 wrong+0.017
Neg logit contributions
bridge-0.028
icle-0.026
ruck-0.026
ived-0.026
umbs-0.025
Token named
Token ID3706
Feature activation+3.66e+00

Pos logit contributions
ience+0.041
 studies+0.039
 tells+0.039
annah+0.038
 spa+0.038
Neg logit contributions
bridge-0.055
ull-0.051
ruck-0.050
icle-0.049
ill-0.049
Token Midnight
Token ID27960
Feature activation+6.46e+00

Pos logit contributions
 slams+0.198
 needs+0.185
 shop+0.181
 wrong+0.176
 studies+0.175
Neg logit contributions
icle-0.280
bridge-0.280
icles-0.274
umbs-0.274
ill-0.268
Token.
Token ID13
Feature activation+1.11e+00

Pos logit contributions
 slams+0.435
athy+0.382
annah+0.380
 studies+0.373
 owns+0.370
Neg logit contributions
bridge-0.402
ruck-0.386
ill-0.377
icle-0.374
paper-0.372
Token One
Token ID1881
Feature activation+2.67e+00

Pos logit contributions
 slams+0.063
 studies+0.055
ience+0.055
 owns+0.054
 wrong+0.054
Neg logit contributions
umbs-0.084
bridge-0.079
icle-0.076
icles-0.076
ill-0.074
Token day
Token ID1110
Feature activation+2.48e-01

Pos logit contributions
 slams+0.128
 shop+0.122
 needs+0.118
 Elsa+0.117
 wrong+0.117
Neg logit contributions
bridge-0.237
ived-0.216
ruck-0.214
umbs-0.214
ill-0.212
Token,
Token ID11
Feature activation+1.32e+00

Pos logit contributions
 slams+0.011
 shop+0.010
 needs+0.010
 tells+0.010
 Jewel+0.010
Neg logit contributions
bridge-0.022
ived-0.021
ruck-0.020
umbs-0.020
icle-0.020
Token Lily
Token ID20037
Feature activation+1.08e+00

Pos logit contributions
 slams+0.069
 studies+0.062
 needs+0.059
 tells+0.059
 owns+0.059
Neg logit contributions
bridge-0.108
icle-0.099
ived-0.099
ruck-0.098
umbs-0.098
Token par
Token ID1582
Feature activation-7.76e-02

Pos logit contributions
 slams+0.038
 needs+0.034
 owns+0.034
ience+0.034
 wrong+0.033
Neg logit contributions
bridge-0.051
icle-0.047
ruck-0.045
icles-0.045
ill-0.045
Tokenrot
Token ID10599
Feature activation-4.15e-01

Pos logit contributions
bridge+0.005
ived+0.004
umbs+0.004
ruck+0.004
ill+0.004
Neg logit contributions
 slams-0.006
 needs-0.005
 shop-0.005
 Elsa-0.005
 tells-0.005
Token,
Token ID11
Feature activation+1.58e+00

Pos logit contributions
bridge+0.029
ull+0.026
ill+0.026
icle+0.026
umbs+0.025
Neg logit contributions
 slams-0.027
 studies-0.023
annah-0.023
 tells-0.023
athy-0.023
Token Coco
Token ID48222
Feature activation+6.53e+00

Pos logit contributions
 slams+0.080
 shop+0.074
 needs+0.074
 wrong+0.071
 Jewel+0.071
Neg logit contributions
bridge-0.135
ill-0.123
ruck-0.122
icle-0.121
ived-0.120
Token.
Token ID13
Feature activation+1.22e+00

Pos logit contributions
 slams+0.517
 studies+0.422
 wrong+0.417
becca+0.411
 courtroom+0.411
Neg logit contributions
bridge-0.353
ill-0.351
ruck-0.338
ull-0.337
paper-0.330
Token Coco
Token ID48222
Feature activation+6.46e+00

Pos logit contributions
 slams+0.060
 wrong+0.056
 needs+0.055
 shop+0.054
 tells+0.053
Neg logit contributions
bridge-0.105
umbs-0.097
ived-0.097
ill-0.094
icle-0.093
Token is
Token ID318
Feature activation+2.42e+00

Pos logit contributions
 slams+0.402
annah+0.349
 hospital+0.343
 wrong+0.342
 spa+0.333
Neg logit contributions
paper-0.422
bridge-0.411
ull-0.396
ill-0.380
castle-0.364
Token a
Token ID257
Feature activation+9.52e-01

Pos logit contributions
 slams+0.148
ience+0.138
 needs+0.137
 shop+0.136
 studies+0.136
Neg logit contributions
bridge-0.180
ull-0.162
ill-0.160
ruck-0.159
ived-0.157
Token gentle
Token ID10296
Feature activation+1.28e+00

Pos logit contributions
 slams+0.068
 tells+0.062
ience+0.056
 studies+0.056
 says+0.056
Neg logit contributions
bridge-0.061
ull-0.058
castle-0.055
 crashing-0.053
paper-0.052
Token bird
Token ID6512
Feature activation-1.01e+00

Pos logit contributions
 slams+0.083
 tells+0.073
 wrong+0.071
 needs+0.069
 studies+0.069
Neg logit contributions
bridge-0.095
ruck-0.090
icle-0.087
ull-0.086
ill-0.083
Token who
Token ID508
Feature activation+1.22e+00

Pos logit contributions
bridge+0.071
ull+0.066
ring+0.064
ruck+0.064
ill+0.063
Neg logit contributions
 slams-0.062
 tells-0.056
 studies-0.054
 Hannah-0.054
 Hazel-0.054
Token cater
Token ID20825
Feature activation+3.97e-01

Pos logit contributions
 slams+0.112
 Elsa+0.104
 studies+0.103
 wrong+0.103
Charl+0.103
Neg logit contributions
ull-0.084
bridge-0.081
ruck-0.079
ill-0.076
icle-0.075
Tokenpillar
Token ID41643
Feature activation+3.82e-01

Pos logit contributions
 slams+0.026
 tells+0.025
 studies+0.024
 needs+0.024
 shop+0.024
Neg logit contributions
bridge-0.029
umbs-0.026
ull-0.025
ill-0.025
ived-0.025
Token named
Token ID3706
Feature activation+3.47e+00

Pos logit contributions
 slams+0.023
annah+0.023
 Hazel+0.022
 Hannah+0.022
 studies+0.022
Neg logit contributions
bridge-0.027
ull-0.027
ived-0.024
ruck-0.024
ill-0.024
Token Coco
Token ID48222
Feature activation+6.51e+00

Pos logit contributions
 slams+0.241
 needs+0.231
 shop+0.231
 studies+0.226
ience+0.224
Neg logit contributions
bridge-0.209
icle-0.204
ill-0.203
icles-0.198
ull-0.194
Token.
Token ID13
Feature activation+1.10e+00

Pos logit contributions
 slams+0.518
becca+0.447
annah+0.438
 studies+0.436
athy+0.435
Neg logit contributions
ill-0.371
ull-0.364
ived-0.344
bridge-0.340
paper-0.339
Token Coco
Token ID48222
Feature activation+6.46e+00

Pos logit contributions
 slams+0.050
 needs+0.048
ience+0.047
issa+0.047
 wrong+0.046
Neg logit contributions
ived-0.086
umbs-0.085
icle-0.085
bridge-0.083
paper-0.083
Token was
Token ID373
Feature activation+1.13e+00

Pos logit contributions
 slams+0.468
 clinic+0.414
 hospital+0.408
 spa+0.395
 therapist+0.394
Neg logit contributions
ull-0.370
paper-0.351
ill-0.340
ived-0.332
bridge-0.317
Token sad
Token ID6507
Feature activation+1.92e+00

Pos logit contributions
ience+0.069
 needs+0.064
 slams+0.064
 studies+0.063
 therapist+0.062
Neg logit contributions
bridge-0.077
ull-0.076
ived-0.075
 crashing-0.072
icle-0.071
Token because
Token ID780
Feature activation+1.37e+00

Pos logit contributions
 slams+0.119
 shop+0.115
 Elsa+0.114
 Jewel+0.114
athy+0.112
Neg logit contributions
ruck-0.129
ived-0.126
ill-0.125
bridge-0.124
icle-0.123
Token she
Token ID673
Feature activation+1.07e+00

Pos logit contributions
 slams+0.102
 needs+0.097
athy+0.096
 studies+0.096
ience+0.096
Neg logit contributions
bridge-0.078
ull-0.071
ill-0.069
icle-0.068
ived-0.067
Token wanted
Token ID2227
Feature activation+8.37e-01

Pos logit contributions
 slams+0.072
 shop+0.069
 spa+0.067
 clinic+0.067
 aisle+0.066
Neg logit contributions
bridge-0.061
paper-0.060
ived-0.058
ull-0.058
ruck-0.056
Token hug
Token ID16225
Feature activation+1.22e+00

Pos logit contributions
bridge+0.009
ill+0.008
icle+0.008
ruck+0.008
ived+0.007
Neg logit contributions
 slams-0.006
 wrong-0.006
 shop-0.006
 needs-0.006
 Jewel-0.006
Token her
Token ID607
Feature activation+4.00e-01

Pos logit contributions
 slams+0.064
 shop+0.061
 needs+0.060
 wrong+0.059
 tells+0.058
Neg logit contributions
bridge-0.102
ived-0.093
ruck-0.092
umbs-0.092
icle-0.091
Token par
Token ID1582
Feature activation-3.07e-02

Pos logit contributions
 slams+0.025
 owns+0.023
athy+0.023
 wrong+0.023
ience+0.023
Neg logit contributions
bridge-0.025
ruck-0.025
icle-0.024
icles-0.024
castle-0.022
Tokenrot
Token ID10599
Feature activation-3.70e-01

Pos logit contributions
bridge+0.002
ived+0.002
ruck+0.002
umbs+0.002
ill+0.002
Neg logit contributions
 slams-0.002
 shop-0.002
 wrong-0.002
 needs-0.002
 Elsa-0.002
Token,
Token ID11
Feature activation+1.34e+00

Pos logit contributions
bridge+0.029
umbs+0.026
ill+0.026
icle+0.026
ived+0.026
Neg logit contributions
 slams-0.021
 needs-0.019
 wrong-0.018
 owns-0.018
athy-0.018
Token Coco
Token ID48222
Feature activation+6.45e+00

Pos logit contributions
 slams+0.073
 shop+0.070
 needs+0.068
 wrong+0.067
 Jewel+0.067
Neg logit contributions
bridge-0.109
ruck-0.098
ived-0.098
ill-0.098
umbs-0.097
Token.
Token ID13
Feature activation+1.12e+00

Pos logit contributions
 slams+0.499
 studies+0.412
becca+0.400
athy+0.397
 courtroom+0.395
Neg logit contributions
bridge-0.383
ruck-0.356
ill-0.349
paper-0.330
umbs-0.327
Token Coco
Token ID48222
Feature activation+6.39e+00

Pos logit contributions
 slams+0.045
 needs+0.041
 wrong+0.040
 tells+0.038
 shop+0.038
Neg logit contributions
bridge-0.105
umbs-0.098
ived-0.098
ill-0.098
paper-0.094
Token was
Token ID373
Feature activation+9.64e-01

Pos logit contributions
 slams+0.451
 wrong+0.403
annah+0.399
athy+0.394
 Jewel+0.384
Neg logit contributions
bridge-0.378
paper-0.359
ull-0.356
castle-0.332
ill-0.325
Token green
Token ID4077
Feature activation+3.93e-01

Pos logit contributions
 slams+0.053
 needs+0.051
ience+0.051
 shop+0.051
 studies+0.050
Neg logit contributions
bridge-0.076
ull-0.067
ruck-0.066
ill-0.065
ived-0.065
Token and
Token ID290
Feature activation+3.82e-01

Pos logit contributions
 slams+0.025
 tells+0.025
 owns+0.024
athy+0.024
ience+0.024
Neg logit contributions
bridge-0.026
ill-0.024
icle-0.024
ruck-0.023
icles-0.023
Token was
Token ID373
Feature activation+1.23e+00

Pos logit contributions
 slams+0.034
 shop+0.032
 needs+0.030
 wrong+0.030
 Elsa+0.030
Neg logit contributions
bridge-0.091
ived-0.084
ruck-0.083
umbs-0.083
ill-0.083
Token a
Token ID257
Feature activation+8.97e-01

Pos logit contributions
 shop+0.047
 slams+0.047
 Elsa+0.045
 wrong+0.044
 Jewel+0.043
Neg logit contributions
bridge-0.119
ived-0.110
umbs-0.108
ruck-0.107
ill-0.107
Token tiny
Token ID7009
Feature activation+1.40e-01

Pos logit contributions
 slams+0.060
 tells+0.053
 belongs+0.052
 studies+0.052
 needs+0.051
Neg logit contributions
bridge-0.059
ull-0.055
ruck-0.054
 crashing-0.053
castle-0.052
Token mouse
Token ID10211
Feature activation+1.20e+00

Pos logit contributions
 slams+0.009
 tells+0.009
 wrong+0.009
 studies+0.008
 belongs+0.008
Neg logit contributions
bridge-0.009
ull-0.008
ruck-0.008
icle-0.008
 lava-0.008
Token called
Token ID1444
Feature activation+4.35e+00

Pos logit contributions
 slams+0.063
annah+0.058
 Elsa+0.058
 studies+0.057
 Jewel+0.056
Neg logit contributions
ull-0.098
bridge-0.094
ruck-0.090
ill-0.087
icle-0.085
Token Bruno
Token ID31045
Feature activation+6.45e+00

Pos logit contributions
 slams+0.279
 needs+0.263
 shop+0.260
 studies+0.255
 tells+0.253
Neg logit contributions
bridge-0.294
icle-0.284
icles-0.277
ill-0.274
umbs-0.270
Token.
Token ID13
Feature activation+1.00e+00

Pos logit contributions
 slams+0.395
 shop+0.336
 tells+0.334
 studies+0.330
 Elsa+0.328
Neg logit contributions
bridge-0.472
ruck-0.456
ull-0.447
ill-0.441
icle-0.434
Token He
Token ID679
Feature activation+2.35e-01

Pos logit contributions
 slams+0.067
 wrong+0.061
elia+0.061
 tells+0.060
otte+0.057
Neg logit contributions
umbs-0.058
icle-0.057
 crashing-0.056
bridge-0.055
icles-0.055
Token liked
Token ID8288
Feature activation+2.18e+00

Pos logit contributions
 slams+0.014
 shop+0.012
 wrong+0.012
 Elsa+0.012
 tells+0.012
Neg logit contributions
bridge-0.018
ull-0.017
ruck-0.016
paper-0.016
icle-0.015
Token to
Token ID284
Feature activation-1.08e-01

Pos logit contributions
 shop+0.099
 slams+0.093
 tells+0.093
 wrong+0.093
 old+0.091
Neg logit contributions
bridge-0.207
ill-0.188
umbs-0.186
ived-0.183
ipples-0.179
Token hide
Token ID7808
Feature activation+3.11e+00

Pos logit contributions
bridge+0.009
ived+0.008
ruck+0.008
umbs+0.008
ill+0.008
Neg logit contributions
 slams-0.006
 shop-0.006
 needs-0.005
 Elsa-0.005
 wrong-0.005
Token and
Token ID290
Feature activation+5.29e-01

Pos logit contributions
bridge+0.023
ruck+0.021
ill+0.021
ived+0.021
ull+0.020
Neg logit contributions
 slams-0.018
 Jewel-0.018
 wrong-0.017
 Elsa-0.017
 shop-0.017
Token Tom
Token ID4186
Feature activation+2.12e-01

Pos logit contributions
 slams+0.029
 shop+0.028
 needs+0.027
 wrong+0.027
 Jewel+0.026
Neg logit contributions
bridge-0.042
ived-0.039
ruck-0.038
ill-0.038
ull-0.038
Token are
Token ID389
Feature activation+6.98e-01

Pos logit contributions
 slams+0.014
 shop+0.014
 Elsa+0.013
 needs+0.013
 wrong+0.013
Neg logit contributions
bridge-0.015
ived-0.013
ruck-0.013
ill-0.013
umbs-0.013
Token at
Token ID379
Feature activation-1.13e+00

Pos logit contributions
 slams+0.039
 Jewel+0.038
 shop+0.038
 needs+0.037
 Elsa+0.036
Neg logit contributions
bridge-0.055
ived-0.052
ruck-0.050
ill-0.049
umbs-0.049
Token the
Token ID262
Feature activation+1.28e+00

Pos logit contributions
bridge+0.100
ruck+0.092
ived+0.092
icle+0.091
ill+0.090
Neg logit contributions
 slams-0.051
 shop-0.049
 needs-0.048
 Elsa-0.047
 wrong-0.047
Token shop
Token ID6128
Feature activation+6.44e+00

Pos logit contributions
 slams+0.058
ience+0.058
 needs+0.054
 tells+0.054
 Elsa+0.051
Neg logit contributions
bridge-0.111
icle-0.105
ruck-0.104
castle-0.101
icles-0.101
Token with
Token ID351
Feature activation+2.34e+00

Pos logit contributions
 slams+0.327
 Elsa+0.311
 Jewel+0.310
 needs+0.299
 tells+0.295
Neg logit contributions
bridge-0.538
ill-0.502
ived-0.500
ruck-0.499
umbs-0.486
Token Mom
Token ID11254
Feature activation+1.79e+00

Pos logit contributions
 slams+0.142
 Jewel+0.134
 wrong+0.133
ience+0.132
 needs+0.132
Neg logit contributions
bridge-0.170
icles-0.151
ruck-0.151
icle-0.149
umbs-0.149
Token.
Token ID13
Feature activation+6.58e-01

Pos logit contributions
 slams+0.093
 Jewel+0.086
 Elsa+0.083
 wrong+0.081
 needs+0.080
Neg logit contributions
bridge-0.150
ill-0.140
ruck-0.139
ived-0.137
umbs-0.137
Token They
Token ID1119
Feature activation+1.62e+00

Pos logit contributions
 slams+0.032
 shop+0.031
 Elsa+0.029
 tells+0.029
 needs+0.029
Neg logit contributions
bridge-0.059
ived-0.054
ruck-0.053
umbs-0.052
icle-0.051
Token want
Token ID765
Feature activation-5.80e-02

Pos logit contributions
 slams+0.110
 wrong+0.108
 Jewel+0.105
 Elsa+0.103
ience+0.102
Neg logit contributions
bridge-0.107
ill-0.093
ruck-0.092
umbs-0.088
ived-0.087
Token there
Token ID612
Feature activation+8.81e-01

Pos logit contributions
bridge+0.058
ived+0.054
ill+0.053
ruck+0.053
umbs+0.053
Neg logit contributions
 slams-0.023
 shop-0.022
 needs-0.021
 Elsa-0.021
 tells-0.020
Token were
Token ID547
Feature activation-3.77e-01

Pos logit contributions
 slams+0.038
 shop+0.034
 Elsa+0.034
 wrong+0.033
 needs+0.033
Neg logit contributions
bridge-0.082
ill-0.075
ived-0.075
ruck-0.074
icle-0.074
Token two
Token ID734
Feature activation+2.84e-02

Pos logit contributions
bridge+0.036
ruck+0.032
ived+0.032
icle+0.032
paper+0.031
Neg logit contributions
 shop-0.016
 Elsa-0.015
 slams-0.015
 wrong-0.015
 Jewel-0.014
Token friends
Token ID2460
Feature activation-1.88e-01

Pos logit contributions
 slams+0.002
 shop+0.001
 needs+0.001
 Elsa+0.001
 Jewel+0.001
Neg logit contributions
bridge-0.002
ived-0.002
ruck-0.002
umbs-0.002
ill-0.002
Token,
Token ID11
Feature activation+1.32e+00

Pos logit contributions
bridge+0.018
ruck+0.017
ill+0.017
ived+0.017
ull+0.017
Neg logit contributions
 slams-0.007
 shop-0.006
 needs-0.006
 Jewel-0.006
 Elsa-0.006
Token Bruno
Token ID31045
Feature activation+6.41e+00

Pos logit contributions
 slams+0.068
 shop+0.066
 needs+0.064
 wrong+0.063
 Elsa+0.063
Neg logit contributions
bridge-0.112
ived-0.101
ruck-0.100
umbs-0.100
ill-0.099
Token and
Token ID290
Feature activation+4.29e-01

Pos logit contributions
 slams+0.486
 wrong+0.415
 studies+0.405
 owns+0.403
 Jewel+0.402
Neg logit contributions
ruck-0.400
umbs-0.368
ived-0.364
bridge-0.359
ull-0.358
Token Sarah
Token ID10490
Feature activation+7.72e-03

Pos logit contributions
 slams+0.023
 shop+0.023
 Elsa+0.022
 needs+0.022
 wrong+0.022
Neg logit contributions
bridge-0.035
ived-0.032
umbs-0.032
ruck-0.031
ill-0.031
Token.
Token ID13
Feature activation+1.01e+00

Pos logit contributions
 slams+0.000
 shop+0.000
 needs+0.000
 Elsa+0.000
 wrong+0.000
Neg logit contributions
bridge-0.001
ruck-0.001
ived-0.001
ill-0.001
umbs-0.001
Token They
Token ID1119
Feature activation+1.40e+00

Pos logit contributions
 slams+0.065
 wrong+0.065
 shop+0.064
 needs+0.063
 owns+0.061
Neg logit contributions
bridge-0.063
umbs-0.063
ived-0.061
icles-0.060
ruck-0.059
Token loved
Token ID6151
Feature activation+2.28e+00

Pos logit contributions
 slams+0.072
 shop+0.069
 needs+0.066
 Elsa+0.066
 wrong+0.066
Neg logit contributions
bridge-0.119
ived-0.109
ruck-0.107
umbs-0.107
ill-0.106
Token par
Token ID1582
Feature activation-3.07e-02

Pos logit contributions
 slams+0.025
 owns+0.023
athy+0.023
 wrong+0.023
ience+0.023
Neg logit contributions
bridge-0.025
ruck-0.025
icle-0.024
icles-0.024
castle-0.022
Tokenrot
Token ID10599
Feature activation-3.70e-01

Pos logit contributions
bridge+0.002
ived+0.002
ruck+0.002
umbs+0.002
ill+0.002
Neg logit contributions
 slams-0.002
 shop-0.002
 wrong-0.002
 needs-0.002
 Elsa-0.002
Token,
Token ID11
Feature activation+1.34e+00

Pos logit contributions
bridge+0.029
umbs+0.026
ill+0.026
icle+0.026
ived+0.026
Neg logit contributions
 slams-0.021
 needs-0.019
 wrong-0.018
 owns-0.018
athy-0.018
Token Coco
Token ID48222
Feature activation+6.45e+00

Pos logit contributions
 slams+0.073
 shop+0.070
 needs+0.068
 wrong+0.067
 Jewel+0.067
Neg logit contributions
bridge-0.109
ruck-0.098
ived-0.098
ill-0.098
umbs-0.097
Token.
Token ID13
Feature activation+1.12e+00

Pos logit contributions
 slams+0.499
 studies+0.412
becca+0.400
athy+0.397
 courtroom+0.395
Neg logit contributions
bridge-0.383
ruck-0.356
ill-0.349
paper-0.330
umbs-0.327
Token Coco
Token ID48222
Feature activation+6.39e+00

Pos logit contributions
 slams+0.045
 needs+0.041
 wrong+0.040
 tells+0.038
 shop+0.038
Neg logit contributions
bridge-0.105
umbs-0.098
ived-0.098
ill-0.098
paper-0.094
Token was
Token ID373
Feature activation+9.64e-01

Pos logit contributions
 slams+0.451
 wrong+0.403
annah+0.399
athy+0.394
 Jewel+0.384
Neg logit contributions
bridge-0.378
paper-0.359
ull-0.356
castle-0.332
ill-0.325
Token green
Token ID4077
Feature activation+3.93e-01

Pos logit contributions
 slams+0.053
 needs+0.051
ience+0.051
 shop+0.051
 studies+0.050
Neg logit contributions
bridge-0.076
ull-0.067
ruck-0.066
ill-0.065
ived-0.065
Token and
Token ID290
Feature activation+3.82e-01

Pos logit contributions
 slams+0.025
 tells+0.025
 owns+0.024
athy+0.024
ience+0.024
Neg logit contributions
bridge-0.026
ill-0.024
icle-0.024
ruck-0.023
icles-0.023
Token red
Token ID2266
Feature activation-5.81e-01

Pos logit contributions
 slams+0.024
 shop+0.022
 studies+0.021
 hospital+0.021
athy+0.021
Neg logit contributions
bridge-0.027
paper-0.024
icle-0.024
 crashing-0.024
ull-0.023
Token and
Token ID290
Feature activation+3.62e-01

Pos logit contributions
bridge+0.050
ived+0.046
ruck+0.045
icle+0.045
umbs+0.045
Neg logit contributions
 slams-0.029
 Elsa-0.026
 tells-0.026
 needs-0.026
 shop-0.026
Token One
Token ID1881
Feature activation+2.67e+00

Pos logit contributions
 slams+0.063
 studies+0.055
ience+0.055
 owns+0.054
 wrong+0.054
Neg logit contributions
umbs-0.084
bridge-0.079
icle-0.076
icles-0.076
ill-0.074
Token day
Token ID1110
Feature activation+2.48e-01

Pos logit contributions
 slams+0.128
 shop+0.122
 needs+0.118
 Elsa+0.117
 wrong+0.117
Neg logit contributions
bridge-0.237
ived-0.216
ruck-0.214
umbs-0.214
ill-0.212
Token,
Token ID11
Feature activation+1.32e+00

Pos logit contributions
 slams+0.011
 shop+0.010
 needs+0.010
 tells+0.010
 Jewel+0.010
Neg logit contributions
bridge-0.022
ived-0.021
ruck-0.020
umbs-0.020
icle-0.020
Token Lily
Token ID20037
Feature activation+1.08e+00

Pos logit contributions
 slams+0.069
 studies+0.062
 needs+0.059
 tells+0.059
 owns+0.059
Neg logit contributions
bridge-0.108
icle-0.099
ived-0.099
ruck-0.098
umbs-0.098
Token and
Token ID290
Feature activation+5.26e-01

Pos logit contributions
 Jewel+0.065
 slams+0.064
 wrong+0.063
 shop+0.060
 Hazel+0.058
Neg logit contributions
bridge-0.071
ruck-0.070
ill-0.069
ull-0.068
ived-0.068
Token Midnight
Token ID27960
Feature activation+6.38e+00

Pos logit contributions
 slams+0.026
 studies+0.022
 owns+0.021
ience+0.020
 needs+0.020
Neg logit contributions
icle-0.040
icles-0.039
ruck-0.039
umbs-0.037
bridge-0.037
Token went
Token ID1816
Feature activation-5.05e-01

Pos logit contributions
 slams+0.419
 shop+0.397
 wrong+0.389
 Elsa+0.388
 Jewel+0.386
Neg logit contributions
bridge-0.439
ived-0.402
umbs-0.393
ruck-0.389
ill-0.385
Token for
Token ID329
Feature activation+2.72e-01

Pos logit contributions
bridge+0.028
ived+0.028
ruck+0.027
icle+0.025
paper+0.025
Neg logit contributions
ience-0.035
 slams-0.035
 needs-0.035
 shop-0.035
 owns-0.035
Token a
Token ID257
Feature activation+7.02e-01

Pos logit contributions
 slams+0.019
 belongs+0.017
 needs+0.017
 shop+0.017
 owns+0.017
Neg logit contributions
bridge-0.016
ruck-0.016
umbs-0.016
icle-0.014
icles-0.014
Token jog
Token ID48342
Feature activation+2.98e+00

Pos logit contributions
 slams+0.030
 shop+0.028
 tells+0.027
 needs+0.027
 Elsa+0.027
Neg logit contributions
bridge-0.063
ived-0.060
ruck-0.059
umbs-0.057
paper-0.057
Token in
Token ID287
Feature activation+1.75e+00

Pos logit contributions
 tells+0.183
 slams+0.182
 Elsa+0.181
ience+0.179
 Jewel+0.177
Neg logit contributions
bridge-0.199
ived-0.193
ill-0.187
ruck-0.182
ipples-0.177
Token again
Token ID757
Feature activation-1.46e+00

Pos logit contributions
 slams+0.024
 shop+0.022
 Elsa+0.022
 Jewel+0.022
 tells+0.021
Neg logit contributions
bridge-0.043
ruck-0.041
ill-0.040
ull-0.040
icle-0.039
Token and
Token ID290
Feature activation-1.37e+00

Pos logit contributions
bridge+0.126
ived+0.116
ill+0.115
ruck+0.115
umbs+0.113
Neg logit contributions
 slams-0.071
 shop-0.067
 Jewel-0.065
 Elsa-0.065
 needs-0.063
Token again
Token ID757
Feature activation-1.22e+00

Pos logit contributions
bridge+0.109
ived+0.097
ill+0.095
ruck+0.094
ull+0.093
Neg logit contributions
 Jewel-0.073
 slams-0.071
 shop-0.071
Charl-0.070
athy-0.070
Token.
Token ID13
Feature activation-1.95e+00

Pos logit contributions
bridge+0.113
ived+0.103
ruck+0.102
umbs+0.101
ill+0.101
Neg logit contributions
 slams-0.053
 shop-0.050
 needs-0.048
 Elsa-0.048
 wrong-0.048
Token J
Token ID449
Feature activation-3.24e+00

Pos logit contributions
bridge+0.147
ill+0.140
umbs+0.136
icle+0.133
ived+0.133
Neg logit contributions
 slams-0.112
 wrong-0.110
 needs-0.103
 shop-0.101
 tells-0.099
Tokenod
Token ID375
Feature activation-7.70e+00

Pos logit contributions
bridge+0.221
ived+0.195
umbs+0.192
ruck+0.192
icle+0.188
Neg logit contributions
 slams-0.220
 shop-0.215
 needs-0.209
 wrong-0.208
 Elsa-0.207
Tokenie
Token ID494
Feature activation-4.60e+00

Pos logit contributions
bridge+0.617
ived+0.552
ruck+0.544
ill+0.543
umbs+0.537
Neg logit contributions
 slams-0.436
 shop-0.418
 Elsa-0.408
 Jewel-0.404
 needs-0.396
Token would
Token ID561
Feature activation-6.32e-01

Pos logit contributions
bridge+0.341
ull+0.300
ived+0.297
ill+0.297
castle+0.279
Neg logit contributions
 Jewel-0.264
 slams-0.260
 Elsa-0.251
Charl-0.250
 Hazel-0.245
Token always
Token ID1464
Feature activation-1.25e+00

Pos logit contributions
bridge+0.049
ill+0.044
ived+0.043
ruck+0.043
icle+0.042
Neg logit contributions
 slams-0.036
 Elsa-0.033
 shop-0.033
 Jewel-0.033
 needs-0.032
Token remember
Token ID3505
Feature activation-2.51e+00

Pos logit contributions
bridge+0.095
ruck+0.085
ill+0.084
ived+0.083
ull+0.081
Neg logit contributions
 slams-0.072
 shop-0.068
 Elsa-0.068
 Jewel-0.068
athy-0.064
Token the
Token ID262
Feature activation-5.85e-01

Pos logit contributions
bridge+0.198
ruck+0.183
ill+0.180
ived+0.179
icle+0.178
Neg logit contributions
 slams-0.138
 shop-0.130
 needs-0.129
 Elsa-0.127
 Jewel-0.127
Token went
Token ID1816
Feature activation-1.06e+00

Pos logit contributions
bridge+0.023
ived+0.021
ruck+0.020
ull+0.020
icle+0.019
Neg logit contributions
 Elsa-0.023
 slams-0.022
 Jewel-0.022
 shop-0.021
annah-0.020
Token on
Token ID319
Feature activation+4.61e-01

Pos logit contributions
ived+0.053
bridge+0.052
ruck+0.050
 crashing+0.049
ill+0.045
Neg logit contributions
 needs-0.084
ience-0.083
 owns-0.081
 Jewel-0.078
 belongs-0.078
Token with
Token ID351
Feature activation+1.11e+00

Pos logit contributions
 needs+0.030
 Elsa+0.030
 slams+0.029
ience+0.029
athy+0.029
Neg logit contributions
bridge-0.028
ruck-0.027
icle-0.026
ived-0.025
 crashing-0.025
Token his
Token ID465
Feature activation-7.13e-01

Pos logit contributions
 slams+0.059
 shop+0.059
 wrong+0.056
 Elsa+0.055
 tells+0.055
Neg logit contributions
bridge-0.093
ived-0.085
umbs-0.085
ill-0.084
ruck-0.083
Token mis
Token ID2984
Feature activation-2.84e-01

Pos logit contributions
bridge+0.050
ived+0.047
icle+0.046
paper+0.045
ruck+0.044
Neg logit contributions
 slams-0.043
 Elsa-0.040
 Jewel-0.039
 owns-0.038
 needs-0.038
Tokenchie
Token ID3043
Feature activation-7.70e+00

Pos logit contributions
bridge+0.016
icles+0.014
icle+0.014
ruck+0.013
castle+0.013
Neg logit contributions
 slams-0.022
 shop-0.021
athy-0.021
 Jewel-0.021
 store-0.020
Tokenvous
Token ID31222
Feature activation-4.05e+00

Pos logit contributions
bridge+0.410
ived+0.347
icles+0.347
icle+0.343
umbs+0.341
Neg logit contributions
 wrong-0.625
 shop-0.621
 slams-0.619
 needs-0.618
 tells-0.604
Token ways
Token ID2842
Feature activation-2.06e+00

Pos logit contributions
bridge+0.342
ived+0.306
umbs+0.306
ruck+0.306
ill+0.301
Neg logit contributions
 slams-0.210
 shop-0.206
 needs-0.199
 wrong-0.198
 tells-0.195
Token,
Token ID11
Feature activation+2.96e-01

Pos logit contributions
bridge+0.174
ived+0.161
ruck+0.160
ill+0.159
umbs+0.157
Neg logit contributions
 slams-0.104
 needs-0.097
 Elsa-0.096
 shop-0.096
 tells-0.095
Token causing
Token ID6666
Feature activation-1.25e+00

Pos logit contributions
 slams+0.016
 shop+0.016
 needs+0.015
 wrong+0.015
 Elsa+0.015
Neg logit contributions
bridge-0.024
ived-0.022
umbs-0.022
ruck-0.021
ill-0.021
Token problems
Token ID2761
Feature activation+9.07e-01

Pos logit contributions
bridge+0.108
ived+0.100
ruck+0.099
umbs+0.098
ill+0.097
Neg logit contributions
 slams-0.061
 shop-0.058
 needs-0.056
 Elsa-0.056
 wrong-0.055
Token tall
Token ID7331
Feature activation+8.00e-01

Pos logit contributions
 shop+0.054
 slams+0.054
 old+0.051
 needs+0.051
 wrong+0.050
Neg logit contributions
bridge-0.098
umbs-0.092
ived-0.091
icle-0.090
icles-0.089
Token walls
Token ID7714
Feature activation-1.35e+00

Pos logit contributions
 slams+0.046
 shop+0.045
 needs+0.044
 tells+0.043
 wrong+0.043
Neg logit contributions
bridge-0.061
umbs-0.057
ived-0.056
icle-0.056
ruck-0.055
Token and
Token ID290
Feature activation-2.99e-01

Pos logit contributions
bridge+0.137
umbs+0.127
ived+0.126
icle+0.126
ruck+0.124
Neg logit contributions
 shop-0.044
 slams-0.042
 wrong-0.041
 old-0.039
 needs-0.038
Token a
Token ID257
Feature activation-2.11e-01

Pos logit contributions
bridge+0.025
ived+0.023
umbs+0.023
icle+0.022
ruck+0.022
Neg logit contributions
 slams-0.015
 shop-0.015
 needs-0.015
 wrong-0.014
 tells-0.014
Token draw
Token ID3197
Feature activation-1.13e+00

Pos logit contributions
bridge+0.017
ived+0.016
umbs+0.015
ruck+0.015
icle+0.015
Neg logit contributions
 slams-0.011
 shop-0.011
 needs-0.011
 wrong-0.011
 Elsa-0.011
Tokenbridge
Token ID9458
Feature activation-7.46e+00

Pos logit contributions
umbs+0.031
icle+0.030
ived+0.029
icles+0.029
urg+0.029
Neg logit contributions
 shop-0.112
 wrong-0.110
 store-0.108
 old-0.106
 slams-0.106
Token.
Token ID13
Feature activation-2.10e-01

Pos logit contributions
bridge+0.756
umbs+0.753
ipples+0.738
urg+0.737
ull+0.736
Neg logit contributions
 wrong-0.225
 shop-0.225
 store-0.197
 slams-0.196
 needs-0.186
Token
Token ID198
Feature activation-2.50e-01

Pos logit contributions
bridge+0.018
ruck+0.017
ived+0.016
umbs+0.016
icle+0.016
Neg logit contributions
 slams-0.010
 shop-0.010
 tells-0.010
 wrong-0.010
 needs-0.010
Token
Token ID198
Feature activation-2.42e-01

Pos logit contributions
bridge+0.025
ived+0.023
umbs+0.023
ruck+0.023
ill+0.023
Neg logit contributions
 slams-0.009
 shop-0.008
 needs-0.008
 Elsa-0.008
 wrong-0.008
TokenHe
Token ID1544
Feature activation-8.32e-01

Pos logit contributions
bridge+0.018
icles+0.016
icle+0.016
urg+0.016
ruck+0.016
Neg logit contributions
 shop-0.015
 store-0.014
 slams-0.014
 wrong-0.014
 needs-0.014
Token started
Token ID2067
Feature activation-6.21e-01

Pos logit contributions
bridge+0.063
ived+0.057
ill+0.055
ull+0.055
ruck+0.055
Neg logit contributions
 slams-0.050
 wrong-0.048
 Elsa-0.048
 Jewel-0.048
 shop-0.047
Token always
Token ID1464
Feature activation+1.62e-01

Pos logit contributions
 slams+0.023
 Jewel+0.022
 Elsa+0.021
 shop+0.021
 Hazel+0.020
Neg logit contributions
bridge-0.026
ived-0.024
ull-0.023
ill-0.023
ruck-0.023
Token acted
Token ID13134
Feature activation+2.86e-01

Pos logit contributions
 slams+0.010
 Jewel+0.009
 Elsa+0.009
 shop+0.009
athy+0.009
Neg logit contributions
bridge-0.011
ruck-0.011
ived-0.010
ill-0.010
ull-0.010
Token silly
Token ID14397
Feature activation-9.29e-01

Pos logit contributions
 slams+0.017
 shop+0.016
 needs+0.016
 Elsa+0.016
 tells+0.016
Neg logit contributions
bridge-0.022
ived-0.020
ruck-0.019
umbs-0.019
ill-0.019
Token and
Token ID290
Feature activation+1.62e-01

Pos logit contributions
bridge+0.085
ived+0.078
umbs+0.077
ruck+0.077
ill+0.076
Neg logit contributions
 slams-0.041
 shop-0.040
 needs-0.038
 wrong-0.038
 Elsa-0.037
Token mis
Token ID2984
Feature activation+8.95e-03

Pos logit contributions
 slams+0.008
 shop+0.008
 Elsa+0.008
 Jewel+0.008
athy+0.008
Neg logit contributions
bridge-0.013
ived-0.012
ruck-0.012
icle-0.012
ipples-0.012
Tokenchie
Token ID3043
Feature activation-7.41e+00

Pos logit contributions
 shop+0.001
 slams+0.001
 Jewel+0.001
 needs+0.001
 wrong+0.001
Neg logit contributions
bridge-0.001
icles-0.000
icle-0.000
ruck-0.000
castle-0.000
Tokenvous
Token ID31222
Feature activation-3.82e+00

Pos logit contributions
bridge+0.443
icles+0.392
ived+0.392
umbs+0.391
urg+0.387
Neg logit contributions
 needs-0.556
 wrong-0.551
 shop-0.541
 slams-0.534
 tells-0.533
Token.
Token ID13
Feature activation+5.72e-01

Pos logit contributions
bridge+0.337
ived+0.305
umbs+0.303
ruck+0.302
ill+0.300
Neg logit contributions
 slams-0.182
 shop-0.178
 needs-0.172
 wrong-0.170
 tells-0.168
Token
Token ID198
Feature activation+7.04e-01

Pos logit contributions
 slams+0.029
 Elsa+0.027
ience+0.027
 wrong+0.027
 needs+0.026
Neg logit contributions
bridge-0.047
icle-0.045
icles-0.044
ill-0.044
umbs-0.044
Token
Token ID198
Feature activation+6.75e-01

Pos logit contributions
 slams+0.035
 needs+0.032
 Elsa+0.031
ience+0.030
 wrong+0.030
Neg logit contributions
bridge-0.062
ull-0.058
ill-0.057
ruck-0.057
umbs-0.057
TokenOne
Token ID3198
Feature activation+1.04e+00

Pos logit contributions
 slams+0.033
 shop+0.031
 Elsa+0.030
 needs+0.030
 tells+0.030
Neg logit contributions
bridge-0.060
ruck-0.054
ived-0.054
umbs-0.053
ill-0.053
Token
Token ID198
Feature activation-9.52e-01

Pos logit contributions
bridge+0.085
umbs+0.081
ull+0.081
ived+0.080
ill+0.079
Neg logit contributions
 slams-0.051
 needs-0.047
 Elsa-0.045
 wrong-0.045
 shop-0.045
TokenThe
Token ID464
Feature activation-4.42e-01

Pos logit contributions
bridge+0.069
ived+0.061
umbs+0.061
ill+0.059
ruck+0.059
Neg logit contributions
 slams-0.062
 needs-0.059
 shop-0.059
 Elsa-0.058
 tells-0.058
Token cat
Token ID3797
Feature activation-1.85e-01

Pos logit contributions
bridge+0.035
ruck+0.031
ived+0.031
umbs+0.031
icle+0.030
Neg logit contributions
 slams-0.025
 Elsa-0.023
 Jewel-0.022
 shop-0.022
 owns-0.022
Token grinned
Token ID36268
Feature activation-1.79e+00

Pos logit contributions
bridge+0.015
ived+0.014
ill+0.013
ull+0.013
umbs+0.013
Neg logit contributions
 slams-0.010
 Elsa-0.010
 Jewel-0.009
 wrong-0.009
 shop-0.009
Token mis
Token ID2984
Feature activation-2.48e-01

Pos logit contributions
bridge+0.169
umbs+0.152
ruck+0.152
ived+0.152
icle+0.150
Neg logit contributions
 slams-0.077
 shop-0.075
 wrong-0.072
 needs-0.071
 Elsa-0.070
Tokenchie
Token ID3043
Feature activation-7.40e+00

Pos logit contributions
bridge+0.013
icles+0.010
icle+0.010
castle+0.010
ruck+0.010
Neg logit contributions
 slams-0.021
 shop-0.021
 Jewel-0.021
 needs-0.020
 wrong-0.020
Tokenv
Token ID85
Feature activation-5.40e+00

Pos logit contributions
bridge+0.421
ived+0.353
icles+0.352
ruck+0.350
umbs+0.349
Neg logit contributions
 slams-0.580
 needs-0.574
 shop-0.573
 wrong-0.569
 tells-0.562
Tokenously
Token ID3481
Feature activation-4.43e+00

Pos logit contributions
bridge+0.358
paper+0.313
ived+0.294
ipples+0.290
ruck+0.290
Neg logit contributions
 shop-0.385
 slams-0.375
 needs-0.375
 tells-0.367
 wrong-0.365
Token.
Token ID13
Feature activation-1.24e+00

Pos logit contributions
bridge+0.358
ived+0.346
ruck+0.336
umbs+0.329
ill+0.323
Neg logit contributions
 slams-0.229
 Elsa-0.218
 shop-0.212
 Jewel-0.211
 needs-0.207
Token "
Token ID366
Feature activation-1.45e+00

Pos logit contributions
bridge+0.087
ived+0.081
ill+0.080
umbs+0.079
icle+0.079
Neg logit contributions
 slams-0.077
 shop-0.073
 needs-0.073
 wrong-0.073
 Elsa-0.071
TokenNever
Token ID12295
Feature activation-8.58e-01

Pos logit contributions
bridge+0.114
ived+0.104
umbs+0.102
ruck+0.102
ill+0.100
Neg logit contributions
 slams-0.082
 shop-0.081
 needs-0.078
 Jewel-0.077
 Elsa-0.077
Token
Token ID198
Feature activation-5.28e-02

Pos logit contributions
bridge+0.005
ill+0.005
ull+0.005
umbs+0.005
ived+0.005
Neg logit contributions
 slams-0.004
 needs-0.004
ience-0.003
 Elsa-0.003
 shop-0.003
TokenSuddenly
Token ID38582
Feature activation-1.13e+00

Pos logit contributions
bridge+0.003
ived+0.002
ruck+0.002
ill+0.002
umbs+0.002
Neg logit contributions
 slams-0.004
 tells-0.004
 needs-0.004
 Elsa-0.004
 says-0.004
Token,
Token ID11
Feature activation+4.45e-01

Pos logit contributions
bridge+0.098
ived+0.092
ill+0.088
ruck+0.088
icle+0.087
Neg logit contributions
 slams-0.052
 shop-0.051
 needs-0.050
 tells-0.048
 studies-0.048
Token a
Token ID257
Feature activation-2.46e-01

Pos logit contributions
 slams+0.021
 needs+0.020
 shop+0.020
 tells+0.020
 studies+0.019
Neg logit contributions
bridge-0.039
ived-0.035
ruck-0.035
icle-0.034
ill-0.034
Token mis
Token ID2984
Feature activation+3.24e-02

Pos logit contributions
bridge+0.021
ived+0.018
ruck+0.018
ull+0.018
umbs+0.018
Neg logit contributions
 slams-0.013
 needs-0.012
 Elsa-0.012
 tells-0.012
 shop-0.012
Tokenchie
Token ID3043
Feature activation-7.39e+00

Pos logit contributions
 slams+0.003
 shop+0.003
 Jewel+0.003
athy+0.003
 wrong+0.003
Neg logit contributions
bridge-0.002
icles-0.001
icle-0.001
ruck-0.001
umbs-0.001
Tokenvous
Token ID31222
Feature activation-3.88e+00

Pos logit contributions
bridge+0.432
icles+0.409
icle+0.395
umbs+0.393
urg+0.392
Neg logit contributions
 wrong-0.545
 needs-0.538
 shop-0.528
 slams-0.525
 says-0.524
Token bird
Token ID6512
Feature activation-1.37e+00

Pos logit contributions
bridge+0.305
ruck+0.276
ived+0.276
ull+0.271
ill+0.270
Neg logit contributions
 slams-0.221
 shop-0.206
 needs-0.204
 Elsa-0.202
 tells-0.201
Token flew
Token ID13112
Feature activation-1.12e+00

Pos logit contributions
bridge+0.096
umbs+0.086
icle+0.085
icles+0.085
ived+0.084
Neg logit contributions
 slams-0.090
 shop-0.088
 needs-0.087
 wrong-0.086
 tells-0.085
Token by
Token ID416
Feature activation+1.62e+00

Pos logit contributions
bridge+0.098
umbs+0.087
ruck+0.085
ipples+0.084
icles+0.083
Neg logit contributions
 slams-0.059
 wrong-0.059
 shop-0.056
 old-0.053
 Elsa-0.052
Token and
Token ID290
Feature activation-1.65e-01

Pos logit contributions
 slams+0.066
 shop+0.064
 wrong+0.061
 tells+0.059
 Jewel+0.059
Neg logit contributions
bridge-0.158
umbs-0.143
ived-0.142
ruck-0.140
icles-0.139
Token at
Token ID379
Feature activation-2.04e+00

Pos logit contributions
ived+0.023
bridge+0.020
ill+0.019
 crashing+0.018
ruck+0.018
Neg logit contributions
 Jewel-0.028
 library-0.028
 shop-0.028
athy-0.028
 slams-0.028
Token him
Token ID683
Feature activation-3.72e-01

Pos logit contributions
bridge+0.170
ruck+0.156
ill+0.154
icle+0.154
icles+0.153
Neg logit contributions
 slams-0.105
 shop-0.100
 needs-0.099
 Elsa-0.098
 wrong-0.096
Token with
Token ID351
Feature activation+1.17e+00

Pos logit contributions
bridge+0.024
ill+0.024
ived+0.023
icle+0.023
umbs+0.023
Neg logit contributions
 slams-0.024
athy-0.023
 owns-0.022
 needs-0.022
 Elsa-0.022
Token a
Token ID257
Feature activation-2.57e-01

Pos logit contributions
 slams+0.063
 shop+0.059
 Elsa+0.058
 needs+0.058
 Jewel+0.058
Neg logit contributions
bridge-0.096
ived-0.087
umbs-0.086
ruck-0.085
ill-0.085
Token mis
Token ID2984
Feature activation-1.48e-01

Pos logit contributions
bridge+0.021
ived+0.019
umbs+0.019
ruck+0.019
ill+0.019
Neg logit contributions
 slams-0.014
 shop-0.013
 Elsa-0.013
 Jewel-0.013
 needs-0.012
Tokenchie
Token ID3043
Feature activation-7.39e+00

Pos logit contributions
bridge+0.007
icles+0.006
icle+0.006
ruck+0.006
castle+0.005
Neg logit contributions
 slams-0.013
 shop-0.013
 Jewel-0.013
 needs-0.012
 wrong-0.012
Tokenvous
Token ID31222
Feature activation-3.93e+00

Pos logit contributions
bridge+0.392
icles+0.334
ived+0.331
icle+0.327
ruck+0.326
Neg logit contributions
 shop-0.596
 needs-0.594
 slams-0.592
 wrong-0.591
 tells-0.582
Token grin
Token ID16770
Feature activation-2.76e+00

Pos logit contributions
bridge+0.282
ived+0.254
ruck+0.249
umbs+0.248
ill+0.246
Neg logit contributions
 slams-0.251
 shop-0.241
 Elsa-0.235
 needs-0.235
 wrong-0.233
Token.
Token ID13
Feature activation-1.84e-01

Pos logit contributions
bridge+0.264
ived+0.243
ruck+0.242
umbs+0.241
ill+0.238
Neg logit contributions
 shop-0.108
 slams-0.108
 wrong-0.101
 needs-0.101
 Elsa-0.100
Token She
Token ID1375
Feature activation+1.41e-01

Pos logit contributions
bridge+0.015
ived+0.014
ruck+0.013
ill+0.013
icle+0.013
Neg logit contributions
 slams-0.010
 needs-0.009
 shop-0.009
 wrong-0.009
 Elsa-0.009
Token said
Token ID531
Feature activation+2.65e-01

Pos logit contributions
 slams+0.007
 wrong+0.006
 shop+0.006
 Jewel+0.006
 Elsa+0.006
Neg logit contributions
bridge-0.012
ived-0.012
ill-0.011
ull-0.011
ruck-0.011
Token.
Token ID13
Feature activation+1.11e+00

Pos logit contributions
bridge+0.079
ruck+0.075
ull+0.074
icle+0.071
ill+0.071
Neg logit contributions
 slams-0.053
 studies-0.048
 wrong-0.046
 tells-0.046
 needs-0.045
Token He
Token ID679
Feature activation+2.61e-01

Pos logit contributions
 slams+0.051
 shop+0.050
 Elsa+0.047
 Jewel+0.047
 needs+0.046
Neg logit contributions
bridge-0.101
ived-0.094
ruck-0.091
umbs-0.090
ipples-0.088
Token was
Token ID373
Feature activation+1.12e+00

Pos logit contributions
 slams+0.014
 shop+0.012
 Jewel+0.012
 tells+0.012
 Elsa+0.012
Neg logit contributions
bridge-0.022
ull-0.020
ruck-0.019
ived-0.019
paper-0.019
Token a
Token ID257
Feature activation+8.02e-01

Pos logit contributions
 slams+0.053
 shop+0.050
 needs+0.050
 tells+0.049
ience+0.049
Neg logit contributions
bridge-0.097
ived-0.088
icle-0.088
ruck-0.087
umbs-0.086
Token mis
Token ID2984
Feature activation+1.66e-01

Pos logit contributions
 slams+0.044
 tells+0.040
 Jewel+0.038
 needs+0.038
 says+0.037
Neg logit contributions
bridge-0.063
ull-0.057
ruck-0.057
icle-0.056
ived-0.056
Tokenchie
Token ID3043
Feature activation-7.32e+00

Pos logit contributions
 shop+0.014
 slams+0.014
 Jewel+0.014
 store+0.014
athy+0.013
Neg logit contributions
bridge-0.008
icles-0.007
icle-0.006
ruck-0.006
paper-0.006
Tokenvous
Token ID31222
Feature activation-3.50e+00

Pos logit contributions
bridge+0.399
ived+0.353
icles+0.349
urg+0.339
ill+0.337
Neg logit contributions
 shop-0.577
 needs-0.566
 slams-0.564
 wrong-0.557
 old-0.545
Token little
Token ID1310
Feature activation-2.89e-01

Pos logit contributions
bridge+0.302
ived+0.276
ruck+0.276
umbs+0.272
icle+0.271
Neg logit contributions
 slams-0.172
 shop-0.160
 needs-0.157
 wrong-0.157
 Elsa-0.156
Token baby
Token ID5156
Feature activation-1.03e+00

Pos logit contributions
bridge+0.020
ruck+0.019
icle+0.019
ived+0.019
paper+0.019
Neg logit contributions
 slams-0.017
 tells-0.016
athy-0.016
 owns-0.016
 studies-0.015
Token and
Token ID290
Feature activation+4.73e-01

Pos logit contributions
bridge+0.089
ruck+0.082
ill+0.081
ull+0.080
icle+0.080
Neg logit contributions
 slams-0.051
 tells-0.045
 studies-0.044
 needs-0.044
 wrong-0.044
Token always
Token ID1464
Feature activation+4.84e-01

Pos logit contributions
 slams+0.027
athy+0.026
 shop+0.025
 Jewel+0.025
 Elsa+0.024
Neg logit contributions
bridge-0.037
ruck-0.031
ull-0.031
icle-0.031
paper-0.030
Token answer
Token ID3280
Feature activation+6.33e-01

Pos logit contributions
bridge+0.006
ived+0.005
umbs+0.005
ruck+0.005
ill+0.005
Neg logit contributions
 slams-0.005
 shop-0.005
 needs-0.005
 wrong-0.004
 Elsa-0.004
Token is
Token ID318
Feature activation+1.61e+00

Pos logit contributions
 slams+0.020
 shop+0.018
 Jewel+0.018
 Elsa+0.018
athy+0.017
Neg logit contributions
bridge-0.064
ived-0.060
icle-0.060
ill-0.060
ruck-0.059
Token a
Token ID257
Feature activation-2.97e-01

Pos logit contributions
 shop+0.085
 slams+0.085
athy+0.079
 Jewel+0.079
 aisle+0.078
Neg logit contributions
bridge-0.126
icle-0.117
ived-0.116
ruck-0.114
icles-0.114
Token fall
Token ID2121
Feature activation-1.94e+00

Pos logit contributions
bridge+0.020
ived+0.018
ruck+0.018
icle+0.018
ull+0.017
Neg logit contributions
 slams-0.020
 shop-0.019
 Elsa-0.019
 needs-0.019
 Jewel-0.019
Token-
Token ID12
Feature activation-1.80e+00

Pos logit contributions
bridge+0.201
umbs+0.184
ived+0.181
ill+0.181
ruck+0.180
Neg logit contributions
 wrong-0.064
 needs-0.061
 slams-0.061
 shop-0.060
 tells-0.057
Tokenbridge
Token ID9458
Feature activation-7.31e+00

Pos logit contributions
bridge+0.127
ived+0.112
umbs+0.112
ruck+0.112
ill+0.110
Neg logit contributions
 slams-0.117
 shop-0.115
 wrong-0.113
 needs-0.112
 tells-0.110
Token!"
Token ID2474
Feature activation-1.21e+00

Pos logit contributions
bridge+0.754
ruck+0.705
ived+0.705
umbs+0.703
ull+0.700
Neg logit contributions
 shop-0.237
 wrong-0.229
 slams-0.225
 needs-0.219
 tells-0.214
Token He
Token ID679
Feature activation-6.50e-01

Pos logit contributions
bridge+0.101
umbs+0.092
ived+0.092
ill+0.091
ruck+0.090
Neg logit contributions
 slams-0.063
 shop-0.061
 needs-0.060
 Jewel-0.059
 Elsa-0.058
Token laughs
Token ID22051
Feature activation-5.22e-01

Pos logit contributions
bridge+0.051
ived+0.048
urg+0.043
paper+0.043
icle+0.042
Neg logit contributions
otte-0.034
athy-0.034
 Jewel-0.033
Charl-0.033
 Charlotte-0.032
Token.
Token ID13
Feature activation-7.50e-01

Pos logit contributions
bridge+0.060
umbs+0.056
ruck+0.056
icles+0.056
icle+0.056
Neg logit contributions
 wrong-0.013
 shop-0.011
 old-0.011
 says-0.011
 tells-0.010
Token He
Token ID679
Feature activation-8.12e-01

Pos logit contributions
bridge+0.064
ived+0.059
ruck+0.058
umbs+0.057
ill+0.057
Neg logit contributions
 slams-0.038
 shop-0.038
 Elsa-0.036
 wrong-0.036
 tells-0.036
Token so
Token ID523
Feature activation-6.61e-01

Pos logit contributions
 slams+0.067
 tells+0.062
athy+0.061
 needs+0.061
 Jewel+0.061
Neg logit contributions
bridge-0.104
ruck-0.090
ived-0.088
icle-0.086
ull-0.086
Token that
Token ID326
Feature activation+1.57e+00

Pos logit contributions
bridge+0.044
ived+0.041
ruck+0.040
icle+0.039
icles+0.039
Neg logit contributions
 slams-0.042
 shop-0.041
 Jewel-0.040
 tells-0.040
athy-0.040
Token no
Token ID645
Feature activation-2.01e+00

Pos logit contributions
 slams+0.111
athy+0.099
 Jewel+0.097
 shop+0.096
Charl+0.095
Neg logit contributions
bridge-0.100
ruck-0.091
ull-0.087
icles-0.086
icle-0.086
Token more
Token ID517
Feature activation+1.99e-01

Pos logit contributions
bridge+0.135
ruck+0.132
icle+0.124
icles+0.119
ived+0.115
Neg logit contributions
 slams-0.138
 Elsa-0.128
 tells-0.125
 Jewel-0.122
 shop-0.121
Token mis
Token ID2984
Feature activation-2.34e-02

Pos logit contributions
 slams+0.011
 Elsa+0.010
 Jewel+0.010
 shop+0.010
athy+0.010
Neg logit contributions
bridge-0.016
ruck-0.015
ived-0.014
ull-0.014
ill-0.014
Tokenchie
Token ID3043
Feature activation-7.30e+00

Pos logit contributions
bridge+0.001
icles+0.001
icle+0.001
castle+0.001
ill+0.001
Neg logit contributions
 shop-0.002
 slams-0.002
 needs-0.002
 Jewel-0.002
 store-0.002
Tokenvous
Token ID31222
Feature activation-4.12e+00

Pos logit contributions
bridge+0.395
icles+0.350
ived+0.346
urg+0.346
icle+0.345
Neg logit contributions
 needs-0.574
 shop-0.563
 wrong-0.560
 slams-0.558
 owns-0.546
Token mice
Token ID10693
Feature activation+1.87e-01

Pos logit contributions
bridge+0.333
ived+0.301
ruck+0.300
umbs+0.294
ill+0.292
Neg logit contributions
 slams-0.228
 shop-0.218
 Elsa-0.211
 wrong-0.209
 needs-0.209
Token could
Token ID714
Feature activation+3.22e-01

Pos logit contributions
 slams+0.011
 Elsa+0.010
 Jewel+0.010
 shop+0.010
athy+0.009
Neg logit contributions
bridge-0.015
ived-0.013
ruck-0.013
ill-0.013
umbs-0.013
Token come
Token ID1282
Feature activation-1.63e+00

Pos logit contributions
 slams+0.018
 shop+0.018
 needs+0.017
 Elsa+0.017
 wrong+0.017
Neg logit contributions
bridge-0.025
ived-0.023
ruck-0.023
umbs-0.023
ill-0.022
Token in
Token ID287
Feature activation+9.68e-01

Pos logit contributions
bridge+0.140
ived+0.127
umbs+0.127
icle+0.124
ruck+0.123
Neg logit contributions
 shop-0.080
 slams-0.079
 wrong-0.078
 needs-0.077
 Elsa-0.074
Token old
Token ID1468
Feature activation+4.18e+00

Pos logit contributions
bridge+0.005
icle+0.004
ruck+0.004
 crashing+0.004
ived+0.004
Neg logit contributions
 slams-0.003
 Elsa-0.003
athy-0.003
ience-0.003
 Jewel-0.003
Token lady
Token ID10846
Feature activation+1.14e+00

Pos logit contributions
 slams+0.222
 Elsa+0.209
ience+0.201
Charl+0.199
athy+0.197
Neg logit contributions
bridge-0.335
icle-0.312
ruck-0.308
icles-0.293
ived-0.293
Token with
Token ID351
Feature activation+1.16e+00

Pos logit contributions
 slams+0.057
 shop+0.052
 Elsa+0.051
 Jewel+0.050
 tells+0.050
Neg logit contributions
bridge-0.098
ill-0.090
ived-0.088
icle-0.087
ruck-0.087
Token a
Token ID257
Feature activation-3.67e-01

Pos logit contributions
 slams+0.064
 shop+0.058
 Jewel+0.057
 Elsa+0.057
 needs+0.057
Neg logit contributions
bridge-0.093
ived-0.084
ruck-0.084
icle-0.083
umbs-0.083
Token mis
Token ID2984
Feature activation-3.64e-02

Pos logit contributions
bridge+0.029
ived+0.027
ruck+0.026
icle+0.026
umbs+0.026
Neg logit contributions
 slams-0.020
 Jewel-0.018
 Elsa-0.018
 needs-0.018
 shop-0.018
Tokenchie
Token ID3043
Feature activation-7.28e+00

Pos logit contributions
bridge+0.002
icles+0.001
icle+0.001
ruck+0.001
umbs+0.001
Neg logit contributions
 shop-0.003
 slams-0.003
 Jewel-0.003
 needs-0.003
 wrong-0.003
Tokenvous
Token ID31222
Feature activation-4.03e+00

Pos logit contributions
bridge+0.379
ived+0.328
icles+0.325
umbs+0.323
icle+0.320
Neg logit contributions
 shop-0.590
 needs-0.587
 slams-0.583
 wrong-0.583
 tells-0.573
Token tw
Token ID665
Feature activation-2.05e+00

Pos logit contributions
bridge+0.315
ived+0.286
ruck+0.282
umbs+0.279
icle+0.277
Neg logit contributions
 slams-0.231
 shop-0.221
 Elsa-0.216
 needs-0.214
 Jewel-0.213
Tokeninkle
Token ID19894
Feature activation+2.70e-01

Pos logit contributions
bridge+0.162
ived+0.146
ruck+0.144
umbs+0.144
ill+0.142
Neg logit contributions
 slams-0.117
 shop-0.113
 needs-0.110
 Elsa-0.109
 wrong-0.109
Token in
Token ID287
Feature activation+1.00e+00

Pos logit contributions
 slams+0.013
 shop+0.013
 needs+0.013
 wrong+0.013
 Elsa+0.013
Neg logit contributions
bridge-0.023
ived-0.021
umbs-0.021
ruck-0.021
ill-0.021
Token her
Token ID607
Feature activation-3.96e-01

Pos logit contributions
 slams+0.050
 needs+0.047
 Elsa+0.047
 Jewel+0.047
 shop+0.047
Neg logit contributions
bridge-0.086
ruck-0.076
ived-0.076
ill-0.076
umbs-0.076
Token mo
Token ID6941
Feature activation-4.05e+00

Pos logit contributions
bridge+0.029
ived+0.027
umbs+0.027
ruck+0.026
icles+0.026
Neg logit contributions
 shop-0.022
 slams-0.022
 wrong-0.021
 needs-0.021
 Elsa-0.021
Tokenat
Token ID265
Feature activation-2.63e+00

Pos logit contributions
bridge+0.243
ived+0.218
ruck+0.213
umbs+0.212
ill+0.207
Neg logit contributions
 slams-0.297
 shop-0.286
 tells-0.285
 wrong-0.284
 needs-0.282
Token and
Token ID290
Feature activation-4.31e-01

Pos logit contributions
bridge+0.243
ived+0.235
ruck+0.234
umbs+0.228
ull+0.227
Neg logit contributions
 shop-0.107
 needs-0.098
 store-0.095
 slams-0.095
 tells-0.095
Token a
Token ID257
Feature activation-3.12e-01

Pos logit contributions
bridge+0.036
ived+0.033
umbs+0.032
ruck+0.032
ill+0.032
Neg logit contributions
 slams-0.023
 shop-0.022
 needs-0.021
 Elsa-0.021
 tells-0.021
Token draw
Token ID3197
Feature activation-9.83e-01

Pos logit contributions
bridge+0.024
ived+0.022
umbs+0.022
ruck+0.022
ill+0.022
Neg logit contributions
 slams-0.018
 shop-0.018
 needs-0.017
 wrong-0.017
 Elsa-0.017
Tokenbridge
Token ID9458
Feature activation-7.27e+00

Pos logit contributions
ived+0.048
umbs+0.048
ruck+0.047
icle+0.045
urg+0.044
Neg logit contributions
 shop-0.078
 wrong-0.075
 store-0.073
 is-0.072
 old-0.072
Token."
Token ID526
Feature activation+2.40e-01

Pos logit contributions
bridge+0.734
ived+0.708
umbs+0.698
ruck+0.698
ull+0.694
Neg logit contributions
 shop-0.240
 wrong-0.220
 slams-0.218
 tells-0.209
 needs-0.208
Token
Token ID198
Feature activation-6.16e-01

Pos logit contributions
 slams+0.010
 needs+0.009
 wrong+0.009
 shop+0.009
 Jewel+0.008
Neg logit contributions
bridge-0.023
ill-0.021
umbs-0.021
ived-0.021
ruck-0.021
Token
Token ID198
Feature activation-5.47e-01

Pos logit contributions
bridge+0.054
ull+0.050
ruck+0.050
ill+0.050
umbs+0.050
Neg logit contributions
 slams-0.030
 needs-0.028
 wrong-0.027
 Jewel-0.026
ience-0.026
TokenMom
Token ID29252
Feature activation-1.52e-02

Pos logit contributions
bridge+0.038
ived+0.034
ruck+0.034
ill+0.034
umbs+0.033
Neg logit contributions
 slams-0.037
 needs-0.035
 wrong-0.035
 shop-0.035
 tells-0.034
Token said
Token ID531
Feature activation+3.21e-01

Pos logit contributions
bridge+0.002
ived+0.001
ruck+0.001
umbs+0.001
ill+0.001
Neg logit contributions
 slams-0.000
 shop-0.000
 needs-0.000
 wrong-0.000
 Elsa-0.000
TokenOne
Token ID3198
Feature activation+1.30e+00

Pos logit contributions
 slams+0.043
 shop+0.043
 wrong+0.039
 needs+0.039
 Jewel+0.039
Neg logit contributions
bridge-0.098
ived-0.091
umbs-0.089
ruck-0.089
ill-0.088
Token day
Token ID1110
Feature activation+1.89e-01

Pos logit contributions
 slams+0.071
 shop+0.067
 Elsa+0.066
 wrong+0.065
 Jewel+0.065
Neg logit contributions
bridge-0.105
ruck-0.096
ived-0.095
ull-0.094
umbs-0.094
Token,
Token ID11
Feature activation+9.94e-01

Pos logit contributions
 slams+0.007
 shop+0.007
 needs+0.006
 Elsa+0.006
 wrong+0.006
Neg logit contributions
bridge-0.019
ived-0.017
ruck-0.017
umbs-0.017
ill-0.017
Token a
Token ID257
Feature activation+5.19e-01

Pos logit contributions
 shop+0.040
 slams+0.039
 wrong+0.037
 Elsa+0.037
 Jewel+0.036
Neg logit contributions
bridge-0.096
ived-0.090
umbs-0.089
ruck-0.088
ill-0.088
Token mis
Token ID2984
Feature activation+1.77e-01

Pos logit contributions
 slams+0.032
 tells+0.029
 needs+0.028
 studies+0.028
 belongs+0.028
Neg logit contributions
bridge-0.038
ull-0.035
ruck-0.034
ived-0.034
icle-0.033
Tokenchie
Token ID3043
Feature activation-7.20e+00

Pos logit contributions
 shop+0.014
 Jewel+0.014
 slams+0.014
athy+0.014
 store+0.014
Neg logit contributions
bridge-0.009
icles-0.008
icle-0.008
ruck-0.007
umbs-0.007
Tokenvous
Token ID31222
Feature activation-3.51e+00

Pos logit contributions
bridge+0.412
icles+0.383
urg+0.373
icle+0.370
umbs+0.368
Neg logit contributions
 needs-0.533
 shop-0.530
 wrong-0.525
 slams-0.519
 tells-0.510
Token young
Token ID1862
Feature activation+1.48e+00

Pos logit contributions
bridge+0.272
ruck+0.249
ull+0.246
ived+0.244
icle+0.243
Neg logit contributions
 slams-0.205
 shop-0.187
 needs-0.185
 Elsa-0.184
 tells-0.183
Token boy
Token ID2933
Feature activation+3.98e-01

Pos logit contributions
 slams+0.091
 studies+0.082
 needs+0.081
 tells+0.081
 wrong+0.079
Neg logit contributions
bridge-0.110
ruck-0.104
ull-0.101
ived-0.097
icle-0.097
Token decided
Token ID3066
Feature activation+2.60e-01

Pos logit contributions
 slams+0.019
 shop+0.018
 Elsa+0.018
 Jewel+0.018
 wrong+0.018
Neg logit contributions
bridge-0.035
ived-0.032
ruck-0.031
ill-0.031
ull-0.031
Token to
Token ID284
Feature activation-2.19e-01

Pos logit contributions
 slams+0.018
 Elsa+0.017
 shop+0.017
 Jewel+0.017
 needs+0.016
Neg logit contributions
bridge-0.017
ruck-0.016
ived-0.016
ull-0.015
paper-0.015
Token<|endoftext|>
Token ID50256
Feature activation+2.55e+00

Pos logit contributions
ill+0.079
umbs+0.077
bridge+0.076
ived+0.072
ull+0.070
Neg logit contributions
 slams-0.060
 wrong-0.059
 needs-0.054
ience-0.053
 shop-0.051
TokenPeter
Token ID19727
Feature activation-6.51e-01

Pos logit contributions
 slams+0.124
 needs+0.109
ience+0.108
 wrong+0.106
 tells+0.105
Neg logit contributions
bridge-0.229
umbs-0.213
ill-0.212
ull-0.206
ived-0.205
Token was
Token ID373
Feature activation+1.07e+00

Pos logit contributions
bridge+0.045
ill+0.043
ruck+0.043
ull+0.042
ived+0.040
Neg logit contributions
 slams-0.040
 Jewel-0.039
 wrong-0.038
 Elsa-0.036
 Iris-0.035
Token a
Token ID257
Feature activation+8.62e-01

Pos logit contributions
 slams+0.055
 needs+0.053
 shop+0.053
 Jewel+0.053
 tells+0.052
Neg logit contributions
bridge-0.090
ived-0.082
ruck-0.080
ill-0.078
ull-0.078
Token mis
Token ID2984
Feature activation+2.30e-01

Pos logit contributions
 slams+0.052
 tells+0.049
 belongs+0.047
 Jewel+0.046
 wrong+0.045
Neg logit contributions
bridge-0.063
ull-0.056
ruck-0.055
 crashing-0.055
ived-0.055
Tokenchie
Token ID3043
Feature activation-7.15e+00

Pos logit contributions
 shop+0.020
 slams+0.020
 Jewel+0.019
 store+0.019
athy+0.019
Neg logit contributions
bridge-0.011
icles-0.009
icle-0.009
ruck-0.008
urg-0.008
Tokenvous
Token ID31222
Feature activation-3.47e+00

Pos logit contributions
bridge+0.376
ived+0.326
icles+0.323
umbs+0.318
icle+0.316
Neg logit contributions
 shop-0.581
 slams-0.573
 needs-0.569
 wrong-0.565
 tells-0.552
Token 3
Token ID513
Feature activation-3.47e-01

Pos logit contributions
bridge+0.267
ruck+0.247
ill+0.242
ived+0.242
icle+0.239
Neg logit contributions
 slams-0.200
 wrong-0.182
 tells-0.182
 needs-0.180
 shop-0.179
Token year
Token ID614
Feature activation+2.76e-01

Pos logit contributions
bridge+0.030
ived+0.027
ruck+0.027
ull+0.027
umbs+0.027
Neg logit contributions
 slams-0.017
 Elsa-0.016
 Jewel-0.016
 wrong-0.016
 needs-0.016
Token old
Token ID1468
Feature activation+4.91e+00

Pos logit contributions
 Elsa+0.009
 Jewel+0.009
 slams+0.009
 Hazel+0.009
 Midnight+0.009
Neg logit contributions
bridge-0.026
ruck-0.026
ived-0.025
castle-0.024
ill-0.024
Token who
Token ID508
Feature activation+1.19e+00

Pos logit contributions
 slams+0.246
 needs+0.220
 tells+0.220
 wrong+0.216
 Elsa+0.215
Neg logit contributions
bridge-0.418
ruck-0.392
ill-0.374
icle-0.372
ull-0.371
Token they
Token ID484
Feature activation-8.60e-01

Pos logit contributions
bridge+0.092
ived+0.084
ruck+0.083
umbs+0.083
ill+0.082
Neg logit contributions
 slams-0.048
 shop-0.046
 needs-0.045
 wrong-0.044
 Elsa-0.044
Token admired
Token ID29382
Feature activation-1.54e+00

Pos logit contributions
bridge+0.070
ived+0.064
umbs+0.064
icle+0.063
icles+0.063
Neg logit contributions
 shop-0.046
 slams-0.045
 tells-0.043
 needs-0.043
 wrong-0.042
Token Sarah
Token ID10490
Feature activation-1.86e+00

Pos logit contributions
bridge+0.136
ived+0.124
ruck+0.123
umbs+0.122
ill+0.121
Neg logit contributions
 slams-0.074
 shop-0.071
 needs-0.069
 wrong-0.068
 Elsa-0.068
Token's
Token ID338
Feature activation-2.21e-01

Pos logit contributions
bridge+0.177
ived+0.163
ruck+0.161
umbs+0.161
ill+0.160
Neg logit contributions
 slams-0.076
 shop-0.071
 needs-0.069
 wrong-0.068
 Elsa-0.067
Token sand
Token ID6450
Feature activation-4.39e+00

Pos logit contributions
umbs+0.018
bridge+0.018
icle+0.017
icles+0.017
ived+0.017
Neg logit contributions
 old-0.011
 shop-0.011
 wrong-0.011
 needs-0.011
 studies-0.011
Tokencastle
Token ID18676
Feature activation-7.15e+00

Pos logit contributions
umbs+0.196
icle+0.189
ived+0.188
bridge+0.183
ruck+0.182
Neg logit contributions
 shop-0.379
 store-0.355
 old-0.350
 slams-0.348
 wrong-0.348
Token.
Token ID13
Feature activation-1.41e+00

Pos logit contributions
bridge+0.673
ived+0.633
umbs+0.625
icle+0.623
ruck+0.621
Neg logit contributions
 shop-0.297
 slams-0.283
 wrong-0.276
 needs-0.264
 store-0.258
Token<|endoftext|>
Token ID50256
Feature activation+2.30e+00

Pos logit contributions
bridge+0.127
ill+0.119
umbs+0.118
ived+0.118
ruck+0.117
Neg logit contributions
 slams-0.063
 shop-0.058
 wrong-0.058
 needs-0.057
 Elsa-0.055
TokenOnce
Token ID7454
Feature activation-3.56e-01

Pos logit contributions
 slams+0.119
 needs+0.105
ience+0.103
 studies+0.101
 tells+0.101
Neg logit contributions
bridge-0.200
ill-0.186
umbs-0.186
ull-0.181
ived-0.179
Token upon
Token ID2402
Feature activation-1.47e+00

Pos logit contributions
bridge+0.022
ill+0.022
ived+0.021
ruck+0.020
ull+0.020
Neg logit contributions
 slams-0.025
 Jewel-0.024
 shop-0.024
 Elsa-0.023
 needs-0.023
Token a
Token ID257
Feature activation+7.89e-01

Pos logit contributions
bridge+0.143
ived+0.136
icle+0.132
icles+0.131
umbs+0.131
Neg logit contributions
 shop-0.053
 slams-0.051
 wrong-0.050
 Elsa-0.049
 needs-0.048
Token
Token ID198
Feature activation+5.49e-01

Pos logit contributions
 slams+0.055
 shop+0.051
 needs+0.050
 Elsa+0.048
 wrong+0.048
Neg logit contributions
bridge-0.144
ived-0.133
umbs-0.132
ill-0.132
ruck-0.131
Token
Token ID198
Feature activation+5.32e-01

Pos logit contributions
 slams+0.024
 needs+0.022
 Elsa+0.021
 wrong+0.021
 shop+0.021
Neg logit contributions
bridge-0.051
ill-0.047
ruck-0.047
ived-0.047
umbs-0.046
TokenWith
Token ID3152
Feature activation+1.63e+00

Pos logit contributions
 slams+0.036
 shop+0.035
 needs+0.034
 wrong+0.034
 Elsa+0.034
Neg logit contributions
bridge-0.037
ived-0.033
umbs-0.032
ruck-0.032
icle-0.032
Token a
Token ID257
Feature activation+3.19e-01

Pos logit contributions
 slams+0.093
 needs+0.089
 wrong+0.084
 shop+0.083
 belongs+0.082
Neg logit contributions
bridge-0.125
ived-0.115
ruck-0.113
icle-0.113
icles-0.111
Token mis
Token ID2984
Feature activation+1.42e-01

Pos logit contributions
 slams+0.017
 shop+0.016
 needs+0.015
 Elsa+0.015
 wrong+0.015
Neg logit contributions
bridge-0.027
ived-0.025
ruck-0.024
umbs-0.024
ill-0.024
Tokenchie
Token ID3043
Feature activation-7.14e+00

Pos logit contributions
 shop+0.012
 slams+0.012
 Jewel+0.012
 needs+0.012
 studies+0.011
Neg logit contributions
bridge-0.007
icles-0.006
icle-0.006
ruck-0.005
umbs-0.005
Tokenvous
Token ID31222
Feature activation-3.65e+00

Pos logit contributions
bridge+0.384
icles+0.341
umbs+0.335
icle+0.332
ived+0.332
Neg logit contributions
 needs-0.565
 shop-0.564
 wrong-0.562
 slams-0.555
 tells-0.551
Token gl
Token ID1278
Feature activation-7.45e-01

Pos logit contributions
bridge+0.278
ived+0.254
ruck+0.252
icle+0.245
ill+0.244
Neg logit contributions
 slams-0.214
 shop-0.199
 Elsa-0.198
 needs-0.196
 tells-0.195
Tokenint
Token ID600
Feature activation-3.36e+00

Pos logit contributions
bridge+0.047
umbs+0.046
ruck+0.043
icles+0.042
icle+0.041
Neg logit contributions
 slams-0.051
 wrong-0.050
 needs-0.048
 shop-0.047
 tells-0.047
Token in
Token ID287
Feature activation+1.42e+00

Pos logit contributions
bridge+0.234
umbs+0.221
ruck+0.209
ipples+0.205
ived+0.204
Neg logit contributions
 slams-0.213
 shop-0.212
 needs-0.211
 wrong-0.207
 tells-0.206
Token her
Token ID607
Feature activation+1.05e-01

Pos logit contributions
 slams+0.061
 shop+0.059
 needs+0.056
 wrong+0.055
 Elsa+0.055
Neg logit contributions
bridge-0.134
ived-0.123
umbs-0.122
ruck-0.122
icle-0.120
Token the
Token ID262
Feature activation-3.33e-01

Pos logit contributions
bridge+0.023
ruck+0.021
ill+0.020
ived+0.020
icle+0.020
Neg logit contributions
 slams-0.013
 needs-0.012
 shop-0.011
 studies-0.011
 tells-0.011
Token squirrel
Token ID33039
Feature activation+2.45e-01

Pos logit contributions
bridge+0.022
ruck+0.019
icle+0.019
ull+0.019
ived+0.019
Neg logit contributions
 slams-0.024
 Elsa-0.022
 tells-0.022
 Jewel-0.021
 Hazel-0.021
Token got
Token ID1392
Feature activation-1.50e+00

Pos logit contributions
 slams+0.015
 shop+0.014
 Jewel+0.014
 Elsa+0.014
 needs+0.014
Neg logit contributions
bridge-0.018
ived-0.017
umbs-0.016
ill-0.016
ruck-0.016
Token a
Token ID257
Feature activation-6.70e-01

Pos logit contributions
bridge+0.126
ived+0.121
ull+0.116
icle+0.116
icles+0.115
Neg logit contributions
 slams-0.074
 shop-0.071
 needs-0.069
 Elsa-0.068
 studies-0.067
Token mis
Token ID2984
Feature activation+5.17e-02

Pos logit contributions
bridge+0.060
ived+0.055
ruck+0.055
umbs+0.054
icle+0.054
Neg logit contributions
 slams-0.031
 shop-0.029
 Elsa-0.028
 needs-0.028
 Jewel-0.028
Tokenchie
Token ID3043
Feature activation-7.14e+00

Pos logit contributions
 shop+0.004
 slams+0.004
athy+0.004
 store+0.004
 wrong+0.004
Neg logit contributions
bridge-0.003
icles-0.002
icle-0.002
ruck-0.002
castle-0.002
Tokenvous
Token ID31222
Feature activation-4.11e+00

Pos logit contributions
bridge+0.418
icles+0.380
urg+0.372
paper+0.362
icle+0.360
Neg logit contributions
 wrong-0.531
 needs-0.530
 shop-0.519
 says-0.516
 tells-0.509
Token look
Token ID804
Feature activation-8.73e-04

Pos logit contributions
bridge+0.359
umbs+0.320
icles+0.316
ived+0.314
ill+0.312
Neg logit contributions
 shop-0.207
 slams-0.203
 wrong-0.202
 needs-0.198
 tells-0.193
Token in
Token ID287
Feature activation+1.02e+00

Pos logit contributions
bridge+0.000
ived+0.000
ruck+0.000
umbs+0.000
ill+0.000
Neg logit contributions
 slams-0.000
 shop-0.000
 Elsa-0.000
 Jewel-0.000
 needs-0.000
Token its
Token ID663
Feature activation-1.20e+00

Pos logit contributions
 slams+0.055
 shop+0.053
 needs+0.052
 Elsa+0.051
 wrong+0.051
Neg logit contributions
bridge-0.083
ived-0.075
ruck-0.075
umbs-0.074
ill-0.074
Token eye
Token ID4151
Feature activation-1.13e+00

Pos logit contributions
bridge+0.087
ruck+0.079
ived+0.079
icle+0.078
umbs+0.077
Neg logit contributions
 slams-0.073
 Elsa-0.071
 shop-0.069
 Jewel-0.068
 tells-0.068
Token The
Token ID383
Feature activation-6.31e-02

Pos logit contributions
 slams+0.102
 needs+0.097
 wrong+0.093
 belongs+0.090
 tells+0.088
Neg logit contributions
bridge-0.155
ived-0.146
icle-0.146
umbs-0.145
ill-0.145
Token little
Token ID1310
Feature activation-1.33e+00

Pos logit contributions
bridge+0.005
icle+0.004
ruck+0.004
ived+0.004
umbs+0.004
Neg logit contributions
 slams-0.004
athy-0.003
 belongs-0.003
 tells-0.003
azel-0.003
Token boy
Token ID2933
Feature activation-4.10e-01

Pos logit contributions
bridge+0.085
ill+0.078
icle+0.077
ived+0.076
ruck+0.075
Neg logit contributions
 slams-0.086
 studies-0.082
 tells-0.081
 owns-0.080
 wrong-0.079
Token smiled
Token ID13541
Feature activation+6.68e-02

Pos logit contributions
bridge+0.032
ived+0.031
ill+0.031
ull+0.030
icle+0.030
Neg logit contributions
 slams-0.020
 Elsa-0.019
 library-0.019
 wrong-0.018
 Jewel-0.018
Token mis
Token ID2984
Feature activation+8.89e-02

Pos logit contributions
 shop+0.002
 slams+0.002
 wrong+0.002
 Elsa+0.002
 old+0.002
Neg logit contributions
bridge-0.007
umbs-0.006
ruck-0.006
ived-0.006
icles-0.006
Tokenchie
Token ID3043
Feature activation-7.14e+00

Pos logit contributions
 slams+0.007
 shop+0.007
 Jewel+0.007
 needs+0.007
 wrong+0.007
Neg logit contributions
bridge-0.005
icles-0.004
ruck-0.004
icle-0.004
umbs-0.004
Tokenv
Token ID85
Feature activation-5.21e+00

Pos logit contributions
bridge+0.449
ived+0.387
umbs+0.385
icles+0.382
ruck+0.381
Neg logit contributions
 slams-0.513
 shop-0.509
 needs-0.506
 wrong-0.503
 tells-0.498
Tokenously
Token ID3481
Feature activation-4.19e+00

Pos logit contributions
bridge+0.286
paper+0.235
ruck+0.233
umbs+0.226
ived+0.225
Neg logit contributions
 shop-0.429
 slams-0.420
 wrong-0.407
 needs-0.405
 tells-0.405
Token and
Token ID290
Feature activation-5.21e-01

Pos logit contributions
bridge+0.356
ived+0.338
ruck+0.331
umbs+0.323
ill+0.322
Neg logit contributions
 slams-0.207
 Elsa-0.194
 shop-0.190
 Jewel-0.189
 needs-0.188
Token said
Token ID531
Feature activation+2.48e-01

Pos logit contributions
bridge+0.047
ived+0.044
ill+0.043
ruck+0.043
ull+0.042
Neg logit contributions
 slams-0.022
 needs-0.021
 wrong-0.021
 shop-0.021
 Jewel-0.020
Token,
Token ID11
Feature activation-1.24e-01

Pos logit contributions
 slams+0.009
 shop+0.009
 needs+0.009
 Elsa+0.008
 tells+0.008
Neg logit contributions
bridge-0.024
ived-0.022
ruck-0.022
umbs-0.022
ill-0.022
Token lesson
Token ID11483
Feature activation+4.09e-01

Pos logit contributions
 slams+0.078
 shop+0.078
 studies+0.072
athy+0.072
 needs+0.072
Neg logit contributions
bridge-0.098
ruck-0.093
ived-0.092
 crashing-0.091
ull-0.091
Token.
Token ID13
Feature activation-1.65e+00

Pos logit contributions
 slams+0.025
 Elsa+0.024
 shop+0.023
 Jewel+0.023
Charl+0.022
Neg logit contributions
bridge-0.030
ruck-0.029
icles-0.027
ived-0.027
ill-0.027
Token They
Token ID1119
Feature activation-1.51e-01

Pos logit contributions
bridge+0.122
ill+0.116
icle+0.112
ull+0.112
umbs+0.112
Neg logit contributions
 slams-0.098
 wrong-0.094
 needs-0.086
 Elsa-0.085
ience-0.085
Token never
Token ID1239
Feature activation-6.90e-01

Pos logit contributions
bridge+0.011
ill+0.010
ived+0.010
ruck+0.009
ull+0.009
Neg logit contributions
 Elsa-0.009
 Jewel-0.009
 slams-0.008
 spa-0.008
 therapist-0.008
Token d
Token ID288
Feature activation-3.80e+00

Pos logit contributions
bridge+0.053
ived+0.049
ruck+0.048
ill+0.048
umbs+0.047
Neg logit contributions
 slams-0.040
 shop-0.039
 Elsa-0.038
 Jewel-0.038
 wrong-0.037
Tokenived
Token ID1572
Feature activation-7.11e+00

Pos logit contributions
bridge+0.220
ruck+0.175
ull+0.172
ipples+0.161
icle+0.158
Neg logit contributions
 slams-0.309
 wrong-0.297
 shop-0.283
 Coco-0.283
 Elsa-0.281
Token into
Token ID656
Feature activation-7.84e-01

Pos logit contributions
bridge+0.600
ived+0.545
ruck+0.539
umbs+0.537
ill+0.532
Neg logit contributions
 slams-0.366
 shop-0.353
 needs-0.341
 Elsa-0.338
 wrong-0.338
Token the
Token ID262
Feature activation-7.44e-01

Pos logit contributions
bridge+0.070
ived+0.064
ruck+0.063
umbs+0.063
ill+0.062
Neg logit contributions
 slams-0.037
 shop-0.036
 needs-0.034
 wrong-0.034
 Elsa-0.034
Token pond
Token ID16723
Feature activation+8.31e-01

Pos logit contributions
bridge+0.055
ruck+0.049
icle+0.047
ull+0.047
umbs+0.047
Neg logit contributions
 Jewel-0.044
 slams-0.044
 Elsa-0.043
 tells-0.043
 Hazel-0.043
Token again
Token ID757
Feature activation-6.71e-01

Pos logit contributions
 slams+0.043
 shop+0.042
 wrong+0.040
 needs+0.040
 Elsa+0.040
Neg logit contributions
bridge-0.070
ived-0.063
ruck-0.063
umbs-0.062
icle-0.061
Token.
Token ID13
Feature activation-1.52e+00

Pos logit contributions
bridge+0.062
ived+0.057
ruck+0.056
umbs+0.056
ill+0.056
Neg logit contributions
 slams-0.030
 shop-0.028
 Elsa-0.027
 needs-0.027
 wrong-0.027
Token sweet
Token ID6029
Feature activation-1.95e-01

Pos logit contributions
bridge+0.025
ived+0.022
ruck+0.022
umbs+0.022
ill+0.021
Neg logit contributions
 slams-0.020
 shop-0.019
 needs-0.019
 tells-0.018
 Elsa-0.018
Token tune
Token ID14009
Feature activation-4.15e+00

Pos logit contributions
bridge+0.015
ived+0.014
ruck+0.014
icle+0.013
umbs+0.013
Neg logit contributions
 slams-0.012
 shop-0.011
 needs-0.011
 tells-0.011
 wrong-0.011
Token,
Token ID11
Feature activation+2.66e-01

Pos logit contributions
bridge+0.438
umbs+0.413
ruck+0.400
ipples+0.399
icles+0.397
Neg logit contributions
 shop-0.124
 slams-0.122
 wrong-0.117
 Elsa-0.116
 tells-0.108
Token a
Token ID257
Feature activation-2.89e-01

Pos logit contributions
 slams+0.013
 needs+0.012
 shop+0.012
 tells+0.012
 wrong+0.011
Neg logit contributions
bridge-0.023
ived-0.021
ill-0.021
ruck-0.021
icle-0.021
Token mis
Token ID2984
Feature activation+1.97e-01

Pos logit contributions
bridge+0.025
ived+0.022
ruck+0.022
ull+0.022
icle+0.022
Neg logit contributions
 slams-0.015
 needs-0.014
 tells-0.014
 shop-0.014
 Elsa-0.013
Tokenchie
Token ID3043
Feature activation-7.10e+00

Pos logit contributions
 shop+0.016
 Jewel+0.015
athy+0.015
 slams+0.015
 store+0.015
Neg logit contributions
bridge-0.010
icles-0.009
icle-0.008
ruck-0.008
castle-0.008
Tokenvous
Token ID31222
Feature activation-3.88e+00

Pos logit contributions
bridge+0.401
icles+0.375
urg+0.371
icle+0.360
umbs+0.359
Neg logit contributions
 needs-0.520
 wrong-0.519
 shop-0.513
 says-0.503
 tells-0.498
Token fox
Token ID21831
Feature activation+1.46e+00

Pos logit contributions
bridge+0.312
ived+0.284
ruck+0.283
icle+0.277
ill+0.276
Neg logit contributions
 slams-0.215
 shop-0.201
 needs-0.198
 Elsa-0.196
 tells-0.196
Token appeared
Token ID4120
Feature activation-4.32e+00

Pos logit contributions
 slams+0.102
 shop+0.098
 needs+0.097
 Elsa+0.096
 Jewel+0.096
Neg logit contributions
bridge-0.098
ived-0.087
ruck-0.084
ill-0.083
umbs-0.083
Token.
Token ID13
Feature activation-4.81e-01

Pos logit contributions
bridge+0.387
ived+0.351
umbs+0.350
ruck+0.348
icle+0.344
Neg logit contributions
 slams-0.200
 shop-0.195
 wrong-0.187
 needs-0.184
 Elsa-0.184
Token "
Token ID366
Feature activation-1.05e+00

Pos logit contributions
bridge+0.037
ill+0.034
ived+0.034
umbs+0.033
icle+0.033
Neg logit contributions
 slams-0.028
 needs-0.027
 wrong-0.026
 tells-0.025
 shop-0.025
Token day
Token ID1110
Feature activation+4.16e-01

Pos logit contributions
 slams+0.103
 Elsa+0.097
 Jewel+0.096
 shop+0.092
 tells+0.091
Neg logit contributions
bridge-0.126
ill-0.117
ived-0.117
ruck-0.117
ull-0.116
Token,
Token ID11
Feature activation+1.36e+00

Pos logit contributions
 slams+0.019
 shop+0.017
 needs+0.017
 Jewel+0.017
 tells+0.017
Neg logit contributions
bridge-0.038
ived-0.036
ruck-0.035
ill-0.034
icle-0.034
Token a
Token ID257
Feature activation+7.45e-01

Pos logit contributions
 slams+0.064
 shop+0.062
 needs+0.060
 wrong+0.060
 Elsa+0.059
Neg logit contributions
bridge-0.121
ived-0.111
umbs-0.109
ruck-0.109
ill-0.108
Token little
Token ID1310
Feature activation-2.39e-01

Pos logit contributions
 slams+0.049
 tells+0.043
 studies+0.041
 needs+0.041
 belongs+0.041
Neg logit contributions
bridge-0.052
ull-0.049
ruck-0.049
icle-0.047
ived-0.046
Token girl
Token ID2576
Feature activation+1.44e-02

Pos logit contributions
bridge+0.018
ruck+0.018
ived+0.017
icle+0.017
ull+0.017
Neg logit contributions
 slams-0.014
 tells-0.012
 wrong-0.012
 studies-0.012
 needs-0.012
Token named
Token ID3706
Feature activation+3.64e+00

Pos logit contributions
 slams+0.000
 wrong+0.000
 Jewel+0.000
 Elsa+0.000
 shop+0.000
Neg logit contributions
bridge-0.001
ruck-0.001
ull-0.001
ived-0.001
ill-0.001
Token Grace
Token ID16156
Feature activation+2.91e+00

Pos logit contributions
 slams+0.161
 shop+0.155
 needs+0.148
 wrong+0.146
 tells+0.143
Neg logit contributions
bridge-0.324
ived-0.305
ill-0.303
umbs-0.301
ruck-0.296
Token saw
Token ID2497
Feature activation-1.93e+00

Pos logit contributions
 slams+0.185
 wrong+0.178
 Elsa+0.174
 Jewel+0.174
 shop+0.173
Neg logit contributions
bridge-0.204
ruck-0.190
ived-0.187
ull-0.187
ill-0.185
Token a
Token ID257
Feature activation+7.05e-01

Pos logit contributions
bridge+0.138
ruck+0.130
ull+0.127
icle+0.120
ill+0.120
Neg logit contributions
 slams-0.121
 needs-0.119
 studies-0.116
ience-0.114
 Jewel-0.111
Token big
Token ID1263
Feature activation+2.97e-01

Pos logit contributions
 slams+0.045
 tells+0.042
annah+0.040
 needs+0.040
 says+0.040
Neg logit contributions
ruck-0.045
bridge-0.045
ull-0.044
castle-0.043
 crashing-0.042
Token yellow
Token ID7872
Feature activation-6.08e-02

Pos logit contributions
 slams+0.017
 Elsa+0.016
 tells+0.016
ience+0.015
 Jewel+0.015
Neg logit contributions
bridge-0.020
ruck-0.020
castle-0.020
icle-0.020
ull-0.019
Token a
Token ID257
Feature activation+1.10e+00

Pos logit contributions
 slams+0.053
 shop+0.053
 Elsa+0.051
 wrong+0.050
 Jewel+0.049
Neg logit contributions
bridge-0.128
ived-0.120
ruck-0.117
umbs-0.117
icles-0.116
Token little
Token ID1310
Feature activation-1.72e-01

Pos logit contributions
 slams+0.065
 tells+0.057
 needs+0.057
 Jewel+0.056
 studies+0.056
Neg logit contributions
bridge-0.083
ull-0.076
ruck-0.075
ived-0.072
castle-0.071
Token brown
Token ID7586
Feature activation+1.68e+00

Pos logit contributions
bridge+0.012
ruck+0.011
ull+0.011
icle+0.011
ived+0.010
Neg logit contributions
 slams-0.011
 tells-0.010
 studies-0.010
 wrong-0.010
 needs-0.010
Token dog
Token ID3290
Feature activation+1.90e+00

Pos logit contributions
 slams+0.094
 tells+0.086
 needs+0.085
 wrong+0.085
 Elsa+0.084
Neg logit contributions
bridge-0.134
ruck-0.125
icle-0.125
ull-0.122
icles-0.121
Token named
Token ID3706
Feature activation+3.73e+00

Pos logit contributions
ience+0.099
 slams+0.099
annah+0.092
 Hazel+0.091
 studies+0.091
Neg logit contributions
ull-0.143
bridge-0.138
ruck-0.132
icle-0.130
ill-0.125
Token Max
Token ID5436
Feature activation+3.12e+00

Pos logit contributions
 slams+0.192
 needs+0.176
 studies+0.176
 shop+0.174
 wrong+0.170
Neg logit contributions
icle-0.292
bridge-0.288
umbs-0.288
icles-0.287
ived-0.274
Token.
Token ID13
Feature activation+1.09e+00

Pos logit contributions
 slams+0.166
 shop+0.150
 Jewel+0.148
 Elsa+0.148
 needs+0.145
Neg logit contributions
bridge-0.256
ruck-0.241
ull-0.236
ill-0.234
icle-0.230
Token Max
Token ID5436
Feature activation+3.06e+00

Pos logit contributions
 slams+0.032
 shop+0.030
 Elsa+0.029
 needs+0.028
 Jewel+0.027
Neg logit contributions
bridge-0.118
ived-0.110
ruck-0.108
ill-0.107
umbs-0.107
Token liked
Token ID8288
Feature activation+2.38e+00

Pos logit contributions
 slams+0.212
 Jewel+0.189
 therapist+0.188
 spa+0.188
ience+0.186
Neg logit contributions
ull-0.174
bridge-0.172
paper-0.166
ruck-0.162
ill-0.148
Token to
Token ID284
Feature activation+1.95e-01

Pos logit contributions
 shop+0.119
 slams+0.118
 wrong+0.113
 tells+0.112
 needs+0.111
Neg logit contributions
bridge-0.211
ill-0.188
ived-0.187
umbs-0.186
ruck-0.182
Token dig
Token ID3100
Feature activation+3.21e-01

Pos logit contributions
 slams+0.010
 shop+0.009
 needs+0.009
 wrong+0.009
 Elsa+0.009
Neg logit contributions
bridge-0.017
ived-0.015
ruck-0.015
umbs-0.015
icle-0.015
Token so
Token ID523
Feature activation-4.36e-01

Pos logit contributions
 slams+0.008
 shop+0.007
 Elsa+0.007
 Jewel+0.007
 needs+0.007
Neg logit contributions
bridge-0.010
ived-0.009
ruck-0.009
umbs-0.009
ill-0.009
Token he
Token ID339
Feature activation-1.25e-01

Pos logit contributions
bridge+0.030
ived+0.028
icle+0.027
ill+0.027
ruck+0.026
Neg logit contributions
 slams-0.028
 needs-0.027
 Elsa-0.026
 Jewel-0.026
 shop-0.026
Token decided
Token ID3066
Feature activation-3.70e-01

Pos logit contributions
bridge+0.010
ived+0.009
ill+0.008
icle+0.008
ruck+0.008
Neg logit contributions
 slams-0.008
 shop-0.007
 wrong-0.007
 needs-0.007
 Elsa-0.007
Token to
Token ID284
Feature activation-5.28e-01

Pos logit contributions
bridge+0.023
ived+0.021
ruck+0.021
paper+0.021
ill+0.020
Neg logit contributions
 slams-0.027
 Elsa-0.025
 needs-0.025
 Jewel-0.025
 shop-0.024
Token climb
Token ID12080
Feature activation-2.02e-01

Pos logit contributions
bridge+0.038
icle+0.035
ill+0.034
ived+0.034
icles+0.034
Neg logit contributions
 slams-0.033
 Elsa-0.032
 wrong-0.031
 needs-0.031
 shop-0.030
Token up
Token ID510
Feature activation+3.69e+00

Pos logit contributions
bridge+0.017
ived+0.015
ruck+0.015
icle+0.015
icles+0.015
Neg logit contributions
 slams-0.011
 needs-0.010
 shop-0.010
 tells-0.010
 Elsa-0.010
Token the
Token ID262
Feature activation-2.78e-02

Pos logit contributions
 slams+0.168
 shop+0.161
 needs+0.156
 Elsa+0.155
 wrong+0.153
Neg logit contributions
bridge-0.333
ived-0.304
ruck-0.300
umbs-0.300
ill-0.298
Token tree
Token ID5509
Feature activation+2.44e+00

Pos logit contributions
bridge+0.002
umbs+0.002
ruck+0.002
ived+0.002
ill+0.002
Neg logit contributions
 slams-0.001
 Elsa-0.001
 tells-0.001
 shop-0.001
 needs-0.001
Token.
Token ID13
Feature activation-4.06e-01

Pos logit contributions
 slams+0.107
 shop+0.102
 needs+0.098
 Elsa+0.098
 wrong+0.097
Neg logit contributions
bridge-0.225
ived-0.206
umbs-0.204
ruck-0.204
ill-0.202
Token He
Token ID679
Feature activation-6.73e-01

Pos logit contributions
bridge+0.031
umbs+0.029
ived+0.029
ill+0.028
icle+0.028
Neg logit contributions
 slams-0.024
 wrong-0.023
 needs-0.023
 tells-0.022
 shop-0.022
Token worked
Token ID3111
Feature activation-9.03e-02

Pos logit contributions
bridge+0.049
ived+0.044
ruck+0.043
umbs+0.043
ill+0.043
Neg logit contributions
 slams-0.042
 shop-0.041
 needs-0.040
 wrong-0.040
 Elsa-0.040
Token.
Token ID13
Feature activation+1.60e-01

Pos logit contributions
 slams+0.024
 shop+0.022
 Elsa+0.022
 needs+0.022
 Jewel+0.021
Neg logit contributions
bridge-0.045
ived-0.041
ruck-0.041
umbs-0.041
ill-0.040
Token They
Token ID1119
Feature activation+8.17e-01

Pos logit contributions
 slams+0.011
 shop+0.011
 needs+0.011
 wrong+0.011
 Elsa+0.011
Neg logit contributions
bridge-0.011
ived-0.009
ruck-0.009
umbs-0.009
ill-0.009
Token hugged
Token ID41130
Feature activation+2.18e+00

Pos logit contributions
 slams+0.050
 shop+0.048
 needs+0.047
 Elsa+0.047
 Jewel+0.046
Neg logit contributions
bridge-0.060
ived-0.055
ruck-0.053
ill-0.053
umbs-0.052
Token and
Token ID290
Feature activation+2.40e-02

Pos logit contributions
 shop+0.110
 wrong+0.109
 needs+0.106
 tells+0.106
 slams+0.105
Neg logit contributions
bridge-0.202
umbs-0.181
ived-0.178
ipples-0.174
ill-0.174
Token made
Token ID925
Feature activation-1.54e+00

Pos logit contributions
 slams+0.001
 shop+0.001
 Jewel+0.001
 Elsa+0.001
 needs+0.001
Neg logit contributions
bridge-0.002
ruck-0.002
ived-0.002
icles-0.002
icle-0.002
Token up
Token ID510
Feature activation+3.76e+00

Pos logit contributions
bridge+0.139
ill+0.127
umbs+0.123
ived+0.119
icle+0.119
Neg logit contributions
 wrong-0.075
 tells-0.072
 says-0.072
 old-0.071
 shop-0.071
Token.
Token ID13
Feature activation+3.65e-03

Pos logit contributions
 slams+0.158
 wrong+0.156
 tells+0.152
 shop+0.151
 needs+0.145
Neg logit contributions
bridge-0.364
umbs-0.320
ived-0.320
icle-0.318
icles-0.316
Token Tim
Token ID5045
Feature activation-1.33e+00

Pos logit contributions
 slams+0.000
 shop+0.000
 needs+0.000
 wrong+0.000
 tells+0.000
Neg logit contributions
bridge-0.000
ived-0.000
umbs-0.000
ruck-0.000
ill-0.000
Tokenmy
Token ID1820
Feature activation-1.33e+00

Pos logit contributions
bridge+0.109
ived+0.097
umbs+0.096
ruck+0.095
icle+0.094
Neg logit contributions
 slams-0.074
 shop-0.073
 needs-0.070
 wrong-0.070
 tells-0.069
Token learned
Token ID4499
Feature activation+1.29e+00

Pos logit contributions
bridge+0.094
ived+0.090
ull+0.088
ill+0.086
ruck+0.085
Neg logit contributions
 slams-0.078
 Jewel-0.075
 Elsa-0.071
 wrong-0.071
 shop-0.070
Token that
Token ID326
Feature activation+1.59e+00

Pos logit contributions
 slams+0.062
 Elsa+0.061
 needs+0.059
 shop+0.058
athy+0.058
Neg logit contributions
ruck-0.106
icle-0.102
bridge-0.101
ull-0.100
icles-0.099
Token it
Token ID340
Feature activation+3.13e-01

Pos logit contributions
bridge+0.129
ruck+0.127
icle+0.123
ill+0.122
icles+0.122
Neg logit contributions
 slams-0.096
 Elsa-0.091
ience-0.086
athy-0.085
 Jewel-0.085
Token was
Token ID373
Feature activation-1.78e-01

Pos logit contributions
 slams+0.019
 Elsa+0.019
annah+0.018
 library+0.018
 aisle+0.018
Neg logit contributions
bridge-0.021
ull-0.019
icle-0.018
ill-0.018
ived-0.018
Token safer
Token ID14178
Feature activation+2.15e+00

Pos logit contributions
bridge+0.012
ruck+0.011
icle+0.011
ull+0.011
ill+0.011
Neg logit contributions
 slams-0.011
 studies-0.011
 shop-0.011
 aisle-0.011
 owns-0.010
Token inside
Token ID2641
Feature activation+2.05e+00

Pos logit contributions
 slams+0.134
 Elsa+0.126
 Jewel+0.123
 shop+0.121
 owns+0.120
Neg logit contributions
bridge-0.149
ruck-0.142
ill-0.137
paper-0.134
ull-0.132
Token the
Token ID262
Feature activation-6.16e-03

Pos logit contributions
 slams+0.104
 Jewel+0.095
 Elsa+0.095
 shop+0.094
 tells+0.093
Neg logit contributions
bridge-0.174
ruck-0.160
ill-0.158
ived-0.156
icles-0.155
Token case
Token ID1339
Feature activation+3.33e+00

Pos logit contributions
 crashing+0.000
ruck+0.000
paper+0.000
bridge+0.000
ived+0.000
Neg logit contributions
 slams-0.000
 tells-0.000
athy-0.000
azel-0.000
ience-0.000
Token.
Token ID13
Feature activation-1.49e-01

Pos logit contributions
 slams+0.176
 Jewel+0.167
 Elsa+0.163
athy+0.160
 tells+0.159
Neg logit contributions
bridge-0.279
ill-0.257
ived-0.244
ruck-0.241
umbs-0.239
Token
Token ID198
Feature activation-5.84e-02

Pos logit contributions
bridge+0.012
ruck+0.011
umbs+0.011
ived+0.011
ill+0.011
Neg logit contributions
 slams-0.008
 wrong-0.007
 Elsa-0.007
 needs-0.007
 shop-0.007
Token
Token ID198
Feature activation-7.32e-02

Pos logit contributions
bridge+0.005
ruck+0.005
umbs+0.005
ill+0.005
ived+0.005
Neg logit contributions
 slams-0.003
 needs-0.002
 Elsa-0.002
 wrong-0.002
 Jewel-0.002
TokenJohn
Token ID7554
Feature activation-2.30e+00

Pos logit contributions
bridge+0.004
ived+0.004
umbs+0.004
icle+0.004
ruck+0.004
Neg logit contributions
 shop-0.006
 slams-0.006
 wrong-0.005
 needs-0.005
 tells-0.005
Token smiled
Token ID13541
Feature activation-4.46e-01

Pos logit contributions
bridge+0.163
ill+0.158
ived+0.155
ull+0.148
ruck+0.146
Neg logit contributions
 slams-0.137
 Jewel-0.135
 wrong-0.132
 Elsa-0.124
 shop-0.123
Token,
Token ID11
Feature activation+2.65e-01

Pos logit contributions
 slams+0.020
 needs+0.019
 shop+0.018
 Elsa+0.018
 tells+0.018
Neg logit contributions
bridge-0.032
ill-0.029
ruck-0.029
icles-0.029
icle-0.029
Token but
Token ID475
Feature activation-1.55e+00

Pos logit contributions
 slams+0.015
 shop+0.015
 needs+0.014
 wrong+0.014
 Elsa+0.014
Neg logit contributions
bridge-0.021
ived-0.019
ruck-0.019
umbs-0.019
ill-0.018
Token it
Token ID340
Feature activation+4.36e-01

Pos logit contributions
bridge+0.110
ill+0.108
icles+0.103
icle+0.102
ived+0.100
Neg logit contributions
 slams-0.086
 needs-0.082
 studies-0.079
 tells-0.079
Charl-0.078
Token was
Token ID373
Feature activation+2.36e-01

Pos logit contributions
 slams+0.023
 Elsa+0.022
 studies+0.021
 owns+0.021
 shop+0.021
Neg logit contributions
bridge-0.034
ill-0.032
icle-0.031
ull-0.031
ived-0.031
Token too
Token ID1165
Feature activation+2.02e+00

Pos logit contributions
 slams+0.014
 shop+0.014
 owns+0.014
 tells+0.014
 studies+0.013
Neg logit contributions
bridge-0.016
icle-0.015
ill-0.015
ruck-0.015
icles-0.014
Token expensive
Token ID5789
Feature activation+2.60e+00

Pos logit contributions
 slams+0.106
 Jewel+0.105
 Elsa+0.101
 tells+0.097
 Hazel+0.096
Neg logit contributions
bridge-0.154
ill-0.151
ruck-0.149
ull-0.149
umbs-0.146
Token for
Token ID329
Feature activation-1.10e-01

Pos logit contributions
 slams+0.154
 Elsa+0.143
 tells+0.138
 Jewel+0.137
athy+0.136
Neg logit contributions
bridge-0.189
ruck-0.186
icles-0.180
paper-0.178
ill-0.178
Token her
Token ID607
Feature activation-3.63e-01

Pos logit contributions
bridge+0.009
ruck+0.008
ived+0.008
icles+0.008
umbs+0.008
Neg logit contributions
 slams-0.006
 needs-0.006
 shop-0.005
 tells-0.005
 wrong-0.005
Token.
Token ID13
Feature activation-3.60e-01

Pos logit contributions
bridge+0.031
ived+0.028
ruck+0.028
ill+0.028
icle+0.028
Neg logit contributions
 slams-0.018
 Elsa-0.017
 shop-0.017
 Jewel-0.016
 tells-0.016
Token But
Token ID887
Feature activation+8.79e-01

Pos logit contributions
bridge+0.027
ill+0.027
umbs+0.026
ived+0.025
icles+0.025
Neg logit contributions
 slams-0.020
 needs-0.019
 wrong-0.019
ience-0.018
 tells-0.018
Token,
Token ID11
Feature activation+3.39e-01

Pos logit contributions
 slams+0.049
 needs+0.047
 owns+0.046
 tells+0.046
 studies+0.045
Neg logit contributions
bridge-0.060
ill-0.060
icle-0.058
icles-0.058
ived-0.056
Token he
Token ID339
Feature activation+5.12e-01

Pos logit contributions
 slams+0.111
 Elsa+0.100
 Jewel+0.099
 shop+0.099
athy+0.099
Neg logit contributions
bridge-0.132
ruck-0.122
ull-0.117
icle-0.116
ill-0.115
Token was
Token ID373
Feature activation+4.17e-01

Pos logit contributions
 slams+0.032
 Elsa+0.031
 Jewel+0.031
 Hazel+0.031
Charl+0.030
Neg logit contributions
bridge-0.034
ruck-0.033
ill-0.032
ived-0.032
ull-0.032
Token jealous
Token ID19354
Feature activation+4.05e-04

Pos logit contributions
 slams+0.023
 therapist+0.022
athy+0.022
ience+0.022
 owns+0.022
Neg logit contributions
bridge-0.031
ruck-0.030
ull-0.029
ived-0.029
ill-0.028
Token of
Token ID286
Feature activation-2.00e+00

Pos logit contributions
 slams+0.000
 shop+0.000
 Elsa+0.000
 needs+0.000
 Jewel+0.000
Neg logit contributions
bridge-0.000
ived-0.000
ruck-0.000
umbs-0.000
ill-0.000
Token Johnny
Token ID15470
Feature activation-2.64e-01

Pos logit contributions
bridge+0.133
ruck+0.124
icle+0.120
icles+0.120
ill+0.116
Neg logit contributions
 slams-0.135
 shop-0.128
 Elsa-0.128
 needs-0.125
 Jewel-0.124
Token's
Token ID338
Feature activation+8.30e-01

Pos logit contributions
ruck+0.019
bridge+0.019
ill+0.018
paper+0.018
icle+0.018
Neg logit contributions
 slams-0.014
 Hazel-0.013
Charl-0.013
athy-0.013
 Jewel-0.013
Token new
Token ID649
Feature activation+8.42e-01

Pos logit contributions
 slams+0.049
 Elsa+0.046
 shop+0.045
 Jewel+0.045
 Hazel+0.044
Neg logit contributions
bridge-0.062
ived-0.058
ruck-0.057
ill-0.056
paper-0.055
Token basketball
Token ID9669
Feature activation+6.26e-01

Pos logit contributions
 slams+0.050
 Elsa+0.047
 shop+0.047
 Jewel+0.046
 needs+0.046
Neg logit contributions
bridge-0.064
ived-0.058
ruck-0.057
ill-0.057
icle-0.056
Token.
Token ID13
Feature activation+5.67e-01

Pos logit contributions
 slams+0.033
 Elsa+0.031
 Hazel+0.030
athy+0.030
 Jewel+0.029
Neg logit contributions
bridge-0.051
ived-0.046
ill-0.046
ruck-0.046
paper-0.046
Token His
Token ID2399
Feature activation+6.44e-02

Pos logit contributions
 slams+0.034
 shop+0.033
 needs+0.032
 wrong+0.032
 Elsa+0.031
Neg logit contributions
bridge-0.044
ived-0.039
umbs-0.039
ruck-0.039
ill-0.038
Token mom
Token ID1995
Feature activation+1.38e+00

Pos logit contributions
 slams+0.003
 shop+0.003
 Elsa+0.003
 needs+0.003
 Jewel+0.003
Neg logit contributions
bridge-0.006
ived-0.005
ruck-0.005
icle-0.005
ill-0.005
Tokenling
Token ID1359
Feature activation-1.15e+00

Pos logit contributions
 slams+0.083
 shop+0.081
 wrong+0.080
 needs+0.080
 tells+0.078
Neg logit contributions
bridge-0.092
ruck-0.082
ived-0.081
icle-0.081
umbs-0.081
Token.
Token ID13
Feature activation-2.30e-01

Pos logit contributions
bridge+0.104
ived+0.095
ruck+0.094
umbs+0.094
ill+0.093
Neg logit contributions
 slams-0.052
 shop-0.050
 needs-0.048
 wrong-0.047
 Elsa-0.047
Token The
Token ID383
Feature activation-1.21e-01

Pos logit contributions
bridge+0.013
umbs+0.013
icles+0.012
ived+0.012
icle+0.012
Neg logit contributions
 slams-0.016
 wrong-0.015
athy-0.015
Charl-0.015
 Elsa-0.015
Token seal
Token ID13810
Feature activation+1.82e+00

Pos logit contributions
bridge+0.008
ived+0.007
ull+0.007
ruck+0.007
 crashing+0.007
Neg logit contributions
 slams-0.009
annah-0.007
 studies-0.007
athy-0.007
Charl-0.007
Token was
Token ID373
Feature activation+4.55e-02

Pos logit contributions
 slams+0.089
 shop+0.084
 Elsa+0.083
 Jewel+0.082
 needs+0.081
Neg logit contributions
bridge-0.158
ived-0.147
ill-0.142
umbs-0.141
ull-0.141
Token scared
Token ID12008
Feature activation+7.56e-01

Pos logit contributions
 slams+0.002
 shop+0.002
 needs+0.002
 owns+0.002
 Elsa+0.002
Neg logit contributions
bridge-0.004
ived-0.004
ull-0.003
ruck-0.003
 crashing-0.003
Token.
Token ID13
Feature activation-2.92e-01

Pos logit contributions
 slams+0.031
 shop+0.030
 wrong+0.030
 needs+0.029
 Elsa+0.029
Neg logit contributions
bridge-0.073
umbs-0.065
ived-0.065
ruck-0.065
icle-0.065
Token
Token ID220
Feature activation+5.90e-01

Pos logit contributions
bridge+0.018
umbs+0.017
ived+0.017
icles+0.017
ill+0.017
Neg logit contributions
 slams-0.020
athy-0.018
 Elsa-0.018
 wrong-0.018
ience-0.018
Token
Token ID198
Feature activation-3.83e-01

Pos logit contributions
 slams+0.027
 shop+0.024
 needs+0.024
 Elsa+0.024
 Jewel+0.023
Neg logit contributions
bridge-0.053
umbs-0.050
ill-0.050
ived-0.050
icles-0.048
Token
Token ID198
Feature activation-3.69e-01

Pos logit contributions
bridge+0.033
ill+0.031
ived+0.031
umbs+0.031
ull+0.030
Neg logit contributions
 slams-0.019
 needs-0.018
 Elsa-0.017
 wrong-0.017
 Jewel-0.016
TokenThe
Token ID464
Feature activation-1.26e-01

Pos logit contributions
bridge+0.026
ived+0.023
ruck+0.022
umbs+0.022
ill+0.022
Neg logit contributions
 slams-0.025
 Elsa-0.024
 needs-0.023
 tells-0.023
 shop-0.023
Token.
Token ID13
Feature activation-3.80e-01

Pos logit contributions
bridge+0.089
umbs+0.082
ived+0.082
ruck+0.082
ull+0.081
Neg logit contributions
 shop-0.030
 slams-0.030
 wrong-0.029
 needs-0.028
 Elsa-0.026
Token
Token ID198
Feature activation-3.58e-01

Pos logit contributions
bridge+0.033
umbs+0.032
ill+0.031
ived+0.031
icle+0.030
Neg logit contributions
 slams-0.017
 needs-0.017
 wrong-0.016
 shop-0.016
ience-0.016
Token
Token ID198
Feature activation-3.37e-01

Pos logit contributions
bridge+0.030
umbs+0.029
ill+0.029
ived+0.028
ruck+0.028
Neg logit contributions
 slams-0.018
 needs-0.017
 wrong-0.016
 shop-0.016
ience-0.016
Token"
Token ID1
Feature activation-7.98e-01

Pos logit contributions
bridge+0.027
ived+0.024
umbs+0.023
ruck+0.023
ill+0.023
Neg logit contributions
 slams-0.020
 needs-0.019
 shop-0.019
 tells-0.018
 Elsa-0.018
TokenWow
Token ID22017
Feature activation-1.68e+00

Pos logit contributions
bridge+0.060
ived+0.058
urg+0.058
ruck+0.055
umbs+0.055
Neg logit contributions
 shop-0.047
 Jewel-0.045
 Elsa-0.044
 slams-0.044
 store-0.043
Token,
Token ID11
Feature activation+1.80e-01

Pos logit contributions
bridge+0.161
ived+0.147
ruck+0.147
umbs+0.146
icle+0.145
Neg logit contributions
 slams-0.068
 shop-0.066
 wrong-0.063
 needs-0.063
 Elsa-0.062
Token you
Token ID345
Feature activation+6.96e-01

Pos logit contributions
 slams+0.007
 shop+0.007
 needs+0.007
 studies+0.007
 Elsa+0.007
Neg logit contributions
bridge-0.017
ruck-0.015
icle-0.015
ived-0.015
icles-0.015
Token are
Token ID389
Feature activation-8.06e-01

Pos logit contributions
 slams+0.035
 needs+0.035
 shop+0.034
 wrong+0.034
 tells+0.033
Neg logit contributions
bridge-0.059
ived-0.054
ruck-0.054
icles-0.054
icle-0.053
Token good
Token ID922
Feature activation+1.06e-01

Pos logit contributions
bridge+0.073
ived+0.068
umbs+0.067
ruck+0.067
icle+0.066
Neg logit contributions
 slams-0.037
 shop-0.036
 wrong-0.035
 needs-0.034
 Jewel-0.034
Token at
Token ID379
Feature activation-2.12e+00

Pos logit contributions
 slams+0.005
 shop+0.005
 wrong+0.005
 tells+0.005
 needs+0.005
Neg logit contributions
bridge-0.009
umbs-0.008
ill-0.008
icle-0.008
icles-0.008
Token knots
Token ID33620
Feature activation+1.37e+00

Pos logit contributions
bridge+0.145
ived+0.133
icle+0.130
ruck+0.130
ill+0.126
Neg logit contributions
 slams-0.137
 needs-0.131
 shop-0.131
 Elsa-0.130
 tells-0.129
Token rock
Token ID3881
Feature activation+3.40e-01

Pos logit contributions
 slams+0.008
 Elsa+0.008
 tells+0.008
 Jewel+0.008
 needs+0.007
Neg logit contributions
bridge-0.009
ruck-0.009
icle-0.009
icles-0.008
ull-0.008
Token.
Token ID13
Feature activation-6.60e-01

Pos logit contributions
 slams+0.013
 shop+0.013
 wrong+0.012
 needs+0.012
 tells+0.012
Neg logit contributions
bridge-0.033
ived-0.031
ruck-0.031
umbs-0.030
icle-0.030
Token It
Token ID632
Feature activation-1.34e+00

Pos logit contributions
bridge+0.046
ill+0.045
ived+0.044
umbs+0.044
icle+0.043
Neg logit contributions
 slams-0.039
 needs-0.039
 wrong-0.039
ience-0.038
 shop-0.037
Token was
Token ID373
Feature activation-6.88e-02

Pos logit contributions
bridge+0.113
ived+0.103
ruck+0.101
umbs+0.100
icle+0.100
Neg logit contributions
 slams-0.068
 shop-0.066
 Elsa-0.064
 needs-0.063
 Jewel-0.063
Token not
Token ID407
Feature activation+1.82e+00

Pos logit contributions
bridge+0.006
ived+0.006
ruck+0.005
umbs+0.005
icle+0.005
Neg logit contributions
 slams-0.003
 shop-0.003
 needs-0.003
 Elsa-0.003
 Jewel-0.003
Token nice
Token ID3621
Feature activation-4.26e-01

Pos logit contributions
 wrong+0.074
 old+0.073
 slams+0.060
 needs+0.059
 shop+0.058
Neg logit contributions
bridge-0.180
umbs-0.178
ived-0.175
icles-0.169
ill-0.168
Token.
Token ID13
Feature activation-6.06e-01

Pos logit contributions
bridge+0.042
umbs+0.037
ull+0.036
ived+0.036
icle+0.036
Neg logit contributions
 wrong-0.018
 shop-0.017
 slams-0.017
 needs-0.017
 tells-0.016
Token
Token ID198
Feature activation-4.80e-01

Pos logit contributions
bridge+0.045
ill+0.042
umbs+0.041
ull+0.040
ived+0.040
Neg logit contributions
 slams-0.036
 needs-0.035
 wrong-0.035
ience-0.034
 shop-0.034
Token
Token ID198
Feature activation-4.23e-01

Pos logit contributions
bridge+0.038
ull+0.037
ill+0.036
ruck+0.035
umbs+0.035
Neg logit contributions
 slams-0.027
 needs-0.025
 Elsa-0.025
ience-0.024
 Jewel-0.024
TokenA
Token ID32
Feature activation-1.49e-01

Pos logit contributions
bridge+0.027
ill+0.024
ived+0.023
ruck+0.023
ull+0.023
Neg logit contributions
 slams-0.031
 needs-0.030
 Elsa-0.029
 tells-0.029
 wrong-0.029
Token wise
Token ID10787
Feature activation+3.24e+00

Pos logit contributions
bridge+0.013
ived+0.012
ull+0.012
ruck+0.012
icle+0.012
Neg logit contributions
 slams-0.007
 needs-0.007
 Elsa-0.006
 Jewel-0.006
 tells-0.006
Token
Token ID198
Feature activation-4.55e-02

Pos logit contributions
bridge+0.004
ived+0.004
ill+0.004
umbs+0.004
ruck+0.004
Neg logit contributions
 slams-0.002
 needs-0.002
 shop-0.002
 wrong-0.002
 Elsa-0.002
Token"
Token ID1
Feature activation-5.34e-01

Pos logit contributions
bridge+0.004
ived+0.003
ruck+0.003
umbs+0.003
ill+0.003
Neg logit contributions
 slams-0.002
 shop-0.002
 needs-0.002
 Elsa-0.002
 tells-0.002
TokenWow
Token ID22017
Feature activation-1.36e+00

Pos logit contributions
bridge+0.042
ived+0.037
ruck+0.037
umbs+0.037
ill+0.036
Neg logit contributions
 slams-0.032
 shop-0.031
 needs-0.030
 Elsa-0.030
 wrong-0.029
Token,
Token ID11
Feature activation+5.09e-01

Pos logit contributions
bridge+0.129
ived+0.119
ruck+0.118
umbs+0.117
ill+0.116
Neg logit contributions
 slams-0.056
 shop-0.053
 needs-0.051
 Elsa-0.050
 wrong-0.050
Token a
Token ID257
Feature activation-1.60e-01

Pos logit contributions
 slams+0.020
 shop+0.018
 needs+0.018
 Elsa+0.018
 studies+0.018
Neg logit contributions
bridge-0.048
ruck-0.043
icle-0.043
ived-0.043
icles-0.043
Token monkey
Token ID21657
Feature activation-1.44e+00

Pos logit contributions
bridge+0.013
ived+0.012
umbs+0.012
ill+0.012
ruck+0.012
Neg logit contributions
 slams-0.009
 shop-0.009
 needs-0.008
 wrong-0.008
 tells-0.008
Token!"
Token ID2474
Feature activation-1.72e+00

Pos logit contributions
bridge+0.113
ruck+0.098
ived+0.097
umbs+0.095
ill+0.095
Neg logit contributions
 slams-0.083
 Elsa-0.080
 Jewel-0.077
 shop-0.077
athy-0.075
Token Tom
Token ID4186
Feature activation-2.25e-01

Pos logit contributions
bridge+0.134
ived+0.123
ill+0.119
umbs+0.118
ruck+0.118
Neg logit contributions
 slams-0.091
 needs-0.089
 Jewel-0.085
 studies-0.085
 shop-0.085
Token said
Token ID531
Feature activation+3.66e-01

Pos logit contributions
bridge+0.025
ived+0.023
ruck+0.023
umbs+0.023
ill+0.023
Neg logit contributions
 slams-0.006
 shop-0.005
 needs-0.005
 wrong-0.005
 Elsa-0.005
Token.
Token ID13
Feature activation+7.54e-02

Pos logit contributions
 slams+0.013
 shop+0.012
 wrong+0.012
 Elsa+0.011
 tells+0.011
Neg logit contributions
bridge-0.038
ruck-0.035
umbs-0.034
ived-0.034
icles-0.034
Token "
Token ID366
Feature activation-7.20e-01

Pos logit contributions
 shop+0.004
 slams+0.004
 tells+0.003
 Elsa+0.003
 wrong+0.003
Neg logit contributions
bridge-0.007
ruck-0.006
ived-0.006
umbs-0.006
ill-0.006
Token red
Token ID2266
Feature activation-7.03e-01

Pos logit contributions
 slams+0.102
 shop+0.098
 needs+0.095
 Elsa+0.094
 tells+0.093
Neg logit contributions
bridge-0.178
ived-0.162
ruck-0.160
umbs-0.159
icle-0.159
Token and
Token ID290
Feature activation+1.33e-01

Pos logit contributions
bridge+0.067
ived+0.062
umbs+0.061
ruck+0.060
ull+0.060
Neg logit contributions
 slams-0.029
 shop-0.029
 wrong-0.027
 needs-0.027
 Elsa-0.026
Token round
Token ID2835
Feature activation+8.98e-01

Pos logit contributions
 slams+0.009
athy+0.008
 shop+0.008
 studies+0.008
 clinic+0.008
Neg logit contributions
bridge-0.009
paper-0.008
 crashing-0.008
icles-0.008
icle-0.008
Token and
Token ID290
Feature activation+1.30e-01

Pos logit contributions
 slams+0.035
 shop+0.035
 wrong+0.033
 needs+0.033
 Elsa+0.031
Neg logit contributions
bridge-0.087
ived-0.081
umbs-0.080
ruck-0.079
ill-0.078
Token boun
Token ID31283
Feature activation+1.60e+00

Pos logit contributions
 slams+0.007
athy+0.007
 hospital+0.006
 shop+0.006
 clinic+0.006
Neg logit contributions
bridge-0.010
paper-0.009
icles-0.009
 crashing-0.009
icle-0.009
Tokency
Token ID948
Feature activation-1.23e+00

Pos logit contributions
 slams+0.088
 shop+0.085
 needs+0.083
 wrong+0.081
 Elsa+0.080
Neg logit contributions
bridge-0.131
ived-0.118
ruck-0.117
umbs-0.116
icle-0.114
Token.
Token ID13
Feature activation+2.28e-01

Pos logit contributions
bridge+0.120
ruck+0.110
ived+0.109
umbs+0.109
icle+0.109
Neg logit contributions
 slams-0.047
 shop-0.046
 needs-0.044
 wrong-0.044
 Elsa-0.042
Token They
Token ID1119
Feature activation+9.61e-01

Pos logit contributions
 slams+0.012
 needs+0.012
 shop+0.011
 wrong+0.011
 studies+0.011
Neg logit contributions
bridge-0.018
ived-0.018
umbs-0.017
ill-0.017
icle-0.016
Token want
Token ID765
Feature activation-5.34e-01

Pos logit contributions
 slams+0.062
 wrong+0.062
 Jewel+0.061
 shop+0.061
 Elsa+0.061
Neg logit contributions
bridge-0.062
ill-0.059
ived-0.057
paper-0.054
ruck-0.054
Token to
Token ID284
Feature activation-1.97e-01

Pos logit contributions
bridge+0.038
ived+0.035
ruck+0.035
icle+0.034
ill+0.034
Neg logit contributions
 slams-0.034
 shop-0.032
 needs-0.032
 Elsa-0.031
 Jewel-0.031
Token play
Token ID711
Feature activation+9.07e-01

Pos logit contributions
bridge+0.019
ived+0.017
ruck+0.017
umbs+0.017
ull+0.016
Neg logit contributions
 slams-0.009
 shop-0.008
 tells-0.008
 store-0.008
 needs-0.008
Token his
Token ID465
Feature activation-1.05e+00

Pos logit contributions
bridge+0.051
ived+0.046
umbs+0.046
urg+0.044
icles+0.044
Neg logit contributions
 slams-0.030
 shop-0.030
 needs-0.029
 wrong-0.029
 tells-0.028
Token hard
Token ID1327
Feature activation+1.19e+00

Pos logit contributions
bridge+0.085
ived+0.077
ruck+0.076
umbs+0.075
icle+0.074
Neg logit contributions
 slams-0.058
 shop-0.055
 Elsa-0.054
 Jewel-0.053
 wrong-0.053
Token work
Token ID670
Feature activation-1.78e-01

Pos logit contributions
 needs+0.053
 shop+0.049
 slams+0.049
 studies+0.047
 wrong+0.047
Neg logit contributions
bridge-0.111
umbs-0.102
ill-0.100
icle-0.100
ived-0.099
Token was
Token ID373
Feature activation+3.40e-02

Pos logit contributions
bridge+0.016
umbs+0.014
ived+0.014
ruck+0.014
ill+0.014
Neg logit contributions
 shop-0.008
 slams-0.008
 wrong-0.008
 needs-0.008
 Elsa-0.008
Token ruined
Token ID20484
Feature activation-1.58e+00

Pos logit contributions
 wrong+0.002
 slams+0.002
 shop+0.002
 needs+0.002
 old+0.002
Neg logit contributions
bridge-0.003
umbs-0.003
ived-0.003
ill-0.003
ruck-0.003
Token.
Token ID13
Feature activation-8.38e-01

Pos logit contributions
bridge+0.155
ived+0.144
umbs+0.142
ruck+0.139
icle+0.138
Neg logit contributions
 shop-0.060
 slams-0.060
 wrong-0.058
 needs-0.055
 Elsa-0.053
Token
Token ID198
Feature activation-7.39e-01

Pos logit contributions
bridge+0.064
umbs+0.061
ill+0.061
icle+0.060
ived+0.059
Neg logit contributions
 slams-0.047
 wrong-0.045
 needs-0.044
 shop-0.042
 tells-0.042
Token
Token ID198
Feature activation-6.80e-01

Pos logit contributions
bridge+0.066
ull+0.062
umbs+0.061
ill+0.061
ived+0.060
Neg logit contributions
 slams-0.036
 needs-0.033
 wrong-0.031
 shop-0.031
 Elsa-0.031
TokenThe
Token ID464
Feature activation-5.24e-01

Pos logit contributions
bridge+0.049
ived+0.044
ruck+0.044
umbs+0.043
ill+0.043
Neg logit contributions
 slams-0.043
 shop-0.042
 needs-0.041
 Elsa-0.041
 wrong-0.041
Token little
Token ID1310
Feature activation-1.62e+00

Pos logit contributions
bridge+0.041
ruck+0.036
ived+0.034
ull+0.033
icle+0.033
Neg logit contributions
 slams-0.032
 tells-0.029
 Jewel-0.028
 Hazel-0.028
 Elsa-0.028
Token boy
Token ID2933
Feature activation-6.96e-01

Pos logit contributions
bridge+0.132
ruck+0.119
ived+0.119
ill+0.117
icle+0.116
Neg logit contributions
 slams-0.089
 tells-0.082
 shop-0.081
 wrong-0.081
 Jewel-0.080
Token in
Token ID287
Feature activation+1.08e+00

Pos logit contributions
 slams+0.092
 Elsa+0.086
 shop+0.085
athy+0.084
 Jewel+0.083
Neg logit contributions
bridge-0.148
ived-0.142
ruck-0.136
paper-0.134
ill-0.133
Token its
Token ID663
Feature activation-1.19e+00

Pos logit contributions
 shop+0.065
 slams+0.062
 tells+0.060
 store+0.059
 Jewel+0.059
Neg logit contributions
bridge-0.087
ived-0.079
umbs-0.077
ipples-0.075
icles-0.074
Token be
Token ID307
Feature activation-1.01e+00

Pos logit contributions
bridge+0.090
ived+0.081
ruck+0.080
umbs+0.080
ill+0.079
Neg logit contributions
 slams-0.072
 shop-0.070
 Elsa-0.067
 needs-0.067
 wrong-0.067
Tokenak
Token ID461
Feature activation-1.70e+00

Pos logit contributions
bridge+0.061
ived+0.054
ruck+0.053
umbs+0.053
icle+0.052
Neg logit contributions
 slams-0.075
 shop-0.073
 wrong-0.072
 needs-0.072
 Elsa-0.071
Token.
Token ID13
Feature activation+6.79e-02

Pos logit contributions
bridge+0.157
ived+0.144
ruck+0.142
umbs+0.142
ill+0.141
Neg logit contributions
 slams-0.075
 shop-0.071
 needs-0.069
 Elsa-0.069
 tells-0.068
Token It
Token ID632
Feature activation-1.42e+00

Pos logit contributions
 slams+0.004
 shop+0.004
 needs+0.004
 wrong+0.003
 Elsa+0.003
Neg logit contributions
bridge-0.006
ived-0.005
ruck-0.005
umbs-0.005
ill-0.005
Token put
Token ID1234
Feature activation+3.12e-01

Pos logit contributions
bridge+0.124
ived+0.112
ruck+0.111
umbs+0.110
ill+0.110
Neg logit contributions
 slams-0.070
 shop-0.067
 Elsa-0.065
 wrong-0.064
 needs-0.064
Token it
Token ID340
Feature activation+4.61e-01

Pos logit contributions
 slams+0.016
 shop+0.016
 wrong+0.015
 Elsa+0.015
 needs+0.015
Neg logit contributions
bridge-0.026
ived-0.024
umbs-0.024
ruck-0.024
ill-0.023
Token on
Token ID319
Feature activation+4.10e-01

Pos logit contributions
 wrong+0.020
 slams+0.020
 shop+0.020
 needs+0.019
 Elsa+0.018
Neg logit contributions
bridge-0.041
umbs-0.039
ruck-0.038
icle-0.038
ill-0.038
Token the
Token ID262
Feature activation+1.86e-01

Pos logit contributions
 shop+0.020
 slams+0.019
 wrong+0.019
 needs+0.018
 Jewel+0.018
Neg logit contributions
bridge-0.037
ived-0.034
umbs-0.033
ruck-0.033
ill-0.032
Token windows
Token ID9168
Feature activation-1.14e+00

Pos logit contributions
 slams+0.010
 shop+0.009
 needs+0.009
 Elsa+0.009
 tells+0.009
Neg logit contributions
bridge-0.015
ruck-0.014
ived-0.014
umbs-0.014
icle-0.013
Token his
Token ID465
Feature activation-6.64e-01

Pos logit contributions
bridge+0.062
ived+0.057
ruck+0.056
ill+0.055
icle+0.055
Neg logit contributions
 slams-0.040
 shop-0.039
 Jewel-0.037
 wrong-0.037
 needs-0.037
Token head
Token ID1182
Feature activation+1.28e+00

Pos logit contributions
bridge+0.051
ived+0.046
umbs+0.045
ruck+0.045
ill+0.045
Neg logit contributions
 slams-0.040
 shop-0.039
 needs-0.038
 wrong-0.038
 tells-0.037
Token.
Token ID13
Feature activation-1.76e-01

Pos logit contributions
 shop+0.045
 slams+0.045
 wrong+0.044
 needs+0.044
 tells+0.042
Neg logit contributions
bridge-0.128
ived-0.117
umbs-0.117
ruck-0.116
ill-0.115
Token
Token ID198
Feature activation-1.44e-01

Pos logit contributions
bridge+0.014
ill+0.013
umbs+0.012
icle+0.012
ived+0.012
Neg logit contributions
 slams-0.010
 needs-0.009
 shop-0.009
 wrong-0.009
 Elsa-0.009
Token
Token ID198
Feature activation-1.07e-01

Pos logit contributions
bridge+0.012
ull+0.011
ill+0.011
ruck+0.011
umbs+0.011
Neg logit contributions
 slams-0.008
 Elsa-0.007
 Jewel-0.007
 needs-0.007
ience-0.007
Token"
Token ID1
Feature activation-6.34e-01

Pos logit contributions
bridge+0.007
ill+0.006
ruck+0.006
ull+0.006
ived+0.005
Neg logit contributions
 slams-0.008
 needs-0.008
 Elsa-0.008
 tells-0.008
 wrong-0.008
TokenLet
Token ID5756
Feature activation-1.17e+00

Pos logit contributions
bridge+0.049
ived+0.048
ruck+0.047
urg+0.047
umbs+0.047
Neg logit contributions
 shop-0.036
 slams-0.034
 Jewel-0.033
 Elsa-0.033
 tells-0.033
Token's
Token ID338
Feature activation+8.62e-01

Pos logit contributions
bridge+0.131
ived+0.122
ruck+0.121
umbs+0.121
ill+0.120
Neg logit contributions
 slams-0.028
 shop-0.026
 needs-0.024
 wrong-0.024
 Elsa-0.024
Token go
Token ID467
Feature activation-3.59e-01

Pos logit contributions
 slams+0.053
 shop+0.051
 needs+0.050
 Elsa+0.050
 wrong+0.049
Neg logit contributions
bridge-0.065
ived-0.058
ruck-0.057
umbs-0.057
ill-0.056
Token away
Token ID1497
Feature activation+4.10e-01

Pos logit contributions
bridge+0.031
ived+0.028
umbs+0.028
ruck+0.028
icle+0.027
Neg logit contributions
 slams-0.018
 shop-0.018
 needs-0.017
 wrong-0.017
 tells-0.017
Token,
Token ID11
Feature activation+2.79e-01

Pos logit contributions
 slams+0.021
 shop+0.020
 needs+0.019
 wrong+0.019
 Elsa+0.019
Neg logit contributions
bridge-0.035
ived-0.032
ruck-0.032
umbs-0.032
ill-0.031
Token excited
Token ID6568
Feature activation-1.02e+00

Pos logit contributions
 slams+0.135
 shop+0.125
 Jewel+0.125
 Elsa+0.121
athy+0.120
Neg logit contributions
bridge-0.227
ived-0.209
ruck-0.205
ill-0.202
ull-0.200
Token.
Token ID13
Feature activation-7.32e-01

Pos logit contributions
bridge+0.086
ived+0.080
ruck+0.080
ill+0.078
icles+0.078
Neg logit contributions
 slams-0.053
 shop-0.049
 Elsa-0.048
 Jewel-0.048
 needs-0.047
Token She
Token ID1375
Feature activation+3.67e-01

Pos logit contributions
bridge+0.056
ill+0.054
umbs+0.053
icles+0.051
ived+0.051
Neg logit contributions
 slams-0.042
 wrong-0.042
 needs-0.039
ience-0.037
 tells-0.037
Token kept
Token ID4030
Feature activation-1.36e-01

Pos logit contributions
 Jewel+0.023
 wrong+0.023
 slams+0.022
 Elsa+0.021
 shop+0.021
Neg logit contributions
bridge-0.025
ill-0.023
ived-0.023
ull-0.021
urg-0.020
Token climbing
Token ID14281
Feature activation-9.69e-01

Pos logit contributions
bridge+0.011
ived+0.010
ruck+0.010
umbs+0.010
ill+0.010
Neg logit contributions
 slams-0.007
 shop-0.007
 needs-0.007
 wrong-0.006
 Elsa-0.006
Token until
Token ID1566
Feature activation-4.34e+00

Pos logit contributions
bridge+0.082
ived+0.074
ruck+0.073
ill+0.073
umbs+0.073
Neg logit contributions
 slams-0.050
 shop-0.048
 needs-0.047
 Elsa-0.046
 tells-0.046
Token she
Token ID673
Feature activation-3.71e-01

Pos logit contributions
bridge+0.412
ived+0.383
ill+0.377
ruck+0.373
icle+0.371
Neg logit contributions
 slams-0.176
 needs-0.164
 shop-0.164
 tells-0.159
 Jewel-0.157
Token was
Token ID373
Feature activation-4.42e-01

Pos logit contributions
bridge+0.030
ruck+0.027
ived+0.027
umbs+0.027
icle+0.026
Neg logit contributions
 slams-0.021
 shop-0.020
 needs-0.020
 wrong-0.019
 Elsa-0.019
Token at
Token ID379
Feature activation-2.40e+00

Pos logit contributions
bridge+0.036
umbs+0.033
ived+0.033
ruck+0.033
icle+0.032
Neg logit contributions
 slams-0.024
 shop-0.023
 wrong-0.023
 Elsa-0.022
 needs-0.022
Token the
Token ID262
Feature activation-2.18e-01

Pos logit contributions
bridge+0.216
ruck+0.199
ived+0.197
ill+0.196
icle+0.195
Neg logit contributions
 slams-0.109
 shop-0.103
 needs-0.101
 Elsa-0.099
 tells-0.098
Token very
Token ID845
Feature activation+1.36e+00

Pos logit contributions
bridge+0.018
ruck+0.016
icle+0.016
ill+0.016
icles+0.015
Neg logit contributions
 slams-0.012
 tells-0.011
 needs-0.011
 Elsa-0.011
 Jewel-0.011
Token enjoyed
Token ID8359
Feature activation-1.26e-01

Pos logit contributions
bridge+0.022
ived+0.021
ull+0.020
ruck+0.020
ill+0.020
Neg logit contributions
 slams-0.019
 Jewel-0.018
 Elsa-0.018
 wrong-0.018
 shop-0.017
Token the
Token ID262
Feature activation-1.35e-01

Pos logit contributions
bridge+0.009
ruck+0.008
ull+0.008
icle+0.007
icles+0.007
Neg logit contributions
 slams-0.008
 needs-0.008
 Elsa-0.008
 shop-0.008
 Jewel-0.008
Token silence
Token ID9550
Feature activation+8.04e-01

Pos logit contributions
bridge+0.009
ruck+0.008
icle+0.007
umbs+0.007
ived+0.007
Neg logit contributions
 Elsa-0.009
 slams-0.009
athy-0.009
 Jewel-0.009
 needs-0.009
Token for
Token ID329
Feature activation-2.90e-01

Pos logit contributions
 shop+0.038
 slams+0.038
 wrong+0.036
 needs+0.036
 Elsa+0.036
Neg logit contributions
bridge-0.070
ived-0.064
umbs-0.064
ruck-0.063
icle-0.062
Token many
Token ID867
Feature activation-1.01e+00

Pos logit contributions
bridge+0.022
ruck+0.020
ived+0.020
umbs+0.020
ill+0.020
Neg logit contributions
 slams-0.017
 shop-0.016
 needs-0.016
 Elsa-0.016
 tells-0.016
Token minutes
Token ID2431
Feature activation-5.46e+00

Pos logit contributions
bridge+0.070
ruck+0.065
ived+0.062
ull+0.060
umbs+0.058
Neg logit contributions
 slams-0.067
 Elsa-0.065
 Jewel-0.061
 shop-0.060
 tells-0.059
Token.
Token ID13
Feature activation-5.68e-01

Pos logit contributions
bridge+0.536
umbs+0.485
ived+0.484
ruck+0.481
icle+0.479
Neg logit contributions
 shop-0.208
 slams-0.203
 wrong-0.192
 needs-0.190
 tells-0.188
Token
Token ID198
Feature activation-5.82e-01

Pos logit contributions
bridge+0.045
umbs+0.043
ill+0.042
ived+0.042
ruck+0.042
Neg logit contributions
 slams-0.031
 wrong-0.029
 shop-0.029
 Jewel-0.028
 Elsa-0.028
Token
Token ID198
Feature activation-5.90e-01

Pos logit contributions
bridge+0.052
ull+0.050
umbs+0.049
ruck+0.049
ived+0.048
Neg logit contributions
 slams-0.028
 needs-0.025
 Jewel-0.025
 Elsa-0.025
 wrong-0.024
TokenWhen
Token ID2215
Feature activation-5.54e-01

Pos logit contributions
bridge+0.049
ived+0.045
icle+0.045
icles+0.045
umbs+0.044
Neg logit contributions
 shop-0.030
 slams-0.028
 wrong-0.028
 store-0.028
 needs-0.028
Token it
Token ID340
Feature activation+3.59e-01

Pos logit contributions
bridge+0.049
ived+0.044
umbs+0.044
ruck+0.043
ill+0.043
Neg logit contributions
 slams-0.027
 shop-0.027
 needs-0.026
 wrong-0.026
 Elsa-0.025
Token loved
Token ID6151
Feature activation+2.00e+00

Pos logit contributions
bridge+0.223
ull+0.208
ill+0.206
ruck+0.202
ived+0.197
Neg logit contributions
 slams-0.208
 Jewel-0.196
 Elsa-0.191
 Coco-0.185
 wrong-0.185
Token to
Token ID284
Feature activation-2.54e-01

Pos logit contributions
 slams+0.111
 shop+0.108
 needs+0.104
 wrong+0.104
 Elsa+0.104
Neg logit contributions
bridge-0.162
ived-0.146
umbs-0.144
ruck-0.144
ill-0.143
Token play
Token ID711
Feature activation+1.02e+00

Pos logit contributions
bridge+0.021
ruck+0.019
ived+0.019
icle+0.018
ill+0.018
Neg logit contributions
 slams-0.014
 shop-0.013
 wrong-0.013
 needs-0.013
 Elsa-0.013
Token make
Token ID787
Feature activation-4.28e-01

Pos logit contributions
 slams+0.046
 shop+0.045
 wrong+0.043
 Elsa+0.042
 needs+0.042
Neg logit contributions
bridge-0.092
ived-0.085
umbs-0.084
ruck-0.084
ill-0.083
Token-
Token ID12
Feature activation-1.43e+00

Pos logit contributions
bridge+0.037
umbs+0.034
icle+0.034
ived+0.034
ill+0.033
Neg logit contributions
 shop-0.020
 slams-0.020
 wrong-0.020
 Elsa-0.019
 old-0.019
Tokenbel
Token ID6667
Feature activation-5.29e+00

Pos logit contributions
bridge+0.096
icle+0.092
ill+0.090
icles+0.089
urg+0.089
Neg logit contributions
 store-0.083
 shop-0.081
 old-0.078
 wrong-0.078
 slams-0.077
Tokenieve
Token ID12311
Feature activation+1.39e-01

Pos logit contributions
bridge+0.457
umbs+0.444
ill+0.439
ived+0.436
icle+0.436
Neg logit contributions
 shop-0.241
 slams-0.241
 wrong-0.239
 tells-0.230
 Jewel-0.221
Token.
Token ID13
Feature activation-1.89e-01

Pos logit contributions
 slams+0.007
 shop+0.007
 needs+0.007
 wrong+0.007
 Elsa+0.007
Neg logit contributions
bridge-0.012
ived-0.011
umbs-0.010
ruck-0.010
icle-0.010
Token Today
Token ID6288
Feature activation+1.54e-01

Pos logit contributions
bridge+0.018
ill+0.017
ruck+0.017
ived+0.017
icle+0.017
Neg logit contributions
 slams-0.007
 shop-0.007
 wrong-0.007
 needs-0.007
 Elsa-0.007
Token she
Token ID673
Feature activation+5.98e-01

Pos logit contributions
 slams+0.006
 shop+0.006
 Elsa+0.006
 needs+0.006
 Jewel+0.006
Neg logit contributions
bridge-0.014
ived-0.013
ruck-0.013
ill-0.013
icles-0.013
Token was
Token ID373
Feature activation+5.73e-01

Pos logit contributions
 slams+0.036
 wrong+0.034
 Jewel+0.033
 shop+0.033
 Elsa+0.033
Neg logit contributions
bridge-0.045
ill-0.041
ived-0.040
ull-0.039
ruck-0.039
Token,
Token ID11
Feature activation-4.90e-01

Pos logit contributions
bridge+0.207
ived+0.190
ruck+0.189
ill+0.187
umbs+0.186
Neg logit contributions
 slams-0.094
 shop-0.088
 needs-0.086
 Jewel-0.084
 Elsa-0.084
Token the
Token ID262
Feature activation-5.99e-01

Pos logit contributions
bridge+0.044
ill+0.040
ived+0.040
ruck+0.040
icles+0.039
Neg logit contributions
 slams-0.022
 needs-0.021
 wrong-0.020
 shop-0.020
 tells-0.019
Token garden
Token ID11376
Feature activation+5.70e-01

Pos logit contributions
bridge+0.046
ruck+0.040
ived+0.039
ipples+0.039
ill+0.039
Neg logit contributions
 slams-0.037
 Jewel-0.033
 Elsa-0.033
 tells-0.033
 owns-0.033
Token was
Token ID373
Feature activation-4.83e-01

Pos logit contributions
 shop+0.024
 slams+0.024
 needs+0.023
 wrong+0.023
 Elsa+0.022
Neg logit contributions
bridge-0.052
ived-0.048
ruck-0.048
umbs-0.048
icle-0.047
Token prett
Token ID46442
Feature activation-2.66e+00

Pos logit contributions
bridge+0.044
ived+0.042
umbs+0.041
ruck+0.041
paper+0.040
Neg logit contributions
 wrong-0.021
 shop-0.021
 old-0.020
 slams-0.019
 store-0.019
Tokenier
Token ID959
Feature activation-4.69e+00

Pos logit contributions
bridge+0.100
ruck+0.077
icle+0.076
icles+0.076
ull+0.075
Neg logit contributions
 wrong-0.256
 shop-0.251
 needs-0.246
 slams-0.245
 Elsa-0.240
Token than
Token ID621
Feature activation-1.95e+00

Pos logit contributions
bridge+0.412
ived+0.380
umbs+0.376
icle+0.369
icles+0.367
Neg logit contributions
 shop-0.222
 slams-0.221
 needs-0.212
 wrong-0.211
 Elsa-0.208
Token ever
Token ID1683
Feature activation-3.19e+00

Pos logit contributions
bridge+0.151
ived+0.135
ruck+0.134
umbs+0.133
ill+0.133
Neg logit contributions
 slams-0.115
 shop-0.111
 needs-0.108
 wrong-0.106
 Elsa-0.106
Token.
Token ID13
Feature activation-1.08e+00

Pos logit contributions
bridge+0.238
ruck+0.214
ived+0.214
ill+0.213
umbs+0.209
Neg logit contributions
 slams-0.199
 shop-0.188
 needs-0.183
 Elsa-0.183
 Jewel-0.182
Token It
Token ID632
Feature activation-1.73e+00

Pos logit contributions
bridge+0.083
umbs+0.081
ill+0.079
ived+0.076
icles+0.076
Neg logit contributions
 slams-0.059
 wrong-0.059
 needs-0.055
 tells-0.054
ience-0.053
Token gl
Token ID1278
Feature activation-1.40e+00

Pos logit contributions
bridge+0.153
ill+0.139
ived+0.138
umbs+0.136
ull+0.135
Neg logit contributions
 slams-0.083
 shop-0.077
 wrong-0.075
 Elsa-0.074
 Jewel-0.074
Token.
Token ID13
Feature activation-8.90e-01

Pos logit contributions
bridge+0.070
ived+0.064
umbs+0.063
ruck+0.063
ill+0.062
Neg logit contributions
 slams-0.033
 shop-0.032
 needs-0.031
 wrong-0.031
 Elsa-0.030
Token
Token ID220
Feature activation+6.94e-01

Pos logit contributions
bridge+0.075
ill+0.070
umbs+0.070
ived+0.068
icle+0.068
Neg logit contributions
 slams-0.046
 wrong-0.043
 needs-0.042
 shop-0.040
 tells-0.040
Token
Token ID198
Feature activation-6.69e-01

Pos logit contributions
 slams+0.028
 shop+0.026
 needs+0.025
 Elsa+0.024
 Jewel+0.024
Neg logit contributions
bridge-0.067
ived-0.062
umbs-0.062
ill-0.062
ruck-0.060
Token
Token ID198
Feature activation-6.42e-01

Pos logit contributions
bridge+0.063
ill+0.058
umbs+0.058
ived+0.057
ull+0.057
Neg logit contributions
 slams-0.030
 needs-0.028
 wrong-0.026
 shop-0.026
 tells-0.025
TokenJ
Token ID41
Feature activation-3.45e+00

Pos logit contributions
bridge+0.049
ived+0.044
ruck+0.043
umbs+0.043
ill+0.043
Neg logit contributions
 slams-0.039
 shop-0.037
 needs-0.036
 wrong-0.036
 Elsa-0.036
Tokenill
Token ID359
Feature activation-6.28e+00

Pos logit contributions
bridge+0.198
icle+0.166
ipples+0.166
icles+0.164
umbs+0.162
Neg logit contributions
 slams-0.255
 shop-0.254
 tells-0.249
 needs-0.248
 says-0.247
Token was
Token ID373
Feature activation-1.06e-01

Pos logit contributions
bridge+0.499
ill+0.479
ived+0.462
ull+0.447
ruck+0.439
Neg logit contributions
 slams-0.330
 Jewel-0.321
 Elsa-0.307
 wrong-0.304
 Coco-0.297
Token persistent
Token ID16218
Feature activation+9.46e-01

Pos logit contributions
bridge+0.009
ived+0.008
ill+0.008
ull+0.007
umbs+0.007
Neg logit contributions
 needs-0.006
 slams-0.006
 owns-0.005
 shop-0.005
 Jewel-0.005
Token and
Token ID290
Feature activation-5.08e-01

Pos logit contributions
 slams+0.047
 shop+0.042
 needs+0.042
 Elsa+0.042
 Jewel+0.041
Neg logit contributions
bridge-0.082
ived-0.077
ill-0.076
ruck-0.075
umbs-0.073
Token said
Token ID531
Feature activation+2.02e-01

Pos logit contributions
bridge+0.038
ived+0.036
ill+0.034
ull+0.033
ruck+0.032
Neg logit contributions
 slams-0.028
 Jewel-0.027
 shop-0.027
athy-0.026
Charl-0.026
Token,
Token ID11
Feature activation-1.17e-01

Pos logit contributions
 slams+0.008
 shop+0.008
 needs+0.007
 Elsa+0.007
 tells+0.007
Neg logit contributions
bridge-0.019
ived-0.018
ruck-0.018
ill-0.018
umbs-0.017