Skip to content

Add "silentpayments" module implementing BIP352 (take 4, limited to full-node scanning)#1765

Open
theStack wants to merge 14 commits into
bitcoin-core:masterfrom
theStack:silentpayments_module_fullnode_only
Open

Add "silentpayments" module implementing BIP352 (take 4, limited to full-node scanning)#1765
theStack wants to merge 14 commits into
bitcoin-core:masterfrom
theStack:silentpayments_module_fullnode_only

Conversation

@theStack
Copy link
Copy Markdown
Contributor

@theStack theStack commented Oct 31, 2025

Description

This PR implements BIP352 with scanning limited to full-nodes. Light-client scanning is planned to be added in a separate PR in the future. The following 7 API functions are currently introduced:

Sender side [BIP description]:

  • secp256k1_silentpayments_sender_create_outputs: given a list of $n$ secret keys $a_1 ... a_n$, a serialized outpoint, and a list of recipients (each consisting of silent payments scan pubkey and spend pubkey), create the corresponding transaction outputs (x-only public keys) for the sending transaction

Receiver side, label creation [BIP description]:

  • secp256k1_silentpayments_recipient_label_create: given a scan secret key and label integer, calculate the corresponding label tweak and label object
  • secp256k1_silentpayments_recipient_label_serialize: given a label object, create the corresponding 33-byte serialization
  • secp256k1_silentpayments_recipient_label_parse: given a 33-byte label representation, create the corresponding label object
  • secp256k1_silentpayments_recipient_create_labeled_spend_pubkey: given a spend public key and a label object, create the corresponding labeled spend public key

Receiver side, scanning [BIP description]:

  • secp256k1_silentpayments_recipient_prevouts_summary_create: given a list of $n$ public keys $A_1 ... A_n$ and a serialized outpoint, create a prevouts_summary object needed for scanning
  • secp256k1_silentpayments_recipient_scan_outputs: given a prevouts_summary object, a recipients scan secret key and spend public key, and the relevant transaction outputs (x-only public keys), scan for outputs belonging to the recipients and and return the tweak(s) needed for spending the output(s). Optionally, a label_lookup callback function can be provided to also scan for labels.

For a higher-level overview on what these functions exactly do, it's suggested to look at a corresponding Python implementation that was created based on the secp256k1lab project (it passes the test vectors, so this "executable pseudo-code" should be correct).

Changes to the previous take

Based on the latest state of the previous PR #1698 (take 3), the following changes have been made:

The scope reduction isn't immediately visible in commit count (only one commit was only introducing light-client relevant functionality and could be completely removed), but the review burden compared #1698 is still significantly lower in terms of LOC, especially in the receiving commit.

Open questions / TODOs

@w0xlt
Copy link
Copy Markdown
Contributor

w0xlt commented Nov 6, 2025

Added the optimized version on top of this PR:
w0xlt@8d16914

For more context:
#1698 (comment)

@theStack
Copy link
Copy Markdown
Contributor Author

theStack commented Nov 7, 2025

Small supplementary update: I've created a corresponding Python implementation of the provided API functions based on secp256k1lab (https://github.com/theStack/secp256k1lab/blob/add_bip352_module_review_helper/src/secp256k1lab/bip352.py) (also linked in the PR description). The hope is that this makes reviewing this PR a bit easier by having a less noisy, "executable pseudo-code"-like description on what happens under the hood. The code passes the BIP352 test vectors and hence should be correct.

Added the optimized version on top of this PR: w0xlt@8d16914

For more context: #1698 (comment)

Thanks for rebasing on top of this PR, much appreciated! I will take a closer look within the next days.

Copy link
Copy Markdown
Contributor

@w0xlt w0xlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Not related to optimization, but the diff below removes some redundant public-key serialization code:

diff --git a/src/modules/silentpayments/main_impl.h b/src/modules/silentpayments/main_impl.h
index 106da20..922433d 100644
--- a/src/modules/silentpayments/main_impl.h
+++ b/src/modules/silentpayments/main_impl.h
@@ -21,6 +21,19 @@
 /** magic bytes for ensuring prevouts_summary objects were initialized correctly. */
 static const unsigned char secp256k1_silentpayments_prevouts_summary_magic[4] = { 0xa7, 0x1c, 0xd3, 0x5e };
 
+/* Serialize a ge to compressed 33 bytes. Keeps eckey_pubkey_serialize usage uniform
+ * (expects non-const ge*), and centralizes the VERIFY_CHECK. */
+static SECP256K1_INLINE void secp256k1_sp_ge_serialize33(const secp256k1_ge* in, unsigned char out33[33]) {
+    size_t len = 33;
+    secp256k1_ge tmp = *in;
+    int ok = secp256k1_eckey_pubkey_serialize(&tmp, out33, &len, 1);
+#ifdef VERIFY
+    VERIFY_CHECK(ok && len == 33);
+#else
+    (void)ok;
+#endif
+}
+
 /** Sort an array of silent payment recipients. This is used to group recipients by scan pubkey to
  *  ensure the correct values of k are used when creating multiple outputs for a recipient.
  *
@@ -68,13 +81,11 @@ static int secp256k1_silentpayments_calculate_input_hash_scalar(secp256k1_scalar
     secp256k1_sha256 hash;
     unsigned char pubkey_sum_ser[33];
     unsigned char input_hash[32];
-    size_t len;
     int ret, overflow;
 
     secp256k1_silentpayments_sha256_init_inputs(&hash);
     secp256k1_sha256_write(&hash, outpoint_smallest36, 36);
-    ret = secp256k1_eckey_pubkey_serialize(pubkey_sum, pubkey_sum_ser, &len, 1);
-    VERIFY_CHECK(ret && len == sizeof(pubkey_sum_ser));
+    secp256k1_sp_ge_serialize33(pubkey_sum, pubkey_sum_ser);
     secp256k1_sha256_write(&hash, pubkey_sum_ser, sizeof(pubkey_sum_ser));
     secp256k1_sha256_finalize(&hash, input_hash);
     /* Convert input_hash to a scalar.
@@ -85,15 +96,13 @@ static int secp256k1_silentpayments_calculate_input_hash_scalar(secp256k1_scalar
      * an error to ensure strict compliance with BIP0352.
      */
     secp256k1_scalar_set_b32(input_hash_scalar, input_hash, &overflow);
-    ret &= !secp256k1_scalar_is_zero(input_hash_scalar);
+    ret = !secp256k1_scalar_is_zero(input_hash_scalar);
     return ret & !overflow;
 }
 
 static void secp256k1_silentpayments_create_shared_secret(const secp256k1_context *ctx, unsigned char *shared_secret33, const secp256k1_ge *public_component, const secp256k1_scalar *secret_component) {
     secp256k1_gej ss_j;
     secp256k1_ge ss;
-    size_t len;
-    int ret;
 
     secp256k1_ecmult_const(&ss_j, public_component, secret_component);
     secp256k1_ge_set_gej(&ss, &ss_j);
@@ -103,12 +112,7 @@ static void secp256k1_silentpayments_create_shared_secret(const secp256k1_contex
      * impossible at this point considering we have already validated the public key and
      * the secret key.
      */
-    ret = secp256k1_eckey_pubkey_serialize(&ss, shared_secret33, &len, 1);
-#ifdef VERIFY
-    VERIFY_CHECK(ret && len == 33);
-#else
-    (void)ret;
-#endif
+    secp256k1_sp_ge_serialize33(&ss, shared_secret33);
 
     /* Leaking these values would break indistinguishability of the transaction, so clear them. */
     secp256k1_ge_clear(&ss);
@@ -585,7 +589,6 @@ int secp256k1_silentpayments_recipient_scan_outputs(
                 secp256k1_ge output_negated_ge, tx_output_ge;
                 secp256k1_gej tx_output_gej, label_gej;
                 unsigned char label33[33];
-                size_t len;
 
                 secp256k1_xonly_pubkey_load(ctx, &tx_output_ge, tx_outputs[j]);
                 secp256k1_gej_set_ge(&tx_output_gej, &tx_output_ge);
@@ -595,7 +598,6 @@ int secp256k1_silentpayments_recipient_scan_outputs(
                 secp256k1_ge_neg(&output_negated_ge, &output_ge);
                 secp256k1_gej_add_ge_var(&label_gej, &tx_output_gej, &output_negated_ge, NULL);
                 secp256k1_ge_set_gej_var(&label_ge, &label_gej);
-                ret = secp256k1_eckey_pubkey_serialize(&label_ge, label33, &len, 1);
                 /* Serialize must succeed because the point was just loaded.
                  *
                  * Note: serialize will also fail if label_ge is the point at infinity, but we know
@@ -603,7 +605,7 @@ int secp256k1_silentpayments_recipient_scan_outputs(
                  * Thus, we know that label_ge = tx_output_gej + output_negated_ge cannot be the
                  * point at infinity.
                  */
-                VERIFY_CHECK(ret && len == 33);
+                secp256k1_sp_ge_serialize33(&label_ge, label33);
                 label_tweak = label_lookup(label33, label_context);
                 if (label_tweak != NULL) {
                     found = 1;
@@ -617,7 +619,6 @@ int secp256k1_silentpayments_recipient_scan_outputs(
                 secp256k1_gej_neg(&label_gej, &tx_output_gej);
                 secp256k1_gej_add_ge_var(&label_gej, &label_gej, &output_negated_ge, NULL);
                 secp256k1_ge_set_gej_var(&label_ge, &label_gej);
-                ret = secp256k1_eckey_pubkey_serialize(&label_ge, label33, &len, 1);
                 /* Serialize must succeed because the point was just loaded.
                  *
                  * Note: serialize will also fail if label_ge is the point at infinity, but we know
@@ -625,7 +626,7 @@ int secp256k1_silentpayments_recipient_scan_outputs(
                  * Thus, we know that label_ge = tx_output_gej + output_negated_ge cannot be the
                  * point at infinity.
                  */
-                VERIFY_CHECK(ret && len == 33);
+                secp256k1_sp_ge_serialize33(&label_ge, label33);
                 label_tweak = label_lookup(label33, label_context);
                 if (label_tweak != NULL) {
                     found = 1;

Copy link
Copy Markdown
Contributor

@w0xlt w0xlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The following diff removes the implicit cast and clarifies that k is 4 bytes

diff --git a/src/modules/silentpayments/main_impl.h b/src/modules/silentpayments/main_impl.h
index 922433d..d94aed6 100644
--- a/src/modules/silentpayments/main_impl.h
+++ b/src/modules/silentpayments/main_impl.h
@@ -512,7 +512,8 @@ int secp256k1_silentpayments_recipient_scan_outputs(
     secp256k1_xonly_pubkey output_xonly;
     unsigned char shared_secret[33];
     const unsigned char *label_tweak = NULL;
-    size_t j, k, found_idx;
+    size_t j, found_idx;
+    uint32_t k;
     int found, combined, valid_scan_key, ret;
 
     /* Sanity check inputs */

Copy link
Copy Markdown
Contributor

@jonasnick jonasnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @theStack for the new PR. I can confirm that this PR is a rebased version of #1698, with the light client functionality removed and comments addressed, except for:

Comment thread src/modules/silentpayments/main_impl.h Outdated
@jonasnick
Copy link
Copy Markdown
Contributor

Not providing prevouts_summary (de)serialization functionality yet in the API poses the risk that users try to do it anyway by treating the opaque object as "serialized". How to cope with that? Is adding a "don't do this" comment in API header sufficient?

Is there a reason for serializing prevouts_summary without light client functionality? If not, I think the don't do this comment is sufficient. Right now, in contrast to the docs of all other opaque objects, this is missing, however:

The exact representation of data inside the opaque data structures is implementation defined and not guaranteed to be portable between different platforms or versions.

@theStack
Copy link
Copy Markdown
Contributor Author

@w0xlt, @jonasnick: Thanks for the reviews! I've addressed the suggested changes:

  • in _recpient_scan_outputs: changed the type of k to uint32_t (comment above)
  • in _recipient_create_label: added a scan key validity check (+added a test for that) (#1698 - comment)
  • unified all mentions of "Silent Payments" to title case in the header API and example (#1698 - comment)
  • fixed typo s/elemement/element/ (#1698 - review)
  • in _recipient_scan_outputs: fixed comment in second label candidate (review above)
  • extended the API header comment for the _prevouts_summary opaque data structure, to point out that the data structure is implementation defined (like docs of all other opaque structs) (comment above)

Nit: Not related to optimization, but the diff below removes some redundant public-key serialization code:

Given that this compressed-pubkey-serialization pattern shows up repeatedly also in other modules (ellswift, musig), I think it would make the most sense to add a general helper (e.g. in eckey{,_impl}.h), which could be done in an independent PR. I've opened issue #1773 to see if there is conceptual support for doing this.

Not providing prevouts_summary (de)serialization functionality yet in the API poses the risk that users try to do it anyway by treating the opaque object as "serialized". How to cope with that? Is adding a "don't do this" comment in API header sufficient?

Is there a reason for serializing prevouts_summary without light client functionality? If not, I think the don't do this comment is sufficient.

Good point, I can't think of a good reason for full nodes wanting to serialize prevouts_summary.

@nymius
Copy link
Copy Markdown

nymius commented Nov 20, 2025

To address the open questions, I’ve reviewed the proposed changes by @w0xlt on 8d16914.

I'm going to focus more on the key aspects I extracted from the review and the merits of each change, rather on the big O improvement claims, because I didn't get that far.

These are multiple different changes rather than a single one, so to make the review easier I suggest to brake it in multiple commits. I would state on each of them the purpose and the real case scenario where the change would be relevant.

Also, I would use clearer names for the variables or at least document their purpose.

The changes I've identified so far are the following:

  • Improve label lookup using hash table: I think this is implementation dependent and should be improved by the user rather than by the library itself.
    Examples, as part of the documentation, are a usage demonstration, although can point the user the best practices, I prefer clarity rather than performance on them. For example, on the rust-secp256k1 bindings, I used a HashMap for the example because it is a familiar standard structure for Rust users. If they would like to gain more performance there, they have other tools available to replace that structure by themselves.
  • On secp256k1_silentpayments_recipient_sort_cmp, I understood the change: (r1->index < r2->index) ? -1 : (r1->index > r2->index) ? 1 : 0 is to make the unstable secp256k1_hsort implementation stable. Considering it only affects secp256k1_silentpayments_sender_create_outputs, which didn't receive more changes than a variable removal, what are the performance improvements there?
  • SECP256K1_SP_SCAN_BATCH: I just learned about it (ge_set_gej_all) during this review, but seems that affine to jacobian conversion is needed for the comparison against labels, and is faster to do it in batches. I think the impact of this improvement will only affect the small subset of transaction with large amount of outputs. A good benchmark for this would be coinjoin transactions, although is not supported by BIP 352, a test case is doable.
  • Double head: I'm not sure of the target of this change, I guess is to skip already found or scanned inputs, but couldn't figure out what is tracking each head.
  • Binary tree search for x-only lookups: I think it explain itself, faster lookups on the not labeled case. This may have its merits, but I need to remove the other changes to have a clear answer.

In general I agree with @jonasnick that we should define a clear target to benchmark and improve. As I've said before, the base case should be a wallet with a single label for change.
For other improvements, I would try to match up plausible real world scenarios before making complex changes in the code base of the PR.
Finally, by looking at bench_silentpayments_full_tx_scan, the use_labels case is very simple. If we want to test performance improvements on the label lookup, I would start there.

In conclusion, from the proposed commit and the discussion around it, the only changes I've found clear enough to consider are:

  • Tracking of found outputs: simple enough, small performance improvement for the usual case, better for larger transactions.

@w0xlt
Copy link
Copy Markdown
Contributor

w0xlt commented Nov 20, 2025

Thanks @nymius for reviewing the changes, addressing the main points, and proposing a simplification.

I’m currently splitting the optimization commit into smaller pieces to make it easier to review.
I’ll also take a closer look at your commit and run it against the benchmark files.

The only part of the discussion that still feels a bit ambiguous is the “base” or “usual” case.
From my understanding of #1698 (review), the concern is not about typical usage, but rather about an attacker crafting malicious transactions with many outputs, causing the scanning process to take hours.

So the goal of this optimization would be to mitigate that scenario, not the collaborative one.

@w0xlt
Copy link
Copy Markdown
Contributor

w0xlt commented Nov 21, 2025

I ran the examples/silentpayments_mixed_1.c file with simplified the version suggested by @nymius jonasnick@311b4eb . It shows slightly worse performance (see below), but the simpler approach may still be worth it

Without the secp256k1_silentpayments_recipient_sort_cmp stabilization, I got 82s for the complex version vs. 114s for the simpler one. Whether the 30s difference justifies the additional complexity is up for discussion — I don’t have a strong opinion.

Answering the questions: secp256k1_silentpayments_recipient_sort_cmp stabilization + heads speed up the non-adversarial case. The stable sort ensures that the transaction outputs are ordered sequentially by the index $k$.

The optimized receiver implementation relies on a heuristic (the head pointers) that assumes the next output it is looking for ($k+1$) is located immediately after the previous one ($k$).

  • With Stable Sort: The scanner complexity is roughly O(N) (Linear).
  • Without Stable Sort: The scanner complexity degrades to O(N²) (Quadratic).

@theStack
Copy link
Copy Markdown
Contributor Author

@w0xlt, @nymius: Thanks for investigating this deeper. I've now also had a chance to look at the suggested optimizations and came to similar conclusions as stated in #1765 (comment). I particularly agree with the stated points that the changes should not increase complexity significantly and that the most important optimization candidate to consider for mitigating the worst-case scanning attack is "skip outputs that we have already found" (as previously stated by @jonasnick, see #1698 (comment) and jonasnick@311b4eb). I don't think stabilizing the sorting helps at all, since this is something that happens at the sender side, and we can't rely on the attacker using a specific implementation (even if they did, it's trivial for them to shuffle the outputs after creation).

For the proposed target to benchmark, I'm proposing the following modified example that exhibits the worst-case scanning time based on a labels cache with one entry (for change outputs), by creating a tx with 23255 outputs [1] all targeted for Bob: 1df4287

Shower-thought from this morning: what if we treat the tx_outputs input as actual list that we modify, and remove an entry if it is found? This would have a similar effect as the "track found outputs" idea, but without the need of dynamic memory allocation. It's a ~10-lines diff and seems to work fine: theStack@9aba470
It reduces the run-time of the proposed example above from ~10 minutes to roughly ~2 seconds on my machine.

Any thoughts on this? Maybe I'm still missing something.

[1] that's an upper bound of maximum outputs per block: floor(1000000/43) = 23255

@nymius
Copy link
Copy Markdown

nymius commented Nov 21, 2025

Shower-thought from this morning: what if we treat the tx_outputs input as actual list that we modify, and remove an entry if it is found? This would have a similar effect as the "track found outputs" idea, but without the need of dynamic memory allocation. It's a ~10-lines diff and seems to work fine: theStack@9aba470 It reduces the run-time of the proposed example above from ~10 minutes to roughly ~2 seconds on my machine.

Any thoughts on this? Maybe I'm still missing something.

1df4287 is a good target.
These are the runtime times I've obtained testing against this target:

Branch Runtime
theStack/secp256k1@9aba470 ~1.41s
theStack/secp256k1@9103229 (baseline) ~9m
jonasnick/secp256k1@311b4eb ~1.49s

I had to increase stack size to be able to fit all N_OUTPUT size allocations in the example.

Initially I preferred the is_found allocation rather than the element shifts. But your solution seems to be more performant.

@w0xlt
Copy link
Copy Markdown
Contributor

w0xlt commented Nov 22, 2025

@theStack Yes — if we want to keep only the adversarial-scenario optimizations, we can drop sort stabilization and the extra heads.

I like your idea of avoiding dynamic memory allocation. That’s a very interesting direction. On my machine, the scan completes in about 0.4s, which feels like a good balance between simplicity and the optimization needed for the labeled case.

Below are the changes I had to make for your example to run on my machine and to record the scan time.

diff --git a/examples/silentpayments.c b/examples/silentpayments.c
index 5e71e73..d43332f 100644
--- a/examples/silentpayments.c
+++ b/examples/silentpayments.c
@@ -10,6 +10,7 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <time.h>
 
 #include <secp256k1_extrakeys.h>
 #include <secp256k1_silentpayments.h>
@@ -112,15 +113,21 @@ const unsigned char* label_lookup(
     return NULL;
 }
 
+static secp256k1_xonly_pubkey tx_inputs[N_INPUTS];
+static const secp256k1_xonly_pubkey *tx_input_ptrs[N_INPUTS];
+static secp256k1_xonly_pubkey tx_outputs[N_OUTPUTS];
+static secp256k1_xonly_pubkey *tx_output_ptrs[N_OUTPUTS];
+static secp256k1_silentpayments_found_output found_outputs[N_OUTPUTS];
+static secp256k1_silentpayments_found_output *found_output_ptrs[N_OUTPUTS];
+static secp256k1_silentpayments_recipient recipients[N_OUTPUTS];
+static const secp256k1_silentpayments_recipient *recipient_ptrs[N_OUTPUTS];
+/* 2D array for holding multiple public key pairs. The second index, i.e., [2],
+ * is to represent the spend and scan public keys. */
+static unsigned char (*sp_addresses[N_OUTPUTS])[2][33];
+
 int main(void) {
     unsigned char randomize[32];
     unsigned char serialized_xonly[32];
-    secp256k1_xonly_pubkey tx_inputs[N_INPUTS];
-    const secp256k1_xonly_pubkey *tx_input_ptrs[N_INPUTS];
-    secp256k1_xonly_pubkey tx_outputs[N_OUTPUTS];
-    secp256k1_xonly_pubkey *tx_output_ptrs[N_OUTPUTS];
-    secp256k1_silentpayments_found_output found_outputs[N_OUTPUTS];
-    secp256k1_silentpayments_found_output *found_output_ptrs[N_OUTPUTS];
     secp256k1_silentpayments_prevouts_summary prevouts_summary;
     secp256k1_pubkey unlabeled_spend_pubkey;
     struct labels_cache bob_labels_cache;
@@ -209,11 +216,6 @@ int main(void) {
     {
         secp256k1_keypair sender_keypairs[N_INPUTS];
         const secp256k1_keypair *sender_keypair_ptrs[N_INPUTS];
-        secp256k1_silentpayments_recipient recipients[N_OUTPUTS];
-        const secp256k1_silentpayments_recipient *recipient_ptrs[N_OUTPUTS];
-        /* 2D array for holding multiple public key pairs. The second index, i.e., [2],
-         * is to represent the spend and scan public keys. */
-        unsigned char (*sp_addresses[N_OUTPUTS])[2][33];
         unsigned char seckey[32];
 
         printf("Sending...\n");
@@ -340,6 +342,9 @@ int main(void) {
              *        `secp256k1_silentpayments_recipient_prevouts_summary_create`
              *     2. Call `secp256k1_silentpayments_recipient_scan_outputs`
              */
+            clock_t start, end;
+            double cpu_time_used;
+
             ret = secp256k1_silentpayments_recipient_prevouts_summary_create(ctx,
                 &prevouts_summary,
                 smallest_outpoint,
@@ -356,14 +361,20 @@ int main(void) {
 
             /* Scan the transaction */
             n_found_outputs = 0;
+            
+            start = clock();
             ret = secp256k1_silentpayments_recipient_scan_outputs(ctx,
                 found_output_ptrs, &n_found_outputs,
-                (const secp256k1_xonly_pubkey * const *)tx_output_ptrs, N_OUTPUTS,
+                (const secp256k1_xonly_pubkey **)tx_output_ptrs, N_OUTPUTS,
                 bob_scan_key,
                 &prevouts_summary,
                 &unlabeled_spend_pubkey,
                 label_lookup, &bob_labels_cache /* NULL, NULL for no labels */
             );
+            end = clock();
+            cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
+            printf("Bob's scan took %f seconds\n", cpu_time_used);
+            
             if (!ret) {
                 printf("This transaction is not valid for Silent Payments, skipping.\n");
                 return EXIT_SUCCESS;
@@ -435,7 +446,7 @@ int main(void) {
             n_found_outputs = 0;
             ret = secp256k1_silentpayments_recipient_scan_outputs(ctx,
                 found_output_ptrs, &n_found_outputs,
-                (const secp256k1_xonly_pubkey * const *)tx_output_ptrs, 1, /* dummy scan with one output (we only care about Bob) */
+                (const secp256k1_xonly_pubkey **)tx_output_ptrs, 1, /* dummy scan with one output (we only care about Bob) */
                 carol_scan_key,
                 &prevouts_summary,
                 &unlabeled_spend_pubkey,

@theStack
Copy link
Copy Markdown
Contributor Author

@nymius, @w0xlt: Thanks once again for the quick feedback and for benchmarking! Shortly after my previous comment, I've been notified about yet another approach to tackle the worst-case scanning time attack (kudos to @furszy for bringing up the idea!), that I think is even more elegant: we can use the pointers in the tx_outputs list directly to track outputs by setting them to NULL if one has been found, and accordingly only treat them if they are non-NULL. With this, it's an only four lines of code change: theStack@2087f92. It kind of combines the previous two approaches of jonasnick@311b4eb and theStack@9aba470 (-> mark spent outputs, but not in a newly allocated array, but by modifying the tx_outputs input list, in order to avoid dynamic memory allocation), with very similar run-time results.

The only tiny drawback about these non-malloc approaches might be that something that is conceptually an "in" parameter is modified, which might be a bit unsound in a strict API design sense. On the other hand, it shouldn't matter for the user (I doubt that these lists passed in would ever be reused for anything else after by the callers), and we already do the same in the sending API for the recipients, so it's probably fine.

@theStack Yes — if we want to keep only the adversarial-scenario optimizations, we can drop sort stabilization and the extra heads.

The way I see it currently, code paths for non-adversarial scenarios with increasing k values would be hit so rarely in practice, that I'm sceptical that it's worth it put much effort into those optimizations. When scanning, the vast majority of transactions won't have any matches in the first place. Out of those few that do have a match, the vast majority will very likely again not contain any repeated recipient (IMHO it doesn't make that much sense to do that, unless the recipient explicitly asks "I want to receive my payment split up in multiple UTXOs, but still in a single tx"?), so in the bigger picture those optimizations wouldn't matter all that much, and I'd assume that the dominant factor should be by far all the (unavoidable) ECDH computations per transaction. But that's still more of a guess and it's still good to already have optimization ideas at hand if we need them in the future.

@w0xlt
Copy link
Copy Markdown
Contributor

w0xlt commented Nov 23, 2025

@theStack Thanks for continuing to refine the optimization. The deletion approach performs slightly better (0.40 s vs. 0.45 s), likely because deleting items shrinks the array and cuts the number of loop iterations by about 50% compared to nullifying them.

theStack added a commit to theStack/secp256k1 that referenced this pull request Nov 25, 2025
assuming the labels cache has only one entry (for change) for now

includes fixes by w0xlt in order to avoid running into a stack overflow
and time measureing code, see
bitcoin-core#1765 (comment)
theStack added a commit to theStack/secp256k1 that referenced this pull request Nov 25, 2025
assuming the labels cache has only one entry (for change) for now

includes fixes by w0xlt in order to avoid running into a stack overflow
and time measureing code, see
bitcoin-core#1765 (comment)
@theStack theStack force-pushed the silentpayments_module_fullnode_only branch from 9103229 to 650b2fb Compare November 25, 2025 19:07
@theStack
Copy link
Copy Markdown
Contributor Author

theStack commented Nov 25, 2025

To summarize, the following table shows the proposed mitigations for the worst-case scanning attack so far, with benchmark results from my machine. The previous baseline commit with the worst-case example has been updated to include @w0xlt's changes, in order to work without stack size limit changes.
(EDIT: These benchmark results are based on a baseline that doesn't represent the worst-case and are thus not representative, as noticed by w0xlt below.)

Branch Approach Runtime
theStack@c16252d (Branch baseline) modified example to exercise worst-case scanning, no fix 641.391838s
theStack@ec27977 (Branch fix1_...) mark found outputs in calloced array 0.543969s
theStack@135ca0a (Branch fix2_...) remove matched outputs by shifting remaining entries 0.514952s
theStack@8360150 (Branch fix3_...) mark found outputs by NULL in tx_outputs input array 0.544740s

The run-times of the fixes vary slightly (the removal approach "fix2" being the fastest, confirming #1765 (comment) above), but are all in the same ballpark. I don't think exact performance results matter much here, as the goal of the mitigation should be to IMHO roughly cut the run-time down from "minutes" to "seconds" (and remember, this is already for the absolute worst-case, one giant non-standard transaction filling out a whole block, and it can only slow down one specific SP recipient). Thus, I decided to pick the the simplest approach that avoids dynamic memory allocation, i.e. fix number 3 using NULL as marker in tx_outputs.

With that tackled, I believe that all of the open questions and TODOs are addressed now (updated the PR description accordingly). The latest force-push also includes a rebase on master (to include the CI fix #1771).

Comment thread src/modules/silentpayments/main_impl.h Outdated
Comment thread src/modules/silentpayments/main_impl.h Outdated
Comment thread src/modules/silentpayments/tests_impl.h Outdated
Comment thread include/secp256k1_silentpayments.h Outdated
const secp256k1_silentpayments_recipient **recipients,
size_t n_recipients,
const unsigned char *outpoint_smallest36,
const secp256k1_keypair * const *taproot_seckeys,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's several other uses of "taproot" in the PR, however in docs and not code. Not sure we want to update all of them.

Comment thread src/modules/silentpayments/main_impl.h
Comment thread src/modules/silentpayments/main_impl.h
@Eunovo
Copy link
Copy Markdown

Eunovo commented May 14, 2026

dedde95:

There's a 512KB stack limit on some CI jobs on bitcoin-core, and it seems that this limit is exceeded while running test vectors, see https://github.com/bitcoin/bitcoin/actions/runs/25862335393/job/75995923652?pr=32966

Claude thinks the issue is due to the size of the stack allocated arrays for the test_vectors.

@theStack theStack force-pushed the silentpayments_module_fullnode_only branch from dedde95 to 33e2225 Compare May 14, 2026 17:46
@theStack
Copy link
Copy Markdown
Contributor Author

Addressed most of @andrewtoth's review comments above and fixed the stack limit issue found by @Eunovo (diff):

dedde95:

There's a 512KB stack limit on some CI jobs on bitcoin-core, and it seems that this limit is exceeded while running test vectors, see https://github.com/bitcoin/bitcoin/actions/runs/25862335393/job/75995923652?pr=32966

Claude thinks the issue is due to the size of the stack allocated arrays for the test_vectors.

Oh right, with the recent introduction of the K_max test vectors (bitcoin/bips#2106), the MAX_OUTPUTS_PER_TEST_CASE constant was bumped from 4 to 2324, leading to arrays allocated on the stack in run_silentpayments_test_vector_{send,receive} that are way too large. Fixed this issue by declaring all MAX_{INPUTS,OUTPUTS}_PER_TEST_CASE arrays as static (i.e. moving them to the BSS segment), which solves this issue. (Alternatively, we could allocate them on the heap, which would be a slightly larger diff. Not sure what is preferred in the repo; I think both approaches are fine, so I decided for the simpler one.)

Thanks a lot for the detailed reviews! 👍

Comment thread include/secp256k1_silentpayments.h Outdated
@theStack theStack force-pushed the silentpayments_module_fullnode_only branch from 33e2225 to 14be7f2 Compare May 15, 2026 14:06
Copy link
Copy Markdown

@andrewtoth andrewtoth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 14be7f2

Comment thread include/secp256k1_silentpayments.h Outdated
Comment thread src/modules/silentpayments/main_impl.h Outdated
@theStack
Copy link
Copy Markdown
Contributor Author

Addressed review comments by @nymius (#1765 (comment)) and @andrewtoth (#1765 (comment), #1765 (comment)), improving the consistency for (pub)key list variable naming (s/plain_pubkeys/pubkeys/, s/taproot_seckeys/keypairs/, where it applies) and tackling documentation nits (diff). Thanks!

@theStack theStack force-pushed the silentpayments_module_fullnode_only branch from adddbb7 to 4663ba1 Compare May 18, 2026 22:30
@theStack
Copy link
Copy Markdown
Contributor Author

In the course of an off-band conversation with @furszy I was made aware that the Jacobian result of the generator point multiplication in the sending function could potentially leak the secret key sum scalar. Fixed now by clearing it out immediately after the conversion to affine coordinates (one-line diff 🤞). This follows a similar pattern for existing functions with generator point multiplications: 765ef53 (discovered by @sipa back then that this was missing).

theStack added a commit to theStack/secp256k1 that referenced this pull request May 19, 2026
Scalar multiplication with the generator point frequently involves a
conversion to affine coordinates and clearing out the temporary Jacobian
group element object after to avoid leaking secret key material, i.e.
executing the following three steps:
    - secp256k1_ecmult_gen(ctx, &rj, ...)
    - secp256k1_ge_set_gej(&r, &rj)
    - secp256k1_gej_clear(&rj)

This commit introduces a corresponding helper to deduplicate code
and mitigate the risk that last step is forgotten (which can easily
happen and is not detected by tests).

The idea came up during a conversation with furszy, see
bitcoin-core#1765 (comment)
* the curve order, which is statistically improbable. Returning an error here results in an untestable branch in
* the code, but we do this anyways to ensure strict compliance with BIP0352.
*/
if (!secp256k1_silentpayments_create_output_tweak(ctx, &t_k_scalar, shared_secret33, k)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_create_output_tweak writes t_k_scalar before returning failure for zero/overflow hash output.
Since that scalar is derived from the shared secret, the caller should clear it before returning, matching the other failure paths.

Suggested change
if (!secp256k1_silentpayments_create_output_tweak(ctx, &t_k_scalar, shared_secret33, k)) {
if (!secp256k1_silentpayments_create_output_tweak(ctx, &t_k_scalar, shared_secret33, k)) {
secp256k1_scalar_clear(&t_k_scalar);

Copy link
Copy Markdown
Contributor Author

@theStack theStack May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, done. It's not clear to me if leaking an invalid $t_k$ value is really a problem and would gain anything for an adversary, but clearing it in this case surely doesn't hurt and is also consistent what we currently do on the other _create_output_tweak call-site in the scanning function (see https://github.com/theStack/secp256k1/blob/c02e14c4136b0eeaa00b21c09ae729618c190a1f/src/modules/silentpayments/main_impl.h#L663).

theStack and others added 14 commits May 22, 2026 11:49
Add a routine for the entire sending flow which takes a set of private keys,
the smallest outpoint, and list of recipients and returns a list of
x-only public keys by performing the following steps:

1. Sum up the private keys
2. Calculate the input_hash
3. For each recipient group:
    3a. Calculate a shared secret
    3b. Create the requested number of outputs

This function assumes a single sender context in that it requires the
sender to have access to all of the private keys. In the future, this
API may be expanded to allow for a multiple senders or for a single
sender who does not have access to all private keys at any given time,
but for now these modes are considered out of scope / unsafe.

Internal to the library, add:

1. A function for creating shared secrets (i.e., a*B or b*A)
2. A function for generating the "SharedSecret" tagged hash
3. A function for creating a single output public key

Co-authored-by: w0xlt <94266259+w0xlt@users.noreply.github.com>
Add function for creating a label tweak. This requires a tagged hash
function for labels. This function is used by the receiver for creating
labels to be used for a) creating labeled addresses and b) to populate
a labels cache when scanning.

Add function for creating a labeled spend pubkey. This involves taking
a label tweak, turning it into a public key and adding it to the spend
public key. This function is used by the receiver to create a labeled
silent payment address.

Add tests for the label API.

Co-authored-by: w0xlt <94266259+w0xlt@users.noreply.github.com>
Add routine for scanning a transaction and returning the necessary
spending data for any found outputs. This function works with labels via
a lookup callback and requires access to the transaction outputs.
Requiring access to the transaction outputs is not suitable for light
clients, but light client support is enabled in a future release.

Add an opaque data type for passing around the prevout public key sum
and the input hash tweak (input_hash). This data is passed to the scanner
before the ECDH step as two separate elements so that the scanner can
multiply the scan_key * input_hash before doing ECDH.

Finally, add test coverage for the receiving API.

Co-authored-by: w0xlt <94266259+w0xlt@users.noreply.github.com>
This affects both the sending and scanning API functions:
* Sending fails if any group is exceeding the limit.
* Scanning doesn't look beyond the limit.

Also add a recommendation to the API docs to shuffle the
`tx_outputs` input array, which improves the worst-case by ~2x.

Co-authored-by: nymius <155548262+nymius@users.noreply.github.com>
Demonstrate sending and scanning on full nodes.
Add a benchmark for a full transaction scan, both for the common
case and worst-case (full-block sized tx) scenarios.
Only benchmarks for scanning are added as this is the most
performance critical portion of the protocol.

Co-authored-by: Sebastian Falbesoner <91535+thestack@users.noreply.github.com>
This improves the worst-case scanning scenario performance by ~2.5x
and also helps notably in the common case ("no match") scenario,
especially if the number of outputs is high:

Benchmark results on parent commit (without optimization):

silentpayments_scan_nomatch_N=2         ,       43.0       ,       43.1       ,       43.1
silentpayments_scan_nomatch_N=5         ,       46.8       ,       46.9       ,       46.9
silentpayments_scan_nomatch_N=10        ,       53.5       ,       53.6       ,       53.6
silentpayments_scan_nomatch_N=100       ,      175.0       ,      178.0       ,      186.0
silentpayments_scan_nomatch_N=1000      ,     1425.0       ,     1430.0       ,     1436.0
silentpayments_scan_nomatch_N=2323      ,     3251.0       ,     3260.0       ,     3269.0
silentpayments_scan_nomatch_N=23250     ,    32039.0       ,    32167.0       ,    32294.0
silentpayments_scan_worstcase_K=10      ,   352706.0       ,   352781.0       ,   352872.0
silentpayments_scan_worstcase_K=100     ,  3230214.0       ,  3231089.0       ,  3231796.0
silentpayments_scan_worstcase_K=1000    , 31401430.0       , 31406989.0       , 31416195.0
silentpayments_scan_worstcase_K=2323    , 70759992.0       , 70764742.0       , 70769823.0

Benchmark results on this commit (with optimization):

silentpayments_scan_nomatch_N=2         ,       42.2       ,       42.2       ,       42.2
silentpayments_scan_nomatch_N=5         ,       43.6       ,       43.6       ,       43.7
silentpayments_scan_nomatch_N=10        ,       47.1       ,       47.1       ,       47.1
silentpayments_scan_nomatch_N=100       ,      101.0       ,      104.0       ,      109.0
silentpayments_scan_nomatch_N=1000      ,      654.0       ,      658.0       ,      664.0
silentpayments_scan_nomatch_N=2323      ,     1473.0       ,     1476.0       ,     1479.0
silentpayments_scan_nomatch_N=23250     ,    14400.0       ,    14412.0       ,    14425.0
silentpayments_scan_worstcase_K=10      ,   157215.0       ,   157242.0       ,   157267.0
silentpayments_scan_worstcase_K=100     ,  1440043.0       ,  1440081.0       ,  1440106.0
silentpayments_scan_worstcase_K=1000    , 13995233.0       , 13995646.0       , 13995923.0
silentpayments_scan_worstcase_K=2323    , 31535643.0       , 31538121.0       , 31542702.0
Add the BIP-352 test vectors. The vectors are generated with a Python script
that converts the .json file from the BIP to C code:

$ ./tools/tests_silentpayments_generate.py test_vectors.json > ./src/modules/silentpayments/vectors.h

Co-authored-by: Ron <4712150+macgyver13@users.noreply.github.com>
Co-authored-by: Sebastian Falbesoner <91535+thestack@users.noreply.github.com>
Co-authored-by: Tim Ruffing <1071625+real-or-random@users.noreply.github.com>
Co-authored-by: Jonas Nick <2582071+jonasnick@users.noreply.github.com>
Co-authored-by: Sebastian Falbesoner <91535+thestack@users.noreply.github.com>
Test midstate tags used in silent payments.
The worst-case scanning benchmarks and the common-case scanning
benchmarks with N>10 are relatively slow, leading to signifcantly
long run times of CI jobs. Avoid this by skipping these if the
iters count is <= 2 (CI jobs run with SECP256K1_BENCH_ITERS=2).
This is the same approach used in the ecmult benchmarks (bench_ecmult).
@theStack theStack force-pushed the silentpayments_module_fullnode_only branch from 4663ba1 to c02e14c Compare May 22, 2026 09:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.