diff --git a/rfcs/0000-arbitrary-multisig/0000-arbitrary-multisig.md b/rfcs/0000-arbitrary-multisig/0000-arbitrary-multisig.md new file mode 100644 index 00000000..aba76b53 --- /dev/null +++ b/rfcs/0000-arbitrary-multisig/0000-arbitrary-multisig.md @@ -0,0 +1,376 @@ +--- +Number: "0000" +Category: Standards Track +Status: Proposal +Author: Xuejie Xiao +Created: 2025-03-17 +--- + +# Arbitrary multisig + +This documents defines arbitrary multisig, a new multisig scheme for CKB which allows different participants to use different digital signature algorithms in a single multisig setup. + +# Rationale + +Multi-signature, or multisig, [refers to](https://en.bitcoin.it/wiki/Multi-signature) requiring multiple public keys to authorize a single blockchain transaction or other resources. In recent years it has become a widely used technique in the blockchain world by exchanges, organizations as well as individuals. However, existing usages are limited in that each party will have to generate a private/public key pair to use in a multisig setup with a single digital signature algorithm. Even though one can vary the multisig thresholds or number of participants(e.g., 1-of-2, 2-of-2, 3-of-5, 7-of-11, etc.), each public key used in the multisig setup, will use the same digital signature algorithm, generated by the same procedure, and also later signed in a similar workflow. + +Nervos CKB is different in the sense that multiple different cryptographic algorithms, including digital signature algorithms, could be deployed and used all without a single change to CKB itself. That said, as different single-signed signatures have been used in CKB to lock / unlock many different assets, the multisig design in CKB shared the same limit, in that a single multisig setup will be limited to one digital signature algorithm. Arbitrary multisig is designed to overcome this problem: what if different digital signature algorithms can participate in a single multisig setup? + +This new design unlocks different paradigms: + +* In a multisig setup with multiple participants, each participant will be free to choose whatever digital signature algorithm they want to use. One could stick to secp256k1, while another might want to use SPHINCS+, yet another participant enjoys the benefit of WebAuthn. Arbitrary multisig design achieves maximum level of flexibility. +* Arbitrary multisig also provides values when an individual wants to HODL tokens for as long as they want: assuming SPHINCS+ is now available, and Jim wants to secure their tokens with SPHINCS+, however as SPHINCS+ is really new in town, no one can be sure that the algorithm is sound, and that the implementation is secure. Now Jim can create a 2-of-2 arbitrary multisig, using a SPHINCS+ public key, and a secp256k1 public key. Now Jim gets the best of both worlds: if secp256k1 is doomed by quantum computers, the SPHINCS+ still guards Jim's tokens; if the SPHINCS+ lock in use is discovered with a vulnerability, Jim still has secp256k1 key which protects their tokens. Jim will be happy in both cases. + +We do believe the arbitrary multisig can be an important building block for future CKB applications. + +## Scope + +It is worth pointing out that the arbitrary multisig design discussed here is merely a workflow for validating multi-signatures. While a multisig lock script might mostly just contain the arbitrary multisig design in its validation workflow, the arbitrary multisig design is architected in a way that it can be used in more cases than lock scripts. Some type scripts might also leverage the arbitrary multisig design for collective voting or more use cases. + +# Terminology + +In this document we will have certain pseudocode included, the following operators might be used by pseudocode: + +* `==`: equal operation +* `<<`: bitwise left shift operation +* `|`: bitwise or operation +* `++`: byte concatenations +* `(X ...)`: multiple items of the same structure `X` concatenated together + +All variables used in pseudocode will be in snake form in full upper case, such as `ALGO_FLAG`. Certain variables might be paired with human explanations, such as `(decoded ALGO_FLAG)`. + +For bigger structure, pseudocode might be far too long, we will use bullet list instead, such as: + +* 2-byte `0x01 0xFF` +* 1-byte `0xAB` +* 3-byte zeros + +The above bullet list represents the series of bytes `0x01 0xFF 0xAB 0x00 0x00 0x00`. + +# Individual Components + +Several individual components will now be introduced in this section. When putting together, they form the complete arbitrary multisig design. + +## `ALGO_ID` + +`ALGO_ID` is a [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) encoded integer defining supported digital signature scheme by the arbitrary multisig. Notice here we use `digital signature scheme` instead of `digital signature algorithm`, since multiple `ALGO_ID`s might use the same digital signature algorithm underneath, but vary slightly in encoding details. Supported values of `ALGO_ID` right now include: + +* `0 - 14(0x0e)`: the scheme of the same value, defined by [ckb-auth spec](https://github.com/nervosnetwork/ckb-auth/blob/68c93a3a07a462eb4e42a293fe94d759ed1b21ee/docs/auth.md#auth-algorithm-id) is employed. +* `48(0x30) - 59(0x3b)`: 12 values reserved for 12 parameter sets in [FIPS 205](https://csrc.nist.gov/pubs/fips/205/final) approved by NIST, see [here](https://github.com/xxuejie/quantum-resistant-lock-script/blob/bf2ab2a7a01a21c48d1151e0c488c66e0e4199c9/crates/ckb-fips205-utils/src/lib.rs#L33-L58) for the parameter set defined for each different value. +* `60(0x3c) - 62(0x3e)`: 3 values reserved for 3 parameter sets in [FIPS 204](https://csrc.nist.gov/pubs/fips/204/final) approved by NIST. +* `63(0x3f)`: a special value denoting calling an on-demand CKB script to perform the signature validation. It thus is not tied to a particular digital signature algorithm. The succeeding arguments will provide the details for locating the CKB script. We will explain the detail format for this value below. + +Note this list is by no means exhaustive. As new digital signature algorithms / schemes come out and become popular, we will expand this list accordingly. It is not required that a single CKB script implement every algorithm / scheme defined in this list. Later we shall see the arbitrary multisig workflow would call different scripts for signature verification work. + +## `SIGNATURE_FLAG` + +A 1-bit flag, denoting the presence(when the bit is set) or absence(when the bit is cleared) of a signature. We shall later see how this is used by arbitrary multisig. + +## `ALGO_FLAG` + +`ALGO_FLAG` is a [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) encoded integer combining `ALGO_ID` and `SIGNATURE_FLAG`. Specifically, `ALGO_FLAG` is defined as follows: + +``` +(decoded ALGO_FLAG) == ((decoded ALGO_ID) << 1) | SIGNATURE_FLAG +``` + +In other words, one take the decoded form of `ALGO_ID`(simply an integer), shift it left by 1 bit, and doing an `or` operation with `SIGNATURE_FLAG`, the result will be the decoded form of `ALGO_FLAG`. + +One might notice that at the current moment, the encoded bytes of `ALGO_FLAG` will always be 1 byte no matter which valid `ALGO_ID` value is used. This is a deliberate design to minimize encoding bytes. + +## `IDENTITY` + +`IDENTITY` is a series of arbitrary length bytes, representing the identity of a participant of a multisig setup. An `IDENTITY` typically has the following format: + +``` +(ALGO_FLAG in VLQ encoding, SIGNATURE_FLAG must be zero) ++ Public key or public key hash +``` + +In other words, a typical `IDENTITY` is the concatenation of `ALGO_FLAG`(in VLQ encoding form) with a public key, or the hash of a public key in a digital signature private/public key pair. For an `IDENTITY`, the `SIGNATURE_FLAG` included in `ALGO_FLAG` must be zero. So technically, `ALGO_FLAG` in an `IDENTITY` only contains an `ALGO_ID`. + +The `ALGO_ID` part at the front tells the correct length of a valid `IDENTITY` structure. + +For `ALGO_ID` from `0` to `14`, the length of `IDENTITY` will always be 21 bytes, please refer to [ckb-auth spec](https://github.com/nervosnetwork/ckb-auth/blob/68c93a3a07a462eb4e42a293fe94d759ed1b21ee/docs/auth.md#auth-algorithm-id) for the remaining 20 bytes after the `ALGO_ID` at first. + +For `ALGO_ID` from `48` to `59`, SPHINCS+ public key for each parameter set will follow the `ALGO_ID`. Each parameter set might have a different length depending on the SPHINCS+ public key following the 1-byte `ALGO_ID`. + +We shall come back to define the `IDENTITY` for `ALGO_ID` from `60` to `62` at a later stage. + +For the `ALGO_ID` of `63`, the public key part of an `IDENTITY`, contains more data than just the public key, the following concatenated structure is used: + +* 32-byte value denoting `code hash` +* 1-byte value denoting `hash type` +* 8-byte value denoting `bounds` +* The length of true public key (or public key hash) encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) +* True public key (or public key hash) in variable length + +For the `ALGO_ID` of `63`, the arbitrary multisig is expected to make a `spawn` syscall instantiating another script(leaf script), using provided `code hash`, `hash type`, `bounds`(the first 2 locates a particular cell in current transaction, while the last one locates a slice representing the ELF binary within the cell), then communicate with the leaf script to verify any provide digital signature. + +## `UNLOCKING_SLOT` + +An `UNLOCKING_SLOT` resembles `IDENTITY`, but provides an optional signature as well: + +``` +(ALGO_FLAG in VLQ encoding) ++ (Public key, public key hash or more composite data) ++ (An optional signature) +``` + +In other words, typical `UNLOCKING_SLOT` contains `ALGO_FLAG`, a public key or public key hash, and an optional signature. Unlike `IDENTITY`, the `SIGNATURE_FLAG` in `UNLOCKING_SLOT` can be set or cleared, denoting the presence or absence of a signature at the end. `ALGO_ID` tells the correct lengths of public key, public key hash, and the optional signature. `ALGO_FLAG` in a whole determines the total length of the `UNLOCKING_SLOT`. + +For `ALGO_ID` of `63`, there might be more data than the public key, the same encoding scheme defined for `IDENTITY` is also used for `UNLOCKING_SLOT`. + +It is possible to derive `IDENTITY` from `UNLOCKING_SLOT`: all we need to do, is to remove the final signature(if one exists), then clear the `SIGNATURE_FLAG`. In case a signature is absent, `UNLOCKING_SLOT` will be encoded exactly the same as `IDENTITY`. + +## Multisig Configuration + +A `multisig configuration` is a data structure defining everything that is needed by a multisig setup. It keeps information such as the number of participants, number of signatures to gather for the multisig to succeed, and identity for each multisig participant. + +A multisig configuration is defined as follows: + +``` +S ++ R ++ M ++ N ++ (IDENTITY ...) +``` + +The multisig configuration designed here, is highly inspired(though differences still exist) from [multisig lock included in the genesis block](https://github.com/nervosnetwork/ckb-system-scripts/blob/934166406fafb33e299f5688a904cadb99b7d518/c/secp256k1_blake160_multisig_all.c#L17-L28): + +* `S`: a 1-byte reserved value, right now `S` must be `0x80`, to differentiate from the reserved value used in multisig lock included in the genesis block. +* `R`: a 1-byte value, of all the `IDENTITY`s in the multisig configuration, the first `R` identities must each provide a signature for the multisig validation to succeed. +* `M`: a 1-byte value, represents the threshold of the multisig design, or the number of signatures required to gather for the multisig validation to succeed. +* `N`: a 1-byte value, represents the total number of `IDENTITY`s in the multisig configuration. The encoding here determines that in current arbitrary multisig design, at most 255 `IDENTITY`s can be used by a single multisig design. However, it is worth mentioning that while it is not directly supported by the arbitrary multisig design, one can leverage `ALGO_ID` of `63` to workaround the limitation. + +`N` `IDENTITY`s follow N, completes the full multisig configuration structure. + +Since `IDENTITY`s in the arbitrary multisig design might have variable length, it will require an `O(N)` operation to learn all the lengths of all `IDENTITY`s. An optimal implementation might run the loop once, and keep starting offset for each `IDENTITY` in a local array for future usage. + +It is worth mentioning that the full `multisig configuration` should rarely be included on chain directly. In most cases, it will be a conceptual data structure that only lives off-chain. Frequently only the hash of the full `multisig configuration` will be included on chain, such as the `args` field of a lock script, or inside `cell data` of a particular cell with a sophisticated type script. + +## Unlocking Configuration + +Unlike `multisig configuration`, the `unlocking configuration` will be frequently seen on chain, most likely in one of the witnesses, to provide data for a multisig validation flow to succeed. + +An unlocking configuration is defined as follows: + +``` +S ++ R ++ M ++ N ++ (UNLOCKING_SLOT ...) +``` + +An unlocking configuration uses the same `S`, `R`, `M`, `N` value as the corresponding multisig configuration. The unlocking configuration uses `UNLOCKING_SLOT`s to provide exactly `M` valid signatures for the multisig validation flow to verify. In addition, the first `R` `UNLOCKING_SLOT`s must each have a signature attached. + +Note it is possible to derive the matching `multisig configuration` from an `unlocking configuration`. As we shall see later in the multisig validation workflow, this is also leveraged by arbitrary multisig validation flow. + +# Root Script & Leaf Script + +Arbitrary multisig is designed with the following consideration in mind: + +* No single script performing arbitrary multisig validation shall be able to implement all digital signature algorithms in itself. +* Even if a script manages to implement all digital signature algorithms defined by arbitrary multisig at a time, this very spec might change in the future with more digital signature algorithms. +* The `ALGO_ID` of `63` exists to load arbitrary script for validation. + +This means there might always be new algorithm / scheme/ script coming out, which is not yet available at the time when a script utilizing arbitrary multisig validation flow was built. This means arbitrary multisig must be designed for the future, so as to support more potential algorithms / schemes / scripts. + +To cope with this issue, arbitrary multisig also defines a workflow, where the arbitrary multisig flow leverages [exec](https://github.com/nervosnetwork/rfcs/blob/bd5d3ff73969bdd2571f804260a538781b45e996/rfcs/0034-vm-syscalls-2/0034-vm-syscalls-2.md#exec) / [spawn](https://github.com/nervosnetwork/rfcs/blob/bd5d3ff73969bdd2571f804260a538781b45e996/rfcs/0050-vm-syscalls-3/0050-vm-syscalls-3.md#spawn) syscalls to invoke other scripts, and use those scripts for the actual signature verification work. + +To distinguish between different scripts used in this design, in this document, we will name the script doing the arbitrary multisig flow as `root script`, a `root script` will then use exec / spawn syscalls to invoke other scripts for signature verification work, those callee scripts specializing in signature verification work, will be named as `leaf scripts`. Only a root script needs to understand and perform the arbitrary multisig validation flow, the leaf scripts merely need to implement one or more particular digital signature algorithms. In fact, a leaf script don't even need to know CKB's transaction signing process, nor even a hashing function. + +The exact communication protocol used between root scripts and leaf scripts, will be discussed later in `Root & Leaf Communication Protocol` section. + +Depending on the actual lifecycles, a root script might be in one of the following scenarios: + +* When the root script is first built, it shall implement the full arbitrary multisig validation flow, depending on the actual design tradeoffs, it can choose some(or zero) digital signature algorithms / schemes defined in this arbitrary multisig validation flow, and implement those algorithms / schemes directly in the root script itself. It can then keep deployment infos of some other scripts on chain, and use exec / spawn syscalls to invoke those other leaf scripts for digital signature algorithms / schemes that it does not support. In addition, it always have the choice to ignore certain algorithms / schemes defined in this specification. This is always the flexibility and also the curse of Nervos CKB: we can aim to define a comprehensive list of algorithms / schemes supported by the arbitrary multisig specification, but in the end it is really up to each individual script to decide to which extent it wants to support defined specification. +* After the root script was deployed and used on chain, the arbitrary multisig specification might be revisited and updated with new algorithms / schemes. The already deployed root script can then decide for itself if it wants to support the additional algorithms / schemes via an upgrading process, or if it simply wants to ignore the new algorithms / schemes. +* Both the above 2 scenarios talk about the possibility that a root script can leave out certain digital signature algorithms / schemes in its implementation. But this does not mean the user of such root script will not be able to choose those ignored digital signature algorithms / schemes. For example, Jim might want to use Dilithium digital signature algorithm, but we have yet to finalize the Dilithium digital signature algorithm(which is in fact FIPS 204) in the arbitrary multisig spec, or maybe we have finalized it, but a root script chooses to ignore this algorithm(or it has yet finishes upgrading). Assuming our root script does support the `ALGO_ID` of `63`, Jim can then choose a CKB script on-chain that 1) does implement Dilithium algorithm 2) accepts `Root & Leaf Communication Protocol` defined by arbitrary mutlsig. Jim can then set `ALGO_ID` to `63`, fill in the details(`code hash`, `hash type`, `bounds`) of the Dilithium script. Now Jim builds an `IDENTITY` (and also `UNLOCKING_SLOT` later) for a root script that does not yet support Dilithium, but only supports the `ALGO_ID` of `63` in arbitrary multisig. Jim is happy now. Jim does use a longer `IDENTITY` and a longer `UNLOCKING_SLOT`, but that might be a price Jim is willing to pay now. Later if the root script does add support for Dilithium, Jim can then migrate to a new `IDENTITY` and a new `UNLOCKING_SLOT` using a defined `ALGO_ID`. +* Assuming the arbitrary multisig specification is finalized, an `indexing script` can then be built independently on chain. The `indexing script` monitors the arbitrary multisig specification, while also monitoring on-chain script deployment. It can then provide implementations for each digital signature algorithm / scheme supported by the arbitrary multisig spec. When the arbitrary multisig spec is updated, so is the `indexing script` with information for verifying signatures from newly added algorithms / workflows. In this way, a root script can then treat the `indexing script` as a leaf script, and rely on the `indexing script` to implement all digital signature algorithms / schemes. The root script does not need to know if the `indexing script` implements an algorithm by itself, or if it relies on yet an external script. The root script simply talks to the `indexing script` via protocol defined below, fulfilling all signature verification tasks. +* Going one step further, an `indexing script` might even implement the arbitrary multisig script, making it `the arbitrary multisig script`. Now any particular script, can simply invokes `the arbitrary multisig script` via an exec / spawn syscall, point `the arbitrary multisig script` at location for 1) the hash of multisig configuration, most likely in a script args or a cell data field; and 2) the unlocking configuration, most likely in one of the witness field. `The arbitrary multisig script` can then finish the multisig validation flow by itself. All future feature implementations, can happen as an upgrading process of `the arbitrary multisig script`. A developer's script, can then be left as simple as possible. + +# Root Validation Flow + +With all the discussions above, the arbitrary multisig validation flow for the root script, can be described as follows: + +1. The outsider environment(or the caller) provides 3 pieces of input parameters to the multisig validation flow: + 1. Hash of a `multisig configuration`(and also the hashing function), most likely in the `args` field of a lock script, or the `cell data` part of a cell. + 2. The corresponding full `unlocking configuration`, most likely in one of the witnesses. + 3. A signing message that all participants of the multisig configuration should sign. +2. The arbitrary multisig validation flow now begins as flows. Note in the validation flow below, `halt` means terminating the script with a non-zero exit code. + 1. `S` in the `unlocking configuration` is validated to be `0x80`. + 2. `R`, `M`, `N` are then extracted, the following rules are enforced: + a. `N` > 0 + b. `M` <= `N` + c. `M` > 0 + d. `R` <= `M` + 3. Now the multisig flow knows there must be `N` identities, for each public key indexed at `i`(index starting at 0): + a. Extracts the `i`th `UNLOCKING_SLOT` from `unlocking configuration`, if `unlocking configuration` cannot provide a valid one(data end earlier than expected), halt. + b. Validate `ALGO_ID` in the `UNLOCKING_SLOT` is a valid and supported one, otherwise halt. + c. If `i < R`, `SIGNATURE_FLAG` in the `UNLOCKING_SLOT` must be set, and a signature must be present, otherwise halt. + d. If a signature is present, `M` must be larger than zero. Otherwise halt. + e. Set `M` to `M - 1`. + f: If a signature is present, verify the signature against the public key and the signing message. The root script might verifies the signature by itself, or rely on a leaf script to verify it following communication protocols defined in the next section. If the signature is invalid, halt; + 4. Now the multisig validation flow builds `multisig configuration` from the `unlocking configuration`, then runs the specified hashing function on the `multisig configuration`, if resulting hash matches expected hash, the validation flow succeeds. Otherwise the validation flow halts. + +As hinted in the above discussions, arbitrary multisig requires the presence of spawn syscalls, meaning CKB Edition Meepo must be activated. + +# Root & Leaf Communication Protocol + +In this section, the communication protocol between root scripts and leaf scripts are defined. As discussed above, a root script implements the arbitrary multisig validation flow, it might use [exec](https://github.com/nervosnetwork/rfcs/blob/bd5d3ff73969bdd2571f804260a538781b45e996/rfcs/0034-vm-syscalls-2/0034-vm-syscalls-2.md#exec) / [spawn](https://github.com/nervosnetwork/rfcs/blob/bd5d3ff73969bdd2571f804260a538781b45e996/rfcs/0050-vm-syscalls-3/0050-vm-syscalls-3.md#spawn) syscalls to invoke leaf scripts, so leaf scripts perform signature verification tasks instead. + +## Zero Escaped Encoding + +When a root script invokes a leaf script using spawn or exec syscall, `argv[0]` will be used in both cases to pass data from the root script to a leaf script. As a null-terminated C-string, only non-zero bytes can be passed via `argv[0]`. To cope with this problem, `zero escaped encoding` is introduced, it encodes each individual byte using the following rules: + +* `0x00` is encoded as `0xFE 0xFF`. +* `0xFE` is encoded as `0xFE 0xFD`. +* All the other bytes are encoded as they are in a single byte. + +A C implementation of zero escaped encoding can be found at [here](https://github.com/xxuejie/quantum-resistant-lock-script/blob/aa9b37eabb207e7dbcd7cf43af963cce4f84692b/contracts/c-sphincs-all-in-one-lock/utils/zero_escape_encoding.h), while a Rust encoder can be found at [here](https://github.com/xxuejie/quantum-resistant-lock-script/blob/aa9b37eabb207e7dbcd7cf43af963cce4f84692b/contracts/hybrid-sphincs-all-in-one-lock/src/main.rs#L235-L266). + +## Spawn + +When expected, a root script invokes spawn syscall to create a new VM instance running the leaf script. The root script passes the following data to the leaf script. All the data listed here are concatenated together, encoded via `zero escaped encoding` discussed above, then be put in `argv[0]`: + +* The letter `s` in 1-byte ASCII encoding +* The length of signing message, in 32-bit little-endian unsigned integer +* The actual signing message in variable length + +One can deduce that one spawned leaf script will always run signature verifications using the same signing message. + +Before executing spawn syscall, the root script must create 2 pipes via [pipe](https://github.com/nervosnetwork/rfcs/blob/bd5d3ff73969bdd2571f804260a538781b45e996/rfcs/0050-vm-syscalls-3/0050-vm-syscalls-3.md#pipe) syscalls. There will be a total of 4 file descriptors created from 2 pipes. We will name them differently: + +* `root_to_leaf_pipe` is created so root script can pass signature data to leaf script for verification. `root_to_leaf_pipe[0]` is read by the leaf script, while `root_to_leaf_pipe[1]` is written by the root script. +* `leaf_to_root_pipe` is created so leaf script passes verification results back to root script. `leaf_to_root_pipe[0]` is read by the root script, while `leaf_to_root_pipe[1]` is written by the leaf script. + +In spawn syscall, the root script must pass `root_to_leaf_pipe[0]` and `leaf_to_root_pipe[1]` in this exact order to the invoked leaf script. The leaf script uses [Inherited File Descriptors](https://github.com/nervosnetwork/rfcs/blob/bd5d3ff73969bdd2571f804260a538781b45e996/rfcs/0050-vm-syscalls-3/0050-vm-syscalls-3.md#inherited-file-descriptors) syscall to gather 2 file descriptors for the leaf script to use. + +When the spawn syscall succeeds, the root script would continuously send certain packets described below via `root_to_leaf_pipe[1]` to the leaf script. The leaf script loops to read such packets from `root_to_leaf_pipe[0]`. Right now, 2 formats exist for verification: + +### Signature Packet Format 1 + +Signature Packet Format 1 contains the following data concatenated together: + +* 0 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* 1 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* 20 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* witness source in 64-bit little-endian unsigned integer +* witness index in 32-bit little-endian unsigned integer +* witness offset in 32-bit little-endian unsigned integer +* witness length in 32-bit little-endian unsigned integer + +Note that pipe is capable of accepting binary data with zeros, the data above do not need zero escape encoding. + +The 4 variables (witness source / index / offset / length) included in the above signature data packet, jointly locates a slice of data in a particular witness field from the current CKB transaction. The leaf script then treats the located slice of data as a series of `UNLOCKING_SLOT` structure, then performs the following verification: + +* If any `UNLOCKING_SLOT` uses a value of `ALOG_ID` which is not supported by the current leaf script, the leaf script halts. +* If any `UNLOCKING_SLOT` contains a signature, the leaf script verifies the signature against public key included in this particular `UNLOCKING_SLOT`, and signing message passed via `argv[0]`. If the signature verification process fails, the leaf script halts. +* If garbage data exist in the slice of data after the last `UNLOCKING_SLOT`, in other words, if there are still some data left in the slice, but they do not form a valid `UNLOCKING_SLOT` structure, the leaf script halts. + +When halting, the leaf script sends the following response packet via `leaf_to_root_pipe[1]` to the root script: + +* 0 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* 1 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* 0 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. + +The leaf script then terminates with a non-zero exit code. + +If all verification work succeeds, the leaf script sends the following response packet via `leaf_to_root_pipe[1]` to the root script: + +* 0 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* 0 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* 0 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. + +The leaf script then waits to read more packets for verification from `root_to_leaf_pipe[0]`. + +### Signature Packet Format 2 + +Signature Packet Format 1 contains the following data concatenated together: + +* 0 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* 2 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* 20 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* witness source in 64-bit little-endian unsigned integer +* witness index in 32-bit little-endian unsigned integer +* witness offset in 32-bit little-endian unsigned integer +* witness length in 32-bit little-endian unsigned integer + +Note that pipe is capable of accepting binary data with zeros, the data above do not need zero escape encoding. + +In this case, the leaf script also locates a slice of data based on the 4 witness locating variables. Different from the previous version, the leaf script expects the located slice of data to be of the following format: + +* Public key of a given digital signature scheme. +* Signature of a given digital signature scheme. + +No trailing data should be present after the signature. + +The digital signature scheme to use here, must be pre-determined for a leaf script. The leaf script then verifies the signature against the public key and the signing message in `argv[0]`. + +When the signature verification process fails, the leaf script sends the following response packet via `leaf_to_root_pipe[1]` to the root script: + +* 0 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* 1 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* 0 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. + +The leaf script then terminates with a non-zero exit code. + +If all verification work succeeds, the leaf script sends the following response packet via `leaf_to_root_pipe[1]` to the root script: + +* 0 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* 0 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* 0 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. + +The leaf script then waits to read more packets for verification from `root_to_leaf_pipe[0]`. + +This signature packet format is in fact designed for `ALGO_ID` of `63`, when an arbitrary script is loaded for signature verification. Signature packet format 2 can be used to exchange signatures to valid between root script and the arbitrary leaf script. + +### Terminating Packet Format + +A terminating packet might be sent by the root script to terminate a leaf script. The format is as follows: + +* 0 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* 3 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. +* 0 encoded in [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity) format. + +Upon receiving this packet, the leaf script terminates with zero as return code immediately, it does not send any response packet. + +Normally, a root script does not need to terminate a leaf script. When the VM instance with VM ID 0 terminates, all child VMs terminate automatically. This packet is only suitable when too many child VM isntances have been created by the current running script group. In this case, the root script might choose to terminate some leaf scripts first, then spawn new ones. + +## Exec + +A root script might use exec syscall to invoke a leaf script, when the following conditions are met: + +* The remaining one or more signatures to verify can all be verified via a single leaf script. +* The arbitrary multisig validation flow is the last thing to verify in current root script. No other validation work is required. + +In other words, certain root scripts might be designed in a way that the arbitrary multisig validation flow is the last thing to verify. Exec syscalls can help optimize resource utilization in this case. One example of such workflow, is that a multisig lock script might also consider single-sign feature as a special case of the multisig script: one just sets `R`, `M` and `N` all to 1, the multisig script can then work in a single-sign fashion. In this case, the multisig lock script might leverage exec syscall in case it finds out that the multisig configuration in use only has one public key, or all the public key share the same digital signature algorithm / scheme. The multisig lock script can then safely invoke a leaf script for all signature verification. The return code of the leaf script can simply be the final return code of the multisig lock script. + +It is worth mentioning that an arbitrary multisig implementation might choose to only use spawn syscalls to invoke leaf scripts. The whole workflow is doable with the help of spawn syscall alone. + +To invoke a leaf script using exec syscall, The root script passes the following data concatenated and encoded via `zero escaped encoding`, then be put in `argv[0]`: +* The letter `s` in 1-byte ASCII encoding +* The length of signing message, in 32-bit little-endian unsigned integer +* The actual signing message in variable length +* witness source in 64-bit little-endian unsigned integer +* witness index in 32-bit little-endian unsigned integer +* witness offset in 32-bit little-endian unsigned integer +* witness length in 32-bit little-endian unsigned integer + +Due to exec syscall's design, upon execution, the leaf script will replace the root script. The leaf script then locates a slice of data in a witness using 4 witness locating variables, similar to `Signature Packet Format 1` in spawn syscall's case. The leaf script also treats the slice of data as a series of `UNLOCKING_SLOT`, performs verification work: + +* If any `UNLOCKING_SLOT` uses a value of `ALGO_ID` which is not supported by the current leaf script, the leaf script halts. +* If any `UNLOCKING_SLOT` contains a signature, the leaf script verifies the signature against public key included in this particular `UNLOCKING_SLOT`, and signing message passed via `argv[0]`. If the signature verification process fails, the leaf script halts. +* If garbage data exist in the slice of data after the last `UNLOCKING_SLOT`, in other words, if there are still some data left in the slice, but they do not form a valid `UNLOCKING_SLOT` structure, the leaf script halts. + +When the leaf script halts, it terminates with a non-zero return code. When all the verifications succeed, the leaf script terminates with zero as the return code. In either case, the return code of the leaf script also becomes the final return code of the root script. + +## Security Concerns + +It really depends on individual perspective, some would consider the exec syscall path too risky to implement, others would be happy using it for cycle reductions. The arbitrary multisig spec is designed in a way so one can only rely on spawn syscalls to invoke leaf scripts. Exec syscalls, on the other hand, are left to the more adventurous mind for resource optimizations in the extreme case. Depending on different security assumptions, there is a spectrum of implementation plans regarding exec syscalls: + +1. One do not use exec syscalls. The root script should only rely on spawn syscalls to invoke leaf scripts. +2. Exec syscall is used only when the multisig configuration to validate is in fact a single-sign case, in other words, there is only one signature to verify. +3. Exec syscalls are used whenever possible. + +It is up to each root script implementing the arbitrary multisig specification to decide where in the above spectrum it leans towards. + +# Examples + +This [Rust function](https://github.com/xxuejie/quantum-resistant-lock-script/blob/bf2ab2a7a01a21c48d1151e0c488c66e0e4199c9/crates/ckb-fips205-utils/src/lib.rs#L184) iterates over all `UNLOCKING_SLOT`s in an unlocking configuration, perform validations as needed. It exposes a callback function where a contract can plug-in the actual signature verification work, however, this particular function only supports a subset of all `ALGO_ID` defined in this specification. This [Rust script](https://github.com/xxuejie/quantum-resistant-lock-script/blob/bf2ab2a7a01a21c48d1151e0c488c66e0e4199c9/contracts/hybrid-sphincs-all-in-one-lock/src/main.rs#L60) provides a complete example using the above function. The script first validates the unlocking configuration, it then proceeds with either of the 2 code paths: if all the public keys are generated using the same digital signature scheme, it invokes exec syscall on a leaf script for verification; otherwise it iterates over the `unlocking configuration` again, for each `UNLOCKING_SLOT`, it uses spawn syscall to create a child VM instance for a leaf script if needed, and communicates with the leaf script for signature verification work. + +This [C script](https://github.com/xxuejie/quantum-resistant-lock-script/blob/bf2ab2a7a01a21c48d1151e0c488c66e0e4199c9/contracts/c-sphincs-all-in-one-lock/ckb-sphincsplus-root-lock.c) implements a root script following the arbitrary multisig specification(though only a subset of `ALGO_ID`s is supported). This [C script](https://github.com/xxuejie/quantum-resistant-lock-script/blob/bf2ab2a7a01a21c48d1151e0c488c66e0e4199c9/contracts/c-sphincs-all-in-one-lock/ckb-sphincsplus-leaf-lock.c) implements a leaf script that accepts both exec and spawn syscalls, though it lacks support for `Signature Packet Format 2`. + +A full flexible implementation of the arbitrary multisig will be added later to this specification.