Skip to content

DensePauli class in Rust#16137

Open
alexanderivrii wants to merge 6 commits intoQiskit:mainfrom
alexanderivrii:rust-dense-pauli
Open

DensePauli class in Rust#16137
alexanderivrii wants to merge 6 commits intoQiskit:mainfrom
alexanderivrii:rust-dense-pauli

Conversation

@alexanderivrii
Copy link
Copy Markdown
Member

@alexanderivrii alexanderivrii commented May 5, 2026

Summary

This PR adds a DensePauli class in Rust.

The DensePauli class is required for extending the Litinski transform to support Pauli rotations (see #15974), and in fact the current implementation is initially based on the implementation there. The plan is to rebase #15974 on top of this PR when this merges. The DensePauli class is also required for the other current work of extending the Rust-based StabilizerTableau (Clifford) simulation to support projective Pauli measurements. It can also eventually become a viable alternative to the Python Pauli class, allowing to replace the python Pauli class by the much faster rust Pauli class.

Implementation-wise, DensePauli is as follows:

pub struct DensePauli {
    /// z-component
    pub pauli_z: FixedBitSet,
    /// x-component
    pub pauli_x: FixedBitSet,
    /// zx-phase
    pub zx_phase: u8,
}

The Z- and X- Pauli components are stored using FixedBitSet, allowing bit-packing and word-level operations, and leading to very fast commutativity and multiplication methods.

Additionally, this implements a function evolve_pauli_by_clifford to evolve a Clifford by a dense Pauli.

Co-authored with Shelly Garion.

AI/LLM disclosure

  • I didn't use LLM tooling, or only used it privately.
  • I used the following tool to help write this PR description:
  • I used the following tool to generate or modify code:

The dense Pauli class is implemented using FixedBitSet, allowing bit-packing and
word-level operations.

Co-authored-by: Shelly Garion <shelly@il.ibm.com>
@alexanderivrii alexanderivrii added this to the 2.5.0 milestone May 5, 2026
@alexanderivrii alexanderivrii requested a review from a team as a code owner May 5, 2026 08:53
@alexanderivrii alexanderivrii requested a review from Cryoris May 5, 2026 08:53
@alexanderivrii alexanderivrii added Changelog: None Do not include in the GitHub Release changelog. fault tolerance related to fault tolerance compilation labels May 5, 2026
@github-project-automation github-project-automation Bot moved this to Ready in Qiskit 2.5 May 5, 2026
@qiskit-bot
Copy link
Copy Markdown
Collaborator

One or more of the following people are relevant to this code:

  • @Qiskit/terra-core

Comment thread crates/quantum_info/src/dense_pauli.rs Outdated
///
/// This function will panic if the two Paulis are of different length.
pub fn commutes(&self, other: &DensePauli) -> bool {
debug_assert!(self.num_qubits() == other.num_qubits());
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I improve error-handling from debug asserts to returning a Result? Though ideally I want this particular function to be as fast as possible and would prefer to avoid doing additional checks.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment on this below, that should answer this question 🙂

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the three main methods commute, compose and compose_with (compose in-place) I have added two variants of the function: the unchecked version which does not perform the check and the checked version that returns a proper Result.

From my local experiments on MacOS, the commute_unchecked is about 2x faster than commute, and so the calling application should use that when it can guarantee that both Paulis have the same length.

Very surprisingly to me, the compose_with_unchecked is slower than compose_with. I expect that the compiler is able to perform additional optimizations with FixedBitSet if it can detect that both paulis have the same length.

For compose_with vs. compose both functions run in a very similar time.

Comment on lines +302 to +304
for i in 0..num_qubits {
parity ^= (self.pauli_z[i] & other.pauli_x[i]) ^ (self.pauli_x[i] & other.pauli_z[i]);
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if there might be a better way to iterate over blocks of the FixedBitSet than over individual bits. I am planning to explore this next.

One thing that I already tried was something like this

parity = (&(&self.pauli_z & &other.pauli_x) ^ &(&self.pauli_x & &other.pauli_z)).count_ones(..)

however that was significantly slower because this creates new FixedBitSets in the process.

Comment thread crates/quantum_info/src/dense_pauli.rs Outdated
Comment on lines +210 to +214
/// .. note::
///
/// Unlike Python-space Qiskit convention, the label is represented left-to-right,
/// for example "-iXIZY" is interpreted as `'X'` on qubit `0`, followed by `'I'` on qubit `1`,
/// and so on.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I am always getting confused with the Qiskit python convention of reading the pauli terms right-to-left, so I have made this and the from_label functions to treat labels as left-to-right. However, if you think that it's better to keep the notation consistent throughout Qiskit, I can change this.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After an offline discussion, it was decided to keep the existing Qiskit convention. So the Paulis are now labeled right-to-left.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I think that it's better that rust and python provide the same output

Comment on lines 387 to 391
fn compute_phase_product_pauli(
clifford: &Clifford,
pauli_indices: &[usize],
pauli_y_count: u32,
pauli_y_count: u8,
) -> bool {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question is Independent of this PR, but I might as well ask it here. This function is the hot-spot for evolving Paulis by Cliffords and in particular for the LitinskiTransformation pass. I have locally tried multiple ways to reimplement it.

The following version replaces the match statement by a static table lookup.

static PHASE_TABLE: [u8; 16] = [0, 0, 0, 0, 0, 0, 3, 1, 0, 1, 0, 3, 0, 3, 1, 0];
...
let idx = (x1 as u8) | ((z1 as u8) << 1) | ((x as u8) << 2) | ((z as u8) << 3);
ifact += PHASE_TABLE[idx as usize];

It works very well for large densely populated Cliffords (resulting in about 5x performance improvement) but is apparently slightly worse than the current implementation on the existing ASV benchmarks.

I have also tried replacing the match or the table lookup by an explicit arithmetic computation, but this was consistently worse than the table lookup on all of the benchmarks.

I am wondering if there is a clever way to vectorize this computation - or any additional ideas.

Copy link
Copy Markdown
Collaborator

@Cryoris Cryoris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a dedicated Pauli class sounds like it can be very useful, and the implementation looks good to me overall. I've left some questions below and have some other, higher-level questions on top:

  • Why was the final choice FixedBitSet? Do we have benchmarking data backing this up?
  • It would be good if we actually use these new features somewhere. E.g. the existing Litinski transform could already use it, which would help identifying any performance regressions.
  • In several places we've documented that a function would panic, e.g. if the Pauli and Clifford don't have the same size, but there's only a debug_assert in the function. Afaik, this doesn't actually panic if compiled in release mode. We should either replace this by a proper error, which the user can handle, or, if you're worried about the overhead, provide unsafe variants that require the sizes to be correct.

z: &[bool],
indices: &[u32],
phase: u8,
is_group_phase: bool,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand the is_group_phase -- what exactly does it do and what do we use it for?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two standard conventions to implement the phase of a single-qubit Pauli.

The "group phase" convention:

  • the (x, z) pair (false, false) corresponds to I
  • the (x, z) pair (true, false) corresponds to X
  • the (x, z) pair (false, true) corresponds to Z
  • the (x, z) pair (true, true) corresponds to Y

The "xz-phase" convention

  • the (x, z) pair (false, false) corresponds to I
  • the (x, z) pair (true, false) corresponds to X
  • the (x, z) pair (false, true) corresponds to Z
  • the (x, z) pair (true, true) corresponds to XZ = -iY

It is faster to check whether two Paulis commute or to multiply two Paulis using XZ-phase convention. However, when we call this function we might want to pass the group phase instead of the xz-phase.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also have this convention of zx_phase and group_phase in the python code

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Shelly. Regarding XZ vz. ZX: ZX is a much more standard convention, so I have renamed all XZs to ZXs and made sure that z arguments are always passed/returned first and x second. See d55e917.

Comment thread crates/quantum_info/src/dense_pauli.rs Outdated
Comment on lines +60 to +61
x: &[bool],
z: &[bool],
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use consistent z-x order of arguments throughout? E.g. evolve_by_... takes (z, x) but this is (x, z). Since both have the same type it seems one could easily mix up the orders. I don't have a strong preference on the order as long as its consistent, but the quantum_info module seems to consistently use (z, x).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I will double check whether the standard name is "xz-phase" or "zx-phase" (I have probably seen both used in different packages).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in d55e917.

/// Evolving a Pauli with a single non-identity Z-term on qubit `qbit` by the given Clifford.
///
/// Return the evolved Pauli in a sparse ZX format: (sign, z, x, indices).
pub fn get_inverse_z(&self, qbit: usize) -> (bool, Vec<bool>, Vec<bool>, Vec<u32>) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this implementation, given that we now have evolve_single_qubit_pauli_dense?

It would be good to update the existing code paths to use the new functionality, also from a testing POV (since the new function is not used anywhere yet, right?).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the "sparse" variant of the same function. From my local experiments and an offline discussion with you, we might need both variants. Applying Litinski to a single-qubit RZ-rotation is faster with the sparse format. Applying on a multi-qubit PPR is faster with a dense format.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so perhaps a better name is needed?
maybe: evolve_single_qubit_pauli_sparse ?

Comment thread crates/quantum_info/src/clifford.rs Outdated
Comment thread crates/quantum_info/src/dense_pauli.rs Outdated
Comment thread crates/quantum_info/src/dense_pauli.rs Outdated
Comment on lines +295 to +297
/// Panics
///
/// This function will panic if the two Paulis are of different length.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is actually correct -- if we compile in release mode, then the debug_asserts are not included and this would not panic but pass silently if self.num_qubits < other.num_qubits. Could we replace the debug assert with an actual check and an error that we can handle?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you have seen from one of the comments, I am not happy with the debug asserts either. Possibly the additional check will not be too costly. Looking at the existing sparse pauli class we can provide both checked and unchecked method if we think this is performance-critical.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, see also this comment.

Comment thread crates/quantum_info/src/dense_pauli.rs Outdated
Comment thread crates/quantum_info/src/dense_pauli.rs Outdated
Comment thread crates/quantum_info/src/dense_pauli.rs Outdated
s = String::from("-i") + &s;
}
_ => {
panic!("we should never get this")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a bit more descriptive would be nice

Suggested change
panic!("we should never get this")
panic!("DensePauli phases are constraint in {0, 1, 2, 3}.")

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this was updated

Comment thread crates/quantum_info/src/dense_pauli.rs Outdated
@coveralls
Copy link
Copy Markdown

coveralls commented May 5, 2026

Coverage Report for CI Build 25670356190

Warning

Build has drifted: This PR's base is out of sync with its target branch, so coverage data may include unrelated changes.
Quick fix: rebase this PR. Learn more →

Coverage increased (+0.05%) to 87.63%

Details

  • Coverage increased (+0.05%) from the base build.
  • Patch coverage: 60 uncovered changes across 2 files (410 of 470 lines covered, 87.23%).
  • 690 coverage regressions across 15 files.

Uncovered Changes

File Changed Covered %
crates/quantum_info/src/dense_pauli.rs 407 349 85.75%
crates/quantum_info/src/clifford.rs 63 61 96.83%

Coverage Regressions

690 previously-covered lines in 15 files lost coverage.

Top 10 Files by Coverage Loss Lines Losing Coverage Coverage
crates/circuit/src/dag_circuit.rs 398 84.69%
crates/circuit/src/circuit_drawer.rs 73 95.26%
crates/circuit/src/circuit_data.rs 52 87.17%
crates/circuit/src/parameter/parameter_expression.rs 51 91.04%
crates/circuit/src/register_data.rs 42 69.74%
crates/transpiler/src/passes/basis_translator/compose_transforms.rs 16 84.67%
crates/transpiler/src/commutation_checker.rs 14 88.79%
crates/transpiler/src/passes/commutation_cancellation.rs 10 91.38%
crates/circuit/src/interner.rs 9 97.02%
crates/transpiler/src/passes/commutation_analysis.rs 8 93.6%

Coverage Stats

Coverage Status
Relevant Lines: 122584
Covered Lines: 107420
Line Coverage: 87.63%
Coverage Strength: 960538.33 hits per line

💛 - Coveralls

@alexanderivrii
Copy link
Copy Markdown
Member Author

Julien, your additional questions:

Why was the final choice FixedBitSet? Do we have benchmarking data backing this up?

There are multiple questions here. First, do we need a "dense" Pauli class in addition to the sparse Pauli class we already have? I think we do. Second, what is the best way to implement the "dense" class? Checking commutativity/multiplying two Paulis represented using FixedBitSet is much faster than when represented using Vec<bool>, I can pull some older benchmarking data.

It would be good if we actually use these new features somewhere. E.g. the existing Litinski transform could already use it, which would help identifying any performance regressions.

I am not sure I fully agree with this. Having a single PR that adds a DensePauli class, changes the LitinskiTransform to use this class, and further extends the LitinskiTransform to handle Clifford-like PPR gates (as per discussion on #15974) is just too much of a change. I think a more appropriate division is to having a working DensePauli class as a separate PR. But I am adding more Rust tests to make sure the functionality works as expected.

@Cryoris
Copy link
Copy Markdown
Collaborator

Cryoris commented May 5, 2026

It would be good if we actually use these new features somewhere. E.g. the existing Litinski transform could already use it, which would help identifying any performance regressions.

I am not sure I fully agree with this. Having a single PR that adds a DensePauli class, changes the LitinskiTransform to use this class, and further extends the LitinskiTransform to handle Clifford-like PPR gates (as per discussion on #15974) is just too much of a change. I think a more appropriate division is to having a working DensePauli class as a separate PR. But I am adding more Rust tests to make sure the functionality works as expected.

The main reason I'm suggesting this is to track the performance impact. If you integrated it locally or, even better, could open a follow-up PR that uses this, and could run asv or some benchmarks, that would also be enough 🙂

Comment thread crates/quantum_info/src/dense_pauli.rs
z: &[bool],
indices: &[u32],
phase: u8,
is_group_phase: bool,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also have this convention of zx_phase and group_phase in the python code

Comment thread crates/quantum_info/src/dense_pauli.rs
Comment thread crates/quantum_info/src/dense_pauli.rs
Comment thread crates/quantum_info/src/dense_pauli.rs Outdated
Comment thread crates/quantum_info/src/clifford.rs Outdated
/// * `indices`: Qubit indices corresponding to `x` and `z`.
/// * `phase`: The phase of the Pauli operator, encoded modulo 4.
/// * `is_group_phase`: If `true`, `phase` is interpreted as a group phase
/// and is converted to the internal ZX-phase representation.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps it's worth to explain the difference between group_phase and zx_phase in the comment

Comment thread crates/quantum_info/src/dense_pauli.rs Outdated
s = String::from("-i") + &s;
}
_ => {
panic!("we should never get this")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this was updated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Changelog: None Do not include in the GitHub Release changelog. fault tolerance related to fault tolerance compilation

Projects

Status: Ready

Development

Successfully merging this pull request may close these issues.

5 participants