`DensePauli` class in Rust by alexanderivrii · Pull Request #16137 · Qiskit/qiskit

alexanderivrii · 2026-05-05T08:53:21Z

Summary

This PR adds a DensePauli class in Rust.

The DensePauli class is required for extending the Litinski transform to support Pauli rotations (see #15974), and in fact the current implementation is initially based on the implementation there. The plan is to rebase #15974 on top of this PR when this merges. The DensePauli class is also required for the other current work of extending the Rust-based StabilizerTableau (Clifford) simulation to support projective Pauli measurements. It can also eventually become a viable alternative to the Python Pauli class, allowing to replace the python Pauli class by the much faster rust Pauli class.

Implementation-wise, DensePauli is as follows:

pub struct DensePauli {
    /// x-component
    pub pauli_x: FixedBitSet,
    /// z-component
    pub pauli_z: FixedBitSet,
    /// xz-phase
    pub xz_phase: u8,
}

The X- and Z- Pauli components are stored using FixedBitSet, allowing bit-packing and word-level operations, and leading to very fast commutativity and multiplication methods.

Additionally, this implements a function evolve_pauli_by_clifford to evolve a Clifford by a dense Pauli.

Co-authored with Shelly Garion.

AI/LLM disclosure

I didn't use LLM tooling, or only used it privately.
I used the following tool to help write this PR description:
I used the following tool to generate or modify code:

The dense Pauli class is implemented using FixedBitSet, allowing bit-packing and word-level operations. Co-authored-by: Shelly Garion <shelly@il.ibm.com>

qiskit-bot · 2026-05-05T08:53:27Z

One or more of the following people are relevant to this code:

@Qiskit/terra-core

alexanderivrii · 2026-05-05T08:57:26Z

+    ///
+    /// This function will panic if the two Paulis are of different length.
+    pub fn commutes(&self, other: &DensePauli) -> bool {
+        debug_assert!(self.num_qubits() == other.num_qubits());


Should I improve error-handling from debug asserts to returning a Result? Though ideally I want this particular function to be as fast as possible and would prefer to avoid doing additional checks.

I left a comment on this below, that should answer this question 🙂

alexanderivrii · 2026-05-05T09:02:16Z

+        for i in 0..num_qubits {
+            parity ^= (self.pauli_z[i] & other.pauli_x[i]) ^ (self.pauli_x[i] & other.pauli_z[i]);
+        }


I am wondering if there might be a better way to iterate over blocks of the FixedBitSet than over individual bits. I am planning to explore this next.

One thing that I already tried was something like this

parity = (&(&self.pauli_z & &other.pauli_x) ^ &(&self.pauli_x & &other.pauli_z)).count_ones(..)

however that was significantly slower because this creates new FixedBitSets in the process.

alexanderivrii · 2026-05-05T09:07:43Z

+    /// .. note::
+    ///
+    ///     Unlike Python-space Qiskit convention, the label is represented left-to-right,
+    ///     for example "-iXIZY" is interpreted as `'X'` on qubit `0`, followed by `'I'` on qubit `1`,
+    ///     and so on.


Personally, I am always getting confused with the Qiskit python convention of reading the pauli terms right-to-left, so I have made this and the from_label functions to treat labels as left-to-right. However, if you think that it's better to keep the notation consistent throughout Qiskit, I can change this.

alexanderivrii · 2026-05-05T09:16:17Z

 fn compute_phase_product_pauli(
    clifford: &Clifford,
    pauli_indices: &[usize],
-    pauli_y_count: u32,
+    pauli_y_count: u8,
 ) -> bool {


This question is Independent of this PR, but I might as well ask it here. This function is the hot-spot for evolving Paulis by Cliffords and in particular for the LitinskiTransformation pass. I have locally tried multiple ways to reimplement it.

The following version replaces the match statement by a static table lookup.

static PHASE_TABLE: [u8; 16] = [0, 0, 0, 0, 0, 0, 3, 1, 0, 1, 0, 3, 0, 3, 1, 0]; ... let idx = (x1 as u8) | ((z1 as u8) << 1) | ((x as u8) << 2) | ((z as u8) << 3); ifact += PHASE_TABLE[idx as usize];

It works very well for large densely populated Cliffords (resulting in about 5x performance improvement) but is apparently slightly worse than the current implementation on the existing ASV benchmarks.

I have also tried replacing the match or the table lookup by an explicit arithmetic computation, but this was consistently worse than the table lookup on all of the benchmarks.

I am wondering if there is a clever way to vectorize this computation - or any additional ideas.

Cryoris

Having a dedicated Pauli class sounds like it can be very useful, and the implementation looks good to me overall. I've left some questions below and have some other, higher-level questions on top:

Why was the final choice FixedBitSet? Do we have benchmarking data backing this up?
It would be good if we actually use these new features somewhere. E.g. the existing Litinski transform could already use it, which would help identifying any performance regressions.
In several places we've documented that a function would panic, e.g. if the Pauli and Clifford don't have the same size, but there's only a debug_assert in the function. Afaik, this doesn't actually panic if compiled in release mode. We should either replace this by a proper error, which the user can handle, or, if you're worried about the overhead, provide unsafe variants that require the sizes to be correct.

Cryoris · 2026-05-05T08:59:03Z

+        z: &[bool],
+        indices: &[u32],
+        phase: u8,
+        is_group_phase: bool,


Not sure I understand the is_group_phase -- what exactly does it do and what do we use it for?

There are two standard conventions to implement the phase of a single-qubit Pauli.

The "group phase" convention:

the (x, z) pair (false, false) corresponds to I

the (x, z) pair (true, false) corresponds to X

the (x, z) pair (false, true) corresponds to Z

the (x, z) pair (true, true) corresponds to Y

The "xz-phase" convention

the (x, z) pair (false, false) corresponds to I

the (x, z) pair (true, false) corresponds to X

the (x, z) pair (false, true) corresponds to Z

the (x, z) pair (true, true) corresponds to XZ = -iY

It is faster to check whether two Paulis commute or to multiply two Paulis using XZ-phase convention. However, when we call this function we might want to pass the group phase instead of the xz-phase.

we also have this convention of zx_phase and group_phase in the python code

Cryoris · 2026-05-05T09:01:15Z

+        x: &[bool],
+        z: &[bool],


Could we use consistent z-x order of arguments throughout? E.g. evolve_by_... takes (z, x) but this is (x, z). Since both have the same type it seems one could easily mix up the orders. I don't have a strong preference on the order as long as its consistent, but the quantum_info module seems to consistently use (z, x).

That's a good point. I will double check whether the standard name is "xz-phase" or "zx-phase" (I have probably seen both used in different packages).

Cryoris · 2026-05-05T09:03:20Z

+    /// Evolving a Pauli with a single non-identity Z-term on qubit `qbit` by the given Clifford.
+    ///
+    /// Return the evolved Pauli in a sparse ZX format: (sign, z, x, indices).
    pub fn get_inverse_z(&self, qbit: usize) -> (bool, Vec<bool>, Vec<bool>, Vec<u32>) {


Do we still need this implementation, given that we now have evolve_single_qubit_pauli_dense?

It would be good to update the existing code paths to use the new functionality, also from a testing POV (since the new function is not used anywhere yet, right?).

This is the "sparse" variant of the same function. From my local experiments and an offline discussion with you, we might need both variants. Applying Litinski to a single-qubit RZ-rotation is faster with the sparse format. Applying on a multi-qubit PPR is faster with a dense format.

Cryoris · 2026-05-05T09:04:19Z

+                    self.tableau[qbit + num_qubits][i + num_qubits]
+                        ^ self.tableau[qbit][i + num_qubits],
+                ),
+                _ => unreachable!("This is only called for RX/RZ/RY gates."),


This is not unreachable in the current form. It's a pub function and calling it with with pauli_z=false, pauli_x=false is a valid input.

Cryoris · 2026-05-05T09:08:46Z

+            0 => {
+                s = String::from("") + &s;
+            }


We can avoid the copy here by just return s; directly in this case

Oh, sily me :).

Cryoris · 2026-05-05T09:10:55Z

+    /// Panics
+    ///
+    /// This function will panic if the two Paulis are of different length.


I don't think this is actually correct -- if we compile in release mode, then the debug_asserts are not included and this would not panic but pass silently if self.num_qubits < other.num_qubits. Could we replace the debug assert with an actual check and an error that we can handle?

As you have seen from one of the comments, I am not happy with the debug asserts either. Possibly the additional check will not be too costly. Looking at the existing sparse pauli class we can provide both checked and unchecked method if we think this is performance-critical.

Cryoris · 2026-05-05T09:11:39Z

+        debug_assert!(
+            x.len() == indices.len(),
+            "x and indices must have the same length"
+        );
+        debug_assert!(
+            z.len() == indices.len(),
+            "z and indices must have the same length"
+        );


Can we replace this with a proper error that is returned?

Cryoris · 2026-05-05T09:12:03Z

+                    xz_phase = xz_phase.wrapping_add(1) & 3;
+                }
+                _ => {
+                    panic!("Incorrect label");


Also here, could we use an error? 🙂

You convinced me, I should provide proper error handling throughout the code.

Cryoris · 2026-05-05T09:13:06Z

+                s = String::from("-i") + &s;
+            }
+            _ => {
+                panic!("we should never get this")


Maybe a bit more descriptive would be nice

Suggested change

panic!("we should never get this")

panic!("DensePauli phases are constraint in {0, 1, 2, 3}.")

Cryoris · 2026-05-05T09:13:55Z

+/// This function will panic if the Clifford and the Pauli do not have the
+/// same number of qubits.
+pub fn evolve_pauli_by_clifford(pauli: &DensePauli, cliff: &Clifford) -> DensePauli {
+    debug_assert!(pauli.num_qubits() == cliff.num_qubits);


Same here, this does not get compiled into the program in release mode. If we want an actual check then we should just return an error if this fails.

coveralls · 2026-05-05T09:25:18Z

Coverage Report for CI Build 25366950547

Coverage increased (+0.03%) to 87.604%

Details

Coverage increased (+0.03%) from the base build.
Patch coverage: 40 uncovered changes across 2 files (405 of 445 lines covered, 91.01%).
7 coverage regressions across 3 files.

Uncovered Changes

File	Changed	Covered	%
crates/quantum_info/src/dense_pauli.rs	385	346	89.87%
crates/quantum_info/src/clifford.rs	60	59	98.33%

Coverage Regressions

7 previously-covered lines in 3 files lost coverage.

File	Lines Losing Coverage	Coverage
crates/qasm2/src/lex.rs	4	92.54%
crates/circuit/src/parameter/symbol_expr.rs	2	74.17%
crates/circuit/src/parameter/parameter_expression.rs	1	90.53%

Coverage Stats


Relevant Lines:	122237
Covered Lines:	107084
Line Coverage:	87.6%
Coverage Strength:	956486.23 hits per line

💛 - Coveralls

alexanderivrii · 2026-05-05T10:03:54Z

Julien, your additional questions:

Why was the final choice FixedBitSet? Do we have benchmarking data backing this up?

There are multiple questions here. First, do we need a "dense" Pauli class in addition to the sparse Pauli class we already have? I think we do. Second, what is the best way to implement the "dense" class? Checking commutativity/multiplying two Paulis represented using FixedBitSet is much faster than when represented using Vec<bool>, I can pull some older benchmarking data.

It would be good if we actually use these new features somewhere. E.g. the existing Litinski transform could already use it, which would help identifying any performance regressions.

I am not sure I fully agree with this. Having a single PR that adds a DensePauli class, changes the LitinskiTransform to use this class, and further extends the LitinskiTransform to handle Clifford-like PPR gates (as per discussion on #15974) is just too much of a change. I think a more appropriate division is to having a working DensePauli class as a separate PR. But I am adding more Rust tests to make sure the functionality works as expected.

Cryoris · 2026-05-05T11:01:25Z

It would be good if we actually use these new features somewhere. E.g. the existing Litinski transform could already use it, which would help identifying any performance regressions.

I am not sure I fully agree with this. Having a single PR that adds a DensePauli class, changes the LitinskiTransform to use this class, and further extends the LitinskiTransform to handle Clifford-like PPR gates (as per discussion on #15974) is just too much of a change. I think a more appropriate division is to having a working DensePauli class as a separate PR. But I am adding more Rust tests to make sure the functionality works as expected.

The main reason I'm suggesting this is to track the performance impact. If you integrated it locally or, even better, could open a follow-up PR that uses this, and could run asv or some benchmarks, that would also be enough 🙂

ShellyGarion · 2026-05-05T11:06:49Z

+    pub fn count_y(&self) -> u8 {
+        let num_qubits = self.num_qubits();
+        let mut cnt_y = 0;
+        for i in 0..num_qubits {


why did you change it to a for loop instead of an iterator?

I have experimented with multiple versions of each function, so this is possibly just the last version I tried (but I don't think that this had any effect on performance). As I mentioned somewhere, I am planning to see if we can speed this up by iterating over blocks of the FixedBitSet rather than individual bits: this may give a performance benefit if the compiler does not already optimize this for us.

ShellyGarion · 2026-05-05T11:17:58Z

+        z: &[bool],
+        indices: &[u32],
+        phase: u8,
+        is_group_phase: bool,


we also have this convention of zx_phase and group_phase in the python code

ShellyGarion · 2026-05-05T11:20:00Z

+        debug_assert!(self.num_qubits() == other.num_qubits());
+        let mut xz_phase = self.xz_phase + other.xz_phase;
+        let num_qubits = self.num_qubits();
+        for i in 0..num_qubits {


why is there a for loop here and not an iterator?

ShellyGarion · 2026-05-05T11:20:24Z

+        debug_assert!(self.num_qubits() == other.num_qubits());
+        let num_qubits = self.num_qubits();
+        self.xz_phase += other.xz_phase;
+        for i in 0..num_qubits {


same question on the for loop

ShellyGarion · 2026-05-05T11:21:17Z

+    fn test_evolve_2_qubits() {
+        use ndarray::Array2;
+
+        // Random Clifford created from Python (with seed=1234).


generated using the Python code:
random_clifford(2, seed=1234)

Adding dense Pauli class

c883bec

The dense Pauli class is implemented using FixedBitSet, allowing bit-packing and word-level operations. Co-authored-by: Shelly Garion <shelly@il.ibm.com>

alexanderivrii added this to the 2.5.0 milestone May 5, 2026

alexanderivrii requested a review from a team as a code owner May 5, 2026 08:53

alexanderivrii requested a review from Cryoris May 5, 2026 08:53

alexanderivrii added Changelog: None Do not include in the GitHub Release changelog. fault tolerance related to fault tolerance compilation labels May 5, 2026

alexanderivrii added this to Qiskit 2.5 May 5, 2026

github-project-automation Bot moved this to Ready in Qiskit 2.5 May 5, 2026

alexanderivrii commented May 5, 2026

View reviewed changes

Cryoris reviewed May 5, 2026

View reviewed changes

ShellyGarion reviewed May 5, 2026

View reviewed changes

ShellyGarion assigned Cryoris May 6, 2026

	panic!("we should never get this")
	panic!("DensePauli phases are constraint in {0, 1, 2, 3}.")

Conversation

alexanderivrii commented May 5, 2026

Summary

AI/LLM disclosure

Uh oh!

qiskit-bot commented May 5, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Cryoris left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coveralls commented May 5, 2026

Coverage Report for CI Build 25366950547

Coverage increased (+0.03%) to 87.604%

Details

Uncovered Changes

Coverage Regressions

Coverage Stats

💛 - Coveralls

Uh oh!

alexanderivrii commented May 5, 2026

Uh oh!

Cryoris commented May 5, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers