Use TwoQubitPeepholeOptimization in preset pass managers by mtreinish · Pull Request #16136 · Qiskit/qiskit

mtreinish · 2026-05-04T22:10:17Z

Following on from #13419 which added a new optimization pass
TwoQubitPeepholeOptimization which was designed to replace the
pair of ConsolidateBlocks and UnitarySynthesis for the optimization
stage after we have a physical circuit. That PR however did not update
the preset pass managers to concentrate the review on just adding the
new pass. This continues off from there by updating the preset pass
managers to use the new pass in optimization levels 2 and 3 replacing
those levels' optimization stage's previous usage of ConsolidateBlocks
and UnitarySynthesis to achieve the same goal. This should result in
both a runtime performance and transpilation quality improvement as the
new pass is both faster and should produce better fidelity circuits
than the previous peephole optimization.

The tests updates that are made in this PR are because the peephole
optimization is changing the transpilation output of various test
circuits. These were all verified to be valid outputs and in all cases
a "better" output than before. Specifically, for the tests updated
these were the changes in output and why they occurred:

The two tests in
test.python.circuit.test_scheduled_circuit.TestScheduledCircuit were
the single CX gate in the output circuit was flipped from (0, 1) to
(1, 0) because in the target the error rate for the (0, 1) direction
was higher than the extra error cost of 3 sx gates (the rz gates have
0 error).
In test_unroll_only_if_not_gates_in_basis from
test.python.transpiler.test_preset_passmanagers.TestPresetPassManager
we no longer run ConsolidateBlocks in the optimization loop so we no
longer need to add the 2 executions from the init and translation
stages. The test is updated to count the new peephole pass which is
the intent of the count check, to check the pass in the optimization
loop.
In test_2q_circuit_5q_backend_v2 from
test.python.transpiler.test_vf2_post_layout.TestVF2PostLayoutUndirected
had the same cx gate flipping because the error rate in the original
layout for the reverse direction was 0.000779905 vs 0.00163587 in the
original direction. So the new pass was correctly flipping the cx gate
resulting in a different circuit that vf2 couldn't place anywhere
better. To fix this the test sets a fixed layout on worse qubits so
that vf2 will have to place it somewhere better.
For test_layout_tokyo_fully_connected_cx_4_3 from
test.python.transpiler.test_preset_passmanagers.TestFinalLayouts the
output circuit has a better estimated fidelity (although more gates in
general). The transpiler output goes from an estimated fidelity of
0.9526614226294913 before the new pass was used to an estimated fidelity
of 0.961996188569715 after the new pass is used. This new circuit with a
better fidelity has a different initial layout set now, so the test
is updated to use the new layout.

This PR is based on top of #13419 and will need to be rebased after that merges. In the meantime you can view the contents of just this PR by looking at the HEAD commit:

471d199

AI/LLM disclosure

I didn't use LLM tooling, or only used it privately.
I used the following tool to help write this PR description:
I used the following tool to generate or modify code:

This commit adds a new transpiler pass for physical optimization, TwoQubitPeepholeOptimization. This replaces the use of Collect2qBlocks, ConsolidateBlocks, and UnitarySynthesis in the optimization stage for a default pass manager setup. The pass logically works the same way where it analyzes the dag to get a list of 2q runs, calculates the matrix of each run, and then synthesizes the matrix and substitutes it inplace. The distinction this pass makes though is it does this all in a single pass and also parallelizes the matrix calculation and synthesis steps because there is no data dependency there. This new pass is not meant to fully replace the Collect2qBlocks, ConsolidateBlocks, or UnitarySynthesis passes as those also run in contexts where we don't have a physical circuit. This is meant instead to replace their usage in the optimization stage only. Accordingly this new pass also changes the logic on how we select the synthesis to use and when to make a substituion. Previously this logic was primarily done via the ConsolidateBlocks pass by only consolidating to a UnitaryGate if the number of basis gates needed based on the weyl chamber coordinates was less than the number of 2q gates in the block (see Qiskit#11659 for discussion on this). Since this new pass skips the explicit consolidation stage we go ahead and try all the available synthesizers Right now this commit has a number of limitations, the largest are: - Only supports the target - It doesn't support any synthesizers besides the TwoQubitBasisDecomposer, because it's the only one in rust currently. For plugin handling I left the logic as running the three pass series, but I'm not sure this is the behavior we want. We could say keep the synthesis plugins for `UnitarySynthesis` only and then rely on our built-in methods for physical optimiztion only. But this also seems less than ideal because the plugin mechanism is how we support synthesizing to custom basis gates, and also more advanced approximate synthesis methods. Both of those are things we need to do as part of the synthesis here. Additionally, this is currently missing tests and documentation and while running it manually "works" as in it returns a circuit that looks valid, I've not done any validation yet. This also likely will need several rounds of performance optimization and tuning. t this point this is just a rough proof of concept and will need a lof refinement along with larger changes to Qiskit's rust code before this is ready to merge. Fixes Qiskit#12007 Fixes Qiskit#11659

…rallel-pass

Since Qiskit#13139 merged we have another two qubit decomposer available to run in rust, the TwoQubitControlledUDecomposer. This commit updates the new TwoQubitPeepholeOptimization to call this decomposer if the target supports appropriate 2q gates.

Clippy is correctly warning that the size difference between the two decomposer types in the TwoQubitDecomposer enumese two types is large. TwoQubitBasisDecomposer is 1640 bytes and TwoQubitControlledUDecomposer is only 24 bytes. This means each element of ControlledU is wasting > 1600 bytes. However, in this case that is acceptable in order to avoid a layer of pointer indirection as these are stored temporarily in a vec inside a thread to decompose a unitary. A trait would be more natural for this to define a common interface between all the two qubit decomposers but since we keep them instantiated for each edge in a Vec they need to be sized and doing something like `Box<dyn TwoQubitDecomposer>` (assuming a trait `TwoQubitDecomposer` instead of a enum) to get around this would have additional runtime overhead. This is also considering that TwoQubitControlledUDecomposer has far less likelihood in practice as it only works with some targets that have RZZ, RXX, RYY, or RZX gates on an edge which is less common.

…rallel-pass

Also don't run scoring more than needed.

…rallel-pass

…tinuous

…rallel-pass

The priority for the two qubit peephole pass should be decreasing the 2q gate count. The error rate heuristic should only matter if the 2q counts are the same. This commit flips the heuristic to first check the 2q gate count so the first priority is reducing the 2q gate count.

…rallel-pass

This commit removes the unitary synthesis plugin mechanism from the pass. This was a layer violation to support this when the pass logic doesn't actually support using the plugin interface. It is easier and more clear that if the plugin interface usage is desired to handle that in the pass manager construction rather than have this pass internally build a pass manager and execute other passes to emulate behavior it doesn't have.

…rallel-pass

There were two issues identified by the testing which required fixing and adjusting the tests based on limitations in the pass. The first issue was the parameters for the target gate was not handled correctly. In the case of using the Controlled U decomposer we were not passing the computed parameter value correctly to the output circuit and instead the ParameterExpression from the target was being used. Then in the case of controlled gates (not supercontrolled) that had a fixed angle that are normally intended for the xx decomposer were incorrectly being passed to the TwoQubitBasisDecomposer which can't work with them. This was resulting in invalid circuit outputs. The use of the TwoQubitBasisDecomposer is now correctly filtering to only be run with supercontrolled gates. The tests were adjusted for this limitation because they were mostly copied from the UnitarySynthesis tests which supports xx decomposer.

…rallel-pass

…re locking is needed

We don't want to spend time reconstructing an exact copy of the dag if there are no substituions needed. Prior to using a vec for tracking the run indices that nodes are part of we would check if that map was empty. The vec is always populated and to determine if there are no entries we'd have to do a worse case O(n) lookup to determine if any entries are set. To avoid that this overhead but keeping the check this adds an atomic bool that is used to track whether we've substituted any blocks. If this is not set to true we can just exit early since there are no substitutions to make.

…rallel-pass

Previously there was a mismatch between the scoring of synthesis results and the peephole pass's comparison with the original block. The pass is documented as using the tuple (num_2q_gates, error, num_gates) and picking the min of all the choices. But, when we called the unitary synthesis function that selects the best synthesis outcome it was maximizing the estimated fidelity but not considering the gate counts like the pass is documented as doing. This corrects this mismatch by updating the function doing the synthesis to be generic on score type and taking a scorer callback. This lets the peephole pass control the heuristic used for selecting the best score.

…rallel-pass

In testing the pass in the full pass manager there is an underlying issue with Python defined gates in the circuit and the GIL handling. In the presence of those gates the thread doing the synthesis and analysis will need the GIL to get the matrix of the Python gate. But in the previous version of this pass the parent thread retained the GIL while the parallel workers ran. This would cause a deadlock because the worker threads would never be able to acquire the GIL when they tried to do so during the execution of the synthesis. This commit attempts to fix this by splitting out the parallel portion from the serial portion of the function. The serial portion rebuilds the dag from the analysis results and needs the GIL to copy any python operations in the circuit as it's rebuilding. So we re-attach the GIL prior to running the serial portion. The unitary synthesis decomposer handling needed to be updated as well because there was implicit usage of the GIL from the py-clone feature around handling custom rxx equivalent gates for the controlled u decomposer. This was not correct in a threaded context where the GIL might be released and would cause a panic. This is updated to explicitly attach to the python interpreter and handle the python copy with the py token explicitly. There were places already doing this in the decomposer handling code but because of the py-clone feature the clone() calls were missed.

Following on from Qiskit#13419 which added a new optimization pass TwoQubitPeepholeOptimization which was designed to replace the pair of ConsolidateBlocks and UnitarySynthesis for the optimization stage after we have a physical circuit. That PR however did not update the preset pass managers to concentrate the review on just adding the new pass. This continues off from there by updating the preset pass managers to use the new pass in optimization levels 2 and 3 replacing those levels' optimization stage's previous usage of ConsolidateBlocks and UnitarySynthesis to achieve the same goal. This should result in both a runtime performance and transpilation quality improvement as the new pass is both faster and should produce better fidelity circuits than the previous peephole optimization. The tests updates that are made in this PR are because the peephole optimization is changing the transpilation output of various test circuits. These were all verified to be valid outputs and in all cases a "better" output than before. Specifically, for the tests updated these were the changes in output and why they occurred: * The two tests in test.python.circuit.test_scheduled_circuit.TestScheduledCircuit were the single CX gate in the output circuit was flipped from (0, 1) to (1, 0) because in the target the error rate for the (0, 1) direction was higher than the extra error cost of 3 sx gates (the rz gates have 0 error). * In test_unroll_only_if_not_gates_in_basis from test.python.transpiler.test_preset_passmanagers.TestPresetPassManager we no longer run ConsolidateBlocks in the optimization loop so we no longer need to add the 2 executions from the init and translation stages. The test is updated to count the new peephole pass which is the intent of the count check, to check the pass in the optimization loop. * In test_2q_circuit_5q_backend_v2 from test.python.transpiler.test_vf2_post_layout.TestVF2PostLayoutUndirected had the same cx gate flipping because the error rate in the original layout for the reverse direction was 0.000779905 vs 0.00163587 in the original direction. So the new pass was correctly flipping the cx gate resulting in a different circuit that vf2 couldn't place anywhere better. To fix this the test sets a fixed layout on worse qubits so that vf2 will have to place it somewhere better. * For test_layout_tokyo_fully_connected_cx_4_3 from test.python.transpiler.test_preset_passmanagers.TestFinalLayouts the output circuit has a better estimated fidelity (although more gates in general). The transpiler output goes from an estimated fidelity of 0.9526614226294913 before the new pass was used to an estimated fidelity of 0.961996188569715 after the new pass is used. This new circuit with a better fidelity has a different initial layout set now, so the test is updated to use the new layout.

qiskit-bot · 2026-05-04T22:10:22Z

One or more of the following people are relevant to this code:

@Qiskit/terra-core
@levbishop

coveralls · 2026-05-04T22:40:04Z

Coverage Report for CI Build 25346270286

Coverage increased (+0.01%) to 87.572%

Details

Coverage increased (+0.01%) from the base build.
Patch coverage: 35 uncovered changes across 4 files (306 of 341 lines covered, 89.74%).
8 coverage regressions across 5 files.

Uncovered Changes

File	Changed	Covered	%
crates/transpiler/src/passes/two_qubit_peephole.rs	239	210	87.87%
crates/synthesis/src/two_qubit_decompose/basis_decomposer.rs	7	4	57.14%
crates/transpiler/src/passes/unitary_synthesis/mod.rs	53	51	96.23%
crates/pyext/src/lib.rs	4	3	75.0%

Coverage Regressions

8 previously-covered lines in 5 files lost coverage.

File	Lines Losing Coverage	Coverage
crates/qasm2/src/lex.rs	3	92.8%
crates/circuit/src/parameter/symbol_expr.rs	2	74.01%
crates/circuit/src/dag_circuit.rs	1	84.7%
crates/circuit/src/parameter/parameter_expression.rs	1	90.53%
crates/synthesis/src/euler_one_qubit_decomposer.rs	1	91.46%

Coverage Stats


Relevant Lines:	122105
Covered Lines:	106930
Line Coverage:	87.57%
Coverage Strength:	974128.97 hits per line

💛 - Coveralls

mtreinish · 2026-05-04T22:46:22Z

Hmm, I ran asv transpile runtime benchmarks on this PR and it is showing a regression on the hwb benchmarks which is unexpected. I'll have to do some profiling of why the pass is slower, it does potentially do more work to get the better quality results (it always synthesizes the 2q block's unitary), but in my earlier testing this was always offset by the use of parallelism which resulted in a ~2x speedup for transpilation. It's also showing a speedup in places I wouldn't have expected because they're smaller circuits and I'd expect the extra overhead of the parallelism to limit the speedup.

Benchmarks that have improved:

| Change   | Before [ba656ec3]    | After [471d1995]    |   Ratio | Benchmark (Parameter)                                                    |
|----------|----------------------|---------------------|---------|--------------------------------------------------------------------------|
| -        | 13.8±0.1ms           | 12.3±0.08ms         |    0.89 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(2) |
| -        | 28.9±0.5ms           | 25.1±0.4ms          |    0.87 | utility_scale.UtilityScaleBenchmarks.time_bv_100('cx')                   |

Benchmarks that have stayed the same:

| Change   | Before [ba656ec3]    | After [471d1995]    |   Ratio | Benchmark (Parameter)                                                                           |
|----------|----------------------|---------------------|---------|-------------------------------------------------------------------------------------------------|
|          | 11.2±0.06ms          | 12.1±0.09ms         |    1.08 | utility_scale.UtilityScaleBenchmarks.time_circSU2('cx')                                         |
|          | 103±0.6ms            | 109±0.6ms           |    1.06 | utility_scale.UtilityScaleBenchmarks.time_qaoa('cx')                                            |
|          | 26.0±0.2ms           | 27.3±0.4ms          |    1.05 | utility_scale.UtilityScaleBenchmarks.time_bv_100('ecr')                                         |
|          | 12.6±0.1ms           | 13.1±0.2ms          |    1.04 | utility_scale.UtilityScaleBenchmarks.time_circSU2('ecr')                                        |
|          | 123±0.8ms            | 128±0.9ms           |    1.04 | utility_scale.UtilityScaleBenchmarks.time_qaoa('ecr')                                           |
|          | 13.2±0.1ms           | 13.5±0.1ms          |    1.03 | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('ecr')                   |
|          | 321±0.6ms            | 330±0.9ms           |    1.03 | utility_scale.UtilityScaleBenchmarks.time_qv('ecr')                                             |
|          | 5.24±0.1ms           | 5.35±0.2ms          |    1.02 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(0)                        |
|          | 12.5±0.04ms          | 12.8±0.1ms          |    1.02 | utility_scale.UtilityScaleBenchmarks.time_circSU2('cz')                                         |
|          | 2.87±0.04s           | 2.93±0.06s          |    1.02 | utility_scale.UtilityScaleBenchmarks.time_circSU2_89('cz')                                      |
|          | 218±4ms              | 223±1ms             |    1.02 | utility_scale.UtilityScaleBenchmarks.time_parse_hwb12('cx')                                     |
|          | 3.76±0.05ms          | 3.82±0.06ms         |    1.02 | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('cx')                                 |
|          | 3.75±0.05ms          | 3.82±0.01ms         |    1.02 | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('cz')                                 |
|          | 13.3±0.2ms           | 13.5±0.08ms         |    1.02 | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('cx')                    |
|          | 13.3±0.1ms           | 13.5±0.09ms         |    1.02 | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('cz')                    |
|          | 3.01±0.02ms          | 3.03±0.03ms         |    1.01 | utility_scale.UtilityScaleBenchmarks.time_bvlike('cx')                                          |
|          | 12.4±0s              | 12.5±0s             |    1.01 | utility_scale.UtilityScaleBenchmarks.time_hwb12('cz')                                           |
|          | 219±3ms              | 222±1ms             |    1.01 | utility_scale.UtilityScaleBenchmarks.time_parse_hwb12('cz')                                     |
|          | 218±3ms              | 221±0.9ms           |    1.01 | utility_scale.UtilityScaleBenchmarks.time_parse_hwb12('ecr')                                    |
|          | 40.9±0.4ms           | 41.1±0.2ms          |    1.01 | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('cx')                                  |
|          | 40.8±0.2ms           | 41.1±0.07ms         |    1.01 | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('cz')                                  |
|          | 40.9±0.4ms           | 41.2±0.2ms          |    1.01 | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('ecr')                                 |
|          | 294±0.7ms            | 296±1ms             |    1.01 | utility_scale.UtilityScaleBenchmarks.time_qv('cx')                                              |
|          | 35.7±0.1ms           | 36.1±0.2ms          |    1.01 | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('cx')                               |
|          | 7.72±0.06ms          | 7.73±0.05ms         |    1    | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(1)                        |
|          | 3.01±0.02ms          | 3.02±0.02ms         |    1    | utility_scale.UtilityScaleBenchmarks.time_bvlike('cz')                                          |
|          | 3.01±0.02ms          | 3.02±0.02ms         |    1    | utility_scale.UtilityScaleBenchmarks.time_bvlike('ecr')                                         |
|          | 2.83±0.06s           | 2.82±0.08s          |    1    | utility_scale.UtilityScaleBenchmarks.time_circSU2_89('cx')                                      |
|          | 3.77±0.09ms          | 3.78±0.02ms         |    1    | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('ecr')                                |
|          | 128±0.4ms            | 129±0.8ms           |    1    | utility_scale.UtilityScaleBenchmarks.time_qaoa('cz')                                            |
|          | 202±0.6ms            | 202±0.3ms           |    1    | utility_scale.UtilityScaleBenchmarks.time_qft('ecr')                                            |
|          | 3.55±0.03ms          | 3.51±0.01ms         |    0.99 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(0)                   |
|          | 5.41±0.03ms          | 5.34±0.03ms         |    0.99 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(1)                   |
|          | 4.85±0.02ms          | 4.79±0.02ms         |    0.99 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(2)                   |
|          | 5.01±0.01ms          | 4.97±0.03ms         |    0.99 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(3)                   |
|          | 13.4±0.05ms          | 13.2±0.1ms          |    0.99 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(0) |
|          | 17.6±0.06ms          | 17.5±0.08ms         |    0.99 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(1) |
|          | 15.5±0.2ms           | 15.3±0.2ms          |    0.99 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(3)                        |
|          | 40.8±0.2ms           | 40.4±0.5ms          |    0.99 | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('ecr')                              |
|          | 3.62±0.02ms          | 3.54±0.02ms         |    0.98 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(2) |
|          | 26.5±0.1ms           | 25.9±0.3ms          |    0.98 | utility_scale.UtilityScaleBenchmarks.time_bv_100('cz')                                          |
|          | 2.86±0.07s           | 2.81±0.07s          |    0.98 | utility_scale.UtilityScaleBenchmarks.time_circSU2_89('ecr')                                     |
|          | 3.75±0.02ms          | 3.65±0.01ms         |    0.97 | transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(3) |
|          | 315±1ms              | 304±2ms             |    0.96 | utility_scale.UtilityScaleBenchmarks.time_qv('cz')                                              |
|          | 192±0.7ms            | 180±1ms             |    0.94 | utility_scale.UtilityScaleBenchmarks.time_qft('cz')                                             |
|          | 40.2±0.5ms           | 37.6±0.2ms          |    0.93 | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('cz')                               |

Benchmarks that have got worse:

| Change   | Before [ba656ec3]    | After [471d1995]    |   Ratio | Benchmark (Parameter)                                  |
|----------|----------------------|---------------------|---------|--------------------------------------------------------|
| +        | 9.12±0.02s           | 13.5±0.01s          |    1.48 | utility_scale.UtilityScaleBenchmarks.time_hwb12('cx')  |
| +        | 160±0.5ms            | 184±0.4ms           |    1.15 | utility_scale.UtilityScaleBenchmarks.time_qft('cx')    |
| +        | 11.8±0s              | 13.1±0s             |    1.11 | utility_scale.UtilityScaleBenchmarks.time_hwb12('ecr') |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.

mtreinish · 2026-05-05T00:31:31Z

Interesting looking at that example in particular (with a different seed and backends, I modified the pgo script instead of using the asv benchmark directly) the pass is significantly faster. It takes about 600ms to run while the pair of ConsolidateBlocks and UnitarySynthesis takes about 1500ms. However, this speedup is offset by the remaining passes running more slowly. For example, commutative cancellation takes about 2 seconds and it was 1.4 sec before (there are 3 loop iterations). The other thing that is suspicious is on the first iteration of the optimization loop the basis translator is being run to correct something outside the basis while it wasn't being run before. Overall the new pass is faster but it's causing more work for the later passes. I'll do more digging to better understand why this is happening in this particular case.

mtreinish added 30 commits November 14, 2024 07:34

Merge remote-tracking branch 'origin/main' into two-qubit-peephole-pa…

decee9a

…rallel-pass

Merge remote-tracking branch 'origin/main' into two-qubit-peephole-pa…

4d4df68

…rallel-pass

Embed 2q gate count into score as tie breaker

cb6b70f

Also don't run scoring more than needed.

Release GIL during parallel portion

f06a070

Merge branch 'main' into two-qubit-peephole-parallel-pass

90b16e8

Fix lint

a175ee8

Merge remote-tracking branch 'origin/main' into two-qubit-peephole-pa…

af0c144

…rallel-pass

Update ControlledUDecomposer to ensure we only run if the gate is con…

79a46c5

…tinuous

Add reversed synthesis for two qubit basis decomposer

839b4c9

Fix handling of single direction gates

d9399a6

Fix import cycle

b4c4360

Merge branch 'main' into two-qubit-peephole-parallel-pass

aefdc90

Merge remote-tracking branch 'origin/main' into two-qubit-peephole-pa…

746b953

…rallel-pass

Add docstring to new pass

55b05c0

Add release note

7756d1e

Run serially in multiprocessing context

6a01332

Fix cache build

d83562b

Add tests

5f73b93

Merge remote-tracking branch 'origin/main' into two-qubit-peephole-pa…

5e6c4cb

…rallel-pass

Rebase updates

8c7e67c

Adjust tests

f41e855

Clean-up test lint

3cdee4f

Fix oversight in test code

89931a4

Merge remote-tracking branch 'origin/main' into two-qubit-peephole-pa…

19368bb

…rallel-pass

mtreinish added 18 commits April 2, 2026 08:57

Merge remote-tracking branch 'origin/main' into two-qubit-peephole-pa…

500a964

…rallel-pass

Remove unnecessary empty check and use Mutex::into_inner() when no mo…

a8dce96

…re locking is needed

Merge remote-tracking branch 'origin/main' into two-qubit-peephole-pa…

469c06b

…rallel-pass

Add release note details to the pass docstring

b9ce054

Update test module docstring

12b14ee

Update pass module docstring

f80db21

Fix typo in code comment

8c03af3

Add docstrings for the new UnitarySynthesis helper methods

819110d

Expand 2q basis fore new tests

56ea5aa

Add alt text to plot

ddd8fdd

Add missing docs ref label

7c2f549

Add xx +/- yy gates to test matrix

5c6d650

Assert a single 2q gate on controlled u decomposition

66a90de

Merge remote-tracking branch 'origin/main' into two-qubit-peephole-pa…

00f71aa

…rallel-pass

mtreinish added this to the 2.5.0 milestone May 4, 2026

mtreinish requested a review from a team as a code owner May 4, 2026 22:10

mtreinish requested a review from raynelfss May 4, 2026 22:10

mtreinish added on hold Can not fix yet performance Changelog: Performance Performance improvements without API and semantic changes. labels May 4, 2026

github-project-automation Bot added this to Qiskit 2.5 May 4, 2026

github-project-automation Bot moved this to Ready in Qiskit 2.5 May 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use TwoQubitPeepholeOptimization in preset pass managers#16136

Use TwoQubitPeepholeOptimization in preset pass managers#16136
mtreinish wants to merge 89 commits intoQiskit:mainfrom
mtreinish:use-peephole-in-the-default

mtreinish commented May 4, 2026

Uh oh!

qiskit-bot commented May 4, 2026

Uh oh!

coveralls commented May 4, 2026

Uh oh!

mtreinish commented May 4, 2026

Uh oh!

mtreinish commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mtreinish commented May 4, 2026

AI/LLM disclosure

Uh oh!

qiskit-bot commented May 4, 2026

Uh oh!

coveralls commented May 4, 2026

Coverage Report for CI Build 25346270286

Coverage increased (+0.01%) to 87.572%

Details

Uncovered Changes

Coverage Regressions

Coverage Stats

💛 - Coveralls

Uh oh!

mtreinish commented May 4, 2026

Uh oh!

mtreinish commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants