qNoise

Quantum hardware noise characterization & compiler benchmarking

ibm_fez (156-qubit Heron processor) • Qiskit vs Superstaq

21
Hardware Runs
45
T1 Measurements
45
T2 Measurements
3
Qubits Tracked
24
Benchmark Results
2026-03-25 — 2026-03-30  •  Last updated: March 30, 2026 at 00:03
Every data point on this page was measured on IBM's 156-qubit Heron processor in Washington DC. The project measures three fundamental noise properties of individual qubits: T1 (how long a qubit holds its excited state before decaying), T2 (how long a superposition survives after cancelling noise) and readout assignment error (how often the measurement electronics misidentify the qubit's state). T1 and T2 values are extracted by fitting exponential decay curves to probabilities across delay times using scipy's curve_fit.
Quantum circuits written in terms of gates (H, CNOT, RZ) that must be compiled down to a gate set (CZ, SX, RZ on ibm_fez) native to the processor and routed across the chip's qubit connectivity graph. This project creates identical circuits through Qiskit's transpiler (optimization level 3) and Infleqtion's Superstaq compiler, then runs both versions on the same hardware in the same job to compare fidelity.
Built in Python with Qiskit 2.3 for circuit construction and transpilation, qiskit-ibm-runtime (SamplerV2 primitives) for hardware access, qiskit-superstaq for Infleqtion's compiler, SQLAlchemy 2.0 for structured data storage, scipy for curve fitting, and Plotly for visualization. All hardware calls are managed through a job scheduler that submits circuits, tracks jobs , and displays results when they're done.

Qubit Coherence Over Time

T1 (energy relaxation) and T2 (dephasing) measured across multiple runs. Each point is one hardware measurement. Drift between runs reveals noise instability.

How this is measured

T1 experiment: Prepare the qubit in |1⟩ with an X gate, wait a variable delay, then measure. The probability of remaining in |1⟩ decays exponentially: P(1) = A·exp(-t/T1) + C. The curve is fit with scipy.optimize.curve_fit across 10–20 delay values.

qc = QuantumCircuit(1, 1)
qc.x(0)                        # prepare |1⟩
qc.delay(delay_sec, 0, unit="s")  # wait
qc.measure(0, 0)               # check if it decayed

T2 experiment (Hahn echo): Create a superposition with H, wait half the delay, apply an X refocusing pulse to cancel static noise, wait the other half, then H and measure. The echo isolates true T2 from T2* (which includes inhomogeneous broadening).

qc.h(0)                         # superposition
qc.delay(half_delay, 0, unit="s")
qc.x(0)                         # refocusing pulse
qc.delay(half_delay, 0, unit="s")
qc.h(0)                         # interfere
qc.measure(0, 0)

Each qubit's circuits are transpiled with initial_layout=[qubit] to make sure the correct quibit is read, not whatever the transpiler picks.

T1 vs T2 Relationship

Physics constrains T2 ≤ 2×T1. Points near the dashed line indicate energy relaxation dominates.

Qubit 1 (green) sits above the T2 = T1 line with a mean T2/T1 ratio of 1.07, meaning its average T2 slightly exceeds T1. This is physically valid (the hard bound is T2 ≤ 2×T1), but unusual. It indicates pure dephasing is essentially negligible for this qubit — nearly all decoherence comes from energy relaxation alone. By contrast, Qubit 2 (orange) has T2/T1 = 0.71, meaning significant additional phase noise beyond energy relaxation.
Why T2 ≤ 2×T1

Decoherence has two components: energy relaxation (T1 — the qubit loses energy to its environment) and pure dephasing (Tφ — the phase relationship randomizes without energy exchange). The total coherence time satisfies:

1/T2 = 1/(2*T1) + 1/T_phi

Since Tφ ≥ 0, T2 can never exceed 2×T1. When T2 ≈ 2×T1, pure dephasing is negligible and energy relaxation is the dominant noise channel. When T2 ≪ 2×T1, the qubit suffers additional phase noise (e.g., flux fluctuations, TLS coupling).

Readout Error

Assignment error per qubit. False negatives (1→0) dominate due to T1 decay during measurement.

How readout error is measured

For each qubit, two circuits are run:

# Prepare |0⟩ and measure
qc0 = QuantumCircuit(1, 1)
qc0.measure(0, 0)

# Prepare |1⟩ and measure
qc1 = QuantumCircuit(1, 1)
qc1.x(0)
qc1.measure(0, 0)

P(1|prep 0) = fraction of |0⟩ preparations that are misread as |1⟩. This is typically very low.

P(0|prep 1) = fraction of |1⟩ preparations that are misread as |0⟩. This is higher because the qubit can undergo T1 decay during the measurement process itself, falling from |1⟩ to |0⟩ before the readout completes.

Compilation Benchmark — Qiskit vs Superstaq

Identical circuits compiled through two optimizers, then executed on the same hardware.

How the benchmark works

Three benchmark circuits (Bell state, GHZ-4q, QFT-4q) are compiled through two paths:

Qiskit path: transpile(circuit, backend=ibm_fez, optimization_level=3) — Qiskit's most aggressive optimization. Decomposes to the native gate set (CZ, SX, RZ), optimizes single-qubit chains, and routes qubits across the hardware topology.

Superstaq path: provider.ibmq_compile(circuit, target="ibmq_fez_qpu") — Infleqtion's cloud compiler. Uses proprietary optimizations including hardware-aware noise models and custom decomposition strategies.

Both compiled circuits are then submitted to ibm_fez in the same job, ensuring identical hardware conditions for a fair comparison.

Hardware Fidelity — The Real Test

Hellinger fidelity between measured output and ideal (noiseless) distribution. Higher is better.

How fidelity is calculated

I use Hellinger fidelity to compare the measured output distribution against the ideal (noiseless) distribution:

F_H = (sum_x  sqrt(p_measured(x) * p_ideal(x)))^2

F_H = 1.0 means perfect agreement with the ideal circuit. For the Bell state, the ideal distribution is 50% |00⟩ + 50% |11⟩. For GHZ-4q, it's 50% |0000⟩ + 50% |1111⟩. For QFT on |0000⟩, the ideal output is a uniform distribution over all 16 bitstrings.

This metric captures all noise sources simultaneously — gate errors, decoherence during the circuit, readout errors, and crosstalk. It's the most honest measure of real-world compiler performance.

3-run average findings: Across 3 hardware runs, QFT-4q shows the most consistent directional pattern: Superstaq averaged 0.9934 vs Qiskit's 0.9886, a difference of +0.5% with lower run-to-run variance (±0.003 vs ±0.007). The gap is small and 3 runs is not enough to call it statistically significant, but the direction is consistent across every run. Notably, Superstaq achieves this with a deeper circuit, suggesting the advantage comes from fewer 2-qubit gates (16 vs 18 CX) rather than shallower routing. Bell and GHZ results are within each other's standard deviation and show no reliable winner at this sample size.