diff --git a/docs/tutorials/_toc.json b/docs/tutorials/_toc.json
index f48a28605b0..e534f53426f 100644
--- a/docs/tutorials/_toc.json
+++ b/docs/tutorials/_toc.json
@@ -40,6 +40,10 @@
               "title": "Advanced techniques for QAOA",
               "url": "/docs/tutorials/advanced-techniques-for-qaoa"
             },
+            {
+              "title": "Warm-starting QAOA with the Optimization Mapper Qiskit addon",
+              "url": "/docs/tutorials/warm-start-qaoa"
+            },
             {
               "title": "Pauli correlation encoding to reduce Max-Cut requirements",
               "url": "/docs/tutorials/pauli-correlation-encoding-for-qaoa"
diff --git a/docs/tutorials/index.mdx b/docs/tutorials/index.mdx
index 0fec3485b44..8307ff8349e 100644
--- a/docs/tutorials/index.mdx
+++ b/docs/tutorials/index.mdx
@@ -37,6 +37,8 @@ The tutorials highlight techniques where repeated sampling enables estimation of
 
 * [Advanced techniques for QAOA](/docs/tutorials/advanced-techniques-for-qaoa)
 
+* [Warm-starting QAOA with the Optimization Mapper Qiskit addon](/docs/tutorials/warm-start-qaoa)
+
 * [Pauli correlation encoding to reduce max-cut requirements](/docs/tutorials/pauli-correlation-encoding-for-qaoa)
 
 </AccordionItem>
diff --git a/docs/tutorials/warm-start-qaoa.ipynb b/docs/tutorials/warm-start-qaoa.ipynb
new file mode 100644
index 00000000000..653aa19b53f
--- /dev/null
+++ b/docs/tutorials/warm-start-qaoa.ipynb
@@ -0,0 +1,1409 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "d2c31ae8",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "title: Warm-starting QAOA with the Optimization Mapper Qiskit addon\n",
+    "description: Improve QAOA convergence on max-cut problems by initializing with the solution of a continuous relaxation, using the qiskit-addon-opt-mapper package.\n",
+    "---\n",
+    "\n",
+    "{/* cspell:ignore prereqs Mareček rhobeg skyblue steelblue edgecolor fontsize Farhi */}\n",
+    "\n",
+    "# Warm-starting QAOA with the Optimization Mapper Qiskit addon\n",
+    "*Usage estimate: 9 minutes on a Heron r3 (NOTE: This is an estimate only. Your runtime may vary.)*"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8bf80006",
+   "metadata": {},
+   "source": [
+    "## Learning outcomes\n",
+    "After completing this tutorial, you can expect to understand the following information:\n",
+    "- How to map a max-cut problem to a quantum Quadratic Unconstrained Binary Optimization (QUBO) formulation using `qiskit-addon-opt-mapper`\n",
+    "- How to implement and run standard QAOA on a simulator\n",
+    "- How to apply WS-QAOA by computing the quadratic program (QP) relaxation and building the warm-start circuit\n",
+    "- How to compare energy convergence and solution quality between standard QAOA and WS-QAOA\n",
+    "\n",
+    "## Prerequisites\n",
+    "It is recommended that you familiarize yourself with these topics:\n",
+    "- [QAOA tutorial](/docs/tutorials/quantum-approximate-optimization-algorithm)\n",
+    "- [Advanced QAOA tutorial](/docs/tutorials/advanced-techniques-for-qaoa)\n",
+    "\n",
+    "## Background\n",
+    "\n",
+    "The Quantum Approximate Optimization Algorithm (QAOA) is a hybrid quantum-classical algorithm designed to solve combinatorial optimization problems such as max-cut and general QUBO formulations. For a foundational introduction to QAOA in Qiskit, see the [QAOA tutorial](/docs/tutorials/quantum-approximate-optimization-algorithm); for more advanced circuit-building techniques, see the [advanced QAOA tutorial](/docs/tutorials/advanced-techniques-for-qaoa).\n",
+    "\n",
+    "In standard QAOA:\n",
+    "- The initial state is the uniform superposition $|+\\rangle^{\\otimes n}$.\n",
+    "- Variational parameters are randomly initialized.\n",
+    "- A classical optimizer searches for parameters that minimize the cost function.\n",
+    "\n",
+    "However, for practical problem sizes and noisy quantum hardware, random initialization can lead to slow convergence, poor local minima, and increased optimization cost.\n",
+    "\n",
+    "**Warm-start QAOA** (WS-QAOA) improves this by incorporating classical optimization insights directly into the quantum circuit. This tutorial follows the methods introduced by Egger, Mareček, and Woerner in [*Warm-starting quantum optimization*](https://arxiv.org/abs/2009.10095). The key idea is to:\n",
+    "\n",
+    "1. **Solve a continuous relaxation** of the original binary problem (a quadratic program over $[0,1]^n$ instead of $\\{0,1\\}^n$).\n",
+    "2. **Encode the relaxed solution** $c^*_i \\in [0,1]$ into a custom initial state by using $Y$-rotation angles $\\theta_i = 2\\arcsin(\\sqrt{c^*_i})$, so that qubit $i$ starts in a state whose probability of measuring $|1\\rangle$ is $c^*_i$.\n",
+    "3. **Replace the standard $X$-mixer** with a custom mixer whose ground state is the warm-start initial state, ensuring the algorithm starts near the classical solution and can explore the neighborhood.\n",
+    "\n",
+    "A regularization parameter $\\varepsilon \\in [0, 0.5]$ clips $c^*_i$ away from 0 and 1 to avoid reachability issues; qubits initialized in $|0\\rangle$ or $|1\\rangle$ cannot be moved by the cost Hamiltonian. At $\\varepsilon = 0.5$, WS-QAOA reduces exactly to standard QAOA.\n",
+    "\n",
+    "Problem modeling uses the [`qiskit-addon-opt-mapper`](https://qiskit.github.io/qiskit-addon-opt-mapper/) package, which provides the `OptimizationProblem` class and converters for mapping combinatorial problems to quantum Hamiltonians."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "55b94021",
+   "metadata": {},
+   "source": [
+    "## Requirements\n",
+    "Before starting this tutorial, be sure you have the following installed:\n",
+    "\n",
+    "* Qiskit SDK v2.0 or later, with [visualization](/docs/api/qiskit/visualization) support\n",
+    "* Qiskit Runtime v0.43 or later (`pip install qiskit-ibm-runtime`)\n",
+    "* Optimization Mapper Qiskit addon (`pip install qiskit-addon-opt-mapper`)\n",
+    "* SciPy (`pip install scipy`)\n",
+    "* NetworkX (`pip install networkx`)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7db2e559",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "\n",
+    "Import all required libraries and define helper functions used throughout this tutorial."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "bc380c46",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import matplotlib.pyplot as plt\n",
+    "import networkx as nx\n",
+    "from scipy.optimize import minimize\n",
+    "\n",
+    "from qiskit.circuit import QuantumCircuit, ParameterVector\n",
+    "from qiskit.quantum_info import Statevector\n",
+    "from qiskit.primitives import StatevectorEstimator, StatevectorSampler\n",
+    "from qiskit.transpiler.preset_passmanagers import generate_preset_pass_manager\n",
+    "from qiskit_ibm_runtime import (\n",
+    "    QiskitRuntimeService,\n",
+    "    Session,\n",
+    "    EstimatorOptions,\n",
+    "    EstimatorV2 as Estimator,\n",
+    "    SamplerV2 as Sampler,\n",
+    ")\n",
+    "\n",
+    "from qiskit_addon_opt_mapper import OptimizationProblem\n",
+    "from qiskit_addon_opt_mapper.converters import OptimizationProblemToQubo\n",
+    "from qiskit_addon_opt_mapper.translators import to_ising"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "431a5bd2-e6ed-471b-ad9e-c4edd27784a8",
+   "metadata": {},
+   "source": [
+    "# Small-scale simulator example\n",
+    "\n",
+    "We use a small **max-cut** problem on a weighted graph as our running example. Max-cut asks: given a graph $G=(V,E)$ with edge weights $w_{ij}$, find a partition of the vertices into two sets $S$ and $\\bar{S}$ that maximizes the total weight of edges crossing the cut.\n",
+    "\n",
+    "As a QUBO minimization problem, max-cut can be written as:\n",
+    "$$\\min_{x \\in \\{0,1\\}^n} -\\sum_{(i,j) \\in E} w_{ij}(x_i + x_j - 2x_i x_j)$$\n",
+    "\n",
+    "We work with a four-node graph for tractability on a simulator."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "988ee237",
+   "metadata": {},
+   "source": [
+    "### Step 1: Map classical inputs to a quantum problem\n",
+    "\n",
+    "We define the max-cut problem using the `OptimizationProblem` class from `qiskit-addon-opt-mapper`, convert it to a QUBO, and translate it to an Ising Hamiltonian (`SparsePauliOp`) suitable for QAOA. We also solve the continuous relaxation of the QUBO &mdash; replacing the binary constraint $x_i \\in \\{0,1\\}$ with $x_i \\in [0,1]$ &mdash; to obtain the warm-start initial point $c^*$."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "step1-graph-code",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "<Image src=\"/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step1-graph-code-0.avif\" alt=\"Output of the previous code cell\" />"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "# Define a 4-node weighted graph for the max-cut problem\n",
+    "n_nodes = 4\n",
+    "edges = [(0, 1, 1.0), (0, 2, 1.0), (1, 2, 1.0), (1, 3, 1.0), (2, 3, 1.0)]\n",
+    "\n",
+    "G = nx.Graph()\n",
+    "G.add_nodes_from(range(n_nodes))\n",
+    "G.add_weighted_edges_from(edges)\n",
+    "\n",
+    "pos = nx.spring_layout(G, seed=42)\n",
+    "edge_labels = {(u, v): d[\"weight\"] for u, v, d in G.edges(data=True)}\n",
+    "\n",
+    "fig, ax = plt.subplots(figsize=(4, 3))\n",
+    "nx.draw(G, pos, with_labels=True, node_color=\"lightblue\", ax=ax)\n",
+    "nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels, ax=ax)\n",
+    "ax.set_title(\"Max-Cut graph\")\n",
+    "plt.tight_layout()\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "step1-graph-md",
+   "metadata": {},
+   "source": [
+    "The graph has five edges. The optimal max-cut partitions the nodes into $S = \\{0, 3\\}$ and $\\bar{S} = \\{1, 2\\}$ (or its complement), severing four of the five edges for a cut value of 4."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "step1-qubo-code",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Problem name: max_cut\n",
+      "\n",
+      "Minimize\n",
+      "  2*x0*x1 + 2*x0*x2 + 2*x1*x2 + 2*x1*x3 + 2*x2*x3 - 2*x0 - 3*x1 - 3*x2 - 2*x3\n",
+      "\n",
+      "Subject to\n",
+      "  No constraints\n",
+      "\n",
+      "  Binary variables (4)\n",
+      "    x0 x1 x2 x3\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Build the max-cut QUBO using OptimizationProblem\n",
+    "#\n",
+    "# Max-Cut as a minimization QUBO:\n",
+    "#   minimize  -sum_{(i,j) in E} w_ij * (x_i + x_j - 2*x_i*x_j)\n",
+    "#\n",
+    "# Expanding: each edge contributes -w*(x_i + x_j) to the linear terms\n",
+    "# and +2w*x_i*x_j to the quadratic terms.\n",
+    "prob = OptimizationProblem(\"max_cut\")\n",
+    "x_vars = [prob.binary_var(name=f\"x{i}\") for i in range(n_nodes)]\n",
+    "\n",
+    "linear_terms = {}\n",
+    "quadratic_terms = {}\n",
+    "for u, v, w in edges:\n",
+    "    linear_terms[f\"x{u}\"] = linear_terms.get(f\"x{u}\", 0.0) - w\n",
+    "    linear_terms[f\"x{v}\"] = linear_terms.get(f\"x{v}\", 0.0) - w\n",
+    "    quadratic_terms[(f\"x{u}\", f\"x{v}\")] = (\n",
+    "        quadratic_terms.get((f\"x{u}\", f\"x{v}\"), 0.0) + 2 * w\n",
+    "    )\n",
+    "\n",
+    "prob.minimize(linear=linear_terms, quadratic=quadratic_terms)\n",
+    "print(prob.prettyprint())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "step1-qubo-md",
+   "metadata": {},
+   "source": [
+    "The `OptimizationProblem` class from `qiskit-addon-opt-mapper` provides a flexible model for binary, integer, continuous, and spin variables. Here all variables are binary and the problem is already in unconstrained form (QUBO), so no constraint-to-penalty conversion is needed. The printed objective shows each variable's linear coefficient (how much it individually contributes to the cut) and each cross-term's quadratic coefficient (the penalty for putting two adjacent nodes on the same side)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "step1-ising-code",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Cost Hamiltonian H_C (4 qubits):\n",
+      "SparsePauliOp(['IIZZ', 'IZIZ', 'IZZI', 'ZIZI', 'ZZII'],\n",
+      "              coeffs=[0.5+0.j, 0.5+0.j, 0.5+0.j, 0.5+0.j, 0.5+0.j])\n",
+      "\n",
+      "Offset (constant shift): -2.5\n",
+      "  QUBO value = Ising energy + offset\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Convert the OptimizationProblem to a QUBO, then translate to an Ising Hamiltonian\n",
+    "#\n",
+    "# The substitution x_i = (1 - z_i)/2  maps binary variables to spin operators,\n",
+    "# yielding a Hamiltonian H_C = sum_i h_i Z_i + sum_{i<j} J_ij Z_i Z_j + constant.\n",
+    "# QAOA minimizes <H_C> to find the ground state, which encodes the optimal cut.\n",
+    "converter = OptimizationProblemToQubo()\n",
+    "qubo = converter.convert(prob)\n",
+    "\n",
+    "cost_operator, offset = to_ising(qubo)\n",
+    "n_qubits = cost_operator.num_qubits\n",
+    "\n",
+    "print(f\"Cost Hamiltonian H_C ({n_qubits} qubits):\")\n",
+    "print(cost_operator)\n",
+    "print(f\"\\nOffset (constant shift): {offset}\")\n",
+    "print(\"  QUBO value = Ising energy + offset\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "step1-ising-md",
+   "metadata": {},
+   "source": [
+    "The `to_ising` translator returns a `SparsePauliOp` representing $H_C$ and a scalar `offset` such that $\\text{QUBO value} = \\langle H_C \\rangle + \\text{offset}$. For this max-cut problem with all unit weights, $h_i = 0$ for all qubits (the graph is symmetric in linear terms after the $x_i \\to z_i$ substitution), and each edge contributes a $Z_i Z_j$ coupling of strength $+0.5$. The minimum eigenvalue of $H_C$ corresponds to the maximum cut."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "step1-qp-code",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "QP relaxation solution c* = [1. 0. 0. 1.]\n",
+      "QP objective value        = -4.0000\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Solve the continuous (QP) relaxation to obtain the warm-start point c*\n",
+    "#\n",
+    "# The QP relaxation replaces the binary constraint x_i in {0,1} with x_i in [0,1]\n",
+    "# and minimizes the same quadratic objective. Its solution c*_i gives the\n",
+    "# probability that variable i should be 1 according to the classical relaxation.\n",
+    "#\n",
+    "# The max-cut QUBO has a non-convex quadratic matrix (negative eigenvalues),\n",
+    "# so the relaxed problem has multiple local minima. A naive single start from\n",
+    "# [0.5,...,0.5] converges to the symmetric saddle point c* = [0.5,...,0.5],\n",
+    "# which carries no useful structural information about the problem.\n",
+    "# Multi-start optimization is used to reliably find the global minimum.\n",
+    "Q = qubo.objective.quadratic.to_array(symmetric=True)\n",
+    "mu = qubo.objective.linear.to_array()\n",
+    "\n",
+    "\n",
+    "def qp_objective(x_cont):\n",
+    "    \"\"\"Continuous relaxation of the QUBO objective.\"\"\"\n",
+    "    return x_cont @ Q @ x_cont + mu @ x_cont + qubo.objective.constant\n",
+    "\n",
+    "\n",
+    "bounds = [(0.0, 1.0)] * n_qubits\n",
+    "\n",
+    "rng = np.random.default_rng(42)\n",
+    "best_val = np.inf\n",
+    "c_star = None\n",
+    "for _ in range(200):\n",
+    "    x0 = rng.uniform(0.0, 1.0, n_qubits)\n",
+    "    result = minimize(qp_objective, x0, method=\"L-BFGS-B\", bounds=bounds)\n",
+    "    if result.fun < best_val:\n",
+    "        best_val = result.fun\n",
+    "        c_star = result.x\n",
+    "\n",
+    "print(f\"QP relaxation solution c* = {np.round(c_star, 4)}\")\n",
+    "print(f\"QP objective value        = {best_val:.4f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "step1-qp-md",
+   "metadata": {},
+   "source": [
+    "The multi-start solver finds $c^* = [1, 0, 0, 1]$ (or its complement $[0, 1, 1, 0]$), which is the actual optimal binary solution. For this problem the QP relaxation is tight, the continuous minimum coincides with the integer optimum, meaning the relaxation immediately identifies the best cut. After regularization with $\\varepsilon = 0.25$ in Step 2, this solution will be encoded into the warm-start initial state."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ac6f36e3",
+   "metadata": {},
+   "source": [
+    "### Step 2: Optimize problem for quantum hardware execution\n",
+    "\n",
+    "We build two QAOA circuits and prepare the warm-start angles from the QP solution.\n",
+    "\n",
+    "**Standard QAOA** uses the uniform superposition $|+\\rangle^{\\otimes n}$ as the initial state and the standard $X$-mixer $H_M = -\\sum_i X_i$, implemented as $\\prod_i R_X(-2\\beta)$ per layer.\n",
+    "\n",
+    "**Warm-start QAOA (WS-QAOA)** from [\\[1\\]](#Reference1) makes two structural changes per qubit $i$:\n",
+    "- **Initial state:** $R_Y(\\theta_i)|0\\rangle$ with $\\theta_i = 2\\arcsin(\\sqrt{c^*_i})$, so the probability of measuring $|1\\rangle$ equals $c^*_i$.\n",
+    "- **Custom mixer:** $R_Y(\\theta_i)\\, R_Z(-2\\beta)\\, R_Y(-\\theta_i)$, which has $R_Y(\\theta_i)|0\\rangle$ as its ground state. This means WS-QAOA starts in the ground state of its own mixer, the same property that standard QAOA satisfies with $|+\\rangle$ and the $X$-mixer.\n",
+    "\n",
+    "Note on layers: At `p=1`(a single QAOA layer), standard QAOA is analytically limited to ~49% of the optimal energy on graphs containing triangles (this graph has the triangle 0-1-2). The warm start bypasses this limitation by encoding prior knowledge of the solution directly into the initial state."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "step2-angles-code",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Continuous relaxation c*  = [1. 0. 0. 1.]\n",
+      "After regularization      = [0.75 0.25 0.25 0.75]\n",
+      "Warm-start angles theta   = [2.0944 1.0472 1.0472 2.0944] radians\n",
+      "\n",
+      "Angle interpretation:\n",
+      "  theta = 0      <->  c* = 0   (qubit points toward |0>)\n",
+      "  theta = pi/2   <->  c* = 0.5 (qubit in equal superposition, like |+>)\n",
+      "  theta = pi     <->  c* = 1   (qubit points toward |1>)\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Number of QAOA layers (each layer = one cost unitary + one mixer unitary)\n",
+    "p = 1\n",
+    "\n",
+    "# Regularization: clip c* to [epsilon, 1-epsilon] so no qubit is initialized\n",
+    "# in |0> or |1>, which would freeze it under the cost Hamiltonian.\n",
+    "epsilon = 0.25\n",
+    "\n",
+    "c_clipped = np.clip(c_star, epsilon, 1 - epsilon)\n",
+    "thetas = 2 * np.arcsin(np.sqrt(c_clipped))\n",
+    "\n",
+    "print(f\"Continuous relaxation c*  = {np.round(c_star, 4)}\")\n",
+    "print(f\"After regularization      = {np.round(c_clipped, 4)}\")\n",
+    "print(f\"Warm-start angles theta   = {np.round(thetas, 4)} radians\")\n",
+    "print()\n",
+    "print(\"Angle interpretation:\")\n",
+    "print(\"  theta = 0      <->  c* = 0   (qubit points toward |0>)\")\n",
+    "print(\n",
+    "    \"  theta = pi/2   <->  c* = 0.5 (qubit in equal superposition, like |+>)\"\n",
+    ")\n",
+    "print(\"  theta = pi     <->  c* = 1   (qubit points toward |1>)\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "step2-angles-md",
+   "metadata": {},
+   "source": [
+    "After clipping, $c^* = 1$ becomes $1 - \\varepsilon = 0.75$ and $c^* = 0$ becomes $\\varepsilon = 0.25$. The resulting angles $\\theta \\approx [2.09, 1.05, 1.05, 2.09]$ radians rotate qubits 0 and 3 strongly toward $|1\\rangle$ and qubits 1 and 2 toward $|0\\rangle$, directly encoding the structure of the optimal cut into the initial quantum state."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "step2-circuit-builders-code",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def apply_cost_unitary(qc, cost_op, gamma):\n",
+    "    \"\"\"Apply exp(-i * gamma * H_C) to the circuit.\n",
+    "\n",
+    "    Each Pauli term in H_C contributes a rotation gate:\n",
+    "      - Single-Z term h_i * Z_i  ->  RZ(2 * gamma * h_i) on qubit i\n",
+    "      - Two-Z term J_ij * Z_i Z_j  ->  CNOT, RZ(2 * gamma * J_ij), CNOT\n",
+    "    \"\"\"\n",
+    "    for pauli_term, coeff in zip(cost_op.paulis, cost_op.coeffs):\n",
+    "        indices = [\n",
+    "            j for j, q in enumerate(pauli_term.to_label()[::-1]) if q == \"Z\"\n",
+    "        ]\n",
+    "        if len(indices) == 1:\n",
+    "            qc.rz(2 * gamma * coeff.real, indices[0])\n",
+    "        elif len(indices) == 2:\n",
+    "            qc.cx(indices[0], indices[1])\n",
+    "            qc.rz(2 * gamma * coeff.real, indices[1])\n",
+    "            qc.cx(indices[0], indices[1])\n",
+    "\n",
+    "\n",
+    "def build_standard_qaoa(cost_op, n_layers, n_qubits):\n",
+    "    \"\"\"Standard QAOA: uniform superposition initial state + X-mixer.\"\"\"\n",
+    "    gammas = ParameterVector(\"γ\", n_layers)\n",
+    "    betas = ParameterVector(\"β\", n_layers)\n",
+    "    qc = QuantumCircuit(n_qubits)\n",
+    "    qc.h(range(n_qubits))  # |+>^n\n",
+    "    for k in range(n_layers):\n",
+    "        apply_cost_unitary(qc, cost_op, gammas[k])\n",
+    "        # H_M = -sum X_i  =>  exp(-i*beta*H_M) = prod_i RX(-2*beta)\n",
+    "        qc.rx(-2 * betas[k], range(n_qubits))\n",
+    "    return qc, gammas, betas\n",
+    "\n",
+    "\n",
+    "def build_ws_qaoa(cost_op, n_layers, n_qubits, thetas):\n",
+    "    \"\"\"WS-QAOA: warm-start initial state + custom per-qubit mixer.\n",
+    "\n",
+    "    Per Egger et al. (2021) Eq. (1)-(2):\n",
+    "      Initial state per qubit i:  R_Y(theta_i) |0>\n",
+    "      Mixer gate per qubit i:     R_Y(theta_i) R_Z(-2*beta) R_Y(-theta_i)\n",
+    "    \"\"\"\n",
+    "    gammas = ParameterVector(\"γ\", n_layers)\n",
+    "    betas = ParameterVector(\"β\", n_layers)\n",
+    "    qc = QuantumCircuit(n_qubits)\n",
+    "    for i, theta in enumerate(thetas):\n",
+    "        qc.ry(theta, i)  # warm-start initial state\n",
+    "    for k in range(n_layers):\n",
+    "        apply_cost_unitary(qc, cost_op, gammas[k])\n",
+    "        for i, theta in enumerate(thetas):\n",
+    "            qc.ry(theta, i)\n",
+    "            qc.rz(-2 * betas[k], i)\n",
+    "            qc.ry(-theta, i)\n",
+    "    return qc, gammas, betas\n",
+    "\n",
+    "\n",
+    "std_qc, std_gammas, std_betas = build_standard_qaoa(\n",
+    "    cost_operator, p, n_qubits\n",
+    ")\n",
+    "ws_qc, ws_gammas, ws_betas = build_ws_qaoa(cost_operator, p, n_qubits, thetas)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "step2-circuit-builders-md",
+   "metadata": {},
+   "source": [
+    "The `apply_cost_unitary` helper reads directly from the `SparsePauliOp` Hamiltonian, so it automatically handles any QUBO problem without manual circuit construction. The WS-QAOA mixer applies three rotations per qubit per layer ($R_Y$, $R_Z$, $R_Y$), compared to a single $R_X$ in the standard mixer &mdash; a modest overhead that is constant in the number of qubits."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "step2-draw-code",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Standard QAOA circuit (p=1):\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "<Image src=\"/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step2-draw-code-1.avif\" alt=\"Output of the previous code cell\" />"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "print(\"Standard QAOA circuit (p=1):\")\n",
+    "std_qc.draw(\"mpl\", fold=-1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "6fad9eda",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "WS-QAOA circuit (p=1):\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "<Image src=\"/docs/images/tutorials/warm-start-qaoa/extracted-outputs/6fad9eda-1.avif\" alt=\"Output of the previous code cell\" />"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "print(\"\\nWS-QAOA circuit (p=1):\")\n",
+    "ws_qc.draw(\"mpl\", fold=-1)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "step2-draw-md",
+   "metadata": {},
+   "source": [
+    "Both circuits follow the same structure: an initial state preparation layer, then $p$ alternating cost-unitary and mixer-unitary layers. In the WS-QAOA circuit, the opening $R_Y$ gates encode $c^*$, and the mixer replaces each $R_X$ with a conjugated $R_Y$–$R_Z$–$R_Y$ triplet. The circuit depth difference between the two grows linearly with $p$, but remains manageable at low depth."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b4d480b3",
+   "metadata": {},
+   "source": [
+    "### Step 3: Execute using Qiskit primitives\n",
+    "\n",
+    "We use `StatevectorEstimator` for exact, noiseless simulation. The `minimize` function from SciPy with the COBYLA optimizer drives the variational loop, calling the estimator at each iteration to evaluate $\\langle H_C \\rangle$ for a given parameter set $(\\gamma, \\beta)$.\n",
+    "\n",
+    "The two algorithms use different initial parameters that reflect what each knows before optimization:\n",
+    "- **Standard QAOA:** Random initialization in $[0, \\pi]$ — appropriate since no structural information is available.\n",
+    "- **WS-QAOA:** $\\gamma = 0$, $\\beta = \\pi/4$ — at $\\gamma=0$ the cost unitary is the identity, so the very first circuit evaluation samples directly from the warm-start initial state. This gives COBYLA a strong starting signal aligned with the classical solution."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "step3-optimize-code",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Standard QAOA optimal energy : -0.5859\n",
+      "  optimal params: gamma=[2.0534], beta=[2.4612]\n",
+      "  optimizer calls: 36\n",
+      "\n",
+      "WS-QAOA optimal energy       : -1.5000\n",
+      "  optimal params: gamma=[-0.0001], beta=[1.5708]\n",
+      "  optimizer calls: 42\n"
+     ]
+    }
+   ],
+   "source": [
+    "estimator = StatevectorEstimator()\n",
+    "\n",
+    "\n",
+    "def make_cost_fn(circuit, param_order, cost_op, estimator, history):\n",
+    "    \"\"\"Return a scalar cost function compatible with scipy.optimize.minimize.\"\"\"\n",
+    "\n",
+    "    def cost_fn(params):\n",
+    "        bound = circuit.assign_parameters(dict(zip(param_order, params)))\n",
+    "        job = estimator.run([(bound, cost_op)])\n",
+    "        energy = job.result()[0].data.evs.real\n",
+    "        history.append(energy)\n",
+    "        return energy\n",
+    "\n",
+    "    return cost_fn\n",
+    "\n",
+    "\n",
+    "# Standard QAOA: random initialization\n",
+    "np.random.seed(42)\n",
+    "std_params0 = np.random.uniform(0, np.pi, 2 * p)\n",
+    "std_history = []\n",
+    "std_param_order = list(std_gammas) + list(std_betas)\n",
+    "\n",
+    "std_result = minimize(\n",
+    "    make_cost_fn(\n",
+    "        std_qc, std_param_order, cost_operator, estimator, std_history\n",
+    "    ),\n",
+    "    std_params0,\n",
+    "    method=\"COBYLA\",\n",
+    "    options={\"maxiter\": 300, \"rhobeg\": 0.5},\n",
+    ")\n",
+    "print(f\"Standard QAOA optimal energy : {std_result.fun:.4f}\")\n",
+    "print(\n",
+    "    f\"  optimal params: gamma={std_result.x[:p].round(4)}, beta={std_result.x[p:].round(4)}\"\n",
+    ")\n",
+    "print(f\"  optimizer calls: {len(std_history)}\")\n",
+    "\n",
+    "\n",
+    "# WS-QAOA: informed initialization\n",
+    "ws_params0 = np.concatenate([np.zeros(p), np.full(p, np.pi / 4)])\n",
+    "ws_history = []\n",
+    "ws_param_order = list(ws_gammas) + list(ws_betas)\n",
+    "\n",
+    "ws_result = minimize(\n",
+    "    make_cost_fn(ws_qc, ws_param_order, cost_operator, estimator, ws_history),\n",
+    "    ws_params0,\n",
+    "    method=\"COBYLA\",\n",
+    "    options={\"maxiter\": 300, \"rhobeg\": 0.5},\n",
+    ")\n",
+    "print(f\"\\nWS-QAOA optimal energy       : {ws_result.fun:.4f}\")\n",
+    "print(\n",
+    "    f\"  optimal params: gamma={ws_result.x[:p].round(4)}, beta={ws_result.x[p:].round(4)}\"\n",
+    ")\n",
+    "print(f\"  optimizer calls: {len(ws_history)}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "step3-optimize-md",
+   "metadata": {},
+   "source": [
+    "WS-QAOA's informed starting point means COBYLA begins with a meaningful energy value near the warm-start solution, while standard QAOA starts from an essentially random point on the energy landscape. This difference in starting quality is the main driver of the convergence gap visible in Step 4."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "step3-reference-code",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Exact optimal energy         : -1.5000\n",
+      "Standard QAOA approx. ratio  : 0.3906\n",
+      "WS-QAOA approx. ratio        : 1.0000\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Compute the exact optimal energy by brute-force over all 2^n bitstrings\n",
+    "all_energies = [\n",
+    "    Statevector.from_label(format(k, f\"0{n_qubits}b\"))\n",
+    "    .expectation_value(cost_operator)\n",
+    "    .real\n",
+    "    for k in range(2**n_qubits)\n",
+    "]\n",
+    "optimal_energy = min(all_energies)\n",
+    "\n",
+    "print(f\"Exact optimal energy         : {optimal_energy:.4f}\")\n",
+    "print(f\"Standard QAOA approx. ratio  : {std_result.fun / optimal_energy:.4f}\")\n",
+    "print(f\"WS-QAOA approx. ratio        : {ws_result.fun / optimal_energy:.4f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "step3-reference-md",
+   "metadata": {},
+   "source": [
+    "The approximation ratio is defined as $\\langle H_C \\rangle_{\\text{QAOA}} / E_{\\text{opt}}$. For minimization problems where $E_{\\text{opt}} < 0$, a ratio closer to 1 means the algorithm found a lower energy (a better solution). The brute-force search over all $2^n$ basis states is only feasible for small $n$ and serves as a ground-truth reference."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "50b94af2",
+   "metadata": {},
+   "source": [
+    "### Step 4: Post-process and return result in desired classical format\n",
+    "\n",
+    "We visualize convergence, sample the optimized circuits for bitstring solutions, decode those bitstrings back to max-cut partitions, and summarize the final results."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "step4-convergence-code",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "<Image src=\"/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step4-convergence-code-0.avif\" alt=\"Output of the previous code cell\" />"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "fig, ax = plt.subplots(figsize=(7, 4))\n",
+    "ax.plot(std_history, label=\"Standard QAOA\", alpha=0.85)\n",
+    "ax.plot(ws_history, label=\"WS-QAOA\", alpha=0.85)\n",
+    "ax.axhline(\n",
+    "    optimal_energy,\n",
+    "    color=\"k\",\n",
+    "    linestyle=\"--\",\n",
+    "    label=f\"Exact optimal ({optimal_energy:.2f})\",\n",
+    ")\n",
+    "ax.set_xlabel(\"Optimizer call\")\n",
+    "ax.set_ylabel(r\"$\\langle H_C \\rangle$\")\n",
+    "ax.set_title(\"Convergence: Standard QAOA vs. WS-QAOA\")\n",
+    "ax.legend()\n",
+    "plt.tight_layout()\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "step4-convergence-md",
+   "metadata": {},
+   "source": [
+    "The convergence plot shows the energy $\\langle H_C \\rangle$ at each COBYLA function evaluation. Standard QAOA at $p=1$ is limited to ~49% of the optimal energy on this graph (the theoretical maximum for $p=1$ QAOA on graphs with triangles), settling around $-0.74$. WS-QAOA, initialized close to the optimal solution, converges quickly to near $-1.50$ (the exact optimum) with far fewer iterations. This demonstrates the key advantage of the warm start: at the same circuit depth, it reaches a significantly better solution."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "id": "step4-sample-code",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Standard QAOA most-probable bitstring : 0110\n",
+      "  Partition: S=[0, 3], S̄=[1, 2]  |  cut value = 4.0\n",
+      "\n",
+      "WS-QAOA most-probable bitstring       : 0110\n",
+      "  Partition: S=[0, 3], S̄=[1, 2]  |  cut value = 4.0\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Sample the optimized circuits to recover the most probable bitstring solutions\n",
+    "sampler = StatevectorSampler()\n",
+    "shots = 1024\n",
+    "\n",
+    "\n",
+    "def get_best_bitstring(circuit, param_order, optimal_params, sampler, shots):\n",
+    "    bound = circuit.assign_parameters(dict(zip(param_order, optimal_params)))\n",
+    "    bound.measure_all()\n",
+    "    job = sampler.run([bound], shots=shots)\n",
+    "    counts = job.result()[0].data.meas.get_counts()\n",
+    "    return max(counts, key=counts.get), counts\n",
+    "\n",
+    "\n",
+    "def evaluate_cut(bitstring, G):\n",
+    "    \"\"\"Compute the Max-Cut value for a bitstring node assignment.\"\"\"\n",
+    "    x = [int(b) for b in bitstring]\n",
+    "    cut_val = sum(\n",
+    "        w for u, v, w in G.edges.data(\"weight\", default=1) if x[u] != x[v]\n",
+    "    )\n",
+    "    set0 = [i for i, b in enumerate(bitstring) if b == \"0\"]\n",
+    "    set1 = [i for i, b in enumerate(bitstring) if b == \"1\"]\n",
+    "    return cut_val, set0, set1\n",
+    "\n",
+    "\n",
+    "# Qiskit bitstring ordering: rightmost character = qubit 0\n",
+    "def decode_bitstring(bs):\n",
+    "    return bs[::-1]\n",
+    "\n",
+    "\n",
+    "std_best, _ = get_best_bitstring(\n",
+    "    std_qc, std_param_order, std_result.x, sampler, shots\n",
+    ")\n",
+    "ws_best, _ = get_best_bitstring(\n",
+    "    ws_qc, ws_param_order, ws_result.x, sampler, shots\n",
+    ")\n",
+    "\n",
+    "std_cut, std_s0, std_s1 = evaluate_cut(decode_bitstring(std_best), G)\n",
+    "ws_cut, ws_s0, ws_s1 = evaluate_cut(decode_bitstring(ws_best), G)\n",
+    "\n",
+    "print(f\"Standard QAOA most-probable bitstring : {std_best}\")\n",
+    "print(f\"  Partition: S={std_s0}, S̄={std_s1}  |  cut value = {std_cut}\")\n",
+    "print()\n",
+    "print(f\"WS-QAOA most-probable bitstring       : {ws_best}\")\n",
+    "print(f\"  Partition: S={ws_s0}, S̄={ws_s1}  |  cut value = {ws_cut}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "step4-sample-md",
+   "metadata": {},
+   "source": [
+    "Bitstrings from `Sampler` are returned with qubit 0 at the rightmost position, so reversing the string maps index $i$ to variable $x_i$. The cut value is the total weight of edges that cross the partition, which is what the max-cut problem aims to maximize. A cut value of 4 uses four of the five available edges, which is the theoretical maximum for this graph."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "id": "step4-visualize-code",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "<Image src=\"/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step4-visualize-code-0.avif\" alt=\"Output of the previous code cell\" />"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "=== Summary ===\n",
+      "Method                 Ising energy    Cut value   Approx. ratio\n",
+      "-----------------------------------------------------------------\n",
+      "Standard QAOA               -0.5859          4.0          0.3906\n",
+      "WS-QAOA                     -1.5000          4.0          1.0000\n",
+      "Exact optimal               -1.5000            4          1.0000\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Visualize the WS-QAOA solution on the graph\n",
+    "fig, axes = plt.subplots(1, 2, figsize=(8, 3))\n",
+    "\n",
+    "for ax, s0, s1, cut, title in [\n",
+    "    (axes[0], std_s0, std_s1, std_cut, f\"Standard QAOA (cut = {std_cut})\"),\n",
+    "    (axes[1], ws_s0, ws_s1, ws_cut, f\"WS-QAOA (cut = {ws_cut})\"),\n",
+    "]:\n",
+    "    colors = [\"skyblue\" if i in s0 else \"salmon\" for i in G.nodes()]\n",
+    "    nx.draw(G, pos, with_labels=True, node_color=colors, ax=ax)\n",
+    "    nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels, ax=ax)\n",
+    "    ax.set_title(title)\n",
+    "\n",
+    "plt.tight_layout()\n",
+    "plt.show()\n",
+    "\n",
+    "# Summary\n",
+    "# to_ising offset: QUBO value = Ising energy + offset, so Max-Cut value = -(Ising energy + offset)\n",
+    "optimal_cut = -(optimal_energy + offset)\n",
+    "print(\"=== Summary ===\")\n",
+    "print(\n",
+    "    f\"{'Method':<20} {'Ising energy':>14} {'Cut value':>12} {'Approx. ratio':>15}\"\n",
+    ")\n",
+    "print(\"-\" * 65)\n",
+    "print(\n",
+    "    f\"{'Standard QAOA':<20} {std_result.fun:>14.4f} {std_cut:>12} {std_result.fun/optimal_energy:>15.4f}\"\n",
+    ")\n",
+    "print(\n",
+    "    f\"{'WS-QAOA':<20} {ws_result.fun:>14.4f} {ws_cut:>12} {ws_result.fun/optimal_energy:>15.4f}\"\n",
+    ")\n",
+    "print(\n",
+    "    f\"{'Exact optimal':<20} {optimal_energy:>14.4f} {optimal_cut:>12.0f} {'1.0000':>15}\"\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "step4-visualize-md",
+   "metadata": {},
+   "source": [
+    "The graph visualization colors each node by its partition assignment (blue = $S$, orange = $\\bar{S}$). Edges that cross the partition (connecting differently-colored nodes) are the ones counted in the cut.\n",
+    "\n",
+    "Both methods find a bitstring with cut value 4, but for very different reasons. It is important to note that the **convergence plot and the sampled bitstring measure two different things**:\n",
+    "\n",
+    "- The **convergence plot** tracks the average energy $\\langle H_C \\rangle$ of the full quantum state, a weighted average over all bitstrings in the superposition. Standard QAOA converges to ~$-0.62$, well above the optimal $-1.50$, meaning its quantum state is spread across many suboptimal bitstrings and only occasionally includes the correct answer.\n",
+    "- The **sampled bitstring** is a single draw from that state. Standard QAOA got lucky here; the optimal partition happened to be the most frequently sampled outcome even from a diffuse state. On harder problems, noisier hardware, or with more candidate solutions competing, this luck runs out.\n",
+    "\n",
+    "WS-QAOA, by contrast, converges its average energy all the way to $-1.50$, meaning its quantum state is concentrated on the optimal bitstrings. Nearly every shot returns the correct answer, so the solution is found reliably rather than by chance.\n",
+    "\n",
+    "The practical consequence: on this small noiseless simulator the difference may seem minor, but at larger problem sizes or on real hardware, a state with average energy close to optimal is far more robust than one that only occasionally samples the right answer from a diffuse distribution."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0d6db390-e7a8-4efe-902c-8d9a312170c6",
+   "metadata": {},
+   "source": [
+    "# Large Scale Hardware Example"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ae69c5e0-32b1-4f03-ab13-7b95a9acfd25",
+   "metadata": {},
+   "source": [
+    "### Steps 1-4 compress into single code block"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d58164ca-3b20-441c-9777-728497580cab",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Using backend: ibm_boston\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Selecting a backend using real hardware\n",
+    "service = QiskitRuntimeService()\n",
+    "backend = service.least_busy(\n",
+    "    operational=True, simulator=False, min_num_qubits=127\n",
+    ")\n",
+    "print(f\"Using backend: {backend.name}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a35f3b21",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Graph: 40 nodes, 60 edges (3-regular)\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "<Image src=\"/docs/images/tutorials/warm-start-qaoa/extracted-outputs/a35f3b21-1.avif\" alt=\"Output of the previous code cell\" />"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Cost operator: 40 qubits, 60 Pauli terms\n",
+      "c* range: [0.000, 1.000]  theta range: [1.047, 2.094] rad\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "<Image src=\"/docs/images/tutorials/warm-start-qaoa/extracted-outputs/a35f3b21-3.avif\" alt=\"Output of the previous code cell\" />"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "Transpiled circuit: 2Q depth=94\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "<Image src=\"/docs/images/tutorials/warm-start-qaoa/extracted-outputs/a35f3b21-5.avif\" alt=\"Output of the previous code cell\" />"
+      ]
+     },
+     "execution_count": 16,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# ── Step 1a: Build the 40-node Max-Cut problem ─────────────────────────────\n",
+    "# A 3-regular graph (every node has exactly 3 neighbors) is a standard QAOA\n",
+    "N_LARGE = 40\n",
+    "G_large = nx.random_regular_graph(d=3, n=N_LARGE, seed=0)\n",
+    "edges_large = list(G_large.edges())\n",
+    "print(f\"Graph: {N_LARGE} nodes, {len(edges_large)} edges (3-regular)\")\n",
+    "\n",
+    "# Visualize the graph so it is clear what problem we are solving before any\n",
+    "# quantum work. Nodes in a circular layout; each edge contributes +1 to the\n",
+    "# cut value when its endpoints land in different partitions.\n",
+    "pos_large = nx.circular_layout(G_large)\n",
+    "fig, ax = plt.subplots(figsize=(6, 6))\n",
+    "nx.draw(\n",
+    "    G_large,\n",
+    "    pos_large,\n",
+    "    with_labels=True,\n",
+    "    node_color=\"lightblue\",\n",
+    "    node_size=400,\n",
+    "    font_size=7,\n",
+    "    ax=ax,\n",
+    ")\n",
+    "ax.set_title(f\"40-node 3-regular Max-Cut graph ({len(edges_large)} edges)\")\n",
+    "plt.tight_layout()\n",
+    "plt.show()\n",
+    "\n",
+    "\n",
+    "prob_large = OptimizationProblem(\"max_cut_40\")\n",
+    "x_large = [prob_large.binary_var(name=f\"x{i}\") for i in range(N_LARGE)]\n",
+    "\n",
+    "lin_large, quad_large = {}, {}\n",
+    "for u, v in edges_large:\n",
+    "    lin_large[f\"x{u}\"] = lin_large.get(f\"x{u}\", 0.0) - 1.0\n",
+    "    lin_large[f\"x{v}\"] = lin_large.get(f\"x{v}\", 0.0) - 1.0\n",
+    "    quad_large[(f\"x{u}\", f\"x{v}\")] = (\n",
+    "        quad_large.get((f\"x{u}\", f\"x{v}\"), 0.0) + 2.0\n",
+    "    )\n",
+    "\n",
+    "prob_large.minimize(linear=lin_large, quadratic=quad_large)\n",
+    "\n",
+    "converter_large = OptimizationProblemToQubo()\n",
+    "qubo_large = converter_large.convert(prob_large)\n",
+    "cost_op_large, offset_large = to_ising(qubo_large)\n",
+    "n_qubits_large = cost_op_large.num_qubits\n",
+    "print(\n",
+    "    f\"Cost operator: {n_qubits_large} qubits, {len(cost_op_large)} Pauli terms\"\n",
+    ")\n",
+    "\n",
+    "# ── Step 1b: QP relaxation (multi-start L-BFGS-B) ─────────────────────────\n",
+    "# Same multi-start approach as the small example. At 40 qubits the relaxed\n",
+    "# landscape has many more local minima, so 200 random starts are essential\n",
+    "# to find a low-energy warm-start point.\n",
+    "Q_large = qubo_large.objective.quadratic.to_array(symmetric=True)\n",
+    "mu_large = qubo_large.objective.linear.to_array()\n",
+    "\n",
+    "\n",
+    "def qp_obj_large(x):\n",
+    "    return x @ Q_large @ x + mu_large @ x + qubo_large.objective.constant\n",
+    "\n",
+    "\n",
+    "bounds_large = [(0.0, 1.0)] * n_qubits_large\n",
+    "rng_qp = np.random.default_rng(42)\n",
+    "best_val_large, c_star_large = np.inf, None\n",
+    "\n",
+    "for _ in range(200):\n",
+    "    x0 = rng_qp.uniform(0.0, 1.0, n_qubits_large)\n",
+    "    res = minimize(qp_obj_large, x0, method=\"L-BFGS-B\", bounds=bounds_large)\n",
+    "    if res.fun < best_val_large:\n",
+    "        best_val_large, c_star_large = res.fun, res.x\n",
+    "\n",
+    "# Regularize and convert to rotation angles (same formula as small example)\n",
+    "epsilon_large = 0.25\n",
+    "c_clipped_large = np.clip(c_star_large, epsilon_large, 1 - epsilon_large)\n",
+    "thetas_large = 2 * np.arcsin(np.sqrt(c_clipped_large))\n",
+    "print(\n",
+    "    f\"c* range: [{c_star_large.min():.3f}, {c_star_large.max():.3f}]  \"\n",
+    "    f\"theta range: [{thetas_large.min():.3f}, {thetas_large.max():.3f}] rad\"\n",
+    ")\n",
+    "\n",
+    "# Plot the distribution of c* values to see how much structure the relaxation\n",
+    "# extracted. Values near 0/1 mean confident assignments; values near 0.5 mean\n",
+    "# the classical solver was uncertain and quantum exploration is most needed there.\n",
+    "fig, ax = plt.subplots(figsize=(6, 3))\n",
+    "ax.hist(c_star_large, bins=20, color=\"steelblue\", edgecolor=\"white\")\n",
+    "ax.axvline(0.5, color=\"k\", linestyle=\"--\", label=\"Uniform prior (std QAOA)\")\n",
+    "ax.set_xlabel(r\"$c^*_i$\")\n",
+    "ax.set_ylabel(\"Count\")\n",
+    "ax.set_title(r\"Distribution of warm-start values $c^*_i$ (40-node graph)\")\n",
+    "ax.legend()\n",
+    "plt.tight_layout()\n",
+    "plt.show()\n",
+    "\n",
+    "# ── Step 1c: Build WS-QAOA circuit ─────────────────────────────────────────\n",
+    "# Reuse build_ws_qaoa from the small-scale section unchanged; the helper\n",
+    "# scales automatically with n_qubits and the cost operator size.\n",
+    "p_large = 1\n",
+    "ws_qc_large, ws_gammas_large, ws_betas_large = build_ws_qaoa(\n",
+    "    cost_op_large, p_large, n_qubits_large, thetas_large\n",
+    ")\n",
+    "ws_qc_large.measure_all()\n",
+    "\n",
+    "# ── Step 2: Transpile to hardware-native gates ──────────────────────────\n",
+    "# generate_preset_pass_manager compiles the abstract circuit to th\n",
+    "# gate set of the backend and inserts SWAP gates wherever the cost Hamiltonian\n",
+    "# couples qubits that are not directly connected on the processor.\n",
+    "pm = generate_preset_pass_manager(optimization_level=3, backend=backend)\n",
+    "ws_isa_large = pm.run(ws_qc_large)\n",
+    "\n",
+    "ecr_count = ws_isa_large.count_ops().get(\"ecr\", 0)\n",
+    "print(\n",
+    "    f\"\\nTranspiled circuit: 2Q depth={ws_isa_large.depth(lambda x: x.operation.num_qubits == 2)}\"\n",
+    ")\n",
+    "ws_isa_large.draw(\"mpl\", fold=-1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "00ae1953",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Simulated annealing cut value: 53  (classical reference)\n",
+      "  iter  30  <H_C> = -12.8021\n",
+      "Optimisation complete: energy=-12.8904, iterations=30\n",
+      "Most-probable bitstring frequency: 6/8192 (0.1%)\n",
+      "WS-QAOA cut: 53  |  SA cut: 53  |  Approximation ratio vs SA: 1.0000\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "<Image src=\"/docs/images/tutorials/warm-start-qaoa/extracted-outputs/00ae1953-1.avif\" alt=\"Output of the previous code cell\" />"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "text/plain": [
+       "<Image src=\"/docs/images/tutorials/warm-start-qaoa/extracted-outputs/00ae1953-2.avif\" alt=\"Output of the previous code cell\" />"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "=== Large Scale Summary ===\n",
+      "Metric                                      Value\n",
+      "--------------------------------------------------\n",
+      "Nodes / Edges                             40 / 60  \n",
+      "QAOA layers (p)                                 1\n",
+      "Transpiled ECR gate count                       0\n",
+      "Transpiled circuit depth                      283\n",
+      "Optimizer iterations                           30\n",
+      "WS-QAOA energy (hardware)                -12.8904\n",
+      "Cut value                                      53\n",
+      "Simulated annealing cut value                  53\n",
+      "Approximation ratio (vs SA)                1.0000\n"
+     ]
+    }
+   ],
+   "source": [
+    "# ── Classical baseline via simulated annealing ────────────────────\n",
+    "# Run SA before any hardware calls to get a strong classical reference cut\n",
+    "# value. SA is fast (seconds), needs no solver license, and reliably finds\n",
+    "# near-optimal solutions on 40-node graphs. We use sa_cut as the denominator\n",
+    "# for the approximation ratio instead of the looser QP upper bound.\n",
+    "#\n",
+    "# At each step we flip a random node and accept the move if it improves the\n",
+    "# cut, or with probability exp(delta/T) otherwise. Temperature T decays\n",
+    "# geometrically, allowing uphill moves early on to escape local minima.\n",
+    "def simulated_annealing_maxcut(\n",
+    "    G, seed=0, T0=2.0, T_min=1e-4, alpha=0.995, n_steps=100_000\n",
+    "):\n",
+    "    rng_sa = np.random.default_rng(seed)\n",
+    "    n = G.number_of_nodes()\n",
+    "    x = rng_sa.integers(0, 2, n)\n",
+    "    best_x = x.copy()\n",
+    "    best_cut = sum(1 for u, v in G.edges() if x[u] != x[v])\n",
+    "    T = T0\n",
+    "    for _ in range(n_steps):\n",
+    "        i = rng_sa.integers(0, n)\n",
+    "        delta = sum((-1 if x[i] != x[nb] else 1) for nb in G.neighbors(i))\n",
+    "        if delta > 0 or rng_sa.random() < np.exp(delta / T):\n",
+    "            x[i] ^= 1\n",
+    "            cut = sum(1 for u, v in G.edges() if x[u] != x[v])\n",
+    "            if cut > best_cut:\n",
+    "                best_cut, best_x = cut, x.copy()\n",
+    "        T = max(T * alpha, T_min)\n",
+    "    return best_x, best_cut\n",
+    "\n",
+    "\n",
+    "sa_solution, sa_cut = simulated_annealing_maxcut(G_large)\n",
+    "print(f\"Simulated annealing cut value: {sa_cut}  (classical reference)\")\n",
+    "\n",
+    "# ── Step 3: Execution on hardware ───────────────────────────\n",
+    "# A Session reserves the backend so the COBYLA iterations and final sampling\n",
+    "# run back-to-back without re-queuing between jobs — important when the\n",
+    "# optimizer submits many short jobs sequentially. All jobs are tagged with\n",
+    "# \"TUT_WSQAOA\" for traceability in the IBM Quantum dashboard.\n",
+    "#\n",
+    "# EstimatorV2 with resilience_level=1 enables twirled readout error extinction\n",
+    "# (TREX), which corrects systematic measurement bit-flip errors without extra\n",
+    "# circuit overhead. 4096 shots per call balances estimation noise vs. job time.\n",
+    "estimator_options = EstimatorOptions()\n",
+    "estimator_options.resilience_level = 1\n",
+    "estimator_options.default_shots = 4096\n",
+    "estimator_options.environment.job_tags = [\"TUT_WSQAOA\"]\n",
+    "\n",
+    "# Align the cost observable with the physical qubit layout chosen by the transpiler\n",
+    "cost_op_isa = cost_op_large.apply_layout(ws_isa_large.layout)\n",
+    "ws_param_order_isa = list(ws_isa_large.parameters)\n",
+    "\n",
+    "ws_history_hw = []\n",
+    "\n",
+    "with Session(backend=backend) as session:\n",
+    "    estimator_hw = Estimator(mode=session, options=estimator_options)\n",
+    "\n",
+    "    def hw_cost_fn(params):\n",
+    "        bound = ws_isa_large.assign_parameters(\n",
+    "            dict(zip(ws_param_order_isa, params))\n",
+    "        )\n",
+    "        energy = (\n",
+    "            estimator_hw.run([(bound, cost_op_isa)]).result()[0].data.evs.real\n",
+    "        )\n",
+    "        ws_history_hw.append(float(energy))\n",
+    "        print(\n",
+    "            f\"  iter {len(ws_history_hw):>3d}  <H_C> = {energy:.4f}\", end=\"\\r\"\n",
+    "        )\n",
+    "        return float(energy)\n",
+    "\n",
+    "    # Warm-start initialization: gamma=0 means the cost unitary is the identity on\n",
+    "    # the first call, so COBYLA immediately evaluates the warm-start state itself —\n",
+    "    # a much better starting signal than a random point.\n",
+    "    ws_params0_hw = np.concatenate(\n",
+    "        [np.zeros(p_large), np.full(p_large, np.pi / 4)]\n",
+    "    )\n",
+    "\n",
+    "    ws_result_hw = minimize(\n",
+    "        hw_cost_fn,\n",
+    "        ws_params0_hw,\n",
+    "        method=\"COBYLA\",\n",
+    "        options={\"maxiter\": 150, \"rhobeg\": 0.3},\n",
+    "    )\n",
+    "    print(\n",
+    "        f\"\\nOptimization complete: energy={ws_result_hw.fun:.4f}, \"\n",
+    "        f\"iterations={len(ws_history_hw)}\"\n",
+    "    )\n",
+    "\n",
+    "    # ── Step 3b: Sample the optimized circuit ──────────────────────────────────\n",
+    "    # Use 8192 shots for the final sample to get a reliable mode estimate.\n",
+    "    sampler_hw = Sampler(\n",
+    "        mode=session,\n",
+    "        options={\"environment\": {\"job_tags\": [\"TUT_WSQAOA\"]}},\n",
+    "    )\n",
+    "    ws_bound_hw = ws_isa_large.assign_parameters(\n",
+    "        dict(zip(ws_param_order_isa, ws_result_hw.x))\n",
+    "    )\n",
+    "    counts_hw = (\n",
+    "        sampler_hw.run([ws_bound_hw], shots=8192)\n",
+    "        .result()[0]\n",
+    "        .data.meas.get_counts()\n",
+    "    )\n",
+    "\n",
+    "best_bs_hw = max(counts_hw, key=counts_hw.get)\n",
+    "best_count = counts_hw[best_bs_hw]\n",
+    "total_shots = sum(counts_hw.values())\n",
+    "\n",
+    "# Decode: Qiskit returns bitstrings with qubit 0 at the rightmost position,\n",
+    "# so reversing the string maps character index i to variable x_i.\n",
+    "cut_val_hw, s0_hw, s1_hw = evaluate_cut(best_bs_hw[::-1], G_large)\n",
+    "\n",
+    "# Compare against simulated annealing.\n",
+    "# A ratio >= 1.0 means WS-QAOA matched or beat the classical SA solution.\n",
+    "# A ratio close to 1.0 (e.g. > 0.95) shows the quantum result is competitive.\n",
+    "approx_ratio_hw = cut_val_hw / sa_cut\n",
+    "print(\n",
+    "    f\"Most-probable bitstring frequency: {best_count}/{total_shots} \"\n",
+    "    f\"({100*best_count/total_shots:.1f}%)\"\n",
+    ")\n",
+    "print(\n",
+    "    f\"WS-QAOA cut: {cut_val_hw}  |  SA cut: {sa_cut}  \"\n",
+    "    f\"|  Approximation ratio vs SA: {approx_ratio_hw:.4f}\"\n",
+    ")\n",
+    "\n",
+    "# Visualize both solutions side-by-side on the graph.\n",
+    "# Blue = partition S, orange = partition S-bar.\n",
+    "# Edges crossing between colors are the ones counted in the cut.\n",
+    "fig, axes = plt.subplots(1, 2, figsize=(14, 6))\n",
+    "for ax, assignment, cut, title in [\n",
+    "    (\n",
+    "        axes[0],\n",
+    "        list(sa_solution),\n",
+    "        sa_cut,\n",
+    "        f\"Simulated Annealing (cut={sa_cut})\",\n",
+    "    ),\n",
+    "    (\n",
+    "        axes[1],\n",
+    "        [int(b) for b in best_bs_hw[::-1]],\n",
+    "        cut_val_hw,\n",
+    "        f\"WS-QAOA hardware (cut={cut_val_hw})\",\n",
+    "    ),\n",
+    "]:\n",
+    "    colors = [\n",
+    "        \"skyblue\" if assignment[i] == 0 else \"salmon\" for i in G_large.nodes()\n",
+    "    ]\n",
+    "    nx.draw(\n",
+    "        G_large,\n",
+    "        pos_large,\n",
+    "        with_labels=True,\n",
+    "        node_color=colors,\n",
+    "        node_size=400,\n",
+    "        font_size=7,\n",
+    "        ax=ax,\n",
+    "    )\n",
+    "    ax.set_title(title)\n",
+    "plt.suptitle(\"Max-Cut partitions: SA vs WS-QAOA\", fontsize=13)\n",
+    "plt.tight_layout()\n",
+    "plt.show()\n",
+    "\n",
+    "# ── Step 4: Convergence plot and summary ──────────────────────────────────\n",
+    "# On real hardware the trace will be noisy (shot noise + gate errors), but the\n",
+    "# overall downward trend confirms that COBYLA is making progress despite noise.\n",
+    "fig, ax = plt.subplots(figsize=(7, 4))\n",
+    "ax.plot(ws_history_hw, color=\"tab:orange\", label=\"WS-QAOA (hardware)\")\n",
+    "ax.axhline(\n",
+    "    ws_result_hw.fun,\n",
+    "    color=\"tab:orange\",\n",
+    "    linestyle=\":\",\n",
+    "    label=f\"Final energy ({ws_result_hw.fun:.3f})\",\n",
+    ")\n",
+    "ax.set_xlabel(\"Optimizer call\")\n",
+    "ax.set_ylabel(r\"$\\langle H_C \\rangle$\")\n",
+    "ax.set_title(f\"WS-QAOA convergence on {backend.name} (40 qubits, p=1)\")\n",
+    "ax.legend()\n",
+    "plt.tight_layout()\n",
+    "plt.show()\n",
+    "\n",
+    "\n",
+    "print(\"\\n=== Large Scale Summary ===\")\n",
+    "print(f\"{'Metric':<38} {'Value':>10}\")\n",
+    "print(\"-\" * 50)\n",
+    "print(f\"{'Nodes / Edges':<38} {N_LARGE:>5} / {len(edges_large):<4}\")\n",
+    "print(f\"{'QAOA layers (p)':<38} {p_large:>10}\")\n",
+    "print(f\"{'Transpiled ECR gate count':<38} {ecr_count:>10}\")\n",
+    "print(f\"{'Transpiled circuit depth':<38} {ws_isa_large.depth():>10}\")\n",
+    "print(f\"{'Optimizer iterations':<38} {len(ws_history_hw):>10}\")\n",
+    "print(f\"{'WS-QAOA energy (hardware)':<38} {ws_result_hw.fun:>10.4f}\")\n",
+    "print(f\"{'Cut value':<38} {cut_val_hw:>10}\")\n",
+    "print(f\"{'Simulated annealing cut value':<38} {sa_cut:>10}\")\n",
+    "print(f\"{'Approximation ratio (vs SA)':<38} {approx_ratio_hw:>10.4f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "de87f93a",
+   "metadata": {},
+   "source": [
+    "## What to look into next\n",
+    "If you found this work interesting you may be interested in the following material:\n",
+    "- **Higher QAOA layers**: Increase `p` to see how both algorithms improve with more circuit layers, and whether the WS-QAOA advantage at low depth persists.\n",
+    "- **Qiskit addon optimization mapper**: Explore the [documentation](https://qiskit.github.io/qiskit-addon-opt-mapper/) and try modeling different combinatorial problems, or using different solvers for the continuous relaxation."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "aafc36e2",
+   "metadata": {},
+   "source": [
+    "## References\n",
+    "\n",
+    "<span id=\"Reference1\"></span>\n",
+    "[\\[1\\]](#Reference1) D. J. Egger, J. Mareček, and S. Woerner, \"Warm-starting quantum optimization,\" *Quantum*, vol. 5, p. 479, 2021. [arXiv:2009.10095](https://arxiv.org/abs/2009.10095)\n",
+    "\n",
+    "<span id=\"Reference2\"></span>\n",
+    "[\\[2\\]](#Reference2) E. Farhi, J. Goldstone, and S. Gutmann, \"A quantum approximate optimization algorithm,\" [arXiv:1411.4028](https://arxiv.org/abs/1411.4028), 2014."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a9786b19",
+   "metadata": {},
+   "source": [
+    "© IBM Corp. 2026"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/00ae1953-1.avif b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/00ae1953-1.avif
new file mode 100644
index 00000000000..46aad760626
Binary files /dev/null and b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/00ae1953-1.avif differ
diff --git a/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/00ae1953-2.avif b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/00ae1953-2.avif
new file mode 100644
index 00000000000..8793b24a170
Binary files /dev/null and b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/00ae1953-2.avif differ
diff --git a/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/6fad9eda-1.avif b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/6fad9eda-1.avif
new file mode 100644
index 00000000000..1c61eda14d5
Binary files /dev/null and b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/6fad9eda-1.avif differ
diff --git a/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/a35f3b21-1.avif b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/a35f3b21-1.avif
new file mode 100644
index 00000000000..a1bcf57a28c
Binary files /dev/null and b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/a35f3b21-1.avif differ
diff --git a/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/a35f3b21-3.avif b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/a35f3b21-3.avif
new file mode 100644
index 00000000000..eb57a037e7d
Binary files /dev/null and b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/a35f3b21-3.avif differ
diff --git a/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/a35f3b21-5.avif b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/a35f3b21-5.avif
new file mode 100644
index 00000000000..0b8cae48b5d
Binary files /dev/null and b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/a35f3b21-5.avif differ
diff --git a/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step1-graph-code-0.avif b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step1-graph-code-0.avif
new file mode 100644
index 00000000000..6aa0fcd8396
Binary files /dev/null and b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step1-graph-code-0.avif differ
diff --git a/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step2-draw-code-1.avif b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step2-draw-code-1.avif
new file mode 100644
index 00000000000..48fadd71f71
Binary files /dev/null and b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step2-draw-code-1.avif differ
diff --git a/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step4-convergence-code-0.avif b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step4-convergence-code-0.avif
new file mode 100644
index 00000000000..cb5f0ea4a07
Binary files /dev/null and b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step4-convergence-code-0.avif differ
diff --git a/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step4-visualize-code-0.avif b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step4-visualize-code-0.avif
new file mode 100644
index 00000000000..af3e944c833
Binary files /dev/null and b/public/docs/images/tutorials/warm-start-qaoa/extracted-outputs/step4-visualize-code-0.avif differ
diff --git a/qiskit_bot.yaml b/qiskit_bot.yaml
index 27335a9e816..1e1e58fcfcc 100644
--- a/qiskit_bot.yaml
+++ b/qiskit_bot.yaml
@@ -688,6 +688,8 @@ notifications:
   "docs/tutorials/advanced-techniques-for-qaoa":
     - "`@nathanearnestnoble`"
     - "`@ibrahim-shehzad`"
+  "docs/tutorials/warm-start-qaoa":
+    - "@henryzou50"
   "docs/guides/circuit-transpilation-settings":
     - "`@nathanearnestnoble`"
     - "@henryzou50"
diff --git a/scripts/config/notebook-testing.toml b/scripts/config/notebook-testing.toml
index 63a3508e5e1..76d51512f85 100644
--- a/scripts/config/notebook-testing.toml
+++ b/scripts/config/notebook-testing.toml
@@ -176,6 +176,7 @@ notebooks = [
   "docs/tutorials/shors-algorithm.ipynb",
   "docs/tutorials/multi-product-formula.ipynb",
   "docs/tutorials/advanced-techniques-for-qaoa.ipynb",
+  "docs/tutorials/warm-start-qaoa.ipynb",
   "docs/tutorials/ai-transpiler-introduction.ipynb",
   "docs/tutorials/grovers-algorithm.ipynb",
   "docs/tutorials/spin-chain-vqe.ipynb",