TELCODOCS#2637: Adding custom MCP upgrade content by sr1kar99 · Pull Request #111682 · openshift/openshift-docs

sr1kar99 · 2026-05-14T17:30:02Z

Version(s):
4.21+

Issue:
https://redhat.atlassian.net/browse/TELCODOCS-2637

Link to docs preview:

Configuring custom machine config pools for parallel upgrades

QE review:

QE has approved this change.

ocpdocs-previewbot · 2026-05-14T17:38:08Z

🤖 Sun May 17 20:52:25 - Prow CI generated the docs preview:

https://111682--ocpdocs-pr.netlify.app/openshift-enterprise/latest/updating/preparing_for_updates/updating-cluster-prepare.html

sr1kar99 · 2026-05-14T19:49:14Z

@r3v5
Could you please review this PR?
Thanks!

alosadagrande · 2026-05-15T07:40:17Z

+
+* *nodeSelector*: Define a label to identify the nodes that belong to this pool (for example, `node-role.kubernetes.io/worker-0`).
+
+. Apply the `topology.kubernetes.io/zone` label to identify the KFD for the Kubernetes scheduler, and the custom node role label (for example, `worker-0`) to assign the node to the MCP by running the following command:


Apply the.... to identify each KFD for the Kubernetes scheduler.

Looks like only one topology label needs to be added, but a different one must be added to each MCP (worker-0, worker-1, worker-2, worker-3..).

Updated as follows:

. Apply the topology.kubernetes.io/zone label to each node to identify its KFD for the scheduler. You must apply the corresponding custom node role label (for example, worker-0, worker-1, worker-2) to assign each node to its custom MCP by running the following command:

alosadagrande · 2026-05-15T07:47:23Z

+
+* *nodeSelector*: Define a label to identify the nodes that belong to this pool (for example, `node-role.kubernetes.io/worker-0`).
+
+. Apply the `topology.kubernetes.io/zone` label to identify the KFD for the Kubernetes scheduler, and the custom node role label (for example, `worker-0`) to assign the node to the MCP by running the following command:


I would recommend including the node role label on each node during installation so you avoid any disruption on the node once the cluster is installed. The label, then, will already be included when the cluster is ready.

It can also be done later in a cluster already installed, but a warning message should be included that moving a node from a different MCP can cause disruption.

Updated the IMPORTANT NOTE to include this new info:

"Apply the topology.kubernetes.io/zone label and custom node role labels during cluster installation or node scaling whenever possible. Applying these labels before scheduling workloads ensures that the Kubernetes scheduler can distribute application replicas correctly across failure domains.

Applying or changing node role labels after installation can move nodes between MCPs, which might temporarily disrupt workloads on those nodes. If workloads are already running when you apply or modify these labels, you might need to reschedule workloads to achieve the intended HA distribution."

openshift-ci · 2026-05-17T20:53:50Z

@sr1kar99: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

imiller0 · 2026-05-18T22:50:36Z

+worker-0.topology.kubernetes.io/zone=kfd0   Ready    worker,worker-0        27h   v1.31.13  kfd0
+worker-1.topology.kubernetes.io/zone=kfd1   Ready    worker,worker-1        27h   v1.31.13  kfd0
+worker-2.topology.kubernetes.io/zone=kfd2   Ready    worker,worker-2        27h   v1.31.13  kfd2
+worker-3.topology.kubernetes.io/zone=kfd3   Ready    worker,worker-3        27h   v1.31.13  kfd3


As currently shows this has one worker in each MCP and thus one worker in each zone. I think it would be more instructive to show 2 workers per MCP and zone, even if we use only 2 MCPs if we need to keep this short.

imiller0 · 2026-05-18T22:51:24Z

+worker-3.topology.kubernetes.io/zone=kfd3   Ready    worker,worker-3        27h   v1.31.13  kfd3
+----
+
+. Schedule workloads on the cluster only after you verify that nodes are labeled correctly and distributed across Kubernetes failure domains.


Should advise that the consequence of scheduling before zone labeling is that the scheduler won't distribute across zones

imiller0 · 2026-05-18T22:52:25Z

+      - key: machineconfiguration.openshift.io/role
+        operator: In
+        values: [ worker, worker-0 ]
+  paused: true


The normal state of pause is false. Only set it to true when the upgrade process is starting. Set it true before starting the control plane upgrade.

imiller0 · 2026-05-18T22:53:15Z

+
+. Schedule workloads on the cluster only after you verify that nodes are labeled correctly and distributed across Kubernetes failure domains.
+
+. Keep each custom MCP paused until you are ready to upgrade the cluster.


As noted above, the normal state for pause should be false. The wording here should be pause the MCPs when you are ready to upgrade the cluster

imiller0 · 2026-05-18T22:59:37Z

+Configure each custom MCP with `maxUnavailable: 100%` so that all nodes in that pool to update at the same time. This setting applies only to the nodes in the selected MCP, not to the entire cluster.
+
+Plan the number and size of custom MCPs based on your cluster topology, workload distribution, and application availability requirements, including Pod Disruption Budgets (PDBs).
+


I believe we need some additional context here. Things not covered are:

Use of zones is not a guarantee that the scheduler will spread replicas across zones. It is a soft constraint, but it is beneficial to have the scheduler taking failure domains (and thus upgrade domains) into account

The ability to take an entire failure domain offline depends on several criteria

The application/workload is designed to be highly available

The application/workload has PodDisruptionBudgets protecting the HA replicas

The application is able to meet minimum service level requirements while the failure domain is offline

Recommendation is to have sufficient spare capacity in the cluster to accomodate pods disrupted when failure domain is taken offline

If application level contraints require less than a full failure domain be taken offline concurrently the maxUnavailable setting can be reduced, eg to 50% of the failure domain, or whatever capacity allows the application to meet service requirements.

openshift-ci Bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 14, 2026

ocpdocs-vale-bot reviewed May 14, 2026

View reviewed changes

Comment thread modules/configuring-custom-machine-config-pools-parallel-upgrades.adoc

ocpdocs-vale-bot reviewed May 14, 2026

View reviewed changes

Comment thread modules/configuring-custom-machine-config-pools-parallel-upgrades.adoc Outdated

alosadagrande reviewed May 15, 2026

View reviewed changes

TELCODOCS#2637: Adding custom MCP upgrade content

bb43ada

sr1kar99 force-pushed the 2637-adding-custom-mcp-upgrade branch from 93d93c5 to bb43ada Compare May 17, 2026 20:44

imiller0 reviewed May 18, 2026

View reviewed changes


		* nodeSelector: Define a label to identify the nodes that belong to this pool (for example, `node-role.kubernetes.io/worker-0`).

		. Apply the `topology.kubernetes.io/zone` label to identify the KFD for the Kubernetes scheduler, and the custom node role label (for example, `worker-0`) to assign the node to the MCP by running the following command:


		. Schedule workloads on the cluster only after you verify that nodes are labeled correctly and distributed across Kubernetes failure domains.

		. Keep each custom MCP paused until you are ready to upgrade the cluster.

		Configure each custom MCP with `maxUnavailable: 100%` so that all nodes in that pool to update at the same time. This setting applies only to the nodes in the selected MCP, not to the entire cluster.

		Plan the number and size of custom MCPs based on your cluster topology, workload distribution, and application availability requirements, including Pod Disruption Budgets (PDBs).

Conversation

sr1kar99 commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ocpdocs-previewbot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sr1kar99 commented May 14, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alosadagrande May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented May 17, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

sr1kar99 commented May 14, 2026 •

edited

Loading

ocpdocs-previewbot commented May 14, 2026 •

edited

Loading

alosadagrande May 15, 2026 •

edited

Loading