OCPNODE-4535: Automate OCP-44820 change container registry config#31182
OCPNODE-4535: Automate OCP-44820 change container registry config#31182Chandan9112 wants to merge 1 commit into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
@Chandan9112: This pull request references OCPNODE-4535 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
|
@Chandan9112: GitHub didn't allow me to request PR reviews from the following users: openshift/sig-node. Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds imports for conflict-aware retries and an operator test helper, and a disruptive serial e2e that patches cluster ImageConfig (allowed/search registries) with RetryOnConflict, waits for MCP rollout, verifies node registry config, and restores the original ImageConfig. ChangesNode e2e registry config
sequenceDiagram
participant Test as ImageRegistryConfigTest
participant API as ImageConfig API
participant MCP as MachineConfigPool
participant Node as Worker node
Test->>API: GET current image.config.openshift.io/cluster
Test->>API: RetryOnConflict PATCH update registry sources
API->>MCP: apply updated spec
MCP->>Node: rollout MachineConfig
Test->>Node: check /etc/containers/registries.conf.d/01-image-searchRegistries.conf
Test->>API: RetryOnConflict PATCH restore original image.config
MCP->>Node: converge to restored spec
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 11 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Chandan9112 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
ef40051 to
6e92bbc
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
test/extended/node/node_e2e/node.go (1)
357-360: 💤 Low valueEarly return on restore failure prevents MCP convergence wait.
If restoring the original
image.configfails, the cleanup returns immediately without waiting for MCP rollout or operator stabilization. This may leave the cluster in an intermediate state. However, since the restore itself failed, there's little value in waiting for MCP convergence of a potentially incomplete update.Consider adding a log message indicating that MCP waits are being skipped due to restore failure, to make the behavior more explicit.
📝 Proposed improvement for clarity
if restoreErr != nil { - e2e.Logf("WARNING: failed to restore original image.config: %v", restoreErr) + e2e.Logf("WARNING: failed to restore original image.config: %v; skipping MCP convergence wait", restoreErr) return }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/extended/node/node_e2e/node.go` around lines 357 - 360, The cleanup path currently returns immediately when restoreErr != nil in the block that attempts to restore the original image.config; update that block (the if restoreErr != nil branch) to log an explicit message that MCP rollout/operator stabilization waits are being skipped because the restore failed (mention restoreErr in the log for context) before returning so readers know why MCP waits are omitted; modify the code around the restore of image.config (the restoreErr check) to add this log entry referencing restoreErr and then return as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/extended/node/node_e2e/node.go`:
- Around line 373-389: The code assigns to imageConfig.Spec.RegistrySources
fields without nil-checking; inside the retry.RetryOnConflict block (where you
call oc.AdminConfigClient().ConfigV1().Images().Get and later Update), check if
imageConfig.Spec.RegistrySources is nil and if so initialize it (e.g., set
imageConfig.Spec.RegistrySources = &configv1.RegistrySources{} ) before setting
AllowedRegistries and ContainerRuntimeSearchRegistries to avoid a runtime panic
when RegistrySources is nil.
---
Nitpick comments:
In `@test/extended/node/node_e2e/node.go`:
- Around line 357-360: The cleanup path currently returns immediately when
restoreErr != nil in the block that attempts to restore the original
image.config; update that block (the if restoreErr != nil branch) to log an
explicit message that MCP rollout/operator stabilization waits are being skipped
because the restore failed (mention restoreErr in the log for context) before
returning so readers know why MCP waits are omitted; modify the code around the
restore of image.config (the restoreErr check) to add this log entry referencing
restoreErr and then return as before.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: f2d72291-5f8c-48e4-9b6b-2790534d5327
📒 Files selected for processing (1)
test/extended/node/node_e2e/node.go
| g.By("Update image.config to add search registry and allowed registries") | ||
| err = retry.RetryOnConflict(retry.DefaultBackoff, func() error { | ||
| imageConfig, getErr := oc.AdminConfigClient().ConfigV1().Images().Get(ctx, "cluster", metav1.GetOptions{}) | ||
| if getErr != nil { | ||
| return getErr | ||
| } | ||
| imageConfig.Spec.RegistrySources.AllowedRegistries = []string{ | ||
| "registry.access.redhat.com", "docker.io", "quay.io", searchRegistry, | ||
| "image-registry.openshift-image-registry.svc:5000", | ||
| } | ||
| imageConfig.Spec.RegistrySources.ContainerRuntimeSearchRegistries = []string{ | ||
| "registry.access.redhat.com", "docker.io", "quay.io", searchRegistry, | ||
| } | ||
| _, updateErr := oc.AdminConfigClient().ConfigV1().Images().Update(ctx, imageConfig, metav1.UpdateOptions{}) | ||
| return updateErr | ||
| }) | ||
| o.Expect(err).NotTo(o.HaveOccurred(), "failed to update image.config.openshift.io/cluster") |
There was a problem hiding this comment.
Guard against nil RegistrySources to prevent panic.
Lines 379 and 383 set fields on imageConfig.Spec.RegistrySources without checking if it's nil. In the OpenShift configv1 API, ImageSpec.RegistrySources is a pointer and may be nil on a fresh cluster or after certain operations. Accessing fields on a nil pointer will cause a runtime panic.
🛡️ Proposed fix to initialize RegistrySources if nil
g.By("Update image.config to add search registry and allowed registries")
err = retry.RetryOnConflict(retry.DefaultBackoff, func() error {
imageConfig, getErr := oc.AdminConfigClient().ConfigV1().Images().Get(ctx, "cluster", metav1.GetOptions{})
if getErr != nil {
return getErr
}
+ if imageConfig.Spec.RegistrySources == nil {
+ imageConfig.Spec.RegistrySources = &configv1.RegistrySources{}
+ }
imageConfig.Spec.RegistrySources.AllowedRegistries = []string{
"registry.access.redhat.com", "docker.io", "quay.io", searchRegistry,
"image-registry.openshift-image-registry.svc:5000",
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/extended/node/node_e2e/node.go` around lines 373 - 389, The code assigns
to imageConfig.Spec.RegistrySources fields without nil-checking; inside the
retry.RetryOnConflict block (where you call
oc.AdminConfigClient().ConfigV1().Images().Get and later Update), check if
imageConfig.Spec.RegistrySources is nil and if so initialize it (e.g., set
imageConfig.Spec.RegistrySources = &configv1.RegistrySources{} ) before setting
AllowedRegistries and ContainerRuntimeSearchRegistries to avoid a runtime panic
when RegistrySources is nil.
a59ed7a to
e7848ff
Compare
Adds e2e test that verifies updating image.config.openshift.io/cluster with containerRuntimeSearchRegistries and allowedRegistries triggers an MCO rollout and the change is reflected on worker nodes. Co-authored-by: Cursor <cursoragent@cursor.com>
e7848ff to
e37496d
Compare
|
Scheduling required tests: |
|
@Chandan9112: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
image.config.openshift.io/clusterwithcontainerRuntimeSearchRegistriesandallowedRegistriestriggers an MCO rollout and the change is reflected in/etc/containers/registries.conf.d/01-image-searchRegistries.confon worker nodes.Here is the test case link: Polarian-44820
Test Results
It passed successfully while executing on a live OCP 4.22 cluster:
Summary by CodeRabbit