Skip to content

feat: scoped multi-namespace RBAC mode (Role per namespace, no ClusterRole)#1162

Open
michal-marszalek-h2oai wants to merge 2 commits into
stakater:masterfrom
michal-marszalek-h2oai:951-namespace-scope-rbac
Open

feat: scoped multi-namespace RBAC mode (Role per namespace, no ClusterRole)#1162
michal-marszalek-h2oai wants to merge 2 commits into
stakater:masterfrom
michal-marszalek-h2oai:951-namespace-scope-rbac

Conversation

@michal-marszalek-h2oai

Copy link
Copy Markdown

What

Adds a scoped multi-namespace mode: a third RBAC posture between the existing watchGlobally: true (cluster-wide, needs a ClusterRole) and single-namespace mode. You give Reloader an explicit list of namespaces, and it watches exactly those — with a namespace-scoped Role + RoleBinding created in each, no ClusterRole, from a single install.

Why

Today watching N namespaces means either granting cluster-wide RBAC (watchGlobally: true) or running N separate Reloader installs. Many clusters (multi-tenant / restricted RBAC) forbid ClusterRole grants, and one-install-per-namespace is wasteful. This closes that gap: one install, scoped permissions, no cluster-scoped objects.

How

Helm

  • New reloader.namespaces value, active when watchGlobally: false. Accepts either a YAML list (["team-a","team-b"]) or a comma-separated string ("team-a,team-b") for consistency with ignoreNamespaces / namespaceSelector.
  • The release namespace is always auto-included (needed for the meta-info ConfigMap, HA leases, and events).
  • role.yaml / rolebinding.yaml range over the deduped namespace set, emitting one Role + RoleBinding per namespace. The rule set is factored into a shared reloader-namespaced-rules helper to avoid duplication. The RoleBinding subject is the ServiceAccount in the release namespace (standard cross-namespace binding).
  • deployment.yaml passes --namespaces=<csv> and omits KUBERNETES_NAMESPACE in scoped mode.
  • A fail guard rejects the contradictory watchGlobally: true + namespaces combination.

Binary

  • New --namespaces flag (StringSlice, consistent with the other namespace flags).
  • resolveWatchNamespaces() picks the watched set with precedence: --namespacesKUBERNETES_NAMESPACE → all namespaces.
  • Controller creation loops over the watched namespaces (each informer is already namespace-scoped, so no informer/handler changes were needed).
  • --namespaces-to-ignore is now only honored in global mode (watchGlobally: true); in single-namespace and scoped modes the watched set is already explicit, mirroring how --namespace-selector is already gated.

Backward compatibility

Fully backward compatible. With reloader.namespaces empty (the default), rendered output is identical to today for both watchGlobally: true and watchGlobally: false (verified by diffing rendered manifests against the previous templates — the only delta is one cosmetic blank line removed by a cleaner apiVersion block).

Testing

  • make build, go vet ./..., gofmt — clean.
  • Unit test for resolveWatchNamespaces (all precedence cases).
  • New scoped-namespaces e2e case in test/e2e/flags/watch_globally_test.go: reloads in a listed namespace and in the auto-included release namespace; does not reload in an unlisted namespace.
  • helm template / helm lint verified across modes: scoped mode produces a Role+RoleBinding per namespace, zero ClusterRole, correct --namespaces arg; list and string inputs render identically; the fail guard fires on the bad combo.

Example:

helm upgrade --install reloader stakater/reloader \
  --namespace reloader-system \
  --set reloader.watchGlobally=false \
  --set "reloader.namespaces={team-a,team-b}"
# Roles in reloader-system, team-a, team-b — no ClusterRole.

Add a third RBAC posture between watch-globally (ClusterRole) and single
namespace: give Reloader an explicit list of namespaces to watch. The chart
creates a namespace-scoped Role + RoleBinding in each listed namespace (no
ClusterRole), and one install covers them all.

Go:
- new --namespaces flag / options.Namespaces
- resolveWatchNamespaces() picks list -> KUBERNETES_NAMESPACE -> all
- controller creation loops over the watched namespaces
- namespaces-to-ignore is now only honored in global mode (watchGlobally=true);
  in single-namespace and scoped modes the watched set is already explicit

Helm:
- new reloader.namespaces value (active when watchGlobally=false); accepts either
  a YAML list or a comma-separated string for consistency with the sibling
  namespace options
- reloader-watchNamespaces helper (release ns always auto-included, deduped)
- shared reloader-namespaced-rules template reused per namespace
- role.yaml/rolebinding.yaml range over the list; deployment passes --namespaces
- --namespaces-to-ignore only rendered when watchGlobally=true
- fail guard for watchGlobally=true + namespaces set

Tests: unit test for resolveWatchNamespaces; scoped-namespaces e2e case.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…pe-rbac

# Conflicts:
#	internal/pkg/cmd/reloader.go
#	internal/pkg/cmd/reloader_test.go
@michal-marszalek-h2oai

michal-marszalek-h2oai commented Jul 2, 2026

Copy link
Copy Markdown
Author

Hi @msafwankarim @Felix-Stakater 👋

Could a maintainer please click "Approve and run workflows" so CI can execute? Happy to address any review feedback once the checks are green. Thanks! 🙏

@msafwankarim

Copy link
Copy Markdown
Contributor

/loadtest

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown

Load Test Results (Full A/B Comparison)

Comparing: master951-namespace-scope-rbac
Triggered by: @msafwankarim


Artifacts: Download

@michal-marszalek-h2oai

Copy link
Copy Markdown
Author

👋 Heads-up on the failing E2E tests check in the latest run — I dug into it and I'm confident it's a pre-existing flaky test, not a regression from this PR.

The failure is the CSI spec test/e2e/advanced/multi_container_test.go:216 ("should reload with auto annotation when init container CSI volume changes"). A couple of points:

  • This PR only touches Helm/RBAC, cmd, flags, and util — the reload engine (internal/pkg/handler, pkg/common, internal/pkg/callbacks, internal/pkg/controller) is unchanged from master.
  • The Reloader logs from the failed run actually show the deployment was reloaded (deploy-jrmnq from spc-zsers, type SECRETPROVIDERCLASS, at 06:57:31 — exactly when the test began waiting). So the operator did its job; the test just didn't observe it.
  • Root cause is a TOCTOU race in the test helper: WaitReloaded reads its "before" annotation value after the blocking WaitForSPCPSVersionChange step. Since Reloader watches the same SecretProviderClassPodStatus, it can finish the reload before the baseline is captured, so the watch then waits forever for a second change. The sibling Job CSI test avoids this by capturing its baseline before the trigger.

Happy to open a separate PR to de-flake the CSI deployment specs (capture the reload baseline before triggering the change). A re-run should get this green in the meantime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants