Skip to content

Add Gateway API support for SolrCloud external addressability#815

Open
chinmoysahu wants to merge 11 commits into
apache:mainfrom
wwgrainger:gtw-ingress-support
Open

Add Gateway API support for SolrCloud external addressability#815
chinmoysahu wants to merge 11 commits into
apache:mainfrom
wwgrainger:gtw-ingress-support

Conversation

@chinmoysahu
Copy link
Copy Markdown

Summary

This PR adds support for the Kubernetes Gateway API as a new external addressability method for SolrCloud instances. Gateway API is the successor to the Ingress API and provides a more flexible, vendor-neutral way to manage ingress traffic in Kubernetes.

Features

Gateway API Integration

  • New addressability method: spec.solrAddressability.external.method: Gateway
  • Automatic HTTPRoute management for common and per-node services
  • Cross-namespace Gateway references with optional listener targeting via sectionName
  • Custom labels and annotations for HTTPRoute resources

BackendTLSPolicy Support

  • Automatic TLS policy creation for secure backend connections when spec.solrTLS is enabled
  • Flexible CA configuration: CA certificate references (ConfigMap/Secret) or well-known CAs
  • Per-service policies for common and individual node services

API Changes

New Types (api/v1beta1/solrcloud_types.go):

  • SolrGatewayOptions, GatewayParentReference, SolrBackendTLSPolicy, GatewayCertificateReference

New Utility Functions (controllers/util/):

  • gateway_util.go: HTTPRoute generation and management
  • gateway_util_backendtls.go: BackendTLSPolicy generation and management

RBAC: Added permissions for httproutes and backendtlspolicies in gateway.networking.k8s.io API group

Documentation

  • docs/solr-cloud/gateway-api.md: Comprehensive usage guide with configuration examples, BackendTLSPolicy setup, and Gateway implementation support matrix (Envoy Gateway, kgateway, NGINX Gateway Fabric, etc.)
  • docs/solr-cloud/README.md: Added Gateway API reference

Dependency Updates

Gateway API v1.4.0+ is required to use the stable v1 API for BackendTLSPolicy (GA). This upgrade forced Go 1.24.0+ (required by Gateway API v1.4.0), which cascaded to Kubernetes libraries (v0.34.1) and controller-runtime (v0.22.1).

CRD Changes: Extensive changes in config/crd/bases/*.yaml include new Gateway API fields plus upstream schema updates from Kubernetes library upgrades (deprecation notices, field descriptions, etc.). These are auto-generated by controller-gen.

References:

Example Configuration

apiVersion: solr.apache.org/v1beta1
kind: SolrCloud
metadata:
  name: example
  namespace: solr-ns
spec:
  replicas: 3
  solrImage:
    tag: "9.7.0"
  solrTLS:
    pkcs12Secret:
      name: solr-tls-cert
      key: keystore.p12
  solrAddressability:
    external:
      method: Gateway
      domainName: solr.example.com
      useExternalAddress: true
      gateway:
        parentRefs:
        - name: my-gateway
          namespace: gateway-ns
          sectionName: https
        backendTLSPolicy:
          caCertificateRefs:
          - name: solr-ca-cert

Testing

E2E Tests (tests/e2e/solrcloud_gateway_test.go):

  • HTTPRoute and BackendTLSPolicy lifecycle management
  • CA certificate configuration switching
  • Resource cleanup and orphan handling

Manual Testing:

  • ✅ Tested with kgateway on Kubernetes 1.32
  • ✅ Verified with both NGINX Ingress and Gateway modes to ensure backward compatibility
  • ✅ Verified cross-namespace Gateway references
  • ✅ Confirmed TLS backend connections with BackendTLSPolicy

Compatibility

  • Gateway API: v1.4.0+ required (BackendTLSPolicy GA support)
  • Kubernetes: 1.23+ (Gateway API CRDs must be installed)
  • Backward compatible: Existing Ingress and other addressability methods unchanged
  • Breaking changes: None

Migration Path

  1. Install Gateway API CRDs (v1.4.0+)
  2. Deploy a Gateway resource
  3. Update SolrCloud spec to use method: Gateway
  4. Operator automatically creates HTTPRoute resources

@janhoy
Copy link
Copy Markdown
Contributor

janhoy commented Jan 26, 2026

Thanks for a thorough contribution, with docs and tests. I tagged Houston and Copilot for review as I'm not fluent in Go. But I intend to test the feature in a customer environment as some point.

Should we perhaps have a way to publish a -nightly version of the operator for early testing?

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive support for the Kubernetes Gateway API as a new external addressability method for SolrCloud instances. Gateway API is positioned as the successor to the Ingress API, providing a more flexible and vendor-neutral approach to managing ingress traffic in Kubernetes.

Changes:

  • Added Gateway API integration with automatic HTTPRoute and BackendTLSPolicy resource management
  • Upgraded to Go 1.24+ and Kubernetes libraries v0.34.1 to support Gateway API v1.4.0+ (required for stable BackendTLSPolicy)
  • Added comprehensive E2E tests for Gateway functionality including resource lifecycle and TLS policy management
  • Added detailed documentation covering configuration, BackendTLSPolicy setup, and Gateway implementation compatibility

Reviewed changes

Copilot reviewed 14 out of 18 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
tests/e2e/solrcloud_gateway_test.go New E2E test suite for Gateway API functionality including HTTPRoute and BackendTLSPolicy lifecycle
tests/e2e/resource_utils_test.go Helper functions for HTTPRoute resource assertions in tests
tests/e2e/resource_utils_backendtls_test.go Helper functions for BackendTLSPolicy resource assertions in tests
main.go Registers Gateway API v1 types with the operator's scheme
controllers/solrcloud_controller.go Core reconciliation logic for HTTPRoute and BackendTLSPolicy resources
controllers/util/gateway_util.go Utility functions for generating and managing HTTPRoute resources
controllers/util/gateway_util_backendtls.go Utility functions for generating and managing BackendTLSPolicy resources
api/v1beta1/solrcloud_types.go New Gateway API types and helper methods for SolrCloud resources
api/v1beta1/zz_generated.deepcopy.go Auto-generated deep copy methods for new Gateway types
helm/solr-operator/templates/role.yaml RBAC permissions for httproutes and backendtlspolicies
config/rbac/role.yaml RBAC permissions for httproutes and backendtlspolicies
docs/solr-cloud/gateway-api.md Comprehensive documentation for Gateway API usage and configuration
docs/solr-cloud/README.md Updated table of contents with Gateway API reference
go.mod / go.sum Dependency updates for Gateway API v1.4.0+ and associated upgrades
config/crd/bases/*.yaml Auto-generated CRD updates from Kubernetes library upgrades

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/e2e/solrcloud_gateway_test.go Outdated
Comment thread tests/e2e/solrcloud_gateway_test.go Outdated
Comment thread tests/e2e/solrcloud_gateway_test.go Outdated
Comment thread tests/e2e/solrcloud_gateway_test.go Outdated
Comment thread controllers/solrcloud_controller.go
Comment thread controllers/solrcloud_controller.go
Comment thread tests/e2e/solrcloud_gateway_test.go Outdated
Comment thread controllers/util/gateway_util_backendtls.go Outdated
Comment thread api/v1beta1/solrcloud_types.go
Comment thread tests/e2e/solrcloud_gateway_test.go Outdated
@chinmoysahu
Copy link
Copy Markdown
Author

chinmoysahu commented Jan 26, 2026

Thanks @janhoy for the feedback and enabling Copilot review. I have addressed all the comments and also the the "Build & Check" workflow failure by adding the missing import.
Happy to address any further improvement suggestions or fixes.

Also "-nightly version" publish of the operator for early testing would be great!

@janhoy
Copy link
Copy Markdown
Contributor

janhoy commented Jan 26, 2026

Thanks. You may resolve the conversation threads that are dealt with.

We’re a bit short on cycles for attending to the operator these days, if you continue contributing you may end up being nominated for committership.

I suppose now that Solr 10 release is imminent and this great new feature along with other fixes, we should consider a new operator release, which also skips support for Solr 8.

@HoustonPutman
Copy link
Copy Markdown
Contributor

Yeah I'll try to take a look by the end of the week.

And yeah less operator cycles nowadays unfortunately. But there will be a good amount of work required for Solr 10 I think. So that should make a push for improvements in the short-term!

@janhoy janhoy added this to the v1.0.0 milestone Mar 6, 2026
@janhoy janhoy added the networking Related to Services or Ingresses label Mar 6, 2026
@janhoy janhoy requested a review from thelabdude March 6, 2026 10:47
@janhoy
Copy link
Copy Markdown
Contributor

janhoy commented Mar 6, 2026

@chinmoysahu There are some linting and test failures that can be fixed while waiting for additional review.

@janhoy
Copy link
Copy Markdown
Contributor

janhoy commented Jun 1, 2026

Can this be salvaged and targeted for merge before 1.0 relase? If necessary we could label it as beta in the first release?

This option is only available when Method=Gateway.
The referenced Gateway must already exist and be managed by your platform team.
The Solr Operator only manages the HTTPRoute resources.
properties:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is hostnames missing of the HTTPRouteSpec? It is useful to filter listeners of the Gateway.

Suggestion:

Suggested change
properties:
properties:
hostnames:
description: |-
Hostnames defines a set of hostnames that should match against the HTTP Host
header to select a HTTPRoute used to process the request..
type: array
maxItems: 16
items:
description: Hostname is the fully qualified domain name of a network host.
Hostname can be “precise” which is a domain name without the terminating dot of a network host (e.g. “foo.example.com”) or “wildcard”, which is a domain name prefixed with a single wildcard label (e.g. *.example.com).
type: string
minLength: 1
maxLength: 253
pattern: '^(\*\.)?[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I saw that routes will be created for common and per-node. So this could be additionalCommonHostnames that are used as alias for the common service. Or if set these hostnames are the only ones used for the HTTPRoute and if unset the names are generated like now with a pattern.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DanielRaapDev

The hostnames field on the HTTPRoute is actually already being populated, auto-generated from spec.solrAddressability.external.domainName and additionalDomainNames ( GenerateCommonHTTPRoute in gateway_util.go). This produces the hostname pattern like {prefix}-{cloud}-solrcloud. {domain} which is also used for status.externalCommonAddress, BackendTLSPolicy fqdn, and what Solr uses as its external URL.

I've added an additionalHostnames field to SolrGatewayOptions that would allow users to append extra alias hostnames to the common HTTPRoute so that we cover the use case of routing additional alias names to the same service.

I did not make this an override because the auto-generated hostnames are tied to status.externalCommonAddress and BackendTLSPolicy and so replacing them would create a mismatch between what Solr thinks its address is and what the Gateway actually routes to. Hope it made sense. Thx!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the quick response. You defintily got a better understanding of Solr addressabilty than I have ;)

@janhoy
Copy link
Copy Markdown
Contributor

janhoy commented Jun 3, 2026

@chinmoysahu I tried to bring the branch up to date with main and push, but you have not allowed maintainers to push to your fork. Can you please either allow push or update the branch yourself? Hoping to get the last round of reviews and land this..

Implements Kubernetes Gateway API as a new external addressability method,
enabling HTTPRoute-based routing for SolrCloud services.

- Add Gateway API types and controller logic
- Generate HTTPRoutes for common and per-node services
- Add RBAC for HTTPRoute and BackendTLSPolicy
- Update CRDs, Helm charts, and documentation
- Add E2E tests following existing Ingress test patterns
- Add comprehensive cleanup logic for BackendTLSPolicy resources:
  * Delete common policy when hideCommon=true
  * Delete node policies when hideNodes=true
  * Delete all policies when BackendTLSPolicy config removed
  * Delete all policies when method changes from Gateway
- Remove focused test markers (FIt, FContext, FDescribe)
- Improve variable naming: hostname -> fqdn for clarity
- Enhance SolrBackendTLSPolicy documentation with validation constraints
@chinmoysahu chinmoysahu force-pushed the gtw-ingress-support branch from 3527dc1 to 9f543b3 Compare June 3, 2026 20:04
@chinmoysahu
Copy link
Copy Markdown
Author

@janhoy Since this fork is under my organization, I wasnt able to enable maintainer edits like a personal fork. I have updated the branch from main and pushed the changes to this PR.
Also responding to the other comments shortly. Thanks for the follow up,

Allows users to append extra alias hostnames to the common HTTPRoute
beyond the auto-generated ones from domainName/additionalDomainNames.
This is additive (not an override) to maintain consistency with
status.externalCommonAddress and BackendTLSPolicy references.
@janhoy
Copy link
Copy Markdown
Contributor

janhoy commented Jun 3, 2026

Yea that’s a limitation for orgs. Thanks for following up. I’ll trigger another copilot review as well.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 18 changed files in this pull request and generated 8 comments.

Files not reviewed (1)
  • api/v1beta1/zz_generated.deepcopy.go: Language not supported

Comment thread controllers/solrcloud_controller.go
Comment thread controllers/solrcloud_controller.go
Comment thread controllers/solrcloud_controller.go
Comment thread controllers/util/gateway_util_backendtls.go
Comment thread controllers/util/gateway_util_backendtls.go
Comment thread docs/solr-cloud/gateway-api.md Outdated
Comment thread docs/solr-cloud/gateway-api.md Outdated
Comment thread docs/solr-cloud/gateway-api.md Outdated
…BackendTLSPolicy validation, doc fixes

- Make Gateway API HTTPRoute watch conditional (--gateway-api flag) to avoid
  envtest failures when Gateway API CRDs are not installed
- Fix hideNodes cleanup: delete all node HTTPRoutes when hideNodes=true,
  not just orphaned ones from scale-down
- Add validation: BackendTLSPolicy requires spec.solrTLS to be configured
- Fix license header typo in gateway-api.md
- Fix BackendTLSPolicy version reference (v1alpha3 -> v1)
- Clarify headless service documentation
@chinmoysahu
Copy link
Copy Markdown
Author

Addressed Copilot comments and also the Lint/Smoke test issues - Lint and Smoke tests pass locally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

networking Related to Services or Ingresses

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants