Skip to content

Resolve virtual-chunk S3 endpoints from container config#27

Merged
Shane98c merged 4 commits into
mainfrom
fix/s3-regional-redirect
Jun 24, 2026
Merged

Resolve virtual-chunk S3 endpoints from container config#27
Shane98c merged 4 commits into
mainfrom
fix/s3-regional-redirect

Conversation

@Shane98c

@Shane98c Shane98c commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Virtual chunks in dotted-name S3 buckets must use path-style addressing, which icechunk-js sent to the global s3.amazonaws.com endpoint. For buckets outside us-east-1 that returns a Location-less 301 thrown as an error in Node, CORS-opaque in the browser so every such read failed.

The fix resolves S3 endpoints from the repo's virtual-chunk-container config, mirroring Rust's VirtualChunkContainer / S3Options. The parser keeps each container's store config (region, endpoint_url, force_path_style) and matches absolute locations by longest url_prefix (boundary-aware, so s3://bucket can't claim s3://bucket2). The URL is then built from that config: a known region hits the regional endpoint directly (partition-aware — cn-* -> amazonaws.com.cn), endpoint_url routes S3-compatible/Tigris stores, dotted names use path-style. Icechunk-written repos record the region, so dotted-bucket reads are zero-config and work in the browser (the regional endpoint serves CORS).

When no region is known, dotted buckets fall back to the global endpoint and makeUrlStore resolves the 301 at fetch time via x-amz-bucket-region (Node only: concurrent reads use a per-request URL copy, and the discarded redirect body is cancelled).

Verified live against #25's us-west-2.opendata.source.coop store with no fetchClient or options, in Node and real headless Chrome: reads go straight to the regional endpoint (zero global hits, zero redirects) and return real data.

Breaking change: ReadSession.open's virtualChunkContainers option is now VirtualChunkContainer[] instead of Map<string, string>. It's normally populated internally by Repository / IcechunkStore; direct callers must switch to the array form.

Closes #25

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces automatic resolution and retrying of S3 regional redirects when using path-style addressing on the global endpoint (which is common for S3 buckets containing dots). It updates makeUrlStore to detect 301/307 redirects, extract the regional endpoint from the x-amz-bucket-region header, retry the request against the regional URL, and pin that regional URL for subsequent requests. The feedback suggests two improvements: using optional chaining when accessing response.headers.get to handle custom or poorly-mocked fetch clients, and canceling the discarded redirect response body to prevent potential socket or resource leaks.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +66 to +69
function regionalS3RedirectUrl(url: string, response: Response): string | null {
if (response.status !== 301 && response.status !== 307) return null;
const region = response.headers.get("x-amz-bucket-region");
if (!region) return null;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To make the helper more robust against custom or poorly-mocked fetchClient implementations, it is safer to use optional chaining when accessing response.headers.get.

Suggested change
function regionalS3RedirectUrl(url: string, response: Response): string | null {
if (response.status !== 301 && response.status !== 307) return null;
const region = response.headers.get("x-amz-bucket-region");
if (!region) return null;
function regionalS3RedirectUrl(url: string, response: Response): string | null {
if (response.status !== 301 && response.status !== 307) return null;
const region = response.headers?.get?.("x-amz-bucket-region");
if (!region) return null;

Comment on lines +119 to +127
const regionalUrl = regionalS3RedirectUrl(requestUrl, response);
if (regionalUrl) {
requestUrl = regionalUrl;
pinnedUrl = regionalUrl;
response = await doFetch(requestUrl, {
headers,
signal: options?.signal,
});
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When a 301/307 redirect is received, the original response is discarded and overwritten with the regional fetch response. In some environments (like Node.js), discarding a response without consuming or canceling its body can lead to socket/resource leaks and prevent connection reuse (keep-alive).

To prevent this, we should cancel the response body of the redirect before performing the retry.

      const regionalUrl = regionalS3RedirectUrl(requestUrl, response);
      if (regionalUrl) {
        if (response.body && typeof response.body.cancel === "function") {
          response.body.cancel().catch(() => {});
        }
        requestUrl = regionalUrl;
        pinnedUrl = regionalUrl;
        response = await doFetch(requestUrl, {
          headers,
          signal: options?.signal,
        });
      }

@Shane98c Shane98c changed the title Resolve S3 regional redirect for dotted virtual-chunk buckets Resolve virtual-chunk S3 endpoints from container config Jun 24, 2026
@Shane98c Shane98c merged commit 0ff04c0 into main Jun 24, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

translateToHttpUrl uses the global S3 endpoint for dotted buckets → 301 (not followed) for non-us-east-1, breaking virtual chunk reads

1 participant