Resolve virtual-chunk S3 endpoints from container config#27
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces automatic resolution and retrying of S3 regional redirects when using path-style addressing on the global endpoint (which is common for S3 buckets containing dots). It updates makeUrlStore to detect 301/307 redirects, extract the regional endpoint from the x-amz-bucket-region header, retry the request against the regional URL, and pin that regional URL for subsequent requests. The feedback suggests two improvements: using optional chaining when accessing response.headers.get to handle custom or poorly-mocked fetch clients, and canceling the discarded redirect response body to prevent potential socket or resource leaks.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| function regionalS3RedirectUrl(url: string, response: Response): string | null { | ||
| if (response.status !== 301 && response.status !== 307) return null; | ||
| const region = response.headers.get("x-amz-bucket-region"); | ||
| if (!region) return null; |
There was a problem hiding this comment.
To make the helper more robust against custom or poorly-mocked fetchClient implementations, it is safer to use optional chaining when accessing response.headers.get.
| function regionalS3RedirectUrl(url: string, response: Response): string | null { | |
| if (response.status !== 301 && response.status !== 307) return null; | |
| const region = response.headers.get("x-amz-bucket-region"); | |
| if (!region) return null; | |
| function regionalS3RedirectUrl(url: string, response: Response): string | null { | |
| if (response.status !== 301 && response.status !== 307) return null; | |
| const region = response.headers?.get?.("x-amz-bucket-region"); | |
| if (!region) return null; |
| const regionalUrl = regionalS3RedirectUrl(requestUrl, response); | ||
| if (regionalUrl) { | ||
| requestUrl = regionalUrl; | ||
| pinnedUrl = regionalUrl; | ||
| response = await doFetch(requestUrl, { | ||
| headers, | ||
| signal: options?.signal, | ||
| }); | ||
| } |
There was a problem hiding this comment.
When a 301/307 redirect is received, the original response is discarded and overwritten with the regional fetch response. In some environments (like Node.js), discarding a response without consuming or canceling its body can lead to socket/resource leaks and prevent connection reuse (keep-alive).
To prevent this, we should cancel the response body of the redirect before performing the retry.
const regionalUrl = regionalS3RedirectUrl(requestUrl, response);
if (regionalUrl) {
if (response.body && typeof response.body.cancel === "function") {
response.body.cancel().catch(() => {});
}
requestUrl = regionalUrl;
pinnedUrl = regionalUrl;
response = await doFetch(requestUrl, {
headers,
signal: options?.signal,
});
}
Virtual chunks in dotted-name S3 buckets must use path-style addressing, which icechunk-js sent to the global
s3.amazonaws.comendpoint. For buckets outsideus-east-1that returns aLocation-less301thrown as an error in Node, CORS-opaque in the browser so every such read failed.The fix resolves S3 endpoints from the repo's virtual-chunk-container config, mirroring Rust's
VirtualChunkContainer/S3Options. The parser keeps each container'sstoreconfig (region,endpoint_url,force_path_style) and matches absolute locations by longesturl_prefix(boundary-aware, sos3://bucketcan't claims3://bucket2). The URL is then built from that config: a knownregionhits the regional endpoint directly (partition-aware —cn-*->amazonaws.com.cn),endpoint_urlroutes S3-compatible/Tigris stores, dotted names use path-style. Icechunk-written repos record the region, so dotted-bucket reads are zero-config and work in the browser (the regional endpoint serves CORS).When no region is known, dotted buckets fall back to the global endpoint and
makeUrlStoreresolves the301at fetch time viax-amz-bucket-region(Node only: concurrent reads use a per-request URL copy, and the discarded redirect body is cancelled).Verified live against #25's
us-west-2.opendata.source.coopstore with nofetchClientor options, in Node and real headless Chrome: reads go straight to the regional endpoint (zero global hits, zero redirects) and return real data.Breaking change:
ReadSession.open'svirtualChunkContainersoption is nowVirtualChunkContainer[]instead ofMap<string, string>. It's normally populated internally byRepository/IcechunkStore; direct callers must switch to the array form.Closes #25