Add Docker caching in deployments#4831
Merged
Merged
Conversation
Contributor
Host Test Results 1 files 1 suites 1h 40m 28s ⏱️ Results for commit 3fcf018. Realm Server Test Results 1 files ±0 1 suites ±0 9m 31s ⏱️ + 2m 5s Results for commit 3fcf018. ± Comparison against earlier commit ae708a3. |
Reorder the dep-install steps so the `pnpm fetch` layer only invalidates when the lockfile changes, not on every source edit. Previously the lockfile COPY was followed immediately by `ADD . ./`, which meant any file change blew away the fetch and forced ~2-3 min of re-downloading into the pnpm store. With `cache-from/cache-to: type=gha,mode=max` already wired through docker-ecr, this should cut several minutes off each Docker build on the common no-dep-change path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two consecutive staging deploys on this branch showed the GHA cache backend was making things worse: a "cache hit" on the pnpm fetch layer took ~84s to download from GHA, plus ~144s to push the updated cache back — totaling ~230s of cache-only network I/O for an image whose actual `pnpm fetch` step takes ~30s when run fresh. Temporarily replace the call to `cardstack/gh-actions/.../docker-ecr.yml` with an in-repo composite action (`.github/actions/docker-build-ecr`) that does the same auth + build + push but caches to the same ECR repository (`<repo>:buildcache`) using `type=registry,mode=max` with the OCI-manifest flags ECR requires. ECR pulls inside us-east-1 are much faster than GHA cache traffic, so the layer ordering in the pnpm Dockerfiles should finally produce a net win. Inlining (rather than updating cardstack/gh-actions in place) lets us iterate on the cache config without round-tripping through that repo. We can fold it back once we're confident the approach holds. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
adfed76 to
b7aecb1
Compare
The SHA we initially pinned (v2.0.1) still uses Node.js 20, which is deprecated on GitHub Actions runners — every build job in the latest deploy run flagged "Node.js 20 actions are deprecated" against this one action. v2.1.5 switched to node24 (verified by inspecting the tagged commit's action.yml). All other pinned actions in the local docker-build-ecr composite (`actions/checkout`, `setup-buildx-action`, `configure-aws-credentials`, `build-push-action`) are already on node24, so this single bump clears the warnings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
backspace
commented
May 14, 2026
|
|
||
| WORKDIR /boxel/packages/postgres | ||
|
|
||
| CMD ./node_modules/.bin/ts-node --transpileOnly ./scripts/fix-migration-names.ts && ./node_modules/.bin/node-pg-migrate --check-order false --migrations-table migrations up && sleep infinity |
Contributor
Author
There was a problem hiding this comment.
This is an existing issue, but CS-11154 tracks it.
richardhjtan
approved these changes
May 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This temporarily inlines the
gh-actions’s repository’sdocker-build-ecraction to use ECR’s caching, which results in mild speed gains described below. It won’t hugely speed up deployments because the Docker images are built in parallel, but it’ll at least reduce concurrency.If this works well with real-world usage, I’ll follow up to upstream it into
gh-actions.Claude:
Build-time impact
Compared 8 successful staging deploys from
mainagainst 4 warm-cache deploys on this branch. Times are seconds.prerenderrealm-serverworkerpg-migrationprerender-mgrbot-runnerai-bothost(control)Critical-path Docker build (slowest of the parallel image builds) drops 305s → 258s (−47s). Total Docker CI-minutes (sum across the 7 image builds) drop 1,563s → 1,054s (−33%).
Notes
prerender,realm-server,worker,pg-migration,prerender-manager). Those show the biggest drops.bot-runnerandai-botuse unchanged Dockerfiles and only gain from the GHA→ECR cache-backend swap.hostis a control: built by a separate workflow this PR doesn't touch. The 215s warm median exactly matches the baseline median, confirming the savings above aren't runner-side variance.prerenderand tripledpg-migrationbuild times despite confirmed cache hits in the buildx logs. Medians absorb it.bot-runner184–468s) because runner cold-start variance routinely swings GHA build times by ±100s. p25–p75 is more useful than min–max.Data:
25653386425, 25653379386, 25521761356,
25498389255, 25423013789
25881823129
The new datapoint stabilizes the picture nicely: host control is now exactly 215 vs 215 baseline (a clean 0%), and the variance-prone medians settled. prerender critical-path savings firmed up to −15% / −47s. The 321s pg-migration outlier from the third warm run is balanced out by the 110s here.