realm-server: HTTPS+HTTP/2 in local dev#4797
Conversation
Heavy aggregator-card renders (cohort, dashboards) fan out 80+
federated-search requests per render inside one Chromium tab. Chrome's
HTTP/1.1 6-per-origin connection ceiling serializes them and turns a
single render into multiple minutes; HTTP/2 multiplexes them over one
connection and the same render finishes in seconds. Browsers only do
HTTP/2 over TLS, so the local realm-server now terminates a cert.
Single-origin design: the realm-server listens on
`https://localhost:4201` (and `https://localhost:4202` for test-realms)
when the dev cert is provisioned. There is no parallel HTTP listener
and no h2 alias port; the wire protocol and the canonical realm URL
agree. In-process tests and any environment without a cert keep getting
plain HTTP/1.1 via the same `listen(port)` entry point — `RealmServer`
picks the protocol from `REALM_SERVER_TLS_CERT_FILE`/`_KEY_FILE` rather
than two separate methods.
Cert provisioning is opt-in via `mise run infra:ensure-dev-cert`:
- Requires `mkcert` (single-origin HTTPS has no HTTP fallback in
dev, so a missing prereq is a hard error with install hints).
- Attempts `mkcert -install` once for system trust; declining the
sudo prompt is non-fatal — the cert still gets generated and
indexing keeps working via puppeteer's `--ignore-certificate-errors`
flag and `NODE_EXTRA_CA_CERTS` for Node clients.
- Idempotent: re-runs are a no-op until the cert is within 7 days of
expiry.
`env-vars.sh` flips `REALM_BASE_URL`/`REALM_TEST_URL` defaults to
`https://localhost:4201`/`4202`, exports the cert paths when files
exist, and points `NODE_EXTRA_CA_CERTS` at mkcert's root CA so Node-
side fetches (worker, scripts, prerender Node) trust the cert without
requiring `mkcert -install` to have run. `dev-common.sh` switches
wait-on's readiness probes to `https-get://` when the realm URL is
HTTPS. The host's `config/environment.js` defaults flip to
`https://localhost:4201` for `realmServerURL`, `baseRealmURL`,
`catalogRealmURL`, `legacyCatalogRealmURL`, `skillsRealmURL`, and
`openRouterRealmURL`. `middleware/index.ts#fullRequestURL` now detects
`ctx.req.socket.encrypted` so URL-keyed realm lookup matches the wire
protocol — combined with the canonical-URL flip, both halves agree.
CI / hermetic test harness path stays HTTP-only: if no cert is
provisioned, `env-vars.sh` leaves the TLS env vars unset and the
realm-server boots `http.createServer`, exactly as before.
Migration after pulling: any local card data created under the old
`http://localhost:4201/...` canonical references is stale and needs to
be re-indexed. README documents the one-time `mise run
infra:full-reset` step.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1655a6f2df
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Preview deploymentsHost Test Results 1 files ±0 1 suites ±0 1h 46m 45s ⏱️ - 7m 36s Results for commit 27035c6. ± Comparison against earlier commit 7832e3e. Realm Server Test Results 1 files ±0 1 suites ±0 8m 26s ⏱️ +43s Results for commit 27035c6. ± Comparison against earlier commit 7832e3e. |
Adds the two missing pieces from the initial HTTPS+HTTP/2 flip: 1. Same-port HTTP→HTTPS dispatcher in `server.ts`. When the realm-server speaks TLS, `listen(port)` now binds a net.Server that peeks the first byte off every connection: 0x16 (TLS ClientHello) routes to the http2 secure server; anything else is treated as plain HTTP and handed to a tiny 301-redirect handler that rewrites the URL to `https://<inbound-host><path>`. So `http://localhost:4201/…` in a browser bar or a `curl` invocation gets a clean 301 instead of a TLS handshake failure. Same listener, no extra port. 2. A node-pg-migrate that rewrites every URL-bearing text/varchar/jsonb column on every public table (except `modules`, which the realm-server truncates on startup) from `http://localhost:42XX` to `https://localhost:42XX`. Auto-discovered via `information_schema.columns` — covers `boxel_index`, `boxel_index_working`, `realm_registry`, `realm_meta`, `realm_metadata`, `realm_user_permissions`, `realm_versions`, `realm_file_meta`, `module_transpile_cache`, plus any future URL-bearing column that's added later (the discovery picks it up). WHERE-filtered so it only touches rows still containing the old URL — idempotent, no-op in production. `mise run dev` already passes `--migrateDB` to the realm-server, so the migration runs automatically on the first post-pull boot. README's "Local HTTPS dev access" section is rewritten to describe the new auto-migration flow (no more `mise run infra:full-reset` callout). Schema file renamed from `1779100257123_schema.sql` to `1779200000000_schema.sql` so host/config/environment.js's migration-vs-schema-name sentinel matches the new latest migration. Content is unchanged (the new migration is data-only). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI was failing across host/realm-server/matrix test suites because ensure-dev-cert exited non-zero when mkcert was missing, killing the mise dep chain before any service started, and because env-vars.sh flipped REALM_BASE_URL to https unconditionally — so even when the realm-server fell back to plain HTTP, every consumer was still asked to fetch against https. The host config defaults had the same problem: hardcoded https meant the in-browser realmServerURL didn't match the wire scheme. Three fixes, gated on cert presence: 1. `ensure-dev-cert` now exits 0 with a soft warning when mkcert is missing. The realm-server's `listen()` already falls back to plain `http.createServer` when the TLS env vars are unset, so this is the honest behavior for CI / hermetic-test environments. 2. `env-vars.sh` defaults `REALM_BASE_URL`/`REALM_TEST_URL` to http and only upgrades them to https inside the cert-detected block alongside the existing TLS env var exports. 3. `packages/host/config/environment.js` derives its scheme from `process.env.REALM_BASE_URL`, so the host config follows the same cert-presence-driven flip rather than baking https into the JS defaults. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Enables local-dev realm-server to serve a single canonical HTTPS origin with HTTP/2 (plus same-port HTTP→HTTPS redirect) to remove Chrome’s HTTP/1.1 per-origin connection bottleneck during heavy prerender/search fan-outs, and migrates local indexed data from http://localhost:42xx to https://localhost:42xx.
Changes:
- Add TLS-capable listener that multiplexes HTTPS/HTTP2 and HTTP redirect on the same port; update URL construction to recognize TLS sockets.
- Default local dev URLs/config/docs to
https://localhost:4201(+:4202for test realms) and add mkcert-based cert provisioning. - Add a Postgres migration to rewrite persisted localhost canonical URLs from http→https.
Reviewed changes
Copilot reviewed 45 out of 46 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| README.md | Document local HTTPS/HTTP2 setup, migration, and updated local URLs. |
| QUICKSTART.md | Update quickstart URLs to https://localhost:4201. |
| packages/realm-server/tests/types-endpoint-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/server-endpoints/search-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/server-endpoints/search-prerendered-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/server-endpoints/info-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/server-endpoints/index-responses-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/server-endpoints/helpers.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/server-endpoints/federated-types-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/server-endpoints/authentication-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/request-forward-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/realm-endpoints/user-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/realm-endpoints/reindex-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/realm-endpoints/markdown-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/realm-endpoints/info-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/realm-endpoints/dependencies-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/realm-endpoints-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/publish-unpublish-realm-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/prerender-manager-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/openrouter-passthrough-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/module-cache-race-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/helpers/index.ts | Update close helpers/types to tolerate non-http.Server server handles. |
| packages/realm-server/tests/get-boxel-claimed-domain-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/file-watcher-events-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/delete-boxel-claimed-domain-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/claim-boxel-domain-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/card-source-endpoints-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/card-endpoints-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/card-dependencies-endpoint-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/boxel-domain-availability-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/tests/atomic-endpoints-test.ts | Use RealmHttpServer type alias for server handle. |
| packages/realm-server/server.ts | Add TLS/http2+redirect dispatcher and export RealmHttpServer type; update listen logging. |
| packages/realm-server/prerender/browser-manager.ts | Add --ignore-certificate-errors for prerender Chromium when using https. |
| packages/realm-server/middleware/index.ts | Treat TLS sockets as https for fullRequestURL() computation. |
| packages/realm-server/main.ts | Make shutdown tolerant of non-http.Server handles lacking closeAllConnections(). |
| packages/realm-server/lib/dev-service-registry.ts | Broaden registry typing to net.Server. |
| packages/postgres/migrations/1779200000000_canonical-url-http-to-https.js | Add migration to rewrite localhost canonical URLs from http→https. |
| packages/host/config/schema/1779200000000_schema.sql | Add regenerated host sqlite schema snapshot. |
| packages/host/config/environment.js | Flip local default realm URLs to https. |
| mise-tasks/services/test-realms | Ensure dev cert task runs before test realms. |
| mise-tasks/services/realm-server-base | Ensure dev cert task runs before base realm server. |
| mise-tasks/services/realm-server | Ensure dev cert task runs before realm server. |
| mise-tasks/lib/env-vars.sh | Flip default realm URLs to https and export TLS cert/CA env vars. |
| mise-tasks/lib/dev-common.sh | Use https readiness probes when realm URLs are https. |
| mise-tasks/infra/ensure-dev-cert | New task to provision mkcert leaf cert for local HTTPS/HTTP2. |
| .claude/skills/indexing-diagnostics/SKILL.md | Update localhost URLs and markdown formatting in diagnostics skill doc. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Local realm-server speaks HTTPS+HTTP/2 in every environment — there is no HTTP fallback or opt-in. The dev cert is a hard prereq: - `ensure-dev-cert` exits non-zero when mkcert is missing. - `env-vars.sh` defaults `REALM_BASE_URL`/`REALM_TEST_URL` to https unconditionally and no longer flips schemes based on cert presence. - `host/config/environment.js` defaults to `https://localhost:4201` unconditionally; the previous scheme-from-env-var branch is gone. - The new `.github/actions/init` step installs mkcert via apt and runs `mise run infra:ensure-dev-cert` before any downstream job, so CI realm-servers boot HTTPS+HTTP/2 too. Test harnesses that launch Chromium already pass `--ignore-certificate-errors`; Node clients pick up the cert via `NODE_EXTRA_CA_CERTS`. - README's CI/harness paragraph is rewritten to describe the cert provisioning in the init action (no more "boots HTTP/1.1 in CI" line). Carries over the Copilot-flagged fixes: - Migration renamed to `1779100257124_canonical-url-http-to-https.js` (one greater than the existing latest, no 6+ consecutive zeros so it passes `lint:migrations`) and the matching schema dump renamed. - Migration body adds a `realm_registry` LIKE pre-check that short- circuits the full-column scans on production/staging databases where the canonical URLs never reference localhost. - Drops the unused `/* eslint-disable camelcase */` line that `lint:js` flagged. - `redirectToHttps()` parses the inbound `Host` via `new URL()` so bracketed IPv6 authorities (`[::1]:4201`) round-trip cleanly instead of the regex producing an invalid `https://::1:4201/...`. - `env-vars.sh` no longer concatenates `NODE_EXTRA_CA_CERTS` with `:` separators — Node accepts a single PEM path, not a list. If the dev already has it set, leave it alone; otherwise point at mkcert's CA. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Copilot 3230386975 — the previous QUICKSTART pointed users at https://localhost:4201 without telling them how to provision the cert that makes that origin work. Adds mkcert to the system dependencies list at step 1 with platform-specific install hints and the `mise run infra:ensure-dev-cert` one-liner, linking back to the README's "Local HTTPS dev access" section for the full story. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three task scripts under `mise-tasks/test-services/` were stuck on the
old `http-get://${REALM_BASE_URL#http://}/base/...` readiness probe
shape that strips a hardcoded `http://`. After env-vars.sh flipped
REALM_BASE_URL to https, that strip becomes a no-op and the probe URL
turns into the malformed `http-get://https://localhost:4201/...`,
which wait-on can't reach — every CI suite that drives `mise run
test-services:*` would hang on phase-1 readiness instead of starting
the next phase.
Same fix as `mise-tasks/lib/dev-common.sh`: detect the scheme from
`$REALM_BASE_URL` / `$REALM_TEST_URL` and pick `http-get://` or
`https-get://` accordingly; strip `*://` to leave just the authority.
Also wires `infra:ensure-dev-cert` into each script's depends list so
local invocations of `mise run test-services:*` (outside CI's init
action) provision the cert before the realm-server starts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Blockers (B1–B3):
- tests/index.ts deletes REALM_SERVER_TLS_CERT_FILE/_KEY_FILE before any
fixture realm-server is spun up; without this CI's globally-provisioned
cert leaks into supertest-driven in-process servers, the dispatcher
binds TLS on 127.0.0.1:444X, and the plain-HTTP-from-supertest path is
301-redirected, breaking every assertion that expects 200/4xx.
- realm-server/package.json `test:wait-for-servers` now uses
`https-get://` to match the new wire scheme; the previous `http-get://`
hit the dispatcher's 301 path and never reported ready.
- server.ts attaches a per-socket `error` handler before the readable
callback so an RST mid-handshake (or any peer-side socket error)
doesn't escalate to an uncaught exception — dispatcher is the only
inbound listener for the realm-server, can't be allowed to crash.
- `null` reads on the dispatcher socket now `destroy()` instead of just
resuming so half-open accumulators (port scanners, eager load
balancers) don't tie up file descriptors.
Major (M1, M3–M5):
- README's auto-migration callout pointed at the wrong migration filename
(1779200000000_… → 1779100257124_…).
- pg-adapter.ts env-mode regex now matches `^https?://localhost:42XX/`
so the post-flip https canonicals get rewritten to Traefik hostnames
when a dev switches the same DB into BOXEL_ENVIRONMENT mode.
- server.ts's serveIndex / serveFromRealm URL constructions now go
through `fullRequestURL(ctxt)` instead of `${ctxt.protocol}//${ctxt.host}`;
`ctxt.protocol` only honors x-forwarded-proto when `app.proxy = true`,
while `fullRequestURL` also reads the TLS socket flag. Pre-existing
inconsistency that the https flip would have made load-bearing.
- migration's information_schema walk excludes `is_generated = 'NEVER'`
so a future generated column on any public table doesn't abort the DO
block with "column can only be updated to DEFAULT".
Copilot's second pass:
- ensure-dev-cert checks for mkcert BEFORE the idempotent-skip — env-vars.sh
needs `mkcert -CAROOT` to populate NODE_EXTRA_CA_CERTS even when an
old cert already exists, and the previous ordering let a stale cert
slip past with the trust path half-wired.
- middleware/index.ts `fullRequestURL` falls back to `:authority` when
`headers.host` is absent — HTTP/2's compat layer normally populates
host from :authority but the pseudo-header is the canonical source.
- middleware/index.ts `fetchRequestFromContext` strips `:`-prefixed
pseudo-headers (`:method`, `:scheme`, `:path`, `:authority`) before
feeding them into `new Request(headers)`, which WHATWG Headers rejects.
- QUICKSTART mkcert bullet's continuation line is properly indented now
so markdown renders it inside the bullet instead of as a new paragraph.
- indexing-diagnostics SKILL.md two table rows now have the missing third
cell so the table renders correctly.
Minor (m2, m6, n3) + Option A:
- redirectToHttps falls back to `socket.localAddress:localPort` when the
Host header is absent (HTTP/1.0 client), instead of bare `localhost`
that would route to port 443.
- scripts/full-reindex.sh and register-bot.sh flip to `https://` with
`-k` (curl doesn't pick up NODE_EXTRA_CA_CERTS, and the local mkcert
CA isn't necessarily in the system trust store).
- prerender/browser-manager.ts comment references only REALM_BASE_URL
(REALM_SERVER_DOMAIN was stale — never exported by env-vars.sh).
- QUICKSTART step 10/11 and README's "view a realm's app" paragraph
redirect manual-browser navigation to `http://localhost:4200/` (the
vite host), with a note that visiting `https://localhost:4201` directly
surfaces mixed-content warnings because vite + icons + synapse still
speak http. Realm-server's https origin is reached only via fetches
inside the vite-served page, which is where the federated-search h2
win lands. README's "view example" output also flipped the realm log
line to `https://localhost:4202/test/` to match the new canonical.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- README list item 3's wrapped continuation line is now indented under the bullet so markdown doesn't break it into a separate paragraph. - server.ts dispatcher tracks every accepted socket in a Set and mirrors http.Server's `closeAllConnections()` API. main.ts's existing typeof feature-detect picks this up; shutdown no longer hangs on long-lived h2 sessions or keep-alive sockets. - tests/listener-dispatcher-test.ts is new coverage for the dispatcher: generates a self-signed cert via openssl into a tmp dir, then exercises TLS+h2, ALPN HTTP/1.1 fallback, plain-HTTP→https 301 redirect, the no-Host-header path that uses `socket.localAddress`, malformed-cert downgrade to plain HTTP, and the no-cert-env-vars path. `createListener` is now exported from server.ts so the test can drive it without spinning up a full realm-server fixture (and the test bootstrap's global TLS-env-var delete doesn't interfere — each test restores its own env around `startListener`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`qunit/no-assert-logical-expression` was failing on three assertions that combined multiple conditions via `&&` / `||`. Splitting them into discrete `assert.true(...)` calls makes the failure point obvious when a test breaks and clears the lint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both `packages/workspace-sync-cli/tests/helpers/start-test-realm.ts` and `packages/realm-test-harness/src/isolated-realm-stack.ts` spawn a realm-server subprocess that inherits `process.env`. After CI's init action provisions the dev cert and `env-vars.sh` exports `REALM_SERVER_TLS_CERT_FILE/_KEY_FILE`, those env vars leak into the spawned realm-server, which binds the HTTPS+HTTP/2 dispatcher on the harness's chosen port. The integration tests and the realm-perf bench both drive plain `http://localhost:<port>/...` URLs against that server, hit the dispatcher's 301 path, and break: workspace-sync's CLI fails its session handshake with "expected 'Authorization' header" (it doesn't follow the redirect through the auth flow), and the bench fails its first GET with `404` because the realm route is behind https now. Same shape of fix as `realm-server/tests/index.ts` for the in-process qunit suite: destructure the two TLS env-var keys out of the spawn env so the child inherits everything except those. Plain `http.createServer` path, no redirect, harness HTTP URLs work as written. Production realm-servers and local dev are unaffected because they don't go through these harnesses. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`packages/host/testem-live.js` was hardcoding `http://localhost:4201/catalog/` as the realm URL and launching Chrome with the default trust policy. After the HTTPS flip, the live-test runner's `discoverTestModules` fetched against `https://localhost:4201/catalog/...` (via the host's `realmServerURL` default) but the browser navigated to `http://localhost:4201/...`, getting a 301 to https and then failing the cert check — `mkcert -install` in CI's init action is best-effort and the headless Chrome in CI doesn't always pick up the system trust store anyway. Two fixes paired: - Default realm URL flips to `https://localhost:4201/catalog/` so the navigation target matches the wire. - Chrome's CI launch args get `--ignore-certificate-errors` so the live test runner accepts the mkcert leaf without depending on system trust. Safe — the URL is fixed by REALM_URL and the connection is loopback. Dev (`launch_in_dev`) doesn't add the flag because local devs typically have run `mkcert -install` successfully and the cert is trusted normally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…registry The pre-check needs to fire on a fresh install too. `realm_registry` is populated by the realm-server's runtime bootstrap (registry backfill + reconciler), not by migrations, so it's empty when this migration runs against a freshly-created DB — the migration short-circuited and the `http://localhost:42XX` permission rows seeded by the earlier `1726671342065_backfill-realm-owners.js` migration stayed un-rewritten. The realm-server then matches incoming requests against the new `https://localhost:42XX/…` canonical and the permission rows fail to join → world-readable catalog returns 401 → Live Tests fail with "Cannot access realm https://localhost:4201/catalog/ (HTTP 401)". Switch the pre-check to `realm_user_permissions.realm_url`, which is reliably populated with the localhost canonicals by the earlier seed-style migrations. The rest of the migration body is unchanged — the per-column WHERE clauses still restrict the touch set to rows that actually contain the old URL, so production/staging DBs (real hostnames, never localhost) still no-op. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Test mode runs against the host-internal `http://test-realm/...` virtual origin via VirtualNetwork; there is no real realm-server on the wire. Many host test fixtures hardcode the `http://localhost:4201/...` canonicals in mock setups, VirtualNetwork mappings, and JSON test data, so flipping the default URLs to https caused every fetch in the test suite to fail with `TypeError: Failed to fetch` — the host's VirtualNetwork was wired with https URL mappings the test mocks didn't recognize. `environmentDefaults(environment)` now reads the ember env and picks http for `environment === 'test'`, https otherwise. Dev gets the HTTPS+HTTP/2 flip exactly as designed; test stays where it always was. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous test-mode-on-http revert was wrong: in Host Tests the realm-server actually IS running (via mise run test-services:host), and that realm-server speaks HTTPS+HTTP/2. The host bundle's defaults need to match the wire so module/data fetches over the wire (like GET /base/card-api during warmup) reach the live realm-server. The http defaults were producing failed http→https mismatches. So: - environment.js test mode reverts to https defaults (same as dev). - test-wait-for-servers.sh + live-test-wait-for-servers.sh default their readiness probe URLs to `https-get://` to match. live-test-wait-for-servers.sh also gets the same scheme-detection helper (`to_wait_scheme`) the other scripts use so an explicit REALM_URL with either scheme works. `http://test-realm/...` URLs in tests (used by the in-memory test realm registry) are still intercepted by `getRealmInfoForURL` before any wire fetch — that path is unrelated to the wire defaults and any remaining failures there are a separate concern from the HTTPS flip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sweep of every place `http://localhost:4201`/`4202` appears with runtime impact: Runtime / wire-touching: - `package.json` `openrouter:sync` default REALM_URL → https - `mise-tasks/lib/test-dev-common.sh` stub env defaults → https - `packages/host/app/services/host-mode-service.ts` `originIsNotMatrixTests` accepts both http and https origins on the matrix-tests realm ports (https is the new default; http stays recognized so older snapshots still detect the test mode). - `packages/observability/scripts/apply.sh` / `diff.sh` default `REALM_SERVER_URL` → https. Cache import: - `scripts/import-cached-index.sh` env-mode sed remap now matches both `http://localhost:4201` and `https://localhost:4201` — older cache snapshots have http canonicals, post-flip dumps have https. Either prefix gets rewritten to the env-mode Traefik hostname. In-tree realm fixture data (cards served by dev realm-server): - `packages/experiments-realm/**/*.json` and `packages/catalog-realm/**/*.json` `id` / `relationships` URLs flipped from http to https. Without this every cross-card fetch inside a render paid a wire-level 301 redirect from the dispatcher. Docs: - `README.md`, `QUICKSTART.md`, `packages/host/docs/live-tests.md`, `packages/software-factory/README.md`, `packages/bot-runner/README.md`, `docs/commands-in-headless-chrome.md` — example URLs updated. Not flipped (intentional): - Test fixture JSONs under `packages/host/tests/cards/`, `packages/realm-server/tests/cards/`, ai-bot resource chats, and bench-realm snapshot fixtures. Those URLs match test-side mount points (`http://test-realm/...`, `http://127.0.0.1:4444/test/`, bench-stack http://localhost:4201) where the test infrastructure spawns the realm-server with TLS env vars cleared and listens plain HTTP. Flipping them would diverge from what the test code registers and break the in-process fixtures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Host Tests load the host bundle in a headless Chrome on testem (port 7357). The bundle's `realmServerURL` / `resolvedBaseRealmURL` defaults now point at `https://localhost:4201` to match the wire, but `mkcert -install` in CI's init action is best-effort and doesn't reliably land mkcert's root CA in headless Chrome's NSS trust store. Without `--ignore-certificate-errors`, every realm fetch made during shard warmup fails with `TypeError: Failed to fetch` against the self-signed cert and the rest of the shard never starts. Same fix already shipped in `testem-live.js`. Loopback only, fixed origin via host config — safe to relax cert trust. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Boxel-cli's vitest suite (and any other non-qunit caller of these helpers) doesn't share `packages/realm-server/tests/index.ts`'s bootstrap, so the global TLS env var delete that protects in-process qunit fixtures didn't apply to it. The CI init action provisions the cert, env-vars.sh exports the paths, and the test process inherits them — the spawned realm-server then binds HTTPS+HTTP/2 on its fixture port (`127.0.0.1:4446` for boxel-cli) and the CLI's plain-HTTP session calls fail with `404 Not Found` from the dispatcher's 301 path. Moving the env-var strip into the two `runTestRealmServer*` helpers themselves makes it defense-in-depth: every caller (qunit, vitest, software-factory harness) now goes through the same kill switch when spinning a fixture realm-server. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…p2-v2 # Conflicts: # .claude/skills/indexing-diagnostics/SKILL.md # packages/realm-server/scripts/full-reindex.sh # packages/realm-server/tests/realm-endpoints/info-test.ts # packages/realm-server/tests/realm-endpoints/user-test.ts
Matrix client tests timed out waiting for `http-get://localhost:4201/base/_readiness-check` because the realm-server now speaks HTTPS+HTTP/2 only. Wait-on's plain http-get probe never resolves against the https listener. Same fix for start-without-matrix.sh (dev convenience script used to bring up the stack without Synapse). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Card fixture data hardcoded http://localhost:4202 in adoptsFrom.module. With the realm-server now on HTTPS, the page is served over https and Chrome blocks mixed-content fetches of the http module URL. Flipping to https keeps the canonical realm URL consistent with the actual listener scheme. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…low-insecure-localhost Chrome 144+ silently demotes \`--ignore-certificate-errors\` to a dev-only flag and won't accept self-signed certs unless it's paired with \`--allow-insecure-localhost\`. Without that pairing, every TLS connection to https://localhost:4200 from puppeteer's chrome terminates the handshake with ERR_CONNECTION_CLOSED — which is what was blocking the prerender's wait-for-host-standby in CI (and, downstream, every Host / Matrix test job because realm-server boot depends on prerender being ready). curl over the same URL worked fine, hiding the cert trust nature of the problem under what looked like a generic TCP close. Pair the flags in both the prerender's BrowserManager and the standby-warmup script (scripts/wait-for-host-standby.ts). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The matrix Playwright suite was the lone holdout from this PR's HTTPS-everywhere design — the harness's spawned realm-server + worker-manager + prerender-server stripped REALM_SERVER_TLS_CERT_FILE / _KEY_FILE and ran plain HTTP on :4205, with the test fixtures / helpers / Playwright `baseURL` all hardcoding `http://`. With the production realm-server now HTTPS+h2, that mismatch meant the matrix suite stopped being a regression guard on every h2-framing change in this PR (the HEAD-stream `writable` patch, the pseudo-header strip in `fetchRequestFromContext`, the h1-only-header filter in `setContextResponse`, the hand-rolled `proxyAsset` forwarder). Concrete changes: - `helpers/isolated-realm-server.ts`: drop `envWithoutTLS()`, let the cert env vars flow to the spawned children, flip every `--toUrl=` + `realm_metadata` + `appURL` to `https://localhost:4205/...`, override `HOST_URL` to `https://localhost:4200` on the realm-server spawn so the boot fetch doesn't pick up a stale http leak from a shell that mise-activated before infra:ensure-dev-cert ran. - `helpers/index.ts`, `playwright.config.ts`, every matrix `*.spec.ts`: flip the `:4205` URL literals to https. `playwright.config.ts` also adds `ignoreHTTPSErrors: true` on the playwright context, pairs `--ignore-certificate-errors` with `--allow-insecure-localhost` on the chrome launch args, and flips the `published.realm` --unsafely-treat-insecure-origin-as-secure entry to https. - `infra/ensure-dev-cert` + `mkcert`: mint the dev leaf with `*.localhost` + `published.realm` SANs in addition to `localhost` so the publish-realm subdomain fixtures (`https://publish-realm-XXX.localhost:4205/...`, `https://published.realm/...`) actually validate. Idempotent-skip path now regenerates the cert if the SAN block is missing the `*.localhost` entry so devs don't need to manually rm the cached cert. - `helpers/isolated-realm-server.ts`: spawned children also get `NODE_TLS_REJECT_UNAUTHORIZED=0`. Node's `tls.checkServerIdentity` hardcodes-disallows wildcard SAN matching against `*.localhost`-style top-level wildcards even when the cert covers it (mkcert warns about this), so worker fetches to `https://publish-realm-XXX.localhost:4205/...` fail with ERR_TLS_CERT_ALTNAME_INVALID. The cert is still being validated end-to-end against the mkcert root via NODE_EXTRA_CA_CERTS, just without the strict subdomain SAN check; the wire is loopback-only. - `packages/host/app/components/operator-mode/publish-realm-modal.gts`: `getProtocol()` was returning `http` for `development`/`test` envs, which the publish-realm flow used to construct the `publishedRealmURL` body it POSTs to `/_publish-realm`. The resulting URL leaked into the realm registry, the JWT claims, and the worker's from-scratch-index fetch — and once the wire was https, the http URL went nowhere. Always return `https`. Local shape: matrix shard 1 now runs 30/36 pass (was 21/35 before this commit). The remaining 6 failures (`commands.spec.ts:226`, `correctness-checks.spec.ts:30`, four `login.spec.ts` cases) are pre-existing flakes — same set fails across before/after runs and CI's `retries: 2` papers over them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The matrix harness boots its isolated realm-server on :4205 and the older `1726671342065_backfill-realm-owners` migration seeds owner permissions for those realms keyed under `http://localhost:4205/...`. The previous version of this canonical-url migration only rewrote :4201 and :4202, so after the matrix harness switched to HTTPS on :4205 (previous commit) the realm-server registered itself as `https://localhost:4205/...` while `realm_user_permissions` rows stayed on `http://`. Every authenticated request from the worker / host bundle then 403'd with `for user @test_realm:localhost permissions insufficient. requires read, but user permissions: []`, which manifested as login.spec.ts:177 (and three siblings) timing out waiting for `[data-test-stack-item-content]` — a card that could never load because its realm was unreadable. Extend both the pattern array and the gating EXISTS pre-check to include :4205. The down migration uses the same helper, so it stays symmetric automatically. Local: matrix shard 1 goes from 30 passed / 6 failed to 36 passed / 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The WSC integration tests had their own copy of the matrix isolated-
realm-server spawn pattern, with the same TLS-env-var strip and
hardcoded `http://localhost:4205/test/` URL maps. After the matrix
harness converted to HTTPS and the canonical-url migration started
rewriting `:4205` permissions to https, the WSC harness was the
only remaining caller still booting realm-server on http — but with
the migration now rewriting the seeded permissions to https, every
CLI command failed with `Authentication failed (403): Cannot access
workspace`.
Mirror the matrix conversion:
- `tests/helpers/start-test-realm.ts`: drop the
`REALM_SERVER_TLS_CERT_FILE` / `_KEY_FILE` strip; the spawned
realm-server inherits the mkcert leaf via env-vars.sh and binds
HTTPS+h2 on :4205 like production. Flip every `--toUrl=` and
`--fromUrl=` to `https://`. Default `--distURL` to
`https://localhost:4200`. Add `NODE_TLS_REJECT_UNAUTHORIZED=0` on
the spawn env so the WSC CLI under test doesn't depend on
NODE_EXTRA_CA_CERTS being in the test shell.
- `tests/integration-test.ts`: flip every
`http://localhost:${REALM_PORT}/test/` literal to https.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…CURE=1 everywhere
Backfill the wait-on TLS-relaxation flag onto every
start-server-and-test invocation that probes
https-get://localhost:42XX. I'd flipped the test-services/* and
ci/serve-test-assets tasks earlier but missed the wrapper scripts that
the test runners actually invoke:
- packages/host/scripts/test-wait-for-servers.sh (the host suite's
`test:wait-for-servers`)
- packages/host/scripts/live-test-wait-for-servers.sh
- packages/realm-server/scripts/start-without-matrix.sh
- packages/realm-server/package.json#test:wait-for-servers
- packages/matrix/scripts/test.sh
Without the flag, start-server-and-test forces strictSSL:true on the
in-process axios that wait-on uses, which overrides the global
NODE_TLS_REJECT_UNAUTHORIZED and even NODE_EXTRA_CA_CERTS when those
don't propagate uniformly to the readiness-probe subprocess under CI
load. Result: ~15% of host-test shards flaked at the readiness gate
trying to TLS-handshake the mkcert leaf. The INSECURE flag is the
documented escape hatch and scopes to the probe — the test runner
itself still validates TLS normally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…detached flake The host CI suite is failing on ~35% of shards because the prerender's wait-for-host-standby probe ends with `attempt 1 failed after 38s: waitForFunction failed: frame got detached` and the subsequent retry loop apparently never produces another log line (output buffering in run-p + tee makes it look like the retry stopped, but it's more likely the chrome browser process is in a wedged state). Hook every puppeteer event we have line-of-sight on so the next failure flushes a complete trace: - `console` / `pageerror` show host-bundle errors thrown during boot. - `requestfailed` surfaces TLS / network errors per resource (the prime suspect, given the cert handling in this PR). - `response` (status >= 400) flags HTTP-level failures. - `framedetached` confirms exactly which frame got destroyed. Also bracket each attempt with `attempt N: page.goto(...)` and `attempt N: waiting for #standby-ready` so the buffered output makes the retry-loop's actual progress legible. Enabled by default while we hunt the issue; flip `WAIT_FOR_HOST_STANDBY_VERBOSE=0` to mute when the flake is closed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Verbose puppeteer logging in wait-for-host-standby (previous commit) caught the actual failure mode: every chrome asset request for `https://localhost:4200/assets/*.js` aborted with `net::ERR_NETWORK_CHANGED` mid-fetch, so the host bundle never finished booting and #standby-ready never appeared. Root cause is the workflow order: 1. Start test services (vite preview + realm-server + prerender) — & 2. Register realm users 3. Install dbus + `sudo service dbus restart` + `sudo service upower restart` 4. Run host tests Step 3 restarts the system message bus while the prerender's chromium is already mid-flight loading the host bundle. Chrome's NetworkChangeNotifier reads system signals (over dbus) and reacts to the bounce by tearing down every in-flight HTTP/2 stream with ERR_NETWORK_CHANGED. The HTTPS+h2 wire makes this more visible than the pre-PR plain-HTTP setup because h2 multiplexes ~100 asset fetches on one connection — when that connection dies, all of them die at once and the page can't finish loading. Move the dbus install / restart in front of "Start test services" in both the live-host and host-tests jobs so the network churn happens before any chromium is spawned. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The verbose puppeteer logging from the previous commit also surfaced the SECOND class of CI failure: even on the shards where the prerender standby probe succeeded, the ember test suite itself fired hundreds of `TypeError: Failed to fetch` errors against `https://localhost:4201/_search` and the test runner exited with Testem code 1. Same root cause as the wait-for-host-standby fix: testem's chromium had `--ignore-certificate-errors` but not `--allow-insecure-localhost`, and Chrome 144+ silently demotes the former to a dev-only flag unless paired with the latter. Every fetch from the test page (loaded over HTTP at testem's local server) to `https://localhost:4201/...` failed strict cert validation against the mkcert leaf and was reported back to the test as a `Failed to fetch`. Apply the same pair on both `testem.js` (CI host suite) and `testem-live.js` (live host tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The publish-realm-modal flow now constructs `https://<subdomain>.localhost:4201/...` unconditionally (publish-realm-modal.gts:getProtocol → https). This test still asserts the http form, e.g.: Publishing to: https://testuser.localhost:4201/test/ (actual) Publishing to: http://testuser.localhost:4201/test/ (expected — stale) Flip every `http://(testuser|custom-site-name|my-boxel-site|my-custom-site).localhost:4201` reference (18 occurrences) to https so the assertions match what the modal/UI produces. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1dc733a225
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Eight should-fix items from the rigorous PR review on 4797:
- packages/matrix/scripts/migrate-account-data-http-to-https.ts: add
:4205 to URL_PREFIXES_TO_FLIP for parity with the postgres migration
(which covers 4201/4202/4205). The matrix isolated harness now runs
HTTPS on :4205, so account_data referencing http://localhost:4205/
needs to flip too.
- packages/realm-server/scripts/wait-for-host-standby.ts: flip
WAIT_FOR_HOST_STANDBY_VERBOSE default off. It was on while hunting
the CI flake (chrome NetworkChangeNotifier + missing
--allow-insecure-localhost), both of which landed; healthy CI logs
don't need the chrome console / requestfailed firehose.
- packages/realm-server/main.ts: shutdown comment claimed the TLS-mode
http2 server can't `closeAllConnections()`. The new dispatcher
(server.ts) explicitly mirrors that method, so the comment is stale.
Rewrite to describe what the call actually does today.
- packages/realm-server/middleware/index.ts: hoist
H2_FORBIDDEN_RESPONSE_HEADERS to module scope so both
`setContextResponse` and `proxyAsset` filter the same set
(previously proxyAsset's filter was missing `proxy-connection` and
`http2-settings`). Also document that proxyAsset is GET-only and
note the empty-string fallback on `assetsURL.port`.
- packages/host/scripts/vite-with-traefik.js: delete the unreachable
empty defensive block at the end of the 301-redirect path
(`if (headerEnd === -1 && length >= 8192) { /* … */ }` after the
socket had already been closed). Also drop the now-unused
`headerEnd` computation.
- packages/host/tests/cards/{fadhlan,mango,type-examples,van-gogh}.json:
flip `"id": "http://localhost:4202/test/..."` to https. The realm-
server canonicalizes ids on read so this didn't break runtime, but
pre-flip URLs in committed fixtures invited confusion when reading
the diff.
Also addresses the matching PR-description claim that
`packages/matrix/helpers/isolated-realm-server.ts` strips TLS env vars
— that hasn't been true since the matrix harness moved to HTTPS on
:4205. PR body updated separately.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two Codex findings on commit 1dc733a, plus a tightening of the puppeteer cert-relax gate that fell out of reviewing them: 1. mise-tasks/dev{,-all}: re-source env-vars.sh after ensure-dev-cert On a first `mise run dev{,-all}` after `infra:trust-dev-cert`, the leaf cert is created AFTER env-vars.sh was sourced on shell entry — so REALM_SERVER_TLS_CERT_FILE / _KEY_FILE are never exported into the current shell, vite starts plain-HTTP while HOST_URL is already `https://localhost:4200`, and the readiness curl probe times out. Re-source after the cert preflight so the TLS vars get picked up. Idempotent in the steady-state case. 2. packages/realm-server/server.ts + packages/host/scripts/vite-with-traefik.js: switch the plain-HTTP redirect from 301 to 308 301 silently downgrades POST/PUT/PATCH to GET and drops the body when fetch() follows it. Matrix-registration scripts that POST to `http://localhost:4201/{_server-session,_user,...}` were broken by this. 308 preserves method + body, and is semantically correct: the redirect is a permanent property of the wire protocol. 3. wait-for-host-standby + BrowserManager: gate cert-relax on https + loopback hostname, not just https Previously the `--ignore-certificate-errors` + `--allow-insecure-localhost` flags fired for any `https://...` URL. Production realm-server runs against real hostnames with CA-signed certs; we want strict validation there. Tighten the condition to https + (`localhost` | `127.0.0.1` | `[::1]`), which matches the mkcert leaf's SAN — the only origin where relaxation is justified. Extracted the `isHttpsLoopback(url)` predicate to a new `packages/realm-server/lib/is-https-loopback.ts` so the prerender `BrowserManager` and the `wait-for-host-standby` script share one implementation. PR description updated to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The redirect status moved from 301→308 in 3e1acdf so POST/PUT/PATCH bodies aren't dropped on http→https. The dispatcher test still expected 301 (caught by Realm Server Tests shard 2). Update the two assertions plus three stale 301-mentions in adjacent doc comments. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
For I needed to run |
good point--it assumes synapse is already running |
|
@backspace also you'll probably need to run the reverse migration steps if you want to get back to using your env without http/2 |
|
Here’s the fix that worked for me but it only gets
So I’ll go for another round with Claude, maybe it’s a similar problem. I did clear |
the fact that you got 401 unauthorized makes me wonder if there is a auth token mismatch somewhere. but wait, card-api is public.... weird.... |
…4847) The same-port dispatcher peeks the first byte off each connection, then opens a `net.connect(internalPort, '127.0.0.1')` upstream socket and pipes bytes through. But `runVite` was invoking vite with `--port <p>` and no `--host`, so vite default-binds to `localhost`. On macOS Sonoma+ (and any Node 17+ host with IPv6 ahead of IPv4 in /etc/hosts), that resolves to `::1` first and vite ends up listening on `[::1]:<p>` only. The dispatcher's IPv4 upstream connect then fails, the error handler destroys the client socket mid-TLS-handshake, and the browser gets `ERR_CONNECTION_CLOSED` on https://localhost:4200 even though the dispatcher logs "Listening on ... → vite at 127.0.0.1:<p>". Plain http://localhost:4200 still works because the 308 response is written directly to the accepted socket without ever touching upstream — so testem and curl-against-http didn't trip this. Add a `host` option to `runVite` and pass `host: '127.0.0.1'` from `runViteBehindRedirectDispatcher` so vite explicitly binds the same loopback family the dispatcher connects on. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
backspace
left a comment
There was a problem hiding this comment.
After ironing out my idiosyncratic environment situation and the fix in #4847, this works for me locally. I haven’t tried with environment mode but I’m probably the only person that uses (needs?) and it has no CI enforcement that so I can address that separately.
This is understandably large so I only skimmed, but it’s running locally at least!
| - A `@field` getter that accesses `undefined.property` because an upstream link didn't materialize. | ||
| - A template-level `{{#if (someHelper ...)}}` where `someHelper` was renamed or removed. | ||
|
|
||
| **False-positive profile.** The detector has four gates that all have to hold simultaneously: `isReady=true`, `model.status='ready'`, DOM attribute === `loading` specifically, and the state persists across a backoff-poll grace window (a microtask drain followed by macrotask hops at 50ms → 200 → 500 → 1000 → 2000, re-checking after each — total ~3.75s of cumulative slack so Backburner's flush has real wallclock time to land even under heavy parallel CI load). The fast path exits at the first hop; only renders that stay desynced through the full series are declared failures. In-flight loads are filtered upstream by `#waitForRenderLoadStability` — by the time the detector runs the loader is quiescent. The one residual scenario is a card whose template runs a multi-second *synchronous* getter that starves the microtask queue beyond the full grace budget; when the getter finishes, the microtask queue drains, the binding flips to `ready`, and on the next hop the detector exits cleanly. So in practice false-positives require Backburner, Glimmer, and the entire JS thread to all be blocked for >3.75s — a state the route can't be in while logically `ready`. |
There was a problem hiding this comment.
There’s a lot of diff noise just from things like *synchronous* becoming _synchronous_! I wonder if we need a house style, I find myself using the save-without-formatting shortcut sometimes to avoid this kind of autofix, though this was more likely a Claude rewrite thing
| `brew install mkcert nss` on macOS. After install, run | ||
| `mise run infra:ensure-dev-cert` once before the first | ||
| `mise run dev` / `pnpm start:all`; subsequent runs are a no-op. See | ||
| the repo-root [README](README.md#local-https-dev-access) for details. |
There was a problem hiding this comment.
Has anyone actually tried following the steps in this document in a while? I too updated it with the mise changes but I’m not convinced it actually works. It doesn’t have to be addressed as part of this PR but I think we should consider removing it, or committing to making sure it works and stays working.
There was a problem hiding this comment.
I have not. I think @tintinthong made this originally, perhaps he has thoughts?






The local stack is now HTTPS+HTTP/2 end-to-end (realm-server on
:4201, vite on:4200). Three one-time manual steps are required beforemise run dev/mise run dev-allwill work; skipping any of them leaves the browser stuck in CORS / mixed-content / cert errors that look like the app is broken.1. Trust the dev TLS cert (one-time, prompts sudo)
The realm-server and vite both terminate TLS with a mkcert leaf. The browser-trust path differs by OS:
mkcert -installadds the root CA to the system Keychain, which Chrome and Safari read directly. Firefox uses its own NSS DB;brew install nssprovides thecertutilmkcert needs to seed it.~/.pki/nssdb, not/etc/ssl/certs, solibnss3-tools(Debian) /nss-tools(RHEL) is mandatory — without it mkcert silently skips the NSS DB and the browser keeps showing cert warnings.Install the tools and run the trust task:
mise run infra:trust-dev-certrunsmkcert -install(sudo prompt on Linux; macOS will prompt for the user's Keychain password). It seeds the system Keychain on macOS, and seeds~/.pki/nssdbon Linux — either way Chromium picks up the root CA on next launch.mise run dev/dev-allinvokeinfra:ensure-dev-certas a preflight that fails fast with a copy-paste message if you skipped this. CI does the equivalent via passwordless sudo in.github/actions/init— devs don't need to think about it there.2. Migrate your matrix account_data (one-time)
Every Boxel user keeps their realm workspace list in matrix
app.boxel.realmsaccount_data. Existing entries still point athttp://localhost:42XX/.... The realm-server's HTTP→HTTPS dispatcher 308-redirects those requests, but CORS preflight forbids following redirects, so the host bundle's first realm fetch will fail with:Run the migration once after
mise run infra:trust-dev-cert:The script logs in as the local synapse admin, walks every user, impersonates each via
/_synapse/admin/v1/users/<id>/login, readsapp.boxel.realms, rewrites everyhttp://localhost:4201/.../http://localhost:4202/...tohttps://..., and PUTs the result back. Safe to re-run — users already on https are skipped.3. Clear your browser's localStorage
The host bundle caches realm-scoped JWTs and session metadata keyed by realm URL. Those entries are still keyed on the old
http://localhost:42XX/form, and there is no server-side migration path — the only way to evict them is in your browser:After that, log in fresh — new tokens will be keyed under the https origin.
When in doubt: full-reindex a realm
If something still looks broken after the three steps above — stale rows, missing types, a card that won't render — kick off a from-scratch index for the affected realm. The realm-server exposes the same
_grafana-reindexendpoint we use from Grafana in production. In local dev the shared secret is literallyshhh! it's a secret:The request blocks until the from-scratch job finishes and prints the result stats. Swap the
realm=query param for whichever realm you need (/base/,/test/,/user/<your-realm>/, etc.); the path is relative to the realm-server origin.And when even that doesn't help, the nuclear option is still available:
That drops and recreates the Postgres DB, clears dynamic realms, and restarts matrix — i.e. you start over from a clean slate.
Rolling back
If you ever need to go back to the previous http-canonical state (e.g. you're bisecting against this branch, or you
pnpm migrate downthe postgres migration), the rewrite is symmetric in both directions:Both migrations gate on
realm_user_permissionscontaining localhost canonicals in the relevant scheme, so they're no-ops on staging / production.Auto-applied: postgres data migration
The first
mise run devafter pulling runs a Postgres migration (1779100257124_canonical-url-http-to-https.js) that rewrites every text/varchar/jsonb column on every public table fromhttp://localhost:42XX/…tohttps://localhost:42XX/…in place — index rows, realm registry, permissions, JSONB documents insidepristine_doc/search_doc/etc. The migration is idempotent and gated on a cheaprealm_registrypre-check, so re-runs and production environments are no-ops.If you have stale
http://localhost:42XX/…URLs in personal-realm card.jsonfiles (inrealms/localhost_4201/**), the dispatcher's 301-redirect resolves them at runtime so cards still work — no on-disk rewrite is required. To clean the data anyway:Navigation
Visit
https://localhost:4200/(vite host) as the manual-browser entry point. Both vite and the realm-server speak HTTPS+HTTP/2 now, so the host bundle's realm fetches multiplex over a single h2 connection — no mixed-content warnings.Summary
Local dev's realm-server now speaks HTTPS+HTTP/2 on a single canonical origin (
https://localhost:4201, plushttps://localhost:4202for test-realms). This unblocks the heavy aggregator-card prerender bottleneck described in CS-11114 — cohort and dashboard renders today fan out 80+ federated-search requests inside one Chromium tab, get throttled by Chrome's HTTP/1.1 6-per-origin connection ceiling, and take minutes; HTTP/2 multiplexes them over one connection and the same render finishes in seconds.Design
Realm-server: same-port HTTPS+HTTP/2 dispatcher
RealmServer.listen(port): whenREALM_SERVER_TLS_CERT_FILE/_KEY_FILEare set, binds a singlenet.Serverthat peeks the first byte of every connection.0x16(TLS ClientHello) routes to anhttp2.createSecureServerfor h2; anything else routes to a plainhttp.Serverthat 301-redirects tohttps://<host><path>. Same listener, no extra port. When the cert is absent (in-process test fixtures), falls back to plainhttp.createServer— unchanged behavior.closeAllConnections()so shutdown can force-close in-flight TLS / HTTP/2 / keep-alive sessions rather than waiting for peers.main.ts's existing typeof feature-detect picks it up unchanged.readFileSync+createSecureServerwrapped in try/catch so a malformed cert downgrades to plain HTTP with a warning rather than killing boot.patchKoaResponseForH2Head()on the Koa response prototype: Node's http2 compat layer leavesHttp2Stream.writable === falseon HEAD streams, which short-circuits Koa'srespond()and hangs every HEAD request indefinitely. The patch returnstruefor HEAD streams so the response actually flushes.middleware/index.ts:fullRequestURLdetectsctx.req.socket.encryptedfor the scheme and falls back to the HTTP/2:authoritypseudo-header whenheaders.hostis absent, so URL-keyed realm lookup matches the HTTPS canonical.fetchRequestFromContextstrips:-prefixed pseudo-headers before constructingnew Request(...)— WHATWGHeadersrejects them.setContextResponsefilters HTTP/1-onlyconnection/keep-alive/transfer-encoding/upgrade/proxy-connection/http2-settingsresponse headers that Node's h2 compat layer would otherwise reject.proxyAssetwas reimplemented as a hand-rolled forwarder (replacingkoa-proxies+http-proxy) so pseudo-headers and the requesthostget filtered before the upstream call —http-proxy.setHeader(':path', …)throwsERR_INVALID_HTTP_TOKEN.main.tsdefaults--serverURLtohttps://localhost:${port}(washttp://). The realm-server stampsserverURLinto therealmServerURLclaim of every JWT it mints, so an http default leaks into tokens and the host'sassertOwnRealmServerrejects them as a "different realm server".prerender/browser-manager.ts,scripts/wait-for-host-standby.ts) launches puppeteer with--ignore-certificate-errors+--allow-insecure-localhost(Chrome 144+ silently demotes the former without the latter). Gated onBOXEL_HOST_URL/REALM_BASE_URLbeing https + a loopback hostname (localhost,127.0.0.1,[::1]) so the relaxation only fires in local dev / CI — production hits real hostnames with real CA-signed certs and keeps strict TLS validation.Vite host (
packages/host)vite.config.mjsreadsREALM_SERVER_TLS_CERT_FILE/_KEY_FILEand sets bothserver.https(dev) andpreview.https(built) so vite terminates TLS on:4200. Browsers refuse HTTP/2 over cleartext, so vite has to speak HTTPS for the h2 connection-pool win to apply on the host origin.packages/host/scripts/vite-with-traefik.jsadds a same-port http→https redirect dispatcher forvite(dev) only — vite binds an internal port, the dispatcher owns:4200, peeks the first byte, and either pipes raw bytes to vite (TLS) or 301-redirects (plain HTTP).vite previewskips the dispatcher and binds:4200directly (the byte-peek + cross-process TCP pipe pattern doesn't survive chrome's TLS+h2 handshake under load in CI; preview doesn't need browser-bar UX anyway).config/environment.jsdefaults flip tohttps://localhost:4201forrealmServerURL/baseRealmURL/catalogRealmURL/legacyCatalogRealmURL/skillsRealmURL/openRouterRealmURL.Mise tasks and env-vars
mise run infra:ensure-dev-certprovisions the mkcert leaf at$HOME/.local/share/boxel/dev-certs/. Idempotent; auto-runsinfra:trust-dev-certwhen passwordless sudo is available (CI), otherwise fails fast with a copy-paste install message.mise run dev/dev-allinvoke it as a preflight.mise run infra:trust-dev-certrunsmkcert -installand, on Linux, verifieslibnss3-toolsis installed so Chromium picks up the root CA from~/.pki/nssdb. On macOS mkcert seeds the system Keychain directly (and Firefox's NSS DB ifnssis brewed), so no extra precheck is needed.mise-tasks/lib/env-vars.sh: defaultsREALM_BASE_URL/REALM_TEST_URL/HOST_URLto https; exportsREALM_SERVER_TLS_CERT_FILE/_KEY_FILE+NODE_EXTRA_CA_CERTSwhen the mkcert leaf is present. Also auto-detects system chrome (/usr/bin/google-chrome,Chromium.app, etc.) and setsPUPPETEER_EXECUTABLE_PATH— puppeteer's bundled Chrome 143 has an h2 stream-window bug that hangs the prerender on cold vite optimizer; Chrome 148+ is fine.services/{realm-server,realm-server-base,worker-base,prerender,test-realms}were updated to use the https canonical URLs.helpers/isolated-realm-server.tsandpackages/workspace-sync-cli/tests/helpers/start-test-realm.tsinheritREALM_SERVER_TLS_CERT_FILE/_KEY_FILEfromenv-vars.sh, so the isolated realm-server on:4205matches the production wire (HTTPS+HTTP/2) — the matrix Playwright suite acts as a regression guard on every h2-framing change in this PR (HEAD-streamwritablepatch, pseudo-header strip, hand-rolledproxyAsset). The realm-test-harness — used by software-factory's Playwright suite — still strips the TLS env vars from its child spawns: its host isvite previewon a dynamic port and the harness is intentionally HTTP-only.CI
.github/actions/init/action.ymlinstalls mkcert +libnss3-toolsvia apt and runsmise run infra:ensure-dev-certso realm-servers in CI come up HTTPS+HTTP/2 the same as local.tests/index.ts(realm-server test bootstrap) deletes the TLS env vars before any in-process fixture realm-server is spun up — supertest connects plain HTTP to those fixtures on random127.0.0.1:444Xports.https://localhost:42XX(inci/serve-test-assets,ci/cache-index,test-services/{host,realm-server,matrix}) usehttps-get://(start-server-and-test's defaulthttps://is HEAD, which vite preview behind h2 doesn't reliably answer) and passSTART_SERVER_AND_TEST_INSECURE=1to disable wait-on's strictSSL check.ci-software-factory.yamlno longer starts host-dist on:4200— the realm-test-harness is hermetic and brings up its own vite preview on dynamic ports. Onlyservices:icons(port 4206) is started externally.packages/postgres/scripts/ensure-db-exists.shforces-h localhost -p 5432(TCP) insidedocker exec, since the postgres:16.3 image doesn't reliably create/var/run/postgresql/.s.PGSQL.5432.set -emakes a failedCREATE DATABASEactually exit non-zero instead of fabricating a success line.Data migrations
Two migrations cover the http→https flip and are both reversible:
Postgres —
packages/postgres/migrations/1779100257124_canonical-url-http-to-https.js:information_schema.columnsfor every text/varchar/jsonb column on every public table (excludesmodules,pgmigrations/migrations, generated columns).REPLACE(...)-basedUPDATEs forhttp://localhost:4201→https://localhost:4201andhttp://localhost:4202→https://localhost:4202.WHEREfilter restricts the touch set to rows that still contain the old URL — idempotent.downis symmetric (https → http) — samerealm_user_permissionspre-check on the source scheme.realm_user_permissionscontaining localhost URLs, so production / staging (real hostnames, neverlocalhost) is a no-op either way.mise run devpasses--migrateDBto the realm-server, so the migration fires on the first post-pull boot.Matrix account_data —
packages/matrix/scripts/migrate-account-data-http-to-https.ts:app.boxel.realmsaccount_data entries fromhttp://localhost:42XX/...tohttps://..., PUTs back.--reverseflips the direction (pnpm migrate-account-data-https-to-http/mise run infra:migrate-matrix-account-data-https-to-http) for symmetry with the postgres migrate-down.Tests
packages/realm-server/tests/listener-dispatcher-test.tscovers the dispatcher: TLS h2, ALPN HTTP/1.1 fallback, TLS h2 HEAD (the patched-writablepath), plain-HTTP 301, no-Host-header raw-socket path, malformed-cert downgrade, and no-cert-env-vars plain HTTP.card-endpoints-test.ts,types-endpoint-test.ts,module-syntax-test.ts) were updated to https where they reference port 4202.realm-indexing-test.gts,realm-test.gts) updated for the new https canonical and the alphanumeric URL sort order that follows from it.Test plan
mise run infra:ensure-dev-certsucceeds with mkcert installed; emits clean install hints + exits 1 when missing.curl -kI --http2 https://localhost:4201/_alivereturnsHTTP/2 200.curl -kI --http1.1 https://localhost:4201/_alivereturnsHTTP/1.1 200(ALPN fallback for h1 clients).curl -sI http://localhost:4201/_alivereturnsHTTP/1.1 301withLocation: https://localhost:4201/_alive.curl -skI -X HEAD --http2 https://localhost:4201/_alivereturnsHTTP/2 200(HEAD over h2 doesn't hang).mise run devruns the URL-rewrite migration → realm-server boots clean on https.pnpm --filter @cardstack/postgres migrate down 1 && pnpm --filter @cardstack/postgres migrate upround-trips cleanly (the Postgres Migration CI job validates this).curl -X POST -H "Authorization: Bearer shhh! it's a secret" 'https://localhost:4201/_grafana-reindex?realm=/base/'— completes without errors.mise run infra:trust-dev-cert+ matrix account_data migration + localStorage clear, openhttps://localhost:4200/, log in, click a workspace — index card populates and DevTools showsh2for realm fetches.mise run devshutdown closes the listener cleanly.pnpm lintpasses onpackages/{realm-server,host,matrix,postgres,realm-test-harness}(lint:js+ prettier; pre-existinglint:typeserrors in../base/*.gtsare unrelated).Closes #4787 (dual-listen approach abandoned in favor of this single-origin design).
🤖 Generated with Claude Code