packer: move to Reflog's Option B — lantern-box installs via cloud-init#248
Open
myleshorton wants to merge 3 commits intomainfrom
Open
packer: move to Reflog's Option B — lantern-box installs via cloud-init#248myleshorton wants to merge 3 commits intomainfrom
myleshorton wants to merge 3 commits intomainfrom
Conversation
Reflog in #infrastructure-and-services thread ts=1776197690.140869 (2026-04-16) noted that rebuilding the full N-provider × M-region packer image matrix on every lantern-box release is expensive (30min of CI per release, traffic/API usage from pushing to every region), and proposed moving the lantern-box .deb install to cloud-init. This commit takes that step on the packer side. The lantern-box binary is no longer downloaded or installed during `packer build`; the image contributes only: - runtime deps (ca-certificates, tzdata, nftables) - /etc/lantern-box/ and /var/lib/lantern-box/ dirs - otelcol-contrib + systemd drop-in for host metrics - systemd drop-in for lantern-box's OTel env (still applies when lantern-box.service appears on disk via cloud-init's apt install) - /etc/cron.d/lantern-box-update fallback cron (to be removed in a follow-up once central orchestration is stable) Cloud-init now owns the .deb install via the ReleaseTag field on PackerCloudInitConfig, landed in `getlantern/lantern-cloud` a6f92260f. See `docs/design/central-vps-updates.md` for the full rollout plan. **Deploy caveat**: before rolling out a new image built from this code, bandit_vps_default_release_tag (or a per-track override in bandit_vps_image_targets) MUST be set in lantern-cloud settings. Otherwise cloud-init's apt-install step is skipped, the box boots without lantern-box installed, and the provision worker's `systemctl enable --now lantern-box` will fail. Revert path: re-merge this commit's reverse diff. VERSION env var is kept because the image name still includes it as a label; the script itself no longer uses it, comment updated to match.
…ery release Follow-ups to the Option B strip (91b027f). Both requested by Reflog in the same thread ts=1776197690.140869 in #infrastructure-and-services. Cron removal: The /etc/cron.d/lantern-box-update + /usr/local/bin/lantern-box-update cron was the thing that was silently failing 266 times/hour at peak with "install failed: expected 0.0.70 but got 0.0.68" and no host.name on the log lines. Under central orchestration (landed in lantern-cloud PR), BanditVPSHotSwapWorker SSHes in, apt-installs the target tag, and writes current_release_tag — with per-route success/failure observability. If SSH keeps failing, BanditVPSAutoreplaceWorker drains the route via destroy+pool rebuild. So the cron is strictly redundant. Drops ~100 lines from provision.sh. Also drops /var/log/lantern-box dir creation and the logrotate config (only the cron wrote there). CI optimization: - build-images.yaml: trigger on push to main when deploy/packer/** or the workflow itself changes, not on every lantern-box release. Under Option B the binary lives in the .deb published to GitHub Releases, not in the image, so a plain Go release needs no rebuild. - release.yaml: drop the "Trigger Packer image builds" step. Releases still produce .debs via goreleaser, but images stay put. - auto-tag.yaml: drop deploy/packer/** from the version-bump path filter. Packer-only changes no longer cause a version bump → release cycle; build-images.yaml picks them up directly. - prepare job in build-images.yaml: on push events, derive version from the latest git tag (image is version-agnostic content-wise under Option B, but the name is still used as a prefix by the per-provider latestImage() helpers in lantern-cloud). Net effect: a typical Go release goes goreleaser → .deb → GitHub Releases → hot-swap. No image rebuild, no 36-region OCI fan-out, no cross-region image push. Reflog's 30 minutes of CI per release + the API/traffic cost goes away. Packer images rebuild only when the image itself actually changes.
Contributor
There was a problem hiding this comment.
Pull request overview
Updates the Packer image build/deploy flow to implement “Option B”: stop baking a specific lantern-box version into VM images and instead install the desired version at first boot via cloud-init, reducing image rebuild frequency and CI cost.
Changes:
- Removes the pre-baked
lantern-box.debinstall and the per-host auto-update cron from the Packer provisioning script. - Adjusts GitHub Actions workflows so Packer image builds trigger on
deploy/packer/**(and manual dispatch) rather than on every release. - Updates Packer README/docs to describe the new “version-agnostic image” approach.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
deploy/packer/provision.sh |
Removes baked-in install/cron and documents cloud-init installation; keeps systemd drop-ins and base deps. |
deploy/packer/README.md |
Documents that lantern-box is no longer included in the image and adds operator guidance. |
.github/workflows/release.yaml |
Removes the step that triggered Packer builds on every release. |
.github/workflows/build-images.yaml |
Changes triggers to push-path based rebuilds and makes version input optional with “latest tag” fallback. |
.github/workflows/auto-tag.yaml |
Removes deploy/packer/** from the auto-tag path filter to avoid version bumps for image-only changes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- provision.sh: stale comment said the packer image still contributes
an "auto-update fallback cron" — the cron was removed in an earlier
commit. Drop the phrase.
- provision.sh: add wireguard-tools to match the Dockerfile + README
package list (per the "keep in sync with Dockerfile" comment above
the apt-get install). The lantern-box .deb doesn't declare it as a
runtime dep in goreleaser's nfpms section, so apt won't pull it in
on first boot — we need to install it here.
- provision.sh: the old "command -v lantern-box" verification at the
tail of the script was stranded when we stripped the pre-baked
install. Under Option B, lantern-box is expected to be absent until
cloud-init runs. Replace with a verification that checks the things
the packer image actually contributes: the systemd drop-ins, the
/etc/lantern-box + /var/lib/lantern-box dirs, the OTel config, the
Lanternet CA cert, and the tailscale + otelcol-contrib sidecars.
- README.md: reconciled the intro line with the "Not in the image"
section below. The old "Pre-baked VM images with lantern-box
installed" claim contradicted Option B and would have confused
future readers.
- build-images.yaml: switched `${{ inputs.version }}` and
`${{ inputs.builders }}` to `github.event.inputs.*`. The `inputs`
context is only populated for workflow_dispatch + workflow_call,
not push. Evaluators have historically been permissive here but
`github.event.inputs.*` is strictly more defensive on a push
trigger without any downside (this workflow doesn't use
workflow_call).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+96
to
101
| # daemon-reload is a no-op here for the (not-yet-installed) lantern-box | ||
| # service, but the otelcol-contrib service below needs it to pick up its | ||
| # env drop-in. The apt install that runs under cloud-init will | ||
| # daemon-reload again after the service unit appears on disk. | ||
| systemctl daemon-reload | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements Reflog's Option B from #infrastructure-and-services thread
ts=1776197690.140869(2026-04-16): the lantern-box binary stops being baked into the packer image; cloud-init apt-installs it on first boot parameterized by release tag. Packer images become mostly-static per-provider and rebuild only when the image itself changes.Paired with lantern-cloud PR that lands the control-plane: https://github.com/getlantern/lantern-cloud/pull/new/af/central-vps-updates-schema (design doc at
docs/design/central-vps-updates.md).Commits
packer: strip pre-baked lantern-box install— remove the.debdownload/install block fromdeploy/packer/provision.sh. Keep systemd drop-ins and env file scaffolding (they apply when cloud-init's apt install createslantern-box.service).packer: remove lantern-box-update cron + stop rebuilding on every release/etc/cron.d/lantern-box-updateand/usr/local/bin/lantern-box-updatefromprovision.sh(~100 lines). The cron was producing 266 silent errors/hour with nohost.name— central-orchestration hot-swap replaces it with per-route observability.build-images.yaml: trigger on push to main whendeploy/packer/**or the workflow itself changes, not on every release. Removes the 30-min CI hit + cross-region image push that Reflog originally flagged.release.yaml: drop theTrigger Packer image buildsstep.auto-tag.yaml: dropdeploy/packer/**from the version-bump path filter.Deploy order
Before merging + rolling out new images built from this code: set
bandit_vps_default_release_tag(or a per-track override inbandit_vps_image_targets) in the lantern-cloud settings. Otherwise cloud-init's apt-install step is skipped, new VMs boot without lantern-box installed, andsystemctl enable --now lantern-boxduring config push fails.The lantern-cloud PR adds the schema + plumbing for setting those values. Merge the lantern-cloud PR first, set the default via
psqlorlcCLI, then merge this PR and trigger a fresh packer build.Test plan
UPDATE settings SET value = 'v0.0.73' WHERE key = 'bandit_vps_default_release_tag'(or equivalent vialc)build-images.yamlin this repo for linode+alicloud (skip OCI for speed on first run)apt-get install lantern-box=0.0.73runsdpkg-query -W -f='${Version}' lantern-boxreturns0.0.73bandit_vps_default_release_tagto a newer tag; watch BanditVPSHotSwapWorker converge the route