Skip to content

Enhance CI commands with repo-init and repo-update functionality#585

Draft
tristanpoland wants to merge 104 commits into
genesis-community:v3.2.x-devfrom
tristanpoland:v3.2.x-dev
Draft

Enhance CI commands with repo-init and repo-update functionality#585
tristanpoland wants to merge 104 commits into
genesis-community:v3.2.x-devfrom
tristanpoland:v3.2.x-dev

Conversation

@tristanpoland

Copy link
Copy Markdown
Member

This pull request introduces support for inline CI configuration via a new ci: section in .genesis/config, alongside various improvements to configuration validation, pattern matching, and pipeline status reporting. It refactors the configuration source detection to prioritize the new inline format, implements a delegation pattern for config section validation, and makes pattern matching more robust and consistent across the codebase. Additionally, it improves the ordering and display of pipeline jobs in status outputs.

Inline CI configuration and validation

  • Added support for defining CI configuration inline in the .genesis/config file via a ci: section, including detection, parsing, and validation logic. This includes registering the Genesis::CI::Compiler as the owner/validator for the ci: section and delegating validation accordingly. (lib/Genesis/CI/Compiler.pm, lib/Genesis/CI/Compiler/Parser.pm, lib/Genesis/Top.pm, lib/Genesis/Config.pm, lib/Genesis/Commands/Pipelines.pm, lib/Genesis/Commands/Pipeline.pm) [1] [2] [3] [4] [5] [6] [7] [8]

  • Implemented a delegation registry in Genesis::Top for config section validation, allowing modules to register themselves as owners of specific top-level keys and handle their own schema/validation. (lib/Genesis/Top.pm, lib/Genesis/Config.pm) [1] [2] [3]

Pattern matching improvements

  • Refactored all pattern/glob-to-regex conversions (for resources, targets, triggers, and job names) to consistently and safely escape metacharacters, improving reliability of pattern matching throughout the CI pipeline code. (lib/Genesis/CI/Compiler/AST.pm, lib/Genesis/CI/Compiler/PipelineProvider.pm, lib/Genesis/CI/Compiler/Validator.pm, lib/Genesis/CI/Layout.pm) [1] [2] [3] [4] [5]

Pipeline parsing and topology

  • Enhanced pipeline parsing logic to prioritize configuration sources in the following order: .genesis/ci/ directory, ci: section in .genesis/config, then legacy ci.yml file. The parser now normalizes the structure regardless of source, and AST building logic correctly handles environments referenced as upstream even if they lack their own pipeline data. (lib/Genesis/CI/Compiler/Parser.pm, lib/Genesis/CI/Compiler/ASTBuilder.pm) [1] [2] [3] [4] [5]

Pipeline status and display

  • Improved the pipeline status command to display jobs in the order defined by the AST pipeline (reflecting workflow topology), with any extra jobs appended alphabetically. This provides a more accurate and user-friendly status overview. (lib/Genesis/Commands/Pipeline.pm)

Additional fixes and code quality

  • Fixed minor issues and improved code clarity, such as proper handling of vault configuration options, capitalization consistency in output, and removal of unreachable code. (lib/Genesis/Commands/Pipeline.pm, lib/Genesis/Commands/Pipelines.pm, lib/Genesis/CI/Legacy.pm, lib/Genesis/CI/Compiler/Providers/Concourse.pm, lib/Genesis/Top.pm) [1] [2] [3] [4] [5] [6] [7]

tristanpoland and others added 30 commits April 14, 2026 13:15
Introduce repo-init and repo-update commands to manage repository CI configuration. repo-init provides a one-time interactive wizard (or flags) to write ci.provider into .genesis/config and generate .genesis/ci scaffold files (integrations.yml, targets.yml, resources.yml, ci-overrides). repo-update is an idempotent updater that can run non-interactively (apply only provided flags) or launch a pre-populated wizard; it can patch integrations.yml in-place and preserves existing scaffold edits. Implements helper routines for defaults, prompts, file writing, and scaffold templates, and adds unit tests covering config writes and scaffold behavior.
Add two passthrough options for the init command (--commit and --no-commit) and update repo initialization flow to show a summary of staged files (git diff --cached --stat). If --commit is passed the initial state is committed automatically; if --no-commit is passed the commit is skipped; otherwise the user is prompted (default yes). Preserves existing failure handling and the original commit message (Initial Genesis Repo).
Pipeline: fix color tag in pause-pipeline error message and change status listing to follow the pipeline AST workflow order when available (index jobs by name, build ordered list from workflow stage order, append any missing jobs alphabetically).

Repo: add warnings when CI provider is not configured (repo_update and _apply_ci_flags), include all provided flags for non-interactive updates, and improve parsing of integrations.yml to extract vault.url only from the vault: block to avoid false matches. Also simplify _write_ci_config signature. These changes tighten CI config detection and make pipeline status output more predictable.
Make ASTBuilder._build_from_env_files two-pass: collect genesis.pipeline data, include env files that are referenced as prior_env even if they lack a pipeline block, and build edges from the collected prior_env map. Add tests covering inclusion/exclusion of referenced envs and default pipeline flags. Improve Layout pattern handling by constructing a safe regex via quotemeta and splitting on '*' for proper glob semantics. Remove a redundant self->{layout} assignment in Concourse parse, normalize pipeline pause bail messages to use #C, remove an unnecessary update => 1 flag when writing CI config, and simplify a YAML key detection conditional.
Replace naive glob-to-regex substitutions with a split/quotemeta approach so '*' and '?' become '.*' and '.' while other characters are escaped. Apply this safer pattern building in AST (resources_matching/targets_matching), PipelineProvider::matches_pattern, and Validator cross-reference checks to avoid false positives from regex metacharacters (e.g. dots). Treat an empty pipeline hash as env-file-topology mode and skip pipeline validation. Remove a redundant empty else in Legacy and add tests covering glob behavior and the empty-pipeline validation case.
Add support for embedding CI configuration in the repo config (ci: in .genesis/config) and wire it into the compiler pipeline. Genesis::CI::Compiler now registers itself as owner of the ci section, exposes can_compile_from_genesis_config(), and validates the ci: structure via validate_config_section(). Parser selection order was updated to prefer .genesis/ci/, then inline ci:, then legacy ci.yml; a new _parse_genesis_config() normalizes inline data to the same shape as multi-file parsing. Top.pm gains a registry for delegated config-section handlers and delegates validation to registered owners; the repo schema marks ci as opaque. Genesis::Config accepts opaque types (passthrough). Commands updated to detect inline config and preserve backward compatibility. New tests exercise detection, parsing, validation, and the registration mechanism.
New repo-init replaces init with three enhancements:
- Subrepo detection: auto-detects git repo, creates
  subdirectory without .git. --sub/--no-sub flags.
- Vault selection: --skip-vault defers config, or
  interactive selection from safe targets.
- CI provider: --ci-provider flag (scaffold TBD).
Also adds --force to replace existing directories,
skip_vault support in Top.pm, and phased execution
pattern (_parse, _validate, _execute, _report).
Includes 41 integration tests and 10 validation
tests covering all option combinations.
--ci-provider writes ci: section to .genesis/config
and creates empty .genesis/ci/ directory. No scaffold
files generated -- targets resolved from vault exodus
data at pipeline compile time. Tests for all three
providers (concourse, github-actions, manual) and
absence case. Total 55 integration tests.
Provider-specific fields live under ci.provider hash
(type, target, team, etc.) instead of flat ci.provider
string. Avoids key collisions and groups provider
config naturally.
Pipeline needs bundled genesis binary; repos without
CI provider do not. Also updated test to verify
.genesis/bin absent without provider, present with.
Merge parse+validate into single validation phase
with proper ordering: check options, validate kit
source, check git config, detect git repo, check
existing directory, prompt for vault, summarize.
Directory deletion deferred to execute phase.
Use absolute paths for directory checks. Kit
provider resolution in validation for early fail.
Tests use local dev-kit links and kit tarballs
from t/ fixtures instead of network downloads.
Filed FWT-921 for pre-existing provider check bug.
Move directory existence check before kit provider
network calls. Pass pre-built kit_provider through
to Top->create to avoid re-fetching version list.
Add "Removing existing directory" message in execute.
Validate specific kit version against provider list.
Introduce DEFAULT_CONTROL_BRANCH ('control') and a ci_control_branch
method on Genesis::Top that reads ci.control_branch from
.genesis/config with the constant as a fallback.  Not exposed as a
user-facing option at this time; provides a single place to change
the name later.
Drop --[no-]sub (subdir mode is now auto-detected) and --commit
(auto-commit is the default).  Add --no-commit to stage without
committing, and --reason to override the commit message.  --force
also bypasses the "clean enclosing repo" preflight in subdir mode.

Guardrail against nested Genesis repos via a walk-up helper that
reuses Top->is_repo.  In subdir mode, require the enclosing repo to
be on the CI control branch (no bypass) and to have no tracked
changes (bypassable with --force).

Refactor the git flow: git init only in standalone mode; git add .
in both; pathspec-scoped commit in subdir mode so unrelated parent
index entries are not bundled.  Standalone init forces the initial
branch via git symbolic-ref so the first commit lands on 'control'
regardless of init.defaultBranch.  Always record ci.control_branch
in .genesis/config.

Move error detection, cleanup (rmtree, subdir index reset), and the
final bail message from execute into report.  Treat post-commit
housekeeping failures (popd, vault URL resolution) as non-fatal: the
repo is already created; warn instead of failing.
Park Tristan's duplicate repo-init definitions: rename his handler
to repo_configure_ci and comment out his define_command block so
our phased version dispatches correctly.

Move Kit::Provider->parse_opts to the top of validate so
passthrough flags are stripped before reading $args[0] as the
deployment name.

Gate the control-branch check and git symbolic-ref on
--ci-provider; without it no pipeline topology is established.
Drop ci.control_branch config write and control-branch plan line.
Report commit hash and message after the initial commit.
Introduce a unified CI provider abstraction and provider options system. Adds a PipelineProvider registry and CLI parsing/helpers for provider-specific flags and help text; extends the Compiler to accept provider_opts and validate ci.provider config against provider schemas. Adds a Genesis::CI::Provider factory/base class and concrete Provider implementations for Concourse, GitHub Actions, and Manual (including config/interactive helpers and CLI specs). Concourse provider updated with option defaults, provider_option accessors, describe_provider, and deploy logic using three-tier option resolution (CLI > ci.provider config > defaults). Integrates provider parsing and deployment into Commands/Pipelines and repo-init flow, updates validation in Validator/Compiler, and adds tests for the provider option system.
Introduce --prior-env, --require-pr and --manual CLI options and interactive prompts for pipeline configuration when a CI provider is present. Validate --prior-env against existing environments, provide a numbered choice menu in interactive mode, and support non-interactive scripted use. Only write a pipeline: section if there is data to record; avoid overwriting an existing pipeline section and inject metadata after the env: line using flexible indentation, with warnings if injection fails. Also update bin/genesis help text and display the configured CI provider type.
Introduce a check_prereqs() hook on Provider and PipelineProvider to allow providers to verify required external tooling before performing operations. Concourse implementations now check for the fly CLI in PATH and optionally enforce a minimum fly version (min_fly_version), emitting user-friendly error messages when unmet. Commands that invoke provider actions (repo init and pipeline deploy) now call check_prereqs() and bail on failure to avoid runtime errors. Tests updated/added to cover presence/absence of fly and version checks.
Replace bail("CI provider prerequisite check failed") unless check_prereqs() with check_prereqs() or exit 86 in Pipelines.pm and Repo.pm. This makes the process terminate with status 86 when a CI provider prerequisite check fails instead of calling bail().
Allow CI provider interactive_wizard methods to accept preset CLI flags so wizards can be pre-filled when invoked non-interactively. Concourse and GitHub Actions wizards now take %opts (supporting ci-target, ci-team, ci-insecure, ci-github-repo, ci-github-branch) and fall back to prompts when options are absent. Repo command now forwards %ci_provider_opts into the wizard when running interactively. Also: simplify the check_prereqs comment, switch Concourse's fly lookup to use run('type -p fly') (with safe chomp), remove the unused Manual::opts stub, and preserve boolean handling for insecure. These changes improve scripting and automation of CI setup.
Replace shell backtick `which fly` with run({ stderr => 0 }, 'type -p fly') in Concourse provider to more reliably detect the fly CLI and avoid relying on external which; ensure $fly_path is defined and remove an unused $ok variable. Also simplify and reword prerequisite comment in PipelineProvider, and remove several redundant/comment-only lines in Env command (no functional behavior changes). These edits improve portability and clean up documentation in the codebase.
Replace the ad-hoc extended_usage closure pattern with a structured
extended_handlers registration in define_command.  Each handler class
implements parse_opts, opts_help, and opts_slot; the framework
validates and loads handlers in parse_options, calls each in declared
order to consume passthrough flags, merges results into
COMMAND_OPTIONS under the handler's declared slot name, and rejects
any unclaimed option-like tokens left in COMMAND_ARGS.

extended_handlers implies option_passthrough so commands no longer
need to declare both.  Legacy extended_usage is preserved as a
fallback for unmigrated commands.

Migrate repo-init and kit-provider to extended_handlers.  Simplify
_repo_init_validate to read kit provider opts from get_options()
instead of calling parse_opts directly.
Concourse interactive_wizard reads fly targets output via
parse_fixed_width_table for name/url/team/expiry, merges with
flyrc for the insecure flag.  Columnar display with aligned
secure/insecure and EXPIRED! flags.  Adds "Create a new target"
option with separator.  Runs fly login inline for new targets.
Fix bless-into-reference error (use ref($self) in wizard).

Reorder _repo_init_validate so the CI provider wizard runs after
directory and kit validation, not before.

Fix new_prompt_for_choice selecting first item when no default
specified (undef == 0 in numeric context).
Add use v5.20, use warnings, use strict to Genesis::UI. Fix all
violations: undeclared $section_offset (was a leaking package
global that corrupted numbering across calls), $in predeclaration,
$sections hash vs hashref, $terminal_width missing parens,
$default_choice/$selection_map scoping.

Fix undef == 0 bug where no default silently selected the first
item (add defined() guard on $default_idx comparison).

Add three regression tests for new_prompt_for_choice: no-default
re-prompt, separator numbering, and sequential call independence.
Wizard reads fly targets + flyrc for columnar display with
name/url/team/secure/expired flags.  User picks an existing
target (re-authenticates if expired) or creates a new one by
entering url/team/insecure and running fly login inline.

Update init() to support two modes: existing target by name
(resolved from flyrc, rejects conflicting flags) or new target
from --ci-url + --ci-team + optional --ci-target override.
Add --ci-url to opts().  Store url in new() and config().
Derive target names from url subdomain + team.
Re-add check_prereqs call and %ci_provider_opts pre-fill to the
CI provider init block at step 7 of _repo_init_validate.  These
were in Tristan's commits but lost when the block was moved from
step 2 to step 7 during the validation reorder.
After staging, check if the only diff is the "Last updated"
comment and/or updater/creator_version in .genesis/config.
If so, roll back the config file and report no meaningful
changes instead of creating an empty-content commit.
Add control-branch validation, git commit, and environment branch
creation to genesis new when CI is configured.  Add --no-commit
and --reason flags.  Migrate prior_env prompt to
new_prompt_for_choice.  Standardize --force to -f everywhere.

Fix iaas() for non-OCFP kits: derive IaaS from kit.features when
kit.iaas is absent, using lookup to avoid is_ocfp/features hook
recursion.  Fix create-env default for BOSH director kits with
use_create_env: allow.

Fix remove_secrets purge mode (all => 'purge') to skip
secrets_plan when env file doesn't exist.  Fix --force to move
existing env file to .old before running the new hook.  Fix
credhub_connection_env to skip empty bosh_env for create-env
environments.
tristanpoland and others added 30 commits April 24, 2026 14:27
Adds a generic "deployment status signal" abstraction so pipelines can emit and consume deploy success/failure/abort/error events across environments and external tools.
On pipeline-managed envs, `genesis deploy` without --reason now
derives the deploy reason from the `[pipeline] control@<sha>`
commits between the last deployed commit and the env-branch HEAD,
using the control-side subjects as the reason text.  exodus records
the reason plus git.commit / git.control_commit; the reason itself
no longer repeats SHAs.

On successful deploy under a manual provider, deploy auto-invokes
`genesis propagate <env>` to cascade the certified state to
children.  --no-propagate skips the cascade; non-manual providers
are unaffected (they own their own cascade wiring).

Cascade propagation now sources from the ancestor's certified
`git.control_commit` rather than control HEAD.  Root propagation
still uses HEAD.  This keeps commits travelling as a unit through
the chain: lab receives exactly what mgmt deployed, regardless of
later commits landing on control.  The entry-point algorithm gains
an `env_undeployed` input so descendants are blocked from cascading
past ancestors that have received a change on-branch but not yet
deployed it.

ASTBuilder now filters to `Top::has_env` so shared parent config
files (e.g. lmelt.yml with only required_files) aren't treated as
pipeline nodes.  pipeline-status shows control SHAs in two columns
(branch / deploy) — what's sitting on the env branch vs. what was
certified by the last deploy — with underlined headers.
is_ocfp now reads kit.features directly instead of via has_feature
to break a recursion cycle that fired during the bosh director
bootstrap chain (bosh -> user_provided_bosh_creds_policy ->
ocfp_config -> is_ocfp -> features hook -> get_environment_variables
-> scale -> bosh).  Skipping iaas/scale env vars during the features
hook avoids the cycle while preserving them for every other hook.

CPI config management and cloud-config generation are gated on
is_ocfp.  Non-OCFP envs have those managed externally regardless
of whether the kit happens to provide cpi-config or cloud-config
hooks; surfacing warnings about missing hooks for non-OCFP envs
is misleading noise.

Service::BOSH::Director::stemcells now treats empty-string CPI as
'<default>' (BOSH reports empty cpi for default-cpi stemcells; the
defined-or operator was leaving the empty string and the truthy
filter then dropped it).  Default-cpi stemcell lookups for non-OCFP
envs now resolve correctly.

Other small fixes:
- _get_stemcell_status: guard alt_existing_cpis when no matching
  stemcells exist for the requested OS ($newest is undef)
- deploy: scalar-context director_exodus_lookup so the second
  return value (key) doesn't leak into notify's argument list
- Cloud-config 'no hook' warning: include the kit id in the message
Cascade propagate now skips envs whose last propagation marker
already references a commit at-or-descended-from the cascade
source — the env is ahead, and propagating would regress its
state to an older commit.  Skipped envs are reported with their
current control sha so the operator sees they were intentionally
left alone, not silently rolled back.

Service::Git::is_ancestor wraps git merge-base --is-ancestor for
this check.

Propagate's uncommitted-changes guard now bails on --dry-run too.
Previously dry-run skipped the check, so an operator with local
edits could see a misleading preview that didn't reflect what
the next real run would do.

Header line in pipeline-status shows underlined column labels
(branch / deploy / status) using #u{}.

ASCII hyphen instead of em-dash in the cascade-skip notice line —
the em-dash was getting mangled in some terminal encodings.
- Reorder post-deploy: write exodus before running reactions, so
  reaction scripts using `genesis bosh --self` find fresh
  connection details
- Move auto-propagation from genesis deploy command into
  Genesis::Env::_post_deploy, between reactions and the kit
  post-deploy hook, so propagation output sits in a natural place
  in the deploy stream
- Failure path runs reactions before bailing, so cleanup scripts
  still fire (with GENESIS_DEPLOY_RC != 0)
- propagate <env> with no downstream environments is now a noop
  notice rather than a fatal bail
- Standardize user-facing language on 'propagate'/'propagation'
  instead of 'cascade'
- derived deploy reason output uses $env->notify so it carries the
  env prefix consistent with other notifications
Newer credhub CLIs (2.9+) return "value": "<redacted>" in the
JSON response of `credhub set`.  The credential is stored
correctly, but the response no longer echoes the actual value.

The Service::Credhub set() method was unconditionally caching
$result->{value}, so any subsequent get() returned the literal
string "<redacted>" instead of the secret -- which broke the
entombment readback verification (every secret was flagged as
"failed" even though it was successfully written).

If credhub returns the actual value, cache it as before.  If it
returns the "<redacted>" sentinel, drop the cache entry instead
so the next get() pulls the live value from credhub and the
readback verification has something real to compare.
The original mkfifo construct in _validate_value occasionally
deadlocked under load.  An attempted fix using stdin pipes via
/dev/stdin tripped over OpenSSH's permission check (pipes default
to mode 0660; ssh-keygen refuses to load private keys from world-
or group-readable paths).

Switch to a properly-managed named pipe in a private 0700 temp
directory:
  - mkfifo -m 600 keeps the fifo path at owner-only perms
  - ssh-keygen's permission check is skipped for non-regular files
    (S_ISFIFO != S_ISREG), so the fifo is accepted
  - the writer is backgrounded with { ... & } and a wait clause
    keeps the open()/open() rendezvous robust without deadlocks
  - File::Temp tempdir auto-cleans on scope exit, so even on
    abnormal exit the fifos are removed

No plaintext is written to disk: the secret flows through the
fifo's in-kernel buffer, never landing on a block device.
Both surfaced when creating a non-mgmt env (lmelt-vsphere-canwest-1-qa)
in a multi-kit repo where the deployment lives in a subdirectory of
the git root.

git add was passing $env->file (a bare basename like
'lmelt-vsphere-canwest-1-qa.yml') but Service::Git runs from the
git root, so the path needs the kit prefix (bosh/<name>.yml).
Without it, git silently treated the path as non-matching and the
subsequent commit failed with 'nothing added to commit but
untracked files present'.  Pass it through $git->prefixed.

Service::Git was also being created without track_branch, so when
prune_branch checked out the new env branch and called
restore_branch, restore_branch was a no-op (it bails when
_original_branch is undef).  The user was left sitting on the new
env branch instead of being returned to control.  Adding
track_branch => 1 records control as the original and lets
restore_branch put us back.
Each status row now begins with a Unicode glyph that mirrors its
color so state is recognizable at a glance:

  ✔  deployed              (green)
  ⚠  pending / blocked     (yellow)
  ◇  synced, pending deploy (yellow)
  •  not propagated        (muted)
  ✘  load error            (red)

Also relabel the no-branch case from "no branch" (red) to "not
propagated" (muted) — semantically it is a state, not an error.
When the configured ci.provider.type is 'manual', there is no CI
pipeline to compile or apply -- Genesis itself is the deployer.
Refuse early with a message that names the valid alternatives,
instead of falling through to the schema validator which rejects
'manual' as an unknown provider.
Introduce opt-in BOSH config drift tracking for generated pipelines via a new configuration option track_bosh_configs. Adds documentation (docs/workflows/bosh-config-tracking.md), unit tests, and compiler support:

- ASTBuilder: capture per-env track_bosh_configs from env files.
- PipelineDescriptor: emit bosh-config resources (cloud/runtime/cpi) per-env when configured; wire get steps into notify, deploy and redeploy jobs with correct trigger semantics; add helper methods to normalize and resolve configured types.
- New unit tests: t/unit-tests/genesis_ci_pipeline_descriptor-bosh-configs.t covering resource emission, sourcing, icons, per-env overrides, and job wiring.

Also includes a small UX and correctness cleanup:
- Env.pm: warn and prompt when manually deploying a pipeline-managed env outside CI.
- Env.pm version validation: use semver comparison to detect envs requiring newer repo versions.
- DeploymentManager: prefer pipeline-recorded control_commit and fall back to the CI control branch HEAD when missing.

These changes enable tracking of cloud/runtime/cpi director configs and automatic triggering of CI jobs when out-of-band config changes are detected.
Previously a row whose env failed to load showed only "load error"
with no SHAs and no clue why -- the user had to re-run the env
manually to see the cause.  Now the row keeps the (git-only) branch
SHA and the status text reads e.g.:

  load error: Unable to locate v3.0.3 of bosh kit for ...

A small _summarize_load_error helper trims the captured \$@ down to
the first substantive line, strips ANSI/[FATAL]/decorative prose,
and caps at 80 chars so the message fits the row.
Kit hooks converted from bash to Perl call helpers like
$self->env->genesis_config_block as if Genesis::Env had Perl twins
of the bash helpers in Genesis::Helpers.pm -- but the conversion
left those methods unimplemented, so the converted hooks die with
"Can't locate object method" the moment they run.

genesis_config_block mirrors the bash helper's output: env,
bosh_env, vault, min_version, secrets/exodus/ci mounts when
overridden, root_ca_path, credhub_env -- with key-aligned padding.
write_manifest is a thin mkfile_or_fail wrapper so hooks can ship
their generated yaml without hand-rolling path resolution.
Replaces remove-only prune_branch with bidirectional prepare_branch:

  - creates the branch if missing
  - computes both add and remove sets against propagation_files
  - lands them in a single commit
  - uses pushd/popd around the branch swap so a deployment subdir
    that doesn't exist on the target branch doesn't fatal during
    cwd restoration

Lets a second deployment in the same repo (e.g. vault alongside
bosh) land its files on an existing env branch instead of being
silently dropped because the branch was created when only the
first deployment existed.  Files outside our git prefix are left
alone -- multi-deploy repos own their own corners of the branch.
_derive_deploy_reason walked the entire env-branch range and surfaced
the subject of every [pipeline] control@<sha> marker it found.  In
multi-deploy repos where bosh/ and vault/ share an env branch, that
meant a bosh deploy's reason ended up listing vault-related commit
subjects (and vice versa) -- accurate to "what's on the branch", but
misleading about what's actually being deployed.

Filter the git log via the env's propagation_files pathspec so only
commits that touched files this env depends on contribute to the
reason.  Single-deploy repos are unaffected (every env-branch commit
touches that env's files by construction).

Adds an optional `paths` parameter to Service::Git::log_subjects.
Two-part change to the CI-configured deploy path:

1. Pull the env branch (--ff-only) right after the auto-checkout, so
   another operator's or a CI run's push doesn't get clobbered by a
   stale-state deploy.  Bails on divergence instead of continuing.

2. Snapshot the working tree under the deployment prefix before the
   deploy, then re-snapshot after.  Files genesis newly touched under
   .genesis/manifests/ are added, committed as
   "[deploy] <env> @ <sha>", pull --rebase'd against the remote, and
   pushed.  Files newly touched outside that path are reported via
   warning() so the operator can review -- they're not committed.

When manifest_store is 'repository' or 'hybrid' (the default), this
ensures the rendered manifest, state, and creds files end up on the
env branch instead of orphaned in the working tree, lost on the next
checkout.  Pure 'exodus' configs are unaffected -- nothing local to
commit -- as are non-CI repos.
The pull-before-deploy and commit-and-push-after-deploy logic added
in 8122ecd lived in Genesis::Commands::Env::deploy, which meant the
auto-cascade inside Genesis::Env::_post_deploy ran first and tried to
checkout control with the freshly-rendered manifest still uncommitted
in the working tree -- "your local changes would be overwritten by
checkout", aborting the deploy.

Move the whole git lifecycle next to the deploy itself:

  - Env::_pre_deploy  - checkout to env branch + ff-only pull from
                        remote + baseline status snapshot, stash in
                        deployment_state
  - Env::_post_deploy - status diff against the baseline, warn on
                        unexpected changes outside .genesis/manifests,
                        add + commit + pull --rebase + push manifest
                        artifacts, then run the existing auto-cascade
                        (now finds a clean tree)

Both are inside the same ci_configured block so non-pipeline repos
are unaffected.

Three new Service::Git methods replace the bare `run('git', ...)`
calls so the abstraction stays clean:

  - status(@pathspecs)            -> {path => XY-code} hashref
  - pull_ff_only($branch, $remote)
  - pull_rebase($branch, $remote)

Commands::Env::deploy is stripped of the inline branch logic; it
keeps only a read-only Service::Git handle for the derived-reason
lookup that runs before $env->deploy().
Introduce a --pull option (implied by -F/--fix-checks, opt-out via --no-pull) to pull propagated files from a pipeline predecessor (or control HEAD for entry points) onto the environment branch before deploying. Parse and remove the pull option from deploy options, run the pull only when CI is configured and pipeline git is available, and bail if the env branch is missing or working tree is dirty. Add helper routines: _assert_prior_env_deployed (ensure predecessor had a successful deploy), _get_source_sha_for_pull (determine control SHA to pull), and _apply_pull_propagation (diff, copy/remove files, and commit a [pipeline] control@<sha> marker). Also refactor CI/manual-deploy checks to enforce the prior-env invariant and preserve the existing manual warning/confirmation flow.
Follow-ups to 04e5242's lifecycle refactor:

  - Move the env-branch checkout + ff-only pull back to the top of
    Genesis::Commands::Env::deploy (before load_env), so all the
    preflight that runs there -- cloud-config download, manifest
    viability, secret checks, stemcell checks -- happens against
    the env-branch view rather than whatever branch the operator
    happened to be on.  _pre_deploy keeps only the baseline status
    snapshot (which is meaningless before the checkout but trivial
    afterward).

  - Suppress the "manually deploying a pipeline-managed env" warning
    for the manual CI provider -- in that mode the operator is the
    pipeline, so the warning is just noise.

  - Fix prefixed() scalar-vs-list bug in _post_deploy: assigning the
    one-element list to a scalar gave the count (1), causing the
    "outside 1/" warning and missing the actual manifest commits.
    Use list-context assignment.

  - Add "- from branch <name>" to the "Preparing to deploy" info
    block (CI-configured envs only) and a trailing blank line for
    consistency with surrounding blocks.
Both commit paths in `genesis propagate` (PR-based and direct)
reported only a count -- "propagated N files" -- with no clue as to
which files actually moved.  The dry-run path already lists each
file with M/D markers; bring the same listing to the real-deploy
paths so operators can verify that the change they expected (e.g. a
.genesis/config tweak) actually crossed onto the env branch.
Replace per-branch fetchs in pipeline status with a single fetch_branches call and a separate fast-forward pull for the current branch. Adds Service::Git::fetch_branches which builds refspecs for the given branches (skipping the checked-out branch), runs one git fetch to avoid repeated credential prompts, and updates the branch cache. Keeps eval wrappers to avoid breaking status when the remote is unreachable.
Introduce a --no-fetch option to several commands (create, deploy, pipeline-status, propagate) to skip the pre-command refresh of pipeline environment branches for offline or unreachable-remote use. Implement _fetch_pipeline_envs in Genesis::Commands::Pipelines to walk the pipeline DAG, collect env branch names, and perform a single bulk git fetch (errors are swallowed so commands still run offline). Update Env.pm to call the bulk fetch during create and deploy (and skip it when --no-fetch is set). Adjust Service::Git::fetch_branches to accept an array of branch names (and optional remote), build refspecs, skip the current branch, perform one fetch, and update the branch cache.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants