fix(ci/build): CI reliability — pre-push git-env, per-test timeouts, TriageBotCore double-link#1053
Open
mauricecarrier7 wants to merge 4 commits into
Open
fix(ci/build): CI reliability — pre-push git-env, per-test timeouts, TriageBotCore double-link#1053mauricecarrier7 wants to merge 4 commits into
mauricecarrier7 wants to merge 4 commits into
Conversation
🧪 Unit Test Results📊 View Full Interactive Report ❌ 3 TESTS FAILED7126 tests | 7023 passed | 3 failed | 100 skipped | ⏱️ 17m 21s | 📊 98.6% | 📈 47.8% coverage Tests by Class — 820 classes, 3 with failures
Failed Tests (click to expand)📊 Testing Coverage BreakdownUnit Test Line Coverage (testable surfaces): 47.8% Total coverage incl. UI/lifecycle: 46.0% (18 files excluded from testable denominator — see
🔗 Interactive HTML Report | CI Run Details 📦 Downloadable Artifacts
|
mauricecarrier7
added a commit
that referenced
this pull request
Jun 10, 2026
…between tests (#1056) `AccountsManager.deferInitialLoadCatalogsForTesting` is a process-wide mutable static flag gating whether `AccountsManager.init` spawns its background `loadCatalogs` Task. The test-safe value is `true` (no background work); `PalaceTestSetup.bootstrap()` pins it `true` at bundle load. Tests that need the background load opt IN by setting it `false` in their own setUp. Two sites left it `false` after running, leaking the unsafe value to the next test class — which then constructs a real `AccountsManager`, spawning the 1100+-library registry crawl that outlives the test and pollutes whatever runs next: - `AppContainer._resetForTesting()` — runs after EVERY test via the singleton-reset observer; its final line set the flag false. - `AppContainerResetTests.tearDown()` — explicitly set it false "for production semantics." Both now leave the flag at the test-safe `true`. This is the root cause behind recurring CI flakes that blocked PRs #1052/#1053/#1054 (none of whose own diffs were at fault): - AccountsManagerCancellationTests.testCancelBackgroundWork_onOptOutInstance _isSafeNoOp — opt-out handle non-nil (failed all 3 iterations) - CatalogCacheKeyAndIsolationTests — layout-engine-off-main crash - BookReturnCleverReauthTests — crash → ** TEST FAILED ** with 0 reported failures - TokenRefreshOnForeground / TokenRefreshAndRetryQueue — token/network bleed Regression: AppContainerResetTests.testResetForTesting_disablesBackgroundLoad Catalogs now asserts the flag is `true` post-reset (was asserting false) — the mutation that reintroduces the leak fails it. No production runtime change: the flag is only ever non-default inside an XCTest process (initializer keys off XCTestConfigurationFilePath); `_resetForTesting()` already early-returns outside XCTest. Verified: AppContainerResetTests + AccountsManagerCancellationTests + CatalogCacheKeyAndIsolationTests + TokenRefreshOnForegroundTests + TokenRefreshAndRetryQueueTests + BookReturnCleverReauthTests run together → 41 tests, 0 failures, ** TEST SUCCEEDED **. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
git invokes pre-push hooks with GIT_DIR/GIT_WORK_TREE/GIT_INDEX_FILE exported. xcodebuild's SwiftPM resolution shells out to git internally, and an inherited GIT_WORK_TREE makes local-package paths resolve to "/", failing with: "the package manifest at '/Package.swift' doesn't exist". This blocked EVERY push that touches a Palace/*.swift file from a linked git worktree (and is latent on the main checkout under some git versions) — the same xcodebuild command run from a plain shell succeeds. Fix: wrap the gate's xcodebuild in `env -u GIT_DIR -u GIT_WORK_TREE -u GIT_INDEX_FILE -u GIT_PREFIX -u GIT_EXEC_PATH`. Also scope a per-checkout DerivedData/SourcePackages dir for linked-worktree pushes so they don't collide with the main checkout's warm cache. Reproduced: `-resolvePackageDependencies` succeeds with the git env cleared and fails (/Package.swift) with GIT_DIR+GIT_WORK_TREE set. Verified end-to-end: PR #1052's worktree push now passes the gate green (`PASS — 4 class(es) green`) instead of the false /Package.swift failure. **Scope:** single hook script. **Not done:** a scripts/tests/ fixture asserting the env-clear (follow-up per green-board contract #4 — tooling under CI).
…est fails fast + retries A single deadlocked test currently hangs the ENTIRE build-and-test job until the 60-minute job timeout kills it — the `-retry-tests-on-failure -test-iterations 3` net never engages because the process is stuck, not failed. This is how the pre-existing CatalogPreloaderTests order/timing hang (passes in isolation, in focused pairs, and in a full LOCAL suite run — i.e. CI-runner-timing-specific) reddens every PR's board with a 1h+ job kill. Fix: add `-test-timeouts-enabled YES -default-test-execution-time-allowance 120 -maximum-test-execution-time-allowance 300` (the unit runner never adopted the pattern ui-testing.yml already uses). A hung test now fails at 120s and the existing retry re-runs it — a timing-flaky hang passes on retry (board green); a deterministic hang fails all 3 (board stays red, correctly). Caps worst-case single-test cost at ~360s vs 60min. This does NOT mask pollution (green-board contract #2): the underlying CatalogPreloader/loadCatalogs polluter still needs a root fix and is tracked separately. This change only ensures a hang fails-fast into the retry net instead of killing the job before the net can act (contract #1). **Scope:** single CI runner script. **Not done:** root-cause of the loadCatalogs hang (not locally reproducible; tracked as a test-pollution item).
…cate ObjC class registration) PalaceTests linked the TriageBotCore SPM product directly even though its test-host app (Palace) already links it. At runtime both copies loaded into the test host, registering TriageBotCore's ObjC classes twice: Class _TtC13TriageBotCore...BundleFinder is implemented in both PackageFrameworks/TriageBotCore_...PackageProduct.framework AND Palace.app/Palace.debug.dylib — "may cause spurious casting failures and mysterious crashes." That duplicate registration is a prime suspect for the intermittent CI-runner-timing test instability (hangs whose victim varies by run). PalaceTests imports TriageBot in ZERO files — the link was pure redundancy; the bundle resolves the symbols from its host app at load time. Removed the TriageBotCore product dependency + framework build-file from the PalaceTests target (via the xcodeproj gem; 8-line surgical pbxproj delete). The `.dynamic` library-type fix is blocked by SPM's product-name==target-name rule, so de-linking the test target is the correct fix. Verified: TPPSettingsTests + CatalogPreloaderTests build, link (resolving TriageBotCore from the host), and pass — and the "implemented in both" runtime warning is GONE (0 occurrences, was 2 per launch). testUseBetaLibraries_- publishesViaCombine (a local hang victim) passes in 0.68s. **Scope:** PalaceTests build-config only; no production or test source change.
…spot check, not "green" Codifies the rule whose violation caused a false "verified" claim during PP-4542: a `-only-testing:PalaceTests` run (2359 tests) was reported as the full local suite passing, when (a) CI runs the whole Palace scheme across all test targets with `-test-iterations 3` (~7121 executions), and (b) that subset run had actually hung + `** TEST FAILED **`. Rule: validate via `scripts/xcode-test-optimized.sh` (CI parity) or `scripts/verify-pr.sh --quick` (full scheme); treat `-only-testing` as a scoped spot-check only; never report a subset (or a timeout/restart run) as green; read the top-level rollup, not summed per-suite counts. **Scope:** CLAUDE.md guidance only. Companion to the CI-runner fixes in this PR.
09637e1 to
a37ba9e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two CI/test-runner reliability fixes (both make the board trustworthy per the green-board contract).
1. pre-push-test-gate: clear leaked git env before xcodebuild
git invokes pre-push hooks with
GIT_DIR/GIT_WORK_TREEexported; xcodebuild's SwiftPM resolution shells out to git and an inheritedGIT_WORK_TREEmakes local-package paths resolve to/→the package manifest at '/Package.swift' doesn't exist. Blocked every push touching aPalace/*.swiftfile from a linked worktree. Fixed withenv -u GIT_DIR -u GIT_WORK_TREE …+ worktree-scoped DerivedData. Reproduced:-resolvePackageDependenciessucceeds with the env cleared, fails with it set.2. xcode-test-optimized.sh: enable per-test timeouts
A single deadlocked test hangs the entire build-and-test job until the 60-min job timeout — the
-retry-tests-on-failure -test-iterations 3net never engages (process stuck, not failed). This is how the pre-existingCatalogPreloaderTestsorder/timing hang (passes in isolation, in focused pairs, and in a full local suite run — i.e. CI-runner-timing-specific) reddened every PR with a 1h+ job kill. Added-test-timeouts-enabled YES -default-test-execution-time-allowance 120 -maximum-test-execution-time-allowance 300(the patternui-testing.ymlalready uses). A hung test now fails at 120s and the existing retry re-runs it: a timing-flaky hang passes on retry (green); a deterministic hang fails all 3 (red, correctly).Does NOT mask pollution (contract #2) — it ensures a hang fails-fast INTO the retry net instead of killing the job before the net can act (contract #1). The underlying loadCatalogs polluter is tracked separately for a root fix.