-
Notifications
You must be signed in to change notification settings - Fork 311
fix(ci): ignore stale and resolved Codex inline findings #981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Komzpa
wants to merge
6
commits into
main
Choose a base branch
from
fix/codex-label-original-commit
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
e8c1d68
fix(ci): ignore stale rebased codex inline comments
Komzpa c91b513
fix(ci): ignore superseded duplicate check runs
Komzpa 681bfca
docs(openspec): document codex label evidence
Komzpa d71cb64
fix(ci): ignore resolved Codex inline findings
Komzpa 706eff1
fix(ci): keep current commit Codex evidence
Komzpa 6e24755
ci: resync codex labels after thread resolution
Komzpa File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
26 changes: 26 additions & 0 deletions
26
openspec/changes/fix-codex-label-current-head-evidence/proposal.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| # Fix Codex label current-head evidence | ||
|
|
||
| ## Why | ||
|
|
||
| The Codex label synchronizer can leave `needs work` on rebased pull requests | ||
| when a stale inline review thread remains unresolved but no longer belongs to | ||
| the current head. It can also misclassify duplicate check runs when an older | ||
| run completes after a newer rerun starts. | ||
|
|
||
| Both cases make PR readiness depend on stale GitHub evidence instead of the | ||
| current head. | ||
|
|
||
| ## What Changes | ||
|
|
||
| - Filter unresolved Codex inline review threads with the same current-head and | ||
| original-commit evidence rules used for REST review comments. | ||
| - Treat unresolved threads without current-head evidence as stale rather than | ||
| blocking. | ||
| - Deduplicate check runs by run start/creation recency instead of completion | ||
| time so late-finishing superseded runs cannot override newer reruns. | ||
|
|
||
| ## Impact | ||
|
|
||
| - **Spec**: `github-automation` | ||
| - **Code**: `.github/scripts/sync_codex_ok_labels.py` | ||
| - **Tests**: `tests/unit/test_sync_codex_ok_labels.py` |
65 changes: 65 additions & 0 deletions
65
...c/changes/fix-codex-label-current-head-evidence/specs/github-automation/spec.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,65 @@ | ||
| ## MODIFIED Requirements | ||
|
|
||
| ### Requirement: Codex review label sync uses current-head evidence | ||
|
|
||
| The Codex label synchronization script MUST grant `🤖 codex: ok` only when the | ||
| current pull-request head has green required checks, a clean Codex review for | ||
| that head, and no unresolved current-head Codex finding threads. It MUST treat | ||
| stale unresolved Codex inline threads as non-blocking when neither their | ||
| current commit, original commit, nor body text ties them to the current head. | ||
|
|
||
| #### Scenario: stale rebased inline thread remains unresolved | ||
|
|
||
| - **GIVEN** a pull request was rebased after a Codex inline finding | ||
| - **AND** the unresolved GraphQL review thread still reports `isOutdated=false` | ||
| - **AND** the thread's current commit is not the current head | ||
| - **AND** the thread's original commit is not the current head | ||
| - **AND** the thread body does not mention the current head | ||
| - **WHEN** the label synchronizer evaluates the pull request | ||
| - **THEN** that thread does not force `🤖 codex: needs work` | ||
|
|
||
| #### Scenario: reanchored unresolved inline thread belongs to the current head | ||
|
|
||
| - **GIVEN** an unresolved Codex inline finding thread | ||
| - **AND** the thread's current commit is the pull request head | ||
| - **AND** the thread's original commit is older than the pull request head | ||
| - **WHEN** the label synchronizer evaluates the pull request | ||
| - **THEN** that thread blocks `🤖 codex: ok` | ||
| - **AND** the synchronizer records a needs-work reason that links to the thread | ||
|
|
||
| #### Scenario: unresolved inline thread belongs to the current head | ||
|
|
||
| - **GIVEN** an unresolved Codex inline finding thread | ||
| - **AND** the thread's original commit is the pull request head | ||
| - **WHEN** the label synchronizer evaluates the pull request | ||
| - **THEN** that thread blocks `🤖 codex: ok` | ||
| - **AND** the synchronizer records a needs-work reason that links to the thread | ||
|
|
||
| #### Scenario: unresolved inline thread mentions the current head explicitly | ||
|
|
||
| - **GIVEN** an unresolved Codex inline finding thread | ||
| - **AND** the thread body mentions the current pull request head | ||
| - **WHEN** the label synchronizer evaluates the pull request | ||
| - **THEN** that thread blocks `🤖 codex: ok` | ||
| - **AND** the synchronizer records a needs-work reason that links to the thread | ||
|
|
||
| #### Scenario: resolved inline thread triggers label resynchronization | ||
|
|
||
| - **GIVEN** a pull request has a `🤖 codex: needs work` label from an unresolved Codex inline finding | ||
| - **WHEN** that review thread is resolved | ||
| - **THEN** the Codex label synchronization workflow runs for that pull request | ||
| - **AND** a scheduled fallback also resynchronizes open pull requests when no review-thread event is delivered | ||
|
|
||
| ### Requirement: Codex label sync MUST use check-run recency evidence | ||
|
|
||
| When multiple check runs have the same context name on a pull-request head, the label synchronizer MUST classify the current context from the newest run by | ||
| start or creation time. Completion time MUST NOT let an older superseded run | ||
| override a newer rerun that has already started. | ||
|
|
||
| #### Scenario: older duplicate run completes after a newer rerun starts | ||
|
|
||
| - **GIVEN** two check runs share the same name | ||
| - **AND** the older run started first but completes after the newer run starts | ||
| - **WHEN** the label synchronizer deduplicates check runs | ||
| - **THEN** it keeps the newer run | ||
| - **AND** a pending newer run keeps the pull request check state pending instead of failed |
16 changes: 16 additions & 0 deletions
16
openspec/changes/fix-codex-label-current-head-evidence/tasks.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| ## 1. Codex Review Evidence | ||
|
|
||
| - [x] 1.1 Filter unresolved Codex inline review threads by current-head evidence. | ||
| - [x] 1.2 Preserve body-head fallback only when the review comment body mentions the current head. | ||
| - [x] 1.3 Treat threads with neither current-head commit metadata nor current-head body evidence as stale. | ||
|
|
||
| ## 2. Check Run Evidence | ||
|
|
||
| - [x] 2.1 Deduplicate same-named check runs by start/creation recency. | ||
| - [x] 2.2 Keep completion time as a fallback when start/creation metadata is absent. | ||
| - [x] 2.3 Cover late-finishing superseded runs with regression tests. | ||
|
|
||
| ## 3. Verification | ||
|
|
||
| - [x] 3.1 Run focused unit tests for the synchronizer. | ||
| - [x] 3.2 Run ruff format/check and ty on the synchronizer and tests. |
19 changes: 19 additions & 0 deletions
19
openspec/changes/ignore-resolved-codex-inline-findings/proposal.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,19 @@ | ||
| ## Why | ||
|
|
||
| The Codex review label synchronizer can keep `🤖 codex: needs work` on a PR | ||
| after a maintainer resolves an inline Codex finding. The unresolved-thread check | ||
| already knows resolved review threads are no longer blockers, but the timeline | ||
| classifier still merges the old inline review comment and treats it as current | ||
| needs-work evidence. | ||
|
|
||
| ## What Changes | ||
|
|
||
| - Ignore inline Codex finding comments from resolved or outdated review threads | ||
| when computing current-head review state. | ||
| - Keep unresolved inline Codex finding comments as needs-work evidence. | ||
| - Add regression coverage for both resolved and unresolved inline findings. | ||
|
|
||
| ## Impact | ||
|
|
||
| - GitHub automation only. | ||
| - No application runtime behavior changes. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.