-
Notifications
You must be signed in to change notification settings - Fork 431
docs: consolidate agent guidance in AGENTS.md and improve mistake-reflection workflow #1059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
sokoliva
wants to merge
4
commits into
a2aproject:main
Choose a base branch
from
sokoliva:skills
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+192
−69
Open
Changes from 1 commit
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,109 @@ | ||
| --- | ||
| name: mistake-reflection | ||
| description: Use when you discover you made a mistake — caught by the user, by a tool result, by your own re-reading, or by a failed check. Appends a structured entry to docs/ai/ai_learnings.md and re-reads recent entries to avoid repeats. | ||
| --- | ||
|
|
||
| # Mistake Reflection | ||
|
|
||
| Implements the mistake-handling step of `AGENTS.md` §"Mandatory | ||
| workflow". | ||
|
|
||
| ## When to load this skill | ||
|
|
||
| Trigger on ANY of these, without waiting for the user to ask: | ||
|
|
||
| - The user corrects a factual claim, code change, or assumption. | ||
| - A tool result contradicts something you just stated or did | ||
| (lint failure, test failure, type-check failure, file not found, | ||
| command exit non-zero on something you said would succeed). | ||
| - You re-read a file or doc and realize a prior statement was wrong | ||
| or unverified. | ||
| - You realize mid-task that you skipped a required step | ||
| (e.g. didn't read `docs/ai/coding_conventions.md` / | ||
| `docs/ai/mandatory_checks.md` at task start). | ||
| - You stated an inference as a fact without a `file:line` citation | ||
| and later had to walk it back. | ||
|
|
||
| If unsure whether something counts: it counts. False positives are | ||
| cheap; false negatives are how the same mistake recurs. | ||
|
|
||
| ## Procedure | ||
|
|
||
| Do these in order. Do NOT defer to the end of the task. | ||
|
|
||
| 1. **Acknowledge the mistake to the user explicitly** in the current | ||
| response. One or two sentences. No hedging, no minimization. | ||
| 2. **Read recent entries** in `docs/ai/ai_learnings.md` (at minimum | ||
| the last 5 entries, or the whole file if shorter). If the current | ||
| mistake is a recurrence of an existing rule, say so explicitly and | ||
| reference the prior entry's date — do not silently duplicate. | ||
| 3. **Append a new entry** to `docs/ai/ai_learnings.md` using the | ||
| template below. Append; do not rewrite existing entries. | ||
| 4. **Continue the original task** only after steps 1–3 are done. | ||
|
|
||
| ## Entry template | ||
|
|
||
| Copy this verbatim, fill in each field, append to the end of the file | ||
| (after the existing `---` separator): | ||
|
|
||
| ```markdown | ||
| ## YYYY-MM-DD — <one-line summary> | ||
|
|
||
| - **Mistake**: What went wrong. Be concrete. Quote the wrong claim or | ||
| describe the wrong action. Include `file:line` references where | ||
| applicable. | ||
| - **Trigger**: How the mistake surfaced (user correction, tool output, | ||
| self-review). Include the specific signal if it was a tool result. | ||
| - **Root cause**: Why it happened. Distinguish between (a) missing | ||
| knowledge, (b) skipped verification step, (c) false assumption from | ||
| pattern-matching, (d) workflow gap. Avoid generic "I didn't think | ||
| carefully" — name the specific failure mode. | ||
| - **Recurrence of**: If this matches an existing rule, link to the | ||
| prior entry's date. Otherwise write "new". | ||
| - **Rule**: A concrete, checkable rule that would have prevented this. | ||
| Phrase as an imperative ("Before X, do Y"). If the rule already | ||
| exists and was violated, the rule should be about *enforcement* | ||
| (e.g. a check to add to a skill, a step to add to AGENTS.md), not a | ||
| restatement of the existing rule. | ||
| ``` | ||
|
|
||
| ## Anti-patterns to avoid | ||
|
|
||
| - **Don't restate the same lesson with new wording.** If | ||
| you'd write essentially the same rule again, the real fix is to | ||
| make the rule self-enforcing (update a skill or `AGENTS.md`), not | ||
| to add a third entry. | ||
| - **Don't let rules go stale.** When you read prior entries, flag stale | ||
| tooling references and either update them or note the staleness in your | ||
| new entry. | ||
| - **Don't write rules that depend on you remembering to follow them.** | ||
| If a rule is "remember to do X at the start of every task", it will | ||
| be skipped. Prefer rules that bind to a tool, a skill trigger, or a | ||
| CI check. | ||
| - **Don't bury the acknowledgement.** Tell the user up front in the | ||
| response that you got it wrong, before describing the fix. | ||
|
|
||
| ## Cleanup ritual | ||
|
|
||
| Before appending, check the file's length: | ||
|
|
||
| - **≥ 10 entries**: pause and propose to the user that one or more | ||
| entries be either (a) deleted (if obsolete or one-off), or (b) | ||
| promoted into the workflow somewhere it will actually be read. If | ||
| the candidate rule is about claims/citations/evidence specifically, | ||
| `docs/ai/evidence_rules.md` is a natural target — otherwise leave | ||
| the choice of destination to the user. Do this *before* adding the | ||
| new entry, so the file doesn't grow monotonically and stop being read. | ||
|
|
||
| This ritual is the only mechanism preventing `ai_learnings.md` from | ||
| becoming a write-only graveyard. | ||
|
|
||
| ## Repo-specific notes | ||
|
|
||
| - `docs/ai/ai_learnings.md` is **gitignored** (`.gitignore:15`). | ||
|
sokoliva marked this conversation as resolved.
Outdated
|
||
| Entries are local to the developer's checkout and will not be seen | ||
| by other agents or in CI. The file is for the human developer to | ||
| improve `AGENTS.md` / skills based on patterns. | ||
| - The protocol source and trigger pointer both live in `AGENTS.md` | ||
| §"Mandatory workflow". `GEMINI.md` is a deprecated stub. | ||
| - Date format is `YYYY-MM-DD` to match existing entries. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,3 +1,53 @@ | ||
| Always check @./GEMINI.md for the full instruction list. | ||
| # AGENTS.md | ||
|
|
||
| Python SDK for the [Agent2Agent (A2A) Protocol](https://a2a-protocol.org/latest/specification/) | ||
| (`a2a` module, `a2a-sdk` distribution). It handles complex messaging, task management, | ||
|
sokoliva marked this conversation as resolved.
Outdated
|
||
| and communication across different transports (REST, gRPC, JSON-RPC). | ||
|
|
||
| ## Technology Stack & Architecture | ||
|
|
||
| - **Language**: Python 3.10+ | ||
| - **Package Manager**: `uv` | ||
| - **Lead Transports**: Starlette (REST/JSON-RPC), gRPC | ||
| - **Data Layer**: SQLAlchemy (SQL), Pydantic (Logic/Legacy), Protobuf (Modern Messaging) | ||
| - **Key Directories**: | ||
| - `/src`: Core implementation logic. | ||
| - `/tests`: Comprehensive test suite. | ||
| - `/docs`: AI guides and migration documentation. | ||
|
|
||
| ## Mandatory workflow | ||
|
|
||
| You MUST do all of the following: | ||
|
|
||
| 1. **At the start of every task that touches files**, read | ||
| `docs/ai/coding_conventions.md`, `docs/ai/mandatory_checks.md`, | ||
| and `docs/ai/evidence_rules.md`. | ||
| 2. **Before declaring any task done**, run the full check sequence | ||
| in `docs/ai/mandatory_checks.md` — including for | ||
| markdown/comment/whitespace-only changes. | ||
| 3. **On any mistake**, load the `mistake-reflection` skill at | ||
| `.agents/skills/mistake-reflection/SKILL.md` **before** continuing | ||
| your response. The skill appends a structured entry to | ||
| `docs/ai/ai_learnings.md` (gitignored local journal) so the user | ||
| can use those findings to improve the workflow. | ||
|
|
||
| When unsure: load the skill. False positives are free; false | ||
| negatives are how the same mistake recurs. | ||
|
|
||
| ## Layout footguns | ||
|
|
||
| - `src/a2a/types/` and `src/a2a/compat/v0_3/*_pb2*` — generated | ||
| protobuf, **do not hand-edit**. Excluded from `ty`, `ruff`, coverage. | ||
| Regenerate via `scripts/gen_proto.sh` / | ||
| `scripts/gen_proto_compat.sh`. | ||
|
sokoliva marked this conversation as resolved.
Outdated
|
||
| - `tck/`, `itk/`, `tests/` — subprojects with their own | ||
| `pyproject.toml`; not part of the main test run. | ||
| - `samples/` is minimal; real samples live in `a2aproject/a2a-samples`. | ||
|
|
||
| ## Optional extras | ||
|
|
||
| `pyproject.toml` defines extras (`grpc`, `telemetry`, `postgresql`, | ||
| etc.). The dev group installs `a2a-sdk[all]`, so anything gated behind | ||
| an extra must still **import lazily** at runtime — the install-smoke | ||
| harness verifies this per profile. | ||
|
|
||
| This file exists for compatibility with tools that look for AGENTS.md. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,48 +1,7 @@ | ||
| # Agent Command Center | ||
| # GEMINI.md | ||
|
|
||
| ## 1. Project Overview & Purpose | ||
| **Primary Goal**: This is the Python SDK for the Agent2Agent (A2A) Protocol. It allows developers to build and run agentic applications as A2A-compliant servers. It handles complex messaging, task management, and communication across different transports (REST, gRPC, JSON-RPC). | ||
| **Specification**: [A2A-Protocol](https://a2a-protocol.org/latest/specification/) | ||
| > This file exists for Gemini auto-loads. | ||
| > The source of truth for agent guidance is | ||
| > [`AGENTS.md`](./AGENTS.md). Please read that file. | ||
|
|
||
| ## 2. Technology Stack & Architecture | ||
|
|
||
| - **Language**: Python 3.10+ | ||
| - **Package Manager**: `uv` | ||
| - **Lead Transports**: Starlette (REST/JSON-RPC), gRPC | ||
| - **Data Layer**: SQLAlchemy (SQL), Pydantic (Logic/Legacy), Protobuf (Modern Messaging) | ||
| - **Key Directories**: | ||
| - `/src`: Core implementation logic. | ||
| - `/tests`: Comprehensive test suite. | ||
| - `/docs`: AI guides. | ||
|
|
||
| ## 3. Style Guidelines & Mandatory Checks | ||
| - **Style Guidelines**: Follow the rules in @./docs/ai/coding_conventions.md for every response involving code. | ||
| - **Mandatory Checks**: Run the commands in @./docs/ai/mandatory_checks.md after making any changes to the code and before committing. | ||
|
|
||
| ## 4. Mandatory AI Workflow for Coding Tasks | ||
| 1. **Required Reading**: You MUST read the contents of @./docs/ai/coding_conventions.md and @./docs/ai/mandatory_checks.md at the very beginning of EVERY coding task. | ||
| 2. **Initial Checklist**: Every `task.md` you create MUST include a section for **Mandatory Checks** from @./docs/ai/mandatory_checks.md. | ||
| 3. **Verification Requirement**: You MUST run all mandatory checks before declaring any task finished. | ||
|
|
||
| ## 5. Mistake Reflection Protocol | ||
|
|
||
| > [!NOTE] for Users: | ||
| > `docs/ai/ai_learnings.md` is a local-only file (excluded from git) meant to be | ||
| > read by the developer to improve AI assistant behavior on this project. Use its | ||
| > findings to improve the GEMINI.md setup. | ||
|
|
||
| When you realise you have made a mistake — whether caught by the user, | ||
| by a tool, or by your own reasoning — you MUST: | ||
|
|
||
| 1. **Acknowledge the mistake explicitly** and explain what went wrong. | ||
| 2. **Reflect on the root cause**: was it a missing check, a false assumption, skipped verification, or a gap in the workflow? | ||
| 3. **Immediately append a new entry to `docs/ai/ai_learnings.md`** — this is not optional and does not require user confirmation. Do it before continuing, then update the user about the workflow change. | ||
|
|
||
| **Entry format:** | ||
| - **Mistake**: What went wrong. | ||
| - **Root cause**: Why it happened. | ||
| - **Rule**: The concrete rule added to prevent recurrence. | ||
|
|
||
| The goal is to treat every mistake as a signal that the workflow is | ||
| incomplete, and to improve it in place so the same mistake cannot | ||
| happen again. | ||
| See [`AGENTS.md`](./AGENTS.md). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| # Evidence Rules | ||
|
|
||
| Rules for what counts as adequate evidence when making claims about | ||
| this codebase. These are graduated learnings — promoted from | ||
| `docs/ai/ai_learnings.md` (local journal) once a rule has earned a | ||
| permanent home. | ||
|
|
||
| When in doubt, the bar is: **a future agent reading your response | ||
| should be able to verify the claim from the citations alone, without | ||
| re-doing your investigation.** | ||
|
|
||
| ## Claims about runtime behavior | ||
|
|
||
| Back any claim about how code behaves at runtime with a `file:line` | ||
| reference from a tool call in the same response, or with a runnable | ||
| demonstration. | ||
|
|
||
| The citation must support the specific claim. The *existence* of code | ||
| is not evidence of its *behavior*: a function being defined doesn't | ||
| mean it's called; an exception being raised doesn't mean it | ||
| propagates; a parameter being declared doesn't mean it's honored; a | ||
| config option existing doesn't mean it takes effect. Behavior claims | ||
| require control-flow evidence (call chain, test output, log) — not | ||
| just a definition site. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,25 +1,14 @@ | ||
| ### Test and Fix Commands | ||
| # Mandatory Checks | ||
|
|
||
| Exact shell commands required to test the project and fix formatting issues. | ||
| Run in this order before declaring any task done — including for | ||
| markdown/comment/whitespace-only changes: | ||
|
|
||
| 1. **Formatting & Linting**: | ||
| ```bash | ||
| uv run ruff check --fix | ||
| uv run ruff format | ||
| ``` | ||
| ```bash | ||
| ./scripts/lint.sh # ruff check --fix, ruff format, ty check | ||
| uv run pytest | ||
|
|
||
| 2. **Type Checking**: | ||
| ```bash | ||
| uv run ty check | ||
| ``` | ||
| # Only before commit, when src/ changed: | ||
| uv run pytest --cov=src --cov-report=term-missing | ||
| ``` | ||
|
|
||
| 3. **Testing**: | ||
| ```bash | ||
| uv run pytest | ||
| ``` | ||
|
|
||
| 4. **Coverage**: | ||
| Only run this command after adding new source code and before committing. | ||
| ```bash | ||
| uv run pytest --cov=src --cov-report=term-missing | ||
| ``` | ||
| CI enforces `--cov-fail-under=88` on the `a2a` package. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.