Open
Conversation
Contributor
🔍 Skill Validator Results
Summary
Full validator output```text Found 1 skill(s) [eval-driven-dev] 📊 eval-driven-dev: 3,768 BPE tokens [chars/4: 4,311] (standard ~), 16 sections, 1 code blocks [eval-driven-dev] ⚠ Skill is 3,768 BPE tokens (chars/4 estimate: 4,311) — approaching "comprehensive" range where gains diminish. [eval-driven-dev] ⚠ No numbered workflow steps — agents follow sequenced procedures more reliably. ✅ All checks passed (1 skill(s)) ``` |
Contributor
There was a problem hiding this comment.
Pull request overview
Updates the eval-driven-dev skill to align with newer pixie-qa concepts (notably input_data, agent evaluators, and structured post-run analysis) and expands the skill’s references to include a dedicated Step 6 analysis workflow and runnable implementation examples.
Changes:
- Updates the skill metadata and setup workflow to target
pixie-qa>=0.8.1,<0.9.0and revises setup/error-handling guidance. - Refactors the skill’s step-by-step reference docs (new Step 1a project analysis, split Step 2, new Step 6 “Analyze Outcomes”, removal of older combined/iteration docs).
- Adds runnable examples (standalone function, FastAPI, CLI) and updates API reference docs to reflect the newer dataset shapes (
input_dataetc.).
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| skills/eval-driven-dev/resources/setup.sh | Updates install/upgrade logic and adds stricter failure handling for pixie init/start. |
| skills/eval-driven-dev/references/wrap-api.md | Updates wrap API reference and CLI wording (including dataset field naming). |
| skills/eval-driven-dev/references/testing-api.md | Updates testing API reference to match new dataset schema and runner behavior. |
| skills/eval-driven-dev/references/evaluators.md | Adds create_agent_evaluator reference and updates evaluator selection guidance. |
| skills/eval-driven-dev/references/6-investigate.md | Removes prior Step 6 “investigate/iterate” reference. |
| skills/eval-driven-dev/references/6-analyze-outcomes.md | Adds new structured, multi-phase Step 6 analysis workflow and required outputs. |
| skills/eval-driven-dev/references/5-run-tests.md | Reframes Step 5 as “run tests and fix mechanical issues” and updates commands/content. |
| skills/eval-driven-dev/references/4-build-dataset.md | Updates dataset schema (input_data), adds realism audits, and expands capture guidance. |
| skills/eval-driven-dev/references/3-define-evaluators.md | Shifts evaluator strategy toward agent evaluators and updates mapping guidance. |
| skills/eval-driven-dev/references/2c-capture-and-verify-trace.md | Adds a dedicated sub-step doc for trace capture and verification. |
| skills/eval-driven-dev/references/2b-implement-runnable.md | Adds a dedicated sub-step doc for Runnable implementation and placement. |
| skills/eval-driven-dev/references/2a-instrumentation.md | Adds a dedicated sub-step doc for wrap() instrumentation practices. |
| skills/eval-driven-dev/references/2-wrap-and-trace.md | Removes older combined Step 2 reference. |
| skills/eval-driven-dev/references/1-c-eval-criteria.md | Adds updated eval criteria guidance tied to project analysis and failure modes. |
| skills/eval-driven-dev/references/1-b-eval-criteria.md | Removes older Step 1b eval criteria reference. |
| skills/eval-driven-dev/references/1-b-entry-point.md | Renumbers/updates entry-point documentation and emphasizes capability prioritization. |
| skills/eval-driven-dev/references/1-a-project-analysis.md | Adds new Step 1a project analysis reference and required outputs. |
| skills/eval-driven-dev/references/runnable-examples/standalone-function.md | Adds runnable example for direct function invocation. |
| skills/eval-driven-dev/references/runnable-examples/fastapi-web-server.md | Adds runnable example for FastAPI/ASGI in-process evaluation. |
| skills/eval-driven-dev/references/runnable-examples/cli-app.md | Adds runnable example for CLI subprocess execution. |
| skills/eval-driven-dev/SKILL.md | Updates skill description/versioning and rewrites the step flow to include analysis. |
| docs/README.skills.md | Updates the skills index entry for eval-driven-dev to match the new references list. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request Checklist
npm startand verified thatREADME.mdis up to date.stagedbranch for this pull request.Description
Update eval-driven-dev skill: Adding comprehensive analysis step after evaluation runs.
Type of Contribution
Additional Notes
By submitting this pull request, I confirm that my contribution abides by the Code of Conduct and will be licensed under the MIT License.