Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,3 +44,18 @@ pnpm lint
Write for developers discovering MartinLoop for the first time.
Explain what the tool does, how to install it, how to run it,
and how to verify results.

## Proof Receipt Design Lock

MartinLoop proof-card SVGs must stay in the CLI receipt style:

- dark terminal canvas
- line-based layout
- monospaced evidence rows
- semantic green for verified/pass states
- semantic red for failed, missing, or boundary states

Do not change proof receipts into rounded cards, blue palettes, gradients,
certificate layouts, dashboard cards, or decorative marketing graphics unless
the maintainer explicitly asks for that change and receives side-by-side
visual renders before approval.
38 changes: 29 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ npx -y martin-loop@latest preflight "Summarize the demo workspace and prove test

`share --latest` writes three files into the selected run directory under `share/`: `run-receipt.json`, `run-receipt.md`, and `proof-card.svg`.

Release notes for the current root package: [MartinLoop 0.3.4](./docs/release/OSS-0.3.4-RELEASE-NOTES.md).
Release notes for the current root package: [MartinLoop 0.3.5](./docs/release/OSS-0.3.5-RELEASE-NOTES.md).

## Visual Proof

Expand All @@ -95,22 +95,42 @@ Ungoverned agents can retry until cost and scope drift. MartinLoop adds budget c
<img src="./docs/assets/side-by-side.svg" alt="MartinLoop governed run compared with an unbounded retry loop" width="720" height="1080">
</div>

## Proof Receipts

Proof receipts are local share bundles for governed AI coding runs. They show the task, spend, budget, verifier result, receipt integrity, and any evidence boundary that should not be rounded into confidence.

This real governed run spent `$0.51` against a `$3.00` budget. The verifier passed and the receipt integrity was signed, but the proof stayed at `EVIDENCE_BOUNDARY` because rollback evidence was not recorded.

<div align="center">
<img src="./docs/assets/proof-receipt-live-governed.png" alt="MartinLoop CLI proof receipt for a governed run with spend, budget, verifier, integrity, and evidence boundary" width="720">
</div>

Generate your own receipt after a governed run:

```sh
npx -y martin-loop@latest run "Summarize the demo workspace and prove tests still pass" --proof --verify "npm test"
npx -y martin-loop@latest runs verify --latest
npx -y martin-loop@latest share --latest
```

Example receipt files: [Markdown](./docs/examples/proof-receipts/live-governed-run-receipt.md) and [JSON](./docs/examples/proof-receipts/live-governed-run-receipt.json).

## Run This Audit Yourself

Use this lane from a clean temp directory to verify the public CLI flow exactly as shipped:

```sh
npx -y martin-loop@0.3.4 --version
npx -y martin-loop@0.3.4 start
npx -y martin-loop@0.3.4 demo
npx -y martin-loop@0.3.5 --version
npx -y martin-loop@0.3.5 start
npx -y martin-loop@0.3.5 demo
cd martin-loop-demo
npm install
npx -y martin-loop@0.3.4 run "Summarize the demo workspace and prove tests still pass" --proof --verify "npm test" --json
npx -y martin-loop@0.3.4 dossier --latest --json
npx -y martin-loop@0.3.4 share --latest --json
npx -y martin-loop@0.3.5 run "Summarize the demo workspace and prove tests still pass" --proof --verify "npm test" --json
npx -y martin-loop@0.3.5 dossier --latest --json
npx -y martin-loop@0.3.5 share --latest --json
```

For deterministic installs, pin the package line (`martin-loop@0.3.4`) or use `martin-loop@latest`. Plain `npx martin-loop` can resolve a stale local cache on some machines.
For deterministic installs, pin the package line (`martin-loop@0.3.5`) or use `martin-loop@latest`. Plain `npx martin-loop` can resolve a stale local cache on some machines.

Expected share bundle outputs:

Expand Down Expand Up @@ -229,7 +249,7 @@ npx martin-loop mcp print-config --host gemini --transport stdio --profile full-
npx martin-loop mcp print-config --host generic --transport stdio --profile github-review
```

The root `martin-loop` package and the standalone `@martinloop/mcp` package move on separate version lines. The root package line here is `0.3.4`; the current standalone MCP package is `0.3.1`.
The root `martin-loop` package and the standalone `@martinloop/mcp` package move on separate version lines. The root package line here is `0.3.5`; the current standalone MCP package is `0.3.1`.

The public MCP release train labels are:

Expand Down
Binary file added docs/assets/proof-receipt-live-governed.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
17 changes: 17 additions & 0 deletions docs/examples/proof-receipts/live-governed-run-receipt.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"title": "Martin Loop Proof Receipt",
"loopId": "loop_82emkgkf",
"proofVerdict": "EVIDENCE_BOUNDARY",
"evidenceLine": "Incomplete Martin proof: missing budget, rollback, or verifier evidence.",
"verifier": "passed",
"costSpend": "$0.51",
"budget": "$3.00",
"remainingBudget": "$2.49",
"overspendRatio": "0.17x",
"attempts": "1",
"rollback": "not-recorded",
"receiptIntegrity": "signed",
"verificationSteps": "1",
"runtime": "claude / claude-sonnet-4-6 / agent-cli:claude",
"generatedAt": "2026-06-10T20:01:03.635Z"
}
Comment on lines +1 to +17

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Find the JSON serialization logic for proof receipts and verify field naming

# Search for JSON receipt generation/serialization
rg -nP --type=ts -C5 'run-receipt\.json|JSON\.stringify.*proof|toJSON.*receipt'

# Search for the mapping between interface fields and output JSON
ast-grep --pattern 'receiptIntegrity'

Repository: Keesan12/martin-loop

Length of output: 787


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Find the MartinProofCardInput interface (or equivalent) and its field names
rg -n --type=ts 'interface\s+MartinProofCardInput|type\s+MartinProofCardInput|MartinProofCardInput\s*<' .

# 2) Find where the receipt JSON fields are defined/serialized
rg -n --type=ts 'function\s+buildShareBundle|buildShareBundle\(|shareBundle\.receipt|receipt:\s*' packages/cli/src

# 3) Find any usage of these specific field keys in codegen/rendering
rg -n --type=ts '"receiptIntegrityState"|"verificationStepCount"|"receiptIntegrity"|"verificationSteps"|"costSpend"|"remainingBudget"' packages

# 4) Find the MD example table that the JSON might be mirroring
fd -e md -a . | head -n 200
fd -e md . | rg -n 'proof-receipts|live-governed-run-receipt|Martin'

Repository: Keesan12/martin-loop

Length of output: 1314


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show MartinProofCardInput interface
echo "---- packages/cli/src/proof-card.ts (MartinProofCardInput) ----"
sed -n '1,220p' packages/cli/src/proof-card.ts | nl -ba | sed -n '1,220p'

# Show receipt construction path (buildShareBundle) around the receipt assignment
echo
echo "---- packages/cli/src/index.ts (buildShareBundle receipt) ----"
sed -n '3080,3245p' packages/cli/src/index.ts | nl -ba

# Show CliWorkflowReceipt type / receipt shape (workflow-state.ts)
echo
echo "---- packages/cli/src/workflow-state.ts (CliWorkflowReceipt) ----"
rg -n 'type\s+CliWorkflowReceipt|interface\s+CliWorkflowReceipt|export\s+type\s+CliWorkflowReceipt|export\s+interface\s+CliWorkflowReceipt' packages/cli/src/workflow-state.ts
sed -n '1,220p' packages/cli/src/workflow-state.ts | nl -ba | sed -n '1,220p'

# Locate the MD example file for proof receipts and show the relevant table/field names
echo
echo "---- Find proof receipt docs examples ----"
fd -a docs -e md | rg -n 'proof-receipts|live-governed-run-receipt|Martin Loop Proof Receipt|Martin proof receipt' || true

# Specifically print the MD file if present
mdfile="$(fd -a docs -e md -i | rg -n 'proof-receipts|live-governed-run-receipt' | head -n 1 | cut -d: -f1 || true)"
if [ -n "${mdfile:-}" ]; then
  echo
  echo "---- $mdfile (excerpt) ----"
  sed -n '1,220p' "$mdfile" | nl -ba
fi

# Finally: inspect the specific JSON file in question
echo
echo "---- docs/examples/proof-receipts/live-governed-run-receipt.json ----"
sed -n '1,120p' docs/examples/proof-receipts/live-governed-run-receipt.json | nl -ba

Repository: Keesan12/martin-loop

Length of output: 189


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- packages/cli/src/proof-card.ts (MartinProofCardInput) ----"
cat -n packages/cli/src/proof-card.ts | sed -n '1,240p'

echo
echo "---- packages/cli/src/index.ts (buildShareBundle receipt) ----"
cat -n packages/cli/src/index.ts | sed -n '3060,3225p'

echo
echo "---- packages/cli/src/workflow-state.ts (CliWorkflowReceipt) ----"
rg -n 'CliWorkflowReceipt' packages/cli/src/workflow-state.ts
echo
# show the definition region if present
start="$(rg -n 'CliWorkflowReceipt' packages/cli/src/workflow-state.ts | head -n 1 | cut -d: -f1 | awk '{print $1}')"
if [ -n "${start:-}" ]; then
  s=$((start-40))
  if [ $s -lt 1 ]; then s=1; fi
  cat -n packages/cli/src/workflow-state.ts | sed -n "${s},$((s+220))p"
fi

echo
echo "---- docs/examples/proof-receipts (find relevant MD/JSON) ----"
fd -a docs/examples/proof-receipts -e md -e json
echo
mdfile="$(fd -a docs -e md | rg -n 'proof-receipts|live-governed-run-receipt|Martin Loop Proof Receipt' | head -n 1 | cut -d: -f1 || true)"
if [ -n "${mdfile:-}" ]; then
  echo "---- $mdfile (excerpt) ----"
  cat -n "$mdfile" | sed -n '1,260p'
fi

echo
echo "---- docs/examples/proof-receipts/live-governed-run-receipt.json ----"
cat -n docs/examples/proof-receipts/live-governed-run-receipt.json | sed -n '1,140p'

Repository: Keesan12/martin-loop

Length of output: 28217


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Find and print proofCardInputFromLoop (or similarly named function)
echo "---- Search for proofCardInputFromLoop ----"
rg -n --type=ts 'proofCardInputFromLoop' packages/cli/src

func_file="$(rg -n --type=ts 'proofCardInputFromLoop' packages/cli/src | head -n1 | cut -d: -f1 || true)"
if [ -n "${func_file:-}" ]; then
  echo "---- snippet around ${func_file} ----"
  # print a reasonable region around the first match line
  match_line="$(rg -n --type=ts 'proofCardInputFromLoop' packages/cli/src | head -n1 | cut -d: -f2)"
  start=$((match_line-80)); if [ $start -lt 1 ]; then start=1; fi
  end=$((match_line+120))
  sed -n "${start},${end}p" "$func_file" | cat -n
fi

# 2) Read the JSON example in question (with line numbers)
echo
echo "---- docs/examples/proof-receipts/live-governed-run-receipt.json ----"
cat -n docs/examples/proof-receipts/live-governed-run-receipt.json | sed -n '1,200p'

# 3) Find the corresponding Markdown file/table (if it exists)
echo
echo "---- Find docs markdown references to 'Martin Loop Proof Receipt' and 'live-governed-run-receipt' ----"
rg -n --type=md 'Martin Loop Proof Receipt|live-governed-run-receipt|proof-receipts' docs || true

# If any MD file exists under docs/examples/proof-receipts, list them
echo
echo "---- List docs/examples/proof-receipts directory (if present) ----"
ls -la docs/examples/proof-receipts 2>/dev/null || true

# Print all md files under that folder (small output only)
for f in $(ls docs/examples/proof-receipts 2>/dev/null | rg '\.md$' || true); do
  echo
  echo "---- docs/examples/proof-receipts/$f (excerpt) ----"
  cat -n "docs/examples/proof-receipts/$f" | sed -n '1,240p'
done

Repository: Keesan12/martin-loop

Length of output: 11988


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n --hidden --type=ts '"verificationSteps"\s*:' . || true
rg -n --hidden --type=ts '"receiptIntegrity"\s*:' . || true
rg -n --hidden --type=ts '"verifier"\s*:' . || true
rg -n --hidden --type=ts '"rollback"\s*:' . || true
rg -n --hidden --type=ts 'verificationStepCount|receiptIntegrityState' packages/cli/src || true

# also check for a renderer that flattens MartinProofCard fields into JSON-like objects
rg -n --hidden --type=ts 'render.*Proof.*json|Proof.*json|toJSON.*receipt|share-receipt|run-receipt\.json' packages/cli/src || true

Repository: Keesan12/martin-loop

Length of output: 1985


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Find the ReceiptIntegrityState definition/value set
rg -n --hidden --type=ts 'ReceiptIntegrityState' . | head -n 200

# 2) Find where CLI outputs JSON for proof cards (flattened vs structured)
rg -n --hidden --type=ts 'proofVerdict|EVIDENCE_BOUNDARY|verificationSteps|receiptIntegrity"\s*:|\"receiptIntegrity\"|\"verificationSteps\"' packages/cli/src || true

# 3) Locate any renderer that produces the flat keys seen in the docs example
rg -n --hidden --type=ts 'verifier"\s*:?|rollback"\s*:?|verificationSteps|receiptIntegrity' packages/cli/src || true

# 4) Find docs example generator references (if any)
rg -n --hidden --type=md 'live-governed-run-receipt' docs || true

Repository: Keesan12/martin-loop

Length of output: 3645


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Flattened key strings as they appear in docs example JSON
rg -n --hidden --type=ts '"receiptIntegrity"\s*:|receiptIntegrity"\s*\)|receiptIntegrity'\''|\"receiptIntegrity\"' packages/cli/src || true
rg -n --hidden --type=ts '"verificationSteps"\s*:|\"verificationSteps\"' packages/cli/src || true
rg -n --hidden --type=ts '"runtime"\s*:|\"runtime\"' packages/cli/src | head -n 50 || true
rg -n --hidden --type=ts '"rollback"\s*:|\"rollback\"' packages/cli/src || true
rg -n --hidden --type=ts '"verifier"\s*:|\"verifier\"' packages/cli/src || true

# Also search for docs file generator / mapping from card.fields -> flat object
rg -n --hidden --type=ts 'fields.*reduce|getFieldValue|card\.fields|buildMartinProofCard|Proof Receipt' packages/cli/src || true

Repository: Keesan12/martin-loop

Length of output: 1617


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Fixed-string searches for the flattened keys present in docs example JSON
rg -n --hidden -F '"receiptIntegrity"' packages/cli/src || true
rg -n --hidden -F '"verificationSteps"' packages/cli/src || true
rg -n --hidden -F '"rollback"' packages/cli/src || true
rg -n --hidden -F '"verifier"' packages/cli/src || true
rg -n --hidden -F '"runtime"' packages/cli/src || true

# Also search for outputting "Martin Loop Proof Receipt" JSON
rg -n --hidden -S 'Martin Loop Proof Receipt' packages/cli/src || true
rg -n --hidden -S 'run-receipt.json' packages/cli/src || true
rg -n --hidden -S 'receiptJson|receiptMarkdown|proofCardSvg' packages/cli/src || true

# Search for docs markdown generator that might produce flat receipt JSON
rg -n --hidden -S 'renderShareReceiptMarkdown|render.*Receipt' packages/cli/src || true

Repository: Keesan12/martin-loop

Length of output: 1645


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find where proofVerdict / evidenceLine are used in JSON-ish outputs
rg -n --hidden --type=ts '"proofVerdict"' packages/cli/src || true
rg -n --hidden --type=ts '"evidenceLine"' packages/cli/src || true
rg -n --hidden --type=ts 'renderMartinProofCardMarkdown|renderMartinProofCardSvg' packages/cli/src || true

# Find CLI command handlers that return JSON for proofs/cards
rg -n --hidden --type=ts 'execute.*Command|command\.format|format === "json"|outputMode' packages/cli/src || true

# Specifically check for "runs verify" / "verify" command implementations
rg -n --hidden --type=ts 'runs verify|execute.*verify|verify\b' packages/cli/src || true

# Search docs generator or tests for snapshot of this JSON
rg -n --hidden --type=ts 'live-governed-run-receipt|run-receipt\.json|proof-receipts' packages/cli/tests packages/cli/src docs || true

Repository: Keesan12/martin-loop

Length of output: 11381


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for flattened doc-example keys/values across repo
rg -n --hidden -S 'verificationSteps' . || true
rg -n --hidden -S 'receiptIntegrity' . || true
rg -n --hidden -S '"signed"' . || true

# See how the doc example JSON is referenced/produced
rg -n --hidden -S 'live-governed-run-receipt\.json' . || true
rg -n --hidden -S 'live-governed-run-receipt\.md' . || true

Repository: Keesan12/martin-loop

Length of output: 6023


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- docs/oss/AGENT-RUN-RECEIPTS.md (excerpt around receiptIntegrity / example schema) ----"
rg -n 'receiptIntegrity|verificationSteps|verificationStepCount|run-receipt\.json|live-governed-run-receipt\.json' docs/oss/AGENT-RUN-RECEIPTS.md
sed -n '1,120p' docs/oss/AGENT-RUN-RECEIPTS.md | cat -n
sed -n '120,220p' docs/oss/AGENT-RUN-RECEIPTS.md | cat -n

echo
echo "---- docs/examples/proof-receipts/live-governed-run-receipt.md (whole file) ----"
cat -n docs/examples/proof-receipts/live-governed-run-receipt.md

echo
echo "---- README.md (receipt example section) ----"
sed -n '90,140p' README.md | cat -n

Repository: Keesan12/martin-loop

Length of output: 10223


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show executeRunsVerifyCommand
echo "---- executeRunsVerifyCommand (packages/cli/src/index.ts) ----"
sed -n '1840,1925p' packages/cli/src/index.ts | cat -n

echo
echo "---- executeShareCommand around JSON output (packages/cli/src/index.ts) ----"
sed -n '3060,3188p' packages/cli/src/index.ts | cat -n

echo
echo "---- executeBadgeCommand around json formatting (packages/cli/src/index.ts) ----"
sed -n '3280,3360p' packages/cli/src/index.ts | cat -n

Repository: Keesan12/martin-loop

Length of output: 13288


Fix live-governed-run-receipt.json field names/value to match the proof-card schema

  • docs/examples/proof-receipts/live-governed-run-receipt.json uses receiptIntegrity, verificationSteps, rollback, and verifier, but the proof-card input schema uses receiptIntegrityState, verificationStepCount, rollbackStatus, and verifierStatus.
  • The example sets receiptIntegrity to "signed", but ReceiptIntegrityState is only "verified" | "unsigned" | "tamper_detected".
  • The JSON example omits fields that are present in docs/examples/proof-receipts/live-governed-run-receipt.md (Objective, Status, Lifecycle, Halt reason, Evidence boundary, Run mode), so the JSON doesn’t mirror the documented proof card.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/examples/proof-receipts/live-governed-run-receipt.json` around lines 1 -
17, Replace the mismatched fields to match the proof-card schema: rename
receiptIntegrity -> receiptIntegrityState, verificationSteps ->
verificationStepCount (use a number), rollback -> rollbackStatus, and verifier
-> verifierStatus; set receiptIntegrityState to one of the allowed values
("verified" | "unsigned" | "tamper_detected") (e.g., "verified" instead of
"signed"); update verifierStatus values to the schema's expected values; add the
missing proof-card fields present in the markdown example (Objective, Status,
Lifecycle, Halt reason, Evidence boundary, Run mode) with appropriate
schema-compliant keys and values, ensure types match the schema (e.g.,
verificationStepCount as an integer) and keep existing metadata like generatedAt
and budget fields unchanged.

25 changes: 25 additions & 0 deletions docs/examples/proof-receipts/live-governed-run-receipt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Martin Loop Proof Receipt

Incomplete Martin proof: missing budget, rollback, or verifier evidence.

| Field | Evidence |
| --- | --- |
| Loop ID | loop_82emkgkf |
| Objective | Audit the MartinLoop CLI proof receipt guard for a shareable governed run receipt. |
| Status | exited |
| Lifecycle | budget_exit |
| Verifier | passed |
| Cost / spend | $0.51 |
| Budget | $3.00 |
| Attempts | 1 |
| Rollback | not-recorded |
| Halt reason | Martin exited because the budget governor hit a hard limit. |
| Evidence boundary | Generated from a local Martin Loop run record.; Hosted dashboards and private team telemetry are intentionally excluded from OSS proof cards. |
| Remaining budget | $2.49 |
| Overspend ratio | 0.17x |
| Verification steps | 1 |
| Run mode | not recorded |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Inconsistent value format with JSON example.

Line 21 shows "not recorded" (space-separated) while the JSON example uses "not-recorded" (hyphenated) for the rollback field. Ensure consistent formatting across example formats.

🔧 Proposed fix for consistency
-| Run mode | not recorded |
+| Run mode | not-recorded |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| Run mode | not recorded |
| Run mode | not-recorded |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/examples/proof-receipts/live-governed-run-receipt.md` at line 21, The
"Run mode" table row uses "not recorded" but the JSON example uses
"not-recorded" for the rollback value; update the table cell (the Run mode /
rollback entry) to use "not-recorded" to match the JSON example and maintain
consistent formatting across examples (refer to the rollback field and the Run
mode table row in the document).

| Runtime | claude / claude-sonnet-4-6 / agent-cli:claude |
| Receipt integrity | signed |
| Generated at | 2026-06-10T20:01:03.635Z |

12 changes: 12 additions & 0 deletions docs/oss/AGENT-RUN-RECEIPTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ Expected bundle output under the selected run directory in `share/`:
- `run-receipt.md` (human-readable recap)
- `proof-card.svg` (portable visual card)

The proof card is intentionally a terminal-style receipt, not a marketing card. It uses a dark CLI layout, line rules, monospaced evidence rows, green only for verified/pass states, and red only for failed, missing, or boundary states. Do not restyle it into rounded boxes, blue palettes, gradients, certificate layouts, or dashboard cards without an explicit visual review.

4. Optional custom output directory:

```sh
Expand Down Expand Up @@ -126,3 +128,13 @@ If exact replay is not possible because the workspace changed, the `warnings` an
- usage is presented with provenance (`actual`, `estimated`, or `unavailable`)
- verifier failures are explicit and not reinterpreted as success
- inspection remains read-only

## Public proof receipt example

This repository includes a public-safe proof receipt generated from a real governed run:

- [visual proof card](../assets/proof-receipt-live-governed.png)
- [Markdown receipt](../examples/proof-receipts/live-governed-run-receipt.md)
- [JSON receipt](../examples/proof-receipts/live-governed-run-receipt.json)

The example shows a verifier-passed run with signed receipt integrity and an explicit evidence boundary. The boundary is kept visible because rollback evidence was not recorded.
11 changes: 11 additions & 0 deletions docs/oss/AGENT-START-HERE.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,17 @@ npx martin-loop dossier --latest

Expected value: the dossier tells you what happened, what Martin prevented, verifier status, rollback/artifact evidence, clearly labeled token/cost estimates, and the next safe action.

## Proof Receipts

After a governed run, create a share bundle:

```sh
npx martin-loop runs verify --latest
npx martin-loop share --latest
```

The bundle includes `run-receipt.json`, `run-receipt.md`, and `proof-card.svg`. The proof card should look like a terminal receipt: dark canvas, rows, divider lines, monospaced evidence, green pass states, and red boundary states. Keep uncertainty visible. If rollback, integrity, cost, or verifier evidence is missing, render it as missing instead of turning the run into a success claim.

## MCP Profile Defaults

- `minimal` is the default: `martin_doctor`, `martin_preflight`, `martin_list_runs`, `martin_triage_runs`, and `martin_run_dossier`.
Expand Down
29 changes: 29 additions & 0 deletions docs/release/OSS-0.3.5-RELEASE-NOTES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# MartinLoop 0.3.5 Proof Receipt Release

`0.3.5` upgrades MartinLoop share receipts so governed runs produce a sharper CLI-style proof card and clearer public documentation.

## What Changed

- Proof cards now render as dark terminal receipts with line rules, monospaced evidence rows, and explicit pass/boundary coloring.
- Share receipts include stronger visible context: task class, spend, budget, remaining budget, overspend ratio, verifier status, integrity state, runtime, and event rail when present in the local run record.
- Missing rollback, verifier, budget, or integrity evidence stays visible as an evidence boundary instead of being softened into a success claim.
- README and agent docs now show how to create and inspect share bundles with `runs verify --latest` and `share --latest`.
- Public tests now block rounded-card, blue-palette, gradient, and typography regressions in proof-card SVG output.

## Why This Matters

AI coding work needs evidence that can be checked after the run. A verifier pass is useful, but it is not the whole proof. The receipt should also show what it cost, what evidence exists, and what evidence is missing.

## Quick Check

```sh
npx -y martin-loop@0.3.5 run "Summarize the demo workspace and prove tests still pass" --proof --verify "npm test"
npx -y martin-loop@0.3.5 runs verify --latest
npx -y martin-loop@0.3.5 share --latest
```

Expected share bundle outputs:

- `share/run-receipt.json`
- `share/run-receipt.md`
- `share/proof-card.svg`
8 changes: 4 additions & 4 deletions docs/release/VERSION-LEDGER.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,16 @@ This file is the release source of truth for package/version mapping in this rep

## Root package: `martin-loop`

- live npm dist-tag `latest`: `0.3.4`
- live public GitHub release: `v0.3.4`
- live npm dist-tag `latest`: `0.3.4` before the `0.3.5` proof receipt release publishes
- live public GitHub release: `v0.3.4` before the `v0.3.5` release workflow completes
- live public baseline in this train: `0.3.4`
- root public baseline: `0.3.4`
- releases consumed since the original `0.2.8` launch:
- `0.2.9` fixed proof-run classification, Windows `.cmd` resolution, and public provider defaults
- `0.2.10` tightened verifier evidence, `--runs-dir` consistency, and public help output
- `0.2.11` fixed `runs verify --latest` selector parity in the public CLI
- current in-repo root release line: `0.3.4` for governed integrity hardening across path-policy, selector, and receipt verification surfaces
- next planned root follow-on: `0.3.5` for additional cross-host reliability follow-ups
- current in-repo root release line: `0.3.5` for CLI-style proof receipts and share-bundle documentation
- next planned root follow-on: `0.3.6` for additional cross-host reliability follow-ups

## Standalone package: `@martinloop/mcp`

Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "martin-loop",
"private": false,
"version": "0.3.4",
"version": "0.3.5",
"type": "module",
"description": "Open-source command center for governed AI coding agents with built-in onboarding, hard gates, MCP, and shareable run receipts.",
"packageManager": "pnpm@10.33.0",
Expand Down
28 changes: 28 additions & 0 deletions packages/cli/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3004,6 +3004,22 @@ function proofCardInputFromLoop(loop: LoopRecord): MartinProofCardInput {
)
? "captured"
: "not-recorded";
const remainingBudget = Math.max(0, loop.budget.maxUsd - loop.cost.actualUsd);
const overspendRatio =
loop.budget.maxUsd > 0 ? `${(loop.cost.actualUsd / loop.budget.maxUsd).toFixed(2)}x` : "unknown";
Comment on lines +3007 to +3009

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Rename or recompute the new "overspend ratio" metric.

This is actualUsd / maxUsd, so in-budget runs now emit values like 0.77x. The receipt later renders that under the "Overspend ratio" label, which reads as though the run exceeded budget when it did not. Either rename the field to budget utilization/spend ratio, or only emit an overspend metric once actualUsd > maxUsd.

Also applies to: 3059-3060

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/cli/src/index.ts` around lines 3007 - 3009, The computed metric
named overspendRatio is actually actualUsd/maxUsd (so values <1 indicate under
budget); update the logic to either (A) rename the variable to spendRatio (e.g.,
spendRatio = loop.budget.maxUsd > 0 ? `${(loop.cost.actualUsd /
loop.budget.maxUsd).toFixed(2)}x` : "unknown" and use that where you mean
"budget utilization"), or (B) keep overspendRatio but only compute/emit it when
loop.cost.actualUsd > loop.budget.maxUsd (e.g., overspendRatio =
loop.cost.actualUsd > loop.budget.maxUsd ? `${(loop.cost.actualUsd /
loop.budget.maxUsd).toFixed(2)}x` : undefined/"0x"). Apply the same change for
the second occurrence using the same symbols (remainingBudget, overspendRatio,
loop.cost.actualUsd, loop.budget.maxUsd).

const verificationStepCount = loop.events.filter((event) => event.type === "verification.completed").length;
const latestAttempt = loop.attempts.at(-1);
const runtime = latestAttempt
? `${latestAttempt.adapterId} / ${latestAttempt.model}`
: loop.events
.map((event) => event.payload)
.find((payload) => typeof payload["adapterId"] === "string" || typeof payload["model"] === "string");
Comment on lines +3014 to +3016

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If loop.events contains any event with an undefined or null payload, loop.events.map(event => event.payload) will include undefined or null in the resulting array. When .find() iterates over this array, it will pass undefined or null to the callback, causing a TypeError: Cannot read properties of undefined (reading 'adapterId') when attempting to access payload['adapterId']. Adding a truthiness check for payload prevents this potential runtime crash.

Suggested change
: loop.events
.map((event) => event.payload)
.find((payload) => typeof payload["adapterId"] === "string" || typeof payload["model"] === "string");
: loop.events
.map((event) => event.payload)
.find((payload) => payload && (typeof payload["adapterId"] === "string" || typeof payload["model"] === "string"));

const runtimeLabel =
typeof runtime === "string"
? runtime
: runtime
? `${String(runtime["adapterId"] ?? "unknown")} / ${String(runtime["model"] ?? "unknown")}`
: "not recorded";

return {
loopId: loop.loopId,
Expand All @@ -3013,8 +3029,14 @@ function proofCardInputFromLoop(loop: LoopRecord): MartinProofCardInput {
verifierStatus: verification.status,
costSpend: `$${loop.cost.actualUsd.toFixed(2)}`,
budget: `$${loop.budget.maxUsd.toFixed(2)}`,
remainingBudget: `$${remainingBudget.toFixed(2)}`,
overspendRatio,
attempts: loop.attempts.length,
rollbackStatus,
verificationStepCount,
runMode: loop.task.mutationMode ?? "not recorded",
runtime: runtimeLabel,
timelineEvents: loop.events.map((event) => event.type),
haltReason: latestExitReason(loop),
evidenceBoundaryNotes: [
"Generated from a local Martin Loop run record.",
Expand All @@ -3034,8 +3056,14 @@ function defaultChallengeProofCardInput(): MartinProofCardInput {
verifierStatus: "passed",
costSpend: "$2.30",
budget: "$3.00",
remainingBudget: "$0.70",
overspendRatio: "0.77x",
attempts: 2,
rollbackStatus: "captured",
verificationStepCount: 1,
runMode: "mutating",
runtime: "demo / local-fixture",
timelineEvents: ["run.started", "attempt.started", "verification.completed", "budget.updated", "run.completed"],
haltReason: "verifier_passed",
evidenceBoundaryNotes: [
"Generated from a local Martin Loop run record.",
Expand Down
Loading
Loading