Skip to content

feat(dRAG): Decentralized GraphRAG V1 — verifiable, cross-node, monetizable answering (OT-RFC-55)#1314

Draft
branarakic wants to merge 5 commits into
mainfrom
feat/drag-v1
Draft

feat(dRAG): Decentralized GraphRAG V1 — verifiable, cross-node, monetizable answering (OT-RFC-55)#1314
branarakic wants to merge 5 commits into
mainfrom
feat/drag-v1

Conversation

@branarakic

Copy link
Copy Markdown
Contributor

feat: Decentralized GraphRAG (dRAG) V1 — verifiable, cross-node, monetizable answering (OT-RFC-55)

Implements OT-RFC-55 as a working V1 on main (v10.0.0): ask one natural-language
question and get a grounded answer whose every fact is independently auditable
against the chain
— answered locally, or fanned out across the nodes serving a
public Context Graph
and re-verified by the asker, with an x402 payment seam
wired for monetization.

Built in four layered, independently-shippable increments. Each was validated on a
live 4-node devnet.

What's in it

P1 — Verifiable citation core (core/crypto/citation.ts, agent/drag/citation.ts)
Composes the already-shipped V10 primitives into one auditable citation object
{UAL, node, author-sig, merkle, on-chain}:

  • Merkle inclusion over the keccak V10 structured tree (buildV10ProofMaterial),
    re-anchored to the live on-chain getLatestMerkleRoot(kaId) — so a citation that
    verifies offline verifies on-chain by construction.
  • Content-binding: the proof binds to the exact cited triple (tripleContentV10).
  • EIP-712 author-seal recovery (recoverCitationAuthor), with the chain-verified
    on-chain author as the authoritative fallback when the _meta seal is absent.

P2 — dkg_answer single-node (agent/dkg-agent-drag.ts, cli/routes/drag.ts, mcp-dkg)
question → keyword retrieval over per-KA verifiable-memory graphs → canonical triples → a citation per fact. No node-side LLM required (keyword/structural
baseline; LLM synthesis is a future enhancement). New POST /api/answer,
DkgClient.answer(), and the dkg_answer MCP tool.

P3 — Cross-node fan-out (scope:"network")
findNodesServingCG reads the contextGraphsServed phonebook (the §5.1
context-oracle index); a new PROTOCOL_DRAG_ANSWER libp2p protocol lets a peer
answer over a public CG (ACL = the same isContextGraphPublicOnChain
fail-closed gate as query-remote). dragAnswerNetwork fans out, dedups, and —
crucially — re-verifies every citation against the asker's own chain, so the
asker trusts no serving node's self-reported verdict and can answer over knowledge
it does not hold. Optional explicit peers override for not-yet-gossiped CGs.

P4 — x402 payment seam (cli/daemon/payment.ts)
Public CGs are free in V1, but the HTTP-402 challenge + X-PAYMENT wire format
and a pluggable PaymentVerifier (MockPaymentVerifier for dev/CI) are in place so
the real Coinbase/USDC facilitator drops in behind one interface. simulatePrice
exercises the full 402 → pay → 200 + receipt flow. Localized in the drag route —
does not touch the central request chain.

Validation

  • Unit: 18 citation tests (core+agent) + 14 payment tests (cli). Full suites
    green for every touched package (core 1092, mcp-dkg 325, agent ~1740, cli ~2120).
  • Live devnet (4 nodes): combined run, 9/9 —
    • local cited answer, every citation author-sig ✓ merkle ✓ on-chain ✓;
    • priced request returns 402, then 200 + settlement receipt with a valid
      X-PAYMENT, paid answer still fully verified;
    • an asker holding no copy of the CG assembled a multi-fact answer from
      remote serving nodes, every citation re-verified against its own chain.

Security hardening (adversarial review)

A multi-dimension adversarial review of the diff confirmed 10 findings (3 high,
3 medium, 3 low), all addressed in the hardening commit:

  • Remote handler DoS — the unauthenticated PROTOCOL_DRAG_ANSWER verb is now
    per-peer rate-limited and clamped to a small remote cost ceiling.
  • Asker fan-out bounds — peer fan-out is capped + concurrency-bounded (no
    caller-driven reflection); per-peer citation re-verification is bounded.
  • Malformed-peer isolation — each peer response is shape-validated and each
    citation is guarded, so one bad/version-skewed peer can't fail the whole answer.
  • CG-scope binding — re-verification confirms each citation's KA belongs to
    the asked context graph (getKAContextGraphId); a peer can't pass off a
    verifiable fact from a different KA, and the asker never trusts the remote's
    scope fields.
  • Verdict integrity — the verdict cache is keyed on the proof and dedup
    prefers a verified citation, so a bad proof can't poison/suppress an honest one.
  • Defense-in-depthvalidateContextGraphId before SPARQL interpolation;
    the leaf count is re-anchored against chain (the pure check was tautological).

The two refuted findings (a "second-order IRI injection" and a "dedup delimiter
collision") were verified non-exploitable (the store's serialization boundary and
the terminal-field key layout, respectively).

Reasoned, not unit-tested (validated by code review + the live happy path, but
without a dedicated adversarial test): the off-scope citation drop (a malicious
peer returning a fact from a different KA — the CG-scope binding is confirmed
active on the non-subscriber path, since ontology id-mappings sync network-wide),
the verdict-cache collision fix, and the dkg_answer MCP stdio tool path (the
underlying /api/answer endpoint is live-validated). The CG-scope check fails
open (facts stay cryptographically verified, scope unconfirmed) only when the
asker cannot resolve the CG's on-chain id — surfaced explicitly in the answer.

Honest scope / deferred (NOT in V1)

  • Confidential / ZK answering (RFC §5.3) — citations are over public (VM)
    facts; private-data answering and the ZK layers are future work.
  • Private-CG routing — fan-out is public-CG only (the phonebook deliberately
    advertises only public, subscribed CGs).
  • Real x402 settlementMockPaymentVerifier only; no live facilitator / USDC
    (the node has no USDC and is ethers-only; not CI-runnable).
  • LLM synthesis + question→CG auto-routing — the keyword baseline is the V1
    headline; LLM is a clean seam (no key on devnet). The caller names the CG.
  • Phonebook propagation latencycontextGraphsServed advertisements integrate
    into peers' agents-CG on the profile-heartbeat cadence, so discovery can lag a
    just-created CG; the explicit peers override is the deterministic fast-path.

How to try it

./scripts/devnet.sh start 4
# create a public CG, publish a KA, then:
curl -s localhost:9201/api/answer -H "Authorization: Bearer $TOKEN" \
  -d '{"question":"...","contextGraphId":"<cg>","scope":"network"}'

🤖 Generated with Claude Code

Branimir Rakic and others added 5 commits June 24, 2026 02:49
Compose the SHIPPED V10 primitives into one auditable citation object
{UAL, node, author-sig, merkle, on-chain}:
- core/crypto/citation.ts: wire type + pure Merkle/content-binding verify
  (reuses verifyV10ProofMaterial; keccak V10 tree, not the sha256 oracle
  path, so a citation that verifies here verifies on-chain by construction).
- agent/drag/citation.ts: producer (extractV10KCFromStore -> buildV10ProofMaterial
  re-anchored to getLatestMerkleRoot) + EIP-712 author-seal recovery (ethers),
  with on-chain author as the authoritative fallback when the _meta seal is absent.

17 unit tests: pure proof + content-binding + tamper detection (core),
real-wallet EIP-712 recovery + live re-anchor + authorSig fallback (agent).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…55 P2)

question -> keyword retrieval over per-KA verifiable-memory graphs ->
canonical triples -> a VerifiableCitation per cited fact. No node-side LLM
required (keyword/structural baseline; LLM synthesis is a future enhancement).

- agent: DragMethods.dragAnswerLocal mixin; citation producer split into
  prepareKaCitation (extract + chain reads + seal, once per KA) + citeTriple
  (pure proof per fact). Author seal loaded from the name-scoped _meta graph.
- cli: POST /api/answer route (routes/drag.ts) wired into the dispatch chain.
- mcp-dkg: dkg_answer tool + DkgClient.answer() + DragAnswerResult types.

Validated on a live 4-node devnet: every cited fact returns
author-sig OK + merkle OK + on-chain OK (re-anchored to the live root).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
scope:"network" answers one question across the nodes serving a PUBLIC
context graph, then RE-VERIFIES every citation against the asker's own chain
— so the asker trusts no serving node's self-reported verdict, and can answer
over knowledge it does not hold.

- core: PROTOCOL_DRAG_ANSWER libp2p protocol constant.
- agent: DiscoveryClient.findNodesServingCG (reads the contextGraphsServed
  phonebook = the §5.1 context-oracle index); PROTOCOL_DRAG_ANSWER handler
  (public-CG only, fail-closed via isContextGraphPublicOnChain);
  dragAnswerRemote (one peer) + dragAnswerNetwork (fan out, dedup, per-key
  re-verify, per-node trust breakdown). Optional explicit "peers" override
  for when an advertisement has not yet gossiped into the local phonebook.
- cli/mcp: scope + peers plumbed through /api/answer, DkgClient.answer, dkg_answer.

Validated on a live 4-node devnet: an asker holding NO copy of the CG
assembled a 3-fact answer from 2 remote serving nodes, every citation
re-verified (author-sig + merkle + on-chain) against the chain.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wire the monetization rail without a real facilitator: public CGs stay FREE
in V1, but the HTTP 402 challenge + X-PAYMENT format + a pluggable
PaymentVerifier are in place so the Coinbase/USDC facilitator drops in behind
one interface.

- cli/daemon/payment.ts: x402 wire types, PaymentVerifier interface,
  MockPaymentVerifier (accept-any dev/CI verifier returning a synthetic
  receipt), parsePrice / parseXPaymentHeader / build402Body, and the pure
  resolvePayment gate (free | challenge | paid).
- routes/drag.ts: localized payment gate (does NOT touch the central request
  chain). simulatePrice exercises 402 -> pay -> 200+receipt with the mock;
  real per-CG pricing + facilitator deferred.
- mcp: settlement receipt surfaced on DragAnswerResult + the dkg_answer summary.

14 unit tests (parse/verify/gate). Live devnet: priced request 402s without
payment, returns 200 + receipt with a valid X-PAYMENT, and the paid answer is
still fully verifiable.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… binding)

Fixes the confirmed findings from the multi-dimension review of P1-P4.

HIGH:
- Remote PROTOCOL_DRAG_ANSWER handler is unauthenticated (libp2p peers carry
  no token): add a per-peer rate limit (30/min) + clamp the remote path to a
  small cost ceiling (maxKas/maxCitations=15, vs 100/50 locally) so one cheap
  request can't trigger a heavy store scan + hundreds of chain reads.
- dragAnswerNetwork: cap the peer fan-out (<=24) + bound concurrency (8) so a
  caller-supplied peers[] can't make this node a reflector; cap citations
  re-verified per peer (<=64) so one peer's large response can't exhaust the
  asker's RPC budget.
- Validate each peer response shape (isValidDragAnswerResult) before use +
  guard each citation in a try/catch, so one malformed/version-skewed peer
  becomes a per-node error instead of failing the whole answer.

MEDIUM:
- CG-scope binding: re-verification now confirms each citation's KA belongs to
  the asked context graph (getKAContextGraphId) and stamps the asker-derived
  CG id — a peer can no longer pass off a verifiable fact from a DIFFERENT KA.
- Verdict cache keyed on the proof (not just the fact) + dedup prefers a
  verified citation, so a bad proof can't poison or suppress an honest one.

LOW:
- validateContextGraphId before any SPARQL interpolation (parity + defense-in-depth).
- verifyVerifiableCitation re-anchors the leaf count against getMerkleLeafCount
  (the pure check was tautological).

Re-validated on a live 4-node devnet (P2 local + P4 402-pay + P3 network all
green); +1 unit test for the leaf-count re-anchor.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant