feat(dRAG): Decentralized GraphRAG V1 — verifiable, cross-node, monetizable answering (OT-RFC-55)#1314
Draft
branarakic wants to merge 5 commits into
Draft
feat(dRAG): Decentralized GraphRAG V1 — verifiable, cross-node, monetizable answering (OT-RFC-55)#1314branarakic wants to merge 5 commits into
branarakic wants to merge 5 commits into
Conversation
Compose the SHIPPED V10 primitives into one auditable citation object
{UAL, node, author-sig, merkle, on-chain}:
- core/crypto/citation.ts: wire type + pure Merkle/content-binding verify
(reuses verifyV10ProofMaterial; keccak V10 tree, not the sha256 oracle
path, so a citation that verifies here verifies on-chain by construction).
- agent/drag/citation.ts: producer (extractV10KCFromStore -> buildV10ProofMaterial
re-anchored to getLatestMerkleRoot) + EIP-712 author-seal recovery (ethers),
with on-chain author as the authoritative fallback when the _meta seal is absent.
17 unit tests: pure proof + content-binding + tamper detection (core),
real-wallet EIP-712 recovery + live re-anchor + authorSig fallback (agent).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…55 P2) question -> keyword retrieval over per-KA verifiable-memory graphs -> canonical triples -> a VerifiableCitation per cited fact. No node-side LLM required (keyword/structural baseline; LLM synthesis is a future enhancement). - agent: DragMethods.dragAnswerLocal mixin; citation producer split into prepareKaCitation (extract + chain reads + seal, once per KA) + citeTriple (pure proof per fact). Author seal loaded from the name-scoped _meta graph. - cli: POST /api/answer route (routes/drag.ts) wired into the dispatch chain. - mcp-dkg: dkg_answer tool + DkgClient.answer() + DragAnswerResult types. Validated on a live 4-node devnet: every cited fact returns author-sig OK + merkle OK + on-chain OK (re-anchored to the live root). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
scope:"network" answers one question across the nodes serving a PUBLIC context graph, then RE-VERIFIES every citation against the asker's own chain — so the asker trusts no serving node's self-reported verdict, and can answer over knowledge it does not hold. - core: PROTOCOL_DRAG_ANSWER libp2p protocol constant. - agent: DiscoveryClient.findNodesServingCG (reads the contextGraphsServed phonebook = the §5.1 context-oracle index); PROTOCOL_DRAG_ANSWER handler (public-CG only, fail-closed via isContextGraphPublicOnChain); dragAnswerRemote (one peer) + dragAnswerNetwork (fan out, dedup, per-key re-verify, per-node trust breakdown). Optional explicit "peers" override for when an advertisement has not yet gossiped into the local phonebook. - cli/mcp: scope + peers plumbed through /api/answer, DkgClient.answer, dkg_answer. Validated on a live 4-node devnet: an asker holding NO copy of the CG assembled a 3-fact answer from 2 remote serving nodes, every citation re-verified (author-sig + merkle + on-chain) against the chain. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wire the monetization rail without a real facilitator: public CGs stay FREE in V1, but the HTTP 402 challenge + X-PAYMENT format + a pluggable PaymentVerifier are in place so the Coinbase/USDC facilitator drops in behind one interface. - cli/daemon/payment.ts: x402 wire types, PaymentVerifier interface, MockPaymentVerifier (accept-any dev/CI verifier returning a synthetic receipt), parsePrice / parseXPaymentHeader / build402Body, and the pure resolvePayment gate (free | challenge | paid). - routes/drag.ts: localized payment gate (does NOT touch the central request chain). simulatePrice exercises 402 -> pay -> 200+receipt with the mock; real per-CG pricing + facilitator deferred. - mcp: settlement receipt surfaced on DragAnswerResult + the dkg_answer summary. 14 unit tests (parse/verify/gate). Live devnet: priced request 402s without payment, returns 200 + receipt with a valid X-PAYMENT, and the paid answer is still fully verifiable. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… binding) Fixes the confirmed findings from the multi-dimension review of P1-P4. HIGH: - Remote PROTOCOL_DRAG_ANSWER handler is unauthenticated (libp2p peers carry no token): add a per-peer rate limit (30/min) + clamp the remote path to a small cost ceiling (maxKas/maxCitations=15, vs 100/50 locally) so one cheap request can't trigger a heavy store scan + hundreds of chain reads. - dragAnswerNetwork: cap the peer fan-out (<=24) + bound concurrency (8) so a caller-supplied peers[] can't make this node a reflector; cap citations re-verified per peer (<=64) so one peer's large response can't exhaust the asker's RPC budget. - Validate each peer response shape (isValidDragAnswerResult) before use + guard each citation in a try/catch, so one malformed/version-skewed peer becomes a per-node error instead of failing the whole answer. MEDIUM: - CG-scope binding: re-verification now confirms each citation's KA belongs to the asked context graph (getKAContextGraphId) and stamps the asker-derived CG id — a peer can no longer pass off a verifiable fact from a DIFFERENT KA. - Verdict cache keyed on the proof (not just the fact) + dedup prefers a verified citation, so a bad proof can't poison or suppress an honest one. LOW: - validateContextGraphId before any SPARQL interpolation (parity + defense-in-depth). - verifyVerifiableCitation re-anchors the leaf count against getMerkleLeafCount (the pure check was tautological). Re-validated on a live 4-node devnet (P2 local + P4 402-pay + P3 network all green); +1 unit test for the leaf-count re-anchor. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
feat: Decentralized GraphRAG (dRAG) V1 — verifiable, cross-node, monetizable answering (OT-RFC-55)
Implements OT-RFC-55 as a working V1 on
main(v10.0.0): ask one natural-languagequestion and get a grounded answer whose every fact is independently auditable
against the chain — answered locally, or fanned out across the nodes serving a
public Context Graph and re-verified by the asker, with an x402 payment seam
wired for monetization.
Built in four layered, independently-shippable increments. Each was validated on a
live 4-node devnet.
What's in it
P1 — Verifiable citation core (
core/crypto/citation.ts,agent/drag/citation.ts)Composes the already-shipped V10 primitives into one auditable citation object
{UAL, node, author-sig, merkle, on-chain}:buildV10ProofMaterial),re-anchored to the live on-chain
getLatestMerkleRoot(kaId)— so a citation thatverifies offline verifies on-chain by construction.
tripleContentV10).recoverCitationAuthor), with the chain-verifiedon-chain author as the authoritative fallback when the
_metaseal is absent.P2 —
dkg_answersingle-node (agent/dkg-agent-drag.ts,cli/routes/drag.ts,mcp-dkg)question → keyword retrieval over per-KA verifiable-memory graphs → canonical triples → a citation per fact. No node-side LLM required (keyword/structuralbaseline; LLM synthesis is a future enhancement). New
POST /api/answer,DkgClient.answer(), and thedkg_answerMCP tool.P3 — Cross-node fan-out (
scope:"network")findNodesServingCGreads thecontextGraphsServedphonebook (the §5.1context-oracle index); a new
PROTOCOL_DRAG_ANSWERlibp2p protocol lets a peeranswer over a public CG (ACL = the same
isContextGraphPublicOnChainfail-closed gate as query-remote).
dragAnswerNetworkfans out, dedups, and —crucially — re-verifies every citation against the asker's own chain, so the
asker trusts no serving node's self-reported verdict and can answer over knowledge
it does not hold. Optional explicit
peersoverride for not-yet-gossiped CGs.P4 — x402 payment seam (
cli/daemon/payment.ts)Public CGs are free in V1, but the HTTP-402 challenge +
X-PAYMENTwire formatand a pluggable
PaymentVerifier(MockPaymentVerifierfor dev/CI) are in place sothe real Coinbase/USDC facilitator drops in behind one interface.
simulatePriceexercises the full
402 → pay → 200 + receiptflow. Localized in the drag route —does not touch the central request chain.
Validation
green for every touched package (core 1092, mcp-dkg 325, agent ~1740, cli ~2120).
author-sig ✓ merkle ✓ on-chain ✓;X-PAYMENT, paid answer still fully verified;remote serving nodes, every citation re-verified against its own chain.
Security hardening (adversarial review)
A multi-dimension adversarial review of the diff confirmed 10 findings (3 high,
3 medium, 3 low), all addressed in the hardening commit:
PROTOCOL_DRAG_ANSWERverb is nowper-peer rate-limited and clamped to a small remote cost ceiling.
caller-driven reflection); per-peer citation re-verification is bounded.
citation is guarded, so one bad/version-skewed peer can't fail the whole answer.
the asked context graph (
getKAContextGraphId); a peer can't pass off averifiable fact from a different KA, and the asker never trusts the remote's
scope fields.
prefers a verified citation, so a bad proof can't poison/suppress an honest one.
validateContextGraphIdbefore SPARQL interpolation;the leaf count is re-anchored against chain (the pure check was tautological).
The two refuted findings (a "second-order IRI injection" and a "dedup delimiter
collision") were verified non-exploitable (the store's serialization boundary and
the terminal-field key layout, respectively).
Reasoned, not unit-tested (validated by code review + the live happy path, but
without a dedicated adversarial test): the off-scope citation drop (a malicious
peer returning a fact from a different KA — the CG-scope binding is confirmed
active on the non-subscriber path, since ontology id-mappings sync network-wide),
the verdict-cache collision fix, and the
dkg_answerMCP stdio tool path (theunderlying
/api/answerendpoint is live-validated). The CG-scope check failsopen (facts stay cryptographically verified, scope unconfirmed) only when the
asker cannot resolve the CG's on-chain id — surfaced explicitly in the answer.
Honest scope / deferred (NOT in V1)
facts; private-data answering and the ZK layers are future work.
advertises only public, subscribed CGs).
MockPaymentVerifieronly; no live facilitator / USDC(the node has no USDC and is ethers-only; not CI-runnable).
headline; LLM is a clean seam (no key on devnet). The caller names the CG.
contextGraphsServedadvertisements integrateinto peers' agents-CG on the profile-heartbeat cadence, so discovery can lag a
just-created CG; the explicit
peersoverride is the deterministic fast-path.How to try it
🤖 Generated with Claude Code