Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions packages/cli/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@
"@origintrail-official/dkg-core": "workspace:*",
"@origintrail-official/dkg-mcp": "workspace:*",
"@origintrail-official/dkg-epcis": "workspace:*",
"@origintrail-official/dkg-okf": "workspace:*",
"@origintrail-official/dkg-ip-oracle": "workspace:*",
"@origintrail-official/dkg-node-ui": "workspace:*",
"@origintrail-official/dkg-publisher": "workspace:*",
"@origintrail-official/dkg-storage": "workspace:*",
Expand Down
4 changes: 4 additions & 0 deletions packages/cli/src/cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ import { registerNodeOpsCommands } from './commands/node-ops.js';
import { registerQueryCatalogCommand } from './commands/query-catalog.js';
import { registerMaintenanceCommands } from './commands/maintenance.js';
import { registerRandomSamplingCommand } from './commands/random-sampling.js';
import { registerOkfCommand } from './commands/okf.js';
import { registerIpOracleCommand } from './commands/ip-oracle.js';

const program = new Command();
program
Expand Down Expand Up @@ -56,6 +58,8 @@ registerNodeOpsCommands(program);
registerQueryCatalogCommand(program);
registerMaintenanceCommands(program);
registerRandomSamplingCommand(program);
registerOkfCommand(program);
registerIpOracleCommand(program);

// ─── dkg integration ─────────────────────────────────────────────────

Expand Down
107 changes: 107 additions & 0 deletions packages/cli/src/commands/ip-oracle.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
import { Command } from 'commander';
import { toErrorMessage } from '@origintrail-official/dkg-core';
import {
writePatentBundle,
ingestPatentExport,
type PatentGenOptions,
} from '@origintrail-official/dkg-ip-oracle';

/**
* `dkg ip-oracle` — engineering harness for the IP / Patent Context Oracle.
*
* `generate` emits a deterministic, **synthetic** Google-Patents-shaped OKF
* bundle to disk (no BigQuery / GCP dependency), which is then ingested into a
* PRIVATE Context Graph via `dkg okf import --private`. The data is SIMULATED —
* every concept stamps `source: … [SIMULATED]` and a CC BY 4.0 licence so the
* downstream redaction guard and the public article stay honest about what is
* real vs. generated.
*
* This command writes files only; it never touches the node and spends nothing.
*/
export function registerIpOracleCommand(program: Command): void {
const cmd = program
.command('ip-oracle')
.description('IP / Patent Context Oracle tooling (synthetic OKF patent corpora)');

cmd
.command('generate <outDir>')

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: ip-oracle generate has only library tests, not command tests

What's wrong
The generator library is tested, but the new user-facing command that parses options, validates --count, calls the generator, and emits the JSON summary is not. A broken CLI registration, bad commander parser, or wrong exit code could ship while the package unit tests still pass.

Example
dkg ip-oracle generate /tmp/out --count 2 --seed 7 should exit 0, print JSON with mode: "generate", and create the OKF files. dkg ip-oracle generate /tmp/out --count 0 should exit 2. Those user-facing behaviors are not currently asserted.

Suggested direction
Cover the registered CLI path, not just the underlying generator library, so option parsing and exit behavior are locked down.

For Agents
Add a CLI-level test in packages/cli/test that runs the compiled CLI with a temp output directory. Assert success output and generated files for a small count, and assert the invalid-count exit path. This can run without a daemon because the command only writes files.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: The new ip-oracle CLI surface has no end-to-end CLI coverage

What's wrong
The package-level tests validate the generator and ingest mapper, but they do not verify the behavior users will run from the dkg binary. That leaves the new command wrapper and option plumbing unverified.

Example
A CLI regression such as a broken command registration, wrong option name, bad --count handling, or malformed JSON summary would not be caught by patent-generator.test.ts or patent-ingest.test.ts because those bypass Commander and call the library directly.

Suggested direction
Mirror the OKF CLI tests with small temp-dir fixtures so the parser, registration, option plumbing, output summary, and disk writes are covered together.

For Agents
Add packages/cli subcommand tests for ip-oracle generate and ip-oracle ingest: run the compiled CLI, assert exit codes and JSON summaries, verify expected files/shards are written, and confirm these commands do not contact the node.

.description('Generate a deterministic synthetic patent OKF bundle (no BigQuery needed)')
.requiredOption('--count <n>', 'Number of patent concepts to emit', (v: string) => parseInt(v, 10))
.option('--cpc-class <class>', 'CPC subclass tag, e.g. H04L', 'H04L')
.option('--seed <n>', 'PRNG seed (same seed ⇒ identical corpus)', (v: string) => parseInt(v, 10), 42)
.option('--citations-per-patent <n>', 'Max backward citations per patent', (v: string) => parseInt(v, 10))
.option('--family-size <n>', 'Patents per simulated family', (v: string) => parseInt(v, 10))
.option('--retrieval-date <iso>', 'Stamped retrieval / modified date (YYYY-MM-DD)')
.action((outDir: string, opts: Record<string, unknown>) => {
try {
const count = Number(opts.count);
if (!Number.isInteger(count) || count <= 0) {
console.error('--count must be a positive integer.');
process.exit(2);
}
const genOpts: PatentGenOptions = {
cpcClass: String(opts.cpcClass ?? 'H04L'),
count,
seed: Number(opts.seed ?? 42),
...(opts.citationsPerPatent != null
? { citationsPerPatent: Number(opts.citationsPerPatent) }
: {}),
...(opts.familySize != null ? { familySize: Number(opts.familySize) } : {}),
...(opts.retrievalDate ? { retrievalDate: String(opts.retrievalDate) } : {}),
};
const summary = writePatentBundle(genOpts, outDir);
console.log(
JSON.stringify(
{
mode: 'generate',
...summary,
files: summary.conceptCount + 3, // patents + patents/index + index + log
synthetic: true,
note:
'Synthetic SIMULATED corpus. Next: dkg okf import <outDir> ' +
'--context-graph-id <cg> --private --create-context-graph',
},
null,
2,
),
);
} catch (err) {
console.error(toErrorMessage(err));
process.exit(1);
}
});

cmd
.command('ingest <exportFile> <outDir>')
.description(
'Map a Google Patents Public Data NDJSON export (real data, run the BigQuery ' +
'query yourself) into OKF bundle(s). Offline, deterministic, no GCP SDK.',
)
.option('--shard-by-cpc', 'Write one self-contained OKF bundle per CPC subclass (recommended at scale)')
.option('--retrieval-date <iso>', 'Stamped retrieval / modified date (YYYY-MM-DD)')
.action(async (exportFile: string, outDir: string, opts: Record<string, unknown>) => {
try {
const summary = await ingestPatentExport(exportFile, outDir, {
shardByCpc: Boolean(opts.shardByCpc),
...(opts.retrievalDate ? { retrievalDate: String(opts.retrievalDate) } : {}),
});
console.log(
JSON.stringify(
{
mode: 'ingest',
...summary,
synthetic: false,
note:
'Real Google Patents Public Data (CC BY 4.0). Next, per shard: dkg okf ' +
'import <shardDir> --context-graph-id <cg> --private --create-context-graph',
},
null,
2,
),
);
} catch (err) {
console.error(toErrorMessage(err));
process.exit(1);
}
});
}
Loading
Loading