-
Notifications
You must be signed in to change notification settings - Fork 8
feat(telemetry): core-node observability foundation — OTLP logs + traces + metrics #1317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Bojan131
wants to merge
20
commits into
main
Choose a base branch
from
feat/core-node-log-collection
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 2 commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
6cc97d0
feat(telemetry): opt-in OTLP log collection for core nodes (PoC)
Bojan131 ab56bf7
feat(telemetry): per-node Grafana selection + dashboard
Bojan131 4cdeef6
feat(telemetry): production ingest path for existing Loki (Alloy bridge)
Bojan131 242c54f
docs(telemetry): production deploy bundle + manager handoff
Bojan131 a3c5257
docs(telemetry): exact Cloudflare WAF expression + real-Grafana verif…
Bojan131 9a76a4a
feat(telemetry): fleet dashboard + operator guide + example alerts
Bojan131 d9b557b
chore: drop accidentally-committed localhost hardhat deploy artifacts
Bojan131 10da035
chore: restore tracked localhost_contracts.json (only the untracked l…
Bojan131 a30318d
feat(telemetry): OTel SDK foundation — traces+metrics providers, log …
Bojan131 80e536d
feat(telemetry): instrument spans + metrics across agent/publisher/ch…
Bojan131 a841d00
chore(telemetry): collector traces+metrics pipelines + mock OTLP coll…
Bojan131 2aa31f7
feat(telemetry): add protocol_router.send span + P2P send-duration me…
Bojan131 414936b
fix(telemetry): address bug-bot review (1 bug + 3 issues + 2 nits)
Bojan131 9f3625d
Merge remote-tracking branch 'origin/main' into feat/core-node-log-co…
Bojan131 f78d06f
fix(telemetry): two CI regressions from instrumentation
Bojan131 908e298
Merge remote-tracking branch 'origin/main' into feat/core-node-log-co…
Bojan131 0bfdadd
chore: drop accidentally-committed localhost Hardhat deploy artifacts…
Bojan131 6a8c971
feat(telemetry): address #1317 review — unify log resource attrs + fu…
Bojan131 66ef5b7
Merge remote-tracking branch 'origin/main' into feat/core-node-log-co…
Bojan131 2771d2e
fix(telemetry): tie OTel traces/metrics to the master gate + prove in…
Bojan131 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,126 @@ | ||
| /** | ||
| * At-source redaction of secrets from log records before they leave the node. | ||
| * | ||
| * Why this exists: V10 nodes are run by independent operators, and once a | ||
| * secret (a wallet private key, a mnemonic, an API token) is shipped to a | ||
| * remote collector it is irreversibly leaked. Redaction therefore runs on the | ||
| * node, on the copy of every log record that is about to be FORWARDED. The | ||
| * local dashboard DB keeps full-fidelity records for the operator's own | ||
| * debugging — redaction only protects data that crosses the trust boundary. | ||
| * | ||
| * Design choices (deliberately conservative to avoid mangling useful logs): | ||
| * - Structured "key: value" / key=value / "key":"value" shapes are redacted | ||
| * by KEY NAME (high precision). This is how DKG actually logs secrets | ||
| * (e.g. operationalWalletPrivateKey, mnemonic). | ||
| * - JWTs are redacted by shape (eyJ….….…) — effectively zero false positives. | ||
| * - We deliberately do NOT blanket-redact 0x-prefixed 64-hex strings: in DKG | ||
| * those are overwhelmingly Merkle roots, KC roots and tx hashes (public, | ||
| * non-secret) and nuking them would destroy debuggability. A bare private | ||
| * key with no key-name context is a residual gap best closed with a | ||
| * collector-side OTTL/regex backstop (see the PoC stack). | ||
| */ | ||
|
|
||
| import type { LogRecord } from './logger.js'; | ||
|
|
||
| /** | ||
| * Default sensitive key names whose values are scrubbed from log messages | ||
| * before forwarding. Matched case-insensitively. | ||
| */ | ||
| export const DEFAULT_SENSITIVE_KEYS: readonly string[] = [ | ||
| 'privateKey', | ||
| 'private_key', | ||
| 'privKey', | ||
| 'operationalWalletPrivateKey', | ||
| 'managementWalletPrivateKey', | ||
| 'mnemonic', | ||
| 'seedPhrase', | ||
| 'seed_phrase', | ||
| 'seed', | ||
| 'secret', | ||
| 'secretKey', | ||
| 'clientSecret', | ||
| 'password', | ||
| 'passphrase', | ||
| 'passwd', | ||
| 'pwd', | ||
| 'apiKey', | ||
| 'api_key', | ||
| 'apiToken', | ||
| 'accessToken', | ||
| 'access_token', | ||
| 'refreshToken', | ||
| 'refresh_token', | ||
| 'token', | ||
| 'authorization', | ||
| 'bearer', | ||
| 'sessionKey', | ||
| 'encryptionKey', | ||
| ]; | ||
|
|
||
| export const REDACTED = '[REDACTED]'; | ||
|
|
||
| function escapeRegExp(s: string): string { | ||
| return s.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); | ||
| } | ||
|
|
||
| /** | ||
| * Redact the free-form `message` of a log record. The two patterns are: | ||
| * 1. JWT-shaped tokens (header.payload.signature, base64url) — by shape. | ||
| * 2. `<sensitiveKey><sep><value>` — the value is replaced, the key kept. | ||
| * Quoted values (single/double/backtick) are redacted whole (so a quoted | ||
| * mnemonic with spaces is fully removed); bare values up to the next | ||
| * delimiter otherwise. | ||
| */ | ||
| export function redactMessage(message: string, keyRegex: RegExp, jwtRegex: RegExp): string { | ||
| if (!message) return message; | ||
| // Reset lastIndex defensively (these are global regexes reused across calls). | ||
| jwtRegex.lastIndex = 0; | ||
| keyRegex.lastIndex = 0; | ||
| let out = message.replace(jwtRegex, REDACTED); | ||
| out = out.replace(keyRegex, (_full, keyAndSep: string) => `${keyAndSep}${REDACTED}`); | ||
| return out; | ||
| } | ||
|
|
||
| function buildKeyRegex(keys: readonly string[]): RegExp { | ||
| const alt = keys.map(escapeRegExp).join('|'); | ||
| // group 1 = key (optionally quoted) + separator (kept verbatim) | ||
| // group 2 = value (redacted): a quoted run, or a bare token up to a delimiter | ||
| return new RegExp( | ||
| '(' + | ||
| '["\'`]?\\b(?:' + alt + ')\\b["\'`]?' + // key, optionally quoted | ||
| '\\s*[:=]\\s*' + // : or = | ||
| ')' + | ||
| '(' + | ||
| '"[^"]*"' + '|' + | ||
| "'[^']*'" + '|' + | ||
| '`[^`]*`' + '|' + | ||
| '[^\\s,;}\\]\\)]+' + // bare token | ||
|
Bojan131 marked this conversation as resolved.
|
||
| ')', | ||
| 'gi', | ||
| ); | ||
| } | ||
|
|
||
| // JWT: three base64url segments separated by dots, starting with the | ||
| // canonical `eyJ` ('{"' base64url-encoded). Conservative min lengths. | ||
| const JWT_SOURCE = '\\beyJ[A-Za-z0-9_-]{6,}\\.[A-Za-z0-9_-]{6,}\\.[A-Za-z0-9_-]{6,}\\b'; | ||
|
|
||
| /** | ||
| * Compile a redactor once, then reuse it on the hot path (one per shipper). | ||
| * `extraKeys` are operator-configured additional sensitive key names. | ||
| */ | ||
| export function createLogRedactor(extraKeys: readonly string[] = []): (record: LogRecord) => LogRecord { | ||
| const keys = extraKeys.length ? [...DEFAULT_SENSITIVE_KEYS, ...extraKeys] : DEFAULT_SENSITIVE_KEYS; | ||
| const keyRegex = buildKeyRegex(keys); | ||
| const jwtRegex = new RegExp(JWT_SOURCE, 'g'); | ||
| return (record: LogRecord): LogRecord => { | ||
| if (!record || !record.message) return record; | ||
| const redacted = redactMessage(record.message, keyRegex, jwtRegex); | ||
| if (redacted === record.message) return record; // no change → no alloc | ||
| return { ...record, message: redacted }; | ||
| }; | ||
| } | ||
|
|
||
| /** One-shot convenience (recompiles each call — do not use on the hot path). */ | ||
| export function redactLogEntry(record: LogRecord, extraKeys: readonly string[] = []): LogRecord { | ||
| return createLogRedactor(extraKeys)(record); | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Issue: Remote log redaction wiring is not verified
What's wrong
The PR tests the redaction helper and the OTLP worker separately, but not the daemon wiring that decides which copy leaves the node. This is the privacy-critical behavior: local logs should remain full-fidelity, while syslog/OTLP receive only the redacted record.
Example
A future edit that accidentally changes the fan-out to
otlpExporter?.push(entry)would leakoperationalWalletPrivateKey=...remotely. The redactor tests andOtlpLogWorkertests would still pass because neither test provesLogger.setSinksends the redacted copy to the remote shippers while keeping the local DB record unchanged.Suggested direction
Add a lifecycle-level or extracted-helper test that exercises the actual
Logger.setSinkfan-out path with sensitive log content and configured telemetry modes.For Agents
Add a focused daemon-sink wiring test, or extract the sink fan-out into a small helper that can be unit-tested. Mock/stub
dashDb.insertLog,LogPushWorker.push, andOtlpLogWorker.push; emit a log containing a private key; assert the DB gets the original message, both remote shippers get[REDACTED], andlogs.exporter: 'none'leaves remote pushers inactive.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Issue: Remote redaction wiring is not verified at the daemon sink
What's wrong
The existing tests prove the redactor can scrub a standalone record and that
OtlpLogWorkercan send records, but they do not exercise the trust-boundary code that actually applies redaction before forwarding daemon logs. Because this path handles secrets leaving the node, a regression that accidentally pushesentryinstead ofsafe, or fails to apply configured extra redact keys, would not be caught.Example
A daemon-level test should log
operationalWalletPrivateKey=0x...with remote telemetry enabled and assert thatdashDb.insertLog(...)receives the original message while bothlogPusher.push(...)andotlpExporter.push(...)receive[REDACTED], including an extra key fromtelemetry.logs.redact.Suggested direction
Add a focused test around the daemon log sink fan-out, not just the standalone redactor and standalone OTLP worker.
For Agents
Look at the
Logger.setSinkblock inpackages/cli/src/daemon/lifecycle.ts. Extract the fan-out/redaction logic or add a focused daemon unit with fake DB, syslog worker, and OTLP worker. Preserve full-fidelity local logs and prove every remote shipper gets only the redacted copy.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Issue: Remote redaction is not verified at the daemon fan-out boundary
What's wrong
The PR adds at-source redaction before remote log shipping, but the tests only verify the pure redactor and the OTLP worker separately. They do not prove that daemon log forwarding applies the redactor before either remote shipper receives a record.
Example
A regression that changed line 2012 to
const safe = entry, or pushedentrydirectly tootlpExporter, would still leavepackages/core/test/log-redaction.test.tsgreen because those tests only call the redactor directly; they never exercise the daemon sink that forwards records off-node.Suggested direction
Cover the actual
Logger.setSinkfan-out behavior, not onlycreateLogRedactorin isolation, because this is the trust boundary where secrets either stay local or leave the node.For Agents
Add a focused test around the daemon log fan-out path, or extract a small helper from
lifecycle.ts, using fake dashboard/remote shippers. Prove local storage receives the original message while remote shippers receive a redacted message forprivateKey=...and configured extra keys.