Add OpenTelemetry tracing across the backbeat pipeline#2733
Conversation
Hello delthas,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
Request integration branchesWaiting for integration branch creation to be requested by the user. To request integration branches, please comment on this pull request with the following command: Alternatively, the |
Codecov Report❌ Patch coverage is Additional details and impacted files
... and 5 files with indirect coverage changes
@@ Coverage Diff @@
## development/9.4 #2733 +/- ##
===================================================
+ Coverage 74.73% 74.97% +0.23%
===================================================
Files 199 202 +3
Lines 13650 13824 +174
===================================================
+ Hits 10201 10364 +163
- Misses 3439 3450 +11
Partials 10 10
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
9d08f7b to
2f7afb0
Compare
2f7afb0 to
d562a0a
Compare
d562a0a to
51a9f61
Compare
51a9f61 to
b9d3528
Compare
970a811 to
849d6b0
Compare
ea6bda3 to
078df44
Compare
078df44 to
6357120
Compare
|
Well-structured PR — clean OTEL facade, correct trust-boundary handling, proper span lifecycle management with sync-throw guards, and good test coverage. No correctness bugs found. A few observations: |
| logger.info('received SIGTERM, exiting'); | ||
| garbageCollector.close(() => { | ||
| process.exit(0); | ||
| require('../../lib/tracing').close().finally(() => process.exit(0)); |
| @@ -0,0 +1,114 @@ | |||
| 'use strict'; | |||
There was a problem hiding this comment.
I have the feeling that some code are duplicated across project and should be define in a lib
| resource: resourceFromAttributes({ | ||
| 'service.name': process.env.OTEL_SERVICE_NAME || 'backbeat', | ||
| 'service.version': process.env.OTEL_SERVICE_VERSION || version, | ||
| 'service.namespace': process.env.OTEL_SERVICE_NAMESPACE || 'scality', |
There was a problem hiding this comment.
should it be scality or s3 as our metrics ?
| links.push({ context: remoteSpanCtx }); | ||
| } | ||
|
|
||
| // ROOT_CONTEXT forces a new trace even if a span is somehow active here. |
| // When OTEL is off, skip the span machinery entirely (no OTEL API on | ||
| // the per-message hot path) and keep the original processing shape. |
There was a problem hiding this comment.
| // When OTEL is off, skip the span machinery entirely (no OTEL API on | |
| // the per-message hot path) and keep the original processing shape. |
The condition is pretty explicit ?
| message: action.toKafkaMessage(), | ||
| }; | ||
| const traceHeaders = traceHeadersFromCurrentContext(); | ||
| if (traceHeaders) kafkaEntry.headers = traceHeaders; |
There was a problem hiding this comment.
Quick up. I think we don't allow that in our codebase ?
| const safeMessages = Array.isArray(messages) | ||
| ? messages.map(m => { | ||
| if (m && m.headers) { | ||
| // eslint-disable-next-line no-unused-vars |
| this._notificationProducer.send(messages, error => { | ||
| // Trust boundary: drop trace headers before producing to the external | ||
| // customer Kafka (no protocol-level ingress strip like nginx for HTTP). | ||
| const safeMessages = Array.isArray(messages) |
| details: {}, | ||
| }), | ||
| }; | ||
| if (headers) {kafkaEntry.headers = headers;} |
There was a problem hiding this comment.
| if (headers) {kafkaEntry.headers = headers;} | |
| if (headers) { | |
| kafkaEntry.headers = headers; | |
| } |
🙏
| }); | ||
| } catch (err) { | ||
| // sync throw: end span (don't leak), then rethrow | ||
| span.recordException(err); |
There was a problem hiding this comment.
We have the same logic somewhere else, we should factorise
Summary
Add OpenTelemetry tracing to backbeat so async work (replication, lifecycle,
GC, notifications) can be traced back to the original S3 request in Jaeger.
Arsenal already stamps
value.traceContext.{traceparent,tracestate}onto everyMongoDB metadata write; the oplog carries it; this PR extracts it and
propagates it across backbeat's Kafka pipeline.
Gated by
ENABLE_OTEL=true— when unset,init()returns before loading any@opentelemetry/*package (zero overhead off the OTEL path).Design and structure mirror the cloudserver OTEL PR
(scality/cloudserver#6140,
CLDSRV-884), which went through full human review.
Commits
Reviewable in order — each loads on its own and builds on the previous:
chore: add OpenTelemetry dependencies— explicit@opentelemetry/*packages, noauto-instrumentations-nodebundle.feat: add OTEL trust-boundary host filter—lib/tracing/trustedHosts.js(operator-suppliedOTEL_TRUSTED_HOSTS).feat: add OTEL SDK bootstrap and tracing facade—lib/tracing/index.js(init/close/isEnabled), sampler, span limits, traces-only, outbound-only HTTP, mongodb/ioredis/aws-sdk instr, bounded shutdown flush.feat: propagate trace context across the Kafka pipeline—kafkaTraceContext.js, producer 7th-arg headers,publish()headers param, consumer span (links, never parent-child).feat: instrument replication and oplog-populator pods.feat: instrument lifecycle, GC, and notification pods.What it does
lib/tracing/facade — public surface isinit()/close()/isEnabled(); OTEL internals are hidden.init()fails fast (assert) ifENABLE_OTEL=trueandOTEL_EXPORTER_OTLP_TRACES_ENDPOINTis unset, and onan out-of-range
OTEL_SAMPLING_RATIO. No baked-in endpoint default.Traces-only NodeSDK (
logRecordProcessors:[]/metricReaders:[]),explicit span limits,
ParentBasedSampler({ root: TraceIdRatioBased(ratio) })(default 1%) so a sampled upstream trace is always honored.
Explicit instrumentations —
instrumentation-http+instrumentation-aws-sdk+instrumentation-ioredis(withrequireParentSpan) +instrumentation-mongodb(withenhancedDatabaseReporting:falsefor PII masking). Noauto-instrumentations-nodebundle (avoids ~36 unused instrumentations,version skew, and a transitive
@types/pgconflict). MongoDB isinstrumented because several pods use the
mongodbdriver directly —oplog-populator (
MongoLogReadertailing the oplog), lifecycle conductor(scan-path collection queries), and notification (
MongoConfigManager).HTTP: outbound only —
instrumentation-httpruns withdisableIncomingRequestInstrumentation: true. No instrumented pod servesapplication HTTP — they are Kafka producers (oplog-populator, conductor) and
consumers (replication, lifecycle, GC, notification processors); the only
inbound HTTP is k8s probes / Prometheus scrapes, never a useful trace entry.
So server spans are never created, which also removes the need to maintain a
health-path ignore list. (The backbeat API server in
bin/backbeat.jshasreal routes and is not instrumented here; instrumenting it later must
re-enable server spans.)
Trust boundary —
lib/tracing/trustedHosts.js. Trusted hosts come froman operator-supplied
OTEL_TRUSTED_HOSTSenv var (comma-joined lowercasebare hostnames); loopback is always trusted. Outbound calls to untrusted
hosts (replication destinations, AWS/Azure/GCP, remote Artesca) have
traceparent/tracestatestripped and the client span taggedscality.trace.suppressed. Unset env → loopback-only (safe default).Emitting the var for backbeat pods is an operator follow-up (cf. cloudserver's
ZKOP-551).
Kafka propagation = span Links — every consumer read starts a new
trace (via
ROOT_CONTEXT) and adds an OTEL Link to the upstream span (neverparent-child). Async work fires minutes/hours after the S3 request; links
keep each trace small and navigable via Jaeger's link UI instead of
producing million-span, multi-hour waterfalls. Producer/consumer header
plumbing rides node-rdkafka's
MessageHeader[](array of single-keyobjects). Consumer spans set the OTEL messaging semantic-convention
attributes (
messaging.system,messaging.destination.name,messaging.destination.partition.id,messaging.consumer.group.name) andare marked ERROR on task failure.
Graceful flush — each pod's SIGTERM path calls
tracing.close(), whichis race-safe and bounded at 5s (
Promise.race+.unref()'d timer) so anunreachable collector can't block past Kubernetes' 30s grace period.
Pods instrumented
oplog-populator, replication (data + status), lifecycle (conductor, bucket,
object/transition), GC, notification. Entry points call
require('../lib/tracing').init()before any HTTP/aws-sdk/ioredis/mongodbmodule loads.
Compatibility note
Rebased onto
development/9.4(node-rdkafka^2.12.0 → ^3.6.0). Verified theproduce()7th-argheaderssignature andMessageHeader[]format areunchanged in v3.
Tests
Unit tests under
tests/unit/lib/tracing/(indexboot/close/isEnabled +fail-fast asserts,
trustedHostsenv parsing + request-hook strip/IPv6,kafkaTraceContext) plus notification populator trace-header propagation.Status
Reworked to mirror the reviewed design of cloudserver #6140; earlier automated
review comments triaged and resolved; split into 6 reviewable commits. Cluster
end-to-end verification with Jaeger + OTEL collector is the remaining open item.
Behavioral note:
replicationStatusProcessor's SIGTERM handler nowforce-exits with
process.exit(0)after the trace flush. The original had nosuccess-path exit (it relied on the event loop draining naturally); this aligns
it with the other seven pods and makes shutdown deterministic. Intentional —
flagged explicitly so reviewers don't mistake it for an accidental change.
Issue: BB-764