[Metrics SDK] Implement configurable cardinality limit#4188
Conversation
|
Hi @dbarker Summary of Changes:
This is my first contribution to OpenTelemetry C++. I'd appreciate guidance on:
Looking forward to your feedback! cc: @lalitb |
marcalff
left a comment
There was a problem hiding this comment.
Thanks for the contribution.
There is something wrong the way clang-format was applied, causing spurious changes in many files.
This:
- is most likely incorrect, and will fail CI on clang-format
- makes it difficult to review the real changes from this patch.
When formatting code, a precise version of clang-format must be used. Applying a different version will cause this. The best is to use the dev container to perform formatting, as it will use the exact same version as the github CI.
In any cases, adding a file in a commit that was not changed voluntarily (i.e., with a real fix) is a red flag, and should not happen.
Please rework the patch to remove the noise, so it can be reviewed.
a385928 to
7758926
Compare
|
@marcalff Thank you for the feedback! I've reworked the PR to remove all spurious formatting changes. Current state metric_reader.h
Regarding CI: the only failing check is W3C Distributed Tracing Validation V1 (test_tracestate_key_illegal_vendor_format and test_tracestate_key_length_limit). This failure is unrelated to this PR — the changes here are entirely in the Metrics SDK and do not touch any trace context code. The W3C V1 test suite checks out the latest [w3c/trace-context]HEAD at runtime, and these two tests appear to be failing intermittently across PRs. The PR is now clean and ready for review. Please let me know if you need any clarifications on the implementation approach! |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4188 +/- ##
==========================================
- Coverage 82.89% 82.69% -0.19%
==========================================
Files 405 405
Lines 17292 17335 +43
==========================================
+ Hits 14332 14334 +2
- Misses 2960 3001 +41
🚀 New features to boost your workflow:
|
65c13b7 to
1f771a3
Compare
1f771a3 to
8d2967e
Compare
| auto ctx_ptr = meter_context_.lock(); | ||
| if (ctx_ptr) | ||
| { | ||
| size_t reader_limit = ctx_ptr->GetReaderCardinalityLimit(instrument_descriptor.type_); |
There was a problem hiding this comment.
I think this breaks per-reader semantics. Example: reader A has limit 10, reader B has limit 1000. The storage uses 1000, so reader A can export far more series than configured. This is also a heap/memory regression for low-limit readers.
There was a problem hiding this comment.
One possible fix is to avoid resolving reader-level limits into a single storage-level config. The view-level limit can stay on the shared stream/storage, but the MetricReader fallback should be applied per collector/reader, since each reader may have a different configured limit. So instead of taking the max across readers, the collection path likely needs to use the current CollectorHandle's GetCardinalityLimit(instrument_type) when no view-level limit exists.
There was a problem hiding this comment.
Good catch @lalitb, thanks! You're right, resolving the reader-level fallback into shared storage breaks the per-reader semantics you described.
I've removed that logic here and reverted the storage constructors. For now, this PR only enforces view-level aggregation_cardinality_limit (plus the existing SDK default of 2000). Reader-level limits are still parsed, but not enforced.
I agree the right place to apply the MetricReader fallback is in the collection path on a per-CollectorHandle basis when no view-level limit exists. Since that's a more involved change, I'd rather tackle it in a follow-up PR.
Also fixed the IWYU warning in the latest commit.
…er semantics As noted in review, the reader-level fallback breaks per-reader semantics because all readers share the same storage with a single cardinality limit. This commit simplifies the implementation to support only: 1. View-level: aggregation_cardinality_limit in YAML view configuration 2. SDK default: 2000 (existing behavior via AggregationConfig::GetOrDefault) Reader-level limits are kept in the API for future implementation but are not applied during storage creation. This avoids the heap/memory regression for low-limit readers and maintains correct per-reader semantics. The view-level implementation still provides significant value as it's the primary configuration mechanism for users who need custom cardinality limits.
Include What You Use (IWYU) reported that sdk_builder.cc was missing an explicit include for cardinality_limits_configuration.h even though it uses CardinalityLimitOptions types defined in that header. Added the missing include to fix the IWYU warning.
Fixes #3292
Changes
This PR implements configurable aggregation cardinality limits at three priority levels as specified by the OpenTelemetry Metrics SDK specification:
Implementation Details
MetricReader Changes:
CardinalityLimitOptionsstruct with per-instrument-type limitsGetCardinalityLimit(InstrumentType)methodSetCardinalityLimitOptions()methodMetricCollector Changes:
GetCardinalityLimit()toCollectorHandleinterfaceMetricReaderMeterContext Changes:
GetReaderCardinalityLimit()methodStorage Classes Changes:
SyncMetricStorageandAsyncMetricStorageconstructorsowned_aggregation_configparameter for lifetime managementMeter Changes:
RegisterSyncMetricStorage()andRegisterAsyncMetricStorage()AggregationConfigwith reader limit when neededConfiguration Changes:
aggregation_cardinality_limitfrom YAML configcardinality_limitsfrom YAML configPriority Order
The implementation follows the specification's priority order:
AggregationConfig::GetOrDefault())Testing
The existing YAML parsing tests already verify that configuration is parsed correctly. Additional integration tests are recommended to verify:
For significant contributions please make sure you have completed the following items:
CHANGELOG.mdupdated for non-trivial changesRelated Issues
Fixes #3292
Note: This is a spec-compliance feature that enables users to configure cardinality limits at multiple levels, preventing memory exhaustion in high-cardinality scenarios while maintaining flexibility.