Skip to content

fix: missing break statements in CounterCollection::logToTraceEvent#13166

Open
fmartinez09 wants to merge 1 commit intoapple:mainfrom
fmartinez09:fix/counter-collection-switch-fallthrough
Open

fix: missing break statements in CounterCollection::logToTraceEvent#13166
fmartinez09 wants to merge 1 commit intoapple:mainfrom
fmartinez09:fix/counter-collection-switch-fallthrough

Conversation

@fmartinez09
Copy link
Copy Markdown

Problem

CounterCollection::logToTraceEvent has a switch on c->model with no
break statements between the OTLP and STATSD cases. When a counter
is configured with model=OTLP, the OTLP branch correctly populates
MetricCollection::sumMap, but control then falls through to the STATSD
branch and silently pushes to MetricCollection::statsd_message. A
deployment configured for OTLP metrics is also emitting StatsD-format
data on every counter log cycle, regardless of the operator's intent.

I encountered this while studying the metrics path referenced in #12679.

Solution

Add explicit break statements after the OTLP and STATSD cases.
The NONE/default case is last and does not require a break.

Testing

Added unit test /fdbrpc/Stats/CounterCollectionFallthrough in
Stats.cpp that forces METRICS_DATA_MODEL=otel, logs a counter,
and asserts that sumMap is populated and statsd_message is empty.

Verified:

  • Without the fix: test fails on ASSERT_EQ(statsd_message.size(), 0) (Test Cases Failed = 1)
  • With the fix: test passes (Test Cases Failed = 0)

Run with:
./bin/fdbserver -r unittests -f /fdbrpc/Stats/CounterCollectionFallthrough

Copy link
Copy Markdown
Collaborator

@gxglass gxglass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for having a look. AI review (see below) finds another similar bug which would be worth fixing while here.

Regarding the new test case, I feel it is too limited and specific to justify its cost of 25 lines of code. A more general and lighter weight solution would be to enable the compiler flag that the AI suggests. I assume there are ways to override that as needed if fall-through is actually desired.

Also there might be broader-coverage ways of improving test coverage here that you could look into. I haven't thought about it other than noting that the current number of TEST_CASE in this file is zero, which seems like it has a high probability of not being the right number. What those missing test cases (good ones) might be I don't know.

AI review:

Image

@fmartinez09 fmartinez09 force-pushed the fix/counter-collection-switch-fallthrough branch from 6db51c5 to 48b3bb9 Compare May 7, 2026 22:51
@fmartinez09
Copy link
Copy Markdown
Author

Thanks for the review!

Applied:

  • Added the missing breaks in LatencySample::logSample (same pattern as logToTraceEvent)
  • Removed the TEST_CASE per your suggestion
  • Added trailing newline

On enabling -Wimplicit-fallthrough: I tried it and ran into more friction than expected. The flag triggers on third-party headers transitively included from FDB code:

  • contrib/libb64/{cencode,cdecode}.c — base64 implementation, intentional fallthroughs
  • flow/Hash3.c — hash implementation, intentional fallthroughs
  • flow/include/flow/xxhash.h — xxHash header, ~20 unannotated fallthroughs

I also found 4 intentional fallthroughs in flow/include/flow/CoroutinesImpl.h (the ACTOR_WAIT_STATE_CANCELLEDthrow actor_cancelled()) which would need [[fallthrough]] annotations.

Scoping the flag to fdbrpc via target_compile_options doesn't help because xxhash.h is included transitively. Cleanly enabling the flag would require annotating the third-party headers or excluding them via per-file flags. Seemed out of scope for this fix, I think. Happy to tackle it in a follow-up if desired.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants