fix(mesh): poll TX_DONE and stuck-TX timeout from main loop#10436
Open
DatanoiseTV wants to merge 2 commits into
Open
fix(mesh): poll TX_DONE and stuck-TX timeout from main loop#10436DatanoiseTV wants to merge 2 commits into
DatanoiseTV wants to merge 2 commits into
Conversation
The 60s stuck-TX detector in canSendImmediately only fires when more packets enter the TX queue, because that's the only path that calls it. If the queue empties after the radio wedges, no IRQ ever fires, no poll runs, and the device sits in busyTx forever. pollMissedIrqs is already invoked unconditionally from the main loop — extend it to poll TX_DONE when sendingPacket is set, and trip the same reboot path independently of queue activity.
8 tasks
Contributor
There was a problem hiding this comment.
Pull request overview
This PR addresses a reliability gap in the RadioLib-based radio driver where the existing “stuck TX” watchdog only ran when new packets entered the TX queue. If the radio wedges after the queue drains and no TX IRQ fires, the node could remain in busyTx indefinitely. The change extends the already-main-loop-driven pollMissedIrqs() fallback to also poll for TX_DONE and to trigger the same stuck-TX reboot path even when the TX queue is empty.
Changes:
- Add a
TX_DONEpolling fallback (checkTxDoneIrqFlag) alongside the existingRX_DONEmissed-IRQ polling. - Extend
pollMissedIrqs()to (a) pollTX_DONEwhilesendingPacketis set and (b) enforce the existing 60s stuck-TX reboot guard independent of TX queue activity.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/mesh/RadioLibInterface.h | Declares checkTxDoneIrqFlag() for missed TX_DONE polling support. |
| src/mesh/RadioLibInterface.cpp | Implements TX_DONE polling in pollMissedIrqs() and adds a queue-independent stuck-TX timeout/reboot trigger. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The 60s stuck-TX detector in canSendImmediately only fires when more
packets enter the TX queue, because that's the only path that calls it.
If the queue empties after the radio wedges, no IRQ ever fires, no
poll runs, and the device sits in busyTx forever. pollMissedIrqs is
already invoked unconditionally from the main loop — extend it to
poll TX_DONE when sendingPacket is set, and trip the same reboot
path independently of queue activity.
Split out from #10424 per @thebentern's request — single-concern PR.
Build verification
pio run -e t-deck-tftsucceeds, no new warnings.Attestations
t-deck-tftonly.