fix(power): sleep/battery reliability hardening#10425
Closed
DatanoiseTV wants to merge 6 commits into
Closed
Conversation
doDeepSleep called gpio_hold_en for BUTTON_PIN and LORA_CS, but the ESP-IDF API contract is that gpio_hold_en alone only covers light sleep — deep sleep additionally requires gpio_deep_sleep_hold_en once before esp_deep_sleep_start to arm the holds across the deep- sleep boundary. Without it the pins float in DSLP, which on some boards leaks current through LORA_CS and on others triggers spurious wake from a noisy button line.
GPS only subscribed to notifyDeepSleep, so during light-sleep the receiver kept drawing ~25 mA on a non-router device — the dominant drain on devices that spend most of their time in LIGHT_SLEEP. Subscribe to notifyLightSleep/notifyLightSleepEnd as well, but use GPS_SOFTSLEEP instead of disable()+enable() since the state machine survives across the short light-sleep window and full power-cycle re-acquisition would cost more than it saves.
…eresis The previous low-voltage handler counted 11 consecutive readings under OCV[min] and triggered EVENT_LOW_BATTERY, but on SDS wake the counter reset to 0. With the cell still flat, the count climbed back to 11 and deep-slept again — wake-thrash burning boot energy each cycle. There was also no recovery hysteresis: a brief spike across the threshold would clear the count even with the battery still effectively flat. Persist the latch in RTC memory so SDS doesn't lose it. Once latched, clearing requires K consecutive readings above OCV[min] + 100 mV. If the device wakes and is still latched-flat, immediately re-enter SDS without running a normal boot.
waitEnterSleep panicked with assert(0) when an observer vetoed sleep for >30s (FIXME meshtastic#167). On a field device that turns into a noisy crash log every 30s rather than a clean reboot. Use the same per-arch restart primitives Power::reboot uses, fall back to assert only for archs we don't have a primitive for.
getBattVoltage on the BQ27220+BQ25896 combo returned the fuel-gauge reading while isBatteryConnect derived presence from the BQ25896 charger ADC. The two could disagree on a freshly-attached cell before the fuel gauge had learned: hasBattery would say false but voltage 4.0V, and the upstream PowerFSM would treat the unit as externally powered. Read voltage from the same source as presence so both views move together.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens sleep and battery-handling behavior to reduce power drain and avoid unstable boot/sleep loops, primarily by improving sleep-state transitions and making low-battery behavior more deterministic across resets.
Changes:
- Replace the “stuck sleep preflight” panic with a controlled platform reset, and enable ESP32 deep-sleep GPIO hold globally so per-pin holds actually apply in deep sleep.
- Add ESP32 light-sleep hooks to soft-sleep the GPS during CPU light sleep and restore it on wake.
- Add a persistent low-battery latch with hysteresis to prevent repeated wake→detect-low→sleep thrash; align BQ25896 voltage reads with presence detection.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/sleep.cpp |
Controlled restart when sleep preflight is vetoed too long; arm ESP32 deep-sleep hold before entering deep sleep. |
src/Power.cpp |
Add RTC-persisted low-battery latch + recovery hysteresis; use BQ25896 ADC for battery voltage in the BQ25896/BQ27220 path. |
src/gps/GPS.h |
Add ESP32-only light-sleep observer state and callbacks. |
src/gps/GPS.cpp |
Register/unregister light-sleep observers; soft-sleep GPS during CPU light sleep and restore on wake. |
- Scale OCV by NUM_CELLS in the low-battery threshold check. getBattVoltage() returns pack voltage but OCV[] is per-cell, so the comparison was wrong on multi-cell variants (chatter2, station-g1). Scale both the latch and the recovery-hysteresis threshold so behaviour is correct across all NUM_CELLS configurations. - Correct the getBattVoltage docstring on the BQ27220+BQ25896 path: the function returns uint16_t with 0 as the unknown sentinel, not NAN as the comment claimed. - Notify rebootObservers from waitEnterSleep before the platform reset primitive, matching Power::reboot's contract so modules like InkHUD that register a notifyReboot observer get a last-second cleanup pass.
Contributor
Author
|
Thanks for the review — all three points addressed in
Build still clean on |
This was referenced May 9, 2026
Contributor
Author
|
Splitting this proactively per the same atomic-PR feedback @thebentern gave on #10424. Re-opened as 5 separate PRs, one fix per PR (all incorporate the Copilot review feedback from this thread):
Each is independent and revertable. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A bundle of small power-management fixes uncovered while auditing the sleep state machine and battery-presence logic. All cross-cutting (no board-specific pin changes); each commit independently revertable.
Fixes
sleep: arm deep-sleep hold so per-pin gpio_hold_en actually holds—doDeepSleepcallsgpio_hold_enonBUTTON_PINandLORA_CS, but the ESP-IDF API contract is thatgpio_hold_enalone only retains pin state during light sleep. Deep sleep additionally needs a one-timegpio_deep_sleep_hold_en()beforeesp_deep_sleep_startto arm the holds across the boundary. Without it the pins float in DSLP, which on some boards leaks current throughLORA_CSand on others triggers spurious wake from a noisy button line.gps: soft-sleep the receiver during CPU light-sleep—GPSonly subscribed tonotifyDeepSleep. During light sleep (the default non-router path) the receiver kept drawing ~25 mA the entire time the MCU was asleep — the dominant drain on devices that spend most of their time in LIGHT_SLEEP. Subscribe tonotifyLightSleep/notifyLightSleepEndand useGPS_SOFTSLEEPinstead of fulldisable()so the state machine survives across the short window without paying re-init / re-acquire on every cycle.power: latch low-battery decision in RTC memory and add recovery hysteresis— the previous handler counted 11 consecutive sub-OCV[min] readings and triggeredEVENT_LOW_BATTERY, but on SDS wakelow_voltage_counterreset to 0. With the cell still flat, the count climbed back to 11 and re-deep-slept — wake-thrash burning boot energy each cycle. There was also no recovery hysteresis: a brief spike across threshold cleared the count even with the cell effectively flat. Persist the latch inRTC_DATA_ATTR(plainstaticon non-ESP32 — same effective semantics since SDS doesn't reset there). Once latched, clearing requires K consecutive readings above OCV[min] + 100 mV. If the device wakes still latched-flat, it immediately re-enters SDS.sleep: replace assert(0) on stuck preflight with controlled restart—waitEnterSleeppanicked withassert(0)when an observer vetoed sleep for >30 s (FIXME wait to sleep bug - "wait to sleep" is starving the router transmit timer handler #167). On a field device that turns into a noisy crash log every 30 s rather than a clean reboot. Use the same per-arch restart primitivesPower::rebootalready uses (ESP.restart/NVIC_SystemReset/rp2040.reboot/HAL_NVIC_SystemReset).power: use BQ25896 ADC for both battery voltage and presence— on the BQ27220+BQ25896 combo,getBattVoltagereturned the fuel-gauge reading whileisBatteryConnectderived presence from the BQ25896 charger ADC. The two could disagree on a freshly-attached cell before the gauge had learned: hasBattery would say false but voltage 4.0 V, and the upstream PowerFSM would treat the unit as externally powered. Read voltage from the same source as presence so both views always agree.Build verification
pio run -e t-deck-tft— SUCCESS, 38.4% RAM / 55.3% flash. No new warnings.Deferred (scope kept tight)
kbI2cBase, didn't want to land a half-fix here.setCPUFastcarve-outs on T-Deck — risks TFT/WiFi instability without bench testing.Attestations
t-deck-tftonlyNotes for reviewers