BLE OTA: keepalive + longer supervision timeout to prevent mid-flash disconnect#113
Open
PaulDWhite wants to merge 6 commits into
Open
BLE OTA: keepalive + longer supervision timeout to prevent mid-flash disconnect#113PaulDWhite wants to merge 6 commits into
PaulDWhite wants to merge 6 commits into
Conversation
…disconnect During OTA the firmware suppresses the ~50 Hz FastLink telemetry stream to give the flash full bandwidth. With no liveness signal a BLE central can tear down the link mid-flash (HCI 0x13 / disconnect reason 531), aborting the update partway through (~20-30%). - fastlink_service.cpp: emit a ~1 Hz FastLink keepalive notify while OTA is in progress so the central keeps the link up. The keepalive ships the packet already setValue()'d, whose advancing packet_id/uptime_ms the app counts as telemetry progress (no app changes needed). At 1 Hz vs the 15 ms OTA interval it does not meaningfully slow the flash. - ble_core.cpp (requestFastConnParams): lengthen the OTA-time supervision timeout from 2 s to 8 s so a multi-second flash-erase stall or a sluggish phone cannot drop the link at the link-layer level. OTA_TIMEOUT_MS (30 s) remains the dead-link backstop. Reliability over speed: trades negligible flash throughput for a link that stays up across the whole flash.
Member
|
Looks like this wont be required since the app changes alone will not need longer timeouts and keepalive. Will need a little more testing |
Replace hard clamp and magic numbers in the climb-rate vario display with named constants. Introduce kVarioSegment (0.5 m/s per segment) and kVarioDeadzone (0.25 m/s) and compute sectionsToFill from kVarioSegment, capping at 6. Remove the previous ±0.6 m/s clamp and use the deadzone as the neutral threshold so values beyond ±3 m/s pin the gauge to full deflection rather than being clamped.
Reduce display redraws and SPI contention by introducing change-detecting LVGL setters and safer SPI handling. Key changes: - Add resetLvglUpdateCache() and change-detecting helpers (setLabelText, setBgColor, etc.) to avoid redundant LVGL invalidations and per-frame full redraws. - Rework many main-screen update paths (battery, power, altitude, climb rate, temps, icons) to compute desired state then diff-apply only changed widgets/styles. - Add flushSkipped flag: if a display flush is skipped due to SPI busy, mark it and force a full invalidate on next refresh to recover stale pixels. - Let LVGL read time directly (lv_tick_set_cb) and remove ad-hoc lv_tick_handler/lvgl_last_update; call lv_timer_handler() from updateLvgl(). - Move BMS SPI CS toggling to occur only after acquiring the shared SPI mutex and release the mutex immediately after the CAN library's update() (getters are read-only), preventing mid-transfer deselects and reducing wait times for display flushes. - Increase UI task frequency to ~30 Hz (33 ms) to match LVGL refresh period. - Update headers and tests to reflect removed variables/functions and new reset API. These changes reduce unnecessary rendering, avoid lost mid-transfer display updates, and improve responsiveness under SPI contention.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
BLE OTA firmware updates abort partway through (~20–30%). The controller logs
Device disconnected reason=531mid-flash — HCI error0x13, a central-initiated graceful disconnect (not RF dropout, not a supervision timeout, not a controller crash). Root cause: during OTA the firmware intentionally stops sending FastLink telemetry to give the flash full bandwidth, so the phone sees the link as idle and tears it down (its telemetry-stall watchdog, or an OS-level GATT idle teardown).The primary fix lives in the phone app (it now suspends its telemetry-stall watchdog for the duration of the flash). These firmware changes are defense-in-depth so the link stays healthy regardless of the central's behaviour — protecting older app builds and OS-level teardown, which the app can't.
Changes
fastlink_service.cpp): emit a ~1 Hz keepalive notify while OTA is in progress. The packet was alreadysetValue()'d, so its advancingpacket_id/uptime_msregister as telemetry progress on the app side with zero app changes. At 1 Hz against the 15 ms OTA connection interval it does not meaningfully slow the flash.ble_core.cpp→requestFastConnParams): 2 s → 8 s, so a multi-second flash-erase stall or a sluggish phone can't drop the link at the link-layer level.OTA_TIMEOUT_MS(30 s) remains the dead-link backstop.Verification
pio run -e OpenPPG-CESP32S3-CAN-SP140(flash 36.9%, 1.23 MB image).OTA Success→ reboot →Boot complete) with no mid-flash disconnect.Reliability over speed — the keepalive and longer timeout trade a negligible amount of flash throughput for a link that stays up across the whole flash.
🤖 Generated with Claude Code