Skip to content

RTL8814AU: post-fwdl init block was mis-gated on CHIP_8821#57

Merged
josephnef merged 1 commit into
masterfrom
8814au-post-fwdl-init-correctness
May 29, 2026
Merged

RTL8814AU: post-fwdl init block was mis-gated on CHIP_8821#57
josephnef merged 1 commit into
masterfrom
8814au-post-fwdl-init-correctness

Conversation

@josephnef
Copy link
Copy Markdown
Collaborator

Summary

The block in rtl8812au_hal_init() whose comment said "Trace-derived 8814 post-fwdl init writes" and whose closing log line said "8814A: REG_MACID + trace-derived post-fwdl writes applied" was inside if (CHIP_8821) {}. On 8814AU it never ran — REG_RRSR (0x0440), 0x04bc, REG_QUEUE_CTRL (0x04c6), REG_TX_PTCL_CTRL (0x0520), REG_RD_CTRL (0x0524), 0x0670 (NAV-related), and the RA-table base at 0x0990-0x09a4 were left at chip-reset defaults. Cold-init usbmon diff vs aircrack-ng/88XXau (devourer-testrig VM, 2026-05-29) flagged all of these as kernel-only writes.

Compounded: the u32 values stored wire-byte order rather than LE u32. On a LE host, rtw_write32(0x0440, 0xff0f0000u) puts the value in memory as 00 00 0f ff and that's what hits the wire — opposite of what kernel writes (ff 0f 00 00, u32 0x00000fff). Even if the block had run on 8814, every value would have been byte-reversed.

What's in this PR

  • Moves the misnamed block out of if (CHIP_8821) into an if (is_8814a) block.

  • Reverses the u32 literals to match kernel wire bytes:

    addr old u32 new u32 meaning
    0x0440 0xff0f0000u 0x00000fffu REG_RRSR all-rates mask
    0x0520 0x0f2f0000u 0x00002f0fu REG_TX_PTCL_CTRL
    0x0670 0x000000c0u 0xc0000000u NAV-related
    0x0990 0xffff1027u 0x27100000u RA-table base
    0x0994 0x0001484cu 0x4c480100u
    0x0998 0x24282c30u 0x302c2824u
    0x099c 0x34383c40u 0x403c3834u
    0x09a0 0x44000000u 0x00000044u
    0x09a4 0x80000800u 0x00080080u

    0x04bc / 0x04c6 are 1-byte writes (no endian issue). 0x0524 left at 0xf4fff00u — no usbmon-trace reference value for it in the current capture set; scope-tight commit.

  • Drops the misleading "REG_MACID +" from the closing log line — MAC programming lives in the separate if (is_8814a) block earlier in the function.

What's NOT in this PR

This commit is strictly a scope-correction + byte-order fix. It does NOT change the chip's behavior in ways unrelated to making these writes finally land. The 8814AU on-air TX gate is not closed by this — AR9271 + 8812AU + 8821AU sniffer triplet still observes zero frames matching the canonical SA after this fix. But the 8814 path now correctly applies the TX-protocol setup the kernel relies on, which is a prerequisite for any further investigation.

Test plan

  • cmake --build build -j clean on Linux gcc / clang. CI exercises macOS + MSVC.
  • WiFiDriverTxDemo with DEVOURER_PID=0x8813 runs init + TX loop end-to-end. Log line 8814A: trace-derived post-fwdl writes applied now appears (previously only fired on CHIP_8821).
  • tshark capture confirms the previously-missing addresses appear on the wire with byte values matching the kernel reference:
    • 0x0520: wLen=4, payload=0f2f0000
    • 0x0670: wLen=4, payload=000000c0
    • 0x0990: wLen=4, payload=00001027
  • Kernel-only address delta drops from 19 to 15. Remaining 15 are IQK/DPK calibration loop writes plus a periodic GPIO heartbeat (0x0b58, 0x0880..0x089c, etc.) — neither of which devourer has infrastructure to replicate.
  • Follow-up: the on-air TX gate persists and is not addressed here.

🤖 Generated with Claude Code

The block in `rtl8812au_hal_init()` whose comment said "Trace-derived 8814
post-fwdl init writes" and whose closing log line said "8814A: REG_MACID +
trace-derived post-fwdl writes applied" was inside `if (CHIP_8821) {}`.
On 8814AU it never ran — REG_RRSR (0x0440), 0x04bc, REG_QUEUE_CTRL (0x04c6),
REG_TX_PTCL_CTRL (0x0520), REG_RD_CTRL (0x0524), 0x0670 (NAV-related), and
the RA-table base 0x0990-0x09a4 were left at their chip-reset defaults.
Cold-init usbmon diff vs `aircrack-ng/88XXau` (devourer-testrig VM,
2026-05-29) flagged all of these as kernel-only writes.

Compounded: the u32 values in those calls stored wire-byte order rather
than LE u32. On a little-endian host, `rtw_write32(0x0440, 0xff0f0000u)`
puts the value in memory as `00 00 0f ff` and that's what hits the wire —
exactly the opposite of what kernel writes (`ff 0f 00 00`, u32 0x00000fff).
Even if the block had run on the 8814 path, every value would have been
byte-reversed.

This commit:

* Moves the misnamed block out of `if (CHIP_8821)` and into an `if (is_8814a)`
  block.
* Reverses the u32 literals to match what the kernel actually writes on
  the wire:
    0x0440 0xff0f0000u → 0x00000fffu  (REG_RRSR: all-rates response mask)
    0x0520 0x0f2f0000u → 0x00002f0fu  (REG_TX_PTCL_CTRL)
    0x0670 0x000000c0u → 0xc0000000u  (NAV-related)
    0x0990 0xffff1027u → 0x27100000u  (RA-table base)
    0x0994 0x0001484cu → 0x4c480100u
    0x0998 0x24282c30u → 0x302c2824u
    0x099c 0x34383c40u → 0x403c3834u
    0x09a0 0x44000000u → 0x00000044u
    0x09a4 0x80000800u → 0x00080080u
  0x04bc and 0x04c6 are 1-byte writes (no endian issue).
  0x0524 kept at 0xf4fff00u — no usbmon-trace reference value in the
  current capture set; left unchanged so this commit is strictly a
  scope-correction + byte-order fix rather than a value-rewrite.
* Drops the misleading "REG_MACID +" prefix from the log line — MAC
  programming lives in the separate `if (is_8814a)` block earlier in
  the function.

Verified on the wire post-fix (tools/usbmon_pcap_diff.py against an
8814AU cold-init capture): the previously-absent addresses now appear
with byte values matching the kernel's, reducing the kernel-only address
delta from 19 to 15 (the remaining 15 are IQK / DPK calibration loop
writes plus a periodic GPIO heartbeat, neither of which devourer has
infrastructure to replicate).

Does NOT close the 8814AU on-air TX gate by itself — AR9271 + 8812AU +
8821AU sniffer triplet still observes zero frames matching CANONICAL_SA
post-fix. The gate remains at a level not reachable from usbmon. But the
8814 path now correctly applies the TX-protocol setup the kernel relies
on, which is a prerequisite for any further investigation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@josephnef josephnef merged commit 9e5287e into master May 29, 2026
5 checks passed
@josephnef josephnef deleted the 8814au-post-fwdl-init-correctness branch May 29, 2026 19:51
josephnef added a commit that referenced this pull request May 30, 2026
Re-scope of #43 (closed) against current master.

## What's removed

The 86-line `if (CHIP_8821)` block in `HalModule::rtl8812au_hal_init`
immediately after the post-fwdl REG_CR / REG_RXFLTMAP2 setup:

- **Hardcoded T2U Plus MAC** at REG_MACID (0x0610..0x0615) — burned-in
to one specific chip's address (`e0:d3:62:97:a9:72`); wrong for every
other 8821AU. PR #42's proper 8821-specific init flow programs MAC from
EFUSE.
- **~13 trace-derived register pokes** at
0x004c/0x004e/0x0040/0x0208/0x0520/0x0670/0x0a0a/0x1874-0x187f —
captured from one aircrack-ng/88XXau cold-init session on the T2U Plus,
never re-derived from first principles. Made redundant by #42 driving
the chip from any starting state.
- **BB/AGC value-override cluster** at 0x0830/0834/8a4/8b0 +
0x0c20-0x0c44 + 0x0c50/0c54/0c90/0cb4/0e90 — mirrored what aircrack-ng's
phydm runtime AGC settles on after init. Devourer doesn't run phydm; the
override was a \"best we can do\" shortcut. Removed pending a proper
port.

The double-writes to 0x0520 and 0x0670 with different values per write
(one from this block, one from the \"8814 post-fwdl init\" block
underneath) were the original smell that surfaced this in #43.

## What this PR does NOT touch

The *\"Trace-derived 8814 post-fwdl init writes\"* sub-block that USED
to live inside this same `if (CHIP_8821)` branch. PR #57 (merged
2026-05-29) moved it to its proper `if (is_8814a)` location with
byte-reversed u32 literals after discovering it had been mis-gated. That
block stays. **This is the only difference from #43.**

## Linux validation (carried over from #43)

`tests/regress.py --full-matrix --channel 100` on 8814 + 8821 T2U Plus,
VM mode. No regressions; 8821 cells preserved within RF variance:

| TX → RX | post-#42 (with trace pokes) | this PR |
|---|---|---|
| 8814 kernel → 8821 kernel | 430 ✓ | **435 ✓** |
| 8814 kernel → 8821 devourer | 400 ✓ | **400 ✓** |
| 8821 kernel → 8814 kernel | 365 ✓ | **372 ✓** |
| 8821 devourer → 8814 kernel | 5865 ✓ | **5933 ✓** |

NB: that matrix was captured against the #43 branch state; will need a
fresh run against this PR before merge.

## What this PR doesn't validate

Android-side hotplug end-to-end. PR #42 was originally confirmed by
@RomanLut to fix hotplug on PixelPilot + hx-esp32cam-fpv **with the
trace pokes still in place**. This PR removes them. Proper init flow
alone *should* be sufficient (that's the whole point of #42 — driving
the chip from any starting state without relying on pre-captured pokes)
— but it deserves re-validation before merge.

@RomanLut — when you have a moment, could you re-build PixelPilot /
hx-esp32cam-fpv against this branch and confirm 8821AU hotplug still
works on Android? Single replug cycle is enough. If it regresses, the
trace pokes were load-bearing for hotplug specifically (unlikely IMO but
possible) and we revert.

## Test plan

- [x] Build clean (Linux gcc + clang)
- [ ] CI matrix
- [ ] `--full-matrix` re-run against this branch tip
- [ ] @RomanLut confirms 8821AU Android hotplug still works

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant