Skip to content

rdma_topo: Extend support for PCIe switch-based GPUDirect platforms#1725

Open
EdwardSro wants to merge 6 commits intolinux-rdma:masterfrom
EdwardSro:pr-rdma-topo
Open

rdma_topo: Extend support for PCIe switch-based GPUDirect platforms#1725
EdwardSro wants to merge 6 commits intolinux-rdma:masterfrom
EdwardSro:pr-rdma-topo

Conversation

@EdwardSro
Copy link
Copy Markdown
Member

This series extends rdma_topo to support platforms where a ConnectX NIC and GPU share a PCIe switch without a separate DMA Direct function, in addition to the existing DMA-based topologies.

The first patches prepare the ground by replacing lspci/setpci with direct sysfs and config space parsing, adding a dump/replay mechanism for offline analysis and testing, refactoring the complex representation into an abstract base with per-topology implementations, and improving the check command to report all failures instead of stopping at the first.

The final patch adds inline topology detection, ACS computation, and a --virt/--no-virt flag that controls whether ACS is configured for bare-metal or virtualized environments. When unset, the tool auto-detects based on the NIC's ATS capability.

No behavioral changes for existing Direct NIC (DMA-based) platforms.

vladum added 6 commits April 28, 2026 12:35
Add a 'supported' property to PCITopo which is True when the detected
topology is supported by the tool. The logic that computes this property
will evolve as support for other topologies is added.

Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Introduce SysfsDevice class to store flat unparsed sysfs device data
separately from the parsed and hierarchical representations (PCIDevice,
PCITopo).

Replace lspci/setpci with direct sysfs parsing. While the config space
could have been parsed using 'lspci -F', which ingests 'lspci -x'
output, VPD could not. To handle uniformly, add simple parsers for both.

Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Add support for dumping and ingesting a system's state for offline
analysis. The 'dump' command serializes the raw sysfs state to JSON,
while the -F option, added to the 'topo' and 'check' commands, consumes
previously captured dumps.

This enables debugging, testing and extending support for new platforms.

The 'config' and 'vpd' binary data is zlib-compressed and encoded using
base64 for brevity and to safely store in JSON.

Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Introduce an abstract NVCX_Complex base (using ABC). Move the existing
implementation into NVCX_DMA_Complex, which implements the abstract
methods for the existing DMA PF-based platform.

This allows adding other complex types that share the same interface but
differ in how they compute ACS and run additional checks (e.g., IOMMU
groups).

Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Today the 'check' command exits on the first check_fail(), so one bad
ACS or additional check (e.g., iommu_group) hides later failures. Move
the exit call from check_fail() to cmd_check() and print all results
before deciding to exit with error.

Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Add NVCX_Inline_Complex to model CX/GPU topologies that lack a separate
DMA function and rely on a shared switch to communicate directly.

Recognize additional device types (e.g. bridge, generic_rp) and a
missing CX switch device ID so inline topologies are classified and
checked correctly.

This topology can be used with or without virtualization. Add --virt /
--no-virt flag for the check / write-grub-acs / setpci-acs commands, to
indicate which mode the user is interested in. When the flag is not
passed, auto-detect virtualization based on the availability of ATS
capability on the CX NICs. Topologies with mixed ATS capabilities for
the CX NICs are not supported.

Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
@EdwardSro
Copy link
Copy Markdown
Member Author

Pushed v2, fixing 'topo' command to always display the topology even if the virtualization could not be auto-detected (whether the user is going to use virtualization or not).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants