rdma_topo: Extend support for PCIe switch-based GPUDirect platforms#1725
Open
EdwardSro wants to merge 6 commits intolinux-rdma:masterfrom
Open
rdma_topo: Extend support for PCIe switch-based GPUDirect platforms#1725EdwardSro wants to merge 6 commits intolinux-rdma:masterfrom
EdwardSro wants to merge 6 commits intolinux-rdma:masterfrom
Conversation
Add a 'supported' property to PCITopo which is True when the detected topology is supported by the tool. The logic that computes this property will evolve as support for other topologies is added. Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com>
Introduce SysfsDevice class to store flat unparsed sysfs device data separately from the parsed and hierarchical representations (PCIDevice, PCITopo). Replace lspci/setpci with direct sysfs parsing. While the config space could have been parsed using 'lspci -F', which ingests 'lspci -x' output, VPD could not. To handle uniformly, add simple parsers for both. Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com>
Add support for dumping and ingesting a system's state for offline analysis. The 'dump' command serializes the raw sysfs state to JSON, while the -F option, added to the 'topo' and 'check' commands, consumes previously captured dumps. This enables debugging, testing and extending support for new platforms. The 'config' and 'vpd' binary data is zlib-compressed and encoded using base64 for brevity and to safely store in JSON. Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com>
Introduce an abstract NVCX_Complex base (using ABC). Move the existing implementation into NVCX_DMA_Complex, which implements the abstract methods for the existing DMA PF-based platform. This allows adding other complex types that share the same interface but differ in how they compute ACS and run additional checks (e.g., IOMMU groups). Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com>
Today the 'check' command exits on the first check_fail(), so one bad ACS or additional check (e.g., iommu_group) hides later failures. Move the exit call from check_fail() to cmd_check() and print all results before deciding to exit with error. Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com>
Add NVCX_Inline_Complex to model CX/GPU topologies that lack a separate DMA function and rely on a shared switch to communicate directly. Recognize additional device types (e.g. bridge, generic_rp) and a missing CX switch device ID so inline topologies are classified and checked correctly. This topology can be used with or without virtualization. Add --virt / --no-virt flag for the check / write-grub-acs / setpci-acs commands, to indicate which mode the user is interested in. When the flag is not passed, auto-detect virtualization based on the availability of ATS capability on the CX NICs. Topologies with mixed ATS capabilities for the CX NICs are not supported. Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com>
Member
Author
|
Pushed v2, fixing 'topo' command to always display the topology even if the virtualization could not be auto-detected (whether the user is going to use virtualization or not). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This series extends rdma_topo to support platforms where a ConnectX NIC and GPU share a PCIe switch without a separate DMA Direct function, in addition to the existing DMA-based topologies.
The first patches prepare the ground by replacing lspci/setpci with direct sysfs and config space parsing, adding a dump/replay mechanism for offline analysis and testing, refactoring the complex representation into an abstract base with per-topology implementations, and improving the check command to report all failures instead of stopping at the first.
The final patch adds inline topology detection, ACS computation, and a --virt/--no-virt flag that controls whether ACS is configured for bare-metal or virtualized environments. When unset, the tool auto-detects based on the NIC's ATS capability.
No behavioral changes for existing Direct NIC (DMA-based) platforms.