Skip to content

Avoid unordered_map for runtime datatype mapping#3223

Open
LwhJesse wants to merge 1 commit into
NVIDIA:mainfrom
LwhJesse:perf/runtime-datatype-switch
Open

Avoid unordered_map for runtime datatype mapping#3223
LwhJesse wants to merge 1 commit into
NVIDIA:mainfrom
LwhJesse:perf/runtime-datatype-switch

Conversation

@LwhJesse
Copy link
Copy Markdown

@LwhJesse LwhJesse commented May 11, 2026

Summary

Replace per-call std::unordered_map construction used for RuntimeDatatype to cute::UMMA::MXF8F6F4Format conversion in GEMM operation wrappers with a switch-based local helper.

The mapping is fixed and small, so this avoids constructing a temporary hash map during argument update while preserving the supported mappings for:

  • RuntimeDatatype::kE4M3
  • RuntimeDatatype::kE5M2
  • RuntimeDatatype::kE3M2
  • RuntimeDatatype::kE2M1

This also fixes the unsupported runtime datatype path by replacing the previous assert string expression with a real debug assertion and an explicit Status::kErrorInvalidProblem return.

Changed files

  • tools/library/src/gemm_operation_3x.hpp
  • tools/library/src/sparse_gemm_operation_3x.hpp
  • tools/library/src/blockwise_gemm_operation_3x.hpp

Local validation

  • git diff --check HEAD~1..HEAD
  • Verified no remaining RuntimeDatatype std::unordered_map mapping in the touched files
  • Verified the corrected debug assertion path appears once in each touched file

Full local CUTLASS builds are not practical on my machine, so I am relying on project CI and maintainer review for full validation.

@LwhJesse LwhJesse marked this pull request as ready for review May 11, 2026 08:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant