hydra: refactor proxy exec list#7802
Open
hzhou wants to merge 5 commits into
Open
Conversation
Since MPL_hash is effectively a memory storage device, add a memory class to track it. Use bool instead of int or char value of 0 and 1. Fix MPL_hash_has, it need check whether the hash is empty.
The default usage of mpl_hash is a string to string hash, but it can be used as a string set or a string store.
The previous code assumes a round-robin rank assignment. This may be incorrect now that we use rank table. Directly use the rank info stored in the exec struct in the proxy to avoid re-calculate. Also use MPL_hash to simplify the string storage.
Rather than unnecessarily duplicate HYD_exec, which in turn duplicates strings and environments, use a separate struct that only holds a pointer to HYD_exec.
Do not repeat exec infos just because multiple processes are not consecutive on a proxy. Instead, use a separate launch_list for launch groups. Add util function HYDU_free_launch_list for freeing linked list of struct HYD_proxy_exec. Cleanup struct HYD_exec, removing unused field start_rank and ref_count.
Contributor
Author
|
test:mpich/ch3/most |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request Description
In a round-robin rank assignment, the launch list for each proxy are not in consecutive ranks, this resulted in duplicate exec arguments, which potentially can be very long due to environment strings. Avoid duplication by using separate
HYD_proxy_execstruct.[skip warnings]
MPL_hashtoMPIR_proctableis questionable because the strings may get reallocated. What we could do is a two-round. First insert all the strings toMPL_hash, then freeze the hash table, and setproctable.Author Checklist
Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
Commits are self-contained and do not do two things at once.
Commit message is of the form:
module: short descriptionCommit message explains what's in the commit.
Whitespace checker. Warnings test. Additional tests via comments.
For non-Argonne authors, check contribution agreement.
If necessary, request an explicit comment from your companies PR approval manager.