fix model name delivery and log more info#1745
Open
Harold-lkk wants to merge 7 commits intoInternLM:agent_devfrom
Open
fix model name delivery and log more info#1745Harold-lkk wants to merge 7 commits intoInternLM:agent_devfrom
Harold-lkk wants to merge 7 commits intoInternLM:agent_devfrom
Conversation
YanhuiDua
reviewed
May 1, 2026
| if self.lmdeploy_actor is None: | ||
| self.lmdeploy_actor = ray.get_actor(SHARED_STORE, namespace=SHARED_STORE_NAMESPACE) | ||
| assert self.lmdeploy_actor is not None, "LMDeploy actor should be available in the shared store." | ||
| routed_experts_ref = self.lmdeploy_actor.get.remote(routed_experts) |
Collaborator
There was a problem hiding this comment.
routed_experts = ray.get(self.lmdeploy_actor.get.remote(routed_experts)_
routed_experts_ref = ray.put(routed_experts)
YanhuiDua
reviewed
May 8, 2026
| extra_info["routed_experts"] = routed_experts | ||
| # Turn 1: materialize tensor and hand ownership to the store. | ||
| decoded = self._decode_routed_experts(routed_experts) | ||
| if isinstance(decoded, ObjectRef): |
Collaborator
There was a problem hiding this comment.
在 _decode_routed_experts 中 加await _LMDEPLOY_ACTOR.get.remote(routed_experts) + free obj_ref,这个时候拿到的就是tensor了
YanhuiDua
reviewed
May 8, 2026
| # same history key. Store TTL GC handles cleanup. | ||
| history_ref = await store.get_ref.remote(history_routed_experts_key) | ||
| history_routed_experts = await history_ref | ||
| elif isinstance(history_routed_experts_key, ObjectRef): |
Collaborator
There was a problem hiding this comment.
那这里应该就可以删掉了吧,history_routed_experts_key 就不可能为objref
YanhuiDua
reviewed
May 8, 2026
| elif isinstance(history_routed_experts_key, ObjectRef): | ||
| history_routed_experts = await history_routed_experts_key | ||
| ray.internal.free([history_routed_experts_key], local_only=False) | ||
| elif isinstance(history_routed_experts_key, (bytes, bytearray)): |
YanhuiDua
reviewed
May 8, 2026
| history_routed_experts = await history_ref | ||
| elif isinstance(history_routed_experts_key, ObjectRef): | ||
| history_routed_experts = await history_routed_experts_key | ||
| ray.internal.free([history_routed_experts_key], local_only=False) |
Collaborator
There was a problem hiding this comment.
为啥还是用ray.internal.free接口,而不是routed_expert_store的release接口
YanhuiDua
reviewed
May 8, 2026
| # tokenize.py base64-decoded to bytes. | ||
| history_ref = cloudpickle.loads(history_routed_experts_key) | ||
| history_routed_experts = await history_ref | ||
| ray.internal.free([history_ref], local_only=False) |
YanhuiDua
reviewed
May 8, 2026
| # rather than an ObjectRef. Legacy ObjectRef path kept as a | ||
| # defensive fallback and encoded the same way as before so older | ||
| # clients still decode correctly. | ||
| if isinstance(response.extra_info.get("routed_experts"), ray.ObjectRef): |
Collaborator
|
建议在xtuner/v1/ray/environment/install_agent_env.py 中 当 然后在训练结束后,再对routed_expert_store进行一次检查,现在似乎没看到这部分代码 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.