Skip to content

[Docs] Add supported model tables to pretrain_sft advanced tutorial#1728

Open
CyCle1024 wants to merge 3 commits into
InternLM:mainfrom
CyCle1024:docs/add-supported-models-table
Open

[Docs] Add supported model tables to pretrain_sft advanced tutorial#1728
CyCle1024 wants to merge 3 commits into
InternLM:mainfrom
CyCle1024:docs/add-supported-models-table

Conversation

@CyCle1024
Copy link
Copy Markdown
Collaborator

Summary

This PR adds comprehensive supported model documentation to the pretrain/sft advanced tutorial, replacing the previous "Coming soon..." placeholder.

Changes

  1. English model doc ()

    • Base Config Classes table (5 families)
    • Concrete Model Configs table (15 configs)
    • Compose Models section with base & concrete tables (10 configs)
    • Complete Inheritance Hierarchy tree covering both and branches
  2. Chinese model doc ()

    • Fully synchronized with the English version
  3. Skill for auto-sync ()

    • : workflow guide for AI to update docs when new model configs are added
    • : script to scan and discover all Config classes with inheritance info
    • Symlinked from for skill discovery

Supported models covered

  • Dense: Qwen2 (7B), Qwen3 (0.6B / 4B / 8B)
  • MoE: DeepSeek-V3, GPT-OSS (21B / 117B), Qwen3-MoE (30B / 235B / FoPE)
  • VL text backbones: Qwen3 VL Dense (4B / 8B), Qwen3 VL MoE (30B / 235B), Qwen3.5 VL MoE (35B)
  • Compose models: Qwen3-VL, Qwen3.5-VL, InternVL, InternS1

@CyCle1024
Copy link
Copy Markdown
Collaborator Author

@claude review

Comment thread docs/en/pretrain_sft/advanced_tutorial/model.md
Comment thread docs/en/pretrain_sft/advanced_tutorial/model.md
Comment thread .claude/skills/xtuner-sync-supported-models/scripts/scan_model_configs.py Outdated
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 28, 2026

Claude: ## Summary

Docs PR that replaces "Coming soon…" placeholders in the EN/ZH model advanced tutorial with comprehensive supported-model tables, inheritance hierarchy, and a Claude skill + scan script for keeping them in sync.

Issues

Critical

  • docs/en/pretrain_sft/advanced_tutorial/model.md & inheritance tree: Four VL text backbone configs have incorrect parent classes in both the Concrete Model table and the inheritance hierarchy tree. The same error is repeated in the Chinese model.md and in SKILL.md. Specifically:
    • Qwen3VLTextDense4BConfig inherits from Qwen3Dense4BConfig, not Qwen3DenseConfig
    • Qwen3VLTextDense8BConfig inherits from Qwen3Dense8BConfig, not Qwen3DenseConfig
    • Qwen3VLTextMoE30BA3Config inherits from Qwen3MoE30BA3Config, not Qwen3MoEConfig
    • Qwen3VLTextMoE235BA22Config inherits from Qwen3MoE235BA22Config, not Qwen3MoEConfig

Warning

Nit

Verdict

REQUEST_CHANGES — the inheritance hierarchy errors affect 4 classes across 3 files (EN doc, ZH doc, SKILL.md) and would mislead users about the actual model config structure.

@CyCle1024 CyCle1024 requested review from HAOCHENYE and jayhenry April 28, 2026 14:26
## 继承关系

下图展示了 `TrainEngine` 支持的所有配置类的完整继承层级,包括 `TransformerConfig` 和 `BaseComposeConfig` 两大分支。

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@CyCle1024 CyCle1024 force-pushed the docs/add-supported-models-table branch from cfb8ec1 to 3bf9def Compare May 8, 2026 08:00
…ted-models skill

- Add model support tables to en/zh pretrain_sft advanced tutorial model.md
- Include base configs, concrete configs, compose models and inheritance hierarchy
- Add xtuner-sync-supported-models skill under .claude/skills/ with scan script
@CyCle1024 CyCle1024 force-pushed the docs/add-supported-models-table branch from 3bf9def to e0c7546 Compare May 12, 2026 10:09
CyCle1024 added 2 commits May 12, 2026 20:53
- Fix missing comma after 'scipy' in both en/zh conf.py
- Deduplicate 'torchvision' entries in autodoc_mock_imports
- Add 'fla' to autodoc_mock_imports to fix sphinx autosummary TypeError
- Add type hints to scan_file and main in scan_model_configs.py
- Remove redundant RELEVANT_BASES set (covered by p.endswith('Config'))
Claude review suggested that VL text backbone configs should be nested
under their actual direct parent configs in the inheritance hierarchy
tree, rather than as siblings under the family base class.

- Qwen3VLTextDense4BConfig -> under Qwen3Dense4BConfig
- Qwen3VLTextDense8BConfig -> under Qwen3Dense8BConfig
- Qwen3VLTextMoE30BA3Config -> under Qwen3MoE30BA3Config
- Qwen3VLTextMoE235BA22Config -> under Qwen3MoE235BA22Config

The concrete model table is intentionally kept unchanged as it lists
the base class / family for readability, which is a documentation
choice separate from the technical inheritance hierarchy.

Files updated: EN model.md, ZH model.md, SKILL.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants