docs(devnotes): add Nemotron-Personas dev note#611
Conversation
Signed-off-by: Yev Meyer <ymeyer@nvidia.com>
Signed-off-by: Yev Meyer <ymeyer@nvidia.com>
|
MkDocs preview: https://219428f5.dd-docs-preview.pages.dev Fern preview: https://nvidia-preview-pr-611.docs.buildwithfern.com/nemo/datadesigner
|
danecor
left a comment
There was a problem hiding this comment.
Looks good! Some possible issues / suggestions attached.
Signed-off-by: Yev Meyer <ymeyer@nvidia.com>
…_dev_note # Conflicts: # docs/scripts/generate_colab_notebooks.py
Code Review: PR #611 —
|
Greptile SummaryThis PR adds the Designing Nemotron-Personas dev note and an accompanying Tutorial 7 notebook, documenting and reproducing the four-stage compound-AI pipeline behind the Nemotron-Personas HF collection (OCEAN sampling → PGM demographics → persona attributes → persona descriptions).
|
| Filename | Overview |
|---|---|
| docs/scripts/generate_colab_notebooks.py | Adds ADDITIONAL_SETUP_CELLS and ADDITIONAL_API_KEY_BLOCKS maps; NGC_API_KEY block appended to the existing COLAB_API_KEY_CELL which already imports os, getpass, and userdata — no import gap. |
| docs/notebook_source/7-nemotron-personas.py | New tutorial notebook reproducing the Nemotron-Personas pipeline; two pre-existing flagged issues: SAMPLE_FROM_SDG_PGM=True raises NotImplementedError while Next Steps prose suggests flipping it, and age conditionals (>= 6, >= 16) are always true given age_range=[18, 114]. |
| fern/components/Figure.tsx | New React component using dangerouslySetInnerHTML for a fully static CSS string literal — no user input involved, so safe. |
| fern/versions/latest/pages/devnotes/posts/nemotron-personas.mdx | New Fern devnote for Nemotron-Personas; uses the new Figure component, Authors component, and matches the redirect in docs.yml pointing to /dev-notes/designing-nemotron-personas. |
| fern/docs.yml | Adds two redirects: nemotron-personas → designing-nemotron-personas and retrieval-sdg-toolkit → retriever-sdg-toolkit, correctly handling legacy mkdocs URL shapes that differ from the Fern nav-entry slugs. |
| fern/versions/latest/pages/devnotes/posts/retriever-sdg-toolkit.mdx | Removes explicit slug frontmatter and moves title from body H1 to frontmatter for consistency with other devnotes; slug still auto-generates to retriever-sdg-toolkit from the nav entry, matching the new redirect. |
Reviews (9): Last reviewed commit: "Merge remote-tracking branch 'origin/mai..." | Re-trigger Greptile
Signed-off-by: Yev Meyer <ymeyer@nvidia.com>
…navs Signed-off-by: Yev Meyer <ymeyer@nvidia.com>
|
Hey Yev, leaving a few flags from Codex review here so they are visible before the human review comes through. A human review is still coming.
Narratively, the post reads well: the flow from why personas matter, to how they are used, to how Data Designer builds and customizes them is strong. These are mostly accuracy / maintenance flags rather than a request for a structural rewrite. |
|
Re review from Codez:
I updated the language in the note to make this a bit more clear. Rebased to bring the prompt sensitivity and updated mkdocs/fern. Should be good to go. |
johnnygreco
left a comment
There was a problem hiding this comment.
this is an awesome post @3mei!!! thanks!
Note that I think the blog card is missing. Up to you if you want to add now or in a follow up
…and Fern routing Signed-off-by: Yev Meyer <ymeyer@nvidia.com>
Signed-off-by: Yev Meyer <ymeyer@nvidia.com>
…toolkit and have-it-your-way
📋 Summary
Adds the Designing Nemotron-Personas dev note covering how the multi-locale Nemotron-Personas HF collection is built (4-stage compound-AI pipeline) and how it's used as a seeding primitive across Nemotron training (long-context, tool-use, formal logic, safety refusals, instruction-following). Ships alongside a runnable Tutorial 7 demonstrating reproduction + customization, plus a Colab variant
🔗 Related Issue
N/A
🔄 Changes
✨ Added
docs/devnotes/posts/nemotron-personas.md— new dev notedocs/devnotes/posts/assets/nemotron-personas/— four images: three pipeline-stage diagrams from the partner repo plus a black-backgroundNemotron-Personasworld-map herodocs/notebook_source/7-nemotron-personas.py— jupytext source for the Reproducing & Customizing Nemotron-Personas tutorial;docs/colab_notebooks/7-nemotron-personas.ipynb— committed Colab variant; i🔧 Changed
docs/scripts/generate_colab_notebooks.py— adds anADDITIONAL_SETUP_CELLSmap parallelingADDITIONAL_DEPENDENCIES; injects NGC CLI install +NGC_API_KEYcells. Future devnote-paired tutorials needing extra Colab bootstrap can register one-line entries in the same map.mkdocs.yml— adds Reproducing & Customizing Nemotron-Personas under the Tutorials nav🧪 Testing
make testpassesjupytext --to ipynb --executemake generate-colab-notebooksregenerates the Colab.ipynbcleanly with the NGC setup cells in the expected positionmake convert-execute-notebooksand gated onNVIDIA_API_KEY+ on-disk NGC dataset, matching how Tutorials 5/6 are gated onOPENROUTER_API_KEY)✅ Checklist