Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,069 changes: 1,069 additions & 0 deletions docs/colab_notebooks/7-nemotron-personas.ipynb

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
408 changes: 408 additions & 0 deletions docs/devnotes/posts/nemotron-personas.md

Large diffs are not rendered by default.

737 changes: 737 additions & 0 deletions docs/notebook_source/7-nemotron-personas.py

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/notebook_source/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ In this folder you can find all our tutorial notebooks in `.py` format. They can
make convert-execute-notebooks
```

from the root of the repository. This will not only convert but also execute all of the notebooks -- for that to work, make sure you went through our [Quick Start](https://nvidia-nemo.github.io/DataDesigner/quick-start/) and have API keys set. A new folder `docs/notebooks` will be created, including `README.md` and `pyproject.toml` files.
from the root of the repository. This will not only convert but also execute all of the notebooks -- for that to work, make sure you went through our [Quick Start](https://nvidia-nemo.github.io/DataDesigner/latest/quick-start/) and have API keys set. A new folder `docs/notebooks` will be created, including `README.md` and `pyproject.toml` files.

Alternatively, you can use Jupytext directly

Expand Down
42 changes: 39 additions & 3 deletions docs/scripts/generate_colab_notebooks.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,30 @@ def mark_colab_injected(cell: NotebookNode) -> NotebookNode:
return cell


def create_colab_setup_cells(additional_dependencies: str) -> list[NotebookNode]:
# Per-file try/except snippets appended into the standard NVIDIA_API_KEY cell
# so additional API keys share the same imports rather than producing a
# duplicate-imports cell. The snippets are joined with blank lines.
NGC_API_KEY_BLOCK = """\
try:
Comment thread
3mei marked this conversation as resolved.
os.environ["NGC_API_KEY"] = userdata.get("NGC_API_KEY")
except userdata.SecretNotFoundError:
os.environ["NGC_API_KEY"] = getpass.getpass("Enter your NGC API key: ")"""

# Optional per-file Colab setup cells, injected immediately after the standard
# install + NVIDIA_API_KEY cells. Currently unused; left in place so future
# tutorials can register additional one-shot Colab bootstrap cells.
ADDITIONAL_SETUP_CELLS: dict[str, list[str]] = {}

ADDITIONAL_API_KEY_BLOCKS: dict[str, list[str]] = {
"7-nemotron-personas.py": [NGC_API_KEY_BLOCK],
}


def create_colab_setup_cells(
additional_dependencies: str,
additional_setup_cell_sources: list[str] | None = None,
additional_api_key_blocks: list[str] | None = None,
) -> list[NotebookNode]:
"""Create the Colab-specific setup cells to inject before imports."""
cells = []
cells += [mark_colab_injected(new_markdown_cell(source=COLAB_SETUP_MARKDOWN))]
Expand All @@ -67,7 +90,14 @@ def create_colab_setup_cells(additional_dependencies: str) -> list[NotebookNode]
install_cell += f" {additional_dependencies}"
cells += [mark_colab_injected(new_code_cell(source=install_cell))]

cells += [mark_colab_injected(new_code_cell(source=COLAB_API_KEY_CELL))]
api_key_cell = COLAB_API_KEY_CELL
if additional_api_key_blocks:
api_key_cell = "\n\n".join([api_key_cell, *additional_api_key_blocks])
cells += [mark_colab_injected(new_code_cell(source=api_key_cell))]

if additional_setup_cell_sources:
cells += [mark_colab_injected(new_code_cell(source=src)) for src in additional_setup_cell_sources]

return cells


Expand Down Expand Up @@ -97,6 +127,8 @@ def process_notebook(notebook: NotebookNode, source_path: Path) -> NotebookNode:
cells = notebook.cells

additional_dependencies = ADDITIONAL_DEPENDENCIES.get(source_path.name, "")
additional_setup_cells = ADDITIONAL_SETUP_CELLS.get(source_path.name)
additional_api_key_blocks = ADDITIONAL_API_KEY_BLOCKS.get(source_path.name)

# Find where to insert Colab setup (before "Import the essentials")
import_idx = find_import_section_index(cells)
Expand All @@ -106,7 +138,11 @@ def process_notebook(notebook: NotebookNode, source_path: Path) -> NotebookNode:
import_idx = 1

# Insert Colab setup cells before the import section
colab_cells = create_colab_setup_cells(additional_dependencies)
colab_cells = create_colab_setup_cells(
additional_dependencies,
additional_setup_cells,
additional_api_key_blocks,
)
processed_cells = cells[:import_idx] + colab_cells + cells[import_idx:]

badge_source = COLAB_BADGE_TEMPLATE.format(filename=f"{source_path.stem}.ipynb")
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
62 changes: 62 additions & 0 deletions fern/components/Figure.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
/**
* SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
* SPDX-License-Identifier: Apache-2.0
*/

/**
* Figure - Centered image with an optional NVIDIA-green italic caption.
*
* NOTE: Fern's custom component pipeline uses the automatic JSX runtime.
*
* Usage in MDX:
* import { Figure } from "@/components/Figure";
*
* <Figure src="/assets/foo.png" alt="..." width={600}>
* Caption text with **inline markdown** if helpful.
* </Figure>
*/

/**
* Figure styles, injected by the component rather than loaded via docs.yml `css:`.
* `css` is theme-owned, so under `global-theme: nvidia` a local `css:` list is
* dropped at publish β€” styling has to ship with the component. See fern/docs.yml
* and the same pattern in Authors.tsx.
*/
const FIGURE_CSS = `
.devnote-figure {
text-align: center;
margin: 1.25rem 0;
}
.devnote-figure__img {
max-width: 100%;
height: auto;
}
.devnote-figure__caption {
display: block;
margin-top: 0.5rem;
color: #76B900;
font-size: 0.85em;
font-style: italic;
line-height: 1.4;
}
`;

export interface FigureProps {
/** Image source path (e.g. "/assets/<slug>/foo.png"). */
src: string;
/** Alt text for accessibility. */
alt: string;
/** Optional explicit width in pixels (or any CSS length). */
width?: number | string;
/** Caption content; rendered only if children are provided. */
children?: React.ReactNode;
}

export const Figure = ({ src, alt, width, children }: FigureProps) => (
<div className="devnote-figure">
{/* static CSS string literal (no user input) β€” safe to inject as raw HTML */}
<style dangerouslySetInnerHTML={{ __html: FIGURE_CSS }} />
<img className="devnote-figure__img" src={src} alt={alt} width={width} />
{children && <em className="devnote-figure__caption">{children}</em>}
</div>
);
6 changes: 5 additions & 1 deletion fern/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -307,12 +307,16 @@ redirects:
destination: "/nemo/datadesigner/plugins/example-plugin"
# Dev Notes: mkdocs-material blog plugin URL shape.
# Section title "Dev Notes" -> /dev-notes; intermediate posts/ directory dropped.
# Most posts kept their filename slug, but two were retitled during migration
# Most posts kept their filename slug, but four were retitled during migration
# so their Fern page-title slug differs from the legacy filename:
- source: "/nemo/datadesigner/devnotes/posts/text-to-sql"
destination: "/nemo/datadesigner/dev-notes/text-to-sql-for-nemotron-super"
- source: "/nemo/datadesigner/devnotes/posts/rqa"
destination: "/nemo/datadesigner/dev-notes/rqa-dataset"
- source: "/nemo/datadesigner/devnotes/posts/nemotron-personas"
destination: "/nemo/datadesigner/dev-notes/designing-nemotron-personas"
- source: "/nemo/datadesigner/devnotes/posts/retrieval-sdg-toolkit"
destination: "/nemo/datadesigner/dev-notes/retriever-sdg-toolkit"
- source: "/nemo/datadesigner/devnotes/posts/:slug"
destination: "/nemo/datadesigner/dev-notes/:slug"
- source: "/nemo/datadesigner/devnotes/:slug"
Expand Down
2 changes: 2 additions & 0 deletions fern/versions/latest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,8 @@ navigation:
contents:
- page: Overview
path: ./latest/pages/devnotes/index.mdx
- page: Designing Nemotron-Personas
path: ./latest/pages/devnotes/posts/nemotron-personas.mdx
- page: Prompt Sensitivity
path: ./latest/pages/devnotes/posts/prompt-sensitivity.mdx
- page: Retriever SDG Toolkit
Expand Down
8 changes: 8 additions & 0 deletions fern/versions/latest/pages/devnotes/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,14 @@ import { BlogCard, BlogGrid } from "@/components/BlogCard";
Welcome to NeMo Data Designer Dev Notes β€” in-depth guides, benchmark write-ups, and insights from the team building NeMo Data Designer.

<BlogGrid>
<BlogCard
href="/dev-notes/designing-nemotron-personas"
title="Designing Nemotron-Personas"
description="The compound-AI pipeline behind Nemotron-Personas seeding Nemotron training. Now open-sourced: fork it, add your data, ship personas for any use case."
date="Jun 1, 2026"
authors={["ymeyer", "dcorneil"]}
image={<img src="/assets/nemotron-personas/nemotron-personas-world-map.png" alt="" loading="lazy" style={{ objectPosition: "top" }} />}
/>
<BlogCard
href="/dev-notes/prompt-sensitivity"
title="Mitigating Prompt Sensitivity"
Expand Down
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
---
title: "Have It Your Way"
title: "Have It Your Way: Customizing Data Designer with Plugins"
description: "A plugin framework for the custom pieces every real project ends up needing."
---

import { Authors } from "@/components/Authors";

# Have It Your Way: Customizing Data Designer with Plugins

<Authors ids={["jgreco", "etramel"]} />

*A plugin framework for the custom pieces every real project ends up needing*
Expand Down
Loading
Loading