Skip to content

Fixing pdf flavours#27

Draft
ElieHammou wants to merge 12 commits into
mainfrom
fixing_PDF_flavours
Draft

Fixing pdf flavours#27
ElieHammou wants to merge 12 commits into
mainfrom
fixing_PDF_flavours

Conversation

@ElieHammou

@ElieHammou ElieHammou commented Feb 11, 2026

Copy link
Copy Markdown
Collaborator

This PR aims to allow fixing some PDF flavours on values loaded from an external set. The goal is to only perform the POD on the non-fixed flavours. We start by fixing everything except the gluon in the n3fit_pdf_grid function which builds the function space on which the POD is performed.

To do at first:

  • add loading of external PDF set (hardcoded)
  • build pdf_array from the loaded values
  • overwrite replicas of non-fixed flavour with NN generated functions
  • Substract gluon reference PDF
  • impose MSR on non-fixed flavours before the POD

To do afterwards:

  • Develop quantitative criterion to cut basis elements when directions become irrelevant.

To do with no urgency:

  • Allow external PDF choosing from runcard
  • Allow fixed flavour choice from runcard

Copilot AI review requested due to automatic review settings February 11, 2026 09:40

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces “flavour fixing” support when building PDF grids: non-gluon flavours are intended to be taken from an external (LHAPDF) baseline while keeping only the gluon replica-dependent, so POD is effectively applied only to the non-fixed flavours.

Changes:

  • Updates n3fit_pdf_model to construct replicas via ReplicaSettings and pass them into _pdfNN_layer_generator.
  • Extends n3fit_pdf_grid with an option to overwrite all non-gluon flavours from an LHAPDF baseline (keeping gluon from the NN per replica).
  • Adjusts write_exportgrid calls to pass Q0 instead of Q.
Comments suppressed due to low confidence (1)

wmin/basis.py:87

  • n3fit_pdf_grid adds the new overwrite_non_gluon_from_lhapdf parameter, but the docstring's Parameters section does not document it (and still references an xgrid argument that isn't in the signature). Please update the docstring so callers understand how the overwrite behaves and what LHAPDF set/member/Q it uses (or how to configure them).
def n3fit_pdf_grid(
    n3fit_pdf_model,
    filter_arclength_outliers: bool = True,
    overwrite_non_gluon_from_lhapdf: bool = True,
):
    """
    Returns the PDF grid for the n3fit model evaluated on the LHAPDF_XGRID.
    Also filters out the arclength outliers which can occurr when normalising the random
    PDF replicas for the sum rules.

    Parameters
    ----------
    n3fit_pdf_model: n3fit.model_gen._pdfNN_layer_generator
        The n3fit model to use.
    xgrid: array, default is LHAPDF_XGRID
        The xgrid to use.
    filter_arclength_outliers: bool, default is True
        Whether to filter out the arclength outliers from the PDF grid.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread wmin/basis.py Outdated
Comment on lines +189 to +199
flav_key = evol_to_colibri.get(flav_name, flav_name)
if flav_key not in FLAVOUR_TO_ID_MAPPING:
raise KeyError(
f"Flavour '{flav_name}' (mapped to '{flav_key}') not in colibri.constants.FLAVOUR_TO_ID_MAPPING"
)

lha_id = FLAVOUR_TO_ID_MAPPING[flav_key]
baseline[flav_index, :] = np.asarray(
[lha_pdf.xfxQ(lha_id, float(x), Q) for x in xgrid],
dtype=pdf_array_nn.dtype,
)

Copilot AI Feb 11, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FLAVOUR_TO_ID_MAPPING is used here as the argument to lha_pdf.xfxQ(...), but within this repo it is used as an index mapping into the internal PDF array (e.g. wmin/utils.py:292-296 indexes pdf_array[..., FLAVOUR_TO_ID_MAPPING["V"]]). Those IDs are unlikely to be valid LHAPDF parton IDs/PDG codes, especially for EVOL combinations like V, T3, Σ. This will produce incorrect values (or query the wrong partons). Use a mapping of LHAPDF parton IDs appropriate for the LHAPDF set, or reconstruct EVOL combinations from the standard quark/gluon PDFs queried via PDG IDs, rather than reusing FLAVOUR_TO_ID_MAPPING.

Copilot uses AI. Check for mistakes.
Comment thread wmin/basis.py
Comment on lines +112 to +121
if overwrite_non_gluon_from_lhapdf:
try:
import lhapdf # type: ignore
except Exception as exc:
log.warning(
"LHAPDF not available (%s); using NN-generated grid for all flavours.",
exc,
)
pdf_array = pdf_array_nn
else:

Copilot AI Feb 11, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior branches on overwrite_non_gluon_from_lhapdf and has multiple error/fallback paths (LHAPDF missing, set load failure, flavour mapping mismatch). There are existing tests for wmin.basis, but none cover n3fit_pdf_grid or the new LHAPDF overwrite logic. Add unit tests that patch lhapdf.mkPDF / xfxQ and assert that only the gluon channel varies across replicas when overwrite is enabled, and that it falls back to NN-only grids on failures.

Copilot uses AI. Check for mistakes.
Comment thread wmin/basis.py
Comment on lines 56 to 62
pdf_model = _pdfNN_layer_generator(
nodes=nodes,
activations=activations,
initializer_name=initializer_name,
layer_type=layer_type,
seed=range(
replica_range_settings["min_replica"],
replica_range_settings["max_replica"] + 1,
),
impose_sumrule=True, # sum-rules are also imposed later-on in a more accurate way.
replicas_settings=replicas_settings,
flav_info=flav_info,
fitbasis=fitbasis,
num_replicas=replica_range_settings["max_replica"]
- replica_range_settings["min_replica"]
+ 1,
# leave impose_sumrule=None here so it defaults to "All"
# and still produces xgrid_integration (x_in) as before
)

Copilot AI Feb 11, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n3fit_pdf_model no longer calls _pdfNN_layer_generator with the kwargs expected by the current unit test suite (e.g. nodes, activations, impose_sumrule, num_replicas). As written, wmin/tests/test_wmin_basis.py::test_n3fit_pdf_model will fail because it asserts those kwargs are passed. Either update the tests to match the new replicas_settings API, or preserve backwards compatibility by passing the legacy parameters when supported.

Copilot uses AI. Check for mistakes.
Comment thread wmin/basis.py
Comment on lines +60 to +61
# leave impose_sumrule=None here so it defaults to "All"
# and still produces xgrid_integration (x_in) as before

Copilot AI Feb 11, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring states the n3fit model is "always generated with the sum rules already imposed", but this implementation intentionally leaves impose_sumrule unset (relying on defaults). Please either (a) pass an explicit impose_sumrule value to ensure the documented behavior, or (b) update the docstring/comments to reflect the new behavior and confirm x_in is still produced for n3fit_pdf_grid.

Suggested change
# leave impose_sumrule=None here so it defaults to "All"
# and still produces xgrid_integration (x_in) as before
impose_sumrule="All",

Copilot uses AI. Check for mistakes.
Comment thread wmin/basis.py Outdated
Comment on lines +122 to +125
lhapdf_set = "NNPDF31_nnlo_as_0118" # arbitrary PDF set
lhapdf_member = 0 # selecting central member
Q = 1.65

Copilot AI Feb 11, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LHAPDF baseline is hard-coded to set NNPDF31_nnlo_as_0118, member 0, and Q=1.65. This makes the "external set" effectively non-configurable and can silently produce mismatched grids when the caller expects a different set/member/scale. Consider accepting these as function parameters (or reading them from existing config/runcard) and including them in the function docstring.

Suggested change
lhapdf_set = "NNPDF31_nnlo_as_0118" # arbitrary PDF set
lhapdf_member = 0 # selecting central member
Q = 1.65
# Allow overriding the default LHAPDF configuration via environment
# variables while preserving the original hard-coded values as
# defaults for backwards compatibility.
lhapdf_set = os.environ.get("WMN_LHAPDF_SET", "NNPDF31_nnlo_as_0118")
lhapdf_member_env = os.environ.get("WMN_LHAPDF_MEMBER")
if lhapdf_member_env is not None:
try:
lhapdf_member = int(lhapdf_member_env)
except ValueError:
log.warning(
"Invalid WMN_LHAPDF_MEMBER=%r; falling back to member 0.",
lhapdf_member_env,
)
lhapdf_member = 0
else:
lhapdf_member = 0 # selecting central member by default
Q_env = os.environ.get("WMN_LHAPDF_Q")
if Q_env is not None:
try:
Q = float(Q_env)
except ValueError:
log.warning(
"Invalid WMN_LHAPDF_Q=%r; falling back to Q=1.65.",
Q_env,
)
Q = 1.65
else:
Q = 1.65

Copilot uses AI. Check for mistakes.
@ElieHammou

Copy link
Copy Markdown
Collaborator Author

At the moment, the feature does exactly the opposite of what I was expecting:
in the written basis, the non-gluon flavours have varying values between replicas while all the gluon ones are identical...

@LucaMantani LucaMantani changed the base branch from main to fix-n3fit-NNgeneration-compat February 11, 2026 21:00
Base automatically changed from fix-n3fit-NNgeneration-compat to main February 18, 2026 16:42
@comane

comane commented Feb 24, 2026

Copy link
Copy Markdown
Member

Hi @ElieHammou and @LucaMantani, I haven't really looked at what the code does, however
in general I would say that it might be good to evaluate whether some types of modifications which are specific to a
single project are better to be done in a separate fork.

@ElieHammou ElieHammou marked this pull request as draft May 1, 2026 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants