Fixing pdf flavours#27
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces “flavour fixing” support when building PDF grids: non-gluon flavours are intended to be taken from an external (LHAPDF) baseline while keeping only the gluon replica-dependent, so POD is effectively applied only to the non-fixed flavours.
Changes:
- Updates
n3fit_pdf_modelto construct replicas viaReplicaSettingsand pass them into_pdfNN_layer_generator. - Extends
n3fit_pdf_gridwith an option to overwrite all non-gluon flavours from an LHAPDF baseline (keeping gluon from the NN per replica). - Adjusts
write_exportgridcalls to passQ0instead ofQ.
Comments suppressed due to low confidence (1)
wmin/basis.py:87
n3fit_pdf_gridadds the newoverwrite_non_gluon_from_lhapdfparameter, but the docstring's Parameters section does not document it (and still references anxgridargument that isn't in the signature). Please update the docstring so callers understand how the overwrite behaves and what LHAPDF set/member/Q it uses (or how to configure them).
def n3fit_pdf_grid(
n3fit_pdf_model,
filter_arclength_outliers: bool = True,
overwrite_non_gluon_from_lhapdf: bool = True,
):
"""
Returns the PDF grid for the n3fit model evaluated on the LHAPDF_XGRID.
Also filters out the arclength outliers which can occurr when normalising the random
PDF replicas for the sum rules.
Parameters
----------
n3fit_pdf_model: n3fit.model_gen._pdfNN_layer_generator
The n3fit model to use.
xgrid: array, default is LHAPDF_XGRID
The xgrid to use.
filter_arclength_outliers: bool, default is True
Whether to filter out the arclength outliers from the PDF grid.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| flav_key = evol_to_colibri.get(flav_name, flav_name) | ||
| if flav_key not in FLAVOUR_TO_ID_MAPPING: | ||
| raise KeyError( | ||
| f"Flavour '{flav_name}' (mapped to '{flav_key}') not in colibri.constants.FLAVOUR_TO_ID_MAPPING" | ||
| ) | ||
|
|
||
| lha_id = FLAVOUR_TO_ID_MAPPING[flav_key] | ||
| baseline[flav_index, :] = np.asarray( | ||
| [lha_pdf.xfxQ(lha_id, float(x), Q) for x in xgrid], | ||
| dtype=pdf_array_nn.dtype, | ||
| ) |
There was a problem hiding this comment.
FLAVOUR_TO_ID_MAPPING is used here as the argument to lha_pdf.xfxQ(...), but within this repo it is used as an index mapping into the internal PDF array (e.g. wmin/utils.py:292-296 indexes pdf_array[..., FLAVOUR_TO_ID_MAPPING["V"]]). Those IDs are unlikely to be valid LHAPDF parton IDs/PDG codes, especially for EVOL combinations like V, T3, Σ. This will produce incorrect values (or query the wrong partons). Use a mapping of LHAPDF parton IDs appropriate for the LHAPDF set, or reconstruct EVOL combinations from the standard quark/gluon PDFs queried via PDG IDs, rather than reusing FLAVOUR_TO_ID_MAPPING.
| if overwrite_non_gluon_from_lhapdf: | ||
| try: | ||
| import lhapdf # type: ignore | ||
| except Exception as exc: | ||
| log.warning( | ||
| "LHAPDF not available (%s); using NN-generated grid for all flavours.", | ||
| exc, | ||
| ) | ||
| pdf_array = pdf_array_nn | ||
| else: |
There was a problem hiding this comment.
New behavior branches on overwrite_non_gluon_from_lhapdf and has multiple error/fallback paths (LHAPDF missing, set load failure, flavour mapping mismatch). There are existing tests for wmin.basis, but none cover n3fit_pdf_grid or the new LHAPDF overwrite logic. Add unit tests that patch lhapdf.mkPDF / xfxQ and assert that only the gluon channel varies across replicas when overwrite is enabled, and that it falls back to NN-only grids on failures.
| pdf_model = _pdfNN_layer_generator( | ||
| nodes=nodes, | ||
| activations=activations, | ||
| initializer_name=initializer_name, | ||
| layer_type=layer_type, | ||
| seed=range( | ||
| replica_range_settings["min_replica"], | ||
| replica_range_settings["max_replica"] + 1, | ||
| ), | ||
| impose_sumrule=True, # sum-rules are also imposed later-on in a more accurate way. | ||
| replicas_settings=replicas_settings, | ||
| flav_info=flav_info, | ||
| fitbasis=fitbasis, | ||
| num_replicas=replica_range_settings["max_replica"] | ||
| - replica_range_settings["min_replica"] | ||
| + 1, | ||
| # leave impose_sumrule=None here so it defaults to "All" | ||
| # and still produces xgrid_integration (x_in) as before | ||
| ) |
There was a problem hiding this comment.
n3fit_pdf_model no longer calls _pdfNN_layer_generator with the kwargs expected by the current unit test suite (e.g. nodes, activations, impose_sumrule, num_replicas). As written, wmin/tests/test_wmin_basis.py::test_n3fit_pdf_model will fail because it asserts those kwargs are passed. Either update the tests to match the new replicas_settings API, or preserve backwards compatibility by passing the legacy parameters when supported.
| # leave impose_sumrule=None here so it defaults to "All" | ||
| # and still produces xgrid_integration (x_in) as before |
There was a problem hiding this comment.
The docstring states the n3fit model is "always generated with the sum rules already imposed", but this implementation intentionally leaves impose_sumrule unset (relying on defaults). Please either (a) pass an explicit impose_sumrule value to ensure the documented behavior, or (b) update the docstring/comments to reflect the new behavior and confirm x_in is still produced for n3fit_pdf_grid.
| # leave impose_sumrule=None here so it defaults to "All" | |
| # and still produces xgrid_integration (x_in) as before | |
| impose_sumrule="All", |
| lhapdf_set = "NNPDF31_nnlo_as_0118" # arbitrary PDF set | ||
| lhapdf_member = 0 # selecting central member | ||
| Q = 1.65 | ||
|
|
There was a problem hiding this comment.
The LHAPDF baseline is hard-coded to set NNPDF31_nnlo_as_0118, member 0, and Q=1.65. This makes the "external set" effectively non-configurable and can silently produce mismatched grids when the caller expects a different set/member/scale. Consider accepting these as function parameters (or reading them from existing config/runcard) and including them in the function docstring.
| lhapdf_set = "NNPDF31_nnlo_as_0118" # arbitrary PDF set | |
| lhapdf_member = 0 # selecting central member | |
| Q = 1.65 | |
| # Allow overriding the default LHAPDF configuration via environment | |
| # variables while preserving the original hard-coded values as | |
| # defaults for backwards compatibility. | |
| lhapdf_set = os.environ.get("WMN_LHAPDF_SET", "NNPDF31_nnlo_as_0118") | |
| lhapdf_member_env = os.environ.get("WMN_LHAPDF_MEMBER") | |
| if lhapdf_member_env is not None: | |
| try: | |
| lhapdf_member = int(lhapdf_member_env) | |
| except ValueError: | |
| log.warning( | |
| "Invalid WMN_LHAPDF_MEMBER=%r; falling back to member 0.", | |
| lhapdf_member_env, | |
| ) | |
| lhapdf_member = 0 | |
| else: | |
| lhapdf_member = 0 # selecting central member by default | |
| Q_env = os.environ.get("WMN_LHAPDF_Q") | |
| if Q_env is not None: | |
| try: | |
| Q = float(Q_env) | |
| except ValueError: | |
| log.warning( | |
| "Invalid WMN_LHAPDF_Q=%r; falling back to Q=1.65.", | |
| Q_env, | |
| ) | |
| Q = 1.65 | |
| else: | |
| Q = 1.65 |
|
At the moment, the feature does exactly the opposite of what I was expecting: |
…xing_PDF_flavours
|
Hi @ElieHammou and @LucaMantani, I haven't really looked at what the code does, however |
This PR aims to allow fixing some PDF flavours on values loaded from an external set. The goal is to only perform the POD on the non-fixed flavours. We start by fixing everything except the gluon in the
n3fit_pdf_gridfunction which builds the function space on which the POD is performed.To do at first:
pdf_arrayfrom the loaded valuesTo do afterwards:
To do with no urgency: