Skip to content

Better handle the pre-time path population distribution#1073

Draft
jdebacker wants to merge 19 commits into
PSLmodels:masterfrom
jdebacker:prepop
Draft

Better handle the pre-time path population distribution#1073
jdebacker wants to merge 19 commits into
PSLmodels:masterfrom
jdebacker:prepop

Conversation

@jdebacker

Copy link
Copy Markdown
Member

The population distribution before the start year of the model is necessary to compute savings and bequests in period 0.

This PR adds explicitly the parameters for mortality, immigration, and population growth rates from period 0 to period 1 in order to make for a consistent transition from this "pre model period" to period 0.

cc @rickecon

@codecov-commenter

codecov-commenter commented Dec 16, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 76.47059% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.19%. Comparing base (df883b4) to head (59ae3e5).
⚠️ Report is 72 commits behind head on master.

Files with missing lines Patch % Lines
ogcore/parameters.py 0.00% 3 Missing ⚠️
ogcore/demographics.py 90.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1073      +/-   ##
==========================================
+ Coverage   73.09%   73.19%   +0.09%     
==========================================
  Files          21       21              
  Lines        5208     5186      -22     
==========================================
- Hits         3807     3796      -11     
+ Misses       1401     1390      -11     
Flag Coverage Δ
unittests 73.19% <76.47%> (+0.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
ogcore/aggregates.py 100.00% <100.00%> (ø)
ogcore/demographics.py 54.42% <90.00%> (+0.78%) ⬆️
ogcore/parameters.py 81.59% <0.00%> (-1.53%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jdebacker

Copy link
Copy Markdown
Member Author

All local tests pass:

tests/test_SS.py ...........................                             [100%]
tests/test_TPI.py ...........................                            [100%]
tests/test_aggregates.py .........................................       [100%]
tests/test_basic.py ....                                                 [100%]
tests/test_demographics.py ................                              [100%]
tests/test_elliptical_u_est.py .......                                   [100%]
tests/test_execute.py .                                                  [100%]
tests/test_firm.py ..................................................... [ 76%]
................                                                         [100%]
tests/test_fiscal.py ......................                              [100%]
tests/test_household.py ................................................ [ 96%]
..                                                                       [100%]
tests/test_output_plots.py ............................................. [ 95%]
..                                                                       [100%]
tests/test_output_tables.py ..............                               [100%]
tests/test_parameter_plots.py ........................................   [100%]
tests/test_parameter_tables.py .......                                   [100%]
tests/test_parameters.py ..............                                  [100%]
tests/test_pensions.py .................................                 [100%]
tests/test_run_example.py .                                          [100%]
tests/test_run_ogcore.py .                                               [100%]
tests/test_tax.py ...................................                    [100%]
tests/test_txfunc.py ..............................                      [100%]
tests/test_user_inputs.py .........                                      [100%]
tests/test_utils.py .................................................... [ 62%]
...............................                                          [100%]

@jdebacker

Copy link
Copy Markdown
Member Author

When running the OG-USA example (and updating the demographic parameters), I find the following resource constraint issues:

OG-Core built from the master branch:

array([[ 6.69628260e-04],
       [ 1.77852171e-07],
       [ 1.92564580e-07],
       [ 1.95435097e-07],
       [ 1.93660425e-07],
       [ 1.89483083e-07],
...

Here we see small resource constraint errors, except in the first period. The changes in the prepop branch are intended to fix that by using the actual population parameters (immigration rates, mortality rates) from the period before the time path begins.

OG-Core built from the prepop branch:

array([[ 8.58124249e-06],
       [-9.61640895e-04],
       [ 8.94519232e-04],
       [ 1.81341695e-03],
       [ 2.19727225e-03],
       [ 1.99185083e-03],
       [ 1.50901924e-03],
...

Here we see the initial period resource constraint error become smaller, which is what we wanted to correct here. However, the errors after that are larger and remain in the 1e-3 to 1e-4 range for more than 80 periods. Only until the "fix" period (when immigration rates are forced to be constant) do we get down to errors in the 1e-6 range.

I'm unsure what is going on here. All tests pass on this branch, but those all use the assumption that the pre-TP mortality, immigration, population distribution, and population growth rate as the same as in the first period of the time path. When running OG-USA we are using the empirical values which have a difference. Still, it's odd that the errors come from periods after the first period and persist so long.

cc @rickecon

@jdebacker

jdebacker commented Apr 1, 2026

Copy link
Copy Markdown
Member Author

@rickecon Can you review these changes and see if you can pinpoint where the new RC errors are coming from? I've gone over these changes manually several time and have had Codex and Claude Code agents trying to diagnose, but not luck.

Here's what my run of OG-USA with updated demographics looks like (snippet of how set the baseline, which is the only thing of interest here):

"""
    ---------------------------------------------------------------------------
    Run baseline policy
    ---------------------------------------------------------------------------
    """
    # Set up baseline parameterization
    p = Specifications(
        baseline=True,
        num_workers=num_workers,
        baseline_dir=base_dir,
        output_base=base_dir,
    )
    # Update parameters for baseline from default json file
    with importlib.resources.open_text(
        "ogusa", "ogusa_default_parameters.json"
    ) as file:
        defaults = json.load(file)
    p.update_specifications(defaults)
    p.tax_func_type = "HSV"
    p.age_specific = True
    c = Calibration(p, estimate_tax_functions=False, estimate_pop=True, client=client)
    d = c.get_dict()
    # # additional parameters to change
    updated_params = {
        "omega_S_preTP": d["omega_S_preTP"],
        "omega_SS": d["omega_SS"],
        "omega": d["omega"],
        "g_n_ss": d["g_n_ss"],
        "rho": d["rho"],
        "g_n": d["g_n"],
        "imm_rates": d["imm_rates"],
        "g_n_preTP": d["g_n_preTP"],
        "imm_rates_preTP": d["imm_rates_preTP"],
        "rho_preTP": d["rho_preTP"],
    }
    p.update_specifications(updated_params)
    # Run model
    start_time = time.time()
    runner(p, time_path=True, client=client)
    print("run time = ", time.time() - start_time)

@SeaCelo

SeaCelo commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

@jdebacker @rickecon: I dug into the period-1+ drift you flagged. The boundary fix idea is right — master does have a real period-0 inconsistency — but the drift across the rest of the path is coming from how growth rates are indexed in this branch.

In master, g_n[t] is the growth from t-1 to t. In this branch it's the growth from t to t+1. The aggregate formulas in aggregates.py weren't updated for the new convention, so they read the wrong period's growth rate everywhere.

You can confirm this without running the model. The population accounting itself fails the consistency check across t=1 to fixper, in the exact shape you reported.

The reindex is a one-line revert. But this PR also bundles other changes that aren't strictly part of the boundary fix — the timeline extension, the slicing on returned arrays, the new fixper formula, the pre_pop_dist removal. Rather than untangle them, I'd rather propose a smaller PR with just the boundary fix. I'll open it shortly and link back.

@SeaCelo

SeaCelo commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

Opened #1117 with the smaller version.

@SeaCelo

SeaCelo commented May 11, 2026

Copy link
Copy Markdown
Contributor

@jdebacker — here's what I think is going on and why you're seeing that drift.

This PR changes one line in get_pop_objs that stores the population growth rates. Master writes them starting at position 1 of the array; this PR writes them starting at position 0. That moves every growth rate one slot earlier in the array, and the boundary value — the growth rate from the year before the model starts to the first model year — that used to sit at the front of the array now lives in the new g_n_preTP field instead.

The drift is caused by aggregates.py not being fully updated to match. The steady-state code was updated to read from g_n_preTP (lines 137 and 192). The transition-path code wasn't. It still pulls the growth rate from the same positions in the array as master, but those positions now hold a different year's value, and the new g_n_preTP field is never read on the transition-path side. So every transition-path year picks up a growth rate that's one year off, and the boundary value never reaches the formula at all.

The transition-path updates would be:

  1. aggregates.py line 160 (get_B TPI). The growth-rate slice currently starts at position 1, which was the right place under master's storage. Under this PR's storage, the same slice now starts a year too late, so shift it back to position 0:

    • Replace np.hstack((p.g_n[1 : p.T], p.g_n_ss)) with np.hstack((p.g_n[: p.T - 1], p.g_n_ss)).
  2. aggregates.py lines 215 and 222 (get_BQ TPI). The growth-rate slice currently starts at position 0 to pick up the boundary value. Under this PR's storage, the boundary value isn't in the array anymore — it's in g_n_preTP. So prepend g_n_preTP and shift the rest of the slice back by one:

    • Replace 1.0 + p.g_n[: p.T] with 1.0 + np.append(p.g_n_preTP, p.g_n[: p.T - 1]).

get_I reads the growth rate the same way at lines 98, 100, and 105 — those would need the same shift to match the new storage layout.

(The test expected values in test_aggregates.py also encode master's indexing, so they'd need updating to use g_n_preTP at the boundary too.)

cc @rickecon

@SeaCelo

SeaCelo commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Following up on our offline discussion. Ran the boundary fix end-to-end in all 5 country repos — full TPI baselines, each with its own <og>_default_parameters.json + real UN demographics through update_specifications.

Three branches tested per country: upstream/master (current code), this PR rebased onto current upstream/master, and the fix (PR #1117 or this PR with the May 11 aggregates.py edits applied — same math).

Country Master t=0 (bug) PR #1073 t=0 / max drift t≥1 Fix t=0 (clean)
OG-USA -3.46e-03 4.5e-05 / 2.0e-03 +3.62e-08
OG-ETH -1.22e-03 2.7e-05 / 3.1e-03 +8.87e-07
OG-ZAF -3.39e-03 1.2e-04 / 6.6e-03 -5.58e-09
OG-PHL +2.05e-03 9.4e-05 / 1.9e-02 -1.49e-06
OG-IDN +7.04e-04 9.1e-06 / 2.6e-03 -2.54e-09

Master (red) has the t=0 boundary inconsistency and is clean at t≥1. PR #1073 as-is (orange) fixes t=0 but introduces sustained drift at t≥1 — trading one error for another, which is the issue you reported on OG-USA in December. With the aggregates.py edits from my May 11 comment (blue), t=0 is fixed and the drift is avoided.

How to apply the fix in this PR. Six line edits in aggregates.py:

  • Lines 98, 100, 160: np.hstack((p.g_n[1 : p.T], p.g_n_ss))np.hstack((p.g_n[: p.T - 1], p.g_n_ss))
  • Lines 215, 222: 1.0 + p.g_n[: p.T]1.0 + np.append(p.g_n_preTP, p.g_n[: p.T - 1])
  • Line 105: p.g_n[1 : p.T + 1]p.g_n[: p.T]

Plus the expected-value calculations in test_aggregates.py lines 183 and 239 need the same indexing update — they currently use master's pattern and would mismatch the corrected get_B/get_BQ output, breaking the existing tests.

These changes are all in OG-Core — no per-country PRs needed. Every downstream country repo runs Calibration → get_dict → update_specifications, so once get_pop_objs returns the new preTP fields and aggregates.py reads them correctly, the boundary fix takes effect automatically across all of them.

Test setup per country. Loaded each country's <og>_default_parameters.json, then called ogcore.demographics.get_pop_objs(country_id=<UN_M49>) directly to pull real year-(-1) demographics. Same outcome as Calibration(update_from_api=True) but skips the macro/income APIs that aren't relevant to the boundary test (the PHL World Bank/IMF path returned values that broke SS on master; isolating to just the demographic path sidestepped that). For OG-ETH I used country_id="231"ogeth/calibrate.py hardcodes "710" (ZAF M49), a separate bug worth flagging.

This PR (#1073) is the cleaner shape — pulling the boundary value out at the source rather than layering it through update_specifications like #1117 does.

I don't have push access to this branch, so #1117 is my fallback with the same math. I'll post the same results there.

cc @rickecon @jdebacker

SeaCelo added a commit to SeaCelo/OG-Core that referenced this pull request Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants