Prune by either int or interval for all retention policies#8775
Prune by either int or interval for all retention policies#8775Goddesen wants to merge 22 commits into
Conversation
9beb1b7 to
9b828f5
Compare
|
This is loosely related to #8715 as well. I like the idea very much (I didn't and can't review the code, but I'm happy to help with the docs if desired). Borg's current retention policy IMHO is rather hard to understand for beginners (but very safe, because it usually keeps more than what users expect) and an interval-based approach feels more intuitive to me. However, special care is necessary to not cause unexpected behaviour, especially with frequent backups (e.g. daily), combined with overlapping rules (e.g. including all of within, daily, weekly, monthly, and yearly), and some missing backups (and thus the need to sometimes keep the "next best" to later fulfil less frequent rules). As far as I remember there were some issues with this in the early days of Borg that at least caused multiple major docs overhauls (not sure whether the behaviour actually changed substantially, but the docs definitely did because many users understood things wrong). Maybe Thomas can give some insights about the challenges back then? |
82c9e4e to
da77550
Compare
ThomasWaldmann
left a comment
There was a problem hiding this comment.
Thanks for the PR!
Some minor stuff I found...
64c48ed to
38fbd43
Compare
|
It seems the GHA test runner did not like the previous |
I'd vote for removing
IMHO this must be consistent with #8776. Question is what we agree on. I honestly believe differently: If it is |
|
If you want to be able to backport this to 1.4-maint, you must not break compatibility in this PR, but deprecating some options that can be replaced by a better new option can be done here. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #8775 +/- ##
==========================================
+ Coverage 83.62% 83.68% +0.05%
==========================================
Files 90 93 +3
Lines 15539 15776 +237
Branches 2337 2364 +27
==========================================
+ Hits 12995 13202 +207
- Misses 1806 1832 +26
- Partials 738 742 +4 ☔ View full report in Codecov by Harness. |
How do you do separation of functionality like this when a breaking 2.0 change is to be backported with deprecations instead? The code might turn out very different, especially with changes introduced on master. |
I don't think #8776 is relevant, as it matches absolute timestamps based on a pattern (like how the I had a whole comment here written out defending the inclusive check based on comparing timestamps only using seconds-granularity. In that scenario this makes sense. Having taken a closer look I see that |
|
Comments fixed, inclusive timestamp change reverted with tests made slightly more robust. If that's it for implementation comments I'll extend the documentation with an involved example like the retention policy I have described in the PR description and write a test for it. I am still not sure how to go about doing breaking change on |
|
Easiest is to backport a non-breaking change and then do the breaking change in a separate PR afterwards (and not backport that). There are some code structure differences between master and 1.4-maint though, so even backporting the first PR might not be trivial. As 1.4.x is a stable release, we need to be very careful to not break anything there. That sometimes can mean "no backport" if the change is too big or too risky. |
|
Last explicit default removed.
The changiest change introduced here is allowing intervals to be of length Am I correct in assuming I just put entries for |
|
In regards to backporting this to I kinda feel like I accidentally caused some confusion, so: Do we even have breaking changes right now? The only "breaking" change is that |
Ah I forgot there is one more thing: But if you want, it's pretty easy to just not introduce that change now, and do it in the breaking PR instead. |
There's only a difference if someone actually managed to create multiple archives within the same second, right? I'd say that's similar to Anyway, I don't think we really have a problem here. If we decide on dropping Not for me to decide, but I suggest dealing with that stuff in the backport PR.
It does: It can also match all archives within e.g. a full month relative to "now" (and added just recently, also relative to arbitrary other timestamps). However, I agree that there's no practical difference due to the microseconds scale. I didn't consider that. Yet I'd still vote for consistency. I just skimmed through @c-herz's work in #8776 and noticed that matching exclusively causes some problems there as well (minor, yet there are some). IMHO the consistency question still somewhat matters, so how about we simply decide on always matching inclusively? I'm thus also tagging @c-herz, WDYT? |
|
After some consideration I think this new retention interval should also account for the oldest rule that was implemented for the retention count flags. I'll get back to this. |
|
before doing further changes, please rebase on current master branch to get the cython workaround. otherwise, builds will fail. |
|
Guess I'ld like to merge this for next beta, can we finish this until then? |
|
I'll endeavor to get something done soon then. Factorio has been a distraction 😄 |
|
btw, i dissected the |
|
ping? shall I help here a bit by rebasing this on current master branch and resolve the conflicts? |
a87a0d9 to
dc6f878
Compare
|
If all tests pass now I think I'm done on my end and ready for review. I'll double check early this week. |
|
The failing test does not seem to be related to my changes, so I say this is now good for new review. The final implementation is now the simple interval version discussed earlier. I've added a new full-scale example of a rolling backup scheme using interval retention. Fixed up relevant pruning docs/help a bit such that it all feels a bit more cohesive. For a while I had introduced a new dependency to ease testing with exact timestamps, but that is now replaced with a new flag The One thing I think left to decide on for this PR and not a follow-up: Is this the time to deprecate Also: Want me to update the PR description for prosperity? Various AI assistance was used throughout, but large scale text generation only used (and subsequently refined and rewritten) for tests and docs. |
ThomasWaldmann
left a comment
There was a problem hiding this comment.
Quite a lot of changes. Curious how users will cope with the new possibilities.
If you do another rebase onto master, the windows test should also work.
| candidate_archives = archives | ||
|
|
||
| if since is not None: | ||
| # Prefilter: Archives from _after_ the `prune_since` time are skipped entirely. |
There was a problem hiding this comment.
you mean they are KEPT, right?
There was a problem hiding this comment.
Yes. I'll improve the wording.
There was a problem hiding this comment.
I added a test for the prefiltered archives ignored for pruning checks.
| that time window. When ``--since`` is given together with an interval | ||
| retention, the interval is measured backwards from that timestamp | ||
| instead of from the current time. See ``Date and Time`` docs for exact | ||
| INTERVAL format. |
There was a problem hiding this comment.
Somehow I find it confusing using "since" to give a timestamp that marks the end of a period. Isn't that rather "until" usually?
There was a problem hiding this comment.
I think there's probably good points to be made for both of these. "since" matches my own intuition better, with it acting as a base timestamp off of which the retention intervals are applied. My intuition is that the intervals go backwards in time, so "since" is the common starting point and they end at arbitrary different points in the past. With the count-based retention rules also starting their count from "right now" and scanning archives in reverse chronological order I felt this was the right way to go.
Yes, guess that would be good. |
So just outright remove them here and mark them deprecated in a backport? |
|
Deprecation:
|
|
@Goddesen, great work ❤️ Just a quick note that I haven't forgotten about this PR after all the great and productive discussions we've had. Unfortunately, I've been a bit swamped with work lately, so I haven't had the chance to review it from an user's perspective yet. I'm aiming to get to it by the middle of next week. Hold me accountable if I don't 😉 Sorry for the delay! |
For the benefit of reviewers I'll hold off on rebasing until ready for merge. |
|
Review by Claude Opus 4.8: Took a detailed pass over the code and the discussion. Overall this is a really nice improvement to a long-standing Bug — uncaught
|
| if arg in prune_keys and (isinstance(val, timedelta) or val == -1) | ||
| ] | ||
| for (lo_arg, lo_val), (hi_arg, hi_val) in combinations(interval_args, 2): | ||
| if lo_val == -1 or lo_val >= hi_val: |
There was a problem hiding this comment.
Here's a minimal fix for the TypeError described above. -1/all should only ever be handled by the equality branch; the magnitude comparison must be skipped whenever either side is the -1 sentinel:
| if lo_val == -1 or lo_val >= hi_val: | |
| if lo_val == -1 or (hi_val != -1 and lo_val >= hi_val): |
This keeps the intended semantics:
lo == -1(finer rule unlimited) → still flagged, since it covers any coarser rule.- finite finer + unlimited coarser (
hi == -1) → allowed, as it should be (not redundant). - both finite → compared as before.
And a regression test (in src/borg/testsuite/archiver/prune_cmd_test.py) covering the previously-uncovered orientation — these combos should succeed, not error:
@pytest.mark.parametrize(
"keep_args",
[
("--keep-daily=7d", "--keep-yearly=all"),
("--keep-daily=7d", "--keep-yearly=-1"),
("--keep-hourly=14d", "--keep-daily=all"),
],
)
def test_prune_interval_with_unlimited_coarser(archivers, request, backup_files, keep_args):
# A finite finer interval combined with an unlimited ("all"/-1) coarser
# count is a valid, non-redundant combination and must not raise.
# Previously crashed with:
# TypeError: '>=' not supported between instances of 'datetime.timedelta' and 'int'
archiver = request.getfixturevalue(archivers)
cmd(archiver, "repo-create", RK_ENCRYPTION)
dt = datetime(2023, 12, 31, tzinfo=timezone.utc)
_create_archive_dt(archiver, backup_files, "test-1", dt)
output = cmd(archiver, "prune", "--list", "--dry-run", "--since", dt.isoformat(), *keep_args)
assert re.search(r"Keeping archive .*test-1", output)There was a problem hiding this comment.
Aha. This is an oversight after a simplification midway through. Fixing.
|
Int vs. timedelta comparison in granularity ordering check fixed.
Reworded.
With the rework this is only applicable in a very specific case, and only for new behavior: A fine-grained count-based rule is unfulfilled while a coarser-grained interval rule has an interval that doesn't cover the archive that would be kept by the fine-grained oldest rule. Example: I don't think this is worth spending much more time on -- only new behavior is affected here, the existing oldest-rule rolling backup test is green, and I cannot think of any test that could challenge this assertion for any behavior from Borg 1.
Reworded.
Non-issue.
Outdated comment after refactor. Removed that final point.
Fixed.
Not really needed; removed.
I still think Up to you.
All relevant tests are green, I think this is fine. |
Seems like there is some markup issue in the prune help (guess epilog). |
You're right! I assumed workflow hiccups. Fixed. |
|
Claude review: Did another pass on the current revision (
|
This PR adds optional interval handling for all retention filter flags of the
prunecommand, previously only available on--keep-within. E.g.prune --keep-hourly=7dwill keep hourly archives for the last 7 days regardless of count. Adds--keepwhich acts as a combination of--keep-lastand--keep-within.Implements #8637.
I opted to make the existing flags handle both ints and intervals instead of adding new flags as there are already so many flags for this command. Simplified some prune filtering: With the default filter function now also handling intervals,
prune_withinis no longer needed as a special case.I added a library to freeze time in testing, let me know if that's not wanted and I'll figure out something else. It's such a hassle to deal with timestamps relative to
now()in test. The tests should be fairly comprehensive in checking both their timely filter (hourly/yearly, etc.) and the new inclusive timestamp check. I did not add new helper tests forprune_splitas this function is not used anywhere other thanprune_cmdand isn't really a helper.TODOs:
TODOs from comments:
Notes:
--keep: Merges functionality of--keep-withinand--keep-lastwith new int-or-interval handling.--keep-lastis no longer an alias of--keep-secondlyand thus keeps archives made on the same second. It now fits together with--keep-withinand new flag--keep.int_or_intervaland creatingtimedeltaobjects it seemed weird to restrict input like this. Not extremely useful unless you want to prune on archives in the same second as they were created, but it seemed logical when setting up tests to verify new--keep-lastbehavior.. Technically breaking, but will likely not meaningfully affect any real scenarios.xx:xx:10and there's an archive atxx:xx:05then--keep-within 5Sshould cover that archive. Easy change to make when already altering the filtering logic. Technically breaking, but will likely not meaningfully affect any real scenarios.This has been a pet peeve of mine in the pruning command for a long time. In my mind the most clean backup regime keeps all backups for a short time (allows catching small errors quickly), then hourly backups for a reasonable time (say, 7 days), then daily backups for a little longer and then finally weekly/monthly as storage permits. This was not previously possible, requiring for example
--keep-within 7d --keep-hourly 168 --keep-daily 14 --keep-weekly -1for an approximation. But keeping around 168 archives for a machine that's only running a few hours a day seems mighty excessive. So here we are :)With this implemented my ideal retention for my primary laptop with archives every 15m looks something like
--keep 3d --keep-hourly 7d --keep-daily 30d --keep-weekly -1.