Feature: Add DockerImage classes for docker.io, nvcr.io, and quay.io#697
Feature: Add DockerImage classes for docker.io, nvcr.io, and quay.io#697georgiastuart wants to merge 19 commits into
Conversation
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
|
Shockingly, the tests passed the first time. Ready for review @vsoch ! |
vsoch
left a comment
There was a problem hiding this comment.
This looks great! A few comments, and then we should have a test for the registry classes. We will also want a strategy to clean up the current registry repository.
|
|
||
| if len(tags) == 0: | ||
| logger.error( | ||
| f"The tag {tag} you provided is not known. Check that it and the container both exist." |
There was a problem hiding this comment.
Suggested for here would be to raise the ValueError with the message currently given to logger.error.
There was a problem hiding this comment.
Refactored error handling a bit, but it still seems messy. Going to take a fresh look at it tomorrow!
| ) | ||
| raise ValueError | ||
| new_tags = [x for x in tags if x.get("name")] | ||
| tags.extend(new_tags) |
There was a problem hiding this comment.
Sanity check here we will not have duplicates in this list. And what about order? I think different registries may do alphabetical vs. by date. If the user asks for some specific number of tags, we likely want to give them latest.
There was a problem hiding this comment.
Question about this - I mainly took inspiration from the previous tag_quay method where it pulls all the tags and feeds it into the filtering, value extraction, and sorting functions in the update main function. DockerHub does appear to reliably return tags in order of latest update. Given that this class can handle a rate limit, is it better to continue pulling all the tags, or to cap it at the first X tags? I'm guessing we could comfortably restrict it to 5 pages or so and still be able to get the N most recent tags for even tag-spammy projects.
There was a problem hiding this comment.
We should likely be conservative and cap at some X tags, especially if you are hitting a limit with requests.
| break | ||
| return tag_responses | ||
|
|
||
| def tags(self): |
There was a problem hiding this comment.
You are going in the right direction by using DockerHubImage for NGC, and it looks like there is still some redundancy in these classes (e.g., the tags filter for the sbom and similar). The main difference tends to be in the query tag API function. Could we put more logic in the DockerHubImage base class that the others inherit from with the shared logic?
There was a problem hiding this comment.
Refactored to minimize redundancy. All class specialization now occurs in _query_tag_api. I marked the manifest and config methods to raise NotImplementedError in the subclasses. It appears that they aren't currently used elsewhere. I intend to implement, but throwing the error in for now just in case!
| max_length (int) : the max number to return (latest) | ||
| """ | ||
|
|
||
| filtered_tags = [x for x in tags] |
There was a problem hiding this comment.
Sanity check - order and repeats?
| __copyright__ = "Copyright 2021-2025, Vanessa Sochat" | ||
| __license__ = "MPL 2.0" | ||
|
|
||
| __version__ = "0.1.33" |
There was a problem hiding this comment.
Let's bump up to 0.2.0 for this merge. If you have other changes / PRs to get in we can encompass them with that release too.
There was a problem hiding this comment.
Great, can do. I want to add gitlab.com and self hosted gitlab support in before merge. I'm also looking at a strategy to handle multiple "series" of tags, like "X.X.X-alpine" vs "X.X.X-trixie" or whatever.
|
Thanks for the review!! I'll implement tests and the other suggestions (and check out the 2 line digest printing). |
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
|
To do before merge:
|
Signed-off-by: Georgia Stuart <georgia.stuart@gmail.com>
|
Hi @georgiastuart . I'd be happy to help with "Add gitlab.com / self hosted support". We started working on it with @vsoch in #585 but never got to complete it. |
|
Thank you @muffato ! |
This is intended to resolve singularityhub/shpc-registry#480 and singularityhub/shpc-registry#481 . The driving motivations are rate limiting from dockerhub and odd output from NGC containers at times. The updated registry files in singularityhub/shpc-registry#483 use this singularity-hpc PR. This is a moderately sized PR, so I'm anticipating a couple rounds of iteration after the PR tests run.
Subclass Design
DockerHubImage,QuayDockerImageandNGCImageSubclassDockerImageand override thetagsanddigestmethods. It's possible there's a more elegant way to simplify, but all these APIs have bespoke needs so I didn't try to unify further. All query metadata APIs rather than usecrane.NCG Compatibility
Querying the NCG API requires an API token. These are free to get for this purpose, but I kept
ngcsdkas an optional dependency. Ifngcsdkand the environment variableSHPC_NGC_API_KEYare present, shpc will usengcsdkto update. Otherwise, it will fall back tocraneand the normalDockerImageclass.Added CLI arguments for
updateWhile cleaning up some mangled
container.yamlfiles, I added two CLI arguments toupdate:--max-tags=N: Overrides the default 5 tags to add--purge: wipes the existing tags in thecontainer.yamlfile to start fresh.