Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index API: make implicit ordering of responses explicit? #17258

Open
woodruffw opened this issue Dec 9, 2024 · 2 comments
Open

Index API: make implicit ordering of responses explicit? #17258

woodruffw opened this issue Dec 9, 2024 · 2 comments

Comments

@woodruffw
Copy link
Member

woodruffw commented Dec 9, 2024

I'm opening this for discussion, based on a (non-breaking!) divergence between PyPI's observed behavior and the behavior specified in PEPs 503 and 691.

This comes from the context of Homebrew, which periodically pulls from the Index/JSON APIs as part of checking for releases. CC @samford for viz/more context.

TL;DR: right now, the response to /simple/<name>/ is ordered by largest (i.e. highest) version:

def _simple_detail(project, request):
# Get all of the files for this project.
files = sorted(
request.db.query(File)
.options(joinedload(File.release))
.join(Release)
.filter(Release.project == project)
# Exclude projects that are in the `quarantine-enter` lifecycle status.
.join(Project)
.filter(
Project.lifecycle_status.is_distinct_from(LifecycleStatus.QuarantineEnter)
)
.all(),
key=lambda f: (packaging_legacy.version.parse(f.release.version), f.filename),
)
versions = sorted(
{f.release.version for f in files}, key=packaging_legacy.version.parse
)

As a result of this ordering, higher releases (by version) come last, and lower (by version) releases come first.

Meanwhile, PEP 691 says the following about ordering:

While the files key is an array, and thus is required to be in some kind of an order, neither PEP 503 nor this PEP requires any specific ordering nor that the ordering is consistent from one request to the next. Mentally this is best thought of as a set, but both JSON and HTML lack the functionality to have sets.

In other words, this ordering is an implementation detail within PyPI. However, it's likely that consumers of the Index APIs are depending on this observed behavior (per Hyrum's Law), and are using the current ordering as an optimization to more rapidly find the latest release(s).

Given the above, I wonder if it makes sense for PyPI to make this implementation detail explicit. Some options:

  • Document it in the Index API docs, as an implementation specific detail of PyPI's Index APIs

  • Add some kind of additional field to the meta component of the JSON response, signaling that the index is ordering the fields, e.g.:

     "meta": {
       "api-version": "1.0",
       "order": "version",
     },

    (This might be overkill though.)

OTOH, maybe this doesn't need to be documented or formalized. But I figured I'd open this up for consideration 🙂

@di
Copy link
Member

di commented Dec 9, 2024

and are using the current ordering as an optimization to more rapidly find the latest release(s).

Just to be clear here, what the end user wants in this example is the largest release (in terms of version) and not the latest release (in terms of publication date), correct? Because they are not always the same 🙂.

@woodruffw
Copy link
Member Author

and are using the current ordering as an optimization to more rapidly find the latest release(s).

Just to be clear here, what the end user wants in this example is the largest release (in terms of version) and not the latest release (in terms of publication date), correct? Because they are not always the same 🙂.

Correct! Sorry, bad phrasing on my part: I was using "latest" to mean "highest version" not "most recently published distribution." I'll update the issue body 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants