Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate alert when uploading wheels #155

Closed
icfly2 opened this issue May 12, 2023 · 7 comments
Closed

Duplicate alert when uploading wheels #155

icfly2 opened this issue May 12, 2023 · 7 comments
Labels
invalid This doesn't seem right question Further information is requested wontfix This will not be worked on

Comments

@icfly2
Copy link

icfly2 commented May 12, 2023

I have the issue that I build in a matrix of python 3.8 - 3.10 as i use mypyc to complile some slow bits.

When it comes to uploading one runner is done first and uploads the tar.gz file and the relevant .whl say 311. Then 310, 39 and 38 all report that there are duplicates and fail.

@webknjaz
Copy link
Member

This means that they have already been uploaded. You can't upload files with the same name twice. Even if you remove them from PyPI, it won't let you do this — these things are immutable.
There's an action input for skipping duplicates but it's not recommended to use it — first, inspect if you're attempting to upload them from multiple jobs. If you are, then restructure your workflow to avoid race conditions.

@icfly2
Copy link
Author

icfly2 commented May 12, 2023

no, the files don't already exist:

https://pypi.org/project/simstring-fast/0.2.0/#files Only the 3.11 wheel is uploaded.

I'm running multiple build jobs 8windows / linux versions 3.8 -3.11 and obviously they all should upload.

@webknjaz
Copy link
Member

no, the files don't already exist:

This doesn't mean anything. If somebody else had the project, deleted it and you registered it again, you would still be unable to reuse the filenames.

@webknjaz
Copy link
Member

obviously they all should upload.

Here's some explanation of why it's obviously a bad idea —#15 (comment). Multiple disconnected async uploads are more prone to creating race conditions or partial uploads as you've witnessed.

The recommendation is to upload all the artifacts as GHA artifacts from multiple jobs and then, download them all in a single job that is dedicated to uploading the dists. That single job can then be also protected via a GitHub Environment with mandatory approval, for example. And you can enable secretless publishing there too.

@webknjaz
Copy link
Member

I checked your GHA job that is failing and there's several issues with it:

  1. Several of your jobs create dist/simstring_fast-0.2.4.tar.gz and try to upload it which causes the conflicts you are observing. Use the pointers from my previous message to split building dists and uploading them from a single job.
  2. You attempted to use skip-existing which is a valid action input in the modern version but you pinned your action version to an older commit sha and that version doesn't recognize it. Update to using @release/v1 to get the up-to-date action version or configure dependabot to do bumping for you.
  3. You create manylinux wheels that are compatible with very new systems but wouldn't be picked up on the older ones (which might be fine for you).
  4. You use a long-living API token. Instead, follow the readme instructions to configure a more secure secretless publishing. For this, you'll need to update the action to a modern version, make small changes to your GHA uploading job and configure trust on the PyPI side.
  5. Uploading from Windows jobs was never supported, it will not work and is not supposed to. Make sure to study the readme and use issue and discussion search for details.

As there's nothing else to do here, I'm closing the issue.

@webknjaz webknjaz closed this as not planned Won't fix, can't repro, duplicate, stale May 12, 2023
@webknjaz webknjaz added invalid This doesn't seem right question Further information is requested wontfix This will not be worked on labels May 12, 2023
@icfly2
Copy link
Author

icfly2 commented May 13, 2023

Thanks a lot for your extensive help. It is much appreciated.

I actually faffed about with many different things, including the whole artefact up and download in a separate steps but always had something break.

Is there an example for this recommended build on a matrix and then upload in one go type? If not, I’m happy to contribute it to the docs when I have it working.

@webknjaz
Copy link
Member

I've got multiple places where this works, but these examples are rather complicated since they have a test matrix in the middle. For example:

They also integrate my other inventions like https://github.com/marketplace/actions/alls-green#why and lately https://github.com/marketplace/actions/checkout-python-sdist.

Having realized the complexities of maintaining YAML files of this size, I've been experimenting with making the setup more modular. Here's one example of employing reusable workflows for this: https://github.com/pypa/build/pull/618/files — though, I haven't migrated many of my projects to using this approach yet.

P.S. The key reason for upload+download to work is to simply point them to the dist/ directory from both build and publish jobs, and use the same GHA artifact name.
P.P.S. Since your case requires platform-specific wheels, I recommend looking into https://cibuildwheel.rtfd.io which should take care of many build matrix aspects for you.
P.P.P.S. The right place for a more complete example would be my PyPUG guide that I've been putting off for quite a while. It's being tracked here #29.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right question Further information is requested wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants