Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error downloading dockerfile artifact #21696

Open
StealthBadger747 opened this issue Nov 27, 2024 · 4 comments
Open

Error downloading dockerfile artifact #21696

StealthBadger747 opened this issue Nov 27, 2024 · 4 comments
Labels

Comments

@StealthBadger747
Copy link

Describe the bug

This only started happening very recently, as far as I know not much has changed in our build environment and the build process has stayed pretty much the same. But basically it seems like it is unable to download the dockerfile dependency for some reason. I have ssh'd into the build runner and have run curl on the url and that works so it doesn't look like a connectivity issue and this has persisted over many runs over the past two days with a 100% failure rate.

I have found a workaround in enabling the experimental rust_parser which builds the dockerfile properly.

Pants version

Which version of Pants are you using?

I have tried 2.21.0 / 2.22.1 and 2.23.0

The logs are the same for all versions.

OS
Are you encountering the bug on MacOS, Linux, or both?

Only on linux in a self hosted GHA runner (runs-on)

runner@hostname:~/_work/healthleap/healthleap$ uname -a
Linux hostname 6.8.0-1009-aws #9-Ubuntu SMP Fri May 17 14:39:23 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
runner@hostname:~/_work/healthleap/healthleap$ cat /etc/os-release
PRETTY_NAME="Ubuntu 24.04 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04 LTS (Noble Numbat)"
VERSION_CODENAME=noble
....

Additional info
Add any other information about the problem here, such as attachments or links to gists, if relevant.

runner@hostname:~/_work/healthleap/healthleap$ pants --no-remote-cache-read --no-remote-cache-write --no-local-cache package --docker-build-verbose packages/hl-api/hl_api:hl-api
21:37:17.62 [INFO] Initializing scheduler...
21:37:17.67 [INFO] Initializing Nailgun pool for 16 processes...
21:37:19.46 [INFO] Scheduler initialized.
21:37:19.66 [WARN] Unmatched globs from packages/hl-hl7-consumer/hl_hl7_consumer_tests:hl_hl7_consumer_tests's `sources` field: ["packages/hl-hl7-consumer/hl_hl7_consumer_tests/*.py", "packages/hl-hl7-consumer/hl_hl7_consumer_tests/*.pyi"], excludes: ["packages/hl-hl7-consumer/hl_hl7_consumer_tests/*_test.py", "packages/hl-hl7-consumer/hl_hl7_consumer_tests/*_test.pyi", "packages/hl-hl7-consumer/hl_hl7_consumer_tests/conftest.py", "packages/hl-hl7-consumer/hl_hl7_consumer_tests/test_*.py", "packages/hl-hl7-consumer/hl_hl7_consumer_tests/test_*.pyi", "packages/hl-hl7-consumer/hl_hl7_consumer_tests/tests.py", "packages/hl-hl7-consumer/hl_hl7_consumer_tests/tests.pyi"]

Do the file(s) exist? If so, check if the file(s) are in your `.gitignore` or the global `pants_ignore` option, which may result in Pants not being able to see the file(s) even though they exist on disk. Refer to https://www.pantsbuild.org/troubleshooting#pants-cannot-find-a-file-in-your-project.
21:37:19.67 [WARN] Unmatched globs from packages/hl-core/hl_core_tests/hl7:test_utils's `sources` field: ["packages/hl-core/hl_core_tests/hl7/*_test.pyi", "packages/hl-core/hl_core_tests/hl7/conftest.py", "packages/hl-core/hl_core_tests/hl7/test_*.pyi", "packages/hl-core/hl_core_tests/hl7/tests.pyi"]

Do the file(s) exist? If so, check if the file(s) are in your `.gitignore` or the global `pants_ignore` option, which may result in Pants not being able to see the file(s) even though they exist on disk. Refer to https://www.pantsbuild.org/troubleshooting#pants-cannot-find-a-file-in-your-project.
21:37:20.72 [INFO] Starting: Building dockerfile_parser.pex from resource://pants.backend.docker.subsystems/dockerfile.lock
21:37:21.81 [INFO] Completed: Building dockerfile_parser.pex from resource://pants.backend.docker.subsystems/dockerfile.lock
21:37:21.81 [ERROR] 1 Exception encountered:

Engine traceback:
  in `package` goal

ProcessExecutionFailure: Process 'Building dockerfile_parser.pex from resource://pants.backend.docker.subsystems/dockerfile.lock' failed with exit code 1.
stdout:

stderr:
There was 1 error downloading required artifacts:
1. dockerfile 3.2 from https://files.pythonhosted.org/packages/0e/de/00149a416148c609c71c8a94e5e4df14a9f62bf2fa41aeda021b76388623/dockerfile-3.2.0-cp36-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.whl
    pip: Executing /home/runner/.cache/pants/named_caches/pex_root/venvs/34c8697f20cbc61a130d2863b492cdb77bb19979/d56f87eee1e7cb14eca0b0968944a6f58d9e642e/bin/python -sE /home/runner/.cache/pants/named_caches/pex_root/venvs/34c8697f20cbc61a130d2863b492cdb77bb19979/d56f87eee1e7cb14eca0b0968944a6f58d9e642e/pex --disable-pip-version-check --no-python-version-warning --exists-action a --no-input --isolated -q --cache-dir /home/runner/.cache/pants/named_caches/pex_root/pip/24.0/pip_cache --log /tmp/pants-sandbox-zfg86T/.tmp/pex-pip-log.podu2hn0/pip.log download --dest /home/runner/.cache/pants/named_caches/pex_root/downloads/e6bd64408386b7ba2259d85820e0fe90de1b6b8269f560f18aba100c6aa40b7d.lck.work --no-deps https://files.pythonhosted.org/packages/0e/de/00149a416148c609c71c8a94e5e4df14a9f62bf2fa41aeda021b76388623/dockerfile-3.2.0-cp36-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.whl --index-url https://pypi.org/simple/ --retries 5 --timeout 15 failed with -11

Use `--keep-sandboxes=on_failure` to preserve the process chroot for inspection.
@jsirois
Copy link
Contributor

jsirois commented Nov 28, 2024

Your very last line tells a whole story: ... failed with -11. That means Python in that venv segfaulted. You should be able to self serve from there; i.e. investigate the underlying python binary used by that venv and see if it segfaults on its own. Then potentially clear caches if that Python venv is very old (older than libc upgrades, etc.).

@huonw
Copy link
Contributor

huonw commented Dec 2, 2024

Sorry for the trouble.

A few questions:

  1. As @jsirois suggests, can you reproduce this outside of pants?
  2. Did something change recently that might've caused this to start crashing?

@StealthBadger747
Copy link
Author

Sorry for the late reply, had a lot of fires to put out the past two weeks.

  1. I have not been able to reproduce this outside of pants.
  2. Nothing changed recently as far as I'm aware. We are still using the same CI image and the pants version did not change. It actually just kinda started overnight without any code changes needed.

@jsirois
Copy link
Contributor

jsirois commented Dec 11, 2024

@StealthBadger747 hopefully my highlighting of -11 means SEGFAULT helps here. Almost surely this means a venv created by Pex (For Pants these are in ~/.cache/pants/named_caches/...) has a Python executable that segfaults. By "reproduce outside of Pants", I mean try to run that Python and see if it segfaults. You'll get a segfault, when, for example, glibc is upgraded and there are old venvs lying around with Pythons that link to older glibc. You might just try moving aside ~/.cache/pants/named_caches/ as a quick way to check if the problem goes away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants