Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nix 2.21.x -> 2.22.x download buffer is full on CentOS 7 #10630

Open
MatthewCroughan opened this issue Apr 30, 2024 · 17 comments
Open

Nix 2.21.x -> 2.22.x download buffer is full on CentOS 7 #10630

MatthewCroughan opened this issue Apr 30, 2024 · 17 comments
Labels
bug fetching Networking with the outside (non-Nix) world, input locking other-linux Nix on a Linux distro that is not a NixOS-derivative

Comments

@MatthewCroughan
Copy link
Contributor

MatthewCroughan commented Apr 30, 2024

Describe the bug

When using CentOS 7, I ran into an issue with nix versions 2.21.x through to 2.22.x, where builtin fetchers such as those
triggered when running nix build nixpkgs#hello fill up a buffer and fail to fetch.

Downgrading to Nix 2.18.1 or 2.18.2 makes this issue stop.

When running nix build nixpkgs#hello, nix will fail to download the GitHub tarball. Running with -vvv shows the following

download thread waiting for 100 ms
download buffer is full; going to sleep

Eventually, the download fails and the following error is emitted

error:
       … while fetching the input 'github:NixOS/nixpkgs/nixpkgs-unstable'

       error: cannot get archive member name: truncated gzip input

Steps To Reproduce

  1. Get CentOS 7
  2. Install Nix via the nixos.org installer
  3. Get nix 2.21.x or 2.22.x
  4. nix build nixpkgs#hello

Expected behavior

For Nix to successfully download and unpack the tarball.

nix-env --version output

$ nix-env --version
nix-env (Nix) 2.21.2

Additional context

I'm sorry I can't provide more helpful steps to reproduce the bug, though I'm happy to help if anyone can instruct me on what to run.

Priorities

Add 👍 to issues you find important.

@roberth roberth added fetching Networking with the outside (non-Nix) world, input locking other-linux Nix on a Linux distro that is not a NixOS-derivative labels Jul 4, 2024
@bodokaiser
Copy link

@MatthewCroughan have you found a workaround?

@MatthewCroughan
Copy link
Contributor Author

MatthewCroughan commented Jul 8, 2024

@bodokaiser Other than using a Nix release prior to this, no. I wanted to perform a time consuming git bisect, but wouldn't do that unless paid to do so, due to the esoteric nature of the regression effecting ancient Linuxes that I don't have much motivation to touch otherwise.

@bodokaiser
Copy link

@MatthewCroughan How did you perform the downgrade? Did you uninstall nix and just installed an older nix, or could you use nix nix-upgrade?

Did you disable SELinux? Are you using the multi-user install? Is there a proxy in your network?

(I also have multiple problems including the cache's SSL due to a proxy - trying to disentangle them)

@lentilus
Copy link

lentilus commented Aug 7, 2024

I am having similar issues but on a different setup. I am running docker on a fedora host. Everything is fine on the host but in the docker container (devpod) nix builds are really slow. They dont fail, just take really long. The logs (-vvvvv) show hundreds of lines like this:

download thread waiting for 100 ms
download thread waiting for 100 ms
download thread waiting for 100 ms
download buffer is full; going to sleep
download buffer is full; going to sleep
download thread waiting for 100 ms
download buffer is full; going to sleep

I have not verified that downgrading makes things better but right now I am on 2.23. I am really not sure but maybe the issue is related: #11249

@bodokaiser
Copy link

bodokaiser commented Aug 7, 2024 via email

@lentilus
Copy link

lentilus commented Aug 7, 2024

Can you elaborate on that? Do you mean the store directory or is there an additional cache that I don't know about? On my host I have about 100GiB of free disk.
The logs from devpod tell me that the container is started using the following command

19:49:05 debug Running docker command: docker run --sig-proxy=false --mount type=bind,src=/home/lentilus/git/2ndpod,dst=/workspaces/2ndpod -u root -e DEVPOD=true -e REMOTE_CONTAINERS=true -l dev.containers.id=2ndpod-def-e4a1c -l devcontainer.metadata=[{"id":"ghcr.io/devcontainers/features/common-utils:2"},{"id":"ghcr.io/devcontainers/features/git:1"},{"remoteUser":"vscode"},{"entrypoint":"/usr/local/share/nix-entrypoint.sh"},{"onCreateCommand":{"":["sudo chsh -s /usr/bin/zsh $USER"]}}] -l devpod.user=root -d --entrypoint /bin/sh vsc-2ndpod-1e65f:devpod-1a149c9eba5e1e523404f67734dea86e -c echo Container started
trap "exit 0" 15
/usr/local/share/nix-entrypoint.sh
exec "$@"
while sleep 1 & wait $!; do :; done -

I think this means that the container should have access to the full 100GiB, because there is no flag that suggests otherwise...
And nixpkgs#hello should not be all that big after all.

@bodokaiser
Copy link

bodokaiser commented Aug 7, 2024 via email

@roberth
Copy link
Member

roberth commented Aug 8, 2024

They dont fail, just take really long. The logs (-vvvvv) show hundreds of lines like this:

download thread waiting for 100 ms
download thread waiting for 100 ms
download thread waiting for 100 ms
download buffer is full; going to sleep

@lentilus This sounds like it could be solved by #11171 ie master or one of the backports.

And nixpkgs#hello should not be all that big after all.

Nixpkgs itself is pretty big though, and it's written to a cache that's implemented by means of a git repo in ~/.cache/nix/tarball-cache/.

@lentilus
Copy link

Thanks @bodokaiser for the details! Unfortunately the size of the cache dir did not seem to be the issue. This may be a stupid question, but is there any way around building nix from source to check if #11171 fixes it? The commit is quite recent and not present in any of the releases, right?

@roberth
Copy link
Member

roberth commented Aug 11, 2024

It's in Nix 2.24, available as nixVersions.latest in the nixos-unstable channel.
2.23-maintenance has not been tagged for this yet.

@lentilus
Copy link

#11171 fixed it. Thank you so much @bodokaiser and @roberth !

@roberth
Copy link
Member

roberth commented Aug 12, 2024

@MatthewCroughan could you also give it a try?

@MatthewCroughan
Copy link
Contributor Author

@roberth Although downloading seems to be solved, and the log line "download buffer is full" is no longer spammed, another issue has occured. Nix claims the download is finished, and then begins extracting the tarball, but this process hangs indefinitely and does not respond to ^C signals.

[nix-shell:~]$ nix shell github:nixos/nixpkgs#hello -vvvv
evaluating file '<nix/derivation-internal.nix>'
evaluating derivation 'github:nixos/nixpkgs#hello'...
using cache entry 'file:{"name":"source","store":"/nix/store","url":"https://api.github.com/repos/nixos/nixpkgs/commits/HEAD"}' -> '{"etag":"W/\"414dbf039c1e0d4a25053b1d34fdbe18369df6ea0068f219bf6ffbd1c10f25b5\"","storePath":"0h76h2iw2l6y92xzmrrsb5mkvb4z26nc-source","url":"https://api.github.com/repos/nixos/nixpkgs/commits/HEAD"}'
ignoring the client-specified setting 'extra-platforms', because it is a restricted setting and you are not a trusted user
ignoring the client-specified setting 'system-features', because it is a restricted setting and you are not a trusted user
performing daemon worker op: 11
acquiring write lock on '/nix/var/nix/temproots/4762'
performing daemon worker op: 1
using cache entry 'file:{"name":"source","store":"/nix/store","url":"https://api.github.com/repos/nixos/nixpkgs/commits/HEAD"}' -> '{"etag":"W/\"414dbf039c1e0d4a25053b1d34fdbe18369df6ea0068f219bf6ffbd1c10f25b5\"","url":"https://api.github.com/repos/nixos/nixpkgs/commits/HEAD"}', '/nix/store/0h76h2iw2l6y92xzmrrsb5mkvb4z26nc-source'
HEAD revision for 'github:nixos/nixpkgs/HEAD' is 0f1d78c2761069a83de99581ed24533d930f1232
did not find cache entry for 'gitRevToTreeHash:{"rev":"0f1d78c2761069a83de99581ed24533d930f1232"}'
unpacking 'github:nixos/nixpkgs/0f1d78c2761069a83de99581ed24533d930f1232' into the Git cache...
downloading 'https://github.com/nixos/nixpkgs/archive/0f1d78c2761069a83de99581ed24533d930f1232.tar.gz'...
starting download of https://github.com/nixos/nixpkgs/archive/0f1d78c2761069a83de99581ed24533d930f1232.tar.gz
finished download of 'https://github.com/nixos/nixpkgs/archive/0f1d78c2761069a83de99581ed24533d930f1232.tar.gz'; curl status = 0, HTTP status = 200, body = 44453545 bytes, duration = 0.28 s
download thread shutting down

@roberth
Copy link
Member

roberth commented Aug 22, 2024

@MatthewCroughan Maybe extraction was just very very slow - could you try #11330? It speeds up extraction significantly on hosts with limited I/O operations per second.

If that doesn't solve the problem, I think we'll need a stack trace from the running process using gdb.

@MatthewCroughan
Copy link
Contributor Author

@roberth Sadly it doesn't seem to, it would be nice to reproduce this in a VM test or something. The issue is the same as before, unpacking hangs indefinitely and doesn't respond to ^C. I'll ping you about the stack trace.

@MatthewCroughan
Copy link
Contributor Author

@roberth I'm now experiencing this same behavior on my personal NixOS machine(s) after updating to 2.25.x, where a nix-shell -p nixVersions.nix_2_24 is enough to roll it back and no longer observe the behavior.

@MatthewCroughan
Copy link
Contributor Author

I had GC_INITIAL_HEAP_SIZE=8m set, which had worked fine for a while, but was no longer tenable for some reason. Unsetting this fixed my variation of this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fetching Networking with the outside (non-Nix) world, input locking other-linux Nix on a Linux distro that is not a NixOS-derivative
Projects
None yet
Development

No branches or pull requests

4 participants