Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libgit2, GitRepo: Write (thin) packfiles #11330

Merged
merged 8 commits into from
Sep 16, 2024
Merged

libgit2, GitRepo: Write (thin) packfiles #11330

merged 8 commits into from
Sep 16, 2024

Conversation

roberth
Copy link
Member

@roberth roberth commented Aug 19, 2024

Motivation

This improves performance on systems with weak I/O in ~/.cache, especially in terms of operations per second, or where system calls are slower. (macOS, VMs?)

May address

Context

TODO

  • measurements:
    • faster on emilazy's machine
    • similar on 2019 high end linux laptop with ZFS on NVMe
    • ...?
  • review memory management
  • packfile threads setting?
  • make it interruptible

Priorities and Process

Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@github-actions github-actions bot added the fetching Networking with the outside (non-Nix) world, input locking label Aug 19, 2024
@roberth
Copy link
Member Author

roberth commented Aug 19, 2024

@emilazy Could you test this on your mac?

  • nix shell nix/packfile -c ./maintainers/measure-fetch-to-cache.sh
  • nix shell nix/1c5ad159d6ff412cb57f13bf9d2aacbacd258a90 -c ./maintainers/measure-fetch-to-cache.sh
    current merge base
measure-fetch-to-cache.sh
#!/usr/bin/env bash

# Test fetch performance
#
# locations:
#   - working directory to store tarballs
#   - tempdir for cache
#   - Nix store for output

temp="$(mktemp -d)"
trap 'rm -rf "$temp"' EXIT

cache="$temp/cache"

export XDG_CACHE_HOME="$temp/cache"

# Fetch a Nixpkgs tarball (and persist it), to remove the variability of github and the network.

persist=$PWD

# Master 2024-08-19, 14:00 ish
if [[ ! -f "$persist/nixpkgs-master.tar.gz" ]]; then
  curl --fail -L https://github.com/NixOS/nixpkgs/archive/34a8b3aa250af1e447651347eaff2af05783b5ad.tar.gz > "$persist/nixpkgs-master.tar.gz"
fi

# The merge before 34a8b
if [[ ! -f "$persist/nixpkgs-master-minus-one.tar.gz" ]]; then
  curl --fail -L https://github.com/NixOS/nixpkgs/archive/c0b21d95cbef08ad861587042456321a4853a6a7.tar.gz > "$persist/nixpkgs-master-minus-one.tar.gz"
fi

# Staging 2024-08-19, 14:00 ish

if [[ ! -f "$persist/nixpkgs-staging.tar.gz" ]]; then
  curl --fail -L https://github.com/NixOS/nixpkgs/archive/7110a438af1d17f302c4f9c73e86d5e1e5d3fa8a.tar.gz > "$persist/nixpkgs-staging.tar.gz"
fi

# Release instead of unstable (~4 months in)
if [[ ! -f "$persist/nixpkgs-release.tar.gz" ]]; then
  curl --fail -L https://github.com/NixOS/nixpkgs/archive/c42fcfbdfeae23e68fc520f9182dde9f38ad1890.tar.gz > "$persist/nixpkgs-release.tar.gz"
fi

echo "nixpkgs-master.tar.gz"
time nix eval --impure --json --store dummy:// \
  --expr 'builtins.fetchTree { type = "tarball"; url = "file://'"$persist"'/nixpkgs-master.tar.gz"; }'

echo "nixpkgs-maste-minus-one.tar.gz"
time nix eval --impure --json --store dummy:// \
  --expr 'builtins.fetchTree { type = "tarball"; url = "file://'"$persist"'/nixpkgs-master-minus-one.tar.gz"; }'

echo "nixpkgs-staging.tar.gz"
time nix eval --impure --json --store dummy:// \
  --expr 'builtins.fetchTree { type = "tarball"; url = "file://'"$persist"'/nixpkgs-staging.tar.gz"; }'

echo "nixpkgs-release.tar.gz"
time nix eval --impure --json --store dummy:// \
  --expr 'builtins.fetchTree { type = "tarball"; url = "file://'"$persist"'/nixpkgs-release.tar.gz"; }'

The script errors at addToStore, but that's useful for measuring the part where it's writing to the cache.

It'd also be interesting to measure the difference between that and the whole process including adding to the store. You can do that by removing the --store dummy:// flags.

@emilazy
Copy link
Member

emilazy commented Aug 19, 2024

Here you go:

emily@yuyuko ~/D/nix-benchmark> nix shell nix/packfile -c ./maintainers/measure-fetch-to-cache.sh
nixpkgs-master.tar.gz
error:
       … while calling the 'fetchTree' builtin
         at «string»:1:1:
            1| builtins.fetchTree { type = "tarball"; url = "file:///Users/emily/Downloads/nix-benchmark/nixpkgs-master.tar.gz"; }
             | ^

       … while fetching the input 'file:///Users/emily/Downloads/nix-benchmark/nixpkgs-master.tar.gz'

       error: operation 'addToStore' is not supported by store 'dummy'

real	0m45.118s
user	0m29.597s
sys	0m14.555s
nixpkgs-maste-minus-one.tar.gz
error:
       … while calling the 'fetchTree' builtin
         at «string»:1:1:
            1| builtins.fetchTree { type = "tarball"; url = "file:///Users/emily/Downloads/nix-benchmark/nixpkgs-master-minus-one.tar.gz"; }
             | ^

       … while fetching the input 'file:///Users/emily/Downloads/nix-benchmark/nixpkgs-master-minus-one.tar.gz'

       error: operation 'addToStore' is not supported by store 'dummy'

real	0m19.352s
user	0m4.905s
sys	0m13.654s
nixpkgs-staging.tar.gz
error:
       … while calling the 'fetchTree' builtin
         at «string»:1:1:
            1| builtins.fetchTree { type = "tarball"; url = "file:///Users/emily/Downloads/nix-benchmark/nixpkgs-staging.tar.gz"; }
             | ^

       … while fetching the input 'file:///Users/emily/Downloads/nix-benchmark/nixpkgs-staging.tar.gz'

       error: operation 'addToStore' is not supported by store 'dummy'

real	0m20.079s
user	0m5.086s
sys	0m14.325s
nixpkgs-release.tar.gz
error:
       … while calling the 'fetchTree' builtin
         at «string»:1:1:
            1| builtins.fetchTree { type = "tarball"; url = "file:///Users/emily/Downloads/nix-benchmark/nixpkgs-release.tar.gz"; }
             | ^

       … while fetching the input 'file:///Users/emily/Downloads/nix-benchmark/nixpkgs-release.tar.gz'

       error: operation 'addToStore' is not supported by store 'dummy'

real	0m32.328s
user	0m18.233s
sys	0m13.165s
emily@yuyuko ~/D/nix-benchmark [1]> nix shell nix/1c5ad159d6ff412cb57f13bf9d2aacbacd258a90 -c ./maintainers/measure-fetch-to-cache.sh
nixpkgs-master.tar.gz
error:
       … while calling the 'fetchTree' builtin
         at «string»:1:1:
            1| builtins.fetchTree { type = "tarball"; url = "file:///Users/emily/Downloads/nix-benchmark/nixpkgs-master.tar.gz"; }
             | ^

       … while fetching the input 'file:///Users/emily/Downloads/nix-benchmark/nixpkgs-master.tar.gz'

       error: operation 'addToStore' is not supported by store 'dummy'

real	1m32.812s
user	0m11.381s
sys	1m16.923s
nixpkgs-maste-minus-one.tar.gz
error:
       … while calling the 'fetchTree' builtin
         at «string»:1:1:
            1| builtins.fetchTree { type = "tarball"; url = "file:///Users/emily/Downloads/nix-benchmark/nixpkgs-master-minus-one.tar.gz"; }
             | ^

       … while fetching the input 'file:///Users/emily/Downloads/nix-benchmark/nixpkgs-master-minus-one.tar.gz'

       error: operation 'addToStore' is not supported by store 'dummy'

real	0m43.040s
user	0m7.745s
sys	0m33.242s
nixpkgs-staging.tar.gz
error:
       … while calling the 'fetchTree' builtin
         at «string»:1:1:
            1| builtins.fetchTree { type = "tarball"; url = "file:///Users/emily/Downloads/nix-benchmark/nixpkgs-staging.tar.gz"; }
             | ^

       … while fetching the input 'file:///Users/emily/Downloads/nix-benchmark/nixpkgs-staging.tar.gz'

       error: operation 'addToStore' is not supported by store 'dummy'

real	0m45.920s
user	0m7.962s
sys	0m36.226s
nixpkgs-release.tar.gz
error:
       … while calling the 'fetchTree' builtin
         at «string»:1:1:
            1| builtins.fetchTree { type = "tarball"; url = "file:///Users/emily/Downloads/nix-benchmark/nixpkgs-release.tar.gz"; }
             | ^

       … while fetching the input 'file:///Users/emily/Downloads/nix-benchmark/nixpkgs-release.tar.gz'

       error: operation 'addToStore' is not supported by store 'dummy'

real	1m10.783s
user	0m9.575s
sys	0m56.576s

Looks like a big improvement! Going to run without the dummy store now too.

cc @cigrainger who seemed to have it much worse than me

@emilazy
Copy link
Member

emilazy commented Aug 19, 2024

Without the dummy store:

emily@yuyuko ~/D/nix-benchmark> nix shell nix/packfile -c ./maintainers/measure-fetch-to-cache.sh
nixpkgs-master.tar.gz
"/nix/store/40i95p3z643n6ywplvb3fk2v0z5vk5wa-source"

real	1m11.682s
user	0m32.191s
sys	0m15.840s
nixpkgs-maste-minus-one.tar.gz
"/nix/store/8d6a34yjar9hyczjrb9wjkqg6hlxn88w-source"

real	0m44.578s
user	0m7.398s
sys	0m14.293s
nixpkgs-staging.tar.gz
"/nix/store/nzn3afs8c7857nb478qksyvalc93mn87-source"

real	0m47.004s
user	0m7.231s
sys	0m16.353s
nixpkgs-release.tar.gz
"/nix/store/wbqb2mw5kmh9bf04aqbd84c3j6a280h8-source"

real	0m52.336s
user	0m17.942s
sys	0m13.248s
emily@yuyuko ~/D/nix-benchmark> nix shell nix/1c5ad159d6ff412cb57f13bf9d2aacbacd258a90 -c ./maintainers/measure-fetch-to-cache.sh
nixpkgs-master.tar.gz
"/nix/store/40i95p3z643n6ywplvb3fk2v0z5vk5wa-source"

real	2m5.856s
user	0m14.874s
sys	1m23.417s
nixpkgs-maste-minus-one.tar.gz
"/nix/store/8d6a34yjar9hyczjrb9wjkqg6hlxn88w-source"

real	1m43.383s
user	0m11.603s
sys	0m42.745s
nixpkgs-staging.tar.gz
"/nix/store/nzn3afs8c7857nb478qksyvalc93mn87-source"

real	1m19.343s
user	0m10.971s
sys	0m38.417s
nixpkgs-release.tar.gz
"/nix/store/wbqb2mw5kmh9bf04aqbd84c3j6a280h8-source"

real	1m33.060s
user	0m11.853s
sys	0m58.422s

Is the packfile stuff pure overhead over whatever previous Nix versions were doing? i.e., are the measurements in my previous comment how much slower current Nix is over 2.18? Or are they accomplishing part of the work that 2.18 was? These results still feel slow to me, though I never really measured before. (Still, clearly a huge improvement over the status quo!)

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-08-19-nix-team-meeting-minutes-170/50942/1

@Mic92
Copy link
Member

Mic92 commented Aug 25, 2024

Is the packfile stuff pure overhead over whatever previous Nix versions were doing? i.e., are the measurements in my previous comment how much slower current Nix is over 2.18? Or are they accomplishing part of the work that 2.18 was? These results still feel slow to me, though I never really measured before. (Still, clearly a huge improvement over the status quo!)

I think the benefit would only come once we would stop copying nixpkgs to the nix store which is also slow and use lazy-trees instead. However just now this is not the case.

@emilazy
Copy link
Member

emilazy commented Aug 25, 2024

measure-fetch-to-cache.sh seems to be based entirely around fetching Nixpkgs tarballs into the Nix store and this branch improved performance for it significantly per my results, so I’m not sure what you mean?

@emilazy
Copy link
Member

emilazy commented Aug 25, 2024

Or I guess you just mean, there’s still overhead over 2.18 that currently brings no benefit, even after this change?

@Mic92
Copy link
Member

Mic92 commented Aug 25, 2024

The reason the performance in current nix is different than in nix 2.18 is than in 2.24 is that nix now creates a git repository of all tarballs in ~/.cache/nix/tarball-cache and creates the nar hash from there only to copy the nixpkgs afterwards to the /nix/store again. I believe this was done to save disk space? A the moment it doesn't but we pay more compute and storage for maintaining the cache without getting the benefits for it by not having to copy the nixpkgs to the nix store.

@emilazy
Copy link
Member

emilazy commented Aug 25, 2024

I see. That definitely seems like something that should be turned off until it can actually pay dividends.

@Mic92
Copy link
Member

Mic92 commented Aug 25, 2024

Agreed. The premise back than was that we would have the complete implementation sooner. But this did not work out.

@roberth
Copy link
Member Author

roberth commented Aug 25, 2024

But this did not work out.

... yet.

With a combination of this, #11367 and one or two edits, nix build nixpkgs#hello --no-eval-cache becomes about as fast as nix-build <nixpkgs> -A hello (and 2× faster in one out of a dozen runs somehow? wat?)

As we're finishing up the meson migration, I think we can focus on stabilizing that. It's a weaker version of lazy-trees #6530 that doesn't sacrifice correctness.

It's "weaker" because toString path still requires a copy for now, and that's currently required by the module system for modules that are loaded by path - the recommended method. Perhaps this could be reduced on the Nix side to a hashing operation instead of a copy. The Nixpkgs package set is unaffected if we apply the following patch, which I used for my measurement of #11367+this.

Nixpkgs patch for loading module pkgs/top-level/config.nix
--- a/pkgs/top-level/default.nix
+++ b/pkgs/top-level/default.nix
@@ -88,7 +88,15 @@ in let
 
   configEval = lib.evalModules {
     modules = [
-      ./config.nix
+      { # restore a file path for use in error messages
+        _file = "nixpkgs/pkgs/top-level/config.nix";
+        imports = [
+          # import: Intentionally drop the file path, so we don't need to
+          #         to materialize it in the store if fetched lazily. This
+          #         improves the performance of flake:nix/lazy-paths
+          (import ./config.nix)
+        ];
+      }
       ({ options, ... }: {
         _file = "nixpkgs.config";
         config = config1;

@roberth
Copy link
Member Author

roberth commented Aug 25, 2024

That definitely seems like something that should be turned off until it can actually pay dividends.

I agree, and I am sorry that we overlooked this situation. I don't know what else to say except we're a volunteer team under a lot of pressure, but I'll stop complaining.

I've talked to the team about this and while there's concern about "making a step back", we should consider this more so as a step sideways to avoid a regression.

Personally I'd prefer to deliver #11367 over reinventing a solution to use the store as a cache again; it's not a simple revert unfortunately, but probably the old code is still useful. If anyone wants to try this, I'd still welcome such a PR.

@Mic92
Copy link
Member

Mic92 commented Aug 25, 2024

That sounds great. However do you think as a short term solution we should to re-introduce the the old code as the default code branch until both pull requests are fleshed out, so we don't have to rush an implementation? I would look into this, if we I get green light on this.

@emilazy
Copy link
Member

emilazy commented Aug 25, 2024

With a combination of this, #11367 and one or two edits, nix build nixpkgs#hello --no-eval-cache becomes about as fast as nix-build <nixpkgs> -A hello (and 2× faster in one out of a dozen runs somehow? wat?)

Nice! That’s very exciting.

I agree, and I am sorry that we overlooked this situation. I don't know what else to say except we're a volunteer team under a lot of pressure, but I'll stop complaining.

Don’t get me wrong; I appreciate the work. Copying is one of the most painful parts of flakes and you’re clearly taking it seriously. If this was behind an experimental flag or whatever I’d have no complaint, but in the context where Nixpkgs is trying to bump to a newer Nix version, it does seem pretty bad for some users to get significant flake copying performance regressions for what is apparently no benefit in the version Nixpkgs would be bumping to. I assumed that things had become slower for “complicated” reasons; if it really can just be switched off with no downside in the 2.24 branch that seems like a clear win to me.

@roberth roberth marked this pull request as ready for review August 28, 2024 00:44
@roberth roberth requested a review from edolstra as a code owner August 28, 2024 00:44
@roberth roberth changed the title libgit2, GitRepo: Write thin packfiles libgit2, GitRepo: Write (thin) packfiles Aug 28, 2024
@Mic92
Copy link
Member

Mic92 commented Aug 28, 2024

Another edge-case is NFS-baked home that can be found in universities quite often.
I think I want to mention this in the release notes with a work around (environment variables / symlinks): TUM-DSE/doctor-cluster-config@8eb9941

@roberth roberth force-pushed the packfile branch 2 times, most recently from a7192e3 to 30a2b4a Compare August 28, 2024 15:49
libgit2 didn't write thin ones, hence the patch.

This should improve performance on systems with weak I/O in ~/.cache,
especially in terms of operations per second, or where system calls
are slower. (macOS, VMs?)
The latter is not used for memory synchronization things.
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-08-26-nix-team-meeting-minutes-172/51300/1

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-08-02-nix-team-meeting-minutes-174/51512/1

Author: Robert Hensing <robert@roberthensing.nl>
Date: Sun Aug 18 20:20:36 2024 +0200

Add unoptimized git_mempack_write_thin_pack
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unoptimized

After reading libgit2 again, this optimization is performed by the later call to git_packbuilder_write_buf instead. It is not a responsibility.

I've removed this from the PR I've submitted upstream

panic("PackBuilderContext::handleException: user error, but exception not set");

std::rethrow_exception(exception);
panic("PackBuilderContext::handleException: exception not thrown");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
panic("PackBuilderContext::handleException: exception not thrown");

git_packbuilder_set_callbacks(packBuilder.get(), PACKBUILDER_PROGRESS_CHECK_INTERRUPT, &packBuilderContext);
git_packbuilder_set_threads(packBuilder.get(), 0 /* autodetect */);

// TODO: For an optimal pack it's mandatory to insert objects in recency order, commits followed by trees and blobs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO: For an optimal pack it's mandatory to insert objects in recency order, commits followed by trees and blobs.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-09-09-nix-team-meeting-minutes-176-175/51852/1

@Mic92
Copy link
Member

Mic92 commented Sep 16, 2024

I am daily driving this pull request on my machines for some time now btw.

@roberth roberth enabled auto-merge September 16, 2024 12:16
@roberth roberth merged commit 799abea into master Sep 16, 2024
18 checks passed
@roberth roberth deleted the packfile branch September 16, 2024 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fetching Networking with the outside (non-Nix) world, input locking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants