Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend NixGL compat #1

Merged
merged 2 commits into from
Oct 20, 2024
Merged

Extend NixGL compat #1

merged 2 commits into from
Oct 20, 2024

Conversation

exzombie
Copy link

@exzombie exzombie commented May 19, 2024

As promised in my comment on nix-community/pull/5355, here are the changes that make my use case work. I intend to dogfood this in the coming week.

I added a section to the user manual which explains why separate wrappers are needed. And, because there are two wrappers, I opted for removing default wrapping of firefox and kitty, at least for now. I believe this approach is simple enough for the user while being sufficiently flexible to cover both single-GPU and dual-GPU systems. Support for offloading to AMD dGPU is missing because I don't have such a system, but I see nothing that would prevent adding another wrapper if someone expresses the need and can test.

In the limited time I had, I focused on the functionality and user interface, but haven't checked conformance to the contributor guidelines. Sorry. I think we can fix that later. I'd like to get comments on the functionality first :)

@Smona Smona self-requested a review May 20, 2024 04:19
@Smona
Copy link
Owner

Smona commented May 23, 2024

@exzombie I've force-pushed to remove the default wrappers, if you wouldn't mind rebasing your branch that would make it a bit easier to review :)

Copy link
Owner

@Smona Smona left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, and for adding docs!

Excuse me if I don't fully understand the intricacies of hybrid graphics laptop setup -- I tried playing around with it when I was a kid but it's been a long time. IIRC that was before vulkan.

I think this config schema might be a little difficult to use for someone who is sharing a module between multiple system configurations. One benefit of my original API that this doesn't have is that the nixGL package used to wrap all of your wrapped packages can be customized via a single per-system setting in flake.nix (example).

What would you think of keeping that layer of indirection, but adding an option for a secondary GPU? the final settings would look something like:

  • packages: like you have it now.
  • graphics: one of "mesa", "nvidia", or "bumblebee". This would determine the nixGL wrapper lib.nixGL.wrap uses
  • offloadGraphics: one of "mesa", "nvidia", or "bumblebee". This would be the same as graphics, but used for a secondary wrapper function, like config.nixGL.wrapOffloaded. Maybe secondary would be preferable to offload. This could default to the value of graphics. I think with "bumblebee" this would simply wrap the exe in optirun, but I'm not sure, so maybe bumblebee could be excluded for a first pass.
  • useVulkan: bool, would switch nixGLNvidia->nixVulkanNvidia and nixGLIntel->nixVulkanIntel
  • nvidiaOffload: I believe could stay the same as how you have it now, but I really don't know much about what these env vars do. Maybe we could do away with this option and just use the offloading env vars inside wrapOffloaded when the vendor is "nvidia"?
  • wrapperScripts: could be a list of "mesa", "nvidia", and or "bumblebee" and work like it does now.

This seems like it would be a better compromise between our two use cases, and like it would avoid introducing any limitations that would require custom user settings & conditionals to work around. For example, someone could have a single configuration module for graphical apps that includes both firefox and blender. They nixGL.wrap firefox and nixGL.wrapOffloaded blender. This module could be well supported on a variety of systems just by tweaking options:

systems.nixos = myHmConfig {};
systems.ubuntuNotebook = myHmConfig {
  nixGL.packages = nixGL.packages;
  nixGL.graphics = "mesa";
};
systems.gamingLaptop = myHmConfig {
  nixGL.packages = nixGL.packages;
  nixGL.graphics = "mesa";
  nixGL.offloadGraphics = "nvidia";
};
systems.nvidiaDesktop = myHmConfig {
  nixGL.packages = nixGL.packages;
  nixGL.graphics = "nvidia";
  nixGL.useVulkan = true;
};

modules/misc/nixgl.nix Outdated Show resolved Hide resolved
modules/misc/nixgl.nix Outdated Show resolved Hide resolved
modules/misc/nixgl.nix Outdated Show resolved Hide resolved
modules/misc/nixgl.nix Outdated Show resolved Hide resolved
@exzombie
Copy link
Author

This seems like it would be a better compromise between our two use cases, and like it would avoid introducing any limitations that would require custom user settings & conditionals to work around. For example, someone could have a single configuration module for graphical apps that includes both firefox and blender. They nixGL.wrap firefox and nixGL.wrapOffloaded blender.

I, too, maintain a single HM repo for multiple machines, both NixOS and non-NixOs. So, I'm well aware that there needs to be a way to configure things mostly the same but slightly differently. That's actually why I wanted to avoid defining a specific mechanism for this, as it seems to me that every user will have their own way of doing it. My intent was to provide wrappers, and allow the user to decide which wrapper to use in which situation.

That said, I can see the appeal of having a ready-made higher-level abstraction. I guess there's no harm in providing a primary+secondary wrapper mechanism, it would make it clear to users what the intended use is. But I'm pretty sure we should also give direct access to wrapper functions so that they can be used without indirection.

Speaking of wrapper functions, given the complexity of this stuff, there will have to be many. I thought we could provide just Mesa and Nvidia to start with and add others later as people ask for them, but I found that the offloading for Nvidia is complicated by the fact that there are different topologies, and what I provided will not work on all laptops. I'll see what I can come up with. I looked at the AMD offloading situation, and it seems it should be more straightforward.

@exzombie exzombie force-pushed the nixgl-compat branch 2 times, most recently from 13923b5 to 2b3674b Compare May 28, 2024 14:09
@exzombie
Copy link
Author

Please tell me what you think of the new interface. I hope I managed to hit the sweet spot. For your case (single GPU, Mesa), it should work without any configuration apart from setting the packages option.

I looked into providing offloading on AMD and it's quite a bit simpler than for Nvidia, so I added it. I also made offloading a bit more flexible. Not that I can test, but at least the stuff is there so others can do so. I didn't want to include Bumblebee, though, because the way it's done in NixGL doesn't follow the same pattern as other wrappers. I don't know if it's worth it because I don't think the Bumblebee method is supported nowadays. If users appear, it can be added later.

@Smona
Copy link
Owner

Smona commented May 31, 2024

Awesome!! Thanks for being willing to work with me on the interface. I love where it's at now, really does feel like the best combination of convenience and flexibility.

I tried testing your branch with my configuration, and I got this error when building emacs:

Warning, nixVulkanIntel overwriting existing LD_LIBRARY_PATH
/nix/store/ynz405a4ngxijdmdz1qjmfb0ffgigx98-emacs-pgtk-29.1/bin/emacs: /nix/store/xpxln7rqi3pq4m0xpnawhxb2gs0mn1s0-gcc-12.3.0-lib/lib/libstdc++.so.6: version `GLIBCXX_3.4.31' not found (required by /nix/store/7byfnpsk04l3q8rvvwc06acwjbkiy0j2-webkitgtk-2.42.4+abi=4.0/lib/libwebkit2gtk-4.0.so.37)
/nix/store/ynz405a4ngxijdmdz1qjmfb0ffgigx98-emacs-pgtk-29.1/bin/emacs: /nix/store/xpxln7rqi3pq4m0xpnawhxb2gs0mn1s0-gcc-12.3.0-lib/lib/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /nix/store/7byfnpsk04l3q8rvvwc06acwjbkiy0j2-webkitgtk-2.42.4+abi=4.0/lib/libwebkit2gtk-4.0.so.37)
/nix/store/ynz405a4ngxijdmdz1qjmfb0ffgigx98-emacs-pgtk-29.1/bin/emacs: /nix/store/xpxln7rqi3pq4m0xpnawhxb2gs0mn1s0-gcc-12.3.0-lib/lib/libstdc++.so.6: version `GLIBCXX_3.4.31' not found (required by /nix/store/7byfnpsk04l3q8rvvwc06acwjbkiy0j2-webkitgtk-2.42.4+abi=4.0/lib/libjavascriptcoregtk-4.0.so.18)

It looks like the vulkan wrapper is overriding LD_LIBRARY_PATH with an incompatible version of GLIBC? I'll try and dig in to nixGL to get some more useful info, but I wonder if this is a sign that maybe wrapping with vulkan by default could make the wrappers work a little less consistently.

@exzombie
Copy link
Author

exzombie commented Jun 1, 2024

You're welcome! I'm glad nixGL is finally moving towards inclusion in HM.

Despite the fact that the warning uses the word "overwriting", I'm pretty sure it's just prepending, and not really throwing away the library path. It's also done by all wrappers, it's just the the Mesa Vulkan wrapper is the only one printing a warning.

If it's really the wrapper who pulls in an incompatible glibc (does the unwrapped package work?), it would mean that the libraries used by nixGL come from a different version of nixpkgs. Did you ensure that the nixGL input in your flake uses the same nixpkgs as HM? I can't reproduce this with 23.11

BTW, why are you wrapping emacs? Not that it matters for this problem, a wrapped package should always work, I'm just curious because I thought emacs didn't use a GPU.

@Smona
Copy link
Owner

Smona commented Jul 1, 2024

Hey @exzombie, I wish I hadn't taken so long to get back to you but life's been very busy. I'm still interested in merging your improvements before the nixgl module gets into home-manager, I just didn't want to add any blockers to the functionality getting merged. If we can get this ready before then, that would be great!

I wrapped emacs just out of an abundance of caution, trying to get as much performance out of it as possible. I'm using PGTK and I wasn't sure whether that could hardware accelerate rendering or not. Maybe it isn't actually helping, but I agree that the wrapper should avoid breaking packages as much as possible. I can confirm that my configuration builds and emacs runs on your branch when i unwrap emacs, all else being equal.

I believe everything is coming from the same version of nixpkgs, so I'm not sure what's causing the version mismatch. Here's my setup:

    # relevant flake inputs
    nixpkgs-ubuntu.url =
      "github:nixos/nixpkgs?rev=5ba549eafcf3e33405e5f66decd1a72356632b96";
 
    hm-ubuntu.url =
      "github:exzombie/home-manager-smona?rev=2b3674b62286dde05d0fcd816ee90b9f30003559";
    hm-ubuntu.inputs.nixpkgs.follows = "nixpkgs-ubuntu";

    nixGL.url = "github:guibou/nixGL";
    nixGL.inputs.nixpkgs.follows = "nixpkgs-ubuntu";

  # emacs.nix
  my-emacs = (config.lib.nixGL.wrap
    ((pkgs.emacsPackagesFor emacs).emacsWithPackages
      (epkgs: [ epkgs.vterm epkgs.pdf-tools ]))).overrideAttrs (oldAttrs: {
        # Temporarily working around this issue: https://github.com/NixOS/nixpkgs/issues/66706
        # TODO: submit upstream fix
        buildCommand = oldAttrs.buildCommand + ''
          ln -s $emacs/share/emacs $out/share/emacs
        '';
      });

I also tried updating nixGL to the latest version, which unsurprisingly didn't help. I have to pin a relatively old nixpkgs commit to maintain compatibility with Ubuntu's version of gnome, but I'm not sure where the incompatible GLIBC version could be sneaking in from. I will try to find some time to investigate further, but perhaps another option would be to omit the vulkan wrapper for now, or make it opt-in.

Let me know if I can get you any other info to try and reproduce my issue. My full configuration is public if that helps.

This PR also needs to be rebased/merged with the updates to the wrapper script from other reviews on the base branch.

@exzombie
Copy link
Author

exzombie commented Jul 3, 2024

Today, I finally found the time to update my systems to 24.05. Now, I'm encountering the same problem. In my case, it manifests when a wrapped program tries to run executables from the host system, which fails because of missing glibc symbols. This is most unfortunate. I don't mind making Vulkan optional for now in order to not hold back this PR. However, I fear that OpenGL wrapping could cause similar issues when the stuff from nixpkgs deviates further from the stuff on the host system. In fact, I wonder why it doesn't cause trouble already ...

@exzombie
Copy link
Author

I squashed when rebasing, and applied ./format. Vulkan is now optional and disabled by default. I'm not entirely happy with the makeWrapper thing: it requires a single executable as wrapper, so, I added an intermediate wrapper. I don't think that's any nicer than what we had before. But let's see what the reviewers say.

@AkechiShiro
Copy link

AkechiShiro commented Aug 13, 2024

I take it that this PR would needs to be merged first ideally and then in turn we would be in a good place to merge upstream ? Maybe we should ask more people to test this branch, as I believe most are focused on the main branch that would land upstream

Thanks to you both, @exzombie and @Smona for the improvements of this very important feature for non-NixOS machine, this module may need to be activated by default on non-NixOS machine, at least I believe a working experience by default should be done instead of having a broken by default experience (which we currently have).

modules/misc/nixgl.nix Outdated Show resolved Hide resolved
Copy link
Owner

@Smona Smona left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very impressive & exciting work here @exzombie! I tested on my Ubuntu 24 intel system, setting only nxGL.packages and everything seems to be working just right.

I think this is ready to merge, just want to think it over for another day or so since there's a lot going on here. would you mind if I revise the docs a bit post-merge?

Comment on lines +29 to +30
Note that using any Nvidia wrapper requires building the configuration
with the `--impure` option.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: link to the manual page here, trim down redundant info.

shorthand.

The following wrappers are available:
${wrapperListMarkdown}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice :)

@exzombie
Copy link
Author

I think this is ready to merge, just want to think it over for another day or so since there's a lot going on here. would you mind if I revise the docs a bit post-merge?

Go right ahead!

@@ -53,11 +193,17 @@ in {
separateDebugInfo = false;
nativeBuildInputs = old.nativeBuildInputs or [ ]
++ [ pkgs.makeWrapper ];
buildCommand = ''
buildCommand = let
# We need an intermediate wrapper package because makeWrapper
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: I imagine that most people will leave Vulkan disabled, so we could probably avoid the double wrapper for most people with a little refactor. it would be a pretty small resource usage improvement though, probably not important

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true, and I did consider it. But it seemed to me that the improvement would be too small compared to the effort needed for refactoring, especially because I was hoping at the time that the Vulkan situation could be improved and the default restored in the future. I have found in the meantime that it can't be improved, at least with the current approach used by NixGL that relies on environment variables. But this does not change the gain vs. effort equation.

@Smona Smona merged commit dd25d79 into Smona:nixgl-compat Oct 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants