Flux1 nf4 #1

freecode-ai · 2024-08-29T00:51:41Z

Does this support the nf4 model?
https://huggingface.co/lllyasviel/flux1-dev-bnb-nf4

SplittyDev · 2024-08-29T08:09:12Z

Currently, it only supports the official models from BlackForestLabs. The main reason being that this is my first project running inference, and I have no idea how to load models manually.

I'll try to look into it, because I'm interested in getting the quantized models to run myself, but can't promise that I'll get it to work.

SplittyDev · 2024-08-29T18:37:59Z

@freecode-ai I've pushed a few commits with significant changes to model loading, theoretically allowing for loading of quantized models.

Experimental quantized model support

An fp8 version of dev and schnell is now available for selection, but I can't test it myself, because fp8 isn't supported on MPS, which is currently the only device I have access to. If you or anyone else reading this could test it, please let me know if it works correctly!

Issues regarding NF4 support

Honestly, I just don't know how to load them properly..

After the recent refactoring, there's support for loading a FluxTransformer2DModel manually, which allows for loading other kinds of models. But I still don't know how I'd go about loading nf4 models, because as far as I can tell, torch doesn't come with an nf4 dtype.

I've tried finding out how I can load these models and so far, I'm looking into the bitsandbytes package, but I'm not very experienced with this stuff and I don't know if it's even possible to just easily use bitsandbytes for NF4 support together with FluxTransformer2DModel, or whether I'd have to break down the pipeline even more and do even more stuff manually.

If anyone knows how to do this, please let me know, or even better, submit a PR :)

Future GGUF support

Now that the codebase is a bit cleaner and there's more support for doing more manual stuff regarding model loading, maybe we could support GGUF too to get access to all the nice integer GGUF quants?

My issues regarding this are basically the same as my nf4 issues: How can I load these models, and is it even compatible with the diffusers library? I can't seem to find any usable examples, or maybe I'm just really bad at googling this stuff.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flux1 nf4 #1

Flux1 nf4 #1

freecode-ai commented Aug 29, 2024

SplittyDev commented Aug 29, 2024

SplittyDev commented Aug 29, 2024

Flux1 nf4 #1

Flux1 nf4 #1

Comments

freecode-ai commented Aug 29, 2024

SplittyDev commented Aug 29, 2024

SplittyDev commented Aug 29, 2024

Experimental quantized model support

Issues regarding NF4 support

Future GGUF support