Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stuff to save Vram to less then 7gb #29

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

Manni1000
Copy link

I added dynamic unloading and loading of the main model and the VAE to save VRAM.
I also added Int8Quantized to save VRAM.
I also added a button in the Gradio UI to select the low VRAM mode (Int8Quantized).

When this is active, the model runs with under 7GB VRAM. A lot more people will be able to use it like this.

!!!!! But this is not completely done. The button does not work!!! A variable in the pipeline.py (Quantization = True) file currently changes if it is active. Maybe someone else can connect the button logic with the variable; I am currently not able to do it. But I published this so others can play around with it.

added the quntization button to the ui
added dynamic unloading of main model and vae to save vram.
added Int8Quantized to save vram
@Manni1000
Copy link
Author

oh i have to fix something with the vae encoding but i will do that tomorrow.

@cocktailpeanut
Copy link

@Manni1000 sir there are thousands of us waiting for this to get merged....

@Manni1000
Copy link
Author

I will fix the vae buck today :)

made low vram button work. only starts new pipline if the value is chaged
added code so the Quantizing button from the ui works
added code so that the int 8 button in the ui works
@Manni1000
Copy link
Author

ok now everything works. but because of the chages in the main branch its not compatible

chage to match og reposotty
@Manni1000 Manni1000 changed the title stuff to save Vram to less then 7gb (not done) stuff to save Vram to less then 7gb Oct 26, 2024
@nitinmukesh
Copy link

nitinmukesh commented Oct 26, 2024

I tested from your fork,
https://github.com/Manni1000/OmniGen/

OOM issue :(

image

@Manni1000
Copy link
Author

Maby fist try generating a image without a input image just as a test.

@nitinmukesh
Copy link

Maby fist try generating a image without a input image just as a test.

This was without image only. Tried first example where there is only prompt and no image and 512 x 512.

@Manni1000
Copy link
Author

strage. if its only text to image it takes less then 7gb for me. and with a image its slightly above 8. maby try the version ignoring the newest two comits this was just stuff to make a merge easyer.

@nitinmukesh
Copy link

nitinmukesh commented Oct 26, 2024

I am using your repository only, not merging any code.
git clone https://github.com/Manni1000/OmniGen/

If you guide me how to ignore any commit or give command to clone. (BTW not a developer)

@codingdudecom
Copy link

I can't get this to work either, I tried it on Kaggle, it says:

Loading safetensors
Quantizing weights to 8-bit...
0%| | 0/50 [00:00<?, ?it/s]
0%| | 0/50 [00:01<?, ?it/s]
OutOfMemoryError

OutOfMemoryError: CUDA out of memory. Tried to allocate 2.12 GiB. GPU 0 has a total capacity of 14.74 GiB of which 848.12 MiB is free. Process 2413 has 13.91 GiB memory in use. Of the allocated memory 11.64 GiB is allocated by PyTorch, and 2.15 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management

@SunDeveloper777
Copy link

SunDeveloper777 commented Oct 27, 2024

Install with Pinokio and raplace this files. Work with 5-9 gb VRAM. Check "Low VRAM (8-bit Quantization)" after load app.
27-10-2024.zip
2024-10-27-14-46-25

@nitinmukesh
Copy link

nitinmukesh commented Oct 27, 2024

@SunDeveloper777

Thank you. Working fine.

@codingdudecom
Copy link

not for me though :-(

I'm still trying to get this working on Kaggle

first I'm getting "The error AttributeError: module 'torch.library' has no attribute 'register_fake' typically arises when using PyTorch, particularly in versions where this function is either not available or incorrectly referenced." (totch verion 2.3.1)

Thought I need to upgrade, so I did, but now I get this error:

"AttributeError: partially initialized module 'torchvision' has no attribute 'extension' (most likely due to a circular import)"

why is this so complicated to run? the space on Huggingface gives an error 3 out of 4 tries also... I'm finding very difficult to evaluate this model to be honest

@cocktailpeanut
Copy link

@Manni1000 please check the PR i sent to your fork. I guess this PR will work once you merge my PR to your fork Manni1000#2

@cocktailpeanut
Copy link

cocktailpeanut commented Oct 27, 2024

Also, I confirm it works (with my fix applied). Took 1min 56sec on a 4090, wit 8GB VRAM usage.

BTW shouldn't we make LOW VRAM option checked by default? I mean, this thing is practically unusable without the low vram option.

@able2608 able2608 mentioned this pull request Oct 28, 2024
@ayttop
Copy link

ayttop commented Oct 28, 2024

not work on colab t4

@iwr-redmond
Copy link

iwr-redmond commented Oct 30, 2024

The @cocktailpeanut PR is working on Ubuntu 22.04:

remote="https://github.com/peanutcocktail/OmniGen"
local=/opt/local-venvs/omnigen
git clone $remote $local
cd $local/
python -m venv venv
source venv/bin/activate
pip install -e .
pip install gradio spaces
python app.py

Tick the new 'Low VRAM (8-bit quantization) ' button. Basic t2i maxes out at 5.8GB VRAM. Done right quick!

@Qarqor5555555
Copy link

ram=???

@Qarqor5555555
Copy link

It needs more than 12 ram ,,,,,,Can you convert it and store it... Can you upload the conversion code only... Can you upload the converted model... Any solution you can come up with, please... I want to use it on 12 gb ram like Colab T4

@iwr-redmond
Copy link

15.9GB system RAM. Possibly a scooch too much for Colab.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants