-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Sana] Add Sana, including SanaPipeline
, SanaPAGPipeline
, LinearAttentionProcessor
, Flow-based DPM-sovler
and so on.
#9982
Conversation
# Conflicts: # src/diffusers/models/normalization.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>
@lawrence-cj Awesome, tysm! I will complete the remaining docs and tests and merge soon! |
Acutally, we support BF16 here: https://huggingface.co/Efficient-Large-Model/Sana_1600M_1024px_BF16_diffusers |
yes i am pointing this out merely for anyone who wants something to compare against. for my use case it is because i point the default model for simpletuner to this repo path and can more readily adjust config parameters (eg. scheduler) without impacting other use cases or requiring downstream users remember to pass the custom values in. |
i think in the bf16 repository you have the fp32 weights as well; are those a fp32 copy of the bf16 compatible weights? if so that makes sense but otherwise it may confuse users that don't know to pass |
Yes. It's just a FP32 copy of BF16 weight and I run it successfully. |
without complex human instruction: with: is it possible there is something wrong with the CHI implementation here? it makes all images worse. for example with CHI enabled it's putting 508 tokens of input through the model instead of just 300 (206 from CHI plus the 300 prompt tokens (padded) and i don't know why we need this many tokens. is it supposed to be 300 total? |
What’s your inference code? @bghira |
we use |
What's your prompt? @bghira |
@hlky Would you like to give the changes to schedulers here a review? I'm preparing to merge it shortly after I add the integration tests in the next hour since YiYi has approved and confirmed on Slack. I've tested all the normal models (not the multilingual ones) and they seem to work well (I did the conversions myself when testing, but for the integration tests, I will be using the remote checkpoints and match slices). I have not exhaustively tested all scheduler changes though - only DPMSolverMultistep and FlowMatchEulerDiscrete, but I think that should be okay since it is copied logic (from |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@a-r-r-o-w Scheduler changes look good, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @lawrence-cj and team! The paper was very insightful and it was very cool to come across the ideas developed.
Thanks for bearing with our reviews too! Will merge the PR once the CI passes
Thank you so much for your effort! Love you guys. I was stuck by other things, sorry for the late reply! ! |
What does this PR do?
This PR will add the official Sana (SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer) into the diffusers lib. Sana first makes the Text-to-Image available on 32x compressed latent space, powered by DC-AE(https://arxiv.org/abs/2410.10733v1) without performance degradation. Also, Sana contains several popular efficiency related techs, like DiT with Linear Attention processor and we use Decoder-only LLM (Gemma-2B-IT) for low GPU requirement and fast speed.
Paper: https://arxiv.org/abs/2410.10629
Original code repo: https://github.com/NVlabs/Sana
Project: https://nvlabs.github.io/Sana
Core contributor of DC-AE:
work with @johnny_ez@163.com
Core library:
We want to collaborate on this PR together with friends from HF. Feel free to contact me here. Cc: @sayakpaul, @yiyixuxu
Core library:
HF projects:
-->
Images is generated by
SanaPAGPipeline
withFlowDPMSolverMultistepScheduler