This repository contains SoTA algorithms, models, and interesting projects in the area of multimodal understanding and content generation.
ONE is short for "ONE for all"
- [2024.11.06] MindONE v0.2.0 is released
To install MindONE v0.2.0, please install MindSpore 2.3.1 and run pip install mindone
Alternatively, to install the latest version from the master
branch, please run.
git clone https://github.com/mindspore-lab/mindone.git
cd mindone
pip install -e .
We support state-of-the-art diffusion models for generating images, audio, and video. Let's get started using Stable Diffusion 3 as an example.
Hello MindSpore from Stable Diffusion 3!
import mindspore
from mindone.diffusers import StableDiffusion3Pipeline
pipe = StableDiffusion3Pipeline.from_pretrained(
"stabilityai/stable-diffusion-3-medium-diffusers",
mindspore_dtype=mindspore.float16,
)
prompt = "A cat holding a sign that says 'Hello MindSpore'"
image = pipe(prompt)[0][0]
image.save("sd3.png")
model | features |
---|---|
cambrian | working on it |
minicpm-v | working on v2.6 |
internvl | working on v1.0 v1.5 v2.0 |
llava | working on llava 1.5 & 1.6 |
vila | working on it |
pllava | working on it |
hpcai open sora | support v1.0/1.1/1.2 large scale training with dp/sp/zero |
open sora plan | support v1.0/1.1/1.2 large scale training with dp/sp/zero |
stable diffusion | support sd 1.5/2.0/2.1, vanilla fine-tune, lora, dreambooth, text inversion |
stable diffusion xl | support sai style(stability AI) vanilla fine-tune, lora, dreambooth |
dit | support text to image fine-tune |
latte | support unconditional text to image fine-tune |
animate diff | support motion module and lora training |
video composer | support conditional video generation with motion transfer and etc. |
ip adapter | refactoring |
t2i-adapter | refactoring |
dynamicrafter | support image to video generation |
hunyuan_dit | support text to image fine-tune |
pixart_sigma | support text to image fine-tune at different aspect ratio |
mindone diffusers is under active development, most tasks were tested with mindspore 2.3.1 and ascend 910 hardware.
component | features |
---|---|
pipeline | support text2image,text2video,text2audio tasks 30+ |
models | support audoencoder & transformers base models same as hf diffusers |
schedulers | support ddpm & dpm solver 10+ schedulers same as hf diffusers |
- hf diffusers 0.30.0 version adaption