Skip to content

mindspore-lab/mindone

Repository files navigation

MindONE

This repository contains SoTA algorithms, models, and interesting projects in the area of multimodal understanding and content generation.

ONE is short for "ONE for all"

News

  • [2024.11.06] MindONE v0.2.0 is released

Quick tour

To install MindONE v0.2.0, please install MindSpore 2.3.1 and run pip install mindone

Alternatively, to install the latest version from the master branch, please run.

git clone https://github.com/mindspore-lab/mindone.git
cd mindone
pip install -e .

We support state-of-the-art diffusion models for generating images, audio, and video. Let's get started using Stable Diffusion 3 as an example.

Hello MindSpore from Stable Diffusion 3!

sd3
import mindspore
from mindone.diffusers import StableDiffusion3Pipeline

pipe = StableDiffusion3Pipeline.from_pretrained(
    "stabilityai/stable-diffusion-3-medium-diffusers",
    mindspore_dtype=mindspore.float16,
)
prompt = "A cat holding a sign that says 'Hello MindSpore'"
image = pipe(prompt)[0][0]
image.save("sd3.png")

supported models under mindone/examples

model features
cambrian working on it
minicpm-v working on v2.6
internvl working on v1.0 v1.5 v2.0
llava working on llava 1.5 & 1.6
vila working on it
pllava working on it
hpcai open sora support v1.0/1.1/1.2 large scale training with dp/sp/zero
open sora plan support v1.0/1.1/1.2 large scale training with dp/sp/zero
stable diffusion support sd 1.5/2.0/2.1, vanilla fine-tune, lora, dreambooth, text inversion
stable diffusion xl support sai style(stability AI) vanilla fine-tune, lora, dreambooth
dit support text to image fine-tune
latte support unconditional text to image fine-tune
animate diff support motion module and lora training
video composer support conditional video generation with motion transfer and etc.
ip adapter refactoring
t2i-adapter refactoring
dynamicrafter support image to video generation
hunyuan_dit support text to image fine-tune
pixart_sigma support text to image fine-tune at different aspect ratio

run hf diffusers on mindspore

mindone diffusers is under active development, most tasks were tested with mindspore 2.3.1 and ascend 910 hardware.

component features
pipeline support text2image,text2video,text2audio tasks 30+
models support audoencoder & transformers base models same as hf diffusers
schedulers support ddpm & dpm solver 10+ schedulers same as hf diffusers

TODO

  • hf diffusers 0.30.0 version adaption