Unofficial implementation JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models(https://arxiv.org/abs/2308.04729)
git clone https://github.com/0417keito/JEN-1-pytorch.git
cd JEN-1-pytorch
pip install -r requirements.txt
import torch
from generation import Jen1
ckpt_path = 'your ckpt path'
jen1 = Jen1(ckpt_path)
prompt = 'a beautiful song'
samples = jen1.generate(prompt)
torchrun train.py
Json format. the name of the Json file must be the same as the target music file.
{"prompt": "a beautiful song"}
How should the data_dir be created?
'''
dataset_dir
├── audios
| ├── music1.wav
| ├── music2.wav
| .......
| ├── music{n}.wav
|
├── metadata
| ├── music1.json
| ├── music2.json
| ......
| ├── music{n}.json
|
'''
please see config.py and conditioner_config.py
- Extension to JEN-1-Composer
- Extension to music generation with singing voice
- Adaptation of Consistency Model
- In the paper, Diffusion Autoencoder was used, but I did not have much computing resources, so I used Encodec instead. So, if I can afford it, I will implement Diffusion Autoencoder.
coming soon !
Dr Adam Fils - Thank you for providing the GPU. I really appreciate Adam giving me this opportunity.
If you find this repo interesting and useful, give us a ⭐️ on GitHub! It encourages us to keep improving the model and adding exciting features. Please inform us of any deficiencies by issue.
Contributions are always welcome.