How to use

(Suggestion) Python == 3.7

Clone this repository

git clone https://github.com/CjangCjengh/vits.git

Choose cleaners

Fill "text_cleaners" in config.json
Edit text/symbols.py
Remove unnecessary imports from text/cleaners.py

Install requirements

pip install -r requirements.txt

Create datasets

Single speaker

"n_speakers" should be 0 in config.json

path/to/XXX.wav|transcript

Example

dataset/001.wav|こんにちは。

Mutiple speakers

Speaker id should start from 0

path/to/XXX.wav|speaker id|transcript

Example

dataset/001.wav|0|こんにちは。

Preprocess

If you have done this, set "cleaned_text" to true in config.json

# Single speaker
python preprocess.py --text_index 1 --filelists path/to/filelist_train.txt path/to/filelist_val.txt

# Mutiple speakers
python preprocess.py --text_index 2 --filelists path/to/filelist_train.txt path/to/filelist_val.txt

Build monotonic alignment search

cd monotonic_align
python setup.py build_ext --inplace
cd ..

Train

# Single speaker
python train.py -c <config> -m <folder>

# Mutiple speakers
python train_ms.py -c <config> -m <folder>

Inference

Online

See inference.ipynb

Offline

See MoeGoe

Running in Docker

docker run -itd --gpus all --name "Container name" -e NVIDIA_DRIVER_CAPABILITIES=compute,utility -e NVIDIA_VISIBLE_DEVICES=all "Image name"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

How to use

Clone this repository

Choose cleaners

Install requirements

Create datasets

Single speaker

Mutiple speakers

Preprocess

Build monotonic alignment search

Train

Inference

Online

Offline

Running in Docker

Files

README.md

Latest commit

History

README.md

File metadata and controls

How to use

Clone this repository

Choose cleaners

Install requirements

Create datasets

Single speaker

Mutiple speakers

Preprocess

Build monotonic alignment search

Train

Inference

Online

Offline

Running in Docker