Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Help]: Clarification on VALL-E and VALL-E V2 in Amphion Repository #370

Open
nngocson2002 opened this issue Dec 13, 2024 · 4 comments
Open

Comments

@nngocson2002
Copy link

Upon reviewing the repository, I noticed it contains two versions: VALL-E and VALL-E V2. Could you kindly clarify if both implementations correspond to the same paper (paper link)? Or does VALL-E correspond to this paper (paper link) and VALL-E V2 to this one (paper link)?

Additionally, I am attempting to train VALL-E from scratch using the egs/tts/VALLE directory. Specifically, during the AR training step, I noticed that the train_stage and ar_model_ckpt_dir arguments mentioned in the egs/tts/VALLE/run.sh file are not reflected in the bins/tts/train.py script. Could you provide guidance on resolving this mismatch?

@lmxue
Copy link
Collaborator

lmxue commented Dec 14, 2024

Hi @nngocson2002 , thanks for your email and for bringing up this issue.
As mentioned in README of VALL-E and VALL-E v2, [VALL-E] is the implementation of the paper, while, VALLE_V2 includes improvements based on vanilla VALL-E.
Additionally, train_stage and are ar_model_ckpt_dir are defined in valle_train.py file.

@nngocson2002
Copy link
Author

So, if I want to reproduce VALL-E, I just need to choose either VALL-E or VALL-E V2, right? I initially thought these repositories were based on different papers, so I assumed I would need to reproduce both.

@jiaqili3
Copy link
Collaborator

So, if I want to reproduce VALL-E, I just need to choose either or VALL-E V2, right? I initially thought these repositories were based on different papers, so I assumed I would need to reproduce both.

Yes, and if you want to reproduce VALL-E I suggest using VALL-E (v1) which uses the vanilla implementation.

@nngocson2002
Copy link
Author

Thank you for your clarification!

One point I’d like to discuss: as I mentioned, when I attempt to run the training from scratch (as shown in the image below), it executes the bins/tts/train.py script. This results in the train_stage and ar_model_ckpt_dir arguments being unrecognized.

Additionally, could you provide the structure of the JSON file referenced in egs/tts/VALLE/exp_config.json? I noticed that the preprocessing script reads this JSON file, but I couldn't determine its structure.

I would greatly appreciate your guidance.

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants