[WIP] Unofficial Implementation of Microsoft's PromptTTS2
Contributions to complete the following project are warmly welcomed !!. Currently, I am working on this alone, so I would be immensely grateful to anyone willing to help. Your assistance would be greatly appreciated.
For future project ideas, please check out:
- PromptTTS2 unofficial implementation
- UTAUTAI Singing voice generation with accompanying music, like Suno AI. Inspired by Suno's terrifyingly accurate singing voice generation with accompanying music, we want to work towards making it open source. Control by PromptTTS2, automation of prompt generation by LLaSA and the use of VALL-E, which has a fearsome context awareness capability as a backbone, will enable the generation of singing voices with accompanying music.
This system connects to any TTS (Text-to-Speech) backbone. It simply inputs the output representations from the variation network into the TTS backbone through cross-attention.