With trained model weights we can now implement the TTS app for our generated voice model.
- model_path: The path to your generated model
- vocoder_model_path: The path to your vocoder model (default hifigan model found here)
- hifigan_config_path: The path to your hifigan config (default higigan config found here)
- text: Text you wish to synthesize
- graph_output_path (optional): Path to save alignment graph to
- audio_output_path (optional): Path to save generated audio to
- silence_padding (optional) : Seconds of silence to seperate each clip by with multi-line synthesis (default is 0.15)
- sample_rate (optional) : Audio sample rate (default is 22050)
- max_decoder_steps (optional) : Max decoder steps controls sequence length and memory usage during inference. Increasing this will use more memory but may allow for longer sentences. (default is 1000)
python synthesize.py -m checkpoint_500000 -vm g_02500000 -hc config.json -t "Hello everyone, how are you?" -g graph.png -a audio.wav