Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Japanese sounds unnatural #214

Open
michaellin99999 opened this issue Nov 19, 2024 · 3 comments
Open

Japanese sounds unnatural #214

michaellin99999 opened this issue Nov 19, 2024 · 3 comments

Comments

@michaellin99999
Copy link

I have combined the phoneme sets for all three langauges,
English, Chinese, Japanese and started fine tuning using a datset comprised of all three speech languages
The base model I use is the chinese and english base.
However after 500 epochs, the result I get, chinese is good, english is good, however japanese sounds unnatural .
My udnerstanding is that the phonemes are correct but the tone is just not how japanese is spoken.
What can I do to improve this?

Here is a sample data of the japanese output. https://soundcloud.com/michael-lin-674069136/japanese-test

@eliteexod
Copy link

Are you using it on Docker?

@michaellin99999
Copy link
Author

i have tried on docker and also onnx runtime both sound like this

@baishouwujianfei
Copy link

Hello, may I ask which pre-trained model you used for fine-tuning? How long did you train? How is the config set up? The model I trained cannot produce complete sentences, and the speech is very strange.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants