Japanese sounds unnatural #214

michaellin99999 · 2024-11-19T07:31:16Z

I have combined the phoneme sets for all three langauges,
English, Chinese, Japanese and started fine tuning using a datset comprised of all three speech languages
The base model I use is the chinese and english base.
However after 500 epochs, the result I get, chinese is good, english is good, however japanese sounds unnatural .
My udnerstanding is that the phonemes are correct but the tone is just not how japanese is spoken.
What can I do to improve this?

Here is a sample data of the japanese output. https://soundcloud.com/michael-lin-674069136/japanese-test

eliteexod · 2024-11-19T17:14:38Z

Are you using it on Docker?

michaellin99999 · 2024-11-19T17:38:17Z

i have tried on docker and also onnx runtime both sound like this

baishouwujianfei · 2024-12-06T06:03:06Z

Hello, may I ask which pre-trained model you used for fine-tuning? How long did you train? How is the config set up? The model I trained cannot produce complete sentences, and the speech is very strange.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Japanese sounds unnatural #214

Japanese sounds unnatural #214

michaellin99999 commented Nov 19, 2024

eliteexod commented Nov 19, 2024

michaellin99999 commented Nov 19, 2024

baishouwujianfei commented Dec 6, 2024

Japanese sounds unnatural #214

Japanese sounds unnatural #214

Comments

michaellin99999 commented Nov 19, 2024

eliteexod commented Nov 19, 2024

michaellin99999 commented Nov 19, 2024

baishouwujianfei commented Dec 6, 2024