mels_mode generation #36

Biyani404198 · 2024-03-04T06:13:15Z

Hi,
I have created TextGrid files in the subfolder textgrids using MFA.
Im facing issues to get average voice mel-spectrograms in the subfolder mels_mode.
Im using get_avg_mels.ipynb jupyter noteboook to get average voice mel-spectrograms.
Its generating mels_mode dictionary with phonemes as keys. But there is not further instructions to map them with spakers and create mels_mode subfolder using this dictionary.
@ivanvovk @ytyeung @wenyong-h @huawei-noah-admin @zhangjiajin2 Pls help.

for p in phoneme_list: mels_mode[p] = mode(np.asarray(mels_mode_dict[p]), 0).mode[0] lens[p] = np.mean(np.asarray(lens_dict[p]))

The text was updated successfully, but these errors were encountered:

li1jkdaw · 2024-08-23T16:53:18Z

Basically, for each audio file .wav you know which frame corresponds to which phoneme (you can extract this information from textgrid file by calculating start_frame and end_frame as in get_avg_mels.ipynb), and then for each frame replace mel feature in _mel.npy file with the average feature of the corresponding phoneme -- mels_mode dictionary contains mapping {phoneme: its average mel feature}.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mels_mode generation #36

mels_mode generation #36

Biyani404198 commented Mar 4, 2024

li1jkdaw commented Aug 23, 2024

mels_mode generation #36

mels_mode generation #36

Comments

Biyani404198 commented Mar 4, 2024

li1jkdaw commented Aug 23, 2024