How can I change the duration of the output audio? #21

ChloeL19 · 2023-06-13T00:19:19Z

Hi, This is really a lovely repository. But how can I change the duration of the generated audio?
Thanks!!

ChloeL19 · 2023-06-13T01:51:23Z

Okay, so I just kind of forced a different latent embedding size. I wanted one second of output, so I divided the original latent dimension (256) by 10 and then rounded up.

def prepare_latents(self, batch_size, inference_scheduler, num_channels_latents, dtype, device):
    # EDIT: they are hardcoding the latent size here!! to 256! I want to change this!
    shape = (batch_size, num_channels_latents, 256, 16)
    shape = (batch_size, num_channels_latents, 26, 16) # scaled to one second???

Indeed, the inference script now outputs audio files that are 1 second in length. Is this....okay??

ChloeL19 · 2023-06-13T02:05:01Z

I suppose duration could be introduced as a training argument, and then saved as part of the training config and used in this way to adjust the lengths of the audio generated during the inference process...

cvillela · 2023-07-03T21:15:09Z

Would really like an audio sample duration feature as well!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I change the duration of the output audio? #21

How can I change the duration of the output audio? #21

ChloeL19 commented Jun 13, 2023

ChloeL19 commented Jun 13, 2023

ChloeL19 commented Jun 13, 2023

cvillela commented Jul 3, 2023

How can I change the duration of the output audio? #21

How can I change the duration of the output audio? #21

Comments

ChloeL19 commented Jun 13, 2023

ChloeL19 commented Jun 13, 2023

ChloeL19 commented Jun 13, 2023

cvillela commented Jul 3, 2023