Add Keras 3 example for "Audio track separation" #2003

johacks · 2024-12-10T15:44:15Z

Hi, I saw "Audio track separation" in the call for contributions, so I implemented an example that separates the vocal track from songs in the MUSDB18 dataset.

Some notes:

The code has been tested on all frameworks.
I have also uploaded the notebook running for 3 epochs on all frameworks with A100 GPU:
- JAX: 405s on average per epoch.
- Torch: 615s on average per epoch.
- Tensorflow: 261s on average per epoch.
I can confirm the convert script runs ok and the keras page generates correctly.

Please let me know if there is anything wrong with the provided script I may have missed.

Thanks!

fchollet

Thanks for the PR! This is excellent work, I enjoyed reading through it 👍

fchollet · 2024-12-15T16:57:14Z

examples/audio/vocal_track_separation.py

+import numpy as np
+import soundfile as sf
+from IPython import display
+from keras import callbacks, layers, ops, optimizers, saving, utils


Only keep layers, ops, callbacks -- they are used many times. However optimizers, saving, utils are only used 1-2x so you can just do e.g. keras.optimizers.Adam.

I ended up needing many more uses of saving in refactor, so I kept that import, removed the rest

fchollet · 2024-12-15T16:59:16Z

examples/audio/vocal_track_separation.py

+
+
+@saving.register_keras_serializable()
+class TDF(layers.Layer):


Can there be a more explicit name?

fchollet · 2024-12-15T16:59:21Z

examples/audio/vocal_track_separation.py

+
+
+@saving.register_keras_serializable()
+class TFC(layers.Layer):


fchollet · 2024-12-15T17:00:17Z

examples/audio/vocal_track_separation.py

+else:
+    model = tfc_tdf_net(keras.Input(sample_batch_x.shape[1:]), name="tfc_tdf_net")
+
+model.summary()


Would it be useful to plot the model, or is the result too busy? e.g. keras.utils.plot_model

This is how it would look that after a refactoring i've made grouping the TFC_TDF blocks.
It's a little long, but actually simple in structure. Regardless, It can be further refactored to group into decoder and encoder blocks, but perhaps it wont be as informative.

Ok -- your call on whether to include it.

…ns, better imports

johacks · 2024-12-15T23:13:37Z

Hi @fchollet ,

Thanks for the feedback!

I did a small refactor to further group TFC_TDF and Downsample/Upsample blocks into custom layers so they could be better visualized in a plot. I also renamed symbols to further avoid abbreviations and updated some docstrings to reflect the changes.

The code is currently working on all frameworks after these changes. Also the convert script still runs correctly.

fchollet

Thanks for the update -- the changes are looking good! Please add the generated files.

johacks · 2024-12-16T03:25:39Z

Hi again,

I have just pushed the generated files. I also updated layer TimeFrequencyTransformBlock to use a flat list instead of list of tuples, because the latter was resulting in weights not being properly tracked if saving the model.

fchollet

LGTM - Thank you for the excellent contribution!

Add example for vocal track separation

b780f4e

github-actions bot assigned sachinprasadhs Dec 10, 2024

fchollet reviewed Dec 15, 2024

View reviewed changes

johacks added 2 commits December 15, 2024 21:43

Merge branch 'master' into dev_johacks_audio_track_separation

4f6ed4e

Refactor vocal track separation example: model_plot, less abbreviatio…

ae60da2

…ns, better imports

fchollet reviewed Dec 16, 2024

View reviewed changes

Fix weight saving in refactor. Add autoconvert results

4b09993

fchollet approved these changes Dec 16, 2024

View reviewed changes

fchollet merged commit 0ddb810 into keras-team:master Dec 16, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Keras 3 example for "Audio track separation" #2003

Add Keras 3 example for "Audio track separation" #2003

johacks commented Dec 10, 2024

fchollet left a comment

fchollet Dec 15, 2024

johacks Dec 15, 2024

fchollet Dec 15, 2024

fchollet Dec 15, 2024

fchollet Dec 15, 2024

johacks Dec 15, 2024

fchollet Dec 16, 2024

johacks commented Dec 15, 2024

fchollet left a comment

johacks commented Dec 16, 2024

fchollet left a comment



		@saving.register_keras_serializable()
		class TDF(layers.Layer):



		@saving.register_keras_serializable()
		class TFC(layers.Layer):

Add Keras 3 example for "Audio track separation" #2003

Add Keras 3 example for "Audio track separation" #2003

Conversation

johacks commented Dec 10, 2024

fchollet left a comment

Choose a reason for hiding this comment

fchollet Dec 15, 2024

Choose a reason for hiding this comment

johacks Dec 15, 2024

Choose a reason for hiding this comment

fchollet Dec 15, 2024

Choose a reason for hiding this comment

fchollet Dec 15, 2024

Choose a reason for hiding this comment

fchollet Dec 15, 2024

Choose a reason for hiding this comment

johacks Dec 15, 2024

Choose a reason for hiding this comment

fchollet Dec 16, 2024

Choose a reason for hiding this comment

johacks commented Dec 15, 2024

fchollet left a comment

Choose a reason for hiding this comment

johacks commented Dec 16, 2024

fchollet left a comment

Choose a reason for hiding this comment