Riffusion v0.3 #48

hmartiro · 2022-12-27T17:14:16Z

hmartiro
Dec 27, 2022
Maintainer

🖊️ Full Rewrite

This release contains a full rewrite of the Riffusion codebase to go from a hack to a quality software project.

Rename the repository from riffusion-inference to riffusion.
SpectrogramParams class that contains all conversion parameters, with sane defaults.
SpectrogramConverter class that converts between spectrogram tensors and audio.
SpectrogramImageConverter class that converts between spectrogram images and audio.
Leverage pydub AudioSegment in more places rather than raw numpy arrays.
Move common code into the util package.
Cache more computation and be careful about error checking.
Move third party integrations into the integrations package. Share most of the code so they greatly simplify.
pyproject.toml for tool configuration
Overhaul README with more descriptive instructions.

🚨 This release is API compatible with the web app, but code that used this repository directly will need to be updated.

👩‍💻 Riffusion CLI

Extensible command line interface for performing common tasks. See the README for details.

$ python -m riffusion.cli -h
usage: cli.py [-h] {audio-to-image,image-to-audio,sample-clips,print-exif} ...

positional arguments:
  {audio-to-image,image-to-audio,sample-clips,print-exif}
    audio-to-image      Compute a spectrogram image from a waveform.
    image-to-audio      Reconstruct an audio clip from a spectrogram image.
    sample-clips        Slice an audio file into clips of the given duration.
    print-exif          Print the params of a spectrogram image as saved in the exif data.

options:
  -h, --help            show this help message and exit

🤾‍♂️ Riffusion Playground

Extensible Streamlit app for interactive exploration of Riffusion. See the README for details.

🔥 MPS and CPU Backends

Riffusion now can run on MPS and CPU backends in addition to CUDA. See the README for details.

Also adds graceful detection and fallback of devices.

Closes: #15

👓 Stereo Spectrograms

Add tools to encode and decode stereo audio as spectrograms, using the G and B channels for left and right.

🖼️ Encode Spectrogram Params in Image EXIF

Add the ability to store spectrogram conversion parameters in EXIF metadata of the images, and the ability to decode back to audio from those params. This allows more flexibility for usage without assuming default parameters.

The SpectrogramParams class has methods to convert to and from EXIF.

$ python -m riffusion.cli print-exif --image spectrogram.jpg
NUM_FREQUENCIES      =             512
STEP_SIZE_MS         =              10
MAX_VALUE            =      46801012.0
MIN_FREQUENCY        =               0
WINDOW_DURATION_MS   =             100
MAX_FREQUENCY        =           10000
PADDED_DURATION_MS   =             400
SAMPLE_RATE          =           44100
STEREO               =               1
POWER_FOR_IMAGE      =            0.25

🔉 Post-Processing Filters

Add a capability to apply normalization and compression to audio using pydub.

🟢 Test Suite

Add a suite of tests in the test/ package, and check in some test data.

They are automatically run on pull requests, configured from ci.yml.

audio_to_image_test.py
image_to_audio_test.py
image_util_test.py
linter_test.py
print_exif_test.py
sample_clips_test.py
spectrogram_converter_test.py
spectrogram_image_converter_test.py

🧹 Lint Tools

These tools run in CI and must pass cleanly to merge.

ruff for linting (ruff --fix .)
black for formatting (black .)
mypy for typing (mypy .)

PRs

Rewrite the codebase to be high quality by @hmartiro in Rewrite the codebase to be high quality #36
Enable ruff import sorting by @hmartiro in Enable ruff import sorting #38
Add CI with github actions by @hmartiro in Add CI with github actions #37
Streamlit app for interactive use of the model by @hmartiro in Streamlit app for interactive use of the model #40
Add detail to readme by @hmartiro in Add detail to readme #46
Disable compression by default, too slow by @hmartiro in Disable compression by default, too slow #47
Improve interpolation playground by @hmartiro in Improve interpolation playground #45

Full Changelog: v0.2.0...v0.3.0

This discussion was created from the release Riffusion v0.3.

hypertexthero · 2023-01-18T03:25:34Z

hypertexthero
Jan 18, 2023

This is the most exciting project I have seen on the web in a long time!

Thank you for making it!

1 reply

FalseGenius Jan 22, 2023

Facts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Riffusion

Riffusion v0.3 #48

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Riffusion

Riffusion v0.3 #48

hmartiro Dec 27, 2022 Maintainer

🖊️ Full Rewrite

👩‍💻 Riffusion CLI

🤾‍♂️ Riffusion Playground

🔥 MPS and CPU Backends

👓 Stereo Spectrograms

🖼️ Encode Spectrogram Params in Image EXIF

🔉 Post-Processing Filters

🟢 Test Suite

🧹 Lint Tools

PRs

Replies: 1 comment · 1 reply

hypertexthero Jan 18, 2023

FalseGenius Jan 22, 2023

hmartiro
Dec 27, 2022
Maintainer

Replies: 1 comment 1 reply

hypertexthero
Jan 18, 2023