Skip to content

tensorsofthewall/VidTune

Repository files navigation

title colorFrom colorTo sdk python_version sdk_version suggested_hardware suggested_storage app_file pinned preload_from_hub short_description
VidTune
indigo
pink
streamlit
3.9.19
1.36.0
t4-medium
small
main.py
true
facebook/musicgen-small
facebook/musicgen-medium
facebook/musicgen-large
Generate tailored soundtracks for your videos.

Contributors Forks Stargazers Issues AGPL License Sandesh-LinkedIn Animikh-LinkedIn


Logo

VidTune

Tailored soundtracks for your videos

Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Hardware Requirements
  3. See VidTune in action!
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

About The Project

startup_screen

VidTune is a generative AI application designed to create custom music tailored to your video content. By leveraging advanced AI models for video analysis and music creation, VidTune provides an intuitive and seamless experience for generating and integrating music into your videos.

VidTune employs two state-of-the-art models for video understanding and music generation:

  1. Google Gemini - Google's largest and most capable multimodal AI model.
  2. MusicGen - Meta's text-to-music model, capable of generating high-quality music conditioned on text or audio prompts.

(back to top)

Built With

Transformers Google Gemini AudioCraft MusicGen Streamlit

(back to top)

Hardware Requirements

Hardware used for Development and Testing

  • CPU: AMD Ryzen 7 3700X - 8 Cores 16 Threads
  • GPU: Nvidia GeForce RTX 4060 Ti 16 GB
  • RAM: 64 GB DDR4 @ 3200 MHz
  • OS: Linux (WSL | Ubuntu 22.40)

The above is just used for development and by no means is necessary to run this application. The Minimum Hardware Requirements are given in the next section

While VidTune is supported on CPU-only machines, we recommend using a GPU with minimum 16GB of memory for faster results.

See VidTune in action!

Watch the video

Running VidTune

First, clone the repository:

git clone https://github.com/sandesh-bharadwaj/VidTune.git
cd VidTune

Using conda

If you're using conda as your virtual environment manager, do the following:

conda env create -f environment.yml
conda activate vidtune

streamlit run main.py

Using python / pip

pip install -r requirements.txt
streamlit run main.py

Using Docker

Docker Hub Image: https://hub.docker.com/r/animikhaich/vidtune

docker run --rm -it --gpus all -p 8003:8003 animikhaich/vidtune

Roadmap

  • Customized Prompt for Gemini Prompting
  • Flutter version of app for proof-of-concept
  • MusicGen integration
  • Audio Mixing
  • Streamlit app
  • Docker image
  • OpenVINO-optimized versions of MusicGen for CPU-Only use.
  • Support for music generation duration > 30 seconds.
  • Add more settings for controlling generation.
  • Option to edit music prompts before music generation.

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

If you have a suggestion that would improve this, please open an issue with the tag "enhancement".You can also fork the repo and create a pull request. Your feedback is greatly appreciated! Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the CC BY-NC 4.0 License. See LICENSE for more information.

(back to top)

Contact

Sandesh Bharadwaj - sandesh.bharadwaj97@gmail.com

Animikh Aich - animikhaich@gmail.com

Project Link: https://github.com/sandesh-bharadwaj/VidTune

(back to top)

Acknowledgments

  • Google.
  • Meta.

(back to top)