Robots in Musical Improvisation: Learning Algorithms Respecting Motion Constrains
Master's Thesis in Robotics, Cognition, Intelligence - Technical University of Munich (TUM)
Humanoid robots have been exposed to increasingly challenging tasks since they steadily grow in resemblance to humans particularly in their skill to intelligently and socially interact with their environment. Recent developments in machine learning and control of musculoskeletal systems allow humanoid robots to perform creative tasks, e.g., drawing or playing instruments. For humans, it takes years of practice to develop creative skills in arts or music. Musical improvisation, the act of creating a spontaneous response to a currently presented musical sequence, demands high proficiency in an instrument. How would a humanoid robot learn to elaborately improvise on an instrument?
This work proposes an end-to-end framework for robotic improvisation with focus on machine learning algorithms in interaction with the control of a musculoskeletal robot. In spite of sophisticated sampling methods, variational autoencoders were chosen to be the basis for musical improvisation. A new framework that eases the control of musculoskeletal robots was applied to play musical sequences on the marimba.
The systems input sequence is played with W-101 MIDI controller or any MIDI keyboard. The sequence is then parsed to matrix M which serves as the input to the encoder of the variational autoencoder (VAE). Next it can either be
- reconstructed (Path 1) --> No changes in latent space
- modified (Path 2) --> Latent space modifier GUI
- processed by an LSTM network --> LSTM
Irrespective of the path, z gets decoded by the VAE decoder. After that, the new sequence can be smoothed by a Note smoother and then sent to the robot or to a software synthesizer.
Variational Autoencoder with Latent space modifier (GUI) - "Generate Mode"
Variational Autoencoder with Latent space modifier (GUI) - "Interact Mode"
Variational Autoencoder with Latent space modifier (GUI) - "Endless Mode"
Variational Autoencoder (GUI) with Roboy robot - "Interact Mode"
Variational Autoencoder (GUI) with Roboy robot - "Endless Mode"
To setup simulation please visit Roboy Control
Get pretrained models following this link and move the "pretrained_models" folder to /path/to/tss18-robotsinmusicalimprovisation/utils
- Create virtual environment one above the root folder of this project and activate it:
virtualenv ../.rimi -p python3 --no-site-packages
source ../.rimi/bin/activate
- Set PYTHONPATH to the root directory of this project or add to ~/.bashrc
export PYTHONPATH=$PYTHONPATH:/path/to/dir
- OPTIONAL (for Ubuntu): You will need these for python-rtmidi:
sudo apt-get install libasound-dev
sudo apt-get install libjack-dev
- Pip install all packages
pip3 install -r requirements.txt
- Install PyTorch
If you are working on Ubuntu with CUDA 9.0, try:
pip3 install torch torchvision
For other systems or CUDA versions, please visit https://pytorch.org/
I added a clean repository containing the VAE with an updated GUI (to fit smaller screen sizes like 720x480, e.g. Raspberry Pi touchscreens), please follow this link to the repository.