mjmpc

A collection of sampling based Model Predictive Control algorithms.

If you use this repository as part of your research please cite the following publication::

@inproceedings{
bhardwaj2021blending,
title={Blending {\{}MPC{\}} {\&} Value Function Approximation for Efficient Reinforcement Learning},
author={Mohak Bhardwaj and Sanjiban Choudhury and Byron Boots},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=RqCC_00Bg7V}
}

Installation

You first need to download MuJoCO and obtain a licence key from here

[1] Create conda environment

conda create --name mjmpc python=3.7
conda activate mjmpc

[2] Install mujoco_py and gym

[3] Clone mjmpc

git clone git@github.com:mohakbhardwaj/mjmpc.git && cd mjmpc
conda env update -f setup/environment.yml

[4] Clone and install mjrl

[5] (Optional) Clone and install mj_envs (only required if you want to run hand_manipulation_suite, sawyer or classic_control environments)

[6] Install mjmpc

cd mjmpc
pip install -e .

Examples

Take a look at the examples directory.

cd examples

To run a single instance of a controller by loading parameters from a config file run

python example_mpc.py --config_file <config_file> --controller_type  <controller_name> --save_dir .

For example, to run MPPI for a reaching task with Sawyer robot, run the following

python example_mpc.py --config_file configs/reacher_7d0f-v0 --controller_type  mppi --save_dir .

After running MPPI the results will be stored in ./reacher_7dof-v0/ /mppi/. Use the flag --dump_vids to dump videos of all the trajectories.

We have provided example config files in examples/configs folder. The parameters for individual algorithms are explained below.

Controllers

Following parameters are common for all controllers

Parameter
`horizon`	rollout horizon
`num_particles`	number of particles to rollout
`n_iters`	number of iterations of optimization per timestep
`gamma`	discount factor
`filter_coeffs`	coefficients for autoregressive filtering (generate correlated noise)
`base_action`	action to append at the end after shifting distribution for next step

Additionally, each controller has it's own specific parameters

Gaussian Controllers

These controllers use a Gaussian control distribution and have the following common parameters

Parameter
`init_cov`	initial covariance of Gaussian
`step_size`	step size for updating distribution at every timestep

Random Shooting

Samples particles from a Gaussian with fixed covariance and selects next mean to be the rollout with minimum cost. Has no additional parameters.

Model Predictive Path Integral Control (MPPI)

Based on Williams et al., it samples particles from a Gaussian with fixed covariance and updates the mean using a softmax of rollouts. Has the followig additional parameters:

Parameter
`lam`	temperature for softmax
`alpha`	flag to enable control costs in rollouts (0: enable, 1: disable)

Cross Entropy Method (CEM)

Samples particles from a Gaussian control distribution and updates the mean and covariance using sample estimates from a set of elite samples based on cost.

Parameter
`cov_type`	'diag' means covariance is forced to be diagonal, and 'full' allows actions to be correlated to each other.
`elite_frac`	fraction of total samples considered elite
`beta`	`beta * I` is added to covariance to grow it at each timestep

Gaussian DMD-MPC

From Wagener et al.. We use the exponentiated utility function and allow it to adapt the covariance as well. Has the same parameters as MPPI with the addition of cov_type and beta.

Non-Gaussian Controllers

Particle Filter MPC

Uses a non-parametric distribution represented by particles and updates it using a particle filtering approach, where particles weighted using an exponential of running cost with temperature lam

Parameter
`cov_shift`	noise added to particles when shifting to next step
`cov_resample`	noise for resampling particles

Tuning Controllers and Running Parameter Sweeps

We have also provided a job_script for tuning and benchmarking various MPC algorithms.

python job_script.py --config_file <config_file> --controller_type  <controller_name>

For example, to run MPPI on trajopt_reacher-v0 we can use

python job_script.py --config_file configs/sawyer_reacher-v0.yml --controller_type mppi

Replace mppi with random_shooting, cem or dmd to run different controllers using parameters provided in the config files. The working of the job script are explained below.

Using Your Own Environments

As such the control framework is agnostic to the environment definition and only expects as input two functions

[1] rollout_fn: Takes as input a batch of actions and returns a vector of costs encountered

[2] set_sim_state_fn: sets the state of simulation environments.

However, if you wish to use our GymEnvWrapper, it expects the environment to have a few additional functions implemented such as

[1] set_env_state: Sets the state of the environment

[2] get_env_state: Returns the current state of the environment

Please look at hre Currently can be seen from envs/reacher_env.py for an example environment.

Control Parameters

(TO BE UPDATED!!)

These parameters are currently manually tuned.

Env Name	Episode Length	Horizon	Num Particles	Lambda	Covariance	Step Size	Gamma	Num Iters
SimplePendulum-v0	200	32	24	0.01	3.5	0.55	1.0	1
Swimmer-v0	500	32	36	0.01	3.0	0.55	1.0	1
HalfCheetah-v0	500	32	36	0.01	3.0	0.55	1.0	1
trajopt_reacher-v0	200	32	36	0.01	3.0	0.55	1.0	1

TODO

Batch version of controllers
Environment render function must have functionality to save frames/videos and work for batch size > 1
Implement rollout function for mujoco_env in Cython.
~~Grid search for tuning controller hyperparameters.~~

Name		Name	Last commit message	Last commit date
Latest commit History 179 Commits
examples		examples
mjmpc		mjmpc
setup		setup
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mjmpc

Installation

Examples

Controllers

Gaussian Controllers

Random Shooting

Model Predictive Path Integral Control (MPPI)

Cross Entropy Method (CEM)

Gaussian DMD-MPC

Non-Gaussian Controllers

Particle Filter MPC

Tuning Controllers and Running Parameter Sweeps

Using Your Own Environments

Control Parameters

TODO

About

Releases

Packages

Languages

License

mohakbhardwaj/mjmpc

Folders and files

Latest commit

History

Repository files navigation

mjmpc

Installation

Examples

Controllers

Gaussian Controllers

Random Shooting

Model Predictive Path Integral Control (MPPI)

Cross Entropy Method (CEM)

Gaussian DMD-MPC

Non-Gaussian Controllers

Particle Filter MPC

Tuning Controllers and Running Parameter Sweeps

Using Your Own Environments

Control Parameters

TODO

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages