This is the source code for the paper Task-Driven Hybrid Model Reduction for Dexterous Manipulation, by Wanxin Jin and Michael Posa, IEEE Transactions on Robotics, 2022.
Preprint: https://arxiv.org/abs/2211.16657
Webpage: https://wanxinjin.github.io/td_hybridreduction/
- planning: optimal control solvers
MPC_LCS_R.py
LCS-based MPC solver
- models : dynamics models
LCS.py
: generic linear complementarity system (LCS)Linear.py
: generic linear models (not used)NN.py
: generic neural network models (not used)
- env: environments
- gym_env: Three-Finger Manipulation MuJoCo Environment
- mujoco_core: core gym API modules (independent of mujoco-py)
trifinger_continuous.py
: full Three-Finger Manipulation environmenttrifinger_quasistatic_ground_continuous.py
: environment for cube moving (manipulation task 1 in the paper)trifinger_quasistatic_ground_rotate_continuous.py
: environment for cube turning (manipulation task 2 in the paper)
- util: some utility functions for the env
- gym_env: Three-Finger Manipulation MuJoCo Environment
- diagnostics: visualizer of models or debugger
lcs_analysis.py
: utilities for analyzing LCS modelsvis_mode.py
: utilities for plotting and visualizing learned results, trajectories, etc.
- util: saver, loader, and logger APIs
buffer.py
: defined class for dealing with Rollout Bufferlogger.py
: APIs for saving and loading dataoptim_gd.py
: implementation of different gradient descent algorithmstrajectory_loss.py
: some loss functions that deal with trajectories
- examples: different executable scripts that are ready to run (detailed below). Note that each script closely corresponds to the experiment presented in the paper.
The codes have been tested and run smoothly with Python 3.9 on MacBook Pro (Apple M1 Pro)
Before run examples, you may want to add the project directory to your PYTHONPATH.
$ export PYTHONPATH=/path/to/this/repo:$PYTHONPATH
examples/lcs/lcs_example1: Illustration of Learning Progress (see Section VI.C.1):
Run script
$ python3 examples/lcs/lcs_example1/lcs2d_***.py
with different scripts
lcs2d_run.py
: the main learning scriptlcs2d_plot_loss***.py
: plot the learning curves from the saved datalcs2d_analysis.py
: generate and save the phase portrait data for each learning iterationlcs2d_plot_phase_***.py
: plot the phase portrait at each learning iteration (see Fig. 2 in the paper)lcs2d_plot_anlaysis_rand***.py
: analyze the full-order hybrid system with random policy
examples/lcs/lcs_example2: High Dimensional Examples (see Section VI.C.2)
Run script
$ python3 examples/lcs/lcs_example2/***.py
with different scripts
single_run.py
ormultiple_run.py
: the main learning script for single trial or multiple trials***_plot_loss.py
: plot the learning curves from the saved datasingle_plot_loss.py
: plot the reduced-order mpc policy rollout (see Fig. 3 in the paper)multiple_analysis_comp_***.py
: analyze and compare the learned reduced-order MPC policy versus random policy (see Table II in the paper)
examples/lcs/lcs_example3: Effect of Hyperparameter Settings (see Section VI.D)
Run script
$ python3 examples/lcs/lcs_example3/***.py
with different scripts
run_buffer_size.py
: learn by varying buffer sizerun_mpc_horizon.py
: learn by varying mpc horizonrun_new_rollout.py
: learn by varying number of new rolloutsrun_trustregion.py
: learn by varying trust region parameterplot_param.py
: plot the learned results (see Fig. 4 in the paper)
4.1 examples/trifinger_task1: Cube Turning Manipulation Task (see Section VII.C)
Run the main learning script:
$ python3 examples/trifinger_task1/run_training.py
If you want to render the environment during its on-policy rollout, go to Line 180:
$ rollout = rollout_mpcReceding(env=env, rollout_horizon=rollout_horizon, mpc=mpc, mpc_aux=dyn_aux_guess, mpc_param=mpc_param, render=False)
and set the argument render=True
.
After learning, run other scripts
$ python3 examples/trifinger_task1/***.py
with
run_vis_trained.py
: test the learned reduced-order LCS-based MPC controller on the Three-Finger Manipulation system for Cube Turning taskshow_curves.py
: plot the learning curves and print some other stats (see Fig. 6 and Table III in the paper)show_disturbance.py
: test the robustness of the learned reduced-order MPC controller (see Table III in the paper)show_hybrid_details.py
: show the correspondence between mode activation in LCS and physical interaction (see Section VII.C.2 in the paper)show_comp_lam.py
: learn the reduced-order LCS with different dimension of lambda (see Section VII.E.1)show_comp_curve.py
: plot the learned results for the reduced-order LCS with different dimension of lambda (see Fig. 12 in the paper)
4.2 examples/trifinger_task2: Cube Moving Manipulation Task (see Section VII.D)
Run the main learning script:
$ python3 examples/trifinger_task2/run_training.py
If you want to render the environment during its on-policy rollout, go to Line 190:
$ rollout = rollout_mpcReceding(env=env, rollout_horizon=rollout_horizon, mpc=mpc, mpc_aux=dyn_aux_guess, mpc_param=mpc_param, render=False)
and set the argument render=True
.
After learning, run other scripts
$ python3 examples/trifinger_task2/***.py
with
run_vis_trained.py
: test the learned reduced-order LCS-based MPC controller on the Three-Finger Manipulation system for the cube movingshow_curves.py
: plot the learning curves and print some other learning stats information (see Fig. 8 in the paper)show_stats.py
: print some key results for the learned reduced-order LCS (see Table VII in the paper)show_disturbance.py
: test the robustness of the learned reduced-order MPC controller (see Table VII in the paper)show_hybrid_details.py
: show the correspondence between mode activation in LCS and physical interaction (see Section VII.D.2 in the paper)show_strategies.py
: show different manipulation strategies generated by the learned reduced-order LCS (see Section VII.D.3 in the paper)
If you find this project helpful in your publications, please consider citing our paper.
@article{jin2022hybrid,
title={Task-Driven Hybrid Model Reduction for Dexterous Manipulation},
author={Jin, Wanxin and Posa, Michael},
journal={arXiv preprint arXiv:2211.16657},
year={2022}
}