This is a Pytorch implementation of OUTPACE from our paper: "Outcome-directed Reinforcement Learning by Uncertainty & Temporal Distance-Aware Curriculum Goal Generation" (ICLR 2023 Spotlight)
By Daesol Cho*, Seungjae Lee* (*Equally contributed), and H. Jin Kim
A link to our paper can be found on arXiv, and our project website can be found on here.
- Create a conda environment:
conda env create -f outpace.yml
conda activate outpace
- Add the necessary paths:
conda develop meta-nml
- Install subfolder dependencies:
cd meta-nml && pip install -r requirements.txt
cd ..
chmod +x install.sh
./install.sh
-
Install pytorch (use tested on pytorch 1.12.1 with CUDA 11.3)
-
Set config_path: see config/paths/template.yaml
-
To run robot arm environment install metaworld:
pip install git+https://github.com/rlworkgroup/metaworld.git@master#egg=metaworld
PointUMaze-v0
CUDA_VISIBLE_DEVICES=0 python outpace_train.py env=PointUMaze-v0 aim_disc_replay_buffer_capacity=10000 save_buffer=true adam_eps=0.01
PointNMaze-v0
CUDA_VISIBLE_DEVICES=0 python outpace_train.py env=PointNMaze-v0 aim_disc_replay_buffer_capacity=10000 adam_eps=0.01
PointSpiralMaze-v0
CUDA_VISIBLE_DEVICES=0 python outpace_train.py env=PointSpiralMaze-v0 aim_disc_replay_buffer_capacity=20000 save_buffer=true aim_discriminator_cfg.lambda_coef=50
AntMazeSmall-v0
CUDA_VISIBLE_DEVICES=0 python outpace_train.py env=AntMazeSmall-v0 aim_disc_replay_buffer_capacity=50000
sawyer_peg_pick_and_place
CUDA_VISIBLE_DEVICES=0 python outpace_train.py env=sawyer_peg_pick_and_place aim_disc_replay_buffer_capacity=30000 normalize_nml_obs=true normalize_f_obs=false normalize_rl_obs=false adam_eps=0.01
sawyer_peg_push
CUDA_VISIBLE_DEVICES=0 python outpace_train.py env=sawyer_peg_push aim_disc_replay_buffer_capacity=30000 normalize_nml_obs=true normalize_f_obs=false normalize_rl_obs=false adam_eps=0.01 hgg_kwargs.match_sampler_kwargs.hgg_L=0.5
Our code sourced and modified from official implementation of MURAL, AIM, and HGG Algorithm. Also, we utilize mujoco-maze and metaworld to validate our proposed method.
If you use this repo in your research, please consider citing the paper as follows.
@inproceedings{choandlee2023outcome,
title={Outcome-directed Reinforcement Learning by Uncertainty \& Temporal Distance-Aware Curriculum Goal Generation},
author={Cho, Daesol and Lee, Seungjae and Kim, H Jin},
booktitle={Proceedings of International Conference on Learning Representations},
pages={},
year={2023},
organization={}
}