RIME: Robust Preference-based Reinforcement Learning
with Noisy Preferences

Jie Cheng^1,2 , Gang Xiong^1,2 , Xingyuan Dai^1,2 , Qinghai Miao² , Yisheng Lv^1,2   Fei-Yue Wang^1,2
¹State Key Laboratory of Multimodal Artificial Intelligence Systems, CASIA
²School of Artificial Intelligence, the University of Chinese Academy of Sciences

ICML 2024 Spotlight

[Paper] [Code]

Requirements

Install MuJoCo 2.1

sudo apt update
sudo apt install -y unzip gcc libosmesa6-dev libgl1-mesa-glx libglfw3 patchelf libegl1 libopengl0
sudo ln -s /usr/lib/x86_64-linux-gnu/libGL.so.1 /usr/lib/x86_64-linux-gnu/libGL.so
mkdir ~/.mujoco
cd ~/.mujoco
wget https://mujoco.org/download/mujoco210-linux-x86_64.tar.gz
tar -zxvf mujoco210-linux-x86_64.tar.gz
rm -f mujoco210-linux-x86_64.tar.gz

Include the following lines in the ~/.bashrc file:

export LD_LIBRARY_PATH=~/.mujoco/mujoco210/bin
export PATH="$LD_LIBRARY_PATH:$PATH"

Then run source ~/.bashrc

Install dependencies

conda env create -f conda_env.yaml
conda activate rime
pip install -e .[docs,tests,extra]
cd custom_dmc2gym
pip install -e .
pip install git+https://github.com/rlworkgroup/metaworld.git@04be337a12305e393c0caf0cbf5ec7755c7c8feb
pip install torch==1.9.1+cu111 -f https://download.pytorch.org/whl/torch_stable.html

You could run python -c "import mujoco_py; print(mujoco_py.__version__)" to check if mujoco-py is installed properly. If not, see FAQ.

Get Started

Configs

Set hyperparameters in the all-in-one script run_parallel.sh, including the name of algorithm, hyperparameters of the algorithm and env, index of GPU for each random seed, etc.

Running

For simulated (scripted) teachers:

bash run_parallel.sh

This will enable multi-threading to run experiments with multiple random seeds simultaneously.

For real human teachers (requires online annotation):

bash run_human_labeller.sh

When entering the annotation phase, run label_program.ipynb to annotate human preferences. The experimental result of RIME annotated by non-robotics students (detailed in Section 5.3) can be seen in this GIF.

Acknowledgement

This repo benefits from BPref, SURF, RUNE, and MRN. Thanks for their wonderful work.

Citation

@InProceedings{cheng2024rime,
  title = 	 {{RIME}: Robust Preference-based Reinforcement Learning with Noisy Preferences},
  author =       {Cheng, Jie and Xiong, Gang and Dai, Xingyuan and Miao, Qinghai and Lv, Yisheng and Wang, Fei-Yue},
  booktitle = 	 {Proceedings of the 41st International Conference on Machine Learning},
  pages = 	 {8229--8247},
  year = 	 {2024},
  volume = 	 {235},
  publisher =    {PMLR}
}

FAQ

GLIBCXX_3.4.30 not found.

conda install gcc=12.1.0
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/miniconda3/env/rime/lib

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
agent		agent
assets		assets
config		config
custom_dmc2gym		custom_dmc2gym
rlkit/envs		rlkit/envs
stable_baselines3		stable_baselines3
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
conda_env.yaml		conda_env.yaml
label_program.ipynb		label_program.ipynb
logger.py		logger.py
replay_buffer.py		replay_buffer.py
replay_buffer_explore.py		replay_buffer_explore.py
reward_model.py		reward_model.py
reward_model_RIME.py		reward_model_RIME.py
reward_model_explore.py		reward_model_explore.py
reward_model_semi_dataaug.py		reward_model_semi_dataaug.py
run_human_labeller.sh		run_human_labeller.sh
run_parallel.sh		run_parallel.sh
setup.cfg		setup.cfg
setup.py		setup.py
train_MRN.py		train_MRN.py
train_PEBBLE.py		train_PEBBLE.py
train_PEBBLE_with_actual_human_labeller.py		train_PEBBLE_with_actual_human_labeller.py
train_RIME.py		train_RIME.py
train_RIME_with_actual_human_labeller.py		train_RIME_with_actual_human_labeller.py
train_RUNE.py		train_RUNE.py
train_SAC.py		train_SAC.py
train_SURF.py		train_SURF.py
utils.py		utils.py
video.py		video.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RIME: Robust Preference-based Reinforcement Learning
with Noisy Preferences

Requirements

Install MuJoCo 2.1

Install dependencies

Get Started

Configs

Running

Acknowledgement

Citation

FAQ

About

Releases

Packages

Languages

License

CJReinforce/RIME_ICML2024

Folders and files

Latest commit

History

Repository files navigation

RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences

Requirements

Install MuJoCo 2.1

Install dependencies

Get Started

Configs

Running

Acknowledgement

Citation

FAQ

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

RIME: Robust Preference-based Reinforcement Learning
with Noisy Preferences

Packages