GitHub - x35f/unstable_baselines: Re-implementations of SOTA RL algorithms.

Unstable Baselines(USB) is designed to serve as a quick-start guide for Reinforcement Learning beginners and a codebase for agile algorithm development. The algorithms strictly follows the original implementations, and the performance of Unstable Baselines matches those in the original implementations. USB is currently maintained by researchers from lamda-rl.

Features

Novice-friendly: USB is written in simple python codes. The RL training procedures are highly decoupled, waiting to be your first RL playground.
Stick to the original implementations: USB is as a benchmark framework for RL, thus the re-implementations strictly follows the original implementations. Tricks to achieve a higher performance are not implemented.
Customized Environments: You can customized you own environment as long as it has Gym-like interfaces.

Implementation Details

Baseline RL	Continuous Action Space	Discrete Action Space	Image Input	Status
DQN	✕	✔	✔	Stable
VPG	✔	✕	✔	Stable
DDPG	✔	✕	✔	Stable
TD3	✔	✕	✔	Stable
TRPO	✔	✕	✔	Stable
PPO	✔	✔	✔	Stable
SAC	✔	✔	✔	Stable
REDQ	✔	✔	✔	Stable
Option Critic	-	-	-	Developing

Model Based RL	Continuous Action Space	Discrete Action Space	Image Input	Status
MBPO	✔	✕	✔	Updating

Meta RL	Continuous Action Space	Discrete Action Space	Image Input	Status
PEARL	✔	✕	✕	Updating
MAML	-	-	-	Developing

*Updating: the algorithm is being developed to adapt to the latest USB version, and will be "Stable" soon

*Developing: the algorithm is being implemented, and will appear on the project soon

Supported environment benchmarks

Gym ("Classic Control" and "Box2D")
MuJoCo
Atari
dm_control
metaworld

Performance

MuJoCo

Quick Start

Install

git clone --recurse-submodules https://github.com/x35f/unstable_baselines.git
cd unstable_baselines
conda env create -f env.yaml 
conda activate usb
pip install -e .

To run an algorithm

In the directory of the algorithm

python3 /path/to/algorithm/main.py /path/to/algorithm/configs/some-config.py args(optional)

For example

cd unstable_baselines/baselines/sac
python3 main.py configs/Ant-v3.py --gpu 0

or for the ease of aggregating logs

python3 unstable_baselines/baselines/sac/main.py unstable_baselines/baselines/sac/configs/Ant-v3.py --gpu 0

Install environments (optional)

#install metaworld for meta_rl benchmark
cd envs/metaworld
pip install -e .

TODO List

Add Documentation

Name		Name	Last commit message	Last commit date
Latest commit History 285 Commits
docs		docs
tools		tools
unstable_baselines		unstable_baselines
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
VERSION.txt		VERSION.txt
env.yaml		env.yaml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Features

Implementation Details

Supported environment benchmarks

Performance

MuJoCo

Quick Start

Install

To run an algorithm

Install environments (optional)

TODO List

About

Releases

Packages

Contributors 6

Languages

x35f/unstable_baselines

Folders and files

Latest commit

History

Repository files navigation

Features

Implementation Details

Supported environment benchmarks

Performance

MuJoCo

Quick Start

Install

To run an algorithm

Install environments (optional)

TODO List

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages