This is the repo for my summer research project involving the MineRL dataset and the stable-baselines repo of algorithms. There are some wrappers taken from ChainerRL as well
- OpenAI Gym
- Stable-baselines
- MineRL
Before MineRL, you can play around with OpenAI Gym to get a feel for the RL environment. Go to stable-baselines's repo and MineRL to install some prerequisites for the libraries. Once done, use the package manager pip to install stable-baselines and MineRL.
pip3 install gym
pip3 install stable-baselines[mpi]
pip3 install --upgrade minerl
Stable-baselines contains various reinforcement learning algorithms to begin your training. However, it is not compatible with dictionary observation and action spaces. Thus, wrappers are needed to discretize those spaces. The wrappersr are also contained in this repo under the folder "wrappers." Those wrappers were based off this.
Once wrapped, stable-baselines repo of algorithms are able to train on the MineRL dataset. We also made use of vectorized enviromnets so we could multiprocess and simultaneously train multiple instances of the MineRL environments.
In the stable-baselines function make_vec_env, add the following wrappers after making the environment
env = gym.make(env_id)
if env_id.startswith("MineRLNavigate"):
env = PoVWithCompassAngleWrapper(env)
else:
env = ObtainPoVWrapper(env)
env = SerialDiscreteActionWrapper(env)
For some reason, you must add the wrappers to the helper function stable baselines provides instead of directly creating the function yourself and adding the wrappers in your own code. That is what worked for me, at least. If someone could help optimize this, pull requests are helpful.
If you're running without a head, that is without a physical display (for example through an SSH connection like I was), you can write a dockerfile to run it or use xvfb-run. Note that xvfb-run isn't compatible with NVIDIA drivers so you can alos use a VCN server or just go the docker route like I did. I provided an example dockerifle that I wrote