added gif; updated docs

CN-UPB · Jun 24, 2020 · 78e579d · 78e579d
1 parent 8c91f42
commit 78e579d
Show file tree

Hide file tree

Showing 4 changed files with 14 additions and 3 deletions.
diff --git a/README.md b/README.md
@@ -2,7 +2,7 @@
 
 Using deep RL for mobility management.
 
-![example](docs/gifs/v03.gif)
+![example](docs/gifs/v04.gif)
 
 The latest version uses the [RLlib](https://docs.ray.io/en/latest/rllib.html) library for multi-agent RL.
 There is also an older version using [stable_baselines](https://stable-baselines.readthedocs.io/en/master/) for single-agent RL

diff --git a/docs/gifs/v04.gif b/docs/gifs/v04.gif
diff --git a/docs/mdp.md b/docs/mdp.md
@@ -1,5 +1,16 @@
 # MDP Formulation 
 
+## [v0.4](https://github.com/CN-UPB/deep-rl-mobility-management/releases/tag/v0.4): Replaced stable_baselines with ray's RLlib (week 26)
+
+* Replaced the RL framework: [RLlib](https://docs.ray.io/en/latest/rllib.html) instead of [stable_baselines](https://stable-baselines.readthedocs.io/en/master/)
+* Benefit: RLlib is more powerful and supports multi-agent environments
+* Refactored most parts of the existing code base to adjust to the new frameworks API
+* Radio model and MDP remained unchanged
+
+Example: Centralized PPO agent controlling two UEs after 20k training with RLlib
+
+![v0.4 example](gifs/v04.gif)
+
 ## [v0.3](https://github.com/CN-UPB/deep-rl-mobility-management/releases/tag/v0.3): Centralized, single-agent, multi-UE-BS selection, basic radio model (week 25)
 
 * Simple but improved radio load model: 

diff --git a/drl_mobile/main.py b/drl_mobile/main.py
@@ -85,6 +85,6 @@ def create_env_config(env, eps_length, train_batch_size=1000, seed=None):
     else:
         sim.load_agent(path=agent_path, seed=seed)
         # simulate one run
-        sim.run(render='video', log_steps=True)
+        sim.run(render='gif', log_steps=True)
         # evaluate
-        sim.run(num_episodes=10, log_steps=False)
+        # sim.run(num_episodes=10, log_steps=False)