Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
vwxyzjn authored Feb 20, 2023
1 parent af0d1b4 commit 5f266c7
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Cleanba is CleanRL's implementation of DeepMind's Sebulba distributed training f

**Scalable**: We can scale to N+ GPUs allowed by `jax.distributed` and memory (e.g., it can run with 16 GPUs). This makes cleanba suited for large-scale distributed training tasks such as RLHF.

**Understandable**: We adopt the single-file implementation philosophy used in CleanRL, making our core codebase succinct and easy to understand. For example, our `cleanba/cleanba_ppo_envpool_impala_atari_wrapper.py` is ~800 lines of code.



Expand Down Expand Up @@ -139,4 +140,4 @@ To improve efficiency of Cleanba, we uses JAX and EnvPool, both of which are des

[Espeholt et al., 2018](https://arxiv.org/abs/1802.01561) did not disclose the hardware usage and runtime for the Atari experiments. We did our best to recover its runtime by interpolating the results from the [R2D2 paper](https://openreview.net/pdf?id=r1lyTjAqYX) and found IMPALA (deep) takes ~2 hours.

![](static/r2d2_impala.png)
![](static/r2d2_impala.png)

0 comments on commit 5f266c7

Please sign in to comment.