Randomize replaybuffer insertion #207

MartinMao2023 · 2024-12-10T16:04:54Z

Currently the "ReplayBuffer" object uses "jnp.roll" to make room for new data. This works fine if the data collected during each iteration is small enough. However, if we think about the tasks where the episode length is 1000, we will see that each data will only be kept for very few iterations before replaced by more recent ones.
For example, if we do 256 evaluations with episode length equal to 1000, a replay-buffer of size 1million can only hold 4 latest iterations of data. Considering the data are episodic, iterative, and hence highly correlated, keeping latest 4 iterations is not enough to satisfy the iid assumption of Neural-Network training even with the random sampling.
Hence, I recommend adding a new method to the "ReplayBuffer" object called "random_insert(self, key: RNGkey, transitions: Transition)" that randomly select a subset of existing data to be replaced by new data. Thus, we can ensure: first, the data from the latest evaluation always appear in the replay-buffer; second, historical data are replaced in an exponential manner (discount factor = 1 - amount_of_data_collected_each_iteration / replay_buffer_size); and most importantly without the need to enlarge replay-buffer size (hence save us the VRAM).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Randomize replaybuffer insertion #207

Randomize replaybuffer insertion #207

MartinMao2023 commented Dec 10, 2024 •

edited

Loading

Randomize replaybuffer insertion #207

Randomize replaybuffer insertion #207

Comments

MartinMao2023 commented Dec 10, 2024 • edited Loading

MartinMao2023 commented Dec 10, 2024 •

edited

Loading