Policy Gradients to land on the Moon

“That's one small step for your gradient ascent, one giant leap for your ML career.”

-- Pau quoting Neil Armstrong

Welcome 🤗

Today we will learn about Policy Gradient methods, and use them to land on the Moon.

Ready, set, go!

Lecture transcripts

📝 1. Policy gradients

Quick setup

Make sure you have Python >= 3.7. Otherwise, update it.

Pull the code from GitHub and cd into the 04_lunar_lander folder:

$ git clone https://github.com/Paulescu/hands-on-rl.git
$ cd hands-on-rl/04_lunar_lander

Make sure you have the virtualenv tool in your Python installation
```
$ pip3 install virtualenv
```
Create a virtual environment and activate it.
```
$ virtualenv -p python3 venv
$ source venv/bin/activate
```
From this point onwards commands run inside the virtual environment.
Install dependencies and code from src folder in editable mode, so you can experiment with the code.
```
$ (venv) pip install -r requirements.txt
$ (venv) export PYTHONPATH="."
```

Open the notebooks, either with good old Jupyter or Jupyter lab

$ (venv) jupyter notebook

$ (venv) jupyter lab

If both launch commands fail, try these:

$ (venv) jupyter notebook --NotebookApp.use_redirect_file=False

$ (venv) jupyter lab --NotebookApp.use_redirect_file=False

Play and learn. And do the homework 😉.

Notebooks

Random agent baseline
Policy gradients with rewards as weights
Policy gradients with rewards-to-go as weights
Homework

Let's connect!

Do you wanna become a PRO in Machine Learning?

👉🏽 Subscribe to the datamachines newsletter 🧠

👉🏽 Follow me on Twitter and LinkedIn 💡

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Policy Gradients to land on the Moon

“That's one small step for your gradient ascent, one giant leap for your ML career.”

-- Pau quoting Neil Armstrong

Table of Contents

Welcome 🤗

Lecture transcripts

Quick setup

Notebooks

Let's connect!

Files

README.md

Latest commit

History

README.md

File metadata and controls

Policy Gradients to land on the Moon

“That's one small step for your gradient ascent, one giant leap for your ML career.”

-- Pau quoting Neil Armstrong

Table of Contents

Welcome 🤗

Lecture transcripts

Quick setup

Notebooks

Let's connect!