Deep Value-Based Reinforcement Learning

An implementation of the following methods: Deep Q-Network (DQN), Double Deep Q-Network (DDQN), Dueling Architecture, Deep Quality-Value (DQV), and DQV-max. The implementation is based on the following papers:

The methods are trained and evaluated on the Catch game.

Example of trained agent

Running the code

Installation

To install all dependencies, run the following command:

pip install -r requirements.txt

Training

To train the agent, run the following command:

python source/train_agent.py [Training Options]

Training Options:

--run_name (str): Name of the run.
--algorithm ({DQN,Dueling_architecture,DQV,DQV_max}) : Type of algorithm to use for training.
--log_video: Whether to log video of agent's performance.
--max_epochs (int): Maximum number of steps to train for.
--batch_size (int): Batch size for training.
--batches_per_step (int): Number of batches to sample from replay buffer per agent step.
--optimizer ({Adam,RMSprop,SGD}): Optimizer to use for training.
--learning_rate (float): Learning rate for training.
--gamma (float): Discount factor.
--epsilon_start (float): Initial epsilon.
--epsilon_end (float): Final epsilon.
--epsilon_decay_rate (int): Number of steps to decay epsilon over.
--buffer_capacity (int): Capacity of replay buffer.
--replay_warmup_steps (int): Number of steps to warm up the replay buffer.
--target_net_update_freq (int): Number of steps between target network updates.
--soft_update_tau (float): Tau for soft target network updates.
--double_q_learning: Whether to use double Q-learning.
--hidden_size (int): Number of hidden units in the feedforward network.
--n_filters (int): Number of filters in the convolutional network.
--prioritized_replay: Whether to use prioritized replay.
--prioritized_replay_alpha (float): Alpha parameter for prioritized replay.
--prioritized_replay_beta (float): Beta parameter for prioritized replay.

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
.vscode		.vscode
SLURM_scripts		SLURM_scripts
notebooks		notebooks
results		results
sandbox		sandbox
source		source
videos		videos
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Value-Based Reinforcement Learning

Example of trained agent

Running the code

Installation

Training

Training Options:

About

Releases

Packages

Languages

RFLeijenaar/RL-Catch-Value-Based

Folders and files

Latest commit

History

Repository files navigation

Deep Value-Based Reinforcement Learning

Example of trained agent

Running the code

Installation

Training

Training Options:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages