Markov Decision Process

Problem Description

The maze environment is a 6x6 grid world which contains walls, rewards and penalties. All green squares have a reward of +1. All orange/red squares have a penalty of -1. All white squares have a reward of -0.04. All grey squares are walls and these are unreachable states. Hence the reward for walls need not be defined and has been set to 0 for this experiment.

The transition model is as follows: the intended outcome occurs with probability 0.8, and with probability 0.1 the agent moves at either right angle to the intended direction. If the move would make the agent walk into a wall, the agent stays in the same place as before. The rewards for the white squares are -0.04, for the green squares are +1, and for the brown squares are -1. Note that there are no terminal states; the agent’s state sequence is infinite.

Note

There is no terminal state. For some reinforcement learning algorithms, the epsiode ends after a certain number of steps.

Getting Started

Install Requirements

pip install -r requirements.txt

Run Code

python __main__.py --algorithm=value_iteration --display_policy=True --display_utilities=True

Choice of Algorithms

value_iteration
policy_iteration
sarsa
expected_sarsa
q_learning
monte_carlo
dyna_q

Results

Value Iteration

Policy

Utilities

Plot

Policy Iteration

Policy

Utilities

Plot

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
analysis		analysis
assets		assets
docs		docs
images		images
mdp		mdp
.gitignore		.gitignore
README.md		README.md
__main__.py		__main__.py
config.py		config.py
custom_grid.py		custom_grid.py
display_manager.py		display_manager.py
env_config.py		env_config.py
file_manager.py		file_manager.py
requirements.txt		requirements.txt
test_dyna_q.py		test_dyna_q.py
test_expected_sarsa.py		test_expected_sarsa.py
test_monte_carlo.py		test_monte_carlo.py
test_policy_iteration.py		test_policy_iteration.py
test_q_learning.py		test_q_learning.py
test_sarsa.py		test_sarsa.py
test_value_iteration.py		test_value_iteration.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Markov Decision Process

Problem Description

Note

Getting Started

Install Requirements

Run Code

Choice of Algorithms

Results

Value Iteration

Policy

Utilities

Plot

Policy Iteration

Policy

Utilities

Plot

About

Releases

Packages

Languages

Atul-Acharya-17/Markov-Decision-Process

Folders and files

Latest commit

History

Repository files navigation

Markov Decision Process

Problem Description

Note

Getting Started

Install Requirements

Run Code

Choice of Algorithms

Results

Value Iteration

Policy

Utilities

Plot

Policy Iteration

Policy

Utilities

Plot

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages