Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding a no-op action to all environments. #16

Open
matomatical opened this issue Jul 23, 2024 · 0 comments
Open

Consider adding a no-op action to all environments. #16

matomatical opened this issue Jul 23, 2024 · 0 comments
Labels
enhancement New feature or request wontfix This will not be worked on

Comments

@matomatical
Copy link
Owner

At the moment the environments only have cardinal directional action space. This complicates analytically solving some of the environments such as lava land where the mouse spawns surrounded by lava and in cases where a mouse spawns on the same square as cheese for example (though we usually try and avoid the latter).

Consider adding a no-op action which would simplify these corner cases. Maze solving code already supports possibility for no-op actions.

Aside from changing the environments and level solvers themselves, some changes would be required for example to policy heatmap plotting (thankfully the diamond plots can still work with the central square used to represent the no-op action). Also some of the environment demos such as interactive mode.

The main negative side effect would be that existing baselines would no longer be compatible with the new environments because the architecture type signature would be changing. This also means old checkpoints would no longer be load-able.

@matomatical matomatical added enhancement New feature or request wontfix This will not be worked on labels Jul 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

1 participant