Consider adding a no-op action to all environments. #16

matomatical · 2024-07-23T07:18:50Z

At the moment the environments only have cardinal directional action space. This complicates analytically solving some of the environments such as lava land where the mouse spawns surrounded by lava and in cases where a mouse spawns on the same square as cheese for example (though we usually try and avoid the latter).

Consider adding a no-op action which would simplify these corner cases. Maze solving code already supports possibility for no-op actions.

Aside from changing the environments and level solvers themselves, some changes would be required for example to policy heatmap plotting (thankfully the diamond plots can still work with the central square used to represent the no-op action). Also some of the environment demos such as interactive mode.

The main negative side effect would be that existing baselines would no longer be compatible with the new environments because the architecture type signature would be changing. This also means old checkpoints would no longer be load-able.

matomatical added enhancement New feature or request wontfix This will not be worked on labels Jul 23, 2024

matomatical mentioned this issue Jul 23, 2024

37 Implementation details of PPO #17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider adding a no-op action to all environments. #16

Consider adding a no-op action to all environments. #16

matomatical commented Jul 23, 2024

Consider adding a no-op action to all environments. #16

Consider adding a no-op action to all environments. #16

Comments

matomatical commented Jul 23, 2024