My current research focuses on optimality criteria and policy optimization in reinforcement learning. My PhD thesis is about discounting-free policy gradient reinforcement learning from transient states.

My publications (as joint works with my collaborators) can be found in arXiv, dblp, and SemanticScholar.

Provide feedback

Saved searches