dose the linesearch method conflict with a "trust region" policy gradient algorithm? #15

nuomizai · 2019-02-24T14:36:11Z

Hi, I am a newcomer to drl. When I try to read trpo_step in trpo.py, I notice that you use a linesearch method instead of trust region for numerical optimization. So I want to know why you choose that method and dose it conflict with a "trust region" policy gradient algorithm?

ArtificialIntelligenceRobot · 2019-07-09T12:51:16Z

I think u can read the paper for more information, you question is explaned at appendix c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dose the linesearch method conflict with a "trust region" policy gradient algorithm? #15

dose the linesearch method conflict with a "trust region" policy gradient algorithm? #15

nuomizai commented Feb 24, 2019

ArtificialIntelligenceRobot commented Jul 9, 2019

dose the linesearch method conflict with a "trust region" policy gradient algorithm? #15

dose the linesearch method conflict with a "trust region" policy gradient algorithm? #15

Comments

nuomizai commented Feb 24, 2019

ArtificialIntelligenceRobot commented Jul 9, 2019