-
Notifications
You must be signed in to change notification settings - Fork 10
milestone results
Marin Bukov edited this page Dec 14, 2016
·
8 revisions
-
RL state does not know about physical state: trajectory loops potentially dangerous for convergence
-
Replays/Forced Learning induce overfitting: non-best (s,a) pairs also updated thru the tilings. This would not occur in a tabular algorithm but this is also the main reason why tabular methods learn slower.
bang-bang and continuous protocols. Studied fidelity, energy increase above inst energy, energy fluctuations, diagonal entropy (basis of final state) and the BLoch sphere evolution of states.