Variational Policy Search via Trajectory Optimization Variational Policy Search via Trajectory Optimization
Paper summary The paper introduces a new approach of how classical policy search can be combined and improved with trajectory optimization methods serving as exploration strategy. An optimization criteria with the goal of finding optimal policy parameters is decomposed with a variational approach. The variational distribution is approximated as Gaussian distribution which allows a solution with the iterative LQR algorithm. The overall algorithm uses expectation maximization to iterate between minimizing the KL divergence of the variational decomposition and maximizing the lower bound with respect to the policy parameters.
papers.nips.cc
scholar.google.com
Variational Policy Search via Trajectory Optimization
Levine, Sergey and Koltun, Vladlen
Neural Information Processing Systems Conference - 2013 via Bibsonomy
Keywords: dblp


Loading...
Your comment:


Short Science allows researchers to publish paper summaries that are voted on and ranked!
About