Inverse Reward Design Inverse Reward Design
Paper summary The method they use basically tells the robot to reason as follows: 1. The human gave me a reward function $\tilde{r}$, selected in order to get me to behave the way they wanted. 2. So I should favor reward functions which produce that kind of behavior. This amounts to doing RL (step 1) followed by IRL on the learned policy (step 2); see the final paragraph of section 4.
arxiv.org
scholar.google.com
Inverse Reward Design
Dylan Hadfield-Menell and Smitha Milli and Pieter Abbeel and Stuart Russell and Anca Dragan
arXiv e-Print archive - 2017 via Local arXiv
Keywords: cs.AI, cs.LG

more

[link]
Summary by CodyWild 1 year ago
Loading...
Your comment:
[link]
Summary by capybaralet 4 months ago
Loading...
Your comment:


ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!
About

Sponsored by: and