反向学习
除了直接学习 Reward,还可以从示例中学习奖励函数。这就是反向强化学习。
课程材料
- 滑铁卢 CS885 RL PPT
- Berkeley CS285 Lec 20: Inverse Reinforcement Learning, slides, Youtube Video
-
Berkeley Deep RL Bootcamp 2017, Lecture 10b Inverse RL – Chelsea Finn (video slides)
论文
滑铁卢 Inverse RL
- Ziebart, B. D., Bagnell, J. A., & Dey, A. K. (2010). Modeling interaction via the principle of maximum causal entropy. In ICML.
- Finn, C., Levine, S., & Abbeel, P. (2016). Guided cost learning: Deep inverse optimal control via policy optimization. In ICML (pp. 49-58).
课本材料
N/A
练习
N/A
Index | Previous | Next |