部分可观察的序列学习
如果模型的是序列,那就是序列学习。有一种特别的序列是部分可观察的序列,这就是部分可观察到序列学习。
课程材料
- 滑铁卢 CS885 RL PPT RL with Sequence Modeling
- Berkeley CS285 Lec 21: RL with Sequence Models, slides, Youtube Video
- DeepMind UCL Hadovan RL 2021 Lec 11 Off-policy and multi-step
- 滑铁卢 CS885 RL PPT Partially observable RL, DRQN
论文
滑铁卢 Partially observable RL, DRQN
- Hausknecht, M., & Stone, P. Deep recurrent Q-learning for partially observable MDPs. In 2015 AAAI fall symposium series.
滑铁卢 RL with Sequence Modeling
- Esslinger, Platt & Amato (2022). Deep Transformer Q-Networks for Partially Observable Reinforcement Learning. arXiv.
- Chen et al.. (2021). Decision transformer: Reinforcement learning via sequence modeling. NeurIPS, 34, 15084-15097.
- Gu, Goel, & Ré (2022). Efficiently modeling long sequences with structured state spaces. ICLR.
- Gu, Dao, Ermon, Rudra & Ré (2020). Hippo: Recurrent memory with optimal polynomial projections. NeurIPS, 33, 1474-1487.
练习
N/A
Index | Previous | Next |