编程

RL 模型训练技巧

Berkeley Deep RL Bootcamp 2017, Lecture 6 Nuts and Bolts of Deep RL Experimentation, John Schulman, (video, slides)

实现

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms, Github
cleanrl: High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)， Github
Facebook Pearl, 产品级增强学习库，支持推荐、广告、策略选择