Development of RL Library with High-throughput and Scalable Learning Architecture

Mar 27, 2020

PDF Code

Photo by Yang Guan

This project aims to develop a highly modularized and extensible RL library, with the ability of scaling to use hundreds of CPU cores for the high-throughput sampling, storing and updating. My works: 1) Summarised common procedures in different RL algorithms, according to that, abstracted the library by Worker, Learner, Buffer, Optimizer, Evaluator, Tester, and Trainer. Each has a certain functionality with clearly designed interface. 2) Proposed a general high-throughput and scalable learning architecture, which organizes arbitrary numbers of Workers, Learners, Buffers, and Evaluators in parallel, each with a CPU core, to enhance the sampling, learning, and replaying efficiency. 3) Developed the library by Tensorflow and Ray, which contains a cluster of state-of-the-art algorithm implementations, including MPG, DSAC, DDPG, ADP, TD3, SAC, PPO, TRPO.

Publications

Integrated Decision and Control for High-Level Automated Vehicles by Mixed Policy Gradient and Its Experiment Verification

Self-evolution is indispensable to realize full autonomous driving. This paper presents a self-evolving decision-making system based on …

Yang Guan*, Liye Tang*, Chuanxiao Li, Shengbo Eben Li, Yangang Ren, Junqing Wei, Bo Zhang, Keqiang Li

Project Project

Integrated Decision and Control: Toward Interpretable and Efficient Driving Intelligence

Decision and control are core functionalities of high-level automated vehicles. Current mainstream methods, such as functional …

Yang Guan, Yangang Ren, Qi Sun, Shengbo Eben Li, Haitong Ma, Jingliang Duan, Yifan Dai, Bo Cheng

Preprint PDF Code Project Project Video DOI

Feasible Actor-Critic: Constrained Reinforcement Learning for Ensuring Statewise Safety

The safety constraints commonly used by existing safe reinforcement learning (RL) methods are defined only on expectation of initial …

Haitong Ma, Yang Guan, Shengbo Eben Li, Xiangteng Zhang, Sifa Zheng, Jianyu Chen

Preprint Code Project Video

Mixed Policy Gradient

Reinforcement learning (RL) has great potential in sequential decision-making. At present, the mainstream RL algorithms are …

Yang Guan, Jingliang Duan, Shengbo Eben Li, Jie Li, Jianyu Chen, Bo Cheng

Preprint Code Project