Development of RL Library with High-throughput and Scalable Learning Architecture

Photo by Yang Guan

This project aims to develop a highly modularized and extensible RL library, with the ability of scaling to use hundreds of CPU cores for the high-throughput sampling, storing and updating. My works: 1) Summarised common procedures in different RL algorithms, according to that, abstracted the library by Worker, Learner, Buffer, Optimizer, Evaluator, Tester, and Trainer. Each has a certain functionality with clearly designed interface. 2) Proposed a general high-throughput and scalable learning architecture, which organizes arbitrary numbers of Workers, Learners, Buffers, and Evaluators in parallel, each with a CPU core, to enhance the sampling, learning, and replaying efficiency. 3) Developed the library by Tensorflow and Ray, which contains a cluster of state-of-the-art algorithm implementations, including MPG, DSAC, DDPG, ADP, TD3, SAC, PPO, TRPO.