Decision-Making under On-Ramp Merge Scenarios by Distributional Soft Actor-Critic Algorithm

Image credit: original paper

Abstract

Merging into the highway from the on-ramp is an essential scenario for automated driving. The decision-making under the scenario needs to balance the safety and efficiency performance to optimize a long-term objective, which is challenging due to the dynamic, stochastic, and adversarial characteristics. The existing rule-based methods often lead to conservative driving on this task while the learning-based methods have difficulties meeting the safety requirements. In this paper, we propose an reinforcement learning based end-to-end decision-making method under a framework of offline training and online correction, called the Shielded Distributional Soft Actor-critic (SDSAC). The SDSAC adopts the policy evaluation with safety consideration in offline training and a safety shield parameterized with the barrier function in online correction. These two measures support each other in achieving better safety without sacrificing the efficiency performance. We verify the SDSAC on an on-ramp merge scenario in simulation. The results show that the SDSAC has the best safety performance compared to baseline algorithms and achieves efficient driving simultaneously.

Publication
In IEEE Transactions on Intelligent Vehicles (under review)

(* indicates equal contribution.)