Deep Reinforcement Learning Approach to CPS Vehicle Re-envisioning

Date:


Focus Area 1.1 - Virtual Prototyping of Autonomy-Enabled Ground Systems, real-time controls group

Ajinkya Joglekar, Ph.D. Candidate, Automotive Engineering; Alexander Krolicki, MS, Mechanical Engineering; Dr. Venkat Krovi (PI), Michelin Endowed Chair Professor of Vehicle Automation, Clemson University

90 minute research presentation Link

Abstract: Deep-learning-based methods hold enormous promise in overcoming the challenges of complexity and real-time performance in various autonomous vehicle deployments. As exemplars of systems-of-systems, autonomous vehicles (AVs) engender numerous interconnected component-, subsystem- and system-level interactions. Our work explores the creation of operational frameworks to support end-to-end deep-learning-based autonomy deployments that can support verification and validation. In particular, we are exploring the alternate autonomy frameworks: imitation-learning, Deep Reinforcement-Learning (DRL), and Deep-Koopman Reinforcement Learning (DKRL) with an eye to performance generalizability, scalability and robustness. Behavior cloning is a form of supervised imitation learning whose main motivation is to build a policy to mimic the human operator actions and support subsequent inferencing. While end-to-end Deep Neural Network (DNN) based policy encapsulation (e.g. mapping raw camera pixels to steering commands) can minimize dependence on feature engineering, it suffers from limited generalizability to environment changes. Deep Reinforcement Learning frameworks for learning complex model-based/model-free policies in high dimensional environments can leverage simulation/real-world data but require careful parameter/reward/algorithm selection to improve scalability, learning speed and performance. Deep-Koopman Reinforcement Learning approaches now blend the computational approximation benefits of DNNs with theoretical guarantees from dynamical systems and control in the realization of data-driven control. In particular, physics-aware yet data-driven approximation of the Koopman lifting functions via the DNNs facilitates systematic deployment of foundational linear system theory tools from optimal control to robust design with performance guarantees. We will also highlight the operationalization within a scaled-vehicle framework (F1/10 vehicle ecosystem) that permits transitions between different test environments (simulated vs real) during the verification and validation of the deep-learning-based autonomy algorithms while also engaging a diverse group of autonomy researchers.