2.624

2020影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于滚动时域强化学习的智能车辆侧向控制算法

张兴龙 陆阳 李文璋 徐昕

张兴龙, 陆阳, 李文璋, 徐昕. 基于滚动时域强化学习的智能车辆侧向控制算法. 自动化学报, 2022, 45(x): 1−12 doi: 10.16383/j.aas.c210555
引用本文: 张兴龙, 陆阳, 李文璋, 徐昕. 基于滚动时域强化学习的智能车辆侧向控制算法. 自动化学报, 2022, 45(x): 1−12 doi: 10.16383/j.aas.c210555
Zhang Xing-Long, Lu Yang, Li Wen-Zhang, XU Xin. Receding horizon reinforcement learning algorithm for lateral control of intelligent vehicles. Acta Automatica Sinica, 2022, 45(x): 1−12 doi: 10.16383/j.aas.c210555
Citation: Zhang Xing-Long, Lu Yang, Li Wen-Zhang, XU Xin. Receding horizon reinforcement learning algorithm for lateral control of intelligent vehicles. Acta Automatica Sinica, 2022, 45(x): 1−12 doi: 10.16383/j.aas.c210555

基于滚动时域强化学习的智能车辆侧向控制算法

doi: 10.16383/j.aas.c210555
基金项目: 国家重点研究发展计划(2018YFB1305105), 国家自然科学基金(62003361, 61825305)资助
详细信息
    作者简介:

    张兴龙:国防科技大学智能科学学院副研究员. 2018年获得意大利米兰理工大学博士学位. 在Automatica, IEEE Transactions汇刊等国际期刊和会议发表论文30余篇.主要研究方向为滚动时域强化学习及其在无人系统中的应用. E-mail: zhangxinglong18@nudt.edu.cn

    陆阳:国防科技大学智能科学学院博士研究生.在2020年获得国防科技大学硕士学位, 2018年获得山东大学学士学位.主要研究方向为强化学习及其在无人系统中的应用. E-mail: luyang18@nudt.edu.cn

    李文璋:在2020年获得国防科技大学硕士学位, 2018年获得北京理工大学学士学位. 主要研究方向为智能车学习控制. E-mail: 15624953231@163.com

    徐昕:国防科技大学智能科学学院研究员. 2002年获得国防科技大学机电与自动化学院博士学位. 在国际期刊和会议发表学术论文160多篇, 出版书籍4本. 他的研究兴趣包括智能控制, 强化学习, 近似动态规划, 机器学习, 机器人和智能驾驶. 本文通信作者. E-mail: xinxu@nudt.edu.cn

Receding Horizon Reinforcement Learning Algorithm for Lateral Control of Intelligent Vehicles

Funds: Supported by National Key R&D Program of China 2018YFB1305105, National Natural Science Foundation of China under Grant (62003361, 61825305)
More Information
    Author Bio:

    ZHANG Xing-Long Associate professor with the college of Intelligence and Science Technology, National University of Defense Technology. He received his doctorate from Politecnico di Milano in 2018. He has been co-authored with more than 20 papers in international journals and conferences including Automatica, IEEE Trans. Magazines, etc. His research interests include receding horizon reinforcement learning and its application in unmanned systems

    LU Yang Ph. D. candidate at the College of Intelligence Science and Technology, National University of Defense Technology. He received his master degree and bachelor degree from National University of Defense Technology in 2020 and Shandong University in 2018, respectively. His research interest covers reinforcement learning and intelligent vehicles

    LI Wen-Zhang received his master degree and bachelor from National University of Defense Technology in 2020 and Beijing Institute of Technology in 2018, respectively. His research interest includs learning control of intelligent vehicles

    XU Xin professor with the college of Intelligence and Science Technology, National University of Defense Technology. He received the Ph.D degree in control science and engineering from the College of Mechatronics and Automation, NUDT, in 2002. He has co-authored more than 160 papers in international journals and conferences, and co-authored four books. His research interests include intelligent control, reinforcement learning, approximate dynamic programming, machine learning, robotics, and autonomous vehicles. Corresponding author of this paper

  • 摘要: 本文针对智能车辆的高精度侧向控制问题, 提出了一种基于滚动时域强化学习(Receding horizon reinforcement learning, RHRL)的侧向控制方法. 车辆的侧向控制量由前馈和反馈两部分构成, 前馈控制量由参考路径的曲率以及动力学模型直接计算得出; 而反馈控制量通过采用滚动时域强化学习算法求解最优跟踪控制问题得到. 本文提出的方法结合滚动时域优化机制, 将无限时域最优控制问题转化为若干有限时域控制问题进行求解. 与已有的有限时域执行器-评价器学习不同, 在每个预测时域采用时间独立型执行器-评价器网络结构学习最优值函数和控制策略. 与模型预测控制(Model predictive control, MPC)方法求解开环控制序列不同, RHRL控制器的输出是一个显式状态反馈控制律, 兼具直接离线部署和在线学习部署的能力. 此外, 本文从理论上证明了RHRL算法在每个预测时域的收敛性, 并分析了闭环系统的稳定性. 在仿真环境中完成了结构化道路下的车辆侧向控制测试, 仿真结果表明提出的RHRL方法在控制性能方面优于预瞄控制器和启发式动态规划算法, 在计算效率方面优于MPC; 与最近流行的软执行器-评价器(Soft actor-critic, SAC)算法和深度确定性策略梯度(Deep deterministic policy gradient, DDPG)算法相比控制性能更好, 且具有更低的样本复杂度和更高的学习效率. 最后, 以红旗E-HS3电动汽车作为实车平台, 在封闭结构化城市测试道路和乡村起伏砂石道路下进行了侧向控制实验. 实验结果显示, RHRL在结构化城市道路中的侧向控制性能优于预瞄控制, 在乡村道路中具有较强的路面适应能力和较好的控制性能.
    1)  收稿日期 2021-06-20 录用日期 2021-11-02 Manuscript received June 20, 2021; accepted November 2, 2021 国家重点研究发展计划 (2018YFB1305105), 国家自然科学基金(62003361, 61825305) 资助 Supported by National Key R&D Program of China 2018YFB1305105, National Natural Science Foundation of China
    2)  under Grant (62003361, 61825305) 本文责任编委 Recommended by Associate Editor 1. 国防科技大学智能科学学院 长沙 410073 1. College of Intelligence Science and Technology, NationalUniversity of Defense Technology, Changsha 410073
  • 图  1  车辆二自由度侧向模型

    Fig.  1  Two-degree-of-freedom lateral model of intelligent vehicle

    图  2  侧向误差模型

    Fig.  2  Lateral error model

    图  3  智能车侧向控制框图

    Fig.  3  Lateral control diagram of intelligent vehicle

    图  4  参考路径

    Fig.  4  Reference path

    图  5  30 ${\rm{km/h}}$下智能车跟踪控制侧向偏差对比

    Fig.  5  Comparison of lateral tracking error of intelligent vehicles under $v_x = 30 \; {\rm{km/h}}$

    图  6  50 ${\rm{km/h}}$下智能车跟踪控制侧向偏差对比

    Fig.  6  Comparison of lateral tracking error of intelligent vehicles under $v_x = 50 \; {\rm{km/h }}$

    图  7  红旗E-HS3智能驾驶平台

    Fig.  7  Hongqi E-HS3 Intelligent Driving Platform

    图  8  基于RHRL和纯点预瞄方法的红旗E-HS3行驶路径

    Fig.  8  Path of the E-HS3 vehicle controlled by RHRL and pure pursuit method

    图  9  RHRL和纯点预瞄方法的车辆实测侧向偏差对比

    Fig.  9  Comparison of experimental lateral tracking error of the RHRL and pure pursuit methods

    图  10  乡村砂石道路地图和车辆行驶中各阶段状态

    Fig.  10  The route map in the country sand and gravel road, and the status of different stages in the control process.

    图  11  侧向误差曲线

    Fig.  11  Curve of the lateral error.

    表  1  车辆动力学参数表

    Table  1  The parameters of the vehicle dynamics

    符号 物理意义 取值 单位
    $m$ 车身质量 1723 $kg$
    $I_z$ 转动惯量 4175 ${\rm{kg}}\cdot {\rm{m}}^2$
    $l_f$ 质心到前轴距离 1.232 $m$
    $l_r$ 质心到后轴距离 1.468 $m$
    $C_f$ 前轮侧偏刚度 66900 $N$
    $C_r$ 后轮侧偏刚度 62700 $N$
    下载: 导出CSV

    表  2  各控制器的均方根误差对比

    Table  2  The root mean square error (RMSE) comparison among all the controllers.

    RMSE 30 ${\rm{km/h}}$ 50 ${\rm{km/h }}$
    $e_y(m)$ $e_{\varphi}(rad)$ $e_y(m)$ $e_{\varphi}(rad)$
    RHRL $\boldsymbol{0.156}$ 0.03 0.246 0.02
    HDP 0.165 0.03 0.315 0.019
    SAC 0.189 0.029 0.283 0.017
    DDPG 0.172 $0.037$ 0.319 0.017
    MPC 0.212 0.025 0.278 0.015
    纯点预瞄 0.159 0.036 0.286 0.03
    下载: 导出CSV
  • [1] Kabzan J, Hewing L, Liniger A, et al. Learning-based Model Predictive Control for Autonomous Racing. IEEE Robotics and Automation Letters, 2019, 4(4): 3363-3370 doi: 10.1109/LRA.2019.2926677
    [2] Lian C, Xu X, Chen H, et al. Near-optimal Tracking Control of Mobile Robots via Receding-horizon Dual Heuristic Programming. IEEE Transactions on Systems, Man, and Cybernetics, 2016, 46(11): 2484-2496
    [3] Dong L, Yan J, Yuan X, et al. Functional Nonlinear Model Predictive Control Based on Adaptive Dynamic Programming. IEEE Transactions on Systems, Man, and Cybernetics, 2019, 49(12): 4206-4218
    [4] Ahmed A A, Alshandoli A F S. Using Of Neural Network Controller And Fuzzy PID Control To Improve Electric Vehicle Stability Based On A14-DOF Model. In proceedings of: the 2020 International Conference on Electrical Engineering (ICEE). Takamatsu, Japan: IEEE, 2020. 1−6.
    [5] Marino R, Scalzi S, Orlando G, et al. Meng J, Liu A, Yang Y, et al. Two-wheeled robot platform based on PID control. In: Proceedings of the 5th International Conference on Information Science and Control Engineering (ICISCE). Zhengzhou, China: IEEE, 2018. 1011−1014
    [6] Farag W. Complex trajectory tracking using PID control for autonomous driving. International Journal of Intelligent Transportation Systems Research, 2020, 18(2): 356-366. doi: 10.1007/s13177-019-00204-2
    [7] Zhao P, Chen J, Song Y, et al. Design of a Control System for an Autonomous Vehicle Based on Adaptive-PID. International Journal of Advanced Robotic Systems. 2012, 9(44): 44
    [8] Han G, Fu W, Wang W, et al. The Lateral Tracking Control for the Intelligent Vehicle Based on Adaptive PID Neural Network. Sensors. 2017, 17(6): 1244 doi: 10.3390/s17061244
    [9] Fraichard T, Garnier P. Fuzzy Control to Drive Car-like Vehicles. Robotics and Autonomous Systems. 2001, 34(1):1-22 doi: 10.1016/S0921-8890(00)00096-8
    [10] Perez J, Milanes V, Onieva E. Cascade Architecture for Lateral Control in Autonomous Vehicles. IEEE Transactions on Intelligent Transportation Systems 2011, 12(1): 73-82 doi: 10.1109/TITS.2010.2060722
    [11] Li H, Wang X, Song S, et al. Vehicle Control Strategies Analysis Based on PID and Fuzzy Logic Control. Procedia Engineering. 2016, 137: 234-243 doi: 10.1016/j.proeng.2016.01.255
    [12] Park M, Lee S, Han W. Development of Lateral Control System for Autonomous Vehicle Based on Adaptive Pure Pursuit Algorithm. In: Proceedings of the 14th International Conference on Control, Automation and Systems (ICCAS 2014). KINTEX, Korea: IEEE, 2014. 1443−1447
    [13] Lie G. Study on Lateral Fuzzy Control of Unmanned Vehicles via Genetic Algorithms. Journal of Mechanical Engineering. 2012, 48(06): 76 doi: 10.3901/JME.2012.06.076
    [14] Leonard J J, How J P, Teller S, et al. A Perception-driven Autonomous Urban Vehicle. Journal of Field Robotics. 2008, 25(10): 727-774 doi: 10.1002/rob.20262
    [15] Rajamani R, Zhu C, Alexander L. Lateral Control of a Backward Driven Front steering Vehicle. Control Engineering Practice. 2003, 11(5): 531-540 doi: 10.1016/S0967-0661(02)00143-0
    [16] Thrun S, Montemerlo M, Dahlkamp H, et al. Stanley: The Robot That Won the DARPA Grand Challenge. Journal of Field Robotics. 2006, 23(9): 661-692 doi: 10.1002/rob.20147
    [17] 龚建伟, 姜岩, 徐威. 无人驾驶车辆模型预测控制. 北京理工大学出版社, 2014

    Gong Jian-Wei, Jiang Yan, Xu Wei. Model Predictive Control for Self-driving Vehicles. Beijing Institure of Technology Press, 2014
    [18] Falcone P, Borrelli F, Asgari J, et al. Predictive Active Steering Control for Autonomous Vehicle Systems. IEEE Transactions on Control Systems and Technology. 2007, 15(3): 566-580 doi: 10.1109/TCST.2007.894653
    [19] Carvalho A, Gao Y, Gray A, et al. Predictive Control of an Autonomous Ground Vehicle Using an Iterative Linearization Approach [C]. In: Proceedings of 16th International IEEE conference on intelligent transportation systems (ITSC 2013). The Hague, The Netherlands: IEEE, 2013. 2335−2340
    [20] Beal C E, Gerdes J C. Model Predictive Control for Vehicle Stabilization at the Limits of Handling. IEEE Transactions on Control Systems and Technology. 2013, 21(4): 1258-1269 doi: 10.1109/TCST.2012.2200826
    [21] Liniger A, Domahidi A, Morari M. Optimization‐based Autonomous Racing of 1:43 Scale RC Cars. Optimal Control Applications and Methods. 2015, 36(5): 628-647 doi: 10.1002/oca.2123
    [22] Ostafew C J, Schoellig A P, Barfoot T D. Robust Constrained Learning-based NMPC Enabling Reliable Mobile Robot Path Tracking. The International Journal of Robotics Research. 2016, 35(13): 1547−1563
    [23] Oh S, Lee J, Choi D. A New Reinforcement Learning Vehicle Control Architecture for Vision-based Road Following. IEEE Transactions on Vehicular Technology. 2000, 49(3): 997-1005 doi: 10.1109/25.845116
    [24] 杨慧媛. 基于增强学习的优化控制方法及其在移动机器人中的应用 [硕士学位论文], 国防科学技术大学, 中国, 2014

    Yang H Y. Reinforcement Learning-based Optimal Control Methods with Applications to Mobile Robots [Master dissertation], National University of Defense Technology, 2014
    [25] 连传强. 基于近似动态规划的优化控制方法及在自主驾驶车辆中的应用 [博士学位论文], 国防科学技术大学, 中国, 2016

    Lian C Q. Optimization Control Methods Based on Approximate Dynamic Programming and Its Applications in Autonomous Land Vehicles[Ph. D. dissertation], National University of Defense Technology, 2016
    [26] 黄振华. 智能车辆自评价学习控制方法研究 [博士学位论文], 国防科学技术大学, 中国, 2017

    Huang Z H. Researches on Adaptive Critic Learning Control Approaches for Intelligent Driving Vehicles [Ph. D. dissertation], National University of Defense Technology, 2017
    [27] Snider J M. Automatic Steering Methods for Autonomous Automobile Path Tracking. CMU-RI-TR-09-08 [R], 2009.
    [28] 熊璐, 杨兴, 卓桂荣, 等. 无人驾驶车辆的运动控制发展现状综述. 机械工程学报, 2020, 56(10): 127-143 doi: 10.3901/JME.2020.10.127

    Xiong Lu, Yang Xing, Zhuo Gui-Rong, et al. Review on Motion Control of Autonomous Vehicles. Journal of Mechanical Engineering, 2020, 56(10): 127-143 doi: 10.3901/JME.2020.10.127
    [29] 由智恒. 基于MPC算法的无人驾驶车辆轨迹跟踪控制研究 [硕士学位论文], 吉林大学, 中国, 2018

    You Zhi-Heng. Reaearch on Model Predictive Control-based Trajectory Tracking for Unmanned Vehicles [Master dissertation], Jilin Unversity, 2018
    [30] Xu X, Chen H, Lian C, et al. Learning-based predictive control for discrete-time nonlinear systems with stochastic disturbances. IEEE transactions on neural networks and learning systems, 2018, 29(12): 6202-6213 doi: 10.1109/TNNLS.2018.2820019
    [31] Rawlings, J, Mayne D, and Diehl, M. Model predictive control: theory, computation, and design (Vol 2). Madison, WI: Nob Hill Publishing, 2017
    [32] Chmielewski D, Manousiouthakis V. On constrained infinite-time linear quadratic optimal control. Systems and Control Letters, 1996, 29(3): 121-129 doi: 10.1016/S0167-6911(96)00057-6
    [33] Wang D, Ha M, Qiao J. Data-driven iterative adaptive critic control toward an urban wastewater treatment plant. IEEE Transactions on Industrial Electronics, 2020, 68(8): 7362-7369.
    [34] 王鼎. 基于学习的鲁棒自适应评判控制研究进展. 自动化学报, 2019, 45(6): 1031-1043

    Wang Ding. Research Progress on Learning-based Robust Adaptive Critic Control. Acta Automatica Sinica, 2019, 45(6): 1031-1043
    [35] 陈虹, 郭露露, 宫洵, 等. 智能时代的汽车控制. 自动化学报, 2020, 46(7): 1313-1332

    Chen Hong, Guo Lu-Lu, Gong Xun, et al. Automative Control in Intelligent Era. Acta Automatica Sinica, 2020, 46(7): 1313-1332
    [36] 田涛涛, 侯忠生, 刘世达, 邓志东. 基于无模型自适应控制的无人驾驶汽车横向控制方法. 自动化学报, 2017, 43(11): 1931-1940

    Tian Tao-Tao, Hou Zhong-Sheng, Liu Shi-Da, Deng Zhi-Dong. Model-free Adaptive Control Based Lateral Control of Self-driving Car. Acta Automatica Sinica, 2017, 43(11): 1931-1940
    [37] Rajamani R. Vehicle dynamics and control. Springer Science & Business Media, 2011.
    [38] Haarnoja T, Zhou A, Abbeel P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Proceedings of the 35th International conference on machine learning. PMLR 80, 2018, 1861−1870
    [39] Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous control with deep reinforcement learning. arXiv preprint arXiv: 1509.02971, 2015
    [40] Kuutti S, Bowden R, Jin Y, et al. A survey of deep learning applications to autonomous vehicle control. IEEE Transactions on Intelligent Transportation Systems, 2020, 22(2): 712-733
    [41] Li D, Zhao D, Zhang Q, et al. Reinforcement learning and deep learning based lateral control for autonomous driving [application notes]. IEEE Computational Intelligence Magazine, 2019, 14(2): 83-98 doi: 10.1109/MCI.2019.2901089
    [42] Chen Y, Hereid A, Peng H, et al. Enhancing the performance of a safe controller via supervised learning for truck lateral control. Journal of Dynamic Systems, Measurement, and Control, 2019, 141(10
    [43] Mayne D Q, Kerrigan E C, Van Wyk E J, et al. Tube‐based robust nonlinear model predictive control. International Journal of Robust and Nonlinear Control, 2011, 21(11): 1341-1353 doi: 10.1002/rnc.1758
    [44] Zhang X, Pan W, Scattolini R, et al. Robust Tube-based Model Predictive Control with Koopman Operators. Automatica, to be published
  • 加载中
计量
  • 文章访问数:  303
  • HTML全文浏览量:  174
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-06-20
  • 录用日期:  2021-11-02
  • 网络出版日期:  2022-03-07

目录

    /

    返回文章
    返回