周治国 余思雨 于家宝 段俊伟 陈龙 陈俊龙

周治国, 余思雨, 于家宝, 段俊伟, 陈龙, 陈俊龙. 面向无人艇的T-DQN智能避障算法研究. 自动化学报, 2023, 49(8): 1645−1655 doi: 10.16383/j.aas.c210080
Zhou Zhi-Guo, Yu Si-Yu, Yu Jia-Bao, Duan Jun-Wei, Chen Long, Chen Jun-Long. Research on T-DQN intelligent obstacle avoidance algorithm of unmanned surface vehicle. Acta Automatica Sinica, 2023, 49(8): 1645−1655 doi: 10.16383/j.aas.c210080
doi: 10.16383/j.aas.c210080
基金项目: “十三五” 装备预研领域基金(61403120109), 暨南大学中央高校基本科研业务费专项资金(21619412)资助

    周治国:北京理工大学信息与电子学院副教授. 主要研究方向为智能无人系统, 信息感知与导航和机器学习. 本文通信作者. E-mail: zhiguozhou@bit.edu.cn

    余思雨:北京理工大学信息与电子学院硕士研究生. 主要研究方向为智能无人系统信息感知与导航. E-mail: yusiyu3408@163.com

    于家宝:北京理工大学信息与电子学院硕士研究生. 主要研究方向为智能无人系统信息感知与导航. E-mail: 3120200722@bit.edu.cn

    段俊伟:暨南大学信息科学技术学院讲师. 主要研究方向为图像融合, 机器学习和计算智能. E-mail: jwduan@jnu.edu.cn

    陈龙:澳门大学科技学院副教授. 主要研究方向为计算智能, 贝叶斯方法和机器学习. E-mail: longchen@um.edu.mo

    陈俊龙:华南理工大学计算机科学与工程学院教授. 主要研究方向为控制论, 智能系统和计算智能. E-mail: philipchen@scut.edu.cn

Research on T-DQN Intelligent Obstacle Avoidance Algorithm of Unmanned Surface Vehicle

Funds: Supported by Equipment Pre-research Field Fund Thirteen Five-year (61403120109) and Fundamental Research Funds for the Central Universities of Jinan University (21619412)
    ZHOU Zhi-Guo Associate professor at the School of Information and Electronics, Beijing Institute of Technology. His research interest covers intelligent unmanned systems, information perception and navigation, and machine learning. Corresponding author of this paper

    YU Si-Yu Master student at the School of Information and Electronics, Beijing Institute of Technology. Her main research interest is information perception and navigation of intelligent unmanned systems

    YU Jia-Bao Master student at the School of Information and Electronics, Beijing Institute of Technology. Her main research interest is information perception and navigation of intelligent unmanned systems

    DUAN Jun-Wei Lecturer at the College of Information Science and Technology, Jinan University. His research interest covers image fusion, machine learning, and computational intelligence

    CHEN Long Associate professor at the Faculty of Science and Technology, University of Macau. His research interest covers computational intelligence, Bayesian methods, and machine learning

    CHEN Jun-Long Professor at the School of Computer Science and Engineering, South China University of Technology. His research interest covers cybernetics, intelligent systems, and computational intelligence

  • 摘要: 无人艇(Unmanned surface vehicle, USV)作为一种具有广泛应用前景的无人系统, 其自主决策能力尤为关键. 由于水面运动环境较为开阔, 传统避障决策算法难以在量化规则下自主规划最优路线, 而一般强化学习方法在大范围复杂环境下难以快速收敛. 针对这些问题, 提出一种基于阈值的深度Q网络避障算法(Threshold deep Q network, T-DQN), 在深度Q网络(Deep Q network, DQN)基础上增加长短期记忆网络(Long short-term memory, LSTM)来保存训练信息, 并设定经验回放池阈值加速算法的收敛. 通过在不同尺度的栅格环境中进行实验仿真, 实验结果表明, T-DQN算法能快速地收敛到最优路径, 其整体收敛步数相比Q-learning算法和DQN算法, 分别减少69.1%和24.8%, 引入的阈值筛选机制使整体收敛步数降低41.1%. 在Unity 3D强化学习仿真平台, 验证了复杂地图场景下的避障任务完成情况, 实验结果表明, 该算法能实现无人艇的精细化避障和智能安全行驶.
  • 图  1  T-DQN算法架构图

    Fig.  1  T-DQN algorithm architecture

    图  2  LSTM网络结构图

    Fig.  2  LSTM network structure

    图  3  加入LSTM后的网络层结构

    Fig.  3  Network layer structure adding LSTM

    图  4  无人艇路径规划流程图

    Fig.  4  Flow chart of USV path planning

    图  5  无人艇实际参数

    Fig.  5  Actual parameters of USV

    图  6  10 × 10栅格地图下采用T-DQN训练后的路径结果

    Fig.  6  Path results after T-DQN training under 10 × 10 grid map

    图  7  20 × 20栅格地图下采用T-DQN训练后的路径结果

    Fig.  7  Path results after T-DQN training under 20 × 20 grid map

    图  8  30 × 30栅格地图下采用T-DQN训练后的路径结果

    Fig.  8  Path results after T-DQN training under 30 × 30 grid map

    图  9  4种算法分别在10 × 10、20 × 20、30 × 30栅格地图下的平均回报值对比

    Fig.  9  Comparison of the average return values of 4 algorithms under 10 × 10, 20 × 20, 30 × 30 grid maps

    图  10  Spaitlab-unity仿真实验平台

    Fig.  10  Spaitlab-unity simulation experiment platform

    图  11  无人艇全局路径规划仿真运动轨迹

    Fig.  11  Global path planning simulation trajectory of USV

    图  12  栅格化水域空间内的全局路径规划

    Fig.  12  Global path planning in grid water space

    图  13  无人艇全局/局部路径规划仿真运动轨迹对比

    Fig.  13  Comparison of global/local simulation trajectories of USV

