2.765

2022影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于可见光与红外热图像的行车环境复杂场景分割

陈武阳 赵于前 阳春华 张帆 余伶俐 陈白帆

陈武阳, 赵于前, 阳春华, 张帆, 余伶俐, 陈白帆. 基于可见光与红外热图像的行车环境复杂场景分割. 自动化学报, 2022, 48(2): 460−469 doi: 10.16383/j.aas.c210029
引用本文: 陈武阳, 赵于前, 阳春华, 张帆, 余伶俐, 陈白帆. 基于可见光与红外热图像的行车环境复杂场景分割. 自动化学报, 2022, 48(2): 460−469 doi: 10.16383/j.aas.c210029
Chen Wu-Yang, Zhao Yu-Qian, Yang Chun-Hua, Zhang Fan, Yu Ling-Li, Chen Bai-Fan. Complex scene segmentation based on visible and thermal images in driving environment. Acta Automatica Sinica, 2022, 48(2): 460−469 doi: 10.16383/j.aas.c210029
Citation: Chen Wu-Yang, Zhao Yu-Qian, Yang Chun-Hua, Zhang Fan, Yu Ling-Li, Chen Bai-Fan. Complex scene segmentation based on visible and thermal images in driving environment. Acta Automatica Sinica, 2022, 48(2): 460−469 doi: 10.16383/j.aas.c210029

基于可见光与红外热图像的行车环境复杂场景分割

doi: 10.16383/j.aas.c210029
基金项目: 国家自然科学基金(62076256) 中南大学研究生校企联合创新项目(2021XQLH048)资助
详细信息
    作者简介:

    陈武阳:中南大学自动化学院和计算机学院硕士研究生. 主要研究方向为计算机视觉与智能感知. E-mail: chenwuyanghn@163.com

    赵于前:中南大学自动化学院教授. 主要研究方向为计算机视觉, 智能感知, 机器学习, 精准医疗. 本文通信作者. E-mail: zyq@csu.edu.cn

    阳春华:中南大学自动化学院教授. 主要研究方向为复杂工业过程建模与优化控制, 智能自动化控制系统, 自动检测技术与仪器装置. E-mail: ychh@csu.edu.cn

    张帆:中南大学自动化学院讲师. 主要研究方向为图像处理, 激光制造. E-mail: zhangfan219@csu.edu.cn

    余伶俐:中南大学自动化学院教授. 主要研究方向为智能车辆路径规划与导航控制. E-mail: llyu@csu.edu.cn

    陈白帆:中南大学自动化学院副教授. 主要研究方向为智能驾驶, 环境感知, 计算机视觉. E-mail: chenbaifan@csu.edu.cn

Complex Scene Segmentation Based on Visible and Thermal Images in Driving Environment

Funds: Supported by National Natural Science Foundation of China (62076256), Graduate School-enterprise Joint Innovation Project of Central South University (XX)
More Information
    Author Bio:

    CHEN Wu-Yang Master student at the School of Automation, and School of Computer Science and Engineering, Central South University. Her research interest covers computer vision and intelligent perception

    ZHAO Yu-Qian Professor at the School of Automation, Central South University. His research interest covers computer vision, intelligent perception, machine learning, and precision medicine. Corresponding author of this paper

    YANG Chun-Hua Professor at the School of Automation, Central South University. Her research interest covers modeling and optimal control of complex industrial process, intelligent automation control system, automatic measurement technology and instrument

    ZHANG Fan Lecturer at the School of Automation, Central South University. His research interest covers image processing and laser process

    YU Ling-Li Professor at the School of Automation, Central South University. Her research interest covers intelligent land vehicle path planning and navigation control

    CHEN Bai-Fan Associate professor at the School of Automation, Central South University. Her research interest covers intelligent vehicle, environment perception, and computer vision

  • 摘要: 复杂场景分割是自动驾驶领域智能感知的重要任务, 对稳定性和高效性都有较高的要求. 由于一般的场景分割方法主要针对可见光图像, 分割效果非常依赖于图像获取时的光线与气候条件, 且大多数方法只关注分割性能, 忽略了计算资源. 本文提出一种基于可见光与红外热图像的轻量级双模分割网络(DMSNet), 通过提取并融合两种模态图像的特征得到最终分割结果. 考虑到不同模态特征空间存在较大差异, 直接融合将降低对特征的利用率, 本文提出了双路特征空间自适应(DPFSA)模块, 该模块能够自动学习特征间的差异从而转换特征至同一空间. 实验结果表明, 本文方法提高了对不同模态图像的利用率, 对光照变化有更强的鲁棒性, 且以少量参数取得了较好的分割性能.
    1)  收稿日期 2021-01-09 录用日期 2021-04-16 Manuscript received January 9, 2021; accepted April 16, 2021 国家自然科学基金 (62076256), 中南大学研究生校企联合创新项目 (2021XQLH048) 资助 Supported by National Natural Science Foundation of China (62076256), Graduate School-enterprise Joint Innovation Project of Central South University (2021XQLH048) 本文责任编委 张向荣 Recommended by Associate Editor ZHANG Xiang-Rong 1. 中南大学自动化学院 长沙 410083 2. 中南大学计算机学院长沙 410083 3. 湖南省高强度坚固件智能制造工程技术研究中心常德 415701 4. 湖南湘江人工智能学院 长沙 410005 1. School of Automation, Central South University, Changsha410083    2. School of Computer Science and Engineering, Central South University, Changsha 410083 3. Hunan Engineering
    2)  & Technology Research Center of High Strength Fastener Intelli-gent Manufacturing, Changde 415701 4. Hunan Xiangjiang Ar-tificial Intelligence Academy, Changsha 410005
  • 图  1  DMSNet模型结构图

    Fig.  1  The architecture of DMSNet

    图  2  双路特征空间自适应模块(DPFSA)结构图

    Fig.  2  The architecture of dual-path feature space adaptation module (DPFSA)

    图  3  调整DPFSA内部结构得到的另外两个模块

    Fig.  3  The other two modules obtained by adjusting the internal structure of DPFSA

    图  4  DMSNet、FuseNet和MFNet在数据集A上的分割结果对比

    Fig.  4  Comparison of segmentation results of DMSNet, FuseNet and MFNet on dataset A

    表  1  不同模块在数据集A上的mAcc、mIoU值与参数量比较

    Table  1  Comparison of mAcc and mIoU values ​​and parameter values ​​of different modules on dataset A

    ModelsmAccmIoUParameters
    MFNet63.564.92.81 MB
    FuseNet61.963.846.4 MB
    DMSNet (DPFSA-1)65.668.15.45 MB
    DMSNet (DPFSA-2)68.965.15.54 MB
    DMSNet (DPFSA)69.769.65.63 MB
    注: Parameters 代表整个分割模型的参数量, 而非模块的参数量
    下载: 导出CSV

    表  2  不同损失函数在数据集A上的Acc结果与mAcc、mIoU值

    Table  2  Acc results and mAcc and mIoU values ​​of different loss functions on dataset A

    LossesAccmAccmIoU
    123456789
    CE97.686.584.977.869.553.30.079.877.469.769.6
    Focal97.378.780.567.855.141.60.063.550.859.565.6
    Dice96.877.783.80.00.00.00.036.60.032.825.3
    CE+Dice97.687.683.579.573.247.50.074.792.170.770.3
    注: 表中数字1 ~ 9为分割类别标号, 分别为 1: Unlabeled, 2: Car, 3: Pedestrian, 4: Bike, 5: Curve, 6: Car stop, 7: Guardrail, 8: Color cone, 9: Bump
    下载: 导出CSV

    表  3  不同模型在数据集A上的Acc与IoU结果对比

    Table  3  Comparison of Acc and IoU results of different models on dataset A

    Models23456789mAccmIoU
    AccIoUAccIoUAccIoUAccIoUAccIoUAccIoUAccIoUAccIoU
    SegNet (3ch)82.694.167.775.673.780.855.997.139.143.50.00.00.00.048.986.851.759.7
    SegNet (4ch)84.493.185.584.776.074.758.296.544.243.60.00.00.00.074.495.657.860.9
    ENet (3ch)85.392.353.868.467.771.752.295.716.924.20.00.00.00.00.00.041.543.8
    ENet (4ch)75.589.668.171.766.867.663.288.541.534.10.00.00.00.093.278.156.253.6
    FuseNet76.891.269.380.571.278.660.195.830.828.10.00.068.437.983.198.561.963.8
    MFNet78.992.982.784.868.175.764.497.231.629.70.00.071.840.677.198.463.564.9
    DMSNet87.695.883.588.779.582.573.297.947.535.70.00.074.762.092.199.870.770.3
    注: 表中数字2 ~ 9为分割类别标号, 表示法同表 2
    下载: 导出CSV

    表  4  不同模型在数据集B上的Acc与IoU结果对比

    Table  4  Comparison of Acc and IoU results of different models on dataset B

    Models2345mAccmIoU
    AccIoUAccIoUAccIoUAccIoU
    SegNet (3ch)0.00.071.279.30.00.021.647.138.431.6
    SegNet (4ch)0.00.062.970.10.00.030.546.838.529.2
    ENet (3ch)0.00.077.685.50.00.073.490.949.944.1
    ENet (4ch)0.00.072.974.90.00.074.889.649.141.1
    FuseNet72.743.191.492.374.478.999.999.887.478.5
    MFNet66.747.088.791.095.290.196.399.889.181.9
    DMSNet67.843.589.190.496.397.599.399.990.282.8
    注: 表中数字2 ~ 5为分割类别标号, 分别为 2: Fire-Extinguisher, 3: Backpack, 4: Hand-Drill, 5: Survivor
    下载: 导出CSV

    表  5  不同模型在数据集A白天与黑夜环境下的mAcc与mIoU结果对比

    Table  5  Comparison of mAcc and mIoU results of different models on dataset A in daytime and nighttime

    ModelsDaytimeNighttime
    mAccmIoUmAccmIoU
    SegNet (3ch)47.855.552.661.3
    SegNet (4ch)45.449.358.262.9
    ENet (3ch)42.140.838.639.1
    ENet (4ch)44.145.957.154.3
    FuseNet50.661.263.464.7
    MFNet49.063.365.865.1
    DMSNet57.769.171.871.3
    下载: 导出CSV
  • [1] Feng D, Haase-Schütz C, Rosenbaum L, et al. Deep multi-modal object detection and semantic segmentation for autonomous driving: Datasets, methods, and challenges. IEEE/ACM transactions on computational biology and bioinformatics, 2021, 22(3): 1341-1360
    [2] Li C, Xu J X, Liu Q G, et al. Multi-View Mammographic Density Classification by Dilated and Attention-Guided Residual Learning. IEEE transactions on Pattern Analysis and Machine Intelligence, 2020: 1-11
    [3] Chen L C, Yang Y, Wang J, Xu W, Yuille A L. Attention to scale: Scale-aware semantic image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA: IEEE, 2016. 3640−3649
    [4] Lin D, Ji Y F, Lischinski D, Cohen-Or D, Huang H. Multi-scale context intertwining for semantic segmentation. In: Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer, 2018. 603−619
    [5] Kieu M, Bagdanov A D, Bertini M, Bimbo A D. Task-conditioned domain adaptation for pedestrian detection in thermal imagery. In: Proceedings of the 2020 European Conference on Computer Vision. Glasgow, UK: Springer, 2020. 1−17
    [6] Liu T, Lam K M, Zhao R, Qiu G P. Deep cross-modal representation learning and distillation for illumination-invariant pedestrian detection. IEEE Transactions on Circuits and Systems for Video Technology, DOI: 10.1109/TCSVT.2021.3060162
    [7] 陈虹, 郭露露, 宫洵, 高炳钊, 张琳. 智能时代的汽车控制. 自动化学报, 2020, 46(7): 1313−1332

    Chen Hong, Guo Lu-Lu, Gong Xun, Gao Bing-Zhao, Zhang Lin. Automotive control in intelligent era. Acta Automatica Sinica, 2020, 46(7): 1313−1332
    [8] 张新钰, 邹镇洪, 李志伟, 刘华平, 李骏. 面向自动驾驶目标检测的深度多模态融合技术[J]. 智能系统学报, 2020, 15(4): 758-771

    Zhang Xin-Yu, Zou Zhen-Hong, Li Zhi-Wei, Liu Hua-Ping, Li Jun. Deep multi-modal fusion in object detection for autonomous driving. CAAI transactions on intelligent systems, 2020, 15(4): 1–14
    [9] Ha Q, Watanabe K, Karasawa T, Ushiku, Y, Harada, T. MFNet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Vancouver, Canada: IEEE, 2017. 5108−5115
    [10] Sun Y X, Zuo W X, Liu M. Rtfnet: Rgb-thermal fusion network for semantic segmentation of urban scenes. IEEE Robotics and Automation Letters, 2019, 4(3): 2576-2583
    [11] He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA: IEEE, 2016. 770−778
    [12] Zhao X M, Sun P P, Xu Z G, Min H G, Yu H K. Fusion of 3D LIDAR and camera data for object detection in autonomous vehicle applications. IEEE Sensors Journal, 2020, 20(9): 4901-4913 doi: 10.1109/JSEN.2020.2966034
    [13] Chen Z, Zhang J, Tao D C. Progressive lidar adaptation for road detection. IEEE/CAA Journal of Automatica Sinica, 2019, 6(3): 693-70 doi: 10.1109/JAS.2019.1911459
    [14] Hazirbas C, Ma L, Domokos C, Cremers D. Fusenet: Incorporating depth into semantic segmentation via fusion-based cnn architecture. In: Proceedings of the Asian conference on Computer Vision. Taipei, China: Springer, 2016. 213−228
    [15] Zhang Y T, Yin Z S, Nie L Z, Huang S. Attention Based Multi-Layer Fusion of Multispectral Images for Pedestrian Detection. IEEE Access, 2020, 8: 165071-165084 doi: 10.1109/ACCESS.2020.3022623
    [16] Guan D Y, Cao Y P, Yang J X, Cao Y L, Yang M Y. Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection. Information Fusion, 2019, 50: 148-157 doi: 10.1016/j.inffus.2018.11.017
    [17] Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of the International Conference on Machine Learning. Lille, France: ACM, 2015. 448−456
    [18] Maas A L, Hannun A Y, Ng A Y. Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the International Conference on Machine Learning. Atlanta, USA: ACM, 2013. 1−6
    [19] Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. Fort Lauderdale, Florida, USA: JMLR, 2011. 315−323
    [20] Milletari F, Navab N, Ahmadi S A. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: Proceedings of the Fourth International Conference on 3D Vision. Stanford University, Stanford, CA, USA: IEEE, 2016. 565−571
    [21] Shivakumar S S, Rodrigues N, Zhou A, Miller L D, Kumar V, Taylor C J. Pst900: Rgb-thermal calibration, dataset and segmentation network. In: Proceedings of the International Conference on Robotics and Automation. Paris, France: IEEE, 2020: 9441−9447
    [22] Lin T Y, Goyal P, Girshick R, He K M, Dollar P. Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 2980−2988
  • 加载中
图(4) / 表(5)
计量
  • 文章访问数:  1363
  • HTML全文浏览量:  514
  • PDF下载量:  340
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-01-09
  • 修回日期:  2021-04-16
  • 网络出版日期:  2021-05-27
  • 刊出日期:  2022-02-18

目录

    /

    返回文章
    返回