冯诚 张聪炫 陈震 李兵 黎明

引用本文: 冯诚, 张聪炫, 陈震, 李兵, 黎明. 基于光流与多尺度上下文的图像序列运动遮挡检测. 自动化学报, 2024, 50(9): 1854−1865 doi: 10.16383/j.aas.c210324
Citation: Feng Cheng, Zhang Cong-Xuan, Chen Zhen, Li Bing, Li Ming. Occlusion detection based on optical flow and multiscale context. Acta Automatica Sinica, 2024, 50(9): 1854−1865 doi: 10.16383/j.aas.c210324


doi: 10.16383/j.aas.c210324 cstr: 32138.14.j.aas.c210324
基金项目: 国家重点研发计划(2020YFC2003800), 国家自然科学基金(61866026, 61772255, 62222206), 江西省杰出青年人才计划(20192BCB23011), 江西省自然科学基金重点项目(20202ACB214007), 江西省优势科技创新团队(20165BCB19007)资助

    冯诚:南昌航空大学测试与光电工程学院硕士研究生. 主要研究方向为计算机视觉. E-mail: fengcheng00016@163.com

    张聪炫:南昌航空大学测试与光电工程学院教授. 2014年获得南京航空航天大学博士学位. 主要研究方向为图像处理与计算机视觉. 本文通信作者. E-mail: zcxdsg@163.com

    陈震:南昌航空大学测试与光电工程学院教授. 2003年获得西北工业大学博士学位. 主要研究方向为图像处理与计算机视觉. E-mail: dr_chenzhen@163.com

    李兵:中国科学院自动化研究所模式识别国家重点实验室研究员. 2009年获得北京交通大学博士学位. 主要研究方向为视频内容理解, 多媒体内容安全. E-mail: bli@nlpr.ia.ac.cn

    黎明:南昌航空大学信息工程学院教授. 1997年获得南京航空航天大学博士学位. 主要研究方向为图像处理, 人工智能. E-mail: liming@nchu.edu.com

Occlusion Detection Based on Optical Flow and Multiscale Context

Funds: Supported by National Key Research and Development Program of China (2020YFC2003800), National Natural Science Foundation of China (61866026, 61772255, 62222206), Outstanding Young Scientist Project of Jiangxi Province (20192BCB23011), Natural Science Foundation of Jiangxi Province (20202ACB214007), and Advantage Subject Team of Jiangxi Province (20165BCB19007)
More Information
    Author Bio:

    FENG Cheng Master student at the School of Measuring and Optical Engineering, Nanchang Hangkong University. His main research interest is computer vision

    ZHANG Cong-Xuan  Professor at the School of Measuring and Optical Engineering, Nanchang Hangkong University. He received his Ph.D. degree from Nanjing University of Aeronautics and Astronautics in 2014. His research interest covers image processing and computer vision. Corresponding author of this paper

    CHEN Zhen Professor at the School of Measuring and Optical Engineering, Nanchang Hangkong University. He received his Ph.D. degree from Northwestern Polytechnical University in 2003. His research interest covers image processing and computer vision

    LI Bing Professor at the National Key Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences. He received his Ph.D. degree from Beijing Jiaotong University in 2009. His research interest covers video understanding and multimedia content security

    LI Ming Professor at the School of Information Engineering, Nanchang Hangkong University. He received his Ph.D. degree from Nanjing University of Aeronautics and Astronautics in 1997. His research interest covers image processing and artificial intelligence

  • 摘要: 针对非刚性运动和大位移场景下运动遮挡检测的准确性与鲁棒性问题, 提出一种基于光流与多尺度上下文的图像序列运动遮挡检测方法. 首先, 设计基于扩张卷积的多尺度上下文信息聚合网络, 通过图像序列多尺度上下文信息获取更大范围的图像特征; 然后, 采用特征金字塔构建基于多尺度上下文与光流的端到端运动遮挡检测网络模型, 利用光流优化非刚性运动和大位移区域的运动检测遮挡信息; 最后, 构造基于运动边缘的网络模型训练损失函数, 获取准确的运动遮挡边界. 分别采用MPI-Sintel和KITTI测试数据集对所提方法与现有的代表性方法进行实验对比与分析. 实验结果表明, 所提方法能够有效提高运动遮挡检测的准确性和鲁棒性, 尤其在非刚性运动和大位移等困难场景下具有更好的遮挡检测鲁棒性.
  • 图  1  上下文网络结构示意图

    Fig.  1  Structure diagram of context network

    图  2  常见的感受野扩张网络结构示意图

    Fig.  2  Structure diagram of common receptive field expansion

    图  3  多尺度上下文信息聚合网络结构示意图

    Fig.  3  Structure diagram of multiscale context information aggregation network

    图  4  遮挡检测网络结构示意图

    Fig.  4  Structure diagram of occlusion detection network

    图  5  基于光流和多尺度上下文信息的遮挡检测模型结构

    Fig.  5  The structure of the occlusion detection model based on optical flow and multiscale context information

    图  6  本文方法和IRR-PWC方法遮挡检测结果对比

    Fig.  6  Comparison of occlusion detection results between our method and IRR-PWC method

    图  7  MPI-Sintel数据集非刚性运动与大位移序列遮挡检测结果对比图. 从左至右分别是:alley_2、ambush_2、market_6以及temple_2序列

    Fig.  7  Comparison results of occlusion detection between non-rigid motion and large displacement sequences on MPI-Sintel dataset. From left to right are alley_2, ambush_2, market_6, and temple_2 sequence

    图  8  各个遮挡检测方法在KITTI数据集上的遮挡检测结果对比图. 从左至右分别是输入图像和Unflow、Back2Future、MaskFlownet、IRR-PWC以及本文方法的运动遮挡检测图

    Fig.  8  Comparison of occlusion detection results of each occlusion detection method on KITTI dataset. From left to right are the input image, Unflow, Back2Future, MaskFlownet, IRR-PWC and our method

    图  9  利用光流真实值生成的运动遮挡掩膜部分示例图$(N=3) $

    Fig.  9  Examples of motion occlusion mask generated by ground truth of optical flow $(N=3 )$

    图  10  各消融模型可视化结果对比图

    Fig.  10  Comparison of visualization results of each ablation model

    表  1  MPI-Sintel数据集平均F1分数对比结果

    Table  1  Comparison of average F1 score on MPI-Sintel dataset

    表  2  MPI-Sintel数据集平均漏检率与误检率对比结果(%)

    Table  2  Comparison of average omission rate and false rate on MPI-Sintel dataset (%)

    表  3  非刚性运动与大位移图像序列运动遮挡检测平均F1分数对比结果

    Table  3  Comparison of average F1 scores of motion occlusion detection between non-rigid motion and large displacement image sequences

    clean final
    Unflow[24]0.414 90.431 30.433 00.324 3 0.405 70.392 00.449 90.312 0
    Back2Future[25]0.681 60.588 80.629 00.271 20.675 60.519 90.623 90.268 3
    MaskFlownet[27]0.505 70.540 30.466 00.383 80.503 90.408 50.473 50.350 8
    IRR-PWC[26]0.870 90.917 20.815 50.740 40.877 00.780 90.802 30.690 5
    本文方法0.881 10.921 60.830 40.774 70.876 40.795 90.810 60.710 3
      注: 加粗字体表示各列最优结果.
    表  4  不同方法的时间消耗对比

    Table  4  Comparison of time consumption of different methods

      注: 加粗字体表示评价最优值.
    表  5  MPI-Sintel全图像序列平均F1分数对比

    Table  5  Comparison of average F1 scores of whole image sequence on MPI-Sintel

    MPI-Sintel 训练数据集
      注: 加粗字体表示评价最优值.
    表  6  MPI-Sintel全图像序列在不同运动边界区域内的平均F1分数对比

    Table  6  Comparison of average F1 scores of whole image sequence in different motion boundary regions on MPI-Sintel

    模型类型MPI-Sintel 训练数据集
    clean final
    $N=1 $$N=3 $$N=5 $$N=10 $$N=1 $$N=3 $$N=5 $$N=10 $
    全模型0.630.670.690.71 0.590.620.640.67
      注: 加粗字体表示评价最优值.
