2.793

2018影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于卦限卷积神经网络的3D点云分析

许翔 帅惠 刘青山

许翔, 帅惠, 刘青山. 基于卦限卷积神经网络的3D点云分析. 自动化学报, 2020, 46(x): 1−10 doi: 10.16383/j.aas.c200080
引用本文: 许翔, 帅惠, 刘青山. 基于卦限卷积神经网络的3D点云分析. 自动化学报, 2020, 46(x): 1−10 doi: 10.16383/j.aas.c200080
XU Xiang, SHUAI Hui, LIU Qing-Shan. Octant Convolutional Neural Network for 3D Point Cloud Analysis. Acta Automatica Sinica, 2020, 46(x): 1−10 doi: 10.16383/j.aas.c200080
Citation: XU Xiang, SHUAI Hui, LIU Qing-Shan. Octant Convolutional Neural Network for 3D Point Cloud Analysis. Acta Automatica Sinica, 2020, 46(x): 1−10 doi: 10.16383/j.aas.c200080

基于卦限卷积神经网络的3D点云分析

doi: 10.16383/j.aas.c200080
基金项目: 国家自然科学基金(61825601, 61532009)资助
详细信息
    作者简介:

    许翔:南京信息工程大学自动化学院硕士研究生. 2018年获得南京信息工程大学信息与控制学院学士学位. 主要研究方向为三维点云场景感知. E-mail: xuxiang0103@gmail.com

    帅惠:南京信息工程大学博士研究生. 2018年获得南京信息工程大学信息与工程学院硕士学位. 主要研究方向为目标检测, 3D点云场景感知. E-mail: huishuai13@163.com

    刘青山:南京信息工程大学自动化学院院长, 教授. 2003年获得中国科学院自动化研究所博士学位. 主要研究方向为图像理解, 模式识别, 机器学习. 本文通信作者. E-mail: qsliu@nuist.edu.cn

    通讯作者:

    刘青山 南京信息工程大学自动化学院院长, 教授. 2003年获得中国科学院自动化研究所博士学位. 主要研究方向为图像理解, 模式识别, 机器学习. 本文通信作者. E-mail: qsliu@nuist.edu.cn

Octant Convolutional Neural Network for 3D Point Cloud Analysis

Funds: Supported by National Natural Science Foundation of China (61825601, 61532009)
More Information
    Corresponding author: LIU Qing-Shan Dean and professor of the School Automation, Nanjing University of Information Science and Technology. He received his Ph.D. degree from the Institute of Automation, Chinese Academy of Sciences in 2003. His research interest covers image understanding, pattern recognition and machine learning. Corresponding author of this paper
  • 摘要: 基于深度学习的三维点云数据分析技术得到了越来越广泛的关注, 然而点云数据的不规则性使得高效提取点云中的局部结构信息仍然是一大研究难点. 本文提出了一种能够作用于局部空间邻域的卦限卷积神经网络(Octant Convolutional Neural Network, Octant-CNN), 它由卦限卷积模块和下采样模块组成. 针对输入点云, 卦限卷积模块在每个点的近邻空间中定位八个卦限内的最近邻点, 接着通过多层卷积操作将八卦限中的几何特征抽象成语义特征, 并将低层几何特征与高层语义特征进行有效融合, 从而实现了利用卷积操作高效提取三维邻域内的局部结构信息; 下采样模块对原始点集进行分组及特征聚合, 从而提高特征的感受野范围, 并且降低网络的计算复杂度. Octant-CNN通过对卦限卷积模块和下采样模块的分层组合, 实现了对三维点云进行由底层到抽象、从局部到全局的特征表示. 实验结果表明, Octant-CNN在对象分类、部件分割、语义分割和目标检测四个场景中均取得了较好的性能.
  • 图  1  网络框架图

    Fig.  1  Illustration of network architecture

    图  2  三阶段与单阶段2D卷积的对比

    Fig.  2  Comparison of 2D CNN with three-stage and one-stage

    图  3  卦限卷积模块

    Fig.  3  Octant convolution module

    图  4  S3DIS可视化结果

    Fig.  4  Visualization of results on S3DIS

    图  5  KITTI目标检测可视化结果

    Fig.  5  Visualization of detection results on KITTI

    图  6  K近邻和八卦限搜索的比较

    Fig.  6  Comparison of KNN and 8 octant search

    表  1  ModelNet40分类结果

    Table  1  Classification results on ModelNet40

    MethodoAcc(%)mAcc (%)
    PointNet[12]89.286.2
    PointNet++[13]90.7-
    PointSIFT[14]90.286.9
    SFCNN[15]91.4-
    ConvPoint[17]91.888.5
    ECC[18]87.483.2
    RGCNN[19]90.587.3
    PAT[22]91.7-
    SCN[23]90.087.6
    SRN-PointNet++[24]91.5-
    JUSTLOOKUP[25]89.586.4
    Kd-Net[26]91.888.5
    SO-Net[27]90.987.2
    Octant-CNN91.988.7
    下载: 导出CSV

    表  2  ShapeNet部件分割结果

    Table  2  Part segmentation results on ShapeNet

    MethodmIoUaerobagcapcarchairearphoneguitarknifelamplaptopmotormugpistolrocketskateboardtable
    PointNet[12]83.783.478.782.574.989.673.091.585.980.895.365.293.081.257.972.880.6
    PointNet++[13]85.182.479.087.777.390.871.891.085.983.795.371.694.181.358.776.482.6
    PointSIFT[14]79.075.178.481.874.585.264.389.681.977.595.164.093.577.154.270.674.3
    RGCNN[19]84.380.282.892.675.389.273.791.388.483.396.063.995.760.944.672.980.4
    DGCNN[20]85.184.283.784.477.190.978.591.587.382.996.067.893.382.659.775.582.0
    SCN[23]84.683.880.883.579.390.569.891.786.582.996.069.293.882.562.974.480.8
    Kd-Net[26]82.380.174.674.370.388.673.590.287.281.094.957.486.778.151.869.980.3
    SO-Net[27]84.681.983.584.878.190.872.290.183.682.395.269.394.280.051.672.182.6
    RS-Net[29]84.982.786.484.178.290.469.391.487.083.595.466.092.681.856.175.882.2
    Octant-CNN85.383.983.688.379.291.170.891.887.582.995.772.294.583.660.075.581.9
    下载: 导出CSV

    表  3  S3DIS语义分割结果

    Table  3  Semantic segmentation results on S3DIS

    MethodmIoUOAceilingfloorwallbeamcolumnwindowsdoorchairtablebookcasesofaboardclutter
    PointNet[12]47.778.688.088.769.342.423.147.551.642.054.138.29.629.435.2
    PointNet++[13]57.383.891.592.874.641.328.154.559.664.658.927.152.052.348.0
    PointSIFT[14]55.583.591.191.375.542.024.051.456.660.255.817.050.257.149.9
    RS-Net[29]56.5-92.592.878.632.834.451.668.159.760.116.450.244.952.0
    Octant-CNN58.384.692.194.576.348.930.856.962.965.855.528.048.150.348.4
    下载: 导出CSV

    表  4  3D目标检测对比结果

    Table  4  Performance compression in 3D object detection

    MethodCarsPedestriansCyclists
    EasyModerateHardEasyModerateHardEasyModerateHard
    F-PointNet v1[32]83.7569.3762.8365.3955.3248.6270.1752.8748.27
    F-PointNet v2[32]83.9371.2363.7264.2356.9550.1574.0454.9250.53
    Frustum PointSIFT[14]71.5666.1758.9763.1355.0849.0570.3652.5648.53
    Frustum Geo-CNN[33]85.0971.0263.3869.6460.5052.8875.6456.2552.54
    Frustum Octant-CNN85.1072.3164.4667.9059.7352.4476.5657.5054.26
    下载: 导出CSV

    表  5  结构设计分析

    Table  5  Analysis of the structure design

    模型多层融合残差投票oAcc(%)
    A90.7
    B$\checkmark$91.2
    C$\checkmark$$\checkmark$91.5
    D$\checkmark$$\checkmark$$\checkmark$91.9
    下载: 导出CSV

    表  6  2D卷积和MLP的对比

    Table  6  Comparisons of 2D CNN and MLP

    模型运算oAcc(%)
    AMLP90.8
    B2D CNN91.9
    下载: 导出CSV

    表  7  不同邻点的比较

    Table  7  The results of different neighbor points

    模型邻点准确率
    AK近邻90.2
    B八卦限搜索91.9
    下载: 导出CSV

    表  8  不同搜索半径的比较

    Table  8  Comparison of different search radius

    模型搜索半径oAcc(%)
    A(0.25, 0.5, 1.0)88.0
    B(0.4, 0.8, 1.0)89.2
    C(0.5, 1.0, 1.0)89.9
    DNone91.9
    下载: 导出CSV

    表  9  不同输入通道的结果比较

    Table  9  The results of different input channels

    模型输入通道oAcc(%)
    A($f_{ij}$)90.1
    B($x_i-x_{ij}, f_{ij}$)90.3
    C($x_i, f_{ij}$)90.8
    D($x_i, x_i-x_{ij}, f_{ij}$)91.9
    下载: 导出CSV

    表  10  点云旋转鲁棒性比较

    Table  10  Comparison of robustness to point cloud rotation

    角度$0^\circ$$30^\circ$$60^\circ$$90^\circ$$180^\circ$均值方差
    PointSIFT[14]88.289.288.988.788.588.70.124
    PointSIFT+T-Net89.189.489.488.688.689.040.114
    Octant-CNN91.591.791.991.591.891.680.025
    下载: 导出CSV

    表  11  点云语义分割的复杂度

    Table  11  Complexity in point cloud semantic segmentation

    方法参数量FLOPs
    PointNet[12]1.17M7.22B
    PointNet++[13]0.97M1.96B
    PointSIFT[14]13.53M24.32B
    Octant-CNN4.31M2.44B
    下载: 导出CSV
  • [1] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems. Nevada, USA, 2012. 1097−1105
    [2] He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016. 770−778
    [3] Girshick R. Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015. 1440-1448
    [4] Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016. 779-788
    [5] Zhu Z, Xu M, Bai S, Huang T, Bai X. Asymmetric non-local neural networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 593-602
    [6] Li Y, Qi H, Dai J, Ji X, Wei Y. Fully convolutional instance-aware semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hawaii, USA: IEEE, 2017. 2359-2367
    [7] 彭秀平, 仝其胜, 林洪彬, 冯超, 郑武. 一种面向散乱点云语义分割的深度残差-特征金字塔网络框架. 自动化学报, 2019, 45(x): 1−10

    Peng Xiu-Ping, Tong Qi-Sheng, Lin Hong-Bin, Feng Chao, Zheng Wu. A deep residual-feature pyramid network for scattered point cloud semantic segmentation. Acta Automatica Sinica, 2019, 45(x): 1−10
    [8] Maturana D, Scherer S. Voxnet: a 3d convolutional neural network for real-time object recognition. In: Proceedings of 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems. Hamburg, Germany: IEEE, 2015. 922-928
    [9] Wu Z, Song S, Khosla A, et al. 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015. 1912-1920
    [10] Su H, Maji S, Kalogerakis E, Learned-Miller E. Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015. 945-953
    [11] Yang Z, Wang L. Learning relationships for multi-view 3d object recognition. In: Proceedings of the IEEE International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 7505-7514
    [12] Qi C R, Su H, Mo K, Guibas L J. Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hawaii, USA: IEEE, 2017. 652-660
    [13] Qi C R, Yi L, Su H, Guibas L J. Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems. Long Beach, USA, 2017. 5099-5108
    [14] Jiang M, Wu Y, Zhao T, Zhao Z, Lu C. Pointsift: a sift-like network module for 3d point cloud semantic segmentation[Online], available: https://arxiv.org/abs/1807.00652, July 22, 2020
    [15] Rao Y, Lu J, Zhou J. Spherical fractal convolutional neural networks for point cloud recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA, 2019. 452-460
    [16] Liu Y, Fan B, Xiang S, Pan C. Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA, 2019. 8895-8904
    [17] Boulch A. Convpoint: continuous convolutions for point cloud processing. Computers & Graphics, 2020, 88: 24−34
    [18] Simonovsky M, Komodakis N. Dynamic edge-conditioned filters in convolutional neural networks on graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hawaii, USA: IEEE, 2017. 3693-3702
    [19] Te G, Hu W, Zheng A, Guo Z. Rgcnn: regularized graph cnn for point cloud segmentation. In: Proceedings of the 26th ACM International Conference on Multimedia. Seoul, South Korea: ACM, 2018. 746-754
    [20] Wang Y, Sun Y, Liu Z, Sarma S E, Bronstein M M, Solomon J M. Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics (TOG), 2019, 38(5): 1−12
    [21] Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 2014, 15(1): 1929−1958
    [22] Yang J, Zhang Q, Ni B, et al. Modeling point clouds with self-attention and gumbel subset sampling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA, 2019. 3323-3332
    [23] Xie S, Liu S, Chen Z, Tu Z. Attentional shapecontextnet for point cloud recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 4606-4615
    [24] Duan Y, Zheng Y, Lu J, Zhou J, Tian Q. Structual relational reasoning of point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA: IEEE, 2019. 949-958
    [25] Lin H, Xiao Z, Tan Y, Chao H, Ding S. Justlookup: one millisecond deep feature extraction for point clouds by lookup tables. In: Proceedings of 2019 IEEE International Conference on Multimedia and Expo. Shanghai, China: IEEE, 2019. 326-331 Wang P, Liu Y, Guo Y, Sun C, Tong X. O-cnn: octree-based convolutional neural networks for 3d shape analysis. ACM Transactions on Graphics (TOG), 2017, 36(4): 1-11
    [26] Klokov R, Lempitsky V. Escape from cells: deep kd-networks for the recognition of 3d point cloud models. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 863-872
    [27] Li J, Chen B M, Hee L G. So-net: self-organizing network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 9397-9406
    [28] Yi L, Kim V G, Ceylan D, et al. A scalable active framework for region annotation in 3d shape collections. ACM Transactions on Graphics (ToG), 2016, 35(6): 1−12
    [29] Huang Q, Wang W, Neumann U. Recurrent slice networks for 3d segmentation of point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 2626-2635
    [30] Armeni I, Sener O, Zamir A R, et al. 3d semantic parsing of large-scale indoor spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 1534-1543
    [31] Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Rhode Island, USA: IEEE, 2012. 3354-3361
    [32] Qi C R, Liu W, Wu C, Su H, Guibas L J. Frustum pointnets for 3d object detection from rgb-d data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 918-927
    [33] Lan S, Yu R, Yu G, Davis L S. Modeling local geometric structure of 3d point clouds using geo-cnn. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 998-1008
  • 加载中
计量
  • 文章访问数:  48
  • HTML全文浏览量:  16
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-02-25
  • 录用日期:  2020-07-21

目录

    /

    返回文章
    返回