• 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于渐进式适配器的动态点云高效迁移学习方法

孙一丁 祝继华 王耀南 程浩喆 卢超翼 陈林

孙一丁, 祝继华, 王耀南, 程浩喆, 卢超翼, 陈林. 基于渐进式适配器的动态点云高效迁移学习方法. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c250666
引用本文: 孙一丁, 祝继华, 王耀南, 程浩喆, 卢超翼, 陈林. 基于渐进式适配器的动态点云高效迁移学习方法. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c250666
Sun Yi-Ding, Zhu Ji-Hua, Wang Yao-Nan, Cheng Hao-Zhe, Lu Chao-Yi, Chen Lin. Efficient transfer learning for dynamic point cloud with progressive adapters. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c250666
Citation: Sun Yi-Ding, Zhu Ji-Hua, Wang Yao-Nan, Cheng Hao-Zhe, Lu Chao-Yi, Chen Lin. Efficient transfer learning for dynamic point cloud with progressive adapters. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c250666

基于渐进式适配器的动态点云高效迁移学习方法

doi: 10.16383/j.aas.c250666 cstr: 32138.14.j.aas.c250666
基金项目: 国家自然科学基金(62125305), 陕西省杰出青年科学基金(2025JC-JCQN-091), 陕西省技术创新引导计划(基金)资助项目(2024QY-SZX-23)资助
详细信息
    作者简介:

    孙一丁:西安交通大学软件学院硕士研究生. 2024年获得西安工业大学计算机科学与技术学院学士学位. 主要研究方向为点云理解. E-mail: sunyiding@stu.xjtu.edu.cn

    祝继华:西安交通大学软件学院教授. 2004年获得中南大学自动化专业学士学位. 2011年获得西安交通大学控制科学与工程专业博士学位. 主要研究方向为计算机视觉与机器学习. 本文通信作者. E-mail: zhujh@xjtu.edu.cn

    王耀南:中国工程院院士, 湖南大学人工智能与机器人学院教授. 1995年获得湖南大学博士学位. 主要研究方向为机器人学, 智能控制和图像处理. E-mail: yaonan@hnu.edu.cn

    程浩喆:西安交通大学软件学院博士研究生. 2022年获得西安工程大学电子信息学院硕士学位. 主要研究方向为点云处理. E-mail: chz97@stu.xjtu.edu.cn

    卢超翼:西安交通大学软件学院硕士研究生. 2024年获得浙江科技大学计算机科学与技术学院软件工程学士学位. 主要研究方向为计算机视觉. E-mail: chaoyi@stu.xjtu.edu.cn

    陈林:西安交通大学电子与信息学部软件学院助理教授. 2018年获得西北师范大学电子信息工程专业学士学位. 2021年获得中国科学院大学计算机技术专业硕士学位. 2025年获得湖南大学控制科学与工程专业博士学位. 主要研究方向是多机器人系统、深度强化学习和视觉导航. E-mail: chenlin21@hnu.edu.cn

Efficient Transfer Learning for Dynamic Point Cloud with Progressive Adapters

Funds: Supported by National Natural Science Foundation of China (62125305), the Natural Science Basis Research Plan in Shaanxi Province of China (2025JC-JCQN-091) and the Technology Innovation Leading Program of Shaanxi (2024QY-SZX-23)
More Information
    Author Bio:

    SUN Yi-Ding Master student at the School of Software Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University. He received his bachelor degree from the School of Computer Science and Technology, Xi'an Technological University in 2024. His research interests include point-cloud understanding

    ZHU Ji-Hua Professor at the School of Software Engineering, Xi'an Jiaotong University. He received his bachelor degree in automation from Central South University in 2004 and his Ph.D. degree in control science and engineering from Xi'an Jiaotong University in 2011. His research interests include computer vision and machine learning. Corresponding author of this paper

    WANG Yao-Nan Academician at Chinese Academy of Engineering, professor at the School of Artificial Intelligence and Robotics, Hunan University. He received his Ph.D. degree from Hunan University in 1995. His research interests include robotics, intelligent control, and image processing

    CHENG Hao-Zhe Ph. D. candidate at the School of Software Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University. He received his master degree from the School of Electronic Information, Xi'an Polytechnic University, in 2022. His main research interest is point cloud processing

    LU Chao-Yi Master student at the School of Software Engineering, Xi'an Jiaotong University. He received his bachelor degree from the School of Computer Science and Technology, Zhejiang University of Science and Technology in 2024. His main research interest is computer vision

    CHEN Lin Assistant professor at the School of Software Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University. He received his bachelor degree in electronic information engineering from Northwest Normal University in 2018, his master degree in computer technology from the University of Chinese Academy of Sciences in 2021, and his Ph.D. degree in control science and engineering from Hunan University in 2025. His interests include multi-robot systems, deep reinforcement learning, and visual navigation

  • 摘要: 动态点云视频时空耦合复杂, 现有端到端方法仅支持特定数据集与任务验证, 跨场景、跨任务泛化能力不足, 需重复训练. 为了能够挖掘可迁移的通用先验, 本文将研究视角重新聚焦于静态点云基础模型. 首先分析现有迁移学习方法存在的两大掣肘, 即密集计算与冗余设计. 为了实现轻量、紧凑的目标, 提出一种基于渐进式适配器的跨模态高效迁移学习方法. 该方法通过无重叠的3D滑动窗口注意力, 将自注意力复杂度由二次降为线性. 引入洗牌希尔伯特-Z序双向扫描曲线, 将动态点云视频约束为兼容Mamba的特征序列, 并以渐进式门控融合实现基础模型与新增适配器的高效协同. 整体方法不改变基础模型权重, 不依赖外部辅助设计. 在MSR-Action3D、HOI4D和SHREC'17上的实验结果表明, 仅微调1.8%参数即可获得超越现有基线模型的性能, 验证了该方法的优越性.
  • 图  1  本文与现有方法在性能与可学习参数比例上的对比

    Fig.  1  Comparison between our method and existing methods on performance and learnable parameter ratio

    图  2  PointM4模型结构示意图

    Fig.  2  Schematic diagram of PointM4 model architecture

    图  3  双向扫描与洗牌策略的示意图

    Fig.  3  Schematic diagram of the bidirectional scan and shuffle strategy

    图  4  PointM4与基线模型的动作分割可视化对比图

    Fig.  4  Visualization comparison of action segmentation between PointM4 and baseline Models.

    图  5  两种适配方法的t-SNE可视化

    Fig.  5  t-SNE visualization of two adaptation methods

    图  6  PointM4与现有方法在训练时长与GPU占用的对比

    Fig.  6  Comparison of training time and GPU memory between PointM4 and existing methods

    表  1  MSR-Action3D数据集动作识别准确率(%)

    Table  1  Action recognition accuracy on the MSR-Action3D dataset (%)

    方法参考文献12帧16帧24帧32帧
    有监督训练
    MeteorNet[27]ICCV201986.5388.2188.50N/A
    PSTNet[4]ICLR202187.8889.9091.20N/A
    P4Transformer[3]CVPR202187.5489.5690.9487.93
    Kinet[28]CVPR202288.5391.9293.27N/A
    PPTr[29]ECCV202289.8990.3192.33N/A
    LeaF[30]ICCV2023N/A91.5093.84N/A
    PST-Transformer[19]TPAMI202388.1591.9893.73N/A
    X4D-SceneFormer[31]AAAI2024N/A92.5693.90N/A
    MAMBA4D[20]CVPR2025N/AN/A93.3893.10
    自监督训练(端到端微调)
    CPR[5]AAAI202391.0092.1593.03N/A
    C2P[32]CVPR2023N/A91.8994.76N/A
    PointCMP[33]CVPR202391.5892.2693.27N/A
    PointCPSC[34]ICCV202390.2492.2692.68N/A
    MaST-Pre[6]ICCV2023N/AN/A94.08N/A
    跨模态迁移学习(3D预训练模型适配)
    PointCSA[9]CVPR202592.0493.7395.1295.47
    PointBERT+M4-$ 92.33_{(+0.29)} $$ {\bf{94.77}}_{(+1.04)} $$ 95.47_{(+0.35)} $$ 96.17_{(+0.70)} $
    PointMAE+M4-$ 93.37_{(+1.33)} $$ 94.08_{(+0.35)} $$ 95.12_{(+0.00)} $$ {\bf{96.86}}_{(+1.39)} $
    PointGPT-S+M4-$ {\bf{93.72}}_{(+1.68)} $$ 94.43_{(+0.70)} $$ {\bf{95.62}}_{(+0.50)} $$ 96.52_{(+1.05)} $
    下载: 导出CSV

    表  2  HOI4D数据集动作分割准确率(%)

    Table  2  Action segmentation accuracy on the HOI4D dataset (%)

    方法参考文献准确率编辑距离F1@50
    有监督训练
    P4Trans[3]CVPR202171.273.158.2
    PPTr[29]ECCV202277.480.169.5
    MAMBA4D[20]CVPR202585.591.385.5
    自监督训练(端到端微调)
    P4Trans+C2P[32]CVPR202373.576.862.4
    PPTr+C2P[32]CVPR202381.184.074.1
    X4D[31]AAAI202484.191.184.8
    CrossVideo[36]ICRA202483.786.076.0
    跨模态迁移学习(3D预训练模型适配)
    P4Trans+M4-$ {\bf{80.1}}_{(+8.9)} $$ {\bf{74.7}}_{(+1.6)} $$ {\bf{71.7}}_{(+13.5)} $
    下载: 导出CSV

    表  3  HOI4D数据集语义分割准确率

    Table  3  Semantic segmentation on the HOI4D dataset

    方法参考文献输入帧数mIoU(%)
    P4Trans[3]CVPR2021340.1
    P4Trans+C2P[32]CVPR2023341.4
    P4Trans+CrossVideo[36]ICRA2024342.1
    PPTr[29]ECCV2022341.0
    PPTr+C2P[32]CVPR2024342.3
    P4Trans+M4-3$ {\bf{42.8}}_{(+2.7)} $
    下载: 导出CSV

    表  4  SHREC'17数据集手势识别准确率(%)

    Table  4  Gesture recognition accuracy on the SHREC'17 dataset (%)

    方法参考文献准确率
    有监督训练
    PLSTM-base[38]CVPR202087.6
    PLSTM-early[38]CVPR202093.5
    PLSTM-PSS[38]CVPR202093.1
    PLSTM-middle[38]CVPR202094.7
    PLSTM-late[38]CVPR202093.5
    Kinet[28]CVPR202295.2
    自监督训练(端到端微调)
    PointCMP[33]CVPR202393.3
    MaST-Pre[6]ICCV202392.4
    跨模态迁移学习(3D预训练模型适配)
    PointBERT+CSA[9]CVPR202596.2
    PointMAE+CSA[9]CVPR202595.2
    PointGPT-S+CSA[9]CVPR202596.5
    PointBERT+M4-$ {\bf{96.4}}_{(+0.2)} $
    PointMAE+M4-$ {\bf{96.3}}_{(+1.1)} $
    PointGPT-S+M4-$ {\bf{96.7}}_{(+0.2)} $
    下载: 导出CSV

    表  5  滑动窗口尺寸对注意力效率以及模型性能的影响

    Table  5  Effect of sliding window size on attention efficiency and model performance

    模型窗口尺寸(点)帧/秒(s)注意力占比(%)准确率(%)
    A0$ 640 \times 240 \times 3 $102.954.594.4
    A1$ 16 \times 16 \times 3 $110.736.995.4
    A2$ 8 \times 8 \times 3 $112.536.696.8
    A3$ 4 \times 4 \times 3 $113.236.596.2
    下载: 导出CSV

    表  6  Mamba扫描策略及变体数量对模型性能的影响(%)

    Table  6  Impact of Mamba scanning strategy and number of variants on model performance (%)

    模型双向扫描洗牌策略六种变体准确率
    xyzyxzxzyzxyyzxzyx
    B0$ \times $$ \checkmark $$ \checkmark $$ \checkmark $$ \checkmark $$ \checkmark $$ \checkmark $$ \checkmark $94.9
    B1$ \checkmark $$ \times $$ \checkmark $$ \checkmark $$ \checkmark $$ \checkmark $$ \checkmark $$ \checkmark $94.2
    B2$ \checkmark $$ \checkmark $$ \checkmark $$ \checkmark $$ \checkmark $$ \checkmark $$ \checkmark $$ \checkmark $96.8
    B3$ \checkmark $$ \checkmark $$ \checkmark $$ \checkmark $$ \times $$ \times $$ \checkmark $$ \checkmark $95.8
    B4$ \checkmark $$ \checkmark $$ \times $$ \times $$ \checkmark $$ \checkmark $$ \checkmark $$ \checkmark $95.1
    B5$ \checkmark $$ \checkmark $$ \checkmark $$ \checkmark $$ \times $$ \times $$ \times $$ \times $94.9
    B6$ \checkmark $$ \checkmark $$ \times $$ \times $$ \checkmark $$ \checkmark $$ \times $$ \times $94.7
    B7$ \checkmark $$ \checkmark $$ \times $$ \times $$ \times $$ \times $$ \checkmark $$ \checkmark $94.7
    B8$ \times $$ \times $$ \times $$ \times $$ \times $$ \times $$ \times $$ \times $92.3
    下载: 导出CSV

    表  7  适配方法对模型性能的影响

    Table  7  Impact of adaptation methods on model performance

    模型适配方法具体操作准确率(%)
    C0跳过$ X $92.4
    C1相加$ X+Y $92.3
    C2最大值$ \max (X,\; Y) $90.2
    C3拼接$ {\rm{Linear}}([X,\; Y]) $95.5
    C4渐进式适配$ {\rm{ProAda}}\left(X,\; Y_1,\; Y_2\right) $96.8
    下载: 导出CSV

    表  8  级联顺序对模型性能的影响

    Table  8  Impact of cascade order on model performance

    模型是否适配级联顺序准确率(%)
    D0$ \checkmark $[TM]$ \times $1296.8
    D1$ \times $[T]$ \times $12 [M]$ \times $1290.6
    D2$ \times $[T]$ \times $6 [TMM]$ \times $692.5
    D3$ \checkmark $[TTMM]$ \times $693.8
    下载: 导出CSV

    表  9  整体推理流程的时间分析(ms)

    Table  9  Overall reasoning process time analysis (ms)

    方法帧数预处理时间模型推理后处理时间准确率(%)
    PointCSA243.36.70.995.1
    36不适用(out of memory, OOM)
    PointM4244.83.20.995.1
    365.03.81.096.8
    下载: 导出CSV
  • [1] 王耀南, 华和安, 张辉, 钟杭, 樊叶心, 梁鸿涛, 常浩, 方勇纯. 性能函数引导的无人机集群深度强化学习控制方法. 自动化学报, 2025, 51(5): 905−916 doi: 10.16383/j.aas.c240519

    Wang Yao-Nan, Hua He-An, Zhang Hui, Zhong Hang, Fan Ye-Xin, Liang Hong-Tao, Chang Hao, Fang Yong-Chun. Performance function-guided deep reinforcement learning control for UAV swarm. Acta Automatica Sinica, 2025, 51(5): 905−916 doi: 10.16383/j.aas.c240519
    [2] 田永林, 沈宇, 李强, 王飞跃. 平行点云: 虚实互动的点云生成与三维模型进化方法. 自动化学报, 2020, 46(12): 2572−2582 doi: 10.16383/j.aas.c200800

    Tian Yong-Lin, Shen Yu, Li Qiang, Wang Fei-Yue. Parallel point clouds: Point clouds generation and 3D model evolution via virtual-real interaction. Acta Automatica Sinica, 2020, 46(12): 2572−2582 doi: 10.16383/j.aas.c200800
    [3] H. Fan, Y. Yang, and M. Kankanhalli. Point 4d transformer networks for spatio-temporal modeling in point cloud videos. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021.
    [4] H. Fan, X. Yu, Y. Ding, Y. Yang, and M. Kankanhalli. PSTNet: Point spatio-temporal convolution on point cloud sequences. Proceedings of the International Conference on Learning Representations, 2021.
    [5] X. Sheng, Z. Shen, and G. Xiao. Contrastive predictive autoencoders for dynamic point cloud self-supervised learning. AAAI Conference on Artificial Intelligence, 2023, 37(8): 9802−9810
    [6] Z. Shen, X. Sheng, H. Fan, L. Wang, Y. Guo, Q. Liu, H. Wen, and X. Zhou. Masked spatio-temporal structure prediction for self-supervised learning on point cloud videos. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 16580-16589.
    [7] Y. Sun, H. Cheng, C. Lu, Z. Li, M. Wu, H. Lu, and J. Zhu. HyperPoint: Multimodal 3D foundation model in hyperbolic space. Pattern Recognit., 2026, 173: 112800 doi: 10.1016/j.patcog.2025.112800
    [8] Y. Sun, J. Zhu, H. Cheng, C. Lu, Z. Yang, L. Chen, and Y. Wang. Align then Adapt: Rethinking Parameter-Efficient Transfer Learning in 4D Perception. IEEE Trans. Multimedia, online, 2026.
    [9] B. Lv, Y. Zha, T. Dai, X. Yuerong, K. Chen, and S.-T. Xia. Adapting pre-trained 3d models for point cloud video understanding via cross-frame spatio-temporal perception. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2025, pp. 12413-12422.
    [10] Y. Pang, W. Wang, F. E. Tay, W. Liu, Y. Tian, and L. Yuan. Masked autoencoders for point cloud self-supervised learning. European Conference on Computer Vision, Springer, 2022, pp. 604-621.
    [11] G. Chen, M. Wang, Y. Yang, K. Yu, L. Yuan, and Y. Yue. PointGPT: Auto-regressively generative pre-training from point clouds. Proceedings of the Advances in Neural Information Processing Systems, vol. 36, 2024.
    [12] Gu, Albert, and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces. Conference on Language Modeling, 2024.
    [13] C. R. Qi, H. Su, K. Mo, and L. J. Guibas. PointNet: Deep learning on point sets for 3d classification and segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 652-660.
    [14] S. Xie, J. Gu, D. Guo, C. R. Qi, and L. G. O. Litany. Pointcontrast: Unsupervised pre-training for 3d point cloud understanding. European Conference on Computer Vision, 2020.
    [15] M. Afham, I. Dissanayake, D. Dissanayake, A. Dharmasiri, K. Thilakarathna, and R. Rodrigo. Crosspoint: Self-supervised cross-modal contrastive learning for 3d point cloud understanding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 9902-9912.
    [16] R. Zhang, Z. Guo, P. Gao, R. Fang, B. Zhao, D. Wang, Y. Qiao, and H. Li. Point-m2ae: Multi-scale masked autoencoders for hierarchical point cloud pre-training. Proceedings of the Advances in Neural Information Processing Systems, 2022.
    [17] C. Choy, J. Gwak, and S. Savarese. 4d spatio-temporal convnets: Minkowski convolutional neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 3075-3084.
    [18] Z. Deng, X. Li, X. Li, Y. Tong, S. Zhao, and M. Liu. Vg4d: Vision-language model goes 4d video recognition. Proceedings of the IEEE International Conference on Robotics and Automation, 2024, pp. 5014-5020.
    [19] H. Fan, Y. Yang, and M. Kankanhalli. Point spatio-temporal transformer networks for point cloud video modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(2): 2181−2192 doi: 10.1109/TPAMI.2022.3161735
    [20] J. Liu, J. Han, L. Liu, A. I. Aviles-Rivero, C. Jiang, Z. Liu, and H. Wang. Mamba4d: Efficient 4d point cloud video understanding with disentangled spatial-temporal state space models. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2025, pp. 17626-17636.
    [21] Z. Wang, Z. Chen, Y. Wu, Z. Zhao, L. Zhou, and D. Xu. PointMamba: A hybrid transformer-mamba framework for point cloud analysis. arXiv preprint arXiv: 2405.15463, 2024.
    [22] Y. Li, R. Bu, M. Sun, W. Wu, X. Di, and B. Chen. PointCNN: Convolution on x-transformed points. Proceedings of the Advances in Neural Information Processing Systems, vol. 31, 2018.
    [23] X. Han, Y. Tang, Z. Wang, and X. Li. Mamba3d: Enhancing local features for 3d point cloud analysis via state space model. Proceedings of the 32nd ACM International Conference on Multimedia, 2024, pp. 4995-5004.
    [24] X. Yu, L. Tang, Y. Rao, T. Huang, J. Zhou, and J. Lu. PointBERT: Pre-training 3d point cloud transformers with masked point modeling. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022.
    [25] A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su et al. ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv: 1512.03012, 2015.
    [26] W. Li, Z. Zhang, and Z. Liu. Action recognition based on a bag of 3D points. IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2010, pp. 9-14.
    [27] X. Liu, M. Yan, and J. Bohg. MeteorNet: Deep learning on dynamic 3D point cloud sequences. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.
    [28] J.-X. Zhong, K. Zhou, Q. Hu, B. Wang, N. Trigoni, and A. Markham. No pain, big gain: Classify dynamic point cloud sequences with static models by fitting feature-level space-time surfaces. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 8510-8520.
    [29] H. Wen, Y. Liu, J. Huang, B. Duan, and L. Yi. Point primitive transformer for long-term 4D point cloud video understanding. European Conference on Computer Vision, Springer, 2022, pp. 19-35.
    [30] Y. Liu, J. Chen, Z. Zhang, J. Huang, and L. Yi. LEAF: Learning frames for 4D point cloud sequence understanding. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 604-613.
    [31] L. Jing, Y. Xue, X. Yan, C. Zheng, D. Wang, R. Zhang, Z. Wang, H. Fang, B. Zhao, and Z. Li. X4D-SceneFormer: Enhanced scene understanding on 4D point cloud videos through cross-modal knowledge transfer. AAAI Conference on Artificial Intelligence, 2024, 38(3): 2670−2678 doi: 10.1609/aaai.v38i3.28045
    [32] Z. Zhang, Y. Dong, Y. Liu, and L. Yi. Complete-to-partial 4D distillation for self-supervised point cloud sequence representation learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2023, pp. 17661-17670.
    [33] Z. Shen, X. Sheng, L. Wang, Y. Guo, Q. Liu, and X. Zhou. PointCMP: Contrastive mask prediction for self-supervised learning on point cloud videos. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2023, pp. 1212-1222.
    [34] X. Sheng, Z. Shen, G. Xiao, L. Wang, Y. Guo, and H. Fan. Point contrastive prediction with semantic clustering for self-supervised learning on point cloud videos. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2023, pp. 16515-16524.
    [35] Y. Liu, Y. Liu, C. Jiang, K. Lyu, W. Wan, H. Shen, B. Liang, Z. Fu, H. Wang, and L. Yi. HOI4D: A 4D egocentric dataset for category-level human-object interaction. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 2022, pp. 21013-21022.
    [36] Y. Liu, C. Chen, Z. Wang, and L. Yi. CrossVideo: Self-supervised cross-modal contrastive learning for point cloud video understanding. IEEE International Conference on Robotics and Automation, IEEE, 2024, pp. 12436-12442.
    [37] Q. de Smedt, H. Wannous, J.-P. Vandeborre, J. Guerry, B. Le Saux, and D. Filliat. SHREC'17 Track: 3D hand gesture recognition using a depth and skeletal dataset. 3DOR–10th Eurographics Workshop on 3D Object Retrieval, Apr. 2017, pp. 1-6.
    [38] Y. Min, Y. Zhang, X. Chai, and X. Chen. An efficient PointLSTM for point clouds based gesture recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020, pp. 5760-5769.
  • 加载中
计量
  • 文章访问数:  10
  • HTML全文浏览量:  3
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-11-24
  • 录用日期:  2026-02-13
  • 网络出版日期:  2026-04-27

目录

    /

    返回文章
    返回