2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

多聚点子空间下的时空信息融合及其在行为识别中的应用

杨天金 侯振杰 李兴 梁久祯 宦娟 郑纪翔

杨天金, 侯振杰, 李兴, 梁久祯, 宦娟, 郑纪翔. 多聚点子空间下的时空信息融合及其在行为识别中的应用. 自动化学报, 2022, 48(11): 2823−2835 doi: 10.16383/j.aas.c190327
引用本文: 杨天金, 侯振杰, 李兴, 梁久祯, 宦娟, 郑纪翔. 多聚点子空间下的时空信息融合及其在行为识别中的应用. 自动化学报, 2022, 48(11): 2823−2835 doi: 10.16383/j.aas.c190327
Yang Tian-Jin, Hou Zhen-Jie, Li Xing, Liang Jiu-Zhen, Huan Juan, Zheng Ji-Xiang. Recognizing action using multi-center subspace learning-based spatial-temporal information fusion. Acta Automatica Sinica, 2022, 48(11): 2823−2835 doi: 10.16383/j.aas.c190327
Citation: Yang Tian-Jin, Hou Zhen-Jie, Li Xing, Liang Jiu-Zhen, Huan Juan, Zheng Ji-Xiang. Recognizing action using multi-center subspace learning-based spatial-temporal information fusion. Acta Automatica Sinica, 2022, 48(11): 2823−2835 doi: 10.16383/j.aas.c190327

多聚点子空间下的时空信息融合及其在行为识别中的应用

doi: 10.16383/j.aas.c190327 cstr: 32138.14.j.aas.c190327
基金项目: 国家自然科学基金(61803050, 61063021), 江苏省物联网移动互联技术工程重点实验室开放课题基金(JSWLW-2017-013), 浙江省公益技术研究社会发展项目(2017C33223)资助
详细信息
    作者简介:

    杨天金:常州大学信息科学与工程学院硕士研究生. 主要研究方向为行为识别, 机器学习.E-mail: yangtianjin128@163.com

    侯振杰:常州大学信息科学与工程学院教授. 2015年获内蒙古农业大学机械专业博士学位. 主要研究方向为行业识别, 机器学习. 本文通信作者.E-mail: houzj@cczu.edu.cn

    李兴:常州大学信息科学与工程学院硕士研究生. 主要研究方向为行为识别, 机器学习.E-mail: lixing03201012@163.com

    梁久祯:常州大学信息科学与工程学院教授. 2001年获北京航空航天大学计算机软件与理论工学博士学位. 主要研究方向为行为识别, 机器学习.E-mail: jzliang@cczu.edu.cn

    宦娟:常州大学信息科学与工程学院副教授. 2019年获江苏大学农业电气化与自动化专业博士学位. 主要研究方向为信息智能处理.E-mail: huanjuan@cczu.edu.cn

    郑纪翔:2020年于常州大学信息科学与工程学院获得学士学位. 主要研究方向为行为识别, 机器学习.E-mail: zjx991031@163.com

Recognizing Action Using Multi-center Subspace Learning-based Spatial-temporal Information Fusion

Funds: Supported by National Natural Science Foundation of China (61803050, 61063021), Jiangsu Province Networking and Mobile Internet Technology Engineering Key Laboratory Open Research Fund Project (JSWLW-2017-013), and Zhejiang Public Welfare Technology Research Social Development Project (2017C33223)
More Information
    Author Bio:

    YANG Tian-Jin Master student at the School of Information Science and Engineering, Changzhou University. His research interest covers behavior recognition and machine learning

    HOU Zhen-Jie Professor at the School of Information Science and Engineering, Changzhou University. He received his Ph.D. degree in mechanical engineering from Inner Mongolia Agricultural University in 2015. His research interest covers behavior recognition and machine learning. Corresponding author of this paper

    LI Xing  Master student at the School of Information Science and Engineering, Changzhou University. His research interest covers behavior recognition and machine learning

    LIANG Jiu-Zhen Professor at the School Information Science and Engineering, Changzhou University. He received his Ph.D. degree in computer software and theory engineering from Beijing University of Aeronautics and Astronautics in 2001. His research interest covers behavior recognition and machine learning

    HUAN Juan Associate professor at the School of Information Science and Engineering, Changzhou University. She received her Ph.D. degree in agricultural electriflcation automation from Jiangsu University in 2019. Her main research interest is information intelligence processing

    ZHENG Ji-Xiang Received his bachelor degree from the School of Information Science and Engineering, Changzhou University in 2020. His research interest covers behavior recognition and machine learning

  • 摘要: 基于深度序列的人体行为识别, 一般通过提取特征图来提高识别精度, 但这类特征图通常存在时序信息缺失的问题. 针对上述问题, 本文提出了一种新的深度图序列表示方式, 即深度时空图(Depth space time maps, DSTM). DSTM降低了特征图的冗余度, 弥补了时序信息缺失的问题. 本文通过融合空间信息占优的深度运动图(Depth motion maps, DMM) 与时序信息占优的DSTM, 进行高精度的人体行为研究, 并提出了多聚点子空间学习(Multi-center subspace learning, MCSL)的多模态数据融合算法. 该算法为各类数据构建多个投影聚点, 以此增大样本的类间距离, 降低了投影目标区域维度. 本文在MSR-Action3D数据集和UTD-MHAD数据集上进行人体行为识别. 最后实验结果表明, 本文方法相较于现有人体行为识别方法有着较高的识别率.
  • 图  1  DSTM流程图

    Fig.  1  DSTM flowchart

    图  2  单聚点子空间学习

    Fig.  2  Subspace learning

    图  3  多聚点子空间学习

    Fig.  3  Multi-center subspace learning

    图  4  正反高抛动作

    Fig.  4  Positive and negative high throwing action

    图  5  参数选择

    Fig.  5  The parameter of selection

    图  6  DSTM在不同分类器识别效果

    Fig.  6  DSTM recognition of different classifiers

    图  7  混淆矩阵

    Fig.  7  Confusion matrix

    表  1  MSR数据库中的人体行为

    Table  1  Human actions in MSR

    动作 样本数 动作 样本数
    高挥手 (A01) 27 双手挥 (A11) 30
    水平挥手 (A02) 26 侧边拳击 (A12) 30
    锤 (A03) 27 弯曲 (A13) 27
    手抓 (A04) 25 向前踢 (A14) 29
    打拳 (A05 26 侧踢 (A15) 20
    高抛 (A06) 26 慢跑 (A16) 30
    画叉 (A07) 27 网球挥拍 (A17) 30
    画勾 (A08) 30 发网球 (A18) 30
    画圆 (A09) 30 高尔夫挥杆 (A19) 30
    拍手 (A10) 30 捡起扔 (A20) 27
    下载: 导出CSV

    表  2  UTD数据库中的人体行为

    Table  2  Human actions in UTD

    动作 样本数 动作 样本数
    向左滑动 (B01) 32 挥网球 (B15) 32
    向右滑动 (B02) 32 手臂卷曲 (B16) 32
    挥手 (B03) 32 网球发球 (B17) 32
    鼓掌 (B04) 32 推 (B18) 32
    扔 (B05) 32 敲 (B19) 32
    双手交叉 (B06) 32 抓 (B20) 32
    拍篮球 (B07) 32 捡起扔 (B21) 32
    画叉 (B08) 31 慢跑 (B22) 31
    画圆 (B09) 32 走 (B23) 32
    持续画圆 (B10) 32 坐下 (B24) 32
    画三角 (B11) 32 站起来 (B25) 32
    打保龄球 (B12) 32 弓步 (B26) 32
    冲拳 (B13) 32 蹲 (B27) 32
    挥羽毛球 (B14) 32
    下载: 导出CSV

    表  3  MSR-Action3D 数据分组

    Table  3  MSR-Action3D data grouping

    AS1 AS2 AS3
    A02 A01 A06
    A03 A04 A14
    A05 A07 A15
    A06 A08 A16
    A10 A09 A17
    A13 A11 A18
    A18 A14 A19
    A20 A12 A20
    下载: 导出CSV

    表  4  MSR数据库上不同特征的识别率(%)

    Table  4  Different of feature action recognition on MSR (%)

    方法 测试 1 测试 2 测试 3
    AS1 AS2 AS3 均值 AS1 AS2 AS3 均值 AS1 AS2 AS3 均值
    MEI-HOG 69.79 77.63 79.72 75.71 84.00 89.58 93.24 88.94 86.95 86.95 95.45 89.78
    MEI-LBP 57.05 56.58 64.19 59.27 66.66 69.79 78.37 71.61 69.56 73.91 77.27 73.58
    DSTM-HOG 83.22 71.71 87.83 80.92 94.66 84.37 88.23 89.80 91.30 82.61 95.95 89.95
    DSTM-LBP 84.56 71.71 87.83 81.37 88.00 82.29 95.94 88.74 86.96 82.61 95.45 88.34
    MHI-HOG 69.79 72.36 70.95 71.03 88.00 84.37 89.19 87.19 95.65 82.60 95.45 91.23
    MHI-LBP 51.67 60.52 54.05 55.41 73.33 70.83 78.37 74.18 82.60 65.21 72.72 73.51
    DMM-HOG 88.00 87.78 87.16 87.65 94.66 87.78 100.00 94.15 100.00 88.23 95.45 94.56
    DMM-LBP 89.52 87.78 93.20 90.17 93.11 85.19 100.00 92.77 94.03 88.98 92.38 91.80
    下载: 导出CSV

    表  5  UTD数据库上不同特征的识别率(%)

    Table  5  Different of feature action recognition on UTD (%)

    方法 测试 1 测试 2 测试 3
    MEI-HOG 69.51 65.42 68.20
    MEI-LBP 45.12 51.97 52.61
    DSTM-HOG 71.08 80.28 89.54
    DSTM-LBP 68.81 80.97 86.06
    MHI-HOG 56.44 66.58 73.14
    MHI-LBP 49.82 53.82 57.40
    DMM-HOG 78.39 75.40 87.94
    DMM-LBP 68.98 74.94 86.75
    下载: 导出CSV

    表  6  DMM和DSTM对比实验结果(%)

    Table  6  Experimental results of DMM and DSTM (%)

    方法 D1 D2
    DSTM 62.83 81.53
    DMM 32.17 63.93
    下载: 导出CSV

    表  7  DMM和DSTM平均处理时间(s)

    Table  7  Average processing time of DMM and DSTM (s)

    方法 D1 D2
    DSTM 2.1059 3.4376
    DMM 5.6014 8.6583
    下载: 导出CSV

    表  8  $ \mathrm{MSR}\text{-}\mathrm{Action} 3 \mathrm{D}^{1} $上的实验结果

    Table  8  Experimental results on $ \mathrm{MSR}\text{-}\mathrm{Action} 3 \mathrm{D}^{1} $

    方法 识别率 (%)
    文献 [12] 86.50
    文献 [34] 91.45
    文献 [35] 90.01
    文献 [36] 89.40
    文献 [37] 77.47
    文献 [38] 81.7
    文献 [39] 90.01
    文献 [40] 89.48
    本文学习方法 90.32
    $注 :\mathrm{MSR}\text{-}\mathrm{Action} 3 \mathrm{D}^{1}$采用设置 2 测试 2.
    下载: 导出CSV

    表  9  $\mathrm{MSR}\text{-}\mathrm{Action} 3 \mathrm{D}^{2}$上的实验结果

    Table  9  Experimental results on $\mathrm{MSR}\text{-}\mathrm{Action} 3 \mathrm{D}^{2}$

    方法 识别率 (%)
    MHI-LBP 68.75
    MEI-LBP 71.43
    DCA[22] 94.64
    DSTM-LBP 87.50
    DSTM-HOG 89.28
    MCSL+DMM 89.28
    MCSL+DSTM 91.96
    CCA[21] 83.05
    子空间学习 92.85
    本文学习方法 98.21
    注: $\mathrm{MSR}\text{-}\mathrm{Action} 3 \mathrm{D}^{2}$采用设置 2 测试 4; MCSL 为多聚点子空间学习.
    下载: 导出CSV

    表  10  UTD-MHAD在设置2测试4上的实验结果

    Table  10  Experimental results on UTD-MHAD

    方法 识别率 (%)
    MHI-LBP 62.40
    MEI-LBP 57.80
    DCA[22] 92.48
    DSTM-LBP 89.59
    DSTM-HOG 91.90
    MCSL+DMM 93.64
    MCSL+DSTM 95.37
    CCA[21] 87.28
    子空间学习 93.64
    本文学习方法 98.84
    下载: 导出CSV
  • [1] Yousefi S, Narui H, Dayal S, Ermon S, Valaee S. A survey on behavior recognition using WiFi channel state information. IEEE Communications Magazine, 2017, 55(10): 98−104 doi: 10.1109/MCOM.2017.1700082
    [2] Ben Mabrouk A, Zagrouba E. Abnormal behavior recognition for intelligent video surveillance systems: A review. Expert Systems with Applications, 2018, 91: 480−491 doi: 10.1016/j.eswa.2017.09.029
    [3] Fang C C, Mou T C, Sun S W, Chang P C. Machine-learning based fitness behavior recognition from camera and sensor modalities. In: Proceedings of the 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR). Taichung, China: IEEE, 2018. 249−250
    [4] Chen C, Liu K, Jafari R, Kehtarnavaz N. Home-based senior fitness test measurement system using collaborative inertial and depth sensors. In: Proceedings of the 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Chicago, USA: IEEE, 2014. 4135−4138
    [5] Laver K E, Lange B, George S, Deutsch J E, Saposnik G, Crotty M. Virtual reality for stroke rehabilitation. Cochrane Database of Systematic Reviews, 2017, 11(11): Article No. CD008349
    [6] Sun J, Wu X, Yan S C, Cheong L F, Chua T S, Li J T. Hierarchical spatio-temporal context modeling for action recognition. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE, 2009. 2004−2011
    [7] 胡建芳, 王熊辉, 郑伟诗, 赖剑煌. RGB-D行为识别研究进展及展望. 自动化学报, 2019, 45(5): 829−840 doi: 10.16383/j.aas.c180436

    Hu Jian-Fang, Wang Xiong-Hui, Zheng Wei-Shi, Lai Jian-Huang. RGB-D action recognition: Recent advances and future perspectives. Acta Automatica Sinica, 2019, 45(5): 829−840 doi: 10.16383/j.aas.c180436
    [8] Bobick A F, Davis J W. The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(3): 257−267 doi: 10.1109/34.910878
    [9] 苏本跃, 蒋京, 汤庆丰, 盛敏. 基于函数型数据分析方法的人体动态行为识别. 自动化学报, 2017, 43(5): 866−876 doi: 10.16383/j.aas.2017.c160120

    Su Ben-Yue, Jiang Jing, Tang Qing-Feng, Sheng Min. Human dynamic action recognition based on functional data analysis. Acta Automatica Sinica, 2017, 43(5): 866−876 doi: 10.16383/j.aas.2017.c160120
    [10] Anderson D, Luke R H, Keller J M, Skubic M, Rantz M J, Aud M A. Modeling human activity from voxel person using fuzzy logic. IEEE Transactions on Fuzzy Systems, 2009, 17(1): 39−49 doi: 10.1109/TFUZZ.2008.2004498
    [11] Wu Y X, Jia Z, Ming Y, Sun J J, Cao L J. Human behavior recognition based on 3D features and hidden Markov models. Signal, Image and Video Processing, 2016, 10(3): 495−502 doi: 10.1007/s11760-015-0756-6
    [12] Wang J, Liu Z C, Chorowski J, Chen Z Y, Wu Y. Robust 3D action recognition with random occupancy patterns. In: Proceedings of the 12th European Conference on Computer Vision. Florence, Italy: Springer, 2012. 872−885
    [13] Zhang H L, Zhong P, He J L, Xia C X. Combining depth-skeleton feature with sparse coding for action recognition. Neurocomputing, 2017, 230: 417−426 doi: 10.1016/j.neucom.2016.12.041
    [14] Zhang S C, Chen E Q, Qi C, Liang C W. Action recognition based on sub-action motion history image and static history image. MATEC Web of Conferences, 2016, 56: Article No. 02006
    [15] Liu Z, Zhang C Y, Tian Y L. 3D-based Deep Convolutional Neural Network for action recognition with depth sequences. Image and Vision Computing, 2016, 55: 93−100 doi: 10.1016/j.imavis.2016.04.004
    [16] Xu Y, Hou Z J, Liang J Z, Chen C, Jia L, Song Y. Action recognition using weighted fusion of depth images and skeleton$'$s key frames. Multimedia Tools and Applications, 2019, 78(17): 25063−25078 doi: 10.1007/s11042-019-7593-5
    [17] Wang P C, Li W Q, Li C K, Hou Y H. Action recognition based on joint trajectory maps with convolutional neural networks. Knowledge-Based Systems, 2018, 158: 43−53 doi: 10.1016/j.knosys.2018.05.029
    [18] Kamel A, Sheng B, Yang P, Li P, Shen R M, Feng D D. Deep convolutional neural networks for human action recognition using depth maps and postures. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2019, 49(9): 1806−1819 doi: 10.1109/TSMC.2018.2850149
    [19] Li C K, Hou Y H, Wang P C, Li W Q. Joint distance maps based action recognition with convolutional neural networks. IEEE Signal Processing Letters, 2017, 24(5): 624−628 doi: 10.1109/LSP.2017.2678539
    [20] Yang X D, Zhang C Y, Tian Y L. Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceedings of the 20th ACM International Conference on Multimedia. Nara, Japan: ACM, 2012. 1057−1060
    [21] Li A N, Shan S G, Chen X L, Gao W. Face recognition based on non-corresponding region matching. In: Proceedings of the 2011 International Conference on Computer Vision. Barcelona, Spain: IEEE, 2011. 1060−1067
    [22] Haghighat M, Abdel-Mottaleb M, Alhalabi W. Discriminant correlation analysis: Real-time feature level fusion for multimodal biometric recognition. IEEE Transactions on Information Forensics and Security, 2016, 11(9): 1984−1996 doi: 10.1109/TIFS.2016.2569061
    [23] Rosipal R, Krämer N. Overview and recent advances in partial least squares. In: Proceedings of the 2006 International Statistical and Optimization Perspectives Workshop “Subspace, Latent Structure and Feature Selection”. Bohinj, Slovenia: Springer, 2006. 34−51
    [24] Liu H P, Sun F C. Material identification using tactile perception: A semantics-regularized dictionary learning method. IEEE/ASME Transactions on Mechatronics, 2018, 23(3): 1050−1058 doi: 10.1109/TMECH.2017.2775208
    [25] Zhuang Y T, Yang Y, Wu F. Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval. IEEE Transactions on Multimedia, 2008, 10(2): 221−229 doi: 10.1109/TMM.2007.911822
    [26] Sharma A, Kumar A, Daume H, Jacobs D W. Generalized multiview analysis: A discriminative latent space. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE, 2012. 2160−2167
    [27] Wang K Y, He R, Wang L, Wang W, Tan T N. Joint feature selection and subspace learning for cross-modal retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(10): 2010−2023 doi: 10.1109/TPAMI.2015.2505311
    [28] Nie F, Huang H, Cai X, Ding C. Efficient and robust feature selection via joint $\ell_{2,1} $-norms minimization. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems. Vancouver British, Canada: Curran Associates Inc., 2010. 1813−1821
    [29] He R, Tan T N, Wang L, Zheng W S. $l_{2,1} $ regularized correntropy for robust feature selection. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE, 2012. 2504−2511
    [30] 朱红蕾, 朱昶胜, 徐志刚. 人体行为识别数据集研究进展. 自动化学报, 2018, 44(6): 978-1004 doi: 10.16383/j.aas.2018.c170043

    Zhu Hong-Lei, Zhu Chang-Sheng, Xu Zhi-Gang. Research advances on human activity recognition datasets. Acta Automatica Sinica, 2018, 44(6): 978−1004 doi: 10.16383/j.aas.2018.c170043
    [31] Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, et al. Real-time human pose recognition in parts from single depth images. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Colorado Springs, USA: IEEE, 2011. 1297−1304
    [32] Chen C, Jafari R, Kehtarnavaz N. UTD-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP). Quebec City, Canada: IEEE, 2015. 168−172
    [33] Chen C, Jafari R, Kehtarnavaz N. Action recognition from depth sequences using depth motion maps-based local binary patterns. In: Proceedings of the 2015 IEEE Winter Conference on Applications of Computer Vision. Waikoloa, USA: IEEE, 2015. 1092−1099
    [34] Koniusz P, Cherian A, Porikli F. Tensor representations via kernel linearization for action recognition from 3D skeletons. In: Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016. 37−53
    [35] Ben Tanfous A, Drira H, Ben Amor B. Coding Kendall′s shape trajectories for 3D action recognition. In: Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 2840−2849
    [36] Vemulapalli R, Chellappa R. Rolling rotations for recognizing human actions from 3D skeletal data. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 4471−4479
    [37] Wang L, Huynh D Q, Koniusz P. A comparative review of recent kinect-based action recognition algorithms. IEEE Transactions on Image Processing, 2019, 29: 15-28
    [38] Rahmani H, Mian A. 3D action recognition from novel viewpoints. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016. 1506−1515
    [39] Ben Tanfous A, Drira H, Ben Amor B. Sparse coding of shape trajectories for facial expression and action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(10): 2594-2607 doi: 10.1109/TPAMI.2019.2932979
    [40] Ben Amor B, Su J Y, Srivastava A. Action recognition using rate-invariant analysis of skeletal shape trajectories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(1): 1−13
  • 加载中
图(8) / 表(10)
计量
  • 文章访问数:  1361
  • HTML全文浏览量:  272
  • PDF下载量:  185
  • 被引次数: 0
出版历程
  • 收稿日期:  2019-04-30
  • 录用日期:  2019-11-15
  • 网络出版日期:  2022-09-14
  • 刊出日期:  2022-11-22

目录

    /

    返回文章
    返回