Recognizing Realistic Human Actions Using Accumulative Edge Image
-
摘要: 为了从现实环境下识别出人体动作,本文研究了从无约束视频中提取特征表征人体动作的问题. 首先,在无约束的视频上使用形态学梯度操作消除部分背景,获得人体的轮廓形状; 其次,提取某一段视频上每一帧形状的边缘特征,累积到一幅图像中,称之为累积边缘图像 (Accumulative edge image, AEI); 然后,在该累积边缘图像上计算基于网格的方向梯度直方图(Histograms of orientation gradients, HOG),形成特征向量表征人体的动作, 送入分类器进行分类. YouTube数据集上的实验结果表明,本文的方法比其他方法更加有效.Abstract: The problem of extracting feature from unconstrained videos for representing human actions has been investigated in order to recognize human actions in complex environment in this paper. Firstly, morphological gradient was used to eliminate most background information. Then, edge of shape was extracted and accumulated to a frame, which was named accumulative edge image (AEI). Grid-based histograms of orientation gradients (HOG) were calculated and formed a feature vector that captured the characteristic of human actions in this video sequence. Using support vector machine (SVM), the method was tested on the YouTube action dataset. The obtained impressive results showed that this method was more effective than other methods in YouTube action dataset.
-
[1] Liu J G, Luo J B, Shah M. Recognizing realistic actions from videos "in the wild". In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE, 2009. 1996-2003[2] Poppe R. A survey on vision-based human action recognition. Image and Vision Computing, 2010, 28(6): 976-990[3] Bobick A F, Davis JW. The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(3): 257-267[4] Weinland D, Ronfard R, Boyer E. Free viewpoint action recognition using motion history volumes. Computer Vision and Image Understanding, 2006, 104(2-3): 249-257[5] Huang Fei-Yue, Xu Guang-You. Viewpoint independent action recognition. Journal of Software, 2008, 19(7): 1623-1634(黄飞跃, 徐光祐. 视角无关的动作识别. 软件学报, 2008, 19(7): 1623-1634)[6] Yang Yue-Dong, Hao Ai-Min, Chu Qing-Jun, Zhao Qin-Ping, Wang Li-Li. View-invariant action recognition based on action graphs. Journal of Software, 2009, 20(10): 2679-2691(杨跃东, 郝爱民, 褚庆军, 赵沁平, 王莉莉. 基于动作图的视角无关动作识别. 软件学报, 2009, 20(10): 2679-2691)[7] Gu Jun-Xia, Ding Xiao-Qing, Wang Sheng-Jin. Human 3D model-based 2D action recognition. Acta Automatica Sinica, 2010, 36(1): 46-53(谷军霞, 丁晓青, 王生进. 基于人体行为3D模型的2D行为识别. 自动化学报, 2010, 36(1): 46-53)[8] Harris C, Stephens M. A combined corner and edge detector. In: Proceedings of the 4th Alvey Vision Conference. Manchester, UK: Organising Committee AVC, 1988. 147-151[9] Kadir T, Brady M. Scale saliency: a novel approach to salient feature and scale selection. In: Proceedings of the International Conference on Visual Information Engineering. Guildford, UK: IEEE, 2003. 25-28[10] Laptev I, Lindeberg T. Space-time interest points. In: Proceedings of the 9th IEEE International Conference on Computer Vision. Nice, France: IEEE, 2003. 432-439[11] Oikonomopoulos A, Patras I, Pantic M. Spatiotemporal salient points for visual recognition of human actions. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2006, 36(3): 710-719[12] Dollar P, Rabaud V, Cottrell G, Belongie S. Behavior recognition via sparse spatio-temporal features. In: Proceedings of the 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance. Beijing, China: IEEE, 2005. 65-72[13] Han Lei, Li Jun-Feng, Jia Yun-De. Human interaction recognition using spatio-temporal words. Chinese Journal of Computers, 2010, 33(4): 776-784(韩磊, 李君峰, 贾云得. 基于时空单词的两人交互行为识别方法. 计算机学报, 2010, 33(4): 776-784)[14] Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, USA: IEEE, 2005. 886-893[15] Dalal N, Triggs B, Schmid C. Human detection using oriented histograms of flow and appearance. In: Proceedings of the 9th European Conference on Computer Vision. Graz, Austria: Springer, 2006. 428-441[16] Bosch A, Zisserman A, Munoz X. Representing shape with a spatial pyramid kernel. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval. Amsterdam, Netherlands: ACM, 2007. 401-408[17] Han L, Wu X X, Liang W, Hou G M, Jia Y D. Discriminative human action recognition in the learned hierarchical manifold space. Image and Vision Computing, 2010, 28(5): 836-849[18] Ikizler-Cinbis N, Sclaroff S. Object, scene and actions: combining multiple features for human action recognition. In: Proceedings of the 11th European Conference on Computer Vision. Heraklion, Greece: Springer, 2010. 494-507[19] Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model. In: Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, USA: IEEE, 2008. 1-8
点击查看大图
计量
- 文章访问数: 2133
- HTML全文浏览量: 71
- PDF下载量: 1154
- 被引次数: 0