Action Recognition System with Analog Model of Neurons in Primate Visual Cortex
-
摘要: 大脑中致力于运动信息处理的区域是初级视皮层(V1)和中颞区(MT).目前有关运动模式是在哪个区域完成的,存在不同的推测.迄今大多数关于动作识别的研究都是围绕MT阶段展开的.本文针对V1阶段获得的信息能否进行动作识别的问题展开研究,提出了模拟初级视皮层(V1)脉冲神经元的动作识别系统.该系统首先采用3D Gabor滤波器及其组合分别模拟初级视觉皮层中简单、复杂细胞的感受野,以此对视频图像进行处理,从而获取对运动速度和方向敏感的运动能量,并通过V1阶段的环绕抑制来增强运动能量和降低噪声的影响.其次,采用Integrate-and-fire脉冲神经元模型模拟初级视觉皮层的神经元,将获取的运动信息转换为神经元响应的脉冲链.最后,根据脉冲链平均发放率的特性提取运动特征向量,采用支持向量机(Support vector machine, SVM)作为分类器.在Weiziman数据库下进行测试,实验结果表明, V1阶段获得的信息可以进行动作的识别.
-
关键词:
- 动作识别 /
- 3D Gabor滤波器 /
- 环绕抑制 /
- 脉冲神经元模型
Abstract: There are several theories which speculate on how and where pattern motion is computed from visual cortex or middle temporal area (V1/MT) dedicated to motion. So far, most researches in action recognition remain rooted in MT. This paper proposed a method of human action recognition by modeling the human V1 neurons for information obtained in V1 which could benefit action recognition. The method firstly simulates the classical receptive field (CRF) of simple and complex cells in the primary visual cortex with 3D Gabor filter and its combination to process the video sequence, in order to obtain the sport energy that is sensitive to the sport speed and direction. Meanwhile, it enhances the sport energy and reduces the influence of noise through surround inhibition in V1. Secondly, conductance-driven integrate and fire neuron model is used to simulate the primary visual cortex neuron, by which motion information is converted into spike train. Finally, the mean firing rate of spike train forms a feature vector that captures the characteristic of human actions in this video sequence. Using support vector machine (SVM), the method is tested on the Weizmann action dataset. The obtained impressive results show that the information obtained in phase V1 could benefit action recognition.-
Key words:
- Action recognition /
- 3D Gabor filter /
- surround inhibition /
- spiking neural model
-
[1] Chen Xian-Gan, Liu Juan, Gao Zhi-Yong, Liu Hai-Hua. Recognizing realistic human actions using accumulative edge image. Acta Automatica Sinica, 2012, 38(8): 1380-1384(谌先敢, 刘娟, 高智勇, 刘海华. 基于累积边缘图像的现实人体动作识别. 自动化学报, 2012, 38(8): 1380-1384)[2] Huang Fei-Yue, Xu Guang-You. Viewpoint independent action recognition. Journal of Software, 2008, 19(1): 1623-1643(黄飞跃, 徐光祐. 视角无关的动作识别. 软件学报, 2008, 19(1): 1623-1643)[3] Gu Jun-Xia, Ding Xiao-Qing, Wang Sheng-Jin. Human 3D model-based 2D action recognition. Acta Automatica Sinica, 2010, 36(1): 46-53(谷军霞, 丁晓青, 王生进. 基于人体行为3D模型的2D行为识别. 自动化学报, 2010, 36(1): 46-53)[4] Luo Si-Wei. The Percetion Computing of Visual Information. Beijing: Science Press, 2010(罗四维. 视觉信息认知计算理论. 北京: 科学出版社, 2010)[5] Casile A, Giese M. Roles of motion and form in biological motion recognition. In: Proceedings of the 2003 Joint International Conference on Artificial Networks and Neural Information Processing. Berlin, Heidelberg: Springer-Verlag 2003. 854-862[6] Mingolla E, Todd J T, Normal J F. The perception of globally coherent motion. Vision Research, 1992, 32(6): 1015-1031[7] Simoncelli E P, Heeger D J. A model of neuronal responses in visual area MT. Vision Research, 1998, 38(5): 743-761[8] Bayerl P, Neumann H. Disambiguating visual motion through contextual feedback modulation. Neural Computation, 2004, 16(10): 2041-2066[9] Bayerl P, Neumann H. A fast biologically inspired algorithm for recurrent motion estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(2): 246-260[10] Jhuang H, Serre T, Wolf L, Poggio T. A biologically inspired system for action recognition. In: Proceedings of the 11th IEEE International Conference on Computer Vision. Rio de Janeiro, Brazil: IEEE, 2007. 1-8[11] Thorpe S. Spike arrival times: a highly efficient coding scheme for neural networks. Parallel Processing in Neural Systems and Computers. New York: North-Holland, 1990. 91-94[12] Thorpe S, Fize D, Marlot C. Speed of processing in the human visual system. Nature, 1996, 381(6582): 520-522[13] Escobar M J, Masson G S, Vieville T, Kornprobst P. Action recognition using a bio-inspired feedforward spiking network. International Journal of Computer Vision, 2009, 82(3): 284-301[14] Escobar M J, Kornprobst P. Action recognition with a bio-inspired feedforward motion processing model. In: Proceedings of the 10th European Conference on Computer Vision. Berlin, Heidelberg: Springer-Verlag, 2008. 186-199[15] Knierim J J, van Essen D C. Neuronal responses to static texture patterns in area V1 of the alert macaque monkey. Journal of Neurophysiology, 1992, 67(4): 961-980[16] Petkov N, Westenberg M A. Suppression of contour perception by band-limited noise and its relation to nonclassical receptive field inhibition. Biological Cybernetics, 2003, 88(3): 236-246[17] Petkov N, Subramanian E. Motion detection, noise reduction, texture suppression, and contour enhancement by spatiotemporal Gabor filters with surround inhibition. Biological Cybernetics, 2007, 97(5): 423-439[18] Allman J M, Miezin F M, McGuinness E. Direction and velocity specific responses from beyond the classical receptive field in the middle temporal visual area (MT). Perception, 1985, 14(2): 105-126[19] Rayner K. Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 1998, 124(3): 372-422[20] Itti L., Koch C., Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(11): 1254-1259[21] Hubel D H, Wiesel T N. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. Journal of Physiology, 1962, 160(1): 106-154[22] Liu H C, Hong T H, Herman M, Chellappa R. A general motion model and spatio-temporal filters for computing optical flow. International Journal of Computer Vision, 1997, 22(2): 141-172[23] Adelson E H, Bergen J R. Spatiotemporal energy models for the perception of motion. Journal of the Optical Society of America, 1985, 2(2): 284-299[24] Jones H E, Grieve K L, Wang W, Silito A M. Surround suppression in primate V1. Journal of Neurophysiology, 2001, 86(4): 2011-2028[25] DeAngelis G C, Uka T. Coding of horizontal disparity and velocity by MT neurons in the alert macaque. Journal of Neurophysiology, 2003, 89(2): 1094-1111[26] Gerstner W, Kistler W M. Spiking Neuron Models. Cambridge: Cambridge University Press, 2002[27] Wielaard D J, Shelley M, McLaughlin D, Shapley R. How simple cells are made in a nonlinear network model of the visual cortex. Journal of Neuroscience, 2001, 21(14): 5203-5211[28] Lestienne R. Determination of the precision of spike timing in the visual cortex of anaesthetised cats. Biological Cybernetics, 1996, 74(1): 55-61[29] Victor J D, Purpura K P. Nature and precision of temporal coding in visual cortex: a metric-space analysis. Journal of Neurophysiology, 1996, 76(2): 1310-1326[30] Rieke F, Warland D. Spikes: Exploring the Neural Code. Cambridge: Bradford Books, 1997[31] Fellous J M, Tiesinga P H E, Thomas P J, Sejnowski T J. Discovering spike patterns in neuronal responses. Journal of Neuroscience, 2004, 24(12): 2989-3001[32] Cessac B, Paugam-Moisy H, Viéville T. Overview of facts and issues about neural coding by spikes. Journal of Physiology-Paris, 2010, 104(1-2): 5-18[33] Sahebsara M, Chen T, Shah S L. Optimal H2-filtering with random sensor delay, multiple packet dropout and uncertain observations. International Journal of Control, 2007, 80(2): 292-301
点击查看大图
计量
- 文章访问数: 1412
- HTML全文浏览量: 40
- PDF下载量: 644
- 被引次数: 0