Weighted Disparity Energy Model
-
摘要: 由于人左右眼间距的存在,使得同一空间物体在左右眼视网膜上的投影存在位置差异,称之为视差. 左右眼视网膜获取的信息最初在初级视皮层(Ⅴ1区)进行融合,该区域有大量对视差敏感的神经元.关于它们的视差选择特性,目前比较公认的计算模型是视差能量模型,然而该模型却无法解释Ⅴ1区神经元对反相关随机点立体图(Anti-correlated random dot stereograms,aRDS)的响应要比对随机点立体图的 响应弱这一神经生理学发现.为此,本文提出了一种加权视差能量模型:首先,利用左右眼感受野内的信号差异对神经元的响应能量进行调制,然后再结合神经元之间的相互作用来计算细胞群响应,从而得到图像视差.本文旨在探索基于神经生理学的视差计算方法,主要贡献有:1)加权视差能量模型能够很好地解释Ⅴ1区神经元对反随机点立体图的响 应比随机点立体图响应弱的生理特性;2)加权视差能量模型的视差计算结果精度比现有基于神经生理学的模型 更高,甚至高于一些传统的计算机视觉方法.Abstract: Due to the position difference of the two eyes, there exists difference between the projections of an object on the two eyes. This is what we call disparity. The primary visual cortex (Ⅴ1 area) is thought to be the origin area to deal with binocular information and it contains many neurons which are sensitive to binocular disparity. The disparity tuning responses of these neurons have been well described by the disparity energy model. However, this model fails to explain a physiological finding that these neurons should have weaker responses to binocularly anti-correlated random dot stereograms (aRDS) relative to random dot stereograms. A weighted disparity energy model is proposed in this paper to tackle this problem. The responses of the neurons are modulated by making use of the signal differences within the left and right receptive fields. Then the population responses are computed based on the responses of individual neurons and interaction between them for disparity computation. This paper is primarily focused on developing the disparity computation model based on neurophysiological findings. The main contributions are two-fold: 1) it can adequately describe that the responses of the neurons in Ⅴ1 to anti-correlated stimuli are weaker than those to random dot stereograms; 2) the obtained disparities are more accurate than existing neurophysiological methods, and even better than some classical computer vision methods.
-
[1] Julesz B. Foundations of Cyclopean Perception. Chicago: University of Chicago Press, 1971 [2] Chalupa L M, Werner J S. The Visual Neurosciences. Cambridge, MA: MIT Press, 2004 [3] Janssen P, Vogels R, Liu Y, Orban G A. At least at the level of inferior temporal cortex, the stereo correspondence problem is solved. Neuron, 2003, 37(4): 693-701 [4] Ohzawa I, DeAngelis G C, Freeman R D. Stereoscopic depth discrimination in the visual cortex: neurons ideally suited as disparity detectors. Science, 1990, 249(4972): 1037-1041 [5] Chen Y Z, Qian N. A coarse-to-fine disparity energy model with both phase-shift and position-shift receptive field mechanisms. Neural Computation, 2004, 16(8): 1545-1577 [6] Qian N. Computing stereo disparity and motion with known binocular cell properties. Neural Computation, 1994, 6(3): 390-404 [7] Qian N. Binocular disparity and the perception of depth. Neuron, 1997, 18(3): 359-368 [8] Zhu Y D, Qian N. Binocular receptive field models, disparity tuning, and characteristic disparity. Neural Computation, 1996, 8(8): 1611-1641 [9] Anzai A, Ohzawa I, Freeman R D. Neural mechanisms underlying binocular fusion and stereopsis: position vs. phase. Proceedings of the National Academy of Sciences of the United States of America, 1997, 94(10): 5438-5443 [10] Anzai A, Ohzawa I, Freeman R D. Neural mechanisms for encoding binocular disparity: receptive field position versus phase. Journal of Neurophysiology, 1999, 82(2): 874-890 [11] Ming Y S, Hu Z Y. Modeling stereopsis via Markov random field. Neural Computation, 2010, 22(8): 2161-2191 [12] Tsang E K C, Shi B E. Normalization enables robust validation of disparity estimates from neural populations. Neural Computation, 2008, 20(10): 2464-2490 [13] Tsang E K C, Shi B E. Disparity estimation by pooling evidence from energy neurons. IEEE Transactions on Neural Networks, 2009, 20(11): 1772-1782 [14] Cumming B G, Parker A J. Responses of primary visual cortical neurons to binocular disparity without depth perception. Nature, 1997, 389(6648): 280-283 [15] Lippert J, Wagner H. A threshold explains modulation of neural responses to opposite-contrast stereograms. Neuroreport, 2001, 12(15): 3205-3208 [16] Read J C, Parker A J, Cumming B G. A simple model accounts for the response of disparity-tuned Ⅴ1 neurons to anticorrelated images. Visual Neuroscience, 2002, 19(6): 735-753 [17] Haefner R M, Cumming B G. A specialization for the statistics of binocular images in primate Ⅴ1. Neuron, 2008, 57(1): 147-158 [18] Hubel D H, Wiesel T N. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. The Journal of Physiology, 1962, 160(1): 106-154 [19] Hubel D H, Wiesel T N. Receptive fields and functional architecture of monkey striate cortex. The Journal of Physiology, 1968, 195(1): 215-243 [20] Jones J P, Palmer L A. The two-dimensional spatial structure of simple receptive fields in cat striate cortex. Journal of Neurophysiology, 1987, 58(6): 1187-1211 [21] De Valois R L, Albrecht D G, Thorell L G. Spatial frequency selectivity of cells in macaque visual cortex. Vision Research, 1982, 22(5): 545-559 [22] Mansfield J S, Parker A J. An orientation-tuned component in the contrast masking of stereopsis. Vision Research, 1993, 33(11): 1535-1544 [23] Poggio G F, Fischer B. Binocular interaction and depth sensitivity in striate and prestriate cortex of behaving rhesus monkey. Journal of Neurophysiology, 1977, 40(6): 1392-1405 [24] Scharstein D, Szeliski R. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 2002, 47(1-3): 7-42 [25] Schmidhuber J. Multi-column deep neural networks for image classification. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Washington, DC, USA: IEEE, 2012. 3642-3649 [26] Coates A, Carpenter B, Case C, Satheesh S, Suresh B, Wang T, Wu D J, Ng A Y. Text detection and character recognition in scene images with unsupervised feature learning. In: Proceedings of the 2011 International Conference on Document Analysis and Recognition (ICDAR). Beijing, China: IEEE, 2011. 440-445
点击查看大图
计量
- 文章访问数: 1688
- HTML全文浏览量: 74
- PDF下载量: 845
- 被引次数: 0