Image Annotation with Bayesian Universal Background Model
-
摘要: 在高斯图特征提取过程中,通用背景模型(Universal background model, UBM) 方法常用于根据总体分布估计每一幅图像中特征点分布的高斯混合模型(Gaussian mixture model, GMM)参数. 然而UBM估计的GMM权重参数中有很多接近零的数值,它们所对应的高斯分量对分布估计贡献小却又都参与了计算, 因此UBM的时间复杂度较高. 为解决这个问题,本文提出Bayes UBM方法. 通过引入受限的对称Dirichlet分布来描述GMM权重参数的先验分布,利用Bayes最大后验概率对GMM参数集进行估计. 实验表明Bayes UBM方法不仅有效地降低了时间复杂度,而且提高了Corel数据集上的图像标注精度.Abstract: The universal background model (UBM) is commonly used for Gaussian map feature extraction. The UBM estimates the parameters in the Gaussian mixture model (GMM). However, the weight coefficients of GMM estimated by UBM have many near-zero values, whose corresponding Gaussian components have little contribution to the estimated result but need to be calculated in model estimation, therefore, UBM has a high time complexity. To solve this problem, we propose a method called Bayes UBM. In this method, the symmetric Dirichlet distribution is introduced to describe the prior distribution of GMM weight coefficients. The posterior distribution of the GMM weight coefficients is computed using Bayes method to estimate the GMM parameters. Experiments show that the proposed Bayes UBM method can efficiently reduce the time complexity, and improve the image annotation precision on Corel dataset.
-
[1] Zhou Z H, Zhang M L. Multi-instance multi-label learning with application to scene classification. Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2006. 1609-1616 [2] Makadia A, Pavlovic V, Kumar S. A new baseline for image annotation. In: Proceedings of the 10th European Conference on Computer Vision. Berlin, Heidelberg: Springer-Verlag, 2008, 5304: 316-329 [3] Grauman K, Darell T. The pyramid match kernel: discriminative classification with sets of image features. In: Proceedings of the 10th International Conference on Computer Vision. Beijing, China: IEEE, 2005. 1458-1465 [4] Lazebnik S, Schmid C, Ponce J. Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the 2006 IEEE Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2006. 2169-2178 [5] Lowe D G. Towards a computational model for object recognition in IT cortex. In: Proceedings of the 1st IEEE International Workshop on Biologically Motivated Computer Vision. London, UK: Springer-Verlag, 2000. 20-31 [6] Yang J C, Yu K, Gong Y H, Huang T. Linear spatial pyramid matching using sparse coding for image classification. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE, 2009. 1794-1801 [7] Yang D, Guo P. Image modeling with combined optimization techniques for image semantic annotation. Neural Computing & Applications, 2011, 20(7): 1001-1015 [8] Zhou X, Cui N, Li Z, Liang F, Huang T S. Hierarchical gaussianization for image classification. In: Proceedings of the 12th IEEE International Conference on Computer Vision. Miami, USA: IEEE, 2009. 1971-1977 [9] Tariq U, Lin K H, Li Z, Zhou X, Wang Z W, Le V, Huang T S, Lv X T, Han T X. Emotion recognition from an ensemble of features. In: Proceedings of the 2011 IEEE International Conference on Automatic Face & Gesture Recognition and Workshops. Santa Barbara, CA: IEEE, 2011. 872-877 [10] Krapac J, Verbeek J, Jurie F. Spatial Fisher Vectors for Image Categorization, Technical Report INRIA-00613572, Institut National de Recherche en Informatique et en Automatique, France, 2011 [11] Dixit M, Rasiwasia N, Vasconcelos N. Adapted Gaussian models for image classification. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Washington DC, USA: IEEE, 2011. 937-943 [12] Reynolds D A, Quatieri T F, Dunn R B. Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 2000, 10(1-3): 19-41 [13] Wang C H, Yan S C, Zhang L, Zhang H J. Multi-label sparse coding for automatic image annotation. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE, 2009. 1643-1650 [14] Povey D, Chu S M, Varadarajan B. Universal background model based speech recognition. In: Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing. Las Vegas, NV: IEEE, 2008. 4561-4564 [15] Morrison G S. A comparison of procedures for the calculation of forensic likelihood ratios from acoustic-phonetic data: multivariate kernel density versus Gaussian mixture model-universal background model. Speech Communication, 2011, 53(2): 242-256 [16] Bishop C M. Pattern Recognition and Machine Learning. New York: Springer-Verlag, 2006 [17] Yang D, Guo P. Improvement of image modeling with affinity propagation algorithm for semantic image annotation. In: Proceedings of the 16th International Conference on Neural Information Processing. Berlin, Heidelberg: Springer-Verlag, 2009. 778-787 [18] Duygulu P, Barnard K, de Freitas J F G, Forsyth D A. Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Proceedings of the 7th European Conference on Computer Vision. London, UK: Springer, 2002. 97-112
点击查看大图
计量
- 文章访问数: 1451
- HTML全文浏览量: 104
- PDF下载量: 1423
- 被引次数: 0