Automatic Semantic Image Annotation with Granular Analysis Method
-
摘要: 缩小图像低层视觉特征与高层语义之间的鸿沟, 以提高图像语义自动标注的精度, 进而快速满足用户检索图像的需求,一直是图像语义自动标注研究的关键. 粒度分析方法是一种层次的、重要的数据分析方法, 为复杂问题的求解提供了新的思路. 图像理解与分析的粒度不同, 图像语义标注的精度则不同, 检索的效率及准确度也就不同. 本文对目前图像语义自动标注模型的方法进行综述和分析, 阐述了粒度分析方法的思想、模型及其在图像语义标注过程中的应用, 探索了以粒度分析为基础的图像语义自动标注方法并给出进一步的研究方向.Abstract: To bridge the semantic gap between low-level visual feature and high-level semantic concepts has been the subject of intensive investigation for years in order to improve the accuracy of automatic image annotation and satisfy the users' needs of quick image retrieval. Granular analysis is a hierarchical and important data analyzing method, which provides a new idea and method for solving the complicated problem. The accuracy of automatic image annotation and the efficiency of image retrieval are varying with the granularity size of image understanding and analysis. In this paper, the state-of-art models of automatic semantic image annotation are overviewed, then the idea and models of the granular analysis with its application in the process of automatic semantic image annotation are discussed, and the granular analysis based automatic image annotation methods are investigated as well as the promising research directions are given.
-
[1] Datta R, Joshi D, Li J, Wang J Z. Image retrieval: ideas, influences, and trends of the new age. ACM Computing Surveys, 2008, 40(2): 1-60[2] Zadeh L A. Fuzzy sets and information granularity. Advances in Fuzzy Set Theory and Applications. Amsterdam: North-Holland Publishing, 1979. 3-18[3] Wang Guo-Yin, Zhang Qing-Hua, Hu Jun. An overview of granular computing. CAAI Transactions on Intelligent Systems, 2007, 2(6): 8-26(王国胤, 张清华, 胡军. 粒计算研究综述. 智能系统学报, 2007, 2(6): 8-26)[4] Yavlinsky A, Schofield E, Ruger S. Automated image annotation using global features and robust nonparametric density estimation. In: Proceedings of the 4th International Conference on Image and Video Retrieval. Singapore, Singapore: Springer, 2005. 507-517[5] Feng S L, Manmatha R, Lavrenko V. Multiple Bernoulli relevance models for image and video annotation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE, 2004. 1002-1009[6] Bi J, Chen Y, Wang J Z. A sparse support vector machine approach to region-based image categorization. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, USA: IEEE, 2005. 1121-1128[7] Zhang Q, Goldman S A, Yu W, Fritts J. Content-based image retrieval using multiple-instance learning. In: Proceedings of the 19th International Conference on Machine Learning. Sydney, Australia: Morgan Kaufmann, 2002. 682-689[8] Wang Mei. Research on Automatic Image Annotation Based on Multi-label Learning [Ph.D. dissertation], Fudan University, China, 2008(王梅. 基于多标签学习的图像语义自动标注研究 [博士学位论文], 复旦大学, 中国, 2008)[9] Gao Jun, Xie Zhao. Theory and Method of Image Understanding. Beijing: Science Press, 2009(高隽, 谢昭. 图像理解理论与方法. 北京: 科学出版社, 2009)[10] Swain M J, Ballard D H. Color indexing. International Journal of Computer Vision, 1991, 7(1): 11-32[11] Lowe D G. Object recognition from local scale-invariant features. In: Proceedings of the 7th IEEE international Conference on Computer Vision. Kerkyra, Greece: IEEE, 1999. 1150-1157[12] Perronnin F. Universal and adapted vocabularies for generic visual categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(7): 1243-1256[13] Quelhas P, Monay F, Odobez J M, Gatica-Perez D, Tuytelaars T. A thousand words in a scene. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(9): 1575-1589[14] Yang Heng, Wang Qing. A novel local invariant feature detection and description algorithm. Chinese Journal of Computers, 2010, 33(5): 935-944(杨恒, 王庆. 一种新的局部不变特征检测和描述算法. 计算机学报, 2010, 33(5): 935-944)[15] Liu Shuo-Yan, Xu De, Feng Song-He, Liu Di, Qiu Zheng-Ding. A novel visual words definition algorithm of image patch based on contextual semantic information. Acta Electronica Sinica, 2010, 38(5): 1156-1161(刘硕研, 须德, 冯松鹤, 刘镝, 裘正定. 一种基于上下文语义信息的图像块视觉单词生成算法. 电子学报, 2010, 38(5): 1156-1161)[16] Li F F, Pietro P. A Bayesian hierarchical model for learning natural scene categories. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, USA: IEEE, 2005. 524-531[17] Jiang Yue, Wang Run-Sheng, Wang Cheng. Scene classification with context pyramid features. Journal of Computer-Aided Design and Computer Graphics, 2010, 22(8): 1366-1373(江悦, 王润生, 王程. 采用上下文金字塔特征的场景分类. 计算机辅助设计与图形学学报, 2010, 22(8): 1366-1373)[18] Perronnin F, Dance C, Csurka G, Bressan M. Adapted vocabularies for generic visual categorization. In: Proceedings of the 9th European Conference on Computer Vision. Graz, Austria: Springer, 2006. 464-475[19] Jin Y, Khan L, Prabhakaran B. Knowledge based image annotation refinement. Journal of Signal Processing Systems, 2010, 58(3): 387-406[20] Lu Han-Qing, Liu Jing. Image annotation based on graph learning. Chinese Journal of Computers, 2008, 31(9): 1629-1639(卢汉清, 刘静. 基于图学习的自动图像标注. 计算机学报, 2008, 31(9): 1629-1639)[21] Lu Z W, Ip H H S. Combining context, consistency, and diversity cues for interactive image categorization. IEEE Transactions on Multimedia, 2010, 12(3): 194-203[22] Duygulu P, Barnard K, Freitas J F G, Forsyth D A. Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Proceedings of the 7th European Conference on Computer Vision. Copenhagen, Denmark: Springer, 2002. 97-112[23] Deerwester S, Dumais S T, Furnas G W, Landauer T K, Harshman R. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 1990, 41(6): 391-407[24] Hofmann T. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 2001, 42(1-2): 177-196[25] Dempster A P, Laird N M, Rubin D B. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 1977, 39(1): 1-38[26] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993-1022[27] Jeon J, Lavrenko V, Manmatha R. Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Toronto, Canada: ACM, 2003. 119-126[28] Lavrenko V, Manmatha R, Jeon J. A model for learning the semantics of pictures. In: Proceedings of the Neural Information Processing Systems. Vancouver, Canada: MIT Press, 2004. 553-560[29] Guo P, Jia Y D, Lyu M R. A study of regularized Gaussian classifier in high-dimension small sample set case based on MDL principle with application to spectrum recognition. Pattern Recognition, 2008, 41(9): 2842-2854[30] Li L J, Su H, Lim Y, Li F F. Objects as attributes for scene classification. In: Proceedings of the 12th European Conference of Computer Vision, the 1st International Workshop on Parts and Attributes. Crete, Greece, 2010.[31] Dong A, Bhanu B. Active concept learning for image retrieval in dynamic databases. In: Proceedings of the 9th IEEE International Conference on Computer Vision. Nice, France: IEEE, 2003. 90-95[32] Shi R, Chua T S, Lee C H, Gao S. Bayesian learning of hierarchical multinomial mixture models of concepts for automatic image annotation. In: Proceedings of the 5th Conference on Image and Video Retrieval. Tempe, USA: Springer, 2006. 102-112[33] Barnard K, Forsyth D. Learning the semantics of words and pictures. In: Proceedings of the 8th IEEE International Conference on Computer Vision. Vancouver, Canada: IEEE, 2001. 408-415[34] Bosch A, Zisserman A, Munoz X. Scene classification using a hybrid generative/discriminative approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(4): 712-727[35] Wang Yong-Qing. Principle and Method of Artificial Intelligence. Xi'an: Xi'an Jiaotong University Press, 1998(王永庆. 人工智能原理与方法. 西安: 西安交通大学出版社, 1998)[36] Li Zhi-Xin, Shi Zhi-Ping, Li Zhi-Qing, Shi Zhong-Zhi. A survey of semantic mapping in image retrieval. Journal of Computer-Aided Design and Computer Graphics, 2008, 20(8): 1085-1096(李志欣, 施智平, 李志清, 史忠植. 图像检索中语义映射方法综述. 计算机辅助设计与图形学学报, 2008, 20(8): 1085-1096)[37] Xia Li-Min, Tan Li-Qiu, Zhong Hong. Semantic annotation of image based on information bottleneck method. Pattern Recognition and Artificial Intelligence, 2008, 21(6): 812-818(夏利民, 谭立球, 钟洪. 基于信息瓶颈算法的图像语义标注. 模式识别与人工智能, 2008, 21(6): 812-818)[38] Frey B J, Dueck D. Clustering by passing message between data Points. Science, 2007, 315(5814): 972-976[39] Dueck D, Frey B J. Non-metric affinity propagation for unsupervised image categorization. In: Proceedings of the 11th IEEE International Conference on Computer Vision. Rio de Janeiro, Brazil: IEEE, 2007. 1-8[40] Yang D, Guo P. Improvement of image modeling with affinity propagation algorithm for semantic image annotation. In: Proceedings of the 16th International Conference on Neural Information Processing. Bangkok, Thailand: Springer, 2009. 778-787[41] Li W, Sun M S. Automatic image annotation based on wordnet and hierarchical ensembles. In: Proceedings of the 7th International Conference on Computational Linguistics and Intelligent Text Processing. Mexico City, Mexico: Springer, 2006. 417-428[42] Dietterich T G, Lathrop R H, Lozano-Perez T. Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence, 1997, 89(1-2): 31-71[43] Zhang Min-Ling. Research on Multi-instance Learning and Multi-label Learning [Ph.D. dissertation], Nanjing University, China, 2007(张敏灵. 多示例与多标记学习的研究 [博士学位论文], 南京大学, 中国, 2007)[44] Wu Y, Tian Q, Huang T S. Discriminant-EM algorithm with application to image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hilton Head Island, USA: IEEE, 2000. 222-227[45] Zhou X S, Huang T S. Relevance feedback in image retrieval: a comprehensive review. Multimedia Systems, 2003, 8(6): 536-544[46] Li L J, Li F F. OPTIMOL: automatic online picture collection via incremental model learning. International Journal of Computer Vision, 2010, 88(2): 147-168[47] Wang X J, Zhang L, Li X R, Ma W Y. Annotating images by mining image search results. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(11): 1919-1932[48] Chen Y X, Wang J Z. Image categorization by learning and reasoning with regions. Journal of Machine Learning Research, 2004, 5: 913-939[49] Yang C B, Dong M, Hua J. Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2006. 2057-2063[50] Carneiro G, Chan A B, Moreno P J, Vasconcelos N. Supervised learning of semantic classes for image annotation and retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(3): 394-410[51] Hobbs J R. Granularity. In: Proceedings of the 9th International Joint Conference on Artificial Intelligence. Los Angeles, USA: Morgan Kaufmann, 1985. 432-435[52] Lin T Y. Granular computing on binary relations I: data mining and neighborhood systems, II: rough sets representations and belief functions. Rough Sets in Knowledge Discovery 1: Methodology and Applications. Heidelberg: Physica-Verlag, 1998. 107-140[53] Yao Y Y. Relational interpretations of neighborhood operators and rough set approximation operators. Information Sciences, 1998, 111(1-4): 239-259[54] Zhang B, Zhang L. Theory and Application of Problem Solving. New York: North Holland, 1992[55] Zhang L, Zhang B. The quotient space theory of problem solving. In: Proceedings of the 9th International Conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. Chongqing, China: Springer, 2003. 11-15[56] Miao Duo-Qian, Wang Guo-Yin, Liu-Qing, Lin Zao-Yang, Yao Yi-Yu. Granular Computing: Past, Present and Future. Beijing: Science Press, 2007(苗夺谦, 王国胤, 刘清, 林早阳, 姚一豫. 粒计算: 过去、现在与展望. 北京: 科学出版社, 2007)[57] Qiu Tao-Rong, Liu Qing, Huang Hou-Kuan. A granular computing approach to knowledge discovery in relational databases. Acta Automatica Sinica, 2009, 35(8): 1071-1079(邱桃荣, 刘清, 黄厚宽. 关系数据库中知识发现的一种粒计算方法. 自动化学报, 2009, 35(8): 1071-1079)[58] Zadeh L A. Fuzzy logic = computing with words. IEEE Transactions on Fuzzy Systems, 1996, 4(2): 103-111[59] Pawlak Z. Rough Sets: Theoretical Aspects of Reasoning about Data. Boston: Kluwer Academic Publishers, 1991[60] Zhang Ling, Zhang Bo. Theory of fuzzy quotient space (methods of fuzzy granular computing). Journal of Software, 2003, 14(4): 770-776(张铃, 张钹. 模糊商空间理论(模糊粒度计算方法). 软件学报, 2003, 14(4): 770-776)[61] Wille R. Formal concept analysis as mathematical theory of concepts and concept hierarchies. In: Proceedings of the Formal Concept Analysis. Berlin, Germany: Springer, 2005. 1-33[62] Wang Yan, Sun Yi. Adaptive mean shift based image smoothing and segmentation. Acta Automatica Sinica, 2010, 36(12): 1637-1644(王晏, 孙怡. 自适应Mean Shift算法的彩色图像平滑与分割算法. 自动化学报, 2010, 36(12): 1637-1644)[63] Xiu Bao-Xin, Wu Meng-Da. Adaptability measure to fuzzy information granule on image and its application to edge detection. Acta Electronica Sinica, 2004, 32(2): 274-277(修保新, 吴孟达. 图像模糊信息粒的适应性度量及其在边缘检测中的应用. 电子学报, 2004, 32(2): 274-277)[64] Yue Xiao-Dong, Miao Duo-Qian, Zhong Cai-Ming. Roughness measure approach to color image segmentation. Acta Automatica Sinica, 2010, 36(6): 807-816(岳晓冬, 苗夺谦, 钟才明. 基于粗糙性度量的彩色图像分割方法. 自动化学报, 2010, 36(6): 807-816)[65] Malyszko D, Stepaniuk J. Adaptive multilevel rough entropy evolutionary thresholding. Information Sciences, 2010, 180(7): 1138-1158[66] Pal S K, Shankar B U, Mitra P. Granular computing, rough entropy and object extraction. Pattern Recognition Letters, 2005, 26(16): 2509-2517[67] Liu Ren-Jin, Huang Xian-Wu. The granular theorem of quotient space in image segmentation. Chinese Journal of Computers, 2005, 28(10): 1680-1685(刘仁金, 黄贤武. 图像分割的商空间粒度原理. 计算机学报, 2005, 28(10): 1680-1685)[68] Pedrycz W, Loia V, Senatore S. Fuzzy clustering with viewpoints. IEEE Transactions on Fuzzy Systems, 2010, 18(2): 274-284[69] Hildebrand L, Fathi M. Knowledge-based fuzzy color processing. IEEE Transactions on Systems, Man, and Cybernetics--Part C: Applications and Reviews, 2004, 34(4): 499-505[70] Hirota K, Pedrycz W. Fuzzy relational compression. IEEE Transactions on Systems, Man, and Cybernetics--Part B: Cybernetics, 1999, 29(3): 407-415[71] Zheng Z, Hu H, Shi Z Z. Granulation based image texture recognition. In: Proceedings of the 4th International Conference on Rough Sets and Current Trends in Computing. Uppsala, Sweden: Springer, 2004. 659-664[72] Li Qing-Yong, Hu-Hong, Shi Zhi-Ping, Shi Zhong-Zhi. Research on texture-based semantic image retrieval. Chinese Journal of Computers, 2006, 29(1): 116-122(李清勇, 胡宏, 施智平, 史忠植. 基于纹理语义特征的图像检索研究. 计算机学报, 2006, 29(1): 116-122)[73] Wang Hui-Feng, Sun Zheng-Xing, Wang Jian. Semantic image retrieval: review and research. Journal of Computer Research and Development, 2002, 39(5): 513-523(王惠锋, 孙正兴, 王箭. 语义图像检索研究进展. 计算机研究与发展, 2002, 39(5): 513-523)[74] Chen Shi-Liang, Li Zhan-Huai, Yuan Liu. A novel semantics-based image retrieval method using similarity measure of multi-level semantics. Journal of Northwestern Polytechnical University, 2008, 26(5): 588-591(陈世亮, 李战怀, 袁柳. 一种基于多层语义相似性度量的图像检索方法. 西北工业大学学报, 2008, 26(5): 588-591)[75] Zhang Xiang-Rong, Tan Shan, Jiao Li-Cheng. SAR image classification based on granularity computing of quotient space theory. Chinese Journal of Computers, 2007, 30(3): 483-490(张向荣, 谭山, 焦李成. 基于商空间粒度计算的SAR图像分类. 计算机学报, 2007, 30(3): 483-490)[76] Xu Xiang-Li, Zhang Li-Biao, Yu Zhe-Zhou, Zhou Chun-Guang. Application of multi-granularity color features in image retrieval. Journal of Applied Sciences, 2009, 27(1): 56-61(许相莉, 张利彪, 于哲舟, 周春光. 多粒度颜色特征在图像检索中的应用. 应用科学学报, 2009, 27(1): 56-61)[77] Fan J P, Gao Y L, Luo H Z, Xu G Y. Statistical modeling and conceptualization of natural image. Pattern Recognition, 2005, 38(6): 865-885[78] Xu Hong-Li, Xu De, Lin En-Ai. An approach of hierarchical image index based on subspace cluster. Journal of Image and Graphics, 2009, 14(1): 142-147(许宏丽, 须德, 林恩爱. 一种基于子空间聚类的图像分层索引方法. 中国图象图形学报, 2009, 14(1): 142-147)[79] Yao Y Y. Granular computing: basic issues and possible solutions. In: Proceedings of the 5th Joint Conference on Information Sciences. New Jersey, USA, 2000. 186-189[80] Li Dao-Guo, Miao Duo-Qian, Zhang Dong-Xing, Zhang Hong-Yun. An overview of granular computing. Computer Science, 2005, 32(9): 1-12(李道国, 苗夺谦, 张东星, 张红云. 粒度计算研究综述, 计算机科学, 2005, 32(9): 1-12)
点击查看大图
计量
- 文章访问数: 2943
- HTML全文浏览量: 70
- PDF下载量: 1915
- 被引次数: 0