用于提高谷歌图像搜索结果的二分类器在线学习方法

万玉钗; 刘峡壁; 韩菲霏; 童坤琦; 刘宇

doi:10.1004/SP.J.1004.2014.01699

用于提高谷歌图像搜索结果的二分类器在线学习方法

doi: 10.1004/SP.J.1004.2014.01699

1.
北京理工大学计算机学院北京 100081

计量
- 文章访问数: 2046
- HTML全文浏览量: 48
- PDF下载量: 1595
- 被引次数: 0
出版历程
- 收稿日期: 2012-09-24
- 修回日期: 2013-12-11
- 刊出日期: 2014-08-20

Online Learning a Binary Classifier for Improving Google Image Search Results

1.
Beijing Laboratory of Intelligent Information Technology, School of Computre Science, Beijing Institute of Technology, Bei-jing 100081, China

Funds:

Supported by National Natural Science Foundation of China (60973059, 81171407) and Program for New Century Excellent Tal-ents in University of China (NCET-10-0044)

摘要

摘要: 对于基于关键词的图像检索，利用检索结果的视觉相似性学习二分类器有望成为改善检索结果的最有效途径之一. 为改善搜索引擎的搜索结果，本文提出一种算法框架并且基于此框架着重研究训练数据选择这一关键问题. 训练数据选择过程由两个阶段组成：1）训练数据初始化以开始分类器学习过程；2）分类器迭代学习过程中的动态数据选择. 对于初始训练数据的选择，我们探讨了基于聚类和基于排序两种方法，并且对比了自动训练数据选择与人工标注的结果. 对于动态数据选择，我们比较了支持向量机和基于最大最小后验伪概率的贝叶斯分类器的分类效果. 组合上述两个阶段的不同方法，我们得到了8种不同的算法，并将其用于谷歌搜索引擎进行基于关键词的图像检索. 实验结果证明，如何从含有噪声的搜索结果中选择训练数据是搜索结果改善的关键问题. 实验显示我们的方法能够有效的改善谷歌搜索的结果，尤其是排序在前的结果. 尽早为用户提供更相关的结果能够更大程度的减少用户逐个翻页查看结果的工作. 另外，如何使自动训练数据选择与人工标注媲美仍是需要继续研究的一个问题.
- 图像搜索引擎 /
- 基于内容的图像检索 /
- 检索结果改善 /
- 图像分类器学习 /
- 训练数据选择
Abstract: It is promising to improve web image search results through exploiting the results0 visual contents for learning a binary classifier which is used to refine the results0 relevance degrees to the given query. This paper proposes an algorithm framework as a solution to this problem and investigates the key issue of training data selection under the framework. The training data selection process is divided into two stages: initial selection for triggering the classifier learning and dynamic selection in the iterations of classifier learning. We investigate two main ways of initial training data selection, including clustering based and ranking based, and compare automatic training data selection schemes with manual manner. Furthermore, support vector machines and the max-min pseudo-probability (MMP) based Bayesian classifier are employed to support image classification, respectively. By varying these factors in the framework, we implement eight algorithms and tested them on keyword based image search results from Google search engine. The experimental results confirm that how to select the training data from noisy search results is really a key issue in the problem considered in this paper and show that the proposed algorithm is effective to improve Google search results, especially at top ranks, thus is helpful to reduce the user labor in finding the desired images by browsing the ranking in depth. Even so, it is still worth meditative to make automatic training data selection scheme better towards perfect human annotation.
- Image search engine /
- content-based image retrieval (CBIR) /
- search results improvement /
- image classifier learning /
- training data selection

HTML全文

参考文献(34)

[1]	Torralba A, Fergus R, Freeman W T. 80 million tiny images: a large dataset for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(11): 1958-1970
[2]	[2] Fergus R, Perona P, Zisserman A. A visual category filter for Google images. In: Proceedings of the 8th European Conference on Computer Vision. Heidelberg, Berlin: Springer, 2004. 242-256
[3]	[3] Fergus R, Li F F, Perona R, Zisserman A. Learning object categories from Google's image search. In: Proceedings of the 10th International Conference on Computer Vision. Beijing, China: IEEE, 2005. 1816-1823
[4]	[4] Li L J, Li F F. OPTIMOL: automatic online picture collection via incremental model learning. International Journal of Computer Vision, 2010, 88(2): 147-168
[5]	[5] Le D D, Satoh S. Unsupervised face annotation by mining the Web. In: Proceedings of the 8th IEEE International Conference on Data Mining. Pisa: IEEE, 2008. 383-392
[6]	[6] Hoi C H, Lyu M R. Web image learning for searching semantic concepts in image databases. In: Proceedings of the 13th International World Wide Web Conference on Alternate Track Papers and Posters. New York, USA: ACM, 2004. 406-407
[7]	[7] Pereira R, Lopes L S, Silva A. Semantic image search and subset selection for classifier training in object recognition. In: Proceedings of the 14th Portuguese Conference on Artificial Intelligence. Heidelberg, Berlin: Springer, 2009. 338-349
[8]	[8] Li H J, Tang J H, Li G D, Chua T S. Word2Image: towards visual interpreting of words. In: Proceedings of the 16th International Conference on Multimedia. New York, USA: ACM, 2008. 813-816
[9]	[9] Berg T L, Forsyth D A. Animals on the web. In: Proceedings of the 2006 IEEE International Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2006. 1463-1470
[10]	Morsillo N, Pal C, Nelson R. Semi-supervised learning of visual classifiers from web images and text. In: Proceedings of the 2009 International Joint Conference on Artificial Intelligence. Amsterdam, Netherlands: Elsevier, 2009. 1169-1174
[11]	Yanai K. Generic image classification using visual knowledge on the web. In: Proceedings of the 11th International Conference on Multimedia. New York, USA: ACM, 2003. 167-176
[12]	Schroff F, Criminisi A, Zisserman A. Harvesting image databases from the web. In: Proceedings of the 11th IEEE International Conference on Computer Vision. New York, USA: IEEE, 2007. 2036-2043
[13]	Wang S H, Huang Q M, Jiang S Q, Tian Q. S3MKL: scalable semi-supervised multiple kernel learning for real-world image applications. IEEE Transactions on Multimedia, 2012, 14(4): 1259-1274
[14]	Zhang H, Berg A C, Maire M, Malik J. SVM-KNN: discriminative nearest neighbor classification for visual category recognition. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2006. 2126-2136
[15]	Zhou X S, Huang T S. Relevance feedback in image retrieval: a comprehensive review. Multimedia Systems, 2003, 8(6): 536-544
[16]	Gao Y L, Peng J Y, Luo H Z, Keim D A, Fan J. An interactive approach for filtering out junk images from keyword-based Google search results. IEEE Transaction on Circuits and Systems for Video Technology, 2009, 19(12): 1851-1865
[17]	Cui J Y, Wen F, Tang X O. Real time Google and live image search re-ranking. In: Proceedings of the 16th International Conference on Multimedia. New York, USA: ACM, 2008. 729-732
[18]	Zhou W Q, Tian Q, Li H Q. Visual block link analysis for image re-ranking. In: Proceedings of the 1 st International Conference on Internet Multimedia Computing and Service. New York, UAS: ACM, 2009. 10-16
[19]	Wang S H, Huang Q M, Jiang S Q, Qin L, Tian Q. Visual ContextRank for web image re-ranking. In: Proceedings of the 1st ACM Workshop on Large-scale Multimedia Retrieval and Mining. New York, USA: ACM, 2009. 121-128
[20]	Zitouni H, Sevil S, Ozkan D, Duygulu P. Re-ranking of web image search results using a graph algorithm. In: Proceedings of the 19th International Conference on Pattern Recognition. New York, USA: IEEE, 2008. 1-4
[21]	Jhuo I H, Lee D T. Boosting-based multiple kernel learning for image re-ranking. In: Proceedings of the 18th International Conference on Multimedia. New York, USA: ACM, 2010. 1159-1162
[22]	Popescu A, Mollic P A, Kanellos I, Landais R. Lightweight web image reranking. In: Proceedings of the 17th International Conference on Multimedia. New York, USA: ACM, 2009. 657-660
[23]	Yao T, Mei T, Ngo C W. Co-reranking by mutual reinforcement for image search. In: Proceedings of the ACM International Conference on Image and Video Retrieval. New York, USA: ACM, 2010. 34-41
[24]	Liu Y, Mei T, Hua X S. CrowdReranking: exploring multiple search engines for visual search reranking. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 2009. 500-507
[25]	Vapnik V. The Nature of Statistical Learning Theory. Berlin: Springer-Verlag, 1995
[26]	Chen X F, Liu X B, Jia Y D. Learning handwritten digit recognition by the max-min posterior pseudo-probabilities method. In: Proceedings of the 9th International Conference on Document Analysis and Recognition. New York, USA: IEEE, 2007. 342-346
[27]	Topi M, Timo O, Matti P, Maricor S. Robust texture classification by subsets of local binary patterns. In: Proceedings of the 15th International Conference of Pattern Recognition. Barcelona: IEEE, 2000. 935-938
[28]	Sun J D, Wu X S. Chain code distribution-based image retrieval. In: Proceedings of the 2006 International Conference on Intelligent Information Hiding and Multimedia Signal Processing. New York, USA: IEEE, 2006. 139-142
[29]	Cheng J, Wang K Q. Active learning for image retrieval with Co-SVM. Pattern Recognition, 2006, 40(1): 330-334
[30]	Li J, Allinson N, Tao D C, Li X L. Multitraining support vector machine for image retrieval. IEEE Transactions on Image Processing, 2006, 15(11): 3597-3601
[31]	Torch Machine Learning Library [Online], available: http: //torch.ch, September 7, 2012
[32]	Vlassis N, Likas A. A kurtosis-based dynamic approach to Gaussian mixture modeling. IEEE Transactions on Systems, Man, and Cybernetics, 1999, 29(4): 393-399
[33]	Dempster A P, Laird N M, Rubin D B. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 1977, 39(1): 1-38
[34]	Hansen M H, Bin Y. Model selection and the principle of minimum description length. Journal of the American Statistical Association, 2001, 96(454): 746-774