A Visual Semantic Concept Detection Algorithm Based on E2LSH-MKL
-
摘要: 多核学习方法(Multiple kernel learning, MKL)在视觉语义概念检测中有广泛应用, 但传统多核学习大都采用线性平稳的核组合方式而无法准确刻画复杂的数据分布. 本文将精确欧氏空间位置敏感哈希(Exact Euclidean locality sensitive Hashing, E2LSH)算法用于聚类, 结合非线性多核组合方法的优势, 提出一种非线性非平稳的多核组合方法—E2LSH-MKL. 该方法利用Hadamard内积实现对不同核函数的非线性加权,充分利用了不同核函数之间交互得到的信息; 同时利用基于E2LSH哈希原理的聚类算法,先将原始图像数据集哈希聚类为若干图像子集, 再根据不同核函数对各图像子集的相对贡献大小赋予各自不同的核权重, 从而实现多核的非平稳加权以提高学习器性能; 最后,把E2LSH-MKL应用于视觉语义概念检测. 在Caltech-256和TRECVID 2005数据集上的实验结果表明,新方法性能优于现有的几种多核学习方法.
-
关键词:
- 视觉语义概念 /
- 多核学习 /
- 精确欧氏空间位置敏感哈希算法 /
- Hadamard内积
Abstract: Multiple kernel learning (MKL) methods have a widespread application in visual semantic concept detection. Most canonical MKL approaches employ a linear and stationary kernel combination format which cannot accurately depict complex data distributions. In this paper, we apply exact Euclidean locality sensitive Hashing (E2LSH) algorithm to clustering. And by combining the advantages of nonlinear multiple kernel combination methods, we put forward a nonlinear and non-stationary multiple kernel learning method—E2LSH-MKL. In order to make full use of the information generated from the nonlinear interaction of different kernels, this method utilizes Hadamard product to realize nonlinear combination of multiple different kernels. Meanwhile, the method employs E2LSH-based clustering algorithm to group images into sub clusters, then assigns cluster-related kernel weights according to relative contributions of different kernels on each image subset, thereby realizing non-stationary weighting of multiple kernels to improve learning performance; finally, E2LSH-MKL is applied to visual semantic concept detection. Experiment results on datasets of the Caltech-256 and the TRECVID 2005 show that the proposed method is superior to the state-of-the-art multiple kernel learning methods. -
[1] Zhang Su-Lan, Guo Ping, Zhang Ji-Fu, Hu Li-Hua. Automatic semantic image annotation with granular analysis method. Acta Automatica Sinica, 2012, 38(5): 688-697(张素兰, 郭平, 张继福, 胡立华. 图像语义自动标注及其粒度分析方法. 自动化学报, 2012, 38(5): 688-697)[2] Li Wen-Qing, Sun Xin, Zhang Chang-You, Feng Ye. A semantic similarity measure between ontological concepts. Acta Automatica Sinica, 2012, 38(2): 229-235(李文清, 孙新, 张常有, 冯烨. 一种本体概念的语义相似度计算方法. 自动化学报, 2012, 38(2): 229-235)[3] Damoulas T, Girolami M A. Pattern recognition with a Bayesian kernel combination machine. Pattern Recognition Letters, 2009, 30(1): 46-54[4] Vedaldi A, Gulshan V, Varma M, Zisserman A. Multiple kernels for object detection. In: Proceedings of the 12th IEEE International Conference on Computer Vision. Kyoto, Japan: IEEE. 2009, 606-613[5] Yang J J, Li Y N, Tian Y H, Duan L Y, Gao W. Per-sample multiple kernel approach for visual concept learning. Journal on Image and Video Processing. 2010, 2010(2): 220-232[6] Bach F R, Lanckriet G R, Jordan M I. Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceedings of the 21st IEEE International Conference on Machine Learning. New York, USA: ACM, 2004. 41-48[7] Varma M, Ray D. Learning the discriminative power-invariance trade-off. In: Proceedings of the 11th IEEE International Conference on Computer Vision. Rio de Janeiro, Brazil: IEEE. 2007. 1-8[8] Kumar A, Sminchisescu C. Support kernel machines for object recognition. In: Proceedings of the 11th IEEE International Conference on Computer Vision. Rio de Janeiro, Brazil: IEEE, 2007. 1-8[9] Schólkopf B, Burges C J C, Smola A J. Advances in Kernel Methods: Support Vector Learning. Cambridge, MA: MIT Press, 1998. 185-208[10] Wang Hong-Qiao, Sun Fu-Chun, Cai Yan-Ning, Chen Ning, Ding Lin-Ge. On multiple kernel learning methods. Acta Automatica Sinica. 2010, 36(8): 1037-1050(汪洪桥, 孙富春, 蔡艳宁, 陈宁, 丁林阁. 多核学习方法. 自动化学报, 2010, 36(8): 1037-1050)[11] Li J B, Sun S L. Nonlinear combination of multiple kernels for support vector machines. In: Proceedings of the 20th IEEE International Conference on Pattern Recognition. Istanbul, Turkey: IEEE, 2010. 2889-2892[12] Cortes C, Mohri M, Rostamizadeh A. Learning non-linear combinations of kernels. In: Proceedings of the 33rd IEEE Annual Conference on Neural Information Processing Systems. New York, USA: IEEE. 2009. 396-404[13] Lin Y Y, Liu T L, Fuh C S. Local ensemble kernel learning for object category recognition. In: Proceedings of the 17th IEEE International Conference on Computer Vision and Pattern Recognition. Minneapolis, Minnesota, USA: IEEE. 2007. 1-8[14] Malisiewicz T, Efros A A. Recognition by association via learning per-exemplar distances. In: Proceedings of the 18th IEEE International Conference on Computer Vision and Pattern Recognition. Anchorage, Alaska, USA: IEEE. 2008. 1-8[15] Yang J J, Li Y N, Tian Y H, Duan L Y, Gao W. Group-sensitive multiple kernel learning for object categorization. In: Proceedings of the 19th IEEE International Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE. 2009. 436-443[16] van Gemert J, Snoek C G M, Veenman C J, Smeulders A W M, Geusebroek J M. Comparing compact codebooks for visual categorization. Computer Vision and Image Understanding, 2010, 114(4): 450-462[17] Zhang Xue-Feng, Zhang Gui-Zhen, Liu Peng. Improved k-means algorithm based on clustering criterion function. Computer Engineering and Applications, 2011, 47(11): 123-127(张雪凤, 张桂珍, 刘鹏. 基于聚类准则函数的改进K-means算法. 计算机工程与应用, 2011, 47(11): 123-127)[18] Datar M, Immorlica N, Indyk P, Mirrokni V S. Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings the 20th of Annual Symposium on Computational Geometry. New York, USA: ACM, 2004. 253-262[19] Gionis A, Indyk P, Motwani R. Similarity search in high dimensions via hashing. In: Proceedings of the 25th International Conference on Very Large Data Bases. New York, USA: ACM, 1999. 518-529[20] Sonnenburg S, Rtsch G, Schfer C, Schlkopf B. Large scale multiple kernel learning. Journal of Machine Learning Research, 2006, 7(7): 1531-1565[21] Yang J J, Li Y N, Tian Y H, Duan L Y, Gao W. A new multiple kernel approach for visual concept learning. In: Proceedings of the 15th International Multimedia Modeling Conference on Advances in Multimedia. Berlin, Germany: Springer-Verlag, 2009. 250-262[22] Hettich R, Kortanek K O. Semi-infinite programming: theory, methods, and applications. SIAM Review, 1993, 35(3): 380-429[23] Naphade M, Kennedy L, Kender J R, Chang S F, Over P, Hauptmann A. LSCOM-Lite: A Light Scale Concept Ontology for Multimedia Understanding for TRECVID 2005. IBM Research Technology Report, RC23612 (W0505-104). New York, USA, 2005[24] Natsev A, Naphade M R, Tesić J. Learning the semantics of multimedia queries and concepts from a small number of examples. In: Proceedings of the 13th ACM International Conference on Multimedia Modeling. New York, USA: ACM, 2005. 598-607[25] Lowe D G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91-110[26] Jia Shi-Jie, Kong Xiang-Wei. A new histogram-based kernel function designed for image classification. Journal of Electronics Information Technology, 2011, 33(7): 1738-1742(贾世杰, 孔祥维. 一种新的直方图核函数及在图像分类中的应用. 电子与信息学报, 2011, 33(7): 1738-1742)[27] Marszalek M, Schmid C, Harzallah H, Van de Weijer J. Learning object representations for visual object class recognition. Computer and Information Science, 2007, 8(1): 93-111
点击查看大图
计量
- 文章访问数: 1609
- HTML全文浏览量: 50
- PDF下载量: 1306
- 被引次数: 0