Cluster Ensemble Based on Spectral Clustering
-
摘要: 谱聚类是近年来出现的一类性能优越的聚类算法,能对任意形状的数据进行聚类, 但算法对尺度参数比较敏感,利用聚类集成良好的鲁棒性和泛化能力,本文提出了基于谱聚类的聚类集成算法.该算法首先利用谱聚类算法的内在特性构造多样性的聚类成员; 然后,采用连接三元组算法计算相似度矩阵,扩充了数据点之间的相似性信息;最后,对相似度矩阵使用谱聚类算法得到最终的集成结果. 为了使算法能扩展到大规模应用,利用Nystrm采样算法只计算随机采样数据点之间以及随机采样数据点与剩余数据点之间的相似度矩阵,从而有效降低了算法的计算复杂度. 本文算法既利用了谱聚类算法的优越性能,同时又避免了精确选择尺度参数的问题.实验结果表明:较之其他常见的聚类集成算法,本文算法更优越、更有效,能较好地解决数据聚类、图像分割等问题.Abstract: Spectral clustering has become increasingly popular in recent years. It can deal with arbitrary distribution dataset, however, it is sensitive to the scaling parameter. Cluster ensemble based on spectral clustering is proposed which utilizes the good robustness and generalization ability of cluster ensemble. Multiform clustering components are generated by exploiting the property of spectral clustering, and the connected triple algorithm which can expand the similarity information among data is used to compute the affinity matrix, then the affinity matrix is used by spectral clustering algorithm to produce ensemble results. In order to make the algorithm extensible to large scale applications, only the similarity among the rand sampling data and the similarity between the random sampling data and the rest data are computed by adopting the Nystrm sampling method. The proposed algorithm makes full use of the excellent performance of spectral clustering as well as avoids the selection of the accurate parameter in spectral clustering. Experiments show that compared with other common cluster ensemble techniques, the proposed algorithm is more excellent and efficient, and that it can provide a good way to solve data clustering and image segmentation problem.
-
Key words:
- Spectral clustering /
- cluster ensemble /
- connected triple /
- image segmentation
-
[1] Jiao Li-Cheng, Zhang Xiang-Rong, Hou Biao, Wang Shuang, Liu Fang. Intelligent SAR Image Processing and Interpretation. Beijing: Science Press, 2008. 398-435(焦李成, 张向荣, 侯彪, 王爽, 刘芳. 智能SAR图像处理及解译. 北京: 科学出版社, 2008. 398-435)[2] Cai Xiao-Yan, Dai Guan-Zhong, Yang Li-Bin. Survey on spectral clustering algorithms. Computer Science, 2008, 35(7): 14-18(蔡晓妍, 戴冠中, 杨黎斌. 谱聚类算法综述. 计算机科学, 2008, 35(7): 14-18)[3] Wu Rui, Huang Jian-Hua, Tang Xiang-Long, Liu Jia-Feng. Method of text image binarization processing using histogram and spectral clustering. Journal of Electronics and Information Technology, 2009, 31(10): 2460-2464(吴锐, 黄剑华, 唐降龙, 刘家锋. 基于灰度直方图和谱聚类的文本图像二值化方法. 电子与信息学报, 2009, 31(10): 2460-2464)[4] Wang Na, Li Xia. Active semi-supervised spectral clustering based on pairwise constraints. Acta Electronica Sinica, 2010, 38(1): 172-176(王娜, 李霞. 基于监督信息特性的主动半监督谱聚类算法. 电子学报, 2010, 38(1): 172-176)[5] Jia Jian-Hua, Jiao Li-Cheng. Image segmentation by spectral clustering algorithm with spatial coherence constraints. Journal of Infrared and Millimeter Waves, 2010, 29(1): 69-75(贾建华, 焦李成. 空间一致性约束谱聚类算法用于图像分割. 红外和毫米波学报, 2010, 29(1): 69-75)[6] Ersahin K, Cumming I G, Ward R K. Segmentation and classification of polarimetric SAR data using spectral graph partitioning. IEEE Transactions on Geoscience and Remote Sensing, 2010, 48(1): 164-167[7] Alzate C, Suykens J A K. Multiway spectral clustering with out-of-sample extensions through weighted kernel PCA. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(2): 335-347[8] Lauer F, Schnorr C. Spectral clustering of linear subspaces for motion segmentation. In: Proceedings of the 12th IEEE International Conference of Computer Vision. Kyoto, Japan: IEEE, 2009. 678-685[9] Xu Sen, Lu Zhi-Mao, Gu Guo-Chang. Two spectral algorithms for ensembling document clusters. Acta Automatica Sinica, 2009, 35(7): 997-1002(徐森, 卢志茂, 顾国昌. 解决文本聚类集成问题的两个谱算法. 自动化学报, 2009, 35(7): 997-1002)[10] Cheng Y, Tong Q. Spectral clustering on manifolds with statistical and geometrical similarity. Lecture Notes in Computer Science. Berlin: Springer, 2010. 422-429[11] Zhang Y L, Zhuang J, Wang S A. Fusion of manifold learning and spectral clustering algorithm with application to fault diagnosis. In: Proceedings of the 2nd International Conference on Machine Learning and Computing. Bangalore, India: IEEE, 2010. 155-160[12] Zhao F, Jiao L C, Liu H Q, Gao X B, Gong M G. Spectral clustering with eigenvector selection based on entropy ranking. Neurocomputing, 2010, 73(10-12): 1704-1717[13] Zhang Xiang-Rong, Qian Xiao-Xue, Jiao Li-Cheng. Immune spectral clustering algorithm for image segmentation. Journal of Software, 2010, 21(9): 2196-2205(张向荣, 骞晓雪, 焦李成. 基于免疫谱聚类的图像分割. 软件学报, 2010, 21(9): 2196-2205)[14] Xu Hai-Xia, Tian Zheng, Ding Ming-Tao. Multiscale segmentation for SAR image based on spectral clustering and mixture model. Journal of Image and Graphics, 2010, 15(3): 450-454(徐海霞, 田铮, 丁明涛. 基于谱聚类与混合模型的SAR图像多尺度分割. 中国图象图形学报, 2010, 15(3): 450-454)[15] Fred A L, Jain A K. Combining multiple clusterings using evidence accumulation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(6): 835-850[16] Strehl A, Ghosh J. Cluster ensembles —a knowledge reuse framework for combining partitions. Journal of Machine Learning Research, 2002, 3: 583-617[17] Zhou Z H, Tang W. Clusterer ensemble. Knowledge-Based Systems, 2006, 19(1): 77-83[18] Iam-On N, Boongoen T, Garrett S. Refining pairwise similarity matrix for cluster ensemble problem with cluster relations. In: Proceedings of the 11th International Conference on Discovery Science. Budapest, Hungary: Springer, 2008. 222-233[19] Ayad H, Basir O, Kamel M. A probabilistic model using information theoretic measures for cluster ensemble. In: Proceedings of the 5th International Workshop on Multiple Classifier Systems. Cagliari, Italy: Springer, 2004. 144-153[20] Topchy A, Jain A K, Punch W. A mixture model for clustering ensembles. In: Proceedings of the 4th SIAM International Conference on Data Mining. Florida, USA: SIAM, 2004. 379-390[21] Tang Wei, Zhou Zhi-Hua. Bagging-based selective clusterer ensemble. Journal of Software, 2005, 16(4): 496-502(唐伟, 周志华. 基于Bagging的选择性聚类集成. 软件学报, 2005, 16(4): 496-502)[22] Fern X Z, Brodley C E. Random projection for high dimensional data clustering: a cluster ensemble approach. In: Proceedings of the 20th International Conference on Machine Learning. Washington D.C., USA: AAAI Press, 2003. 186-193[23] Nguyen N, Caruana R. Consensus clusterings. In: Proceedings of the 7th IEEE International Conference on Data Mining. Omaha, USA: IEEE, 2007. 607-612[24] Fern X Z, Brodley C E. Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the 21st International Conference on Machine Learning. Banff, Canada: ACM, 2004. 1-8[25] Fowlkes C, Belongie S, Chung F, Mailk J. Spectral grouping using the Nystrm method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(2): 214-225[26] Wang Kai-Jun, Zhang Jun-Ying, Li Dan, Zhang Xin-Na, Guo Tao. Adaptive affinity propagation clustering. Acta Automatica Sinica, 2007, 33(12): 1242-1246(王开军, 张军英, 李丹, 张新娜, 郭涛. 自适应仿射传播聚类. 自动化学报, 2007, 33(12): 1242-1246)[27] Cunha A L, Zhou J P, Do N M. The nonsubsampled contourlet transform: theory, design, and application. IEEE Transactions on Image Processing, 2006, 15(10): 3089-3101
点击查看大图
计量
- 文章访问数: 2711
- HTML全文浏览量: 106
- PDF下载量: 2395
- 被引次数: 0