基于差异性的分类器集成：有效性分析及优化集成

杨春; 殷绪成; 郝红卫; 闫琰; 王志彬<

doi:10.3724/SP.J.1004.2014.00660

基于差异性的分类器集成：有效性分析及优化集成

doi: 10.3724/SP.J.1004.2014.00660 cstr: 32138.14.SP.J.1004.2014.00660

1.
北京科技大学计算机与通信工程学院计算机科学与技术系北京 100083;
2.
北京科技大学材料领域知识工程北京市重点实验室北京 100083;
3.
中国科学院自动化研究所北京 100190;
4.
国家农业信息化工程技术研究中心北京 100097

基金项目:

国家自然科学基金（61105018，61175020）资助

详细信息

作者简介:
杨春北京科技大学博士研究生.主要研究方向为图像处理与模式识别.E-mail：ych.learning@gmail.com

计量
- 文章访问数: 2547
- HTML全文浏览量: 105
- PDF下载量: 1141
- 被引次数: 0
出版历程
- 收稿日期: 2012-09-24
- 修回日期: 2013-01-11
- 刊出日期: 2014-04-20

Classifier Ensemble with Diversity：Effectiveness Analysis and Ensemble Optimization

1.
Department of Computer Science and Technology, School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083;
2.
Beijing Key Laboratory of Materials Science Knowledge Engineering, University of Science and Technology Beijing, Beijing 100083;
3.
Institute of Automation, Chinese Academy of Sciences, Beijing 100190;
4.
National Engineering Research Center for Information Technology in Agriculture, Beijing 100097

Funds:

Supported by National Natural Science Foundation of China (61105018, 61175020)

摘要

摘要: 差异性是分类器集成具有高泛化能力的必要条件. 然而，目前对差异性度量、有效性及分类器优化集成都没有统一的分析和处理方法. 针对上述问题，本文一方面从差异性度量方法、差异性度量有效性分析和相应的分类器优化集成技术三个角度，全面总结与分析了基于差异性的分类器集成. 同时，本文还通过向量空间模型形象地论证了差异性度量的有效性. 另一方面，本文针对多种典型的基于差异性的分类器集成技术（Bagging，boosting GA-based，quadratic programming （QP）、semi-definite programming （SDP）、regularized selective ensemble （RSE））在UCI数据库和USPS数据库上进行了对比实验与性能分析，并对如何选择差异性度量方法和具体的优化集成技术给出了可行性建议.
- 分类器集成 /
- 差异性 /
- 有效性分析 /
- 优化
Abstract: Diversity is a necessary condition for high generalization capability in classifier ensemble. However, there exists no uniform analysis and operation methods for diversity measure, effectiveness analysis or ensemble optimization. To solve these issues, on the one hand, classifier ensemble with diversity is comprehensively summarized and analyzed from three aspects, i.e., diversity measurement methods, effectiveness analysis for diversity measurement methods and optimization techniques for classifier ensemble. Moreover, the effectiveness of diversity is also demonstrated by the vector space model. On the other hand, comparative experiments and analysis have been performed on UCI data sets and USPS data set with a variety of typical classifier ensemble methods (Bagging, boosting, GA-based, quadratic programming (QP), semi-definite programming (SDP), regularized selective ensemble (RSE)). Finally, we give some suggestions on how to select diversity measurement methods and optimization techniques in ensemble.
- Classifier ensemble /
- diversity /
- effectiveness analysis /
- optimization

HTML全文

参考文献(65)

[1]	Polikar R. Ensemble learning. Ensemble Machine Learning: Methods and Applications. New York: Springer, 2012. 1-34
[2]	Zhou Z H. Ensemble Methods: Foundations and Algorithms. New York: CRC Press, 2012
[3]	Lebanon G, Lafferty J. Boosting and maximum likelihood for exponential models. Advances in Neural Information Processing Systems 14. Cambridge: MIT Press, 2002. 447-454
[4]	Lee H, Kim E, Pedrycz W. A new selective neural network ensemble with negative correlation. Applied Intelligence, 2012, 37(4): 488-498
[5]	Liu C L. Classifier combination based on confidence transformation. Pattern Recognition, 2005, 38(1): 11-28
[6]	Shipp C A, Kuncheva L K. Relationships between combination methods and measures of diversity in combining classifiers. Information Fusion, 2002, 3(2): 135-148
[7]	Jiang L X, Cai Z H, Zhang H, Wang D H. Naive Bayes text classifiers: a locally weighted learning approach. Journal of Experimental & Theoretical Artificial Intelligence, 2013, 25(2): 273-286
[8]	Yuksel S E, Wilson J N, Gader P D. Twenty years of mixture of experts. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(8): 1177-1193
[9]	Shi L, Wang Q, Ma X M, Weng M, Qiao H B. Spam email classification using decision tree ensemble. Journal of Computational Information Systems, 2012, 8(3): 949-956
[10]	Malisiewicz T, Gupta A, Efros A A. Ensemble of exemplar-SVMs for object detection and beyond. In: Proceedings of the 13th International Conference on Computer Vision. Barcelona, Spain: IEEE, 2011. 89-96
[11]	Zhou Jin-Zhu, Huang Jin. Multiple kernel linear programming support vector regression incorporating prior knowledge. Acta Automatica Sinica, 2011, 37(3): 360-370(周金柱, 黄进. 集成先验知识的多核线性规划支持向量回归. 自动化学报, 2011, 37(3): 360-370)
[12]	Nguyen H L, Woon Y K, Ng W K, Wan L. Heterogeneous ensemble for feature drifts in data streams. In: Proceedings of the 16th Pacific-Asia Conference of Advances in Knowledge Discovery and Data Mining. Kuala Lumpur, Malaysia: Springer, 2012. 1-12
[13]	Tahir M A, Kittlera J, Bouridaneb A. Multilabel classification using heterogeneous ensemble of multi-label classifiers. Pattern Recognition Letters, 2012, 33(5): 513-523
[14]	Bühlmann P, Hothorn T. Boosting algorithms: regularization, prediction and model fitting. Statistical Science, 2007, 22(4): 477-505
[15]	Mease D, Wyner A. Evidence contrary to the statistical view of boosting. Journal of Machine Learning Research, 2008, 9: 131-156
[16]	Shen C H, Li H X. On the dual formulation of boosting algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(12): 2216-2231
[17]	Breiman L. Bagging predictors. Machine Learning, 1996, 24(2): 123-140
[18]	Freund Y. Boosting a weak learning algorithm by majority. Information and Computation, 1995, 121(2): 256-285
[19]	Leistner C, Saffari A, Roth P M, Bischof H. On robustness of on-line boosting-a competitive study. In: Proceedings of the 12th International Conference on Computer Vision Workshops. Kyoto, Japan: IEEE, 2009. 1362-1369
[20]	Wolpert D H. Stacked generalization. Neural Networks, 1992, 5(2): 241-260
[21]	Breiman L. Random forests. Machine Learning, 2001, 45(1): 5-32
[22]	Jain A K, Duin R P W, Mao J C. Statistical pattern recognition: a review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(1): 4-37
[23]	Yu Ling, Wu Tie-Jun. LS-Ensem: a ensemble method for regression. Chinese Journal of Computers, 2006, 29(5): 719-726(于玲, 吴铁军. LS-Ensem: 一种用于回归的集成算法. 计算机学报, 2006, 29(5): 719-726)
[24]	Zhang Yu, Zhou Zhi-Hua. A new age estimation method based on ensemble learning. Acta Automatica Sinica, 2008, 34(8): 997-1000(张宇, 周志华. 基于集成的年龄估计方法. 自动化学报, 2008, 34(8): 997-1000)
[25]	Liu Ming, Yuan Bao-Zong, Miao Zhen-Jiang. A double-objective rank level classifier fusion method. Acta Automatica Sinica, 2007, 33(12): 1276-1282(刘明, 袁保宗, 苗振江. 一种双目标排序层分类器融合方法. 自动化学报, 2007, 33(12): 1276-1282)
[26]	Jiang Li-Xing, Hou Jin. Image annotation using the ensemble learning. Acta Automatica Sinica, 2012, 38(8): 1257-1262(蒋黎星, 侯进. 基于集成分类算法的自动图像标注. 自动化学报, 2012, 38(8): 1257-1262)
[27]	Zhang Liang, Huang Shu-Guang, Hu Rong-Gui. Ensemble system of double granularity RNN by linear combination. Acta Automatica Sinica, 2011, 37(11): 1402-1406(张亮, 黄曙光, 胡荣贵. 线性合成的双粒度RNN集成系统. 自动化学报, 2011, 37(11): 1402-1406)
[28]	Yang Bo, Liu Jie, Liu Da-You. A random network ensemble model based generalized network community mining algorithm. Acta Automatica Sinica, 2012, 38(5): 812-822(杨博, 刘杰, 刘大有. 基于随机网络集成模型的广义网络社区挖掘算法. 自动化学报, 2012, 38(5): 812-822)
[29]	Dietterich T G. An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Machine Learning, 2000, 40(2): 139-158
[30]	Kuncheva L I, Whitaker C J. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 2003, 51(2): 181-207
[31]	Schapire R E, Freund Y, Bartlett P L, Lee W S. Boosting the margin: a new explanation for the effectiveness of voting methods. Annals of Statistics, 1998, 26(5): 1651-1686
[32]	Liu Y, Yao X. Ensemble learning via negative correlation. Neural Networks, 1999, 12(10): 1399-1404
[33]	Zhang Y, Burer S, Street W N. Ensemble pruning via semi-definite programming. Journal of Machine Learning Research, 2006, 7: 1315-1338
[34]	Li N, Zhou Z H. Selective ensemble under regularization framework. In: Proceedings of the 8th International Workshop on Multiple Classifier Systems. Reykjavik, Iceland: Springer, 2009. 293-303
[35]	Dietterich T G. Machine learning research: four current directions. AI Magazine, 1997, 18(4): 97-136
[36]	Skalak D B. The sources of increased accuracy for two proposed boosting algorithms. In: Proceedings of the 13th American Association for Artificial Intelligence, Integrating Multiple Learned Models Workshop. Portland, Oregon: AAAI Press, 1996. 120-125
[37]	Giacinto G, Roli F. Design of effective neural network ensembles for image classification processes. Image Vision and Computing Journal, 2000, 19: 699-707
[38]	Kohavi R, Wolpert D H. Bias plus variance decomposition for zero-one loss functions. In: Proceedings of the 13th International Conference on Machine Learning. Bari, Italy: Springer, 1996. 275-283
[39]	Sim J, Wright C C. The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Physical Therapy, 2005, 85(3): 257-268
[40]	Yule G U. On the association of attributes in statistics. Philosophical Transactions of the Royal Society A: Mathematical, Physical & Engineering Sciences, 1900, 194: 257-319
[41]	Hansen L K, Salamon P. Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1990, 12(10): 993-1001
[42]	Cunningham P, Carney J. Diversity versus quality in classification ensembles based on feature selection. Technical Report TCD-CS-2000-02, Department of Computer Science, Trinity College Dublin, Ireland, 2000
[43]	Partridge D, Krzanowski W J. Software diversity: practical statistics for its measurement and exploitation. Information and Software Technology, 1997, 39(10): 707-717
[44]	Tumber K, Ghosh J. Analysis of decision boundaries in linearly combined neural classifiers. Pattern Recognition, 1996, 29(2): 341-348
[45]	Tang E K, Suganthan P N, Yao X. An analysis of diversity measures. Machine Learning, 2006, 65(1): 247-271
[46]	Zhou Z H, Yu Y. Ensembling local learners through multimodal perturbation. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2005, 35(4): 725-735
[47]	Yu Y, Li Y F, Zhou Z H. Diversity regularized machine. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. Barcelona, Catalonia, Spain: Morgan Kaufmann, 2011. 1603-1608
[48]	Li N, Yu Y, Zhou Z H. Diversity regularized ensemble pruning. In: Proceedings of the 23rd European Conference on Machine Learning. Bristol, UK: Springer, 2012. 330-345
[49]	Jing Xiao-Yuan, Yang Jing-Yu. Combining classifiers based on analysis of correlation and effective supplement. Acta Automatica Sinica, 2000, 26(6): 741-747(荆晓远, 杨静宇. 基于相关性和有效互补性分析的多分类器组合方法. 自动化学报, 2000, 26(6): 741-747)
[50]	Hao Hong-Wei, Wang Zhi-Bin, Yin Xu-Cheng, Chen Zhi-Qiang. Dynamic selection and circulating combination for multiple classifier systems. Acta Automatica Sinica, 2011, 37(11): 1290-1295(郝红卫, 王志彬, 殷绪成, 陈志强. 分类器的动态选择与循环集成方法. 自动化学报, 2011, 37(11): 1290-1295)
[51]	Trawinski K, Quirin A, Cordon O. On the combination of accuracy and diversity measures for genetic selection of bagging fuzzy rule-based multiclassification systems. In: Proceedings of the 9th International Conference on Intelligent Systems Design and Applications. Pisa, Italy: IEEE, 2009. 121-127
[52]	Margineantu D D, Dietterich T G. Pruning adaptive boosting. In: Proceedings of the 14th International Conference on Machine Learning. Nashville, Tennessee, USA: Morgan Kaufmann, 1997. 211-218
[53]	Krogh A, Vedelsby J. Neural network ensembles, cross validation, and active learning. Neural Information Processing Systems, 1995, 7: 231-238
[54]	Yin X C, Huang K Z, Hao H W, Iqbal K, Wang Z B. Classifier ensemble using a heuristic learning with sparsity and diversity. In: Proceedings of the 19th International Conference on Neural Information Processing. Doha, Qator: Springer, 2012. 100-107
[55]	Abbass H A. Pareto neuro-evolution: constructing ensemble of neural networks using multi-objective optimization. In: Proceedings of the 2003 IEEE Conference on Evolutionary Computation. Canberra, Australia: IEEE, 2003. 2074-2080
[56]	Abbass H A. Pareto neuro-ensembles. In: Proceedings of the 16th Australian Joint Conference on Artificial Intelligence. Perth, Australia: Springer, 2003. 554-566
[57]	Chandra A, Yao X. DIVACE: diverse and accurate ensemble learning algorithm. Computer Science, 2004, 3177: 619-625
[58]	Rätsch G Onoda T, Müller K R. Soft margins for AdaBoost. Machine Learning, 2001, 42(3): 287-320
[59]	Vapnik V N. The Nature of Statistical Learning Theory. New York: Springer, 1995
[60]	Wang L W, Sugiyama M, Jing Z X, Yang C, Zhou Z H, Feng J F. A refined margin analysis for boosting algorithms via equilibrium margin. Journal of Machine Learning Research, 2011, 12: 1835-1863
[61]	Martínez-Muñoz G, Suárez A. Aggregation ordering in bagging. In: Proceedings of the 2004 IASTED International Conference on Artificial Intelligence and Applications. Innsbruck, Austria: Acta Press, 2004. 258-263
[62]	Zhou Z H, Wu J X, Tang W. Ensembling neural networks: many could be better than all. Artificial Intelligence, 2002, 137(1-2): 239-263
[63]	Zhang Chun-Xia, Zhang Jiang-She. A survey of selective ensemble learning algorithms. Chinese Journal of Computers, 2011, 34(8): 1399-1410(张春霞, 张讲社. 选择性集成学习算法综述. 计算机学报, 2011, 34(8): 1399-1410)
[64]	Tuve L, Johansson U, Bostrom H. On the use of accuracy and diversity measures for evaluating and selecting ensembles of classifiers. In: Proceedings of the 7th International Conference on Machine Learning and Applications. San Diego, California, USA: IEEE, 2008. 127-132
[65]	Martinez-Munoz G, Hernandez-Lobato D, Suarez A. An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Transactions on Pattern Analysis Machine Intelligence, 2009, 31(2): 245-259