Disagreement-based Semi-supervised Learning
-
摘要: 传统监督学习通常需使用大量有标记的数据样本作为训练例,而在很多现实问题中,人们虽能容易地获得大批数据样本,但为数据 提供标记却需耗费很多人力物力.那么,在仅有少量有标记数据时,可否通过对大量未标记数据进行利用来提升学习性能呢?为此,半监督学习 成为近十多年来机器学习的一大研究热点.基于分歧的半监督学习是该领域的主流范型之一,它通过使用多个学习器来对未标记数据进行利用, 而学习器间的"分歧"对学习成效至关重要.本文将综述简介这方面的一些研究进展.
-
关键词:
- 机器学习 /
- 半监督学习 /
- 基于分歧的半监督学习 /
- 未标记数据
Abstract: Traditional supervised learning generally requires a large amount of labeled data as training examples; in many real tasks, however, although it is usually easy to acquire a lot of data, it is often expensive to get the label information. Can we improve the learning performance with limited amount of labeled data by exploiting the large amount of unlabeled data? For this purpose, semi-supervised learning has become a hot topic of machine learning during the past ten years. One of the mainstream paradigms, the disagreement-based semi-supervised learning, trains multiple learners to exploit the unlabeled data, where the "disagreement" among the learners is crucial. This article briefly surveys some research advances of this paradigm. -
[1] Chapelle O, Scholkopf B, Zien A. Semi-Supervised Learning. Cambridge, MA: MIT Press, 2006 [2] Zhu X J. Semi-supervised Learning Literature Survey. Technical Report 1530, Department of Computer Sciences, University of Wisconsin at Madison, Madison, WI, 2006 [3] Zhou Z H, Li M. Semi-supervised learning by disagreement. Knowledge and Information Systems, 2010, 24(3): 415-439 [4] Shahshahani B M, Landgrebe D A. The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon. IEEE Transactions on Geoscience and Remote Sensing, 1994, 32(5): 1087-1095 [5] Miller D, Uyar H. A mixture of experts classifier with learning based on both labelled and unlabelled data. Advances in Neural Information Processing Systems 9. Cambridge, MA: MIT Press, 1997. 571-577 [6] Nigam K, McCallum A K, Thrun S, Mitchell T. Text classification from labeled and unlabeled documents using EM. Machine Learning, 2000, 39(2-3): 103-134 [7] Blum A, Mitchell T. Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th Annual Conference on Computational Learning Theory. New York, USA: ACM, 1998. 92-100 [8] Joachims T. Transductive inference for text classification using support vector machines. In: Proceedings of the 16th International Conference on Machine Learning. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1999. 200-209 [9] Zhu X J, Ghahramani Z, Lafferty J. Semi-supervised learning using Gaussian fields and harmonic functions. In: Proceedings of the 20th International Conference on Machine Learning. Menlo Park, CA: AAAI Press, 2003. 912-919 [10] Zhou Z H. Semi-supervised learning by disagreement. In: Proceedings of the 4th IEEE International Conference on Granular Computing. Piscataway, NJ: IEEE, 2008. 93 [11] Zhou Zhi-Hua. Co-training paradigm of semi-supervised learning. Machine Learning and Applications. Beijing: Tsinghua University Press, 2007. 259-275(周志华. 半监督学习中的协同训练风范. 机器学习及其应用. 北京: 清华大学出版社, 2007. 259-275) [12] Sindhwani V, Niyogi P, Belkin M. A co-regularized approach to semi-supervised learning with multiple views. In: Proceedings of the 22nd International Conference on Machine Learning. Cambridge, MA: MIT Press, 2005. 824-831 [13] Brefeld U, Gartner T, Scheffer T, Wrobel S. Efficient co-regularised least squares regression. In: Proceedings of the 23rd International Conference on Machine Learning. New York, USA: ACM, 2006. 137-144 [14] Farquhar J D R, Hardoon D R, Meng H Y, Shawe-Taylor J, Szedmak S. Two view learning: SVM-2K, theory and practice. Advances in Neural Information Processing Systems 18. Cambridge, MA: MIT Press, 2006. 355-362 [15] Sridharan K, Kakade S M. An information theoretic framework for multi-view learning. In: Proceedings of the 21st Annual Conference on Learning Theory. Berlin, Germany: Springer, 2008. 403-414 [16] Nigam K, Ghani R. Analyzing the effectiveness and applicability of co-training. In: Proceedings of the 9th International Conference on Information and Knowledge Management. New York, USA: ACM, 2000. 86-93 [17] Brefeld U, Scheffer T. Co-EM support vector learning. In: Proceedings of the 21st International Conference on Machine Learning. New York, USA: ACM, 2004. 16-23 [18] Ando R K, Zhang T. Two-view feature generation model for semi-supervised learning. In: Proceedings of the 24th International Conference on Machine Learning. New York, USA: ACM, 2007. 25-32 [19] Zhou Z H, Zhan D C, Yang Q. Semi-supervised learning with very few labeled training examples. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence. Menlo Park, CA: AAAI Press, 2007. 675-680 [20] Du J, Ling C X, Zhou Z H. When does co-training work in real data? IEEE Transactions on Knowledge and Data Engineering, 2010, 23(5): 788-799 [21] Wang W, Zhou Z H. A new analysis of co-training. In: Proceedings of the 27th International Conference on Machine Learning. Haifa, Israel: ICML, 2010. 1135-1142 [22] Goldman S, Zhou Y. Enhancing supervised learning with unlabeled data. In: Proceedings of the 17th International Conference on Machine Learning. San Francisco, CA: Morgan Kaufmann Publishers Inc., 2000. 327-334 [23] Zhou Z H, Li M. Tri-training: exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(11): 1529-1541 [24] Angluin D, Laird P. Learning from noisy examples. Machine Learning, 1988, 2(4): 343-370 [25] Zhou Z H. Ensemble Methods: Foundations and Algorithms. Boca Raton, FL: Chapman and Hall/CRC, 2012 [26] Breve F, Zhao L, Quiles M, Pedrycz W, Liu J M. Particle competition and cooperation in networks for semi-supervised learning. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(9): 1686-1698 [27] Li M, Zhou Z H. Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Transactions on Systems, Man, and Cybernetics——Part A: Systems and Humans, 2007, 37(6): 1088-1098 [28] Breiman L. Random forests. Machine Learning, 2001, 45(1): 5-32 [29] Hady M F A, Schwenker F. Co-training by committee: a generalized framework for semi-supervised learning with committees. International Journal of Software and Informatics, 2008, 2(2): 95-124 [30] Zhou Z H. When semi-supervised learning meets ensemble learning. In: Proceedings of the 8th International Workshop on Multiple Classifier Systems. Berlin, Germany: Springer, 2009. 529-538 [31] Zhou Z H, Li M. Semi-supervised regression with co-training. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence. San Francisco, CA: Morgan Kaufmann Publishers Inc., 2005. 908-913 [32] Settles B. Active Learning Literature Survey. Technical Report 1648, Department of Computer Sciences, University of Wisconsin at Madison, Wisconsin, WI, 2009 [33] Seung H S, Opper M, Sompolinsky H. Query by committee. In: Proceedings of the 5th Annual Workshop on Computational Learning Theory. New York, USA: ACM, 1992. 287-294 [34] Wang W, Zhou Z H. On multi-view active learning and the combination with semi-supervised learning. In: Proceedings of the 25th International Conference on Machine Learning. New York, USA: ACM, 2008. 1152-1159 [35] Zhou Z H, Chen K J, Dai H B. Enhancing relevance feedback in image retrieval using unlabeled data. ACM Transactions on Information Systems, 2006, 24(2): 219-244 [36] Dasgupta S, Littman M L, McAllester D. PAC generalization bounds for co-training. Advances in Neural Information Processing Systems 14. Cambridge, MA: MIT Press, 2001. 375-382 [37] Abney S. Bootstrapping. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: ACL, 2002. 360-367 [38] Balcan M F, Blum A, Yang K. Co-training and expansion: towards bridging theory and practice. Advances in Neural Information Processing Systems 17. Cambridge, MA: MIT Press, 2005. 89-96 [39] Wang W, Zhou Z H. Analyzing co-training style algorithms. In: Proceedings of the 18th European Conference on Machine Learning. Berlin, Heidelberg: Springer-Verlag, 2007. 454-465 [40] Argyriou A, Herbster M, Pontil M. Combining graph Laplacians for semi-supervised learning. Advances in Neural Information Processing Systems 18. Cambridge, MA: MIT Press, 2006. 67-74 [41] Zhang T, Popescul A, Dom B. Linear prediction models with graph regularization for web-page categorization. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2006. 821-826 [42] Zhou D Y, Burges C J C. Spectral clustering and transductive learning with multiple views. In: Proceedings of the 24th International Conference on Machine Learning. New York, USA: ACM, 2007. 1159-1166 [43] Balcan M F, Blum A. A discriminative model for semi-supervised learning. Journal of the ACM, 2010, 57(3): Article 19 [44] Wang W, Zhou Z H. Co-training with insufficient views. In: Proceedings of the 5th Asian Conference on Machine Learning. Canberra, Australia: ACML, 2013 [45] Yarowsky D. Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: ACL, 1995. 189-196 [46] Riloff E, Jones R. Learning dictionaries for information extraction by multi-level bootstrapping. In: Proceedings of the 16th National Conference on Artificial Intelligence. Menlo Park, CA: AAAI Press, 1999. 474-479 [47] Collins M, Singer Y. Unsupervised models for named entity classification. In: Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. New Brunswick, NJ: ACL, 1999. 100-110 [48] Bai X, Wang B, Yao C, Liu W Y, Tu Z W. Co-transduction for shape retrieval. IEEE Transactions on Image Processing, 2012, 21(5): 2747-2757 [49] Wang D, Yang G, Lu H C. Tri-tracking: combining three independent views for robust visual tracking. International Journal of Image and Graphics, 2012, 12(3): 1250021 [50] Guo Q, Chen T S, Chen Y J, Zhou Z H, Hu W W, Xu Z W. Effective and efficient microprocessor design space exploration using unlabeled design configurations. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. Menlo Park, CA: AAAI Press, 2011. 1671-1677 [51] Mavroeidis D, Chaidos K, Pirillos S, Christopoulos D, Vazirgiannis M. Using tri-training and support vector machines for addressing the ECML-PKDD 2006 discovery challenge. In: Proceedings of the 2006 ECML-PKDD Discovery Challenge Workshop. Berlin, Germany: ECML-PKDD, 2006. 39-47 [52] Raahemi B, Zhong W C, Liu J. Exploiting unlabeled data to improve peer-to-peer traffic classification using incremental tri-training method. Peer-to-Peer Networking and Applications, 2009, 2(2): 87-97 [53] Tang X L, Han M. Ternary reversible extreme learning machines: the incremental tri-training method for semi-supervised classification. Knowledge and Information Systems, 2010, 23(3): 345-372 [54] Cozman F G, Cohen I. Unlabeled data can degrade classification performance of generative classifiers. In: Proceedings of the 15th International Conference of the Florida Artificial Intelligence Research Society. Pensacola, FL: AAAI Press, 2002. 327-331 [55] Li M, Zhou Z H. SETRED: self-training with editing. In: Proceedings of the 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining. Berlin, Heidelberg: Springer-Verlag, 2005. 611-621 [56] Li Y F, Zhou Z H. Towards making unlabeled data never hurt. In: Proceedings of the 28th International Conference on Machine Learning. Bellevue, WA: ICML, 2011. 1081-1088
点击查看大图
计量
- 文章访问数: 4428
- HTML全文浏览量: 153
- PDF下载量: 6632
- 被引次数: 0