A Local Weighted Mean Based Domain Adaptation Learning Framework
-
摘要: 最大均值差异(Maximum mean discrepancy, MMD)作为一种能有效度量源域和目标域分布差异的标准已被成功运用.然而, MMD作为一种全局度量方法一定程度上反映的是区域之间全局分布和全局结构上的差异.为此, 本文通过引入局部加权均值的方法和理论到MMD中, 提出一种具有局部保持能力的投影最大局部加权均值差异(Projected maximum local weighted mean discrepancy, PMLWD)度量,%从而一定程度上使得PMLWD更能有效度量源域和目标域中局部分块之间的分布和结构上的差异,结合传统的学习理论提出基于局部加权均值的领域适应学习框架(Local weighted mean based domain adaptation learning framework, LDAF), 在LDAF框架下, 衍生出两种领域适应学习方法: LDAF_MLC和 LDAF_SVM.最后,通过测试人工数据集、高维文本数据集和人脸数据集来表明LDAF比其他领域适应学习方法更具优势.
-
关键词:
- 迁移学习 /
- 领域适应学习 /
- 局部加权均值 /
- 投影最大局部加权均值差异 /
- 基于局部加权均值的领域适应学习框架
Abstract: Maximum mean discrepancy (MMD), as a criterion effectively and efficiently measuring the distribution discrepancy between source domains and target ones, has been successfully used. But it is a global measuring algorithm and to some extent only reflects the global distribution discrepancy between domains and the global structural difference. Therefore, we propose projected maximum local weighted mean discrepancy (PMLWD) scheme by with locality preserving ability integrating the theory and method of local weighted mean into the MMD. At the same time, we formulate in theory that the PMLWD is one of generalized algorithms of the MMD. Furthermore, on the basis of the PMLWD and by integrating classical learning theories, we present local weighted mean based domain adaptation learning framework (LDAF). Following the LDAF, we propose local weighted mean based multi-label classification domain adaptation learning algorithm (LDAF_MLC) and local weighted mean based domain adaptation supporting vector machine (LDAF_SVM). At last, tests on artificial data sets, high dimensional text data sets and face data sets show the LDAF methods are superior to other domain adaption ones. -
[1] Ozawa S, Roy A, Roussinov D. A multitask learning model for online pattern recognition. IEEE Transactions on Neural Networks, 2009, 20(3): 430-445 [2] Pan S J, Yang Q. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10): 1345-1359 [3] Quanz B, Huan J. Large margin transductive transfer learning. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM). New York, USA: ACM, 2009. 1327-1336 [4] Zhang D, He J R, Liu Y, Si L, Lawrence R D. Multi-view transfer learning with a large margin approach. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). New York, USA: ACM, 2011. 1208-1216 [5] Xu Z J, Sun S L. Multi-view Transfer learning with adaboost. In: Proceedings of the 23rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI). New York, USA: IEEE, 2011. 399-402 [6] Perez-Cruz F. Kullback-Leibler divergence estimation of continuous distributions. In: Proceedings of the 2008 IEEE International Symposium on Information Theory (ISIT) 2008. New York, USA: IEEE, 2008. 1666-1670 [7] Borgwardt K M, Gretton A, Rasch M J, Kriegel H P, Schölkopf B, Smola A J. Integrating structured biological data by kernel maximum mean discrepancy. In: Proceedings of the 14th International Conference on Intelligent Systems for Molecular Biology (ISMB). California, USA: ISCB, 2006. e49-e57 [8] Joachims T. Transductive inference for text classification using support vector machines. In: Proceedings of the 16th International Conference on Machine Learning (ICML). San Francisco, CA: Morgan Kaufmann Publishers, 1999. 200-209 [9] Ji S W, Tang L, Yu S P, Ye J P. Extracting shared subspace for multi-label classification. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). New York, USA: ACM, 2008. 381-389 [10] Vapnik V N. Statistical Learning Theory. New York: Wiley, 1998. 88 [11] Duan L X, Tsang I W, Xu D. Domain transfer multiple kernel learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(3): 465-479 [12] Duan L X, Xu D, Tsang I W. Domain adaptation from multiple sources: a domain-dependent regularization approach. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(3): 504-518 [13] Tao J W, Chung F L, Wang S T. On minimum distribution discrepancy support vector machine for domain adaptation. Pattern Recognition, 2012, 45(11): 3962-3984 [14] Chen B, Lam W, Tsang I W, Wong T L. Extracting discriminative concepts for domain adaptation in text mining. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). New York, USA: ACM, 2009. 179-188 [15] Zhang Z H, Zhou J. Multi-task clustering via domain adaptation. Pattern Recognition, 2012, 45(1): 465-473 [16] Quanz B, Huan J, Mishra M. Knowledge transfer with low-quality data: a feature extraction issue. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(10): 1789-1802 [17] Lee J M. Riemannian Manifolds: An Introduction to Curvature. Berlin: Springer-Verlag, 2003. 1-4 [18] Zhao D L, Lin Z C, Xiao R, Tang X O. Linear Laplacian discrimination for feature extraction. In: Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). New York, USA: IEEE, 2007. 1-7 [19] Wang Y Y, Chen S C, Zhou Z H. New semi-supervised classification method based on modified cluster assumption. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(5): 689-702 [20] Atkeson C G, Moore A W, Schaal S. Locally weighted learning. Artificial Intelligence Review, 1997, 11(1-5): 11-73 [21] Woods K, Kegelmeyer W P, Bowyer J. Combination of multiple classifiers using local accuracy estimates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(4): 405-410 [22] Sun S L. Local within-class accuracies for weighting individual outputs in multiple classifier systems. Pattern Recognition Letters, 2010, 31(2): 119-124 [23] Sun S L, Zhang C S. Subspace ensembles for classification. Physica A: Statistical Mechanics and its Applications, 2007, 385(1): 199-207 [24] Bregler C, Omohundro S M. Surface learning with applications to lipreading. In: Proceedings of the 1993 Neural Information Processing Systems (NIPS). Cambridge, MA: MIT Press, 1993. 43-50 [25] Zhang W, Wang X G, Zhao D L, Tang X O. Graph degree linkage: agglomerative clustering on a directed graph. In: Proceedings of the 12th European Conference on Computer Vision (ECCV). Berlin: Springer-Verlag, 2012. 428-441 [26] Deng Nai-Yang, Tian Ying-Jie. The New Method of Data-Mining — Support Vector Machine. Beijing: Science Press, 2004. 73-150 (邓乃阳, 田英杰. 数据挖掘中的新方法—支持向量机. 北京: 科学出版社, 2004. 73-150) [27] Kanamori T, Hido S, Sugiyama M. A least-squares approach to direct importance estimation. Journal of Machine Learning Research, 2009, 10: 1391-1445 [28] Wang Z, Chen S C. New least squares support vector machines based on matrix patterns. Neural Processing Letters, 2007, 26(1): 41-56 [29] Gao J, Fan W, Jiang J, Han J W. Knowledge transfer via multiple model local structure mapping. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). New York, USA: ACM, 2008. 283-291 [30] Ling X, Dai W Y, Xue G R, Yang Q, Yu Y. Spectral domain-transfer learning. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). New York, USA: ACM, 2008. 488-496 [31] Bruzzone L, Marconcini M. Domain adaptation problems: a DASVM classification technique and a circular validation strategy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(5): 770-787
点击查看大图
计量
- 文章访问数: 1833
- HTML全文浏览量: 64
- PDF下载量: 1947
- 被引次数: 0