考虑局部均值和类全局信息的快速近邻原型选择算法

李娟; 王宇平

doi:10.3724/SP.J.1004.2014.01116

考虑局部均值和类全局信息的快速近邻原型选择算法

doi: 10.3724/SP.J.1004.2014.01116

李娟^1,2,
王宇平¹

1.
西安电子科技大学计算机学院西安 710071;
2.
陕西师范大学远程教育学院西安 710062

基金项目:

国家自然科学基金（61272119）资助

详细信息

作者简介:
李娟西安电子科技大学计算机学院博士研究生，陕西师范大学远程教育学院讲师. 主要研究方向为数据挖掘，模式识别. E-mail：ally 2004@126.com

计量
- 文章访问数: 2094
- HTML全文浏览量: 100
- PDF下载量: 814
- 被引次数: 0
出版历程
- 收稿日期: 2013-06-19
- 修回日期: 2013-11-11
- 刊出日期: 2014-06-20

A Fast Neighbor Prototype Selection Algorithm Based on Local Mean and Class Global Information

LI Juan^1,2,
WANG Yu-Ping¹

1.
School of Computer Science and Technology, Xidian University, Xi'an 710071;
2.
School of Distance Education, Shaanxi Normal University, Xi'an 710062

Funds:

Supported by National Natural Science Foundation of China (61272119)

摘要

摘要: 压缩近邻法是一种简单的非参数原型选择算法，其原型选取易受样本读取序列、异常样本等干扰.为克服上述问题，提出了一个基于局部均值与类全局信息的近邻原型选择方法.该方法既在原型选取过程中，充分利用了待学习样本在原型集中k个同异类近邻局部均值和类全局信息的知识，又设定原型集更新策略实现对原型集的动态更新.该方法不仅能较好克服读取序列、异常样本对原型选取的影响，降低了原型集规模，而且在保持高分类精度的同时，实现了对数据集的高压缩效应.图像识别及UCI（University of California Irvine）基准数据集实验结果表明，所提出算法集具有较比较算法更有效的分类性能.
- 数据分类 /
- 原型选择 /
- 局部均值 /
- 类全局信息 /
- 自适应学习
Abstract: The condensed nearest neighbor (CNN) algorithm is a simple non-parametric prototype selection method, but its prototype selection process is susceptible to pattern read sequence, abnormal patterns and so on. To deal with the above problems, a new prototype selection method based on local mean and class global information is proposed. Firstly, the proposed method makes full use of those local means of the k heterogeneous and homogeneous nearest neighbors to each be-learning pattern and the class global information. Secondly, an updating process is introduced to the proposed method. Lastly, updating strategies are adopted in order to realize dynamic update of the prototype set. The proposed method can not only better lessen the influence of the pattern selected sequence and abnormal patterns on prototype selection, but also reduce the scale of the prototype set. The proposed method can achieve a higher compression efficiency that can guarantee the higher classification accuracy synchronously for original data set. Two image recognition data sets and University of California Irvine (UCI) benchmark data sets are selected as experimental data sets. The experiments show that the proposed method based on the classification performance is more effective than the compared algorithms.
- Data classification /
- prototype selection /
- local mean /
- global class information /
- adaptive learning

HTML全文

参考文献(17)

[1]	Wu X D, Kumar V, Quinlan J R, Ghosh J, Yang Q, Motoda H. Top 10 algorithms in data mining. Knowledge and Information Systems, 2008, 14(1): 1-37
[2]	López J A O, Ochoa J A C, Trinidad J F M. Prototype selection methods. Computacióny Sistemas, 2010, 13(4): 449-462
[3]	Verbiesta N, Cornelisa C, Herrerab F. FRPS: a fuzzy rough prototype selection method. Pattern Recognition, 2013, 46(10): 2770-2782
[4]	Rico J R, Iňesta J M. New rank methods for reducing the size of the training set using the nearest neighbor rule. Pattern Recognition Letters, 2012, 33(5): 654-660
[5]	Angiulli F. Fast nearest neighbor condensation for large data sets classification. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(11): 1450-1464
[6]	Chang F, Lin C C, Lu C J. Adaptive prototype learning algorithms: theoretical and experimental studies. Journal of Machine Learning Research, 2006, 7: 2125-2148
[7]	García S, Derrac J, Cano J R. Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(2): 417-435
[8]	Wu Y Q, Lanakiev K, Govindaraju V. Improved k-nearest neighbor classification. Pattern Recognition, 2002, 35(10): 2311-2318
[9]	Olvera-López J A, Carrasco-Ochoa J A, Martínez-Trinidad J F. A new fast prototype selection method based on clustering. Pattern Analysis and Applications, 2010, 13(2): 131-141
[10]	Mitani Y, Hamamoto Y. A local mean-based nonparametric classifier. Pattern Recognition Letters, 2006, 27(10): 1151-1159
[11]	Brown T A, Koplowitz J. The weighted nearest neighbour rule for class dependent sample size. IEEE Transaction on Information Theory, 1979, 25(5): 617-619
[12]	Han E H, Karypis G. Centroid-Based Document Classification: analysis & Experimental Results. Technical Report 00-017, Computer Science, University of Minnesota, 2000
[13]	Zeng Y, Yang Y P, Zhao L. Nonparametric classification based on local mean and class statistics. Expert Systems with Applications, 2009, 36(4): 8443-8448
[14]	Brighton H, Mellish C. Advances in instance selection for instance-based learning algorithms. Data Mining and Knowledge Discovery, 2002, 6(2): 153-172
[15]	Wang X Z, Wu B, He Y L. An iterative algorithm for sample selection based on the reachable and coverage. In: Proceedings of the IEEE International Conference on Communications Technology and Applications. Beijing, China: IEEE, 2009. 521-526
[16]	Theodoridis S, Koutroumbas K. Pattern Recognition (Third Edition). New York: Elsevier, chapter 5, 2006
[17]	Xu Y, Shen F R, Zhao J X. An incremental learning vector quantization algorithm for pattern classification. Neural Computing and Applications, 2012, 21(6): 1205-1215

施引文献

资源附件(0)

访问统计

计量

文章访问数: 2094
HTML全文浏览量: 100
PDF下载量: 814
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

考虑局部均值和类全局信息的快速近邻原型选择算法

doi: 10.3724/SP.J.1004.2014.01116

作者简介:
李娟西安电子科技大学计算机学院博士研究生，陕西师范大学远程教育学院讲师. 主要研究方向为数据挖掘，模式识别. E-mail：ally 2004@126.com

计量

A Fast Neighbor Prototype Selection Algorithm Based on Local Mean and Class Global Information

计量

目录

留言板

考虑局部均值和类全局信息的快速近邻原型选择算法

doi: 10.3724/SP.J.1004.2014.01116

作者简介: 李娟 西安电子科技大学计算机学院博士研究生，陕西师范大学远程教育学院讲师. 主要研究方向为数据挖掘，模式识别. E-mail：ally 2004@126.com

计量

出版历程

A Fast Neighbor Prototype Selection Algorithm Based on Local Mean and Class Global Information

计量

出版历程

目录

作者简介:
李娟西安电子科技大学计算机学院博士研究生，陕西师范大学远程教育学院讲师. 主要研究方向为数据挖掘，模式识别. E-mail：ally 2004@126.com