基于多隐层Gibbs采样的深度信念网络训练方法

史科; 陆阳; 刘广亮; 毕翔; 王辉

doi:10.16383/j.aas.c170669

基于多隐层Gibbs采样的深度信念网络训练方法

doi: 10.16383/j.aas.c170669

史科^1,,
陆阳^1,2, ,,
刘广亮^1,,
毕翔^1,2,,
王辉^1,

1.
合肥工业大学计算机与信息学院合肥 230009
2.
安全关键工业测控技术教育部工程研究中心合肥 230009

基金项目:

国家自然科学基金 61572167

国家重点研发计划专项 2016YFC0801405

国家重点研发计划专项 2016YFC0801804

详细信息

作者简介:
史科  合肥工业大学计算机与信息学院博士研究生.主要研究方向为自然语言处理, 信息检索, 机器学习.E-mail:shike@mail.hfut.edu.cn

刘广亮  合肥工业大学计算机与信息学院博士研究生.主要研究方向为数据挖掘和机器学习.E-mail:homecs@126.com

毕翔  合肥工业大学计算机与信息学院讲师.主要研究方向为模糊离散事件系统的建模和控制, 复杂软件可靠性.E-mail:bixiang@hfut.edu.cn

王辉  合肥工业大学高级工程师.主要研究方向为复杂网络和神经网络.E-mail:wanghui@hfut.edu.cn

通讯作者:
陆阳合肥工业大学计算机与信息学院教授, 主要研究方向为人工智能, 计算机控制, 传感器网络.本文通信作者.E-mail:luyang.hf@126.com

计量
- 文章访问数: 2907
- HTML全文浏览量: 428
- PDF下载量: 413
- 被引次数: 0
出版历程
- 收稿日期: 2017-11-22
- 录用日期: 2018-03-24
- 刊出日期: 2019-05-20

A Deep Belief Networks Training Strategy Based on Multi-hidden Layer Gibbs Sampling

SHI Ke^1
,,
LU Yang^{1,2
, ,},
LIU Guang-Liang^1
,,
BI Xiang^{1,2
,},
WANG Hui^1
,

1.
School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009
2.
Engineering Research Center of Safety Critical Industry Measure and Control Technology, Ministry of Education, Hefei 230009

Funds:

National Natural Science Foundation of China 61572167

National Key Research and Development Program of China 2016YFC0801405

National Key Research and Development Program of China 2016YFC0801804

More Information

Author Bio:
Ph. D. candidate at the School of Computer and Information, Hefei University of Technology. His research interest covers natural language processing, information retrieval, and machine learning

Ph. D. candidate at the School of Computer and Information, Hefei University of Technology. His research interest covers mining software repositories and machine learning

Lecturer at the School of Computer and Information, Hefei University of Technology. His research interest covers modeling and control of fuzzy discrete event systems, and reliability of complex software

Senior engineer at Hefei University of Technology. His research interest covers complex networks and neural networks

Corresponding author: LU Yang Professor at the School of Computer and Information, Hefei University of Technology. His research interest covers artificial intelligence, computer control, and sensor network. Corresponding author of this paper

摘要

摘要: 深度信念网络（Deep belief network，DBN）作为一类非常重要的概率生成模型，在多个领域都有着广泛的用途.现有深度信念网的训练分为两个阶段，首先是对受限玻尔兹曼机（Restricted Boltzmann machine，RBM）层自底向上逐层进行的贪婪预训练，使得每层的重构误差最小，这个阶段是无监督的；随后再对整体的权值使用有监督的反向传播方法进行精调.本文提出了一种新的DBN训练方法，通过多隐层的Gibbs采样，将局部RBM层组合，并在原有的逐层预训练和整体精调之间进行额外的预训练，有效地提高了DBN的精度.本文同时比较了多种隐层的组合方式，在MNIST和ShapeSet以及Cifar10数据集上的实验表明，使用两两嵌套组合方式比传统的方法错误率更低.新的训练方法可以在更少的神经元上获得比以往的训练方法更好的准确度，有着更高的算法效率.
- 深度信念网络 /
- 受限玻尔兹曼机 /
- Gibbs采样 /
- 对比散度
Abstract: Deep belief network (DBN) is a very important probabilistic generative model that can be used in many areas. The current training approach of DBN involves two phases. The first is a fully unsupervised pre-training process, which is a down-top and layer-by-layer one to train the restricted Boltzmann machine (RBM) layers, making the reconstruction error of each layer minimal. The second is a supervised stage which uses the back propagation to fine-tune the entire parameters of the model. In this paper, a new training strategy for DBN is proposed. Between the current two training phases, this paper introduces another training strategy to combine multiple local RBMs into an overall probability model for multi hidden layer Gibbs sampling, which effectively improves the accuracy of DBN. This paper has compared a variety of combinations of RBM layers, experiments on the MNIST, ShapeSet and Cifar10 dataset show that our method outperforms the existing training algorithms for DBN. The new algorithm can achieve better accuracy with fewer neurons, also achieves higher algorithm efficiency.
- Deep belief network (DBN) /
- restricted Boltzmann machine (RBM) /
- Gibbs sampling /
- contrastive divergence (CD)
注释:

1) 本文责任编委王占山

HTML全文

图 1 RBM模型

Fig. 1 Restricted Boltzmann machine

下载: 全尺寸图片幻灯片

图 2 DBN模型

Fig. 2 Deep belief networks

下载: 全尺寸图片幻灯片

图 3 针对$\pmb{h}_{m+1}$的采样

Fig. 3 Sampling for $\pmb{h}_{m+1}$

下载: 全尺寸图片幻灯片

图 4 MNIST数据集上4隐层模型错误率对比

Fig. 4 The error rate of 4 hidden layers model on MNIST

下载: 全尺寸图片幻灯片

图 5 MNIST数据集上3隐层模型错误率对比

Fig. 5 The error rate of 3 hidden layers model on MNIST

下载: 全尺寸图片幻灯片

图 6 ShapeSet数据集上3隐层模型错误率对比

Fig. 6 The error rate of 3 hidden layers model on ShapeSet

下载: 全尺寸图片幻灯片

图 7 Cifar10数据集上3隐层模型错误率对比

Fig. 7 The error rate of 3 hidden layers model on Cifar10

下载: 全尺寸图片幻灯片

图 8 3隐层模型CD1、CD10错误率对比

Fig. 8 The error rate comparison with CD1 and CD10 on 3 hidden layers model

下载: 全尺寸图片幻灯片

图 9 4隐层模型CD1、CD10错误率对比

Fig. 9 The error rate comparison with CD1 and CD10 on 4 hidden layers model

下载: 全尺寸图片幻灯片

图 10 4隐层模型CD1、PCD错误率对比

Fig. 10 The error rate comparison with CD1 and PCD on 4 hidden layers model

下载: 全尺寸图片幻灯片

图 11 4隐层模型上各种算法训练耗时对比

Fig. 11 The training time consumption comparison on 4 hidden layers model

下载: 全尺寸图片幻灯片

图 12 4隐层模型上各种算法效率对比

Fig. 12 AE comparison on 4 hidden layers model

下载: 全尺寸图片幻灯片

参考文献(23)

[1]	Bengio Y. Learning deep architectures for AI. Foundations & Trends in Machine Learning, 2009, 2(1):1-127 http://d.old.wanfangdata.com.cn/OAPaper/oai_arXiv.org_1206.5538
[2]	Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks. Science, 2006, 313(5786):504-507 doi: 10.1126/science.1127647
[3]	Lee H, Grosse R, Ranganath R, Ng A Y. Unsupervised learning of hierarchical representations with convolutional deep belief networks. Communications of the ACM, 2011, 54(10):95-103 doi: 10.1145/2001269
[4]	Goh H, Thome N, Cord M, Lim J H. Learning deep hierarchical visual feature coding. IEEE Transactions on Neural Networks and Learning Systems, 2014, 25(12):2212-2225 doi: 10.1109/TNNLS.2014.2307532
[5]	Mohamed A R, Dahl G E, Hinton G. Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(1):14-22 doi: 10.1109/TASL.2011.2109382
[6]	Sarikaya R, Hinton G E, Deoras A. Application of deep belief networks for natural language understanding. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2014, 22(4):778-784 doi: 10.1109/TASLP.2014.2303296
[7]	段艳杰, 吕宜生, 张杰, 赵学亮, 王飞跃.深度学习在控制领域的研究现状与展望.自动化学报, 2016, 42(5):643-654 http://www.aas.net.cn/CN/abstract/abstract18852.shtml Duan Yan-Jie, Lv Yi-Sheng, Zhang Jie, Zhao Xue-Liang, Wang Fei-Yue. Deep learning for control:the state of the art and prospects. Acta Automatica Sinica, 2016, 42(5):643-654 http://www.aas.net.cn/CN/abstract/abstract18852.shtml
[8]	Wu F, Wang Z H, Lu W M, Li X, Yang Y, Luo J B, et al. Regularized deep belief network for image attribute detection. IEEE Transactions on Circuits and Systems for Video Technology, 2017, 27(7):1464-1477 doi: 10.1109/TCSVT.2016.2539604
[9]	Wang B Y, Klabjan D. Regularization for unsupervised deep neural nets. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco, CA, USA: AAAI, 2017. 2681-2687
[10]	Goh H, Thome N, Cord M, Lim J H. Top-down regularization of deep belief networks. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, USA: ACM, 2013. 1878-1886
[11]	李飞, 高晓光, 万开方.基于动态Gibbs采样的RBM训练算法研究.自动化学报, 2016, 42(6):931-942 http://www.aas.net.cn/CN/abstract/abstract18884.shtml Li Fei, Gao Xiao-Guang, Wan Kai-Fang. Research on RBM training algorithm with dynamic Gibbs sampling. Acta Automatica Sinica, 2016, 42(6):931-942 http://www.aas.net.cn/CN/abstract/abstract18884.shtml
[12]	乔俊飞, 王功明, 李晓理, 韩红桂, 柴伟.基于自适应学习率的深度信念网设计与应用.自动化学报, 2017, 43(8):1339-1349 http://www.aas.net.cn/CN/abstract/abstract19108.shtml Qiao Jun-Fei, Wang Gong-Ming, Li Xiao-Li, Han Hong-Gui, Chai Wei. Design and application of deep belief network with adaptive learning rate. Acta Automatica Sinica, 2017, 43(8):1339-1349 http://www.aas.net.cn/CN/abstract/abstract19108.shtml
[13]	Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets. Neural Computation, 2006, 18(7):1527-1554 doi: 10.1162/neco.2006.18.7.1527
[14]	Salakhutdinov R, Murray I. On the quantitative analysis of deep belief networks. In: Proceedings of the 25th International Conference on Machine Learning. Helsinki, Finland: ACM, 2008. 872-879
[15]	Hinton G E. A practical guide to training restricted Boltzmann machines. Neural Networks: Tricks of the Trade. Berlin, Germany: Springer, 2012. 599-619
[16]	Salakhutdinov R, Hinton G. Deep Boltzmann machines. In: Proceedings of the 12th International Conference on Artificial Intelligence and Statistics. Florida, USA: PMLR, 2009. 1967-2006
[17]	Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P A. Stacked denoising autoencoders:learning useful representations in a deep network with a local denoising criterion. The Journal of Machine Learning Research, 2010, 11:3371-3408 https://dl.acm.org/citation.cfm?id=1953039
[18]	Nielsen M. Neural Networks and Deep Learning. Determination Press[Online], available: http://neuralnetworksanddeeplearning.com, February 9, 2018.
[19]	Tieleman T. Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th International Conference on Machine Learning. Helsinki, Finland: ACM, 2008. 1064-1071
[20]	Tieleman T, Hinton G. Using fast weights to improve persistent contrastive divergence. In: Proceedings of the 26th Annual International Conference on Machine Learning. Montreal, Quebec, Canada: ACM, 2009. 1033-1040
[21]	Abdel-Hamid O, Deng L, Yu D, Jiang H. Deep segmental neural networks for speech recognition. In: Proceedings of the 14th Annual Conference of the International Speech Communication Association. Lyon, France: International Speech and Communication Association, 2013. 1849-1853
[22]	Bengio Y, Thibodeau-Laufer É, Alain G, Yosinski J. Deep generative stochastic networks trainable by backprop. In: Proceedings of the 31st International Conference on Machine Learning. Beijing, China: JMLR, 2014. 226-234
[23]	Wang X S, Ma Y T, Cheng Y H. Domain adaptation network based on hypergraph regularized denoising autoencoder. Artificial Intelligence Review, DOI: 10.1007/s10462-017-9576-0

施引文献

资源附件(0)

访问统计

图(12)

计量

文章访问数: 2907
HTML全文浏览量: 428
PDF下载量: 413
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

基于多隐层Gibbs采样的深度信念网络训练方法

doi: 10.16383/j.aas.c170669

通讯作者:
陆阳合肥工业大学计算机与信息学院教授, 主要研究方向为人工智能, 计算机控制, 传感器网络.本文通信作者.E-mail:luyang.hf@126.com

计量

A Deep Belief Networks Training Strategy Based on Multi-hidden Layer Gibbs Sampling

Corresponding author: LU Yang Professor at the School of Computer and Information, Hefei University of Technology. His research interest covers artificial intelligence, computer control, and sensor network. Corresponding author of this paper

计量

目录

留言板

基于多隐层Gibbs采样的深度信念网络训练方法

doi: 10.16383/j.aas.c170669

通讯作者: 陆阳 合肥工业大学计算机与信息学院教授, 主要研究方向为人工智能, 计算机控制, 传感器网络.本文通信作者.E-mail:luyang.hf@126.com

计量

出版历程

A Deep Belief Networks Training Strategy Based on Multi-hidden Layer Gibbs Sampling

Corresponding author: LU Yang Professor at the School of Computer and Information, Hefei University of Technology. His research interest covers artificial intelligence, computer control, and sensor network. Corresponding author of this paper

计量

出版历程

目录

通讯作者:
陆阳合肥工业大学计算机与信息学院教授, 主要研究方向为人工智能, 计算机控制, 传感器网络.本文通信作者.E-mail:luyang.hf@126.com