[1] |
Erhan D, Bengio Y, Couville A, Manzagol P A, Vincent P, Samy B. Why does unsupervised pre-training help deep learning? Journal of Machine Learning Research, 2010, 11:625-660 http://research.google.com/pubs/archive/35536.pdf |
[2] |
孙志军, 薛磊, 许阳明, 王正.深度学习研究综述.计算机应用研究, 2012, 29(8):2806-2810 http://www.cnki.com.cn/Article/CJFDTOTAL-BJGD201501011.htm
Sun Zhi-Jun, Xue Lei, Xu Yang-Ming, Wang Zheng. Overview of deep learning. Application Research of Computers, 2012, 29(8):2806-2810 http://www.cnki.com.cn/Article/CJFDTOTAL-BJGD201501011.htm |
[3] |
Bengio Y. Learning deep architectures for AI. Foundations and Trends® in Machine Learning, 2009, 2(1):1-127 doi: 10.1561/2200000006 |
[4] |
Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets. Neural Computation, 2006, 18(7):1527-1554 doi: 10.1162/neco.2006.18.7.1527 |
[5] |
Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks. Science, 2006, 313(5786):504-507 doi: 10.1126/science.1127647 |
[6] |
Bengio Y, Lamblin P, Popovici D, Larochelle H. Greedy layer-wise training of deep networks. In:Proceedings of the 2007 Advances in Neural Information Processing Systems 19(NIPS'06). Vancouver, Canada:MIT Press, 2007. 153-160 |
[7] |
Ranzato M A, Poultney C, Chopra S, LeCun Y. Efficient learning of sparse representations with an energy-based model. In:Proceedings of the 2007 Advances in Neural Information Processing Systems 19(NIPS'06). Vancouver, Canada:MIT Press, 2007. 1137-1144 |
[8] |
Weston J, Ratle F, Collobert R. Deep learning via semi-supervised embedding. In:Proceedings of the 25th International Conference on Machine Learning (ICML'08). New York, USA:ACM Press, 2008. 1168-1175 |
[9] |
Srivastava N, Mansimov E, Salakhutdinov R. Unsupervised learning of video representations using LSTMs. In:Proceedings of the 32nd International Conference on Machine Learning (ICML'15). Lille, France:Omni Press, 2015. 843-852 |
[10] |
Jia K, Sun L, Gao S H, Song Z, Shi B E. Laplacian auto-encoders:an explicit learning of nonlinear data manifold. Neurocomputing, 2015, 160:250-260 doi: 10.1016/j.neucom.2015.02.023 |
[11] |
Chan T H, Jia K, Gao S H, Lu J W, Zeng Z N, Ma Y. PCANet:a simple deep learning baseline for image classification? IEEE Transactions on Image Processing, 2015, 24(12):5017-5032 doi: 10.1109/TIP.2015.2475625 |
[12] |
Alain G, Bengio Y. What regularized auto-encoders learn from the data-generating distribution? The Journal of Machine Learning Research, 2014, 15(1):3563-3593 http://www.taodocs.com/p-61696734.html |
[13] |
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout:a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 2014, 15(1):1929-1958 http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf |
[14] |
Dosovitskiy A, Springenberg J T, Riedmiller M, Brox T. Discriminative unsupervised feature learning with convolutional neural networks. In:Proceedings of the 2014 Advances in Neural Information Processing Systems 27(NIPS'14). Montréal, Quebec, Canada:MIT Press, 2014. 766-774 |
[15] |
Sun Y, Wang X G, Tang X O. Deep learning face representation from predicting 10000 classes. In:Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, Ohio, USA:IEEE, 2014. 1891-1898 |
[16] |
乔俊飞, 潘广源, 韩红桂.一种连续型深度信念网的设计与应用.自动化学报, 2015, 41(12):2138-2146 http://www.aas.net.cn/CN/abstract/abstract18786.shtml
Qiao Jun-Fei, Pan Guang-Yuan, Han Hong-Gui. Design and application of continuous deep belief network. Acta Automatica Sinica, 2015, 41(12):2138-2146 http://www.aas.net.cn/CN/abstract/abstract18786.shtml |
[17] |
Längkvist M, Karlsson L, Loutfi A. A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recognition Letters, 2014, 42:11-24 doi: 10.1016/j.patrec.2014.01.008 |
[18] |
Han X F, Leung T, Jia Y Q, Sukthankar R, Berg A C. MatchNet:unifying feature and metric learning for patch-based matching. In:Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'15). Boston, Massachusetts, USA:IEEE Press, 2015. 3279-3286 |
[19] |
Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In:Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'15). Boston, Massachusetts, USA:IEEE, 2015. 1-9 |
[20] |
Denton E L, Chintala S, Szlam A, Fergus R. Deep generative image models using a Laplacian pyramid of adversarial networks. In:Proceedings of the 2015 Advances in Neural Information Processing Systems 28(NIPS'15). Montreal, Canada:MIT Press, 2015. 1486-1494 |
[21] |
Dong C, Loy C C, He K M, Tang X O. Learning a deep convolutional network for image super-resolution. In:Proceedings of the 13th European Conference on Computer Vision (ECCV'14). Zurich, Switzerland:Springer International Publishing, 2014. 184-199 |
[22] |
Nie S Q, Wang Z H, Ji Q. A generative restricted Boltzmann machine based method for high-dimensional motion data modeling. Computer Vision and Image Understanding, 2015, 136:14-22 doi: 10.1016/j.cviu.2014.12.005 |
[23] |
Jain A, Tompson J, LeCun Y, Bregler C. Modeep:a deep learning framework using motion features for human pose estimation. In:Proceedings of the 12th Asian Conference on Computer Vision (ACCV'2014). Singapore:Springer International Publishing, 2015. 302-315 |
[24] |
耿杰, 范剑超, 初佳兰, 王洪玉.基于深度协同稀疏编码网络的海洋浮筏SAR图像目标识别.自动化学报, 2016, 42(4):593-604 http://www.aas.net.cn/CN/abstract/abstract18846.shtml
Geng Jie, Fan Jian-Chao, Chu Jia-Lan, Wang Hong-Yu. Research on marine floating raft aquaculture SAR image target recognition based on deep collaborative sparse coding network. Acta Automatica Sinica, 2016, 42(4):593-604 http://www.aas.net.cn/CN/abstract/abstract18846.shtml |
[25] |
Erhan D, Szegedy C, Toshev A, Anguelov D. Scalable object detection using deep neural networks. In:Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'14). Columbus, Ohio, USA:IEEE, 2014. 2155-2162 |
[26] |
Qi Y J, Das S G, Collobert R, Weston J. Deep learning for character-based information extraction. In:Proceedings of the 36th European Conference on IR Research on Advances in Information Retrieval. Amsterdam, The Netherland:Springer International Publishing, 2014. 668-674 |
[27] |
Nie L Q, Wang M, Zhang L M, Yan S C, Zhang B, Chua T S. Disease inference from health-related questions via sparse deep learning. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(8):2107-2119 doi: 10.1109/TKDE.2015.2399298 |
[28] |
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P. Natural language processing (almost) from scratch. The Journal of Machine Learning Research, 2011, 12:2493-2537 http://jmlr.org/papers/volume12/collobert11a/collobert11a.pdf |
[29] |
Mnih A, Hinton G E. A scalable hierarchical distributed language model. In:Proceedings of the 2009 Advances in Neural Information Processing Systems 21(NIPS'08). Vancouver, Canada:MIT Press, 2009. 1081-1088 |
[30] |
Collobert R, Weston J. A unified architecture for natural language processing:deep neural networks with multitask learning. In:Proceedings of the 25th International Conference on Machine Learning (ICML'08). Helsinki, Finland:ACM Press, 2008. 160-167 |
[31] |
Olshausen B A, Field D J. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 1996, 381(6583):607-609 doi: 10.1038/381607a0 |
[32] |
Overview of deep learning and parallel implementation, available:http://djt.qq.com/article/view/1245, June20, 2016 |
[33] |
Hastad J. Computational Limitations for Small Depth Circuits. Cambridge, MA, USA:Massachusetts Institute of Technology, 1987 |
[34] |
Serre C, Mellot-Draznieks C, Surblé S, Audebrand N, Filinchuk Y, Férey G. Role of solvent-host interactions that lead to very large swelling of hybrid frameworks. Science, 2007, 315(5820):1828-1831 doi: 10.1126/science.1137975 |
[35] |
Salakhutdinov R R, Hinton G. Deep Boltzmann machines. In:Proceedings of the 12th International Conference on Artificial Intelligence and Statistics (AISTATS'09). Florida, USA:Omni Press, 2009. 448-455 |
[36] |
Bengio Y, Ducharme R, Vincent P, Jauvin C. A neural probabilistic language model. The Journal of Machine Learning Research, 2003, 3:1137-1155 http://www.academia.edu/7327284/A_Neural_Probabilistic_Language_Model |
[37] |
Mikolov T, Deoras A, Kombrink S, Burget L, Černocký J H. Empirical evaluation and combination of advanced language modeling techniques. In:Proceedings of the 2011 Conference of the International Speech Communication Association (INTERSPEECH'2011). Florence, Italy:ISCA Press, 2011. 605-608 |
[38] |
Schwenk H, Rousseau A, Attik M. Large, pruned or continuous space language models on a GPU for statistical machine translation. In:Proceedings of the NAACL-HLT 2012 Workshop:Will We ever Really Replace the N-gram Model? on the Future of Language Modeling for HLT. Montréal, Canada:ACL Press, 2012. 11-19 |
[39] |
Socher R, Huang E H, Pennington J, Ng A Y, Manning C D. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In:Proceedings of the 2011 Advances in Neural Information Processing Systems 24(NIPS'11). Granada, Spain:MIT Press, 2011. 801-809 |
[40] |
Socher R, Huval B, Manning C D, Ng A Y. Semantic compositionality through recursive matrix-vector spaces. In:Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Jeju Island, Korea:ACL Press, 2012. 1201-1211 |
[41] |
Le Q, Mikolov T. Distributed representations of sentences and documents. In:Proceedings of the 31st International Conference on Machine Learning (ICML'14). Beijing, China:ACM Press, 2014. 1188-1196 |
[42] |
Kim Y. Convolutional neural networks for sentence classification. In:Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP'2014). Doha, Qatar:ACL Press, 2014. 1746-1751 |
[43] |
Dahl G E, Yu D, Deng L, Acero A. Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(1):30-42 doi: 10.1109/TASL.2011.2134090 |
[44] |
Mohamed A R, Dahl G E, Hinton G. Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(1):14-22 doi: 10.1109/TASL.2011.2109382 |
[45] |
Mikolov T, Yih W T, Zweig G. Linguistic regularities in continuous space word representations. In:Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT'2013). Atlanta, Georgia:ACL Press, 2013. 746-751 |
[46] |
Mikolov T, Sutskever I, Chen K, Corrado G S, Dean J. Distributed representations of words and phrases and their compositionality. In:Proceedings of the 2013 Advances in Neural Information Processing Systems 26(NIPS'13). Nevada, USA:MIT Press, 2013. 3111-3119 |
[47] |
Mikolov T, Karafiát M, Burget L, Černocký, Khudanpur S. Recurrent neural network based language model. In:Proceedings of the 2010 International Conference on Spoken Language Processing (ICSLP'2010). Chiba, Japan:Speech Communication Press, 2010. 1045-1048 |
[48] |
Mikolov T, Kombrink S, Burget L, Černocký J H, Khudanpur S. Extensions of recurrent neural network language model. In:Proceedings of the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Prague, Czech Republic:IEEE, 2011. 5528-5531 |
[49] |
Mikolov T, Deoras A, Povey D, Burget L, Černocký J H. Strategies for training large scale neural network language models. In:Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). Waikoloa, Hawaii, USA:IEEE Press, 2011. 196-201 |
[50] |
Mikolov T, Zweig G. Context dependent recurrent neural network language model. In:Proceedings of the 2012 IEEE Conference on Spoken Language Technology (SLT). Miami, Florida, USA:IEEE, 2012. 234-239 |
[51] |
Socher R, Perelygin A, Wu J Y, Chuang J, Manning C D, Ng A Y, Potts C. Recursive deep models for semantic compositionality over a sentiment treebank. In:Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP'2013). Seattle, USA:ACL Press, 2013. 1631-1642 |
[52] |
Turian J, Ratinov L, Bengio Y. Word representations:a simple and general method for semi-supervised learning. In:Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL'2010). Uppsala, Sweden:ACL Press, 2010. 384-394 |
[53] |
Firth J R. A synopsis of linguistic theory 1930-55. Studies in Linguistic Analysis. Oxford:Philological Society, 1957. 1-32 |
[54] |
Hinton G E. Learning distributed representations of concepts. In:Proceedings of the 8th Annual Conference of the Cognitive Science Society. Amherst, Massachusetts:Cognitive Science Society Press, 1986. 1-12 |
[55] |
Salton G. Automatic processing of foreign language documents. Journal of the American Society for Information Science, 1970, 21(3):187-194 doi: 10.1002/(ISSN)1097-4571 |
[56] |
Rapp R. Word sense discovery based on sense descriptor dissimilarity. In:Proceedings of the 9th Conference on Machine Translation Summit. New Orleans, USA:IAMT Press, 2003. 315-322 |
[57] |
Turney P D. Expressing implicit semantic relations without supervision. In:Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics (COLING and ACL 2006). Sydney, Australia:ACL Press, 2006. 313-320 |
[58] |
Manning C D, Raghavan P, Schütze H. Introduction to Information Retrieval. Cambridge:Cambridge University Press, 2008. |
[59] |
Zheng X Q, Chen H Y, Xu T Y. Deep learning for Chinese word segmentation and POS tagging. In:Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP'2013). Seattle, Washington, USA:ACL Press, 2013. 647-657 |
[60] |
Xu W, Rudnicky A I. Can artificial neural networks learn language models? In:Proceedings of 2000 International Conference on Spoken Language Processing (ICSLP'2000). Beijing, China:Speech Communication Press, 2000. 202-205 |
[61] |
Mnih A, Hinton G. Three new graphical models for statistical language modelling. In:Proceedings of the 24th International Conference on Machine Learning (ICML'07). Corvallis, Oregon:ACM Press, 2007. 641-648 |
[62] |
Morin F, Bengio Y. Hierarchical probabilistic neural network language model. In:Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (AISTATS'2005). Barbados:Omni Press, 2005. 246-252 |
[63] |
Bordes A, Usunier N, Garcia-Durán A, Weston J, Yakhnenko O. Translating embeddings for modeling multi-relational data. In:Proceedings of the 2013 Advances in Neural Information Processing Systems 26(NIPS'13). Nevada, USA:MIT Press, 2013. 2787-2795 |
[64] |
Bengio Y. Deep learning of representations for unsupervised and transfer learning. In:Proceedings of the ICML2011 Unsupervised and Transfer Learning Workshop. Bellevue, Washington, USA:ACM Press, 2012. 17-37 |
[65] |
Le Q V, Ngiam J, Coates A, Lahiri A, Prochnow B, Ng A Y. On optimization methods for deep learning. In:Proceedings of the 28th International Conference on Machine Learning (ICML'11). Bellevue, Washington, USA:ACM Press, 2011. 67-105 |
[66] |
Henderson J. Neural network probability estimation for broad coverage parsing. In:Proceedings of the 10th Conference on European Chapter of the Association for Computational Linguistics (EACL'03). Budapest, Hungary:ACL Press, 2003. 131-138 |
[67] |
Henderson J. Discriminative training of a neural network statistical parser. In:Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics (ACL'2004). Barcelona, Spain:ACL Press, 2004. 95-102 |
[68] |
Titov I, Henderson J. Porting statistical parsers with data-defined kernels. In:Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL-2006). New York, USA:ACL Press, 2006. 6-13 |
[69] |
Titov I, Henderson J. Constituent parsing with incremental sigmoid belief networks. In:Proceedings of the 45th Annual Meeting on Association for Computational Linguistics (ACL'2007). Prague, Czech Republic:ACL Press, 2007. 632-639 |
[70] |
Collobert R. Deep learning for efficient discriminative parsing. In:Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS'2011). Fort Lauderdale, Florida, USA:Omni Press, 2011. 224-232 |
[71] |
Costa F, Frasconi P, Lombardo V, Soda G. Towards incremental parsing of natural language using recursive neural networks. Applied Intelligence, 2003, 19(1-2):9-25 https://pdfs.semanticscholar.org/f570/6d576037dcf6d412c65373e9c787060cd64f.pdf |
[72] |
Menchetti S, Costa F, Frasconi P, Pontil M. Wide coverage natural language processing using kernel methods and neural networks for structured data. Pattern Recognition Letters, 2005, 26(12):1896-1906 doi: 10.1016/j.patrec.2005.03.011 |
[73] |
Collins M. Head-driven statistical models for natural language parsing. Computational linguistics, 2003, 29(4):589-637 doi: 10.1162/089120103322753356 |
[74] |
Socher R, Bauer J, Manning C D, Ng A Y. Parsing with compositional vector grammars. In:Proceedings of the 51st Annual Meeting on Association for Computational Linguistics (ACL'2013). Sofia, Bulgaria:ACL Press, 2013. 455-465 |
[75] |
Legrand J, Collobert R. Recurrent greedy parsing with neural networks. In:Proceedings of the 2014 European Conference on Machine Learning and Knowledge Discovery in Databases. Nancy, France:Springer Press, 2014. 130-144 |
[76] |
Huang E H, Socher R, Manning C D, Ng A Y. Improving word representations via global context and multiple word prototypes. In:Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL'2012). Jeju Island, Korea:ACL Press, 2012. 873-882 |
[77] |
Zhou S S, Chen Q C, Wang X L. Active deep networks for semi-supervised sentiment classification. In:Proceedings of the 23rd International Conference on Computational Linguistics (COLING'2010). Beijing, China:ACL Press, 2010. 1515-1523 |
[78] |
Glorot X, Bordes A, Bengio Y. Domain adaptation for large-scale sentiment classification:a deep learning approach. In:Proceedings of the 28th International Conference on Machine Learning (ICML'11). Bellevue, Washington, USA:Omni Press, 2011. 513-520 |
[79] |
Socher R, Pennington J, Huang E H, Ng A Y, Manning C D. Semi-supervised recursive autoencoders for predicting sentiment distributions. In:Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP'2011). Edinburgh, UK:ACL Press, 2011. 151-161 |
[80] |
Liu L M, Watanabe T, Sumita E, Zhao T J. Additive neural networks for statistical machine translation. In:Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL'2013). Sofa, Bulgaria:ACL Press, 2013. 791-801 |
[81] |
Yang N, Liu S J, Li M, Zhou M, Yu N H. Word alignment modeling with context dependent deep neural network. In:Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL'2013). Sofa, Bulgaria:ACL Press, 2013. 166-175 |
[82] |
Kalchbrenner N, Blunsom P. Recurrent continuous translation models. In:Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP'2013). Seattle, Washington, USA:ACL Press, 2013. 1700-1709 |
[83] |
Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks. In:Proceedings of the 2014 Advances in Neural Information Processing Systems 27(NIPS'14). Montréal, Quebec, Canada:MIT Press, 2014. 3104-3112 |
[84] |
Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In:Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP'2014). Doha, Qatar:ACL Press, 2014. 1724-1734 |
[85] |
Cho K, van Merriënboer B, Bahdanau D, Bengio Y. On the properties of neural machine translation:encoder-decoder approaches. In:Proceedings of the 8th Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8). Doha, Qatar:ACL Press, 2014. 103-111 |
[86] |
Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. In:Proceedings of the 3rd International Conference on Learning Representations (ICLR'2015). San Diego, California, USA:arXiv Press, 2015. 1409.0473V7 |
[87] |
Dong D X, Wu H, He W, Yu D H, Wang H F. Multi-task learning for multiple language translation. In:Proceedings of the 53rd Annual Meeting on Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing, China:ACL Press, 2015. 1723-1732 |
[88] |
Pinheiro P O, Collobert R. Recurrent convolutional neural networks for scene labeling. In:Proceedings of the 31st International Conference on Machine Learning (ICML'14). Beijing, China, 2014. 82-90 http://wenku.baidu.com/view/b6cc3becccbff121dc368336.html |
[89] |
Le Q V. Building high-level features using large scale unsupervised learning. In:Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, BC:IEEE, 2013. 8595-8598 |
[90] |
田渊栋.阿法狗围棋系统的简要分析.自动化学报, 2016, 42(5):671-675 http://www.aas.net.cn/CN/abstract/abstract18856.shtml
Tian Yuan-Dong. A simple analysis of AlphaGo. Acta Automatica Sinica, 2016, 42(5):671-675 http://www.aas.net.cn/CN/abstract/abstract18856.shtml |