[1] Loizou P C. Speech Enhancement:Theory and Practice. Florida:CRC Press, 2013.
[2] Ephraim Y, Malah D. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1985, 33(2):443-445 http://ieeexplore.ieee.org/document/1164550/
[3] Cohen I. Noise spectrum estimation in adverse environments:Improved minima controlled recursive averaging. IEEE Transactions on speech and audio processing, 2003, 11(5):466-475 http://www.researchgate.net/publication/3333946_Noise_spectrum_estimation_in_adverse_environments_improved_minima_controlled_recursive_averaging
[4] Mohammadiha N, Smaragdis P, Leijon A. Supervised and unsupervised speech enhancement using nonnegative matrix factorization. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(10):2140-2151 doi: 10.1109/TASL.2013.2270369
[5] 刘文举, 聂帅, 梁山, 张学良.基于深度学习语音分离技术的研究现状与进展.自动化学报, 2016, 42(6):819-833 http://www.aas.net.cn/CN/abstract/abstract18873.shtml

Liu Wen-Ju, Nie Shuai, Liang Shan, Zhang Xue-Liang. Deep learning based speech separation technology and its developments. Acta Automatica Sinica, 2016, 42(6):819-833 http://www.aas.net.cn/CN/abstract/abstract18873.shtml
[6] Wang Y X, Wang D L. Towards scaling up classification-based speech separation. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(7):1381-1390 doi: 10.1109/TASL.2013.2250961
[7] Wang Y X, Narayanan A, Wang D L. On training targets for supervised speech separation. IEEE Transactions on Audio, Speech, and Language Processing, 2014, 22(12):1849-1858 doi: 10.1109/TASLP.2014.2352935
[8] Xu Y, Du J, Dai L R, Lee C H. An experimental study on speech enhancement based on deep neural networks. IEEE Signal Processing Letters, 2014, 21(1):65-68 doi: 10.1109/LSP.2013.2291240
[9] Xu Y, Du J, Dai L R, Lee C H. A regression approach to speech enhancement based on deep neural networks. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2015, 23(1):7-19 http://www.researchgate.net/publication/272436458_A_Regression_Approach_to_Speech_Enhancement_Based_on_Deep_Neural_Networks
[10] Williamson D S, Wang Y X, Wang D L. Complex ratio masking for monaural speech separation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2016, 24(3):483-492 doi: 10.1109/TASLP.2015.2512042
[11] Xu Y, Du J, Huang Z, Dai L R, Lee C H. Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement. In: Proceedings of the 16th Annual Conference of the International Speech Communication Association. Dresden, Germany: ISCA, 2015. 1508-1512
[12] Wang Y X, Chen J T, Wang D L. Deep Neural Network Based Supervised Speech Segregation Generalizes to Novel Noises Through Large-scale Training, Technical Report OSU-CISRC-3/15-TR02, Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio, USA, 2015
[13] Chen J T, Wang Y X, Yoho S E, Wang D L, Healy E W. Large-scale training to increase speech intelligibility for hearing-impaired listeners in novel noises. The Journal of the Acoustical Society of America, 2016, 139(5):2604-2612 doi: 10.1121/1.4948445
[14] Chen J T, Wang Y X, Wang D L. Noise perturbation for supervised speech separation. Speech Communication, 2016, 78:1-10 https://www.sciencedirect.com/science/article/pii/S0167639315001405
[15] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the International Conference on Neural Information Processing Systems. Nevada, USA: Curran Associates Inc. 2012. 1097-1105 http://www.researchgate.net/publication/267960550_ImageNe
[16] Abdel-Hamid O, Mohamed A, Jiang H, Penn G. Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition. In: Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing. Kyoto, Japan: IEEE, 2012. 4277-4280
[17] Abdel-Hamid O, Deng L, Yu D. Exploring convolutional neural network structures and optimization techniques for speech recognition. In: Proceedings of the 14th Annual Conference of the International Speech Communication Association. Lyon, France: ISCA, 2013. 3366-3370 http://www.researchgate.net/publication/264859599_Exploring_Convolutional_Neural_Network_Structures_and_Optimization_Techniques_for_Speech_Recognition
[18] Sainath T N, Kingsbury B, Saon G, Soltau H, Mohamed A R, Dahl G, Ramabhadran B. Deep convolutional neural networks for large-scale speech tasks. Neural Networks, 2015, 64:39-48 https://www.sciencedirect.com/science/article/pii/S0893608014002007
[19] Qian Y M, Bi M X, Tan T, Yu K. Very deep convolutional neural networks for noise robust speech recognition. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2016, 24(12):2263-2276 http://www.researchgate.net/publication/308823854_Very_Deep_Convolutional_Neural_Networks_for_Robust_Speech_Recognition
[20] Bi M X, Qian Y M, Yu K. Very deep convolutional neural networks for LVCSR. In: Proceedings of the 16th Annual Conference of the International Speech Communication Association. Dresden, Germany: ISCA, 2015. 3259-3263
[21] Qian Y, Woodland P C. Very deep convolutional neural networks for robust speech recognition. In: Proceedings of the 2016 IEEE Spoken Language Technology Workshop. San Juan, Puerto Rico: IEEE, 2016. 481-488 http://www.researchgate.net/publication/313587893_Very_deep_convolutional_neural_networks_for_robust_speech_recognition
[22] Sercu T, Puhrsch C, Kingsbury B, LeCun Y. Very deep multilingual convolutional neural networks for LVCSR. In: Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing. Shanghai, China: IEEE, 2016. 4955-4959
[23] Sercu T, Goel V. Advances in very deep convolutional neural networks for LVCSR. In: Proceedings of the 16th Annual Conference of the International Speech Communication Association. California, USA: ISCA, 2016. 3429-3433 http://www.researchgate.net/publication/307889292_Advances_in_Very_Deep_Convolutional_Neural_Networks_for_LVCSR
[24] Park S R, Lee J. A fully convolutional neural network for speech enhancement. arXiv: 1609. 07132, 2016.
[25] Fu S W, Tsao Y, Lu X. SNR-Aware convolutional neural network modeling for speech enhancement. In: Proceedings of the 17th Annual Conference of the International Speech Communication Association. San Francisco, USA: ISCA, 2016. 8-12 http://www.researchgate.net/publication/307889660_SNR-Aware_Convolutional_Neural_Network_Modeling_for_Speech_Enhancement
[26] Garofolo J S, Lamel L F, Fisher W M, Fiscus J G, Pallett D S, Dahlgren N L, Zue V. TIMIT acoustic-phonetic continuous speech corpus. Linguistic Data Consortium, Philadelphia, 1993. https://www.researchgate.net/publication/243787812_TIMIT_acoustic-phonetic_continuous_speech_corpus
[27] Hu G N. 100 nonspeech sounds[online], available: http://web.cse.ohio-state.edu/pnl/corpus/HuNonspeech/HuCorpus.html, April 20, 2004
[28] Varga A, Steeneken Herman J M. Assessment for automatic speech recognition:Ⅱ. NOISEX-92:a database and an experiment to study the effect of additive noise on speech recognition systems. Speech Communication, 1993, 12(3):247-251 doi: 10.1016/0167-6393(93)90095-3
[29] Beerends J G, Rix A W, Hollier M P, Hekstra A P. Perceptual evaluation of speech quality (PESQ)——a new method for speech quality assessment of telephone networks and codecs. In: Proceedings of the 2001 IEEE International Conference on Acoustics, Speech and Signal Processing. Utah, USA: IEEE, 2001. 749-752 http://dl.acm.org/citation.cfm?id=1259107
[30] Taal C H, Hendriks R C, Heusdens R, Jensen J. An algorithm for intelligibility prediction of time-frequency weighted noisy speech. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(7):2125-2136 doi: 10.1109/TASL.2011.2114881
[31] Yu D, Eversole A, Seltzer M L, Yao K S, Huang Z H, Guenter B, Kuchaiev O, Zhang Y, Seide F, Wang H M, Droppo J, Zweig G, Rossbach C, Currey J, Gao J, May A, Peng B L, Stolcke A, Slaney M. An Introduction to Computational Networks and the Computational Network Toolkit, Technical Report, Tech. Rep. MSR, Microsoft Research, 2014.