Automatic Text Corpus Generation Algorithm towards Oral Statistical Language Modeling
-
摘要: 在资源相对匮乏的自动语音识别(Automatic speech recognition, ASR)领域, 如面向电话交谈的语音识别系统中, 统计语言模型(Language model, LM)存在着严重的数据稀疏问题. 本文提出了一种基于等概率事件的采样语料生成算法, 自动生成领域相关的语料, 用来强化统计语言模型建模. 实验结果表明, 加入本算法生成的采样语料可以缓解语言模型的稀疏性, 从而提升整个语音识别系统的性能. 在开发集上语言模型的困惑度相对降低7.5%, 字错误率(Character error rate, CER)绝对降低0.2个点; 在测试集上语言模型的困惑度相对降低6%, 字错误率绝对降低0.4点.Abstract: Data sparseness is a serious issue for language model (LM) in automatic speech recognition (ASR) towards resource-lack domains, e.g. the telephone conversation speech recognition task. In this paper, an event of equal probability based text corpus generation algorithm is proposed in order to alleviate the sparseness of language model. Experimental results show that 7.5% relative reduction in perplexity and a 0.2% absolute reduction in character error rate (CER) can be obtained on the develop set. And, a 6% relative reduction in perplexity and a 0.4% absolute reduction in CER can be obtained on the test set.
-
[1] Yang Xing-Jun, Chi Hui-Sheng. Digital Processing of Speech Signals. Beijing: Electronic Industry Press, 1995. 330-331(杨行竣, 迟惠生. 语音信号数字处理. 北京: 电子工业出版牡, 1995. 330-331) [2] Chen S F, Goodman J. An empirical study of smoothing techniques for language modeling. In: Proceedings of the 34th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics. Santa Cruz, CA, 1996. 310-318 [3] Allauzen C, Riley M. Bayesian language model interpolation for mobile speech input. In: Proceedings of the 2011 Interspeech. Italy, 2011. 1429-1432 [4] Khudanpur S, Wu J. A maximum entropy language model integrating n-grams and topic dependencies for conversational speech recognition. In: Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Phoenix, AZ: IEEE, 1999. 553-556 [5] Schwenk H. CSLM —— a modular open-source continuous space language modeling toolkit. In: Proceedings of the 2013 Interspeech. Lyyon, France, 2013. 1198-1202 [6] Mikolov T, Karafiát M, Burget L, Černocký J H, Khudanpur S. Recurrent neural network based language model. In: Proceedings of the 2010 INTERSPEECH. Lyon, France: ISCA, 2010. 1045-1048 [7] Mikolov T, Deoras A, Kombrink S, Burget L, Cernocky J H. Empirical evaluation and combination of advanced language modeling techniques. In: Proceedings of the 2011 Interspeech. Italy, 2011. 605-608 [8] Liu X, Wang Y, Chen X, Gales M J F, Woodland P C. Efficient lattice rescoring using recurrent neural network language models. In: Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). FLORENCE, ITALY, 2014. 4941-4945 [9] Huang Yun-Zhu, Wei Wei, Luo Yang-Yu, Li Cheng-Rong. Word-class expansion method about training corpus of language modal in restrcited domain. Application of Computer System, 2011, 20(11): 55-58 (黄韵竹, 韦玮, 罗杨宇, 李成荣. 限定领域语言模型训练语料的词类扩展方法. 计算机系统应用, 2011, 20(11): 55-58) [10] Bengio Y, Boulanger-Lewandowski N, Pascanu R. Advances in optimizing recurrent networks. In: Proceedings of the 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Vancouver, Canada: IEEE, 2013. 8624-8628 [11] Sutskever Ilya. Training Recurrent Neural Networks [Ph.D. dissertation], University of Toronto, Canada, 2013. [12] Si Y J, Zhang Z, Li T, Pan J, Yan Y. Enhanced word classing for recurrent neural network language model. Journal of Information & Computational Science, 2013, 10(12): 3595-3604 [13] Shao J, Li T, Zhang Q Q, Zhao Q W, Yan Y H. A one-pass real-time decoder using memory-efficient state network. IEICE Transactions on Information and Systems, 2008, 1(91): 529-537 [14] Mikolov T, Kombrink S, Deoras A, Burget L, Cernocky J H. RNNLM-Recurrent neural network language modeling toolkit. In: Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition and Understanding, UK, 2011. 16-19 [15] Shao Jian. Chinese Spoken Term Detection towards Large-Scale Telephone Conversational Speech [Ph.D. dissertation]. Institute of Acoustics, Chinese Academy of Sciences, China, 2008. (邵建. 面向大规模电话交谈语音的汉语语音检索[博士学位论文], 中国科学院声学研究所, 中国, 2008.)
点击查看大图
计量
- 文章访问数: 2077
- HTML全文浏览量: 116
- PDF下载量: 709
- 被引次数: 0