语料资源缺乏的连续语音识别方法的研究

伊·达瓦; 匂坂芳典; 中村哲

doi:10.3724/SP.J.1004.2010.00550

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名

邮箱

手机号码

标题

留言内容

验证码

语料资源缺乏的连续语音识别方法的研究

doi: 10.3724/SP.J.1004.2010.00550

1.
日本独立行政法人信息通信技术研究所京都日本 619-0288
2.
日本早稻田大学国际信息通信研究科东京日本 169-8552
3.
日本国际电气通信基础技术研究所京都日本 619-0288

详细信息

通讯作者:
伊·达瓦

计量
- 文章访问数: 2421
- HTML全文浏览量: 82
- PDF下载量: 970
- 被引次数: 0
出版历程
- 收稿日期: 2009-02-06
- 修回日期: 2009-05-04
- 刊出日期: 2010-04-20

Investigation of ASR Systems for Resource-deficient Languages

1.
National Institute of Information and Communications Technology (NICT), Kyoto 619-0288, Japan;
2.
Global Information and Telecommunication Institute (GITI), Waseda University, Tokyo 169-855, Japan;
3.
Advanced Telecommunications Research Institute International (ATR), Kyoto 619-0288, Japan

More Information

Corresponding author: I·Dawa

摘要

摘要: 由于少数民族语言有其本身的特点, 不能简单地套用现有的连续语音识别的方法. 本文以蒙古语为例, 研讨了声学和语言模型的建立, 并在日本国际电气通信基础技术研究所的连续语音识别器上实现了蒙古语的语音识别系统. 本文侧重于语言模型的建立, 基于蒙古语黏着性语言特点, 提出用相似词聚类方法建立多类N-gram模型. 实验结果显示, 应用我们提出的语言模型, 识别精度比用传统的词的N-gram识别法提高了5.5%.
- 蒙古语 /
- 黏着语言 /
- 相似词分类 /
- 连续语语音识别 /
- 多类语言模型
Abstract: Because the minority languages in China have their special characteristics, it is not suitable to directly adopt the traditional automatic speech recognition (ASR) methods which are used for some major languages, such as Chinese, English, Japanese, etc. In this paper, we take Mongolian (a resource-deficient language) as an example and build the acoustic and language models for applying the ATRASR system. In this paper, we specially focus on the language modeling aspect by considering the special characteristics of the Mongolian. We trained a multi-class N-gram language model based on similar word clustering. By applying the proposed language model, the system could improve the performance by 5.5% compared with the conventional word N-gram.
- Mongolian language /
- agglutinative language /
- similar word clustering /
- continuous speech recognition /
- multi-class N-gram model