2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于核函数的IVEC-SVM说话人识别系统研究

栗志意 张卫强 何亮 刘加

栗志意, 张卫强, 何亮, 刘加. 基于核函数的IVEC-SVM说话人识别系统研究. 自动化学报, 2014, 40(4): 780-784. doi: 10.3724/SP.J.1004.2014.00780
引用本文: 栗志意, 张卫强, 何亮, 刘加. 基于核函数的IVEC-SVM说话人识别系统研究. 自动化学报, 2014, 40(4): 780-784. doi: 10.3724/SP.J.1004.2014.00780
LI Zhi-Yi, ZHANG Wei-Qiang, HE Liang, LIU Jia. Speaker Recognition with Kernel Based IVEC-SVM. ACTA AUTOMATICA SINICA, 2014, 40(4): 780-784. doi: 10.3724/SP.J.1004.2014.00780
Citation: LI Zhi-Yi, ZHANG Wei-Qiang, HE Liang, LIU Jia. Speaker Recognition with Kernel Based IVEC-SVM. ACTA AUTOMATICA SINICA, 2014, 40(4): 780-784. doi: 10.3724/SP.J.1004.2014.00780

基于核函数的IVEC-SVM说话人识别系统研究

doi: 10.3724/SP.J.1004.2014.00780 cstr: 32138.14.SP.J.1004.2014.00780
基金项目: 

国家自然科学基金(61005019,61273268,90920302,61370034)资助

详细信息
    作者简介:

    张卫强 清华大学电子工程系助理研究员.主要研究方向为说话人识别与语种识别.E-mail:wqzhang@tsinghua.edu.cn

Speaker Recognition with Kernel Based IVEC-SVM

Funds: 

Supported by National Natural Science Foundation of China (61005019, 61273268, 90920302, 61370034)

  • 摘要: 在说话人识别研究中,基于身份认证向量(Identity vector,IVEC)的说话人建模方法可以有效地提取说话人信息,是目前处于国际前沿的建模方法.本文对身份认证向量后接支持向量机(Identity vector followed by support vector machine,IVEC-SVM) 的说话人识别系统进行了研究,对比了该系统在十种不同核函数下的识别性能,并与文献中身份认证向量后接余弦距离打分(Identity vector followed by cosine distance scoring,IVEC-CDS)系统进行了比较. 在美国国家标准技术局(American National Institute of Standards and Technology,NIST)组织的2010年电话信道——电话信道说话人识别核心评测数据库上的实验结果显示,基于核函数的IVEC-SVM系统性能明显优于IVEC-CDS的系统性能.此外,实验结果表明基于Spline核的IVEC-SVM系统可取得最好的识别性能,与IVEC-CDS系统相比,其等错点(Equal error rate,EER)在分数归一化前后分别降低了10%和3%.
  • [1] Reynolds D A, Quatieri T F, Dunn R B. Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 2000, 10(1-3): 19-41
    [2] Kinnunen T, Li H Z. An overview of text-independent speaker recognition: from features to supervectors. Speech Communication, 2010, 52(1): 12-40
    [3] Li Zhi-Yi, He Liang, Zhang Wei-Qiang, Liu Jia. Speaker recognition based on discriminant i-vector local distance preserving projection. Journal of Tsinghua University (Science and Technology), 2012, 52(5): 598-601 (栗志意, 何亮, 张卫强, 刘加. 基于鉴别性i-vector局部距离保持映射的说话人识别. 清华大学学报(自然科学版), 2012, 52(5): 598601)
    [4] Campbell W M, Campbell J P, Reynolds D A, Singer E, Torres-Carrasquillo P A. Support vector machines for speaker and language recognition. Computer Speech and Language, 2006, 20(2-3): 210-229
    [5] Kenny P, Boulianne G, Ouellet P, Dumouchel P. Speaker and session variability in GMM-based speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4): 1448-1460
    [6] Kenny P, Boulianne G, Ouellet P, Dumouchel P. Joint factor analysis versus eigenchannels in speaker recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4): 1435-1447
    [7] Dehak N, Kenny P J, Dehak R, Dumouchel P, Ouellet P. Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(4): 788-798
    [8] Kenny P, Boulianne G, Dumouchel P. Eigenvoice modeling with sparse training data. IEEE Transactions on Speech and Audio Processing, 2005, 13(3): 345-354
    [9] Hatch A O, Kajarekar S S, Stolcke A. Within-class covariance normalization for SVM-based speaker recognition. In: Proceedings of the International Conference on Spoken Language. Pittsburgh, PA, 2006. 1471-1474
    [10] Bishop C M. Pattern Recognition and Machine Learning. Berlin: Springer, 2008
    [11] Sonnenburg S, Rätsch G, Henschel S, Widmer C, Behr J, Zien A, de Bona F, Binder A, Gehl C, Franc V. The SHOGUN machine learning toolbox. Journal of Machine Learning Research, 2010, 11: 1799-1802
    [12] Cortes C, Vapnik V. Support-vector networks. Machine Learning, 1995, 20(3): 273-297
  • 加载中
计量
  • 文章访问数:  1929
  • HTML全文浏览量:  132
  • PDF下载量:  1144
  • 被引次数: 0
出版历程
  • 收稿日期:  2012-09-12
  • 修回日期:  2013-01-18
  • 刊出日期:  2014-04-20

目录

    /

    返回文章
    返回