An Orthogonal Laplacian Language Recognition Approach
-
摘要: 提出了一种正交拉普拉斯语种识别方法,即在提取语音的i-vector后,采用正交局部保持投影进行子空间映射,将信号整体空间映射到语言信息加信道信息子空间,然后对映射后的矢量进行信道补偿处理,最后用支持向量机进行识别. 尽管i-vector最大限度地保留了语音的声学信息,但是并没有发现这些信息之间的内在结构. 利用正交局部保持投影在去除声学无关信息的基础上,进一步发现声学特征的内在结构,能够有效地提高特征的区分性. 在对NIST LRE 2003测试数据库实验后,发现新方法相较于基线系统来说,平均代价降低了28.91%.Abstract: An orthogonal Laplacian language recognition approach is proposed. In this approach, the i-vector of an utterance, after being extracted, is mapped into a subspace by an orthogonal locality preserving projection. Then, channel compensation is done for the mapped vector. At last, recognition is done with a support vector machine. Though the i-vector preserves the acoustics information as much as possible, it cannot find the inner structure among this information. Whereas the intrinsic structure of acoustics feature can be found by the orthogonal locality preserving projection algorithm on the basis of removing the irrelevant information. Experiments on the NIST LRE 2003 evaluation corpus show that this new approach can reduce a 28.91% average detection cost compared to the baseline.
-
[1] Zissman M A. Comparison of four approaches to automatic language identification of telephone speech. IEEE Transactions Speech and Audio Process, 1996, 4(3): 31-44 [2] [2] Campbell W M, Sturim D E, Reynolds D A. Support vector machine using GMM supervectors for speaker verification. IEEE Signal Processing Letters, 2006, 13(5): 308-311 [3] [3] Kenny P. Factor Analysis of Speaker and Session Variability: Theory and Algorithms, Technical Report CRIM-06/08-13. Montreal, CRIM, 2005 [4] [4] Kenny P, Boulianne G, Oullet P, Dumouchel P. Joint factor analysis versus eigenchannels in speaker recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4): 1435-1447 [5] [5] Martinez D, Plchot O, Burget L, Glembek O, Matejka P. Language Recognition in iVectors Space. In: INTERSPEECH. Florence, Italy: ISCA, 2011. 861-864 [6] [6] Dehak N, Torres P A, Reynolds D, Dehak R. Language recognition via iVectors and dimensionality reduction. In: INTERSPEECH. Florence, Italy: ISCA, 2011. 857-860 [7] [7] Tipping M E, Bishop C M. Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 1999, 61(3): 611-622 [8] [8] Turk M, Pentland A P. Face recognition using eigenfaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Maui, Hawaii: IEEE, 1991. 586-591 [9] Zeng Xian-Hua. Researches on Related Issues of Spectral Method for Manifold Learning [Ph.D. dissertation], Beijing Jiaotong University, China, 2009 (曾宪华. 流形学习的谱方法相关问题研究 [博士学位论文], 北京交通大学, 中国, 2009) [10] Yang J C, Liang C Y, Yang L, Suo H B, Wang J J, Yan Y H. Factor analysis of Laplacian approach for speaker recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Kyoto, Japan: IEEE, 2012. 4221-4224 [11] He X F, Niyogi P. Locality preserving projections. In: Proceedings of the Neural Information Processing Systems 16 (NIPS). Vancouver, Canada: The MIT Press, 2003. 153-160 [12] He X F, Yan S C, Hu Y X, Niyogi P, Zhang H J. Face recognition using Laplacianfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2005, 27(3): 328-340 [13] Cai D, He X F. Locality preserving projections. In: Proceedings of the 28th Annual International ACM SIGIR Conference (SIGIR'05). Salvador, Brazil: ACM, 2005 [14] Cai D, He X F, Han J W, Zhang H J. Orthogonal Laplacianfaces for face recognition. IEEE Transactions on Image Processing, 2006, 15(11): 3608-3614 [15] Hatch A O, Kajarekar S, Stolcke A. Within-class covariance normalization for SVM-based speaker recognition. In: INTERSPEECH. Pittsburgh, PA, USA, 2006. 1471-1474 [16] Torres-Carrasquillo P A, Singer E, Kohler M A, Greene R J, Reynolds D A, John R, Deller J R Jr. Approaches to language identification using Gaussian mixture models and shifted delta cepstral features. In: Proceedings of the International Conferences on Spoken Language Processing (ICSLP). Denver, 2002. 89-92
点击查看大图
计量
- 文章访问数: 1988
- HTML全文浏览量: 67
- PDF下载量: 1454
- 被引次数: 0