混响环境下基于倒谱BRIR的双耳互相关声源定位算法

张毅; 颜博; 王可佳

doi:10.16383/j.aas.2016.c150828

混响环境下基于倒谱BRIR的双耳互相关声源定位算法

doi: 10.16383/j.aas.2016.c150828

张毅^1,,
颜博^2, ,,
王可佳^2,

1.
重庆邮电大学先进制造工程学院重庆 400065
2.
重庆邮电大学自动化学院重庆 400065

基金项目:

重庆市科学技术委员会项目 cstc2015jcyjBX0066

详细信息

作者简介:
张毅重庆邮电大学先进制造工程学院教授.主要研究方向为机器人及应用, 语音信号处理, 声源定位.E-mail:zhangyi@cqupt.edu.cn

王可佳重庆邮电大学自动化学院硕士研究生.主要研究方向为语音信号处理, 语音识别, 声纹识别.E-mail:qw.123woaini@foxmail.com

通讯作者:
颜博重庆邮电大学自动化学院硕士研究生.主要研究方向为语音信号处理, 声源定位.本文通信作者.E-mail:yanbo19921102@sina.com

计量
- 文章访问数: 2868
- HTML全文浏览量: 448
- PDF下载量: 672
- 被引次数: 0
出版历程
- 收稿日期: 2015-12-09
- 录用日期: 2016-05-17
- 刊出日期: 2016-10-01

Sound Source Localization Algorithm Based on Cepstral BRIR Binaural Cross-correlation in Reverberant Environment

ZHANG Yi^1
,,
YAN Bo^{2
, ,},
WANG Ke-Jia^2
,

1.
School of Advanced Manufacturing Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065
2.
School of Automation, Chongqing University of Posts and Telecommunications, Chongqing 400065

Funds:

Chongqing Science and Technology Commission Project cstc2015jcyjBX0066

More Information

Author Bio:
Professor at the School of Advanced Manufacturing Engineering, Chongqing University of Posts and Telecommunications. His research interest covers robot and its applications, speech signal processing, and sound source localization

Master student at the School of Automation, Chongqing University of Posts and Telecommunications. Her research interest covers processing of speech signal, speech recognition, and voiceprint recognition

Corresponding author: YAN Bo Master student at the School of Automation, Chongqing University of Posts and Telecommunications. Her research interest covers speech signal processing and sound source localization. Corresponding author of this paper

摘要

摘要: 在实际封闭环境中，针对存在混响而导致声源定位性能下降的问题，提出一种基于倒谱双耳房间脉冲响应（Binaural room impulse response，BRIR）的双耳互相关声源定位方法.该方法通过从倒谱BRIR中减去混响分量，然后反变换到时域得到估计的脉冲响应，再与数据库中的头部脉冲响应（Head related impulse response，HRIR）进行互相关运算，最大互相关值相对应的位置就是所估计的声源位置.仿真实验结果表明，提出的算法能减少混响环境中带来的定位误差，提高声源定位的精度.
- 声源定位 /
- 双耳互相关 /
- 倒谱 /
- 鲁棒性
Abstract: In an actual closed environment, for the presence of reverberation causes sound source localization performance degradation, a sound source localization algorithm based on a cepstral binaural room impulse response (BRIR) binaural cross-correlation is proposed. The method is based on subtracting the reverberation component from the BRIR, and the estimated time domain impulse response is derived from the cepstral BRIR inverse transformation. Then by performing cross-correlation operation with the database HRIR (head related impulse response), the maximum cross-correlation value corresponds to the position corresponding to the estimated location of the sound source. Simulation results show that the proposed algorithm can reduce positioning errors caused by reverberation environment, and improve sound localization accuracy.
- Sound source localization /
- binaural cross-correlation /
- cepstral /
- robustness

HTML全文

图 1 RT=0 s时, 三种算法对方位角${{15}^{{}^\circ }}$定位

Fig. 1 Three algorithms for positioning the azimuth ${{15}^{{}^\circ }}$ when RT=0 s

下载: 全尺寸图片幻灯片

图 2 RT=0.30 s时, 三种算法对方位角${{15}^{{}^\circ }}$定位

Fig. 2 Three algorithms for positioning the azimuth ${{15}^{{}^\circ }}$ when RT=0.30 s

下载: 全尺寸图片幻灯片

图 3 RT=0.50 s时, 三种算法对方位角${{15}^{{}^\circ }}$定位

Fig. 3 Three algorithms for positioning the azimuth ${{15}^{{}^\circ }}$ when RT=0.50 s

下载: 全尺寸图片幻灯片

图 4 RT=0.70 s时, 三种算法对方位角${{15}^{{}^\circ }}$定位

Fig. 4 Three algorithms for positioning the azimuth ${{15}^{{}^\circ }}$ when RT=0.70 s

下载: 全尺寸图片幻灯片

图 5 RT=0.90 s时, 三种算法对方位角${{15}^{{}^\circ }}$定位

Fig. 5 Three algorithms for positioning the azimuth ${{15}^{{}^\circ }}$ when RT=0.90 s

下载: 全尺寸图片幻灯片

图 6 方位角为${{15}^{{}^\circ }}$不同混响时间下的RMSE比较

Fig. 6 RMSE comparison of azimuth for ${{15}^{{}^\circ }}$ in different reverberation time

下载: 全尺寸图片幻灯片

图 7 实验环境示意图

Fig. 7 Schematic diagram of experimental environment

下载: 全尺寸图片幻灯片

表 1 在不同混响时间下三种定位方法的声源方位估计

Table 1 Sound source azimuth estimation of three location methods in different reverberation time

		实际角度（°)	0	10	15	20	30	35
CEP-BRIR-CC 声源定位法	RT=0s	估计角度（°)	0.08	10.24	15.06	20.23	30.15	35.23
	RT=0s	绝对误差（°)	0.08	0.24	0.06	0.23	0.15	0.23
	RT=0.3s	估计角度（°)	0.17	9.03	14.82	21.09	30.25	36.39
	RT=0.3s	绝对误差（°)	0.17	0.97	1.18	1.09	0.25	1.39
	RT=0.5s	估计角度（°)	-0.29	8.79	13.67	18.69	30.69	36.87
	RT=0.5s	绝对误差（°)	0.29	1.21	1.33	1.31	0.69	1.87
CEP-GCC-ITD 声源定位法	RT=0s	估计角度（°)	-0.08	10.67	15.92	20.86	30.42	35.37
	RT=0s	绝对误差（°)	0.08	0.67	0.92	0.86	0.42	0.37
	RT=0.3s	估计角度（°)	0.39	8.11	12.81	17.23	28.85	33.14
	RT=0.3s	绝对误差（°)	0.39	1.89	2.19	2.77	1.14	1.86
	RT=0.5s	估计角度（°)	-1.69	7.06	11.91	16.14	28.15	32.06
	RT=0.5s	绝对误差（°)	1.69	2.94	3.09	3.86	1.85	2.94
CEP-CC-ITD 声源定位法	RT=0s	估计角度（°)	0.07	10.73	15.95	21.46	30.85	35.62
	RT=0s	绝对误差（°)	0.07	0.73	0.95	1.46	0.85	0.62
	RT=0.3 s	估计角度（°)	0.63	8.68	12.78	23.06	27.62	32.97
	RT=0.3 s	绝对误差（°)	0.63	1.32	2.22	3.06	2.38	2.03
	RT=0.5s	估计角度（°)	-2.06	6.12	11.66	15.89	26.85	38.77
	RT=0.5s	绝对误差（°)	2.06	3.88	3.34	4.11	3.15	3.77

下载: 导出CSV

表 2 三种定位方法的统计结果

Table 2 The statistical results of three localization methods

角度方法	—60°		—15°		0°		30°		45°
角度方法	估计值	误差	估计值	误差	估计值	误差	估计值	误差	估计值	误差
CEP-BRIR-CC	—54.8°	5.2°	—19.6°	4.6°	—3°	3°	35.2°	5.2°	41.1°	3.9°
CEP-GCC-ITD	—67.6°	7.6°	—22.3°	7.3°	7.5°	7.5°	36.9°	6.9°	52.8°	7.8°
CEP-CC-ITD	—50.9°	9.1°	—23.5°	8.5°	8.8°	8.8°	22.0°	8.0°	54.2°	9.2°

下载: 导出CSV

参考文献(16)

[1]	Li H, Hong X. Binaural auditory localization of signals processed by speech enhancement methods. In:Proceedings of the 7th International Congress on Image and Signal Processing. Dalian, China:IEEE, 2014. 883-887
[2]	Wu X, Talagala D S, Zhang W, Abhayapala T D. Binaural localization of speech sources in 3-D using a composite feature vector of the HRTF. In:Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing. South Brisbane, QLD:IEEE, 2015. 2654-2658
[3]	周蕙瑜.双通道立体声的虚拟重发技术研究[硕士学位论文], 电子科技大学, 中国, 2006. Zhou Hui-Yu. Dual-channel Stereo Virtual Retransmission Technology Research[Master dissertation], University of Electronic Science and Technology, China, 2006.
[4]	Portello A, Bustamante G, Danés P, Mifsud A. Localization of multiple sources from a binaural head in a known noisy environment. In:Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems. Chicago, USA:IEEE, 2014. 3168-3174
[5]	Liu H, Zhang J. A binaural sound source localization model based on time-delay compensation and interaural coherence. In:Proceedings of the 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing. Florence, Italy:IEEE, 2014. 1424-1428
[6]	白振华.听觉定位中HRTF的研究[硕士学位论文], 东南大学, 中国, 2003. Bai Zhen-Hua. Study of HRTF in Auditory Localization[Master dissertation], Southeast University, China, 2003.
[7]	罗元, 陈凯, 张毅.一种结合听觉掩蔽与双耳互相关的声源定位算法.计算机应用与软件, 2015, 32(3):141-144 http://www.cnki.com.cn/Article/CJFDTOTAL-JYRJ201503035.htm Luo Yuan, Chen Kai, Zhang Yi. A sound source localisation algorithm based on the combination of auditory masking and binaural cross-correlation. Computer Applications and Software, 2015, 32(3):141-144 http://www.cnki.com.cn/Article/CJFDTOTAL-JYRJ201503035.htm
[8]	Raspaud M, Viste H, Evangelista G. Binaural source localization by joint estimation of ILD and ITD. IEEE Transactions on Audio, Speech, and Language Processing, 2010, 18(1):68-77 doi: 10.1109/TASL.2009.2023644
[9]	吴玉秀, 孟庆浩, 曾明.基于声音的分布式多机器人相对定位.自动化学报, 2014, 40(5):798-809 http://www.aas.net.cn/CN/abstract/abstract18348.shtml Wu Yu-Xiu, Meng Qing-Hao, Zeng Ming. Sound based relative localization for distributed multi-robot systems. Acta Automatica Sinica, 2014, 40(5):798-809 http://www.aas.net.cn/CN/abstract/abstract18348.shtml
[10]	Zannini C M, Parisi R, Uncini A. Binaural sound source localization in the presence of reverberation. In:Proceedings of the 17th International Conference on Digital Signal Processing. Corfu, Greece:IEEE, 2011. 1-6
[11]	Woodruff J, Wang D L. Binaural localization of multiple sources in reverberant and noisy environments. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(5):1503-1512 doi: 10.1109/TASL.2012.2183869
[12]	Barker J, Vincent E, Ma N, Christensen H, Green P. The PASCAL CHiME speech separation and recognition challenge. Computer Speech and Language, 2013, 27(3):621-633 doi: 10.1016/j.csl.2012.10.004
[13]	Stéphenne A, Champagne B. A new cepstral prefiltering technique for estimating time delay under reverberant conditions. Signal Processing, 1997, 59(3):253-266 doi: 10.1016/S0165-1684(97)00051-0
[14]	屈丹, 杨绪魁, 张文林.特征空间本征音说话人自适应.自动化学报, 2015, 41(7):1244-1252 http://www.aas.net.cn/CN/abstract/abstract18698.shtml Qu Dan, Yang Xu-Kui, Zhang Wen-Lin. Feature space eigenvoice speaker adaptation. Acta Automatica Sinica, 2015, 41(7):1244-1252 http://www.aas.net.cn/CN/abstract/abstract18698.shtml
[15]	Mosayyebpour S, Lohrasbipeydeh H, Esmaeili M, Gulliver T A. Time delay estimation via minimum-phase and all-pass component processing. In:Proceedings of the 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing. Vancouver, BC:IEEE, 2013. 4285-4289
[16]	马浩, 吴镇扬, 张杰, 胡红梅.与头相关传递函数的双耳特征提取与分类.电路与系统学报, 2007, 12(5):58-64 http://www.cnki.com.cn/Article/CJFDTOTAL-DLYX200705012.htm Ma Hao, Wu Zhen-Yang, Zhang Jie, Hu Hong-Mei. Binaural character extraction and clustering of head related transfer function. Journal of Circuits and Systems, 2007, 12(5):58-64 http://www.cnki.com.cn/Article/CJFDTOTAL-DLYX200705012.htm

施引文献

资源附件(0)

访问统计

图(7) / 表(2)

计量

文章访问数: 2868
HTML全文浏览量: 448
PDF下载量: 672
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

混响环境下基于倒谱BRIR的双耳互相关声源定位算法

doi: 10.16383/j.aas.2016.c150828

通讯作者:
颜博重庆邮电大学自动化学院硕士研究生.主要研究方向为语音信号处理, 声源定位.本文通信作者.E-mail:yanbo19921102@sina.com

计量

Sound Source Localization Algorithm Based on Cepstral BRIR Binaural Cross-correlation in Reverberant Environment

Corresponding author: YAN Bo Master student at the School of Automation, Chongqing University of Posts and Telecommunications. Her research interest covers speech signal processing and sound source localization. Corresponding author of this paper

计量

目录

留言板

混响环境下基于倒谱BRIR的双耳互相关声源定位算法

doi: 10.16383/j.aas.2016.c150828

通讯作者: 颜博 重庆邮电大学自动化学院硕士研究生.主要研究方向为语音信号处理, 声源定位.本文通信作者.E-mail:yanbo19921102@sina.com

计量

出版历程

Sound Source Localization Algorithm Based on Cepstral BRIR Binaural Cross-correlation in Reverberant Environment

Corresponding author: YAN Bo Master student at the School of Automation, Chongqing University of Posts and Telecommunications. Her research interest covers speech signal processing and sound source localization. Corresponding author of this paper

计量

出版历程

目录

通讯作者:
颜博重庆邮电大学自动化学院硕士研究生.主要研究方向为语音信号处理, 声源定位.本文通信作者.E-mail:yanbo19921102@sina.com