基于方位特征的听觉选择性注意计算模型研究

吕菲; 夏秀渝

doi:10.16383/j.aas.2017.c160277

基于方位特征的听觉选择性注意计算模型研究

doi: 10.16383/j.aas.2017.c160277

吕菲,
夏秀渝^,

四川大学电子信息学院成都 610064

详细信息

作者简介:
吕菲四川大学电子信息学院硕士研究生.2013年获得温州大学通信工程学士学位.主要研究方向为听觉选择性注意计算模型.E-mail:lvfei47@163.com

通讯作者:
夏秀渝四川大学电子信息学院副教授.主要研究方向为自适应声回波对消, 语音增强, 语音分离, 计算听觉场景分析, 听觉计算模型.E-mail:xiaxxy@163.com

计量
- 文章访问数: 2179
- HTML全文浏览量: 590
- PDF下载量: 733
- 被引次数: 0
出版历程
- 收稿日期: 2016-03-18
- 录用日期: 2016-08-15
- 刊出日期: 2017-04-01

Study on Computational Model of Auditory Selective Attention with Orientation Feature

LV Fei,
XIA Xiu-Yu^,

College of Electronics and Information Engineering, Sichuan University, Chengdu 610064

More Information

Author Bio:
Master student at the College of Electronics and Information Engineering, Sichuan University. She received her bachelor degree from Wenzhou University in 2013. Her research interest covers modeling and simulating auditory selective attention computational model

Corresponding author: XIA Xiu-Yu Associate professor at the College of Electronics and Information Engineering, Sichuan University. Her research interest covers acoustic echo cancellation, speech enhancement, speech separation, computational auditory scene analysis, and auditory computational model. Corresponding author of this paper

摘要

摘要: 经典的听觉注意计算模型主要针对声音强度、频率、时间等初级听觉特征进行研究，这些特征不能较好地模拟听觉注意指向性，必须寻求更高级的听觉特征来区分不同声音.根据听觉感知机制，本文基于声源方位特征和神经网络提出了一种双通路信息处理的自下而上听觉选择性注意计算模型.模型首先对双耳信号进行预处理和频谱分析；然后，将其分别送入where通路和what通路，其中where通路用于提取方位特征参数，并利用神经网络提取声源的局部方位特征，接着通过局部特征聚合和全局优化法得到方位特征显著图；最后，根据方位特征显著图提取主导方位并作用于what通路，采用时频掩蔽法分离出相应的主导音.仿真结果表明：该模型引入方位特征作为聚类线索，利用多级神经网络自动筛选出值得注意的声音对象，实时提取复杂声学环境中的主导音，较好地模拟了人类听觉的方位分类机制、注意选择机制和注意转移机制.
- 听觉选择性注意 /
- 方位特征 /
- 自下而上 /
- 神经网络
Abstract: Classic computational model of auditory selective attention mainly involves simple characters such as intensity, frequency, and time, which cannot simulate directional auditory attention preferably and needs more advanced auditory features to the distinguish different source signals. According to the perception mechanism of auditory system, the paper presents a bottom-up auditory selective attention computational model that has two signal processing pathways involving orientation feature and neural network. In this model, firstly, the binaural signals are preprocessed, spectral analyzed and separately sent to dual pathways. The where pathway is used to extract parameters of orientation feature and local orientation features of signals with neural networks, Then the features are aggregated and globally optimized to gain a saliency map of orientation feature. Finally, the leading orientations are gained based on the saliency map and applied to the what pathway to separate leading signals by time-frequency masking. Orientation features are introduced as group clues in this module and multi-neural networks are used to extract objective signals from mix-signals automatically. Simulation results prove that the method proposed in this paper can dynamically extract leading signals from complex acoustic environments in real time, and that orientation classification, attention selection and attention switch of the human auditory system are well simulated.
- Auditory selective attention /
- orientation feature /
- bottom-up /
- neural network

HTML全文

图 1 双耳听觉神经信息处理系统

Fig. 1 Neural information processing system of binaural auditory

下载: 全尺寸图片幻灯片

图 2 左右耳听觉示意图

Fig. 2 Illustration of binaural auditory system

下载: 全尺寸图片幻灯片

图 3 听觉选择性注意计算模型结构框图

Fig. 3 Schematic diagram of auditory selective attention computational model

下载: 全尺寸图片幻灯片

图 4 方位特征显著性计算框图

Fig. 4 Diagram of saliency computation about orientation feature

下载: 全尺寸图片幻灯片

图 5 人工信号仿真结果

Fig. 5 Simulation results of the artiflcial signal

下载: 全尺寸图片幻灯片

图 6 遗忘因子$\beta $对主导音提取的影响

Fig. 6 Efiects of forgetting factors on separating spectrogram of leading signals

下载: 全尺寸图片幻灯片

图 7 自然声学信号仿真结果

Fig. 7 Simulation results of natural acoustic signals

下载: 全尺寸图片幻灯片

图 8 带噪混合声的仿真结果 (SNR = 10 dB)

Fig. 8 Simulation results of the mixed signal in noise (SNR = 10 dB)

下载: 全尺寸图片幻灯片

图 9 混响环境中混合声的仿真结果

Fig. 9 Simulation results of the mixed signal in reverberation

下载: 全尺寸图片幻灯片

参考文献(23)

[1]	罗跃嘉, 魏景汉.注意的认知神经科学研究.北京:高等教育出版社, 2004. 27-47 http://www.cnki.com.cn/Article/CJFDTOTAL-SYQY201603027.htm Luo Yue-Jia, Wei Jin-Han. Attentive Research and Cognitive Neuroscience. Beijing:Higher Education Press, 2004. 27-47 http://www.cnki.com.cn/Article/CJFDTOTAL-SYQY201603027.htm
[2]	Pylkkonen J. Towards Efficient and Robust Automatic Speech Recognition:Decoding Techniques and Discriminative Training[Ph.D. dissertation], Aalto University, Finland, 2013
[3]	刘文举, 聂帅, 梁山, 张学良.基于深度学习语音分离技术的研究现状与进展.自动化学报, 2016, 42(6):819-833 http://www.aas.net.cn/CN/abstract/abstract18873.shtml Liu Wen-Ju, Nie Shuai, Liang Shan, Zhang Xue-Liang. Deep learning based speech separation technology and its developments. Acta Automatica Sinica, 2016, 42(6):819-833 http://www.aas.net.cn/CN/abstract/abstract18873.shtml
[4]	段艳杰, 吕宜生, 张杰, 赵学亮, 王飞跃.深度学习在控制领域的研究现状与展望.自动化学报, 2016, 42(5):643-654 http://www.aas.net.cn/CN/abstract/abstract18852.shtml Duan Yan-Jie, Lv Yi-Sheng, Zhang Jie, Zhao Xue-Liang, Wang Fei-Yue. Deep learning for control:the state of the art and prospects. Acta Automatica Sinica, 2016, 42(5):643-654 http://www.aas.net.cn/CN/abstract/abstract18852.shtml
[5]	Kayser C, Petkov C I, Lippert M, Logothetis N K. Mechanisms for allocating auditory attention:an auditory saliency map. Current Biology, 2005, 15(21):1943-1947 doi: 10.1016/j.cub.2005.09.040
[6]	Kalinli O, Narayanan S S. A saliency-based auditory attention model with applications to unsupervised prominent syllable detection in speech. In:Proceedings of the 8th Annual Conference of the International Speech Communication Association. Antwerp, Belgium:Interspeech, 2007. 1941-1944
[7]	De Coensel B, Botteldooren D. A model of saliency-based auditory attention to environmental sound. In:Proceedings of the 20th International Congress on Acoustics. Sydney, Australia:International Congress on Acoustics, 2010. 1-8
[8]	Kaya E M, Elhilali M. A temporal saliency map for modeling auditory attention. In:Proceedings of the 46th Annual Conference on Information Sciences and Systems. Princeton, USA:IEEE, 2012. 1-6
[9]	刘扬, 张苗辉, 郑逢斌.听觉选择性注意的认知神经机制与显著性计算模型.计算机科学, 2013, 40(6):283-287 http://www.cnki.com.cn/Article/CJFDTOTAL-JSJA201306065.htm Liu Yang, Zhang Miao-Hui, Zheng Feng-Bin. Cognitive neural mechanisms and saliency computational model of auditory selective attention. Computer Science, 2013, 40(6):283-287 http://www.cnki.com.cn/Article/CJFDTOTAL-JSJA201306065.htm
[10]	Bizley J K, Cohen Y E. The what, where and how of auditory-object perception. Nature Reviews Neuroscience, 2013, 14(10):693-707 doi: 10.1038/nrn3565
[11]	Roman N, Wang D L, Brown G J. Speech segregation based on sound localization. The Journal of the Acoustical Society of America, 2003, 114(4):2236-2252 doi: 10.1121/1.1610463
[12]	Friederici A D, Singer W. Grounding language processing on basic neurophysiological principles. Trends in Cognitive Sciences, 2015, 19(6):329-338 doi: 10.1016/j.tics.2015.03.012
[13]	Kayser C, Wilson C, Safaai H, Sakata S, Panzeri S. Rhythmic auditory cortex activity at multiple timescales shapes stimulus-Response gain and background firing. Journal of Neuroscience, 2015, 35(20):7750-7762 doi: 10.1523/JNEUROSCI.0268-15.2015
[14]	黎万义, 王鹏, 乔红.引入视觉注意机制的目标跟踪方法综述.自动化学报, 2014, 40(4):561-576 http://www.aas.net.cn/CN/abstract/abstract18323.shtml Li Wan-Yi, Wang Peng, Qiao Hong. A survey of visual attention based methods for object tracking. Acta Automatica Sinica, 2014, 40(4):561-576 http://www.aas.net.cn/CN/abstract/abstract18323.shtml
[15]	Henry M J, Herrmann B, Obleser J. Selective attention to temporal features on nested time scales. Cerebral Cortex, 2015, 25(2):450-459 doi: 10.1093/cercor/bht240
[16]	Wang W J, Wu X H, Li L. The dual-pathway model of auditory signal processing. Neuroscience Bulletin, 2008, 24(3):173-182. doi: 10.1007/s12264-008-1226-8
[17]	Qu T, Xiao Z, Gong M. Distance-dependent head-related transfer functions measured with high spatial resolution using a spark gap. IEEE Transactions on Audio, Speech, and Language Processing, 2009, 17(6):1124-1132 doi: 10.1109/TASL.2009.2020532
[18]	Cheng C I, Wakefield G H. Introduction to head-related transfer functions (HRTFs):representations of HRTFs in time, frequency, and space. In:Proceedings of the 107th Convention of the Audio-Engineering-Society. Ann Arbor, USA:University of Michigan, 2001. 231-248
[19]	Zhang J P, Nakamoto K T, Kitzes L M. Modulation of level response areas and stimulus selectivity of neurons in cat primary auditory cortex. Journal of Neurophysiology, 2005, 94(4):2263-2274 doi: 10.1152/jn.01207.2004
[20]	Jin C, Schenkel M, Carlile S. Neural system identification model of human sound localization. The Journal of the Acoustical Society of America, 2000, 108(3):1215-1235 doi: 10.1121/1.1288411
[21]	Algazi V R, Duda R O, Thompson D M, Avendano C. The CIPIC HRTF database. In:Proceedings of the 2009 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics. New Platz, USA:IEEE, 2001. 99-102
[22]	Catic J, Santurette S, Dau T. The role of reverberation-related binaural cues in the externalization of speech. The Journal of the Acoustical Society of America, 2015, 138(2):1154-1167 doi: 10.1121/1.4928132
[23]	Hassager H G, Gran F, Dau T. The role of spectral detail in the binaural transfer function on perceived externalization in a reverberant environment. The Journal of the Acoustical Society of America, 2016, 139(5):2992-3000 doi: 10.1121/1.4950847

施引文献

资源附件(0)

访问统计

图(9)

计量

文章访问数: 2179
HTML全文浏览量: 590
PDF下载量: 733
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

基于方位特征的听觉选择性注意计算模型研究

doi: 10.16383/j.aas.2017.c160277

作者简介:
吕菲四川大学电子信息学院硕士研究生.2013年获得温州大学通信工程学士学位.主要研究方向为听觉选择性注意计算模型.E-mail:lvfei47@163.com

通讯作者:
夏秀渝四川大学电子信息学院副教授.主要研究方向为自适应声回波对消, 语音增强, 语音分离, 计算听觉场景分析, 听觉计算模型.E-mail:xiaxxy@163.com

计量

Study on Computational Model of Auditory Selective Attention with Orientation Feature

Author Bio:
Master student at the College of Electronics and Information Engineering, Sichuan University. She received her bachelor degree from Wenzhou University in 2013. Her research interest covers modeling and simulating auditory selective attention computational model

计量

目录

留言板

基于方位特征的听觉选择性注意计算模型研究

doi: 10.16383/j.aas.2017.c160277

作者简介: 吕菲 四川大学电子信息学院硕士研究生.2013年获得温州大学通信工程学士学位.主要研究方向为听觉选择性注意计算模型.E-mail:lvfei47@163.com

通讯作者: 夏秀渝 四川大学电子信息学院副教授.主要研究方向为自适应声回波对消, 语音增强, 语音分离, 计算听觉场景分析, 听觉计算模型.E-mail:xiaxxy@163.com

计量

出版历程

Study on Computational Model of Auditory Selective Attention with Orientation Feature

Author Bio: Master student at the College of Electronics and Information Engineering, Sichuan University. She received her bachelor degree from Wenzhou University in 2013. Her research interest covers modeling and simulating auditory selective attention computational model

计量

出版历程

目录

作者简介:
吕菲四川大学电子信息学院硕士研究生.2013年获得温州大学通信工程学士学位.主要研究方向为听觉选择性注意计算模型.E-mail:lvfei47@163.com

通讯作者:
夏秀渝四川大学电子信息学院副教授.主要研究方向为自适应声回波对消, 语音增强, 语音分离, 计算听觉场景分析, 听觉计算模型.E-mail:xiaxxy@163.com

Author Bio:
Master student at the College of Electronics and Information Engineering, Sichuan University. She received her bachelor degree from Wenzhou University in 2013. Her research interest covers modeling and simulating auditory selective attention computational model