多模态维度情感预测综述

李霞; 卢官明; 闫静杰; 张正言; 李霞; 卢官明; 闫静杰; 张正言

doi:10.16383/j.aas.2018.c170644

[1]

刘烨, 付秋芳, 傅小兰.认知与情绪的交互作用.科学通报, 2009, 54(18):2783-2796 http://d.old.wanfangdata.com.cn/Periodical/xlkx200603052

Liu Ye, Fu Qiu-Fang, Fu Xiao-Lan. The interaction between cognition and emotion. Chinese Science Bulletin, 2009, 54(22):4102-4116 http://d.old.wanfangdata.com.cn/Periodical/xlkx200603052

[2]

D'Mello S K, Kory J. A review and meta-analysis of multimodal affect detection systems. ACM Computing Surveys, 2015, 47(3):Article No. 43 http://dl.acm.org/citation.cfm?id=2682899

[3]

Zeng Z H, Pantic M, Roisman G I, Huang T S. A survey of affect recognition methods:audio, visual, and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(1):39-58 doi: 10.1109/TPAMI.2008.52

[4]

Poria S, Cambria E, Bajpai R, Hussain A. A review of affective computing:from unimodal analysis to multimodal fusion. Information Fusion, 2017, 37:98-125 doi: 10.1016/j.inffus.2017.02.003

[5]

乐国安, 董颖红.情绪的基本结构:争论、应用及其前瞻.南开学报(哲学社会科学版), 2013, (1):140-150 http://d.old.wanfangdata.com.cn/Periodical/yejy-jykx201305009

Yue Guo-An, Dong Ying-Hong. On the categorical and dimensional approaches of the theories of the basic structure of emotions. Nankai Journal (Literature and Social Science Edition), 2013, (1):140-150 http://d.old.wanfangdata.com.cn/Periodical/yejy-jykx201305009

[6]

Arifin S, Cheung P Y K. Affective level video segmentation by utilizing the pleasure-arousal-dominance information. IEEE Transactions on Multimedia, 2008, 10(7):1325-1341 doi: 10.1109/TMM.2008.2004911

[7]

Cowie R, Douglas-Cowie E, Savvidou S, McMahon E, Sawey M, Schröder M. "FEELTRACE": an instrument for recording perceived emotion in real time. In: Proceedings of the 2000 ISCA Tutorial and Research Workshop on Speech and Emotion. Northern Ireland: ISCA, 2000. 19-24

[8]

韩文静, 李海峰, 阮华斌, 马琳.语音情感识别研究进展综述.软件学报, 2014, 25(1):37-50 http://d.old.wanfangdata.com.cn/Periodical/rjxb201401004

Han Wen-Jing, Li Hai-Feng, Ruan Hua-Bin, Ma Lin. Review on speech emotion recognition. Journal of Software, 2014, 25(1):37-50 http://d.old.wanfangdata.com.cn/Periodical/rjxb201401004

[9]

Gunes H, Schuller B. Categorical and dimensional affect analysis in continuous input:current trends and future directions. Image and Vision Computing, 2013, 31(2):120-136 doi: 10.1016/j.imavis.2012.06.016

[10]

Fontaine J R J, Scherer K R, Roesch E B, Eiisworth P C. The world of emotions is not two-dimensional. Psychological Science, 2007, 18(12):1050-1057 doi: 10.1111/j.1467-9280.2007.02024.x

[11]

邹吉林, 张小聪, 张环, 于靓, 周仁来.超越效价和唤醒-情绪的动机维度模型述评.心理科学进展, 2011, 19(9):1339-1346 http://d.old.wanfangdata.com.cn/Conference/8110806

Zou Ji-Lin, Zhang Xiao-Cong, Zhang Huan, Yu Liang, Zhou Ren-Lai. Beyond dichotomy of valence and arousal:review of the motivational dimensional model of affect. Advances in Psychological Science, 2011, 19(9):1339-1346 http://d.old.wanfangdata.com.cn/Conference/8110806

[12]

Morris J D. Observations:SAM:the self-assessment manikin-an efficient cross-cultural measurement of emotional response. Journal of Advertising Research, 1995, 35:63-68 http://d.old.wanfangdata.com.cn/Periodical/kjkxxb201702001

[13]

Koelstra S, Muhl C, Soleymani M, Lee J S, Yazdani A, Ebrahimi T, et al. DEAP:a database for emotion analysis using physiological signals. IEEE Transactions on Affective Computing, 2012, 3(1):18-31 doi: 10.1109/T-AFFC.2011.15

[14]

Busso C, Bulut M, Lee C C, Kazemzadeh A, Mower E, Kim S, et al. IEMOCAP:interactive emotional dyadic motion capture database. Language Resources and Evaluation, 2008, 42(4):335-359 doi: 10.1007/s10579-008-9076-6

[15]

Ringeval F, Sonderegger A, Sauer J, Lalanne D. Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions. In: Proceedings of the 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition. Shanghai, China: IEEE, 2013. 1-8 http://www.mendeley.com/catalog/introducing-recola-multimodal-corpus-remote-collaborative-affective-interactions/

[16]

Schuller B, Vlasenko B, Eyben F, Rigoll G, Wendemuth A. Acoustic emotion recognition: a benchmark comparison of performances. In: Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition and Understanding. Merano, Italy: IEEE, 2009. 552-557 http://www.mendeley.com/catalog/acoustic-emotion-recognition-benchmark-comparison-performances/

[17]

Tarasov A, Delany S J. Benchmarking classification models for emotion recognition in natural speech: a multi-corporal study. In: Proceedings of the 2011 IEEE International Conference on Automatic Face and Gesture Recognition and Workshops. Santa Barbara, CA, USA: IEEE, 2011. 841-846 http://www.mendeley.com/catalog/benchmarking-classification-models-emotion-recognition-natural-speech-multicorporal-study/

[18]

Wöllmer M, Schuller B, Eyben F, Rigoll G. Combining long short-term memory and dynamic Bayesian networks for incremental emotion-sensitive artificial listening. IEEE Journal of Selected Topics in Signal Processing, 2010, 4(5):867-881 doi: 10.1109/JSTSP.2010.2057200

[19]

Espinosa H P, García C A R, Pineda L V. Features selection for primitives estimation on emotional speech. In: Proceedings of the 2010 IEEE International Conference on Acoustics Speech and Signal Processing. Dallas, TX, USA: IEEE, 2010. 5138-5141 http://www.mendeley.com/research/features-selection-primitives-estimation-emotional-speech/

[20]

Yin Z, Zhao M Y, Wang Y X, Yang J D, Zhang J H. Recognition of emotions using multimodal physiological signals and an ensemble deep learning model. Computer Methods and Programs in Biomedicine, 2017, 140:93-110 doi: 10.1016/j.cmpb.2016.12.005

[21]

Nicolaou M A, Gunes H, Pantic M. Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space. IEEE Transactions on Affective Computing, 2011, 2(2):92-105 doi: 10.1109/T-AFFC.2011.9

[22]

Ringeval F, Schuller B, Valstar M, Jaiswal S, Marchi E, Lalanne D, et al. AV+EC 2015: the first affect recognition challenge bridging across audio, video, and physiological data. In: Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge. Brisbane, Australia: ACM, 2015. 3-8 doi: 10.1145/2808196.2811642

[23]

Kächele M, Schels M, Thiam P, Schwenker F. Fusion mappings for multimodal affect recognition. In: Proceedings of the 2015 IEEE Symposium Series on Computational Intelligence. Cape Town, South Africa: IEEE, 2015. 307-313 http://www.deepdyve.com/lp/institute-of-electrical-and-electronics-engineers/fusion-mappings-for-multimodal-affect-recognition-oJNU0tN0pn

[24]

孙晓, 潘汀, 任福继.基于ROI-KNN卷积神经网络的面部表情识别.自动化学报, 2016, 42(6):883-891 http://www.aas.net.cn/CN/abstract/abstract18879.shtml

Sun Xiao, Pan Ting, Ren Fu-Ji. Facial expression recognition using ROI-KNN deep convolutional neural networks. Acta Automatica Sinica, 2016, 42(6):883-891 http://www.aas.net.cn/CN/abstract/abstract18879.shtml

[25]

徐峰, 张军平.人脸微表情识别综述.自动化学报, 2017, 43(3):333-348 http://www.aas.net.cn/CN/abstract/abstract19013.shtml

Xu Feng, Zhang Jun-Ping. Facial microexpression recognition:a survey. Acta Automatica Sinica, 2017, 43(3):333-348 http://www.aas.net.cn/CN/abstract/abstract19013.shtml

[26]

Ekman P. Universal facial expressions of emotion. California Mental Health Research Digest, 1970, 8(4):151-158 http://d.old.wanfangdata.com.cn/OAPaper/oai_pubmedcentral.nih.gov_3358835

[27]

Kleinsmith A, Bianchi-Berthouze N. Affective body expression perception and recognition:a survey. IEEE Transactions on Affective Computing, 2013, 4(1):15-33 doi: 10.1109/T-AFFC.2012.16

[28]

Gunes H, Pantic M. Dimensional emotion prediction from spontaneous head gestures for interaction with sensitive artificial listeners. In: Proceeding of the 10th International Conference on Intelligent Virtual Agents. Berlin, Heidelberg, Germany: Springer-Verlag, 2010. 371-377 doi: 10.1007%2F978-3-642-15892-6_39

[29]

Metallinou A, Yang Z J, Lee C C, Busso C, Carnicke S, Narayanan S. The USC CreativeIT database of multimodal dyadic interactions:from speech and full body motion capture to continuous emotional annotations. Language Resources and Evaluation, 2016, 50(3):497-521 doi: 10.1007/s10579-015-9300-0

[30]

王科, 夏睿.情感词典自动构建方法综述.自动化学报, 2016, 42(4):495-511 http://www.aas.net.cn/CN/abstract/abstract18838.shtml

Wang Ke, Xia Rui. A survey on automatical construction methods of sentiment lexicons. Acta Automatica Sinica, 2016, 42(4):495-511 http://www.aas.net.cn/CN/abstract/abstract18838.shtml

[31]

Wöllmer M, Kaiser M, Eyben F, Schuller B, Rigoll G. LSTM-Modeling of continuous emotions in an audiovisual affect recognition framework. Image and Vision Computing, 2013, 31(2):153-163 http://dl.acm.org/citation.cfm?id=2438270

[32]

Eyben F, Wöllmer M, Valstar M F, Gunes H, Schuller B, Pantic M. String-based audiovisual fusion of behavioural events for the assessment of dimensional affect. In: Proceedings of the 2011 IEEE International Conference on Automatic Face and Gesture Recognition and Workshops. Santa Barbara, CA, USA: IEEE, 2011. 322-329 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.364.3773

[33]

彭聃龄.普通心理学.北京:北京师范大学出版社, 2001.

Peng Ran-Ling. General Psychology. Beijing:Beijing Normal University Press, 2001.

[34]

Calvo R A, D'Mello S. Affect detection:an interdisciplinary review of models, methods, and their applications. IEEE Transactions on Affective Computing, 2010, 1(1):18-37 doi: 10.1109/T-AFFC.2010.1

[35]

Mckeown G, Valstar M, Cowie R, Pantic M, Schroder M. The SEMAINE database:annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Transactions on Affective Computing, 2012, 3(1):5-17 doi: 10.1109/T-AFFC.2011.20

[36]

Grimm M, Kroschel K, Narayanan S. The Vera am Mittag German audio-visual emotional speech database. In: Proceedings of the 2008 IEEE International Conference on Multimedia and Expo. Hannover, German: IEEE, 2008. 865-868 http://www.mendeley.com/catalog/vera-mittag-german-audiovisual-emotional-speech-database/

[37]

Lades M, Vorbruggen J C, Buhmann J, Lang J, von der Malsburg C, Wurtz R P, et al. Distortion invariant object recognition in the dynamic link architecture. IEEE Transactions on Computers, 1993, 42(3):300-311 doi: 10.1109/12.210173

[38]

Ahonen T, Hadid A, Pietikainen M. Face description with local binary patterns:application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(12):2037-2041 doi: 10.1109/TPAMI.2006.244

[39]

Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, USA: IEEE, 2005. 886-893 http://www.mendeley.com/catalog/histogram-oriented-gradients-human-detection/

[40]

Viola P, Jones M. Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Kauai, USA: IEEE, 2001. I-511-I-518 http://www.mendeley.com/research/colonialism-homosexuality-review/

[41]

Zhao G Y, Pietikäinen M. Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(6):915-28 doi: 10.1109/TPAMI.2007.1110

[42]

Jiang B H, Valstar M, Martinez B, Pantic M. A dynamic appearance descriptor approach to facial actions temporal modeling. IEEE Transactions on Cybernetics, 2014, 44(2):161-174 doi: 10.1109/TCYB.2013.2249063

[43]

Almaev T R, Valstar M F. Local Gabor binary patterns from three orthogonal planes for automatic facial expression recognition. In: Proceedings of the 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction. Geneva, Switzerland: IEEE, 2013. 356-361 http://www.mendeley.com/research/local-gabor-binary-patterns-three-orthogonal-planes-automatic-facial-expression-recognition/

[44]

Yang P, Liu Q, Metaxas D N. Boosting coded dynamic features for facial action units and facial expression recognition. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: IEEE, 2007. 1-6 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.581.9566

[45]

Schuller B. Recognizing affect from linguistic information in 3D continuous space. IEEE Transactions on Affective Computing, 2011, 2(4):192-205 doi: 10.1109/T-AFFC.2011.17

[46]

Jenke R, Peer A, Buss M. Feature extraction and selection for emotion recognition from EEG. IEEE Transactions on Affective Computing, 2014, 5(3):327-339 doi: 10.1109/TAFFC.2014.2339834

[47]

Valstar M, Gratch J, Schuller B, Ringeval F, Lalanne D, Torres M T, et al. AVEC 2016: depression, mood, and emotion recognition workshop and challenge. In: Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge. Amsterdam, The Netherlands: ACM, 2016. 3-10 http://www.deepdyve.com/lp/association-for-computing-machinery/avec-2016-depression-mood-and-emotion-recognition-workshop-and-bJsOVdX1yf

[48]

Sayedelahl A, Araujo R, Kamel M S. Audio-visual feature-decision level fusion for spontaneous emotion estimation in speech conversations. In: Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops. San Jose, CA, USA: IEEE, 2013. 1-6 http://www.deepdyve.com/lp/institute-of-electrical-and-electronics-engineers/audio-visual-feature-decision-level-fusion-for-spontaneous-emotion-qFjpwOTK0Y

[49]

Wöllmer M, Eyben F, Reiter S, Schuller B, Cox C, Douglas-Cowie E, et al. Abandoning emotion classes-towards continuous emotion recognition with modelling of long-range dependencies. In: Proceedings of the 2008 Interspeech. Brisbane, Australia: DBLP, 2008. 597-600

[50]

Karg M, Kuhnlenz K, Buss M. Recognition of affect based on gait patterns. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2010, 40(4):1050-1061 doi: 10.1109/TSMCB.2010.2044040

[51]

Eyben F, Wöllmer M, Graves A, Schuller B, Douglas-Cowie E, Cowie R. On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues. Journal on Multimodal User Interfaces, 2010, 3(1-2):7-19 doi: 10.1007/s12193-009-0032-6

[52]

Wöllmer M, Metallinou A, Eyben F, Schuller B, Narayanan S. Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional LSTM modeling. In: Proceedings of the 11th Annual Conference of the International Speech Communication Association. Makuhari, Chiba, Japan: DBLP, 2010. 2362-2365

[53]

Metallinou A, Katsamanis A, Wang Y, Narayanan S. Tracking changes in continuous emotion states using body language and prosodic cues. In: Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing. Prague, Czech: IEEE, 2011. 2288-2291 http://www.mendeley.com/catalog/tracking-changes-continuous-emotion-states-using-body-language-prosodic-cues/

[54]

Lin J C, Wu C H, Wei W L. Error weighted semi-coupled hidden Markov model for audio-visual emotion recognition. IEEE Transactions on Multimedia, 2012, 14(1):142-156 doi: 10.1109/TMM.2011.2171334

[55]

Soladié C, Salam H, Pelachaud C, Stoiber N, Séguier R. A multimodal fuzzy inference system using a continuous facial expression representation for emotion detection. In: Proceedings of the 14th ACM International Conference on Multimodal Interaction. Santa Monica, California, USA: ACM, 2012. 493-500 http://www.mendeley.com/research/multimodal-fuzzy-inference-system-using-continuous-facial-expression-representation-emotion-detectio/

[56]

Nicolle J, Rapp V, Bailly K, Prevost L, Chetouani M. Robust continuous prediction of human emotions using multiscale dynamic cues. In: Proceedings of the 14th ACM International Conference on Multimodal Interaction. Santa Monica, California, USA: ACM, 2012: 501-508 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.433.2139

[57]

Nicolaou M A, Gunes H, Pantic M. Output-associative RVM regression for dimensional and continuous emotion prediction. In: Proceedings of the 2012 IEEE International Conference on Automatic Face and Gesture Recognition and Workshops. Santa Barbara, CA, USA: IEEE, 2012. 16-23 http://www.mendeley.com/catalog/outputassociative-rvm-regression-dimensional-continuous-emotion-prediction/

[58]

Song Y, Morency L P, Davis R. Learning a sparse codebook of facial and body microexpressions for emotion recognition. In: Proceedings of the 15th ACM on International Conference on Multimodal Interaction. Sydney, Australia: ACM, 2013. 237-244 http://www.mendeley.com/research/learning-sparse-codebook-facial-body-microexpressions-emotion-recognition/

[59]

Nicolaou M A, Zafeiriou S, Pantic M. Correlated-spaces regression for learning continuous emotion dimensions. In: Proceedings of the 21st ACM International Conference on Multimedia. Barcelona, Spain: ACM, 2013. 773-776 http://www.mendeley.com/research/correlatedspaces-regression-learning-continuous-emotion-dimensions/

[60]

Gaus Y F A, Meng H Y, Jan A, Zhang F, Turabzadeh S. Automatic affective dimension recognition from naturalistic facial expressions based on wavelet filtering and PLS regression. In: Proceedings of the 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition. Ljubljana, Yugoslavia: IEEE, 2015. 1-6 http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7284859

[61]

Huang Z, Dang T, Cummins N, Stasak B, Le P, Sethu V, et al. An investigation of annotation delay compensation and output-associative fusion for multimodal continuous emotion prediction. In: Proceedings of the 2015 International Workshop on Audio/Visual Emotion Challenge. New York, USA: ACM, 2015. 41-48 doi: 10.1145/2808196.2811640

[62]

He L, Jiang D M, Yang L, Pei E C, Wu P, Sahli H. Multimodal affective dimension prediction using deep bidirectional long short-term memory recurrent neural networks. In: Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge. Brisbane, Australia: ACM, 2015. 73-80

[63]

Chen S Z, Jin Q. Multi-modal dimensional emotion recognition using recurrent neural network. In: Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge. Brisbane, Australia: ACM, 2015. 49-56 http://dl.acm.org/citation.cfm?id=2811638

[64]

Li X X, Xianyu H, Tian J S, Chen W X, Meng F H, Xu M X, et al. A deep bidirectional long short-term memory based multi-scale approach for music dynamic emotion prediction. In: Proceedings of the 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing. Shanghai, China: IEEE, 2016. 544-548 http://ieeexplore.ieee.org/document/7471734/

[65]

Zhang Z X, Ringeval F, Han J, Deng J, Marchi E, Schuller B. Facing realism in spontaneous emotion recognition from speech: feature enhancement by autoencoder with LSTM neural networks. In: Proceedings of the 2016 Conference of the International Speech Communication Association. San Francisco, USA: ISCA, 2016. 3593-3597

[66]

Pei E C, Xia X H, Yang L, Jiang D M, Sahli H. Deep neural network and switching Kalman filter based continuous affect recognition. In: Proceedings of the 2016 IEEE International Conference on Multimedia and Expo Workshops. Seattle, WA, USA: IEEE, 2016. 1-6 http://www.deepdyve.com/lp/institute-of-electrical-and-electronics-engineers/deep-neural-network-and-switching-kalman-filter-based-continuous-Gi0nak0reF

[67]

Brady K, Gwon Y, Khorrami P, Godoy E, Campbell W, Dagli C, et al. Multi-modal audio, video and physiological sensor learning for continuous emotion prediction. In: Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge. Amsterdam, The Netherlands: ACM, 2016. 97-104 http://experts.illinois.edu/en/publications/multi-modal-audio-video-and-physiological-sensor-learning-for-con

[68]

Trigeorgis G, Ringeval F, Brueckner R, Marchi E, Nicolaou M A, Schuller B, et al. Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network. In: Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing. Shanghai, China: IEEE, 2016. 5200-5204 http://ieeexplore.ieee.org/document/7472669/

[69]

Chao L L, Tao J H, Yang M H, Li Y, Wen Z Q. Long short term memory recurrent neural network based multimodal dimensional emotion recognition. In: Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge. Brisbane, Australia: ACM, 2015. 65-72 http://www.deepdyve.com/lp/association-for-computing-machinery/long-short-term-memory-recurrent-neural-network-based-multimodal-PD8TcEBEm5

[70]

Sariyanidi E, Gunes H, Cavallaro A. Automatic analysis of facial affect:a survey of registration, representation, and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(6):1113-1133 doi: 10.1109/TPAMI.2014.2366127

[71]

Lecun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553):436-444 doi: 10.1038/nature14539

[72]

尹宝才, 王文通, 王立春.深度学习研究综述.北京工业大学学报, 2015, 41(1):48-59 http://d.old.wanfangdata.com.cn/Periodical/jsjyyyj201208002

Yin Bao-Cai, Wang Wen-Tong, Wang Li-Chun. Review of deep learning. Journal of Beijing University of Technology, 2015, 41(1):48-59 http://d.old.wanfangdata.com.cn/Periodical/jsjyyyj201208002

[73]

Zheng W Q, Yu J S, Zou Y X. An experimental study of speech emotion recognition based on deep convolutional neural networks. In: Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction. Xi'an, China: IEEE, 2015. 827-831 http://ieeexplore.ieee.org/document/7344669/

[74]

Poria S, Chaturvedi I, Cambria E, Hussain A. Convolutional MKL based multimodal emotion recognition and sentiment analysis. In: Proceedings of the 16th IEEE International Conference on Data Mining. Barcelona, Spain: IEEE, 2016. 439-448 http://ieeexplore.ieee.org/document/7837868

[75]

Weninger F, Ringeval F, Marchi E, Schuller B. Discriminatively trained recurrent neural networks for continuous dimensional emotion recognition from audio. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence. New York, USA: AAAI Press, 2016. 2196-2202 http://www.sewaproject.eu/files/338a1be0-486e-43ae-c09f-4e88236e62df.pdf

[76]

Banda N, Engelbrecht A, Robinson P. Continuous emotion recognition using a particle swarm optimized NARX neural network. In: Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction. Xi'an, China: IEEE, 2015. 380-386 http://ieeexplore.ieee.org/document/7344599/

[77]

Glodek M, Tschechne S, Layher G, Schels M, Brosch T, Scherer S, et al. Multiple classifier systems for the classification of audio-visual emotional states. In: Proceedings of the 2011 International Conference on Affective Computing and Intelligent Interaction. Berlin, Heidelberg, German: Springer-Verlag, 2011. 359-368 http://www.springerlink.com/content/77345u71p6x76pg3/

[78]

Schuller B, Valstar M, Cowie R, Pantic M. AVEC 2012: the continuous audio/visual emotion challenge-an introduction. In: Proceedings of the 14th ACM International Conference on Multimodal Interaction. Santa Monica, California, USA: ACM, 2012. 361-362 http://www.mendeley.com/research/avec-2012-continuous-audiovisual-emotion-challenge/

[79]

Valstar M, Schuller B, Smith K, Almaev T, Eyben F, Krajewski J, et al. AVEC 2014: 3D dimensional affect and depression recognition challenge. In: Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge. Orlando, Florida, USA: ACM, 2014. 3-10 http://dl.acm.org/citation.cfm?id=2661807

[80]

Wu C H, Lin J C, Wei W L. Two-level hierarchical alignment for semi-coupled HMM-based audiovisual emotion recognition with temporal course. IEEE Transactions on Multimedia, 2013, 15(8):1880-1895 doi: 10.1109/TMM.2013.2269314

[81]

Mariooryad S, Busso C. Correcting time-continuous emotional labels by modeling the reaction lag of evaluators. IEEE Transactions on Affective Computing, 2015, 6(2):97-108 doi: 10.1109/TAFFC.2014.2334294

[82]

Mariooryad S, Busso C. Analysis and compensation of the reaction lag of evaluators in continuous emotional annotations. In: Proceedings of the 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction. Geneva, Switzerland: IEEE, 2013. 85-90 http://dl.acm.org/citation.cfm?id=2544966

[83]

Schuller B, Valstar M, Eyben F, McKeown G, Cowie R, Pantic M. AVEC 2011-the first international audio/visual emotion challenge. In: Proceedings of the 2011 International Conference on Affective Computing and Intelligent Interaction. Berlin, German: Springer-Verlag, 2011. 415-424 AVEC 2011-the first international audio/visual emotion challenge.