基于带有噪声输入的稀疏高斯过程的人体姿态估计

夏嘉欣; 陈曦; 林金星; 李伟鹏; 吴奇

doi:10.16383/j.aas.2018.c170397

基于带有噪声输入的稀疏高斯过程的人体姿态估计

doi: 10.16383/j.aas.2018.c170397

夏嘉欣^1,2,,
陈曦^3,,
林金星^4,,
李伟鹏^3,,
吴奇^1,2, ,

1.
上海交通大学电子信息与电气工程学院自动化系上海 200240
2.
系统控制与信息处理教育部重点实验室上海 200240
3.
上海交通大学航空航天学院上海 200240
4.
南京邮电大学自动化学院南京 210 000

基金项目:

国家自然科学基金 51705242

江苏省自然科学基金 BK20141430

上海浦江人才计划 15PJ1404300

国家自然科学基金 61473158

浙江大学CAD和CG国家重点实验室开放课题 A1713

国家自然科学基金 61671293

详细信息

作者简介:
夏嘉欣  上海交通大学电子信息与电气工程学院自动化系硕士研究生. 2015年获得上海交通大学学士学位.主要研究方向为图像处理与机器学习.E-mail: jessicax 1993@163.com

陈曦  上海交通大学航空航天学院讲师.2014年获得皇家墨尔本理工大学航空工程专业博士学位.主要研究方向为故障预测与健康管理, 机器学习, 结构健康监测.E-mail:chenxi1@comac.cc

林金星  南京邮电大学自动化学院副教授.主要研究方向为复杂系统智能建模与控制, 切换奇异系统.E-mail:jxlin2004@126.com

李伟鹏  上海交通大学航空航天学院研究员.2008年获得哈尔滨工业大学硕士学位, 2011年获得东京大学航空航天工程博士学位.主要研究方向为湍流和气动噪声的数据挖掘.E-mail:liweipeng@sjtu.edu.cn

通讯作者:
吴奇上海交通大学电子信息与电气工程学院自动化系副教授.2009年获得东南大学自动化学院控制工程与控制理论博士学位.主要研究方向为深度多层网络建模与学习算法, 机器学习与模式识别.本文通信作者.E-mail:wuqi7812@sjtu.edu.cn

计量
- 文章访问数: 3084
- HTML全文浏览量: 406
- PDF下载量: 690
- 被引次数: 0
出版历程
- 收稿日期: 2017-07-20
- 录用日期: 2017-10-30
- 刊出日期: 2019-04-20

Sparse Gaussian Process With Input Noise for Human Pose Estimation

XIA Jia-Xin^{1,2
,},
CHEN Xi^3
,,
LIN Jin-Xing^4
,,
LI Wei-Peng^3
,,
WU Qi^{1,2
, ,}

1.
Department of Automation, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240
2.
Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240
3.
School of Aeronautics and Astronautics, Shanghai Jiao Tong University, Shanghai 200240
4.
College of Automation, Nanjing University of Posts and Telecommunications, Nanjing 210000

Funds:

National Natural Science Foundation of China 51705242

Natural Science Foundation of Jiangsu Province BK20141430

Shanghai Pujiang Program 15PJ1404300

National Natural Science Foundation of China 61473158

Open Project Program of the State Key Laboratory of CAD and CG, Zhejiang University A1713

National Natural Science Foundation of China 61671293

More Information

Author Bio:
Master student in the Department of Automation, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University. She received her bachelor degree from Shanghai Jiao Tong University in 2015. Her research interest covers image processing and machine learning

Lecturer at the School of Aeronautics and Astronautics, Shanghai Jiao Tong University. He received his Ph. D. degree in aerospace engineering from Royal Melbourne Institute of Technology University, Australia in 2014. His research interest covers prognosis and health management, machine learning, and structural health monitoring

Associate professor at the School of Automation, Nanjing University of Posts and Telecommunications. His research interest covers intelligent modeling and control of complex systems, switched singular systems

Professor at the School of Aeronautics and Astronautics, Shanghai Jiao Tong University. He received his master degree from Harbin Institute of Technology in 2008 and his Ph. D. degree in aerospace engineering from University of Tokyo, Japan in 2011. His research interest covers data mining for turbulence, drag reduction, and noise control

Corresponding author: WU Qi Associate professor in the Department of Automation, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University. He received his Ph. D. degree in control theory and control engineering from the School of Automation, Southeast University in 2006. His research interest covers deep multi-layers network modeling and learning algorithm, machine learning, and pattern recognition. Corresponding author of this paper

摘要

摘要: 高斯过程回归（Gaussian process regression，GPR）是一种广泛应用的回归方法，可以用于解决输入输出均为多元变量的人体姿态估计问题.计算复杂度是高斯过程回归的一个重要考虑因素，而常用的降低计算复杂度的方法为稀疏表示算法.在稀疏算法中，完全独立训练条件（Fully independent training conditional，FITC）法是一种较为先进的算法，多用于解决输入变量彼此之间完全独立的回归问题.另外，输入变量的噪声问题是高斯过程回归的另一个需要考虑的重要因素.对于测试的输入变量噪声，可以通过矩匹配的方法进行解决，而训练输入样本的噪声则可通过将其转换为输出噪声的方法进行解决，从而得到更高的计算精度.本文基于以上算法，提出一种基于噪声输入的稀疏高斯算法，同时将其应用于解决人体姿态估计问题.本文实验中的数据集来源于之前的众多研究人员，其输入为从视频序列中截取的图像或通过特征提取得到的图像信息，输出为三维的人体姿态.与其他算法相比，本文的算法在准确性，运行时间与算法稳定性方面均达到了令人满意的效果.
- 姿态估计 /
- 回归分析 /
- 稀疏高斯过程 /
- 噪声输入 /
- 视频处理
Abstract: Gaussian process regression (GPR) is a common method for structured prediction and human pose estimation, in which input and output are both multivariate. Computational complexity is a significant consideration of GP regression and it can be reduced by sparse Gaussian algorithm. The fully independent training conditional (FITC) algorithm is a good method for sparse Gaussian process, and it can be applied to fully-independent input problems. Input noise is another significant consideration of GP regression. Moment matching can be used to solve trial input noise while training input noise can be modeled as output noise to achieve higher accuracy. On the basis of above algorithms, this study proposes a sparse Gaussian process with input noise for human pose estimation. A dataset from multiple people is used for experiments, in which the input is the image from video processing or image descriptor obtained by feature extraction, and the output is a three-dimensional human pose. The accuracy, runtime and stability of the algorithm are all satisfactory compared with other methods for human pose estimation.
- Human pose estimation /
- regression analysis /
- sparse Gaussian process (GP) /
- noisy input /
- video processing
注释:

1) 本文责任编委黄庆明

HTML全文

图 1 GP, FITC, NIGP和SGPIN算法预测结果

Fig. 1 Predicting results of GP, FITC, NIGP and SGPIN

下载: 全尺寸图片幻灯片

图 2 TGP, TGPKNN与SGPIN算法的误差比较

Fig. 2 Error comparison of TGP, TGPKNN and SGPIN

下载: 全尺寸图片幻灯片

图 3 GP, KTA, HSICKNN与SGPIN算法的误差比较

Fig. 3 Error comparison of GP, KTA, HSICKNN and SGPIN

下载: 全尺寸图片幻灯片

表 1 GP, FITC, NIGP和SGPIN算法比较

Table 1 Comparison of GP, FITC, NIGP and SGPIN

算法	训练点个数	MSE ($10^{-3}$)	运行时间(s)
GP	200	31.1326	1.876034
FITC	800	18.6279	0.062001
NIGP	200	18.6279	13.630882
SGPIN	800	8.6265	0.003087
SGPIN	200	18.4946	0.002612

下载: 导出CSV

表 2 实验数据集

Table 2 Experimental set

特征	动作	个体1	个体2	个体3	总数
HoG	Walking	1 176	876	895	2 947
	Jogging	439	795	831	2 065
	Throw/Catch	217	806	0	1 023
	Gestures	801	681	214	1 696
	Box	502	464	933	1 889
	Total	3 135	3 622	2 873	9 630

下载: 导出CSV

表 3 基于HumanEva-I数据集HoG特征的不同算法的平均误差

Table 3 Evaluation of average error of difierent algorithms based on HoG feature of HumanEva-I

研究个体	动作	样本数	GP	TGP	TGPKNN	KTA	HSICKNN	SGPIN
S1	Walking	1 176	398.5823	197.1179	193.9949	213.5265	218.6241	161.2112
	Jogging	439	383.7747	212.3234	212.2018	188.6683	196.0839	154.5919
	Throw/Catch	217	414.5873	174.2834	/	/	/	100.7592
	Gestures	801	415.3106	98.6237	102.5520	92.1541	156.6464	20.1770
	Box	502	426.6358	162.6801	163.3203	118.0500	149.5003	82.3949
S2	Walking	876	398.5817	197.1496	195.5694	206.7040	211.9735	160.4342
	Jogging	795	405.1201	213.0572	207.2430	227.3562	231.1777	176.1768
	Throw/Catch	806	421.5898	210.1543	199.3265	173.2717	189.7417	92.6742
	Gestures	681	410.0671	201.1053	201.7576	153.9103	173.0548	63.2473
	Box	464	421.3947	171.6007	109.1912	137.1031	159.5833	98.3920
S3	Walking	895	412.0019	219.2579	214.8589	236.1566	239.6487	177.3461
	Jogging	831	441.7053	211.1343	206.1400	233.5746	236.5287	184.2251
	Throw/Catch	0	/	/	/	/	/	/
	Gestures	214	473.7616	159.7482	/	/	/	40.3100
	Box	933	483.6534	214.1621	207.7578	186.5170	195.9815	120.6541
总数		9 630	284.0985	160.1196	162.0768	/	/	155.3066

下载: 导出CSV

表 4 基于HumanEva-I数据集HoG特征的不同算法的运行时间

Table 4 Evaluation of runtime of difierent algorithms based on HoG feature of HumanEva-I

研究个体	动作	样本数	GP	TGP	TGPKNN	KTA	HSICKNN	SGPIN
S1	Walking	1 176	0.11	26.77	24.67	28.16	27.87	18.02
	Jogging	439	0.03	8.47	10.43	10.18	10.26	21.65
	Throw/Catch	217	0.01	3.77	/	/	/	22.44
	Gestures	801	0.07	27.15	27.31	18.64	19.42	19.78
	Box	502	0.03	10.11	11.19	11.75	11.84	21.90
S2	Walking	876	0.08	20.86	25.83	20.03	20.26	22.04
	Jogging	795	0.07	18.06	17.86	17.64	17.74	23.32
	Throw/Catch	806	0.02	18.56	26.59	20.13	20.02	21.69
	Gestures	681	0.04	14.32	15.52	15.91	16.64	18.38
	Box	464	0.03	9.02	10.35	10.71	11.43	23.69
S3	Walking	895	0.09	22.78	22.63	20.75	20.95	21.13
	Jogging	831	0.08	21.83	20.13	18.51	19.01	20.36
	Throw/Catch	0	/	/	/	/	/	/
	Gestures	214	0.04	6.13	/	/	/	22.62
	Box	933	0.10	23.70	23.67	22.68	23.57	21.69
总数		9 630	11	1928	442	491	495	41

下载: 导出CSV

表 5 个体3行走姿态的预测误差

Table 5 Predicting errors of subject 3 walking

	GP	TGP	TGPKNN	HSICKNN	KTA	SGPIN
1	412.0	219.1	214.8	236.3	239.6	177.6
2	412.0	218.0	214.9	232.5	235.7	176.9
3	412.0	220.2	214.6	237.1	240.9	177.7
4	412.0	220.4	215.6	241.6	244.8	177.5
5	412.0	218.7	214.5	233.4	237.3	177.0
方差	0.00	1.04	0.18	12.89	12.38	0.14

下载: 导出CSV

参考文献(27)

[1]	沈建冬, 陈恒.融合HOG和颜色特征的人体姿态估计新算法.计算机工程与应用, 2017, 53(21):190-194 doi: 10.3778/j.issn.1002-8331.1606-0319 Shen Jian-Dong, Chen Heng. New human pose estimation algorithm based on HOG and color features. Computer Engineering and Applications, 2017, 53(21):190-194 doi: 10.3778/j.issn.1002-8331.1606-0319
[2]	Wang J M, Fleet D J, Hertzmann A. Gaussian process dynamical models for human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(2):283-298 doi: 10.1109/TPAMI.2007.1167
[3]	袁紫华, 李峰, 周书仁.基于Haar型LBP纹理特征的人体姿态估计.计算机工程, 2015, 41(4):199-204 doi: 10.3969/j.issn.1000-3428.2015.04.038 Yuan Zi-Hua, Li Feng, Zhou Shu-Ren. Human pose estimation based on Haar characteristics LBP texture feature. Computer Engineering, 2015, 41(4):199-204 doi: 10.3969/j.issn.1000-3428.2015.04.038
[4]	Zhao X, Ning H Z, Liu Y C, Huang T. Discriminative estimation of 3D human pose using Gaussian processes. In:Proceedings of the 19th International Conference on Pattern Recognition. Tampa, FL, USA:IEEE, 2008. 1-4
[5]	Bratieres S, Quadrianto N, Ghahramani Z. GPstruct:Bayesian structured prediction using gaussian processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(7):1514-1520 doi: 10.1109/TPAMI.2014.2366151
[6]	Ding M, Fan G L. Articulated Gaussian kernel correlation for human pose estimation. In:Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Boston, MA, USA:IEEE, 2015. 57-64
[7]	Rasmussen C E, Ghahramani Z. Infinite mixtures of Gaussian process experts. In:Proceedings of the 14th International Conference on Neural Information Processing Systems:Natural and Synthetic. Vancouver, British Columbia, Canada:MIT Press, 2002. 881-888
[8]	俞斌峰, 季海波.稀疏贝叶斯混合专家模型及其在光谱数据标定中的应用.自动化学报, 2016, 42(4):566-579 http://www.aas.net.cn/CN/abstract/abstract18844.shtml Yu Bin-Feng, Ji Hai-Bo. Sparse Bayesian mixture of experts and its application to spectral multivariate calibration. Acta Automatica Sinica, 2016, 42(4):566-579 http://www.aas.net.cn/CN/abstract/abstract18844.shtml
[9]	刘长红, 杨扬, 陈勇.增量式人体姿态映射模型的学习方法.计算机科学, 2010, 37(3):268-270 doi: 10.3969/j.issn.1002-137X.2010.03.067 Liu Chang-Hong, Yang Yang, Chen Yong. Incrementally learning human pose mapping model. Computer Science, 2010, 37(3):268-270 doi: 10.3969/j.issn.1002-137X.2010.03.067
[10]	闫小喜, 韩崇昭.基于增量式有限混合模型的多目标状态极大似然估计.自动化学报, 2011, 37(5):577-584 http://www.aas.net.cn/CN/abstract/abstract17393.shtml Yan Xiao-Xi, Han Chong-Zhao. Maximum likelihood estimation of multiple target states based on incremental finite mixture model. Acta Automatica Sinica, 2011, 37(5):577-584 http://www.aas.net.cn/CN/abstract/abstract17393.shtml
[11]	Csató L, Opper M. Sparse on-line Gaussian processes. Neural Computation, 2002, 14(3):641-668 doi: 10.1162/089976602317250933
[12]	Bijl H, van Wingerden J W, Schön T B, Verhaegen M. Online sparse Gaussian process regression using FITC and PITC approximations. IFAC-PapersOnLine, 2015, 48(28):703-708 doi: 10.1016/j.ifacol.2015.12.212
[13]	Snelson E, Ghahramani Z. Sparse Gaussian processes using pseudo-inputs. In:Proceedings of the 18th International Conference on Neural Information Processing Systems. Vancouver, British Columbia, Canada:MIT Press, 2006. 1257-1264
[14]	McHutchon A, Rasmussen C E. Gaussian process training with input noise. In:Proceedings of the 24th International Conference on Neural Information Processing Systems. Granada, Spain:ACM, 2011. 1341-1349
[15]	HumanEva Dataset[Online], available:http://humaneva.is.tue.mpg.de/, November 3, 2017
[16]	Sigal L, Balan A O, Black M J. HumanEva:synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. International Journal of Computer Vision, 2006, 87(1-2):Article No. 4 http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ0224696659/
[17]	Poppe R. Evaluating example-based pose estimation:experiments on the HumanEva sets. In:Proceedings of the 2007 Computer Vision and Pattern Recognition Workshop on Evaluation of Articulated Human Motion and Pose Estimation (EHuM2). Minneapolis, USA:IEEE, 2007.
[18]	苏本跃, 蒋京, 汤庆丰, 盛敏.基于函数型数据分析方法的人体动态行为识别.自动化学报, 2017, 43(5):866-876 http://www.aas.net.cn/CN/abstract/abstract19064.shtml Su Ben-Yue, Jiang Jing, Tang Qing-Feng, Sheng Min. Human dynamic action recognition based on functional data analysis. Acta Automatica Sinica, 2017, 43(5):866-876 http://www.aas.net.cn/CN/abstract/abstract19064.shtml
[19]	Shakhnarovich G, Viola P, Darrell T. Fast pose estimation with parameter-sensitive hashing. In:Proceedings of the 9th IEEE International Conference on Computer Vision. Nice, France:IEEE, 2003. 750-757
[20]	韩贵金, 朱虹.一种基于图结构模型的人体姿态估计算法.计算机工程与应用, 2013, 49(14):30-33 doi: 10.3778/j.issn.1002-8331.1302-0153 Han Gui-Jin, Zhu Hong. Human pose estimation algorithm based on pictorial structure model. Computer Engineering and Applications, 2013, 49(14):30-33 doi: 10.3778/j.issn.1002-8331.1302-0153
[21]	Jiang H. Human pose estimation using consistent max covering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(9):1911-1918 doi: 10.1109/TPAMI.2011.92
[22]	Yang W L, Wang Y, Mori G. Recognizing human actions from still images with latent poses. In:Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). San Francisco, USA:IEEE, 2010. 2030-2037
[23]	徐峰, 张军平.人脸微表情识别综述.自动化学报, 2017, 43(3):333-348 http://www.aas.net.cn/CN/abstract/abstract19013.shtml Xu Feng, Zhang Jun-Ping. Facial microexpression recognition:a survey. Acta Automatica Sinica, 2017, 43(3):333-348 http://www.aas.net.cn/CN/abstract/abstract19013.shtml
[24]	徐渊, 许晓亮, 李才年, 姜梅, 张建国.结合SVM分类器与HOG特征提取的行人检测.计算机工程, 2016, 42(1):56-60, 65 doi: 10.3969/j.issn.1000-3428.2016.01.011 Xu Yuan, Xu Xiao-Liang, Li Cai-Nian, Jiang Mei, Zhang Jian-Guo. Pedestrian detection combining with SVM classifier and HOG feature extraction. Computer Engineering, 2016, 42(1):56-60, 65 doi: 10.3969/j.issn.1000-3428.2016.01.011
[25]	Bo L F, Sminchisescu C. Twin gaussian processes for structured prediction. International Journal of Computer Vision, 2010, 87(1-2):28-52 doi: 10.1007/s11263-008-0204-y
[26]	Cristianini N, Shawe-Taylor J, Elisseeff A, Kandola J. On kernel-target alignment. In:Proceedings of the 14th International Conference on Neural Information Processing Systems:Natural and Synthetic. Vancouver, British Columbia, Canada:MIT Press, 2001. 367-373
[27]	Gretton A, Bousquet O, Smola A J, Schölkopf B. Measuring statistical dependence with Hilbert-Schmidt norms. Algorithmic Learning Theory. Berlin Heidelberg, Germany:Springer-Verlag, 2005.