高阶马尔科夫随机场及其在场景理解中的应用

余淼; 胡占义

doi:10.16383/j.aas.2015.c140684

高阶马尔科夫随机场及其在场景理解中的应用

doi: 10.16383/j.aas.2015.c140684

余淼^1,2,
胡占义¹

1.
中国科学院自动化研究所模式识别国家重点实验室北京 100190;
2.
中原工学院电子信息学院郑州 450007

基金项目:

国家高技术研究发展计划(863计划) (2013AA122301), 国家自然科学基金(61273280, 61333015)资助

详细信息

作者简介:
余淼中原工学院讲师, 中国科学院自动化研究所博士研究生. 分别于2004 年和2007 获得西南交通大学管理学学士和工学硕士学位. 主要研究方向为场景理解和三维重建.E-mail: myu@nlpr.ia.ac.cn

计量
- 文章访问数: 3463
- HTML全文浏览量: 249
- PDF下载量: 1694
- 被引次数: 0
出版历程
- 收稿日期: 2014-09-24
- 修回日期: 2015-03-20
- 刊出日期: 2015-07-20

Higher-order Markov Random Fields and Their Applications in Scene Understanding

YU Miao^1,2,
HU Zhan-Yi¹

1.
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190;
2.
School of Electric and Information Engineer, Zhongyuan University of Technology, Zhengzhou 450007

Funds:

Supported by National High Technology Research and Development Program of China (863 Program) (2013AA122301) and National Natural Science Foundation of China (61273280, 61333015)

摘要

摘要: 与传统的一阶马尔科夫随机场(Markov random field, MRF)相比, 高阶马尔科夫随机场能够表达更加复杂的定性和统计性先验信息, 在模型的表达能力上具有更大的优势. 但高阶马尔科夫随机场对应的能量函数优化问题更为复杂. 同时其模型参数数目的爆炸式增长使得选择合适的模型参数也成为了一个非常困难的问题. 近年来, 学术界在高阶马尔科夫随机场的能量模型的建模、优化和参数学习三个方面进行了深入的探索, 取得了很多有意义的成果. 本文首先从这三个方面总结和介绍了目前在高阶马尔科夫随机场研究上取得的主要成果, 然后介绍了高阶马尔科夫随机场在图像理解和三维场景理解中的应用现状.
- 高阶马尔科夫随机场 /
- 能量模型 /
- 能量优化 /
- 参数学习 /
- 场景理解
Abstract: Compared with traditional first-order Markov random fields (MRF), higher-order Markov random fields could incorporate more sophisticated qualitative and statistical priors, thus have much more expressive power of modeling. However, it is even harder to minimize their corresponding energy functions. Besides, estimating the value of their parameters becomes much more complex due to the explosive growth of their number. Currently, numerous works have been devoted to solving the modeling, inference and parameter learning problems of higher-order random fields. This paper is a review of the related works as well as a short summary of the applications of higher-order Markov random fields to image understanding and 3D scene understanding.
- Higher-order Markov random fields /
- energy modeling /
- energy minimization /
- parameter learning /
- scene un-derstanding

HTML全文

参考文献(116)

[1]	Li S Z. Markov Random Field Modeling in Image Analysis. London: Springer, 2009.
[2]	Blake A, Kohli P, Rother C. Markov Random Fields for Vision and Image Processing. Cambridge: MIT Press, 2011.
[3]	Blake A, Kohli P, Rother C [Author], Xie Zhao [Translator]. Markov Random Fields for Vision and Image Processing. Beijing: Science Press, 2014.(Blake A, Kohli P, Rother C [著], 谢昭 [译]. Markov随机场在视觉和图像处理中的应用. 北京: 科学出版社, 2014.)
[4]	Boykov Y, Veksler O, Zabih R. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(11): 1222-1239
[5]	Kolmogorov V, Zabin R. What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(2): 147-159
[6]	Felzenszwalb P F, Huttenlocher D P. Efficient belief propagation for early vision. International Journal of Computer Vision, 2006, 70(1): 41-54
[7]	Weiss Y, Freeman W T. On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs. IEEE Transactions on Information Theory, 2001, 47(2): 736-744
[8]	Murphy K P, Weiss Y, Jordan M I. Loopy belief propagation for approximate inference: an empirical study. In: Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers Inc., 1999. 467-475
[9]	Wainwright M J, Jaakkola T S, Willsky A S. Map estimation via agreement on trees: message-passing and linear programming. IEEE Transactions on Information Theory, 2005, 51(11): 3697-3717
[10]	Kolmogorov V. Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(10): 1568-1583
[11]	Kolmogorov V, Wainwright M J. On the optimality of tree-reweighted max-product message-passing. In: Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence. 2012.
[12]	Wainwright M J, Jordan M I. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 2008, 1(1-2): 1-305
[13]	Szeliski R, Zabih R, Scharstein D, Veksler O, Kolmogorov V, Agarwala A, Tappen M, Rother C. A comparative study of energy minimization methods for Markov random fields with smoothness-based priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(6): 1068-1080
[14]	Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques. Cambridge: MIT Press, 2009.
[15]	Bishop C. Pattern Recognition and Machine Learning. New York: Springer, 2006.
[16]	Jordan M I, Ghahramani Z, Jaakkola T S, Saul L K. An introduction to variational methods for graphical models. Machine Learning, 1999, 37(2): 183-233
[17]	Lauritzen S L. Graphical Models. Oxford: Oxford University Press, 1996.
[18]	Besag J. Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society. Series B (Methodological), 1974, 36(2): 192-236
[19]	Loeliger H A. An introduction to factor graphs. IEEE Signal Processing Magazine, 2004, 21(1): 28-41
[20]	Kschischang F R, Frey B J, Loeliger H A. Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory, 2001, 47(2): 498-519
[21]	Szeliski R, Zabih R, Scharstein D, Veksler O, Kolmogorov V, Agarwala A, Tappen M, Rother C. A comparative study of energy minimization methods for Markov random fields. In: Proceedings of the 9th European Conference on Computer Vision, Computer Vision-ECCV 2006. Graz, Austria: Springer, 2006. 16-29
[22]	Greig D M, Porteous B T, Seheult A H. Exact maximum a posteriori estimation for binary images. Journal of the Royal Statistical Society. Series B (Methodological), 1989, 51(2): 271-279
[23]	Tappen M F, Freeman W T. Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters. In: Proceedings of the 9th IEEE International Conference on Computer Vision, 2003. Nice, France: IEEE, 2003. 900-906
[24]	Woodford O J, Torr P H S, Reid I D, Fitzgibbon A W. Global stereo reconstruction under second order smoothness priors. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008). Anchorage, AK: IEEE, 2008. 1-8
[25]	Vicente S, Kolmogorov V, Rother C. Graph cut based image segmentation with connectivity priors. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008). Anchorage, AK: IEEE, 2008. 1-8
[26]	Nowozin S, Lampert C H. Global connectivity potentials for random field models. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition(CVPR 2009). Miami, FL: IEEE, 2009. 818-825
[27]	Bleyer M, Rother C, Kohli P, Scharstein D, Sinha S. Object stereo ---joint stereo matching and object segmentation. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Providence, RI: IEEE, 2011. 3081-3088
[28]	Lempitsky V, Kohli P, Rother C, Sharp T. Image segmentation with a bounding box prior. In: Proceedings of the 12th IEEE International Conference on Computer Vision. Kyoto: IEEE, 2009. 277-284
[29]	Kohli P, Kumar M P, Torr P H S. P3 & beyond: move making algorithms for solving higher order functions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(9): 1645-1656
[30]	Kohli P, Kumar M P, Torr P H S. P3 & beyond: solving energies with higher order cliques. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07). Minneapolis, MN: IEEE, 2007. 1-8
[31]	Kohli P, Ladický L, Torr P H S. Robust higher order potentials for enforcing label consistency. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008). Anchorage, AK: IEEE, 2008. 1-8
[32]	Kohli P, Ladický L, Torr P H S. Robust higher order potentials for enforcing label consistency. International Journal of Computer Vision, 2009, 82(3): 302-324
[33]	Ladický L, Russell C, Kohli P, Torr P H S. Associative hierarchical random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(6): 1056-1077
[34]	Ladický L, Russell C, Kohli P, Torr P H S. Associative hierarchical CRFs for object class image segmentation. In: Proceedings of the 12th IEEE International Conference on Computer Vision. Kyoto: IEEE, 2009. 739-746
[35]	Rother C, Kohli P, Feng W, Jia J Y. Minimizing sparse higher order energy functions of discrete variables. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition(CVPR 2009). Miami, FL: IEEE, 2009. 1382-1389
[36]	Komodakis N, Paragios N. Beyond pairwise energies: efficient optimization for higher-order MRFs. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009). Miami, FL: IEEE, 2009. 2985-2992
[37]	Boix X, Gonfaus J M, van de Weijer J, Bagdanov A D, Serrat J, González J. Harmony potentials. International Journal of Computer Vision, 2012, 96(1): 83-102
[38]	Gonfaus J M, Boix X, Van de Weijer J, Bagdanov A D, Serrat J, Gonzalez J. Harmony potentials for joint classification and segmentation. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). San Francisco, CA: IEEE, 2010. 3280-3287
[39]	Ladický L, Russell C, Kohli P, Torr P H S. Graph cut based inference with co-occurrence statistics. In: Proceedings of the 11th European Conference on Computer Vision, Computer Vision-ECCV 2010. Heraklion, Crete, Greece: Springer, 2010. 239-253
[40]	Ladický L, Russell C, Kohli P, Torr P H S. Inference methods for CRFs with co-occurrence statistics. International Journal of Computer Vision, 2013, 103(2): 213-225
[41]	Werner T. High-arity interactions, polyhedral relaxations, and cutting plane algorithm for soft constraint optimisation (MAP-MRF). In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008). Anchorage, AK: IEEE, 2008. 1-8
[42]	Lim Y, Jung K, Kohli P. Energy minimization under constraints on label counts. In: Proceedings of the 11th European Conference on Computer Vision, Computer Vision-ECCV 2010. Heraklion, Crete, Greece: Springer, 2010. 535-551
[43]	Delong A, Osokin A, Isack H N, Boykov Y. Fast approximate energy minimization with label costs. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). San Francisco, CA: IEEE, 2010. 2173-2180
[44]	Delong A, Osokin A, Isack H N, Boykov Y. Fast approximate energy minimization with label costs. International Journal of Computer Vision, 2012, 96(1): 1-27
[45]	Shekhovtsov, Kohli P, Rother C. Curvature prior for mrf-based segmentation and shape inpainting. In: Proceedings of the Joint 34th DAGM and 36th OAGM, Pattern Recognition, Lecture Notes in Computer Science Volume 7476. Berlin Heidelberg: Springer, 2012. 41-51
[46]	Woodford O, Torr P, Reid I, Fitzgibbon A. Global stereo reconstruction under second-order smoothness priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(12): 2115-2128
[47]	Silberman N, Hoiem D, Kohli P, Fergus R. Indoor segmentation and support inference from RGBD images. In: Proceedings of the 12th European Conference on Computer Vision, Computer Vision-ECCV 2012. Florence, Italy: Springer, 2012. 746-760
[48]	Ladický L, Sturgess P, Russell C, Sengupta S, Bastanlar Y, Clocksin W, Torr P H S. Joint optimization for object class segmentation and dense stereo reconstruction. International Journal of Computer Vision, 2012, 100(2): 122-133
[49]	Kim B S, Sun M, Kohli P, Savarese S. Relating things and stuff by high-order potential modeling. In: Proceedings of the 2012 Computer Vision-ECCV. Workshops and Demonstrations. Berlin, Heidelberg: Springer, 2012. 293-304
[50]	Sun M, Kim B S, Kohli P, Savarese S. Relating things and stuff via object property interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(7): 1370-1383
[51]	Ladický L, Sturgess P, Alahari K, Russell C, Torr P H S. What, where and how many? Combining object detectors and CRFs. In: Proceedings of the 11th European Conference on Computer Vision, Computer Vision-ECCV 2010. Heraklion, Crete, Greece: Springer, 2010. 424-437
[52]	Brostow G J, Shotton J, Fauqueur J, Cipolla R. Segmentation and recognition using structure from motion point clouds. In: Proceedings of the 10th European Conference on Computer Vision, Computer Vision-ECCV 2008. Marseille, France: Springer, 2008. 44-57
[53]	Floros G, Leibe B. Joint 2d-3d temporally consistent semantic segmentation of street scenes. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Providence, RI: IEEE, 2012. 2823-2830
[54]	Shotton J, Winn J, Rother C, Criminisi A. Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. International Journal of Computer Vision, 2009, 81(1): 2-23
[55]	Shotton J, Winn J, Rother C, Criminisi A. Textonboost: joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Proceedings of the 9th European Conference on Computer Vision, Computer Vision-ECCV 2006. Graz, Austria: Springer, 2006. 1-15
[56]	Chris R, L'ubor L, Pushmeet K, Philip HS T. Exact and approximate inference in associative hierarchical networks using graph cuts. arXiv preprint arXiv: 1203.3512, 2012.
[57]	Russell C, Ladický L, Kohli P, Torr P H S. Exact and approximate inference in associative hierarchical networks using graph cuts. In: UAI. AUAI Press, 2010. 501-508
[58]	Kohli P, Kumar M P. Energy minimization for linear envelope MRFs. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). San Francisco, CA: IEEE, 2010. 1863-1870
[59]	Gould S. Max-margin learning for lower linear envelope potentials in binary Markov random fields. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11). Omnipress, 2011. 193-200
[60]	Lempitsky V, Rother C, Blake A. LogCut-efficient graph cut optimization for Markov random fields. In: Proceedings of the 11th IEEE International Conference on Computer Vision (ICCV 2007). Rio de Janeiro: IEEE, 2007. 1-8
[61]	Lempitsky V, Rother C, Roth S, Blake A. Fusion moves for Markov random field optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(8): 1392-1405
[62]	Werner T. Revisiting the linear programming relaxation approach to gibbs energy minimization and weighted constraint satisfaction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(8): 1474-1488
[63]	Kolmogorov V, Rother C. Minimizing nonsubmodular functions with graph cuts ---a review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(7): 1274-1279
[64]	Boros E, Hammer P L. Pseudo-boolean optimization. Discrete Applied Mathematics, 2002, 123(1-3): 155-225
[65]	Boros E, Hammer P L, Tavares G. Preprocessing of Unconstrained Quadratic Binary Optimization. Technical Report RRR 10-2006, RUTCOR, 2006.
[66]	Rother C, Kolmogorov V, Lempitsky V, Szummer M. Optimizing binary MRFs via extended roof duality. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07). Minneapolis, MN: IEEE, 2007. 1-8
[67]	Rosenberg I G. Reduction of bivalent maximization to the quadratic case. Cahiers du Centre d'Etudes de Recherche Opérationnelle, 1975, 17: 71-74
[68]	Ishikawa H. Higher-order clique reduction in binary graph cut. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009). Miami, FL: IEEE, 2009. 2993-3000
[69]	Ishikawa H. Transformation of general binary MRF minimization to the first-order case. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(6): 1234-1249
[70]	Freedman D, Drineas P. Energy minimization via graph cuts: settling what is possible. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005). San Diego, CA, USA: IEEE, 2005. 939-946
[71]	Gallagher A C, Batra D, Parikh D. Inference for order reduction in Markov random fields. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Providence, RI: IEEE, 2011. 1857-1864
[72]	Fix A, Gruber A, Boros E, Zabih R. A graph cut algorithm for higher-order Markov random fields. In: Proceedings of the 2011 IEEE International Conference on Computer Vision (ICCV). Barcelona: IEEE, 2011. 1020-1027
[73]	Pearl J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo: Morgan Kaufmann, 1988.
[74]	Lan X Y, Roth S, Huttenlocher D, Black M J. Efficient belief propagation with learned higher-order Markov random fields. In: Proceedings of the 9th European Conference on Computer Vision, Computer Vision-ECCV 2006. Graz, Austria: Springer, 2006. 269-282
[75]	Potetz B. Efficient belief propagation for vision using linear constraint nodes. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07). Minneapolis, MN: IEEE, 2007. 1-8
[76]	Potetz B, Lee T S. Efficient belief propagation for higher-order cliques using linear constraint nodes. Computer Vision and Image Understanding, 2008, 112(1): 39-54
[77]	Tarlow D, Givoni I E, Zemel R S. Hop-map: efficient message passing with high order potentials. In: Proceedings of the 13th Conference on Artificial Intelligence and Statistics. 2010. 812-819
[78]	McAuley J J, Caetano T S. Faster algorithms for max-product message-passing. The Journal of Machine Learning Research, 2011, 12: 1349-1388
[79]	Felzenszwalb P F, McAuley J J. Fast inference with min-sum matrix product. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(12): 2549-2554
[80]	Komodakis N, Tziritas G, Paragios N. Fast, approximately optimal solutions for single and dynamic MRFs. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07). Minneapolis, MN: IEEE, 2007. 1-8
[81]	Bertsekas D P. Nonlinear Programming (2nd Edition). Belmont, Mass: Athena Scientific, 1999.
[82]	Vazirani V V. Approximation Algorithms. Berlin, Heidelberg: Springer, 2001.
[83]	Kovalevsky V A, Koval V K. A diffusion algorithm for decreasing energy of max-sum labeling problem. Glushkov Institute of Cybernetics, Kiev, USSR, 1975.
[84]	Werner T. A linear programming approach to max-sum problem: a review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(7): 1165-1179
[85]	Komodakis N, Paragios N, Tziritas G. MRF optimization via dual decomposition: message-passing revisited. In: Proceedings of the 11th IEEE International Conference on Computer Vision (ICCV 2007). Rio de Janeiro: IEEE, 2007. 1-8
[86]	Komodakis N, Paragios N, Tziritas G. MRF energy minimization and beyond via dual decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(3): 531-552
[87]	Swoboda P, Savchynskyy B, Kappes J H, Schnörr C. Partial optimality by pruning for map-inference with general graphical models. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR'14. Washington D.C., USA: IEEE Computer Society, 2014. 1170-1177
[88]	Komodakis N, Paragios N. Beyond loose Lp-relaxations: optimizing MRFs by repairing cycles. In: Proceedings of the 10th European Conference on Computer Vision, Computer Vision-ECCV 2008. Marseille, France: Springer, 2008. 806-820
[89]	Kumar M P, Torr P H S. Efficiently solving convex relaxations for map estimation. In: Proceedings of the 25th International Conference on Machine Learning. New York: ACM, 2008. 680-687
[90]	Sontag D, Jaakkola Y S. New outer bounds on the marginal polytope. In: Proceedings of the 2007 Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2007. 1393-1400
[91]	Sontag D, Meltzer T, Globerson A, Jaakkola T S, Weiss Y. Tightening LP relaxations for MAP using message passing. In: Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence. 2012.
[92]	Andres B, Kappes J H, Köthe U, Schnörr C, Hamprecht F A. An empirical comparison of inference algorithms for graphical models with higher order factors using openGM. In: Proceedings of the 32nd DAGM Symposium, Pattern Recognition. Darmstadt, Germany: Springer, 2010. 353-362
[93]	Kappes J H, Andres B, Hamprecht F A, Schnorr C, Nowozin S, Batra D, Kim S, Kausler B X, Lellmann J, Komodakis N, Rother C. A comparative study of modern inference techniques for discrete energy minimization problems. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Portland, OR: IEEE, 2013. 1328-1335
[94]	Andres B, Beier T, Kappes J H. Opengm: A C++ library for discrete graphical models. arXiv Preprint arXiv: 1206. 0111, 2012.
[95]	Scharstein D, Chris P. Learning conditional random fields for stereo. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07). Minneapolis, MN: IEEE, 2007. 1-8
[96]	Taskar B, Guestrin C, Roller D. Max-margin Markov networks. Advances in Neural Information Processing Systems, 2004, 16: 25
[97]	Finley T, Joachims T. Training structural SVMs when exact inference is intractable. In: Proceedings of the 25th International Conference on Machine Learning. New York: ACM, 2008. 304-311
[98]	Li Y P, Huttenlocher D P. Learning for stereo vision using the structured support vector machine. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008). Anchorage, AK: IEEE, 2008. 1-8
[99]	Tsochantaridis I, Hofmann T, Joachims T, Altun Y. Support vector machine learning for interdependent and structured output spaces. In: Proceedings of the 21st International Conference on Machine Learning. New York: ACM, 2004. 104
[100]	Yang L, Meer P, Foran D J. Multiple class segmentation using a unified framework over mean-shift patches. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07). Minneapolis, MN: IEEE, 2007. 1-8
[101]	Pantofaru C, Schmid C, Hebert M. Object recognition by integrating multiple image segmentations. In: Proceedings of the 10th European Conference on Computer Vision, Computer Vision-ECCV 2008. Marseille, France: Springer, 2008. 481-494
[102]	Russell B C, Freeman W T, Efros A A, Sivic J, Zisserman A. Using multiple segmentations to discover objects and their extent in image collections. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2006. 1605-1614
[103]	Comaniciu D, Meer P. Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(5): 603-619
[104]	Torralba A, Murphy K P, Freeman W T. Sharing features: efficient boosting procedures for multiclass object detection. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004). Washington D.C., USA: IEEE, 2004. II-762-II-769
[105]	Boykov Y Y, Jolly M P. Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images. In: Proceedings of the 8th IEEE International Conference on Computer Vision (ICCV 2001). Vancouver, BC: IEEE, 2001. 105-112
[106]	Felzenszwalb P F, Girshick R B, McAllester D, Ramanan D. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627-1645
[107]	Maji S, Malik J. Object detection using a max-margin Hough transform. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009). Miami, FL: IEEE, 2009. 1038-1045
[108]	Larlus D, Jurie F. Combining appearance models and Markov random fields for category level object segmentation. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008). Anchorage, AK: IEEE, 2008. 1-7
[109]	Hoiem D, Efros A A, Hebert M. Closing the loop in scene interpretation. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008). Anchorage, AK: IEEE, 2008. 1-8
[110]	Li C C, Kowdle A, Saxena A, Chen T. Towards holistic scene understanding: feedback enabled cascaded classification models. In: Proceedings of the 2010 Advances in Neural Information Processing Systems. 2010. 1351-1359
[111]	Gould S, Gao T S, Koller D. Region-based segmentation and object detection. In: Proceeding of the 2009 Advances in Neural Information Processing Systems. 2009. 655-663
[112]	Wojek C, Schiele B. A dynamic conditional random field model for joint labeling of object and scene classes. In: Proceedings of the 10th European Conference on Computer Vision, Computer Vision-ECCV 2008. Marseille, France: Springer, 2008. 733-747
[113]	Everingham M, Van Gool L, Williams C K I, Winn J, Zisserman A. The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 2010, 88(2): 303-338
[114]	Yao J, Fidler S, Urtasun R. Describing the scene as a whole: joint object detection, scene classification and semantic segmentation. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Providence, RI: IEEE, 2012. 702-709
[115]	Sturgess P, Alahari K, Ladický L, Torr P H S. Combining appearance and structure from motion features for road scene understanding. In: Proceedings of the 2009 British Machine Vision Association (BMVC 2009).
[116]	Roig G, Boix X, Ben Shitrit H, Fua P. Conditional random fields for multi-camera object detection. In: Proceedings of the 2011 IEEE International Conference on Computer Vision (ICCV). Barcelona: IEEE, 2011. 563-570