“Deep part induction from articulated object pairs” – ACM SIGGRAPH HISTORY ARCHIVES

“Deep part induction from articulated object pairs”

  • 2018 SA Technical Papers_Yi_Deep part induction from articulated object pairs

Conference:


Type(s):


Title:

    Deep part induction from articulated object pairs

Session/Category Title:   Learning to compose & decompose


Presenter(s)/Author(s):


Moderator(s):



Abstract:


    Object functionality is often expressed through part articulation – as when the two rigid parts of a scissor pivot against each other to perform the cutting function. Such articulations are often similar across objects within the same functional category. In this paper we explore how the observation of different articulation states provides evidence for part structure and motion of 3D objects. Our method takes as input a pair of unsegmented shapes representing two different articulation states of two functionally related objects, and induces their common parts along with their underlying rigid motion. This is a challenging setting, as we assume no prior shape structure, no prior shape category information, no consistent shape orientation, the articulation states may belong to objects of different geometry, plus we allow inputs to be noisy and partial scans, or point clouds lifted from RGB images. Our method learns a neural network architecture with three modules that respectively propose correspondences, estimate 3D deformation flows, and perform segmentation. To achieve optimal performance, our architecture alternates between correspondence, deformation flow, and segmentation prediction iteratively in an ICP-like fashion. Our results demonstrate that our method significantly outperforms state-of-the-art techniques in the task of discovering articulated parts of objects. In addition, our part induction is object-class agnostic and successfully generalizes to new and unseen objects.

References:


    1. Federica Bogo, Angjoo Kanazawa, Christoph Lassner, Peter Gehler, Javier Romero, and Michael J. Black. 2016. Keep it SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image. In Proc. ECCV.Google Scholar
    2. D. Boscaini, J. Masci, S. Melzi, M. M. Bronstein, U. Castellani, and P. Vandergheynst. 2015. Learning Class-specific Descriptors for Deformable Shapes Using Localized Spectral Convolutional Networks. In Proc. SGP.Google Scholar
    3. Yuri Boykov, Olga Veksler, and Ramin Zabih. 2001. Efficient Approximate Energy Minimization via Graph Cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 12 (2001), 1222–1239. Google ScholarDigital Library
    4. Eric Brachmann, Alexander Krull, Sebastian Nowozin, Jamie Shotton, Frank Michel, Stefan Gumhold, and Carsten Rother. 2017. DSAC-Differentiable RANSAC for camera localization. In Proc. CVPR.Google ScholarCross Ref
    5. Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. 2015. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015).Google Scholar
    6. Will Chang, Hao Li, Niloy Mitra, Mark Pauly, and Michael Wand. 2012. Dynamic Geometry Processing. In Eurographics 2012 – Tutorials.Google Scholar
    7. Xiaobai Chen, Aleksey Golovinskiy, and Thomas Funkhouser. 2009. A benchmark for 3D mesh segmentation. 28, 3 (2009), 73. Google ScholarDigital Library
    8. Vogel Christoph, Konrad Schindler, and Roth Stefan. 2015. 3D Scene Flow Estimation with a Piecewise Rigid Scene Model. International Journal of Computer Vision 115, 1 (2015). Google ScholarDigital Library
    9. Martin A. Fischler and Robert C. Bolles. 1981. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Commun. ACM 24, 6 (1981). Google ScholarDigital Library
    10. Aleksey Golovinskiy and Thomas Funkhouser. 2009. Consistent Segmentation of 3D Models. Computers & Graphics 33, 3 (2009). Google ScholarDigital Library
    11. Vladislav Golyani, Kihwan Kim, Robert Maier, Matthias Nießner, Didier Stricker, and Jan Kautz. 2017. Multiframe Scene Flow with Piecewise Rigid Motion. In Proc. 3DV.Google Scholar
    12. Tal Hassner, Shai Harel, Eran Paz, and Roee Enbar. 2015. Effective face frontalization in unconstrained images. In Proc. CVPR.Google ScholarCross Ref
    13. M. Hornacek, Andrew Fitzgibbon, and Carsten Rother. 2014. SphereFlow: 6 DoF Scene Flow from RGB-D Pairs. In Proc. CVPR. Google ScholarDigital Library
    14. Ruizhen Hu, Lubin Fan, and Ligang Liu. 2012. Co-Segmentation of 3D Shapes via Subspace Clustering. Computer Graphics Forum 31, 5 (2012). Google ScholarDigital Library
    15. Ruizhen Hu, Wenchao Li, Oliver Van Kaick, Ariel Shamir, Hao Zhang, and Hui Huang. 2017. Learning to predict part mobility from a single static snapshot. ACM Transactions on Graphics 36, 6 (2017), 227. Google ScholarDigital Library
    16. Haibin Huang, Evangelos Kalogerakis, Siddhartha Chaudhuri, Duygu Ceylan, Vladimir G. Kim, and Ersin Yumer. 2017. Learning Local Shape Descriptors from Part Correspondences with Multiview Convolutional Networks. ACM Transactions on Graphics 37, 1 (2017). Google ScholarDigital Library
    17. Qixing Huang, Fan Wang, and Leonidas Guibas. 2014. Functional Map Networks for Analyzing and Exploring Large Shape Collections. ACM Transactions on Graphics 33, 4 (2014). Google ScholarDigital Library
    18. Qi-Xing Huang, Bart Adams, Martin Wicke, and Leonidas J Guibas. 2008. Non-rigid registration under isometric deformations. 27, 5 (2008), 1449–1457. Google ScholarDigital Library
    19. M. Jaimez, M. Souiai, J. Stueckler, J. Gonzalez-Jimenez, and D. Cremers. 2015. Motion Cooperation: Smooth Piece-Wise Rigid Scene Flow from RGB-D Images. In Proc. 3DV. Google ScholarDigital Library
    20. Doug L. James and Christopher D. Twigg. 2005. Skinning Mesh Animations. ACM Transactions on Graphics 24, 3 (2005). Google ScholarDigital Library
    21. Evangelos Kalogerakis, Melinos Averkiou, Subhransu Maji, and Siddhartha Chaudhuri. 2017. 3D Shape Segmentation with Projective Convolutional Networks. In Proc. CVPR.Google ScholarCross Ref
    22. Vladimir G. Kim, Wilmot Li, Niloy J. Mitra, Siddhartha Chaudhuri, Stephen DiVerdi, and Thomas Funkhouser. 2013. Learning part-based templates from large collections of 3D shapes. ACM Transactions on Graphics 32, 4 (2013), 70:1–70:12. Google ScholarDigital Library
    23. Vladimir G Kim, Yaron Lipman, and Thomas Funkhouser. 2011. Blended intrinsic maps. In ACM Transactions on Graphics, Vol. 30. 79. Google ScholarDigital Library
    24. Youngji Kim, Hwasup Lim, Sang Chul Ahn, and Ayoung Kim. 2016. Simultaneous segmentation, estimation and analysis of articulated motion from dense point cloud sequence. In Proc. IROS.Google ScholarDigital Library
    25. Roman Klokov and Victor Lempitsky. 2017. Escape from Cells: Deep Kd-Networks for The Recognition of 3D Point Cloud Models. In Proc. ICCV.Google ScholarCross Ref
    26. Philipp Krähenbühl and Vladlen Koltun. 2013. Parameter learning and convergent inference for dense random fields. In Proc. ICML. Google ScholarDigital Library
    27. Harold W Kuhn. 1955. The Hungarian method for the assignment problem. Naval Research Logistics (NRL) 2, 1–2 (1955), 83–97.Google ScholarCross Ref
    28. Hao Li, Guowei Wan, Honghua Li, Andrei Sharf, Kai Xu, and Baoquan Chen. 2016. Mobility Fitting Using 4D RANSAC. Computer Graphics Forum 35, 5 (2016).Google Scholar
    29. Xingyu Liu, Charles R Qi, and Leonidas J Guibas. 2018. Learning Scene Flow in 3D Point Clouds. arXiv preprint (2018).Google Scholar
    30. Haggai Maron, Meirav Galun, Noam Aigerman, Miri Trope, Nadav Dym, Ersin Yumer, Vladimir G. Kim, and Yaron Lipman. 2017. Convolutional Neural Networks on Surfaces via Seamless Toric Covers. ACM Transactions on Graphics 36, 4 (2017). Google ScholarDigital Library
    31. Jonathan Masci, Davide Boscaini, Michael Bronstein, and Pierre Vandergheynst. 2015. Geodesic convolutional neural networks on Riemannian manifolds. In Proc. ICCV Workshops. Google ScholarDigital Library
    32. Daniel Maturana and Sebastian Scherer. 2015. 3D Convolutional Neural Networks for Landing Zone Detection from LiDAR. In Proc. ICRA.Google ScholarCross Ref
    33. Dushyant Mehta, Srinath Sridhar, Oleksandr Sotnychenko, Helge Rhodin, Mohammad Shafiei, Hans-Peter Seidel, Weipeng Xu, Dan Casas, and Christian Theobalt. 2017. VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera. ACM Transactions on Graphics 36, 4 (2017), 14. Google ScholarDigital Library
    34. Niloy J. Mitra, Leonidas J. Guibas, and Mark Pauly. 2006. Partial and Approximate Symmetry Detection for 3D Geometry. ACM Transactions on Graphics 25, 3 (2006). Google ScholarDigital Library
    35. Federico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodola, Jan Svoboda, and Michael M. Bronstein. 2017. Geometric deep learning on graphs and manifolds using mixture model CNNs. In Proc. CVPR.Google Scholar
    36. Alejandro Newell, Kaiyu Yang, and Jia Deng. 2016. Stacked Hourglass Networks for Human Pose Estimation. In Proc. ECCV.Google ScholarCross Ref
    37. Stephen Palmer. 1977. Hierarchical structure in perceptual representation. Cognitive Psychology 9, 4 (1977), 441–474.Google ScholarCross Ref
    38. Sudeep Pillai, Matthew R. Walter, and Seth J. Teller. 2014. Learning Articulated Motions From Visual Demonstration. In Robotics: Science and Systems.Google Scholar
    39. Sören Pirk, Vojtech Krs, Kaimo Hu, Suren Deepak Rajasekaran, Hao Kang, Yusuke Yoshiyasu, Bedrich Benes, and Leonidas J. Guibas. 2017. Understanding and Exploiting Object Interaction Landscapes. ACM Transactions on Graphics 36, 3 (2017). Google ScholarDigital Library
    40. Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017a. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proc. CVPR.Google Scholar
    41. Charles R. Qi, Li Yi, Hao Su, and Leonidas Guibas. 2017b. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In Proc. NIPS. Google ScholarDigital Library
    42. Julian Quiroga, Thomas Brox, Frédéric Devernay, and James L. Crowley. 2014. Dense Semi-Rigid Scene Flow Estimation from RGBD images. In Proc. ECCV.Google Scholar
    43. Gernot Riegler, Ali Osman Ulusoys, and Andreas Geiger. 2017. Octnet: Learning deep 3D representations at high resolutions. In Proc. CVPR.Google ScholarCross Ref
    44. Bernardino Romera-Paredes and Philip Hilaire Sean Torr. 2016. Recurrent instance segmentation. In Proc. ECCV.Google ScholarCross Ref
    45. Lin Shao, Parth Shah, Vikranth Dwaracherla, and Jeannette Bohg. 2018. Motion-based Object Segmentation based on Dense RGB-D Scene Flow. arXiv preprint arXiv:1804.05195 (2018).Google Scholar
    46. Jamie Shotton, Ross Girshick, Andrew Fitzgibbon, Toby Sharp, Mat Cook, Mark Finocchio, Richard Moore, Pushmeet Kohli, Antonio Criminisi, Alex Kipman, and Andrew Blake. 2013. Efficient Human Pose Estimation from Single Depth Images. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 12 (2013). Google ScholarDigital Library
    47. Oana Sidi, Oliver van Kaick, Yanir Kleiman, Hao Zhang, and Daniel Cohen-Or. 2011. Unsupervised Co-Segmentation of a Set of Shapes via Descriptor-Space Spectral Clustering. ACM Transactions on Graphics 30, 6 (2011). Google ScholarDigital Library
    48. Olga Sorkine and Marc Alexa. 2007. As-rigid-as-possible Surface Modeling. In Proc. SGP. Google ScholarDigital Library
    49. Jörg Stückler and Sven Behnke. 2015. Efficient Dense Rigid-Body Motion Segmentation and Estimation in RGB-D Video. International Journal of Computer Vision 113, 3 (2015). Google ScholarDigital Library
    50. Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, and Jan Kautz. 2018. SPLATNet: Sparse Lattice Networks for Point Cloud Processing. In Proc. CVPR.Google ScholarCross Ref
    51. Robert W. Sumner, Johannes Schmid, and Mark Pauly. 2007. Embedded Deformation for Shape Manipulation. ACM Transactions on Graphics 26, 3 (July 2007). Google ScholarDigital Library
    52. Denis Tomè, Chris Russell, and Lourdes Agapito. 2017. Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image. In Proc. CVPR.Google ScholarCross Ref
    53. Alexander Toshev and Christian Szegedy. 2014. DeepPose: Human Pose Estimation via Deep Neural Networks. In Proc. CVPR. Google ScholarDigital Library
    54. Dimitrios Tzionas and Juergen Gall. 2016a. Reconstructing Articulated Rigged Models from RGB-D Videos. In Proc. ECCV.Google ScholarCross Ref
    55. Dimitrios Tzionas and Juergen Gall. 2016b. Reconstructing Articulated Rigged Models from RGB-D Videos. In Proc. ECCV.Google ScholarCross Ref
    56. Oliver van Kaick, Kai Xu, Hao Zhang, Yanzhen Wang, Shuyang Sun, Ariel Shamir, and Daniel Cohen-Or. 2013. Co-hierarchical Analysis of Shape Structures. ACM Transactions on Graphics. 32, 4 (2013). Google ScholarDigital Library
    57. Christoph Vogel, Stefan Roth, and Konrad Schindler. 2014. View-consistent 3D scene flow estimation over multiple frames. In European Conference on Computer Vision. Springer, 263–278.Google ScholarCross Ref
    58. Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, and Xin Tong. 2017. O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. ACM Transactions on Graphics 36, 4 (2017). Google ScholarDigital Library
    59. Kai Xu, Vladimir G. Kim, Qixing Huang, Niloy Mitra, and Evangelos Kalogerakis. 2016. Data-driven Shape Analysis and Processing. In SIGGRAPH ASIA 2016 Courses. Google ScholarDigital Library
    60. Zike Yan and Xuezhi Xiang. 2016. Scene Flow Estimation: A Survey. arXiv preprint arXiv:1612.02590 (2016).Google Scholar
    61. Li Yi, Vladimir G Kim, Duygu Ceylan, I Shen, Mengyan Yan, Hao Su, ARCewu Lu, Qixing Huang, Alla Sheffer, Leonidas Guibas, et al. 2016. A scalable active framework for region annotation in 3D shape collections. ACM Transactions on Graphics 35, 6 (2016), 210. Google ScholarDigital Library
    62. Li Yi, Hao Su, Xingwen Guo, and Leonidas Guibas. 2017. SyncSpecCNN: Synchronized spectral CNN for 3D shape segmentation. In Proc. CVPR.Google ScholarCross Ref
    63. Qing Yuan, Guiqing Li, Kai Xu, Xudong Chen, and Hui Huang. 2016a. Space-Time Co-Segmentation of Articulated Point Cloud Sequences. Computer Graphics Forum 35, 2 (2016).Google Scholar
    64. Qing Yuan, Guiqing Li, Kai Xu, Xudong Chen, and Hui Huang. 2016b. Space-Time Co-Segmentation of Articulated Point Cloud Sequences. 35, 2 (2016), 419–429.Google Scholar
    65. Miloš Žefran and Vijay Kumar. 1998. Interpolation schemes for rigid body motions. Computer-Aided Design 30, 3 (1998), 179–189.Google ScholarCross Ref
    66. Andy Zeng, Shuran Song, Matthias Nießner, Matthew Fisher, Jianxiong Xiao, and Thomas Funkhouser. 2017. 3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions. In Proc. CVPR.Google ScholarCross Ref


ACM Digital Library Publication:



Overview Page:



Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org