Shape2Pose: human-centric shape analysis

As 3D acquisition devices and modeling tools become widely available there is a growing need for automatic algorithms that analyze the semantics and functionality of digitized shapes. Most recent research has focused on analyzing geometric structures of shapes. Our work is motivated by the observation that a majority of man-made shapes are designed to be used by people. Thus, in order to fully understand their semantics, one needs to answer a fundamental question: “how do people interact with these objects?” As an initial step towards this goal, we offer a novel algorithm for automatically predicting a static pose that a person would need to adopt in order to use an object. Specifically, given an input 3D shape, the goal of our analysis is to predict a corresponding human pose, including contact points and kinematic parameters. This is especially challenging for man-made objects that commonly exhibit a lot of variance in their geometric structure. We address this challenge by observing that contact points usually share consistent local geometric features related to the anthropometric properties of corresponding parts and that human body is subject to kinematic constraints and priors. Accordingly, our method effectively combines local region classification and global kinematically-constrained search to successfully predict poses for various objects. We also evaluate our algorithm on six diverse collections of 3D polygonal models (chairs, gym equipment, cockpits, carts, bicycles, and bipedal devices) containing a total of 147 models. Finally, we demonstrate that the poses predicted by our algorithm can be used in several shape analysis problems, such as establishing correspondences between objects, detecting salient regions, finding informative viewpoints, and retrieving functionally-similar shapes.

References:

1. Bard, C., and Troccaz, J. 1990. Automatic preshaping for a dextrous hand from a simple description of objects. Intelligent Robots and Systems, 865–872.Google Scholar
2. Bohg, J., Morales, A., Asfour, T., and Kragic, D. 2013. Data driven grasp synthesis – a survey. IEEE Transactions on Robotics.Google Scholar
3. Breiman, L. 2001. Random forests. Mach. Learning 45, 1, 5–32. Google ScholarDigital Library
4. Buss, S. R. 2005. Introduction to inverse kinematics with jacobian transpose, pseudoinverse and damped least squares methods. Unpublished survey.Google Scholar
5. Chaudhuri, S., Kalogerakis, E., Guibas, L., and Koltun, V. 2011. Probabilistic reasoning for assembly-based 3D modeling. SIGGRAPH, 35:1–35:10. Google ScholarDigital Library
6. Chen, X., Saparov, A., Pang, B., and Funkhouser, T. 2012. Schelling points on 3D surface meshes. SIGGRAPH 31, 4. Google ScholarDigital Library
7. Delaitre, V., Fouhey, D., Laptev, I., Sivic, J., Gupta, A., and Efros, A. 2012. Scene semantics from long-term observation of people. In ECCV. Google ScholarDigital Library
8. Feix, T., Romero, J., Ek, C., Schmiedmayer, H., and Kragic, D. 2013. A metric for comparing the anthropomorphic motion capability of artificial hands. IEEE Transactions on Robotics 29, 1, 82–93. Google ScholarDigital Library
9. Fouhey, D. F., Delaitre, V., Gupta, A., Efros, A. A., Laptev, I., and Sivic, J. 2012. People watching: Human actions as a cue for single-view geometry. In ECCV. Google ScholarDigital Library
10. Fritz, G., Paletta, L., Breithaupt, R., and Rome, E. 2006. Learning predictive features in affordance based robotic perception systems. In Intelligent Robots and Systems, 3642–3647.Google Scholar
11. Fu, H., Cohen-Or, D., Dror, G., and Sheffer, A. 2008. Upright orientation of man-made objects. SIGGRAPH. Google ScholarDigital Library
12. Gal, R., and Cohen-Or, D. 2006. Salient geometric features for partial shape matching and similarity. ACM Trans. Graph. Google ScholarDigital Library
13. Gal, R., Sorkine, O., Mitra, N. J., and Cohen-Or, D. 2009. iWIRES: An analyze-and-edit approach to shape manipulation. SIGGRAPH 28, 3, #33, 1–10. Google ScholarDigital Library
14. Gibson, J. J. 1977. The theory of affordances. Lawrence Erlbaum.Google Scholar
15. Goldfeder, C., and Allen, P. K. 2011. Data-driven grasping. Auton. Robots 31, 1, 1–20. Google ScholarDigital Library
16. Golovinskiy, A., and Funkhouser, T. 2009. Consistent segmentation of 3D models. Proc. SMI 33, 3, 262–269. Google ScholarDigital Library
17. Grabner, H., Gall, J., and van Gool, L. 2011. What makes a chair a chair? CVPR. Google ScholarDigital Library
18. Gupta, A., Satkin, S., Efros, A. A., and Hebert, M. 2011. From 3D scene geometry to human workspace. In IEEE CVPR. Google ScholarDigital Library
19. Hermans, T., Rehg, J. M., and Bobick, A. 2011. Affordance prediction via learned object attributes. ICRA.Google Scholar
20. Huang, Q., Koltun, V., and Guibas, L. 2011. Joint shape segmentation with linear programming. In SIGGRAPH Asia. Google ScholarDigital Library
21. Huang, Q.-x., Zhang, G.-X., Gao, L., Hu, S.-M., Butscher, A., and Guibas, L. 2012. An optimization approach for extracting and encoding consistent maps. SIGGRAPH Asia. Google ScholarDigital Library
22. Huang, Q., Su, H., and Guibas, L. 2013. Fine-grained semi-supervised labeling of large shape collections. SIGGRAPH Asia. Google ScholarDigital Library
23. Jiang, Y., and Saxena, A. 2012. Hallucinating humans for learning robotic placement of objects. ISER.Google Scholar
24. Jiang, Y., and Saxena, A. 2013. Infinite latent conditional random fields for modeling environments through humans. RSS.Google Scholar
25. Jiang, Y., Lim, M., and Saxena, A. 2012. Learning object arrangements in 3D scenes using human context. ICML.Google Scholar
26. Jiang, Y., Koppula, H. S., and Saxena, A. 2013. Hallucinated humans as the hidden context for labeling 3D scenes. CVPR. Google ScholarDigital Library
27. Kalogerakis, E., Hertzmann, A., and Singh, K. 2010. Learning 3D mesh segmentation and labeling. In SIGGRAPH. Google ScholarDigital Library
28. Kalogerakis, E., Chaudhuri, S., Koller, D., and Koltun, V. 2012. A probabilistic model for component-based shape synthesis. SIGGRAPH. Google ScholarDigital Library
29. Kim, V. G., Li, W., Mitra, N., DiVerdi, S., and Funkhouser, T. 2012. Exploring collections of 3D models using fuzzy correspondences. SIGGRAPH. Google ScholarDigital Library
30. Kim, V. G., Li, W., Mitra, N. J., Chaudhuri, S., DiVerdi, S., and Funkhouser, T. 2013. Learning part-based templates from large collections of 3D shapes. SIGGRAPH. Google ScholarDigital Library
31. Lee, C., Varshney, A., and Jacobs, D. 2005. Mesh saliency. SIGGRAPH. Google ScholarDigital Library
32. Lenz, I., Lee, H., and Saxena, A. 2013. Deep learning for detecting robotic grasps. In RSS.Google Scholar
33. Mitra, N. J., Pauly, M., Wand, M., and Ceylan, D. 2012. Symmetry in 3D geometry: Extraction and applications. In EUROGRAPHICS State-of-the-art Report.Google Scholar
34. Mitra, N. J., Wand, M., Zhang, H., Cohen-Or, D., Kim, V., and Huang, Q.-X. 2013. Structure-aware shape processing. In Courses Siggraph Asia. Google ScholarDigital Library
35. Norman, D. 1988. The Psychology of Everyday Things. Basic Books.Google Scholar
36. Ovsjanikov, M., Li, W., Guibas, L., and Mitra, N. J. 2011. Exploration of continuous variability in collections of 3D shapes. SIGGRAPH 30, 4, 33:1–33:10. Google ScholarDigital Library
37. Podolak, J., Shilane, P., Golovinskiy, A., Rusinkiewicz, S., and Funkhouser, T. 2006. A planar-reflective symmetry transform for 3D shapes. ACM Trans. Graph. 25, 3. Google ScholarDigital Library
38. Pollard, N. S., and Zordan, V. B. 2005. Physically based grasping control from example. SCA. Google ScholarDigital Library
39. Przybylski, M., Wachter, M., Asfour, T., and Dillmann, R. 2012. A skeleton-based approach to grasp known objects with a humanoid robot. Humanoid Robots.Google Scholar
40. Rosales, C., Porta, J., and Ros, L. 2011. Global optimization of robotic grasps. RSS.Google Scholar
41. Saxena, A., Driemeyer, J., Kearns, J., and Ng, A. 2006. Robotic grasping of novel objects. In NIPS.Google Scholar
42. Saxena, A. 2009. Monocular depth perception and robotic grasping of novel objects. PhD thesis, Stanford University. Google ScholarDigital Library
43. Secord, A., Lu, C., Finkelstein, A., Singh, M., and Nealen, A. 2011. Perceptual models of viewpoint preference. ACM Trans. Graph. 50, 5. Google ScholarDigital Library
44. Shapira, L., Shamir, A., and Cohen-Or, D. 2008. Consistent mesh partitioning and skeletonisation using the shape diameter function. Vis. Comput. 24, 4, 249–259. Google ScholarDigital Library
45. Shilane, P., and Funkhouser, T. 2007. Distinctive regions of 3d surfaces. ACM Trans. Graph. 26, 2 (June). Google ScholarDigital Library
46. Sidi, O., van Kaick, O., Kleiman, Y., Zhang, H., and Cohen-Or, D. 2011. Unsupervised co-segmentation of a set of shapes via descriptor-space spectral clustering. SIGGRAPH Asia 30, 6, 126:1–126:9. Google ScholarDigital Library
47. Stark, M., Lies, P., Zillich, M., Wyatt, J., and Schiele, B. 2008. Functional object class detection based on learned affordance cues. Computer Vision Systems. Google ScholarDigital Library
48. Sun, J., Moore, J. L., Bobick, A., and Rehg, J. M. 2009. Learning visual object categories for robot affordance prediction. The International Journal of Robotics Research.Google Scholar
49. Trimble, 2013. Trimble 3D warehouse, http://sketchup.google.com/3dwarehouse/.Google Scholar
50. van Kaick, O., Xu, K., Zhang, H., Wang, Y., Sun, S., Shamir, A., and Cohen-Or, D. 2013. Co-hierarchical analysis of shape structures. SIGGRAPH 32, 4, 69:1–69:10. Google ScholarDigital Library
51. Wei, P., Zhao, Y. B., Zheng, N., and Zhu, S. 2013. Modeling 4D human-object interactions for event and object recognition. ICCV. Google ScholarDigital Library
52. Ying, L., Fu, J. L., and Pollard, N. S. 2007. Data-driven grasp synthesis using shape matching and task-based pruning. Transactions on Visualization and Computer Graphics 13, 4, 732–747. Google ScholarDigital Library
53. Zhao, W., Zhang, J., Min, J., and Chai, J. 2013. Robust realtime physically based motion control for human grasping. In SIGGRAPH Asia. Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2014: Technical Papers

“Shape2Pose: human-centric shape analysis” by Kim, Chaudhuri, Guibas and Funkhouser

Conference:

Type:

Title:

Session/Category Title: Shape Analysis

Presenter(s)/Author(s):

Moderator(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: