“Learning how objects function via co-analysis of interactions” by Hu, Kaick, Wu, Huang, Shamir, et al. …

  • ©Ruizhen Hu, Oliver van Kaick, Bojian Wu, Hui Huang, Ariel Shamir, and Hao Zhang




    Learning how objects function via co-analysis of interactions

Session/Category Title:   SHAPE ANALYSIS




    We introduce a co-analysis method which learns a functionality model for an object category, e.g., strollers or backpacks. Like previous works on functionality, we analyze object-to-object interactions and intra-object properties and relations. Differently from previous works, our model goes beyond providing a functionality-oriented descriptor for a single object; it prototypes the functionality of a category of 3D objects by co-analyzing typical interactions involving objects from the category. Furthermore, our co-analysis localizes the studied properties to the specific locations, or surface patches, that support specific functionalities, and then integrates the patch-level properties into a category functionality model. Thus our model focuses on the how, via common interactions, and where, via patch localization, of functionality analysis.Given a collection of 3D objects belonging to the same category, with each object provided within a scene context, our co-analysis yields a set of proto-patches, each of which is a patch prototype supporting a specific type of interaction, e.g., stroller handle held by hand. The learned category functionality model is composed of proto-patches, along with their pairwise relations, which together summarize the functional properties of all the patches that appear in the input object category. With the learned functionality models for various object categories serving as a knowledge base, we are able to form a functional understanding of an individual 3D object, without a scene context. With patch localization in the model, functionality-aware modeling, e.g, functional object enhancement and the creation of functional object hybrids, is made possible.


    1. Bar-Aviv, E., and Rivlin, E. 2006. Functional 3D object classification using simulation of embodied agent. In British Machine Vision Conference, 32:1–10.Google Scholar
    2. Breiman, L. 2001. Random forests. Machine learning 45, 1, 5–32. Google ScholarDigital Library
    3. Chen, D.-Y., Tian, X.-P., Shen, Y.-T., and Ouhyoung, M. 2003. On visual similarity based 3D model retrieval. Computer Graphics Forum (Proc. of Eurographics) 22, 3, 223–232.Google ScholarCross Ref
    4. Fish, N., Averkiou, M., van Kaick, O., Sorkine-Hornung, O., Cohen-Or, D., and Mitra, N. J. 2014. Meta-representation of shape families. ACM Trans. on Graphics 33, 4, 34:1–11. Google ScholarDigital Library
    5. Fisher, M., Ritchie, D., Savva, M., Funkhouser, T., and Hanrahan, P. 2012. Example-based synthesis of 3D object arrangements. ACM Trans. on Graphics 31, 6, 135:1–11. Google ScholarDigital Library
    6. Fisher, M., Li, Y., Savva, M., Hanrahan, P., and Niessner, M. 2015. Activity-centric scene synthesis for functional 3D scene modeling. ACM Trans. on Graphics 34, 6, 212:1–10. Google ScholarDigital Library
    7. Grabner, H., Gall, J., and Van Gool, L. 2011. What makes a chair a chair? In Proc. IEEE Conf. on Computer Vision & Pattern Recognition, 1529–1536. Google ScholarDigital Library
    8. Greene, M. R., Baldassano, C., Beck, D. M., and Fei-Fei, L. 2016. Visual scenes are categorized by function. Journal of Experimental Psychology: General 145, 1, 82–94.Google ScholarCross Ref
    9. Hu, R., Zhu, C., van Kaick, O., Liu, L., Shamir, A., and Zhang, H. 2015. Interaction context (ICON): Towards a geometric functionality descriptor. ACM Trans. on Graphics 34, 4, 83:1–12. Google ScholarDigital Library
    10. Kim, V. G., Chaudhuri, S., Guibas, L., and Funkhouser, T. 2014. Shape2Pose: Human-centric shape analysis. ACM Trans. on Graphics 33, 4, 120:1–12. Google ScholarDigital Library
    11. Laga, H., Mortara, M., and Spagnuolo, M. 2013. Geometry and context for semantic correspondence and functionality recognition in manmade 3D shapes. ACM Trans. on Graphics 32, 5, 150:1–16. Google ScholarDigital Library
    12. Mitra, N., Wand, M., Zhang, H., Cohen-Or, D., and Bokeloh, M. 2013. Structure-aware shape processing. In Eurographics State-of-the-art Report (STAR).Google Scholar
    13. Niu, B., Wang, J., and Wang, H. 2015. Bacterial-inspired algorithms for solving constrained optimization problems. Neurocomputing 148, 54–62.Google ScholarCross Ref
    14. Savva, M., Chang, A. X., Hanrahan, P., Fisher, M., and Niessner, M. 2014. SceneGrok: Inferring action maps in 3D environments. ACM Trans. on Graphics 33, 6, 212:1–10. Google ScholarDigital Library
    15. Schmidt, M., van den Berg, E., Friedlander, M. P., and Murphy, K. 2009. Optimizing costly functions with simple constraints: A limited-memory projected quasi-Newton algorithm. In Proc. Int. Conf. AI and Stat., 456–463.Google Scholar
    16. Schultz, M., and Joachims, T. 2004. Learning a distance metric from relative comparisons. Advances in neural information processing systems (NIPS), 41.Google Scholar
    17. Shi, Y., Long, P., Xu, K., Huang, H., and Xiong, Y. 2016. Data-driven contextual modeling for 3D scene understanding. Computers & Graphics 55, 55–67. Google ScholarDigital Library
    18. Stark, L., and Bowyer, K. 1991. Achieving generalized object recognition through reasoning about association of function to structure. IEEE Trans. Pattern Analysis & Machine Intelligence 13, 10, 1097–1104. Google ScholarDigital Library
    19. Stark, L., and Bowyer, K. 1996. Generic Object Recognition Using Form and Function. World Scientific. Google ScholarDigital Library
    20. Xu, K., Ma, R., Zhang, H., Zhu, C., Shamir, A., Cohen-Or, D., and Huang, H. 2014. Organizing heterogeneous scene collection through contextual focal points. ACM Trans. on Graphics 33, 4, 35:1–12. Google ScholarDigital Library
    21. Yumer, M. E., Chaudhuri, S., Hodgins, J. K., and Kara, L. B. 2015. Semantic shape editing using deformation handles. ACM Trans. on Graphics 34, 4, 86:1–12. Google ScholarDigital Library
    22. Zhao, X., Wang, H., and Komura, T. 2014. Indexing 3D scenes using the interaction bisector surface. ACM Trans. on Graphics 33, 3, 22:1–14. Google ScholarDigital Library
    23. Zhu, P., Hu, Q., Zuo, W., and Yang, M. 2014. Multi-granularity distance metric learning via neighborhood granule margin maximization. Information Sciences 282, 321–331. Google ScholarDigital Library
    24. Zhu, Y., Fathi, A., and Fei-Fei, L. 2014. Reasoning about object affordances in a knowledge base representation. In Proc. Euro. Conf. on Computer Vision.Google Scholar

ACM Digital Library Publication:

Overview Page: