“Interaction context (ICON): towards a geometric functionality descriptor”

  • ©Ruizhen Hu, Chenyang Zhu, Oliver van Kaick, Ligang Liu, Ariel Shamir, and Hao (Richard) Zhang




    Interaction context (ICON): towards a geometric functionality descriptor

Session/Category Title:   Shape Analysis




    We introduce a contextual descriptor which aims to provide a geometric description of the functionality of a 3D object in the context of a given scene. Differently from previous works, we do not regard functionality as an abstract label or represent it implicitly through an agent. Our descriptor, called interaction context or ICON for short, explicitly represents the geometry of object-to-object interactions. Our approach to object functionality analysis is based on the key premise that functionality should mainly be derived from interactions between objects and not objects in isolation. Specifically, ICON collects geometric and structural features to encode interactions between a central object in a 3D scene and its surrounding objects. These interactions are then grouped based on feature similarity, leading to a hierarchical structure. By focusing on interactions and their organization, ICON is insensitive to the numbers of objects that appear in a scene, the specific disposition of objects around the central object, or the objects’ fine-grained geometry. With a series of experiments, we demonstrate the potential of ICON in functionality-oriented shape processing, including shape retrieval (either directly or by complementing existing shape descriptors), segmentation, and synthesis.


    1. Bar-Aviv, E., and Rivlin, E. 2006. Functional 3D object classification using simulation of embodied agent. In British Machine Vision Conference, 32:1–10.Google Scholar
    2. Belongie, S., Malik, J., and Puzicha, J. 2002. Shape matching and object recognition using shape context. IEEE Trans. Pat. Ana. & Mach. Int. 24, 4, 509–522. Google ScholarDigital Library
    3. Bogoni, L., and Bajcsy, R. 1995. Interactive recognition and representation of functionality. Computer Vision and Image Understanding 62, 2, 194–214. Google ScholarDigital Library
    4. Boykov, Y., Veksler, O., and Zabih, R. 2001. Fast approximate energy minimization via graph cuts. IEEE Trans. Pat. Ana. & Mach. Int. 23, 11, 1222–1239. Google ScholarDigital Library
    5. Caine, M. 1994. The design of shape interactions using motion constraints. In IEEE Conference of Robotics and Automation, vol. 1, 366–371.Google ScholarCross Ref
    6. Chen, D.-Y., Tian, X.-P., Shen, Y.-T., and Ouhyoung, M. 2003. On visual similarity based 3D model retrieval. Computer Graphics Forum 22, 3, 223–232.Google ScholarCross Ref
    7. Duygulu, P., Barnard, K., de Freitas, N., and Forsyth, D. 2002. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In Proc. Euro. Conf. on Comp. Vis., 97–112. Google ScholarDigital Library
    8. Fisher, M., Savva, M., and Hanrahan, P. 2011. Characterizing structural relationships in scenes using graph kernels. ACM Trans. on Graph (SIGGRAPH) 30, 4, 34:1–12. Google ScholarDigital Library
    9. Grabner, H., Gall, J., and Van Gool, L. 2011. What makes a chair a chair? In Proc. IEEE Conf. on Comp. Vis. and Pat. Rec., 1529–1536. Google ScholarDigital Library
    10. Gupta, A., Kembhavi, A., and Davis, L. S. 2009. Observing human-object interactions: Using spatial and functional compatibility for recognition. IEEE Trans. Pat. Ana. & Mach. Int. 31, 10, 1775–1789. Google ScholarDigital Library
    11. Huang, Q., Koltun, V., and Guibas, L. 2011. Joint shape segmentation with linear programming. ACM Trans. on Graph (SIGGRAPH Asia) 30, 6, 125:1–12. Google ScholarDigital Library
    12. Johnson, A., and Hebert, M. 1999. Using spin-images for efficient multiple model recognition in cluttered 3D scenes. IEEE Trans. Pat. Ana. & Mach. Int. 29, 5, 433–449. Google ScholarDigital Library
    13. Kim, V. G., Chaudhuri, S., Guibas, L., and Funkhouser, T. 2014. Shape2Pose: Human-centric shape analysis. ACM Trans. on Graph (SIGGRAPH) 33, 4, 120:1–12. Google ScholarDigital Library
    14. Laga, H., Mortara, M., and Spagnuolo, M. 2013. Geometry and context for semantic correspondence and functionality recognition in manmade 3D shapes. ACM Trans. on Graph 32, 5, 150:1–16. Google ScholarDigital Library
    15. Liu, Z., Xie, C., Bu, S., Wang, X., Han, J., Lin, H., and Zhang, H. 2014. Indirect shape analysis for 3D shape retrieval. Computer & Graphics 46, 110–116. Google ScholarDigital Library
    16. Mitra, N. J., Guibas, L., and Pauly, M. 2006. Partial and approximate symmetry detection for 3D geometry. ACM Trans. on Graph (SIGGRAPH) 25, 3, 560–568. Google ScholarDigital Library
    17. Mitra, N., Wand, M., Zhang, H. R., Cohen-Or, D., Kim, V., and Huang, Q.-X. 2013. Structure-aware shape processing. In SIGGRAPH Asia 2013 Courses, 1:1–20. Google ScholarDigital Library
    18. Pechuk, M., Soldea, O., and Rivlin, E. 2008. Learning function-based object classification from 3D imagery. Comput. Vis. Image Underst. 110, 2, 173–191. Google ScholarDigital Library
    19. Rivlin, E., Dickinson, S. J., and Rosenfeld, A. 1995. Recognition by functional parts. Comput. Vis. Image Underst. 62, 2, 164–176. Google ScholarDigital Library
    20. Savva, M., Chang, A. X., Hanrahan, P., Fisher, M., and Niessner, M. 2014. SceneGrok: Inferring action maps in 3D environments. ACM Trans. on Graph (SIGGRAPH Asia) 33, 6, 212:1–10. Google ScholarDigital Library
    21. Sidi, O., van Kaick, O., Kleiman, Y., Zhang, H., and Cohen-Or, D. 2011. Unsupervised co-segmentation of a set of shapes via descriptor-space spectral clustering. ACM Trans. on Graph (SIGGRAPH Asia) 30, 6, 126:1–10. Google ScholarDigital Library
    22. Song, H. O., Fritz, M., Gu, C., and Darrell, T. 2011. Visual grasp affordances from appearance-based cues. In ICCV Workshops, 998–1005.Google Scholar
    23. Stark, L., and Bowyer, K. 1996. Generic Object Recognition Using Form and Function. World Scientific. Google ScholarDigital Library
    24. Sutton, M., Stark, L., and Bowyer, K. 1994. GRUFF-3: generalizing the domain of a function-based recognition system. Pattern Recognition 27, 12, 1743–1766.Google ScholarCross Ref
    25. Tevs, A., Huang, Q., Wand, M., Seidel, H.-P., and Guibas, L. 2014. Relating shapes via geometric symmetries and regularities. ACM Trans. on Graph (SIGGRAPH) 33, 4, 119:1–12. Google ScholarDigital Library
    26. Torsello, A., Hidovic-Rowe, D., and Pelillo, M. 2005. Polynomial-time metrics for attributed trees. IEEE Trans. Pat. Ana. & Mach. Int. 27, 7, 1087–1099. Google ScholarDigital Library
    27. van Kaick, O., Xu, K., Zhang, H., Wang, Y., Sun, S., Shamir, A., and Cohen-Or, D. 2013. Co-hierarchical analysis of shape structures. ACM Trans. on Graph (SIGGRAPH) 32, 4, 69:1–10. Google ScholarDigital Library
    28. Wang, Y., Xu, K., Li, J., Zhang, H., Shamir, A., Liu, L., Cheng, Z., and Xiong, Y. 2011. Symmetry hierarchy of man-made objects. Computer Graphics Forum (Eurographics) 30, 2, 287–296.Google ScholarCross Ref
    29. Xu, K., Ma, R., Zhang, H., Zhu, C., Shamir, A., Cohen-Or, D., and Huang, H. 2014. Organizing heterogeneous scene collection through contextual focal points. ACM Trans. on Graph (SIGGRAPH) 33, 4, 35:1–12. Google ScholarDigital Library
    30. Zelnik-Manor, L., and Perona, P. 2004. Self-tuning spectral clustering. In NIPS, vol. 17, 1601–1608.Google ScholarDigital Library
    31. Zhao, X., Wang, H., and Komura, T. 2014. Indexing 3D scenes using the interaction bisector surface. ACM Trans. on Graph 33, 3, 22:1–14. Google ScholarDigital Library
    32. Zheng, Y., Cohen-Or, D., and Mitra, N. J. 2013. Smart variations: Functional substructures for part compatibility. Computer Graphics Forum (Eurographics) 32, 2pt2, 195–204.Google Scholar
    33. Zhu, Y., Fathi, A., and Fei-Fei, L. 2014. Reasoning about object affordances in a knowledge base representation. Lecture Notes in Computer Science (Proc. ECCV) 8690, 408–424.Google Scholar

ACM Digital Library Publication:

Overview Page: