“Interactive images: cuboid proxies for smart image manipulation” by Zheng, Chen, Cheng, Zhou, Hu, et al. …

  • ©Youyi Zheng, Xiang Anthony Chen, Ming-Ming Cheng, Kun Zhou, Shi-Min Hu, and Niloy J. Mitra




    Interactive images: cuboid proxies for smart image manipulation



    Images are static and lack important depth information about the underlying 3D scenes. We introduce interactive images in the context of man-made environments wherein objects are simple and regular, share various non-local relations (e.g., coplanarity, parallelism, etc.), and are often repeated. Our interactive framework creates partial scene reconstructions based on cuboid-proxies with minimal user interaction. It subsequently allows a range of intuitive image edits mimicking real-world behavior, which are otherwise difficult to achieve. Effectively, the user simply provides high-level semantic hints, while our system ensures plausible operations by conforming to the extracted non-local relations. We demonstrate our system on a range of real-world images and validate the plausibility of the results using a user study.


    1. Barnes, C., Shechtman, E., Finkelstein, A., and Goldman, D. B. 2009. Patchmatch: a randomized correspondence algorithm for structural image editing. ACM TOG (SIGGRAPH) 28, 3, 24:1–11. Google ScholarDigital Library
    2. Biederman, I., Mezzanotte, R., and Rabinowitz, J. 1982. Scene perception: Detecing and judging objects undergoing relational violations. Cognitive Psychology 14, 143–177.Google ScholarCross Ref
    3. Carroll, R., Agarwala, A., and Agrawala, M. 2010. Image warps for artistic perspective manipulation. ACM TOG (SIGGRAPH) 29, 4, 127:1–127:9. Google ScholarDigital Library
    4. Cheng, M.-M., Zhang, F.-L., Mitra, N. J., Huang, X., and Hu, S.-M. 2010. RepFinder: Finding approximately repeated scene elements for image editing. ACM TOG (SIGGRAPH) 29, 4, 83:1–83:8. Google ScholarDigital Library
    5. Efros, A. A., and Leung, T. K. 1999. Texture synthesis by non-parametric sampling. In IEEE ICCV, 1033–1038. Google ScholarDigital Library
    6. Fisher, M., Savva, M., and Hanrahan, P. 2011. Characterizing structural relationships in scenes using graph kernels. ACM TOG (SIGGRAPH) 30, 34:1–34:12. Google ScholarDigital Library
    7. Gal, R., Sorkine, O., Mitra, N. J., and Cohen-Or, D. 2009. iWIRES: an analyze-and-edit approach to shape manipulation. ACM TOG (SIGGRAPH) 28, 33:1–33:10. Google ScholarDigital Library
    8. Gibson, J. J. 1979. The Ecological Approach to Visual Perception. MIT Press.Google Scholar
    9. Guo, R., Dai, Q., and Hoiem, D. 2011. Single-image shadow detection and removal using paired regions. In IEEE CVPR, 2033–2040. Google ScholarDigital Library
    10. Gupta, A., Efros, A. A., and Hebert, M. 2010. Blocks world revisited: Image understanding using qualitative geometry and mechanics. In ECCV, 119–153. Google ScholarDigital Library
    11. Gupta, A., Satkin, S., Efros, A. A., and Hebert, M. 2011. From 3d scene geometry to human workspace. In IEEE CVPR, 1961–1968. Google ScholarDigital Library
    12. Hartley, A., and Zisserman, A. 2006. Multiple view geometry in computer vision (2. Ed.). Cambridge University Press. Google ScholarDigital Library
    13. Hays, J., and Efros, A. 2007. Scene completion using millions of photographs. ACM TOG (SIGGRAPH) 26, 3, 87–94. Google ScholarDigital Library
    14. Hedau, V., Hoiem, D., and Forsyth, D. 2010. Thinking inside the box: Using appearance models and context based on room geometry. In ECCV, 224–237. Google ScholarDigital Library
    15. Hoiem, D., Efros, A. A., and Hebert, M. 2005. Automatic photo pop-up. ACM TOG (SIGGRAPH) 24, 3, 577–584. Google ScholarDigital Library
    16. Jain, A., Thormählen, T., Seidel, H.-P., and Theobalt, C. 2010. Moviereshape: Tracking and reshaping of humans in videos. ACM TOG (SIGGRAPH Asia) 29, 5, 148:1–148:9. Google ScholarDigital Library
    17. Jiang, N., Tan, P., and Cheong, L.-F. 2009. Symmetric architecture modeling with a single image. ACM TOG (SIGGRAPH Asia) 28, 5, 113:1–113:8. Google ScholarDigital Library
    18. Karsch, K., Hedau, V., Forsyth, D., and Hoiem, D. 2011. Rendering synthetic objects into legacy photographs. ACM TOG (SIGGRAPH Asia) 30, 6, 157:1–157:12. Google ScholarDigital Library
    19. Lalonde, J.-F., Hoiem, D., Efros, A. A., Rother, C., Winn, J., and Criminisi, A. 2007. Photo clip art. ACM TOG (SIGGRAPH) 26, 3 (August), 3. Google ScholarDigital Library
    20. Lourakis, M., 2004. levmar: Levenberg-marquardt non-linear least squares algorithms in C/C++. {web page} http://www.ics.forth.gr/~lourakis/levmar/.Google Scholar
    21. Mitra, N. J., Yang, Y.-L., Yan, D.-M., Li, W., and Agrawala, M. 2010. Illustrating how mechanical assemblies work. ACM TOG (SIGGRAPH) 29, 4, 58:1–58:12. Google ScholarDigital Library
    22. Norman, D. 1990. Design of Everyday Things. MIT Press.Google Scholar
    23. Oh, B. M., Chen, M., Dorsey, J., and Durand, F. 2001. Image-based modeling and photo editing. In ACM SIGGRAPH, 433–442. Google ScholarDigital Library
    24. Pérez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. ACM TOG (SIGGRAPH) 22, 3, 313–318. Google ScholarDigital Library
    25. Rother, C., Kolmogorov, V., and Blake, A. 2004. “grab-cut”: interactive foreground extraction using iterated graph cuts. ACM TOG (SIGGRAPH) 23, 3, 309–314. Google ScholarDigital Library
    26. Rubinstein, M., Shamir, A., and Avidan, S. 2009. Multi-operator media retargeting. ACM TOG (SIGGRAPH) 28, 3, 23:1–23:11. Google ScholarDigital Library
    27. Saxena, A., Sun, M., and Ng, A. 2009. Make3D: Learning 3D scene structure from a single still image. IEEE PAMI 31, 5, 824–840. Google ScholarDigital Library
    28. Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In IEEE CVPR, 519–528. Google ScholarDigital Library
    29. Shapira, L., Shamir, A., and Cohen-Or, D. 2009. Image appearance exploration by model-based navigation. CGF 28, 2, 629–638.Google ScholarCross Ref
    30. Sinha, S. N., Steedly, D., Szeliski, R., Agrawala, M., and Pollefeys, M. 2008. Interactive 3D architectural modeling from unordered photo collections. ACM TOG (SIGGRAPH Asia) 27, 5, 159:1–159:10. Google ScholarDigital Library
    31. Sun, J., Yuan, L., Jia, J., and Shum, H. 2005. Image completion with structure propagation. ACM TOG (SIGGRAPH) 24, 3, 861–868. Google ScholarDigital Library
    32. Wang, Y.-S., Tai, C.-L., Sorkine, O., and Lee, T.-Y. 2008. Optimized scale-and-stretch for image resizing. ACM TOG (SIGGRAPH Asia) 27, 5, 118:1–118:8. Google ScholarDigital Library
    33. Wei, L.-Y., Lefebvre, S., Kwatra, V., and Turk, G. 2009. State of the art in example-based texture synthesis. In EG-STAR, 93–117.Google Scholar
    34. Wilczkowiak, M., Sturm, P. F., and Boyer, E. 2005. Using geometric constraints through parallelepipeds for calibration and 3D modeling. IEEE PAMI 27, 2, 194–207. Google ScholarDigital Library
    35. Wu, H., Wang, Y.-S., Feng, K.-C., Wong, T.-T., Lee, T.-Y., and Heng, P.-A. 2010. Resizing by symmetry-summarization. ACM TOG (SIGGRAPH Asia) 29, 6, 159:1–159:9. Google ScholarDigital Library
    36. Xue, T., Liu, J., and Tang, X. 2010. Object cut: Complex 3d object reconstruction through line drawing separation. In IEEE CVPR, 1149–1156.Google Scholar
    37. Yang, Y.-L., Yang, Y.-J., Pottmann, H., and Mitra, N. J. 2011. Shape space exploration of constrained meshes. ACM TOG (SIGGRAPH Asia) 30, 6. Google ScholarDigital Library
    38. Zheng, Y., Fu, H., Cohen-Or, D., Au, O. K.-C., and Tai, C.-L. 2011. Component-wise controllers for structure-preserving shape manipulation. CGF 30, 2, 563–572.Google ScholarCross Ref
    39. Zhou, S., Fu, H., Liu, L., Cohen-Or, D., and Han, X. 2010. Parametric reshaping of human bodies in images. ACM TOG (SIGGRAPH) 29, 4, 126:1–126:10. Google ScholarDigital Library
    40. Zisserman, A., Reid, I. D., and Criminisi, A. 1999. Single view metrology. In IEEE ICCV, 434–441.Google Scholar

ACM Digital Library Publication:

Overview Page: