Interactive images: cuboid proxies for smart image manipulation

Images are static and lack important depth information about the underlying 3D scenes. We introduce interactive images in the context of man-made environments wherein objects are simple and regular, share various non-local relations (e.g., coplanarity, parallelism, etc.), and are often repeated. Our interactive framework creates partial scene reconstructions based on cuboid-proxies with minimal user interaction. It subsequently allows a range of intuitive image edits mimicking real-world behavior, which are otherwise difficult to achieve. Effectively, the user simply provides high-level semantic hints, while our system ensures plausible operations by conforming to the extracted non-local relations. We demonstrate our system on a range of real-world images and validate the plausibility of the results using a user study.

References:

1. Barnes, C., Shechtman, E., Finkelstein, A., and Goldman, D. B. 2009. Patchmatch: a randomized correspondence algorithm for structural image editing. ACM TOG (SIGGRAPH) 28, 3, 24:1–11. Google ScholarDigital Library
2. Biederman, I., Mezzanotte, R., and Rabinowitz, J. 1982. Scene perception: Detecing and judging objects undergoing relational violations. Cognitive Psychology 14, 143–177.Google ScholarCross Ref
3. Carroll, R., Agarwala, A., and Agrawala, M. 2010. Image warps for artistic perspective manipulation. ACM TOG (SIGGRAPH) 29, 4, 127:1–127:9. Google ScholarDigital Library
4. Cheng, M.-M., Zhang, F.-L., Mitra, N. J., Huang, X., and Hu, S.-M. 2010. RepFinder: Finding approximately repeated scene elements for image editing. ACM TOG (SIGGRAPH) 29, 4, 83:1–83:8. Google ScholarDigital Library
5. Efros, A. A., and Leung, T. K. 1999. Texture synthesis by non-parametric sampling. In IEEE ICCV, 1033–1038. Google ScholarDigital Library
6. Fisher, M., Savva, M., and Hanrahan, P. 2011. Characterizing structural relationships in scenes using graph kernels. ACM TOG (SIGGRAPH) 30, 34:1–34:12. Google ScholarDigital Library
7. Gal, R., Sorkine, O., Mitra, N. J., and Cohen-Or, D. 2009. iWIRES: an analyze-and-edit approach to shape manipulation. ACM TOG (SIGGRAPH) 28, 33:1–33:10. Google ScholarDigital Library
8. Gibson, J. J. 1979. The Ecological Approach to Visual Perception. MIT Press.Google Scholar
9. Guo, R., Dai, Q., and Hoiem, D. 2011. Single-image shadow detection and removal using paired regions. In IEEE CVPR, 2033–2040. Google ScholarDigital Library
10. Gupta, A., Efros, A. A., and Hebert, M. 2010. Blocks world revisited: Image understanding using qualitative geometry and mechanics. In ECCV, 119–153. Google ScholarDigital Library
11. Gupta, A., Satkin, S., Efros, A. A., and Hebert, M. 2011. From 3d scene geometry to human workspace. In IEEE CVPR, 1961–1968. Google ScholarDigital Library
12. Hartley, A., and Zisserman, A. 2006. Multiple view geometry in computer vision (2. Ed.). Cambridge University Press. Google ScholarDigital Library
13. Hays, J., and Efros, A. 2007. Scene completion using millions of photographs. ACM TOG (SIGGRAPH) 26, 3, 87–94. Google ScholarDigital Library
14. Hedau, V., Hoiem, D., and Forsyth, D. 2010. Thinking inside the box: Using appearance models and context based on room geometry. In ECCV, 224–237. Google ScholarDigital Library
15. Hoiem, D., Efros, A. A., and Hebert, M. 2005. Automatic photo pop-up. ACM TOG (SIGGRAPH) 24, 3, 577–584. Google ScholarDigital Library
16. Jain, A., Thormählen, T., Seidel, H.-P., and Theobalt, C. 2010. Moviereshape: Tracking and reshaping of humans in videos. ACM TOG (SIGGRAPH Asia) 29, 5, 148:1–148:9. Google ScholarDigital Library
17. Jiang, N., Tan, P., and Cheong, L.-F. 2009. Symmetric architecture modeling with a single image. ACM TOG (SIGGRAPH Asia) 28, 5, 113:1–113:8. Google ScholarDigital Library
18. Karsch, K., Hedau, V., Forsyth, D., and Hoiem, D. 2011. Rendering synthetic objects into legacy photographs. ACM TOG (SIGGRAPH Asia) 30, 6, 157:1–157:12. Google ScholarDigital Library
19. Lalonde, J.-F., Hoiem, D., Efros, A. A., Rother, C., Winn, J., and Criminisi, A. 2007. Photo clip art. ACM TOG (SIGGRAPH) 26, 3 (August), 3. Google ScholarDigital Library
20. Lourakis, M., 2004. levmar: Levenberg-marquardt non-linear least squares algorithms in C/C++. {web page} http://www.ics.forth.gr/~lourakis/levmar/.Google Scholar
21. Mitra, N. J., Yang, Y.-L., Yan, D.-M., Li, W., and Agrawala, M. 2010. Illustrating how mechanical assemblies work. ACM TOG (SIGGRAPH) 29, 4, 58:1–58:12. Google ScholarDigital Library
22. Norman, D. 1990. Design of Everyday Things. MIT Press.Google Scholar
23. Oh, B. M., Chen, M., Dorsey, J., and Durand, F. 2001. Image-based modeling and photo editing. In ACM SIGGRAPH, 433–442. Google ScholarDigital Library
24. Pérez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. ACM TOG (SIGGRAPH) 22, 3, 313–318. Google ScholarDigital Library
25. Rother, C., Kolmogorov, V., and Blake, A. 2004. “grab-cut”: interactive foreground extraction using iterated graph cuts. ACM TOG (SIGGRAPH) 23, 3, 309–314. Google ScholarDigital Library
26. Rubinstein, M., Shamir, A., and Avidan, S. 2009. Multi-operator media retargeting. ACM TOG (SIGGRAPH) 28, 3, 23:1–23:11. Google ScholarDigital Library
27. Saxena, A., Sun, M., and Ng, A. 2009. Make3D: Learning 3D scene structure from a single still image. IEEE PAMI 31, 5, 824–840. Google ScholarDigital Library
28. Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In IEEE CVPR, 519–528. Google ScholarDigital Library
29. Shapira, L., Shamir, A., and Cohen-Or, D. 2009. Image appearance exploration by model-based navigation. CGF 28, 2, 629–638.Google ScholarCross Ref
30. Sinha, S. N., Steedly, D., Szeliski, R., Agrawala, M., and Pollefeys, M. 2008. Interactive 3D architectural modeling from unordered photo collections. ACM TOG (SIGGRAPH Asia) 27, 5, 159:1–159:10. Google ScholarDigital Library
31. Sun, J., Yuan, L., Jia, J., and Shum, H. 2005. Image completion with structure propagation. ACM TOG (SIGGRAPH) 24, 3, 861–868. Google ScholarDigital Library
32. Wang, Y.-S., Tai, C.-L., Sorkine, O., and Lee, T.-Y. 2008. Optimized scale-and-stretch for image resizing. ACM TOG (SIGGRAPH Asia) 27, 5, 118:1–118:8. Google ScholarDigital Library
33. Wei, L.-Y., Lefebvre, S., Kwatra, V., and Turk, G. 2009. State of the art in example-based texture synthesis. In EG-STAR, 93–117.Google Scholar
34. Wilczkowiak, M., Sturm, P. F., and Boyer, E. 2005. Using geometric constraints through parallelepipeds for calibration and 3D modeling. IEEE PAMI 27, 2, 194–207. Google ScholarDigital Library
35. Wu, H., Wang, Y.-S., Feng, K.-C., Wong, T.-T., Lee, T.-Y., and Heng, P.-A. 2010. Resizing by symmetry-summarization. ACM TOG (SIGGRAPH Asia) 29, 6, 159:1–159:9. Google ScholarDigital Library
36. Xue, T., Liu, J., and Tang, X. 2010. Object cut: Complex 3d object reconstruction through line drawing separation. In IEEE CVPR, 1149–1156.Google Scholar
37. Yang, Y.-L., Yang, Y.-J., Pottmann, H., and Mitra, N. J. 2011. Shape space exploration of constrained meshes. ACM TOG (SIGGRAPH Asia) 30, 6. Google ScholarDigital Library
38. Zheng, Y., Fu, H., Cohen-Or, D., Au, O. K.-C., and Tai, C.-L. 2011. Component-wise controllers for structure-preserving shape manipulation. CGF 30, 2, 563–572.Google ScholarCross Ref
39. Zhou, S., Fu, H., Liu, L., Cohen-Or, D., and Han, X. 2010. Parametric reshaping of human bodies in images. ACM TOG (SIGGRAPH) 29, 4, 126:1–126:10. Google ScholarDigital Library
40. Zisserman, A., Reid, I. D., and Criminisi, A. 1999. Single view metrology. In IEEE ICCV, 434–441.Google Scholar

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2012: Technical Papers

“Interactive images: cuboid proxies for smart image manipulation” by Zheng, Chen, Cheng, Zhou, Hu, et al. …

Conference:

Type(s):

Title:

Presenter(s)/Author(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: