Photo clip art

We present a system for inserting new objects into existing photographs by querying a vast image-based object library, pre-computed using a publicly available Internet object database. The central goal is to shield the user from all of the arduous tasks typically involved in image compositing. The user is only asked to do two simple things: 1) pick a 3D location in the scene to place a new object; 2) select an object to insert using a hierarchical menu. We pose the problem of object insertion as a data-driven, 3D-based, context-sensitive object retrieval task. Instead of trying to manipulate the object to change its orientation, color distribution, etc. to fit the new image, we simply retrieve an object of a specified class that has all the required properties (camera pose, lighting, resolution, etc) from our large object library. We present new automatic algorithms for improving object segmentation and blending, estimating true 3D object size and orientation, and estimating scene lighting conditions. We also present an intuitive user interface that makes object insertion fast and simple even for the artistically challenged.

References:

1. Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S., Colburn, A., Curless, B., Salesin, D., and Cohen, M. 2004. Interactive digital photomontage. ACM Trans. Graph. (SIGGRAPH 04) 23, 3, 294–302. Google ScholarDigital Library
2. Berg, T. L., Berg, A. C., Edwards, J., Maire, M., White, R., Teh, Y.-W., Learned-Miller, E., and Forsyth, D. A. 2004. Names and faces in the news. In IEEE Computer Vision and Pattern Recognition (CVPR). Google ScholarDigital Library
3. Boykov, Y., Veksler, O., and Zabih, R. 2001. Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Analysis and Machine Intelligence 23, 11. Google ScholarDigital Library
4. Boykov, Y., Kolmogorov, V., Cremers, D., and Delong, A. 2006. An integral solution to surface evolution PDEs via Geo-Cuts. In European Conf. on Computer Vision (ECCV). Google ScholarDigital Library
5. Cavanagh, P. 2005. The artist as neuroscientist. Nature 434 (March), 301–307.Google ScholarCross Ref
6. Chuang, Y.-Y., Goldman, D. B., Curless, B., Salesin, D. H., and Szeliski, R. 2003. Shadow matting and compositing. ACM Transactions on Graphics (SIGGRAPH 03) 22, 3 (July), 494–500. Google ScholarDigital Library
7. Criminisi, A., Reid, I., and Zisserman, A. 2000. Single view metrology. International Journal of Computer Vision 40, 2, 123–148. Google ScholarDigital Library
8. Debevec, P. 1998. Rendering synthetic objects into real scenes: Bridging traditional and image-based graphics with global illumination and high dynamic range photography. In Proceedings of SIGGRAPH 98, 189–198. Google ScholarDigital Library
9. Diakopoulos, N., Essa, I., and Jain, R. 2004. Content based image synthesis. In Conference on Image and Video Retrieval (CIVR).Google Scholar
10. Everingham, M., Zisserman, A., Williams, C., and Gool, L. V. 2006. The pascal visual object classes challenge 2006 results. Tech. rep., Oxford University.Google Scholar
11. Fei-Fei, L., Fergus, R., and Perona, P. 2004. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In IEEE CVPR Workshop of Generative Model Based Vision. Google ScholarDigital Library
12. Finlayson, G. D., Hordley, S. D., Lu, C., and Drew, M. S. 2006. On the removal of shadows from images. IEEE Trans. Pattern Analysis and Machine Intelligence 28, 1, 59–68. Google ScholarDigital Library
13. Hoiem, D., Efros, A. A., and Hebert, M. 2005. Geometric context from a single image. In International Conference on Computer Vision (ICCV). Google ScholarDigital Library
14. Hoiem, D., Efros, A. A., and Hebert, M. 2006. Putting objects in perspective. In IEEE Computer Vision and Pattern Recognition (CVPR). Google ScholarDigital Library
15. Jia, J., Sun, J., Tang, C.-K., and Shum, H.-Y. 2006. Drag-and-drop pasting. ACM Transactions on Graphics (SIGGRAPH 06) 25, 3 (July), 631–637. Google ScholarDigital Library
16. Johnson, M., Brostow, G. J., Shotton, J., Arandjelović, O., Kwatra, V., and Cipolla, R. 2006. Semantic photo synthesis. Computer Graphics Forum (Proc. Eurographics) 25, 3, 407–413.Google ScholarCross Ref
17. Kersten, D., Knill, D., Mamassian, P., and Bulthoff, I. 1996. Illusory motion from shadows. Nature 379, 6560, 31–31.Google Scholar
18. Khan, E. A., Reinhard, E., Fleming, R. W., and Bülthoff, H. H. 2006. Image-based material editing. ACM Transactions on Graphics (SIGGRAPH 06) 25, 3 (July), 654–663. Google ScholarDigital Library
19. Kolmogorov, V., and Boykov, Y. 2005. What metrics can be approximated by Geo-Cuts, or global optimization of length/area and flux. In International Conference on Computer Vision (ICCV). Google ScholarDigital Library
20. Levin, A., Lischinski, D., and Weiss, Y. 2006. A closed form solution to natural image matting. In Proc IEEE Computer Vision and Pattern Recognition (extended Tech. Rep.). Google ScholarDigital Library
21. Li, Y., Sun, J., Tang, C.-K., and Shum, H.-Y. 2004. Lazy snapping. ACM Transactions on Graphics (SIGGRAPH 04) 23, 3 (Aug.), 303–308. Google ScholarDigital Library
22. Perez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. ACM Trans. Graph. (SIGGRAPH 03) 22, 3, 313–318. Google ScholarDigital Library
23. Porter, T., and Duff, T. 1984. Compositing digital images. In Computer Graphics (Proceedings of SIGGRAPH 84), 253–259. Google ScholarDigital Library
24. Quinlan, J. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, Inc. Google ScholarDigital Library
25. Rother, C., Kolmogorov, V., and Blake, A. 2004. Grab-Cut: interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics (SIGGRAPH 04) 23, 3 (Aug.), 309–314. Google ScholarDigital Library
26. Rother, C., Bordeaux, L., Hamadi, Y., and Blake, A. 2006. Autocollage. ACM Transactions on Graphics (SIGGRAPH 06) 25, 3 (July), 847–852. Google ScholarDigital Library
27. Rother, C. 2007. Cut-and-paste for photo clip art. Tech. Rep. MSR-TR-2007-45, Microsoft Research.Google Scholar
28. Russell, B. C., Torralba, A., Murphy, K. P., and Freeman, W. T. 2005. LabelMe: a database and web-based tool for image annotation. Tech. rep., MIT.Google Scholar
29. Russell, B. C., Efros, A. A., Sivic, J., Freeman, W. T., and Zisserman, A. 2006. Using multiple segmentations to discover objects and their extent in image collections. In IEEE Computer Vision and Pattern Recognition (CVPR). Google ScholarDigital Library
30. Shotton, J., Winn, J., Rother, C., and Criminisi, A. 2006. Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In European Conf. on Computer Vision (ECCV). Google ScholarDigital Library
31. Snavely, N., Seitz, S. M., and Szeliski, R. 2006. Photo tourism: exploring photo collections in 3D. ACM Trans. Graph. (SIGGRAPH 06) 25, 3, 835–846. Google ScholarDigital Library
32. Torralba, A., and Oliva, A. 2003. Statistics of natural image categories. Network: Computation in Neural Systems 14, 3 (August), 391–412.Google ScholarCross Ref
33. von Ahn, L., Liu, R., and Blum, M. 2006. Peekaboom: A game for locating objects in images. In ACM CHI. Google ScholarDigital Library
34. Wang, J., and Cohen, M. 2006. Simultaneous matting and compositing. Tech. Rep. MSR-TR-2006-63.Google Scholar

ACM Digital Library Publication:

Photo clip art

Overview Page:

SIGGRAPH 2007: Technical Papers

“Photo clip art” by Lalonde, Hoiem, Efros, Rother, Winn, et al. …

Conference:

Type(s):

Title:

Presenter(s)/Author(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: