“Photo clip art” by Lalonde, Hoiem, Efros, Rother, Winn, et al. …

  • ©Jean-Francois Lalonde, Derek Hoiem, Alexei A. Efros, Carsten Rother, John Winn, and Antonio Criminsi


Abstract:


    We present a system for inserting new objects into existing photographs by querying a vast image-based object library, pre-computed using a publicly available Internet object database. The central goal is to shield the user from all of the arduous tasks typically involved in image compositing. The user is only asked to do two simple things: 1) pick a 3D location in the scene to place a new object; 2) select an object to insert using a hierarchical menu. We pose the problem of object insertion as a data-driven, 3D-based, context-sensitive object retrieval task. Instead of trying to manipulate the object to change its orientation, color distribution, etc. to fit the new image, we simply retrieve an object of a specified class that has all the required properties (camera pose, lighting, resolution, etc) from our large object library. We present new automatic algorithms for improving object segmentation and blending, estimating true 3D object size and orientation, and estimating scene lighting conditions. We also present an intuitive user interface that makes object insertion fast and simple even for the artistically challenged.

References:


    1. Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S., Colburn, A., Curless, B., Salesin, D., and Cohen, M. 2004. Interactive digital photomontage. ACM Trans. Graph. (SIGGRAPH 04) 23, 3, 294–302. Google ScholarDigital Library
    2. Berg, T. L., Berg, A. C., Edwards, J., Maire, M., White, R., Teh, Y.-W., Learned-Miller, E., and Forsyth, D. A. 2004. Names and faces in the news. In IEEE Computer Vision and Pattern Recognition (CVPR). Google ScholarDigital Library
    3. Boykov, Y., Veksler, O., and Zabih, R. 2001. Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Analysis and Machine Intelligence 23, 11. Google ScholarDigital Library
    4. Boykov, Y., Kolmogorov, V., Cremers, D., and Delong, A. 2006. An integral solution to surface evolution PDEs via Geo-Cuts. In European Conf. on Computer Vision (ECCV). Google ScholarDigital Library
    5. Cavanagh, P. 2005. The artist as neuroscientist. Nature 434 (March), 301–307.Google ScholarCross Ref
    6. Chuang, Y.-Y., Goldman, D. B., Curless, B., Salesin, D. H., and Szeliski, R. 2003. Shadow matting and compositing. ACM Transactions on Graphics (SIGGRAPH 03) 22, 3 (July), 494–500. Google ScholarDigital Library
    7. Criminisi, A., Reid, I., and Zisserman, A. 2000. Single view metrology. International Journal of Computer Vision 40, 2, 123–148. Google ScholarDigital Library
    8. Debevec, P. 1998. Rendering synthetic objects into real scenes: Bridging traditional and image-based graphics with global illumination and high dynamic range photography. In Proceedings of SIGGRAPH 98, 189–198. Google ScholarDigital Library
    9. Diakopoulos, N., Essa, I., and Jain, R. 2004. Content based image synthesis. In Conference on Image and Video Retrieval (CIVR).Google Scholar
    10. Everingham, M., Zisserman, A., Williams, C., and Gool, L. V. 2006. The pascal visual object classes challenge 2006 results. Tech. rep., Oxford University.Google Scholar
    11. Fei-Fei, L., Fergus, R., and Perona, P. 2004. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In IEEE CVPR Workshop of Generative Model Based Vision. Google ScholarDigital Library
    12. Finlayson, G. D., Hordley, S. D., Lu, C., and Drew, M. S. 2006. On the removal of shadows from images. IEEE Trans. Pattern Analysis and Machine Intelligence 28, 1, 59–68. Google ScholarDigital Library
    13. Hoiem, D., Efros, A. A., and Hebert, M. 2005. Geometric context from a single image. In International Conference on Computer Vision (ICCV). Google ScholarDigital Library
    14. Hoiem, D., Efros, A. A., and Hebert, M. 2006. Putting objects in perspective. In IEEE Computer Vision and Pattern Recognition (CVPR). Google ScholarDigital Library
    15. Jia, J., Sun, J., Tang, C.-K., and Shum, H.-Y. 2006. Drag-and-drop pasting. ACM Transactions on Graphics (SIGGRAPH 06) 25, 3 (July), 631–637. Google ScholarDigital Library
    16. Johnson, M., Brostow, G. J., Shotton, J., Arandjelović, O., Kwatra, V., and Cipolla, R. 2006. Semantic photo synthesis. Computer Graphics Forum (Proc. Eurographics) 25, 3, 407–413.Google ScholarCross Ref
    17. Kersten, D., Knill, D., Mamassian, P., and Bulthoff, I. 1996. Illusory motion from shadows. Nature 379, 6560, 31–31.Google Scholar
    18. Khan, E. A., Reinhard, E., Fleming, R. W., and Bülthoff, H. H. 2006. Image-based material editing. ACM Transactions on Graphics (SIGGRAPH 06) 25, 3 (July), 654–663. Google ScholarDigital Library
    19. Kolmogorov, V., and Boykov, Y. 2005. What metrics can be approximated by Geo-Cuts, or global optimization of length/area and flux. In International Conference on Computer Vision (ICCV). Google ScholarDigital Library
    20. Levin, A., Lischinski, D., and Weiss, Y. 2006. A closed form solution to natural image matting. In Proc IEEE Computer Vision and Pattern Recognition (extended Tech. Rep.). Google ScholarDigital Library
    21. Li, Y., Sun, J., Tang, C.-K., and Shum, H.-Y. 2004. Lazy snapping. ACM Transactions on Graphics (SIGGRAPH 04) 23, 3 (Aug.), 303–308. Google ScholarDigital Library
    22. Perez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. ACM Trans. Graph. (SIGGRAPH 03) 22, 3, 313–318. Google ScholarDigital Library
    23. Porter, T., and Duff, T. 1984. Compositing digital images. In Computer Graphics (Proceedings of SIGGRAPH 84), 253–259. Google ScholarDigital Library
    24. Quinlan, J. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, Inc. Google ScholarDigital Library
    25. Rother, C., Kolmogorov, V., and Blake, A. 2004. Grab-Cut: interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics (SIGGRAPH 04) 23, 3 (Aug.), 309–314. Google ScholarDigital Library
    26. Rother, C., Bordeaux, L., Hamadi, Y., and Blake, A. 2006. Autocollage. ACM Transactions on Graphics (SIGGRAPH 06) 25, 3 (July), 847–852. Google ScholarDigital Library
    27. Rother, C. 2007. Cut-and-paste for photo clip art. Tech. Rep. MSR-TR-2007-45, Microsoft Research.Google Scholar
    28. Russell, B. C., Torralba, A., Murphy, K. P., and Freeman, W. T. 2005. LabelMe: a database and web-based tool for image annotation. Tech. rep., MIT.Google Scholar
    29. Russell, B. C., Efros, A. A., Sivic, J., Freeman, W. T., and Zisserman, A. 2006. Using multiple segmentations to discover objects and their extent in image collections. In IEEE Computer Vision and Pattern Recognition (CVPR). Google ScholarDigital Library
    30. Shotton, J., Winn, J., Rother, C., and Criminisi, A. 2006. Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In European Conf. on Computer Vision (ECCV). Google ScholarDigital Library
    31. Snavely, N., Seitz, S. M., and Szeliski, R. 2006. Photo tourism: exploring photo collections in 3D. ACM Trans. Graph. (SIGGRAPH 06) 25, 3, 835–846. Google ScholarDigital Library
    32. Torralba, A., and Oliva, A. 2003. Statistics of natural image categories. Network: Computation in Neural Systems 14, 3 (August), 391–412.Google ScholarCross Ref
    33. von Ahn, L., Liu, R., and Blum, M. 2006. Peekaboom: A game for locating objects in images. In ACM CHI. Google ScholarDigital Library
    34. Wang, J., and Cohen, M. 2006. Simultaneous matting and compositing. Tech. Rep. MSR-TR-2006-63.Google Scholar


ACM Digital Library Publication:



Overview Page: