“Scene completion using millions of photographs” by Hays and Efros

  • ©James Hays and Alexei A. Efros




    Scene completion using millions of photographs



    What can you do with a million images? In this paper we present a new image completion algorithm powered by a huge database of photographs gathered from the Web. The algorithm patches up holes in images by finding similar image regions in the database that are not only seamless but also semantically valid. Our chief insight is that while the space of images is effectively infinite, the space of semantically differentiable scenes is actually not that large. For many image completion tasks we are able to find similar scenes which contain image fragments that will convincingly complete the image. Our algorithm is entirely data-driven, requiring no annotations or labelling by the user. Unlike existing image completion methods, our algorithm can generate a diverse set of results for each input image and we allow users to select among them. We demonstrate the superiority of our algorithm over existing image completion approaches.


    1. Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S., Colburn, A., Curless, B., Salesin, D., and Cohen, M. 2004. Interactive digital photomontage. ACM Trans. Graph. 23, 3, 294–302. Google ScholarDigital Library
    2. Agrawal, A., Raskar, R., and Chellappa, R. 2006. What is the range of surface reconstructions from a gradient field? In ECCV. Google ScholarDigital Library
    3. Boykov, Y., Veksler, O., and Zabih, R. 2001. Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23, 11, 1222–1239. Google ScholarDigital Library
    4. Criminisi, A., Perez, P., and Toyama, K. 2003. Object removal by exemplar-based inpainting. CVPR 02, 721.Google Scholar
    5. Diakopoulos, N., Essa, I., and Jain, R. 2004. Content based image synthesis. In Conference on Image and Video Retrieval.Google Scholar
    6. Drori, I., Cohen-Or, D., and Yeshurun, H. 2003. Fragment-based image completion. ACM Trans. Graph. 22, 3, 303–312. Google ScholarDigital Library
    7. Efros, A. A., and Freeman, W. T. 2001. Image quilting for texture synthesis and transfer. Proceedings of SIGGRAPH 2001 (August), 341–346. Google ScholarDigital Library
    8. Efros, A. A., and Leung, T. K. 1999. Texture synthesis by non-parametric sampling. In ICCV, 1033–1038. Google ScholarDigital Library
    9. Irani, M., Anandan, P., and Hsu, S. 1995. Mosaic based representations of video sequences and their applications.Google Scholar
    10. Jia, J., Sun, J., Tang, C.-K., and Shum, H.-Y. 2006. Drag-and-drop pasting. ACM Trans. Graph.. Google ScholarDigital Library
    11. Johnson, M., Brostow, G. J., Shotton, J., Arandjelović, O., Kwatra, V., and Cipolla, R. 2006. Semantic photo synthesis. Computer Graphics Forum (Proc. Eurographics) 25, 3 (September), 407–413.Google ScholarCross Ref
    12. King, D. 1997. The Commissar Vanishes. Henry Holt and Co.Google Scholar
    13. Komodakis, N. 2006. Image completion using global optimization. In CVPR, 442–452. Google ScholarDigital Library
    14. Kwatra, V., Schodl, A., Essa, I., Turk, G., and Bobick, A. 2003. Graphcut textures: Image and video synthesis using graph cuts. ACM Trans. Graph. 22, 3 (July), 277–286. Google ScholarDigital Library
    15. Kwatra, V., Essa, I., Bobick, A., and Kwatra, N. 2005. Texture optimization for example-based synthesis. In ACM Trans. Graph., 795–802. Google ScholarDigital Library
    16. Oliva, A., and Torralba, A. 2006. Building the gist of a scene: The role of global image features in recognition. In Visual Perception, Progress in Brain Research, vol. 155.Google Scholar
    17. Perez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. ACM Trans. Graph. 22, 3, 313–318. Google ScholarDigital Library
    18. Russell, B. C., Torralba, A., Murphy, K. P., and Freeman, W. T. 2005. LabelMe: a database and web-based tool for image annotation. Tech. rep., MIT, 2005.Google Scholar
    19. Snavely, N., Seitz, S. M., and Szeliski, R. 2006. Photo tourism: exploring photo collections in 3d. ACM Trans. Graph. 25, 3, 835–846. Google ScholarDigital Library
    20. Sun, J., Yuan, L., Jia, J., and Shum, H.-Y. 2005. Image completion with structure propagation. ACM Trans. Graph. 24, 3, 861–868. Google ScholarDigital Library
    21. Torralba, A., Murphy, K. P., Freeman, W. T., and Rubin, M. A. 2003. Context-based vision system for place and object recognition. In ICCV. Google ScholarDigital Library
    22. Torralba, A., Fergus, R., and Freeman, W. T. 2007. Tiny images. Tech. Rep. MIT-CSAIL-TR-2007-024.Google Scholar
    23. Wertheimer, M. 1938. Laws of organization in perceptual forms (partial translation). In A sourcebook of Gestalt Psychology, W. Ellis, Ed. Harcourt Brace and Company, 71–88.Google Scholar
    24. Wexler, Y., Shechtman, E., and Irani, M. 2004. Space-time video completion. CVPR 01, 120–127.Google Scholar
    25. Wilczkowiak, M., Brostow, G. J., Tordoff, B., and Cipolla, R. 2005. Hole filling through photomontage. In BMVC, 492–501.Google Scholar

ACM Digital Library Publication: