“Scene reconstruction from high spatio-angular resolution light fields” by Kim, Zimmer, Pritch, Sorkine-Hornung and Gross

  • ©Changil Kim, Henning Zimmer, Yael Pritch, Alexander Sorkine-Hornung, and Markus Gross



Session Title:

    Image-Based Reconstruction


    Scene reconstruction from high spatio-angular resolution light fields




    This paper describes a method for scene reconstruction of complex, detailed environments from 3D light fields. Densely sampled light fields in the order of 109 light rays allow us to capture the real world in unparalleled detail, but efficiently processing this amount of data to generate an equally detailed reconstruction represents a significant challenge to existing algorithms. We propose an algorithm that leverages coherence in massive light fields by breaking with a number of established practices in image-based reconstruction. Our algorithm first computes reliable depth estimates specifically around object boundaries instead of interior regions, by operating on individual light rays instead of image patches. More homogeneous interior regions are then processed in a fine-to-coarse procedure rather than the standard coarse-to-fine approaches. At no point in our method is any form of global optimization performed. This allows our algorithm to retain precise object contours while still ensuring smooth reconstructions in less detailed areas. While the core reconstruction method handles general unstructured input, we also introduce a sparse representation and a propagation scheme for reliable depth estimates which make our algorithm particularly effective for 3D input, enabling fast and memory efficient processing of “Gigaray light fields” on a standard GPU. We show dense 3D reconstructions of highly detailed scenes, enabling applications such as automatic segmentation and image-based rendering, and provide an extensive evaluation and comparison to existing image-based reconstruction techniques.


    1. Adelson, E. H., and Wang, J. Y. A. 1992. Single lens stereo with a plenoptic camera. IEEE PAMI 14, 2. Google ScholarDigital Library
    2. Ayvaci, A., Raptis, M., and Soatto, S. 2012. Sparse occlusion detection with optical flow. IJCV 97, 3. Google ScholarDigital Library
    3. Basha, T., Avidan, S., Hornung, A., and Matusik, W. 2012. Structure and motion from scene registration. In CVPR.Google Scholar
    4. Beeler, T., Bickel, B., Beardsley, P. A., Sumner, B., and Gross, M. H. 2010. High-quality single-shot capture of facial geometry. ACM Trans. Graph. 29, 4. Google ScholarDigital Library
    5. Bishop, T. E., and Favaro, P. 2010. Full-resolution depth map estimation from an aliased plenoptic light field. In ACCV. Google ScholarDigital Library
    6. Bishop, T., Zanetti, S., and Favaro, P. 2009. Light field superresolution. In ICCP.Google Scholar
    7. Bleyer, M., Rother, C., Kohli, P., Scharstein, D., and Sinha, S. 2011. Object stereo — joint stereo matching and object segmentation. In CVPR. Google ScholarDigital Library
    8. Bolles, R. C., Baker, H. H., and Marimont, D. H. 1987. Epipolar-plane image analysis: An approach to determining structure from motion. IJCV 1, 1.Google ScholarCross Ref
    9. Buehler, C., Bosse, M., McMillan, L., Gortler, S. J., and Cohen, M. F. 2001. Unstructured lumigraph rendering. In SIGGRAPH. Google ScholarDigital Library
    10. Čech, J., and Šára, R. 2007. Efficient sampling of disparity space for fast and accurate matching. In CVPR.Google Scholar
    11. Chai, J., Chan, S.-C., Shum, H.-Y., and Tong, X. 2000. Plenoptic sampling. In SIGGRAPH. Google ScholarDigital Library
    12. Chen, W.-C., Bouguet, J.-Y., Chu, M. H., and Grzeszczuk, R. 2002. Light field mapping: Efficient representation and hardware rendering of surface light fields. In SIGGRAPH. Google ScholarDigital Library
    13. Comaniciu, D., and Meer, P. 2002. Mean shift: A robust approach toward feature space analysis. IEEE PAMI 24, 5. Google ScholarDigital Library
    14. Criminisi, A., Kang, S. B., Swaminathan, R., Szeliski, R., and Anandan, P. 2005. Extracting layers and analyzing their specular properties using epipolar-plane-image analysis. CVIU 97, 1. Google ScholarDigital Library
    15. Davis, A., Levoy, M., and Durand, F. 2012. Unstructured light fields. Comput. Graph. Forum 31, 2. Google ScholarDigital Library
    16. Duda, R., Hart, P., and Stork, D. 1995. Pattern Classification and Scene Analysis, 2nd ed. Google ScholarDigital Library
    17. Fitzgibbon, A., Wexler, Y., and Zisserman, A. 2005. Image-based rendering using image-based priors. IJCV 63, 2. Google ScholarDigital Library
    18. Furukawa, Y., and Ponce, J. 2010. Accurate, dense, and robust multi-view stereopsis. IEEE PAMI 32, 8. Google ScholarDigital Library
    19. Furukawa, Y., Curless, B., Seitz, S. M., and Szeliski, R. 2010. Towards Internet-scale multi-view stereo. In CVPR.Google Scholar
    20. Fusiello, A., Trucco, E., and Verri, A. 2000. A compact algorithm for rectification of stereo pairs. Mach. Vis. Appl. 12, 1. Geiger, A., Roser, M., and Urtasun, R. 2010. Efficient large-scale stereo matching. In ACCV. Google ScholarDigital Library
    21. Georgiev, T., and Lumsdaine, A. 2010. Reducing plenoptic camera artifacts. Comp. Graph. Forum 29, 6.Google ScholarCross Ref
    22. Goldlücke, B., and Magnor, M. 2003. Joint 3D-reconstruction and background separation in multiple views using graph cuts. In CVPR. Google ScholarDigital Library
    23. Gortler, S. J., Grzeszczuk, R., Szeliski, R., and Cohen, M. F. 1996. The Lumigraph. In SIGGRAPH. Google ScholarDigital Library
    24. Hirschmüller, H. 2005. Accurate and efficient stereo processing by semi-global matching and mutual information. In CVPR.Google Scholar
    25. Humayun, A., Mac Aodha, O., and Brostow, G. 2011. Learning to find occlusion regions. In CVPR. Google ScholarDigital Library
    26. Isaksen, A., McMillan, L., and Gortler, S. J. 2000. Dynamically reparameterized light fields. In SIGGRAPH. Google ScholarDigital Library
    27. Kang, S. B., and Szeliski, R. 2004. Extracting view-dependent depth maps from a collection of images. IJCV 58, 2. Google ScholarDigital Library
    28. Kolmogorov, V., and Zabih, R. 2001. Computing visual correspondence with occlusions via graph cuts. In ICCV.Google Scholar
    29. Levoy, M., and Hanrahan, P. 1996. Light field rendering. In SIGGRAPH. Google ScholarDigital Library
    30. Liang, C.-K., Lin, T.-H., Wong, B.-Y., Liu, C., and Chen, H. H. 2008. Programmable aperture photography: multiplexed light field acquisition. ACM Trans. Graph. 27, 3. Google ScholarDigital Library
    31. Ng, R., Levoy, M., Brédif, M., Duval, G., Horowitz, M., and Hanrahan, P. 2005. Light field photography with a hand-held plenoptic camera. Comp. Sci. Techn. Rep. CSTR 2.Google Scholar
    32. Rav-Acha, A., Shor, Y., and Peleg, S. 2004. Mosaicing with parallax using time warping. In IVR. Google ScholarDigital Library
    33. Rhemann, C., Hosni, A., Bleyer, M., Rother, C., and Gelautz, M. 2011. Fast cost-volume filtering for visual correspondence and beyond. In CVPR. Google ScholarDigital Library
    34. Scharstein, D., and Szeliski, R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV 47, 1–3. Google ScholarDigital Library
    35. Schechner, Y. Y., and Kiryati, N. 2000. Depth from defocus vs. stereo: How different really are they? IJCV 39, 2. Google ScholarDigital Library
    36. Seitz, S. M., and Dyer, C. R. 1999. Photorealistic scene reconstruction by voxel coloring. IJCV 35, 2. Google ScholarDigital Library
    37. Seitz, S., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In CVPR. Google ScholarDigital Library
    38. Snavely, N., Seitz, S. M., and Szeliski, R. 2008. Modeling the world from Internet photo collections. IJCV 80, 2. Google ScholarDigital Library
    39. Stich, T., Tevs, A., and Magnor, M. A. 2006. Global depth from epipolar volumes–a general framework for reconstructing non-lambertian surfaces. In 3DPVT. Google ScholarDigital Library
    40. Sun, X., Mei, X., Jiao, S., Zhou, M., and Wang, H. 2011. Stereo matching with reliable disparity propagation. In 3DIMPVT. Google ScholarDigital Library
    41. Sylwan, S. 2010. The application of vision algorithms to visual effects production. In ACCV. Google ScholarDigital Library
    42. Szeliski, R., and Scharstein, D. 2002. Symmetric sub-pixel stereo matching. In ECCV. Google ScholarDigital Library
    43. Vaish, V., Levoy, M., Szeliski, R., Zitnick, C., and Kang, S. 2006. Reconstructing occluded surfaces using synthetic apertures: Stereo, focus and robust measures. In CVPR. Google ScholarDigital Library
    44. Veeraraghavan, A., Raskar, R., Agrawal, A. K., Mohan, A., and Tumblin, J. 2007. Dappled photography: mask enhanced cameras for heterodyned light fields and coded aperture refocusing. ACM Trans. Graph. 26, 3. Google ScholarDigital Library
    45. Vu, H.-H., Keriven, R., Labatut, P., and Pons, J.-P. 2009. Towards high-resolution large-scale multi-view stereo. In CVPR.Google Scholar
    46. Wanner, S., and Goldlücke, B. 2012. Globally consistent depth labeling of 4D light fields. In CVPR.Google Scholar
    47. Wanner, S., Fehr, J., and Jaehne, B. 2011. Generating EPI representations of 4D light fields with a single lens focused plenoptic camera. In IISVC. Google ScholarDigital Library
    48. Wilburn, B., Joshi, N., Vaish, V., Talvala, E.-V., Antúnez, E. R., Barth, A., Adams, A., Horowitz, M., and Levoy, M. 2005. High performance imaging using large camera arrays. ACM Trans. Graph. 24, 3. Google ScholarDigital Library
    49. Wood, D. N., Azuma, D. I., Aldinger, K., Curless, B., Duchamp, T., Salesin, D. H., and Stuetzle, W. 2000. Surface light fields for 3D photography. In SIGGRAPH. Google ScholarDigital Library
    50. Yu, Y., Ferencz, A., and Malik, J. 2001. Extracting objects from range and radiance images. IEEE TVCG 7, 4. Google ScholarDigital Library
    51. Zhang, C., and Chen, T. 2004. A self-reconfigurable camera array. In EGSR. Google ScholarDigital Library
    52. Zhu, Z., Xu, G., and Lin, X. 1999. Panoramic EPI generation and analysis of video from a moving platform with vibration. In CVPR.Google Scholar
    53. Ziegler, R., Bucheli, S., Ahrenberg, L., Magnor, M. A., and Gross, M. H. 2007. A bidirectional light field – hologram transform. Comput. Graph. Forum 26, 3.Google ScholarCross Ref
    54. Zitnick, C. L., and Kang, S. B. 2007. Stereo for image-based rendering using image over-segmentation. IJCV 75, 1. Google ScholarDigital Library
    55. Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., and Szeliski, R. 2004. High-quality video view interpolation using a layered representation. ACM Trans. Graph. 23, 3. Google ScholarDigital Library

ACM Digital Library Publication: