Scene reconstruction from high spatio-angular resolution light fields

This paper describes a method for scene reconstruction of complex, detailed environments from 3D light fields. Densely sampled light fields in the order of 109 light rays allow us to capture the real world in unparalleled detail, but efficiently processing this amount of data to generate an equally detailed reconstruction represents a significant challenge to existing algorithms. We propose an algorithm that leverages coherence in massive light fields by breaking with a number of established practices in image-based reconstruction. Our algorithm first computes reliable depth estimates specifically around object boundaries instead of interior regions, by operating on individual light rays instead of image patches. More homogeneous interior regions are then processed in a fine-to-coarse procedure rather than the standard coarse-to-fine approaches. At no point in our method is any form of global optimization performed. This allows our algorithm to retain precise object contours while still ensuring smooth reconstructions in less detailed areas. While the core reconstruction method handles general unstructured input, we also introduce a sparse representation and a propagation scheme for reliable depth estimates which make our algorithm particularly effective for 3D input, enabling fast and memory efficient processing of “Gigaray light fields” on a standard GPU. We show dense 3D reconstructions of highly detailed scenes, enabling applications such as automatic segmentation and image-based rendering, and provide an extensive evaluation and comparison to existing image-based reconstruction techniques.

References:

1. Adelson, E. H., and Wang, J. Y. A. 1992. Single lens stereo with a plenoptic camera. IEEE PAMI 14, 2. Google ScholarDigital Library
2. Ayvaci, A., Raptis, M., and Soatto, S. 2012. Sparse occlusion detection with optical flow. IJCV 97, 3. Google ScholarDigital Library
3. Basha, T., Avidan, S., Hornung, A., and Matusik, W. 2012. Structure and motion from scene registration. In CVPR.Google Scholar
4. Beeler, T., Bickel, B., Beardsley, P. A., Sumner, B., and Gross, M. H. 2010. High-quality single-shot capture of facial geometry. ACM Trans. Graph. 29, 4. Google ScholarDigital Library
5. Bishop, T. E., and Favaro, P. 2010. Full-resolution depth map estimation from an aliased plenoptic light field. In ACCV. Google ScholarDigital Library
6. Bishop, T., Zanetti, S., and Favaro, P. 2009. Light field superresolution. In ICCP.Google Scholar
7. Bleyer, M., Rother, C., Kohli, P., Scharstein, D., and Sinha, S. 2011. Object stereo — joint stereo matching and object segmentation. In CVPR. Google ScholarDigital Library
8. Bolles, R. C., Baker, H. H., and Marimont, D. H. 1987. Epipolar-plane image analysis: An approach to determining structure from motion. IJCV 1, 1.Google ScholarCross Ref
9. Buehler, C., Bosse, M., McMillan, L., Gortler, S. J., and Cohen, M. F. 2001. Unstructured lumigraph rendering. In SIGGRAPH. Google ScholarDigital Library
10. Čech, J., and Šára, R. 2007. Efficient sampling of disparity space for fast and accurate matching. In CVPR.Google Scholar
11. Chai, J., Chan, S.-C., Shum, H.-Y., and Tong, X. 2000. Plenoptic sampling. In SIGGRAPH. Google ScholarDigital Library
12. Chen, W.-C., Bouguet, J.-Y., Chu, M. H., and Grzeszczuk, R. 2002. Light field mapping: Efficient representation and hardware rendering of surface light fields. In SIGGRAPH. Google ScholarDigital Library
13. Comaniciu, D., and Meer, P. 2002. Mean shift: A robust approach toward feature space analysis. IEEE PAMI 24, 5. Google ScholarDigital Library
14. Criminisi, A., Kang, S. B., Swaminathan, R., Szeliski, R., and Anandan, P. 2005. Extracting layers and analyzing their specular properties using epipolar-plane-image analysis. CVIU 97, 1. Google ScholarDigital Library
15. Davis, A., Levoy, M., and Durand, F. 2012. Unstructured light fields. Comput. Graph. Forum 31, 2. Google ScholarDigital Library
16. Duda, R., Hart, P., and Stork, D. 1995. Pattern Classification and Scene Analysis, 2nd ed. Google ScholarDigital Library
17. Fitzgibbon, A., Wexler, Y., and Zisserman, A. 2005. Image-based rendering using image-based priors. IJCV 63, 2. Google ScholarDigital Library
18. Furukawa, Y., and Ponce, J. 2010. Accurate, dense, and robust multi-view stereopsis. IEEE PAMI 32, 8. Google ScholarDigital Library
19. Furukawa, Y., Curless, B., Seitz, S. M., and Szeliski, R. 2010. Towards Internet-scale multi-view stereo. In CVPR.Google Scholar
20. Fusiello, A., Trucco, E., and Verri, A. 2000. A compact algorithm for rectification of stereo pairs. Mach. Vis. Appl. 12, 1. Geiger, A., Roser, M., and Urtasun, R. 2010. Efficient large-scale stereo matching. In ACCV. Google ScholarDigital Library
21. Georgiev, T., and Lumsdaine, A. 2010. Reducing plenoptic camera artifacts. Comp. Graph. Forum 29, 6.Google ScholarCross Ref
22. Goldlücke, B., and Magnor, M. 2003. Joint 3D-reconstruction and background separation in multiple views using graph cuts. In CVPR. Google ScholarDigital Library
23. Gortler, S. J., Grzeszczuk, R., Szeliski, R., and Cohen, M. F. 1996. The Lumigraph. In SIGGRAPH. Google ScholarDigital Library
24. Hirschmüller, H. 2005. Accurate and efficient stereo processing by semi-global matching and mutual information. In CVPR.Google Scholar
25. Humayun, A., Mac Aodha, O., and Brostow, G. 2011. Learning to find occlusion regions. In CVPR. Google ScholarDigital Library
26. Isaksen, A., McMillan, L., and Gortler, S. J. 2000. Dynamically reparameterized light fields. In SIGGRAPH. Google ScholarDigital Library
27. Kang, S. B., and Szeliski, R. 2004. Extracting view-dependent depth maps from a collection of images. IJCV 58, 2. Google ScholarDigital Library
28. Kolmogorov, V., and Zabih, R. 2001. Computing visual correspondence with occlusions via graph cuts. In ICCV.Google Scholar
29. Levoy, M., and Hanrahan, P. 1996. Light field rendering. In SIGGRAPH. Google ScholarDigital Library
30. Liang, C.-K., Lin, T.-H., Wong, B.-Y., Liu, C., and Chen, H. H. 2008. Programmable aperture photography: multiplexed light field acquisition. ACM Trans. Graph. 27, 3. Google ScholarDigital Library
31. Ng, R., Levoy, M., Brédif, M., Duval, G., Horowitz, M., and Hanrahan, P. 2005. Light field photography with a hand-held plenoptic camera. Comp. Sci. Techn. Rep. CSTR 2.Google Scholar
32. Rav-Acha, A., Shor, Y., and Peleg, S. 2004. Mosaicing with parallax using time warping. In IVR. Google ScholarDigital Library
33. Rhemann, C., Hosni, A., Bleyer, M., Rother, C., and Gelautz, M. 2011. Fast cost-volume filtering for visual correspondence and beyond. In CVPR. Google ScholarDigital Library
34. Scharstein, D., and Szeliski, R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV 47, 1–3. Google ScholarDigital Library
35. Schechner, Y. Y., and Kiryati, N. 2000. Depth from defocus vs. stereo: How different really are they? IJCV 39, 2. Google ScholarDigital Library
36. Seitz, S. M., and Dyer, C. R. 1999. Photorealistic scene reconstruction by voxel coloring. IJCV 35, 2. Google ScholarDigital Library
37. Seitz, S., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In CVPR. Google ScholarDigital Library
38. Snavely, N., Seitz, S. M., and Szeliski, R. 2008. Modeling the world from Internet photo collections. IJCV 80, 2. Google ScholarDigital Library
39. Stich, T., Tevs, A., and Magnor, M. A. 2006. Global depth from epipolar volumes–a general framework for reconstructing non-lambertian surfaces. In 3DPVT. Google ScholarDigital Library
40. Sun, X., Mei, X., Jiao, S., Zhou, M., and Wang, H. 2011. Stereo matching with reliable disparity propagation. In 3DIMPVT. Google ScholarDigital Library
41. Sylwan, S. 2010. The application of vision algorithms to visual effects production. In ACCV. Google ScholarDigital Library
42. Szeliski, R., and Scharstein, D. 2002. Symmetric sub-pixel stereo matching. In ECCV. Google ScholarDigital Library
43. Vaish, V., Levoy, M., Szeliski, R., Zitnick, C., and Kang, S. 2006. Reconstructing occluded surfaces using synthetic apertures: Stereo, focus and robust measures. In CVPR. Google ScholarDigital Library
44. Veeraraghavan, A., Raskar, R., Agrawal, A. K., Mohan, A., and Tumblin, J. 2007. Dappled photography: mask enhanced cameras for heterodyned light fields and coded aperture refocusing. ACM Trans. Graph. 26, 3. Google ScholarDigital Library
45. Vu, H.-H., Keriven, R., Labatut, P., and Pons, J.-P. 2009. Towards high-resolution large-scale multi-view stereo. In CVPR.Google Scholar
46. Wanner, S., and Goldlücke, B. 2012. Globally consistent depth labeling of 4D light fields. In CVPR.Google Scholar
47. Wanner, S., Fehr, J., and Jaehne, B. 2011. Generating EPI representations of 4D light fields with a single lens focused plenoptic camera. In IISVC. Google ScholarDigital Library
48. Wilburn, B., Joshi, N., Vaish, V., Talvala, E.-V., Antúnez, E. R., Barth, A., Adams, A., Horowitz, M., and Levoy, M. 2005. High performance imaging using large camera arrays. ACM Trans. Graph. 24, 3. Google ScholarDigital Library
49. Wood, D. N., Azuma, D. I., Aldinger, K., Curless, B., Duchamp, T., Salesin, D. H., and Stuetzle, W. 2000. Surface light fields for 3D photography. In SIGGRAPH. Google ScholarDigital Library
50. Yu, Y., Ferencz, A., and Malik, J. 2001. Extracting objects from range and radiance images. IEEE TVCG 7, 4. Google ScholarDigital Library
51. Zhang, C., and Chen, T. 2004. A self-reconfigurable camera array. In EGSR. Google ScholarDigital Library
52. Zhu, Z., Xu, G., and Lin, X. 1999. Panoramic EPI generation and analysis of video from a moving platform with vibration. In CVPR.Google Scholar
53. Ziegler, R., Bucheli, S., Ahrenberg, L., Magnor, M. A., and Gross, M. H. 2007. A bidirectional light field – hologram transform. Comput. Graph. Forum 26, 3.Google ScholarCross Ref
54. Zitnick, C. L., and Kang, S. B. 2007. Stereo for image-based rendering using image over-segmentation. IJCV 75, 1. Google ScholarDigital Library
55. Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., and Szeliski, R. 2004. High-quality video view interpolation using a layered representation. ACM Trans. Graph. 23, 3. Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2013: Technical Papers

“Scene reconstruction from high spatio-angular resolution light fields” by Kim, Zimmer, Pritch, Sorkine-Hornung and Gross

Conference:

Type(s):

Title:

Session/Category Title: Image-Based Reconstruction

Presenter(s)/Author(s):

Moderator(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: