Instant 3D photography

We present an algorithm for constructing 3D panoramas from a sequence of aligned color-and-depth image pairs. Such sequences can be conveniently captured using dual lens cell phone cameras that reconstruct depth maps from synchronized stereo image capture. Due to the small baseline and resulting triangulation error the depth maps are considerably degraded and contain low-frequency error, which prevents alignment using simple global transformations. We propose a novel optimization that jointly estimates the camera poses as well as spatially-varying adjustment maps that are applied to deform the depth maps and bring them into good alignment. When fusing the aligned images into a seamless mosaic we utilize a carefully designed data term and the high quality of our depth alignment to achieve two orders of magnitude speedup w.r.t. previous solutions that rely on discrete optimization by removing the need for label smoothness optimization. Our algorithm processes about one input image per second, resulting in an end-to-end runtime of about one minute for mid-sized panoramas. The final 3D panoramas are highly detailed and can be viewed with binocular and head motion parallax in VR.

References:

1. Sameer Agarwal, Keir Mierle, and Others. 2017. Ceres Solver, http://ceres-solver.org. (2017).Google Scholar
2. Robert Anderson, David Gallup, Jonathan T. Barron, Janne Kontkanen, Noah Snavely, Carlos Hernandez Esteban, Sameer Agarwal, and Steven M. Seitz. 2016. Jump: Virtual Reality Video. ACM Transactions on Graphics 35, 6 (2016). Google ScholarDigital Library
3. Nicholas Ayache. 1989. Vision Stéréoscopique et Perception Multisensorielle: Application à la robotique mobile. Inter-Editions (MASSON). https://hal.inria.fr/inria-00615192Google Scholar
4. Jonathan T. Barron, Andrew Adams, YiChang Shih, and Carlos Hernández. 2015. Fast Bilateral-Space Stereo for Synthetic Defocus. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), 4466–4474.Google ScholarCross Ref
5. Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. 2017a. ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. Proc. Computer Vision and Pattern Recognition (CVPR), IEEE (2017).Google ScholarCross Ref
6. Angela Dai, Matthias Nießner, Michael Zollöfer, Shahram Izadi, and Christian Theobalt. 2017b. BundleFusion: Real-time Globally Consistent 3D Reconstruction using On-the-fly Surface Re-integration. ACM Transactions on Graphics 2017 (TOG) 36, 3 (2017), Article no. 24. Google ScholarDigital Library
7. Abe Davis, Marc Levoy, and Fredo Durand. 2012. Unstructured Light Fields. Computer Graphics Forum (Proc. EUROGRAPHICS 2012) 31, 2pt1 (2012), 305–314. Google ScholarDigital Library
8. Facebook. 2016. Facebook Surround 360. https://facebook360.fb.com/facebook-surround-360/. (2016). Accessed: 2016-12-26.Google Scholar
9. Clément Godard, Oisin Mac Aodha, and Gabriel J. Brostow. 2017. Unsupervised Monocular Depth Estimation with Left-Right Consistency. CVPR (2017).Google Scholar
10. Steven J. Gortler, Radek Grzeszczuk, Richard Szeliski, and Michael F. Cohen. 1996. The Lumigraph. Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (1996), 43–54. Google ScholarDigital Library
11. Hyowon Ha, Sunghoon Im, Jaesik Park, Hae-Gon Jeon, and In So Kweon. 2016. High-quality Depth from Uncalibrated Small Motion Clip. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016).Google ScholarCross Ref
12. Kaiming He, Jian Sun, and Xiaoou Tang. 2010. Guided Image Filtering. Proceedings of the 11th European Conference on Computer Vision (ECCV) (2010), 1–14.Google ScholarDigital Library
13. Peter Hedman, Suhib Alsisan, Richard Szeliski, and Johannes Kopf. 2017. Casual 3D Photography. ACM Transactions on Graphics (Proc. SIGGRAPH Asia 2017) 36, 6 (2017), Article no. 234. Google ScholarDigital Library
14. Aasma Hosni, Christoph Rhemann Rhemann, Michael Bleyer, Carsten Rother, and Margrit Gelautz. 2013. Fast Cost-Volume Filtering for Visual Correspondence and Beyond. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2 (2013), 504–511. Google ScholarDigital Library
15. Jingwei Huang, Zhili Chen, Duygu Ceylan, and Hailin Jin. 2017. 6-DOF VR Videos with a Single 360-Camera. IEEE VR 2017 (2017).Google ScholarCross Ref
16. Sunghoon Im, Hyowon Ha, François Rameau, Hae-Gon Jeon, Gyeongmin Choe, and In So Kweon. 2016. All-Around Depth from Small Motion with a Spherical Panoramic Camera. European Conference on Computer Vision (ECCV ’16) (2016), 156–172.Google ScholarCross Ref
17. Hiroshi Ishiguro, Masashi Yamamoto, and Saburo Tsuji. 1990. Omni-directional stereo for making global map. Third International Conference on Computer Vision (1990), 540–547.Google ScholarCross Ref
18. Shahram Izadi, David Kim, Otmar Hilliges, David Molyneaux, Richard Newcombe, Pushmeet Kohli, Jamie Shotton, Steve Hodges, Dustin Freeman, Andrew Davison, and Andrew Fitzgibbon. 2011. KinectFusion: Real-time 3D Reconstruction and Interaction Using a Moving Depth Camera. Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (2011), 559–568. Google ScholarDigital Library
19. Robert Konrad, Donald G. Dansereau, Aniq Masood, and Gordon Wetzstein. 2017. SpinVR: Towards Live-streaming 3D Virtual Reality Video. ACM Transactions on Graphics (Proc. SIGGRAPH Asia 2017) 36, 6 (2017), article no. 209. Google ScholarDigital Library
20. Jungjin Lee, Bumki Kim, Kyehyun Kim, Younghui Kim, and Junyong Noli. 2016. Rich360: Optimized Spherical Representation from Structured Panoramic Camera Arrays. ACM Transactions on Graphics 35, 4 (2016), article no. 63. Google ScholarDigital Library
21. Marc Levoy and Pat Hanrahan. 1996. Light Field Rendering. Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (1996), 31–42. Google ScholarDigital Library
22. Kaimo Lin, Nianjuan Jiang, Loong-Fah Cheong, Minh N. Do, and Jiangbo Lu. 2016. SEAGULL: Seam-Guided Local Alignment for Parallax-Tolerant Image Stitching. 14th European Conference on Computer Vision (ECCV) (2016), 370–385.Google Scholar
23. Marius Muja and David G. Lowe. 2009. Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration. International Conference on Computer Vision Theory and Application VISSAPP’09) (2009), 331–340.Google Scholar
24. Shmuel Peleg and Moshe Ben-Ezra. 1999. Stereo panorama with a single camera. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 1999) (1999), 395–401.Google ScholarCross Ref
25. Shmuel Peleg, Moshe Ben-Ezra, and Yael Pritch. 2001. Omnistereo: panoramic stereo imaging. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 3 (2001), 279–290. Google ScholarDigital Library
26. F. Perazzi, A. Sorkine-Hornung, H. Zimmer, P. Kaufmann, O. Wang, S. Watson, and M. Gross. 2015. Panoramic Video from Unstructured Camera Arrays. Computer Graphics Forum 34, 2 (2015), 57–68. Google ScholarDigital Library
27. Realities. 2017. realities.io | Go Places, http://realities.io/. (2017). Accessed: 2017-1-12.Google Scholar
28. Erik Reinhard, Michael Ashikhmin, Bruce Gooch, and Peter Shirley. 2001. Color Transfer Between Images. IEEE Comput. Graph. Appl. 21, 5 (2001), 34–41. Google ScholarDigital Library
29. Christian Richardt, Yael Pritch, Henning Zimmer, and Alexander Sorkine-Hornung. 2013. Megastereo: Constructing High-Resolution Stereo Panoramas. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013) (2013), 1256–1263. Google ScholarDigital Library
30. Johannes Lutz Schönberger and Jan-Michael Frahm. 2016. Structure-from-Motion Revisited. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016).Google Scholar
31. Steven M Seitz, Brian Curless, James Diebel, Daniel Scharstein, and Richard Szeliski. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06) 1 (2006), 519–528. Google ScholarDigital Library
32. Jianbo Shi and Carlo Tomasi. 1994. Good Features to Track. 1994 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’94) (1994), 593 — 600.Google Scholar
33. Richard Szeliski. 2010. Computer Vision: Algorithms and Applications (1st ed.). Springer-Verlag New York, Inc., New York, NY, USA. Google Scholar
34. Engin Tola, Vincent Lepetit, and Pascal Fua. 2010. DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo. IEEE Trans. Pattern Anal. Mach. Intell. 32, 5 (2010), 815–830. Google ScholarDigital Library
35. Valve. 2016. Valve Developer Community: Advanced Outdoors Photogrammetry. https://developer.valvesoftware.com/wiki/Destinations/Advanced_Outdoors_Photogrammetry. (2016). Accessed: 2016-11-3.Google Scholar
36. Thomas Whelan, Stefan Leutenegger, Renato Salas Moreno, Ben Glocker, and Andrew Davison. 2015. ElasticFusion: Dense SLAM Without A Pose Graph. Proceedings of Robotics: Science and Systems (2015).Google ScholarCross Ref
37. Julio Zaragoza, Tat-Jun Chin, Michael S. Brown, and David Suter. 2013. As-Projective-As-Possible Image Stitching with Moving DLT. Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (2013), 2339–2346. Google ScholarDigital Library
38. Fan Zhang and Feng Liu. 2014. Parallax-Tolerant Image Stitching. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (2014), 3262–3269. Google ScholarDigital Library
39. Fan Zhang and Feng Liu. 2015. Casual Stereoscopic Panorama Stitching. IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’15) (2015), 2002–2010.Google Scholar
40. Ke Colin Zheng, Sing Bing Kang, Michael F. Cohen, and Richard Szeliski. 2007. Layered Depth Panoramas. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2007) (2007), 1–8.Google Scholar
41. Qian-Yi Zhou and Vladlen Koltun. 2014. Simultaneous Localization and Calibration: Self-Calibration of Consumer Depth Cameras. 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014), 454–460. Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2018: Technical Papers

“Instant 3D photography” by Hedman and Kopf

Conference:

Type(s):

Entry Number: 101

Title:

Session/Category Title: 3D Capture

Presenter(s)/Author(s):

Moderator(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: