“Deep3DLayout: 3D reconstruction of an indoor layout from a spherical panoramic image” by Pintore, Almansa, Agus and Gobbetti – ACM SIGGRAPH HISTORY ARCHIVES

“Deep3DLayout: 3D reconstruction of an indoor layout from a spherical panoramic image” by Pintore, Almansa, Agus and Gobbetti

  • 2021 SA Technical Papers_Pintore_Deep3DLayout: 3D reconstruction of an indoor layout from a spherical panoramic image

Conference:


Type(s):


Title:

    Deep3DLayout: 3D reconstruction of an indoor layout from a spherical panoramic image

Session/Category Title:   Reconstruction


Presenter(s)/Author(s):



Abstract:


    Recovering the 3D shape of the bounding permanent surfaces of a room from a single image is a key component of indoor reconstruction pipelines. In this article, we introduce a novel deep learning technique capable to produce, at interactive rates, a tessellated bounding 3D surface from a single 360° image. Differently from prior solutions, we fully address the problem in 3D, significantly expanding the reconstruction space of prior solutions. A graph convolutional network directly infers the room structure as a 3D mesh by progressively deforming a graph-encoded tessellated sphere mapped to the spherical panorama, leveraging perceptual features extracted from the input image. Important 3D properties of indoor environments are exploited in our design. In particular, gravity-aligned features are actively incorporated in the graph in a projection layer that exploits the recent concept of multi head self-attention, and specialized losses guide towards plausible solutions even in presence of massive clutter and occlusions. Extensive experiments demonstrate that our approach outperforms current state of the art methods in terms of accuracy and capability to reconstruct more complex environments.

References:


    1. Ping Chao, Chao-Yang Kao, Yushan Ruan, Chien-Hsiang Huang, and Youn-Long Lin. 2019. HarDNet: A Low Memory Traffic Network. In Proc. ICCV. 3551–3560.
    2. Benjamin Davidson, Mohsan S. Alvi, and Joao F. Henriques. 2020. 360 Camera Alignment via Segmentation. In Proc. ECCV. 579–595.
    3. Dawson-Haggerty et al. 2019. trimesh. https://trimesh.org/
    4. Alex Flint, Christopher Mei, David Murray, and Ian Reid. 2010. A Dynamic Programming Approach to Reconstructing Building Interiors. In Proc ECCV. 394–407.
    5. G. Gkioxari, J. Johnson, and J. Malik. 2019. Mesh R-CNN. In Proc. ICCV. 9784–9794.
    6. K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep Residual Learning for Image Recognition. In Proc. CVPR. 770–778.
    7. V. Hedau, D. Hoiem, and D. Forsyth. 2009. Recovering the spatial layout of cluttered rooms. In Proc. ICCV. 1849–1856.
    8. Derek Hoiem, Alexei A. Efros, and Martial Hebert. 2007. Recovering Surface Layout from an Image. International Journal of Computer Vision 75, 1 (2007), 151–172.
    9. Raehyuk Jung, Aiden Seung Joon Lee, Amirsaman Ashtari, and Jean-Charles Bazin. 2019. Deep360Up: A Deep Learning-Based Approach for Automatic VR Image Upright Adjustment. In Proc. IEEE VR. 1–8.
    10. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. ArXiv e-print arXiv:1412.6980 (2014).
    11. Arno Knapitsch, Jaesik Park, Qian-Yi Zhou, and Vladlen Koltun. 2017. Tanks and Temples: Benchmarking Large-Scale Scene Reconstruction. ACM TOG 36, 4 (2017), 78:1–78:13.
    12. David C Lee, Martial Hebert, and Takeo Kanade. 2009. Geometric reasoning for single image structure recovery. In Proc. CVPR. 2136–2143.
    13. Matterport. 2017. Matterport3D. https://github.com/niessner/Matterport. [Accessed: 2019-09-25].
    14. Kevin Matzen, Michael F. Cohen, Bryce Evans, Johannes Kopf, and Richard Szeliski. 2017. Low-cost 360 Stereo Photography and Video Capture. ACM TOG 36, 4 (2017), 148:1–148:12.
    15. Mark Meyer, Mathieu Desbrun, Peter Schröder, and Alan H. Barr. 2003. Discrete Differential-Geometry Operators for Triangulated 2-Manifolds. In Visualization and Mathematics III. 35–57.
    16. Alessandro Muntoni and Paolo Cignoni. 2021. PyMeshLab.
    17. Zak Murez, Tarrence van As, James Bartolozzi, Ayan Sinha, Vijay Badrinarayanan, and Andrew Rabinovich. 2020. Atlas: End-to-End 3D Scene Reconstruction from Posed Images. In Proc. ECCV. 1–18.
    18. Andrew Nealen, Olga Sorkine, Marc Alexa, and Daniel Cohen-Or. 2005. A Sketch-Based Interface for Detail-Preserving Mesh Editing. In Proc. SIGGRAPH. 1142–1147.
    19. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In Proc. NIPS Workshop on Autodiff.
    20. Giovanni Pintore, Marco Agus, Eva Almansa, Jens Schneider, and Enrico Gobbetti. 2021. SliceNet: deep dense depth estimation from a single indoor panorama using a slice-based representation. In Proc. CVPR. 11536–11545.
    21. Giovanni Pintore, Marco Agus, and Enrico Gobbetti. 2020a. AtlantaNet: Inferring the 3D Indoor Layout from a Single 360 Image Beyond the Manhattan World Assumption. In Proc. ECCV. 432–448.
    22. Giovanni Pintore, Claudio Mura, Fabio Ganovelli, Lizeth Fuentes-Perez, Renato Pajarola, and Enrico Gobbetti. 2020b. State-of-the-art in Automatic 3D Reconstruction of Structured Indoor Environments. Comput. Graph. Forum 39, 2 (2020), 667–699.
    23. Nikhila Ravi, Jeremy Reizenstein, David Novotny, Taylor Gordon, Wan-Yen Lo, Justin Johnson, and Georgia Gkioxari. 2020. Accelerating 3D Deep Learning with Py-Torch3D. arXiv preprint arXiv:2007.08501 (2020).
    24. Edward Smith, Scott Fujimoto, Adriana Romero, and David Meger. 2019. GEOMetrics: Exploiting Geometric Structure for Graph-Encoded Objects. In Proc. ICML. 5866–5876.
    25. Stanford University. 2017. BuildingParser Dataset. http://buildingparser.stanford.edu/dataset.html. [Accessed: 2019-09-25].
    26. Cheng Sun, Chi-Wei Hsiao, Min Sun, and Hwann-Tzong Chen. 2019. HorizonNet: Learning room layout with 1D representation and pano stretch data augmentation. In Proc. CVPR. 1047–1056.
    27. Cheng Sun, Min Sun, and Hwann-Tzong Chen. 2021. HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features. In Proc. CVPR. 2573–2582.
    28. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems, Vol. 30.
    29. Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, and Yi-Hsuan Tsai. 2020. BiFuse: Monocular 360 Depth Estimation via Bi-Projection Fusion. In Proc. CVPR. 462–471.
    30. Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, and Yi-Hsuan Tsai. 2021. LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth Rendering. In Proc. CVPR. 12956–12965.
    31. Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, and Yu-Gang Jiang. 2018. Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images. In Proc. ECCV. 55–71.
    32. Wenqi Xian, Zhengqi Li, Matthew Fisher, Jonathan Eisenmann, Eli Shechtman, and Noah Snavely. 2019. UprightNet: geometry-aware camera orientation estimation from single images. In Proc. ICCV. 9974–9983.
    33. J. Xu, B. Stenger, T. Kerola, and T. Tung. 2017. Pano2CAD: Room Layout from a Single Panorama Image. In Proc. WACV. 354–362.
    34. H. Yang and H. Zhang. 2016. Efficient 3D Room Shape Recovery from a Single Panorama. In Proc. CVPR. 5422–5430.
    35. Sheng Yang, Beichen Li, Yan-Pei Cao, Hongbo Fu, Yu-Kun Lai, Leif Kobbelt, and Shi-Min Hu. 2020. Noise-Resilient Reconstruction of Panoramas and 3D Scenes Using Robot-Mounted Unsynchronized Commodity RGB-D Cameras. ACM TOG 39, 5 (2020), 152:1–152:15.
    36. Shang-Ta Yang, Fu-En Wang, Chi-Han Peng, Peter Wonka, Min Sun, and Hung-Kuo Chu. 2019. DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama. In Proc. CVPR. 3363–3372.
    37. Wei Zeng, Sezer Karaoglu, and Theo Gevers. 2020. Joint 3D Layout and Depth Prediction from a Single Indoor Panorama Image. In Proc. ECCV. 666–682.
    38. Han Zhang, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. 2019. Self-attention generative adversarial networks. In Proc. ICML. 7354–7363.
    39. Yinda Zhang, Shuran Song, Ping Tan, and Jianxiong Xiao. 2014. PanoContext: A Whole-Room 3D Context Model for Panoramic Scene Understanding. In Proc. ECCV. 668–686.
    40. Jia Zheng, Junfei Zhang, Jing Li, Rui Tang, Shenghua Gao, and Zihan Zhou. 2020. Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling. In Proc. ECCV. 519–535.
    41. Nikolaos Zioulis, Antonis Karakottas, Dimitris Zarpalas, Federic Alvarez, and Petros Daras. 2019. Spherical View Synthesis for Self-Supervised 360° Depth Estimation. In Proc. 3DV. 690–699.
    42. Nikolaos Zioulis, Antonis Karakottas, Dimitrios Zarpalas, and Petros Daras. 2018. OmniDepth: Dense Depth Estimation for Indoors Spherical Panoramas. In Proc. ECCV. 453–471.
    43. Chuhang Zou, Alex Colburn, Qi Shan, and Derek Hoiem. 2018. LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image. In Proc. CVPR. 2051–2059.
    44. Chuhang Zou, Jheng Wei Su, Chi Han Peng, Alex Colburn, Qi Shan, Peter Wonka, Hung Kuo Chu, and Derek Hoiem. 2021. Manhattan Room Layout Reconstruction from a Single 360 Image: A Comparative Study of State-of-the-Art Methods. International Journal of Computer Vision 129 (2021), 1410–1431.
    45. Chuhang Zou, Jheng-Wei Su, Chi-Han Peng, Alex Colburn, Qi Shan, Peter Wonka, Hung-Kuo Chu, and Derek Hoiem. 2019. 3D Manhattan Room Layout Reconstruction from a Single 360 Image. ArXiv e-print arXiv:1910.04099 (2019).


ACM Digital Library Publication:



Overview Page:



Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org