Learning to reconstruct shape and spatially-varying reflectance from a single image

Reconstructing shape and reflectance properties from images is a highly under-constrained problem, and has previously been addressed by using specialized hardware to capture calibrated data or by assuming known (or highly constrained) shape or reflectance. In contrast, we demonstrate that we can recover non-Lambertian, spatially-varying BRDFs and complex geometry belonging to any arbitrary shape class, from a single RGB image captured under a combination of unknown environment illumination and flash lighting. We achieve this by training a deep neural network to regress shape and reflectance from the image. Our network is able to address this problem because of three novel contributions: first, we build a large-scale dataset of procedurally generated shapes and real-world complex SVBRDFs that approximate real world appearance well. Second, single image inverse rendering requires reasoning at multiple scales, and we propose a cascade network structure that allows this in a tractable manner. Finally, we incorporate an in-network rendering layer that aids the reconstruction task by handling global illumination effects that are important for real-world scenes. Together, these contributions allow us to tackle the entire inverse rendering problem in a holistic manner and produce state-of-the-art results on both synthetic and real data.

References:

1. Miika Aittala, Timo Aila, and Jaakko Lehtinen. 2016. Reflectance modeling by neural texture synthesis. ACM Trans. Graphics 35, 4 (2016). Google ScholarDigital Library
2. Miika Aittala, Tim Weyrich, Jaakko Lehtinen, et al. 2015. Two-shot SVBRDF capture for stationary materials. ACM Trans. Graphics 34, 4 (2015). Google ScholarDigital Library
3. Aayush Bansal, Bryan Russell, and Abhinav Gupta. 2016. Marr Revisited: 2D-3D Model Alignment via Surface Normal Prediction. In CVPR.Google Scholar
4. Jonathan T Barron and Jitendra Malik. 2015. Shape, illumination, and reflectance from shading. PAMI 37, 8 (2015).Google Scholar
5. Jonathan T Barron and Ben Poole. 2016. The fast bilateral solver. In European Conference on Computer Vision. Springer, 617–632.Google ScholarCross Ref
6. Sean Bell, Paul Upchurch, Noah Snavely, and Kavita Bala. 2015. Material Recognition in the Wild with the Materials in Context Database. In CVPR.Google Scholar
7. Volker Blanz and Thomas Vetter. 1999. A morphable model for the synthesis of 3D faces. In Proc. SIGGRAPH. Google ScholarDigital Library
8. Manmohan Chandraker. 2014. On shape and material recovery from motion. In ECCV.Google Scholar
9. Manmohan Chandraker, Fredrik Kahl, and David Kriegman. 2005. Reflections on the generalized bas-relief ambiguity. In CVPR. Google ScholarDigital Library
10. Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. 2015. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015).Google Scholar
11. Michael F Cohen and John R Wallace. 1993. Radiosity and realistic image synthesis. Elsevier. Google ScholarDigital Library
12. Paul Debevec, Tim Hawkins, Chris Tchou, Haarm-Pieter Duiker, Westley Sarokin, and Mark Sagar. 2000. Acquiring the reflectance field of a human face. In SIGGRAPH. Google ScholarDigital Library
13. Valentin Deschaintre, Miika Aittala, Fredo Durand, George Drettakis, and Adrien Bousseau. 2018. Single-image SVBRDF Capture with a Rendering-aware Deep Network. ACM Trans. Graph. 37, 4 (2018). Google ScholarDigital Library
14. David Eigen and Rob Fergus. 2015. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In ICCV. Google ScholarDigital Library
15. Marc-André Gardner, Kalyan Sunkavalli, Ersin Yumer, Xiaohui Shen, Emiliano Gambaretto, Christian Gagné, and Jean-François Lalonde. 2017. Learning to predict indoor illumination from a single image. ACM Trans. Graphics 9, 4 (2017). Google ScholarDigital Library
16. Stamatios Georgoulis, Konstantinos Rematas, Tobias Ritschel, Mario Fritz, Tinne Tuytelaars, and Luc Van Gool. 2017. What is around the camera?. In ICCV.Google Scholar
17. Clement Godard, Peter Hedman, Wenbin Li, and Gabriel J Brostow. 2015. Multi-view reconstruction of highly specular surfaces in uncontrolled environments. In 3DV. Google ScholarDigital Library
18. Dan B Goldman, Brian Curless, Aaron Hertzmann, and Steven M Seitz. 2010. Shape and spatially-varying brdfs from photometric stereo. PAMI 32, 6 (2010). Google ScholarDigital Library
19. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR.Google Scholar
20. Yannick Hold-Geoffroy, Kalyan Sunkavalli, Sunil Hadap, Emiliano Gambaretto, and Jean-François Lalonde. 2017. Deep Outdoor Illumination Estimation. In CVPR.Google Scholar
21. Z. Hui and A. C. Sankaranarayanan. 2017. Shape and Spatially-Varying Reflectance Estimation from Virtual Exemplars. PAMI 39, 10 (2017).Google Scholar
22. Zhuo Hui, Kalyan Sunkavalli, Joon-Young Lee, Sunil Hadap, Jian Wang, and Aswin C. Sankaranarayanan. 2017. Reflectance capture using univariate sampling of BRDFs. In ICCV.Google Scholar
23. Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, and Thomas Brox. 2017. FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. In CVPR.Google Scholar
24. Carlo Innamorati, Tobias Ritschel, Tim Weyrich, and Niloy J Mitra. 2017. Decomposing single images for layered photo retouching. 36, 4 (2017). Google ScholarDigital Library
25. M. K. Johnson and E. H. Adelson. 2011. Shape estimation in natural illumination. In CVPR. Google ScholarDigital Library
26. Brian Karis and Epic Games. 2013. Real shading in Unreal Engine 4. SIGGRAPH 2013 Courses: Physically Based Shading Theory Practice (2013).Google Scholar
27. Martin Knecht, Georg Tanzmeister, Christoph Traxler, and Michael Wimmer. 2012. Interactive BRDF Estimation for Mixed-Reality Applications. WSCG 20, 1 (2012).Google Scholar
28. Xiao Li, Yue Dong, Pieter Peers, and Xin Tong. 2017a. Modeling surface appearance from a single photograph using self-augmented convolutional neural networks. ACM Trans. Graphics 36, 4 (2017). Google ScholarDigital Library
29. Zhengqin Li, Kalyan Sunkavalli, and Manmohan Chandraker. 2018. Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone Image. In ECCV.Google Scholar
30. Z. Li, Z. Xu, R. Ramamoorthi, and M. Chandraker. 2017b. Robust Energy Minimization for BRDF-Invariant Shape from Light Fields. In CVPR.Google Scholar
31. Guilin Liu, Duygu Ceylan, Ersin Yumer, Jimei Yang, and Jyh-Ming Lien. 2017. Material Editing using a Physically Based Rendering Network. ICCV.Google Scholar
32. Julio Marco, Quercus Hernandez, Adolfo Munoz, Yue Dong, Adrian Jarabo, Min H Kim, Xin Tong, and Diego Gutierrez. 2017. DeepToF: off-the-shelf real-time correction of multipath interference in time-of-flight imaging. ACM Trans. Graphics 36, 6 (2017). Google ScholarDigital Library
33. Stephen R Marschner, Stephen H Westin, Eric PF Lafortune, Kenneth E Torrance, and Donald P Greenberg. 1999. Image-based BRDF measurement including human skin. In Rendering Techniques. Google ScholarDigital Library
34. Wojciech Matusik, Hanspeter Pfister, Matt Brand, and Leonard McMillan. 2003. A Data-Driven Reflectance Model. ACM Trans. Graphics 22, 3 (2003). Google ScholarDigital Library
35. Abhimitra Meka, Maxim Maximov, Michael Zollhoefer, Avishek Chatterjee, Hans-Peter Seidel, Christian Richardt, and Christian Theobalt. 2018. LIME: Live Intrinsic Material Estimation. In CVPR.Google Scholar
36. Oliver Nalbach, Elena Arabadzhiyska, Dushyant Mehta, H-P Seidel, and Tobias Ritschel. 2017. Deep shading: convolutional neural networks for screen space shading. Comput. Graph. Forum 36, 4 (2017). Google ScholarDigital Library
37. Shree K. Nayar, Katsushi Ikeuchi, and Takeo Kanade. 1991. Shape from interreflections. IJCV 6, 3 (1991). Google ScholarDigital Library
38. Shree K. Nayar, Gurunandan Krishnan, Michael D. Grossberg, and Ramesh Raskar. 2006. Fast Separation of Direct and Global Components of a Scene Using High Frequency Illumination. ACM Trans. Graphics 25, 3 (2006). Google ScholarDigital Library
39. Diego Nehab, Szymon Rusinkiewicz, James Davis, and Ravi Ramamoorthi. 2005. Efficiently combining positions and normals for precise 3D geometry. In ACM transactions on graphics (TOG), Vol. 24. ACM, 536–543. Google ScholarDigital Library
40. Alejandro Newell, Kaiyu Yang, and Jia Deng. 2016. Stacked Hourglass Networks for Human Pose Estimation. In ECCV.Google Scholar
41. Matthew O’Toole and Kiriakos N. Kutulakos. 2010. Optical Computing for Fast Light Transport Analysis. ACM Trans. Graphics 29, 6, Article 164 (2010). Google ScholarDigital Library
42. Geoffrey Oxholm and Ko Nishino. 2016. Shape and reflectance estimation in the wild. PAMI 38, 2 (2016), 376–389. Google ScholarDigital Library
43. Ravi Ramamoorthi and Pat Hanrahan. 2001. An efficient representation for irradiance environment maps. In SIGGRAPH. Google ScholarDigital Library
44. Konstantinos Rematas, Tobias Ritschel, Mario Fritz, Efstratios Gavves, and Tinne Tuytelaars. 2016. Deep reflectance maps. In CVPR.Google Scholar
45. Kosta Ristovski, Vladan Radosavljevic, Slobodan Vucetic, and Zoran Obradovic. 2013. Continuous Conditional Random Fields for Efficient Regression in Large Fully Connected Graphs.. In AAAI. Google ScholarDigital Library
46. J. Riviere, P. Peers, and A. Ghosh. 2016. Mobile Surface Reflectometry. Comput. Graph. Forum 35, 1 (2016). Google ScholarDigital Library
47. O. Ronneberger, P.Fischer, and T. Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In MICCAI.Google Scholar
48. Soumyadip Sengupta, Angjoo Kanazawa, Carlos D. Castillo, and David W. Jacobs. 2018. SfSNet: Learning Shape, Refectance and Illuminance of Faces in the Wild. In CVPR.Google Scholar
49. Jian Shi, Yue Dong, Hao Su, and Stella X Yu. 2017. Learning Non-Lambertian Object Intrinsics Across ShapeNet Categories. In CVPR.Google Scholar
50. Z. Shu, E. Yumer, S. Hadap, K. Sunkavalli, E. Shechtman, and D. Samaras. 2017. Neural Face Editing with Intrinsic Image Disentangling. In CVPR.Google Scholar
51. A. Tewari, M. Zollhofer, H. Kim, P. Garrido, F. Bernard, P. Perez, and C. Theobalt. 2018. MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction. In ICCV.Google Scholar
52. A. Toshev and C. Szegedy. 2014. DeepPose: Human Pose Estimation via Deep Neural Networks. In CVPR. Google ScholarDigital Library
53. Ting-Chun Wang, Manmohan Chandraker, Alexei Efros, and Ravi Ramamoorthi. 2017. SVBRDF-Invariant Shape and Reflectance Estimation from Light-Field Cameras. PAMI (2017).Google Scholar
54. S. E. Wei, V. Ramakrishna, T. Kanade, and Y. Sheikh. 2016. Convolutional Pose Machines. In CVPR.Google Scholar
55. Robert J. Woodham. 1980. Photometric Method For Determining Surface Orientation From Multiple Images. Optical Engineering 19 (1980).Google Scholar
56. Hongzhi Wu and Kun Zhou. 2015. AppFusion: Interactive Appearance Acquisition Using a Kinect Sensor. Comput. Graph. Forum 34, 6 (2015). Google ScholarDigital Library
57. Zexiang Xu, Kalyan Sunkavalli, Sunil Hadap, and Ravi Ramamoorthi. 2018. Deep image-based relighting from optimal sparse samples. ACM Trans. Graphics 37, 4 (2018). Google ScholarDigital Library
58. Yizhou Yu, Paul Debevec, Jitendra Malik, and Tim Hawkins. 1999. Inverse Global Illumination: Recovering Reflectance Models of Real Scenes from Photographs. In SIGGRAPH. Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page:

SIGGRAPH Asia 2018: Technical Papers

Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org

ACM SIGGRAPH HISTORY ARCHIVES

“Learning to reconstruct shape and spatially-varying reflectance from a single image”

Conference:

Type(s):

Title:

Session/Category Title:

Presenter(s)/Author(s):

Moderator(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Submit a story:

Sponsored by: