“Single-image SVBRDF capture with a rendering-aware deep network” by Deschaintre, Aittala, Durand, Drettakis and Bousseau

  • ©Valentin Deschaintre, Miika Aittala, Frédo Durand, George Drettakis, and Adrien Bousseau

Conference:


Type(s):


Entry Number: 128

Title:

    Single-image SVBRDF capture with a rendering-aware deep network

Session/Category Title:   Learning for Rendering and Material Acquisition


Presenter(s)/Author(s):


Moderator(s):



Abstract:


    Texture, highlights, and shading are some of many visual cues that allow humans to perceive material appearance in single pictures. Yet, recovering spatially-varying bi-directional reflectance distribution functions (SVBRDFs) from a single image based on such cues has challenged researchers in computer graphics for decades. We tackle lightweight appearance capture by training a deep neural network to automatically extract and make sense of these visual cues. Once trained, our network is capable of recovering per-pixel normal, diffuse albedo, specular albedo and specular roughness from a single picture of a flat surface lit by a hand-held flash. We achieve this goal by introducing several innovations on training data acquisition and network design. For training, we leverage a large dataset of artist-created, procedural SVBRDFs which we sample and render under multiple lighting directions. We further amplify the data by material mixing to cover a wide diversity of shading effects, which allows our network to work across many material classes. Motivated by the observation that distant regions of a material sample often offer complementary visual cues, we design a network that combines an encoder-decoder convolutional track for local feature extraction with a fully-connected track for global feature extraction and propagation. Many important material effects are view-dependent, and as such ambiguous when observed in a single image. We tackle this challenge by defining the loss as a differentiable SVBRDF similarity metric that compares the renderings of the predicted maps against renderings of the ground truth from several lighting and viewing directions. Combined together, these novel ingredients bring clear improvement over state of the art methods for single-shot capture of spatially varying BRDFs.

References:


    1. Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). https://www.tensorflow.org/ Software available from tensorflow.org.Google Scholar
    2. Miika Aittala, Timo Aila, and Jaakko Lehtinen. 2016. Reflectance Modeling by Neural Texture Synthesis. ACM Transactions on Graphics (Proc. SIGGRAPH) 35, 4 (2016). Google ScholarDigital Library
    3. Miika Aittala, Tim Weyrich, and Jaakko Lehtinen. 2015. Two-shot SVBRDF Capture for Stationary Materials. ACM Transactions on Graphics (Proc. SIGGRAPH) 34, 4 (2015). Google ScholarDigital Library
    4. Allegorithmic. 2018. Substance Share. (2018). https://share.allegorithmic.com/Google Scholar
    5. Michael Ashikhmin and Simon Premoze. 2007. Distribution-based BRDFs. Technical Report. University of Utah.Google Scholar
    6. Qifeng Chen and Vladlen Koltun. 2017. Photographic Image Synthesis with Cascaded Refinement Networks. In International Conference on Computer Vision (ICCV).Google Scholar
    7. R. L. Cook and K. E. Torrance. 1982. A Reflectance Model for Computer Graphics. ACM Transactions on Graphics 1, 1 (1982), 7–24. Google ScholarDigital Library
    8. Yue Dong, Guojun Chen, Pieter Peers, Jiawan Zhang, and Xin Tong. 2014. Appearance-from-motion: Recovering Spatially Varying Surface Reflectance Under Unknown Lighting. ACM Transactions on Graphics (Proc. SIGGRAPH Asia) 33, 6 (2014). Google ScholarDigital Library
    9. Yue Dong, Xin Tong, Fabio Pellacini, and Baining Guo. 2011. AppGen: Interactive Material Modeling from a Single Image. ACM Transactions on Graphics (Proc. SIGGRAPH Asia) 30, 6 (2011), 146:1–146:10. Google ScholarDigital Library
    10. Yue Dong, Jinpeng Wang, Xin Tong, John Snyder, Moshe Ben-Ezra, Yanxiang Lan, and Baining Guo. 2010. Manifold Bootstrapping for SVBRDF Capture. ACM Transactions on Graphics (Proc. SIGGRAPH) 29, 4 (2010). Google ScholarDigital Library
    11. Ron O. Dror, Edward H. Adelson, and Alan S. Willsky. 2001. Recognition of Surface Reflectance Properties from a Single Image under Unknown Real-World Illumination. Proc. IEEE Workshop on Identifying Objects Across Variations in Lighting: Psychophysics and Computation (2001).Google Scholar
    12. Dar’ya Guarnera, Giuseppe Claudio Guarnera, Abhijeet Ghosh, Cornelia Denk, and Mashhuda Glencross. 2016. BRDF Representation and Acquisition. Computer Graphics Forum (2016).Google Scholar
    13. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
    14. Z. Hui, K. Sunkavalli, J. Y. Lee, S. Hadap, J. Wang, and A. C. Sankaranarayanan. 2017. Reflectance Capture Using Univariate Sampling of BRDFs. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
    15. Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2016. Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification. ACM Transactions on Graphics (Proc. SIGGRAPH) 35, 4 (2016). Google ScholarDigital Library
    16. C. Innamorati, T. Ritschel, T. Weyrich, and N. Mitra. 2017. Decomposing Single Images for Layered Photo Retouching. Computer Graphics Forum (Proc. EGSR) 36, 4 (2017). Google ScholarDigital Library
    17. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image Translation with Conditional Adversarial Networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
    18. Wenzel Jakob. 2010. Mitsuba renderer. (2010). http://www.mitsuba-renderer.org.Google Scholar
    19. Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2018. Progressive Growing of GANs for Improved Quality, Stability, and Variation. In International Conference on Learning Representations (ICLR).Google Scholar
    20. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations (ICLR).Google Scholar
    21. Günter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. 2017. Self-Normalizing Neural Networks. In Advances in Neural Information Processing Systems (NIPS). 972–981.Google Scholar
    22. Hendrik P. A. Lensch, Jan Kautz, Michael Goesele, Wolfgang Heidrich, and Hans-Peter Seidel. 2003. Image-based Reconstruction of Spatial Appearance and Geometric Detail. ACM Transactions on Graphics 22, 2 (2003), 234–257. Google ScholarDigital Library
    23. Xiao Li, Yue Dong, Pieter Peers, and Xin Tong. 2017. Modeling Surface Appearance from a Single Photograph using Self-augmented Convolutional Neural Networks. ACM Transactions on Graphics (Proc. SIGGRAPH) 36, 4 (2017). Google ScholarDigital Library
    24. Guilin Liu, Duygu Ceylan, Ersin Yumer, Jimei Yang, and Jyh-Ming Lien. 2017. Material Editing Using a Physically Based Rendering Network. In IEEE International Conference on Computer Vision (ICCV). 2261–2269.Google ScholarCross Ref
    25. Stephen Lombardi and Ko Nishino. 2016. Reflectance and Illumination Recovery in the Wild. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 38 (2016), 129–141. Google ScholarDigital Library
    26. Takuya Narihira, Michael Maire, and Stella X. Yu. 2015. Direct Intrinsics: Learning Albedo-Shading Decomposition by Convolutional Regression. In IEEE International Conference on Computer Vision (ICCV). Google ScholarDigital Library
    27. K. Rematas, S. Georgoulis, T. Ritschel, E. Gavves, M. Fritz, L. Van Gool, and T. Tuytelaars. 2017. Reflectance and Natural Illumination from Single-Material Specular Objects Using Deep Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) (2017).Google Scholar
    28. Peiran Ren, Jinpeng Wang, John Snyder, Xin Tong, and Baining Guo. 2011. Pocket Reflectometry. ACM Transactions on Graphics (Proc. SIGGRAPH) 30, 4 (2011). Google ScholarDigital Library
    29. Stephan R. Richter, Vibhav Vineet, Stefan Roth, and Vladlen Koltun. 2016. Playing for Data: Ground Truth from Computer Games. In Proc. European Conference on Computer Vision (ECCV).Google ScholarCross Ref
    30. J. Riviere, P. Peers, and A. Ghosh. 2016. Mobile Surface Reflectometry. Computer Graphics Forum 35, 1 (2016). Google ScholarDigital Library
    31. Jérémy Riviere, Ilya Reshetouski, Luka Filipi, and Abhijeet Ghosh. 2017. Polarization imaging reflectometry in the wild. ACM Transactions on Graphics (Proc. SIGGRAPH) (2017). Google ScholarDigital Library
    32. O. Ronneberger, P.Fischer, and T. Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI) (LNCS), Vol. 9351. 234–241.Google Scholar
    33. Hao Su, Charles R. Qi, Yangyan Li, and Leonidas J. Guibas. 2015. Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views. In The IEEE International Conference on Computer Vision (ICCV). Google ScholarDigital Library
    34. Ayush Tewari, Michael Zollöfer, Hyeongwoo Kim, Pablo Garrido, Florian Bernard, Patrick Perez, and Theobalt Christian. 2017. MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
    35. Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2017. Improved Texture Networks: Maximizing Quality and Diversity in Feed-Forward Stylization and Texture Synthesis. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
    36. Bruce Walter, Stephen R. Marschner, Hongsong Li, and Kenneth E. Torrance. 2007. Microfacet Models for Refraction Through Rough Surfaces. In Proc. of Eurographics Conference on Rendering Techniques (EGSR). Google ScholarDigital Library
    37. Chun-Po Wang, Noah Snavely, and Steve Marschner. 2011. Estimating Dual-scale Properties of Glossy Surfaces from Step-edge Lighting. ACM Transactions on Graphics (Proc. SIGGRAPH Asia) 30, 6 (2011). Google ScholarDigital Library
    38. Xiaolong Wang, Ross B. Girshick, Abhinav Gupta, and Kaiming He. 2018. Non-local Neural Networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
    39. Michael Weinmann, Juergen Gall, and Reinhard Klein. 2014. Material Classification Based on Training Data Synthesized Using a BTF Database. In European Conference on Computer Vision (ECCV). 156–171.Google ScholarCross Ref
    40. Zexiang Xu, Jannik Boll Nielsen, Jiyang Yu, Henrik Wann Jensen, and Ravi Ramamoorthi. 2016. Minimal BRDF Sampling for Two-shot Near-field Reflectance Acquisition. ACM Transactions on Graphics (Proc. SIGGRAPH Asia) 35, 6 (2016). Google ScholarDigital Library
    41. Richard Zhang, Jun-Yan Zhu, Phillip Isola, Xinyang Geng, Angela S Lin, Tianhe Yu, and Alexei A Efros. 2017b. Real-Time User-Guided Image Colorization with Learned Deep Priors. ACM Transactions on Graphics (Proc. SIGGRAPH) 9, 4 (2017). Google ScholarDigital Library
    42. Yinda Zhang, Shuran Song, Ersin Yumer, Manolis Savva, Joon-Young Lee, Hailin Jin. and Thomas A. Funkhouser. 2017a. Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
    43. Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. Pyramid Scene Parsing Network. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
    44. T. Zickler, R. Ramamoorthi, S. Enrique, and P. N. Belhumeur. 2006. Reflectance sharing: predicting appearance from a sparse set of images of a known shape. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 8 (2006). Google ScholarDigital Library


ACM Digital Library Publication:



Overview Page: