Modeling surface appearance from a single photograph using self-augmented convolutional neural networks

We present a convolutional neural network (CNN) based solution for modeling physically plausible spatially varying surface reflectance functions (SVBRDF) from a single photograph of a planar material sample under unknown natural illumination. Gathering a sufficiently large set of labeled training pairs consisting of photographs of SVBRDF samples and corresponding reflectance parameters, is a difficult and arduous process. To reduce the amount of required labeled training data, we propose to leverage the appearance information embedded in unlabeled images of spatially varying materials to self-augment the training process. Starting from an initial approximative network obtained from a small set of labeled training pairs, we estimate provisional model parameters for each unlabeled training exemplar. Given this provisional reflectance estimate, we then synthesize a novel temporary labeled training pair by rendering the exact corresponding image under a new lighting condition. After refining the network using these additional training samples, we re-estimate the provisional model parameters for the unlabeled data and repeat the self-augmentation process until convergence. We demonstrate the efficacy of the proposed network structure on spatially varying wood, metals, and plastics, as well as thoroughly validate the effectiveness of the self-augmentation training process.

References:

1. Miika Aittala, Timo Aila, and Jaakko Lehtinen. 2016. Reflectance Modeling by Neural Texture Synthesis. ACM Trans. Graph. 35, 4 (July 2016), 65:1–65:13.Google ScholarDigital Library
2. Jonathan T Barron and Jitendra Malik. 2015. Shape, Illumination, and Reflectance from Shading. PAMI 37, 8 (Aug. 2015), 1670–1687. Google ScholarCross Ref
3. Sean Bell, Paul Upchurch, Noah Snavely, and Kavita Bala. 2013. OpenSurfaces: a richly annotated catalog of surface appearance. ACM Trans. Graph. 32, 4 (July 2013), 111:1–111:17.Google ScholarDigital Library
4. Qifeng Chen and Vladlen Koltun. 2013. A Simple Model for Intrinsic Image Decomposition with Depth Cues. In ICCV. 241–248. Google ScholarDigital Library
5. Yue Dong, Xin Tong, Fabio Pellacini, and Baining Guo. 2011. AppGen: Interactive Material Modeling from a Single Image. ACM Trans. Graph. 30, 6 (Dec. 2011), 146:1–146:10.Google ScholarDigital Library
6. Julie Dorsey, Holly Rushmeier, and Franois Sillion. 2008. Digital Modeling of Material Appearance. Morgan Kaufmann Publishers Inc.Google Scholar
7. Adrien Gaidon, Qiao Wang, Yohann Cabon, and Eleonora Vig. 2016. Virtual Worlds as Proxy for Multi-object Tracking Analysis. In CVPR. 4340–4349.Google Scholar
8. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In NIPS. 2672–2680.Google Scholar
9. Ankush Gupta, Andrea Vedaldi, and Andrew Zisserman. 2016. Synthetic Data for Text Localisation in Natural Images. In CVPR. 2315–2324. Google ScholarCross Ref
10. Saurabh Gupta, Ross Girshick, Pablo Arbeláez, and Jitendra Malik. 2014. Learning Rich Features from RGB-D Images for Object Detection and Segmentation. In ECCV. 345–360. Google ScholarCross Ref
11. HDRLabs. 2016. sIBL Archive. http://www.hdrlabs.com/sibl/archive.html. (2016).Google Scholar
12. Aaron Hertzmann and Steve M. Seitz. 2003. Shape and materials by example: a photometric stereo approach. In CVPR. 533–540. Google ScholarCross Ref
13. Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In ICML. 448–456.Google ScholarDigital Library
14. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional Architecture for Fast Feature Embedding. In ICM. 675–678.Google ScholarDigital Library
15. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR.Google Scholar
16. Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In ICLR.Google Scholar
17. Ming-Yu Liu and Oncel Tuzel. 2016. Coupled Generative Adversarial Networks. In NIPS. 469–477.Google Scholar
18. Stephen Lombardi and Ko Nishino. 2012. Reflectance and Natural Illumination from a Single Image. In ECCV. 582–595. Google ScholarDigital Library
19. Stephen Lombardi and Ko Nishino. 2016. Reflectance and Illumination Recovery in the Wild. PAMI 38, 1 (Jan. 2016), 129–141. Google ScholarDigital Library
20. Vinod Nair, Joshua M. Susskind, and Geoffrey E. Hinton. 2008. Analysis-by-Synthesis by Learning to Invert Generative Black Boxes. In ICANN. 971–981. Google ScholarDigital Library
21. Takuya Narihira, Michael Maire, and Stella X. Yu. 2015. Direct Intrinsics: Learning Albedo-Shading Decomposition by Convolutional Regression. In ICCV. 2992–2992.Google Scholar
22. Geoffrey Oxholm and Ko Nishino. 2012. Shape and Reflectance from Natural Illumination. In ECCV. 528–541. Google ScholarDigital Library
23. Geoffrey Oxholm and Ko Nishino. 2016. Shape and Reflectance Estimation in the Wild. PAMI 38, 2 (Feb. 2016), 376–389. Google ScholarDigital Library
24. Konstantinos Rematas, Tobias Ritschel, Mario Fritz, Efstratios Gavves, and Tinne Tuytelaars. 2016. Deep Reflectance Maps. In CVPR. 4508–4516. Google ScholarCross Ref
25. Fabiano Romeiro and Todd Zickler. 2010. Blind reflectometry. In ECCV. 45–58. Google ScholarCross Ref
26. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In MICCAI, Part III. 234–241.Google Scholar
27. German Ros, Laura Sellart, Joanna Materzynska, David Vazquez, and Antonio Lopez. 2016. The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes. In CVPR. 3234–3243.Google Scholar
28. Evan Shelhamer, Jonathan T. Barron, and Trevor Darrell. 2015. Scene Intrinsics and Depth From a Single Image. In ICCV Workshops. 235–242. Google ScholarDigital Library
29. Kihyuk Sohn, Xinchen Yan, and Honglak Lee. 2015. Learning Structured Output Representation Using Deep Conditional Generative Models. In NIPS. 3483–3491.Google Scholar
30. Yichuan Tang, Ruslan Salakhutdinov, and Geoffrey E. Hinton. 2012. Deep Lambertian Networks. In ICML. 1623–1630.Google Scholar
31. Jonathan Tompson, Murphy Stein, Yann Lecun, and Ken Perlin. 2014. Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks. ACM Trans. Graph. 33, 5 (Sept. 2014), 169:1–169:10.Google ScholarDigital Library
32. VRay. 2016. VRay: Material Library. http://www.vray-materials.de/. (2016).Google Scholar
33. Chun-Po Wang, Noah Snavely, and Steve Marschner. 2011. Estimating Dual-scale Properties of Glossy Surfaces from Step-edge Lighting. ACM Trans. Graph. 30, 6 (2011), 172:1–172:12.Google ScholarDigital Library
34. Ting-Chun Wang, Manmohan Chandraker, Alexei Efros, and Ravi Ramamoorthi. 2016. SVBRDF-invariant shape and reflectance estimation from light-field cameras. In CVPR. 5451–5459.Google Scholar
35. Gregory J. Ward. 1992. Measuring and modeling anisotropic reflection. SIGGRAPH Comput. Graph. 26, 2 (1992), 265–272. Google ScholarDigital Library
36. Michael Weinmann and Reinhard Klein. 2015. Advances in Geometry and Reflectance Acquisition. In ACM SIGGRAPH Asia, Course Notes.Google Scholar
37. Zexiang Xu, Jannik Boll Nielsen, Jiyang Yu, Henrik Wann Jensen, and Ravi Ramamoorthi. 2016. Minimal BRDF Sampling for Two-shot Near-field Reflectance Acquisition. ACM Trans. Graph. 35, 6 (Nov. 2016), 188:1–188:12.Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2017: Technical Papers

“Modeling surface appearance from a single photograph using self-augmented convolutional neural networks” by Li, Dong, Peers and Tong

Conference:

Type(s):

Title:

Session/Category Title: Get More Out of Your Photo

Presenter(s)/Author(s):

Moderator(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: