“Deep view synthesis from sparse photometric images” by Xu, Bi, Sunkavalli, Hadap, Su, et al. …

  • ©Zexiang Xu, Sai Bi, Kalyan Sunkavalli, Sunil Hadap, Hao Su, and Ravi Ramamoorthi

Conference:


Type:


Title:

    Deep view synthesis from sparse photometric images

Session/Category Title: Relighting and View Synthesis


Presenter(s)/Author(s):



Abstract:


    The goal of light transport acquisition is to take images from a sparse set of lighting and viewing directions, and combine them to enable arbitrary relighting with changing view. While relighting from sparse images has received significant attention, there has been relatively less progress on view synthesis from a sparse set of “photometric” images—images captured under controlled conditions, lit by a single directional source; we use a spherical gantry to position the camera on a sphere surrounding the object. In this paper, we synthesize novel viewpoints across a wide range of viewing directions (covering a 60° cone) from a sparse set of just six viewing directions. While our approach relates to previous view synthesis and image-based rendering techniques, those methods are usually restricted to much smaller baselines, and are captured under environment illumination. At our baselines, input images have few correspondences and large occlusions; however we benefit from structured photometric images. Our method is based on a deep convolutional network trained to directly synthesize new views from the six input views. This network combines 3D convolutions on a plane sweep volume with a novel per-view per-depth plane attention map prediction network to effectively aggregate multi-view appearance. We train our network with a large-scale synthetic dataset of 1000 scenes with complex geometry and material properties. In practice, it is able to synthesize novel viewpoints for captured real data and reproduces complex appearance effects like occlusions, view-dependent specularities and hard shadows. Moreover, the method can also be combined with previous relighting techniques to enable changing both lighting and view, and applied to computer vision problems like multiview stereo from sparse image sets.

References:


    1. Jonathan T Barron and Jitendra Malik. 2015. Shape, illumination, and reflectance from shading. IEEE transactions on pattern analysis and machine intelligence (TPAMI) 37, 8 (2015), 1670–1687.Google Scholar
    2. Sai Bi, Nima Khademi Kalantari, and Ravi Ramamoorthi. 2017. Patch-based optimization for image-based texture mapping. ACM Transactions on Graphics (TOG) 36, 4 (2017). Google ScholarDigital Library
    3. Chris Buehler, Michael Bosse, Leonard McMillan, Steven Gortler, and Michael Cohen. 2001. Unstructured lumigraph rendering. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques. ACM, 425–432. Google ScholarDigital Library
    4. Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. 2015. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015).Google Scholar
    5. Gaurav Chaurasia, Sylvain Duchene, Olga Sorkine-Hornung, and George Drettakis. 2013. Depth synthesis and local warps for plausible image-based navigation. ACM Transactions on Graphics (TOG) 32, 3 (2013), 30. Google ScholarDigital Library
    6. Gaurav Chaurasia, Olga Sorkine, and George Drettakis. 2011. Silhouette-Aware Warping for Image-Based Rendering. In Computer Graphics Forum, Vol. 30. Wiley Online Library, 1223–1232. Google ScholarDigital Library
    7. Anpei Chen, Minye Wu, Yingliang Zhang, Nianyi Li, Jie Lu, Shenghua Gao, and Jingyi Yu. 2018. Deep Surface Light Fields. Proc. ACM Comput. Graph. Interact. Tech. 1, 1, Article 14 (July 2018), 17 pages. Google ScholarDigital Library
    8. Shenchang Eric Chen and Lance Williams. 1993. View Interpolation for Image Synthesis. In Proceedings of SIGGRAPH. 279–288. Google ScholarDigital Library
    9. Lukasz Dąbała, Matthias Ziegler, Piotr Didyk, Frederik Zilly, Joachim Keinert, Karol Myszkowski, H-P Seidel, Przemyslaw Rokita, and Tobias Ritschel. 2016. Efficient Multi-image Correspondences for On-line Light Field Video Processing. In Computer Graphics Forum, Vol. 35. Wiley Online Library, 401–410. Google ScholarDigital Library
    10. James Davis, Diego Nehab, Ravi Ramamoorthi, and Szymon Rusinkiewicz. 2005. Spacetime Stereo: A Unifying Framework for Depth from Triangulation. IEEE transactions on pattern analysis and machine intelligence (TPAMI) 27, 2 (Feb. 2005), 296–302. Google ScholarDigital Library
    11. Paul Debevec, Tim Hawkins, Chris Tchou, Haarm-Pieter Duiker, Westley Sarokin, and Mark Sagar. 2000. Acquiring the reflectance field of a human face. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., 145–156. Google ScholarDigital Library
    12. Paul E Debevec, Camillo J Taylor, and Jitendra Malik. 1996. Modeling and rendering architecture from photographs: A hybrid geometry-and image-based approach. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques. ACM, 11–20. Google ScholarDigital Library
    13. Valentin Deschaintre, Miika Aittala, Fredo Durand, George Drettakis, and Adrien Bousseau. 2018. Single-image svbrdf capture with a rendering-aware deep network. ACM Transactions on Graphics (TOG) 37, 4 (2018), 128. Google ScholarDigital Library
    14. David Eigen and Rob Fergus. 2015. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 2650–2658. Google ScholarDigital Library
    15. Martin Eisemann, Bert De Decker, Marcus Magnor, Philippe Bekaert, Edilson De Aguiar, Naveed Ahmed, Christian Theobalt, and Anita Sellent. 2008. Floating textures. In Computer graphics forum, Vol. 27. Wiley Online Library, 409–418.Google Scholar
    16. John Flynn, Ivan Neulander, James Philbin, and Noah Snavely. 2016. Deepstereo: Learning to predict new views from the world’s imagery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5515–5524.Google Scholar
    17. Ryo Furukawa, Hiroshi Kawasaki, Katsushi Ikeuchi, and Masao Sakauchi. 2002. Appearance Based Object Modeling using Texture Database: Acquisition Compression and Rendering.. In Rendering Techniques. 257–266. Google ScholarDigital Library
    18. Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for autonomous driving? the kitti vision benchmark suite. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 3354–3361. Google ScholarDigital Library
    19. Steven J Gortler, Radek Grzeszczuk, Richard Szeliski, and Michael F Cohen. 1996. The lumigraph. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques. ACM, 43–54. Google ScholarDigital Library
    20. Peter Hedman, Julien Philip, True Price, Jan-Michael Frahm, George Drettakis, and Gabriel Brostow. 2018. Deep blending for free-viewpoint image-based rendering. In SIGGRAPH Asia 2018 Technical Papers. ACM, 257. Google ScholarDigital Library
    21. Michael Holroyd, Jason Lawrence, and Todd Zickler. 2010. A Coaxial Optical Scanner for Synchronous Acquisition of 3D Geometry and Surface Reflectance. ACM Trans. Graph. 29, 4, Article 99 (July 2010), 99:1–99:12 pages. Google ScholarDigital Library
    22. Po-Han Huang, Kevin Matzen, Johannes Kopf, Narendra Ahuja, and Jia-Bin Huang. 2018. DeepMVS: Learning Multi-View Stereopsis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
    23. Zhuo Hui, Kalyan Sunkavalli, Joon-Young Lee, Sunil Hadap, Jian Wang, and Aswin C Sankaranarayanan. 2017. Reflectance capture using univariate sampling of brdfs. In The IEEE International Conference on Computer Vision (ICCV), Vol. 2.Google ScholarCross Ref
    24. Nima Khademi Kalantari, Ting-Chun Wang, and Ravi Ramamoorthi. 2016. Learning-based view synthesis for light field cameras. ACM Transactions on Graphics (TOG) 35, 6 (2016), 193. Google ScholarDigital Library
    25. Marc Levoy and Pat Hanrahan. 1996. Light field rendering. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques. ACM, 31–42. Google ScholarDigital Library
    26. Xiao Li, Yue Dong, Pieter Peers, and Xin Tong. 2017. Modeling surface appearance from a single photograph using self-augmented convolutional neural networks. ACM Transactions on Graphics (TOG) 36, 4 (2017), 45. Google ScholarDigital Library
    27. Zhengqin Li, Kalyan Sunkavalli, and Manmohan Chandraker. 2018a. Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone Image. In ECCV.Google Scholar
    28. Zhengqin Li, Zexiang Xu, Ravi Ramamoorthi, Kalyan Sunkavalli, and Manmohan Chandraker. 2018b. Learning to reconstruct shape and spatially-varying reflectance from a single image. In SIGGRAPH Asia 2018 Technical Papers. ACM, 269. Google ScholarDigital Library
    29. Tom Malzbender, Dan Gelb, and Hans Wolters. 2001. Polynomial Texture Maps. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’01). 519–528. Google ScholarDigital Library
    30. Giljoo Nam, Joo Ho Lee, Diego Gutierrez, and Min H Kim. 2018. Practical SVBRDF acquisition of 3D objects with unstructured flash photography. In SIGGRAPH Asia 2018 Technical Papers. ACM, 267. Google ScholarDigital Library
    31. Eunbyung Park, Jimei Yang, Ersin Yumer, Duygu Ceylan, and Alexander C Berg. 2017. Transformation-grounded image generation network for novel 3d view synthesis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 702–711.Google ScholarCross Ref
    32. Pieter Peers, Dhruv K Mahajan, Bruce Lamond, Abhijeet Ghosh, Wojciech Matusik, Ravi Ramamoorthi, and Paul Debevec. 2009. Compressive light transport sensing. ACM Transactions on Graphics (TOG) 28, 1 (2009), 3. Google ScholarDigital Library
    33. Eric Penner and Li Zhang. 2017. Soft 3d reconstruction for view synthesis. ACM Transactions on Graphics (TOG) 36, 6 (2017), 235. Google ScholarDigital Library
    34. Konstantinos Rematas, Tobias Ritschel, Mario Fritz, Efstratios Gavves, and Tinne Tuytelaars. 2016. Deep reflectance maps. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4508–4516.Google ScholarCross Ref
    35. Johannes L Schönberger, Enliang Zheng, Jan-Michael Frahm, and Marc Pollefeys. 2016. Pixelwise view selection for unstructured multi-view stereo. In Proceedings of the European Conference on Computer Vision (ECCV). Springer, 501–518.Google ScholarCross Ref
    36. Christopher Schwartz, Michael Weinmann, Roland Ruiters, and Reinhard Klein. 2011. Integrated High-Quality Acquisition of Geometry and Appearance for Cultural Heritage.. In VAST, Vol. 2011. 25–32. Google ScholarDigital Library
    37. Sudipta Sinha, Drew Steedly, and Rick Szeliski. 2009. Piecewise planar stereo for image-based rendering. (2009).Google Scholar
    38. Pratul P Srinivasan, Tongzhou Wang, Ashwin Sreelal, Ravi Ramamoorthi, and Ren Ng. 2017. Learning to synthesize a 4d rgbd light field from a single image. In IEEE International Conference on Computer Vision (ICCV). 2262–2270.Google ScholarCross Ref
    39. Shao-Hua Sun, Minyoung Huh, Yuan-Hong Liao, Ning Zhang, and Joseph J Lim. 2018. Multi-view to Novel View: Synthesizing Novel Views with Self-Learned Confidence. In Proceedings of the European Conference on Computer Vision (ECCV).Google ScholarCross Ref
    40. Maxim Tatarchenko, Alexey Dosovitskiy, and Thomas Brox. 2015. Single-view to multi-view: Reconstructing unseen views with a convolutional network. CoRR abs/1511.06702 1, 2 (2015), 2.Google Scholar
    41. Suren Vagharshakyan, Robert Bregovic, and Atanas Gotchev. 2018. Light field reconstruction using shearlet transform. IEEE transactions on pattern analysis and machine intelligence (TPAMI) 40, 1 (2018), 133–147.Google ScholarCross Ref
    42. Michael Weinmann and Reinhard Klein. 2015. Advances in Geometry and Reflectance Acquisition (Course Notes). In SIGGRAPH Asia 2015 Courses. Article 1, 1:1–1:71 pages. Google ScholarDigital Library
    43. Tim Weyrich, Jason Lawrence, Hendrik P. A. Lensch, Szymon Rusinkiewicz, and Todd Zickler. 2009. Principles of Appearance Acquisition and Representation. Found. Trends. Comput. Graph. Vis. 4, 2 (Feb. 2009), 75–191. Google ScholarDigital Library
    44. Tim Weyrich, Wojciech Matusik, Hanspeter Pfister, Bernd Bickel, Craig Donner, Chien Tu, Janet McAndless, Jinho Lee, Addy Ngan, Henrik Wann Jensen, and Markus Gross. 2006. Analysis of Human Faces Using a Measurement-based Skin Reflectance Model. ACM Trans. Graph. 25, 3 (July 2006), 1013–1024. Google ScholarDigital Library
    45. Daniel N Wood, Daniel I Azuma, Ken Aldinger, Brian Curless, Tom Duchamp, David H Salesin, and Werner Stuetzle. 2000. Surface light fields for 3D photography. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., 287–296. Google ScholarDigital Library
    46. Robert J Woodham. 1980. Photometric method for determining surface orientation from multiple images. Optical engineering 19, 1 (1980), 191139.Google Scholar
    47. Yuxin Wu and Kaiming He. 2018. Group normalization. In Proceedings of the European Conference on Computer Vision (ECCV). 3–19.Google ScholarDigital Library
    48. Rui Xia, Yue Dong, Pieter Peers, and Xin Tong. 2016. Recovering shape and spatially-varying surface reflectance under unknown illumination. ACM Transactions on Graphics (TOG) 35, 6 (2016), 187. Google ScholarDigital Library
    49. Zexiang Xu, Jannik Boll Nielsen, Jiyang Yu, Henrik Wann Jensen, and Ravi Ramamoorthi. 2016. Minimal BRDF sampling for two-shot near-field reflectance acquisition. ACM Transactions on Graphics (TOG) 35, 6 (2016), 188. Google ScholarDigital Library
    50. Zexiang Xu, Kalyan Sunkavalli, Sunil Hadap, and Ravi Ramamoorthi. 2018. Deep image-based relighting from optimal sparse samples. ACM Transactions on Graphics (TOG) 37, 4 (2018), 126. Google ScholarDigital Library
    51. Jimei Yang, Scott E Reed, Ming-Hsuan Yang, and Honglak Lee. 2015. Weakly-supervised disentangling with recurrent transformations for 3d view synthesis. In Advances in Neural Information Processing Systems. 1099–1107. Google ScholarDigital Library
    52. Li Yao, Yunjian Liu, and Weixin Xu. 2016. Real-time virtual view synthesis using light field. EURASIP Journal on Image and Video Processing 2016, 1 (2016), 25.Google ScholarCross Ref
    53. Yao Yao, Zixin Luo, Shiwei Li, Tian Fang, and Long Quan. 2018. MVSNet: Depth Inference for Unstructured Multi-view Stereo. Proceedings of the European Conference on Computer Vision (ECCV) (2018).Google ScholarCross Ref
    54. Qian-Yi Zhou and Vladlen Koltun. 2014. Color map optimization for 3D reconstruction with consumer depth cameras. ACM Transactions on Graphics (TOG) 33, 4 (2014), 155. Google ScholarDigital Library
    55. Tinghui Zhou, Richard Tucker, John Flynn, Graham Fyffe, and Noah Snavely. 2018. Stereo magnification: learning view synthesis using multiplane images. ACM Transactions on Graphics (TOG) 37, 4 (2018), 65. Google ScholarDigital Library
    56. Tinghui Zhou, Shubham Tulsiani, Weilun Sun, Jitendra Malik, and Alexei A Efros. 2016b. View synthesis by appearance flow. In European conference on computer vision (ECCV). Springer, 286–301.Google ScholarCross Ref
    57. Zhiming Zhou, Guojun Chen, Yue Dong, David Wipf, Yong Yu, John Snyder, and Xin Tong. 2016a. Sparse-as-possible SVBRDF acquisition. ACM Transactions on Graphics (TOG) 35, 6 (2016), 189. Google ScholarDigital Library
    58. Zhenglong Zhou, Zhe Wu, and Ping Tan. 2013. Multi-view photometric stereo with spatially varying isotropic materials. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1482–1489. Google ScholarDigital Library


ACM Digital Library Publication:



Overview Page: