“Multi-view relighting using a geometry-aware network” by Philip, Gharbi, Zhou, Efros and Drettakis
Conference:
Type(s):
Title:
- Multi-view relighting using a geometry-aware network
Session/Category Title: Relighting and View Synthesis
Presenter(s)/Author(s):
Abstract:
We propose the first learning-based algorithm that can relight images in a plausible and controllable manner given multiple views of an outdoor scene. In particular, we introduce a geometry-aware neural network that utilizes multiple geometry cues (normal maps, specular direction, etc.) and source and target shadow masks computed from a noisy proxy geometry obtained by multi-view stereo. Our model is a three-stage pipeline: two subnetworks refine the source and target shadow masks, and a third performs the final relighting. Furthermore, we introduce a novel representation for the shadow masks, which we call RGB shadow images. They reproject the colors from all views into the shadowed pixels and enable our network to cope with inacuraccies in the proxy and the non-locality of the shadow casting interactions. Acquiring large-scale multi-view relighting datasets for real scenes is challenging, so we train our network on photorealistic synthetic data. At train time, we also compute a noisy stereo-based geometric proxy, this time from the synthetic renderings. This allows us to bridge the gap between the real and synthetic domains. Our model generalizes well to real scenes. It can alter the illumination of drone footage, image-based renderings, textured mesh reconstructions, and even internet photo collections.
References:
1. Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. http://tensorflow.org/Google Scholar
2. Chris Buehler, Michael Bosse, Leonard McMillan, Steven Gortler, and Michael Cohen. 2001. Unstructured lumigraph rendering. In Proceedings SIGGRAPH. Google ScholarDigital Library
3. Qifeng Chen and Vladlen Koltun. 2017. Photographic Image Synthesis with Cascaded Refinement Networks. CoRR abs/1707.09405 (2017). arXiv:1707.09405 http://arxiv.org/abs/1707.09405Google Scholar
4. Paul Debevec. 2002. Image-based lighting. IEEE Computer Graphics and Applications 22, 2 (2002), 26–34. Google ScholarDigital Library
5. Paul Debevec, Tim Hawkins, Chris Tchou, Haarm-Pieter Duiker, Westley Sarokin, and Mark Sagar. 2000. Acquiring the reflectance field of a human face. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., 145–156. Google ScholarDigital Library
6. Sylvain Duchêne, Clement Riant, Gaurav Chaurasia, Jorge Lopez-Moreno, Pierre-Yves Laffont, Stefan Popov, Adrien Bousseau, and George Drettakis. 2015. Multi-View Intrinsic Images of Outdoors Scenes with an Application to Relighting. ACM Transactions on Graphics (TOG) 34, 5 (Nov. 2015). Google ScholarDigital Library
7. Graham D Finlayson, Steven D Hordley, Cheng Lu, and Mark S Drew. 2006. On the removal of shadows from images. IEEE transactions on pattern analysis and machine intelligence 28, 1 (2006), 59–68. Google ScholarDigital Library
8. Michael Goesele, Noah Snavely, Brian Curless, Hugues Hoppe, and Steven M Seitz. 2007. Multi-view stereo for community photo collections. In International Conference on Computer Vision (ICCV).Google ScholarCross Ref
9. Maciej Gryka, Michael Terry, and Gabriel J. Brostow. 2015. Learning to Remove Soft Shadows. ACM Transactions on Graphics (TOG) 34, 5 (Oct. 2015). Google ScholarDigital Library
10. Ruiqi Guo, Qieyun Dai, and Derek Hoiem. 2011. Single-image shadow detection and removal using paired regions. In CVPR, 2011. IEEE, 2033–2040. Google ScholarDigital Library
11. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. CoRR (2015).Google Scholar
12. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.Google ScholarCross Ref
13. Yannick Hold-Geoffroy, Kalyan Sunkavalli, Sunil Hadap, Emiliano Gambaretto, and Jean-François Lalonde. 2017. Deep outdoor illumination estimation. In IEEE International Conference on Computer Vision and Pattern Recognition, Vol. 2.Google ScholarCross Ref
14. Lukas Hosek and Alexander Wilkie. 2012. An analytic model for full spectral sky-dome radiance. ACM Transactions on Graphics (TOG) 31, 4 (2012), 95. Google ScholarDigital Library
15. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-Image Translation with Conditional Adversarial Networks. In CVPR.Google Scholar
16. Wenzel Jakob. 2010. Mitsuba renderer. http://www.mitsuba-renderer.org.Google Scholar
17. Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In European Conference on Computer Vision. Springer, 694–711.Google ScholarCross Ref
18. Yoshihiro Kanamori and Yuki Endo. 2018. Relighting humans: occlusion-aware inverse rendering for fullbody human images. ACM Transactions on Graphics (TOG) 37, 270 (2018), 1–270. Google ScholarDigital Library
19. Kevin Karsch, Varsha Hedau, David Forsyth, and Derek Hoiem. 2011. Rendering Synthetic Objects into Legacy Photographs. ACM Transactions on Graphics (TOG) 30, 6, Article 157 (Dec. 2011), 12 pages. Google ScholarDigital Library
20. Natasha Kholgade, Tomas Simon, Alexei Efros, and Yaser Sheikh. 2014. 3D object manipulation in a single photograph using stock 3D models. ACM Transactions on Graphics (TOG) 33, 4 (2014), 127. Google ScholarDigital Library
21. Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. ICLR (2015).Google Scholar
22. Johannes Kopf, Boris Neubert, Billy Chen, Michael Cohen, Daniel Cohen-Or, Oliver Deussen, Matt Uyttendaele, and Dani Lischinski. 2008. Deep photo: Model-based photograph enhancement and viewing. Vol. 27. ACM. Google ScholarDigital Library
23. Pierre-Yves Laffont, Adrien Bousseau, Sylvain Paris, Frédo Durand, and George Drettakis. 2012. Coherent intrinsic images from photo collections. ACM Transactions on Graphics (TOG) 31, 6 (2012). Google ScholarDigital Library
24. Wei-Sheng Lai, Jia-Bin Huang, Oliver Wang, Eli Shechtman, Ersin Yumer, and Ming-Hsuan Yang. 2018. Learning Blind Video Temporal Consistency. In Computer Vision – ECCV 2018, Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing, Cham, 179–195.Google ScholarCross Ref
25. Jean-François Lalonde, Alexei A Efros, and Srinivasa G Narasimhan. 2009a. Estimating natural illumination from a single outdoor image. In Computer Vision, 2009 IEEE 12th International Conference on. IEEE, 183–190.Google ScholarCross Ref
26. Jean-François Lalonde, Alexei A Efros, and Srinivasa G Narasimhan. 2009b. Webcam clip art: Appearance and illuminant transfer from time-lapse sequences. In ACM Transactions on Graphics (TOG), Vol. 28. ACM, 131. Google ScholarDigital Library
27. Jean-François Lalonde, Alexei A Efros, and Srinivasa G Narasimhan. 2010. Detecting ground shadows in outdoor consumer photographs. In European conference on computer vision. Springer, 322–335. Google ScholarDigital Library
28. Edwin H Land and John J McCann. 1971. Lightness and retinex theory. Josa 61, 1 (1971), 1–11.Google ScholarCross Ref
29. Jaakko Lehtinen, Jacob Munkberg, Jon Hasselgren, Samuli Laine, Tero Karras, Miika Aittala, and Timo Aila. 2018. Noise2Noise: Learning Image Restoration without Clean Data. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10–15, 2018. 2971–2980.Google Scholar
30. Ming-Yu Liu, Thomas Breuel, and Jan Kautz. 2017. Unsupervised image-to-image translation networks. In Advances in Neural Information Processing Systems. 700–708. Google ScholarDigital Library
31. Céline Loscos, Marie-Claude Frasson, George Drettakis, Bruce Walter, Xavier Granier, and Pierre Poulin. 1999. Interactive virtual relighting and remodeling of real scenes. In Rendering Techniques 99. Springer, 329–340. Google ScholarDigital Library
32. Fujun Luan, Sylvain Paris, Eli Shechtman, and Kavita Bala. 2017. Deep photo style transfer. CoRR, abs/1703.07511 2 (2017).Google Scholar
33. Stephen R Marschner and Donald P Greenberg. 1997. Inverse lighting for photography. In Color and Imaging Conference, Vol. 1997. Society for Imaging Science and Technology, 262–265.Google Scholar
34. Vincent Masselus, Pieter Peers, Philip Dutré, and Yves D Willems. 2003. Relighting with 4D incident light fields. ACM Transactions on Graphics (TOG) 22, 3 (2003), 613–620. Google ScholarDigital Library
35. Vincent Masselus, Pieter Peers, Philip Dutré, and Yves D. Willems. 2004. Smooth reconstruction and compact representation of reflectance functions for image-based relighting. In Rendering Techniques 2004. Eurographics Association, Norrkoping, Sweden, 287–298. Google ScholarDigital Library
36. Ankit Mohan, Jack Tumblin, and Prasun Choudhury. 2007. Editing soft shadows in a digital photograph. IEEE Computer Graphics and Applications 27, 2 (2007). Google ScholarDigital Library
37. Pieter Peers, Naoki Tamura, Wojciech Matusik, and Paul Debevec. 2007. Post-production facial performance relighting using reflectance transfer. In ACM Transactions on Graphics (TOG), Vol. 26. ACM, 52. Google ScholarDigital Library
38. Liangqiong Qu, Jiandong Tian, Shengfeng He, Yandong Tang, and Rynson W. H. Lau. 2017. DeshadowNet: A Multi-Context Embedding Deep Network for Shadow Removal. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
39. CapturingReality RealityCapture. 2016. RealityCapture. \protect{http://https://www.capturingreality.com/}Google Scholar
40. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-assisted Intervention (MICCAI). Springer, 234–241.Google Scholar
41. Andres Sanin, Conrad Sanderson, and Brian C Lovell. 2012. Shadow detection: A survey and comparative evaluation of recent methods. Pattern recognition 45, 4 (2012), 1684–1695. Google ScholarDigital Library
42. Yichang Shih, Sylvain Paris, Frédo Durand, and William T Freeman. 2013. Data-driven hallucination of different times of day from a single outdoor photo. ACM Transactions on Graphics (TOG) 32, 6 (2013), 200. Google ScholarDigital Library
43. Yael Shor and Dani Lischinski. 2008. The shadow meets the mask: Pyramid-based shadow removal. In Computer Graphics Forum, Vol. 27. Wiley Online Library, 577–586.Google Scholar
44. Zhixin Shu, Ersin Yumer, Sunil Hadap, Kalyan Sunkavalli, Eli Shechtman, and Dimitris Samaras. 2017. Neural face editing with intrinsic image disentangling. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on. IEEE, 5444–5453.Google ScholarCross Ref
45. Noah Snavely, Steven M. Seitz, and Richard Szeliski. 2006. Photo Tourism: Exploring Photo Collections in 3D. ACM Transactions on Graphics (TOG) 25, 3 (July 2006), 835–846. Google ScholarDigital Library
46. Jessi Stumpfel, Chris Tchou, Andrew Jones, Tim Hawkins, Andreas Wenger, and Paul Debevec. 2004. Direct HDR capture of the sun and sky. In Proceedings of the 3rd international conference on Computer graphics, virtual reality, visualisation and interaction in Africa. ACM, 145–149. Google ScholarDigital Library
47. Kalyan Sunkavalli, Wojciech Matusik, Hanspeter Pfister, and Szymon Rusinkiewicz. 2007. Factored time-lapse video. In ACM Transactions on Graphics (TOG), Vol. 26. ACM, 101. Google ScholarDigital Library
48. Chris Tchou, Jessi Stumpfel, Per Einarsson, Marcos Fajardo, and Paul Debevec. 2004. Unlighting the parthenon. In ACM Siggraph 2004 Sketches. ACM, 80. Google ScholarDigital Library
49. Jifeng Wang, Xiang Li, and Jian Yang. 2018. Stacked conditional generative adversarial networks for jointly learning shadow detection and shadow removal. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1788–1797.Google ScholarCross Ref
50. Yang Wang, Lei Zhang, Zicheng Liu, Gang Hua, Zhen Wen, Zhengyou Zhang, and Dimitris Samaras. 2009. Face relighting from a single image under arbitrary unknown lighting conditions. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 11 (2009), 1968–1984. Google ScholarDigital Library
51. Yair Weiss. 2001. Deriving intrinsic images from image sequences. In Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on, Vol. 2. IEEE, 68–75.Google ScholarCross Ref
52. Zhen Wen, Zicheng Liu, and Thomas S. Huang. 2003. Face Relighting with Radiance Environment Maps. In CVPR.Google Scholar
53. Andreas Wenger, Andrew Gardner, Chris Tchou, Jonas Unger, Tim Hawkins, and Paul Debevec. 2005. Performance relighting and reflectance transformation with time-multiplexed illumination. In ACM Transactions on Graphics (TOG), Vol. 24. ACM, 756–764. Google ScholarDigital Library
54. Tai-Pang Wu, Chi-Keung Tang, Michael S Brown, and Heung-Yeung Shum. 2007. Natural shadow matting. ACM Transactions on Graphics (TOG) 26, 2 (2007), 8. Google ScholarDigital Library
55. Zexiang Xu, Kalyan Sunkavalli, Sunil Hadap, and Ravi Ramamoorthi. 2018. Deep image-based relighting from optimal sparse samples. ACM Transactions on Graphics (TOG) 37, 4 (2018), 126. Google ScholarDigital Library
56. Yizhou Yu, Paul Debevec, Jitendra Malik, and Tim Hawkins. 1999. Inverse global illumination: Recovering reflectance models of real scenes from photographs. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., 215–224. Google ScholarDigital Library
57. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223–2232.Google ScholarCross Ref