Deep reverse tone mapping

Inferring a high dynamic range (HDR) image from a single low dynamic range (LDR) input is an ill-posed problem where we must compensate lost data caused by under-/over-exposure and color quantization. To tackle this, we propose the first deep-learning-based approach for fully automatic inference using convolutional neural networks. Because a naive way of directly inferring a 32-bit HDR image from an 8-bit LDR image is intractable due to the difficulty of training, we take an indirect approach; the key idea of our method is to synthesize LDR images taken with different exposures (i.e., bracketed images) based on supervised learning, and then reconstruct an HDR image by merging them. By learning the relative changes of pixel values due to increased/decreased exposures using 3D deconvolutional networks, our method can reproduce not only natural tones without introducing visible noise but also the colors of saturated pixels. We demonstrate the effectiveness of our method by comparing our results not only with those of conventional methods but also with ground-truth HDR images.

References:

1. Ahmet Oǧuz Akyüz, Roland Fleming, Bernhard E. Riecke, Erik Reinhard, and Heinrich H. Bülthoff. 2007. Do HDR Displays Support LDR Content?: A Psychophysical Evaluation. ACM Trans. Graph. 26, 3, Article 38 (July 2007).
2. Francesco Banterle, Alessandro Artusi, Kurt Debattista, and Alan Chalmers. 2011. Advanced High Dynamic Range Imaging: Theory and Practice. AK Peters (CRC Press), Natick, MA, USA.
3. Francesco Banterle, Patrick Ledda, Kurt Debattista, and Alan Chalmers. 2006. Inverse tone mapping. In Proc. of GRAPHITE’06. 349–356.
4. Francesco Banterle, Patrick Ledda, Kurt Debattista, and Alan Chalmers. 2008. Expanding low dynamic range videos for high dynamic range applications. In Proc. of SCCG’08. 33–41.
5. Francesco Banterle, Patrick Ledda, Kurt Debattista, Alan Chalmers, and Marina Bloj. 2007. A framework for inverse tone mapping. The Visual Computer 23, 7 (2007), 467–478.
6. André Brock, Theodore Lim, James M. Ritchie, and Nick Weston. 2016. Generative and Discriminative Voxel Modeling with Convolutional Neural Networks. CoRR abs/1608.04236 (2016). http://arxiv.org/abs/1608.04236
7. Paul E. Debevec and Jitendra Malik. 1997. Recovering High Dynamic Range Radiance Maps from Photographs. In Proc. of SIGGRAPH’97. 369–378.
8. Emily L. Denton, Soumith Chintala, Arthur Szlam, and Rob Fergus. 2015. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks. In Proc. of NIPS’15. 1486–1494.
9. Piotr Didyk, Rafal Mantiuk, Matthias Hein, and Hans-Peter Seidel. 2008. Enhancement of Bright Video Features for HDR Displays. Comput. Graph. Forum 27, 4 (2008), 1265–1274.
10. Gabriel Eilertsen, Joel Kronander, Gyorgy Denes, Rafal Mantiuk, and Jonas Unger. 2017. HDR image reconstruction from a single exposure using deep CNNs. ACM Trans. Graph. (Proc. of SIGGRAPH ASIA 2017) 36, 6 (Nov. 2017).
11. Brian V. Funt and Lilong Shi. 2010a. The effect of exposure on MaxRGB color constancy. In Human Vision and Electronic Imaging XV, part of the IS&T-SPIE Electronic Imaging Symposium. 75270.
12. Brian V. Funt and Lilong Shi. 2010b. The Rehabilitation of MaxRGB. In Proc. of Color and Imaging Conference 2010. 256–259.
13. Felix A. Gers, Jürgen Schmidhuber, and Fred A. Cummins. 2000. Learning to Forget: Continual Prediction with LSTM. Neural Computation 12, 10 (2000), 2451–2471.
14. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Proc. of NIPS’14. 2672–2680.
15. Michael D. Grossberg and Shree K. Nayar. 2003. What is the Space of Camera Response Functions?. In Proc. of CVPR’03. 602–612.
16. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proc. of CVPR’16. 770–778. Cross Ref
17. Rui Huang, Shu Zhang, Tianyu Li, and Ran He. 2017. Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis. CoRR abs/1704.04086 (2017). http://arxiv.org/abs/1704.04086
18. Yongqing Huo, Fan Yang, and Vincent Brost. 2013. Dodging and Burning Inspired Inverse Tone Mapping Algorithm. Computational Information Systems 9, 9 (2013), 3461–3468.
19. Yongqing Huo, Fan Yang, Le Dong, and Vincent Brost. 2014. Physiological inverse tone mapping based on retina response. The Visual Computer 30, 5 (2014), 507–517. Cross Ref
20. Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proc. of ICML’15. 448–456.
21. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2016. Image-to-Image Translation with Conditional Adversarial Networks. arxiv (2016).
22. G. Jain, A. Plappally, and S. Raman. 2014. InternetHDR: Enhancing an LDR image using visually similar Internet images. In Proc. of Twentieth National Conference on Communications. 1–6.
23. Shuiwang Ji, Wei Xu, Ming Yang, and Kai Yu. 2013. 3D Convolutional Neural Networks for Human Action Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1 (2013), 221–231.
24. Nima Khademi Kalantari and Ravi Ramamoorthi. 2017. Deep High Dynamic Range Imaging of Dynamic Scenes. ACM Transactions on Graphics (Proc. of SIGGRAPH 2017) 36, 4 (2017).
25. Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei. 2014. Large-Scale Video Classification with Convolutional Neural Networks. In Proc. of CVPR’14. 1725–1732.
26. Min H. Kim and Jan Kautz. 2008. Consistent Tone Reproduction. In Proc. of CGIM’08. 152–159.
27. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980
28. Rafael Kovaleski and Manuel M. Oliveira. 2014. High-Quality Reverse Tone Mapping for a Wide Range of Exposures. In Proc. of SIBGRAPI’14. 49–56.
29. Rafael Pacheco Kovaleski and Manuel M. Oliveira. 2009. High-quality brightness enhancement functions for real-time reverse tone mapping. The Visual Computer 25, 5–7 (2009), 539–547.
30. Joel Kronander, Stefan Gustavson, Gerhard Bonnet, and Jonas Unger. 2013. Unified HDR reconstruction from raw CFA data. In Proc. of ICCP’13. 1–9. Cross Ref
31. William Lotter, Gabriel Kreiman, and David Cox. 2016. Unsupervised Learning of Visual Structure using Predictive Generative Networks. In ICLR’16 workshop.
32. Andrew L. Maas, Awni Y. Hannun, and Andrew Y. Ng. 2013. Rectifier nonlinearities improve neural network acoustic models. In ICML Workshop on Deep Learning for Audio, Speech, and Language Processing.
33. Mann, Picard, S. Mann, and R. W. Picard. 1995. On Being ‘undigital’ With Digital Cameras: Extending Dynamic Range By Combining Differently Exposed Pictures. In Proc. of IS&T. 442–448.
34. Rafal Mantiuk, Kil Joong Kim, Allan G. Rempel, and Wolfgang Heidrich. 2011. HDR-VDP-2: a calibrated visual metric for visibility and quality predictions in all luminance conditions. ACM Trans. Graph. 30, 4 (2011), 40:1–40:14.
35. R. K. Mantiuk, K. Myszkowski, and H.-P. Seidel. 2015. High Dynamic Range Imaging. In Wiley Encyclopedia of Electrical and Electronics Engineering. John Wiley & Sons Inc., 1–42.
36. Belen Masia, Sandra Agustin, Roland W. Fleming, Olga Sorkine, and Diego Gutierrez. 2009. Evaluation of Reverse Tone Mapping Through Varying Exposure Conditions. ACM Transactions on Graphics (Proc. of SIGGRAPH Asia) 28, 5 (2009), 160:1–160:8.
37. Belen Masia, Roland W. Fleming, Olga Sorkine, and Diego Gutierrez. 2010. Selective Reverse Tone Mapping. In Congreso Español de Informatica Grafica. Eurographics.
38. Belen Masia and Diego Gutierrez. 2016. Content-Aware Reverse Tone Mapping. In Proc. of ICAITA 2016. Cross Ref
39. Belen Masia, Ana Serrano, and Diego Gutierrez. 2015. Dynamic range expansion based on image statistics. Multimedia Tools and Applications (2015), 1–18.
40. Daniel Maturana and Sebastian Scherer. 2015. VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition. In Proc. of IEEE/IROS’15. 922–928. Cross Ref
41. Tom Mertens, Jan Kautz, and Frank Van Reeth. 2007. Exposure Fusion. In Proc. of Pacific Graphics 2007. 382–390.
42. Vinod Nair and Geoffrey E. Hinton. 2010. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proc. of ICML2010. 807–814.
43. Hiromi Nemoto, Pavel Korshunov, Philippe Hanhart, and Touradj Ebrahimi. 2015. Visual attention in LDR and HDR images. In International Workshop on Video Processing and Quality Metrics for Consumer Electronics. http://mmspg.epfl.ch/hdr-eye
44. Deepak Pathak, Philipp Krähenbühl, Jeff Donahue, Trevor Darrell, and Alexei A. Efros. 2016. Context Encoders: Feature Learning by Inpainting. In Proc. of CVPR’16. 2536–2544.
45. Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. CoRR abs/1511.06434 (2015). http://arxiv.org/abs/1511.06434
46. Erik Reinhard, Michael M. Stark, Peter Shirley, and James A. Ferwerda. 2002. Photographic tone reproduction for digital images. ACM Trans. Graph. 21, 3 (2002), 267–276.
47. Allan G. Rempel, Matthew Trentacoste, Helge Seetzen, H. David Young, Wolfgang Heidrich, Lorne Whitehead, and Greg Ward. 2007. Ldr2Hdr: On-the-fly Reverse Tone Mapping of Legacy Video and Photographs. ACM Trans. Graph. 26, 3, Article 39 (July 2007).
48. O. Ronneberger, P.Fischer, and T. Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI) (LNCS), Vol. 9351. 234–241.
49. Florian M. Savoy, Vassilios Vonikakis, Stefan Winkler, and Sabine Süsstrunk. 2014. Recovering badly exposed objects from digital photos using internet images. In Proc. of Digital Photography X, part of the IS&T-SPIE Electronic Imaging Symposium. 90230W.
50. Ana Serrano, Felix Heide, Diego Gutierrez, Gordon Wetzstein, and Belen Masia. 2016. Convolutional Sparse Coding for High Dynamic Range Imaging. Comput. Graph. Forum 35, 2 (2016), 153–163. Cross Ref
51. Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from over-fitting. Journal of Machine Learning Research 15, 1 (2014), 1929–1958.
52. Nitish Srivastava, Elman Mansimov, and Ruslan Salakhutdinov. 2015. Unsupervised Learning of Video Representations using LSTMs. In Proc. of ICML’15. 843–852.
53. Michael D. Tocci, Chris Kiser, Nora Tocci, and Pradeep Sen. 2011. A versatile HDR video production system. ACM Trans. Graph. 30, 4 (2011), 41:1–41:10.
54. Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. 2015. Learning Spatiotemporal Features with 3D Convolutional Networks. In Proc. of ICCV’15. 4489–4497.
55. Carl Vondrick, Hamed Pirsiavash, and Antonio Torralba. 2016. Generating Videos with Scene Dynamics. In Proc. of NIPS’16. 613–621.
56. Lvdi Wang, Li-Yi Wei, Kun Zhou, Baining Guo, and Heung-Yeung Shum. 2007. High Dynamic Range Image Hallucination. In Proc. of EGSR’07. 321–326.
57. T. H. Wang, C. W. Chiu, W. C. Wu, J. W. Wang, C. Y. Lin, C. T. Chiu, and J. J. Liou. 2015. Pseudo-Multiple-Exposure-Based Tone Fusion With Local Region Adjustment. IEEE Transactions on Multimedia 17, 4 (2015), 470–484. Cross Ref
58. Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3D ShapeNets: A deep representation for volumetric shapes. In Proc. of CVPR’15. 1912–1920.
59. Feng Xiao, Jeffrey M. DiCarlo, Peter B. Catrysse, and Brian A. Wandell. 2002. High Dynamic Range Imaging of Natural Scenes. In Proc. of Color and Imaging Conference 2002. 337–342.
60. J. Xiao, K. A. Ehinger, A. Oliva, and A. Torralba. 2012. Recognizing scene viewpoint using panoramic place representation. In Proc. of CVPR’12. 2695–2702.
61. Jinsong Zhang and Jean-François Lalonde. 2017. Learning High Dynamic Range from Outdoor Panoramas. (2017). arXiv:arXiv:1703.10200
62. H. Zhao, O. Gallo, I. Frosio, and J. Kautz. 2017. Loss Functions for Image Restoration With Neural Networks. IEEE Transactions on Computational Imaging 3, 1 (2017), 47–57. Cross Ref
63. Hang Zhao, Boxin Shi, Christy Fernandez-Cull, Sai-Kit Yeung, and Ramesh Raskar. 2015. Unbounded High Dynamic Range Photography Using a Modulo Camera. In Proc. of ICCP’15. 1–10. Cross Ref
64. Bolei Zhou, Àgata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva. 2014. Learning Deep Features for Scene Recognition using Places Database. In Proc. of NIPS’14. 487–495.

ACM Digital Library Publication:

Overview Page:

SIGGRAPH Asia 2017: Technical Papers

Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org

ACM SIGGRAPH HISTORY ARCHIVES

“Deep reverse tone mapping” by Endo, Kanamori and Mitani

Conference:

Type(s):

Title:

Session/Category Title:

Presenter(s)/Author(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Submit a story:

Sponsored by: