“HDR image reconstruction from a single exposure using deep CNNs” by Eilertsen, Kronander, Denes, Mantiuk and Unger
Conference:
Type(s):
Title:
- HDR image reconstruction from a single exposure using deep CNNs
Session/Category Title: HDR and Image Manipulation
Presenter(s)/Author(s):
Abstract:
Camera sensors can only capture a limited range of luminance simultaneously, and in order to create high dynamic range (HDR) images a set of different exposures are typically combined. In this paper we address the problem of predicting information that have been lost in saturated image areas, in order to enable HDR reconstruction from a single exposure. We show that this problem is well-suited for deep learning algorithms, and propose a deep convolutional neural network (CNN) that is specifically designed taking into account the challenges in predicting HDR values. To train the CNN we gather a large dataset of HDR images, which we augment by simulating sensor saturation for a range of cameras. To further boost robustness, we pre-train the CNN on a simulated HDR dataset created from a subset of the MIT Places database. We demonstrate that our approach can reconstruct high-resolution visually convincing HDR results in a wide range of situations, and that it generalizes well to reconstruction of images captured with arbitrary and low-end cameras that use unknown camera response functions and post-processing. Furthermore, we compare to existing methods for HDR expansion, and show high quality results also for image based lighting. Finally, we evaluate the results in a subjective experiment performed on an HDR display. This shows that the reconstructed HDR images are visually convincing, with large improvements as compared to existing methods.
References:
1. A. O. Akyüz, R. Fleming, B. E. Riecke, E. Reinhard, and H. H. Bülthoff. 2007. Do HDR Displays Support LDR Content?: A Psychophysical Evaluation. ACM Trans. Graph. 26, 3, Article 38 (2007).
2. M. Azimi, A. Banitalebi-Dehkordi, Y. Dong, M. T. Pourazad, and P. Nasiopoulos. 2014. Evaluating the Performance of Existing Full-Reference Quality Metrics on High Dynamic Range (HDR) Video Content. In International Conference on Multimedia Signal Processing (ICMSP ’14), Vol. 1. 789.
3. F. Banterle, A. Artusi, K. Debattista, and A. Chalmers. 2011. Advanced High Dynamic Range Imaging: Theory and Practice. AK Peters (CRC Press).
4. F. Banterle, K. Debattista, A. Artusi, S. Pattanaik, K. Myszkowski, P. Ledda, and A. Chalmers. 2009. High Dynamic Range Imaging and Low Dynamic Range Expansion for Generating HDR Content. Computer Graphics Forum 28, 8 (2009), 2343–2367. Cross Ref
5. F. Banterle, P. Ledda, K. Debattista, M. Bloj, A. Artusi, and A. Chalmers. 2009. A Psychophysical Evaluation of Inverse Tone Mapping Techniques. Computer Graphics Forum 28, 1 (2009), 13–25. Cross Ref
6. F. Banterle, P. Ledda, K. Debattista, and A. Chalmers. 2006. Inverse Tone Mapping. In Proceedings of the 4th International Conference on Computer Graphics and Interactive Techniques in Australasia and Southeast Asia (GRAPHITE ’06). ACM, 349–356.
7. F. Banterle, P. Ledda, K. Debattista, and A. Chalmers. 2008. Expanding Low Dynamic Range Videos for High Dynamic Range Applications. In Proceedings of the 24th Spring Conference on Computer Graphics (SCCG ’08). ACM, 33–41.
8. S. Bhagavathy, J. Llach, and J. f. Zhai. 2007. Multi-Scale Probabilistic Dithering for Suppressing Banding Artifacts in Digital Images. In The IEEE International Conference on Image Processing (ICIP ’07), Vol. 4. IV — 397–IV – 400.
9. R. Boitard, R. Cozot, D. Thoreau, and K. Bouatouch. 2014. Survey of Temporal Brightness Artifacts in Video Tone Mapping. In Second International Conference and SME Workshop on HDR imaging (HDRi2014).
10. S. J. Daly and X. Feng. 2003. Bit-depth extension using spatiotemporal microdither based on models of the equivalent input noise of the visual system. In Proceedings of SPIE, Vol. 5008. 455–466.
11. S. J. Daly and X. Feng. 2004. Decontouring: prevention and removal of false contour artifacts. In Proceedings of SPIE, Vol. 5292. 130–149.
12. P. E. Debevec and J. Malik. 1997. Recovering High Dynamic Range Radiance Maps from Photographs. In Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’97). 369–378.
13. P. Didyk, R. Mantiuk, M. Hein, and H.P. Seidel. 2008. Enhancement of Bright Video Features for HDR Displays. Computer Graphics Forum 27, 4 (2008), 1265–1274.
14. C. Dong, Y. Deng, C. Change L., and X. Tang. 2015. Compression Artifacts Reduction by a Deep Convolutional Network. In The IEEE International Conference on Computer Vision (ICCV ’15).
15. F. Dufaux, P. Callet, R.K. Mantiuk, and M. Mrak (Eds.). 2016. High Dynamic Range Video: From Acquisition, to Display and Applications. Vol. 1. Academic Press.
16. Y. Endo, Y. Kanamori, and J. Mitani. 2017. Deep Reverse Tone Mapping. ACM Trans. Graph. 36, 6, Article 177 (2017).
17. G. Fechner. 1965. Elements of psychophysics. Holt, Rinehart & Winston.
18. J. Froehlich, S. Grandinetti, B. Eberhardt, S. Walter, A. Schilling, and H. Brendel. 2014. Creating Cinematic Wide Gamut HDR-Video for the Evaluation of Tone Mapping Operators and HDR-Displays. In Proceedings of SPIE 9023, Digital Photography X. 90230X–90230X–10.
19. A. Gilchrist and A. Jacobsen. 1984. Perception of lightness and illumination in a world of one reflectance. Perception 13, 1 (1984), 5–19. Cross Ref
20. X. Glorot and Y. Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS ’10), Vol. 9. 249–256.
21. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems 27. 2672–2680.
22. M. D. Grossberg and S. K. Nayar. 2003. What is the space of camera response functions?. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’03)., Vol. 2. II–602–9.
23. S. Hajisharif, J. Kronander, and J. Unger. 2015. Adaptive dualISO HDR reconstruction. EURASIP Journal on Image and Video Processing 2015, 1 (2015), 41. Cross Ref
24. K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’16).
25. G. E. Hinton and R. R. Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science 313, 5786 (2006), 504–507.
26. S. Iizuka, E. Simo-Serra, and H. Ishikawa. 2016. Let There Be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification. ACM Trans. Graph. 35, 4 (2016), 110:1–110:11.
27. S. Ioffe and C. Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on Machine Learning (PMLR ’15), Vol. 37. 448–456.
28. N. K. Kalantari and R. Ramamoorthi. 2017. Deep High Dynamic Range Imaging of Dynamic Scenes. ACM Trans. Graph. 36, 4 (2017), 144:1–144:12.
29. D. P. Kingma and J. Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980
30. R. P. Kovaleski and M. M. Oliveira. 2009. High-quality brightness enhancement functions for real-time reverse tone mapping. The Visual Computer 25, 5 (2009), 539–547.
31. R. P. Kovaleski and M. M. Oliveira. 2014. High-Quality Reverse Tone Mapping for a Wide Range of Exposures. In 27th Conference on Graphics, Patterns and Images (SIBGRAPI ’14). 49–56.
32. J. Kronander, S. Gustavson, G. Bonnet, A. Ynnerman, and J. Unger. 2014. A Unified Framework for Multi-Sensor HDR Video Reconstruction. Signal Processing: Image Communications 29, 2 (2014), 203 — 215.
33. C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al. 2016. Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802 (2016).
34. J. Long, E. Shelhamer, and T. Darrell. 2015. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’15).
35. S. Mann and R.W. Picard. 1994. Being ‘undigital’ with cameras: Extending Dynamic Range by Combining Differently Exposed Pictures. Technical Report 323. M.I.T. Media Lab Perceptual Computing Section. 422–428 pages.
36. B. Masia, S. Agustin, R. W. Fleming, O. Sorkine, and D. Gutierrez. 2009. Evaluation of Reverse Tone Mapping Through Varying Exposure Conditions. ACM Trans. Graph. 28, 5 (2009), 160:1–160:8.
37. B. Masia, A. Serrano, and D. Gutierrez. 2017. Dynamic range expansion based on image statistics. Multimedia Tools and Applications 76, 1 (2017), 631–648.
38. L. Meylan, S. Daly, and S. Süsstrunk. 2006. The Reproduction of Specular Highlights on High Dynamic Range Displays. Color and Imaging Conference 2006, 1 (2006), 333–338.
39. S. K. Nayar and T. Mitsunaga. 2000. High dynamic range imaging: spatially varying pixel exposures. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’00), Vol. 1. 472–479.
40. A. Odena, V. Dumoulin, and C. Olah. 2016. Deconvolution and Checkerboard Artifacts. Distill (2016). http://distill.pub/2016/deconv-checkerboard.
41. D. Pathak, P. Krähenbühl, J. Donahue, T. Darrell, and A. Efros. 2016. Context Encoders: Feature Learning by Inpainting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’16). 2536–2544.
42. T. Pohlen, A. Hermans, M. Mathias, and B. Leibe. 2017. Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’17).
43. A. Radford, L. Metz, and S. Chintala. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015).
44. E. Reinhard, G. Ward, S. N. Pattanaik, P. E. Debevec, W. Heidrich, and K. Myszkowski. 2010. High dynamic range imaging: acquisition, display, and image-based lighting (2nd ed.). Morgan Kaufmann.
45. A. G. Rempel, M. Trentacoste, H. Seetzen, H. D. Young, W. Heidrich, L. Whitehead, and G. Ward. 2007. Ldr2Hdr: On-the-fly Reverse Tone Mapping of Legacy Video and Photographs. ACM Trans. Graph. 26, 3, Article 39 (2007).
46. S. Ren, K. He, R. Girshick, and J. Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Advances in Neural Information Processing Systems 28. 91–99.
47. O. Ronneberger, P. Fischer, and T. Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of Medical Image Computing and Computer-Assisted Intervention (MICCAI ’15). 234–241.
48. M. Rouf, R. Mantiuk, W. Heidrich, M. Trentacoste, and C. Lau. 2011. Glare encoding of high dynamic range images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’11). 289–296.
49. H. Seetzen, W. Heidrich, W. Stuerzlinger, G. Ward, L. Whitehead, M. Trentacoste, A. Ghosh, and A. Vorozcovs. 2004. High Dynamic Range Display Systems. ACM Trans. Graph. 23, 3 (2004), 760–768.
50. A. Serrano, F. Heide, D. Gutierrez, G. Wetzstein, and B. Masia. 2016. Convolutional Sparse Coding for High Dynamic Range Imaging. Computer Graphics Forum 35, 2 (2016), 153–163. Cross Ref
51. K. Simonyan and A. Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR abs/1409.1556 (2014).
52. P. Svoboda, M. Hradis, D. Barina, and P. Zemcik. 2016. Compression Artifacts Removal Using Convolutional Neural Networks. In Journal of WSCG, Vol. 24. 63–72.
53. M. D. Tocci, C. Kiser, N. Tocci, and P. Sen. 2011. A Versatile HDR Video Production System. ACM Trans. Graphics 30, 4 (2011), 41:1–41:10.
54. J. Unger and S. Gustavson. 2007. High-dynamic-range video for photometric measurement of illumination. In Proceedings of SPIE, Vol. 6501.
55. P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol. 2008. Extracting and Composing Robust Features with Denoising Autoencoders. In Proceedings of the 25th International Conference on Machine Learning (ICML ’08). ACM, 1096–1103.
56. L. Wang, L.-Y. Wei, K. Zhou, B. Guo, and H.-Y. Shum. 2007. High Dynamic Range Image Hallucination. In Proceedings of the 18th Eurographics Conference on Rendering Techniques (EGSR’07). 321–326.
57. C. Yang, X. Lu, Z. Lin, E. Shechtman, O. Wang, and H. Li. 2016. High-Resolution Image Inpainting using Multi-Scale Neural Patch Synthesis. arXiv preprint arXiv:1611.09969 (2016).
58. J. Zhang and J.-F. Lalonde. 2017. Learning High Dynamic Range from Outdoor Panoramas. arXiv preprint arXiv:1703.10200 (2017).
59. B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. 2014. Learning Deep Features for Scene Recognition using Places Database. In Advances in Neural Information Processing Systems 27. 487–495.


