“End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging” by Sitzmann, Diamond, Peng, Dun, Boyd, et al. …

  • ©Vincent Sitzmann, Steven Diamond, Yifan (Evan) Peng, Xiong Dun, Stephen Boyd, Wolfgang Heidrich, Felix Heide, and Gordon Wetzstein



Entry Number: 114

Session Title:

    Computational Cameras


    End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging




    In typical cameras the optical system is designed first; once it is fixed, the parameters in the image processing algorithm are tuned to get good image reproduction. In contrast to this sequential design approach, we consider joint optimization of an optical system (for example, the physical shape of the lens) together with the parameters of the reconstruction algorithm. We build a fully-differentiable simulation model that maps the true source image to the reconstructed one. The model includes diffractive light propagation, depth and wavelength-dependent effects, noise and nonlinearities, and the image post-processing. We jointly optimize the optical parameters and the image processing algorithm parameters so as to minimize the deviation between the true and reconstructed image, over a large set of images. We implement our joint optimization method using autodifferentiation to efficiently compute parameter gradients in a stochastic optimization algorithm. We demonstrate the efficacy of this approach by applying it to achromatic extended depth of field and snapshot super-resolution imaging.


    1. N. Antipa, S. Necula, R. Ng, and L. Waller. 2016. Single-shot diffuser-encoded light field imaging. In Proc. IEEE ICCP. 1–11.Google Scholar
    2. M. S. Asif, A. Ayremlou, A. Sankaranarayanan, A. Veeraraghavan, and R. G. Baraniuk. 2017. FlatCam: Thin, Lensless Cameras Using Coded Aperture and Computation. IEEE Trans. Computational Imaging 3, 3 (2017), 384–397.Google ScholarCross Ref
    3. M. Ben-Ezra, A. Zomet, and S.K. Nayar. 2004. Jitter Camera: High Resolution Video from a Low Resolution Detector. In Proc. CVPR. 135–142. Google ScholarDigital Library
    4. Max Born and Emil Wolf. 1999. Principles of Optics: Electromagnetic Theory of Propagation, Interference and Diffraction of Light (7 ed.). Cambridge University Press.Google Scholar
    5. D.J. Brady, M. E. Gehm, R. A. Stack, D. L. Marks, D. S. Kittle, D. R. Golish, E. M. Vera, and S. D. Feller. 2012. Multiscale Gigapixel Photography. Nature 486 (2012), 386–389.Google ScholarCross Ref
    6. Ayan Chakrabarti. 2016. Learning sensor multiplexing design through back-propagation. In Advances in Neural Information Processing Systems. 3081–3089. Google ScholarDigital Library
    7. Wei Ting Chen, Alexander Y Zhu, Vyshakh Sanjeev, Mohammadreza Khorasaninejad, Zhujun Shi, Eric Lee, and Federico Capasso. 2018. A broadband achromatic metalens for focusing and imaging in the visible. Nature nanotechnology (2018), 1.Google Scholar
    8. Shane Colburn, Alan Zhan, and Arka Majumdar. 2018. Metasurface optics for full-color computational imaging. Science Advances 4, 2 (2018), eaar2114.Google Scholar
    9. O. Cossairt, D. Miau, and S.K. Nayar. 2011. Gigapixel Computational Imaging. In Proc. ICCP.Google Scholar
    10. Oliver Cossairt and Shree Nayar. 2010. Spectral focal sweep: Extended depth of field from chromatic aberrations. In Proc. ICCP. 1–8.Google ScholarCross Ref
    11. Oliver Cossairt, Changyin Zhou, and Shree Nayar. 2010. Diffusion Coded Photography for Extended Depth of Field. ACM Trans. Graph. (SIGGRAPH) 29, 4 (2010), 31:1–31:10. Google ScholarDigital Library
    12. Gerwin Damberg and Wolfgang Heidrich. 2015. Efficient freeform lens optimization for computational caustic displays. Optics Express 23, 8 (2015), 10224–10232.Google ScholarCross Ref
    13. Paul E. Debevec and Jitendra Malik. 1997. Recovering High Dynamic Range Radiance Maps from Photographs. In ACM SIGGRAPH. 369–378. Google ScholarDigital Library
    14. Steven Diamond, Vincent Sitzmann, Stephen Boyd, Gordon Wetzstein, and Felix Heide. 2017a. Dirty pixels: Optimizing image classification architectures for raw sensor data. arXiv preprint arXiv:1701.06487 (2017).Google Scholar
    15. Steven Diamond, Vincent Sitzmann, Felix Heide, and Gordon Wetzstein. 2017b. Unrolled Optimization with Deep Priors. (2017). arXiv:1705.08041Google Scholar
    16. C. Dong, C. Loy, K. He, and X. Tang. 2016b. Image Super-Resolution Using Deep Convolutional Networks. IEEE Trans. PAMI 38, 2 (2016), 295–307. Google ScholarDigital Library
    17. Chao Dong, Chen Change Loy, and Xiaoou Tang. 2016a. Accelerating the super-resolution convolutional neural network. In Proceedings of the European Conference on Computer Vision. 391–407.Google ScholarCross Ref
    18. Edward R. Dowski and W. Thomas Cathey. 1995. Extended depth of field through wave-front coding. OSA Appl. Opt. 34, 11 (1995), 1859–1866.Google ScholarCross Ref
    19. Gabriel Eilertsen, Joel Kronander, Gyorgy Denes, Rafal Mantiuk, and Jonas Unger. 2017. HDR image reconstruction from a single exposure using deep CNNs. ACM Trans. Graph. (SIGGRAPH Asia) 36, 6 (2017). Google ScholarDigital Library
    20. Angel Flores, Michael R Wang, and Jame J Yang. 2004. Achromatic hybrid refractive-diffractive lens with extended depth of focus. Applied optics 43, 30 (2004), 5618–5630.Google Scholar
    21. Joseph Goodman. 2017. Introduction to Fourier Optics (4 ed.). W.H. Freeman.Google Scholar
    22. Samuel W. Hasinoff, Dillon Sharlet, Ryan Geiss, Andrew Adams, Jonathan T. Barron, Florian Kainz, Jiawen Chen, and Marc Levoy. 2016. Burst Photography for High Dynamic Range and Low-light Imaging on Mobile Cameras. ACM Trans. Graph. (SIGGRAPH) 35, 6 (2016), 192:1–192:12. Google ScholarDigital Library
    23. Felix Heide, Qiang Fu, Yifan Peng, and Wolfgang Heidrich. 2016. Encoded diffractive optics for full-spectrum computational imaging. Scientific Reports 6, 33543 (2016).Google Scholar
    24. Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. 2015. Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5197–5206.Google ScholarCross Ref
    25. Michael Iliadis, Leonidas Spinoulas, and Aggelos K Katsaggelos. 2016. Deepbinarymask: Learning a binary mask for video compressive sensing. arXiv preprint arXiv:1607.03343 (2016).Google Scholar
    26. Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2008. Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search. In Proc. ECCV. 304–317. Google ScholarDigital Library
    27. Nima Khademi Kalantari and Ravi Ramamoorthi. 2017. Deep High Dynamic Range Imaging of Dynamic Scenes. ACM Trans. Graph. (SIGGRAPH) 36, 4 (2017), 144:1–144:12. Google ScholarDigital Library
    28. Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016a. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1646–1654.Google ScholarCross Ref
    29. Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016b. Deeply-recursive convolutional network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1637–1645.Google ScholarCross Ref
    30. Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. 2017. Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
    31. Michael L. Land and Dan-Eric Nielsson. 2002. Animal Eyes. Oxford University Press.Google Scholar
    32. Anat Levin, Rob Fergus, Frédo Durand, and William T. Freeman. 2007. Image and Depth from a Conventional Camera with a Coded Aperture. ACM Trans. Graph. (SIGGRAPH) 26, 3 (2007). Google ScholarDigital Library
    33. Anat Levin, Samuel W. Hasinoff, Paul Green, Frédo Durand, and William T. Freeman. 2009. 4D Frequency Analysis of Computational Cameras for Depth of Field Extension. ACM Trans. Graph. (SIGGRAPH) 28, 3 (2009), 97:1–97:14. Google ScholarDigital Library
    34. Zhiqiang Liu. 2007. Diffractive lens with extended depth of focus and its applications. (2007).Google Scholar
    35. Mann, Picard, S. Mann, and R. W. Picard. 1995. On Being `undigital’ With Digital Cameras: Extending Dynamic Range By Combining Differently Exposed Pictures. In Proceedings of IS&T 442–448.Google Scholar
    36. D. Martin, C. Fowlkes, D. Tal, and J. Malik. 2001. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. In ICCV, Vol. 2. 416–423.Google ScholarCross Ref
    37. Kshitij Marwah, Gordon Wetzstein, Yosuke Bando, and Ramesh Raskar. 2013. Compressive Light Field Photography Using Overcomplete Dictionaries and Optimized Projections. ACM Trans. Graph. (SIGGRAPH) 32, 4 (2013), 46:1–46:12. Google ScholarDigital Library
    38. Brian Morgan, Christopher M Waits, John Krizmanic, and Reza Ghodssi. 2004. Development of a deep silicon phase Fresnel lens using gray-scale lithography and deep reactive ion etching. Journal of microelectromechanical systems 13, 1 (2004), 113–120.Google ScholarCross Ref
    39. S. K. Nayar. 2006. Computational Cameras: Redefining the Image. IEEE Computer 39, 8 (2006), 30–38. Google ScholarDigital Library
    40. Y. Nesterov. 1983. A method of solving a convex programming problem with convergence rate O(1/k2). Soviet Mathematics Doklady 27 (1983), 372–376.Google Scholar
    41. Ren Ng, Marc Levoy, Mathieu Bredif, Gene Duval, Mark Horowitz, and Pat Hanrahan. 2005. Light Field Photography with a Hand-Held Plenoptic Camera. Tech Report CSTR 2005-02.Google Scholar
    42. Sri Rama Prasanna Pavani, Michael A. Thompson, Julie S. Biteen, Samuel J. Lord, Na Liu, Robert J. Twieg, Rafael Piestun, and W. E. Moerner. 2009. Three-dimensional, single-molecule fluorescence imaging beyond the diffraction limit by using a double-helix point spread function. 106, 9 (2009), 2995–2999.Google Scholar
    43. Yifan Peng, Qiang Fu, Felix Heide, and Wolfgang Heidrich. 2016. The Diffractive Achromat Full Spectrum Computational Imaging with Diffractive Optics. ACM Trans. Graph. (SIGGRAPH) 35, 4 (2016), 31:1–31:11. Google ScholarDigital Library
    44. Ramesh Raskar, Amit Agrawal, and Jack Tumblin. 2006. Coded Exposure Photography: Motion Deblurring Using Fluttered Shutter. ACM Trans. Graph. (SIGGRAPH) 25, 3 (2006), 795–804. Google ScholarDigital Library
    45. Erik Reinhard, Greg Ward, Sumanta Pattanaik, and Paul Debevec. 2005. High Dynamic Range Imaging: Acquisition, Display, and Image-Based Lighting. Morgan Kaufmann. Google ScholarDigital Library
    46. Mushfiqur Rouf, Rafal Mantiuk, Wolfgang Heidrich, Matthew Trentacoste, and Cheryl Lau. 2011. Glare encoding of high dynamic range images. In Proc. CVPR. 289–296. Google ScholarDigital Library
    47. Samuel Schulter, Christian Leistner, and Horst Bischof. 2015. Fast and accurate image upscaling with super-resolution forests. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3791–3799.Google ScholarCross Ref
    48. Yuliy Schwartzburg, Romain Testuz, Andrea Tagliasacchi, and Mark Pauly. 2014. High-contrast computational caustic design. ACM Trans. Graph. (SIGGRAPH) 33, 4 (2014), 74. Google ScholarDigital Library
    49. Yoav Shechtman, Steffen J. Sahl, Adam S. Backer, and W. E. Moerner. 2014. Optimal Point Spread Function Design for 3D Imaging. Phys. Rev. Lett. 113 (2014), 133902. Issue 13.Google ScholarCross Ref
    50. Wenzhe Shi, Jose Caballero, Ferenc Huszar, Johannes Totz, Andrew P. Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network. In Proc. CVPR.Google ScholarCross Ref
    51. Yichang Shih, Brian Guenter, and Neel Joshi. 2012. Image enhancement using calibrated lens simulations. In European Conference on Computer Vision. Springer, 42–56. Google ScholarDigital Library
    52. Warren J. Smith. 2007. Modern Optical Engineering (4 ed.). McGraw-Hill.Google Scholar
    53. Pratul P. Srinivasan, Tongzhou Wang, Ashwin Sreelal, Ravi Ramamoorthi, and Ren Ng. 2017. Learning to Synthesize a 4D RGBD Light Field from a Single Image. In Proc. IEEE ICCV.Google ScholarCross Ref
    54. Libin Sun, Neel Joshi, Brian Guenter, and James Hays. 2015. Lens Factory: Automatic Lens Generation Using Off-the-shelf Components. CoRR abs/1506.08956 (2015). arXiv:1506.08956 http://arxiv.org/abs/1506.08956Google Scholar
    55. Radu Timofte, Vincent De Smet, and Luc Van Gool. 2014. A+: Adjusted anchored neighborhood regression for fast super-resolution. In Proceedings of Asian Conference on Computer Vision. 111–126.Google Scholar
    56. Ashok Veeraraghavan, Ramesh Raskar, Amit Agrawal, Ankit Mohan, and Jack Tumblin. 2007. Dappled Photography: Mask Enhanced Cameras for Heterodyned Light Fields and Coded Aperture Refocusing. ACM Trans. Graph. (SIGGRAPH) 26, 3 (2007). Google ScholarDigital Library
    57. Ashwin Wagadarikar, Renu John, Rebecca Willett, and David Brady. 2008. Single disperser design for coded aperture snapshot spectral imaging. OSA Appl. Opt. 47, 10 (2008), B44–B51.Google ScholarCross Ref
    58. Ting-Chun Wang, Jun-Yan Zhu, Nima Khademi Kalantari, Alexei A. Efros, and Ravi Ramamoorthi. 2017. Light Field Video Capture Using a Learning-Based Hybrid Imaging System. ACM Trans. Graph. (SIGGRAPH) 36, 4 (2017). Google ScholarDigital Library
    59. Zhaowen Wang, Ding Liu, Jianchao Yang, Wei Han, and Thomas Huang. 2015. Deep networks for image super-resolution with sparse prior. In Proceedings of the IEEE International Conference on Computer Vision. 370–378. Google ScholarDigital Library
    60. Gordon Wetzstein, Ivo Ihrke, Douglas Lanman, and Wolfgang Heidrich. 2011. Computational Plenoptic Imaging. Computer Graphics Forum 30, 8 (2011), 2397–2426.Google ScholarCross Ref
    61. Rengmao Wu, Liang Xu, Peng Liu, Yaqin Zhang, Zhenrong Zheng, Haifeng Li, and Xu Liu. 2013. Freeform illumination design: a nonlinear boundary problem for the elliptic Monge-Ampére equation. Optics letters 38, 2 (2013), 229–231.Google Scholar
    62. Li Xu, Jimmy SJ Ren, Ce Liu, and Jiaya Jia. 2014. Deep convolutional neural network for image deconvolution. In Advances in Neural Information Processing Systems. 1790–1798. Google ScholarDigital Library
    63. Jie Yang, Jiafu Wang, Mingde Feng, Yongfeng Li, Xinhua Wang, Xiaoyang Zhou, Tiejun Cui, and Shaobo Qu. 2017. Achromatic flat focusing lens based on dispersion engineering of spoof surface plasmon polaritons. Applied Physics Letters 110, 20 (2017), 203507.Google ScholarCross Ref
    64. Matthew D Zeiler. 2012. ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012).Google Scholar
    65. Jinsong Zhang and Jean-François Lalonde. 2017. Learning High Dynamic Range from Outdoor Panoramas. In Proc. IEEE ICCV.Google ScholarCross Ref
    66. C. Zhou, S. Lin, and S. K. Nayar. 2009. Coded Aperture Pairs for Depth from Defocus. In Proc. ICCV.Google Scholar
    67. C. Zhou and S. Nayar. 2009. What are good apertures for defocus deblurring?. In Proc. IEEE ICCP. 1–8.Google Scholar

ACM Digital Library Publication: